key: cord-118553-ki6bbuod authors: piccolomini, elena loli; zama, fabiana title: preliminary analysis of covid-19 spread in italy with an adaptive seird model date: 2020-03-22 journal: nan doi: nan sha: doc_id: 118553 cord_uid: ki6bbuod in this paper we propose a susceptible-infected-exposed-recovered-dead (seird) differential model for the analysis and forecast of the covid-19 spread in some regions of italy, using the data from the italian protezione civile from february 24th 2020. in this study investigate an adaptation of the model. since several restricting measures have been imposed by the italian government at different times, starting from march 8th 2020, we propose a modification of seird by introducing a time dependent transmitting rate. in the numerical results we report the maximum infection spread for the three italian regions firstly affected by the covid-19 outbreak(lombardia, veneto and emilia romagna). this approach will be successively extended to other italian regions, as soon as more data will be available. the recent diffusion of the covid-19 corona virus has renewed the interest of the scientific and political community in the mathematical models for epidemic. many researchers are making efforts for proposing new refined models to analyse the present situation and predict possible future scenarios. with this paper we hope to contribute to the ongoing research on this topic and to give a practical instrument for a deeper comprehension of the virus spreading features and behaviour. we consider here deterministic models based on a system of initial values problems of ordinary differential equations (odes). this theory has been studied since about one century by w.o. kermack and a. g. mackendrick [1] that proposed the basic susceptible-infected-removed (sir) model. the sir model and its later modifications, such as susceptible-exposed-infected-removed (seir) [2] are commonly used by the epidemic medical community in the study of outbreaks diffusion.in these models, the population is divided into groups. for example, the sir model groups are: susceptible who can catch the disease, infected who have the disease and can spread it, and removed those who have either had the disease, or are recovered, immune or isolated until recovery. the seir model proposed by chowell et al. [3] considers also the exposed group: containing individuals who are in the incubation period. the evolution of the infected group depends on a key parameter, usually denoted as r0, representing the basic reproductive rate. the value of r0 can be inferred, for example, by epidemic studies or by statistical data from literature or it can be calibrated from the available data. in this paper we use the available data for determining the value of r0 best fitting the data. compared to previous outbreaks, such as sars-cov or mers-cov [4] , when the disease had been stopped after a relatively small number of infected people, we are now experimenting a different situation. indeed the number of infected people grows exponentially, and apparently, it can be stopped only by a complete lockdown of the affected areas, as evidenced by the covid-19 outbreak in the chinese city of wuhan in december 2019. analogously, in the italian case, in order to limit the virus diffusion all over the italian area, the government has started to impose more and more severe restrictions since march 6th 2020. hopefully, these measures will affect the spread of the covid-19 virus reducing the number of infected people and the value of the parameter r0. the introduction of different levels of lockdown require an adaptation of the standard epidemic models to this new situation. some examples about the chinese outbreak can be found in [5, 4, 6] . concerning the italian situation, which is currently evolving, it is possible to model the introduction of restricting measures by introducing a non constant infection rate [7] . in this paper we propose to represent the infection rate as a piecewise function, which reflects the changes of external conditions. the parameter r0, which is proportional to the infection rate, becomes a time dependent parameter r t which follows a different trend each time the external conditions change, depending on the particular situation occurring in that period. for example, if new restrictions are applied to the population movements at time t 1 , we can hopefully argue that r t starts to decrease when t > t 1 . finally, we believe that relevant information is not only infected but also recovered and dead numbers we modified seir model by splitting the removed population into recovered and dead. in section 2 we describe the details of seird model with constant and with time dependent infection rate seird(rm). finally in section 3 we test the model on some regional aggregated data published by the protezione civile italiana [8] . the equations relating the groups are the followings: where n is the total population, β is the infection rate, a coefficient accounting for the susceptible people get infected by infectious people and γ is the parameter of infectious people which become resistant per unit time. a more refined model is the seir model where a new compartment e representing the exposed individuals that are in the incubation period is added. the resulting equations in the seir model are the following: where α represents the incubation rate. the difference between the exposed (e) and infected (i) is that the former have contracted the disease but are not infectious, and the latter can spread the disease. seir has been used to model breakouts, such as ebola in congo and uganda [3] . in [2] the equations are modified by adding the quarantine and vaccination coefficients. in our case, unfortunately, vaccination is not available. a further model, the sird, considering the group of dead (d) in place of the exposed is analysed in [7] for the forecast of covid 19 spreading. in this paper we propose a seird model accounting for five different groups, susceptible, exposed, infected, recovery and dead. the system of equations is given by: in order to consider the restrictions imposed by the italian government since the infection rate of seir equations can be found in [7] , where the function is assumed to have a decreasing exponential form. however, observing the data trend, we believe that β t has a smoother decreasing behavior and we choose to model it as a decreasing rational function: in the present work we use a constant value ρ = 0.75 but we might calibrate it in future. by substituting β(t) (2) in the s and e equations in (1) we obtain seird rational model seird(rm). we calibrate the parameters of seird and seird(rm) by solving non-linear least squares problems with positive constraints. for example, in the seird on the vector of parameters q = (β, α, γ r , γ d ), and the vector y of the acquired data at given times t i , i = 1, . . . n. let f (u, q) be the function computing the numerical solution u of the differential system (1), the estimation of the parameter q is obtained solving the following non linear least squares problem: q ≥ 0 where we introduce positivity constraints on q. the constrained optimization problem is solved with a trust-region based method implemented in the lsqnonlin matlab function. for further details about the optimization problem for identification parameters in differential problems see for example [9] . in this section we report the results obtained by using the seird model to monitor the covid-19 outbreak in italy during the period 24/02/2020-20/03/2020. in this section we consider the seird model (1) and use the following different measurement subsets from [8] relative to lombardia region: • s1 10 days measurements: 24/02/2020-04/03/2020 • s2 18 days measurements: 24/02/2020-12/03/2020 • s3 23 days measurements: 24/02/2020-17/03/2020 for identification of the parameters and than apply the identified parameters to model the infected-recovered-dead populations up to june 22nd (120 days). besides population plots in figures 1 ,2 we collect some meaningful quantitative information about the model parameters ( • the maximum infected population is reached on april 9th 2020 for s1, april 23rd 2020 for s2 and april 28th 2020 for s3. • concerning the values of the model parameters we have that the transmitting rate β can be estimated as β e = 0.3, the incubation rate α is approximately α e = 3 while the recovery rate is approximately (γ r ) e = 0.06 and the dead rate can be approximated as is approximately (γ d ) e = 0.04. the following conclusions can be drawn: • the seird parameters, computed from the first 10 days measurements, do not model properly the measurements. indeed the plots reported in figure 1 show that data of ird populations have a slower increase rate, compared to the model predictions. • in figure 5 reproduce the data quite accurately. the peak values reported in table 4 represent the largest percentage of infected and the longest time required to reach the peak. in order to improve the data fit we split the parameter identification step into the following two phase process: • phase 1 identification of the parameters of the standard seird model using a the data subset s1, (i.e. up to time t 0 = 10). change of the reproduction parameter (r0) into a time dependent reproduction function defined as follows: the plots of r t for the three regions are reported in figure 6 . in table 6 we lombardia emilia romagna veneto the improved modeling properties can be appreciated in population plots reported in figures 7 , 8 and 9. we observe that seird(rm) reproduces the data trends more precisely compared to the seird model. in this paper we proposed a seird model for (about june 20th), while veneto has its infection peak around the end of july 2020, probably due to different testing modalities. we highlight that it is only 12 days since restrictions started in italy and, maybe in the next few days, the effects of such measures will become more evident, hopefully causing a further decrease in the infection trend. in this case, the previsions shown in this paper should be updated by introducing a new time t 1 at which the the decreasing slope of β t should change, for example by estimating the parameter ρ in (2) with new data. the proposed model is flexible and we believe it could be easily adapted to monitor various infected areas with different restriction policies. a contribution to the mathematical theory of epidemics modeling the spread of ebola with seir and optimal control, in: siam under-graduate research online the basic reproductive number of ebola and the effects of public health measures: the cases of congo and uganda early dynamics of transmission and control of covid-19: a mathematical modelling study nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study prediction of new coronavirus infection based on a modified seir model. doi:medrxiv preprint doi analysis and forecast of covid-19 spreading parameter estimation algorithms for kinetic modeling from noisy data, in: system modeling and optimization substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) key: cord-027316-echxuw74 authors: modarresi, kourosh title: detecting the most insightful parts of documents using a regularized attention-based model date: 2020-05-22 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50420-5_20 sha: doc_id: 27316 cord_uid: echxuw74 every individual text or document is generated for specific purpose(s). sometime, the text is deployed to convey a specific message about an event or a product. other occasions, it may be communicating a scientific breakthrough, development or new model and so on. given any specific objective, the creators and the users of documents may like to know which part(s) of the documents are more influential in conveying their specific messages or achieving their objectives. understanding which parts of a document has more impact on the viewer’s perception would allow the content creators to design more effective content. detecting the more impactful parts of a content would help content users, such as advertisers, to concentrate their efforts more on those parts of the content and thus to avoid spending resources on the rest of the document. this work uses a regularized attention-based method to detect the most influential part(s) of any given document or text. the model uses an encoder-decoder architecture based on attention-based decoder with regularization applied to the corresponding weights. the main purpose of nlp (natural language processing) and nlu (natural language understanding) is to understand the language. more specifically, they are focused on not just to see the context of text but also to see how human uses the language in daily life. thus, among other ways of utilizing this, we could provide an optimal online experience addressing needs of users' digital experience. language processing and understanding is much more complex than many other applications in machine learning such as image classification as nlp and nlu involve deeper context analysis than other machine learning applications. this paper is written as a short paper and focuses on explaining only the parts that are contribution of this paper to the state-of-the art. thus, this paper does not describe the state-of-the-art works in details and uses those works [2, 4, 5, 8, 53, 60, 66, 70, 74, 84 ] to build its model as a modification and extension of the state of the art. therefore, a comprehensive set of reference works have been added for anyone interested in learning more details of the previous state of the art research [3, 5, 10, 17, 33, 48, 49, 61-63, 67-73, 76, 77, 90, 91, 93] . deep learning has become a main model in natural language processing applications [6, 7, 11, 22, 38, 55, 64, 71, 75, 78-81, 85, 88, 94] . among deep learning models, often rnn-based models like lstm and gru have been deployed for text analysis [9, 13, 16, 23, 32, 39-42, 50, 51, 58, 59] . though, modified version of rnn like lstm and gru have been improvement over rnn (recurrent neural networks) in dealing with vanishing gradients and long-term memory loss, still they suffer from many deficiencies. as a specific example, a rnn-based encoder-decoder architecture uses the encoded vector (feature vector), computed at the end of encoder, as the input to the decoder and uses this vector as a compressed representation of all the data and information from encoder (input). this ignores the possibility of looking at all previous sequences of the encoder and thus suffers from information bottleneck leading to low precision, especially for texts of medium or long sequences. to address this problem, global attention-based model [2, 5] where each of the encoder sequence uses all of the encoder sequences. figure 1 shows an attention-based model. where i = 1:n is the encoder sequences and, t = 1:m represents the decoder sequences. each of the encoder states looks into the data from all the encoder sequences with specific attention measured by the weights. each weight, w ti , indicates the attention decoder network t pays for the encoder network i. these weights are dependent on the previous decoder and output states and present encoder state as shown in fig. 2 . given the complexity of these dependencies, a neural network model is used to compute these weights. two layers (1024) of fully connected layers and relu activation function is used. where h is the state of the encoder networks, s t−1 is the previous state of the decoder and v t−1 is the previous decoder output. also, w t is the weights of the encoder state t. since w t are the output from softmax function, then, this section overviews of the contribution of this paper and explains the extension made over the state-of-the-art model. a major point of attention for many texts related analysis is to determine which part(s) of the input text has had more impact in determining the output. he length of input text could be very long combining of potentially hundreds and thousands of words or sequences, i.e., n could be very large number. thus, there are many weights (w ti ) in determining any part of output v t , and also since many of these weights are correlated, it's difficult to determine the significance of any input sequence in computing any output sequence v t . to make these dependencies clearer and to recognize the most significant input sequences for any output sequence, we apply a zero-norm penalty to make the corresponding weight vector to become a sparse vector. to achieve the desired sparsity, zero-norm (l 0 ) is applied to make any corresponding w t vector very sparse as the penalty leads to minimization of the number of non-zero entries in w t . the process is implemented by imposing the constraint of, since l 0 is computationally intractable, we could use surrogate norms such as l 1 norm or euclidean norm, l 2 . to impose sparsity, the l 1 norm, lasso [8, 14, 15, 18, 21] is used in this work, or, as the penalty function to enforce sparsity on the weight vectors. this penalty, β w t 1 , is the first extension to the attention model [2, 5] . here, β is the regularization parameter which is set as a hyperparameter where its value is set before learning. higher constraint leads to higher sparsity with higher added regularization biased error and lower values of the regularization parameter leads to lower sparsity and lesser regularization bias. the main goal of this work is to find out which parts of encoder sequences are most critical in determining and computing any output. the output could be a word, a sentence or any other subsequence. the goal is critical especially in application such as machine translation, image captioning, sentiment analysis, topic modeling and predictive modeling such as time series analysis and prediction. to add another layer of regularization, this work imposes embedding error penalty to the objective function (usually, cross entropy). this added penalty also helps to address the "coverage problem" (the phenomenon of often observed dropping or frequently repeating words --or any other subsequence --by the network). the embedding regularization is, α embedding error 2 (6) input to any model has to be a number and hence the raw input of words or text sequence needs to be transformed to continuous numbers. this is done by using one-hot encoding of the words and then using embedding as shown in fig. 3 . whereu i is the raw input text, is the one-hot encoding representation of the raw input and u i is the embedding of the i-th input or sequence. also, α is the regularization parameter. the idea of embedding is based on that embedding should preserve word similarities, i.e., the words that are synonyms before embedding, should remain synonyms after embedding. using this concept of embedding, the scaled embedding error is, or, after scaling the embedding error, which could be re-written, using a regularization parameter (α), as, where l is the measure or metric of similarity of words representations. here, for all similarity measures, both euclidean norm and cosine similarity (dissimilarity) have been used. in this work, the embedding error using the euclidean norm is used, alternatively, we could include the embedding error of the output sequence in eq. (10) . when the input sequence (or the dictionary) is too long, to prevent high computational complexity of computing similarity of each specific word with all other words, we choose a random (uniform) sample of the input sequences to compute the embedding error. the regularization parameter, α, is computed using cross validation [26] [27] [28] [29] [30] [31] . alternatively, adaptive regularization parameters [82, 83] could be used. this model was applied on wikipedia datasets for english-german translation (one-way translation) with 1000 sentences. the idea was to determine which specific input word (in english) is the most important one for the corresponding german translation. the results were often an almost diagonal weight matrix, with few non-zero off diagonal entries, indicating the significance of the corresponding word(s) in the original language (english). since the model is an unsupervised approach, it's hard to evaluate its performance without using domain knowledge. the next step in this work would be to develop a unified and interpretable metric for automatic testing and evaluation of the model without using any domain knowledge and also to apply the model to other applications such as sentiment analysis. inverse problems: principles and applications in geophysics, technology, and medicine polosukhin: attention is all you need domain adaptation via pseudo in-domain data selection multiple object recognition with visual attention neural machine translation by jointly learning to align and translate the dropout learning algorithm deep learning an unsupervised feature learning nips 2012 workshop nesta, a fast and accurate first-order method for sparse recovery learning long-term de-pendencies with gradient descent is difficult a neural probabilistic language model theano: a cpu and gpu math expression compiler audio chord recognition with recurrent neural networks a singular value thresholding algorithm for matrix completion exact matrix completion via convex optimization compressive sampling long short-term memory-networks for machine reading learning phrase representations using rnn encoder-decoder for statistical machine translation framewise phoneme classification with bidirectional lstm and other neural network architectures generating sequences with recurrent neural networks the elements of statistical learning; data mining, inference and prediction handwritten digit recognition via deformable prototypes gene shaving' as a method for identifying distinct sets of genes with similar expression patterns matrix completion via iterative soft-thresholded svd package 'impute'. cran multilingual distributed representations without word alignment advances in natural language processing long short-term memory gradient flow in recurrent nets: the difficulty of learning long-term dependencies regularization for applied inverse and ill-posed problems compositional attention networks for machine reasoning two case studies in the application of principal component principal component analysis rotation of principal components: choice of normalization constraints a modified principal component technique based on the lasso recurrent continuous translation models statistical machine translation structured attention networks statistical phrase-based translation learning phrase representations using rnn encoder-decoder for statistical machine translation conditional random fields: probabilistic models for segmenting and labeling sequence data neural networks: tricks of the trade a structured self-attentive sentence embedding effective approaches to attention-based neural machine translation learning to recognize features of valid textual entailments natural logic for textual inference encyclopedia of language & linguistics introduction to information retrieval the stanford corenlp natural language processing toolkit. computer science, acl computational linguistics and deep learning differentiating language usage through topic models effective approaches to attention based neural machine translation application of dnn for modern data with two examples: recommender systems & user recognition. deep learning summit standardization of featureless variables for machine learning models using natural language processing generalized variable conversion using k-means clustering and web scraping an efficient deep learning model for recommender systems effectiveness of representation learning for the analysis of human behavior an evaluation metric for content providing models, recommendation systems, and online campaigns combined loss function for deep convolutional neural networks a randomized algorithm for the selection of regularization parameter. inverse problem symposium a local regularization method using multiple regularization levels a decomposable attention model on the difficulty of training recurrent neural networks. in: icml on the difficulty of training recurrent neural networks how to construct deep recurrent neural networks fast curvature matrix-vector products for second-order gradient descent bidirectional recurrent neural networks continuous space translation models for phrase-based statistical machine translation continuous space language models for statistical machine translation sequence to sequence learning with neural networks google's neural machine translation system: bridging the gap between human and machine translation adadelta: an adaptive learning rate method key: cord-103913-jgko7b0j authors: macedo, a. m. s.; brum, a. a.; duarte-filho, g. c.; almeida, f. a. g.; ospina, r.; vasconcelos, g. l. title: a comparative analysis between a sird compartmental model and the richards growth model date: 2020-08-06 journal: nan doi: 10.1101/2020.08.04.20168120 sha: doc_id: 103913 cord_uid: jgko7b0j we propose a compartmental sird model with time-dependent parameters that can be used to give epidemiological interpretations to the phenomenological parameters of the richards growth model. we illustrate the use of the map between these two models by fitting the fatality curves of the covid-19 epidemic data in italy, germany, sweden, netherlands, cuba, and japan. the pandemic of the novel coronavirus disease (covid-19) has created a major worldwide sanitary crisis [1, 2] . developing a proper understanding of the dynamics of the covid-19 epidemic curves is an ongoing challenge. in modeling epidemics, in general, compartmental models [3] have been to some extent the tool of choice. however, in the particular case of the covid-19 epidemic, standard compartmental models, such as sir, seir, and sird, have so far failed to produce a good description of the empirical data, despite a great amount of intensive work [4, 5, 6, 7, 8, 9, 10, 11] . in this context, phenomenological growth models have met with some success, particularly in the description of cumulative death curves [12, 13, 14] . the recent discovery, within the context of a generalized growth model known as the beta logistic model [15] , of a slow, power-law approach towards the plateau in the final stage of the epidemic curves is another remarkable example of this qualitative success. growth models, however, have the drawback that their parameters may not be easily interpreted in terms of standard epidemiological concepts [16] , as can the parameters of the usual compartmental models. as a concrete example, consider the transmission rate parameter β of the sir model [3] . it can be easily interpreted as the probability that a contact between a susceptible individual and an infective one leads to a transmission of the pathogen, times the number of contacts per day. although the value of β cannot be measured directly in a model independent way, and it is probably not even constant in the covid-19 epidemic curves, the epidemiological meaning of the parameter is nonetheless easy to grasp conceptually. as a result, models that incorporate such parameters in their basic equations are sometimes regarded as "more epidemiological," so to speak, than others that do not use similar parameters. this state of affairs creates a somewhat paradoxical scenario, in which we have, on the one hand, the striking empirical success of phenomenological growth models sometimes being downplayed, owing to the lack of a simple epidemiological picture of the underlying mechanism [16] , and, on the other hand, the failure of traditional epidemiological compartmental models to produce good quantitative agreement with the empirical covid-19 data. a glaring instance of the inadequacy of standard compartmental models for the covid-19 epidemic is their inability to predict the power-law behavior often seen in the early-growth regime as well as in the saturation phase of the accumulated death curves-a feature that is well captured by growth models [15] , as already mentioned. it is clear that a kind of compromise is highly desirable, in which we get the benefits of the accuracy of the growth models in describing the epidemic, along with a reasonable epidemiological interpretation of their free parameters. an attempt in this direction was presented by wang [16] , where an approximate map between the richards growth model [16] and the accumulated number of cases of a sir model was proposed. the two free parameters of the richards model were expressed as a function of the epidemiological based parameters of the sir model. here we improve on this analysis in two ways: (i) we extend the sir model to a sird model by incorporating the deceased compartment, which is then used as the basis for the map onto the richards model; (ii) the parameters of the sird model are allowed to have a time dependence, which is crucial to gain some efficacy in describing realistic cumulative epidemic curves of covid-19. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint it is in general very hard to estimate the actual number of infected people within a given population, simply because a large proportion of infections go undetected. this happens largely because many carriers of the coronavirus are either asymptomatic or develop only mild symptoms, which in turn makes the number of confirmed cases for covid-19 a poor proxy for the actual number of infections. this issue is well known in the literature and referred to as the "under-reporting problem" [17, 18] . with this in mind, and in the absence of more reliable estimates for the number of infected cases, we shall here focus our analysis on the fatality curves, defined as the cumulative number of deaths as a function of time. in the present study we considered the mortality data of covid-19 from the following countries: italy, germany, sweden, netherlands, cuba, and japan. the data used here were obtained from the database made publicly available by the johns hopkins university [19] , which lists in automated fashion the number of the confirmed cases and deaths attributed to covid-19 per country. we have used data up to july 30, 2020. the time evolution of the number of cases/deaths in an epidemy can be modelled by means of the richards model (rm), defined by the following ordinary differential equation [20, 21, 22] : where c(t) is the cumulative number of cases/deaths at time t, r is the growth rate at the early stage, k is the final epidemic size, and the parameter α measures the asymmetry with respect to the s-shaped curve of the standard logistic model, which is recovered for α = 1. in the present paper we shall apply the rm to the fatality curves of covid-19, so that c(t) will always represent the cumulative numbers of deaths at time t, where t will be counted in days from the first death. equation (3.1) must be supplemented with a boundary condition, which can be either the initial time, t = 0, or the inflection point, t = t c , defined by the condition a direct integration of (3.1) yields the following explicit formula: which will be the basis of our analysis. . cc-by-nc-nd 4.0 international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint we start by recalling the standard susceptible (s)-infected (i)-recovered (r)-deceased (d) epidemiological model where s(t), i(t), r(t), and d(t) are the number of individuals at time t in the classes of susceptible, infected, recovered, and deceased respectively; whereas n is the total number of individuals in the population. i.e., n = s(t)+i(t)+r(t)+d(t). the initial values are chosen to be s(0) = s 0 , i(0) = i 0 , with s 0 + i 0 = n , and r(0) = 0 = d(0). the parameters γ 1 and γ 2 are the rates at which infected individual becomes recovered or deceased, respectively. we then consider the following modified sird model, where in (3.3) and (3.4) we replace n with only the partial population in the s and i compartments, which takes into account the fact that the recovered (assuming they become immune) and the deceased cannot contribute to the transmission. we thus find (3.10) a fundamental quantity in epidemiology is the basic reproductive ratio, r 0 , which is defined as the expected number of secondary infections caused by an infected individual during the period she (or he) is infectious in a population consisting solely of susceptible individuals. in this model, r 0 can be calculated using the next generation method [23, 24] and is given by next, we define y(t) = s(t) + i(t) and divide (3.8) by (3.7) to obtain . cc-by-nc-nd 4.0 international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint integrating both sides of (3.12), and inserting the result into (3.7), yields a growth equation of the richards type: where α = 1 − 1/r 0 and l = (i 0 + s 0 ) 1/α s 1−1/α 0 . we now seek to approximate the curve of accumulated death d(t), obtained from the sird model, with the richards function c(t), as defined in (3.2) . to this end, we first impose the boundary conditions k = d(∞) and t c = t i , where t i is the inflection point of d(t). by definitiond(t i ) = 0, which implies from (3.10) thaṫ i(t i ) = 0 and thus (3.14) furthermore, we require that at t = t i both c(t) and its derivativeċ(t) respectively match d(t) andḋ(t), thus using the conditioni̇(t i ) = 0 in the sird equations, we find using equations (3.15) and (3.16), we finally obtain the connection between the parameters (r, α) of the rm and the parameters (β, γ 1 , γ 2 ) of the sird model , (3.20) which are the central equations of this paper. we can estimate the precision of the above 'map' between the rm and the sird model via the relative error function: we have verified numerically that . cc-by-nc-nd 4.0 international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint where ε is typically of order 0.1. a typical example of the agreement between the sird model, for a given set of parameters (β, γ 1 , γ 2 ), and the rm with the parameters obtained from the map described by (3.19) and (3.20) , is illustrated in fig. 1. in fig. 2 we show the simple monotonic dependence of the richards parameters (r, α) on the parameter β of the sird model, for the biologically relevant interval 0 ≤ r, α ≤ 1. we also show, for comparison, the behavior of the basic reproduction number r 0 . the sird model with constant parameters proved to be insufficient to accommodate properly the human intervention biased dynamics of the covid-19 epidemics. the simplest solution to this problem is to allow the epidemiological parameter β to change in time according to the simple exponential decay function [4] where τ 0 is the starting time of the intervention and τ 1 is the average duration of interventions. here β 0 is the initial transmission rate of the pathogen and the product β 0 β 1 represents the transmission rate at the end of the epidemic. remarkably, the central map equations, (3.19) and (3.20) , are still valid, although t i is no longer given by (3.14) and should be determined from the maximum of the curve i(t) obtained from the numerical solution of the sird equations, with the parameter β replaced by the function β(t). . cc-by-nc-nd 4.0 international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint figura 2: behavior of the richards parameters (r, α) and r 0 as a function of the parameter β of the sird model. in fig. 3 we demonstrate some applications of the sird-rm map by showing the cumulative number of deaths (red circles) attributed to covid-19 for the following countries: italy, germany, netherlands, sweden, japan and cuba. in all figures shown, the continuous (black) curve is the numerical fit to the empirical data, as produced by the sird model with the time-dependent parameter β(t) given in (3.23) , and the dashed (bright green) curve is the corresponding theoretical curve predicted by the rm, with the parameters as obtained from the map (3.19) and (3.20) . the statistical fits were performed using the levenberg-marquardt algorithm [25], as implemented by the lmfit python package [26], to solve the corresponding non-linear least square optimization problem. in other words, the lmfit package was applied to each empirical dataset to determine the parameters (β 0 , β 1 , γ 1 , γ 2 , τ 0 , τ 1 ) of the sird model described in secs. 3.2 and 3.3. one can see from fig. 3 that the agreement between the rm and the sird model is very good in all cases considered, which satisfactorily validates the map between these two models. this result thus shows, quite convincingly, that the parameters of the richards model do bear a direct relationship to epidemiological parameters, as represented, say, in compartmental models of the sird type. although the interpretation of the richards parameters (r, α) are less obvious, in that they involve a nonlinear relation with the probability rates used in compartmental models, these parameters should nonetheless be regarded as bonafide epidemiological parameters. furthermore, it is important to emphasize the flexibility of the rm: this model, which has only two time-independent parameters, is equivalent (in the sense of the is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint words, the two constant parameters of the rm are sufficient to characterize, to a rather good extent, the entire evolution of the covid-19 epidemic in a given location. it is worth pointing out that the discovery of power-law behaviors in the earlygrowth regime as well as in the saturation phase of the accumulated death curves, both of which are well described by the beta logistic model (blm) [15] , brings about the challenge to accommodate power laws into a compartmental model. a preliminary analysis [15] shows that substantial modifications in the sird equations may be required to achieve power law-behavior in the short-time and long-time regimes of the epidemic curves. the possibility of a map between the blm and a modified sird model with time-dependent parameters is currently under investigation. the present paper provides a map between a sird model with time dependent parameters and the richards growth model. we illustrated the use of this map by fitting the fatality curves of the covid-19 epidemics data for italy, germany, sweden, netherlands, cuba and japan. the results presented here are relevant in that they showcase the fact that phenomenological growth models, such as the richards model, are valid epidemiological models not only because they can successfully describe the empirical data but also because they capture, in an effective way, the underlying dynamics of an infectious disease. in this sense, the free parameters of growth models acquire a biological meaning to the extent that they can be put in correspondence (albeit not a simple one) with parameters of compartmental model, which have a more direct epidemiological interpretation. [25] j. j. moré, "the levenberg-marquardt algorithm: implementation and theory," in numerical analysis, pp. 105-116, springer, 1978. [26] m. newville, t. stensitzki, d. allen, and a. ingargiola, "non-linear leastsquares minimization and curve-fitting for python," chicago, il, 2015. . cc-by-nc-nd 4.0 international license it is made available under a perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in (which was not certified by peer review) preprint the copyright holder for this this version posted august 6, 2020. . https://doi.org/10.1101/2020.08.04.20168120 doi: medrxiv preprint director-general's opening remarks at the media briefing on covid-19 -30 director-general's opening remarks at the media briefing on covid-19 -30 containing papers of a mathematical and physical character chinese and italian covid-19 outbreaks can be correctly described by a modified sird model power laws in superspreading events: evidence from coronavirus outbreaks and implications for sir models epidemiological model with anomalous kineticsthe covid-19 pandemics what will be the economic impact of covid-19 in the us? rough estimates of disease scenarios estimation of covid-19 dynamics "on a back-of-envelope": does the simplest sir model provide quantitative parameters and predictions? modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions modeling the epidemic dynamics and control of covid-19 outbreak in china a sir model assumption for the spread of covid-19 in different communities generalized logistic growth modeling of the covid-19 outbreak in 29 provinces in china and in the rest of the world modelling fatality curves of covid-19 and the effectiveness of intervention strategies dynamics and future of sars-cov-2 in the human host complexity signatures in the covid-19 epidemic: power law behaviour in the saturation regime of fatality curves richards model revisited: validation by and application to infection dynamics age-structured estimation of covid-19 icu demand from low quality data analysis of covid-19 under-reporting in brazil coronavirus covid-19 global cases by the center for systems science and engineering a flexible growth function for empirical use richards model revisited: validation by and application to infection dynamics richards model: a simple procedure for real-time prediction of outbreak severity on the definition and the computation of the basic reproduction ratio r 0 in models for infectious diseases in heterogeneous populations the construction of nextgeneration matrices for compartmental epidemic models this work was partially supported by the national council for scientific and technological development ( key: cord-017934-3wyebaxb authors: kurahashi, setsuya title: an agent-based infectious disease model of rubella outbreaks date: 2019-05-07 journal: agents and multi-agent systems: technologies and applications 2019 doi: 10.1007/978-981-13-8679-4_20 sha: doc_id: 17934 cord_uid: 3wyebaxb this study proposes a simulation model of rubella. sir (susceptible, infected, recovered) model has been widely used to analyse infectious diseases such as influenza, smallpox, bioterrorism, to name a few. on the other hand, agent-based model begins to spread in recent years. the model enables to represent the behaviour of each person on the computer. it also reveals the spread of infection by simulation of the contact process among people in the model. the study designs a model based on smallpox and ebola fever model in which several health policies are decided such as vaccination, the gender-specific workplace and so on. the infectious simulation of rubella, which has not yet vaccinated completely for men in japan, is implemented in the model. as results of experiments using the model, it has been found that preventive vaccine to all the men is crucial factors to prevent the spread in women. infectious diseases have been serious risk factors in human societies for centuries. smallpox has been recorded in human history since more than b.c 1100. people have also been suffering from many other infectious diseases such as malaria, cholera, tuberculosis, typhus, aids, influenza, etc. although people have tried to prevent and hopefully eradicate them, a risk of unknown infectious diseases including sars, a new type of infectious diseases, as well as ebola haemorrhagic fever and zika fever have appeared on the scene. a model of infectious disease has been studied for years. sir (susceptible, infected, recovered) model has been widely used to analyse such diseases based on a mathematical model. after an outbreak of sars, the first sir model of sars was published and many researchers studied the epidemic of the disease using this model. when an outbreak of a new type of influenza is first reported, the u.s. government immediately starts an emergency action plan to estimate parameters of its sir model. nevertheless, the sir model has difficulty to analyse which measures are effective because the model has only one parameter to represent infectiveness. for example, it is difficult for the sir model to evaluate the effect of temporary closing of classes because of the influenza epidemic. the agent-based approach or the individual-based approach has been adopted to conquer these problems in recent years [1] [2] [3] [4] . the model enables to represent behaviour of each person. it also reveals the spread of an infection by simulation of the contact process among people in the model. in this study, we developed a model to simulate rubella based on the infectious disease studies using agent-based modelling. what we want to know is how to prevent an epidemic of infectious diseases not only using mechanisms of the epidemic but also decision-making of health policy [5] . we aim to study the relationship between antibody holding rate of men and the spread of infection by constructing infection of rubella virus with the agent-based model and repeating simulation experiment on a computer. although our previous study described the infectious disease model of smallpox and ebola [6] , this paper proposes a new model of rubella which has caused crucial problems for pregnant women in recent years. in sect. 2, as examples of infections that occurred in the past, we will explain smallpox, ebola hemorrhagic fever, zika fever and rubella. section 3 describes related research on infectious disease models. section 4 explains the basic model and describes the rubella model. section 5 explains the experimental results and sect. 6 discusses them. finally, we summarize the whole in sect. 7. the smallpox virus affects the throat where it invades into the blood and hides in the body for about 12 days. patients developed a high fever after that, but rashes do not appear until about 15 days after the infection. while not developing rashes, smallpox virus is able to infect others. after 15 days, red rashes break out on the face, arms and legs, and subsequently they spread over the entire body. when all rashes generate pus, patients suffer great pains; finally, 30% of patients succumb to the disease. for thousands of years, smallpox was a deadly disease that resulted in thousands of deaths. a source of ebola infection is allegedly by eating a bat or a monkey, but it is unknown whether eating these animals is a source of the infection. due to the recent epidemic, which began in guinea in december 2013, 11,310 deaths have been confirmed. the authorities of guinea, liberia and sierra leone have each launched a state committee of emergency and have taken measures to cope with the situation. the prohibition of entry over the boundary of guinea is included in these measures. zika fever is an illness caused by zika virus via the bite of mosquitoes. it can also be potentially spread by sex according to recent report [7, 8] . most cases have no symptoms and present are usually mild including fever, red eyes, joint pain and a rash [9] , but it is believed that the zika fever may cause microcephaly which severely affects babies by a small head circumference. rubella is a type of viral infection caused by the rubella virus [10, 11] . in japan, there were epidemics (1976, 1982, 1987, 1992 ) once every 5 years, but after a male and female infant was the subject of periodic vaccination, no big epidemic occurred. however, in 2004, 40,000 people outbreaks were estimated and ten congenital rubella syndromes were reported. a large epidemic occurred in asia in 2011, and from 2013 to 2014, an epidemic exceeding 14,000 cases occurred mainly in adult males who did not take the vaccine [12] . the epidemic recurred in 2018, as of october the number of annual infections of rubella was about 1300 people, the national institute of infectious diseases announced emergency information on rubella epidemics. the centers for disease control and prevention (u.s.) raised the rubella alert level in japan to the second 'recommendation' among the three levels [13] . they recommended that pregnant women who are not protected against rubella through either vaccination or previous rubella infection should not travel to japan during this outbreak. epstein [14, 15] made a smallpox model based on 49 epidemics in europe from 1950 to 1971. in the model, 100 families from two towns were surveyed. the family includes two parents and two children, thus the population is each 400 from each town. all parents go to work in their town during the day except 10% of adults who go to another town. all children attend school. there is a communal hospital serving the two towns in which each five people from each town work. this model was designed as an agent-based model, and then simulation of infectious disease was conducted using the model. as results of experiments showed that (1) in a base model in which any infectious disease measures were not taken, the epidemic spread within 82 days and 30% of people died, (2) a trace vaccination measure was effective but it was difficult to trace all contacts to patients in an underground railway or an airport, (3) a mass vaccination measure was effective, but the number of vaccinations would be huge so it was not realistic and (4) epidemic quenching was also effective, and reactive household trace vaccination along with pre-emptive vaccination of hospital workers showed a dramatic effect. ohkusa [16] evaluated smallpox measures using an individual-based model of infectious diseases. the model supposed a town including 10,000 habitats and a public health centre. in the model, one person was infected with smallpox virus at a shopping mall. they compared between a trace vaccination measure and a mass vaccination measure. as a result of simulation, it was found that the effect of trace vaccination dropped if the early stage of infection was high and the number of medical staff is small, while the effect of mass vaccination was stable. therefore, timely and concentrate mass vaccination is required when virus starts spreading. the estimation about the number, place and time of infection is needed quickly and the preparation of an emergency medical treatment and estimation system is required for such occasions. regarding measles epidemics, agent-based simulation models of measles transmission have been developed using the framework for reconstructing epidemiological dynamics, a data-driven agent-based model to simulate the spread of an airborne infectious disease in an irish town, and so on [17, 18] . from these studies, the effectiveness of an agent-based model has been revealed, yet these are not sufficient models to consider a relationship between antibody holding rate of men and women, and commutation routes and the gender-specific workplace. in addition, authorities need to make a decision regarding measles-rubella mixed (mr) vaccine to men. this study takes into account these extensions. we designed a health policy simulation model of infectious disease based on epstein's smallpox model. the model includes smallpox, ebola haemorrhagic fever and rubella. we assume all individuals to be susceptible which means no background of immunity. 100 families live in two towns. the family includes two parents and two children. therefore, the population is each 400 in each town. all parents go to work in their town during the day except 10% of adults commute to another town. all children attend school. there is a communal hospital serving two towns in which five people from each town work. each round consists of an interaction through the entire agent population. the call order is randomized each round and agents are processed or activated, serially. on each round, when an agent is activated, she identifies her immediate neighbours for interaction. each interaction results in a contact. in turn, that contact results in a transmission of the infection from the contacted agent to the active agent with probability. the probability of contact at an interaction is 0.3 at a workplace and a school, while 1.0 at a home and a hospital. the probability of infection at a contact is 0.3 at a workplace and a school, while 1.0 at a home and a hospital. in the event the active agent contracts the disease, she turns blue to green and her own internal clock of disease progression begins. after 12 days, she will turn yellow and begins infecting others. length of noncontagious period is 12 days, and early rash contagious period is 3 days. unless the infected individual is vaccinated within 4 days of exposure, the vaccine is ineffective. at the end of day 15, smallpox rash is finally evident. next day, individuals are assumed to hospitalize. after 8 more days, during which they have a cumulative 30% probability of mortality, surviving individuals recover and return to circulation permanently immune to further infection. dead individuals are coloured black and placed in the morgue. immune individuals are coloured white. individuals are assumed to be twice as infectious during days 1-19 as during days 12-15. in the event the active agent contracts the disease, she turns blue to green and her own internal clock of disease progression begins. after 7 days, she will turn yellow and begins infecting others. however, her disease is not specified in this stage. after 3 days, she begins to have vomiting and diarrhoea and the disease is specified as ebola. unless the infected individual is dosed with antiviral medicine within 3 days of exposure, the medicine is ineffective. this is an imaginary medicine to play the policy game. at the end of day 12, individuals are assumed to hospitalize. after 4 more days, during which they have a cumulative 90% probability of mortality, surviving individuals recover and return to circulation permanently immune to further infection. dead individuals are coloured black and placed in the morgue. immune individuals are coloured white. other settings are the same as smallpox. rubella is a viral infectious disease characterized by fever and rash. since symptoms range from subclinical to complications, it is difficult to judge as rubella only with symptoms. if a pregnant woman until about 20 weeks of pregnancy infects rubella with a virus, there is a possibility that the baby will develop congenital rubella syndrome (crs). in consequence, congenital heart disease, hearing loss, cataract, pigmentary retinopathy and the like occur as congenital abnormalities. for this reason, the pre-inoculation of the vaccine is extremely important. in japan, however, only junior high school girls were eligible for regular vaccination of rubella vaccine from 1977 to 1995. in the past, vaccination was recommended for children under the age of 3, but due to the frequent occurrence of meningitis caused by the vaccine strains, the use was discontinued after that. thereafter, the national epidemic of measles occurred mainly in the 10-20 generations in 2007. 'prevention guideline on specific infectious diseases related to measles' was announced by the ministry of health, labour and welfare, and rubella was also designated as a disease to take measures. and during the 5 years from 2008 to 2012, as a periodical inoculation at the first period (1-year-old child), the second period (before elementary school entrance), the third period (first grader of junior high school) and the fourth period (third grader of high school), mr vaccine was to be inoculated. from fiscal 2013, as a rule, measles-rubella mixed (mr) vaccine is inoculated in infants of the first period and children before elementary school entrance of the second period. according to a survey of 2016, females possess about 95% of antibodies in all ages, while males only possess about 90% of 20-34 years old, and 76-84% of 35-55 years old. the antibodies holding rate in middle-aged males stays low. regarding the infection process, fever, rashes and lymphadenopathy (especially the posterior portion of the auricle, the back of the auricle, the neck) appear after a latency period of 14-21 days from infection. fever is observed in about half of rubella patients. in addition, inapparent infection is about 15-30%. the process of infection in a rubella model is plotted in fig. 1 . the model employs with the basic parameters to the disease. an orange line and a blue line indicate the number of infected and recovered people, respectively. when a player adopts a basic model, it takes approximately 150 days until convergence of the outbreak and more than 29 people have infected. however, the result of executing the experiment many times is greatly different. figure 2 shows the histogram of the results of 1000 runs. the horizontal axis shows the number of infected people, the vertical axis shows occurrence frequency, the blue next, the results when men and women are working separately in the workplace are shown in fig. 3 . as the results show, the frequency of infection by more than 20 men has increased to over 10% of the total. this result is thought to be caused by a low antibody holding rate of males. however, the total number of women infected has not increased. next, we conducted an experiment with a model that introduced the railway to commute. in this model, adults commute by railway. as the experimental result in fig. 4 shows, the number of infected men has not only increased dramatically but also the number of women infected has increased. in the basic model without the railway, the frequency of infection of one or more women became 25%, but in the model using the railway, it has increased to 28%. especially, the case where more than ten women were infected was 21% or more, which was a severe result. figure 5 shows the experimental results when men and women work separately in a railway model. the result is more serious. the frequency fig. 5 the experimental result of rubella model of the workplace separated the sexes with a railway of infection of one or more women increases to 37%, and the case where more than ten women were infected was 30%. it was estimated by the gender-specific workplace model that the low infection rate of middle-aged men as low as 80% is the cause of infection spread. based on this, we experimented in the case of strengthening medical policy promoting vaccination for males and raising the antibody holding rate to 95%. figure 6 shows experimental results at the gender-specific workplace, railway use and male antibody holding rate of 95%. according to this, both males and females, the proportion of infected people who expanded to more than one person has drastically decreased to less than 15%. by combining other infection prevention measures, it is possible to control the spread of infection of rubella. from these experimental results, the rubella virus infection is not represented by a simple statistical distribution. it is considered that the infection process is a complicated system in which positive feedback works by accidentality of infection route fig. 6 the experimental result of rubella model of the workplace separated the sexes with a railway and 95% antibody holding rate and interaction between infected people. it is also speculated that the spread of infection will start in a workplace where many men do not possess antibodies. one of the major factors of infection spread is commuters with many opportunities to come into close contact with the unspecified majority in a train. in addition, it became clear that raising the antibody holding rate of males is an important measure to prevent the spread of the whole infection. this study proposes a simulation model of rubella. it also evaluates health policies to prevent an epidemic. as health policies, vaccination to men, avoidance of separated workplace by sexes and avoidance crowds in a train were implemented in the model. as a result of experiments, it has been found that vaccination to middle-aged men and avoidance crowds in a train are crucial factors to prevent the spread of rubella. using a public transportation to commute, however, is inevitable in a modern society. even if 95% of people including men and women were vaccinated, it would not prevent the epidemic completely. by combining other infection prevention measures, it is possible to control the spread of infection of rubella. individual-based computational modeling of smallpox epidemic control strategies containing a large bioterrorist smallpox attack: a computer simulation approach agent-based models networks, crowds, and markets: reasoning about a highly connected world risk and benefit of immunisation: infectious disease prevention with immunization a health policy simulation model of smallpox and ebola haemorrhagic fever zika virus: rapid spread in the western hemisphere zika virus spreads to new areas -region of the americas who: zika virus, world health organization, media centre, fact sheets national institute of infectious diseases: fiscal year 2017 rubella immunization status and status of antibody retention -survey on infectious disease epidemic survey in 2017 (provisional result) rubella in japan toward a containment strategy for smallpox bioterror: an individual-based computational approach generative social science: studies in agent-based computational modeling an evaluation of counter measures for smallpox outbreak using an individual based model and taking into consideration the limitation of human resources of public health workers the role of vaccination coverage, individual behaviors, and the public health response in the control of measles epidemics: an agent-based simulation for california an open-data-driven agent-based model to simulate infectious disease outbreaks key: cord-024501-nl0gsr0c authors: tan, chunyang; yang, kaijia; dai, xinyu; huang, shujian; chen, jiajun title: msge: a multi-step gated model for knowledge graph completion date: 2020-04-17 journal: advances in knowledge discovery and data mining doi: 10.1007/978-3-030-47426-3_33 sha: doc_id: 24501 cord_uid: nl0gsr0c knowledge graph embedding models aim to represent entities and relations in continuous low-dimensional vector space, benefiting many research areas such as knowledge graph completion and web searching. however, previous works do not consider controlling information flow, which makes them hard to obtain useful latent information and limits model performance. specifically, as human beings, predictions are usually made in multiple steps with every step filtering out irrelevant information and targeting at helpful information. in this paper, we first integrate iterative mechanism into knowledge graph embedding and propose a multi-step gated model which utilizes relations as queries to extract useful information from coarse to fine in multiple steps. first gate mechanism is adopted to control information flow by the interaction between entity and relation with multiple steps. then we repeat the gate cell for several times to refine the information incrementally. our model achieves state-of-the-art performance on most benchmark datasets compared to strong baselines. further analyses demonstrate the effectiveness of our model and its scalability on large knowledge graphs. large-scale knowledge graphs(kgs), such as freebase [1] , yago3 [2] and dbpedia [3] , have attracted extensive interests with progress in artificial intelligence. real-world facts are stored in kgs with the form of (subject entity, relation, object entity), denoted as (s, r, o), benefiting many applications and research areas such as question answering and semantic searching. meanwhile, kgs are still far from complete with missing a lot of valid triplets. as a consequence, many researches have been devoted to knowledge graph completion task which aims to predict missing links in knowledge graphs. knowledge graph embedding models try to represent entities and relations in low-dimensional continuous vector space. benefiting from these embedding models, we can do complicated computations on kg facts and better tackle the kg completion task. translation distance based models [4] [5] [6] [7] [8] regard predicting a relation between two entities as a translation from subject entity to tail entity with the relation as a media. while plenty of bilinear models [9] [10] [11] [12] [13] propose different energy functions representing the score of its validity rather than measure the distance between entities. apart from these shallow models, recently, deeper models [14, 15] are proposed to extract information at deep level. though effective, these models do not consider: 1. controlling information flow specifically, which means keeping relevant information and filtering out useless ones, as a result restricting the performance of models. 2. the multi-step reasoning nature of a prediction process. an entity in a knowledge graph contains rich latent information in its representation. as illustrated in fig. 1 , the entity michael jordon has much latent information embedded in the knowledge graph and will be learned into the representation implicitly. however, when given a relation, not all latent semantics are helpful for the prediction of object entity. intuitively, it is more reasonable to design a module that can capture useful latent information and filter out useless ones. at the meantime, for a complex graph, an entity may contain much latent information entailed in an entity, one-step predicting is not enough for complicated predictions, while almost all previous models ignore this nature. multi-step architecture [16, 17] allows the model to refine the information from coarse to fine in multiple steps and has been proved to benefit a lot for the feature extraction procedure. in this paper, we propose a multi-step gated embedding (msge) model for link prediction in kgs. during every step, gate mechanism is applied several times, which is used to decide what features are retained and what are excluded at the dimension level, corresponding to the multi-step reasoning procedure. for partial dataset, gate cells are repeated for several times iteratively for more finegrained information. all parameters are shared among the repeating cells, which allows our model to target the right features in multi-steps with high parameter efficiency. we do link prediction experiments on 6 public available benchmark datasets and achieve better performance compared to strong baselines on most datasets. we further analyse the influence of gate mechanism and the length of steps to demonstrate our motivation. link prediction in knowledge graphs aims to predict correct object entities given a pair of subject entity and relation. in a knowledge graph, there are a huge amount of entities and relations, which inspires previous work to transform the prediction task as a scoring and ranking task. given a known pair of subject entity and relation (s, r), a model needs to design a scoring function for a triple (s, r, o), where o belongs to all entities in a knowledge graph. then model ranks all these triples in order to find the position of the valid one. the goal of a model is to rank all valid triples before the false ones. knowledge graph embedding models aim to represent entities and relations in knowledge graphs with low-dimensional vectors (e s , e r , e t ). transe [4] is a typical distance-based model with constraint formula e s + e r − e t ≈ 0. many other models extend transe by projecting subject and object entities into relationspecific vector space, such as transh [5] , transr [6] and transd [18] . toruse [7] and rotate [8] are also extensions of distance-based models. instead of measuring distance among entities, bilinear models such as rescal [9] , distmult [10] and complex [11] are proposed with multiplication operations to score a triplet. tensor decomposition methods such as simple [12] , cp-n3 [19] and tucker [13] can also be seen as bilinear models with extra constraints. apart from above shallow models, several deeper non-linear models have been proposed to further capture more underlying features. for example, (r-gcns) [15] applies a specific convolution operator to model locality information in accordance to the topology of knowledge graphs. conve [14] first applies 2-d convolution into knowledge graph embedding and achieves competitive performance. the main idea of our model is to control information flow in a multi-step way. to our best knowledge, the most related work to ours is transat [20] which also mentioned the two-step reasoning nature of link prediction. however, in transat, the first step is categorizing entities with kmeans and then it adopts a distance-based scoring function to measure the validity. this architecture is not an end-to-end structure which is not flexible. besides, error propagation will happen due to the usage of kmeans algorithm. we denote a knowledge graph as g = {(s, r, o)} ⊆ e × r × e , where e and r are the sets of entities, relations respectively. the number of entities in g is n e , the number of relations in g is n r and we allocate the same dimension d to entities and relations for simplicity. e ∈ r ne * d is the embedding matrix for the schematic diagram of our model with length of step 3. es and er represent embedding of subject entity and relation respectively. e i r means the query relation are fed into the i-th step to refine information.ẽs is the final output information, then matrix multiplication is operated betweenẽs and embedding matrix of entities e. at last, logistic sigmoid function is applied to restrict the final score between 0 and 1. entities and r ∈ r nr * d is the embedding matrix for relations. e s , e r and e o are used to represent the embedding of subject entity, relation and subject entity respectively. besides, we denote a gate cell in our model as c. in order to obtain useful information, we need a specific module to extract needed information from subject entity with respect to the given relation, which can be regarded as a control of information flow guided by the relation. to model this process, we introduce gate mechanism, which is widely used in data mining and natural language processing models to guide the transmission of information, e.g. long short-term memory (lstm) [21] and gated recurrent unit (gru) [22] . here we adopt gating mechanism at dimension level to control information entailed in the embedding. to make the entity interact with relation specifically, we rewrite the gate cell in multi-steps with two gates as below: two gates z and r are called update gate and reset gate respectively for controlling the information flow. reset gate is designed for generating a new e s or new information in another saying as follows: update gate aims to decide how much the generated information are kept according to formula (3):ẽ hardmard product is performed to control the information at a dimension level. the values of these two gates are generated by the interaction between subject entity and relation. σ-logistic sigmoid function is performed to project results between 0 and 1. here 0 means totally excluded while 1 means totally kept, which is the core module to control the flow of information. we denote the gate cell as c. besides, to verify the effectiveness of gate mechanism, we also list the formula of a cell that exclude gates as below for ablation study: with the gate cell containing several gating operations, the overall architecture in one gate cell is indeed a multi-step information controlling way. in fact, a single gate cell can generate useful information since the two gating operations already hold great power for information controlling. however, for a complex dataset, more fine and precise features are needed for prediction. the iterative multi-step architecture allows the model to refine the representations incrementally. during each step, a query is fed into the model to interact with given features from previous step to obtain relevant information for next step. as illustrated in fig. 2 , to generate the sequence as the input for multi-step training, we first feed relation embedding into a fully connected layer: we reshape the output as a sequence [e 0 r , e 1 r , ..., e k r ] = reshape(e r ) which are named query relations. this projection aims to obtain query relations of different latent aspects such that we can utilize them to extract diverse information across multiple steps. information of diversity can increase the robustness of a model, which further benefits the performance. query relations are fed sequentially into the gate cell to interact with subject entity and generate information from coarse to fine. parameters are shared across all steps so multi-step training are performed in an iterative way indeed. our score function for a given triple can be summarized as: where c k means repeating gate cell for k steps and during each step only the corresponding e i r is fed to interact with output information from last step. see fig. 2 for better understanding. after we extract the final information, it is interacted with object entity with a dot product operation to produce final score. in previous rnn-like models, a cell is repeated several times to produce information of an input sequence, where the repeating times are decided by the length of the input sequence. differently, we have two inputs e s and e r with totally different properties, which are embeddings of subject entity and relation respectively, which should not be seen as a sequence as usual. as a result, a gate cell is used for capturing interactive information among entities and relations iteratively in our model, rather than extracting information of just one input sequence. see fig. 3 for differences more clearly. training. at last, matrix multiplication is applied between the final output information and embedding matrix e, which can be called 1-n scoring [14] to score all triples in one time for efficiency and better performance. we also add reciprocal triple for every instance in the dataset which means for a given (s, r, t), we add a reverse triple (t, r −1 , s) as the previous work. we use binary crossentropy loss as our loss function: we add batch normalization to regularise our model and dropout is also used after layers. for optimization, we use adam for a stable and fast training process. embedding matrices are initialized with xavier normalization. label smoothing [23] is also used to lessen overfitting. in this section we first introduce the benchmark datasets used in this paper, then we report the empirical results to demonstrate the effectiveness of our model. analyses and ablation study are further reported to strengthen our motivation. language system) are biomedical concepts such as disease and antibiotic. • kinship [25] contains kinship relationships among members of the alyawarra tribe from central australia. the details of these datasets are reported in table 1 . the evaluation metric we use in our paper includes mean reciprocal rank(mrr) and hit@k. mrr represents the reciprocal rank of the right triple, the higher the better of the model. hit@k reflects the proportion of gold triples ranked in the top k. here we select k among {1, 3, 10}, consistent with previous work. when hit@k is higher, the model can be considered as better. all results are reported with 'filter' setting which removes all gold triples that have existed in train, valid and test data during ranking. we report the test results according to the best performance of mrr on validation data as the same with previous works. table 3 . link prediction results on umls and kinship. for different datasets, the best setting of the number of iterations varies a lot. for fb15k and umls the number at 1 provides the best performance, however for other datasets, iterative mechanism is helpful for boosting the performance. the best number of iterations is set to 5 for wn18, 3 for wn18rr, 8 for fb15k-237 and 2 for kinship. we do link prediction task on 6 benchmark datasets, comparing with several classical baselines such as transe [4] , distmult [10] and some sota strong baselines such as conve [14] , rotate [8] and tucker [13] . for smaller datasets umls and kinship, we also compare with some non-embedding methods such as ntp [26] and neurallp [27] which learn logic rules for predicting, as well as minerva [28] which utilizes reinforcement learning for reasoning over paths in knowledge graphs. the results are reported in table 2 and table 3 . overall, from the results we can conclude that our model achieves comparable or better performance than sota models on datasets. even with datasets without inverse relations such as wn18rr, fb15k-237 which are more difficult datasets, our model can still achieve comparable performance. to study the effectiveness of the iterative multi-step architecture, we list the performance of different number of steps on fb15k-237 in table 4 . the model settings are all exactly the same except for length of steps. from the results on fb15k-237 we can conclude that the multi-step mechanism indeed boosts the performance for a complex knowledge graph like fb15k-237, which verify our motivation that refining information for several steps can obtain more helpful information for some complex datasets. we report the convergence process of tucker and msge on fb15k-237 dataset and wn18rr dataset in fig. 4 . we re-run tucker with exactly the same settings in table 5 , we report the parameter counts of conve, tucker and our model for comparison. our model can achieve better performance on most datasets with much less parameters, which means our model can be more easily migrated to large knowledge graphs. as for tucker, which is the current sota method, the parameter count is mainly due to the core interaction tensor w , whose size is d e * d r * d e . as the grow of embedding dimension, this core tensor will lead to a large increasing on parameter size. however, note that our model is an iterative architecture therefore only a very few parameters are needed apart from the embedding, the complexity is o(n e d + n r d). for evaluating time efficiency, we re-run tucker and our model on telsa k40c. tucker needs 29 s/28 s to run an epoch on fb15k-237/wn18rr respectively, msge needs 17 s/24 s respectively, which demonstrate the time efficiency due to few operations in our model. to further demonstrate our motivation that gate mechanism and multi-step reasoning are beneficial for extracting information. we do ablation study with the following settings: • no gate: remove the gates in our model to verify the necessity of controlling information flow. • concat: concatenate information extracted in every step together and feed them into a fully connected layer to obtain another kind of final information, which is used to verify that more useful information are produced by the procedure of multi-step. • replicate: replicate the relation to gain k same query relations for training. this is to prove that extracting diverse information from multi-view query relations is more helpful than using the same relation for k times. the experiment results are reported in table 6 . all results demonstrate our motivation that controlling information flow in a multi-step way is beneficial for link prediction task in knowledge graphs. especially a gated cell is of much benefit for information extraction. in this paper, we propose a multi-step gated model msge for link prediction task in knowledge graph completion. we utilize gate mechanism to control information flow generated by the interaction between subject entity and relation. then we repeat gated module to refine information from coarse to fine. it has been proved from the empirical results that utilizing gated module for multiple steps is beneficial for extracting more useful information, which can further boost the performance on link prediction. we also do analysis from different views to demonstrate this conclusion. note that, all information contained in embeddings are learned across the training procedure implicitly. in future work, we would like to aggregate more information for entities to enhance feature extraction, for example, from the neighbor nodes and relations. freebase: a collaboratively created graph database for structuring human knowledge yago3: a knowledge base from multilingual wikipedias dbpedia: a nucleus for a web of open data translating embeddings for modeling multi-relational data knowledge graph embedding by translating on hyperplanes learning entity and relation embeddings for knowledge graph completion knowledge graph embedding on a lie group rotate: knowledge graph embedding by relational rotation in complex space a three-way model for collective learning on multi-relational data embedding entities and relations for learning and inference in knowledge bases complex embeddings for simple link prediction simple embedding for link prediction in knowledge graphs tensor factorization for knowledge graph completion convolutional 2d knowledge graph embeddings modeling relational data with graph convolutional networks reasonet: learning to stop reading in machine comprehension gated-attention readers for text comprehension knowledge graph embedding via dynamic mapping matrix canonicaltensor decomposition for knowledge base completion translating embeddings for knowledge graph completion with relation attention mechanism long short-term memory learning phrase representations using rnn encoder-decoder for statistical machine translation rethinking the inception architecture for computer vision observed versus latent features for knowledge base and text inference statistical predicate invention end-to-end differentiable proving differentiable learning of logical rules for knowledge base reasoning go for a walk and arrive at the answer: reasoning over paths in knowledge bases using reinforcement learning key: cord-011400-zyjd9rmp authors: peixoto, tiago p. title: network reconstruction and community detection from dynamics date: 2019-09-18 journal: nan doi: 10.1103/physrevlett.123.128301 sha: doc_id: 11400 cord_uid: zyjd9rmp we present a scalable nonparametric bayesian method to perform network reconstruction from observed functional behavior that at the same time infers the communities present in the network. we show that the joint reconstruction with community detection has a synergistic effect, where the edge correlations used to inform the existence of communities are also inherently used to improve the accuracy of the reconstruction which, in turn, can better inform the uncovering of communities. we illustrate the use of our method with observations arising from epidemic models and the ising model, both on synthetic and empirical networks, as well as on data containing only functional information. the observed functional behavior of a wide variety largescale system is often the result of a network of pairwise interactions. however, in many cases, these interactions are hidden from us, either because they are impossible to measure directly, or because their measurement can be done only at significant experimental cost. examples include the mechanisms of gene and metabolic regulation [1] , brain connectivity [2] , the spread of epidemics [3] , systemic risk in financial institutions [4] , and influence in social media [5] . in such situations, we are required to infer the network of interactions from the observed functional behavior. researchers have approached this reconstruction task from a variety of angles, resulting in many different methods, including thresholding the correlation between time series [6] , inversion of deterministic dynamics [7] [8] [9] , statistical inference of graphical models [10] [11] [12] [13] [14] and of models of epidemic spreading [15] [16] [17] [18] [19] [20] , as well as approaches that avoid explicit modeling, such as those based on transfer entropy [21] , granger causality [22] , compressed sensing [23] [24] [25] , generalized linearization [26] , and matching of pairwise correlations [27, 28] . in this letter, we approach the problem of network reconstruction in a manner that is different from the aforementioned methods in two important ways. first, we employ a nonparametric bayesian formulation of the problem, which yields a full posterior distribution of possible networks that are compatible with the observed dynamical behavior. second, we perform network reconstruction jointly with community detection [29] , where, at the same time as we infer the edges of the underlying network, we also infer its modular structure [30] . as we will show, while network reconstruction and community detection are desirable goals on their own, joining these two tasks has a synergistic effect, whereby the detection of communities significantly increases the accuracy of the reconstruction, which in turn improves the discovery of the communities, when compared to performing these tasks in isolation. some other approaches combine community detection with functional observation. berthet et al. [31] derived necessary conditions for the exact recovery of group assignments for dense weighted networks generated with community structure given observed microstates of an ising model. hoffmann et al. [32] proposed a method to infer community structure from time-series data that bypasses network reconstruction by employing a direct modeling of the dynamics given the group assignments, instead. however, neither of these approaches attempt to perform network reconstruction together with community detection. furthermore, they are tied down to one particular inverse problem, and as we will show, our general approach can be easily extended to an open-ended variety of functional models. bayesian network reconstruction.-we approach the network reconstruction task similarly to the situation where the network edges are measured directly, but via an uncertain process [33, 34] : if d is the measurement of some process that takes place on a network, we can define a posterior distribution for the underlying adjacency matrix a via bayes' rule where pðdjaþ is an arbitrary forward model for the dynamics given the network, pðaþ is the prior information on the network structure, and pðdþ ¼ p a pðdjaþpðaþ is a normalization constant comprising the total evidence for the data d. we can unite reconstruction with community detection via an, at first, seemingly minor, but ultimately consequential modification of the above equation where we introduce a structured prior pðajbþ where b represents the partition of the network in communities, i.e., b ¼ fb i g, where b i ∈ f1; …; bg is group membership of node i. this partition is unknown, and is inferred together with the network itself, via the joint posterior distribution the prior pðajbþ is an assumed generative model for the network structure. in our work, we will use the degreecorrected stochastic block model (dc-sbm) [35] , which assumes that, besides differences in degree, nodes belonging to the same group have statistically equivalent connection patterns, according to the joint probability with λ rs determining the average number of edges between groups r and s and κ i the average degree of node i. the marginal prior is obtained by integrating over all remaining parameters weighted by their respective prior distributions, which can be computed exactly for standard prior choices, although it can be modified to include hierarchical priors that have an improved explanatory power [36] (see supplemental material [37] for a concise summary.). the use of the dc-sbm as a prior probability in eq. (2) is motivated by its ability to inform link prediction in networks where some fraction of edges have not been observed or have been observed erroneously [34, 39] . the latent conditional probabilities of edges existing between groups of nodes is learned by the collective observation of many similar edges, and these correlations are leveraged to extrapolate the existence of missing or spurious ones. the same mechanism is expected to aid the reconstruction task, where edges are not observed directly, but the observed functional behavior yields a posterior distribution on them, allowing the same kind of correlations to be used as an additional source of evidence for the reconstruction, going beyond what the dynamics alone says. our reconstruction approach is finalized by defining an appropriate model for the functional behavior, determining pðdjaþ. here, we will consider two kinds of indirect data. the first comes from a susceptible-infected-susceptible (sis) epidemic spreading model [40] , where σ i ðtþ ¼ 1 means node i is infected at time t, 0, otherwise. the likelihood for this model is where is the transition probability for node i at time t, with fðp; σþ ¼ ð1 − pþ σ p 1−σ , and where m i ðtþ ¼ p j a ij lnð1 − τ ij þσ j ðtþ is the contribution from all neighbors of node i to its infection probability at time t. in the equations above, the value τ ij is the probability of an infection via an existing edge ði; jþ, and γ is the 1 → 0 recovery probability. with these additional parameters, the full posterior distribution for the reconstruction becomes since τ ij ∈ ½0; 1, we use the uniform prior pðτþ ¼ 1. note, also, that the recovery probability γ plays no role on the reconstruction algorithm, since its term in the likelihood does not involve a [and, hence, gets cancelled out in the denominator pðσjγþ ¼ pðγjσþpðσþ=pðγþ]. this means that the above posterior only depends on the infection events 0 → 1 and, thus, is also valid without any modifications to all epidemic variants susceptible-infected (si), susceptibleinfected-recovered (sir), susceptible-exposed-infectedrecovered (seir), etc., [40] , since the infection events occur with the same probability for all these models. the second functional model we consider is the ising model, where spin variables on the nodes s ∈ f−1; 1g n are sampled according to the joint distribution where β is the inverse temperature, j ij is the coupling on edge ði; jþ, h i is a local field on node i, and zða; β; j; hþ ¼ p s expðβ p i c ã ; 0; otherwiseg. the value of c ã was chosen to maximize the posterior similarity, which represents the best possible reconstruction achievable with this method. nevertheless, the network thus obtained is severely distorted. the inverse correlation method comes much closer to the true network, but is superseded by the joint inference with community detection. empirical dynamics.-we turn to the reconstruction from observed empirical dynamics with unknown underlying interactions. the first example is the sequence of m ¼ 619 votes of n ¼ 575 deputies in the 2007 to 2011 session of the lower chamber of the brazilian congress. each deputy voted yes, no, or abstained for each legislation, which we represent as f1; −1; 0g, respectively. since the temporal ordering of the voting sessions is likely to be of secondary importance to the voting outcomes, we assume the votes are sampled from an ising model [the addition of zero-valued spins changes eq. (9) only slightly by replacing 2 coshðxþ → 1 þ 2 coshðxþ]. figure 4 shows the result of the reconstruction, where the division of the nodes uncovers a cohesive government and a split opposition, as well as a marginal center group, which correlates very well with the known party memberships and can be used to predict unseen voting behavior (see supplemental material [37] for more details). in fig. 5 , we show the result of the reconstruction of the directed network of influence between n ¼ 1833 twitter users from 58224 retweets [50] using a si epidemic model (the act of "retweeting" is modeled as an infection event, using eqs. (5) and (6) with γ ¼ 0) and the nested dc-sbm. the reconstruction uncovers isolated groups with varying propensities to retweet, as well as groups that tend to influence a large fraction of users. by inspecting the geolocation metadata on the users, we see that the inferred groups amount, to a large extent, to different countries, although clear subdivisions indicate that this is not the only factor governing the influence among users (see supplemental material [37] for more details). conclusion.-we have presented a scalable bayesian method to reconstruct networks from functional observations that uses the sbm as a structured prior and, hence, performs community detection together with reconstruction. the method is nonparametric and, hence, requires no prior stipulation of aspects of the network and size of the model, such as number of groups. by leveraging inferred correlations between edges, the sbm includes an additional source of evidence and, thereby, improves the reconstruction accuracy, which in turn also increases the accuracy of the inferred communities. the overall approach is general, requiring only appropriate functional model specifications, and can be coupled with an open ended variety of such models other than those considered here. [51, 52] for details on the layout algorithm), and the edge colors indicate the infection probabilities τ ij as shown in the legend. the text labels show the dominating country membership for the users in each group. inferring gene regulatory networks from multiple microarray datasets dynamic models of large-scale brain activity estimating spatial coupling in epidemiological systems: a mechanistic approach bootstrapping topological properties and systemic risk of complex networks using the fitness model the role of social networks in information diffusion network inference with confidence from multivariate time series revealing network connectivity from response dynamics inferring network topology from complex dynamics revealing physical interaction networks from statistics of collective dynamics learning factor graphs in polynomial time and sample complexity reconstruction of markov random fields from samples: some observations and algorithms, in approximation, randomization and combinatorial optimization. algorithms and techniques which graphical models are difficult to learn estimation of sparse binary pairwise markov networks using pseudo-likelihoods inverse statistical problems: from the inverse ising problem to data science inferring networks of diffusion and influence on the convexity of latent social network inference learning the graph of epidemic cascades statistical inference approach to structural reconstruction of complex networks from binary time series maximum-likelihood network reconstruction for sis processes is np-hard network reconstruction from infection cascades escaping the curse of dimensionality in estimating multivariate transfer entropy causal network inference by optimal causation entropy reconstructing propagation networks with natural diversity and identifying hidden sources efficient reconstruction of heterogeneous networks from time series via compressed sensing robust reconstruction of complex networks from sparse data universal data-based method for reconstructing complex networks with binary-state dynamics reconstructing weighted networks from dynamics reconstructing network topology and coupling strengths in directed networks of discrete-time dynamics community detection in networks: a user guide bayesian stochastic blockmodeling exact recovery in the ising blockmodel community detection in networks with unobserved edges network structure from rich but noisy data reconstructing networks with unknown and heterogeneous errors stochastic blockmodels and community structure in networks nonparametric bayesian inference of the microcanonical stochastic block model for summary of the full generative model used, details of the inference algorithm and more information on the analysis of empirical data efficient monte carlo and greedy heuristic for the inference of stochastic block models missing and spurious interactions and the reconstruction of complex networks epidemic processes in complex networks spatial interaction and the statistical analysis of lattice systems equation of state calculations by fast computing machines monte carlo sampling methods using markov chains and their applications asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications artifacts or attributes? effects of resolution on the little rock lake food web note that, in this case, our method also exploits the heterogeneous degrees in the network via the dc-sbm, which can refinements of this approach including thouless-anderson-palmer (tap) and bethe-peierls (bp) corrections [14] yield the same performance for this example pseudolikelihood decimation algorithm improving the inference of the interaction network in a general class of ising models the simple rules of social contagion hierarchical block structures and high-resolution model selection in large networks hierarchical edge bundles: visualization of adjacency relations in hierarchical data key: cord-026336-xdymj4dk authors: ranjan, rajesh title: temporal dynamics of covid-19 outbreak and future projections: a data-driven approach date: 2020-06-06 journal: trans indian natl doi: 10.1007/s41403-020-00112-y sha: doc_id: 26336 cord_uid: xdymj4dk long-term predictions for an ongoing epidemic are typically performed using epidemiological models that predict the timing of the peak in infections followed by its decay using non-linear fits from the available data. the curves predicted by these methods typically follow a gaussian distribution with a decay rate of infections similar to the climbing rate before the peak. however, as seen from the recent covid-19 data from the us and european countries, the decay in the number of infections is much slower than their increase before the peak. therefore, the estimates of the final epidemic size from these models are often underpredicted. in this work, we propose two data-driven models to improve the forecasts of the epidemic during its decay. these two models use gaussian and piecewise-linear fits of the infection rate respectively during the deceleration phase, if available, to project the future course of the pandemic. for countries, which are not yet in the decline phase, these models use the peak predicted by epidemiological models but correct the infection rate to incorporate a realistic slow decline based on the trends from the recent data. finally, a comparative study of predictions using both epidemiological and data-driven models is presented for a few most affected countries. in recent days, coronavirus disease 2019 (covid-19) has emerged as an unprecedented challenge before the world. this disease is caused by a novel coronavirus sars-cov-2, for which there is no specific medication or vaccine approved by medical authorities. this disease is transmitted by inhalation or contact with infected droplets or fomites, and the incubation period may range from 2 to 14 days (wu and mcgoogan 2020) . this disease can be fatal to the elderly patients (about 27% for 60+ age groups), and those with underlying co-morbid conditions . as of may 15, 2020, there have been about 4.6 million confirmed cases of covid-19 and about 300,000 reported deaths globally. a realistic estimate of intensity and temporal distribution of this epidemic can be beneficial to design key strategies to regulate the quarantine as well as to prepare for social and economic consequences due to lockdown. however, as seen from the recent literature (roda et al. 2020) , the predictions by epidemiological models for an ongoing spread are often unreliable as they do not accurately capture the dynamics of covid-19 in the absence of established parameters. in this work, we propose data-driven models for covid-19 decay purely based on characteristics of covid-19 spread, and thus include the effects of lockdown and other key factors. for subsequent discussions, the default year is 2020 and all the statistics are based on data till may 15, 2020, unless otherwise specified. first, we examine the dynamics of covid-19, before and after the lockdown as shown in fig. 1a . the abscissa indicates the days shifted by the date when the lockdown was imposed or other intervention measures were taken (see list in ranjan 2020b) . thus, the four phases indicate: (1) early slow epidemic growth ( t < t −1 ), (2) initial exponential growth ( t −1 < t < t 0 ) typical of an epidemic, (3) continuing exponential growth during lockdown based on the incubation period of sars-cov-2 ( t 0 < t < t 1 ≡ t 0 + 14 ) and (4) expected deceleration phase ( t > t 1 ). in fig. 1a , both china and south korea (sk) show a very rapid arrest of the covid-19 growth post interventions ( t > t 1 ), while other countries display just a slowdown evident by the change in slope. further, the growth curves for india and russia are much more rapid in this phase compared to those for other countries . the differences in covid-19 spread among geographical regions after the lockdown can be better visualized on a linear scale as shown in fig. 2a . most of the countries considered in the figure took social distancing measures before the end of march, so it is expected that the effects of interventions should become visible latest by mid-april. both the us and the uk exhibit linear growth in this period, while other european countries show initial linear growth followed by a slow flattening (ansumali and prakash 2020) . the curves for india and russia are closer to exponential very different trends of the curve after the lockdown indicate a disparity in compliance levels of social distancing measures. for example, in the us, each state follows its norm of intervention, and social distancing measures are imposed on different dates. an implication of this is that when the initially most impacted states like new york and new jersey started showing signs of flattening in late april, other states like illinois, massachusetts and california displayed surges in the number of cases, thereby keeping the overall growth in the us on a linear course. the case of india is also compelling as it first displayed a strong impact of lockdown (close to 70% compliance, as suggested in ranjan 2020a) despite a few local outbreaks. this led to a linear growth for some time, but an escalation in early may cases put india on a near exponential course. to further examine this, we plot covid-19 distribution in key affected states in india in fig. 1b . the time-series data is divided into four periods, with the first three being before, during and after lockdown similar to that in fig. 1a and the last one from may 2 (green shade) when a surge in the number of cases in many states put india onto the exponential course. fig. 1b , we note a varying distribution of covid-19 among indian states much like in the us, just four states -maharashtra, gujarat, tamilnadu, and new delhi contribute to about 70% of the total cases. among these, the most affected states, maharashtra and gujarat are on the course of exponential growth. delhi, tamilnadu and west bengal show an initial arrest of the growth (blue shade), followed by later and more recent local outbreaks as marked by a discontinuity in the slope (see green shade in inset). several other states, including uttar pradesh, kerala and karnataka display good control over the epidemic, while other states are in the linear regime throughout from t 1 (beginning of the blue shade). since the predictive models depend significantly on data, an important aspect to consider is that the number of reported infections does not truly reflect the actual outbreak of covid-19. the data on infection rate are often limited by the countries' testing capability, which in turn is related to the availability of testing kits, size of healthcare professionals per population, and infrastructure. further, the asymptomatic population is often excluded in testing strategies adopted by most countries. to elucidate this, we show the daily number of tests for key countries in fig. 2b . we generally note that the increase in the number of tests with time is very closely related to the infections shown in fig. 2a , as expected. a small number of reported cases for india in march could be due to inadequate testing at that time. therefore the predictions by models using data from that period had considerable uncertainty (ranjan 2020b; singh and adhikari 2020). we briefly discuss the implication of rigorous testing in covid-19 control by carefully examining the south korean data, shown in the inset in fig. 2b . unlike most countries with the number of tests increasing slowly during the initial phase of the outbreak, the reverse is seen for south korea. the response of sk to the outbreak was quick and they ran the most comprehensive and well-organized testing program in the world from february (fig. 2b ) when the outbreak was still not severe. this, combined with large-scale efforts to isolate infected people and trace and quarantine their contacts, lead to successful control of the outbreak. for comparison, sk, the us, and india respectively have 14500, 33000 and 1550 tests per million inhabitants on may 15. a final but most important factor affecting the outbreak and the predictions is the epidemiology of covid-19 in different geographical regions. the values of epidemiological parameters such as transmission rate, recovery rate, and basic reproduction number ) depend on many social and environmental factors and are dissimilar in different regions. for an ongoing outbreak, the epidemiology is not fully established, but available data can provide meaningful insights. we report three characteristic ratios: positivity ratio (pr), case recovery ratio (crr) and case fatality ratio (cfr) to roughly correlate with the epidemiological parameters: the rates of infection , recovery , and mortality respectively (hethcote 2020) . pr is the total number of infections for a given number of tests. crr and cfr are respectively, the number of recovered and deceased cases as a fraction of total infections. these values are reported in percentages in table 1 , a large proportion of the young in the total population, and possible immunity due to bcg vaccinations (curtis et al. 2020 ) and malarial infections (goswami et al. 2020) . crrs for germany ( ≃ 87% ) and spain ( ≃ 69% ) are highest, but it is expected that the value of crr in countries currently in the acceleration phase will improve with time. the case fatality ratio is very high for france(≃ 15% ), uk(≃ 14% ) and italy(≃ 14% ) compared to the world average of 2-3%. a high ratio may be due to a higher percentage of the elderly population in these countries. it is clear from the above discussion that the epidemiologies of covid-19, as well as the impact of social distancing for different countries, are very dissimilar. further, there is an inhomogenous covid-19 spread within a country as seen for the us and india. all of these factors make the modeling of this epidemic during its progress very challenging. typically, epidemiological models such as a logistic or a compartmental model are preferred for modeling the later stages. however, these models are highly dependent on initial conditions and underlying unknown epidemiological parameters, incorrect estimation of which can give completely different results. a further concern with these models is the prediction of the decay rate of infections, which is generally high compared to the recent trends (ranjan 2020a) . therefore, in this work, we propose two data-driven models for the predictions that incorporate the slow decay of the epidemic post-lockdown and provide more realistic estimates. projections for key affected countries are presented using these data-driven as well as epidemiological models. covid-19 data used in this study are taken from various sources. modeling is based on the time-series data from johns hopkins university coronavirus data stream, which combines world health organization (who) and centers for disease control and prevention (cdc) case data. data on tests are taken from 'our world in data' source that compiles data from the european centre for disease prevention and control (ecdc). time-series data for indian states are taken from github.com/covid19india. typically for an ongoing epidemic, epidemiological models estimate the underlying parameters based on fit from available data and then use simple ordinary differential equations to predict the day of the peak and the decay rate. to illustrate the limitations of these models, we show predictions for italy using a logistic model (ma 2020) , as well as two compartmental epidemiological models-sir (hethcote 2020) and generalized seir (seiqrdp) (peng et al. 2020) in fig. 3a . open-source matlab codes developed by batista (2020) and cheynet (2020) are used for sir and seoqrdp models respectively. as the epidemic has passed its peak in italy, a key parameter for estimation is the decay rate. a close examination of the daily cases in fig. 3a shows that all the three models predict a faster decay rate as the curve is nearly symmetric around the peak. this distribution leads to an under-prediction of size as well as the duration of the epidemic. although not shown for every geographical region considered in this paper, this is true for most of the predictions. we shall describe the first data-driven model to improve the predictions. as shown in fig. 3a , the infection rate (daily cases) predicted by epidemiological models follow a nearly normal distribution with where represents the day with the peak number of cases, is spread around this day from the beginning of the epidemic to the end, and a is a constant that determines the number of cases. because of the nearly symmetric distribution of the curve, the decline rate is typically predicted as the negative of the climb rate. to make the predictions closer to actual values in the deceleration phase, we introduce a new parameter , which changes the variance of this distribution after the peak to make the decline rate more realistic. hence, the new distribution in this modified gaussian decay model (mgdm) is where the pre-multiplication factor ensures that the number of infections on the peak day remains unaltered. the dash-dot magenta curve in fig. 2a shows the distribution of mgdm. the infection rate, in this case, is closer to the actual values during decay and generally provides the upper limit of estimated total cases, as seen from the difference in the cumulative cases. parameters for fitted normal distribution with r 2 = 0.9999 , rmse=4.09 and 95% confidence bounds are: a = 5273(5272, 5275), = −5.572 (−5.723, −5.421), = 248.1(248, 248.2) . the modifications due to data gives = 1.4, = 0.996. the final epidemic size by this model as well as sir and seiqrdp models are given in table 1 . note that, a gaussian fit of the infection rate can be directly used in mgdm for regions with sufficient data in the deceleration phase, and a prediction from epidemiological model is not necessary. we shall now discuss the second data-driven model. as discussed earlier, recent trends indicate that the lockdown arrests the initial exponential growth but a linear regime persists after that, and then a prolonged decay follows. we propose that this decay can be modeled better with several linear segments than an exponential or a gaussian curve that gives a fast decline with a relatively small tail. piecewise-linear decay model (pldm) incorporates these dynamics. the cumulative data in the deceleration phase is collected and then divided into equal segments. an optimal piecewise linear fit in a least-squares sense is then obtained. the slopes of these linear fits are m i , where i = 1, ..., n , n being the number of segments. the ratios of slopes are then computed, , with i < 1 , and a slope factor ̄ is calculated by taking an average of the last three ratios. this factor is then used to predict the slope of the next future segment such as m n+1 =̄m n and so on. figure 3a includes the predictions from this model for italy. data during the decay phase, between mar 21 and may 15, have been divided into five equal segments of eleven days. the modeling gives ̄= 0.6734 , which is used to predict the slopes of future linear segments of the same sizes (see bottom panel in the figure). as evident from fig. 3a , while the prediction of the final estimate size for both mgdm and pldm is similar, pldm predicts the cumulative curve more closely and has a more gradual decay. table 1 shows the projections for key countries using both the epidemiological (sir and seiqrdp) as well as datadriven models until the middle of august. though not shown for individual cases, it is ensured in every case that the logistic fits are statistically significant with r 2 > 0.98 and p-value < 0.0001 . parameters of data-driven models ( , in mgdm, and ̄, m n in pldm) are directly obtained from the data in the deceleration phase when available. for countries, where the infection rate is still growing, predictions from the seiqrdp model are used as a baseline, and parameters of data-driven models are taken from the fit used in italy. as expected, there is higher uncertainty for these countries. all european countries in table 1 except uk, where the outbreak is already in the decline phase, show a good convergence of epidemic sizes i.e., predictions from the epidemiological models are not very different as shown in the case of italy (fig. 3a) . likewise, estimates from both the data-driven models are very close and are higher than those from epidemiological models as expected. both mgdm and pldm suggest the equilibrium to be expected towards the end of july. predictions for these regions using the datadriven models are fairly reliable provided there is no new outbreak. for the us, there is an uncertainty due to fluctuations in the recent data, which in turn is due to different epidemiology and a differential impact of stay-at-home order among different states (ranjan 2020a) . for the uk, the epidemic is in the linear growth stage with mild signs of decay recently. therefore, there is high unreliability in the prediction of the peak. nevertheless, forecasts by different models in these cases can provide an estimate of the expected range. for india and russia, the growths are still close to exponential, and therefore there is a significant disparity in predictions by different epidemiological models. figure 3b illustrates the uncertainty in such cases by showing the projections for india by different models. while the logistic and sir models predict the peak close to each other, seiqrdp model shows continuing growth till the middle of june before the decline begins. this difference leads to a significantly higher epidemic size with seiqrdp (0.66 million) than those with logistic (0.17 million) and sir (0.22 millions) models. a critical difference between the sir and seiqrdp models implemented here is that in sir, the population n considered is just the number of susceptible persons before the outbreak, while the entire population of the region is taken as the population size in seiqrdp. as results from the seiqrdp model are used as a baseline for data-driven models, estimates from the latter are also in the higher range. these models are then used for statewise projections in india. table 2 gives the lower and upper range value of the estimated epidemic size as calculated by all the models. as expected, the highest contribution comes from four key states: maharashtra, gujarat, delhi and tamilnadu, who are on the exponential growth (fig. 1b) . also, the projections have the highest uncertainty in these regions among the states listed in the table. if these states can control the epidemic and new outbreaks do not appear in other states, it is expected that the optimistic scenario for india shown by the sir model in fig. 3b can be realized. the final epidemic size of the entire world is difficult to estimate without getting individual estimates of all the countries. this is because the global trend of total infections is still on an accelerating stage with new countries (brazil, peru, canada) reporting surge in the number of cases. epidemiological models such as logistic and compartmental models are generally used to predict the total size and duration of covid-19. however, these models generally do not account for the precise change in dynamics due to different interventions, or a new outbreak, and therefore estimate unrealistic epidemic size. we show that the covid-19 curves for different countries after the lockdown are very dissimilar with four primary distributions: linear, exponential, and slow and fast flattening. further, within a country, the characteristics of spread among states may be different. therefore, to account for differences in dynamics, a locally data-driven approach for modeling may be more suitable. two data-driven models for the decay of covid-19 based on recent trends-one based on skewed gaussian distribution and the other by using a piecewise linear fit-are proposed. these models generally provide a more realistic estimate of the epidemic size than epidemiological models for regions in the deceleration phase, with the piecewise linear model predicting a more gradual decay. for countries (like india and russia) still in the growth stage, these data-driven models use predictions from epidemiological models as a baseline and impose corrections using parameters obtained from an available data with a realistic decline rate. the uncertainty in predictions for such cases is higher. the paper also highlights that the reported data on infections is not an accurate representation of actual outbreak, and is limited by the testing capacity. therefore, estimations given by these models could still be optimistic and should be used with caution. a periodic evaluation of characteristics of covid-19 spread, and thus a revision of projections is necessary. a very flat peak: exponential growth phase of covid-19 is mostly followed by a prolonged linear growth phase, not an immediate saturation 2020) fitviruscovid19, matlab central file exchange generalized seir epidemic model (fitting and computation) considering bcg vaccination to reduce the impact of covid-19 interaction between malarial transmission and bcg vaccination with covid-19 incidence in the world map: a changing landscape human immune system? medrxiv the mathematics of infectious diseases the reproductive number of covid-19 is higher compared to sars coronavirus estimating epidemic exponential growth rate and basic reproduction number effective transmission across the globe: the role of climate in covid-19 mitigation strategies epidemic analysis of covid-19 in china by dynamical modeling estimating the final epidemic size forcovid-19 outbreak using improved epidemiological models predictions for covid-19 outbreak in india using epidemiologicalmodels.medrxiv why is it difficult to accurately predict the covid-19 epidemic? age-structured impact of social distancing on the covid-19 epidemic in india characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention prevalence of comorbidities in the novel wuhan coronavirus (covid-19) infection: a systematic review and metaanalysis the author would like to thank prof. datta gaitonde for his support and encouragement during these unprecedented times. the critical inputs from prof. roddam narasimha in improving the manuscript are gratefully acknowledged. contributions from dr. sudheendra n r rao (scientific advisor, organization for rare key: cord-125330-jyppul4o authors: crokidakis, nuno; sigaud, lucas title: modeling the evolution of drinking behavior: a statistical physics perspective date: 2020-08-24 journal: nan doi: nan sha: doc_id: 125330 cord_uid: jyppul4o in this work we study a simple compartmental model for drinking behavior evolution. the population is divided in 3 compartments regarding their alcohol consumption, namely susceptible individuals s (nonconsumers), moderated drinkers m and risk drinkers r. the transitions among those states are rules by probabilities. despite the simplicity of the model, we observed the occurrence of two distinct nonequilibrium phase transitions to absorbing states. one of these states is composed only by susceptible individuals s, with no drinkers ($m=r=0$). on the other hand, the other absorbing state is composed only by risk drinkers r ($s=m=0$). between these two steady states, we have the coexistence of the three subpopulations s, m and r. comparison with abusive alcohol consumption data for brazil shows a good agreement between the model's results and the database. epidemic models have been widely used to study contagion processes such as the spread of infectious diseases [1] and rumors [2] . this kind of model has also been used for the spread of social habits, such as the smoking habit [3] , cocaine [4] and alcohol consumption [5] , obesity [6] , corruption [7] , cooperation [8] , ideological conflicts [9] , and also to other problems like rise/fall of ancient empires [10] , dynamics of tax evasion [11] and radicalization phenomena [12] . the main reason such social behaviors can be modelled by contagion processes is the response by elements of the ensemble to the social context of the studied subject. both social or peer pressure and positive reinforcement from other agents, regardless if the behavior brings positive or negative consequences to the individual, can influence each one's way of life. therefore, models for the epidemics of infectious diseases are also able to describe the spread of such tendencies, like alcoholism [13, 14] . the standard medical way of categorizing alcohol consumption [15] is in three groups -nonconsumers, moderate (or social) consumers and risk (or excessive) consumers; thus, modeling of the interactions and consequent changes of an individual from one group to another is governed by interaction parameters. one interesting aspect that should be taken into consideration when modeling alcohol consumption is the tendency on some individuals to gradually increase their consumption rate, not due to social susceptibility, but when under stressful or depressing circumstances, since alcohol plays a major role both as cause and consequence of depression, for instance [16] . this means that one can attach a probability of a moderate drinker to become an excessive drinker that is dependent only on the actual moderate drinkers population size, instead of the two population groups involved in the change. if one considers the current world situation with the recent coronavirus disease 2019 outbreak (covid-19), this self-induced increase in alcohol consumption is not only realistic, but also becomes more prominent -this has been observed in a myriad of studies this year detailing the consequences and dangers of both alcohol withdrawal (in places where it has become harder to legally acquire alcohol during the pandemic) and alcohol consumption increase [17, 18, 19, 20] . this work is organized as follows. in section 2, we present the model and define the microscopic rules that lead to its dynamics. the analytical and numerical results are presented in section 3, including comparisons with brazil's alcohol consumption data for a range of eleven years, used as a case study in order to evaluate the present model. finally, our conclusions are presented in section 4. our model is based on the proposal of references [5, 13, 14, 21, 22, 23, 24, 25] that treat alcohol consumption as a disease that spreads by social interactions. in such case, we consider an epidemic-like model where the transitions among the compartments are governed by probabilities. in this work we consider population homogeneous mixing, i.e., a fully-connected population of n individuals. this population is divided in 3 compartments, namely: • s: nonconsumer individuals, individuals that have never consumed alcohol or have consumed in the past and quit. in this case, we will call them susceptible individuals, i.e., susceptible to become drinkers, either again or for the first time; • m: nonrisk consumers, individuals with regular low consumption. we will call them moderated drinkers; • r: risk consumers, individuals with regular high consumption. we will call them risk drinkers; to be precise, a moderated drinker is a man who consumes less than 50 cc of alcohol every day or a woman who consumes less than 30 cc of alcohol every day. on the other hand, a risk drinker is a man who consumes more than 50 cc of alcohol every day or a woman who consumes more than 30 cc of alcohol every day [5] 1 . since we are considering a contagion model, the probabilities related to changes in agents' compartments represent the possible contagions. the transitions among compartments are as following: in the above rules, β represents an "infection" probability, i.e., the probability that a consumer (m or r) individual turns a nonconsumer one into drinker. the risk drinkers r can also "infect" the moderated m agents and turn them into risk drinkers r, which occurs with probability δ. these two infections occur by contagion, in our model, where individuals belonging to a group with a higher degree of consumption can influence others to drink more via social contact. this transition m → r can also occur spontaneously, with probability α, if a given agent increase his/her alcohol consumption -this is the only migration pathway from one group to another, in this model, that does not depend on the population of the receiving compartment, since it corresponds to a self-induced progression from moderate (m) to risk (r) drinking. as stated in the introduction, above, the increase of alcohol consumption has been documented to occur under stressful circumstances (like the covid-19 pandemic) or clinical depression, regardless of social interaction with risk drinkers. finally, the probability γ represents the infection probability that turn risk drinkers r into susceptible agents s. in this case, it can represent the pressure of social contacts (family, friends, etc) over individuals that drink excessively. we did not take into account transitions from risk (r) to moderate (m), assuming that, as a rule, once an individual reaches a behavior of excessive consumption of alcohol, contact with moderate drinkers does not imply on a tendency to lower one's consumption -meanwhile, it is assumed that contacts that do not drink at all are able to exert a higher pressure on them to quit drinking. it is not that the from risk to moderate transition cannot occur -it is just that for our model this probability, when comparing it with the overall picture, is negligible. for simplicity, we consider a fixed population, i.e., at each time step t we have the normalization condition s(t) + m(t) + r(t) = 1, where we defined the population densities s(t) = s(t)/n, m(t) = m(t)/n and r(t) = r(t)/n. since we will only deal with the relative proportions among the three different groups in relation to the total population n, i.e. the population densities, we will not take into account birth-mortality relations and populational increase/decrease effects. so, even if n is not a constant number, for all mod-elling purposes it will not matter due to the fact that we will deal only with the s(t), m(t) and r(t) subpopulations in relation to the total population. one other way of looking at this approximation is to consider only the adult population as relevant to our modelling, and assume that new individuals coming of age correspond to the number of deaths [24, 25] . based on the microscopic rules defined in the previous subsection, one can write the master equations that describe the time evolution of the densities s(t), m(t) and r(t) as follows, and we also have the normalization condition valid at each time step t. first of all, one can analyze the early evolution of the population, for small times. considering the initial conditions s(0) ≈ 1, m(0) ≈ 1/n and r(0) = 0, one can linearize eq. (2) to obtain that can be directly integrated to obtain m(t) = m 0 e α(r 0 −1) t , where m 0 = m(t = 0) and one can obtain the expression for the basic reproduction number as it is usual in epidemic models [1, 27] , the disease (alcoholism) will persist in the population if r 0 > 1, i.e., for β > α. one can start analyzing the time evolution of the three classes of individuals. we numerically integrated eqs. (1), (2) and (3) the effects of the variation of the model's parameters. as initial conditions, we considered s(0) = 0.99, m(0) = 0.01 and r(0) = 0, and for simplicity we fixed α = 0.03 and δ = 0.07, varying the parameters β and γ. in fig. 1 (a) , (b) and (c) we exhibit results for fixed β = 0.07 and typical values of γ. one can see that the increase of γ causes the increase of s and the decrease of m and r. remembering that γ models the persuasion of nonconsumers s in the social interactions with risk drinkers r, i.e., the social pressure of individuals that do not consume alcohol over their contacts (friends, relatives, etc) that consume too much alcohol. on the other hand, in fig. 1 (d) we considered γ = 0.15 and β = 0.07. for this case, where we have β > γ, we see that the densities evolve in time, and in the steady states we observe the survival of only the risk drinkers, i.e., for t → ∞ we have s = m = 0 and r = 1. this last result will be discussed in more details analytically in the following. as we observed in fig. 1 , the densities s(t), m(t) and r(t) evolve in time, and after some time they stabilize. in such steady states, the time derivatives of eqs. (1) -(3) are zero. in the t → ∞ limit, eq. (1) gives us (−β m − β r + γ r) s = 0, where we denoted the stationary values as s = s(t → ∞), m = m(t → ∞) and r = r(t → ∞). this last equation has two solutions, one of them is s = 0 and from the other solution we can obtain a relation between r and m, considering now the limit t → ∞ in eq. (2), one obtains β s r = (α + δ r − β s) m . if the obtained solution s = 0 is valid, this relation gives us m = 0 and consequently from (4) we have r = 1. this solution represents an absorbing state [28, 29] , since the dynamics becomes frozen due to the absence of s and m agents. we will discuss this solution in more details in the following. considering now the relation (7) and the normalization condition (4), one can obtain substituting (9) and (7) in (8) considering this result (10) in eqs. (9) and (7) we obtain, respectively the obtained eqs. (10) -(12) represent a second possible steady state solution of the model, that is a realistic solution since the three fractions s, m and r coexist in the population. we can look to eq. (10) in more details. it can be rewritten in the critical phenomena perspective as [30, 31] where the critical points are given by the competition among the contagions cause the occurrence of such three regions in the model. from one side we have drinkers (moderated and risk) influencing nonconsumers to consume alcohol, with probability β. on the other hand, we have the social pressure of nonconsumers over risk drinkers, with probability γ, in order to make such alcoholics to begin treatment and stop drinking. finally, it is important to mention the parameter α, that drives the only transition of the model that does not depend on a direct social interaction. that parameter models the spontaneous increase of alcohol consumption, and it is also responsible for the first phase transition (together with γ), since we have β (1) c = 0 for α = 0. it means that the alcohol consumption (the "disease") cannot be eliminated of the population after a long time if there is a spontaneous increase of alcohol consumption from individuals that drink moderately, which is a realistic feature of the model. for clarity, we exhibit in fig. 3 the phase diagram of the model in the plane β versus γ, separating the three above discussed regions. in fig. 3 , the absorbing phase with s = 1 and m = r = 0 is located in region i for β < β (1) c , the coexistence phase is denoted by ii for β (where the three densities coexist) and the other absorbing phase where s = m = 0 and r = 1 is located in region iii. from this figure we see the mentioned competition among the contagions. indeed, if β is sufficiently high, many nonconsumers become moderated drinkers. such moderated drinkers will become risk drinkers (via probabilities α and δ), and in the case of small γ we will observe after a long time the disappearance of nonconsumers and moderated drinkers (region iii). in the opposite case, i.e., for high γ and small β, the flux into the compartment s is intense, and in the long-time limit the other two subpopulations m and r disappears (region i). finally, for intermediate values of β and γ the competition among the social interactions lead to the coexistence of the three subpopulations in the stationary states (region ii). it is worthwhile to mention that the sizes of regions i and ii are directly dependent on probability α, while region iii is always fixed due to eq. (18) . this means that, if parameter α is increased, region i will become gradually larger, which is an indication that the spontaneous evolution from moderate to risk drinking behavior increases the latter's absorbing state. in consequence, since probability α represents a percentage of moderate drinkers that become risk drinkers without the need for social interaction, it is a crucial factor not only to implement the theoretical model but also to identify a possible percentage of the population that has a natural tendency to present excessive alcohol consumption behavior, regardless of their social interaction network. for fig. 3 , for instance, this value is 3%. larger values of α narrow the set of parameters that can be chosen in order to realistically describe a real system. finally, we compare the model's results with data of drinking abusive consumption in brazil [26] . data were collected from 2009 to 2019, thus in fig. 4 the initial time t = 0 represents the fraction of abusive drinkers for 2009, t = 1 represents the fraction for 2010, and so on. since the data is for the fraction of people that consume alcohol abusively, we plot the density of risk drinkers r(t) together with the data. in order to compare them with the model, we considered for the initial density of risk drinkers r(0) = 0.185 and numerically integrated eqs. (1) -(3). the value 0.185 was chosen since it is the fraction of abusive drinkers for 2009 obtained from the database [26] . in addition, we rescaled the time of the simulation results to match the time of real data: such simulation time was multiplied by 0.12 for a better comparison. we find that the simulated drinking trajectories qualitatively correspond to the data. for the numerical results, we considered the parameters β = 0.06, γ = 0.11, α = 0.047 and δ = 0.2, which indicate that the probability of finding an individual that will spontaneously become a risk drinker in brazil during the last decade is around 4,7%. furthermore, looking at eqs. (17) and (18) , it is easy to see that in order to model brazil's c < β < β (2) c , showing that the model describes the available data in its most realistic spectrum (region ii of figure 3 ). naturally, in comparison with actual data, models should always present the three different phases, i.e. coexistence between the three different population groups, since descriptions with only nonconsumers or risk drinkers are unrealistic. this qualitative agreement with brazil's database in the realistic spectrum of the model points to a good, albeit simplistic, modelling. in this work, we have studied a compartmental model that aims to describe the evolution of drinking behavior in an adult population. we considered a fully-connected population that is divided in three compartments, namely susceptible individuals s (nonconsumers), moderated drinkers m and risk drinkers r. the transitions among the compartments are ruled by probabilities, representing the social interactions among individuals, as well as spontaneous decisions, in particular from moderate evolving into risk drinkers, and we studied the model through analytical and numerical calculations. from the theoretical point of view, the model is of interest of statistical physics since we observed the occurrence of two distinct nonequilibrium phase transitions. these transitions separate the model in three regions: (i) existence of nonconsumers only; (ii) coexistence of the three compartments and (iii) existence of risk drinkers only. regions i and iii represent two distinct absorbing phases, since the system becomes frozen due to the existence of only one subpopulation for each case -this means that, in order to describe real populational systems, the parameters must be chosen so that the model falls in region ii, since populations consisting solely of nonconsumers or risk drinkers do not represent a realistic entity. the critical points of such transitions were obtained analytically. a comparison with available data for brazil's extreme alcohol consumption for the past decade shows a good qualitative agreement with the model, with the chosen parameters framed within its realistic boundaries. it will be important in a couple of years time to re-evaluate these results in the light of new data comprising years 2020 and 2021, in order to verify the direct effects of the covid-19 pandemic in the brazilian population's alcohol consumption. an hypothesis to be tested is a possible increase in parameter α combined with a corresponding decrease in the other parameters, corresponding to social interactions. the phase transitions observed in the model are active-absorbing phase transitions, and the predicted critical exponent for the order parameter is 1 (m ∼ (β−β c ) 1 ) as in the mean-field directed percolation, that is the prototype of a phase transition to an absorbing state [30, 31] . it would be interesting to estimate numerically other critical exponents of the model, as well as to simulate it in regular d-dimensional lattices (e.g. square and cubic) in order to obtain all the critical exponents. this is important to define precisely the universality class of the model, as well as its upper critical dimension. this extension is left for a future work. furthermore, it can also be considered the inclusion of heterogeneities in the population, like agents' conviction [32] , time-dependent transition rates [33] , inflexibility [34] , etc. the mathematical theory of infectious diseases and its applications, charles griffin & company ltd, 5a crendon street, high wycombe, bucks hp13 6le epidemics and rumours analysing the spanish smoke-free legislation of 2006: a new method to quantify its impact using a dynamic model predicting cocaine consumption in spain: a mathematical modelling approach alcohol consumption in spain and its economic cost: a mathematical modeling approach modeling the obesity epidemic: social contagion and its implications for control can honesty survive in a corrupt parliament? evolution of tag-based cooperation on erdős-rényi random graphs encouraging moderation: clues from a simple model of ideological conflict the dynamics of the rise and fall of empires dynamics of tax evasion through an epidemic-like model modeling radicalization phenomena in heterogeneous populations social epidemiology and complex system dynamic modelling as applied to health behaviour and drug use research agent-based modeling of drinking behavior: a preliminary model and potential applications to theory and practice world health statistics 2018: monitoring health for the sdgs, sustainable development goals the prevalence and impact of alcohol problems in major depression: a systematic review bilinski, alcohol consumption reported during the covid-19 pandemic: the initial stage alcohol use and misuse during the covid-19 pandemic: a potential public health crisis? complicated alcohol withdrawalan unintended consequence of covid-19 lockdown alcohol use in times of the covid 19: implications for monitoring and policy modelling alcohol problems: total recovery mohyud-din, a conformable mathematical model for alcohol consumption in spain dynamics of an alcoholism model on complex networks with community structure and voluntary drinking modeling binge drinking drinking as an epidemica simple mathematical model with recovery and relapse, in: therapist's guide to evidence-based relapse prevention vigitel brasil 2019 : vigilncia de fatores de risco e proteo para doenas crnicas por inqurito telefnico : estimativas sobre frequncia e distribuio sociodemogrfica de fatores de risco e proteo para doenas crnicas nas capitais dos 26 estados brasileiros e no distrito federal em 2019 covid-19 spreading in rio de janeiro, brazil: do the policies of social isolation really work? survival of the scarcer in space symbiotic two-species contact process nonequilibrium phase transitions in lattice models non-equilibrium critical phenomena and phase transitions into absorbing states competition among reputations in the 2d sznajd model: spontaneous emergence of democratic states critical behavior of the sis epidemic model with time-dependent infection rate inflexibility and independence: phase transitions in the majority-rule model the authors thank ronald dickman for some suggestions. financial support from the brazilian scientific funding agencies cnpq (grants 303025/2017key: cord-132307-bkkzg6h1 authors: blanco, natalia; stafford, kristen; lavoie, marie-claude; brandenburg, axel; gorna, maria w.; health, matthew merski center for international; education,; biosecurity,; medicine, institute of human virology -university of maryland school of; baltimore,; usa, maryland; epidemiology, department of; health, public; medicine, university of maryland school of; nordita,; technology, kth royal institute of; university, stockholm; stockholm,; sweden,; biological,; centre, chemical research; chemistry, department of; warsaw, university of; warsaw,; poland, title: prospective prediction of future sars-cov-2 infections using empirical data on a national level to gauge response effectiveness date: 2020-07-06 journal: nan doi: nan sha: doc_id: 132307 cord_uid: bkkzg6h1 predicting an accurate expected number of future covid-19 cases is essential to properly evaluate the effectiveness of any treatment or preventive measure. this study aimed to identify the most appropriate mathematical model to prospectively predict the expected number of cases without any intervention. the total number of cases for the covid-19 epidemic in 28 countries was analyzed and fitted to several simple rate models including the logistic, gompertz, quadratic, simple square, and simple exponential growth models. the resulting model parameters were used to extrapolate predictions for more recent data. while the gompertz growth models (mean r2 = 0.998) best fitted the current data, uncertainties in the eventual case limit made future predictions with logistic models prone to errors. of the other models, the quadratic rate model (mean r2 = 0.992) fitted the current data best for 25 (89 %) countries as determined by r2 values. the simple square and quadratic models accurately predicted the number of future total cases 37 and 36 days in advance respectively, compared to only 15 days for the simple exponential model. the simple exponential model significantly overpredicted the total number of future cases while the quadratic and simple square models did not. these results demonstrated that accurate future predictions of the case load in a given country can be made significantly in advance without the need for complicated models of population behavior and generate a reliable assessment of the efficacy of current prescriptive measures against disease spread. on march 11, 2020 the world health organization (who) declared the novel coronavirus outbreak (sars-cov-2 causing covid-19) as a pandemic 1 more than three months after the first cases of pneumonia were reported in wuhan, china in december, 2019 1 . from wuhan the virus rapidly spread globally, currently leading to ten million confirmed cases and half a million deaths around the world. although coronaviruses have a wide range of hosts and cause disease in many animals, sars-cov-2 is the seventh named member of the coronaviridae known to infect humans 2 . an infected individual will start presenting symptoms an average of 5 days after exposure 3 but approximately 42% of infected individuals remain asymptomatic 4, 5 . furthermore, almost six out of 100 infected patients die globally due to covid-19 6 . currently, treatment and vaccine options for covid-19 are limited 7 . there is currently no effective or approved vaccine for sars-cov-2 although a report from april 2020 noted 78 active vaccine projects, most of them at exploratory or pre-clinical stages 8 . as the virus is transmitted mainly from person to person, prevention measures include social distancing, self-isolation, hand washing, and use of masks. strict measures of quarantine have been shown as the most effective mitigation measures, reducing up to 78% of expected cases compared to no intervention 9 . nevertheless, to evaluate the actual effectiveness of any mitigation measure it is necessary to accurately predict the expected number of cases in the absence of intervention. while there has been some early concern about the ability of sars-cov-2 to spread at an apparent near exponential rate 10 , real limitations in available resources (i.e. susceptible population) will reduce the spread to a logistic growth rate 11 . logistic growth produces a sigmoidal curve ( figure 1 ) where the total number of cases (n) eventually asymptotically approaches the population carrying capacity (nm), which for viral epidemics is analogous to the fraction of the population that will be infected before "herd immunity" is achieved 12, 13 . this is represented in derivative form by the generalized logistic function (equation 1): where α, β, & γ are mathematical shape parameters that define the shape of the curve, and r is the general rate term, analogous to the standard epidemiological parameter, r0, the reproductive number, which is a measure of the infectivity of the virus itself 13, 14 . for a logistic curve where α = ½ and β = γ = 0, one gets quadratic growth 15 with n = (rt/2) 2 , while for α = β = γ = 1, this equation can be rearranged to quadratic form (equation 2) 11 traditionally the number of cases that will occur in an epidemic like covid-19 is modeled with an seir model (susceptible, exposed, infected, recovered/removed), in which the total population is divided into four categories: susceptible -those who can be infected, exposedthose who in the incubation period but not yet able to transmit the virus to others, infectious -those who are capable of spreading disease to the susceptible population, and recovered/removedthose who have finished the disease course and are not susceptible to re-infection or have died. for a typical epidemic, the ability for infectious individuals to spread the disease is proportional to the fraction of the population in the susceptible category with "herd immunity" 12, 13 and extinction of the epidemic occurs once a limiting fraction of the population has entered into the recovered/removed category 13 . however, barriers to transmission, either natural 18 before knowing the actual outcome) are preferable to retrospective analysis in which effectiveness is gauged after the results of the prescriptive actions are known 24, 25 . this study aimed to evaluate if a simple model was able to correctly prospectively predict the total number of cases at a future date. we found that fitting the case data to a quadratic (parabolic) rate curve 15 for the early points in the epidemic curves (before the mitigation efforts began to have effects) was easy, efficient, and made good predictions for the number of cases at future dates despite significant national variation in the start of the infection, mitigation response, or economic condition. data on the number of covid-19 cases was downloaded from the european centre for disease prevention and control (ecdc) on june 1, 2020 26 . countries that had reported the highest numbers of cases in mid-march 2020 (and russia) were chosen as the focus of our analysis to minimize statistical error due to small numbers. the total number of cases for each country was calculated as a simple sum of that day plus all previous days. days that were missing from the record were assigned values of zero. the early part of the curve was fit and statistical parameters were generated using prism 8 (graphpad) using the non-linear regression module using the program standard centered second order polynomial (quadratic), exponential growth, and the gompertz growth model as defined by prism 8, and a simple user-defined simple square model (n = at 2 + c) where n is the total number of cases, a and c are the fitting constants, and t is the number of days from the beginning of the epidemic curve. the beginning of the curve (si table 1 ) was defined empirically among the first days in which the number of cases began to increase regularly. typically, this occurred when the country had reported less than 100 total cases. the early part of the curve was defined by manual examination looking for changes in the curve shape and later confirmed by r 2 values for the quadratic model. prospective predictions for the number of cases were done by fitting the total number of covid-19 cases for each day starting with day 5 and then extrapolating the number of cases using the estimated model parameters to predict the number of cases for the final day for which data was available (june 1, 2020) or to the last day before significant decrease in the r 2 value for the quadratic fit. fit parameters for the gompertz growth model were not used to make predictions if the fit itself was ambiguous. acceptable predictions were defined as being within a factor of two from the actual number (i.e. predictions within 50-200% of the actual total). a simple exponential growth model is a poor fit for the sars-cov-2 pandemic: the total number of cases for each of 28 countries was plotted with time and several model equations were fit to the early part of the data before mitigating effects from public health policies began to change the rate of disease spread. in total, 20 (71 %) countries showed mitigation of disease spread by june 1 (figure 2 ). when the early, pre-mitigation portion of the data was examined for all 28 countries, the gompertz growth model had the best statistical parameters (mean r 2 = 0.998 ± 0.0028, table 1 ) although a fit could not be obtained for the data from 2 countries and many of the fit values for nm were unrealistic compared to national populations (e.g. china and india had predicted nm values corresponding to 0.014 % and 0.33 % of their populations respectively 26 (si table 2 )). fitting was also incomplete for the generalized logistic model for all 28 countries underlining the difficulty in applying this model. on the other hand, the simple models were able to robustly fit all the current data, with the quadratic (parabolic) model performing the best (mean r 2 = 0.992 ± 0.004) and the exponential model the worst (mean r 2 = 0.957 ± 0.022)( table 1 ). in only three (11 %) countries did the exponential model have the best overall r 2 value among the simple models. furthermore, the trend of the overall superiority of the gompertz model followed by the quadratic was also observed in the standard error of the estimate statistic as well. the mean standard error of the estimate (sy.x, analogous to the root mean squared error for fits of multiple parameters) value for the 28 countries was 1699 for the gompertz model, 5613 for the quadratic model, 8572 for the simple square model and 11257 for the exponential model (table 1) . likewise, plots of the natural log of the total number of cases in the early parts of the epidemic (lnn) with time are significantly less linear (as determined by r 2 ) than equivalent plots of the square root of the total number of cases (n 1/2 ) (si table 3 , si figs 1, 2). while logistic growth models have been widely used to model epidemics 16, 27 , uncertainties in estimates of r0 (and therefore the population carrying capacity nm) make prospective predictions of the course of the epidemic difficult 14, 27 . (figure 3 , table 2 , si table 4 ). here we define predictions as accurate when they are within a factor of two (50-200%) of the actual outcome. for most countries, the simple exponential model massively overpredicts the number of future cases. predictions generated more than 14 days prior were more than double the actual number of cases for 17 (61 %) countries examined. in fact, for 15 (54 %) countries, the exponential model made at least one overprediction by a factor of greater than 10,000 fold, while the quadratic and simple square models make no overprediction by more than a factor of 3.3 and 2.1, respectively (i.e. using the first 10 days of data from portugal the exponential model predicts 34 million cases while the quadratic, simple square, and gompertz growth models predict 24957, 20358 and 18953 cases respectively while 23683 total cases were observed while the total population of portugal in 2018 was 10.3 million 26 ). predictions using the quadratic and simple square models were much more accurate. only in four (14 %) countries does the quadratic model ever overpredict the final number of cases by more than a factor of two while the simple square model overpredict by a factor of two for only one (4 %) country (si table 4 ). for the quadratic model, the mean maximum daily overprediction was a factor of 1.6-fold (median 1.3 fold) while for the simple square model the mean maximum daily overprediction was 1.3-fold (median 1.1 fold). both of these models produced much more accurate predictions than the simple exponential model (table 2) . the (table 1) , and this may account for the conflation of the course of the sars-cov-2 pandemic with truly exponential growth. that the exponential growth constant term, k, is constantly decreasing after day 10 in 10 (68 %) countries (si fig. 3 ) further indicates the overall utility of logistic models, which were explicitly developed to model the a constantly decreasing rate of growth due to consumption of the available resource (i.e. the susceptible population pool of the sir model) 16 . but, while logistic models are implicitly the correct model, they are difficult to accurately fit during the early portion of an epidemic due to inherent uncertainties in the mathematical shape parameters (equation 1) of the curve itself and the population carrying capacity for sars-cov-2, nm, which still has a significant uncertainty as the virus has only recently moved into the human population. herd immunity is defined as 1 -1/r0, and since current estimates for r0 vary from 1.5 -6.5 14 . this implies that 33 -85% of the population will need to have contracted the disease and developed immunity in order to terminate the epidemic. a discrepancy of this size will significantly affect predictions based on logistic growth models. here we note the utility of the quadratic (parabolic) and simple square models in predicting the course of the pandemic more than a month in advance. the simple exponential model vastly overpredicts the number of cases (fig. 2 , table 2 ). the gompertz growth model, while often making largely correct predictions often generates wildly inaccurate estimates of the population carrying capacity nm (si table 2 ), and the generalized logistic model simply fails to produce a statistically reliable result with the currently available data. overestimation of the future number of cases will cause problems because the failure of the number of predicted cases to materialize may be erroneously used as evidence that poorly implemented and ineffective policy prescriptions are reducing the spread of sars-cov-2, which may lead to political pressure for premature cessation of all prescriptive measures and inevitably an increase in the number of cases and excess, unnecessary morbidities. fortunately, the quadratic model produces accurate, prospective predictions of the number of cases (fig. 3 , table 2 ). use of this model is simple as it is directly implemented in common spreadsheet programs and can be implemented without much difficulty or technical modeling expertise. in theory, this model can also be applied to smaller, sub-national populations, although the smaller number of total cases in these regions will undoubtedly give rise to larger statistical errors. in no way does the empirical agreement between the quadratic model and empirical data negate the fact that the growth of the sars-cov-2 epidemic is logistic in nature in all 28 countries (table 1, si table 2 ). we expect the suitability of these empirical quadratic fits is related to either the fact that quadratic form of the slope of the generalized logistic function or the limitation of the virus to a physical radius of infectivity around infectious individuals, or that it is still early in the pandemic as no country has yet officially logged even 1% of its population as having been infected, or all three. of course, the true number of covid-19 cases is a matter of debate as there is speculation that a significant fraction of infections are not being identified 34 . however, because this method is focused on the rate of case growth over time, the errors that lead to any undercounting within a given country are likely to remain largely unchanged over the short time periods observed here and still provide a reasonable estimate of the number of positively identified cases. while despite their similar predictive power we largely focus on the quadratic model rather than the simple square model for the aforementioned reasons, we must also note that quadratic curve fitting is natively implemented in most common spreadsheet software while the simple square model is not. by monitoring the r 2 values for the quadratic models, it is a simple task to identify when the epidemic is beginning to subside within a country (i.e. "bending the curve"). here we recommend the use of an r 2 value of 0.985 for identifying when the rate of infection is beginning to subside, but more conservative estimates can also be made by lowering this threshold. examination of the data collected here suggests that early, aggressive measures have been most effective at reducing disease burden within a country. countries that initially adopted less stringent measures (such as the us, uk, russia, and brazil) are currently more heavily burdened than those countries that started with more intense prescriptions (such as china, south korea, australia, denmark, and vietnam) 35 both logistic curves is the same, the gompertz curve reaches the population carrying capacity more slowly, resulting in a long tailed epidemic. the initial part of the gompertz curve (including time points until 5% of the population has been infected) was fit to the simple exponential (red dashes), quadratic (blue dashes) and simple square (green dashes) models. it is apparent from these curves how quickly the exponential curve overestimates the rate of growth for the epidemic as compared to the quadratic and simple square fit curves and how the quadratic model more closely follows the gompertz growth curve, evidenced by the smaller sy.x value for the quadratic fit in table 1 . for each model using only data up to that day are used to predict the number of expected cases for the last day for which data is available (or the last day before significant curve deviation is observed, see figure 2 ). days on which the fit was not statistically sound were omitted from the graph. effectiveness" blanco et al. table 2 : the fit parameters for the development of the early portion of the sars-cov-2 epidemic in 28 countries for the exponential, quadratic, simple square, simple exponential, gompertz growth models as calculated for each individual day during the early portion of the epidemic. a a the fit equations for each are as follows: simple exponential: where n is the total number of cases, t is the time in days, n0 is the initial seeding population of the epidemic, nm is the population carrying capacity (the amount of the population that must be infected to achieve herd immunity), a, b & c are the standard quadratic terms (or for the simple square model equation). additionally, the number of days of data used in the fitting, the r 2 , sum of squares, and sy.x statistical values are given. for the gompertz growth model, an adequate fit could not be achieved for brazil or denmark and this is indicated by figure 3 : the change of the exponential rate term (k) over time for each of the 28 countries. it can be clearly seen that k is generally decreasing over time, often on each day but sometimes after an initial bit of increasing. this indicates that the exponential rate is regularly decreasing, as expected for a situation where growth resource is decreasing, as is expected for the logistic models family of models, including the generalized logistic and gompertz growth models. coronaviruse disease 2019 (covid-19) situation report -51 ?sfvrsn=1ba62e57_10: world health organization covid-19: epidemiology, evolution, and cross-disciplinary perspectives the epidemiology and pathogenesis of coronavirus disease (covid-19) outbreak estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship estimation of the asymptomatic ratio of novel coronavirus infections (covid-19) real estimates of mortality following covid-19 infection impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in hong kong: an observational study the covid-19 vaccine development landscape interventions to mitigate early spread of sars-cov-2 in singapore: a modelling study real-time forecasts of the covid-19 epidemic in china fromfebruary 5th to february 24th herd-immunity to helminth infection and implications for parasite control herd immunity'': a rough guide the reproductive number of covid-19 is higher compared to sars coronavirus piecewise quadratic growth during the 2019 novel coronavirus epidemic the use of gompertz models in growth analyses, and new gompertzmodel approach: an addition to the unified-richards family dynamics of tumor growth the impact of a physical geographic barrier on the dynamics of measles can china's covid-19 strategy work elsewhere? effective containment explains subexponential growth in recent confirmed covid-19 cases in china covid-19 national emergency response center e cmtk, prevention. contact transmission of covid-19 in south korea: novel investigation techniques for tracing contacts using social and behavioural science to support covid-19 pandemic response 19 image data collection: prospective predictions are the future cohort studies: prospective versus retrospective rational evaluation of various epidemic models based on the covid-19 data ofchina model selection and evaluation based on emerging infectious disease data sets including a/h1n1 and ebola national response to covid-19 in the republic of korea and lessons learned for other countries the french response to covid-19: intrinsic difficulties at the interface of science, public health, and policy covid-19 healthcare demand and mortality in sweden in response to non-pharmaceutical (npis) mitigation and suppression scenarios what policy makers need to know about covid-19 protective immunity high population densities catalyse the spread of covid-19 high temperature and high humidity reduce the transmission of covid-19 substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) whose coronavirus strategy worked best? scientists hunt most effective policies and gompertz growth models based on days of good predictions before the target, last day of observed data, inclusive. b b good predictions are defined as the predicted result being within a factor of 2, predictions from 50 -200% of the actual total number of cases). thus, the quadratic model was able to predict the total number of cases in the united states in each of the 45 days before that day, while the exponential model was only within the defined good range for the 20 days preceding that day. the range of minimum predictions (under predictions) and maximum predictions (overpredictions) is also given si figure 1: plots of the square root of the total number of cases (√n) for the early portion of the covid-19 si figure 2: plots of the natural log of the total number of cases (lnn) for the early portion of the covid-19 key: cord-016261-jms7hrmp authors: liu, chunmei; song, yinglei; malmberg, russell l.; cai, liming title: profiling and searching for rna pseudoknot structures in genomes date: 2005 journal: transactions on computational systems biology ii doi: 10.1007/11567752_2 sha: doc_id: 16261 cord_uid: jms7hrmp we developed a new method that can profile and efficiently search for pseudoknot structures in noncoding rna genes. it profiles interleaving stems in pseudoknot structures with independent covariance model (cm) components. the statistical alignment score for searching is obtained by combining the alignment scores from all cm components. our experiments show that the model can achieve excellent accuracy on both random and biological data. the efficiency achieved by the method makes it possible to search for structures that contain pseudoknot in genomes of a variety of organisms. searching genomes with computational models has become an effective approach for the identification of genes. during recent years, extensive research has been focused on developing computationally efficient and accurate models that can find novel noncoding rnas and reveal their associated biological functions. unlike the messenger rnas that encode the amino acid residues of protein molecules, noncoding rna molecules play direct roles in a variety of biological processes including gene regulation, rna processing, and modification. for example, the human 7sk rna binds and inhibits the transcription elongation factor p-tefb [17] [25] and the rnase p rna processes the 5' end of precursor trnas and some rrnas [7] . noncoding rnas include more than 100 different families [23] . genome annotation based on models constructed from homologous sequence families could be a reliable and effective approach to enlarging the known families of noncoding rnas. the functions of noncoding rnas are, to a large extent, determined by the secondary structures they fold into. secondary structures are formed by bonded base pairs between nucleotides and may remain unchanged while the nucleotide sequence may have been significantly modified through mutations over the course of evolution. profiling models based solely on sequence content such as hidden markov model (hmm) [12] may miss structural homologies when directly used to search genomes for noncoding rnas containing complex secondary structures. models that can profile noncoding rnas must include both the content and the structural information from the homologous sequences. the covariance model (cm) developed by eddy and durbin [6] extends the profiling hmm by allowing the coemission of paired nucleotides on certain states to model base pairs, and introduces bifurcation states to emit parallel stems. the cm is capable of modeling secondary structures comprised of nested and parallel stems. however, pseudoknot structures, where at least two structurally interleaving stems are involved, cannot be directly modeled with the cm and have remained computationally intractable for searching [1] [13] [14] [18] [19] [20] [21] [24] . so far, only a few systems have been developed for profiling and searching for rna pseudoknots. one example is erpin developed by gautheret and lambert [8] [15] . erpin searches genomes by sequentially looking for single stem loop motifs contained in the noncoding rna gene, and reports a hit when significant alignment scores are observed for all the motifs at their corresponding locations. since erpin does not allow the presence of gaps when it performs alignments, it is computationally very efficient. however, alignments with no gaps may miss distant homologies and thus result in a lower sensitivity. brown and wilson [2] proposed a more realistic model comprised of a number of stochastic context free grammar (scfg) [3] [22] components to profile pseudoknot structures. in their model, the interleaving stems in a pseudoknot structure are derived from different components; the pseudoknot structure is modeled as the intersection of components. the optimal alignment score of a sequence segment is computed by aligning it to all the components iteratively. the model can be used to search sequences for simple pseudoknot structures efficiently. however, a generic framework for modeling interleaving stems and carrying out the search was not proposed in their work. for pseudoknots with more complex structure, more than two scfg components may be needed and the extension of the iterative alignment algorithm to k components may require k! different alignments in total since all components are treated equally in their model. in this paper, we propose a new method to search for rna pseudoknot structures using a model of multiple cms. unlike the model of brown and wilson, we use independent cm components to profile the interleaving stems in a pseudoknot. based on the model, we have developed a generic framework for modeling interleaving stems of pseudoknot structures; we propose an algorithm that can efficiently assign stems to components such that interleaving stems are profiled in different components. the components with more stems are associated with higher weights in determining the overall conformation of a sequence segment. in order to efficiently perform alignments of the sequence segment to the model, instead of iteratively aligning the sequence segment to the cm components, our searching algorithm aligns it to each component independently following the descending order of component weights. the statistical log-odds scores are computed based on the structural alignment scores of each cm component. stem contention may occur such that two or more base pairs obtained from different components require the participation of the same nucleotide. due to the conformational constraints inherently imposed by the cm components, stem contentions occur infrequently (less than 30%) and can be effectively resolved based on the conformational constraints from the alignment results on components with higher weight values. the algorithm is able to accomplish the search with a worst case time complexity of o((k − 1)w 3 l) and a space complexity of o(kw 2 ), where k is the number of cm components in the model, w and l are the size of the searching window and the length of the genome respectively. we used the model to search for a variety of rna pseudoknots inserted in randomly generated sequences. experiments show that the model can achieve excellent sensitivity (se) and specificity (sp) on almost all of them, while using only slightly more computation time than searching for pseudoknot-free rna structures. we then applied the model and the searching algorithm to identify the pseudoknots on the 3' untranslated region in several rna genomes from the corona virus family. an exact match between the locations found by our program and the real locations is observed. finally, in order to test the ability of our program to cope with noncoding rna genes with complex pseudoknot structures, we carried out an experiment where the complete dna genomes of two bacteria were searched to find the locations of the tmrna genes. the results show that our program identified the location with a reasonable amount of error (with a right shift of around 20 nucleotide bases) for one bacterial genome and for the other bacteria search was perfect. to the best of our knowledge, this is the first experiment where a whole genome of more than a million nucleotides is searched for a complex structure that contains pseudoknots. to test the performance of the model, we developed a search program in c language and carried out searching experiments on a sun/solaris workstation. the workstation has 8 dual processors and 32gb main memory. we evaluated the accuracy of the program on both real genomes and randomly generated sequences with a number of rna pseudoknot structures inserted. the rnas we choose to test the model are shown in table 1 . model training and testing are based on the multiple alignments downloaded from the rfam database [10] . for each rna pseudoknot, we divided the available data into a training set and a testing set, and the parameters used to model it are estimated based on multiple structural alignments among 5 − 90 homologous training sequences with a pairwise identity less than 80%. the emission probabilities of all nucleotides for a given state in a cm component are estimated by computing their frequencies to appear in the corresponding column in the multiple alignment of training sequences; transition probabilities are computed similarly by considering the reltable 2 . the performance of the model on different rna pseudoknots inserted into a background (of 10 5 nucleotides) randomly generated with different c+g concentrations. tn is the total number of pseudoknotted sequence segments inserted; ci is the number of sequence segments correctly identified by the program (with a positional error less than ±3 bases); nh is the number of sequence segments returned by the program; se and sp are sensitivity and specificity respectively. the thresholds of log-odds score are predetermined using the z-score value of 4.0. ative frequencies for different types of transitions that occur between the corresponding consecutive columns in the alignment. pseudocounts, dependent on the number of training sequences, are included to prevent overfitting of the model to the training data. to measure the sensitivity and specificity of the searching program within a reasonable amount of time, for each selected pseudoknot structure, we selected 10 − 40 sequence segments from the set of testing data and inserted them into each of the randomly generated sequences of 10 5 nucleotides. in order to test whether the model is sensitive to the base composition of the background sequence, we varied the c+g concentration in the random background. the program computes the log-odds, the logarithmic ratio of the probability of generating sequence segment s by the null (random) model r to that by our model m . it reports a hit when the z-score of s is greater than 4.0. the computation of z-scores requires knowing the mean and standard deviation for the distribution of log-odd scores of random sequence segments; both of them can be determined with methods similar to the ones introduced by klein and eddy [11] before the search starts. as can be seen in table 2 , the program correctly identifies more than 80% of inserted sequence segments with excellent specificity in most of the experiments. the only exception is the srprna, where the program misses more than 50% inserted sequence segments in one of the experiments. the relatively lower sensitivity in that particular experiment can be partly ascribed to the fact that the pseudoknot structure of srprna contains fewer nucleotides; thus its structural and sequence patterns have a larger probability to occur randomly. the running time for srprna, however, is also significantly shorter than that needed by most of other rna pseudoknots due to the smaller size of the model. additionally, while the alpha−rbs pseudoknot has a more complex structure and three cm components are needed to model it, our searching algorithm efficiently identifies more than 95% of the inserted pseudoknots with high specificities. a higher c+g concentration in the background does not adversely affect the specificity of the model; it is evident from table 2 that the program achieves better overall performance in both sensitivity and specificity in a background of higher c+g concentrations. we therefore conjecture that the specificity of the model is partly determined by the base composition of the genome and is improved if the base composition of the target gene is considerably different from its background. to test the accuracy of the program on real genomes, we performed experiments to search for particular pseudoknot structures in the genomes for a variety of organisms. table 3 shows the genomes on which we have searched with our program and the locations annotated for the corresponding pseudoknot structures. the program successfully identified the exact locations of known 3'utr pseudoknot in four genomes from the family of corona virus. this pseudoknot was recently shown to be essential for the replication of the viruses in the family [9] . in addition, the genomes of the bacteria, haemophilus influenzae and neisseria meningitidis mc58, were searched for their tmrna genes. the haemophilus influenzae dna genome contains about 1.8 × 10 6 nucleotides and neisseria meningitidis mc58 dna genome contains about 2.2 × 10 6 nucleotides. the tmrna functions in the transtranslation process to add a c-terminal peptide tag to the incomplete protein product of table 3 . the results obtained with our searching program on the genomes of a variety of organisms. ga is the accession number of the genome; rl specifies the real location of the pseudoknot structure in the genome; sl is the one returned by the program; rt is the running time needed to perform the searching in hours; gl is the length of the genome in its number of bases. a defective mrna [16] . the central part of the secondary structure of tmrna molecule consists of four pseudoknot structures. figure 1 shows the pseudoknot structures on the tmrna molecule. in order to search the bacterial dna genomes efficiently, the combined pseudoknots 1 and 2 were used to search the genome first; the program searches for the whole tmrna gene only in the region around the locations where a hit for pk1 and pk2 is detected. we cut the genome into segments with shorter lengths (around 10 5 nucleotide bases for each), and ran the program in parallel on ten of them in two rounds. the result for neisseria meningitidis mc58 shows that we successfully identified the exact locations of tmrna. however, the locations of tmrna obtained for haemophilus influenzae have a shift of around 20 nucleotides with respect to its real location (7% of the length of the tmrna). this slight error can probably be ascribed to our "hit-andextend" searching strategy to resolve the difficulty arising from the complex structure and the relatively much larger size of tmrna genes; positional errors may occur during different searching stages and accumulate to a significant value. our experiment on the dna genomes also demonstrates that, for each genome, it is very likely there is only one tmrna gene in it, since our program found only one significant hit. to our knowledge, this is the first computational experiment where a whole genome of more than a million nucleotides was successfully searched for a complex structure that contains pseudoknot structures. the covariance model (cm) proposed by eddy and durbin [6] [5] can effectively model the base pairs formed between nucleotides in an rna molecule. similarly to the emission probabilities in hmms, the emission probabilities in the cm for both unpaired nucleotides and base pairs are positional dependent. the profiling of a stem hence consists of a chain of consecutive emissions of base pairs. parallel stems on the rna sequence are modeled with bifurcation transitions where a bifurcation state is split into two states. the parallel stems are then generated from the transitions starting with the two states that result respectively. the genome is scanned by a window with an appropriate length. each location of the window is scored by aligning all subsequence segments contained in the window to the model with the cyk algorithm. the maximum log-odds score of them is determined as the log-odds score associated with the location. a hit is reported for a location if the computed log-odds score is higher than a predetermined threshold value. pseudoknot structures are beyond the profiling capability of a single cm due to the inherent context sensitivity of pseudoknots. models for pseudoknot structures require a mechanism for the description of their interleaving stems. previous work by brown and wilson [2] and cai et al. [4] has modeled the pseudoknot structures with grammar components that intersect or cooperatively communicate. a similar idea is adopted in this work; a number of independent cm components are combined to resolve the difficulty in profiling that arises from the interleaving stems. interleaving stems are profiled in different cm components and the alignment score of a sequence segment is determined based on a combination of the alignment scores on all components. however, the optimal conformations from the alignments on different components may violate some of the conformational constraints that a single rna sequence must follow. for example, a nucleotide rarely forms two different base pairs simultaneously with other nucleotides in an rna molecule. this type of restriction is not considered by the independent alignments carried out in our model and thus may lead to erroneous searching results if not treated properly. in our model, stem contention may occur. we break the contention by introducing different priorities to components; base pairs determined from components with the highest priority win the contention. we hypothesize that, biochemically, components profiling more stems are likely to play more dominant roles in the formation of the conformation and are hence assigned higher priority weights. in order to profile the interleaving stems in a pseudoknot structure with independent cm components, we need an algorithm that can partition the set of stems on the rna sequence into a number of sets comprised of stems that mutually do not interleave. based on the consensus structure of the rna sequence, an undirected graph g = (v, e) can be constructed where v , the set of vertices in g, consists of all stems on the sequence. two vertices are connected with an edge in g if the corresponding stems are in parallel or nested. the set of vertices v needs to be partitioned into subsets such that the subgraph induced by each subset forms a clique. we use a greedy algorithm to perform the partition. starting with a vertex set s initialized to contain a arbitrarily selected vertex, the algorithm iteratively searches the neighbors of the vertices in s and computes the set of vertices that are connected to all vertices in s. it then randomly selects one vertex v that is not in s from the set and modifies s by assigning v to s. the algorithm outputs s as one of the subsets in the partition when s can not be enlarged and randomly selects an unassigned vertex and repeats the same procedure. it stops when every vertex in g has been included in a subset. although the algorithm does not minimize the number of subsets in the partition, our experiments show that it can efficiently provide optimal partitions of the stems on pseudoknot structures of moderate structural complexity. the cm components in the profiling model are generated and trained based on the partition of the stems. the stems in the same subset are profiled in the same cm component. for each component, the parameters are estimated by considering the consensus structure formed by the stems in the subset only. the optimal alignments of a sequence segment to the cm components are computed with the dynamic programming based cyk algorithm. as we have mentioned before, higher priority weights are assigned to components with more stems profiled. the component with the maximum number of stems thus has the maximum weight and is the dominant component in the model. the algorithm performs alignments in the descending order of component weights. it selects the sequence segment that maximizes the log-odds score from the dominant component. the alignment scores and optimal conformations of this segment on other components are then computed and combined to obtain the overall log-odds score for the segment's position on the genome. more specifically, we assume that the model contains k cm components m 0 , m 1 , ..., m k−1 in descending order of component weights. the algorithm considers all possible sequence segments s d that are enclosed in the window and uses equation (1) to determine the sequence segment s to be the candidate for further consideration, where w is the length of the window used in searching, and equation (2) to compute the overall log-odds score for s. we use sm i to denote the parts of s that are aligned to the stems profiled in cm component m i . basically, log odds(sm i |m i ) accounts for the contributions from the alignment of sm i to m i . the log-odds score of sm i is counted in both m 0 and m i and must be subtracted from the sum. log odds(s|m ) = log odds(s|m 0 ) the conformations corresponding to the optimal alignments of a sequence segment to all cm components are obtained by tracing back the dynamic programming matrices and checking to ensure that no stem contention occurs. since each nucleotide in the sequence is represented with a state in a cm component, the cm inherently imposes constraints on the optimal conformations of sequence segments aligned to it. we hence expect that stem contention occurs with a low frequency. in order to verify this intuition, we tested the model on sequences randomly generated with different base compositions and evaluated the frequencies of stem contentions for pseudoknot structures on which we have performed an accuracy test; the results are shown in figure 2 . the presence of stem contention increases the running time of the algorithm, because the alignment of one of the involved components must be recomputed to resolve the contention. based on the assumption that components with more stems contribute more to the stability of the optimal conformation, we resolve the contention in favor of such components. we perform recomputation on the component with a lower number of stems by incorporating conformational constraints inherited from components with more stems into the alignment algorithm, preventing them from forming the contentious stems. specifically, we assume that stem s j ∈ m i and stem contention occurs between s j and other stems profiled in m i−1 ; the conformational constraints from the component m i−1 are in the format of (l 1 , l 2 ) and (r 1 , r 2 ). in other words, to avoid the stem contmrna-pk34 telomerase-vert tombus-3-iv alpha-rbs srprna fig. 2 . 4000 random sequences were generated at each given base composition and aligned to the corresponding profiling model. the sequences are of about the same length as the length of the pseudoknot structure. the stem contention rates for each pseudoknot structure were measured and plotted. they were the ratio of the number of random sequences in which stem contentions occurred to the number of total random sequences. left: plots of profiling models observed to have a stem contention rate lower than 20%, right: plots of these with slightly higher stem contention frequencies. the experimental results demonstrate that, in all pseudoknots where we have performed accuracy tests, stem contention occurs with a rate lower than 30% and is insensitive to the base composition of sequences. tention, the left and right parts of the stem must be the subsequences of indices (l 1 , l 2 ) and (r 1 , r 2 ) respectively. the dynamic programming matrices for s j are limited to the rectangular region that satisfies l 1 ≤ s ≤ l 2 and r 1 ≤ t ≤ r 2 . the stem contention frequency depends on the conformational flexibilities of the components in the covariance model. more flexibilities in conformation may improve the sensitivity of the model but cause higher contention frequency and thus increase the running time for the algorithm. in the worst case, recomputation is needed for all nondominant components in the model and the time complexity of the algorithm becomes o((k − 1)w 3 l), where k is the number of components in the model, w and l are the window length and the genome length respectively. in this paper, we have introduced a new model that serves as the basis for a generic framework that can efficiently search genomes for the noncoding rnas with pseudoknot structures. within the framework, interleaving stems in pseudoknot structures are modeled with independent cm components and alignment is performed by aligning sequence segments to all components following the descending order of their weight values. stem contention occurs with a low frequency and can be resolved with a dynamic programming based recomputation. the statistical log-odds scores are computed based on the alignment results from all components. our experiments on both random and biological data demonstrate that the searching framework achieves excellent performance in both accuracy and efficiency and can be used to annotate genomes for noncoding rna genes with complex secondary structures in practice. we were able to search a bacterial genome for a complete structure with a pseudoknot in about one week on our sun workstation. it would be desirable to improve our algorithm so that we could search larger genomes and databases. the running time, however, could be significantly shortened if a filter can be designed to preprocess dna genomes and only the parts that pass the filtering process are aligned to the model. alternatively, it may be possible to devise alternative profiling methods to the covariance model that would allow faster searches. dynamic programming algorithms for rna secondary structure prediction with pseudoknots rna pseudoknot modeling using intersections of stochastic context free grammars with applications to database search small subunit ribosomal rna modeling using stochastic context-free grammars stochastic modeling of pseudoknot structures: a grammatical approach biological sequence analysis: probabilistic models of proteins and nucleic acids rna sequence analysis using covariance models ribonuclease p: unity and diversity in a trna processing ribozyme direct rna motif definition and identification from multiple sequence alignments using secondary structure profiles characterization of the rna components of a putative molecular switch in the 3' untranslated region of the murine coronavirus genome rfam: an rna family database rsearch: finding homologs of single structured rna sequences hidden markov models in computational biology. applications to protein modeling prediction of rna pseudoknots-comparative study of genetic algorithms rna pseudoknot prediction in energy based models rnamotif, an rna secondary structure definition and search algorithm functional and structural analysis of a pseudoknot upstream of the tag-encoded sequence in e. coli tmrna 7sk small nuclear rna binds to and inhibits the activity of cdk9/cyclin t complexes design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics the language of rna: a formal grammar that includes pseudoknots a dynamic programming algorithm for rna structure prediction including pseudoknots an iterated loop matching approach to the prediction of rna secondary structures with pseudoknots stochastic context-free grammars for trna modeling an expanding universes of noncoding rnas tree adjoining grammars for rna structure prediction the 7sk small nuclear rna inhibits the cdk9/cyclin t1 kinase to control transcription key: cord-103435-yufvt44t authors: van aalst, marvin; ebenhöh, oliver; matuszyńska, anna title: constructing and analysing dynamic models with modelbase v1.0 a software update date: 2020-10-02 journal: biorxiv doi: 10.1101/2020.09.30.321380 sha: doc_id: 103435 cord_uid: yufvt44t background computational mathematical models of biological and biomedical systems have been successfully applied to advance our understanding of various regulatory processes, metabolic fluxes, effects of drug therapies and disease evolution or transmission. unfortunately, despite community efforts leading to the development of sbml or the biomodels database, many published models have not been fully exploited, largely due to lack of proper documentation or the dependence on proprietary software. to facilitate synergies within the emerging research fields of systems biology and medicine by reusing and further developing existing models, an open-source toolbox that makes the overall process of model construction more consistent, understandable, transparent and reproducible is desired. results and discussion we provide here the update on the development of modelbase, a free expandable python package for constructing and analysing ordinary differential equation-based mathematical models of dynamic systems. it provides intuitive and unified methods to construct and solve these systems. significantly expanded visualisation methods allow convenient analyses of structural and dynamic properties of the models. specifying reaction stoichiometries and rate equations, the system of differential equations is assembled automatically. a newly provided library of common kinetic rate laws highly reduces the repetitiveness of the computer programming code, and provides full sbml compatibility. previous versions provided functions for automatic construction of networks for isotope labelling studies. using user-provided label maps, modelbase v1.0 streamlines the expansion of classic models to their isotope-specific versions. finally, the library of previously published models implemented in modelbase is continuously growing. ranging from photosynthesis over tumour cell growth to viral infection evolution, all models are available now in a transparent, reusable and unified format using modelbase. conclusion with the small price of learning a new software package, which is written in python, currently one of the most popular programming languages, the user can develop new models and actively profit from the work of others, repeating and reproducing models in a consistent, tractable and expandable manner. moreover, the expansion of models to their label specific versions enables simulating label propagation, thus providing quantitative information regarding network topology and metabolic fluxes. mathematical models are accepted as valuable tools in advancing biological and medical research [1, 2] . in particular, models based on ordinary differential equations (ode) found their application in a variety of fields. most recently, deterministic models simulating the dynamics of infectious diseases gained the interest of the general public during our combat of the covid-19 pandemic, when a large number of ode based mathematical models has been developed and discussed even in nonscientific journals (see for example [3] [4] [5] ). such focus on mathematical modelling is not surprising, because computational models allow for methodical investigations of complex systems under fixed, controlled and reproducible conditions. hence, the effect of various perturbations of the systems in silico can be inspected systematically. importantly, long before exploring their predictive power, the model building process itself plays an important role in integrating and systematising vast amounts of available information [6] . properly designed and verified computational models can be used to develop hypotheses to guide the design of new research experiments (e.g., in immunology to study lymphoid tissue formation [7] ), support metabolic engineering efforts (e.g., identification of enzymes to enhance essential oil production in peppermint [8] ), contribute to tailoring medical treatment to the individual patient in the spirit of precision medicine (e.g., in oncology [2] ), or guide political decision making and governmental strategies (see the review on the impact of modelling for european union policy [9] ). considering their potential impact, it is crucial that models are openly accessible so that they can be verified and corrected, if necessary. in many publications, modelling efforts are justified by the emergence of extraordinary amounts of data provided by new experimental techniques. however, arguing for the necessity of model construction only because a certain type or amount of data exists, ignores several important aspects. firstly, computational models are generally a result of months, if not years, of intense research, which involves gathering and sorting information, simplifying numerous details and distilling out the essentials, implementing the mathematical description in computer code, carrying out performance tests and, finally, validation of the simulation results. our understanding of many phenomena could become deeper if instead of constructing yet another first-generation model, we could efficiently build on the knowledge that was systematically collected in previously developed models. secondly, the invaluable knowledge generated during the model construction process is often lost, mainly because of the main developer leaves the research team, but also due to unfavourable funding strategies. it is easier to obtain research funds for the construction of novel, even if perfunctory models, than to support a long-term maintenance of existing ones. preservation of the information collected in the form of a computational model has became an important quest in systems biology, and has been to some extend addressed by the community. development of the systems biology markup language (sbml) [10] for unified communication and storage of biomedical computational models and the existence of the biomodels repository [11] already ensured the survival of constructed models beyond the academic lifetime of their developers or the lifetime of the software used to create them. but a completed model in the sbml format does not allow to follow the logic of model construction and the knowledge generated by the building process is not preserved. such knowledge loss can be prevented by providing simple-to-use toolboxes enforcing a universal and readable form of constructing models. we have therefore decided to develop modelbase [12] , a python package that encourages the user to actively engage in the model building process. on the one hand we fix the core of the model construction process, while on the other hand the software does not make the definitions too strict, and fully integrates the model construction process into the python programming language. this differentiates our software from 2/13 many available python-based modelling tools (such as scrumpy [13] or pysces [14] ) and other mathematical modelling languages (recently reviewed from a software engineering perspective by schölzel and colleagues [15] ). we report here new features in modelbase v1.0, developed over the last two years. we have significantly improved the interface to make model construction easier and more intuitive. the accompanying repository of re-implemented, published models has been considerably expanded, and now includes a diverse selection of biomedical models. this diversity highlights the general applicability of our software. essentially, every dynamic process that can be described by a system of odes can be implemented with modelbase. implementation modelbase is a python package to facilitate construction and analysis of ode based mathematical models of biological systems. version 1.0 introduces changes not compatible with the previous official release 0.2.5 published in [12] . all api changes are summarised in the official documentation hosted by readthedocs. the model building process starts by creating a modelling object in the dedicated python class model and adding to it the chemical compounds of the system. then, following the intuition of connecting the compounds, you construct the network by adding the reactions one by one. each reaction requires stoichiometric coefficients and a kinetic rate law. the latter can be provided either as a custom function or selecting from the newly provided library of rate laws. the usage of this library (ratelaws) reduces the repetitiveness by avoiding boilerplate code. it requires the user to explicitly define reaction properties, such as directionality. this contributes to a systematic and understandable construction process, following the second guideline from the zen of python: "explicit is better than implicit". from this, modelbase automatically assembles the system of odes. it also provides numerous methods to conveniently retrieve information about the constructed model. in particular, the get * methods allow inspecting all components of the model, and calculate reaction rates for given concentration values. these functions have multiple variants to return all common data structures (array, dictionary, data frames). after the model building process is completed, simulation and analyses of the model are performed with the simulator class. currently, we offer interfaces to two integrators to solve stiff and non-stiff ode systems. provided you have installed the assimulo package [16] , as recommended in our installation guide, modelbase will be using cvode, a variable-order, variable-step multi-step algorithm. the cvode class provides a direct connection to sundials, the suite of nonlinear and differential/algebraic equation solvers [17] which is a powerful industrial solver and robust time integrator, with a high computing performance. in case when assimulo is not available, the software will automatically switch to the scipy library [18] using lsoda as an integrator, which in our experience showed a lower computing performance. sensitivity analysis provides a theoretical foundation to systematically quantify effects of small parameter perturbations on the global system behaviour. in particular, metabolic control analysis (mca), initially developed to study metabolic systems, is an important and widely used framework providing quantitative information about the response of the system to perturbations [19, 20] . this new version of modelbase now has a full suite of methods to calculate response coefficients and elasticises, and plotting them as a heat-map, giving a clear and intuitive colour-coded visualisation of the results. an example of such visualisation, for a re-implemented toy model of the upper part of glycolysis (section 3.1.2 [21] ), can be found in figure 1 . many of the available relevant software packages for building computational models restrict the users by providing unmodifiable plotting routines with predefined settings that may not suit your personal preferences. with modelbase v1.0 we constructed our plotting functions allowing the user to pass optional keyword-arguments (often abbreviated as **kwargs), so you still access and change all plot elements, providing a transparent and flexible interface to the commonly used matplotlib library [22] . the easy access functions to visualise the results of simulations were expanded from the previous version. they now include plotting selections of compounds or fluxes, phase-plane analysis and the results of mca. models for isotope tracing modelbase has additionally been developed to aid the in silico analyses of label propagation during isotopic studies. in order to simulate the dynamic distribution of isotopes all possible labelling patterns for all intermediates need to be created. by providing an atom transition map in the form of a list or a tuple, all 2 n isotope-specific versions of a chemical compound are created automatically, where n denotes the number of possibly labelled atoms. changing the name of previous function carbonmap to labelmap in v1.0 acknowledges the diversity of possible labelling experiments that can be reproduced with models built using our software. sokol and portais derived the theory of dynamic label propagation under stationary assumption [23] . in steady state the space of possible solutions is reduced and the labelling dynamics can be represented by a set of linear differential equations. we have used this theory and implemented an additional class linearlabelmodel which allows figure 2 . labelling curves in a linear non-reversible pathway. example of label propagation curves for a linear non-reversible pathway of 5 randomly sized metabolite pools, as proposed in the paper by sokol and portais [23] . circles mark the position at which the first derivative of each labelling curve reaches maximum. in the original paper this information has been used to analyse the label shock wave (lsw) propagation. to reproduce these results run the jupyter notebook from the additional file 2. rapid calculation of the label propagation given the steady state concentrations and fluxes of the metabolites [23] . modelbase will automatically build the linear label model after providing the label maps. such a model is provided in figure 2 , where we simulate label propagation in a linear non-reversible pathway, as in figure 1 in [23] . the linear label models are constructed using modelbase rate laws and hence can be fully exported as sbml file. many models loose their readability due to the inconsistent, intractable or misguided naming of their components (an example is a model with reactions named as v1 -v10, without referencing them properly). by providing meta data for any modelbase object, you can abbreviate component names in a personally meaningful manner and then supply additional annotation information in accordance with standards such as miriam [24] via the newly developed meta data interface. this interface can also be used to supply additional important information compulsorily shared in a publication but not necessarily inside the code, such like the unit of a parameter. with the newly implemented changes our package becomes more versatile and user friendly. as argued before, its strength lies in its flexibility and applicability to virtually any biological system, with dynamics that can be described using an ode system. there exist countless mathematical models of biological and biomedical systems derived using odes. many of them are rarely re-used, at least not to the extent that could be reached, if models were shared in a readable, understandable and reusable way [15] . as our 5/13 package can be efficiently used both for the development of new models, as well as the reconstruction of existing ones, as long as they are published with all kinetic information required for the reconstruction, we hope that modelbase will in particular support users with limited modelling experience in re-constructing already existing work, serving as a starting point for their further exploration and development. we have previously demonstrated the versatility of modelbase by re-implementing mathematical models previously published without the source code: two models of biochemical processes in plants [25, 26] , and a model of the non-oxidative pentose phosphate pathway of human erythrocytes [27, 28] . to present how the software can be applied to study medical systems, we used modelbase to re-implement various models, not published by our group, and reproduced key results of the original manuscripts. it was beyond our focus to verify the scientific accuracy of the corresponding model assumptions. we have selected them to show that despite describing different processes, they all share a unified construct. this highlights that by learning how to build a dynamic model with modelbase, you in fact do not learn how to build a one-purpose model, but expand your toolbox to be capable of reproducing any given ode based model. all examples are available as jupyter notebooks and listed in the additional files. for the purpose of this paper, we surveyed available computational models and subjectively selected a relatively old publication of significant impact, based on the number of citations, published without providing the computational source code, nor details regarding the numerical integration. we have chosen a four compartment model of hiv immunology that investigates the interaction of single virus population with the immune system described only by the cd4 + t cells, commonly known as t helper cells [29] . we have implemented the four odes describing the dynamics of uninfected (t), latently infected (l) and actively infected cd4 + t cells (a) and infectious hiv population (v). we have reproduced the results from figure 3 from the original paper [29] showing the decrease in the overall population of cd4+ t-cell (uninfected + latently infected + actively infected cd4+) over time, depending on the number of infectious particles produced per actively infected cell (n). to reproduce these results run the jupyter notebook from the additional file 3. in the figure 3 , we reproduce the results from fig. 3 from the original paper, where by changing the number of infectious particles produced per actively infected cell (n) we follow the dynamics of the overall t cell population (t+l+a) over a period of 10 years. the model has been further used to explore the effect of azidothymidine, an antiretroviral medication, by decreasing the value of n after 3 years by 25% or 75%, mimicking the blocking of the viral replication of hiv. a more detailed description of the timedependent drug concentration in the body is often achieved with pharmacokinetic models. mathematical models based on a system of differential equations that link the dosing regimen with the dynamics of a disease are called pharmacokinetic-pharmacodynamic (pk-pd) models [30] and with the next example we explore how modelbase can be used to develop such models. the technological advances forced a paradigm shift in many fields of our life, including medicine, making more personalised healthcare not only a possibility but a necessity. a pivotal role in the success of precision medicine will be to correctly determine dosing regimes for drugs [31] and pk-pd models provide a quantitative tool to support this [32] . pk-pd models have proved quite successful in many fields, including oncology [33] and here we used the classical tumour growth model by simeoni and colleagues, originally implemented using the industry standard software winnonlin [34] . as the pharmacokinetic model has not been fully described we reproduced only the highly simplified case, where we assume a single drug administration and investigate the effect of drug potency (k 2 ) on simulated tumour growth curves. in figure 4 we plot the simulation results of the modelbase implementation of the system of four odes over the period of 18 days where we systematically changed the value of k 2 , assuming a single drug administration on day 9. with the mca suite available with our software we can calculate the change of the system in response to perturbation of all other system parameters. such quantitative description of the systems response to local parameter perturbation provides support in further studies of the rational design of combined drug therapy or discovery of new drug targets, as described in the review by cascante and colleagues [35] . finally, compartmental models based on ode systems have a long history of application in mathematical epidemiology [36] . many of them, including numerous recent publications studying the spread of coronavirus, are based on the classic epidemic susceptible-infected-recovered (sir) model, originating from the theories developed by kermack and mckendrick at the beginning of last century [37] . one of the most critical information searched for while simulating the dynamics of infectious disease is the existence of disease free or endemic equilibrium and assessment of its stability [38] . indeed periodic oscillations have been observed for several infectious diseases, including measles, influenza and smallpox [36] . to provide an overview of more modelbase functionalities we have implemented a relatively simple sir model based on the recently published autonomous model for smallpox [39] . we have generated damped oscillations and visualised them using the built-in function plot phase plane (see figure 5 ). in the attached jupyter notebook we present how quickly and efficiently in terms of lines of code, the sir model is built and how to add and remove new reactions and/or compounds to construct further variants of this model, such as a seir (e-exposed) or sird (d-deceased) model. figure 4 . compartmental pharmacokinetic-pharmacodynamic model of tumour growth after anticancer therapy. we have reproduced the simplified version of the pk-pd model of tumour growth, where pk part is reduced to a single input and simulated the effect of drug potency (k 2 ) on tumour growth curves. the system of four odes describing the dynamics of the system visualised on a scheme above is integrated over the period of 18 days. we systematically changed the value of k 2 , assuming a single drug administration on day 9. we have obtained the same results as in the figure 3 in the original paper [34] . to reproduce these results run the jupyter notebook from the additional file 4. sir model with vital dynamics including birth rate has been adapted based on the autonomous model to simulate periodicity of chicken pox outbreak in hida, japan [39] . to reproduce these results run the jupyter notebook from the additional file 5. we are presenting here updates of our modelling software that has been developed to simplify the building process of mathematical models based on odes. modelbase is fully embedded in the python programming language. it facilities a systematic construction of new models, and reproducing models in a consistent, tractable and expandable manner. as odes provide a core method to describe the dynamical systems, we hope that our software will serve as the base for deterministic modelling, hence it's name. with the smoothed interface and clearer description of how the software can be used for medical purposes, such as simulation of possible drug regimens for precision medicine, we expect to broaden our user community. we envisage that by providing the mca functionality, also users new to mathematical modelling will adapt a working scheme where such sensitivity analyses become an integral part of the model development and study. the value of sensitivity analyses is demonstrated by considering how results of such analyses have given rise to new potential targets for drug discovery [35] . we especially anticipate that the capability of modelbase to automatically generate labelspecific models will prove useful in predicting fluxes and label propagation dynamics through various metabolic networks. in emerging fields such as computational oncology, such models will be useful to, e.g., predict the appearance of labels in cancer cells. if you have any questions regarding modelbase, you are very welcome to ask them. it is our mission to enable reproducible science and to help putting the theory into action. project name: modelbase project home page: https://pypi.org/project/modelbase/ operating system(s): platform independent programming language: python other requirements: none licence: gnu general public license (gpl), version 3 any restrictions to use by non-academics: none computational systems biology computational oncology -mathematical modelling of drug regimens for precision medicine effective containment explains subexponential growth in recent confirmed covid-19 cases in china an updated estimation of the risk of transmission of the novel coronavirus (2019-ncov) covid-19 outbreak on the diamond princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures how computational models can help unlock biological systems model-driven experimentation: a new approach to understand mechanisms of tertiary lymphoid tissue formation, function, and therapeutic resolution mathematical modeling-guided evaluation of biochemical, developmental, environmental, and genotypic determinants of essential oil composition and yield in peppermint leaves modelling for eu policy support: impact assessments the systems biology markup language (sbml): a medium for representation and exchange of biochemical network models biomodels database: a repository of mathematical models of biological processes building mathematical models of biological systems with modelbase scrumpy : metabolic modelling with python modelling cellular systems with pysces required characteristics for modeling languages in systems biology: a software engineering perspective. biorxiv sciencedirect assimulo: a unified framework for ode solvers sundials : suite of nonlinear and differential / algebraic equation solvers algorithms for scientific computing in python the control of flux: 2 1 a linear steady-state treatment of enzymatic chains. general properties, control and effector strength systems biology: a textbook matplotlib: a 2d graphics environment theoretical basis for dynamic label propagation in stationary metabolic networks under step and periodic inputs minimum information requested in the annotation of biochemical models (miriam) short-term acclimation of the photosynthetic electron transfer chain to changing light: a mathematical model a mathematical model of the calvin photosynthesis cycle comparison of computer simulations of the f-type and l-type non-oxidative hexose monophosphate shunts with 31p-nmr experimental data from human erythrocytes 13c n.m.r. isotopomer and computersimulation studies of the non-oxidative pentose phosphate pathway of human erythrocytes dynamics of hiv infection of cd4+ t cells modeling of pharmacokinetic/pharmacodynamic (pk/pd) relationships: concepts and perspectives precision medicine: an opportunity for a paradigm shift in veterinary medicine hhs public access precision dosing in clinical medicine: present and future different ode models of tumor growth can deliver similar results predictive pharmacokinetic-pharmacodynamic modeling of tumor growth kinetics in xenograft models after administration of anticancer agents metabolic control analysis in drug discovery and disease the mathematics of infectious diseases a contribution to the mathematical theory of epidemics qualitative and bifurcation analysis using an sir model with a saturated treatment function emergence of oscillations in a simple epidemic model with demographic data the authors declare that they have no competing interests. jupyter notebook with the modelbaseimplementation of the minimal pharmacokineticpharmacodynamic (pk-pd)model linking that linking the dosing regimen of an anticancer agent to the tumour growth, proposed by simeoni and colleagues [34] . jupyter notebook with the modelbase implementation of the classic epidemic susceptible-infected-recovered (sir) model parametrised as the autonomous model used to simulate periodicity of chicken pox outbreak in hida, japan [39] . the official documentation is hosted on readthedocs. key: cord-027336-yk3cs8up authors: paciorek, mateusz; bogacz, agata; turek, wojciech title: scalable signal-based simulation of autonomous beings in complex environments date: 2020-05-22 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50420-5_11 sha: doc_id: 27336 cord_uid: yk3cs8up simulation of groups of autonomous beings poses a great computational challenge in terms of required time and resources. the need to simulate large environments, numerous populations of beings, and to increase the detail of models causes the need for parallelization of computations. the signal-based simulation algorithm, presented in our previous research, prove the possibility of linear scalability of such computations up to thousands of computing cores. in this paper further extensions of the signal-based models are investigated and new method for defining complex environments is presented. it allows efficient and scalable simulation of structures which cannot be defined using two dimensions, like multi-story buildings, anthills or bee hives. the solution is applied for defining a building evacuation model, which is validated using empirical data from a real-life evacuation drill. the research on modeling and simulation of groups of autonomous beings, like crowds of pedestrians, cars in traffic or swarms of bees, has a great need for efficient simulation methods. the need for simulating numerous groups of beings in large environment, the growing complexity of models and the desire to collect the results fast increase the complexity of the computational task. therefore, new, scalable simulation methods are constantly pursued by the researchers. in our previous research [2] , a novel method for parallelization of autonomous beings simulation has been presented. it is based on the concept of information propagation, which replaces the need for searching the required data by individual beings. the signal is modeled in a 2d grid of cells, it is directional and autonomously spreads to adjacent cells in the grid. this simple concept removes the need for analyzing remote locations during model update, and therefore, allows splitting the simulated environment between many computing processes, which communicate with fixed number of neighbors once in a single time step, which is the crucial requirement of super-scalable algorithm defined in [5] . the basic version of the method allows representing two-dimensional environments as grids of cells, which are split into equal fragments and updated in parallel. after each time step, the processes updating neighbor fragments exchange information about signal and beings located in common borders. the method requires defining dedicated simulation models, however, in return it can provide linear scalability of simulation. linear scalability has been achieved in several tests performed on hpc hardware, which involved up to 3456 computing cores. linearly scalable method for autonomous beings simulation encouraged further research on signal-based models and their validation. in [2] three simple models have been presented in order to demonstrate the capabilities of the method. in [9] a model of pedestrian behaviour based on proxemics rules have been developed and tested. further work on signal-based models for pedestrians led to the problem considered in this paper: the need for modeling complex, multi-level buildings, which cannot be represented in two dimensions. similar problem can be encountered in modeling of other species habitations, like anthills, bee hives or mole tunnel systems. this class of modeling problems can be addressed by using three-dimensional models, however, efficiency of such approach would be doubtful. only a small part of the 3d space is accessible and significant so the 3d model would introduce a huge waste in memory and computing resources. in this paper we propose the extension of the signal propagation method addressing the requirements of environments that cannot be represented by 2d grid. the extension introduces an abstraction over the relation of adjacency, which enables flexibility in defining the shape of the environment while preserving values of distance and direction. this concept is further explored by implementing a evacuation scenario of a multi-story building as an example of such an environment. finally, we compare the metrics collected during the simulation with the available real-life data to validate the resulting model. the problem of autonomous beings simulation and computer-aided analysis of phenomena taking place in large groups of such beings has been studied for many decades. for example, first significant result of traffic simulation using computers can be found in the fifties of xx century [6] . since then countless reports on methods for modeling and simulation of different types of beings and different phenomena have been published. specific problems, like urban traffic or pedestrian dynamics, attracted so much attention, that several different classifications of models can be found in the literature [10, 12, 15] . the taxonomy is based on the considered level of detail [12] can be found in several different problems, where macro-, mezo-and mico-scale models are being distinguished. in recent years the vast majority of research focus on micro-scale models, which distinguish individual entities and allow differences in their characteristics and behavior. one of the basic decisions which has to be made while defining the model is the method of representing the workspace of the beings. the environment model can be discrete or continuous, it can represent 2 or 3 dimensions. the decision to discretize the workspace significantly simplifies the algorithms for model execution, which allows simulating larger groups in bigger environments. in many cases a discrete workspace model is sufficient to represent desired features of the beings and reproduce phenomena observed in real systems, like in the wellrecognized nagel-schreckenberg freeway traffic model [8] . many other researchers use inspirations from cellular automata in simulation of different types of beings (swarms of bees [1] or groups of pedestrians [13] ) because of simplicity, elegance and sufficient expressiveness. the common challenge identified in the vast majority of the publications in the area is the performance of the simulations. the need for increasing the complexity of models, simulating larger populations of beings and getting results faster is visible in almost all of the considered approaches. in many cases the performance issues prevent further development, and therefore a lot of effort is being put into creating parallel versions of simulation algorithms and generalpurpose simulation frameworks. scalable communication mechanisms have been added to repast [4] , dedicated frameworks are being built (like pandora [14] ). the results show that by defining models dedicated for parallel execution, scalability can be achieved, like in [3] , where almost linear scaling is demonstrated up to 432 processing units with the flame on hpc framework. efficient parallelization of this type of computational task is non-trivial. the algorithm executed in each time step of the simulation operates on a single data structure, which represents the environment state. parallel access to the data structure requires complex synchronization protocols, which imply significant and non-scalable overhead. therefore, in our solution presented in [2] , we focused on removing the need for accessing the remote parts of the data structure. the modeling method assumes that the information is explicitly pushed to computing units that might need it for updating their state. implemented xinuk simulation platform proves the scalability of the approach, offering linear scalability up to 3456 cores of a supercomputer. in this paper we present important extensions to the modeling methods supported by the platform, which allow representing complex environments, not possible to model with a 2d grid of cells. the scalable simulation algorithm, implemented by the xinuk framework, is capable of simulating any 2d environment while providing a flexible approach to the interpretation of the grid. cells might represent a terrain with qualities appropriate for the simulation, actors exclusively occupying a specific place in the grid or a group of actors amassed in one location. each cell is adjacent to its eight neighbor cells and is able to interact with any of them (e.g. move its contents or a part thereof), if the logic defined by the simulation designer allows such an action. however, the simulations that cannot be represented in simple 2d environment are difficult, if not impossible, to properly model using the framework. while some 3d layouts can be mapped to 2d grid utilizing simplifications or other compromises (e.g. modelling upward/downward slopes or stairs as special cells that modify movement speed or behavior of the agent in such a cell), most terrain configurations would greatly benefit from-or even require-more general solution. one example of such configuration, which will be the main focus of this work, is a multi-story building with staircases located in the center. each floor can be represented as an independent part of the 2d grid, but the connections between the floors would have to overlap with them. the standard, 2d moore neighborhood is not sufficient to correctly represent aforementioned environments. one solution would be to generalize the method to represent 3d grid of cells, with each cell having 26 neighbors. however, in the considered class of problems, this approach would result in significant waste of memory and computational resources, as only few of the cells would be important for the simulation results. from this problem stems the idea of the abstraction of the cell neighborhood. the proposed version of the modeling method introduces a neighborhood mapping for each direction. each cell can connect to: top, top-right, right, bottomright, bottom, bottom-left, left and top-left. given a grid of dimensions h × w , this mapping can be declared as in eq. 1: where x × y is a set of all possible coordinates in initial grid, d is a set of mentioned directions and n is a function mapping coordinates and direction to another set of coordinates or none, representing the absence of the neighbor in given direction. likewise, the signal has been updated to be stored as a similar map containing signal strength in given direction. as a result, the signal propagation algorithm required reformulation to make use of the new representation. firstly, the idea of the function of adjacent direction ad of a direction was necessary, which is presented in eq. 2: with use of this function, and assuming the function s that returns the current signal in the cell at given coordinates in given direction (eq. 3), the new signal propagation function sp can be described as in eq. 4: where sp f ∈ [0, 1] is a global suppression factor of the signal. it is worth noting that, should the need arise, this mechanism can be extended to any collection of directions, as long as the signal propagation function is updated to properly represent distribution in all directions. introduction of the new neighbor resolution allows seamless adaptation of the previous approach: by default, all the neighbors of the cell are the adjacent cells. the neighbor mapping function for such a case is defined as in eq. 5, with an exception of the grid borders, where neighbors are nonexistent (none, as in (1)): additionally, in the step of the grid creation any neighbor relation can be replaced to represent non-grid connection between cells. while the concept of remote connections appears trivial, it is critical to consider the possibility that in the process of the grid division neighbors are distributed to the separate computational nodes. such a situation is certain to occur on the line of the division. in our previous approach, we applied buffer zones mechanism as a solution. the new concept required this part of the framework to be redesigned, to allow more flexible cell transfer. as a result, new type of cells was introduced: remote cells. each represents a cell that is not present in the part of grid processed in this worker and contains information regarding: -the identifier of the worker responsible for processing the part of grid containing target cell, -the coordinates of target cell, -the contents of the cell awaiting the transfer to the target cell. following each simulation step, contents of all remote cells are sent to their respective workers, using the logic previously associated with the buffer zones. the modification of the framework did not introduce any alterations in the overall complexity of the simulation process. communication between processes was not altered and utilizes the same methods as the previous synchronization of the buffer zones. creation of the grid containing large number of non-standard neighborhood relations does introduce additional cell contents that need to be transmitted to the target worker, however it is the minimal volume of data required for the simulation to be processed. summarizing, as a result of all the mentioned modifications, the simulation algorithm acquired the ability to model environments that are not possible to be represented on the 2d grid. buffer zones mechanism has been abstracted to allow more flexible communication without assuming that the communication can only occur at the grid part borders. the scalability of the framework has been preserved, since the amount of the data sent between workers remains unchanged for the same simulation model represented in the previous and the proposed approach. it is possible to define more complex communication schemes, however, the number of communication targets remains fixed and relatively low for each worker. signal-based methods can be used to simulate evacuations of people from buildings. in such a signal-based model, it is enough to place a signal sources in exits, so beings will move accordingly to egress routes created by signal emitted by the sources, leaving the building. a negative signal can be used to make beings stay away from potential threats, for instance fire or smoke. in the presented model, the repelling signal was used for representing the reluctance for creating large crowds when alternative routes are possible. in xinuk framework, there are two basic cell types: -obstacle -a cell that does not let signal through, -emptycell -an empty cell traversable by a signal and accessible by any being. in the proposed model, obstacle type of cells was used to create walls. in addition, the following cells were added to create the evacuation model: -personcell -representing a person that was to evacuate, -exitcell -representing an exit from a building, -teleportationcell -a remote cell, described in the previous section, that was moving a being to a destination cell, -evacuationdirectioncell -source of a static signal. a being of personcell type was able to move to any accessible adjacent cell, that is an emptycell, an exitcell or a teleportationcell. movements were possible in 8 directions if there were no walls or other people in the neighborhood, as shown in fig. 1 . in the created model, two types of signals were used: -static signal field -a snapshot of a signal propagated in a particular number of initial iterations, where cells of evacuationdirectioncell type were the signal sources. propagated signal created egress routes that were static during the simulation, -dynamic signal field -signal emitted by moving personcell beings. the static signal field can be compared to a floor field described in [11] or potential field [13] . two different models of evacuating people behaviors were implemented and tested: -moving variant -always move if a movement is possible. -standing variant -move only when the best movement is possible. in the moving variant, a person's destination was calculated as follows: 1. signals in directions of neighbor cells were calculated based on dynamic and static signals by summing both signals, 2. calculated signals were sorted in a descending order, 3. from sorted destinations, a first destination was chosen that was currently available (the cell was not occupied by any other person). in the standing variant, a person did not move if the best direction was not available, preventing unnatural movements to directions further away from targeted exit. thus the 3rd step from the moving variant algorithm was changed as follows: 3. from sorted destinations, a first destination was chosen. if the destination was not available, the being would not move. in a high congestion of beings trying to get to a particular location, conflicts are highly possible. one solution to this problem is to let two beings enter one cell, creating a crowd. a simpler solution, implemented in our model, is to check if a destination cell chosen by a being is empty both in current and in the next iteration. this way, all conflicts will be avoided. each floor of a building was mapped onto a 2d grid reflecting its shape and dimensions. to simulate a multi-story buildings using 2d grids, a special teleportationcell was created. a being that entered the teleportationcell at the end of a staircase on a floor n, was moved to a cell corresponding to the beginning of a staircase at floor n − 1. to validate the evacuation model, a simulation of a real-life evacuation described in [7] was implemented. the evacuation drill took place in a 14-story tower connected to a 3-story low-rise structure. the highest floor of the tower was ignored in the research, thus in the implementation there is no xii floor in the tower, as shown on the building plan (fig. 2) . the highest floor of a low-rise structure was ignored as well and the data focused on the tower, which is a reason why the low-rise structure is shown as one floor on the building plan. each floor was connected to two staircases, each 0,91m wide, allowing only a single line of pedestrians to be created. the staircase length was not mentioned in the paper. evacuation started earlier on 3 floors: v, vi and vii. after 3 min, there was a general alarm on the remaining floors. pre-evacuation time curve was approximately linear (fig. 6 in [7] ). evacuation rate was over one person per second for the first 75% of people in the building. afterwards, the rate was slightly smaller because of discontinuation of use of one of exits (the reason not explained in the article). results from the drill can be seen in fig. 6 . based on the above data, an implementation of evacuation model described in the previous section was created in xinuk platform. crucial parameters were set as follows: -gridsize = 264 ; size of a grid. the whole grid was a square of 264 × 264 cells, each cell representing a square with side length of 0,91 m. -iterationsnumber = 1000 ; number of iterations completed in a single simulation. 1 iteration was corresponding to 1 s. -evacuationdirectioninitialsignal = 1 ; signal intensity that was emitted by evacuationdirectioncell. -personinitialsignal = -0.036 ; signal intensity that was emitted by person-cell. in contrast to evacuationdirectioninitialsignal, the value was negative so people were repelling each other slightly. in the simulation, 1 iteration corresponds to 1 s of an evacuation. number of people on each floor was set accordingly to table 1 from the source paper. the number of people placed on low-rise structure's floor was equal to the number of all people in low-rise structure stated in the source paper. the implemented simulation had three phases: -phase 1 -1st to 18th iteration -creating static signal field by placing signal sources in the cells corresponding to the exits and corridors' ends, -phase 2 -19th to 199th iteration -evacuation after the initial alarm, evacuating people from v, vi and vii floors, -phase 3 -200th to 1000th iteration -evacuation after the general alarm, evacuating people from the whole building. to achieve linear pre-evacuation times, all of the people were not placed on a grid in the first iteration of the 2nd and 3rd phase of the simulation, but they were placed on a grid linearly -one person per iteration on each floor (relying on the fig. 6 in [7] ). validation of the model was based on visual analysis of simulation results and on the evacuation curve metric which is a dependency between the number of people evacuated from the building and time. empirical data is shown in fig. 6 . during the observations, two anomalies were visible: 1. problem 1: crowds of people next to the upper and lower entrance of staircases on each floor were consistently different-lower entrance seemed to generate larger crowds-and an evacuation using upper stairs tended to take longer (fig. 3 ), 2. problem 2: people in corridors that could not move forward but could go back, were moving back and forth waiting to go further or were choosing to go back to staircase entrances. problem 1 was a result of a sequential order of updating cells. a person that was trying to enter an upper staircase was above the people in the staircase, thus the person was updated before the people in the corridor. this way, the person's movement was favored and congestion was visible in the corridors and there were no crowds close to the upper entrances to the upper staircases. similarly, trying to enter a lower staircase, the movement of people already in the staircase was favored. a person trying to enter the staircase was updated after people in the staircase managed to occupy all empty spaces, preventing the person from entering the staircase. figure 5 shows both of the situations. a simple solution to this problem was to update locations of people in a random order. figure 4 shows the results of such an approach. crowds are distributed equally close to both of staircase entrances and evacuation in both staircases progresses similarly. problem 2 was a result of the decision making algorithm. according to the algorithm, people would make any movement if it was possible, even a bad one, rather than stay in place. a solution was to use the standing variant: choose only the most attractive destination or not to move at all. the visual analysis of the final model behavior is satisfactory. the formation of crowds and selection of the exits resembled the expected phenomena. the people were eager to follow the shortest path towards the exits, while avoiding excessive crowding which might lead to trampling. the four combinations of the model update algorithm, moving/standing variant and sequential/random variant, were executed 30 times. we used the favoring movements of people that are upper in a grid in sequential way of updating cells. on the left -a situation when person a is trying to enter a staircase that is below. on the right -a situation when person b is trying to enter a staircase that is above prometheus supercomputer (part of pl-grid infrastructure 1 ), which allowed completion of all computation within several minutes. the resulting chart (fig. 7) shows that this particular metrics is not influenced significantly by the selected variant. an average rate of people leaving the building is a little over 1 person per second, which matches the data on 6. after an evacuation of 75% of people, the rate did not change, as in the simulation people continued using two exits. on a source chart (fig. 6) people have already reached exits in first seconds while on the resulting chart (fig. 7) the first person left the building after 200th second of evacuation. according to [7] , people on the floors other than v, vi and vii did not evacuate before the general alarm. thus, it is not clear why the chart shown in fig. 6 suggests that some people have left the building in such a short time overcoming at least 5 floor long stairs. the results of the experiments are not perfectly matching the empirical data from [7] . nonetheless, taking into consideration the ambiguity, contradictions or lack of some details of the described evacuation, we conclude that the resulting model yielded realistic and consistent outcomes. in the context of validation of the presented method as a building block of similar environments, the results are satisfactory. in this paper we proposed an extension to the signal propagation modeling method and the xinuk framework, addressing the limitations of the existing approach in the context of complex 3d environments. we introduced the alternative to the standard, grid-based neighborhood as a means to generalizing the idea of buffer zones present in the framework. the new mechanism greatly increased the flexibility of the method, while avoiding the massive growth of complexity that would result from switching to full 3d grid environment. at the same time, the scalability of the original solution remained unhindered, as the alteration did not increase the amount of data exchanged between computational nodes in the synchronization step. the evacuation scenario used as a demonstration of the new capabilities of the method provided results confirming that such a scenario can be accurately represented in the resulting framework. as a followup of this work, we intend to further explore the possibilities arising from the flexible neighborhood declaration, especially the environments that would benefit from additional directions, e.g. layers of an anthill utilizing the vertical directions. research dedicated to further reduction in the communication might yield interesting results as well, e.g. investigating a trade-off between accuracy and performance of the system while performing the synchronization after several simulation steps. beehave: a systems model of honeybee colony dynamics and foraging to explore multifactorial causes of colony failure high-performance computing framework with desynchronized information propagation for large-scale simulations exploitation of high performance computing in the flame agent-based simulation framework repast hpc: a platform for large-scale agent-based modeling super-scalable algorithms for computing on 100,000 processors simulation of freeway traffic on a general-purpose discrete variable computer pre-evacuation data collected from a mid-rise evacuation exercise a cellular automaton model for freeway traffic hpc large-scale pedestrian simulation based on proxemics rules verification and validation of simulation models cellular automaton model for evacuation process with obstacles genealogy of traffic flow models towards realistic and effective agent-based models of crowd dynamics scalable agent-based modelling with cloud hpc resources for social simulations crowd analysis: a survey acknowledgments. the research presented in this paper was supported by the polish ministry of science and higher education funds assigned to agh university of science and technology. the authors acknowledge using the pl-grid infrastructure. key: cord-020683-5s3lghj6 authors: buonomo, bruno title: effects of information-dependent vaccination behavior on coronavirus outbreak: insights from a siri model date: 2020-04-09 journal: nan doi: 10.1007/s11587-020-00506-8 sha: doc_id: 20683 cord_uid: 5s3lghj6 a mathematical model is proposed to assess the effects of a vaccine on the time evolution of a coronavirus outbreak. the model has the basic structure of siri compartments (susceptible–infectious–recovered–infectious) and is implemented by taking into account of the behavioral changes of individuals in response to the available information on the status of the disease in the community. we found that the cumulative incidence may be significantly reduced when the information coverage is high enough and/or the information delay is short, especially when the reinfection rate is high enough to sustain the presence of the disease in the community. this analysis is inspired by the ongoing outbreak of a respiratory illness caused by the novel coronavirus covid-19. aviruses can infect people and then spreading from person-to-person. however, this may happen with serious consequences: well known cases are that of severe acute respiratory syndrome (sars) which killed 813 people worldwide during 2002-2003 outbreak [29] , and the more recent case of middle east respiratory syndrome coronavirus (mers), where a total of 2494 confirmed cases including 858 associated deaths were reported, the majority from saudi arabia (at the end of november 2019, [30] ). therefore, coronavirus may represent a serious public health threat. the emergency related to the novel outbreak in china is still ongoing at time of writing this article and it is unclear how the situation worldwide will unfold. the news released by media create great concern and behavioral changes can be observed in the everyday life of individuals, even in europe where at the moment only few cases have been reported. for example, the fear of coronavirus has driven rapidly to sold out of protective face masks in pharmacies in italy long before the first case in the country was reported [1] . a specific aspects of diseases caused by coronavirus is that humans can be reinfected with respiratory coronaviruses throughout life [19] . the duration of immunity for sars, for example, was estimated to be greater than 3 years [34] . moreover, investigations on human coronavirus with infected volunteers has shown that even though the immune system react after the infection (serum-specific immunoglobulin and igc antibody levels peak 12-14 days after infection) at one year following experimental infection there is only partial protection against re-infection with the homologous strain [9] . predictions or insight concerning the time-evolution of epidemics, especially when a new emerging infectious disease is under investigation, can be obtained by using mathematical models. in mathematical epidemiology, a large amount of literature is devoted to the use of the so called compartmental epidemic models, where the individuals of the community affected by the infectious disease are divided in mutually exclusive groups (the compartments) according to their status with respect to the disease [3, 4, 10, 21, 24] . compartmental epidemic models are providing to be the first mathematical approach for estimating the epidemiological parameter values of covid-19 in its early stage and for anticipating future trends [2, 11, 28] . when the disease under interest confer permanent immunity from reinfection after being recovered, the celebrated sir model (susceptible-infectious-recovered) and its many variants are most often adopted. however, where reinfection cannot be neglected the sirs model (susceptible-infectious-recovered, and again susceptible) and its variants may be used, under the assumption that infection does not change the host susceptibility [3, 4, 10, 21, 24] . since the disease of our interest has both reinfection and partial immunity after infection, we consider as starting point the so-called siri model (susceptibleinfectious-recovered-infectious) which takes into account of both these features (see [25] and the references contained therein for further information on siri model). when the epidemic process may be decoupled from the longer time-scale demographic dynamics, i. e. when birth and natural death terms may be neglected, one gets a simpler model with an interesting property. in fact, according to the values of three relevant parameters (the transmission rate, the recovery rate and the reinfection rate), the model exhibits three different dynamics [18, 20] : (i) no epidemic will occur, in the sense that the fraction of infectious will decrease from the initial value to zero; (ii) an epidemic outbreak occurs, in the sense that the fraction of infectious will initially increase till a maximum value is reached and then it decreases to zero; (iii) an epidemic outbreak occurs and the disease will permanently remain within the population. at time of writing this paper, scholars are racing to make a vaccine for the novel covid-19 coronavirus available. as of february 12, 2020, it was announced that 'the first vaccine could be ready in 18 months' [32] . therefore, it becomes an intriguing problem to qualitatively assess how the administration of a vaccine could affect the outbreak, taking into account of the behavioral changes of individuals in response to the information available on the status of the disease in the community. this is the main aim of this paper. the scenario depicted here is that of a community where a relatively small quantity of infectious is present at time of delivering the vaccine. the vaccination is assumed to be fully voluntary and the choice to get vaccinated or not is assumed to depend in part on the available information and rumors concerning the spread of the disease in the community. the behavioral change of individuals is introduced by employing the method of information-dependent models [14, 15, 33] which is based on the introduction of a suitable information index. such an approach has been applied to general infectious diseases [8, 14, 15, 23, 33] as well as specific ones, including childhood diseases like measles, mumps and rubella [14, 33] and is currently under development (for very recent papers see [5, 22, 35] ). therefore, another goal of this manuscript is to provide an application of the information index to a simple model containing relevant features of a coronavirus disease. specifically, we use epidemiological parameter values based on early estimation of novel coronavirus covid-19 [28] . the rest of the paper is organized as follows: in sect. 2 we introduce the basic siri model and recall its main properties. in sect. 3 we implement the siri model by introducing the information-dependent vaccination. the epidemic and the reinfection thresholds are discussed in sect. 4. section 5 is devoted to numerical investigations: the effects of the information parameters on the time evolution of the outbreak are discussed. conclusions and future perspective are given in sect. 6. since the disease of our interest has both reinfection and partial immunity after infection, we first consider the siri model, which is given by the following nonlinear ordinary differential equations (the upper dot denotes the time derivative) [18] : here s, i and r denote, respectively, the fractions of susceptible, infectious (and also infected) and recovered individuals, at a time t (the dependence on t is omitted); β is the transmission rate; γ is the recovery rate; μ is the birth/death rate; σ ∈ (0, 1) is the reduction in susceptibility due to previous infection. model (1) assumes that the time-scale under consideration is such that demographic dynamics must be considered. however, epidemics caused by coronavirus often occurs quickly enough to neglect the demographic processes (as in the case of sars in [2002] [2003] . when the epidemic process is decoupled from demography, i.e. when μ = 0, one obviously gets the reduced model:ṡ this very simple model has interesting properties. indeed, introduce the basic reproduction number r 0 = β/γ . it has been shown that the solutions have the following behavior [20] : if r 0 ≤ 1, then no epidemic will occur, in the sense that the state variable i (t) denoting the fraction of infectious will decrease from the initial value to zero; if r 0 ∈ (1, 1/σ ), then an epidemic outbreak will follow, in the sense that the state variable i (t) will initially increase till a maximum value is reached and then it decreases to zero; if r 0 > 1/σ , then an epidemic outbreak will follow and the disease will permanently remain within the population, in the sense that the state variable i (t) will approach (after a possibly non monotone transient) an endemic equilibrium e, given by: where: the equilibrium e is globally asymptotically stable [20] and it is interesting to note that, since the demography has been neglected, the disease will persist in the population due to the reservoir of partially susceptible individuals in the compartment r. from a mathematical point of view, the threshold r 0 = r 0σ , where r 0σ = 1/σ , is a bifurcation value for model (2) . this does not happen for model (1) . in fact, when demography is included in the model, the endemic equilibrium exists for r 0 > 1, where r 0 = β/(μ + γ ) and therefore both below and above the reinfection threshold. model (2) (as well as (1)) is a simple model which is able to describe the timeevolution of the epidemic spread on a short time-scale. however, it does not takes into account of possible control measure. the simplest one to consider is vaccination. we consider the scenario where the vaccination is assumed to be fully voluntary. in order to emphasize the role of reinfection, we assume that only susceptible individuals (i.e. individuals that did not experience the infection) consider this protective option. when the vaccine is perfect (i.e. it is an ideal vaccine which confer 100 percent life-long immunity) one gets the following model: where v denotes the fraction of vaccinated individuals and ϕ 0 is the vaccination rate. in the next section we will modify the siri model (4) to assess how an hypothetical vaccine could control the outbreak, taking into account of the behavioral changes of individuals produced by the information available on the status of the disease in the community. we modify the siri model by employing the idea of the information-dependent epidemic models [23, 33] . we assume that the vaccination is fully voluntary and information-dependent, in the sense that the choice to get vaccinated or not depends on the available information and rumors concerning the spread of the disease in the community. the information is mathematically represented by an information index m(t), which summarizes the information about the current and past values of the disease and is given by the following distributed delay [12] [13] [14] 16] : here, the functiong describes the information that individuals consider to be relevant for making their choice to vaccinate or not to vaccinate. it is often assumed thatg depends only on prevalence [5, 12, 14, 16] where g is a continuous, differentiable, increasing function such that g(0) = 0. in particular, we assume that: in (6) the parameter k is the information coverage and may be seen as a 'summary' of two opposite phenomena, the disease under-reporting and the level of media coverage of the status of the disease, which tends to amplify the social alarm. the range of variability of k may be restricted to the interval (0, 1) (see [6] ). the delay kernel k (t) in (5) is a positive function such that +∞ 0 k (t)dt = 1 and represents the weight given to past history of the disease. we assume that the kernel is given by the first element erl 1,a (t) of the erlangian family, called weak kernel or exponentially fading memory. this means that the maximum weight is assigned to the current information and the delay is centered at the average 1/a. therefore, the parameter a takes the meaning of inverse of the average time delay of the collected information on the disease. with this choice, by applying the linear chain trick [26] , the dynamics of m is ruled by the equation: we couple this equation with model (4). the coupling is realized through the following information-dependent vaccination rate: where the constant ϕ 0 ∈ (0, 1) represents the fraction of the population that chooses to get vaccinate regardless of rumors and information about the status of the disease in the population, and ϕ 1 (m(t)) represents the fraction of the population whose vaccination choice is influenced by the information. generally speaking, we require that ϕ 1 (0) = 0 and ϕ 1 is a continuous, differentiable and increasing function. however, as done in [5, 14] , we take: where ε > 0. this parametrization leads to an overall coverage of 1−ε (asymptotically for m → ∞). here we take ε = 0.01, which means a roof of 99% in vaccine uptakes under circumstances of high perceived risk. we also take d = 500 [14] . note that this choice of parameter values implies that a 96.4% vaccination coverage is obtained in correspondence of an information index m = 0.07 (see fig. 1 ). finally we assume that the vaccine is not perfect, which is a more realistic hypothesis, so that the vaccinated individuals may be infected but with a reduced susceptibility ψ. the siri epidemic model with information-dependent vaccination that we consider is therefore given by the meaning of the state variables, the parameters and their baseline values are given in table 1 . note that (9) to get: let us introduce the quantity which is the basic reproduction number of model (2) [20] . from the second equation of (9) it easily follows thati̇ where: it immediately follows that, if i (0) > 0, then: assuming that i (0) > 0 and r(0) = v (0) = 0 (and therefore s(0) < 1) it follows that: if p 0 < 1/s(0), then the epidemic curve initially decays. if p 0 > 1/s(0) the epidemic takes place since the infectious curve initially grows. from the first equation in (9) it can be seen that at equilibrium it must bes = 0. therefore, all the possible equilibria are susceptible-free. since the solutions are clearly bounded, this means that for large time any individual who was initially susceptible has experienced the disease or has been vaccinated. looking for equilibria in the form e = ĩ ,r,ṽ ,m , from (10) we get: disease-free equilibria: ifĩ = 0. it can be easily seen from (12) that therefore there are infinitely many disease-free equilibria of the form endemic equilibrium: we begin by looking for equilibria such that this implies that: therefore:ṽ = 0 andr it follows that an unique susceptibles-free endemic equilibrium exists, which is given by: where which exists only if the quantity is the reinfection threshold. when σ > σ c the disease may spread and persist inside the community where the individuals live. note that in classical sir models the presence of an endemic state is due to the replenishment of susceptibles ensured by demography [20] , which is not the case here. the local stability analysis of e 1 requires the jacobian matrix of system (10): taking into account of (13), (14) and that v 1 = 0, it follows the eigenvalues are: and the eigenvalues of the submatrix: the trace is negative and the determinant is so that e 1 is locally asymptotically stable. (i) the stable endemic state e 1 can be realized thanks to the imperfection of the vaccine, in the sense that when ψ = 0 in (9) the variable v is always increasing. (ii) the information index, in the form described in sect. 3, may be responsible of the onset of sustained oscillations in epidemic models both in the case of delayed information (see e.g. [12, 14, 16, 17] ) and instantaneous information (as it happens when the latency time is included in the model [7] ). in all these mentioned cases, the epidemic spread is considered on a long time-scale and demography is taken into account. the analysis in this section clearly shows that sustained oscillations are not possible for the short time-scale siri model with information. table 1 5 we use epidemiological parameter values based on early estimation of novel coronavirus covid-19 provided in [28] . the estimation, based on the use of a seir metapopulation model of infection within chinese cities, revealed that the transmission rate within the city of wuhan, the epicenter of the outbreak, was 1.07 day −1 , and the infectious period was 3.6 days (so that γ = 0.27 day −1 ). therefore the brn given in (11) is p 0 = 3.85 (of course, in agreement with the estimate in [28] ), and the value σ c := 1/p 0 = 0.259 is the threshold for the infection rate. for vaccinated individuals, the relative susceptibility (compared to an unvaccinated individuals) is set ψ = 0.15, which means that vaccine administration reduces the transmission rate by 85% (vaccine efficacy = 0.85). this value falls within the estimates for the most common vaccine used in the usa, where vaccine efficacy ranges between 0.75 and 0.95 (see table 9 .3, p. 222, in [24] ). as for the relative susceptibility of recovered individuals, we consider two relevant baseline cases: (i) case i: σ = 0.2. this value is representative of a reinfection value below the reinfection threshold σ c ; (i) case ii: σ = 0.4. this value is representative of a reinfection value above the reinfection threshold σ c . the information parameter values are mainly guessed or taken from papers where the information dependent vaccination is used [5, 14] . the information coverage k ranges from a minimum of 0.2 (i.e. the public is aware of 20% of the prevalence) to 1. the average time delay of information ranges from the hypothetical case of immediate information (t =0) to a delay of 120 days. the description and baseline values of the parameters are presented in table 1 . the initial data reflect a scenario in which a small portion of infectious is present in the community at time of administrating the vaccine. furthermore, coherently with the initial data mentioned in sect. 4, we assume that: and, clearly, s(0) = 1 − i (0). according to the analysis made in sect. 4, values of σ below the threshold σ c implies that the epidemic will eventually die out. when σ is above σ c , then the disease is sustained endemically by reinfection. this behavior is illustrated in fig. 2 , where it is considered the worst possible scenario, where k = 0.2 and t = 120 days. in fig. 2 , left panel, the continuous line is obtained for σ = 0.2. vaccination is not able to influence the outbreak, due to the large delay. however, even though an epidemic peak occurs after three weeks, thereafter the disease dies out due to the low level of reinfection. the case σ = 0.4 is represented by the dotted line. as expected, the reinfection is able to 'restart' the epidemic. the trend (here captured for one year) would be to asymptotically converge to the endemic equilibrium e 1 . the corresponding time evolution of the information index m is shown in fig. 2 , right panel. in particular, in the elimination case (σ = 0.2), the information index reaches a maximum of 0.002 (approx.) which corresponds to a vaccination rate of 51.5% (see fig. 1 ). after that, it declines but, due to memory of past events, the information index is still positive months after the elimination of the disease. the 'social alarm' produced in the case σ = 0.4 is somehow represented by the increasing continuous curve in fig. 2 , right panel. at the end of the time frame it is m ≈ 0.022 which corresponds to a vaccination rate of 91%. in summary, a large reinfection rate may produce a large epidemic. however, even in this worst scenario, the feedback produced by behavioral changes due to information may largely affect the outbreak evolution. in fig. 3 table 1 model (10), is given by the quantity: more informed people react and vaccinate and this, in turn, contribute to the elimination of the disease. therefore, a threshold value k c exists above which the disease can be eliminated. an insight on the overall effect of parameter k on the epidemic may be determined by evaluating how it affects the cumulative incidence (ci), i.e. total number of new cases in the time frame [0, t f ]. we also introduce the following index which measures the relative change of cumulative incidence for two different values, say p 1 and p 2 , of a given parameter p over the simulated time frame (in other words, the percentage variation of the cumulative incidence varying p from p 2 ). in fig. 4 (first plot from the left) it is shown the case of a reinfection value σ = 0.2, that is under the reinfection threshold. it can be seen how ci is declining with increasing k. in fig. 4 (second plot from the left) a comparison with the case of low information coverage, k = 0.2, is given: a reduction till 80% of ci may be reached by increasing the value of k till k = 0.99. when the reinfection value is σ = 0.4 (fig. 4 , third and fourth plot), that is above the reinfection threshold, the 'catastrofic case' is represented in correspondence of k = 0.2. this case is quickly recovered by increasing k, as we already know from fig. 3 , because of the threshold value k c , between 0.2 and 0.3, which allows to pass from the endemic to no-endemic asymptotic state. then, again table 1 ci is declining with increasing k. this means that when reinfection is high, the effect of information coverage is even more important. in fact, in this case the prevalence is high and a high value of k result in a greater behavioral response by the population. in fig. 5 it is shown the influence of the information delay t on ci. in the case σ = 0.2 ci grows concavely with t (first plot from the left). in fig. 5 (second plot) a comparison with the case of maximum information delay, t = 120 days, is given: a reduction till 75% of ci may be reached by reducing the value of t till to very few days. when the reinfection value is σ = 0.4 (fig. 5 , third and fourth plot), that is above the reinfection threshold, ci increases convexly with t . a stronger decreasing effect on ci can be seen by reducing the delay from t = 120 days to t ≈ 90, and a reduction till 98% of ci may be reached by reducing the value of t till to very few days. in this paper we have investigated how a hypothetical vaccine could affect a coronavirus epidemic, taking into account of the behavioral changes of individuals in response to the information about the disease prevalence. we have first considered a basic siri model. such a model contains the specific feature of reinfection, which is typical of coronaviruses. reinfection may allow the disease to persist even when the time-scale of the outbreak is small enough to neglect the demography (births and natural death). then, we have implemented the siri model to take into account of: (i) an available vaccine to be administrated on voluntary basis to susceptibles; (ii) the change in the behavioral vaccination in response to information on the status of the disease. we have seen that the disease burden, expressed through the cumulative incidence, may be significantly reduced when the information coverage is high enough and/or the information delay is short. when the reinfection rate is above the critical value, a relevant role is played by recovered individuals. this compartment offers a reservoir of susceptibles (although with a reduced level of susceptibility) and if not vaccinate may contribute to the re-emergence of the disease. on the other hand, in this case a correct and quick information may play an even more important role since the social alarm produced by high level of prevalence results, in turn, in high level of vaccination rate and eventually in the reduction or elimination of the disease. the model on which this investigation is based is intriguing since partial immunity coupled to short-time epidemic behavior may lead to not trivial epidemic dynamics (see the 'delayed epidemic' case, where an epidemics initially may decrease to take off later [27] ). however, it has many limitations in representing the covid-19 propagation. for example, the model represents the epidemics in a closed community over a relatively short time-interval and therefore it is unable to capture the complexity of global mobility, which is one of the main concerns related to covid-19 propagation. another limitation, which is again related to the global aspects of epidemics like sars and covid-19, is that we assume that individuals are influenced by information on the status of the prevalence within the community where they live (i.e. the fraction i is part of the total population) whereas local communities may be strongly influenced also by information regarding far away communities, which are perceived as potential threats because of global mobility. moreover, in absence of treatment and vaccine, local authorities face with coronavirus outbreak using social distancing measures, that are not considered here: individuals are forced to be quarantined or hospitalized. nevertheless, contact pattern may be reduced also as response to information on the status of the disease. in this case the model could be modified to include an information-dependent contact rate, as in [5, 7] . finally, the model does not include the latency time and the diseaseinduced mortality is also neglected (at the moment, the estimate for covid-19 is at around 2%). these aspects will be part of future investigations. mascherine sold out in farmacie roma data-based analysis, modelling and forecasting of the novel coronavirus (2019-ncov) outbreak. medrxiv preprint infectious diseases of humans. dynamics and control mathematical epidemiology oscillations and hysteresis in an epidemic model with informationdependent imperfect vaccination global stability of an sir epidemic model with information dependent vaccination globally stable endemicity for infectious diseases with information-related changes in contact patterns modeling of pseudo-rational exemption to vaccination for seir diseases the time course of the immune response to experimental coronavirus infection of man mathematical structures of epidemic systems a new transmission route for the propagation of the sars-cov-2 coronavirus information-related changes in contact patterns may trigger oscillations in the endemic prevalence of infectious diseases bistable endemic states in a susceptible-infectious-susceptible model with behavior-dependent vaccination vaccinating behaviour, information, and the dynamics of sir vaccine preventable diseases. theor bifurcation thresholds in an sir model with informationdependent vaccination fatal sir diseases and rational exemption to vaccination the impact of vaccine side effects on the natural history of immunization programmes: an imitation-game approach infection, reinfection, and vaccination under suboptimal immune protection: epidemiological perspectives epidemiology of coronavirus respiratory infections epidemics with partial immunity to reinfection modeling infectious diseases in humans and animals nonlinear dynamics of infectious diseases via informationinduced vaccination and saturated treatment modeling the interplay between human behavior and the spread of infectious diseases an introduction to mathematical epidemiology bistability of evolutionary stable vaccination strategies in the reinfection siri model biological delay systems: linear stability theory the change of susceptibility following infection can induce failure to predict outbreak potential for r 0 novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions world health organization: cumulative number of reported probable cases of sars middle east respiratory syndrome coronavirus world health organization: novel coronavirus world health organization: director-general's remarks at the media briefing on 2019-ncov on 11 statistical physics of vaccination duration of antibody responses after severe acute respiratory syndrome stability and bifurcation analysis on a delayed epidemic model with information-dependent vaccination publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations conflict of interest the author states that there is no conflict of interest. key: cord-002474-2l31d7ew authors: lv, yang; hu, guangyao; wang, chunyang; yuan, wenjie; wei, shanshan; gao, jiaoqi; wang, boyuan; song, fangchao title: actual measurement, hygrothermal response experiment and growth prediction analysis of microbial contamination of central air conditioning system in dalian, china date: 2017-04-03 journal: sci rep doi: 10.1038/srep44190 sha: doc_id: 2474 cord_uid: 2l31d7ew the microbial contamination of central air conditioning system is one of the important factors that affect the indoor air quality. actual measurement and analysis were carried out on microbial contamination in central air conditioning system at a venue in dalian, china. illumina miseq method was used and three fungal samples of two units were analysed by high throughput sequencing. results showed that the predominant fungus in air conditioning unit a and b were candida spp. and cladosporium spp., and two fungus were further used in the hygrothermal response experiment. based on the data of cladosporium in hygrothermal response experiment, this paper used the logistic equation and the gompertz equation to fit the growth predictive model of cladosporium genera in different temperature and relative humidity conditions, and the square root model was fitted based on the two environmental factors. in addition, the models were carried on the analysis to verify the accuracy and feasibility of the established model equation. with the large-scale use of central air conditioning system and the improvement of people's living standard, more and more attention has been paid to the increasingly serious problem of indoor air pollution. studies showed that air handing unit is an important source of microorganisms for indoor biological pollution, and some microorganisms tend to stay in the dust of air conditioning units with the appropriate temperature and humidity environment. the microorganisms grow and then enter the indoor space through the air, resulting in the destruction of indoor air quality [1] [2] [3] [4] [5] [6] [7] [8] . national institute for occupational safety and health (niosh) conducted a study of 529 buildings, and the research results showed that among the sources of indoor air pollution, central air-conditioning system pollution accounted for 53% 9 . summary of measurement by professor fanger showed that air-conditioning systems accounted for 42% in the indoor pollution sources 10 . based on the supervision and inspection of air conditioning and ventilation system in the public places of china's 30 provinces, municipalities and autonomous regions by china academy of building research and the chinese center for disease control and prevention, it was found that more than 90% of the central air conditioning systems could not meet china's national sanitary standard 11 . thus, air conditioning system should eliminate negative impact caused by its own, on this basis, it may relate to positive effect of the ventilation. in recent years, h1n1, sars and other popular virus spread [12] [13] , and some researches showed that the hygienic, reasonable air conditioning systems were important to reduce damage [14] [15] . therefore, microbial pollution in central air conditioning system has become a critical topic in the field of indoor air pollution. studies showed that the filter, cooling coil, humidifier, and cooling tower in central air-conditioning system were easy to microbial breeding [16] [17] [18] [19] [20] [21] . in this study, a venue in dalian was selected as the research object. as the working condition of the air conditioning system was down, the environment parameters were measured, and microorganisms existing on the wind pipe, filtering net, surface cooler, and condensate water, on the floor and in the air were collected. besides, according to the tested microbial density and the identified genome sequence of collected microorganisms, the hygrothermal response experiment of dominant fungal was detected, and the fitting analysis was carried out based on the prediction model, followed by a series of statistical analysis. the aim of the present study was to clarify characteristics of the microorganisms in air conditioning systems, and the study would be helpful for policymakers and hvac engineers to develop the appropriate strategies and to conduct the bacteria and fungi characteristic assessments in hvac system. preliminary survey. the object of study is a venue in dalian, which covers a total area of 36400 m 2 and building area is 17320 m 2 . the aboveground part includes a swimming pool, ball training venues, gymnasium, the lounge room and the clinic. the underground part consists of a table tennis hall, air conditioning equipment rooms and reservoir area. the whole building is centralized-controlled by the central air conditioning room, which includes two air handling units. two measured units were all air system, which only had a coarse efficiency filter, and the unit is also provided with a heater, cooler and fan etc. both units are the primary return air system and the filters are removable types. the running time of the air conditioning system is from may to october, and the daily operation period is 08:00-21:00. all components are cleaned every two years. when the measurement was carried on, the unit a and b were both cleaned a year ago. both units were closed during the sample collection. measurement method. the actual measurement is divided into two parts: the environment parameter measurement and air unit sampling. first, the temperature, humidity, and co 2 concentration were automatically recorded by the temperature, humidity and co 2 monitor (mch-383sd, japan). second, the disinfected planktonic microorganism sampler (hkm series dw-20, china) was installed where fresh air and return air mixed. once installing the sampler, we loaded medium in the sampler and set the parameter of air flow in sampler as 2000 l loaded medium. after the sample collection, the petri dishes must be sealed for preservation. finally, according to the hygienic specification of central air conditioning ventilation system in public buildings of china 22 , we sampled the dust by using sterile non-woven (100 mm*100 mm) on components of unit a and b, respectively, and each sampling point covered a 15 cm*15 cm area at the sampling area. the non-woven fabrics were put into sterile water and stirred to make sure that organic substance on the non-woven was fully dissolved in sterile water. then, the samples of sterile water containing organic substances were prepared in 10 times and 100 times diluted concentration, respectively. there are 10 sampling points in unit a and 5 points in b, and two measuring point positions of the units are shown in fig. 1 . the microorganisms collected in the air were directly cultured, and the samples in the dust were 10 times and 100 times diluted and 100 μ l of the sample was inoculated into the two kinds of solid culture media. beef extract peptone medium was used for cultivating bacteria and potato dextrose agar was used for cultivating fungus 22, 23 . each dilution was done in 3 parallel samples to reduce the error, and final results showed the average value. the blank samples and test samples were set up for each of the cultures. if there is no colony detected on the blank sample, the test results are valid. both field test and laboratory measurements were performed in accordance with the hygienic specification of central air conditioning ventilation system in public buildings requirements 22 . genome sequencing. only a small part of microorganisms are cultivable. therefore, the traditional cultivation method can not harvest all the species in ecological samples 24 . fungal genome sequencing is an emerging method to identify the microbial genome, which could directly indicate related species information from environment samples 25 . fungal amplicon sequencing analysis was used in this study, because the existing research showed that fungal spores have stronger vitality than other microorganisms in the air, and fungi dominated the microorganism in air conditioning systems. therefore, this method was mainly used to identify fungi in this study 3,17-28 . environment parameters in air handling units. temperature, humidity and co 2 concentration of unit a and b are shown in table 1 . unit a is located in the ground floor (b1), and the unit b is located on the ground floor. compared to the unit b, the humidity of unit a is higher, and the temperature is lower. microbial colony analysis. the distribution density of bacteria and fungi in the unit a is obtained through statistics, as shown in fig. 2 . the concentration of airborne fungus was 44 cfu/m 3 , and the concentration of airborne bacteria was 16 cfu/m 3 . the unit a showed the obvious microbial contamination status, though all components and airborne microorganism meet the hygienic specification of central air conditioning ventilation system in public buildings of china 22 . the microbial distribution in filter net is central < edge < bottom and bacteria accounted for a larger proportion; the microbial distribution in surface cooler is center > against the wall > edge, and fungi accounted for a large. the fungal contamination in the air is more serious than the bacteria. the distribution density of bacteria and fungi in the unit b were obtained through statistics, as shown in fig. 3 . the concentration of airborne fungus was 22 cfu/m 3 , and the concentration of airborne bacteria was 80 cfu/m 3 . parts of the measuring point in the unit b were polluted seriously. the bacterial colonies in the corner and the ground of the surface cooler were beyond the hygienic index (≤ 100 cfu/cm 2 ) in the hygienic specification of central air conditioning ventilation system in public buildings of china regulates 22 . limited by unit placement, there were less measuring points in unit b, and we chose the same measuring points in both units for comparison (centre of surface cooler, surface cooler against wall, corner of surface, and ground of surface cooler). the comparison between unit a and b indicates that the bacterial density in unit a was less than that in the same sampling point in unit b, but the fungal density in unit a was more than that in the same sampling point in unit b. if the cleaning and disinfection is not enough before the air conditioning system running, it may make the fungus to enter the indoor environment, which results in make the pollution of indoor air. compared with cooling coil, the fungus contamination is worse in the floor dust and the air suspension. during the actual measurement, it is found that the unit internal is unprecedentedly narrow and low intensity of illumination in a closed state. according to the description by technicians, it is easy to trample damage to the underground pipes, which leads to the disinfection and cleaning work rarely in the unit. fungal genome sequencing analysis. in this study, we analysed the samples from the sampling points a1, b1, and b2 by amplicon sequencing information analysis, respectively named a1a, b1a, and b2a. all collected samples in the air conditioner were transferred to the eppendorf tubes and processed with the extraction step. samples were resusponded in tens buffer with sds and proteinase k as described by vinod 29 . after incubation at 55 °c, phenol/chloroform/isoamyl alcohol was added to remove proteins, and the nucleic acid was precipitated with isopropanol and sodium acetate (0.3 m). total dna was dissolved in 1 × te after washing with 75% ethanol. and then the quality and quantity tests were conducted by agarose gel electrophoresis, . for pcr product, the jagged ends of dna fragment would be converted into blunt ends by using t4 dna polymerase, klenow fragment and t4 polynucleotide kinase. then add an ' a' base to each 3' end to make it easier to add adapters. after all that, fragments too short would be removed by ampure beads. for genomics dna, we use fusion primer with dual index and adapters for pcr, fragments too short would be removed by ampure beads too. in both cases, only the qualified library can be used for sequencing. the quality and quantity of libraries were assessed using the 2130 bioanaylzer (agilent technologies) and the steponeplus real-time pcr system (applied biosystems). the raw data generated by miseq and hiseq 2000 sequencers was processed to eliminate the adapter pollution and low quality to obtain clean reads. the qualified clean data was used for the further bioinformatics analysis. firstly, paired-end reads with overlap were merged to tags by software flash (v1.2.11) 30 , and then tags were clustered to otu at 97% sequence similarity using usearch (v7.0.1090) 31 . secondly, taxonomic ranks were assigned to otu representative sequence using ribosomal database project (rdp) na, e bayesian classifier v.2.2 32 . at last, alpha diversity, beta diversity and the different species screening were analyzed based on otu and taxonomic ranks by mothur (v1.31.2) 33 . in order to fully understand the community structure of fungal sample and analyse fungus microbial diversity, while excluding errors that human operation brings, genome sequencing method in fields of molecular biology was employed in this study to obtain micro biological information. illumina company developed miseq method with higher flux and simple operation and lower cost for genome sequencing. besides, the synthesis of side edge sequencing method with higher reliability is more suitable for laboratory community structure. the high-throughput sequencing was found to be useful to characterize compositions and diversities of moulds. the gene sequence of the test samples from genome sequencing was dealed with, such as stitching and matching, and the sample had a total of 59309 high quality fungal sequences, with an average length of 219 bp. the optimized sequence and the average length of the sample are shown in table 2 . otu and abundance analysis. stitching and optimising the tags, in order to be the otu (operational taxonomic units) for species classification, gathering in the 97% similarity, and the statistics of each sample in the abundance of information in each otu were done 31, [34] [35] . rank otu curve is a form of species diversity in the sample, which can explain two aspects of sample diversity, that is, the richness and evenness of species in the sample. the richness of species in the samples represented by the horizontal length of the curve is wide, so that the sample species is more abundant. the uniformity of species in the samples from the curve reflects the longitudinal axis of the shape. that the curve is flat means that the sample has a higher composition of the species evenness. from fig. 4 , the species composition of b2a is the most abundant, and the uniformity is the highest. sample diversity analysis of observed species. alpha diversity is the analysis of species diversity in a single sample 33 , including the species observed index, shannon index, etc. the greater the two indices are, the more abundant species is in the sample. the species observed index reflects the richness of the community in the sample, which also refers to the number of species in the community, without taking into account the abundance of each species in the community. shannon index reflects the diversity of the community, and the species richness and evenness of species in the sample community. in the case of the same species richness, the greater the evenness of the species is in the community, the greater the diversity of the community is. observed species exponential dilution curve. random sample in the processed sequence, draw the sequence number for the abscissa and the corresponding species number can be detected as the ordinate, so as to form a curve production dilution curve, shown in fig. 5(a) . with the increase of the sample quantity, the number of species increase and gradually become stabilized. when the curve reaches the plateau, it can be considered that the depth of the sequencing has been basically covered all the species in the sample. at the same time, the observed species index can reflect the richness of community in the sample, that is, the number of species in the community. it can be seen that the distribution of fungal species richness is b2a > b1a > a1a. shannon exponential dilution curve. shannon index is affected not only by species richness in the sample community, but also by the evenness of species. in the case of the same species richness, the greater the evenness of the species is in the community, the more abundant diversity of the community is. it can be seen in the fig. 5 (b) that the fungal species diversity of the unit b is significantly more complex than the unit a, and the similarity of species diversity in two sampling points of unit b was very high. composition of microbial samples. figure 6 illustrates the species composition proportion of the three sampling points, and the proportion was redrew by removing the strains which were not detected in the sample. the results are shown in table 3 . the species with the largest proportion is the dominant fungi. according to the fungal genome sequencing analysis results, fungal components in different units at the same sampling were different, and that in the same unit at different sampling points were roughly similar. they were caused by the different environmental conditions. on the center of air cooling coil in unit a, candida accounted for 80%; on the center and against the wall of the air cooling coil in unit b, cladosporium accounted for 50%, accompanied by alternaria, emericella and other fungus. cladosporium is usually rich in outdoor air, but they will also grow on the indoor surfaces when the humidity is high. existing research shows that the cladosporium spore is an extremely important allergen in the airborne transmission, which could cause asthma attacks or similar respiratory disease in patients with allergic reactions 36 . some species of candida is a kind of conditional pathogenic fungi in the human body. growth prediction analysis of models. traditional microbial detection generally have the characteristics of hysteresis, which cannot play the role of prediction, but the use of mathematical models to predict the growth of microorganisms can timely and effectively predict the growth of microorganisms. therefore, it is very important to study the growth prediction model of the fungi in the air conditioning system. according to environmental conditions mentioned before, we established growth kinetics prediction model of cladosporium spp. to predict the rapid fungal growth in the experimental conditions, which can provide a theoretical basis for air microbial contamination prediction system and help evaluate the health risk inside buildings. the models were fitted by origin software (version 8) and matlab r2014a, and the fitting conditions of logistic model and gompertz model were compared under different temperature and humidity conditions. the fitting effect between these two models and the fitting results of the two models were compared, and the corresponding model parameters were obtained. in addition, the square root model was fitted based on the two environmental factors. experimental study on the hygrothermal response of fungus. laboratory studies have revealed that fungal growth and reproduction are affected by water, temperature, nutrients, micro-elements, ph, light, carbon dioxide, and oxygen tension 37 .the most relevant determinants of fungal proliferation in the building context are water/moisture and temperature, and to a certain extent those factors affect other environmental factors such as substrate ph, osmolarity, nutrient, material properties etc 37, 38 .in order to lay the foundation for the fitting model, and to study the growth characteristics of fungi in different temperature and relative humidity, we set an experimental study on the hygrothermal response of fungus. from the results of fungal genome sequencing and literature research 39-41 , we selected cladosporium spp. and penicillium spp. as the research objects which are both common in air conditioning systems.this paper mainly studied the status of microbial contamination in air handling units so that the air temperature of each part of the air handling unit should be considered. the temperature gradient of 20 °c − 25 °c − 30 °c and relative humidity gradient of 45%− 60%− 75%− 80% were selected as experimental hygrothermal conditions. the results of hygrothermal experiments are shown in figs 7, 8, 9. it can be known that growth rate of cladosporium spp. is faster than that of penicillium spp., in any experimental conditions, which is the essential characteristics of a strain, is hygrothermal response control method cannot change. these data indicated that low rh environments can reduce or even inhibit fungal growth. this observation agrees with findings by w. tang and pasanen 42,43 . growth prediction analysis based on logistic model. logistic model is a typical model of biological ecology, which has been widely used in the field of biological ecology 44 . according to the actual research, the following formula equation (1) was obtained after the appropriate change of the logistic equation. n was the colony growth diameter, cm; t was the microbial growth culture time, h; a 1 , a 2 , x 0 , p as the model parameters. it can be seen from the table 4 , the fitting curve of logistic model is similar to the experimental results. at 20 °c and 30 °c temperature conditions, the model's fitting effect is excellent, and r 2 is greater than 0.99; at 25 °c temperature conditions, the model fitting effect is not as good as other temperature conditions. predicting the growth of microorganisms. the pmp program developed by the ministry of agriculture to establish the model of pathogenic bacteria is the basic model for the study of gompertz equation. gompertz model has been widely used in the field of biology. gompertz model expression was as equation (2): c n was the colony growth diameter, cm; t was the microbial growth culture time, h; a, a, k, x c as the model parameters. it can be seen from the table 5 that the fitting curve of gompertz model is better fitted to the measured parameters. at the same temperature, with the increase of relative humidity, gompertz model fitting effect is better; the model is well fitted at the temperature of 25 °c and the fitting effect is better than 20 °c and 30 °c temperature conditions. the fitting of logistic model to the growth of the fungus is better than that of the gompertz model. the two models are tested by the deviation factor b f and the accuracy factor a f in the mathematical model test. staphylococcus xylosus were studied by mcmeekin 45 . they found that when t min is fixed, for each ϕ , the relationship between growth rate and temperature can be described by using the square root model. the combined effects of these two variables can be expressed by the modified equation (3): in the formula, u is the growth rate of fungus, cm/h; b 2 is the coefficient; t is the culture temperature, °c; t min is the most important parameter of square root equation, and it refers to the lowest temperature when the growth rate is zero, °c; ϕ is relative humidity of the cultivation, %. by using the logistic primary model, the predictive value of the growth rate of the cladosporium colony growth rate (instantaneous velocity) was obtained, as table 6 shows. through the model fitting, the parameters of the square root model could be obtained, as table 7 shows, and the model fitting of predicting growth of cladosporium was shown as fig. 10 . the model equation of b f value was between 0.90-1.05, indicating that the model used to predict the range of the experimental environment in cladosporium colony growth condition. at the same time, the a f value of the model was 1.05169 that is closed to 1, which shows that the model has high accuracy. table 7 . model fitting and model parameters of double factor square root. this study selected two central air conditioning systems at a venue in dalian as the objects. actual measurement and a series of studies were carried out on microbial pollution characteristic, and the results are shown as below: (1) the bacterial colony forming units of the two measuring points in unit b were 192 cfu/cm 2 and 828 cfu/cm 2 , respectively, which exceeded the hygienic specification of central air conditioning ventilation system in public buildings of china (≤ 100cfu/cm 2 ), and the rest of the test points met the relevant standards of china. the distribution of bacteria was more than fungi, and the concentration was higher. with the total characteristics of different distribution density, the area of dust associated microorganisms and the air pollution were more serious. (2) alternaria spp., candida spp., cercospora spp. and cladosporium spp. existed in both units. the candida spp. accounted for 80% in unit a, and the cladosporium spp. occupied 50% in unit b. the composition of fungi in b was more complicated. two dominant fungi are both deleterious to health, so the timely maintenance and cleaning are required. it is suggested that the operating space should be reserved in the air conditioning room, so as to avoid incomplete cleaning and disinfection. (3) within the experimental temperature and relative humidity, with the increase of relative humidity or temperature, the colony growth of the same strain showed an increasing trend. for the prediction model of the fungus growth, the study found that the overall fitting effect of logistic model is better, and r 2 values were greater than 0.97. logistic model for the cladosporium spp. growth was better than gompertz model. at the same time, considering the influence of temperature and relative humidity, the square root model can well predict the growth of cladosporium spp. it provides a theoretical basis for the growth of fungi in the air conditioning system under the hygrothermal environment conditions. why, when and how do hvac-systems pollute the indoor environment and what to do about it? the european airless project the control technology of microbial contamination in air conditioning system overview of biological pollution prevention methods in air conditioning system. heating ventilating and air conditioning relationships between air conditioning, airborne microorganisms and health. bulletin de l'academie nationale de medecine a review of the biological pollution characteristics, standards and prevention and control technologies of central air-conditioning system risks of unsatisfactory airborne bacteria level in air-conditioned offices of subtropical climates development of a method for bacteria and virus recovery from heating, ventilation, and air conditioning (hvac) filters microbial air quality at szczawnica sanatorium the national institute for occupational safety and health indoor environmental evaluation experience air pollution sources in offices and assembly halls quantified by the olf unit control and improvement of building indoor biological pollution a major outbreak of severe acute respiratory syndrome in hong kong epidemiological investigation of an outbreak of pandemic influenza a (h1n1) 2009 in a boarding school: serological analysis of 1570 cases a study on the effective removal method of microbial contaminants in building according to bioviolence agents performance analysis of a direct expansion air dehumidification system combined with membrane-based total heat recovery hvac systems as emission sources affecting indoor air quality: a critical review testing and analysis for microbes and particles in central air conditioning systems of public buildings. heating ventilating and air conditioning endotoxins and bacteria in the humidifier water of air conditioning systems for office rooms investigation and review of microbial pollution in air conditioning systems of public buildings. heating ventilating and air conditioning study on the microbial secondary pollution and control in air conditioning system research on the dust deposition in ventilation and air conditioning pipe. heating ventilating and air conditioning hygienic specification of central air conditioning ventilation system in public buildings indoor air part17: detection and enumeration of moulds-culture-based method macro genomics approach in environmental microbial ecology and genetic search application study of the microbiome in waiting rooms of a japanese hospital fungal colonization of automobile air conditioning systems changes in airborne fungi from the outdoors to indoor air; large hvac systems in nonproblem buildings in two different climates research on the microbial diversity analysis in the sediment and macro genomic library at the south pole, xiamen university isolation of vibrio harveyi bacteriophage with a potential for biocontrol of luminous vibriosis in hatchery environments fast length adjustment of short reads to improve genome assemblies highly accurate otu sequences from microbial amplicon reads ribosomal database project: data and tools for high throughput rrna analysis introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities the silva ribosomal rna gene database project: improved data processing and web-based tools naive bayesian classifier for rapid assignment of rrna sequences into the new bacterial taxonomy study on the relationship between dominant fungi in the air and the allergic asthma in children field guide for the determination of biological contaminants in environmental samples separate effects of moisture content and water activity on the hyphal extension of penicillium rubens on porous media investigation and review of microbial pollution in air conditioning systems of public buildings. heating ventilating and air conditioning microorganisms and particles in ahu systems: measurement and analysis the indoor fungus cladosporium halotolerans survives humidity dynamics markedly better than aspergillus niger and penicillium rubens despite less growth at lowered steady-state water activity effects of temperature, humidity and air flow on fungal growth rate on loaded ventilation filters fungal growth and survival in building materials under fluctuating moisture and temperature conditions mathematical modeling of growth of salmonella in raw ground beef under isothermal conditions from 10 to 45 °c model for combined effect of temperature and salt concentration/water activity on the growth rate of staphylococcus xylosus the study is supported by the national nature science foundation of china (51308088), beijing key lab of heating, gas supply, ventilating and air conditioning engineering (nr2013k05), the fundamental research funds for the central universities (dut14qy24) and the urban and rural housing construction science and technology plan project (2014-k6-012). key: cord-025843-5gpasqtr authors: wild, karoline; breitenbücher, uwe; képes, kálmán; leymann, frank; weder, benjamin title: decentralized cross-organizational application deployment automation: an approach for generating deployment choreographies based on declarative deployment models date: 2020-05-09 journal: advanced information systems engineering doi: 10.1007/978-3-030-49435-3_2 sha: doc_id: 25843 cord_uid: 5gpasqtr various technologies have been developed to automate the deployment of applications. although most of them are not limited to a specific infrastructure and able to manage multi-cloud applications, they all require a central orchestrator that processes the deployment model and executes all necessary tasks to deploy and orchestrate the application components on the respective infrastructure. however, there are applications in which several organizations, such as different departments or even different companies, participate. due to security concerns, organizations typically do not expose their internal apis to the outside or leave control over application deployments to others. as a result, centralized deployment technologies are not suitable to deploy cross-organizational applications. in this paper, we present a concept for the decentralized cross-organizational application deployment automation. we introduce a global declarative deployment model that describes a composite cross-organizational application, which is split to local parts for each participant. based on the split declarative deployment models, workflows are generated which form the deployment choreography and coordinate the local deployment and cross-organizational data exchange. to validate the practical feasibility, we prototypical implemented a standard-based end-to-end toolchain for the proposed method using tosca and bpel. in recent years various technologies for the automated deployment, configuration, and management of complex applications have been developed. these deployment automation technologies include technologies such as chef, terraform, or ansible to name some of the most popular [27] . additionally, standards such as the topology and orchestration specification for cloud applications (tosca) [20] have been developed to ensure portability and interoperability between different environments, e.g., different cloud providers or hypervisors. these deployment automation technologies and standards support a declarative deployment modeling approach [9] . the deployment is described as declarative deployment model that specifies the desired state of the application by its components and their relations. based on this structural description a respective deployment engine derives the necessary actions to be performed for the deployment. although most of these technologies and standards are not limited to a specific infrastructure and able to manage multi-cloud applications, they all use a central orchestrator for the deployment execution. this central orchestrator processes the declarative deployment model and either forwards the required actions in order to deploy and orchestrate the components to agents, e.g., in the case of chef to the chef clients running on the managed nodes, or executes them directly, e.g., via ssh on a virtual machine (vm), as done by terraform [25] . however, today's applications often involve multiple participants, which can be different departments in a company or even different companies. especially in industry 4.0 the collaboration in the value chain network is of great importance, e.g., for remote maintenance or supply chain support [7] . all these applications have one thing in common: they are cross-organizational applications that composite distributed components, whereby different participants are responsible for different parts of the application. the deployment and management of such applications cannot be automated by common multi-cloud deployment automation technologies [22] , since their central orchestrators require access to the internal infrastructure apis of the different participants, e.g., the openstack api of the private cloud, or their credentials, e.g., to login to aws. there are several reasons for the involved participants to disclose where and how exactly the application components are hosted internally: new security issues and potential attacks arose, legal and compliance rules must be followed, and the participant wants to keep the control over the deployment process [17] . this means that common centralized application deployment automation technologies are not suitable to meet the requirements of new emerging application scenarios that increasingly rely on cross-organizational collaborations. in this paper, we address the following research question: "how can the deployment of composite applications be executed across organizational boundaries involving multiple participants that do not open their infrastructure apis to the outside in a fully automated decentralized manner?" we present a concept for the decentralized cross-organizational application deployment automation that (i) is capable of globally coordinating the entire composite application deployment in a decentralized way while (ii) enabling the involved participants to control their individual parts locally. therefore, we introduce a global multi-participant deployment model describing the composite crossorganizational application, which is split into local parts for each participant. based on the local deployment models a deployment choreography is generated, which is executed in a decentralized manner. based on the tosca and bpel [19] standards the existing opentosca ecosystem [6] is extended for the proposed method and validated prototypically. for application deployment automation two general approaches can be distinguished: declarative and imperative deployment modeling approaches [9] . for our decentralized cross-organizational application deployment automation concept both approaches are combined. most of the deployment automation technologies use deployment models that can be processed by the respective deployment engine. deployment models that specify the actions and their order to be executed, e.g., as it is done by workflows, are called imperative deployment models, deployment models that specify the desired state of an application are called declarative deployment models [9] . we explain the declarative deployment models in a technology-independent way based on the essential deployment meta model (edmm) that has been derived from 13 investigated deployment technologies in previous work [27] . the meta model for declarative deployment models presented in sect. 3 is based on the edmm and is the basis for the declarative part of the presented concept. in edmm an application is defined by its components and their relations. for the semantic of these components and relations reusable component and relation types are specified. for example, it can be defined that a web application shall be hosted on an application server and shall be connected to a queue to publish data that are processed by other components. for specifying the configuration of the components properties are defined, e.g., to provide the credentials for the public cloud or to set the name of the database. for instantiating, managing, and terminating components and relations executable artifacts such as shell scripts or services are encapsulated as operations that can be executed to reach the desired state defined by the deployment model. the execution order of the operations is derived from the deployment model by the respective deployment engine [5] . in contrast, imperative deployment models explicitly specify the actions and their order to be executed to instantiate and manage an application [9] . actions can be, e.g., to login to a public cloud or to install the war of a web application on an application server. especially for complex applications or custom management behavior imperative deployment models are required, since even if declarative models are intuitive and easy to understand, they do not enable to customize the deployment and management. imperative deployment technologies are, e.g., bpmn4tosca [16] , and general-purpose technologies such as bpel, bpmn [21], or scripting languages. in general, declarative deployment models are more intuitive but the execution is less customizable, while imperative deployment models are more complex to define but enable full control of the deployment steps. therefore, there are hybrid approaches for using declarative models that are transformed into imperative models to get use of the benefits of both approaches [5] . in this paper, we follow this hybrid approach by transforming declarative models to imperative choreography models. this means, the user only has to specify the declarative model and, thus, we explain the declarative modeling approach in sect. 4 using a motivating scenario. first, in the next section the meta model for declarative deployment models is introduced. our approach presented in sect. 5 is based on declarative deployment models that are transformed into imperative choreographies. based on edmm and inspired by the declarative application management modeling and notation (dmmn) [3] , the gentl meta model [1] , and tosca, a definition of declarative deployment models d ∈ d is introduced: model) . a declarative deployment model d ∈ d is a directed, weighted, and possibly disconnected graph and describes the structure of an application with the required deployment operations: the elements of the tuple d are defined as follows: declarative deployment model specifying all details of the desired application. the notation is based on vino4tosca with components as nodes, relations as edges, and the types in brackets [4] . in addition, sample operations are shown as dots. following the design cycle by wieringa [26] , we first examined the current situation in various research projects with industrial partners, namely in the projects ic4f 1 , sepia.pro 2 , and smartorchestra 3 . with regard to horizontal integration through the value chain network in the context of industry 4.0, we focused on the requirements and challenges of collaboration between different companies [7] . based on our previous research focus, the deployment and management of applications, the following research problems have emerged: (a) how can the deployment of composite applications across organizational boundaries be automated in a decentralized manner? (b) what is the minimal set of data to be shared between the involved participants to enable the automated decentralized deployment? in fig. 1 queue and database, respectively. in addition, three operations are exemplary shown: a connectsto to establish a connection to the queue, a connectsto to connect to the database, and an install operation to install the jar artifact on the order vm. the other properties and operations are abstracted. assuming that a single organization is responsible for deploying the entire application and has full control over the openstacks and aws, the common deployment automation technologies examined by wurster et al. [27] fit perfectly. however, in the depicted scenario two participants, p1 and p2, who may be different departments or companies, intend to realize a cross-organizational application so that common deployment automation technologies are no longer applicable. while all participants must agree on the application-specific components, the underlying infrastructure is the responsibility of each participant. for security reasons, participants typically do not provide access to internal apis, share the credentials for aws, or leave the control over deployment to others. to address the research problems, we propose a decentralized concept to enable the cross-organizational application deployment automation ensuring that (i) only as little data as necessary is exchanged between participants and (ii) each participant controls only his or her local deployment while the overall deployment is coordinated. the proposed solution is described in detail in the following section and in sect. 6 the implementation and validation is presented. the motivating scenario in fig. 1 serves as use case for the validation. for the decentralized cross-organizational application deployment automation with multiple participants, it has to be considered that (i) the participants want to exchange as little data as necessary and (ii) each participant controls only his or her local deployment while the global coordination of the deployment of the entire application is ensured. taking these requirements into account, we have developed the deployment concept depicted in fig. 2 . in the first step, the application-specific components are modeled representing the use case to be realized. they typically include the business components such as the order app, storage components such as the database component, and communication components such as the order queue in fig. 1 . in the second step, the global multi-participant deployment model (gdm) is generated, a declarative deployment model containing all publicly visible information that is shared between the participants. this publicly visible information contains also data that must be provided by the respective infrastructure. for example, to execute the operation to establish a connection between order processor and database in fig. 1 , the ip of the database vm is required as input. subgraphs, so called local parts of the gdm, are then assigned to participants responsible for the deployment of the respective components. the gdm is then processed by each participant. first, in step three, for each application-specific component a hosting environment is selected and the adapted model stored as local multi-participant deployment model (ldm). in the motivating scenario in fig. 1 participant p1 selected aws for the order queue and the openstack for the order app. however, this individual placement decision is not shared. for the deployment execution we use an hybrid approach: based on the ldm a local deployment workflow model is generated in step four that orchestrates the local deployment and cross-organizational information exchange activities. all local workflows form implicitly the deployment choreography which enables the global coordination of the deployment across organizational boundaries. each step is described in detail in the following. in the initial step, the application-specific components representing the use case to be realized have to be modeled. they typically include business components, storage components, and communication components. in the motivating scenario in fig. 1 the set of application-specific components contains the order app, the order queue, the order processor, and the database. in addition, the lifecycle operations, e.g., to install, start, stop, or terminate the components and relations, have to be defined for each of these components and their relations, since all input parameters of these operations must be provided as globally visible information in the gdm. application-specific components are defined as follows: c s ⊆ c d in d, where all r s = (c s , c t ) ∈ r d with {c s , c t } ∈ c s are of type d (r s ) = connectst o and for each c i ∈ c s : cap(type d (c i )) = ∅, to ensure that the application-specific components can be deployed across organizational boundaries, the gdm is generated in the second step which contains the minimal set of necessary information that have to be globally visible, i.e., that have to be shared. thus, the gdm is defined as follows: the elements of the tuple g are defined as follows: -d ∈ d: declarative deployment model that is annotated with participants. -p g ⊆ ℘(σ + ) × ℘(σ + ): set of participants with p i = (id , endpoint) ∈ p , whereby σ + is the set of characters in the ascii table. -participant g : the mapping assigning a component c i ∈ c d to a participant p i ∈ p g participant g : c d → p g . the example in fig. 3 depicts a simplified gdm. the application-specific components, depicted in dark gray, specify requirements, e.g., the order queue requires a message queue middleware. these requirements have to be satisfied by the respective hosting environment. furthermore, for these components as well as their connectsto-relations operations with input parameters are defined. to establish a connection to the order queue the url and q-name of the queue are required. either the target application-specific component provides respective matching properties such as the q-name property exposed by the order queue component or the environment has to provide it such as the input parameter url. for this, in this step placeholder host components are generated that contain all capabilities and properties that have to be exposed by the hosting environment. -for each r j ∈ r d : π 2 (r j ) = c j with type d (r j ) = connectsto and for each operation op r ∈ operations s (r j ) all data elements v r ∈ π 1 (op r ) \ properties s (c j ) are added to properties d (c h ). in the example in fig. 3 the host order queue component provides the capability messagequeue and exposes the property url, which is required as input parameter for the connectsto operations. before the deployment model is processed by each participant, subgraphs of the gdm are assigned to the participants. this subgraph is called local part and indicates who is responsible for this part of the application. this is done by annotating the gdm as shown in fig. 3 on the right. since participants typically do not want to share detailed information about their hosting environment, the gdm is given to each participant for further processing. each participant p i has to select a hosting environment for all c s ∈ c s with participant g (c s ) = p i . in fig. 3 fig. 3 is valid because the property url is covered and the sqs exposes the required capability messagequeue. the substitution is automated by our prototype described in sect. 6 components and matching to existing infrastructure and middleware several approaches exist [12, 23, 24] . soldani et al. [24] introduced the toscamart method to reuse deployment models to derive models for new applications, hirmer et al. [12] introduced a component wise completion, and we presented in previous work [23] how to redistribute a deployment model to different cloud offerings. these approaches use a requirement-capability matching mechanism to select appropriate components. we extended this mechanism to match the properties as well. the resulting local multi-participant deployment model (ldm) is a partially substituted gdm with detailed middleware and infrastructure components for the application-specific components managed by the respective participant. up to this point we follow a purely declarative deployment modeling approach. the core step of our approach is the generation of local deployment workflow models that form the deployment choreography. they are derived from the ldms by each participant and (i) orchestrate all local deployment activities and (ii) coordinate the entire deployment and data exchange to establish crossparticipant relations. while centralized deployment workflows can already be generated [5] , the global coordination and data exchange are not covered yet. cross-participant relations are of type connectsto and between components managed by different participants. to establish cross-participant relations, the participants have to exchange the input parameters for the respective connectsto-operations. in the example in fig. 3 the relation con2 establishes a connection from the order processor managed by p2 to the order queue managed by p1. the connectsto-operation requires the url and the q-name as input. both parameters have to be provided by p1. since this information is first available during deployment time, this data exchange has to be managed during deployment: for each cross-participant relation a sending and receiving activity is required to exchange the information after the target component is deployed and before the connection is established. in addition, the deployment of the entire application must be ensured. independent which participant initiates the deployment, all other participants have to deploy their parts as well. this is covered by three cases that have to be distinguished for the local deployment workflow generation as conceptually shown in fig. 4 . in the upper part abstracted ldms and in the lower part generated activities from the different participants perspectives are depicted. on the left (a) activities from a crossparticipant relation target perspective, in the middle (b) from a cross-participant relation source perspective, and on the right (c) activities generated to ensure the initiation of the entire deployment are depicted. first, a definition of local deployment workflow models based on the production process definition [14, 18] is provided: for each participant p i ∈ p a local deployment workflow model w i based on the ldm is defined as: the elements of the tuple w i are defined as follows: set of control connectors between activities, whereby each e y = (a s , a t ) ∈ e wi represents that a s has to be finished before a t can start. set of data elements, whereby σ + is the set of characters in the ascii table and v y = (datatype, value) ∈ v wi . -i wi : the mapping assigns to each activity a y ∈ a wi its input parameters and it is called the input container i wi : a wi → ℘(v wi ). -o wi : the mapping assigns to each activity a y ∈ a wi its output parameters and it is called the output container o wi : a wi → ℘(v wi ). -type wi : the mapping assigns each a y ∈ a wi to an activity type, type wi : based on this definition, local deployment workflow models can be generated based on specific rules. in fig. 4 the resulting activities are depicted: (a) for each component c t ∈ c d that is target of a cross-participant relation r c = (c s , c t ) with participant g (c t ) = p i and participant g (c s ) = p j , an activity a t ∈ a wi : type wi (a t ) = invoke is added that invokes the start operation of c t . after a component is started, a connection to it can be established [5] . thus, a c : type wi (a c ) = send is added to w i that contains all input parameters of the connectsto-operation of r c provided by p i in o wi (a c ). (b) for the component c s ∈ c d , the source of the cross-participant relation r c , an activity a c : type wj (a c ) = receive is add to w j of p j . with the control connector e(a init , a c ) added to w j it is ensured that the activity is activated after the initiate activity of p j . after the input values are received and the start operation of c s is successfully executed, the actual connectstooperation can be executed. (c) each workflow w i starts with the initiate activity a init ∈ a wi : type wi (a init ) = receive. to ensure that after a init is called the entire application deployment is initiated, a notification is sent to all other participants. for each p j ∈ p \ {p i } an activity a n : type wi (a n ) = send with a control connector e(a init , a n ) is added to w i . since each participant notifies all others, for n participants, each participant has to discard n-1 messages. since the payloads are at most a set of key-value pairs this is not critical. each participant generates a local deployment workflow model, which together implicitly form the deployment choreography. as correlation identifier the gdm id and application instance id are sufficient. while the gdm id is known in advance, the application instance id is generated by the initiating participant. the approach enables a decentralized deployment while each participant controls only his or her deployment and shares only necessary information. to demonstrate the practical feasibility of the approach we extended the tosca-based open-source end-to-end toolchain opentosca 4 [6] . it consists of a modeling tool winery, a deployment engine opentosca container, and a self-service portal. in tosca, deployment models are modeled as topology templates, the components as node, and the relations as relationship templates with their types. the types define properties, operations, capabilities, and requirements. plans are the imperative part of tosca, for which standard workflow languages such as bpmn or bpel can be used. all tosca elements and executables, implementing operations and components, are packaged as cloud service archive (csar). in fig. 5 the system architecture for two participants is depicted. winery is extended by the placeholder generation and the placeholder substitution. either p1 or p2 models the application-specific components and generates the gdm using the placeholder generation that generates node types with the respective properties and capabilities. the resulting gdm is then packaged with the csar im-/exporter and sent to each participant. the substitution mapping detects the local part of managed by the respective participant in the gdm and selects topology templates from the repository to substitute the placeholder host components. the substituted topology template is then uploaded to the opentosca container. the plan builder generates a deployment plan based on the declarative model. we use bpel for the implementation. either p1 or p2 can then initiate the deployment. the plan runtime instantiates the plan and invokes the operations. the actual operation, e.g., to create a vm, is executed by the operation runtime. the communication between the opentosca containers is managed by the management bus. the management bus is the participant's endpoint in our setup. however, also arbitrary messaging middleware or any other endpoint that can process the messages can be used. we used the deployment model presented in fig. 1 with two and three participants for the validation. in contrast to general workflow approaches [14, 15] , we do not have to deal with splitting workflows according to the participants, since we can completely rely on the declarative deployment model and only implicitly generates a choreography. however, a prerequisite is that each participant only uses the predefined interfaces so that the choreography can be executed. at present, we also limit ourselves to the deployment aspect and do not consider the subsequent management. while management functionalities such as scaling are often covered by the cloud providers themselves, other functionalities such as testing, backups, or updates are not offered. management increases the complexity of automation, especially when local management affects components managed by other participants. we currently only support tosca as a modeling language and opentosca as a deployment engine. so far, we lack the flexibility to support technologies like kubernetes, terraform, or chef, which are often already in use in practice. however, this is part of the planned future work. the research in the field of multi-cloud, federated cloud, and inter-cloud [10, 22] focuses on providing unified access to different cloud providers, making placement decisions, migration, and management. all these approaches consider multiple cloud providers satisfying the requirements of a single user. the cloud forms differ in whether the user is aware of using several clouds or not. however, the collaboration between different users each using and controlling his or her environment, whether it is a private, public, or multi-cloud, is not considered, but this is highly important, especially in cross-company scenarios which arose with new emerging use cases in the fourth industrial revolution. arcangeli et al. [2] examined the characteristics of deployment technologies for distributed applications and also considered the deployment control, whether it is centralized or decentralized. however, also the decentralized approaches with a peer-to-peer approach does not consider the sovereignty of the involved peers and the communication restrictions. in previous work [13] , we introduced an approach to enable the deployment of parts of an application in environments that restrict incoming communication. however, the control is still held by a central orchestrator. kopp and breitenbücher [17] motivated that choreographies are essential for distributed deployments. approaches for modeling choreographies, e.g., with bpel [8] or to split orchestration workflows into multiple workflows [14, 15] have been published. however, most of the deployment technologies are based on a declarative deployment models [27] , since defining the individual tasks to be performed in the correct order to reach a desired state are error-prone. thus, instead of focusing on workflow choreographies we implicitly generated a choreography based on declarative deployment models. breitenbücher et al. [5] demonstrated how to derive workflows from declarative deployment models. however, their approach only enables to generate orchestration workflows which cannot be used for decentralized cross-organizational deployments. herry et al. [11] introduced a planning based approach to generate a choreography. however, they especially focus on generating an overall choreography that can be executed by several agents. for us the choreography is only an implicit artifact, since we mainly focus on enabling the cross-organizational deployment by minimizing the globally visible information and obtaining the sovereignty of the participants. in this paper, we presented an approach for the decentralized deployment automation of cross-organizational applications involving multiple participants. a cross-organizational deployment without a central trusted third-party is enabled based on a declarative deployment modeling approach. the approach facilitates that (i) each participant controls the local deployment, while the global deployment is coordinated and (ii) only the minimal set of information is shared. a declarative global multi-participant deployment model that contains all globally visible information is generated and split to local deployment models that are processed by each participant. each participant adapts the local model with internal information and generates an imperative deployment workflow. these workflows form the deployment choreography that coordinates the entire application deployment. we implemented the concept by extending the opentosca ecosystem using tosca and bpel. in future work the data exchange will be optimized since each participant sends notification messages to all other participant and thus for n participants n-1 messages have to be discarded. we further plan not only to enable multi-participant deployments but also multi-technology deployments by enabling to orchestrate multiple deployment technologies. a gentl approach for cloud application topologies automatic deployment of distributed software systems: definitions and state of the art eine musterbasierte methode zur automatisierung des anwendungsmanagements. dissertation vino4tosca: a visual notation for application topologies based on tosca combining declarative and imperative cloud application provisioning based on tosca the opentosca ecosystem -concepts & tools collaborative networks as a core enabler of industry 4.0 bpel4chor: extending bpel for modeling choreographies declarative vs. imperative: two modeling patterns for the automated deployment of applications inter-cloud architectures and application brokering: taxonomy and survey choreographing configuration changes automatic topology completion of tosca-based cloud applications deployment of distributed applications across public and private networks supporting business process fragmentation while maintaining operational semantics: a bpel perspective e role-based decomposition of business processes using bpel bpmn4tosca: a domainspecific language to model management plans for composite applications choreographies are key for distributed cloud application provisioning production workflow: concepts and techniques oasis: web services business process execution language version 2.0 (2007) 20. oasis: tosca simple profile in yaml version 1.2 (2019) 21. omg: bpmn version 2.0. object management group (omg multi-cloud: expectations and current approaches topology splitting and matching for multi-cloud deployments toscamart: a method for adapting and reusing cloud applications a taxonomy and survey of cloud resource orchestration techniques design science methodology for information systems and software engineering the essential deployment metamodel: a systematic review of deployment automation technologies acknowledgments. this work is partially funded by the bmwi project ic4f (01ma17008g), the dfg project distopt (252975529), and the dfg's excellence initiative project simtech (exc 2075 -390740016). key: cord-127900-78x19fw4 authors: leung, abby; ding, xiaoye; huang, shenyang; rabbany, reihaneh title: contact graph epidemic modelling of covid-19 for transmission and intervention strategies date: 2020-10-06 journal: nan doi: nan sha: doc_id: 127900 cord_uid: 78x19fw4 the coronavirus disease 2019 (covid-19) pandemic has quickly become a global public health crisis unseen in recent years. it is known that the structure of the human contact network plays an important role in the spread of transmissible diseases. in this work, we study a structure aware model of covid-19 cgem. this model becomes similar to the classical compartment-based models in epidemiology if we assume the contact network is a erdos-renyi (er) graph, i.e. everyone comes into contact with everyone else with the same probability. in contrast, cgem is more expressive and allows for plugging in the actual contact networks, or more realistic proxies for it. moreover, cgem enables more precise modelling of enforcing and releasing different non-pharmaceutical intervention (npi) strategies. through a set of extensive experiments, we demonstrate significant differences between the epidemic curves when assuming different underlying structures. more specifically we demonstrate that the compartment-based models are overestimating the spread of the infection by a factor of 3, and under some realistic assumptions on the compliance factor, underestimating the effectiveness of some of npis, mischaracterizing others (e.g. predicting a later peak), and underestimating the scale of the second peak after reopening. epidemic modelling of covid-19 has been used to inform public health officials across the globe and the subsequent decisions have significantly affected every aspect of our lives, from financial burdens of closing down businesses and the overall economical crisis, to long term affect of delayed education, and adverse effects of confinement on mental health. given the huge and long-term impact of these models on almost everyone in the world, it is crucial to design models that are as realistic as possible to correctly assess the cost benefits of different intervention strategies. yet, current models used in practice have many known issues. in particular, the commonly-used compartment based models from classical epidemiology do not consider the structure of the real world contact networks. it has been shown previously that contact network structure changes the course of an infection spread significantly (keeling 2005; bansal, grenfell, and meyers 2007) . in this paper, we demonstrate the structural effect of different underlying contact networks in covid-19 modelling. standard compartment models assume an underlying er contact network, whereas real networks have a non-random structure as seen in montreal wifi example. in each network, two infected patients with 5 and 29 edges are selected randomly and the networks in comparison have the same number of nodes and edges. in wifi network, infected patients are highly likely to spread their infection in their local communities while in er graph they have a wide-spread reach. non-pharmaceutical interventions (npis) played a significant role in limiting the spread of covid-19. understanding effectiveness of npis is crucial for more informed policy making at public agencies (see the timeline of npis applied in canada in table 2 ). however, the commonly used compartment based models are not expressive enough to directly study different npis. for example, ogden et al. (2020) described the predictive modelling efforts for covid-19 within the public health agency of canada. to study the impact of different npis, they used an agent-based model in addition to a separate deterministic compartment model. one significant disadvantage of the compartment model is its inability to realistically model the closure of public places such as schools and universities. this is due to the fact that compartment models assume that each individual has the same probability to be in contact with every other individual in the population which is rarely true in reality. only by incorporating real world contact networks into compartment models, one can disconnect network hubs to realistically simulate the effect of closure. therefore, ogden et al. (2020) need to rely on a separate stochastic agent-based model to model the closure of public places. in contrast, our proposed cgem is able to directly model all npis used in practice realistically. in this work, we propose to incorporate structural information of contact network between individuals and show the effects of npis applied on different categories of contact networks. in this way, we can 1) more realistically model various npis, 2) avoid the imposed homogeneous mixing assumption from compartment models and utilize different networks for different population demographics. first, we perform simulations on various synthetic and real world networks to compare the impact of the contact network structure on the spread of disease. second, we demonstrate that the degree of effectiveness of npis can vary drastically depending on the underlying structure of the contact network. we focus on the effects of 4 widely adopted npis: 1) quarantining infected and exposed individuals, 2) social distancing, 3) closing down of non-essential work places and schools, and 4) the use of face masks. lastly, we simulate the effect of re-opening strategies and show that the outcome will depend again on the assumed underlying structure of the contact networks. to design a realistic model of the spread of the pandemic, we also used a wifi hotspot network from montreal to simulate real world contact networks. given our data is from montreal, we focus on studying montreal timeline but the basic principles are valid generally and cgem is designed to be used with any realistic contact network. we believe that cgem can improve our understanding on the current covid-19 pandemic and be informative for public agencies on future npi decisions. summary of contributions: • we show that structure of the contact networks significantly changes the epidemic curves and the current compartment based models are subject to overestimating the scale of the spread • we demonstrate the degree of effectiveness of different npis depends on the assumed underlying structure of the contact networks • we simulate the effect of re-opening strategies and show that the outcome will depend again on the assumed underlying structure of the contact networks reproducibility: code for the model and synthetic network generation are in supplementary material. the real-world data can be accessed through the original source. different approaches have accounted for network structures in epidemiological modelling. degree block approximation (barabási et al. 2016 ) considers the degree distribution of the network by grouping nodes with the same degree into the same block and assuming that they have the same behavior. percolation theory methods (newman 2002) can approximate the final size of the epidemic for networks with specified degree distributions. recently, sambaturu et al. (2020) (vogel 2020; lawson et al. 2020) design effective vaccination strategies based on real and diverse contact networks. various modifications are made to the compartment differential equations to account for the network effect (aparicio and pascual 2007; keeling 2005; bansal, grenfell, and meyers 2007) . simulation-based approaches are often used when the underlying networks are complex and mathematically intractable. grefenstette et al. (2013) employed an agent-based model to simulate the dynamics of the seir model with a census-based synthetic population. the contact networks are implied by the behavior patterns of the agents. chen et al. (2020) adopted the independent cascade (ic) model (saito, nakano, and kimura 2008) to simulate the disease propagation and used facebook network as a proxy for the contact network. social networks, however, are not always a good approximation for the physical contact networks. in our study, we attempt to better ground the simulations by inferring the contact networks from wifi hub connection records. table 2 : cgem can realistically model all npis used in practice while existing models miss one or more npis period), prevalence of hospital admissions and icu use, and death. they assumed the effect of physical-distancing measures were to reduce the number of contacts per day across the entire population. in addition, enhanced testing and contact tracing were assumed to move individuals with nonsevere symptoms from the infectious to isolated compartments. in this work, we also examine the effect of closure of public places which is difficult to simulate in a realistic manner for standard compartment models. ogden et al. (2020) described the predictive modelling efforts for covid-19 within the public health agency of canada. they estimated that more than 70% of the canadian population may be infected by covid-19 if no intervention is taken. they proposed an agent-based model and a deterministic compartment model. in the compartment model, similar to tuite, fisman, and greer (2020), effects of physical distancing are modelled by reducing daily per capita contact rates. the agent model is used to separately simulate the effects of closing schools, workplaces and other public places. in this work, we compare the effects all npis used in practice through a unified model and show how different contact networks change the outcome of npis. in addition, ferguson et al. (2020) employed an individual-based simulation model to evaluate the impact of npis, such as quarantine, social distancing and school closure. the number of deaths and icu bed demand are used as proxies to compare the effectiveness of npis. in comparison, our model can directly utilize contact networks and we also model the impact of wearing masks. block et al. (2020) proposed three selective social distancing strategies based on the observations that epidemic dynamics depends on the network structure. the strategies aim to increase network clustering and eliminate shortcuts and are shown to be more effective than naive social distancing. reich, shalev, and kalvari (2020) proposed a selective social distancing strategy which lower the mean degree of the network by limiting super-spreaders. the authors also compared the impact of various npis, including testing, contact tracing, quarantine and social distancing. neural network based approaches (soures et al. 2020; dan-dekar and barbastathis 2020) are also proposed to estimate the effectiveness of quarantine and forecast the spread of the disease. in a classic seir model, referred to as base seir, the dynamics of the system at each time step can be described by the following equations (aron and schwartz 1984) : where an individual can be in one of the 4 states: (s) susceptible, (e) exposed, (i) infected and can infect nodes that are susceptible, and (r) recovered at any given time step t. β, σ, γ are the transition rates from s to e, e to i, and i to r respectively. similarly, in cgem, an individual can be either s susceptible, e exposed, i infected or r recovered. we do not consider reinfection, but extensions are straightforward. unlike the equation-based seir model which assumes homogeneous mixing, cgem takes into account the contact patterns between the individuals by simulating the spread of a disease over a contact network. each individual becomes a node in the network and the edges represent the connections between people. algorithm 1 shows the pseudo code for cgem 1 . given a contact network, we assume that a node comes into contact with all its neighbours at each time step. more specifically, at each time step, the susceptible neighbours of infected individuals will become infected with a transmission probability φ, and enter the exposed state (illustrated below). we randomly select exposed nodes to become infected with probability σ and let them recover with a probability γ. (barabási et al. 2016) , the parameters of the synthetic graph generation could be adjusted to produce graphs with same sizes thus facilitating a fair comparison between different structures. we discuss details in the following sections. inferring transmission rate by definition, β represents the likelihood that a disease is transmitted from an infected to a susceptible in a unit time. barabási et al. (2016) assumes that on average each node comes into contact with k neighbors, then the relationship between β and the transmission rate φ can be expressed as: where k is the average degree of the nodes. in the case of a regular random network, all nodes have the same degree, i.e. k = k and equation 1 can be reduced into: β = k · φ (2) the homogeneous mixing assumption made by the standard seir model can be well simulated by running cgem over a regular random network, we propose to bridge the two models with the following procedure: 1. fit the classic seir model to real data to estimate β. 2. run cgem over regular random networks with different values of k and with φ derived from equation 2. 3. choose k = k * which produce the best fit to the predictions of the classic seir model. the regular random network with average degree k * would be the contact network the classic seir model is approximating and φ * = β/k * would be the implied transmission rate. we will use this transmission rate for other contact networks studied, so that the dynamics of the disease (transmissibility) is fixed and only the structure of contact graph changes. tuning synthetic network generators as a proxy for actual contact networks which are often not available, we can pair cgem with synthetic networks with more realistic properties, comparable to real world networks e.g. heavy-tail degree distribution, small average shortest path, etc. to adjust the parameters of these generators, we can reframe the problem as: given transmission rate φ * and population size n, are there other networks which can produce the same infection curve? for this, we can carry out similar procedures as above. for example, we can run cgem with transmission rate φ * over scale-free networks generated from different values of m ba , where m ba is the number of edges a new node can form in the barabasi albert algorithm (barabási et al. 2016) . m ba which produces the best fit to the infection curve gives us a synthetic contact network that is realistic in terms of number of edges compared to the real contact network. here we explain how different npis can be modelled directly in cgem as changes in the underlying structure. quarantine how can we model the quarantining and selfisolation of exposed and infected individuals? exposed individuals have come into close contact with an infected person and are considered to have high risk of contracting. in an ideal world, most, if not all, infected individuals would be easily identifiable and quarantined. however, in reality, over 40% (he et al. 2020 ) of infected cases are asymptomatic and not all are identified immediately or at all and therefore can go on to infect others unintentionally. to account for this in our model, we apply quarantining by removing all edges from a subset of exposed and infected nodes. social distancing social distancing reduces opportunities of close contacts between individuals by limiting contacts to those from the same household and staying at least 6 feet apart from others when out in public. in cgem, a percentage of edges from each node are removed to simulate the effects of social distancing to different extent. wearing masks masks are shown to be effective in reducing the transmission rate of covid-19 with a relative risk (rr) of 0.608 (ollila et al. 2020) . we simulate this by assigning a mask wearing state to each node and varying the transmissibility, φ, based on whether 2 nodes in contact are wearing masks or not. we define the new transmission rate with this npi, φ mask as follows: if both nodes wearing masks m 1 · φ, if 1 node wearing masks m 0 · φ, otherwise closure: removing hubs places of mass gathering (e.g. schools and workplaces) put large number of people in close proximity. if infected individuals are present in these locations, they can have a large number of contacts and very quickly infect many others. in a network, these nodes with a high number of connections, or degree, are known as hubs. by removing the top degree hubs, we simulate the effects of cancelling mass gathering, and closing down schools and non-essential workplaces. in cgem, we remove all edges from r% of top degree nodes to simulate the closure of schools and non-essential workplaces. however, some hubs, such as (workers in) grocery stores and some government agencies, must remain open, so we assign each hub a successful removal rate of p success to control this effect. compliance given the npis are complied by majority but not all the individuals, we randomly assign a fixed percentage of the nodes as non-compilers. we set this to 26% in all the simulations based on a recent survey (bricker 2020) . due to the economical and psychological impacts of a complete lockdown on the society, it is critical to know how safe it is to resume commercial and social activities once the pandemic has stabilized. therefore, we also investigate the impact of relaxing each npis and the risk of a second wave infection. more specifically, we simulate a complete reversing of the npis, by adding back the edges that were removed when the npi was applied at first, to return the underlying structure to its original form. we compare the spread of covid-19 with synthetic and real world networks. these networks include 3 synthetic networks, (1) the regular random network, where all nodes have the same degree, (2) the erdős-reńyi random network, where the degree distribution is poisson distributed, (3) the barabasi albert network, where the degree distributions follows a power law. additionally, we analyzed 4 real world network, the usc35 network from the face-book100 dataset (traud, mucha, and porter 2012) , consisting of facebook friendship relationship links between students and staffs at the university of southern california in september 2005, and 3 snapshots of a real world wifi hotspot network from montreal , a network often used as a proxy for human contact network while studying disease transmission yang et al. 2020 ). in the montreal wifi network, edges are formed between nodes (mobile phones) that are connected to the same public wifi hub at the same time. as shown in table 3 , each of the 7 networks consist of 17,800 nodes, consistent with 1/100th of the population of the city of montreal, and have between 110,000 to 220,000 edges, with the exception of the usc network. due to the aggregated nature of the usc dataset, edge sampling is enforced during the contact phase in order to obtain reasonable disease spread. the synthetic networks are in general more closely connected than the montreal wifi networks, despite having similar number of nodes and edges. only the largest connected component is considered in all networks. the structure of the contact network plays an important role in the spread of a disease (bansal, grenfell, and meyers 2007) . it dictates how likely susceptible nodes will come into contact with infected ones and therefore it is crucial to evaluate how the disease will spread on each network with the same initial parameters. here, the classic seir model is fitted against the infection rates from the first of the 100th case in montreal to april 4 to obtain β, which is before any npi is applied. with eq. 2, the transmission rate, φ, is estimated to be 0.0371 and is used across all networks. in all experiments, we also seed the population with the same initial number of 3 exposed nodes and 1 infected node. the parameters used to generate synthetic networks are obtained following the procedures described in the previous session. all results are averaged across 10 runs. the grey shaded region shows the 95% confidence interval of each curve. as shown in figure 2 , the er network fits the base seir model almost perfectly-compare green 'er' and black 'base' curves. observation 1 cgem closely approximates the base seir model when the contact network is assumed to be erdős-reńyi graph. all networks drastically overestimates the spread of covid-19 when compared with real world data. this can be expected to some degree as in this experiment we are projecting the curves assuming no npi is in effect which is not what happened in reality (see 'real' orange curve). however, we observe that all 3 synthetic networks, including the er model exceedingly overshoot, showing almost the entire population getting infected, whereas the real-world wifi networks predict a 3x lower peak. observation 2 assuming an erdős-reńyi graph as the contact network overestimates the impact of covid-19 by more than a factor of 3 when compared with more realistic structures. in order to limit the effects of the pandemic, the federal and provincial governments introduced a number of measures to reduce the spread of covid-19. we simulate the effects of 4 different non-pharmaceutical interventions, or npis, at different strengths to determine their effectiveness. these include, (1) quarantining exposed and infected individuals, (2) social distancing between nodes, (3) removing hubs, and (4) the use of face masks. quarantine we apply quarantining into our model on march 23. where both quebec and canadian government have asked those who returned from foreign travels or experienced flu-like symptoms to self isolate. we remove all edges from 50, 75, and 95% of exposed and infected nodes to simulate various strengths of quarantining. figure 8 displays the effect of quarantining on different graph structures. quarantining infected and exposed nodes both reduces and delays the peak of all infection curve. however, the peak is not delayed as much in the wifi graphs as the er graph predicts, which is important information in planning for the healthcare system. out of all tested npis, applying quarantine has the most profound reduction on all infections curves. observation 3 quarantining delays the peak of infection on the er graph whereas the peak on the real world graphs are lowered but not delayed significantly. social distancing reduces the number of close contacts. different degrees of 10%, 30%, and 50% of edges from each node is removed to simulate this. figure 9 shows the effects of social distancing on the infection curves of each network structures. it is effective in reducing the peak of the pandemic on all networks but again delays the peaks only on synthetic networks. similar to observation 3, we have: observation 4 social distancing delays the peak of infection on the er graph whereas the peak on the real world graphs are lowered but not delayed significantly. removing hubs we remove all edges from 1% of top degree nodes to simulate the closure of schools and 5 and 10% of top degree nodes to simulate the closure of non-essential workplaces. these npis are applied on march 23 respectively, coinciding with the dates of school and non-essential business closure in quebec. p success is set to 0.8 unless otherwise stated. figure 10 shows the effects of removing hubs. this npi is very effective on the ba network and all 3 montreal wifi networks since these networks have a power law degree distribution and hubs are present. however, it is not very effective on the regular and er random networks. observation 5 the er graph significantly underestimates the effect of removing hubs. removing hubs is most effective on networks with a power law degree distribution since hubs act as super spreaders and removing them effectively contains the virus. however, no hubs are present in the er and regular random network, and thus removing hubs reduces to removing random nodes. luckily, real world contact networks have power law degree distributions, making a hubs removal an effective strategy in practice. wearing masks we set m 2 = 0.6, m 1 = 0.8 and m 0 = 1, and use the following transmission rate, φ mask in cgem: if both nodes wearing masks 0.8 · φ, if 1 node wearing masks 1 · φ, otherwise wearing masks is only able to flatten the infection curve on the synthetic networks but does not reduce the final epidemic attack rate, the total size of population infected, as shown in figure 11 . however, in the real world wifi networks, wearing masks is able to both flatten the curve and also significantly reduce the final epidemic attack rate. observation 6 the er graph significantly underestimates the effect of wearing masks in terms of the total decrease in the final attack rate. we experiment with reopening of all the npis, but for brevity we only report the results for allowing hubs, which corresponds to the current reopening of schools and public places. the results form other npis are available in the extended results. for removing hubs, we apply reopening on july 18 (denoted by the second vertical line in figure 7) , after many non-essential businesses and workplaces are allowed to open in quebec. because the synthetic networks estimates that most of the population would be infected before the hubs are reopened, we calibrate the number of infected and recovered individuals at the point of reopening to align with figure 6 : difference between cumulative curves from wearing masks and not wearing masks. the cumulative curves represent the total impact, and the different shows how much drop in final attack rate is estimated with the npi enforced. statistics available in the real world data. therefore the simulation continues after reopening with all the models having the same number of susceptible individuals, otherwise int the er graph, everyone is infected at that point. we can see in figure 7 that er and regular random network significantly underestimates the extent of second wave infections. ba and the wifi networks all show second wave infections with a higher peak than the initial, prompting more caution when considering reopening businesses and schools. observation 7 er graph significantly underestimates the second peak after reopening public places, i.e. allowing back hubs. in this paper, we propose to model covid-19 on contact networks (cgem) and show that such modelling, when compared to traditional compartment based models, gives significantly different epidemic curves. moreover, cgem subsumes the traditional models while providing more expressive power to model the npis. we hope that cgem could be used to achieve more informed policy making when studying reopening strategies for covid-19 . url https building epidemiological models from r 0: an implicit treatment of transmission in networks seasonality and period-doubling bifurcations in an epidemic model when individual behaviour matters: homogeneous and network models in epidemiology network science social networkbased distancing strategies to flatten the covid-19 curve in a post-lockdown world one quarter 26 percent of canadians admit they're not practicing physical distancing as a time-dependent sir model for covid-19 with undetectable infected persons neural network aided quarantine control model estimation of global covid-19 spread impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand fred (a framework for reconstructing epidemic dynamics): an open-source software system for modeling infectious diseases and control strategies using census-based populations temporal dynamics in viral shedding and transmissibility of covid-19 epidemic wave dynamics attributable to urban community structure: a theoretical characterization of disease transmission in a large network données covid-19 au québec the implications of network structure for epidemic dynamics covid-19: recovery and re-opening tracker crawdad dataset ilesansfil/wifidog situation of the coronavirus covid-19 in montreal spread of epidemic disease on networks predictive modelling of covid-19 in canada face masks prevent transmission of respiratory diseases: a meta-analysis of randomized controlled trials modeling covid-19 on a network: super-spreaders, testing and containment. medrxiv prediction of information diffusion probabilities for independent cascade model designing effective and practical interventions to contain epidemics sir-net: understanding social distancing measures with hybrid neural network model for covid-19 infectious spread social structure of facebook networks mathematical modelling of covid-19 transmission and mitigation strategies in the population of ontario covid-19: a timeline of canada's first-wave response targeted pandemic containment through identifying local contact network bottlenecks montreal wifi network 3 snapshots of the montreal wifi network are used in this paper with the following time periods: 2004-08-27 to 2006-11-30, 2007-07-01 to 2008-02-26, and 2009-12-02 to 2010-03-08 . each entry in the dataset consists of a unique connection id, a user id, node id (wifi hub), timestamp in, and timestamp out. nodes in the network are the users in each connection. an edge forms between users who have connected to the same wifi hub at the same time. connections are sampled with the aforementioned timestamp in dates to obtain ∼ 17800 nodes. since there are many disconnected nodes in the wifi networks, only the giant connected component is used.synthetic networks we compared cgem with the wifi networks with 3 synthetic network models, the regular, er, and ba networks. in each of these models, we set the number of nodes to be 17,800 and fit respective parameters to best match the infection curve of the base model and the number of edges in the wifi networks. table 5 all the experiments have been performed on a stock laptop. the following assumptions are made in cgem:1. individuals who recover from covid-19 cannot be infected again 2. symptomatic and asymptomatic individuals have the same transmission rate and they quarantine with the same probability 3. a certain percentage of the population do not compile with npis regardless of their connection. quarantine figure 8 shows the results of quarantining on all graph structures. quarantining infected and exposed nodes both reduces and delays the peak of all infection curve. however, the peak is not delayed as much in the wifi graphs when compared to the regular and er graphs.social distancing figure 9 shows the results of applying social distancing on all networks. like quarantining, this is effective in reducing the peaks of the infection curve on all networks, but the delay of peaks is only apparent on the synthetic networks.removing hubs figure 10 shows the results of apply school and business closure on all networks. the er and regular random networks significantly underestimates the effect of removing hubs.wearing masks figure 11 shows the results of wearing masks and without on each network. figure 12 shows the infection curves of all the networks with all npis applied. on march 23, 50% social distancing and 50% quaranine is applied, and 10% of hubs are removed with a success rate of 0.8. wearing mask is applied on april 6. the wifi networks more closely resemble the shape of the real infection curve. table 2 key: cord-004157-osol7wdp authors: ma, junling title: estimating epidemic exponential growth rate and basic reproduction number date: 2020-01-08 journal: infect dis model doi: 10.1016/j.idm.2019.12.009 sha: doc_id: 4157 cord_uid: osol7wdp the initial exponential growth rate of an epidemic is an important measure of the severeness of the epidemic, and is also closely related to the basic reproduction number. estimating the growth rate from the epidemic curve can be a challenge, because of its decays with time. for fast epidemics, the estimation is subject to over-fitting due to the limited number of data points available, which also limits our choice of models for the epidemic curve. we discuss the estimation of the growth rate using maximum likelihood method and simple models. this is a series of lecture notes for a summer school in shanxi university, china in 2019. the contents are based on ma et al. (ma, dushoff, bolker, & earn, 2013) . we will study the initial exponential growth rate of an epidemic in section 1, the relationship between the exponential growth rate and the basic reproduction number in section 2, an introduction to the least square estimation and its limitations in section3, an introduction to the maximum likelihood estimation in section 4, and the maximum likelihood estimation of the growth rate in section 5. epidemic curves are time series data of the number of cases per unit time. common choices for the time unit include a day, a week, a month, etc. it is an important indication for the severeness of an epidemic as a function of time. for example, fig. 1 shows the cumulative number of ebola cases during the 2014e16 ebola outbreak in western africa. the cumulative cases during the initial growth phase form an approximately linear relationship with time in log-linear scale. thus, in linear scale, the number of deaths increases exponentially with time. the mortality curve (the number of deaths per unit time) shows a similar pattern, as demonstrated by the daily influenza deaths in philadelphia during the 1918 influenza pandemic shown in fig. 2 . in fact, most epidemics grow approximately exponentially during the initial phase of an epidemic. this can be illustrated by the following examples. where s is the fraction of susceptible individuals, i is the fraction of infectious individuals, and r is the fraction of recovered individuals; b is the transmission rate per infectious individual, and g is the recovery rate, i.e., the infectious period is exponentially distributed with a mean 1=g. linearize about the disease-free equilibrium (dfe) ð1; 0; 0þ, di dt zðb à gþi: (2) thus, if b à g > 0, then iðtþ grows exponentially about the dfe. in addition, initially, sz1, thus, the incidence rate (number of new cases per unit time) c ¼ bsi also increases exponentially. it is similar for an susceptible-exposed-infectious-recovered (seir) model, as illustrated by the following example. example 2. lets consider an seir model: where e is the fraction of latent individuals (infected but not infectious), s the rate that latent individuals leaving the class, i.e; , the mean latent period is exponentially distributed with mean 1=s; s, i, r, b and g are similarly defined as in example 1. again, ð1; 0; 0; 0þ is a disease free equilibrium representing a completely susceptible population. linearize about this equilibrium, the equations for e and i are decoupled, and become de dt note that the jacobian matrix j ¼ às b s àg ! has two real eigenvalues, namely, l 1 ¼ àðs þ gþ þ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðs à gþ 2 þ 4sb q 2 ; l 2 ¼ àðs þ gþ à ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ðs à gþ 2 þ 4sb q 2 : thus, about the dfe, the solution of the model is asymptotically exponential with a rate l 1 . similar to example 1, the incidence rate also grows exponentially initially. in general, suppose the infection states of an individual can be characterized by the following vector ð s ! ; i ! þ, where s ! represents multiple susceptible states, and i ! represents multiple infectious (or latent) states. we also use s ! and i ! represent the number of individuals in each state. also assume that the epidemic can be modeled by the following generic system þ is a dfe, and the initial number of infectious individuals i ! ð0þ is very small, then, initially, the dynamics of i is governed by the following linearized system if the def is unstable, then iðtþ grows asymptotically exponentially. 2. the exponential growth rate and the basic reproduction number the exponential growth rate is, by itself, an important measure for the speed of spread of an infectious disease. it being zero is, like the basic reproduction number r 0 ¼ 1, a disease threshold. the disease can invade a population if the growth rate is positive, and cannot invade (with a few initially infectious individuals) if it is negative. in fact, it can be used to infer r 0 .there are two approaches to infer r 0 from the exponential growth rate, a parametric one, and a non-parametric one. for the parametric approach, we need an underlying model that gives both the growth rate and r 0 . example 3. consider the sir model (1) in example 1. note that ð1; 0; 0þ is an disease free equilibrium, representing a completely susceptible population. as we discussed above, the exponential growth rate is l ¼ b à g. note that the basic reproduction number is r 0 ¼ b=g . if, for example, g is estimated independently to l, then, lets look at a more complicated example. express b in terms of l and substitute it into r 0 , then thus, if the mean infectious period 1=g and the mean latent period 1=s can be independently estimated on l, then r 0 can be inferred from l. typically, for an epidemic model that contains a single transmission rate b, if all other parameters can be estimated independently to the exponential growth rate l, then l determines b, and thus determines r 0 . models can be overly simplified for mathematical tractability. for example, both the sir model in example 1 and the seir model in example 2 assume exponentially distributed infectious period. however, the infectious period and the latent period are mostly likely not exponential. wallinga and lipsitch (wallinga & lipsitch, 2006 ) developed a non-parametric method to infer the basic reproduction number from the exponential growth rate without assuming a model. let hðaþ be the probability that a random individual remain infectious a time units after being infected (i.e., a is the infection age); bðaþ is the rate of transmission at the infection age a. then, tðaþ ¼ hðaþbðaþ is the transmissibility of a random infectious individual at the infection age a, assuming that the whole population is susceptible. thus, in addition, we assume that the population is randomly mixed, i.e., every pair of individuals have identical rate of contact. let cðtþdt be the number of new infections during the time interval ½t;t þ dt, that is, cðtþ is the incidence rate, and sðtþ be the average susceptibility of the population, i.e., the expected susceptibility of a randomly selected individual. in addition, new infections at time t is the sum of all infections caused by infectious individuals infected a time unit ago (i.e., at time t à a) if they remain infectious at time t (with an infectious age a) and their contact is susceptible. that is, and thus cðtþ ¼ sðtþ to compute r 0 , we need to normalize tðaþ as a probability density function, note that wðaþda is the probability that a secondary infection occurs during the infection age interval ½a; a þ da. that is, wðaþ is the probability density function of the generation time, i.e., the time from being infected to generate a secondary infection. this generation time is also called the serial interval. with the serial interval distribution wðtþ, this means that the cðtþ is only determined by r 0 , wðtþ and sðtþ. at the beginning of an epidemic, where the epidemic grows exponentially (with an exponential growth rate l), sðtþz1 and cðtþ ¼ c 0 e lt where c 0 is the initial number of cases at where mðxþ ¼ r ∞ 0 e xa wðaþda is the moment generating function of the serial time distribution wðaþ. equation (5) links the exponential growth rate to the basic reproduction number though the serial interval distribution only. that is, if we can estimate the serial interval distribution and the exponential growth rate independently, that we can infer the basic reproduction number. note that the serial interval distribution wðtþ can be estimated independently to the exponential growth rate. for example, it can be estimated empirically using contact tracing. alternatively, one can also assume an epidemic model. here we discuss a few simple examples. example 5. consider an sir model. let fðaþ be the cumulative distribution function of the infectious period, and a constant transmission rate b. the probability that an infected individual remains infectious a time units after being infected is and thus the transmissibility is tðaþ ¼ b½1 à fðaþ; and the serial interval distribution is where m is the mean infectious period. for the special case that the infectious period is exponentially distributed with a rate g, i.e., fðaþ ¼ 1 à e àga , this model becomes model (1). then the density function of serial interval distribution is which is identical to the density function of infectious period distribution. the moment generating function is note that the exponential growth rate is l ¼ b à g, then lets consider a more complex example with multiple infected states. example 6. consider an seir model with a constant transmission rate b. let fðaþ and gðaþ be the cumulative distribution functions of the infectious period and the latent period, respectively. given the latent period t l ¼ [ a, the probability that an infectious individual is infectious a time units after being infected is 1 à fða à [þ:thus, hence, the serial interval distribution is for the special case that the latent period is exponentially distributed with a rate s (i.e., fðaþ ¼ 1 à e àga ) and the latent period is exponentially distributed with a rate s (i.e., gðaþ ¼ 1 à e àsa ), this model becomes model (3), and wðaþ ¼ gse àga z a 0 e ðgàsþs ds ¼ ðge àga þãðse àsa þ: that is, if both distributions are exponential, the serial interval distribution is the convolution of the latent period distribution and the infectious period distribution. in this case, the basic reproduction number is where m i ðxþ and m l ðxþ are the moment generating functions of the infectious period and latent period, respectively. in equation (4), r ðtþ ¼ r 0 sðtþ is the reproduction number, and thus this equation can be used to estimate the production number at any time t during the epidemic given the incidence curve cðtþ, namely, this is similar to, but different from, the nonparametric method developed by wallingua and teunis (wallinga & teunis, 2004) . the least squares method is one of the most commonly used methods for parameter estimation in mathematical biology. this method is in fact a mathematical method. for a family of curves f ðt; q ! þ, where q ! 2r m is a vector of parameters of the family, this method finds the curve f ðt; b qþ in the family that minimizes the distance between the curve and a set of points , and x ! be the euclidean norm in r n , then the mathematical formulation of the least squares method is where argmin gives the parameter q ! that minimizes the objective function. for our purpose, the observations fðt i ; x i þg nà1 i¼0 is the epidemic curve, i.e., x 0 is the number of initially observed cases, and x i is the number of new cases during the time interval ðt ià1 ; t 1 . we aim to find an exponential function f ðt; c 0 ; lþ ¼ c 0 e lt that minimizes its distance to the epidemic curve, i.e., the parameters q ¼ ðc 0 ; lþ. there are two commonly use methods to estimate the exponential growth rate l: 1. nonlinear least square to fit to f ðt; c 0 ; lþ ¼ c 0 e lt directly; 2. linear least square to fit fðt i ; lnx i þg to ln f ðt; c 0 ; lþ ¼ lnc 0 þ lt. the nonlinear least squares method does not have an analytic solution. numerical optimization is needed to solve the minimization problem (6). the linear least square method has an analytic solution: let [ 0 ¼ lnc 0 , then the least squares problem becomes the objective function is a quadratic function of [ 0 and l, thus, the minimum is achieved at i¼0 y i , which represents the average of any sequence fy i g n i¼0 , then, and thus the best fit exponential growth rate ls b l ¼ do these two methods yield the same answer? to compare, we simulate an epidemic curve of the stochastic seir model in example 2, using the gillespie method (gillespie, 1976) . the simulated daily cases (number of individuals showing symptom on a day) are then aggregated into weekly cases. then, we use both methods to fit an exponential curve to the simulated epidemic curve. the simulated epidemic curve and the fitting results are shown in fig. 3 . this exercise illustrates a challenge of fitting an exponential model to an epidemic curve: how to determine the time period to fit the exponential model. the exponential growth rate of an seir model decreases with time as the susceptible population decreases. in fig. 3 , the epidemic curve peaks in week 13. we choose a sequence of nested fitting windows starting in the first week and ending in a week w for w ¼ 3; 4;…;13. the seir model has an asymptotic exponential growth, so the fitted exponential growth rate is not monotonic near the beginning of the epidemic. for larger fitting windows, both methods give an exponential growth rate that decreases with the length of the fitting window. we need more data points to reduce the influence of the stochasticity. however, using more data points also risks of obtaining an estimate that deviates too much from the true exponential growth rate. there is no reliable method to choose a proper fitting window. fig. 3 also shows that the linear and nonlinear least squares methods may not yield the same estimate. this is because of a major limitation of both least squares methods: they implicitly assume that the deviations jx i à f ðt i ; q ! þj carry identical weights. with the nonlinear method, later data points (at larger times) deviate more from the exponential curve than the earlier data points, because the exponential growth slows down with time. thus, the method is more biased to the later data points. with the linear method, the deviations in lnx i are more even than in x i , and thus the linear method is less biased to the later data points than the nonlinear method does. the least squares method, as mentioned above, is a mathematical problem. it does not explicitly assume any error distributions, and thus cannot give us statistical information about the inference. for example, if we use two slightly different fitting windows and get two slightly different estimates, is the difference of the two estimates statistically significant? such a question cannot easily be answered by the least squares method. interestingly, the least squares methods make many implicit assumptions to the deviations. we have mentioned the implicit equal-weight assumption above. it also implicitly assumes that the order of the observations does not matter, and that positive and negative deviations are equivalent. thus, they implicitly assume that the deviations are independently identically and symmetrically distributed. in statistics, the least squares method is commonly used in linear and nonlinear regression with an addition assumption that the errors are independently and identically normally distributed. however, these assumption on the errors may not be appropriate. for example, the new cases at time t þ 1 may be infected by those who are infected at time t. thus, the number of new cases at different times may not be independent. also, the number of cases is a counting variable, and thus its mean and variance may be closely related, meaning that the error may not be identically normally distributed. in the next section, we address some of these problems using the maximum likelihood method. the maximum likelihood method is a commonly used statistical method for parameter inference; see, e.g., [(bolker, 2008 to construct the likelihood function we need to make assumptions on the error distribution. there are two types of error: the process error and the observation error. the observation error is the error in the observation process. for example, most people with influenza do not go to see a doctor, and thus there is no record of these cases, resulting in an under-reporting of the number influenza cases. also, many influenza related deaths are caused by complications such as pneumonia, and influenza may not be recorded as the cause. typos, miscommunication, etc, can all result in observation errors. the process error originates from the stochasticity of the system that is independent to observation. for example, the disease dynamics is fig. 3 . the simulated seir epidemic curve (upper) and the fitted exponential growth rate as a function of the end of the fitting window (lower). the epidemic curve is simulated stochastically from the seir model in example 2 using the gillespie method (gillespie, 1976) with the parameters b ¼ 0:3, s ¼ 1, g ¼ 0:2, the rates have a time unit of a day. the daily cases are then aggregated by week. the data points are taken at times t i ¼ i, i ¼ 0; 1; 2; …13 weeks. the theoretical exponential growth rate is l ¼ 0:547 per week. intrinsically stochastic. the time that an infectious individual recovers, and the time that a susceptible individual is infected, are all random variables that affects the number of new infections at any time, even if we eliminate all observation errors. these two types of errors have very different nature, and thus need very different assumptions. for example, it is reasonable to assume that observation errors are independent to each other, but process errors at a later time are commonly dependent on the process errors at earlier times. if observation errors are large and process errors are negligible, then we assume that the random variable x i corresponding to the observation x i is independently distributed with a probability mass function p i ðk; q ! þ where k is the values that x i can take. then, the likelihood function is the maximization of this likelihood function rarely has an analytic solution, and commonly needs to be solved numerically. note that each factor (probability) can be very small, and thus the product may be very difficult to minimize numerically because of rounding errors (from the binary representation of real numbers in computers). it is a common practice to maximize the log-likelihood function for example, we assume that the number of cases xðt i þ at time t i is independently poisson distributed with mean m i ¼ c 0 e lti . then, the log-likelihood function note that the observed cases x i are constants, and thus the last term can be ignored for maximization. thus, this maximization problem can only be solved numerically. we choose poisson distribution because its simple form greatly simplifies the log-likelihood function. in addition, it does not introduce more parameters, which is valuable to avoid over-fitting when the number of data points available is small. if the process error is not completely negligible, then choosing an overly dispersed distribution, such as the negative binomial distribution may be desirable. a negative binomial distribution has two parameters, the success probability q ! 0 and the shape parameter r > 0. for simplicity, we assume that the shape parameter r is the same at each time t i , and will; be estimated together with the model parameters q ! ; but q depend on t i . the probability mass function is and the log-likelihood function is again, the last term can be ignored for the optimization problem. in addition, there is a constraint r > 0. if process errors are large and observation errors are negligible, then we cannot assume that the observed values x iþ1 and x i are independent to each other. instead, for all i ¼ 0; 1; …; n à 2, we compute the probability mass function of x iþ1 given fx j ¼ x j g i j¼0 , namely, q iþ1 ðk; q ! fx j g i j¼0 þ. then, the likelihood function is for simplicity, assume that x iþ1 is poisson distribution with mean m iþ1 ¼ x i e lðtiþ1àtiþ . note that, since we assumed no observation error, the initial condition c 0 ¼ x 0 is exact, and thus there is a single parameter l for the model. thus, and thus the log-likelihood function is x ià1 e lðt i àt ià1 þ þ x i lðt i à t ià1 þ þ x i lnx i à lnx i !: again, the last two terms can be ignored in maximization because they are constants. thus, l ¼ argmax l x ià1 e lðt i àt ià1 þ þ ðt i à t ià1 þx i l: it is much harder to formulate the likelihood function if process errors and observation errors must both be considered. we can simplify the problem by ignoring the process error and use an overly dispersed observation error distribution as a compensation. note that this simplification mainly affects the confidence intervals. the maximum likelihood method gives a point estimate, i.e., one set of parameter values that makes it mostly likely to observe the data. however, it is not clear how close the point estimates are to the real values. to answer this question we use an interval estimate, commonly known as a confidence interval. a confidence interval with a confidence level a is an interval that has a probability a that contains the true parameter value. a commonly used confidence level is 95%, which originates from a normal distribution. if a random variable x is normally distributed with a mean m and a standard deviation s, then the probability that x2½m à2s; m þ2s is 95%. the confidence interval can be estimated using the likelihood ratio test [ (bolker, 2008), p.192] . let c q !^b e the point estimate of the parameters. a value l 0 is in the 95% confidence interval is equivalent to accepting with 95% probability that l 0 is a possible growth rate. to determine this we fit a nested model by fixing the growth rate l ¼ l 0 , suppose its point estimate is b q 0 . we then compute the likelihood ratio the wilks' theorem (wilks, 1938) guarantees that, as the sample size becomes large, the statistics à2lnl ¼ 2½[ð b qþ à[ð b q 0 þ is c 2 distributed with a degree of freedom 1. we thus can compare à2lnl with the 95% quantile of the c 2 distribution and determine if l 0 should be in the confidence interval or not. we can thus perform a linear search on both sides of the point estimate to determine the boundary of the confidence interval. we still have not addressed the problem of choosing a fitting window for an exponential model. recall that the challenge arises because the exponential growth rate of an epidemic decreases with time. instead of finding heuristic conditions for choosing the fitting window, we circumvent this problem by incorporating the decrease of the exponential growth rate into our model. we have two choices, using either a mechanistic model such as an sir or seir model, or a phenomenological model. naturally, if we know that a mechanistic model is a good description of the disease dynamics, fitting such a model to the epidemic curve is a good option (see, e.g., (chowell, ammon, hengartner, & hyman, 2006; pourabbas, d'onofrio, & rafanelli, 2001) ,). we use an sir model as an example. for simplicity, we assume that the process error is negligible, and the incidence rate is poisson distributed with a mean cðtþ given by an sir model (cðtþ ¼ bsin where n is the population size). to construct the log-likelihood function, we need to calculate cðtþ, i.e., numerically solve the sir model. to do so, we need the transmission rate b. the recovery rate g, the initial fraction of infectious individuals ið0þ ¼ i 0 (with the assumption that rð0þ ¼ 0, sð0þ ¼ 1 à i 0 , and thus i 0 determines the initial conditions), in addition to the population size n. thus, the parameters of the model is q ! ¼ ðb; g; i 0 ; nþ. thus the log-likelihood function is (ignoring the constant terms) where the number of new cases cðt i þ in the time interval ½t i ; t iþ1 is cðt i þ ¼ sðt iþ1 þ à sðt i þ ; and sðt i þ is solved numerically from the sir model. thus, [ implicitly depend on b, g and i 0 through sðtþ. one draw back using such a mechanistic model is its high computational cost, since each evaluation of the log-likelihood function requires solving the model numerically, and numerical optimization algorithms can be very hungry on function evaluations, especially if the algorithm depends on numerical differentiation. another draw back is that these mechanistic models can be overly simplified, and may not be a good approximation to the real disease dynamics. for example, for seasonal influenza, due to the fast evolution of the influenza virus, individuals have different history of infection, and thus have different susceptibility to a new strain. yet simple sir and seir models assume a population with a homogeneous susceptibility. thus using a simple sir to fit to an influenza epidemic may be an over simplification. however, realistic mechanistic models can be overly complicated, and involve too many parameters that are at best difficult to estimate. for example, a multi-group sir model depends on a contact matrix consisting of transmission rates between groups, which contains a large number of parameters if the model uses many groups. if all we need to estimate is the exponential growth rate, we only need a model that describes the exponential growth that gradually slows down. most cumulative epidemic curves grow exponentially initially, and then saturates at the final epidemic size. a simple phenomenological model can be used to describe the shape of the cumulative epidemic curve, but the model itself may not have realistic biological meaning. however, if simple mechanistic models cannot faithfully describe the epidemic process, using a simple phenomenological model with an analytical formula may be a better choice, at least numerically, because repetitively solving a system differential equations numerically, and differentiating the log-likelihood function numerically, can both be avoided with the analytical formula. here we discuss some examples for such models. the logistic model is the simplest model that shows an initial exponential growth followed a gradual slowing down and a saturation. the cumulative incidences cðtþ (the total number of cases by time t) can be approximated by d dt cðtþ ¼ rcðtþ 1 à cðtþ k : where r is the exponential growth rate, and k ¼ lim t/∞ cðtþ. let c 0 ¼ cð0þ, its solution is the new cases cðt i þ in a time period ½t i ; t iþ1 is thus the model parameters are q ! ¼ ðr; k; c 0 þ. note that it is less than the number of parameters of the simplest mechanistic model (i.e., the sir model). the logistic model has a fixed rate of slowing down of the exponential growth rate. to be more flexible, we can use the richards model (richards, 1959) for the cumulative incidence curve. the richards model, also called the power law logistic model, can be written as d dt cðtþ where ais the parameter that controls the steepness of the curve. note that the logistic model is a special case with a ¼ 1. its solution is the new cases cðt i þ in a time period ½t i ; t iþ1 is also given by (8). the parameters are q ! ¼ ðr; k; c 0 ; aþ. to compare the performance of both the sir model and the phenomenological models, we fit these models to the stochastically simulated seir epidemic curve of weekly cases that we introduced in section 3 (fig. 3) . we assume that the process error is negligible, and the observations are poisson distributed about the mean that is given by the corresponding models. we use the maximum likelihood method. the results are shown in fig. 4 . the predictions of the exponential model, as discussed before, quickly decreases as more data points are used. both the logistic model and the richards model give robust estimates with fitting windows ending up to the peak of the epidemic. the sir model gives a robust estimate for all fitting windows up to the whole epidemic curve. thus, the sir model is a good model to use to fit the exponential growth rate, even if it may not be the correct mechanistic model. (e.g., it ignores the latent period in this example). it requires more computational power, because the epidemic curve lacks an analytic formula, and needs to be numerically solved from a system of ordinary differential equations. the logistic model and the richards model can be used for all data points up to the peak of the epidemic. fig. 4 also show that the sir model and the logistic model give the narrowest confidence intervals. however, narrower confidence intervals may not be desirable if it has a large chance that it does not contain the true value. due to errors, especially process errors, each realization of the underlying stochastic epidemic process yields a different epidemic curve. these epidemic curves may exhibit different exponential growth rates even if the underlying parameter values are the same. an observed epidemic curve is just a single realization of the epidemic process. does the estimated confidence intervals contain the theoretical exponential growth rate of the epidemic process? this question is answered by the "coverage probability", which is the probability that the confidence interval contains the true value. if the confidence interval properly considers all sources of stochasticity, then the coverage probability should be equal to its confidence level. to illustrate this, we numerically compute the coverage of the confidence intervals by simulating the seir model 400 times and compute confident interval of the exponential growth rate for each realization, and compute the fraction of the confident intervals containing the theoretical value l ¼ 0:537. the results is summarized in below: logistic model richards model coverage probability 43% 65% that is, even though the logistic model gives a narrow confidence interval, its coverage probability is low. the coverage probability of the confidence interval given by the richards model is also significantly lower than the confidence level. this is indeed caused by treating process errors as observation errors. if there is under reporting, that is, only a fraction p of the cases can be observed, then the observation error becomes larger as p decreases (i.e., more under reporting). the coverage will become larger as a result. for example, the case fatality ratio of the 1918 pandemic influenza is about 2% (frost, 1920) . thus, the mortality curve can be treated as the epidemic curve with a large under reporting ratio, and thus the observation error dominates. in this case ignoring the process error is appropriate. ecological models and data in r transmission dynamics of the great influenza pandemic of 1918 in geneva, switzerland: assessing the effects of hypothetical interventions statistics of influenza morbidity. with special reference to certain factors in case incidence and case-fatality a general method for numerically simulating the stochastic time evolution of coupled chemical reactions a method to estimate the incidence of communicable diseases under seasonal fluctuations with application to cholera a flexible growth function for empirical use how generation intervals shape the relationship between growth rates and reproductive numbers different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures the comparison of the results of fitting the sir, exponential, logistic, and richards models to a simulated weekly incidence curve, as a function of the end point of the fitting window (upper). the epidemic curve (lower) is shown as a reference. the epidemic curve and the theoretical exponential this research is partially supported by a natural sciences and engineering research council canada discovery grant, and national natural science foundation of china (no.11771075). key: cord-031143-a1qyadm6 authors: pinto neto, osmar; reis, josé clark; brizzi, ana carolina brisola; zambrano, gustavo josé; de souza, joabe marcos; pedroso, wellington; de mello pedreiro, rodrigo cunha; de matos brizzi, bruno; abinader, ellysson oliveira; zângaro, renato amaro title: compartmentalized mathematical model to predict future number of active cases and deaths of covid-19 date: 2020-08-30 journal: res doi: 10.1007/s42600-020-00084-6 sha: doc_id: 31143 cord_uid: a1qyadm6 introduction: in december 2019, china reported a series of atypical pneumonia cases caused by a new coronavirus, called covid-19. in response to the rapid global dissemination of the virus, on the 11th of mars, the world health organization (who) has declared the outbreak a pandemic. considering this situation, this paper intends to analyze and improve the current seir models to better represent the behavior of the covid-19 and accurately predict the outcome of the pandemic in each social, economic, and political scenario. methodology: we present a generalized susceptible-exposed-infected-recovered (seir) compartmental model and test it using a global optimization algorithm with data collected from the who. results: the main results were: (a) our model was able to accurately fit the either deaths or active cases data of all tested countries using optimized coefficient values in agreement with recent reports; (b) when trying to fit both sets of data at the same time, fit was good for most countries, but not all. (c) using our model, large ranges for each input, and optimization we predict death values for 15, 30, 45, and 60 days ahead with errors in the order of 5, 10, 20, and 80%, respectively; (d) sudden changes in the number of active cases cannot be predicted by the model unless data from outside sources are used. conclusion: the results suggest that the presented model may be used to predict 15 days ahead values of total deaths with errors in the order of 5%. these errors may be minimized if social distance data are inputted into the model. in december 2019, in china, a series of atypical pneumonia cases have emerged caused by a new coronavirus, nowadays officially called covid-19 by the world health organization (who). it has spread rapidly throughout the country, having had its epicenter the city of wuhan, in which 82,249 people were infected and 3341 people have died. in response to the rapid global dissemination of the virus, on the 11th of mars the who has declared the outbreak a pandemic. since then, the global impact of the covid-19 became a great threat to the public health. considering this emergency, different areas of science need to focus its attention to the challenges imposed by this new coronavirus. in such scenarios, it is imperative the necessity of new, improved, and specific mathematical modeling. there are many uncertainties regarding the gravity of the infection caused by . nevertheless, based on epidemiological investigations, the period of incubation is 1 to 14 days, more evidently between 3 and 7 days, and the virus is contagious still on its latency period (guo et al. 2020) . the majority of the infected adults and children have developed mild symptoms alike those of a common cold, and some patients evolved rapidly to an acute respiratory discomfort, followed by respiratory failure, multiple organ failure, and death. the probability of death in the usa, according to the centers for disease control and prevention (cdc), has ranged from 0.5% to ages of 45-54 to 1.4% to ages 55-64 and growing on a constant rate with age. pre-existing comorbidities, affecting the vulnerability to the infection, also increase the probability of death. covid-19 seems to have a relatively higher rate of transmissibility when compared with other coronavirus infections and, to better understand it, it is very important to multiple factors (chen et al. 2020; wang et al. 2020) . family environment, age, and wealth distribution are essential factors related to the transmission and mortality rate of covid-19 (walker et al. 2020) . another important that must be considered, especially in the mortality rate, is the number of hospital beds and the capacity of intensive care unities (icu). the relationship between the age ranges that require attention and the mortality rate by infection was scrutinized by china and, assuming that 30% of the hospitalized will demand intensive care (icu), and among those 50% will die, it has been calculated the demand for hospital beds assuming the average stay of 16 days in the hospital (ferguson et al. 2020) . the experience of covid-19 in many countries, concerning medical assistance, has indicated that the demand for hospital beds and the need for mechanical ventilation have overcome the availability of those in countries with higher per capita income. therefore, the consequences in countries with scarcity of these services are expected to be larger (walker et al. 2020) . previous experiences in some countries highlight the need to anticipate the impacts of the pandemic outbreak and to develop researches with epidemiological models. these mathematical models are necessary to the comprehension of the present outbreak's behavior, so that countries might develop strategies to minimize the impacts on the healthcare system and preserve life (wu et al. 2020; peng et al. 2020) . as an example, public administrators may find comprehensive ground to define policies such as enforcing social distance measures, the available versus need of laboratorial tests, planning for hospital beds, and health system resources. in the absence of a vaccine, mathematical modeling may assess the effectiveness of non-pharmaceutical interventions and in its role in decreasing population contact and viral transmission to control the pandemic outbreak. china has managed to control the outbreak with the deployment of isolation of its cases and social distancing of the population (ferguson et al. 2020) . there are some non-pharmaceutical strategies to control an outbreak, such as containment, mitigation, and suppression. when the containment measures fail to control the outbreak, mitigation and suppression strategies may be adopted to postpone and mitigate its effects on society and the healthcare system. mitigation will concentrate in retarding but not necessarily impeding the spread of the outbreak, reducing the peak of medical assistance and protecting the higher risk groups. suppression aims to reverse the outbreaks' growth, diminishing the number of cases and maintaining this frame for an indefinite time, through more extreme measures, such as quarantining, police enforcement, mass testing, compulsory notification, and finance support to the population in isolation, among other actions. (ferguson et al. 2020; walker et al. 2020) . concerning mathematical modeling, which supplies detailed mechanisms about the outbreak dynamics, the susceptible-exposed-infected-recovered (seir) epidemiological model is widely adopted to characterize a pandemic caused by covid-19. for instance, this method was used for decision making in hubei, wuhan, and beijing (peng et al. 2020) . this paper intends to analyze and improve up on the traditional seir model by adding important new compartments, as well as consider the effects of non-pharmaceutical interventions and the possibility of death coming from lack of available icu beds. moreover, such a model can also be used to prototype and analyze the cause/effect relation of a multitude of actions and public health strategies, so the most effective ones can be chosen for each country, city, or province. since every affected region is different, it is of utmost importance to help organizations to determine not only the number of active cases but also the number of hospital beds, and icus will be needed at a certain point in time to maximize the usage of public resources. we present a generalized seir compartmental model using novel and recently suggested ideas and concepts (apmonitor optimization suite 2020; peng et al. 2020 , university of basel 2020 ). an application of our model using real mobility data to investigate different future projections for the usa has been recently reported (kennedy et al. 2020) . it is composed of eight compartments: susceptible, unsusceptible, exposed, infected, hospitalized, critical, dead, and recovered (sueihcdr; fig. 1 ). the model assumes, at first that, the whole population is susceptible (eq. 1) to the disease. as time progresses, a susceptible person can either become exposed (eq. 5) to the virus or unsusceptible (eq. 2). where i(t) is the number of infectious people at time t, n pop is the population of the country, β is the infection rate, α is a protection rate, and sd is a social distancing factor. as in peng et al. (2020) we introduced a protection rate α factor to our susceptible equation (eq. 1). this protection rate was introduced to account for possible decreases in the number of susceptible people to the virus caused by factors other than social distancing, such as the usage of face masks, better hygiene, more effective contact tracing, and possible vaccines and or drugs that may prevent infection. different from the aforementioned study of peng et al. (2020) , however, we varied α across time (eq. 3). this time variation was introduced to reliably model people's behavior, who are not commonly too concerned about the disease in the earlier stages of the epidemic, but as the number of infected and deaths increases, become more cautious about the virus. where α 0 is the reference value that is the maximum value and t f is the final time for the prediction. furthermore, we also introduced a social distancing factor sd, which also varies with time (eq. 4). social distancing was modeled as a logistic curve so that the model could account for the date (t sd ) when a possible quarantine measurement starts. as mentioned before, real data for mobility can be used in our model when available; using real mobility data have been shown to be important when long-term future projections are intended (kennedy et al. 2020) . where sd 0 is sd reference value, that is the maximum value, and t sd is the time the sd increases until reaching sd 0 . exposed people become infectious after an incubation time of 1/γ (eq. 5). infected people stay infected for a period of 1/δ (eq. 6) days and can have three different outcomes. considering m as a specific parameter to account the fraction of infectious that is asymptomatic, it is possible to determine that a percentage of the infected (1 − m) go hospitalized, another percentage of them (l) may die without hospitalization, and the rest of them (m − l) recover. l was introduced as a function of time (eq. 7) so that the time when hospital bed became unavailable could be modeled (t m ), as well as the duration that hospital was full (dur). fig. 1 sueihcdr model info graphic description; it is composed of eight compartments susceptible, unsusceptible, exposed, infected, hospitalized, critical, dead, and recovered. β is the infection rate, sd is a social distancing factor, α is a protection rate, m is the fraction of infectious that are asymptomatic, 1 − m is the percentage of the infected go hospitalized, l is the percentage of infected people that may die without hospitalization, 1 − c is the percentage of hospitalized people that recovers, c is the fraction of hospitalized that becomes critical cases needing to go an intensive care unit (icu), and f is the fraction of people in critical state that dies where l 0 is the inclination of the angular coefficient of the ramp up until reaching the maximum value reference value and t l is the time when people started dying due the lack of available icus. hospitalized people (eq. 8) stay hospitalized for 1/ζ days and can either recover (1 − c) or become critical (c-specific parameter to account the fraction of hospitalized that becomes critical cases) needing to go an intensive care unit (icu). where ε is the inverse of the time people stay in the icu. a person stays on average 1/ε in the icu (eq. 9) and can either go back to the hospital (1 − f) or die (f-specific parameter to account the fraction of people in critical state that died). therefore, recovered people (eq. 10) can either come straight from infection when the case is mild (m − l) or from the hospital when the case is no critical (1 − c). death (eq. 11) arises either from lack of available treatment (l), or from critical cases in the icu ( f ). at last, the effective reproduction number r t (eq. 12) of our model can be estimated as solving and testing the model we used the fourth order runge-kutta numerical method to solve our system of ordinary differential equation in matlab (mathworks inc.r17a). to test our model we gathered active cases, recovered cases, accumulated deaths, and tests per million people data from the who of ten different countries in different stages of the epidemic: germany, brazil, spain, italy, south korea, portugal, switzerland, thailand, and usa. lack of testing and under-notification of active cases has been largely reported for the covid-19 (hasell et al. 2020; worldometer, 2020; ufpel, 2020) ; in consequence, active cases data were corrected by a factor. the correction factor was found via optimization, as described in the next paragraph, using a range of possibilities estimated based on previous reports. lower bound was determined considering the death rate of the country to be as described in verity et al. (2020) , corrected by age (young: 0.32%; older adults (60+): 6.4%; senior older adults (80+): 13.4%). upper bound was set by considering the same age proportional differences in death rate, as previously mentioned, but adjusting death rate of each country by the death rate in iceland (country with the greatest percentage of test per inhabitant gudbjartsson et al. 2020) . we used a custom build matlab global optimization algorithm using monte carlo iterations algorithm and multiple local minima searches. the algorithm was tested for the best solution considering 21 different inputs to the model within ranges obtained from the who and several publications (liu et al. 2020; ranjan 2020; wu et al. 2020 ; table 1 ) and 1 correction factor (f) for the active cases ( table 2 ). the algorithm was used to minimize a goal function (j) as a combination of active cases and death time series (eq. 13). optimization algorithm results after 10,000 runs. the pareto front was determined considering two simultaneous objectives active cases rmse and deaths rmse. the best 100 solutions were used to initialize the matlab multi-objective genetic algorithm data under 500 active cases were discarded. initial values for each compartmental parameter had ranges proportional to the following initial values (table 1) : infected initial values (i0) were determined as the corrected actives cases first value greater than 50; exposed initial values (e0) were 0.5 × i0; hospitalized initial values (h0) were 0.2 × i0; critical cases initial values (c0) were 0.7 × i0; death initial values (d0) were obtained from the accumulated deaths real data; similarly, recovered initial values (rec0) were obtained from the recovered real data. optimization algorithm results were considered after 10,000 runs; from them, the 100 best solutions were used as initial population for a multi-objective genetic algorithm (matlab function: gamultiobj) to determine a pareto front of solutions considering two simultaneous objectives rmse (active cases) and rmse (deaths) (fig. 2) . lower and upper bounds for the genetic algorithm were set as 40% variations to the best solution found out of the previous optimization algorithm runs, and 100 generations were created. all fitting processes were done for data from the day 1 of the outbreak for each country up to april 18, may 3, may 18, june 3, and june 18, 2020, to test the accuracy of the future predictions that can be made based on the model and optimization results. furthermore, 2% perturbations to the model coefficients were used to determine via monte carlo a 95% confidence interval for the results. results are presented as mean (standard deviation). besides introducing more compartments than a traditional seir model (e.g., hospitalized) the three main differences of our model sueihcdr to a standard seir model is the addition of α sd and l. our results suggest that our model was able to accurately fit the data of all countries when one goal is considered (figs. 3 and 4) . however, when we tried to fit the model two both accumulated deaths and active cases, we found that we could not reproduce with the same accuracy the data for all analyzed countries (fig. 5 ). table 2 shows the optimization parameter results for june 18, 2020, for germany, brazil, spain, italy, south korea, portugal, switzerland, thailand, and the usa considering the solution from the pareto front (fig. 2 ) that minimized j (eq. 12). mean protective rate (α) was 0.027 (0.007); mean infectious rate (β) was 0.59 (0.07); mean fraction of infectious that are asymptomatic or mild (m) was 0.92 (0.04); the mean fraction of infectious people that died with no treatment (l) was 0.006 (0.003); the fraction of severe cases that turn critical (c) was 0.25 (0.05); the mean fraction of critical cases that are fatal ( f ) was 0.32 (0.05), and the mean social distancing parameter (sd) was 0.57 (0.04). table 3 shows the inverse values of γ, δ, ζ, and ε: the mean latent period was 1 (0.2) days; the mean infectious period was 8.9 (1.5); the mean hospitalized period was 4.3 (0.7) days, and the mean period in icu was 11.5 (2.1) days. the basic reproduction number (r 0 ) was 2.24 (0.52), and the death rate was 1.2 (0.5)%. figures 3, 4 , and 5 show the model results for all studied countries. figure 3 shows results considering the two-goal optimization pareto front that minimized death rmse. optimization was done considering end-date june 3 (black circles for deaths and green circle for active cases). the red circles (deaths) and blue circles (active cases) indicate "future" real data for the next 15 days. similarly, fig. 4 shows the results for the two-goal optimization, but now minimizing active cases rmse and end-date may 3. finally, fig. 5 shows the results considering the solution from pareto front that minimized j for end-date may 18th. table 4 shows the model future projection of 15, 30, 45, and 60 days for total number of infected, deaths, hospitalized, peak hospitalization, icu patients, peak day icu, and recovered patients. results indicate deaths in the thousands for every country but korea and thailand. usa has a peak day of more than 100 thousand hospitalized patients. additionally, spain projects almost 2 million recovered people by the end of august 2020. according to the model estimations, brazil will have more than 200 thousand icu patients treated by the end of july 18 and 70 thousand more in the following 30 days; peak day will demand 35 thousand icu beds. finally, tables 5, 6, and 7 shows the percentage errors comparing model results to real data for the day of the analyses and future projections of 15, 30, 45, and 60 days. table 5 shows the results for the optimization minimizing j, table 6 shows the minimizing death rmse, and table 7 shows the minimizing active case rmse. thirty-day projections were performed twice: first, as for the other time windows, considering future date june 18, and second considering future data may 18. as expected, errors got larger for farther into the fig. 4 model results for active cases and accumulated deaths for all studies countries, considering minimizing active cases rmse. optimization was done considering end-date may 3 (black circles for deaths and green circle for active cases). the red circles (deaths) and blue circles (active cases) indicate real data up to june 18 future projections. in general, projected deaths had smaller percentage errors. because of recent re-opening of countries such as portugal and spain 30-day future projections considering data from 60 days ago to estimate 30 days ago yielded better results than data from 30 days ago to estimate present day (may 18). as it can be seen in fig. 4 , there was a sudden increase in the number of active cases for both countries in the past days that were not predicted by the model whose active case curves kept on a steady decline. considering the rapid growing covid-19 pandemic and the necessity of modeling the phenomenon to make future predictions in the number of cases, deaths, but ultimately in the number of hospital and icu beds, we present a novel generalized seir compartmental model with the addition of the unsusceptible, hospitalized, critical, and dead compartments. furthermore, we introduce three new parameters to the model (α, sd, and l). we tested our model using a global optimization algorithm and data collected from the who for several countries. our main findings were as follows: (a) our model was able to accurately fit the either deaths or active cases data of all countries tested independent of what stage of the epidemic they were using optimized coefficient values in agreement with recent reports; (b) when trying to fit both sets of data at the same time, fit was good for some countries, but not for all; (c) using our model, large ranges for each input, and (black circles for deaths and green circle for active cases). the red circles (deaths) and blue circles (active cases) indicate real data up to june 18 table 3 inverse of the model optimized coefficients of γ, δ, ζ, and ε representing latent, infectious, hospitalization, and critical cases mean duration in days, as well as the model estimated basic reproductive number (r 0 ) and the death rate (dr) for june 18, 2020, for germany, brazil, spain, italy, south korea, portugal, switzerland, thailand, and usa, respectively. μ stands for the mean across countries and std for the standard deviation our results show that that our model can fit data from several countries, despite obvious different covid-19 scenarios among them, such as south korea and spain for example. in order to do that, among other things, we estimated the infection rate (β) as an important determinant in the growth of the infected cases mainly in the early stages of the epidemic and a social distancing coefficient (sd) and a protective coefficient (α) that can cause decreases in rate of transmission. this estimation process provides information to compare different social distance measures adopted among several countries. south korea results, for instance, exhibits decreased effective transmission rate β (1 − sd) compared with other countries and the best social distancing at a rate of 64%. this result concurs with south korea political decisions (shin 2020 ). as our model does not have a quarantined state, the effective testing, contact tracing, and quarantining implemented by korea was reflected not only in a greater sd values but also an increased protection rate of α = 0.036. the worse protection rates were found for brazil (α = 0.016) and the usa (α = 0.02) most likely caused by poor political decision and downplaying by officials of the seriousness of the virus in the beginning of the crisis (abutaleb et al. 2020; andreoni 2020) . furthermore, in order to adequately model, countries where the number of deaths are critically above the expected number considering covid-19 death mortality rates even considering possible age effects (li et al. 2020 ; who 2020), we introduced a coefficient l to the model. this coefficient represents the percentage of people that went from infectious to death without access to hospital care. introducing l was a novel idea in seir model studies. it was done to account for the sad reality that many people are facing during the covid-19 pandemic, as many people have passed away for the lack of available icu and/or hospital beds, especially in some regions where the outbreak was not early contained, italy for example (tondo 2020) . nevertheless, in order to accurately estimate the value of l, one need to know c (specific parameter to account the fraction of hospitalized that becomes critical cases) and f (specific parameter to account the fraction of people in critical state that died) from outside sources. our model predicted a basic reproduction number r 0 of 2.24 (0.52). the basic reproduction number represents the average number of secondary cases that result from the introduction of a single infectious case in a susceptible population (anastassopoulou et al. 2020) . considering the importance of such parameter, several other papers have tried using different methods to estimate this parameter for covid-19, and our values fall within the range of values reported so far. in their review, liu et al. (2020) reported two studies using stochastic methods that estimated r 0 ranging from 2.2 to 2.68, six studies, where mathematical methods, with results ranging of 1.5 to 6.49, and finally three studies that used statistical methods such as exponential growth with estimations ranging from 2.2 to 3.58. additionally, we found a worldwide mean of latent period of 1 (0.2) and infectious period of approximately 8.9 (1.5) days. the mean estimated latent period found here is smaller than some previously reported, such as in peng et al. (2020) and guan et al. (2020) who reported estimates the latent median times around 2-3 days. nevertheless, our results corroborate with the idea that covid-19 transmission may occur in the pre-symptomatic phase and that covid-19 patients may have an inconsiderable latent non-infectious period. the mean infectious period of 9 days is within expected range estimated by recent publications (guo et al. 2020; hou et al. 2020) . our results indicate that, despite all uncertainty and biases in the data collected, lack of testing in several countries and possible changes in policies and people's behavior regarding the covid-19 our proposed mathematical modeling may help predict 15 days ahead values of total deaths with errors in the order of 5% and 30 days ahead values of active cases with errors in the order of 30%. moreover, a reliable 2-week prediction of the number of deaths suggests that the model may also be used to determine the number of hospital and icu beds that a region will need ahead of time enough for people to prepare themselves for it. unfortunately, we could not get reliable data of number of hospitalizations and icu patients in the different countries studied here to verify the certainty of our predictions for the values estimated by the model, and we urge future research to do so. furthermore, future application of our model should consider including stratification by age groups (li et al. 2020 ) and coefficients to account for temperature variations and people's density (chen et al. 2020; wang et al. 2020 ). additionally, for brazil where the active cases are still fastgrowing errors in prediction can be large (fig. 4) . the larger errors in such cases happened because there is less data for the optimization process to fit the data to the models' parameters and the fact the active cases and accumulated death curves are still, approximately, exponentially growing (ranjan 2020) . because of the simplicity of the curve, different optimization solutions can fit the data but yield quite different future projections. for example, different combinations of α and β may cause similar behavior patterns for the beginning of the curve. our results are in agreement with recent study by ranjan (2020) , who adds that modeling of an epidemic during its progress is very challenging as the parameters such as transmission rate and basic reproduction number are different for different geographical regions and depend on many social and environmental factors. they also concluded that the early stage of an epidemic is relatively easy to model and the modeling of later stages to predict the decline and eventual flattening of the curve is very challenging as more known parameters need to be included in the model. the inclusion of effects due to isolation and quarantine adds to the complication. although technically we solved this issue by including in our model three time-changing coefficients α, sd, and l, they are hard to find by optimization for countries in the beginning stages. this happens mainly because sd and l are time-dependent triggered and the optimization process attributes random values for both these coefficients and their time "activations." with larger t sd and t l than current time, different values of sd and l can yield the same temporal trends for the beginning of the curves but significantly different behaviors after times t sd and t l . in other words, in countries where the epidemic is still in its pre-peak stages, especially during the fast-initial growing phase, some of the model coefficients, especially sd, α, t sd , and t l , should be estimated from outside sources and/or used to infer possible future scenarios dependent upon future defined policies, such as, for example, an enforcement of social distance measures. furthermore, sudden changes in sd after a period may also cause a rapid increase in active cases that cannot be predicted by the model (e.g., figure 4 ; spain, and portugal); to predict such cases, data from sd and ideally protection rates should be obtained from outside sources. studies to test such hypothesis should be made. in response to the rapid global dissemination of the covid-19, on the 11th of mars the who has declared the outbreak a pandemic motivating further research in epidemiological mathematical modelling. the results suggest that the presented model may be used to predict 15 days ahead values of total deaths with errors in the order of 5%. these errors may be minimized if social distance data are inputted into the model. sudden changes in social distance measures could not be predicted by the model using optimization alone. the u.s. was beset by denial and dysfunction as the coronavirus raged. the washington post data-based analysis, modelling and forecasting of the covid-19 outbreak what you need to know. new york times roles of meteorological conditions in covid-19 transm i s s i impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand clinical characteristics of 2019 novel coronavirus infection in china early spread of sars-cov-2 in the icelandic population the origin, transmission and clinical therapies on coronavirus disease 2019 (covid-19) outbreak -an update on the status to understand the global pandemic, we need global testing-the our world in data covid-19 testing dataset the effectiveness of the quarantine of wuhan city against the corona virus disease 2019 (covid-19): well-mixed seir model analysis modeling the effects of intervention strategies on covid-19 transmission dynamics risk factors for severity and mortality in adult covid-19 inpatients in wuhan the reproductive number of covid-19 is higher compared to sars coronavirus epidemic analysis of covid-19 in china by dynamical modeling estimating the final epidemic size for covid-19 south korea extends intensive social distancing to reach 50 daily coronavirus cases. reuters italian hospitals short of beds as coronavirus death toll jumps. the guardian. palermo estimates of the severity of coronavirus deases 2019: a model-based analysis the global impact of covid-19 and strategies for mitigation and suppression. imperial college, n high temperature and high humidity reduce the transmission of covid-19. ssrn electron j, 2020. who director-general's opening remarks at the media briefing on covid-19. world health organization (who) 2020 estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china apmonitor optimization suite. covid-19 optimal control response university of basel. covid-19scenarios developed at the university of basel. ©2020 covid-19 publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations acknowledgments the authors would like to thank dr. osmar pinto jr, dra. iara rca pinto, emely flores, leandro dalmarco, fernando torres balbina, fabricio duarte, henrique touguinha, guilherme ferro, marco antonio ridenti, and all others who have helped us by sharing data and information during this project. conflict of interest the authors declare that they have no conflict of interest. key: cord-017181-ywz6w2po authors: maus, carsten title: component-based modelling of rna structure folding date: 2008 journal: computational methods in systems biology doi: 10.1007/978-3-540-88562-7_8 sha: doc_id: 17181 cord_uid: ywz6w2po rna structure is fundamentally important for many biological processes. in the past decades, diverse structure prediction algorithms and tools were developed but due to missing descriptions in clearly defined modelling formalisms it’s difficult or even impossible to integrate them into larger system models. we present an rna secondary structure folding model described in ml-devs, a variant of the devs formalism, which enables the hierarchical combination with other model components like rna binding proteins. an example of transcriptional attenuation will be given where model components of rna polymerase, the folding rna molecule, and the translating ribosome play together in a composed dynamic model. single stranded ribonucleic acids (rna) are able to fold into complex threedimensional structures like polypeptide chains of proteins do. the structure of rna molecules is fundamentally important for their function, e.g. the well studied structures of trna and the different rrna variants. but also other transcripts of the dna, i.e. mostly mrnas, perform structure formation which has been shown to be essential for many regulatory processes like transcription termination and translation initiation [1, 2, 3] . the shape of a folded rna molecule can also define binding domains for proteins or small target molecules which can be found for example within riboswitches [4] . the enormous relevance for many biological key processes led to raised research efforts in identifying various rna structures over the past decades. unfortunately the experimental structure identification with nmr and x-ray techniques is difficult, expensive, and highly time-consuming. therefore, many in silico methods for rna structure prediction were developed which cover different requirements. diverse comparative methods exist using alignments of similar rna sequences to predict structures [5, 6] , but also many single sequence prediction algorithms work very well. some of them predict the most stable rna structure in a thermodynamical equilibrium, e.g. [7, 8, 9] , whereas some other simulate the kinetic folding pathway over time [10, 11, 12, 13] . the latter is also in the focus of the presented modelling approach here. results of rna structure predictions as well as kinetic folding simulations have reached a high level of accuracy and thus in silico folding became a widely used and well established technique in the rna community. however, none of the existing tools and programs provides a flexible integration into larger system models which is also due to the fact that they are written in proprietary formalisms and do not distinguish between model description and simulation engine. to illuminate the importance of the folding processes and the possibility to integrate them into larger models, lets take a look at a concrete example of gene regulation. the tryptophan (trp) operon within bacterial genomes represents one of the best understood cases of gene regulation and has been subject to various modelling approaches [14, 15] . tryptophan is an amino acid, a building block for the production of proteins. the trp operon includes five consecutive genes, coding for five proteins. the joint action of the proteins permits the synthesis of trp through a cascade of enzymatic reactions. this ability is vital since the bacterium may be unable to feed on trp from the environment. as long as trp is obtained from the surrounding medium, its costly synthesis is impaired by a threefold control mechanism: repression of transcription initiation, transcriptional attenuation, and inactivation of the cascade of enzymatic reactions actually producing trp. each of these are triggered by trp availability. transcriptional attenuation follows if transcription starts although trp was available. this has a small but non-negligible chance. as soon as rna polymerase (rnap) has transcribed the operon's leader region into an mrna molecule, a ribosome can access this. the ribosome starts translating the mrna content into a growing sequence of amino acids. the speed of the ribosome depends on trp availability. the ribosome advances quickly as long as trp is abundant, which prevents rnap from proceeding into the operon's coding region. the attenuation is caused by the formation of a certain constellation of rna hairpin loops in presence of a trp molecule at a distinct segment of the mrna molecule (figure 1). attenuation depends on the synchronised advance of both rnap and ribosome, and their relative positioning with respect to mrna. in [15] a model of the tryptophan operon was developed, in which repressors, the operon region, and the mrna were modelled individually. however, the latter in not much detail. only the repression of transcription initiation was included in the model. consequently, the simulation result showed stochastic bursts in the trp level, caused by the repressor falling off, which started the transcription. integrating a transcriptional attenuation model would have prevented this non-realistic increase in trp concentration, and mimicked the threefold regulation of the trp more realistically. however, the question is how to model the attenuation. as this regulating process depends largely on structure formation, modelling of rna folding would be a big step in the right direction for reflecting attenuation dynamics. additionally, modelling interactions between mrna and rnap as well as mrna and the ribosome are needed because both influence the kinetic folding process and rna termination structures break up gene transcription. the focus of this paper is the rna folding process, but at the end we will also give a detailed outlook how the composed model of tryptophan attenuation looks like and how the individual model components act together. the reason for rna folding is the molecules' general tendency to reach the most favourable thermodynamical state. complementary bases of rna nucleotides can form base pairs by building hydrogen bonds similar to dna double helices. adenine (a) and uracil (u) are complementary bases as well as cytosine (c) and guanine (g). in addition, the wobble base pair g-u is also frequently found in rna structure folding. each additional hydrogen bond of base pairs affords a small energy contribution to the overall thermodynamic stability, but there is another chemical interaction which is even more important for the rna structure than just the number and type of base pairs. it's called base stacking and describes the interactions between the aromatic rings of adjacent bases by van-der-waals bonds. base pair stacking is the cause why an uninterrupted long helix is thermodynamically more favourable than a structure of multiple single base pairs or short helices interrupted by loop regions, even if the number and type of base pairs are equal. since the 1970s, significant progress has been done on identifying thermodynamic parameters of different base pair neighbourhoods and structural elements like hairpin loops, e.g. [16, 17] . this was a precondition to develop rna structure prediction algorithms based on energy minimisation, i.e. finding the thermodynamical most stable structure. rna structures are hierarchically organised (see figure 2 ). the most simple hierarchy level is the primary structure which is nothing else than the linear sequence of nucleotides. two nucleotides are linked over the 3' and 5' carbon atoms of their ribose sugar parts resulting in a definite strand direction. the secondary structure consists of helices formed by base pairs and intersecting loop regions. such structural elements are formed rapidly within the first milliseconds of the folding process [18] . interacting secondary structure elements finally build the overall three-dimensional shape of rna molecules. although they are formed by simple base pairs like secondary structures, helices inside loop regions are often seen as tertiary structures. such pseudoknots and higher order tertiary interactions are, due to their complexity and analog to many other rna structure prediction methods, not covered by our model. however, it should not retain unstated here that there are some existing tools which can predict pseudoknots quite well, e.g. [10] . primary structure secondary structure tertiary structure as already mentioned, typically kinetic rna folding simulations, as e.g. [10, 11, 12] , are aimed at efficiently and accurately simulating the molecules structure formation in isolation rather than supporting a reuse of rna folding models and a hierarchical construction of models. for approaching such model composition, we use the modelling formalism ml-devs [19] , a variant of the devs formalism [20] . as devs does, it supports a modular-hierarchical modelling and allows to define composition hierarchies. ml-devs extends devs by supporting variable structures, dynamic ports, and multi-level modelling. the latter is based on two ideas. the first is to equip the coupled model with a state and a behaviour of its own, such that macro behaviour the macro level does not appear as a separate unit (an executive) of the coupled model. please recall that in traditional devs coupled models do not have an own state nor a behaviour. secondly, we have to explicitly define how the macro level affects the micro level and vice versa. both tasks are closely interrelated. we assume that models are still triggered by the flow of time and the arrival of events. obviously, one means to propagate information from macro to micro level is to exchange events between models. however, this burdens typically modelling and simulation unnecessarily, e.g. in case the dynamics of a micro model has to take the global state into consideration. therefore, we adopt the idea of value couplings. information at macro level is mapped to specific port names at micro level. each micro model may access information about macro variables by defining input ports with corresponding names. thus, downward causation (from macro to micro) is supported. in the opposite direction, the macro level needs access to crucial information at the micro level. for this purpose, we equip micro models with the ability to change their ports and to thereby signalise crucial state changes to the outside world. upward causation is supported, as the macro model has an overview of the number of micro models being in a particular state and to take this into account when updating the state at macro level. therefore, a form of invariant is defined whose violation initiates a transition at macro level. in the downward direction, the macro level can directly activate its components by sending them events -thereby, it becomes possible to synchronously let several micro models interact which is of particular interest when modelling chemical reactions. these multi-level extensions facilitate modelling, figure 3 depicts the basic idea, see also [21] . the central unit in composed models using rna structure information is an rna folding model. therefore, we first developed a model component which describes the folding kinetics of single stranded rna molecules. it consists of a coupled ml-devs model representing the whole rna molecule and several atomic models. each nucleotide (nt) of the rna strand is represented by an instance of the atomic model nucleotide which is either of the type a, c, g, or u meaning its base. they are connected via ports in the same order as the input sequence (primary structure) and have knowledge about their direct neighbours. for example, the nt at sequence position 8 is connected with the nt number 7 on its 5' side and on the 3' location it is connected with the nt at position 9 (see figure 4 ). state variables hold rudimentary information about the neighbours, to be exact their base type and current binding partners. "binding partner" means a secondary structure defining base pair and the term is used only in this context here and does not mean the primary backbone connections. if a partner of a nucleotide changes, an output message will be generated and the receiving (neighboured) nucleotides will update their state variables. holding information about other atomic model states is normally not the case in devs models as they are typically seen as black boxes. however, here it is quite useful because of some dependencies concerning base pair stability. base pairs are modelled by wide range connections of nucleotides via additional interfaces. whereas the rna backbone bonds of adjacent nucleotides are fixed after model initialisation, the connections between base pairing nucleotides are dynamically added and removed during simulation (figure 5). therefore, two different major states (phases) of nucleotides exist: they can be either unpaired or paired. as already stated in section 3.1, base pair stability depends on the involved bases and their neighbourhood, especially stacking energies of adjacent base pairs provide crucial contributions for structure stabilisation. in our kinetic folding model, base pair stability is reflected by binding duration, i.e. the time advance function of the paired phase. thus, pairing time depends on thermodynamic parameters for nucleic acids base stacking which were taken from [17] and are also be used by mfold version 2.3 [7] . this thermodynamic data set not only provides the free energy (gibbs energy) for a given temperature of 37 • c, but also the enthalpy change δh of various stacking situations. the enthalpy together with the free energy and the absolute temperature allows us to calculate the entropy change δs which allows us further to calculate the activation energy δe a for base pair dissociation at any temperature t between 0 and 100 • c: δe a is directly used as one parameter for base pair opening time, i.e. the duration of a paired phase is directly dependent on the activation energy for base pair disruption. to allow rna structures to escape from local energy minima and refold to more stable structures, the base pair dissociation time will be randomised, which leads to very short bonding times in some cases although the activation energy needed for opening the base pair is quite large. for base pair closing an arbitrary short random time is assigned with the unpaired phase of nucleotides assuming that rna base pair formation is a very fast and random process. after unpaired time has expired, the nucleotide model tries to build a base pair with another nucleotide randomly chosen from within a set of possible pairing partner positions. this set is determined by sterically available rna strand regions and thus an abstraction of spatial constraints. for example, a hairpin loop smaller than 4 nucleotides is sterically impossible, but also many other nucleotide positions with larger distance can be excluded for secondary structure folding (see figure 6 ). an unpaired nucleotide is not able to choose another nt for base pairing by its own. it has no global information about the rna shape which is strongly needed here. therefore, an implicit request by adding a new input port will be made to the coupled model that holds such macro knowledge and can therefore choose a valid position with which the requesting nt will try to pair next. for choosing this position at macro level two different model variants exist. the first and more simple one picks a position totally random from within the set of possible partners, whereas the second variant takes the entropy change into account when a pairing partner will be selected. the latter method prefers helix elongation in contrast to introducing new interior loops by single base pair formations and small loops will be more favourable than large ones [12] . correct rna folding with the first method is dependent on the base pair stabilities and the random folding nuclei which are the first appearing base pairs of helical regions. this last point is less important for rna folding with model variant 2 because the chosen binding partners are more deterministic due to loop entropy consideration. a comparison of both approaches with respect to simulation results is given in section 6. once a nucleotide received an input message by the macro model containing the number of a potential pairing partner, it tries to form a base pair with this nt by adding a coupling and sending a request. for a successful pairing, the partners must be of complementary base type and they must be able to pair in principle, e.g. bases can be modified so that they can not bind to others. figure 7 illustrates the whole state flow for base pair formation and disruption of the nucleotide model component. the role of the macro model and its interactions with the micro level (nucleotides) shows the schematic organisation of the whole rna folding model in figure 8 . already mentioned in the previous section, high level information about the whole rna molecule is needed to take sterical restrictions for base pairing into account. therefore, the coupled model holds the overall rna secondary structure which will be updated every time the state of a nucleotide changes from unpaired to paired and vice versa. this will be triggered by nucleotide port adding and removal recognised by the macro level. the same functionality is used to signalise the macro level the wish to try pairing with another nucleotide. the macro model detects a port adding, calculates the sterically possible partner set, chooses a position from within the set, and after all sends this position to the just now added nucleotide input port ( figure 8, nt 5) . the coupled macro model is further responsible for sequence initialisation on the micro level by adding and connecting nucleotide models, i.e. it generates the primary rna structure. another task of the macro model is to observe the current folding structure for special structural patterns. this could be for example a specific binding domain for a protein or small ligand. also transcription termination or pausing structures can be of interest for observation. if observed structures are present during folding simulation, the macro level model can signalise this information and thus trigger dynamics to other components by adding new ports to itself (representing docking sites) or send messages over existing ports. a composed example model which uses this capability can be found in section 7. for evaluating the model's validity we simulated the folding of different rna molecules with known structure and analysed the results. three different types of experiments were done: native structure-correlates the majority of formed structures with the native structure after sufficient long simulation time? structure distribution-is the equilibrium ratio between minimum free energy (mfe) and suboptimal structures as expected? structure refolding-are molecules able to refold from suboptimal to more stable structural conformations? unfortunately, only few time-resolved folding pathways are experimentally derived and most of them treat pseudoknots [22] and higher order tertiary structure elements [22, 23, 24] which can not be handled by our folding model and are therefore out of the question for a simulation study. hence, some comparisons with other in silico tools were also made, although we know that one has to be careful with comparing different models for validating a model as it is often unclear how valid the other models are. because the folding model is highly stochastic, every simulation experiment was executed multiple times. typically 100 replications were made. structural analysis of the cis-acting replication element from hepatitis c virus revealed a stem hairpin loop conformation where the helix is interrupted by an internal or bulge loop region [25] . figure 9 shows simulation results of its structure formation. the three-dimensional base pair lifetime plots indicate correct folding of both helical regions and only few misfolded base pairs. only small differences can be seen between simulations with the two base pair formation variants described in section 5.1. without taking entropy into account for pairing, a bit more noise of misfolded base pairs can be observed which is not surprising due to the absolutely random partner choice. another well known rna structure is the cloverleaf secondary structure of trnas [26, 27] consisting of four helical stems: the amino acid arm, d arm, anticodon arm, and the t arm. some base modifications and unusual nucleotides exist in trna which stabilise its structure formation, e.g. dihydrouridine and pseudouridine. such special conditions are not considered by our folding model as well as tertiary interactions leading to the final l-shaped form. however, folding simulations result in significant cloverleaf secondary structure formation (figure 10). although there is much misfolded noise, the four distinct helix peaks are the most stable structural elements introduced during simulation, especially the amino acid arm. no fundamental difference can be seen between base pair formation model 1 and 2. a third native structure validation experiment treats the corona virus s2m motif which is a relatively long hairpin structure with some intersecting internal and bulge loops [28, 29] . simulation of the sars virus s2m rna folding indicates only for one of the native helix regions a conspicuous occurrence ( figure 11 ). the other helices closer to the hairpin loop show no significant stabilisation. competing misfolded structural elements can be observed equally frequent or even more often. base pair formation model 2 provides a slightly better result than the first one, but it is unsatisfying too. a reason for the result can be the multiple internal and bulge loops, which destabilise the stem and thus allow locally more stable structure elements to form. in [30] a quantitative analysis of different rna secondary structures by comparative imino proton nmr spectroscopy is described. the results indicate that a small 34-nt rna has two equally stable structures in thermodynamic equilibrium, one with 2 short helices and the other with a single hairpin. folding simulations of the same rna strand show an equal ratio of the probed structures as well ( figure 12) . however, both are representing just 20% of all present structures which was not detected in the nmr experiments. many base pairs were introduced during simulation which are competing with base pairs of the two stable structures and thus reduce their appearance. this can be easily seen in the 3d matrix of figure 12 where some additional peaks show high misfolded base pair lifetimes. simulating the rna folding with kinfold [12] results in a five times higher amount of the 2-helix conformation than the single hairpin, but their total sum is about 60% of all molecules and thus less misfolded structures can be observed. real-time nmr spectroscopy was used by wenter et al. to determine refolding rates of a short 20-nt rna molecule [31] . the formation of its most stable helix was temporarily inhibited by a modified guanosine at position 6. after photolytic removal of this modification a structure refolding was observed. to map such forced structure formation to relatively unstable folds at the beginning of an experiment, most rna folding tools have the capability to initialise simulations with specified structures. we used, much closer to the original wetlab experiment, a different strategy, i.e. at first g6 was not capable to bind any other base. the time course after removing this prohibition during simulation is shown in figure 13 . wenter et al. detected structure refolding by measuring the imino proton signal intensity of u11 and u17, which show high signal intensity if they are paired with other bases. accordingly we observed the state of both uracils over simulation time as well. after removal of g6 unpaired locking, a logarithmic decrease of structures with paired u11 and uniform increase of folds fig. 13 . refolding of an deliberately misfolded small rna molecule [31] . wetlab measurements are drawn by a x. simulation parameters: time 10 seconds, temperature 288.15 k, base pair formation model 1, 100 replications. simulations with kinfold were made with the same temperature and replication number but over a time period of 250 seconds. with paired u17 can be observed reaching a complete shift of the conformational equilibrium after 4 seconds. a very similar refolding progression was experimentally measured (single spots in figure 13 ), but with a strong deviating time scale of factor 25. this could be a remaining model parameter inaccuracy or due to special experimental conditions which are not covered by our model, e.g. unusual salt concentrations. however, our model allows a quite realistic refolding from a suboptimal to a more stable rna structure. identical in silico experiments with kinfold [12] by contrast, do not show any significant refolding (figure 13, nonchanging curves). the same counts for seqfold [11] . with both approaches the energy barrier seems to be too high to escape from the misfolded suboptimal structure. local optima can more easily be overcome in comparison to other traditional pure macro methods (see figure 13 ). we assume that even "stable" base pairs might be subject to changes, and let the nucleotides "searching" for a stable structure at micro level. this proved beneficial and emphasised the role of the micro level. however, the simulation revealed the importance of macro constraints for the folding process, and the implications of a lack of those. macro constraints that have been considered are for example the relative positioning of the nucleotides, particularly within spatial structures like hairpin or internal loops. the interplay between macro and micro level allowed us to reproduce many of the expected structure elements, e.g. figures 9 and 10, although macro constraints have been significantly relaxed. these simplifications lead to "wrongly" formed structures and maybe could have been prevented by integrating terminal base stacking for pairing stability as well as less abstract base pair closing rules as macro constraints. a comparison of the two implemented base pair formation methods indicate only few differences. without taking entropy into account the noise of unstable single base pairs and short helices increases, but not dramatically. the same stable structures are formed based on both rules. having a working folding model we are now able to combine it with other model components that are influenced by or are influencing the rna structure formation and come back to the motivation, the attenuation of tryptophan synthesis. at least two further models are needed to reflect transcription attenuation: the rna polymerase and the ribosome (figure 14). rna molecules are products of the transcription process which is the fundamental step in gene expression. once the rna polymerase enzyme complex (rnap) has successfully bound to dna (transcription initiation), it transcribes the template sequence into an rna strand by sequentially adding nucleotides to the 3' end and thus elongates the molecule. to reflect this synthesising process, in the rna model, new nucleotide models and their backbone connections are added dynamically during simulation. this process is triggered by the rnap model component which interacts with rna. this dynamic rna elongation allows the simulation of sequential folding, where early synthesised parts of the rna molecule can already fold whereas other parts still have to be added. please note that this is not a unique feature of the model presented here, as kinetic folding tools typically realise sequential folding by just adding a new nt after a certain time delay. however, a component design allows to combine the rnap with further models (e.g. the dna template), or to model it in more detail (e.g. diverse rnap subunits), and to exchange model components on demand. the pattern observation function of the rna folding model, which is realised at macro level, allows us to look for an intrinsic transcription termination structure [2] during simulation. if such structure is formed, the folding model removes its elongation input port meaning the release of the rna from the polymerase enzyme. at this time point the elongation stops, but structure folding and interactions with other components proceed. the ribosome enzyme complex translates rna sequences into protein determining amino acid sequences (peptides). translation starts at the ribosome binding site of mrna molecules which is reflected by a pair of input and output ports of the rna models. the translation begins after connecting it with a ribosome model. the current ribosome position with respect to the rna sequence is known by the rna model. a triplet of three rna bases (codon) encodes for one amino acid. the ribosome requests for the next codon 3' of its current rna location when peptide chain elongation has proceeded. this is the case when the correct amino acid of the last codon entered the enzyme. the speed of the translation process depends strongly on the availability of needed amino acids. if an amino acid type is not sufficiently available, the ribosome stalls at the corresponding codon and thus pauses translation. a ribosome is quite big and thus 35-36 nucleotides are covered by its shape [32] . therefore, a region upstream and downstream of the ribosome location is not able to form base pairs. as the rna model knows the ribosome location, this is handled by the rna macro level model which sends corresponding events to its nucleotide micro model components. the same counts for the helicase activity of the ribosome [32] . for sequence translation, the macro level model will disrupt a base paired structure element when it is reached by the enzyme. whether those additional models are realised as atomic models, or coupled models depends on the objective of the simulation study. referring to the operon model presented in [15] , the rnap, the mrna, and the ribosome would replace the simplistic mrna model, to integrate the attenuation process into the model. we presented a component-based model of rna folding processes. unlike traditional approaches which focus on the results of the folding process, e.g. stable structures in thermodynamical equilibrium, our objective has been different. the idea was to develop an approach that allows to integrate the folding processes into larger models and to take the dynamics into account, that has shown to be crucial in many regulation processes. therefore, the formalism ml-devs was used. at macro level, certain constraints referring to space and individual locations were introduced, whereas at micro level, the nucleotides were responsible for a successful base pairing and for the stability of the structure. a model component for the nucleotides and one model component for the entire rna molecule have been defined. the simulation results have been compared to wetlab experiments. therefore, the model components can be parametrised for different rna sequences (base types) as well as environmental conditions (e.g temperature). the evaluation revealed an overall acceptable performance, and in addition, insights into the role of micro level dynamics and macro level constraints. the integration of the rna folding model into a model of transcription attenuation has been sketched. next steps will be to realise this integration and to execute simulation experiments to analyse the impact of this more detailed regulation model on the synthesis of tryptophan. translation initiation and the fate of bacterial mrnas the mechanism of intrinsic transcription termination transcription attenuation: once viewed as a novel regulatory strategy genetic control by a metabolite binding mrna multiple structural alignment and clustering of rna sequences secondary structure prediction for aligned rna sequences mfold web server for nucleic acid folding and hybridization prediction fast folding and comparison of rna secondary structures a dynamic programming algorithm for rna structure prediction including pseudoknots prediction and statistics of pseudoknots in rna structures using exactly clustered stochastic simulations description of rna folding by "simulated annealing rna folding at elementary step resolution beyond energy minimization: approaches to the kinetic folding of rna dynamic regulation of the tryptophan operon: a modeling study and comparison with experimental data a variable structure model -the tryptophan operon rna hairpin loop stability depends on closing base pair coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of rna folding rapid compaction during rna folding combining micro and macro-modeling in devs for computational biology theory of modeling and simulation one modelling formalism & simulator is not enough! a perspective for computational biology based on james ii single-molecule rna folding rna folding at millisecond intervals by synchrotron hydroxyl radical footprinting a single-molecule study of rna catalysis and folding a cis-acting replication element in the sequence encoding the ns5b rna-dependent rna polymerase is required for hepatitis c virus rna replication transfer rna: molecular structure, sequence, and properties the crystal structure of trna a common rna motif in the 3' end of the genomes of astroviruses, avian infectious bronchitis virus and an equine rhinovirus the structure of a rigorously conserved rna element within the sars virus genome bistable secondary structures of small rnas and their structural probing by comparative imino proton nmr spectroscopy kinetics of photoinduced rna refolding by real-time nmr spectroscopy mrna helicase activity of the ribosome acknowledgments. many thanks to adelinde m. uhrmacher for her helpful comments and advice on this work. i will also thank roland ewald and jan himmelspach for their instructions for using james ii. the research has been funded by the german research foundation (dfg). key: cord-025404-rk2fuovf authors: venero, sheila katherine; schmerl, bradley; montecchi, leonardo; dos reis, julio cesar; rubira, cecília mary fischer title: automated planning for supporting knowledge-intensive processes date: 2020-05-05 journal: enterprise, business-process and information systems modeling doi: 10.1007/978-3-030-49418-6_7 sha: doc_id: 25404 cord_uid: rk2fuovf knowledge-intensive processes (kips) are processes characterized by high levels of unpredictability and dynamism. their process structure may not be known before their execution. one way to cope with this uncertainty is to defer decisions regarding the process structure until run time. in this paper, we consider the definition of the process structure as a planning problem. our approach uses automated planning techniques to generate plans that define process models according to the current context. the generated plan model relies on a metamodel called metakip that represents the basic elements of kips. our solution explores markov decision processes (mdp) to generate plan models. this technique allows uncertainty representation by defining state transition probabilities, which gives us more flexibility than traditional approaches. we construct an mdp model and solve it with the help of the prism model-checker. the solution is evaluated by means of a proof of concept in the medical domain which reveals the feasibility of our approach. in the last decades, the business process management (bpm) community has established approaches and tools to design, enact, control, and analyze business processes. most process management systems follow predefined process models that capture different ways to coordinate their tasks to achieve their business goals. however, not all types of processes can be predefined at design timesome of them can only be specified at run time because of their high degree of uncertainty [18] . this is the case with knowledge-intensive processes (kips). kips are business processes with critical decision-making tasks that involve domain-specific knowledge, information, and data [4] . kips can be found in domains like healthcare, emergency management, project coordination, and case management, among others. kip structure depends on the current situation and new emergent events that are unpredictable and vary in every process instance [4] . thus, a kip's structure is defined step by step as the process executes, by a series of decisions made by process participants considering the current specific situations and contexts [13] . in this sense, it is not possible to entirely define beforehand which activities will execute or their ordering and, indeed, it is necessary to refine them as soon as new information becomes available or whenever new goals are set. these kinds of processes heavily rely on highly qualified and trained professionals called knowledge workers. knowledge workers use their own experience and expertise to make complex decisions to model the process and achieve business goals [3] . despite their expertise, it is often the case that knowledge workers become overwhelmed with the number of cases, the differences between cases, rapidly changing contexts, and the need to integrate new information. they therefore require computer-aided support to help them manage these difficult and error-prone tasks. in this paper, we explore how to provide this support by considering the process modeling problem as an automated planning problem. automated planning, a branch of artificial intelligence, investigates how to search through a space of possible actions and environment conditions to produce a sequence of actions to achieve some goal over time [10] . our work investigates an automated way to generate process models for kips by mapping an artifact-centric case model into a planning model at run time. to encode the planning domain and planning problem, we use a case model defined according to the metakip metamodel [20] that encloses data and process logic into domain artifacts. it defines datadriven activities in the form of tactic templates. each tactic aims to achieve a goal and the planning model is derived from it. in our approach, we use markov decision processes (mdp) because they allow us to model dynamic systems under uncertainty [7] , although our definition of the planning problem model enables using different planning algorithms and techniques. mdp finds optimal solutions to sequential and stochastic decision problems. as the system model evolves probabilistically, an action is taken based on the observed condition or state and a reward or cost is gained [7, 10] . thus, an mdp model allows us to identify decision alternatives for structuring kips at run time. we use prism [11] , a probabilistic model checker, to implement the solution for the mdp model. we present a proof of concept by applying our method in a medical treatment scenario, which is a typical example of a non-deterministic process. medical treatments can be seen as sequential decisions in an uncertain environment. medical decisions not only depend on the current state of the patient, but they are affected by the evolution of the states as well. the evolution of the patient state is unpredictable, since it depends on factors such as preexisting patient illnesses or patient-specific characteristics of the diseases. in addition, medical treatment decisions involve complex trade-offs between the risks and benefits of various treatment options. we show that it is possible to generate different optimal treatment plans according to the current patient state and a target goal state, assuming that we have enough data to accurately estimate the transition probabilities to the next patient state. the resulting process models could help knowledge workers to make complex decisions and structure execution paths at run time with more probability of success and optimizing constraints, such as cost and time. the remainder of this paper is organized as follows: sect. 2 presents a motivating medical scenario. section 3 introduces the theoretical and methodological background. section 4 describes the proposed method to encode a case model as a planning model. section 5 reports on the application of the methodology in a scenario. section 6 discusses the obtained findings and related work. finally, sect. 7 wraps up the paper with the concluding remarks. this section presents a motivating medical case scenario. suppose we have the following medical scenario in the oncology department stored in the electronic medical record (emr). in order to receive the second cycle of r-ice, it is necessary to stabilize mary's health status as soon as possible. thus, at this time the goal is to decrease her body temperature to 36.5 • c ≤ t emp ≤ 37.2 • c and reduce the level of nausea to zero ln = 0. for that, physicians need to choose from vast treatment strategies to decide which procedures are the best for mary, in her specific current context. assume that we have statistical data about two possible tactics for achieving the desired goal: fever (fvr) and nausea (nausea) management, shown in table 1 adapted from [2] . each of these tactics can be fulfilled through multiple activities that have different interactions and constraints with each other, as well as to the specifics of the patient being treated. for example, (a) treating nausea with a particular drug may affect the fever, (b) administration of the drug may depend on the drugs that the patient is taking, (c) drug effectiveness may depend on the patient history with the drug, or (d) giving the drug may depend on whether the drug has already been administered and how much time has elapsed since the last dose. these issues make manual combination of even this simple case challenging, and it becomes much harder for more complex treatments and patient histories. support is therefore needed that can take into account patient data, constraints, dependencies, and patient/doctor preferences to help advise the doctor on viable and effective courses of treatment. this section presents the underlying concepts in our proposal. section 3.1 provides an overview of the metakip metamodel; sect. 3.2 introduces basic concepts of automated planning; sect. 3.3 explains markov decision process (mdp). section 3.4 describes the prism tool and language. our previous work proposed an artifact-centric metamodel [20] for the definition of kips, aiming to support knowledge workers during the decision-making process. the metamodel supports data-centric process management, which is based on the availability and values of data rather than completion of activities. in data-centric processes, data values drive decisions and decisions dynamically drive the course of the process [18] . the metamodel is divided into four major packages: case, control-flow, knowledge, and decision, in such a way that there is an explicit integration of the data, domain, and organizational knowledge, rules, goals, and activities. the case package defines the base structure of the metamodel, a case. a case model definition represents an integrated view of the context and environment data of a case, following the artifact-centric paradigm. this package is composed of a set of interconnected artifacts representing the logical structure of the business process. an artifact is a data object composed of a set of items, attributes, and data values, defined at run time. the knowledge package captures explicit organizational knowledge, which is encoded through tactic templates, goals, and metrics that are directly influenced by business rules. tactics templates represent best practices and guidelines. usually, they have semi-structured sequences of activities or unstructured loose alternative activities pursuing a goal. the control-flow package defines the behavior of a case. it is composed of a set of data-driven activities to handle different cases. activity definitions are made in a declarative way and have pre-and post-conditions. the metamodel refines the granularity of an activity that could be a step or a task. a task is logically divided into steps, which allows better management of data entry on the artifacts. step definitions are associated with a single attribute of an artifact, a resource, and a role type at most. this definition gives us a tight integration between data, steps and resources. these packages are used to model alternative plans to answer emergent circumstances, reflecting environmental changes or unexpected outcomes during the execution of a kip. the decision package represents the structure of a collaborative decision-making process performed by knowledge workers. we proposed a representation of how decisions can be made by using the principles of strategic management, such as, looking towards goals and objectives and embracing uncertainty by formulating strategies for the future and correct them if necessary. the strategic plan is structured at run time by goals, objectives, metrics and tactic templates. planning is the explicit and rational deliberation of actions to be performed to achieve a goal [7] . the process of deliberation consists of choosing and organizing actions considering their expected outcomes in the best possible way. usually, planning is required when an activity involves new or less familiar situations, complex tasks and objectives, or when the adaptation of actions is constrained by critical factors such as high risk. automated planning studies the deliberation process computationally [7] . a conceptual model for planning can be represented by a state-transition system, which formally is a 4-tuple σ = (s, a, e, γ), where s = {s 1 , s 2 , ....} is a finite or recursively enumerable set of states; a = {a 1 , a 2 , ...} is a finite or recursively enumerable set of actions; e = {e 1 , e 2 , ...} is a finite or recursively enumerable set of events; and γ : s × a × e → 2 s is a state-transition function. actions are transitions controlled by a plan executor. events are unforeseen transitions that correspond to the internal dynamics of the system and cannot be controlled by the plan executor. both events and actions contribute to the evolution of the system. given a state transition system σ, the purpose of planning is to deliberate which actions to apply into which states to achieve some goal from a given state. a plan is a structure that gives the appropriate actions. a markov decision process (mdp) is a discrete-time stochastic control process. it is a popular framework designed to make decisions under uncertainty, dealing with nondeterminism, probabilities, partial observability, and extended goals [7] . in mdps, an agent chooses action a based on observing state s and receives a reward r for that action [10] . the state evolves probabilistically based on the current state and the action taken by the agent. figure 1 (a) presents a decision network [10] , used to represent a mdp. the state transition function t (s |s, a) represents the probability of transitioning from state s to s after executing action a. the reward function r(s, a) represents the expected reward received when executing action a from state s. we assume that the reward function is a deterministic function of s and a. an mdp treats planning as an optimization problem in which an agent needs to plan a sequence of actions that maximizes the chances of reaching the goal. action outcomes are modeled with a probability distribution function. goals are represented as utility functions that can express preferences on the entire execution path of a plan, rather than just desired final states. for example, finding the optimal choice of treatment optimizing the life expectancy of the patient or optimizing cost and resources. prism [11] is a probabilistic model checker that allows the modeling and analysis of systems that exhibit probabilistic behavior. the prism tool provides support for modeling and construction of many types of probabilistic models: discrete-time markov chains (dtmcs), continuous-time markov chains (ctmcs), markov decision processes (mdps), and probabilistic timed automata (ptas). the tool supports statistical model checking, confidence-level approximation, and acceptance sampling with its discrete-event simulator. for nondeterministic models it can generate an optimal adversary/strategy to reach a certain state. models are described using the prism language, a simple, state-based language based on the reactive modules formalism [1] . figure 1 (b) presents an example of the syntax of a prism module and rewards. the fundamental components of the prism language are modules. a module has two parts: variables and commands. variables describe the possible states that the module can be in at a given time. commands describe the behavior of a module, how the state changes over time. a command comprises a guard and one or more updates. the guard is a predicate over all the variables in the model. each update describes a transition that the module can take if the guard is true. a transition is specified by giving the new values of the variables in the module. each update has a probability which will be assigned to the corresponding transition. commands can be labeled with actions. these actions are used for synchronization between modules. cost and rewards are expressed as real values associated with certain states or transitions of the model. in our approach, plans are fragments of process models that are frequently created and modified during process execution. plans may change as new information arrives and/or when a new goal is set. we advocate the creation of a planner to structure process models at run time based on a knowledge base. the planner synthesizes plans on-the-fly according to ongoing circumstances. the generated plans should be revised and re-planned as soon as new information becomes available. thereby, it involves both computer agents and knowledge workers in a constant interleaving of planning, execution (configuration and enactment), plan supervision, plan revision, and re-planning. an interactive software tool might assist human experts during planning. this tool should allow defining planning goals and verifying emerging events, states, availability of activities and resources, as well as preferences. the run-time generation of planning models according to a specific situation in a case instance requires the definition of the planning domain and then the planning problem itself. definition 1. let the case model be represented according to the metakip metamodel. the planning domain is derived from the case model that can be described using a state-transition system defined as a 5-tuple σ = (s, a, e, γ, c) such as that: s is the set of possible case states. a is the set of actions that are represented by activities inside tactics that an actor may perform. e is the set of events in the context or in the environment. γ : s × a × e → 2 s , is the state-transition function, so the system evolves according to the actions and events that it receives. c : s ×a → [0, ∞) is the cost function that may represent monetary cost, time, risk or something that can be minimized or maximized. the state of a case is the set of values (available data) of the attributes contained in artifacts of the context and the environment. however, since the number of attributes of the artifacts is very large, it is necessary to limit the number of attributes to only the most relevant ones, which determines the current state of the case at a given time t. actions in the metakip metamodel are represented by the activities within a tactic. tactics represent best practices and guidelines used by the knowledge workers to make decisions. in metakip, they serve as tactic templates to be instantiated to deal with some situations during the execution of a case instance. tactics are composed of a finite set of activities pursuing a goal. a tactic can be structured or unstructured. a tactic is a 4-tuple t = (g, p c, m, a), where: g is a set of variables representing the pursuing goal state, p c is a finite set of preconditions representing a state required for applying the tactic, m is a set of metrics to track and assess the pursuing goal state, and a is a finite set of activities. in metakip, an activity could be a single step or a set of steps (called a task). an activity has some preconditions and post-conditions (effects). we map activities into executable actions. an executable action is an activity in which their effects can modify the values of the attributes inside business artifacts. these effects can be deterministic or non-deterministic. ef ∈eff p ef (i) = 1. c is the number which represents the cost (monetary, time, etc.) of performing a. as the state-transition function γ is too large to be explicitly specified, it is necessary to represent it in a generative way. for that, we use the planning operators from which it is possible to compute γ. thus, γ can be specified through a set of planning operators o. a planning operator is instantiated by an action. at this point, we are able to define the planning problem to generate a plan as a process model. definition 5. the planning problem for generating a process model at a given time t is defined as a triple p = (os t , gs t , ro t ), where: os t is the observable situation of a case state at time t. gs t is the goal state at time t, a set of attributes with expected output values. ro t represents a subset of the o that represents only available and relevant actions for a specific situation during the execution of a case instance at a given time t. where the state of c is s t and the set issues in the situation of c is i t . is an attribute with an expected output value, v i belongs to an artifact of c. these attributes are selected by the knowledge workers. some metrics required to asses some goals inside tactics can be added to the goal. gs t represents the expected reality of c. gs t serves as an input for searching an execution path for a specific situation. different goal states can be defined over time. let p = (os t , gs t , ro t ) be the planning problem. a plan π is a solution for p . the state produced by applying π to a state os t in the order given is the state gs t . a plan is any sequence of actions π = (a 1 , ..., a k ), where k ≥ 1. the plan π represents the process model. our problem definition enables the use of different planning algorithms and the application of automatic planning tools to generate alternatives plans. as we are interested in kips, which are highly unpredictable processes, we use markov decision processes for formulating the model for the planner. mdps allows us to represent uncertainty with a probability distribution. mdp makes sequential decision making and reasons about the future sequence of actions and obstructions, which provides us with high levels of flexibility in the process models. in the following, we show how to derive an mdp model expressed in the prism language from a metakip model automatically. algorithm 1 shows the procedure to automatically generate the mdp model for the prism tool, where the input parameters are: os t , gs t , set of domain t actics, t is the given time, p p minimum percentage of preconditions satisfaction, and p g minimum percentage of goal satisfaction, both p p and p g are according to the rules of the domain. as described in sect. 3.4, a module is composed of variables and commands. variables of the module are the set of attributes from the case artifacts that belong to os t ∪ gs t . commands are represented for the relevant planning operators ro t . the name of the command is the identifier of the action, the guards are the preconditions p c and the effects eff are the updates with associated probabilities. rewards are represented by the cost of actions c and are outside of the module of prism. add necessary metrics to evaluate createp rismmodel(v, c, r) for finding the set of relevant planning operators ro t , first, we select tactics whose preconditions must be satisfied by the current situation os t and whose goal is related to the target state gs t . this can be done by calculating the percentages of both the satisfied preconditions and achievable goals. if these percentages are within an acceptable range according to the rules of the domain, the tactics are selected. second, this first set of tactics is shown to the knowledge workers who select the most relevant tactics. the set of the selected relevant tactics is denoted as rt . from this set of tactics, we verify which activities inside the tactics are available at time t. thus, the set of available actions at time t is denoted by a t = a 1 , a 2 , . . . , a n . finally, the relevant planning operators, ro t , are created by means of a t . to generate plans in prism, it is necessary to define a property file that contains properties that define goals as utility functions. prism evaluates properties over an mdp model and generates all possible resolutions of non-determinism in the model, state graphs, and gives us the optimal state graph. the state graph describes a series of possible states that can occur while choosing actions aiming to achieve a goal state. it maximizes the probability to reach the goal state taking into consideration rewards computed, that is maximizing or minimizing rewards and costs. in our context, a property represents the goal state gs t to be achieved while trying to optimize some criteria. then, prism calculates how desirable an executing path is according to one criterion. thus, plans can be customized according to knowledge workers' preferences (costs and rewards). to generate a plan, we need to evaluate a property. the generated plan is a state graph that represents a process model to be executed at time t. the generated process model shows case states as nodes and states transitions as arcs labeled with actions which outcomes follow probability distribution function. according to this state graph, the knowledge worker could choose which action to execute in a particular state. this helps knowledge workers to make decisions during kips execution. this section formulates a patient-specific mdp model in prism for the medical scenario presented in sect. 2. in the area of health care, medical decisions can be modeled with markov decisions processes (mdp) [5, 17] . although mdp is more suitable for certain types of problems involving complex decisions, such as liver transplants, hiv, diabetes, and others, almost every medical decision can be modeled as an mdp [5] . we generate the prism model by defining the observable situation os t , goal state gs t , and the set of relevant planning operators ro t . taking in consideration the medical scenario, the observable situation is os 0 = {t emp 0 = 38 • , ln 0 = 4} and the goal state is gs 0 = {36 • c ≤ t emp ≤ 37.2 • c, ln = 0} where: temp is the temperature of the patient and ln is the level of nausea, both attributes of the health status artifact. we assume that the set of relevant tactics rt according to the current health status of the patient are fever and nausea management, presented in sect. 2. table 2 shows the specification of one activity of each tactic, showing their preconditions, effects with their probability, time, and cost of execution. we modeled the activity effects with probabilities related to the probability of the patient to respond to the treatment. for example, the possible effects of applying the activity administer oral antipyretic medication are: (e1) the patient successfully responds to treatment, occurring with a probability 0.6; (e2) 30% of the time the patient partially responds to treatment where their temperature decreases by 0.5 • or more fails to reach the goal level; and (e3) the patient does not respond at all to treatment or gets worse (occurring with a probability of 0.1). the other activities are similarly modeled according to the response of the patient. assuming that all activities from both tactics are available, the set of executable actions is a t = {a1, a2, a3, a4, a5, b1, b2, b3}. then, it is possible to model the set of relevant planning operators ro t . having os t , gs t and ro t , it is possible to generate the mdp model in the language prism. once we created the mdp model, the following utility functions were evaluated: minimize time and cost while reaching the target state. the optimal plan to achieve the goal state gs t while minimizing the cost shows that reachability is eight iterations. the resulting model has 13 states, 35 transitions, and 13 choices. the time for the model construction was 0.056 s. figure 2 presents only a fragment of the model generated, highlighting the most probable path from the initial state to the goal state. the first suggested action is b1 (labeled arc) with possible outcome states with their probabilities. if the most probable next state is achieved, the next action to perform is a1 which has a probability of 0.6 to reach the goal state. knowledge workers can use this generated plan to decide which is the next activity they should perform in a particular state. to make the plan readable to knowledge workers, they could be presented with only the most probable path, and this could be updated according to the state actually reached after activity execution. further studies are necessary to help guiding knowledge workers in interpreting and following the model. in the last decades, there has been a growing interest in highly dynamic process management, with different types of approaches that deal with the variability, flexibility, and customization of processes at design time and at run time. most approaches start from the premise that there is a process model to which different changes have to be made, such as adding or deleting fragments according to a domain model or to generate an alternative sequence of activities due to some customization option. a few approaches use automated planning for synthesizing execution plans. laurent et al. [12] explored a declarative modeling language called alloy to create the planning model and generate the plans. this approach seems to be very promising for activity-centric processes, but not effective enough for data-centric processes, as data is not well-enough treated to be the driver of the process as required in kips. smartpm [16] investigated the problem of coordinating heterogeneous components inside cyber-physical systems. they used a pddl (planning domain definition language) planner that evaluates the physical reality and the expected reality, and synthesize a recovery process. similarly, marrella and lespérance proposed an approach [15] to dynamically generate process templates from a representation of the contextual domain described in pddl, an initial state, and a goal condition. however, for the generation of the process templates, it is assumed that tasks are black boxes with just deterministic effects. on the other hand, henneberger et al. [8] explored an ontology for generating process models. the generated process models are action state graphs (asg). although this work uses a very interesting semantic approach, they did not consider important aspects such as resources and cost for the planning model. there has been an increasing interest in introducing cognitive techniques for supporting the business process cycle. ferreira et. al. [6] proposed a new life cycle for workflow management based on continuous learning and planning. it uses a planner to generate a process model as a sequence of actions that comply with activity rules and achieve the intended goal. hull and nezhad [9] proposed a new cycle plan-act-learn for cognitively-enabled processes that can be carried out by humans and machines, where plans and decisions define actions, and it is possible to learn from it. recently, marrella [14] showed how automatic planning techniques can improve different research challenges in the bpm area. this approach explored a set of steps for encoding a concrete problem as a pddl planning problem with deterministics effects. in this paper we introduced the notion of the state of a case regarding datavalues in the artifacts of a case instance. from this state, we can plan different trajectories towards a goal state using automated planning techniques. our solution generates action plans considering the non-deterministic effects of the actions, new emerging goals and information, which provides high levels of flexibility and adaptation. as we describe a generic planning model, it is possible to use different planning algorithms or combine other planning models, such as the classical planning model or the hierarchical task network (htn), according to the structuring level of the processes at different moments. thereby, we could apply this methodology to other types of processes, from well-structured processes to loosely or unstructured processes. our approach relies on mdp, which requires defining transition probabilities, which in some situations can be very difficult and expensive to get. nowadays a huge amount of data is produced by many sensors, machines, software systems, etc, which might facilitate the acquisition of data to estimate these transition probabilities. in the medical domain, the increasing use of electronic medical record systems shall provide the medical data from thousands of patients, which can be exploited to derive these probabilities. a limitation in mdps refers to the size of the problem because the size of the state-space explodes, and it becomes more difficult to solve. in this context, several techniques for finding approximate solutions to mdps can be applied in addition to taking advantage of the rapid increase of processing power in the last years. flexible processes could be easily designed if we replan after an activity execution. in fact, our approach suggests a system that has a constant interleaving of planning, execution, and monitoring. in this way, it will help knowledge workers during the decision-making process. process modeling is usually conducted by process designers in a manual way. they define the activities to be executed to accomplish business goals. this task is very difficult and prone to human errors. in some cases (e.g., for kips), it is impossible due to uncertainty, context-dependency, and specificity. in this paper, we devised an approach to continually generate run-time process models for a case instance using an artifact-centric case model, data-driven activities, and automatic planning techniques, even for such loosely-structured processes as kips. our approach defined how to synthesize a planning model from an artifactoriented case model defined according to the metakip metamodel. the formulation of the planning domain and the planning problem rely on the current state of a case instance, context and environment, target goals, and tactic templates from which we can represent actions, states, and goals. as our focus is kips management, we chose to use the mdp framework that allows representing uncertainty, which is one of kips essential characteristics. to automatically generate the action plan, we used the tool prism, which solves the mdp model and provides optimal solutions. future work involve devising a user-friendly software application for knowledge workers to interact with the planner and improve the presentation of plans in such a way that it is more understandable to them. our goal is to develop a planner which combines different types of planning algorithms to satisfy different requirements in business processes, especially regarding the structuring level. this planner will be incorporated into a fully infrastructure for managing knowledge-intensive processes that will be based on the dw-saarch reference architecture [19] . reactive modules nursing interventions classification (nic)-e-book thinking for a living. how to get better performance and results knowledge-intensive processes: characteristics, requirements and analysis of contemporary approaches mdps in medicine: opportunities and challenges an integrated life cycle for workflow management based on learning and planning automated planning: theory and practice semantic-based planning of process models rethinking bpm in a cognitive world: transforming how we learn and perform business processes decision making under uncertainty: theory and application prism 4.0: verification of probabilistic real-time systems planning for declarative processes towards is supported coordination in emergent business processes automated planning for business process management a planning approach to the automated synthesis of template-based process models smartpm: an adaptive process management system through situation calculus, indigolog, and classical planning a markov decision process model to guide treatment of abdominal aortic aneurysms enabling flexibility in process-aware information systems: challenges, methods technologies dw-saaarch: a reference architecture for dynamic self-adaptation in workflows towards a metamodel for supporting decisions in knowledge-intensive processes key: cord-026827-6vjg386e authors: awan, ammar ahmad; jain, arpan; anthony, quentin; subramoni, hari; panda, dhabaleswar k. title: hypar-flow: exploiting mpi and keras for scalable hybrid-parallel dnn training with tensorflow date: 2020-05-22 journal: high performance computing doi: 10.1007/978-3-030-50743-5_5 sha: doc_id: 26827 cord_uid: 6vjg386e to reduce the training time of large-scale deep neural networks (dnns), deep learning (dl) scientists have started to explore parallelization strategies like data-parallelism, model-parallelism, and hybrid-parallelism. while data-parallelism has been extensively studied and developed, several problems exist in realizing model-parallelism and hybrid-parallelism efficiently. four major problems we focus on are: 1) defining a notion of a distributed model across processes, 2) implementing forward/back-propagation across process boundaries that requires explicit communication, 3) obtaining parallel speedup on an inherently sequential task, and 4) achieving scalability without losing out on a model’s accuracy. to address these problems, we create hypar-flow—a model-size and model-type agnostic, scalable, practical, and user-transparent system for hybrid-parallel training by exploiting mpi, keras, and tensorflow. hypar-flow provides a single api that can be used to perform data, model, and hybrid parallel training of any keras model at scale. we create an internal distributed representation of the user-provided keras model, utilize tf’s eager execution features for distributed forward/back-propagation across processes, exploit pipelining to improve performance and leverage efficient mpi primitives for scalable communication. between model partitions, we use send and recv to exchange layer-data/partial-errors while allreduce is used to accumulate/average gradients across model replicas. beyond the design and implementation of hypar-flow, we also provide comprehensive correctness and performance results on three state-of-the-art hpc systems including tacc frontera (#5 on top500.org). for resnet-1001, an ultra-deep model, hypar-flow provides: 1) up to 1.6[formula: see text] speedup over horovod-based data-parallel training, 2) 110[formula: see text] speedup over single-node on 128 stampede2 nodes, and 3) 481[formula: see text] speedup over single-node on 512 frontera nodes. recent advances in machine/deep learning (ml/dl) have triggered key success stories in many application domains like computer vision, speech comprehension and recognition, and natural language processing. large-scale deep neural networks (dnns) are at the core of these state-of-the-art ai technologies and have been the primary drivers of this success. however, training dnns is a compute-intensive task that can take weeks or months to achieve state-of-theart prediction capabilities (accuracy). these requirements have led researchers to resort to a simple but powerful approach called data-parallelism to achieve shorter training times. various research studies [5, 10] have addressed performance improvements for data-parallel training. as a result, production-grade ml/dl software like tensorflow and pytorch also provide robust support for data-parallelism. while data-parallel training offers good performance for models that can completely reside in the memory of a cpu/gpu, it can not be used for models larger than the memory available. larger and deeper models are being built to increase the accuracy of models even further [1, 12] . figure 1 highlights how memory consumption due to larger images and dnn depth limits the compute platforms that can be used for training; e.g. resnet-1k [12] with the smallest possible batch-size of one (a single 224 × 224 image) needs 16.8 gb memory and thus cannot be trained on a 16 gb pascal gpu. similarly, resnet-1k on image size 720 × 720 needs 153 gb of memory, which makes it out-of-core for most platforms except cpu systems that have 192 gb memory. these out-of-core models have triggered the need for model/hybrid parallelism. however, realizing model-parallelism-splitting the model (dnn) into multiple partitions-is non-trivial and requires the knowledge of best practices in ml/dl as well as expertise in high performance computing (hpc). we note that model-parallelism and layer-parallelism can be considered equivalent terms when the smallest partition of a model is a layer [7, 15] . little exists in the literature about model-parallelism for state-of-the-art dnns like resnet(s) on hpc systems. combining data and model parallelism, also called hybridparallelism has received even less attention. realizing model-parallelism and hybrid-parallelism efficiently is challenging because of four major problems: 1) defining a distributed model is necessary but difficult because it requires knowledge of the model as well as of the underlying communication library and the distributed hardware, 2) implementing distributed forward/back-propagation is needed because partitions of the model now reside in different memory spaces and will need explicit communication, 3) obtaining parallel speedup on an inherently sequential task; forward pass followed by a backward pass, and 4) achieving scalability without losing out on a model's accuracy. proposed approach: to address these four problems, we propose hypar-flow: a scalable, practical, and user-transparent system for hybrid-parallel training on hpc systems. we offer a simple interface that does not require any modeldefinition changes and/or manual partitioning of the model. users provide four inputs: 1) a model defined using the keras api, 2) number of model partitions, 3) number of model replicas, and 4) strategy (data, model, or hybrid). unlike existing systems, we design and implement all the cumbersome tasks like splitting the model into partitions, replicating it across processes, pipelining over batch partitions, and realizing communication inside hypar-flow. this enables the users to focus on the science of the model instead of system-level problems like the creation of model partitions and replicas, placement of partitions and replicas on cores and nodes, and performing communication between them. hypar-flow's simplicity from a user's standpoint and its complexity (hidden from the user) from our implementation's standpoint is shown in fig. 2 . from a research and novelty standpoint, our proposed solution is both model-size as well as model-type agnostic. it is also different compared to all existing systems because we focus on high-level and abstract apis like keras that are used in practice instead of low-level tensors and matrices, which would be challenging to use for defining state-of-the-art models with hundreds of layers. hypar-flow's solution to communication is also novel because it is the first system to exploit standard message passing interface (mpi) primitives for inter-partition and inter-replica communication instead of reinventing single-use libraries. to the best of our knowledge, there are very few studies that focus on hybridparallel training of large dnns; especially using tensorflow and keras in a user-transparent manner for hpc environments where mpi is a dominant programming model. we make the following key contributions in this paper: [20] is a language for distributed dl with an emphasis on tensors distributed across a processor mesh. mtf only works with the older tf apis (sessions, graphs, etc.). furthermore, the level at which mtf distributes work is much lower compared to hypar-flow, i.e., tensors vs. layers. users of mtf need to re-write their entire model to be compatible with mtf apis. unlike mtf, hypar-flow works on the existing models without requiring any code/model changes. we summarize these related studies on data, model, and hybrid-parallelism and their associated features in table 1 . out-of-core methods like [4, 17] take a different approach to deal with large models, which is not directly comparable to model/hybrid-parallelism. several data-parallelism only studies have been published that offer speedup over sequential training [3, 5, 10, 18, 21] . however, all of these are only limited to models that can fit in the main memory of the gpu/cpu. we provide the necessary background in this section. training itself is an iterative process and each iteration happens in two broad phases: 1) forward pass over all the layers and 2) back-propagation of loss (or error) in the reverse order. the end goal of dnn training is to obtain a model that has good prediction capabilities (accuracy). to reach the desired/target accuracy in the fastest possible time, the training process itself needs to be efficient. in this context, the total training time is a product of two metrics: 1) the number of epochs required to reach the target accuracy and 2) the time required for one epoch of training. in data-parallel training, the complete dnn is replicated across all processes, but the training dataset is partitioned across the processes. since the model replicas on each of the processes train on different partitions of data, the weights (or parameters) learned are different on each process and thus need to be synchronized among replicas. in most cases, this is done by averaging the gradients from all processes. this synchronization is performed by using a collective communication primitive like allreduce or by using parameter servers. the synchronization of weights is done at the end of every batch. this is referred to as synchronous parallel in this paper. model and hybrid-parallelism: data-parallelism works for models that can fit completely inside the memory of a single gpu/cpu. but as model sizes have grown, model designers have pursued aggressive strategies to make them fit inside a gpu's memory, which is a precious resource even on the latest volta gpu (32 gb). this problem is less pronounced for cpu-based training as the amount of cpu memory is significantly higher (192 gb) on the latest generation cpus. nevertheless, some models can not be trained without splitting the model into partitions; hence, model-parallelism is a necessity, which also allows the designers to come up with new models without being restricted to any memory limits. the entire model is partitioned and each process is responsible only for part (e.g. a layer or some layers) of the dnn. model-parallelism can be combined with data-parallelism as well, which we refer to as hybrid-parallelism. we expand on four problems discussed earlier in sect. 1 and elaborate specific challenges that need to be addressed for designing a scalable and usertransparent system like hypar-flow. to develop a practical system like hypar-flow, it is essential that we thoroughly investigate apis and features of dl frameworks. in this context, the design analysis of execution models like eager execution vs. graph (or lazy) execution is fundamental. similarly, analysis of model definition apis like tensorflow estimators compared to keras is needed because these will influence the design choices for developing systems like hypar-flow. furthermore, the granularity of interfaces needs to be explored. for instance, using tensors to define a model is very complex compared to using a high-level model api like keras and onnx that follow the layer abstraction. finally, we need to investigate the performance behavior of these interfaces and frameworks. specific to hypar-flow, the main requirement from an api's perspective is to investigate a mechanism that allows us to perform user-transparent model partitioning. unlike other apis, keras seems to provide us this capability via the tf.keras.model interface. data-parallelism is easy to implement as no modification is required to the forward pass or the back-propagation of loss (error) in the backward pass. however, for model-parallelism, we need to investigate methods and framework-specific functionalities that enable us to implement the forward and backward pass in a distributed fashion. to realize these, explicit communication is needed between model partitions. for hybrid-parallelism, even deeper investigation is required because communication between model replicas and model partitions needs to be well-coordinated and possibly overlapped. in essence, we need to design a distributed system, which embeds communication primitives like send, recv, and allreduce for exchanging partial error terms, gradients, and/or activations during the forward and backward passes. an additional challenge is to deal with newer dnns like resnet(s) [12] as they have evolved from a linear representation to a more complex graph with several types of skip connections (shortcuts) like identity connections, convolution connections, etc. for skip connections, maintaining dependencies for layers as well as for model-partitions is also required to ensure deadlock-free communication across processes. even though model-parallelism and hybrid-parallelism look very promising, it is unclear if they can offer performance comparable to data-parallelism. to achieve performance, we need to investigate if applying widely-used and important hpc techniques like 1) efficient placement of processes on cpu cores, 2) pipelining via batch splitting, and 3) overlap of computation and communication can be exploited for improving performance of model-parallel and hybrid-parallel training. naive model-parallelism will certainly suffer from under-utilization of resources due to stalls caused by the sequential nature of computation in the forward and backward passes. we propose hypar-flow as an abstraction between the high-level ml/dl frameworks like tensorflow and low-level communication runtimes like mpi as shown in fig. 3(a) . the hypar-flow middleware is directly usable by ml/dl applications and no changes are needed to the code or the dl framework. the four major internal components of hypar-flow, shown in fig. 3(b) , are 1) model generator, 2) trainer, 3) communication engine (ce), and 4) load balancer. the subsections that follow provide details of design schemes and strategies for hypar-flow and challenges (c1-c3) addressed by each scheme. the model generator component is responsible for creating an internal representation of a dnn (e.g. a keras model) suitable for distributed training (fig. 2) . in the standard single-process (sequential) case, all trainable variables (or weights) of a model exist in the address space of a single process, so calling tape.gradients() on a tf.gradienttape object to get gradients will suffice. however, this is not possible for model-parallel training as trainable variables (weights) are distributed among model-partitions. to deal with this, we first create a local model object on all processes using the tf.keras.model api. next, we identify the layers in the model object that are local to the process. finally, we create dependency lists that allow us to maintain layer and rank dependencies for each of the local model's layers. these three components define our internal distributed representation of the model. this information is vital for realizing distributed backpropagation (discussed next) as well as for other hypar-flow components like the trainer and the communication engine. having a distributed model representation is crucial. however, it is only the first step. the biggest challenge for hypar-flow and its likes are: "how to train a model that is distributed across process boundaries?". we deal with this challenge inside the trainer component. first, we analyze how training is performed on a standard (non-distributed) keras to realize distributed back-propagation, we need 1) partial derivative (d1) of loss l with respect to the weight w 1, and 2) partial derivative (d2) of loss l with respect to the weight w 2. the challenge for multi-process case is that the term called "partial error" shown in eqs. 6 and 7 can only be calculated on partition-2 (fig. 4) as y only exists there. to calculate d1, partition-1 needs this "partial error" term in addition to d1. because we rely on accessing gradients using the dl framework's implementation, this scenario poses a fundamental problem. tensorflow, the candidate framework for this work, does not provide a way to calculate gradients that are not part of a layer. to implement this functionality, we introduce the notion of grad layer in hypar-flow, which acts as a pseudo-layer inserted before the actual layer on each model-partition. we note that tensorflow's gradienttape cannot be directly used for this case. grad layers ensure that we can call tape.gradients() on this grad layer to calculate the partial errors during back-propagation. specifically, a grad layer is required for each recv operation so that partial error can be calculated for each preceding partition's input. a call to tape.gradients() will return a list that contains gradients as well as partial errors. the list is then used to update the model by calling optimizer.apply gradients(). we note that there is no need to implement distributed back-propagation for the data-parallel case as each model-replica is independently performing the forward and backward pass. the gradients are only synchronized (averaged) at the end of the backward pass (back-propagation) using allreduce to update the model weights in a single step. in sects. 5.1 and 5.2, we discussed how the distributed model definition is generated and how back-propagation can be implemented for a model that is distributed across processes. however, trainer and model generator only provide an infrastructure for distributed training. the actual communication of various types of data is realized in hypar-flow's communication engine (ce). the ce is a light-weight abstraction for internal usage and it provides four simple apis: 1) send, 2) recv, 3) broadcast and 4) allreduce. for pure data-parallelism, we only need to use allreduce. however, for model-parallelism, we also need to use point-to-point communication between model-partitions. in the forward pass, the send/recv combination is used to propagate partial predictions from each partition to the next partition starting at layer 1. on the other hand, send/recv is used to backpropagate the loss and partial-errors from one partition to the other starting at layer n. finally, for hybrid-parallelism, we need to introduce allreduce to accumulate (average) the gradients across model replicas. we note that this is different from the usage of allreduce in pure data-parallelism because in this case, the model itself is distributed across different partitions so allreduce cannot be called directly on all processes. one option is to perform another p2p communication between model replicas for gradient exchange. the other option is to exploit the concept of mpi communicators. we choose the latter one because of its simplicity as well as the fact the mpi vendors have spent considerable efforts to optimize the allreduce collective for a long time. to realize this, we consider the same model-partition for all model-replicas to form the allreduce communicator. because we only need to accumulate the gradients local to a partition across all replicas, allreduce called on this communicator will suffice. please refer back to fig. 2 (sect. 1) for a graphical illustration of this scheme. the basic ce design described above works but does not offer good performance. to push the envelope of performance further, we investigate two hpc optimizations: 1) we explore if the overlap of computation and communication can be exploited for all three parallelization strategies and 2) we investigate if pipelining can help overcome some of the limitations that arise due to the sequential nature of the forward/backward passes. finally, we also handle some advanced cases for models with non-consecutive layer connections (e.g. resnet(s)), which can lead to deadlocks. to achieve near-linear speedups for data-parallelism, the overlap of computation (forward/ backward) and communication (allreduce) has proven to be an excellent choice. horovod, a popular data-parallelism middleware, provides this support so we simply use it inside hypar-flow for pure data-parallelism. however, for hybridparallelism, we design a different scheme. we create one mpi communicator per model partition whereas the size of each communicator will be equal to the number of model-replicas. this design allows us to overlap the allreduce operation with the computation of other partitions on the same node. an example scenario clarifies this further: if we split the model across 48 partitions, then we will use 48 allreduce operations (one for each model-partition) to get optimal performance. this design allows us to overlap the allreduce operation with the computation of other partitions on the same node. because dnn training is inherently sequential, i.e., the computation of each layer is dependent on the completion of the previous layer. this is true for the forward pass, as well as for the backward pass. to overcome this performance limitation, we exploit a standard technique called pipelining. the observation is that dnn training is done on batches (or mini-batches) of data. this offers an opportunity for pipelining as a training step on samples within the batch is parallelizable. theoretically, the number of pipeline stages can be varied from 1 all the way to batch size. this requires tuning or a heuristic and will vary according to the model and the underlying system. based on hundreds of experiments we performed for hypar-flow, we derive a simple heuristic: use the largest possible number for pipeline stages and decrease it by a factor of two. in most cases, we observed that num pipeline stages = batch size provides the best performance. figure 5 shows a non-consecutive model with skip connections that requires communication 1) between adjacent model-partitions for boundary layers and 2) non-adjacent model-partitions for the skip connections. to handle communication dependencies among layers for each model-partition, we create two lists: 1) forward list and 2) backward list. each list is a list of lists to store dependencies between layers as shown in fig. 5 . "f" corresponds to the index of the layer to which the current layer is sending its data and "b" corresponds to the index of the layer from which the current layer is receiving data. an arbitrary sequence of sending and receiving messages may lead to a deadlock. for instance, if partition-1 sends the partial predictions to partition-3 when partition-3 is waiting for predictions from partition-2, a deadlock will occur as partition-2 is itself blocked (waiting for results from partition-1 ). to deal with this, we sort the message sequence according to the ranks so that the partition sends the first message to the partition which has the next layer. the models we used did not show any major load imbalance but we plan to design this component in the future to address emerging models from other application areas that require load balancing capabilities from hypar-flow. we have used three hpc systems to evaluate the performance and test the correctness of hypar-flow: 1) frontera at texas advanced computing center (tacc), 2) stampede2 (skylake partition) at tacc, and 3) epyc: a local system with dual-socket amd epyc 7551 32-core processors. inter-connect: frontera nodes are connected using mellanox infiniband hdr-100 hcas whereas stampede2 nodes are connected using intel omni-path hfis. [1] . the design schemes proposed for hypar-flow are architecture-agnostic and can work on cpus and/or gpus. however, in this paper, we focus only on designs and scale-up/scale-out performance of manycore cpu clusters. we plan to perform in-depth gpu-based hypar-flow studies in the future. we now present correctness related experiments followed by a comprehensive performance evaluation section. because we propose and design hypar-flow as a new system, it is important to provide confidence to the users that hypar-flow not only offers excellent performance but also trains the model correctly. to this end, we present the correctness results based on two types of accuracy-related metrics: 1) train accuracy (train acc)-percentage of correct predictions for the training data during the training process and 2) test accuracy (test acc)-percentage of correct predictions for the testing data on the trained model. both metrics are covered for small scale training using vgg-16 on the cifar-10 dataset. we train vgg-16 for 10 epochs using 8 model-partitions on two stampede2 nodes with a batch size of 128 and 16 pipeline stages as shown in fig. 6(a) . next, we show test accuracy for resnet-110-v1 in fig. 6(b) and resnet-1001-v2 in fig. 6(c) . the learning rate (lr) schedule was used from keras applications [1] for both resnet(s) and was kept similar for sequential as well as parallel training variants. training for resnet-110 and resnet-1001 was performed for 150 and 50 epochs, respectively. the following variants have been compared: discussion: clearly, model-parallel training with hypar-flow is meeting the accuracy of the sequential model for 150 and 50 epochs of training for resnet-110 and resnet-1001, respectively. we note that training is a stochastic process and there are variations in earlier epochs whether we use the sequential version or the model-parallel version. however, the significance is of the end result, which in this case peaks at 92.5% for all the configurations presented. we ran multiple training jobs to ensure that the trends presented are reproducible. we use the term "process" to refer to a single mpi process in this section. the actual mapping of the process to the compute units (or cores) varies according to the parallelization strategy being used. images/second (or img/sec) is the metric we are using for performance evaluation of different types of training experiments. number of images processed by the dnn during training is affected by the depth (number of layers) of the model, batch size (bs), image size (w × h), and number of processes. higher img/sec indicates better performance. some important terms are clarified further: -horovod (dp): dnn training using horovod directly (data-parallel). we train various models on a single stampede2 node-dual-socket xeon skylake with 48 cores and 96 threads (hyper-threading enabled). the default version of tensorflow relies on underlying math libraries like openblas and intel mkl. on intel systems, we tried the intel-optimized version of tensorflow, but it failed with different errors such as "function not implemented" etc. for the amd system, we used the openblas available on the system. both of these platforms offer very slow sequential training. we present single-node results for vgg-16, resnet-110-v1, and resnet-1001-v2. vgg-16 has 16 layers so it can be split in to as many as 16 partitions. we try all possible cases and observe the best performance for num partitions = 8. as shown in fig. 7(a) , we see that hf (mp) offers better performance for small batch sizes and hf/horovod (dp) offers better performance for large batch sizes. hf (mp) offers better performance compared to sequential (1.65× better at bs 1024) as well as to data-parallel training (1.25× better at bs 64) for vgg-16 on stampede2. resnet-110-v1 has 110 layers so we were able to exploit up to 48 modelpartitions within the node as shown in fig. 7(b) . we observe the following: 1) hf (mp) is up to 2.1× better than sequential at bs = 1024, 2) hf (mp) is up to 1.6× better than horovod (dp) and hf (dp) at bs = 128, and 3) hf (mp) is 15% slower than hf (dp) at bs = 1024. the results highlight that modelparallelism is better at smaller batch sizes and data-parallelism are better only when large batch-size is used. figure 8 (a) shows that hf (mp) can offer up to 3.2× better performance than sequential training for resnet-110-v1 on epyc (64 cores). epyc offered better scalability with increasing batch sizes compared to stampede2 nodes ( fig. 7(b) vs. 8(a)) the performance gains suggest that hf (mp) can better utilize all cores on eypc compared to sequential training. to push the envelope of model depth and stress the proposed hypar-flow system, we also perform experiments for resnet-1001-v2, which has 1,0001 layers and approximately 30 million parameters. figure 8(b) shows the performance for resnet-1001-v2. it is interesting to note that dataparallel training performs poorly for this model. this is because the number of parameters increases the synchronization overhead for hf (dp) and horovod (dp) significantly. hence, even for large batch sizes, the computation is not enough to amortize the communication overhead. thus, hf (mp) offers much better performance compared to sequential (2.4× better at bs = 256) as well as to data-parallel training (1.75× better at bs = 128). two-node results for model parallelism are presented using vgg-16 and resnet-1001-v2. figure 9 (a) shows the performance trends for vgg-16 training across two nodes. as mentioned earlier, we are only able to achieve good performance with model-parallelism for up to 8 model-partitions for the 16 layers of vgg-16. we also perform experiments for 16 model-partitions but observe performance degradation. this is expected because of the lesser computation per partition and greater communication overhead in this scenario. we scale resnet-1001-v2 on two nodes using 96 model-partitions in the model-parallelism-only configuration on stampede2. the result is presented in fig. 9(b) . we observe that model-parallel hf (mp) training provides 1.6× speedup (at bs = 256) over hf (dp) and horovod (dp). on the other hand, a data-parallel-only configuration is not able to achieve good performance for resnet-1001 due to significant communication (allreduce) overhead during gradient aggregation. emerging models like amoebanet [19] are different compared to vgg and resnet(s). in order to show the benefit of hypar-flow as a generic system for various types of models, we show the performance of training a 1,381-layer amoebanet variant in fig. 10 . we provide results for four different conditions: 1) sequential training using keras and tensorflow on one node, 2) hf (mp) with 4 partitions on one node, 3) hf (mp) with 8 partitions on two nodes, and 4) hf (hp), where hp denotes hybrid parallelism on two nodes. as shown in fig. 10 , we observe that hybrid parallelism offers the best possible performance using the same set of nodes. the most comprehensive coverage of hypar-flow's flexibility, performance, and scalability are presented in fig. 11 (a). the figure shows performance for various combinations of hybrid-parallel training of resnet-1001-v2 on 128 stampede2 nodes. the figure has three dimensions: 1) the number of nodes on the x-axis, 2) performance (img/sec) on y-axis, and 3) batch size using the diameter of the circles. the key takeaway is that hybrid-parallelism offers the user to make trade-offs between high-throughput (img/sec) and batch size. from an accuracy (convergence) standpoint, the goal is to keep the batch-size small so model updates are more frequent. however, larger batch-size delays synchronization and thus provides higher throughput (img/sec). hypar-flow offers the flexibility to control these two goals via different configurations. for instance, the large blue circle with diagonal lines shows results for 128 nodes using 128 modelreplicas where the model is split into 48 partitions on the single 48-core node. this leads to a batch-size of just 32,768, which is 2× smaller than the expected 65,536 if pure data-parallelism is used. it is worth noting that the performance of pure data-parallelism even with 2× larger batch-size will still be lesser than the hybrid-parallel case, i.e., 793 img/sec (=6.2 × 128 -considering ideal scaling for data-parallel case presented earlier in fig. 8(b) ) vs. 940 img/sec (observed valuefig. 11(a) ). this is a significant benefit of hybrid-parallel training, which is impossible with pure model and/or data parallelism. in addition to this, we also present the largest scale we know of for any model/hybrid-parallel study on the latest frontera system. figure 11 (b)) shows near-ideal scaling on 512 frontera nodes. effectively, every single core out of the 28,762 cores on these 512 nodes is being utilized by hypar-flow. the resnet-1001 model is split into 56 partitions as frontera nodes have a dual-socket cascade-lake xeon processor for a total of 56 cores/node. we run one model-replica per node with a batch size of 128. to get the best performance, pipeline stages were tuned and the best number was found to be 128. today, designers develop models accounting for the restriction of memory consumption. however, with hypar-flow, this restriction no longer exists, and designers can come up with models with as many layers as needed to achieve the desired accuracy. to illustrate this, we present resnet-5000, an experimental model with 5000 layers. resnet-5000 is massive and requires a lot of memory so we were able to train it with a batch-size of 1 only. beyond that, it is not trainable on any existing system. we stress-test hypar-flow to scale the training of resnet-5000 to two nodes and were able to train for bigger batch sizes. we note that training resnet-5000 and investigation of its accuracy and finding the right set of hyper-parameters is beyond the scope of this paper. the objective is to showcase hypar-flow's ability to deal with models that do not exist today. model and data-parallelism can be combined in a myriad of ways to realize hybrid-parallel training. e.g. model-parallelism on a single node with multiple cores with data-parallelism across nodes. there are non-trivial and modeldependent trade-offs involved when designing hybrid schemes. model-parallelism and data-parallelism have different use cases; model-parallelism is beneficial when we have a large model, or we want to keep a small effective batch size for training. on the other hand, data-parallelism gives a near-linear scale-out on multiple nodes but it also increases batch size. in our experiments, we observe that single-node model-parallelism is better than single-node data-parallelism. theoretically, the number of model-partitions can not be larger than the number of layers in the model; we can not have more than 110 partitions for resnet-110. in practice, however, we observe that one layer per model-partition will not be used because it suffers from performance degradation. to conclude, hypar-flow's flexible hybrid-parallelism offers the best of both worlds; we can benefit from both model and data parallelism for the same model. we summarize the key observations below: -models like resnet-110 offer better performance for model-parallelism on smaller batch sizes (<128). -newer and very-deep models like resnet-1001 benefit from model-parallelism for any batch size ( fig. 8(b) ). -hypar-flow's model-parallel training provides up to 3.2× speedup over sequential training and 1.6× speedup over data-parallel training ( fig. 8(a) ). -hypar-flow's hybrid-parallel training offers flexible configurations and provides excellent performance for resnet-1001; 110× speedup over single-node training on 128 stampede2 (xeon skylake) nodes ( fig. 11(a) ). -hypar-flow's hybrid-parallel training is highly scalable; we scale resnet-1001 to 512 frontera nodes (28,762 cores) as shown in fig. 11 (b). deep learning workloads are going through a rapid change as newer models and larger, more diverse datasets are being developed. this has led to an explosion of software frameworks like tensorflow and approaches like data and modelparallelism to deal with ever-increasing workloads. in this paper, we explored a new approach to train state-of-the-art dnns and presented hypar-flow: a unified framework that enables user-transparent and parallel training of ten-sorflow models using multiple parallelization strategies. hypar-flow does not enforce any specific paradigm. it allows the programmers to experiment with different parallelization strategies without requiring any changes to the model definition and without the need for any system-specific parallel training code. instead, hypar-flow trainer and communication engine take care of assigning the partitions to different processes and performing inter-partition and interreplica communication efficiently. for resnet-1001 training using hypar-flow, we were able to achieve excellent speedups: up to 1.6× over data-parallel training, up to 110× over single-node training on 128 stampede2 nodes, and up to 481× over single-node on 512 frontera nodes. we also tested the ability of hypar-flow to train very large experimental models like resnet-5000, which consists of 5,000 layers. we believe that this study paves new ways to design models. we plan to publicly release the hypar-flow system so that the community can use it to develop and train next-generation models on large-scale hpc systems. extremely large minibatch sgd: training resnet-50 on imagenet in 15 minutes oc-dnn: exploiting advanced unified memory capabilities in cuda 9 and volta gpus for out-of-core dnn training s-caffe: co-designing mpi runtimes and caffe for scalable deep learning on modern gpu clusters legion: expressing locality and independence with logical regions demystifying parallel and distributed deep learning: an in-depth concurrency analysis improving strong-scaling of cnn training by exploiting finer-grained parallelism integrated model, batch, and domain parallelism in training neural networks accurate, large minibatch sgd: training imagenet in 1 hour pipedream: fast and efficient pipeline parallel dnn training identity mappings in deep residual networks gpipe: efficient training of giant neural networks using pipeline parallelism beyond data and model parallelism for deep neural networks one weird trick for parallelizing convolutional neural networks imagenet classification with deep convolutional neural networks dragon: breaking gpu memory capacity limits with direct nvm access imagenet/resnet-50 training in 224 seconds regularized evolution for image classifier architecture search mesh-tensorflow: deep learning for supercomputers optimizing network performance for distributed dnn training on gpu clusters: imagenet/alexnet training in 1.5 minutes key: cord-034846-05h2no14 authors: singer, gonen; marudi, matan title: ordinal decision-tree-based ensemble approaches: the case of controlling the daily local growth rate of the covid-19 epidemic date: 2020-08-07 journal: entropy (basel) doi: 10.3390/e22080871 sha: doc_id: 34846 cord_uid: 05h2no14 in this research, we develop ordinal decision-tree-based ensemble approaches in which an objective-based information gain measure is used to select the classifying attributes. we demonstrate the applicability of the approaches using adaboost and random forest algorithms for the task of classifying the regional daily growth factor of the spread of an epidemic based on a variety of explanatory factors. in such an application, some of the potential classification errors could have critical consequences. the classification tool will enable the spread of the epidemic to be tracked and controlled by yielding insights regarding the relationship between local containment measures and the daily growth factor. in order to benefit maximally from a variety of ordinal and non-ordinal algorithms, we also propose an ensemble majority voting approach to combine different algorithms into one model, thereby leveraging the strengths of each algorithm. we perform experiments in which the task is to classify the daily covid-19 growth rate factor based on environmental factors and containment measures for 19 regions of italy. we demonstrate that the ordinal algorithms outperform their non-ordinal counterparts with improvements in the range of 6–25% for a variety of common performance indices. the majority voting approach that combines ordinal and non-ordinal models yields a further improvement of between 3% and 10%. in epidemiology, mathematical modeling is widely used to predict the transmissibility and dynamic spread of an epidemic, while statistical analysis is often used to evaluate the effect of a variety of variables on epidemic transmission. for the prediction task, the most commonly used mathematical models are those that apply sir/seir (susceptible, (exposed), infectious, and removed) differential equations [1] [2] [3] [4] . these studies usually assume available data on the number of susceptible individuals and the numbers of infections, deaths, and recoveries. recently, several studies have incorporated spatial patterns into epidemiological mathematical models for predicting the spread of an epidemic, by invoking specific assumptions regarding the behavior and location of individuals in the network [5] [6] [7] . research studies that examine the effect of different factors on the spread of an epidemic tend to use statistical analysis techniques such as pearson's correlation coefficient, descriptive statistics, and regression models. mecenas et al. [8] , for example, described 17 recent studies that used these techniques to investigate the effect of weather variables on the spread of covid-19 and sars. in comparison to that for weather variables, the evidence about the effect of environmental factors on the transmission and viability of covid-19 and other epidemics is less conclusive. pedersen and meneghini [9] , for example, explored the effects of containment measures and found that drastic restrictions have reduced the spread of covid-19 modestly but have been insufficient to halt the epidemic. mastrandrea and barrat [10] showed that social interactions shape the patterns of epidemic spreading processes in populations and explored how incomplete data on contact networks affect the prediction of epidemic risk. all of the aforementioned mathematical modeling approaches for spread prediction tend to be predicated on model-specific assumptions and, in general, cannot represent non-linear dynamics or introduce probabilistic variables into the model. on the other hand, the statistical methods used for evaluation of the effect of various factors on transmission are often based on specific types of data, meaning that they cannot identify patterns and relationships among other types of data; in addition, they are not generally suited to dealing with massive data. this research study is motivated by the increasing availability of different types of data, as well as the availability of massive data, which naturally lend themselves to data-driven approaches for epidemic spread prediction [11] [12] [13] [14] . compared to the research studies described above, which employed known time-series forecasting models, or were predicated on a specific model, we propose the use of ordinal classification algorithms for the prediction and evaluation of the epidemic spread. these algorithms are designed to address the aforementioned limitations of previous models. the next section presents a review of classification algorithms and their adaptability for the evaluation of different factors affecting the spread of an epidemic-the goal of which is to predict the daily growth rate factor. classification is one of the most common tasks in machine learning. it is used to identify to which of a set of classes a new observation belongs, based on the values of the explanatory or input variables. in our research, the classification task consists of distinguishing between different levels of a growth rate factor [15] that represents the epidemic spread. the classification is based on the identification of relationships among variables that represent specific conditions and risk factors. for data in which the class attribute exhibits some form of ordering (such as the growth factor level), ordinal classification can be applied, which takes into account the ranking relationship among classes [16] . ordinal problems commonly address real-world applications such as portfolio investment by expected return performance or classification of the severity of disease, in which a classification error could have critical consequences [17] [18] [19] [20] [21] . most of these techniques assume monotonicity between the explaining and target attributes [21] [22] [23] [24] [25] . several previous research studies have shown that ordinal classifiers yield poor classification accuracy when applied to datasets with high levels of non-monotonic noise [26] . in recent studies, singer et al. and singer and cohen [27, 28] proposed an ordinal classification tree based on a weighted information gain measure. they found it to be effective for classification problems in which the class variable exhibits some form of ordering, and where dependencies between the attributes and the class value can be non-monotonic, as may be the case for the current problem of the control of epidemic spread based on environmental factors. for example, the ordinal attribute "forecast temperature" may have a non-monotonic effect on the growth factor; that is, extreme temperature conditions, either very high or very low, may lead to lower growth factor values, while under "moderate" temperatures, the growth factor may be higher. the weighted information gain measure proposed in these studies takes into consideration the magnitude of the potential classification error, where this error is calculated relative to the value of a specific class of the target. in this research paper, we extend the weighted information gain measure such that the classification error can be measured from a statistical value that is not necessarily defined by a single class-for example, the expected value of all classes. we use the proposed measure to develop ordinal decision-tree-based ensemble approaches, i.e., ordinal adaboost and random forest models, which are known to outperform individual classifiers. we demonstrate that these ordinal decision-tree-based approaches are naturally suited to identifying large numbers of data patterns with complex dependence structures, without requiring a priori assumptions regarding the dependencies within the data. thus, the proposed algorithms have the precise characteristics required to address the aforementioned shortcomings of existing approaches for evaluating the effects of different factors on daily and regional growth factors of an epidemic. the main objectives of this study are fourfold: (i) to extend the weighted information gain measure such that the classification error can be measured from a statistical value that is not necessarily defined by a single class; (ii) to develop ordinal decision-tree-based ensemble approaches in which an objective-based information gain measure is used; (iii) to examine the advantage of combining ordinal decision-tree-based ensemble approaches with non-ordinal individual classifiers to leverage the strengths of each type of classifier; and (iv) to examine the ability to carry out multi-class identification of different levels of a daily growth factor using ordinal decision-tree-based ensemble approaches. the remainder of the paper is organized as follows. in section 2, we provide a detailed description of the proposed ordinal decision-tree-based ensemble approaches. section 3 presents numerical experiments for evaluating and benchmarking the proposed algorithms against known non-ordinal algorithms. the input data for these experiments consist of covid-19 growth rate factors, along with environmental factors and containment measures for 19 regions in italy. finally, the conclusions and discussion are presented in section 4. in this section, we begin by presenting an extension to a general version of the objective-based entropy measure proposed by singer et al. and singer and cohen [27, 28] (section 2.1). in section 2.2, we develop ordinal adaboost and random forest algorithms based on ordinal decision tree models that use this measure. further, we propose a majority voting approach based on combined ordinal and non-ordinal algorithms. we then continue, in section 2.3, by describing the dataset used for classification of the daily covid-19 growth rate factor, which will be used to evaluate the performance of the suggested approaches. the objective-based information gain (obig) is a measure for selecting the attributes with the greatest explanatory value in a decision tree model. the goal of our research is to identify the severity of the epidemic spreading process by classifying the level of the daily growth factor (dgf). the obig is appropriate for this purpose, since it takes into consideration the ordinal nature of the dgf level and the magnitude of the potential classification error. assume that we wish to classify the level of the dgf based on a dataset d = {(x m , y m ), m = 1, 2, . . . , m}, where x m = [v m,1 , v m,2 , . . . , v m,k ] denotes a sample m in the dataset, defined by a vector of values for the k attributes a = { a 1 , a 2 , . . . , a k }, and y m denotes the value of the daily growth factor of sample m. we categorize the values of y m into n different severity levels denoted by the random variable c ∈ {c i , i = 1, . . . , n}, where c 1 denotes the lowest daily growth rate and c n the highest daily growth rate. the new discretized daily growth rate variable (the target variable) is denoted by y = {y m , m = 1, 2, . . . , m}. we define values for the different levels of the daily growth rate, v ∈ v(c i ), i = 1, . . . , n , as an increasing function of the severity, such that v(c i ) < v(c j ), ∀i < j. the precise values of the classes should be assigned by considering the consequences of potential classification errors. for example, assume that we have three levels of daily growth rate, (c 1 , c 2 , c 3 ) = (low, medium, high), and the decision-makers are conservative regarding restricting the movements of the public, as they wish to prevent medium and high spreading rates of the epidemic. thus, when a "high" daily growth rate is predicted, the recommendation is global closure, while when a "medium" daily growth is predicted, the recommendation is local closures, which decision-makers believe can also be a highly effective strategy in the case of a high growth rate. it is only when a "low" daily growth rate is predicted that it is considered acceptable to remove all restrictions. the health consequences of predicting a low daily growth rate while the actual growth rate is medium are more severe than those of predicting a medium daily growth rate when the actual growth rate is high. accordingly, we would assign values to the classes such that the relationship between the values of different levels is v( . the obig uses an objective-based entropy (obe) measure that, like the conventional concept of entropy, measures the randomness and uncertainty of the outcome of a random variable. however, unlike the conventional measure, the obe allocates different weights to the classes as follows: where p(c i ) is the probability that record m belongs to class c i , and ω(c i ) is the weight of class c i . in previous studies that used the obe formula [27, 28] , the weights assigned to different classes were calculated according to the values and dispersions of the classes with respect to the value of selected class c s (y, v) defined by statistic s. among the selected classes proposed in these studies were the class with the maximum value, c max = arg max v(c i ) , and the most probable class in the dataset, . in this research study, we generalize the obe measure by replacing the value of the selected class, v(c s (y, v)), with a targeted value, t(y, v). the targeted value should not necessarily be related to a selected class, but can reflect a statistical property of the set of classes, such as its expected value: specific examples of the targeted value, as presented in previous studies, include t(y, v) = v(c max ) and t(y, v) = v(c mode ). thus, we define the weight of class c i in this research study as where ω(c i ) is calculated as follows: the absolute deviation of the value of the ith class, v(c i ), from the targeted value, t(y, v), divided by the sum of all absolute difference values over all possible classes in the dataset. this measure implies that an attribute with a smaller distribution around the targeted value obtains a smaller obe value, which represents a lower risk. the factor α (α > 0) is a normalization factor that smooths the distribution of the weights over the different classes. the objective-based entropy measure is used for the calculation of an objective-based information gain measure, obig k (d, t), for selecting branching attributes in dataset d in decision tree models by partitioning the records in d over the attribute a k having n k distinct values as follows: where the second expression on the right-hand side of the equation is the objective-based entropy of a possible partitioning on the attribute a k . the value d r k |d| represents the weight of the rth partition of attribute a k , i.e., the number of records in d r k relative to the total number of records in d, and obe y r k , t represents the objective-based entropy of the sub-dataset y r k ⊆ y. similar to the conventional information gain measure, the objective-based information gain is overly sensitive to the number of distinct values n k ; thus, it should be normalized (at least for some algorithms) when there is a large variance in the number of values for different attributes [29] [30] [31] . in our research, we used the cart model as well as ordinal decision-tree-based ensemble approaches that are composed of cart models. given that these algorithms branch via binary splitting at each node of the tree (n k = 2, ∀k), we used the obig measure in equation (4) without normalization. the attribute a k with the highest (non-negative) weighted information gain was selected as the branching attribute of the node. ensemble methods attempt to overcome the bias or variance effects of individual classifiers by combining several of them together [32] , thus achieving better performance [33, 34] . despite the fact that ensemble methods for nominal classification have received considerable attention in the literature and have been chosen in preference to single individual learning algorithms for many classification tasks, the use of ordinal classification in ensemble algorithms has rarely been discussed [35] . in the literature, the most widely used ensemble methods are categorized into four techniques: bagging, boosting, stacking, and voting [35, 36] . in this research, we suggest integrating the objective-based information gain into ensemble methods to enable the use of ordinal classification in ensemble algorithms. specifically, in section 2.2.1, we propose an ordinal random forest (rf) algorithm to implement ordinal classification in an ensemble approach based on the bagging technique. in section 2.2.2, we propose an ordinal adaboost algorithm, which is based on the boosting technique. in section 2.2.3, we propose a majority voting approach based on a combination of non-ordinal and ordinal decision-tree-based models. the random forest algorithm is a bagging method in which we create random sub-samples of our dataset with replacement, and we train decision trees on each sample. since, in each sub-sample, the gain of each attribute with respect to the target variable may be similar, the different decision trees may have considerable structural similarity. thus, in order to ensure that the trees are less correlated, while still ensuring that the samples are chosen randomly, the algorithm samples over the attributes in each node and uses only a random subset of them to choose the variable to split on, which reduces the similarities between different decision trees. a random forest built from several learners can significantly reduce the variance, i.e., fit the training data well, but can perform poorly with testing data (called overfitting), usually as a result of the model's high complexity; thus, the rf approach is most suitable when combining individual learners when each of them suffers from large variance. indeed, several studies have shown that random forests yield accurate and robust classification results when high variance exists [37] [38] [39] . figure 1 presents the pseudocode for the construction of the random forest algorithm, as well as for the classification phase. in the construction phase, the random forest is built from l decision trees { g l , l = 1, 2, . . . , l , each trained on the lth bootstrap sample of the dataset, d l ⊂ d. for each node in the tree, we randomly select a subset of attributes ⊂ a, and the attribute with the highest objective-based information gain is selected as the splitting attribute using the procedure obig(d, a, t). in the classification phase, for each new instance x given to the decision tree classifiers as an input, the output g l (x) = c i is returned by each model, and the class with the highest number of votes is chosen as follows:ŷ = arg max in the case where two classes receive the same number of votes, the algorithm chooses one of them at random. the adaboost algorithm is an adaptive boosting method in the sense that subsequent weak classifiers are tweaked in favor of those instances misclassified by previous classifiers, i.e., the classifiers are built sequentially and not in parallel like in the random forest algorithm. each subsequent classifier focuses on previous "harder-to-classify" instances by increasing their weights in the dataset. in our example, we use 1-level decision trees based on the obig measure as the weak classifiers (low-complexity models). adaboost can significantly reduce the bias, i.e., the difference between the predicted value and true target value in the training data (called "underfitting"), which usually occurs due to low complexity of the model. like the random forest approach, adaboost reaches a classification by applying multiple decision trees to every sample and combining the the adaboost algorithm is an adaptive boosting method in the sense that subsequent weak classifiers are tweaked in favor of those instances misclassified by previous classifiers, i.e., the classifiers are built sequentially and not in parallel like in the random forest algorithm. each subsequent classifier focuses on previous "harder-to-classify" instances by increasing their weights in the dataset. in our example, we use 1-level decision trees based on the obig measure as the weak classifiers (low-complexity models). adaboost can significantly reduce the bias, i.e., the difference between the predicted value and true target value in the training data (called "underfitting"), which usually occurs due to low complexity of the model. like the random forest approach, adaboost reaches a classification by applying multiple decision trees to every sample and combining the predictions made by individual trees. however, rather than selecting the class with the majority vote among the decision trees, as in random forest, in the adaboost algorithm, every decision tree contributes to a varying degree to the final prediction according to the incorrectly classified samples (as described below). several studies have shown that adaboost yields accurate and robust results when high bias exists [40] [41] [42] . such that samples that were misclassified by previous decision trees have higher weights. we begin in the first step, l = 1, by building a decision tree under the assumption of identical weights for the m samples in the dataset, d 1 m = 1 m , m = 1, . . . , m. at the end of each iteration, we calculate the classifier error and use it to update the samples' weights for the subsequent iteration and to determine the weight of the current decision tree in the final classification. the error of the classifier is calculated as follows: where e l m is the error function, which is defined in the adaboost algorithm by entropy 2020, 22, x for peer review 8 of 20 thus, the error is assigned a value of 1 when the predicted value g l (x m ) of a sample x m is different from the real value, and 0 otherwise. note that we require 0 ≤ ε l < 1 − 1 n , such that if the constraint does not hold, we stop generating decision trees. the expression 1 − 1 n reflects the error of a naïve classifier (i.e., random choice of a class) assuming equal probabilities for the classes. note that equation (7) assumes the same error value of 1 for each incorrectly classified sample, regardless of the magnitude of the classification error. thus, an alternative option for an error function that considers the magnitude of the classification error would be such that the error value is the difference between the values of the predicted and actual class divided by the maximum difference between two different class values, where the denominator serves to normalize the error to the range 0 ≤ e l m ≤ 1. as mentioned, we use the error e l m to calculate the decision tree weight in the final classification, which is achieved as follows: where w l > 0, and its value increases as the classifier's error decreases, while w l → 0 + when ε l → 1 − 1 n , which means that the weight of a classifier tends to zero as its error tends to the naïve classifier's error. we also use w l to update the samples' weights, such that misclassified samples obtain higher weights as follows: where z l = m m=1 d l m e (w l e l m ) is a normalization constant such that the new sample weights will sum up to 1. note that when the error of the classifier tends to the naïve classifier error, ε l → 1 − 1 n (i.e., w l → 0 + ), or when the error is equal to zero, ε l = 0 (i.e., all samples classified correctly), then the weights of the samples will not change compared to previous iterations up to a constant. in any other case, d l m decreases for samples that were correctly classified. in the classification phase, for each new instance x given to the decision tree models as an input, the output g l (x) = c i is returned by each model and contributes w l to the final prediction as follows: where θ l (x) is defined in equation (5). in order to benefit maximally from the various ordinal decision-tree-based algorithms (each with its own objective-based entropy measure), as well as from non-ordinal algorithms, we propose a simple ensemble approach, which, in theory, should leverage the strengths of each individual classifier [43] . assume that we wish to build j-many classifiers g j , j = 1, 2, . . . , j . for each new sample x given to classifier j as an input, the outputŷ j is returned using the classification procedure of the relevant model, and the class with the highest number of votes is chosen as follows: arg max we constructed a covid-19 dataset (see appendix a) to evaluate the newly developed algorithms and to compare them with their non-ordinal counterparts and other conventional algorithms. the dataset is based on 43 features, created from three different types of daily data relating to 19 regions of italy between 7th march and 1st april 2020. (the region and the date are two additional features of the dataset, which we refer to as "key features"). the three types of data are as follows: (i) covid-19 patient data [44] , which consists of 13 numerical features (e.g., number of new positive cases, number of tests performed); (ii) weather data [45] , comprising 15 numerical features (e.g., temperature, humidity, pressure, wind speed); and (iii) containment and mitigation measures data [46] , which includes 15 boolean features (e.g., regulations regarding outdoor gatherings, public transport cleaning, mass isolation, and school closure). our target value is the daily growth factor (dgf t ) of positive cases o t normalized by the number of tests performed, η t , relative to the corresponding values for the previous day: daily growth rates equal to zero for specific regions and dates were removed from the dataset, since they usually reflect pre-covid19-spread periods for these regions. in order to prepare the data for classification, we discretized the dfg into three different levels, denoted by c = {c 1 , c 2 , c 3 }, where c 1 represents negative growth and is defined as dgf < 0.9 (43% of cases), c 2 represents linear growth and is defined as dgf ∈ [0.9, 1.1] (23% of cases), and c 3 represents exponential growth and is represented by dgf > 1.1 (34% of cases). we denote the values for the different levels of the daily growth rate by v(c i ) = i. during this experiment, we aimed to predict at time τ = t the growth rate per region 6 days later (i.e., at time τ = t + 6), based on (a) the last 6 days of covid-19 patient data (i.e., data corresponding to τ ∈ [t − 5, t]); (b) the subsequent 5 days of weather data, as forecast at time t (i.e., τ ∈ [t + 1, t + 5]); and (c) containment measures at time t. we assumed an incubation period of up to 5 days and the ability to forecast the weather 5 days in advance with high accuracy. we further assumed that only the containment measures for the current date were known (we observed from the data that, in most cases, the containment measures did not change during the following 5 days). after the preprocessing stage, the final dataset that was used as an input for the classification algorithms, d = (x m , y m ), m = 1, 2, . . . , m , consisted of m = 463 samples and k = 161 attributes, a = { a 1 , a 2 , . . . , a k }. for model evaluation, the data were split into a training dataset, corresponding to data for 7 march to 29 march (80% of the data), and a testing dataset of data for 28 march to 1 april (20% of the data). in that way, the model was trained on past data and then validated on future data that it had not previously encountered. this subsection compares the performance of the obig-based ordinal cart, i.e., a single decision tree, with the popular non-ordinal cart. four different versions of the ordinal cart are evaluated, corresponding to four different targeted values, t(y, v) ∈ v(c max ), v(c mode ), v(c min ), ev . for benchmarking purposes, the performance of the ordinal and non-ordinal classifiers was computed using three performance measures for multi-class classification: f-score, accuracy, and area under the curve (auc) [47] . additionally, we used the mean square error (mse) and kendall's correlation coefficient, τ b , which are acceptable performance measures for ordinal classification [27, 48] . the best performance values are highlighted in bold in table 1 . the following insights can be gleaned from the table: (1) the ordinal cart algorithms based on v(c max ) and v(c mode ) yielded better performance than the regular non-ordinal cart in four out of five indices, while the versions based on ev and v(c min ) were superior for all indices. (2) the common classification performance measures (f-score, accuracy, and auc) were between 9% and 17% higher for the cart based on ev than for the non-ordinal cart. a similar improvement in performance is seen for mse, while the improvement for τ b is much higher. figure 3 illustrates the auc values obtained for each growth factor level, as well as the global auc, for the two best ordinal models and the conventional non-ordinal cart model. it can be seen that the ordinal cart models yielded significantly better results for all classes than the individual classifier. the improvement in performance for the best ordinal cart model ranges from 13% to 19% (for the three individual levels and the global average). this subsection reports the performance of the obig-based ordinal adaboost and random forest algorithms (tables 2 and 3 of improvement of the best ordinal adaboost and random forest classifiers relative to their nonordinal counterparts for the classification measures f-score, accuracy, and auc are 14-25% and 6-21%, respectively. the improvement in performance is similar for mse and considerably higher for this subsection reports the performance of the obig-based ordinal adaboost and random forest algorithms (tables 2 and 3 , respectively) and compares them with their non-ordinal ensemble counterparts. the best performance values in each table are highlighted in bold. we observe the following: (1) all ordinal adaboost algorithms except for the one based on obe(v(c mode )) achieved better performance than the conventional adaboost with respect to all five indices. (2) all ordinal random forest algorithms outperformed their non-ordinal counterpart for all indices, with the exception of the ordinal random forest based on obe(ev), which yielded a lower value for the kendall index. (3) the ordinal adaboost model based on obe(v(c max )) and the ordinal random forest model based on obe(v(c mode )) yielded the best performance for all five indices. (4) the ranges of improvement of the best ordinal adaboost and random forest classifiers relative to their non-ordinal counterparts for the classification measures f-score, accuracy, and auc are 14-25% and 6-21%, respectively. the improvement in performance is similar for mse and considerably higher for τ b . table 2 . performance measures for the classification of the daily growth factor using ordinal and non-ordinal adaboost models. table 3 . performance measures for the classification of the daily growth factor using ordinal and non-ordinal random forest models. figure 4 illustrates the auc values obtained for each growth factor level for the best ordinal adaboost and random forest classifiers compared to their conventional non-ordinal counterparts. it can be seen that the ordinal models yielded significantly better results for all classes except for the case of linear growth in the adaboost models. however, errors relative to fringe classes usually lead to more serious consequences; thus, in ordinal problems, it is more important to achieve better performance for these classes. figure 4 illustrates the auc values obtained for each growth factor level for the best ordinal adaboost and random forest classifiers compared to their conventional non-ordinal counterparts. it can be seen that the ordinal models yielded significantly better results for all classes except for the case of linear growth in the adaboost models. however, errors relative to fringe classes usually lead to more serious consequences; thus, in ordinal problems, it is more important to achieve better performance for these classes. in this section, we evaluate whether the differences between the predictions of some of the best ordinal classifiers and their non-ordinal counterparts are significant. this analysis was carried out for each of the three types of classifier-cart, adaboost, and random forest-by conducting a twotailed paired-samples t-test using pairs of predictions for each instance in the testing dataset. table 4 summarizes the results, showing that the difference was found to be significant for the ordinal decision-tree-based ensemble approaches (p < 0.05). table 4 . paired t-test results for the significance of the difference in the predictions of the ordinal classifiers and their non-ordinal counterparts (ordinal cart vs. cart; ordinal adaboost vs. adaboost; ordinal random forest vs. random forest). non-ordinal counterpart 0.14 0.0057 0.0015 in this section, we evaluate whether the differences between the predictions of some of the best ordinal classifiers and their non-ordinal counterparts are significant. this analysis was carried out for each of the three types of classifier-cart, adaboost, and random forest-by conducting a two-tailed paired-samples t-test using pairs of predictions for each instance in the testing dataset. table 4 summarizes the results, showing that the difference was found to be significant for the ordinal decision-tree-based ensemble approaches (p < 0.05). table 4 . paired t-test results for the significance of the difference in the predictions of the ordinal classifiers and their non-ordinal counterparts (ordinal cart vs. cart; ordinal adaboost vs. adaboost; ordinal random forest vs. random forest). ordinal adaboost-obe (v(c min )) non-ordinal counterpart 0.14 0.0057 0.0015 figures 5-7 present the distributions of the errors (the actual class values minus the predicted class values) for the ordinal classifiers compared to their non-ordinal counterparts. it can be seen that the frequencies of zero-error cases for the ordinal algorithms were substantially higher than for their non-ordinal counterparts. furthermore, the frequency of errors with a value of −2 (i.e., the prediction of an exponential growth factor when the actual growth factor was negative) was higher in all non-ordinal models than in the corresponding ordinal models, with the difference being at least a factor of 2 in the case of the cart and random forest models. class values) for the ordinal classifiers compared to their non-ordinal counterparts. it can be seen that the frequencies of zero-error cases for the ordinal algorithms were substantially higher than for their non-ordinal counterparts. furthermore, the frequency of errors with a value of −2 (i.e., the prediction of an exponential growth factor when the actual growth factor was negative) was higher in all nonordinal models than in the corresponding ordinal models, with the difference being at least a factor of 2 in the case of the cart and random forest models. the frequencies of zero-error cases for the ordinal algorithms were substantially higher than for their non-ordinal counterparts. furthermore, the frequency of errors with a value of −2 (i.e., the prediction of an exponential growth factor when the actual growth factor was negative) was higher in all nonordinal models than in the corresponding ordinal models, with the difference being at least a factor of 2 in the case of the cart and random forest models. for benchmarking purposes, in this subsection we compare the best ordinal classifiers from the previous analysis with a number of popular non-ordinal classifiers. table 5 presents the performance measures for eleven individual classifiers (eight non-ordinal classifiers and three ordinal), with the two best performance values highlighted in bold. the following insights can be gleaned from the table: (1) the ordinal adaboost yielded the best results for f-score, accuracy, and auc, and among the two best results for mse and τ b . overall, it appears to be the best among the individual classifiers. (2) logistic regression and ordinal random forest are among the two best classifiers with respect to two indices. in this subsection, we apply a majority voting ensemble approach based on ordinal and non-ordinal classifiers. we combined the ordinal adaboost with targets t(y, v) ∈ v(c max ), ev , which were found to be the best adaboost classifiers in previous analyses, with the best non-ordinal classifier-logistic regression. the performance of the majority voting model is reported in the first column of table 6 . for comparison, we also show the performance of two individual classifiers-logistic regression (column 2) and the best ordinal classifier, adaboost based on v(c max ) (column 3). the best results are indicated in bold. the majority voting model outperformed the individual classifiers for all five indices with the ranges of improvement for accuracy, auc, and f-score being 3-6% (relative to the ordinal classifier) and 6-10% (relative to the non-ordinal classifier). in this research we suggest an extension to the objective-based information gain (obig) measure that was proposed in [27, 28] for selecting the attributes with the greatest explanatory value in a classification problem. in these studies, the weights assigned to different classes were calculated with respect to the value of a selected class. in the present study, we introduced a general targeted value function that is not necessarily related to a specific class. based on the extended obig measure, we proposed novel, ensemble-based ordinal approaches, i.e., (1) ordinal adaboost and ordinal random forest decision tree models and (2) a majority voting approach that combines these models together with conventional non-ordinal algorithms. we demonstrated how the ensemble ordinal approaches may be implemented to evaluate the effect of different factors on the level of the regional daily growth factor (dgf) of the spread of an epidemic in order to yield a classification value. the construction of the proposed models considers the magnitude of the classification error of the daily growth factor. the classification tool will enable the spreading process to be tracked and controlled, as the models can yield insights regarding the link between local containment measures and the dgf. we evaluated the performance of the suggested approaches for classification of the daily covid-19 growth rate factor in 19 regions of italy. a comparison of each of the ordinal models with its conventional, non-ordinal counterpart demonstrated that the proposed models are superior based on a variety of common performance metrics for both conventional and ordinal classification problems. specifically, the best individual ordinal cart model yielded a 9-17% improvement when compared to the conventional cart model for three common indices for conventional classification problems: f-score, accuracy, and auc. for the same indices, the best ordinal adaboost model yielded a 14-25% improvement when compared to the conventional adaboost model, and the ordinal random forest model yielded a 6-21% improvement when compared to its non-ordinal counterpart. a similar level of improvement was observed for one of the performance measures designed for ordinal classification (mse), while the second such measure (kendall's correlation coefficient) showed much greater improvement. furthermore, the ordinal adaboost was shown to be the best individual classifier when compared to all other ordinal classifiers and eight popular non-ordinal classifiers. finally, we investigated a majority voting approach that combines ordinal and non-ordinal classifiers. this ensemble approach achieved better performance in all indices than the best individual ordinal and non-ordinal classifiers, with a level of improvement of 3-10% in all indices. the level of improvement offered by the proposed ordinal approaches relative to their non-ordinal counterparts suggests that these approaches show promise for classification of the regional daily growth factor level in the spread of an epidemic, which is an ordinal target problem with no monotonic constraints on the explaining attributes. however, despite the fact that in our experiment, the relative improvement between the ordinal and the non-ordinal models is systematic, the experiment was performed only on a single dataset with a relatively low number of instances (which limits the prediction performance of the models), while ensemble models are mostly suitable for datasets with large numbers of instances. in conclusion, the main findings of this study are as follows. first, the ordinal decision-tree-based ensemble approaches yielded better classification results than their non-ordinal counterparts, and the best ordinal classifier outperformed eight popular non-ordinal classifiers. second, when implementing an ensemble approach by combining two ordinal decision tree algorithms with a non-ordinal algorithm, the classification performance is improved even further. third, the proposed approaches are suitable for carrying out multi-class identification of different levels of the daily growth factor rate. future research could apply the suggested ordinal algorithms to other datasets, including datasets with larger numbers of instances, to verify the robustness of the algorithms with respect to different settings. future studies could also consider integrating the magnitude of the classification error into other boosting ensemble methods, such as gradient boosting or xgboost algorithms. furthermore, it could be interesting to examine the performance of other ensemble approaches, such as stacking or soft voting. the authors declare no conflict of interest. modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions phase-adjusted estimation of the number of coronavirus disease a mathematical model for simulating the phase-based transmissibility of a novel coronavirus a mathematical model for simulating the transmission of wuhan novel coronavirus adequacy of seir models when epidemics have spatial structure: ebola in sierra leone spatial spread of the west africa ebola epidemic panmictic and clonal evolution on a single patchy resource produces polymorphic foraging guilds effects of temperature and humidity on the spread of covid-19: a systematic review quantifying undetected covid-19 cases and effects of containment measures in italy how to estimate epidemic risk from incomplete contact diaries data? a unified framework of epidemic spreading prediction by empirical mode decomposition-based ensemble learning techniques learning node representations through epidemic dynamics on networks predicting the epidemic potential and global diffusion of mosquito-borne diseases using machine learning employing machine learning techniques for the malaria epidemic prediction in ethiopia estimating epidemic exponential growth rate and basic reproduction number a simple approach to ordinal classification evaluation methods for ordinal classification measuring the performance of ordinal classification cautious ordinal classification by binary decomposition ordinal regression methods: survey and experimental study rulem: a novel heuristic rule learning approach for ordinal classification with monotonicity constraints learning and classification of monotonic ordinal concepts monotonicity maintenance in information-theoretic machine learning algorithms monotone classification with decision trees monotonic classification extreme learning machine adding monotonicity to learning algorithms may impair their accuracy a weighted information-gain measure for ordinal classification trees an objective-based entropy approach for interpretable models in support of human resource management: the case of absenteeism at work evaluation of the effect of learning disabilities and accommodations on the prediction of the stability of academic behaviour of undergraduate engineering students using decision trees identification of subgroups of terror attacks with shared characteristics for the purpose of preventing mass-casualty attacks: a data-mining approach decision tree ensemble method for analyzing traffic accidents of novice drivers in urban areas neural networks for pattern recognition on combining classifiers eboc: ensemble-based ordinal classification in transportation a novel classifier ensemble approach for financial distress prediction comparative analysis of decision tree algorithms: id3, c4.5 and random forest random forest in remote sensing: a review of applications and future directions congestive heart failure detection using random forest classifier the performance comparison of adaboost and svm applied to sar atr modest adaboost-teaching adaboost to generalize better a robust multi-class adaboost algorithm for mislabeled noisy data. knowl.-based syst introduction to machine learning the weather channel; wunderground. the weather company, an ibm business epidemic forecasting global npi (efgnpi) a systematic analysis of performance measures for classification tasks learning to classify ordinal data: the data replication method this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord-129272-p1jeiljo authors: broniec, william; an, sungeun; rugaber, spencer; goel, ashok k. title: using vera to explain the impact of social distancing on the spread of covid-19 date: 2020-03-30 journal: nan doi: nan sha: doc_id: 129272 cord_uid: p1jeiljo covid-19 continues to spread across the country and around the world. current strategies for managing the spread of covid-19 include social distancing. we present vera, an interactive ai tool, that first enables users to specify conceptual models of the impact of social distancing on the spread of covid-19. then, vera automatically spawns agent-based simulations from the conceptual models, and, given a data set, automatically fills in the values of the simulation parameters from the data. next, the user can view the simulation results, and, if needed, revise the simulation parameters and run another experimental trial, or build an alternative conceptual model. we describe the use vera to develop a sir model for the spread of covid-19 and its relationship with healthcare capacity. newspaper articles in recent weeks have been filled with stories about the spread of covid-19 across the country and around the world. as the virus continues to spread, various countries are both searching for pharmaceutical mechanisms to combat the disease (such as vaccines and cures) and adopting social strategies for preventing or mitigating its spread (such as social distancing). the mix of mitigation strategies seems to vary among countries depending on factors such as demographics, culture, technology, economic resources, healthcare capacity, and leadership (mccurry, ratcliffe & davidson 2020). on march 14, 2020, washington post published an influential article describing how different levels of social distancing may affect the spread of the virus using simple agent-based simulations (stevens 2020) . figure 1 illustrates the effect. the curve in red shows a rapid increase and decrease in the spread of the virus in a relatively short time period if no precautions are taken. the curve in yellow shows a slower increase and decrease in a relatively longer period if social distancing is practiced. the idea is that a country may adopt a social distancing strategy to "flatten the curve" so that the healthcare capacity of the country is not overwhelmed. since then, several other studies have established this point more firmly, including a detailed study from imperial college of london (ferguson et al. 2020). in this article, we describe vera_epidemiology (or just vera for short), an interactive ai tool that enables users to build conceptual models of the impact of social distancing on covid-19. unlike some of the other simulations, vera enables the user to explicitly specify the conceptual model in a visual language and automatically spawns agent-based simulations from the conceptual models. given a set of data, vera automatically extracts initial values for the simulation parameters from the dataset, and it also affords the user to interactively revise the parameter values. the user can view the simulation results, and, if needed, revise the simulation parameters and run another experimental trial, or build an alternative conceptual model. we demonstrate the model development process through comparative models of the impact of social distancing using the johns hopkins university dataset (cssegisanddata, 2020). we describe the use of vera to develop a sir model for the spread of covid-19 and its relationship with healthcare capacity. the results show gradual "flattening of the curve" as increasingly intense social distancing strategies are implemented. more importantly, vera acts like a virtual laboratory to conduct "what if" experiments with different sir models without requiring any knowledge of mathematical equations or computer programming. the virtual experimentation research assistant (vera) is a web application that enables users to construct conceptual models of complex systems and run model simulations. the original vera system operated in the domain of ecological systems and has been extensively used for learning and education in ecology (an et al. 2018 ; http://vera.cc.gatech.edu). the vera system described here is an adaptation of the original vera for ecological modeling. we illustrate the use of vera to create three models that form a series of increasingly intense social distancing policies (see figure 2 ). we use two parameters to control the levels of social distancing: interaction probability and adoption/transmission interval. these two parameters are incrementally increased from model 1 to model 3 to increase the interaction probability while the covid-19 spread component itself remains intact. vera uses conceptual models to enable users to visually express the components and relationships of a system. figure 3 illustrates a screenshot from vera's editor for constructing conceptual models in epidemiology. the left panel contains a palette for adding different types of components to the model. the grid panel in the center is where the conceptual model is assembled. the right panel depicts model parameters and their initial values for the simulation of the conceptual model. note the visual nature of the conceptual models in vera. the conceptual models in figure 2 illustrate an interaction between social distancing and covid-19 cases. vera translates qualitative conceptual models into quantitative values and equations to generate agent-based simulations. vera currently uses the following parameters specific to epidemiology: starting count, duration, adoption/transmission count, adoption/transmission onset, and adoption/transmission interval. vera facilitates describing the simulation parameter values in two ways. first, vera enables users to import data from external sources and then uses machine learning techniques for abstracting initial simulation parameter values from the data. second, it enables users to interactively set the parameter values. table 1 shows increasing the interaction probability and the adoption /transmission interval parameters to represent the increasing level of social distancing. the values of the simulation parameters of the covid-19 cases component in figure 4 were automatically filled in by applying machine learning techniques on the imported dataset (cssegisanddata, 2020). figure 4 : the simulation parameters of the covid-19 cases components. vera uses an off-the-shelf agent-based simulation system called netlogo (http://ccl.northwestern.edu/netlogo/). running the simulation enables the user to observe the evolution of the system variables over time and iterate through the generate-evaluate-revise loop. figure 5 shows the simulation results generated from the model in figure 4 using different parameter values, as indicated in table 1 . the first model shows one future progression where the rate spikes early, but there is a sharp decline. this can be compared to south korea's handling of covid-19 spread, where medical technology and mass screenings drastically limited the spread. the primary concern here is the capacity of the healthcare system, where demand can easily exceed supply. the next graph shows a model of moderate social distancing. in this case, the peak occurs later, does not reach as high, and stretches the pandemic out for a longer period. such approaches are straining the healthcare capacity of many european nations now and may not be enough. the final model shows the results of intense social distancing on infection rates. these strong lifestyle changes drastically reduce the number of potential avenues for disease spread, and the resulting spread of infections occurs much slower, providing time, resources, and planning for a society to operate in the times of a pandemic. now that we have illustrated the core techniques in vera, we describe the use of vera develop the sir model for understanding the relationship between social distancing and the spread of covid-19. the sir model (kermak & mckendrick 1927 ) is a commonly used mathematical 1 3.intense prevention model to understand the spread of infectious diseases. we show just how important adjusting behavior can be in shaping the outcome of a pandemic. the strain on the healthcare infrastructure has sweeping consequences-reduced availability of resources prevents some people from receiving adequate treatment, and even those with conditions other than covid-19 will still see negative impacts as these resources become scarcer. the conceptual model in figure 5 illustrates the sir model of disease spread. the conceptual model consists of three components that each represent a number of individuals-the susceptible, the infected, and the recovered-as well as the healthcare capacity. the susceptible population becomes the infected population, and the infected population becomes the recovered population. figure 5 : the sir conceptual model in vera for the sir model, we use a different set of parameters than for the simple model described in the previous section. the susceptible, infected and recovered components have starting populations as the simulation parameter. the healthcare capacity component has capacity as the simulation parameter. the become relationship has two simulation parameters: average contacts per day per person and transmission likelihood. the recover relationship has average recovery time parameter (see table 2 for description). describes how many people the average person is coming in contact with on a given day in this simulation. describes the probability of the disease transferring from an infected person to a susceptible person that they have come in contact with. average recovery time describes how long it takes in average to recover from the disease. figure 6 shows the simulation results on a sample population of 10,000 people with two different average contacts per day per person. the first model (top) shows the simulation results with 16 average contacts per day per person. the red series corresponds to the number of infected individuals, and it exceeds the healthcare capacity of the system very early under these conditions. the second model (bottom) shows the simulation results with 12 average contacts per day per person. reducing the average contacts per day per person hypothetically suggest that people are reducing social contact somewhat, but not substantially. compared to the prior graph, the peak is closer to 7,000 than 8,000, and healthcare capacity is exceeded after 20 days in the simulation, rather than around 15 days-a step in a positive direction. the users can experiment with other values of average contacts per day per person until they find a scenario where infections do not exceed healthcare capacity. in this article, we used vera to develop sir models for the spread of covid-19. these models not only express the impact of social distancing on the spread of the disease but also the management of the impact of the disease on the healthcare capacity. we note, however, that vera supports conceptual modeling, not mathematical modeling, and that its models are more explanatory of general patterns than predictive of specific outcomes. vera (1) enables the user to explicitly specify the conceptual model in a visual language, (2) automatically spawns agent-based simulations from the conceptual models, (3) automatically extracts initial values for the simulation parameters from a given dataset, and (4) supports the user through the whole cycle of model generation, evaluation and revision. thus, vera provides a virtual laboratory: the user can try out a variety of conceptual models and simulation parameters, and conduct "what if" virtual experiments. we posit that this has significant implications for learning and education, for example, informal learning by the citizens of the world. vera: popularizing science through ai novel coronavirus (covid-19) cases, provided by jhu csse. github repository impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand a contribution to the mathematical theory of epidemics mass testing, alerts and big fines: the strategies used in asia to slow coronavirus. the guardian why outbreaks like coronavirus spread exponentially, and how to "flatten the curve we thank preethi sethumadhavan for her contributions to this work. the development of the original version of vera for ecological modeling was supported by an nsf bigdata grant. we thank robert bates (veloxicity), dr. jennifer hammock (encyclopedia of life, smithsonian institution), and dr. emily weigel and brady young (school of biological sciences, georgia tech) for their contributions to the original vera project. key: cord-025348-sh1kehrh authors: jurj, sorin liviu; opritoiu, flavius; vladutiu, mircea title: deep learning-based computer vision application with multiple built-in data science-oriented capabilities date: 2020-05-02 journal: proceedings of the 21st eann (engineering applications of neural networks) 2020 conference doi: 10.1007/978-3-030-48791-1_4 sha: doc_id: 25348 cord_uid: sh1kehrh this paper presents a data science-oriented application for image classification tasks that is able to automatically: a) gather images needed for training deep learning (dl) models with a built-in search engine crawler; b) remove duplicate images; c) sort images using built-in pre-trained dl models or user’s own trained dl model; d) apply data augmentation; e) train a dl classification model; f) evaluate the performance of a dl model and system by using an accuracy calculator as well as the accuracy per consumption (apc), accuracy per energy cost (apec), time to closest apc (ttcapc) and time to closest apec (ttcapec) metrics calculators. experimental results show that the proposed computer vision application has several unique features and advantages, proving to be efficient regarding execution time and much easier to use when compared to similar applications. data is at the core of every dl application. because the machine learning lifecycle consists of four stages such as data management, model learning, model verification and model deployment [1] , in order to collect, analyze, interpret and make use of this data, e.g. training accurate models for real-life scenarios, in recent years, new specializations were introduced in universities around the world such as machine learning and data science, to name only a few. additionally, also new career positions were created recently such as machine learning engineer and data scientist, being some of the top paid positions in the industry [2] . regarding computer vision applications for image classification tasks, a major bottleneck before training the necessary dl models is considered to be the data collection which consists mainly of data acquisition, data labeling and improvement of the existing data in order to train very accurate dl models [3] . another bottleneck is that, because the amount of data needed to train a dl model is usually required to be very large in size and because most of this important data is not released to the general public but is instead proprietary, the need of an original dataset for a particular dl project can be very crucial. in general, data can be acquired either by a) buying it from marketplaces or companies such as quandl [4] and ursa [5] ; b) searching it for free on platforms like kaggle [6] ; c) crawling it from internet resources with the help of search engine crawlers [7] ; d) paying to a 24 â 7 workforce on amazon mechanical turk [8] like the creators of the imagenet dataset did to have all of their images labeled [9] ; e) creating it manually for free (e.g. when the user takes all the photos and labels them himself), which can be impossible most of the time because of a low-budget, a low-quality camera or time constraints. the importance of image deduplication can be seen in the fields of computer vision and dl where a high number of duplicates can create biases in the evaluation of a dl model, such as in the case of cifar-10 and cifar-100 datasets [10] . it is recommended that before training a dl classification model, one should always check and make sure that there are no duplicate images found in the dataset. finding duplicate images manually can be very hard for a human user and a time-consuming process, this being the reason why a software solution to execute such a task is crucial. some of the drawbacks of existent solutions are that they usually require the user to buy the image deduplication software or pay monthly for a cloud solution, they are big in size or are hard to install and use. despite all of these options, especially in the case of scraping the images from the internet, once stored they can still be unorganized or of a lower quality than expected, with images needed to be sorted out each in their respective class folder in order for the user (e.g. data scientist) to be able later to analyze and use this data for training a performant dl model. this kind of sorting task can take a tremendous amount of time even for a team, from several days or weeks to even months [11] . another difficult task is that once the data is cleaned, organized and ready to be trained from scratch or using transfer learning, because of the variety of dl architectures, each with different sizes and training time needed until reaching convergence [12] , it can be very difficult to know from the beginning which dl architecture fits the best a given dataset and will, at the end of the training, result in a dl model that has high accuracy. because energy consumption in dl started to become a very debated aspect in recent months, especially regarding climate change [13] [14] [15] [16] [17] , the necessity of evaluating the performance of dl models also by their energy consumption and cost is very crucial. considering these aspects, our work introduces a dl-based computer vision application that has multiple unique built-in data science-oriented capabilities which give the user the ability to train a dl image classification model without any programming skills. it also automatically searches for images on the internet, sort these images each in their individual class folder and is able to remove duplicate images as well as to apply data augmentation in a very intuitive and user-friendly way. additionally, it gives the user an option to evaluate the performance of a dl model and hardware platform not only by considering its accuracy but also its power consumption and cost by using the environmentally-friendly metrics apc, apec, ttcapc and ttcapec [16] . the paper is organized as follows. in sect. 2 we present the related work. section 3 describes the proposed dl-based computer vision application. section 4 presents the experimental setup and results. finally, sect. 5 concludes this paper. considering the advancements of dl in recent years, there is a growing interest in computer vision applications in the literature, such as regarding the automatic sorting of images, as shown by the authors in [18] . the authors propose a solution called imagex for sorting large amounts of unorganized images found in one or multiple folders with the help of a dynamic image graph and which successfully groups together these images based on their visual similarity. they also created many similar applications, e.g. imagesorter [19] , which besides sorting images based on their color similarity, is also able to search, download and sort images from the internet with a built-in google image search option. a drawback of their applications is that the user is able to only visualize similar images, without also having these images automatically cleaned and sorted in their respective class folder with high accuracy. also, the authors in [20] created an application called sharkzor that combines user interaction with dl in order to sort large amounts of images that are similar. by comparison, regarding sorting, their solutions only sort images by grouping them based on how similar they are to each other after a human interacted and sorted these images initially, whereas our application sorts them automatically by using in-built pre-trained dl models or gives the user an option to use his own trained dl models. an on-device option that uses dl capabilities and helps users find similar photos (e.g. finding photos that contain certain objects such as flowers, trees, food, to name only a few) is presented also by apple in their newest version of photos app [21] . regarding the detection of duplicate images, this technique has practical applications in many domains such as social media analysis, web-scale retrieval as well as digital image forensics [22, 23] , with several works in the literature applying it for the detection of copyright infringements [24] and fraud detection [25] . recently, a python package that makes use of hashing algorithms and convolution neural networks (cnns) that finds exact or near-duplicates in an image collection called image deduplicator (imagededup) was released in [26] . in our computer vision application, we make use of this package in order to offer a user the option to remove duplicate images from the images dataset (e.g. right before training a dl model). when training dl models from scratch or by using transfer learning, usually frameworks such as tensorflow and pytorch are used [27] , either locally (e.g. on a personal laptop or desktop pc that contains a powerful gpu) or in cloud services such as cloud automl [28, 29] , amazon aws [30] or microsoft azure [31] , with the work in [32] even assessing the feasibility and usefulness of automated dl in medical imaging classification, where physicians with no programming experience can still complete such tasks successfully. the problem when training locally is that the user still has to research on his own which size the images should have for a given dl architecture, which dl architecture to choose for his dataset and if it is necessary to apply fine-tuning and image augmentation. regarding using the cloud services for training a dl model, even though these may solve most of the problems mentioned above, they still have some drawbacks such as that they can be affected by latency, can be difficult to manage (not user-friendly) and most importantly, they can be very expensive when training for several hours (e.g. cloud automl from google costs around $20 per hour when used for computer vision tasks [27] ). similar work to ours is presented by the authors in [33] where they created the image atm (automated tagging machine) tool that automatizes the pipeline of training an image classification model (preprocessing, training with model tweaking, evaluation, and deployment). regarding preprocessing, the image atm tool just resizes the images to fit the model input shape. for training, it uses transfer learning with pre-trained cnns from keras by firstly training the last dense layer followed by the whole network. for evaluation, it calculates the confusion matrix and other metrics. a few disadvantages of image atm: the tool is aimed at people with programming knowledge (developers) and is focused mainly on the training function. also, in order to use the image atm tool, the user must take the work of preparing the data in a specific folder structure, e.g. the user must create a .yml file with some of the parameters desired, path to images and destination path. the user must also create a . json file containing the classification of each image. some advantages of the image atm are that the tool offers the possibility for cloud training, has access to more models (although all are trained with the same dataset) and that the evaluation errors can be visualized. when compared to image atm, our computer vision application has several advantages such as that it is accessible to more kinds of people and offers more functionalities such as image web scraping and sorting, deduplication, calculators for accuracy as well as for the apc, apec, ttcapc and ttcapec metrics, all in a user-friendly graphical user interface (gui). the proposed dl-based computer vision application is summarized in fig. 1 and is built using the python programming language. it is composed of the most common features needed in the computer vision field and facilitate them in the form of a gui, without requiring the user to have any knowledge about coding or dl in order to be able to fully use it. regarding the system, the compilation dependencies and installation requirements of the proposed application are python 3, windows 10 (or later version) or linux (ubuntu 12 or later version). regarding the python libraries, we use pyqt5 for creating the gui, hdf5 for loading dl model files, tensorflow for training and inference, opencv for image processing, numpy for data processing, shutil for copying images in the system, tqdm for showing the terminal progress bar, imagededup [26] for deduplication of images, icrawler for crawling the images and fman build system (fbs) for creating installers. there are certain conventions that are common in all the features of the proposed application: 1. model files: these are .h5 files that contain the architecture of a keras model and the weights of its parameters. these are used to load (and save) a previously trained model in order to be able to use it. 2. model class files: these are extensionless files that contain the labels of each of the classes of a dl model. it contains n lines, where n is the number of classes in the model, and the line i contains the label corresponding to the ith element of the output of the dl model. 3. preprocessing function: in this convention, a preprocessing function is a function that takes as input the path to an image and a shape, loads the image from the input path, converts the image to an array and fits it to the input of the model. images folders structures: we use two different folders structures: unclassified structures and classified structures. the unclassified images folders structure is the simplest one, consisting of just one folder containing images, presumably to be classified or deduplicated. the classified images folders structure consists of a folder which in turn contains subfolders. each subfolder represents a class of images, is named the same as the label for that class, and contains images tagged or classified belonging to that class. following, we will present all the built-in features: automatic web crawler assisted by inference classification, images deduplication, images sorter assisted by inference classification, dl model trainer with data augmentation capabilities, accuracy calculator as well as the apc and apec [16] calculators. the purpose of this feature is to collect images related to a keyword (representing a class) from the web and by using a classification algorithm, to make sure that the images are indeed belonging to this class. during the inference process needed for cleaning the images, a preprocessing is happening in the background, which, depending on the pretrained or custom dl model that is chosen, will resize the images, making them have the correct input shape (e.g. 28 â 28 â 1 for mnist and 224 â 224 â 3 for imagenet) for the dl model. a summarized view of the implemented image crawler feature can be seen in fig. 2 and is composed of the following elements: 'model' -a combo box containing all the existent pretrained in-built dl models such as "mnist" or "resnet50" as well as the 'custom' option which gives the user the possibility to load his own previously trained dl model; confidence slider ('confidence required') -a slider to select the minimum accuracy value to be used when classifying the images and which ranges from 0 to 99; image class selector ('select a class of images') -a combo box containing the labels of all the classes from the pretrained built-in selected dl model (e.g. 10 classes for when the "mnist" model is selected and 1000 classes when the "resnet50" model is selected). additionally, the box contains an autocomplete search function as well; images amount ('max amount to get') -a slider to select the number of images that should be crawled from the internet and which ranges from 1 to 999 and 'destination folder' -a browser to select the path for the final location of the obtained images. the options under 'custom model configuration' only apply when the dl model selected is "custom" and is not built-in in the proposed computer vision application, e.g. when it was trained by the user itself. these options are: 'model file' -a browser to select the .h5 file the user wishes to use for inference and model classes -a browser to select the extensionless file containing the name of each output class on which the selected dl model (.h5 file) was trained on. finally, this feature's gui interface has a button ('add images!') that begins the web crawling process. with the help of this feature, images are automatically crawled by the crawler and downloaded to a temporal folder location. after that, each image is classified with the selected dl model, and if the classification coincides with the selected class and the confidence is higher than the selected threshold, the image is moved to the 'destination folder', where each image will be saved in its own class folder. this feature automatizes the population of image classification datasets by providing a reliable way of confirming that the downloaded images are clean and correctly organized. the purpose of this feature is to remove duplicate images found in a certain folder. for this, we incorporated the imagededup techniques found in [26] . a summarized view of the implemented images deduplication feature can be seen in fig. 3 and is composed of the following elements: 'images folder' -a browser to select the location of the folder containing the images that need to be analyzed for duplicate images; 'destination folder' -a browser to select the location of the folder where the deduplicated images will be stored; 'duplicates folder' -a browser to select the location of the folder where the found duplicate images will be stored. each duplicate image found will be stored in a subfolder. regarding advanced settings, it is composed of: hashing method selector ('select a hashing method') -a combo box containing 4 hashing methods that can be used for deduplication (perceptual hashing (default), difference hashing, wavelet hashing, and average hashing) as well as a 'max distance threshold' -the maximum distance by which two images will be considered to be the same (default value is 10). finally, this interface has a button ('deduplicate!') that begins the deduplication process according to the selected parameters. following, we will shortly describe the types of hashes we are using in the images deduplication feature: a) average hash: the average hash algorithm first converts the input image to grayscale and then scales it down. in our case, as we want to generate a 64-bit hash, the image is scaled down. next, the average of all gray values of the image is calculated and then the pixels are examined one by one from left to right. if the gray value is larger than the average, a 1 value is added to the hash, otherwise a 0 value; b) difference hash: similar to the average hash algorithm, the difference hash algorithm initially generates a grayscale image from the input image. here, from each row, the pixels are examined serially from left to right and compared to their neighbor to the right, resulting in a hash; c) perceptual hash: after gray scaling, it applies the discrete cosine transform to rows and as well as to columns. next, we calculate the median of the gray values in this image and generate, analogous to the median hash algorithm, a hash value from the image; d) wavelet hash: analogous to the average hash algorithm, the wavelet hash algorithm also generates a gray value image. next, a twodimensional wavelet transform is applied to the image. in our case, we use the default wavelet function called the haar wavelet. next, each pixel is compared to the median and the hash is calculated. regarding this deduplication feature, first, the hasher generates hashes for each of the images found in the images folder. with these hashes, the distances between hashes (images) are then calculated and if they are lower than the maximum distance threshold (e.g. 10), then they are considered duplicates. secondly, for each group of duplicates, the first image is selected as "original" and a folder is created in the duplicates folder with the name of the "original" folder. then all duplicates of this image are stored on that folder. this feature successfully integrates the image deduplication technique [26] and provides a simple and quick way to utilize it. this feature helps a user to sort an unsorted array of images by making use of dl models. a summarized view of the implemented images sorter feature assisted by inference classification can be seen in fig. 4 and is composed of elements similar to the ones presented earlier for the image crawler feature, but in this case with the function of selecting the path to the folders from which and where images should be sorted. in the destination folder, a new folder is created for each possible class, with the name extracted from the extensionless file that contains all the names of the classes, plus a folder named 'undetermined'. then, each image from the 'images folder' is automatically preprocessed, feed as input to the selected dl model and saved in the corresponding class folder. the highest value from the output determines the predicted class of the image: if this value is less than the minimum 'confidence required', value, then the image will be copied and placed in the 'undetermined' folder, otherwise, the image will be copied to the folder corresponding to the class of the highest value from the output. we took the decision of copying the files instead of moving them, for data security and backup reasons. this feature heavily reduces the amount of time required to sort through an unclassified dataset of images by not only doing it automatically but also removing the need to set up coding environments or even write a single line of code. this feature gives the user a simple gui to select different parameters in order to train and save a dl image classifier model. a summarized view of the implemented dl model trainer feature assisted by inference classification can be seen in fig. 5 and is composed of the following elements: 'model'as described earlier for the image crawler feature; 'sorted images folder' -a browser to select the folder that contains the classified folder structure with the images to be trained on; 'number of training batches' -an integer input, to specify the number of batches to train and 'size of batches'an integer input, to specify the number of images per batch. regarding the custom options, they are the same as mentioned earlier regarding the image crawler feature. next, this interface has a button ('train model') that, when clicked on, prompts a new window for the user to be able to visualize in a very user-friendly way all the image transformations that can be applied to the training dataset in a random way during training. more exactly, as can be seen in fig. 6 , the user can input the following parameters for data augmentation: horizontal flip -if checked the augmentation will randomly flip or not images horizontally; vertical flip -if checked the augmentation will randomly flip or not images horizontally; max width shift -slider (%), maximum percentage (value between 0 and 100) of the image width that it can be shifted left or , the maximum amount of degrees (value between 0 and 90) that an image might be rotated and max shear shift -slider (%), maximum shear value (value between 0 and 100) for image shearing. the data augmentation feature allows the user to visualize the maximum possible changes that can be made to an image in real-time, without the need of guessing the right parameters. following, a training generator is defined with the selected parameters; the generator randomly takes images from the folder structure and fills batches of the selected size, for the number of batches that are selected. these batches are yielded as they are being generated. regarding the training, first, the selected dl model is loaded, its output layer is removed, the previous layers are frozen and a new output layer with the size of the number of classes in the folder structure is added. the model is then compiled with the adam optimizer and the categorical cross-entropy as the loss function. finally, the generator is fed to the model to be fitted. once the training is done, the total training time is shown to the user and a model file (.h5) is created on a prompted input location. this feature achieves the possibility of training a custom dl model on custom classes just by separating images in different folders. there is no knowledge needed about dl and this feature can later also be easily used by the image sorting feature described earlier in order to sort future new unsorted images. this section of the application gui gives a user the option to compute the accuracy of a dl model on the given dataset in the classified images folder structure. a summarized view of the implemented accuracy calculator feature can be seen in fig. 7 and is composed of the following elements: 'model' -as described earlier for the image crawler feature; 'test images folder' -a browser to select the folder that contains the classified folder structure to measure the accuracy of a dl classification model; 'size of batches'an integer input, to specify the number of images per batch. the custom options are the same as mentioned earlier regarding the image crawler feature. finally, this interface has a button ('calculate accuracy') that starts the accuracy evaluation process. after loading the dl model and the list of classes, it searches for the classes as subfolders names in the classified images folder structure. then, for each class (or subfolder) it creates batches of the selected batch size, feeds them to the dl model and counts the number of accurate results as well as the number of images. with these results, it calculates the total accuracy of the dl model and shows it to the user directly in the application gui. this feature provides a simple and intuitive gui to measure the accuracy of any dl image classification model. this gui feature makes use of our apc metric [16] and which is a function that takes into account not only the accuracy of a system (acc) but also the energy consumption of the system (c). the apc metric can be seen in eq. (1) below: where c stands from energy consumption of the system and it's measured in watt/hour (wh) and acc stands for accuracy; a is the parameter for the wc a function, the default value is 0.1; b is a parameter (ranges from 0 to infinity) that controls the influence of the consumption in the final result: higher values will lower more heavily the value of the metric regarding the consumption. the default value is 1. the application gui gives a user the option to define the values for a and b as well as to specify and calculate the accuracy and energy consumption of a dl model using the above apc metric equation. a summarized view of the implemented apc calculator feature can be seen in fig. 8 and is composed of the following elements: 'model test accuracy (%)' -this widget gives a user the option to input the accuracy or use the previously described accuracy calculator feature to measure the accuracy of a dl model and 'energy consumption (wh)' -float input to specify the power consumption of a user's dl model. regarding the advanced options, it has: alpha (a) -float input to specify the desired value of a (default 0.2) and beta ðb) -float input to specify the desired value of b (default 1). for simplicity, a table is shown with the following columns: accuracy, energy consumption, alpha, beta, and apc. whenever a value is changed, the table is automatically updated as well. finally, the application gui has a button ('calculate apc') to begin the calculation of the apc metric. the function itself is a numpy implementation of our previously defined apc metric [16] seen in eq. (1) and takes as input parameters the values defined in the application gui. the implemented feature brings this new apc metric to any user by allowing them to easily calculate the accuracy per consumption and know the performance of their dl model with regards to not only the accuracy but also to the impact it has on the environment (higher energy consumption = higher negative impact on nature). however, the drawback of the current version of this apc calculator feature in the proposed application gui is that the user has to measure the energy consumption of the system manually. we plan to implement automatic readings of the power consumption in future updates (e.g. by using the standard performance evaluation corporation (spec) ptdaemon tool [34, 35] , which is also planned to be used for power measurements by the mlperf benchmark in their upcoming mid-2020 update). this metric is a function that takes into account not only the accuracy of a system (acc) but also the energy cost of the system (c). the apec metric can be seen in eq. (2) below: where c stands for the energy cost of the system and it's measured in eur cents per inference and acc stands for accuracy. a is the parameter for the wc a function, the default value is 0.1; b is a parameter (ranges from 0 to infinity) that controls the influence of the cost in the final result: higher values will lower more heavily the value of the metric regarding the cost. the default value is 1. the apec feature is presented in fig. 9 and lets a user define the values for a and b, specify or calculate the accuracy of a dl model, specify the energy consumption and the cost of wh of the dl as well as calculate the apec using the formula seen earlier in (2) . the apec feature of the proposed computer vision application is composed of the following elements: 'model test accuracy (%)'works similar to the apc widget described earlier; 'energy consumption (wh)' -works also similar to the apc widget described earlier and watt-hour cost -float input to specify the cost in eur cents of a wh. regarding the advanced options, we have: alpha (a) -float input to specify the desired value of a(default 0.2) and beta bfloat input to specify the desired value of b(default 1). a similar table like the one for apc calculator is shown also here, with the following columns: accuracy, energy cost, alpha, beta, and apec. whenever a value is changed, the table is automatically updated here as well. finally, the application gui has a button ('calculate apec') to begin the calculation of the apec metric. the function itself is an implementation on numpy of our previously defined apec metric [16] seen in eq. (2) and takes as input parameters the values defined in the application gui. the implemented feature brings this new apec metric to any user by allowing them to easily calculate the accuracy per energy cost and evaluate the performance of their dl model with regards to the impact it has on the environment (higher energy consumption = higher cost = negative impact on nature). however, the drawback of the current version of this apec calculator feature is that the user has to measure the energy consumption of the system and calculate its wh cost manually. the objective of the ttapc metric [16] is to combine training time and the apc inference metric in an intuitive way. the ttcapc feature is presented in fig. 10 and is composed of the following elements: 'model test accuracy (%)' and 'energy consumption (wh)', both working similar to the apec widget described earlier; 'accuracy delta' -float input to specify the granularity of the accuracy axis; 'energy delta'float to specify the granularity of the energy axis. regarding the advanced options, they are the same as the ones presented earlier regarding the apec feature. a similar table like the one for apec calculator is shown also here, with the following columns: accuracy, energy consumption, alpha, beta, accuracy delta, energy delta, rounded accuracy, rounded energy, training time and closest apc. whenever a value is changed, the table is automatically updated here as well. finally, the application gui has a button ('calculate ttcapc') to begin the calculation of the ttcapc metric. the objective of the ttcapec metric [16] is to combine training time and the apec inference metric. the ttcapec feature is presented in fig. 11 and is composed of the same elements like the ttcapc feature presented earlier and one additional element called 'energy cost (eur cents per wh)' which is similar to the one presented earlier regarding the apec metric calculator and where the user can specify the cost in eur cents of a wh. a similar table like the one for ttcapc calculator is shown also here, with the following columns: accuracy, energy cost, alpha, beta, accuracy delta, energy delta, rounded accuracy, rounded energy, training time and closest apec. finally, the application gui has a button ('calculate ttcapec') to begin the calculation of the ttcapec metric. following, we will show the experimental results regarding all the implemented features in comparison with existing alternatives found in the literature and industry. we run our experiments on a desktop pc with the following configuration: on the hardware side we use an intel(r) core(tm) i7-7800x cpu @ 3.50 ghz, 6 core(s), 12 logical processor(s) with 32 gb ram and an nvidia gtx 1080 ti as the gpu; on the software side we use microsoft windows 10 pro as the operating system with cuda 9.0, cudnn 7.6.0 and tensorflow 1.10.0 using the keras 2.2.4 framework. as can be seen in table 1 , our proposed image crawler feature outperforms existent solutions and improves upon them. even though the crawling took the same amount of time, this is not the case regarding the cleaning part, where, because this feature is not available in any of the existent solutions, this needed to be done manually and took 47 s for a folder containing 97 images as compared to only 10 s for our proposed solution which executed the task automatically. a comparison between "dirty" images and clean images can be seen in fig. 12 where, for simplicity, we searched for 97 pictures of "cucumber", which is one class from the total of 1000 classes found in the imagenet dataset [9] . it can be easily observed how the existent solutions provide images that don't represent an actual cucumber, but products (e.g. shampoos) that are made of it. after automatically cleaning these images with a confidence rate of 50% with the proposed feature, only 64 clean images remained in the folder. for the experiments seen in table 2 , we tested the speed time of the proposed built-in image deduplication feature that uses the imagededup python package [26] . we run these experiments on finding only exact duplicates on the same number of images with a maximum distance threshold of 10 for all four hashing methods. as can be seen, the average speed is about 16 s for finding duplicates in a folder containing 1.226 images, with difference hashing being the fastest hashing method from all four. for our experiments regarding the sorting of images with the proposed images sorter feature, we used both the mnist as well as the imagenet pre-trained models with a confidence rate of 50% and presented the results in table 3 . regarding mnist experiments, we converted the mnist dataset consisting of 70.000 images of 28 â 28 pixels to png format by using the script in [36] and mixed all these images in a folder. after that, we run our image sorter feature on them and succeeded to have only 0.09% of undetermined images, with a total speed time of around 6 min. regarding imagenet, we used the imagenet large scale visual recognition challenge 2013 (ilsvrc2013) dataset containing 456.567 images belonging to 1000 classes with a confidence rate of 50%. here we successfully sorted regarding the custom model, we used one of our previously trained dl models (resnet-50) that can classify 34 animal classes [37] on a number of 2.380 images of 256 â ratio pixels (70 images for each of the 34 animal classes) with a confidence rate of 50%. here we succeeded to have 1.42% undetermined images, with a total speed time of almost 4 min. the percentage of the undetermined images for all cases can be improved by modifying the confidence rate, but it is out of this paper's scope to experiment with different confidence values. the time that a dl prediction task takes depends on a few variables, mainly the processing power of the machine used to run the model, the framework used to call the inference of the model and the model itself. since processing power keeps changing and varies greatly over different machines, and all the frameworks are optimized complexity wise and keep evolving, we find that among these three the most important to measure is, therefore, the model itself used in the prediction. models vary greatly in their architecture, but all dl models can be mostly decomposed as a series of floating points operations (flops). because, generally, more flops equal more processing needed and therefore more time spent in the whole operation, we measured the time complexity of the built-in imagenet and mnist models in flops and presented the results in table 4 . for the experiments regarding the dl model training feature, because we want to evaluate the application on a real-world problem, we will attempt to show that this feature could be very useful for doctors or medical professionals in the aid of detecting diseases from imaging data (e.g. respiratory diseases detection with x-ray images). in order to prove this, we will attempt to automatically sort between the images of sick patients versus healthy patients regarding, firstly, pneumonia [38], and secondly, covid-19 [39] , all within our application and doing it only with the training feature that the application provides. for this, first, in order to classify between x-ray images of patients with pneumonia versus x-ray images of healthy patients, we made use of transfer learning and trained a 'resnet50' architecture for around 2 h without data augmentation on pneumonia [38] dataset containing 6.200 train images by selecting 10 as the value for the number of training batches and 10 as the value for the size of batches (amount of images per batch) and achieved 98.54% train accuracy after 10 epochs. secondly, in order to classify between x-ray images of patients with covid-19 versus x-ray images of negative patients, we again made use of transfer learning and trained a 'resnet50' for the experiments regarding the accuracy calculator feature, we used the two custom dl models trained earlier to classify x-ray images of patients with pneumonia versus x-ray images of healthy patients and between x-ray images of patients with covid-19 versus x-ray images of negative patients, with 20 as the size of batches (20 images per batch). the evaluation took in both cases around 50 s with a test accuracy of 93.75% regarding the pneumonia model on 620 test images and 91% regarding the covid-19 model on 11 test images, proving that the proposed computer vision application can easily be used by any medical personal with very basic computer knowledge in order to train and test a dl classification model for medical work purposes. regarding the experiments with the proposed apc [16] calculator feature, we presented the simulated results for different model test accuracy (%) and energy consumption (wh) values in table 5 . we run all the experiments with 0.2 as the alpha value and with 1.0 as the beta value. it is important to mention that our recommendation for a correct comparison between two dl models, is that it is always necessary that they are both tested with the same alpha and beta values. as can be seen in table 5 where we experimented with random energy consumption and test accuracy values, our apc calculator feature is evaluating the performance of a dl model by considering not only the accuracy but also the power consumption. therefore, dl models that consume around 50 wh (e.g. when running inference on a laptop) instead of 10 wh (e.g. when running inference on a low-cost embedded platform such as the nvidia jetson tx2) [15] , are penalized more severely by the apc metric. regarding the experiments with the proposed apec [16] calculator feature, we presented the simulated results for different model test accuracy (%) and energy cost in table 6 . we run all the experiments with 0.2 as the alpha value and with 1.0 as the beta value. for simplicity, regarding electricity costs, we took germany as an example. according to "strom report" (based on eurostat data) [40] , german retail consumers paid 0.00305 euro cents for a wh of electricity in 2017. we used this value to calculate the cost of energy by plugging it in the equation presented in (2)", where "c" in this case stands for the energy cost. as can be seen, the apec metric favors lower power consumption and cost, favoring the use of green energy (free and clean energy). regarding the experiments with the proposed ttcapc [16] calculator feature, we simulated a custom dl model on two platforms and presented the results in table 7 . as can be seen, even though the accuracy and training time is the same for both platforms, the ttcapc feature favors the platform which has less power consumption. regarding the experiments with the proposed ttcapec [16] calculator feature, we simulated with the same dl model values used also in the experiments regarding the ttcapc calculator earlier and presented the results in table 8 . as can be also seen in this case, the ttcapec feature favors the lower power consumption of a system because it results in a lower cost. additionally and more importantly, it favors dl-based systems that are powered by green energy, because they have 0 electricity costs and no negative impact on our environment. in this paper, we present a computer vision application that succeeds in bringing common dl features needed by a user (e.g. data scientist) when performing image classification related tasks into one easy to use and user-friendly gui. from automatically gathering images and classifying them each in their respective class folder in a matter of minutes, to removing duplicates, sorting images, training and evaluating a dl model in a matter of minutes, all these features are integrated in a sensible and intuitive manner that requires no knowledge of programming and dl. experimental results show that the proposed application has many unique advantages and also outperforms similar existent solutions. additionally, this is the first computer vision application that incorporates the apc, apec, ttcapc and ttcapec metrics [16] , which can be easily used to calculate and evaluate the performance of dl models and systems based not only on their accuracy but also on their energy consumption and cost, encouraging new generations of researchers to make use only of green energy when powering their dl-based systems [15] . assuring the machine learning lifecycle: desiderata, methods, and challenges impact of artificial intelligence on businesses: from research, innovation, market deployment to future shifts in business models a survey on data collection for machine learning: a big data -ai integration perspective imagenet: a large-scale hierarchical image database do we train on test data? purging cifar of near-duplicates snapshot serengeti, high-frequency annotated camera trap images of 40 mammalian species in an african savanna deep double descent: where bigger models and more data hurt energy and policy considerations for deep learning in nlp efficient implementation of a self-sufficient solar-powered real-time deep learning-based system environmentally-friendly metrics for evaluating the performance of deep learning models and systems tackling climate change with machine learning dynamic construction and manipulation of hierarchical quartic image graphs sharkzor: interactive deep learning for image triage, sort, and summary apple photos benchmarking unsupervised near-duplicate image detection recent advance in content-based image retrieval: a literature survey effective and efficient global context verification for image copy detection image forensics: detecting duplication of scientific images with manipulation-invariant image similarity performance analysis of deep learning libraries: tensorflow and pytorch automl: a survey of the state-of-the-art automated deep learning design for medical image classification by healthcare professionals with no coding experience: a feasibility study measuring and benchmarking power consumption and energy efficiency standard performance evaluation corporation (spec) power mnist converted to png format real-time identification of animals found in domestic areas of europe covid-19 image data collection key: cord-029311-9769dgb6 authors: nemati, hamed; buiras, pablo; lindner, andreas; guanciale, roberto; jacobs, swen title: validation of abstract side-channel models for computer architectures date: 2020-06-13 journal: computer aided verification doi: 10.1007/978-3-030-53288-8_12 sha: doc_id: 29311 cord_uid: 9769dgb6 observational models make tractable the analysis of information flow properties by providing an abstraction of side channels. we introduce a methodology and a tool, scam-v, to validate observational models for modern computer architectures. we combine symbolic execution, relational analysis, and different program generation techniques to generate experiments and validate the models. an experiment consists of a randomly generated program together with two inputs that are observationally equivalent according to the model under the test. validation is done by checking indistinguishability of the two inputs on real hardware by executing the program and analyzing the side channel. we have evaluated our framework by validating models that abstract the data-cache side channel of a raspberry pi 3 board with a processor implementing the armv8-a architecture. our results show that scam-v can identify bugs in the implementation of the models and generate test programs which invalidate the models due to hidden microarchitectural behavior. information flow analysis that takes into account side channels is a topic of increasing relevance, as attacks that compromise confidentiality via different microarchitectural features and sophisticated side channels continue to emerge [2, 27, 28, [31] [32] [33] 40] . while there are information flow analyses that try to counter these threats [3, 15] , these approaches use models that abstract from many features of modern processors, like caches and pipelining, and their effects on channels that can be accessed by an attacker, like execution time and power consumption. instead, these models [36] include explicit "observations" that become available to an attacker when the program is executed and that should overapproximate the information that can be observed on the real system. while abstract models are indispensable for automatic verification because of the complexity of modern microarchitectures, the amount of details hidden by these models makes it hard to trust that no information flow is missed, i.e., their soundness. different implementations of the same architecture, as well as optimizations such as parallel and speculative execution, can introduce side channels that may be overlooked by the abstract models. this has been demonstrated by the recent spectre attacks [32] : disregarding these microarchitectural features can lead to consider programs that leak information on modern cpus as secure. thus, it is essential to validate whether an abstract model adequately reflects all information flows introduced by the low-level features of a specific processor. in this work, we introduce an approach that addresses this problem: we show how to validate observational models by comparing their outputs against the behavior of the real hardware in systematically generated experiments. in the following, we give an overview of our approach and this paper. our contribution. we introduce scam-v (side channel abstract model validator), a framework for the automatic validation of abstract observational models. at a high level, scam-v generates well-formed 1 random binaries and attempts to construct pairs of initial states such that runs of the binaries from these states are indistinguishable at the level of the model, but distinguishable on the real hardware. in essence, finding such counterexamples implies that the observational model is not sound, and leads to a potential vulnerability. figure 1 illustrates the main workflow of scam-v. the first step of our workflow (described in sect. 3) is the generation of a binary program for the given architecture, guided towards programs that trigger certain features of the architecture. the second step translates the program to the intermediate language bir (described in sect. 2.4) and annotates the result with observations according to the observational model under validation. this transpilation is provably correct with respect to the formal model of the isa, i.e., the original binary program and the transpiled bir program have the same effects on registers and memory. in step three we use symbolic execution to syn-thesize the weakest relation on program states that guarantees indistinguishability in the observational model (sect. 4) . through this relation, the observational model is used to drive the generation of test cases -pairs of states that satisfy the relation and can be used as inputs to the program (sect. 5). finally, we run the generated binary with different test cases on the real hardware, and compare the measurements on the side channel of the real processor. a description of this process together with general remarks on our framework implementation are in sect. 6 . since the generated test cases satisfy the synthesized relation, soundness of the model would imply that the side-channel data on the real hardware cannot be distinguished either. thus, a test case where we can distinguish the two runs on the hardware amounts to a counterexample that invalidates the observational model. after examining a given test case, the driver of the framework decides whether to generate more test cases for the same program, or to generate a new program. we have implemented scam-v in the hol4 theorem prover 2 and have evaluated the framework on three observational models (introduced in sect. 2.3) for the l1 data-cache of the armv8 processor on the raspberry pi 3 (sect. 2.2). our experiments (sect. 7) led to the identification of model invalidating microarchitectural features as well as bugs in the armv8 isa model and our observational extensions. this shows that many existing abstractions are substantially unsound. since our goal is to validate that observational models overapproximate hardware information flows, we do not attempt to identify practically exploitable vulnerabilities. instead, our experiments attempt to validate these models in the worst case scenario for the victim. this consists of an attacker that can precisely identify the cache lines that have been evicted by the victim and that can minimize the noise of these measurements in the presence of background processes and interrupts. we briefly introduce the concepts of side channels, indistinguishability, observational models, and observational equivalence. for the rest of this section, consider a fixed program that runs on a fixed processor. we can model the program running on the processor by a transition system m = s, → , where s is a set of states and →⊆ s × s a transition relation. in automated verification, the state space of such a model usually reflects the possible values of program variables (or: registers of the processor), abstracting from low-level behavior of the processor, such as cache contents, electric currents, or real-time behavior. that is, for every state of the real system there is a state in the model that represents it, and a state of the model usually represents a set of states of the real system. then, a side channel is a trait of the real system that can be read from by an attacker and that is not modeled in m . states r 1 and r 2 of the real system are indistinguishable if a real-world attacker is not able to distinguish executions from r 1 or r 2 by means of the side channel on the real hardware. note that executions may be distinguishable even if they end in the same final state, e.g., if the attacker is able to measure execution time. in order to verify resilience against attacks that use side channels, one option is to extend the model to include additional features of the real system and to formalize indistinguishability in terms of some variations of non-interference [25, 26] . unfortunately, it is infeasible to develop formal models that capture all side channels of a modern computer architecture. for instance, precisely determining execution time or power consumption of a program requires to deal with complex processor features such as cache hierarchies, cache replacement policies, speculative execution, branch prediction, or bus arbitration. moreover, for some important parts of microarchitectures, their exact behavior may not even be public knowledge, e.g., the mechanism used to train the branch predictor. additionally, information flow analyses cannot use the same types of overapproximations that are used for checking safety properties or analyzing worst-case execution time, e.g., the introduction of nondeterminism to cover all possible outcomes. in order to handle this complexity, information flow analyses [3, 15] use models designed to overapproximate information flow to channels in terms of system state observations. to this end, the model is extended with a set of possible observations o and we consider a transition relation →⊆ s × o × s, i.e., each transition produces an observation that captures the information that it potentially leaks to the attacker. we assume that the set o contains an empty observation ⊥, and call a transition labeled with ⊥ a silent transition. we call the resulting transition system an observational model. for instance, in case of a rudimentary cacheless processor, the execution time of a program depends only on the sequence of executed instructions. in this case, extending the model with observations that reveal the instructions is more convenient than producing a clock-accurate model of the system. we use the operator • for the sequential composition of observations. in particular, for a trace π = s 0 → o1 s 1 states s 1 ∈ s and s 2 ∈ s are observationally equivalent, denoted s 1 ∼ m s 2 , iff for every possible trace π 1 of m that starts in s 1 there is a trace π 2 of m that starts in s 2 such that π 1 ∼ m π 2 , and vice versa. note that this notion is, in principle, different from the notion of indistinguishability. the overapproximation of information flows can lead to false positives: for example, execution of a program may require the same amount of time even if the sequences of executed instructions are different. a more severe concern is that these abstractions may overlook some flows of information due to the number of low-level details that are hidden. for instance, an observational model may not take into account that for some microcontrollers the number of clock cycles required for multiplication depends on the value of the operands. the use of an abstract model to verify resilience against side-channel attacks relies on the assumption that observational equivalence entails indistinguishability for a real-world attacker on the real system: sound if whenever the model states s 1 and s 2 represent the real system states r 1 and r 2 , respectively, then s 1 ∼ m s 2 entails indistinguishability of r 1 and r 2 . in order to evaluate our framework, we selected raspberry pi 3 3 , which is a widely available armv8 embedded system. the platform's cpu is a cortex-a53, which is an 8-stage pipelined processor with a 2-way superscalar and inorder execution pipeline. the cpu implements branch prediction, but it does not support speculative execution. this makes the cpu resilient against variations of spectre attacks [5] . in the following, we focus on side channels that exploit the level 1 (l1) datacache of the system. the l1 data-cache is transparent for programmers. when the cpu needs to read a location in memory in case of a cache miss, it copies the data from memory into the cache for subsequent uses, tagging it with the memory location from which the data was read. data is transferred between memory and cache in blocks of 64 bytes, called cache lines. the l1 data-cache (fig. 2) is physically indexed and physically tagged and is 4-way set associative: each memory location can be cached in four different entries in the cache-when a line is loaded, if all corresponding entries are occupied, the cpu uses a specific (and usually underspecified) replacement policy to decide which colliding line should be evicted. the whole l1 cache is 32kb in size, hence it has 128 cache sets (i.e. 32 kb/64 b/4). let a be a physical address, in the following we use off(a) (i.e., least significant 6 bits), index(a) (i.e., bits from 6 to 12), and tag(a) (i.e., the remaining bits) to extract the cache offset, cache set index, and cache tag of the address. the cache implements a prefetcher, for some configurable k ∈ n: when it detects a sequence of k cache misses whose cache set indices are separated by a fixed stride, the prefetcher starts to fetch data in the background. for example, in fig. 2 , if k = 3 and the cache is initially empty then accessing addresses a, b, and c, whose cache lines are separated by a stride of 2, can cause the cache to prefetch the block [384 . . . 449]. attacks that exploit the l1 data-cache are usually classified in three categories: in time-driven attacks (e.g. [47] ), the attacker measures the execution time of the victim and uses this knowledge to estimate the number of cache misses and hits of the victim; in trace-driven attacks (e.g. [1, 48] ), the adversary can profile the cache activities during the execution of the victim and observe the cache effects of a particular operation performed by the victim; finally, in access-driven attacks (e.g. [39, 46] ), the attacker can only determine the cache sets modified after the execution of the victim has completed. a widely used approach to extract information via cache is prime+probe [40] : (1) the attacker reads its own memory, filling the cache with its data; (2) the victim is executed; (3) the attacker measures the time needed to access the data loaded at step (1): slow access means that the corresponding cache line has been evicted in step (2) . in the following we disregard time-driven attacks and trace-driven attacks: the former can be countered by normalizing the victim execution time; the latter can be countered by preventing victim preemption. focusing on access-driven attacks leads to the following notion of indistinguishability: definition 4. real system states r 1 and r 2 are indistinguishable for accessdriven attacks on the l1 data-cache iff executions starting in r 1 or r 2 modify the same cache sets. we remark that for multi-way caches, the need for models that overapproximate the information flow is critical since the replacement policies are seldom formally specified and a precise model of the channel is not possible. the following observational model attempts to overapproximate information flows for data-caches by relying on the fact that accessing two different addresses that only differ in their cache offset produces the same cache effects: notice that by making the program counter observable, this model assumes that the attacker can infer the sequence of instructions executed by the program. we introduce several relaxed models, representing different assumptions on the hardware behavior and attacker capability. each relaxed model is obtained by projecting observations of definition 5. let α be a relaxed model and f α the corresponding projection function, then mwc,pc s . the following model assumes that the effects of instructions that do not interact with the data memory are not measurable, hence the attacker does not observe the program counter: definition 6. the projection of the multi-way cache observational model is f mwc ((pc, acc)) = acc. on many processors, the replacement policy for a cache set does not depend on previous accesses performed to other cache sets. the resulting isolation among cache sets leads to the development of an efficient countermeasure against accessdriven attacks: cache coloring [23, 45] . this consists in partitioning the cache sets into multiple regions and ensuring that memory pages accessible by the adversary are mapped to a specific region of the cache. in this case, accesses to other regions do not affect the state of cache sets that an attacker can examine. therefore these accesses are not observable. this assumption is captured by the following model: the projection of the partitioned multi-way cache observational model is f pmwc ((pc, acc)) = acc if acc = (op, t, i) and i belongs to the set of cache sets that are addressable by the attacker, and is ⊥ otherwise. notice that cache prefetching can violate soundness of this model, since accesses to the non-observable region of the cache may lead to prefetching addresses that lie in the observable part of the cache (see sect. 7.2). finally, for direct-mapped caches, where each memory address is mapped to only one cache entry, the cache tag should not be observable if the attacker does not share memory with the victim: since the cache in cortex-a53 is multi-way set associative, this model is not sound. for example, in a two-way set associative cache, accessing a, a and a, b, where both a and b have the same cache set index but different cache tags, may result in different cache states. to achieve a degree of hardware independence, we use the architecture-agnostic intermediate representation bir [34] . it is an abstract assembly language with statements that work on memory, arithmetic expressions, and jumps. figure 3 shows an example of code in a generic assembly language and its transpiled bir code. this code performs a conditional jump to l2 if z holds, and otherwise it sets x1 to the multiplication x2 * x3. then, at l2 it loads a word from memory at address x1 into x2, and finally adds 8 to the pointer x1. bir programs are organized into blocks, which consist of jump-free statements and end in either conditional jump (cjmp), unconditional jump (jmp), or halt. bir also has explicit support for observations, which are produced by statements that evaluate a list of expressions in the current state. to account for expressive observational models, bir allows conditional observation. the condition is represented by an expression attached to the observation statement. the observation itself happens only if this condition evaluates as true in the current state. the observations in fig. 3 reflect a scenario where the datacache has been partitioned: some lines are exclusively accessible by the victim (i.e. the program), some lines can be shared with the attacker. the statement obs(sline(x1), [tag(x1), index(x1)]) for the load instruction consists of an observation condition (sline(x1)) and a list of expressions to observe ([tag(x1), index(x1)]). the function sline checks that the argument address is mapped in a shared line and therefore visible to the attacker. the functions tag and index extract the cache tag and set index in which the argument address is mapped. binary programs can be translated to bir via a process called transpilation. this transformation reuses formal models of the isas and generates a proof that certifies correctness of the translation by establishing a bisimulation between the two programs. we base our validation of observational models on the execution of binary programs rather than higher-level code representations. this approach has the following benefits: (i) it obviates the necessity to trust compilers or reason about how their compilation affects side-channels. (ii) implementation effort is reduced because most existing side-channel analysis approaches also operate on binary representations, which requires isa models. (iii) this approach allows to find isa model faults independently of the compilation. (iv) it enables a unified infrastructure to handle many different types of channels. in scam-v, we implemented two techniques to generate well-formed binaries: random program generation and monadic program generation. the random generator leverages the instruction encoding machinery from the existing hol4 model of the isa and produces arbitrary well-formed armv8 binaries, with the possibility to control the frequency of occurrences of each instruction class. the monadic generator is following a grammar-driven approach in the style of quickcheck [13] that generates arbitrary programs that fit a specific pattern or template. the program templates can be defined in a modular, declarative style and are extensible. we use this approach to generate programs in a guided fashion, focusing on processor features that we want to exercise in order to validate a model, or those we suspect may lead to a counterexample. figures 4 and 5 show some example programs generated by scam-v, including straight-line programs that only do memory loads, programs that load from addresses in a stride pattern to trigger automatic prefetching, and programs with branches. more details on how the program generators work can be found in [38] . synthesis of the weakest relation is based on standard symbolic execution techniques. we only cover the basic ideas of symbolic execution in the following and refer the reader to [30] for more details. we use x to range over symbols, and c, e, and p to range over symbolic expressions. a symbolic state σ consists of a concrete program counter i σ , a path condition p σ , and a mapping m σ from variables to symbolic expressions. we write e(σ) = e for the symbolic evaluation of the expression e in σ, and e(s) for the value obtained by substituting the symbols of the symbolic expression e with the values of the variables in s, where s is a concrete state. symbolic execution produces one terminating state 4 for each possible execution path: a terminating state is produced when halt is encountered; the execution of cjmp c l 1 l 2 from state σ follows both branches using the path conditions c(σ) and ¬c(σ). symbolic execution of the example in fig. 3 produces the terminating states σ 1 and σ 2 . for the first branch we have p σ1 = z and m σ1 = {x 1 → x 1 + 8, x 2 → load(m, x 1 )} (we omit the variables that are not updated), and for the second branch p σ2 = ¬z and m σ2 = {x 1 → we extend standard symbolic execution to handle observations. that is, we add to each symbolic state a list l σ , and the execution of obs c #» e in σ appends the pair (c, #» e ) to l σ , where c = c(σ) and #» e [i] = #» e [i](σ) are the symbolic evaluation of the condition and expressions of the observation. for instance, in the example of fig. 3 the list for the terminating states are let σ be the set of terminating states produced by the symbolic execution, s be a concrete state, and σ ∈ σ be a symbolic state such that p σ (s) holds, then executing the program from the initial state s produces the value m σ (x)(s) for the variable x. after computing σ, we synthesize the observational equivalence relation (denoted by ∼) by ensuring that every possible pair of execution paths have equivalent lists of observations. formally, s 1 ∼ s 2 is equivalent to: this synthesized relation implies the observational equivalence defined in sect. 2 (definition 2). in the example, the synthesized relation (after simplification) is as follows (notice that primed symbols represent variables of the second state and we omitted the symmetric cases): fig. 6 . example test cases when the first 10 cache sets are shared. we recall that raspberry pi 3 has 128 cache sets and 64 bytes per line. figure 6 shows two pairs of states that satisfy the relation, assuming only the first 10 cache sets are shared. states s 1 and s 2 lead the program to access the third cache set, while s 1 and s 2 lead the program to access cache sets that are not shared, therefore they generate no observations. a test case for a program p is a pair of initial states s 1 , s 2 such that p produces the same observations when executed from either state, i.e., s 1 ∼ s 2 . the relation as described in sect. 4 characterizes the space of observationally equivalent states, so a simple but naive approach to test-case generation consists in querying the smt solver for a model of this relation. the model that results from the query gives us two concrete observationally equivalent values for the registers that affect the observations of the program, so at this point we could forward these to our testing infrastructure to perform the experiment on the hardware. however, the size of an observational equivalence class can be enormous, because there are many variations to the initial states that cannot have effects on the channels available to the attacker. choosing a satisfying assignment for the entire relation every time without any extra guidance risks producing many test cases that are too similar to each other, and thus unlikely to find counterexamples. for instance, the smt solver may generate many variations of the test case (s 1 , s 2 ) in fig. 6 by iterating over all possible values for register x 2 of state s 1 , even if the value of this register is immaterial for the observation. in practice, we explore the space of observationally equivalent states in a more systematic manner. to this end, scam-v supports two mechanisms to guide the selection of test cases: path enumeration and term enumeration. path enumeration partitions the space according to the combination of symbolic execution paths that are taken, whereas term enumeration partitions the space according to the value of a user-supplied bir expression. in both cases, the partitions are explored in round-robin fashion, choosing one test case from each partition in turn. to make the queries to the smt solver more efficient, we only generate a fragment of the relation that corresponds to the partition under test. path enumeration. every time we have to generate a test case, we first select a pair (σ 1 , σ 2 ) ∈ σ × σ of symbolic states as per sect. 4, which identifies a pair of paths (p σ1 , p σ2 ). the chosen paths vary in each iteration in order to achieve full path coverage. the query given to the smt solver then becomes 5 since the meat of the relation is a conjunction of implications, this is a natural partitioning scheme that ensures all conjuncts are actually explored. note that without this mechanism, the smt solver could always choose states that only satisfy one and the same conjunct. to guide this process even further, the user can supply a path guard, which is a predicate on the space of paths. any path not satisfying the guard is skipped, allowing the user to avoid exploring unwanted paths. for example, for the program in fig. 3 we can use a path guard to force the test generation to select only paths that produce no observations: e.g., (z ⇒ ¬sline(x 1 )) ∧ (¬z ⇒ ¬sline(x 2 * x 3 )). term enumeration. in addition to path enumeration, we can choose a bir expression e that depends on the symbolic state, and a range r of values to enumerate. every query also includes the conjuncts e σ1 = v 1 ∧ e σ2 = v 2 where v 1 , v 2 ∈ r and such that the v i are chosen to achieve full coverage of r×r. term enumeration can be useful to introduce domain-specific partitions, provided that r×r is small enough. for example, this mechanism can be used to ensure that we explore addresses that cover all possible cache sets, if we set e to be a mask that extracts the cache set index bits of the address. for example, for the program in fig. 3 we can use z * index(x 1 ) + (1 − z) * index(x 2 * x 3 ) to enumerate all combinations of accessed cache sets while respecting the paths. the implementation 6 of scam-v is done in the hol4 theorem prover using its meta-language, i.e., sml. scam-v relies on the binary analysis platform holba for transpiling the binary code of test programs to the bir representation. this transpilation uses the existing hol4 model of the armv8 architecture [16] for giving semantics to arm programs. in order to validate the observational models of sect. 2.3, we extended the transpilation process to inline observation statements into the resulting bir program. these observations represent the observational power of the side channel. in order to compute possible execution paths of test programs and their corresponding observations, which are needed to synthesize the observational equivalence relation of sect. 4, we implemented a symbolic execution engine in hol4. all program generators from sect. 3 as well as the weakest relation synthesis from sect. 4 and the test-case generator from sect. 5 are implemented as sml libraries in scam-v. the latter uses the smt solver z3 [14] to generate test inputs. for conducting the experiments in this paper, we used raspberry pi 3 boards equipped with arm cortex-a53 processors implementing the armv8-a architecture. the scam-v pipeline generates programs and pairs of observationally equivalent initial states (test cases) for each program. each combination of a program with one of its test cases is called an experiment. after generating experiments, we execute them on the processor implementation of interest to examine their effects on the side channel. figure 7 depicts the life of a single experiment as goes through our experiment handling design. this consists of: (step 1) generating an experiment and storing it in a database, (step 2) retrieving the experiment from the database, (step 3) integrating it with experiment-platform code and compiling it into platform-compatible machine code, and (step 4-6) executing the generated binary on the real board, as well as finally receiving and storing the experiment result. the experiment-platform code configures page tables to setup cacheable and uncacheable memory, clears the cache before every execution of the program, and inserts memory barriers around the experiment code. the platform executes in arm trustzone, which enables us to use privileged debug instructions to obtain the cache state directly for comparison after experiment execution. the way in which we compare final cache states for distinguishability depends on the attacker and observational model in question. for multi-way cache, we say two states are indistinguishable if and only if for each valid entry in one state, there is a valid entry with the same cache tag in the corresponding cache set of the other state and vice versa. for the partitioned multi-way cache, we check the states in the same way, except we do it only for a subset of the cache sets (see sect. 7.2 for details on the exact partition). for the direct-mapped cache, we compare how many valid cache lines there are in each set, disregarding the cache tags. these comparison functions have been chosen to match the attacker power of the relaxed models in definitions 6, 7, and 8 respectively. since the arm-v8 experimentation platform runs as bare-metal code, there are no background processes or interrupts. despite this fact, our measurements may contain noise due to other hardware components that share the same memory subsystem, such as the gpu, and because our experiments are not synchronized with the memory controller. in order to simplify repeatability of our experiments, we execute each experiment 10 times and check for discrepancies in the final state of the data cache. unless all executions give the same result, this experiment is classified as inconclusive and excluded from further analysis. first, we want to make sure that scam-v can invalidate unsound observational models in general. for this purpose, we generated experiments that use the model of definition 8, i.e., for every memory access in bir we observe the cache set index of the address of the access. we know that this is not a sound model for raspberry pi 3, because the platform uses a 4-way cache. table 1 .1 shows that both the random program generator and the monadic load generator uncovered counterexamples that invalidated this observational model. next, we consider the partitioned cache observational model from definition 7. that is, we partition the l1 cache of the raspberry pi 3 into two contiguous regions and assume that the attacker has only access to the second region. due to the prefetcher of cortex-a53 we expect this model to be unsound and indeed we could invalidate it. to this end, we generated experiments for two variations of the model. variation a splits the cache at cache set 61, meaning that only cache sets 61-127 were considered accessible to the attacker. variation b splits the cache at cache set 64 (the midpoint), such that cache sets 64-127 were considered visible. the following program is one of the counterexamples for variation a that have been discovered by scam-v using the monadic program generator. program the counterexample exploits the fact that prefetching fills more lines than those loaded by the program, provided the memory accesses happen in a certain stride pattern. thus, it essentially needs to have two properties: (i) two different starting addresses for the stride, a 1 and a 2 , with a cache set index that is lower than 61 to avoid any observations in the model, and thus satisfying observational equivalence, and (ii) one of a 1 and a 2 is close enough to the partition boundary. in this case, automatic prefetching will continue to fill lines in subsequent sets, effectively crossing the boundary into the attacker-visible region. in our experiments, we used a path guard to generate only states that produce only memory accesses to the region of the cache that is not visible by the attacker. additionally, we used term enumeration to force successive test cases to start a stride on a different cache set and therefore cover the different cache set indices. without this guidance, the tool could generate only experiments that affect the lower sets of the cache and never explore scenarios that affect the sets with indices closer to the split boundary. for variation b, we have not found such a counterexample. the only difference is that the partition boundary is on line 64, which means that each partition fits exactly in a small page (4k). we conjecture that the prefetcher does not perform line fills across small page (4k) boundaries. this could be for performance reasons, as crossing a page boundary can involve a costly page walk if the next page is not in the tlb. if this is the case, it would seem that it is safe to use prefetching with a partitioned cache, provided the partitions are page-aligned. table 1 .2 summarizes our experiments for this model. in the remaining experiments, we consider the model of definition 6 and we assume that the attacker has access to the complete l1 cache. even if we expected this model to be sound, our experiments (table 1. 3) identified several counterexamples. we comment on two classes of counterexamples below. previction. some counterexamples are due to an undocumented behavior that we called "previction" because it causes a cache line to be evicted before the corresponding cache set is full. the following program compares x0 and x1 and executes a sequence of three loads. in case of equality, fourteen nop are executed between the first two loads. program input 1 and input 2 are two states that exercise the two execution paths and have the same values for x2, x3 and x4, hence the two states are observationally equivalent. notice that all memory loads access cache set 0. since the cache is 4-way associative and the cache is initially empty, we expect no eviction to occur. executions starting in input 2 behave as expected and terminate with the addresses of x2, x3, and x4 in the final cache state. however, the execution from input 1 leads to a previction, which causes the final cache state to only contain the addresses of x3 and x4. the address of x2 has been evicted even if the cache set is not full. therefore the two states are distinguishable by the attacker. our hypothesis is that the processor detects a short sequence of loads to the same cache set and anticipates more loads to the same cache set with no reuse of previously loaded values. it evicts the valid cache line in order to make space for more colliding lines. we note that these cache entries are not dirty and thus eviction is most likely a cheap operation. the execution of a nop sequence probably ensures that the first cache line fill is completed before the other addresses are accessed. this program consists of five consecutive load instructions. this program always produces five observations consisting of the cache tag and set index of the five addresses. input 1 and input 2 are observationally equivalent: they only differ for x16, which affects the address used for the third load, but the addresses 0x80100020 and 0x80100000 have the same cache tag and set index and only differ for the offset within the same cache line. however, these experiments lead to two distinguishable microarchitectural states. more specifically, execution from input 1 results in the filling of cache set 0, where the addresses of registers x0, x3, x16 and x22 + 8 are present in the cache, while executions from input 2 leads a cache state where the address of x0 is not in the cache and has been probably evicted. this effect can be the result of the interaction between cache previction and cache bank collision [9, 40] , whose behavior depends on the cache offset. notice that cache bank collision is undocumented for arm cortex-a53. tromer et al. [46] have shown that such offset-dependent behaviors can make insecure side-channel countermeasures for aes that rely on making accesses to memory blocks (rather than addresses) key-independent. additionally to microarchitectural features that invalidate the formal models, our experiments identified bugs of the implementation of the models: (1) the formalization of the armv8 instruction set used by the transpiler and (2) the module that inserts bir observation statements into the transpiled binary to capture the observations that can be made according to a given observational model. table 1 .4 reports problems identified by the random program generator. some of these failing experiments result in distinguishable states while others result in run-time exceptions. in fact, if the model predicts wrong memory accesses for a program then our framework can generate test inputs that cause accesses to unmapped memory regions. the example program in fig. 4 exhibits both problems when executed with appropriate inputs. missing observations. the second step of our framework translates binary programs to bir and adds observations to reflect the observational model under validation. in order to generate observations that correspond to memory loads, we syntactically analyze the right-hand side of bir assignments. for instance, for line l2 in fig. 3 we generate an observation that depends on variable x1 because the expression of assignment is load (mem, x1) . this approach is problematic when a memory load is immaterial for the result of an instruction. for example, ldr xzr and ldr wzr instructions load from memory to a register that is constantly zero. the following program loads from x30 into xzr. input 1 input 2 the translation of this instruction is simply [jmp next_addr]: there is no assignment that loads from x30 because the register xzr remains zero. therefore, our model generates no observations and any two input states are observationally equivalent. the arm specification does not clarify that the microarchitecture can skip the immaterial memory load. our experiments show that this is not the case and therefore our implementation of the model is not correct. in fact, the program accesses cache set index(0x80000040) = 1 for input 1 and cache set index(0x80000038) = 0 for input 2, which results in distinguishable states. moreover, by not taking into account the memory access our framework generates some tests that set x30 to unmapped addresses and cause run-time exceptions. flaw in hol4 armv8 isa model. our tool has identified a bug of the hol4 armv8 isa model. this model has been used in several projects [8, 17] as the basis for formal analysis and is used by our framework to transform arm programs to bir programs. despite its wide adoption, we identified a problem in the semantics of instructions compare and branch on zero (cbz) and compare and branch on non-zero (cbnz). these instructions implement a conditional jump based on the comparison of the input register with zero. while cbz jumps in case of equality, cbnz jumps in case of inequality. however, our tests identified that cbnz wrongly behaves as cbz in the hol4 model. hardware models. verification approaches that take into account the underlying hardware architecture have to rely on a formal model of that architecture. commercial instruction set architectures (isas) are usually specified mostly in natural language, and their formalization is an active research direction. for example, goel et al. [24] formalize the isa of x86 in acl2, morrisett et al. [37] model the x86 architecture in the coq theorem prover, and sarkar et al. [42] provide a formal semantics of the x86 multiprocessor isa in hol. moreover, domain-specific languages for isas have been developed, such as the l3 language [19] , which has been used to model the armv7 architecture. as another example, siewiorek et al. [44] proposed the instruction-set processor language for formalizing the semantics of the instructions of a processor. to gain confidence in the correctness of a processor model, it needs to be verified or validated against the actual hardware. this problem has received considerable attention lately. there are whitebox approaches such as the formal verification that a processor model matches a hardware design [10, 18] . these approaches differ from ours in that they try to give a formal guarantee that a processor model is a valid abstraction of the actual hardware, and to achieve that they require the hardware to be accessible as a white box. more similar to ours are black-box approaches that validate an abstract model by randomly generated instructions or based on dynamic instrumentation [20, 29] . combinations of formal verification and testing approaches for hardware verification and validation have also been considered [11] . in contrast to our work, all of the approaches above are limited to functional correctness, and validation is limited to single-instruction test cases, which we show to be insufficient for information flow properties. going beyond these restrictions is the work of campbell and stark [12] , who generate sequences of instructions as test cases, and go beyond functional correctness by including timing properties. still, neither their models nor their approach is suitable to identify violations of information flow properties. validating information flow properties. to the best of our knowledge, we present the first automated approach to validate processor models with respect to information flow properties. to this end, we build on the seminal works of mclean [35] on non-interference, roscoe [41] on observational determinism, and barthe et al. [7] on self-composition as a method for proving information flow properties. most closely related is the work by balliu et al. [6] on relational analysis based on observational determinism. these approaches are based on the different observational models that have been proposed in the literature. for example, the program counter security model [36] has been used when the execution time depends on the control flow of the victim. extensions of this model also make observable data that can affect execution time of an instruction, or memory addresses accessed by the program to model timing differences due to caching [4] . many analysis tools use these observational models. ct-verif [3] implements a sound information flow analysis by proving observational equivalence constructing a product program. cacheaudit [15] quantifies information leakage by using abstract interpretation. the risks of using unsound models for such analyses have been demonstrated by the recent spectre attack family [32] , which exploits speculation to leak data through caches. several other architectural details require special caution when using abstract models, as some properties assumed by the models could be unmet. for instance, cache clean operations do not always clean residual state in implementations of replacement policies [21] . furthermore, many processors do not provide sufficient means to close all leakage, e.g., shared state cannot be cleaned properly on a context switch [22] . finally, it has been shown that fixes relying on too specific assumptions can be circumvented by modifying the attack [43] , and that attacks are possible even against formally verified software if the underlying processor model is unsound [28] . for these reasons, validation of formal models by directly measuring the hardware is of great importance. we presented scam-v, a framework for automatic validation of observational models of side channels. scam-v uses a novel combination of symbolic execution, relational analysis, and observational models to generate experiments. we evaluated scam-v on the arm cortex-a53 processor and we invalidated all models of sect. 2.3, i.e., those with observations that are cache-line-offset-independent. our results are summarized as follows: (i) in case of cache partitioning, the attacker can discover victim accesses to the other cache partitions due to the automatic data prefetcher; (ii) the cortex-a53 prefetcher seems to respect 4k page boundaries, like in some intel processors; (iii) a mechanism of cortex-a53, which we called previction, can leak the time between accesses to the same cache set; (iv) the cache state is affected by the cache line offset of the accesses, probably due to undocumented cache bank collisions like in some amd processors; (v) the formal armv8 model had a flaw in the implementation of cbnz; (vi) our implementation of the observational model had a flaw in case of loads into the constant zero register. moreover, since the microarchitectural features that lead to these findings are also available on other armv8 cores, including some that are affected by spectre (e.g. cortex a57), it is likely that similar behaviors can be observed on these cores, and that more powerful observational models, including those that take into account spectre-like effects, may also be unsound. these promising results show that scam-v can support the identification of undocumented and security-relevant features of processors (like results (ii), (iii), and (iv)) and discover problems in the formal models (like results (v) and (vi)). in addition, users can drive test-case generation to conveniently explore classes of programs that they suspect would lead to side-channel leakage (like in result (i)). this process is enabled by path and term enumeration techniques as well as custom program generators. moreover, scam-v can aid vendors to validate implementations with respect to desired side-channel specifications. given the lack of vendor communication regarding security-relevant processor features, validation of abstract side-channel models is of critical importance. as a future direction of work, we are planning to extend scam-v for other architectures (e.g. arm cortex-m0 based microcontrollers), noisy side channels (e.g. time and power consumption), and other side channels (e.g. cache replacement state). moreover, we are investigating approaches to automatically repair an unsound observational model starting from the counterexamples, e.g., by adding state observations. finally, the theory in sect. 4 can be used to develop a certifying tool for verifying observational determinism. open access this chapter is licensed under the terms of the creative commons attribution 4.0 international license (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license and indicate if changes were made. the images or other third party material in this chapter are included in the chapter's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the chapter's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. trace-driven cache attacks on aes (short paper) predicting secret keys via branch prediction verifying constant-time implementations formal verification of sidechannel countermeasures using self-composition arm limited: vulnerable arm processors to spectre attack automating information flow analysis of low level code secure information flow by self-composition on the verification of system-level information flow properties for virtualized execution platforms cache-timing attacks on aes putting it all together -formal verification of the vamp a survey of hybrid techniques for functional verification randomised testing of a microprocessor model using smtsolver state generation quickcheck: a lightweight tool for random testing of haskell programs z3: an efficient smt solver cacheaudit: a tool for the static analysis of cache side channels l3: a specification language for instruction set architectures verified compilation of cakeml to multiple machine-code targets formal specification and verification of arm6 directions in isa specification a trustworthy monadic formalization of the armv7 instruction set architecture do hardware cache flushing operations actually meet our expectations your processor leaks information-and there's nothing you can do about it preventing cache-based side-channel attacks in a cloud environment simulation and formal verification of x86 machine-code programs that make system calls security policies and security models unwinding and inference control rowhammer.js: a remote software-induced fault attack in javascript cache storage channels: aliasdriven attacks and verified countermeasures an executable formalisation of the sparcv8 instruction set architecture: a case study for the leon3 processor symbolic execution and program testing timing attacks on implementations of diffie-hellman, rsa, dss, and other systems spectre attacks: exploiting speculative execution differential power analysis trabin: trustworthy analyses of binaries proving noninterference and functional correctness using traces the program counter security model: automatic detection and removal of control-flow side channel attacks rocksalt: better, faster, stronger sfi for the x86 validation of abstract side-channel models for computer architectures advances on access-driven cache attacks on aes cache attacks and countermeasures: the case of aes csp and determinism in security modelling the semantics of x86-cc multiprocessor machine code malicious management unit: why stopping cache attacks in software is harder than you think computer structures: principles and examples the tlb slice-a low-cost high-speed address translation mechanism efficient cache attacks on aes, and countermeasures cryptanalysis of des implemented on computers with cache cross-vm side channels and their use to extract private keys we thank matthias stockmayer for his contributions to the symbolic execution engine in this work. this work has been supported by the trustfull project financed by the swedish foundation for strategic research, the kth cerces center for resilient critical infrastructures financed by the swedish civil contingencies agency, as well as the german federal ministry of education and research (bmbf) through funding for the cispa-stanford center for cybersecurity (fkz: 13n1s0762). key: cord-034843-cirltmy4 authors: nabipour, m.; nayyeri, p.; jabani, h.; mosavi, a.; salwana, e.; s., shahab title: deep learning for stock market prediction date: 2020-07-30 journal: entropy (basel) doi: 10.3390/e22080840 sha: doc_id: 34843 cord_uid: cirltmy4 the prediction of stock groups values has always been attractive and challenging for shareholders due to its inherent dynamics, non-linearity, and complex nature. this paper concentrates on the future prediction of stock market groups. four groups named diversified financials, petroleum, non-metallic minerals, and basic metals from tehran stock exchange were chosen for experimental evaluations. data were collected for the groups based on 10 years of historical records. the value predictions are created for 1, 2, 5, 10, 15, 20, and 30 days in advance. various machine learning algorithms were utilized for prediction of future values of stock market groups. we employed decision tree, bagging, random forest, adaptive boosting (adaboost), gradient boosting, and extreme gradient boosting (xgboost), and artificial neural networks (ann), recurrent neural network (rnn) and long short-term memory (lstm). ten technical indicators were selected as the inputs into each of the prediction models. finally, the results of the predictions were presented for each technique based on four metrics. among all algorithms used in this paper, lstm shows more accurate results with the highest model fitting ability. in addition, for tree-based models, there is often an intense competition between adaboost, gradient boosting, and xgboost. the prediction process of stock values is always a challenging problem [1] because of its long-term unpredictable nature. the dated market hypothesis believes that it is impossible to predict stock values and that stocks behave randomly, but recent technical analyses show that most stocks values are reflected in previous records; therefore the movement trends are vital to predict values effectively [2] . moreover, stock market groups and movements are affected by several economic factors such as political events, general economic conditions, commodity price index, investors' expectations, movements of other stock markets, the psychology of investors, etc. [3] . the value of stock groups is computed with high market capitalization. there are different technical parameters to obtain statistical data from the value of stock prices [4] . generally, stock indices are gained from prices of stocks with high market investment and were employed via principal component analysis (pca) to predict the daily future of stock market index returns. the results showed that deep neural networks were superior as classifiers based on pca-represented data compared to others. das et al. [23] implemented the feature optimization through considering the social and biochemical aspects of the firefly method. in their approach, they involved the objective value selection in the evolutionary context. the results indicated that firefly, with an evolutionary framework applied to the online sequential extreme learning machine (oselm) prediction method, was the best model among other experimented ones. hoseinzade and haratizadeh [24] proposed a convolutional neural networks (cnns) framework, which can be applied to various data collections (involving different markets) to explore features for predicting the future movement of the markets. from the results, remarkable improvement in prediction's performance in comparison with other recent baseline methods was achieved. krishna kumar and haider [25] compared the performance of single classifiers with a multi-level classifier, which was a hybrid of machine learning techniques (such as decision tree, support vector machine, and logistic regression classifier). the experimental results revealed that the multi-level classifiers outperformed the other works and led to a more accurate model with the best predictive ability, roughly 10 to 12% growth inaccuracy. chung and shin [26] applied one of the deep learning methods (cnns) for predicting the stock market movement. in addition, the genetic algorithm (ga) was employed to optimize the parameters of the cnn method systematically, and results showed that the ga-cnn outperformed the comparative models as the hybrid method of ga and cnn. sim et al. [27] proposed cnn to predict stock prices as a new learning method. the study aimed to solve two problems, using cnns and optimizing them for stock market data. wen et al. [28] applied the cnn algorithm on noisy temporal series by frequent patterns as a new method. the results proved that the method was adequately effective and outperformed traditional signal process methods with a 4 to 7% accuracy improvement. rekha et al. [29] employed cnn and rnn to make a comparison between two algorithms' results and actual results via stock market data. lee et al. [30] used cnns to predict the global stock market and then trained and tested their model with data from other countries. the results demonstrated that the model could be trained on the relatively large data and tested on the small markets where there was not enough amount of data. liu et al. [31] investigated a numerical-based attention method with dual sources stock market data to find the complementarity between numerical data and news in the prediction of stock prices. as a result, the method filtered noises effectively and outperformed prior models in dual sources stock prediction. baek and kim [32] proposed an approach for stock market index forecasting, which included a prediction lstm module and an overfitting prevention lstm module. the results confirmed that the proposed model had an excellent forecasting accuracy compared to a model without an overfitting prevention lstm module. chung and shin [33] employed a hybrid approach of lstm and ga to improve a novel stock market prediction model. the final results showed that the hybrid model of the lstm network and ga was superior in comparison with the benchmark model. chen et al. [34] used three neural networks, the radial basis function neural network, the extreme learning machine, and three traditional artificial neural networks, to evaluate their performance on high-frequency data of the stock market. their results indicated that deep learning methods got transaction data from the nonlinear features and could predict the future of the market powerfully. zhou et al. [35] applied lstm and cnn on high-frequency data from the stock market with the approach of rolling partition training and testing set to evaluate the update cycle effect on the performance of models. based on extensive experimental results, models could effectively reduce errors and increase prediction accuracy. chong et al. [36] tried to examine the performance of deep learning algorithms for stock market prediction with three unsupervised feature extraction ways, pca, restricted boltzmann machine and auto encoder. final results with significant improvement suggested that additional information could be extracted by deep neural networks from the autoregressive model. long et al. [37] suggested an innovative end-to-end model named multi-filters neural network (mfnn) specifically for price prediction task and feature extraction on financial time series data. their results indicated that the network outperformed common machine learning methods, statistical models, and convolutional, recurrent, and lstm networks in terms of accuracy, stability and profitability. moews et al. [38] proposed employing deep neural networks that use step-wise linear regressions in the preparatory feature engineering with exponential smoothing for this task, with regression slopes as movement strength indicators for a specified time interval. the final results showed the feasibility of the suggested method, with advanced accuracies and accounting for the statistical importance of the results for additional validation, as well as prominent implications for modern economics. garcia et al. [39] examined the effect of financial indicators on the german dax-30 stock index by employing a hybrid fuzzy neural network to forecast the one-day ahead direction of the index with various methods. their experimental works demonstrated that the fall in the dimension through the factorial analysis produces less risky and profitable strategies. cervelló-royo and guijarro [40] compared the performance of four machine learning models to validate the predicting capability of technical indicators in the technological nasdaq index. the results showed that the random forest outperformed the other models deemed in their study, being able to predict the 10-days ahead market movement, with a normal accuracy of 80%. konstantinos et al. [41] suggested an ensemble prediction combination method as an alternative approach to forecast time series. the ensemble learning technique combined various learning models. their results indicated the effectiveness proposed ensemble learning way, and the comparative analysis showed adequate evidence that the method could be used successfully to conduct prediction based on multivariate time series problems. overall, all researchers believe that stock price prediction and modeling have been challenging problems for study and speculators due to noisy and non-stationary characteristics of data. there is a minor difference between papers for choosing the most effective indicators for modeling and predicting the future of stock markets. feature selection can be an important part of studies to achieve better accuracy; however, all studies indicate that uncertainty is an inherent part of these forecasting tasks because of fundamental variables. employing new machine learning and deep learning methods such as recent ensemble learning models, cnns and rnns with high prediction ability is a significant advantage of recent studies that show the forecasting potential of these methods in comparison with traditional and common approaches such as statistical analyses. iran's stock market has been highly popular recently because of the arising growth of tehran stock exchange dividend and price index-tedpix in the last decades, and one of the reasons is that most of the state-owned firms are being privatized under the general policies of article 44 in the iranian constitution, and people are allowed to buy the shares of newly privatized firms under the specific circumstances. this market has some specific attributes in comparison with other country's stock markets, one of them being dealing a price limitation of ±5% of the opening price of the day for every index. this issue hinders the abnormal market fluctuation and scatters market shocks, political issues, etc. over a specific time and could make the market smoother and more predictable. trading takes place through licensed registered private brokers of exchange organization and the opening price of the next day is through the defined base-volume of the companies and transaction volume as well. however, the deficiency of valuable papers on this market to predict future values with machine learning models is clear. this study concentrates on the process of future value prediction for stock market groups, which are crucial for investors. despite significant development in iran's stock market in recent years, there has been not enough research on the stock price predictions and movements using novel machine learning methods. this paper aims to compare the performance of some regressors which applied on fluctuating data to evaluate predictor models, and the predictions are evaluated for 1, 2, 5, 10, 15, 20, and 30 days in advance. in addition, with tuning parameters, we try to reduce errors and increase the accuracy of models. ensemble learning models are broadly employed nowadays for its predictive performance progress. these methods combine multiple forecasts from one or multiple methods to improve the accuracy of simple prediction and to prevent possible overfitting problems. in addition, anns are entropy 2020, 22, 840 5 of 23 universal approximators and flexible computing frameworks which can be used to an extensive range of time series predicting problems with a great degree of accuracy. therefore, by considering the literature review, this research work examines the predictability of a set of cutting-edge machine learning methods, which involves tree-based models and deep learning methods. employing the whole of tree-based methods, rnn, and lstm techniques for regression problems and comparing their performance in tehran stock exchange is a recent research activity presented in this study. this paper includes three different sections. at first, through the methodology section, the evolution of tree-based models with the introduction of each one is presented. besides, the basic structure of neural networks and recurrent ones are described briefly. in the research data section, 10 technical indicators are shown in detail with selected methods parameters. at the final step, after introducing three regression metrics, machine learning results are reported for each group, and the model's behavior is compared. since the set of splitting rules employed to differently divide the predictor space can be summarized in a tree, these types of models are known as decision-tree methods. figure 1 , adapted from [42, 43] shows the evolution of tree-based algorithms over several years and the following sections introduce them. literature review, this research work examines the predictability of a set of cutting-edge machine learning methods, which involves tree-based models and deep learning methods. employing the whole of tree-based methods, rnn, and lstm techniques for regression problems and comparing their performance in tehran stock exchange is a recent research activity presented in this study. this paper includes three different sections. at first, through the methodology section, the evolution of tree-based models with the introduction of each one is presented. besides, the basic structure of neural networks and recurrent ones are described briefly. in the research data section, 10 technical indicators are shown in detail with selected methods parameters. at the final step, after introducing three regression metrics, machine learning results are reported for each group, and the model's behavior is compared. since the set of splitting rules employed to differently divide the predictor space can be summarized in a tree, these types of models are known as decision-tree methods. figure 1 , adapted from [42, 43] shows the evolution of tree-based algorithms over several years and the following sections introduce them. decision trees are a popular supervised learning technique used for classification and regression jobs. the purpose is to make a model that predicts a target value by learning easy decision rules formed from the data features. there are some advantages of using this method, such as it being easy to understand and interpret or able to work out problems with multi-outputs; on the contrary, creating over-complex trees that result in overfitting is a fairly common disadvantage. a schematic illustration of the decision tree is shown in figure 2 , adapted from [43] . decision trees are a popular supervised learning technique used for classification and regression jobs. the purpose is to make a model that predicts a target value by learning easy decision rules formed from the data features. there are some advantages of using this method, such as it being easy to understand and interpret or able to work out problems with multi-outputs; on the contrary, creating over-complex trees that result in overfitting is a fairly common disadvantage. a schematic illustration of the decision tree is shown in figure 2 , adapted from [43] . literature review, this research work examines the predictability of a set of cutting-edge machine learning methods, which involves tree-based models and deep learning methods. employing the whole of tree-based methods, rnn, and lstm techniques for regression problems and comparing their performance in tehran stock exchange is a recent research activity presented in this study. this paper includes three different sections. at first, through the methodology section, the evolution of tree-based models with the introduction of each one is presented. besides, the basic structure of neural networks and recurrent ones are described briefly. in the research data section, 10 technical indicators are shown in detail with selected methods parameters. at the final step, after introducing three regression metrics, machine learning results are reported for each group, and the model's behavior is compared. since the set of splitting rules employed to differently divide the predictor space can be summarized in a tree, these types of models are known as decision-tree methods. figure 1 , adapted from [42, 43] shows the evolution of tree-based algorithms over several years and the following sections introduce them. decision trees are a popular supervised learning technique used for classification and regression jobs. the purpose is to make a model that predicts a target value by learning easy decision rules formed from the data features. there are some advantages of using this method, such as it being easy to understand and interpret or able to work out problems with multi-outputs; on the contrary, creating over-complex trees that result in overfitting is a fairly common disadvantage. a schematic illustration of the decision tree is shown in figure 2 , adapted from [43] . a bagging model (as a regressor model) is an ensemble estimator that fits each basic regressor on random subsets of the dataset and next accumulate their single predictions, either by voting or by averaging, to make the final prediction. this method is a meta-estimator and can commonly be employed as an approach to decrease the variance of an estimator such as a decision tree by using randomization into its construction procedure and then creating an ensemble out of it. in this the random forest model is created by a great number of decision trees. this method simply averages the prediction result of trees, which is called a forest. in addition, this model has three random concepts; randomly choosing training data when making trees, selecting some subsets of features when splitting nodes, and considering only a subset of all features for splitting each node in each simple decision tree. during training data in a random forest, each tree learns from a random sample of the data points. a schematic illustration of the random forest, adapted from [43] , is indicated in figure 3 . averaging, to make the final prediction. this method is a meta-estimator and can commonly be employed as an approach to decrease the variance of an estimator such as a decision tree by using randomization into its construction procedure and then creating an ensemble out of it. in this method, samples are drawn with replacement and predictions and obtained through a majority voting mechanism. the random forest model is created by a great number of decision trees. this method simply averages the prediction result of trees, which is called a forest. in addition, this model has three random concepts; randomly choosing training data when making trees, selecting some subsets of features when splitting nodes, and considering only a subset of all features for splitting each node in each simple decision tree. during training data in a random forest, each tree learns from a random sample of the data points. a schematic illustration of the random forest, adapted from [43] , is indicated in figure 3 . the boosting method refers to a group of algorithms that converts weak learners to a powerful learner. the method is an ensemble for developing the model predictions of any learning algorithm. the concept of boosting is to sequentially train weak learners to correct their past performance. adaboost is a meta-estimator that starts by fitting a model on the main dataset and then fits additional copies of the model on a similar dataset. during the process, the samples' weights are adapted based on the current prediction error, so subsequent models concentrate more on difficult items. gradient boosting method is similar to adaboost when it sequentially adds predictors to an ensemble model, each of them correcting its past performance. in contrast with adaboost, gradient boosting fits a new predictor of the residual errors (made by the prior predictor) using gradient descent to find the failure in the predictions of the previous learner. overall, the final model is capable of employing the base model to decreases errors over time. the xgboost is an ensemble tree method (similar to gradient boosting) and the method applies the principle of boosting for weak learners. however, xgboost was introduced for better speed and performance. in-built cross-validation ability, efficient handling of missing data, regularization for avoiding overfitting, catch awareness, tree pruning, and parallelized tree building are common advantages of the xgboost algorithm. the boosting method refers to a group of algorithms that converts weak learners to a powerful learner. the method is an ensemble for developing the model predictions of any learning algorithm. the concept of boosting is to sequentially train weak learners to correct their past performance. adaboost is a meta-estimator that starts by fitting a model on the main dataset and then fits additional copies of the model on a similar dataset. during the process, the samples' weights are adapted based on the current prediction error, so subsequent models concentrate more on difficult items. gradient boosting method is similar to adaboost when it sequentially adds predictors to an ensemble model, each of them correcting its past performance. in contrast with adaboost, gradient boosting fits a new predictor of the residual errors (made by the prior predictor) using gradient descent to find the failure in the predictions of the previous learner. overall, the final model is capable of employing the base model to decreases errors over time. the xgboost is an ensemble tree method (similar to gradient boosting) and the method applies the principle of boosting for weak learners. however, xgboost was introduced for better speed and performance. in-built cross-validation ability, efficient handling of missing data, regularization for avoiding overfitting, catch awareness, tree pruning, and parallelized tree building are common advantages of the xgboost algorithm. anns are single or multi-layer neural nets that are fully connected. figure 4 shows a sample of ann with an input and output layer and also two hidden layers, adapted from [43] . in a layer, each node is connected to every other node in the next layer. with an increase in the number of hidden layers, it is possible to make the network deeper. anns are single or multi-layer neural nets that are fully connected. figure 4 shows a sample of ann with an input and output layer and also two hidden layers, adapted from [43] . in a layer, each node is connected to every other node in the next layer. with an increase in the number of hidden layers, it is possible to make the network deeper. figure 5 is shown for each of the hidden or output nodes, while a node takes the weighted sum of the inputs, added to a bias value, and passes it through an activation function (usually a non-linear function). the result is the output of the node that becomes another node input for the next layer. the procedure moves from the input to the output, and the final output is determined by doing this process for all nodes. the learning process of weights and biases associated with all nodes for training the neural network. equation (1) shows the relationship between nodes, weights, and biases [44] . the weighted sum of inputs for a layer passed through a non-linear activation function to another node in the next layer. it can be interpreted as a vector, where x1, x2, …, and xn are inputs, w1, w2, …, and wn are weights respectively, n is the number of the input for the final node, f is activation function and z is the output. by calculating weights/biases, the training process is completed by some rules: initialize the weights/biases for all the nodes randomly, performing a forward pass by the current weights/biases and calculating each node output, comparing the final output with the actual target, and modifying figure 5 is shown for each of the hidden or output nodes, while a node takes the weighted sum of the inputs, added to a bias value, and passes it through an activation function (usually a non-linear function). the result is the output of the node that becomes another node input for the next layer. the procedure moves from the input to the output, and the final output is determined by doing this process for all nodes. the learning process of weights and biases associated with all nodes for training the neural network. node is connected to every other node in the next layer. with an increase in the number of hidden layers, it is possible to make the network deeper. figure 5 is shown for each of the hidden or output nodes, while a node takes the weighted sum of the inputs, added to a bias value, and passes it through an activation function (usually a non-linear function). the result is the output of the node that becomes another node input for the next layer. the procedure moves from the input to the output, and the final output is determined by doing this process for all nodes. the learning process of weights and biases associated with all nodes for training the neural network. equation (1) shows the relationship between nodes, weights, and biases [44] . the weighted sum of inputs for a layer passed through a non-linear activation function to another node in the next layer. it can be interpreted as a vector, where x1, x2, …, and xn are inputs, w1, w2, …, and wn are weights respectively, n is the number of the input for the final node, f is activation function and z is the output. by calculating weights/biases, the training process is completed by some rules: initialize the weights/biases for all the nodes randomly, performing a forward pass by the current weights/biases and calculating each node output, comparing the final output with the actual target, and modifying equation (1) shows the relationship between nodes, weights, and biases [44] . the weighted sum of inputs for a layer passed through a non-linear activation function to another node in the next layer. it can be interpreted as a vector, where x 1 , x 2 , . . . , and xn are inputs, w 1 , w 2 , . . . , and w n are weights respectively, n is the number of the input for the final node, f is activation function and z is the output. by calculating weights/biases, the training process is completed by some rules: initialize the weights/biases for all the nodes randomly, performing a forward pass by the current weights/biases and calculating each node output, comparing the final output with the actual target, and modifying the weights/biases consequently by gradient descent with the backward pass, generally known as backpropagation algorithm. rnn is a very prominent version of neural networks extensively used in various processes. in a common neural network, the input is processed through several layers, and output is made. it is assumed that two consecutive inputs are independent of each other. however, the situation is not correct in all processes. for example, for the prediction of the stock market at a certain time, it is crucial to consider the previous observations. simple rnn has multiple neurons to create a network. each neuron has a time-varying activation function and each connection between nodes has a real-valued weight that can be modified at each step. according to general architecture, the output of the node (at time t − 1) will be passed to the input (at time t) and add the data of itself (at time t) to make the output (at time t); recurrently exploiting the neuron node to flow multiple node elements to create rnn. figure 6 , adapted from [43] shows a simple architecture of rnn. furthermore, equations (2) and (3) indicate the recursive formulas of rnn [45] . where y t , h t , x t , and w h are output vector, hidden layer vector, input vector, and weighting matrix respectively the weights/biases consequently by gradient descent with the backward pass, generally known as backpropagation algorithm. rnn is a very prominent version of neural networks extensively used in various processes. in a common neural network, the input is processed through several layers, and output is made. it is assumed that two consecutive inputs are independent of each other. however, the situation is not correct in all processes. for example, for the prediction of the stock market at a certain time, it is crucial to consider the previous observations. simple rnn has multiple neurons to create a network. each neuron has a time-varying activation function and each connection between nodes has a real-valued weight that can be modified at each step. according to general architecture, the output of the node (at time t-1) will be passed to the input (at time t) and add the data of itself (at time t) to make the output (at time t); recurrently exploiting the neuron node to flow multiple node elements to create rnn. figure 6 , adapted from [43] shows a simple architecture of rnn. furthermore, equations (2) and (3) indicate the recursive formulas of rnn [45] . where yt, ht, xt, and wh are output vector, hidden layer vector, input vector, and weighting matrix respectively lstm is a specific kind of rnn with a wide range of applications similar to time series analysis, document classification, speech, and voice recognition. in contrast with feedforward anns, the predictions made by rnns are dependent on previous estimations. in real, rnns are not employed extensively because they have a few deficiencies which cause impractical evaluations. the difference between lstm and rnn is that every neuron in lstm is a memory cell. the lstm links the prior information to the current neuron. every neuron has three gates (input gate, forget gate, and output gate). by the internal gate, the lstm is able to solve the long-term dependence problem of the data. lstm architecture includes forget gate, input gate, and output gate. the forget gate controls discarding information from the cell, and equations (4) and (5) show its related formulas where ht-1 is output at the prior time (t-1), and xt is input at the current time (t) into sigmoid function (s(t)). all w and b are the weight matrices and bias vectors that require to be learned during the training process. ft defines how much information will be remembered or forgotten. the input gate defines which new information remember in cell state by equations (5)(7) . the value of it is generated to lstm is a specific kind of rnn with a wide range of applications similar to time series analysis, document classification, speech, and voice recognition. in contrast with feedforward anns, the predictions made by rnns are dependent on previous estimations. in real, rnns are not employed extensively because they have a few deficiencies which cause impractical evaluations. the difference between lstm and rnn is that every neuron in lstm is a memory cell. the lstm links the prior information to the current neuron. every neuron has three gates (input gate, forget gate, and output gate). by the internal gate, the lstm is able to solve the long-term dependence problem of the data. lstm architecture includes forget gate, input gate, and output gate. the forget gate controls discarding information from the cell, and equations (4) and (5) show its related formulas where h t−1 is output at the prior time (t − 1), and x t is input at the current time (t) into sigmoid function (s(t)). all w and b are the weight matrices and bias vectors that require to be learned during the training process. f t defines how much information will be remembered or forgotten. the input gate defines which new information remember in cell state by equations (5)(7) . the value of i t is generated to determine how much new information cell state need to be remembered. a tanh function gains an election message to be added to the cell state by inputting the output (h t−1 ) at the prior time (t − 1) and adding the current time t input information (x t ). c t gets the updated information that must be added to the cell state (equation (8)). the output gate defines which information will be output in cell state. the value of o t is entropy 2020, 22, 840 9 of 23 between 0 and 1; which is employed to indicate how many cells state information that need to output (equation (9)). the result of h t is the lstm block's output information at time t (equation (10)) [45] . c this study aims to make a short run prediction for the emerging iranian stock market and employ data from november 2009 to november 2019 (10 years) of four stock market groups, diversified financials, petroleum, non-metallic minerals, and basic metals, which are completely generous. from the opening, close, low high, and prices of the groups, 10 technical indicators are calculated. the data for this study is supplied from the online repository of the tehran securities exchange technology management co (tsetmc) [46] . before using information for the training process, it is vital to take a preprocessing step. we employ data cleaning, which is the process of detecting and correcting inaccurate records from a dataset and refers to identifying inaccurate or irrelevant parts of the data and then replacing, modifying, or deleting the dirty data. the interquartile range (iqr score) is a measure of statistical dispersion and is robust against outliers, and this method is used to detect outliers and modify the dataset. indeed, as an important point, to prevent the effect of the larger value of an indicator on the smaller ones, the values of 10 technical indicators for all groups are normalized independently. data normalization refers to rescaling actual numeric features into a 0 to 1 range and is employed in machine learning to create a training model less sensitive to the scale of variables. table 1 indicates all the technical indicators, which are employed as input values based on domain experts and previous studies [47] [48] [49] ; the input values for calculating indicators in the table are opening, high, low and closing prices in each trading day; "t" means current time, and "t + 1" and "t − 1" mean one day ahead and one day before, respectively. table 2 shows the summary statistics of indicators for the groups. simple n-day moving average = c t +c t−1 +...+c t−n+1 n weighted 14-day moving average = n * c t +(n−1) * c t−1 +...+c t−n+1 n+(n−1)+...+1 n is number of days c t is the closing price at time t l t and h t is the low price and high price at time t, respectively ll t__t−n+1 and hh t__t−n+1 is the lowest low and highest high prices in the last n days, respectively up t and dw t means upward price change and downward price change at time t, respectively ema( moving average convergence divergence (macd t ) = ema (12) sma is calculated by the average of prices in a selected range, and this indicator can help to determine if a price will continue its trend. wma gives us a weighted average of the last n values, where the weighting falls with each prior price. mom calculates the speed of the rise or falls in stock prices and it is a very useful indicator of weakness or strength in evaluating prices. stck is a momentum indicator over a particular period of time to compare a certain closing price of a stock to its price range. the oscillator sensitivity to market trends can be reduced by modifying that time period or by a moving average of results. stcd measures the relative position of the closing prices in comparison with the amplitude of price oscillations in a certain period. this indicator is based on the assumption that as prices increase, the closing price tends towards the values which belong to the upper part of the area of price movements in the preceding period and when prices decrease, the opposite is correct. lwr is a type of momentum indicator which evaluates oversold and overbought levels. sometimes lwr is used to find exit and entry times in the stock market. macd is another type of momentum indicator which indicates the relationship between two moving averages of a share's price. traders can usually use it to buy the stock when the macd crosses above its signal line and sell the shares when the macd crosses below the signal line. ado is usually used to find out the flow of money into or out of stock. ado line is normally employed by traders seeking to determine buying or selling time of stock or verify the strength of a trend. rsi is a momentum indicator that evaluates the magnitude of recent value changes to assess oversold or overbought conditions for stock prices. rsi is showed as an oscillator (a line graph which moves between two extremes) and moves between 0 to 100. cci is employed as a momentum-based oscillator to determine when a stock price is reaching a condition of being oversold or overbought. cci also measures the difference between the historical average price and the current price. the indicator determines the time of entry or exit for traders by providing trade signals. dataset used for all models-except rnn and lstm models-are identical. there are 10 features (10 technical indicators) and one target (stock index of the group) for each sample of the dataset. as mentioned, all 10 features are normalized independently before being used to fit models and improve the performance of algorithms. since the goal is developing models to predict stock group values, datasets are rearranged to incorporate the 10 features of each day to the target value of n-days ahead. in this study, models are evaluated by training them to predict the target value for 1, 2, 5, 10, 15, 20, and 30 days ahead. there are several parameters related to each model, but we tried to choose the most effective ones concerning our experimental works and prior studies. for tree-based models, several trees (ntrees) were the design parameter while other common parameters are set identical between all models. parameters and their values for each model are listed in table 3 . the number of trees to perform tree-based models is fairly robust to over-fitting, so a large number typically results in better prediction. the maximum depth of the individual regression estimators limits the number of nodes in the tree. the best value depends on the interaction of the input variables. in machine learning, the learning rate is an important parameter in an optimization method that finds out the step size at each iteration while moving toward a minimum of a loss function. for rnn and lstm networks, because of their time-series behavior, datasets are arranged to include the features of more than just one day. while for the ann model, all parameters but epochs are constant; for rnn and lstm models, the variable parameters are several days included in the training dataset and respective epochs. by increasing the number of days in the training set, the number of epochs is increased to train the models with an adequate number of epochs. table 4 presents all valid values for the parameters of each model. for example, if five days are included in the training set for ann, rnn, or lstm models, the number of epochs is set to 300 to thoroughly train the models. the activation function of a node in anns describes the output of that node given an input or set of inputs. optimizers are methods employed to change the attributes of anns, such as learning rate and weights to reduce the losses. an epoch is a term used in anns and shows the number of passes of the entire training dataset the ann model has completed. in this section four metrics used in the study are introduced. mean absolute percentage error (mape) is often employed to assess the performance of the prediction methods. mape is also a measure of prediction accuracy for forecasting methods in the machine learning area, it commonly presents accuracy as a percentage. equation (11) shows its formula [50] . where a t is the actual value and f t is the forecast value. in the formula, the absolute value of the difference between those is divided by a t . the absolute value is summed for every forecasted value and divided by the number of data. finally, the percentage error is made by multiplying to 100. mean absolute error (mae) is a measure of the difference between two values. mae is an average of the difference between the prediction and the actual values. mae is a usual measure of prediction error for regression analysis in the machine learning area. the formula is shown in equation (12) [50,51]. where a t is the true value and f t is the prediction value. in the formula, the absolute value of the difference between those is divided by n (number of samples) and summed for every forecasted value. root mean square error (rmse) is the standard deviation of the prediction errors in regression work. prediction errors or residuals show the distance between real values and a prediction model, and how they are spread out around the model. the metric indicates how data is concentrated near the best fitting model. rmse is the square root of the average of squared differences between predictions and actual observations. relative root mean square error (rrmse) is similar to rmse and this takes the total squared error and normalizes it by dividing by the total squared error of the predictor model. the formula is shown in equation (13) [50,51]. where a t is the observed value, f t is the prediction value and n is the number of samples. the mean squared error (mse) measures the quality of a predictors and its value is always non-negative (values closer to zero are better). the mse is the second moment of the error (about the origin), and incorporates both the variance of the prediction model (how widely spread the predictions are from one data sample to another) and its bias (how close the average predicted value is from the observation). the formula is shown in equation (14) [50]. where a t is the observed value, f t is the prediction value and n is the number of samples. six tree-based models namely decision tree, bagging, random forest, adaboost, gradient boosting, and xgboost, and also three neural networks-based algorithms (ann, rnn, and lstm) are employed in the prediction of the four stock market groups. for the purpose, prediction experiments for 1, 2, 5, 10, 15, 20, and 30 days in advance of time are conducted. results for diversified financials are depicted in tables 5-11 for instance. for better understanding and reduction of the number of result tables, the average performance of algorithms for each group is demonstrated in tables 12-15, and also table 16 shows the average runtime per sample for all models. it is prominent to note that a comprehensive number of experiments are performed for each of the groups and prediction models with various model parameters. the following tables show the best parameters where a minimum prediction error is obtained. indeed, it is clear from the results that error values generally rise when prediction models are created for a greater number of days ahead. for example, mape values of xgboost are 0.88, 1.14, 1.45, 1.77, 2.03, 2.30, and 2.48 respectively. however, it is possible to observe a less strict ascending trend in some cases (which was seen in previous studies similarly) due to deficiency in the prediction ability of some models in some special cases based on the main dataset. in this work, we use all of 10 technical indicators as 10 input features and the number of data is 2600. to prevent overfitting, we randomly split our main dataset into two parts, train data and test data, at the first step and then fit our models on the train data. seventy percent of the main dataset (1820 data) is assigned to train data. next, the models are used to predict future values and calculate metrics with test data (780 data). in addition, we employ regularization and validation data (20% of train data) to increase our accuracy and tune our hyperparameters during training (the training process for tree-based models and anns is different here). figure 7 shows the performance of xgboost for five days ahead of diversified financials as an example. the comparison between actual values and predicted values indicate the quality of modeling and the prediction task. it is important to note that the cases are not exactly consecutive trading days because we split our dataset randomly by shuffling. by deeming the literature, our result in this study is one of the most accurate predictions and it can be interpreted by the dataset and the performance of models. it is true that the process of training is totally important, but we believe that the role of the dataset is greater here. the dataset is relatively specific because of some rules in tehran stock exchange. for example, the value change of each stock is limited to +5% and −5%, or the closing price of a stock is close to the opening price on the next trading day. these rules are learned by machine learning algorithms and then the models are able to significantly predict our dataset from tehran stock exchange. regarding the results of diversified financials as an example, adaboost regressor and lstm can predict the future prices well with normally 1.59% and 0.60% error; these values become more important when we know that the maximum range of changes is 10% (from −5% to +5%). so, with the specific dataset and powerful models, we still have noticeable errors, which indicate the effect of fundamental parameters. fundamental analysis is a method of measuring a security's intrinsic value by examining related economic and financial factors. this method of stock analysis is considered to be in contrast to technical analysis, which forecasts the direction of prices. noticeably, most non-scientific factors such as policies, increase in tax etc. affect the groups in stock markets; for example, the pharmaceutical industries experience growth with covid-19 at the present time. based on extensive experimental works and reported values the following results are obtained: among tree-based models the average runtime of deep learning models is high compared to others • lstm is powerfully the best model for prediction all stock market groups with the lowest error and the best ability to fit, but the problem is the great runtime in spite of noticeable efforts to find valuable studies on the same stock market, there is not any important paper to report, and this deficiency is one of the novelties of this research. we believe that this paper can be a baseline to compare for future studies. for all investors, it is always necessary to predict stock market changes for detecting accurate profits and reducing potential mark risks. this study employed tree-based models (decision tree, bagging, random forest, adaboost, gradient boosting, and xgboost) and neural networks (ann, rnn, and lstm) to correctly forecast the values of four stock market groups (diversified financials, petroleum, non-metallic minerals, and basic metals) as a regression problem. the predictions were made for 1, 2, 5, 10, 15, 20, and 30 days ahead. as far as our belief and knowledge, this study is the most successful and recent research work that involves ensemble learning methods and deep learning algorithms for predicting stock groups as a popular application. to be more detailed, exponentially smoothed technical indicators and features were used as inputs for prediction. in this prediction problem, the methods were able to significantly advance their performance, and lstm was the top performer in comparison with other techniques. overall, as a logical conclusion, both tree-based and deep learning algorithms showed remarkable potential in regression problems to predict the future values of the tehran stock exchange. among all models, lstm was our superior model for predicting all stock market groups with the lowest error and the best ability to fit (by average values of mape: 0.60, 1.18, 1.52 and 0.54), but the problem was the great runtime (80.902 ms per sample). as future work, we recommend using the algorithms on other stock markets or examining other hyperparameters effects on the final results. for all investors, it is always necessary to predict stock market changes for detecting accurate profits and reducing potential mark risks. this study employed tree-based models (decision tree, bagging, random forest, adaboost, gradient boosting, and xgboost) and neural networks (ann, rnn, and lstm) to correctly forecast the values of four stock market groups (diversified financials, petroleum, non-metallic minerals, and basic metals) as a regression problem. the predictions were made for 1, 2, 5, 10, 15, 20, and 30 days ahead. as far as our belief and knowledge, this study is the most successful and recent research work that involves ensemble learning methods and deep learning algorithms for predicting stock groups as a popular application. to be more detailed, exponentially smoothed technical indicators and features were used as inputs for prediction. in this prediction problem, the methods were able to significantly advance their performance, and lstm was the top performer in comparison with other techniques. overall, as a logical conclusion, both tree-based and deep learning algorithms showed remarkable potential in regression problems to predict the future values of the tehran stock exchange. among all models, lstm was our superior model for predicting all stock market groups with the lowest error and the best ability to fit (by average values of mape: 0.60, 1.18, 1.52 and 0.54), but the problem was the great runtime (80.902 ms per sample). as future work, we recommend using the algorithms on other stock markets or examining other hyperparameters effects on the final results. hybridization of evolutionary levenberg-marquardt neural networks and data pre-processing for stock market prediction. knowl.-based syst capital markets efficiency: evidence from the emerging capital market with particular reference to dhaka stock exchange hybridization of evolutionary levenberg-marquardt neural networks and data pre-processing for stock market prediction. knowl.-based syst capital markets efficiency: evidence from the emerging capital market with particular reference to dhaka stock exchange stock price forecast based on bacterial colony rbf neural network overview and history of statistics for equity markets impact of the stock market capitalization and the banking spread in growth and development in latin american: a panel data estimation with system gmm stock market value prediction using neural networks stock market prediction with multiple classifiers stock market analysis: a review and taxonomy of prediction techniques handbook of research on machine learning applications and trends: algorithms, methods, and techniques: algorithms, methods, and techniques evaluating multiple classifiers for stock price direction prediction evaluating the employment of technical indicators in predicting stock price index variations using artificial neural networks (case study: tehran stock exchange) predicting stock returns by classifier ensembles computational intelligence and financial markets: a survey and future directions stock price prediction using lstm, rnn and cnn-sliding window model an effective time series analysis for equity market prediction using deep learning model robust online time series prediction with recurrent neural networks reinforced recurrent neural networks for multi-step-ahead flood forecasts an integrated framework of deep learning and knowledge graph for prediction of stock price trend: an application in chinese stock exchange market an innovative neural network approach for stock market prediction stock market prediction using optimized deep-convlstm model augmented textual features-based stock market prediction predicting the daily return direction of the stock market using hybrid machine learning algorithms. financ stock market prediction using firefly algorithm with evolutionary framework optimized feature reduction for oselm method cnnpred: cnn-based stock market prediction using a diverse set of variables blended computation of machine learning with the recurrent neural network for intra-day stock market movement prediction using a multi-level classifier genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction is deep learning for image recognition applicable to stock market prediction? complexity stock market trend prediction using high-order information of time series prediction of stock market using neural network strategies global stock market prediction based on stock chart images using deep q-network a numerical-based attention method for stock market prediction with dual information modaugnet: a new forecasting framework for stock market index value with an overfitting prevention lstm module and a prediction lstm module genetic algorithm-optimized long short-term memory network for stock market prediction which artificial intelligence algorithm better predicts the chinese stock market stock market prediction on high-frequency data using generative adversarial nets deep learning networks for stock market analysis and prediction: methodology, data representations, and case studies deep learning-based feature engineering for stock price movement prediction. knowl.-based syst lagged correlation-based deep learning for directional trend change prediction in financial time series hybrid fuzzy neural network to predict price direction in the german dax-30 index forecasting stock market trend: a comparison of machine learning algorithms. financ. mark exploring an ensemble of methods that combines fuzzy cognitive maps and neural networks in solving the time series prediction problem of gas consumption in greece machine learning: a probabilistic perspective deep learning for stock market prediction the handbook of brain theory and neural networks learning long-term dependencies in narx recurrent neural networks data science in economics predicting direction of stock price index movement using artificial neural networks and support vector machines: the sample of the istanbul stock exchange predicting stock market index using fusion of machine learning techniques the authors declare no conflict of interests. key: cord-016954-l3b6n7ej authors: young, colin r.; welsh, c. jane title: animal models of multiple sclerosis date: 2008 journal: sourcebook of models for biomedical research doi: 10.1007/978-1-59745-285-4_69 sha: doc_id: 16954 cord_uid: l3b6n7ej to determine whether an immunological or pharmaceutical product has potential for therapy in treating multiple sclerosis (ms), detailed animal models are required. to date many animal models for human ms have been described in mice, rats, rabbits, guinea pigs, marmosets, and rhesus monkeys. the most comprehensive studies have involved murine experimental allergic (or autoimmune) encephalomyelitis (eae), semliki forest virus (sfv), mouse hepatitis virus (mhv), and theiler’s murine encephalomyelitis virus (tmev). here, we describe in detail multispecies animal models of human ms, namely eae, sfv, mhv, and tmev, in addition to chemically induced demyelination. the validity and applicability of each of these models are critically evaluated. multiple sclerosis (ms) affects about 350,000 people in the united states and is a major cause of nervous system disability in adults between the ages of 15 and 45 years. the symptoms are diverse, ranging from tremor, nystagmus, paralysis, and disturbances in speech and vision. extensive demyelination is seen in the neuronal lesions. the clinical heterogeneity of ms, as well as the finding of different pathological patterns, suggests that ms may be a spectrum of diseases that may represent different pathological processes. 1 this has led to the development of many different animal models, including rodents and nonhuman primates, that reflect the pathological processes and could allow for the development of therapeutic approaches. at the present time, the exact etiological mechanism in humans is not clear; however, several animal models are available providing insight into disease processes. the relative inaccessibility and sensitivity of the central nervous system (cns) in humans preclude studies on disease pathogenesis, and so much of our understanding of infections and immune responses has been derived from experimental animal models. the experimental systems include theiler's virus, mouse hepatitis virus, and semliki forest virus infections of laboratory rodents. additional information has been obtained from studies of experimental infections of other animals that result in demyelination, notably maedi-visna virus in sheep and canine distemper virus in dogs. in humans and animals, most natural cases of demyelinating disease are rare complications of viral infections. one possible reason for the low incidence of demyelination following viral infections could be the low efficiency of neuroinvasion. however, a correlation between cns infection and clinical disease is difficult to determine. the role of genetics and environmental factors in ms is complex. factors such as geographical location, ethnic background, and clustering in temperate climates all contribute to susceptibility. individuals with a north european heritage are statistically more susceptible to ms than those from a more tropical environment and it is more common in women. 2 epidemiological data indicate that ms is not a single-gene disorder and that additionally environmental factors contribute to the disease. 3 data from genetic studies indicate that although mhc genes clearly contribute to disease susceptibility and/or resistance, it is probable that a combination of environmental factors may additionally contribute to disease development in genetically predisposed individuals. to understand the initiating factors and progression of ms, researchers have turned to experimental model systems. since this disease cannot be recreated in a tissue culture system, much effort has been directed to the use of laboratory animals. those animal models should mirror the clinical and pathological fi ndings observed in human ms. ideally, the animal model should be in a species that is easy to handle, inexpensive, can be kept in large numbers, and is easily bred in laboratory conditions. the most frequently used animals are laboratory rodents, including mice, rats, guinea pigs, and hamsters. one of the most useful aspects of laboratory rodents as animal models of disease is the vast array of inbred strains of the species available, most notably in experimental mice. additionally, very valuable information has been obtained from studies using larger animals including sheep, dogs, cats, and nonhuman primates. models of ms fall into two main groups: viral and nonviral. viral models are immensely relevant since epidemiological studies suggest an environmental factor, and almost all naturally occurring cns demyelinating diseases of humans and animals of known etiology are caused by a virus. these include in humans, subacute sclerosing panencephalitis (sspe)-caused by measles or rubella viruses, progressive multifocal leukoencephalopathy (pml)-caused by jc virus, and human t lymphotrophic virus-1 (htlv-1)-associated myelopathy (ham)-caused by htlv-1; in animals, these include visna virus in sheep and canine distemper in dogs. however, no one virus has consistently been associated with human ms, although it is likely that more than one virus could trigger the disease. of the nonviral models of ms, experimental allergic encephalomyelitis (eae) is the most widely studied. eae is characterized by inflammatory infiltrates in the cns that can be associated with demyelinating lesions. in eae, the disease is initiated by the extraneural injection of cns material, or purified myelin components, emulsified in an adjuvant, the most commonly employed one being complete freund's adjuvant containing mycobacterium tuberculosis h37ra. however, no naturally occurring autoimmune correlate of this experimental disease is known, although it is extensively researched as a model of ms, with the reasoning that ms may be such a disease. the most widely studied models of ms are the experimental infections of rodents resulting in an inflammatory demyelinating disease in the cns, such as theiler's virus, mouse hepatitis virus, and semliki forest virus. 4 each of these infections gives rise to lesions of mononuclear cell inflammatory demyelination throughout the brain and spinal cord but not in the peripheral nervous system. as such, this histopathology correlates with human ms, although it does not preclude the fact that the viruses could gain access to the cns via the peripheral nervous system. these viral models demonstrate how a virus can easily reproduce cns disease, which is comparatively rare in humans, and how this can be influenced by many factors including both genetic and immunological. experimental studies in induced animal models have the advantage over studies in spontaneous models in that the onset and progression of the disease can be controlled. although it has been proposed that some autoimmune diseases may have a viral etiology, virus-induced autoimmunity is a controversial subject. epidemiological studies of ms provide strong evidence for the involvement of a viral etiology in the onset of disease. theiler's virus-induced demyelination, a model for human ms, bears several similarities to the human disease: an immune-mediated demyelination, involvement of cd4 + helper t cells and cd8 + cytotoxic t cells, delayed type hypersensitivity responses to viral antigens and autoantigens, and pathology. indeed this mouse model may provide a scenario that closely resembles chronic progressive ms. theiler's murine encephalomyelitis virus (tmev) is a picornavirus that causes an asymptomatic gastrointestinal infection, followed by occasional paralysis. there are two main strains of tmev, the virulent strains and persistent theiler's original (to) strains. the virulent gdvii strains of theiler's virus are highly neurovirulent and when injected intracranially, cause death by encephalitis within 48 h. gdvii strains also cause differing forms of paralysis depending on the route of inoculation (see table 69 -1). from these studies it appears that the gdvii virus may gain access to the cns by retrograde axonal transport rather than by a hematogeneous route. 5 infection of susceptible strains of mice with the persistent to strains bean, da, ww, or yale results in a primary demyelinating disease that closely resembles human ms. 6 infection of resistant strains of mice with bean does not result in demyelinating disease, since these mice are able to clear virus from the cns. susceptible mice fail to clear virus from the cns, possibly resulting from poor natural killer (nk) cell and cytotoxic t lymphocyte (ctl) responses. persistent viral infection of the cns is required for demyelination. following the intracranial injection of susceptible mice with bean, virus replicates both in the brain and spinal cord. 7 one month postinfection, viral titers decrease, and high levels of neutralizing antibodies are detected (figure 69-1) . at this point in the disease, neurons may become infected with virus and mice may develop a nonprogressive flaccid paralysis of the forelimbs and/or hindlimbs. 8 this is sometimes referred to as a polio-like disease, but this is confusing since flaccid paralysis in mice infected with poliovirus is progressive and normally results in death. in the late phase of the disease, astrocytes, oligodendrocytes, and macrophage/microglial cells become infected with virus. also in the demyelinating disease there is both b and t cell autoimmunity, directed against myelin and its antigenic components. genetics of persistent infection and demye-linating disease all inbred mouse strains inoculated intracerebrally with tmev show early encephalomyelitis, but not all strains remain persistently infected. resistant strains normally 9 this trait is under multigenic control, with h-2 mhc class i genes being the most prominent. additionally, several non-h-2 quantitative trait loci (qtl) have been identified within the same h-2 haplotypes that control persistence. there is generally a good correlation in inbred strains between susceptibility to three phenotypes (viral load, pathology, and symptoms), suggesting that variations in both demyelination and clinical disease may result from how each mouse strain can control the viral load during the persistent infection. 10 using b10 congeneic and recombinant strains of mice, susceptibility to disease has been mapped to the h-2d region. 11 furthermore, resistant haplotypes are dominant and the same locus controls viral load during persistence and demyelination. currently, 11 non-h-2 susceptibility loci have been identified as having an effect on susceptibility to theiler's virus-induced disease (tvid) (see table 69 -2). the mechanism(s) of tvid may be different for different mouse strains, but most of the information has come from studies of sjl/j mice infected with the da or bean strain of virus. the virus infects oligodendrocytes, and the resulting demyelinating disease could be due, in part, to the virus killing oligodendrocytes directly or by the virus-specific cd8 + ctls present in the lesions. 7 a series of experiments has demonstrated that demyelination correlates with the presence of a cd4 + t cell-mediated response against viral epitopes. these cells secrete cytokines such as interferon (ifn)-γ that activate both microglial cells and invading monocytes, which subsequently secrete factors such as tumor necrosis factor (tnf)-α and thus can cause "bystander" demyelination. activated macrophages ingest and degrade damaged myelin. autoantibodies 12,13 and myelin-specific cd4 + t cells have been shown in sjl/j mice several months after intracranial inoculation. 14, 15 epitope spreading in these mice commences with recognition of a proteolipid protein (plp) epitope, and then progresses to additional plp epitopes and then to myeloid basic protein (mbp) epitopes. 14 a direct demonstration that disease can be maintained on a purely autoimmune footing, after infection has been eradicated, has not been shown. immunity and theiler's virus the first response to viral infection is the production of type i interferons, which are critical for viral clearance. ifn-α/β receptor knockout mice injected with tmev die of encephalomyelitis within 10 days of infection. nk cells are activated early in infection with certain viruses. in tmev infection susceptible sjl mice have a 50% lower nk cell activity in comparison to the highly resistant c57bl/6 mice. this low nk activity in sjl mice is in part due to a defect in the thymus impairing the responsiveness of nk cells to stimulation by ifn-β. the pivotal role of nk cells in early tmev clearance is demonstrated by the finding that resistant mice depleted of nk cells by monoclonal antibodies to nk 1.1 develop severe signs of gray matter disease. 16 in the early disease, both cd4 + and cd8 + t cells have been shown to be important in viral clearance. in early disease cd4 + t cells are required for b cells to produce antibodies for viral clearance. 12 these cd4 + t cells secrete ifn-γ, which in vitro inhibits tmev replication and has a protective role in vivo. cd8 + t cells are also important in viral clearance, as demonstrated by the finding that cd8 + t cell-depleted mice fail to clear virus and develop a more severe demyelinating disease. 17 cd8 + t cells also provide protection against tvid when adoptively transferred to a tvid-susceptible strain, balb/c.anncr. thus, cd8 + t cells are implicated in viral clearance and resistance to demyelination. higher ctl activity has been demonstrated in tvid-resistant c57bl/6 mice as compared to resistant sjl/j mice. 18 these ctls may play an important role in viral disease since they may recognize viral determinants and/or they may inhibit delayed type hypersensitivity responses. the relative roles of th1/th2 cells in tvid are very complex, and a simple picture of a th1 or th2 polarization during infection may not be apparent. a pathogenic role for th1 cells during late demyelinating disease is demonstrated by the finding that both tvid correlates with delayed type hypersensitivity responses to tmev and that the depletion of cd4 + t cells during late disease results in the amelioration of clinical signs. high levels of the proinfl ammatory th1 cytokines ifn-γ and tnf-α in late disease correlate well with maximal disease activity. evidence demonstrating the protective role of th2 in tvid has been shown in experiments in which skewing the immune response toward th2 immunity in tmev infection diminishes the later demyelinating disease. 19 however, other studies have shown that the th1/th2 balance did not explain the difference in susceptibility to tvid. th1 cytokines are generally pathogenic during late demyelinating disease, whereas th2 cytokines are protective. 20 and is characterized by abnormally thin myelin sheaths in relation to axon diameter. to date, however, there are few reliable data on the frequency of remyelination in ms patients. stimulation of remyelination is a potential treatment for ms. the tmev model of ms can be used to study remyelination using remyelinationpromoting antibodies. in this remyelination model, sjl/j mice, aged 4-8 weeks, are injected with a 10 µl volume containing 200,000 pfu of daniel strain intracerebrally. all animals develop mild encephalitis, which resolves within 14 days after the injection. the infected mice then develop the chronic demyelinating disease that gradually progresses over several months. to study remyelination, mice that had been infected with tmev for 6 months receive a single intraperitoneal injection of 0.5 mg (∼0.025 g/kg body weight) of a recombinant remyelinating antibody (rhigm22) in phosphate buffered saline (pbs). 21 in one study, 82.8% of lesions in animals treated with rhigm22 showed retraction of varying degrees, presumably the effect of remyelination in these lesions. the direct binding of rhigm22 to demyelinated lesions is consistent with the hypothesis that these antibodies work directly in the lesions, probably by binding to the cns glia to induce remyelination. thus, this murine tmev model can also be used as a model with which to examine different modes of remyelination. mouse hepatitis virus (mhv) is a member of the coronaviridae, a group of large positive sense enveloped rna viruses. depending on the strain of virus used, mhv causes a variety of diseases including enteritis, hepatitis, and demyelinating encephalomyelitis. 22 infection of mice with the neutrotropic jhm strain of mhv causes encephalitis, followed by chronic demyelination. virus is not cleared from the cns, resulting in a persistent infection. after intracerebral or intranasal infection with mhv, virus enters the brain and causes encephalitis. 23 intranasal infection with mhv-jhm or -a59 leads to viral spread through the olfactory bulb and along the olfactory tracts, as well as along the trigeminal nerve to the mesencephalic nucleus. 23 up to 4 days postinfection (pi) early viral spread is via specific neural pathways and neural connections. viral titers peak at about day 5 pi in the brain and later in the spinal cord and virus is cleared by days 8-20 pi. 24 however, viral antigen is still detectable up to day 30 pi. additionally, viral rna is detectable in the brain as late at 10-12 months postinfection, although the amount of rna decreases with time. liver infection can occur after any route of infection (in, ic, ig, or ip), with viral titers peaking at day 5 pi and hepatitis developing during the first 1-2 weeks. 25 cns demyelination develops as active mhv infection resolves. the lesions observed are histologically very similar to those observed in ms patients. these mhv lesions are characterized by primary demyelination accompanied by naked axons, 23 and are found scattered throughout the spinal cord. 26 the peripheral nervous system is not affected. chronic lesions are associated with lipid-laden macrophages, scattered lymphocytes, and perivascular cuffing. these chronic lesions can persist as late as day 90 pi, and demyelinating axons can be seen as late as 16 months postinfection. chronic diseases in mhv-infected mice are associated with ataxia, hindlimb paresis, and paralysis, followed by a recovery. this animal recovery is mediated by cns remyelination, beginning anywhere from 14 to 70 days pi. c57bl/6 mice (h-2 b ) are susceptible to mhv infection. in this murine model adult mice (of weight 20-22 g) are anesthetized by inhalant anesthesia and receive an intracerebral injection of approximately 500 pfu of a neurotropic mhv strain in a volume of approximately 20 µl of pbs. this intracerebral injection of mhv results in a biphasic disease: an acute encephalomyelitis with myelin loss, followed 10-12 days later by an immunemediated demyelinating encephalomyelitis with progressive destruction of the cns. 27 there is an 80-90% survival rate of mice injected with this mhv, with animals usually succumbing during the first 2 weeks of acute infection. animals surviving this acute stage show a 95% chance of survival. 28 control animals injected intracerebrally with sterile pbs show no clinical signs or histological defects. electron micrographs of demyelinating lesions show that macrophage processes slip between layers in the myelin sheath, implying that macrophages could indeed be mediating demyelination. 29 the appearance of macrophages within the cns also correlates with the development of lesions. additionally, they do not appear in large numbers in the absence of lymphocytes, so it is possible that myelin damage is caused by a nonmacrophagedependent mechanism and that macrophages may only clear up the damaged myelin. in contrast to other mouse models of demyelination, there does not appear to be a clear role for any single lymphocytic or monocytic subset mediating the demyelination. rather, it appears that a balance of immune components may be necessary for viral clearance and that various pathways, both immune and nonimmune, may cause the ensuing demyelinating events. recently, progress has been made in further identifying the immune cells required for demyelination. experimental infection of severe combined immunodeficiency (scid) mice, lacking t cells, results in fulminate encephalitis without demyelination. 30 adoptive transfer of splenocytes from syngeneic immunocompetent mice into infected scid mice results in demyelination within 7-9 days posttransfer. additional experiments indicated that either cd4 + or cd8 + t cell subsets are capable of initiating this process. however, mice that receive splenocytes depleted of cd4 + t cells survive longer and develop more demyelination than mice receiving splenocytes depleted of cd8 + t cells. thus, experimental scid mice demonstrate that the roles of each t cell subset in demyelinating diseases are not equal. 31 ifn-γ is a critical mediator of homeostasis and infl ammation in ms and many of its rodent models. bone marrow chimera mice have been used to address the role of ifn-γ in bystander demyelination mediated by cd8 + t cells. these chimeras as rodent models for jhm have addressed a hypothesis that ifn-γ produced by cd8 + t cells, and not from other sources, was the critical component in mediating bystander demyelination. this chimeric approach did not compromise ifn-γ production by cells such as nk cells and dendritic cells, thus preserving the innate immune response to the virus. 32 the results demonstrated that ifn-γ produced by these innate cells was unable to initiate the demyelinating disease, even in the context of activated cd8 + t cells lacking only the ability of produce ifn-γ. these findings highlight the role that cd8 + t cells have in demyelination in jmh-infected mice. 33 it has been demonstrated that ifn-γ is critical in other animal models of demyelination and in ms. semliki forest virus (sfv) is an alphavirus of the togaviridae. the virus has been isolated from mosquitoes, but the natural host is unknown. sfv is a single-stranded positive strand rna virus that has been cloned and sequenced. the most commonly studied strains used in adult mice are the virulent l10 strain and the avirulent a7(74) strain. both of these strains are avirulent in neonatal and suckling mice by all routes of infection. experimental infection of mice with sfv is widely used as a model to study the mechanism of virus-induced cns disease. sfv has the advantage of being neuroinvasive as well as neurotropic, thus allowing studies of viral entry into the cns and the integrity of the bloodbrain barrier (bbb). following intraperitoneal injection with 5000 pfu sfv in 0.1 ml pbs containing 0.75% bovine serum albumin, 34 all strains replicate in muscles and other tissues, resulting in a plasma viremia. virus then crosses the cerebral vascular endothelial cells, resulting in infection of neurons and oligodendrocytes. 35 in neonatal or adult mice, infection with virulent strains results in widespread infection that is lethal within a few days. in contrast, infection of mice with the a7(74) strain results in a cns infection, and infectious virus is cleared from the brain by day 10. infiltrating mononuclear cells are observed 3 days pi and peak at about day 7. focal lesions of demyelination throughout the cns are observed 10 days pi and peak between 14 and 21 days pi. 36 sfv-induced demyelinating diseases have been widely studied following intraperitoneal injection of adult mice with the a7(74) strain of the virus. following intraperitoneal injection, virus is detected in the brain by 24 h. viral titers then rise, but rapidly decline following initiation of the immune response. interestingly, although infectious virus can be detected only up to day 8 pi, realtime polymerase chain reaction (rt-pcr) studies detect viral rna up to day 90 pi. 37 thus, it is possible that there is persistence of viral antigen(s). disturbance of the bbb occurs between 4 and 10 days pi, which corresponds to the increase in inflammatory cell infi ltration and reduction in viral titer and which may be related to the influx of cells or cytokine-mediated effects. the presence of macrophages, activated microglia, and the proinfl ammatory cytokines tnf-α, ifn-γ, interleukin (il)-1α, il-2, il-6, and granulocyte-macrophage colony-stimulating factor (gm-csf) during sfv-induced demyelination, in addition to enhancing the infl ammatory response, may also play a role in controlling viral infection since il-6, ifn-γ, and tnf have direct antiviral activity. 38 additionally, ifn-γ and tnf production peripherally coincides with sfv-induced encephalitis in sjl and b6 mice. interestingly, these same cytokines predominate in ms lesions. 39 an intense inflammatory response characterized by perivascular cuffi ng is apparent histologically from 3 days. demyelination, as demonstrated using luxol fast blue staining of sections, is apparent by 14 days. however, small focal lesions of demyelination can be observed using electron microscopy by day 10. a striking feature of sfv infection appears in the optic nerve, where there are demyelinating lesions and changes in visually evoked responses and axonal transport. 40 this optic neuritis also occurs in human ms. it appears that sfv-induced demyelination in this mouse model is accompanied by neurophysiologically demonstrable visual deficits very similar to those found in ms patients. thus, this may provide a very useful animal model for research into ms. the advantages of this model are that genetic and environmental factors can be readily controlled, while the low cost and fast reproductive rate make experimental design considerably easier. no demyelination is observed following sfv infection of scid mice or athymic mice. in the absence of specific immune responses, scid mice infected with sfv a7(74) have a persistent viremia, a persistent and restricted cns infection, and no lesions of demyelination. comparison of the infection to that in nu/nu and balb/c mice and studies on the transfer of immune sera show that immunoglobulin m (igm) antibodies clear the viremia but not the brain virus and that infections of brain virus can be reduced by igg antibodies. these igg antibodies can abolish infectivity titers in the brain but cannot remove all viral rna. 41 adoptive cell transfer studies and administration of anti-cd8 antibodies demonstrate that demyelination following sfv infection is dependent on cd8 + t cells. 4 this is consistent with the finding that the cns inflammatory infiltrate is dominated by cd8 + t cells. this fi nding is analogous to that in ms and is in contrast to that in eae, where cd4 + cells predominate. 42 in the eae autoimmune model of ms, studies suggest that a th1 cytokine profile predominates. another point of difference between the eae model and the sfv model is shown in the th1/th2 profiles. following infection with sfv, th1 and th2 cytokines were detected in the cns and both were present throughout the time course studied, indicating that there was no bias of th response in the cns, nor were changes apparent with time. 43 the experimental disease eae has been investigated in many strains of animals including mice, rats, guinea pigs, rabbits, marmosets, and rhesus monkeys. eae is an autoimmune infl ammatory disease of the cns and is characterized by perivascular and subpial inflammatory infiltrates and demyelinating lesions. the disease is usually initiated by injection of autoantigens emulsifi ed in an adjuvant. the progression and pathology of lesions observed depend on the type of antigen used in the injection, the method of injection, and the strain of animal used. because of its very nature, eae as a model of ms does not address certain pertinent questions relating to ms, such as age-related onset of disease or epidemiology. a major difference between eae and viral models of ms is that in eae the inflammatory response is directed to autoantigens. a feature of the eae model is that the course of the disease can be relapsing and remitting. studies of eae have been used to identify antigenic determinants on components of myelin. using bioinformatic technology these determinants have been used to search available databases of viral and bacterial proteins. results indicate numerous viral and bacterial protein segments with probable sequence similarity to myelin basic protein determinants. 44 experimental allergic encephalomyelitis in rabbits eae has been induced in rabbits by footpad inoculation with rabbit spinal cord homogenate, resulting in hindlimb paresis or paralysis. 45 rabbits with 5-day paraplegia showed increased spinal cord incorporation of radioactive drugs administered in the epidural space. thus, this demyelinating disease process may expose the spinal cord to larger amounts of sub-stances administered neuraxially. it is therefore possible that this rabbit model could be used to investigate the incorporation of radioactive therapeutic drugs in the epidural space. experimental allergic encephalomyelitis in guinea pigs guinea pigs have also been investigated to determine whether they may serve as useful eae models of ms. the interest in guinea pigs stems from the fact that group 1 cd1 glycoprotein homologues, which in humans present foreign and self lipid and glycolipid antigens to t cells, are not found in mice and rats but are present in guinea pigs. in this guinea pig model, animals have been sensitized for eae, and cd1 and mhc class ii expression has been measured in the cns. in normal guinea pigs low level mhc class ii occurred on meningeal macrophages and microglial cells, whereas immunoreactivity for cd1 was absent. in the eae cns, however, the majority of infiltrating cells were mhc ii + and microglia showed increased expression, whereas cd1 immunoreactivity was detected on astrocytes, b cells, and macrophages. minimal cd1 and mhc ii coexpression was detected on inflammatory cells or glia. thus, in this guinea pig eae model group 1 cd1 molecules are upregulated in the cns on subsets of cells distinct from the majority of mhc iibearing cells. 46 this expression of cd1 proteins in such eae lesions broadens the potential repertoire of antigens recognized at these sites and highlights the value of this guinea pig model of human ms. experimental allergic encephalomyelitis in rats rats were injected with spinal cord homogenate or the encephalitogen; myelin basic protein induced eae in genetically susceptible dark agouti (da) rats but not in albino oxford (ao) rats. here 8-to 12-week-old rats were immunized in either or both hind footpads with 0.1 ml antigenic emulsion containing 100 µg rat spinal cord tissue in complete freund's adjuvant (cfa). 47 rats are monitored from day 5 after inoculation and the severity of disease was assessed by grading tail, hindlimb, and forelimb weakness, each on a sale of 0 (no disease), 1 (loss of tail tonicity), 2 (hindlimb weakness), 3 (hindlimb paralysis), to 4 (moribund or dead). clinical disease in susceptible strains of rats is apparent in all animals, and the onset of disease occurs at day 11 postinjection. at the peak of the clinical manifestation of eae there is a marked increase in the level of infiltration of cells accompanied by a lack of activation in susceptible da rats, whereas it remains elevated in resistant ao rats. at the peak of clinical disease da rat spinal cords contain high levels of cd4 + t cells. da rats also contained 10 times as many live cd4 + t cells as ao rats. astrocytosis, as an indication of cns reaction to the presence of infl ammatory cells, was clearly observed in both rat species. microglial activation persists in resistant ao rats, whereas activation is downregulated in da rats. in this model it is speculated that at the peak of disease, infiltrating monocytes and macrophages are the main antigen-presenting and effector cells. rat eae may also be induced by the injection of xenogeneic myelin. for example, 8-to 12-week-old lewis rats injected in both hind footpads with an emulsion containing 100 µg of guinea pig myelin basic protein and cfa develop acute eae. also chronic relapsing eae (cr-eae) may also be induced in this rat model using a regimen of intraperitoneal injections of 4 mg/kg of cyclosporin a. 48 pathology studies indicate that in acute and cr-eae, mcp-1 and its receptor ccr2 are significantly upregulated throughout the course of cr-eae and that a large number of macrophages infiltrated the cr-eae lesion. this suggests that macrophages recruited by mcp-1 and ccr2-expressing cns cells are responsible for the development and relapse of eae. thus, in this rat model, in addition to t cells, macrophages are another target for immunotherapy studies for neurological autoimmune diseases. a more recent development of a rat eae model involves using human mbp as antigen. here eae was induced by the immunization of female wistar rats with human mbp. it was found that most of the rats developed tail tone loss and hindlimb paralysis together with demyelination, infiltrative lymphocyte foci, and "neurophagia" in the cortex of cerebra and in the white matter of the spinal cord. 49 this study further demonstrated that this rat model of eae induced by human mbp resembles many features of human ms and may promise to be a better animal model for the study of ms. 49 the use of cfa is not a prerequisite for the development of rat eae. for example, eae can be induced in 10-to 16-week-old da rats by a single hind footpad injection of an encephalitogenic emulsion consisting of rat or guinea pig spinal cord homogenate (sch) in pbs. 50 the reason for not wanting to use cfa is that in itself it induces a strong inflammatory response and exerts numerous immunomodulatory properties. additionally, cfa induces a strong anti-purified protein derivative (ppd) response and may induce adjuvant arthritis, another autoimmune disease. the susceptibility of da rats to eae induction with sch depends upon the origin of the cns tissue, the homologous tissue being the more efficient encephalitogen. da rats that recovered from eae that had been induced with homologous sch without adjuvant and then immunized with the encephalitogenic emulsion containing cfa developed clinical signs of the disease. neurological signs in rechallenged rats were milder, but first signs appeared earlier. the earlier onset of eae observed in da rats after rechallenge has been attributed to the reactivation of memory cells. taken together, these experiments demonstrate that eae can be effi ciently and reproducibly induced in da rats without the use of cfa. this experimental model for understanding the basic mechanisms involved in autoimmunity within the cns, without the limitations and inherent problems imposed by the application of adjuvants, may represent one of the most reliable rodent models of ms. the rat as an experimental model could be used to evaluate new immunotherapies of eae. these include antigen-induced mucosal tolerance, treatment with cytokines, and dendritic cellbased immunotherapy. the ideal treatment of diseases with an autoimmune background such as ms should specifically eliminate autoreactive t cells without affecting the integrity of the immune system. one way to achieve this would be to induce immunological tolerance to autoantigens by the oral or nasal administration of autoantigen. several studies have shown that nasal administration of soluble antigens results in peripheral tolerance by immune deviation or the induction of other regulatory mechanisms. in the rat model this tolerance has been investigated using synthetic peptides of mbp, mbp68-86, 87-99, and 110-128. nasal administration of the encephalitogenic mbp68-86 or 87-99 suppresses eae. mbp68-86 and 87-99 given together had synergistic effects in suppressing eae and reversed ongoing eae. a problem, however, of antigen-specific therapy by the nasal route is that one antigen, or peptide, may be effective in inducing tolerance in one strain of animal but not in another. one way of treating ongoing eae may be the use of an altered peptide ligand with high tolerogenic efficacy when administered nasally. 51 cytokines have been widely used in disease prevention and treatment. cytokine immunotherapy in ms could employ one or two basic strategies: first, to administer immune response downregulatory cytokines, or second, to administer inhibitors of proinfl ammatory cytokines. the nasal route of administering these cytokines has been studied in the rat eae model. nasal administration of low doses of antiinflammatory or regulatory cytokines such as il-4, il-10, or tumor growth factor (tgf)-β1 inhibits development of rat eae when given before or on the day of immunization, but by differing mechanisms. nasally administered il-10 reduced both peripheral immune responses and microglia activation in the cns, whereas nasal administration of il-4 or tgf-β1 triggered the activation of dendritic cells (dcs). however, nasal administration of cytokines alone fails to treat ongoing lewis rat eae. interestingly, nasal administration of mbp68-86 + il-4 or mbp68-86 + il-10 suppresses ongoing eae in lewis rats. the suppression of eae by mbp68-86 + il-10 is associated with the induction of a broad lymphocyte hyporesponsiveness. although this combined administration of autoantigen plus cytokine may be effective in treating rat eae, the applicability of this to human ms is severely limited by the lack of knowledge of the pathologically relevant autoantigen(s) in ms. dcs not only activate lymphocytes, but also induce t cell tolerance to antigens. 52 use of tolerogenic dcs is thus a possible immunotherapeutic strategy for treatment of eae, and indeed this has been studied in some detail. however, mbp68-86-pulsed dcs only prevented the development of eae and failed to treat ongoing eae in lewis rats. 53 in an attempt to treat ongoing eae, splenic dcs have been isolated from healthy lewis rats and modifi ed in vitro with cytokines ifn-β, il-2, il-10, or tgf-β1. upon subcutaneous injection into lewis rats on day 5 pi with mbp68-86 + fca, ifn-β or tgf-β1-modifi ed dcs promoted immune protection from eae. the common marmoset callithrix jacchus is an outbred species characterized by a naturally occurring bone marrow chimerism. the marmoset is a primate phylogenetically close to humans, and has been studied as an animal model for ms. 54 eae can be induced in the common marmoset by the injection of human brain white matter, dispersed in demineralized water to a concentration of 30 mg/ml and emulsified with cfa containing 0.5 mg/ml of mycobacterium butyricum h37a. monkeys are injected intracutaneously with 600 µl of emulsion into the dorsal skin at several locations. clinical disease in this model is scored daily on a scale from 0 to 5: 0 = no clinical signs; 0.5 = apathy, loss of appetite, and an altered walking pattern without ataxia; 1 = lethargy and/or anorexia; 2 = ataxia; 2.5 = paraparesis or monoparesis and/or sensory loss and/or brainstem syndrome; 3 = paraplegia or hemiplegia; 4 = quadriplegia; and 5 = spontaneous death attributable to eae. 55 here the onset of disease, as measured by clinical scores, is variable among animals between 7 and 13 weeks postinoculation. additionally, the maximal clinical scores are variable among animals and range between 2 and 4. on histopathological examination, large plaques of demyelination are observed in the white matter of cerebral hemispheres, mainly localized around the wall of lateral ventricles, in the hemispheric white matter, corpus callosum, optic nerves, and optic tracts. the demyelinated areas show a moderate or severe degree of inflammation characterized by perivascular cuffs of mononuclear cells. in the spinal cord, widespread demyelination is also observed. areas of demyelination involve the ventral, lateral, and dorsal columns of the spinal cord, especially in the outer part of the spinal tracts. thus, pathology in the marmoset model is characterized by infl ammation, demyelination, and astrogliosis. interestingly, this model demonstrates the presence of axonal damage in demyelinating plaques. indeed, axonal damage and loss are well-known events in ms. in ms, axonal damage appears to be an early event, related to an acute inflammation. in the marmoset eae, axonal damage also occurs in areas of acute and early inflammation and demyelination. this eae in c. jacchus is of special interest because of the resemblance of this model to the human disease, and the similarity between the immune systems of marmosets and humans. the type of clinical signs of eae in marmosets depends largely on the antigens used for disease induction. sensitization of marmosets to human myelin induces a relapsing-remitting, secondary-progressive disease course. 56 lesions in this model represent all stages present in chronic ms. marmosets inoculated with mbp develop only mild inflammatory disease unless bordetella pertussis is used with the encephalitogen. cns demyelination critically depends on the presence of antibodies to myelin oligodendrocyte glycoprotein (mog), a minor cns component. marmosets sensitized to a chimeric protein of mbp and proteolipid protein (of myelin) develop clinical eae only after the autoimmune reaction has spread to mog. marmosets immunized with recombinant human mog 1-125 do not develop relapsing-remitting disease but only chronic-progressive disease. 57 during the asymptomatic phase of this primary progressive-like disease, which can last from 2 to 20 weeks, brain lesions are detectable using magnetic resonance imaging (mri), but are not expressed clinically. the induction of eae with mbp or white matter tissue homogenate (wmh) has been well established in rhesus monkeys (macaca mulatta). the rhesus monkey was the first animal species in which eae was deliberately induced. 58 that autoimmunity to brain antigens could induce paralytic disease was confirmed by studies in rhesus monkeys given repeated inoculations of brain homogenates. 58 mog-induced eae has also been produced in this nonhuman primate species 59 that is a highly relevant model for the human disease. the close similarity of the human and rhesus monkey immune system is illustrated by the high degree of similarity between the polymorphic mhc and t cell receptor genes between these two primates. to produce this mog-induced eae, monkeys are injected, under anesthesia, with a total of 1 ml of 1:1 emulsion composed of 320 µg mog in pbs and cfa at 10 sites into the dorsal skin. overt clinical signs are scored daily according to the following criteria: (0) no clinical signs; (0.5) loss of appetite, apathy, and altered walking; (1) lethargy, anorexia, substantial reduction of the general condition, and loss of tail tonus; (2) ataxia, tail biting, sensory loss, and/or blindness; (2.5) incomplete paralysis of one (hemiparesis) or two sides (paraparesis); (3) complete paralysis of one (hemiplegia) or two sides (paraplegia); (4) complete paralysis (quadriplegia); and (5) death. the onset of clinical disease varies between animals, and occurs at days 15-23 after encephalitogenic challenge. all monkeys, however, develop clinical disease and all achieved a score of 4 on clinical severity. the current available panel of nonhuman primate eae models may reflect the spectrum of inflammatory demyelinating diseases in the human population. these eae models can therefore be used to investigate pathogenic mechanisms and to develop more effective therapies. the most widely studied animal model of eae is that of the mouse. in common with other animal models of eae, disease induction varies depending on both the sex of the animals, the mouse strain used, as well as the origin of the spinal cord encephalitogen. in this model mice, aged 6-8 weeks, are immunized subcutaneously in four sites over the back with 200-400 µg of guinea pig mbp emulsified in equal volumes of cfa containing 200-400 µg heat-killed mycobacterium tuberculosis. 60 mice also receive 200 µg of pertussis toxin in 0.2 ml pbs intraperitoneally at the time of immunization and 48 h later. mice are then scored daily for clinical signs of eae for at least 35 days as follows: 0, no clinical signs; +1, limp tail or waddling gait with tail tonicity; +2, ataxia or waddling gait with tail limpness; +3, partial hindlimb paralysis; +4, total hindlimb paralysis; and 5, moribund/death. for each strain of mice there is variation in day of onset of disease, varying from day 7 to day 22 postinfection; incidence of disease, varying from 30% to 100% of animals; incidence of mortality, varying from 0% to 40% of animals; and mean clinical scores, varying from 0 to 1.6. many mouse strains have been employed in the study of eae, and while the sjl strain has been most frequently used to model gender differences in both disease onset and severity, the sjl model has some limitations due to its diminished cd4 + t cell repertoire. certain susceptible strains of mice, such as fvb mice, show a relapsing-remitting course of disease that bears some resemblance to ms. fvb mice therefore may serve as a mouse strain into which various transgenes may be introduced for the purpose of studying their influence on eae and for exploring new therapeutic approaches. 61 since eae is a well-studied disease in mice, mimicking many clinical and pathological features of ms, including cns infl ammation and demyelination, it is of significance that it can also be used as an appropriate model to study ms-related pain. it has been clearly demonstrated in sjl that in both "active" and "passive" eae, there is an initial increase in tail withdrawal latency (hypoalgesia) that peaked several days prior to the peak in motor defi cits during the acute disease phase. during the chronic disease phase, tail withdrawal latencies decreased and were significantly faster than control latencies for up to 38 days postimmunization. thus, it is possible to use both murine active and passive eae as models for ms-related pain. 62 while specific immunotherapeutic strategies are effective in experimental model systems, translation to the human disease has genetically been poorly tolerated or has proved to be ineffective. this conflict may in part be due to the model systems used as well as the poor correlation of in vitro findings compared to those observed in vivo. in biozzi abh mice, which express the novel mhc class ii a, eae occurs following immunization with myelin proteins and peptide epitopes of these proteins; however, only plp peptide 56-70, mog peptide 8-21, or spinal cord homogenate reproducibly induces chronic relapsing eae (creae) with infl ammation and demyelination. 63 creae provides a wellcharacterized reproducible system to develop therapeutic strate-gies during established relapsing autoimmune neurological disease and is pertinent to ms. in creae in abh mice, relapse and progression of disease are associated with emergence and broadening of the immune repertoire due to release of myelin antigens following myelin damage. 64 thus, this creae model in biozzi abh mice is very well suited as a model with which to examine the effect of therapeutic strategies in a dynamic system. disease susceptibility in human ms is associated with three mhc class ii alleles in the hla-dr2 haplotype, drb1*1501, drb5*0101, and dqb1*0602. 65 an autoimmune pathogenesis has been hypothesized in which one or more of these mhc class ii molecules presents cns-derived self-antigens to autoaggressive cd4 + t cells, which infiltrate the cns initiating an infl ammatory response. however, the target autoantigens in ms are unknown. immunization of mice with myelin or other brainassociated proteins induces eae, a disease resembling ms both clinically and pathologically. the proteins, mbp, plp, and mog, components of the myelin sheath, are candidate antigens. 66 indeed, t cells that are reactive to these antigens have been demonstrated in ms patients. 67 mice expressing the human hla-dr2 (drb1*1501) molecule are capable of presenting peptides from all these three ms candidate autoantigens. it is possible in ms that while t cells responding to one of these antigens may initiate the disease, epitope spreading and the recruitment of t cells with additional specificities, as the disease progresses, could lead to infl ammatory responses to several proteins resulting in an escalation of the autoimmune response. 68 transgenic mouse models of multiple sclerosis are now well established. the following are two examples of such transgenic models. first, ms is associated with hla class ii molecules hla-dr2, -dr3, and -dr4. in humans it is difficult to analyze the individual roles of hla molecules in disease pathogenesis due to the heterogeneity of mhc genes, linkage disequilibrium, the influence of non-mhc genes, and the contribution of environmental factors. however, the specific roles of each of these class ii molecules can be addressed using transgenic models expressing these hla genes. this model could prove useful in deciphering the role of hla molecules and autoantigens in ms. 69 second, while eae has been a valuable model for the immunopathogen-esis of ms, it has sometimes been difficult to reconcile the fi ndings and therapies in the rodent models and the cellular and molecular interactions that can be studied in human disease. humanized transgenic mice offer a means of achieving this, through the expression of disease-implicated hla class ii molecules, coexpressed with a cognate hla-class ii-restricted, myelin-specifi c t cell receptor derived from a human t cell clone implicated in disease. such transgenic mice could provide an excellent model for studying epitope spreading in a humanized immunogenetic environment and for testing of immunotherapies. 70 the majority of the current therapies being planned for phase ii and iii trials in ms were first examined in eae. thus a particularly pertinent question is whether eae is a suitable relevant research tool for ms? some researchers believe that while eae is a useful model of acute human cns demyelination, its contribution to the understanding of ms is limited. 71 eae is an acute monophasic illness, as compared to ms, which is a chronic relapsing disease, and may be more suited as a model of acute disseminated encephalomyelitis (adem). drawbacks of the eae model include the following: (1) the nature of the inflammatory response in eae as compared to ms; (2) th-1-mediated disease in eae as compared to ms; (3) differences in the pathology between eae and ms; and (4) pitfalls in extending immunotherapies from eae to ms (see table 69 -3). consequently it may be concluded that the clinical picture of eae presented depends not only on the animal species used, but also on the route of administration of the encephalitogen and the nature of the encephalitogen, mbp, plp, or mog. it is thus possible that these eae models are somewhat imprecise methods to study the pathogenesis of ms or to develop therapeutic strategies. the nonhuman primate eae models are of primary importance for the safety and efficacy of testing new therapeutics for ms that may not work sufficiently well in species distant from humans, such as rodents. questions concerning the immunogenicity of biological therapeutics have also been addressed in nonhuman primates. many biological therapeutics, such as anti-cd4 antibodies 72 and altered peptide ligands, 73 have been investigated in rodents. although some of these therapeutics have been effective in treating eae in rodents, they have proven to be partially effective, or in some cases detrimental, in ms patients. 74 this ultimately raises the question of whether rodent models are the appropriate animal models for testing new therapeutic strategies for use in human ms. there are several examples, in humans and animals, of demyelinating diseases not associated with viral infections, such as demyelination associated with vitamin deficiency or toxins. many different animal models of eae have been studied using various mri techniques. 75 the clinical features of such models depend greatly upon the route of inoculation of the encephalitogen as well as the species and strain of animal used. inoculation routes such as subcutaneous, footpad, or intraperitoneal are not helpful in determining the onset or location of the lesion in the brain or spinal cord. thus, to create demyelinating lesions of precisely known locations and time courses, stereotaxic techniques are used to inoculate animals with chemicals that induce demyelinating lesions in the brain. several chemicals, such as ethidium bromide, cuprizone, and lysophosphatidylcholine (lpc), when injected directly into nerves or into the cns, produce lesions of demyelination. for demyelination studies with lpc, male wistar rats are anesthetized with sodium pentathal and fixed in a rat head restraining stereotaxic surgical table, head shaved, a burr hole created, and 0.2 µl of a 1% lpc solution in isotonic saline injected using an injector cannula. then lpc is infused at the rate of 0.05 µl/min for the next 4-5 min. the cannula is then removed and the burr hole closed using bone wax. rats are then observed daily and histological studies are carried out from day 3 to day 15 after lpc injection to cover the entire process of disease evolution. 76 using this lpc-induced demyelination it is possible to observe the complete pathological process of demyelination and remyelination in this animal model of ms. demyelination can be observed with the maximum value occurring on day 10. after day 10, remyelination starts with a reduction in edema. this model could be particularly useful for studying remyelination. one prominent feature of all chemically induced lesions is that the demyelinating lesions, and subsequent remyelination, can be studied without the interference of immune mechanisms. this has a tremendous advantage over virally induced models of ms: since no virus was inoculated none can remain to affect the remyelination. the use of animal models to investigate the pathogenesis of neuroinflammatory disorders of the central nervous system a multi-generational family with multiple sclerosis the geography of multiple sclerosis refl ects genetic susceptibility pathogenesis of virus-induced demyelination a comparison of the neurotropism of theiler's virus and poliovirus in cba mice theiler's virus infection in mice: an unusual biphasic disease leading to demyelination theiler's murine encephalomyelitis virus infection in mice: a persistent viral infection of the central nervous system which induces demyelination the effect of restraint stress on the neuropathogenesis of theiler's virus-induced demyelination, a murine model for multiple sclerosis the interaction of two groups of murine genes determines the persistence of theiler's virus in the central nervous system the genetics of the persistent infection and demyelinating disease caused by theiler's virus susceptibility to theiler's virus-induced demyelination. mapping of the gene within the h-2d region the effect of l3t4 t cell depletion on the pathogenesis of theiler's murine encephalomyelitis virus infection in cba mice characterization of b lymphocytes present in the demyelination lesions induced by theiler's virus investigation of the role of autoimmune responses to myelin in the pathogenesis of theiler's virus-induced demyelinating disease persistent infection with theiler's virus leads to cns autoimmunity via epitope spreading role of natural killer cells as immune effectors in encephalitis and demyelination induced by theiler's virus the role of cd8+ t cells in the acute and chronic phases of theiler's virus-induced disease in mice quantitative, not qualitative, differences in cd8+ t cell responses to theiler's murine encephalomyelitis virus between resistant c57bl/6 and susceptible sjl/j mice inhibition of theiler's virus mediated demyelination by peripheral immune tolerance induction in vivo evaluation of remyelination in rat brain by magnetization transfer imaging a human antibody that promotes remyelination enters the cns and decreases lesion load as detected by t2-weighted spinal cord mri in a virus-induced murine model of ms persistent viral infections weiss sr the organ tropism of mouse hepatitis virus a59 in mice is dependent on dose and route of inoculation cd4+ and cd8+ t cells are not major effectors of mouse hepatitis virus a59-induced demyelinating disease cellular reservoirs for coronavirus infection of the brain in beta2-microglobulin knockout mice autoimmune and virus induced demyelinating diseases. a review dynamic regulation of alpha-and beta-chemokine expression in the central nervous system during mouse hepatitis virus-induced demyelinating disease remyelination, axonal sparing, and locomotor recovery following transplantation of glial-committed progenitor cells into the mhv model of multiple sclerosis oligodendrocytes and their myelin-plasma membrane connections in jhm mouse hepatitis virus encephalomyelitis macrophage infiltration, but not apoptosis, is correlated with immune-mediated demyelination, following murine infection with a neurotropic coronavirus cd4 and cd8 t cells have redundant but not identical roles in virus-induced demyelination cd8+ t cr+ and cd8+ tcr-cells in whole bone marrow facilitate the engraftment of hematopoietic stem cells across allogeneic barriers bystander cd8 t-cellmediated demyelination is interferon-γ-dependent in a coronavirus model of multiple sclerosis a role for α4-integrin in the pathology following semliki forest virus infection replication of the a7(74) strain of semliki forest virus is restricted in neurons demyelination induced in mice by avirulent semliki forest virus. ii. an ultrastructural study of focal demyelination in the brain long-term effects of semliki forest virus infection in the mouse central nervous system a case for cytokines as effector molecules in the resolution of virus infection the adhesion molecule and cytokine profi le of multiple sclerosis lesions physiological deficits in the visual system of mice infected with semliki forest virus and their correlation with those seen in patients with demyelinating disease role of immune responses in protection and pathogenesis during semliki forest virus encephalitis isolation and characterization of cells infiltrating the spinal cord during the course of chronic relapsing experimental allergic encephalomyelitis in the biozzi ab/h mice characterization of the cellular and cytokine response in the central nervous system following semliki forest virus infection probable epitopes: relationships between myelin basic protein antigenic determinants and viral and bacterial proteins acute experimental allergic encephalomyelitis increases lumbar spinal cord incorporation of epidurally administered [(3)h]-d-mannitol and [(14)c]-carboxyl-inulin in rabbits upregulation of group 1 cd1 antigen presenting models in guinea pigs with experimental autoimmune encephalomyelitis: an immunohistochemical study neurological response after induction of experimental allergic encephalomyelitis in susceptible and resistant rat strains upregulation of monocyte chemotactic protein-1 and cc chemokine receptor 2 in the central nervous system is closely associated with relapse of autoimmune encephalomyelitis in lewis rats evaluation of a rat model of experimental autoimmune encephalomyelitis with human mbp as antigen induction of experimental autoimmune encephalomyelitis in dark agouti rats without adjuvant inhibition of experimental autoimmune encephalomyelitis by inhalation but not oral administration of the encephalitogenic peptide: influence of mhc binding affi nity dendritic cells and the control of immunity rat models as tools to develop new immunotherapies eae in the common marmoset callithrix jacchus demyelination and axonal damage in a non-human primate model of multiple sclerosis histopathological characterization of magnetic resonance imaging-detectable brain white matter lesions in a primate model of multiple sclerosis: a correlative study in the experimental autoimmune encephalomyelitis model in common marmosets (callithrix jacchus) experimental allergic encephalomyelitis in the new world monkey, callithrix jacchus observations on the attempts to produce acute disseminated allergic encephalomyelitis in primates rhesus monkeys are highly susceptible to experimental autoimmune encephalomyelitis induced by myelin oligodendrocyte glycoprotein: characterization of immunodominant t-and b-cell epitopes sex differences in experimental autoimmune encephalomyelitis in multiple murine strains eae susceptibility in fvb mice hyperalgesia in an animal model of multiple sclerosis encephalitogenic and tolerogenic potential of altered peptide ligands of mog and plp in biozzi abh mice native myelin oligodendrocyte glycoprotein promotes severe chronic neurological disease and demyelination in biozzi abh mice hla class ii associated genetic susceptibility in multiple sclerosis: a critical reevaluation progress in determining the causes and treatment of multiple sclerosis t cell reactivity to multiple myelin antigens in multiple sclerosis patients and healthy controls humanized animal models for autoimmune diseases identifi cation of t cell epitopes on human proteolipid protein and induction of experimental autoimmune encephalomyelitis in hla class ii-transgenic mice disease-related epitope spread in a humanized t cell receptor transgenic model of multiple sclerosis experimental allergic encephalomyelitis: a misleading model of multiple sclerosis the use of monoclonal antibodies for treatment of autoimmune disease an altered peptide ligand mediates immune deviation and prevents autoimmune encephalomyelitis the ups and downs of multiple sclerosis therapeutics gadolinium enhancement in acute and chronic progressive eae in the guinea pig sequential diffusion-weighted magnetic resonance image study of lysophosphatidyl choline-induced experimental demyelinating lesion: an animal model of multiple sclerosis the authors would like to thank dana parks for expert secretarial assistance with this manuscript. key: cord-017423-cxua1o5t authors: wang, rui; jin, yongsheng; li, feng title: a review of microblogging marketing based on the complex network theory date: 2011-11-12 journal: 2011 international conference in electrics, communication and automatic control proceedings doi: 10.1007/978-1-4419-8849-2_134 sha: doc_id: 17423 cord_uid: cxua1o5t microblogging marketing which is based on the online social network with both small-world and scale-free properties can be explained by the complex network theory. through systematically looking back at the complex network theory in different development stages, this chapter reviews literature from the microblogging marketing angle, then, extracts the analytical method and operational guide of microblogging marketing, finds the differences between microblog and other social network, and points out what the complex network theory cannot explain. in short, it provides a theoretical basis to effectively analyze microblogging marketing by the complex network theory. as a newly emerging marketing model, microblogging marketing has drawn the domestic academic interests in the recent years, but the relevant papers are scattered and inconvenient for a deep research. on the microblog, every id can be seen as a node, and the connection between the different nodes can be seen as an edge. these nodes, edges, and relationships inside form the social network on microblog which belongs to a typical complex network category. therefore, reviewing the literature from the microblogging marketing angle by the complex network theory can provide a systematic idea to the microblogging marketing research. in short, it provides a theoretical basis to effectively analyze microblogging marketing by the complex network theory. the start of the complex network theory dates from the birth of small-world and scale-free network model. these two models provide the network analysis tools and information dissemination interpretation to the microblogging marketing. "six degrees of separation" found by stanley milgram and other empirical studies show that the real network has a network structure of high clustering coefficient and short average path length [1] . watts and strogatz creatively built the smallworld network model with this network structure (short for ws model), reflecting human interpersonal circle focus on acquaintances to form the high clustering coefficient, but little exchange with strangers to form the short average path length [2] . every id in microblog has strong ties with acquaintance and weak ties with strangers, which matches the ws model, but individuals can have a large numbers of weak ties in the internet so that the online microblog has diversity with the real network. barabàsi and albert built a model by growth mechanism and preferential connection mechanism to reflect that the real network has degree distribution following the exponential distribution and power-law. because power-law has no degree distribution of the characteristic scale, this model is called the scale-free network model (short for ba model) [3] . exponential distribution exposes that most nodes have low degree and weak impact while a few nodes have high degree and strong impact, confirming "matthew effect" in sociology and satisfying the microblog structure that celebrities have much greater influence than grassroots, which the small-world model cannot describe. in brief, the complex network theory pioneered by the small-world and scalefree network model overcomes the constraints of the network size and structure of regular network and random network, describes the basic structural features of high clustering coefficient, short average path length, power-law degree distribution, and scale-free characteristics. the existing literature analyzing microblogging marketing by the complex network theory is less, which is worth further study. the complex network theory had been evoluted from the small-world scale-free model to some major models such as the epidemic model and game model. the diffusion behavior study on these evolutionary complex network models is valuable and can reveal the spread of microblogging marketing concept in depth. epidemic model divides the crowd into three basic types: susceptible (s), infected (i), and removed (r), and build models according to the relationship among different types during the disease spread in order to analyze the disease transmission rate, infection level, and infection threshold to control the disease. typical epidemic models are the sir model and the sis model. differences lie in that the infected (i) in the sir model becomes the removed (r) after recovery, so the sir model is used for immunizable diseases while the infected (i) in the sis model has no immunity and only becomes the susceptible (s) after recovery. therefore, the sis model is used for unimmunizable diseases. these two models developed other epidemic model: sir model changes to sirs model when the removed (r) has been the susceptible (s); sis model changes to si model presenting the disease outbreaks in a short time when the infected (i) is incurable. epidemic model can be widely seen in the complex network, such as the dissemination of computer virus [4] , information [5] , knowledge [6] . guimerà et al. finds the hierarchical and community structure in the social network [7] . due to the hierarchical structure, barthélemy et al. indicate that the disease outbreak followed hierarchical dissemination from the large-node degree group to the small-node degree group [8] . due to the community structure, liu et al. indicate the community structure has a lower threshold and greater steady-state density of infection, and is in favor of the infection [9] ; fu finds that the real interpersonal social network has a positive correlation of the node degree distribution, but the real interpersonal social network has negative [10] . the former expresses circles can be formed in celebrities except grassroots, but the latter expresses contacts can be formed in celebrities and grassroots on the microblog. the game theory combined with the complex network theory can explain the interpersonal microlevel interaction such as tweet release, reply, and retweet because it can analyze the complex dynamic process between individuals such as the game learning model, dynamic evolutionary game model, local interaction model, etc.(1) game learning model: individuals make the best decision by learning from others in the network. learning is a critical point to decision-making and game behavior, and equilibrium is the long-term process of seeking the optimal results by irrational individuals [11] . bala and goyal draw the "neighbor effect" showing the optimal decision-making process based on the historical information from individuals and neighbors [12] . (2) dynamic evolutionary game model: the formation of the social network seems to be a dynamic outcome due to the strategic choice behavior between edge-breaking and edge-connecting based on the individual evolutionary game [13] . fu et al. add reputation to the dynamic evolutionary game model and find individuals are more inclined to cooperate with reputable individuals in order to form a stable reputation-based network [14] . (3) local interaction model: local network information dissemination model based on the strong interactivity in local community is more practical to community microblogging marketing. li et al. restrain preferential connection mechanism in a local world and propose the local world evolutionary network model [15] . burke et al. construct a local interaction model and find individual behavior presents the coexistence of local consistency and global decentrality [16] . generally speaking, microblog has characteristics of the small-world, scale-free, high clustering coefficient, short average path length, hierarchical structure, community structure, and node degree distribution of positive and negative correlation. on one hand, the epidemic model offers the viral marketing principles to microblogging marketing, such as the sirs model can be used for the long-term brand strategy and the si model can be used for the short-term promotional activity; on the other hand, the game model tells microblogging marketing how to find opinion leaders in different social circles to develop strategies for the specific community to realize neighbor effect and local learning to form global microblog coordination interaction. rationally making use of these characteristics can preset effective strategies and solutions for microblogging marketing. the complex network theory is applied to biological, technological, economic, management, social, and many other fields by domestic scholars. zhou hui proves the spread of sars rumors has a typical small-world network features [17] . duan wenqi studies new products synergy diffusion in the internet economy by the complex network theory to promote innovation diffusion [18] . wanyangsong (2007) analyzes the dynamic network of banking crisis spread and proposes the interbank network immunization and optimization strategy [19] . although papers explaining microblogging marketing by the complex network theory have not been found, these studies have provided the heuristic method, such as the study about the online community. based on fu's study on xiao nei sns network [10] , hu haibo et al. carry out a case study on ruo lin sns network and conclude that the online interpersonal social network not only has almost the same network characteristics as the real interpersonal social network, but also has a negative correlation of the node degree distribution while the real interpersonal social network has positive. this is because the online interpersonal social network is more easier for strangers to establish relationships so that that small influence people can reach the big influence people and make weak ties in plenty through breaking the limited range of real world [20] . these studies can be used to effectively develop marketing strategies and control the scope and effectiveness of microblogging marketing. there will be a great potential to research on the emerging microblog network platform by the complex network theory. the complex network theory describes micro and macro models analyzing the marketing process to microblogging marketing. the complex network characteristics of the small-world, scale-free, high clustering coefficient, short average path length, hierarchical structure, community structure, node degree distribution of positive and negative correlation and its application in various industries provide theoretical and practical methods to conduct and implement microblogging marketing. the basic research idea is: extract the network topology of microblog by the complex network theory; then, analyze the marketing processes and dissemination mechanism by the epidemic model, game model, or other models while taking into account the impact of macro and micro factors; finally, find out measures for improving or limiting the marketing effect in order to promote the beneficial activities and control the impedimental activities for enterprizes' microblogging marketing. because the macro and micro complexity and uncertainty of online interpersonal social network, the previous static and dynamic marketing theory cannot give a reasonable explanation. based on the strong ties and weak ties that lie in individuals of the complex network, goldenberg et al. find: (1) after the external short-term promotion activity, strong ties and weak ties turn into the main force driving product diffusion; (2) strong ties have strong local impact and weak transmission ability, while weak ties have strong transmission ability and weak local impact [21] . therefore, the strong local impact of strong ties and strong transmission ability of weak ties are required to be rationally used for microblogging marketing. through system simulation and data mining, the complex network theory can provide explanation framework and mathematical tools to microblogging marketing as an operational guide. microblogging marketing is based on online interpersonal social network, having difference with the nonpersonal social network and real interpersonal social network. therefore, the corresponding study results cannot be simply mixed if involved with human factors. pastor-satorras et al. propose the target immunization solution to give protection priority to larger degree node according to sis scale-free network model [22] . this suggests the importance of cooperation with the large influential ids as opinion leaders in microblogging marketing. remarkably, the large influential ids are usually considered as large followers' ids on the microblog platform that can be seen from the microblog database. the trouble is, as scarce resources, the large influential ids have a higher cooperative cost, but the large followers' ids are not all large influential ids due to the online public relations behaviors such as follower purchasing and watering. this problem is more complicated than simply the epidemic model. the complex network theory can be applied in behavior dynamics, risk control, organizational behavior, financial markets, information management, etc.. microblogging marketing can learn the analytical method and operational guide from these applications, but the complex network theory cannot solve all the problems of microblogging marketing, mainly: 1. the complexity and diversity of microblogging marketing process cannot completely be explained by the complex network theory. unlike the natural life-like virus, individuals on microblog are bounded rational, therefore, the decisionmaking processes are impacted by not only the neighbor effect and external environment but also by individuals' own values, social experience, and other subjective factors. this creates a unique automatic filtering mechanism of microblogging information dissemination: information recipients reply and retweet the tweet or establish and cancel contact only dependent on their interests, leading to the complexity and diversity. therefore, interaction-worthy topics are needed in microblogging marketing, and the effective followers' number and not the total followers' number of id is valuable. this cannot be seen in disease infection. 2. there are differences in network characteristics between microblog network and the real interpersonal social network. on one hand, the interpersonal social network is different from the natural social network in six points: (1) social network has smaller network diameter and average path length; (2) social network has higher clustering coefficient than the same-scale er random network; (3) the degree distribution of social network has scale-free feature and follows power-law; (4) interpersonal social network has positive correlation of node degree distribution but natural social network has negative; (5) local clustering coefficient of the given node has negative correlation of the node degree in social network; (6) social network often has clear community structure [23] . therefore, the results of the natural social network are not all fit for the interpersonal social network. on the other hand, as the online interpersonal social network, microblog has negative correlation of the node degree distribution which is opposite to the real interpersonal social network. this means the results of the real interpersonal social network are not all fit for microblogging marketing. 3. there is still a conversion process from information dissemination to sales achievement in microblogging marketing. information dissemination on microblog can be explained by the complex network models such as the epidemic model, but the conversion process from information dissemination to sales achievement cannot be simply explained by the complex network theory, due to not only individual's external environment and neighborhood effect, but also consumer's psychology and willingness, payment capacity and convenience, etc.. according to the operational experience, conversion rate, retention rates, residence time, marketing topic design, target group selection, staged operation program, and other factors are needed to be analyzed by other theories. above all, microblogging marketing which attracts the booming social attention cannot be analyzed by regular research theories. however, the complex network theory can provide the analytical method and operational guide to microblogging marketing. it is believed that microblogging marketing on the complex network theory has a good study potential and prospect from both theoretical and practical point of view. the small world problem collective dynamics of 'small-world' networks emergence of scaling in random networks how viruses spread among computers and people information exchange and the robustness of organizational networks network structure and the diffusion of knowledge team assembly mechanisms determine collaboration network structure and team performance romualdo pastor-satorras, alessandro vespignani: velocity and hierarchical spread of epidemic outbreaks in scale-free networks epidemic spreading in community networks social dilemmas in an online social network: the structure and evolution of cooperation the theory of learning in games learning from neighbors a strategic model of social and economic networks reputation-based partner choice promotes cooperation in social networks a local-world evolving network model the emergence of local norms in networks research of the small-world character during rumor's propagation study on coordinated diffusion of new products in internet market doctoral dissertation of shanghai jiaotong university structural analysis of large online social network talk of the network: a complex systems look at the underlying process of word-of-mouth immunization of complex networks meeting strangers and friends of friends: how random are socially generated networks key: cord-034181-ji4empe6 authors: saqib, mohd title: forecasting covid-19 outbreak progression using hybrid polynomial-bayesian ridge regression model date: 2020-10-23 journal: appl intell doi: 10.1007/s10489-020-01942-7 sha: doc_id: 34181 cord_uid: ji4empe6 in 2020, coronavirus disease 2019 (covid-19), caused by the sars-cov-2 (severe acute respiratory syndrome corona virus 2) coronavirus, unforeseen pandemic put humanity at big risk and health professionals are facing several kinds of problem due to rapid growth of confirmed cases. that is why some prediction methods are required to estimate the magnitude of infected cases and masses of studies on distinct methods of forecasting are represented so far. in this study, we proposed a hybrid machine learning model that is not only predicted with good accuracy but also takes care of uncertainty of predictions. the model is formulated using bayesian ridge regression hybridized with an n-degree polynomial and uses probabilistic distribution to estimate the value of the dependent variable instead of using traditional methods. this is a completely mathematical model in which we have successfully incorporated with prior knowledge and posterior distribution enables us to incorporate more upcoming data without storing previous data. also, l(2) (ridge) regularization is used to overcome the problem of overfitting. to justify our results, we have presented case studies of three countries, −the united states, italy, and spain. in each of the cases, we fitted the model and estimate the number of possible causes for the upcoming weeks. our forecast in this study is based on the public datasets provided by john hopkins university available until 11th may 2020. we are concluding with further evolution and scope of the proposed model. in late december 2019, a group of patients was come up with an unknown etiology to the hospitals having symptoms of pneumonia. later on, the first case of novel coronavirus was reported in the city of wuhan in hubei province in central china [1] . after taking a basic understanding of the virus, medical experts have given a name as severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and the name of the disease caused by this virus is coronavirus disease 2019 (covid19) [2]. the cases of covid-19 pandemic are growing rapidly. till 30th april 2020, we have 3,251,587 confirmed and 229,832 death cases throughout the world due to this hazardous pandemic, covid19. in india, the first laboratory-confirmed case of covid-19 was reported from kerala on 30th january 2020 and as of 30th april 2020, a total of 33,931 cases and 943 deaths were reported in india [3] . to tackle this ongoing pandemic and such events in the future where the lives of millions of people are at high risk, we need a strong health care system and technology that will be the means of making a way to a panacea. whenever such pandemic spread in a country or province it has some patterns and various mathematical models can be proposed to forecast using such technologies and mathematical theories. for example, in [4] , the authors proposed a model for malaria transmission dynamics of the anopheles mosquito and in [5] a bifurcation analysis for malaria transmission has been developed. as we are also aware of the menacing of hiv/tb and study [6] presented the mathematical analysis of the transmission dynamics of the same. according to [7] , due to being class of β-coronavirus, it has a spreading capability among hosts (primary to secondary source) and that is why the magnitude of infected cases growing non-linearly. non-linearity of any pandemic can be detected in several ways e.g. in [8] a laplacian based decomposition is used to solve the non-linear parameters in a pine witt disease. similarly, in [9] , a fractional version of sirs (susceptible -infectious -recovered -susceptible) model has been developed to help, to control the syncytial virus in infants. also, in [10] , the author has used generalized additive models (gams) to predict dengue outbreaks based on disease surveillance, meteorological and socio-economic data. despite several research works and their documentation, there are huge opportunities for the utilization of ai, machine learning, and data science in this field, due to the novelty of the root cause. for example, in [11] . this article author has a comprehensive discussion regarding ai applications, constraints, and pitfalls during the covid-19 pandemic. so, there must be some prediction methods that are required to estimate the magnitude of infected cases, and masses of studies on distinct methods of forecasting are represented so far [12] . in [12, 13] , authors estimate the possible number of infected cases in india using long short-term memory (lstm). same as in [14] , the study represented virus progression and forecast using the same algorithm for canada and compare with the united states (us) and italy. in [15] , sujatha performed linear regression (lr), multilayer perceptron (mlp), and vector autoregression model (varm) for expectation on the covid-19 kaggle information to anticipate the epidemiological pattern of the disease and rate of covid-2019 cases in india. in [16] author proposed machine learning models (xgboost and multi-output regressor) to predict confirmed cases over the coming 24 days in every province of south korea with 82.4% accuracy. as we have already discussed, a study in [10] , proposed to control the syncytial virus in infants, same as for china, a modified seir and ai prediction of the trend of the epidemic of covid-19 has been proposed in this study [17] . different research also takes place on the cases of india but using different methods, and autoregression integrated moving average model (arima) and richard's model [18] . moreover predictions, some mathematical models have also estimated the effects of lockdown and social-distancing in india in a practical scenario [19] but all these studies represented so far are based on inadequate data at the initial stage without any measurement of uncertainty. these models are developed with good accuracy but as well as the data become available, those entire algorithms will not be able to survive without a few evaluations due to the dynamic nature of pandemic escalation of the covid-19. so, a distribution based learning model will be more promising rather than doing point estimation. bayesian learning is a very well-known method of making any prediction based on our prior knowledge [20] . many studies have been already used the bayesian approach for prediction for many pandemics and clinical forecasting like in [21] authors have been estimated the probability of demonstrating vaccine efficacy in the declining ebola epidemic using the bayesian modeling approach. in this [22] chapter, the author focuses on the various utility of bayesian prediction and it is not only useful, but simple, exact, and coherent, and hence beautiful. also, the study [23] illustrated a bayesian analysis for emerging infectious diseases. same as in [24] , paper presented a bayesian scheme for real-time estimation of the probability distribution of the effective reproduction number of the epidemic potential of emerging infectious diseases and show how to use such inferences to formulate significance tests on future epidemiological observations. besides, a study also proposed a system, able to provide early, quantitative predictions of sars epidemic events using a bayesian dynamic model for influenza surveillance demonstrated [25] . so, this was the motivation behind the proposed study, the prediction of infected cases by covid-19 which is also a sars family virus can be formulated using bayesian learning as a study [25] already represented for influenza surveillance. in the proposed study we are formulating bayesian learning regression with a polynomial of n-degree. furthermore, one issue occurs when working with time-series data (as covid-19 confirmed cases) is over-fitting particularly when estimating models with large numbers of parameters over relatively short periods and the solution to the over-fitting problem, is to take a bayesian approach (using ridge regularization) which allows us to impose certain priors on depended variables [26] . another big reason we often prefer to use bayesian methods is that it allows us to incorporate uncertainty in our parameter estimates which are particularly useful when forecasting [26] . the manuscript is organized as follows. "method and model" explains the methodology used to construct the model and various terminology used in the study. " significance of proposed model in covid-19 outbreak" describes the important advantages of such a hybrid model and also discussed our novelty of the work in the covid-19 outbreak. after that three case studies in "case studies" also presented to justify our results and fruition of the model. in the last, we discussed our results, comparison with other developed models, and finding in the section "results and discussion" followed by the conclusion in "conclusion". the datasets collected from johns hopkins university are used in the studies [27] . the datasets accessed on 11 may 2020. it provides several fatalities and registered patients by the end of each day. the dataset is available in the time series format with date, month, and year so that the temporal components are not neglected. a wavelet transformation [28] is applied to preserve the time-frequency components and it also mitigates the random noise in the dataset. this dataset consists of six columns ( table 1) . the only pre-processing was required to transform the dataset. the observations recorded every day and for each day a new column added. the datasets are divided into two parts training (80%) and testing (20%) datasets. one of the very basic approaches to make a prediction is another version of linear regression is polynomial regression in which the relationship between independent and dependent variables is an n-degree polynomial. mathematically, it can represent as follows: or, where β i is the coefficient and ϵ the measurement error which is f(x) is our polynomial model and to develop a good model we need to tuning, β i so that following loss function with l 2 regularization (ridge regularization) will be as minimum as possible where, the first part of the eq. 4 is the residual sum of squares (rss), the difference between actual value (y i ) and predicted value (f(x i )) of the i th observation. λ is the regularization term, deciding how much regularize the β i . now, for the best fitting our aim to minimize the β by tuning coefficients, β i . according to [29] , the maximum likelihood estimate of β which reduces the l(y i , now, instead of a vector of coefficients, we have a single value b β, in ℝ p + 1 [30] . here bayesian regression (br) comes into the picture. in the br, instead of predicting value mentioned as above, it used probabilistic distribution to estimate the value of y i and its follow the following syntax so, from conjugate prior distribution [20], the eq. 7 is re-written as and v = n − k, n is the number of observations, and k is the number of coefficients in vector β. this suggests a form for the prior distribution is after the formulation of the prior distribution, now we need to generate posterior distribution, which can be formulated as follow (from eqs. 7, 9 and 10), where λ 0 is ridge regression [31] used to overcome the problem of multicollinearity normally occurring when the model has large numbers of parameter and it is equal to β i and i is an identity matrix of n × n. now, the posterior mean (μ n ) can be represented in the term of b β and prior mean μ 0 and for the bayesian learning other can be upgraded as follows now we are ready to estimate the probability of y on given conditions (m) using bayes theorem as where m is the marginal likelihood and prior density, here, m is p(y| x, β, σ) (see fig. 1 ). there are many parameters used in the proposed model (table 2 ) and a fit-and-score method implemented to optimize. it also implements predict, predict_proba, decision_function, transforms, and inverse_transform if they are implemented in the estimator used. the parameters of the estimator used to apply these methods are optimized by cross-validated search over parameter settings [32] . the number of parameter settings that are tried is given by n_iter (≈100 in the proposed model). we initialize the parameters with default values and obtain the best-fitted parameters as given in the following tables (tables 3 and 4 ). the optimization of hyperparameters take place by implementing proposed model in python.3.6 using scikit-learn [32] and used spyder, a publically available software, a gui to debug the code. the piece of code available as follows. but the user-defined value must be greater than or equal to 1. tol − float, optional, default = 1.e-3 it represents the precision of the solution and will stop the algorithm if w has converged. alpha_1 − float, optional, default = 1.e-6 it is the 1st hyperparameter which is a shape parameter for the gamma distribution prior over the alpha parameter. alpha_2 − float, optional, default = 1.e-6 it is the 2nd hyperparameter which is an inverse scale parameter for the gamma distribution prior over the alpha parameter. lambda_1 − float, optional, default = 1.e-6 it is the 1st hyperparameter which is a shape parameter for the gamma distribution prior over the lambda parameter. lambda_2 − float, optional, default = 1.e-6 it is the 2nd hyperparameter which is an inverse scale parameter for the gamma distribution prior over the lambda parameter. bayesian_search.best_params_. in the proposed model we have developed concepts of bayesian inference that differ fundamentally from the traditional approach. this is completely mathematical methods in which we have successfully incorporated with prior knowledge. instead of making predictions only, it discovers full probability distribution of the problem-domain even on a small dataset which also encounters the features of the confidence interval, risk aversity, etc. [33] . moreover, posterior distribution makes the model to incorporate more upcoming data without storing previous data. in the current situation of the pandemic, data are not enough to make any prediction without any measurement of uncertainty. in the introduction section, we have seen many studies for covid-19 progression with good accuracy but as well as data become available, those entire algorithms will not able to survive without a few evaluations. it will happen because of the dynamic nature of pandemic escalation. for example, if we consider our traditional regression methods (eq. 1) and we can discover the best possible values for vector β by using eq. 5, in this case, β will be more promising on large datasets rather than small datasets (the available data of covid-19 is not enough yet) because this method failed to quantify the certainty [34] . here, we need to make little change with β, determine a distribution instead of a single point estimation and it is all that bayesian ridge regression does in this model. now, when β is a distribution instead of a mere number our þalso turns into stochastic and becomes a distribution too. this means that we have confidence interval in our prediction and it became necessary to encounter uncertainty in the case of covid-19 progression forecasting when datasets are rapidly growing but not sufficient yet. besides, in eq. 4 of the model, we also used l 2 (ridge) regularization to makes model less prone to overfitting. ridge regression is better to use when all the weights are equal sizes and the dataset has no outliers. clinical trials and diagnosis are very expensive and their outcomes are crucial to the concerned stakeholders and, hence, there is considerable pressure to optimize them. in medical treatments, clinicians and nurses very often have to make various complex and critical decisions during the diagnosis of the patients. in reality, these decisions are full of uncertainty and unpredictability. however, based on the available information, obtained from various clinical and diagnostic tests and situation of the patient, both clinicians and nurses try to reduce their uncertainty in clinical decisions and attempts to shift to span the predictability of the chance of improvement in patient's condition. in the case of the covid-19 pandemic, the situation is the same as any other clinical trials. many pre-planning and controlling need good prediction for the magnitude of infected cases as well as the measurement of uncertainty. one route of optimization is to make better use of all available information, and bayesian statistics provides this opportunity. bayesian statistics provide a formal mathematical method for combining prior information with current information at the design stage, during the conduct of the trial, and at the analysis stage. the main reason for using a bayesian approach to covid-19 is that it facilitates representing and taking fuller account of the uncertainties related to models and parameter values. in contrast, most decision analyses based on maximum likelihood (or least squares) estimation involve fixing the values of parameters that may, in actuality, have an important bearing on the outcome of the analysis and for which there is considerable uncertainty. one of the major benefits of the bayesian approach is the ability to incorporate prior information. bayesian inference based approach is really important to conduct for covid-19 pandemic rather than doing point estimations because it makes it possible to obtain probability density functions for model parameters and estimate the uncertainty that is important in the risk assessment analytics. in the bayesian regression approach, we can take into account other models are developed with good accuracy but as well as data become available, those entire algorithms will not able to survive without a few evaluations due to the dynamic nature of pandemic escalation of the covid-19 but the proposed model corrects the distributions for model parameters and forecasting results using parameters distributions. this approach has always been used for pandemic and clinical forecasting due to uncertainty measurement for example in [21] bayesian modeling approach has used to calculate vaccine efficacy in the declining ebola epidemic and [23, 24] demonstrated a bayesian scheme for emerging infectious diseases and show how to use such inferences to formulate significance tests on future epidemiological observations. in short, bayesian methods have the following advantages [35] over other time-series machine learning approaches: & it provides an organized way of combining prior information with data, within a solid decision theoretical framework. & it's an inference based learning approach based on previously available data without reliance on asymptotic approximation and such learning gives the consistency of the results with a small sample and large sample equally. & it is based on the likelihood principle which gives identical inferences with distinct sampling designs. & interpretability of distribution of various parameters used in the model. the method of the present study is unique because the model uses prior and posterior distribution to estimate the confirmed cases. the model should not only judge by the accuracy but also on the reliability of the prediction it makes using prior and posterior knowledge fetched from the data. to test the results and get the accuracy of the model we have proposed a case study of three countries-the u.s, spain, and italy. we implemented the proposed model with hybridization of polynomial fitting of degree 4, 5, 6, and 7 because we have observed the best estimation are happen within this range of degrees. the confirmed cases in italy were the lowest since 13th march but the deaths remain stubbornly high, have hovered between 600 and 800 for the last few weeks (see fig. 2 ). using pbrr, we fitted polynomials and discovered that degree-6 is the best fit for the dataset of italy. in fig. 3 the solid blue line demonstrating the actual confirmed cases and dashed green line represented observation calculated by the model. in fig. 3 (a) , the degree-4 pbrr is suffering from underfitting and poorly estimate the cases for the unseen days. also, in fig. 3 (b) , the model showing overfitting and overestimate on the testing data. fig. 3 (b) , sudden decrement in the number of cases which is not a good prediction considering the ongoing situation of italy. fig. 3 (c) , degree-6 pbrr given rmse 418.36 with an accuracy of 91% on testing data. in fig. 4 we have plotted four polynomial-curves using pbrr of different degrees and observed degree-6 is well suited the fig. 4 (a) 4-degree pbrr fitted but given poor performancetested data due to overfitting on training data. also, in fig. 4 (b) 5-degree and (d) 7-degree pbrr fitted so well but after 100 days, it started decreasing which is not suitable for the current circumstances. through our investigation on the dataset of spain, an instant decrement is recorded on the 95th days of first case arrival. similar to the previous we fitted four different pbrr on spain dataset too and found that 7-degree is the best fit that not correctly estimates confirmed cases for unseen days but also tracks the decrement happen earlier (see fig. 5 ). the model, in fig. 5 (a) is underfitting that neither predicts the unseen observation nor performed well on the training dataset. the other two models ( fig. 5 (b) and (c)) are not suitable for the present ongoing. the model, in fig. 5 (d) estimates the testing data having rmse 624.27 with an accuracy of 90.5%. in the above case studies, we have fitted and found different parameters for the predictions (table 4 ). now, we can predict for the upcoming days. so, we have predicted for 6 days and compare with the actual number of cases on those days (table 5 ). it is demanding to construct a model to predict the dynamic progression of covid-19 situations. so many researchers are struggling to find and implementing such models with optimal parameters and unknown variables which lead them to uncertainty. pbrr model is different from all the studies published or at least discussed in the literature survey because of its nature of making an estimation. it is a complex mathematical model that more focused to discover distribution instead of making a single value linear prediction of the dependent variable and this feature makes it more promising. as far as we have seen in all the above-mentioned case studies different polynomial based on bayesian belief having a range of degrees between 4 and 7 best fit and enable us to forecast future infected cases of covid-19. instead of applying any specific country, we can also estimate the cases on the worldwide dataset. fig. 6 , demonstrates the curve fitting using the pbrr model of degree 5 on world-wide data with accuracy 89% on testing data. we also estimate the magnitude of confirmed cases in the upcoming 10 days. applying pbrr on world-wide data is means scaling the independent variables but our model also survived in this scenario and showing the consistency of the system. to prove the novelty and superiority of the proposed model, we have compared several models (table 6) after the comparison, we finally observed that the proposed model is better than other in the term of rmse and comparable equal in term of accuracy with arima and lstm. although, arima and lstm are giving little bit more accurate results pbrr using the prior and posterior distribution for the model parameters which is not used by any of either arima or lstm. we also experiment with bayesian linear regression with using the prior and posterior distribution for the model parameters which has not given satisfactory result in the term of rmse, accuracy, and sd. in section 2, we have already discussed the importance of the prior and posterior distribution for the model parameters. no doubt, lstm is a deep learning based advanced approach to forecast time series data but it also has some drawbacks compare to proposed model e.g. longer time to train, more memory, overfitting, sensitive to different random weight initializations etc. the overfitting is one of the major issues of the lstm which has overcome in proposed model by adding ridge regularization. we have a sequential path from older past cells to the current one in lstm hidden layers. in fact the path is now even more complicated, because it has additive and forget branches attached to it. lstm and gru and derivatives are able to learn a lot of longer term information but they can remember sequences of 100 s, not 1000s or 10,000 s or more as given here [40] . moreover, rnns are not hardware friendly. it takes a lot of resources we do not have to train these networks fast. also it takes many resources to run these models in the cloud, the cloud is not scalable [40] . pbrr modeling not only has sufficient accuracy but also reliable than other methods. in present circumstances when thousands of people are losing their loving ones or own lives a model with more promising algorithms is needed along with good accuracy. prediction with misconceptions may lead to a serious problem for health care professionals as well as governments. although, pbrr is giving reliable results the reality is the forecasting of any pandemic is not only merely dependent on previous observations or time-series analytical inference. many more important factors influence the magnitude of infection like healthcare system stability, education, awareness of people, weather, lockdown, and social-distancing, etc. soon, the researcher may come up with different robust models that also consider these factors. ministry of health & family welfare, government of india malaria transmission dynamics of the anopheles mosquito in kumasi, ghana bifurcation analysis of a mathematical model for malaria transmission mathematical analysis of the transmission dynamics of hiv/tb coinfection in the presence of treatment covid-19: a promising cure for the global panic semianalytical study of pine wilt disease model with convex rate under caputo-febrizio fractional order derivative a new fractional hrsv model and its optimal control: a non-singular operator approach prediction of dengue outbreaks based on disease surveillance, meteorological and socio-economic data artificial intelligence vs covid-19: limitations, constraints and pitfalls science of the total environment prediction for the spread of covid-19 in india and effectiveness of preventive measures predictions for covid-19 outbreak in india using epidemiological models predictions for covid-19 outbreak in india using time series forecasting of covid-19 transmission in canada using lstm networks a machine learning methodology for forecasting of the covid-19 cases in india machine learning model estimating number of covid-19 infection cases over coming 24 modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions forecasting covid-19 impact in india using pandemic waves nonlinear growth models modeling and predictions for covid 19 spread in india estimating the probability of demonstrating vaccine efficacy in the declining ebola epidemic: a bayesian modelling approach chapter 5 -bayesian prediction bayesian analysis for emerging infectious diseases real time bayesian estimation of the epidemic potential of emerging infectious diseases bayesian dynamic model for influenza surveillance a bayesian approach to time series forecasting covid-19 datasets from johns hopkins university wavelet transform domain filters: a spatially selective noise filtration technique the elements of statistical learning: data mining, inference, and prediction. springer 30 improving efficiency by shrinkage: the james-stein and ridge regression estimators scikit-learn: machine learning in {p}ython the bugs book: a practical introduction to bayesian analysis, ser. taylor & francis, chapman & hall/crc texts in statistical science analysis of regression confidence intervals and bayesian credible intervals for uncertainty quantification introduction to bayesian analysis procedures seir and regression model based covid-19 outbreak predictions in india covid-19 virus outbreak forecasting of registered and recovered cases after sixty day lockdown in italy: a data driven model approach covid-19: a comparison of time series methods to forecast percentage of active cases per population time series forecasting of covid-19 transmission in canada using lstm networks the fall of rnn / lstm publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. dhanbad. he has been involved in automation in smart grid and applied artificial intelligence in seismology. his area of interest is data analysis, artificial intelligence, and applied statistics. key: cord-027315-1i94ye79 authors: bielecki, andrzej; gierdziewicz, maciej title: simulation of neurotransmitter flow in three dimensional model of presynaptic bouton date: 2020-05-22 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50420-5_10 sha: doc_id: 27315 cord_uid: 1i94ye79 in this paper a geometrical model for simulation of the nerve impulses inside the presynaptic bouton is designed. the neurotransmitter flow is described by using partial differential equation with nonlinear term. the bouton is modeled as a distorted geosphere and the mitochondrion inside it as a highly modified cuboid. the quality of the mesh elements is examined. the changes of the amount of neurotransmitter during exocytosis are simulated. the activity of neurons [16] , including those taking place in the presynaptic bouton of the neuron, may be described theoretically with differential equations [1, 2, 4] or, more specifically, with partial differential equations (pde) [3, 11, 13] . numerical methods have been commonly used to solve pde [9, 10, 12, 13] , which may be a proof that scientists are constantly interested in this phenomenon and, on the other hand, that results are variable and depend strongly on the biological assumptions and on the way of modeling. to obtain the solution, the appropriate design of the mesh of the studied object (in this paper: of a presynaptic bouton of the neuron) is necessary, which may be a demanding task [19, 20] . some experiments in the cited papers have already been performed in order to answer the question: how the neurotransmitter (nt) mediates the process of conducting nerve impulses, which is connected with the distribution of nt in the synapse and with its exocytosis. this paper is intended to expand that knowledge by using more complicated mathematical model in three dimensions together with a geometric model which is more complex than it was in some previous works [8, 13] . there are also examples of the description of a realistic geometric model [7] , but without performing simulations. however, the nt flow in the realistically shaped model of the presynaptic bouton affected by the presence of a mitochondrion inside it has not been studied yet. therefore, the objective of this paper was to examine the process of nt flow in a realistic model of the presynaptic bouton with a mitochondrion partly occupying its volume. in this section the model based on partial differential equations is presented briefly. such approach allows us to study the dynamics of the transport both in time and in spatial aspect. the model is nonlinear, diffusive-like -see the paper [3] for details. there are a few assumptions about the model. the bouton location is a bounded domain ω ⊂ r 3 . the boundary (the membrane) of the bouton is denoted as ∂ω. the total amount of neurotransmitter in the bouton is increasing when new vesicles are supplied inside the bouton and decreasing when their contents are released through the bouton membrane. the proper subset of the bouton, ω 1 , is the domain of vesicles supply. the release site, γ d , is a proper subset of the membrane. the function f : ω → r models the synthesis of the vesicles that contain neurotransmitter. the value of this function is f (x) = β > 0 on ω 1 and f (x) = 0 on ω \ ω 1 . the neurotransmitter flow in the presynaptic bouton was modeled with the following partial differential equation: where (x, t) is the density of neurotransmitter at the point x and time t and a is the diffusion coefficient. the last term contains the function f (x), presented before, and the threshold nt density, above which synthesis dose not take place, denoted by¯ . for any x ∈ r the "x + " symbol means max(0, x). the boundary where ∂/∂ν is a directional derivative in the outer normal direction ν. the function η(t) depends on the time t and takes the value of 1 for t ∈ [t n , t n + τ ] (with the action potential arriving at t n and with the release time τ ), and η(t) = 0 otherwise [5, 8] . multiplying (2) by a smooth function v : ω → r, and integrating by parts gives (see [3] for details): provided that is sufficiently smooth. piecewise linear c 0 tetrahedral finite elements are used in the numerical scheme. the unknown function is approximated by h , computed with the formula in which the basis functions of the finite element space are denoted by v i . using let a(t) = a 1 + η(t)a 2 . let us also introduce the nonlinear operator b : r k → r k in the following way the eq. (3) may be rewritten as the time step is defined as t k = kδt where δt is at least several times smaller than τ . if we assume that k h is approximated by h (t k ) we can use the crank-nicolson approximative scheme valid for k = 1, 2, . . .. in the finite element basis {v k } k k=1 , 0 is approximated by 0 h . the scheme must be solved for k h . the problem of the nonlinearity in b is dealt with by the iterative scheme: in each step of the iteration the linear system (9) is solved. the iterative procedure is stopped for the value m for which the correction of the solution due to one step of the scheme is sufficiently small. note that the non-linear term is calculated at the previous iteration (m) and the linear terms at iterations (m + 1). such scheme reduces to solving the linear system in each step of the iteration. it is an alternative to the newton method, and it has the advantage of not needing to cope with the non-differentiable nonlinearity introduced with the last term (taking the positive part) of the equation. from physiological point of view, the time step for the eq. (9) should be less than the time scale of the modeled phenomenon. we do not have the proof of the scheme stability but the numerical experiments show that the iterations of the fixed point scheme with respect to index m converge, and the limit is the solution of the crank-nicolson scheme. this scheme is known to be unconditionally stable. still, to avoid numerical artifacts, the time-step needs to be sufficiently small. our numerical experiments show that no numerical artifacts are seen in the obtained solution for δt = 0.0001 µs and for δt = 0.00005 µs. for the numerical simulations, the bouton is modeled as a bounded polyhedral domain ω ⊂ r 3 . the boundary of the bouton, ∂ω, is represented by a set of flat polygons. the proper polyhedral subset of the bouton, ω 1 , is the domain of vesicles supply. the release site γ d is modeled as a finite sum of flat polygons. various geometrical models of the presynaptic bouton have been used by the authors to simulate the process of conducting nerve impulses so far. the natural reference point for assessing their quality is to compare them to the images of real boutons which have been thoroughly examined, for example, several years ago [17, 18] . one of the models was based of the skeleton made up of two concentric spheres, the outer one having the diameter of 2-3 µm and the inner one free from synaptic vesicles, and also with numerous release sites [13] . the other structure consisting of two concentric globes and a single release site was filled with a tetrahedral mesh and it has been used to examine the distribution of neurotransmitter inside the bouton [6] . a very similar structure has been utilized to find the connection between the number and location of synthesis domains and the amount of neurotransmitter in different locations in the bouton [8] . another, more complicated, surface model intended for the discrete simulation of vesicle movement consisted of a realistically shaped bouton and a mitochondrion partly occupying its space [7] . the amplitudes of evoked excitatory junctional potentials and the estimated number of the synaptic vesicles released during exocytosis were examined in [13] with the use of the model with empty central region. the study discussed, among others, the way the synaptic depression was reflected in the results of the simulations. the model made up from two concentric spheres filled with a tetrahedral mesh was used to demonstrate synaptic depression [6] . the volume of the model was about 17,16 µm 3 and the surface about 32,17 µm 2 . the diffusion coefficient was 3 µm 2 /s, synthesis rate and the exocytosis rate around 21 µm 3 /s. the results confirmed that during 0.5 s of 40 hz stimulation, with the assumed values of parameters, the amount of neurotransmitter in the presynaptic bouton decreased, though only by the fraction of about 2%. the next studies with the globe model [8] were performed to examine the influence of the location and number of synthesis zones on the amount of neurotransmitter in particular locations of the bouton. the simulation parameters were similar as in [6] but the number of synthesis zones was 1 or 2 (however, with the same total volume) and the location was also variable. the chief conclusion was that the closer the synthesis (or supply) domain to the release site, the faster the bouton becomes depleted of neurotransmitter. the geometric aspects of the models of the bouton used so far, mentioned in the previous section, were extremely simplified. in this paper, the simulations have been conducted on the basis of the realistic model of the bouton in which the mitochondrion is modeled as well. the structure used to model the presynaptic bouton in this paper was based on the real shape of such a bouton and it was composed of geometric shapes modified so that it resembled the original. therefore, the outer of the bouton was modeled as a moderately modified sphere, whereas a part of the mitochondrion inside it was a highly distorted cuboid, as it has been presented in [7] . the parameters of the surface mesh are described therein in detail. it should be stressed that the mesh described in this paper was designed to shorten the computing time of simulations of neurotransmitter flow, and therefore contains, among others, relatively large elements. the number of tetrahedra in the mesh was less then 10 5 and the number of the surface triangles (faces) was less than 5 × 10 4 . the parameters of the input surface mesh were as follows. the total surface of the bouton was s ≈ 6.7811 µm 2 , and the area of the active zone (of the release site) was s az ≈ 0.2402 µm 2 . the surface mesh is presented in fig. 1 . the tetrahedral mesh was generated with the tetgen program [14, 15] . the result is depicted in fig. 2 . the input parameters for tetgen were chosen in order to minimize the number of mesh elements. as a result of this, the bouton three-dimensional mesh contained 77418 tetrahedra with the total volume v ≈ 0.9029 µm 3 . the volume of the part of the bouton located in the direct neighborhood of the mitochondrion (where the endings of the microtubules are sometimes found), assumed as the theoretical nt supply zone ("synthesis domain") was v s ≈ 0.0198 µm 3 . the quality of the mesh was assessed by computing several measures for each tetrahedra and by taking, for each measure, its maximal value i.e. the value for the worst mesh element. the values are collected in table 1 . the relatively high values of the mesh quality parameters may suggest a cautious approach to the analysis of the results. however, further stages of the experiment revealed that the mesh proved sufficient for performing the planned simulation. the initial density of the neurotransmitter was calculated by using the forwhere a = 300[vesicles/µm 3 ] is the theoretical maximal value of the function, b = 0.28[1/µm 2 ], and r[µm] is the distance from the "center" of the bouton i.e. from the center of the minimal cuboid containing the model. the simulation time was 0.1 s and during that time there were 4 impulses. the results are depicted in the following figures: fig. 3 and fig. 4 , and in table 2 . the program was designed primarily for validating the accuracy of the calculations; therefore the algorithm was implemented as a single threaded code in python, with numeric and graphic modules included, and the calculations were relatively slow. the approximate speed was 30-40 iterations per hour. therefore one simulation run lasted about 2-3 days. the total amount of neurotransmitter in the presynaptic bouton calculated during simulation was almost constant. the relative decrease of its value did not exceed 1%. those values refer to the situation when the neuron is not very intensively stimulated, and synaptic depression is not likely to occur. from the analysis of fig. 3 it may be concluded that the activity of the release site is very moderate. however, closer examination of fig. 4 reveals that the spatial distribution of neurotransmitter does change during stimulation. in the region directly adjacent to the release site the amount of neurotransmitter drops a little, and also in its vicinity a slight decrease may be noticed, which is visible in the last picture in fig. 4 . the amount of neurotransmitter increased between stimuli, though at a very low speed, which may be attributed to the low value of exocytosis rate. the changes in these parameters can result in neurotransmitter amount sinking rapidly during exocytosis, thus leading to synaptic depression. to verify the reliability of the results, two control simulations were run. the first one, with two times larger mesh, confirmed that the relative differences in the results did not exceed 0.14%. for the second one, with two times larger mesh (as before) and with halved time step, the results were almost the same; the relative difference between two control runs did not exceed 10 −10 . the process described throughout this paper is similar to which has been found before [1] where, in fig. 4 , one can notice the almost constant number of released vesicles in every time step at the end of the simulation. taking into account the fact that the authors of the cited paper assumed that the total number of synaptic vesicles in the bouton was more than 10 4 , and that the number of synaptic vesicles released in each time step was about 40, we may conclude that if with m ≈ 250 vesicles in the bouton (the value assumed in this paper) the amount of neurotransmitter released during one time step is approximately 0.5, our results are similar to those found in literature; the proportions of the released amount to the total vesicle pool are of the same order of magnitude. in another study [2] the equilibrium state has been achieved, though at a lower level, indicating synaptic depression. the model of realistically shaped presynaptic bouton with a mitochondrion partly blocking its volume, presented in this paper, proved its ability to be used in simulation of synaptic depression. it should be stressed that the values of the parameters chosen for the initial tests of the proposed structure refer to the situation of a very regular activity, not threatened by depression. however, in silico tests may reveal the changes in distribution of neurotransmitter in a presynaptic bouton during a very frequent stimulation, thus allowing us to study depression in detail, the more so because the results of the experiments found in literature, whether or not a strong synaptic depression was detected, confirm that our simulation results reflect the real processes taking place in the presynaptic bouton of a stimulated neuron. simulation and parameter estimation of dynamics of synaptic depression temperature dependence of vesicular dynamics at excitatory synapses of rat hippocampus model of neurotransmitter fast transport in axon terminal of presynaptic neuron dynamical properties of the reaction-diffusion type model of fast synaptic transport three-dimensional model of signal processing in the presynaptic bouton of the neuron three-dimensional simulation of synaptic depression in axon terminal of stimulated neuron construction of a 3d geometric model of a presynaptic bouton for use in modeling of neurotransmitter flow a study on efficiency of 3d partial differential diffusive model of presynaptic processes numerical simulation for a neurotransmitter transport model in the axon terminal of a presynaptic neuron compartment model of neuropeptide synaptic transport with impulse control a model of intracellular transport of particles in an axon mixed-element octree: a meshing technique toward fast and real-time simulations in biomedical applications synaptic bouton properties are tuned to best fit the prevailing firing pattern tetgen: a quality tetrahedral mesh generator and 3d delaunay triangulator, version 1.4 user manual. wias -weierstrass institute for applied analysis and stochastics (wias tetgen, a delaunay-based quality tetrahedral mesh generator new trends in neurocybernetics stoichiometric biology of the synapse. dissertation composition of isolated synaptic boutons reveals the amounts of vesicle trafficking proteins high-fidelity geometric modeling for biomedical applications new software developments for quality mesh generation and optimization from biomedical imaging data key: cord-025283-kf65lxp5 authors: nayyeri, mojtaba; vahdati, sahar; zhou, xiaotian; shariat yazdi, hamed; lehmann, jens title: embedding-based recommendations on scholarly knowledge graphs date: 2020-05-07 journal: the semantic web doi: 10.1007/978-3-030-49461-2_15 sha: doc_id: 25283 cord_uid: kf65lxp5 the increasing availability of scholarly metadata in the form of knowledge graphs (kg) offers opportunities for studying the structure of scholarly communication and evolution of science. such kgs build the foundation for knowledge-driven tasks e.g., link discovery, prediction and entity classification which allow to provide recommendation services. knowledge graph embedding (kge) models have been investigated for such knowledge-driven tasks in different application domains. one of the applications of kge models is to provide link predictions, which can also be viewed as a foundation for recommendation service, e.g. high confidence “co-author” links in a scholarly knowledge graph can be seen as suggested collaborations. in this paper, kges are reconciled with a specific loss function (soft margin) and examined with respect to their performance for co-authorship link prediction task on scholarly kgs. the results show a significant improvement in the accuracy of the experimented kge models on the considered scholarly kgs using this specific loss. transe with soft margin (transe-sm) obtains a score of 79.5% hits@10 for co-authorship link prediction task while the original transe obtains 77.2%, on the same task. in terms of accuracy and hits@10, transe-sm also outperforms other state-of-the-art embedding models such as complex, conve and rotate in this setting. the predicted co-authorship links have been validated by evaluating profile of scholars. with the rapid growth of digital publishing, researchers are increasingly exposed to an incredible amount of scholarly artifacts and their metadata. the complexity of science in its nature is reflected in such heterogeneously interconnected information. knowledge graphs (kgs), viewed as a form of information representation in a semantic graph, have proven to be extremely useful in modeling and representing such complex domains [8] . kg technologies provide the backbone for many ai-driven applications which are employed in a number of use cases, e.g. in the scholarly communication domain. therefore, to facilitate acquisition, integration and utilization of such metadata, scholarly knowledge graphs (skgs) have gained attention [3, 25] in recent years. formally, a skg is a collection of scholarly facts represented in triples including entities and a relation between them, e.g. (albert einstein, co-author, boris podolsky). such representation of data has influenced the quality of services which have already been provided across disciplines such as google scholar 1 , semantic scholar [10] , openaire [1] , aminer [17] , researchgate [26] . the ultimate objective of such attempts ranges from service development to measuring research impact and accelerating science. recommendation services, e.g. finding potential collaboration partners, relevant venues, relevant papers to read or cite are among the most desirable services in research of research enquiries [9, 25] . so far, most of the approaches addressing such services for scholarly domains use semantic similarity and graph clustering techniques [2, 6, 27] . the heterogeneous nature of such metadata and variety of sources plugging metadata to scholarly kgs [14, 18, 22] keeps complex metaresearch enquiries (research of research) challenging to analyse. this influences the quality of the services relying only on the explicitly represented information. link prediction in kgs, i.e. the task of finding (not explicitly represented) connections between entities, draws on the detection of existing patterns in the kg. a wide range of methods has been introduced for link prediction [13] . the most recent successful methods try to capture the semantic and structural properties of a kg by encoding information as multi-dimensional vectors (embeddings). such methods are known as knowledge graph embedding (kge) models in the literature [23] . however, despite the importance of link prediction for the scholarly domains, it has rarely been studied with kges [12, 24] for the scholarly domain. in a preliminary version of this work [11] , we tested a set of embedding models (in their original version) on top of a skg in order to analyse suitability of kges for the use case of scholarly domain. the primary insights derived from results have proved the effectiveness of applying kge models on scholarly knowledge graphs. however, further exploration of the results proved that the many-to-many characteristic of the focused relation, co-authorship, causes restrictions in negative sampling which is a mandatory step in the learning process of kge models. negative sampling is used to balance discrimination from the positive samples in kgs. a negative sample is generated by a replacement of either subject or object with a random entity in the kg e.g., (albert einstein, co-author, trump) is a negative sample for (albert einstein, co-author, boris podolsky). to illustrate the negative sampling problem, consider the following case: assuming that n = 1000 is the number of all authors in a skg, the probability of generating false negatives for an author with 100 true or sensible but unknown collaborations becomes 100 1000 = 10%. this problem is particularly relevant when the in/out-degree of entities in a kg is very high. this is not limited to, but particularly relevant, in scholarly kgs with its network of authors, venues and papers. to tackle this problem, we propose a modified version of the margin ranking loss (mrl) to train the kge models such as transe and rotate. the model is dubbed sm (soft margins), which considers margins as soft boundaries in its optimization. soft margin loss allows false negative samples to move slightly inside the margin, mitigating the adverse effects of false negative samples. our main contributions are: -proposing a novel loss function explicitly designed for kgs with many-tomany relations (present in co-authorship relation of scholarly kgs), -showcasing the effect of the proposed loss function for kge models, -providing co-authorship recommendations on scholarly kgs, -evaluating the effectiveness of the approach and the recommended links on scholarly kgs with favorable results, -validating the predicted co-authorship links by a profile check of scholars. the remaining part of this paper proceeds as follows. section 2 represents details of the scholarly knowledge graph that is created for the purpose of applying link discovery tools. section 3 provides a summary of preliminaries required about the embedding models and presents some of the focused embedding models of this paper, transe and rotate. moreover, other related works in the domain of knowledge graph embeddings are reviewed in sect. 3.2. section 4 contains the given approach and description of the changes to the mrl. an evaluation of the proposed model on the represented scholarly knowledge graph is shown in sect. 5. in sect. 6, we lay out the insights and provide a conjunction of this research work. a specific scholarly knowledge graphs has been constructed in order to provide effective recommendations for the selected use case (co-authorship). this knowledge graph is created after a systematic analysis of the scholarly metadata resources on the web (mostly rdf data). the list of resources includes dblp 2 , springer nature scigraph explorer 3 , semantic scholar 4 and the global research identifier database (grid) 5 with metadata about institutes. a preliminary version of this kg has been used for experiments of the previous work [11] where suitability of embedding models have been tested of such use cases. through this research work we will point to this kg as skgold. towards this objective, a domain conceptualization has been done to define the classes and relations of focus. figure 1 shows the ontology that is used for the creation of these knowledge graphs. in order to define the terms, the openresearch [20] ontology is reused. each instance in the scholarly knowledge graph is equipped with a unique id to enable the identification and association of the kg elements. the knowledge graphs consist of the following core entities of papers, events, authors, and departments. in the creation of the our kg 6 which will be denoted as skgnew a set of 7 conference series have been selected (namely iswc, eswc, aaai, neurips, cikm, aci, kcap and hcai have been considered in the initial step of retrieving raw metadata from the source). in addition, the metadata flitted for the temporal interval of 2013-2018. the second version of the same kg has been generated directly from semantic scholar. the datasets, used for model training, which in total comprise 70,682 triples where 29,469 triples are coming from the skgold and 41,213 triples are generated in skgnew. in each set of experiments, both datasets are split into triples of training/validation/test sets. table 1 includes the detailed statistics about the datasets only considering three relationships between entities namely hasauthor (paper -author), hascoauthor (author -author), hasvenue (author/papervenue). due to the low volume of data, isaffiliated (author -organization) relationship is eliminated due in skgnew. in this section we focus on providing required preliminaries for this work as well as the related work. the definitions required to understand our approach are: -knowledge graph. let e, r be the sets of entities and relations respectively. a kg is roughly represented as a set the proposed loss is trained on a classical translation-based embedding models named transe and a model for complex space as rotate. therefore, we mainly provide a description of transe and rotate and further focus on other state-ofthe-art models. transe. it is reported that transe [4] , as one of the simplest translation based models, outperformed more complicated kges in [11] . the initial idea of transe model is to enforce embedding of entities and relation in a positive triple (h, r, t) to satisfy the following equality: where h, r and t are embedding vectors of head, relation and tail respectively. transe model defines the following scoring function: rotate. here, we address rotate [16] which is a model designed to rotate the head to the tail entity by using relation. this model embeds entities and relations in complex space. by inclusion of constraints on the norm of entity vectors, the model would be degenerated to transe. the scoring function of rotate is loss function. margin ranking loss (mrl) is one of the most used loss functions which optimizes the embedding vectors of entities and relations. mrl computes embedding of entities and relations in a way that a positive triple gets lower score value than its corresponding negative triple. the least difference value between the score of positive and negative samples is margin (γ). the mrl is defined as follows: where [x] + = max(0, x) and s + and s − are respectively the set of positive and negative samples. mrl has two disadvantages: 1) the margin can slide, 2) embeddings are adversely affected by false negative samples. more precisely, the issue of margin sliding is described with an example. assume that f r (h 1 , t 1 ) = 0 and f r (h 1 , t 1 ) = γ, or f r (h 1 , t 1 ) = γ and f r (h 1 , t 1 ) = 2γ are two possible scores for a triple and its negative sample. both of these scores get minimum value for the optimization causing the model to become vulnerable to a undesirable solution. to tackle this problem, limited-based score [28] revises the mrl by adding a term to limit maximum value of positive score: it shows l rs significantly improves the performance of transe. authors in [28] denote transe which is trained by l rs as transe-rs. regarding the second disadvantage, mrl enforces a hard margin in the side of negative samples. however, using relations with many-to-many characteristic (e.g., co-author), the rate of false negative samples is high. therefore, using a hard boundary for discrimination adversely affects the performance of a kge model. with a systematic evaluation (performance under reasonable set up) of suitable embedding models to be considered in our evaluations, we have selected two other models that are described here. complex. one of the embedding models focusing on semantic matching model is complex [19] . in semantic matching models, the plausibility of facts are measured by matching the similarity of their latent representation, in other words it is assumed that similar entities have common characteristics i.e. are connected through similar relationships [13, 23] . in complex the entities are embedded in the complex space. the score function of complex is given as follows: in whicht is the conjugate of the vector t. here we present a multi-layer convolutional network model for link prediction named as conve. the score function of the conve is defined as below: in which g denotes a non-linear function,h andr are 2d reshape of head and relation vectors respectively, ω is a filter and w is a linear transformation matrix. the core idea behind the conve model is to use 2d convolutions over embeddings to predict links. conve consists of a single convolution layer, a projection layer to the embedding dimension as well as an inner product layer. this section proposes a new model independent optimization framework for training kge models. the framework fixes the second problem of mrl and its extension mentioned in the previous section. the optimization utilizes slack variables to mitigate the negative effect of the generated false negative samples. in contrast to margin ranking loss, our optimization uses soft margin. therefore, uncertain negative samples are allowed to slide inside of margin. figure 2 visualizes the separation of positive and negative samples using margin ranking loss and our optimization problem. it shows that the proposed optimization problem allows false negative samples to slide inside the margin by using slack variables (ξ). in contrast, margin ranking loss doesn't allow false negative samples to slide inside of the margin. therefore, embedding vectors of entities and relations are adversely affected by false negative samples. the mathematical formulation of our optimization problem is as follows: where f r (h, t) is the score function of a kge model (e.g., transe or rotate), s + , s − are positive and negative samples sets. γ 1 ≥ 0 is the upper bound of score of positive samples and γ 2 is the lower bound of negative samples. γ 2 − γ 1 is margin (γ 2 ≥ γ 1 ). ξ r h,t is slack variable for a negative sample that allows it to slide in the margin. ξ r h,t helps the optimization to better handle uncertainty resulted from negative sampling. the term ( ξ r h,t ) represented in the problem 5 is quadratic. therefore, it is convex which results in a unique and optimal solution. moreover, all three constraints can be represented as convex sets. the constrained optimization problem (5) is convex. as a conclusion, it has a unique optimal solution. the optimal solution can be obtained by using different standard methods e.g. penalty method [5] . the goal of the problem (5) is to adjust embedding vectors of entities and relations. a lot of variables participate in optimization. in this condition, using batch learning with stochastic gradient descent (sgd) is preferred. in order to use sgd, constrained optimization problem (5) should be converted to unconstrained optimization problem. the following unconstrained optimization problem is proposed instead of (5). the problem (5) and (6) may not have the same solution. however, we experimentally see that if λ 1 and λ 2 are properly selected, the results would be improved comparing to margin ranking loss. this section presents the evaluations of transe-sm and rotate-sm (transe and rotate trained by sm loss), over a scholarly knowledge graph. the evaluations are motivated for a link prediction task in the domain of scholarly communication in order to explore the ability of embedding models in support of metaresearch enquiries. in addition, we provide a comparison of our model with other state-ofthe-art embedding models (selected by performance under a reasonable set up) on two standard benchmarks (freebase and wordnet). four different evaluation methods have been performed in order to approve: 1) better performance and effect of the proposed loss, 2) quality and soundness of the results, 3) validity of the discovered co-authorship links and 4) sensitivity of the proposed model to the selected hyperparameters. more details about each of these analyses are discussed in the remaining part of this section. the proposed loss is model independent, however, we prove its functionality and effectiveness by applying it on different embedding models. in the first evaluation method, we run experiments and assess performance of transe-sm model as well as rotate-sm in comparison to the other models and the original loss functions. in order to discuss this evaluation further, let (h, r, t) be a triple fact with an assumption that either head or tail entity is missing (e.g., (?, r, t) or (h, r, ?) ). the task is to aim at completing either of these triples (h, r, ?) or (?, r, t) by predicting head (h) or tail (t) entity. mean rank (mr), mean reciprocal rank (mrr) [23] and hits@10 have been extensively used as standard metrics for evaluation of kge models on link prediction. in computation of mean rank, a set of pre-processing steps have been done such as: -head and tail of each test triple are replaced by all entities in the dataset, -scores of the generated triples are computed and sorted, -the average rank of correct test triples is reported as mr. let rank i refers to the rank of the i−th triple in the test set obtained by a kge model. the mrr is obtained as follows: the computation of hits@10 is obtained by replacing all entities in the dataset in terms of head and tail of each test triples. the result is a sorted list of triples based on their scores. the average number of triples that are ranked at most 10 is reported as hits@10 as represented in table 2 . the results mentioned in the table 2 validate that transe-sm and rotate-sm significantly outperformed other embedding models in all metrics. in addition, evaluation of the state-of-the-art models have been performed over the two benchmark datasets namely fb15k and wn18. while our focus has been resolving problem of kges in presence of many-to-many relationships, the evaluations of the proposed loss function (sm) on other datasets show the effectiveness of sm in addressing other types of relationships. table 3 shows the results of experiments for transe, complex, conve, rotate, transe-rs, transe-sm and rotate-sm. the proposed model significantly outperforms the other models with an accuracy of 87.2% on fb15k. the evaluations on wn18 shows that rotate-sm outperforms other evaluated models. the optimal settings for our proposed model corresponding to this part of the evaluation are λ 0 = 100, γ 1 = 0.4, γ 2 = 0.5, α = 10, d = 200 for fb15k and λ 0 = 100, γ 1 = 1.0, γ 2 = 2.0, α = 10, d = 200 for wn18. with the second evaluation method, we aim at approving quality and soundness of the results. in order to do so, we additionally investigate the quality of the recommendation of our model. a sample set of 9 researchers associated with the linked data and information retrieval communities [21] are selected as the foundation for the experiments of the predicted recommendations. table 4 shows the number of recommendations and their ranks among the top 50 predictions for all of the 9 selected researchers. these top 50 predictions are filtered for a closer look. the results are validated by checking the research profile of the recommended researchers and the track history of co-authorship. in the profile check, we only kept the triples which are indicating: 1. close match in research domain interests of scholars by checking profiles, 2. none-existing scholarly relation (e.g., supervisor, student), 3. none-existing affiliation in the same organization, 4. none-existing co-authorship. for example, out of all the recommendations that our approach has provided for researcher with id a136, 10 of them have been identified sound and new collaboration target. the rank of each recommended connection is shown in the third column. furthermore, the discovered links for co-authorship recommendations have been examined with a closer look to the online scientific profile of two top machine learning researchers, yoshua bengio 9 , a860 and yann lecun 10 , a2261. the recommended triples have been created in two patterns of (a860, r, ?) and (?, r, a860) and deduplicated for the same answer. the triples are ranked based on scores obtained from transe-sm and rotate-sm. for evaluations, a list of top 50 recommendations has been selected per considered researcher, bengio and lecun. in order to validate the profile similarity in research and approval of not existing earlier co-authorship, we analyzed the profile of each recommended author to "yoshua bengio" and "yann lecun" as well as their own profiles. we analyzed the scientific profiles of the selected researchers provided by the most used scholarly search engine, google citation 11 . due to author nameambiguity problem, this validation task required human involvement. first, the research areas indicated in the profiles of researchers have been validated to be similar by finding matches. in the next step, some of the highlighted publications with high citations and their recency have been controlled to make sure that the profiles of the selected researchers match in the machine learning community close to the interest of "yoshua bengio" -to make sure the researchers can be considered in the same community. as mentioned before, the knowledge graphs that are used for evaluations consist of metadata from 2013 till 2018. in checking the suggested recommendations, a co-authorship relation which has happened before or after this temporal interval is considered valid for the recommendation. therefore, the other highly ranked links with none-existed co-authorship are counted as valid recommendations for collaboration. figure 4b shows a visualization of such links found by analyzing top 50 recommendations to and from "yoshua bengio" and fig. 4a shows the same for "yann lecun". out of the 50 discovered triples for "yoshua bengio" being head, 12 of them have been approved to be a valid recommendation (relevant but never happened before) and 8 triples have been showing an already existing co-authorship. profiles of 5 other researchers have not been made available by google citation. among the triples with "yoshua bengio" considered in the tail, 8 of triples have been already discovered by the previous pattern. profile of 5 researchers were not available and 7 researchers have been in contact and co-authorship with "yoshua bengio". finally, 5 new profiles have been added as recommendations. out of 50 triples (y annlecun, r, ?), 14 recommendations have been discovered as new collaboration cases for "yann lecun". in analyzing the triples with a pattern of the fixed tail (?, r, y annlecun), there have been cases either without profiles on google citations or have had an already existing co-authorship. by excluding these examples as well as the already discovered ones from the other triple pattern, 5 new researchers have remained as valid recommendations. in this part we investigate the sensitivity of our model to the hyperparameters (γ 1 , γ 2 , λ 0 ). to analyze sensitivity of the model to the parameters γ 2 , we fix γ 1 to 0.1, 1 and 2. moreover, λ 0 is also fixed to one. then different values for γ 2 are tested and visualized. regarding the red dotted line in fig. 3a , the parameter γ 1 is set to 0.1 and λ 0 = 1. it is shown that by changing γ 2 from 0.2 to 3, the performance increases to reach the peak and then decreases by around 15%. therefore, the model is sensitive to γ 2 . the significant waving of results can be seen when γ 1 = 1, 2 as well (see fig. 3a ). therefore, proper selection of γ 1 , γ 2 is important in our model. we also analyze the sensitivity of the performance of our model on the parameter λ 0 . to do so, we take the optimal configuration of our model corresponding to the fixed γ 1 , γ 2 . then the performance of our model is investigated in different setting where the λ 0 ∈ {0.01, 0.1, 1, 10, 100, 1000}. according to fig. 3b , the model is less sensitive to the parameter λ 0 . therefore, to obtain hyper parameters of the model, it is recommended that first (γ 1 , γ 2 ) are adjusted by validation when λ 0 is fixed to a value (e.g., 1). then the parameter λ 0 is adjusted while (γ 1 , γ 2 ) are fixed. the aim of the present research was to develop a novel loss function for embedding models used on kgs with a lot of many-to-many relationships. our use case is scholarly knowledge graphs with the objective of providing predicted links as recommendations. we train the proposed loss on embedding model and examine it for graph completion of a real-world knowledge graph in the example of scholarly domain. this study has identified a successful application of a model free loss function namely sm. the results show the robustness of our model using sm loss function to deal with uncertainty in negative samples. this reduces the negative effects of false negative samples on the computation of embeddings. we could show that the performance of the embedding model on the knowledge graph completion task for scholarly domain could be significantly improved when applied on a scholarly knowledge graph. the focus has been to discover (possible but never happened) co-author links between researchers indicating a potential for close scientific collaboration. the identified links have been proposed as collaboration recommendations and validated by looking into the profile of a list of selected researchers from the semantic web and machine learning communities. as future work, we plan to apply the model on a broader scholarly knowledge graph and consider other different types of links for recommendations e.g, recommend events for researchers, recommend publications to be read or cited. openaire lod services: scholarly communication data as linked data construction of the literature graph in semantic scholar towards a knowledge graph for science translating embeddings for modeling multi-relational data convex optimization a three-layered mutually reinforced model for personalized citation recommendation convolutional 2d knowledge graph embeddings a comparative survey of dbpedia, freebase, opencyc, wikidata, and yago science of science semantic scholar metaresearch recommendations using knowledge graph embeddings combining text embedding and knowledge graph embedding techniques for academic search engines a review of relational machine learning for knowledge graphs data curation in the openaire scholarly communication infrastructure factorizing yago: scalable machine learning for linked data rotate: knowledge graph embedding by relational rotation in complex space arnetminer: extraction and mining of academic social networks linked data in libraries: a case study of harvesting and sharing bibliographic metadata with bibframe complex embeddings for simple link prediction openresearch: collaborative management of scholarly communication metadata unveiling scholarly communities over knowledge graphs aminer: search and mining of academic social networks knowledge graph embedding: a survey of approaches and applications acekg: a large-scale knowledge graph for academic data mining big scholarly data: a survey researchgate: an effective altmetric indicator for active researchers? pave: personalized academic venue recommendation exploiting copublication networks learning knowledge embeddings by combining limit-based scoring loss acknowledgement. this work is supported by the epsrc grant ep/m025268/1, the wwtf grant vrg18-013, the ec horizon 2020 grant lambda (ga no. 809965), the cleopatra project (ga no. 812997), and the german national funded bmbf project mlwin. key: cord-018791-h3bfdr14 authors: rasulev, bakhtiyor title: recent developments in 3d qsar and molecular docking studies of organic and nanostructures date: 2016-12-09 journal: handbook of computational chemistry doi: 10.1007/978-3-319-27282-5_54 sha: doc_id: 18791 cord_uid: h3bfdr14 the development of quantitative structure–activity relationship (qsar) methods is going very fast for the last decades. osar approach already plays an important role in lead structure optimization, and nowadays, with development of big data approaches and computer power, it can even handle a huge amount of data associated with combinatorial chemistry. one of the recent developments is a three-dimensional qsar, i.e., 3d qsar. for the last two decades, 3d-osar has already been successfully applied to many datasets, especially of enzyme and receptor ligands. moreover, quite often 3d qsar investigations are going together with protein–ligand docking studies and this combination works synergistically. in this review, we outline recent advances in development and applications of 3d qsar and protein–ligand docking approaches, as well as combined approaches for conventional organic compounds and for nanostructured materials, such as fullerenes and carbon nanotubes. the methodology of quantitative structure-activity relationship (qsar) is very well described in various publications (hansch et al. 1995; kubinyi 1997a, b; eriksson et al. 2003) . in short, qsar is a method to find correlations and mathematical models for congeneric series of compounds, affinities of ligands to their binding sites, rate constants, inhibition constants, toxicological effect, and many other biological activities, based on structural features, as well as group and molecular properties, such as electronic properties, polarizability, or steric properties (klebe et al. 1994; hansch et al. 1995; karelson et al. 1996; kubinyi 1997a, b; perkins et al. 2003; isayev et al. 2006; martin 2009; rasulev et al. 2010; puzyn et al. 2011) . thus, qsar approaches have been used for many types of biological activities to describe correlations for series of drugs and drug candidates (kubinyi 1997a, b; veber et al. 2002) . in addition, in case of available crystallographic data on the proteins, the qsar models can be developed with the additional information from the three-dimensional (3d) structures of these proteins, interacting with drug candidates, by applying protein-ligand docking data for further qsar analysis, or, if there is no data on 3d structure of protein, then developing qsar based on three-dimensional features of investigated molecules (moro et al. 2005; ragno et al. 2005; gupta et al. 2009; hu et al. 2009; sun et al. 2010; araújo et al. 2011; ahmed et al. 2013 ). the second approach was named as 3d3d qsar qsar approach (wise et al. 1983; cramer and bunce 1987; cramer et al. 1988; clark et al. 1990 ). there are also many other multidimensional approaches, including 4d qsar and 5d qsar, but all of them are just extension of qsar analysis to a number of conformations (orientations, tautomers, stereoisomers, or protonation states) per molecule, number of concentrations (dosages) per compound, etc (lill 2007) . in overall, when talking about 3d qsar, computational chemists usually assume that the qsar analysis takes into account a three-dimensional structure of the compound in minimal energy conformation and builds qsar model based on various 3d fields generated (kubinyi 1997a, b) . a first similar to 3d qsar approach was developed by cramer in 1983, which was the predecessor of 3d approaches called dynamic lattice-oriented molecular modeling system (dylomms) that involves the use of pca to extract vectors from the molecular interaction fields, which are then correlated with biological activities (wise et al. 1983) . later authors improved this approach and by combining the two existing techniques, grid and pls, has developed a powerful 3d qsar methodology, so-called comparative molecular field analysis (comfa) (cramer et al. 1988; clark et al. 1990 ). soon after, comfa has become a prototype of 3d qsar methods (kim et al. 1998; todeschini and gramatica 1998; podlogar and ferguson 2000) . comfa approach was then implemented in the sybyl software (tripos 2006 ) from tripos inc. as it was mentioned before, a good and fruitful approach is a combination of molecular docking and 3d qsar pharmacophore methods (patel et al. 2008; gupta et al. 2009; araújo et al. 2011; ahmed et al. 2013) . molecular docking and 3d qsar model are the two potent methods in drug discovery process. thus, virtual screening using 3d qsar approaches followed by docking has become one of the reputable methods for drug discovery and enhancing the efficiency in lead optimization (oprea and matter 2004) . the main advantage of this combined approach of 3d qsar and pharmacophore-based docking is to focus on specific key interaction for protein-ligand binding to improve drug candidates. ameliorate the selection of active compounds; it is optimal to use both methods like molecular docking and 3d qsar modeling (gopalakrishnan et al. 2005; klebe 2006; perola 2006; pajeva et al. 2009; yang 2010) . since the time of development of 3d qsar approach, a number of papers and methods' developments were published within 3d qsar methodology. let's briefly list and explain these methods here and then discuss recent developments and applications of these 3d qsars in the assessment of the properties of biologically active compounds and development of drugs and drug candidates. as can be seen from fig. 1 , the number of publications related to 3d qsar approach is increasing every year, from 3 to 5 publications in the beginning of 1990s to about 150 publications per year in 2014. it confirms the increasing importance of the method and successful application in many drug design projects. to give some view on a number of 3d qsar methods developed for the last three decades, we listed below the ligand-based 3d qsar methods and very short description per each of them. comfa -comparative molecular field analysis is the method which correlates the field values of the structure with biological activities. comfa generates an equation correlating the biological activity with the contribution of interaction energy fields at every grid point (cramer et al. 1988 ). the method was developed in the 1988 and still one of the most popular ones for 3d qsar modeling. comsia -comparative molecular similarity indices analysis (comsia) method, where the molecular similarity indices calculated from steric and electrostatic alignment (seal) similarity fields and applied as descriptors to encode steric, electrostatic, hydrophobic, and hydrogen bonding properties (klebe et al. 1994 ). this is a development of comfa method and also gets very popular in drug design. grid -this method and program was designed as an alternative to the original comfa approach. it is actually a force field which calculates the interaction energy fields in molecular-field analysis and determines the energetically favorable binding sites on molecules of known structure. the method to some extent is similar to comfa, and it computes explicit nonbonded (or non-covalent) interactions between a molecule of known 3d structure and a probe (i.e., a chemical group with certain user-defined properties). the probe is located at the sample positions on a lattice throughout and around the macromolecule. the method offers two distinct advantages, one of them is the use of a 6-4 potential function for calculating the interaction energies, which is smoother than the 6-12 form of the lennard-jones type in comfa, and another advantage is the availability of different types of probes (goodford 1985) . moreover, the program in addition of computing the regular steric and electrostatic potentials also calculates the hydrogen bonding potential using a hydrogen bond donor and acceptor, as well as the hydrophobic potential using a "dry probe." later on, a water probe was included to calculate hydrophobic interactions (kim et al. 1998; kim 2001) . msa -molecular shape analysis (msa) is a ligand-based 3d qsar method which attempts to merge conformational analysis with the classical hansch approach. the method deals with the quantitative characterization, representation, and manipulation of molecular shape in the construction of a qsar model (hopfinger 1980) . hasl -inverse grid-based approach represents the shapes of the molecules inside an active site as a collection of grid points (doweyko 1988) . the methodology of this approach begins with the intermediate conversion of the cartesian coordinates (x, y, z) for superposed set of molecules to a 3d grid consisting of the regularly spaced points that are (1) arranged orthogonally to each other, (2) separated by a particular distance termed as the resolution (which determines the number of grid points representing a molecule), and (3) all sprawl within the van der waals radii of the atoms in the molecule. thus, the resulting set of points is referred to as the molecular lattice and represents the receptor active site map (like in comfa). quite important that the overall lattice dimensions are dependent on the size of the molecules and the resolution chosen. grind -this is the method that uses grid-independent descriptors (grind) which encode the spatial distribution of the molecular interaction fields (mif) of the studied compounds (pastor et al. 2000) . in the anchor-grind method (fontaine et al. 2005) , to compare the mif distribution of different compounds, the user defines a single common position in the structure of all the compounds in the series, so-called anchor point. this anchor point does not provide enough geometrical constrains to align the compounds studied; however, it is used by the method as a common reference point, making it possible to describe the geometry of the mif regions in a more precise way than grind does. the anchor point is particularly easy to assign in datasets having some chemical substituents well known as being crucial for the activity. in the anchor-grind approach, the r groups are described with two blocks of variables: the anchor-mif and the mifmif blocks (fig. 2 ). the first one describes the geometrical distribution of the r mif relative to the anchor point, while the second one describes the geometrical distribution of the mif within each r group. these blocks are obtained after the following steps: (i) every r group is considered as attached to the scaffold, (ii) the anchor point is set automatically on an atom of the scaffold, (iii) a set of mif are calculated with the program grid (goodford 1985) as well as the shape field implemented in the program almond (cruciani et al. 2004) , and (iv), as the last step, the blocks of descriptors are computed from the anchor point and the filtered mif. authors also were able to incorporate a molecular shape into the grind descriptors . germ -genetically evolved receptor model (germ) is a method for 3d qsar and also for constructing 3d models of protein-binding sites in the absence of a crystallographically established or homology-modeled structure of the receptor (walters and hinds 1994) . as for many 3d qsar datasets, the primary requirement for germ is a structure-activity set for which a sensible alignment of realistic conformers has been determined. the methodology is the following: it encloses the superimposed set of molecules in a shell of atoms (analogous to the first layer of atoms in the active site) and allocates these atoms with explicit atom types (aliphatic h, polar h, etc., to match the types of atoms found in the investigated proteins). comma -comparative molecular moment analysis (comma) is one of the unique alignment-independent 3d qsar methods, which involves the computation of molecular similarity descriptors (similar to comsia) based on the spatial moments of molecular mass (i.e. shape) and charge distributions up to second-order as well as related quantities (silverman and platt 1996) . combine -comparative binding energy analysis (combine) method was developed to make use of the structural data from ligand-protein complexes, within a 3d qsar methodology. the method is based on the hypothesis where free energy of binding can be correlated with a subset of energy components calculated from the structures of receptors and ligands in bound and unbound forms (ortiz et al. 1995; lushington et al. 2007 ). comsa -comparative molecular surface analysis (comsa) is a non-grid 3d qsar method that utilizes the molecular surface to define the regions of the compounds which are required to be compared using the mean electrostatic potentials (meps) (polanski et al. 2002 (polanski et al. , 2006 . in overall, the methodology proceeds by subjecting the molecules in the dataset to geometry optimization and assigning them with partial atomic charges. afmoc -adaptation of fields for molecular comparison (afmoc) is a close to 3d qsar method that involves fields derived from the protein environments (i.e. not from the superimposed ligands as in comfa); therefore, it is also known as a "reverse" comfa (dafmoc) approach or protein-dependent 3d qsar (gohlke and klebe 2002) . the methodology is the following: a regularly spaced grid is placed into the receptor binding site, followed by mapping of the knowledgebased pair potentials between protein atoms and ligand atom probes onto the grid intersections resulting in the potential fields. thus, based on these potential fields, specific interaction fields are generated by multiplying distance-dependent atomtype properties of actual ligands docked into the active site with the neighboring grid values. in result, these atom-type-specific interaction fields are then correlated with the binding affinities using pls technique, which assigns individual weighting factors to each field value. coria -comparative residue interaction analysis (coria) is a 3d qsar approach which utilizes the descriptors that describe the thermodynamic events involved in ligand binding, to explore both the qualitative and the quantitative features of the ligand-receptor recognition process. the coria methodology is the following: initially it simply consisted of calculating the nonbonded (van der waals and coulombic) interaction energies between the ligand and the individual active site residues of the receptor that are involved in interaction with the ligand (datar et al. 2006; dhaked et al. 2009 ). by employing the genetic algorithmsupported pls technique (g-pls), these energies then correlated with the biological activities of molecules, along with the other physiochemical variables describing the thermodynamics of binding, such as surface area, lipophilicity, molar refractivity, molecular volume, strain energy, etc. somfa -self-organizing molecular-field analysis, where firstly the mean activity of training set is subtracted from the activity of each molecule to get their mean-centered activity values. the methodology is the following: • a 3d grid around the molecules with values at the grid points signifying the shape or electrostatic potential is generated. • the shape or electrostatic potential value at every grid point for each molecule is multiplied by its mean-centered activity. • the grid values for each molecule are summed up to give the master grids for each property. • then the so-called somfa property,i descriptors from the master grid values are calculated and correlated with the log-transformed molecular activities (robinson et al. 1999 ). knn-mfa -this relatively new method was developed and reported in 2006 by ajmani et al. (2006) . knn-mfa is a k-nearest neighbor molecular-field analysis. knn-mfa adopts a k-nearest neighbor principle for generating relationships of molecular fields with the experimentally reported activity. this method utilizes the active analogue principle that lies at the foundation of medicinal chemistry. like many 3d qsar methods, knn-mfa requires suitable alignment of a given set of molecules. this is followed by generation of a common rectangular grid around the molecules. the steric and electrostatic interaction energies are computed at the lattice points of the grid using a methyl probe of charge c1. these interaction energy values are considered for relationship generation and utilized as descriptors to decide nearness between molecules. 3d-hovaifa -this method based on three-dimensional holographic vector of atomic interaction field analysis (zhou et al. 2007 ). initially the holographic vector for 3d qsar methods was developed by zhou et al. in 2007 (zhou et al. 2007 ). proceeding from two spatial invariants, namely, atom relative distance and atomic properties on the bases of three common nonbonded (electrostatic, van der waals, and hydrophobic) interactions which are directly associated with bioactivities, 3d-hovaif method derives multidimensional vectors to represent molecular steric structural characteristics. cmf -this is a recently introduced continuous molecular-field approach (baskin and zhokhova 2013) . this is a novel approach that consists in encapsulating continuous molecular fields into specially constructed kernels. it is based on the application of continuous functions for the description of molecular fields instead of finite sets of molecular descriptors (such as interaction energies computed at grid nodes) commonly used for this purpose. the feasibility of using molecular-field kernels in combination with the support vector regression (svr) machine learning method to build 3d qsar models has been demonstrated by the same authors earlier (zhokhova et al. 2009 ). authors claim that by combining different types of molecular fields and methods of their approximation, different types of kernels with different types of kernel-based machine learning methods, it is possible not only to present lots of existing methods in chemoinformatics and medicinal chemistry as particular cases within a single methodology but also to develop new approaches aimed at solving new problems (baskin and zhokhova 2013) . the example of application of this approach is described later in this chapter. phase -this is a flexible system (engine) (dixon et al. 2006 ) for common pharmacophore identification and assessment, 3d qsar model development, and 3d database creation and searching (within schrodinger suite, schrodinger, llc). it includes some subprograms, for example, ligprep, which attaches hydrogens, converts 2d structures to 3d, generates stereoisomers, and, optionally, neutralizes charged structures or determines the most probable ionization state at a user-defined ph. it also includes macromodel conformational search engine to generate a series of 3d structures that sample the thermally accessible conformational states. for purposes of 3d modeling and pharmacophore model development, each ligand structure is represented by a set of points in 3d space, which coincide with various chemical features that may facilitate non-covalent binding between the ligand and its target receptor. phase provides six built-in types of pharmacophore features: hydrogen bond acceptor (a), hydrogen bond donor (d), hydrophobic (h), negative ionizable (n), positive ionizable (p), and aromatic ring (r). in addition, users may define up to three custom feature types (x, y, z) to account for characteristics that don't fit clearly into any of the six built-in categories. to construct a 3d qsar model, a rectangular grid is defined to encompass the space occupied by the aligned training set molecules. this grid divides space into uniformly sized cubes, typically 1 å on each side, which are occupied by the atoms or pharmacophore sites that define each molecule. apf -in 2008 , totrov m (totrov 2008 introduced atomic property fields (apf) for 3d qsar analysis. apf concept is introduced as a continuous, multicomponent 3d potential that reflects preferences for various atomic properties at each point in space (fig. 3) . the approach is extended to multiple flexible ligand alignments using an iterative procedure, self-consistent atomic property fields by optimization (scapfold). the application of atomic property fields and scapfold for virtual ligand screening and 3d qsar is tested on published benchmarks. the new method is shown to perform competitively in comparison to current state-of-the-art methods (comfa and comsia). thus, there are studies with comparative analysis of these two methods, phase and catalyst (hypogen). in 2007, evans et al. (2007) provided a comparative study of phase and catalyst methods for their performance in determining 3d qsars and concluded that the performance of phase is better than or equal to that of catalyst hypogen, with the datasets and parameters used. authors found that within phase, the atom-based grid qsar model generally performed better than the pharmacophore-based grid, and by using overlays from catalyst to build phase grid qsar models, they found evidence that better performance of phase on these datasets was due to the use of the grid technique. in this part of the review, we discuss the new developments in the methods and in applications of 3d qsars for various chemicals, including nanostructured materials. for the last 10 years, there was mainly improvement of the 3d qsar approaches which were developed before 2005. as it was discussed above, all these methods as comfa, comsia, grid, somfa, etc., were developed in late 1990s and early 2000s. some of the recently introduced methods are just improvements of old approaches. however, if we take a look at applications, we can see many interesting publications and novel ligand developments which were designed by 3d qsar and docking methods. recently, a quite interesting study was performed by virsodia et al. (2008) on antitubercular activity of 23 substituted n-phenyl-6-methyl-2-oxo-4-phenyl-1,2,3,4-tetrahydropyrimidine-5-carboxamides by application of 3d qsar using comfa and comsia methods. authors synthesized and assessed the antitubercular activity of all investigated compounds followed by comprehensive 3d qsar modeling. authors were able to get good models with r 2 d 0.98 and 0.95, with cross-validated q 2 d 0.68 and 0.58, respectively. authors stated that comfa and comsia contours helped them to design some new molecules with improved activity (virsodia et al. 2008) . another comfa and comsia study was performed by ravichandran et al. (2009) by analysis of anti-hiv activity of 96 1,3,4-thiazolidine derivatives. authors were able to get good models by comfa and comsia with r 2 values 0.931 and 0.972, respectively. the predictive model was evaluated using a test set comprising of 17 molecules, and the predicted r 2 values of comfa and comsia models were 0.861 and 0.958, respectively. with the use of comsia method, kumar et al. (2009) were able to investigate novel bignelli dihydropyrimidines with potential anticancer activity. the developed model based on 32 compounds showed a good statistical data -for training set q 2 d 0.51, while for the test set r 2 d 0.93. raparti et al. in 2009 (raparti et al. 2009 ) reported a study based on a novel knn-mfa 3d qsar which was discussed above, where authors synthesized, assessed for antimycobacterial activity, and investigated by 2d and 3d qsar approaches a series of ten compounds (4-(morpholin-4-yl)-n0-(arylidene)benzohydrazides). authors were able to get satisfactory for this size of dataset statistical results for 3d qsar model against m. tuberculosis (raparti et al. 2009 ), with r 2 d 0.910 and q 2 d 0.507, respectively. another knn-mfa 3d qsar study was conducted and published by kishore jha et al. in 2010 (jha et al. 2010 . authors evaluated the antimicrobial activity of 21 compounds by knn-mfa combined with various selection procedures. as selection methods, authors were using simulated annealing (sa), genetic algorithms (ga), and stepwise (sw) forward-backward methods. the developed model showed satisfactory results for this kind of studies, with q 2 d 0.696 and r 2 pred d 0.615. authors concluded that the 3d qsar study has shown that less electronegative substituent would be favorable for the activity, and therefore the future molecules should be designed with less electronegative group to result in potentially active molecules. thus, recently, araújo et al. (2011) studied acetylcholine inhibitors (acheis) by application of combined approach, so-called receptor-dependent 3d qsar (rd 3d qsar) where they investigated a series of 60 benzylpiperidine inhibitors of human acetylcholinesterase. they received two models with r 2 d 0.86, q 2 d 0.74 and r 2 d 0.90, q 2 d 0.75, which were validated by a combined ga-pls approach. based on those models, authors have proposed four new benzylpiperidine derivatives and predicted the pic 50 for each molecule. the good predicted potency of one of the benzylpiperidine derivatives indicated a promising potency for this candidate as a new huache inhibitor (araújo et al. 2011) . in another similar study, in 2009 gupta et al. (2009 conducted an interesting combined study with protein-ligand docking-based 3d qsar study of hiv-1 integrase inhibitors. they were using protein-ligand docking to identify a potential binding mode for 43 inhibitors at hiv-1 in active site, and best docked conformation of certain molecule was used as a template for alignment. the docking was followed by comfa and comsia modeling, and authors developed very good models with r 2 cv values of 0.728 and 0.794, respectively, and non-cross-validated ones r 2 ncv d 0.934 and 0.928. this combined docking-based 3d qsar methodology showed really good predictive abilities and can be employed further in the development of better inhibitors for various proteins. one more study is worth to discuss where authors applied a combination of docking and 3d qsar to reveal the most important structural factors for the activity. here, hu et al. (2009) applied a receptor-and ligand-based 3d qsar study for a series of 68 non-nucleoside hiv-1 reverse transcriptase inhibitors (2amino-6-arylsulfonylbenzonitriles and their thio and sulfinyl congeners). authors were applying docking simulations to position the inhibitors into rt active site to determine the most probable binding mode and most reliable conformations. this complex receptor-based and ligand-based alignment procedure and different alignment modes allowed authors to obtain reliable and predictive comfa and comsia models with cross-validated q 2 value of 0.723 and 0.760, respectively. authors concluded that the comfa steric and comsia hydrophobic fields support the idea that bulkier and hydrophobic groups are favorable to bioactivity in the 3-and 5-positions of the b (benzene)-ring. at the same time, these groups are unfavorable in the 4-position. also, the comsia h-bond donor and acceptor fields suggest that the sulfide and sulfone inhibitors are more active than the sulfoxide ones due to h-bonding with protein residues. it is good to mention here another interesting study where combination of methods is used, including molecular docking and 3d qsar to develop a predictive qsar model. moro et al. (2006) suggested the use a combination of molecular electrostatic potential (mep) surface properties (autocorrelation vectors) with the conventional partial least-square (pls) analysis to produce a robust ligand-based 3d structure-activity relationship (automep/pls). they applied this approach to predict human a3 receptor antagonist activities. first of all, the approach was suggested as an efficient and alternative pharmacodynamic-driven filtering method for small-size virtual libraries. for this, authors generated a small-sized combinatorial library (841 compounds) that was derived from the scaffold of the known human a3 antagonist pyrazolo-triazolo-pyrimidines (moro et al. 2005) . this is another successful example of combined approach of docking and 3d qsar to investigate and design active analogue compounds. authors were using multidock code that is part of moe suite (molecular operating environment (moe) 2016) to get a conformational sampling and then calculate interaction energies using mmff94 (halgren 1996) and use it for further steps. the meps were derived from a classical point charge model: the electrostatic potential for each molecule is obtained by moving a unit positive point charge across the van der waals surface, and it is calculated at various points j on this surface (moro et al. 2006) . authors were able to test the approach by synthesizing several predicted potent compounds, and they found that all the newly synthesized compounds are correctly predicted as potent human a3 antagonists (moro et al. 2006) . as a continuation of development of pharmacophore-and docking-based methods for qsar, the novel phase code was developed. this updated code then was used by amnerkar and bhusari (2010) to investigate by 3d qsar approach the anticonvulsant activity of some prop-2-eneamido and 1-acetyl-pyrazolin derivatives compound 52 aligned to the pharmacophore for which blue indicates nitrogen, yellow refers to sulfur, green indicates chlorine, gray indicates carbon, and white refers to hydrogen (reproduced with permission from reference (amnerkar and bhusari 2010). copyright elsevier, 2010) of aminobenzothiazole. they received a statistically significant 3d qsar model with r 2 of 0.922 and q 2 of 0.814. the model was analyzed in order to understand the trends of investigated molecules for their anticonvulsant properties. authors found the influence of electron withdrawing, hydrogen bond donor, and negative/positive ionic and hydrophobic groups to anticonvulsant activity. authors believe that the derived 3d qsar as well as clues for possible structural modifications will be of interest and significance for the strategic design of more potent molecules in the benzothiazoles as anticonvulsant agents. the pharmacophore hypothesis generated from phase-based 3d qsar analysis can be seen in fig. 4 . another phase application for 3d qsar study is published by pulla et al. (2016) . authors applied a 3d qsar approach to investigate silent mating-type information regulation 2 homologue 1 (sirt1) which is the homologous enzyme of silent information regulator-2 gene in yeast. sirt1 was believed to be overexpressed in many cancers (prostate, colon) and inflammatory disorders (rheumatoid arthritis); that is why it has good therapeutic importance. authors conducted both structurebased and ligand-based drug design strategies with utilizing high-throughput virtual screening of chemical databases. then an energy-based pharmacophore was generated using the crystal structure of sirt1 bound with a small molecule inhibitor and compared with a ligand-based pharmacophore model that showed four similar features. a 3d qsar model was developed and applied to generated structures. among the designed compounds, lead 17 emerged as a promising sirt1 inhibitor with ic 50 of 4.34 m and, at nanomolar concentration (360 nm), attenuated the proliferation of prostate cancer cells (lncap) (pulla et al. 2016) . the 3d qsar model was developed using phase 3.4 module in maestro 9.3 software package developed by schrodinger, llc (dixon et al. 2006) . docking studies were executed using glide 5.8 module (halgren et al. 2004) . authors were validating the pharmacophore model by set composed of 1055 compounds, consisting of 1000 decoys and 55 known inhibitors. the drug-like decoy set of 1000 compounds was obtained from the glide module (halgren et al. 2004) . a final 3d qsar model was developed based on dataset of 79 molecules reported as sirt1 inhibitors in various literatures. to develop qsar model, phase module relied on pls regression applied to a large set of binary-valued variables. each independent variable in the model originated from the grid of cubic volume elements spanning the space occupied by the training set ligands. each training ligand in the training set was represented by binary code consisting of set of bit values (0 or 1) indicating the volume of elements occupied by van der waals model of the ligand. authors were able to get a very good 3d qsar model with r 2 d 0.953, q 2 d 0.908, and r 2 ext d 0.941. a validated 3d qsar model (for adhrr 802) authors used to generate contour maps could help in understanding the importance of functional groups at specific positions toward biological activity. these insights could be known by comparing the contour maps of the most and least active compounds, as shown in fig. 5 represented in pulla et al. (2016) . the blue and red cubes indicated the favorable and unfavorable regions, respectively, of the hydrogen bond donor effect, while light-red and yellow cubes indicated favorable and unfavorable regions, respectively, of the hydrophobic effect, and the cyan and orange cubes indicated favorable and unfavorable regions, respectively, of the electron-withdrawing effect. from fig. 5a , it can be seen that the blue favorable regions of the hydrogen bond donor effect were nearer to the donor feature (d5) of the active molecule; however, it could also be observed that blue boxes were also concentrated at the amide group beside thiophen, thus illustrating that additional donor groups at these regions (blue cubes) could increase biological activity. at the same time, in the inactive molecule, red unfavorable boxes were observed around the donor feature (d5), inferring the biological inactiveness of the molecule. in the case of the hydrophobic effect, the light-red color cubes were seen surrounding the hydrophobic feature (h8, piperidine) of the active molecule, whereas the presence of few yellow unfavorable cubes indicated that these hydrophobic groups were not in the right position in the inactive molecule, illustrating the weak biological activity. next, in the case of the electron-withdrawing effect of the active molecule, the favorable cyan cubes were seen around the acceptor feature (a3), and cyan cubes were also seen near the pyrimidine ring. it inferred that the acceptor features near the pyrimidine ring could further increase the bioactivity of the molecule. however, in the case of the inactive molecule, mostly unfavorable orange cubes were observed around the acceptor feature (a3), illustrating the importance of the electron-withdrawing group in the activity of lead molecules. thus, another new combined docking-based 3d qsar study (sun et al. 2010 ) was published with the analysis of influenza neuraminidase inhibitors. the study was based on novel 3d-hovaifa method which is based on three-dimensional holographic vector of atomic interaction field analysis (zhou et al. 2007 ). as it was mentioned above, initially the holographic vector for 3d qsar was developed by (zhou et al. 2007 ). the method uses atomic relative distance and atomic properties on the bases of three common nonbonded (electrostatic, van der waals, and hydrophobic) interactions which are directly associated with bioactivities, and then 3d-hovaif method derives multidimensional vectors to represent molecular steric structural characteristics for further 3d qsar analysis. similarly to previous study, authors conducted a docking study to find the best docking pose and template for alignment. then authors were able to get good models for a large dataset of 124 compounds and received the following correlation coefficients, r 2 d 0.789 and r 2 cv d 0.732. authors claim that 3d-hovaifa can be applicable to molecular structural characterization and bioactivity prediction. in addition, it was showed that hovaifa and docking results are corresponding (sun et al. 2010) , which illustrates that hovaifa is an effective methodology for characterization of complex interactions of drug molecules. one more docking-based 3d qsar study was published in 2010 by sakkiah et al. (2010) , where authors conducted 3d qsar pharmacophore-based virtual screening and molecular docking for the identification of potential hsp90 inhibitors. authors were using hypo and hypogen (li et al. 2000) 3d-based pharmacophore models. based on the training set of 16 compounds, they were able to develop a good model using pharmacophore generation module in discovery studio (accelrys) and then apply it for test set of 30 compounds. for predicting activity, the correlation coefficients of the model for training and test sets were 0.93 and 0.91, respectively. authors then applied the model to virtual screening of about 160,000 compounds (maybridge and scaffold databases) and finally selected 1150 compounds for docking studies. finally, 36 selected compounds were reported that were showing high activity based on 3d qsar model and docking analysis. the developed hypogen pharmacophore model that was used for virtual screening of 160,000 compounds from the databases is represented in fig. 6 . recently introduced and discussed previously the cmf approach for 3d qsar analysis was successfully applied for several datasets (baskin and zhokhova 2013) . authors applied cmf approach to build 3d qsar models for eight datasets through the use of five types of molecular fields (the electrostatic, steric, hydrophobic, hydrogen bond acceptor and donor ones). the 3d qsar models were developed for the following datasets: 114 angiotensin converting enzyme (ace) inhibitors, 111 acetylcholinesterase (ache) inhibitors, 163 ligands for benzodiazepine receptors (bzr), 322 cyclooxygenase-2 (cox-2) inhibitors, 397 dihydrofolatereductase (dhfr) inhibitors, 66 glycogen phosphorylase b (gpb) inhibitors, 76 thermolysin (ther) inhibitors, and 88 thrombine (thr) inhibitors. authors were able to get good models and then compare statistical characteristics of the developed models with the same characteristics built earlier for the same datasets using the most popular 3d qsar methods, comfa and comsia, based on the use of molecular fields. almost for all cases authors received better statistics than in original works. for example, for ace inhibitors, cmf approach showed q 2 d 0.72, while comfa and comsia showed 0.68 and 0.65, respectively. 3d qsar for ache inhibitors showed for cmf q 2 d 0.58, while for comfa and comsia, it was 0.52 and 0.48, respectively. 3d qsar for dhfr inhibitors showed for cmf q 2 d 0.67, while for comfa and comsia, it was only 0.49 and 0.53, respectively. other datasets also showed better results. the only one dataset, for bzr receptors, showed not so high values (q 2 d 0.40) but comparable with previous data received by comfa and comsia -0.32 and 0.41, respectively. as follows from the results presented in this paper, this particular implementation of the cmf approach provides an appealing alternative to the traditional lattice-based methodology. this method provides either comparable or enhanced predictive performance in comparison with state-of-theart 3d qsar methods, such as comfa and comsia. authors also discussed advantages and disadvantages of this approach. the potential advantages of this approach result from the ability to approximate electronic molecular structures with any desirable accuracy level, the ability to leverage the valuable information contained in partial derivatives of molecular fields (otherwise lost upon discretization) to analyze models and enhance their predictive performance, the ability to apply integral transforms to molecular fields and models, etc. the most attractive features of the cmf approach are its versatility and universality. at the same time, of the most serious limitations of the cmf approach, at least in its present form, comes from the mere nature of kernel-based machine learning methods. the amount of computational resources needed to calculate a kernel matrix scales as a square of the number of compounds in the training set. the mean amount of computational resources needed to calculate each element of a kernel matrix also scales as a square of the average number of atoms in molecules. as a result, it becomes impractical to build 3d qsar models using a training set with more than 300 medium-sized compounds (baskin and zhokhova 2013) . another interesting combination approach, shih et al. (2011) in 2011 proposed a combination of 3d qsar methods in order to get better predictivity for the dataset. authors proposed for the first time a combination approach to integrate the pharmacophore (phmodel) (taha et al. 2008) , comfa, and comsia models for b-raf (raf family of serine/threonine kinases). the phmodel was implemented by the program accelrys discovery studio 2.1. first, authors established ten phmodels and used them to align diverse inhibitor structures for generating the comfa and comsia models. then the partial least-square (pls) method was used and known b-raf inhibitors to validate the prediction ability of comfa and comsia models. finally, the goodness of hit (gh) test score was used as a benchmark for appraising the prediction ability of comfa and comsia models to screen a compound database. thus, ten phmodels were generated based on the 27 training set inhibitors. each phmodel included four features: hydrogen bond acceptor (a), hydrogen bond donor (d), hydrophobic (hy), and ring aromatic (ra). the correlation coefficients for ten phmodels were very good and ranged from 0.964 to 0.902. authors claim that this approach could be applied to screen inhibitor databases, optimize inhibitor structures, and identify novelty potent or specific inhibitors (shih et al. 2011) . one more combination study for raf inhibitors is worth to mention, was performed by yang et al. (2011) which was applied as a combination of docking, molecular dynamics (md), molecular mechanics poisson-boltzmann surface area (mm/pbsa) calculations, and 3d qsar analysis to investigate the detailed binding mode between b-raf kinases with the series of inhibitors and also to find the key structural features affecting the inhibiting activities. considering the difficulty in the accurate estimation of electrostatic interaction, the qm-polarized ligand docking and gbsa rescoring were applied to predict probable poses of these inhibitors bound into the active site of b-raf kinase. to obtain the rational conformation for developing 3d qsar models, authors applied the docking-based conformation selection strategy. moreover, the detailed interactions were analyzed on the basis of the results from md simulation and the free energy calculation for two inhibitors with much difference in their activity. authors investigated 61 b-raf inhibitors and developed comfa and comsia models with r 2 d 0.917 and 0.940, respectively. in result, the structure-based 3d qsar models provided a further structural analysis and modifiable information for understanding the sars of these inhibitors. the important hydrophobic property of the 3-substitution of b-ring was required to be type 2 inhibitors. the five substitutable positions of the c-ring could be further modified. authors concluded that the results obtained from the combined computational approach will be helpful for the rational design of novel type 2 raf kinase inhibitors. recently, a group of computational scientists is proposed to apply a proteinprotein interaction (ppi) analysis to target small molecules. since currently in worlds life science, research is going on the booming of interactome studies, a lot of interactions can be measured in a high-throughput way, taking into account that large-scale datasets are already available. studies show that many different types of interactions can be potential drug targets. this boom of hts studies greatly broadens the drug target search space, which makes the drug target discovery difficult. in this case, computational methods are highly desired to efficiently provide candidates for further experiments and hold the promise to greatly accelerate the discovery of novel drug targets. thus, wang et al. (2016) published a study where they suggested a new method, where inhibition of protein-protein interaction (ppi) analysis offered as a promising source to improve the specificity of drugs with fewer adverse side effects. they proposed a machine learning method to predict ppi targets in a genomic-wide scale. authors developed a computational method, named as preppitar (wang et al. 2016) , to predict ppis as drug targets by uncovering the potential associations between drugs and ppis (fig. 7) . authors investigated the databases and manually constructed a gold-standard positive dataset for drug and ppi interactions. their effort leads to a dataset with 227 associations among 63 ppis and 113 fda-approved drugs and allowed them to build models and learn the association rules from the data. also, authors were able to characterize drugs by profiling in chemical structure, drug atc-code annotation, and side-effect space and represent ppi similarity by a symmetrical s-kernel based on protein amino acid sequence. at the end, a support vector machine (svm) is used to predict novel associations between drugs and ppis. the preppitar method was validated on the well-established gold-standard dataset. authors found that all chemical structures, drug atc code, and side-effect information are predictive for ppi target. authors claim that preppitar can serve as a useful tool for ppi target discovery and provide a general heterogeneous data-integrative framework. preppitar applies the kernel fusion method to integrate multiple information about drug, including chemical structure, atc code, and drug side effect to detect the interactions between drugs and ppis. (b) collecting known associations between drugs and ppis as gold-standard positives in a bipartite graph. (c) calculating drug-drug and ppi-ppi similarity metrics, where t i ; i d 1; 2; 3; 4 are the sequence similarity among proteins. (d) relating the similarity among drugs and similarity among ppis by kronecker product kernel and applying svm-based algorithm to predict the unknown associations between drugs and ppis (reproduced with permission from reference (wang et al. 2016 ). copyright oxford university press, 2016 nanomaterials are becoming an important component of the modern life and have been the subject of increasing number of investigations involving various areas of natural sciences and technology. however, theoretical modeling of physicochemical and biological activity of these species is still very scarce. the prediction of properties and activities of "classical" substances via correlating with molecular descriptors is a well-known procedure, by application qsar and 3d qsar methods. in spite of this, the application of qsar for the nanomaterials is a very complicated task, because of "nonclassical" structure of these materials. here, we would like to show first applications of the 3d qsar and docking methods for nanostructured materials, which are nevertheless possible and can be useful in predicting their various properties and activities (toxicity). thus, one of the first 3d qsar studies for nanostructured materials was provided in 2008. durdagi et al. 2008a have investigated novel fullerene analogues as potential hiv pr inhibitors. it was the first work where authors analyzed nanostructured compounds for anti-hiv activity using protein-ligand docking and 3d qsar approaches. moreover, authors conducted md simulations of ligand-free and the inhibitor bound hiv-1 pr systems to complement some previous studies and to provide proper input structure of hiv-1 pr in further docking simulations. then, five different combinations of stereoelectronic fields of 3d qsar/comsia models were obtained from the set of biologically evaluated and computationally designed fullerene derivatives (where training set d 43 and test set d 6) in order to predict novel compounds with improved inhibition effect. the best 3d qsar/comsia model yielded a cross-validated r 2 value of 0.739 and a non-cross-validated r 2 value of 0.993. authors stated that the derived model indicated the importance of steric (42.6 %), electrostatic (12.7 %), h-bond donor (16.7 %), and h-bond acceptor (28.0 %) contributions (fig. 8 ). in addition, the derived contour plots together with applied de novo drug design were then used as pilot models for proposing the novel analogues with enhanced binding affinities. interestingly, the investigated by authors the nanostructured compounds have triggered the interest of medicinal chemists to look for novel fullerene-type hiv-1 pr inhibitors possessing higher bioactivity. later this year, authors published a second study for the same type of fullerenebased nanomaterials (durdagi et al. 2008b) . the same group published in 2009 another study for fullerene derivatives but functionalized by amino acids (durdagi et al. 2009 ). authors used in silico screening approach in order to propose potent fullerene analogues as anti-hiv drugs. two of the most promising derivatives showing significant binding scores were subjected to biological studies that confirmed the efficacy of the new compounds. the results showed that new leads can be discovered possessing higher bioactivity. authors used docking approach together with md simulations to get the best hits during the virtual screening. (durdagi et al. 2008a ). copyright elsevier, 2008 in 2011, the same group provided further analysis to design better anti-hiv fullerene-based inhibitors (tzoupis et al. 2011) . in this study authors employed a docking technique, two 3d qsar models, md simulations and the mm-pbsa method. in particular, authors investigated (1) hydrogen bonding (h-bond) interactions between specific fullerene derivatives and the protease, (2) the regions of hiv-1 pr that play a significant role in binding, (3) protease changes upon binding, and (4) various contributions to the binding free energy, in order to identify the most significant of them. the comfa and comsia methods were applied too, to build 3d qsar models, where good correlation coefficients were received, for both methods, r 2 d 0.842 and 0.928, respectively. authors claim that the computed binding free energies are in satisfactory agreement with the experimental results. another group published in 2013 a study that conducted a comprehensive investigation of fullerene analogues by combined computational approach including quantum chemical, molecular docking, and 3d descriptors-based qsar (ahmed et al. 2013) . authors stated that the protein-ligand docking studies and improved structure-activity models have been able both to predict binding affinities for the set of fullerene-c 60 derivatives and to help in finding mechanisms of fullerene derivative interactions with human immunodeficiency virus type 1 aspartic protease, hiv-1 pr. protein-ligand docking revealed several important molecular fragments that are responsible for the interaction with hiv-1 pr (fig. 9 ). in addition, a density functional theory method has been utilized to identify the optimal geometries and predict physicochemical parameters of 49 studied compounds. the five-variable ga-mlra-based model showed the best predictive ability (r 2 train d 0.882 and r 2 test d 0.738), with high internal and external correlation coefficients. calvaresi and zerbetto (2010) published a study where they investigated a fullerene binding with a set of proteins. authors investigated about 20 proteins that are known to modify their activity upon interaction with c60. the set was examined using patchdock (schneidman-duhovny et al. 2005) software with an algorithm that appraises quantitatively the interaction of c60 and the surface of each protein. the redundancy of the set allowed them to establish the predictive power of the approach that finds explicitly the most probable site where c60 docks on each protein. about 80 % of the known fullerene-binding proteins fall in the top 10 % of scorers. the close match between the model and experiments vouches for the accuracy of the model and validates its predictions. authors identified the sites of docking and discussed them in view of the existing experimental data available for protein c60 interaction. in addition, authors identified new proteins that can interact with c60 and discussed for possible future applications as drug targets and fullerene derivative bioconjugate materials. later, calvaresi and zerbetto (2011) published another study, where they investigated the binding of fullerene c60 with 1099 proteins. they one more time confirmed that hydrophobic pockets of certain proteins can accomodate a carbon cage either in full or in part. since the identification of proteins that are able to discriminate between different cages is an open issue, they were interested in investigating much larger library than in calvaresi and zerbetto (2010) . prediction of candidates is achieved with an inverse docking procedure that accurately accounts for (i) van der waals interactions between the cage and the protein surface, (ii) desolvation free energy, (iii) shape complementarity, and (iv) minimization of the number of steric clashes through conformational variations. a set of 1099 protein structures is divided into four categories that either select c60 or c70 (p-c60 or p-c70) and either accommodate the cages in the same pocket or in different pockets. thus, authors also confirmed the agreement with the experiments, where the kcsa potassium channel is predicted to have one of the best performances for both cages. recently, in 2015 xavier et al. (esposito et al. 2015) published a qsar study of decorated carbon nanotube investigation for toxicity using 4d fingerprints. in this study, authors proposed detailed mechanisms of action, relating to nanotoxicity, for a series of decorated (functionalized) carbon nanotube complexes based on previously reported qsar models. possible mechanisms of nanotoxicity for six endpoints (bovine serum albumin, carbonic anhydrase, chymotrypsin, hemoglobin along with cell viability, and nitrogen oxide production) have been extracted from the corresponding optimized qsar models. the molecular features relevant to each of the endpoint respective mechanism of action for the decorated nanotubes are also discussed in the study. based on the molecular information contained within the optimal qsar models for each nanotoxicity endpoint, either the decorator attached to the nanotube is directly responsible for the expression of a particular activity, irrespective of the decorator's 3d geometry and independent of the nanotube, or those decorators having structures that place the functional groups of the decorators as far as possible from the nanotube surface most strongly influence the biological activity. a docking study, together with comprehensive dft analysis was conducted by saikia et al. (2013) . authors made a simulation to analyze the interaction of nanomaterials with biomolecular systems, where they performed density functional calculations on the interaction of pyrazinamide (pza) drug with functionalized single-wall cnt (fswcnt) as a function of nanotube chirality and length, followed by docking simulation of fswcnt with pnca protein. the functionalization of pristine swcnt that facilitates in enhancing the reactivity of the nanotubes and formation of such type of nanotube-drug conjugate is thermodynamically feasible. docking studies predicted the plausible binding mechanism and suggested that pza loaded fswcnt facilitates in the target-specific binding of pza within the protein following a lock and key mechanism. authors noticed that no major structural deformation in the protein was observed after binding with cnt, and the interaction between ligand and receptor is mainly hydrophobic in nature. authors anticipate that these findings may provide new routes toward the drug delivery mechanism by cnts with long-term practical implications in tuberculosischemotherapy. in another study, turabekova et al. (2014) published a comprehensive study of carbon nanotube and pristine fullerene interactions with toll-like receptors (tlrs), which are responsible for immune response. having experimental data on hands and conducting comprehensive protein-ligand investigation, authors were able to show that cnt and fullerenes can bind to certain tlrs. authors suggested a hypothetical model providing the potential mechanistic explanation for immune and inflammatory responses observed upon exposure to carbon nanoparticles. specifically, authors performed a theoretical study to analyze cnt and c60 fullerene interactions with the available x-ray structures of tlr homo-and heterodimer extracellular domains. this assumption was based on the fact that similar to the known tlr ligands, both cnts and fullerenes induce, in cells, the secretion of certain inflammatory protein mediators, such as interleukins and chemokines. these proteins are observed within inflammation downstream processes resulting from the ligand molecule-dependent inhibition or activation of tlr-induced signal transduction. the computational studies have shown that the internal hydrophobic pockets of some tlrs might be capable of binding small-sized carbon nanostructures (5,5 armchair swcnts containing 11 carbon atom layers and c60 fullerene). high binding scores and minor structural alterations induced in tlr ectodomains upon binding c60 and cnts further supported the proposed hypothesis (fig. 10) . additionally, the proposed hypothesis is strengthened by the indirect experimental fig. 10 5,5 cnt-bound tlr1/tlr2 ecds: (a) 5,5 cnt is bound to the tlr1 and tlr2 ecd interface dimerization area, (b) aligned structures of tlr2 ecds before (green carbon atoms) and after (blue carbon atoms) the impact opls2005 refinement upon binding 5,5 cnts. the orientation of two parallel entrance loops and the side chains of hydrophobic phe349, phe325, and leu328 preventing the nanotube from intrusion are shown to be optimized (reproduced with permission from reference (turabekova et al. 2014) . copyright, royal society of chemistry, 2014) findings indicating that cnts and fullerenes induce an excessive expression of specific cytokines and chemokines (i.e., il-8 and mcp1). later, this kind of interaction was confirmed by md simulation provided by mozolewska et al. (2014) . in this study, authors made an attempt to determine if the nanotubes could interfere with the innate immune system by interacting with tlrs. for this purpose, authors used the following tlr structures downloaded from the rcsb protein data bank: tlr2 (3a7c), tlr4/md (3fxi), tlr5 (3v47), tlr3 (2a0z), and the complexes of tlr1/tlr2 (2z7x) and tlr2/tlr6 (3a79). the results of steered molecular dynamics (smd) simulations have shown that nanotubes interact very strongly with the binding pockets of some receptors (e.g., tlr2), which results in their binding to these sites without substantial use of the external force. in this chapter, we discussed 3d qsar and protein-ligand docking methods, recent applications of them for conventional organic compounds design and for nanostructured materials. despite of all pitfalls, the 3d qsar approach confirmed the importance and value in drug design and medicinal chemistry. moreover, the combination of 3d qsar approach with other techniques, including proteinligand docking study gives much better improvement in predictions of biologically active compounds and drug candidates. the development of methods for 3d qsar still continues, giving improved predictions for conventional organic compounds. thus, we believe that 3d qsar methods in the near future will be able to encode and model various organic and nanomaterials for important biological and physicochemical property improvement. receptor-and ligand-based study of fullerene analogues: comprehensive computational approach including quantum-chemical, qsar and molecular docking simulations three-dimensional qsar using the k-nearest neighbor method and its interpretation synthesis, anticonvulsant activity and 3d-qsar study of some prop-2-eneamido and 1-acetyl-pyrazolin derivatives of aminobenzothiazole receptor-dependent (rd) 3d-qsar approach of a series of benzylpiperidine inhibitors of human acetylcholinesterase (huache) the continuous molecular fields approach to building 3d-qsar models baiting proteins with c60 fullerene sorting proteins comparative molecular field analysis (comfa). 2. toward its use with 3d-structural databases the dylomms method: initial results from a comparative study of approaches to 3d qsar comparative molecular field analysis (comfa). 1. effect of shape on binding of steroids to carrier proteins comparative residue interaction analysis (coria): a 3d-qsar approach to explore the binding contributions of active site residues with ligands exploring the binding of hiv-1 integrase inhibitors by comparative residue interaction analysis (coria) phase: a new engine for pharmacophore perception, 3d qsar model development, and 3d database screening. 1. methodology and preliminary results the hypothetical active site lattice. an approach to modelling active sites from data on inhibitor molecules computational design of novel fullerene analogues as potential hiv-1 pr inhibitors: analysis of the binding interactions between fullerene inhibitors and hiv-1 pr residues using 3d qsar, molecular docking and molecular dynamics simulations 3d qsar comfa/comsia, molecular docking and molecular dynamics studies of fullerene-based hiv-1 pr inhibitors in silico drug screening approach for the design of magic bullets: a successful example with anti-hiv fullerene derivatized amino acids methods for reliability and uncertainty assessment and for applicability evaluations of classification-and regression-based qsars exploring possible mechanisms of action for the nanotoxicity and protein binding of decorated nanotubes: interpretation of physicochemical properties from optimal qsar models 3d qsar methods: phase and catalyst compared incorporating molecular shape into the alignment-free grid-in dependent descriptors anchor-grind: filling the gap between standard 3d qsar and the grid-in dependent descriptors drugscore meets comfa: adaptation of fields for molecular comparison (afmoc) or how to tailor knowledge-based pair-potentials to a particular protein a computational procedure for determining energetically favorable binding sites on biologically important macromolecules a virtual screening approach for thymidine monophosphate kinase inhibitors as antitubercular agents based on docking and pharmacophore models docking-based 3d-qsar study of hiv-1 integrase inhibitors merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94 glide: a new approach for rapid, accurate docking and scoring. 2. enrichment factors in database screening exploring qsar a qsar investigation of dihydrofolate reductase inhibition by baker triazines based upon molecular shape analysis receptor-and ligand-based 3d-qsar study for a series of non-nucleoside hiv-1 reverse transcriptase inhibitors structure-toxicity relationships of nitroaromatic compounds design, synthesis and biological evaluation of 1, 3, 4-oxadiazole derivatives quantum-chemical descriptors in qsar/qspr studies thermodynamic aspects of hydrophobicity and biological qsar a critical review of recent comfa applications virtual ligand screening: strategies, perspectives and limitations. drug discovery today molecular similarity indices in a comparative analysis (comsia) of drug molecules to correlate and predict their biological activity qsar and 3d qsar in drug design part 1: methodology. drug discovery today qsar and 3d qsar in drug design part 2: applications and problems novel biginelli dihydropyrimidines with potential anticancer activity: a parallel synthesis and comsia study hypogen: an automated system for generating 3d predictive pharmacophore models multi-dimensional qsar in drug discovery whither combine? new opportunities for receptor-based qsar let's not forget tautomers chemical computing group inc. 1010 sherbooke st. west, suite #910 combined target-based and ligand-based drug design approach as a tool to define a novel 3d-pharmacophore model of human a3 adenosine receptor antagonists the application of a 3d-qsar (automep/pls) approach as an efficient pharmacodynamic-driven filtering method for small-sized virtual library: application to a lead optimization of a human a 3 adenosine receptor antagonist preliminary studies of interaction between nanotubes and toll-like receptors integrating virtual screening in lead discovery prediction of drug binding affinities by comparative binding energy analysis combined pharmacophore modeling, docking, and 3d qsar studies of abcb1 and abcc1 transporter inhibitors grid-independent descriptors (grind): a novel class of alignment-independent three-dimensional molecular descriptors 3d qsar and molecular docking studies of benzimidazole derivatives as hepatitis c virus ns5b polymerase inhibitors quantitative structure-activity relationship methods: perspectives on drug discovery and toxicology minimizing false positives in kinase virtual screens qsar and comfa: a perspective on the practical application to drug discovery the comparative molecular surface analysis (comsa)-a nongrid 3d qsar method by a coupled neural network and pls system: predicting p k a values of benzoic and alkanoic acids modeling robust qsar energy-based pharmacophore and three-dimensional quantitative structure-activity relationship (3d-qsar) modeling combined with virtual screening to identify novel small-molecule inhibitors of silent mating-type information regulation 2 homologue 1 (sirt1) using nano-qsar to predict the cytotoxicity of metal oxide nanoparticles docking and 3-d qsar studies on indolyl aryl sulfones. binding mode exploration at the hiv-1 reverse transcriptase non-nucleoside binding site and design of highly active n-(2-hydroxyethyl) carboxamide and n-(2-hydroxyethyl) carbohydrazide derivatives novel 4-(morpholin-4-yl)-n 0 -(arylidene) benzohydrazides: synthesis, antimycobacterial activity and qsar investigations qsar modeling of acute toxicity on mammals caused by aromatic compounds: the case study using oral ld 50 for rats predicting anti-hiv activity of 1, 3, 4-thiazolidinone derivatives: 3d-qsar approach self-organizing molecular field analysis: a tool for structure-activity studies density functional and molecular docking studies towards investigating the role of single-wall carbon nanotubes as nanocarrier for loading and delivery of pyrazinamide antitubercular drug onto pnca protein 3d qsar pharmacophore based virtual screening and molecular docking for identification of potential hsp90 inhibitors patchdock and symmdock: servers for rigid and symmetric docking development of novel 3d-qsar combination approach for screening and optimizing b-raf inhibitors in silico comparative molecular moment analysis (comma): 3d-qsar without molecular superposition docking and 3d-qsar studies of influenza neuraminidase inhibitors using three-dimensional holographic vector of atomic interaction field analysis combining ligandbased pharmacophore modeling, quantitative structure-activity relationship analysis and in silico screening for the discovery of new potent hormone sensitive lipase inhibitors 3d qsar in drug design atomic property fields: generalized 3d pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3d qsar immunotoxicity of nanoparticles: a computational study suggests that cnts and c 60 fullerenes might be recognized as pathogens by toll-like receptors binding of novel fullerene inhibitors to hiv-1 protease: insight through molecular dynamics and molecular mechanics poisson-boltzmann surface area calculations molecular properties that influence the oral bioavailability of drug candidates synthesis, screening for antitubercular activity and 3d-qsar studies of substituted n-phenyl-6-methyl-2-oxo-4-phenyl-1, 2, 3, 4-tetrahydro-pyrimidine-5-carboxamides genetically evolved receptor models: a computational approach to construction of receptor models computational probing protein-protein interactions targeting small molecules progress in three-dimensional drug design: the use of real-time colour graphics and computer postulation of bioactive molecules in dylomms pharmacophore modeling and applications in drug discovery: challenges and recent advances molecular dynamics simulation, free energy calculation and structure-based 3d-qsar studies of b-raf kinase inhibitors method of continuous molecular fields in the search for quantitative structure-activity relationships three dimensional holographic vector of atomic interaction field (3d-hovaif) key: cord-005321-b3pyg5b3 authors: cai, li-ming; li, zhaoqing; song, xinyu title: global analysis of an epidemic model with vaccination date: 2017-07-21 journal: j appl math comput doi: 10.1007/s12190-017-1124-1 sha: doc_id: 5321 cord_uid: b3pyg5b3 in this paper, an epidemic dynamical model with vaccination is proposed. vaccination of both newborn and susceptible is included in the present model. the impact of the vaccination strategy with the vaccine efficacy is explored. in particular, the model exhibits backward bifurcations under the vaccination level, and bistability occurrence can be observed. mathematically, a bifurcation analysis is performed, and the conditions ensuring that the system exhibits backward bifurcation are provided. the global dynamics of the equilibrium in the model are also investigated. numerical simulations are also conducted to confirm and extend the analytic results. mathematical models have become important tools in analyzing the spread and control of infectious diseases [2] . based on the theory of kermack and mckendrick [19] , the spread of infectious diseases usually can be described mathematically by compart-mental models such as sir, sirs, seir, seirs models (where s represents the class of the susceptible population, e is the exposed class in the latent period, i is infectious class, r is the removed class, which has recovered with temporary or permanent immunity). in recent years, a variety of compartmental models have been formulated, and the mathematical analysis of epidemiology models has advanced rapidly, and the analyzed results are applied to infectious diseases [2, 18, 32] . vaccination campaigns have been critical in attacking the spread of infectious diseases, e.g., pertussis, measles, and influenza. the eradication of smallpox has been considered as the most spectacular success of vaccination [44] . although vaccination has been an effective strategy against infectious diseases, current preventive vaccine consisting of inactivated viruses do not protect all vaccine recipients equally. the vaccine-based protection is dependent on the immune status of the recipient [2, 32] . for example, influenza vaccines protect 70-90% of the recipients among healthy young adults and as low as 30-40% of the elderly and others with weakened immune systems (such as hiv-infected or immuno-suppressed transplant patients) (see, [14, 30, 44] ). since vaccination is the process of administering weakened or dead pathogens to a healthy person or animal, with the intent of conferring immunity against a targeted form of a related disease agent, the individuals having the vaccine-induced immunity can be distinguished from the recovered individuals by natural immunity. thus, vaccination can also be considered by adding some compartment naturally into the basic epidemic models. over the past few decades, a large number of simple compartmental mathematical models with vaccinated population have been used in the literature to assess the impact or potential impact of imperfect vaccines for combatting the transmission diseases [1, 3, 11, 16, 20, 21, 23, 31, 43, 45] . in some of these studies (e.g., papers [16, 31, 43] ), authors have shown that the dynamics of the model are determined by the disease's basic reproduction number 0 . if 0 < 1 the disease can be eliminated from the community; whereas an endemic occurs if 0 > 1. therefore, if an efficient vaccination campaign acts to reduce the disease's basic reproduction number 0 below the critical level of 1, then the disease can be eradicated. while in other studies, such as alexander et al. [1] and arino et al. [3] , they have shown that the criterion for 0 < 1 is not always sufficient to control the spread of a disease. a phenomenon known as a backward bifurcation is observed. mathematically speaking, when a backward bifurcation occurs, there are at least three equilibria for 0 < 1 in the model: the stable disease-free equilibrium, a large stable endemic equilibrium, and a small unstable endemic equilibrium which acts as a boundary between the basins of attraction for the two stable equilibria. in some cases, a backward bifurcation leading to bistability can occur. thus, it is possible for the disease itself to become endemic in a population, given a sufficiently large initial outbreak. these phenomena have important epidemiological consequences for disease management. in recent years, backward bifurcation, which leads to multiple and subthreshold equilibria, has been attracting much attention (see, [1, 3, 4, 6, 11, 16, 17, 20, 21, 23, 24, 33, 34, 37, 40] ). several mechanisms with vaccination have been identified to cause the occurrence of backward bifurcation in paper [33] . in this paper, we shall investigate the effects of a vaccination campaign with an imperfect vaccine upon the spread of a non-fatal disease, such as hepatitis a, hepatitis b, tuberculosis and influenza, which features both exposed and infective stages. in particular, we focus on the vaccination parameters how to change the qualitative behavior of the model, which may lead to subthreshold endemic states via backward bifurcation. global stability results for equilibria are obtained. the model constructed in this paper is an extension of the model in paper [31] , including a new compartment for the latent class (an important feature for the infectious diseases eg. hepatitis a, hepatitis b, tuberculosis and influenza) and the disease cycle. it is one of the aims of this paper to strengthen the disease cycle to cause multiple endemic equilibria. the paper is organized as follows. an epidemic model with vaccination of an imperfect vaccine is formulated in sect. 2, and the basic reproduction number, and the existence of backward bifurcation and forward bifurcation are analyzed in sect. 3. the global stability of the endemic equilibrium is established in sect. 4. the paper is concluded with a discussion. in order to derive the equations of the mathematical model, we divide the total population n in a community into five compartments: susceptible, exposed (not yet infectious), infective, recovered, and vaccinated; the numbers in these states are denoted by s(t), the flow diagram of the disease spread is depicted in fig. 1 . all newborns are assumed to be susceptible. of these newborns, a fraction α of individuals are vaccinated, where α ∈ (0, 1]. susceptible individuals are vaccinated at rate constant ψ. the parameter γ 1 is the rate constant at which the exposed individuals become infectious, and γ 2 is the rate constant that the infectious individuals become recovered and acquire temporary immunity. finally, since the immunity acquired by infection wanes with time, the recovered individuals have the possibility γ 3 of becoming susceptible again. β is the transmission coefficient (rate of effective contacts between susceptible and infective individuals per unit time; this coefficient includes rate of contacts and effectiveness of transmission). since the vaccine does not confer immunity to all vaccine recipients, vaccinated individuals may become infected but at a lower rate than unvaccinated (those in class s). thus in this case, the effective contact rate β is multiplied by a scaling factor σ (0 ≤ σ ≤ describes the vaccine efficacy, σ = 0 represents vaccine that offers 100% protection against infection, while σ = 1 models a vaccine that offers no protection at all). it is assumed that the natural death rate and birth rate are μ and the disease-induced death rate is ignored. thus the total population n is constant. since the model consider the dynamics of the human populations, it is assumed that all the model parameters are nonnegative. thus, the following model of differential equations is formulated based on the above assumptions and fig. 1 , with nonnegative initial conditions and n (0) > 0. system (2.1) is well posed: solutions remain nonnegative for nonnegative initial conditions. we illustrate here that there are limiting cases in system (2.1): if σ = 0, the vaccine is perfectly effective, and α = ψ = 0, there is no vaccination, system (2.1) will be reduced to the standard seirs model in [28] ; if γ 3 = 0 and the limit γ 1 → ∞, system (2.1) will be equivalent to an svir model in [31] . if we let α = 0 and γ 3 = 0, system (2.1) can be reduced to an sveir epidemic model in [16] , where authors aim to assess the potential impact of a sars vaccine via mathematical modelling. to explore the effect of the vaccination period and the latent period on disease dynamics, an sveir epidemic model with ages of vaccination and latency are formulated in paper [10] . in papers [10, 16, 28, 31] , authors have shown that the dynamics of the model are determined by the disease's basic reproduction number 0 . that is, the disease free equilibrium is globally asymptotically stable for 0 ≤ 1; and there is a unique endemic equilibrium which is globally asymptotically stable if 0 > 1. if ψ = 0 and limit γ 1 → ∞, system (2.1) will be reduced into an siv epidemic model in [36] , where authors investigate the effect of imperfect vaccines on the disease's transmission dynamics. in [36] , it is shown that reducing the basic reproduction number 0 to values less than one no longer guarantees disease eradication. in this paper, we show that if a vaccination campaign with an imperfect vaccine and the disease cycle is considered, a more complicated dynamic behavior is observed in system (2.1). for example, the backward bifurcation occurs in system (2.1). in the following, first, it is easy to obtain that the total population n in system (2.1) is constant. to simplify our notation, we define the occupation variable of compartments s, e, i, v, and r as the respective fractions of a constant population n that belong to each of the corresponding compartments. we still write the occupation variable of compartments as s, e, i, v and r, respectively. thus, it is easy to verify that is positively invariant and globally attracting in r 5 + . it suffices to study the dynamics of (2.1) on d. thus, system (2.1) can be rewritten as the following system: in the case σ = 0, system (2.3) reduces to an seirs model without vaccination [28] , , is considered as the basic reproduction number of the model. the classical basic reproduction number is defined as the number of secondary infections produced by a single infectious individual during his or her entire infectious period. mathematically, the reproduction number is defined as a spectral radius r 0 (which is a threshold quantity for disease control) that defines the number of new infectious generated by a single infected individual in a fully susceptible population [39] . in the following, we shall use this approach to determine the reproduction number of system (2.3). it is easy to see that system (2.3) has always a disease-free equilibrium, 3) can be rewritten as the jacobian matrices of f(x) and v(x) at the disease-free equilibrium p 0 are, respectively, f v −1 is the next generation matrix of system (2.3). it follows that the spectral radius according to theorem 2 in [39] , the basic reproduction number of system (2.3) is the basic reproduction number r vac can be interpreted as follows: a proportion of γ 1 μ+γ 1 of exposed individuals progress to the infective stage before dying; 1 μ+γ 2 represents the number of the secondary infection generated by an infective individual when he or she is in the infectious stage. those newborns vaccinated individuals have generated the number μ+ψ of the secondary infection. average vaccinated individuals with vaccination rate ψ have generated the fraction of the secondary infection. now we investigate the conditions for the existence of endemic equilibria of system (2.3). any equilibrium (s, v, e, i, r) of system (2.3) satisfies the following equations: from the second and third equation of (3.1), we have hence, there exists no endemic equilibrium for r 0 ≤ 1. for r 0 > 1, the existence of endemic equilibria is determined by the presence in (0, 1] of positive real solutions of the quadratic equation where, .2) and (3.3), we can see that the number of endemic equilibria of system (2.3) is zero, one, or two, depending on parameter values. for σ = 0 (the vaccine is totally effective), it is obviously that there is at most one endemic equilibrium (p * (s * , e * , i * , r * , v * )) in system. from now on we make the realistic assumption that the vaccine is not totally effective, and thus 0 < σ < 1. we notice that if r vac = 1, then we have since all the model parameters are positive, it follows from thus, r vac is a continuous decreasing function of ψ for ψ > 0, and if ψ < ψ crit , then r vac > 1 and c < 0. therefore, it follows that p(i ) of eq. (3.2) has a unique positive root for r vac > 1. now we consider the case for r vac < 1. in this case, c > 0, and ψ ≥ ψ crit . from (3.3), it is easy to see that b(ψ) is an increasing function of ψ. thus, if b(ψ crit ) ≥ 0, then b(ψ) > 0 for ψ > ψ crit . thus, p(i ) has no positive real root which implies system have no endemic equilibrium in this case. thus, let us consider the case is an linear increasing function of ψ. thus, there is a uniqueψ > ψ crit such that b(ψ) = 0, and thus (ψ) < 0. since (ψ) is a quadratic function of ψ with positive coefficient for ψ 2 , (ψ) has a unique rootψ in [ψ crit ,ψ]. thus, for r vac < 1 we have b(ψ) < 0, a > 0, c ≥ 0, and (ψ) > 0 for ψ ∈ (ψ crit ,ψ). therefore, p(i ) has two possible roots and system from the above discussion, we have b(ψ) > 0 for ψ >ψ, and (ψ) < 0 for ψ ∈ (ψ,ψ). therefore, it follows that system (2.3) has no endemic equilibria for ψ >ψ. if r vac = 1, we have c = 0. in this case, system has a unique endemic equilibrium for b(ψ) < 0 and no endemic equilibrium for b(ψ) > 0. summarizing the discussion above, we have the following theorem: for ψ crit < ψ <ψ and has no endemic equilibria for ψ >ψ; according to theorem 2 of van den driesche and watmough [39] , we have the following result. the disease-free equilibrium p 0 is locally asymptotically stable when r vac < 1 and unstable when r vac > 1. in the following, we first give a global result of the disease-free equilibrium of system (2.3) under some conditions. by directly calculating the derivative of l along system (2.3) and notice that s +σ v < 1, thus, we have it is easy to verify that the maximal compact invariant set in {(s, e, i, r, v ) ∈ : the global stability of p 0 follows from the lasalle invariance principle [22] . from the above discussion, we know that system (2.3) may undergo a bifurcation at the disease-free equilibrium when r vac = 1. now we establish the conditions on the parameter values that cause a forward or backward bifurcation to occur. to do so, we shall use the following theorem whose proof is found in castillo-chavez and song [5] , which based on the use of the center manifold theory [15] . for the following general system with a parameter φ. without loss of generality, it is assumed that x = 0 is an equilibrium for system (3.4) for all values of the parameters φ, that is, f (0, φ) = 0 for all φ. then the local dynamics of system (3.4) around x = 0 are totally determined by a and b. (i) a > 0, b > 0. when φ < 0 with |φ| 1, x = 0 is locally asymptotically stable and there exists a positive unstable equilibrium; when 0 < φ 1, x = 0 is unstable and there exists a negative and locally asymptotically equilibrium; (ii) a < 0, b < 0. when φ < 0, with |φ| 1, x = 0 is unstable; when 0 < φ 1, x = 0 is locally asymptotically stable and there exists a negative unstable equilibrium; (iii) a > 0, b < 0.when φ < 0, with |φ| 1, x = 0 is unstable and there exists a locally asymptotically stable negative equilibrium; when 0 < φ 1, x = 0 is stable and a positive unstable equilibrium appears; (iv) a < 0, b > 0. when φ changes from negative to positive, x = 0 changes its stability from stable to unstable. correspondingly, a negative unstable equilibrium becomes positive and locally asymptotically stable. now by applying theorem 3.4, we shall show system (2.3) may exhibit a forward or a backward bifurcation when r vac = 1. consider the disease-free equilibrium p 0 = (s 0 , 0, 0, 0) and choose β as a bifurcation parameter. solving r vac = 1 gives let j 0 denote the jacobian of the system (2.3) evaluated at the dfe p 0 with β = β * . by directly computing, we have it is easy to obtain that j 0 (p 0 , β * ) has eigenvalues given by thus, λ 3 = 0 is a simple zero eigenvalue of the matrix j (p 0 , β * ) and the other eigenvalues are real and negative. hence, when β = β * , the disease free equilibrium p 0 is a non-hyperbolic equilibrium. thus, assumptions (a1) of theorem 3.4 is verified. now, we denote with ω = (ω 1 , ω 2 , ω 3 , ω 4 ), a right eigenvector associated with the zero eigenvalue λ 3 = 0. thus, thus, we have from the above, we obtain that v = 0, γ 1 μ + γ 1 , 1, 0 . let a and b be the coefficients defined as in theorem 3.4. computation of a, b. for system (2.3), the associated non-zero partial derivatives of f (evaluated at the dfe (p 0 ), x 1 = s, x 2 = i, x 3 = e, x 4 = r.) are given by (3.5) since the coefficient b is always positive, according to theorem 3.4, it is the sign of the coefficient a, which decides the local dynamics around the disease-free equilibrium p 0 for β = β * . if the coefficient a is positive, the direction of the bifurcation of system (2.3) at β = β * is backward; otherwise, it is forward. thus, we formulate a condition, which is denoted by (h 3 ) : thus, if (h 3 ) holds, we have a > 0, otherwise, a < 0. summarizing the above results, we have the following theorem. fig. 3 the forward bifurcation diagram for model (2.3) value of ψ say ψ * such that backward bifurcation occurs of ψ < ψ * and forward bifurcation occurs if ψ > ψ * . both of these bifurcation diagrams are obtained by considering β as bifurcation parameter and later it is plotted with respect to r vac . in this section, we shall investigate the global stability of the unique endemic equilibrium for r vac > 1. here we shall apply the geometric approach [25, 27, 38] to establish the global stability of the unique endemic equilibrium. in recent years, many authors [3, 16, 26, 28, 29] have applied this method to show global stability of the positive equilibria in system. here, we follow the techniques and approaches in paper [3, 16] to investigate global stability of the endemic equilibrium in system (2.3). here, we omit the introduction of the general mathematical framework of these theorems and only focus on their applications. in the previous section, we have showed that if r vac > 1, system (2.3) has a unique endemic equilibrium in d. furthermore, r vac > 1 implies that the disease-free equilibrium p 0 is unstable (theorem 3.2) . the instability of p 0 , together with p 0 ∈ ∂d, implies the uniform persistence of the state variables. this result can be also showed by using the same arguments from proposition 4.2 in [27] and proposition 2.2 in [29] . thus, we first give the following result: to prove our conclusion, we set the following differential equatioṅ where f : d(⊂ r n ) → r n , d is open set and simply connected and f ∈ c (r n ). letμ where, p(x) be a nonsingular ( ) matrix, the second additive compound of the jacobian matrix ∂ f /∂ x. μ is the lozinskiȋ measure with respect to a vector norm | · |. the following result comes from corollary 2.6 in paper [25] . from proposition 4.1, it is easy to verify that the condition (i) in theorem 4.1 holds. therefore, to prove our conclusion, we only verify that (ii) in theorem 4.1 holds. according to paper [35] , the lozinskiȋ measure in theorem 4.1 can be evaluated as follow:μ where d + is the right-hand derivative. now we state our main result in this section. then the unique equilibrium p * in system (2.3) is globally asymptotically stable for r vac > 1. then, the jacobian matrix of system (2.3) can be written as the second additive compound [25] (see, "appendix") of jacobian matrix is the 6 × 6 matrix given by where p f is the derivative of p in the direction of the vector field f . thus, we have p f p −1 = −diag(ė/e,ė/e,ė/e,i̇ /i,i̇ /i,i̇ /i ). thus, we obtain that as in [3, 16] , we define the following norm on r 6 : where z ∈ r 6 , with components z i , i = 1, . . . , 6 and and let now we demonstrate the existence of some κ > 0 such that by linearity, if this inequality is true for some z, then it is also true for −z. similar to analyzing methods in paper [3, 16] , our proof is subdivided into eight separate cases, based on the different octants and the definition of the norm (4.4). to facilitate our analysis, we use the following inequalities: u 2 (t) ≥ |z 4 |, |z 5 |, |z 6 |, |z 5 + z 6 |, |z 4 + z 5 + z 6 |, case 1. let u 1 (z) > u 2 (z),z 1 , z 2 , z 3 > 0 and |z 1 | > |z 2 | + |z 3 |. then we have ||z|| = z 1 and u 2 (z) < z 1 . taking the right derivative of ||z||, we have (4.7) case 4. by linearity, eq. (4.7) also holds for u 1 > u 2 and z 1 , z 2 , z 3 < 0 when |z 1 | < |z 2 | + |z 3 |. thus, if we require that γ 1 < γ 3 + μ holds, then the inequality (4.5) holds for case 3 and case 4. 3 and |z 1 | > |z 2 |. thus, we have ||z|| = |z 1 | + |z 3 |, and u 2 (z) < |z 1 | + |z 3 |. by directly calculating, we obtain that using the inequalities , |z 5 |, |z 4 + z 5 + z 6 | ≤ u 2 (z) < |z 2 | + |z 3 |, and |z 1 | ≤ |z 2 |, we have (4.9) case 8. by linearity, eq. (4.9) also holds for u 1 > u 2 and z 2 , thus, if we require that γ 1 < γ 3 + μ holds, then the inequality (4.5) holds for case 7 and case 8. therefore, from the discussion above, we know that if inequalities (4.3)hold, then there exists κ > 0 such that d + ||z|| ≤ −κ||z|| for all z ∈ r 4 and all nonnegative s, v, e and i . all conditions in theorem 4.1 can be satisfied when inequalities (4.3) hold. therefore, by theorem 4.1, we can determine that if inequalities (4.3) hold, then the unique endemic equilibrium of system (2.3) is globally stable in d for r vac > 1. remark 2 in sect. 3, we have shown that system (2.3) exhibit a backward bifurcation for r vac ≤ 1. as stressed in [3] , for cases in which the model exhibits bistability, the compact absorbing set required in theorem 4.1 does not exist. by applying similar methods in [3] , a sequence of surfaces that exists for time > 0 and minimizes the functional measuring surface area may be obtained. therefore, the global dynamics of system (2.3) in the bistability region can be further investigated as it has been done in paper [3] . in this paper, an epidemic model with vaccination has been investigated. by analysis, it is showed that the proposed model exhibits a more complicated dynamic behavior. backward bifurcation under the vaccination level conditions, and bistability phenomena can be observed. the global stability of the unique endemic equilibrium in the model is demonstrated for r vac > 1. note that the model (2.3) can be solved in an efficient way by means of the multistage adomian decomposition method (madm) as a relatively new method [8, 9, 12, 13] . the madm has some superiority over the conventional solvers such as the r-k family. to illustrate the various theoretical results contained in this paper, the effect of some important parameter values on the dynamical behavior of system (2.3) is investigated in the following. now we consider first the role of the disease cycle on the backward bifurcation. if γ 3 = 0, [i.e., the disease cycle-free in model (2.3)], then the expression for the bifurcation coefficient, a, given in eq. (3.5) reduces to thus, the backward bifurcation phenomenon of system (2.3) will not occur if γ 3 = 0. this is in line with results in papers [16, 31] , where the disease cycle-free model (2.3) has a globally asymptotically stable disease-free equilibrium if the basic reproduction number is less than one. differentiating a, given in eq. (3.5) , with respect to γ 3 gives hence, the bifurcation coefficient, a is an increasing functions of γ 3 . thus, the feasibility of backward bifurcation occurring increases with disease cycle. now we consider the role of vaccination on the backward bifurcation. let α = ψ = σ = 0, then the expression for the bifurcation coefficient, a, given in eq. (3.5) , is reduces to thus, the backward bifurcation phenomenon of system (2.3) will not occur if α = ψ = σ = 0 (i.e., the model (2.3) will not undergo backward bifurcation in the absence of vaccination). this is also in line with results in paper [26] , where the vaccination-free model (2.3) has a globally asymptotically stable equilibrium if the basic reproduction number r 0 is less than one. furthermore, the impact of the vaccine-related parameters (ψ, σ ) on the backward bifurcation is assessed by carrying out an analysis on the bifurcation coefficient a as follows. differentiating a, given in eq. (3.5), partially with respect to ψ, gives thus, the backward bifurcation coefficient, a is a decreasing function of the vaccination rate ψ. hence, the possibility of backward occurring decreases with increasing vaccination rate ( i.e., vaccinating more susceptible individuals decrease the likelihood of the occurrence of backward bifurcation). differentiating the bifurcation coefficient a, given in eq. (3.5), partially with respect to σ gives ∂a ∂σ = 2β * γ 1 μ + γ 1 m 1 , with m 1 = − γ 2 γ 3 (μ + γ 3 )(μ + ψ) + μ(1 − α)(μ + γ 1 )(μ + γ 2 )(μ + ψ + σ (μα + ψ)) γ 1 (μ + ψ)[μ(1 − α) + σ (μα + ψ)] 2 + μ + γ 2 γ 1 + γ 2 μ + γ 3 + 1 . thus, the bifurcation coefficient, a is a decreasing function with respect to σ . that is, the likelihood of backward bifurcation occurring decreases with increasing vaccine efficacy. let α = 0.3, μ = 0.00004566, β = 0.4, ψ = 0.005, γ 1 = 0.1, γ 2 = 0.05, γ 3 = 0.033. by direct calculating, it is easy to verify that m 1 is negative and also condition (h 3 ) is satisfied. figure 4 depicts the backward bifurcation occurring phenomena with lower vaccine efficacy with σ = 0.15; fig. 5 depicts the likelihood of backward bifurcation occurring with higher vaccine efficacy σ = 0.45. in addition, it is obvious that our expression for the basic reproduction number in system (2.3), i.e., r vac = βγ 1 (μ + γ 1 )(μ + γ 2 ) μ(1 − α) + σ (μα + ψ) μ + ψ is independent of the loss rate of immunity γ 3 . from the above analysis, we have found that the dynamics of the model are not determined by the basic reproduction number, and the phenomena of the backward bifurcation in system may occur. moreover, it is found that the occurrence feasibility is increasing with the loss rate of immunity γ 3 . from the following expression, it is easy to see that the policy of vaccinations with imperfect vaccines can decrease the the basic reproduction number r vac . thus, the imperfect vaccine may be beneficial to the community. this is also a positive point, sice it is know that the use of some imperfect vaccine can sometime result in detrimental consequences to the community [3, 20] . at last, we must point out that although the system (2.3) with (2.2) is well posed mathematically, we acknowledge the biological reality that the fraction of the constant total population which occupies a compartment can only be within the subset q of rational values within r 5 + , and furthermore only within a sub-subset of values within q belonging to n/n where n belongs to the integers z ∈ [0, n ]. in addition, we also point out that the analysis of the model (2.1) may become somewhat different if disease fatalities and more complex vital dynamics are included, in particular, if the population size is no longer constant. in the future, we may investigate many various modeling possibilities to simulate a real world biological process based on model (2.1). on the other hand, we note that the population in our model (2.1) is assumed to be homogeneously mixed. in fact, different individual may have different number of contacts. thus, a complex network-based approach on diseases transmission may be closer to a realistic situation [7, 41, 42] . in the future, we shall investigate dynamics of the proposed model based on a complex network. a vaccination model for transmission dynamics of influenza infectious diseases of humans global results for an epidemic model with vaccination that exhibits backward bifurcation on the dynamics of an seir epidemic model with a convex incidence rate dynamical models of tuberculosis and their application dynamical behavior of an epidemic model for a vector-borne disease with direct transmission global stability of an epidemic model with carrier state in heterogeneous networks a new modified adomian decomposition method and its multistage form for solving nonlinear boundary value problems with robin boundary conditions a reliable algorithm for positive solutions of nonlinear boundary value problems by the multistage adomian decomposition method global stability of an sveir epidemic model with ages of vaccination and latency theoretical assessment of public health impact of imperfect prophylactic hiv-1 vaccines with with therapeutic benefits analytical approximate solutions for a general nonlinear resistor-nonlinear capacitor circuit model chaos control in the cerium-catalyzed belousov-zhabotinsky reaction using recurrence quantification analysis measures influenza vaccine efficacy in young, healthy adults nonlinear oscilations, dynamical systems, and bifurcations of vector fields an sveir model for assessing potential impact of an imperfect anti-sars vaccine backward bifurcation in epidemic control the mathematics of infectious diseases a contribution to the mathematical theory of epidemics. part i vaccination strategies and backward bifurcation in an age-sinceinfection structured model a simple vaccination model with multiple endemic states the stability of dynamical systems, regional conference series in applied mathematics global dynamics of vector-borne diseases with horizontal transmission in host population global anlysis of sis epidmeic model with a simple vaccination and multiple endemic equilibria on r.a. smiths autonomous convergence theorem. rocky mount global stability for the seir model in epidemiology a geometric approach to global-stability problems global stability of seirs models in epidemiology global dynamics of an seir epidemic model with vertical transmission new vaccine against tuberculosis: current developments and future challenges svir epidemic models with vaccination strategies modeling and dynamics of infectious diseases on the mechanism of strain replacement in epidemic models with vaccination, in current developments in mathematical biology progression age enhanced backward bifurcation in an epidemic model with super-infection logarithmic norms and projections applied to linear differential systems modelling the effect of imperfect vaccines on disease epidemiology global results for an sirs model with vaccination and isolation some application of hausdorff dimension inequalities for ordinary differential equations reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission backward bifurcation of an epidemic model with treatment global analysis of an sis model with an infective vector on complex networks global dynamics of a network epidemic model for waterborne diseases spread global attractivity and permanence of a sveir epidemic model with pulse vaccination and time delay who advisory committee on variola virus research report of the fourteenth meeting the periodic solution of a class of epidemic acknowledgements we would like to thank dr chin-hong park( editor-in-chief) and the four reviewers for their constructive comments and suggestions that have helped us to improve the manuscript significantly. the second additive compound matrix a [2] for a 6 × 6 matrix a = (a i j ) is key: cord-018947-d4im0p9e authors: helbing, dirk title: challenges in economics date: 2012-02-10 journal: social self-organization doi: 10.1007/978-3-642-24004-1_16 sha: doc_id: 18947 cord_uid: d4im0p9e in the same way as the hilbert program was a response to the foundational crisis of mathematics [1], this article tries to formulate a research program for the socio-economic sciences. the aim of this contribution is to stimulate research in order to close serious knowledge gaps in mainstream economics that the recent financial and economic crisis has revealed. by identifying weak points of conventional approaches in economics, we identify the scientific problems which need to be addressed. we expect that solving these questions will bring scientists in a position to give better decision support and policy advice. we also indicate, what kinds of insights can be contributed by scientists from other research fields such as physics, biology, computer and social science. in order to make a quick progress and gain a systemic understanding of the whole interconnected socio-economic-environmental system, using the data, information and computer systems available today and in the near future, we suggest a multi-disciplinary collaboration as most promising research approach. static where the world was dynamic, it assumed competitive markets where few existed, it assumed rationality when we knew full well that economic agents were not rational . . . economics had no way of dealing with changing tastes and technology . . . econometrics was equally plagued with intractable problems: economic observations are never randomly drawn and seldom independent, the number of excluded variables is always unmanageably large, the degrees of freedom unacceptably small, the stability of significance tests seldom unequivocably established, the errors in measurement too large to yield meaningful results . . . " [5] . in the following, we will try to identify the scientific challenges that must be addressed to come up with better theories in the near future. this comprises practical challenges, i.e. the real-life problems that must be faced (see sect. 16 .2), and fundamental challenges, i.e. the methodological advances that are required to solve these problems (see sect. 16.3) . after this, we will discuss, which contribution can be made by related scientific disciplines such as econophysics and the social sciences. the intention of this contribution is constructive. it tries to stimulate a fruitful scientific exchange, in order to find the best way out of the crisis. according to our perception, the economic challenges we are currently facing can only be mastered by large-scale, multi-disciplinary efforts and by innovative approaches [6] . we fully recognize the large variety of non-mainstream approaches that has been developed by "heterodox economists". however, the research traditions in economics seem to be so powerful that these are not paid much attention to. besides, there is no agreement on which of the alternative modeling approaches would be the most promising ones, i.e. the heterogeneity of alternatives is one of the problems, which slows down their success. this situation clearly implies institutional challenges as well, but these go beyond the scope of this contribution and will therefore be addressed in the future. since decades, if not since hundreds of years, the world is facing a number of recurrent socio-economic problems, which are obviously hard to solve. before addressing related fundamental scientific challenges in economics, we will therefore point out practical challenges one needs to pay attention to. this basically requires to classify the multitude of problems into packages of interrelated problems. probably, such classification attempts are subjective to a certain extent. at least, the list presented below differs from the one elaborated by lomborg et al. [7] , who identified the following top ten problems: air pollution, security/conflict, disease control, education, climate change, hunger/malnutrition, water sanitation, barriers to migration and trade, transnational terrorism and, finally, women and development. the following (non-ranked) list, in contrast, is more focused on socio-economic factors rather than resource and engineering issues, and it is more oriented at the roots of problems rather than their symptoms: 1. demographic change of the population structure (change of birth rate, migration, integration. . . ) 2. financial and economic (in)stability (government debts, taxation, and inflation/ deflation; sustainability of social benefit systems; consumption and investment behavior. . . ) 3. social, economic and political participation and inclusion (of people of different gender, age, health, education, income, religion, culture, language, preferences; reduction of unemployment. . . ) 4. balance of power in a multi-polar world (between different countries and economic centers; also between individual and collective rights, political and company power; avoidance of monopolies; formation of coalitions; protection of pluralism, individual freedoms, minorities. . . ) 5. collective social behavior and opinion dynamics (abrupt changes in consumer behavior; social contagion, extremism, hooliganism, changing values; breakdown of cooperation, trust, compliance, solidarity. . . ) 6. security and peace (organized crime, terrorism, social unrest, independence movements, conflict, war. . . ) 7. institutional design (intellectual property rights; over-regulation; corruption; balance between global and local, central and decentral control. . . ) 8. sustainable use of resources and environment (consumption habits, travel behavior, sustainable and efficient use of energy and other resources, participation in recycling efforts, environmental protection. . . ) 9. information management (cyber risks, misuse of sensitive data, espionage, violation of privacy; data deluge, spam; education and inheritance of culture. . . ) 10 . public health (food safety; spreading of epidemics [flu, sars, h1n1, hiv], obesity, smoking, or unhealthy diets. . . ) some of these challenges are interdependent. in the following, we will try to identify the fundamental theoretical challenges that need to be addressed in order to understand the above practical problems and to draw conclusions regarding possible solutions. the most difficult part of scientific research is often not to find the right answer. the problem is to ask the right questions. in this context it can be a problem that people are trained to think in certain ways. it is not easy to leave these ways and see the problem from a new angle, thereby revealing a previously unnoticed solution. three factors contribute to this: 1. we may overlook the relevant facts because we have not learned to see them, i.e. we do not pay attention to them. the issue is known from internalized norms, which prevent people from considering possible alternatives. 2. we know the stylized facts, but may not have the right tools at hand to interpret them. it is often difficult to make sense of patterns detected in data. turning data into knowledge is quite challenging. 3. we know the stylized facts and can interpret them, but may not take them seriously enough, as we underestimate their implications. this may result from misjudgements or from herding effects, i.e. from a tendency to follow traditions and majority opinions. in fact, most of the issues discussed below have been pointed out before, but it seems that this did not have an effect on mainstream economics so far or on what decision-makers know about economics. this is probably because mainstream theory has become a norm [8] , and alternative approaches are sanctioned as norm-deviant behavior [9, 10] . as we will try to explain, the following fundamental issues are not just a matter of approximations (which often lead to the right understanding, but wrong numbers). rather they concern fundamental errors in the sense that certain conclusions following from them are seriously misleading. as the recent financial crisis has demonstrated, such errors can be very costly. however, it is not trivial to see what dramatic consequences factors such as dynamics, spatial interactions, randomness, non-linearity, network effects, differentiation and heterogeneity, irreversibility or irrationality can have. despite of criticisms by several nobel prize winners such as reinhard selten (1994), joseph stiglitz and george akerlof (2001) , or daniel kahneman (2002) , the paradigm of the homo economicus, i.e. of the "perfect egoist", is still the dominating approach in economics. it assumes that people would have quasi-infinite memory and processing capacities and would determine the best one among all possible alternative behaviors by strategic thinking (systematic utility optimization), and would implement it into practice without mistakes. the nobel prize winner of 1976, milton friedman, supported the hypothesis of homo economicus by the following argument: "irrational agents will lose money and will be driven out of the market by rational agents" [11] . more recently, robert e. lucas jr., the nobel prize winner of 1995, used the rationality hypothesis to narrow down the class of empirically relevant equilibria [12] . the rational agent hypothesis is very charming, as its implications are clear and it is possible to derive beautiful and powerful economic theorems and theories from it. the best way to illustrate homo economicus is maybe a company that is run by using optimization methods from operation research, applying supercomputers. another example are professional chess players, who are trying to anticipate the possible future moves of their opponents. obviously, in both examples, the future course of actions can not be fully predicted, even if there are no random effects and mistakes. it is, therefore, no wonder that people have repeatedly expressed doubts regarding the realism of the rational agent approach [13, 14] . bertrand russell, for example, claimed: "most people would rather die than think". while this seems to be a rather extreme opinion, the following scientific arguments must be taken seriously: 1. human cognitive capacities are bounded [16, 17] . already phone calls or conversations can reduce people's attention to events in the environment a lot. also, the abilities to memorize facts and to perform complicated logical analyses are clearly limited. 2. in case of np-hard optimization problems, even supercomputers are facing limits, i.e. optimization jobs cannot be performed in real-time anymore. therefore, approximations or simplifications such as the application of heuristics may be necessary. in fact, psychologists have identified a number of heuristics, which people use when making decisions [18] . 3. people perform strategic thinking mainly in important new situations. in normal, everyday situation, however, they seem to pursue a satisficing rather than optimizing strategy [17] . meeting a certain aspiration level rather than finding the optimal strategy can save time and energy spent on problem solving. in many situation, people even seem to perform routine choices [14] , for example, when evading other pedestrians in counterflows. 4. there is a long list of cognitive biases which question rational behavior [19] . for example, individuals are favorable of taking small risks (which are preceived as "chances", as the participation in lotteries shows), but they avoid large risks [20] . furthermore, non-exponential temporal discounting may lead to paradoxical behaviors [21] and requires one to rethink, how future expectations must be modeled. 5. most individuals have a tendency towards other-regarding behavior and fairness [22, 23] . for example, the dictator game [24] and other experiments [25] show that people tend to share, even if there is no reason for this. leaving a tip for the waiter in a restaurant that people visit only once is a typical example (particularly in countries where tipping is not common) [26] . such behavior has often been interpreted as sign of social norms. while social norms can certainly change the payoff structure, it has been found that the overall payoffs resulting from them do not need to create a user or system optimum [27] [28] [29] . this suggests that behavioral choices may be irrational in the sense of non-optimal. a typical example is the existence of unfavorable norms, which are supported by people although nobody likes them [30] . 6. certain optimization problems can have an infinite number of local optima or nash equilibria, which makes it impossible to decide what is the best strategy [31] . 7. convergence towards the optimal solution may require such a huge amount of time that the folk theorem becomes useless. this can make it practically impossible to play the best response strategy [32] . 8. the optimal strategy may be deterministically chaotic, i.e. sensitive to arbitrarily small details of the initial condition, which makes the dynamic solution unpredictable on the long run ("butterfly effect") [33, 34] . this fundamental limit of predictability also implies a limit of control-two circumstances that are even more true for non-deterministic systems with a certain degree of randomness. in conclusion, although the rational agent paradigm (the paradigm of homo economicus) is theoretically powerful and appealing, there are a number of empirical and theoretical facts, which suggest deficiencies. in fact, most methods used in financial trading (such as technical analysis) are not well compatible with the rational agent approach. even if an optimal solution exists, it may be undecidable for practical reasons or for theoretical ones [35, 36] . this is also relevant for the following challenges, as boundedly rational agents may react inefficently and with delays, which questions the efficient market hypothesis, the equilibrium paradigm, and other fundamental concepts, calling for the consideration of spatial, network, and time-dependencies, heterogeneity and correlations etc. it will be shown that these points can have dramatic implications regarding the predictions of economic models. the efficient market hypothesis (emh) was first developed by eugene fama [37] in his ph.d. thesis and rapidly spread among leading economists, who used it as an argument to promote laissez-faire policies. the emh states that current prices reflect all publicly available information and (in its stronger formulation) that prices instantly change to reflect new public information. the idea of self-regulating markets goes back to adam smith [38] , who believed that "the free market, while appearing chaotic and unrestrained, is actually guided to produce the right amount and variety of goods by a so-called "invisible hand"." furthermore, "by pursuing his own interest, [the individual] frequently promotes that of the society more effectually than when he intends to promote it" [39] . for this reason, adam smith is often considered to be the father of free market economics. curiously enough, however, he also wrote a book on "the theory of moral sentiments" [40] . "his goal in writing the work was to explain the source of mankind's ability to form moral judgements, in spite of man's natural inclinations towards self-interest. smith proposes a theory of sympathy, in which the act of observing others makes people aware of themselves and the morality of their own behavior . . . [and] seek the approval of the "impartial spectator" as a result of a natural desire to have outside observers sympathize with them" [38] . such a reputation-based concept would be considered today as indirect reciprocity [41] . of course, there are criticisms of the efficient market hypothesis [42] , and the nobel prize winner of 2001, joseph stiglitz, even believes that "there is not invisible hand" [43] . the following list gives a number of empirical and theoretical arguments questioning the efficient market hypothesis: 1. examples of market failures are well-known and can result, for example, in cases of monopolies or oligopolies, if there is not enough liquidity or if information symmetry is not given. 2. while the concept of the "invisible hand" assumes something like an optimal self-organization [44] , it is well-known that this requires certain conditions, such as symmetrical interactions. in general, however, self-organization does not necessarily imply system-optimal solutions. stop-and-go traffic [45] or crowd disasters [46] are two obvious examples for systems, in which individuals competitively try to reach individually optimal outcomes, but where the optimal solution is dynamically unstable. 3. the limited processing capacity of boundedly rational individuals implies potential delays in their responses to sensorial inputs, which can cause such instabilities [47] . for example, a delayed adaptation in production systems may contribute to the occurrence of business cycles [48] . the same applies to the labor market of specially skilled people, which cannot adjust on short time scales. even without delayed reactions, however, the competitive optimization of individuals can lead to suboptimal individual results, as the "tragedy of the commons" in public goods dilemmas demonstrates [49, 50] . 4. bubbles and crashes, or more generally, extreme events in financial markets should not occur, if the efficient market hypothesis was correct (see next subsection). 5. collective social behavior such as "herding effects" as well as deviations of human behavior from what is expected from rational agents can lead to such bubbles and crashes [51] , or can further increase their size through feedback effects [52] . cyclical feedbacks leading to oscillations are also known from the beer game [53] or from business cycles [48] . the efficient market paradigm implies the equilibrium paradigm. this becomes clear, if we split it up into its underlying hypotheses: 1. the market can be in equilibrium, i.e. there exists an equilibrium. 2. there is one and only one equilibrium. 3. the equilibrium is stable, i.e. any deviations from the equilibrium due to "fluctuations" or "perturbations" tend to disappear eventually. 4. the relaxation to the equilibrium occurs at an infinite rate. note that, in order to act like an "invisible hand", the stable equilibrium (nash equilibrium) furthermore needs to be a system optimum, i.e. to maximize the average utility. this is true for coordination games, when interactions are well-mixed and exploration behavior as well as transaction costs can be neglected [54] . however, it is not fulfilled by so-called social dilemmas [49] . let us discuss the evidence for the validity of the above hypotheses one by one: 1. a market is a system of extremely many dynamically coupled variables. theoretically, it is not obvious that such a system would have a stationary solution. for example, the system could behave periodic, quasi-periodic, chaotic, or turbulent [81-83, 85-87, 94] . in all these cases, there would be no convergence to a stationary solution. 2. if a stationary solution exists, it is not clear that there are no further stationary solutions. if many variables are non-linearly coupled, the phenomenon of multistability can easily occur [55] . that is, the solution to which the system converges may not only depend on the model parameters, but also on the initial condition, history, or perturbation size. such facts are known as path-dependencies or hysteresis effects and are usually visualized by so-called phase diagrams [56] . 3. in systems of non-linearly interacting variables, the existence of a stationary solution does not necessarily imply that it is stable, i.e. that the system will converge to this solution. for example, the stationary solution could be a focal point with orbiting solutions (as for the classical lotka-volterra equations [57] ), or it could be unstable and give rise to a limit cycle [58] or a chaotic solution [33] , for example (see also item 1). in fact, experimental results suggest that volatility clusters in financial markets may be a result of over-reactions to deviations from the fundamental value [59] . 4. an infinite relaxation rate is rather unusual, as most decisions and related implemenations take time [15, 60] . the points listed in the beginning of this subsection are also questioned by empirical evidence. in this connection, one may mention the existence of business cycles [48] or unstable orders and deliveries observed in the experimental beer game [53] . moreover, bubbles and crashes have been found in financial market games [61] . today, there seems to be more evidence against than for the equilibrium paradigm. in the past, however, most economists assumed that bubbles and crashes would not exist (and many of them still do). the following quotes are quite typical for this kind of thinking (from [62] ): in 2004, the federal reserve chairman of the u.s., alan greenspan, stated that the rise in house values was "not enough in our judgment to raise major concerns". in july 2005 when asked about the possibility of a housing bubble and the potential for this to lead to a recession in the future, the present u.s. federal reserve chairman ben bernanke (then chairman of the council of economic advisors) said: "it's a pretty unlikely possibility. we've never had a decline in housing prices on a nationwide basis. so, what i think is more likely is that house prices will slow, maybe stabilize, might slow consumption spending a bit. i don't think it's going to drive the economy too far from it's full path though." as late as may 2007 bernanke stated that the federal reserve "do not expect significant spillovers from the subprime market to the rest of the economy". according to the classical interpretation, sudden changes in stock prices result from new information, e.g. from innovations ("technological shocks"). the dynamics in such systems has, for example, been described by the method of comparative statics (i.e. a series of snapshots). here, the system is assumed to be in equilibrium in each moment, but the equilibrium changes adiabatically (i.e. without delay), as the system parameters change (e.g. through new facts). such a treatment of system dynamics, however, has certain deficiencies: 1. the approach cannot explain changes in or of the system, such as phase transitions ("systemic shifts"), when the system is at a critical point ("tipping point"). 2. it does not allow one to understand innovations and other changes as results of an endogeneous system dynamics. 3. it cannot describe effects of delays or instabilities, such as overshooting, self-organization, emergence, systemic breakdowns or extreme events (see sect. 16.3.4). 4. it does not allow one to study effects of different time scales. for example, when there are fast autocatalytic (self-reinfocing) effects and slow inhibitory effects, this may lead to pattern formation phenomena in space and time [63, 64] . the formation of settlements, where people agglomerate in space, may serve as an example [65, 66] . 5. it ignores long-term correlations such as memory effects. 6. it neglects frictional effects, which are often proportional to change ("speed") and occur in most complex systems. without friction, however, it is difficult to understand entropy and other path-dependent effects, in particular irreversibility (i.e. the fact that the system may not be able to get back to the previous state) [67] . for example, the unemployment rate has the property that it does not go back to the previous level in most countries after a business cycle [68] . comparative statics is, of course, not the only method used in economics to describe the dynamics of the system under consideration. as in physics and other fields, one may use a linear approximation around a stationary solution to study the response of the system to fluctuations or perturbations [69] . such a linear stability analysis allows one to study, whether the system will return to the stationary solution (which is the case for a stable [nash] equilibrium) or not (which implies that the system will eventually be driven into a new state or regime). in fact, the great majority of statistical analyses use linear models to fit empirical data (also when they do not involve time-dependencies). it is know, however, that linear models have special features, which are not representative for the rich variety of possible functional dependencies, dynamics, and outcomes. therefore, the neglection of non-linearity has serious consequences: 1. as it was mentioned before, phenomena like multiple equilibria, chaos or turbulence cannot be understood by linear models. the same is true for selforganization phenomena or emergence. additionally, in non-linearly coupled systems, usually "more is different", i.e. the system may change its behavior fundamentally as it grows beyond a certain size. furthermore, the system is often hard to predict and difficult to control (see sect. 16.3.8). 2. linear modeling tends to overlook that a strong coupling of variables, which would show a normally distributed behavior in separation, often leads to fat tail distributions (such as "power laws") [70, 71] . this implies that extreme events are much more frequent than expected according to a gaussian distribution. for example, when additive noise is replaced by multiplicative noise, a number of surprising phenomena may result, including noise-induced transitions [72] or directed random walks ("ratchet effects") [73] . 3. phenomena such as catastrophes [74] or phase transition ("system shifts") [75] cannot be well understood within a linear modeling framework. the same applies to the phenomenon of "self-organized criticality" [79] (where the system drives itself to a critical state, typically with power-law characteristics) or cascading effects, which can result from network interactions (overcritically challenged network nodes or links) [77, 78] . it should be added that the relevance of network effects resulting from the on-going globalization is often underestimated. for example, "the stock market crash of 1987, began with a small drop in prices which triggered an avalanche of sell orders in computerized trading programs, causing a further price decline that triggered more automatic sales." [80] therefore, while linear models have the advantage of being analytically solvable, they are often unrealistic. studying non-linear behavior, in contrast, often requires numerical computational approaches. it is likely that most of today's unsolved economic puzzles cannot be well understood through linear models, no matter how complicated they may be (in terms of the number of variables and parameters) [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] . the following list mentions some areas, where the importance of non-linear interdependencies is most likely underestimated: â�¢ collective opinions, such as trends, fashions, or herding effects. â�¢ the success of new (and old) technologies, products, etc. â�¢ cultural or opinion shifts, e.g. regarding nuclear power, genetically manipulated food, etc. â�¢ the "fitness" or competitiveness of a product, value, quality perceptions, etc. â�¢ the respect for copyrights. â�¢ social capital (trust, cooperation, compliance, solidarity, . . . ). â�¢ booms and recessions, bubbles and crashes. â�¢ bank panics. â�¢ community, cluster, or group formation. â�¢ relationships between different countries, including war (or trade war) and peace. another common simplification in economic modeling is the representative agent approach, which is known in physics as mean field approximation. within this framework, time-dependencies and non-linear dependencies are often considered, but it is assumed that the interaction with other agents (e.g. of one company with all the other companies) can be treated as if this agent would interact with an average agent, the "representative agent". let us illustrate this with the example of the public goods dilemma. here, everyone can decide whether to make an individual contribution to the public good or not. the sum of all contributions is multiplied by a synergy factor, reflecting the benefit of cooperation, and the resulting value is equally shared among all people. the prediction of the representative agent approach is that, due to the selfishness of agents, a "tragedy of the commons" would result [49] . according to this, everybody should free-ride, i.e. nobody should make a contribution to the public good and nobody would gain anything. however, if everybody would contribute, everybody could multiply his or her contribution by the synergy factor. this example is particularly relevant as society is facing a lot of public goods problems and would not work without cooperation. everything from the creation of public infrastructures (streets, theaters, universities, libraries, schools, the world wide web, wikipedia etc.) over the use of environmental resources (water, forests, air, etc.) or of social benefit systems (such as a public health insurance), maybe even the creation and maintainance of a commonly shared language and culture are public goods problems (although the last examples are often viewed as coordination problems). even the process of creating public goods is a public good [95] . while it is a well-known problem that people tend to make unfair contributions to public goods or try to get a bigger share of them, individuals cooperate much more than one would expect according to the representative agent approach. if they would not, society could simply not exist. in economics, one tries to solve the problem by introducing taxes (i.e. another incentive structure) or a "shadow of the future" (i.e. a strategic optimization over infinite time horizons in accordance with the rational agent approach) [96, 97] . both comes down to changing the payoff structure in a way that transforms the public good problem into another one that does not constitute a social dilemma [98] . however, there are other solutions of the problem. when the realm of the mean-field approximation underlying the representative agent approach is left and spatial or network interactions or the heterogeneity among agents are considered, a miracle occurs: cooperation can survive or even thrive through correlations and co-evolutionary effects [99] [100] [101] . a similar result is found for the public goods game with costly punishment. here, the representative agent model predicts that individuals avoid to invest into punishment, so that punishment efforts eventually disappear (and, as a consequence, cooperation as well). however, this "second-order free-rider problem" is naturally resolved and cooperation can spread, if one discards the mean-field approximation and considers the fact that interactions take place in space or social networks [56] . societies can overcome the tragedy of the commons even without transforming the incentive structure through taxes. for example, social norms as well as group dynamical and reputation effects can do so [102] . the representative agent approach implies just the opposite conclusion and cannot well explain the mechanisms on which society is built. it is worth pointing out that the relevance of public goods dilemmas is probably underestimated in economics. partially related to adam smith's belief in an "invisible hand", one often assumes underlying coordination games and that they would automatically create a harmony between an individually and system optimal state in the course of time [54] . however, running a stable financial system and economy is most likely a public goods problem. considering unemployment, recessions always go along with a breakdown of solidarity and cooperation. efficient production clearly requires mutual cooperation (as the counter-example of countries with many strikes illustrates). the failure of the interbank market when banks stop lending to each other, is a good example for the breakdown of both, trust and cooperation. we must be aware that there are many other systems that would not work anymore, if people would lose their trust: electronic banking, e-mail and internet use, facebook, ebusiness and egovernance, for example. money itself would not work without trust, as bank panics and hyperinflation scenarios show. similarly, cheating customers by selling low-quality products or selling products at overrated prices, or by manipulating their choices by advertisements rather than informing them objectively and when they want, may create profits on the short run, but it affects the trust of customers (and their willingness to invest). the failure of the immunization campaign during the swine flu pandemics may serve as an example. furthermore, people would probably spend more money, if the products of competing companies were better compatible with each other. therefore, on the long run, more cooperation among companies and with the customers would pay off and create additional value. besides providing a misleading picture of how cooperation comes about, there are a number of other deficiencies of the representative agent approach, which are listed below: 1. correlations between variables are neglected, which is acceptable only for "well-mixing" systems. according to what is known from critical phenomena in physics, this approximation is valid only, when the interactions take place in high-dimensional spaces or if the system elements are well connected. (however, as the example of the public goods dilemma showed, this case does not necessarily have beneficial consequences. well-mixed interactions could rather cause a breakdown of social or economic institutions, and it is conceivable that this played a role in the recent financial crisis.) 2. percolation phenomena, describing how far an idea, innovation, technology, or (computer) virus spreads through a social or business network, are not well reproduced, as they depend on details of the network structure, not just on the average node degree [103] . 3. the heterogeneity of agents is ignored. for this reason, factors underlying economic exchange, perturbations, or systemic robustness [104] cannot be well described. moreover, as socio-economic differentiation and specialization imply heterogeneity, they cannot be understood as emergent phenomena within a representative agent approach. finally, it is not possible to grasp innovation without the consideration of variability. in fact, according to evolutionary theory, the innovation rate would be zero, if the variability was zero [105] . furthermore, in order to explain innovation in modern societies, schumpeter introduced the concept of the "political entrepreneur" [106] , an extra-ordinarily gifted person capable of creating disruptive change and innovation. such an extraordinary individual can, by definition, not be modeled by a "representative agent". one of the most important drawbacks of the representative agent approach is that it cannot explain the fundamental fact of economic exchange, since it requires one to assume a heterogeneity in resources or production costs, or to consider a variation in the value of goods among individuals. ken arrow, nobel prize winner in 1972, formulated this point as follows [107] : "one of the things that microeconomics teaches you is that individuals are not alike. there is heterogeneity, and probably the most important heterogeneity here is heterogeneity of expectations. if we didn't have heterogeneity, there would be no trade." we close this section by mentioning that economic approaches, which go beyond the representative agent approach, can be found in refs. [108, 109] . another deficiency of economic theory that needs to be mentioned is the lack of a link between micro-and macroeconomics. neoclassical economics implicitly assumes that individuals make their decisions in isolation, using only the information received from static market signals. within this oversimplified framework, macro-aggregates are just projections of some representative agent behavior, instead of the outcome of complex interactions with asymmetric information among a myriad of heterogeneous agents. in principle, it should be understandable how the macroeconomic dynamics results from the microscopic decisions and interactions on the level of producers and consumers [81, 110] (as it was possible in the past to derive micro-macro links for other systems with a complex dynamical behavior such as interactive vehicle traffic [111] ). it should also be comprehensible how the macroscopic level (the aggregate econonomic situation) feeds back on the microscopic level (the behavior of consumers and producers), and to understand the economy as a complex, adaptive, self-organizing system [112, 113] . concepts from evolutionary theory [114] and ecology [115] appear to be particularly promising [116] . this, however, requires a recognition of the importance of heterogeneity for the system (see the the previous subsection). the lack of ecological thinking implies not only that the sensitive network interdependencies between the various agents in an economic system (as well as minority solutions) are not properly valued. it also causes deficiencies in the development and implementation of a sustainable economic approach based on recycling and renewable resources. today, forestry science is probably the best developed scientific discipline concerning sustainability concepts [117] . economic growth to maintain social welfare is a serious misconception. from other scientific disciplines, it is well known that stable pattern formation is also possible for a constant (and potentially sustainable) inflow of energy [69, 118] . one of the great achievements of economics is that it has developed a multitude of methods to use scarce resources efficiently. a conventional approach to this is optimization. in principle, there is nothing wrong about this approach. nevertheless, there are a number of problems with the way it is usually applied: 1. one can only optimize for one goal at a time, while usually, one needs to meet several objectives. this is mostly addressed by weighting the different goals (objectives), by executing a hierarchy of optimization steps (through ranking and prioritization), or by applying a satisficing strategy (requiring a minimum performance for each goal) [119, 120] . however, when different optimization goals are in conflict with each other (such as maximizing the throughput and minimizing the queue length in a production system), a sophisticated timedependent strategy may be needed [121] . high profit? best customer satisfaction? large throughput? competitive advantage? resilience? [122] in fact, the choice of the optimization function is arbitrary to a certain extent and, therefore, the result of optimization may vary largely. goal selection requires strategic decisions, which may involve normative or moral factors (as in politics). in fact, one can often observe that, in the course of time, different goal functions are chosen. moreover, note that the maximization of certain objectives such as resilience or "fitness" depends not only on factors that are under the control of a company. resilience and "fitness" are functions of the whole system, in particularly, they also depend on the competitors and the strategies chosen by them. 3. the best solution may be the combination of two bad solutions and may, therefore, be overlooked. in other words, there are "evolutionary dead ends", so that gradual optimization may not work. (this problem can be partially overcome by the application of evolutionary mechanisms [120] ). 4. in certain systems (such as many transport, logistic, or production systems), optimization tends to drive the system towards instability, since the point of maximum efficiency is often in the neighborhood or even identical with the point of breakdown of performance. such breakdowns in capacity or performance can result from inefficiencies due to dynamic interaction effects. for example, when traffic flow reaches its maximum capacity, sooner or later it breaks down. as a consequence, the road capacity tends to drop during the time period where it is most urgently needed, namely during the rush hour [45, 123] . 5 . optimization often eliminates reduncancies in the system and, thereby, increases the vulnerability to perturbations, i.e. it decreases robustness and resilience. 6. optimization tends to eliminate heterogeneity in the system [80] , while heterogeneity frequently supports adaptability and resilience. 7. optimization is often performed with centralized concepts (e.g. by using supercomputers that process information collected all over the system). such centralized systems are vulnerable to disturbances or failures of the central control unit. they are also sensitive to information overload, wrong selection of control parameters, and delays in adaptive feedback control. in contrast, decentralized control (with a certain degree of autonomy of local control units) may perform better, when the system is complex and composed of many heterogeneous elements, when the optimization problem is np hard, the degree of fluctuations is large, and predictability is restricted to short time periods [77, 124] . under such conditions, decentralized control strategies can perform well by adaptation to the actual local conditions, while being robust to perturbations. urban traffic light control is a good example for this [121, 125] . 8. further on, today's concept of quality control appears to be awkward. it leads to a never-ending contest, requiring people and organizations to fulfil permanently increasing standards. this leads to over-emphasizing measured performance criteria, while non-measured success factors are neglected. the engagement into non-rewarded activities is discouraged, and innovation may be suppressed (e.g. when evaluating scientists by means of their h-index, which requires them to focus on a big research field that generates many citations in a short time). while so-called "beauty contests" are considered to produce the best results, they will eventually absorb more and more resources for this contest, while less and less time remains for the work that is actually to be performed, when the contest is won. besides, a large number of competitors have to waste considerable resources for these contests which, of course, have to be paid by someone. in this way, private and public sectors (from physicians over hospitals, administrations, up to schools and universities) are aching under the evaluationrelated administrative load, while little time remains to perform the work that the corresponding experts have been trained for. it seems naã¯ve to believe that this would not waste resources. rather than making use of individual strengths, which are highly heterogeneous, today's way of evaluating performance enforces a large degree of conformity. there are also some problems with parameter fitting, a method based on optimization as well. in this case, the goal function is typically an error function or a likelihood function. not only are calibration methods often "blindly" applied in practice (by people who are not experts in statistics), which can lead to overfitting (the fitting of meaningless "noise"), to the neglection of collinearities (implying largely variable parameter values), or to inaccurate and problematic parameter determinations (when the data set is insufficient in size, for example, when large portfolios are to be optimized [126] ). as estimates for past data are not necessarily indicative for the future, making predictions with interpolation approaches can be quite problematic (see also sect. 16.3.3 for the challenge of time dependence). moreover, classical calibration methods do not reveal inappropriate model specifications (e.g. linear ones, when non-linear models would be needed, or unsuitable choices of model variables). finally, they do not identify unknown unknowns (i.e. relevant explanatory variables, which have been overlooked in the modeling process). managing economic systems is a particular challenge, not only for the reasons discussed in the previous section. as large economic systems belong to the class of complex systems, they are hard or even impossible to manage with classical control approaches [76, 77] . complex systems are characterized by a large number of system elements (e.g. individuals, companies, countries, . . . ), which have non-linear or network interactions causing mutual dependencies and responses. such systems tend to behave dynamic rather than static and probabilistic rather than deterministic. they usually show a rich, hardly predictable, and sometimes paradoxical system behavior. therefore, they challenge our way of thinking [127] , and their controllability is often overestimated (which is sometimes paraphrased as "illusion of control") [80, 128, 129] . in particular, causes and effects are typically not proportional to each other, which makes it difficult to predict the impact of a control attempt. a complex system may be unresponsive to a control attempt, or the latter may lead to unexpected, large changes in the system behavior (so-called "phase transitions", "regime shifts", or "catastrophes") [75] . the unresponsiveness is known as principle of le chatelier or goodhart's law [130] , according to which a complex system tends to counteract external control attempts. however, regime shifts can occur, when the system gets close to so-called "critical points" (also known as "tipping points"). examples are sudden changes in public opinion (e.g. from pro to anti-war mood, from a smoking tolerance to a public smoking ban, or from buying energy-hungry sport utilities vehicles (suvs) to buying environmentally-friendly cars). particularly in case of network interactions, big changes may have small, no, or unexpected effects. feedback loops, unwanted side effects, and circuli vitiosi are quite typical. delays may cause unstable system behavior (such as bull-whip effects) [53] , and over-critical perturbations can create cascading failures [78] . systemic breakdowns (such as large-scale blackouts, bankruptcy cascades, etc.) are often a result of such domino or avalanche effects [77] , and their probability of occurrence as well as their resulting damage are usually underestimated. further examples are epidemic spreading phenomena or disasters with an impact on the socio-economic system. a more detailed discussion is given in refs. [76, 77] . other factors contributing to the difficulty to manage economic systems are the large heterogeneity of system elements and the considerable level of randomness as well as the possibility of a chaotic or turbulent dynamics (see sect. 16.3.4) . furthermore, the agents in economic systems are responsive to information, which can create self-fulfilling or self-destroying prophecy effects. inflation may be viewed as example of such an effect. interestingly, in some cases one even does not know in advance, which of these effects will occur. it is also not obvious that the control mechanisms are well designed from a cybernetic perspective, i.e. that we have sufficient information about the system and suitable control variables to make control feasible. for example, central banks do not have terribly many options to influence the economic system. among them are performing open-market operations (to control money supply), adjustments in fractional-reserve banking (keeping only a limited deposit, while lending a large part of the assets to others), or adaptations in the discount rate (the interest rate charged to banks for borrowing short-term funds directly from a central bank). nevertheless, the central banks are asked to meet multiple goals such as: â�¢ to guarantee well-functioning and robust financial markets. â�¢ to support economic growth. â�¢ to balance between inflation and unemployment. â�¢ to keep exchange rates within reasonable limits. furthermore, the one-dimensional variable of "money" is also used to influence individual behavior via taxes (by changing behavioral incentives). it is questionable, whether money can optimally meet all these goals at the same time (see sect. 16.3.7) . we believe that a computer, good food, friendship, social status, love, fairness, and knowledge can only to a certain extent be replaced by and traded against each other. probably for this reason, social exchange comprises more than just material exchange [131] [132] [133] . it is conceivable that financial markets as well are trying to meet too many goals at the same time. this includes: â�¢ to match supply and demand. â�¢ to discover a fair price. â�¢ to raise the foreign direct investment (fdi). â�¢ to couple local economies with the international system. â�¢ to facilitate large-scale investments. â�¢ to boost development. â�¢ to share risk. â�¢ to support a robust economy, and â�¢ to create opportunities (to gamble, to become rich, etc.). therefore, it would be worth stuyding the system from a cybernetic control perspective. maybe, it would work better to separate some of these functions from each other rather than mixing them. another aspect that tends to be overlooked in mainstream economics is the relevance of psychological and social factors such as emotions, creativity, social norms, herding effects, etc. it would probably be wrong to interpret these effects just as a result of perception biases (see sect. 16.3.1) . most likely, these human factors serve certain functions such as supporting the creation of public goods [102] or collective intelligence [134, 135] . as bruno frey has pointed out, economics should be seen from a social science perspective [136] . in particular, research on happiness has revealed that there are more incentives than just financial ones that motivate people to work hard [133] . interestingly, there are quite a number of factors which promote volunteering [132] . it would also be misleading to judge emotions from the perspective of irrational behavior. they are a quite universal and a relatively energy-consuming way of signalling. therefore, they are probably more reliable than non-emotional signals. moreover, they create empathy and, consequently, stimulate mutual support and a readiness for compromises. it is quite likely that this creates a higher degree of cooperativeness in social dilemma situations and, thereby, a higher payoff on average as compared to emotionless decisions, which often have drawbacks later on. finally, there is no good theory that would allow one to assess the relevance of information in economic systems. most economic models do not consider information as an explanatory variable, although information is actually a stronger driving force of urban growth and social dynamics than energy [137] . while we have an information theory to determine the number of bits required to encode a message, we are lacking a theory, which would allow us to assess what kind of information is relevant or important, or what kind of information will change the social or economic world, or history. this may actually be largely dependent on the perception of pieces of information, and on normative or moral issues filtering or weighting information. moreover, we lack theories describing what will happen in cases of coincidence or contradiction of several pieces of information. when pieces of information interact, this can change their interpretation and, thereby, the decisions and behaviors resulting from them. that is one of the reasons why socio-economic systems are so hard to predict: "unknown unknowns", structural instabilities, and innovations cause emergent results and create a dynamics of surprise [138] . the problems discussed in the previous two sections pose interesting practical and fundamental challenges for economists, but also other disciplines interested in understanding economic systems. econophysics, for example, pursues a physical approach to economic systems, applying methods from statistical physics [81] , network theory [139, 140] , and the theory of complex systems [85, 87] . a contribution of physics appears quite natural, in fact, not only because of its tradition in detecting and modeling regularities in large data sets [141] . physics also has a lot of experience how to theoretically deal with problems such as time-dependence, fluctuations, friction, entropy, non-linearity, strong interactions, correlations, heterogeneity, and many-particle simulations (which can be easily extended towards multi-agent simulations). in fact, physics has influenced economic modeling already in the past. macroeconomic models, for example, were inspired by thermodynamics. more recent examples of relevant contributions by physicists concern models of self-organizing conventions [54] , of geographic agglomeration [65] , of innovation spreading [142] , or of financial markets [143] , to mention just a few examples. one can probably say that physicists have been among the pioneers calling for new approaches in economics [81, 87, [143] [144] [145] [146] [147] . a particularly visionary book beside wolfgang weidlich's work was the "introduction to quantitative aspects of social phenomena" by elliott w. montroll and wade w. badger, which addressed by mathematical and empirical analysis subjects as diverse as population dynamics, the arms race, speculation patterns in stock markets, congestion in vehicular traffic as well as the problems of atmospheric pollution, city growth and developing countries already in 1974 [148] . unfortunately, it is impossible in our paper to reflect the numerous contributions of the field of econophysics in any adequate way. the richness of scientific contributions is probably reflected best by the econophysics forum run by yi-cheng zhang [149] . many econophysics solutions are interesting, but so far they are not broad and mighty enough to replace the rational agent paradigm with its large body of implications and applications. nevertheless, considering the relatively small number of econophysicists, there have been many promising results. the probably largest fraction of publications in econophysics in the last years had a data-driven or computer modeling approach to financial markets [143] . but econophyics has more to offer than the analysis of financial data (such as fluctuations in stock and foreign currency exchange markets), the creation of interaction models for stock markets, or the development of risk management strategies. other scientists have focused on statistical laws underlying income and wealth distributions, nonlinear market dynamics, macroeconomic production functions and conditions for economic growth or agglomeration, sustainable economic systems, business cycles, microeconomic interaction models, network models, the growth of companies, supply and production systems, logistic and transport networks, or innovation dynamics and diffusion. an overview of subjects is given, for example, by ref. [152] and the contributions to annual spring workshop of the physics of socio-economic systems division of the dpg [153] . to the dissatisfaction of many econophysicists, the transfer of knowledge often did not work very well or, if so, has not been well recognized [150] . besides scepticism on the side of many economists with regard to novel approaches introduced by "outsiders", the limited resonance and level of interdisciplinary exchange in the past was also caused in part by econophysicists. in many cases, questions have been answered, which no economist asked, rather than addressing puzzles economists are interested in. apart from this, the econophysics work was not always presented in a way that linked it to the traditions of economics and pointed out deficiencies of existing models, highlighting the relevance of the new approach well. typical responses are: why has this model been proposed and not another one? why has this simplification been used (e.g. an ising model of interacting spins rather than a rational agent model)? why are existing models not good enough to describe the same facts? what is the relevance of the work compared to previous publications? what practical implications does the finding have? what kind of paradigm shift does the approach imply? can existing models be modified or extended in a way that solves the problem without requiring a paradigm shift? correspondingly, there have been criticisms not only by mainstream economists, but also by colleagues, who are open to new approaches [151] . therefore, we would like to suggest to study the various economic subjects from the perspective of the above-mentioned fundamental challenges, and to contrast econophysics models with traditional economic models, showing that the latter leave out important features. it is important to demonstrate what properties of economic systems cannot be understood for fundamental reasons within the mainstream framework (i.e. cannot be dealt with by additional terms within the modeling class that is conventionally used). in other words, one needs to show why a paradigm shift is unavoidable, and this requires careful argumentation. we are not claiming that this has not been done in the past, but it certainly takes an additional effort to explain the essence of the econophysics approach in the language of economics, particularly as mainstream economics may not always provide suitable terms and frameworks to do this. this is particularly important, as the number of econophysicists is small compared to the number of economists, i.e. a minority wants to convince an established majority. to be taken seriously, one must also demonstrate a solid knowledge of related previous work of economists, to prevent the stereotypical reaction that the subject of the paper has been studied already long time ago (tacitly implying that it does not require another paper or model to address what has already been looked at before). a reasonable and promising strategy to address the above fundamental and practical challenges is to set up multi-disciplinary collaborations in order to combine the best of all relevant scientific methods and knowledge. it seems plausible that this will generate better models and higher impact than working in separation, and it will stimulate scientific innovation. physicists can contribute with their experience in handling large data sets, in creating and simulating mathematical models, in developing useful approximations, in setting up laboratory experiments and measurement concepts. current research activities in economics do not seem to put enough focus on: â�¢ modeling approaches for complex systems [154] . â�¢ computational modeling of what is not analytically tractable anymore, e.g. by agent-based models [155] [156] [157] . â�¢ testable predictions and their empirical or experimental validation [164] . â�¢ managing complexity and systems engineering approaches to identify alternative ways of organizing financial markets and economic systems [91, 93, 165] , and â�¢ an advance testing of the effectiveness, efficiency, safety, and systemic impact (side effects) of innovations, before they are implemented in economic systems. this is in sharp contrast to mechanical, electrical, nuclear, chemical and medical drug engineering, for example. expanding the scope of economic thinking and paying more attention to these natural, computer and engineering science aspects will certainly help to address the theoretical and practical challenges posed by economic systems. besides physics, we anticipate that also evolutionary biology, ecology, psychology, neuroscience, and artificial intelligence will be able to make significant contributions to the understanding of the roots of economic problems and how to solve them. in conclusion, there are interesting scientific times ahead. it is a good question, whether answering the above list of fundamental challenges will sooner or later solve the practical problems as well. we think, this is a precondition, but it takes more, namely the consideration of social factors. in particular, the following questions need to be answered: 8. how do costly punishment, antisocial punishment, and discrimination come about? 9. how can the formation of social norms and conventions, social roles and socialization, conformity and integration be understood? 10. how do language and culture evolve? 11. how to comprehend the formation of group identity and group dynamics? what are the laws of coalition formation, crowd behavior, and social movements? 12. how to understand social networks, social structure, stratification, organizations and institutions? 13. how do social differentiation, specialization, inequality and segregation come about? 14. how to model deviance and crime, conflicts, violence, and wars? 15. how to understand social exchange, trading, and market dynamics? we think that, despite the large amount of research performed on these subjects, they are still not fully understood. the ultimate goal would be to formulate mathematical models, which would allow one to understand these issues as emergent phenomena based on first principles, e.g. as a result of (co-)evolutionary processes. such first principles would be the basic facts of human capabilities and the kinds of interactions resulting from them, namely: 1. birth, death, and reproduction. 2. the need of and competition for resources (such as food and water). 3. the ability to observe their environment (with different senses). 4. the capability to memorize, learn, and imitate. 5. empathy and emotions. 6. signaling and communication abilities. 7. constructive (e.g. tool-making) and destructive (e.g. fighting) abilities. 8. mobility and (limited) carrying capacity. 9. the possibility of social and economic exchange. such features can, in principle, be implemented in agent-based models [158] [159] [160] [161] [162] [163] . computer simulations of many interacting agents would allow one to study the phenomena emerging in the resulting (artificial or) model societies, and to compare them with stylized facts [163, 168, 169] . the main challenge, however, is not to program a seemingly realistic computer game. we are looking for scientific models, i.e. the underlying assumptions need to be validated, and this requires to link computer simulations with empirical and experimental research [170] , and with massive (but privacy-respecting) mining of social interaction data [141] . in the ideal case, there would also be an analytical understanding in the end, as it has been recently gained for interactive driver behavior [111] . as well as for inspiring discussions during a visioneer workshop in zurich from january 13 how to understand human decision-making? how to explain deviations from rational choice theory and the decision-theoretical paradoxes? why are people risk averse? 2. how does consciousness and self-consciousness come about? how to understand creativity and innovation? 4. how to explain homophily, i.e. the fact that individuals tend to agglomerate, interact with and imitate similar others? how to explain social influence, collective decision making, opinion dynamics and voting behavior? why do individuals often cooperate in social dilemma situations? how do indirect reciprocity, trust and reputation evolve? references 1 how did economists get it so wrong? the financial crisis and the systemic failure of academic economics (dahlem report address of the governor of the central bank of barbados on the futurict knowledge accelerator: unleashing the power of information for a sustainable future global solutions: costs and benefits economics as religion: from samuelson to chicago and beyond the counter revolution of science the case of flexible exchange rates adaptive behavior and economic theory bounded rationality. the adaptive toolbox the bounds of reason: game theory and the unification of the behavioral sciences on information efficiency and financial stability simon models of man simple heuristics that make us smart judgment under uncertainty: heuristics and biases physics of risk and uncertainty in quantum decision making a theory of fairness, competition, and cooperation foundations of human sociality: economic experiments and ethnographic evidence from fifteen small-scale societies social distance and other-regarding behavior in dictator games measuring social value orientation economics and restaurant gratuities: determining tip rates the cement of society: a study of social order the rewards of punishment. a relational theory of norm enforcement the emergence of homogeneous norms in heterogeneous populations the emperor's dilemma: a computational model of selfenforcing norms non-explanatory equilibria: an extremely simple game with (mostly) unattainable fixed points the myth of the folktheorem deterministic chaos for example, three-body planetary motion has deterministic chaotic solutions, although it is a problem in classical mechanics, where the equations of motion optimize a lagrangian functional on formally undecidable propositions of principia mathematica and related systems on computable numbers, with an application to the entscheidungsproblem the behavior of stock market prices an inquiry into the nature and causes of the wealth of nations the theory of moral sentiments (1759) evolution of indirect reciprocity revisiting market efficiency: the stock market as a complex adaptive system there is no invisible hand. the guardian optimal self-organization traffic and related self-driven many-particle systems from crowd dynamics to crowd safety: a video-based analysis stability and stabilization of time-delay systems network-induced oscillatory behavior in material flow networks and irregular business cycles the tragedy of the commons altruistic punishment in humans switching phenomena in a system with no switches animal spirits: how human psychology drives the economy, and why it matters for global capitalism testing behavioral simulation models by direct experiment a mathematical model for behavioral changes by pair interactions multistability in a dynamic cournot game with three oligopolists evolutionary establishment of moral and double moral standards through spatial interactions abzweigungen einer periodischen lã¶sung von einer stationã¤ren lã¶sung eines differentialgleichungssystems dynamic decision behavior and optimal guidance through information services: models and experiments the dynamics of general equilibrium modeling the stylized facts in finance through simple nonlinear adaptive systems predicting moments of crisis in physics and finance" during the workshop "windows to complexity the chemical basis of morphogenesis lectures on nonlinear differential equation-models in biology settlement formation ii: numerical simulation self-organization in space and induced by fluctuations stokes integral of economic growth: calculus and the solow model hysteresis and the european unemployment problem pattern formation and dynamics in nonequilibrium systems critical phenomena in natural sciences: chaos, fractals, selforganization and disorder the science of disasters noise-induced transitions: theory and applications in physics, chemistry, and biology brownian motors: noisy transport far from equilibrium catastrophe theory introduction to phase transitions and critical phenomena managing complexity: an introduction. pages 1-16 in systemic risks in society and economics systemic risk in a unifying framework for cascading processes on networks how nature works: the science of self-organized criticality why economists failed to predict the financial crisis physics and social science-the approach of synergetics nonlinear economic dynamics the self-organizing economy self-organization of complex structures: from individual to collective dynamics complex economic dynamics sociodynamics: a systematic approach to mathematical modelling in the social sciences business dynamics: systems thinking and modeling for a complex world growth theory, non-linear dynamics and economic modelling complexity hints for economic policy coping with the complexity of economics handbook of research on complexity the rise and decline of nations: economic growth, stagflation, and social rigidities the evolution of cooperation a general theory of equilibrium selection phase transitions to cooperation in the prisoner's dilemma evolutionary games and spatial chaos social diversity promotes the emergence of cooperation in public goods games the outbreak of cooperation among success-driven individuals under noisy conditions governing the commons. the evolution of institutions for collective action topological traps control flow on real networks: the case of coordination failures modelling supply networks and business cycles as unstable transport phenomena the selforganization of matter and the evolution of biological macromolecules the theory of economic development the changing face of economics. conversations with cutting edge economists beyond the representative agent economics with heterogeneous interacting agents modeling aggregate behavior and fluctuations in economics collection of papers on an analytical theory of traffic flows in the economy as an evolving complex system ii the economy as an evolving complex system iii evolutionary economics ecology for bankers origin of wealth: evolution, complexity, and the radical remaking of economics introduction to forest ecosystem science and management models of biological pattern formation multi-objective management in freight logistics: increasing capacity, service level and safety with optimization algorithms handbook of research on nature-inspired computing for economics and management self-control of traffic lights and vehicle flows in urban road networks biologistics and the struggle for efficiency: concepts and perspectives analytical calculation of critical perturbation amplitudes and critical densities by non-linear stability analysis of a simple traffic flow model complexity cube for the characterization of complex production systems verfahren zur koordination konkurrierender prozesse oder zur steuerung des transports von mobilen einheiten innerhalb eines netzwerkes (method for coordination of concurrent processes for noise sensitivity of portfolio selection under various risk measures the logic of failure: recognizing and avoiding error in complex situations complexity and the enterprise: the illusion of control nurturing breakthroughs: lessons from complexity theory monetary relationships: a view from threadneedle street. (papers in monetary economics, reserve bank of australia, 1975); for applications of le chatelier's principle to economics see also p. a. samuelson, foundations of economic analysis structures of social life: the four elementary forms of human relations understanding and assessing the motivations of volunteers: a functional approach happiness: a revolution in economics the wisdom of crowds: why the many are smarter than the few and how collective wisdom shapes business swarm intelligence. introduction and applications economics as a science of human behaviour: towards a new social science paradigm growth, innovation, scaling and the pace of life in cities uncertainty and surprise in complex systems scale-free networks: a decade and beyond economic networks: the new challenges hyperselection and innovation described by a stochastic model of technological evolution introduction to econophysics: correlations and complexity in finance minority games: interacting agents in financial markets economics needs a scientific revolution the economy needs agent-based modelling meltdown modelling introduction to quantitative aspects of social phenomena econophysics forum fifteen years of econophysics: worries, hopes and prospects worrying trends in econophysics econophysics and sociophysics: trends and perspectives aims and scope of the physics of socio-economic systems division of the german physical society pluralistic modeling of complex systems handbook of computational economics simulation modeling in organizational and management research developing theory through simulation methods understanding complex social dynamics: a plea for cellular automata based modelling platforms and methods for agent-based modeling from factors to actors: computational sociology and agent-based modeling artificial societies. multiagent systems and the micro-macro link in sociological theory generative social science: studies in agent-based computational modeling the handbook of experimental economics managing complexity: concepts, insights, applications engineering economy statistical physics of social dynamics the future of social experimenting the authors are grateful for partial financial support by the eth competence center "coping with crises in complex socio-economic systems" (ccss) through eth research grant ch1-01 08-2 and by the future and emerging technologies programme fp7-cosi-ict of the european commission through the project visioneer (grant no.: 248438). they would like to thank for feedbacks on the manuscript by kenett dror, tobias preis and gabriele tedeschi key: cord-032413-zbbpfaj4 authors: lu, shasha; koopialipoor, mohammadreza; asteris, panagiotis g.; bahri, maziyar; armaghani, danial jahed title: a novel feature selection approach based on tree models for evaluating the punching shear capacity of steel fiber-reinforced concrete flat slabs date: 2020-09-03 journal: materials (basel) doi: 10.3390/ma13173902 sha: doc_id: 32413 cord_uid: zbbpfaj4 when designing flat slabs made of steel fiber-reinforced concrete (sfrc), it is very important to predict their punching shear capacity accurately. the use of machine learning seems to be a great way to improve the accuracy of empirical equations currently used in this field. accordingly, this study utilized tree predictive models (i.e., random forest (rf), random tree (rt), and classification and regression trees (cart)) as well as a novel feature selection (fs) technique to introduce a new model capable of estimating the punching shear capacity of the sfrc flat slabs. furthermore, to automatically create the structure of the predictive models, the current study employed a sequential algorithm of the fs model. in order to perform the training stage for the proposed models, a dataset consisting of 140 samples with six influential components (i.e., the depth of the slab, the effective depth of the slab, the length of the column, the compressive strength of the concrete, the reinforcement ratio, and the fiber volume) were collected from the relevant literature. afterward, the sequential fs models were trained and verified using the above-mentioned database. to evaluate the accuracy of the proposed models for both testing and training datasets, various statistical indices, including the coefficient of determination (r(2)) and root mean square error (rmse), were utilized. the results obtained from the experiments indicated that the fs-rt model outperformed fs-rf and fs-cart models in terms of prediction accuracy. the range of r(2) and rmse values were obtained as 0.9476–0.9831 and 14.4965–24.9310, respectively; in this regard, the fs-rt hybrid technique demonstrated the best performance. it was concluded that the three hybrid techniques proposed in this paper, i.e., fs-rt, fs-rf, and fs-cart, could be applied to predicting sfrc flat slabs. various projects in the field of civil engineering, e.g., residential buildings, office blocks, and parking stations, utilize reinforced concrete flat slabs due to the fact that the structure produced by the two-way cast-in-place concrete slabs will be able to offer a cost-effective structural system for engineers as well as architects [1, 2] . various features of the reinforced concrete flat slabs, including flat soffit, can remarkably facilitate the installation of rebar as well as formwork [3] . moreover, these structures can result in a reduced overall height of the story. the benefits of reinforced concrete flat slabs have attracted the attention of a large number of researchers working on the reactions of such structures in both theoretical and experimental studies [4] [5] [6] . according to the available literature, the punching shear capacity of the slab-column connections can be considered as the maximum strength of a reinforced concrete flat slab [1] . on the other hand, compared to the punching load, the residual strength of a slab after punching is significantly lower. therefore, after punching the slab at one of the columns, the neighboring columns can rapidly become overloaded and develop a failure state upon punching. this can result in the escalating breakdown of those buildings where flat slab components are used [1] . there are many building collapses reported in the literature triggered by failure on punching, resulting in deaths as well as significant economic loss. as an example, schousboe [7] reported a case where a 24-story building collapsed during construction in 1973 in virginia. once investigations were completed on the incident, it was reported that the collapse was caused by a shear failure in the slab component utilized on one of the top floors. in addition, king and delatte [8] discussed the breakdown of a building complex with 16 stories in the us because of too low punching shear strength of the flat slab component. to prevent such cases of collapse, a number of recent studies have focused on the failure mechanism of such structures. they have attempted to improve the punching shear capacity of slabs using empirical equations. on the other hand, there has been an increase in the popularity of steel fibers in the field of structural engineering [9] ; thus, these fibers have been used as reinforcement in concrete flat slabs as a means to improve their punching shear capacity [10] [11] [12] . it is also worth mentioning that a number of experimental studies (e.g., [3] ) have confirmed that the punching shear capacity can be improved by reinforcing concrete flat slabs using steel fibers. this has resulted in the widespread application of steel fiber-reinforced concrete (sfrc) flat slabs to various construction building-related projects. however, an important issue about the slab-column connection is the fact that the design codes have been established for such structures, currently (e.g., the aci 318-11 standard [13] ). therefore, it is necessary to modify the current codes so that they can adapt to the design process for the sfrc slabs. in this regard, narayanan and darwish [14] proposed an equation based on the compressive zone's strength over inclined cracks, the pull-out shear forces exerted upon the steel fibers in the direction of such cracks, and the shear forces reinforced by membrane actions as a means to determine the punching shear capacity of the sfrc. moreover, harajli et al. [15] proposed a design equation based on linear regression, which can be used for analyzing the contribution of the concrete and fibers to the total punching shear strength. on the other hand, choi et al. [16] presented a theoretical study evaluating the effectiveness of a design equation, which is supported by the assumption related to the response of tensile reinforcement before the occurrence of punching shear failure. additionally, maya et al. [3] attempted to evaluate three different prediction equations applied to calculate the punching shear capacity by acquiring empirical data available in the literature. in another relevant study, gouveia et al. [17] analyzed an experimental investigation focusing on the behavior of the sfrc up to failure. more recently, gouveia et al. [18] carried out another experimental study aimed at evaluating the punching shear capacity of the sfrc slab-column connections. moreover, gouveia et al. [19, 20] implemented experimental studies to assess the load capacity of the sfrc flat slabs subjected to vertical loads incremented in a monotonic manner as well as reversed horizontal cyclic loading. another study conducted by kueres et al. [21] investigated the response of reinforced concrete flat slabs as well as column bases to the punching shear force by employing the fracture kinematics of the slabs. kueres and hegger [22] suggested a kinematic theory based on two different parameters for the punching shear in reinforced concrete slabs that lack shear reinforcement. a new experimental approach was proposed by einpaul et al. [23] to record the creation and progress of cracks in punching test samples. simões et al. [24] analyzed the measurements for the kinematics as well as the crack development, corresponding to punching failures. the results obtained from this analysis were then utilized for establishing a mechanical model, aiming at better understanding the punching shear failures. a review of available literature showed that the prediction process for the shear punching capacity of sfrc is often concentrated on using modified design equations and simple statistical methods. while the theoretical prediction models are highly important for assessing the relation between the shear punching capacity of the sfrc and the factors affecting it. it should be noted that punching shear behavior is a very complex phenomenon, necessitating the evaluation of other approximation and estimation methods. during the last decades, the lack of adequate and reliable empirical or analytical relations for the evaluation of the shear punching capacity of flat slabs has resulted in attracting the interest of researchers dealing with non-deterministic techniques. in the light of the above discussion, over the last decades, the applications of artificial intelligent (ai) and machine learning (ml) techniques have rapidly grown, and new intelligent models have been developed to solve problems in science and engineering (especially civil engineering) . hoang [62] studied the sophisticated data analysis approach for estimating shear punching capacity. accordingly, plmr (piecewise linear multiple regression) as well as ann (artificial neural network) were chosen in their research as predictive techniques, and then the shear punching capacity of sfrc flat slabs was predicted. furthermore, by training the plmr model, an automatic sequential approach was utilized. hoang's study was focused on the plmr ml approach since its configuration can clearly be explained, depicted, and understood. the plmr was identified as an appropriate instrument for modeling shear punching capacity. on the other hand, the ann model was used by armaghani et al. [31] to determine the shear capacity of concrete beams. by examining laboratory samples and comparing them with classical models, they introduced ann as a model of high flexibility. in another research conducted by asteris et al. [36] , various structures of the ann model were developed. they presented the network weights as a new procedure of modeling. given that the performance of the ai models can be enhanced, asteris et al. [63] improved an ann structure using the normalization technique for predicting the mechanical properties of sandcrete materials. the results showed that the developed model offered a high performance compared to experimental models. due to the development of intelligent models, the capabilities of optimization algorithms were also considered. sun et al. [29] proposed an artificial bee colony algorithm to optimize the developed models. in their study, different concrete samples were optimized and compared with real samples. although ai techniques are able to solve many problems related to science and engineering, they can introduce a new model, which is black-box, and its configuration cannot easily be explained and understood by researchers and engineers [64] [65] [66] [67] [68] [69] [70] [71] [72] . on the other hand, tree-based models can offer broader capabilities in the field of nonlinear problem-solving. by developing tree-based models, a tree model can be extracted, which is easy to understand and apply [34, 73, 74] . one of the weaknesses in recent research simulating the intelligent models is the lack of control over important data for simulation. these data can be better achieved if it can be managed in the best way, and the simulation process can be optimized by selecting superior features. in this research, to determine the prediction model with the best performance, the tree-based models were compared to each other. the data used in different studies have various properties, each of which can change the simulation process. because recent research focuses less on such data, this paper implemented a new process by selecting important features from the data. this process ultimately helped intelligent models to increase model performance. in order to train and test the above-mentioned ml models, a dataset consisting of 140 experimental data specimens was acquired from the available literature. this dataset included six explanatory variables, i.e., the depth of the slab, the effective depth of the slab, the length of the column, the compressive strength of the concrete, the reinforcement ratio, and the fiber volume. these variables were used to forecast the punching shear capacity of the sfrc flat slabs. the rest of the current paper is organized as follows: section 2 deals with the formulation of the study. section 3 discusses the iterative process performed to identify the structure of the tree-based models. section 4 presents the results of the developed models. finally, section 5 concludes the whole study. it is possible to estimate the punching shear strength of sfrc using the mechanical model of the csct (i.e., the critical shear crack theory). in the following, the punching shear strength of sfrc with and without transverse reinforcement has been discussed [75, 76] . the punching shear strength for reinforced concrete slabs that do not have transverse reinforcement can be expressed as [76] : where ψ signifies the maximal rotation of the slab, d denotes the effective depth of the slab, b o signifies the control perimeter (which is considered at a distance of d/2 from the face of the column), f c denotes the compressive strength of the concrete, d g signifies the aggregate size, and d g0 is the reference aggregate size that can be set to 16 mm. along the surface of the failure, which is determined by the critical shear crack, the overall shear strength achieves the contributions from the concrete and the steel fibers [77] . the following equation can be utilized for calculating the punching shear strength: where v r,c represents the contribution from the concrete, and v r, f signifies the contribution from the fibers. moreover, voo and foster [78] presented a formulation for quantifying the tensile strength generated by the fibers over a plane with unit area. this equation can be expressed as: where k f signifies the global orientation factor, q f denotes the volume of the fiber, s b is the bond stress between the fibers and the concrete mix, and a f represents the aspect ratio parameter for the steel fibers. based on this equation, the overall punching shear contribution from the fibers can be computed as follows [3] : where ξ signifies the distance between a point and the soffit of the slab, and a p represents the horizontally projected area of the punching shear failure surface. it is worth mentioning that the integration in this equation makes it possible to reach a closed-form solution in order to calculate the contribution from the fiber [77] . moreover, according to the notion of the average bridging stress as well as the kinematic assumption [77] , it is possible to calculate the contribution from the fiber using the equation presented below [3] : using the above equation, maya et al. [3] proposed a simplified equation for calculating the contribution from the concrete. their equation is presented as follows: in this equation, γ c represents the partial safety factor for the concrete, which is equal to 1.5. a dataset consisting of 140 test specimens as well as six factors governing the punching capacity, i.e., the depth of the slab (h), the effective depth of the slab (d), the length of the column (bc), the compressive strength of the concrete (f c ), the reinforcement ratio (ρ), and the fiber volume (ρ f ), was collected for training and testing the ml models. the main influential parameters for the punching capacity samples included geometry dimensions and materials used. the geometry of the samples used in this research included the range of h = 55-180 mm, d = 39-150 mm, bc = 60-225. the data points in this dataset were acquired from the experimental studies available in the literature [12, 14, 15, [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] . table 1 presents the statistical descriptions of the variables in this dataset to predict the punching shear strength (v). moreover, figure 1 depicts the histograms for the output and the input parameters. as shown in figure 1 , the two parameters µ and σ 2 represent the mean of data and variance of the data, respectively. data quality can be determined using the normal distribution implemented for input and output data. note that the data applied to predictive models should be selected with high accuracy. in 2001, breiman [92] developed the random forest (rf) model that is a non-parametric ensemble classifier based on the flexible decision tree algorithm. this approach is, in fact, an expansion over the classification and regression tree, consisting of hybridization of numerous trees where bootstrap samples are employed to generate each individual tree [93, 94] . in this approach, the algorithm employed for constructing the model automatically selects random parts of the training data. moreover, in the training process, each branch of the tree at each node will be determined by a randomized subset of the variables. furthermore, each individual tree is expanded in order to minimize the classification error; however, the result is affected by the random selection. the main objective of rf is to determine to what extent the prediction error increments as the data output for specific variables is permutated. therefore, this approach is able to identify the significance of each variable when all variables are controlled [95, 96] . the classification and regression trees (cart) approach is a non-parametric regression method that is among the most widely-used ml approaches [93] . this approach is highly flexible since it can utilize any type of numeric and binary data, while the result is not affected by the monotone transformations and various measurement scales [97] . to construct the decision trees in cart, a binary partitioning algorithm is often used [98] . furthermore, in order to deal with missing data in a specific factor, regression trees are usually utilized through the replacement process [93] . in this approach, in order to prevent the overfitting of terminal nodes in the tree, the splits are recursively snipped [93] . the cart method applies the equation below to classification problems; this method is based on comparing the distribution of the target attribute with two child nodes: where k signifies the target classes, pl(k) is the probability distribution target on the left side of the nodes, pr(k) is the probability distribution target on the right side of the nodes, and u is the penalty on the splits [99] . the random tree (rt) approach is a supervised classification model, which was first proposed by breiman [92] . similar to rf, rt is based on ensemble learning. in the rt approach, there are a number of learners who operate on their own. the notion of bagging is applied to constructing a decision tree, and it provides a randomly selected set of samples. the main difference between a standard tree and rf involves the splitting of the node. in rf, this splitting is performed based on the best predictor among a subset of predictors; however, the standard tree utilizes the elite split among all variables. rt is an ensemble of three predictors, i.e., forests, and it can deal with regression as well as classification applications. when the rt algorithm is executed, the tree classifier receives the input data. all the available trees classify the inputs. finally, the class with the highest frequency will be output by the system. since the training error is computed internally, cross-validation or bootstraps are not required for estimating the accuracy of the training stage. it is worth mentioning that the output for the regression problems is calculated by taking the average of the responses of all the forest members [100] . moreover, the error for this approach is calculated based on the ratio of misclassified vectors to all the vectors present in the original dataset. this section compares the performance of the proposed tree models with the performance of the feature selection (fs) hybrid models. it should be noted that a repetitive random subsampling, consisting of 20 training and testing runs, was carried out in order to assess the prediction performance of the model in a reliable manner. for each of the runs, 70 percent of the available data was employed as the training data for the model, estimating the punching shear strength, while the remaining 30 percent of the data was employed to test the model [101] [102] [103] . in addition, along with the rmse (root mean square error), the mape (mean absolute percentage error), the mae (mean absolute error), and the r 2 (coefficient of determination) were used to assess the performance of the model [104] [105] [106] . these performance evaluation indices are calculated using the following equations: where y a, i indicates the actual value of the punching shear capacity of the i − th instance, while y p, i represents the predicted value for the same parameter, and n d signifies the number of data instances in the selected dataset. to compute the value of r 2 , the values for ss yy and sse must be calculated. these are obtained using the following equations: where y a, m signifies the mean of the actual punching shear capacity. figure 2 presents the methodology process of this research in order to solve the problem. the simulation of different methods requires an examination of the main parameters to control the quality of the results. in this part, the modeling process of the rf technique is described. since this model is based on the use of a tree, its important and effective parameters are similar to the base model. initially, the data collected from the literature were divided into two parts: training and testing. proper segmentation affects the simulation results, the flexibility of the models, and their accuracy. due to the fact that in the simulations, about 70-80% of the data is recommended to be used for training purposes, in this study, 70% of the data was allocated to this part. two parameters, i.e., tree number and tree depth, were identified as the two main keys to the rf model. various models were implemented to design the appropriate structure for the prediction of the punching shear capacity of sfrc flat slabs so that the best conditions could be evaluated. figure 3 shows the variations of models made based on the number of trees. to compare the performance of models, in this section, two statistical indices, mae and r 2 , were used. the number of trees for this problem ranged from 1 to 10. as shown in this figure, the model changes generally provide an accuracy of r 2 = 0.82-0.9. this indicates that the parameter of tree number has a deep impact on model performance. furthermore, mae changes confirm this and show that the mae rate decreases as model accuracy increases. although the main part of the rf model design can be achieved with the number of trees, the tree depth parameter also has an important effect on the structure of the proposed model. the depth of the tree actually allows each tree to grow to a specific size. as shown in figure 4 , the range of changes in tree depth is smaller, and the closer to depth (10), the smaller the range of change. figure 5 shows one of the suitable structures for this model. according to this figure, the number of trees and their depth can be well achieved by how they are grown. this model was used as one of the tree models in this study for the prediction of the punching shear capacity of sfrc flat slabs. due to the close similarities to the tree structure, different cases of this model are also affected by this basic model. ninety-eight data (70% of the total data) were allocated for designing and training the cart model. the number of inputs of this model (set to six) is in accordance with table 1 . the evaluation of the punching shear capacity of the sfrc flat slabs, with different methods, can be introduced as an alternative solution. the cart model attempted to obtain the best model through sensitivity analysis. due to the same conditions with tree models, tree depth was evaluated as the most effective parameter. figure 6 shows the effect of different changes in the models made. at the same time, it can be understood that the cart model offers fewer changes than the rf model and much higher accuracy. this suggests that it can be used as a superior model in predicting the punching shear capacity of sfrc flat slabs. the accuracy of this model can be achieved up to about r 2 = 0.95, with changes to tree depth. finally, the third model was implemented to predict the punching shear capacity of the sfrc flat slabs. this model also has different parameters, the most important of which are the number of models and the depth of the tree for each model. initially, different models were designed so that the effect of each parameter could be well-identified. figures 7 and 8 show the results of the number of models and the depth of the tree, respectively. in general, from these two diagrams, it can be concluded that unlike the previous models, here, compared to the depth of the tree, the number of models has a greater impact on the simulation. in addition, it can be noted that this model offers a higher capability than the rf model, and the results are in a more acceptable range. under these conditions, this model can be introduced as one of the highly-accurate predictive methods applicable to estimating the punching shear capacity of sfrc flat slabs. in the previous sections, basic models were implemented to simulate and evaluate the punching shear capacity of sfrc flat slabs. the effects of their various parameters were examined, and finally, with more knowledge, the power of each model could be obtained. because the models were trained without considering any changes in the data, they were used as baseline models for this research. in various simulations, the number of input parameters and data quality have significant effects on the results. therefore, it is important to be focused on selecting outstanding and effective features, and then we can come up with high prediction performance models. this research implemented a process for designing an appropriate fs model. the process is such that the base models in the previous section were selected as comparative models, and then models based on them used the fs conditions. this process continues until it reaches a stable state and higher accuracy. the fss used in this computational loop include various ranges of statistical tests to determine the superior features of the data. these include different statistical distributions, different dimensional of computational, weighting parameters, data outlier criteria, etc. finally, hybrid models designed on this basis were tested to assess their flexibility. the results of the base and hybrid models are given in table 2 . as can be seen in this table, the hybrid models offer higher quality and accuracy in both training and testing modes. fs-rf, fs-cart, and fs-rt models obtained an accuracy of r 2 = 0.9476, 0.9608, and 0.9831 for training data, respectively. the same models provided accuracy of r 2 = 0.9190, 0.9454, and 0.9581 for testing data, respectively. these results show that they are more powerful models compared to the base models. by examining the errors of the models, it can be concluded that the errors have been reduced to an acceptable level compared to the initial models (i.e., single models). therefore, in this study, the hybrid models are found capable of predicting the punching shear capacity of sfrc flat slabs. since the criteria for comparing the models are important, the gain criterion was used for this purpose. gain is a measure of the effectiveness of the models developed over the original data. this criterion shows how acceptable the performance of the models is. the more area the chart covers, the higher quality it offers. figure 9 presents this category for the two sections of training and testing. hybrid models generally have a more accurate prediction. however, the fs-rt model was found better than the other models in both stages of training and testing. in order to have a better understanding regarding the developed models, figure 10 shows the correlations between measured and predicted punching shear capacity of the sfrc flat slabs by hybrid predictive models for both training and testing stages. based on the results obtained in this study, it can be concluded that both single and hybrid ml techniques presented in this study are able to provide an acceptable accuracy level for the prediction of punching shear capacity of the sfrc flat slabs. however, if higher accuracy is of interest and necessary, the hybrid models can be utilized. the hybrid models are considered as powerful, practical, and easy to use models in estimating the punching shear capacity of sfrc flat slabs. the current study introduced a number of ml techniques in order to estimate the punching shear capacity of sfrc flat slabs. the proposed techniques were applied to obtain an approximation of the mapping function between six descriptive variables (the depth of the slab, the effective depth of the slab, the length or radius of the column, the compressive strength of the concrete, the reinforcement ratio, and the fiber volume) and the output variable, i.e., the punching shear capacity. moreover, the fs approach was implemented along with the tree models, i.e., rf, cart, and rt, for introducing new hybrid models. the results obtained from the methods indicated that the rt and cart models provided the best performance in predicting the punching shear capacity of the sfrc flat slabs. on the other hand, it was revealed that combining base models with fs could improve the accuracy of the results. findings confirmed that the hybridized fs-rt method could be a promising tool for designing sfrc flat slabs. future studies can evaluate other acceptance criteria for the breaking point as well as other sophisticated methods for preventing the overfitting of the fs-rt model. moreover, the novel prediction models can be evaluated using an extended dataset, which can be based on the experiments carried out in recent studies. finally, the proposed fs-based hybrid methods were shown applicable to solving other civil engineering problems. post-punching behavior of flat slabs assessment of csa a23. 3 structural integrity requirements for two-way slabs punching shear strength of steel fibre reinforced concrete slabs 3d finite element investigation of the compressive membrane action effect in reinforced concrete flat slabs nonlinear behaviour of reinforced concrete flat slabs after a column loss event a punching shear mechanical model for reinforced concrete flat slabs with and without shear reinforcement bailey's crossroads collapse reviewed collapse of 2000 commonwealth avenue: punching shear case study recent trends in steel fibered high-strength concrete strength evaluation of interior slab-column connections punching shear in steel fibre reinforced concrete slabs without traditional reinforcement evaluation of steel fiber reinforcement for punching shear resistance in slab-column connections-part i: monotonically increased load building code requirements for structural concrete (aci 318-05) and commentary (aci punching shear tests on steel-fibre-reinforced micro-concrete slabs effect of fibers on the punching shear strength of slab-column connections punching shear strength of interior concrete slab-column connections reinforced with steel fibers sfrc flat slabs punching behavior-experimental research experimental and theoretical evaluation of punching strength of steel fiber reinforced concrete slabs assessment of sfrc flat slab punching behavior-part i: monotonic vertical loading assessment of sfrc flat slab punching behavior-part ii: reversed horizontal cyclic loading fracture kinematics of reinforced concrete slabs failing in punching two-parameter kinematic theory for punching shear in reinforced concrete slabs without shear reinforcement measurements of internal cracking in punching test slabs without shear reinforcement validation of the critical shear crack theory for punching of slabs without transverse reinforcement by means of a refined mechanical model punching shear capacity estimation of frp-reinforced concrete slabs using a hybrid machine learning approach improving performance of retaining walls under dynamic conditions developing an optimized ann based on ant colony optimization technique intelligent design of retaining wall structures under dynamic conditions the use of new intelligent techniques in designing retaining walls applying a meta-heuristic algorithm to predict and optimize compressive strength of concrete samples evaluating slope deformation of earth dams due to earthquake shaking using mars and gmdh techniques soft computing-based techniques for concrete beams shear strength compressive strength of natural hydraulic lime mortars using soft computing techniques expression programming model for predicting tunnel convergence developing gep tree-based, neuro-swarm, and whale optimization models for evaluation of bearing capacity of concrete-filled steel tube columns van examining hybrid and single svm models with different kernels to predict rock brittleness predicting the shear strength of reinforced concrete beams using artificial neural networks cost-effective approaches based on machine learning to predict dynamic modulus of warm mix asphalt with high reclaimed asphalt pavement detection of flaws in concrete using ultrasonic tomography and convolutional neural networks mapping and holistic design of natural hydraulic lime mortars optimization of artificial intelligence system by evolutionary algorithm for prediction of axial capacity of rectangular concrete filled steel tubes under compression self-compacting concrete strength prediction using surrogate models krill herd algorithm-based neural network in structural seismic reliability evaluation a novel heuristic algorithm for the modeling and risk assessment of the covid-19 pandemic phenomenon application of group method of data handling technique in assessing deformation of rock mass on the metaheuristic models for the prediction of cement-metakaolin mortars compressive strength estimation of axial load-carrying capacity of concrete-filled steel tubes using surrogate models a novel artificial intelligence technique to predict compressive strength of recycled aggregate concrete using ica-xgboost model a comparative study of ann and anfis models for the prediction of cement-based mortar materials compressive strength bearing capacity of thin-walled shallow foundations: an experimental and artificial intelligence-based study bearing capacity of shallow foundation's prediction through hybrid artificial neural networks effects of joints on the cutting behavior of disc cutter running on the jointed rock mass effect of water content on argillization of mudstone during the tunnelling process model test on the entrainment phenomenon and energy conversion mechanism of flow-like landslides analysis of the excavation damaged zone around a tunnel accounting for geostress and unloading evaluation method of rockburst: state-of-the-art literature review utilizing gradient boosted machine for the prediction of damage to residential structures owing to blasting vibrations of open pit mining comparative performance of six supervised learning methods for the development of models of hard rock pillar stability prediction forecasting of tbm advance rate in hard rock condition based on artificial neural network and genetic programming techniques prediction of blast-induced rock movement during bench blasting: use of gray wolf optimizer and support vector regression multi-planar detection optimization algorithm for the interval charging structure of large-diameter longhole blasting design based on rock fragmentation aspects prediction of rock brittleness using genetic algorithm and particle swarm optimization techniques estimating punching shear capacity of steel fibre reinforced concrete slabs using sequential piecewise multiple linear regression and artificial neural network feed-forward neural network prediction of the mechanical properties of sandcrete materials a new hybrid method for predicting ripping production in different weathering zones through in-situ tests application of deep neural networks in predicting the penetration rate of tunnel boring machines investigating the effective parameters on the risk levels of rockburst phenomena by developing a hybrid heuristic algorithm a svr-gwo technique to minimize flyrock distance resulting from blasting a new approach for estimation of rock brittleness based on non-destructive tests invasive weed optimization technique-based ann to the prediction of rock tensile strength modeling of masonry failure surface under biaxial compressive stress using neural networks anisotropic masonry failure criterion using artificial neural networks optimizing ann performance using doe: application on turning of a titanium alloy van a combination of feature selection and random forest techniques to solve a problem related to blast-induced ground vibration random forest and bayesian network techniques for probabilistic prediction of flyrock induced by blasting in quarry sites punching shear strength of reinforced concrete slabs without transverse reinforcement applications of the critical shear crack theory to punching of r/c slabs with transverse reinforcement the critical shear crack theory as a mechanical model for punching shear design and its application to code provisions tensile-fracture of fibre-reinforced concrete: variable engagement model punching shear tests of concrete slab-column joints containing fiber reinforcement similarities between punching and shear strength of steel fiber reinforced concrete (sfrc) slabs and beams contribution of steel fibers to the strength characteristics of lightweight concrete slab-column connections failing in punching shear an explanatory analysis of driver injury severity in rear-end crashes using a decision table/naïve bayes (dtnb) hybrid classifier software effort and risk assessment using decision table trained by neural networks knowledge-base verification analysis on punching shear behavior of the raft slab reinforced with steel fibers design equation for punching shear capacity of sfrc slabs punching shear capacity of interior sfrc slab-column connections punching shear strength of steel fiber high strength reinforced concrete slabs punching shear strength of high performance fiber reinforced concrete slabs punching shear behavior of reinforced slab-column connections made with steel fiber concrete benefits of concentrated slab reinforcement and steel fibers on performance of slab-column connections random forests landslide susceptibility assessment in lianhua county (china): a comparison between a random forest data mining technique and bivariate and multivariate statistical models landslide susceptibility estimation by random forests technique: sensitivity and scaling issues classification and regression by randomforest comparison and ranking of different modelling techniques for prediction of site index in mediterranean mountain forests gis-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in iran top 10 algorithms in data mining random model trees: an effective and scalable regression method applying various hybrid intelligent systems to evaluate and predict slope stability under static and dynamic conditions prediction of rockburst risk in underground projects developing a neuro-bee intelligent system practical risk assessment of ground vibrations resulting from blasting, using gene expression programming and monte carlo simulation techniques development of a new methodology for estimating the amount of ppv in surface mines based on prediction and probabilistic models predicting tunnel boring machine performance through a new model based on the group method of data handling this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license the authors declare no conflict of interest. key: cord-016045-od0fr8l0 authors: liu, ming; cao, jie; liang, jing; chen, mingjun title: epidemic-logistics network considering time windows and service level date: 2019-10-04 journal: epidemic-logistics modeling: a new perspective on operations research doi: 10.1007/978-981-13-9353-2_13 sha: doc_id: 16045 cord_uid: od0fr8l0 in this chapter, we present two optimization models for optimizing the epidemic-logistics network. in the first one, we formulate the problem of emergency materials distribution with time windows to be a multiple traveling salesman problem. knowledge of graph theory is used to transform the mtsp to be a tsp, then such tsp route is analyzed and proved to be the optimal hamilton route theoretically. besides, a new hybrid genetic algorithm is designed for solving the problem. in the second one, we propose an improved location-allocation model with an emphasis on maximizing the emergency service level. we formulate the problem to be a mixed-integer nonlinear programming model and develop an effective algorithm to solve the model. in this chapter, we present two optimization models for optimizing the epidemic-logistics network. in the first one, we formulate the problem of emergency materials distribution with time windows to be a multiple traveling salesman problem. knowledge of graph theory is used to transform the mtsp to be a tsp, then such tsp route is analyzed and proved to be the optimal hamilton route theoretically. besides, a new hybrid genetic algorithm is designed for solving the problem. in the second one, we propose an improved location-allocation model with an emphasis on maximizing the emergency service level. we formulate the problem to be a mixed-integer nonlinear programming model and develop an effective algorithm to solve the model. burg hemorrhagic fevers in angola, sars in china, anthrax mail in usa, ebola in congo,smallpox and so on. bioterrorism threats are realistic and it has a huge influence on social stability, economic development and human health. without question, nowadays the world has become a risk world, filling with all kinds of threaten from both nature and man made. economy would always be the most important factor in normal materials distribution network. however, timeliness is much more important in emergency materials distribution network. to form a timeliness emergency logistics network, a scientific and rational emergency materials distribution system should be constructed to cut down the length of emergency rescue route and reduce economic loss. in 1990s, america had invested lots of money to build and perfect the emergency warning defense system of public health, aiming to defense the potential terrorism attacks of biology, chemistry and radioactivity material. metropolitan medical response system (mmrs) is one of the important parts, which played a crucial role in the "9.11" event and delivered 50 tons medicine materials to new york in 7 h [1] . in october 2001, suffered from the bioterrorism attack of anthrax, the federal medicine reserve storage delivered a great deal of medicine materials to local health departments [2] . khan et al. [3] considered that the key challenge of anti-bioterrorism is that people don't know when, where and in which way they would suffer bioterrorism attack, and what they can do is just using vaccine, antibiotics and medicine to treat themselves after disaster happened. because of this, venkatesh and memish [4] mentioned that what a country most needed to do is to check its preparation for bioterrorism attacks, especially the perfect extent of the emergency logistics network which includes the reserve and distribution of emergency rescue materials, and the emergency response ability to bioterrorism attacks. other anti-bioterrorism response relative researches can be found in kaplan et al. [5] . emergency materials distribution is one of the major activities in anti-bioterrorism response. emergency materials distribution network is driven by the biological virus diffusion network, which has different structures from the general logistics network. quick response to the emergency demand after bioterrorism attack through efficient emergency logistics distribution is vital to the alleviation of disaster impact on the affected areas, which remains challenges in the field of logistics and related study areas [6] . in the work of cook and stephenson [7] , importance of logistics management in the transportation of rescue materials was firstly proposed. references ray [8] and rathi et al. [9] introduced emergency rescue materials transportation with the aim of minimizing transportation cost under the different constraint conditions. a relaxed vrp problem was formulated as an integer programming model and proved that's a np-hard problem in dror er al. [10] other scholars have also carried out much research on the emergency materials distribution models such as fiedrich et al. [11] , ozdamar et al. [12] and tzeng et al. [13] . during the actual process of emergency material distribution, the emergency command center(ecc) would always supply the emergency materials demand points(emdp) in groups based on the vehicles they have. besides, each route wouldn't repeat, which made any demand point get the emergency materials as fast as possible. to the best of our knowledge, this is a very common experience in china. under the assumption that any demand point would be satisfied after once replenishment, the question researched would be turn into a multiple traveling salesman problem (mtsp) with an immovable origin. in the work of bektas [14] , the author gave a detailed literature review on mtsp from both sides of model and algorithm. malik et al. [15] , carter and ragsdale [16] illustrate some more results on how to solve the mtsp. to summarize, our model differs from past research in at least three aspects. first, nature disaster such as earthquake, typhoons, flood and so on was always used as the background or numerical simulation in the past research, such kind of disaster can disrupt the traffic and lifeline systems, obstructing the operation of rescue machines, rescue vehicles and ambulances. but situation in anti-bioterrorism system is different, traffic would be normal and the epidemic situation could be controlled with vaccination or contact isolation. second, to the best of our knowledge, this is the first time to combine research on the biological epidemic model and the emergency materials distribution model, and we assume that emergency logistics network is driven by the biological virus diffusion network. therefore, it has a different structure from the general logistics network. third, the new hybrid genetic algorithm we designed and applied in this study is different from all the traditional ways, we improved the two-part chromosome which proposed by carter and ragsdale [16] , and custom special set order function, crossover function and mutation function, which can find the optimal result effectively. although rule of the virus diffusion isn't the emphasis in our research, it is the necessary part when depicting the emergency demanded. figure 13 .1 illustrates sir epidemic model with natural birth and death of the population. then we can get the mathematic formulas as follows. where s, i and r, represent number of the susceptible, infective and recovered population, respectively. b and d, stand for the natural birth and death coefficient, α is the death coefficient for disease, β is the proportion coefficient from s to i in unit time, and last, γ is the proportion coefficient from i to r. note that number of the susceptible and the infective persons would be gotten via computer simulation with value of the other parameters preset. then, the emergency materials each point demanded can be also calculated based on the number of sick person. figure 13 .2 is the roadway network of a city in south china, numbers beside the road are the length of the section (unit: km). point o is the ecc and the other nodes 1-32 are the emdps. now, there are some emergency materials arrived at the ecc by air transport and we need to send it to each demand point as fast as possible. we assumed that all the emdps are divided into 4 groups, and any demand point in the network would be satisfied after once replenishment, then the question researched was turn into a mtsp with an immovable origin. however, time windows constraint wasn't considered. in this study, we use the new hybrid ga to solve the mtsp with time windows. using sir epidemic model in sect. 13.2, number of the susceptible and infective people can be forecasted before emergency distribution. then symbol t i is set to show how much time is consumed in demand point i, i = 1, 2, . . . , 32. we assume it has a simple linear functional relationship with number of the infective person as follows. where i i is number of the infective people in demand point i, v vac is the average speed of vaccination. another assumption for this research is that vehicle speed is the same as in any roadway section in the network, which noted as a symbol v . so, question researched in this study is: based on the epidemic model analysis, how can we distribute the emergency materials to the whole emdps with a time windows constraint? how many groups should be divided? and, how can we get the optimization route? with the analysis above, objective function model can be formulated as follows. i / ∈s j∈s where x i j = 1 means that the emergency materials is distributed to point j immediately after point i, otherwise, x i j = 0. s i j represent the shortest route between point i and j. n is number of the distribution groups. t k is time consumed in group k. t t w is the time windows. equations (13.4) and (13.5) are the grouping constraints, (13.6) and (13.7) insure that each demand point would be supplied once. equation (13.8) assures that there is no sub loop in the optimal route. equation (13.9) is the time windows constraint. and last, eq. (13.10) is the parameter specification. the hybrid genetic algorithm are presented as follows. step 1: using sir epidemic model in sect. 13.2 to forecast number of the susceptible and infective people, and then, confirm the emergency distribution time in each emdp; step 2: generate the original population based on the code rule; step 3: using the custom set order function to optimize the original population and make the new population have finer sequence information; step 4: estimate that whether the results satisfy the constraints (4) to (10) in the model, if yes, turn to the next step, if not, delete the chromosome; step 5: using the fitness function to evaluate fitness value of the new population; step 6: end one fall and the best one doubled policy are used to copy the population; step 7: crossover the population using the custom crossover function; step 8: mutate the population using the custom mutation function; step 9: repeat the operating procedures (3)(8) until the terminal condition is satisfied; step 10: 10 approximate optimal routes would be found by the new hybrid genetic algorithm and then the best equilibrium solution would be selected by the local search algorithm. in order to evaluate the practical efficiency of the proposed methodology, parameters of the sir epidemic model are given as follows, b = d = 10 −5 , β = 10 −5 , α = 0.01, γ = 0.03, and initializing s = 10,000, i = 100, show the fitness and route length vary with iterate times using the new hybrid ga, respectively. from the figures we can see that each group would be converged effectively, 10 approximate optimal routes would be obtained. comparison of the 10 approximate optimal routes is illustrated in fig. 13 .7, and the best equilibrium solution of emergency materials distribution is shown in fig. 13.8 . from figs. 13.6 and 13.7, though length of the route in group 9 is the shortest one, it isn't the best equilibrium solution. in other words, some demand points can be supplied immediately but others should wait for a long time. this is not the objective we pursue. from fig. 13 .7, inside deviation of group 7 is the minimum one, which means route in group 7 is the best equilibrium solution, though it isn't the shortest route. in other words, all the demand points can be supplied in the minimum time difference at widest possibility. another problem worthy to be pointed out is that group 10 is the suboptimal to group 7, and this can be used as a candidate choice for commander under the emergency environment. in fact, results in the prior section are too idealized, for we just considered emergency materials distribution at the beginning of the virus diffusion (day = 5) and we assume that each emdp has the same situation. in fact, it is impossible. each parameter preset would affect the result at last immensely. some of them are discussed as follows. (1) time consumed with different initial size of s there are 32 emdps in this distribution network, actually, each point has a different number of the susceptible people to others, and we can assume they are distributed from 10,000 to 50,000. with the sir epidemic model in sect. 13.1.2, different size of the initial susceptible people will bring different size of infective people at last, and then, time consumed in these emdps would be varied. figure 13 .9 illustrates that time consumed in one emdp with different initial size of s as date increased. there is almost no distinguish among them in the first 30 days (a month), however, distinguish is outstanding in the following days. the larger the initial size of s is, the faster increment speed of the time consumed. in sect. 13.1.4, s = 10, 000 is taken for each emdp and the time consumed almost no more than 1 h, this is a very simple situation, and the optimal route with time windows can be depicted easily. but when initial size of s increased, the problem would become much more trouble for satisfying the time window constraint, and then, we should divided the distribution route in much more groups. (2) time consumed with different initial size of i as mentioned before, each emdp also has a different number of the infective people to others, and we can assume they are distributed from 50 to 200. figure 13 .10 illustrates that time consumed in one emdp with different initial size of i as date increased. it also can get that time consumed in the first 30 days is smoothly, but distinguish is outstanding in the following days. similar as before, the larger the initial size of i is, the faster increment speed of the time consumed. another interesting result is that vary i from 50 to 200, distinguish of the time consumed in each situation isn't very outstanding, and size of the time consumed is around 1 h. in other words, model in sect. 13.1.3 is still serviceable and we needn't change the grouping design. (3) time consumed with different initial size of β β is one of the most important parameters in sir epidemic model, it affects number of the infective people in the population directly, and then, it affects the time consumed in emdp accordingly. vary value of β from 10 −5 to 5 × 10 −5 , and we get time consumed with it changed as show in fig. 13 .11. still we have conclusion that time consumed in the first 30 days is more or less in different situations, but distinguish is outstanding in the following days. similar as before, the larger the initial size of β is, the faster increment speed of the time consumed. with initial size of β increased, distribution groups should be adjusted for satisfying the time windows. based on the analysis above, we can see that time consumed in the first 30 days always stay in a lower level. it is important information for emergency relief in the anti-bioterrorism system, which means the earlier the emergency materials distributed, the less affect would be brought by parameters varied. this also answers the actual question that why emergency relief activities always get the best effectiveness at the beginning. emergency materials distribution problem with a mtsptw characteristic in the antibioterrorism system is researched in this study, and the best equilibrium solution is obtained by the new hybrid ga. modeling the mtsp using the new two-part chromosome proposed has clear advantages over using either of the existing one chromosome or the two chromosome methods. besides, combined with the sir epidemic model, relationship between the parameters and the result are discussed at last, which makes methods proposed in this study more practical. a problem worthy to be pointed out is that the shortest route between any two emdps in the new hybrid ga is calculated by dijkstra algorithm, so, the optimal result would be gotten even if some sections of the roadway are disrupted, which makes applicability range of the method projected in this study expanded. research on the emergency materials distribution is a very complex work, only some idealized situations are analyzed and discussed in this study, and some other constraints such as loading capacity of the vehicles, death coefficient for disease, distribution mode and so on, which could be directions of further research. emergency logistics network design is extremely important when responding to an unexpected epidemic pandemic. in this study, we propose an improved locationallocation model with an emphasis on maximizing the emergency service level (esl). we formulate the problem to be a mixed-integer nonlinear programming model (minlp) and develop an effective algorithm to solve the model. the numerical test shows that our model can provide tangible recommendations for controlling an unexpected epidemic. over to satisfy the emergency demand of epidemic diffusion, an efficient emergency service network, which considers how to locate the regional distribution center (rdc) and how to allocate all affected areas to these rdcs, should be urgently designed. this problem opens a wide range for applying the or/ms techniques and it has attracted many attentions in recent years. for example, ekici et al. [17] proposed a hybrid model, which estimated the spread of influenza and integrated it with a location-allocation model for food distribution in georgia. chen et al. [18] proposed a model which linked the disease progression, the related medical intervention actions and the logistics deployment together to help crisis managers extract crucial insights on emergency logistics management from a strategic standpoint. ren et al. [19] presented a multi-city resource allocation model to distribute a limited amount of vaccine to minimize the total number of fatalities due to a smallpox outbreak. he and liu [20] proposed a time-varying forecasting model based on a modified seir model and used a linear programming model to facilitate distribution decision-making for quick responses to public health emergencies. liu and zhang [21] proposed a time-space network model for studying the dynamic impact of medical resource allocation in controlling the spread of an epidemic. further, they presented a dynamic decision-making framework, which coupled with a forecasting mechanism based on the seir model and a logistics planning system to satisfy the forecasted demand and minimize the total operation costs [22] . anparasan and lejeune [23] proposed an integer linear programming model, which determined the number, size, and location of treatment facilities, deployed medical staff, located ambulances to triage points, and organized the transportation of severely ill patients to treatment facilities. büyüktahtakın et al. [24] proposed a mixed-integer programming (mip) model to determine the optimal amount, timing and location of resources that are allocated for controlling ebola in west-africa. moreover, literature reviews on or/ms contributions to epidemic control were conducted in dasaklis et al. [25] , rachaniotis et al. [26] and dasaklis et al. [27] . in this study, we propose an improved location-allocation model for emergency resources distribution. we define a new concept of emergency service level (esl) and then formulate the problem to be a mixed-integer nonlinear programming (minlp) model. more precisely, our model (1) identifies the optimal number of rdcs, (2) determines rdcs' locations, (3) decides on the relative scale of each rdc, (4) allocates each affected area to an appropriate rdc, and (5) obtains esl for the best scenario, as well as other scenarios. (1) in this study, esl includes two components. esl 1 is constructed to reflect the level of demand satisfaction and esl 2 is proposed for the relative level of emergency operation cost. these two aspects are given by the weight coefficient α and 1 − α respectively. the influence of these two factors on the esl is illustrated in fig. 13 .12. figure 13 .12a represents that esl 1 increases as the level of demand satisfaction raised. as we can see that it is a piecewise curve. before demand is completely met, it is an s-shape curve from zero to α. after that, it becomes a constant, which means the additional emergency supplies cannot improve the esl. figure 13 .12b identifies that esl 2 decreases as the relative total cost increases. when emergency operation cost is minimized, the esl 2 arrives at its best level of 1 − α. similarly, when emergency operation cost is maximized, the esl 2 is zero. our model depicts the problem of location and allocation for emergency logistics network design. the network is a three-echelon supply chain of strategic national stockpile (sns), rdcs, and affected areas. the core problem is to determine the number and locations for the rdcs. in each affected area, there is a point of dispensing (pod). to model the problem, we first present the relative parameters and variables as follows. i : set of snss, i ∈ i . j : set of rdcs, j ∈ j . k : set of affected areas, k ∈ k . α: weight coefficient for the two parts of esl. variables: d i j : distance from sns i to rdc j. for simplify, the euclidean distance is adopted. d jk : distance from rdc j to affected area k. ε jk : binary variable. if rdc j provides emergency supplies to affected area k, ε jk = 1; otherwise, ε jk = 0. z j : binary variable. if rdc j is opened, z j = 1; otherwise, z j = 0. x jk : amount of emergency supplies from rdc j to affected area k. y i j : amount of emergency supplies from sns i to rdc j. (x j , y j ): coordinates of rdc j. according to the above notations, we can define the optimization model as follows. max e sl = e sl 1 + e sl 2 (13.11) herein, e sl 1 is defined as (13.12)-(13.14). these equations reflect that the less the unsatisfied demand is, the higher e sl 1 is. e sl 2 is defined as follows. first, we formulate the total operation cost as (13.15): ε jk x jk d jk + j j=1 z j c r dc j (13.15) where c r dc j is the operating cost for rdc j when it is opened. it is decided by the relative size of the rdc, which can be expressed as: second, to non-dimensionalize the cost function f 2 , we calculate the following two extreme values for eq. (13.15) . where var represents all variables and s represents the following constraints. f min 2 and f max 2 are the minimum and maximum values obtained by solving (13.15) without considering the e sl 1 . the definition of e sl 2 means that the lower the total operation cost is, the higher the esl is. the constraints for the optimization model are given as follows: x jk , y i j ∈ z + 0 , ∀i ∈ i, j ∈ j, k ∈ k (13.28) (x j , y j ), ∀ j ∈ j are continuous variables. (13.29) constraint (13.21) indicates that each affected area is serviced by a single rdc. constraint (13.22) specifies that the supplies to each affected area should not be more than its demand. constraint (13.23) is a flow conservation constraint. constraint (13.24) shows that only the opened rdc can provide distribution service. constraint (13.25) the proposed model for emergency services network design is a minlp model as it involves multiplication of two variables (i.e., ε jk x jk ). more importantly, the optimization model includes a continuous facility location-allocation model with unknown number of rdcs. to avoid the complexity of such minlp model, we modify it by adding two auxiliary variables. the detail of the modification was introduced in mccormick [28] . our solution procedure integrates an enumeration search rule and a genetic algorithm (ga), which are applied iteratively. as ga is a mature algorithm [29] , details of the ga process are omitted here. we summarize the proposed solution methodology as below. step 1: data input and parameters setting, which includes i, j, k, α, d k , (x k , y k ), (x i , y i ), c t l , c lt l , and c r dc j and the related parameters for ga. step 2: initialization. generate the original population according to the constraints. step 3: evaluation. fitness function is defined as the reciprocal of esl. step 4: selection. use roulette as the select rule. step 5: crossover. single-point rule is used. step 6: mutation. a random mutation is applied. step 7: if termination condition is met, go to the next step; else, return to step 4. step 8: output the results. ( to clarify the effect of the model, we conduct a numerical test. assuming there is an unexpected epidemic outbreak in a 100 × 100 square region with 10 affected areas in it. in the square region, only three snss can provide emergency supplies. because at the early stage of the outbreak, there is a large demand for emergency supplies. the supply capacity of these snss is less than the total demand in affected areas, which are set at 700, 600 and 400 respectively. the coordinates of the snss and the affected areas are obtained in advance. the upper bound of rdc number is set to be 8. the cost of operating a rdc is defined as 6760 × √ s j . the demand in each affected area is randomly generated. finally, unit transportation cost from sns to rdc is set to be 80 and unit transportation cost from rdc to affected area is 160. based on the above data setting, we solve our model by using matlab software and obtain the results in fig. 13 .13. as it shows in this figure, one can observe that there is a trade-off between the two components of the esl. in our example, we test the parameter α from 0.4 to 0.9, which means the demand satisfaction is more and more important in our decision making. the result shows that when α is equal to 0.6, the total esl can arrive at its best value (0.9258). beyond which it decreases again. in practice, the decision makers may have different value of α according to the actual needs. correspondingly, it will lead to different esl. our model also determines the optimal number, locations and relative sizes for the rdcs. the test results are shown in table 13 .1. for example, rdc1 deliver emergency supplies to affected areas 2, 7 and 9. its relative size is 33.23%, which means emergency supplies transshipped in this rdc occupies the corresponding proportion in total emergency distribution. table 13 .2 illustrates the proportion of demand satisfaction for each affected area. for an example, demand for emergency supplies in affected area 2 is 149, and all this area's demand is totally satisfied. however, one can also observe that demands in some areas are partly supplied due to the supply capacity limitation. for example, only 69.5% of the demand in affected area 1 is delivered. (3) sensitivity analysis (1) impact of α on the esl to understand the impact of α on the esl, we solve our model with 6 different values of this parameter, meaning that decision makers have different considerations of the two components of the esl. we compare the test result in table 13 .3. it can be observed that esl 1 increases along with the emphasis on demand satisfaction. however, the actual proportion of esl 1 is always staying at 90% of the setting of α. as to the esl 2 , one can note that it increases at first and then decreases as α varied from 0.4 to 0.9. (2) sensitivity analysis on different demand in each affected area we also examined the impact of different demand in each affected area. the test results are shown in fig. 13 .14. we change the original demand in each affected area for five scenarios. that means different demand situations when an unexpected infectious epidemic happened. one can easily observe the more the demand is, the lower the optimal esl is. that is because when the demand increases, the supplies of snss remain original, which leads a reduction in esl 1 . when the demand in each affected area changes, esl 2 varies slightly. which shows that the change of the total operation cost for the emergency logistics is not obvious when the scale of disease becomes smaller. in this study, we propose an improved location-allocation model with an emphasis on maximizing the emergency service level (esl). we formulate the problem to be a mixed-integer nonlinear programming model and develop an effective algorithm to solve the model. moreover, we test our model through a case study and sensitivity analysis. the main contribution of this research is the function of esl, which considers demand satisfaction and emergency operation cost simultaneously. our definition of esl is different from the existing literature and has a significant meaning for guiding the actual operations in emergency response. future studies could address the limitations of our work in both the disease forecasting and logistics management. for example, the dynamics of epidemic diffusion could be considered and thus our optimization problem can be extended to a dynamic programming model. american metropolitan medical response system and it's inspire for the foundation of the public healthy system in our country the discussion of orientation of chinese public health in 21th century public-health preparedness for biological terrorism in the usa bioterrorism-a new challenge for public health analyzing bioterror response logistics: the case of smallpox an emergency logistics distribution approach for quick response to urgent relief demand in disasters lessons in logistics from somalia a multi-period linear programming modal for optimally scheduling the distribution of food-aid in west africa allocating resources to support a multicommodity flow with time windows vehicle routing with split deliveries optimized resource allocation for emergency response after earthquake disasters emergency logistics planning in natural disasters multi-objective optimal planning for designing relief delivery systems the multiple traveling salesman problem: an overview of formulations and solution procedures an approximation algorithm for a symmetric generalized multiple depot, multiple travelling salesman problem a new approach to solving the multiple traveling salesperson problem using genetic algorithms modeling influenza pandemic and planning food distribution modeling the logistics response to a bioterrorist anthrax attack optimal resource allocation response to a smallpox outbreak methodology of emergency medical logistics for public health emergencies a dynamic allocation model for medical resources in the control of influenza diffusion a dynamic logistics model for medical resources allocation in an epidemic control with demand forecast updating resource deployment and donation allocation for epidemic outbreaks a new epidemics-logistics model: insights into controlling the ebola virus disease in west africa emergency supply chain management for controlling a smallpox outbreak: the case for regional mass vaccination controlling infectious disease outbreaks: a deterministic allocation-scheduling model with multiple discrete resources epidemics control and logistics operations: a review computability of global solutions to factorable nonconvex programs: part i-convex underestimating problems distribution network redesign for marketing competitiveness key: cord-009481-6pm3rpzj authors: parnell, gregory s.; smith, christopher m.; moxley, frederick i. title: intelligent adversary risk analysis: a bioterrorism risk management model date: 2009-12-11 journal: risk anal doi: 10.1111/j.1539-6924.2009.01319.x sha: doc_id: 9481 cord_uid: 6pm3rpzj the tragic events of 9/11 and the concerns about the potential for a terrorist or hostile state attack with weapons of mass destruction have led to an increased emphasis on risk analysis for homeland security. uncertain hazards (natural and engineering) have been successfully analyzed using probabilistic risk analysis (pra). unlike uncertain hazards, terrorists and hostile states are intelligent adversaries who can observe our vulnerabilities and dynamically adapt their plans and actions to achieve their objectives. this article compares uncertain hazard risk analysis with intelligent adversary risk analysis, describes the intelligent adversary risk analysis challenges, and presents a probabilistic defender–attacker–defender model to evaluate the baseline risk and the potential risk reduction provided by defender investments. the model includes defender decisions prior to an attack; attacker decisions during the attack; defender actions after an attack; and the uncertainties of attack implementation, detection, and consequences. the risk management model is demonstrated with an illustrative bioterrorism problem with notional data. toward risk-based regulations, specifically using pra to analyze and demonstrate lower cost regulations without compromising safety. (7, 8) research in the nuclear industry has also supported advances in human reliability analysis, external events analysis, and common cause failure analysis. (9−11) more recently, leaders of public and private organizations have requested risk analyses for problems that involve the threats posed by intelligent adversaries. for example, in 2004, the president directed the department of homeland security (dhs) to assess the risk of bioterrorism. (12) homeland security presidential directive 10 (hspd-10): biodefense for the 21st century, states that " [b] iological weapons in the possession of hostile states or terrorists pose unique and grave threats to the safety and security of the united states and our allies" and charged the dhs with issuing biennial assessments of biological threats to "guide prioritization of our on-going investments in biodefense-related research, development, planning, and preparedness." a subsequent homeland security presidential directive 18 (hspd-18): medical countermeasures against weapons of mass destruction directed an integrated risk assessment of all chemical, biological, radiological, and nuclear (cbrn) threats. (13) the critical risk analysis question addressed in this article is: are the standard pra techniques for uncertain hazards adequate and appropriate for intelligent adversaries? as concluded by the nrc (2008) study on bioterrorism risk analysis, we believe that new techniques are required to provide credible insights for intelligent adversary risk analysis. we will show that treating adversary decisions as uncertain hazards is inappropriate because it can provide a different risk ranking and may underestimate the risk. in the rest of this section, we describe the difference between natural hazards and intelligent adversaries and demonstrate, with a simple example, that standard pra applied to attacker's intent may underestimate the risk of an intelligent adversary attack. in the second section, we describe a canonical model for resource allocation decision making for an intelligent adversary problem using an illustrative bioterrorism example with notional data. in the third section, we describe the illustrative analysis results obtained from the model and discuss the insights they provide for risk assessment, risk communication, and risk management. in the fourth section, we describe the benefits and limitations of the model. finally, we discuss future work and our conclusions. we believe that risk analysis of uncertain hazards is fundamentally different than risk analysis of intelligent adversaries. (14, 15) some of the key differences are summarized in table i . (16) a key difference is historical data. for many uncertain events, both natural and engineered, we have not only historical data of extreme failures or crises, but many times we can replicate events in a laboratory environment for further study (engineered systems) or analyze using complex simulations. intelligent adversary attacks have a long historical background, but the aims, events, and effects we have recorded may not prove a valid estimate of future threat because of changes in adversary intent and capability. both uncertain hazard risks of occurrence and geographical risk can be narrowed down and identified concretely. intelligent adversary targets vary by the goals of the adversary and can be vastly dissimilar between adversary attacks. information sharing between the two events differs dramatically. after hurricanes or earthquakes, engineers typically review the incident, publish results, and improve their simulations. sometimes after intelligent adversary attacks, or near misses, the situation and conduct of the attack may involve critical state vulnerabilities and protected intelligence means. in these cases, intelligence agencies may be reluctant to share complete information even with other government agencies. the ability to influence the event is also different. though we can prepare, we typically have no way of influencing the natural event to occur or not occur. on the other hand, governments may be able to affect the impact of terrorism attacks by a variety of offensive, defensive, and recovery measures. in addition, adversary attacks can take on so many forms that one cannot realistically defend/respond/recover/etc. against all types of attacks. although there have been efforts to use event tree technologies in intelligent adversary risk analysis (e.g., btra), many believe that this approach is not credible. (19) the threat from intelligent adversaries comes from a combination of both intent and capability. we believe that pra still has an important role in intelligent adversary risk analysis for assessment of the capabilities of adversaries, the vulnerabilities of potential targets, and potential consequences of attacks. however, intent is not a some cities may be considered riskier than others (e.g., new york city, washington), but terrorists may attack anywhere, any time. information sharing: asymmetry of information: new scientific knowledge on natural hazards can be shared with all the stakeholders. governments sometimes keep secret new information on terrorism for national security reasons. natural event: intelligent adversary events: to date, no one can influence the occurrence of an extreme natural event (e.g., an earthquake). governments may be able to influence terrorism (e.g., foreign policy; international cooperation; national and homeland security measures). government and insureds can invest in well-known mitigation measures. attack methodologies and weapon types are numerous. local agencies have limited resources to protect potentially numerous targets. federal agencies may be in a better position to develop better offensive, defensive and response strategies. modified from kunreuther. (17, 18) 431-461. (18) factor in natural hazard risk analysis. in intelligent adversary risk analysis, we must consider the intent of the adversary. the adversary will make future decisions based on our preparations, its objectives, and information about its ability to achieve its objectives that is dynamically revealed in a scenario. bier et al. provides an example of addressing an adversary using a defender-attacker game theoretic model. (20) nrc provides three examples of intelligent adversary models. (16) we believe it will be more useful to assess an attacker's objectives (although this is not a trivial task) than assigning probabilities to their decisions prior to the dynamic revelation of scenario information. we believe that modeling adversary objectives will provide greater insight into the possible actions of opponents rather than exhaustively enumerating probabilities on all the possible actions they could take. furthermore, we believe the probabilities of adversary decisions (intent) should be an output of, not an input to, risk analysis models. (16) this is a principal part of game theory as shown in aghassi et al. and jain et al. (21, 22) to make our argument and our proposed alternative more explicit, we use a bioterrorism illustrative example. in response to the 2004 hspd, in october 2006, the dhs released a report called the bioterrorism risk assessment (btra). (19) the risk assessment model contained a 17-step event tree (18 steps with consequences) that could lead to the deliberate exposure of civilian populations for each of the 27 most dangerous pathogens that the center for disease control tracks (emergency.cdc.gov/bioterrorism) plus one engineered pathogen. the model was extremely detailed and contained a number of separate models that fed into the main btra model. the btra resulted in a normalized risk for each of the 28 pathogens, and rank-ordered the pathogens from most risky to least risky. the national research council (nrc) conducted a review of the btra model and provided 11 specific recommendations for improvement to the model. (16) in our example, we will use four of the recommendations: model the decisions of intelligent adversaries, include risk management, simplify the model by not assigning probabilities to the branches of uncertain events, and do not normalize the risk. the intelligent adversary technique we developed builds on the deterministic defenderattacker-defender model and is solved using decision trees. (16) because the model has been simplified to reflect the available data, the model can be developed in a commercial off-the-shelf (cots) software package, such as the one we used for modeling, dpl (www.syncopation.org). other decision analysis software may work as well. 4(23) event trees have been useful for modeling uncertain hazards. (24) however, there is a key difference in the modeling of intelligent adversary decisions that event trees do not capture. as norman c. rasmussen, the director of the 1975 reactor safety study that validated pra for use in nuclear reactor safety, states in a later article, while the basic assumption of randomness holds true for nuclear safety, it is not valid for human action. (25) the attacker makes decisions to achieve his or her objectives. the defender makes resource allocation decisions before and after an attack to try to mitigate vulnerabilities and consequences of the attacker's actions. this dynamic sequence of decisions made by first the defender, then an attacker, then again by the defender should not be modeled solely by assessing probabilities of the attacker's decisions. for example, when the attacker looks at the defender's preparations for their possible bioterror attack, it will not assign probabilities to its decisions; it chooses the agent and the target based on perceived ability to acquire the agent and successfully attack the target that will give it the effects it desires to achieve its objectives. (15) representing an attacker decision as a probability may underestimate the risk. consider the simple bioterrorism event tree given in fig. 1 with notional data. using an event tree, for each agent (a and b) there is a probability that an adversary will attack, a probability of attack success, and an expected consequence for each outcome (at the terminal node of the tree). the probability of success 4 a useful reference for decision analysis software is located on the orms website (http://www.lionhrtpub.com/orms/surveys/das/ das.html). involves many factors, including the probability of obtaining the agent and the probability of detection during attack preparations and execution. the consequences depend on many factors, including agent propagation, agent lethality, time to detection, and risk mitigation; in this example, the consequences range from 0 or no consequences to 100, the maximum consequences (on a normalized scale of consequences). calculating expected values in fig. 1 , we would assess expected consequences of 32. we would be primarily concerned about agent b because it contributes 84% of the expected consequences (30 × 0.9 = 27 for b out of the total of 32). looking at extreme events, we would note that the worst-case consequence of 100 has a probability of 0.05. however, adversaries do not assign probabilities to their decisions; they make decisions to achieve their objectives, which may be to maximize the consequences they can inflict. (26) if we use a decision tree as in fig. 2 , we replace the initial probability node with a decision node because this is an adversary decision. we find that the intelligent adversary would select agent a, and the expected consequences are 50, which is a different result than with the event tree. again, if we look at the extreme events, the worstcase event (100 consequences) probabilities are 0.5 for the agent a decision and 0.6 for the agent b decision. the expected consequences are greater and the primary agent of concern is now a. in this simple example, the event tree approach underestimates the expected risk and provides a different risk ranking. furthermore, the event tree example underestimates the risk of the extreme events. however, while illustrating important differences, this simple decision tree model does not sufficiently model the fundamental structure of intelligent adversary risk. this model has a large number of applications for homeland security. for example, it would be easy to see the use of this canonical model applied to a dirty bomb example laid out in rosoff and von winterfeldt (27) or any other intelligent adversary scenario. in this article, we show a use of a bioterrorism application. we believe that the canonical risk management model must have at least six components: the initial actions of the defender to acquire defensive capabilities, the attacker's uncertain acquisition of the implements of attack (e.g., agents a, b, and c), the attacker's target selection and method of attack(s) given implement of attack acquisition, the defender's risk mitigation actions given attack detection, the uncertain consequences, and the cost of the defender actions. from this model, one could also conduct baseline risk analysis by looking at the status quo. in general, the defender decisions can provide offensive, defensive, or information capabilities. we are not considering offensive decisions such as preemption before an attack; instead, we are considering decisions that will increase our defensive capability (e.g., buy vaccine reserves) (28) or provide earlier or more complete information for warning of an attack (add a bio watch city). (29) in our defenderattacker-defender decision analysis model, we have the two defender decisions (buy vaccine, add a bio watch city), the agent acquisition for the attacker is uncertain, the agent selection and target of attack is another decision, the consequences (fatalities and economic) are uncertain, the defender decision after attack to mitigate the maximum possible casualties, and the costs of defender decisions are known. the defender risk is defined as the probability of adverse consequences and is modeled using a multiobjective additive model similar to multiobjective value models. (30) we have assumed that the defender minimizes the risk and the attacker maximizes the risk. 5 we implemented this model as a decision tree (fig. 3 ) and an influence diagram (fig. 4) using dpl. the mathematical formulation of our model and the notional data are provided in the appendix the illustrative decision tree model (figs. 3 and 4) begins with decisions that the defender (united states) makes to deter the adversary by reducing the vulnerabilities or be better prepared to mitigate a bioterrorism attack of agents a, b, or c. we modeled and named the agents to represent notional bioterror agents using the cdc's agent categories in table ii . for example, agent a represents a notional agent from category a. table iii provides a current listing of the agents by category. there are many decisions that we could model; however, for our simple illustrative example, we chose to model notional decisions about the bio watch program for agents a and b and the bioshield vaccine reserve for agent a. bio watch is a program that installs and monitors a series of passive sensors within a major metropolitan city. (29) the bioshield program is a plan to purchase and store vaccines for some of the more dangerous pathogens. (28) the defender first decides whether or not to add another city to the bio watch program. if that city is attacked, this decision could affect the warning time, which influences the response and, ultimately, the potential consequences of an attack. of course, the bio watch system does not detect every agent, so we modeled agent c to be the most effective agent that the bio watch system does not sense and provide additional warning. adding a city will also incur a cost in dollars for the united states. the second notional defender decision is the amount of vaccine to store for agent a. agent a is the notional agent that we have modeled with the largest probability of acquisition and potential consequences. the defender can store a percentage of what experts think we would need in a largescale biological agent attack. the more vaccine the united states stores, the fewer consequences we will have if the adversaries use agent a and we have sufficient warning and capability to deploy the vaccine reserve. however, as we store more vaccine, the costs for purchasing and storage increase. for (31) category definition a the u.s. public health system and primary healthcare providers must be prepared to address various biological agents, including pathogens that are rarely seen in the united states. high-priority agents include organisms that pose a risk to national security because they: can be easily disseminated or transmitted from person to person; result in high mortality rates and have the potential for major public health impact; might cause public panic and social disruption; and require special action for public health preparedness. b second highest priority agents include those that: are moderately easy to disseminate; result in moderate morbidity rates and low mortality rates; and require specific enhancements of cdc's diagnostic capacity and enhanced disease surveillance. c third highest priority agents include emerging pathogens that could be engineered for mass dissemination in the future because of: availability; ease of production and dissemination; and potential for high morbidity and mortality rates and major health impact. simplicity's sake, each of the defender decisions cost the same amount; therefore, at the first budget level of us$ 10 million, the defender can only choose to one decision. after the defender has made its investment decisions, which we assume are known to the attacker, the attacker makes two decisions: the type of agent and the target. we will assume that the attacker has already made the decision to attack the united states with a bioterror agent. in our model, there are three agents it can choose, although this can be increased to represent the other pathogens listed in table iii . as stated earlier, if we only looked at the attacker decision, agent a would appear to be the best choice. agents b and c are the next two most attractive agents to the attacker, respectively. again, agents a and b can be detected by bio watch whereas agent c cannot. the attacker has some probability of acquiring each agent. if the agent is not acquired, the attacker cannot attack with that agent. in addition, each agent has a lethality associated with it, represented by the agent casualty factor. finally, each agent has a different probability of being detected over time. generally, the longer it takes for the agent to be detected, the more consequences the united states will suffer. the adversary also decides what size of population to target. generally, the larger the population targeted, the more potential consequences could result. the attacker's decisions affect the maximum possible casualties from the scenario. however, regardless of the attacker's decisions, there is some probability of actually attaining a low, medium, or high percentage of the maximum possible casualties. this sets the stage for the next decision by the defender. after receiving warning of an attack, the defender decides whether or not to deploy the agent a vaccine reserve. this decision depends upon how much of the vaccine reserve the united states chose to store, whether the attacker used agent a, and the potential effectiveness of the vaccine given timely attack warning. in addition, there is a cost associated with deploying the vaccine reserve. again, for simplicity's sake, the cost for every defender decision is the same, thus forcing the defender to only choose the best option(s) for each successive us$ 10 million increase in budget up to the maximum of us$ 30 million. in our model (fig. 4) , we have two types of consequences: casualties and economic impact. given the defender-attacker-defender decisions, the potential casualties and the economic impact are assessed. casualties are based on the agent, the population attacked, the maximum potential casualties, the warning time given to the defender, and the effectiveness of vaccine for agent a (if the agent a is the agent and the vaccine is used). economic effects are modeled using a linear model with a fixed economic cost that does not depend on the number of casualties and a variable cost of the number of casualties multiplied by the cost per casualty. of course, the defender would like potential consequences (risk) given an attack to be low, whereas the attacker would like the potential consequences (risk) to be high. our economic consequences model was derived using a constant and upper bound from wulf et al. (34) the constant cost we used is $10 billion, and from the upper bound, also given in wulf et al., we derived the cost per casualty. (34) we believe this fixed cost is reasonable because when looking at the example of the anthrax letters of 2001, experts estimate that although there were only 17 infected and five killed, there was a us$ 6 billion cost to the united states. (35) in this tragic example, there was an extremely high economic impact even when the casualties were low. each u.s. defender decision incurs a budget cost. the amount of money available to homeland security programs is limited by a budget determined by the president and congress. the model will track the costs incurred and only allows spending within the budget (see the appendix). we considered notional budget levels of us$ 0 million, us$ 10 million, us$ 20 million, and us$ 30 million. normally, a decision tree is solved by maximizing or minimizing the expected attribute at the terminal branches of the tree. in our model however, the defender risk depends on the casualty and economic consequences given an attack. we use multiple objective decision analysis with an additive value (risk) model to assign risk to the defender consequences. 6 the defender is minimizing risk and the attacker is maximizing risk. we assign a value of 0.0 to no consequences and a value of 1.0 to the worst-case consequences in our model. we model each consequence with a linear risk function and a weight (see the appendix). the risk functions measure returns to scale on the consequences. of course, additional consequences could be included and different shaped risk curves could be used. some of the key assumptions in our model are listed in table iv (the details are in the appendix) along with some possible alternative assumptions. given different assumptions, the model may produce different results. we model the uncertainty of the attacker's capability to acquire an agent with a probability distribution and we vary detection time by agent. clearly, other indications and warnings exist to detect possible attacks. these programs could be included in the model. we model three defender decisions: add a bio watch city for agents a and b, increase vaccine reserve for agent a, and deploy agent a. we assume limited decision options (i.e., 100% storage of vaccine a, 50% storage, 0% storage), but other decisions could be modeled (e.g., other levels of storage, storing vaccines for other agents, etc). we used one casualty model for all agents. other casualty and economic models could be used. finally, our model makes some assumptions about objectives. in the first of these we assume that the risks important to the defender are the number of casualties and the economic impact, but additional measures could be used. second, we assume defenders and attackers have a diametrically opposed view of all of the objectives. clearly, we could model additional objectives. in addition, we made some budget assumptions, which could be improved or modified. we assumed a fixed budget, but this budget could be modeled with more detailed cost models (e.g., instead of a set amount to add a city to the bio watch program, the budget could reflect different amounts depending upon the city and the robustness of the sensors installed). finally, our model results in a risk of a terrorist attack; the same methodology for a defender-attacker-defender decision tree can be used to determine a utility score instead of a risk; an example of this is in keeney. (15) one thing to consider when altering or adding to the assumptions is the number of strategies the model evaluates. currently, the canonical model has 108 different strategies to evaluate (table v) . with more complexity, the number of strategies that would need to be evaluated could grow significantly. largescale decision trees can be solved with monte carlo simulation. after modeling the canonical problem, we obtained several insights. first, we found that in our model economic impact and casualties are highly correlated. higher casualties result in higher economic impact. other consequences, for example, psychological consequences, could also be correlated with casualties. second, a bioterror attack could have a large economic impact (and psychological impact), even if casualties are low. the major risk analysis results are shown in fig. 5 . risk shifting occurs in our decision analysis model. in the baseline (with no defender spending), agent a is the most effective agent for the attacker to select and, therefore, the agent against which the defender can protect if the budget is increased. as we improve our defense against agent a, at some point the attacker will choose to attack using agent b. the high-risk agent will have shifted from agents a to b. as the budget level continues to increase, the defender adds a city to the bio watch program and the attackers choose to attack with agent c, which bio watch cannot detect. we use notional data in our model, but if more realistic data were used, the defender could determine the cost/benefit ratios of additional risk reduction decisions. this decision model uses cots software to quantitatively evaluate the potential risk reductions associated with different options and make cost-benefit decisions. fig. 5 provides a useful summary of the expected risk. however, it is also important to look at the complementary cumulative distribution (fig. 6) to better understand the likelihood of extreme outcomes. for example, the figure shows that spending us$ 0 or us$ 10 million gives the defender a 10% chance of zero risk, whereas spending us$ 20 or us$ 30 million gives the defender an almost 50% chance of having zero risk. the best risk management result would be that option 4 deterministically or stochastically dominates (sd) option 3, option 3 sd option 2, and option 2 sd option 1. the first observation we note from fig. 6 is that options 2, 3, and 4 stochasically dominate 1 because option 1 has a higher probability for each risk outcome. a second observation is that while option 4 sd option 3, option 4 does not sd option 2 because option 4 has a larger probability of yielding a risk level of 0.4. along the x-axis, one can see the expected risk (er) of each alternative. this expected risk corresponds to the expected value of risk from the budget versus risk rainbow diagram in fig. 5 . this example illustrates a possibly important relationship necessary for understanding and communicating how the budget might affect the defender's risk and choice of options. risk managers can run a value of control or value of correlation diagram to see which nodes most directly affect the outcomes and which are correlated (fig. 7) . because we only have two uncertainty nodes in our canonical model, the results are not surprising. the graphs show that the ability to acquire the agent is positively correlated with the defender risk. as the probability of acquiring the agent increases, so does defender risk. in addition, the value of control shows the amount of risk that could be reduced given perfect control over each probabilistic node, and that it is clear that acquiring the agent would be the most important variable for risk managers to control. admittedly, this is a basic example, but with a more complex model, analysts could determine which nodes are positively or negatively correlated with risk and which uncertainties are most important. using cots software also allows us to easily perform sensitivity analysis on key model assumptions. from the value of correlation and control above, the probability of acquiring the agent was highly and positively correlated with defender risk and had the greatest potential for reducing defender risk. we can generate sensitivity analysis such as rainbow diagrams. the rainbow diagram (fig. 8) shows the decision changes as our assumption about the probability of acquiring agent a increases. the different shaded regions represent different decisions, for both the attacker and the defender. this rainbow diagram was produced using a budget level of us$ 20 million, so in the original model, the defender would choose not to add a city to bio watch, store 100% of vaccine for agent a, but not choose to deploy it because the attacker chose to use agent b. if the probability of acquiring agent a was low enough (in section a from fig. 8) , we see that the attacker would choose to use agent c because we have spent our money on adding another city to bio watch, which is the only thing that affects both agents a and b, but not agent c. as the probability of acquiring agent a increases, both the attacker's and the defender's optimal strategies change. our risk management decision depends on the probability that the adversary acquires agent a. risk analysis of intelligent adversaries is fundamentally different than risk analysis of uncertain hazards. as we demonstrated in section 1.3, assigning probabilities to the decisions of intelligent adversaries can underestimate the potential risk. decision tree models of intelligent adversaries can provide insights into the risk posed by intelligent adversaries. the defender-attacker-defender decision analysis model presented in this article provides four important benefits. first, it provides a risk assessment (the baseline or status quo) based on defender and attacker objectives and probabilistic assessment of threat capabilities, vulnerabilities, and consequences. second, it provides information for risk-informed decision making about potential risk management options. third, using cots software, we can provide a variety of very useful sensitivity analysis. fourth, although the model would be developed by a team, the risk analysis can be conducted by one risk analyst with an understanding of decision trees and optimization and training on the features of the cots software. the application of risk assessment and risk management techniques should be driven by the goals of the analysis. in natural hazard risk analysis, there is value in performing risk assessment without risk management. some useful examples are "unfinished business," a report from the epa and the 2008 u.k. national risk register. (36, 37) in intelligent adversary risk analysis, the defender-attacker-defender decision analysis can provide essential information for risk management decision making. in our example, risk management techniques are important, and this type of model provides insights about resource allocation decisions to reduce or shift risk. in addition, with budget set to us$ 0, the model can be used to assess the baseline risk. as the budget increases, the model clearly shows the best risk management decisions and the associated risk reduction. this model enables the use of cots risk analysis software. in addition, the use of cots software enables the use of standard sensitivity analysis tools to provide insights into areas in which the assumptions are critical or where the model should be improved or expanded. currently, many event tree models including the dhs btra event tree require extensive contractor support to run, compile, and analyze. (16) although one would still need a multidisciplinary team to create the model, once completed the defenderattacker-defender decision analysis model is usable by a single risk analyst who can provide near realtime analysis results to stakeholders and decisionmakers as long as the risk analyst understands the risk management options, decision trees, optimization, and has training in the cots tool. the technique we advocate in this article has limitations. some of the limitations of this model are the same as those of event trees. there are limitations on the number of agents used in the models. we easily modeled 28 bioagents with reasonable run times, but more agents could be modeled. in addition, there are challenges in assessing the probabilities of uncertain events, for example, the probability that the attacker acquires agent a. next, there is a limitation in the modeling of the multiple consequences. another limitation may be that to get more realistic results, we may have to develop "response surface" models of more complex consequence models. these limitations are shared by event trees and decision trees. however, decision trees also have some limitations that are not shared by event trees. first, only a limited number of risk management decisions can realistically be modeled. therefore, care must be used to choose the most appropriate set of potential decisions. (15, 18) in addition, there may be an upper bound on the number of decisions or events that can be modeled in cots software. it is important to note that it may be difficult to determine an objective function for the attacker. as mentioned before, there is a tradeoff in replacing the probabilities assigned to what an attacker might do (event tree approach) with attacker objectives (decision tree approach). we believe it is easier to make informed assessments about the objectives of adversaries than to assess the probabilities of their future actions. however, we need more research on assessing the robustness of risk management decisions to assumptions about adversary objectives. finally, successful model operation and interpretation requires trained analysts who understand decision analysis and defenderattacker-defender optimization. this article has demonstrated the feasibility of modeling intelligent adversary risk using defenderattacker-defender decision analysis. table iv and section 2.7 identified several alternative modeling assumptions that could be considered. we can modify and expand our assumptions to increase the complexity and fidelity of the model. the next step is to use the model with the best data available on the agents of concern and a proposed set of potential risk management options. use of our defender-attacker-defender model does not require a major intelligent adversary research program; it requires only the willingness to change. (16) much of the data used for event tree models can be used in the decision analysis model. assessing probabilities of attacker decisions will not increase our security but defender-attacker-defender decision analysis models can provide a sound assessment of risk and the essential information our nation needs to make risk-informed decisions. g.s.p. is grateful for the many helpful discussions on intelligent adversary risk analysis with his colleagues on the 2008 nrc committee and the defender-attacker-defender research of jerry brown and his colleagues at the naval postgraduate school. the authors are grateful for the dpl modeling advice provided by chris dalton of syncopation. the authors thank roger burk at the united states military academy for his useful reviews and suggestions. finally, the authors thank the area editor and reviewers for very detailed comments and suggestions that have helped us improve our article. this model is a multiobjective decision analysis/game theory model that allows for risk management at the u.s. governmental level in terms of budgeting and certain bioterror risk mitigation decisions. the values for probabilities as well as factors are notional and could easily be changed based on more accurate data. it uses the starting u.s. (defender) decisions of adding a city to the bio watch program (or not) and the percent of storing an agent in the nation's vaccine reserve program to set the conditions for an attacker decision. the attacker can choose which agent to use as well as what size of population to target. there is some unpredictability in the ability to acquire the agent as well as the effects of the agent given the defender and attacker decisions. finally, the defender gets to choose whether to deploy the vaccine reserve to mitigate casualties. the model tracks the cost for each u.s. decision and evaluates them over a specified budget. the decisions cannot violate the budget without incurring a dire penalty. the objectives that the model tracks are u.s. casualties and impact to the u.s. economy. they are joined together using a value function with weights for each objective. we outline our model using a method suggested by brown and rosenthal. (38) probabilistic risk assessment: reliability engineering, design, and analysis risk analysis in engineering and economics risk modeling, assessment, and management reactor safety study: assessment of accident risk in u.s. commercial nuclear plants nuclear regulatory commission (usnrc) fault tree handbook understanding risk management: a review of the literature and industry practice a survey of risk assessment methods from the nuclear, chemical, and aerospace industries for applicability to the privatized vitrification of hanford tank wastes procedural and submittal guidance for the individual plant examination of external events (ipeee) for severe accident vulnerabilities procedure for analysis of common-cause failures in probabilistic safety analysis a technique for human error analysis (atheana) homeland security presidential directive 10 homeland security presidential directive 18 [hspd-18]: medical countermeasures against weapons of mass destruction guiding resource allocations based on terrorism risk modeling values for anti-terrorism analysis department of homeland security's bioterrorism risk assessment: a call for change, committee on methodological improvements to the department of homeland security's biological agent risk analysis insurability of (mega)-terrorism risk: challenges and perspectives. report prepared for the oecd task force on terrorism insurance, organization for economic cooperation and development integrating risk management with homeland security and antiterrorism resource allocation decision-making. chapter 10 in kamien d (ed). the mcgraw-hill handbook of homeland security fort detrick, md: biological threat characterization center of the national biodefense analysis and countermeasures center choosing what to protect: strategic defensive allocation against an unknown attacker robust game theory robust solutions in stackelberg games: addressing boundedly rational human preference models. association for the advancement of artificial intelligence workshop: 55-60 syncopation software. available at probabilistic modeling of terrorist threats: a systems analysis approach to setting priorities among countermeasures probabilistic risk assessment: its possible use in safeguards problems. presented at the institute for nuclear materials management meeting nature plays with dice-terrorists do not: allocating resources to counter strategic versus probabilistic risks a risk and economic analysis of dirty bomb attacks on the ports of los angeles and long beach project bioshield: protecting americans from terrorism the bio watch program: detection of bioterrorism strategic decision making: multiobjective decision analysis with spreadsheets bioterrorist agents/diseases definitions by category center for disease control (cdc) emerging and re-emerging infectious diseases strategic alternative responses to risks of terrorism world at risk: report of the commission on the prevention of wmd proliferation and terrorism unfinished business: a comparative assessment of environmental problems optimization tradecraft: hard-won insights from real-world decision support. interfaces key: cord-117688-20gfpbyf authors: cakmakli, cem; simsek, yasin title: bridging the covid-19 data and the epidemiological model using time varying parameter sird model date: 2020-07-03 journal: nan doi: nan sha: doc_id: 117688 cord_uid: 20gfpbyf this paper extends the canonical model of epidemiology, sird model, to allow for time varying parameters for real-time measurement of the stance of the covid-19 pandemic. time variation in model parameters is captured using the generalized autoregressive score modelling structure designed for the typically daily count data related to pandemic. the resulting specification permits a flexible yet parsimonious model structure with a very low computational cost. this is especially crucial at the onset of the pandemic when the data is scarce and the uncertainty is abundant. full sample results show that countries including us, brazil and russia are still not able to contain the pandemic with the us having the worst performance. furthermore, iran and south korea are likely to experience the second wave of the pandemic. a real-time exercise show that the proposed structure delivers timely and precise information on the current stance of the pandemic ahead of the competitors that use rolling window. this, in turn, transforms into accurate short-term predictions of the active cases. we further modify the model to allow for unreported cases. results suggest that the effects of the presence of these cases on the estimation results diminish towards the end of sample with the increasing number of testing. the outbreak of the new coronavirus, covid-19, is one of most severe health crisis the world has encountered in last decades if not century. the spread of the virus has been at an unexpected pace since the burst of the pandemic first in wuhan, china in early january of 2020. the world health organization has declared the outbreak of covid-19 as a global pandemic in march 11 2020. official records indicate that as of the end of june, more than 10 million people are infected with a total death toll approaching to half a million. anticipating the devastating humanitarian and economic effects of the covid-19, countries have taken various measures to contain the pandemic. a variety of measures have been imposed ranging from complete lockdown, essentially freezing the flow of life for an uncertain period, to partial lockdown implying a partial closure to the daily routines for the protection of the most vulnerable in the population. on the contrary, some countries including sweden, england and netherlands were reluctant to consider any measures at first at the onset of the outbreak but rapidly switch to impose lockdown measures. recently, a vast majority of countries has launched the process for normalization confronting economic pressures. the decision of imposing and/or relaxation of these various sorts of measures and evaluating the outcome of these actions evidently rely on efficiently monitoring the course of the pandemic. therefore, epidemiological models for estimating, and perhaps even more crucially for predicting the trajectory of pandemic come to the forefront. however, if these measures indeed turn out to be effective and changing the natural course of the pandemic, then this implies that the parameters of the epidemiological models alter to comply this changing trajectory. this is the departure point of this paper. specifically, we develop a simple and statistically coherent model that allows for time variation of the parameters of the conventional workhorse epidemiological model. we start our analysis by confronting a simple version of the workhorse epidemiological model with the actual data. from the perspective of econometrics, we specify a count process for modeling the course of the covid-19 pandemic for a selected set of countries based on the sird model. the sird model identifies the four states of the pandemic as susceptible, infected, recovered and death and it depicts the evolution of these states depending on the total number of infected individuals, see kermack and mckendrick (1927) ; allen (2008) . it is the contestation of these forces, i.e. the parameters governing the rates of infection and resolution (in the form of recovery or death) that determines the course of pandemic. therefore, we extend the econometric model by allowing for time variation in the structural parameters resorting to the generalized autoregressive score (gas) modeling framework which is a class of observation-driven models. the proposed model permits a flexible yet feasible framework that can track the evolution of structural parameters quite timely and accurately. one important aspect of our specification is its relatively low computational cost, which might be crucial especially at the beginning of the pandemic when the data is scarce and the uncertainty is overwhelming. we construct a set of selected countries that so far have experienced different courses of pandemic to demonstrate the efficacy of the proposed framework. these include countries that can mitigate the pandemic but with differential momentum, countries where the pandemic starts relatively late and countries that experience a second wave of pandemic. this provides us a testing ground with a wide variety of patterns to examine the potential of the proposed model. our results indicate that for a majority of countries the structural parameters alter over time. the rate of infection typically starts at a high level at the onset of the pandemic and but it decreases at distinct paces depending on the success of the country in containing the transmission of the virus. on the contrary, the recovery rate starts at a low level and gets stabilized after an increase from these low values reflecting the performance of health systems in handling the active cases. as a result, the reproduction rates, computed as the ratio of the infection rate to the recovery and mortality rates, start at high levels often exceeding the value 5, but diminish at differential rates. two crucial findings emerge from the outcomes of the proposed model with time varying parameters. first, us, russia and brazil still cannot tackle with the (first wave of) pandemic as the reproduction rates have not fallen below the critical value of 1. second, iran and south korea seem to experience the second wave of the pandemic. we further examine the real-time performance of the models. it is crucial for the models to indicate the stance of the pandemic in real-time and to provide accurate and timely predictions at least in the short-run. a real-time estimation and forecasting exercise starting from april show that the proposed model with time varying parameters indeed provide timely information on the current stance of the pandemic ahead of the competing models. moreover, it also provides superior forecasting performance up to 1 week ahead, especially for those countries who are currently experiencing the second wave of the pandemic. finally, we extend the model for including the cases that are undocumented (as these infected individuals do not show symptoms) using the strategy in grewelle and de leo (2020) . while inclusion of those leads to large discrepancies in parameters compared to initial findings especially at the onset of the pandemic, parameter values converge to similar values towards the end of may as the cumulative figures mount. the literature on estimating the sird model (with fixed parameters) and variants to evaluate the current stance of the covid-19 pandemic has exploded since the outbreak of the pandemic. relatively earlier analysis include read et al. (2020) and lourenco et al. (2020) who estimate a sird based model with the data from china for the former and for the uk and italy for the latter using a likelihood based inference strategy. wu et al. (2020) blend data related to covid-19 for china with mobility data and estimate the epidemiological model using bayesian inference to predict the spread of the virus domestically and internationally. li et al. (2020) conduct a similar analysis employing a modified sir model together with a network structure and mobility data to uncover the size of the undocumented cases, see also hortaçsu et al. (2020) . zhang et al. (2020) extend the standard sir model with many additional compartments and estimate a part of parameters using bayesian inference. several factors might lead to time variation in the parameters of the epidemiological model regarding to the course of pandemic. on the one hand, the lockdown measures taken by the policy makers are intended to isolate the infected from the susceptible individuals. therefore, the parameter governing the rate of infection which is simply the average number of contacts of a person is likely to alter with the conduct of lockdown, see for example hale et al. (2020) . on the other hand, advancements in the fight against covid-19 including the recovery of drugs that could effectively mitigate the course of the disease, the installment or the lack of the medical equipment such as ventilators might alter the rate of recovery or in other words the duration of the state of being infected, see for example greenhalgh and day (2017) on time variation in recovery rates. accordingly, anastassopoulou et al. (2020) use a least squares based approach on a rolling window of daily observations and document the time variation of parameters in the sird based model using chinese data. tan and chen (2020) also employ a similar but more articulated rolling window strategy to capture the time variation in the model parameters. other frameworks with time varying model parameters almost exclusively allow for the time variation only in the infection rate. an application prior to covid-19 outbreak includes, for example, xu et al. (2016) among others, who utilize a gaussian process prior for the incidence rate involving the rate of infection using a bayesian nonparametric structure. in the context of covid-19 pandemic, kucharski et al. (2020) estimate a modified sir model using a parameter driven model framework allowing the infection rate to follow a geometric random walk with the remaining parameters kept as constant, see marioli et al. (2020) for a similar approach. similarly, yang et al. (2020) and fernndez-villaverde and jones (2020) allow for time variation in the rate of infection keeping the remaining parameters constant. in this paper, we propose an alternative modeling strategy to capture the time variation in the full set of structural parameters of the sird model. on the one hand, our modeling framework is statistically consistent with the typical count data structure related to pandemic unlike the models that either employ least squares based inference or likelihood based inference using gaussian distribution, e.g. kalman filter. on the other hand, our framework is computationally inexpensive unlike the models that are statistically consistent but employing particle filter type of methods for inference which are computationally quite costly. this might be crucial especially at the onset of the pandemic when the data is scarce and uncertainty about the course of pandemic is abounding. our framework belongs to the class of observation driven models and specifically to the class of generalized autoregressive score models (henceforth gas models) proposed by creal et al. (2013) . gas models involve many of the celebrated econometric models including generalized autoregressive heteroskedasticy (garch) model and various variants as a special case, and thus, they proved to be useful in both model fitting as well as prediction. koopman et al. (2016) provides a comprehensive analysis on predictive power of these models compared to parameter driven models in many settings including models with count data. independent of the analysis of covid-19 pandemic, observation-driven models for count data are considered in many different cases. davis et al. (2003) provides a comprehensive analysis on observation driven models with a focus on data with (conditional) poisson distributions. ferland et al. (2006) the remainder of the paper is organized as follows. in section 2 we describe the canonical sird model and introduce the sird model with time varying parameters with full details provided in the appendix. in this section, we discuss the estimation results using full sample data from various countries. in section 3, we discuss the real-time performance of our model framework in capturing the current stance of the pandemic as well as in short-term forecasting compared to frequently used competitors. in section 4, we extend the model to account for infected individuals who are not diagnosed and therefore not included in the sample. finally, we conclude in section 5. hence, a fraction γ + ν of the infections are 'resolved' in total. this leads to the following sets of equations: note that the last equation defines the law of motion for the number of infected individuals and it is the outcome of the first three equations as ∆s t + ∆r t + ∆i t + ∆d t = 0 holds at any given time, assuming that the size of the population remains constant. 1 the parameters of interest are the structural parameters β, γ and ν that provide information on the transmission and resolution rates of the covid-19 pandemic, respectively. a central metric which characterizes the course of the pandemic is the reproduction number, r 0 . the reproduction number refers to the speed of the diffusion which can be computed by the ratio of newly confirmed cases, denoted as ∆c t , 2 to the resolved cases, that is ∆c t /(∆r t + ∆d t ). therefore, it serves as a threshold parameter of many epidemiological models for disease extinction or spread. considering the fact that s t /n ≈ 1, r 0,t can be approximated by β/(γ + ν) in (1) and it holds exactly when t = 0, referred as the basic reproduction rate r 0 . in this sense, a value of r 0 being less than unity indicates that the pandemic is contained and if it exceeds unity, this implies that the spread of the pandemic continues. inference on β, γ and ν and thereby inference on r 0 enables us to track the trajectory of the pandemic. our main motivation for employing the model from the econometrics point of view is to conform this canonical epidemiological model with the actual datasets and pinpoint the stance of the pandemic timely. for that purpose, we first discretesize (1) as the typical covid-19 dataset involves daily observations on the counts of individuals belonging to these states of health. motivated by this, we specify a counting process for the states using the poisson distribution with conditional arrivals implying a nonhomogenous poisson process for all the counts see for example allen (2008) ; yan (2008) ; rizoiu et al. (2018) for earlier examples and li et al. (2020) in the covid-19 context for a similar approach. this leads to the following specification for the stochastic evolution of counts of these states where ω t stands for information set that is available up to time t. we assume that ∆c t , ∆r t and δd t are independent conditional on ω t−1 . the resulting distribution for the active number of infections, i t , is a skellam distribution (conditional on ω t−1 ) with the mean πi t−1 = (1 + β(r −1 0 − 1))i t−1 and the variance as β(r −1 0 + 1)i t−1 , where we use the identity in the last equation together with the definition of r 0 . therefore, stationarity of the resulting process depends on whether r 0 < 1 or r 0 ≥ 1, i.e. whether the pandemic is taken under control or not. while moments conditional on i t−1 is identical due to poisson distribution, the first and second moments diverge when we consider the unconditonal moments. in this case, where we assume that the initial condition, i 0 , is known. if the initial condition is considered as a parameter to be estimated then the variance is further amplified with a factor in terms of the variance of the initial condition. accordingly, the unconditional moments of the states of the pandemic are linear functions of these unconditional moments of i t . we refer to appendix b.1 for details. we conduct bayesian inference using the likelihood implied by (2) together with noninformative priors for estimating the model parameters. specifically, for all the models, we use independent metropolis-hastings algorithm using normal distribution around the posterior mode and hessian as the candidate density, see robert and casella (2013) for details. we use the data of selected set of countries starting from the early days of pandemic until the end of second week of june. for each country, we use the day when the number of confirmed cases exceeds 1000 as the starting point of the sample. we display the dataset in table 1 . [insert table 1 about here] these countries exhibit extensive heterogeneity in terms of their experience related to pandemic. some countries in this set impose strict measures of full lockdown and successful policies of testing and tracing promptly at the onset of the pandemic including south korea, while other countries including italy has imposed these immense measures after a certain threshold regarding the number of infected individuals. some others opt for imposing mixed strategies involving partial lockdowns and voluntary quarantine such as turkey and us. we also include other interesting cases including brazil and russia who are going through differential phases of the pandemic. finally, iran experiences a second wave of the pandemic, though south korea seems to start a similar pattern, albeit much milder. hence, this relatively rich and heterogeneous dataset involving countries with all sorts of pandemic experience enables us to examine the success of the econometric model in tracking the changes in the structural parameters as a response to this policy implementations. the estimates of the model parameters are displayed in table 2 . [insert table 2 about here] we first focus on the basic reproduction rate, r 0 as displayed in the last column of table 2 . for all countries but south korea r 0 exceeds the threshold of 1 and for some it also exceeds 2. apparently, when the full sample of several days is taken into account, estimation results show that in almost none of the countries in our sample the pandemic could be taken under control. for brazil, russia and us who experience relatively prolonged earlier phases of the pandemic we estimate an r 0 that is very close to 2 or that exceeds 2 departing from the rest of the countries in the sample. this is due to the almost exceptional low recovery rate for the case of us and high rate of infection for brazil. indeed, two groups come into view considering the infection rate. brazil and iran constitute the group with high infection rate departing from the remaining countries. a similar grouping appears also for the mortality rate. in this case italy joins to the group of high mortality rate together with brazil and iran. the estimation results in table 2 rely on the sird model with fixed parameters as demonstrated in (2). while, as of the second week of june it is widely accepted that the pandemic is taken under control in countries including south korea, we still obtain an r 0 very close to 1. this might be due to rapid pace of infectiousness, captured by β, at the onset of the pandemic which is brought under control due to rapidly imposed measures. moreover, increasing knowledge about the sars-cov-2 virus, availability of the medical care facilities such as icu's and more effective treatment of the infection potentially lead to changes in the recovery rate γ and the mortality rate ν. therefore, it might be crucial to model this time variation efficiently in a data scarce environment allowing for estimation at the onset of the pandemic as well. in this section we put forward the sird model with time varying parameters. for modeling the time variation we use the framework of generalized autoregressive score model (gas hereafter) which encompasses a wide range of celebrated models in econometrics, including generalized autoregressive conditional heeteroscedasticity model (garch) and its variants. briefly, the gas model relies on the intuitive principle of modeling the time variation in key parameters in an autoregressive manner which evolves in the direction implied by the score function and thereby improving the (local) likelihood, see creal et al. (2013) for a detailed analysis of the gas model. as in the case of the garch model, it effectively captures the time dependence in long lags in a parsimonious yet quite flexible structure. perhaps more importantly, since it admits a recursive deterministic structure the resulting data driven time variation in parameters is computationally inexpensive to estimate. this might be crucial given that these flexible models are evaluated throughout the course of the pandemic when data is often scarce especially at the onset of it. consider the sird model with time varying parameters as β t , γ t and ν t . we first transform these parameters into logarithmic terms to ensure positivity of these parameters and thereby the positivity of the predicted counts every time period. let the parameter with a tilde denote the logarithmic transformations asβ t = ln(β t ),γ t = ln(γ t ) andν t = ln(ν t ). the resulting time varying paramaters -sird (tvp-sird) model is as follows where s 1,t , s 2,t and s 3,t are the (scaled) score functions of the joint likelihood. since the likelihood function of the sird model is constituted by the (conditionally) independent poisson processes, each score function is derived using the corresponding compartment. specifically, let ∇ 1,t = ∂l(∆ct;βt) ∂βt , ∇ 2,t = ∂l(∆rt;γt) ∂γt and ∇ 3,t = ∂l(∆dt;νt) ∂νt denote the score functions of the likelihood function for period t observation. we specify s i,t such that the score functions are scaled by their variance as in the specific case of sird model, this modeling strategy leads to the following specification for the (scaled score functions) in terms of the logarithmic link function , λ 2,t = γ t i t−1 and λ 3,t = ν t i t−1 . the resulting specification implies an intuitive updating rule in the sense that the parameters (in logarithmic form) are updated using a combination of recent parameter value and recent percentage deviation from the mean. we refer to appendix a for the details on derivation of (5). the specification in (4) leads to quite rich dynamics both in terms of mean and the variance of the resulting process. this enables us to capture the evolution of the pandemic accurately which is reflected as timely and prompt response of the parameters to the changes in the data, i.e. in the states of the pandemic. we refer to appendix b.2 for the details on the properties of the process described in (4). an appealing feature of the tvp-sird model is that it encompasses the sird model with fixed parameters. for example, when α 1 = 1 together with α 0 and α 2 to be zero, then the rate of infection, β t , remains fixed over the course of pandemic. this would indicate that the lockdown measures are proved to be ineffective as it does not lead to a systematic change in the infection rate. similar reasoning also applies to γ t and ν t . therefore, it provides a solid framework for statistically testing the efficiency of measures for taking the pandemic under control. we display the estimates of the underlying parameters governing (logarithms of) β t , γ t and ν t in table 3 . [insert table 3 about here] the parameter estimates in table 3 indicate that the structural parameters governing the diffusion of the infection exhibit time varying behaviour. for all the countries, the 95% credible intervals for the posterior joint distribution exclude the set of α 0 = α 2 = 0 and α 1 = 1 implying that β t indeed varies over time. the same conclusion also applies for the recovery rate γ t and mortality rate ν t which is considered as constant parameters in vast majority of studies. we display the evolution of the underlying structural parameters, β t , γ t , ν t and the resulting r 0,t over time in figure 1 . [insert figure 1 about here] the recovery rate seems to have less variation for a majority of countries, which usually starts with low levels as still the first wave of recoveries are limited and then getting stabilized around some fixed values. turkey seems to be an exception with an increasing rate of recovery towards the end of april first and then at the end of may. this high rate of recovery coincides perfectly with the days right after the peak of active cases at the end of the third week of april. we also note that the uncertainty around these values also rises as a result of this rapid change. for italy and russia the improvement in the recovery rate is quite gradual approaching to a stable and high level only towards the last week of may. when we consider the mortality rate, an interesting structure emerges. for all countries with the exception of south korea the value of mortality rate is smoothly stabilized around a fixed value. while this fixed value is 0.001 for a majority of countries, it is lower for south korea and larger for brazil and iran reflecting the varying capability of the health systems of these countries in coping with the pandemic. for south korea the mortality rate of the pandemic is quite low and the seemingly volatile nature of the mortality rate might be due to these minuscule rates prone to fluctuations over time. the course of the reproduction rate, r 0,t is of central importance for tracing the efficacy of the containment efforts of the pandemic. the last column of the figure 1 displays these and we discuss our findings country by country. for brazil, although the reproduction rate has decreased from record high levels to values around 2, it is still larger than 1. even more crucially, the drop from 2 to values just above 1 at the beginning of june is due to sudden increase in the rate of recovery rather than a decrease in the infection rate. for italy, the reproduction rate has fallen below the value of 1 at the beginning of may and remained there since then. therefore, italy seems to take the pandemic successfully under control. for iran, the reproduction rate fell below 1 as early as mid april but it has exceeded this critical threshold starting from the second week of may and still remains above 1. as discussed earlier, this is due to the increasing infection rate, to a large extent, reflecting the potential threat of the second wave. a similar pattern is also observed for and we repeat the process by adding one more observation (and dropping one observation at the beginning of the sample) recursively. we consider three cases by setting m = 10, 20 and 30, i.e. starting from ten days of data up to one month of data. for the tvp-sird model we use the expanding window at hand up to time period t as the parameters in this case are time varying. we display the evolution of the structural parameters, β t , γ t , ν t and the resulting r 0,t over time in figure 2 . [insert figure 2 about here] when we consider the rolling window estimates using the sird model with fixed parameters, we observe that there is a trade-off between the speed of reaction to the evolution of pandemic and the window size as expected. when the window size is taken as 30 days the parameters evolve quite smoothly but cannot react to the rapid changes promptly. on the contrary, when 10 days of window size is used parameters adjust to the new conditions more quickly. when we focus on the time varying parameters sird model, we observe that the parameters can accommodate to the newly changing conditions swiftly, ahead of the sird model with fixed parameters regardless of the window size. in addition, they can also react to the abrupt changes in the data. specifically, parameter estimates obhere we do not discuss the results with 20 and 30 days of rolling window size for the fixed parameter sird model as these perform worse than the model with 10 days of window size used for inference. finally, we explore whether this capability of the tvp-sird model in reflecting the stance of the pandemic in a timely manner indeed proved to be useful in forecasting the number of active cases. this would also indicate whether the tvp-sird model indeed reflects the current stance of the pandemic in a timely but accurate manner. we therefore perform a real-time forecasting exercise where using our recursive estima-tions of the models in (2) and (4) based on the information available in time period t, we perform h = 1, 2, . . . , 7−day ahead predictions of active cases, i.e. i t+h . we use the first one third of the full sample as the estimation sample and we expand the window by adding one more observation and repeat the procedure. this roughly provides us at least 40 days of evaluation period for each country. we display the results involving rmsfes of the competing models relative to tvp-sird model in table 4 . [insert table 4 about here] assumption 3 the infected individuals that are asymptomatic always recover, and thus, they do not switch to the state of 'death', i.e. d t = d * t . the second assumption implies that the recovery process for the infected individuals with and without symptoms are identical. this assumption is obviously subject to doubt, however, it saves us an extra parameter to calibrate. these assumptions serve as a rough approximation to the entire sample without deep epidemiological insight and therefore the estimation outputs should be taken with caution. using these assumptions, the tvp-sird model in terms of the total numbers can be written as where ∆i * t = ∆it (1−δt) and ∆r * t = ∆rt (1−δt) . the key observation in these new set of equations is that the observed number of deaths are identical to the total number if the third assumption indeed holds. this, in turn, prevents the number of individuals who are susceptible to be computed as a fraction of 1/(1 − δ t ) as the final equation suggests. therefore, the evolution of the structural parameters differs from the counterpart in the previous cases, where observed data is assumed to represent the full sample. while this source of variation might suffice for the identification of the δ t parameter, we note that the number of deaths constitutes only a minuscule fraction of the total number of susceptible individuals. to enhance the identification of δ t we exploit the information in the total number of testing following grewelle and de leo (2020) . briefly, the underlying idea stems from the fact that the detection of the infections including the asymptomatic individuals would improve with the increasing number of testing. in that sense, the fraction of tested individuals among the population should be related to the ratio of reported infections to the total number of infections. this leads to the following expression here ρ t is the fraction of tests with positive outcomes among all tests. with the increasing number of testing on the population this fraction is expected to be low and therefore δ t approaches to 1. on the other hand, if testing is concentrated only on symptomatic individuals then this fraction is close to 1 and δ t approaches to a lower bound, captured by the parameter exp(−k) where the functional form admits for exponential decay. we display the evolution of the model parameters estimated using (6) and (7) in figure 3 for selected countries. compared to previous sections, countries including brazil, iran and russia are excluded from the sample as these countries do not provide accurate information on testing at daily frequency. [insert figure 3 about here] as can be seen from the graphs in the first row of figure 3 , we observe a sizable fraction of the infected individuals that do not show symptoms for many countries at the beginning of the sample. while this fraction is smallest for the south korea, starting from 15% at the beginning of the sample decreasing to 4% in june, it is largest for us starting from 50% of all infected individuals decreasing to 30% in june. for italy and turkey, the fraction is about 20% and it declines to about 10% in june. the temporary increases in the fraction of asymptomatic individuals at the beginning of the sample is due to the efforts of these countries to increase their testing capacity in april. in this case, the speed of increase in the fraction of positive outcomes falls behind the speed of increase in the number of tests leading to 'spuriously' low values of δ t . however, once the capacity is reached we observe a monotonically decreasing pattern in the course of δ t as expected. the impact of the relatively sizable fraction of the asymptomatic individuals for us can be seen from the last column of figure 3 . while the pattern of the evolution of the parameters remains unaffected there are some level shifts in all rates which seem to be diminishing towards june. we also observe similar level shifts of parameters for other countries, albeit limited compared to us case. these effects are further aggravated when we consider r 0,t which is the ratio of the infection rate over the rate of resolution. in this case, for us and italy, r 0,t differs considerably from the earlier estimates computed using the reported numbers at the beginning of the sample. the parameter values converge as the cumulative figures mount and this seems to have a little effect after the first half of the samples. therefore, under these assumptions, the r 0,t computed using official statistics reflects progressively more and more the actual stance of the pandemic. the world is struggling heavily to mitigate the spread of covid-19 pandemic, which so far has devastating effects both from humanitarian and economic point of view. countries have been imposing various measures to fight the pandemic ranging from partial curfew to full lockdown to lower the transmission of the pandemic. these measures supposedly pave the way for the normalization of economies and reopening policies which has started since the early june in many countries. health systems with overloaded intensive care units lead to substantial variation in the number of recoveries as well as daily death tolls over the course of the pandemic. additionally, many countries including south korea and iran start to experience a second wave of the pandemic after the easing of the first wave. therefore, the parameters in the workhorse epidemiological sird model, and ultimately the key statistics of the pandemic, i.e. the reproduction rate, change over time due to the change in these structural parameters. in this paper, we extend the sird model allowing for time varying structural parameters for timely and accurate measurement of the stance of the pandemic. our modeling framework falls into the class of generalized autoregressive score models, where the parameters evolve deterministically according to an autoregressive process in the direction implied by the score function. therefore, the resulting approach permits quite a flexible yet parsimonious and statistically coherent framework that can operate in data scarce environments easily due to low computational cost. we demonstrate the potential of the proposed model using daily data from seven countries ranging from us to south korea that have distinct pandemic dynamics over the last 6 months. our results show that the proposed framework can nicely track the stance of the pandemic in real-time. for all countries, the infection rate has reduced considerably but at differential speed depending on the success in containing the pandemic. our findings suggest that there is considerable fluctuation in recovery and mortality rates, which seems to get more stable towards june. for us, russia and brazil the reproduction rate is above the critical level of 1 implying that these countries are unable to contain the pandemic yet. our findings confirm the observation that iran and south korea experience the second wave of the pandemic. we further extend the model for including the infected individuals that do not show symptoms and therefore are not diagnosed. this seems to have a sizable impact on the estimated level of reproduction rate at the onset of the sample but converge to similar levels with the earlier findings towards the end of the sample. : the evolution of δ t , β t , γ t , ν t and r 0,t when asymptomatic infected individuals are also considered in the sample the graphs show the evolution of the time varying parameters, δt, the fraction of asymptomatic cases in total cases, βt, the rate of infection, γt, the rate of recovery, νt, the mortality rate and the resulting reproduction rate, r0,t estimated using the tvp-sird model introduced in (4) displayed with the blue line and the tvp-sird model that also takes the unreported cases into account introduced in (6) and (7) displayed with the red line. let f 1 (∆c t |ω t−1 ), f 2 (∆r t |ω t−1 ) and f 3 (∆d t |ω t−1 ) denote the conditional probability density functions for ∆c t , ∆r t and ∆d t conditional on the information set at time period t−1, ω t−1 , respectively. assuming (conditional) independence among these variables, the conditional joint probability density function could be written as f (∆c t , ∆r t , ∆d t |ω t−1 ) = f 1 (∆c t |ω t−1 )f 2 (∆r t |ω t−1 )f 3 (∆d t |ω t−1 ). we assume that these marginal distributions follow poisson distribution with arrival rates specified in sird model in equation (2). thus, the joint density is as follows, , λ 2,t = γ t i t−1 and λ 3,t = ν t i t−1 . the score functions, denoted as ∇ 1,t , ∇ 2,t and ∇ 3,t , can be written as we use the variance of the score functions for the scaling parameter. variance of the score function for ∇ 1,t , for example, can be computed as as e[(∆c t − λ 1,t ) 2 ] refers to the variance of poisson distributed random variable ∆c t , it is identical to λ 1,t . hence, the resulting expression is, similar computations lead to the variances of score functions for ∆r t and ∆d t as v ar(∇ 2,t |ω t−1 ) = scaling (a.2) together with (a.4) and (a.5), the scaled score functions can be written as follows, the final step includes the division of the scaled score functions by β t , γ t and ν t , respectively, to obtain the scaled score function in terms of parameters with logarithmic transformations applying the chain rule. the resulting time evolution forβ t = log(β t ), combining (a.7) together with the sird equations, the final model becomes as follows we assume that the initial values of states, s 0 , i 0 , r 0 and d 0 are known. for the sake of simplicity, we assume that s t ≈ n . we focus on the general form of equations as y t |ω t−1 ∼ p oisson(λi t−1 ) where λ = β, γ and ν for y t = ∆c t , ∆r t and ∆d t , respectively. the conditional mean and the variance for the poisson distributed variables are the resulting process is stationary if the underlying process for i t is stationary. this, in turn depends on the basic reproduction rate, r 0 . to see this, we first start with ∆i t and . therefore, the difference equation governing i t takes the form of where the last equation uses the definition of the basic reproduction rate, r 0 , as the ratio of the infection rate to the resolution rate. in case r 0 exceeds unity, as β is positive, the process is explosive, i.e. the pandemic progress exponentially. on the other hand, if r 0 falls below unity, the process becomes stationary. we can track down the process conditional on the starting value for i 0 . let π = (1 + β(1 − r −1 0 )). e[i t ] = π t i 0 , (b.11) in case the initial condition is known, otherwise it is replaced by e[i 0 ]. for the variance we can use a similar recursion. we start with computing v ar(i t |ω t−1 ). by the law of total variance and using forward iteration, the unconditional variance can be computed as v ar(i t ) = β(1 + r −1 0 )e[i t−1 ] + π 2 v ar(i t−1 ) . . . in case the initial condition is known, the second term drops. finally, we can use (b.11) and (b.13) for construction of the unconditional moments of y t as follows v ar(y t ) = λe[i t ] + λ 2 v ar(i t ). (b.14) a common drawback of the models which involves variables assumed to follow poisson distribution is that the conditional mean is assumed to be identical to the variance. however, as can be seen in above derivations, the spread of the random variable y t is greater than its expected value, i.e v ar[y t ] > e[y t ]. therefore, the model allows for overdispersion in the data. as in the previous section, we focus on the general form of equations as y t |ω t−1 ∼ p oisson(λ t ) where λ t = ψ t i t−1 where ψ t = β t , γ t and ν t for y t = ∆c t , ∆r t and ∆d t , respectively. we consider w t = log(λ t ) = log(ψ t ) + log(i t−1 ) = θ t + k t−1 to insure positivity of the dynamic arrival rate. the time evolution of the model parameters is as follows . assuming stationarity for the evolution of the parameters, that would imply for θ t that moreover, cov(e t , e s ) = 0 for t = s. this suggests that e t series can be considered as i.i.d. innovations. therefore, unconditional moments of θ t follows as where we use the i.i.d. property of the innovations. for deriving the expectations of the variables on the states of the pandemic, we use an approximation to gaussian distribution and we can write see davis et al. (2003) . moreover the unconditional variance of y t series follows from the law of total variance together with delta method as notice that λ t is a linear function of i t−1 , and thus, the discussion provided on the properties of i t in the previous section also applies here. an important distinction is that the process governing i t in (b.10) is replaced with parameters that are time varying. hence, score component of the model has a significant impact on the long run behavior of the model variables. we impose stationarity restrictions on the dynamic process governing θ t . therefore, unconditional moments regarding to i t can be derived similarly with θ t replaced with the unconditional mean and the variance of θ t . furthermore, let us consider the conditional variance v ar(θ t |ω t−1 ) = α 2 2 λ −1 t−1 . therefore, the time variation in the conditional variance of the data stems from both k t−1 and θ t and its evolution is characterized by the coefficients of lagged score variable. this provides considerably rich dynamics for capturing the evolution of the pandemic which is reflected as timely and prompt response of the parameters to the changes in the data, i.e. in the states of the pandemic which is reflected in figure 1 and also figure 2 where we consider the real-time performance of the models. a multi-risk sir model with optimally targeted lockdown an introduction to stochastic epidemic models data-based analysis, modelling and forecasting of the covid-19 outbreak generalized poisson autoregressive models for time series of counts generalized autoregressive score models with applications observation-driven models for poisson counts integer-valued garch process estimating and simulating a sird model of covid-19 for many countries, states, and cities poisson autoregression time-varying and state-dependent recovery rates in epidemiological models estimating the global infection fatality rate of covid-19 variation in government responses to covid-19. blavatnik school of government working paper 31 estimating the fraction of unreported infections in epidemics with a known epicenter: an application to covid-19 a contribution to the mathematical theory of epidemics predicting time-varying parameters with parameter-driven and observation-driven models early dynamics of transmission and control of covid-19: a mathematical modelling study substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the sars-cov-2 epidemic estimating the covid-19 infection rate: anatomy of an inference problem tracking r of covid-19: a new real-time estimation using the kalman filter. medrxiv novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions sir-hawkes: linking epidemic models and hawkes processes to model diffusions in finite populations monte carlo statistical methods real-time differential epidemic analysis and prediction for covid-19 pandemic nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study bayesian non-parametric inference for stochastic epidemic models using gaussian processes distribution theory, stochastic processes and infectious disease modelling short-term forecasts and long-term mitigation evaluations for the covid-19 epidemic in hubei province, china prediction of the covid-19 outbreak based on a realistic stochastic model key: cord-017003-3farxcc3 authors: koibuchi, yukio; sato, shinji title: numerical simulation of urban coastal zones date: 2010 journal: advanced monitoring and numerical analysis of coastal water and urban air environment doi: 10.1007/978-4-431-99720-7_3 sha: doc_id: 17003 cord_uid: 3farxcc3 nan water quality is directly and indirectly influenced by a variety of flows. for example, since phytoplankton drifts passively according to water flows, it is strongly influenced by the distribution of the flow of the bay (lucas et al. 1999) . bay water motions also exert a strong influence on other water quality parameters. these kinds of influences are direct and obvious. the degree of influence naturally increases with the strength of the currents. in contrast, even if the flow is very small, it has some spatial patterns in common with outer bay directions. in this case, a nutrient load that is discharged at the head of the bay is transported to distant locations and, probably, the water exchange rate of the bay also increases. the nutrient loading that is permissible -i.e. which the bay can accept -will then increase due to the increase in the water exchange rate. for water retention rates, the strength of currents is not as important as spatial patterns. for example, tidal currents are dominant in urban coastal areas. however they are oscillatory. water particles in tidal currents move to the head of the bay during flood tides, but move back to the mouth of the bay during ebb tides. as a result, tidal currents do not substantially transport water particles. in contrast, density currents are clearly weaker than tidal currents, but they flow in one direction continuously and transport water particles more efficiently. as a result, they substantially effect water retention times and ecosystem characteristics. a water quality model for bays must therefore consist of a threedimensional circulation model and an ecosystem model that describes pelagic and benthic aspect of nutrients cycling. this section focuses on the physical modeling and the next section deals with water quality modeling. the final section discusses the application of these models to tokyo bay. many three-dimensional hydrodynamic models have been developed in the last decades, including pom (princeton ocean model; blumberg and mellor 1987) , ch3d (curvilinear hydrodynamics in 3 dimensions; johnson et al. 1993 ) and roms (regional ocean modeling system; maccready et al. 2002 ; li et al. 2005) . these models solve the navier-stokes equation with the forcing (the wind stress, coriolis force and buoyancy force) under adequate approximations that are called the hydrostatic and boussinesq approximations. hydrostatic approximation assumes that there is a perfect balance between pressure gradients and gravity: in other words, no acceleration occurs in a vertical direction. this is justified because the aspect ratio of urban coastal areas is extremely small, and hence the vertical motions are considered to be small and also to be further inhibited by gravitational forces under stable density stratification. this means that vertical acceleration is negligible and the fluid behaves as though it were under static equilibrium as far as vertical motion is concerned (prudman 1953) . density variations in urban coastal areas are also small (less than 3% or so), and so density can be considered to be constant, except when body forces resulting from the motion of a density stratified fluid in a gravitational field are concerned. this approximation is called the boussinesq approximation: in other words, changes in the mass or inertia of a fluid body due to the changes in its density are negligible, while the same changes in density are consequential when the gravitational field is present ( kantha and clayson 2000) . therefore, following boussinesq (1903) , this approximation justifies replacing r by a constant reference density r 0 everywhere except in terms involving gravitational acceleration constant g. under such approximations, the governing equations are transformed as follows: here, t stands for time. u , v, and w are the velocity components in x , y , and z directions. the symbol r′ is the reference density and it is defined as r = r 0 + r′ . the symbols a x , a y and a z are the eddy viscosities in x , y , and z directions. the symbol g is the acceleration due to gravity. all currents including tides, wind-driven currents, and density currents in urban coastal zones are strongly influenced by geometry and bathymetry, whereas these areas are rarely regular in shape. in particular, a coastline near an urban coastal area is more complex, due to reclamations and the constructions of harbors, than a natural one. in addition, the uniformity of the bathymetry further lessened as a result of dredging for vessel transport. a computational grid is required to accurately represent such complex geometry and bathymetry. for this reason, the selection of which grid system to use has varied along with the progress of modeling, although the governing equations are not basically different. for vertical coordinate systems (shown in fig. 3 -1 ), cartesian ( z -coordinate vertical grid) and sigma-coordinate grids have been widely used. a cartesian grid is easy to understand, and shows the correspondence between program codes and governing equations. it is sometimes more accurate than sigmacoordinate grids, especially if the bathymetry of the bay is simple and mild. the sigma-coordinate system tends to have an error featuring the presence of steep-bottom topography. however, unless an excessively large number of vertical levels are employed, the cartesian grid fails to represent the bottom topography with satisfied accuracy. the sigma-coordinate system is convenient in the sense that it can essentially introduce a "flattening out" mechanism for variable bottoms at z = −h(x, y). the flow near the seabed is also calculated well. moreover, the sigmacoordinate system is easy to program since the number of vertical grids can be the same, and the setting of boundary conditions will be simple. the sigma-coordinate system has long been widely used in both meteorology and oceanography (phillips 1957 ; freeman et al. 1972) . after the incorporation of approximations, the governing equations in the sigma coordinate system are as follows: where, t stands for time. u , v, and s ⋅ are the velocity components in x , y, and z directions in the s ⋅ coordinate system. h is change in surface elevation, h is initial water depth, and h is the total water depth ( h=h + h ). f is the coriolis coefficient and p stands for pressure. r and r′ are the constant reference density and the deviation from it, and r = r 0 + r′ . a h and a v are the horizontal and vertical eddy viscosity coefficients, respectively. g is acceleration due to gravity. recently, the stretched grid system (s-grid system) has also been popular. this grid system is an extension of sigma-coordinate system. generally, a sigma-coordinate grid divides the vertical coordinate into an equal number of points. the s-grid system has a higher resolution near the surface and bottom (haidvogel et al. 2000) . wind-stress and bottom friction are considered at surface and bottom boundary conditions, respectively. settings in boundary conditions are easy in sigma-coordinate systems since they use the same number of vertical grids. at lateral boundaries, normal velocities are set at zero, and a free slip condition is applied to the friction terms. at open boundary, the velocity gradient is set at zero. horizontal computational grids are also modified to fit topography. the simplest horizontal computational grid is the rectangular grid with fixed spacing. the rectangular grid is equivalent to the cartesian vertical grid. recently, curvilinear coordinate systems have been widely used. these systems allow greater flexibility than rectangular grid systems. fig. 3 -2 is (from ming et al. 2005) an example of a horizontal curvilinear coordinate system. the chesapeake bay, like other bays, has a typical complex geometry, and thus a horizontal curvilinear coordinate system is advantageous (li et al. 2005) . this is extended to a nested grid system in which finer grids are used in regions to yield detailed information. in urban coastal areas, density difference plays an essential role for water quality and currents. one of the most important phenomena induced by the density effect is stratification. once stratification occurs in a coastal zone, surface water and bottom water are isolated. this process is very important when we discuss the distributions of pollutants from the land. stratification also enhances the increase of phytoplankton in the surface layer and oxygen depletion near the seabed. moreover, estuarine circulation is induced by density differences in the salty sea water and the river flow. fig. 3-3 shows a schematic diagram of estuarine circulation. river water runs through the urban area and runs off from the river mouth, spreading over the sea surface like a veil since river water has low density compared with saline sea water. to cancel the density difference between the river water and the saline water, a great deal of sea water is entrained into the river water flow. such a mixing process continues until the river water reaches the same density as the surrounding sea water, resulting in vertical circulation in the bays that is is several to ten times greater than the river flux (unoki 1998) . thus, estuarine circulation induces seaward currents on the surface and landward currents near the bottom. the speed of the currents is slow compared with the tidal currents, as explained previously. since estuarine circulations are in a fixed direction, its material transport is very effective over a long time scale in spite of the small magnitude of its velocity. estuarine circulation also plays an important role in the nutrient cycles of stratified bays. organic matters are deposited on the seabed after phytoplankton blooms or river runoffs. they are decomposed by bacteria in the seabed. these nutrients are supplied from the seabed under anoxic condition in summer. in order to include density effects in the numerical model, conservation equations for temperature and salinity are also included. then, to obtain a realistic prediction for vertical stratification, a turbulent closure model is employed (mellor and yamada 1982) . consequently, flows driven by various mechanisms -e.g. the gravitational, wind-driven, and topographically induced flows -can be reproduced within physical numerical models. diffusion equations for temperature and salinity in the sigma coordinate system are as follows: here, t and s stand for temperature and salinity, respectively. heat balance and moisture balance at surface are considered as surface boundary condition for temperature and salinity, respectively. c p and q stands for specific heat coefficient and net surface heat flux at the surface, respectively. r stands for river discharge. k h and k v are the horizontal and vertical eddy diffusion coefficients, respectively. the vertical mixing coefficients, v a and v k are obtained by appealing to the second order turbulent closure scheme of mellor and yamada (1982) , which characterizes turbulence by equations for the kinetic energy of turbulence, q 2 , and turbulence macro scale l , according to: r r (3.12) wall proximity functions w is defined as follows: 3.14) mixing coefficients are given as: the stability functions s m , s h , and s q are analytically derived from algebraic relations. they are functionally dependent on , q and l . these relations are derived from closure hypotheses described by mellor (1973) and later summarized by mellor and yamada (1982) . a semi-implicit finite difference scheme has been adopted where equations are discretized explicitly in the horizontal direction and implicitly in the vertical direction. an arakawa c staggered grid has been used with the first order upwind scheme. the tri-diagonal formation of the momentum equation is utilized, and in combination with the mass conservation equation, an algebraic equation is obtained where the only unknown variable is the surface elevation h in implicit form. this algebraic equation is solved through the successive over relaxation (sor) method. the phenomena in urban coastal zone are not only physical but also biological or chemical, each of which relates to the other. the ecosystems and water quality of urban coastal zones are highly complicated. to deal with these complex systems, a water quality model is composed of a three-dimensional physical circulation model and an ecosystem model that describes pelagic and benthic aspect of nutrients cycling. the pelagic and benthic systems also have interactions with each other. in ecosystem models, each water quality variable is often called a compartment. various kinds of models are also proposed for ecosystem models (kremer and nixon 1978 ; fasham et al. 1990 ; chai et al. 2002 ; kishi et al. 2007) . some models, such as ce-qual-icm (cerco and cole 1995) each model is developed with basic aquatic compartments such as phytoplankton, zooplankton, and nutrients. some differences in the modeling of sediment, detritus, and the detailed modeling of phytoplankton exist, depending on the target ecosystems and the objectives of the study. as a result, no fully adaptive model applicable for all water bodies exists. if we were to make a model that could be adapted for all areas, its results would be too complex to discuss. it would not be so different from observing the real world. for example, ocean ecosystem models tend to focus only on pelagic systems. they tend to ignore benthic modeling, since the open ocean is deep enough to prevent the return of detritus to the seabed. meanwhile, ocean ecosystem models generally deal with some metals in order to represent the limiting factor of phytoplankton. these metals are fully abundant in urban coastal zones. however, they are often depleted during phytoplankton growth in the open ocean. on the other hand, the concentration of phytoplankton in coastal areas is highly variable both spatially and temporally as compared to the open sea. subsequent sedimentation of this bloom also constitutes a major input to benthic ecology (waite et al. 1992 ; matsukawa 1990 ; yamaguchi et al. 1991) . to represent these phenomena, ecosystem models of coastal zones usually cover benthic systems. ecosystem models solve conservation equations for relevant components with appropriate source and sink terms. this is the same as temperature and salinity modeling in physical models, as explained in sect. 3.1.4 . for the sigma coordinate system, a mathematical formulation of the conservation of mass is written by: where c denotes concentration of the water quality variable and t is time. fluxes into and out of the target control volume are calculated by using physical model results. s(x,y,s,t) represents sources or sinks of the water quality variable due to internal production and the removal of the biogeochemical effect. it also represents the kinetic interactions of each compartment. w(x,y,s,t) represents the external inputs of the variable c . for example, phytoplankton constitutes the first level in the food chain of the pelagic ecosystem of bays. phytoplankton photosynthesizes by using sunlight and increases. at this time, the source term s of phytoplankton is increased depending on the amount of photosynthesis that takes place. phytoplankton is then decreased by the grazing of zooplankton. the source term s of zooplankton is increased along with this grazing, and the source term of phytoplankton is decreased. ecosystem models basically express the relationship of each compartment through mathematical expressions. these models provide a quantitative description of the influences of physical circulation on the biological and chemical processes of urban coastal zones. the ecosystem model introduced here was developed to simulate the nutrient budget of an urban coastal zone. it includes the temporal and spatial variations of phytoplankton, nutrients, detritus, and dissolved oxygen (do). in urban coastal zones, nutrients emitted from urban areas are not a limiting factor for phytoplankton growth. however, quantifying the nutrient budget is essential for analyzing and restoring the ecosystems of urban coastal zones. fig. 3 .4 shows schematic interactions of a lower trophic ecosystem model which is used for tokyo bay (koibuchi et al. 2001 ) this model has 18 state variables: phytoplankton (phy), zooplankton (zoo), nutrients (nh4, no3, po4 and si), labile detritus (ldon, ldop, ldosi) and refractory detritus (rdon, rdop, rdosi) for each nutrient, labile detritus carbon (ldoc), refractory detritus carbon (rdoc), dissolved organic carbon (doc), and dissolved oxygen (do), as well as sedimentation processed from particulate organic material. since the basic structure of the model follows the widely applied ce-qual-icm cole 1993, 1995) , this section mainly focuses on our modifications of the ce-qual-icm model in the following description. this model deals with four phytoplankton groups. phyd1 is based on skeletonema costatum, which is a dominant phytoplankton species in tokyo bay. phyd2 represent a winter diatom group (such as eucampia ). phyr is a mixed summer assemblage consisting primarily of heterosigma akashiwo and thalassosira . phyz denotes the dinoflagellates. these four phytoplankton assemblages have different optimal levels of light for photosynthesis, maximum growth rates, optimal temperatures for growth, and half saturation constants for nutrient uptake. diatoms only use silica during growth. the time rate of change of phytoplankton due to biological activity and sink is given by: (3.18) where x = d1, d2, r, z , denote each phytoplankton assemblage. the phytoplankton growth rate m depends on temperature t , on photosynthetically available radiation i , and on the nutrient concentration of nitrogen, phosphorus, and silica : idealized nutrient cycling in tokyo bay's ecosystem according to the model of koibuchi et al. (2001) . cycling between the 18 state variables: phytoplankton, zooplankton, nutrients (nitrogen, phosphorus, and silicate), labile detritus and refractory detritus for each nutrient, and dissolved oxygen, as well as sedimentation processed from particulate organic materials where m max (t) is the growth rate at ambient temperature, which relates m max , the maximum growth rate m max = m 0 · 1.066 t (eppey 1972) , t opt , the optimal temperature of each plankton assemblage, and b 1 and b 2 are shaping coefficients, k no3 , k nh4 , k po4 and k si , which is the michaelis of the halfsaturation constant for each nutrient. i is exponentially decreasing with water depth z according to: 3.24) where i 0 is shortwave radiation, and par is the fraction of light that is available for photosynthesis. k w , k chl , and k sal are the light attenuation coefficients for water, chlorophyll, and depth average salinity, respectively. suspended sediment reduces underwater light intensity and affects the growth of phytoplankton. the effect of suspended sediment concentration on light intensity should be simulated as its own compartment. however, the re-suspension rate of mixed mud and the available data on the concentration of sediment suspended in river water and on the seabed are very limited. therefore, we used observation results of salinity based on field observation data from 1999 and 2000, as shown in fig. 3 -5 . the function l(i ) represents the photosynthesis-light relationship (evans and parslow 1985) , the rate of phytoplankton grazing, g, which is a function of an ambient temperature: where k grz is the predation rate at 20°c. other phytoplankton loss terms are mortality, represented by the linear rate m p , where w px is the constant vertical sinking velocity for each phytoplankton. the growth rates of zooplankton are expressed as follows: here b is the assimilation efficiency of phytoplankton by zooplankton, and l bm and l e denote excretion due to basal metabolism and ingestion, while the remaining fraction is transferred to the detritus. m z is the loss coefficient of zooplankton mortality. the nutrient compartments have four principal forms for each nutrient (nitrogen, phosphorus, and silica): dissolved organic nutrients, labile and refractory particulate organic nutrients (lpon and rpon, respectively) , and dissolved inorganic nutrients. only the dissolved inorganic nutrients are utilized by phytoplankton for growth. nutrients are changed to these various organic and inorganic forms via respiration and predation. fig. 3.6 shows an example of nutrient cycles using phosphorus. dop, lpop, and rpop work as a pool of phosphorus. for example, certain labile compounds that are rapidly degraded, such as the sugars and amino acids in the particulate organic matter deposited on the sediment surface, decompose readily; others, such as cellulose, are more refractory, or resistant to decomposition. table 3 .1 shows the distributions of each detritus form by each event, based on pett (1989) . ammonia and nitrate are utilized by phytoplankton for growth. ammonia is the preferred form of inorganic nitrogen for algal growth, but phytoplankton utilize nitrate when ammonia concentrations become depleted. nitrogen is returned from algal biomass to the various dissolved and particulate organic nitrogen pools through respiration and predatory grazing. the time rates for variations due to the biological processes of nitrate and ammonium are as follows. denitrification does not occur in the pelagic water systems in this model, but rather occurs in the anoxic sediment layer. as a result, if denitrification occurs in the sediment, nitrate is transferred by diffusion effect into the sediment. phosphorus kinetics is basically similar to nitrogen kinetics except for the denitrification and the alkaline phosphatase effects of the dop degradation processes. many phytoplankton can enhance alkaline phosphatase activity. this effect makes it possible for them to use phosphate from dop pools (fitzgerald and nelson 1966) . this effect is formulated in the following model: (3.31) where, r dop is the decomposition rate for dop, r dop-min is the minimum constant of dop decomposition(day −1 ), k po4 is a half saturation constant of phosphate uptake, and r dop-di is the acceleration effect of dop decomposition by diatoms. the kinetics of the silica is fundamentally the same as the kinetics of the phosphorus. only diatoms utilize silica during growth. silica is returned to the unavailable silica pool during respiration and predation. the sediment system is zoned in two layers (see fig. 3 .6 ), an aerobic and an anoxic layer. organic carbon concentrations in the sediment are controlled by detritus burial velocity, the speed of labile and refrigerate organic carbon decomposition, and the rate constant for the diagenesis of particulate organic carbon. the thickness of the aerobic layer is calculated by oxygen diffusion when the amount of oxygen at the bottom layer of the pelagic system isn't zero. the nutrient model, which is a simplified version of the model, treats the nutrients ammonium, nitrate, phosphate, and silica and their exchanges with the pelagic system. silicate-dependent diatoms and non-silicate-dependent algae are distinguished. dissolved oxygen is an essential index for the water quality of an urban coastal zone. sources of do included in the model are reaeration at the sea surface, photosynthesis of phytoplankton, and do in inflows. the sink of do includes respiration of phytoplankton and zooplankton, oxidation of detritrial carbon (ldoc and rdoc), nitrification, and sediment oxygen demand. the time variation of do is formulated as follows: ( ) where, k oc is the oxygen to carbon ratio. k on is the oxygen to nitrogen ratio. k a and q a t-20 denote the reaeration rate at 20°c and the temperature coefficient for reaeration at the sea surface, respectively. k nh4 and q nh4 t-20 are the ammonia oxidation rate at 20°c and the temperature coefficient. k nit is the half saturation constant of ammonia oxidation. k rdoc and q rdoc t-20 are the rdoc mineralization rate at 20°c and the temperature coefficient for rdoc mineralization. k ldoc and q ldoc t-20 mark the ldoc mineralization rate at 20°c and the temperature coefficient for ldoc mineralization. k mldoc is the half saturation constant for ldoc mineralization. the concentration of do saturation is proportional to temperature and salinity. oxygen saturation value is calculated by using the following equation: where t is temperature and s is salinity. tokyo bay is located at the central part of the main island of japan. the inner bay, the north of which is 50 km in length in its narrowest channel (fig. 3.7 ) along the main axis of the bay, connects to the pacific ocean. its average depth and width are 18 m and 25 km, respectively. tokyo bay is one of the most eutrophicated bays in japan. phytoplankton increase in the surface layer from late spring to early fall, and oxygen depletion and the formation of hydrogen sulfide occur on the sea bed. the sea-water color at the head of the bay sometimes becomes milky blue-green in late summer after a continuous north wind (koibuchi et al. 2001 ) . this phenomenon is called a blue tide. in the last decade, a variety of water quality observation equipment has been developed. this has made it easier to measure water quality than in the past. however, even with advanced technology, measuring the flux of nutrients is not easy. to quantify the nutrients budget, we applied our numerical model to tokyo bay. the computational domain was divided into 1km horizontal grids with 20 vertical layers. computation was carried out from april 1, 1999 to march 31, 2000, with time increments of 300s provided by the japan meteorological agency giving hourly meteorological data that included surface wind stress, precipitation, and solar radiation. at the open boundary, an observed tide level was obtained which can be downloaded from the japan oceanographic data center (jodc) of the japan coastal guard. don and dop was obtained at 30% of tn, tp based on the observation results of suzumura and ogawa (2001) at the open boundary. fig. 3-8 shows a temporal variation of the computed density at s2. the simulation of water column density over the whole period (april 1999-october 1999) agreed well with measured density. variations between simulated and observed values were generally less than 0.5 through the water column. time variation of density effectively reproduced observed results, including shortterm wind-induced variation, formation of stratification during summer, and mixing after continuous strong wind in the middle of october. calculation results also reproduced an upwelling event during the summer season. total chlorophyll-a concentrations in the surface water were reproduced relatively well by model simulations. the model captured the temporal increase in chlorophyll-a that were seen in the observation results, as denoted by arrows in fig. 3-9 . during these periods, phytoplankton increased more than 50 m g/l. in tokyo bay, a red tide is defined as a chlorophyll a concentration of greater than 50 m g/l. four different types of plankton assemblages do showed high variability compared with field measurements at the bottom, and was relatively higher than the field data from late september through october. oxygen-depleted water was made on the seabed, representing a basic trend for do variations (fig. 3-10 ) . the simulation of phosphate captured not only the observed increase in the surface layer during the summer season, but also inter-annual variability observed over the study period (fig. 3-11 ) . for example, phosphate concentrations increased from june to july due to phosphate release from sediment. simulation results represented this kind of trend based on the oxygen-depleted water. fig. 3-12 shows nitrate concentration. nitrate concentration in the surface layer fluctuated considerably during this period. nitrate levels doubled or tripled occasionally at the surface. the timing of the high nitrate concentrations and the low density in the surface layer coincided with increases in the river discharge. concentrations of nitrate were underestimated in bottom waters during summer. further study is needed to simulate denitrification processes in the sediment layer. fig. 3-13 shows the calculation results of an annual budget of nitrogen and phosphorus in tokyo bay. the annual budget is useful in understanding nutrient cycles. nitrogen is supplied to a considerable degree from rivers, since atmospheric nitrogen input is significant around urban areas. phytoplankton uptake the nitrogen and sink to bottom waters, where they are decomposed by heterotrophic processes which consume oxygen. at the head of the bay (between the line 1 and line 2 in fig. 3-7 ) , about 40% of nitrogen is sunk as detritus and 20% of it is lost into the atmosphere by denitrification. ammonia released from sediment reaches 20%. about 60% of the nitrogen load flows out from the bay. in contrast, atmospheric phosphorus input to the bay is negligible compared to the contribution of phosphorus from other sources. phosphate is released from sediment in the same amount as that discharged from the rivers, and it is transported to the head of the bay by estuarine circulation. as a result, the amount of phosphate in the inner bay remains high. in contrast, nitrogen is mainly supplied from the river mouth and transported quickly out of the bay. in conclusion, nitrogen and phosphorus showed important differences in the mechanisms by which they cycle in fig. 3-13 . summary of fluxes and process rates calculated in tokyo bay from january 1999 to january 2000. units are given in ton/year for each element the bay. the regeneration of nutrients and their release from the sediment is an important source for phytoplankton growth and is equal to the contributions from the rivers. phosphorus in particular is largely retained within the system through recycling between sediment and water. these results denote the difficulty of improving the eutrophication of bays through the construction of sewage treatment plants alone. big cities have long been developed near waterfronts. even now, naval transport remains one of the most important transportation systems, especially for heavy industries and agriculture. today, many of the world's largest cities are located on coastal zones, and therefore vast quantities of human waste are discharged into near-shore zones (walker 1990) . fifty percent of the world's populations live within 100 km of the sea. many people visit urban coastal zones for recreation and leisure, and we also consume seafood harvested from this area. as a result, effluents released into the water pose a risk of pathogen contamination and human disease. this risk is particularly heightened for waters that receive combined sewer overflows (csos) from urban cities where both sanitary and storm waters are conveyed in the same sewer system. to decrease the risk from introduced pathogens, monitoring that is both well designed and routine is essential. however, even though a surprising number of pathogens have been reported in the sea, measuring these pathogens is difficult and time consuming -not least because such pathogens typically exist in a "viable but non-culturable" (vbnc) state. in addition, the physical environments of urban coastal zones vary widely depending on time and location. their complicated geographical features border both inland and outer oceans, and so both inland and outer oceans affect them. for example, tidal currents, which are a dominant phenomenon in this area, oscillate according to diurnal periods. even if the emitted levels of pathogens were constant and we could monitor the levels of pathogen indicator organisms at the same place, they would fluctuate according to tidal periods. density stratification also changes with the tides. consequently, the frequent measurement of pathogens is needed to discuss the risk pathogens pose in urban coastal zones. however, this kind of frequent monitoring appears to be impossible. to solve this conundrum and achieve an assessment of pathogen risk, we developed a set of numerical models that expand upon the models developed in sect. 3.1 and that include a pathogens model coupled with a three-dimensional hydrodynamic model. section 3.2.2 deals with the distributions of pathogens in urban coastal zones. the pathogens model is explained in greater detail in sect. 3.2.3 . section 3.2.4 deals with numerical experiments that help to understand the effects of appropriate countermeasures. figure 3 -14 shows some typical density distributions patterns in urban coastal zones. the changing balance between tidal amplitudes and river discharge is responsible for the differences among these patterns. as tidal currents increase, the production of turbulent kinetic energy grows and can become the largest source of mixing in the shallow coastal waters. on the other hand, river-discharged water has a low density, creating a density difference between sea water and land-input water. in salt-wedge estuaries ( fig. 3-14 , top) , river water is discharged into a small tidal-range sea. the strength of the tidal currents decreases relative to the river flow. this creates a vertical stratification of density. as a result, river water distributes like a veil over the sea's surface and moves seaward. fig. 3-14 . cross-sectional view of the mixing patterns in urban coastal zones in contrast, bottom water moves to the river mouth and mixes with the river water. under such conditions, pathogens move on the surface of the sea, and further mixing with low-density fresh water is restricted by stratification. in partially mixed estuaries ( fig. 3-14 , middle) , the tidal force becomes a more effective mixing mechanism. fresh water and sea water are mixed by turbulent energy. as a result, pathogens that are emitted from sewer treatment plants are more mixed than those in the static salt-wedge estuaries. in well-mixed estuaries ( fig. 3-14 , bottom) , the mixing of salt and river waters becomes more complete due to the increased strength of tidal currents relative to river flow. here, the density difference is developed in a horizontal direction. as a result, pathogens are mixed in the water column and settle down on the sea bed, in turn contaminating estuarine waters during the spring tide or contaminating rainfall through re-suspension (pommepuy et al. 1992) . figure 3 -15 shows distributions of pathogens under coastal environments. these pathogens encounter a wide range of stresses including uv rays (sinton et al. 2002) , temperature differences (matsumoto and omura 1980) , ph (solić and krstulović 1992) , salinity (omura et al. 1982) , and lack of nutrients. the pathogens are transported by currents and continue to become part of sedimentation and to be re-suspended in urban coastal zones (pommepuy et al. 1992 ) . the modeling of major pathogens of concern (including adenovirus, enterovirus, rotavirus, norovirus, and coronavirus) is not usually conducted owing to the difficulty of modeling and the lack of observational data in coastal environments. we modeled escherichia coliform (e. coli) by using experimental data in coastal sea water. this model consists of a three-dimensional hydrodynamic model and an e. coli model (onozawa et al. 2005) . the mathematical framework employed in the e. coli model takes the same approach that was explained in sect. 3.1.3 . the mass balance of e. coli is expressed as follows: 3.33) where coli denotes concentrations of e. coli (cfu/100 ml), and t is time. sink represents the sinking speed of e. coli . u i denotes flow speed for the calculation of the advection term. e i denotes the diffusion coefficients. sal denotes the salinity-dependent die-off rate ( ppt/day). sunlight is generally recognized to be one source by which bacteria are inactivated, due to uv damage to the bacterial cell (sinton et al. 2002) . however, this particular target area has high turbidity that rapidly absorbs uv rays at the sea's surface. as a result, this process has been ignored in this model. in this model, we can see the numerical simulations performed with two nested domains to fit the complex geography feature around the odaiba area . these nested grids make possible a representation of the stratification effect. a detailed configuration of the model is summarized in table 3 -2 . the two computational domains cover the whole fig. 3 -16 ) that includes both the salinity simulation and observation results. the simulation results show stratification, mixing, and an upwelling phenomenon, and include levels and timing. fig. 3-19 shows a comparison between observation results and a calculation for temperature and salinity in a fine grid scale (domain 2 in fig. 3-16 ) . variations between the simulated and observed values were generally less than 2.5°c and 2 psu through the water column. the timing and periods of upwelling events were captured accurately. after rain fall, river discharge was increased remarkably. model results adequately represent precipitation variation events and their effects. figure 3 -20 shows a comparison between modeled and measured e. coli at stn.1. the current standard for acceptably safe beaches for swimming set by of the ministry of the environment of japan is a fecal coliform rate of 1000 coliforms unit per 100 ml (cfu/100 ml). this index of fecal coliform includes not only e. coli but also others. however, it is well known that the majority of the fecal coliform in this area comes from e. coli . therefore, we use a value of 1000 cfu/100 ml e. coli as the standard for the safety of swimming in the sea. from the calculation results, we can see that durations when the standards are exceeded are very limited, and that most of the summer period falls below the standard for swimming. this result also denotes that the increasing rates of e. coli do not agree with levels of precipitation. even in small precipitations, e. coli significantly increased. understanding the effects of physical factors is important to understanding the fate and distributions of pathogens. such an understanding is in turn variations in levels of e. coli are directly correlated with the discharge from pumping stations, tidal currents, river discharges, and density distributions, as explained in sect. 3.2.2 . as a result, the distributions of cso differ according to timing, even when the level of discharge is the same. we performed numerical experiments in order to evaluate the contributions of these different discharges and phenomena to cso distributions. the first numerical experiment was a nowcast simulation that calculated e. coli distributions under realistic conditions. the second experiment was a numerical experiment to estimate the effects of a waste-reservoir that was being constructed near the shibaura area. numerical experiments were also applied odaiba area, which is used as a bathing area. figure 3 -21 shows temporal variations of precipitation and river discharges (top), as well as tide levels and e. coli concentrations discharged from three different areas. shibaura and sunamachi area are located at the upper bay location from the odaiba area. morigasaki has the largest area, but is located in the lower bay location from the odaiba area (see fig. 3 in this spring tide period, tidal ranges can reach 2 m. small precipitations were measured from august 29th to 30th. river discharge increased with precipitation, reaching 50 m 3 /s. the levels of e. coli increased rapidly after the rainfall event, due mainly to discharges from the sunamachi and chibaura areas. near the end of this period of increase, effluent from morigasaki also reached the odaiba area. fig. 3-22 shows the spatial distributions of e. coli from three different times. from this fig., the e. coli emitted from the morigasaki area can be seen to have been transported from the lower region of the bay to the odaiba area. this is because the small amount of river discharge resulted in a thin layer of low-density, highly concentrated e. coli on the surface of the sea, and tended to isolate the e. coli by preventing it from mixing with the water column. in contrast, fig. 3-23 shows a large precipitation case under the neap tide period. large amounts of precipitation produced a large river discharge that reached 500 m 3 /s. in this period, only the upper bay's csos arrived at the odaiba area. no contributions from the morigasaki area took place. in conclusion, the concentrations of e. coli vary widely according to space and time. the density distributions produced by the balance of fig. 3-22 . spatial distributions of e. coli at odaiba area tides and river discharges have very complex effects. e. coli concentrations reached maximum levels after small precipitation events, but did not increase so much under large precipitation events due to mixing. these kinds of results would be impossible to understand only from observation. the model successfully captured complex distributions of e. coli and helped our understanding of pathogens contaminations. to mitigate cso pollutions, the construction of storage tanks at three sites in tokyo has been planned by the tokyo metropolitan government. shibaura is the target area of this plan around the odaiba area. numerical numerical simulation was performed with and without the proposed storage tank, which has a capacity of 30,000 m 3 . this storage tank can store csos after rainfall. to include the effect of continuous rain, we assumed that the csos stored in the tank could be purified within 1 day. table 3 .3 shows the calculation results for the mitigation effect of the storage tank. these numbers denote the dates when the standards for bathing in the sea (over 1000 cfu/100 ml) were exceeded. fig. 3 .17 shows the observation stations. from this table, stn.3 shows the largest decrease in csos among these five stations. this is because stn.3 is located closest to the pumping stations, and therefore would be most sensitive to the csos. before the construction of the cso storage tank, minimum bathing standards for cso levels were exceeded on 15 days. after the construction of the storage tank, the duration of cso levels that exceeded safety standards decreased to 11 days. there was an improvement of 4 days. on the other hand, other stations only 2 days, or in some cases, less than 1 day. such differences could not be observed, especially in those stations that are located inside the odaiba area due to the enclosed feature of bathymetry. for example, over 10,000 storage tanks have been built in germany alone, and another 10,000 were planned during the 1980s in germany. our plans for dealing with csos are not enough to mitigate the effects of csos completely. at the same time, these results show us the complexity of pathogens distributions and the importance of numerical modeling for this problem. numerical simulation is one of the most important tools for the management of water quality and ecosystems in urban coastal zones. we have developed a water quality model to simulate both nutrient cycles and pathogens distributions, and coupled it with a three-dimensional hydrodynamic model of urban coastal areas. to quantify the nutrients budget, a numerical model should include material cycles with phytoplankton, zooplankton, carbons, nutrients, and oxygen. we applied this model to the tokyo bay and simulated water column temperatures, salinity, and nutrient concentrations that were closely linked with field observations. this model successfully captured periods of timing, stratification events, and subsequent changes in bottom water oxygen and nutrients. our model results also indicated that there were clear differences between the material cycles of nitrogen and phosphorus inside the bay. the regeneration of nutrients and its release from sediment was found to be a source of phytoplankton growth on the same order of importance as contributions from rivers. in particular, phosphorus was found to have been largely retained within the system through recycling between sediment and water. we also developed a pathogen model that includes e. coli and is applied to the simulation of cso influences in urban coastal zones. these results indicate that, because of stratification, concentrations of e. coli significantly increase after even small precipitation events. from this study, the balance between tidal mixing and river waters can be seen to be significant. however, these are only two case studies; it remains necessary to simulate the structure and characteristics of cso distributions and their impact on urban coastal zone pollution. such simulations remain as future works to be undertaken. ministry of the environment a description of a three-dimensional coastal ocean circulation model phytoplankton kinetics in a subtrophical estuary: eutrophication three-dimensional eutrophication model of chesapeake bay user's guide to the ce-qual-icm: three-dimensional eutrophication model one dimensional ecosystem model of the equatorial pacific upwelling system part i: model development and silicon and nitrogen cycle temperature and phytoplankton growth in the sea a model of annual plankton cycles a nitrogen-based model of phytoplankton dynamics in the oceanic mixed layer extractive and enzymatic analysis for limiting or surplus phosphorus in algae a modified sigma equations; approach to the numerical modeling of great lake hydrodynamics model evaluation experiments in the north atlantic basin: simulations in nonlinear terrain-following coordinates verification of a three-dimensional hydrodynamic model of chesapeake bay small scale processes in geophysical fluid flows nemuro -a lower trophic level model for the north pacific marine ecosystem blue tide occurred in the west of tokyo bay in summer of study on budget and circulation of nitrogen and phosphorus in tokyo bay a coastal marine ecosystem: simulation and analysis simulations of chesapeake bay estuary: sensitivity to turbulence mixing parameterizations and comparison with observations processes governing phytoplankton blooms in estuaries. ii. the role of transport in global dynamics long-term isohaline salt balance in an estuary nitrogen budget in tokyo bay with special reference to the low sedimentation to supply ratio some factors affecting the survival of fecal indicator bacteria in sea water analytic prediction of the properties of stratified planetary surface layers development of a turbulence closure model for geophysical fluid problems viability and adaptability of e-coli. and enterococcus group to salt water with high concentration of sodium chloride numerical calculation of combined sewer overflow(cso) due to heavy rain around daiba in the head of tokyo bay kenetics of microbial mineralization of organic carbon from detrital skeletonema costatum cells a coordinate system having some special advantages for numerical forecasting enteric bacterial survival factors sunlight inactication of fecal indicator bacteria and bacteriophages from waste stabilization pond effluent in fresh and saline waters separate and combined effects of solar radiation, temperature, salinity, and ph on the survival of feacal coliforms in seawater characterization of dissolved organic phosphrous in coastal seawater using ultrafiltration and phosphoydrolytic enzymes relation between the transport of gravitational circulation and the river discharge in bays spring bloom sedimentation in a subarctic ecosystem the coastal zone seasonal changes of organic carbon and nitrogen production by phytoplankton in the estuary of river tamagawa key: cord-012866-p3mb7r0v authors: luo, yan; chalkou, konstantina; yamada, ryo; funada, satoshi; salanti, georgia; furukawa, toshi a. title: predicting the treatment response of certolizumab for individual adult patients with rheumatoid arthritis: protocol for an individual participant data meta-analysis date: 2020-06-12 journal: syst rev doi: 10.1186/s13643-020-01401-x sha: doc_id: 12866 cord_uid: p3mb7r0v background: a model that can predict treatment response for a patient with specific baseline characteristics would help decision-making in personalized medicine. the aim of the study is to develop such a model in the treatment of rheumatoid arthritis (ra) patients who receive certolizumab (ctz) plus methotrexate (mtx) therapy, using individual participant data meta-analysis (ipd-ma). methods: we will search cochrane central, pubmed, and scopus as well as clinical trial registries, drug regulatory agency reports, and the pharmaceutical company websites from their inception onwards to obtain randomized controlled trials (rcts) investigating ctz plus mtx compared with mtx alone in treating ra. we will request the individual-level data of these trials from an independent platform (http://vivli.org). the primary outcome is efficacy defined as achieving either remission (based on acr-eular boolean or index-based remission definition) or low disease activity (based on either of the validated composite disease activity measures). the secondary outcomes include acr50 (50% improvement based on acr core set variables) and adverse events. we will use a two-stage approach to develop the prediction model. first, we will construct a risk model for the outcomes via logistic regression to estimate the baseline risk scores. we will include baseline demographic, clinical, and biochemical features as covariates for this model. next, we will develop a meta-regression model for treatment effects, in which the stage 1 risk score will be used both as a prognostic factor and as an effect modifier. we will calculate the probability of having the outcome for a new patient based on the model, which will allow estimation of the absolute and relative treatment effect. we will use r for our analyses, except for the second stage which will be performed in a bayesian setting using r2jags. discussion: this is a study protocol for developing a model to predict treatment response for ra patients receiving ctz plus mtx in comparison with mtx alone, using a two-stage approach based on ipd-ma. the study will use a new modeling approach, which aims at retaining the statistical power. the model may help clinicians individualize treatment for particular patients. systematic review registration: prospero registration number pending (id#157595). (continued from previous page) systematic review registration: prospero registration number pending (id#157595). keywords: rheumatoid arthritis, certolizumab, individual participant data meta-analysis, prediction model, treatment response background rheumatoid arthritis (ra) is a chronic inflammatory disease, for which we cannot currently expect complete cure. the drugs that can delay disease progression are known as disease-modifying anti-rheumatic drugs (dmards). there are 3 categories: conventional synthetic dmards (csdmards), biologic dmards (bdmards), and targeted synthetic dmards (tsdmards). bdmards can be further divided into several subtypes according to the target, among which the tumor necrosis factor (tnf) α inhibitors are the most classic and widely used. most ra patients undergo long-term treatment. according to the treat-to-target strategy proposed by the eular (european league against rheumatism) practice guideline [1] , repeated assessment of disease activity should be performed every 3 to 6 months after a treatment is given, to evaluate the response and decide the next-step strategy: switching drugs, maintenance, tapering, or discontinuation. hence, the disease course of ra is composed of many short-term (3 to 6 months) intervention-response loops. for the purpose of improving long-term prognosis, such as delaying the progression of bone fusion or functional deficiency, short-term intervention-response loops need to have beneficial outcomes [2] . to find the optimal treatment for a particular patient, it is necessary to personalize the treatment. it would be helpful if we could predict the probability of treatment response based on the patient's genetic, biologic, and clinical features. however, common evidence in the form of randomized controlled trials (rcts) or their meta-analyses (mas) at the aggregate level only reports average results. the drug that works for the average patients might not work or even be harmful for a particular patient. consequently, it is desirable to identify subgroups of patients associated with different treatment effects. individual participant data meta-analysis (ipd-ma) has been previously employed to develop prediction models for treatment effects [3] [4] [5] [6] . previous treatment response prediction models for ra were mainly based on observational studies [7] [8] [9] [10] [11] . observational studies seem suited for predicting the absolute risk of an outcome, but it may be less satisfactory in estimating the relative risk between different drugs, because unknown confounders may persist even when we try to adjust for known confounders. on the other hand, though the population in rcts is highly restricted hence may be less representative, data from rcts are more rigorously collected and more likely to provide an unbiased estimate of the relative treatment effects [12] . the synthesis of rct data via ipd-ma can increase the statistical power [13] and have been used to predict treatment response [6, [14] [15] [16] [17] . to the best of the authors' knowledge, such an approach has not been taken to predict treatment response in ra to date. our aim is to develop a prediction model of treatment effects based on individual characteristics of ra patients through ipd-ma. since tnfα inhibitors are the most classic and widely used bdmards for ra, we will build a model for certolizumab (ctz), a tnfα inhibitor with sufficient ipd data, in this study. we will first estimate the pooled average effect sizes for the primary and secondary outcomes using one-stage bayesian hierarchical ipd-ma. the main objective of the study is to use a two-stage risk modeling approach to predict the individualized treatment effects interest [12] . the first stage is to build a multivariable model aiming to predict the baseline risk for a particular patient blinded to treatment. in the second stage, this baseline risk score will be used as a prognostic factor and an effect modifier in an ipd meta-regression model to estimate the individualized treatment effects of ctz. we consider to validate and optimize the modeling approach in the present study, and plan eventually to expand it to an ipd network meta-analysis to compare several drug types (e.g., interleukin-6 inhibitors, anti-cd20 antibodies) as our future research perspective. the present protocol has been registered within the prospero database (provisional registration number id#157595) and is being reported in accordance with the reporting guidance provided in the preferred reporting items for systematic reviews and meta-analyses protocols (prisma-p) statement [18] (see the checklist in additional file 1). the proposed ipd-ma will be reported in accordance with the reporting guidance provided in the preferred reporting items for systematic reviews and meta-analyses of individual participant data (prisma-ipd) statement [19] . any amendments made to this protocol when conducting the study will be outlined and reported in the final manuscript. studies will be selected according to the following criteria: patients, interventions, outcomes, and study design. we will include adults (18 years or older) who are diagnosed with either early ra (2010 american college of rheumatology (acr)/european league against rheumatism (eular) classification criteria) [20, 21] or established ra (1987 classification criteria) [22] . patients with inner organ involvement (such as interstitial lung diseases), vasculitis, or concomitant other systemic autoimmune diseases will be excluded. we will include both treatment-naïve patients and patients who have insufficient response to previous treatments. we will include patients with moderate to severe disease activity based on any validated composite disease activity measures. patients who have already achieved remission or at low disease activity at baseline will be excluded. patients who have used certolizumab (ctz) within 6 months before randomization will be excluded. we will include rcts which compare certolizumab (ctz) plus methotrexate (mtx) with mtx monotherapy, regardless of doses. if a study compares ctz + any csdmards with any csdmards, we will only include patients on ctz + mtx or mtx from that study. trials investigating the tapering or discontinuation strategy of ctz will be excluded. our primary outcome is efficacy defined by disease states, which is achieving either remission (based on acr-eular boolean or index-based remission definition [23] ) or low disease activity (based on either of the validated composite disease activity measures [24] : das28 (disease activity score based on the evaluation of 28 joints) ≤3.2 [25] , cdai (clinical disease activity index) ≤ 10 [26] , sdai (simplified disease activity index) ≤ 11 [27] ) at 3 months (allowance 2-4 months) after treatment, as a binary outcome. we choose it as our primary outcome because it is suggested as the indicator of the treatment target in both the practice guideline [1] and the guideline for conducting clinical trials in ra [2] , and it has been shown to provide more information for future joint damage and functional outcomes compared to relative response (change from baseline) [28] . we have two secondary outcomes. one is efficacy defined by response (improvement from baseline), for which we will use the acr response criteria acr50 (50% improvement based on acr core set variables) [29] . another is adverse events (aes). we will perform an ipd-ma separately for patients with all kinds of infectious aes within 3 months since it is one of the most important aes for biologic agents. we will also describe other noticeable aes within 3 months reported in the trials. we will not make predictions models for the secondary outcomes. we will include double-blind rcts only. if there are crossover rcts, only the data of the first phase will be used for analysis. cluster rcts, quasi-randomized trials, and observational studies will be excluded. we will conduct an electronic search of cochrane cen-tral, pubmed, and scopus from inception onwards, with the keywords: "rheumatoid arthritis," "certolizumab" or "cdp870" "cimzia", "methotrexate" or "mtx," without language restrictions. a draft search strategy is provided in additional file 2. we will search who international clinical trials registry platform to find the registered studies. we will search the us food and drug administration (fda) reports to see if there are any unpublished reports from the pharmaceutical companies. for ipd, we will contact the company which markets certolizumab and request ipd through http://vivli.org. we will assess the representativeness of the ipd among all the eligible studies by investigating the potential differences between trials with ipd and those without ipd. two independent reviewers will screen the titles and abstracts retrieved from the electronic searches to assess for inclusion. if both reviewers agree that a trial does not meet eligibility criteria, it will be excluded. the full text of all the remaining articles will be obtained for further reading, and the same eligibility criteria will be applied to determine which to exclude. any disagreements will be resolved through discussion with a third member of the review team. two reviewers will independently extract the information for all the included studies at aggregate level. a detailed data extraction template will be developed and piloted on 3 articles; after finalizing the items on the data extraction form, the 3 articles will be re-extracted. the main information includes intervention/control details, trial implementation features (e.g., completion year, randomized numbers, dropouts, follow-up length), baseline demographic and disease-specific characteristics, and outcomes of interested. the above information will be used for: (1) exploring the representativeness of the trials with ipd among all the eligible trials and (2) confirming if the ipd is consistent with the reported results. when the ipd is ready to be used, we will identify the variables of interest before the analysis. the variables regarding intervention, control, and outcomes are defined as the above in the "eligibility criteria" section. with regard to patient or trial characteristics to be used as potential covariates in the prognostic model, based on the literature [30] [31] [32] and our clinical practice, we propose the following factors as candidates of potential prognostic factors (pfs, baseline factors that may affect the response regardless of the treatment) (table 1) , which will be used for baseline risk model development (see the "predicting treatment effect for patients with particular characteristics: a two-stage model" section below). we will try to collect all the information listed in table 1 from the data, but only available factors that have been recorded in the trials will be added into the model. we will decide in which type (e.g., continuous, categorical, binary, etc.) a covariate will be put into the model according to the distribution of that covariate after we obtain the data. two independent reviewers will assess the risk of bias (rob) for each included rct according to "rob 2 tool" proposed by the cochrane group [34] . for the efficacy primary outcome, rcts will be graded as "low risk of bias," "high risk of bias," or "some concerns" in the following five domains: risk of bias arising from the randomization process, risk of bias due to deviations from the intended interventions, missing outcome data, risk of bias in measurement of the outcome, and risk of bias in selection of the reported result. the assessment will be adapted for ipd-ma, i.e., as per the obtained data and not the conducted and reported analyses in the original publications. finally, they will be summarized as an overall risk of bias according to the rob 2 algorithm. since our primary aim is to develop a prediction model and not to get a precise estimation of the treatment effects, all the analyses will be based on ipd only. therefore, we will neither analyze aggregate data together nor investigate the robustness of the ipd-ma including aggregate data, for they are beyond the perspectives of the present study. we first synthesize the data using one-stage bayesian hierarchical ipd-ma [35] . we will estimate the average relative treatment effect in terms of odds ratio (or) for efficacy. let y ij denote the dichotomous outcome of interest (y ij = 1 for remission or low disease activity), for patient i where i = 1, 2, …, n j in trial j out of n trials, t ij be 0/1 for patient in control/intervention group, and p ij is the probability of having the outcome. where α j is the log odds of the outcome for the control group, in trial j, which is independent across trials; δ j is table 1 potential candidates to be involved as prognostic factors in the prognostic model *factors that have been proved to be a prognostic factor for any treatments in previous studies # since genetic tests for ra are not routinely implemented in clinical practice, we anticipate that most studies will not report them. although genetics are often considered critical in precision medicine, we will consider it justifiable if no genetic information is included in our model, because there is no single one that has been proven to be strongly associated with the prognosis or treatment responses, and two studies have indicated that genetic information barely contribute in predicting treatment effects [33] the treatment effect (log or), which we assume to be exchangeable across trials; δ is the summary estimate of the log-odds ratios for the intervention versus the control arm; and τ 2 is the heterogeneity of δ across trials and normally distributed across trials. predicting treatment effect for patients with particular characteristics: a two-stage model data pre-processing within each study, the outcomes and the covariates will be evaluated for missing data, and we will further look at their distributional characteristics and correlations between the covariates (listed in the "at ipd level: for studies with ipd available" section). we will use multiple imputation methods for handling missing data [36] . we will consider data transformation for continuous variables to resolve skewness and re-categorization for categorical variables if necessary. if two or more variables are highly correlated, we will only retain the variable that is most commonly reported across studies and in the literature or the variable that has the least missing values. stage 1: developing a baseline risk model in this step, we will build a multivariable model to predict the probability that a patient, given her or his baseline characteristics, is likely to achieve remission or low disease activity irrespective of treatment; we will refer to this model as the baseline risk model. the risk model can be built using the patients from the control group only, or from both intervention and control group. the former is more intuitive; however, a simulation study indicated that models based on the whole participants produced estimates with narrower distribution of bias and were less prone to overfitting [37] . we will fit a multivariable logistic regression model: r ij is the probability of the outcome for patient i from trial j at the baseline. b 0j is the intercept, which is exchangeable across studies. pf ijk denotes the k prognostic factor (in total, there are p prognostic factors) in study j for patient i, and b kj is the regression coefficient for k prognostic factor in study j and is exchangeable across studies. in order to select the most appropriate model, we propose two approaches: (1) use previously identified prognostic factors and through discussions with rheumatologists to decide the subset of the most clinically relevant factors and estimate the coefficients using penalized maximum likelihood estimation shrinkage method and (2) use lasso penalization methods for variable selection and coefficient shrinkage [38] . for each possible model, we will examine the sample size first, in order to assess the reliability of the model. we will calculate the events per variable (epv) for our model, using all the categories of categorical variables and the degrees of freedom of continuous outcomes [39] . we will calculate efficient sample size for developing a logistic regression model [40] . validation is essential in prediction model development. since no external data is available, we can only use internal validation. via resampling methods like bootstrap or cross-validation, we can estimate the calibration slope and c-statistic for each model, to indicate the ability of calibration and discrimination. stage 2: developing a meta-regression model for treatment effects we use the same notation system as that in the "average relative treatment effect: ipd-ma" section. the logit(r ij ) from stage 1 will be used as a covariate in the meta-regression model, both as a prognostic factor and as an effect modifier. let logitðr ij þ j denote the average of logit-risk for all the individuals in study j. the regression equation will be: γ a j is the log odds in the control group when a patient has a risk equal to the mean risk, which is assumed to be independent across trials. g 0j is the coefficient of the risk score, while g j is the treatment effect modification of the risk score for the intervention group versus the control group; both are assumed to be exchangeable cross trials and normally distributed about a summary estimate γ 0 and γ respectively. predicting the probability of having the outcome for a new patient assume a new patient i who is not from any trial j has a baseline risk score g logitðr i þ calculated from stage-one. in order to predict the absolute logitprobability to achieve the outcome, we use: we would have estimated δ, γ 0 , and γ in the metaregression stage. we will estimate logitðrþ as the mean of logit(r ij ) across all the individuals and studies. for a, we will estimate it by synthesizing all the control arms. then, we can calculate the individual probability of the outcome both for the control and the intervention and estimate the predicted absolute and relative treatment effect. to evaluate the performance of the two-stage prediction model, we will use internally validation methods via both the traditional measures, like c-statistic, and measures relevant to clinical usefulness. publication bias considering that we will probably not be able to include all the relevant research works, as some studies or their results were likely not published owing to non-significant results (study publication bias and outcome reporting bias) [41, 42] , we will evaluate this issue by comparing the search and screening results (as we will try to retrieve possibly unpublished reports) with the ipd we can get. if necessary, we will address it by adding the study's variance as an extra covariate in the final ipd meta-regression model (see the section "predicting treatment effect for patients with particular characteristics: a two-stage model"-"stage 2: developing a meta-regression model for treatment effects"). statistical software we will use r for our analyses. stage 2 will be performed in a bayesian setting using r2jags. for the development of the baseline risk model, we will use the pmsampsize command to estimate if the available sample size is enough. we will examine the linear relationship between each one of the prognostic factors and the outcome via rcs and anova commands. the lasso model will be developed using the cv.glmnet command. we will use the lrm command for the predefined model based on prior knowledge, and then for the penalized maximum likelihood estimation, we will use the pentrace command. for the bootstrap internal validation (both for the baseline risk score and for the two-stage prediction model), we will use self-programmed r-routines. we have presented the study protocol for a prediction model of treatment effects for ra patients receiving ctz plus mtx, using a two-stage approach based on ipd-ma. though there are many optional drugs in treating ra, as treatment failure is relatively high, individualizing the treatment is imperative. many prognostic models for ra have been proposed, but no one is sufficiently satisfactory [31] . we have discovered several problems. most previous models focused on long-term radiographic or functional prognosis. although they are certainly the critical outcomes that both clinicians and patients care about, the complex therapeutic changes during the long treatment process are extremely difficult to handle in developing prediction models. thus, it usually ends up with a simplified strategy, such as taking only the initial treatment into account, which compromises the clinical interpretation and relevancy of the model. on the other hand, a good short-term treatment response is always positively associated with good longterm prognosis [43, 44] . predicting short-term treatment effect is instructive in clinical practice; however, research is lacking. a few established "short-term" diseaseactivity-oriented prediction models used an outcome measured at 6 months or 12 months. the problem is, unless in active-controlled studies, there would be considerable dropouts after 3-4 months; furthermore, due to ethnical issues, many trials would offer the nonresponders other active treatments after 3-4 months. under the itt principle, patients were commonly analyzed as originally allocated; when dropouts were not negligible, imputation methods were usually used, but mostly single imputation such as non-responder imputation or last observation carried forward (locf) [45] . one may argue that these estimates were conservative to the intervention group though not precise. but in fact it is not always conservative for a relative effect estimate, while unbiased relative estimates are of critical interest in building personalized prediction models. as a result, in order to be methodologically rigorous, we choose the outcome measured at 3 months, when the randomization is likely kept, and which is consistent with the assessment time recommended by the guideline [1] . additionally, thanks to the ipd, we will be able to use multiple imputation to handle missing data, rather than the single imputations used in primary rcts. we will use a two-stage approach to construct the prediction model using ipd-ma. unlike the usual approach, which includes baseline features as prognostic factors and effect modifiers (through interaction terms) simultaneously, we first build a risk model for baseline factors, then treating the risk score as both a prognostic factor and an effect modifier. by doing so, overfitting problem caused by too many covariates and interaction terms can be alleviated. moreover, since penalization will only be used in common regression during risk modeling stage but not in meta-regression, the compromised penalization in meta-regression can be avoided. for stage 1, generally there are two types of risk models. one is an externally developed model, which is derived based on data independent from the data used at stage 2, such as established models from previous studies, or using some other studies. the other is an internally developed risk model, for which the same data will be used to build both the risk model and the treatment effects model [37] . because there is no well-established risk model to predict the short-term disease activity for ra patients and also because we will very probably not have sufficient sample size to divide the entire data into two parts, we will use the internal risk model for our study. we acknowledge several limitations in our study. first, we handle effect modification at the level of risk scores, instead of particular covariates. that is, we will not try to identify specific effect modifiers. this may cause some problems in interpretation, as the concept of distinguishing prognostic factors and effect modifiers is well recognized. however, our approach assures the statistical power. second, due to the restricted sample size, only internal validation is planned while external validation is lacking. it needs to be validated on an external dataset in the future. third, we only focus on short-term treatment response for ra patients receiving two kinds of treatment, ctz and mtx. future studies may extend the scope to compare several kinds of therapies and treatment strategies and finally model for the long-term prognosis taking into consideration all the treatment processes. supplementary information accompanies this paper at https://doi.org/10. 1186/s13643-020-01401-x. additional file 1. prisma-p checklist. eular recommendations for the management of rheumatoid arthritis with synthetic and biological disease-modifying antirheumatic drugs: 2016 update committee for medicinal products for human use (chmp): guideline on clinical investigation of medicinal products for the treatment of rheumatoid arthritis a framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis developing and validating risk prediction models in an individual participant data meta-analysis cochrane ipdm-amg: individual participant data (ipd) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use statistical approaches to identify subgroups in meta-analysis of individual participant data: a simulation study arthritis: which subgroup of patients with rheumatoid arthritis benefits from switching to rituximab versus alternative anti-tumour necrosis factor (tnf) agents after previous failure of an anti-tnf agent? prediction of response to methotrexate in rheumatoid arthritis prediction of response to targeted treatment in rheumatoid arthritis association of response to tnf inhibitors in rheumatoid arthritis with quantitative trait loci for cd40 and cd39 assessment of a deep learning model based on electronic health record data to forecast clinical outcomes in patients with rheumatoid arthritis personalized evidence based medicine: predictive approaches to heterogeneous treatment effects meta-analysis of individual participant data: rationale, conduct, and reporting getreal methods review g: get real in individual participant data (ipd) meta-analysis: a review of the methodology quantifying heterogeneity in individual participant data meta-analysis with binary outcomes a critical review of methods for the assessment of patient-level interactions in individual participant data meta-analysis of randomized trials, and guidance for practitioners cognitive-behavioral analysis system of psychotherapy, drug, or their combination for persistent depressive disorder: personalizing the treatment choice using individual participant data network metaregression preferred reporting items for systematic review and meta-analysis protocols (prisma-p) preferred reporting items for systematic review and meta-analyses of individual participant data: the prisma-ipd statement rheumatoid arthritis classification criteria: an american college of rheumatology/ european league against rheumatism collaborative initiative rheumatoid arthritis classification criteria: an american college of rheumatology/ european league against rheumatism collaborative initiative the american rheumatism association 1987 revised criteria for the classification of rheumatoid arthritis american college of rheumatology/european league against rheumatism provisional definition of remission in rheumatoid arthritis for clinical trials the definition and measurement of disease modification in inflammatory rheumatic diseases the disease activity score and the eular response criteria acute phase reactants add little to composite disease activity indices for rheumatoid arthritis: validation of a clinical activity score remission and active disease in rheumatoid arthritis: defining criteria for disease activity states the importance of reporting disease activity states in rheumatoid arthritis clinical trials american college of rheumatology. preliminary definition of improvement in rheumatoid arthritis remission-induction therapies for early rheumatoid arthritis: evidence to date and clinical implications assessing prognosis and prediction of treatment response in early rheumatoid arthritis: systematic reviews poor prognostic factors guiding treatment decisions in rheumatoid arthritis patients: a review of data from randomized clinical trials and cohort studies crowdsourced assessment of common genetic contribution to predicting anti-tnf treatment response in rheumatoid arthritis rob 2: a revised tool for assessing risk of bias in randomised trials meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ missing data in randomised controlled trials: a practical guide using internally developed risk models to assess heterogeneity in treatment effects in clinical trials regression shrinkage and selection via the lasso risk prediction models: i. development, internal validation, and assessing the incremental value of a new (bio)marker minimum sample size for developing a multivariable prediction model: part ii -binary and time-to-event outcomes randomized controlled trials of rheumatoid arthritis registered at clinicaltrials.gov: what gets published and when dissemination and publication of research findings: an updated review of related biases rheumatoid arthritis treatment: the earlier the better to prevent joint damage evaluation of different methods used to assess disease activity in rheumatoid arthritis: analyses of abatacept clinical trial data a systematic review of randomised controlled trials in rheumatoid arthritis: the reporting and handling of missing data in composite outcomes publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations authors' contributions yl and taf conceived the study. kc and gs designed the modeling strategy. ry and sf provided substantial contribution to the design of the study during its development. yl drafted the manuscript, and all the authors critically revised it. all authors gave final approval of the version to be published. this study was supported by the intramural support to the department of health promotion and human behavior, kyoto university graduate school of medicine/school of public health. the funder has no role in the study design, data collection, data analysis, data interpretation, writing of the report, or in the decision to submit for publication. the data that support the findings of this study are available from http://vivli. org but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. data are however available from http://vivli.org upon reasonable request and application, after their permission. this study does not require institutional review board approval and participant consent.competing interests taf reports personal fees from mitsubishi-tanabe, msd, and shionogi and a grant from mitsubishi-tanabe, outside the submitted work; taf has a patent 2018-177688. gs was invited to participate in two methodological meetings about the use of real-world data, organized by biogen and by merck. all the other authors report no competing interests to declare. key: cord-104486-syirijql authors: adiga, aniruddha; chen, jiangzhuo; marathe, madhav; mortveit, henning; venkatramanan, srinivasan; vullikanti, anil title: data-driven modeling for different stages of pandemic response date: 2020-09-21 journal: arxiv doi: nan sha: doc_id: 104486 cord_uid: syirijql some of the key questions of interest during the covid-19 pandemic (and all outbreaks) include: where did the disease start, how is it spreading, who is at risk, and how to control the spread. there are a large number of complex factors driving the spread of pandemics, and, as a result, multiple modeling techniques play an increasingly important role in shaping public policy and decision making. as different countries and regions go through phases of the pandemic, the questions and data availability also changes. especially of interest is aligning model development and data collection to support response efforts at each stage of the pandemic. the covid-19 pandemic has been unprecedented in terms of real-time collection and dissemination of a number of diverse datasets, ranging from disease outcomes, to mobility, behaviors, and socio-economic factors. the data sets have been critical from the perspective of disease modeling and analytics to support policymakers in real-time. in this overview article, we survey the data landscape around covid-19, with a focus on how such datasets have aided modeling and response through different stages so far in the pandemic. we also discuss some of the current challenges and the needs that will arise as we plan our way out of the pandemic. as the sars-cov-2 pandemic has demonstrated, the spread of a highly infectious disease is a complex dynamical process. a large number of factors are at play as infectious diseases spread, including variable individual susceptibility to the pathogen (e.g., by age and health conditions), variable individual behaviors (e.g., compliance with social distancing and the use of masks), differing response strategies implemented by governments (e.g., school and workplace closure policies and criteria for testing), and potential availability of pharmaceutical interventions. governments have been forced to respond to the rapidly changing dynamics of the pandemic, and are becoming increasingly reliant on different modeling and analytical techniques to understand, forecast, plan and respond; this includes statistical methods and decision support methods using multi-agent models, such as: (i) forecasting epidemic outcomes (e.g., case counts, mortality and hospital demands), using a diverse set of data-driven methods e.g., arima type time series forecasting, bayesian techniques and deep learning, e.g., [1] [2] [3] [4] [5] , (ii) disease surveillance, e.g., [6, 7] , and (iii) counter-factual analysis of epidemics using multi-agent models, e.g., [8] [9] [10] [11] [12] [13] ; indeed, the results of [11, 14] were very influential in the early decisions for lockdowns in a number of countries. the specific questions of interest change with the stage of the pandemic. in the pre-pandemic stage, the focus was on understanding how the outbreak started, epidemic parameters, and the risk of importation to different regions. once outbreaks started-the acceleration stage, the focus is on determining the growth rates, the differences in spatio-temporal characteristics, and testing bias. in the mitigation stage, the questions are focused on non-prophylactic interventions, such as school and work place closures and other social-distancing strategies, determining the demand for healthcare resources, and testing and tracing. in the suppression stage, the focus shifts to using prophylactic interventions, combined with better tracing. these phases are not linear, and overlap with each other. for instance, the acceleration and mitigation stages of the pandemic might overlap spatially, temporally as well as within certain social groups. different kinds of models are appropriate at different stages, and for addressing different kinds of questions. for instance, statistical and machine learning models are very useful in forecasting and short term projections. however, they are not very effective for longer-term projections, understanding the effects of different kinds of interventions, and counter-factual analysis. mechanistic models are very useful for such questions. simple compartmental type models, and their extensions, namely, structured metapopulation models are useful for several population level questions. however, once the outbreak has spread, and complex individual and community level behaviors are at play, multi-agent models are most effective, since they allow for a more systematic representation of complex social interactions, individual and collective behavioral adaptation and public policies. as with any mathematical modeling effort, data plays a big role in the utility of such models. till recently, data on infectious diseases was very hard to obtain due to various issues, such as privacy and sensitivity of the data (since it is information about individual health), and logistics of collecting such data. the data landscape during the sars-cov-2 pandemic has been very different: a large number of datasets are becoming available, ranging from disease outcomes (e.g., time series of the number of confirmed cases, deaths, and hospitalizations), some characteristics of their locations and demographics, healthcare infrastructure capacity (e.g., number of icu beds, number of healthcare personnel, and ventilators), and various kinds of behaviors (e.g., level of social distancing, usage of ppes); see [15] [16] [17] for comprehensive surveys on available datasets. however, using these datasets for developing good models, and addressing important public health questions remains challenging. the goal of this article is to use the widely accepted stages of a pandemic as a guiding framework to highlight a few important problems that require attention in each of these stages. we will aim to provide a succinct model-agnostic formulation while identifying the key datasets needed, how they can be used, and the challenges arising in that process. we will also use sars-cov-2 as a case study unfolding in real-time, and highlight some interesting peer-reviewed and preprint literature that pertains to each of these problems. an important point to note is the necessity of randomly sampled data, e.g. data needed to assess the number of active cases and various demographics of individuals that were affected. census provides an excellent rationale. it is the only way one can develop rigorous estimates of various epidemiologically relevant quantities. there have been numerous surveys on the different types of datasets available for sars-cov-2, e.g., [15] [16] [17] [18] , as well as different kinds of modeling approaches. however, they do not describe how these models become relevant through the phases of pandemic response. an earlier similar attempt to summarize such responsedriven modeling efforts can be found in [19] , based on the 2009-h1n1 experience, this paper builds on their work and discusses these phases in the present context and the sars-cov-2 pandemic. although the paper touches upon different aspects of model-based decision making, we refer the readers to a companion article in the same special issue [20] for a focused review of models used for projection and forecasting. multiple organizations including cdc and who have their frameworks for preparing and planning response to a pandemic. for instance, the pandemic intervals framework from cdc 1 describes the stages in the context of an influenza pandemic; these are illustrated in figure 1 . these six stages span investigation, recognition and initiation in the early phase, followed by most of the disease spread occurring during the acceleration and deceleration stages. they also provide indicators for identifying when the pandemic has progressed from one stage to the next [21] . as envisioned, risk evaluation (i.e., using tools like influenza risk assessment tool (irat) and pandemic severity assessment framework (psaf)) and early case identification characterize the first three stages, while non-pharmaceutical interventions (npis) and available figure 1 : cdc pandemic intervals framework and who phases for influenza pandemic therapeutics become central to the acceleration stage. the deceleration is facilitated by mass vaccination programs, exhaustion of susceptible population, or unsuitability of environmental conditions (such as weather). a similar framework is laid out in who's pandemic continuum 2 and phases of pandemic alert 3 . while such frameworks aid in streamlining the response efforts of these organizations, they also enable effective messaging. to the best of our knowledge, there has not been a similar characterization of mathematical modeling efforts that go hand in hand with supporting the response. for summarizing the key models, we consider four of the stages of pandemic response mentioned in section 2: pre-pandemic, acceleration, mitigation and suppression. here we provide the key problems in each stage, the datasets needed, the main tools and techniques used, and pertinent challenges. we structure our discussion based on our experience with modeling the spread of covid-19 in the us, done in collaboration with local and federal agencies. • acceleration (section 5): this stage is relevant once the epidemic takes root within a country. there is usually a big lag in surveillance and response efforts, and the key questions are to model spread patterns at different spatio-temporal scales, and to derive short-term forecasts and projections. a broad class of datasets is used for developing models, including mobility, populations, land-use, and activities. these are combined with various kinds of time series data and covariates such as weather for forecasting. • mitigation (section 6): in this stage, different interventions, which are mostly non-pharmaceutical in the case of a novel pathogen, are implemented by government agencies, once the outbreak has taken hold within the population. this stage involves understanding the impact of interventions on case counts and health infrastructure demands, taking individual behaviors into account. the additional datasets needed in this stage include those on behavioral changes and hospital capacities. • suppression (section 7): this stage involves designing methods to control the outbreak by contact tracing & isolation and vaccination. data on contact tracing, associated biases, vaccine production schedules, and compliance & hesitancy are needed in this stage. figure 2 gives an overview of this framework and summarizes the data needs in these stages. these stages also align well with the focus of the various modeling working groups organized by cdc which include epidemic parameter estimation, international spread risk, sub-national spread forecasting, impact of interventions, healthcare systems, and university modeling. in reality, one should note that these stages may overlap, and may vary based on geographical factors and response efforts. moreover, specific problems can be approached prospectively in earlier stages, or retrospectively during later stages. this framework is thus meant to be more conceptual than interpreted along a linear timeline. results from such stages are very useful for policymakers to guide real-time response. consider a novel pathogen emerging in human populations that is detected through early cases involving unusual symptoms or unknown etiology. such outbreaks are characterized by some kind of spillover event, mostly through zoonotic means, like in the case of covid-19 or past influenza pandemics (e.g., swine flu and avian flu). a similar scenario can occur when an incidence of a well-documented disease with no known vaccine or therapeutics emerges in some part of the world, causing severe outcomes or fatalities (e.g., ebola and zika.) regardless of the development status of the country where the pathogen emerged, such outbreaks now contains the risk of causing a worldwide pandemic due to the global connectivity induced by human travel. two questions become relevant at this stage: what are the epidemiological attributes of this disease, and what are the risks of importation to a different country? while the first question involves biological and clinical investigations, the latter is more related with societal and environmental factors. one of the crucial tasks during early disease investigation is to ascertain the transmission and severity of the disease. these are important dimensions along which the pandemic potential is characterized because together they determine the overall disease burden, as demonstrated within the pandemic severity assessment framework [22] . in addition to risk assessment for right-sizing response, they are integral to developing meaningful disease models. formulation let θ = {θ t , θ s } represent the transmission and severity parameters of interest. they can be further subdivided into sojourn time parameters θ δ · and transition probability parameters θ p · . here θ corresponds to a continuous time markov chain (ctmc) on the disease states. the problem formulation can be represented as follows: given π(θ), the prior distribution on the disease parameters and a dataset d, estimate the posterior distribution p(θ|d) over all possible values of θ. in a model-specific form, this can be expressed as p(θ|d, m) where m is a statistical, compartmental or agent-based disease model. in order to estimate the disease parameters sufficiently, line lists for individual confirmed cases is ideal. such datasets contain, for each record, the date of confirmation, possible date of onset, severity (hospitalization/icu) status, and date of recovery/discharge/death. furthermore, age-and demographic/comorbidity information allow development of models that are age-and risk group stratified. one such crowdsourced line list was compiled during the early stages of covid-19 [24] and later released by cdc for us cases [25] . data from detailed clinical investigations from other countries such as china, south korea, and singapore was also used to parameterize these models [26] . in the absence of such datasets, past parameter estimates of similar diseases (e.g., sars, mers) were used for early analyses. modeling approaches for a model agnostic approach, the delays and probabilities are obtained by various techniques, including bayesian and ordinary least squares fitting to various delay distributions. for a particular disease model, these are estimated through model calibration techniques such as mcmc and particle filtering approaches. a summary of community estimates of various disease parameters is provided at https://github.com/midas-network/covid-19. further such estimates allow the design of pandemic planning scenarios varying in levels of impact, as seen in the cdc scenarios page 4 . see [27] [28] [29] for methods and results related to estimating covid-19 disease parameters from real data. current models use a large set of disease parameters for modeling covid-19 dynamics; they can be broadly classified as transmission parameters and hospital resource parameters. for instance in our work, we currently use parameters (with explanations) shown in table 1 . challenges often these parameters are model specific, and hence one needs to be careful when reusing parameter estimates from literature. they are related but not identifiable with respect to population level measures such as basic reproductive number r 0 (or effective reproductive number r eff ) and doubling time which allow tracking the rate of epidemic growth. also the estimation is hindered by inherent biases in case ascertainment rate, reporting delays and other gaps in the surveillance system. aligning different data streams (e.g., outpatient surveillance, hospitalization rates, mortality records) is in itself challenging. when a disease outbreak occurs in some part of the world, it is imperative for most countries to estimate their risk of importation through spatial proximity or international travel. such measures are incredibly valuable in setting a timeline for preparation efforts, and initiating health checks at the borders. over centuries, pandemics have spread faster and faster across the globe, making it all the more important to characterize this risk as early as possible. formulation let c be the set of countries, and g = {c, e} an international network, where edges (often weighted and directed) in e represent some notion of connectivity. the importation risk problem can be formulated as below: given c o ∈ c the country of origin with an initial case at time 0, and c i the country of interest, using g, estimate the expected time taken t i for the first cases to arrive in country c i . in its probabilistic form, the same can be expressed as estimating the probability p i (t) of seeing the first case in country c i by time t. data needs assuming we have initial case reports from the origin country, the first data needed is a network that connects the countries of the world to represent human travel. the most common source of such information is the airline network datasets, from sources such as iata, oag, and openflights; [30] provides a systematic review of how airline passenger data has been used for infectious disease modeling. these datasets could either capture static measures such as number of seats available or flight schedules, or a dynamic count of passengers per month along each itinerary. since the latter has intrinsic delays in collection and reporting, for an ongoing pandemic they may not be representative. during such times, data on ongoing travel restrictions [31] become important to incorporate. multi-modal traffic will also be important to incorporate for countries that share land borders or have heavy maritime traffic. for diseases such as zika, where establishment risk is more relevant, data on vector abundance or prevailing weather conditions are appropriate. modeling approaches simple structural measures on networks (such as degree, pagerank) could provide static indicators of vulnerability of countries. by transforming the weighted, directed edges into probabilities, one can use simple contagion models (e.g., independent cascades) to simulate disease spread and empirically estimate expected time of arrival. global metapopulation models (gleam) that combine seir type dynamics with an airline network have also been used in the past for estimating importation risk. brockmann and helbing [32] used a similar framework to quantify effective distance on the network which seemed to be well correlated with time of arrival for multiple pandemics in the past; this has been extended to covid-19 [8, 33] . in [34] , the authors employ air travel volume obtained through iata from ten major cities across china to rank various countries along with the idvi to convey their vulnerability. [35] consider the task of forecasting international and domestic spread of covid-19 and employ official airline group (oag) data for determining air traffic to various countries, and [36] fit a generalized linear model for observed number of cases in various countries as a function of air traffic volume obtained from oag data to determine countries with potential risk of under-detection. also, [37] provide africa-specific case-study of vulnerability and preparedness using data from civil aviation administration of china. challenges note that arrival of an infected traveler will precede a local transmission event in a country. hence the former is more appropriate to quantify in early stages. also, the formulation is agnostic to whether it is the first infected arrival or first detected case. however, in real world, the former is difficult to observe, while the latter is influenced by security measures at ports of entry (land, sea, air) and the ease of identification for the pathogen. for instance, in the case of covid-19, the long incubation period and the high likelihood of asymptomaticity could have resulted in many infected travelers being missed by health checks at poes. we also noticed potential administrative delays in reporting by multiple countries fearing travel restrictions. as the epidemic takes root within a country, it may enter the acceleration phase. depending on the testing infrastructure and agility of surveillance system, response efforts might lag or lead the rapid growth in case rate. under such a scenario, two crucial questions emerge that pertain to how the disease may spread spatially/socially and how the case rate may grow over time. within the country, there is need to model the spatial spread of the disease at different scales: state, county, and community levels. similar to the importation risk, such models may provide an estimate of when cases may emerge in different parts of the country. when coupled with vulnerability indicators (socioeconomic, demographic, co-morbidities) they provide a framework for assessing the heterogeneous impact the disease may have across the country. detailed agent-based models for urban centers may help identify hotspots and potential case clusters that may emerge (e.g., correctional facilities, nursing homes, food processing plants, etc. in the case of covid-19). formulation given a population representation p at appropriate scale and a disease model m per entity (individual or sub-region), model the disease spread under different assumptions of underlying connectivity c and disease parameters θ. the result will be a spatio-temporal spread model that results in z s,t , the time series of disease states over time for region s. data needs some of the common datasets needed by most modeling approaches include: (1) social and spatial representation, which includes census, and population data, which are available from census departments (see, e.g., [38] ), and landscan [39] , (2) connectivity between regions (commuter, airline, road/rail/river), e.g., [30, 31] , (3) data on locations, including points of interest, e.g., openstreetmap [40] , and (4) activity data, e.g., the american time use survey [41] . these datasets help capture where people reside and how they move around, and come in contact with each other. while some of these are static, more dynamic measures, such as from gps traces, become relevant as individuals change their behavior during a pandemic. modeling approaches different kinds of structured metapopulation models [8, [42] [43] [44] [45] , and agent based models [46] [47] [48] [49] [50] have been used in the past to model the sub-national spread; we refer to [13, 51, 52] for surveys on different modeling approaches. these models incorporate typical mixing patterns, which result from detailed activities and co-location (in the case of agent based models), and different modes of travel and commuting (in the case of metapopulation models). challenges while metapopulation models can be built relatively rapidly, agent based models are much harder-the datasets need to be assembled at a large scale, with detailed construction pipelines, see, e.g., [46] [47] [48] [49] [50] . since detailed individual activities drive the dynamics in agent based models, schools and workplaces have to be modeled, in order to make predictions meaningful. such models will get reused at different stages of the outbreak, so they need to be generic enough to incorporate dynamically evolving disease information. finally, a common challenge across modeling paradigms is the ability to calibrate to the dynamically evolving spatio-temporal data from the outbreak-this is especially challenging in the presence of reporting biases and data insufficiency issues. given the early growth of cases within the country (or sub-region), there is need for quantifying the rate of increase in comparable terms across the duration of the outbreak (accounting for the exponential nature of such processes). these estimates also serve as references, when evaluating the impact of various interventions. as an extension, such methods and more sophisticated time series methods can be used to produce short-term forecasts for disease evolution. formulation given the disease time series data within the country z s,t until data horizon t , provide scale-independent growth rate measures g s (t ), and forecastsẑ s,u for u ∈ [t, t + ∆t ], where ∆t is the forecast horizon. data needs models at this stage require datasets such as (1) time series data on different kinds of disease outcomes, including case counts, mortality, hospitalizations, along with attributes, such as age, gender and location, e.g., [53] [54] [55] [56] [57] , (2) any associated data for reporting bias (total tests, test positivity rate) [58] , which need to be incorporated into the models, as these biases can have a significant impact on the dynamics, and (3) exogenous regressors (mobility, weather), which have been shown to have a significant impact on other diseases, such as influenza, e.g., [59] . modeling approaches even before building statistical or mechanistic time series forecasting methods, one can derive insights through analytical measures of the time series data. for instance, the effective reproductive number, estimated from the time series [60] can serve as a scale-independent metric to compare the outbreaks across space and time. additionally multiple statistical methods ranging from autoregressive models to deep learning techniques can be applied to the time series data, with additional exogenous variables as input. while such methods perform reasonably for short-term targets, mechanistic approaches as described earlier can provide better long-term projections. various ensembling techniques have also been developed in the recent past to combine such multi-model forecasts to provide a single robust forecast with better uncertainty quantification. one such effort that combines more than 30 methods for covid-19 can be found at the covid forecasting hub 5 . we also point to the companion paper for more details on projection and forecasting models. challenges data on epidemic outcomes usually has a lot of uncertainties and errors, including missing data, collection bias, and backfill. for forecasting tasks, these time series data need to be near real-time, else one needs to do both nowcasting, as well as forecasting. other exogenous regressors can provide valuable lead time, due to inherent delays in disease dynamics from exposure to case identification. such frameworks need to be generalized to accommodate qualitative inputs on future policies (shutdowns, mask mandates, etc.), as well as behaviors, as we discuss in the next section. once the outbreak has taken hold within the population, local, state and national governments attempt to mitigate and control its spread by considering different kinds of interventions. unfortunately, as the covid-19 pandemic has shown, there is a significant delay in the time taken by governments to respond. as a result, this has caused a large number of cases, a fraction of which lead to hospitalizations. two key questions in this stage are: (1) how to evaluate different kinds of interventions, and choose the most effective ones, and (2) how to estimate the healthcare infrastructure demand, and how to mitigate it. the effectiveness of an intervention (e.g., social distancing) depends on how individuals respond to them, and the level of compliance. the health resource demand depends on the specific interventions which are implemented. as a result, both these questions are connected, and require models which incorporate appropriate behavioral responses. in the initial stages, only non-prophylactic interventions are available, such as: social distancing, school and workplace closures, and use of ppes, since no vaccinations and anti-virals are available. as mentioned above, such analyses are almost entirely model based, and the specific model depends on the nature of the intervention and the population being studied. formulation given a model, denoted abstractly as m, the general goals are (1) to evaluate the impact of an intervention (e.g., school and workplace closure, and other social distancing strategies) on different epidemic outcomes (e.g., average outbreak size, peak size, and time to peak), and (2) find the most effective intervention from a suite of interventions, with given resource constraints. the specific formulation depends crucially on the model and type of intervention. even for a single intervention, evaluating its impact is quite challenging, since there are a number of sources of uncertainty, and a number of parameters associated with the intervention (e.g., when to start school closure, how long, and how to restart). therefore, finding uncertainty bounds is a key part of the problem. data needs while all the data needs from the previous stages for developing a model are still there, representation of different kinds of behaviors is a crucial component of the models in this stage; this includes: use of ppes, compliance to social distancing measures, and level of mobility. statistics on such behaviors are available at a fairly detailed level (e.g., counties and daily) from multiple sources, such as (1) the covid-19 impact analysis platform from the university of maryland [56] , which gives metrics related to social distancing activities, including level of staying home, outside county trips, outside state trips, (2) changes in mobility associated with different kinds of activities from google [61] , and other sources, (3) survey data on different kinds of behaviors, such as usage of masks [62] . modeling approaches as mentioned above, such analyses are almost entirely model based, including structured metapopulation models [8, [42] [43] [44] [45] , and agent based models [46] [47] [48] [49] [50] . different kinds of behaviors relevant to such interventions, including compliance with using ppes and compliance to social distancing guidelines, need to be incorporated into these models. since there is a great deal of heterogeneity in such behaviors, it is conceptually easiest to incorporate them into agent based models, since individual agents are represented. however, calibration, simulation and analysis of such models pose significant computational challenges. on the other hand, the simulation of metapopulation models is much easier, but such behaviors cannot be directly represented-instead, modelers have to estimate the effect of different behaviors on the disease model parameters, which can pose modeling challenges. challenges there are a number of challenges in using data on behaviors, which depends on the specific datasets. much of the data available for covid-19 is estimated through indirect sources, e.g., through cell phone and online activities, and crowd-sourced platforms. this can provide large spatio-temporal datasets, but have unknown biases and uncertainties. on the other hand, survey data is often more reliable, and provides several covariates, but is typically very sparse. handling such uncertainties, rigorous sensitivity analysis, and incorporating the uncertainties into the analysis of the simulation outputs are important steps for modelers. the covid-19 pandemic has led to a significant increase in hospitalizations. hospitals are typically optimized to run near capacity, so there have been fears that the hospital capacities would not be adequate, especially in several countries in asia, but also in some regions in the us. nosocomial transmission could further increase this burden. formulation the overall problem is to estimate the demand for hospital resources within a populationthis includes the number of hospitalizations, and more refined types of resources, such as icus, ccus, medical personnel and equipment, such as ventilators. an important issue is whether the capacity of hospitals within the region would be overrun by the demand, when this is expected to happen, and how to design strategies to meet the demand-this could be through augmenting the capacities at existing hospitals, or building new facilities. timing is of essence, and projections of when the demands exceed capacity are important for governments to plan. the demands for hospitalization and other health resources can be estimated from the epidemic models mentioned earlier, by incorporating suitable health states, e.g., [43, 63] ; in addition to the inputs needed for setting up the models for case counts, datasets are needed for hospitalization rates and durations of hospital stay, icu care, and ventilation. the other important inputs for this component are hospital capacity, and the referral regions (which represent where patients travel for hospitalization). different public and commercial datasets provide such information, e.g., [64, 65] . modeling approaches demand for health resources is typically incorporated into both metapopulation and agent based models, by having a fraction of the infectious individuals transition into a hospitalization state. an important issue to consider is what happens if there is a shortage of hospital capacity. studying this requires modeling the hospital infrastructure, i.e., different kinds of hospitals within the region, and which hospital a patient goes to. there is typically limited data on this, and data on hospital referral regions, or voronoi tesselation can be used. understanding the regimes in which hospital demand exceeds capacity is an important question to study. nosocomial transmission is typically much harder to study, since it requires more detailed modeling of processes within hospitals. challenges there is a lot of uncertainty and variability in all the datasets involved in this process, making its modeling difficult. for instance, forecasts of the number of cases and hospitalizations have huge uncertainty bounds for medium or long term horizon, which is the kind of input necessary for understanding hospital demands, and whether there would be any deficits. the suppression stage involves methods to control the outbreak, including reducing the incidence rate and potentially leading to the eradication of the disease in the end. eradication in case of covid-19 appears unlikely as of now, what is more likely is that this will become part of seasonal human coronaviruses that will mutate continuously much like the influenza virus. contact tracing problem refers to the ability to trace the neighbors of an infected individual. ideally, if one is successful, each neighbor of an infected neighbor would be identified and isolated from the larger population to reduce the growth of a pandemic. in some cases, each such neighbor could be tested to see if the individual has contracted the disease. contact tracing is the workhorse in epidemiology and has been immensely successful in controlling slow moving diseases. when combined with vaccination and other pharmaceutical interventions, it provides the best way to control and suppress an epidemic. formulation the basic contact tracing problem is stated as follows: given a social contact network g(v, e) and subset of nodes s ⊂ v that are infected and a subset s 1 ⊂ s of nodes identified as infected, find all neighbors of s. here a neighbor means an individual who is likely to have a substantial contact with the infected person. one then tests them (if tests are available), and following that, isolates these neighbors, or vaccinates them or administers anti-viral. the measures of effectiveness for the problem include: (i) maximizing the size of s 1 , (ii) maximizing the size of set n (s 1 ) ⊆ n (s), i.e. the potential number of neighbors of set s 1 , (iii) doing this within a short period of time so that these neighbors either do not become infectious, or they minimize the number of days that they are infectious, while they are still interacting in the community in a normal manner, (iv) the eventual goal is to try and reduce the incidence rate in the community-thus if all the neighbors of s 1 cannot be identified, one aims to identify those individuals who when isolated/treated lead to a large impact; (v) and finally verifying that these individuals indeed came in contact with the infected individuals and thus can be asked to isolate or be treated. data needs data needed for the contact tracing problem includes: (i) a line list of individuals who are currently known to be infected (this is needed in case of human based contact tracing). in the real world, when carrying out human contact tracers based deployment, one interviews all the individuals who are known to be infectious and reaches out to their contacts. modeling approaches human contact tracing is routinely done in epidemiology. most states in the us have hired such contact tracers. they obtain the daily incidence report from the state health departments and then proceed to contact the individuals who are confirmed to be infected. earlier, human contact tracers used to go from house to house and identify the potential neighbors through a well defined interview process. although very effective it is very time consuming and labor intensive. phones were used extensively in the last 10-20 years as they allow the contact tracers to reach individuals. they are helpful but have the downside that it might be hard to reach all individuals. during covid-19 outbreak, for the first time, societies and governments have considered and deployed digital contact tracing tools [66] [67] [68] [69] [70] . these can be quite effective but also have certain weaknesses, including, privacy, accuracy, and limited market penetration of the digital apps. challenges these include: (i) inability to identify everyone who is infectious (the set s) -this is virtually impossible for covid-19 like disease unless the incidence rate has come down drastically and for the reason that many individuals are infected but asymptomatic; (ii) identifying all contacts of s (or s 1 ) -this is hard since individuals cannot recall everyone they met, certain folks that they were in close proximity might have been in stores or social events and thus not known to individuals in the set s. furthermore, even if a person is able to identify the contacts, it is often hard to reach all the individuals due to resource constraints (each human tracer can only contact a small number of individuals. the overall goal of the vaccine allocation problem is to allocate vaccine efficiently and in a timely manner to reduce the overall burden of the pandemic. formulation the basic version of the problem can be cast in a very simple manner (for networked models): given a graph g(v, e) and a budget b on the number of vaccines available, find a set s of size b to vaccinate so as to optimize certain measure of effectiveness. the measure of effectiveness can be (i) minimizing the total number of individuals infected (or maximizing the total number of uninfected individuals); (ii) minimizing the total number of deaths (or maximizing the total number of deaths averted); (iii) optimizing the above quantities but keeping in mind certain equity and fairness criteria (across socio-demographic groups, e.g. age, race, income); (iv) taking into account vaccine hesitancy of individuals; (v) taking into account the fact that all vaccines are not available at the start of the pandemic, and when they become available, one gets limited number of doses each month; (vi) deciding how to share the stockpile between countries, state, and other organizations; (vii) taking into account efficacy of the vaccine. data needs as in other problems, vaccine allocation problems need as input a good representation of the system; network based, meta-population based and compartmental mass action models can be used. one other key input is the vaccine budget, i.e., the production schedule and timeline, which serves as the constraint for the allocation problem. additional data on prevailing vaccine sentiment and past compliance to seasonal/neonatal vaccinations are useful to estimate coverage. modeling approaches the problem has been studied actively in the literature; network science community has focused on optimal allocation schemes, while public health community has focused on using meta-population models and assessing certain fixed allocation schemes based on socio-economic and demographic considerations. game theoretic approaches that try and understand strategic behavior of individuals and organization has also been studied. challenges the problem is computationally challenging and thus most of the time simulation based optimization techniques are used. challenge to the optimization approach comes from the fact that the optimal allocation scheme might be hard to compute or hard to implement. other challenges include fairness criteria (e.g. the optimal set might be a specific group) and also multiple objectives that one needs to balance. while the above sections provide an overview of salient modeling questions that arise during the key stages of a pandemic, mathematical and computational model development is equally if not more important as we approach the post-pandemic (or more appropriately inter-pandemic) phase. often referred to as peace time efforts, this phase allows modelers to retrospectively assess individual and collective models on how they performed during the pandemic. in order to encourage continued development and identifying data gaps, synthetic forecasting challenge exercises [71] may be conducted where multiple modeling groups are invited to forecast synthetic scenarios with varying levels of data availability. another set of models that are quite relevant for policymakers during the winding down stages, are those that help assess overall health burden and economic costs of the pandemic. epideep: exploiting embeddings for epidemic forecasting an arima model to forecast the spread and the final size of covid-2019 epidemic in italy (first version on ssrn 31 march) real-time epidemic forecasting: challenges and opportunities accuracy of real-time multi-model ensemble forecasts for seasonal influenza in the u.s realtime forecasting of infectious disease dynamics with a stochastic semi-mechanistic model healthmap the use of social media in public health surveillance. western pacific surveillance and response journal : wpsar the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak basic prediction methodology for covid-19: estimation and sensitivity considerations. medrxiv covid-19 outbreak on the diamond princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand. imperial college technical report modelling disease outbreaks in realistic urban social networks computational epidemiology forecasting covid-19 impact on hospital bed-days, icudays, ventilator-days and deaths by us state in the next 4 months open data resources for fighting covid-19 data-driven methods to monitor, model, forecast and control covid-19 pandemic: leveraging data science, epidemiology and control theory covid-19 datasets: a survey and future challenges. medrxiv mathematical modeling of epidemic diseases the use of mathematical models to inform influenza pandemic preparedness and response mathematical models for covid-19 pandemic: a comparative analysis updated preparedness and response framework for influenza pandemics novel framework for assessing epidemiologic effects of influenza epidemics and pandemics covid-19 pandemic planning scenarios epidemiological data from the covid-19 outbreak, real-time case information covid-19 case surveillance public use data -data -centers for disease control and prevention covid-19 patients' clinical characteristics, discharge rate, and fatality rate of meta-analysis estimating the generation interval for coronavirus disease (covid-19) based on symptom onset data the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china the use and reporting of airline passenger data for infectious disease modelling: a systematic review flight cancellations related to 2019-ncov (covid-19) the hidden geometry of complex, network-driven contagion phenomena potential for global spread of a novel coronavirus from china forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study. the lancet using predicted imports of 2019-ncov cases to determine locations that may not be identifying all imported cases. medrxiv preparedness and vulnerability of african countries against introductions of 2019-ncov. medrxiv creating synthetic baseline populations openstreetmap american time use survey multiscale mobility networks and the spatial spreading of infectious diseases optimizing spatial allocation of seasonal influenza vaccine under temporal constraints assessing the international spreading risk associated with the 2014 west african ebola outbreak spread of zika virus in the americas structure of social contact networks and their impact on epidemics generation and analysis of large synthetic social contact networks modelling disease outbreaks in realistic urban social networks containing pandemic influenza at the source report 9: impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand the structure and function of complex networks a public data lake for analysis of covid-19 data midas network. midas 2019 novel coronavirus repository covid-19) data in the united states covid-19 impact analysis platform covid-19 surveillance dashboard the covid tracking project absolute humidity and the seasonal onset of influenza in the continental united states epiestim: a package to estimate time varying reproduction numbers from epidemic curves. r package version google covid-19 community mobility reports mask-wearing survey data impact of social distancing measures on coronavirus disease healthcare demand, central texas, usa current hospital capacity estimates -snapshot total hospital bed occupancy quantifying the effects of contact tracing, testing, and containment covid-19 epidemic in switzerland: on the importance of testing, contact tracing and isolation quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing isolation and contact tracing can tip the scale to containment of covid-19 in populations with social distancing. available at ssrn 3562458 privacy sensitive protocols and mechanisms for mobile contact tracing the rapidd ebola forecasting challenge: synthesis and lessons learnt acknowledgments. the authors would like to thank members of the biocomplexity covid-19 response team and network systems science and advanced computing (nssac) division for their thoughtful comments and suggestions related to epidemic modeling and response support. we thank members of the biocomplexity institute and initiative, university of virginia for useful discussion and suggestions. this key: cord-027286-mckqp89v authors: ksieniewicz, paweł; goścień, róża; klinkowski, mirosław; walkowiak, krzysztof title: pattern recognition model to aid the optimization of dynamic spectrally-spatially flexible optical networks date: 2020-05-23 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50423-6_16 sha: doc_id: 27286 cord_uid: mckqp89v the following paper considers pattern recognition-aided optimization of complex and relevant problem related to optical networks. for that problem, we propose a four-step dedicated optimization approach that makes use, among others, of a regression method. the main focus of that study is put on the construction of efficient regression model and its application for the initial optimization problem. we therefore perform extensive experiments using realistic network assumptions and then draw conclusions regarding efficient approach configuration. according to the results, the approach performs best using multi-layer perceptron regressor, whose prediction ability was the highest among all tested methods. according to cisco forecasts, the global consumer traffic in the internet will grow on average with annual compound growth rate (cagr) of 26% in years 2017-2022 [3] . the increase in the network traffic is a result of two main trends. firstly, the number of devices connected to the internet is growing due to the increasing popularity of new services including internet of things (iot ). the second important trend influencing the traffic in the internet is popularity of bandwidth demanding services such as video streaming (e.g., netflix ) and cloud computing. the internet consists of many single networks connected together, however, the backbone connecting these various networks are optical networks based on fiber connections. currently, the most popular technology in optical networks is wdm (wavelength division multiplexing), which is expected to be not efficient enough to support increasing traffic in the nearest future. in last few years, a new concept for optical networks has been deployed, i.e., architecture of elastic optical networks (eons). however, in the perspective on the next decade some new approaches must be developed to overcome the predicted "capacity crunch" of the internet. one of the most promising proposals is spectrally-spatially flexible optical network (ss-fon) that combines space division multiplexing (sdm) technology [14] , enabling parallel transmission of co-propagating spatial modes in suitably designed optical fibers such as multi-core fibers (mcfs) [1] , with flexible-grid eons [4] that enable better utilization of the optical spectrum and distanceadaptive transmissions [15] . in mcf-based ss-fons, a challenging issue is the inter-core crosstalk (xt) effect that impairs the quality of transmission (qot ) of optical signals and has a negative impact on overall network performance. in more detail, mcfs are susceptible to signal degradation as a result of the xt that happens between adjacent cores whenever optical signals are transmitted in an overlapping spectrum segment. addressing the xt constraints significantly complicates the optimization of ss-fons [8] . besides numerous advantages, new network technologies bring also challenging optimization problems, which require efficient solution methods. since the technologies and related problems are new, there are no benchmark solution methods to be directly applied and hence many studies propose some dedicated optimization approaches. however, due to the problems high complexity, their performance still needs a lot of effort to be put [6, 8] . we therefore observe a trend to use artificial intelligence techniques (with the high emphasis on pattern recognition tools) in the field of optimization of communication networks. according to the literature surveys in this field [2, 10, 11, 13] , the researchers mostly focus on discrete labelled supervised and unsupervised learning problems, such as traffic classification. regression methods, which are in the scope of that paper, are mostly applied for traffic prediction and estimation of quality of transmission (qot ) parameters such as delay or bit error rate. this paper extends our study initiated in [7] . we make use of pattern recognition models to aid optimization of dynamic mcf-based ss-fons in order to improve performance of the network in terms of minimizing bandwidth blocking probability (bbp), or in other words to maximize the amount of traffic that can be allocated in the network. in particular, an important topic in the considered optimization problem is selection of a modulation format (mf) for a particular demand, due to the fact that each mf provides a different tradeoff between required spectrum width and transmission distance. to solve that problem, we define applicable distances for each mf (i.e., minimum and maximum length of a routing path that is supported by each mf). to find values of these distances, which provide best allocation results, we construct a regression model and then combine it with monte carlo search. it is worth noting that this work does not address dynamic problems in the context of changing the concept over time, as is often the case with processing large sets, and assumes static distribution of the concept [9] . the main novelty and contribution of the following work is an in-depth analysis of the basic regression methods stabilized by the structure of the estimator ensemble [16] and assessment of their usefulness in the task of predicting the objective function for optimization purposes. in one of the previous works [7] , we confirmed the effectiveness of this type of solution using a regression algorithm of the nearest weighted neighbors, focusing, however, much more on the network aspect of the problem being analyzed. in the present work, the main emphasis is on the construction of the prediction model. its main purpose is: -a proposal to interpret the optimization problem in the context of pattern recognition tasks. the rest of the paper is organized as follows. in sect. 2, we introduce studied network optimization problem. in sect. 3, we discuss out optimization approach for that problem. next, in sect. 4 we evaluate efficiency of the proposed approach. eventually, sect. 5 concludes the work. the optimization problem is known in the literature as dynamic routing, space and spectrum allocation (rssa) in ss-fons [5] . we are given with an ss-fon topology realized using mcfs. the topology consists of nodes and physical link. each physical link comprises of a number of spatial cores. the spectrum width available on each core is divided into arrow and same-sized segments called slices. the network is in its operational state -we observe it in a particular time perspective given by a number of iterations. in each iteration (i.e., a time point), a set of demands arrives. each demand is given by a source node, destination node, duration (measured in the number of iterations) and bitrate (in gbps). to realize a demand, it is required to assign it with a light-path and reserve its resources for the time of the demand duration. when a demand expires, its resources are released. a light-path consists of a routing path (a set of links connecting demand source and destination nodes) and a channel (a set of adjacent slices selected on one core) allocated on the path links. the channel width (number of slices) required for a particular demand on a particular routing path depends on the demand bitrate, path length (in kilometres) and selected modulation format. each incoming demand has to be realized unless there is not enough free resources when it arrives. in such a case, a demand is rejected. please note that the selected light-paths in i -th iteration affect network state and allocation possibilities in the next iterations. the objective function is defined here as bandwidth blocking probability (bbp) calculated as a summed bitrate of all rejected demands divided by the summed bitrate of all offered demands. since we aim to support as much traffic as it is possible, the objective criterion should be minimized [5, 8] . the light-paths' allocation process has to satisfy three basic rssa constraints. first, each channel has to consists of adjacent slices. second, the same channel (i.e., the same slices and the same core) has to be allocated on each link included in a light-path. third, in each time point each slice on a particular physical link and a particular core can be used by at most one demand [8] . there are four modulation formats available for transmissions-8-qam, 16-qam, qpsk and bpsk. each format is described by its spectral efficiency, which determines number of slices required to realize a particular bitrate using that modulation. however, each modulation format is also characterized by the maximum transmission distance (mtd) which provides acceptable value of optical signal to noise ratio (osnr) at the receiver side. more spectrally-efficient formats consume less spectrum, however, at the cost of shorter mtds. moreover, more spectrally-efficient formats are also vulnerable to xt effects which can additionally degrade qot and lead to demands' rejection [7, 8] . therefore, the selection of the modulation format for each demand is a compromise between spectrum efficiency and qot. to answer that problem, we use the procedure introduced in [7] to select a modulation format for a particular demand and routing path [7] . let m = 1, 2, 3, 4 denote modulation formats ordered in increasing mtds (and in decreasing spectral efficiency at the same time). it means that m = 1 denotes 8-qam and m = 4 denotes bpsk. let mt d = [mtd 1 , mtd 2 , mtd3, mtd 4 ] be a vector of mtds for modulations 8-qam, 16-qam, qpsk, bpsk respectively. moreover, let at d = [atd 1 , atd 2 , atd3, atd 4 ] (where atd i <= mtd i , i = 1, 2, 3, 4) be the vector of applicable transmission distances. for a particular demand and a routing path we select most spectrally-efficient modulation format i for which atd i is grater of equal to the selected path length and the xt effect is on an acceptable level. for each candidate modulation format, we asses the xt level based on the adjacent resources' (i.e., slices and cores) availability using procedure proposed in [7] . it is important to note that we do not indicate atd 4 (for bpsk) since we assume that this modulation is able to support transmission on all candidate routing paths regardless of their length. please also note that when xt level is too high for all modulation formats, the demand is rejected regardless of the light-paths' availability. in sect. 2 we have studied rssa problem and emphasised the importance of efficient modulation selection task. for that task we have proposed solution method whose efficiency strongly depends on the applied atd vector. therefore, we aim to find atd * vector that provides best results. the vector elements have to be positive and have upper bounds given by vector mtd. moreover, the following condition have to be satisfied: atd i < atd i+1 , i = 1, 2. since solving rssa instances is a time consuming process, it is impossible to evaluate all possible atd vectors in a reasonable time. we therefore make use of regression methods and propose a scheme to find atd * depicted in fig. 1 . a representative set of 1000 different atd vectors is generated. then, for each of them we simulate allocation of demands in ss-fon (i.e., we solve dynamic rssa). for the purpose of demands allocation (i.e., selection of light-paths), we use a dedicated algorithm proposed in [7] . for each considered atd vector we save obtained bbp. based on that data, we construct a regression model, which predicts bbp based on an atd vector. having that model, we use monte carlo method to find atd * vector, which is recommended for further experiments. to solve an rssa instance for a particular atd vector, we use heuristic algorithm proposed in [7] . we work under the assumption that there are 30 candidate routing paths for each traffic demand (generated using dijkstra algorithm). since the paths are generated in advance and their lengths are known, we can use an atd vector and preselect for these paths modulation formats based on the procedure discussed in sect. 2. therefore, rssa is reduced to the selection of one of the candidate routing paths and a communication channel with respect to the resource availability and assessed xt levels. from the perspective of pattern recognition methods, the abstraction of the problem is not the key element of processing. the main focus here is the representation available to construct a proper decision model. for the purposes of considerations, we assume that both input parameters and the objective function take only quantitative and not qualitative values, so we may use probabilistic pattern recognition models to process them. if we interpret the optimization task as searching for the extreme function of many input parameters, each simulation performed for their combination may also be described as a label for the training set of supervised learning model. in this case, the set of parameters considered in a single simulation becomes a vector of object features (x n ), and the value of the objective function acquired around it may be interpreted as a continuous object label (y n ). repeated simulation for randomly generated parameters allows to generate a data set (x) supplemented with a label vector (y). a supervised machine learning algorithm can therefore gain, based on such a set, a generalization abilities that allows for precise estimation of the simulation result based on its earlier runs on the random input values. a typical pattern recognition experiment is based on the appropriate division of the dataset into training and testing sets, in a way that guarantees their separability (most often using cross-validation), avoiding the problem of data peeking and a sufficient number of repetitions of the validation process to allow proper statistical testing of mutual model dependencies hypotheses. for the needs of the proposal contained in this paper, the usual 5-fold cross validation was adopted, which calculates the value of the r 2 metric for each loop of the experiment. having constructed regression model, we are able to predict bbp value for a sample atd vector. please note that the time required for a single prediction is significantly shorter that the time required to simulate a dynamic rssa. the last step of our optimization procedure is to find atd * -vector providing lowest estimated bbp values. to this end, we use monte carlo method with a number of guesses provided by the user. the rssa problem was solved for two network topologies-dt12 (12 nodes, 36 links) and euro28 (28 nodes, 82 links). they model deutsche telecom (german national network) and european network, respectively. each network physical link comprised of 7 cores wherein each of the cores offers 320 frequency slices of 12.5 ghz width. we use the same network physical assumptions and xt levels and assessments as in [7] . traffic demands have randomly generated end nodes and birates uniformly distributed between 50 gbps and 1 tbps, with granularity of 50 gbps. their arrival follow poisson process with an average arrival rate λ demands per time unit. the demand duration is generated according to a negative exponential distribution with an average of 1/μ. the traffic load offered is λ/μ normalized traffic units (ntus). for each testing scenario, we simulate arrival of 10 6 demands. four modulations are available (8-qam, 16-qam, qpsk, bpsk) wherein we use the same modulation parameters as in [7] . for each topology we have generated 9 different datasets, each consists of 1000 samples of atd vector and corresponding bbp. the datasets differ with the xt coefficient (μ = 1 · 10 −9 indicated as "xt1", μ = 2 · 10 −9 indicated as "xt2", for more details we refer to [7] ) and network links scaling factor (the multiplier used to scale lengths of links in order to evaluate if different lengths of routing paths influence performance of the proposed approach). for dt12 we use following scaling factors: 0.4, 0.6, 0.8, . . . , 2.0. for euro28 the values are as follows: 0.104, 0.156, 0.208, 0.260, 0.312, 0.364, 0.416, 0.468, 0.520. we indicate them as "sx.xxx " where x.xxx refers to the scaling factor value. using these datasets we can evaluate whether xt coefficient (i.e., level of the vulnerability to xt effects) and/or average link length influence optimization approach performance. the experimental environment for the construction of predictive models, including the implementation of the proposed processing method, was implemented in python, following the guidelines of the state-of-art programming interface of the scikit-learn library [12] . statistical dependency assessment metrics for paired tests were calculated according to the wilcoxon test, according to the implementation contained in scipy module. each of the individual experiments was evaluated by r 2 score -a typical quality assessment metric for regression problems. the full source code, supplemented with employed datasets is publicly available in a git repository 1 . five simple recognition models were selected as the base experimental estimators: knr-k-nearest neighbors regressor with five neighbors, leaf size of 30 and euclidean metric approximated by minkowski distance, -dknr-knr regressor weighted by distance from closest patterns, mlp-a multilayer perceptron with one hidden layer of one hundred neurons, with the relu activation function and adam optimizer, dtr-cart tree with mse split criterion, lin-linear regression algorithm. in this section we evaluate performance of the proposed optimization approach. to this end, we conduct three experiments. experiment 1 focuses on the number of patterns required to construct a reliable prediction model. experiment 2 assesses the statistical dependence of built models. eventually, experiment 3 verifies efficiency of the proposed approach as a function of number of guesses in the monte carlo search. the first experiment carried out as part of the approach evaluation is designed to verify how many patterns -and thus how many repetitions of simulations -must be passed to individual regression algorithms to allow the construction of a reliable prediction model. the tests were carried out on all five considered regressors in two stages. first, the range from 10 to 100 patterns was analyzed, and in the second, from 100 to 1000 patterns per processing. it is important to note that due to the chosen approach to cross-validation, in each case the model is built on 80% of available objects. the analysis was carried out independently on all available data sets, and due to the non-deterministic nature of sampling of available patterns, its results were additionally stabilized by repeating a choice of the objects subset five times. in order to allow proper observations, the results were averaged for both topologies. plots for the range from 100 to 1000 patterns were additionally supplemented by marking ranges of standard deviation of r 2 metric acquired within the topology and presented in the range from the .8 value. the results achieved for averaging individual topologies are presented in figs. 2 and 3 . for dt12 topology, mlp and dtr algorithms are competitively the best models, both in terms of the dynamics of the relationship between the number of patterns and the overall regression quality. the linear regression clearly stands out from the rate. a clear observation is also the saturation of the models, understood by approaching the maximum predictive ability, as soon as around 100 patterns in the data set. the best algorithms already achieve quality within .8, and with 600 patterns they stabilize around .95. the relationship between each of the recognition algorithms and the number of patterns takes the form of a logarithmic curve in which, after fast initial growth, each subsequent object gives less and less potential for improving the quality of prediction. this suggests that it is not necessary to carry out further simulations to extend the training set, because it will not significantly affect the predictive quality of the developed model. very similar observations may be made for euro28 topology, however, noting that it seems to be a simpler problem, allowing faster achievement of the maximum model predictive capacity. it is also worth noting here the fact that the standard deviation of results obtained by mlp is smaller, which may be equated with the potentially greater stability of the model achieved by such a solution. the second experiment extends the research contained in experiment 1 by assessing the statistical dependence of models built on a full datasets consisting of a thousand samples for each case. the results achieved are summarized in tables 1a and b. as may be seen, for the dt12 topology, the lin algorithm clearly deviates negatively from the other methods, in absolutely every case being a worse solution than any of the others, which leads to the conclusion that we should completely reject it from considering as a base for a stable recognition model. algorithms based on neighborhood (knr and dknr) are in the middle of the rate, in most cases statistically giving way to mlp and dtr, which would also suggest departing from them in the construction of the final model. the statistically best solutions, almost equally, in this case are mlp and dtr. for euro28 topology, the results are similar when it comes to lin, knr and dknr approaches. a significant difference, however, may be seen for the achievements of dtr, which in one case turns out to be the worst in the rate, and in many is significantly worse than mlp. these observations suggest that in the final model for the purposes of optimization lean towards the application of neural networks. what is important, the highest quality prediction does not exactly mean the best optimization. it is one of the very important factors, but not the only one. it is also necessary to be aware of the shape of the decision function. for this purpose, the research was supplemented with visualizations contained in fig. 4 . algorithms based on neighborhood (knn, dknn) and decision trees (dtr) are characterized by a discrete decision boundary, which in the case of visualization resembles a picture with a low level of quantization. in the case of an ensemble model, stabilized by cross-validation, actions are taken to reduce this property in order to develop as continuous a border as possible. as may be seen in the illustrations, compensation occurs, although in the case of knn and dknn leads to some disturbances in the decision boundary (interpreted as thresholding the predicted label value), and for the dtr case, despite the general correctness of the performed decisions, it generates image artifacts. such a model may still retain high predictive ability, but it has too much tendency to overfit and leads to insufficient continuity of the optimized function to perform effective optimization. clear decision boundaries are implemented by both the lin and mlp approaches. however, it is necessary to reject lin from processing due to the linear nature of the prediction, which (i ) in each optimization will lead to the selection of the extreme value of the analyzed range and (ii ) is not compatible with the distribution of the explained variable and must have the largest error in each of the optimas. summing up the observations of experiments 1 and 2, the mlp algorithm was chosen as the base model for the optimization task. it is characterized by (i ) statistically best predictive ability among the methods analyzed and (ii ) the clearest decision function from the perspective of the optimization task. the last experiment focuses on the finding of best atd vector based on the constructed regression model. to this end, we use monte carlo method with different number of guesses. tables 2 and 3 present the obtained results as a function of number of guesses, which changes from 10 1 up to 10 9 . the results quality increases with the number of guesses up to some threshold value. then, the results do not change at all or change only a little bit. according to the presented values, monte carlo method applied with 10 3 guesses provides satisfactory results. we therefore recommend that value for further experiments. the following work has considered the topic of employing pattern recognition methods to support ss-fon optimization process. for a wide pool of generated cases, analyzing two real network topologies, the effectiveness of solutions implemented by five different, typical regression methods was analyzed, starting from logistic regression and ending with neural networks. conducted experimental analysis shows, with high probability obtained by conducting proper statistical validation, that mlp is characterized by the greatest potential in this type of solutions. even with a relatively small pool of input simulations, constructing a data set for learning purpouses, interpretable in both the space of optimization and machine learning problems, simple networks of this type achieve both high quality prediction measured by the r 2 metric, and continuous decision space creating the potential for conducting optimization. basing the model on the stabilization realized by using ensemble of estimators additionally allows to reduce the influence of noise on optimization, whichin a state-of-art optimization methods -could show a tendency to select invalid optimas, burdened by the nondeterministic character of the simulator. further research, developing ideas presented in this article, will focus on the generalization of the presented model for a wider pool of network optimization problems. high-capacity transmission over multi-core fibers a comprehensive survey on machine learning for networking: evolution, applications and research opportunities visual networking index: forecast and trends elastic optical networking: a new dawn for the optical layer on the efficient dynamic routing in spectrally-spatially flexible optical networks on the complexity of rssa of any cast demands in spectrally-spatially flexible optical networks machine learning assisted optimization of dynamic crosstalk-aware spectrallyspatially flexible optical networks survey of resource allocation schemes and algorithms in spectrally-spatially flexible optical networking data stream classification using active learned neural networks artificial intelligence (ai) methods in optical networks: a comprehensive survey an overview on application of machine learning techniques in optical networks scikit-learn: machine learning in python machine learning for network automation: overview, architecture, and applications survey and evaluation of space division multiplexing: from technologies to optical networks modeling and optimization of cloud-ready and content-oriented networks. ssdc classifier selection for highly imbalanced data streams with minority driven ensemble key: cord-021426-zo9dx8mr authors: peiffer, robert l.; pohm-thorsen, laurie; corcoran, kelly title: models in ophthalmology and vision research date: 2013-10-21 journal: the biology of the laboratory rabbit doi: 10.1016/b978-0-12-469235-0.50025-7 sha: doc_id: 21426 cord_uid: zo9dx8mr this chapter reviews the anatomy and physiology of the rabbit eye from a comparative perspective. the anatomy of the rabbit eye reflects its niche as a diurnal herbivore. the rabbit has both photopic and scotopic vision without the benefit of a tapetum. orbits are laterally situated; the rabbit is one of the few animals in which the orbital axis coincides with the visual axis. the shape of the orbit is circular, compared to the cone shaped human orbit. the orbital walls are of bone, except inferiorly, where the wall is formed partially by the muscles of mastication. the superior orbital wall is formed by the frontal bone. the supraorbital process of the frontal bone contains three supraorbital foramina, which are unique feature of the rabbit orbit; the foramina are incisures formed into apertures by a cartilaginous sheet. the optic foramina share a common canal anteriorly with only a thin boney plate to divide them, which disappears posteriorly to form one canal opening into the cranium as a single foramen. entropion in young rabbits can occur as a primary or as a secondary condition arising from infection. viral-induced eyelid proliferations can result from infection by the rabbit myxoma and papilloma viruses. the papilloma virus is a part of the papovaviridae and is a dna virus transmitted by arthropod vectors. the cottontail rabbit in the mid-western united states is most frequently affected, although domestic rabbits are susceptible. the laboratory rabbit, for reasons both obvious and subtle, has no close competition with regard to being the most com monly utilized species for experimental ophthalmology; use of rabbits in vision research is somewhat tempered, but lagomorphs are well represented here as well. for the ophthalmic researcher, accessibility, economy of acquisition and mainte nance, general tractability, and relatively large prominent globes have been determining factors rather than demonstrated similar ities in physioanatomy. this chapter reviews the anatomy and physiology of the rabbit eye from a comparative perspective, summarizes documented spontaneous ocular conditions, discusses experimentally induced disease in general terms, and concludes with a summary of ob servations regarding the rabbit as a model for broad categories of research. presentation is conceptual and general rather than spe cific, anticipating that this chapter will be the first step, not the last, for those who would seek to know the rabbit eye. references are selective (we hope representative and pertinent) rather than comprehensive (a cursory literature search yielded several thou sand references), with emphasis on the contemporary and the in vivo. our review of anatomy and physiology is cumulative from classic texts (davis, 1929; prince et αί, 1960; prince, 1964; cole, 1974; francois and neetens, 1974; tripathi, 1974) , more recent papers, and personal observations. the anatomy of the rabbit eye reflects its niche as a diurnal herbivore. the rabbit has both photopic and scotopic vision without benefit of a tapetum. orbits are laterally situated; the rabbit is one of the few animals in which the orbital axis coincides with the visual axis. there is an angle of 150° to 175° between the two visual axes with a binocular visual field of 10° to 35° in width. by moving the eyes and tilting the head upward the rabbit can achieve a maximum field of vision of almost 360°. the prominent globes, which extend up to 5 mm beyond the inferior and 12 mm be yond the superior orbital rim, and the large corneas contribute to the phenomenal field of vision. adult globe size is 18 mm horizontal, 17 mm vertical, and 16 mm anterior-posterior. in the rabbit the internal maxillary artery (a branch of the external carotid) enters the orbit through the anterior sphenoidal foramen and gives rise to the external ophthalmic artery, which anastomoses with the internal ophthalmic artery to supply the extraocular muscles and harder's gland and with both posterior and anterior ciliary arteries (nasal and temporal) which supply the globe. the internal ophthalmic artery is a branch of the external carotid and enters the orbit through the optic foramen. venous drainage from the vortex veins is into a rather extensive posterior orbital venous sinus which surrounds the muscle cone and harder's gland. the shape of the orbit is circular, compared to the coneshaped human orbit. the orbital walls are of bone, except inferiorly, where the wall is formed partially by the muscles of mastication. the superior orbital wall is formed by the frontal bone. the supraorbital process of the frontal bone contains three supraorbital foramina, which are an unique feature of the rabbit orbit; the foramina are incisures formed into apertures by a cartilaginous sheet. the optic foramina share a common canal anteriorly with only a thin boney plate to divide them, which disappears posteriorly to form one canal opening into the cra nium as a single foramen. the rabbit has nine extraocular muscles, one more than is acknowledged in other domestic animals; the additional muscle is the depressor palpebrae inferior. in other mammals a short tendinous extension of the inferior rectus muscle depresses the lower lid; the rabbit has a prominent globe which extends be yond the orbital rim and thus requires an additional muscle for depression. the muscle arises from the zygomatic bone slightly inferior to the level of the nasal canthus and inserts into the anterior portion of the lower lid. the remaining muscles origi nate from the orbital wall between the optic foramen and the orbitorotundum foramen and insert rather close (2-4 mm) to the limbus. the retractor bulbi muscle surrounds the optic nerve deep to the rectus muscles; its point of origin is closer to the orbitorotundum foramen, and it inserts well behind the equator in an irregular fashion. the palpebral opening in the rabbit is 10 to 16 mm long. the superior lid is shorter and thicker than the inferior lid, with more numerous, posteriorly directed cilia. on the inferior lid the cilia are longer nasally, shorter temporally, and directed straight ahead to allow maximum protection and vision. the orbicularis oculi muscle is comparatively large in the rabbit. there are 40 to 50 meibomian glands embedded in the tarsus, the palpebral conjunctiva contains numerous lymphatic nodules and intraepithelial glands, and the caruncle has a broad base 5 mm wide which merges with both eyelids. rabbits blink 10 to 12 times per hour. the third eyelid of the rabbit is not noticeably active; in fact, it does not nictitate. it can, however, be retracted by applying pressure to the globe. the direction of movement is upward and temporally, and it does not move more than two-thirds across the laboratory rabbit, for reasons both obvious and subtle, has no close competition with regard to being the most commonly utilized species for experimental ophthalmology; use of rabbits in vision research is somewhat tempered, but lagomorphs are well represented here as well. for the ophthalmic researcher, accessibility, economy of acquisition and maintenance, general tractability, and relatively large prominent globes have been determining factors rather than demonstrated similarities in physioanatomy. this chapter reviews the anatomy and physiology of the rabbit eye from a comparative perspective, summarizes documented spontaneous ocular conditions, discusses experimentally induced disease in general terms, and concludes with a summary of observations regarding the rabbit as a model for broad categories of research. presentation is conceptual and general rather than specific, anticipating that this chapter will be the first step, not the last, for those who would seek to know the rabbit eye. references are selective (we hope representative and pertinent) rather than comprehensive (a cursory literature search yielded several thousand references), with emphasis on the contemporary and the in vivo. our review of anatomy and physiology is cumulative from classic texts (davis, 1929; prince et al., 1960; prince, 1964; cole, 1974; francois and neetens, 1974; tripathi, 1974) , more recent papers, and personal observations. the anatomy of the rabbit eye reflects its niche as a diurnal herbivore. the rabbit has both photopic and scotopic vision without benefit of a tapetum. orbits are laterally situated; the rabbit is one of the few animals in which the orbital axis coincides with the visual axis. there is an angle of 150°to 175°between the two visual axes with a binocular visual field of 10°to 35°in width. by moving the eyes and tilting the head upward the rabbit can achieve a maximum field of vision of almost 360°. the prominent globes, which extend up to 5 mm beyond the inferior and 12 mm beyond the superior orbital rim, and the large corneas contribute to the phenomenal field of vision. adult globe size is 18 mm horizontal, 17 mm vertical, and 16 mm anterior-posterior. in the rabbit the internal maxillary artery (a branch of the external carotid) enters the orbit through the anterior sphenoidal foramen and gives rise to the external ophthalmic artery, which anastomoses with the internal ophthalmic artery to supply the extraocular muscles and harder's gland and with both posterior and anterior ciliary arteries (nasal and temporal) which supply the globe. the internal ophthalmic artery is a branch of the external carotid and enters the orbit through the optic foramen. venous drainage from the vortex veins is into a rather extensive posterior orbital venous sinus which surrounds the muscle cone and harder's gland. the shape of the orbit is circular, compared to the coneshaped human orbit. the orbital walls are of bone, except inferiorly, where the wall is formed partially by the muscles of mastication. the superior orbital wall is formed by the frontal bone. the supraorbital process of the frontal bone contains three supraorbital foramina, which are an unique feature of the rabbit orbit; the foramina are incisures formed into apertures by a cartilaginous sheet. the optic foramina share a common canal anteriorly with only a thin boney plate to divide them, which disappears posteriorly to form one canal opening into the cranium as a single foramen. the rabbit has nine extraocular muscles, one more than is acknowledged in other domestic animals; the additional muscle is the depressor palpebrae inferior. in other mammals a short tendinous extension of the inferior rectus muscle depresses the lower lid; the rabbit has a prominent globe which extends beyond the orbital rim and thus requires an additional muscle for depression. the muscle arises from the zygomatic bone slightly inferior to the level of the nasal canthus and inserts into the anterior portion of the lower lid. the remaining muscles originate from the orbital wall between the optic foramen and the orbitorotundum foramen and insert rather close (2-4 mm) to the limbus. the retractor bulbi muscle surrounds the optic nerve deep to the rectus muscles; its point of origin is closer to the orbitorotundum foramen, and it inserts well behind the equator in an irregular fashion. the palpebral opening in the rabbit is 10 to 16 mm long. the superior lid is shorter and thicker than the inferior lid, with more numerous, posteriorly directed cilia. on the inferior lid the cilia are longer nasally, shorter temporally, and directed straight ahead to allow maximum protection and vision. the orbicularis oculi muscle is comparatively large in the rabbit. there are 40 to 50 meibomian glands embedded in the tarsus, the palpebral conjunctiva contains numerous lymphatic nodules and intraepithelial glands, and the caruncle has a broad base 5 mm wide which merges with both eyelids. rabbits blink 10 to 12 times per hour. the third eyelid of the rabbit is not noticeably active; in fact, it does not nictitate. it can, however, be retracted by applying pressure to the globe. the direction of movement is upward and temporally, and it does not move more than two-thirds across the conjunctiva is divided into two continuous parts, namely, the bulbar conjunctiva and the palpebral conjunctiva; total sur face area is about 50% of the human conjunctiva. the palpebral conjunctiva is firmly adherent to the lids and is approximately 40 μηι in thickness. in the fornix there are numerous goblet cells and intraepithelial glands scattered between the epithelial cells, and the epithelium is thicker. the lacrimal and auxiliary ducts enter into the fornices. the bulbar conjunctiva is thinner (10-30 μπι), with fewer goblet cells which increase in number toward the limbus. a superficial epithelial cell type character ized by large osmiophilic granules is not found in primates. the presence of immunoglobulin a (iga) staining plasma cells and the tendency for dense inflammatory cell infiltrates to accumu late in the conjunctiva in response to inflammatory disease in adjacent tissues speak to its role in immunoresponsiveness. the lacrimal system consists of three glands, the lacrimal puncta and canaliculi, nasolacrimal duct, and nasopuncta. the aqueous tear film is produced by the lacrimal gland, harder's gland, and the gland of the third eyelid; normal schirmer tear test values in the rabbit are 5.30 ± 2.96 mm/min (abrams et al, 1990) . the lacrimal gland is large, bilobulated, and pale red in color, occupying the orbit adjacent to the lower rim; it is narrow in form with bulbous enlargements at each canthus. the lacrimal gland is a serous-secreting, compound, tubular gland surrounded by a fibrous connective tissue capsule; the lobes are divided into lobules by loose connective tissue septa containing reticular and collagenous elastic fibers. in the rabbit the lacrimal gland plays a lesser role in lubricating the eye than it does in humans, and the effect of removal of the lacrimal gland is diminished by compen satory secretions from the gland of the third eyelid and harder's gland. although transient keratoconjunctivitis sicca may be ob served following partial removal of the gland, signs disappear, supposedly owing to the regeneration of the tissue. harder's gland is quite large and is attached to the inferior nasal medial wall of the orbit. dimensions are about 19 by 12 to 15 by 4 to 6 mm. the gland is encapsulated and is almost totally surrounded by the orbital venous sinus. the gland is roughly kidney shaped with two distinct lobes, a pink lobe and a white lobe; the size and dispersion of lipid droplets in the gland probably account for the difference in color. the cells of the gland differ in shape as well. in the pink lobe the cells of the acini are cuboidal and filled with lipid droplets which are larger than those in the white lobe. in the white lobe, the acini are formed by columnar epithelium with smaller lipid droplets. the two lobes appear to have similar function, which includes being an integral part of the secretory immune system. there is a single duct which opens onto the inferior part of the bulbar surface of the third eyelid, to which the gland is attached. the gland of the third eyelid closely resembles harder's gland in structure and is situated surrounding the shaft of the cartilage of the third eyelid. the biochemistry of isolated rabbit lacrimal acini has been described (bradley et al, 1992) . the rabbit has only one lacrimal punctum which is located in the inferior eyelid, 3 mm from the medial canthus and 3 mm from the inner lid margin. the proximal portion of the canaliculus is very short, approximately 2 mm long, and assumes a funnel-shaped sac at its transition into the nasolacrimal duct. the lacrimal bone in the rabbit does not have a well-defined lacrimal fossa although it does support the lacrimal sac medi ally. the nasolacrimal duct courses through the semicircular lacrimal canal from the orbit to the maxilla. in the maxilla the duct courses medially and rostrally for 5 to 6 mm; at this point the duct changes diameter abruptly to 1 mm and curves ros trally. the duct maintains a 2 mm diameter until it reaches the incisor tooth root, where it spirals slightly and is compressed in a saggital plane between the alveolar bone of the premaxilla and the nasal cartilage. the duct courses rostrally and medially to the nasal vestibule, where it exits at the nasopunctum several millimeters posterior to the mucocutaneous junction of the alar fold. the nasolacrimal duct of the rabbit is difficult to cannulate as resistance is encountered at the proximal maxillary curve and the base of the incisor tooth (burling et al, 1991) . histori cally the rabbit nasolacrimal duct is similar to the human naso lacrimal duct. the rabbit cornea is quite prominent, having a horizontal dimension of 15 mm and a vertical dimension of 13.5 to 14 mm. the cornea has a uniform thickness of 407 ± 20 μπι (fig. 1) . the epithelium is thinner in the rabbit than it is in humans, ap proximately 30 to 40 μπι thick, and consists of one row of elon gated columnar basal cells beneath three to five layers of wing and surface cells. descemet's membrane is continually laid down throughout life and gradually thickens; the membrane is usually 7 or 8 μιη thick, but may reach up to 15 μηι with age. the endothelial cells are hexagonal in shape, about 20 μιη in diame ter, with a denisty of 2998 ± 326/mm 2 (salistad and peiffer, 1981) , and unlike those of humans, primates, and cats possess regenerative capabilities (van hörne et al, 1977) . the cornea is innervated by ciliary nerves which pass forward from the ciliary body to the limbus between the sclera and the choroid. humans have 82 nerve bundles serving the cornea; the rabbit has 65. many iris pillars with broad bases pass from the iris root and taper to a fine insertion into the termination of descemet's the conjunctiva is divided into two continuous parts, namely, the bulbar conjunctiva and the palpebral conjunctiva; total surface area is about 50% of the human conjunctiva. the palpebral conjunctiva is firmly adherent to the lids and is approximately 40 j.lm in thickness. in the fornix there are numerous. gob~et cells and intraepithelial glands scattered between the epithelial cells, and the epithelium is thicker. the lacrimal and auxiliary ducts enter into the fornices. the bulbar conjunctiva is thinner (10-30 j.lm), with fewer goblet cells which increase in number toward the limbus. a superficial epithelial cell type characterized by large osmiophilic granules is not found in primates. the presence of immunoglobulin a (iga) staining plasma cells and the tendency for dense inflammatory cell infiltrates to accumulate in the conjunctiva in response to inflammatory disease in adjacent tissues speak to its role in immunoresponsiveness. the lacrimal system consists of three glands, the lacrimal puncta and canaliculi, nasolacrimal duct, and nasopuncta. the aqueous tear film is produced by the lacrimal gland, harder's gland, and the gland of the third eyelid; normal schirmer tear test values in the rabbit are 5.30 ± 2.96 mm/min (abrams et ai., 1990) . the lacrimal gland is large, bilobulated, and pale red in color, occupying the orbit adjacent to the lower rim; it is narrow in form with bulbous enlargements at each canthus. the lacrimal gland is a serous-secreting, compound, tubular gland~u~oun.ded by a fibrous connective tissue capsule; the lobes are divided into lobules by loose connective tissue septa containing reticular and collagenous elastic fibers. in the rabbit the lacrimal gland plays a lesser role in lubricating the eye than it does in humans, and the effect of removal of the lacrimal gland is diminished by compensatory secretions from the gland of the third eyelid and harder's gland. although transient keratoconjunctivitis sicca may be observed following partial removal of the gland, signs disappear, supposedly owing to the regeneration of the tissue. harder's gland is quite large and is attached to the inferior nasal medial wall of the orbit. dimensions are about 19 by 12 to 15 by 4 to 6 mm. the gland is encapsulated and is almost totally surrounded by the orbital venous sinus. the gland is roughly kidney shaped with two distinct lobes, a pink lobe and a white lobe; the size and dispersion of lipid droplets in the gland probably account for the difference in color. the cells of the gland differ in shape as well. in the pink lobe the cells of the acini are cuboidal and filled with lipid droplets which are larger than those in the white lobe. in the white lobe, the acini are formed by columnar epithelium with smaller lipid droplets. the two lobes appear to have similar function, which includes 411 being an integral part of the secretory immune system. there is a single duct which opens onto the inferior part of the bulbar surface of the third eyelid, to which the gland is attached. the gland of the third eyelid closely resembles harder's g.land in structure and is situated surrounding the shaft of the cartilage of the third eyelid. the biochemistry of isolated rabbit lacrimal acini has been described (bradley et ai., 1992) . the rabbit has only one lacrimal punctum which is located in the inferior eyelid, 3 mm from the medial canthus and 3 mm from the inner lid margin. the proximal portion of the canaliculus is very short, approximately 2 mm long, and assumes a funnel-shaped sac at its transition into the nasolacrimal duct. the lacrimal bone in the rabbit does not have a well-defined lacrimal fossa although it does support the lacrimal sac medially. the nasolacrimal duct courses through the semicircular lacrimal canal from the orbit to the maxilla. in the maxilla the duct courses medially and rostrally for 5 to 6 mm; at this point the duct changes diameter abruptly to 1 mm and curves rostrally. the duct maintains a 2 mm diameter until it reaches the incisor tooth root, where it spirals slightly and is compressed in a saggital plane between the alveolar bone of the premaxilla and the nasal cartilage. the duct courses rostrally and medially to the nasal vestibule, where it exits at the nasopunctum several millimeters posterior to the mucocutaneous junction of the alar fold. the nasolacrimal duct of the rabbit is difficult to cannulate as resistance is encountered at the proximal maxillary curve and the base of the incisor tooth (burling et ai., 1991) . histologically the rabbit nasolacrimal duct is similar to the human nasolacrimal duct. the rabbit cornea is quite prominent, having a horizontal dimension of 15 mm and a vertical dimension of 13.5 to 14 mm. the cornea has a uniform thickness of 407 ± 20 j.lm (fig. 1 ). the epithelium is thinner in the rabbit than it is in humans, approximately 30 to 40 j.lm thick, and consists of one row of elongated columnar basal cells beneath three to five layers of wing and surface cells. descemet's membrane is continually laid down throughout life and gradually thickens; the membrane is usually 7 or 8 j.lm thick, but may reach up to 15 j.lm with age. the endothelial cells are hexagonal in shape, about 20 j.lm in diameter, with a denisty of 2998 ± 326/mm 2 (salistad and peiffer, 1981) , and unlike those of humans, primates, and cats possess regenerative capabilities (van horne et ai., 1977) . the cornea is innervated by ciliary nerves which pass forward from the ciliary body to the limbus between the sclera and the choroid. humans have 82 nerve bundles serving the cornea; the rabbit has 65. many iris pillars with broad bases pass from the iris root and taper to a fine insertion into the termination of descemet's " * % μ β · *^. . (continued) membrane and the surrounding sclerocorneal stroma. the broad insertions provide a firm anchorage for the long thin iris and large ciliary process attached to its rear surface, which in turn suspend a comparatively large lens. the pillars are approxi mately 0.1 to 0.2 mm apart and lead to a prominent ciliary cleft. the trabecular meshwork of the rabbit is relatively rudimentary (fig. 2) . toward the posterior pole, where it is 0.18 mm thick. along the equator superiorly the average thickness is 0.25 mm and inferiorly 0.2 mm. the ciliary vessels that pass forward in the sclera are found closer to the external surface in rabbits and humans compared to dogs and cats. the rabbit sclera has considerable thickness variation, mea suring 0.5 mm adjacent to the limbus and thinning gradually the anterior chamber of the rabbit is quite shallow (3.5 mm). the rabbit pupil is slightly ovoid vertically but is circular in shape when widely dilated; pupillary size varies from 5 to 11 mm. arising from the ciliary body, the iris has a narrow base membrane and the surrounding sclerocorneal stroma. the broad insertions provide a firm anchorage for the long thin iris and large ciliary process attached to its rear surface, which in turn suspend a comparatively large lens. the pillars are approximately 0.1 to 0.2 mm apart and lead to a prominent ciliary cleft. the trabecular meshwork of the rabbit is relatively rudimentary (fig. 2) . the rabbit sclera has considerable thickness variation, measuring 0.5 mm adjacent to the limbus and thinning gradually toward the posterior pole, where it is 0.18 mm thick. along the equator superiorly the average thickness is 0.25 mm and inferiorly 0.2 mm. the ciliary vessels that pass forward in the sclera are found closer to the external surface in rabbits and humans compared to dogs and cats. the anterior chamber of the rabbit is quite shallow (3.5 mm). the rabbit pupil is slightly ovoid vertically but is circular in shape when widely dilated; pupillary size varies from 5 to ii mm. arising from the ciliary body, the iris has a narrow base of approximately 250 j.1m, thickening centrally to 270 j.1m before tapering to 90 j.1m at the pupillary border. stromal melanocytes are absent in albinos. the sphincter muscle is well developed and extends to the pupillary margin; the epithelium is nonpigmented in the albino and consists of a posterior layer of columnar cells apposed to a flat cell layer adjacent to the dilator muscle. there is only one main arterial circle, which is halfway between the pupil and the root of the iris; it is formed by four branches of the ciliary arteries, two of which enter temporally and two nasally. many small vessels branch off the major circle and pass posterior to cross the ora ciliaris retinae and disperse in the choroid. the capillaries near the pupillary margin drain into the radial veins, which drain posteriorly to the venous system of the choroid. the ciliary body in the rabbit is poorly developed and flat owing to the scarce muscle fibers, which accounts for the negligible power of accommodation in the rabbit. the ciliary body is 1.5 mm long from the ora ciliaris retinae to the iris root and is 0.3 mm at its thickest part. the ciliary processes are well developed and differ from those of humans in that they arise from the anterior portion of the ciliary body and merge into the posterior surface of the iris to extend within i mm of the pupillary margin (fig. 3) . the processes are frequently joined to the iris for much of their length, and not all of the processes join the iris at the same point. all the processes are connected proximally by the "sims" or ciliary web which provides rigidity and vascular anastomosis between the processes. this type of relationship between the ciliary processes and iris is frequently found in mammals with little or no accommodative power; it is theorized that a small amount of accommodative power may be obtained by engorging or decreasing the blood volume in the ciliary processes and iris, which changes the diameter of the pupil and position of the lens slightly. the zonular fibers have a diffuse origin from the ciliary body; they originate as far posterior as the ora ciliaris and from both the ridges and valleys of the ciliary processes. the origin of the vascular supply to the ciliary body is the long posterior ciliary artery, short posterior ciliary artery, and to some degree the anterior ciliary artery. a circular channel within the ciliary body is formed by two terminal branches of the long posterior ciliary arteries, which form two connecting semicircular channels. the rabbit choroid is well developed and typically mammalian in structure and without a tapetum. the choroidal thickness varies, being thickest posteriorly and thinning toward the ora ciliaris retinae; it tends to be thicker inferiorly compared to superiorly and is thickest and most heavily pigmented in the region of the visual streak, an area that lies well above the posterior pole of the globe on either side of and below the optic disk. the capillaries within the stroma of the ciliary body have a thin endothelium, 0.15 j.1m, with fenestrations 200 to 1200 a in size. fluid flows through the fenestrations and into the stroma of the ciliary body and toward the ciliary epithelium. the ciliary body has an energy-dependent transport mechanism similar to that found in the renal tubules. sodium and chloride ions are actively pumped into the aqueous and water passively follows. na+,k+-atpase has been localized to the inner layer of the nonpigmented ciliary epithelium and may be associated with the "sodium pump." the volume of rabbit aqueous is about 300 1.l.j (250 f.l1 anterior chamber, 57 f.l1 posterior chamber). the fractional turnover rate (k 1 ) is about 0.15, outflow facility (c) between 0.21 and 0.34 ml/min/mm hg, episcleral venous pressure 9.3 mm hg, and normal intraocular pressure (lop) 18-21 mm hg with a circadian rhythm, being lowest at night and highest during the day. the concentration of na+ and k+ in the rabbit aqueous is virtually the same as that found in plasma. the concentrations of bicarbonate and ascorbic acid are much higher than those found in the plasma, and the highest concentrations of both are found in the posterior chamber. ascorbic acid is actively transported into the posterior chamber, and the higher concentration of bicarbonate is related to the presence of carbonate dehydratase in the ciliary body. carbonate dehydratase catalyzes the formation of carbonic acid from co 2 and water; the carbonic acid dissociates, and the bicarbonate ions pass into the aqueous. the concentration of glucose is 10 to 20% lower than that found in the plasma owing to the metabolism of glucose by the lens and cornea. the concentration of lactic acid is the same as in plasma, but most of it is derived from the metabolism of glucose. the concentration of protein is i % of serum and the ratio of albumin to globulin the same as that found in plasma. the trabecular plexus, which consists of a large number of anastomosing small intrascleral vessels, lies adjacent to the trabecular meshwork. the aqueous humor enters this plexus and travels to the perilimbal veins and then, via ciliary veins, enters the orbital venous sinus. the rabbit lens is larger, is more spherical, and occupies a greater percentage of the globe than in humans, which have considerably larger eyes. the lens is a transparent, avascular, biconvex body. the anterior surface has a curvature radius of 5.3 mm, and the posterior surface has a slightly steeper curvature radius of 5.0 mm. the anterior to posterior dimension is 7 mm, and the equatorial dimension is 9 to ii mm. there are two linear sutures; the anterior suture is vertically directed, and the posterior suture is horizontally directed. the capsule is thickened anteriorly and is thickest toward the equator where the zonular fibers insert. at the equator the capsule is 8 to 10 f.lm thick, and it thins to 4 to 6 f.lm at the posterior pole. the epithelium is a single layer of cuboidal epithelial cells and averages 17 f.lm in thickness. the lens fibers form a complicated interlacing and process-interlocking system which is tightly knit yet capable of enough resilience to permit 1.5 0 of accommodation. the lens nucleus is not well defined. removal of the lens leaves the rabbit 10 diopters hypermetropic. in the newborn rabbit there is a prominent primary vitreous body with many blood vessels extending from the optic disk to the posterior surface of the lens. after 2 to 3 weeks the vessels disappear, but the hyaloid canal or vascular remnants generally persist. the rabbit vitreous weighs 1.4 g. the hyalocytes are numerous and are most readily found near the cortex within 30 f.lm of the surface of the vitreous, where the highest concentration of hyaluronic acid is found. like the aqueous there is free movement of many substances within the vitreous; half the water in the rabbit vitreous is replaced in 10 to 15 min. the flow in the rabbit vitreous moves meridionally from the ciliary region toward the posterior pole; one meridional stream is i mm from the surface of the retina, while another follows the posterior surface of the lens. when viewing the fundus the optic nerve head is superior and nasal to the posterior pole of the globe and is deeply cupped and horizontally shaped. two broad white bands of myelinated nerve fibers, the medullary rays, extend nasally and temporally from the optic disk; they lose their myelination just short of the equator on either side. small bundles of myelinated nerve fibers extend from the rays, especially inferiorly, and appear as white streaks. the retinal vein and artery divide just before or just after entering the globe into nasal and temporal branches and travel from the center of the cupped disk along the medullary rays. the remaining retina is avascular (fig. 4 ). the visual streak is an area of the retina that appears to have more sensitivity and is located just inferior to the medullary rays and runs parallel with them; its center is 3 mm below the optic disk, it is 3 to 4 mm wide, and the streak is in apposition with the pigment streak of the choroid. photoreceptors are predominantly rods; however, cones may be found in the visual streak. in nonpigmented rabbits the choroidal vessels are easily seen. the rabbit retina varies in thickness; the thickest area is at the visual streak, which is 160 f..lm thick. most other areas of the retina are uniformly 120 f..lm thick, but there is thinning around the ora ciliaris to 90 f..lm. the rabbit retina is not completely differentiated until 6 weeks after birth. the retinal pigment epithelium (rpe) is devoid of pigment in the albino rabbit. the cells are flat and elongated in saggital section but polygonal in tangential section. the cells are 10 to 15 f..lm in diameter and 5 to 7 f..lm thick. there are approximately 40 to 45 receptors per each pigment epithelial cell. by the second week of life the rabbit photoreceptor layer is sufficiently differentiated to be functional. most of the receptors in the rabbit retina are rods, but centrally there are atypical cones, found in greatest concentration in the area of the visual streak. the diameter of most of the rod outer segments is i f..lm, and most are ii f..lm long; cones have twice the diameter. the receptor layer is thickest at the visual streak (about 40 f..lm). muller cells and the internal and external limiting membranes are rather prominent in the rabbit. most of the nuclei in the outer nuclear layer are about 4 f..lm in diameter, and the thickest area of the layer is 30 f..lm at the visual streak. most of the retina shows 9 to 10 rows of outer nuclei. the axonal extensions of the photoreceptors are ensheathed in the cytoplasm of the mul-ier cells to form the outer plexiform layer; the photoreceptor ends dilate to form synaptic expansions, which synapse with the bipolar cells in the inner areas of the layer. the outer plexiform layer has a uniform thickness of 8 to 9 f..lm. a cell that is peculiar to the rabbit is a large (37-38 f..lm), lightstaining cell that sometimes has more than one nucleus. there are usually one three or four layers of nuclei in the inner nuclear layer, and the layer is thickest (30 f..lm) in the visual streak and thins toward the ora ciliaris to 18 f..lm. above the optic nerve it is 15 f..lm, and inferior to the medullary rays it is a mere 5 f..lm. there is usually one layer of ganglion cells, with the exception of the visual streak, where there may be up to 3 to 4 layers. the presence of the largest number of ganglion cells and the concentration of cones in that zone correspond with the general mammalian pattern of retinal structure in areas of highest acuity. an interesting feature of the rabbit retina is the presence of a number of clearly multinucleated cells in the ganglion cell layer (fig. 5) . the rabbit retinal vascular pattern is merangiotic, characterized by the presence of blood vessels in a limited part of the retina, with the larger vessels being ophthalmoscopically visible. the retinal circulation is of ciliary origin; one large arteriole and venule rise on either side of the optic disk and are often accompanied by smaller vessels which penetrate the retina near the optic disk. all the retinal vessels are confined to the area of the myelinated nerve fibers and run collateral to them. this area is about 15 to 18 mm broad and i to 2 mm high. the major retinal arterioles and venules can readily be visualized microscopically and have a diameter of75 and 100 .....m, respectively. the large vessels lie on the inner surface of the retina and give off superficial, deep, and peripheral capillaries; the remaining retina is virtually avascular (de schaepdrijver et al., 1989) . because the rabbit has a merangiotic retina, it is a less than ideal choice for an experimental model to study retinal vascular diseases of humans. the myelinated optic disk is ovoid, elongated in the horizontal plane, and has a large, deep physiological cup, owing at least in part to the absence of lamina cribrosa (fig. 6 ). blood supply is from branches of the ciliary vessels. the optic nerve is about 1.5 mm in diameter and has an average length of about 12 mm between the optic chiasm and the globe. the average number of nerve fibers is 2,611,000. this is less than in humans but is accounted for by the different rod and cone ratios and the size of the eye. the optic nerve leaves the globe at an acute ventral angle and passes through the optic foramen. the two optic nerves pass through a common cranial foramina and form the optic chiasm. there is nearly complete decussation or "crossing over" of the optic nerve fibers, with only a few ipsilateral nerve fibers. the optic tracts pass caudally to the lateral geniculate nucleus; the majority of the axons terminate in the lateral geniculate nucleus as part of the pathway for conscious perception while the rest continue as part of the reflex pathway. the axons which terminate in the lateral geniculate nucleus synapse with ganglion cells whose fibers pass caudally in the optic radiation to the cerebral (visual) cortex, located on the lateral, caudal, and medial aspects of the occipital lobe. the remaining fibers pass over the lateral geniculate nucleus to terminate in the pretectal area or rostral colliculus. the axons that enter the pretectal region synapse with ganglion cells in the nucleus of edinger-westphal (cranial nerve iii). some of the parasympathetic nerve fibers of the oculomotor nerve cross over to the nucleus of the other hemisphere, and the rest of the efferent fibers travel to the ciliary ganglion, via the short ciliary nerve, to terminate in the sphincter muscle. the rostral colliculus receives input from the optic nerve tract, visual cortex, and spinal cord. it influences the spinal cord and the nuclei of the oculomotor nerve, trochlear nerve, and facial nerve. the combination of these pathways function to integrate head, neck, and eye movements in response to visual stimuli. the ciliary muscle, pupillary dilator muscle, third eyelid, and muller's muscle all have sympathetic innervation. the sympathetic fibers originate from the hypothalamus and pass down the cervical spinal cord with preganglionic neurons in the first four segments of the thoracic spinal cord. the fibers pass cranially with the vagus nerve and terminate at the cranial cervical ganglion; postganglionic fibers distribute to the various structures. knowledge of spontaneously occurring ophthalmic diseases in the laboratory rabbit is important in the management and husbandry of research colonies. such knowledge is also necessary to distinguish spontaneous from experimentally induced conditions and to identify potential models for ocular diseases in humans. entropion in young rabbits can occur as a primary or as a secondary condition arising from infection. the condition can be surgically corrected with either an everting mattress suture or by blepharoplasty (fox et ai., 1979) . blepharitis can be due to infectious agents including bacteria such as pasteurella multocida or staphylococcus aureus, which can infect the eyelids as well as the conjunctiva (millichamp and collins, 1986) . viral-induced eyelid proliferations can result from infection by the rabbit myxoma and papilloma viruses (ross, 1972) . the papilloma virus is part of the papovaviridae and is a dna virus transmitted by arthropod vectors. the cottontail rabbit in the midwestern united states is most frequently affected, although domestic rabbits are susceptible. clinically, papillomatosis is characterized by horny warts that are normally found in hairless areas such as eyelids. they may undergo malignant transformation to squamous cell carcinoma. conjunctivitis/dacryocystitis. serous and purulent conjunctivitis is one of the more common ophthalmic diseases in the rabbit, pasteurella multocida being the most frequent etiologic agent. infection typically causes a subacute to chronic conjunctivitis with a mucopurulent discharge; the lacrimal sac may concurrently or singularly be infected as well (jones and carrington, 1988) . there are often other associated clinical signs of pansystemic involvement, and rabbits utilized for ophthalmic research are best obtained from and maintained in a pasteurella-free environment. other bacteria, such as staphylococcus aureus and haemophilus sp. (srivastava et ai., 1986) , as well as environmental factors such as hay dust can also cause conjunctivitis in rabbits (buckley and lowman, 1979) . cultures and sensitivities are usually indicated to diagnose this condition definitively. viral conjunctivitis is most commonly caused by rabbit myxoma virus. conjunctivitis and edema of the eyelids are the most consistent ocular manifestations, and animals may develop a mucopurulent blepharoconjunctivitis. conjunctival hyperplasia. rabbits are subject to a unique circumferential hyperplasia of the bulbar conjunctiva that appears to occur as a primary condition (arnbjev, 1979) . the 19 . models in ophthalmology and vision research conjunctiva folds on itself as it encroaches onto the cornea without adhering to it. the condition is usually bilateral, and recurrence following excisional conjunctivoplasty and/or cryosurgery is not uncommon (fig. 7) . corneal dystrophy. corneal dystrophy has been reported in the rabbit in two distinct forms. in the new zealand white rabbit, two cases of unilateral corneal opacities beginning at the limbus circumferentially and progressing toward the center have been reported (port and dodd, 1983) . the dystrophy appeared as a smooth, white opacity that was not associated with inflammation. the possibility of trauma, toxicity, or dietary etiologies was excluded. histologically, the epithelium was a thickened and disorganized layer. the basement membrane and stroma showed no abnormalities. spontaneously occurring anterior corneal dystrophy was found in related 6-month-old dutch belted rabbits. the animals were either unilaterally or bilaterally affected. the focal, crescent-shaped epithelial and subepithelial opacities involved the central and paracentral cornea. studies of the breeding colony indicated a familial predisposition. this anterior corneal dystrophy was found to involve epithelium, epithelial basement membrane, and anterior stroma (moore et ai., 1987) . dietary lipid keratopathy. lipid keratopathy and other systemic manifestations of high fat diets fed to rabbits have been described (roscoe and riccaroli, 1969; gwin and gelatt, 1977; stock et al., 1985) , and similar lesions occur in the wantabe heritable hyperlipidemic (whhl) rabbit (garibaldi and pecquet goad, 1988) . corneal lipidosis, unlike dystrophies, will induce an inflammatory reaction with corneal neovascularization as well as macrophage and neutrophil infiltration. corneal lipid deposition typically occurs in the anterior stroma, basement membrane, and epithelium. the lipid keratopathy usually begins at the limbus either focally or circumferentially and can spread centrally. dietary-induced lipid infiltration and inflammation have also been reported in the sclera, third eyelid, iris, ciliary body, and choroid. other lipid degenerations. lipid degeneration characterized by the geographic deposition of presumably cholesterol crystals into the subepithelial stroma has been seen in rabbits as a nonspecific change following intraocular surgical procedures. , 1988) . the rabbit or squirrel fibroma virus was the likely etiology in the rabbit, which was temporarily housed outdoors. the corneal mass was firm, pale yellow, and extended approximately 0.75 cm onto the nasal cornea of one eye. rabbits experimentally infected with the coronavirus of pleural effusion disease developed a nongranulomatous anterior 419 uveitis 3-6 days after exposure that persisted for 2-3 weeks (fiedelius et al., 1978) . the overall incidence of spontaneous ophthalmic diseases in rabbit populations is low; the prevalence of spontaneously occurring cataracts based on research of rabbit fetuses is 3.6% (weisse et al., 1974) . one study reported 5 of 7 rabbits in one litter to have congenital nuclear and cortical cataracts that did not progress over time. test matings failed to prove inheritability, and an in utero insult was the likely etiology (gelatt, 1975) . the author has observed sporadic nuclear cataracts, cataracts associated with posterior lenticonus (fig. 8) , and cataracts related to the persistence of the primary vitreous and tunica vasculosa lentis. spontaneous lens capsule rupture. an interesting condition seen not uncommonly in rabbits is a spontaneous rupture of the anterior lens capsule with a resultant zonal granulomatous lensinduced uveitis. the condition is usually unilateral; animals present with cataract, a small pupil fixed by posterior synechiae, a localized bulging forward of adjacent iris (misdiagnosis of an intraocular neoplasm is common), and a mild anterior uveitis. the lens-induced uveitis is characterized by zones of proteinladen macrophages, lymphocytes, and plasma cells, and the defect sealed by iris. wolfer et al. (1993) have identified encephalitozoon cuniculi within the lenses and speculate a casual relationship (fig. 9) . lens extraction may be of benefit in preserving visual eyes. glaucoma occurs as an autosomal recessive trait and as a semilethal condition in the new zealand rabbit. the disease process has similarities to congenital glaucoma in humans and is associated with goniodysgenesis. affected animals have impaired aqueous outflow by 3 months of age, and by 6 months clinical changes of megalobulbus, increased corneal diameter, corneal edema and neovascularization, and variable increase in lop are noted (fig. 10) . morphologically, the uveal tissue inserts anteriorly into the cornea, and there is associated dysplasia or complete absence of trabecular meshwork and ciliary cleft (tesluk et ai., 1982) . an inbred pigmented rabbit strain was described with a retinal degeneration evident ophthalmoscopically as a retinal pigment epitheliopathy: scotopic erg was affected, and histologically an outer retina degeneration was seen (reichenbach and baar, 1985) . although the authors argued for a potential model for retinitis pigmentosa, no heritability data were published, and the morphological evidence was weak. h. ocular neoplasms intraocular neoplasms are uncommon; however, metastatic lymphosarcoma may occur. the collection of the author contains a rabbit globe with an adenocarcinoma, presumably of ciliary body epithelium origin, but the possibility of a primary tumor elsewhere with ocular metastases could not be excluded. an astonishing array of infectious agents have been applied onto and within the rabbit eye in order to study disease pathogenesis and mechanisms, evaluate therapy, and examine local and systemic responses to infection. blepharitis. rabbits immunized with cell wall antigens of staphylococcus aureus developed a blepharitis, following topi-cal challenge with viable bacteria, that appeared to be due to a hypersensitivity reaction (mondino et ai., 1987) . conjunctivitis. rabbits developed a follicular conjunctivitis 24 hr following topical application of rabbit retrovirus type 70 (langford et ai., 1986) . keratoconjunctivitis. preimmunized rabbits challenged with subconjunctival inoculation of onchocerca volvulus developed conjunctivitis, limbal abscesses, and stromal keratitis; the response was more severe in animals immunized with live versus freeze-killed microfilaria (duke and garner, 1975) . a model of phlyctenular keratoconjunctivitis was developed by exposing rabbits immunized against staphylococcus aureus or its cell wall to viable organisms, with subsequent development of phlyctenulae and catarrhal infiltrates (mondino et ai., 1980) . herpetic keratitis. the rabbit has become the standard experimental model for herpetic keratitis, and a voluminous literature exists dealing with the natural history, pathology, immunology, and treatment of herpes simplex virus (hsv) keratitis. early studies utilized abrasion of the epithelial surface by one of a variety of methods followed by topical application of hsv; however, abrasion is not required, and simple inoculation and massage through the eyelids routinely cause the disease. a number of significant variables exist, including inoculation technique, virus strain, and breed of rabbit; in general, hsv type i (hsv-i) produces a mild to moderate conjunctivitis followed by punctate epithelial erosions by day 4 which coalesce to form a dendritic geographic ulcer by day 7; lesions resolve with minimal scarring and occasional vascularization by days 14-21. type 2 hsv (hsv-2) tends to have a slightly longer incubation period and more prolonged and severe corneal sequelae. latency within the trigeminal ganglion and reactivation stimulated by a variety of factors have been documented. disciform (stromal) keratitis can be reproduced by injection of virus into the corneal stroma. in general, the rabbit eye is more sensitive to hsv than the human eye, and immune response is an important player in disease pathogenesis; species differences between humans and rabbits in this regard are discussed elsewhere. the rabbit cornea is thus an imperfect paradigm of human hsv keratitis (merriam, 1984) . bacterial keratitis. the rabbit cornea, as well as those of other laboratory animal species, is extremely resistant to establishment of experimental infection with some of the more common human ocular pathogens, including pneumococcus. streptococcus epidermis. moraxella lacunata, nisseria gonococcus, nocardia asteroides, and mycobacterium fortuitium; successful rabbit modeling has been achieved with more virulent organisms, such as staphylococcus aureus. pseudomonas aeruginosa. serratia marcescens, shigella fiexneri, and, with less success, proteus mirabilis, pasteurella multocida. clostridium petfrigens, and bacteroides fragilis. studies have in-volved the application of either an inoculation of organisms or the isolated endoor exotoxins by scraping or scratching the corneal surface or by intrastromal injection (barth, 1989) ; in most cases, the former situation more closely mimics the human disease. fungal keratitis. francois and rijsselaere (1974) demonstrated that topical subconjunctival administration of corticosteroids intensified the keratitis resulting from the intrastromal injection of fusarium or aspergillus spores. pretreatment with subconjunctival injection of corticosteroids was required to induce a fungal keratitis in pigmented rabbits injected intralamellarly with actively germinating conidia from fusarium so/ani; culture-positive ulcers were present at 2 and 3 weeks (forster and rebell, 1975) . ellison and newmark (1973) and singh et al. (1989) utilized similar models of immunosuppression to induce aspergillus fumigatus infection; the latter work found the experimental disease to be more severe in albino rabbits. agrawal et al. (1982) demonstrated an enhanced severity of curralaria lunata keratitis with pretreatment with penicillin, streptomycin, or cortisone. acanthamoeba keratitis. corneal infection with the ubiquitous protozoan acanthamoeba is rare in humans and dependent on risk factors that have not yet been characterized. disease binding of parasites to rabbit cornea (niederkorn et al., 1992) . herpes virus uveitis. intravitreal injection of hsy-i or hsy-2 induced a uveitis in rabbit eyes. onset of the primary disease was gradual, whereas repeat infection of recovered eyes with either live or inactivated virus resulted in an immediate response, implying that secondary disease is mediated by immunologic mechanisms (oh, 1984) . herpetic retinochoroiditis. a rabbit model of neonatal hsy-2 infection was developed by the subcutaneous injection of the virus into newborn rabbits; animals died from systemic infection by day 5. ocular lesions included retinal folds and necrosis and mild uveitis; hsy-2 was identified in the retina using fluorescent antibody techniques (oh, 1984) . borna disease chorioretinitis. borna disease virus-infected rabbits developed a multifocal chorioretinitis that paralleled the clinical neurological signs. histopathologically, a perivascular choroidal infiltrate and necrosis of retinal pigment epithelium and photoreceptors was seen. virus reached the eye via the axons of the optic nerve (krey et al., 1979) . toxoplasmosis. injection of toxoplasma organisms into the subarachnoidal space of the rabbit eye produced a focal retinochoroiditis, vitritis, and mild anterior uveitis 4-6 days later that persisted for 4-6 weeks; resolution occurred with an atrophic hyperpigmented scar and with persistence of organisms. recurrences could be induced using intravenous antilymphocytic serum, normal serum, or total body radiation (nozik and o'connor, 1970) . histoplasmosis. presumed ocular histoplasmosis in humans is a self-limiting disease characterized by minimal inflammation, hemorrhagic maculopathy, peripapillary scarring, and atrophic lesions most commonly seen in the peripheral fundus. hematogenous infection with measured suspensions of histoplasma capsulatum will produce a focal choroiditis in rabbit eyes; other described models produce acute anterior segment inflammation, vitreous clouding, or other features not characteristic of the human disease. intracarotid injection of 25-50,000 g-89 strain organisms caused blepharoconjunctivitis and multifocal choroiditis within 1-6 days; anterior uveitis occurred in 70% of animals within 2-5 days. development of lesions in contralateral eyes was less predictable and somewhat delayed. organisms were present in acute lesions; spontaneous resolution with posterior synechiae, chorioretinal scars, and clearing of the organism occurred by the eighth week (o'connor, 1975) . bacterial endophthalmitis. early studies with staphylococcus aureus or escherichia coli demonstrated that the anterior chamber of the rabbit resisted infectious agents better than the vitreous, and subsequently staphylococcus aureus (the organism most frequently isolated from postoperative endophthalmitis in humans) and others (including pseudomonas aeruginosa and klebsiella oxytoca) were injected in rabbit vitreous to evaluate efficacy of type, dosage, and route of administration of antibiotics and to compare and contrast medical and surgical (vitrectomy) treatment as well as to study immune responsiveness. intravitreal injection of the organisms (dependent on virulence and size of inoculum) induces a suppurative response that, untreated, destroyed eyes in as little as 20 hr (yen-lowder, 1989) . viable staphylococci were present up to day 21 but could not be cultured by day 30, correlating with an increase in vitreous levels of igg (engstrom et al., 1991) . fungal endophthalmitis. ellison (1979) injected candida albicans into rabbit eyes to study the efficacy of natamycin therapy. hematogenous endophthalmitis resulted in a high percentage of rabbits injected intravenously with a suspension of candida albicans; the disease was uni-or bilateral. locally produced antibodies were demonstrated in aqueous humor (bessieres et ai., 1987) . sensitization with ovalbumin and freund's complete adjuvant followed by the limbal injection of ovalbumin resulted in progressive scleral disease 90°-180°from the injection site (hembry et ai., 1979) . a model of thimerosal antibody-induced immune complex hypersensitivity was created by exposing rabbits previously immunized to thimerosal conjugates via antigen-sensitized contact lenses (baines et ai., 1991) . a disease process with similarities to ocular cicatricial pemphigoid occurred in neonatal dutch belted rabbits administered subconjunctival or intraperitoneal murine monoclonal antibodies against stratified squamous epithelial basement membrane (roat et al., 1990) . allergic conjunctivitis can be mimicked by the topical application of a selective mast cell degranulator, n-methyl-p-methoxyphenethylamine formaldehyde condensation product, which induced an eosinophilic infiltrate (abelson et al., 1983) . subconjunctival injection of 10-1000 mg of platelet-activating factor induced a conjunctivitis that appeared to be mediated by peptidoleukotrienes (muller et ai., 1990) . immunization of both albino and pigmented rabbits with heterologous rat lens protein six times at 2-week intervals followed by surgical disruption of the anterior lens capsule induced an uveitis of variable intensity . uveitis mechanisms and pharmacology have been studied using a variety of experimental induction methods ranging from paracentesis to neurogenic inflammation by the topical application of neutral formaldehyde (krootila et al., 1989) to intravitreal injection of endotoxin (williams and peterson, 1987; csukas et ai., 1990) . intravitreal injection of bovine serum albumin in presensitized rabbits induced a bilateral uveitis, more severe in the injected eye, that was characterized by an acute elevation of intraocular pressure (lop) (uusitalo, 1984; jamieson et ai., 1989) . similar responses were obtained using human serum albumin (verbey et ai., 1988; hoyng, 1989) or egg albumin (bonnet et ai., 1976) . a single intravitreal injection of horse serum likewise caused an uveitis (pankowska and boj, 1990) . uveitis manifested by alterations in vascular permeability resulted from the intravenous injection of bovine "y-globulin (bgg) into immunized animals or bgg-anti-bgg antigenantibody complexes into normal rabbits. ciliary processes were notably involved (howes and mckay, 1975) . experimental allergic uveitis (eau) can be induced in rabbits by administering retina (hempel et al., 1976) or soluble retinal s-antigen (kalsaw and wacker, 1986; rao et al., 1986; fricke et ai., 1990) and complete freund's adjuvant; 90% of animals develop a chorioretinitis characterized by spontaneous relapses. immunization with bovine interphotoreceptor retinoidbinding protein (irbp) will result in a uveoretinitis with macrophagic infiltrate, outer retinal degeneration, and breakdown of the blood-retinal barrier notable by day 18 (eisenfeld et at., 1987) . sensitization of new zealand albino rabbits to bovine cerebral white matter or myelin basic protein in freund's complete adjuvant followed by challenge by intravenous brain-specific basic protein or indifferent purified tuberculin induced a cellmediated optic neuritis with demyelination in both groups of animals (wisniewski and bloom, 1975). c. models of other ophthalmic disease processes 1. eyelid papillomas a wild strain of shope papilloma virus was used to induce tumors in rabbits by dermal abrasion followed by both topical application and intradermal injection, in this case to study efficacy of immunotherapy (smolin et al., 1981) . keratinization of meibomian glands may playa role in meibomian gland dysfunction and associated chronic blepharitis; a similar condition was induced in albino rabbits by the twicedaily topical application of 2% epinephrine over 6-12 months; 56% of eyes developed detectable lesions (jester et al., 1989) . partial keratoconjunctivitis sicca (kcs) was induced in rabbits by cauterizing the excretory duct of the lacrimal gland; total kcs involved surgical removal of the third eyelid and harderian gland. both models developed increased tear osmolarity and decreases in conjunctival goblet cell density and corneal epithelial glycogen and at 44 weeks postoperatively, visualized by rose bengal staining (gilbard et ai., 1989) . subepithelial calcification is a common human condition that occurs secondary to chronic anterior segment disease or systemic abnormalities of calcium and phosphate metabolism. the condition can be produced in rabbits by the intravitreal injection of 1 mg of ovalbumin followed in 12-14 days by large doses of intramuscular calciferal (600,000-900,000 units); by corneal freezing, kmn0 4 perfusion of the anterior chamber, or corneal endothelial abrasion combined with treatment with dihydrotachysterol (dht); in vitamin d-deficient rabbits injected with intravitreal polyethylene sulfonate; and by either co 2 or argon laser treatment (muirhead and tomazzoli-gerosa, 1984) . absence of bowman's layer in the rabbit warrants consideration in interspecies correlation. corneal neovascularization is a clinical event with potential for both benefit and harm. in addition, the accessibility and visualizability in the lucent cornea provides a convenient model to study angiogenesis in general. investigations utilizing the rabbit to study vasoproliferative processes have induced corneal vascularization with a variety of techniques, including trephination (thoft et al., 1979; groden et al., 1983) , thermal cautery (ruben, 1981) , silver nitrate application (ausprunk et al., 1978) , and suturing (stock et al., 1985; cherry and garner, 1976) , as well as the intrastromal injection of a variety of cells, tissues, and other substances (moore and sholley, 1985) . blood vessel extension from the limbus occurs at the rate of 0.2-0.3 mm/day. a rabbit model of endothelial dysfunction can be induced by mechanical abrasion or cryodestruction of the corneal endothelium; these techniques induce a transient edema that clears in days to weeks, a tribute to the somewhat unique regenerative capability of rabbit endothelium. flushing the anterior chamber with 0.050/0 benzalkonium chloride results in corneal edema 425 within 24 hr with regeneration in only a small percentage of experimental animals over a 2-to 3-month period. interestingly, intraocular pressure was lowered in experimental eyes (maurice and perlman, 1977; kohchi et ai., 1980) . the elegant work by khodadoust (1968) and khodadoust and silverstein (1968) laid the foundation for the use of the rabbit as a model for corneal allografts, host immunomechanisms of graft rejection, and pharmacological manipulation of immunoresponsiveness. mechanisms of host sensitization appear to be similar across species (polack, 1972) ; in rabbits the rate of allograft rejection is low, but the phenomenon can be induced in a high percentage of animals by utilizing techniques to stimulate graft vascularization, including leaving silk sutures in place or focally cauterizing the recipient margin (mannis and may, 1983) . the remarkably regenerative endothelium of rabbits does not make it a good model for the study of endothelial cell responses to penetrating keratoplasty, the cat being preferred. much of the basic understanding of the process of healing of traumatic and surgical wounds has its foundation in rabbit studies, from observations on limbal cataract incisions to corneal alterations associated with keratorefractive surgery, including radial keratotomy and excimer laser corneal sculpting. likewise, studies of corneal epithelial healing and attempts to modulate the same have extensively utilized this species. epithelial wounds may be created mechanically or chemically; of import and technique-dependent is whether the epithelial basement membrane is removed (pfister, 1975; kuwabara et al., 1976; haik and zimny, 1971) . experimentally diabetic rabbits are more prone to epithelial basement membrane injury (hatchell et ai., 1982) . attention has focused on the healing of corneoscleral filtration wounds used to manage glaucoma with the aim of exploring methods, primarily pharmacological, to keep them open and filtering (gressel et al., 1984; miller et al., 1985) . a comparison of rabbit, dog, and cat sclerotomy and trabeculectomy wounds 1 week after surgery demonstrated a more prominent myofibroblastic response in pigmented compared to nonpigmented rabbits (peiffer et ai., 1991) . normal healing occurs in about 17 days with a rather prominent inflammatory cell infiltrate (miller et ale , 1985) . corticosteroids induce elevation of lop in humans, and numerous studies have attempted to exploit the rabbit as a model for this modest elevation of lop with both mechanistic and therapeutic aims. interestingly, variable results have been reported, with rabbit variables (age, sex, and breed), steroid type and dosage, and/or tonometric methodology likely responsible for disparities and inconsistencies. most consistent results have been reported utilizing repeated subconjunctival injections of either repository betamethasone (bonami et ai., 1978) or triamcinolone (bonami et ai., 1978; hester et al., 1987) twice weekly. elevation of lop up to 10 mm hg occurred, with a lesser and delayed effect occurring in uninjected fellow eyes, suggesting a systemic as well as local effect. other models of ocular hypertension in rabbits have included water loading (60 ml/kg increased lop approximately 10 mm hg for 30-120 min) (seidenhamel and dungan, 1974) and the intravenous infusion of 5% glucose (10-12 mm hg increase of 40 min duration with an infusion of 15 ml/kg) (bonami et ai., 1976) . historically, experimental glaucoma has been produced in rabbits by several methods. injection of i% kaolin into the anterior chamber obstructs the outflow channels, and lop reached 50-70 mm hg within 14 days (voronina, 1954) . encircling the globe with constricting threads (skotnicki, 1957) or bands (flocks et ai., 1959) caused high lop (70-100 mm hg) which decreased to 35-50 mm hg within 48 hr; one-third of the eyes in one study developed endophthalmitis (flocks et al., 1959) . polyethylene tubing threaded into the rabbit iridocorneal angle caused an increase in lop within 24 hr that remained elevated for 6 months; loss of retinal ganglion cells occurred (kupfer, 1962; malik et ai., 1970) . intraocular injection of methylcel-lujose will cause a transient elevation of lop (samis, 1962; kazdan and macdonald, 1963) , whereas injection of cotton fragments induces a more prolonged elevation (decarvalho, 1962) . injection of 75 units of a-chymotrypsin into the anterior chamber resulted in a chronic moderate elevation of lop (up to 50 mm hg) in 50% of both pigmented and nonpigmented rabbit eyes (best et ai., 1975) . subconjunctival injection of sclerosing agents such as phenol caused long-term elevation of lop (malik et ai., 1970; luntz, 1966) . intracameral injection of autologous fixed red blood cells resulted in a chronic, variable, but dramatic elevation of lop associated with a hemolytic inflammatory process; optic nerve changes were documented in the model (fig. ii) . perhaps the model that is most desirable utilized an argon laser to occlude the iridocorneal angle in pigmented rabbits; laser parameters included 200-275 50-fj-m spots, 0.1-0.2 sec, 1.2-1.6 w, and 32.8-78 joules of total energy; an lop elevation of28-50 mm hg occurred in 50% of treated rabbits (gherezghiher et al., 1986) . naphthalene-induced cataracts in rats and rabbits are a widely utilized model for oxidative change to the lens and have been studied from the perspectives of both basic and applied research. oral dosing of pigmented rabbits with naphthalene (i g/kg) every other day for 4 weeks induced variable cataractogenesis; for the first 2-3 weeks cortical vacuoles and granules developed consistently. thereafter, one-third progressed to maturity while one-third remained relatively stable. this variability makes the model less than ideal for experimental cataract research (selzer et ai., 1991) . i i. sugar cataracts rabbits will develop sugar cataracts in a fashion similar to that seen in other species. new zealand rabbits fed a 50% galactose diet developed lens opacities after 5 days (cheng et ai., 1992) , amphibians can regenerate lens from either corneal or iris epithelium; this ability is largely lost in higher vertebrates. among mammals, rabbits possess perhaps a relatively unique ability to resynthesize lens following surgical removal; residual lens epithelial cells are the source of the process, which although not uncommon to a lesser extent in other mammals occurs in lagomorphs with particular exuberance. regeneration occurs slowly (over months), and even though a new lens bow may form, fibers are irregular and the "new" lens is not transparent. crystalline synthesis occurs, but proteins are somewhat 19 . models in ophthalmology and vision research altered from normal rabbit lens (owon et ai., 1989) . while of interest to the basic scientist, lens regeneration may be the bane of the researcher who uses the rabbit as a surgical model for cataract extraction and related procedures. cataracts associated with spontaneous retinal degenerations have been described in humans, rats, and dogs, and the pathogenesis has been postulated to be due to the release of polyunsaturated lipid peroxidation products from photoreceptor membranes. intravitreal injection of liposomes prepared from phospholipids containing lipid peroxidation products induced posterior subcapsular cataracts in chinchilla rabbits (babizhayer and deyer, 1989 ). in 1966 noell and associates reported that chronic exposure of the rat to moderate-intensity light resulted in retinal degeneration. a similar phenomenon was demonstrated in dutch belted rabbits, with the exposure threshold being considerably higher than that seen in rats (lawwill, 1973) . rabbit studies have contributed to our knowledge of the retinotoxic effects of intraocular iron; intravitreal placement in both pigmented and albino rabbits resulted in retinal degeneration, with photoreceptors primarily affected (olsen et ai., 1979; burger and klintwals, 1974) . retinotoxic effects and the pharmacokinetics of iodate have likewise been studied in rabbits (regnaut, 1970) . experimental retinal hemorrhagic detachments in rabbits have been induced to simulate the early stages of human agerelated macular degeneration. the technique included transvitreal subretinal injection of fresh autologous blood. rather rapid irreversible outer retinal degeneration occurred (glatt and machemer, 1982) . tractional retinal detachments were induced in pigmented rabbits simply by penetrating the sclera over the pars plana, extending the wound to 8 mm with scissors, excising prolapsed vitreous, and suturing the sclera with 8-0 silk. eyes with intraoperative autologous blood injection were likely to develop retinal detachments, whereas those without blood injection did not (cleary and ryan, 1979) . the proliferation of preretinal membranes, which is an important component of retinal detachment in humans, has been induced in eyes of pigmented rabbits by the intravitreal injection of autologous retinal pigmented epithelium (radtke et al., 1981) . retinal neovascularization, a precursor to vitreous hemorrhage and traction, which is a potentially blinding condition associated with diabetes mellitus, retinal vein occlusion, and retinopathy of prematurity, can be induced by the intravitreal injection of fibroblasts and hyaluronidase (autoszyk et al., 1991 ) . dutch rabbits were subjected to the subretinal injection of hanks' solution accompanied by either mechanical trauma to the rpe or argon laser photocoagulation to study mechanisms involved in central serous chorioretinopathy (negi and marmor, 1984) . axonal regeneration has been documented in adult rabbits following mechanical, thermal, or glaucomatous damage to the optic nerve; attempts were not successful in terms of functional reconnection to the brain (eysel and peichl, 1985; schnitzer and scherer, 1990; peiffer et ai., 1991) . rabbits have been utilized extensively in safety, irritability, and biocompatability studies to provide data on how specific chemicals, substances, devices, or other biomaterials interact with the ocular tissues in order to predict tolerability and/or safety in humans. this interspecies extrapolation is predicated on the assumption that mechanisms of inflammation and tissue and immunoresponsiveness are similar; quite the contrary, dramatic species differences have been documented. the relatively labile and less predictable blood-ocular barrier of the rabbit has been mentioned earlier; the rabbit uvea responds to insult with exuberant fibrin exudation. a comparison of response to intravitreal glycosaminoglycans showed the rabbit to have a more severe anterior segment response and a less predictable posterior segment response, compared to the human and monkey (peiffer, 1991) . thus, both quantitative and qualitative differences in ocular tissue responsiveness make the rabbit less than an ideal model (bito, 1984) . the draize test, utilized for the testing of irritancy of cosmetics, toiletries, agricultural chemicals, occupational and environmental hazards, and certain therapeutic agents, especially ophthalmic formulations, is a technically simple, unsophisticated, semiquantitative method of evaluating the safety of the products as they relate to the external eye (draize et al., 1944) . the draize test has been controversial both scientifically (rabbit response is exaggerated when compared to human eyes and quite variable, and the test does not allow for fine discrimination) and ethically; alternatives including draize modifications (low-volume testing) and in vitro techniques continue to be investigated. rabbit eyes have been used to study ocular tissue responses to suture material (allen et ai., 1982) , the components of surgical sponges (cotton, collagen, and cellulose) (peiffer et ai., 1983) , intraocular lenses (cook et ai., 1986) (fig. 12) , and vitreous replacement materials (peiffer, 1991) . although interspecies differences need to be considered, appreciation of them has allowed for some valid extrapolation. research on the visual process has involved relatively sophisticated investigation of the organization of neurons, their electrophysiological responses, their relationship to one another, and their biochemistry at the level of retina, geniculate, cerebral cortex, and other visually related areas of the brain. other species, because of relatively unique and/or experimentally desirable variants of the above, have received more attention from neurophysiologists than has the rabbit. papers by van hof and colleagues provided a qualitative visual psychophysical description of the dutch belted rabbit, and this breed has a strong claim to being the standard for vision research. visual acuity is between 20' and 40' of cycle width (vaney, 1980; van hof, 1966 van hof, , 1967 van hof and lagers van haselen, 1973; van hof and steele, 1977) . an elegant overview of the "topographical relationships between the anatomy and physiology of the rabbit visual system" by hughes (1976) summarizes work in this area prior to 1970 and was part of a symposium entitled, "vision in the rabbit," held in rotterdam in 1970 and published in documenta ophthalmogica, volume 30, in 1971: rather than attempt to summarize these works here i refer the reader to that source. of note with regard to work on the retinal neurons is the starburst amacrine cell, which has a distinctive and regular dendritic geometry with cholinergic input solely to ganglion cells (famiglietti, 1985) . horizontal cells are larger and more densely populated in the periphery (reichenbach and wohlarts, 1983) ; interplexiform cells are present (oyster and takahaski, 1977) . the development of synapses within the outer and inner plexiform layer has been studied (dachaux and miller, 198ia, b) . retinal ganglion cell morphology, distribution, and function have been investigated, and work on the rabbit eye has contributed to the understanding of the ganglion cell receptive field (barlow and levick, 1961; barlow and hill, 1963; barlow et ai., 1964; oyster and barlow, 1967; oyster, 1968 , oyster et ai., 1972 vaney et ai., 1981; amthor et ai., 1989; pu and amthor, 1990) . a population of large ganglion cells similar to alpha cells of the cat retina has been described (peichl et ai., 1987) . connections of the retina to the brain have been studied by a variety of morphological tracer, and degenerative studies; projections to the lateral geniculate, superior colliculus, and accessory optic system have been characterized, and the latter, consisting of the medial terminal nucleus, lateral terminal nucleus, and dorsal terminal nucleus, has been studied rather extensively in the rabbit (sithi-amoru, 1976; hamasaki and marg, 1962; oyster et ai., 1980; giolli et ai., 1984 giolli et ai., , 1985 soodak and simpson, 1988) . responses are similar to and thus input likely from direction-sensitive retinal ganglion cells. a horseradish peroxidase study showed the rabbit accessory optic system to consist of two fasciculi and lacking a dorsal terminal nucleus (terubayashi and fujisawa, 1988) . cortical wiring has also been studied (giolli and guthrie, 1971; giolli et al., 1978; schmolke and viebahn, 1986) , as have pathways for the pupillary light response in this species (zuouc and kiribuchi, 1985) and the innervation of the extraocular muscles (evinger et al., 1987) . electrophysiologically, the rabbit erg b wave can be recorded between postnatal day 11 and 18 (reuter et al., 1971) . the visual evoked response is diurnally cyclical (bobbert and brandenburg, 1978) . unsleeping eyes, by nature raised to take the horizon in ... (hughes, 1976) the eighteenth century poet who thus described the rabbit eye alluded to its uniqueness. rabbits have served us well in our understanding of normal ocular form and function and the pathogenesis and management of ophthalmic disease, but likely not optimally. species variation in the sensitivity and nature of the blood-ocular barriers has been addressed. differences between breeds and strains and especially between nonpigmented and pigmented eyes have not been elaborated on but warrant consideration. there are demonstrated differences in drug distribution (barza et al., 1979) and neural pathways (oyster et al., 1987) between nonpigmented and pigmented eyes; empirical observations with regard to species-specific tissue responsiveness, hypertensive response to corticosteroids, and numerous other experimental conditions have been made. the albino eye is a reasonable model to study parameters in albinos, but the vast majority of humans (to which most modeling is extrapolated) have melanin in the iris, ciliary, and retinal pigment epithelium as well as the uvea. the continued use of the new zealand white rabbit as a model for ophthalmic and visual experimentation is difficult to rationalize, and the use of pigmented rabbits will result in better scientific studies. conjunctival eosinophils in compound 48/80 rabbit model evaluation of the schirmer tear test in clinically normal rabbits clinical and experimental keratitis due to currularia lunata (wakker) boedyn var. aeria (batista, lima, and vasconceles) ellis long-term study of iris sutures in rabbits morphologies of rabbit retinal ganglion cells with concentric receptive fields pseudoterygium in a pygmy rabbit the sequence of events in the regression of corneal capillaries an experimental model of preretinal neovascularization in the rabbit lens opacity induced by lipid peroxidation products as a model of cataract associated with retinal disease ocular hypersensitivity to thimerosal in rabbits selective sensitivity to direction of movement in ganglion cells of the rabbit retina retinal ganglion cells responding selectively to direction and speed of image motion in the rabbit the mechanism of directionally sensitive units in the rabbit retina animal models of bacterial corneal ulcers marked differences between pigmented and albino rabbits in the concentration of clindamycin in iris and choroid-retina local production of specific antibodies in the aqueous humor in experimental candida endophthalia in rabbits experimental alphachymotrypsin glaucoma species differences in the response of the eye to irritation and trauma: a hypothesis of divergence in ocular defense mechanisms and the choice of experimental animals for eye research diurnal changes in the rabbit's visual evoked potential an improved model of experimentally induced ocular hypertension in the rabbit experimental corticosteroid ocular hypertension in the rabbit standardization of an experimental immune uveitis in the rabbit for topical testing of drugs isolation and subcellular fractionation analysis of acini from rabbit lacrimal glands chronic non-infective conjunctivitis in rabbits experimental retinal degeneration in the rabbit produced by intraocular iron anatomy of the rabbit nasolacrimal duct and its clinical implications highresolution mr imaging of water diffusion in the rabbit lens corneal neovascularization treated with argon laser experimental posterior penetrating eye injury in the rabbit. i. method of production and natural history comparative aspects of the intraocular fluids clinical and pathologic evaluation of a flexible silicone posterior chamber lens design in a rabbit model time course of rabbit ocular inflammatory response and mediator release after intravitreal endotoxin an intracellular electrophysiological study of the ontogeny of functional synapses in the rabbit retina. i. receptors, horizontal, and bipolar cells b). an intracellular electrophysiological study of the ontogeny of functional synapse in the rabbit retina. ii. amacrine cells the anatomy and histology of the eye and orbit of the rabbit histopathology of retina and optic nerve with experimental glaucoma retinal vascular patterns in domestic animals methods for the study of irritation and toxicity of substances applied topically to the skin and mucous membranes reactions to subconjunctival inoculation of onchocerca volvulus microfilaria in pre-immunized rabbits uveoretinitis in rabbits following immunization with interphotoreceptor retinoid-binding protein effects of subconjunctival primaricias in experimental keratomycosis intravenous effects of primaricin on myotic endophthalmitis immune response to staphylococcus aureus endophthalmitis in a rabbit model extra and intracellular hrp analysis of the organization of extraocular motoneurons and internuclear neurons in the guinea pig and rabbit regenerative capacity of retinal axons in the cat, rabbit, and guinea pig starburst amacrine cells: morphological constancy and systematic variation in the anisotropic field of rabbit retinal neurons uveitis in rabbits with pleural effusion disease: clinical and histopathological observations mechanically induced glaucoma in animals animal model of fusarium solani keratitis congenital entropion in a litter of rabbits comparative anatomy of the vascular supply of the eye in vertebrates corticosteroids and ocular mycoses: experimental study mitigating effects of dialysable leukocyte extract (dle) on the experimental allergic uveitis (eau) of the rabbit lipid keratopathy in the watanabe (whhl) rabbit congenital cataract in a litter of rabbits laser induced glaucoma in rabbits tear film and ocular surface changes after closure of the meibomian gland orifices in the rabbit organization of the subcortical projections of visual areas i and ii in the rabbit. an experimental degenerative study an autoradiographic study of the projections of visual cortical area i to the thalamus, pretectum, and superior colliculus of the rabbit pretectal and brain stem projections of the medial terminal nucleus of the accessory optic system of the rabbit and rat as studied by anterograde and retrograde neuronal tracing methods projections of medial terminal accessory optic nucleus, ventral tegmental nuclei, and substantia nigra of rabbit and rat as studied by retrograde axonal transport of horseradish peroxidase experimental subretinal hemorrhage in rabbits 5-flurouracil and glaucoma filtering surgery: i. an animal model the effect of corneal trephination on neovascularization bilateral ocular lipidosis in a cottontail rabbit fed an all-milk diet induction of de novo synthesis of crystalline lenses in aphakic rabbits scanning electron microscopy of corneal wound healing in the rabbit microelectrode study of the accessory optic tract in the rabbit susceptibility of the corneal epithelial basement membrane to injury in diabetic rabbits experimental model for scleritis experimental chorioretinitis in rabbits following injection of autologous retina in freund's complete adjuvant steroid-induced ocular hypertension in the rabbit: a model using subconjunctival injections circulating immune complexes: effects on ocular vascular permeability in the rabbit the influence of topical prostaglandin in hsa-induced uveitis in the rabbit topographical relationships between the anatomy and physiology of the rabbit visual system cortical and subcortical pathways for pupillary reactions in rabbits characterized and predictable rabbit uveitis model for anti-inflammatory drug screening meibomian gland dysfunction ii: the role of keratinization in a rabbit model of mgd pasteurella dacryocystitis in rabbits rabbit ocular and pineal autoimmune response to retinal antigens experimental angle block avoiding paracentesis reflex penetrating corneal transplantation in the rabbit transplantation and rejection of indi vidual cell layers of the cornea bullous keratopathy; endothelial cell changes in experimental corneal edema multifocal retinopathy in borna disease infected rabbits platelets and polymorphonuclear leukocytes in experimental ocular inflammation in the rabbit eye. grafe's arch studies of intraocular pressure ii: the histopathology of experimentally increased intraocular pressure in the rabbit sliding of the epithelium in experimental corneal wounds conjunctivitis in rabbits caused by enterovirus type 70 (ev70) effect of prolonged exposure of rabbit retina to low-intensity light experimental glaucoma in the rabbit pox virus keratitis in a rabbit experimental production of glaucoma in rabbits supression of the corneal allograft reaction: an experimental comparison of cyclosporin a and topical steroid permanent destruction of the corneal endothelium in rabbits the rabbit model of herpetic keratitis an animal model of filtration surgery blepharoconjunctivitis associated with staphylococcus aureus in a rabbit rabbit model of phlyctenulosis and catarrhal infiltrates a rabbit model of staphylococcal blepharitis anterior corneal dystrophy in american dutch belted rabbits: biomicroscopic and histologic findings comparison of the neovascular effects of stimulated macrophages and neutrophils in autologous rabbit corneas animal models of band keratopathy paf-induced conjunctivitis in the rabbit is mediated by peptido-leukotrienes experimental serous retinal detachment and focal pigment epithelial damage topical fibronection and corneal epithelial wound healing in the rabbit susceptibility of corneas from various animal species to in vitro bonding and invasion by acanthomeba castellani studies on experimental ocular toxoplasmosis in the rabbit experimental ocular histoplasmosis primary and secondary herpes simplex uveitis in rabbits herpes simplex retinochoroiditis in newborn rabbits studies on the handling of retinotoxic doses of iodate in rabbits the analysis of image motion by the rabbit retina direction selective units in rabbit retina: distribution of preferred directions interplexiform cells in rabbit retina direction selective retinal ganglion cells and control of optokinetic nystagmus in the rabbit retinal ganglion cells projecting to the rabbit accessory optic system ganglion cell density in albino and pigmented rabbit retinas labeled with a ganglion cell specific monoclonal antibody acute experimental uveitis caused by a single administration of heterologous blood protein to the vitreous body of rabbits the rabbit as an alternative model for the intraocular testing of viscoelastic substances myofibroblasts in the healing of filtering wounds in rabbit, dog, and cat intraocular response to cotton, collagen, and cellulose in the rabbit effects ofgm-rul ganglioside on the optic nerve and retina in experimental glaucoma the healing of corneal epithelial abrasions in the rabbit; a scanning electron microscope study scanning electron microscopy of corneal graft rejection: epithelial rejection, endothelial rejection, and formation of posterior graft membranes two cases of corneal epithelial dystrophy in rabbits the rabbit in eye research anatomy and histology of the eye and orbit in domestic animals dendritic morphologies of retinal ganglion cells projecting to the lateral geniculate nucleus in the rabbit simulation of massive periretinal proliferation by autotransplantation of retinal pigment epithelial cells in rabbits ultrastructural analysis of experimental allergic uveitis in rabbits vitreous hemorrhage. an experimental study iii: experimental degeneration of the rabbit retina induced by hemoglobin injection into the vitreous retinitis-pigmentosa-like tapetoretinal degeneration in a rabbit breed horizontal cells of the rabbit retina: some quantitative properties revealed by selective staining the electroretinogram in normal and light-deprived rabbits antibasement membrane antibody-mediated experimental conjunctivitis phospholipid changes in the eye and aorta of cholesterol-fed rabbits zoological and wildlife review: myoxomatosis in the rabbit corneal vascularization specular microscopic observations of the corneal endothelium in the normal rabbit an experimental method to produce angle block in rabbits and the use of phospholine after angle block dendrite bundles in lamina ii/iii of the rabbit neocortex microglial responses in the rabbit retina following transection of the optic nerve characteristics and pharmacologic utility of an intraocular pressure (lop) model in unanesthetized rabbits regional enzyme profiles in rabbit lenses with early stages of naphthalene cataract clinical and experimental mycotic corneal ulcer cased by aspergillus fumigatus and the effect of oral ketaconazole in the treatment evidence of an extended representation of the visual field in the superior colliculus of the rabbit eye experimental glaucoma immunotherapy for rabbit lid papillomas the accessory optic system of the rabbit. i. basic visual response properties characterization of a haemophilius sp lipid keratopathy in rabbits: an animal model system the accessory optic system of the rabbit, cat, dog, and monkey: a whole hrp study a clinical and pathological study of inherited glaucoma in new zealand white rabbits ocular surface epithelium and vascularization in rabbits comparative physiology and anatomy of the outflow pathway an acute ocular inflammatory reaction induced by intravitreal bovine serum albumin in presensitized rabbits: the effect of phentolamine decreased ecto-5-nucleotidase activity in rabbit lymphocytes during experimental lens-induced uveitis the grating acuity of the wild european rabbit rabbit retinal ganglion cells. receptive field classification and axonal conduction properties discrimination between striated patterns of different orientation in the rabbit interocular transfer in the rabbit the retinal fixation area in the rabbit binocular vision in the rabbit regenerative capacity of corneal endothelium in rabbit and cat the effects of 3-isobutyl methylxanthine on experimentally provoked uveitis in rabbits production of experimental glaucoma and its clinical aspects spontane congenitale katarakte bei ratte, maus, und kaninehen the influence of topical corticosteroid therapy upon polymorphonuclear leukocyte distribution, vascular integrity, and ascorbate levels in endotoxin induced inflammation of the rabbit eye experimental allergic optic neuritis (eaon) in the rabbit: a new model to study primary demyelinating diseases phacoclastic uveitis in the rabbit antibiotics in the treatment of experimental bacterial endophthalmitis.ln key: cord-020871-1v6dcmt3 authors: papariello, luca; bampoulidis, alexandros; lupu, mihai title: on the replicability of combining word embeddings and retrieval models date: 2020-03-24 journal: advances in information retrieval doi: 10.1007/978-3-030-45442-5_7 sha: doc_id: 20871 cord_uid: 1v6dcmt3 we replicate recent experiments attempting to demonstrate an attractive hypothesis about the use of the fisher kernel framework and mixture models for aggregating word embeddings towards document representations and the use of these representations in document classification, clustering, and retrieval. specifically, the hypothesis was that the use of a mixture model of von mises-fisher (vmf) distributions instead of gaussian distributions would be beneficial because of the focus on cosine distances of both vmf and the vector space model traditionally used in information retrieval. previous experiments had validated this hypothesis. our replication was not able to validate it, despite a large parameter scan space. the last 5 years have seen proof that neural network-based word embedding models provide term representations that are a useful information source for a variety of tasks in natural language processing. in information retrieval (ir), "traditional" models remain a high baseline to beat, particularly when considering efficiency in addition to effectiveness [6] . combining the word embedding models with the traditional ir models is therefore very attractive and several papers have attempted to improve the baseline by adding in, in a more or less ad-hoc fashion, word-embedding information. onal et al. [10] summarized the various developments of the last half-decade in the field of neural ir and group the methods in two categories: aggregate and learn. the first one, also known as compositional distributional semantics, starts from term representations and uses some function to combine them into a document representation (a simple example is a weighted sum). the second method uses the word embedding as a first layer of another neural network to output a document representation. the advantage of the first type of methods is that they often distill down to a linear combination (perhaps via a kernel), from which an explanation about the representation of the document is easier to induce than from the neural network layers built on top of a word embedding. recently, the issue of explainability in ir and recommendation is generating a renewed interest [15] . in this sense, zhang et al. [14] introduced a new model for combining highdimensional vectors, using a mixture model of von mises-fisher (vmf) instead of gaussian distributions previously suggested by clinchant and perronnin [3] . this is an attractive hypothesis because the gaussian mixture model (gmm) works on euclidean distance, while the mixture of von mises-fisher (movmf) model works on cosine distances-the typical distance function in ir. in the following sections, we set up to replicate the experiments described by zhang et al. [14] . they are grouped in three sets: classification, clustering, and information retrieval, and compare "standard" embedding methods with the novel movmf representation. in general, we follow the experimental setup of the original paper and, for lack of space, we do not repeat here many details, if they are clearly explained there. all experiments are conducted on publicly available datasets and are briefly described here below. classification. two subsets of the movie review dataset: (i) the subjectivity dataset (subj) [11] ; and (ii) the sentence polarity dataset (sent) [12] . clustering. the 20 newsgroups dataset 1 was used in the original paper, but the concrete version was not specified. we selected the "bydate" version, because it is, according to its creators, the most commonly used in the literature. it is also the version directly load-able in scikit-learn 2 , making it therefore more likely that the authors had used this version. retrieval. the trec robust04 collection [13] . the methods used to generate vectors for terms and documents are: tf-idf. the basic term frequency -inverse document frequency method [5] . implemented in the scikit-learn library 3 . [4] . lda. latent dirichlet allocation [2] . cbow. word2vec [9] in the continuous bag-of-word (cbow) architecture. pv-dbow/dm. paragraph vector (pv) is a document embedding algorithm that builds on word2vec. we use here both its implementations: distributed bag-of-words (pv-dbow) and distributed memory (pv-dm) [7] . the lsi, lda, cbow, and pv implementations are available in the gensim library 4 . the fk framework offers the option to aggregate word embeddings to obtain fixed-length representations of documents. we use fisher vectors (fv) based on (i) a gaussian mixture model (fv-gmm) and (ii) a mixture of von mises-fisher distributions (fv-movmf) [1] . we first fit (i) a gmm and (ii) a movmf model on previously learnt continuous word embeddings. the fixed-length representation of a document x containing t words w i -expressed as where k is the number of mixture components. the vectors g x i , having the dimension (d) of the word vectors e wi , are explicitly given by [3, 14] : where ω i are the mixture weights, γ t (i) = p(i|x t ) is the soft assignment of x t to (i) gaussian and (ii) vmf distribution i, and σ 2 i = diag(σ i ), with σ i the covariance matrix of gaussian i. in (i), σ i refers to the mean vector; in (ii) it indicates the mean direction and κ i is the concentration parameter. we implement the fk-based algorithms by ourselves, with the help of the scikit-learn library for fitting a mixture of gaussian models and of the spherecluster package 5 for fitting a mixture of von mises-fisher distributions to our data. the implementation details of each algorithm are described in what follows. each of the following experiments is conceptually divided in three phases. first, text processing (e.g. tokenisation); second, creating a fixed-length vector representation for every document; finally, the third phase is determined by the goal to be achieved, i.e. classification, clustering, and retrieval. for the first phase the same pre-processing is applied to all datasets. in the original paper, this phase was only briefly described as tokenisation and stopword removal. it is not given what tokeniser, linguistic filters (stemming, lemmatisation, etc.), or stop word list were used. knowing that the gensim library was used, we took all standard parameters (see provided code 6 ). gensim however does not come with a pre-defined stopword list, and therefore, based on our own experience, we used the one provided in the nltk library 7 for english. for the second phase, transforming terms and documents to vectors, zhang et al. [14] specify that all trained models are 50 dimensional. we have additionally experimented with dimensionality 20 (used by clinchant and perronnin [3] for clustering) and 100, as we hypothesized that 50 might be too low. the tf-idf model is 5000 dimensional (i.e. only the top 5000 terms based on their tf-idf value are used), while the fischer-kernel models are 15 × d dimensional, where d = {20, 50, 100}, as just explained. in what follows, d refers to the dimensionality of lsi, lda, cbow, and pv models. the cbow and pv models are trained using a default window size of 5, keeping both low and high-frequency terms, again following the setup of the original experiment. the lda model is trained using a chunk size of 1000 documents and for a number of iterations over the corpus ranging from 20 to 100. for the fk methods, both fitting procedures (gmm and movmf) are independently initialised 10 times and the best fitting model is kept. for the third phase, parameters are explained in the following sections. logistic regression is used for classification in zhang et al., and therefore also used here. the results of our experiments, for d = 50 and 100-dimensional feature vectors, are summarised in table 1 . for all the methods, we perform a parameter scan of the (inverse) regularisation strength of the logistic regression classifier, as shown in fig. 1(a) and (b) . additionally, the learning algorithms are trained for a different number of epochs and the resulting classification accuracy assessed, cf. fig. 1(c) and (d). figure 1 (a) indicates that cbow, fv-gmm, fv-movmf, and the simple tf-idf, when properly tuned, exhibit a very similar accuracy on subj -the given confidence intervals do not indeed allow us to identify a single, best model. surprisingly, tf-idf outperforms all the others on the sent dataset ( fig. 1(b) ). increasing the dimensionality of the feature vectors, from d = 50 to 100, has the effect of reducing the gap between tf-idf and the rest of the models on the sent dataset (see table 1 ). for clustering experiments, the obtained feature vectors are passed to the kmeans algorithm. the results of our experiments, measured in terms of adjusted rand index (ari) and normalized mutual information (nmi), are summarised in table 2 . we used both d = 20 and 50-dimensional feature vectors. note that the evaluation of the clustering algorithms is based on the knowledge of the ground truth class assignments, available in the 20 newsgroups dataset. as opposed to classification, clustering experiments show a generous imbalance in performance and firmly speak in favour of pv-dbow. interestingly, tf-idf, fv-gmm, and fv-movmf, all providing high-dimensional document representations, have a low clustering effectiveness. lsi and lda achieve low accuracy (see table 1 ) and are omitted here for visibility. the left panels [(a) and (b)] show the effect of (inverse) regularisation of the logistic regression classifier on the accuracy, while the right panels [(c) and (d)] display the effect of training for the learning algorithms. the two symbols on the right axis in panels (a) and (b) indicate the best (fv-movmf) results reported in [14] . for these experiments, we extracted from every document of the test collection all the raw text, and preprocessed it as described in the beginning of this section. the documents were indexed and retrieved for bm25 with the lucene 8.2 search engine. we experimented with three topic processing ways: (1) title only, (2) description only, and (3) title and description. the third way produces the best results and closest to the ones reported by zhang et al. [14] , and hence are the only ones reported here. an important aspect of bm25 is the fact that the variation of its parameters k 1 and b could bring significant improvement in performance, as reported by lipani et al. [8] . therefore, we performed a parameter scan for k 1 ∈ [0, 3] and b ∈ [0, 1] with a 0.05 step size for both parameters. for every trec topic, the scores of the top 1000 documents retrieved from bm25 were normalised to [0,1] with the min-max normalisation method, and were used in calculating the scores of the documents for the combined models [14] . the original results, those of our replication experiments with standard (k 1 = 1.2 and b = 0.75) and best bm25 parameter values-measured in terms of mean average precision (map) and precision at 20 (p@20)-are outlined in table 3 . we replicated previously reported experiments that presented evidence that a new mixture model, based on von mises-fisher distributions, outperformed a series of other models in three tasks (classification, clustering, and retrievalwhen combined with standard retrieval models). since the source code was not released in the original paper, important implementation and formulation details were omitted, and the authors never replied to our request for information, a significant effort has been devoted to reverse engineer the experiments. in general, for none of the tasks were we able to confirm the conclusions of the previous experiments: we do not have enough evidence to conclude that fv-movmf outperforms the other methods. the situation is rather different when considering the effectiveness of these document representations for clustering purposes: we find indeed that the fv-movmf significantly underperforms, contradicting previous conclusions. in the case of retrieval, although zhang et al.'s proposed method (fv-movmf) indeed boosts bm25, it does not outperform most of the other models it was compared to. clustering on the unit hypersphere using von mises-fisher distributions latent dirichlet allocation aggregating continuous word embeddings for information retrieval indexing by latent semantic analysis distributional structure. word let's measure run time! extending the ir replicability infrastructure to include performance aspects distributed representations of sentences and documents verboseness fission for bm25 document length normalization efficient estimation of word representations in vector space neural information retrieval: at the end of the early years a sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales the trec robust retrieval track. sigir forum aggregating neural word embeddings for document representation ears 2019: the 2nd international workshop on explainable recommendation and search authors are partially supported by the h2020 safe-deed project (ga 825225). key: cord-028420-z8sv9f5k authors: filighera, anna; steuer, tim; rensing, christoph title: fooling automatic short answer grading systems date: 2020-06-09 journal: artificial intelligence in education doi: 10.1007/978-3-030-52237-7_15 sha: doc_id: 28420 cord_uid: z8sv9f5k with the rising success of adversarial attacks on many nlp tasks, systems which actually operate in an adversarial scenario need to be reevaluated. for this purpose, we pose the following research question: how difficult is it to fool automatic short answer grading systems? in particular, we investigate the robustness of the state of the art automatic short answer grading system proposed by sung et al. towards cheating in the form of universal adversarial trigger employment. these are short token sequences that can be prepended to students’ answers in an exam to artificially improve their automatically assigned grade. such triggers are especially critical as they can easily be used by anyone once they are found. in our experiments, we discovered triggers which allow students to pass exams with passing thresholds of [formula: see text] without answering a single question correctly. furthermore, we show that such triggers generalize across models and datasets in this scenario, nullifying the defense strategy of keeping grading models or data secret. adversarial data sample perturbations, also called adversarial examples, intending to fool classification models have been a popular area of research in recent years. many state of the art (sota) models have been shown to be vulnerable to adversarial attacks on various data sets [8, 44, 47] . on image data, the extent of modifications needed to change a sample's classified label are often so small they are imperceptible to humans [2] . on natural language data, perturbations can more easily be detected by humans. however, it is still possible to minimally modify samples so that the semantic meaning does not change but the class assigned by the model does [3, 6, 13, 17, 22, 29, 30] . while the existence of such adversarial examples unveils our models' shortcomings in many fields, they are especially worrying in settings where we actually expect to face adversaries. in this work, we focus on one such setting: automatic short answer grading (asag) systems employed in exams. asag systems take free-text answers and evaluate their quality with regards to their semantic content, completeness and relevance to the answered question. these free-text answers are provided by students and are typically somewhere between a phrase and a paragraph long. the willingness of college students to cheat has been well-studied [1, 9, 11, 18, 36, 37] . and while the exact percentage of cheating students varies greatly from study to study, whitley [42] reports a mean of 43.1% of students cheating on examinations over 36 studies in his review. klein et al. [19] report similar values for cheating on exams in their large scale comparison of cheating behaviors in different schools. in these studies cheating behavior included copying from other students, getting the exam questions beforehand or bringing a cheat sheet to the exam. we argue that exploiting weaknesses in automatic grading schemes is just another, albeit less explored, form of cheating and expect the students' willingness to exhibit such behavior to be similar. therefore, if we wish to employ automated grading systems in exams, we should ensure that the perceived cost of cheating them outweighs the benefits. the perceived cost of cheating is made up of various factors, such as the punishment when caught, moral considerations or the difficulty of cheating in the first place [26] . in this work, we aim to investigate the last factor: how difficult is it to fool automatic short answer grading systems? for this purpose, we first reproduce the sota approach to asag [39] which is based on the transformer model bert [10] . then we subject the reproduced models to adversarial attacks. in particular, we employ and modify the universal trigger attack proposed by wallace et al. [41] . it generates short sequences of tokens, called universal adversarial triggers, which try to make a model predict a target class regardless of the actual sample. in our context, students could prepend such a universal trigger targeted at the correct class to all of their answers in an exam to artificially improve their score. an example of such a trigger can be seen in table 1 . this kind of universal trigger attack is especially critical as such triggers can be easily employed by anyone once they are found. in this work we make the following novel contributions: -application of sota natural language processing insights to the educational scenario of exam grading -modification of wallace et al.'s universal trigger attack to make it more targeted at a specific class, namely the correct class -investigation of trigger transferability across datasets and models two research areas are relevant for our work: automatic short answer grading and adversarial attacks. table 1 . an example showing how prepending the trigger sequence "none exits" to a student answer -taken from scientsbank's question em-21b's unseen answers [12] changes the prediction from incorrect to correct. georgia found one brown mineral and one black mineral. how will she know which one is harder? reference answer: the harder mineral will leave a scratch on the less hard mineral. if the black mineral is harder, the brown mineral will have a scratch student answer: that it stay the scratch. → incorrect triggered answer: none exits that it stay the scratch. → correct adversarial attacks can be categorized into input dependent and input independent attacks. input dependent attacks aim to modify specific inputs so that the model misclassifies them. strategically inserting, deleting or replacing words with their synonyms [29] , their nearest neighbors in the embedding space [3] or other words which have a high probability of appearing in the same context [47] are examples of such an attack. samanta and mehta [35] also consider typos which in turn result in valid words, e.g. goods and good, for their replacement candidate pool. modifications can also be made on the character level by inserting noise, such as swapping adjacent characters or completely scrambling words [6] . finally, the text can also be paraphrased to change the syntactic structure [17] . input agnostic attacks, on the other hand, aim to find perturbations that lead to misclassifications on all samples. for instance, this can be done by selecting a single perturbation in the embedding space which is then applied to all tokens indiscriminately [15] . alternatively, ribeiro et al. [30] propose an approach that first paraphrases specific inputs to find semantically equivalent adversaries and then generalizes found examples to universal, semantically equivalent adversarial rules. rules are selected to maximize semantic equivalence when applied to a sample, induce as many misclassifications as possible and are finally vetted by humans. an example of such a rule is "what is" → "what's". another input independent approach involves concatenating a series of adversarial words -triggers -to the beginning of every input sample [5] . the universal trigger attack [41] utilized in this work also belongs to this category. in sect. 4 the attack is explained in more detail. additional information on adversarial attacks can also be found in various surveys [44, 48] . systems that automatically score student answers have been explored for multiple decades. proposed approaches include clustering student answers into groups of similar answers and assigning grades to whole clusters instead of individual answers [4, 16, 45, 46] , grading based on manually constructed rules or models of ideal answers [21, 43] and automatically assigning grades based on the answer's similarity to given reference answers. we will focus on similarity-based approaches here because most recent sota results were obtained using this kind of approach. however, more information on other approaches can be found in various surveys [7, 14, 32] . the earlier similarity-based approaches involve manually defining features that try to capture the similarity of answers on multiple levels [12, 24, 25, 33, 34, 38] . surface features, such as lexical overlap or answer length ratios, are utilized by most feature engineered approaches. semantic similarity measures are also common, be it in the form of sentence embedding distances or measures derived from knowledge bases like wordnet [28] . some forms of syntactic features are also often employed. dependency graph alignment or measures based on the part-of-speech tags' distribution in the answers would be examples of syntactic features. a further discussion of various features can be found in [27] . more recently, deep learning methods have also been adapted to the task of automatic short answer grading [20, 31, 40] . the key difference to the feature engineered approaches lies in the fact that the text's representation in the feature space is learned by the model itself. the best performing model (in terms of accuracy and f1 score) on the benchmark 3-way semeval dataset [12] was proposed by sung et al. [39] . they utilize the uncased bert base model [10] which was pre-trained on the bookscorpus [49] and the english wikipedia. it was pre-trained on the tasks of predicting randomly masked input tokens and whether a sentence is another's successor or not. sung et al. then fine-tune this deep bidirectional language representation model to predict whether an answer is correct, incorrect or contradictory compared to a reference answer. for this purpose, they append a feed-forward classification layer to the bert model. the authors claim that their model outperforms even human graders. to reproduce the results reported by sung et al. [39] , we trained 10 models with the hyperparameters stated in the paper. unreported hyperparameters were selected close to the original bert model's values with minimal tuning. the models were trained on the shuffled training split contained in the scients-bank dataset of the semeval-2013 challenge. as in the reference paper, we use the 3-way task of predicting answers to be correct, incorrect or contradictory. then the models were evaluated on the test split. the test set contains three distinct categories: unseen answers, unseen questions and unseen domains. unseen answers are answers to questions for which some answers have already been seen during training. unseen questions are completely new questions and the unseen domains category contains questions belonging to domains the model has not seen during training. we were not able to reproduce the reported results exactly with this setup. out of the 10 models, model 4 and 8 performed best. a comparison of their and the reported model's results can be seen in table 2 . the 10 models' average performance can be seen in fig. 1 . since the reported results are mostly within one or two standard deviations of the results achieved in our experiments, more in-depth hyperparameter tuning and reruns with different random seeds may yield the reported results. alternatively, the authors may have taken steps that they did not discuss in the paper. however, as this is not the focus of this work, we deemed the reproduced models sufficient for our experiments. table 2 . performance of best reproduced models, model 4 and 8, compared to the results reported by sung et al. [39] in terms of accuracy (acc), macro-averaged f1 score (m-f1) and weighted-averaged f1 score (w-f1). each category's best result is marked in bold. unseen in this work, we employ the universal trigger attack proposed by wallace et al. [41] . it is targeted, meaning that a target class is specified and the search will try to find triggers that lead the model to predict the specified class, regardless of the sample's actual class. the attack begins with an initial trigger, such as "the the the", and iteratively searches for good replacements for the words contained in the trigger. the replacement strategy is based on the hotflip attack proposed by ebrahimi et al. [13] . for each batch of samples, candidates are chosen out of all tokens in the vocabulary so that the loss for the target class is minimized. then, a beam search over candidates is performed to find the ordered combination of triggers which maximizes the batch's loss. we augment this attack by also considering more target label focused objective functions for the beam search than the batch's loss. namely, we experiment with naively maximizing the number of target label predictions and the targeted logsoftmax function depicted in eq. 1. here, l = {correct, incorrect, contra-dictory}, t is the target label, n denotes the number of samples in the batch x and f l (x) represents the model's output for label l's node before the softmax activation function given a sample x. in this section, we first give a short overview of the datasets used in our experiments. then, we present the best triggers found, followed by a short investigation of the effect of trigger length on the number of successful flips. next, the effect of our modified objective function is investigated. finally, we report on the transferability of triggers across models. the semeval asag challenge consists of two distinct datasets: scientsbank and beetle. while the beetle set only contains questions concerning electricity, the scientsbank corpus includes questions of various scientific domains, such as biology, physics and geography. we do not include the class distribution of the 3-way task here, as it can be found in the original [12] and the asag reference paper [39] . unless explicitly stated, all experiments were conducted in the following way. model 8 was chosen as the victim model because it has the overall best performance of all reproduction models. see table 2 for reference. since the model was trained on the complete scientsbank training split as stated in the reference paper, we selected the beetle training split as basis for our attacks. while the class labels were homogenized for both datasets in the semeval challenge, the datasets are still vastly different. they were collected in dissimilar settings, by different authors and deal with disparate domains [12] . this is important, as successful attacks with this setup imply transferability of triggers across datasets. in practice, this would allow attackers to substitute secret datasets with their own corpora and still find successful attacks on the original data. to the best of our knowledge, experiments investigating the transferability of natural language triggers across datasets are a novel contribution of our work. from the beetle training set all 1227 incorrect samples were selected. the goal of the attack was to flip their classification label to correct. we would also have liked to try and flip contradictory examples. however, the model was only able to correctly predict 18 of the 1049 contradictory samples without any malicious intervention necessary. finally, the triggers found are evaluated on the scientsbank test split. in the related work, the success of an attack is most often measured in the drop in accuracy it is able to achieve. however, this would overestimate the performance in our scenario as we are only interested in incorrect answers which are falsely graded as correct in contrast to answers which are labeled as contradictory. therefore, we also report the absolute number of successful flips from incorrect to correct. during the iterative trigger search process described in sect. 4 a few thousand triggers were evaluated on the beetle set. of these, the 20 triggers with the most achieved flips were evaluated on the test set and of these, the best triggers can be seen in table 3 . on the unseen answers test split, the model without any triggers misclassified 12.4% (31) of all incorrect samples as correct. the triggers "none varies" and "none would" managed to flip an additional 8.8% of samples so that 21.3% (53) are misclassified in total. on the unseen questions split, the base misclassification rate was 27.4% (101) and "none would" added another 10.1% for a total of 37.5% (138). on the unseen domains split, "none elsewhere" increased the misclassification rate from 22.0% (491) to 37.1% (826). wallace et al. [41] state that the trigger length is a trade-off between effectiveness and stealth. they experimented with prepending triggers of lengths between 1 and 10 tokens and found longer triggers to have higher success rates. this differs from observations made in our experiments. when the correct class is targeted, a trigger length of two achieves the best results, as can be seen in table 3 . on the unseen answers split, the best trigger of length 3 is "heats affected penetrated" and it manages to flip only 42 samples. the number of successful flips further decreases to 9 for the best trigger of length 4, "##ired unaffected least being". the same trend also holds for the other test table 3 . the triggers with the most flips from incorrect to correct for each test split. the number of model 8's misclassifications without any triggers can be found in the last row. for the sake of comparability with related work, the accuracy for incorrect samples is also given. ua stands for "unseen answers", uq denotes "unseen questions" and ud represents "unseen domains". splits but is omitted here for brevity. this difference in observation may be due to the varying definitions of attack success. wallace et al. [41] view a trigger as successful as soon as the model assigns any class other than the true label, while we only accept triggers which cause a prediction of the class correct. the educational setting of this work may also be a factor. effect of objective function. we compared the performance of the three different objective functions described in sect. 4, namely the original function proposed by wallace et al. [41] , the targeted logsoftmax depicted in eq. 1 and the naive maximization of the number of target label predictions. to make the comparison as fair as possible while keeping the computation time reasonable, we fixed the hyperparameters of the attack to a beam size of 4 and a candidate set size of 100. the attack was run for the same number of iterations exactly once for each function. the best triggers found by each function can be seen in table 4 . the performance is relatively similar, with the targeted function achieving the most flips on all test splits, closely followed by the original function and, lastly, the naive prediction maximization. qualitative observation of all produced triggers showed that the original function's triggers resulted in more flips from incorrect to contradictory than the proposed targeted function's. transferability. one of the most interesting aspects of triggers relates to the ability to find them on one model and use them to fool another model. in this setting, attackers do not require access to the original model, which may be kept secret in a grading scenario. trigger transferability across models allows them to train a substitute model for the trigger search and then attack the actual grading model with found triggers. we investigate this aspect by applying all good triggers found on model 8 to model 4. note that this also included triggers from a search on the scientsbank training split and not just the beetle training set. the best performing triggers in terms of flips induced in model 4 can be seen in table 5 . we also included the triggers which performed best on model 8 here. while there are triggers that perform well on both models, e.g. "none else", the best triggers for each model differ. interestingly, triggers like "nowhere changes" or "anywhere." perform even better on model 4 than the best triggers found for the original victim model. on ua, "nowhere changes" flips 14.9% of all incorrect samples to correct. in addition to the base misclassification rate, this leads to a misclassification rate of 32.5%. on uq, the same trigger increases the misclassification rate by 22.8% to a total of 50%. on the ud split, prepending "anywhere." to all incorrect samples raises the rate by 17.1% to 46.1%. as a curious side note, the trigger "heats affected penetrated" discussed in the section regarding trigger length performed substantially better on model 4, so that it was a close contender for the best trigger list. in our scenario, a misclassification rate of 37.5% means that students using triggers only need to answer 20% of the questions correctly to pass a test that was designed to have a passing threshold of 50%. if an exam would be graded by model 4, students could pass the test by simply prepending "nowhere changes" to their answers without answering a single question correctly! however, this does not mean that these triggers flip any arbitrary answer, as a large portion of the flipped incorrect answers showed at least a vague familiarity with the question's topic similar to the example displayed in table 1 . additionally, these rates were achieved on the unseen questions split. translated to our scenario this implies that we would expect our model to grade questions similar to questions it has seen during training but for which it has not seen a single example answer, besides the reference answer. to take an example out of the actual dataset, a model trained to grade the question what happens to earth materials during deposition? would also be expected to grade what happens to earth materials during erosion? with only the help of the reference answer "earth materials are worn away and moved during erosion.". the results suggest that the current sota approach is ill-equipped to generalize its grading behavior in such a way. nevertheless, even if we supply training answers to every question the misclassification rates are quite high with 21.3% and 32.5% for model 8 and 4, respectively. considering how easy these triggers are employed by everyone once someone has found them, this is concerning. thus, defensive measures should be investigated and put into place before using automatic short answer grading systems in practice. in conclusion, we have shown the sota automatic short answer grading system to be vulnerable to cheating in the form of universal trigger employment. we also showed that triggers can be successful even if they were found on a disparate dataset or model. this makes the attack easier to execute, as attackers can simply substitute secret grading components in their search for triggers. lastly, we also proposed a way to make the attack more focused on flipping samples from a specific source class to a target class. there are several points of interest which we plan to study further in the future. for one, finding adversarial attacks on natural language tasks is a very active field at the moment. exposing asag systems to other forms of attacks, such as attacks based on paraphrasing, would be very interesting. additionally, one could also explore defensive measures to make grading models more robust. an in-depth analysis of why these attacks work would be beneficial here. finally, expanding the transferability study conducted in this work to other kinds of models, such as roberta [23] or feature engineering-based approaches, and additional datasets may lead to interesting findings as well. cheating on exams in the iranian efl context threat of adversarial attacks on deep learning in computer vision: a survey generating natural language adversarial examples powergrading: a clustering approach to amplify human effort for short answer grading universal adversarial attacks on text classifiers synthetic and natural noise both break neural machine translation the eras and trends of automatic short answer grading adversarial examples are not easily detected: bypassing ten detection methods the culture of cheating: from the classroom to the exam room bert: pre-training of deep bidirectional transformers for language understanding college cheating in japan and the united states semeval-2013 task 7: the joint student response analysis and 8th recognizing textual entailment challenge hotflip: white-box adversarial examples for text classification machine learning approach for automatic short answer grading: a systematic review universal adversarial perturbation for text classification semi-supervised clustering for short answer scoring adversarial example generation with syntactically controlled paraphrase networks online exams and cheating: an empirical analysis of business students' views cheating during the college years: how do business school students compare? earth mover's distance pooling over siamese lstms for automatic short answer grading c-rater: automated scoring of short-answer questions deep text classification can be fooled roberta: a robustly optimized bert pretraining approach creating scoring rubric from representative student answers for improved short answer grading learning to grade short answer questions using semantic similarity measures and dependency graph alignments motivational perspectives on student cheating: toward an integrated model of academic dishonesty get semantic with me! the usefulness of different feature types for shortanswer grading wordnet::similarity: measuring the relatedness of concepts generating natural language adversarial examples through probability weighted word saliency semantically equivalent adversarial rules for debugging nlp models investigating neural architectures for short answer scoring a perspective on computer assisted assessment techniques for short free-text answers sentence level or token level features for automatic short answer grading? use both feature engineering and ensemble-based approach for improving automatic short-answer grading performance towards crafting text adversarial samples cheating and plagiarism: perceptions and practices of first year it students an examination of student cheating in the two-year college fast and easy short answer grading with high accuracy improving short answer grading using transformer-based pre-training multiway attention networks for modeling sentence pairs universal adversarial triggers for attacking and analyzing nlp factors associated with cheating among college students: a review using nlp to support scalable assessment of short free text responses adversarial examples: attacks and defenses for deep learning automatic coding of short text responses via clustering in educational assessment reducing annotation efforts in supervised short answer scoring generating fluent adversarial examples for natural languages adversarial attacks on deep learning models in natural language processing: a survey aligning books and movies: towards story-like visual explanations by watching movies and reading books key: cord-016364-80l5mua2 authors: menotti-raymond, marilyn; o’brien, stephen j. title: the domestic cat, felis catus, as a model of hereditary and infectious disease date: 2008 journal: sourcebook of models for biomedical research doi: 10.1007/978-1-59745-285-4_25 sha: doc_id: 16364 cord_uid: 80l5mua2 the domestic cat, currently the most frequent of companion animals, has enjoyed a medical surveillance, as a nonprimate species, second only to the dog. with over 200 hereditary disease pathologies reported in the cat, the clinical and physiological study of these feline hereditary diseases provides a strong comparative medicine opportunity for prevention, diagnostics, and treatment studies in a laboratory setting. causal mutations have been characterized in 19 felid genes, with the largest representation from lysosomal storage enzyme disorders. corrective therapeutic strategies for several disorders have been proposed and examined in the cat, including enzyme replacement, heterologous bone marrow transplantation, and substrate reduction therapy. genomics tools developed in the cat, including the recent completion of the 2-fold whole genome sequence of the cat and genome browser, radiation hybrid map of 1793 integrated coding and microsatellite loci, a 5-cm genetic linkage map, arrayed bac libraries, and flow sorted chromosomes, are providing resources that are being utilized in mapping and characterization of genes of interest. a recent report of the mapping and characterization of a novel causative gene for feline spinal muscular atrophy marked the first identification of a disease gene purely from positional reasoning. with the development of genomic resources in the cat and the application of complementary comparative tools developed in other species, the domestic cat is emerging as a promising resource of phenotypically defined genetic variation of biomedical significance. additionally, the cat has provided several useful models for infectious disease. these include feline leukemia and feline sarcoma virus, feline coronavirus, and type c retroviruses that interact with cellular oncogenes to induce leukemia, lymphoma, and sarcoma. mankind has held a centuries-long fascination with the cat. the earliest arch eological records that have been linked to the domestication of felis catus date to approximately 9500 years ago from cyprus, 1 with recent molecular genetic analyses in our laboratory suggesting a middle eastern origin for domestication (c. driscoll et al., unpublished observations) . currently the most numerous of companion animals, numbering close to 90 million in households across the united states (http://www.appma.org/ press_industrytrends.asp), the cat enjoys a medical surveillance second only to the dog and humankind. in this chapter we review the promise of the cat as an important model for the advancement of human hereditary and infectious disease and the genomic tools that have been developed for the identification, and characterization of genes of interest. for many years we have sought to characterize genetic organization in the domestic cat and to develop genomic resources that establish f. catus as a useful animal model for human hereditary disease analogues, neoplasia, genetic factors associated with host response to infectious disease, and mammalian genome evolution. 2, 3 to identify genes associated with inherited pathologies that mirror inherited human conditions and interesting pheno-types in the domestic cat, we have produced genetic maps of sufficient density to allow linkage or association-based mapping exercises. [4] [5] [6] [7] [8] [9] [10] [11] the first genetic map of the cat, a physical map generated from a somatic-cell hybrid panel, demonstrated the cat's high level of conserved synteny with the human genome, which offered much promise for the future application of comparative genomic inference in felid mapping and association exercises. 12 several radiation hybrid (rh) and genetic linkage (gl) maps have since been published. [4] [5] [6] [7] [8] [9] 11, 13, 14 although previous versions of the cat gene map, based on somatic cell hybrid and zoo fish analysis, 15 ,16 revealed considerable conservation of synteny with the human genome, these maps provided no knowledge of gene order or intrachromosomal genome rearrangement between the two species, information that is critical to applying comparative map inference to gene dis covery in gene-poor model systems. radiation hybrid (rh) mapping has emerged as a powerful tool for constructing moderate-to high-density gene maps in vertebrates by obviating the need to identify interspecific polymorphisms critical for the generation of genetic linkage maps. 7 the most recent rh map of the cat 8 includes 1793 markers: 662 coding loci, 335 selected markers derived from the cat 2x whole genome sequence targeted at breakpoints in conserved synteny between human and cat, and 797 short tandem repeat (str) loci. the strategy used in developing the current rh map was to target gaps in the feline-human comparative map, and to provide more definition in breakpoints in regions of conserved synteny between cat and human. the 1793 markers cover the length of the 18 feline autosomes and the x chromosome at an average spacing of one marker every 1.5 mb (megabase), with fairly uniform marker density. 8 an enhanced comparative map demonstrates that the current map provides 86% and 85% comparative coverage of the human and canine genomes, respectively. 8 ninety-six percent of the 1793 cat markers have identifi able orthologues in the canine and human genome sequences, providing a rich comparative tool, which is critical in linkage mapping exercises for the identification of genes controlling feline phenotypes. figure 25 -1 presents a graphic display of each cat chromosome and blocks of conserved syntenic order with the human and canine genomes. 8 one hundred and fifty-two cat-human and 134 cat-dog homologous synteny blocks were identified. alignment of cat, dog, and human chromosomes demonstrated different patterns of chromosomal rearrangement with a marked increase in interchromosomal rearrangements relative to human in the canid lineage (89% of all rearrangements), as opposed to the more frequent intrachromosomal rearrangements in the felid lineage (95% of all rearrangements) since divergence from a common carnivore ancestor 55 my ago. with an average spacing of 1 marker every 1.5 mb in the feline euchromatic sequence, the map provided a solid framework for the chromosomal assignment of feline contigs and scaffolds during assembly of the cat genome assembly, 17 and served as a comparative tool to aid in the identification of genes controlling feline phenotypes. as a complement to the rh map of the cat, a third generation linkage map of 625 strs is currently nearing completion. the map has been generated in a large multigeneration domestic cat pedigree (n = 483 informative meioses). 18 previous first-and second-generation linkage maps of the cat were generated in a multigeneration interspecies pedigree generated between the domestic cat and the asian leopard cat, prionailurus bengalensis, 7 to facilitate the mapping and integration of type i (coding) and type ii (polymorphic str) loci. 7 the current map, which spans all 18 autosomes with single linkage groups, has twice the str density of previous maps, providing a 5-cm resolution. there is also greatly expanded coverage of the x chromosome, with some 75 str loci. marker order between the current generation rh and gl maps is highly concordant. 8 approximately 85% of the strs are mapped in the most current rh map of the cat, 8 which provides reference and integration with type i loci. whereas the third-generation linkage map is composed entirely of str loci, the sequence homology of extended genomic regions adjacent to the str loci in the cat 2x whole genome sequence, 17 to the dog's homologous region, 19 has enabled us to obtain identifiable orthologues in the canine and human genome sequences for over 95% of the strs. thus, practically every str acts as a "virtual" type 1 locus, with both comparative anchoring and linkage map utility. combined with the cat rh map, these genomic tools provide us with the comparative reference to other mammalian genomes critical for linkage and association mapping. the domestic cat is one of 26 mammalian species endorsed by the national human genome research institute (nhgri) human genome annotation committee for a "light" 2-fold whole genome sequence, largely to capture the pattern of genome variation and divergence that characterizes the mammalian radiations (http:// www.hgsc.bcm.tmc.edu/projects/bovine/, http://www.broad.mit. edu/mammals/). although light genome coverage provides limited sequence representation, (∼80%), 20 one of the rationales for these light genome sequences included "enhancing opportunities for research on species providing human medical models." the 2-fold assembly of the domestic cat genome has recently been completed for a female abyssinian cat, "cinnamon," 17 and a 7x whole genome sequencing effort is planned in the near future. a total of 9,161,674 reads were assembled to 817,956 contigs, covering 1.642 gb with an n50 (i.e., half of the sequenced base pairs are in contigs 0 of susceptible, infectious and recovered individuals, respectively in relation to specific social characteristics. we assume that the rapid spread of the disease and the low mortality rate allows to ignore changes in the social structure, such as the aging process, births and deaths. consequently, for a given population of total number n , we have that where f (a) is the total distribution of the social features defined by the vector a. hence, we recover the total fraction of the population which belong to the susceptible, infected and recovered as follows in a situation where changes in the social features act on a slower scale with respect to the spread of the disease, the socially structured compartmental model follows the dynamics where the function β(a, a * ) ≥ 0 represents the uncertain interaction rate among individuals with different social features and γ(a) ≥ 0 the recovery rate which may depend on the social feature. often, in socially structured models the interaction rate between people is assumed to be separable, and proportionate to the activity level of the social feature [22, 23] , as follows with b(a) the average number of people contacted by a person with social feature a per unit time. alternative approaches are based on preferential mixing [13, 21] . specific examples of agedependent social interaction matrices are reported in appendix b. we introduce the usual normalization scaling and observe that the quantities s(t), i(t) and r(t) satisfy the conventional sir dynamic where the fraction of recovered is obtained from r(t) = 1 − s(t) − i(t). we refer to [22, 23] for analytical results concerning model (2) and (4) . in the following we will adopt the simple compartmental model (2) to derive our feedback controlled formulation in presence of uncertainty. the extension to more realistic compartmental models in epidemiology such as seir and/or mseir can be carried out in a similar way. in order to simplify the description, we will consider the case d a = 1 and set the social dependence as the age a of the individual because of its importance in epidemic dynamics. it is clear, however, that similar containment procedures can be applied also on the basis of other social features. we will first formulate the feedback controlled sir model in the deterministic case and subsequently extend our approach to the presence of uncertain parameters. in order to define the action of a policy maker introducing a control over the system based on social distancing and other containment measures linked to the social structure we consider an optimal control framework. the choice of an appropriate functional is problem dependent [15] . in our setting, we account the minimization of the total number of the infected population i(t) through the an age dependent control action depending both on time and pairwise interactions among individuals with different ages. thus, we introduce the optimal control problem min u∈u j(u) : subject to with initial condition i(a, 0) = i 0 (a), s(a, 0) = s 0 (a) and r(a, 0) = r 0 (a). the number of infected individual is measured by a monotone increasing function ψ(·) such that ψ : [0, 1] → r + . this function models the policy maker's perception of the impact of the epidemic by the number of people currently infected. for example ψ(i) = i q /q, for q > 1 implies an underestimation of the actual number of infected that we expect to result in the need for a larger penalty term than q = 1. the control aims to minimize this measure of the total infected population by reducing the rate of interaction between individuals. we consider a quadratic cost for its actuation. such control is restricted to the space of admissible controls which ensure the admissibility of the solution for (6) . the solution to problem (5)(6) is computed through the optimality conditions obtained from the euler-lagrangian as follows where p s (a, t), p i (a, t), p r (a, t) are the associated lagrangian multipliers. computing the variations with respect to (s, i, r) we retrieve the adjoint system with terminal conditions p s (a, t ) = 0, p i (a, t ) = 0 and p r (a, t ) = 0. note that the contribution of p r (a, t) vanishes since the control does not act directly on population r, and the removed population is not considered in the minimization of the functional. the optimality condition reads the optimality conditions (7)-(8) are first order necessary conditions for the optimal control u * (a, t). in order to be admissible then the control reads u(a, a * , t) = max 0, min where φ m,β (a, a * ) = min{β(a, a * ), m }. the approach just described, however, is generally quite complicated when there are uncertainties as it involves solving simultaneously the forward problem (5)-(6) and the backward problem (7)(8) . moreover, the assumption that the policy maker follows an optimal strategy over a long time horizon seems rather unrealistic in the case of a rapidly spreading disease such as the covid-19 epidemic. in this section we consider short time horizon strategies which permits to derive suitable feedback controlled models. these strategies are suboptimal with respect the original problem (5)-(6) but they have proved to be very successful in several social modeling problems [1] [2] [3] [4] 18] . to this aim, we consider a short time horizon of length h > 0 and formulate a time discretize optimal control problem through the functional j h (u) restricted to the interval [t, t + h], as follows subject to recalling that the macroscopic information on the infected is we can derive the minimizer of j h computing d u j(u) ≡ 0. we retrieve the following nonlinear equation introducing the scaling ν(a, a * ) = hκ(a, a * ) we can pass to the limit for h → 0. hence (12) reduces to the instantaneous control the resulting controlled dynamics is retrieved by applying the instantaneous strategy directly in the continuous system (6) as follows in what follows we provide a sufficient conditions for the admissibility of the instantaneous control in terms of the penalization term κ(a, a * ). indeed we want to assure that the dynamics preserve the monotonicity of the number of susceptible population s(a, t). proposition 2.1. let β(a, a * ) ≥ δ > 0, then for all (a, a * ) ∈ λ×λ solutions to (14) are admissible if the penalization κ satisfies the following inequality whereī(a) andī are respectively the peak reached by the infected of class a and by the total population. proof. by imposing the non-negativity of the controlled reproduction rate inside the integral we have this inequality has to be satisfied for every time t ≥ 0. second we observe that the number of susceptible s(a, t) is decreasing in time therefore s 0 (a) ≥ s(a, t) for all t. moreover i(a, t) reaches a peak before decreasing to 0 (note that this peak can also be in t = 0), sayī for the macroscopic variable andī(a). thus, thanks to the monotonicity of ψ(·), we can restrict the previous inequality as follows δκ(a, a * ) ≥ s 0 (a)ī(a * )ψ (ī). in figure 1 we report the phase diagram of susceptible-infected trajectories for the controlled model with homogeneous mixing with ψ(i) = i q /q. the dynamic is similar to the classical sir model but with a nonlinear contact rate. in particular, the trajectories are flattened when the value of the control is such that κβ ≈ s(t)i(t)ψ (i(t)) and the reproductive number is close to zero. note, however that this status is not an equilibrium point of the system. to understand this, let us observe that an equilibrium state (s * , i * ) for (16) satisfies the equations (κβ − s * i * ψ (i * )) s * i * = 0 ((κβ − s * i * ψ (i * ))s * − κγ) i * = 0. an equilibrium point corresponds to the classical state in which we have the extinction of the disease i * = 0 and s * arbitrary and defined by the asymptotic state of the dynamics [23] . now, let's suppose that i * = 0, s * = 0, we can look for solutions where control is able to perfectly balance the spread of the disease since the beginning of the outbreak of new infectious diseases, the actual number of infected and recovered people is typically underestimated, causing fatal delays in the implementation of public health policies facing the propagation of epidemic fronts. this is the case of the spreading of covid-19 worldwide, often mistakenly underestimated due to deficiencies in surveillance and diagnostic capacity [25, 37] . health systems are struggling to adopt systematic testing to monitor actual cases. moreover, another important epidemiological factor that pollutes the available data is the proportion of asymptomatic [25, 31, 41] . among the common sources of uncertainties for dynamical systems modeling epidemic outbreaks we may consider the following • noisy and incomplete available data • structural uncertainty due to the possible inadequacy of the mathematical model used to describe the phenomena under consideration. in the following we consider the effects on the dynamics of uncertain data, such as the initial conditions on the number of infected people or the interaction and recovery rates. on the numerical level we consider techniques based on stochastic galerkin methods, for which spectral convergence on random variables is obtained under appropriate regularity assumptions [40] . for simplicity, the details of the numerical method that allows to reduce the uncertain dynamic system to a set of deterministic equations are reported in the appendix a. we introduce the random vector z = (z 1 , . . . , z dz ) whose components are assumed to be independent real valued random variables with b r the borel set. we assume to know the probability density p(z) : r dz → r dz + characterizing the distribution of z. here, z ∈ r dz is a random vector taking into account various possible sources of uncertainty in the model. in presence of uncertainties we extend the initial modelling by introducing the quantities s(z, a, t), i(z, a, t) and r(z, a, t) representing the distributions at time t ≥ 0 of susceptible, infectious and recovered individuals. the total size of the population is a deterministic conserved quantity in time, i.e. hence, the quantities denote the uncertain fractions of the population that are susceptible, infectious and recovered respectively. the social structured model with uncertainties reads to illustrate the impact of uncertainties let us consider the simple following example. in the case of homogeneous mixing with uncertain contact rate β(z) = β + αz, α > 0, z ∈ r distributed as p(z) the model reads with deterministic initial values i(z, 0) = i 0 and s(z, 0) = s 0 . the solution for the proportion of infectious during the initial exponential phase is [38] i(z, t) and its expectation where the function represents the statistical correction factor to the standard deterministic exponential phase of the disease i 0 e 1 γ (βs0t−t) . if z is uniformly distributed in [−1, 1] we can explicitly compute more in general, if z has zero mean then by jensen's inequality we have w (t) > 1 for t > 0, so that the expected exponential phase is amplified by the uncertainty (see [38] ). in a similar way, keeping β constant, but introducing a source of uncertainty in the initial data i(z, 0) = i 0 + µz, µ > 0 and z ∈ r distributed as p(z) the solution in the exponential phase reads and then its expectation wherez is the mean of the variable z. therefore, the expected initial exponential growth behaves as the one with deterministic initial data i 0 + µz. of course, if both sources of uncertainty are present the two effects just described sum up in the dynamics. in presence of uncertainties the optimal control problem (5)-(6) is modified as follows being r[ψ(i(·, t))] a suitable operator taking into account the presence of the uncertainties z. examples of such operator that are of interest in epidemic modelling are the expectation with respect to uncertainties or relying on data which underestimate the number of infected min z∈r dz ψ(i(z, t)). in (22) u the space of admissible controls is defined as the above minimization is subject to the following dynamics with initial condition i(z, a, 0) = i 0 (z, a), s(z, a, 0) = s 0 (z, a) and r(z, a, 0) = r 0 (z, a). the implementation of instantaneous control for dynamics in presence of uncertainties follows from the derivation presented in section 2.3 and we omit the details. the resulting feedback control u(a, a * , t) reads and defines the feedback controlled model in presence of uncertainties. in this section we present several numerical tests on the constrained compartmental model with uncertain data. details of the numerical method used are given in appendix a. in an attempt to analyse sufficiently realistic scenarios, in the following examples we will refer to values taken from italian data on the covid-19 epidemic [36] . more precisely, in the first test case we illustrate the behavior of the model in a simplified approach in the absence of uncertainty and social structure and without trying to reproduce scenarios closely related to current data. in the second test case, following a progressively more realistic approach, we consider the impact of the presence of uncertain data in the controlled model with homogeneous social mixing and calibrated on italian data. the same setting is then considered in test 3 taking into account the additional effect on the spreading of the infectious disease given by the social structure of the system described by suitable social interaction functions. the final scenario, explored in test 4, examines the possibility of reducing the amplitude of the epidemic peak with the application of relaxed confinement measures related to the social structure of the system. to illustrate the effects of controls introduced that mimic containment procedures, let us first consider the case where the social structure is not present. furthermore, to simplify further the modeling, in this first example we neglect any dependence on uncertain data. we consider as initial small number of infected and recovered i(0) = 3.68 × 10 −6 , r(0) = 8.33 × 10 −8 . these normalized fractions refer specifically to the first reported values in the case of the italian outbreak of covid-19, even if in this simple test case we will not try to match the data in a predictive setting but simply to illustrate the behavior of the feedback controlled model. based on recent studies [25, 41] , the initial infection rate of covid-19 has been estimated of an r 0 = β/γ between 2 and 6.5. here, to exemplify the possible evolution of the pandemic we consider a value close to the lower bound, taking β = 25 and γ = 10, so that r 0 = 2.5. in figures 2 and 3 we report the infected and recovered dynamics based on the activation of the control in two different time frames. in figure 2 the activation for t ∈ [0.5, 1], which means that for t > 1 we suppose that the containment restrictions are deactivated. in figure 3 we consider a larger activation time frame t ∈ [0. 5, 2] . with the choice of ψ(i) = i q /q, q ≥ 1, we can observe how the control term is able to flatten the curve even if, as expected, the case q = 2 gives rise to a weaker control action. note, however, that if the activation time is too short the control is not able to significantly change the total number of infected (and therefore recovered). on the other hand, by enlarging the activation time in combination with a sufficiently small penalty constant, the peak infection is not only reduced, but the total number of infected people is decreased. to achieve this, the control should be kept activated for a sufficiently long time and with the right intensity in a kind of plateau regime where there is a perfect balance between the containment effect and the spread of the disease. on the contrary, if the control is too strong, the majority of the population remains susceptible and consequently the disease will start spreading again forming a second wave after the containment policy is removed. the cost functional depends on the value of q and can be evaluated summing up contributions in (9) with explicit form of the control given by (13) . in figure 4 the cost of the two interventions is compared. we can see how a higher cost is associated with q = 1 that can be obtained with the control q = 2 for weaker penalizations. then, in figure 5 we compare the performance of the two controls with a deactivation time in the range t ∈ [1, 3]. to align the costs of interventions, we consider κ = 10 −3 for q = 1 and κ = 10 −4 for q = 2. it can be observed that there is a minimum control horizon for both strategies, in order to avoid the onset of a second infection peak. a sufficiently long control horizon is therefore necessary to reduce the impact of the infection. next we focus on the influence of uncertain quantities on the controlled system with homogeneous mixing focusing on available data for covid-19 outbreak in italy, see [36] . according to recent results on the diffusion of covid-19 in many countries the number of infected, and therefore recovered, is largely underestimated on the official reports, see e.g. [25, 31] . the estimation of epidemiological parameters is a very difficult problem that can be addressed with many different approaches [9, 16, 38] . in our case, we have limited ourselves to identifying the deterministic parameters of the model through a standard fitting procedure, considering the possible uncertainties due to such an estimation as part of the subsequent uncertainty quantification process. it has been reported, in fact, that deterministic methods based on compartmental models overestimate the effective reproduction number [30] . in order to calibrate the models with the reported quantities we solved two separate constrained optimization problems in absence of uncertainties. first we estimated β and γ by considering on the time interval [t 0 , t u ] the following least square problem min β,γ tu t0 |i(t) −î(t)| 2 dt namely the l 2 norm of the difference between the observed number of infectedî(t) and the theoretical evolution of the unconstrained model (κ = +∞). problem (27) has to be solved under the constraints β ≥ 0, γ ≥ 0. this problem is solved over the time span [t 0 , t u ] where we assumed no social containment procedure was activated. next, we estimate the penalization κ = κ(t) in time by solving over a sequence of time steps under the constraint κ(t i ) > 0, k l , k r ≥ 1 integers, and where for the evolution we consider the values β and γ estimated in the first optimization step (27) . in details, since the available data start on february 24 2020, when no social restrictions were enforced by the italian government, and since the lockdown started on march 9 we considered t u = 14 (days). the second fitting procedure has been activate up to last available data with daily time stepping (h = 1) and a window of eight days (k l = 3, k r = 5) for regularization along one week of available data. to solve problem (27) which corresponds to an initial attack rate r 0 ≈ 6.25 which agrees with other observations [30] . the corresponding time dependent values for the control in the case ψ(i) = i q /q are reported in figure 6 . after an initial adjustment phase the penalty terms converge towards a constant value that we can assume as fixed in predictive terms for future times in a lockdown scenario. this is consistent with a situation in which society needs a certain period of time to adapt to the lockdown policy. in order to have an insight on global impact of uncertain parameters we consider a twodimensional uncertainty z = (z 1 , z 2 ) with independent components such that and β(z) = β e + αz 2 , where i 0 is the same as in test 1 and taken from [36] on february 24, 2020 and β e and γ e are estimated from (27) . of course, other probability distributions p(z) = p 1 (z 1 )p 2 (z 2 ) defined only for z 1 , z 2 ≥ 0 are possible, like beta distributions [38, 40] . however, qualitatively the results do not change and the introduction of a beta distribution would imply dependence on an additional parameter for each random variable, so we limit ourselves to the simple situation of uniform uncertainty. in all the considered evolutions we adopted a stochastic galerkin approach, see appendix a with m = 10 and a fourth order runge-kutta method for the time integration. the feedback controlled model has been computed using as estimation of the total number of infected reported, namely in agreement with the fitting procedure obtained from the lower bounds of the uncertain initial data. in figure 7 we represent the evolution of the expected value of the number of infected obtained by the controlled model in the presence of initial random data (29) and uncertain contact frequency (30) . the value µ = 10 have been chosen accordingly to the who suggestions that around 80% are asymptomatic 1 . the uncertainty in the contact rate has been modeled taking α = 1. we represented the expected values of the number of infected along with the confidence bands obtained from the overall variance in z. the range of true infected as seen increases dramatically in both cases q = 1 and q = 2 with a strong uncertainty on the effective numbers. the bars below the graph are the reported values of the number of infected on which the model has been calibrated. from (31) we can estimate the effect of the lockdown policies on the value of r 0 in time. the effect of the uncertain initial estimated value r 0 is translated in the confidence bands from z 2 reported in the left plot of figure 8 . the results show that the r 0 reproduction number, thanks to drastic containment actions, has been drastically reduced and its expected value is now just below one. this shows the importance to continue with containment policies to avoid a restart of the spread of the epidemic. we analyze the effects of the inclusion of age dependence and social interactions in the above scenario. more precisely we consider the social interaction functions reported in appendix b and uncertain initial number of infected. these functions were normalized using the previously estimated parameters β e and γ e in accordance with figure 10 : test 3. expected number of infected in time for ψ(i) = i q , q = 1 (left) and q = 2 (right) together with the confidence bands in the case of covid-19 outbreak in italy for homogeneous mixing or social mixing (top row) and, in the social mixing scenario, age-independent or age-dependent recovery rate (bottom row). the abscissae are measured in days starting from the beginning of the epidemic. figure 11 : test 3. expected number of recovered in time for ψ(i) = i q , q = 1 (left) and q = 2 (right) together with the confidence bands in the case of covid-19 outbreak in italy for the social mixing scenario, age-independent or age-dependent recovery rate. the abscissae are measured in days starting from the beginning of the epidemic. where λ = [0, a max ], a max = 100 and where γ(a) accordingly to recent studies on age-related recovery rates [39] has been chosen as a decreasing function of the age as follows with r = 5 and c ∈ r such that (32) holds. clearly, this choice involves a certain degree of arbitrariness since there are not yet sufficient studies on the subject, nevertheless, as we will see in the simulations, it is able to reproduce more realistic scenarios in terms of age distribution of the infected without altering the behaviour relative to the total number of infected. we divided the computation time frame into two zones and used different models in each zone, in accordance with the policy adopted by the italian government. the first time interval defines the period without any form of containment from 24 february to 9 march, the second the lockdown period from 9 march. in the first zone we adopted the uncontrolled model with homogeneous mixing for the estimation of epidemiological parameters. hence, in the second zone we compute the evolution of the feedback controlled age dependent model (25)-(26) with matching (on average) interaction and recovery rates (32) and with the estimated control penalization κ(t) as reported in figure 6 until april 23rd. after april 23rd the computation advances in time using as penalization term the constant asymptotic valueκ reached by k(t). the initial values for the age distributions of susceptible and infectious individuals are shown in figure 9 in agreement with reported data 2 . in figure 10 we report in the top line the results of the expected number of infected with the related confidence bands in case of homogeneous mixing and social mixing for the constant recovery rate γ e . at the bottom we compare the case of constant and age-dependent recovery rates for the social mixing scenario. in general, the homogeneous mixing hypothesis leads to an underestimation of the maximum number of infected and shows a slower decay over time, the latter effect also found in the presence of an age-dependent recovery rate. the expected total number of recovered people is shown in figure 11 . finally in figure 12 we report the expected age distribution of infectious individuals in time. we remind that all the simulations presented in this section have been obtained assuming to maintain the lockdown measures introduced on march 9 for all subsequent times. one of the major problems in the application of very strong containment strategies, such as the lockdown applied in italy, is the difficulty in maintaining them over a long period, both for the figure 12 : test 3. expected age distribution of infectious individuals over a long time horizon of 200 days in the social mixing scenario with γ = γ e (left) and age dependent recovery γ(a) defined in (33) . economic impact and for the impact on the population from a social point of view. in order to analyze sustainable control strategies, therefore, it is necessary to resort to models with a social structure and control methods based on specific forms of social distancing that allow the economy to restart and the population to dedicate itself, albeit in a limited way, to its pre-pandemic activities. in accordance with the interaction function characterizing the productive activities introduced in the appendix b, see also figure 14 , we considered the following age dependent penalization κ(a, a * ) = κ w a, a * ∈ λ p κ s elsewhere where λ p defines the typical age group related to productive activities, for example λ p = [20, 60] and 1/κ s > 1/κ w characterize two control actions related to a strong and a weaker containment of social distance. we will assume 1/κ s = p s /κ whereκ is the asymptotic value calculated for the penalization parameter in the lockdown period, while κ w has been relaxed compared to κ s , i.e. 1/κ w = p w /κ, with 1 ≥ p s ≥ p w . in figure 13 we report the evolution of the age-controlled model in the case q = 1. we have considered two possible scenarios that reproduce the evolution of the infected in the hypothesis of relaxation of the lockdown measures at may 4 as foreseen by the italian government. in the first scenario, on the left, we consider two intervals that simulate the two different controlled periods measured in days from the beginning of the epidemic, t 1 = [14, 69] with p s = p w = 1, and t 2 = [70, 118] where the social measures of distance are loosened with p w = 0.6 and p s = 0.9. in figure 13 (left) we compare this choice with a situation where the relaxation of the so-called phase 2 control measures in the interval t 2 are further loosened by chosing p w = 0.3 and p s = 0.6. the respective numbers of infected in these two situations are denoted by t u1 , i u2 . it is easily observed how we systematically increase the number of infected by successive relaxations of the initial strict social-distancing measures. in figure 13 (bottom, left) we report also the evolution of age-dependent expected number of infected individuals. the second scenario assumes that after an initial opening in which few productive activities are allowed, the government will gradually loosen the containment procedures with a gradual approach after a certain period of time. in this case we assumed three control zones: t 1 = [14.69] with p s = p w = 1 , the interval t 2 = [70.98] with p s =0.9, p w =0.6 and finally a third period t 3 = [99.118] with p s =0.8, p w =0.5. this last period will correspond to a further relaxation of confinement measures on june 1st. the solution related to the number of infected is indicated with i u3 and is shown in figure 13 . (right). for comparison we also report the full lockdown solution denoted with i u0 . in the lower part of the figure on the right the corresponding distributions of infected individuals by age over time are also reported. as can be seen, a gradual strategy allows figure 13 : test 4. top row: evolution of the total number of infected in a two-zone scenario identified by κ 1 and κ 2 (left) and in a three zones scenario identified by κ 0 and κ 3 . bottom row: age distribution of infected in the two-zones (left) and three-zones (right) scenario. to contain the peak of contagions and to have a result comparable with the fully controlled model at a much lower social cost. the timing and intensity of the interventions, however, are crucial to prevent a restart of the epidemic wave. quantifying the impact of uncertain data in the context of an epidemic emergency is essential in order to design appropriate containment measures. such containment measures, implemented by several countries in the course of the covid-19 epidemic, have proved effective in reducing the r 0 reproduction number to below or very close to one. these large-scale non-pharmaceutical interventions vary from country to country but include social distance (banning large mass events, closing public places and advising people not to socialize outside their families), closing borders, closing schools, measures to isolate symptomatic individuals and their contacts, and the large-scale lock-down of populations with all but essential prohibited travel. one of the main problems is the sustainability of these interventions, which until the introduction of a vaccine will have to be maintained in the field for long periods. however, estimating the reproductive numbers of sars-cov-2 is a major challenge due to the high proportion of infections not detected by health care systems and differences in test application, resulting in diverse proportions of infections detected over time and between countries. most countries currently have only the capacity to test a small proportion of suspected cases and tests are reserved for severely ill patients or high risk groups. the available data therefore offer a systematic partial overview of trends. in this article, starting from a sir-type compartmental model with social structure, we have developed new mathematical models describing the actions of a government agency to control the population in order to reduce the estimated number of infected people in the presence of uncertain data. this hypothesis allows to derive a model that contains the control action in feedback form based on the perception of the policy maker of the spread of the disease. subsequently the model has been effectively solved in the presence of uncertain data, expanding the state variables into orthogonal polynomials in the uncertainty space, reducing the problem to a set of deterministic equations for the distribution of the solution through the course of the epidemic. the resulting controlled dynamic system then results in a deterministic stochastic solution that enables efficient estimation of uncertain parameters. the numerical simulations, carried out using data from the recent covid-19 outbreak in italy, show, on the one hand, the model's ability to well describe lockdown scenarios aimed at flattening the infection curve and, on the other hand, how the high uncertainty of the data on the number of infected people makes it very difficult to provide long-term quantitative forecasts. by identifying some plausible scenarios in accordance with the literature, sustainable containment measures by the population based on the resumption of certain occupational activities characterized by specific age groups and social interaction matrices have been studied. the results demonstrate the effectiveness of such approaches based on the social structure of the system capable of achieving results similar to much more restrictive policies in reducing the risk that the virus may return to spread when the restrictions are lifted but at a significantly lower socio-economical cost. further studies in this direction will be aimed at considering more realistic epidemic models than the one analyzed in this work together with multiple control terms specific for each social activity in order to design optimal strategies to mitigate the overall epidemic impact. in this appendix we give the details of the stochastic galerkin (sg) method used to solve the feedback controlled system (25)-(26) with uncertainties. to this aim, we consider a random vector z = (z 1 , . . . , z d ) with independent components and whose distribution is p(z) : r dz → r dz + . the stochastic galerkin approximation of the differential model (25)-(26) is based on stochastic orthogonal polynomials and provides a spectrally accurate solution under suitable regularity assumptions, see [40] . we consider the linear space p m of polynomials of degree up to m generated by a family of polynomials {φ h (z)} m |h|=0 that are orthonormal in the space l 2 (ω) such that being k = (k 1 , . . . , k d ) a multi-index, |k| = k 1 +· · ·+k dz with δ kh the kronecker delta function, and e[·] the expectation with respect to p(z). the construction of the polynomial basis {φ h (z)} m |h|=0 depends on the distribution of the uncertainties and must be chosen in agreement with the askey scheme [40] . we summarise in table 1 several polynomials bases in connection with the law of a random component of z. where the above system is then integrated in time directly in the space of projections. we remark that statistical quantities of interest, such as expectation and variance of infected, can be recovered as whereas for the variance we get in this appendix we report the details of the social interaction functions that characterize the dynamics of social mixing. these characteristics are in fact crucial for a correct prediction of outcomes, especially in diseases transmitted by close contacts. several large-scale studies have been designed in the last decade to determine relevant age-based models in social mixing. without attempting to review the vast literature on this topic, we mention [6, 33, 35] and the references therein. the number of contacts per person generally shows considerable variability according to age, occupation, country and even day of the week, in relation to the social habits of the population. nevertheless, some universal behaviours can be extracted which emerge as a function of specific social activities. social mixing is highly age-related, which means that people usually tend to interact with other people of a similar age. young people have a high rate of contact with adults aged around 30-39 and older people over 60, i.e. their parents and grandparents respectively. contact rates are indeed very high at home and at school. on the other hand, professional mixing is weakly assortative by age and tends to be determined by uniform interactions, approximately between people from 25 and 60 years old. for these reasons we consider an interaction function determined by three main sub-functions that characterize the family, the school and the professional mixing. therefore, a stylized function approximating a realistic contact matrix can be written as follows β(a, a * ) = j∈a β j (a, a * ), figure 14 : from left to right, contour plot of the social interaction functions β f , β e and β p taking into account the different contact rates of people with ages a and a * in relation to specific social activities. the function β f characterizes the family contacts, β e the education and school contacts, and β p the professional contacts. where the functions β j (a, a * ) take into account the different contact rates of people with ages a and a * in relation to specific social activities of the type a = {f, e, p }, where we identify family contacts with f , education and school contacts with e and professional contacts with p . the particular structure of these social interaction matrices was determined empirically in [6, 35] . here, according to these observations, we propose suitable mathematical functions that can be calibrated to reproduce empirical observations. in details, familiar contacts tend to concentrate on a three-band matrix with a peak around younger ages. this can be reproduced considering the functioñ β f (a, a * ) = β 0 + λ f,0 λ f,1 + (a 2 + a 2 * ) · e −λ f,2 (a 2 +a 2 * ) (1 + (a − a * ) 2 ) λ f,3 , λ f,0 , λ f,1 , λ f,2 , λ f,3 > 0. hence, we define the family interactions as β f (a, a * ) = c f β f (a, a * ) + α µ1,µ2=±µβ f (a + µ 1 , a * + µ 2 ) , being µ > 0 the age shift at which family contacts occur. on the other hand, school and professional interactions are more age-specific and the corresponding matrices can be reproduced as follows β e (a, a * ) = c e β 0 + exp − 1 σ 2 e (a − λ e ) 2 + (a * − λ e ) 2 β p (a, a * ) = c p β 0 + exp 1 σ 2 p (a − λ p ) 4 + (a * − λ p ) 4 , being c f , c e , c p > 0 normalizing constants such that λ β(a, a * )da da * = 1 , λ e > 0 is the average contact age at school, and λ p > 0 is the average professional contact age. in figures 14 and 15 we represent the three social interaction functions and the resulting global social interaction function β(a, a * ). the details of the parameters used in the simulations are reported in table 2 . figure 15 : the social interaction function β = β f + β e + β p . contact function parameters β f (a, a * ) β 0 λ f,0 λ f,1 λ f,2 λ f,3 µ α 4/3 2 1/100 8 100 3/10 1/4 β e (a, a * ) β 0 σ 2 e λ e 4/3 1/20 21/20 β p (a, a * ) β 0 σ 2 p λ p 4/3 1/40 2/5 table 2 : the parameters defining the details of the interaction functions used in the simulations. [3] g. albi, m. herty, l. pareschi. kinetic description of optimal control problems and applications to opinion consensus. commun. math. sci., 13 (6): 1407-1429, 2015. selective model-predictive control for flocking systems uncertainty quantification in control problems for flocking models boltzmann-type control of opinion consensus through leaders optimal control of a sir epidemic model with general incidence function and a time delays the french connection: the first large populationbased contact survey in france relevant for the spread of infectious diseases time-optimal control strategies in sir epidemic models un)conditional consensus emergence under perturbed and decentralized feedback controls parameter estimation and uncertainty quantification for an epidemic model a generalization of the kermack-mckendrick deterministic epidemic model towards uncertainty quantification and inference in the stochastic sir epidemic model sparse stabilization and optimal control of the cucker-smale model epidemiological models with age structure, proportionate mixing, and cross-immunity optimizing vaccination strategies in an age structured sir model optimal control for pandemic influenza: the role of limited antiviral treatment and isolation fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts uncertainty quantification for kinetic models in socioeconomic and life sciences kinetic models for optimal control of wealth inequalities estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in 11 european countries threshold behaviour of a sir epidemic model with age structure and immigration mixing in age-structured population models of infectious diseases modeling heterogeneous mixing in infectious disease dynamics, in models for infectious human diseases the mathematics of infectious diseases analytical and numerical results for the age-structured s-i-s epidemic model with mixed inter-intracohort transmission correcting under-reported covid-19 case numbers: estimating the true scale of the pandemic uncertainty quantification for hyperbolic and kinetic equations, sema-simai springer series a contribution to the mathematical theory of epidemics modeling optimal age-specific vaccination strategies against pandemic influenza an optimal control theory approach to nonpharmaceutical interventions the reproductive number of covid-19 is higher compared to sars coronavirus estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship optimal, near-optimal, and robust epidemic control social contacts and mixing patterns relevant to the spread of infectious diseases an introduction to uncertainty quantification for kinetic equations and related problems projecting social contact matrices in 152 countries using contact surveys and demographic data github: covid-19 italia -monitoraggio situazione covid-19 and italy: what next? epidemic models with uncertainty in the reproduction age-dependent risks of incidence and mortality of covid-19 in hubei province and other parts of china hongdou estimation of the reproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: a data-driven analysis assuming now s(z, a, t), i(z, a, t) and r(z, a, t) in l 2 (ω) we may approximate these terms through a generalized polynomial chaos expansion in the random space as followswhere the quantitiesŝ k ,î k ,r k are projections in the polynomial spacêthe sg formulation of system (25) withand where s m , i m , r m are defined by (34) . then, thanks to the orthonormality of the polynomial basis of p m , multiplying by φ m , for all |m| ≤ m , and taking the expectation with respect to p(z) we obtain the following coupled system of m + 1 deterministic equations for the evolution of each key: cord-103280-kf6mqv4e authors: bergs, thomas; hardt, marvin; schraknepper, daniel title: determination of johnson-cook material model parameters for aisi 1045 from orthogonal cutting tests using the downhill-simplex algorithm date: 2020-12-31 journal: procedia manufacturing doi: 10.1016/j.promfg.2020.05.081 sha: doc_id: 103280 cord_uid: kf6mqv4e abstract despite the increasing digitalization of manufacturing processes in the context of industry 4.0, the process design and development of machining processes poses major challenges for today’s manufacturing technology. compared to the conventional process design, which is influenced by an empirical "trial-and-error" principle, the simulative process design offers the possibility of reducing development time and costs while at the same time improving the process understanding. a possible simulation technique to achieve these goals is the finite element method (fem). the fem enables the calculation of the thermo-mechanical load spectrum underlying the machining process. therefore, different input models are required. one of the most critical input models is the material model, which describes the constitutive material behavior. to determine the material model parameters, either (conventional) material tests, which require an extrapolation into the regime of metal cutting, or inverse techniques are used, where the process itself is used as a material test. using the inverse technique, the model parameters are modified iteratively until a predefined agreement between simulations and experiments is achieved. the evaluation of the agreement bases on integral process variables, such as the cutting force, and their simulative counterparts. however, the procedure of the inverse determination requires high computational efforts and is not robust. this paper presents a novel approach to enhance the robustness of the inverse material model parameter determination from the cutting process. orthogonal cutting tests on aisi 1045 steel have been conducted on a broaching machine tool over a range of different cutting speeds and undeformed chip thicknesses to set an experimental database. thereby, the workpiece material was investigated in the two different heat treatments: normalized and coarse-grain annealed. the machining experiments showed differences in terms of the integral process results when comparing the two heat treatments. these results motivated for the development of a methodology capable to determine material model parameters robust and inversely from the machining process, which can be used with lower computational effort. to simulate the machining process, a coupled-eulerian-lagrangian (cel) model of the orthogonal cutting process has been set up. the material model parameters have been inversely determined using the downhill-simplex-algorithm, which has been modified for this case. by using the downhill-simplex-algorithm, it was possible to determine material model parameters within 17 iterations and achieving an average deviation between the experiment and the simulations below 10 %. thereby, different process observables such as temperature, forces, and chip form have been used for the evaluation. through this method, it is possible to determine material model parameters, which enable a good match between experiments and simulations with a low computational effort. in the field of machining, the relevance of the fourth industrial revolution is especially reflected by an increasing demand of the virtualization of the process design [1] . in the state of the art, machining processes are empirically designed by a trial-and-error approach [2] . however, the empirical process design is limited in its capabilities, since it is of descriptive nature and not predictive. additionally, this method of process design is expensive and time consuming [3] . to overcome these limitations, simulation techniques are used that exhibit the capability to reduce the time to market of new processes and products [4] . the methods to model the metal cutting process can be divided into the following types: analytical, numerical, artificial intelligence (ai), and hybrid modeling [5] . an example for numerical methods is the finite in the field of machining, the relevance of the fourth industrial revolution is especially reflected by an increasing demand of the virtualization of the process design [1] . in the state of the art, machining processes are empirically designed by a trial-and-error approach [2] . however, the empirical process design is limited in its capabilities, since it is of descriptive nature and not predictive. additionally, this method of process design is expensive and time consuming [3] . to overcome these limitations, simulation techniques are used that exhibit the capability to reduce the time to market of new processes and products [4] . the methods to model the metal cutting process can be divided into the following types: analytical, numerical, artificial intelligence (ai), and hybrid modeling [5] . an example for numerical methods is the finite in the field of machining, the relevance of the fourth industrial revolution is especially reflected by an increasing demand of the virtualization of the process design [1] . in the state of the art, machining processes are empirically designed by a trial-and-error approach [2] . however, the empirical process design is limited in its capabilities, since it is of descriptive nature and not predictive. additionally, this method of process design is expensive and time consuming [3] . to overcome these limitations, simulation techniques are used that exhibit the capability to reduce the time to market of new processes and products [4] . the methods to model the metal cutting process can be divided into the following types: analytical, numerical, artificial intelligence (ai), and hybrid modeling [5] . an example for numerical methods is the finite in the field of machining, the relevance of the fourth industrial revolution is especially reflected by an increasing demand of the virtualization of the process design [1] . in the state of the art, machining processes are empirically designed by a trial-and-error approach [2] . however, the empirical process design is limited in its capabilities, since it is of descriptive nature and not predictive. additionally, this method of process design is expensive and time consuming [3] . to overcome these limitations, simulation techniques are used that exhibit the capability to reduce the time to market of new processes and products [4] . the methods to model the metal cutting process can be divided into the following types: analytical, numerical, artificial intelligence (ai), and hybrid modeling [5] . an example for numerical methods is the finite 48th sme north american manufacturing research conference, namrc 48 (cancelled due to element analysis (fea), which has found wide applications in modeling the machining process, especially in the scientific community [5] . ever since the first application of finite element method (fem) in the field of engineering by zienkiewicz [6] and especially in the field of machining by klamecki [7] in the 1970s, the use of fem techniques for modeling manufacturing processes increased significantly. the major advantage of the fem when modeling the machining process is the possibility to calculate process quantities, such as stresses, strains, and strain rates, which cannot be measured during the machining process [8] . however, these process quantities play a major role to understand the mechanisms within the process and are therefore necessary to enhance the process understanding. when modeling the machining process by means of fem, diverse input models are necessary, such as a friction model and a material model, which are essential for the success and reliability of the predicted results [9] . among these models, the material model has a major impact on the results [10] [11] [12] [13] . the material models used for metal cutting simulations can be divided into empirical/phenomenological, semi-empirical, and physical-based material models. thereby, empirical material models describe the material behavior as a function of the strain, strain rate, and temperature, whereas physical-based material model take fundamental microstructural properties such as the dislocation density into account [14] . mostly, empirical material models are used in machining simulations, whereas physical-based material models are just rarely utilized due to enormous difficulties of modeling the basic material deformation mechanisms under loads of the metal cutting process. [15] however, as important as the selected material model are the underlying material model parameters. commonly, these parameters are determined by quasi-static or dynamic material tests. an example for a dynamic material test, which has been used for determining material model parameters to describe high-strain and strain rate flow curves, is the split-hopkinson-pressure-bar (shpb) test [16; 17] . when using the shpb-test, strains up to 0.5, strain rates up to 5·10 5 s -1 , and temperatures up to 1,000 °c are achievable [18] . however, these conditions are far away from those encountered in the machining process. in the process, strains up to 2, strain rates up to 10 6 s -1 , and temperatures between 500 and 1,400 °c can occur [2; 19; 5] . due to these differences in the occurring loads, extrapolation of the determined material behavior into the regime of metal cutting becomes necessary, which can lead to large deviations between the predicted and the actual material behavior [20] . to circumvent the extrapolation of the determined material behavior in the regime of metal cutting alternative approaches are necessary. within the last decade, inverse techniques have been used, where the process to be modeled is used as a material test itself. thereby, the material model parameters are modified within a simulation until the predicted simulation results match the experimental results [21] . the conformity of simulation and experiments is evaluated based on integral process results, such as the cutting force, chip thickness, chip form or cutting temperature. however, the procedure of the inverse determination is not robust and requires a large number of iterations and, therefore, high computational efforts [19] . to overcome these issues when inversely identifying material model parameters, different optimization strategies and algorithms have been used. in the field of sheet metal forming, chaparro et al. investigated the inverse parameter identification of the barlat material model using a genetic algorithm, a gradient-based algorithm, and a combination of both [22] . the results revealed that both algorithms are able to fit the numerical with the experimental data. in the field of machining, özel and karpat, used the evolutionary computational algorithm of cooperative particle swarm optimization (cpso) to determine the jc-parameter [23] . however, the underlying experimental data were obtained from shpb-tests and from orthogonal cutting experiments in conjunction with a modified oxley model. therefore, the drawbacks of extrapolation and of the oxley model are underlying this approach. franchi et al. developed an inverse optimization procedure to determine the jc-parameters of aisi 316 stainless steel and saf 2507 super-duplex stainless steel as well as the coulomb friction coefficient [24] . therefore, a sequential approach, starting with an initial set of machining simulation based on a design of computer experiments (doce) and analysis of the numerical results in terms of cutting forces and temperatures was used. based on these results, a regression model was developed serving as a surrogate model. subsequent, a multi-island genetic optimization algorithm was used to identify the best collection of jc-and friction coefficients by minimizing an objective function. however, the approach revealed large deviations between experimental and numerical results of up to 75 %. bosetti et al. inversely determined jc-material model parameters and the tresca friction coefficient for aisi 304 stainless steel using the downhill-simplex-algorithm (dsa) in combination with a genetic algorithm (ga) [25] . since their approach focused on just one cutting condition, it remains questionable how well the determined parameters describe the material behavior for other cutting conditions. the underlying problem of the non-uniqueness of material model parameters has been widely reported in the literature [19; 26-28] . within this paper, a new methodology is used to determine the material model parameters from orthogonal cutting experiments using an optimization algorithm. to set an experimental database, orthogonal cutting experiments have been conducted on a broaching machine tool. as workpiece material aisi 1045 has been chosen in two different states of heat treatment. the material has been investigated in the normalized and in the coarse grain annealed state. the differences in the material behavior are initially analyzed by comparing the results of quasi-static tensile and shearing tests as well as of charpy-impact tests, revealing differences in the two annealing states. thereafter, a novel methodology of material parameter determination from the machining process is presented and applied to determine the johnson-cook material model parameters of aisi 1045 in the normalized state. to determine the material model parameters, the downhill-simplex algorithm, which has been modified to be applicable to the inverse problem of material model parameter determination, is used. the paper is organized as follows: in the following chapter, the characterization of the workpiece material, including the results of quasi-static tests and impact tests are presented, followed by the experimental set-up of the orthogonal cutting experiments. in chapter 3, the results of the orthogonal cutting tests are outlined and analyzed. the description of the orthogonal cutting model in the machining simulations is presented thereafter. in chapter 6 the material model parameters are determined for aisi 1045 in the normalized state using an optimization algorithm, followed by the validation of the determined model parameters. finally, a summary and conclusion of the results will be given in the last chapters. this chapter is divided into three subchapters, presenting firstly the workpiece material aisi 1045 that is used in the experiments of this work, followed by a characterization of the material by means of conventional material tests. in the third subchapter, the experimental set-up of the orthogonal cutting experiments is outlined. as workpiece material, the low-carbon steel aisi 1045 was investigated. the material has been focus of several researches so far, especially in the field of machining due to its wide application in the automotive industry, where it is used for components of medium loads such as crankshafts [29] . the chemical composition of the workpiece material has been determined by spark spectroscopy. the results, which have been averaged from four measurements, are summarized in table 1 . the chemical compositions shows slight deviations from the nominal chemical composition as specified by the manufacturer, especially in terms of the carbon content. to investigate the influence of the material's grain size on the material behavior, the aisi 1045 has been annealed in two different states: normalized (n) and coarse-grain annealed (cg). the heat treatment conditions are summarized in table 2 . after the heat treatment, the microsections showed a homogeneous microstructure without a line-structure from previous manufacturing processes. the microstructure of the normalized material consisted of globular perlite and ferrite, whereas the microstructure of the coarse grain annealed material consisted of globular pearlite and globular/lamellar ferrite, see fig. 1 . the phase fraction of pearlite and ferrite as well as the average grain size of the two phases were determined using the software imagej. the results are summarized in table 3 . further, table 3 contains the results of hardness measurements, which show a distinct deviation between the two different heat treatments. the given hardness values are averaged values, based on 16 measurements. average ferrite grain size d / µm 7 11 average pearlite grain size d / µm 12 37 hardness hv / hv0.03 223±30 347±40 the differences in the hardness measurements are attributed to the differences in the phase fractions of the two investigated heat treatment conditions. the higher phase fraction of pearlite for the coarse grain material causing a higher hardness of the material, since pearlite exhibits a higher hardness than ferrite. besides the metallographic alterations, the differences of the two heat-treatments in terms of the material behavior under quasi-static and impact conditions have been investigated. the material behavior under quasi-static conditions has been determined under tensile and shear loading and the impact behavior by using the charpy-impact test. the quasi-static tests have been conducted on a zwick z100 universal testing machine with a maximal force of fmax = 100 kn. the shear and tensile tests have been conducted with a strain rate of ̇= 0.001 −1 , resulting in a test speed of vtensile = 4.5 mm/min and vshear = 0.3 mm/min. both, tensile and coarse grain annealed 50 µm 50 µm shear tests have been conducted three times for each state of annealing. the tensile tests have been conducted according to din en iso 6892-1 using the a50 geometry. in fig. 2 , the results of the tests are shown, revealing slight differences between the test repetitions. in comparison to the coarse grain annealed samples, the normalized samples show a slightly lower ultimate tensile strength of rm = 700 mpa, whereas the average ultimate tensile strength of the coarse grain annealed samples is rm = 724 mpa. on the other side, the yield strength of the normalized sample is , = 486 and of the coarse grain annealed samples , = 424 . however, the uniform elongation of the normalized samples is with ag = 13 % about 33 % higher than the one of the coarse-grain annealed samples. the differences in the ultimate tensile strength are attributed to the phase differences of the two different states of annealing. the higher pearlite fraction of the coarse grain annealed samples results in an increased ultimate strength compared to the normalized samples. the higher uniform elongation of the normalized samples is attributed to the smaller grain size of the normalized material. the shear tests have been conducted by using a modified flat tensile specimen, which was notched by two hooks, fig. 3 . therefore, shear conditions were enabled. the results of the shear tests show comparable differences, with slightly higher stresses for the coarse grain annealed sample and higher strains for the normalized samples, fig. 3 . additional to the quasi-static tests, impact tests have been conducted according to din 50115 using charpy-v-notched samples. the tests have been conducted on a zwick/roell system with a maximum impact energy of 50 j and have been repeated two times for each annealing condition. the conducted impact tests revealed differences between the investigated samples, with higher impact energies for the normalized material. the average necessary impact energy of the normalized samples was 24.5 j, whereas the one of the coarse grain annealed samples was 15.7 j. except for one cgsample, all samples failed due to fracture. the orthogonal cutting experiments have been conducted on a tests bench built on a vertical broaching machine of type forst rasx 8 x 2200 x 600 m/cnc, fig. 4 . the broaching machine has a stroke length of 2,200 mm and a maximum cutting speed of vc = 150 m/min. for cutting speeds up to vc = 30 m/min the maximum broaching force is fmax = 80 kn and for the high cutting speeds fmax = 20 kn. in comparison to conventional broaching, the workpiece has been clamped into a customized fixture in the tool holder, where normally the broaching tool would be clamped into. as cutting tool, a grooving insert tool of type corocut from sandvik coromant has been used. the grooving inserts were made out of the cemented carbide grade h13a, with an average grain size of 1-2 µm and a co-content of 6 %. the rake angle of the tool was γ = 6° and the flank angle α = 3°. to circumvent the occurrence of built-up edge formation, the tools have been coated with a tialn-coating of 4 µm thickness. the cutting edge rounding was measured in average to rβ = 14 µm with a standard deviation of 2.1 µm for all tested tools the cutting tool was clamped into a tool holder on a dynamometer type z21289 from kistler with a measuring range from -80 to +80 kn in the cutting force direction. the sampling rate was 50 khz. during the experiments both, cutting force fc and cutting normal force fcn were measured. a high-speed camera of type phantom v7.3 captured the chip formation process with a frame rate of 6,700 s -1 and a resolution of 800 x 600 pixels. to increase the intensity of light and to enhance the capture of the chip formation, a led-light has been used. the cutting temperature has been measured by using a two-color pyrometer. since the workpiece temperatures behind the cutting zone were too low to be captured by the twocolor pyrometer, the pyrometer has been used to measure the tool-side temperature of chip. further, an infrared camera captured the workpiece temperature field. however, the results of these measurements will not be part of this paper and will be presented in the future. the orthogonal cutting experiments have been conducted by varying the cutting speed vc and the undeformed chip thickness h. the experimental design is summarized in table 4 . the orthogonal cutting tests were repeated twice in order to enhance the statistical security. the cutting force fc and cutting normal force fcn have been analyzed in the steady state of the measured signal. therefore, the signal from 40 to 90 % of the total cutting time has been evaluated, fig. 5 . the results of the cutting force and cutting normal force measurements for the investigated undeformed chip thicknesses and cutting speeds are shown in fig. 5 . an increase of the cutting force and cutting normal force with increasing undeformed chip thickness can be seen for all investigated conditions and states of annealing of the material. the increase of the cutting force and cutting normal force with increasing undeformed chip thickness can be explained by the higher mechanical load which is involved in the chip formation process and is in accordance with observations from the literature. when increasing the cutting speed vc, a slight decrease of the cutting force and a more distinct decrease of the cutting normal force can be observed. the decrease of the cutting forces can be explained by the thermomechanical workpiece behavior. an increase of the cutting speed results in an increase of both, the mechanical and thermal load. the decrease of the force can therefore be attributed to a more dominating influence of the thermal softening, compared to the strain and strain rate hardening. it is further assumed that frictional alterations cause a decrease of the cutting normal force fcn, which is sensitive to the frictional conditions [30] . comparing the two different states of annealing reveals higher cutting and cutting normal forces of the normalized samples. this observation is in accordance with the results of the quasi-static tests and is attributed to the higher yield strength of the normalized material. the chip thicknesses have been measured at three different spots for all chips, using the recordings of the high-speed camera. the results of these measurements are shown in fig. 6 , revealing an increase of the chip thickness with increasing the undeformed chip thickness. a large alteration of the chip thickness with increasing cutting speed cannot be observed. the comparison between the two states of annealing reveals small differences, with higher chip thicknesses for the normalized material. these differences are, however, within the measurement scattering. when measuring the temperature/temperature field in the cutting process, different measurement positions are possible. in this study, a two-color pyrometer was used to measure the temperature of the chip shortly after its formation. the displacement of the measuring position with different undeformed chip thicknesses showed no alterations. the results of the temperature measurements using the twocolor pyrometer are summarized in fig. 7 . when increasing the undeformed chip thickness, an increase of the chip temperature can be observed. however, an increase of the chip temperature for higher cutting speeds is not as distinct, as it is the case for higher undeformed chip thicknesses. albeit, the shown temperatures do not necessarily represent the temperature close to the cutting zone. due to higher tool-chip contact lengths or different chip curvatures, the measuring position can be further from the cutting zone, fig. 7 . in contrast to the force measurements, no pronounced differences between the two heat treatments can be identified for the chip temperature measurements. the chip temperature measurements show a large scatter. the deviations have to be taken into account when determining the material model parameters, since the simulations are not expected to be more accurate than the experiments. the experimental results of the process observables of the two investigated states of heat treatment for aisi 1045 showed smaller deviations than expected from the quasi-static material tests. it seems doubtful that the simulations are capable to depict these differences in the process observables, which were in the experiments in average lower than 10 %. therefore, just the experimental results of the normalized material will be used for the inverse determination of the material model parameters inversely by means of fem-simulations. this chapter outlines the fem-model used for the orthogonal cutting simulations. therefore, the used models to describe the material behavior and the frictional behavior are described, followed by the coupled-eulerian-lagrangian (cel) model. to model the material behavior under metal cutting conditions, the jc-model has widely been used, equation (1). in the jc-model, the effects of strain , strain rate ̇, and temperature on the flow stress are modeled uncoupled by three separate terms [31] . the first bracket of equation (1) expresses the strain hardening, which is in accordance with the ludwik-equation [32] . the second bracket models the effect of strain-rate hardening and is formulated in a logarithmic form. the effect of thermal softening is expressed based on a power function [29] . in the jc-model, , , , , and are material constants. ̇0 is the reference strain rate with ̇0 = 0.1 −1 , 0 the reference temperature, and the melting temperature. [33] = ( + ⋅ ) (1 + ln (̇̇0)) ( besides the material model, the friction model has a major influence on the simulated results [34] . in several studies, it has been shown that using a simple coulomb friction model is not sufficient to describe the frictional behavior between the tool and the workpiece [35] . to overcome this drawback, several researches proposed different friction models, e.g. özel [36] , filice et al. [37] or puls et al. [35] . puls et al. developed a friction test that is capable to reproduce the conditions encountered in the machining process in terms of relative velocities, temperatures, and normal pressures [35] . therefore, the authors used an orthogonal highspeed deformation process on a broaching machine, where an indexable insert was rotated, resulting in an extreme negative rake angle, which suppressed the chip formation. based on their findings, a temperature dependent friction model was developed, equation (2). in this study, the friction model according to puls et al. is used, as well as the determined friction model parameters, which have been determined for aisi 1045 in a normalized state [35] . the parameters used for the friction model are summarized in table 5 . the friction model was implemented into the simulation program abaqus/explicit in a tabular form, whereby the friction coefficient was given for temperature steps of 10 °c until the melting temperature where it reaches zero. when modeling the machining process, two different formulations are used: the eulerian and the lagrangian formulation [5; 38] . in the eulerian approach, the mesh is fixed in space and material flows through element faces allowing large strains without the occurrence of mesh distortion [39] . however, for simulations of the machining process, the eulerian formulation can only be used for the steady state, wherefore the knowledge of the final chip geometry is required [5] . in the lagrangian formulation, the nodes of the mesh are attached to the material and follow the material deformation [40] . to overcome the individual drawbacks of the eulerian and lagrangian formulations, see e.g. [5] , two other formulations have been developed and applied in modeling the machining process: the arbitrary lagrangian eulerian (ale) [41] and the coupled eulerian lagrangian (cel) [42] approach. in the ale-formulation, the material flows through the meshanalogous to the eulerian formulationand the element nodes are additional able to move free within the area. the cel-formulation has firstly been introduced to machining in the last five years [42; 43] . in the celformulation, the workpiece material is modeled by the eulerian formulation, allowing the material to flow freely through the fixed mesh. the tool on the other hand, is modeled by the lagrangian formulation. the set-up of the cel-model underlying this paper is shown in fig. 8 . an inflow of the material in the euler domain is used as boundary condition to model the cutting speed, fig. 8 . the material leaves the euler domain in form of a chip or flows out of the domain on the right hand side in the set-up. within the euler domain, an additional area is modeled to allow the chip to be formed. the lagrangian tool is fixed in space and is assumed to be rigid. the tool was modeled with the element type c3d4t, to allow the calculation of the temperature within the tool. the mesh size varied along the sides of the tool, ranging from 5 µm to 50 µm. therefore, an accurate calculation of the temperature within the tool can be achieved. the euler domain on the other side was modeled with ec3d8rt elements with a smallest mesh size of 5 µm within the zone of chip formation. the parameters to model the thermal and mechanical parameters of the tool material were taken from the literature [44; 45] . to model the influence of the coating, a 4 µm thick layer of tialn-coating was modeled on the tool, represented by the green area in fig. 8 . the data to model the thermal and the mechanical behavior were taken from the literature [46; 47] . for all conducted simulations, a constant cutting length of lc = 3.3 mm was used to enable to reach the steady state conditions. to decrease the computational time, mass scaling was used by a factor of 1,000. therefore, the computational time was decreased to approximately 2 h for the simulations of table 6 . the method to determine the jc-material model underlying this paper bases on the downhill-simplex algorithm (dsa) (also called nelder-mead algorithm) [48] . in a previous study, the authors investigated the algorithm to re-identify the jcparameters [49] . in this study, the algorithm is used to identify the parameters from experimental results. generally, the dsa is a method for a multi-dimensional problem that can be employed to minimize the error between predictions and measurements [50] . when determining material models inversely, this capability can be used to minimize the error between the experimental results and the numerical results. thereby, the model parameters are iterated until a pre-defined error or a number of iterations is reached. an advantage of the dsa, in comparison to other optimization algorithms such as the levenberg-marquardt algorithm which has been employed in the material model parameter determination [27; 51-53] , is that the dsa is a derivate-free algorithm. when using the dsa, an initial simplex is needed. a simplex is a regular polytope that is defined by n + 1 vertices in a n-dimensional optimization problem [48; 54] . for a 3dspace, the simplex is a tetrahedron and for the 2d-space a triangle. for the 2d-case, the procedure of the dsa is illustrated in fig. 9 . the three points 1 , and +1 define the initial simplex. the worst vertex ( +1 ), in terms of the error value, is reflected around the centroid of the hyperplane that is formed by the two remaining vertices [48] . this operator is called reflection. the other operators of the dsa are expansion, internal contraction, and external contraction. in this study, the parameters of the four described operators were set to = 1 (reflection), = 0.5 (expansion), = 0.5 (internal contraction) and = 0.5 (external contraction). to apply the downill-simplex algorithm to the problem of inverse parameter determination, a function has to be defined, which can be optimized by the algorithm. this function has to evaluate the deviation of the numerical from the experimental results. therefore, an error function has been defined that evaluates the normalized deviation of the cutting force fc, the cutting normal force fcn, chip thickness h', and chip temperature tc. the error function is given in equation (3). the different integral process results are weighted by individual weighting factors. these have been chosen based on the scatter of the experimental results. a high scatter resulted in a low weighing factor, whereas a higher weighting factor has been used for low scatter. the weighting factors were set to = 0.40, = 0.25, = 0.15, and ℎ ′ = 0.20. the lower weighting factor of the cutting normal force fcn compared to the weighting factor of the cutting force fc is due to the high sensitivity of the cutting normal force to the frictional conditions. by utilizing a temperature-dependent friction model, an accurate modeling of the frictional conditions has been aimed. however, determining the frictional conditions is not the scope of this paper. this chapter outlines the procedure to determine the material model parameters of the jc-model for the material aisi 1045 in the normalized state. in order to reduce the computational effort of the procedure, the jc-parameter has been determined from the quasi-static tests. therefore, the parameter is assumed to be 486 . the downhill-simplex algorithm is applied to determine four material model parameters. hence, an initial simplex of five vertices is necessary. the vertices are randomly selected with the requirement to be within the defined domain of the parameters. the domain of the parameters is determined based on typical material model parameters from the literature for aisi 1045 plus an additional offset. the domain of the material model parameters was set to: ∈ [350 , 700 ] , ∈ [0.005, 0.15], ∈ [0.1, 0.9], and ∈ [0.1, 0.85] . the algorithm has been modified so that when a parameter is calculated to be outside of the parameter domain, it is projected onto the edge of the domain. to determine the material model parameters, three cutting conditions were used covering the lower domain of the investigated cutting conditions. the undeformed chip thickness ℎ = 0.01 was not considered, since the measurements of the chip thickness were in the lower range of the measurable. as upper cutting condition, the undeformed chip thickness modified simplex ℎ = 0.2 was chosen. thus, the range of finishing and roughing conditions was covered. further, the application of the determined material model parameters to cutting conditions outside of the calibration domain can be evaluated. the development of the error function over the number of iterations is shown in fig. 10 , where s1 to s5 represent the initial simplex. in fig. 10 , a decreasing trend of the value of the error function can be observed. the algorithm finished after 13 iterations, after reaching a local minimum. however, an average error of 15.4 % over the three cutting conditions used for the parameter determination remained. the comparison of the experimental results and the simulated counterparts shows good agreement in terms of cutting force fc, chip thickness h', and chip temperature tc for the investigated cutting conditions. the simulated cutting force deviated by less than 21 % from their experimental counterparts, the chip thickness by less than 13 % in average, and the chip temperature by less than 5 %. however, the simulated cutting normal force fcn deviated by large extends to the experiments by up to 264 % for the highest cutting condition. the large deviations in terms of the cutting normal force fcn are attributed to the used friction model, which has a significantly influence on the cutting normal force. even though that the cutting normal force was just taken into account by 10 % in the error function, the deviations cause a high value of the error function. an improvement of this value is expected, when neglecting the cutting normal force fcn in the determination of the material model parameters. that is why a second approach has been followed, where the cutting normal force is not taken into account. in this approach, the weighting factors were chosen to = 0.50 , = 0 , = 0.15 , and ℎ ′ = 0.35 . the weighting factor ℎ′ has been increased as well in order to maintain the sum of 1 of the weighting factos. for this approach, the development of the error function over the number of iterations is shown in fig. 11 . in comparison to the first approach, a more distinct asymptotical development of the error function with increasing number of iterations can be seen. the algorithm finished after undercutting an average error value of lower than 10 %. the predefined criteria was reached after 17 iterations. for the determined material model parameters, the remaining average error value after 17 iterations was 8.1 %. for this parameter set, the cutting force fc deviated by less than 10 % from the experimental results, the chip thickness by less than 8 % respective by 20 % for the high undeformed chip thickness of ℎ = 0.2 and the chip temperature by less than 11 %. it is remarkable that the deviation of the cutting normal force decreased for the second approach, where the cutting normal force was not evaluated in the algorithm. however, for the three cutting conditions used for the parameter calibration, the experimental cutting normal force was still 14 %, 72 %, and 148 % higher than the simulated one. a comparison between the experimental and the simulated results is shown in fig. 12 and fig. 13 . s3 s4 s5 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12 s3 s4 s5 i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12 i13 i14 i15 i16 three cutting conditions have been used for the material model parameter determination. to validate the determined material model parameters, additional simulations for further cutting conditions have been conducted. thereby, simulations for cutting conditions in the regime used for the determination as well as cutting conditions outside of the regime have been conducted. for the simulations outside of the regime used for the determination, the material behavior has to be extrapolated. the results of the simulations, as well as their experimental counterpart are shown in fig. 14 . as it can be seen in the fig. 14 (a) , the simulated cutting force fc within the regime of parameter determination are close to their experimental counterparts. for the lower undeformed chip thickness ℎ = 0.01 , which has not been considered for the material parameter determination, the cutting forces are also close to their experimental counterparts. in contrast, for higher undeformed chip thicknesses of ℎ = 0.3 and ℎ = 0.4 the simulated cutting force are overestimated. the overestimation of the cutting forces for higher undeformed chip thicknesses can be attributed to the parameter m of the thermal softening term. it is expected, that the thermal softening parameter is accurately enough for the domain of parameter calibration, but not for an extrapolation to higher cutting conditions. for the simulated chip thicknesses and chip temperatures, comparable observations as for the simulated cutting forces can be made. within the regime of parameter determination, a good agreement between experiments and simulations can be observed. however, for the higher feed rates, larger deviation between experiments and simulations occur. the simulation of the cutting process by means of fem is characterized by high computational times, especially in comparison to other simulation methods. these high computational times are one of the reasons why machining simulations using the fem is not widely used in industry. for the inverse material parameter determination, lower computational times have to be aimed. for the determination of the material parameters of the second approach presented here, the total computational time was 4 days and 17 hours. by parallelization on different cpu, the simulation time was reduced to 1 day and 18 hours. conventional material tests and orthogonal cutting experiments have been conducted on aisi 1045 in two different states of annealing. the results emphasized differences in the material behavior between the two states of annealing. these differences were more distinct for the material behavior under quasi-static conditions than for the cutting process. for the cutting process, the identified differences between the two states of annealing were smaller than the defined criterion for evaluating the agreement between the experiment and simulation. consequently, only the material model parameters of one state of heat treatment were inversely calibrated within this work. if the agreement between experiment and simulation will be further enhanced, the differences in the material behavior depending on the state of heat treatment have to be taken into account. however, it is questionable how good an agreement between simulation and experiment can be at best for a wide range of process parameters when using the jc material model. this will be further investigated in the future. (2) (2) within this paper a novel approach, which is capable to determine the material model parameters inversely from the machining process within a short time has been presented. for the inverse parameter determination, multiple cutting conditions have been used for the calibration. additionally, further cutting conditions outside the domain of calibration have been used for validation. the results demonstrated a close agreement between the experiments and the simulations within the domain of parameter determination. for cutting conditions outside the domain, large deviations between simulations and experiments can be observed. therefore, the domain of parameter calibration has to be taken into account when using material model parameters to model the material behavior of the cutting process. the application of the model to a domain outside of the domain of calibration is expected to result in larger deviations due to extrapolating. the deviations of the simulations from the experiments can be attributed, at least to some extent, to the used material and friction model. especially the large differences of the cutting normal forces are expected to be due to the friction model. further, the choice of the weighting factors underlying the evaluation have to be addressed critical. in order to investigate the influence of the weighting factors on the procedure of the inverse parameter determination, a sensitivity analysis will be presented in a future publication. in the future further analysis of both parts, the experimental and the numerical, will be conducted. for the experimental part, the influence of the heat treatment as well as of alloying elements on the material behavior will be investigated. for the numerical part, the robustness of the used algorithm, the influence of the underlying algorithm parameters and the influence of the error function will be investigated. additionally, when inversely determining the material model parameters from cutting experiments the achievable agreement between experiments and simulations will be investigated. an enhanced agreement is expected when taking more process observables, such as the temperature filed or the chip curl radius into account. predictive tool and process design for efficient chip control in metal cutting the influence of material models on finite element simulation of machining on the machining induced residual stresses in in718 nickel-based alloy. experiments and predictions with finite element simulation dependence of machining simulation effectiveness on material and friction modelling recent advances in modelling of metal machining processes the finite element method in engineering science incipient chip formation in metal cutting--a three-dimension finite element analysis thermo-mechanical modeling of orthogonal machining process by finite element analysis material property needs in modeling metal machining identification of material constitutive law constants using machining tests. a response surface methodology based approach high performance and optimum design structure and materials a machiningbased methodology to identify material constitutive law for finite element simulation finite element simulation of machining inconel 718 alloy including microstructure changes inverse identification of johnson-cook material parameters from machining simulations advances in material and friction data for modelling of metal machining a unified material model including dislocation drag and its application to simulation of orthogonal cutting of ofhc copper a new 3d multiphase fe model for micro cutting ferritic-pearlitic carbon steels bruchverhalten von leichtmetallen unter impact-beanspruchung. dissertation: rwth on modelling the influence of thermo-mechanical behavior in chip formation during hard turning of 100cr6 bearing steel a new method to determine material parameters from machining simulations using inverse identification process modeling in machining. part i: determination of flow stress data a new thermo-viscoplastic material model for finite-element-analysis of the chip formation process material parameters identification. gradient-based, genetic and hybrid optimization algorithms identification of constitutive material model parameters for high-strain rate metal cutting conditions using evolutionary computational algorithms inverse analysis procedure to determine flow stress and friction data for finite element modeling of machining identification of johnson-cook and tresca's parameters for numerical modeling of aisi-304 machining processes a study of non-uniqueness during the inverse identification of material parameters how to identify johnson-cook parameters from machining simulations identification of material constitutive laws for machining-part i: an analytical model describing the stress, strain, strain rate, and temperature fields in the primary shear zone in orthogonal metal cutting zerspanung mit geometrisch bestimmter schneide. (series: vdi-buch). 9th analytical and experimental investigation of rake contact and friction behavior in metal cutting microstructural based models for bcc and fcc metals with temperature and strain rate dependency elemente der technologischen mechanik. 1st a constitutive model and data for metals subjected to large strains, high strain rates and high temperatures the mechanics of cutting: in-situ measurement and modelling experimental investigation on friction under metal cutting conditions the influence of friction models on finite element simulations of machining on the correlations between friction model and predicted temperature distribution in orthogonal machining simulation of the orthogonal metal cutting process using an arbitrary lagrangian-eulerian finite-element method modelling and simulation of machining processes aspects of ductile fracture and adaptive mesh refinement in damaged elasto-plastic materials simulation of chip formation in orthogonal metal cutting process. an ale finite element approach application of the coupled eulerian-lagrangian (cel) method to the modeling of orthogonal cutting coupled eulerian-lagrangian modelling of high speed metal cutting processes refractory, hard and intermetallic materials world directory and handbook of hardmetals and hard materials thermal properties of cutting tool coatings at high temperatures orthogonal cutting of hardened aisi d2 steel with tialn-coated inserts-simulations and experiments a simplex method for function minimization inverse material model parameter identification for metal cutting simulations by optimization strategies determination of flow stress for metal cutting simulation-a progress report is it possible to identify johnson-cook law parameters from machining simulations determination of johnson-cook parameters from machining simulations determination of work material flow stress and friction for fea of machining using orthogonal cutting tests parameter optimisation in constitutive equations for hot forging identification of constitutive parameters -optimization strategies and applications the authors would like to thank the deutsche forschungsgemeinschaft (dfg, german research foundation) for the funding of the depicted research within the project 365204822 "development and verification of a constitutive approach for the determination of high-speed flow curves from the cutting process". key: cord-031232-6cv8n2bf authors: de weck, olivier; krob, daniel; lefei, li; lui, pao chuen; rauzy, antoine; zhang, xinguo title: handling the covid‐19 crisis: toward an agile model‐based systems approach date: 2020-08-27 journal: nan doi: 10.1002/sys.21557 sha: doc_id: 31232 cord_uid: 6cv8n2bf the covid‐19 pandemic has caught many nations by surprise and has already caused millions of infections and hundreds of thousands of deaths worldwide. it has also exposed a deep crisis in modeling and exposed a lack of systems thinking by focusing mainly on only the short term and thinking of this event as only a health crisis. in this paper, authors from several of the key countries involved in covid‐19 propose a holistic systems model that views the problem from a perspective of human society including the natural environment, human population, health system, and economic system. we model the crisis theoretically as a feedback control problem with delay, and partial controllability and observability. using a quantitative model of the human population allows us to test different assumptions such as detection threshold, delay to take action, fraction of the population infected, effectiveness and length of confinement strategies, and impact of earlier lifting of social distancing restrictions. each conceptual scenario is subject to 1000+ monte‐carlo simulations and yields both expected and surprising results. for example, we demonstrate through computational experiments that maintaining strict confinement policies for longer than 60 days may indeed be able to suppress lethality below 1% and yield the best health outcomes, but cause economic damages due to lost work that could turn out to be counterproductive in the long term. we conclude by proposing a hierarchical computerized, command, control, and communications (c4) information system and enterprise architecture for covid‐19 with real‐time measurements and control actions taken at each level. f i g u r e 1 confirmed deaths per million people as of july 21, 2020 makes it a systemic crisis and not only a pure health crisis. the closest analog we have at a global scale is the h1n1 influenza pandemic of 1917-1919 ("spanish flu") which killed between 17 and 50 million people worldwide. 6 thus, handling the current covid-19 crisis requires a holistic approach taking into consideration an extremely complex system, ie., society as a whole. another important aspect of the covid-19 crisis is that the pandemic propagation has been very fast, thus demanding rapid decisionmaking. moreover, and we think that this is a structuring feature of this crisis, the incubation time of the disease introduces a delay-that has been estimated as being up to two weeks according to epidemiologists (refs. 7, 8, or 9 )-between the implementation of countermeasures and the observation of their effects. this is compounded by the fact that a significant fraction of the virus carriers appear to be asymptomatic, causing a large difference between the numbers of actual cases and of known or confirmed cases (see refs. 1, 3, or 5) . this explains why the problem of monitoring the covid-19 crisis can be seen as a controltheoretic problem with delay in the feedback loop used to stabilize the situation in addition to the problem of low or only partial observability of the true system states. we shall elaborate further on this point. from a system-theoretic perspective, the above characteristics raise several difficult problems. the first one, which is rather expected, regards scalability: can our current systems engineering and modeling methods (cf. for instance, refs. 10-16, or 17) be extended to a system, or more precisely a system-of-systems (cf. ref. 18 or 19) , as large and as complex as human society as a whole? this question is clearly not easy to solve and appears moreover poorly addressed by the only known models of such scope, ie., the so-called world models, based on generalized volterra equations, that followed the seminal work of forrester in the 1970s (see refs. 20, 21, and 22) . a second problem is caused by the emergence of local and partial solutions which is significant since the covid-19 crisis impacts all sectors of society, including the medical, financial, transportation, manufacturing, and overall economic systems. society therefore needs fast and innovative solutions in order to mitigate as much as possible the consequences of the crisis. time pressure favors local and partial solutions, but also a strong coordination among actors in order to avoid contradictory strategies. a central question is therefore how to favor the emergence of bottom-up local actions while, at the same time, ensuring top-down monitoring and coordination of such actions, with short feedback loops. this calls for an agile approach (see refs. 23, 24, or 25) to the global covid-19 crisis. stating the above problems, we made a clear choice in this paper: we do strongly believe in the use of models, and more precisely of systemic models to think through and manage the crisis. models as we consider them here are, however, not platonic ideals, but observational models which rely on the observation of the reality of the covid-19 crisis, including the effects of the decisions made based on them. such models are intended to capture the systemic nature of the crisis in order to achieve a better understanding of the situation and to allow a better communication among stakeholders. in that respect, models have two main roles: first, the concrete calculation of key performance indicators to support the decision-making process through experiments in silico; and a second more metaphorical one, to help us think better about the dynamic evolution of the systems at stake. the remainder of this paper is organized as follows. in section 2, we discuss which systemic models may support better management of the covid-19 crisis. then, in section 3, we advocate for an agile approach for crisis management. section 4 completes the paper with several recommendations. the general impression which emerges from the large and rapidly expanding literature dedicated to the covid-19 pandemic is that this crisis was first and foremost analyzed as primarily a health crisis (cf. ref. 26 or 2) . economic impacts of the crisis were of course quickly understood, but, as far as we could observe, they were rather considered as an inevitable consequence of the health crisis that has to be managed as a second priority. 27 however, the aggressive mitigation measures that were set up in many countries were and are at the same time quite efficient from a health-preservation point of view (see, for instance, ref. 28 or 29) and highly inefficient from an economical perspective due to their global economic impact on all of society (see ref. 30 or 31) . in this matter, there is-to the best of our knowledge-no rational discussion in the scientific literature on what could be the best tradeoff for jointly minimizing both the health impact and the economic impact of the covid-19 crisis. perhaps the biggest ethical issue around such trade-offs is that it would require placing an explicit economic value on human lives, as discussed for instance in ref. 32 . this is something that no national or regional government in the world has apparently been willing to do. moreover, what shall one do if the health crisis remains endemic in the near future which is one of the possible scenarios (cf. section 2.2.2)? as one can see, thinking from a global rather than a purely local perspective can deeply change the way one addresses the crisis and its consequences. this situation is probably the consequence of the fact that the crisis is mainly observed on daily basis, through for instance the daily covid-19 reports provided by the world health organization, 5 by other institutions, 3 and by each local government, leading to a rather shortterm vision of the crisis. however, changing the time scale of observation gives us immediately a totally different point-of-view on the covid-19 crisis. if we are, for instance, observing the crisis at the time step of a quarter of a year (three months), it becomes almost instantaneous and can be considered as an event-in the classical meaning of synchronous modeling 33 -without any duration. thus, the choice of time step and sampling frequency is critical as it is for any control system. this perspective change forces us to think what could be the next state of the system under observation, ie, human society, which may be on its way toward a deep economic crisis, at least in western countries. continuing the analysis at the same coarse time scale, a possible catastrophic evolution scenario would be a financial crisis result-f i g u r e 2 a possible catastrophic scenario that could result from the initial covid-19 health crisis ing with some delay from an economic crisis initiated by the health crisis, thus generating the specter of a deep and prolonged recession, as pointed out as a possibility by some economists (ref. 27 or 31). moreover, this situation could then also lead to more "classical" health crises in the future (see figure 2 ) due to the two-sided coupled interaction between the public health system and the economic system. in such a catastrophic future scenario, extending the duration of people's confinement in western countries in order to minimize the short-term health impact during the initial crisis could, for instance, result in deeply debilitating the health of more or less the same population in the mid-to long-term future. such a possible paradox is typical in optimal control theory where the optimal trajectory of any nonlinear system can never be obtained through local optimizations alone. 34 in order to take into account and to avoid such paradoxical consequences, one must choose a systems approach to analyze the covid-19 crisis, integrating all existing domains of knowledge into a common understanding of the crisis, in order to obtain a global vision, both in space and time and at different possible observation scales, and thus giving a chance to find the global optimum for human society as a whole. we can thus see that there is another crisis, hidden within the covid-19 crisis, which is a crisis of models. the global community is indeed focusing on short-term health-specific models to better master the crisis, but these models are inadequate as soon as one wants to address the crisis from a longer-term society-wide perspective which requires systemic models. in this matter, let us recall that a model is an abstraction (in the meaning of abstract interpretation theory 35 ) of reality, but not reality itself, as expressed, for instance, by the famous assertion "a map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness," popularized by korzybski 36 or the well-known "all models are wrong, some are useful" by box. 37 "models" which are not actually reflecting reality within some error bounds are in fact not models in that observational definition and may even have negative impacts on reality since they will lead to wrong decisions or control actions. these negative impacts of wrong "models" can of course be amplified in the context of a systemic crisis such as covid-19. our point of view is clearly supported by an analysis of the 2020 scientific literature to date. a search of the keyword "covid-19" on google scholar 38 in april 2020, revealed that, at this moment of time, only 10 papers-ie., around 1%-of the first 900 most cited papers on covid-19 were not discussing primarily health issues (health covering here biology, epidemiology, medicine, and health policy and management), but rather focusing on the societal and economic consequences of the crisis. moreover, in terms of citations, most of these 10 papers were poorly cited: two were cited around 20 times, three around 10 times, and the remaining ones less than 5 times, while the average number of citations per paper was 15 in our sample. only very few health-oriented papers, such as ref. 39 , also discuss mixed strategies involving economic or psychological considerations to fight the coronavirus. it seems therefore that the majority of the scientific effort is focused on the short-term, without taking into account what might be the mid-and long-term societal consequences of the covid-19 crisis. one may also notice that there is probably another crisis of medical models that can be observed due to the covid-19 crisis. this other crisis focuses around the merits of hydroxychloroquine and azithromycin as a possible treatment of covid-19, as proposed by raoult and his team. 40 this has since then been shown to be a proposal which was not supported by a rigorous methodological approach according to medical methodologists. 41 however, medical statistical methodology (see ref. 42 for an introduction to this domain) appears also to be questionable from a modeling perspective: the frequency-based models used in methodological medicine usually cannot have probabilistic interpretations due to a lack of large series of experiments required to apply the law of large numbers; 43 hence such frequency-based models can only find correlations between proposed medications and observe effects on structurally limited series due to the high costs of clinical studies. 44 but since correlation is not causation, it is just not possible, without any understanding of the underlying biological mechanisms, to scientifically deduce anything from such studies, as long as we agree on the fact that science deals with causal explanations, which does however not prevent using correlation-based results from a practical perspective as soon as they are established in a sound way. 72 in this analysis, the debate around the rigor of the pragmatic and agile approach followed by raoult may just be a new popperian debate 45 opposing different medical methods for addressing an infectious health crisis, similar to the debates that existed in physics around aristotelian theory in the 16th century 46 or aether theory in the 19th century. to conclude this initial discussion on the crisis of models, we point out that if the scenario that we highlighted in figure 2 comes true, we may also eventually be forced to deal with another crisis of models, namely, the crisis of mathematical models used in finance. these other "models" are not necessarily models in the observational sense that we are using in this paper since they suffer from many well-known issues such as reflexivity, 47 which refers to the fact that mathematical financial models are essentially observing other mathematical financial models, or more deeply the lack of evidence for the market equilibrium hypothesis, 48 which is at the heart of the probabilistic framework used in mathematical finance, but which is in fact rarely observed in practice (see, for instance, ref. 48 or 49), especially in a financial crisis situation where the market is of course highly unbalanced and volatile and therefore out of equilibrium, as pointed out by several researchers. the covid-19 crisis is thus forcing us to open our eyes and to look for the "right" models to use for effectively managing human society. one should use models that are effectively capturing the reality as it is and not as we would like it to be, if we want to make nondominated decisions in the face of a crisis of such magnitude and have a chance to tackle it successfully. as stated above, there is a crucial need for constructing a realistic observational system model of the covid-19 crisis. we shall now present the main ingredients of such a systemic model. taking a systems approach leads us naturally to construct first a systemic framework for modeling the covid-19 crisis. the first step toward that objective is to understand what are the main systems 10 involved in or impacted by the crisis. in that respect, the following ones are quite obvious: • the natural environment from which the coronavirus which initiated the crisis is coming, • the social system, which contains the population that is or can be infected by the coronavirus, • the health system which attempts to cure the people infected by the coronavirus, • the governance system which has to choose the optimal health policy to face the pandemic, • the economic system which may be indirectly impacted by the covid-19 crisis. note that the impact of the covid-19 crisis on the economic system depends of course on the health policy choosen by the governance (political) system. if a health policy recommends or forces-as often done 5 -a large fraction of its population to stay home, it causes a double shock, 31 first on the supply side since economic actors which are lacking a work force must reduce their production and secondly on the demand side since people who are not working anymore are usually paid less or not at all and thus are also consuming less. we can now sketch the first item of our generic covid-19 systemic framework which is the high-level environment 10 that we modeled in figure 3 . this first system view exposes the exchanges of matter, people, information, and money-plus coronavirus here-that exist between the main systems involved in the covid-19 crisis. note that the overall system taken into account here, ie., human society as a whole, including its natural environment, is a closed system on our home planet earth. as a consequence, the only levers to solve the crisis are internal to this global system. using that technique, the point is thus to be able to construct realistic domain-specific lifecycle scenarios for each system involved in the covid-19 environment. we first focus only on the social and economic systems, since we are considering here the situation that occurs after the end of the covid-19 health crisis (see figure 2 ). we can then see that: • the lifecycle of the social system can be analyzed to first order in terms of wealth and health, where these features can be, respectively, in a systems approach, we will thus have to construct the different possible global lifecycle scenarios that can be achieved in this way (see figure 4 for an illustration of this classical process), to evaluate their probabilities and to define means to mitigate the worst consequences. to obtain more detailed models, we shall moreover refine them in terms of space, to capture the geographic dimension of human society, and time, and to make optimal trade-off decisions between the shortand long-term impact of the covid-19 crisis. note also that these lifecycle scenarios are of course highly country-dependent due to the central role of the governance system in the resolution of the covid-19 crisis, as well as the susceptibility of the population which is an initial condition. the last element of our covid-19 systemic framework is finally a mission statement, 10 ie, the core high level requirement regarding human society which expresses the objective that the governance system wants to fulfill. one can indeed understand that the behavior of our system of interest-human society-will be different depending on whether one wants to minimize the impact of the covid-19 crisis on the social, health, or economic system or to find the best balance between the impacts on these three systems. this is a multiobjective optimization problem for which we provide a sample result below, and that we intend to explore more in details in a forthcoming paper. it is therefore of high importance-as system theory tells us (see refs. 10, 14, or 16)-to be able to clearly define the mission to achieve. taking a systems approach to the covid-19 crisis requires instantiating our systemic framework per country. each country has its own specificities, associated with its own history and culture, that one must consider in any systems approach: for instance, chinese traditional medicine and rigorous group behaviors are specific to china, while a centralized governance system and poorly followed health rules are specific to france, while a heterogeneous health system that favors more affluent consumers and differentiated laws and policies by state are specific to the united states of america. these types of compartmental models have significant limitations since they only consider the human population in a macroscopic way, reacting globally in a uniform manner to an epidemic, which is not the case in reality. furthermore, in a classic sird model, eventually 100% of the population is infected, which is never observed in practice. in the covid-19 pandemic, one can also observe clusters where the epidemic seems to recursively focus, 5 which rather suggests a fractal epidemic propagation, as also mentioned in an older paper by jansse et al in 1999 55 which did not seem to have been further explored by the epidemiology community. such fractal behavior is however not at all captured by the classical sird-like compartmental models. note also that, quite surprisingly, we did not find significant scientific papers studying the geometric multiscale structure of the geography of the covid-19 pandemic, which also suggests that this dimension has not yet been analyzed in depth. in order to better integrate geography, which is one of the most important features of the human population system, we choose a as, for instance, in ref. 56 . in such an approach, the human population is modeled as a network, that is to say a nondirected graph, 57 where each node of the network represents an individual or a group of people, eg., a family, and each edge represents a connection between people. for the purpose of our study, we used networks randomly generated according to the barabási-albert model, 58 which is believed to capture the most important features of real social networks. we shall recall that the barabási-albert model generates networks by introducing nodes one by one (after an initial step). a degree d is chosen for each new node, which is then connected to d other nodes chosen at random from the nodes already in the network. to simulate a social network, the average value of the degree d is usually chosen between 2 and 3. the barabási-albert model produces randomized scale-free networks in which most of the nodes have a low degree (below 10), but some may have a very high degree. in order to understand how an epidemic propagates in a population modeled in this way, we used networks with 100 000 nodes and an average degree d for new nodes of 2.1. with these features, the degree of nodes in a social network is typically distributed as shown in table 1 . potential "superspreaders" are individuals with large degree >20. to model the propagation of an epidemic in this network, we discretized a classical sird-like model (see refs. 52 and 54 and figure 5 ) which leads us to represent the evolution of the state of each f i g u r e 6 stochastic state automaton modeling the possible evolution of a node in the social network node of the social network that models the human population by a stochastic finite automaton whose possible transitions are described in typically gaussian-for incubation and sickness times. we, however, think that our experiments can give us a better qualitative understanding of epidemic propagation since we believe that this social-network approach better captures the fundamentals of the social system, compared to the simpler compartment-type models. it may thus be helpful for constructing more realistic epidemic propagation models, even if it would require a very significant amount of data collection and fine tuning. the use of contact tracers in health systems is, for instance, a direct, but laborious, way to reconstruct such social networks to quickly identify infected people and to isolate them before they infect others. 60 our first experiment consisted of simulating increasingly virulent epidemics by assuming increasing values of the probability ρ of infecting somebody (1000 trials were done per value of ρ). our results are described in table 2 . they show a remarkably interesting phenomenon: for all values of the probability ρ, only a tiny fraction π of the population is eventually infected in most of the number ν of simulations (less than 1 of 1000 persons in more than 90% of the cases), or when a significant proportion (greater than 1%) is infected, the fraction of infected people π depends on ρ. in simpler terms, this can be stated as follows: there are a lot of viruses circulating in the population, but only a few of them give rise to epidemic outbreaks. the reasons for which a virus gives rise to an epidemic outbreak are intrinsic to the virus itself, but ta b l e 2 proportion π of the population that is infected, for different values of the propagation probability ρ. also dependent on external factors such as who is infected first, eg, a person with few contacts and low nodal degree or a superspreader with high nodal degree as shown in table 1 , and also the behavior of the population which impacts ρ. this may explain, at least to some extent, why some countries or regions are more stricken than others, which suggests again a fractal interpretation of the geographical scope of an epidemic, as already mentioned above. the second experiment that we shall report on in this section aimed at studying the effects of the deconfinement of a confined population that has been ordered to shelter-in-place. we studied here different proportions τ of the population that becomes sick before the epidemic becomes observable (ie., roughly between day 10 and 20 in figure 5) and different values of the duration γ in terms of days of confinement. we considered that there was a delay δ of 20 days before confinement was put in place and took ρ = 0.015. we also simulated the efficiency of the confinement by reducing the capacity of edges in the social network to propagate the disease by a factor 1 − ε with 0 ≤ ε ≤ 1. this factor represents the degree of adherence of the population to sanitary guidelines for social distancing, wearing face masks, and so forth. at each step of the simulation, representing one day, an infected node has thus probability ρ × (1 − ε) to infect an adjacent healthy node. in our experiment, we took ε = 0.66 (= 2/3). we then reported the computed values of the resulting lethality in table 3 . we assumed that γ is as large as necessary, which is clearly not realistic since confinement cannot be maintained too long for both economic and psychological reasons, but the results give the underlying trend. each measure reported was obtained by means of a monte-carlo simulation of 2000 trials. as expected, the longer the confinement, the fewer deaths. note however that, to be fully efficient, the confinement must be rather long, several months (>90 days) in our virtual experiment. the most interesting part of this experiment comes however from the observation of the total duration of the epidemic outbreak. table 4 shows these durations for the same values of τ and γ as in table 3 . if the confinement is sufficiently long, the lethality drops significantly (in some cases below 1%), but also the total duration of the epidemic outbreak is shortened. if the confinement is not maintained sufficiently long, it is still partially effective, in that it reduces the lethality, but it has a quite paradoxical consequence: the epidemic outbreak lasts longer than if no countermeasures were taken at all. a short confinement does not prevent the disease from significantly propagating: it just slows down the propagation and avoids the sharp peak of infected shown in figure 5 around day 40, which seems to be its main motivation in order not to overwhelm the capacity of the health care system. for this reason, when the population is deconfined too early, the disease is still present and remains endemic. the above experiments do not pretend to fully represent reality, but are just to motivate the use of social-network models for epidemic modeling. as pointed out by stattner and vidot, 56 "network models turn out to be a more realistic approach than simple models like compartment or metapopulation models, since they are more suited to the complexity of real relationships." one of the limitations of existing network models is, however, that they do not distinguish between recurring social links with family members and coworkers and casual links based on one-time encounters such as in public transportation or at large events. they should therefore be further refined and integrated into a model-based agile approach for crisis management, while taking into account their limitations. in this section, we model the potential impact of the epidemic as a function of different actions of the governance system on the economic system (see figure 3 ). in order to do so we must expand the prior analysis by not only considering lethality in terms of deaths (see table 3 ), but also the value of lost economic activity during confinement. reverting back to the simplified sird model in figure 5 , but now accounting for the fraction of population ε actually adhering to confinement during a lockdown of duration γ, which is ordered with some delay δ after a critical cumulative threshold τ of the population has become infected, we run a set of simulations. the baseline run of the model shown in figure 5 is considered as scenario 0 with no countermeasures and it is gradually modified using the one-factor-at-a-time (ofat) technique to test a number of actions by the governance system, including reducing the delay δ to order a lockdown, increasing the level of rigor ε of the confinement, as well as its duration γ . table 5 shows the results of a number of numerical experiments to probe these trade-off in terms of the value of human lives lost, versus productive work lost in the economic system. in order to estimate the economic impact of the epidemic a number of assumptions were made: in the baseline scenario 0, we do not take any countermeasures and the bulk of the $4.38b total loss is due to the deaths of 4% of the population. the $312m in lost work are due to the inability of the infected and sick population to perform work during their illness, which is assumed to last for 14 days. this is the kind of situation we would expect to see in a country with a government that is either unable or unwilling to intervene in the crisis. scenarios 1-3 institute a partial lockdown (ε = 66%) after either 10 or 20 days delay after recognizing the onset of the epidemic and the confinement lasts either 30 or 60 days. the results are not satisfactory, since the total damages exceed the baseline case where no action is taken. this outcome is due to the fact that one-third of the population does not adhere to the confinement and continues to be infected, making the disease endemic. a prolonged partial lockdown for 60 days with only 66% effectiveness as shown in scenario 3 is the worst case and leads to both a high number of deaths (about 4000) as well as high economic damages totaling $6.1b due to the prolonged shutdown, which ultimately is ineffective. this scenario is representative of the overall situation in the united states in mid2020. in scenarios 4-7, we shorten the reaction time to trigger the confinement after only five days (quick government action) and we gradually increase the rigor of the confinement to 90% (strong government enforcement). it turns out that these actions are highly effective, yielding a best-case scenario 7 with only 66 deaths, a short epidemic duration of 61 days and only $740m in damages, mainly due to the strict but short 30-day confinement in which 90% of the population participates. this essentially prevents the epidemic from blossoming and quickly snuffs out the disease. the ta b l e 5 scenario analysis with sird model for assessing total human and economic damages: n, number of daily contacts, ρ, probability of infection, τ, fraction of population infected to trigger confinement, ε, fraction of population adhering to confinement, δ, delay to confinement start, γ, confinement duration, t, duration of epidemic, total number of deaths, lost work in millions $m, and total damages including lost human lives and lost work in billions $b, n = 100,000 population size figure 7 . this may reflect the situation in countries that the economic analysis shows that the initial conditions, speed of response, and rigor of response by the governance system are crucial in determining the outcome. figures 8 and 9 , respectively, show the sharp contrast between the ratio of human loss (deaths) and economic work loss for scenarios 0 (do nothing) and scenario 5 (rapid and strong government response). there is indeed a trade-off between deaths and lost work, as in sce in the previous section, we identified a deep crisis of models that has been exposed by the covid-19 pandemic and proposed to mitigate this issue by constructing a systemic model of the crisis. in this section, we shall deal with some possible solutions to master the crisis using a systems approach. as is well known in any scientific discipline, the solution of a problem highly depends on the clarity and rigor of the way the problem is framed. we will therefore dedicate this short section to the statement of the problem that we need to solve in the context of the covid-19 crisis. a first characteristic of the covid-19 crisis is its global impact on human society. this crisis can thus be considered as a common cause failure-in the meaning of system safety theory 61 -for all main systems forming human society. if we are taking a safety approach, the first problem to solve is thus to mitigate the impacts of the crisis on the vulnerable systems forming human society, that is to say the social, health, and economic systems, as results from the system analysis of section 2.2.1. a second characteristic of the covid-19 crisis comes from the need to take into account strong feedback delays. in this matter, a first type of delay comes from the fact that it is most of the time too late for deploying mitigation actions to limit the epidemic propagation when significant numbers of infections are observed somewhere, since the effects of these actions will only be observable two weeks later. this was clearly shown in table 5 in scenarios 8-10. moreover, a secondtotally different type of delay comes from the fact that focusing on short-term health impacts of the crisis may lead to long-term issues of an economic nature, which forces to arbitrate between short-and longterm consequences of a given action. finally, a last characteristic of the covid-19 crisis is uncertainty. due to the global nature of the crisis and the rather short period of time on which it is concentrated, uncertainty is everywhere. clinical data about the infection are permanently partial, so difficult to interpret. understanding of the real social system network structure is never easy to capture. the exact nature and size of the impact on the economic system are difficult to evaluate. precise data on the capabilities on which to rely may be tricky to obtain. last, but not least, the crisis also results in a massive, heterogeneous and often contradictory amount of data in which the really interesting signals may be either weak or hidden. synthesizing these three features of the crisis, the problem to solve in our context can now be clearly stated: how to optimally mitigate the short-and long-term consequences of the covid-19 pandemic on human society, taking into account delays and uncertainties that are specific to this crisis? one can notice that this statement is a typical control problemin the sense of control theory 62 -integrating here delay and uncertainty, which can be addressed by many existing techniques (see refs. 63 and 64) . consequently, the objective should be to design a new system that can support this controllability objective. based on the closed-loop control principle, which is the only one that allows to achieve a given target behavior along the time axis, 62 such a f i g u r e 1 0 high-level covid-19 environment integrating a specific decision-aid system that has yet to be designed covid-19 decision-aid system (shown as the gray box at the top of figure 10 ) will have to measure the current state of the main systems forming human society in order to provide effective feedback actions on the social system through the governance system, the only legitimate one to make decisions and take control actions. figure 10 depicts how such a decision-aid system could be integrated into the high-level covid-19 environment. there is at least one domain where making decisions under structural uncertainties on an underlying geographic scope is quite well known since a long time in human history, which is the military domain. architecting a covid-19 decision-aid system using the typical architectural pattern of a computerized, command, control, and communications (c4) system (see ref. 65 or 66) , used in the defense area, seems thus quite a natural idea, as it is also quite often used in a system-of-systems engineering context (see ref. 18 or 19) . this leads us to propose an organization for a covid-19 decision-aid system based on the following three hierarchical layers, that correspond to three natural levels of abstraction associated with a given geographic scope (that may be either the international, country, and local levels or country, region, and city levels in practice), exactly like c4 systems are organized: 1. the strategic layer is the place where global situational awareness is required to master the crisis on a given large-scale geographic scope: its mission is to monitor at a high level the crisis and to elaborate strategic decisions based on an overall vision, fed by tactical information; 2. the operation layer is intended to master the crisis on a given medium-scale geographic scope: it is thus a distributed system which has to capture and synthesize tactical information and make operational decisions on their basis in accordance with the upper strategic decisions; 3. the tactical layer is intended to master the crisis on a local geographic scope: it is thus again a distributed system which has to capture and synthetize field information and make tactical decisions on their basis in accordance with the upper operational decisions. note that this architecture shown in figure 11 shall be understood as a hierarchical enterprise architecture, which defines how an organizational system, supported by suitable information systems and systemic models as discussed previously, shall be organized and behave. the main idea underpinning it is the principle of subsidiarity: decisions should be taken as close as possible to the level that is the most appropriate for their resolution. this principle means in particular that an upper level shall avoid to make decisions that are too intrusive at a lower level in order to let each local level take always the more appropriate actions depending on the real local conditions that it can observe, while following at the same time global orientations when locally relevant. this is crucial in the military sphere, but even more so in the context of the covid-19 crisis where speed of decision making is fundamental due to the latency of the epidemic propagation as seen in section 2. note that one shall also capture weak signals of systemic importance at each level of the proposed architecture: to illustrate that point, the fact that a police officer is infected in a certain area is, for instance, a typical weak signal since we may infer from it that there is a certain probability that the whole police force in the concerned area is or will be infected, at least in the near future (since the number of daily contacts or nodal degree of police officers may exceed n > 10, see tables 1 and 5 ). proposing the previous hierarchical architectural pattern is, however, of course not enough to specify how a covid-19 decision-aid system shall work. in this matter, the first point is to organize the systemic model that we sketched out in section 2.2 according to the hierarchy that we just presented and which is used to organize the proposed decision-aid system. hence, such a model shall not be monolast, but not least, the covid-19 decision-aid system that we sketched here shall behave in an agile way, in the meaning of agility in software or industrial development (see refs. 68-70, 71 or 24) . a pending problem is to have a plan, do, check, and act process that can quickly adapt to a quite fast-changing reality. agility allows to solve that issue by structuring in a very rigorous way the analysis, decision, and action processes, while providing a lot of flexibility to all involved actors, which are two mandatory features for addressing a complex crisis like covid-19. in practice, an agile covid-19 decision-aid process has typically to be organized around regular agile rituals-managed in this paper, we draw attention to the core importance of having realistic system models to manage and to mitigate a systemic crisis of the order of magnitude such as the covid-19 crisis. we also sketched out what could be an agile approach to use in this kind of crisis. our purpose was of course not to propose some definitive solution which is probably impossible. we do, however, think that the ideas contained in this paper are valuable contributions that may be of interest in the context of the covid-19 crisis, especially due to the fact the underlying health crisis will probably be endemic for a certain period of time (at least for 200-300 days according to most of our simulation runs) and be coupled with future short-and mid-term economic outcomes. while there are economic impacts due to strong mitigation actions such as mandated confinements (causing lost economic activity), the value loss due to human deaths at an estimated lethality rate of 4% would far exceed the economic losses. we have shown that this depends strongly on the average valuation of a human life, which is in itself a highly controversial issue. there are of course many detailed aspects of the proposed covid-19 decision support system that require further detail and elaboration. we focused on the issue of delay and rigor of action in the overall epidemic control system in this paper. however, as we discover more about the particular nature of this particular coronavirus, the issue of observability of human society (testing) may for instance be an even larger one. the epidemiological characteristics of an outbreak of covid-19) -china report of the who-china joint mission on coronavirus disease 2019 (covid-19). who; 2020 statistics and research -coronavirus pandemic (covid-19). our world in data covid-19) situation reports the site of origin of the 1918 influenza pandemic and its public health implications clinical features of patients infected with 2019 novel coronavirus in wuhan, china evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside hubei province, china: a descriptive and modelling study a novel coronavirus from patients with pneumonia in china cesames systems architecting method -a pocket guide. cesames strategic engineering -designing systems for an uncertain future. mit the art of systems architecting the systems approach: fresh solutions to complex problems through combining science and practical common sense. commons lab| case study series 2 introduction to systems engineering system modeling and simulation-an introduction architecting systems-concepts, principles and practice. college publications theory of modeling and simulation-integrating discrete events and continuous dynamic systems complex system and systems of systems engineering key challenges and opportunities in'system of systems' engineering world dynamics the limit to growth fluctuations in the abundance of a species considered mathematically agile systems engineering. www.incose.org/chapters groups/workinggroups/transformational/agile-systems-se cesam engineering systems agile framework. 10th international conference on complex systems design & management the agile manifesto reworked for systems engineering the global impact of covid-19 and strategies for mitigation and suppression, imperial college covid-19 response team, imperial college the global macroeconomic impacts of covid-19: seven scenarios how will country-based mitigation measures influence the course of the covid-19 epidemic? covid-19 cases per million inhabitants: a comparison. statista confronting the crisis: priorities for the global economy we pay minimum five times as much for a life year with the corona measures than we normal do embedded system design. kluwer adaptive control processes: a guided tour abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints science and sanity-an introduction to non-aristotelian systems and general semantics. the international non-aristotelian library pub all models are wrong, but some are useful the publish or perish book. melbourne: tarma software research pty ltd combining behavioral economics and infectious disease epidemiology to mitigate the covid-19 outbreak hydroxychloroquine and azithromycin as a treatment of covid-19: results of an open-label non-randomized clinical trial statistical review of hydroxychloroquine and azithromycin as a treatment of covid-19: results of an open-label non-randomized clinical trial clinical research methodology and evidence-based medicine: the basics. anshan ltd an introduction to probability theory and its applications. vols. i and ii clinical trial cost is a fraction of the drug development bill, with an average price tag of 19 m$. outsourcing pharma conjectures and refutations: the growth of scientific knowledge. london and new york: basic books galileo studies. branch line reflexivity in credit markets. california institute of technology the flawed foundations of general equilibrium: critical essays on economic theory financial crisis dynamics: attempt to define a market instability indicator a theory of timed automata a contribution to the mathematical theory of epidemics mathematical structure of epidemic systems an introduction to infectious disease modelling lévy-flight spreading of epidemic processes leading to percolating clusters social network analysis in epidemiology: current trends and perspectives the theory of graphs and its applications statistical mechanics of complex networks statistical mechanics of cellular automata cluster-based epidemic control through smartphone-based body area networks system reliability theory-models mathematical control theory: deterministic finite dimensional systems uncertainty and control control of time-delay systems understanding command and control. washington, dc: us department of defense ccrp publication series joint chiefs of staff. doctrine for command, control, communications, and computer (c4) systems support to optimal treatment of an sir epidemic model with time delay agile manifesto a decade of agile methodologies: towards explaining agile software development agile systems engineering national heart institute, national institutes of health, public health service, federal security agency. epidemiological approaches to heart disease: the framingham study. joint session of the epidemiology, health officers, medical care, and statistics sections of the american public health association multiple risk functions for predicting coronary heart disease: the concept, accuracy, and application the authors would like to thank the three anonymous referees for their valuable comments which contributed to a substantial improvement of the paper. we would finally like to stress the fact that systems engineering has an important role to play in the covid-19 context since it can enable the necessary collaboration of the various disciplines-such as biology, economics, engineering, epidemiology, finance, geography, health policy management, immunology, logistics, manufacturing, medicine, safety, sociology, urban systems, and so forth-that are all providing a piece of the complex puzzle posed by the global covid-19 crisis. key: cord-001687-paax8pqh authors: henkel, jan; woodruff, maria a.; epari, devakara r.; steck, roland; glatt, vaida; dickinson, ian c.; choong, peter f. m.; schuetz, michael a.; hutmacher, dietmar w. title: bone regeneration based on tissue engineering conceptions — a 21st century perspective date: 2013-09-25 journal: bone res doi: 10.4248/br201303002 sha: doc_id: 1687 cord_uid: paax8pqh the role of bone tissue engineering in the field of regenerative medicine has been the topic of substantial research over the past two decades. technological advances have improved orthopaedic implants and surgical techniques for bone reconstruction. however, improvements in surgical techniques to reconstruct bone have been limited by the paucity of autologous materials available and donor site morbidity. recent advances in the development of biomaterials have provided attractive alternatives to bone grafting expanding the surgical options for restoring the form and function of injured bone. specifically, novel bioactive (second generation) biomaterials have been developed that are characterised by controlled action and reaction to the host tissue environment, whilst exhibiting controlled chemical breakdown and resorption with an ultimate replacement by regenerating tissue. future generations of biomaterials (third generation) are designed to be not only osteoconductive but also osteoinductive, i.e. to stimulate regeneration of host tissues by combining tissue engineering and in situ tissue regeneration methods with a focus on novel applications. these techniques will lead to novel possibilities for tissue regeneration and repair. at present, tissue engineered constructs that may find future use as bone grafts for complex skeletal defects, whether from post-traumatic, degenerative, neoplastic or congenital/developmental “origin” require osseous reconstruction to ensure structural and functional integrity. engineering functional bone using combinations of cells, scaffolds and bioactive factors is a promising strategy and a particular feature for future development in the area of hybrid materials which are able to exhibit suitable biomimetic and mechanical properties. this review will discuss the state of the art in this field and what we can expect from future generations of bone regeneration concepts. after 15 years of tissue engineering & regenerative medicine 1.0 and another 10 years of 2.0 versions ( 1) the era of tissue engineering 3.0 has begun. this review will des-care outcomes. today major reconstructive surgeries (due to trauma or tumour removal) are still limited by the paucity of autologous materials available and donor site morbidity. recent advances in the development of scaffold-based tissue engineering (te) have given the surgeon new options for restoring form and function. there are now bioactive biomaterials (second generation) available that elicit a controlled action and reaction to the host tissue environment with a controlled chemical breakdown and resorption to ultimately be replaced by regenerating tissue. third-generation biomaterials are now being designed to stimulate regeneration of living tissues using tissue engineering and in situ tissue regeneration methods. engineering functional bone using combinations of cells, scaffolds and bioactive factors are seen as a promising approach and these techniques will undoubtedly lead to ceaseless possibilities for tissue regeneration and repair. there are currently thousands of research papers and reviews available on bone tissue engineering, but there is still a major discrepancy between scientific research efforts on bone tissue engineering and the clinical application of such strategies. there is an evident lack of comprehensive reviews that cover both the scientific research aspect as well as the clinical translation and practical application of bone tissue engineering techniques. this review will therefore discuss the state of the art of scientific bone tissue engineering concepts and will also provide current approaches and future perspectives for the clinical application of bone tissue engineering. bone as an organ has next to its complex cellular composition a highly specialised organic-inorganic architecture which can be classified as micro-and nanocomposite tissue. its mineralised matrix consists of 1) an organic phase (mainly collagen, 35% dry weight) responsible for its rigidity, viscoelasticity and toughness; 2) a mineral phase of carbonated apatite (65% dry weight) for structural reinforcement, stiffness and mineral homeostasis; and 3) other non-collagenous proteins that form a microenvironment stimulatory to cellular functions (2) . bone tissue exhibits a distinct hierarchical structural organization of its constituents on numerous levels including macrostructure (cancellous and cortical bone), microstructure (harversian systems, osteons, single trabeculae), sub-microstructure (lamellae), nanostructure (fibrillar collagen and embedded minerals) and subnanostructure (molecular structure of constituent elements, such as mineral, collagen, and non-collagenous organic proteins) ( figure 1 ) (3) . macroscopically, bone consists of a dense hard cylindrical shell of cortical bone along the shaft of the bone that becomes thinner with greater distance from the centre of the shaft towards the articular surfaces. cortical bone encompasses increasing amounts of porous trabecular bone (also called cancellous or spongy bone) at the proximal and distal ends to optimise articular load transfer (2) . in humans, trabecular bone has a porosity of 50-90% with an average trabecular spacing of around 1mm and an average density of approximately 0.2 g·cm -3 (4) (5) (6) . cortical bone has a much denser structure with a porosity of 3-12% and an average density of 1.80 g·cm -3 (5, 7) . on a microscopic scale, trabecular struts and dense cortical bone are composed of mineralized collagen fibres stacked parallel to form layers, called lamellae (3-7 µm thick) and then stacked in a±45° manner (2) . in mature bone these lamellae wrap in concentric layers (3-8 lamellae) around a central part named haversian canal which containings nerve and blood vessels to form what is called an osteon (or a haversian system), a cylindrical structure running roughly parallel to the long axis of the bone (3) . cancellous bone consists of interconnecting framework of rod and plate shaped trabeculae. on a nanostructural level, the most prominent structures are the collagen fibres, surrounded and infiltrated by mineral. at the sub-nanostructural level three main materials are bone crystals, collagen molecules, and non-collagenous organic proteins. for further details the reader is referred to (3) . mineralised bone matrix is populated with four boneactive cells: osteoblasts, osteoclasts, osteocytes and bone lining cells. additional cell types are contained within the bone marrow that fills the central intramedullary canal of the bone shaft and intertrabecular spaces near the articular surfaces (8) . bone has to be defined as an organ composed of different tissues and also serves as a mineral deposit affected and utilised by the body's endocrine system to regulate (among others) calcium and phosphate homeostasis in the circulating body fluids. furthermore, recent studies indicate that bone exerts an endocrine function itself by producing hormones that regulate phosphate and glucose homeostasis integrating the skeleton in the global mineral and nutrient homeostasis (9) . bone is a highly dynamic form of connective tissue which undergoes continuous remodelling (the orchestrated removal of bone by osteoclasts followed by the formation of new bone by osteoblasts) to optimally adapt its structure to changing functional demands (mechanical loading, nutritional status etc.). from a material science point of view bone matrix is a composite material of a polymer-ceramic lamellar fibre-matrix and each of these design and material aspects influence the mechanical properties of the bone tissue (10) . the mechanical properties depend on the bone composition (porosity, mineralisation etc.) as well as the structural organisation (trabecular or cortical bone architecture, collagen fibre orientation, fatigue damage etc.) (11) . collagen possesses a young's modulus of 1-2 gpa and an ultimate tensile strength of 50-1 000 mpa, compared to the mineral hydroxyapatite which has a young's modulus of ~130 gpa and an ultimate tensile strength of ~100 mpa. the resulting mechanical properties of the two types of bone tissue, namely the cortical bone and cancellous bone, are shown in table 1 . age and related changes in bone density have been reported to substantially influence the mechanical properties of cancellous bone (12) . as outlined above, bone shows a distinct hierarchical structural organization and it is therefore important to also define the mechanical properties at microstructural levels ( table 2) . although the cancellous and cortical bone may be of the same kind of material, the maturation of the cortical bone material may alter the mechanical properties at the microstructural level. bone tissue is also known to be mechano-receptive; both normal bone remodelling and fracture or defect healing are influenced by mechanical stimuli applied at the regenerating defect site and surrounding bone tissue (17) (18) (19) (20) . in contrast to most other organs in the human body, bone tissue is capable of true regeneration, i.e. healing without the formation of fibrotic scar tissue (21) . during the healing process basic steps of fetal bone development are recapitulated and bone regenerated in this way does not differ structurally or mechanically from the surrounding undamaged bone tissue (22) . however, despite this tremendous regenerative capacity, table 1 mechanical properties of compact (cortical) and spongy (cancellous) bone. reproduced and modified from (13) . dry specimen (submicrostructure) (16) 22 5-10% of all fractures are prone to delayed bony union or will progress towards a non-union and the development of a pseudarthrosis (23) (24) . together with large traumatic bone defects and extensive loss of bone substance after tumour resection or revision surgery after failed arthroplasties, these pathological conditions still represent a major challenge in today's clinical practice. the rangeof bone graft materials available to treatsuch problems in modern clinical practice essentially include autologous bone (from the same patient), allogeneic bone (from a donor), and demineralised bone matrices, as well as a wide range of synthetic bone substitute biomaterials such as metals, ceramics, polymers, and composite materials. during the last decades, tissue engineering strategies to restore clinical function have raised considerable scientific and commercial interest in the field of orthopaedic surgery as well as reconstructive and oromaxillofacial surgery. yet, the treatment of bone defects and the search for bone substitute materials is not just a modern day phenomenon, with its history reaching back through millennia. the quest for the most efficient way to substitute for lost bone and to develop the best bone replacement material has been pursued by humans for thousands of years. in peru, archaeologists discovered the skull of a tribal chief from 2000 bc in which a frontal bone defect (presumably from trepanation) had been covered with a 1 mm-thick plate of hammered gold (25) . trephined incan skulls have been found with plates made from shells, gourds, and silver or gold plates covering the defect areas (26) . in a skull found in the ancient center of ishtkunui (armenia) from approx. 2000 bc, a 7 mm diameter skull defect had been bridged with a piece of animal bone (27). these pursuits are not limited to skull surgeries involving bone substitutes. ancient egyptians have been shown to have profound knowledge of orthopaedic und traumatological procedures with surgeons having implanted iron prostheses for knee joint replacement as early as 600 bc, as analyses of preserved human mummies have revealed (28). the first modern era report of a bone xenograft procedure is believed to be the dutch surgeon job janszoon van meekeren in 1668 (29-30). a skull defect of a russian nobleman was successfully treated with a bone xenograft taken from the calvaria of a deceased dog. the xenograft was reported to have become fully incorporated into the skull of the patient. in the 1800s, plaster of paris (calcium sulphate) was used to fill bone cavities in patients suffering from tuberculosis (31). attempts were also made to fill bone defects with cylinders made from ivory (32). in 1820 the german surgeon phillips von walters described the first clinical use of a bone autograft to reconstruct skull defects in patients after trepanation (33). walters successfully repaired trepanation holes, following surgery to relieve intracranial pressure, with pieces of bone taken from the patient's own head. in 1881, scottish surgeon william macewen described the first allogenic bone grafting procedure: he used tibial bone wedges from three donors that had undergone surgery for skeletal deformity correction (caused by rickets) to reconstruct an infected humerus in a 3-year-old child (34) major contributions leading to the development of modern day bone grafting procedures and bone substitutes have been made by ollier and barth in the late 1800s. louis léopold ollier carried out extensive experiments to study the osteogenic properties of the periosteum and other various approaches to new bone formation, mainly in rabbit and dog models. he also meticulously reviewed the literature on bone regeneration available at that time and in 1867 he published his 1 000-page textbook 'traite experimentel et clinique de la regeneration des os et de la production artificielle du tissu osseux', in which he described the term 'bone graft' ("greffe osseuse") for the first time (35) . in 1895 the german surgeon arthur barth published his treatise 'ueber histologische befunde nach knochenimplantationen' ('on histological findings after bone implantations') presenting his results of various bone grafting procedures involving the skull and long bones (humerus, forearm bones) of dogs and rabbits including histological assessment (36) . today, both ollier's and barth's work are considered to be milestones in the development of present day bone grafting procedures and bone substitute materials. with the development of new orthopaedic techniques and increased numbers of joint replacement procedures (prostheses), the demand for bone grafts increased in the 20 th century, leading to the opening of the first bone bank for allogenic bone grafts in new york in 1945 (37) . but the risk of an immunological reaction from transplanted allogenic bone material was soon recognized and addressed in various studies (38) (39) . several procedures such as the use of hydrogen peroxide to macerate bone grafts ("kieler span") in the 1950s and 1960s to overcome antigenity were not successful (40) (41) . today, bone substitute materials such as (bovine) bone chips are routinely used in clinical practice after being pretreated to remove antigen structures. however, due to the processing steps necessary to abolish antigenicity, most of these grafts do not contain viable cells or growth factors and are therefore inferior to viable autologous bone graft options. when allografts with living cells are transplanted, there is a risk of transmitting viral and bacterial infections: transmission of human immunodeficiency virus (hiv), hepatitis c virus (hcv), human t-lymphozytic virus (htlv), unspecified hepatitis, tuberculosis and other bacteria has been documented (mainly) for allografts (mainly from those containing viable cells) (42) . as early as 1932, the work of the swiss h. matti proved the paramount meaning of autologous cancellous bone grafts for bone regeneration approaches (43) . having conducted various experiments on the osteogenic potential of autologous and allogenic bone, schweiberer concluded in 1970 that the autologous transplant remains the only really reliable transplantation material of the future, if applied to bring about new bone formation or crucially to support the bridging bone defects (44) . even though this statement was made more than 50 years ago, it still remains valid today, when bone is still the second most transplanted material, second only to blood. worldwide more than 3.5 million bone grafts (either autografts or allografts) are performed each year (45) . recent advances in technology and surgical pro-cedures have significantly increased the options for bone grafting material, with novel products designed to replace both the structural properties of bone, as well as promote faster integration and healing. the number of procedures requiring bone substitutes is increasing, and will continue to do so as the population ages and physical activity of the elderly population increases. therefore, while the current bone grafting market globally is estimated to be in excess of $2.5 billion us each year, it is expected to increase at a compound annual growth rate of 7-8% (45) . although the last decades have seen numerous innovations in bone substitute materials, the treatment of bone defects with autologous bone grafting material is still considered to be the 'gold standard' against which all other methods are compared (46) . autologous bone combines all the properties desired in a bone grafting material: it provides a scaffold for the ingrowth of cells necessary for bone regeneration (=osteoconductive); it promotes the proliferation of stem cells and their differentiation into osteogenic cells (=osteoinductive) and it holds viable cells that can form new bone tissue (= osteogenic) (22, 47) . however, the available volume of autologous bone graft from a patient is limited and an additional surgical procedure is required to harvest the grafting material which is associated with a significant risk of donor site morbidity. 20-30% of autograft patients experience morbidity such as chronic pain or dysaesthesia at the graft-harvesting site (48) . large bone defects (>5 cm) may be treated with bone segment transport or free vascularized bone transfer (49) , as the use of an autologous bone graft alone is not recommended because of the risk of graft resorption despite good soft tissue coverage (50) . the vascularised fibula autograft (51) and the ilizarov method (52) (53) (54) are the most commonly used treatment methods for larger bone defects; however, complications are common and the process can be laborious and painful for the patient as she/he may be required to use external fixation systems for up to one and half years (49, (55) (56) . the limitations of existing bone grafting procedures, either autologous or allogenic in nature, and the increased demand for bone grafts in limb salvage surgeries for bone tumours and in revision surgeries of failed arthroplasties have renewed the interest in bone substitute materials and alternative bone grafting procedures (57) . in 1986, masquelet and colleagues (58) first described a new two-stage technique taking advantage of the body's immune response to foreign materials for bone reconstruction. the authors called it the 'concept of induced membranes' -soon to become known as the 'masquelet technique': in a first step, a radical debridement of necrotic bone and soft tissue is followed by the filling of the defect site with a polymethylmethacrylate (pmma) spacer and stabilisation with an external fixator. after the definitive healing of the soft tissue, a second procedure is performed 6-8 weeks later, when the pmma spacer is removed and a morcellised cancellous bone graft (from the iliac crest) is inserted into the cavitiy (59-60). the cement spacer was initially thought to prevent the collapse of the soft tissue into the bone defect and to prepare the space for bone reconstruction. however, it was soon discovered that the pmma spacer does not only serve as a place holder, but that a foreign body reaction to the spacer also induces the formation of a membrane that possesses highly desirable properties for bone regeneration (60-61): the induced membrane was shown to be richly vascularised in all layers; the inner membrane layer (facing the cement) composed of synovial like epithelium and the out part is made from fibroblasts, myoblasts and collagen. the induced membrane has also been shown to secrete various growth factors in a time-dependent manner: high concentrations of vascular endothelial growth factor (vegf) as well as transforming growth factor β (tgf β) are secreted as early as the second week after implantation of the pmma spacer; bone morphogenetic protein 2 (bmp-2) concentration peaks at the fourth week. the induced membrane stimulates the prolifera-tion of bone marrow cells and differentiation towards an osteoblastic lineage. finally, clinical experience has shown that the cancellous bone inside the induced membrane is not subject to resorption by the body. ever since its introduction the 'induced membrane'-technique has been used very successfully in various clinical cases (see (59) and references therein). however, the masquelet technique still requires the harvesting of an autologous bone graft, and with that come all the potential aforementioned complications. furthermore, the use of alternate bone substitute materials, such as hydroxyapatite tricalcium phosphate, in combination with the masquelet technique has so far yielded results inferior to the use the masquelet technique with autologous bone grafting material (59, 62) . besides the masquelet technique, a more recent innovation has also significantly improved the clinical approach to restoring bone defects. the development of the reamer-irrigator-aspirator (ria © )-system (depuy-synthes) has given clinicians an alternative to iliac crest harvesting to retrieve bone grafting materials from patients: the ria system provides irrigation and aspiration during intramedullay reaming, allowing the harvesting of finely morselised autologous bone and bone marrow for surgical procedures requiring bone grafting material (63). the ria was initially developed to lower the intramedullary pressure during the reaming of long bones to reduce the risk of fat embolisms and pulmonary complications such as the acute respiratory distress syndrom (ards), as well as to reduce local thermal necrosis of bone tissue (64) (65) . however, the finely morselised autologous bone and bone marrow that is collected by the ria has been shown to be rich in stem cells, osteogenic cells and growth factors and has been recognized to be a suitable bone graft alternative to the iliac crest autograft tissue (66) (67) . also, ria enables the harvesting of larger bone graft volumes compared to the iliac crest (approx. 40 cm 3 for the femur and 33 cm 3 for the tibia) (48, 65) . furthermore, the risk of complications from the harvesting procedure has been reduced significantly (ria 6% vs. 19.37% for iliac crest autografts) (68) . since its introduction, the indications for use of ria have been further extended to include the treatment of postoperative osteomyelitis (69) and the harvesting of mesenchymal stem cells (mscs) (70) . the innovation driven by the ria systems was so significant, that the journal "injury" has dedicated a complete issue to the data available on ria and its applications recently (71) . a systematic review on the reamer-irrigator-aspirator indications and clinical results has recently been published by cox et al (72) . the masquelet technique as well as the ria-system is nowadays frequently used in clinical practice, in-dependently. however, the two techniques may also be combined to further improve their effectiveness when treating severe bone defects, for example in posttraumatic limb reconstruction (73) . an example of a case from one of our author's clinical practice (m.s.) combining the use of masquelet technique and the use of the ria-system to treat a complex case of tibial nonunion is provided in figure 2 . both the masquelet technique and the development of the ria-system represent significant improvements in today's clinical approach to bone reconstruction and regeneration. however, utilising these techniques, we have still not been able to replace autologous bone grafting in order to avoid surgical graft retrieval procedures with all the associated disadvantages. however, with research looking towards increasingly sophisticated bone tissue engineering techniques and their first clinical applications, the quest for developing improved bone substitute material advances to the next level. bone substitutes can be defined as "a synthetic, inorganic or biologically organic combination-biomaterialwhich can be inserted for the treatment of a bone defect instead of autogenous or allogenous bone" (74) . this definition applies to numerous substances and a variety of materials have been used over time in an attempt to substitute bone tissue. although merely of historic interest and with no significance in modern therapies, the use of seashells, nuts, gourds and so forth show that humans have strived for bsm for thousands of years. with the introduction of tissue engineering and its clinical application the regenerative medicine in 1993 (75) the modern day quest for bsms has undergone a significant change. the limitations of current clinical approaches have necessitated the development of alternative bone repair techniques and have driven the development of scaffold-based tissue engineering strategies. in the past, mostly inert bone substitute materials have been used, functioning mainly as space holders during the healing processes. now a paradigm shift has taken place towards the use of new 'intelligent' tissue engineering biomaterials that would support and even promote tissue re-growth (76) . according to the "diamond concept" of bone tissue engineering (77) (78) , an ideal bone substitute material should offer an osteoinductive three-dimensional structure, contain osteogenic cells and osteoinductive factors, have sufficient mechanical properties and promote vascularisation. despite extensive research in the field of bone tissue engineering, apart from the "gold standard" figure 2 clinical case combining the masquelet-technique and the ria-system to treat a tibial non-union. 51 year old male acquired a gustillo 3b fracture of the right tibia and fibula and was treated with a stage procedure with locked plating and a free flap . the patient's progress was very slow and an implant failure occurred 8 months post-operatively (a). the patient was then referred for the further management and underwent debridement of the non-union site on the distal tibia by lifting the flap (b). the size of the extensive bone defect is shown in b (intraoperative image of situs and x-ray image with retractor in defect site). additionally, a pmma bone cement spacer was inserted into the tibial defect as part of the masquelet technique. postop x-ray images after surgery with the pmma spacer (circles) in place (c). 8 weeks later the pmma spacer was removed and the induced membrane at the defect site was packed with autologous cancellous bone graft obtained from the femur using the reamer-irrigator-aspirator (ria) technique. (d) shows assembled ria system, insert showing morselised autologous bone and bone marrow graft obtained. postop films after the second surgery (e). 7 weeks after bone grafting the defect showed good healing and patient was able to fully bear weight as tolerated. over the following 2 months x-ray images showed progressive bridging of the zone and he was able to return to work with light duties. he was reviewed again 7 months post-surgery and had returned to work full-time and was walking long distances without any support (f). autograft bone, no currently available bsm can offer these properties in one single material. therefore, the fundamental concept underlying tissue engineering is to combine a scaffold or three-dimensional construct with living cells, and/or biologically active molecules to form a "tissue engineering construct" (tec), which promotes the repair and/or regeneration of tissues (79) (80) . currently used bsm can be classified into different subgroups according to their origin (76, 81): 1) bsm of natural origin this group consists of harvested autogenous bone grafts as well as allogenic bsm, such demineralised bone matrix, corticocancellous or cortical grafts, cancellous chips (from either cadavers or living donors) (82) (83) (84) . xenogenic materials, for example porous natural bone hydroxyapatite from animal bones (bovine, equine, porcine etc.) are also part of this group (85) . phytogenic materials such as bone-analogue calcium phosphate originally obtained from marine algae or coral derived materials, also fall into this category (86-87). this groups contains ceramics such as bioactive glasses (88) , tricalciumphosphates (tcp) (89) (90) , hydroxyapatite (ha) (91) (92) (93) and glass ionomer cements as well as calcium phosphate (cp) ceramics (94) . metals such as titanium also belong to this group. furthermore polymers including polymethylmethacrylate (pmma), polylactides/ poliglycolides and copolymers as well as polycaprolactone (pcl) (95) are summarised in this group (76, 79, (96) (97) . 3) composite materials bsm combining different materials such as ceramics and polymers are referred to as composite materials (92, (98) (99) . by merging materials with different structural and biochemical properties into composite materials, the properties of composite materials can be modified to achieve more favourable characteristics, for instance with respect to biodegradability (79, 97) . 4) bsm combined with growth factors natural or recombinant growth factors such a bone morphogenic protein (bmp), platelet-derived growth factor (pdgf), transforming growth factor-ß (tgf-β), insulin-like growth-factor 1, vascular endothelial growth factor (vegf) and fibroblast growth factor can be added to increase the biological activity of bsm (100-101). for example, a composite material made of medicalgrade polycaprolactone-tricalcium phosphate (mpcl-tcp) scaffolds (combined with recombinant human bmp-7) has been demonstrated to completely bridge a critical-sized (3 cm) tibial defect in a sheep model (102) . 5) bsm with living cells mesenchymal stem cells (103) (104) (105) , bone marrow stromal cells (106) (107) , periosteal cells (108) (109) , osteoblasts (110) and embryonic (111) as well as adult stem cells (112) have been used in bone tissue engineering (22, 101, (113) (114) (115) (116) . these cells can generate new tissue alone or can be used in combination with scaffold matrices. bsms can also be classified according to their properties of action. an overview of the currently available bsm for clinical (orthopaedic) use and their mode of action is given in table 3 (reproduced from (117)). scaffolds serve as three-dimensional structures to guide cell migration, proliferation and differentiation. in load bearing tissues, it also serves as temporary mechanical support structure. scaffolds substitute for the function of the extracellular matrix and need to fulfil highly specific criteria. an ideal scaffold should be (i) three-dimensional and highly porous with an interconnected pore network for cell growth and flow transport of nutrients and metabolic waste; (ii) should have surface properties which are optimized for the attachment, migration, proliferation and differentiation of cell types of interest (depending on the targeted tissue); (iii) be biocompatible, not elicit an immune response and be biodegradable with a controllable degradation rate to compliment cell/tissue in-growth and maturation; (iv) its mechanical properties should match those of the tissue at the site of implantation and (v) the scaffold structure should be easily and efficiently reproducible in various shapes and sizes (97) . biocompatibility represents the ability of a material to perform with an appropriate response in a specific application (118) . as a general rule, scaffolds should be fabricated from materials that do not have the potential to elicit immunological or clinically detectable primary or secondary foreign body reactions (119) . parallel to the formation of new tissue in vivo, the scaffold may undergo degradation via the release of by-products that are either biocompatible without proof of elimination from the body (biodegradable scaffolds) or can be eliminated through natural pathways from the body, either by simple filtration of by-products or after their metabolisation (bioresorbable scaffolds) (97) . due to poor vascularisation or low metabolic activity, the capacity of the surrounding tissue to eliminate the by-products may be low leading to a build up of the by-products thereby causing local temporary disturbances (97) : a massive in vivo release of acidic degradation by-products leading to inflammatory reactions has been reported for several (120) (121) (122) . another example is the increase of osmotic pressure or ph caused by local fluid accumulation or transient sinus formation from fibre reinforced polyglycolide pins used in orthopaedic applications (120) . it is also known that calcium phosphate biomaterial particles can cause inflammatory reactions after being implanted (although this inflammatory reaction may be considered desirable to a certain extent as it subsequently stimulates osteoprogenitor cell differentiation and bone matrix deposition) (123) . these examples illustrate that potential problems related to biocompatibility in tissue engineering constructs for bone and cartilage applications may be related to the use of biodegradable, erodible and bioresorbable polymer scaffolds. therefore, it is important that the three dimensional tissue engineering construct (tec) is exposed at all times to sufficient quantities of neutral culture media when undertaking cell culture procedures, especially during the period where the mass loss of the polymer matrix occurs (97) . for applications in vivo, it is of course not possible to expose the tec to neutral media, and one therefore has to carefully take into account the local specifications (ph, vascularisation, metabolic activity etc) of the tissue to be engineered when accessing biocompatibility of a tec. the design of tissue engineering scaffolds needs to consider physico-chemical properties, morphology and bio-mechanical properties as well as degradation kinetics. the scaffold structure is expected to guide the development of new bone formation by promoting attachment, migration, proliferation and differentiation of bone cells. parallel to tissue formation, the scaffold should also undergo degradation in order to allow for ultimate replacement of scaffold material with newly formed, tissue engineered bone. furthermore, the scaffold is also responsible for (temporal) mechanical support and stability at the tissue engineering site until the new bone is fully matured and is able to withstand mechanical load. as a general rule, the scaffold material should be sufficiently robust to resist changes in shape resulting from the introduction of cells into the scaffold (each of which should capable of exerting tractional forces) and from wound contraction forces that would be evoked during tissue healing in vivo (79) . in order to achieve optimal results, it is therefore necessary to carefully balance the biomechanical properties of a scaffold with its degradation kinetics. a scaffold material has to be chosen that degrades and resorbs at a controlled rate, giving the tec sufficient mechanical stability at all times, but at the same time allowing new in vivo formed bone tissue to substitute for its structure. figure 3 depicts the interdependence of molecular weight loss and mass loss of a slow degrading composite scaffold and also shows the corresponding stages of tissue regeneration (80) . at the time of implantation the biomechanical properties of a scaffold should match the structural pro-perties of the tissue it is implanted into as closely as possible (124) . it should possess sufficient structural integrity for the period until the engineered tissue ingrowth has replaced the slowly disappearing scaffold matrix with regards to mechanical properties. in bone tissue engineering the degradation and resorption kinetics of the scaffold have to be controlled in such a way that the bioresorbable scaffold retains its physical properties for at least 6 months to enable cell and tissue remodelling to achieve stable biomechanical conditions and vascularisation at the defect site (97) . apart from host anatomy and physiology, the type of tissue that is aimed to be engineered also has a profound influence on the degree of remodelling: in cancellous bone the remodelling takes 3-6 months, while cortical bone will take twice as long, approximately 6-12 months, to remodel (79) . whether the tec will be part of a load bearing or non-load bearing site will also significantly influence the needs for mechanical stability of the tec as mechanical loading can directly affect the degradation behaviour as well (79) . utilising orthopaedic implants to temporarily stabilise the defect area also influences the requirements for biomechanical stability of the tec significantly (18, 125) . it is therefore crucial to meticulously select the scaffold material individually for each tissue engineering approach to tailor the mechanical properties and degradation kinetics exactly to the purpose of the specific tec (97) . consequently, there is not one "ideal scaffold material" for all bone tissue engineering purposes, but the choice depends on the size, type and location of the bone tissue to be regenerated. the surface area of a scaffold represents the space where pivotal interactions between biomaterial and host tissue take place. the performance of a tec depends fundamentally on the interaction between biological fluids and the surface of the tec, and it is often mediated by proteins absorbed from the biological fluid (126) . the initial events include the orientated adsorption of molecules from the surrounding fluid, creating a specific interface to which the cells and other factors respond to the macrostructure of the scaffold as well as the microtopography and chemical properties of the surface determine which molecules are adsorbed and how cells will attach and align themselves (127) . the focal attachments made by the cells with their substrate then determines cell shape, which in turn transduces signals via the cytoskeleton to the nucleus resulting in expression of specific proteins which may be structural or signal-related and contribute towards the cell phenotype. due to technical progress, we are now able to manipulate materials at the atomic, molecular, and supramolecular level, and bulk materials and surfaces can be designed at a similar dimension to that of the nanometer constituent components of bone (2): in natural bone, hydroxyapatite plates are approximately between 25 nm in width and 35 nm in length while collagen type 1 is a triple helix 300 nm in length, 0.5 nm in width and with a periodicity of 67 nm (128) . "nanomaterials" commonly refers to materials with basic structural units in the range 1-100 nm (nanostructured), crystalline solids with grain sizes between 1 and 100 nm (nanocrystals), individual layers or multilayer surface coatings in the range 1-100 nm (nanocoatings), extremely fine powders with an average particle size in the range 1-100 nm and fibres with a diameter in the range 1-100 nm (nanofibres) (2) . the close proximity of the scale of these materials to the scale of natural bone composites makes the application of nanomaterials for bone tissue engineering a very promising strategy. surfaces with nanometer topography can promote the availability of amino acid and proteins for cell adhesion to a great extent, for example, the adsorption of fibronectin and vitronectin [two proteins known to enhance osteoblast and bone forming cell function (129) ] can be significantly increased by decreasing the grain size on the scaffold/implant surface below 100 nm (130) . it has also been shown that calcium-mediated cell protein adsorption on nanophase material promotes unfolding of these proteins promoting bone cell adhesion and function (130) . current literature supports the hypothesis that by creating surface topographies with characteristics that approximate the size of proteins, a certain control over protein adsorption and interactions will be possible. since the surface characteristics regarding roughness, topography and surface chemistry are then transcribed via the protein layer into information that is comprehensible for the cells (127) , this will enable the fabrication of surface properties directly targeted at binding specific cell types. in vitro, osteoblast adhesion, proliferation and differentiation and calcium deposition is enhanced on nanomaterials with grain sizes less than 100 nm (130) (131) . the adherence of osteoblasts has been shown to increase up to threefold when the surface is covered with nanophase titanium particles instead of conventional titanium particles (132) . nano-and microporosiy has also been shown to promote osteogenic differentiation (133) and osteogenesis (134) . the use of nanomaterials to achieve better osteointegration of orthopaedic implants and for bone tissue engineering approaches has been extensively summarised in several recent reviews (2, 135-138) and will not be reviewed in its entirety here. however, it becomes clear that rough scaffold surfaces favour attachment, proliferation and differentiation of anchorage-dependent bone forming cells (139) . osteogenic cells migrate to the scaffold surface through a fibrin clot initially established immediately after implantation of the tec from the haematoma caused by the surgical procedure (101) . the migration causes retraction of the temporary fibrin matrix and, if not well secured, can lead to detachment of the fibrin from the scaffold during wound contraction leading to decreased migration of the osteogenic cells into the scaffold (140) (141) . with regards to surface chemistry, degradation properties and by-products (relating to ph, osmotic pressure, inflammatory reactions etc.) are of importance and have been briefly discussed already. in the following section, the role of calcium phosphate in the osteoinductivity of biomaterials will be summarized as an example of how surface chemistry may be manipulated to benefit scaffold properties. to date, most synthetic biomaterials that have been shown to be osteoinductive contained calcium phosphate underlining the crucial role of calcium and phosphate in osteoinduction properties of biomaterials (142) . as summarised above, adequate porosity and pore size is crucial for bone tissue engineering scaffolds in order to allow sufficient vascularisation and enable a supply of body fluids throughout the tec. together with this nutrient supply, a release of calcium and phosphate ions from the biomaterial surface takes places and is believed to be the origin of bioactivity of calcium phosphate biomaterials (143) (144) (145) . this process is followed by the precipitation of a biological carbonated apatite layer (that contains calcium-, phosphate-and other ions such as magnesium as well as proteins and other organic compounds) that occurs when the concentration of calcium and phosphate ions has reached super saturation level in the vicinity of the implant (142, (146) (147) . this bone-like biological carbonated apatite layer is thought to be physiological trigger for stem cells to differentiate down the osteogenic lineage or could induce the release of growth factors that complement this process (142) . for biomaterials lacking calcium phosphate particles, the roughness of the surface is considered to act as a collection of nucleation sites for calcium phosphate precipitation from the hosts' body fluids, thereby forming a carbonated apatite layer. comparing calcium phosphate (cap) coated fibrous scaffolds (fibre diameter approx 50 μm) made from medical grade polycaprolactone (mpcl) with non-coated mpcl-scaffolds, we have shown that cap-coating is beneficial for new bone formation in vitro, enhancing alkaline phosphatase activity and mineralisation within the scaffolds (148) . interestingly, other research has shown that the implantation of highly soluble carbonated apatite ceramics alone did not result in bone induction in vivo (149) , suggesting that a relatively stable surface (e.g. through a composite material that contains a less soluble phase) is needed for the facilitation of bone formation as discussed above (see "mechanical properties and degradation kinetics"). bone formation requires a stable biomaterial interface and therefore, too rapid in vivo dissolution of calcium phosphate materials has been shown to be unfavourable for the formation of new bone tissue (150) (151) . chai et al. and barradas et al. have recently reviewed the effects of calcium phosphate osteogenicity in bone tissue engineering (150, 152) . further comprehensive reviews on the influence of surface topography and surface chemistry on cell attachment and proliferation for orthopaedic implants and bone tissue engineering are available (2, 126, 142, 150, 153) . porosity is commonly defined as the percentage of void space in a so called cellular solid (the scaffold in bone tissue engineering applications) (154) . using solid and porous particles of hydroxyapatite for the delivery of the growth factor bmp-2, kuboki et al. showed that pores are crucial for bone tissue formation because they allow migration and proliferation of osteoblasts and mesenchymal cells, as well as vascularisation; no new bone formed on solid particles (155) . a porous scaffold surface also improves mechanical interlocking between the implanted tecs and the surrounding natural bone tissue, providing greater mechanical stability at this crucial interface in tissue engineering (156) . scaffold porosity and pore size relate to the surface area available for the adhesion and growth of cells both in vitro as well as in vivo and to the potential for host tissue ingrowth, including vasculature, to penetrate into the central regions of the scaffold architecture. in assessing the significance of porosity several in vivo studies have been conducted utilising hard scaffold materials such as calcium phosphate or titanium with defined porous characteristics (157) . the majority of these studies indicate the importance of pore structure in facilitating bone growth. increase of porosity as well as pore size and spacing of pore interconnectivity has been found to positively influence bone formation in vivo, which is also correlated with scaffold surface area. pore interconnections smaller than 100 μm were found to restrict vascular penetration and supplementation of a porous structure with macroscopic channels has been found to further enhance tissue penetration and bone formation (97, 158) . interestingly, these results correlate well with the diameter of the physiological haversian systems in bone tissue that possess an approximate diameter of more than 100 µm. the ability of new capillary blood vessels to grow into the tec is also related to the pore size, thereby directly influencing the rate of ingrowth of newly formed bone tissue into the tec: in vivo, larger pore sizes and higher porosity lead to a faster rate of neovascularisation, thereby promoting greater amounts of new bone formation via direct osteogenesis. in contrast, small pores favour hypoxic conditions and induce osteochondral formation before osteogenesis occurs (92) . pores and pore interconnections should be at least 300 microns in diameter to allow sufficient vascularisation. besides the actual macroporosity (pore size >50 µm) of the scaffold microporosity (pore size < 10 µm) and pore wall roughness also have a large impact on osteogenic response: microporosity results in larger surface areas contributing to higher bone-inducing protein adsorption and to ion exchange and bone-like apatite formation by dissolution and re-precipitation (139, 157) . as outlined above, sub-micron and nanometre surface roughness favours attachment, proliferation and differentiation of anchorage-dependent bone forming cells (139) . although increased porosity and higher pore size facilitate bone ingrowth, it also compromises the structural integrity of the scaffold, and if the porosity becomes too high it may adversely affect the mechanical properties of the scaffold at the same time (79) . in addition, the rate of degradation is influenced by the porosity and pore size (for biodegradable scaffolds). a higher pore surface area enhances interaction of the scaffold materials with host tissue and can thereby accelerate degradation by macrophages via oxidation and/or hydrolysis (157) . therefore, scaffolds fabricated from biomaterials with a high degradation rate should not have high porosities (>90%) in order to avoid compromise to the mechanical and structural integrity before adequate substitution by newly formed bone tissue. scaffolds made from slowly degrading biomaterials with robust mechanical properties can, in contrast, be highly porous (157) . table 4 illustrates mechanical properties and degradation kinetics in relation to the porosity for many commonly used composite scaffolds. this illustrates that there are a number of advantages and disadvantages associated with any changes made to the porosity or pore size of scaffolds. it is inevitable to find a balance between these pros and cons in order to tailor the scaffold properties ideally to the demands of the tissue engineering approach used. for comprehensive reviews on role of porosity and pore size in tissue engineering scaffolds, the reader is referred to two recently published reviews (157, 159) . it becomes clear that a multitude of factors have to be taken into account when designing and fabricating scaffolds for bone tissue engineering. however, it is beyond the scope of this review to present all of them in detail and a number of comprehensive reviews have been published recently on this topic (2, 5, 79, 97, 101, (160) (161) . the three-dimensional design characteristics in combination with the material properties of a scaffold are crucial for bone tissue engineering purposes. not only does the scaffold structure need to be controlled on a macroscopic level (to achieve sufficient interposition of the scaffold into the defect site), but also on a microscopic level (to optimise tissue engineering properties with regards to osteoinduction, osteoconduction, osteogenesis and vascularisation as well as mechanical stability) and even down to nanostructural configuration (to optimise protein adsorption, cell adhesion, differentiation and proliferation related to desired tissue engineering characteristics of the tec). it is therefore necessary to exert strict control over the scaffold properties during the fabrication process. conventional techniques for scaffold fabrication include solvent casting and particulate leaching, gas foaming, fibre meshes and fibre bonding, phase separation, melt molding, emulsion freeze drying, solution casting and freeze drying (162) . all of these techniques are subtractive in nature, meaning that parts of the fabricated scaffold are removed from the construct after the initial fabrication process in order to generate the desired three-dimensional characteristics. hence a number of limitations exist regarding these fabrication methods: conventional methods do not allow a precise control over pore size, pore geometry, pore interconnectivity or spatial distribution of pores and interconnecting channels of the scaffolds fabricated (92, (163) (164) . in addition, many of these techniques require the application of organic solvents and their residues can impose severe adverse effects on cells due to their potentially toxic and/or carcinogenic nature, reducing the biocompatibility of the scaffold significantly (165) . the introduction of additive manufacturing (am) techniques into the field of bone tissue engineering has helped to overcome many of these restrictions (92, 162, 166) . in am three-dimensional objects are created in a computer-controlled layer-by-layer fabrication process. in contrast to subtractive conventional methods of scaffold fabrication, this technique is additive in nature and does not involve removal of materials after the initial fabrication step. these techniques have also been named "rapid prototyping" or "solid free form fabrication" in the past, but in order to clearly distinguish them from conventional methods the latest astm standard now summarises all of these techniques under the term "additive manufacturing" (167) . the basis for each am process is the design of a three-dimensional digital or in silico model of the scaffold to be produced. this computer model can either be created from scratch using "computer aided design" (cad) methods or can be generated using data from a 3d-scan of existing three-dimensional structures (such as the human skeleton) (168) . the digital model is then converted into an stl-file that expresses the three-dimensional structure as the summary of multiple horizontal two-dimensional planes. using this stl-file an am-machine then creates the three-dimensional scaffold structure in a layer-bylayer fabrication method in which each layer is tightly connected to the previous layer to create a solid object. a number of different am techniques are currently applied using thermal, chemical, mechanical and/or optical processes to create the solid three-dimensional object (166) . these methods include laser-based methods such as stereolithography (stl) and selective laser sintering (sls), printing-based applications (e.g. 3d-printing, wax-printing) and nozzle-based systems like melt extrusion/ fused deposition modeling (fdm) and bioplotting. the multitude of am techniques and their specifications were reviewed by several authors lately (162, 166, (169) (170) . am techniques have been used since the 1980s in the telecommunication industry, in jewelry making and production of automobiles (171) . from the 1990s onwards, am was gradually introduced to the medical field as well (172) : am was initially used to fabricate threedimensional models of bone pathologies in orthopaedic maxillofacial neurosurgical applications to plan surgical procedures and for haptic assessment during the surgery itself (173-174). with recent technical advances am is nowadays applied to make custom-made implants and surgical tools (175) and to fabricate highly detailed, custom-made three-dimensional models for the indivi-dual patient (using data from ct, mri, spect etc.) to plan surgical approaches, specifically locate osteotomy sites, choose the correct implant and to predict functional and cosmetic outcomes of surgeries (176) (177) . thereby the operating time as well as the risk of complications has been reduced significantly. the application of am in bone tissue engineering represents a highly significant innovation that has drasticcally changes the way scaffolds are being fabricated; am has more or less become the new gold standard for scaffold manufacturing (92) . the advantages of rapid prototyping processes include (but are not limited to) increased speed, customisation and efficiency. am technologies have relatively few process steps and involve little manual interaction, therefore, three-dimensional parts can be manufactured in hours and days instead of weeks and months. the direct nature of am allows the economical production of customized tissue engineering scaffolds. the products can be tailored to match the patient's needs and still sustain economic viability as compared to traditional techniques which must manufacture great numbers of devices. the conventional scaffold fabrication methods commonly limit the ability to form complex geometries and internal features. am methods reduce the design constraints and enable the fabrication of desired delicate features both inside and outside the scaffold. using stl, the am technique with the highest precision, for example objects at a scale of 20 µm can be fabricated (178) . a two-photon stl-technique to initiate the polymerisation can be used to pro-duce structures even at micrometer and sub-micrometer levels (179) . am methods allow for variation of composition of two or more materials across the surface, interface, or bulk of the scaffold during the manufacturing. thereby, positional variations in physicochemical properties and surface characteristics can be created and utilized to promote locally specific tissue engineering signals. several am techniques operate without the use of toxic organic solvents. this is a significant benefit, since incomplete removal of solvents may lead to harmful residues that can affect adherence of cells, activity of incorporated biological agents or surrounding tissues as already described. am allows the control of scaffold porosity leading to the applications that may have areas of greater or lesser structural integrity and areas of encouraged blood flow due to increased porosity. fabricating devices and/or implants with differences in spatial distribution of porosities, pore sizes, mechanical and chemical properties can mimic the complex composition and architecture of natural bone tissue and thereby optimise bone tissue engineering techniques. in addition, scaffolds with gradients in porosity and pore sizes can be functionalised to allow vascularisation and direct osteogenesis in one area of the scaffold, while promoting osteochondral ossification in the other, which is an appealing approach to reproduce multiple tissues and tissue interfaces within one and the same biomaterial scaffold (157) . table 5 summarises the advantages of scaffolds designed and fabricated by am techniques. musculoskeletal conditions are highly prevalent and cause a large amount of pain, illness and disability to patients. these conditions are the second most common reason for consulting a general practitioner, accounting for almost 25% of the total cost of illness and up to 15% of primary care (180) . in addition, the impact of musculoskeletal conditions is predicted to grow with the increasing incidence of lifestyle-related obesity, reduced physical fitness and increased road traffic accidents (180) . the impact of bone trauma is significant-the consequences of failing to restore full function to an injured limb are dramatically demonstrated by the statistic that only 28% of patients suffering from severe open fractures of the tibia are able to resume full function and hence return to previous employment (180) . along with trauma, tumour resection is another major cause of large bone defects. cancer is a major public health challenge, with one in four deaths in the united states currently due to this disease. recent statistics indicate that 1 638 910 new cancer cases and 577 190 deaths from cancer are projected to occur in the united states in 2012 (181) . as outlined above, the number of procedures requiring bone implant material is increasing, and will continue to do so in our aging population and with deteriorating physical activity levels (57) . the current bone grafting market already is estimated to be in excess of $2.5 billion each year and is expected to increase by 7-8% per year (45) . with the introduction of tissue engineering the hopes and expectations were extremely high to be able to substitute natural organs with similar (or even better) tissue engineered replacement organs. however, at the time it was stated that "few areas of technology will require more interdisciplinary research than tissue engineering" (75) and this assessment holds true today. in the years to follow, numerous private and public institutes conducted scientific research and clinical translation efforts related to tissue engineering. at the beginning of 2001, tissue engineering research and development was being pursued by 3 300 scientists and support staff in more than 70 start-up companies or business units with a combined annual expenditure of over $600 million usd (182) . the us national institutes of health (nih), accounting for the largest cumulative us federal research expenditures, has increased the funding in tissue engineering from 2.36 billion usd in the fiscal year 2003 to more than 614 billion usd for the fiscal year 2006 (183) . between 2000 and 2008 the number of papers published on tissue engineering and scaffolds per year increased by more than 400% and more than 900%, respectively (184) . but despite the increasing research expenditure and the magnitude of discoveries and innovations in bone tissue engineering since its introduction more than three decades ago, the translation of these novel techniques into routine clinical applications on a large scale has still not taken place. as scott j. hollister has pointed out, there is, on the one hand, a stark contrast between the amount of tissue engineering research expenditures over the last 20 years and the resulting numbers of products and sales figures. on the other hand, there is also a significant discrepancy between the complexities of intended tissue engineering therapies compared to the actual therapies that have reached clinical applications (184) . this evident gap between research and clinical application/commercialisation is commonly termed the "valley of death" due to the large number of ventures that "die" between scientific technology development and actual commercialization due to lack of funds ( figure 4 ) (184) . the valley of death is particularly large for tissue engineering approaches because this field of research often utilises immensely cost intensive high-tech biotechnologies for technological development eating up large parts of the funding available, but then additionally faces the challenges of funding large scale preclinical studies and clinical studies to gain approval by regulatory bodies, demonstrate product safety and gain clinical acceptance (184) (185) (186) . to bridge the gap between the bench and bedside, the scaffold is required to perform as a developmentally conducive extracellular niche, at a clinically relevant scale and in concordance with strict clinical (economic and manufacturing) prerequisites ( figure 5 ) (187) . in this context the scaffold facilitates for smaller and medium sized defects the entrapment of the hematoma and prevents it's "too early" contraction (188) . for large and high-load bearing defects the scaffold can also deliver cells and/or growth factors to the site of damage and provides an appropriate template for new tissue formation. the scaffold should thus constitute a dynamically long-lasting yet degradable three-dimensional architecture, preferably serving as a functional tissue substitute which, over time, can be replaced by cell-derived tissue function. designing and manufacturing processes are believed to be the gatekeepers to translate tissue engineering research into clinical tissue engineering applications and concentration on the development of these entities will enable scaffolds to bridge the gap between research and clinical practice (184) . one of the greatest difficulties in bridging the valley of death is to develop good manufacturing processes and scalable designs and to apply these in preclinical studies; for a description of the rationale and road map of how our multidisciplinary research team has addressed this first step to translate orthopaedic bone engineering from bench to bedside see below and refer to our recent publication (185) . in order to take bone tissue engineering approaches from bench to bedside, it also imperative to meticulously assess the clinical demands for specific scaffold characteristics to achieve a broad and optimised range of clinical applications for the specific tissue engineering approach. a sophisticated bone tissue engineering technology will not necessarily have multiple clinical applications just because of its level of complexity, and defining specific clinical target applications remains one of the most underestimated challenges in the bridging the valley of death (184) . there is often a great level of discrepancy between the clinical demands on a tissue engineering technique and the scientific realisation of such technique, hampering the clinical translation. thus a scaffold that is realistically targeted at bridging the valley of death should (187): (i) meet fda approval (for further details on this topics see reviews by scott j. hollister 2011 and 2009) (184, 189) ; (ii) allow for cost effective manufacturing processes; (iii) be sterilisable by industrial techniques; (iv) enable easy handling without extensive preparatory procedures in the operation theatre; (v) preferably, be radiographically distinguishable from newly formed tissue; and (vi) allow minimally invasive implantation (190) (191) . in targeting the translation of a (bone) tissue engineering approach from bench to bedside, there is a distinct hierarchy and sequence of the type of studies that need to be undertaken to promote the translation process (192) : having identified clinical needs and based on fundamental discoveries regarding biological mechanisms, a novel tissue engineering approach is designed and first studies are undertaken to characterise mechanical and chemical properties of the tec to be used. the next step involves feasibility and bioactivity testing and should be carried out in vitro and in vivo. in vitro assays using cell culture preparations are used to characterise the effects of materials on isolated cell function and for screening large numbers of compounds for biological activity, toxicity and immunogenicity (193) (194) . however, due to their nature using isolated cells, in vitro models are unavoidably limited in their capacity to reflect complex in vivo environments that the tec will be exposed to and are therefore inadequate to predict in vivo or clinical performances. therefore, in vivo models (that is animal models) are required in order to overcome the limitations of in vitro models to provide a reproducible approximation of the real life situation. in vivo feasibility testing is almost exclusively done in small animals, mainly in rodents and rabbits (192, (195) (196) (197) . the advantages of small animal models include relatively easy standardisation of experimental conditions, fast bone turnover rates (=shorter periods of observation), similar lamellar bone architecture and similar cancellous bone thinning and fragility, similar remodelling rates and sites, common availability and relatively low costs for housing and maintenance. disadvantages of rodent and rabbit models include different skeletal loading patterns, open epiphyses at various growth plates up to the age of 12-14 months (or for lifetime in rats), minimal intra-cortical remodelling, the lack of harversian canal systems, a smaller proportion of cancellous bone to total bone mass and their relatively small size for testing of implants (196) . whilst a large number of studies in rodents and rabbits have established proof of concept for bone tissue engineering strategies, scaling up to larger, more clinically relevant animal models has presented new challenges. quoting thomas a. einhorn, when conducting animal studies, one has to keep in mind that "in general, the best model system is the one which most closely mimics the clinical situation for which this technology is being developed, will not heal spontaneously unless the technology is used, and will not heal when another technology is used if that technology is less advanced than the one being tested" (198) . the most effective animal models will therefore 1) provide close resemblance of the clinical and biological environment and material properties, 2) encompass highly standardised measurement methods providing objective parameters (qualitative and quantitative) to investigate the newly formed bone tissue and 3) are able to detect and predict significant differences between the bone tissue engineering methods investigated (192) . for clinical modelling and efficacy prediction of the tissue engineering strategy to be translated into clinical application, up-scaling to large animal models is therefore inevitable. thereby, the tissue engineering therapy can be delivered in the same (or similar) way in which it will be delivered in clinical settings utilising surgical techniques that match (or closely resemble) clinical methods at the site that matches the setting in which it will be used later as closely as possible (192) . the advantage of large animal models (using nonhuman primates, dogs, cats, sheep, goats, pigs) is the closer resemblance of microarchitecture, bone physiology and biomechanical properties in humans. they encompass a well-developed haversian and trabecular bone remodelling, have greater skeletal surface to volume areas, show similar skeletal disuse atrophy, enable the use of implants and techniques similar to the ones used in humans and show highly localised bone fragility associated with stress shielding by implants. however, the use of large animal models has disadvantages as well, including the high cost and maintenance expenses, extensive housing and space requirements, relatively long life spans and lower bone turnover rates (making longer study periods necessary), difficulties in standardisation to generate large, homogenous samples for statistical testing as well as various ethical concerns depending on the species used (e.g. primates) (196) . but despite several disadvantages, it is inevitable to perform the final pre-clinical in large animals, as realistically as possible, with relevant loading conditions and with similar surgical techniques as used in the final procedure in humans (197) . large animal models provide mass and volume challenges for scaffold-based tissue engineering and require surgical fixation techniques that cannot be tested either in vitro or in small animal models (184) . in general, preclinical translation testing is performed in large skeletally mature animals, the species most utilised are dog, sheep, goat and pig (192, 199) . if sufficient preclinical evidence for the efficacy and safety of the new bone tissue engineering system has been generated utilising large animal models, clinical trials care undertaken to prove clinical significance and safety, ultimately leading to the translation of the technology into routine clinical practice. taking composite scaffold based bone tissue engineering from bench to bedside in accordance with the above outline rationale for translating bone tissue engineering research into clinical applications, during the last decade our interdisciplinary research team has focussed on the bench to bedside translation of a bone tissue engineering concept based on slowly biodegradable composite scaffolds made from medical grade polycaprolactone (mpcl) and calcium phosphates [hydroxyapatite (ha) and tricalcium phosphate (tcp)] (80, 200) . detailed descriptions of the scaffold fabrication protocol can be found in our recent publications (102, 109, (200) (201) (202) . the scaffolds have been shown in vitro to support cell attachment, migration and proliferation; degradation behaviour and tissue in-growth has also been extensively studied (203) (204) (205) (206) . we subsequently took the next step towards clinical translation by performing small animal studies using rat, mice and rabbit models (207) (208) (209) . as reviewed in detail in reference (200) , we were able to demonstrate the in vivo capability of our composite scaffolds in combination with growth factors or cells to promote bone regeneration within ectopic sites or critical sized cranial defects in the small animal models. studies in large animal models that closely resemble the clinical characteristics of human disease, with respect to defect size and mechanical loading, then became essential to advance the translation of this technology into the most difficult and challenging clinical applications in orthopaedic tumour and trauma surgery. the choice of a suitable large animal model depends on the ultimate clinical application, and consequently there is no such thing as "one gold standard animal model". over the last years, our research team has investigated the application of our composite scaffolds in several preclinical large animal models addressing different clinical applications: load-bearing, critical-sized ovine tibial defect model well-characterised, reproducible and clinically relevant animal models are essential to generate proof-ofprinciple pre-clinical data necessary to advance novel therapeutic strategies into clinical trial and practical application. our research group at the queensland university of technology (qut; brisbane, australia) has spent the last 5 years developing a world-leading defect model to study pre-clinically different treatment options for cases of large volume segmental bone loss (159, 210) . we have successfully established this 3 cm critical-sized defect model in sheep tibiae to study the mpcl-tcp scaffold in combination with cells or growth factors including bone morphogenic proteins (bmps) (211) (212) . this model has not only generated a series of highly cited publications (211) (212) (213) (214) (215) , but also has attracted large interest in the orthopaedic industry to be used as a preclinical test bed for their bone graft products under development. the model enables control of experimental conditions to allow for direct comparison of products against a library of benchmarks and gold standards we have developed over the last 5 years (we have performed more than 200 operations using this model todate). our preclinical tibial defect model developed at qut is one of the only available models internationally, which is suitable from both reproducibility and cost point of view for the evaluation of large segmental defect repair technologies in statistically powered study designs. we have chosen this critical sized segmental defect model of the tibia for our large animal model because tibial fractures represent the most common long bone fractures in humans and are often associated with significant loss of bone substance (216) (217) . also, tibial fractures result in high rates of non-unions or pseudarthroses (216, 218) . from an orthopaedic surgeons point of view it can be argued that amongst all bone defects seen in the clinical practice, segmental defects of the tibia are often the most challenging graft sites. this owes to the grafts being required to bear loads close to physiological levels very soon after implantation, this is despite internal fixation, which often provides the necessary early stability, but also suffers from the poor soft tissue coverage (vascularisation issue) of the tibia compared to the femur. hence, in a bone engineering strategy for the treatment of segmental tibial defects, the scaffold must bear (or share) substantial loads immediately after implantation. the scaffold's mechanical properties (strength, modulus, toughness, and ductility) are determined both by the material properties of the bulk material and by its structure (macrostructure, microstructure, and nanostructure). matching the mechanical properties of a scaffold to the tibial graft environment is critically important so that progression of tissue healing is not limited by mechanical failure of the scaffold prior to successful tissue regeneration. similarly, because mechanical signals are important mediators of the differentiation of cell progenitors, a scaffold must create an appropriate stress environment throughout the site where new tissue is desired. hence, one of the greatest challenges in scaffold design for load bearing tibial defects is the control of the mechanical properties of the scaffold over time. by trialing our bone tissue engineering strategies in a tibial defect model, we will therefore address a highly relevant clinical problem and are creating valuable pre-clinical evidence for the translation from bench to bedside. with the 3 cm critical defect being regenerated successfully by applying our mpcl-tcp scaffold in combination with bmp (102), we are now investigating bone regeneration potentials in even larger sized tibial defects ( figure 6 ). spinal fusion has been investigated in animal models for one hundred years now and a lot of the knowledge we have today on how spinal fusion progresses was gained through animal models (219) (220) . with regards to the above pictured rationale for translating bone tissue engineering approaches to clinical practice, it is of importance to note that the physical size of the sheep spine is adequate to allow spinal surgery to be carried out using the same implants and surgical approaches that are used in humans as well. also, sheep spines allow for an evaluation of the success of the study using fusion assessments commonly used in clinical practice. when considering spinal fusion in large animal models, it is apparent that due to the biomechanical properties of the spine a biped primate animal model [such as in (221) ] should ideally preferred over a quadruped large animal model [for example ovine (222) or porcine (223) ]. but given the expenses and limited availability of primate testing as well as ethical concerns due to the close phylogenical relation, it is more feasible to trial large numbers of scaffold variations in the most appropriate quadruped large animal models and then evaluate the best performing scaffold in a primate model, if possible (184) . we have outlined above that defining specific clinical target applications is a critical prerequisite for successful bone tissue engineering research that is meant to be translated into clinical practice. in accordance with this we have selected the thoracic spine for our animal model because we have identified idiopathic scoliosis as clinically highly relevant thoracic spine pathology. idiopathic scoliosis is a complex three-dimensional deformity affecting 2-3% of the general population (224) . scoliotic spine deformities include progressive coronal curvature, hypokyphosis or lordosis in the thoracic spine and vertebral rotation in the axial plane with posterior elements turned rotated toward the curve concavity. scoliotically deformed vertebral columns are prone to accelerated intervertebral disc degeneration, initiating more severe morphological changes of the affected vertebral joints and leading to chronic local, pseudoradicular, and radicular back pain (225) . one of the critical aspects in surgical scoliosis deformity correction is bony fusion to achieve long-term stability (226) . autologous bone grafting is still the gold standard to achieve spinal fusion and superior to other bone grafts for spinal fusion (227) (228) (229) . nonetheless, the use of autologous bone grafting material has significant risks as outlined in detail above. a number of animal models for the use of tissue-engineered bone constructs in spinal fusion exists (230) and the use of bone morphogenetic proteins for spinal fusion has been studied extensively (219, 222, (231) (232) . however, to the best of our knowledge, our ovine thoracic spine fusion model is the first existing preclinical large animal model on thoracic interverte-bral fusion allowing the assessment of tissue-engineering constructs such as biodegradable mpcl-cap scaffolds and recombinant human bone morphogenetic protein-2 (rhbmp2) as a bone graft substitute to promote bony fusion (figure 7 ) (233) . we have been able to show that radiological and histological results at 6-months post surgery indicated had comparable grades of fusion and evidenced new bone formation for the mpcl-cap scaffolds plus rhbmp-2 and autograft groups. the scaffold alone group, however, had lower grades of fusion in comparison to the other two groups. our results demonstrate the ability of this large animal model to trial various tissue engineering constructs against the current gold standard autograft treatment for spinal fusion in the same animal. in the future, we will be able to compare spinal fusion tissue engineering constructs in order to create statistically significant evidence for clinical translation of such techniques. the tibial diaphysis (c-d) and the periosteum is removed from the defect site and additionally also from 1cm of the adjacent bone proximally and distally. special care is taken not to damage the adjacent neurovascular bundle (e, bundle indicated by asterisk). the defect site is then stabilised using a 12 hole dcp (synthes) (f). afterwards 6cm mpcl-tcp scaffold loaded with prp and rhbmp-7 is press fitted into the defect site to bridge the defect (g-h) and the plate is fixed in its final position. xray analysis at 3 months after implantation (i) shows complete bridging of the defect site with newly formed radio-opaque mineralised tissue (in order to provide sufficient mechanical support, the scaffold is not fully degraded yet and scaffold struts appear as void inside the newly formed bone tissue). the interdisciplinary research group has evaluated and patented the parameters necessary to process medical grade polycaprolactone (mpcl) and mpcl composite scaffolds (containing hydroxyapatite or tricalciumphosphate) by fused deposition modeling (97) . these "first generation scaffolds" have undergone more than 5 years of studies in clinical settings and have gained federal drug administration (fda)-approval in 2006 and have also been successfully commercialised (www. osteoporeinternational.com). the scaffolds have been used highly successfully as burr whole plugs for cranioplasty (234) and until today more than 200 patients have received burr whole plugs, scaffolds for orbital floor reconstruction and other cranioplasties (figure 8 ) (92) . with their extensive, multidisciplinary approach the research team has achieved one of the rare examples of a highly successful bone tissue engineering approach bridging the gap between scientific research and clinical practice leading to significant innovations in clinical routines. as shown above, "second generation scaffolds" produced by fdm and based on composite materials have already been broadly studied in vitro plus in vivo in small animal models and are currently under preclinical evaluation in large animal studies conducted by our research group. available data so far clearly supports the view that further translation into clinical use will take place and that a broad spectrum of targeted clinical applications will exist for these novel techniques. our we herein propose that regenerative medicine 3.0 has commenced. we foresee that the complexity and great variety of large bone defects require an individualized, patient-specific approach with regards to surgical reconstruction in general and implant/tissue engineering selection in specific. we advocate that bone tissue engineering and bioengineering technology platforms, such as additive manufacturing approaches can be used even more substantially in bone grafting procedures to advance clinical approaches in general and for the benefit of individual patient in particular. the tremendous advantage of scaffolds made by additive manufacturing techniques such as fused deposition modeling (fdm) is the distinct control over the macroscopic and microscopic shape of the scaffold and thereby control over the shape of the entire tec in total. additive manufacturing enables the fabrication of highly structured scaffolds to optimise properties highly relevant in bone tissue engineering (osteoconductivity, osteoinductivity, osteogenicity, vascularisation, mechanical and chemical properties) on a micro-and nanometre scale. using high-resolution medical images of bone pathologies (acquired via ct, µct, mri, ultrasound, 3d digital photogrammy and other techniques) (168), we are not only be able to fabricate patient-specific instrumentation (235) (236) (237) , patient-specific conventional implants (238-242) or allografts (243) , but also to realise custom-made tissue engineering constructs (tec) tailored specifically to the needs of each individual patient and the desired clinical application (168, 174, 244) . we therefore predict that the commencing area of regenerative medicine 3.0 will hold a significant leap forward in terms of personalised medicine. we have already proven the clinical application of this concept by fabricating a custom-made bioactive mpcl-tcp implant via cad/fdm that was used clinically to successfully reconstruct a complex cranial defect (245) . we have also recently provided a rationale for the use of cad/fdm and mpcl-tcp scaffolds in contributing to clinical therapy concepts after resection of musculoskeletal sarcoma (figures 9 and 10 ) (246) . although it has to be mentioned that our approaches presented in this review are at different stages of clinical translation, their entity clearly represents a promising and highly significant 21 century approach in taking bone tissue engineering strategies from bench to bedside and into the era of regenerative medicine 3.0. in conclusion, the field of bone tissue engineering has significantly changed the millennia old quest by humans indicate osteotomy planes to achieve tumour free margins, after which, after which the cad model is virtually resected (e). a custom made scaffold to fit the defined defect is then created by mirroring the healthy side of the pelvis, adjusting the size of the scaffold accordingly and fabricating the scaffold from the virtual model using am techniques (f). flanges, intramedullary pegs and other details can be added to the porous scaffold structure to facilitate surgical fixation and to enhance its primary stability after implantation (g). images d-g reproduced with permission from (246) , © the authors to optimise the treatment of bone defects and to identify suitable bone substitute materials. we have reviewed the historic development, current clinical therapy standards and their limitations as well as currently available bone substitute materials. we have also outlined current knowledge on scaffold properties required for primary stability and even load distribution is achieved by using an internal fixation device (4) . secondary stability is achieved by osseointegration of both the fibula and the porous tissue engineering scaffold. over time, the scaffold is slowly replaced by ingrowing tissue engineered bone and the defect is completely bridged and regenerated (5) . h partly reproduced with permission from (246), © the authors. bone tissue engineering and the potential clinical applications as well as the difficulties in bridging the gap between research and clinical practice. although the clinical translation of these approaches has not taken place on a large scale yet, bone tissue engineering clearly holds the potential to overcome historic limitations and disadvantages associated with the use of the current gold-standard autologous bone graft. optimizing combinations of cells, scaffolds, and locally and systemically active stimuli will remain a complex process characterized by a highly interdependent set of variables with a large range of possible variations. consequently, these developments must also be nurtured and monitored by a combination of clinical experience, knowledge of basic biological principles, medical necessity, and commercial practicality. the responsibility for rational development is shared by the entire orthopaedic community (developers, vendors, and physicians). the need for objective and systematic assessment and reporting is made particularly urgent by the recent rapid addition of many new options for clinical use. by applying a complex interplay of 21st century technologies from various disciplines of scientific research, the gap between bone tissue engineering research and the translation into clinically available bone tissue engineering applications can successfully be bridged. flati g. chirurgia nella preistoria. parte i. provincia med aquila. regenerative medicine 2.0 nanostructured biomaterials for tissue engineering bone mechanical properties and the hierarchical structure of bone biomechanics of trabecular bone advanced biomaterials for skeletal tissue regeneration: instructive and smart functions form and function of bone comparison of microcomputed tomographic and microradiographic measurements of cortical bone porosity structure and development of the skeleton the skeleton as an endocrine organ bone as a ceramic composite material determinants of the mechanical properties of bones age-related bone changes an introduction to bioceramics the elastic modulus for bone the elastic moduli of human subchondral, trabecular, and cortical bone tissue and the size-dependency of cortical bone modulus elastic properties of human cortical and trabecular lamellar bone measured by nanoindentation effect of mechanical stability on fracture healing--an update biomechanics and tissue engineering mechanobiology of bone tissue mechanical conditions in the initial phase of bone healing the biology of fracture healing bone regeneration: current concepts and future directions enhancement of fracture healing risk factors contributing to fracture non-unions repairing holes in the head: a history of cranioplasty cranioplasty in prehistoric times traite experimental et clinique de la regeneration des os et de la production artificielle du tissu osseux histologische untersuchung über knochen implantationen the use of homogenous bone grafts; a preliminary report on the bone bank transplantation immunity in bone homografting immunologic aspects of bone transplantation the animal bone chip in the bone bank knochenregeneration mit knochenersatzmaterialien: eine tierexperimentelle studie. hefte unfallheilk adverse reactions and events related to musculoskeletal allografts: reviewed by the world health organisation project notify ueber die freie transplantation von knochenspongiosa. langenbecks arch clin chir experimental studies on bone transplantation with unchanged and denaturated bone substance. a contribution on causal osteogenesis bone graft substitutes: what are the options? bone grafts and bone graft substitutes in orthopaedic trauma surgery. a critical analysis autogenous bone graft: donor sites and techniques reamer-irrigator-aspirator bone graft and bi masquelet technique for segmental bone defect nonunions: a review of 25 cases bone grafts: a radiologic, histologic, and biomechanical model comparing autografts, allografts, and free vascularized bone grafts long bone reconstruction with vascularized bone grafts limb-lengthening, skeletal reconstruction, and bone transport with the ilizarov method bone lengthening (distraction osteogenesis): a literature review ilizarov principles of deformity correction ilizarov bone transport treatment for tibial defects use of the ilizarov technique for treatment of non-union of the tibia associated with infection skeletal tissue regeneration: current approaches, challenges, and novel reconstructive strategies for an aging population effet biologique des membranes à corps etranger induites in situ sur la consolidation des greffes d'os spongieux masquelet technique for the treatment of bone defects: tips-tricks and future directions the concept of induced membrane for reconstruction of long bone defects induced membranes secrete growth factors including vascular and osteoinductive factors and could stimulate bone regeneration influences of induced membranes on heterotopic bone formation within an osteo-inductive complex treatment of large segmental bone defects with reamerirrigator-aspirator bone graft: technique and case series the influence of a one-step reamer-irrigator-aspirator technique on the intramedullary pressure in the pig femur bone graft harvest using a new intramedullary system a new minimally invasive technique for large volume bone graft harvest for treatment of fracture nonunions complications following autologous bone graft harvesting from the iliac crest and using the ria: a systematic review novel technique for medullary canal débridement in tibia and femur osteomyelitis osteogenic potential of reamer irrigator aspirator (ria) aspirate collected from patients undergoing hip arthroplasty reaming irrigator aspirator system: early experience of its multipurpose use reamer-irrigatoraspirator indications and clinical results: a systematic review using the bi-masquelet technique and reamer-irrigator-aspirator for post-traumatic foot reconstruction the use of bone substitutes in the treatment of bone defects-the clinical view and history current trends and future perspectives of bone substitute materials -from space holders to innovative biomaterials fracture healing: the diamond concept the diamond concept--open questions state of the art and future directions of scaffold-based bone engineering from a biomaterials perspective bone tissue engineering: from bench to bedside bone graft substitutes safety and efficacy of use of demineralised bone matrix in orthopaedic and trauma surgery demineralized bone matrix as an osteoinductive biomaterial and in vitro predictors of its biological potential the clinical use of allografts, demineralized bone matrices, synthetic bone graft substitutes and osteoinductive growth factors: a survey study a thorough physicochemical characterisation of 14 calcium phosphate-based bone substitution materials in comparison to natural bone coralline hydroxyapatite bone graft substitute: a review of experimental studies and biomedical applications maxilla sinus grafting with marine algae derived bone forming material: a clinical report of long-term results three-dimensional glass-derived scaffolds for bone tissue engineering: current trends and forecasts for the future calcium salts bone regeneration scaffolds: a review article current application of beta-tricalcium phosphate composites in orthopaedics various preparation methods of highly porous hydroxyapatite/polymer nanoscale biocomposites for bone regeneration concepts of scaffold-based tissue engineering--the rationale to use solid free-form fabrication techniques biocomposites containing natural polymers and hydroxyapatite for bone tissue engineering calcium phosphate ceramic systems in growth factor and drug delivery for bone tissue engineering: a review the return of a forgotten polymer-polycaprolactone in the 21st century polymeric materials for bone and cartilage repair scaffolds in tissue engineering bone and cartilage bioactive composites for bone tissue engineering design of bone-integrating organic-inorganic composite suitable for bone repair current concepts of molecular aspects of bone healing bone tissue engineering: state of the art and future trends a tissue engineering solution for segmental defect regeneration in load-bearing long bones clinical application of human mesenchymal stromal cells for bone tissue engineering characterization, differentiation, and application in cell and gene therapy mesenchymal stem cells in musculoskeletal tissue engineering: a review of recent advances in clonal characterization of bone marrow derived stem cells and their application for bone regeneration bone marrow stromal cells (bone marrow-derived multipotent mesenchymal stromal cells) for bone tissue engineering: basic science to clinical translation current insights on the regenerative potential of the periosteum: molecular, cellular, and endogenous engineering approaches periosteal cells in bone tissue engineering osteoblasts in bone tissue engineering stem cells from umbilical cord and placenta for musculoskeletal tissue engineering adipose-derived stem cells for tissue repair and regeneration: ten years of research and a literature review bone tissue engineering: recent advances and challenges bone tissue engineering: current strategies and techniques--part ii: cell types cell sources for bone tissue engineering: insights from basic science assessing the value of autologous and allogeneic cells for regenerative medicine orthopaedic applications of bone graft & graft substitutes: a review mineralization processes in demineralized bone matrix grafts in human maxillary sinus floor elevations biological matrices and tissue reconstruction foreign-body reactions to fracture fixation implants of biodegradable synthetic polymers foreign body reactions to resorbable poly(l-lactide) bone plates and screws used for the fixation of unstable zygomatic fractures late degradation tissue response to poly(l-lactide) bone plates and screws inflammatory cell response to calcium phosphate biomaterial particles: an overview bone regeneration materials for the mandibular and craniofacial complex bone scaffolds: the role of mechanical stability and instrumentation relative influence of surface topography and surface chemistry on cell response to bone implant materials. part 1: physico-chemical effects role of material surfaces in regulating bone and cartilage cell response formation and function of bone specific proteins mediate enhanced osteoblast adhesion on nanophase ceramics mechanisms of enhanced osteoblast adhesion on nanophase alumina involve vitronectin osteoblast adhesion on nanophase ceramics nano rough micron patterned titanium for directing osteoblast morphology and adhesion the influence of surface microroughness and hydrophilicity of titanium on the up-regulation of tgfbeta/bmp signalling in osteoblasts the effects of microporosity on osteoinduction of calcium phosphate bone graft substitute biomaterials advances in bionanomaterials for bone tissue engineering development of nanomaterials for bone repair and regeneration tissue engineering -nanomaterials in the musculoskeletal system perspectives on the role of nanotechnology in bone tissue engineering a preliminary study on osteoinduction of two kinds of calcium phosphate ceramics in vitro modeling of the bone/implant interface osteoinduction, osteoconduction and osseointegration osteoinductive biomaterials--properties and relevance in bone repair bonding of bone to apatite-coated implants biological profile of calcium phosphate coatings early bone formation around calcium-ion-implanted titanium inserted into rat tibia basse-cathalinat b. evolution of the local calcium content around irradiated beta-tricalcium phosphate ceramic implants: in vivo study in the rabbit calcium phosphate-based osteoinductive materials effect of culture conditions and calcium phosphate coating on ectopic bone formation osteoinduction by biomaterials--physicochemical and structural influences osteoinductive biomaterials: current knowledge of properties, experimental models and biological mechanisms bone formation induced by calcium phosphate ceramics in soft tissue of dogs: a comparative study between porous alpha-tcp and beta-tcp current views on calcium phosphate osteogenicity and the translation into effective bone regeneration strategies relative influence of surface topography and surface chemistry on cell response to bone implant materials. part 2: biological aspects new perspectives in mercury porosimetry bmp-induced osteogenesis on the surface of hydroxyapatite with geometrically feasible and nonfeasible structures: topology of osteogenesis porosity of 3d biomaterial scaffolds and osteogenesis potential of ceramic materials as permanently implantable skeletal prostheses three-dimensional scaffolds for tissue engineering: role of porosity and pore size bone tissue engineering: current strategies and techniques--part i: scaffolds recent advances in bone tissue engineering scaffolds a review of rapid prototyping techniques for tissue engineering purposes scaffold-based tissue engineering: rationale for computer-aided design and solid free-form fabrication systems rapid prototyping in tissue engineering: challenges and potential making tissue engineering scaffolds work. review: the application of solid freeform fabrication technology to the production of tissue engineering scaffolds additive manufacturing of tissues and organs astm standard f2792-10: standard terminology for additive manufacturing technologies image-guided tissue engineering porous scaffold design for tissue engineering printing and prototyping of tissues and scaffolds additive processing of polymers a review of rapid prototyping (rp) techniques in the medical and biomedical sector rapid prototyping for orthopaedic surgery prototyping for surgical and prosthetic treatment speedy skeletal prototype production to help diagnosis in orthopaedic and trauma surgery. methodology and examples of clinical applications clinical applications of physical 3d models derived from mdct data and created by rapid prototyping a review on stereolithography and its applications in biomedical engineering submicron stereolithography for the production of freely movable mechanisms by using single-photon polymerization development of the australian core competencies in musculoskeletal basic and clinical science project -phase 1 cancer statistics the growth of tissue engineering support for tissue engineering and regenerative medicine by the national institutes of health scaffold engineering: a bridge to where? establishment of a preclinical ovine model for tibial segmental bone defect repair by applying bone tissue engineering strategies mapping the translational science policy 'valley of death' bridging the regeneration gap: stem cells, biomaterials and clinical translation in bone tissue engineering emerging rules for inducing organ regeneration scaffold translation: barriers between concept and clinic tissue engineering of bone: the reconstructive surgeon's point of view engineering bone: challenges and obstacles the design and use of animal models for translational research in bone tissue engineering and regenerative medicine high-throughput screening for analysis of in vitro toxicity high-throughput screening assays to discover smallmolecule inhibitors of protein interactions long bone defect models for tissue engineering applications: criteria for choice animal models for bone tissue engineering skeletal tissue engineeringfrom in vitro studies to large animal models clinically applied models of bone regeneration in tissue engineering research selection and development of preclinical models in fracture-healing research the return of a forgotten polymer: polycaprolactone in the 21st century fused deposition modeling of novel scaffold architectures for tissue engineering applications the stimulation of healing within a rat calvarial defect by mpcl-tcp/collagen scaffolds loaded with rhbmp-2 comparison of the degradation of polycaprolactone and polycaprolactone-(βtricalcium phosphate) scaffolds in alkaline medium evaluation of polycaprolactone scaffold degradation for 6 months in vitro and in vivo dynamics of in vitro polymer degradation of polycaprolactone-based scaffolds: accelerated versus simulated physiological conditions composite electrospun scaffolds for engineering tubular bone grafts biomimetic tubular nanofiber mesh and platelet rich plasma-mediated delivery of bmp-7 for large bone defect regeneration differences between in vitro viability and differentiation and in vivo bone-forming efficacy of human mesenchymal stem cells cultured on pcl-tcp scaffolds combined marrow stromal cell-sheet techniques and high-strength biodegradable composite scaffolds for engineered functional bone grafts treatment of long bone defects and non-unions: from research to clinical practice the challenge of establishing preclinical models for segmental bone defect research custom-made composite scaffolds for segmental defect repair in long bones a tissue engineering solution for segmental defect regeneration in long bones establishment of a preclinical ovine model for tibial segmental bone defect repair by applying bone tissue engineering strategies autologous vs. allogenic mesenchymal progenitor cells for the reconstruction of critical sized segmental tibial bone defects in aged sheep prevalence of long-bone non-unions epidemiology of adult fractures: a review path analysis of factors for delayed healing and nonunion in 416 operatively treated tibial shaft fractures animal models for preclinical assessment of bone morphogenetic proteins in the spine animal models for spinal fusion the use of recombinant human bone morphogenetic protein 2 (rhbmp-2) to promote spinal fusion in a nonhuman primate anterior interbody fusion model. spine (phila pa osteogenic protein versus autologous interbody arthrodesis in the sheep thoracic spine. a comparative endoscopic study using the bagby and kuslich interbody fusion device. spine (phila pa lateral surgical approach to lumbar intervertebral discs in an ovine model scoliosis and its pathophysiology: do we understand it? spine (phila pa back pain and function 22 years after brace treatment for adolescent idiopathic scoliosis: a casecontrol study-part i. spine (phila pa 1976) challenges to bone formation in spinal fusion bone graft substitutes for spinal fusion an update on bone substitutes for spinal fusion bone graft substitutes in spinal surgery spinal fusion surgery: animal models for tissue-engineered bone constructs bone morphogenetic protein in spinal fusion: overview and clinical update bone morphogenetic proteins for spinal fusion establishment and characterization of an open minithoracotomy surgical approach to an ovine thoracic spine fusion model cranioplasty after trephination using a novel biodegradable burr hole cover: technical case report patient-specific instrumentation for total knee arthroplasty: a review a review of rapid prototyped surgical guides for patient-specific total knee replacement new technologies in planning and performance of osteotonics: example cases in hand surgery computer-assisted prefabrication of individual craniofacial implants cad/cam dental systems in implant dentistry: update uncemented computer-assisted designcomputer-assisted manufacture femoral components in revision total hip replacement: a minimum follow-up of ten years uncemented custom computer-assisted design and manufacture of hydroxyapatite-coated femoral components: survival at 10 to 17 years ct lesion model-based structural allografts: custom fabrication and clinical experience rapid prototyping for biomedical engineering: current capabilities and challenges calvarial reconstruction by customized bioactive implant can bone tissue engineering contribute to therapy concepts after resection of musculoskeletal sarcoma? sarcoma key: cord-010903-kuwy7pbo authors: liu, jiajun; neely, michael; lipman, jeffrey; sime, fekade; roberts, jason a.; kiel, patrick j.; avedissian, sean n.; rhodes, nathaniel j.; scheetz, marc h. title: development of population and bayesian models for applied use in patients receiving cefepime date: 2020-03-05 journal: clin pharmacokinet doi: 10.1007/s40262-020-00873-3 sha: doc_id: 10903 cord_uid: kuwy7pbo background and objective: understanding pharmacokinetic disposition of cefepime, a β-lactam antibiotic, is crucial for developing regimens to achieve optimal exposure and improved clinical outcomes. this study sought to develop and evaluate a unified population pharmacokinetic model in both pediatric and adult patients receiving cefepime treatment. methods: multiple physiologically relevant models were fit to pediatric and adult subject data. to evaluate the final model performance, a withheld group of 12 pediatric patients and two separate adult populations were assessed. results: seventy subjects with a total of 604 cefepime concentrations were included in this study. all adults (n = 34) on average weighed 82.7 kg and displayed a mean creatinine clearance of 106.7 ml/min. all pediatric subjects (n = 36) had mean weight and creatinine clearance of 16.0 kg and 195.6 ml/min, respectively. a covariate-adjusted two-compartment model described the observed concentrations well (population model r(2), 87.0%; bayesian model r(2), 96.5%). in the evaluation subsets, the model performed similarly well (population r(2), 84.0%; bayesian r(2), 90.2%). conclusion: the identified model serves well for population dosing and as a bayesian prior for precision dosing. electronic supplementary material: the online version of this article (10.1007/s40262-020-00873-3) contains supplementary material, which is available to authorized users. cefepime is a commonly utilized antibiotic for nosocomial infections. rising resistance, manifesting as increased cefepime minimum inhibitory concentrations (mics), has led to more frequent clinical failures [1, 2] . to advise clinical outcomes according to mics, the clinical and laboratory standards institute updated the susceptibility breakpoints and then created a category of susceptibledose dependent for mics of 4 and 8 mg/l for enterobacteriaceae spp. [3] . achieving goal pharmacokinetic exposures to effectively treat these higher mics can require a precision dosing approach. the online version of this article (https ://doi.org/10.1007/s4026 2-020-00873 -3) contains supplementary material, which is available to authorized users. cefepime, like other β-lactams, has pharmacodynamic activity governed by 'time-dependent' activity. the fraction of time that the unbound drug concentration exceeds the mic (ft >mic ) for the dosing interval is the pharmacokinetic/pharmacodynamic (pk/pd) efficacy target for cefepime [4] , and a target of 60-74% has been previously proposed [5] [6] [7] [8] . for the currently approved cefepime product and combination agents in the pipeline [9, 10] , understanding cefepime disposition and variability is crucial to for optimal treatment of patients. as inter-and intra-patient pk variability can impact the achievement of pd goals, understanding the precision of population dosing is important. further, to fully realize precision dosing, individualized models (e.g., bayesian models) are needed. once developed, these models will form the basis for adaptive feedback and control strategies when paired with real-time drug assays. the purpose of this study was to: (1) develop and evaluate a unified cefepime population pk model for adult and pediatric patients, and (2) construct an individualized model that can be utilized to deliver precision cefepime dosing. data from four clinical cefepime pk studies representing unique groups of patients were compiled. subject demographics and study methodologies have been previously described [11] [12] [13] [14] . in brief, populations represented were febrile neutropenic adults with hematologic malignancies [13, 14] , those with critical illness [12] , and children with presumed or documented bacterial infections [11] . for the two studies that evaluated adults with neutropenic fever, sime et al. prospectively enrolled 12 patients receiving chemotherapy and/or stem cell transplant who subsequently developed febrile neutropenia and were administered maximum doses of cefepime [13] . a total of 53 cefepime plasma concentrations in presumably steady-state dosing intervals (third, sixth, and ninth) were analyzed for pk target attainment. whited et al. prospectively studied similar patients (n = 9) who were admitted to hematology-oncology services and were receiving cefepime at a maximum dosage for febrile neutropenia [14] . cefepime pk samples were obtained during steady state and analyzed for population parameters. critically ill adults were studied by roberts et al. as a prospective multinational pk study and included 14 patients who received cefepime (only n = 13 were included for model evaluation) [12] . last, reed et al. characterized cefepime pharmacokinetics in hospitalized pediatric patients (above 2 months of age) who received cefepime as monotherapy for bacterial infections [11] . for our study, only those who received intravenous cefepime were included for model development. adult (n = 9) and partial pediatric (n = 24) datasets were utilized for pk model building ( fig. 1) [11, 14] . model evaluation was performed with other datasets consisting of independent adult (n = 12, n = 13) and pediatric (n = 12) patients [11] [12] [13] 15] . pediatric patients from reed et al. [11] were randomized into the model building or the evaluation dataset. all clinical patient-level data included age, weight, and serum creatinine. an estimated creatinine clearance (crcl) was calculated for each patient [16] . the cockcroft-gault formula (applied to all subjects) served as a standardized descriptor for the elimination rate constant. this study was exempted by the institutional review board at midwestern university chicago college of pharmacy. to construct the base pk models, the nonparametric adaptive grid (npag) algorithm [17, 18] within the pmetrics (version 1.5.2) package [18] for r [19] was utilized. multiple physiologically relevant, one-and two-compartmental pk models were built and assessed. the one-compartment structural model included an intravenous cefepime dose into and parameterized total cefepime elimination rate constant (k e ) from the central compartment. the two-compartment model included additional parameterizations of intercompartmental transfer constants between central and peripheral compartments (k cp and k pc ). in candidate models, total cefepime elimination was explored according to full renal and partial renal clearance (cl) models [i.e., nonrenal elimination (k e intcpt) and renal elimination descriptor (k e 0 vectorized as a function of glomerular filtration estimates)] [9, 20] . assay error was included into the model using a polynomial equation in the form of standard deviation (sd) a unified cefepime population pharmacokinetic model has been developed from adult and pediatric patients and evaluates well in independent populations. when paired with real-time β-lactam assays, a precision dosing approach will optimize drug exposure and improve clinical outcomes. as a function of each observed concentration, y (i.e., sd = c 0 + c 1 · y). observation weighting was performed using gamma (i.e., error = sd · gamma), a multiplicative variance model to account for extra process noise. gamma was initially set at 4 with c 0 and c 1 equal to 0.5 and 0.15, respectively. covariate relationships were assessed using the 'pmstep' function in pmetrics by applying stepwise linear regressions (forward selection and backwards elimination) of all covariates on pk parameters. additionally, a priori analyses examined the effect of covariates on cefepime k e , and both weight and crcl were variables considered a priori to have a high potential likelihood to impact cefepime pharmacokinetics ( [9, 21, 22] . weight and crcl were standardized to 70 kg and 120 ml/min, respectively. further, an allometric scaler was applied to standardized weight (i.e., quotient of weight in kg divided by 70 kg raised to the negative 0.25th power) as a covariate adjustment to k e (esm). ultimate model retention was governed according to criteria described below. the best-fit pk and error model was identified by the change in objective function value (ofv) calculated as differences in − 2 log-likelihood, with a reduction of 3.84 in ofv corresponding to p < 0.05 based on chi-square distribution and one degree of freedom. further, the best-fit model was selected based on the rule of parsimony and the lowest akaike's information criterion scores. goodness of fit of the competing models were evaluated by regression on observed vs. predicted plots, coefficients of determination, and visual predictive checks. predictive performance was assessed using bias and imprecision in both population and individual prediction models. bias was defined as mean weighted prediction error; imprecision was defined as bias-adjusted mean weighted squared prediction error. posterior-predicted cefepime concentrations for each study subject were calculated using individual median bayesian posterior parameter estimates. to evaluate the final adjusted model, the npag algorithm [17, 18] was employed to assess the performance with separate data sets (fig. 1 ). the population joint density from the best-fit covariate adjusted model was employed as a bayesian prior for the randomly withheld pediatric data adult data 3 adult data 4 adult data 1 fig. 1 schematic for data sources in model development and evaluation and separate adult data. in the evaluation process, structural model, model parameters, assay error, and observation weighting were unchanged. goodness of fit of the competing models were determined as described above. simulation was performed to examine the exposures predicted by the final model, employing all support points from the population parameter joint density in the final npag analysis [18, 23] . each support point was treated as a mean vector surrounded by the population variance-covariance matrix (i.e., covariance equal to the population covariance divided by the total number of support points). for each subject, 1000 simulated profiles were created with predicted outputs at 0.1-h intervals. covariate values for each simulated subject were fixed based on arithmetic means of observed weight and crcl for corresponding adult and pediatric populations. semi-parametric monte carlo sampling was performed from the multimodal multivariate distribution of parameters with the parameter space concordant with the npag population analysis results (i.e., best-fit model) [ table s1 of the esm] [23] . maximum dosing regimens were simulated for adult and pediatric populations (total n = 33): 2 g every 8 h infused over 0.5 h and 50 mg/kg every 8 h infused over 0.5 h, respectively. protein binding of 20% (i.e., 80% free fraction of total cefepime dose) was accounted for in predicting cefepime concentrations [9] . the pk/pd target of ft >mic ≥ 68% was utilized across doubling mics of 0.25-32 mg/l over the first 24 h of cefepime therapy [5] . estimates are provided from the first 24 h of simulations as timely administration of effective antimicrobial agents is associated with increased survival [24] . a total of 70 clinically diverse subjects, contributing 683 cefepime concentrations, were included in this study (n = 33 subjects for model development; n = 37 subjects for evaluation) (fig. 1 a total of 428 cefepime observations were available for model development. cefepime concentrations ranged from 0.5 to 249.7 μg/ml. the base one-and two-compartment models (without covariate adjustment) produced reasonable fits for observed and bayesian posterior-predicted cefepime concentrations (r 2 = 84.7% and 85.2%, respectively), but population estimates were unsatisfactory (r 2 = 22.7% and 27.8%, respectively) ( table 1) . weight and crcl displayed relationships with the standard two-compartment model (i.e., base two-compartment model). volume of distribution was associated with weight (p < 0.2) and ke (total) was associated with crcl (p < 0.2). after standardizing weight (to 70 kg) without an allometric scaler in the base two-compartment model, fits for both population and bayesian posterior estimates against the observed data improved (r 2 = 60.7% and 96.5%, respectively; ofv change, 4). bias and imprecision for bayesian posterior fits were − 0.18 and 1.12, respectively. when covariates (i.e., weight to volume of distribution and k e ; crcl to k e ) and the allometric scaler were applied in the two-compartment model, bayesian posteriors fit well (r 2 = 96.5%; fig. 2 right) with low bias and imprecision (− 0.15 and 1.07, respectively), and the population pk model produced good fits of the observed cefepime concentrations (r 2 = 87.0%, bias = 0.53, imprecision = 7.75; fig. 2 left) . the ofv change from the weightadjusted two-compartment model to the final model was significant at − 34 (p < 0.05) [ table 1 ]. the final model also produced acceptable predicted checks (fig. 3) . thus, a two-compartment model with weight and crcl as covariate adjustments and allometric scaling was selected as the final pk model. the population parameter values from the final pk model are summarized in table 2 . structural model and differential equations that define the population pk are listed in the esm. the population parameter value covariance matrix can be found in table 3 . additionally, weighted residual error plots for the best-fit model (fig. s1 ) and scatter plots for covariates for the base structural model (fig. s2) can be found included in the esm. for the evaluation subset, bayesian priors resulted in reasonably accurate and precise predictions (population r 2 = 84.0%, bayesian r 2 = 90.2%; fig. 4 ). fig. 2 goodness-of-fit plots for best-fit population cefepime pk model (model development) results of the probability of target attainment (pta) analysis are shown in table 4 this study created a population and individual pk model for adult and pediatric patients and can serve as a bayesian prior for precision dosing. when paired with a real-time assay for cefepime, this model allows for precise and accurate predictions of cefepime disposition via adaptive feedback control. in the absence of real-time assays, these cefepime pk parameters facilitate more accurate population-based dosing table 3 population parameter value covariance matrix for the best-fit model strategies. previous work by rhodes et al. has shown an absolute difference of approximately 20% in survival probability across the continuum of achieving 0-100% ft >mic in adult patients with gram-negative bloodstream infections, thus understanding the dose and re-dosing interval necessary to achieve optimal pk exposures should greatly improve clinical outcomes for patients treated with cefepime [5] . individualized dosing and therapeutic drug monitoring of β-lactam antibiotics (e.g., cefepime) are critically important to achieving optimal drug exposure (i.e., optimal ft >mic as the pk/pd target) and improving clinical outcomes [4, 25, 26] . precision medicine has been named as a major focus for the national health institute with $215 million invested [27] , yet precision medicine has mostly focused on genomic differences [28, 29] . precision dosing is an important facet of precision medicine, and renewed efforts in precision dosing in the real-world setting are being pursued [30] . cefepime is a highly relevant example. while rigorous reviews and analyses are conducted during the development phase of an antibiotic, dose optimization is far less ideal for the types of patients who ultimately receive the drug. this is highlighted by the fact that although cefepime-associated neurotoxicity is rare, this serious and potentially life-threatening adverse event has been increasingly reported and few strategies exist for optimizing and delivering precision exposures [31, 32] . lamoth et al. found that a cefepime trough concentration of ≥ 22 mg/l has a 50% probability of predicting neurotoxicity [33] . huwyler et al. identified a similar predictive threshold of > 20 mg/dl (five-fold increased risk for neurologic events) [34] . in contrast, rhodes et al. found the cut-off of 22 mg/l to be suboptimal [35] . furthermore, rhodes et al. performed simulations from literature cefepime data and observed a high intercorrelation amongst all pk parameters (i.e., area under the curve at steady state, maximum plasma concentration, and minimum plasma concentration), suggesting that more work is needed to establish the pharmacokinetic/toxicodynamic (pk/td) profile for cefepime. in addition to complications by these less-than-ideal pk/ td data, clinicians are left to treat patients with extreme age differences, organ dysfunction, and comorbid conditions affecting antibiotic pharmacokinetics/pharmacodynamics [26] . further, a contemporary dose reduction strategy based on estimated renal function (e.g., estimated crcl using the cockcroft-gault formula) is also likely to be confounded in these patients by intrinsic pk variability, such as changes in volume of distribution, and the challenges of accurately estimating the glomerular filtration rate at any point in time, leading to more 'uncertainties' in balancing dose optimization and adverse events [9, 36] . these 'real-world' patients are often under-represented, and thus not well understood, from a pk/pd and pk/td standpoint during the drug approval process. bridging to the more typical patients that are clinically treated is important and central to the mission of precision medicine. the findings of this study can be used to guide cefepime dosing in these 'real-world' patients. several other studies have reviewed population cefepime pharmacokinetics. sime [38] . in our pediatric population, means of cl and elimination half-life were 3.1 l/h and 3.0 h, respectively. our simulation findings are similar to those of shoji et al. that the maximum pediatric cefepime dosing did not adequately achieve optimal exposure to target higher mics. while the cefepime package labeling recommends maximum dosages of 2 g every 8 h for adult patients with neutropenic fever and 50 mg/kg every 8 h for pediatric patients with pneumonia and/or neutropenic fever, there may be a need to extend these dosing regimens to other populations (in the absence of aforementioned indications) to achieve the best clinical outcomes by optimizing the pk/pd attainment goals [9] . other studies also performed a simulation for pta with different cefepime regimens and renal functions. tam et al. found that with a pd target of 67% f t>mic, 2 g every 8 h (30-minute infusion) achieved approximately 90% pta for mic of 8 mg/l in patients with crcl of 120 ml/min while 2 g every 12 h achieved barely above 80% pta for an mic of 4 mg/l in the same population [39] . nicasio et al. also conducted a simulation using a pd target of 50% f t>mic in the critically ill with varying renal function. the maximum recommended dosage (2 g every 8 h) in patients with crcl between 50 and 120 ml/min achieved a pta of 78.1% at an mic of 16 mg/l; however, when the same regimen was infused over 0.5 h, the pta achieved was significantly lower [37] . collectively, these findings suggest that cefepime exposure is highly variable and may be clinically suboptimal in a large number of patients commonly treated with cefepime. these findings support the need for precision dosing and therapeutic drug monitoring for β-lactam antibiotics to reach optimal pk/pd targets given the high variability in drug exposures. our study is not without limitations. although a relatively large and diverse cohort was included in model development and evaluation, we did not specifically assess certain subgroups such as patients with morbid obesity and severe renal dysfunction. these conditions may require patient-specific models. second, many studies to date included 'real-world' patients with various disease sates (e.g., neutropenic fever, renal failure, sepsis); however, all studies were conducted under the research protocol where doses, and administration times were all carefully confirmed. additional efforts will be needed to evaluate model performance in clinical contexts. a unified population model for cefepime in adult and pediatric populations was developed and demonstrated excellent performance on evaluation. current cefepime dosages are often suboptimal, and population variability is high. precision dosing approaches and real-time assays are needed for cefepime to optimize drug exposure and improve clinical outcomes. failure of current cefepime breakpoints to predict clinical outcomes of bacteremia caused by gram-negative organisms evaluation of clinical outcomes in patients with gram-negative bloodstream infections according to cefepime mic performance standards for antimicrobial susceptibility testing pharmacokinetic/pharmacodynamic parameters: rationale for antibacterial dosing of mice and men defining clinical exposures of cefepime for gram-negative bloodstream infections that are associated with improved survival clinical pharmacodynamics of cefepime in patients infected with pseudomonas aeruginosa relationship between pk/pd of cefepime and clinical outcome in febrile neutropenic patients with normal renal function interrelationship between pharmacokinetics and pharmacodynamics in determining dosage regimens for broad-spectrum cephalosporins cefepime/vnrx-5133 broad-spectrum activity is maintained against emerging kpc-and pdc-variants in multidrug-resistant k. pneumoniae and p. aeruginosa pharmacokinetics of intravenously and intramuscularly administered cefepime in infants and children dali: defining antibiotic levels in intensive care unit patients: are current beta-lactam antibiotic doses sufficient for critically ill patients? adequacy of high-dose cefepime regimen in febrile neutropenic patients with hematological malignancies pharmacokinetics of cefepime in patients with cancer and febrile neutropenia in the setting of hematologic malignancies or hematopoeitic cell transplantation. pharmacotherapy guidance document: population pharmacokinetics guidance for industry prediction of creatinine clearance from serum creatinine an adaptive grid non-parametric approach to pharmacokinetic and dynamic (pk/pd) population models accurate detection of outliers and subpopulations with pmetrics, a nonparametric and parametric pharmacometric modeling and simulation package for r r: a language and environment for statistical computing cefepime clinical pharmacokinetics pharmacokinetics of cefepime in patients with respiratory tract infections cefepime in intensive care unit patients: validation of a population pharmacokinetic approach and influence of covariables population modeling and monte carlo simulation study of the pharmacokinetics and antituberculosis pharmacodynamics of rifampin in lungs duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock pharmacokinetics-pharmacodynamics of antimicrobial therapy: it's not just for mice anymore therapeutic drug monitoring of the beta-lactam antibiotics: what is the evidence and which patients should we be using it for? the white house. fact sheet: president obama's precision medicine initiative precision medicine: from science to value. health aff (millwood) precision medicine: changing the way we think about healthcare precision dosing: defining the need and approaches to deliver individualized drug dosing in the real-world setting cefepime and risk of seizure in patients not receiving dosage adjustments for kidney impairment characterizing cefepime neurotoxicity: a systematic review high cefepime plasma concentrations and neurological toxicity in febrile neutropenic patients with mild impairment of renal function cefepime plasma concentrations and clinical toxicity: a retrospective cohort study an exploratory analysis of the ability of a cefepime trough concentration greater than 22 mg/l to predict neurotoxicity performance of the cockcroft-gault, mdrd, and new ckd-epi formulas in relation to gfr, age, and body size population pharmacokinetics of high-dose, prolonged-infusion cefepime in adult critically ill patients with ventilator-associated pneumonia population pharmacokinetic assessment and pharmacodynamic implications of pediatric cefepime dosing for susceptible-dose-dependent organisms pharmacokinetics and pharmacodynamics of cefepime in patients with various degrees of renal function acknowledgements j.a. roberts would like to acknowledge funding key: cord-007726-bqlf72fe authors: rydell-törmänen, kristina; johnson, jill r. title: the applicability of mouse models to the study of human disease date: 2018-11-09 journal: mouse cell culture doi: 10.1007/978-1-4939-9086-3_1 sha: doc_id: 7726 cord_uid: bqlf72fe the laboratory mouse mus musculus has long been used as a model organism to test hypotheses and treatments related to understanding the mechanisms of disease in humans; however, for these experiments to be relevant, it is important to know the complex ways in which mice are similar to humans and, crucially, the ways in which they differ. in this chapter, an in-depth analysis of these similarities and differences is provided to allow researchers to use mouse models of human disease and primary cells derived from these animal models under the most appropriate and meaningful conditions. although there are considerable differences between mice and humans, particularly regarding genetics, physiology, and immunology, a more thorough understanding of these differences and their effects on the function of the whole organism will provide deeper insights into relevant disease mechanisms and potential drug targets for further clinical investigation. using specific examples of mouse models of human lung disease, i.e., asthma, chronic obstructive pulmonary disease, and pulmonary fibrosis, this chapter explores the most salient features of mouse models of human disease and provides a full assessment of the advantages and limitations of these models, focusing on the relevance of disease induction and their ability to replicate critical features of human disease pathophysiology and response to treatment. the chapter concludes with a discussion on the future of using mice in medical research with regard to ethical and technological considerations. although the genetic lineages of mice and humans diverged around 75 million years ago, these two species have evolved to live together, particularly since the development of agriculture. for millennia, mice (mus musculus) were considered to be pests due to their propensity to ravenously consume stored foodstuff (mush in ancient sanskrit means "to steal" [1] ) and their ability to adapt to a wide range of environmental conditions. since the 1700s, domesticated mice have been bred and kept as companion animals, and in victorian england, "fancy" mice were prized for their variations in coat color and comportment; these mouse strains were the forerunners to the strains used in the laboratory today. robert hooke performed the first recorded inquiry-driven experiments on mice in 1664, when he investigated the effects of changes in air pressure on respiratory function [2] . more recently, with data from the human genome project and sequencing of the mus musculus genome showing remarkable genetic homology between these species, as well as the advent of biotechnology and the development of myriad knockout and transgenic mouse strains, it is clear why the mouse has become the most ubiquitous model organism used to study human disease. in addition, their small size, rapid breeding, and ease of handling are all important advantages to scientists for practical and financial reasons. however, keeping in mind that mice are fellow vertebrates and mammals, there are ethical issues inherent to using these animals in medical research. this chapter will provide an overview of the important similarities and differences between mus musculus and homo sapiens and their relevance to the use of the mouse as a model organism and provide specific examples of the quality of mouse models used to investigate the mechanisms, pathology, and treatment of human lung diseases. we will then conclude with an assessment of the future of mice in medical research considering ethical and technological advances. as a model organism used to test hypotheses and treatments related to human disease, it is important to understand the complex ways in which mice are similar to humans, and crucially, the ways in which they differ. a clear understanding of these aspects will allow researchers to use mouse models of human disease and primary cells derived from mice under the most appropriate and meaningful conditions. in 2014, the encyclopedia of dna elements (encode) program published a comparative analysis of the genomes of homo sapiens and mus musculus [3] , as well as an in-depth analysis of the differences in the regulatory landscape of the genomes of these species [4] . encode, a follow-up to the human genome project, was implemented by the national human genome research institute (nhgri) at the national institutes of health in order to develop a comprehensive catalog of protein-encoding and nonproteincoding genes and the regulatory elements that control gene expression in a number of species. this was achieved using a number of genomic approaches (e.g., rna-seq, dnase-seq, and chip-seq) to assess gene expression in over 100 mouse cell types and tissues; the data were then compared with the human genome. overall, these studies showed that although gene expression is fairly similar between mice and humans, considerable differences were observed in the regulatory networks controlling the activity of the immune system, metabolic functions, and responses to stress, all of which have important implications when using mice to model human disease. in essence, mice and humans demonstrate genetic similarity with regulatory divergence. specifically, there is a high degree of similarity in transcription factor networks but a great deal of divergence in the cis-regulatory elements that control gene transcription in the mouse and human genomes. moreover, the chromatin landscape in cell types of similar lineages in mouse and human is both developmentally stable and evolutionarily conserved [3] . of particular relevance regarding modeling human diseases involving the immune system, in its assessment of transcription factor networks, the mouse encode consortium revealed potentially important differences in the activity of ets1 in the mouse and human genome. although conserved between the two species, divergence in ets1 regulation may be responsible for discrepancies in the function of the immune system in mouse and human [4] . certainly, the biological consequences of these differences in gene expression and regulation between human and mouse invite further investigation. the anatomical and physiological differences between model organisms and humans can have profound impacts on interpreting experimental results. virtually every biological process under investigation in experimental studies involves at least one anatomical structure. to aid in interpretation, many anatomy compendia have been developed for model organisms; the most useful organize anatomical entities into hierarchies representing the structure of the human body, e.g., the foundational model of anatomy developed by the structural informatics group at the university of washington [5] . although an analysis of the myriad differences between mouse and human anatomy is beyond the scope of this chapter, a few of the most critical issues that have an impact on the interpretation of data from mouse experiments should be mentioned. the most obvious difference between mice and humans is size; the human body is about 2500 times larger than that of the mouse. size influences many aspects of biology, particularly the metabolic rate, which is correlated to body size in placental mammals through the relationship bmr ¼ 70 â mass (0.75), where bmr is the basal metabolic rate (in kcal/day). thus, the mouse bmr is roughly seven times faster than that of an average-sized human [6] . this higher bmr has effects on thermoregulation, nutrient demand, and nutrient supply. as such, mice have greater amounts of metabolically active tissues (e.g., liver and kidney) and more extensive deposits of brown fat [6] . furthermore, mice more readily produce reactive oxygen species than do humans, which is an important consideration when modeling human diseases involving the induction of oxidative stress (i.e., aging, inflammation, and neurodegeneration) [6] . the lung provides an excellent example of the similarities and differences between human and mouse anatomy. similar to the human organ, the mouse lung is subdivided into lobes of lung parenchyma containing a branching bronchial tree and is vascularized by the pulmonary circulation originating from the right ventricle. there are a number of subtle variations in this general structure between species, i.e., the number of lobes on the right and left, the branching pattern, and the distribution of cartilage rings around the large airways, but the most important differences between the mouse and human lung are related to the organism's size (airway diameter and alveolar size are naturally much smaller in the mouse) and respiratory rate. moreover, there are important differences in the blood supply of the large airways in humans versus mice [7] . specifically, the bronchial circulation (a branch of the high-pressure systemic circulation that arises from the aorta and intercostal arteries) supplies a miniscule proportion of the pulmonary tissue in mice (the trachea and bronchi) compared to humans; the majority of the lung parenchyma is supplied by the low-pressure, high-flow pulmonary circulation. in the mouse, these systemic blood vessels do not penetrate into the intraparenchymal airways, as they do in larger species [8] . this difference, although subtle, has important ramifications regarding the vascular supply of lung tumors which, in humans, is primarily derived from the systemic circulation [9] . these differences may also have profound consequences when modeling human diseases involving the lung vasculature. the adaptive immune system evolved in jawed fish about 500 million years ago, well before the evolution of mammals and the divergence of mouse and human ancestral species [10] . many features of the adaptive immune system, including antigen recognition, clonal selection, antibody production, and immunological tolerance, have been maintained since they first arose in early vertebrates. however, the finer details of the mouse and human immune systems differ considerably, which is not surprising since these species diverged 75 million years ago [6] . while some have claimed that these differences mean that research into immunological phenomena in mice is not transferable to humans, as long as these differences are understood and acknowledged, the study of mouse immune responses can continue to be relevant. research on mice has been vital to the discovery of key features of both innate and adaptive immune responses; for example, the first descriptions of the major histocompatibility complex, the t cell receptor, and antibody synthesis were derived from experiments performed on mice [6] . the general structure of the immune system is similar in mice and humans, with similar mediators and cell types involved in rapid, innate immune responses (complement, macrophages, neutrophils, and natural killer cells) as well as adaptive immune responses informed by antigen-presenting dendritic cells and executed by b and t cells. however, due to the anatomical and physiological differences between these species as described above, divergence in key features of the immune system, such as the maintenance of memory t cells (related to the life span of the organism) and the commensal microbiota (related to the lifestyle of the organism), has arisen [11] . similar to what has been discovered regarding the genetics of mice and humans, i.e., broad similarities in structure but considerable differences in regulation, there are a number of known discrepancies in the regulation of innate and adaptive immunity in mouse models of human disease mice versus humans, including the balance of leukocyte subsets, t cell activation and costimulation, antibody subtypes and cellular responses to antibody, th1/th2 differentiation, and responses to pathogens (described in detail in table 1 ). in addition to these differences in immune cell functions, the expression of specific genes involved in immune responses also differs, particularly those for toll-like receptors, defensins, nk inhibitory receptors, thy-1, and many components of chemokine and cytokine signaling; additionally, differences between mouse strains are known to exist for many of these mediators [12] . another important consideration when using mice to perform immunological research (with a view to translating these findings to human medicine) is the availability of hundreds of strains of genetically modified mice that have enabled exquisitely detailed studies on immune cell function, regulation, and trafficking. many of these strains involve the expression of inducible cre or cas9 that allow for targeted knockdown or overexpression of key immune function-related genes in specific cell types at specific moments in time. however, it is important to note that drift between mouse colonies has long been known to occur. in fact, a recent report described the fortuitous discovery of a point mutation in the natural cytotoxicity receptor 1 (ncr1) gene in the c57/bl6 cd45.1 mouse strain, resulting in absent ncr1 expression. this mutation was found to have profound effects on the response of mice to viral infection, i.e., the mice were resistant to cytomegalovirus infection but more susceptible to influenza virus [23] . this cautionary tale highlights the importance of understanding the genetic evolution of laboratory strains of mice, the effect of these genetic and immunological changes on mouse biology, and the impact on the translation of these results to human medicine. in addition to the differences between mouse and human genetics, physiology, and immunology highlighted above, several factors must also be taken into account when performing in vitro assays using isolated mouse cells and applying these findings to our understanding of human disease. particularly with regard to stem cell research, it should be noted that the telomeres of mouse cells are five-to tenfold longer than human telomeres, resulting in greater replicative capacity [24] . there are also important differences in the regulation of pluripotency and stem cell differentiation pathways in humans and mice [25] . moreover, there are considerable species differences in the longevity of cultured cells; for example, mouse fibroblasts are capable of spontaneous immortalization in vitro, whereas human fibroblasts become senescent and ultimately fail to thrive in culture [26] . in summary, although there are considerable differences between mice and humans, constant improvement in the analytical techniques used to delineate these differences and their effects on whole organism and cell function have provided vital information and contributed to our understanding of both murine and human biology. experimentation employing mouse models of human disease will continue to provide key insights into relevant disease mechanisms and potential drug targets for further clinical investigation. however, several important considerations must be taken into account when selecting a mouse model of human disease, as described in the following section, using mouse models of human lung disease to illustrate this point. the two most salient features of a mouse model of human disease are the accuracy of its etiology (it employs a physiologically relevant method of disease induction) and its presentation (its ability to recapitulate the features of human disease). the relevance of any given mouse model can be judged on the basis of these two criteria, and there is considerable variation within mouse models of human disease in this regard. as a full assessment of the advantages and limitations of all currently available mouse models of human disease would be prohibitively long and complex, here we have elected to assess the accuracy of currently available models of human lung diseases, i.e., asthma, chronic obstructive pulmonary disease, and pulmonary fibrosis, focusing on the relevance of disease induction in these models and their ability to replicate critical features of human disease pathophysiology and response to treatment. the first and foremost notion when modeling human disease in mice is to acknowledge the species differences, which are significant [27] . as described above, genetics, anatomy, physiology, and immunology differ between mice and humans, but despite these differences, mouse models of human disease are useful and necessary, as long as data interpretation is performed appropriately. an elegant example of differences between mice and humans that must be considered when designing a mouse model of human inflammatory lung disease is the key effector cell type in human asthma, i.e., mast cells. these leukocytes differ in granule composition as well as localization in the mouse and human airways [28] . mice mostly lack mast cells in the peripheral lung [29] , whereas humans have numerous mast cells of multiple subpopulations in the alveolar parenchyma [30] . another example is anatomy: in contrast to humans, mice lack an extensive pulmonary circulation, which may have significant effects on leukocyte adhesion and migration, and subsequently inflammation [31] . still, as long as these differences are taken into consideration, mouse models can be powerful tools in the discovery and exploitation of new targets for the treatment of human disease. the world health organization (who) defines asthma as a chronic disease characterized by recurrent attacks of breathlessness and wheezing, which may vary in severity and frequency from person to person. the disease is characterized by airway hyperresponsiveness, airway smooth muscle thickening, increased mucus secretion and collagen deposition, as well as prominent inflammation affecting both large and small airways [32] . nowadays, it is recognized that asthma is not a single homogenous disease but rather several different phenotypes united by similar clinical symptoms [32, 33] . only a few animal species develop asthma naturally, including cats and horses [34, 35] , whereas mice do not [31] . however, mice can be manipulated to develop a type of allergic airway inflammation, which is similar in many ways to the human disease, in response to different aeroallergens [36] . importantly, these models are capable of recapitulating only the allergic type of human asthma and have less relevance for other types of asthma (i.e., endotypes induced by medication, obesity, and air pollution). as with many human diseases, asthma has a complex and multifaceted etiology, where environmental factors, genetic susceptibility, and microbial colonization all contribute; thus, it is important to take strain differences into consideration. generations of inbreeding have created mouse strains that differ not only in coat color and disposition but also from a physiological, immunological, and genetic perspective. different strains may be more susceptible to allergic airway inflammation or pulmonary fibrosis, whereas others are more or less resistant. choosing the right strain to model a specific disease or pathologic event is thus essential. the most widely used strains for models of allergic airway inflammation are balb/c and c57bl/6. these strains differ regarding the type of immune response mounted to an inhaled allergen: c57bl/6 is generally considered a t h 1-skewed strain, whereas balb/c is regarded as a t h 2-skewed strain [36] . due to their strong t h 2response, and subsequent development of robust asthmatic responses, balb/c has been commonly used to model asthma [37] . however, most humans do not express such a strongly t h 2-skewed immune system, suggesting this strain may not be the best model of human disease; instead, c57bl/6 may be more suitable as immune responses in this strain are more similar to those of atopic human subjects [37] . furthermore, as c57bl/6 is the most commonly used strain for the development of genetically manipulated mice, using these mice allows for very specific investigations into disease pathology; thus, this strain is increasingly used in models of human lung disease. besides the genetic differences in the mouse strains used in these models, the etiology (the method of disease induction) of commonly used models of asthma is highly variable. in humans with allergic asthma, environmental allergen exposure occurs at the airway mucosa; the immune response is coordinated in the bronchopulmonary lymph nodes, and the t cells, macrophages, and eosinophils recruited as part of this response travel to the lung where they mediate the cardinal features of asthma: airway inflammation, structural remodeling of the airway wall, and airway hyperreactivity [38] . ideally, these features should be found in a physiologically relevant mouse model of asthma. however, for the sake of cost and convenience, early mouse models of asthma used the surrogate protein ovalbumin (ova) [31] rather than an environmental allergen to induce an immune response, which also requires the use of a powerful t h 2-polarizing adjuvant such as alum delivered via the intraperitoneal route, followed by ova nebulization-a clear divergence from the etiology of human asthma [36] . in terms of disease presentation, mice develop some hallmarks of asthma, including airway eosinophilic inflammation, goblet cell metaplasia, and increased airway smooth muscle density [31] . after the cessation of ova exposure, most of the remodeling resolves, although some structural alterations remain up to 1 month after the last challenge [39] . based on these attributes, the ova model is primarily a model to investigate the initiation of inflammation, rather than the chronic progression and maintenance of inflammation [31] . a clear advantage with the ova model is the number of studies where it is used; both the pros and cons are familiar. it is easy to find a suitable protocol, and the model is readily accessible and flexible regarding the number of sensitizations and allergen doses. the model is relatively easy to reproduce, as ova and different adjuvants are easily obtained. however, the resolution of remodeling following the cessation of allergen provocations is a disadvantage, as is the practical problem with the nebulization of an allergen-it ends up in the mouse's coat and is ingested during grooming, potentially resulting in systemic exposure (this is particularly relevant in models employing systemic, intraperitoneal sensitization). in addition, concerns have been raised against the use of adjuvants to induce the immunological response, as well as the clinical relevance of ova as an allergen, which have driven the development of more clinically relevant allergens and models [31] . the common environmental aeroallergen house dust mite (hdm) extract is increasingly used to initiate disease in mouse models of allergic airway inflammation, as it is a common human allergen (around 50% of asthmatics are sensitized to hdm [40] ) that evokes asthma attacks and other allergic responses in susceptible individuals. in addition, hdm has inherent allergenic properties, likely due to components with protease activity [40] , so there is no need to use an adjuvant, thus improving the etiological similarity of these models with the clinical situation [41] . in contrast to ova, prolonged exposure of hdm (up to 7 weeks) induces asthma-like severe airway inflammation with prominent eosinophilia, severe hyperreactivity to methacholine, and robust remodeling of the airway wall [41] , i.e., the presentation of chronic respiratory hdm exposure in mice effectively recapitulates the key features of human allergic asthma. importantly, the airway structural changes induced by chronic hdm exposure, such as increased collagen deposition, airway smooth muscle thickening, and microvascular alterations, persist for at least 4 weeks after the cessation of hdm exposure [42] , another commonality with human asthma in which airway remodeling is currently considered to be irreversible. thus, the advantages of using hdm as the allergen in mouse models of asthma are the clinical relevance of the allergen [43] and the route of delivery via the respiratory tract. moreover, studies have shown that the type of inflammation and characteristics of tissue remodeling are relatively similar to those seen in human asthmatics [35, 41, 43] . one disadvantage is the complexity of hdm extract; as a consequence of this complexity, variations exist in some components between batches, particularly regarding the content of lipopolysaccharide, so reproducibility in these studies may be problematic. with similarity to hdm, these models were developed to be as clinically relevant as possible, as many patients suffer from allergy toward cockroach allergen, molds, and other environmental irritants. a common feature of these allergens is their complex nature, as they commonly consist of a mix of different allergic epitopes and fragments. this complexity is most likely why the immunological reaction in mice is relatively similar to that seen in asthmatics [44] . cockroach allergen (cra) is a common allergen, known to induce asthma in susceptible individuals; thus, it shares with hdm the advantage of being highly clinically relevant [45] . cra induces peribronchial inflammation with significant eosinophilic inflammation and transient airway hyperresponsiveness, both of which can be increased by repeated administrations of the allergen [45] . colonization of the airways with aspergillus fumigatus is the cause of allergic bronchopulmonary aspergillosis (abpa), a disease where the lungs are colonized by the fungus, but allergens from aspergillus fumigatus can also induce asthma similar to other allergens [46] . the reaction to aspergillus allergens is robust, and often no adjuvants are needed to elicit inflammation [46] . in addition to aspergillus, other fungi such as penicillium and alternaria can also induce asthma in humans and have been used to model disease in mice [47] . a common difficulty with these allergens is the method of administration, as the physiological route is believed to be the inhalation of dry allergens; mimicking this route with a nebulizer introduces the risk of the animals ingesting the allergen and thus causing systemic responses [47] . exacerbations of asthma are defined as the worsening of symptoms, prompting an adjustment in treatment, and are believed to be associated with increased inflammation in the distal airways. clinically, exacerbations are believed to be induced by infections (most common), allergen exposure, or pollutants, which can be modeled in different ways [48, 49] : 1. infections with viruses and bacteria or exposure to proteins/ dna/rna derived from these microbes. 2. administration of a high dose of allergen in a previously sensitized animal. 3. exposure to environmental pollutants, such as diesel exhaust or ozone. modeling exacerbations adds a layer of complexity, as robust ongoing allergic airway inflammation needs to be established first, before challenge with the exacerbating agent. both the ova and hdm models are used in this respect, and in both cases chronic protocols extending for several weeks before triggering an exacerbation have been used [48] . chronic obstructive pulmonary disease (copd) is characterized by chronic airway obstruction, in contrast to asthma where the obstruction is reversible (particularly in response to bronchodilator treatment). clinically, in copd, chronic bronchitis and emphysema can occur either separately or in combination. copd is almost always associated with either first-or secondhand tobacco smoking or in rare cases with a deficiency in the production of α1antitrypsin (a serpin that prevents elastin breakdown as a result of neutrophil degranulation) [50] . the etiology of copd is highly complex and is believed to develop after many years of smoking in combination with other known factors such as genetic susceptibility or environmental factors [51] . in similarity to asthma, inflammation is a major component in copd, but the leukocyte profile is very different: the most prominent players in copd-related inflammation are neutrophils and, to some degree, macrophages [51] . due to the complex etiology of copd, it is difficult to recapitulate all aspects of this disease in a single model, so in most cases, the aim is to induce copd-like lesions by exposing mice to tissue-damaging substances (usually cigarette smoke) or to mimic emphysema by the administration of tissue-degrading enzymes [27, 51] . clearly, mice do not smoke cigarettes on their own, so to model copd by cigarette smoke (cs) inhalation, the mice need to be exposed to unfiltered cs in an induction chamber; moreover, in an attempt to better model the chronic aspects of copd, this needs to be performed for a prolonged period of time. mice are very tolerant to cs, but eventually (over a period of several weeks), cs induces pulmonary neutrophilic inflammation that is associated with some degree of tissue degradation and destruction [51] . an important advantage of this model is the fact that cs is the actual irritant responsible for disease in humans, and mice develop several features similar to the clinical disease, making this model highly clinically relevant [27] . a significant drawback is the self-limitation of the model-the pathological changes do not progress after the cessation of cs exposure [51] . furthermore, the exposure time needed for mice to develop copd-like pathology is extensive, i.e., studies have shown that an exposure protocol of 5 days per week for a minimum of 3 months is needed to generate robust structural changes to the lung [52]. the pathological image in copd is complex and varies greatly between patients, commonly encompassing chronic bronchitis and bronchiolitis, emphysema, fibrosis, and airway obstruction. although mice develop some of these symptoms when exposed to cs, they do not develop all the symptoms of human disease; thus, cs has advantages as a model but fails to mimic the complexity of the clinical situation and disease presentation [27] . other models of copd rely on the administration of proteases (protein-degrading enzymes) that are believed to be involved in the pathology of this disease in a subset of patients, such as elastindegrading elastase. this approach mimics the emphysematous changes seen in copd, but the pathological process underlying tissue destruction is likely very different compared to the clinical situation [51] , as very few patients show evidence of elastase dysregulation [27] . however, if the aim of the study is to investigate the general effect of protease-induced tissue destruction and regeneration, then this is a highly relevant method [51] . some studies on copd have also used genetically modified animals, such as mice overexpressing collagenase, which results in tissue destruction without inflammation or fibrosis with an end result fairly similar to the type of emphysema observed in copd [53] . pulmonary fibrosis, the accumulation of fibrotic tissue within the alveolar parenchyma, is merely a symptom of disease, and the etiology of this pathology in humans varies greatly [54] . the most enigmatic class is perhaps the idiopathic interstitial pneumonias, especially idiopathic pulmonary fibrosis (ipf). ipf is a debilitating and progressive disease with a grave prognosis, characterized by progressive fibrosis believed to reflect aberrant tissue regeneration [55] . as the reason behind this defective repair is unknown, although a combination of immunological, genetic, and environmental factors are suspected, it is very difficult to model disease in a clinically relevant fashion [56] . the most common method used to model pulmonary fibrosis in mice is administration of the chemotherapeutic agent bleomycin; this agent is known to cause pulmonary fibrosis in humans as well, but this may not accurately reflect the true etiology of most cases of human disease. the strain of choice is c57bl/6, as it is prone to developing pulmonary fibrosis, whereas balb/c is relatively resistant, a feature believed to reflect the cytokine response following cellular stress and damage [57] . bleomycin administration can be performed locally or systemically, producing very different results. the most common model of pulmonary fibrosis is a single intranasal or intratracheal administration of bleomycin, with analysis 3 to 4 weeks later. during this time, the drug causes acute tissue damage in a restricted area of the lung (where the solution ends up during administration), followed by intense inflammation in this area and subsequent fibrosis, which gradually resolves within weeks. however, if older mice are used, the fibrosis will persist longer than in younger mice, which is in accordance with clinical ipf, where the majority of the patients are 65 years of age or older [56, 58] . a great advantage of this model is how well-characterized it is. in addition, local administration is labor-effective, as only one administration is required and the result is highly reproducible. the fibrosis is robust, only affects the lungs, and the accumulation of extracellular matrix can be easily measured using standard techniques [58] . furthermore, as it is used throughout the world, studies performed in different labs and by different groups can be compared relatively easily. unfortunately, the intense pulmonary inflammation may be lethal, and fatalities are to be expected with this model [59] , representing an important ethical limitation. furthermore, fibrosis is heterogeneous-it develops where the bleomycin solution is deposited. the solution usually deposits within the central lung, a localization that is not in agreement with the clinical situation where fibrosis is located in the more distal regions of the lung parenchyma. in addition, the fibrosis that develops as a result of severe tissue damage is self-limiting and reversible, unlike what is observed clinically [58] . the severe degree of tissue damage induced by bleomycin may in fact be more relevant for modeling acute lung injury (ali) or acute respiratory distress syndrome (ards). bleomycin can also be administered systemically, through intravenous or subcutaneous injection. in contrast to local administration, this route requires multiple administrations and is thus more laborintensive [58] . some studies have described the usage of osmotic mini-pumps, where bleomycin is slowly administered over a short period of time, and then fibrosis continues to develop over subsequent weeks [60] . irrespective of the route of delivery, systemic administration results in more homogenous fibrosis, affecting the entire lung through the pulmonary endothelium and persisting much longer than following local administration [61] . the main advantages of systemic administration are that inflammation is limited, while the fibrosis is more apparent and displays a more distal pattern, all of which mimics the clinical situation relatively well. the multiple administrations allow for lower doses with each injection; this is less stressful to the animals and results in little to no mortality [61] and is thus more ethically acceptable. a major disadvantage with this model is that it takes time for fibrosis to develop [58] , which may be the reason it is used relatively scarcely, and thus the pathological development is less well-understood. in addition, as ipf is a local disease, local administration of the etiologic agent may better mimic the clinical reality [56] . the administration of fluorescein isothiocyanate (fitc) induces focal inflammation, primarily involving mononuclear cells and neutrophils, and localizes in areas where the fitc solution is deposited [58] . antibodies against fitc can be detected after 1 week, and the fibrosis persists for up to 5 months after instillation [58] . the benefits of this model are mainly related to the persistent fibrosis that does not appear to be self-limiting, thus reflecting the clinical situation, and it is also very easy to determine which part of the lung has been exposed to fitc, as the molecule is fluorescent [58] . it is also an advantage that both c57bl/6 and balb/c mice are susceptible and develop fibrosis following fitc administration [56] . the disadvantages of this model include profound variability due to differences between batches of fitc, as well as in the method used to prepare the solution before instillation. importantly, given the characteristics of the etiologic agent used to induce this model of ipf, this model is considered a very artificial system with limited clinical relevance [56] . adenovirus vectors have been used to overexpress the pro-fibrotic cytokine transforming growth factor (tgf)-β, which results in pulmonary fibrosis. as tgf-β overexpression in the lungs is known to be crucial in the development of fibrosis in humans [62] , this model mimics an important feature of disease etiology. however, the delivery system has some drawbacks, as the virus itself initiates an immune response. moreover, adenoviruses display significant tropism for epithelial cells and rarely infect other cell types such as fibroblasts [58] , which are the cells meant to be targeted in this model. as tgf-β has major effects on fibroblast biology, the main feature of this model is the effect of epitheliumderived tgf-β on fibroblasts and myofibroblasts, resulting in the deposition of ecm proteins and areas of dense fibrosis [63] . an advantage of this model is the relatively low degree of inflammation, as well as what appears to be a direct effect on fibroblasts/ myofibroblasts [63] , which is in accordance with the clinical situation (as we understand it today). silica administration induces a similar pathology in mouse lungs as in humans exposed to silica, and as is also observed in human silicainduced fibrosis, structural remodeling persists when administration is halted [56] . following the administration of silica particles, fibrotic nodules develop in mouse lungs, with considerable resemblance to the human lesions that develop after exposure to mineral fibers [56] . the fibrotic response is accompanied by a limited inflammatory response, and different pro-fibrotic cytokines such as tgf-β, platelet-derived growth factor, and il-10 are involved in disease development, which is in accordance with the clinical situation [56] . another advantage is that nodules develop around silica fibers, and these fibers are easy to identify by light microscopy. the response in this model is strain-dependent, with c57bl/6 mice being the most susceptible. the main drawbacks are the time required to establish disease, i.e., 30-60 days, and the need for special equipment to aerosolize the silica particles. however, since the route of administration, the driving etiologic agent, and the resulting pathobiology are all similar to the characteristics of this subtype of pulmonary fibrosis [56, 58] , the silica exposure model can be considered to have very good clinical relevance. 7 what does the future hold for mouse models of human disease? medical research using experimental animals (not only mice but other animals including rats, guinea pigs, zebrafish, and fruit flies) has greatly contributed to many important scientific and medical advances in the past century and will continue to do so into the near future. these advances have contributed to the development of new medicines and treatments for human disease and have therefore played a vital role in increasing the human life span and improving quality of life. despite the acknowledged benefits of performing research using experimental animals, a number of considerations must be made before embarking on this type of research. of course, the financial aspects of conducting this type of work are an important limitation, as the costs of purchasing and housing mice can be prohibitive, especially when genetically modified mice and colony maintenance are required for the study. the practicalities of working with animals such as mice may also be an issue, as this type of work requires specialized facilities, equipment, and staff to ensure studies are carried out in a manner that is safe for both the researchers and the animals. moreover, as discussed in detail in this chapter, the relevance of the selected animal model to human disease must be carefully evaluated to ensure that these experiments provide robust results that are translatable to human health and disease. another important and demanding aspect of biomedical research using animals is the ethics of imposing pain and suffering on live animals. although there has been a considerable reduction in the numbers of animals used in research in the last 30 years, animal research remains a vital part of biomedical research. however, no responsible scientist wants to cause unnecessary suffering in experimental animals if it can be avoided, so scientists have accepted controls on the use of animals for medical research. in the uk, this ethical framework has been enshrined in law, i.e., the animals (scientific procedures) act 1986. this legislation requires that applications for a project license to perform research involving the use of "protected" animals (including all vertebrates and cephalopods) must be fully assessed with regard to any harm imposed on the animals. this involves a detailed examination of the proposed procedures and experiments, and the numbers and types of animal used, with robust statistical calculations to support these numbers. the planned studies are then considered in light of the potential benefits of the project. both within and outside the uk, approval for a study involving protected animals also requires an internal ethical review process, usually conducted by the research institution where the work is taking place, with the aim of promoting animal welfare by ensuring the work will be carried out in an ethical manner and that the use of animals is justified. additionally, the uk has a national animal use reduction strategy supported by the national centre for the replacement, refinement and reduction of animals in research (nc3rs; london, uk). this consortium was established in 2004 to promote and develop high-quality research that takes the principles of replacement, refinement, and reduction (the 3rs) into account. replacement strategies often involve the use of alternative, non-protected species (e.g., zebrafish, fruit flies, flatworms) and in vitro correlates (two-dimensional cell culture or threedimensional organoids containing multiple cell types) to test hypotheses and assess the effects of therapeutic interventions. the main obstacle with studies on non-protected animals is the difficulty of accurately mimicking the complex physiological systems involved in human health and disease, as described in detail above. for example, the fruit fly drosophila melanogaster is an excellent model organism for studies on genetic diseases, aging, and pathogen-borne illnesses but may be less relevant for studies on complex lung diseases. importantly, model organisms such as fruit flies, zebrafish, and flatworms do not possess lungs, which somewhat limits the translatability of research on these animals in the field of respiratory disease. as such, it is likely that rodents will remain the model organism of choice for studies into lung disease for some time to come. there has been considerable progress recently in imitating single organs such as the liver, lung, and brain in vitro using multiple cell types and a physical scaffold. as an important advantage, these in vitro tests have replaced a large number of rodents in initial drug discovery experiments, while also speeding up the process [64] . these studies still require further refinement and validation to establish them as suitable models for an entire organ; importantly, these in vitro organoids cannot take into account interactions between organ systems in complex, multisystem diseases such as copd. refinement involves selecting the most clinically relevant model for the disease available, informed by the discussion above on closely recapitulating the etiologic agent and disease pathobiology associated with clinical cases. another important factor is refining the management of pain. an assessment of the procedures used and the effects of the substance on the animal, as well as the degree of handling, restraint, and analgesia, are other important aspects of refinement. this standard of animal care is achieved through strict regulations and controls on how personnel are trained to carry out experiments on live animals. adequate training is an important aspect of refinement and should be reviewed and improved on an ongoing basis. moreover, refinement can be achieved by improving animal housing by environmental enrichment, e.g., providing a place for mice to hide in the cage and housing social animals such as mice in appropriate-sized groups. these simple changes can improve the physiological and behavioral status of research animals; this not only increases animal well-being but also contributes to the quality of the experimental results by reducing stress levels. the 3rs aspect of reduction focuses on the statistical power of experiments and by following the animal research: reporting of in vivo experiments (arrive) guidelines, originally published in plos biology in 2010. these guidelines provide a framework to improve the reporting of research performed on live animals by maximizing the quality of the scientific data and by minimizing unnecessary studies. the arrive guidelines provide a checklist of aspects that must be considered in good quality research using live animals. the guidelines are most appropriate for comparative studies involving two or more groups of experimental animals with at least one control group, but they also apply to studies involving drug dosing in which a single animal is used as its own control (within-subject experiments). the guidelines provide recommendations on what should be considered when preparing to report on the results of experiments involving live animals, i.e., by providing a concise but thorough background on the scientific theory and why and how animals were used to test a hypothesis, a statement on ethical approvals and study design including power and sample size calculations, a clear description of the methods used to ensure repeatability, objective measurements of outcomes and adverse effects, and interpretation of the results in light of the available literature and the limitations of the study. in addition to the positive impact of the arrive guidelines on reducing the number of animals used in experiments, this checklist provides an easy-tofollow roadmap on what is required for good quality reporting of experimental results. in conclusion, the use of animals in research will continue to be an important aspect of medical research, and these procedures can be ethically justified provided the proper controls are in place. the benefits of animal research have been vital to the progress of medical science; abandoning these studies would have severe negative consequences on human health. by considering aspects such as the 3rs and the arrive guidelines in planning experiments involving live animals, the number of animals used and suffering of these animals for the benefit of human health can be minimized. this requires a strong regulatory framework such as that found in the uk and many other countries, as well an ongoing public debate on the advantages and limitations of animal experimentation. use of house mice in biomedical research the laboratory mouse a comparative encyclopedia of dna elements in the mouse genome principles of regulatory information conservation between mouse and human of mice and men: aligning mouse and human anatomies mouse models of human disease: an evolutionary perspective structure and composition of pulmonary arteries, capillaries, and veins angiogenesis in the mouse lung lung cancer perfusion: can we measure pulmonary and bronchial circulation simultaneously? origin and evolution of the adaptive immune system: genetic events and selective pressures the evolutionary basis for differences between the immune systems of man, mouse, pig and ruminants of mice and not men: differences between mouse and human immunology gender dimorphism in differential peripheral blood leukocyte counts in mice using cardiac, tail, foot, and saphenous vein puncture methods species differences in the expression of major histocompatibility complex class ii antigens on coronary artery endothelium: implications for cellmediated xenoreactivity icos is critical for cd40-mediated antibody class switching homozygous loss of icos is associated with adultonset common variable immunodeficiency b7-h3: a costimulatory molecule for t cell activation and ifn-gamma production global gene regulation during activation of immunoglobulin class switching in human b cells selective loss of type i interferon-induced stat4 activation caused by a minisatellite insertion in mouse stat2 regulation of human helper t cell subset differentiation by cytokines functional dichotomy in the cd4+ t cell response to schistosoma mansoni what animal models teach humans about tuberculosis cutting edge: check your mice-a point mutation in the ncr1 locus identified in cd45.1 congenic mice with consequences in mouse susceptibility to infection comparative computational analysis of pluripotency in human and mouse stem cells the transition from primary culture to spontaneous immortalization in mouse fibroblast populations studying human respiratory disease in animals -role of induced and naturally occurring models role of mast cells in allergic and non-allergic immune responses: comparison of human and murine data allergic airway inflammation induces migration of mast cell populations into the mouse airway novel site-specific mast cell subpopulations in the human lung animal models of asthma: reprise or reboot? emerging molecular phenotypes of asthma allergens in veterinary medicine immunologic responses following respiratory sensitization to house dust mite allergens in mice understanding asthma using animal models use of the cockroach antigen model of acute asthma to determine the immunomodulatory role of early exposure to gastrointestinal infection t cell homing to epithelial barriers in allergic disease allergic airway inflammation initiates long-term vascular remodeling of the pulmonary circulation respiratory allergy caused by house dust mites: what do we really know? continuous exposure to house dust mite elicits chronic airway inflammation and structural remodeling induction of vascular remodeling in the lung by chronic house dust mite exposure lung responses in murine models of experimental asthma: value of house dust mite over ovalbumin sensitization a novel mouse model of experimental asthma temporal role of chemokines in a murine model of cockroach allergen-induced airway hyperreactivity and eosinophilia animal models of allergic bronchopulmonary aspergillosis murine models of airway fungal exposure and allergic sensitization mouse models of acute exacerbations of allergic asthma mouse models of severe asthma: understanding the mechanisms of steroid resistance, tissue remodelling and disease exacerbation novel variants of ser-pin1a gene: interplay between alpha1-antitrypsin deficiency and chronic obstructive pulmonary disease different lung responses to cigarette smoke in two strains of mice sensitive to oxidants collagenase expression in the lungs of transgenic mice causes pulmonary emphysema tissue remodelling in pulmonary fibrosis idiopathic pulmonary fibrosis exploring animal models that resemble idiopathic pulmonary fibrosis the role of mouse strain differences in the susceptibility to fibrosis: a systematic review murine models of pulmonary fibrosis highly selective endothelin-1 receptor a inhibition prevents bleomycin-induced pulmonary inflammation and fibrosis in mice (r)-resolvin d1 ameliorates bleomycin-induced pulmonary fibrosis in mice extracellular matrix alterations and acute inflammation; developing in parallel during early induction of pulmonary fibrosis smad3 signaling involved in pulmonary fibrosis and emphysema adenovector-mediated gene transfer of active transforming growth factor-beta1 induces prolonged severe fibrosis in rat lung the ethics of animal research. talking point on the use of animals in scientific research key: cord-017728-yazo0lga authors: brauer, fred title: compartmental models in epidemiology date: 2008 journal: mathematical epidemiology doi: 10.1007/978-3-540-78911-6_2 sha: doc_id: 17728 cord_uid: yazo0lga we describe and analyze compartmental models for disease transmission. we begin with models for epidemics, showing how to calculate the basic reproduction number and the final size of the epidemic. we also study models with multiple compartments, including treatment or isolation of infectives. we then consider models including births and deaths in which there may be an endemic equilibrium and study the asymptotic stability of equilibria. we conclude by studying age of infection models which give a unifying framework for more complicated compartmental models. we will be concerned both with epidemics which are sudden outbreaks of a disease, and endemic situations, in which a disease is always present. epidemics such as the 2002 outbreak of sars, the ebola virus and avian flu outbreaks are events of concern and interest to many people. the 1918 spanish flu epidemic caused millions of deaths, and a recurrence of a major influenza epidemic is a dangerous possibility. an introduction of smallpox is of considerable concern to government officials dealing with terrorism threats. an endemic situation is one in which a disease is always present. the prevalence and effects of many diseases in less developed countries are probably less well-known but may be of even more importance. every year millions of people die of measles, respiratory infections, diarrhea and other diseases that are easily treated and not considered dangerous in the western world. diseases such as malaria, typhus, cholera, schistosomiasis, and sleeping sickness are endemic in many parts of the world. the effects of high disease mortality on mean life span and of disease debilitation and mortality on the economy in afflicted countries are considerable. our goal is to provide an introduction to mathematical epidemiology, including the development of mathematical models for the spread of disease as well as tools for their analysis. scientific experiments usually are designed to obtain information and to test hypotheses. experiments in epidemiology with controls are often difficult or impossible to design and even if it is possible to arrange an experiment there are serious ethical questions involved in withholding treatment from a control group. sometimes data may be obtained after the fact from reports of epidemics or of endemic disease levels, but the data may be incomplete or inaccurate. in addition, data may contain enough irregularities to raise serious questions of interpretation, such as whether there is evidence of chaotic behaviour [12] . hence, parameter estimation and model fitting are very difficult. these issues raise the question of whether mathematical modeling in epidemiology is of value. our response is that mathematical modeling in epidemiology provides understanding of the underlying mechanisms that influence the spread of disease and, in the process, it suggests control strategies. in fact, models often identify behaviours that are unclear in experimental data -often because data are non-reproducible and the number of data points is limited and subject to errors in measurement. for example, one of the fundamental results in mathematical epidemiology is that most mathematical epidemic models, including those that include a high degree of heterogeneity, usually exhibit "threshold" behaviour. in epidemiological terms this can be stated as follows: if the average number of secondary infections caused by an average infective, called the basic reproduction number, is less than one a disease will die out, while if it exceeds one there will be an epidemic. this broad principle, consistent with observations and quantified via epidemiological models, has been consistently used to estimate the effectiveness of vaccination policies and the likelihood that a disease may be eliminated or eradicated. hence, even if it is not possible to verify hypotheses accurately, agreement with hypotheses of a qualitative nature is often valuable. expressions for the basic reproduction number for hiv in various populations have been used to test the possible effectiveness of vaccines that may provide temporary protection by reducing either hiv-infectiousness or susceptibility to hiv. models are used to estimate how widespread a vaccination plan must be to prevent or reduce the spread of hiv. in the mathematical modeling of disease transmission, as in most other areas of mathematical modeling, there is always a trade-off between simple models, which omit most details and are designed only to highlight general qualitative behaviour, and detailed models, usually designed for specific situations including short-term quantitative predictions. detailed models are generally difficult or impossible to solve analytically and hence their usefulness for theoretical purposes is limited, although their strategic value may be high. in these notes we describe simple models in order to establish broad principles. furthermore, these simple models have additional value as they are the building blocks of models that include more detailed structure. many of the early developments in the mathematical modeling of communicable diseases are due to public health physicians. the first known result in mathematical epidemiology is a defense of the practice of inoculation against smallpox in 1760 by daniel bernoulli, a member of a famous family of mathematicians (eight spread over three generations) who had been trained as a physician. the first contributions to modern mathematical epidemiology are due to p.d. en'ko between 1873 and 1894 [11] , and the foundations of the entire approach to epidemiology based on compartmental models were laid by public health physicians such as sir ross, r.a., w.h. hamer, a.g. mckendrick and w.o. kermack between 1900 and 1935 , along with important contributions from a statistical perspective by j. brownlee. a particularly instructive example is the work of ross on malaria. dr. ross was awarded the second nobel prize in medicine for his demonstration of the dynamics of the transmission of malaria between mosquitoes and humans. although his work received immediate acceptance in the medical community, his conclusion that malaria could be controlled by controlling mosquitoes was dismissed on the grounds that it would be impossible to rid a region of mosquitoes completely and that in any case mosquitoes would soon reinvade the region. after ross formulated a mathematical model that predicted that malaria outbreaks could be avoided if the mosquito population could be reduced below a critical threshold level, field trials supported his conclusions and led to sometimes brilliant successes in malaria control. however, the garki project provides a dramatic counterexample. this project worked to eradicate malaria from a region temporarily. however, people who have recovered from an attack of malaria have a temporary immunity against reinfection. thus elimination of malaria from a region leaves the inhabitants of this region without immunity when the campaign ends, and the result can be a serious outbreak of malaria. we will begin with an introduction to epidemic models. next, we will incorporate demographic effects into the models to explore endemic states, and finally we will describe models with infectivity depending on the age of infection. our approach will be qualitative. by this we mean that rather than attempting to find explicit solutions of the systems of differential equations which will form our models we will be concerned with the asymptotic behaviour, that is, the behaviour as t → ∞ of solutions. this material is meant to be an introduction to the study of compartmental models in mathematical epidemiology. more advanced material may be found in many other sources, including chaps. 5-9 of this volume, the case studies in chaps. 11-14, and [2, 4-6, 9, 17, 29, 35 ]. an epidemic, which acts on a short temporal scale, may be described as a sudden outbreak of a disease that infects a substantial portion of the population in a region before it disappears. epidemics usually leave many members untouched. often these attacks recur with intervals of several years between outbreaks, possibly diminishing in severity as populations develop some immunity. this is an important aspect of the connection between epidemics and disease evolution. the book of exodus describes the plagues that moses brought down upon egypt, and there are several other biblical descriptions of epidemic outbreaks. descriptions of epidemics in ancient and medieval times frequently used the term "plague" because of a general belief that epidemics represented divine retribution for sinful living. more recently some have described aids as punishment for sinful activities. such views have often hampered or delayed attempts to control this modern epidemic . there are many biblical references to diseases as historical influences, such as the decision of sennacherib, the king of assyria, to abandon his attempt to capture jerusalem about 700 bc because of the illness of his soldiers (isaiah 37, [36] [37] [38] . the fall of empires has been attributed directly or indirectly to epidemic diseases. in the second century ad the so-called antonine plagues (possibly measles and smallpox) invaded the roman empire, causing drastic population reductions and economic hardships. these led to disintegration of the empire because of disorganization, which facilitated invasions of barbarians. the han empire in china collapsed in the third century ad after a very similar sequence of events. the defeat of a population of millions of aztecs by cortez and his 600 followers can be explained in part by a smallpox epidemic that devastated the aztecs but had almost no effect on the invading spaniards thanks to their built-in immunities. the aztecs were not only weakened by disease but also confounded by what they interpreted as a divine force favoring the invaders. smallpox then spread southward to the incas in peru and was an important factor in the success of pizarro's invasion a few years later. smallpox was followed by other diseases such as measles and diphtheria imported from europe to north america. in some regions, the indigenous populations were reduced to one tenth of their previous levels by these diseases. between 1519 and 1530 the indian population of mexico was reduced from 30 million to 3 million. the black death spread from asia throughout europe in several waves during the fourteenth century, beginning in 1346, and is estimated to have caused the death of as much as one third of the population of europe between 1346 and 1350. the disease recurred regularly in various parts of europe for more than 300 years, notably as the great plague of london of 1665-1666. it then gradually withdrew from europe. as the plague struck some regions harshly while avoiding others, it had a profound effect on political and economic developments in medieval times. in the last bubonic plague epidemic in france (1720-1722), half the population of marseilles, 60% of the population in nearby toulon, 44% of the population of arles and 30% of the population of aix and avignon died, but the epidemic did not spread beyond provence. the historian w.h. mcneill argues, especially in his book [26] , that the spread of communicable diseases has frequently been an important influence in history. for example, there was a sharp population increase throughout the world in the eighteenth century; the population of china increased from 150 million in 1760 to 313 million in 1794 and the population of europe increased from 118 million in 1700 to 187 million in 1800. there were many factors involved in this increase, including changes in marriage age and technological improvements leading to increased food supplies, but these factors are not sufficient to explain the increase. demographic studies indicate that a satisfactory explanation requires recognition of a decrease in the mortality caused by periodic epidemic infections. this decrease came about partly through improvements in medicine, but a more important influence was probably the fact that more people developed immunities against infection as increased travel intensified the circulation and co-circulation of diseases. perhaps the first epidemic to be examined from a modeling point of view was the great plague in london (1665-1666). the plague was one of a sequence of attacks beginning in the year 1346 of what came to be known as the black death. it is now identified as the bubonic plague, which had actually invaded europe as early as the sixth century during the reign of the emperor justinian of the roman empire and continued for more than three centuries after the black death. the great plague killed about one sixth of the population of london. one of the few "benefits" of the plague was that it caused cambridge university to be closed for two years. isaac newton, who was a student at cambridge at the time, was sent to his home and while "in exile" he had one of the most productive scientific periods of any human in history. he discovered his law of gravitation, among other things, during this period. the characteristic features of the great plague were that it appeared quite suddenly, grew in intensity, and then disappeared, leaving part of the population untouched. the same features have been observed in many other epidemics, both of fatal diseases and of diseases whose victims recovered with immunity against reinfection. in the nineteenth century recurrent invasions of cholera killed millions in india. the influenza epidemic of 1918-1919 killed more than 20 million people overall, more than half a million in the united states. one of the questions that first attracted the attention of scientists interested in the study of the spread of communicable diseases was why diseases would suddenly develop in a community and then disappear just as suddenly without infecting the entire community. one of the early triumphs of mathematical epidemiology [21] was the formulation of a simple model that predicted behaviour very similar to the behaviour observed in countless epidemics. the kermack-mckendrick model is a compartmental model based on relatively simple assumptions on the rates of flow between different classes of members of the population. there are many questions of interest to public health physicians confronted with a possible epidemic. for example, how severe will an epidemic be? this question may be interpreted in a variety of ways. for example, how many individuals will be affected altogether and thus require treatment? what is the maximum number of people needing care at any particular time? how long will the epidemic last? how much good would quarantine or isolation of victims do in reducing the severity of the epidemic? these are some of the questions we would like to study with the aid of models. we formulate our descriptions as compartmental models, with the population under study being divided into compartments and with assumptions about the nature and time rate of transfer from one compartment to another. diseases that confer immunity have a different compartmental structure from diseases without immunity. we will use the terminology sir to describe a disease which confers immunity against re-infection, to indicate that the passage of individuals is from the susceptible class s to the infective class i to the removed class r. on the other hand, we will use the terminology sis to describe a disease with no immunity against re-infection, to indicate that the passage of individuals is from the susceptible class to the infective class and then back to the susceptible class. other possibilities include seir and seis models, with an exposed period between being infected and becoming infective, and sirs models, with temporary immunity on recovery from infection. the independent variable in our compartmental models is the time t and the rates of transfer between compartments are expressed mathematically as derivatives with respect to time of the sizes of the compartments, and as a result our models are formulated initially as differential equations. possible generalizations, which we shall not explore in these notes, include models in which the rates of transfer depend on the sizes of compartments over the past as well as at the instant of transfer, leading to more general types of functional equations, such as differential-difference equations, integral equations, or integro-differential equations. in order to model such an epidemic we divide the population being studied into three classes labeled s, i, and r. we let s(t) denote the number of individuals who are susceptible to the disease, that is, who are not (yet) infected at time t. i(t) denotes the number of infected individuals, assumed infectious and able to spread the disease by contact with susceptibles. r(t) denotes the number of individuals who have been infected and then removed from the possibility of being infected again or of spreading infection. removal is carried out either through isolation from the rest of the population or through immunization against infection or through recovery from the disease with full immunity against reinfection or through death caused by the disease. these characterizations of removed members are different from an epidemiological perspective but are often equivalent from a modeling point of view which takes into account only the state of an individual with respect to the disease. in formulating models in terms of the derivatives of the sizes of each compartment we are assuming that the number of members in a compartment is a differentiable function of time. this may be a reasonable approximation if there are many members in a compartment, but it is certainly suspect otherwise. in formulating models as differential equations, we are assuming that the epidemic process is deterministic, that is, that the behaviour of a population is determined completely by its history and by the rules which describe the model. in other chapters of this volume linda allen and ping yan describe the study of stochastic models in which probabilistic concepts are used and in which there is a distribution of possible behaviours. the developing study of network science, introduced in chap. 4 of this volume and described in [28, 30, 33] , is another approach. the basic compartmental models to describe the transmission of communicable diseases are contained in a sequence of three papers by w.o. kermack and a.g. mckendrick in 1927 mckendrick in , 1932 mckendrick in , and 1933 . the first of these papers described epidemic models. what is often called the kermack-mckendrick epidemic model is actually a special case of the general model introduced in this paper. the general model included dependence on age of infection, that is, the time since becoming infected. curiously, kermack and mckendrick did not explore this situation further in their later models which included demographic effects. age of infection models have become important in the study of hiv/aids, and we will return to them in the last section of this chapter. the special case of the model proposed by kermack and mckendrick in 1927 which is the starting point for our study of epidemic models is a flow chart is shown in fig. 2 (1) an average member of the population makes contact sufficient to transmit infection with βn others per unit time, where n represents total population size (mass action incidence) . (2) infectives leave the infective class at rate αi per unit time. (3) there is no entry into or departure from the population, except possibly through death from the disease. according to (1) , since the probability that a random contact by an infective is with a susceptible, who can then transmit infection, is s/n , the number of new infections in unit time per infective is (βn )(s/n ), giving a rate of new infections (βn )(s/n )i = βsi. alternately, we may argue that for a contact by a susceptible the probability that this contact is with an infective is i/n and thus the rate of new infections per susceptible is (βn )(i/n ), giving a rate of new infections (βn )(i/n )s = βsi. note that both approaches give the same rate of new infections; there are situations which we shall encounter where one is more appropriate than the other. we need not give an algebraic expression for n since it cancels out of the final model, but we should note that for a disease that is fatal to all who are infected n = s +i; while, for a disease from which all infected members recover with immunity, n = s + i + r. later, we will allow the possibility that some infectives recover while others die of the disease. the hypothesis (3) really says that the time scale of the disease is much faster than the time scale of births and deaths so that demographic effects on the population may be ignored. an alternative view is that we are only interested in studying the dynamics of a single epidemic outbreak. in later sections we shall consider models that are the same as those considered in this first section except for the incorporation of demographic effects (births and deaths) along with the corresponding epidemiological assumptions. the assumption (2) requires a fuller mathematical explanation, since the assumption of a recovery rate proportional to the number of infectives has no clear epidemiological meaning. we consider the "cohort" of members who were all infected at one time and let u(s) denote the number of these who are still infective s time units after having been infected. if a fraction α of these leave the infective class in unit time then u = −αu , and the solution of this elementary differential equation is thus, the fraction of infectives remaining infective s time units after having become infective is e −αs , so that the length of the infective period is distributed exponentially with mean ∞ 0 e −αs ds = 1/α, and this is what (2) really assumes. the assumptions of a rate of contacts proportional to population size n with constant of proportionality β, and of an exponentially distributed recovery rate are unrealistically simple. more general models can be constructed and analyzed, but our goal here is to show what may be deduced from extremely simple models. it will turn out that many more realistic models exhibit very similar qualitative behaviours. in our model r is determined once s and i are known, and we can drop the r equation from our model, leaving the system of two equations we are unable to solve this system analytically but we learn a great deal about the behaviour of its solutions by the following qualitative approach. to begin, we remark that the model makes sense only so long as s(t) and i(t) remain non-negative. thus if either s(t) or i(t) reaches zero we consider the system to have terminated. we observe that s < 0 for all t and i > 0 if and only if s > α/β. thus i increases so long as s > α/β but since s decreases for all t, i ultimately decreases and approaches zero. if s(0) < α/β, i decreases to zero (no epidemic), while if s(0) > α/β, i first increases to a maximum attained when s = α/β and then decreases to zero (epidemic). we think of introducing a small number of infectives into a population of susceptibles and ask whether there will be an epidemic. the quantity βs(0)/α is a threshold quantity, called the basic reproduction number and denoted by r 0 , which determines whether there is an epidemic or not. if r 0 < 1 the infection dies out, while if r 0 > 1 there is an epidemic. the definition of the basic reproduction number r 0 is that the basic reproduction number is the number of secondary infections caused by a single infective introduced into a wholly susceptible population of size k ≈ s(0) over the course of the infection of this single infective. in this situation, an infective makes βk contacts in unit time, all of which are with susceptibles and thus produce new infections, and the mean infective period is 1/α; thus the basic reproduction number is actually βk/α rather than βs(0)/α. instead of trying to solve for s and i as functions of t, we divide the two equations of the model to give and integrate to find the orbits (curves in the (s, i)-plane, or phase plane) with c an arbitrary constant of integration. here, we are using log to denote the natural logarithm. another way to describe the orbits is to define the note that the maximum value of i on each of these orbits is attained when s = α/β. note also that since none of these orbits reaches the i -axis, s > 0 for all times. in particular, s ∞ = lim t→∞ s(t) > 0, which implies that part of the population escapes infection. let us think of a population of size k into which a small number of infectives is introduced, so that s 0 ≈ k, i 0 ≈ 0, and r 0 = βk/α. if we use the fact that lim t→∞ i(t) = 0, and let s ∞ = lim t→∞ s(t), then the relation from which we obtain an expression for β/α in terms of the measurable quantities s 0 and s ∞ , namely we may rewrite this in terms of r 0 as the final size relation in particular, since the right side of (2.3) is finite, the left side is also finite, and this shows that s ∞ > 0. it is generally difficult to estimate the contact rate β which depends on the particular disease being studied but may also depend on social and behavioural factors. the quantities s 0 and s ∞ may be estimated by serological studies (measurements of immune responses in blood samples) before and after an epidemic, and from these data the basic reproduction number r 0 may be estimated by using (2.3) . this estimate, however, is a retrospective one which can be determined only after the epidemic has run its course. initially, the number of infectives grows exponentially because the equation for i may be approximated by i = (βk − α)i and the initial growth rate is this initial growth rate r may be determined experimentally when an epidemic begins. then since k and α may be measured β may be calculated as however, because of incomplete data and under-reporting of cases this estimate may not be very accurate. this inaccuracy is even more pronounced for an outbreak of a previously unknown disease, where early cases are likely to be misdiagnosed. the maximum number of infectives at any time is the number of infectives when the derivative of i is zero, that is, when s = α/β. this maximum is given by obtained by substituting s = α/β, i = i max into (2.2). the village of eyam near sheffield, england suffered an outbreak of bubonic plague in 1665-1666 the source of which is generally believed to be the great plague of london. the eyam plague was survived by only 83 of an initial population of 350 persons. as detailed records were preserved and as the community was persuaded to quarantine itself to try to prevent the spread of disease to other communities, the disease in eyam has been used as a case study for modeling [31] . detailed examination of the data indicates that there were actually two outbreaks of which the first was relatively mild. thus we shall try to fit the model (2.1) over the period from mid-may to mid-october 1666, measuring time in months with an initial population of seven infectives and 254 susceptibles, and a final population of 83. values of susceptibles and infectives in eyam are given in [31] for various dates, beginning with s(0) = 254, i(0) = 7, shown in table 2 .1. the relation ( the actual data for the eyam epidemic are remarkably close to the predictions of this very simple model. however, the model is really too good to be true. our model assumes that infection is transmitted directly between people. while this is possible, bubonic plague is transmitted mainly by rat fleas. when an infected rat is bitten by a flea, the flea becomes extremely hungry and bites the host rat repeatedly, spreading the infection in the rat. when the host rat dies its fleas move on to other rats, spreading the disease further. as the number of available rats decreases the fleas move to human hosts, and this is how plague starts in a human population (although the second phase of the epidemic may have been the pneumonic form of bubonic plague, which can be spread from person to person). one of the main reasons for the spread of plague from asia into europe was the passage of many trading ships; in medieval times ships were invariably infested with rats. an accurate model of plague transmission would have to include flea and rat populations, as well as movement in space. such a model would be extremely complicated and its predictions might well not be any closer to observations than our simple unrealistic model. in [31] a stochastic model was also used to fit the data, but the fit was rather poorer than the fit for the simple deterministic model (2.1). in the village of eyam the rector persuaded the entire community to quarantine itself to prevent the spread of disease to other communities. this policy actually increased the infection rate in the village by keeping fleas, rats, and people in close contact with one another, and the mortality rate from bubonic plague was much higher in eyam than in london. further, the quarantine could do nothing to prevent the travel of rats and thus did little to prevent the spread of disease to other communities. one message this suggests to mathematical modelers is that control strategies based on false models may be harmful, and it is essential to distinguish between assumptions that simplify but do not alter the predicted effects substantially, and wrong assumptions which make an important difference. the assumption in the model (2.1) of a rate of contacts per infective which is proportional to population size n , called mass action incidence or bilinear incidence, was used in all the early epidemic models. however, it is quite unrealistic, except possibly in the early stages of an epidemic in a population of moderate size. it is more realistic to assume a contact rate which is a non-increasing function of total population size. for example, a situation in which the number of contacts per infective in unit time is constant, called standard incidence, is a more accurate description for sexually transmitted diseases. we generalize the model (2.1) by replacing the assumption (1) by the assumption that an average member of the population makes c(n ) contacts in unit time with c (n ) ≥ 0 [7, 10] , and we define it is reasonable to assume β (n ) ≤ 0 to express the idea of saturation in the number of contacts. then mass action incidence corresponds to the choice c(n ) = βn , and standard incidence corresponds to the choice c(n ) = λ. some epidemic models [10] have used a michaelis-menten type of interaction of the form another form based on a mechanistic derivation for pair formation [14] leads to an expression of the form data for diseases transmitted by contact in cities of moderate size [25] suggests that data fits the assumption of a form with a = 0.05 quite well. all of these forms satisfy the conditions c (n ) ≥ 0, β (n ) ≤ 0. because the total population size is now present in the model we must include an equation for total population size in the model. this forces us to make a distinction between members of the population who die of the disease and members of the population who recover with immunity against reinfection. we assume that a fraction f of the αi members leaving the infective class at time t recover and the remaining fraction (1 − f ) die of disease. we use s, i, and n as variables, with n = s + i + r. we now obtain a three-dimensional model we also have the equation r = −fαi, but we need not include it in the model since r is determined when s, i, and n are known. we should note that if f = 1 the total population size remains equal to the constant k, and the model (2.6) reduces to the simpler model (2.1) with β replaced by the constant β(k). we wish to show that the model (2.6) has the same qualitative behaviour as the model (2.1), namely that there is a basic reproduction number which distinguishes between disappearance of the disease and an epidemic outbreak, and that some members of the population are left untouched when the epidemic passes. these two properties are the central features of all epidemic models. for the model (2.6) the basic reproduction number is given by because a single infective introduced into a wholly susceptible population makes c(k) = kβ(k) contacts in unit time, all of which are with susceptibles and thus produce new infections, and the mean infective period is 1/α. in addition to the basic reproduction number r 0 there is also a timedependent running reproduction number which we call r * , representing the number of secondary infections caused by a single individual in the population who becomes infective at time t. in this situation, an infective makes c(n ) = nβ(n ) contacts in unit time and a fraction s/n are with susceptibles and thus produce new infections. thus it is easy to see that for the model (2.6) the running reproduction number is given by if r * < 1 for all large t, the epidemic will pass. we may calculate the rate of change of the running reproduction number with respect to time, using (2.6) and (2.5) to find that and an epidemic begins. however, r * decreases until it is less than 1 and then remains less than 1. thus the epidemic will pass. if r 0 < 1 then and there is no epidemic. from (2.6) we obtain integration of these equations from 0 to t gives when we combine these two equations, eliminating the integral expression, and use if we let t → ∞, s(t) and n (t) decrease monotonically to limits s ∞ and n ∞ respectively and i(t) → 0. this gives the relation in this equation, k−n ∞ is the change in population size, which is the number of disease deaths over the course of the epidemic, while k − s ∞ is the change in the number of susceptibles, which is the number of disease cases over the course of the epidemic. in this model, (2.8) is obvious, but we shall see in a more general setting how to derive an analogous equation from which we can calculate an average disease mortality. equation (2.8) generalizes to the infection age epidemic model of kermack and mckendrick. if we use the same approach as was used for (2.1) to show that s ∞ > 0, we obtain and we are unable to proceed because of the dependence on n . however, we may use a different approach to obtain the desired result. we assume that β(0) is finite, thus ruling out standard incidence. if we let t → ∞ in the second equation of (2.7) we obtain the first equation of (2.6) may be written as since since the right side of this inequality is finite, the left side is also finite and this establishes that s ∞ > 0. in addition, if we use the same integration together with the inequality we obtain a final size inequality if β(n ) → ∞ as n → 0 we must use a different approach to analyze the limiting behaviour. it is possible to show that s ∞ = 0 is possible only if n → 0 and k 0 β(n )dn diverges, and this is possible only if f = 0, that is, only if all infectives ide of disease. the assumption that β(n ) is unbounded as n → 0 is biologically unreasonable. in particular, standard incidence is not realistic for small population sizes. a more realistic assumption would be that the number of contacts per infective in unit time is linear for small population size and saturates for larger population sizes, which rules out the possibility that the epidemic sweeps through the entire population. in many infectious diseases there is an exposed period after the transmission of infection from susceptibles to potentially infective members but before these potential infectives can transmit infection. if the exposed period is short it is often neglected in modeling. a longer exposed period could perhaps lead to significantly different model predictions, and we need to show that this is not the case. to incorporate an exponentially distributed exposed period with mean exposed period 1/κ we add an exposed class e and use compartments s, e, i, r and total population size n = s +e +i +r to give a generalization of the epidemic model (2.6). we also have the equation r = −fαi, but we need not include it in the model since r is determined when s, i, and n are known. a flow chart is shown in fig. 2 .6. the analysis of this model is the same as the analysis of (2.6), but with i replaced by e + i. that is, instead of using the number of infectives as one of the variables we use the total number of infected members, whether or not they are capable of transmitting infection. some diseases have an asymptomatic stage in which there is some infectivity rather than an exposed period. this may be modeled by assuming infectivity reduced by a factor ε e during an exposed period. a calculation of the rate of new infections per susceptible leads to a model there is a final size relation like (2. 3) for the model (2.9). integration of the sum of the first two equations of (2.9) from 0 to ∞ gives and division of the first equation of (2.9) by s followed by integration from 0 to ∞ gives the same integration using β(n ) ≤ β(0) < ∞ shows as in the previous section that s ∞ > 0. one form of treatment that is possible for some diseases is vaccination to protect against infection before the beginning of an epidemic. for example, this approach is commonly used for protection against annual influenza outbreaks. a simple way to model this would be to reduce the total population size by the fraction of the population protected against infection. however, in reality such inoculations are only partly effective, decreasing the rate of infection and also decreasing infectivity if a vaccinated person does become infected. to model this, it would be necessary to divide the population into two groups with different model parameters and to make some assumptions about the mixing between the two groups. we will not explore such more complicated models here. if there is a treatment for infection once a person has been infected, we model this by supposing that a fraction γ per unit time of infectives is selected for treatment, and that treatment reduces infectivity by a fraction δ. suppose that the rate of removal from the treated class is η. the sit r model, where t is the treatment class, is given by a flow chart is shown in fig. 2.7 . it is not difficult to prove, much as was done for the model (2.1) that in order to calculate the basic reproduction number, we may argue that an infective in a totally susceptible population causes βk new infections in unit time, and the mean time spent in the infective compartment is 1/(α + γ). in addition, a fraction γ/(α+γ) of infectives are treated. while in the treatment stage the number of new infections caused in unit time is δβk, and the mean time in the treatment class is 1/η. thus r 0 is it is also possible to establish the final size relation (2.3) by means similar to those used for the simple model (2.1). we integrate the first equation of (2.11) to obtain log integration of the third equation of (2.11) gives integration of the sum of the first two equations of (2.11) gives combination of these three equations and (2.12) gives if β is constant, this relation is an equality, and is the same as (2.3). an actual epidemic differs considerably from the idealized models (2.1) or (2.6), as was shown by the sars epidemic of 2002-3. some notable differences are: 1. as we have seen in the preceding section, at the beginning of an epidemic the number of infectives is small and a deterministic model, which presupposes enough infectives to allow homogeneous mixing, is inappropriate. 2. when it is realized that an epidemic has begun, individuals are likely to modify their behaviour by avoiding crowds to reduce their contacts and by being more careful about hygiene to reduce the risk that a contact will produce infection. 3. if a vaccine is available for the disease which has broken out, public health measures will include vaccination of part of the population. various vaccination strategies are possible, including vaccination of health care workers and other first line responders to the epidemic, vaccination of members of the population who have been in contact with diagnosed infectives, or vaccination of members of the population who live in close proximity to diagnosed infectives. 4. diagnosed infectives may be hospitalized, both for treatment and to isolate them from the rest of the population. 5. contact tracing of diagnosed infectives may identify people at risk of becoming infective, who may be quarantined (instructed to remain at home and avoid contacts) and monitored so that they may be isolated immediately if and when they become infective. 6. in some diseases, exposed members who have not yet developed symptoms may already be infective, and this would require inclusion in the model of new infections caused by contacts between susceptibles and asymptomatic infectives from the exposed class. 7. isolation may be imperfect; in-hospital transmission of infection was a major problem in the sars epidemic. in the sars epidemic of 2002-2003 in-hospital transmission of disease from patients to health care workers or visitors because of imperfect isolation accounted for many of the cases. this points to an essential heterogeneity in disease transmission which must be included whenever there is any risk of such transmission. all these generalizations have been considered in studies of the sars epidemic of 2002-3. while the ideas were suggested in sars modelling, they are in fact relevant to any epidemic. one beneficial effect of the sars epidemic has been to draw attention to epidemic modelling which may be of great value in coping with future epidemics. if a vaccine is available for a disease which threatens an epidemic outbreak, a vaccinated class which is protected at least partially against infection should be included in a model. while this is not relevant for an outbreak of a new disease, it would be an important aspect to be considered in modelling an influenza epidemic or a bioterrorist outbreak of smallpox. for an outbreak of a new disease, where no vaccine is available, isolation and quarantine are the only control measures available. let us formulate a model for an epidemic once control measures have been started. thus, we assume that an epidemic has started, but that the number of infectives is small and almost all members of the population are still susceptible. we formulate a model to describe the course of an epidemic when control measures are begun under the assumptions: 1. exposed members may be infective with infectivity reduced by a factor ε e , 0 ≤ ε e < 1. 2. exposed members who are not isolated become infective at rate κ 1 . 3. we introduce a class q of quarantined members and a class j of isolated members. 4. exposed members are quarantined at a proportional rate γ 1 in unit time (in practice, a quarantine will also be applied to many susceptibles, but we ignore this in the model). quarantine is not perfect, but reduces the contact rate by a factor ε q . the effect of this assumption is that some susceptibles make fewer contacts than the model assumes. 5. there may be transmission of disease by isolated members, with an infectivity factor of ε j . 6. infectives are diagnosed at a proportional rate γ 2 per unit time and isolated. in addition, quarantined members are monitored and when they develop symptoms at rate κ 2 they are isolated immediately. 7. infectives leave the infective class at rate α 1 and a fraction f 1 of these recover, and isolated members leave the isolated class at rate α 2 with a fraction f 2 recovering. these assumptions lead to the seqijr model [13] here, we have used an equation for n to replace the equation the model before control measures are begun is the special case of (2.13). it is the same as (2.10). we define the control reproduction number r c to be the number of secondary infections caused by a single infective in a population consisting essentially only of susceptibles with the control measures in place. it is analogous to the basic reproduction number but instead of describing the very beginning of the disease outbreak it describes the beginning of the recognition of the epidemic. the basic reproduction number is the value of the control reproduction number with in addition, there is a time-dependent effective reproduction number r * which continues to track the number of secondary infections caused by a single infective as the epidemic continues with control measures (quarantine of asymptomatics and isolation of symptomatics) in place. it is not difficult to show that if the inflow into the population from travellers and new births is small (i.e., if the epidemiological time scale is much faster than the demographic time scale), our model implies that r * will become and remain less than unity, so that the epidemic will always pass. even if r c > 1, the epidemic will abate eventually when the effective reproduction number becomes less than unity. the effective reproduction number r * is essentially r c multiplied by a factor s/n , but allows time-dependent parameter values as well. however, it should be remembered that if the epidemic takes so long to pass that there are enough new births and travellers to keep r * > 1, there will be an endemic equilibrium meaning that the disease will establish itself and remain in the population. we have already calculated r 0 for (2.10) and we may calculate r c in the same way but using the full model with quarantined and isolated classes. we obtain each term of r c has an epidemiological interpretation. the mean duration in e is 1/d 1 with contact rate ε e β, giving a contribution to r c of ε e kβ(k)/d 1 . a fraction κ 1 /d 1 goes from e to i, with contact rate β and mean duration 1/d 2 , giving a contribution of kβ(k)κ 1 /d 1 d 2 . a fraction γ 1 /d 1 goes from e to q, with contact rate ε e ε q β and mean duration 1/κ 2 , giving a contribution of ε e ε q kβ(k)γ 1 /d 1 κ 2 . a fraction κ 1 γ 2 /d 1 d 2 goes from e to i to j, with a contact rate of ε j β and a mean duration of 1/α 2 , giving a contribution of ε j kβ(k)κ 1 γ 2 /α 2 d 1 d 2 . finally, a fraction γ 1 /d 1 goes from e to q to j with a contact rate of ε j β and a mean duration of 1/α 2 giving a contribution of ε j kβ(k)γ 1 /d 1 α 2 . the sum of these individual contributions gives r c . in the model (2.13) the parameters γ 1 and γ 2 are control parameters which may be varied in the attempt to manage the epidemic. the parameters q and j depend on the strictness of the quarantine and isolation processes and are thus also control measures in a sense. the other parameters of the model are specific to the disease being studied. while they are not variable, their measurements are subject to experimental error. the linearization of (2.13) at the disease-free equilibrium (k, 0, 0, 0, 0, k) the corresponding characteristic equation is a fourth degree polynomial equation whose leading coefficient is 1 and whose constant term is a positive constant multiple of 1 − r c , thus positive if r c < 1 and negative if r c > 1. if r c > 1 there is a positive eigenvalue, corresponding to an initial exponential growth rate of solutions of (2.13). if r c < 1 it is possible to show that all eigenvalues of the coefficient matrix have negative real part, and thus solutions of (2.13) die out exponentially [38] . next, we wish to show that analogues of the relation (2.8) and s ∞ > 0 derived for the model (2.6) are valid for the management model (2.13). we begin by integrating the equations for s + e, q, i, j, and n of (2.13) with respect to t from t = 0 to t = ∞, using the initial conditions we obtain, since e, q, i, and j all approach zero at t → ∞, in order to relate (k − s ∞ ) to (k − n ∞ ), we need to express thus we have this has the form, analogous to (2.8), with c, the disease death rate, given by the mean disease death rate may be measured and this expression gives information about some of the parameters in the model which can not be measured directly. it is easy to see that 0 an argument similar to the one used for (2.6) but technically more complicated may be used to show that s ∞ > 0 for the treatment model (2.13). thus the asymptotic behaviour of the management model (2.13) is the same as that of the simpler model (2.6). if the control reproduction number r c is less than 1 the disease dies out and if r c > 1 there is an epidemic which will pass leaving some members of the population untouched. the underlying assumptions of the models of kermack-mckendrick type studied in this chapter are that the sizes of the compartments are large enough that the mixing of members is homogeneous. while these assumptions are probably reasonable once an epidemic is well underway, at the beginning of a disease outbreak the situation may be quite different. at the beginning of an epidemic most members of the population are susceptible, that is, not (yet) infected, and the number of infectives (members of the population who are infected and may transmit infection) is small. the transmission of infection depends strongly on the pattern of contacts between members of the population, and a description should involve this pattern. since the number of infectives is small a description involving an assumption of mass action should be replaced by a model which incorporates stochastic effects. one approach would be a complete description of stochastic epidemic models, for which we refer the reader to the chapter on stochastic models in this volume by linda allen. another approach would be to consider a stochastic model for an outbreak of a communicable disease to be applied so long as the number of infectives remains small, distinguishing a (minor) disease outbreak confined to this initial stage from a (major) epidemic which occurs if the number of infectives begins to grow at an exponential rate. once an epidemic has started we may switch to a deterministic compartmental model. this approach is described in chap. 4 on network models in this volume. there is an important difference between the behaviour of network models and the behaviour of models of kermack-mckendrick type, namely that for a stochastic disease outbreak model if r 0 < 1 the probability that the infection will die out is 1, while if r 0 > 1 there is a positive probability that the infection will persist, and will lead to an epidemic and a positive probability that the infection will increase initially but will produce only a minor outbreak and will die out before triggering a major epidemic. epidemics which sweep through a population attract much attention and arouse a great deal of concern. as we have mentioned in the introduction, the prevalence and effects of many diseases in less developed countries are probably less well-known but may be of even more importance. there are diseases which are endemic in many parts of the world and which cause millions of deaths each year. we have omitted births and deaths in our description of models because the time scale of an epidemic is generally much shorter than the demographic time scale. in effect, we have used a time scale on which the number of births and deaths in unit time is negligible. to model a disease which may be endemic we need to think on a longer time scale and include births and deaths. for diseases that are endemic in some region public health physicians need to be able to estimate the number of infectives at a given time as well as the rate at which new infections arise. the effects of quarantine or vaccine in reducing the number of victims are of importance, just as in the treatment of epidemics. in addition, the possibility of defeating the endemic nature of the disease and thus controlling or even eradicating the disease in a population is worthy of study. measles is a disease for which endemic equilibria have been observed in many places, frequently with sustained oscillations about the equilibrium. the epidemic model of the first section assumes that the epidemic time scale is so short relative to the demographic time scale that demographic effects may be ignored. for measles, however, the reason for the endemic nature of the disease is that there is a flow of new susceptible members into the population, and in order to try to model this we must include births and deaths in the model. the simplest way to incorporate births and deaths in an infectious disease model is to assume a constant number of births and an equal number of deaths per unit time so that the total population size remains constant. this is, of course, feasible only if there are no deaths due to the disease. in developed countries such an assumption is plausible because there are few deaths from measles. in less developed countries there is often a very high mortality rate for measles and therefore other assumptions are necessary. the first attempt to formulate an sir model with births and deaths to describe measles was given in 1929 by h.e. soper [32] , who assumed a constant birth rate µk in the susceptible class and a constant death rate µk in the removed class. his model is this model is unsatisfactory biologically because the linkage of births of susceptibles to deaths of removed members is unreasonable. it is also an improper model mathematically because if r(0) and i(0) are sufficiently small then r(t) will become negative. for any disease model to be plausible it is essential that the problem be properly posed in the sense that the number of members in each class must remain non-negative. a model that does not satisfy this requirement cannot be a proper description of a disease model and therefore must contain some assumption that is biologically unreasonable. a full analysis of a model should include verification of this property. a model of kermack and mckendrick [22] includes births in the susceptible class proportional to total population size and a death rate in each class proportional to the number of members in the class. this model allows the total population size to grow exponentially or die out exponentially if the birth and death rates are unequal. it is applicable to such questions as whether a disease will control the size of a population that would otherwise grow exponentially. we shall return to this topic, which is important in the study of many diseases in less developed countries with high birth rates. to formulate a model in which total population size remains bounded we could follow the approach suggested by [15] in which the total population size is held constant by making birth and death rates equal. such a model is because s + i + r = k, we can view r as determined when s and i are known and consider the two-dimensional system we shall examine a slightly more general sir model with births and deaths for a disease that may be fatal to some infectives. for such a disease the class r of removed members should contain only recovered members, not members removed by death from the disease. it is not possible to assume that the total population size remain constant if there are deaths due to disease; a plausible model for a disease that may be fatal to some infectives must allow the total population to vary in time. the simplest assumption to allow this is a constant birth rate λ, but in fact the analysis is quite similar if the birth rate is a function λ(n ) of total population size n . let us analyze the model where n = s + i + r, with a mass action contact rate, a constant number of births λ per unit time, a proportional natural death rate µ in each class, and a rate of recovery or disease death α of infectives with a fraction f of infectives recovering with immunity against reinfection. in this model if f = 1 the total population size approaches a limit k = λ/µ. then k is the carrying capacity of the population. if f < 1 the total population size is not constant and k represents a carrying capacity or maximum possible population size, rather than a population size. we view the first two equations as determining s and i, and then consider the third equation as determining n once s and i are known. this is possible because n does not enter into the first two equations. instead of using n as the third variable in this model we could have used r, and the same reduction would have been possible. if the birth or recruitment rate λ(n ) is a function of total population size then in the absence of disease the total population size n satisfies the differential equation the carrying capacity of population size is the limiting population size k, satisfying the condition λ (k) < µ assures the asymptotic stability of the equilibrium population size k. it is reasonable to assume that k is the only positive equilibrium, so that λ(n ) > µn for 0 ≤ n ≤ k. for most population models, however, if λ(n ) represents recruitment into a behavioural class, as would be natural for models of sexually transmitted diseases, it would be plausible to have λ(0) > 0, or even to consider λ(n ) to be a constant function. if λ(0) = 0, we require λ (0) > µ because if this requirement is not satisfied there is no positive equilibrium and the population would die out even in the absence of disease. we have used a mass action contact rate for simplicity, even though a more general contact rate would give a more accurate model, just as in the epidemics considered in the preceding section. with a general contact rate and a density-dependent birth rate we would have a model if f = 1, so that there are no disease deaths, the equation for n is so that n (t) approaches a limiting population size k. the theory of asymptotically autonomous systems [8, 24, 34, 37] implies that if n has a constant limit then the system is equivalent to the system in which n is replaced by this limit. then the system (2.16) is the same as the system (2.15) with β replaced by the constant β(k) and n by k, and λ(n ) replaced by the constant λ(k) = µk. we shall analyze the model (2.15) qualitatively. in view of the remark above, our analysis will also apply to the more general model (2.16) if there are no disease deaths. analysis of the system (2.16) with f < 1 is much more difficult. we will confine our study of (2.16) to a description without details. the first stage of the analysis is to note that the model (2.15) is a properly posed problem. that is, since s ≥ 0 if s = 0 and i ≥ 0 if i = 0, we have s ≥ 0, i ≥ 0 for t ≥ 0 and since n ≤ 0 if n = k we have n ≤ k for t ≥ 0. thus the solution always remains in the biologically realistic region s ≥ 0, i ≥ 0, 0 ≤ n ≤ k if it starts in this region. by rights, we should verify such conditions whenever we analyze a mathematical model, but in practice this step is frequently overlooked. our approach will be to identify equilibria (constant solutions) and then to determine the asymptotic stability of each equilibrium. asymptotic stability of an equilibrium means that a solution starting sufficiently close to the equilibrium remains close to the equilibrium and approaches the equilibrium as t → ∞ >, while instability of the equilibrium means that there are solutions starting arbitrarily close to the equilibrium which do not approach it. to find equilibria (s ∞ , i ∞ ) we set the right side of each of the two equations equal to zero. the second of the resulting algebraic equations factors, giving two alternatives. the first alternative is i ∞ = 0, which will give a disease-free equilibrium, and the second alternative is βs ∞ = µ + α, which will give an endemic equilibrium, provided βs ∞ = µ+α < βk. if i ∞ = 0 the other equation gives s ∞ = k = λ/µ. for the endemic equilibrium the first equation gives we linearize about an equilibrium (s ∞ , i ∞ ) by letting y = s−s ∞ , z = i−i ∞ , writing the system in terms of the new variables y and z and retaining only the linear terms in a taylor expansion. we obtain a system of two linear differential equations, the coefficient matrix of this linear system is we then look for solutions whose components are constant multiples of e λt ; this means that λ must be an eigenvalue of the coefficient matrix. the condition that all solutions of the linearization at an equilibrium tend to zero as t → ∞ is that the real part of every eigenvalue of this coefficient matrix is negative. at the disease-free equilibrium the matrix is which has eigenvalues −µ and βk − µ − α. thus, the disease-free equilibrium is asymptotically stable if βk < µ + α and unstable if βk > µ + α. note that this condition for instability of the disease-free equilibrium is the same as the condition for the existence of an endemic equilibrium. in general, the condition that the eigenvalues of a 2×2 matrix have negative real part is that the determinant be positive and the trace (the sum of the diagonal elements) be negative. since βs ∞ = µ+α at an endemic equilibrium, the matrix of the linearization at an endemic equilibrium is and this matrix has positive determinant and negative trace. thus, the endemic equilibrium, if there is one, is always asymptotically stable. if the quantity is less than one, the system has only the disease-free equilibrium and this equilibrium is asymptotically stable. in fact, it is not difficult to prove that this asymptotic stability is global, that is, that every solution approaches the disease-free equilibrium. if the quantity r 0 is greater than one then the disease-free equilibrium is unstable, but there is an endemic equilibrium that is asymptotically stable. again, the quantity r 0 is the basic reproduction number. it depends on the particular disease (determining the parameter α) and on the rate of contacts, which may depend on the population density in the community being studied. the disease model exhibits a threshold behaviour: if the basic reproduction number is less than one the disease will die out, but if the basic reproduction number is greater than one the disease will be endemic. just as for the epidemic models of the preceding section, the basic reproduction number is the number of secondary infections caused by a single infective introduced into a wholly susceptible population because the number of contacts per infective in unit time is βk, and the mean infective period (corrected for natural mortality) is 1/(µ + α). there are two aspects of the analysis of the model (2.16) which are more complicated than the analysis of (2.15). the first is in the study of equilibria. because of the dependence of λ(n ) and β(n ) on n , it is necessary to use two of the equilibrium conditions to solve for s and i in terms of n and then substitute into the third condition to obtain an equation for n . then by comparing the two sides of this equation for n = 0 and n = k it is possible to show that there must be an endemic equilibrium value of n between 0 and k. the second complication is in the stability analysis. since (2.16) is a threedimensional system which can not be reduced to a two-dimensional system, the coefficient matrix of its linearization at an equilibrium is a 3 × 3 matrix and the resulting characteristic equation is a cubic polynomial equation of the form λ 3 + a 1 λ 2 + a 2 λ + a 3 = 0 . the routh-hurwitz conditions a 1 > 0, a 1 a 2 > a 3 > 0 are necessary and sufficient conditions for all roots of the characteristic equation to have negative real part. a technically complicated calculation is needed to verify that these conditions are satisfied at an endemic equilibrium for the model (2.16). the asymptotic stability of the endemic equilibrium means that the compartment sizes approach a steady state. if the equilibrium had been unstable, there would have been a possibility of sustained oscillations. oscillations in a disease model mean fluctuations in the number of cases to be expected, and if the oscillations have long period could also mean that experimental data for a short period would be quite unreliable as a predictor of the future. epidemiological models which incorporate additional factors may exhibit oscillations. a variety of such situations is described in [18, 19] . the epidemic models of the first section also exhibited a threshold behaviour but of a slightly different kind. for these models, which were sir models without births or natural deaths, the threshold distinguished between a dying out of the disease and an epidemic, or short term spread of disease. from the third equation of (2.15) we obtain where n = s + i + r. from this we see that at the endemic equilibrium n = k − (1 − f )αi/µ, and the reduction in the population size from the carrying capacity k is the parameter α in the sir model may be considered as describing the pathogenicity of the disease. if α is large it is less likely that r 0 > 1. if α is small then the total population size at the endemic equilibrium is close to the carrying capacity k of the population. thus, the maximum population decrease caused by disease will be for diseases of intermediate pathogenicity. in order to describe a model for a disease from which infectives recover with immunity against reinfection and that includes births and deaths as in the model (2.16), we may modify the model (2.16) by removing the equation for r and moving the term fαi describing the rate of recovery from infection to the equation for s. this gives the model describing a population with a density-dependent birth rate λ(n ) per unit time, a proportional death rate µ in each class, and with a rate α of departure from the infective class through recovery or disease death and with a fraction f of infectives recovering with no immunity against reinfection. in this model, if f < 1 the total population size is not constant and k represents a carrying capacity, or maximum possible population size, rather than a constant population size. it is easy to verify that if we add the two equations of (2.20), and use n = s + i we obtain for the sis model we are able to carry out the analysis with a general contact rate. if f = 1 the equation for n is and n approaches the limit k. the system (2.20) is asymptotically autonomous and its asymptotic behaviour is the same as that of the single differential equation where s has been replaced by k − i. but (2.21) is a logistic equation which is easily solved analytically by separation of variables or qualitatively by an equilibrium analysis. we find that i → 0 if kβ(k) < (µ + α), or r 0 < 1 and i → i ∞ > 0 with to analyze the sis model if f < 1, it is convenient to use i and n as variables instead of s and i, with s replaced by n − i. this gives the model equilibria are found by setting the right sides of the two differential equations equal to zero. the first of the resulting algebraic equations factors, giving two alternatives. the first alternative is i = 0, which will give a disease-free equilibrium i = 0, n = k, and the second alternative is β(n )(n − i) = µ + α, which may give an endemic equilibrium. for an endemic equilibrium (i ∞ , n ∞ ) the first equation gives substitution into the other equilibrium condition gives which can be simplified to while the right side of (2.23) is since if r 0 > 1 the left side of (2.23) is less than the right side of (2.23), and this implies that (2.23) has a solution for n, 0 < n < k. thus there is an endemic equilibrium if r 0 > 1. if r 0 < 1 this reasoning may be used to show that there is no endemic equilibrium. the linearization of (2.22) at an equilibrium (i ∞ , n ∞ ) has coefficient matrix at the disease-free equilibrium the matrix is which has eigenvalues λ (k) − µ and kβk − (µ + α). thus, the diseasefree equilibrium is asymptotically stable if kβ(k) < µ + α, or r 0 < 1, and unstable if kβ(k) > µ+α, or r 0 > 1. note that the condition for instability of the disease-free equilibrium is the same as the condition for the existence of an endemic equilibrium. at an endemic equilibrium, since β( it is clear that the coefficient matrix has negative trace and positive determinant if λ (n ) < µ and this implies that the endemic equilibrium is asymptotically stable. thus, the endemic equilibrium, which exists if r 0 > 1, is always asymptotically stable. if r 0 < 1 the system has only the disease-free equilibrium and this equilibrium is asymptotically stable. in the case f = 1 the verification of these properties remains valid if there are no births and deaths. this suggests that a requirement for the existence of an endemic equilibrium is a flow of new susceptibles either through births, as in the sir model or through recovery without immunity against reinfection, as in the sis model with or without births and deaths. if the epidemiological and demographic time scales are very different, for the sir model we observed that the approach to endemic equilibrium is like a rapid and severe epidemic. the same happens in the sis model, especially if there is a significant number of deaths due to disease. if there are few disease deaths the number of infectives at endemic equilibrium may be substantial, and there may be damped oscillations of large amplitude about the endemic equilibrium. for both the sir and sis models we may write the differential equation for i as which implies that whenever s exceeds its endemic equilibrium value s ∞ , i is increasing and epidemic-like behaviour is possible. if r 0 < 1 and s < k it follows that i < 0, and thus i is decreasing. thus, if r 0 < 1, i cannot increase and no epidemic can occur. next, we will turn to some applications of sir and sis models, taken mainly from [3] . in order to prevent a disease from becoming endemic it is necessary to reduce the basic reproduction number r 0 below one. this may sometimes be achieved by immunization. if a fraction p of the λ newborn members per unit time of the population is successfully immunized, the effect is to replace k by k (1 − p) , and thus to reduce the basic reproduction number to r 0 (1 − p) . a population is said to have herd immunity if a large enough fraction has been immunized to assure that the disease cannot become endemic. the only disease for which this has actually been achieved worldwide is smallpox for which r 0 is approximately 5, so that 80% immunization does provide herd immunity. for measles, epidemiological data in the united states indicate that r 0 for rural populations ranges from 5.4 to 6.3, requiring vaccination of 81.5-84.1% of the population. in urban areas r 0 ranges from 8.3 to 13.0, requiring vaccination of 88.0-92.3% of the population. in great britain, r 0 ranges from 12.5 to 16.3, requiring vaccination of 92-94% of the population. the measles vaccine is not always effective, and vaccination campaigns are never able to reach everyone. as a result, herd immunity against measles has not been achieved (and probably never can be). since smallpox is viewed as more serious and requires a lower percentage of the population be immunized, herd immunity was attainable for smallpox. in fact, smallpox has been eliminated; the last known case was in somalia in 1977, and the virus is maintained now only in laboratories (although there is currently some concern that it may be reintroduced as a bioterrorism attack). the eradication of smallpox was actually more difficult than expected because high vaccination rates were achieved in some countries but not everywhere, and the disease persisted in some countries. the eradication of smallpox was possible only after an intensive campaign for worldwide vaccination [16] . in order to calculate the basic reproduction number r 0 for a disease, we need to know the values of the contact rate β and the parameters µ, k, and α. the parameters µ, k, and α can usually be measured experimentally but the contact rate β is difficult to determine directly. there is an indirect means of estimating r 0 in terms of the life expectancy and the mean age at infection which enables us to avoid having to estimate the contact rate. in this calculation, we will assume that β is constant, but we will also indicate the modifications needed when β is a function of total population size n . the calculation assumes exponentially distributed life spans and infective periods. in fact, the result is valid so long as the life span is exponentially distributed. consider the "age cohort" of members of a population born at some time t 0 and let a be the age of members of this cohort. if y(a) represents the fraction of members of the cohort who survive to age (at least) a, then the assumption that a fraction µ of the population dies per unit time means that y (a) = −µy(a). since y(0) = 1, we may solve this first order initial value problem to obtain y(a) = e −µa . the fraction dying at (exactly) age a is −y (a) = µy(a). the mean life span is the average age at death, which is since y(a) = e −µa , this reduces to 1/µ. the life expectancy is often denoted by l, so that we may write the rate at which surviving susceptible members of the population become infected at age a and time t 0 + a, is βi(t 0 + a). thus, if z(a) is the fraction of the age cohort alive and still susceptible at age a, z (a) = −[µ+βi(t 0 +a)]z(a). solution of this first linear order differential equation gives the mean length of time in the susceptible class for members who may become infected, as opposed to dying while still susceptible, is and this is the mean age at which members become infected. if the system is at an equilibrium i ∞ , this integral may be evaluated, and the mean age at infection, denoted by a, is given by for our model the endemic equilibrium is and this implies this relation is very useful in estimating basic reproduction numbers. for example, in some urban communities in england and wales between 1956 and 1969 the average age of contracting measles was 4.8 years. if life expectancy is assumed to be 70 years, this indicates r 0 = 15.6. if β is a function β(n ) of total population size the relation (2.24) becomes if disease mortality does not have a large effect on total population size, in particular if there is no disease mortality, this relation is very close to (2.24). the relation between age at infection and basic reproduction number indicates that measures such as inoculations, which reduce r 0 , will increase the average age at infection. for diseases such as rubella (german measles), whose effects may be much more serious in adults than in children, this indicates a danger that must be taken into account: while inoculation of children will decrease the number of cases of illness, it will tend to increase the danger to those who are not inoculated or for whom the inoculation is not successful. nevertheless, the number of infections in older people will be reduced, although the fraction of cases which are in older people will increase. many common childhood diseases, such as measles, whooping cough, chicken pox, diphtheria, and rubella, exhibit variations from year to year in the number of cases. these fluctuations are frequently regular oscillations, suggesting that the solutions of a model might be periodic. this does not agree with the predictions of the model we have been using here; however, it would not be inconsistent with solutions of the characteristic equation, which are complex conjugate with small negative real part corresponding to lightly damped oscillations approaching the endemic equilibrium. such behaviour would look like recurring epidemics. if the eigenvalues of the matrix of the linearization at an endemic equilibrium are −u ± iv, where i 2 = −1, then the solutions of the linearization are of the form be −ut cos(vt + c), with decreasing "amplitude" be −ut and "period" 2π v . for the model (2.15) we recall from (2.17) that at the endemic equilibrium we have βi ∞ + µ = µr 0 , βs ∞ = µ + α and from (2.18) the matrix of the linearization is the eigenvalues are the roots of the quadratic equation if the mean infective period 1/α is much shorter than the mean life span 1/µ, we may neglect the terms that are quadratic in µ. thus, the eigenvalues are approximately and these are complex with imaginary part µ(r 0 − 1)α. this indicates oscillations with period approximately 2π µ(r 0 − 1)α . we use the relation µ(r 0 −1) = µl/a and the mean infective period τ = 1/α to see that the interepidemic period t is approximately 2π √ aτ . thus, for example, for recurring outbreaks of measles with an infective period of 2 weeks or 1/26 year in a population with a life expectancy of 70 years with r 0 estimated as 15, we would expect outbreaks spaced 2.76 years apart. also, as the "amplitude" at time t is e −µr0t/2 , the maximum displacement from equilibrium is multiplied by a factor e −(15)(2.76)/140 = 0.744 over each cycle. in fact, many observations of measles outbreaks indicate less damping of the oscillations, suggesting that there may be additional influences that are not included in our simple model. to explain oscillations about the endemic equilibrium a more complicated model is needed. one possible generalization would be to assume seasonal variations in the contact rate. this is a reasonable supposition for a childhood disease most commonly transmitted through school contacts, especially in winter in cold climates. note, however, that data from observations are never as smooth as model predictions and models are inevitably gross simplifications of reality which cannot account for random variations in the variables. it may be difficult to judge from experimental data whether an oscillation is damped or persistent. in the model (2.15) the demographic time scale described by the birth and natural death rates λ and µ and the epidemiological time scale described by the rate α of departure from the infective class may differ substantially. think, for example, of a natural death rate µ = 1/75, corresponding to a human life expectancy of 75 years, and epidemiological parameters α = 25, f = 1, describing a disease from which all infectives recover after a mean infective period of 1/25 year, or two weeks. suppose we consider a carrying capacity k = 1, 000 and take β = 0.1, indicating that an average infective makes (0.1)(1, 000) = 100 contacts per year. then r 0 = 4.00, and at the endemic equilibrium we have s ∞ = 250.13, i ∞ = 0.40, r ∞ = 749.47. this equilibrium is globally asymptotically stable and is approached from every initial state. however, if we take s(0) = 999, i(0) = 1, r(0) = 0, simulating the introduction of a single infective into a susceptible population and solve the system numerically we find that the number of infectives rises sharply to a maximum of 400 and then decreases to almost zero in a period of 0.4 year, or about 5 months. in this time interval the susceptible population decreases to 22 and then begins to increase, while the removed (recovered and immune against reinfection) population increases to almost 1,000 and then begins a gradual decrease. the size of this initial "epidemic" could not have been predicted from our qualitative analysis of the system (2.15). on the other hand, since µ is so small compared to the other parameters of the model, we might consider neglecting µ, replacing it by zero in the model. if we do this, the model reduces to the simple kermack-mckendrick epidemic model (without births and deaths) of the first section. if we follow the model (2.15) over a longer time interval we find that the susceptible population grows to 450 after 46 years, then drops to 120 during a small epidemic with a maximum of 18 infectives, and exhibits widely spaced epidemics decreasing in size. it takes a very long time before the system comes close to the endemic equilibrium and remains close to it. the large initial epidemic conforms to what has often been observed in practice when an infection is introduced into a population with no immunity, such as the smallpox inflicted on the aztecs by the invasion of cortez. if we use the model (2.15) with the same values of β, k and µ, but take α = 25, f = 0 to describe a disease fatal to all infectives, we obtain very similar results. now the total population is s + i, which decreases from an initial size of 1,000 to a minimum of 22 and then gradually increases and eventually approaches its equilibrium size of 250.53. thus, the disease reduces the total population size to one-fourth of its original value, suggesting that infectious diseases may have large effects on population size. this is true even for populations which would grow rapidly in the absence of infection, as we shall see later. many parts of the world experienced very rapid population growth in the eighteenth century. the population of europe increased from 118 million in 1700 to 187 million in 1800. in the same time period the population of great britain increased from 5.8 million to 9.15 million, and the population of china increased from 150 million to 313 million [27] . the population of english colonies in north america grew much more rapidly than this, aided by substantial immigration from england, but the native population, which had been reduced to one tenth of their previous size by disease following the early encounters with europeans and european diseases, grew even more rapidly. while some of these population increases may be explained by improvements in agriculture and food production, it appears that an even more important factor was the decrease in the death rate due to diseases. disease death rates dropped sharply in the eighteenth century, partly from better understanding of the links between illness and sanitation and partly because the recurring invasions of bubonic plague subsided, perhaps due to reduced susceptibility. one plausible explanation for these population increases is that the bubonic plague invasions served to control the population size, and when this control was removed the population size increased rapidly. in developing countries it is quite common to have high birth rates and high disease death rates. in fact, when disease death rates are reduced by improvements in health care and sanitation it is common for birth rates to decline as well, as families no longer need to have as many children to ensure that enough children survive to take care of the older generations. again, it is plausible to assume that population size would grow exponentially in the absence of disease but is controlled by disease mortality. the sir model with births and deaths of kermack and mckendrick [22] includes births in the susceptible class proportional to population size and a natural death rate in each class proportional to the size of the class. let us analyze a model of this type with birth rate r and a natural death rate µ < r. for simplicity we assume the disease is fatal to all infectives with disease death rate α, so that there is no removed class and the total population size is n = s + i. our model is from the second equation we see that equilibria are given by either i = 0 or βs = µ + α. if i = 0 the first equilibrium equation is rs = µs, which implies s = 0 since r > µ. it is easy to see that the equilibrium (0,0) is unstable. what actually would happen if i = 0 is that the susceptible population would grow exponentially with exponent r − µ > 0. if βs = µ + α the first equilibrium condition gives which leads to thus, there is an endemic equilibrium provided r < α + µ, and it is possible to show by linearizing about this equilibrium that it is asymptotically stable. on the other hand, if r > α + µ there is no positive equilibrium value for i. in this case we may add the two differential equations of the model to give and from this we may deduce that n grows exponentially. for this model either we have an asymptotically stable endemic equilibrium or population size grows exponentially. in the case of exponential population growth we may have either vanishing of the infection or an exponentially growing number of infectives. if only susceptibles contribute to the birth rate, as may be expected if the disease is sufficiently debilitating, the behaviour of the model is quite different. let us consider the model which has the same form as the celebrated lotka-volterra predator-prey model of population dynamics. this system has two equilibria, obtained by setting the right sides of each of the equations equal to zero, namely (0,0) and an endemic equilibrium ((µ + α)/β, (r − µ)/β). it turns out that the qualitative analysis approach we have been using is not helpful as the equilibrium (0,0) is unstable and the eigenvalues of the coefficient matrix at the endemic equilibrium have real part zero. in this case the behaviour of the linearization does not necessarily carry over to the full system. however, we can obtain information about the behaviour of the system by a method that begins with the elementary approach of separation of variables for first order differential equations. we begin by taking the quotient of the two differential equations and using the relation to obtain the separable first order differential equation integration gives the relation where c is a constant of integration. this relation shows that the quantity is constant on each orbit (path of a solution in the (s, i− plane). each of these orbits is a closed curve corresponding to a periodic solution. this model is the same as the simple epidemic model of the first section except for the birth and death terms, and in many examples the time scale of the disease is much faster than the time scale of the demographic process. we may view the model as describing an epidemic initially, leaving a susceptible population small enough that infection cannot establish itself. then there is a steady population growth until the number of susceptibles is large enough for an epidemic to recur. during this growth stage the infective population is very small and random effects may wipe out the infection, but the immigration of a small number of infectives will eventually restart the process. as a result, we would expect recurrent epidemics. in fact, bubonic plague epidemics did recur in europe for several hundred years. if we modify the demographic part of the model to assume limited population growth rather than exponential growth in the absence of disease, the effect would be to give behaviour like that of the model studied in the previous section, with an endemic equilibrium that is approached slowly in an oscillatory manner if r 0 > 1. example. (fox rabies) rabies is a viral infection to which many animals, especially foxes, coyotes, wolves, and rats, are highly susceptible. while dogs are only moderately susceptible, they are the main source of rabies in humans. although deaths of humans from rabies are few, the disease is still of concern because it is invariably fatal. however, the disease is endemic in animals in many parts of the world. a european epidemic of fox rabies thought to have begun in poland in 1939 and spread through much of europe has been modeled. we present here a simplified version of a model due to r.m. anderson and coworkers [1] . we begin with the demographic assumptions that foxes have a birth rate proportional to population size but that infected foxes do not produce offspring (because the disease is highly debilitating), and that there is a natural death rate proportional to population size. experimental data indicate a birth rate of approximately 1 per capita per year and a death rate of approximately 0.5 per capita per year, corresponding to a life expectancy of 2 years. the fox population is divided into susceptibles and infectives, and the epidemiological assumptions are that the rate of acquisition of infection is proportional to the number of encounters between susceptibles and infectives. we will assume a contact parameter β = 80, in rough agreement with observations of frequency of contact in regions where the fox density is approximately 1 fox/km 2 , and we assume that all infected foxes die with a mean infective period of approximately 5 days or 1/73 year. these assumptions lead to the model with β = 80, r = 1.0, µ = 0.5, α = 73. as this is of the form (2.26), we know that the orbits are closed curves in the (s, i) plane, and that both s and i are periodic functions of t. we illustrate with some simulations obtained using maple (figs. 2.8, 2.9, and 2.10). it should be noted from the graphs of i in terms of t that the period of the oscillation depends on the amplitude, and thus on the initial conditions, with larger amplitudes corresponding to longer periods. a warning is in order here. the model predicts that for long time intervals the number of infected foxes is extremely small. with such small numbers, the continuous deterministic models we have been using (which assume that population sizes are differentiable functions) are quite inappropriate. if the density of foxes is extremely small an encounter between foxes is a random event, and the number of contacts cannot be described properly by a function of population densities. to describe disease transmission properly when population sizes are very small we would need to use a stochastic model. now let us modify the demographic assumptions by assuming that the birth rate decreases as population size increases. we replace the birth rate of r per susceptible per year by a birth rate of re −as per susceptible per year, with a a positive constant. then, in the absence of infection, the fox population is given by the first order differential equation we omit the verification that the equilibrium n = 0 is unstable while the positive equilibrium n = (1/a) log(r/µ) is asymptotically stable. thus, the population has a carrying capacity given by the model now becomes if βs = µ + α there is an endemic equilibrium with βi + µ = re −as . a straightforward computation, which we shall not carry out here shows, that the disease-free equilibrium is asymptotically stable if r 0 = βk/(µ + α) < 1 and unstable if r 0 > 1, while the endemic equilibrium, which exists if and only if r 0 > 1, is always asymptotically stable. another way to express the condition for an endemic equilibrium is to say that the fox population density must exceed a threshold level k t given by with the parameter values we have been using, this gives a threshold fox density of 0.92 fox/km 2 . if the fox density is below this threshold value, the fox population will approach its carrying capacity and the disease will die out. above the threshold density, rabies will persist and will regulate the fox population to a level below its carrying capacity. this level may be approached in an oscillatory manner for large r 0 . the 1927 epidemic model of kermack we will describe a general age of infection model and carry out a partial analysis; there are many unsolved problems in the analysis. we continue to let s(t) denote the number of susceptibles at time t and r(t) the number of members recovered with immunity, but now we let i * (t) denote the number of infected (but not necessarily infective) members. we make the following assumptions: 1. the population has a birth rate λ(n ), and a natural death rate µ giving a carrying capacity k such that λ(k) = µk, λ (k) < µ. 2. an average infected member makes c(n ) contacts in unit time of which s/n are with susceptibles. we define β(n ) = c(n )/n and it is reasonable to assume that β (n ) ≤ 0, c (n ) ≥ 0. 3. b(τ ) is the fraction of infecteds remaining infective if alive when infection age is τ and b µ (τ ) = e −µτ b(τ ) is the fraction of infecteds remaining alive and infected when infection age is τ . letb µ (0) = ∞ 0 b µ (τ )dτ. ( in previous sections we have used b(τ ) = e −ατ , which would give b µ (τ ) = e −(µ+α)τ . we let i 0 (t) be the number of new infecteds at time t, i(t, τ ) be the number of infecteds at time t with infection age τ , and let φ(t) be the total infectivity at time t. then differentiation of the equation for i * gives three terms, including the rate of new infections and the rate of natural deaths. the third term gives the rate of recovery plus the rate of disease death as thus the si * r model is since i * is determined when s, φ, n are known we have dropped the equation for i * from the model, but it will be convenient to recall if f = 1 then n (t) approaches the limit k, the model is asymptotically autonomous and its dimension may be reduced to two, replacing n by the constant k. we note, for future use, that we define m = (1 − f )(1 − µb µ (0), and 0 ≤ m ≤ 1. we note, however, that if f = 1 then m = 0. we also have, using integration by parts, if a single infective is introduced into a wholly susceptible population, making kβ(k) contacts in unit time, the fraction still infective at infection age τ is b µ (τ ) and the infectivity at infection age τ is a µ (τ ). thus r 0 , the total number of secondary infections caused, is example. (exposed periods) one common example of an age of infection model is a model with an exposed period, during which individuals have been infected but are not yet infected. thus we may think of infected susceptibles going into an exposed class (e), proceeding from the exposed class to the infective class (i) at rate κe and out of the infective class at rate αi. exposed members have infectivity 0 and infective members have infectivity 1. thus i * = e + i and φ = i. we let u(τ ) be the fraction of infected members with infection age τ who are not yet infective if alive and v(τ ) the fraction of infected members who are infective if alive. then the fraction becoming infective at infection age τ if alive is κu(τ ), and we have the solution of the first of the equations of (2.28) is when we multiply this equation by the integrating factor e ατ and integrate, we obtain the solution and this is the term a µ (τ ) in the general model. the term b(τ ) is u(τ )+v(τ ). thus we have with these choices and the identifications we may verify that the system (2.27) reduces to which is a standard seir model. for some diseases there is an asymptomatic period during which individuals have some infectivity rather than an exposed period. if the infectivity during this period is reduced by a factor ε, then the model can be described by the system this may be considered as an age of infection model with the same identifications of the variables and the same choice of u(τ ), v(τ ) but with a(τ ) = εu(τ ) + v(τ ). there is a disease-free equilibrium s = n = k, φ = 0 of (2.27). endemic equilibria (s, n, φ) are given by if f = 1 the third condition gives λ(n ) = µn , which implies n = k. then the second condition may be solved for s, after which the first condition may be solved for φ. thus, there is always an endemic equilibrium. if f < 1 the second of the equilibrium conditions gives now substitution of the first two equilibrium conditions into the third gives an equilibrium condition for n , namely then we must have λ(n ) < µn. however, this would contradict the demographic condition λ(n ) > µn, 0 < n < k imposed earlier. this shows that if r 0 < 1 there is no endemic equilibrium. if r 0 > 1 for n = 0 the left side of (2.29) is non-negative while the right side is negative. for n = k the left side of (2.29) is µk(1 − m ) while the right side is this shows that there is an endemic equilibrium solution for n . the linearization of (2.27) at an equilibrium (s, n, φ) is the condition that this linearization has solutions which are constant multiples of e −λτ is that λ satisfies a characteristic equation. the characteristic equation at an equilibrium (s, n, φ) is here, the choice of q(λ) is motivated by the integration by parts formula the characteristic equation then reduces to where p = β(n ) + sβ (n ) ≥ 0. the characteristic equation for a model consisting of a system of ordinary differential equations is a polynomial equation. now we have a transcendental characteristic equation, but there is a basic theorem that if all roots of the characteristic equation at an equilibrium have negative real part then the equilibrium is asymptotically stable [39, chap. 4] . at the disease-free equilibrium s = n = k, φ = 0 the characteristic equation is since the absolute value of the left side of this equation is no greater than kβ(k)â µ (0) if λ ≥ 0 the disease-free equilibrium is asymptotically stable if and only if r 0 = kβ(k)â µ (0) < 1 . in the analysis of the characteristic equation (2.30) it is helpful to make use of the following elementary result: if |p (λ)| ≤ 1, g(λ) > 0 for λ ≥ 0, then all roots of the characteristic equation p (λ) = 1 + g(λ) satisfy λ < 0. to prove this result, we observe that if λ ≥ 0 the left side of the characteristic equation has absolute value at most 1 while the right side has absolute value greater than 1. if f = 1, the characteristic equation reduces to we have |sβ(n )â µ (λ)| ≤ s β (n )â µ (0) = 1 the term φβ(n ) λ + µ in (2.30) has positive real part if λ ≥ 0. it follows from the above elementary result that all roots satisfy λ < 0, so that the endemic equilibrium is asymptotically stable. thus all roots of the characteristic equation (2.30) have negative real part if f = 1. the analysis if f < 1 is more difficult. the roots of the characteristic equation depend continuously on the parameters of the equation. in order to have a root with λ ≥ 0 there must be parameter values for which either there is a root at "infinity", or there is a root λ = 0 or there is a pair of pure imaginary roots λ = ±iy, y > 0. since the left side of (2.30) approaches 0 while the right side approaches 1 as λ → ∞, λ ≥ 0, it is not possible for a root to appear at "infinity". for λ = 0, since sβ(n )â µ (0) = 1 and β (n ) ≤ 0 the left side of (2.30) is less than 1 at λ = 0, while the right side is greater than 1 since 1 − λ (n )b µ (0) > 1 − λ (n )/µ > 0 if λ (n ) < µ. this shows that λ = 0 is not a root of (2.30), and therefore that all roots satisfy λ < 0 unless there is a pair of roots λ = ±iy, y > 0. according to the hopf bifurcation theorem [20] a pair of roots λ = ±iy, y > 0 indicates that the system (2.27) has an asymptotically stable periodic solution and there are sustained oscillations of the system. a somewhat complicated calculation using the fact that since b µ (τ ) is monotone non-increasing, in (2.30) has negative real part for some y > 0. this is not possible with mass action incidence, since β (n ) = 0; thus with mass action incidence the endemic equilibrium of (2.27) is always asymptotically stable. since β (n ) ≤ 0, instability requires b µ (iy) = there are certainly less restrictive conditions which guarantee asymptotic stability. however, examples have been given [36, 37] of instability, even with f = 0, λ (n ) = 0, where constant infectivity would have produced asymptotic stability. their results indicate that concentration of infectivity early in the infected period is conducive to such instability. in these examples, the instability arises because a root of the characteristic equation crosses the imaginary axis as parameters of the model change, giving a pure imaginary root of the characteristic equation. this translates into oscillatory solutions of the model. thus infectivity which depends on infection age can cause instability and sustained oscillations. in order to formulate an si * s age of infection model we need only take the si * r age of infection model (2.22) and move the recovery term from the equation for r (which was not listed explicitly in the model) to the equation for s. we obtain the model although we will not carry out any analysis of this model, it may be attacked using the same approach as that used for (2.27) . it may be shown that if r 0 = kβ(k)â µ (0) < 1 the disease-free equilibrium is asymptotically stable. if r 0 > 1 there is an endemic equilibrium and the characteristic equation at this equilibrium is where p = β(n ) + sβ (n ) ≥ 0. many diseases, including most strains of influenza, impart only temporary immunity against reinfection on recovery. such disease may be described by sis age of infection models, thinking of the infected class i * as comprised of the infective class i together with the recovered and immune class r. in this way, members of r neither spread or acquire infection. we assume that immunity is lost at a proportional rate κ. we let u(τ ) be the fraction of infected members with infection age τ who are infective if alive and v(τ ) the fraction of infected members who are not recovered and still immune if alive. then the fraction becoming immune at infection age τ if alive is αu(τ ), and we have these equations are the same as (2.28) obtained in formulating the seir model with α and κ interchanged. thus we may solve to obtain we take b(τ ) = u(τ )+v(τ ), a(τ ) = u(τ ). then if we define i = φ, r = i * −φ, the model (2.31) is equivalent to the system which is a standard sirs model. if we assume that, instead of an exponentially distributed immune period, that there is an immune period of fixed length ω we would again obtain u(τ ) = e −ατ , but now we may calculate that v(τ ) = 1 − e −ατ , (τ ≤ ω), v(τ ) = e −ατ (e αω − 1), (τ > ω) . to obtain this, we note that v (τ ) = αu(τ ), (τ ≤ ω), v (τ ) = αu(τ ) − αu(τ − ω), (τ > ω) . from these we may calculate a(τ ), b(τ ) for an si * s model. since it is known that the endemic equilibrium for an sirs model with a fixed removed period can be unstable [19] , this shows that (2.33) may have roots with non-negative real part and the endemic equilibrium of an si * s age of infection model is not necessarily asymptotically stable. the si * r age of infection model is actually a special case of the si * s age of infection model. we could view the class r as still infected but having no infectivity, so that v(τ ) = 0. the underlying idea is that in infection age models we divide the population into members who may become infected and members who can not become infected, either because they are already infected or because they are immune. we conclude by returning to the beginning, namely an infection age epidemic model closely related to the original kermack-mckendrick epidemic model [21] . we simply remove the birth and natural death terms from the si * r model (2.27) . the result is and this shows that s ∞ > 0. we recall that we are assuming here that β(0) is finite; in other words we are ruling out standard incidence. it is possible to show that s ∞ can be zero only if n → 0 and k 0 β(n )dn diverges. however, from (2.8) we see that this is possible only if f = 0. if there are no disease deaths, so that the total population size n is constant, or if β is constant (mass action incidence), the above integration gives the final size relation log we may view the epidemic management model (2.13) as an age of infection model. we define i * = e + q + i + j, and we need only calculate the kernels a(τ ), b(τ ). we let u(τ ) denote the number of members of infection age τ in e, v(τ ) the number of members of infection age τ in q, w(τ ) the number of members of infection age τ in i, and z(τ ) the number of members of infection age τ in j. then (u, v, w, z) satisfies the linear homogeneous system with constant coefficient with initial conditions u(0) = 1, v(0) = 0, w(0) = 0, z(0) = 0. this system is easily solved recursively, and then the system (2.13) is an age of infection epidemic model with a(τ ) = ε e u(τ )+ε e ε q v(τ )+w(τ )+ε j z(τ ), b(τ ) = u(τ )+v(τ )+w(τ )+z(τ ) . in particular, it now follows from the argument carried out just above that s ∞ > 0 for the model (2.13). the proof is less complicated technically than the proof obtained for the specific model (2.13). the generalization to age of infection models both unifies the theory and makes some calculations less complicated. population dynamics of fox rabies in europe infectious diseases of humans. oxford science publications mathematical models in population biology and epidemiology vertically transmitted diseases: models and dynamics. biomathematics mathematical approaches for emerging and reemerging infectious diseases: an introduction mathematical approaches for emerging and reemerging infectious diseases: models, methods and theory the role of long incubation periods in the dynamics of hiv/aids. part 1: single populations models asymptotically autonomous epidemic models epidemic modelling: an introduction overall patterns in the transmission cycle of infectious disease agents the first epidemic model: a historical note on p.d. en'ko detecting nonlinearity and chaos in epidemic data singapore and beijing experience the saturating contact rate in marriage and epidemic models qualitative analysis for communicable disease models an immunization model for a hetereogeneous population the mathematics of infectious diseases periodicity in epidemic models periodicity and stability in epidemic models: a survey abzweigung einer periodischen lösungen von einer stationaren lösung eines differentialsystems mckendrick: a contribution to the mathematical theory of epidemics mckendrick: contributions to the mathematical theory of epidemics, part. ii mckendrick: contributions to the mathematical theory of epidemics, part. iii asymptotically autonomous differential systems dynamic models of infectious diseases as regulators of population size plagues and peoples the global condition network theory and sars: predicting outbreak diversity epidemic models: their structure and relation to data the structure and function of complex networks raggett: modeling the eyam plague interpretation of periodicity in disease prevalence exploring complex networks asymptotically autonomous differential equations in the plane mathematics in population biology mathematical and statistical approaches to aids epidemiology how may infection-age dependent infectivity affect the dynamics of hiv/aids? reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission theory of nonlinear age-dependent population dynamics. marcel dekker key: cord-007255-jmjolo9p authors: pulliam, juliet r. c.; dushoff, jonathan title: ability to replicate in the cytoplasm predicts zoonotic transmission of livestock viruses date: 2009-02-15 journal: j infect dis doi: 10.1086/596510 sha: doc_id: 7255 cord_uid: jmjolo9p understanding viral factors that promote cross-species transmission is important for evaluating the risk of zoonotic emergence. weconstructed a database of viruses of domestic artiodactyls and examined the correlation between traits linked in the literature to cross-species transmission and the ability of viruses to infect humans. among these traits-genomic material, genome segmentation, and replication without nuclear entry-the last is the strongest predictor of cross-species transmission. this finding highlights nuclear entry as a barrier to transmission and suggests that the ability to complete replication in the cytoplasm may prove to be a useful indicator of the threat of cross-species transmission. previous studies have compared emerging human pathogens to nonemerging human pathogens and looked for characteristics typical of those considered to be emerging [1] [2] [3] . to ask which characteristics predict host jumps requires a different approach. specifically, we must examine the pool of other hosts' pathogens that a target species regularly encounters. from this pool we can compare the characteristics of microbes that are able to infect the target host versus those that manifest no evidence of an ability to infect the target host. molecular characteristics that facilitate cross-species transmission are likely to be substantially different between viruses, bacteria, and protozoa, because of large differences in the pathobiology of these different taxa. here, we focus on cross-species transmission of viral infections and examine the effects of 3 characteristics that are described in the literature as expected to affect the ability of a viral group to infect a novel host species: genome segmentation, genomic material, and site of replication. the ability to rapidly explore genetic state space is expected to increase the probability of a host jump, so we expect that viruses with rna genomes will have a higher probability of jumping than viruses with dna genomes [1, 4] and that viruses with segmented genomes will have a higher probability of jumping than viruses with nonsegmented genomes [4] . complex interactions with a host's cellular machinery, on the other hand, are expected to decrease the probability of a host jump, so we expect that viruses that are able to complete replication in the cytoplasm will have a higher probability of jumping than viruses requiring nuclear entry [5] . to examine the effects of these characteristics, we should choose a target species that will maximize the chance that viral infection due to cross-species transmission events will have been detected; the obvious choice is humans. likewise, we should minimize differences in exposure of the target host to infectious virions produced by the source hosts. humans have regular contact with all potentially infectious bodily fluids of domestic food animals; we thus ensure that the target species has contact with all viral groups infecting the source hosts by analyzing the pool of viral species known to infect sheep, goats, cattle, and pigs. methods. we constructed a database containing taxonomic and molecular data on known viruses of domestic artiodactyls. to determine which viruses to include in the database, we searched the primary literature for references documenting infection of these species with all recognized species in all viral genera known to infect mammals. for each viral species infecting sheep, goats, pigs, or cattle, we then searched the literature to determine whether human infections have been documented (see table a1 in appendix a, which appears only in the electronic edition of the journal). viruses dependent on coinfection with other viral species, known to be maintained through continuous transmission within humans (see appendix a), or for which documented instances of artiodactyl infection resulted from human-to-animal transmission or experimental infection were excluded from the database. all literature searches were performed between 10 january 2007 and 15 february 2007 using web of science. the database contains information on the 3 molecular characteristics hypothesized to influence the potential of a virus to cross host species: site of replication (x sr ; whether replication is completed in the cytoplasm or requires nuclear entry), genomic material (x gm ; rna or dna), and segmentation of the viral genome (x seg ; segmented or nonsegmented). these characteristics are conserved at the family level, and classifications were made on the basis of standard reference books [6, 7] . we used a combination of hypothesis testing and modelbased prediction to analyze the database. hypothesis testing allowed us to determine how likely it was that the observed patterns were due to chance, whereas model-based prediction allowed us to determine what trait or set of traits was the best predictor of a livestock virus's ability to infect humans and to estimate the probability that a particular virus species would be able to jump host species, given knowledge of the traits of interest. computer code and data are available at http://lalashan .mcmaster.ca/hostjumps/ or from the authors. to determine the statistical significance of the effect of each trait on zoonotic transmission independent of the 2 other traits of interest, we performed a series of randomization tests. for a particular trait, we held the values of the other 2 traits and the ability of the viral species in the database to infect humans constant and permuted the values of the trait of interest within each subset defined by the other 2 traits (thereby preserving the full cross-correlational structure of the data with regard to the 3 viral traits) 100,000 times. the p value was given by the proportion of permutations that allowed the model to predict outcomes as well as or better than the model that was constructed using the observed data, and an ␣ level of 5% was used to determine the statistical significance of results. we compared model fit by use of a logistic regression model that predicted the ability to infect humans as a function of replication site, genomic material, and segmentation. the logistic model was fit in the r statistics package [8] , and fits were compared on the basis of likelihood. because the 3 traits examined are conserved at the family level for all species in our database, treating species as independent may bias our results. we therefore repeated our analysis at the genus and family levels. permutations of the data set were constructed by permuting the values of the trait under consideration at the taxonomic level examined and assigning species within a genus (or family) the corresponding trait value after permutation. p values were calculated as in the species-level analysis. to examine the magnitude and relative importance of the effects that the 3 molecular characteristics of interest have on the ability of the viral species in the database to infect humans, we developed a set of logistic regression models. each model included some combination of viral traits as independent variables and the ability to infect humans as the dependent variable. traits not having a statistically significant effect on the ability of livestock viruses to infect humans were still considered for modelbased predictions, because sample sizes were limited and small-but real-effects may not be detected via hypothesis testing. we estimated parameter values for each model in r and compared models using akaike's information criterion adjusted for small sample size (aic c ) [9] . results. a total of 146 viral species were found to infect the livestock species of interest and meet other criteria for inclusion in the database. of these, 141 species (representing 59 genera in 22 families) fulfilled the criteria for inclusion in the analysis. the effect of site of replication was significant at all 3 taxonomic levels examined (p ͻ .001, p ϭ .018, and p ϭ .014 for the species, genus, and family levels, respectively), with nearly half of the virus species completing replication in the cytoplasm able to infect humans. neither genomic material nor segmentation showed a significant effect on the ability of livestock viruses to infect humans at any taxonomic level. logistic regression model comparisons are summarized in table 1. the models are given in order as ranked by aic c . figure 1 compares the observed data with the results of the best model. the best model included site of replication as the only variable (odds ratio, 17.4 [95% confidence interval, 3.98 -75.8), and the top 4 models were the 4 that included site of replication. each of these models showed a positive correlation between replication in the cytoplasm and the ability to infect humans, as expected. segmentation appears in models 2, 4, 6 and 7, and all 4 models showed a positive correlation between having a segmented genome and the ability to infect humans. genomic material appears in models 3, 4, 5, and 6. again, all 4 models showed a correlation in the expected direction. it is interesting to note that both of the viral species that caused major pandemics in humans in the 20th century (hiv and influenza virus a) require nuclear entry for replication. because influenza virus a infects domestic artiodactyls but was excluded from our database because it is maintained through continuous transmission in humans, we confirmed the robustness of our results to this exclusion; we also confirmed that our findings were robust to the inclusion of viral species for which human infection data were based solely on serology (see table b1 in appendix b, which appears only in the electronic edition of the journal). discussion. our analyses indicate that viral species infecting domestic artiodactyls are more likely to infect humans if they complete replication in the cytoplasm without nuclear entry. the observed effect of cytoplasmic replication on host-jumping ability is not surprising given the complex molecular pathways regulating nuclear entry. viral species that are unable to complete replication in the cytoplasm require intracellular transport from the site of penetration, targeting of the nucleus through nuclear localization signals, and importation of genetic material, proteins, and/or whole virions through the nuclear pore complex [10] . the combination of molecular mechanisms governing this chain of events is likely to be highly host specific, because of strong selective pressure against admission of foreign particles into the nucleus. to date, discussion of barriers to viral replica-tion has largely focused on receptors for cellular entry. the concentration on this aspect of the viral life cycle exists for 2 substantive reasons. first, the inability to enter a cell obviously precludes viral replication; second, several well-documented viral host jumps have been shown to occur after point mutations that modify interactions between viral particles and cellular receptors [11] [12] [13] . the effect of nuclear entry seen in our data set emphasizes that cellular entry, while a necessary step, is insufficient for completion of the viral life cycle. the ability to produce genetic diversity is the factor most widely discussed as expected to increase viral host-jumping ability [1, [3] [4] [5] 14] . although the observed effects of genomic matenote. x sr , x gm , and x seg are variables indicating the molecular characteristics of a viral species (see methods). ln(ᐉ ) is the log likelihood of the best-fit parameter combination for a given model. k is the no. of model parameters for a given model. aic c is the value of akaike's information criterion with small sample size correction for each model; thus, ⌬aic c is the difference in aic c value between a given model and the best model (i.e., the model with the lowest aic c value). w i is the akaike weight of the model. ␤ seg , ␤ gm , and ␤ sr are regression coefficients for genome segmentation, genomic material, and site of replication, respectively. ␤ i represents the estimated intercept for the best-fit parameter combination for each model. rial and segmentation were not statistically significant, our data do not necessarily contradict this expectation. the hypothesized effect of segmentation, in particular, may be obscured in our data set by a combination of the small number of viral species with segmented genomes and the absence of segmented dna viruses. on the other hand, the lack of predictive power associated with genomic material and segmentation in our data set may indicate that consideration of these traits alone is insufficient to capture the potential to generate useful genetic diversity. the degree to which the pool of viruses infecting domestic artiodactyls is typical of all potentially zoonotic viral species is uncertain. other pools of viral species should be examined to determine the generality of our results. similarly, further studies should examine whether the observed patterns hold for crossspecies transmission of viruses to other target host species, including wildlife and domestic animals. given the rapid rates at which ecological relationships between species are changing as a result of anthropogenic landscape changes, global warming, and globalization of both human and animal populations, the development of indicators of the risk of cross-species pathogen transmission is an increasingly important goal. as humans, domestic animals, and wildlife are brought into contact with species from which they were formerly isolated, they inevitably encounter the pathogens that these species carry. the finding that the ability to complete replication in the cytoplasm is the best predictor of zoonotic transmission and that nearly half of domestic artiodactyl viruses that are able to complete replication in the cytoplasm can infect humans suggests that cytoplasmic replication will be a useful indicator of the ability of a newly encountered virus species to jump hosts, an essential prerequisite to epidemic or pandemic emergence [15] . it should be noted, however, that the present analysis focused exclusively on the ability to infect the target host, and the viral traits influencing this step in the emergence process may differ from those that predispose a virus to cause severe disease in a novel host as well as from those that facilitate transmission within a novel host species. diseases of humans and their domestic mammals: pathogen characteristics, host range, and the risk of emergence risk factors for human disease emergence host range and emerging and reemerging infectious diseases evolvability of emerging viruses viral host jumps: moving toward a predictive framework virus taxonomy: eighth report of the international committee on taxonomy of viruses the springer index of viruses r: a language and environment for statistical computing. vienna: r foundation for statistical computing in: model selection and multimodel inference: a practical information-theoretic approach viral entry into the nucleus the natural host range shift and subsequent evolution of canine parvovirus resulted from virus-specific binding to the canine transferrin receptor structure of sars coronavirus spike receptor-binding domain complexed with receptor a single amino acid in the pb2 gene of influenza a virus is a determinant of host range molecular constraints to interspecies transmission of viral pathogens origins of major human infectious diseases key: cord-027318-hinho0mh authors: zak, matthew; krzyżak, adam title: classification of lung diseases using deep learning models date: 2020-05-22 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50420-5_47 sha: doc_id: 27318 cord_uid: hinho0mh in this paper we address the problem of medical data scarcity by considering the task of detection of pulmonary diseases from chest x-ray images using small volume datasets with less than thousand samples. we implemented three deep convolutional neural networks (vgg16, resnet-50, and inceptionv3) pre-trained on the imagenet dataset and assesed them in lung disease classification tasks using transfer learning approach. we created a pipeline that segmented chest x-ray (cxr) images prior to classifying them and we compared the performance of our framework with the existing ones. we demonstrated that pre-trained models and simple classifiers such as shallow neural networks can compete with the complex systems. we also validated our framework on the publicly available shenzhen and montgomery lung datasets and compared its performance to the currently available solutions. our method was able to reach the same level of accuracy as the best performing models trained on the montgomery dataset however, the advantage of our approach is in smaller number of trainable parameters. furthermore, our inceptionv3 based model almost tied with the best performing solution on the shenzhen dataset despite being computationally less expensive. the availability of computationally powerful machines allowed emerging methods like pixel/voxel-based machine learning (pml) breakthroughs in medical image analysis/processing. instead of calculating features from segmented regions, this technique uses voxel/pixel values in input images directly. therefore, neither segmentation nor feature extraction is required. the performance supported by the natural sciences and engineering research council of canada. part of this research was carried out by the second author during his visit of the westpomeranian university of technology while on sabbatical leave from concordia university. of pml's can possibly exceed that of common classifiers [16] as this method is able to avoid errors caused by inaccurate segmentation and feature extraction. the most popular powerful approaches include convolutional neural networks (including shift-invariant neural networks). they resulted in false positive (fp) rates reduction in computer-aided design framework (cad) for detection of masses and microcalcifications [12] in mammography and in lung nodule detection in chest x-ray cxr images [13] , neural filters and massive-training artificial neural networks including massive-training artificial neural networks (mtanns) including a mixture of expert mtanns, laplacian eigenfunction lap-mtann and massive-training support vector regression (mtsvr) for classification, object detection and image enhancement in malignant lung modules detection in ct, fp reduction in cad for polyp detection in ct colonography, bone separation from soft tissue in cxr and enhancement of lung nodules in ct [11] . chest x-ray is one of the most frequently used diagnostic modality in detecting different lung diseases such as pneumonia or tuberculosis. roughly 1 million of adults require hospitalization because of pneumonia, and about 50,000 dies from this disease annually in the us only. examination of lung nodules in cxr can lead to missing of diseases like lung cancer. however, not all of them are visible in retrospect. studies show that 82-95% of lung cancer cases were missed due to occlusions (at least partial) by ribs or clavicle. to address this problem researchers examined dual-energy imaging, a technique which can produce images of two tissues, namely "soft-tissue" image and "bone" image. this technique has many drawbacks, but undoubtedly one of the most important ones is the exposure to radiation. the mtanns models have been developed to address this problem and serve as a technique for ribs/soft-tissue separation. the idea behind training of those algorithms is to provide them with bone and soft-tissue images obtained from a dual-energy radiography system. the mtann was trained using cxrs as input and corresponding boneless images. the ribs contrast is visibly suppressed in the resulting image, maintaining the soft tissue areas such as lung vessels. recent developments in deep neural networks [2] lead to major improvements in medical imaging. the efficiency of dimensionality reduction algorithms like lung segmentation was demonstrated in the chest x-ray image analysis. recently researchers aimed at improving tuberculosis detection on relatively small data sets of less than 103 images per class by incorporating deep learning segmentation and classification methods from [4] . we will further explore these techniques in this paper. in this paper we combine two relatively small datasets containing less than 103 images per class for classification (pneumonia and tuberculosis detection) and segmentation purposes. we selected 306 examples per "disease" class (306 images with tuberculosis and 306 images with pneumonia) and 306 of healthy patients yielding the set of 918 samples from different patients. sample images from both datasets are shown in fig. 1 . the shenzhen hospital dataset (sh) [2, 6] containing cxr images was created by the people's hospital in shenzhen, china. it includes both abnormal (containing traces of tuberculosis) and standard cxr images. unfortunately, the dataset is not well-balanced in terms of absence or presence of disease, gender, or age. we extracted only 153 samples of healthy patients (153 from both datasets) and 306 of those labeled with traces of tuberculosis. selecting information about one class from different resources ensures that the model is not contaminated by the features resulting from the method of taking images, e.g., the lens. pneumonia is an inflammatory condition of the lung affecting the little air sacs known as alveoli. standard symptoms comprise of a blend of a dry hacking cough, inconvenience breathing, chest agony, and fever. the labeled optical tomography and chest x-ray images for classification dataset [9] includes selected images of pneumonia patients from the medical center in guangzhou. it consists of data with two classes -normal and those containing marks of pneumonia. all data come from the patient's routine clinical care. the volume of the complete dataset includes thousands of validated optical coherence tomography (oct) and x-ray images yet for our analysis we wanted to keep the dataset tiny and evenly distributed thus only 153 images were selected (other 153 images come from the tuberculosis dataset) from the resources labeled as healthy and 306 as pneumonia -both chosen randomly. external segmentation of left and right lung images (exclusion of redundant information: bones, internal organs, etc.) was proven to be effective in boosting prediction accuracy. to extract lungs information and exclude outside regions, we used the manually prepared masks included in the extension of the sh dataset, namely, the segmented sh dataset, see fig. 2 . due to nonidentical borders and lung shapes, the segmentation data has high variability although its distribution is quite similar to the regular one when compared to image area distribution. model-based methods greatly improve their predictions when the number of training samples grows. when a limited amount of data is available, some transformations have to be applied to the existing dataset to synthetically augment the training set. researchers in [10] employed three techniques to augment the training dataset. the first approach was to randomly crop of a 224 × 224 pixel fixed-size window from a 256 × 256 pixel image. the second technique was flipping the image horizontally, which allowed capturing information about reflection invariance. finally, the third method added randomly generated lighting to capture color and illumination variation. transfer learning is a very popular approach in computer vision related tasks using deep neural networks when data resources are scarce. therefore, to launch a new task, we incorporate the pre-trained models skilled in solving similar problems. this method is crucial in medical image processing due to the shortage of real samples. in deep neural networks, feature extraction is carried out but passing raw data through models specialized in other tasks. here, we can refer to deep learning models such as resnet, where the last layer information serves as input to a new classifier. transfer learning in deep learning problems can be performed using a common approach called pre-trained models approach. reuse model states that pre-trained model can produce a starting point for another model used in a different task. this involves incorporation of the whole model or its parts. the adopted model may or may not need to be refined on the input-output data for the new task. the third option considers selecting one of the available models. it is very common that research institutions publish their algorithms trained on challenging datasets which may fully or partially cover the problem stated by a new task. imagenet [3] is a project that helps computer vision researches in classification and detection tasks by providing them with a large image dataset. this database contains roughly 14 million different images from over 20 thousand classes. imagenet also provides bounding boxes with annotations for over 1 million images, which are used in object localization problems. in this work, we will experiment with three deep models (vgg16, resnet-50, and inceptionv3) pre-trained on the imagenet dataset. the following deep nets have been considered: vgg16, resnet-50 and incep-tionv3. the vgg16 convolutional network is a model with 16 layers trained on fixed size images. the input is processed through a set of convolution layers which use small-size kernels with a receptive field 3 × 3. this is the smallest size allowing us to capture the notion of up, down, right, left, and center. the architecture also incorporates 1 × 1 kernels which may be interpreted as linear input transformation (followed by nonlinearity). the stride of convolutions (number of pixels that are shifted in every convolution -step size) is fixed and set to 1 pixel; therefore the spatial resolution remains the same after processing an input through a layer, e.g., the padding is fixed to 1 for 3 × 3 kernels. spatial downsizing is performed by five consecutive pooling (max-pooling) layers, which are followed by some convolution layers. however, not all of them are followed by max-pooling. the max-pooling operation is carried over a fixed 2 × 2 pixel window, with a stride of 2 pixels. this cascade of convolutional layers ends with three fully-connected (fc) layers where the first two consist of 4096 nodes each and the third one of 1000 as it performs the 1000-way classification using softmax. all hidden layers have the same non-linearity relu (rectified linear unit) [10] . the resnet convolutional neural network is a 50-layer deep model trained on more than a million fixed-size images from the imagenet dataset. the network classifies an input image into one of 1000 object classes like car, airplane, horse or mouse. the network has learned a large amount of features thanks to training images diversity and achieved 6.71% top-5 error rate on the imagenet dataset. the resnet-50 convolutional neural network consists of 5 stages, each having convolutions and identity blocks. every convolution block consists of 3 convolutional layers. resnet-50 is related to resnet-34, however, the idea behind its sibling model remains the same. the only difference is in residual blocks; unlike those in resnet-34 resnet-50 replaces every two layers in a residual block with a three-layer bottleneck block and 1 × 1 convolutions, which reduce and eventually restore the channel depth. this allows reducing a computational load when a 3×3 convolution is calculated. the model input is first processed through a layer with 64 filters each 7×7 and stride 2 and downsized by a max-pooling operation, which is carried over a fixed 2 × 2 pixel window, with a stride of 2 pixels. the second stage consists of three identical blocks, each containing a double convolution with 64 3 × 3 pixels filters and a skip connection block. the third pile of convolutions starts with a dotted line (image not included) as there is a change in the dimensionality of an input. this effect is achieved through the change of stride in the first convolution bloc from 1 to 2 pixels. the fourth and fifth groups of convolutions and skip connections follow the pattern presented in the third stage of input processing, yet they change the number of filters (kernels) to 256 and 512, respectively. this model has over 25 million parameters. the researchers from google introduced the first inception (inceptionv1) neural network in 2014 during the imagenet competition. the model consisted of blocs called "inception cell" that was able to conduct convolutions using different scale filters and afterward aggregate the results as one. thanks to 1 × 1 convolution which reduces the input channel depth the model saves computations. using a set of 1 × 1, 3 × 3, and finally, 5 × 5 size of filters, an inception unit cell learns extracting features of different scale from the input image. although inception cells use max-pooling operator, the dimension of a processed data is preserved due to "same" padding, and so the output is properly concatenated. a follow-up paper [17] was released not long after introducing a more efficient inceptionv3 solution to the first version of the inception cell. large filters sized 5 × 5, and 7 × 7 are useful in extensive spatial features extraction, yet their disadvantage lies in the number of parameters and therefore computational disproportion. the inceptionv3 model contains over 23 million parameters. the architecture can be divided into 5 modules. the first processing block consists of 3 inception modules. then, information is passed through the effective grid size reduction and processed through four consecutive inception cells with asymmetric convolutions. moving forward, information flows to the 17 × 17 pixels convolution layer connected to an auxiliary classifier and another effective grid size-reduction block. finally, data progresses through a series of two blocs with wider filter banks and consequently gets to a fully-connected layer ended with a softmax classifier. visualization of the network architecture can be found in fig. 3. many vision-related tasks, especially those from the field of medical image processing expect to have a class assigned to every pixel, i.e., every pixel is associated with a corresponding class. to conduct this process, we propose so-called u-net neural network architecture described in [18] and in sect. 4.2. this model works well with very few training image examples yielding precise segmentation. the motivation behind this network is to utilize progressive layers instead of a building system, where upsampling layers are utilized instead of pooling operators, consequently increasing the output resolution. high-resolution features are combined with the upsampled output to perform localization. the deconvolution layers consist of a high number of kernels, which better propagate information and result in outputs with higher resolution. owing to the described procedures, the deconvolution path is approximately symmetric to the contracting one and so the architecture resembles the u shape. there are no fully connected layers, therefore, making it possible to conduct the seamless segmentation of relatively large images extrapolating the missing context by mirroring the processed input. the network showed in fig. 4 consists of an expansive path (right) and a contracting one(left). the first part (contracting) resembles a typical convolutional neural network; the repeated 3 × 3 convolutions followed by a non-linearity (here relu), and 2 × 2 poling with stride 2. each downsampling operation doubles the number of resulting feature maps. all expansive path operations are made of upsampling of the feature channels followed by a 2 × 2 deconvolution (or "up-convolution") which reduces the number of feature maps twice. the result is then concatenated with the corresponding feature layer from the contracting path and convolved with 3 × 3 kernels, and each passed through a relu. the final layers apply a 1 × 1 convolution to map each feature vector to the desired class. following the approaches presented in the literature we wanted to use deep convolutional neural networks to segment lungs [8] before processing it through the classification models mentioned in sect. 3.4. researchers in [8] indicate that u-net architecture and its modifications outperform the majority of cnn-based models and achieve excellent results by easily capturing spacial information about the lungs. thus, we propose a pipeline that consists of two stages: first segmentation and then classification. the phase of extracting valuable information (lungs) is conducted with a model presented in sect. 3.2 our algorithms trained for 500 epochs on an extension of the sh dataset. the input to our u-shaped deep neural network is a regular chest x-ray image, whereas the output is a manually prepared binary mask of lung shape, matching the input. the code for the transfer-learning models is publicly available through a python api, keras. our algorithms were trained on servers equipped with gpu provided by helios calcul québec, which consists of fifteen computing nodes each having eight nvidia k20 gpus and additionally six computing nodes with eight nvidia k80 boards each. every k80 board includes two gpu's and so the total of 216 gpu's in the cluster. as mentioned before, our model was trained for 500 epochs using a dataset partitioned into 80%, 10%, and 10% bins, for training, validation and test parts, respectively using the models introduced in sect. 3.4 using the batch size of 8 samples, augmentation techniques briefed in sect. 3.2, adam optimizer and categorical cross-entropy as a loss function for pixel-wise binary classification. the training results are shown in fig. 5 . as we can easily notice, the validation error is slowly falling throughout the whole training, whereas there is no major change after the 100th epoch. the final error on the validation set is right below 0.05 and slightly above 0.06 on the test set. our algorithm learns shape-related features typical for lungs and can generalize well further over unseen data. figure 6 shows the results of our u-net trained models. it is clear that the network was able to learn chest shape features and exclude regions containing internal organs such as heart. these promising results allowed us to process the whole dataset presented earlier and continue our analysis on the newly processed images. we propose a two-stage pipeline for classifying lung pathologies into pneumonia and tuberculosis consisting of two stages: first for chest x-ray image segmentation and second for lung disease classification. the first stage (segmentation) is trained during experiments described in the previous section. the second stage utilizes deep models described in sect. 3.4, whereas we investigate potential improvements in performance depending on the type of model used. our classification models were trained using the same setup as described in sect. 3.4. here, we conduct our experiments using the data described in sect. 3.1. the difference is in prior segmentation, which extracts valuable information for the task, namely lungs. figure 6 shows the training samples; the left and right panels correspond to input and output, respectively. we tried all models with three deep net classifiers (vgg16, resnet-50, incep-tionv3) in the task of classification of lung images into two classes: pneumonia and tuberculosis. we observed that inceptionv3 based model performed best and thus due to lack of space we display only its performance results. the confusion matrix in fig. 8 (a) shows that the new model improved the number of true positives (tp) in all classes in comparison with the vgg16 and resnet-50 based models. image fig. 8 (b) shows that the auc score for healthy, tuberculosis and pneumonia cases were 90%, 93%, and 99%, respectively. after comparing the results obtained by models without transfer learning we observe that transfer-learning models perform well in lung diseases classification using segmented images tasks even when the data resources are scarce. in this section, we compare the performance of our models to the results achieved in the literature over different datasets (fig. 9 ). the algorithm that scored the best in the majority results was inceptionv3 trained on the segmented images. what is more, it produced very high scores for the "disease" classes showing that a random instance containing marks of tuberculosis or pneumonia has over 90% probability to be classified to the correct class. although the scores of the healthy class are worse than the diseased ones, its real cost is indeed lower as it is always worse to classify a sick patient as healthy. the inceptionv3 based model scored best, reaching better accuracy than vgg16 algorithms by over 12%. although the interpretability of our methods is not guaranteed, we can clearly state that using transfer-learning based algorithms on small datasets allows achieving competitive classification scores on the unseen data. furthermore, we compared the class activation maps shown in fig. 10 in order to investigate the reasoning behind decision making. the remaining features, here lungs, force the network to explore it and thus make decisions based on observed changes. that behavior was expected and additionally improved the interpretability of our models as the marked regions might bring attention of the doctor in case of sick patients. in this section, we compare performance of our models with the results in the literature using over different datasets. in order to do so we trained our algorithms on the shenzhen and montgomery datasets [6] ten times, generated the results for all the models and averaged their scores: accuracy, precision, sensitivity, specificity, f1 score and auc. table 1 presents comparison of different deep learning models trained on the shenzhen dataset [6] . although our approach does not guarantee the best performance, it is always close to the highest even though it is typically less complex. researchers in [5] used various pre-trained models in the pulmonary disease detection task, and the ensemble of them yields the highest accuracy and sensitivity. to compare, our inceptionv3-based model achieves accuracy smaller by only one percent and has identical auc, which means that our method gives an equal probability of assigning a positive case of tuberculosis to its corresponding class over a negative sample. although we could not outperform the best methods, our framework is less complicated. furthermore, in table 2 we compared the performance of our framework trained on the montgomery dataset [6] to the literature. our inceptionv3-based model tied with [14] in terms of accuracy, yet showed higher value of auc. resnet-50 and vgg16 based models performed worse, however not by much as they reached accuracies of 76% and 73% respectively, which is roughly 3 and 6% less than the highest score achieved. table 1 . comparison of different deep learning based solutions trained on the shenzhen dataset [6] . although our result is not the best, it performs better than any single model (excluding ensemble). horizontal line means that corresponding results were not provided in literature. accuracy precision sensitivity specificity f1 score auc [6] . our average performance is almost identical to [14] . we created lung diseases classification pipeline based on transfer learning that was applied to small datasets of lung images. we evaluated its performance in classification of non-segmented and segmented chest x-ray images. in our best performing framework we used u-net segmentation network and inceptionv3 deep model classifier. our frameworks were compared with the existing models. we demonstrated that models pre-trained by transfer learning approach and simple classifiers such as shallow neural networks can successfully compete with the complex systems. tb detection in chest radiograph using deep learning architecture lung segmentation in chest radiographs using anatomical atlases with nonrigid registration imagenet: a largescale hierarchical image database deep learning with lung segmentation and bone shadow exclusion techniques for chest x-ray analysis of lung cancer abnormality detection and localization in chest x-rays using deep convolutional neural networks. arxiv two public chest x-ray datasets for computer-aided screening of pulmonary diseases automatic tuberculosis screening using chest radiographs nuclei segmentation in histopathological images using two-stage learning large dataset of labeled optical coherence tomography (oct) and chest x-ray images imagenet classification with deep convolutional neural networks computer-aided detection of peripheral lung cancers missed at ct: roc analyses without and with localization a multiple circular path convolution neural network system for detection of mammographic masses artificial convolution neural network for medical image pattern recognition efficient deep network architectures for fast chest x-ray tuberculosis screening and visualization a novel approach for tuberculosis screening based on deep convolutional neural networks. in: medical imaging 2016: computer-aided diagnosis pixel-based machine learning in medical imaging rethinking the inception architecture for computer vision u-net: convolutional networks for biomedical image segmentation key: cord-020610-hsw7dk4d authors: thys, séverine title: contesting the (super)natural origins of ebola in macenta, guinea: biomedical and popular approaches date: 2019-10-12 journal: framing animals as epidemic villains doi: 10.1007/978-3-030-26795-7_7 sha: doc_id: 20610 cord_uid: hsw7dk4d in december 2013, a two-year-old child died from viral haemorrhagic fever in méliandou village in the south-east of guinea, and constituted the likely index case of a major epidemic. when the virus was formally identified as ebola, epidemiologists started to investigate the chains of transmission, while local people were trying to make sense out of these deaths. the epidemic control measures taken by national and international health agencies were soon faced by strong reluctance and a sometimes aggressive attitude of the affected communities. based on ethnographic work in macenta (forest region) in the autumn of 2014 for the global outbreak and alert response network (goarn) of the world health organization, this chapter shows that while epidemiologists involved in the outbreak response attributed the first ebola deaths in the forest region to the transmission of a virus from an unknown animal reservoir, local citizens believed these deaths were caused by the breach of a taboo. epidemiological and popular explanations, mainly evolving in parallel, but sometimes overlapping, were driven by different explanatory models: a biomedical model embodying nature in the guise of an animal disease reservoir, which in turn poses as threat to humanity, and a traditional-religious model wherein nature and culture are not dichotomized. the chapter will argue that epidemic responses must be flexible and need to systematically document popular discourse(s), rumours, codes, practices, knowledge and opinions related to the outbreak event. this precious information must be used not only to shape and adapt control interventions and health promotion messages, but also to trace the complex biosocial dynamics of such zoonotic disease beyond the usual narrow focus on wild animals as the sources of infection. at the end of december 2013 with the death of a two-year-old child in the village of méliandou in guéckédou prefecture, four days after the onset of symptoms (fever, black stools and vomiting). 3 this patient would be considered from now on as the 'case zero', the index case stemming the severe ebola virus disease (evd) epidemic of west africa from apparently a single zoonotic transmission event. 4 but then, with the idea of the spillover taking central stage the question arises: which animal species, the mythic 'animal zero', came to bear the burden of epidemic blame this time? 5 while this retrospective epidemiological study was perceived as essential for limiting high-risk exposures and for quickly implementing the most appropriate control interventions, these investigations (biomedical experts deployed from the rich north) were tempted to mimic and fulfil the 'outbreak narrative' imposed by the global health governance. 6 in this endeavour, rather than discovering the epidemiological origin, what becomes crucial is to quickly identify the carriers-'these vehicles necessary to drive forward the plot', which often function as the outbreak narrative's scapegoats. 7 historically always located at the boundary of the human social body, the ideal candidate to carry this role in the evd epidemic of 2014-2016 was once again the wild and villainous non-human animal. because the pathways for emergence are in any way 'natural' or 'sylvatic', according to the dominant western biomedical model, the inclusion of wildlife in the epidemiology and the evolution of emerging infectious diseases is justified, yet its role is often misrepresented. 8 although the probability of a humans contracting the disease from an infected animal still remains very low, certain cultural practices sometimes linked with poverty, especially 'bushmeat' hunting, continue to be seen as the main source of transgression of species boundaries. 9 in the african context, research into emerging infections from animal sources implicates nonhuman primate 'bushmeat' hunting as the primary catalyst of new diseases. 10 since the virus of ebola was identified for the first time in zaïre in 1976 and qualified as the first 'emerging' virus according to the new world clinic called 'global health', the link between animal and human health appears based on an 'us vs. them'. 11 after the formal confirmation of the aetiological agent in march 2014, the epidemic quickly took on an unprecedented scale and severity in several respects. it was declared by the who as an 'extraordinary event' because of its duration, the number of people infected, and its geographical extent which made it the largest ebola epidemic recorded in history until then. 12 to these quantifiable impact measures were added sociological, ecological, political and economic phenomena that are much more complex to decrypt. these have had a profound impact on society, well beyond the remote rural environment that was typically affected by preceding epidemics. 13 by threatening major urban areas, these 'geographies of blame' or 'hotspots' (usually at the margin of modern civilisation and configuring specific areas of the world or the environment into the breeding grounds of viral ontogenesis) have been mapped by 'virus-hunters' to update 'predictions about where in africa wild animals may harbour the virus and where the transmission of the virus from these animals to humans is possible'. 14 in addition to this epidemic's extraordinary character, by spreading beyond the capacities of humanitarian aid, this new biomedically unsolved complexity conferred upon it a status of 'exceptionality' also by 'proclaiming the danger of putting the past in (geographical) proximity with the present'. 15 this status had the effect, among others, of the most intense involvement, perhaps more visibly than before, of different disciplines, from human and animal health to the social sciences, in the international response. anthropology's response in particular was 'one of the most rapid and expansive anthropological interventions to a global health emergency in the discipline's history'. 16 yet it is very critical that the collective social science experiences acquired during this west african ebola epidemic remained engaged to addressing future outbreaks and beyond. they translated and shared anthropological knowledge between scholars by including translation for public health specialists, transmitting that knowledge to junior scientists, and engaging in ongoing work to develop relevant methodology and theory. 17 among the three west african countries most affected by the epidemic, guinea-conakry has been more marked by this dual 'exceptionality', that is to say, both epidemiological and social. beside the exceptionalism described by the senegalese anthropologist faye on the strong and sometimes violent demonstrations of popular reticence with regard to the activities of the 'riposte', guinea was also marked by a higher case fatality rate, as shown in the who report of 30 march 2016. globally raised up to more than 66% (while knowing that the number of cases and deaths was probably underreported), this case fatality rate confirmed the seriousness of the disease in a guinean context where the ebola virus had never hit before. 18 neither the medical community, nor the population, nor the authorities had so far experienced it. despite all the measures implemented, to the question, why did we observe a higher case fatality rate in guinea compared to that of other countries, a multitude of factors can be advanced. the latter deserve to be the subject of multidimensional analyses, especially as this global lethality has manifested itself differently according to the geographical region of the country. the highest fatality rate was observed in forest guinea (72.5%, 1230/1697), the region of origin of the index case and main epicentre of the epidemic. was this due to exclusively biomedical factors, such as a lower level of immunity among the guinean population? 19 or was it because of late care that would have given patients less chance of surviving and fighting the virus? but then, why did people infected with the virus later arrive at ebola treatment centres (etc) in guinea? was it due to a poorer and more limited health system and frailer medical and health infrastructure than liberia and sierra leone at the time of the epidemic? or was it due to less effective coordination work by international and national teams in responding to the epidemic? 20 or simply because in guinea the local communities were much more reluctant and intentionally opposed to the deployment of humanitarian and health assistance? although sharing broadly similar cultural worlds, what can therefore explain this notable difference of social resistance between the affected countries? combined with a divergent political practice and lived experiences of the state, especially between sierra leone and guinea, the working hypothesis drawn from my ethnographic observations in macenta and related literature review is that part of the continuing episodes of hostility and social resistance manifested by guinean communities regarding the adoption of the proposed control measures against the scourge of ebola has its origins in the divergence between explanatory systems of the disease; on the one hand, biomedical explanatory systems, and, on the other hand, popular explanatory systems. 21 in march 2014, when ebola hemorrhagic fever was formally identified a few months after the first death, epidemiologists and local populations each actively began to trace and understand this first human-to-human transmission chain of the disease, as well as its triggering event. evolving most often in parallel, and overlapping at times, these epidemiological and popular investigations generally refer to different explanatory models, some more biomedical ('natural') and others more mysticoreligious ('supernatural'). the purpose of this chapter is to trace and reflect on the interpretations of the origin and transmission of the ebola disease, as perceived and explained by the population, and to contrast them with the explanatory model of epidemiologists. in order to interrupt the two routes of evd transmission, namely from animal reservoirs to humans and between human infection, humanitarian responses followed the following public health logic: 'bushmeat' hunting, butchering and consumption should be banned and the ill should be isolated within etcs and burials should be made safe. yet, the interventions related to this reasoning had unattended consequences and, together with the ebola disease itself, they 'disrupted several intersecting but precarious social accommodations that had hitherto enabled radically different and massively unequal worlds to coexist'. 22 carriers, in the case of human-to-human transmission, are generally perceived as the ones promulgating the epidemics and are marked with transgressive attributes intrinsic to their 'contagiousness' (e.g. wanton or deviant sexuality for the hiv epidemic, uncleanliness for the cholera epidemic, immigration for typhoid). 23 however, in zoonosis-related diagnostic discourses, pathogens have the potential to reverse relations between humans and animals in such a way that the carrier becomes the victim. 24 located at the 'interface' between humans, animals and the (natural) environmentalready proved to be a virtual place where deadly pandemic risks lie waiting for humanity-'forest people' from guinea were rendered both carriers of the disease and victims of the villainous role of nonhuman animals. 25 the response to the fear of pandemics has been made unmistakable: we have to shield off humanity from nature. this mindset strongly adheres to the prevailing 'culture-nature divide' which is also depicted through zoonotic cycles diagrams further operating both as pilots of human mastery over human-animal relations and as crucial sites of unsettlement for the latter. 26 wild animals became public enemy number one, together with those who were supposedly facilitating the transgression of the boundaries between the cultural and natural world with (or because of) their culturally 'primitive' or 'underdeveloped' practices. by framing 'bushmeat' hunting, as well as local burials, as the main persisting cultural practices among the 'forest people' to explain (or to justify) the maintenance of the evd transmission during the west african epidemic, the notion of culture that fuelled sensational news coverage has strongly stigmatised this 'patient zero' community both globally and within guinea, and has been employed to obscure the actual, political, economic and political-economic drivers of infectious disease patterns. 27 appointed by my former institute, the institute of tropical medicine of antwerp, belgium (itm), to the who, i was sent to guinea-conakry from the end of october to the end of november 2014 for a four-week mission by the global outbreak alert and response network (goarn). 28 since august 2014, the country had been in the largest and longest phase of the epidemic, the second recrudescence which would also be the most intense one up until january 2015. 29 i first spent a week in conakry to follow the implementation of a social mobilisation project (project of monitoring committees at the level of each commune in the urban area). then, following an evaluation of the situation qualified as catastrophic by the national coordinator of the who, it was in macenta, forest guinea, where i was deployed. macenta, located east of guéckédou, was the prefecture considered to be the epicentre of this new outbreak of ebola and where transmission was the most intense. this district would remain one of guinea's most affected regions. by october 2014, macenta, where catastrophic scenarios seemed possible, had already a cumulative number of almost 600 cases since the beginning of the epidemic. the epidemiological situation was out of control, with a lack of material, human and financial resources. on arrival, there was still only one transit center (cdt). a new etc was being finalised by msf belgium. its management would be taken over a few weeks later by the french red cross. due to the long rainy season, the road used for bringing confirmed cases from macenta to the guéckédou treatment centre was in a deplorable state, slowing down the start of treatment and increasing the risk of transmission during transportation. it is as a medical anthropologist that i have been involved in guinea's national coordination platform for the fight against ebola and this within the commission of 'social mobilization and communities engagement', also named locally the 'communication' unit, in order to document, better understand and help to address the reluctance manifested by the local community. without going into the debate about the instrumentalisation of anthropologists as simple 'cultural mediators' at the service of humanitarians, i will simply recall here the specific objectives assigned for the mission. 30 they consisted, on the one hand, in an analysis of rumours and crisis situations in order to propose responsive actions and, on the other hand, in adapting the responses and protocols of the various national and international institutions to local conditions, giving priority to comprehensive and participatory approaches. by integrating the 'communication' unit, i tried to support and animate the meticulous and sensitive work of a whole team working to rebuild trust with communities and to 'open' villages reluctant to receive care interventions. under the authority of unicef guinea, this communication team also hosted many local associations previously working for the prevention of infectious diseases, such as hiv/aids, in the region. the latter had already been mobilised to serve as a relay and to mitigate the unpredictable consequences of the epidemic not foreseen by the riposte, such as, among other things, sensitisation and reception of healed people and orphans of ebola, food distribution, and support for people and villages stigmatised by the disease for whom access to the market-purchase and sale of products-was forbidden. religious representatives of protestant and muslim communities also voluntarily joined this platform to learn and then preach preventive behaviour, to comfort the population, as well as to deconstruct and addressed rumours. their main message was to convince the public that ebola did indeed exist and 'was a real disease'. subsequently, the communication unit was finally able to associate the prefectural direction of traditional medicine of macenta counting 6122 traditional healers and distributed in the 14 subprefectures of macenta. the main objective of this new activity was to engage all traditional healers in the fight against evd by raising the awareness of their patients and their entourage thanks to their high level of credibility in their respective communities. they also undertook to refer their patients directly to the tc if they came to present even one of the symptoms of evd (fever, diarrhoea [with blood], vomiting [with blood], loss of appetite). a 'health promotion' team managed and financed by msf belgium also acted on the ground. each morning, the different commissions and stakeholders of the riposte present in macenta were meeting at the prefectural health directorate (dps) to discuss and coordinate their activities in the field. 31 alongside a guinean sociologist, consultant for the who and the assistant coordinator of the mission philafricaine, i was quickly immersed in the realities of the field and in the local strategies elaborated with respect of traditional hierarchies, despite the emergencies. 32 their goal was to restore dialogue with the various village representatives who, since the officialisation of the epidemic, had decided to resist ebola interventions. this was, for instance, the case of the village of dandano, where deaths had risen to 63; a village whose access was authorised the day after my arrival in macenta. although tragic, this coincidence made me earn some legitimacy from the other national and international 'fighters'. it is in this intense and difficult context that the ethnographic observations and their preliminary analysis, presented in this chapter, were collected. the methods employed are based on participant observation, including many informal discussions during meetings with villagers (representatives of youth/notables/sages/women), with religious representatives (protestant pastors, and imams), with drivers and partners of the coordination community (e.g. doctors without borders, guinean red cross, unicef among others). some formal interviews were also conducted with key informants such as healed individuals (ebola survivors), traditional healers, pastoralists and local actors in the fight. biomedical scientific literature and reports on epidemiological data, as well as observational notes, photographs and audio recordings collected in the field, allowed me to trace the interpretations of the origin and transmission of ebola in a dual perspective: that of epidemiologists, on the one hand, and that of the population on the other. it is through the concept of explanatory models or 'cultural models of the disease' developed by arthur kleinman that i attempted to interpret the observations ( fig. 7.1) . 33 this is a conceptual framework that has already been used by barry and bonnie hewlett, alain epelboin and pierre formenty in their respective interventions during the previous outbreak of ebola haemorrhagic fever in the congo in 2003. 34 to be able to adapt the response and interrupt transmission, it is essential to know and understand how the population perceives the introduction of a disease, especially when it is such a deadly one. explanatory or cultural models refer to the explanations of an individual or a culture and to predictions about a particular disease. 35 these are social and cultural systems that construct the clinical reality of the disease. culture is not the only factor that shapes their forms: political, economic, social, historical and environmental factors also play an important role in disease knowledge construction. in kleinmann's model, care systems are composed of three sectors (popular, professional, and traditional) that overlap. in each healthcare system, the disease is perceived, named and interpreted, and a specific type of care is applied. the sick subject encounters different discourses about the illness as she or he moves from one sector to another. kleinman defines the existence, in each sector, of explanatory models of the disease for the sick individual, for his/her family and for the practitioner, whether professional or not. in general, only one part of an explanatory model is conscious, the other is not. although the explanatory models seek 36 from the health district mbomo in congo in 2003, they identified five different cultural models including a sorcery model (sorcerer sending spiritual objects into victims), a religious sect (la rose croix, a christian sect devoted to study of mystical aspects of life), an illness model (fever, vomiting, diarrhoea with blood), an epidemic model (illness that comes rapidly with the air/wind and effects many people) and a biomedical model (ebola haemorrhagic fever). 37 interestingly, none of the integrated non-biomedical models identified a specific non-human animal as potential source and/or carrier of evd or hunting and butchering as specific health risk activities for such illness. this further supports the epistemic dissonance observed during many epidemics (including the west african evd epidemic in this case), between the public health framing of wild meat as hazardous and the practical and social significance of the activities that occasion contact with that hazard. 38 in the case of evd, it is the biomedical cultural model that prevails among western health workers. when the alert was launched by the local health authorities on 10 march 2014, two and a half months after the beginning of the disease of the index case, it was virologic investigations that were conducted at first, following the many deaths that occurred during this socalled silent phase. when the zaïre ebolavirus was identified as the causative agent, retrospective epidemiological investigations of the cases took place, which are crucial during the outbreak of an infectious disease responsible for such high mortality rate. the first chains of transmission of evd are presented in the below graph adapted from baize et al. (2014) (fig. 7. 2). 39 these investigations are mainly based on the identification of patients and the analysis of hospital documents and reports (results of blood tests carried out in the laboratory), as well as on testimonies and interviews with the affected families, the inhabitants of the villages where the cases occurred, suspected patients and their contacts, funeral participants, public health authorities and hospital staff members. virologic analyses suggest a single introduction of the virus into the human population. 40 but the exact origin of the infection of this two-year-old child has not yet been definitively identified, even though the role of bats as natural hosts of the ebola virus, including this time also the insectivorous species, remains one of the most probable scientific hypotheses. 41 up to now, the precise nature of the initial zoonotic event in guinea remains undetermined and the natural reservoir of the ebola virus more generally is not yet certain, beside for three species of fruit bat and other insectivorous african bat species known to carry ebola antibodies and rna. 42 therefore, this informational gap was from the start filled with assumptions during the west african outbreak. among these assumptions, the elusive link between bats, wild animals and humans triggered high concerns over handling, butchering and consuming wild animals, commonly referred to as 'bushmeat'. 43 consequently, these concerns were integrated into public health messages on disease prevention and were translated into a 'bushmeat ban' by governments across the region and enforced during the entire outbreak. 44 this raises the question of the value of focusing on zoonotic transmission, in particular by fruit bats and non-human primates, which was quickly (s1) child, 2 years-old onset dec. dotted arrows: the epidemiological links have not been well established deemed to be of minimal risk, when the biggest threat of infection was from other humans. 45 furthermore, it raises the question of whether there is evidence to indicate and confirm that 'bushmeat'-related information included in public health campaigns in the region actually reduced ebola transmission. first, hunting and consuming 'bushmeat' for food have long been a part of human history occurring worldwide, serving as an important source of protein, and household income, especially where the ability to raise domestic animals is limited. 46 the term itself encompasses an extensive list of taxa that are harvested in the wild (ranging from cane rats to elephants and including duiker, squirrels, porcupine, monkeys, non-human primates, bats and hogs) for food, medicine, trophies and other traditional, cultural uses. 47 yet, designating the consumption of wild animal meat through the use of the term 'bushmeat' for west africans instead of 'game', as is the case for europeans and americans, by the media, scientific literature and public health campaigns that prohibit this practice, participates in 'semiotics of denigration' and has the effect of perpetuating 'exotic' and 'primitive' stereotypes of africa. 48 although involuntary, the immediate and visceral effect produced in western minds by the thought of someone eating a chimpanzee, a dog or a bat, for instance, creates a feeling of disgust which downgrades this person, his/her needs and his/her claims on us. 49 this issue has led to calls to replace the term with 'wild meat' or 'meat from wild animals'. 50 secondly, while the term 'bushmeat' typically refers to the practice in the forests of africa, the trade of 'bushmeat', which has expanded over the past two decades, is considered as an example of an anthropogenic factor that provides opportunities for the transmission of diseases from wildlife to humans. 51 the unsolved reconciliation between present policies and practices and the different values at stake (ecological, nutritional, economic and intrinsic values of wildlife hunted for food) in the actual 'bushmeat crisis', have accentuated the national and global conservation, development and health (infectious disease transmission related) concerns over hunting, eating and trading wild meat. 52 thirdly, because of the many competing interests and realities involved, the proscription of hunting and consuming certain species of wild animals-in particular fruit bats and nonhuman primates during the west africa ebolavirus outbreak-has resulted in several unintended consequences, has incurred great cost and has had only a limited effect. 53 in addition to being vague, inconsistent with scientific research and targeted to the wrong audience, messaging that unilaterally stressed the health risk posed by wild meat and fomite consumption contradicted the experiences of target publics, who consume wild meat without incident. 54 consequently, in addition to having a negative impact on the livelihoods of people living at the frontlines of animal contact, the ban ran the risk of eroding public confidence in the response efforts and fuelling rumours as to the cause of evd (e.g. that the government was attempting to weaken villages in areas supporting the opposition party, as wild meat is considered an important source of physical 'strength' and energy). 55 by focusing exclusively on the risk of spillover, we are distorting and concealing aspects of the dynamics at play. what if species boundaries are not perceived in the same way by everyone? what if the transgression of this 'invisible enemy' is spotted at a different intersection, beyond the nature/society binary? the first chains of human-to-human transmission led to the conclusion that the main vector of contamination was a health professional (s14) who spread the ebola virus in macenta, nzérékoré and kissidougou in february 2014. the fifteenth patient, a doctor (s15), would have also contaminated his relatives in the same areas. the aetiological agent of this deadly disease (the ebola virus for some, the transgression of a taboo for others) remained hidden until then and finally became apparent because of clusters of cases in the hospitals of guéckédou and macenta. indeed, even though the high risk of exposures was elucidated, the problem remained hidden for a number of months, mainly because no doctor or health official had previously witnessed a case of ebola and because its clinical presentation was similar to many other endemic diseases experienced in guinea, such as cholera, which affects the region regularly. but these signals could also have been blurred by another narrative of the causative agent of these same symptoms. this is very similar to what genese marie sodikoff has identified during the recent bubonic plague epidemic in madagascar, when scientists elicited survivors' memories of dead rats in the vicinity to reconstruct the transmission chain. not only were these clues imperceptible to most, but residents had also constructed an alternative outbreak narrative based on different evidence. 56 indeed, the mystico-religious beliefs deeply rooted in this region, even within the medical profession, have offered a different interpretation of causality according to a cultural model other than the biomedical model used by epidemiologists. following james fairhead, it is important to note that 'cultural' model does not tend here to slip into more totalising ideas of 'culture', such as a model being a 'kissi culture' (see below) nor its strict symmetrical opposite (e.g. a model of the 'humanitarian culture' or of a 'western culture'). 57 origin and transmission chain according to an 'animist' model at the beginning of the epidemic, for some, the first deaths in forest guinea were due to the transmission of the filovirus through contact with animals' and/or patients' body fluids; while for others, these deaths originated from the transgression of a taboo related to the touch of a fetish belonging to a sick person, a member of a secret society belonging to one of the ethnic groups of the region. as a result, susceptibility to ebola was initially perceived to be restricted to this particular ethnic group, labelling ebola as an 'ethnic disease'. 58 i decided to name this explanatory model of evd in forest guinea, the 'animist' model, not to further racialise this epidemic, but because it refers to the genies and fetishes that constitute principal aspects of the ancient religions of west africa and also because it describes a belief in a dual existence for all things-a physical, visible body and a psychic, invisible soul. 59 according to a young pastor from macenta who i interviewed, and as confirmed by several other sources of key informants, the population of macenta initially attributed the origin of the disease (in this region at least) to a curse that was only affecting the kissi ethnic group because the first 11 deaths solely affected people belonging to this ethnic group. here is what was stated: … on arrival with all the rumours we heard in conakry, i really did not believe in the beginning that it [the ebola virus disease] must be true because i thought it was an issue of the kissi (…) because it had started in macenta with the kissi, the first 11 deaths were almost only kissi. so we thought it was something related to it … and so we, as toma, it was not going to touch us, it is like that at the beginning we perceived things (…) not something genetic, we thought about the fetishism and idolatry activities that people exercised and that can influence them in one way or another … the first rumour that was there, in macenta, the first death was the doctor who was dead in front of everyone's views. people said they have an idol called 'doma' and so when a person dies of that according to the tradition and according to what is done. and those who are on the thing [those who belong to the secret society of 'doma'] have no right to touch, to manipulate the corpse, or to see it otherwise they may die (…) and that, it existed before. it is a kind of secret society, so they have told us that it can certainly be that, that it is why they [the kissi] are just dying successively. 60 according to these discourses, a health worker from guéckédou hospital (s14), who had gone to seek treatment at his friend's house at macenta hospital (s15), belonged, like him, to a secret initiation society called 'doma' which is also the name of a very powerful fetish; so powerful that it can cause a very fast death for its owner if it has been touched by someone else belonging to the same secret society. 61 when the guéckédou health worker's body was moved, the doctor from macenta would have touched this fetish, idol, sacred object, often hidden in the owner's boubou (traditional clothing). by touching the sacred, the fetish got upset causing the brutal death of the director of macenta's hospital very soon after this event. at that point, in order to repair this transgression and calm the anger of the fetish, six more deaths must succeed each other to reach the symbolic number of seven. if the number of sudden and rapid deaths reaches eight, it means that the fetish is very powerful, and, as a result, seven additional deaths must occur to reach 14 deaths to restore harmony and repair sacrilege. if we reach 15 deaths, we must go to 21 deaths before the disturbed order is restored and moreover that the stain is 'washed', and so on. 62 since the first 11 deaths of this second chain were indeed members of this kissi ethnic group (fig. 7. 3), the 'animist' explanatory model of the disease was quite consistent with people's observations and gained legitimacy among the population at the expense of the biomedical discourse of the existence of evd. as the susceptibility of dying from ebola was initially and predominantly perceived as restricted to this particular ethnic group, no preventive measures were adopted by the non-kissi population of the region. among the kissi, the consequent epistemic dissonance between the public health logic and the transgression to be restored led between june and july 2014 twenty-six kissi-speaking villages in guéckedou prefecture to isolate themselves from ebola response, cutting bridges and felling trees to prevent vehicle access, and stoning intruding vehicles. 63 because it is a disease of the social-of those who look after and visit others, and of those who attend funerals-there are of course many reasons why the ebola phenomenon was likely to be associated with sorcery. it is also not a coincidence that the triggering event, the transgression, in this explanatory model was attributed to medical doctors. as elite africans generally educated in european ways and relatively wealthy, this social group displays many characteristics of sorcerers (they lead a secluded life, do not share their gains, exchange abrupt greetings, eat large quantities of meat and eat alone). 64 moreover, the intense preoccupation throughout this region with 'hidden evil in the world around you that finds dramatic expression in the clandestine activities of witches and the conspiracies of enemies' is exacerbated by tiny pathogens remaining largely invisible to our routine social practices, hence attracting suspicions of sorcery (fig. 7.3) . 65 following the investigation of this 'animist' model in relation to the strong community resistance manifested in forest guinea, i interviewed a member of the riposte communication unit originating from macenta about the dandano case 66 : yes, there is the specificity of dandano. (…) [in] dandano there was a great witch doctor who had gone to greet his counterpart witch doctor where there were a lot of cases. and that is where he got infected. he returned to dandano. three days later he developed the disease and died. afterwards, as he is a great, recognised witch doctor, people said to themselves, because he died, it was not ebola that killed him but his fetish that is taking revenge on him because it is a betrayal to leave one's domain to greet one's friend. maybe he went to spy on his friend and his friend hit him … well, there have been many versions. (…) among the old people who knew the drug he had, euh… his fetish, the grigri that he had, and that if it was his grigri who killed him, it means that all those who saw him, who saw his body, must also suffer. (…) [we could] see his dead body because he was not protected, because we had to wash him and there were medicines that had to be poured to annihilate his fetishes' power before burying him. so there must have been deaths, hence it was already premeditated. then there were deaths, as it was said, and they were successive deaths. that means there were deaths, two days, three days, so people put more anathema on what happened. and that is how dandano lived things. so there were deaths, we said it is the fetish that woke up because dandano is known as a village of powerful fetishes, that is known. (…) even all the sensitisation we do, we never stop in dandano on a manager, a notable, otherwise they can do something to you … so it is well recognised (…) dandano, is not where you have to go joking. (…) at the end, with a lot of deaths, a lot of funerals, they saw that no, it is not that [the fetish anger] anymore, and with the information here and there, it is ebola. and it is like that with all the negotiations (…). 67 notably, these explanatory models are distinct from general beliefs about diseases and care techniques in the region. we cannot argue then that 'biomedicine' and 'kissi culture' are somehow distinct and opposed. 68 chain of transmission according to the 'animist' cultural model. s14 and s15 are the two suspect cases as presented in the 'biomedical' chain of transmission (see fig. 7 .2); the grey blocks are the 11 kissi people of the 'animist' transmission chain these beliefs belong to the ideology of different sectors of the care system and exist independently of the illness of a subject. explanatory models are collected in response to a particular episode of illness in a given subject in a given sector and can evolve over time, depending on how the experience, knowledge and risk exposure of the concerned individual develop. this is precisely what has been reported to us and what has been observed in forest guinea. as the number of deceased progressed, and according to the religious and/or ethnic affiliation of the deceased, a new explanatory model was put in place as stated in this conversation: yes, at first it was said, when i was in conakry, since our country is predominantly muslim, it was said that it is a matter for christians since muslims do not eat apes. muslims do not eat the bat. it's only the foresters who eat that. and that's why this disease hits only the kissi and toma who are from the forest. so it's a kaf disease. -kaf? (séverine thys) -from unbelievers, pagans who do not know god. we call kaf, all those who do not believe in the god of muslims. 69 this last extract particularly highlights the fact that these explanatory models are not fixed in time and space and are not impervious to each other either. indeed, the first health messages communicated to the population and built on the biomedical model were intensely focused on the need to avoid the consumption of 'bushmeat', especially wild animals identified as potential primary sources of contamination, namely monkeys and bats. the content of these messages gave birth to another popular model, in which the food taboos or eating habits observed by members affiliated to a certain religion allowed them to explain why this disease was affecting certain groups and not others. 70 this quote also perfectly illustrates how popular discourses have integrated medical interpretations or public health messages. in the study conducted by bonwitt et al. about the local impact of the wild meat ban during the outbreak, all participants, irrespective of age or gender, were aware of wild mammals acting as a source of transmission for ebola. yet a confusion remained about which species in particular could transmit the ebola virus, which may be due to the content of public health messages that were inconsistent as regards the species shown to be potentially hazardous. 71 messages are being absorbed, but in such chaos and fear, people process information according to their own worldview, according to the sources available to them, and following their personal experiences and instincts. furthermore, the criminalisation of wild meat consumption, which fuelled fears and rumours within communities, did entrench distrust towards outbreak responders and also exacerbated pre-existing tensions within villages, ethnicities and religions. 72 following the kissi, it seemed that it was the muslim community that was hit by sudden and numerous deaths. to cope with this new upheaval, this new incomprehension, the operated explanatory model of these deaths' origin was, consequently, first that of a 'maraboutage': it started like that until a certain moment. and then it turned upside down. there have always been upheavals. it turned upside down, and instead of being weighed at a certain moment on the toma and the kissi, it was rather on the manyas, who are entirely, 99%, 100% even muslims. and so people started saying 'ha! that only attacks muslims, why not christians?'. so there has always been upheaval in all the procedures of this disease evolution. 73 as noted by hewlett et al., 'patients, physicists, caregivers and local people in different parts of the world have cultural patterns for different diseases. providing care and appropriate treatment for a particular disease is often based on negotiation between these different models'. 74 to be able to negotiate, it is necessary that each one, doctor and patient, partakes in the knowledge of the explanatory model of the other. while most health professionals rarely assume that people have and construct their own interpretation of the causal chain, my ethnographic observations presented in this chapter demonstrate that the a priori on which all interventions of sensitisation are based is not only incorrect, but also a source of blockages for the adoption of prescribed behaviours. this is because, to return to hewlett et al., 'people do not just follow the continuous thread of learning; they also develop an ability to articulate adherence to prescribed behaviours with the refusal of others, to cooperate at certain times and to show reluctance to others, inviting the analysis to move towards a sociology of compromise'. 75 through the example of funerals, wilkinson and leach have also cast light on the presumption that the knowledge needed to stop the epidemic is held by public health experts and scientists, and not by local people. 76 this very often leads to the development of protocols and procedures that completely negate the contribution of communities. 77 this asymmetrical reflection between caregivers and care receivers, the structural violence that has cultivated inequalities in this region, the heterogeneity of experiences seen by the populations as fundamental contradictions between words and facts, the confidence and trust crisis since the 'demystification' programme initiated during sékou touré's time, and the traumas inflicted by a transgression of usages in the name of urgency and the exceptional nature of the ebola epidemic, are all realities that have fueled community reluctance and resistance. 78 the late involvement of traditional healers, primarily consulted by guineans when experiencing illness, in the activities of the response in macenta, is another example of this asymmetry, which too often omits to acknowledge and relate to these other categories that support the social fabric, even if since alma ata in 1978 these stakeholders should no longer be on the margins of the health system. 79 although the concept of explanatory models is not sufficient to explain all the failures of response in the context of guinea, or the bordering regions with sierra leone and liberia, nevertheless it allows to move past linear technical discussions of 'weak health systems' as the main reason for the scale of the disaster. the use of this conceptual framework for understanding popular interpretations of the origin of the disease and its transmission reveals the complex, historically rooted and multidimensional picture of the ebola crisis. several authors agree that, 'in any case, it is not a question of archaic beliefs or outlier depictions, but good answers -which can be called rational in this context -to a vital emergency situation, interpreted in the light of past and present experiences'. 80 a better knowledge and comparison of these discourses and different cultural models of the disease, sometimes incorporated, sometimes hermetic, could nevertheless contribute considerably to the success of the fight against the epidemic, especially when it concerns the improvement of knowledge of the chains of disease transmission, the identification and understanding of the behaviours of local populations, and of the sources of denials and rumours. explanatory models proposed by the biomedical sciences are very often in competition and in contradiction with diagnoses made by traditional healers and especially with rumours involving divine punishments, breaches of prohibitions, the misdeeds of wizards or genies, or virologic warfare. 81 if this 'animist' model is not identified nor recognised as making sense for others at the key moment, there will also be no negotiation and no understanding of the distances and proximities existing between the thought systems present in the concerned ecosystems. an anthropological approach remains essential to adapting this response to local realities. epelboin further argues that 'local models of causation regarding misfortune, often the most predominant, involve not only the virulence of the virus and human behaviour, but the evil actions of human and non-human individuals. the virologic model is then only one explanatory model among others, leaving the field open to all social, economic and political uses of misfortune'. 82 following the re-emergence of this infectious disease of zoonotic origin in a whole new social ecosystem, a cross-sectoral research agenda, the so-called one health integrated approach, has finally emerged in the field of viral haemorrhagic fevers, also enabling the role of anthropology to be expanded to times of epidemic outbreak. until then, anthropologists were mandated to contribute to the adaptation and improvement of immediate public health interventions in relation to human-to-human transmission. yet, the growing interest of anthropologists in the interaction between humans and non-humans has made it possible to extend their research topic to the complex dynamics of the primary and secondary transmission of the virus. 83 in addition, this anthropological interest has provided a new cross-cultural perspective on the movement of pathogens and has therefore improved knowledge about the mechanisms of emergence, propagation and amplification of a disease located at the interface between humans and wildlife. 84 such was the role of almudena marí saéz and colleagues who, in a multidisciplinary team, conducted an ethnographic study in the village of the ebola epidemic's origin, the index case village, to better understand local social hunting practices and the relationships between bats and humans. 85 however, the realm of the human-animal-disease interaction has been limited to 'natural versus cultural' domains and frequently conceived as a biological phenomenon in one health studies instead of a biocultural one integrating the social and cultural dimensions generated by human-animal relations. incorporating anthropology into one health approaches should provide a more nuanced and expanded account of the fluidity of bodies, categories and boundaries as drawn up by existing ethnographies on cattle in east and southern africa for example. 86 epelboin et al. have stressed that, 'the anthropological approach in previous epidemics has confirmed that the urgency and severity of an epidemic must not prevent people from listening to them and thinking throughout the epidemic of taking into account indigenous codes, customs, knowledge, skills and beliefs'. 87 by taking seriously the possibility that affected people in the places where we do research or implement control measures might not see things in the same way, we have to be willing to have our categories (such as culture/nature, human/animal, mind/body, male/female, caregivers/care receivers) unsettled, and to grapple with the practical implications of this for engagement in field sites, for knowledge-sharing and for the design of interventions, in the hope that such improvements might contribute to a future prevention of ebola and to public health policies more suitable to respond to people's basic needs. 88 it also allows the affected people themselves to have a say in the matter. as philippe descola and other anthropologists have argued, on the basis of a comparative analysis of a wide range of ethnographic work across the continents, native classificatory systems usually offer a continuum, rather than sharp divisions, among humans and other animal species. 89 indeed, human dispositions and behaviours are attributed not only to animals but also to spirits, monsters and artefacts, contrasting to modern western models, which generally see the categories of human and non-human as clearly defined and mutually exclusive. 90 the ability to sense and avoid harmful environmental conditions is necessary for the survival of all living organisms and, as paul slovic has argued, 'humans have an additional capability that allows them to alter their environment as well as respond to it'. 91 as regards the emerging violence in conservation as either against nature (e.g. culling bats) or in defence of it (e.g. rearranging landscapes within an inclusive 'one health' approach), james fairhead proposes that such violence is increasingly between 'the included' and 'rogues' in ways that transcend the nature/society binary. 92 while the 'white', and african elites were seen by the affected population as 'antisocial' intruders or rogues, suspected of sorcery and using ebola as a tool for political manipulation, those involved in the struggle to address the ebola epidemic were not fighting just against the virus but also against the natural world that harboured it: the rogues which included villainous bats but moreover habitat destroyers, namely hunters, bushmeat traders and deforesters. these were the humans casted as the ones invading the habitat of the virus. since evd will be constantly reconceptualised, and because of new scientific discoveries (e.g. on natural reservoir, or vaccine development), control interventions must listen to and take into account popular perceptions as well as the socio-cultural and political context and their respective evolution. rumours must be identified and managed on a case-by-case basis without global generalisation that could reinforce misinterpretations on the assumption that ignorance alone generates these rumours, con-flicts, lack of trust and resistance. moreover, zoonotic epidemic fighters should follow macgregor's and waldman' recommendations by starting to think differently with and about animals and about species boundaries in order to generate novel ways of addressing zoonotic diseases, allowing for closer integration with people's own cultural norms and understandings of human-animal dynamics. 93 and medicine 129 (2015) ebola virus disease in guinea-update (situation as of aspects épidémiologiques de la maladie à virus ebola en guinée (décembre 2013-avril 2016 emergence of zaire ebola virus disease in guinea investigating the zoonotic origin of the west african ebola epidemic zoonosis: prospects and challenges for in this outbreak story, a disease emerges in a remote location and spreads across a world highly connected by globalisation and air travel to threaten 'us all'-read the globally powerful north: see a entre science et fiction' contagious: cultures, carriers, and the outbreak narrative: wald priscilla the global focus on wildlife as a major contributor to emerging pathogens and infectious diseases in humans and domestic animals is due to reports which are not based on field, experimental or dedicated research but rather on surveys of literature and research regarding human immunodeficiency virus (hiv) and aids, severe acute respiratory syndrome (sars) and highly pathogenic avian influenza (hpai), all of which have an indirect wildlife link: r. kock, 'drivers of disease emergence and spread: is wildlife to blame on how and why 'bushmeat' hunting leads to the emergence of novel zoonotic pathogens see bushmeat hunting, deforestation, and prediction of zoonoses emergence uncovering zoonoses awareness in an emerging disease "hotspot"'. social science attempt of the zoonotic niche of evd, see contagious: cultures, carriers, and the outbreak narrative the term 'exceptionality' is borrowed from s. l. faye, 'l' "exceptionnalité" d'ebola et les "réticences" populaires en guinée-conakry. réflexions à partir d'une approche d'anthropologie symétrique to-plague-and-beyond-how-can-anthropologistsbest-engage-past-experience-to-prepare-for-new-epidemics. for the policy relevance of anthropological expertise and a (self-)critical reflection on ebola and on anthropological (and more broadly social scientific) engagements with humanitarian response, see a. menzel and a. schroven the term 'riposte' is the french name used to designate the official national mobilisation settled to respond to the evd crisis, structured into two poles, an inter-ministerial committee and a national coordination committee grouping together the international actors and the national non-governmental organisations; see m. fribault heterogeneities in the case fatality ratio in the west african ebola outbreak challenges in controlling the ebola outbreak in two prefectures in guinea: why did communities continue to resist? comparison of social resistance to ebola response in sierra leone and guinea suggests explanations lie in political configurations not culture understanding social resistance to the ebola response in the forest region of the republic of guinea: an anthropological perspective contagious: cultures, carriers, and the outbreak narrative zoonosis: prospects and challenges for medical anthropology the good, the bad and the ugly: framing debates on nature in a one health community understanding social resistance to the ebola response in the forest region of the republic of guinea: an anthropological perspective sustainability and contemporary man-nature divide: aspects of conflict and alienation on the visual ethnographic examination of the ebola zoonotic cycle transformed into tools of public health communication by the us cdc during the outbreak of medical anthropology and ebola in congo unintended consequences of the "bushmeat ban emergence of zaire ebola virus disease in guinea maladie à virus ebola: une zoonose orpheline?'. bulletin de l'académie vétérinaire de france inclusivity and the rogue bats and the war against "the invisible enemy about the natural reservoir for ebola virus see the evolution of ebola virus: insights from the 2013-2016 epidemic' a review of the role of food and the food system in the transmission and spread of ebolavirus mammalian biogeography and the ebola virus in africa for information on the 'bushmeat ban', see bonwitt et al., 'unintended consequences of the "bushmeat ban ebola virus disease epidemic emergence of zaire ebola virus disease in guinea'; world health organization, 'one year into the ebola epidemic: a deadly, tenacious and unforgiving virus caring for critically ill patients with ebola virus disease ebola-myths, realities, and structural violence'; and olival and hayman the threat to primates and other mammals from the bushmeat trade in africa, and how this threat could be diminished origins of major human infectious diseases'; centers for disease control and prevention take-a-semiotician-or-what-we-talk-aboutwhen-we-talk-about-bush-meat-by-adia-benton/. the kellogg institute on the feeling of disgust as a sentiment with powerful political valences, see also j. livingston, 'disgust, bodily aesthetics and the ethic of being human in botswana the anatomy of disgust world organisation for animal health the bushmeat trade: increased opportunities for transmission of zoonotic disease bushmeat crisis' is caused by the dual threats of wildlife extinctions and declining food and livelihood security of some of the poorest people on earth and whether the hunting of bushmeat is primarily an issue of biodiversity conservation or human livelihood, or both, varies according to perspective, place and over time; see unintended consequences of the "bushmeat ban impact of the ebola virus disease outbreak on market chains and trade of agricultural products in west africa'. food and agriculture organization of the united nations sending the right message: wild game and the west africa ebola outbreak bushmeat ban" in west africa during the 2013-2016 ebola virus disease epidemic'; p. richards, ebola: how a people's science helped end an epidemic les errances de la communication sur la maladie à virus ebola zoonotic semiotics: plague narratives and vanishing signs in madagascar' understanding social resistance to the ebola response in the forest region of the republic of guinea: an anthropological perspective one year on: why ebola is not yet over in guinea encyclopedia of medical anthropology : health and illness in the world's cultures extracts of the individual interview conducted with the pastor on the cultural and political role of initiation societies in the forest region and the related experiences of local citizens in relation to both the manding (often islamic) world to the north, and to the 'white' (often christian) colonial and neo-colonial order, see fairhead purity and danger, an analysis of concepts of pollution and taboo communication with rebellious communities during an outbreak of ebola virus disease in guinea: an anthropological approach understanding social resistance to the ebola response in the forest region of the republic of guinea: an anthropological perspective memories of the slave trade: ritual and historical imagination in sierra leone lifeworlds: essays in existential anthropology for more information about dandano village 'surrendering their sick and dead after being battered by the virus', see a. nossiter extracts of the individual interview conducted with a voluntary of the communication unit of macenta new therapeutic landscapes in africa: parental categories and practices in seeking infant health in republic of guinea extracts of the individual interview conducted with the pastor for similar narrative about muslim communities and food taboos regarding bats, see f. batty, 'reinventing "others" in a time of ebola unintended consequences of the "bushmeat ban extracts of the individual interview conducted with the pastor medical anthropology and ebola in congo: cultural models and humanistic care ebola en guinée: violences historiques et régimes de doute briefing: ebola-myths, realities, and structural violence l' "exceptionnalité" d'ebola et les "réticences" populaires en guinée-conakry ebola en guinée: violences historiques et régimes de doute'; wilkinson and leach traiter les corps comme des fagots' production sociale de l'indifférence en contexte ebola (guinée approche anthropologique de l'épidémie de fhv ebola 2014 en guinée conakry zoonosis: prospects and challenges for medical anthropology extending the "social": anthropological contributions to the study of viral haemorrhagic fevers investigating the zoonotic origin of the west african ebola epidemic views from many worlds: unsettling categories in interdisciplinary research on endemic zoonotic diseases animal spirits and mimetic affinities: the semiotics of intimacy in african human/animal identities annexe 13. contribution de l'anthropologie médicale à la lutte contre les épidémies de fièvres hémorragiques à virus ebola et marburg'. in world health organisation, épidémies de fièvres hémorragiques à virus ebola et marburg: préparation, alerte, lutte et évaluation the morning after: anthropology and the ebola hangover beyond nature and culture biosecurity and the topologies of infected life: from borderlines to borderlands nature and society: anthropological perspectives by perception of risk inclusivity and the rogue bats and the war against "the invisible enemy views from many worlds acknowledgements i would like to thank tenin traoré, a guinean sociologist and consultant to who, and joseph kovoïgui, assistant coordinator of the philafrican mission and then consultant to who, for their commitment and engagement in the fight against ebola, their generosity, their knowledge, their experience and our fruitful collaboration in many respects. i would also like to thank the coordination team and the dps (prefectural health direction) of macenta for their welcome and sincere attention; goarn/who, antwerp institute of tropical medicine, and in particular prof. marleen boelaert for emotional, financial and logistical support; dr. alain epelboin for field preparation and numerous sharing with the francophone anthropological platform; and christos lynteris for his invitation to connect and exchange with the anglophone 'anthro-zoonoses' network and contribute to this timely collection. key: cord-030681-4brd2efp authors: friston, karl j.; parr, thomas; zeidman, peter; razi, adeel; flandin, guillaume; daunizeau, jean; hulme, ollie j.; billig, alexander j.; litvak, vladimir; moran, rosalyn j.; price, cathy j.; lambert, christian title: dynamic causal modelling of covid-19 date: 2020-08-07 journal: wellcome open res doi: 10.12688/wellcomeopenres.15881.2 sha: doc_id: 30681 cord_uid: 4brd2efp this technical report describes a dynamic causal model of the spread of coronavirus through a population. the model is based upon ensemble or population dynamics that generate outcomes, like new cases and deaths over time. the purpose of this model is to quantify the uncertainty that attends predictions of relevant outcomes. by assuming suitable conditional dependencies, one can model the effects of interventions (e.g., social distancing) and differences among populations (e.g., herd immunity) to predict what might happen in different circumstances. technically, this model leverages state-of-the-art variational (bayesian) model inversion and comparison procedures, originally developed to characterise the responses of neuronal ensembles to perturbations. here, this modelling is applied to epidemiological populations—to illustrate the kind of inferences that are supported and how the model per se can be optimised given timeseries data. although the purpose of this paper is to describe a modelling protocol, the results illustrate some interesting perspectives on the current pandemic; for example, the nonlinear effects of herd immunity that speak to a self-organised mitigation process. the purpose of this paper is to show how dynamic causal modelling can be used to make predictions-and test hypotheses-about the ongoing coronavirus pandemic (huang et al., 2020; wu et al., 2020; zhu et al., 2020) . it should be read as a technical report 1 , written for people who want to understand what this kind of modelling has to offer (or just build an intuition about modelling pandemics). it contains a sufficient level of technical detail to implement the model using matlab (or its open source version octave), while explaining things heuristically for non-technical readers. the examples in this report are used to showcase the procedures and subsequent inferences that can be drawn. having said this, there are some quantitative results that will be of general interest. these results are entirely conditional upon the model used. dynamic causal modelling (dcm) refers to the characterisation of coupled dynamical systems in terms of how observable data are generated by unobserved (i.e., latent or hidden) causes (friston et al., 2003; moran et al., 2013) . dynamic causal modelling subsumes state estimation and system identification under one bayesian procedure, to provide probability densities over unknown latent states (i.e., state estimation) and model parameters (i.e., system identification), respectively. its focus is on estimating the uncertainty about these estimates to quantify the evidence for competing models, and the confidence in various predictions. in this sense, dcm combines data assimilation and uncertainty quantification within the same optimisation process. specifically, the posterior densities (i.e., bayesian beliefs) over states and parameters-and the precision of random fluctuations-are optimised by maximising a variational bound on the model's marginal likelihood, also known as model evidence. this bound is known as variational free energy or the evidence lower bound (elbo) in machine learning (friston et al., 2007; hinton & zemel, 1993; mackay, 1995; winn & bishop, 2005) . intuitively, this means one is trying to optimise probabilistic beliefs-about the unknown quantities generating some data-such that the (marginal) likelihood of those data is as large as possible. the marginal likelihood 2 or model evidence can always be expressed as accuracy minus complexity. this means that the best models provide an accurate account of some data as simply as possible. therefore, the model with the highest evidence is not necessarily a description of the process generating data: rather, it is the simplest description that provides an accurate account of those data. in short, it is 'as if' the data were generated by this kind of model. importantly, models with the highest evidence will generalise to new data and preclude overfitting, or overconfident predictions about outcomes that have yet to be measured. in light of this, it is imperative to select the parameters or models that maximise model evidence or variational free energy (as opposed to goodness of fit or accuracy). however, this requires the estimation of the uncertainty about model parameters and states, which is necessary to evaluate the (marginal) likelihood of the data at hand. this is why estimating uncertainty is crucial. being able to score a model-in terms of its evidence-means that one can compare different models of the same data. this is known as bayesian model comparison and plays an important role when testing different models or hypotheses about how the data are caused. we will see examples of this later. this aspect of dynamic causal modelling means that one does not have to commit to a particular form (i.e., parameterisation) of a model. rather, one can explore a repertoire of plausible models and let the data decide which is the most apt. dynamic causal models are generative models that generate consequences (i.e., data) from causes (i.e., hidden states and parameters). the form of these models can vary depending upon the kind of system at hand. here, we use a ubiquitous form of model; namely, a mean field approximation to loosely coupled ensembles or populations. in the neurosciences, this kind of model is applied to populations of neurons that respond to experimental stimulation (marreiros et al., 2009; moran et al., 2013) . here, we use the same mathematical approach to model a population of individuals and their response to an epidemic. the key idea behind these (mean field) models is that the constituents of the ensemble are exchangeable; in the sense that sampling people from the population at random will give the same average as following one person over a long period of time. under this assumption 3 , one can then work out, analytically, how the probability distribution over various states of people evolve over time, e.g., whether someone was infected or not. this involves parameterising the probability that people will transition from one state to another. by assuming the population is large, one can work out the likelihood of observing a certain number of people who were infected, given the probabilistic state of the population at that point in time. in turn, one can work out the probability of a sequence or timeseries of new cases. this is the kind of generative model used here, where the latent states were chosen to generate the data that are-or could be-used to track a pandemic. figure 1 provides an overview of this model. in terms of epidemiological models, this can be regarded as an extended seir (susceptible, exposed, infected and recovered) compartmental model (kermack et al., 1997) . please see (kucharski et al., 2020) for an application of this kind of model to covid-19 4 . there are number of advantages to using a model of this sort. first, it means that one can include every variable that 'matters', such that one is not just modelling the spread of an infection but an ensemble response in terms of behaviour (e.g., social distancing). this means that one can test hypotheses about the contribution of various responses that are installed in the model-or what would happen under a different kind of response. a second advantage of having a generative model is that one can evaluate its evidence in relation to alternative models, and therefore optimise the structure of the model itself. for example, does social distancing behaviour depend upon the number of people who are infected? or, does it depend on how many people have tested positive for covid-19? (this question is addressed below). a third advantage is more practical, in terms of data analysis: because we are dealing with ensemble dynamics, there is no need to create multiple realisations or random samples to estimate uncertainty. this is because the latent states are not the states of an individual but the sufficient statistics of a probability distribution over individual states. in other words, we replace random fluctuations in hidden states with hidden states that parameterise random fluctuations. the practical consequence of this is that one can fit these models quickly and efficiently-and perform model comparisons over thousands of models. a fourth advantage is that, given a set of transition probabilities, the ensemble dynamics are specified completely. this has the simple but important consequence that the only unknowns in the model are the parameters of these transition probabilities. crucially, in this model, these do not change with time. this means that we can convert what would have been a very complicated, nonlinear state space model for data assimilation into a nonlinear mapping from some unknown (probability transition) parameters to a sequence of observations. we can therefore make precise predictions about the long-term future, under particular circumstances. this follows because the only uncertainty about outcomes inherits from the uncertainty about the parameters, which do not change with time. these points may sound subtle; however, the worked examples below have been chosen to illustrate these properties. this technical report comprises four sections. the first details the generative model, with a focus on the conditional dependencies that underwrite the ensemble dynamics generating outcomes. the outcomes in question here pertain to a regional outbreak. this can be regarded as a generative model for the first wave of an epidemic in a large city or metropolis. this section considers variational model inversion and comparison, under hierarchical models. in other words, it considers the distinction between (first level) models of an outbreak in one country and (second level) models of differences among countries, in terms of model parameters. the second section briefly surveys the results of second level (between-country) modelling, looking at those aspects of the model that are conserved over countries (i.e., random effects) and those which are not (i.e., fixed effects). the third section then moves on to the dynamics and predictions for a single country; here, the united kingdom. it considers the likely outcomes over the next few weeks and how confident one can be about these outcomes, given data from all countries to date. this section drills down on the parameters that matter in terms of affecting death rates. it presents a sensitivity analysis that establishes the contribution of parameters or causes in the model to eventual outcomes. it concludes by looking at the effects of social distancing and herd immunity. the final section concludes with a consideration of predictive validity by comparing predicted and actual outcomes. this section describes the generative model summarised schematically in figure 1 , while the data used to invert or fit this model are summarised in figure 2 . these data comprise global (worldwide) timeseries from countries and regions from the initial reports of positive cases in china to the current day 5 . the generative model is a mean field model of ensemble dynamics. in other words, it is a state space model where the states correspond to the sufficient statistics (i.e., parameters) of a probability distribution over the states of an ensemble or population-here, a population of people who are in mutual contact at some point in their daily lives. this kind of model is used routinely to model populations of neurons, where the ensemble dynamics are cast as density dynamics, under gaussian assumptions about the probability densities; e.g., (marreiros et al., 2009 ). in other words, a model of how the mean and covariance of a population affects itself and the means and covariances of other populations. here, we will focus on a single population and, crucially, use a discrete state space model. this means that we will be dealing with the sufficient statistics (i.e. expectations) of the probability of being in a in brief, this compartmental model generates timeseries data based on a mean field approximation to ensemble or population dynamics. the implicit probability distributions are over four latent factors, each with four levels or states. these factors are sufficient to generate measurable outcomes; for example, the number of new cases or the proportion of people infected. the first factor is the location of an individual, who can be at home, at work, in a critical care unit (ccu) or in the morgue. the second factor is infection status; namely, susceptible to infection, infected, infectious or immune. this model assumes that there is a progression from a state of susceptibility to immunity, through a period of (pre-contagious) infection to an infectious (contagious) status. the third factor is clinical status; namely, asymptomatic, symptomatic, acute respiratory distress syndrome (ards) or deceased. again, there is an assumed progression from asymptomatic to ards, where people with ards can either recover to an asymptomatic state or not. finally, the fourth factor represents diagnostic or testing status. an individual can be untested or waiting for the results of a test that can either be positive or negative. with this setup, one can be in one of four places, with any infectious status, expressing symptoms or not, and having test results or not. note that-in this construction-it is possible to be infected and yet be asymptomatic. however, the marginal distributions are not independent, by virtue of the dynamics that describe the transition among states within each factor. crucially, the transitions within any factor depend upon the marginal distribution of other factors. for example, the probability of becoming infected, given that one is susceptible to infection, depends upon whether one is at home or at work. similarly, the probability of developing symptoms depends upon whether one is infected or not. the probability of testing negative depends upon whether one is susceptible (or immune) to infection, and so on. finally, to complete the circular dependency, the probability of leaving home to go to work depends upon the number of infected people in the population, mediated by social distancing. the curvilinear arrows denote a conditioning of transition probabilities on the marginal distributions over other factors. these conditional dependencies constitute the mean field approximation and enable the dynamics to be solved or integrated over time. at any point in time, the probability of being in any combination of the four states determines what would be observed at the population level. for example, the occupancy of the deceased level of the clinical factor determines the current number of people who have recorded deaths. similarly, the occupancy of the positive level of the testing factor determines the expected number of positive cases reported. from these expectations, the expected number of new cases per day can be generated. a more detailed description of the generative model-in terms of transition probabilities-can be found in in the main text. figure 2 . timeseries data. this figure provides a brief overview of the timeseries used for subsequent modelling, with a focus on the early trajectories of mortality. the upper left panel shows the distribution, over countries, of the number of days after the onset of an outbreakdefined as 8 days before more than one case was reported. at the time of writing (4 th april 2020), a substantial number of countries witnessed an outbreak lasting for more than 60 days. the upper right panel plots the total number of deaths against the durations in the left panel. those countries whose outbreak started earlier have greater cumulative deaths. the middle left panel plots the new deaths reported (per day) over a 48-day period following the onset of an outbreak. the colours of the lines denote different countries. these countries are listed in the lower left panel, which plots the cumulative death rate. china is clearly the first country to be severely affected, with remaining countries evincing an accumulation of deaths some 30 days after china. the middle right panel is a logarithmic plot of the total deaths against population size in the initial (48-day) period. interestingly, there is little correlation between the total number of deaths and population size. however, there is a stronger correlation between the total number of cases reported (within the first 48 days) and the cumulative deaths as shown in lower right panel. in this period, germany has the greatest ratio of total cases to deaths. countries were included if their outbreak had lasted for more than 48 days and more than 16 deaths had been reported. the timeseries were smoothed with a gaussian kernel (full width half maximum of two days) to account for erratic reporting (e.g., recording deaths over the weekend). particular state at any one time. this renders the model a compartmental model (kermack et al., 1997) , where each state corresponds to a compartment. these latent states evolve according to transition probabilities that embody the causal influences and conditional dependencies that lend an epidemic its characteristic form. our objective is to identify the right conditional dependencies-and form posterior beliefs about the model parameters that mediate these dependencies. having done this, we can then simulate an entire trajectory into the distant future, even if we are only given data about the beginning of an outbreak 6 . the model considers four different sorts of states (i.e., factors) that provide a description of any individual-sampled at random-that is sufficient to generate the data at hand. in brief, these factors were chosen to be as conditionally independent as possible to ensure an efficient estimation of the model parameters 7 . the four factors were an individual's location, infection status, clinical status and diagnostic status. in other words, we considered that any member of the population can be characterised in terms of where they were, whether they were infected, infectious or immune, whether they were showing mild and severe or fatal symptoms, and whether they had been tested with an ensuing positive or negative result. each of these factors had four levels. for example, the location factor was divided into home, work, critical care unit, and the morgue. these states should not be taken too literally. for example, home stands in for anywhere that has a limited risk of exposure to, or contact with, an infected person (e.g., in the domestic home, in a non-critical hospital bed, in a care home, etc). work stands in for anywhere that has a larger risk of exposure to-or contact with-an infected person and therefore covers non-work activities, such as going to the supermarket or participating in team sports. similarly, designating someone as severely ill with acute respiratory distress syndrome (ards) is meant to cover any life-threatening conditions that would invite admission to intensive care. having established the state space, we can now turn to the causal aspect of the dynamic causal model. the causal structure of these models depends upon the dynamics or transitions from one state or another. it is at this point that a mean field approximation can be used. mean field approximations are used widely in physics to approximate a full (joint) probability density with the product of a series of marginal densities (bressloff & newby, 2013; marreiros et al., 2009; schumacher et al., 2015; zhang et al., 2019) . in this case, the factorisation is fairly subtle: we will factorise the transition probabilities, such that the probability of moving among states-within each factor-depends upon the marginal distribution of other factors (with one exception). for example, the probability of developing symptoms when asymptomatic depends on, and only on, the probability that i am infected. in what follows, we will step through the conditional probabilities for each factor to show how the model is put together (and could be changed). the first factor has four levels, home, work, ccu and the morgue. people can leave home but will always return (with unit probability) over a day. the probability of leaving home has a (prior) baseline rate of one third but is nuanced by any social distancing imperatives. these imperatives are predicated on the proportion of the population that is currently infected, such that the social distancing parameter (an exponent) determines the probability of leaving home 8 . for example, social distancing is modelled as the propensity to leave home and expose oneself to interpersonal contacts. this can be modelled with the following transition probability: this means that the probability of leaving home, given i have no symptoms, is the probability i would have gone out normally, multiplied by a decreasing function of the proportion of people in the population who are infected. formally, this proportion is the marginal probability of being infected, where the marginal probability of a factor is an average over the remaining factors. the marginal probability p l of the location factor is as follows: where the final four equalities define each factor or state in the model. the parameters in this social distancing model are the probability of leaving home every day (θ out ) and the social distancing exponent (θ sde ). the only other two places one can be are in a ccu or the morgue. the probability of moving to critical care depends upon bed (i.e., hospital) availability, which is modelled as a sigmoid function of the occupancy of this state (i.e., the probability that a ccu bed is occupied) and a bed capacity parameter (a threshold). if one has severe symptoms, then one stays in the ccu. finally, the probability of moving to the morgue depends on, and only on, being deceased. note that all these dependencies are different states of the clinical factor (see below). this means we can write the transition probabilities among the location factor for each level of the clinical factor as follows (with a slight abuse of notation): here, the columns and rows of each transition probability matrix are ordered: home, work, ccu, morgue. the column indicates the current location and the row indicates the next location. parameter θ cap is bed capacity threshold and is a decreasing sigmoid function. in brief, these transition probabilities mean that i will go out when asymptomatic, unless social distancing is in play. however, when i have symptoms i will stay at home, unless i am hospitalised with acute respiratory distress. i remain in critical care unless i recover and go home or die and move to the morgue, where i stay. technically, the morgue is an absorbing state. in a similar way, we can express the probability of moving between different states of infection (i.e., susceptible, infected, infectious and immune) as follows: these transition probabilities mean that when susceptible, the probability of becoming infected depends upon the number of social contacts-which depends upon the proportion of time spent at home. this dependency is parameterised in terms of a transition probability per contact (θ trn ) and the expected number of contacts at home (θ rin ) and work (θ rou ) 9 . once infected, one remains in this state for a period of time that is parameterised by a transition rate (θ inf ). this parameterisation illustrates a generic property of transition probabilities; namely, an interpretation in terms of rate constants and, implicitly, time constants. the rate parameter θ is related to the rate constant κ and time constant τ according to: in other words, the probability of staying in any one state is determined by the characteristic length of time that state is occupied. this means that the rate parameter above can be specified, a priori, in terms of the number of days we expect people to be infected, before becoming infectious. similarly, we can parameterise the transition from being infectious to being immune in terms of a typical period of being contagious, assuming that immunity is enduring and precludes reinfection 10 . note that in the model, everybody in the morgue is treated as having acquired immunity. the transitions among clinical states depend upon both the infection status and location as follows: ( ) ( ) the transitions among clinical states (i.e., asymptomatic, symptomatic, ards and deceased) are relatively straightforward: if i am not infected (i.e., susceptible or immune) i will move to the asymptomatic state, unless i am dead. however, if i am infected (i.e., infected or infectious), i will develop symptoms with a particular probability (θ dev ). once i have developed symptoms, i will remain symptomatic and either recover to an asymptomatic state or develop acute respiratory distress with a particular probability (θ sev ). the parameterisation of these transitions depends upon the typical length of time that i 9 here, inf trn infectious p p θ = − ⋅ can be interpreted as a probability of eluding infection with each interpersonal contact, such that the probability of remaining uninfected after θ r contacts is given by p θ r. note, that there is no distinction between people at home and at work; both are equally likely to be infectious. we can now assemble these transition probabilities into a probability transition matrix, and iterate from the first day to some time horizon, to generate a sequence of probability distributions over the joint space of all factors: notice that this is a completely deterministic state space model, because all the randomness is contained in the probabilities. notice also that the transition probability matrix t is both state and time dependent, because the transition probabilities above depend on marginal probabilities. in this approximation, the number of contacts i make is a weighted average of the number of people i could infect at home and the number of people i meet outside, per day, times the number of days i am contagious. the effective reproduction rate is not a biological rate constant. however, it is a useful epidemiological summary statistic that indicates how quickly the disease spreads through a population. when less than one, the infection will decay to an endemic equilibrium. we will use this measure later to understand the role of herd immunity. this completes the specification of the generative model of latent states. a list of the parameters and their prior means (and variances) is provided in table 1 . notice that all of the parameters are scale parameters, i.e., they are rates or probabilities that cannot be negative. to enforce these positivity constraints, one applies a log transform to the parameters during model inversion or fitting. this has the advantage of being able to simplify the numerics using gaussian assumptions about the prior density (via a lognormal assumption). in other words, although the scale parameters are implemented as probabilities or rates, they are estimated as log parameters, denoted by note that prior variances are specified for log parameters. for example, a variance of 1/64 corresponds to a prior confidence interval of ~25% and can be considered weakly informative. these prior expectations should be read as the effective rates and time constants as they manifest in a real-world setting. for example, a three-day period of contagion is shorter than the period that someone might be infectious (wölfel et al., 2020) 14 , on the (prior) assumption that they will self-isolate, when they realise they could be contagious. further parameters are required to generate data, such as the size of the population and the number of people who are initially 11 it is revealing to note that the number of model parameters pertaining to pcr testing matches the number of parameters mediating the epidemiology per se. this reflects the fact that the generative model has to consider every aspect of how data are generated. in order to leverage the information in new positive tests, it is necessary to think carefully about all the parameters that contribute to these data; for example, the probability of being tested and the selection bias towards testing people who are more likely to be infected. crucially, this bias has to be estimated during model inversion and could vary substantially from country to country. although not implemented in this report, subsequent distinctions between pillar 1 and 2 test data would be a nice example of different selection biases. this speaks to the importance of modelling pillar 1 and 2 as distinct data modalities. from a technical perspective, equipping standard epidemiological models with an 'observation model' can be regarded as building a complete dynamic causal model. the key thing to bear in mind here is that the parameters of so-called observation models have to be treated in exactly the same way as epidemiological parameters, because they could show conditional dependencies. in dynamic causal modelling, all unknown parameters are treated in a uniform way to maximise (a free energy bound on) marginal likelihood. 12 notice that this model is configured for new cases that are reported based on buccal swabs (i.e., am i currently infected?), not tests for antibody or immunological status. a different model would be required for forthcoming tests of immunity (i.e., have i been infected?). furthermore, one might consider the sensitivity and specificity of any test by including sensitivity and specificity in (1.7). for example, 1 in 3 tests may be false negatives; especially, when avoiding bronchoalveolar lavage to minimise risk to clinicians: wang et al., 2020b. detection of sars-cov-2 in different types of clinical specimens. jama. 13 added in revision: the reproduction ratio in this report was based upon an approximation to the expected number of people that i might infect, if i was infectious. in subsequent reports, the reproduction ratio was brought into line with more formal definitions, based on the geometric rate of increase in the prevalence of infection and the period of contagion. a minimum reproduction ratio (r) of nearly zero in this report corresponds to about 0.7 in subsequent (and other) reports. 14 shedding of covid-19 viral rna from sputum can outlast the end of symptoms. seroconversion occurs after 6-12 days but is not necessarily followed by a rapid decline of viral load. infected (θ n , θ n ) 15 , which parameterise the initial state of the population (where ⊗ denotes a kronecker tensor product): in this technical report, we will choose a simpler option that treats a pandemic as a set of linked point processes that can be modelled as rare events. in other words, we will focus on modelling a single outbreak in a region or city and treat the response of the 'next city' as a discrete process post hoc. this simplifies the generative model; in the sense we only have to worry about the ensemble dynamics of the population that comprises one city . a complimentary perspective on this choice is that we are trying to model the first wave of an epidemic as it plays out in the first city to be affected. any second wave can then be treated as the first wave of another city or region. under the initial conditions, the population size can be set, a priori, to 1,000,000; noting that a small city comprises (by definition) a hundred thousand people, while a large city can exceed 10 million. this population parameter is a prior that is updated based on the available data, providing an estimate of the "effective population" size. effective population is defined here as the proportion of the total population who are susceptible to infection, and therefore participate in the outbreak. the assumption that the effective population size reflects the total population of a country is a hypothesis that we will test later 16 . for clarity, we are not implying that the remainder of the population classed as "not susceptible" are immune or resistant to covid-19, rather there exists a sub-population who do not take part in the current outbreak for any of a variety of reasons that may include being shielded or geographically isolated from infected cases. furthermore, as the effective population (and other parameters) are estimated directly from the data, they will therefore reflect the source of the information. at the time of writing, in the uk this was dominated by the london outbreak. finally, as all parameters pertain to the effective population, proportions (or probabilities)-such as population immunityrequire appropriate scaling to be expressed as a percentage of the total (census) population. the likelihood or observation model the outcomes considered in figure 2 are new cases (of positive tests and deaths) per day. these can be generated by multiplying the appropriate probability by the (effective) population size. the appropriate probabilities here are just the expected occupancy of positive test and deceased states, respectively. because we are dealing with large populations, the likelihood of any observed daily count has a binomial distribution that can be approximated by a gaussian density 17 . here, outcomes are counts of rare events with a small probability π << 1 of occurring in a large population of size n >> 1. for example, the likelihood of observing a timeseries of daily deaths can be expressed as a function of the model parameters as follows: the advantage of this limiting (large population) case is that a (variance stabilising) square root transform of the data counts renders their variance unity. with the priors and likelihood model in place, we now have a full joint probability over causes (parameters) and consequences (outcomes). this is the generative model ( , ) ( | ) ( ) one can now use standard variational techniques (friston et al., 2007) to estimate the posterior over model parameters and evaluate a variational bound on the model evidence or marginal likelihood. mathematically, this is expressed as follows: table 1 also includes a parameter for the proportion of people who are initially immune, which we will call on later. these expressions show that maximising the variational free energy f with respect to an approximate posterior q(ϑ) renders the kullback-leibler (kl) divergence between the true and approximate posterior as small as possible. at the same time, the free energy becomes a lower bound on the log evidence. the free energy can then be used to compare different models, where any differences correspond to a log bayes factor or odds ratio (kass & raftery, 1995; winn & bishop, 2005) . one may be asking why we have chosen this particular state space and this parameterisation? are there alternative model structures or parameterisations that would be more fit for purpose? the answer is that there will always be a better model, where 'better' is a model that has more evidence. this means that the model has to be optimised in relation to empirical data. this process is known as bayesian model comparison based upon model evidence (winn & bishop, 2005) . for example, in the above model we assumed that social distancing increases as a function of the proportion of the population who are infected (1.1). this stands in for a multifactorial influence on social behaviour that may be mediated in many ways. for example, government advice, personal choices, availability of transport, media reports of 'panic buying' and so on. so, what licenses us to model the causes of social distancing in terms of a probability that any member of the population is infected? the answer rests upon bayesian model comparison. when inverting the model using data from countries with more than 16 deaths (see figure 2 ), we obtained a log evidence (i.e., variational free energy) of -15701 natural units (nats). when replacing the cause of social distancing with the probability of encountering someone with symptoms-or the number of people testing positive-the model evidence fell substantially to -15969 and -15909 nats, respectively. in other words, there was overwhelming evidence in favour of infection rates as a primary drive for social distancing, over and above alternative models. we will return to the use of bayesian model comparison later, when asking what factors determine differences between each country's response to the pandemic. table 1 lists all the model parameters; henceforth, dcm parameters. in total, there are 21 dcm parameters. this may seem like a large number to estimate from the limited amount of data available (see figure 2 ). the degree to which a parameter is informed by the data depends upon how changes in the parameter are expressed in data space. for example, increasing the effective population size will uniformly elevate the expected cases per day. conversely, decreasing the number of initially infected people will delay the curve by shifting it in time. in short, a parameter can be identified if it has a relatively unique expression in the data. this speaks to an important point, the information in the data is not just in the total count-it is in the shape or form of the transient 18 . on this view, there are many degrees of freedom in a timeseries that can be leveraged to identify a highly parameterised model. the issue of whether the model is over parameterised or under parameterised is exactly the issue resolved by bayesian model comparison; namely, the removal of redundant parameters to suppress model complexity and ensure generalisation: see (1.13) 19 . one therefore requires the best measures of model evidence. this is the primary motivation for using variational bayes; here, variational laplace (friston et al., 2007) . the variational free energy, in most circumstances, provides a better approximation than alternatives such as the widely used akaike information criteria and the widely used bayesian information criteria (penny, 2012). one special aspect of the model above is that it has absorbing states. for example, whenever one enters the morgue, becomes immune, dies or has a definitive test result, one stays in that state: see figure 1 . this is important, because it means the long-term behaviour of the model has a fixed point. in other words, we know what the final outcomes will be. these outcomes are known as endemic equilibria. this means that the only uncertainty is about the trajectory from the present point in time to the distant future. we will see later that-when quantified in terms of bayesian credible intervals-this uncertainty starts to decrease as we go into the distant future. this should be contrasted with alternative models that do not parameterise the influences that generate outcomes and therefore call upon exogenous inputs (e.g., statutory changes in policy or changes in people's behaviour). if these interventions are unknown, they will accumulate uncertainty over time. by design, we elude this problem by including everything that matters within the model and parameterising strategic responses (like social distancing) as an integral part of the transition probabilities. we have made the simplifying assumption that every country reporting new cases is, effectively, reporting the first wave of an affected region or city. clearly, some countries could suffer simultaneous outbreaks in multiple cities. this is accommodated by an effective population size that could be greater than the prior expectation of 1 million. this is an example of finding a simple model that best predicts outcomes-that may not be a veridical reflection of how those outcomes were actually generated. in other words, we will assume that each country behaves as if it has a single large city of at-risk denizens. in the next section, we look at the parameter estimates that obtain by pooling information from all countries, with a focus on between country differences, before turning to the epidemiology of a single country (the united kingdom). hitherto, we have focused on a generative model for a single city. however, in a pandemic, many cities will be affected. this calls for a hierarchical generative model that considers the response of each city at the first level and a global response at the second. this is an important consideration because it means, from a bayesian perspective, knowing what happens elsewhere places constraints (i.e., bayesian shrinkage priors) on estimates of what is happening in a particular city. clearly, this rests upon the extent to which certain model parameters are conserved from one city to another-and which are idiosyncratic or unique. this is a problem of hierarchical bayesian modelling or parametric empirical bayes (friston et al., 2016; kass & steffey, 1989 ). in the illustrative examples below, we will adopt a second level model in which key (log) parameters are sampled from a gaussian distribution with a global (worldwide) mean and variance. from the perspective of the generative model, this means that to generate a pandemic, one first samples city-specific parameters from a global distribution, adds a random effect, and uses the ensuing parameters to generate a timeseries for each city. this section considers the modelling of country-specific parameters, under a simple (general linear) model of between-country effects. this (second level) model requires us to specify which parameters are shared in a meaningful way between countries and which are unique to each country. technically, this can be cast as the difference between random and fixed effects. designating a particular parameter as a random effect means that this parameter was generated by sampling from a countrywide distribution, while a fixed effect is unique to each country. under a general linear model, the distribution for random effects is gaussian. in other words, to generate the parameter for a particular country, we take the global expectation and add a random gaussian variate, whose variance has to be estimated under suitable hyperpriors. furthermore, one has to specify systematic differences between countries in terms of independent variables; for example, does the latitude of a country have any systematic effect on the size of the at-risk population? the general linear model used here comprises a constant (i.e., the expectation or mean of each parameter over countries), the (logarithms of) total population size, and a series of independent variables based upon a discrete sine transform of latitude and longitude. the latter variables stand in for any systematic and geopolitical differences among countries that vary smoothly with their location. notice that the total population size may or may not provide useful constraints on the effective size of the population at the first level. under this hierarchical model, a bigger country may have a transport and communication infrastructure that could reduce the effective (at risk) population size. a hint that this may be the case is implicit in figure 2 , where there is no apparent relationship between the early incidence of deaths and total population size. in the examples below, we treated the number of initial cases and the parameters pertaining to testing as fixed effects and all remaining parameters as random effects. the number of initial infected people determines the time at which a particular country evinces its outbreak. although this clearly depends upon geography and other factors, there is no a priori reason to assume a random variation about an average onset time. similarly, we assume that each country's capacity for testing was a fixed effect; thereby accommodating non-systematic testing or reporting strategies 20 . note that in this kind of modelling, outcomes such as new cases can only be interpreted in relation to the probability of being tested and the availability of tests 21 . with this model in place, we can now use standard procedures for parametric empirical bayesian modelling (friston et al., 2016; kass & steffey, 1989) to estimate the second level parameters that couple between-country independent variables to country-specific parameters of the dcm. however, there are a large number of these parameters-that may or may not contribute to model evidence. in other words, we need some way of removing redundant parameters based upon bayesian model comparison. this calls upon another standard procedure called . each of these models corresponds to a particular combination of parameters that have been 'switched off', by shrinking their prior variance to zero. by averaging the posterior estimates in proportion to the evidence for each model, -known as bayesian model averaging (hoeting et al., 1999)-we can eliminate redundant parameters and thereby provide a simpler explanation for differences among countries. this is illustrated in the lower panels, which show the posterior densities before (left) and after (right) bayesian model reduction. these estimates are shown in terms of their expectation or maximum a posteriori (map) value (as blue bars), with 90% bayesian credible intervals (as pink bars). the first 21 parameters are the global expectations of the dcm parameters. the remaining parameters are the coefficients that link various independent variables at the second level to the parameters of the transition probabilities at the first. note that a substantial number of second level parameters have been removed; however, many are retained. this suggests that there are systematic variations over countries in certain random effects at the country level. figure 4 provides an example based upon the largest effect mediated by the independent variables. in this analysis, latitude (i.e., distance from the south pole) appears to reduce the effective size of an at-risk population. in other words, countries in the northern hemisphere have a smaller effective population size, relative to countries in the southern hemisphere. clearly, there may be many reasons for this; for example, systematic differences in temperature or demographics. the key thing to take from this analysis is the tight credible intervals on the parameters, when averaging in this way. according to this analysis, the number of effective contacts at home is about three people, while this increases by an order of magnitude to about 30 people when leaving home. the symptomatic and acute respiratory distress periods have been estimated here at about five and 13 days respectively, with a delay in testing of about two days. these are the values that provide the simplest explanation for the global data at hand-and are in line with empirical estimates 22 . figure 6 shows the country-specific parameter estimates for 12 of the 21 dcm parameters. these posterior densities were evaluated under the empirical priors from the parametric empirical bayesian analysis above. as one might expect-in in this instance, the models compared are at the second or between-country level. in other words, the models compared contained all combinations of (second level) parameters (a parameter is removed by setting its prior variance to zero). if the model evidence increases-in virtue of reducing model complexity-then this parameter is redundant. the upper panels show the relative evidence of the most likely 256 models, in terms of log evidence (left panel) and the corresponding posterior probability (right panel). redundant parameters are illustrated in the lower panels by comparing the posterior expectations before and after the bayesian model reduction. the blue bars correspond to posterior expectations, while the pink bars denote 90% bayesian credible intervals. the key thing to take from this analysis is that a large number of second level parameters have been eliminated. these second level parameters encode the effects of population size and geographical location, on each of the parameters of the generative model. the next figure illustrates the nonredundant effects that can be inferred with almost 100% posterior confidence. here, the effective size of the population appears to depend upon the latitude of a country. the right panel shows the absolute values of the glm parameters in matrix form, showing that the effective size of the population was most predictable (the largest values are in white), though not necessarily predictable by total population size. the red circle highlights the parameter mediating the relationship illustrated in the left panel. 25 or, indeed, a previous pandemic, such as the 2009 h1h1 pandemic. we will return to this in the conclusion. 23 https://en.wikipedia.org/wiki/greater_london 24 however, there does appear to be some predictive validity to these that are addressed in an epilogue. note rather than dissect the predictive validity of each parameter and country, which is widely recognised as a challenging problem (moghadas, s.m., shoukat, a., fitzpatrick, m.c., wells, c.r., sah, p., pandey, a., sachs, j.d., wang, z., meyers, l.a., singer, b.h., galvani, a.p., 2020. projecting hospital utilization during the covid-19 outbreaks in the united states. proc natl acad sci u s a 117, 9122-9126.), we have provided some representative examples. a comprehensive analysis of this type would be beyond the scope of this report. it is also important to note that predictions based upon rate parameters and probabilities are a reflection of prior assumptions about these parameters, whereas predictions based upon the hidden states speak to the predictive validity of the dcm model structure (see below). virtue of the second level effects that survived bayesian model reduction-there are some substantial differences between countries in certain parameters. for example, the effective population size in the united states of america is substantially greater than elsewhere at about 25 million (the population in new york state is about 19.4 million). the effective population size in the uk (dominated by cases in london) is estimated to be about 2.5 million (london has a population of about 8.96 million) 23 . social distancing seems to be effective and sensitive to infection rates in france but much less so in canada. the efficacy of social distancing in terms of the difference between the number of contacts at home and work is notably attenuated in the united kingdom-that has the greatest number of home contacts and the least number of work contacts. other notable differences are the increased probability of fatality in critical care evident in china. this is despite the effective population size being only about 2.5 million. again, these assertions are not about actual states of affairs. these are the best explanations for the data under the simplest model of how those data were caused 24 . this level of modelling is important because it enables the data or information from one country to inform estimates of the first level (dcm) parameters that underwrite the epidemic in another country 25 . this is another expression of the importance of having a hierarchical generative model for making sense of the data. here, the generative model has latent causes that span different countries, thereby enabling the fusion of multimodal data from multiple countries (e.g., new test or death rates). two natural questions now arise. are there any systematic differences between countries in the parameters that shape epidemiological dynamics-and what do these dynamics or trajectories look like? this concludes our brief treatment of between country effects, in which we have considered the potentially important role of bayesian model reduction in identifying systematic variations in the evolution of an epidemic from country to country. the next section turns to the use of hierarchically informed estimates of dcm parameters to characterise an outbreak in a single country. this section drills down on the likely course of the epidemic in the uk, based upon the posterior density over dcm parameters afforded by the hierarchical (parametric empirical) bayesian analysis of the previous section (listed in table 2 ). figure 7 shows the expected trajectory of death rates, new cases, and occupancy of ccu beds over a six-month (180 day) period. these (posterior predictive) densities are shown in terms of an expected trajectory and 90% credible intervals (blue line and shaded areas, respectively). the black dots represent empirical data (available at the time of writing). notice that the generative model can produce outcomes that may or may not be measured. here, the estimates are based upon the new cases and deaths in figure 2 . the panels on the left show that our confidence about the causes of new cases is relatively high during the period for which we have data and then becomes uncertain in the future. this reflects the fact that the data are informing those parameters that shaped the initial transient, whereas other parameters responsible for the late peak and subsequent trajectory are less informed. notice that the uncertainty about cumulative deaths itself accumulates. on this analysis, we can be 90% confident that in five weeks, between 13,000 and 22,000 people may have died. relative to the total population, the proportion of people dying is very small; however, the cumulative death rates in absolute numbers are substantial, in relation to seasonal influenza (indicated with broken red lines). although cumulative death rates are small, they are concentrated within a short period of time, with near-identical ccu needs-with the risk of over-whelming available capacity (not to mention downstream effects from blocking other hospital admissions to prioritise the pandemic). the underlying latent causes of these trajectories are shown in figure 8 . the upper panels reproduce the expected trajectories of the previous figure, while the lower panels show the underlying latent states in terms of expected rates or probabilities. for example, the social distancing measures are expressed in terms of an increasing probability of being at home, given the accumulation of infected cases in the population. during the peak expression of death rates, the proportion of people who are immune (herd immunity) increases to about 30% and then asymptotes at about 90%. this period is associated with a marked increase in the probability of developing symptoms (peaking at about 11 weeks, after the first reported cases). interestingly, under these projections, the number of people expected to be in critical care should not exceed capacity: at its peak, the upper bound of the 90% credible interval for ccu occupancy is approximately 4200, this is within the current ccu capacity of london (corresponding to the projected capacity of the temporary nightingale hospital 26 in london, uk). it is natural to ask which dcm parameters contributed the most to the trajectories in figure 8 . this is addressed using a sensitivity analysis. intuitively, this involves changing a particular parameter and seeing how much it affects the outcomes of interest. figure 9 reports a sensitivity analysis of the parameters in terms of their direct contribution to cumulative deaths (upper panel) and how they interact (lower panel). these are effectively the gradient and hessian matrix (respectively) of predicted cumulative deaths. the bars in the upper panel pointing to the left indicate parameters that decrease total deaths. these include social distancing and bed availability, which are-to some extent-under our control. other factors that improve fatality rates include the symptomatic and acute respiratory distress periods and the probability of surviving outside critical care. these, at the present time, are not so amenable to intervention. note that initial immunity has no effect in this analysis because we clamped the initial values to zero with very precise priors. we will relax this later. first, we look at the effect of social distancing by simulating the ensemble dynamics under increasing levels of the social distancing exponent (i.e., the sensitivity of our social distancing and self-isolation behaviour to the prevalence of the virus in the community). it may be surprising to see that social distancing has such a small effect on total deaths (see upper panel in figure 9 ). however, the contribution of social distancing is in the context of how the epidemic elicits other responses; for example, increases in critical care capacity. quantitatively speaking, increasing social distancing only delays the expression of morbidity in the population: it does not, in and of itself, decrease the cumulative cost (although it buys time to develop capacity, treatments, and primary interventions). this is especially the case if there is no effective limit on critical care capacity, because everybody who needs a bed can be accommodated. this speaks to the interaction between different causes or parameters in generating outcomes. in the particular case of the uk, the results in figure 4 suggest that although social distancing is in play, self-isolation appears limited. this is because the number of contacts at home is relatively high (at over five); thereby attenuating the effect of social distancing. in other words, slowing the spread of the virus depends upon reducing the number of contacts by social distancing. however, this will only work if there is a notable difference between the number of contacts at home and work. one can illustrate this by simulating the effects of social distancing, when it makes a difference. figure 10 reproduces the results in figure 8 but for 16 different levels of the social distancing parameter, while using the posterior expectation for contacts at home (of about four) from the bayesian parameter average. social distancing is expressed in terms of the probability of being found at home or work (see the panel labelled location). as we increase social distancing the probability and duration of being at home during the outbreak increases. this flattens the curve of death rates per day from about 600 to a peak of about 400. this is the basis of the mitigation ('curve flattening') strategies that have been adopted worldwide. the effect of this strategy is to reduce cumulative deaths and prevent finite resources being overwhelmed. in this example, from about 17,000 to 14,000, potentially saving about 3000 people. this is roughly four times the number of people who die in the equivalent period due to road traffic accidents. interestingly, these (posterior predictive) projections suggest that social distancing can lead to an endgame in which not everybody has to be immune (see the middle panel labelled infection). we now look at herd immunity using the same analysis. figure 11 reproduces the results in figure 10 using the united kingdom posterior estimates -but varying the initial (herd) immunity over 16 levels from, effectively, 0 to 100%. the effects of herd immunity are marked, with cumulative deaths ranging from about 18,000 with no immunity to very small numbers with a herd immunity of about 70%. the broken red lines in the upper right panel are the number of people dying from seasonal influenza (as in figure 7) . these projections suggest that there is a critical level of herd immunity that will effectively avert an epidemic; in virtue of reducing infection rates, such that the spread of the virus decays exponentially. if we now return to figure 8 , it can be seen that the critical level of herd immunity will, on the basis of these projections, be reached 2 to 3 weeks after the peak in death rates. at this point-according to the model-social distancing the key point to take from this figure is the quantification of uncertainty inherent in the credible intervals. in other words, uncertainty about the parameters propagates through to uncertainty in predicted outcomes. this uncertainty changes over time because of the nonlinear relationship between model parameters and ensemble dynamics. by model design, one can be certain about the final states; however, uncertainty about cumulative death rates itself accumulates. the mapping from parameters, through ensemble dynamics to outcomes is mediated by latent or hidden states. the trajectory of these states is illustrated in the next figure. 27 note, only 2800 beds are ventilator/itu beds. 28 we will use predictions-as opposed to projections-when appropriate, to emphasise the point that the generative model is not a timeseries model, in the sense that the unknown quantities (dcm parameters) do not change with time. this means the there is uncertainty about predictions in the future and the past, given uncertainty about the parameters (see figure 7 ). this should be contrasted with the notion of forecasting or projection; however, predictions in the future, in this setting, can be construed as projections. starts to decline as revealed by an increase in the probability of being at work. we will put some dates on this trajectory by expressing it as a narrative in the conclusion. from a modelling perspective, the influence of initial herd immunity is important because it could form the basis of modelling the spread of the virus from city to another-and back again. in other words, more sophisticated generative . the expected death rate is shown in blue, new cases in red, predicted recovery rate in orange and ccu occupancy in yellow. the black dots correspond to empirical data. the lower four panels show the evolution of latent (ensemble) dynamics, in terms of the expected probability of being in various states. the first (location) panel shows that after about 5 to 6 weeks, there is sufficient evidence for the onset of an episode to induce social distancing, such that the probability of being found at work falls, over a couple of weeks to negligible levels. at this time, the number of infected people increases (to about 32%) with a concomitant probability of being infectious a few days later. during this time, the probability of becoming immune increases monotonically and saturates at about 20 weeks. clinically, the probability of becoming symptomatic rises to about 30%, with a small probability of developing acute respiratory distress and, possibly death (these probabilities are very small and cannot be seen in this graph). in terms of testing, there is a progressive increase in the number of people tested, with a concomitant decrease in those untested or waiting for their results. interestingly, initially the number of negative tests increases monotonically, while the proportion of positive tests starts to catch up during the peak of the episode. under these parameters, the entire episode lasts for about 10 weeks, or less than three months. the broken red line in the upper left panel shows the typical number of ccu beds available to a well-resourced city, prior to the outbreak. models can be envisaged, in which an infected person from one city is transported to another city with a small probability or rate. reciprocal exchange between cities, (and ensuing 'second waves') will then depend sensitively on the respective herd immunities in different regions. anecdotally, other major pandemics, without social isolation strategies, have almost invariably been followed by a second peak that is as high (e.g., the 2009 h1n1 pandemic), or higher, than the first. under the current model, this would be handled in terms of a second region being infected by the first city and so on; like a chain of dominos or the spread of a bushfire (rhodes & anderson, 1998; zhang & tang, 2016). crucially, the effect of the second city (i.e., wave) on the first will be sensitive to the herd immunity established by the first wave. in this sense, it is interesting to know how initial levels of immunity shape a regional outbreak, under idealised assumptions. figure 12 illustrates the interaction between immunity and viral spread as characterised by the effective reproduction rate, r (a.k.a. number or ratio); see (1.9). this figure plots the predicted death rates for the united kingdom and the accompanying fluctuations in r and herd immunity, where both are treated as outcomes of the generative model. the key thing to observe is that with low levels of immunity, r is fairly high at around 2.5 (current estimates of the basic reproduction ratio 29 r 0 , in the literature, range from 1.4 to 3.9). as soon as social distancing comes into play, r falls dramatically to almost 0. however, when social distancing is relaxed some weeks later, r remains low due to the partial acquisition of herd immunity, during the peak of the epidemic. note that herd immunity in this setting pertains to, and only to, the effective or at-risk population: 80% herd immunity a few months from onset would otherwise be overly optimistic, compared to other de novo pandemics; e.g., (donaldson et al., 2009 ). on the other hand, an occult herd immunity (i.e. not accompanied by symptoms) is consistent with undocumented infection and rapid dissemination (li et al., 2020) . note that this way of characterising the spread of a virus depends upon many variables (in this model, two factors and three parameters). and can vary from country to country. repeating the above analysis for china gives a much higher initial or basic reproduction rate, which is consistent with empirical reports (sanche et al., 2020). scenarios for a particular country. in the final section, we revisit the confidence with which these posterior predictive projections can be made. variational approaches-of the sort described in this technical report-use all the data at hand to furnish statistically efficient estimates of model parameters and evidence. this contrasts with alternative approaches based on cross-validation. in the cross-validation schemes, model evidence is approximated by cross-validation accuracy. in other words, the evidence for a model is scored by the log likelihood that some withheld or test data can be explained by the model. although model comparison based upon a variational evidence bound renders cross-validation unnecessary, one can apply the same procedures to demonstrate predictive validity. figure 13 illustrates this by fitting partial timeseries from one country (italy) using the empirical priors afforded by the parametric empirical bayesian analysis. these partial data comprise the early phase of new cases. if the model has predictive validity, the ensuing posterior predictive density should contain the data that was withheld during estimation. figure 13 presents an example of forward prediction over a 10-day period that contains the peak death rate. in this example, the withheld data are largely within the 90% credible intervals, speaking to the predictive validity of the generative model. there are two caveats here: first, similar analyses using very early timeseries from italy failed to predict the peak, because of insufficient (initial) constraints in the data. second, the credible intervals probably suffer from the well-known overconfidence problem in variational bayes, and the implicit mean field approximation (mackay, 2003) 30 . we have rehearsed variational procedures for the inversion of a generative model of a viral epidemic-and have extended this model using hierarchical bayesian inference (parametric empirical bayes) to deal with the differential responses of each country, in the context of a worldwide pandemic. clearly, this narrative is entirely conditioned on the generative model used to make these predictions (e.g., the assumption of lasting immunity, which may or may not be true). the narrative is offered in a deliberately definitive fashion to illustrate the effect of resolving uncertainty about what will happen. it has been argued that many deleterious effects of the 30 note further that the credible intervals can include negative values. this is an artefact of the way in which the intervals are computed: here, we used a first-order taylor expansion to propagate uncertainty about the parameters through to uncertainty about the outcomes. however, because this generative model is non-linear in the parameters, high-order terms are necessarily neglected. 31 this narrative is not offered as a prediction -but as an example of the kind of predictions afforded by dynamic causal modelling. an aspect of these predictions is that they include systemic factors beyond the epidemiology per se. the best example of this is the above predictions about social distancing, which could be read as 'lockdown'; namely the probability that i will leave home. this highlights a key distinction between dynamic causal models and standard quantitative epidemiological modelling that treats things like 'lockdown' as interventions that are supplied to the model. in contrast, interventions such as social distancing and testing are modelled as an integral part of the process -and are estimated on the basis of the data at hand. one consequence of this is that one can make predictions about when 'interventions' -or their suspension -will occur in the future. 29 the basic reproduction ratio is a constant that scores the spread of a contagion in a susceptible population. this corresponds to the effective reproduction ratio at the beginning of the outbreak, when everybody is susceptible. see figure 12 uncertainty about what will happen. this is a key motivation behind procedures that quantify uncertainty, above and beyond being able to evaluate the evidence for different hypotheses about what will happen. one aspect of this is reflected in rhetoric such as "there is no clear exit strategy". it is reassuring to note that, if one subscribes to the above model, there is a clear exit strategy inherent in the self-organised mitigation 32 afforded by herd immunity. for example, within a week of the peak death rate, there should be sufficient herd immunity to preclude any resurgence of infections in, say, london. the term 'self-organised' is used carefully here. this is because we are part of this process, through the effect of social distancing on our location, contact with infected people and subsequent dissemination of covid-19. in other words, this formulation does not preclude strategic (e.g., nonpharmacological) interventions; rather, it embraces them as part of the self-organising ensemble dynamics 33 . this technical report describes an initial implementation of the dcm framework to provide a generative model of a viral epidemic, and to demonstrate the potential utility of such modelling. clearly there are a number of ways this model could be refined. our hope in making it open source is that it will allow others to identify issues, contribute to improvements-and help facilitate objective comparisons with other models-using bayesian model comparison. there remain a number of outstanding issues: the generative model-at both the first and second level-needs to be explored more thoroughly. at the first level, this may entail the addition of other factors; for example, splitting the population into age groups or different classes of clinical vulnerability. procedurally, this should be fairly simple, by specifying the dcm parameters for each age group (or cohort) separately and precluding transitions between age groups (or cohorts). one could also consider the fine graining of states within each factor. for example, making a more careful distinction between being in and not in critical care (e.g., being in self-isolation, being in a hospital, community care home, rural or urban location and so on). at the between city or country level, the parameters of the general linear model could be easily extended to include a host of demographic and geographic independent variables. finally, it would be fairly straightforward to use increasingly fine-grained outcomes, using regional timeseries, as opposed to country timeseries (these data are currently available from: https://github.com/ cssegisanddata/covid-19). another plausible extension to the hierarchical model is to include previous outbreaks of mers and sars (middle east and severe acute respiratory syndrome, respectively) in the model. this would entail supplementing the timeseries with historical (i.e., legacy) data and replicating the general linear model for each type of virus. in effect, this would place figure 13 . predictive validity. this figure uses the same format as figure 7 ; however, here, the posterior estimates are based upon partial data, from early in the timeseries for an exemplar country (italy). these estimates were obtained under (parametric) empirical bayesian priors. the red dots show outcomes that were not used to estimate the expected trajectories (and credible intervals). this example illustrates the predictive validity of the estimates for a 10-day period following the last datapoint, which capture the rise to the peak of new cases. other words, more information about the dcm parameters can be installed through adjusting the prior expectations and variances. the utility of these adjustments would then be assessed in terms of model evidence. this may be particularly relevant as reliable data about bed occupancy, proportion of people recovered, etc becomes available. empirical priors or constraints on any parameter that shares characteristics with mers-cov and sars-cov. in terms of the model parameters-as opposed to model structure-more precise knowledge about the underlying causes of an epidemic will afford more precise posteriors. in a key aspect of the generative model used in this technical report is that it precludes any exogenous interventions of a strategic sort. in other words, the things that matter are built into the model and estimated as latent causes. however, prior knowledge about fluctuating factors, such as closing schools or limiting international air flights, could be entered by conditioning the dcm parameters on exogenous inputs. this would explicitly install intervention policies into the model. again, these conditions would only be licensed by an increase in model evidence (i.e., through comparing the evidence for models with and without some structured intervention). this may be especially important when it comes to modelling future interventions, for example, a 'sawtooth' social distancing protocol. a simple example of this kind of extension would be including a time dependent increase in the capacity for testing: at present, constraints on testing rates are assumed to be constant. a complementary approach would be to explore models in which social distancing depends upon variables that can be measured or inferred reliably (e.g., the rate of increase of people testing positive) and optimise the parameters of the ensuing model to minimise cumulative deaths. in principle, this should provide an operational equation that could be regarded as an adaptive (social distancing) policy, which accommodates as much as can be inferred about the epidemiology as possible. a key outstanding issue is the modelling of how one region (or city) affects another-and how the outbreak spreads from region to region. this may be an important aspect of these kinds of models; especially when it comes to modelling second waves as 'echoes' of infection, which are reflected back to the original epicentre. as noted above, the ability of these echoes to engender a second wave may be sensitively dependent on the herd immunity established during the first episode. herd immunity is therefore an important (currently latent or unobserved) state. this speaks to the importance of antibody testing in furnishing empirical constraints on herd immunity. in turn, this motivates antibody testing, even if the specificity and sensitivity of available tests are low. sensitivity and specificity are not only part of generative models, they can be estimated along with the other model parameters. in this setting, the role of antibody testing would be to provide data for population modelling and strategic advice-not to establish whether any particular person is immune or not (e.g., to allow them to go back to work). finally, it would be useful to assess the construct validity of the variational scheme adopted in dynamic causal modelling, in relation to schemes that do not make mean field approximations. these schemes usually rely upon some form of sampling (e.g., markov chain monte carlo sampling) and cross-validation. cross-validation accuracy can be regarded as a useful but computationally expensive proxy for model evidence and is the usual way that modellers perform automatic bayesian computation. given the prevalence of these sampling based (non-variational) schemes, it would be encouraging if both approaches converged on roughly the same predictions. the aim of this technical report is to place variational schemes on the table, so that construct validation becomes a possibility in the short-term future. the figures in this technical report can be reproduced using annotated ( ., 2020) ). the code is also compatible with gnu octave 5.2. details about future developments of the software will be available from https://www.fil.ion.ucl.ac.uk/spm/covid-19/. this epilogue was written three months after the report was submitted, providing an opportunity to revisit some of the predictions in light of actual outcomes. although the predictions in this report were used to illustrate the nature of the predictions supported by models that included social distancing, they can be used to assess the predictive validity of the dcm. subsequently, the dcm was optimized using bayesian model comparison. a crucial addition was the inclusion of heterogeneity in the response of the population to viral infection. however, even the simple dcm above accommodated sufficient heterogeneity-in terms of the distinction between an effective and total (census) population-to provide some accurate predictions. in brief, the shape and timing of the epidemic in london was predicted to within a few days. conversely, the number of fatalities and positive test results were overestimated by a factor of about 3. in what follows, we list the accurate and inaccurate predictions. we assume that the census population of london was 8.96 million 34 . london's population is taken to be the effective population estimated to be 2.49 million (see table 2 ) and social distancing is read as lockdown (i.e., the probability of leaving home). • "based on current data, reports of new cases in london are expected to peak on april 5" daily confirmed cases of coronavirus in london (and the uk) peaked on april 5 35 . • "a peak in death rates around april 10 (good friday this prediction corresponds to 8.9% = 32% x 2.49/8.98 of the census population of london, which coincides with the consensus estimates at that time. "professor chris whitty admits he thinks at least 10% of the capital has been infected" (published on 24-april-2020) 38 . • "improvements should be seen by may 8, shortly after the may bank holiday, when social distancing will be relaxed." on may 8, the first black lives matter demonstrations started in london. this was followed by the first governmental relaxation of lockdown on may 10: "so, work from home if you can, but you should go to work if you can't work from home." (prime minister's address to the nation: 10-may-2020) 39 • "at this time [may 8] herd immunity should have risen to about 80%" population immunity in the effective population corresponds to 80% x 2.49 / 8.9 = 22% seroprevalence in the census population, which had risen to 17.5% in the previous week: "after making adjustments for the accuracy of the assay and the age and gender distribution of the population, the overall adjusted prevalence in london increased from 1.5% in week 13 to 12.3% in weeks 15 to 16 and 17.5% in week 18" (week ending may 3, 2020) 40 . • "by june 12, death rates should have fallen to low levels with over 90% of people being immune" weekly reported deaths in london hospitals for the week ending june 11 fell to 22 (with positive tests)) 41 . seroprevalence for this period was not reported. • "by june 12, social distancing [lockdown] will no longer be a feature of daily life." the second governmental relaxation of lockdown was announced on june 10 and june 23, with an initial reopening of shops, and an easing of the two-metre social distancing rule: "[a]s the business secretary confirmed yesterday, we can now allow all shops to reopen from monday." (prime minister's statement that the coronavirus press conference: 10-june-2020) 42 "thanks to our progress, we can now go further and safely ease the lockdown in england. at every stage, caution will remain our watchword, and each step will be conditional and reversible. mr speaker, given the significant fall in the prevalence of the virus, we can change the two-metre social distancing rule, from 4th july." (prime minister's statement to the house: 23-june-2020) 43 inaccurate predictions these were overestimates; daily deaths in london peaked at 249 on april 9 with cumulative deaths at the time of writing (17-july-2020) of 6,106 45 . this represents consistent overestimates by factors of 3.2 and 2.8, respectively. this may reflect the fact that the data used in the report included regions in the united kingdom outside london. software is available from: https://www.fil.ion.ucl.ac.uk/spm/ covid-19/. this technical report presents a dynamical causal model of the transmission dynamics of covid-19. i believe this paper is one of very few (if any) that follow this type of approach which makes it interesting and an important contribution to the literature even after dozens of modeling papers on the topic have been published or are in the process of publication. the paper is well described and the results are interesting and present a new approach for assessing the role of multiple factors on the spread of covid-19. however, the epidemic has advanced significantly, and it would be good to see how the results and perspectives are shaped by more recent data. authors should consider updating the paper with the most recent data available, and discuss how their analysis/conclusions are shaped by integrating additional data. if any results are presented, are all the source data underlying the results available to ensure full reproducibility? yes are the conclusions about the method and its performance adequately supported by the findings presented in the article? yes we have tried to revise the paper to preserve its original content (by limiting changes to the main text to clarify and unpack things). we have used new footnotes no. 11, 13, 16, 24, 31, and 33 and a new section "posthoc evaluation of model predictions" to address issues that have arisen since submission (for example, the validity of predictions in light of actual outcomes). we hope this revised version is helpful in further clarifying our new approach. this is an interesting and expansive modelling paper from a group of scientists that do not primarily focus on modelling infectious diseases, i think contributions to epidemiology from other fields should always be welcomed and this is no exception. the techniques employed in this paper are less of a different type of model and more of an entirely different modelling framework. as such, i see part of my job in this review as trying to bridge the gaps between the language and techniques of dynamic causal modelling and infectious disease modelling. hopefully in doing so i am able to present any criticism in a way that both the authors and other infectious disease modellers are able to follow and understand. the dynamic causal model developed in this paper can be understood roughly as a stochastic compartmental seir model that has 1) a "generative" model that describes movement between unobserved states over time (infection, recovery etc) and 2) an "observational model" that describes the likelihood for the parameter values in the generative model given the observed data (in this case daily deaths and positive tests). the generative model has four components: location, which determines where you are and the contacts you make; infection, which is akin to the susceptible -exposed -infectious -recovered model used commonly; clinical, which determines the clinical presentation should you become infected; testing, which links your current infection status to the result of a swab test. you can be at various states within any of these four components at once, for example i could be an asymptomatic, infectious person at work that has not been tested. how i move between these states is governed by a matrix of probabilities that can be non-linear in time and as a response to feedback from other parameters within the model (for example my probability of observing lockdown can grow as more people die during the outbreak). i think ultimately the generative model is comparable to a complicated seir model and the next step in the mind of an infectious disease modeller is to use the likelihood from the observational model in a fitting method such as mcmc to generate samples from the posterior distribution of the generative model parameters. instead, dynamic causal modelling has a developed body of theory that allows for approximation of the analytical solution to the posterior of the model parameters that maximises the model evidence (marginal likelihood). this allows for immediate comparison of different generative model structures on the same data through selecting the model with the optimal log model evidence, which is also referred to as "variational free energy" (a similar process to the commonly used akaike or bayesian information criterion). this was refreshing to me as it can sometimes be difficult to obtain aic/bic after fitting your model depending on how you have fit it, such as in the probabilistic modelling language stan where you sometimes need to calculate the leave-one-out information criterion (loo-ic) yourself. another interesting methodological addition from the dynamic causal modelling framework is fitting the model to data from several different countries and then assigning model parameters as fixed or random effects, using a generalised linear model to estimate the between-country effects of certain covariates. in the manuscript the authors show the results of this process, finding a relationship between the latitude of a country and the effective population size of the outbreak inferred by the model. while, as the authors acknowledge, latitude here is very likely a proxy for other socio-economic variables, this approach could potentially yield interesting results using a wider selection of between-country effects or as a heuristic device to try and understand what factors are driving the model fit in each country. this is complemented with a technique called "bayesian model reduction", which efficiently prunes redundant parameters out of the model to simultaneously achieve model parsimony and perform a sensitivity analysis of sorts since it involves fixing the prior of each parameter and looking at the difference in model fit. to me, the framework of dynamic causal modelling seems to make available several tools that should be of interest to infectious disease modellers. it is not the case that infectious disease modellers don't already try to reduce models or compare them between countries, but what is attractive about the dynamic causal modelling approach is the coherency of the framework and the availability of software to perform the methods for models in general (although i think most infectious disease modellers would prefer to use r rather than matlab). at the very least, the methods employed in the dynamic causal modelling framework could be adapted to work with the more familiar combined compartmental model and mcmc approach. the methods in the dynamic causal modelling framework are heavily used and accepted in the field of neuroscience, so i don't think it's my job in this review to scrutinise them in particular outside of understanding them to the point where i can understand how the model in this particular paper was fitted. with the general modelling approach summarised i can move on to the specifics of the structure of the generative and observation models: in a similar way to aic/bic, i think i am correct in thinking that model selection using variational free energy only provides a relative score of model fit and not an objective score. choosing the best model out of a suite of models does not guarantee that this best model fits well, for this we need to turn to predictive validity and this is where i think the model laid out in the paper is at its weakest. below is the best-fitting model's prediction for london in full: "based on current data, reports of new cases in london are expected to peak on april 5, followed by a peak in death rates around april 10 (good friday). at this time, critical care unit occupancy should peak, approaching-but not exceeding-capacity, based on current predictions and resource availability. at the peak of death rates, the proportion of people infected (in london) is expected to be about 32%, which should then be surpassed by the proportion of people who are immune at this time. improvements should be seen by may 8, shortly after the may bank holiday, when social distancing will be relaxed. at this time herd immunity should have risen to about 80%, about 12% of london's population will have been tested. just under half of those tested will be positive. by june 12, death rates should have fallen to low levels with over 90% of people being immune and social distancing will no longer be a feature of daily life." it's quite hard to tell if we are meant to interpret this as an example of what sort of narrative could be derived from the results of the model, or whether this is a genuine model prediction. if it is the latter, then i would expect to see mention of when the prediction was made, as well as plots showing the prediction (shown in figures 12 and 6 ) against the data which is now available. the authors do this for their predictions for italy ( figure 13 ) but not london. i am writing this review on june 10th and at the time of writing the number of deaths on the 9th june was 286. without numbers given for the prediction it's hard to know if this counts as "low levels" or not, the 8th june was the beginning of week 24 and the corresponding prediction of daily deaths in figure 12 is near zero. perhaps more concerning than the prediction for deaths is the prediction for immunity. in the paper i find it quite difficult to tell what exactly is being spoken about when it comes to immunity. the model fits a parameter called "effective population" (θn) that i think could do with some further explanation, it seems to be the case that immunity is presented as the number of infections inferred by the model divided by the effective population. when the model was fitted to uk data it inferred an effective population size of ~2.5 million people. it's quite hard to tell but from figure 8 , looking at the cumulative cases inferred by the model and the proportion of the population entering the immune category, it seems like the model has predicted that nearly all of the 2.5 million people in the effective population are now immune. here is what the authors say about the effective population parameter: "in this technical report, we will choose a simpler option that treats a pandemic as a set of linked point processes that can be modelled as rare events. in other words, we will focus on modelling a single outbreak in a region or city and treat the response of the 'next city' as a discrete process post hoc. this simplifies the generative model; in the sense we only have to worry about the ensemble dynamics of the population that comprises one city. a complimentary perspective on this choice is that we are trying to model the first wave of an epidemic as it plays out in the first city to be affected. any second wave can then be treated as the first wave of another city or region .under this choice, the population size can be set, a priori, to 1,000,000; noting that a small city comprises (by definition)a hundred thousand people, while a large city can exceed 10 million. note that this is a prior expectation, the effective population size is estimated from the data: the assumption that the effective population size reflects the total population of a country is a hypothesis (that we will test later)." it is true that you can use a model with a population size under 67 million, look at the dynamics of the outbreak from the model output, and infer things about the potential effectiveness of social distancing, eventual likelihood of herd immunity, and so on, that would be true in a larger population. however, you would not fit a model to death data for all of the uk using a population parameter that is smaller than the population of the uk. i think the model output as shown in the manuscript is a best guess at the outbreak dynamics if the number of deaths and cases observed in a place with a population of 67 million people were instead observed in a place with a population of 2.5 million. as a result of fitting to death rates for a population 30 times bigger than the one in your model, you would expect to find that almost everyone is infected. since the writing of this manuscript, serological studies have started to emerge which estimate the percentage of the population that have been infected (which would correspond to the immune compartment in the model) . on the 24th may the ons estimated that around 7% of the uk have antibodies for covid-19, rising to 17% in london. even acknowledging that serology studies are not perfect and that the ones performed so far have been quite small scale, this is really quite a different picture than the 90% population immunity presented by the model output. the picture is similar in serological studies across the world, even in healthcare workers in hard-hit cities like barcelona that would have faced constant exposure to infection. what is the result of fitting the model to uk deaths and reported cases with a fixed, actual value for the effective population? or at least using the uk population as the prior value? i think either a) the model output should be more clearly presented as an example or b) you should acknowledge that the model output gives predictions that seem very different from the emerging evidence the fitted probability that a person dies given that they are in the ccu (θfat) for china and italy is very high (nearly 100% and well over 50%). how well does this compare to actual observed mortality rates in ccus? for example, this paper 1 , found 26% mortality in icus in lombardy, italy in early march. the uk data collated by the john hopkins covid-19 data repository that the authors use fetches data from here. the observation model could be improved by including a delay between the actual occurrence of death and its eventual reporting in the official statistics, sometimes it can take a couple of days for deaths to appear in the government figures. i think this could interfere with the model fit as it tries to align deaths and reported cases (which it currently reasons have both happened on that day). it is also important to consider the structure of the surveillance system when trying to fit to reported cases. in the uk for a good while tests were only undertaken on hospital admissions that were severe enough to warrant being admitted overnight (or at least that is what the official policy was). other countries like south korea had drive-through test centres. this is going to cause a huge discrepancy in how you should interpret changes in reported cases. it is strange that there is so much variation in some of the parameters between countries. for example, the contagious period is around 1 day in china but around 3-5 days in france? what is the biological reasoning behind this? arguably there could be some genetic variation in the virus between countries but could that cause such a significant difference? is there any empirical evidence that supports differences in how long your are contagious between countries? the same goes for the numbers of contacts at home or contacts at work. people in the united kingdom are estimated to have around 7 contacts at home, but the average size of uk households is just 2.3. it would be good to link the output of these variables to any empirical data that is available to show that they are meaningful and do actually correspond to whatever data might be available. one of the countries with the lowest effective contacts in the household (~ 1.5) has a higher average household size than the uk of 2.5. the variable for the probability of infection given contact (θtrn) is fairly stable apart from china and australia where it is relatively large and small, respectively. do the authors have any thoughts why this might be the case? the model does not include any kind of age structure. age has a large effect on the fatality of infection and should therefore be accounted for. countries with an older population would likely see a higher fatality rate. age could also influence the amount and types of contacts that people make, with more intergenerational contacts happening within the home and more intragenerational contacts happening at work or school. the model described in this paper is an interesting and important first step at putting together a model of infectious disease dynamics within the framework of dynamic causal modelling. however, when the particular model here is fit to data i don't think it displays that it has captured the dynamics of the outbreak well wherever it is able to be compared to separately collected bits of data such as seroprevalence or ccu mortality. i think what has happened in the model fitting process for the most part is that the variation introduced into the time series of deaths and reported cases due to differing surveillance and reporting structures, differing testing regimes, differing outbreak responses, and differing population demographics between countries have been accounted for within the generative model through between-country variation in parameters such as the effective population size, numbers of contacts at work (for example, do most people in china really have between 100 and 150 effective contacts at work?), ccu fatality, contagious period length, and others. the unfortunate reality is that with a flexible enough model (in terms of numbers of parameters) it is always possible to produce a fit that very closely matches the reported case and death data observed so far. the real test for this model is whether the estimated parameter values that can be compared to other sources of data match what we observe empirically and i think it is fairly obvious that this has not happened. sadly i don't think that i can recommend this paper for indexing as it currently stands because i don't think it is clear what it is trying to do. i think the easiest way of resolving this problem is for the authors to ask themselves the question "do i think the model predictions made for the uk in this paper are plausible or are they examples of predictions that can be made from the model?". if the predictions are examples then this paper is an introduction to disease modelling using dynamic causal modelling and the predictions should be more clearly labelled as examples. the paper could then be further improved by showing how methods such as the between-country parameter comparisons using the hierarchical glm correspond to the types of questions that disease modellers want to answer. alternatively, if the authors do think that the predictions made in this paper are accurate, then they need to be far more stringent comparing their predictions with data that has become available since they are made and have questions to answer regarding the gap between the 90% immunity in london that they predict and the 17% that has been estimated by the ons. that london may have already reached herd immunity has huge implications for future intervention policies, the most significant being that there is no danger of a second wave. if we behave as if there is 90% immunity (completely end social distancing etc.) but we are in fact well below herd immunity, then we will have likely caused the second wave through our own actions. compare we would like to thank you for the considerable time and effort you have spent reviewing our manuscript. your thoroughness and attention to detail, in what must be very busy and challenging times, has been very much appreciated. we were particularly impressed with the summary of the technical aspects of this work, which are useful and informed descriptions in their own right. we have tried to revise the paper to preserve its original content (by limiting changes to the main text to clarify and unpack things). we have used footnotes and a new 'to address issues that have arisen since submission (for example, the validity of predictions in light of actual outcomes). below are the replies to the comments, that for clarity we have grouped into key themes. we hope these revisions are what you had in mind: the primary purpose of this paper was to serve as a technical report, introducing a methodology that could be, and was, used to answer specific questions about epidemiological parameters and epidemiological model structure. to clarify this, we have emphasised that the narrative at the end of the paper is an example of the kind of predictions that can be made, rather than a definitive prediction per se (footnote 31): "this narrative is not offered as a prediction -but as an example of the kind of predictions afforded by dynamic causal modelling. an aspect of these predictions is that they include systemic factors beyond the epidemiology per se. the best example of this is the above predictions about social distancing, which could be read as 'lockdown'; namely the probability that i will leave home. this highlights a key distinction between dynamic causal models and standard quantitative epidemiological modelling that treats things like 'lockdown' as interventions that are supplied to the model. in contrast, interventions such as social distancing and testing are modelled as an integral part of the process -and are estimated on the basis of the data at hand. one consequence of this is that one can make predictions about when 'interventions' -or their suspension -will occur in the future." regarding specific predictive validity, we thought it would be disingenuous to change the predictions in light of subsequent outcomes-or the procedures that were applied in subsequent reports. however, we have now added an extensive 'posthoc evaluation of model predictions' section in the revised version that addresses the predictions in light of current data. this section implicitly addresses the specific points about predictions in the reviewers' comments. we have also attempted to make the demarcation between a procedural and predictive contribution clearer throughout the text by including footnotes like the following (footnote 33): "to reiterate, the purpose of this technical report was to introduce the variational procedures entailed by dynamic causal modelling in the setting of quantitative, epidemiological modelling. since this report was submitted, several papers have used procedures described in this report to address specific questions; for example, the impact of lockdown cycles, the effect of population fluxes among regional outbreaks, the efficacy of testing and tracing, and the impact of heterogeneous susceptibility and transmission. crucially, in line with a key message of this foundational paper, each successive application of the dynamic causal modelling leveraged bayesian model comparison to update the model as new data became available." we also take the opportunity to future-proof retrospective evaluations of the reproduction ratio with the following footnote 13: "added in revision: the reproduction ratio in this report was based upon an approximation to the expected number of people that i might infect, if i was infectious. in subsequent reports, the reproduction ratio was brought into line with more formal definitions, based on the geometric rate of increase in the prevalence of infection and the period of contagion. a minimum reproduction ratio (r) of nearly zero in this report corresponds to about 0.7 in subsequent (and other) reports." in addition to these, we have also incorporated a number of additional changes outlined below. it is clear that the "effective population" terminology, particularly in respect to immunity, represents a common source of confusion. to rectify this, we have made a number of changes throughout the paper. first, we have amended the "initial conditions and population size" section, splitting it and introducing a new subsection as follows: we have also annotated the legend to figure 11, and made the following change to immunity predictions, to clarify this further: we appreciate the number of suggestions to help refine or improve this model further. as surmised in the "predictive validity" section of your review, this report provides an initial technical description for the kind of analyses that could be used via the presented methodology. in a sense, it represents a proof of concept for this type of modelling, and we acknowledge there are many directions and improvements that could be made such as there remain a number of outstanding issues:" additionally, in a separate piece of work [1] we have also formally compared an ode-based seir model to the dcm presented here. here the seir was developed originally by moghadas et al. [2] to assess ccu projections due to covid-19 in the us. the seir model comprised 12 states including asymptomatic and subclinical infected states, self-isolation, and separate states of hospitalization [2]. we optimised parameters for both the seir and dcm using identical variational processes to those presented here. taking data from seven european countries including the uk, we found that the approximate model evidence or free energy provided very strong support for the dcm as compared to the seir model, suggesting that marginal state occupancy was important when accounting for those data. in particular log bayes factors of >100 was evidenced for all seven datasets. this comparative analysis is currently under review. we thank the reviewer for highlighting this. we are aware that delays in reporting deaths and reporting of statistics over weekends do represent potential confounds to the observed time series data. in this work, we perform smoothing of time series by several days to deal with these delays in reporting. delays in reporting pcr testing were modelled explicitly in terms of a 'waiting for a test' state because entry into this state depends upon testing capacity. conversely, a simple delay in reporting a death can be accommodated by an increase in effective dwell time in critical care. one could consider a dcm that modelled the delay in reporting deaths explicitly-and then use bayesian model comparison to compare models with and without delays. we did not do this; however, the conditional dependencies between an additional delay parameter and the existing parameters would probably reduce the marginal likelihood (i.e., bayesian model evidence) of an extended dcm. we agree that differences in testing and reporting strategies will impact the data. in the model presented, the testing rate parameter accounts for some of these differences. we have added the following footnote 11 to emphasise the importance of this part of the model. first a disclaimer is that these assertions (for example fig. 6 , showing differences among countries) are not about actual states of affairs. these are the best explanations for the data available at the time, under the simplest model of how those data were caused. however, there does appear to be some degree of predictive validity; for example, the predicted ccu mortality rate in the uk in april (at the time of writing of the paper) of about 48%, was close to data published on the 4th april by the intensive care national audit and research centre (critical care mortality = 50.1% [3]). regarding the italian data from lombardy, whilst the mortality rate was lower (26%), the data was acquired earlier on in the pandemic (february 20 to march 18) before the peak in cases. rather than dissect the predictive validity of each parameter and country, which is widely recognised as an extremely challenging problem [4], we would reiterate that this paper is intended as a technical report for dcm, and provides examples of the types of questions that could be addressed using this method. to clarify these points, we have modified the following in the "parametric empirical bayes and hierarchical modelling section": [5] https://en.wikipedia.org/wiki/greater_london ******************************************************************************************** posthoc evaluation of model predictions this section was written three months after the report was submitted, providing an opportunity to revisit some of the predictions in light of actual outcomes. although the predictions in this report were used to illustrate the nature of the predictions supported by models that included social distancing, they can be used to assess the predictive validity of the dcm. subsequently, the dcm was optimized using bayesian model comparison. a crucial addition was the inclusion of heterogeneity in the response of the population to viral infection. however, even the simple dcm above accommodated sufficient heterogeneity-in terms of the distinction between an effective and total (census) population-to provide some accurate predictions. in brief, the shape and timing of the epidemic in london was predicted to within a few days. conversely, the number of fatalities and positive test results were overestimated by a factor of about 3. in what follows, we list the accurate and inaccurate predictions. we assume that the census population of london was 8.96 million [1] . london's population is taken to be the effective population estimated to be 2.49 million (see table 2 ) and social distancing is read as lockdown (i.e., the probability of leaving home). "at the peak of death rates [april 10], the proportion of people infected (in london) is expected to be about 32%" ○ this prediction corresponds to 8.9% = 32% x 2.49/8.96 of the census population of london, which coincides with the consensus estimates at that time. "professor chris whitty admits he thinks at least 10% of the capital has been infected" (published on 24-april-2020) [5] . "improvements should be seen by may 8, shortly after the may bank holiday, when social distancing will be relaxed." ○ on may 8, the first black lives matter demonstrations started in london. this was followed by the first governmental relaxation of lockdown on may 10: "so, work from home if you can, but you should go to work if you can't work from home." (prime minister's address to the nation: 10-may-2020) [6] "at this time [may 8] herd immunity should have risen to about 80%" ○ population immunity in the effective population corresponds to 80% x 2.49 / 8.9 = 22% seroprevalence in the census population, which had risen to 17.5% in the previous week: "after making adjustments for the accuracy of the assay and the age and gender distribution of the population, the overall adjusted prevalence in london increased from 1.5% in week 13 to 12.3% in weeks 15 to 16 and 17.5% in week 18" (week ending may 3, 2020) [7]. "by june 12, death rates should have fallen to low levels with over 90% of people being immune" ○ weekly reported deaths in london hospitals for the week ending june 11 fell to 22 (with positive tests)[8]. seroprevalence for this period was not reported. "by june 12, social distancing [lockdown] will no longer be a feature of daily life." ○ the second governmental relaxation of lockdown was announced on june 10 and june 23, with an initial reopening of shops, and an easing of the two-metre social distancing rule: "[a]s the business secretary confirmed yesterday, we can now allow all shops to reopen from monday." (prime minister's statement that the coronavirus press conference: 10-june-2020) [9] "thanks to our progress, we can now go further and safely ease the lockdown in england. at every stage, caution will remain our watchword, and each step will be conditional and reversible. mr speaker, given the significant fall in the prevalence of the virus, we can change the two-metre social distancing rule, from 4th july." (prime minister's statement to the house: 23-june-2020) [10] inaccurate predictions "about 12% of london's population will have been tested (may 8). just under half of those ○ tested will be positive." this was an overestimate: 12% of the effective population corresponds to 143,424 = 12% x .48 x 2.49 positive tests. at the time of writing (17-july-2020), only 34,397 people have tested positive in london [11]-a quarter of the predicted number. from figure 8 : peak daily death rate 807 (710-950) with cumulative deaths of 17,500 (14,000-21,000) ○ these were overestimates; daily deaths in london peaked at 249 on april 9 with cumulative deaths at the time of writing (17-july-2020) of 6,106 [12] . this represents consistent overestimates by factors of 3.2 and 2.8, respectively. this may reflect the fact that the data used in the report included regions in the united kingdom outside london. baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region estimating required'lockdown'cycles before immunity to sars-cov-2: model-based analyses competing interests: no competing interests were disclosed reviewer expertise: mathematical epidemiology. author response 31 jul 2020 we would like to thank you dr. chowell, for reviewing our manuscript and finding our approach interesting. we have now revised the paper, based on your and another reviewer's feedback, which is now available. key: cord-027438-ovhzult0 authors: veen, lourens e.; hoekstra, alfons g. title: easing multiscale model design and coupling with muscle 3 date: 2020-05-25 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50433-5_33 sha: doc_id: 27438 cord_uid: ovhzult0 multiscale modelling and simulation typically entails coupling multiple simulation codes into a single program. doing this in an ad-hoc fashion tends to result in a tightly coupled, difficult-to-change computer program. this makes it difficult to experiment with different submodels, or to implement advanced techniques such as surrogate modelling. furthermore, building the coupling itself is time-consuming. the multiscale coupling library and environment version 3 (muscle 3) aims to alleviate these problems. it allows the coupling to be specified in a simple configuration file, which specifies the components of the simulation and how they should be connected together. at runtime a simulation manager takes care of coordination of submodels, while data is exchanged over the network in a peer-to-peer fashion via the muscle library. submodels need to be linked to this library, but this is minimally invasive and restructuring simulation codes is usually not needed. once operational, the model may be rewired or augmented by changing the configuration, without further changes to the submodels. muscle 3 is developed openly on github, and is available as open source software under the apache 2.0 license. natural systems consist of many interacting processes, each taking place at different scales in time and space. such multiscale systems are studied for instance in materials science, astrophysics, biomedicine, and nuclear physics [11, 16, 24, 25] . multiscale systems may extend across different kinds of physics, and beyond into social systems. for example, electricity production and distribution covers processes at time scales ranging from less than a second to several decades, covering physical properties of the infrastructure, weather, and economic aspects [21] . the behaviour of such systems, especially where emergent phenomena are present, this work was supported by the netherlands escience center and nwo under the e-musc project. may be understood better through simulation. simulation models of multiscale systems (multiscale models for short), are typically coupled simulations: they consist of several submodels between which information is exchanged. constructing multiscale models is a non-trivial task. in addition to the challenge of constructing and verifying a sufficiently accurate model of each of the individual processes in the system, scale bridging techniques must be used to preserve key invariants while exchanging information between different spatiotemporal scales. if submodels that use different domain representations need to communicate, then conversion methods are required to bridge these gaps as well. multiscale models that exhibit temporal scale separation may require irregular communication patterns, and spatial scale separation results in running multiple instances, possibly varying their number during simulation. once verified, the model must be validated and its uncertainty quantified (uq) [22] . this entails uncertainty propagation (forward uq) and/or statistical inference of missing parameter values and their uncertainty (inverse uq). sensitivity analysis (sa) may also be employed to study the importance of individual model inputs for obtaining a realistic result. such analysis is often done using ensembles, which is computationally expensive especially if the model on its own already requires significant resources. recently, semi-intrusive methods have been proposed to improve the efficiency of uq of multiscale models [18] . these methods leave individual submodels unchanged, but require replacing some of them or augmenting the model with additional components, thus changing the connections between the submodels. when creating a multiscale model, time and development effort can often be saved by reusing existing submodel implementations. the coupling between the models however is specific to the multiscale model as a whole, and needs to be developed from scratch. doing this in an ad-hoc fashion tends to result in a tightly coupled, difficult-to-change computer program. experimenting with different model formulations or performing efficient validation and uncertainty quantification then requires changing the submodel implementations, which in turn makes it difficult to ensure continued interoperability between model components. as a result, significant amounts of time are spent solving technical problems rather than investigating the properties of the system under study. these issues can be alleviated through the use of a model coupling framework, a software framework which takes care of some of the aspects of coupling submodels together into a coupled simulation. many coupling frameworks exist, originating from a diversity of fields [2, 12, 13] . most of these focus on tightlycoupled scale-overlapping multiphysics simulations, often in a particular domain, and emphasise efficient execution on high-performance computers. the muscle framework has taken a somewhat different approach, focusing on scale-separated coupled simulation. these types of coupled simulations have specific communication patterns which occupy a space in between tightlycoupled, high communication intensity multiphysics simulations, and pleasingly parallel computations in which there is no communication between components at all. the aforementioned methods for semi-intrusive uq entail a similar com-munication style, but require the ability to handle ensembles of (parts of) the coupled simulation. in this paper, we introduce version 3 of the multiscale coupling library and environment (muscle 3 [23] ), and explain how it helps multiscale model developers in connecting (existing) submodels together, exchanging information between them, and changing the structure of the multiscale model as required for e.g. uncertainty quantification. we compare and contrast muscle 3 to two representative examples: precice [5] , an overlapping-scale multiphysics framework, and amuse [20] , another multiscale-oriented coupling framework. muscle 3 is based on the theory of the multiscale modeling and simulation framework (mmsf, [4, 8] ). the mmsf provides a systematic method for deriving the required message exchange pattern from the relative scales of the modelled processes. as an example, we show this process for a 2d simulation of in-stent restenosis (isr2d, [6, 7, 17, 19] ). this model models stent deployment in a coronary artery, followed by a healing process involving (slow) cell growth and (fast) blood flow through the artery. the biophysical aspects of the model have been described extensively in the literature; here we will focus on the model architecture and communication pattern. note that we have slightly simplified both the model (ignoring data conversion) and the method (unifying state and boundary condition updates) for convenience. simple coupled simulations consist of two or more sequential model runs, where the output of one model is used as an input of the next model. this suffices if one real-world process takes place before the next, or if there is otherwise effectively a one-way information flow between the modeled processes. the pattern of data exchange in such a model may be described as a directed acyclic graph (dag)-based workflow. a more complex case is that of cyclic models, in which two or more submodels influence each other's behaviour as the simulation progresses. using a dag to describe such a simulation is possible, but requires modelling each executing submodel as a long sequence of state update steps, making the dag unwieldy and difficult to analyse. moreover, the number of steps may not be known in advance if a submodel has a variable step size or runs until it detects convergence. a more compact but still modular representation is obtained by considering the coupled simulation to be a collection of simultaneously executing programs (components) which exchange information during execution by sending and receiving messages. designing a coupled simulation then becomes a matter of deciding which component should send which information to which other component at which time. designing this pattern of information exchange between the components is non-trivial. each submodel must receive the information it needs to perform its next computation as soon as possible and in the correct form. moreover, in order to avoid deadlock, message sending and receiving should match up exactly between the submodels. figure 1 depicts the derivation of the communication pattern of isr2d according to the mmsf. figure 1a) shows the spatial and temporal domains in which the three processes comprising the model take place. temporally, the model can be divided into a deployment phase followed by a healing phase. spatially, deployment and cell growth act on the arterial wall, while blood flow acts on the lumen (the open space inside the artery). figure 1b) shows a scale separation map [15] for the healing phase of the model. on the temporal axis, it shows that blood flow occurs on a scale of milliseconds to a second, while cell growth is a process of hours to weeks. thus, the temporal scales are separated [9] . spatially, the scales overlap, with the smallest agents in the cell growth model as well as the blood flow model's grid spacing on the order of 10 µm, while the domains are both on the order of millimeters. according to the mmsf, the required communication pattern for the coupled simulation can be derived from the above information. the mmsf assumes that each submodel executes a submodel execution loop (sel). the sel starts with an initialisation step (f init ), then proceeds to repeatedly observe the state (o i ) and then update the state (s). after a number of repetitions of these two steps, iteration stops and the final state is observed (o f ). during observation steps, (some of the) state of the model may be sent to another simulation component, while during initialisation and state update steps messages may be received. for the isr2d model, causality dictates that the deployment phase is simulated before the healing phase, and therefore that the final state of the deployment (o f ) is fed into the initial conditions (f init ) of the healing simulation. in the mmsf, this is known as a dispatch coupling template. within the healing phase, there are two submodels which are timescale separated. this calls for the use of the call (o i to f init ) and release (o f to s) coupling templates. figure 1c) shows the resulting connections between the submodels using the multiscale modeling language [10] . in this diagram, the submodels are drawn as boxes, with lines indicating conduits between them through which messages may be transmitted. decorations at the end of the lines indicate the sel steps (or operators) between which the messages are sent. note that conduits are unidirectional. figure 1d) shows the corresponding timeline of execution. first, deployment is simulated, then the cell growth and blood flow models start. at every timestep of the cell growth submodel (slow dynamics), part of its state is observed (o i ) and used to initialise (f init ) the blood flow submodel (fast dynamics). the blood flow model repeatedly updates its state until it converges, then sends (part of) its final state (at its o f ) back to the cell growth model's next state update (s). while the above demonstrates how to design a multiscale model from individual submodels, it does not explain how to implement one. in this section, we introduce muscle 3 and ymmsl, and show how they ease building a complex coupled simulation. muscle 3 is the third incarnation of the multiscale coupling library and environment, and is thus the successor of muscle [14] and muscle 2 [3] . muscle 3 consists of two main components: libmuscle and the muscle manager. figure 2 shows how libmuscle and the muscle manager work together with each other and with the submodels to enact the simulation. at start-up, the muscle manager reads in a description of the model and then waits for the submodels to register. the submodels are linked with libmuscle, which offers an api through which they can interact with the outside world using ports, which are gateways through which messages may be sent and received. to start the simulation, the manager is started first, passing the configuration, and then the submodels are all started and passed the location of the manager. at submodel start-up, libmuscle connects to the muscle manager via tcp, describes how it can be contacted by other components, and then receives a description for each of its ports of which other component it should communicate with and where it may be found. the muscle manager derives this information from the model topology description and from the registration information sent by the other submodels. the submodels then set up direct peer-to-peer network connections to exchange messages. these connections currently use tcp, but a negotiation mechanism allows for future addition of faster transports without changes to user code. submodels may use mpi for internal communication independently of their use of muscle 3 for external communication. in this case, muscle 3 uses a spinloop-free receive barrier to allow resource sharing between submodels that do not run concurrently. in muscle 2, each non-java model instance is accompanied by a java minder process which handles communication, an additional complexity that has been removed in muscle 3 in favour of a native libmuscle implementation with language bindings. the manager is in charge of setting up the connections between the submodels. the model is described to the manager using ymmsl, a yaml-based serialisation of the multiscale modelling and simulation language (mmsl). mmsl is a somewhat simplified successor to the mml [10] , still based on the same concepts from the mmsf. listing 1 shows an example ymmsl file for isr2d. the model is described with its name, the compute elements making up the model, and the conduits between them. the name of each compute element is given, as well as a second identifier which identifies the implementation to use to instantiate this compute element. conduits are listed in the form component1.port1: component2.port2, which means that any messages sent by component1 on its port1 are to be sent to component2 on its port2. the components referred to in the conduits section must be listed in the compute elements section. muscle 3 reads this file directly, unlike muscle 2 which was configured using a ruby script that could be derived from the mml xml file. the ymmsl file also contains settings for the simulation. these can be global settings, like length of the simulated artery section, or addressed to a specific submodel, e.g. bf.velocity. submodel-specific settings override global settings if both are given. settings may be of types float, integer, boolean, string, and 1d or 2d array of float. the submodels need to coordinate with the manager, and communicate with each other. they do this using libmuscle, which is a library currently available in python 3 (via pip), c++ and fortran. unlike in muscle 2, whose java api differed significantly from the native one, the same features are available in all supported languages. listing 2 shows an example in python. first, an instance object is created and given a description of the ports that this submodel will use. at this point, libmuscle will connect to the manager to register itself, using an instance name and contact information passed on the command line. next, the reuse loop is entered. if a submodel is used as a micromodel, then it will need to run many times over the course of the simulation. the required number of runs equals the macromodel's number of timesteps, which the micromodel should not have any knowledge of if modularity is to be preserved. a shared setting could solve that, but will not work if the macromodel has varying timesteps or runs until it detects convergence. determining whether to do another run is therefore taken care of by muscle 3, and the submodel simply calls its reuse instance() function to determine if another run is needed. in most cases, muscle 2 relied on a global end time to shut down the simulation, which is less flexible and potentially error-prone. within the reuse loop is the implementation of the submodel execution loop. first, the model is initialised (lines 10-17). settings are requested from libmuscle, passing an (optional) type description so that libmuscle can generate an appropriate error message if the submodel is configured incorrectly. note that re recovered time is specified without the prefix; libmuscle will automatically resolve the setting name to either a submodel-specific or a global setting. a message containing the initial state is received on the relevant port (line 15), and the submodel's simulation time is initialised using the corresponding timestamp. the obtained data is then used to initialise the simulation state in a model-specific way, as represented here by an abstract init model() function (line 17). next is the iteration part of the sel, in which the state is repeatedly observed and updated (lines [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] . in addition to the simulation time corresponding to the current state, the timestamp for the next state is calculated here (line 21). this is unused here, but is required in case two submodels with overlapping timescales are to be coupled [4] and so improves reusability of the model. in isr2d's o i operator, the current geometry of the artery is calculated and sent on the geom out port (lines [22] [23] . next, the wall shear stress is received and used in the model's state update, after which the simulation time is incremented and the next observation may occur (lines 26-28). once the final state is reached, it is sent on the corresponding port (line 31). in this example, this port is not connected, which causes muscle 3 to simply ignore the send operation. in practice, a component would be attached which saves this final state to disk, or postprocesses it in some way, possibly via an in-situ/in-transit analysis framework. message data may consist of floating point numbers, integers, booleans, strings, raw byte arrays, or lists or dictionaries containing these, as well as grids of floating point or integer numbers or booleans, where muscle 2 only supported 1d arrays of numbers. internally, muscle 3 uses messagepack for encoding the data before it is sent. uncertainty quantification of simulation models is an important part of their evaluation. intrusive methods provide an efficient solution in some cases, but uq is most often done using monte carlo (mc) ensembles. an important innovation in muscle 3 compared to muscle 2 is its flexible support for monte carlobased algorithms. this takes the form of two orthogonal features: instance sets and settings injection. the simulation has been augmented with a sampler and a load balancer, and there are now multiple instances of each of the three submodels. the sampler samples the uncertain parameters from their respective distributions, and generates a settings object for each ensemble member. these objects are sent to the load balancer, which distributes them evenly among the available model instances. the settings are then sent into a special port on the submodel instances named muscle settings in, from where the receiving libmuscle automatically overlays them on top of the centrally provided settings. the settings are then transparently passed on to the corresponding other submodel instances. final results are passed back via the load balancer to the sampler, which can then compute the required statistics. to enable communication with sets of instances, muscle 3 offers vector ports, recognisable by the square brackets in the name. a vector port allows sending or receiving on any of a number of slots, which correspond to instances if the port is connected to a set of them. vector ports may also be connected to each other, in which case each sending slot corresponds to a receiving slot. in the example, the sampler component resizes its parameters[] port to the number of samples it intends to generate, then generates the settings objects and sends one on each slot. the load balancer receives each object on the corresponding slot of its front in port, and passes it on to a slot on its back out port. it is then received by the corresponding ensemble member, which runs and produces a result for the load balancer to receive on back in. the load balancer then reverses the earlier slot mapping, and passes the result back to the sampler on the same slot on its front out that the sampler sent the corresponding settings object on. with the exception of the mapping inside the load balancer, all addressing in this use case is done transparently by muscle 3, and components are not aware of the rest of the simulation. in particular, the submodels are not aware of the fact that they are part of an ensemble, and can be completely unmodified. an important advantage of the use of a coupling framework is the increase in modularity of the model. in muscle 3, submodels do not know of each other's existence, instead communicating through abstract ports. this gives a large amount of flexibility in how many submodels and submodel instances there are and how they are connected, as demonstrated by the uq example. modularity can be further improved by inserting helper components into the simulation. for instance, the full isr2d model has two mappers, components which convert from the agent-based representation of the cell model to the lattice-based representation of the blood flow model and back. these are implemented in the same way as submodels, but being simple functions only implement the f init and o f parts of the sel. the use of mappers allows submodels to interact with the outside world on their own terms from a semantic perspective as well as with respect to connectivity. separate scale bridging components may be used in the same way, except converting between scales rather than between domain representations. other coupling libraries and frameworks exist. while a full review is beyond the scope of this paper (see e.g. [12] ), we provide a brief comparison here with two other such frameworks in order to show how muscle 3 relates to other solutions. precice is a framework for coupled multiphysics simulations [5] . it comes with adapters for a variety of cfd and finite element solvers, as well as scale bridging algorithms and coupling schemes. data is exchanged between submodels in the form of variables defined on meshes, which can be written to by one component and read by another. connections are described in an xml-based configuration file. like muscle 3, precice links submodels to the framework by adding calls to a library to them. for more generic packages, a more extensive adapter is created to enable more configurability. submodels are started separately, and discover each other via files in a known directory on a shared file system, after which peer-to-peer connections are set up. precice differs from muscle 3 in that it is intended primarily for scaleoverlapping, tightly-coupled physics simulations. muscle 3 can do this as well, but is mainly designed for loosely-coupled multiscale models of any kind. for instance, it is not clear how an agent-based cell simulation as used in isr2d would fit in the precice data model. muscle 3's central management of model settings and its support for sets of instances allows it to run ensembles, thus providing support for uncertainty quantification. precice does not seem to have any features in this direction. the astrophysical multipurpose software environment (amuse) is a framework for coupled multiscale astrophysics simulations [20, 25] . it comprises a library of well-known astrophysics models wrapped in python modules, facilities for unit handling and data conversion, and infrastructure for spawning these models and communicating with them at runtime. data exchange between amuse submodels is in the form of either grids or particle collections, both of which store objects with arbitrary attributes. with respect to linking submodels, amuse takes the opposite approach to muscle 3 and precice. instead of linking the model code to a library, the model code is made into a library, and where muscle 3 and precice have a built-in configurable coupling paradigm, in amuse coupling is done by an arbitrary user-written python script which calls the model code. this script also starts the submodels, and performs communication by reading and writing to variables in the models. linking a submodel to amuse is more complex than doing this in muscle 3, because an api needs to be implemented that can access many parts of the model. this api enables access to the model's parameters as well as to its state. amuse comes with many existing astrophysics codes however, which will likely suffice for most users. coupling via a python script gives the user more flexibility, but also places the responsibility for implementing the coupling completely on the user. uncertainty quantification could be implemented, although scalability to large ensembles may be affected by the lack of peer-to-peer communication. muscle 3 , as the latest version of muscle, builds on almost fourteen years of work on the multiscale modelling and simulation framework and the mus-cle paradigm. it is mainly designed for building loosely coupled multiscale simulations, rather than scale-overlapping multi-physics simulations. models are described by a ymmsl configuration file, which can be quickly modified to change the model structure. linking existing codes to the framework can be done quickly and easily due to its library-based design. other frameworks have more existing integrations however. which framework is best will thus depend on which kind of problem the user is trying to solve. muscle 3 is open source software available under the apache 2.0 license, and it is being developed openly on github [23] . compared to muscle 2, the code base is entirely new and while enough functionality exists for it to be useful, more work remains to be done. we are currently working on getting the first models ported to muscle 3, and we plan to further extend support for uncertainty quantification, implementing model components to support the recently-proposed semi-intrusive uq algorithms [18] . we will also finish implementing semi-intrusive benchmarking of models, which will enable performance measurement and support performance improvements as well as enabling future static scheduling of complex simulations. other future features could include dynamic instantiation and more efficient load balancing of submodels in order to support the heterogeneous multiscale computing paradigm [1] . patterns for high performance multiscale computing multiphysics and multiscale software frameworks: an annotated bibliography distributed multiscale computing with muscle 2, the multiscale coupling library and environment foundations of distributed multiscale computing: formalization, specification, and analysis precice a fully parallel library for multi-physics surface coupling a complex automata approach for in-stent restenosis: twodimensional multiscale modelling and simulations towards a complex automata multiscale model of in-stent restenosis a framework for multi-scale modelling principles of multiscale modeling mml: towards a multiscale modeling language physics-based multiscale coupling for full core nuclear reactor simulation mastering the scales: a survey on the benefits of multiscale computing software survey of multiscale and multiphysics applications and communities an agent-based coupling platform for complex automata towards a complex automata framework for multi-scale modeling: formalism and the scale separation map multiscale modeling in nanomaterials science semi-intrusive multiscale metamodelling uncertainty quantification with application to a model of in-stent restenosis semi-intrusive uncertainty propagation for multiscale models uncertainty quantification of a multiscale model for in-stent restenosis the astrophysical multipurpose software environment a review of modelling tools for energy and electricity systems with large shares of variable renewables a comprehensive framework for verification, validation, and uncertainty quantification in scientific computing multiscale computational models of complex biological systems multi-physics simulations using a hierarchical interchangeable software interface key: cord-022891-vgfv5pi4 authors: hall, graeme m. j.; hollinger, david y. title: simulating new zealand forest dynamics with a generalized temperate forest gap model date: 2000-02-01 journal: ecol appl doi: 10.1890/1051-0761(2000)010[0115:snzfdw]2.0.co;2 sha: doc_id: 22891 cord_uid: vgfv5pi4 a generalized computer model of forest growth and nutrient dynamics (linkages) was adapted for the temperate evergreen forests of new zealand. systematic differences in species characteristics between eastern north american species and their new zealand counterparts prevented the initial version of the model from running acceptably with new zealand species. several equations were identified as responsible, and those modeling available light were extended to give more robust formulations. the resulting model (linknz) was evaluated by comparing site simulations against independent field measurements of stand sequences and across temperature and moisture gradients. it successfully simulated gap dynamics and forest succession for a range of temperate forest ecosystems in new zealand, while retaining its utility for the forests of eastern north america. these simulations provided insight into new zealand conifer–hardwood and beech species forest succession. the adequacy of the ecological processes, such as soil moisture balance, decomposition rates, and nutrient cycling, embodied in a forest simulation model was tested by applying it to new zealand forest ecosystems. this gave support to the model’s underlying hypothesis, derived from linkages, that interactions among demographic, microbial, and geological processes can explain much of the observed variation in ecosystem carbon and nitrogen storage and cycling. the addition of a disturbance option to the model supported the hypothesis that large‐scale disturbance significantly affects new zealand forest dynamics. individual-based forest simulation models predict the dynamics and structure of complex forest ecosystems. worldwide, long-term forest composition and forest species distributions are under pressure from continuing large-scale anthropogenic effects. because forest simulation models are ecosystem based, they can provide both predictions of forest response to these impacts and a consistent synthesis of the ecological processes involved (rastetter 1996) . such properties make them a vital part of any assessment of ecosystem response to global change (reynolds et al. 1996, shugart and smith 1996) . the structure of most simulation models of forest succession can be traced back to those developed to reproduce the population dynamics of trees in mixedspecies forests of northeastern north america (botkin et al. 1972, shugart and west 1977) . this approach tracked the development of each individual plant throughout its life cycle, with forest dynamics simulated by calculating the competitive interrelationships among trees in a restricted area, similar to that resulting manuscript received 20 april 1998; accepted 18 january 1999; final version received 10 march 1999. from the gap in a forest canopy formed by the death or removal of a large canopy tree. by simulating a sufficient number of gaps, the dynamics of the forest are reproduced (yamamoto 1992) . this concept is supported by various plant succession studies, which show that changes in a forest ecosystem may be described by averaging the growth dynamics in gaps of different successional ages (watt 1947 , bray 1956 , curtis 1959 , forman and godron 1981 . forest gap simulation models have been developed to predict long-term impacts on forest ecosystems caused by blight, harvest management, past climates, animal browse, pollution, and large-scale disturbance by fire or storm, and to predict transients in species composition and forest structure due to changing climate, (e.g., shugart and west 1977 , aber et al. 1979 , solomon et al. 1981 , pastor and post 1988 , bugmann 1996 . shugart and smith (1996) compiled a list of 37 such models developed to simulate vegetation dynamics in environments ranging from cool northern hemisphere boreal forest to warm subtropical australian rain forest. almost half of these models are dedicated to north american vegetation (80% of these in eastern forests), with the other models predicting forest composition and dynamics in central and northern europe, australasia, africa, and asia. comparisons have shown that a forest gap model developed for one geographical area is unlikely to contain all of the ecological processes required to successfully simulate forest composition and structure for another area . this is partly because, despite a common lineage, models formulate the basic processes of species establishment, growth, and mortality variously. models also vary by the way in which resource limitations alter growth, by the species' life history attributes employed, and by the depth of physiology incorporated. not all models explicitly maintain a soil moisture balance or a litter decomposition-soil nutrient cycle. in an effort to better determine the potential role of climate change and other exogenous factors on new zealand forest development, we extended an eastern north american simulation model (pastor and post 1985) for new zealand's forests. in doing so, we evaluated the ecological generality of functions and algorithms developed originally for the eastern north american forests, and gained insight into several longrunning debates about ecological processes in new zealand forests. the nature of new zealand's topography, climate, and forest characteristics indicated which ecosystem processes to model. new zealand's small group of islands (landmass 270 000 km 2 ) have variations in climate, geology, and soil that offer a wider range of habitats than many much larger landmasses (wardle 1984 , molloy 1988 ). the three main islands (north, south, and stewart) span 1500 km (13њ) of latitude and have a maritime climate, with sea level temperatures ranging from warm temperate in the north (mean annual temperature (mat) ϳ 15њc) to cold temperate in the south (mat ϳ 10њc). new zealand is tectonically active, with mountain building continuing at rates of up to 12 mm/yr (whitehouse 1988) . about 60% of the land is ͼ300 m above sea level, and 70% is defined as hilly or steep (molloy 1988) . the varied topography cuts across the prevailing westerly winds, modifying land temperatures and causing strong rainfall gradients. the north island has a central volcanic plateau with an average elevation of 500 m and main axial mountain ranges running northeast to south. rainfall exceeds 1600 mm/yr in these areas and can reach 6000 mm/yr. there are also sizable areas of lowlands and coastal plains. the south island has a young mountain range running 700 km north to south. the many summits ͼ3000 m create strongly differentiated climatic gradients. in the eastern lee of the south island range, rainfall drops to a low of 300 mm/yr, whereas on the windward side in westland and fiordland, it may reach 12 000 mm/yr near the ranges. the forests of new zealand are dominated by longlived evergreen species (wardle 1984 , wardle 1991 . nothofagus (beech) species characterize many forest types, occurring either in pure associations (46% of the remaining forested area) or in mixed forest (22% of remaining forest area). beech species predominate in mountain regions of both main islands. their wide ecological ranges enable their frequent occurrence in montane zones as well as lowland areas. in the northern warmer areas (north of 38њ s), the conifer agathis australis dominates the forests in association with a mixture of hardwoods, podocarp, and beech species. further south, lowland forests are characterized by emergent, long-lived, evergreen podocarp species dacrydium, podocarpus, and prumnopitys. these are associated with a diverse group of broad-leaved hardwoods (beilschmiedia, metrosideros, and weinmannia) , characteristic of the main canopy (ogden et al. 1996) . the strong rainfall gradients and dry summer climates influence species composition and led us to consider models with a site water balance (e.g., post 1986, botkin and nisbet 1992) . the young landscapes and generally infertile forest soils in new zealand (molloy 1988 , wardle 1991 ) similarly required models that relate soil nutrient status to species composition and forest stature (e.g., aber et al. 1982 , pastor and post 1986 , bonan 1990 . we chose the linkages gap model post 1986, post and pastor 1996) because it included these ecosystem processes without excessive data requirements. it has explicit feedbacks between light, water, and nitrogen availability, and their effects on stand composition and productivity. previous gap models have related soil nutrient status to tree growth by species-specific sigmoid equations that reach maximum values at the highest reported basal area, or biomass, for a given region. the linkages model eliminates these site-specific maxima by explicitly simulating water and nutrient availability and using them to influence tree growth (post and pastor 1996) . however, the model lacks some recent modifications in allometric relationships, growth equations, and spatially explicit modeling of the light environment (e.g., leemans and prentice 1987 , martin 1992 , pacala et al. 1993 , prentice et al. 1993 , bugmann 1996 . a complete description of linkages is given in pastor and post (1985) . our model, linknz, is a version of linkages generalized for new zealand forest conditions, with species parameters obtained from an alternative database of new zealand species (g. m. j. hall and d. y. hollinger, unpublished manuscript) . here, we examine how well linknz simulates forest patterns and reproduces forest characteristics in a range of broadleaved hardwood and conifer forest types throughout new zealand. the model presented is intended as a basis for future development. some characteristics of new zealand tree species differ profoundly from those of eastern north american species. therefore, we were careful to extend, rather than modify, the mechanisms of competition or nutrient cycling. this preserved the model's original ability to reproduce dynamics of the forests of eastern north america and allowed a com-generalized forest gap model parison of results with linkages. our intention was to produce a more generally applicable forest gap model. the linkages model post 1985, 1986) shares a common structure with the jabowa/foret class of stochastic tree population models that predict ecosystem dynamics through interactions between the forest and available resources (botkin et al. 1972, shugart and west 1977) . the model simulates, on a yearly cycle, the establishment, growth, and mortality of all individual trees in a 1/12 ha plot (0.083 ha), adjusted by the effects of climate, soil properties, and competition. plot size corresponds to the average gap created by a dominant tree in eastern north american forests (shugart and west 1977) . initial diameter at breast height (dbh) is stochastically set between 1.27 cm and 1.42 cm. monthly rainfall and temperature variables, together with soil moisture capacity and wilting point, determine available site moisture. as the canopy forms and develops, light availability to each tree changes, affecting growth rates and the establishment of new trees. available soil nitrogen is initialized for the site and then determined annually by external inputs, losses due to leaching, and dynamics associated with processes of immobilization and mineralization during litter decomposition (post and pastor 1996) . available soil nitrogen then affects tree growth and stand composition, which, in turn, alters litter quantity and quality and modifies decomposition rates (pastor and post 1985) . a sufficient number of growing seasons, or annual cycles, was set to allow modeled forest biomass at each site to settle into an approximate steady state. the longevity of several widespread, dominant new zealand tree species (g. m. j. hall and d. y. hollinger, unpublished manuscript) led us to run the model for 2000 annual cycles, rather than the 250-400 considered sufficient for eastern north american species by pastor and post (1985) , or the 1000-yr period used by pacala et al. (1993) . we retained this 2000-yr time frame when simulating stochastic whole-stand disturbances, to allow long-lived individuals in any undisturbed stands to complete at least one life cycle. the stochastic nature of the model requires that more than one plot be simulated to obtain an adequate description of forest composition and structure (yamamoto 1992) . we generated 50 plot successions on each site to smooth anomalous events. because of its linkages heritage, linknz also tracks details of soil organic matter and nitrogen pools, as well as site water balance. typically, both soil organic matter and soil n accumulate on a plot for several centuries until reaching an approximate steady state. by contrast, the estimates of transpiration depend only on the physical environment. these outputs assist in the understanding of factors regulating vegetation at different sites, but insufficient new zealand data prevented further evaluation. latitude, monthly temperature, monthly rainfall, and growing season data were obtained from new zealand meteorological service climate reports (1969) (1970) (1971) (1972) (1973) (1974) (1975) (1976) (1977) (1978) (1979) (1980) (1981) (1982) (1983) (1984) (1985) (1986) . soil moisture holding capacity and wilting point were set according to broad soil type and soil moisture deficit maps (molloy 1988) , and knowledge of the seven test sites. the initial organic matter (74.5 mg/ha) and n levels (1.64 mg/ha) were left as the model default values. we selected 72 tree species considered fundamental to the structure and functioning of new zealand native forest ecosystems, using a reference file of all woody species found on surveys (hall 1992) . these species included early successional species, understory species, and many major canopy species occurring throughout new zealand forests. we also included four common and widespread tree fern species, cyathea smithii, c. dealbata, c. medullaris, and dicksonia squarrosa, because of their influence on patterns of succession in new zealand forests (wardle 1991) . they are treated by the model in the same way as tree species. optimal growth constants were calculated from the equations of botkin et al. (1972) from species maximum dimension and longevity. species used are listed in the appendix. model parameters generated from the species' life history attributes included maximum height, maximum diameter, maximum longevity, limits of annual growing degree-day sums, shade and nutrient tolerances, establishment conditions and rates, and various canopy, foliage, and litter properties. data and methods of obtaining parameters are described in g. m. j. hall and d. y. hollinger (unpublished manuscript) . the linkages model, as presented by pastor and post (1986) , required modifications to its slow-growth, available-light, and decay-rate conditions to reproduce forests characteristic of new zealand sites. many major forest species did not feature in the simulations. for example, at riverhead near auckland, linkages limited species composition to the large conifer agathis australis and two hardwoods, beilschmiedia tawa and b. tarairi, and excluded the common podocarp species dacrydium cupressinum, prumnopitys ferruginea, and p. taxifolia. further south at taupo, linkages predicted that the hardwood b. tawa would dominate and exclude all the common podocarp species. this pattern was repeated at other sites throughout the country, with site occupation being captured by one or two hardwood ecological applications vol. 10, no. 1 or beech species and other widespread species failing to establish. the linkages model tests the growth of a tree independently of age and employs conditions common to many other models. if the resource-limited diameter increment is less than a fixed minimum of 1 mm/yr (botkin et al. 1972 , shugart 1984 , botkin and nisbet 1992 , or ͻ10% of the maximum increment for its size (solomon 1986) , the model defines this as a poor growth year for the stem. if a tree grows poorly for two consecutive years, the probability of survival is reduced so that it has a ͻ1% chance of surviving 10 consecutive years of poor growth. this ''poor-growth'' condition is too restrictive for modeling several slow-growing new zealand species. the clearest case is that of the long-lived conifer halocarpus biformis, with a highest recorded growth increment of 0.8 mm/yr (wardle 1991) , preventing linkages from establishing it at all. under link-ages, the widespread, dominant conifers of the podocarpaceae with recorded maximum annual diameter increments ͻ3 mm/yr also struggle to survive in slightly less than optimal conditions. slow growth is common in the understory for other new zealand species as well. seedlings of nothofagus species, for example, remain in a quiescent state (ͻ1.35 m tall), making only limited growth for decades until an opportunity is provided by the death of canopy trees (wardle 1984) . nothofagus fusca, normally one of the faster growing dominant species (stewart and rose 1990) , may grow in diameter by only 0.8 mm/yr under a dense canopy, with 70% of the poles in a typical stand passing through this stage (kirkland 1961) . baxter and norton (1989) show a similar growth release behavior for dacrydium cupressinum and quintinia acutifolia, in which ring widths of young trees are ͻ0.5 mm/yr under an intact canopy, but increase to widths of 2-4 mm/yr after the overstory trees are removed. to minimize alterations to the model, we retained the same slow-growth conditions of linkages when maximum diameter increments exceeded 2.5 mm/yr, or when maximum longevity was attained. for maximum increments ͻ2.5 mm/yr, growth was defined as slow only when resource-limited increments were ͻ10% of the maximum increment. the 2.5 mm/yr threshold was chosen by comparison against optimal growth increments recorded for a range of new zealand tree species (g. m. j. hall and d. y. hollinger, unpublished manuscript) . species with a maximum diameter increment exceeding 2.5 mm/yr will have, at some point, the same tests for slow growth as in linkages, whereas those that never attain this increment have the 1 mm/yr fixed minimum waived. these conditions allowed the slowest growing species, including several in the podocarpaceae, to establish and move through the slowgrowing phase without excessive mortality. the light passing through a canopy can be modeled, using the beer-lambert law, as where i is the light intensity below the canopy; i o is the light intensity above the canopy; k is an extinction coefficient that is a function of foliage angle distribution, spatial dispersion, and optical properties; and lai is the leaf area index in square meters of foliage per square meter of ground (monsi and saeki 1953 ). the linkages model calculates i as a percentage of full sunlight. this value is used to determine whether there is sufficient light for new individuals to become established on the plot, and as a growth multiplier for seedlings of those species that can establish. in link-ages, it is calculated (aber et al. 1982 ) as: where f is the foliage mass in grams per plot, and the divisor 93 750 is a factor that accounts for the size of the plot (833.3 m 2 ) and converts the mass of the foliage into an effective leaf area index. the per meter conversion factor of 112.5 (93 750/833.3) is thus the product of the leaf mass per unit area (slm) and the reciprocal of the extinction coefficient, k. whittaker et al. (1974) used a value of ϳ92 g/m 2 for northern hardwoods foliage, implying that the implicit value of k from aber et al. (1982) is ϳ0.8. new zealand forests are predominantly evergreen, with high specific leaf mass, in contrast to the predominantly deciduous eastern north american forests. consequently, linkages underestimates available light in new zealand sites for a given foliage mass, with the result that canopy species prevent any new establishment. we rewrote eq. 2 as to take account of the variation in slm, which was an implied constant in linkages, and then used our measured slm values. within linkages, canopy openings are assumed to increase decay rates because of microclimatic changes. the model relates closed-canopy leaf production, l c , to available water for plant growth, w s (pastor and post 1984) , as it then compares this year's leaf litter (l a ) with l c to construct a decay multiplier that increases decomposition under canopies with low leaf area (aber et al. 1982) . for soils of high water-holding capacity, the multiplier ranges from 1.0 (l a ϭ l c ) to 2.0 (no canopy); for low water-holding capacity, it ranges from 1.0 to 1.25 (pastor and post 1986) . in linkages, the l c value for canopy leaf pro-duction is too low for most new zealand forests; l a exceeds l c , causing the model to set the decay multiplier to 1.0 and so nullify any gap effect. field examples show that daniel (1975) estimated l c as 3.2 mg·ha ϫ1 ·yr ϫ1 in a new zealand podocarp-broad-leaved forest; benecke and evans (1987) established that l c ϭ 6.4 mg·ha ϫ1 ·yr ϫ1 from a nothofagus truncata forest; and hollinger et al. (1994) obtained l c ϭ 5.9 mg·ha ϫ1 ·yr ϫ1 from a mature n. fusca stand. leaf production may be greater in new zealand forests because of a long growing season, combined with the evergreen habit of most tree species. over a 2000-yr run of the model, simulations at a warm-temperate species-rich forest site at riverhead, auckland showed that the original decay multiplier was nullified in 99.9% of years on soils of low water-holding capacity, and in 66.2% of years on soils of high water-holding capacity. to compensate for differing new zealand forest leaf production values and to retain eq. 4 for american species, a scaling factor was applied to l a . this was obtained by multiplying each plant's litter mass by the average northern hardwood foliage mass of 92 g/m 2 (whittaker et al. 1974 ) and dividing by the species' slm. over the simulated period, the model calculated mean values for l a of 2.70 mg·ha ϫ1 ·yr ϫ1 on low-water-holding capacity soils and 2.77 mg·ha ϫ1 ·yr ϫ1 on high-capacity soils. the slm adjustment rescaled these mean values downward by nearly 50%, making them comparable with the l c of old-growth american forests in eq. 4, and activating the decay rate multiplier in new zealand forest sites. after this adjustment, the decay-rate multiplier on the simulated riverhead forest plot was negated on just 0.3% of the annual cycles. pollen, charcoal, and fossilized plant fragments point to a long history of change in new zealand indigenous forests. forest disturbances have been due to physical factors associated with steep mountain slopes, and include volcanism, periodic fire, forest dieback, windthrow, drought, flooding, and snow damage (wardle 1984) . we added a basic mechanism to linknz to simulate disturbance. the type (wind or fire) and the mean disturbance return time can be set in the site parameters. the linknz model will trigger that type of disturbance with a probability annually that is the reciprocal of the disturbance frequency. after windthrow, all trees on the plot are assumed to be dead and all biomass, including belowground root mass, is returned to the site for decomposition-nutrient cycling. after fire, all trees are assumed to be dead and the larger biomass components are returned to the site. the biomass and n in foliage and twigs are presumed to be volatilized and lost from the site. revisions made to the linkages code (pastor and post 1985) gave a threefold improvement in execution speed. the linknz program, species data, and site data will be made available, subject to a ''fair use'' policy, on the web site ͗http:/www.landcare.cri.nz͘. evaluation of forest ecosystem gap models is not straightforward. to construct definitive tests of simulated forest dynamics, several stands would have to be monitored for long periods while they were returning to old-growth forest. even in well-observed eastern north american forests, a lack of historical data on succession has been acknowledged (pacala et al. 1993 ). in addition, rastetter (1996) has noted that each alternative used to evaluate modeled ecosystem response to global change fails to provide a severe and crucial test. partly, this is due to difficulties in locating past or present ecosystem states comparable to those expected under climate change. for instance, how can valid data be obtained for evaluating long-term vegetation responses to increased co 2 , when short-term chamber experiments are still inconclusive? we adopt recent evaluation methods for forest gap models to evaluate the ability of the model under current climates, but acknowledge that theoretical difficulties remain. shugart (1984) and shugart and smith (1996) discuss procedures used for testing the results of gap models. most include assessing the model's abilities to reproduce ''target patterns'' of stand or tree biomass increments, stand structure (basal area, density, stem diameter distributions) or composition (relative basal area, relative density) for stands of known age, successional trends in a chronosequence of stands, and forest response to disturbance as a ''natural'' experiment. bugmann (1996) tested model predictions of species composition, biomass, and distribution on a range of sites in the european alps. we assess linknz similarly by comparing output with general characteristics of forest vegetation at sites throughout new zealand. these sites are located across both temperature and precipitation gradients in habitats varying from diverse-species warm-temperate forests to limited-floristic cool-temperate forests (table 1) . pacala et al. (1993) tested their spatially explicit model against data from a short chronosequence (up to 100 yr), and against a long-term succession, by comparing species composition and basal area. we compare our model through time against studies of forest successional development on landslides in southwestern new zealand (mark et al. 1964 , stewart 1986 ). comparisons of models and methods can give confidence if underlying methods are independent (rastetter 1996) . we compare simulated long-term dynamics against successional sequences deduced by methods based on empirical data and models unrelated to linknz (e.g., ogden 1983 , wardle 1984 , burns and smale 1990 . bugmann et al. (1996) compare versions of forest gap models showing changes due to additional features. we briefly compare results between our model and a test of an allometryfinally, we present and discuss model results for forests where large-scale disturbance appears to play a role in shaping forest structure. the model reproduces the broad patterns of forest succession and composition at a variety of test sites. at riverhead, auckland, a silty clay loam soil texture (field moisture capacity 38.3 cm, wilting point of 20 cm) was chosen to represent moister, valley soil conditions. from the simulations, mean relative stem densities at 25-yr intervals for the main species (with mean biomass ͼ 0.01 mg·ha ϫ1 ·yr ϫ1 ) were clustered using a group-average linkage with a gamma similarity coefficient (systat 1997) . this gave three groups of species at the 0.8 similarity level. the early-arriving species (group 1) separated at the 0.45 similarity level into primarily short-lived, small trees (group 1a) and longer lived tall trees (group 1b). initially, modeled primary succession proceeds through the fast-growing group 1a species, especially leptospermum scoparium and kunzea ericoides, with aristotelia serrata also present for the first 25 yr (fig. 1) . included in group 1a is the slow-growing phyllocladus trichomanoides, which can maintain a longer presence. the relative density of these pioneer species drops rapidly after 25-50 yr, returning only when large canopy gaps occur after year 500. the massive conifer agathis australis attains dominance in ͼ90% of the simulated plots by year 300 and, like the group 1a pioneers, begins at a high density. over the first ϳ200 yr, agathis also replaces group 1b, early-establishing hardwood species including weinmannia silvicola, knightia excelsa, elaeocarpus dentatus, and the beech nothofagus truncata (fig. 1) . the model suggests, however, that agathis does not regen-erate well in situ, with its relative density declining steadily for ͼ500 yr and its biomass dropping until it disappears after 1700-1800 yr. as agathis declines after 500-700 yr, the forest becomes co-dominated by species from groups 1b and 2, forming an agathis/ hardwood community with an increasing podocarp component (figs. 1 and 2a) . the group 2 species, beilschmiedia tawa and b. tarairi, rise in numbers, reaching 30% of total biomass between years 500 and 800, and then decline to hold a near-constant 10% of total biomass after 1200 yr, as the longer lived group 3 podocarps emerge. of these, dacrydium cupressinum gains slowly in relative density and biomass after group 1a species disappear, while prumnopitys ferruginea and p. taxifolia increase rapidly as agathis wanes. in the absence of a large disturbance, such as fire, these podocarp species (with a small hardwood component) are predicted to eventually characterize the forest. our simulations indicate that, at this warm new zealand site, maximum forest biomass is reached between 250 and 450 yr, while agathis dominates. during that period, the model predicts a mean basal area of 62 m 2 / ha (fig. 2b) , of which agathis contributes 58 m 2 /ha. this compares well with a mean basal area of 57 m 2 / ha for agathis, observed in 25 mature (mean age 327 yr) stands . with a drier soil (moisture capacity 22.9 cm, wilting point 10.4 cm), depicting ridge conditions at riverhead, agathis is predicted to persist at the expense of the group 2 hardwoods. the simulation (not graphed) produces a 6.5% higher mean agathis biomass, with a lower maximum and a smoother decline. the hardwood species beilschmiedia tawa and b. tarairi, which favor more fertile soils and are less tolerant of water stress than agathis, drop in mean biomass by ͼ50%, from 9.8% to 4.3%. in comparison, the three major podocarp species retain 31-33% of total biomass. the droughttolerant prumnopitys taxifolia prospers at the expense of both d. cupressinum and p. ferruginea. in the cooler climate further to the south, near taupo generalized forest gap model fig. 1. relative stem densities of key species on modeled plots using climate and soil data from riverhead, auckland. other species of lesser importance on the modeled plots include: elaeocarpus hookerianus in group 1b, quintinia serrata in group 2, and podocarpus totara and podocarpus hallii in group 3. in this and all subsequent simulations, the values shown are the means from 50 simulated 1/12 ha plots. minimum and maximum mean percentages of total stems are given for each species. ( fig. 3a) , modeled primary succession on silty clay loam soil proceeds again through kunzea ericoides with leptospermum scoparium, and aristotelia serrata. weinmannia silvicola is replaced by the cooler climate species w. racemosa, and warmer temperate species such as the hardwood beilschmiedia tarairi and the dominant conifer agathis fail to establish. the common north island hardwood b. tawa retains a small, constant biomass (ϳ6%) throughout the simulation period. in the first 100 yr, modeled plots are dominated by weinmannia racemosa, k. ericoides, and elaeocarpus species. these species make up ͼ70% of total biomass at year 100, reduce to 50% by year 200 as the podocarps dacrydium cupressinum, prumnopitys ferruginea, and p. taxifolia increase, and become a minor component at 1% by year 500. modeled community composition is similar to that at lakeside sites in the taupo area (wardle 1984, clarkson and nicholls 1992) . the successional patterns resemble a sequence described by wardle (1991) for parts of this central north island volcanic plateau area with a deep tephra soil, characterized by d. cupressinum-dominated mixed-podocarp forest establishing by 200-300 yr and developing by 400-500 yr into a large, mature podocarp-broadleaved forest. by contrast, simulations carried out using the cooler climate conditions for reefton (typical of the south island west coast of new zealand) suggest that the emergent podocarp dacrydium cupressinum, in association with the common hardwood weinmannia racemosa, will more quickly dominate plots in this area (after the initial establishment of aristotelia serrata, leptospermum scoparium, and kunzea ericoides). the model shows prumnopitys and podocarpus species, followed by nothofagus species, beginning to establish after ϳ200 yr (fig. 3b) species. apart from n. fusca (to be discussed), forest composition agrees with descriptions of the area (wardle 1984). the drop-off in biomass over 30 yr at about year 900 represents mortality of the last of the original cohort of the long-lived, dominant d. cupressinum. the gap model predicts that the eventual ''steady-state'' forest composition of the major species at taupo and reefton may be similar (fig. 3a, b) , with comparable patterns for w. racemosa and the podocarp species d. cupressinum, p. ferruginea, and p. taxifolia at both sites. slight differences are evident, with reefton predicted to have a larger beech component and b. tawa restricted to taupo, as observed in nature. the model generates a similar forest 200 km south at franz josef (fig. 3c) , where the mean annual temperature is only slightly lower than in reefton (table 1) . early succession at this site is started by leptospermum scoparium and aristotelia serrata, rather than kunzea, as at the more northern sites. a similar amount of nothofagus menziesii is predicted in forest at franz josef, as for reefton (fig. 3b, c) . however, franz josef is located within the 150-km stretch of the south island west coast where nothofagus is absent (referred to as the ''beech gap''). excluding beech species from the model at franz josef does not alter the forest dynamics greatly, and correctly predicts a dacrydium cupressinum-dominated podocarp forest (fig. 3d ). other species include weinmannia racemosa, quintinia acutifolia, pseudowintera colorata, and cyathea smithii. still omitting beech species from the model, we simulated climate conditions several hundred meters upslope of the franz josef meteorological station by reducing mean monthly temperatures by 2њc. at this upslope site, a mixed hardwood-podocarp forest is sim-ulated with metrosideros umbellata, weinmannia racemosa, podocarpus hallii, and phyllocladus aspleniifolius var. alpinus (fig. 3e ). other species present include griselinia littoralis, libocedrus bidwillii, and pseudowintera colorata. this change in species composition with elevation corresponds with that commonly observed along the western slopes of the southern alps in new zealand (e.g., wardle 1991) . the total biomass in these simulated slope forests is about two-thirds that estimated for the lowland podocarp forests. this reduction is caused partly by a change in species composition from the large, lowland podocarp, dacrydium cupressinum, to the smaller-statured podocarp, p. hallii, with the broad-leaved hardwood w. racemosa, and is exacerbated by the decline in biomass of the dominant m. umbellata. metrosideros initially dominates the forest, reaching ϳ200 mg/ha after 250 yr, but is gradually replaced by podocarpus hallii after 500 yr. on the drier east side of the southern alps, the model simulates very different forests from those on the wetter west side (fig. 3f, g) . at the driest site (mean precipitation 635 mm/yr) near twizel, on a sandy soil, the model generates a forest dominated by podocarpus hallii, with a small amount of nothofagus solandri var. cliffortioides, phyllocladus aspleniifolius var. alpinus, and prumnopitys taxifolia. early succession at this site is dominated by leptospermum scoparium and n. solandri var. cliffortioides. the simulated biomass of these plots is ϳ220 mg/ha. although there is no forest at present around twizel, on adjacent slopes there is abundant charcoal evidence for a p. hallii forest before the arrival of polynesian settlers in new zealand (molloy et al. 1963 , wells 1972 ). on drier, cooler sites, pollen and charcoal evidence from the foot of the ben ohau range near twizel record a p. alpinus-dominated scrub with a lesser p. hallii component and traces of n. solandri var. cliffortioides (mcglone and moar 1998). at the higher elevation craigieburn site (fig. 3g) , on a sandy-loam soil, the model generates a mixed nothofagus solandri var. cliffortioides-n. menziesii forest where there is presently solely n. solandri var. cliffortioides. this modeled forest exhibits interesting dynamic behavior, and will be discussed in more detail (see natural monocultures). simulated forest succession at lake thompson (using climate data from the west arm, manapouri station) allowed comparison with several detailed studies of succession in the area (mark et al. 1964 , stewart 1986 . qualitatively, much of the early pattern of succession observed by mark et al. (1964 mark et al. ( , 1989 was reproduced by linknz, with aristotelia serrata and tually n. menziesii. other species that occur in forested sites close to lake thompson (stewart 1986 ), such as pseudowintera colorata, a valley floor forest co-dominant in the understory, and the conifers podocarpus hallii and prumnopitys ferruginea, are also present in our simulations (fig. 4a, b) . quantitative predictions were more variable. the linknz model reproduced the dynamics of the short-lived pioneer aristotelia serrata, a co-dominant in the 15-yr-old slip-face plot (exceeding 10% relative density), and absent from stands ն50 yr old (mark et al. 1964 ). the principal seral species, leptospermum scoparium, was prolific in the 15-yr-old stand (60% relative density), much less abundant in the 50-80 yr old stands (20%), and not recorded in the mature forest. our simulation reproduced the initial relative density and persistence of l. scoparium, but predicted only 5% relative density after 50 yr. predictions for the lightdemanding beech species nothofagus solandri var. cliffortioides at year 80 and year 140 were within 5% and 1%, respectively of the field observations. the model properly predicted that the mature-forest dominant n. menziesii would establish at an early stage, but gave it only slight increases in relative density until year 120, whereas mark et al. (1989) found that it occurred in moderate amounts on both 73-yr-old and 102-yr-old slip-face plots. the linknz model predicted that high numbers of podocarpus hallii and prumnopitys ferruginea would occur after year 50. these affect predicted relative densities of other species. weinmannia racemosa was found by mark et al. (1989) to contribute 13% relative density at year 49 and 41% in the mature forest. the linknz model predicted 18% w. racemosa relative density at year 50, dropping to 12% at year 140 as the podocarps became abundant (fig. 4a) . mark et al. (1989) recorded a relatively high density of the hardwood metrosideros umbellata after 49 yr, whereas the model predicted just 5% for this species after 50 yr. despite these differences, the simulation of an adjacent mature nothofagus menziesii-weinmannia racemosa-pseudowintera colorata forest is still very acceptable (e.g., stewart 1986 , mark et al. 1989 ). with reference to species' relative densities, we note that our modeled results are the average of 50 stands; individual stands can follow quite distinct trajectories from the mean. introducing relatively infrequent stochastic disturbance (1/12 ha total plot blow-down on the average of 200-500 yr) results in simulated forests with a greater representation of early successional species and lower biomass than in forests where only individual tree gap replacement dynamics are allowed. this regime has the effect of not only increasing the number of gaps over time, but also dramatically altering the stand structure. over time, a plot generally carries several trees so that, when one dies, any remaining individuals are free to respond, but under the blow-down scenario, all remaining trees on the plot are killed (e.g., fig. 5a vs. fig. 2a, and fig. 5b vs. fig. 3b ). in these simulations, the long-lived podocarp species decline in absolute as well as relative importance, while other species tend to maintain absolute biomass and increase in relative representation. this is illustrated at the reefton site, which has a history of mass disturbances including two major earthquakes in the area during this century (wardle 1984) . when we imposed a mean disturbance return time of 300 yr and allowed fallen material to remain on site, predicted total biomass of nothofagus species at reefton increased from 5.5% without disturbance to 27.5%. the beech n. fusca is common in the area (wardle 1984) , but in simulations without disturbance, it was virtually absent (fig. 3b) . the introduction of disturbance allowed n. fusca to capture ͼ12% of total biomass, increased n. menziesii from 5% to 16% of total biomass, and led to the establishment of a small amount of n. solandri. with this disturbance regime, the earlyestablishing and common hardwood weinmannia racemosa retained a constant presence and more than doubled its share of biomass from 5.6% to 13.4%. these gains came at the expense of the podocarps, with the biomass of prumnopitys species reduced by onehalf and dacrydium cupressinum reduced by one-third to 26%, compared to a scenario without random disturbance. the lists of species and successional patterns generated by the model for different sites in new zealand agree closely with those observed at the local sites (e.g., wardle 1984 , wardle 1991 ). at the sites tested, the model does not establish any species where it does not belong, with the exception of the ''beech gap,'' to be discussed. for example, the commonly described pattern of colonization, which has the fast-colonizing kunzea ericoides and leptospermum scoparium acting as ''nurse'' species for agathis australis or podocarps (ogden 1983) , is reproduced in the riverhead, auckland simulation ( fig. 2a) , as is the eventual replacement of agathis by hardwoods and podocarps . in the initial 100-200 yr, the riverhead simulation also shows the common hardwoods weinmannia silvicola, knightia excelsa, elaeocarpus dentatus, the beech nothofagus truncata, and phyllocladus trichomanoides establishing in numbers, and then gradually being overtopped and replaced by agathis (fig. 1) . this successional sequence in these warmer temperate forests is described by ecroyd (1982) and burns and smale (1990) . the model also reproduces observed changes in species composition for different soils. dri-er soil sites favor longer term agathis occupation, whereas moister soils suit hardwood species (ecroyd 1982) . the biomass and basal area estimates produced by the model are more difficult to evaluate, but agree in general with published estimates. from a harvesting trial in a typical 130-yr-old agathis stand in the hunua ranges south of auckland, madgwick et al. (1982) estimated the agathis biomass component at ϳ132 mg/ ha. on agathis-occupied plots, with hunua climate data, the model predicts a mean ϳ133 mg/ha agathis component at 150 yr. huge biomass is possible in clumped stands of mature agathis forest (wardle 1991) . hinds and reid (1957) estimate that a representative area of a typical mature agathis forest with 100 merchantable stems/ha could produce ͼ400 mg/ha, on average, of commercial timber. madgwick et al. (1982) found that stemwood made up 64% of total agathis biomass in their study; this factor generates an approximate total biomass estimate of 625 mg/ha. this compares with the model estimate in which total agathis biomass reaches a peak of 630 mg/ha in 300-450 yr old stands. near riverhead, ogden (1983) recorded basal areas of agathis of 72 m 2 /ha for a ϳ300-yr-old stand and 55 m 2 /ha for a young ϳ120-yr-old ''ricker'' stand. in comparison, our model simulations predict lower mean basal areas of 62 m 2 /ha on 300-yr-old agathis-dominated stands and 49 m 2 /ha at 125 yr (fig. 2b ). burns and smale (1990) , on an intermediate-stage 200-yr-old site on the coromandel peninsula, obtained a total basal area of 62 m 2 /ha, of which 42% was contributed by agathis. our model, with their coromandel site climate data, predicted a similar total stand basal area, but with an 89% agathis component. for reefton, we estimated undisturbed mature podocarp-beech forest aboveground biomass at ϳ325 mg/ ha (fig. 3b) and belowground biomass at ϳ300 mg/ ha. for periodically disturbed mature podocarp-beech forest, above-and belowground biomass was ϳ235 and ϳ260 mg/ha, respectively. beets (1980) recorded aboveground (living) and belowground (excluding logs) biomass values of 306 mg/ha and 340 mg/ha, which included 147 mg/ha of roots, at a mature mixedpodocarp-beech site near reefton. in the craigieburn range, several studies have investigated nothofagus solandri var. cliffortioides aboveground biomass, finding values that range between 177 and 323 mg/ha (benecke and nordmeyer 1982 , schoenenberger 1984 , harcombe et al. 1998 ). our simulated values range from 129 to 284 mg/ha. although the model successfully simulates broad successional patterns within new zealand forests, detailed patterns may not be exactly reproduced, particularly during the early-establishment stages. forest gap models incorporate stochastic elements to mimic many ecosystem processes and produce multiple simulations to obtain average results and calculate confidence limits. uncertainty in model data and possible errors in ecological applications vol. 10, no. 1 field data also obscure reconstructions of past events (rastetter 1996) . deviations between model and reality result from a number of sources. these include potential errors in site parameters such as climate or soil water-holding capacity, errors in species parameters, and flaws in our understanding (and in modeling this understanding) of how site and species characteristics interact and affect forest growth. the results from lake thompson are instructive. our climate estimates for lake thompson are taken from a site 50 km distant, in an area of high relief and climatic extremes; our estimate of initial site fertility may also be imprecise. we use the same set of growth parameters for each species throughout new zealand (ignoring ecotypic variation) and some of these parameters are relatively imprecise estimates. a chronic problem with testing gap models in this way is that the species potentially available on a plot exceed those that naturally occur (the model provides for an omnipresent seed source, ignoring seed dispersal mechanisms and differing arrival times). this results in a relatively high percentage of ''other'' species that may exist for only several years before dying off. in addition, during early establishment, the model initiates all individuals as equal-aged saplings with a mean stem dbh of 1.5 cm. time required to reach this point is not explicitly accounted for and there can already be considerable age differences between individuals, depending upon their species' growth characteristics. despite these problems, the detailed pattern of succession simulated for landslides near lake thompson is reasonable. the species characteristics given to linknz were not altered nor were the species limited to those present in the lake thompson area. develice (1988) presented a basic foret-type model with allometric parameters set for the five tree species of greatest importance at the lake thompson landslides. even so, his model overestimated the initial density of nothofagus solandri var. cliffortioides and the subsequent density of n. menziesii. in these develice (1988) simulations, n. menziesii accounted for nearly 50% of total stand density by year 80, whereas mark et al. (1964 mark et al. ( , 1989 found that its density was generally ͻ10% of the total. the linknz model produced a better approximation to field counts of these nothofagus species, and predicted the increases in pseudowintera colorata over time that were noted by mark et al. (1989) . it did deviate from field observations by initially establishing podocarp species (not included by develice 1988); although these species did not become a significant part of the site biomass until year 200 (fig. 4b) , they lowered relative density predictions for the hardwood species. the simulated early and numerous establishment of these bird-dispersed podocarp species lends support to the contention of mark et al. (1989) that the initial floristics model of primary succession (egler 1954) , in which species arrive simultaneously and successively gain dominance according to their life history attributes, may not fully account for the early dynamics on these slip faces. in summary, although our results and those of develice (1988) deviate in some details from the short-term pattern of succession reported by mark and co-workers, the mature forest simulation of linknz corresponded closely to that described by stewart (1986) and mark et al. (1989) . having established the validity of the model for reproducing general successional patterns across a range of sites in new zealand, we then used the model to provide some insights on several ongoing debates concerning ecological patterns and processes in new zealand. there has been debate concerning the ''lack'' of regeneration in agathis australis forests (for discussion, see ecroyd 1982 and ogden 1985) . our results, based solely on gap-phase replacement dynamics inherent in the model, support the primary succession theory of egler (1954) and the early belief (cockayne 1928) that agathis is successional to a climax podocarp forest (figs. 1, 2a) and that strongly agathis-dominated forests would occur only in the first 400 yr after largescale disturbance. the gap size and frequencies produced by the model are a consequence of the comparative life history attributes set for the trees that occupy the site. thus, long-lived trees will produce gaps only infrequently; in the case of agathis, this frequency is so low that there is sufficient time for the shade-tolerant hardwoods and podocarps to become well established, reducing the likelihood that enough light would penetrate through the understory of a tree-sized gap to permit abundant agathis regeneration. in a study of agathis treefall gaps, ogden et al. (1987) found that agathis established in only a few, enough to maintain a presence but not dominance. they estimated that, owing to its longevity (mean ͼ 600 yr), agathis could survive on any site up to 1500-2000 yr. the corresponding simulation ( fig. 2a) , without large-scale disturbance, shows agathis declining from an initial dominance, but remaining a significant component of the forest for nearly 1600 yr. many workers have pointed out the importance of larger scale, infrequent disturbance to new zealand forest dynamics (e.g., stewart 1982, ogden 1985) , and have urged acceptance of ''kinetic'' models (e.g., veblen et al. 1980) in which stochastic disturbance is accepted as a selective force. ahmed and ogden (1987) inferred from their study of agathis population structure that episodic regeneration occurred at intervals of 100-300 yr. our introduction of stochastic disturbance of this frequency into the dynamics of the model (fig. 5a) have gone, declining to zero by year 1800 (fig. 2a) . in the stochastically disturbed scenario (fig. 5a) , agathis biomass remains at ϳ33% of total biomass over the entire interval between 600 and 1800 yr. our results are consistent with the conclusion of ahmed and ogden (1987) : agathis is a successional species that maintains a strong presence in the forests of northland because of repeated disturbance. nothofagus species are completely absent from a 150-km stretch along the central west coast of the south island, and are also absent from stewart island, 40 km south of the south island. cockayne (1926) suggested that this absence could be the result of insufficient time for beech to have recolonized the area since the end of the last glaciation (ϳ10 000 yr bp). our results for sites within the ''beech gap,'' such as franz josef and hokitika (data not shown), lend support to cockayne's hypothesis. when beech is part of the available species pool, it can establish and become a permanent, if lowbiomass, component of the forest. even when beech availability on the site was delayed 1000 yr to allow the forest to fully establish first, linknz indicated that a small amount of beech could establish. this suggests that poor modes of dispersal in beech may play a more significant role than any total inability to compete. wardle (1964) also suggests that new zealand beech may compete less effectively with existing vegetation where the rainfall is high, such as on the west coast, than in the drier conditions to the east of the axial ranges. the linknz model supports this contention, because the simulations show that beech requires a longer period of time to become established in the ''beech gap'' sites than on the drier, cooler east side of the south island in the craigieburn range (compare fig. 3b , c with fig. 3f, g) . beech establishment in this region may also be influenced by the effects of stand-level disturbance on soil fertility. the model suggests that stochastic disturbances that create gaps larger than normal treefall size promote the establishment of some beech species. wardle (1984) points out that beech can be outcompeted in low-elevation, high-rainfall areas owing to difficulties in finding suitable sites and increasing competition in the high-density, species-rich understories. ogden (1988) describes the role of synchronized cohort mortality in the dynamics of beech and how disturbance promotes nothofagus fusca establishment. when we included beech in the species pool and introduced wind-blow disturbances (300-yr mean return time) to the simulations at franz josef located in the ''beech gap,'' beech biomass increased by a factor of 5 to reach ϳ25% of the total; n. menziesii biomass doubled, and the fast-growing n. fusca established and attained ͼ12% of total biomass. by contrast, a disturbance regime that removed organic matter from the site inhibited beech species establishment and produced a more typical low-elevation hardwood-podocarp forest. ogden (1988) notes that n. fusca prefers higher fertility sites and would be expected to have difficulty establishing. at this low-elevation, low-fertility site, the model predicts that other species, such as the common hardwood weinmannia racemosa, would capture the area and lead to the establishment of large, dominant podocarps such as dacrydium cupressinum. our results for the craigieburn forest show that biomass contributions of the beech species nothofagus solandri var. cliffortioides and n. menziesii tend to oscillate out of phase with one another in a damped cycle of ϳ500 yr (fig. 6a ). an ecological interpretation of this behavior is that, as the fast-growing, light-demanding, even-aged stands of n. solandri var. cliffortioides thin, they become replaced by relatively evenaged stands of the shade-tolerant n. menziesii. when these n. menziesii stands senesce, the faster growing n. solandri var. cliffortioides begin to recapture the site. because n. menziesii is longer lived and can continue to regenerate under its own canopy, these oscillations lengthen and decrease in amplitude over subsequent generations. this ''counter-cyclical succession'' is a consequence of the life history attributes of the two species most suited to the cool craigieburn climate. both species have wide, overlapping soil fertility, soil moisture, and climatic tolerances (wardle 1984, benecke and allen 1992 ; g. m. j. hall and d. y. hollinger, unpublished manuscript) . in new zealand, these species form almost continuous alpine and subalpine forests throughout the axial mountain ranges of both main islands. wardle (1984) found that in mixed stands, dense, smalldiameter (young) nothofagus menziesii usually occur in conjunction with large-diameter (old) n. solandri var. cliffortioides and, conversely, stands with low numbers of large n. menziesii trees often have high densities of young n. solandri var. cliffortioides trees. ogden (1988) further suggested that n. solandri var. cliffortioides will gradually be replaced by n. menziesii unless the stand is severely disturbed. in fact, the craigieburn forests are essentially monocultures of n. solandri var. cliffortioides, with few other canopy species. disturbance events such as wind, earthquake-triggered landslips, and heavy snowfall (wardle 1984) frequently disrupt these subalpine and alpine forests on the eastern slopes of the south island axial range. age-diameter distributions indicate that stands may be severely damaged by gales at periods of ϳ120-150 yr, and stands suffer minor damage at intervals of 20-30 yr (wardle 1984 , jane 1986 , harcombe et al. 1998 ). when we investigated the influence of disturbance on the dynamics and biomass of n. solandri var. cliffortioides-n. menziesii forests by adding stochastic disturbance (blowdown of whole-plot biomass) to the dynamics of the model, the counter-cyclical pattern of succession was removed (fig. 6b) . furthermore, as the mean interval between disturbances decreased, the relative percentage and absolute amount of n. menziesii in the resulting stands decreased from a 70% dominance to ͻ20% for mean intervals greater than one event every 120 yr (fig. 7) . this model behavior provides support for wardle's (1984) conclusion that stability favors n. menziesii and disturbance favors n. solandri var. cliffortioides, and suggests why n. solandri var. cliffortioides can be so dominant in disturbance-prone subalpine forests. we evaluated the degree to which principles and relationships derived from north american studies could be used to simulate the structure and dynamics of new zealand forest ecosystems. the characteristics of new zealand tree species are significantly different from those of eastern north american trees. new zealand species are generally long-lived evergreens with low n and high specific mass foliage that is retained in the canopy for several years. yet, a model that was based on ecological processes of tree competition and growth, litter decomposition, and n cycling that originated primarily in north america and was designed to simulate the ecology of eastern north american species, only required minor modifications to acceptably simulate general forest patterns and processes across climatic gradients in a range of new zealand forest types. the most significant modifications improved the way in which the predecessor model, linkages, calculated the forest floor light environment and modified the slow growth rate conditions that trigger mortality. overall, our results tested the adequacy of the ecological processes embodied in forest simulators such as linkages. they provided support for the model's underlying hypothesis: interactions among demographic processes determining plant population structure, microbial processes determining n availability, and geological processes determining water availability explain much of the observed variation in ecosystem c and n storage and cycling (pastor and post 1985) . in this framework, geology and climate act as constraints within which feedbacks between vegetation and light availability, and between vegetation and n availability, operate. by incorporating a simple disturbance regime into the model, we also supported the hypothesis that large-scale disturbance is of importance in shaping the dynamics and current composition of new zealand forests (veblen and stewart 1982 , ogden 1985 , wardle 1991 ). the linknz model is a versatile simulation model of vegetation patterns and processes in new zealand forests. it may find additional practical applications in investigating the impacts of climatic change, forest harvesting practices, forest restoration, or introduced animal impacts on the dynamics of indigenous vegetation. furthermore, the modifications that we have incorporated should also improve the performance of the model in its original domain, the northeastern united states. acknowledgments many colleagues at landcare research and at the department of conservation provided assistance with this project. predicting the effects of different harvesting regimes on productivity and yield in northern hardwoods predicting the effects of rotation length, harvest intensity, and fertilization on fiber yield from northern hardwood forests in new england population dynamics of the emergent conifer agathis australis (d.don) lindl. (kauri) in new zealand. 1. population structures and tree growth rates in mature stands forest recovery after logging in lowland dense rimu forest, westland, new zealand amount and distribution of dry matter in a mature beech/podocarp community new zealand's mountain forests: their use and abuse. pages 1-12 in yang yupo and zhang jiangling, editors. protection and management of mountain forests. iufro project group p 1.07-00: ecology of subalpine zones growth and water use in nothofagus truncata (hard beech) in temperate hill country carbon uptake and allocation by nothofagus solandri var. cliffortioides (hook. f.) poole and pinus contorta douglas ex loudon ssp. contorta at montane and subalpine altitudes some ecological consequences of a computer model of forest growth forest response to climatic change: effects of parameter estimation and choice of weather patterns on the reliability of projections gap phase replacement in a maple-basswood forest a simplified forest model to study species composition along climate gradients a comparison of forest gap models structure and behavior changes in structure and composition over fifteen years in a secondary kauri (agathis australis)-tanekaha (phyllocladus trichomanoides) forest stand the distribution of beech (nothofagus) species in the east taupo zone after the 1850 bp volcanic eruptions monograph on the new zealand beech forests. part 1. the ecology of the forests and the taxonomy of the beeches the vegetation of new zealand the vegetation of wisconsin. an ordination of plant communities preliminary account of litter production in a new zealand lowland podocarp-rata-broadleaf forest test of a forest dynamics simulator in new zealand biological flora of new zealand. 8. agathis australis (d. don) lindl. (araucariaceae). kauri vegetation science concepts. i. initial floristic composition, a factor in old-field vegetation development patches and structural components for a landscape ecology pc-recce vegetation inventory data analysis spatial and temporal patterns in structure, biomass, growth, and mortality in a monospecific nothofagus solandri var. cliffortioides forest in new zealand forest trees and timbers of new zealand carbon dioxide exchange between an undisturbed old-growth temperate forest and the atmosphere wind damage as an ecological process in mountain beech forests of canterbury preliminary notes on seeding and seedlings in red and hard beech forests of north westland and the silvicultural implications description and simulation of tree-layer composition and size distributions in a primaeval picea-pinus forest above-ground biomass, nutrients, and energy content of trees in a second-growth stand of agathis australis forest succession on landslides in the fiord ecological region, southwestern new zealand forest succession on landslides above lake thompson, fiordland exe: a climatically sensitive model to study climate change and co 2 enrichment effects on forests dryland holocene vegetation history distribution of subfossil forest remains, eastern south island soils in the new zealand landscape. mallinson rendel, in association with the new zealand society of soil science ü ber den lichtfaktor in den pflanzengesellschaften und seine bedeutung fü r die stoffproduktion meteorological observations community matrix model predictions of future forest composition at russell state forest an introduction to plant demography with special reference to new zealand trees forest dynamics and stand-level dieback in new zealand's nothofagus forests ecology of new zealand nothofagus forests population dynamics of the emergent conifer agathis australis (d. don) lindl. (kauri) in new zealand. ii. seedling population sizes and gap-phase regeneration forest models defined by field measurements: i. the design of a northeastern forest simulator calculating thornthwaite's and mather's aet using an approximating function development of a linked forest productivity-soil process model. oak ridge national laboratory ornl/tm-9519 response of northern forests to co 2 -induced climate change linkages: an individualbased forest ecosystem model a simulation model for the transient effects of climate change on forest landscapes validating models of ecosystem response to global change above ground biomass of mountain beech (nothofagus solandri (hook.f) oerst. var. cliffortioides (hook.f.) poole) in different stands near timber a theory of forest dynamics a review of forest patch models and their application to global change research development of an appalachian forest succession model and its application to assessment of the impact of the chestnut blight transient responses of forests to co 2 -induced climatic change: simulation modelling experiments in eastern north america simulating the role of climate change and species immigration on forest succession forest dynamics and disturbance in a beech/hardwood forest the significance of life history strategies in the developmental history of mixed beech (nothofagus) forests systat manual, version 7.0. spss, chicago structure and dynamics of old growth nothofagus forests in the valdivian andes on the conifer regeneration gap in new zealand: the dynamics of libocedrus bidwillii stands on south island the new zealand beeches: ecology, utilisation and management facets of the distribution of forest vegetation in new zealand vegetation of new zealand pattern and process in the plant community ecology of podocarpus hallii in central otago geomorphology of the central southern alps, new zealand: the interaction of plate collision and atmospheric circulation the hubbard brook ecosystem study: forest biomass and production the gap theory in forest dynamics. the botanical magazine in particular, larry burrows, glenn stewart, robert allen, ian payton, and bruce burns assisted with comments about the sites and forest ecosystems. matt mcglone, glenn stewart, and two unnamed referees kindly reviewed the manuscript and suggested improvements. this project was funded by the foundation for research, science and technology under contract number c09624. a list of the 76 new zealand forest species selected for input to the linknz forest gap model is available in esa's electronic data archive: ecological archives a010-001. key: cord-024283-ydnxotsq authors: chen, jiarui; cheong, hong-hin; siu, shirley weng in title: bestox: a convolutional neural network regression model based on binary-encoded smiles for acute oral toxicity prediction of chemical compounds date: 2020-02-01 journal: algorithms for computational biology doi: 10.1007/978-3-030-42266-0_12 sha: doc_id: 24283 cord_uid: ydnxotsq compound toxicity prediction is a very challenging and critical task in the drug discovery and design field. traditionally, cell or animal-based experiments are required to confirm the acute oral toxicity of chemical compounds. however, these methods are often restricted by availability of experimental facilities, long experimentation time, and high cost. in this paper, we propose a novel convolutional neural network regression model, named bestox, to predict the acute oral toxicity ([formula: see text]) of chemical compounds. this model learns the compositional and chemical properties of compounds from their two-dimensional binary matrices. each matrix encodes the occurrences of certain atom types, number of bonded hydrogens, atom charge, valence, ring, degree, aromaticity, chirality, and hybridization along the smiles string of a given compound. in a benchmark experiment using a dataset of 7413 observations (train/test 5931/1482), bestox achieved a squared correlation coefficient ([formula: see text]) of 0.619, root-mean-squared error (rmse) of 0.603, and mean absolute error (mae) of 0.433. despite of the use of a shallow model architecture and simple molecular descriptors, our method performs comparably against two recently published models. measuring the chemical and physiological properties of chemical compounds are fundamental tasks in biomedical research and drug discovery [19] . the basic idea of modern drug design is to search chemical compounds with desired affinity, potency, and efficacy against the biological target that is relevant to the disease of interest. however, not only that there are tens of thousands known chemical compounds existed in nature, but many more artificial chemical compounds are being produced each year [9] . thus, the modern drug discovery pipeline is focused on narrowing down the scope of the chemical space where good drug candidates are [7, 11] . potential lead compounds will be subjected to further experimental validation on their pharmacodynamics and pharmacokinetic (pd/pk) properties [2, 14] ; the latter includes absorption, distribution, metabolism, excretion, and toxicity (adme/t) measurements. traditionally, chemists and biologists conduct cell-based or animal-based experiments to measure the pd/pk properties of these compounds and their actual biological effects in vivo. however, these experiments are not only high cost in terms of both time and money, the experiments that involve animal testings are increasingly subjected to concerns from ethical perspectives [1] . among all measured properties, toxicity of a compound is the most important one which must be confirmed before approval of the compound for medication purposes [16] . there are different ways to classify the toxicity of a compound. for example, based on systemic toxic effects, the common toxicity types include acute toxicity, sub-chronic toxicity, chronic toxicity, carcinogenicity developmental toxicity and genetic toxicity [22] . on the other hand, based on the toxicity effects area, toxicity can also be classified as hepatotoxicity, ototoxicity, ocular toxicity, etc. [15] . therefore, there is a great demand for accurate, low-cost and time-saving toxicity prediction methods for different toxicity categories. toxicity of a chemical compound is associated with its chemical structure [17] . a good example is the chiral compounds. this kind of compounds and their isomers have highly similar structures but only slight differences in molecular geometry. their differences cause them to possess different biological properties. for example, the drug dopa is a compound for treating the parkinson disease. the d-isomer form of this compound has severe toxicity whereas the l-isomer form does not [12] . therefore, only its levorotatory form can be used for medical treatments. this property-structure relationship is often described as quantitative structure-activity relationship (qsar) and have been widely used in the prediction of different properties of compounds [4, 24] . based on the same idea, toxicities of a compound, being one of the most concerned properties, can be predicted via computational means as a way to select more promising candidates before undertaking further biological experiments. the simplified molecular input line entry system, also called smiles [20, 21] , is a linear representation of a chemical compound. it is a short ascii string describing the composition, connectivity, and charges of atoms in a compound. an example is shown in fig. 1 . the compound is called morphine; it is originated from the opiate family and is found to exist naturally in many plants and animals. morphine has been widely used as a medication to relief acute and chronic pain of patients. nowadays, compounds are usually converted into their smiles strings for the purpose of easy storage into databases or for other computational processing such as machine learning. common molecular toolkits such as rdkit [8] and openbabel [13] can convert a smiles string to its 2d and 3d structures, and vice versa. in recent years, machine learning has become the mainstream technique in natural language processing (nlp). among all machine learning applications for nlp, text classification is the most widely studied. based on the input text sentences, a machine learning-based nlp model analyzes the organization of words and the types of words in order to categorize the given text. two pioneering nlp methods are textcnn [6] and convnets [26] . the former method introduced a pretrained embedding layer to encode words of input sentences into fixed-size feature vectors with padding. then, feature vectors of all words were combined to form a sentence matrix that was fed into a standard convolutional neural network (cnn) model. this work was considered a breakthrough at that time and accumulated over 5800 citations since 2014 (as per google scholar). another spotlight paper in nlp for text classification is convnets [26] . instead of analyzing words in a sentence, this model exploited simple one-hot encoding method at the character level for 70 unique characters in sentence analysis. the success of these methods in nlp shed lights to other applications that have only texts as raw data. compound toxicity prediction can be considered as a classification problem too. recently, hirohara et al. [3] proposed a new cnn model for toxicity classification based on character-level encoding. in this work, each smiles character is encoded into a 42-dimensional feature vector. the cnn model based on this simple encoding method achieved an area-under-curve (auc) value of 0.813 for classification of 12 endpoints using the tox21 dataset [18] . the best auc score in tox21 challenge is 0.846 which is achieved by deeptox [10] . despite of its higher accuracy, the deeptox model is extremely complex. it requires heavy feature engineering from a large pool of static and dynamic features derived from the compounds or indirectly via external tools. the classification model is ensemble-based combining deep neural network (dnn) with multiple layers of hidden nodes ranging from 2 10 to 2 14 nodes. the train dataset for this highly complex model was comprised of over 12,000 observations and superior predictive performance was demonstrated. besides classification, toxicity prediction can be seen as a regression problem when the compound toxicity level is of concern. like other qsar problems, toxicity regression is a highly challenging task due to limited data availability and noisiness of the data. with limited data, the use of simpler model architecture is preferred to avoid the model being badly overfitted. in this work, we have focused on the regression of acute oral toxicity of chemical compounds. two recent works [5, 23] were found to solve this problem where the maximally achievable r 2 is only 0.629 [5] . in this study, we developed a regression model for acute oral toxicity prediction. the prediction task is to estimate the median lethal dose, ld 50 , of the compound; this is the dose required to kill half the members of the tested population. a small ld 50 value indicates high toxicity level whereas a large ld 50 value indicates low toxicity level of the compound. based on the ld 50 value, compounds can be categorized into four levels as defined by the united states environmental protection agency (epa) (see table 1 ). category iii slightly toxic and slightly irritating 500 < ld50 ≤ 5000 category iv practically non-toxic and not an irritant 5000 < ld50 the rat acute oral toxicity dataset used in this study was kindly provided by the author of toptox [23] . this dataset was also used in the recent study of computational toxicity prediction by karim et al. [5] . for ld 50 prediction task, the dataset contains 7413 samples; out of which 5931 samples are for training and 1482 samples are for testing. the original train/test split was deliberately made to maintain similar distribution of the train and test datasets to facilitate learning and model validation. it is noteworthy that as the actual ld 50 values were in a wide range (train set: 0.042 mg/kg to 99947.567 mg/kg, test set: 0.020 mg/kg to 114062.725 mg/kg), the ld 50 values were first transformed to mol/kg format, and then scaled logarithmically to −log 10 (ld 50 ). finally, the processed experimental values range from 0.470 to 7.100 in the train set and 0.291 to 7.207 in the test set. as a smiles string is not an understandable input format for general machine learning methods, it needs to be converted or encoded into a series of numerical values. ideally, these values should capture the characteristics of the compound and correlates to the interested observables. the most popular way to encode a smile is to use molecular fingerprints such as molecular access system (maccs) and extended connectivity fingerprint (ecfp). however, fingerprint algorithms generate high dimensional and sparse matrices which make learning difficult. here, in order to solve the regression task for oral toxicity prediction. inspired by the work of hirohara et al. [3] , we proposed the modified binary encoding method for smiles, named bes for short. in bes, each character is encoded by a binary vector of 56 bits. among them 26 bits are for encoding the smiles alphabets and symbols by the one-hot encoding approach; 30 bits are for encoding various atomic properties including number of bonded hydrogens, formal charge, valence, ring atom, degree, aromaticity, chirality, and hybridization. the feature types and corresponding size of the feature is listed in table 2 . as the maximum length of smiles strings in our dataset is 300, the size of the feature matrix for one smiles string was defined to be 56 × 300. for a smiles string that is shorter than 300 in length, zero padding was applied. figure 2 illustrates how bes works. our prediction model is a conventional cnn model with convolutional layers to extract features, pooling layers to reduce dimensionality of the feature matrix and to prevent overfitting, and a multi-layer neural network to correlate features to ld 50 values. to decide the model architecture and to tune hyperparameters of the model, a grid search method was employed. table 3 shows the hyperparameters and their ranges of values within which the model was optimized. in each grid search process, the model training was run for 500 epochs and the mean-squared error (mse) loss of the model in 5-fold cross validation was used as a criteria for model selection. the optimal parameters are also presented in table 3 . the final production model was trained using the optimal parameters and the entire train dataset. the maximum training epoch was 1000; early stop method was used to prevent the problem of overfitting. the architecture of our optimized cnn model is presented in fig. 3 . the model contains two convolutional layers (conv) with 512 and 1024 filters respectively. after each convolutional layer is an average pooling layer and a batch normalization layer (bn). then, a max pooling layer is used before the learned features fed into the fully connected layers (fc). four fcs containing 2048, 1024, 512, and 256 hidden nodes were found to be the optimal combination for toxicity prediction and the relu function is used to generate the prediction output. all implementations were done using python 3.6.9 with the following libraries: anaconda 4.7.0, rdkit v2019.09.2.0, pytorch 1.2.0 and cuda 10.0. we used gettotalnumhs, getformalcharge, getchiraltag, gettotaldegree, isinring, getisaromatic, gettotalvalence and gethybridization functions from rdkit to calculate atom properties. our model was trained and tested in a workstation equipped with two nvidia tesla p100 gpus. training of the final production model was performed using the optimal parameters obtained from the result of our extensive grid search. figure 4 shows the evolution of mse over the number of training cycles. the training stopped at the 900-th epoch with mse of 0.016. table 4 shows the performances of our model table 3 . in the train and test sets. the training performance is excellent which gives r 2 of 0.982 as all the data was used to construct the model. for the test set, the model predicts with r 2 of 0.619, rmse of 0.603, and mae of 0.433. figure 5 shows the scatterplot of bestox prediction on the test data. we can see that prediction is better for compounds with lower toxicity (lower −log 10 (ld 50 )) and worse for those with higher toxicity. this may be due to fewer data available in the train set for higher toxicity compounds. thus, we also tested our model on samples with target values less than 3.5 in the test set (1255 samples out of total 1482 samples, the sample coverage is more than 84%). in this case, the performance of our model is improved: rmse is decreased from 0.603 to 0.516 and mae is reduced from 0.433 to 0.385. table 5 . performance comparison of our model to two existing acute oral toxicity prediction methods: toptox [23] and dt+snn [5] . performance data of these methods were obtained from the original literature. table 5 presents the comparative performance of bestox to two existing acute oral toxicity prediction models, the st-dnn model from toptox and the dt+snn model from karim et al. [5] . results show that our model is slightly better than st-dnn with respect to r 2 and mae. the best performed model is dt+snn which has a correlation of 0.629; but rmse and mae were not provided in the original study. the closeness of the performance metrics of bestox to two existing models suggest that our model performs on par with them. nevertheless, it should be mentioned that while our model has employed simple features and relatively simple model architecture, st-dnn and dt+snn relied on highly engineered input features and complex ensemble-based model architectures. for st-dnn [23] , they combined 700 element specific topological descriptors (estd) and 330 auxiliary descriptors as candidates to generate the feature vectors for prediction (our model uses only 56 features). in addition, their model included ensemble of two different types of classifiers, namely, deep neural network (dnn) and gradient boosted decision tree (gbdt). combining predictions from several classifiers is an easy way to improve prediction accuracy, however, the complexity introduced into the model makes the already "black box model" more difficult to understand. for the recent dt+snn model [5] , they used decision trees (dt) to select 817 different descriptors generated from the padel tools [25] . although their shallow neural network (snn) architecture required short model training time, more time was spent on feature generation and selection. different combination of features were used depending on the tasks to be predicted, which had high computational cost. here, bestox has achieved results comparable to these more complex models with simple binary features and model architecture, showing the power of our method. in this paper, we present our new method bestox for acute oral toxicity prediction. inspired by nlp techniques for text classification, we have designed a simple character-level encoding method for smiles called the binary-encoded smiles (bes). we have developed a shallow cnn to learn the bes matrices to predict the ld 50 values of compounds. we trained our model on the rat acute oral toxicity data, tested and compared to two other existing models. despite the simplicity of our method, bestox has achieved a good performance with r 2 of 0.619, comparable to the single-task model proposed by toptox [23] but slightly inferior to the hybrid decision tree and shallow neural network model by karim et al. [5] . future improvement of bestox will be focused on extending the scope of datasets. as shown in the work of wu et al. [23] , multitask learning can improve performance of prediction models due to availability of more data on different toxicity effects. the idea of multitask technique is to train a model with multiple training sets; each set corresponds to one toxicity prediction task. feeding the learners with different toxicity data helps them to learn common latent features of molecules offered by different datasets. recent efforts to elucidate the scientific validity of animalbased drug tests by the pharmaceutical industry, pro-testing lobby groups, and animal welfare organisations screening: methods for experimentation in industry, drug discovery, and genetics convolutional neural network based on smiles representation of compounds for detecting chemical motif a review on machine learning methods for in silico toxicity prediction efficient toxicity prediction via simple features using shallow neural networks and decision trees convolutional neural networks for sentence classification virtual screening for bioactive molecules rdkit: open-source cheminformatics exploration of the chemical space and its three historical regimes deeptox: toxicity prediction using deep learning virtual screening strategies in drug discovery chiral drugs: an overview open babel: an open chemical toolbox integrating virtual screening in lead discovery new promising approaches to treatment of chemotherapy-induced toxicities in silico toxicology: computational methods for the prediction of chemical toxicity understanding the basics of qsar for applications in pharmaceutical sciences and risk assessment improving the human hazard characterization of chemicals: a tox21 update dose finding in drug development smiles, a chemical language and information system. 1. introduction to methodology and encoding rules smiles. 2. algorithm for generation of unique smiles notation encyclopedia of toxicology quantitative toxicity prediction using topology based multitask deep neural networks machine learning based toxicity prediction: from chemical structural description to transcriptome analysis padel-descriptor: an open source software to calculate molecular descriptors and fingerprints character-level convolutional networks for text classification acknowledgments. this work was supported by university of macau (grant no. myrg2017-00146-fst). key: cord-033010-o5kiadfm authors: durojaye, olanrewaju ayodeji; mushiana, talifhani; uzoeto, henrietta onyinye; cosmas, samuel; udowo, victor malachy; osotuyi, abayomi gaius; ibiang, glory omini; gonlepa, miapeh kous title: potential therapeutic target identification in the novel 2019 coronavirus: insight from homology modeling and blind docking study date: 2020-10-02 journal: egypt j med hum genet doi: 10.1186/s43042-020-00081-5 sha: doc_id: 33010 cord_uid: o5kiadfm background: the 2019-ncov which is regarded as a novel coronavirus is a positive-sense single-stranded rna virus. it is infectious to humans and is the cause of the ongoing coronavirus outbreak which has elicited an emergency in public health and a call for immediate international concern has been linked to it. the coronavirus main proteinase which is also known as the 3c-like protease (3clpro) is a very important protein in all coronaviruses for the role it plays in the replication of the virus and the proteolytic processing of the viral polyproteins. the resultant cytotoxic effect which is a product of consistent viral replication and proteolytic processing of polyproteins can be greatly reduced through the inhibition of the viral main proteinase activities. this makes the 3c-like protease of the coronavirus a potential and promising target for therapeutic agents against the viral infection. results: this study describes the detailed computational process by which the 2019-ncov main proteinase coding sequence was mapped out from the viral full genome, translated and the resultant amino acid sequence used in modeling the protein 3d structure. comparative physiochemical studies were carried out on the resultant target protein and its template while selected hiv protease inhibitors were docked against the protein binding sites which contained no co-crystallized ligand. conclusion: in line with results from this study which has shown great consistency with other scientific findings on coronaviruses, we recommend the administration of the selected hiv protease inhibitors as first-line therapeutic agents for the treatment of the current coronavirus epidemic. the first outburst of pneumonia cases with unknown origin was identified in the early days of december 2019, in the city of wuhan, hubei province, china [1] . revelation about a novel beta coronavirus currently regarded as the 2019 novel coronavirus [2] came up after a high-throughput sequencing of the viral genome which exhibits a close resemblance with the severe acute respiratory syndrome (sars-cov) [3] . the 2019-ncov is the seventh member of enveloped rna coronavirus family (subgenus sarbecovirus, orthocoronavirinae) [3] , and there are accumulating facts from family settings and hospitals confirming that the virus is most likely transmitted from person-to-person [4] . the 2019-ncov has also recently been declared by the world health organization as a public health emergency of international concern [5] and as of the 5th of february 2020, over 24,000 cases has been confirmed and documented from laboratories around the world [6] while more than 28, 000 of such cases were documented in china through laboratory confirmation as of the 6th of february 2020 [7] . despite the fast rate of global spread of the virus, the characteristics clinically peculiar to the 2019-ncov acute respiratory disease (ard) remain unclear to a very large extent [8] . over 8000 infections and 900 deaths were recorded worldwide in the summer of 2003 before a successful containment of the severe acute respiratory syndrome wave was achieved as the disease itself was also a major public health concern worldwide [9, 10] . the infection that led to a huge number of death cases was linked to a new coronavirus also known as the sars coronavirus (sars-cov). coronaviruses are positive-stranded rna viruses and they possess the largest known viral rna genomes. the first major step in containing the sars-cov-lined infection was to successfully sequence the viral genome, the organization of which was found to exhibit similarity with the genome of other coronaviruses [11] . the main proteinase crystal structure from both the transmissible gastroenteritis virus and the human coronavirus (hcov 229e) has been determined with the discovery that the enzyme crystal structure exists as a dimer and the orientation of individual protomers making up the dimer has been observed to be perpendicular to each other. each of the protomers is made up of three catalytic domains [12] . the first and second domains of the protomers have a two-βbarrel fold that can be likened to one of the folds in the chymotrypsin-like serine proteinases. domain iii have five α-helices which are linked to the second domain by a long loop. individual protomers have their own specific region for the binding of substrates, and this region is positioned in the left cleft between the first and second domain. dimerization of the protein is thought to be a function of the third domain [13] . the main proteinase of the sars cov is known to be a cysteine proteinase which has in its active site, a cysteine-histidine catalytic dyad. conservation of the sars cov main proteinase across the genome sequence of all sars coronaviruses is very high, likewise the homology of the protein to the main proteinase of other coronaviruses. on the basis that high similarity exists between the different coronavirus main proteinase crystal structures and the conservation of almost all the amino acid residue side chains involved in the dimeric state formation, it was proposed that the only biologically functional form coronavirus main proteinase might be is its existence as a dimer [14] . more recently, chen et al. in his study which involved the application of molecular dynamic simulations and enzyme activity measurements from a hybrid enzyme showed that the only active form of the proteinase is in its dimeric state [15] . recent studies based on the sequence homology of the coronavirus main proteinase structural model with tgev as well as the solved crystal structure has involved the docking of substrate analogs for the virtual screening of natural products and a collection of synthetic compounds, alongside approved antiviral therapeutic agents in the evaluation of the coronavirus main proteinase inhibition [16] . some compounds from this study were identified for the inhibitory role played against the viral proteinase. these compounds include the l-700,417, which is an hiv-1 protease inhibitor, calanolide a, and nevirapine, both of which are reverse transcriptase inhibitors, an inhibitor of the α-glucosidase named glycovir, sabadinine, which is a natural product and ribavirin, a general antiviral agent [17] . ribavirin was shown to exhibit an antiviral activity in vitro, at cytotoxic concentrations against the sars coronavirus. at the start of the first outbreak of the sars epidemic, ribavirin was administered as a first-line of defense. the administration was as a monotherapy and in combination with corticosteroids or the hiv protease inhibitor, kaletra [18] . according to reports from a very recent research conducted by cao et al., where a total of a 199 laboratoryconfirmed sars-cov-infected patients were made to undergo a controlled, randomized, open-labeled trial in which 100 patients were assigned to the standard care group and 9 patients assigned to the lopinavirritonavir group. 48.4% of the patients in the lopinavir-ritonavir group (46 patients) and 49.5% of the patients in the standard care group (49 patients) exhibited serious adverse events between randomization and the 28th day. the exhibited adverse events include acute respiratory distress syndrome (ards), acute kidney injury, severe anemia, acute gastritis, hemorrhage of lower digestive tract, pneumothorax, unconsciousness, sepsis, acute heart failure etc. patients in the lopinavir-ritonavir group in addition, specifically exhibited gastrointestinal adverse events which include diarrhea, vomiting, and nausea [19] . our current study took advantage of the availability of the sars cov main proteinase amino acid sequence to map out the nucleotide coding region for the same protein in the 2019-ncov. two selected hiv protease inhibitors (lopinavir and ritonavir) were then targeted at the catalytic site of the protein 3d structure which was modeled using already available templates. the predicted activity of the drug candidates was validated by targeting them against a recently crystalized 3d structure of the enzyme, which has been made available for download in the protein data bank. lopinavir is an antiretroviral protease inhibitor used in combination with ritonavir in the therapy and prevention of human immunodeficiency virus (hiv) infection and the acquired immunodeficiency syndrome (aids). it plays a role as an antiviral drug and a hiv protease inhibitor. it is a member of amphetamines and a dicarboxylic acid diamide (fig. 1 ). the complete genome of the isolated wuhan seafood market pneumonia virus (2019-ncov) was downloaded from the genbank database with an assigned accession number of mn908947.3. the nucleotide sequence of the full genome was copied out in fasta format. the gen-bank sequence database is an annotated collection of all nucleotide sequences which are publicly available with their translated protein segments and also open access. this database is designed and managed by the national center for biotechnology information (ncbi) in accordance with the international nucleotide sequence database collaboration (insdc) [20] . nucleotides between the 10055 and 10972 sequence of the 2019-ncov genome was selected as the sequence of interest. translation of the nucleotide sequence of interest in the 2019-ncov and the back-translation of the sars cov main proteinase amino acid sequence was achieved with the use of emboss transeq and backtranseq tools, respectively [21] . transeq reads one or more nucleotide sequences and writes the resulting translated sequence of protein to file while backtranseq makes use of a codon usage table which gives the usage frequency of individual codon for every amino acid [22] . for every amino acid sequence input, the corresponding most frequently occurring codon is used in the nucleotide sequence that forms the output. the corresponding amino acid sequence generated as a product of the transeq translation of the nucleotide sequence of interest had no stop codons and as such was used directly for protein homology modeling without the need for any deletion. two sets of sequence alignments were carried out in this study. the first was the alignment between the translated nucleotide sequence copy of the 2019-ncov genome which was used for the reference protein homology modeling and the amino acid sequence of the sars cov main proteinase while the second alignment was between the back-translated sars cov main proteinase nucleotide sequence and the 2019-ncov full genome. the latter was used in mapping out the protein coding sequence in the 2019-ncov full genome. these alignments were carried out using the clustal omega software package. clustal omega can read inputs of nucleotide and amino acid sequences in formats such as a2m/fasta, clustal, msf, phylip, selex, stockholm, and vienna [23] . template search with blast and hhblits was performed against the swiss-model template library. the target sequence was searched with blast against the primary amino acid sequence contained in the smtl. a total of 120 templates were found. an initial hhblits profile was built using the procedure outlined in remmert et al. [24] followed by 1 iteration of hhblits against nr20. the obtained profile was then searched against all profiles of the smtl. a total of 192 templates were found. models were built based on the targettemplate alignment using promod3. coordinates which are conserved between the target and the template are copied from the template to the model. insertions and deletions are remodeled using a fragment library. side chains were then rebuilt and finally, the geometry of the resulting model, regularized by using a force field [25] . for the estimation of the protein structure model quality, we used the qmean (qualitative model energy analysis), a composite scoring function that describes the main aspects of protein structural geometrics, which can also derive on the basis of a single model, both global (i.e., for the entire structure) and local (i.e., per residue) absolute quality estimates [26] . an appreciable number of alternative models have been produced which formed the basis on which scores produced by the final model was selected. the qmean score was thus used in the selection of the most reliable model against which the consensus structural scores were calculated. molprobity (version 4.4) was used as the structurevalidation tool that produced the broad-spectrum evaluation of the quality of the target protein at both the global and local levels. it greatly relies on the provided sensitivity and power by optimizing the placement of hydrogen and all-atom contact analysis with complementary versions of updated covalent-geometry and torsion-angle criteria [27] . the torsion angles between individual residues of the target protein were calculated using the ramachandran plot. this is a plot of the torsional angles [phi (φ) and psi (ψ)] of the amino acid residues making up a peptide. in the order of sequence, the torsion angle of n(i-1), c(i), ca(i), n(i) is φ while the torsion angle of c(i), ca(i), n(i), c(i+1) is ψ. the values of φ were plotted on the x-axis while the values of ψ were plotted on the y-axis [28] . plotting the torsional angles in this way graphically shows the possible combination of angles that are allowed. the quaternary structure annotation of the template is employed to model the target sequence in its oligomeric state. the methodology as proposed by bertoni et al. [29] was supported on a supervised machine learning algorithm rule, support vector machines (svm), which mixes conservation of interface, clustering of structures with other features of the template to produce a quaternary structure quality estimate (qsqe). the qsqe score is a number that ranges between 0 and 1, and it is a reflection of the accuracy expected of the inter-chain contacts for a model engineered based on a given template and its alignment. the higher score is an indication of a more reliable result. this enhances the gmqe score that calculates the accuracy of the 3d structure of the resulting model. the 3d structural homology modeling of the 2019-ncov genome translated segment was followed by a structural comparison with the sars cov main proteinase 3d structure (pdb: 1uj1). this was achieved using the ucsf chimera which is a highly extensible tool for interactive analysis and visualization of molecular structures and other like data, including docking results, supramolecular assemblies, density maps, sequence alignments, trajectories, and conformational ensembles [30] . high-quality animation videos were also generated. the amino acid constituents of the target protein secondary structures were colored and visualized in 3d using the pymol molecular visualzer which uses opengl extension wrangler library (glew) and freeglut. the py aspect of the pymol is a reference to the programming language that backs up the software algorithm which was written in python [31] . the percentage composition of each component making up the secondary structure was calculated using the chou and fasman secondary structure prediction (cfssp) server. this is a secondary structure predictor that predicts regions of secondary structure from an amino acid input sequence such as the regions making up the alpha helix, beta sheet, and turns. the secondary structure prediction output is displayed in a linear sequential graphical view according to the occurrence probability of the secondary structure component. the cfssp implemented methodology is the chou-fasman algorithm, which is based on the relative frequency analyses of alpha helices, beta sheets, and loops of each amino acid residue on the basis of known structures of proteins solved with x-ray crystallography [32]. the expasy server calculates protein physiochemical parameters as a part of its sub-function, basically for the identification of proteins [33] . we engaged the function of the protparam tool in calculating various physiochemical parameters in the model and template protein for comparison purposes. the calculated parameters include the molecular weight, theoretical isoelectric point, amino acid composition, extinction coefficient, instability index, etc. the inference on evolutionary relationship was made utilizing the maximum likelihood methodology which is the basis of the jtt matrix-based model [34] . the corresponding consensus tree on bootstrap was inferred from a thousand replicates, and this was used to represent the historical evolution of the analyzed taxa. the tree branches forming partitions that were reproduced in bootstrap replicates of less than 50% were automatically collapsed. next to every branch in the tree is the displayed percentage of tree replicates of clustered associated taxa in the bootstrap test of a thousand replicates. initial trees were derived automatically for the search through the application of the neighbor-join and bionj algorithms to a matrix of pairwise distances calculated using a jtt model and followed by the selection of the most superior log likelihood value topology. the phylogenetic analysis was carried out on 12 amino acid sequences with close identity. the complete dataset contained a total of 306 positions. the whole analysis was conducted using the molecular evolutionary and genetics analysis (mega) software (version 7) [35] . ligand preparation and molecular docking protocol 2d structures of the experimental ligands were viewed from the pubchem repository and sketched using the chemaxon software [36] . the sketched structures were downloaded and saved as mrv files which were converted into smiles strings with the openbabel. the compounds prepared as ligands were docked against each of the prepared protein receptors using autodock vina [37] . blind docking analysis was performed at extra precision mode with minimized ligand structures. after a successful docking, a file consisting of all the poses generated by the autodock vina along with their binding affinities and rmsd scores was generated. in the vina output log file, the first pose was considered as the best because it has stronger binding affinity than the other poses and without any rmsd value. the polar interactions and binding orientation at the active site of the proteins were viewed on pymol and the docking scores for each ligand screened against each receptor protein were recorded. the same docking protocol was performed against the sars-cov main proteinase 3d structure that was downloaded from the protein data bank with a pdb identity of 6m2n. obtained outputs were visualized, compared, and documented for validation purpose. the full genome of the 2019-ncov (https://www.ncbi. nlm.nih.gov/nuccore/mn908947.3?report=fasta) consists of 29903 nucleotides, but for the purpose of this study, nucleotides between 10055 and 10972 were considered to locate the protein of interest. the direct translation of this segment of nucleotides produced a sequence of 306 amino acids (fig. 2 ). this amino acid count was reached after the direct translation of the nucleotide sequence of interest as there were no single existing stop codons hence, deletion of any form was needless. as depicted in fig. 3 , few structural differences were noticed. the amino acid sequences making up these nonconserved regions were clearly revealed in fig. 4 . notwithstanding, a 96% identity was observed between both sequences showing the conserved domains were predominant. figure 4 represents the percentage amino acid sequence identity between the target and the template protein, where the positions with a single asterisk (*) depicts regions of full residue conservation while the segments with the colon (:) indicates regions of conservation between amino acid residues with similar properties. positions with the period (.) show regions of conservation between amino acids with less similar properties. the amino acid sequence of the sars coronavirus main proteinase was back-translated to generate the corresponding nucleotide sequence which was then aligned with the 2019-ncov full genome. this was carried out for the purpose of mapping out the region of the 2019-ncov full genome where the proteinase coding sequence is located. as depicted in fig. 5 , the target protein coding sequence is located between 10055 and 10972 nucleotides of the viral genome the outcome of a qmean analysis is anchored on the composite scoring function which calculates several features regarding the structure of the target protein. the expression of the estimated absolute quality of the model is in terms of how well the score of the model is in agreement with the values expected of a set of resultant structures from high-resolution experiments. the global score values can either be from qmean4 or qmean6. qmean4 is a combination of four statistical potential terms represented in a linear form while qmean6 in addition to the functionality of qmean4 uses two agreement terms in the consistency evaluation of structural features with sequence-based predictions. both qmean4 and qmean6 originally are in the range of 0 to 1 with 1 being the good score and are by default transformed into z-scores (table 1) for them to be related to what would have been expected from x-ray structures of high resolution. the local scores are also a combination of four linear potential statistical terms with the agreement terms of evaluation being on a per residue basis. the local scores are also estimated in the range of 0 to 1, where one is the good score (fig. 6) . when compared to the set of non-redundant protein structures, the qmean z-score of the target protein as shown in fig. 7 was 0. the models located in the dark zone are shown in the graph to have scores less than 1 while the scores of other regions outside the dark zone can either be 1 < the z-score < 2 or z-score > 2. good models are often located in the dark zone. whenever such values are found, they result in some strains in the polypeptide chain and in cases of such, the stability of the structure will depend greatly on additional interactions but this conformation may be conserved in a protein family for its structural significance. another αand β-regions clustering principle exemption can be viewed on the right side plot of fig. 8 where the distribution of torsion angles for glycine are the only displayed angles on the ramachandran plot. glycine has no side chain, and this gives room for flexibility in the polypeptide chain hence making accessible the forbidden rotation angles. glycine for this reason is more concentrated in the regions making up the loop where sharp bends can occur in the polypeptide. for this reason, glycine is highly conserved in protein families as the presence of turns at specific positions is a characteristic feature of particular structural folds. the comparative physiochemical parameter computation of the template and target proteins by protparam were deduced from the amino acid sequences of the individual proteins. no additional information was required about the proteins under consideration and each of the complete sequences was analyzed. the amino acid sequence of the target protein has not been deposited in the swiss-prot database. for this reason, inputting the standard single-letter amino acid sequence for both proteins into the text field was employed in computing the physiochemical properties as shown in tables 3, 4 the two hiv protease inhibitors (lopinavir and ritonavir) when targeted at the modeled 2019-ncov catalytic site gave significant inhibition attributes; hence, the in silico study was planned through molecular docking analysis with autodock vina. the binding orientation of the drugs to the protein active site as viewed in the pymol molecular visualizer (fig. 11) showed an induced fit model binding conformation. the same compounds were targeted against the active site of the downloaded pdb 3d structure of the sars-cov main proteinase (pdb 6m2n) for comparison purposes fig. 12 . the active site residues as visualized in pymol are shown in fig. 13 . the binding of lopinavir to the target protein which produced the best binding score was used as the predictive model. residues at the distance of < 5 angstroms to the bound ligand were assumed to form fig. 13 the combined view of the 3d structural comparison between the modeled target protein and the downloaded pdb structure of the viral protein (left column) and their primary sequence alignment (right column). the target protein is colored in grey while its protein data bank equivalence is colored in red. the high structural similarity between the two proteins was validated through their sequence alignment which produced 99.34% sequence identity score. homology modeling which is a computational method for modeling the 3d structure of proteins and also fig. 8 depicted here are two ramachandran plots. the plot on the left hand side shows the general torsion angles for all the residues in the target protein while the plot on the right hand side is specific for the glycine residues of the protein fig. 9 the target protein secondary structures with bound lopinavir. at the top is the secondary structure visualization on pymol with regions making up the alpha helix, beta sheets, and loops shown in light blue, purple, and brown, respectively. below is the prediction by cfssp where the red, green, yellow, and blue lines depict regions of the helices, sheets, turns, and coils (loops), respectively. the predicted secondary structure composition shows a high degree of alpha helix and beta sheets, respectively, occupying 45 and 47% of the total residues with the percentage loop occupancy at 8% regarded as comparative modeling, constructs atomic models based on known structures or structures that have been determined experimentally and likewise share more than 40% sequence homology. the backing principle for this is that there is likely to be an existing three-dimensional structure similarity between two proteins with high similarity in their amino acid sequence. with one of the proteins having an already determined 3d structure, the structure of the unknown one can be copied with a high degree of confidence. there is a higher degree of accuracy for alpha carbon positions in homology modeling than the side chains and regions containing the loop which is mostly inaccurate. as regards the template selection, homologous proteins with determined structures are searched through the protein data bank (pdb) and templates must have alongside a minimum of 40% identity with the target sequence, the highest possible resolution with appropriate cofactors for selection consideration [29] . in this study, the target protein was modeled using the sars coronavirus main proteinase as template. this selection was based on the high resolution and its identity with the target protein which is as high as 96%. qualitative model energy analysis (qmean) is a composite scoring function that describes protein structures on the basis of major geometrical aspects. the scoring function of the qmean calculates the global quality of models on six structural descriptors linear combination basis, where four of the six descriptors are statistical potentials of mean force. the analysis of local geometry is carried out by potential of torsion angle over three consecutive amino acids. in predicting the structure of a protein structure, final models are often selected after the production of a considerable number of alternative models; hence, the prediction of the protein structure is anchored on a scoring function which identifies within a collection of alternative models the best structural model. two distance-dependent interaction potentials are used to assess long-range interactions based on c_β atoms and all atoms, respectively. the burial status of amino acid residues describes the solvation potential of the model while the two terms that reflect the correlation between solvent accessibility, calculated and predicted secondary structure are not excluded [38] . the resultant target protein can be considered a good model as the z-scores of interaction energy of c_β, pairwise energy of all atoms, solvation energy, and torsion angle energy are − 0.35, − 0.65, − 0.77, and 0.36, respectively, as shown in table 1 . the quality of a good model protein can be compared to high-resolution reference structures that are products of x-ray crystallography analysis through z-score where 0 is the average z-score value for a good model [26] . according to benkert et al., qmean z-score provides an estimate value of the degree of nativeness of the structural features that can be observed in a model, and this is an indication that the model is of a good quality in comparison to other experimental structures [26] . our study shows the z-score of the target is "0" as indicated in fig. 6 and such a score is an indication of a relatively good model as it possesses the average z-score for a perfect model. properties of the model that is predicted determines the molprobity scores. work initially done on all-atom contact analysis has shown that proteins possess exquisitely well-packed structures with favorable van der waals interactions which overlap between atoms that do not form hydrogen bonds [39] . unfavorable steric clashes are correlated strongly with the quality of data that are often poor where a near zero occurrence of such steric clashes occurs in the ordered regions of crystal structures with high resolution. therefore, low values of clash scores are indications of a very good model which likewise has been proven by the clash score value exhibited by the target protein that was modeled for the purpose of this study (table 2 ). in addition to the clash score, the protein conformation details are remarkably relaxed, such as staggered χ angles and even staggered methyl groups [40] . applied forces to a given local motif in environments predominantly made up of folded protein interior can produce a locally strained conformation but significant strain are kept near functionally needed minimum by evolution and this is on the presumption that the stability of proteins is too marginal for high tolerance. in traditional validation measures updates, there has been a compilation of rigorously quality-filtered crystal structures through homology, resolution, and the overall score validation at file level, by b-factor and sometimes at residue level, by all-atom steric clashes. the resulting multi-dimensional distributions generated after an adequate smoothing are used in scoring the "protein-like" nature of each local conformation in relation to known structures either for backbone ramachandran values or the side chain rotamers [41] . rotamer outliers are equivalent to < 1% at high resolution while general-case ramachandran outliers to a high-resolution equivalence of < 0.05%, and ramachandran favored to 98%. in this regard, the definition of the molprobity score (mpscore) was given as mpscore = 0.426 *ln(1+clashscore) + 0.33 *ln(1+max(0, rota_out|-1)) + 0.25 *ln(1+ max(0, rama_iffy|-2)) + 0.5 where the number of unfavorable all-atom steric overlaps ≥ 0.4 å per 1000 atoms defines the clashscore [38] . the rota_out is the side chain conformation percentage termed as the rotamer outliers, from side chains that can undergo possible evaluation while rama_iffy is the backbone ramachandran percentage conformations that allows beyond the favored region, from residues that can undergo possible evaluation. the derivatives of the coefficients are from a log-linear fit to crystallographic resolution on a set of pdb structures that has undergone filtration, so that the mpscore of a model is the resolution at which each of its scores will be the values expected thus, the lower mpscores are the indications of a better model. with a clash score of 2.06 and a 95.66% value for the ramachandran favored region as compared to the ramachandran outliers and rotamer outliers with individual values of 0.83% and 5.21% respectively, we arrived at a molprobity score of 1.82 which is low enough to indicate the quality of a good model in our experimental protein. the characteristic repetitive conformation attribute of amino acid residues is the basis for the repetitive nature of the secondary structures hence the repetitive scores of φ and ψ. the range of φ and ψ scores can be used in distinguishing the different secondary structural elements as the different φ and ψ values of each secondary structure elements map their respective regions on the ramachandran plot. peptides of the ramachandran plot have the average values of their α-helices clustered about the range of φ = − 57°and ψ = − 47°while the average values of 130°and ψ = + 140°describes the ramachandran plot clustering for twisted beta sheets [42] . the core region (green in fig. 8 ) on the plot has the most favorable combinations for the φ and ψ values and has the highest number of dots. the figure also shows in the upper right quadrant, a small third core region. this is known as the allowed region and can be found in the areas of the core regions or might not be associated with the core region. it has lesser data points compared to the core regions. the other areas on the plot are regarded as disallowed. since glycine has only one hydrogen atom as side chain, steric hindrance is not as likely to occur as φ and ψ are rotated through a series of values. the glycine residues having φ and ψ values of + 55°and − 116°, respectively [43] do not exhibit steric hindrance and for that reason positioned in the disallowed region of the ramachandran plot as shown in the right hand side plot in fig. 8 . the extinction coefficient is an indication of the intensity of absorbed light by a protein at specific wavelength. the importance of estimating this coefficient is to monitor a protein undergoing purification in a spectrophotometer. woody in his experiment [44] has shown the possibility of estimating a protein's molar extension coefficient from knowledge of its amino acid composition which has been presented in table 3 . the extinction coefficient of the proteins (both the template and the target proteins) was calculated using the equation: e(prot) = numb(tyr) × ext(tyr) + numb(trp) × ext(trp) + numb(cystine) × ext(cystine)the absorbance (optical density) was calculated using the following formula: for this equation to be valid, the following conditions must be met: ph 6.5, 6.0 m guanidium hydrochloride, 0.02 m phosphate buffer. the n-terminal residue identity of a protein is an important factor in the determination of its stability in vivo and also plays a major role in the proteolytic degradation process mediated by ubiquitin [45] . βgalactosidase proteins with different n-terminal amino acids were designed through site-directed mutagenesis, and the designed β-galactosidase proteins have different half-lives in vivo which is striking, ranging from over a hundred hours to less than 2 min, but this is dependent on the nature of the amino terminus residue on the experimental model (yeast in vivo; mammalian reticulocytes in vitro, e. coli in vivo). the order of individual amino acid residues is thus in respect to the conferred half-lives when located at the protein's amino terminus [46] . this is referred to as the "n-end rule" which was what the estimated half-life of both the template and target proteins were based on. the instability index provides an estimate of the protein's stability in a test tube. statistical analysis of 32 stable and 12 unstable proteins has shown [47] that there are specific dipeptides with significantly varied occurrence in the unstable proteins as compared with those in the stable proteins. the authors of this method have assigned a weight value of instability to each of the 400 different dipeptides (diwv). the computation of a protein's instability index is thus possible using these weight values, which is defined as: table 3 amino acid composition table for both template and target protein amino acid residues in one letter codes template 17 12 19 17 12 14 9 26 8 12 30 11 10 16 13 16 26 3 11 24 target 17 11 21 17 12 14 9 26 7 11 29 11 10 17 13 16 24 3 11 27 durojaye et al. where l is the sequence length and diwv(x[i]x[i + 1]) is the instability weight value for the dipeptide starting from position i. a protein that exhibits an instability index value less than 40 can be predicted as a stable protein while an instability index value that exceeds the 40 threshold is an indication that the protein may be unstable. the comparative instability index values for the template and target proteins were 29.67 and 27.65 (table 4) , respectively, showing both are stable proteins. the relative volume occupied by aliphatic side chains (valine, alanine, leucine, and isoleucine) of a protein is known as its aliphatic index. it may be an indicator of a positive factor for an increment in globular proteins thermostability. the aliphatic index of the experimental proteins was calculated according to the following formula [48] : where x(ala), x(val), x(ile), and x(leu) are the mole percent (100 × mole fraction) of alanine, valine, isoleucine, and leucine. the coefficients "a" and "b" are the relative volume of the valine side chain (a = 2.9) and of leu/ile side chains (b = 3.9) to the alanine side chain. the calculated aliphatic index for the experimental protein shows that the thermostability of the target protein is slightly higher than the template. the most common secondary structures are the alpha helices and beta sheets although the beta turns and omega loops also occur. elements of the secondary structures spontaneously form as an intermediate before their folding into the corresponding three-dimensional tertiary structures [49] . the stability and how robust the α-helices are to mutations in natural proteins have been shown in previous studies. they have also been shown to be more designable than the beta sheets; thus, designing a functional all-α helix protein is likely to be easier than designing proteins with both α helix and strands, and this has recently been confirmed experimentally [50] . the template and target proteins both have a total of 306 amino acid residues (table 4 ) with the composition of individual residues shown in table 3 . as shown in fig. 9 , the target protein which shares a structural homology with the template (fig. 3 and the animation video) is predominantly occupied by residues forming alpha helix and beta sheets, with very low percentage of the residues forming loops. the stability of these two proteins is revealed in their physiochemical characteristics which can therefore be linked to the high percentage of residues forming alpha helix. the ultimate goal of genome analysis is to understand the biology of organisms in both evolutionary and functional terms, and this involves the combination of different data from various sources [51] . for the purpose of this study, we compared our protein of interest to similar proteins in the ncbi database to predict the evolutionary relationships between homologous proteins represented in the genomes of each divergent species. this makes the amino acid sequence alignment the most suitable form of alignment for the phylogenetic tree construction. organisms with common ancestors were positioned in the same monophyletic group in the tree, and the same node where the protein of interest (the 2019-ncov main proteinase) is positioned also houses the non-structural polyprotein of the 1ab bat sars-like coronavirus. this shows that the two viral proteins share a common source with shorter divergence period. bootstrapping allows evolutionary predictions on the level of confidence. one hundred is a very high level of confidence in the positioning of the node in the topology. the lower scores are more likely to happen by chance than it is in the real tree topology [52] . the bootstrap value of the above-mentioned viral proteins which is exactly 100 is a very high level of statistical support for their positioning at the nodes in the branched part of the tree. the length of the branches is a representation of genetic distance. it is also the measure of the time since the viral proteins diverged, which means, the greater the branch length, the likelihood that it took a longer period of time since divergence from the most closely related protein [53] . the tw9 and tjf strains of the sars coronavirus orf1a polyprotein and replicase, respectively, are the most distantly related, based on their branch length and as such can be regarded as the out-group in the tree. structure-based drug discovery is the easiest molecular docking methodology as it screens variety of listed ligands (compounds) in chemical library by "docking" them against proteins of known structures which in this study is the modeled 3d structure of the 2019-ncov main proteinase and showing the binding affinity details alongside the binding conformation of ligands in the enzyme active site [54] . ligand docking can be specific, that is, focusing only on the predicted binding sites of the protein of interest or can be blind docking where the entire area of the protein is covered. most docking tool applications focus on the predicted primary binding site of ligands; however, in some cases, the information about the target protein binding site is missing. blind docking is known to be an unbiased molecular docking approach as it scans the protein structure in order to locate the ideal binding site of ligands [55] . the autodock-based blind docking approach was introduced in this study to search the entire surface of the target and template protein for binding sites while optimizing the conformation of peptides simultaneously. for this reason, it was necessary to set up our docking parameters to search the entire surface of the modeled main proteinase of the 2019-ncov. this was achieved using the autogrid to create a very large grid map (center 77 å × − 10 å × 15 å and size 30 å × 60 å × 35 å) with the maximum number of points in each dimension in order to cover the whole protein. we observed a partial overlap in the docking pose of lopinavir to the active site of both template and target protein as compared to the conspicuous difference observed in the binding orientation of ritonavir to the protein active sites. these differential poses can be viewed distinctively in the attached animation video. a keen view of the binding orientation of the two drug candidates to the 2019-ncov virus main proteinase active site (fig. 11 ) is also consistent with the proposed induced fit binding model. in a comparative docking study, the same drug candidates (lopinavir and ritonavir) were docked against the active site of the pdb downloaded version of the viral main proteinase. the docking grid for this purpose was set with precision as the solved pdb structure of the virus included a cocrystalized ligand at the enzyme active site (center -32 å × − 65 å × 42 å and size 25 å × 30 å × 25 å) and experimental ligands bind to this site with precision and variation in poses (fig. 12) . the binding energy results table 5 here, the docking results of lopinavir and ritonavir against the template and target protein are shown. the binding of ritonavir to the template protein produced the highest number of inter model hydrogen bonds while the binding of lopinavir to the target protein formed polar interaction with three residues at the active site as compared to the two formed by the other interactions table 6 the amino acid residues involved in polar interaction, the number of inter-model hydrogen bonds and the docking score of lopinavir and ritonavir upon binding to the 3d pdb download of the sars-cov main proteinase (pdb 6m2n) showed a difference of − 0.3 kcal/mol upon the binding of lopinavir to the template and the pdb 3d structure of the enzyme (pdb 6m2n), and a difference of − 0.5 kcal/ mol between the pdb 3d structure of the enzyme and the target protein (table 5 and 6). the same comparative study was repeated for the binding of ritonavir and a difference of − 0.1 and − 1.0 kcal/mol was observed upon the binding of drug to the template and target proteins, respectively, in comparison with the binding to the downloaded 3d structure of the enzyme from the pdb. the observed consistency in the binding energy of the drug candidates can also serve as a reference to the validity and quality of the modeled protein, which has exhibited a high sequence and structural similarity with the downloaded 3d structure from the protein data bank (fig. 13 ). in an effort to make available potent therapeutic agents against the fast rising 2019 novel coronavirus epidemic, we identified from the viral genome the coding region and modeled the main proteinase of the virus coupled with the evaluation of the efficacy of existing hiv protease inhibitors by targeting the protein active site using a blind docking approach. our study has shown that lopinavir displays a broader spectrum inhibition against both the sars coronavirus and 2019-ncov main proteinase as compared to the inhibition profile of ritonavir. the modeled 3d structure of the enzyme has also provided interesting insights regarding the binding orientation of the experimental drugs and possible interactions at the protein active site. the conclusion from the study of cao et al. as previously discussed however has shown that the administration of the lopinavir-ritonavir therapy might elicit additional health concerns as a result of the extreme adverse events exhibited by the experimental subjects for the purpose of their study. it was also observed that the drugs showed no increased benefit when compared with the standard supportive care. in view of this findings, we therefore suggest a drug modification approach aimed at avoiding the health concerns posed by the lopinavir-ritonavir combined therapy while retaining their proteinase inhibitory activity. supplementary information accompanies this paper at https://doi.org/10. 1186/s43042-020-00081-5. additional file 1. supplementary information to this article can be found online at https://www.rcsb.org/structure/6m2n clinical features of patients with 2019 novel coronavirus in wuhan genomic characterization and epidemiology of 2019 novel coronavirus: implications of virus origins and receptor binding a novel coronavirus from patients with pneumonia in china a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster importation and human-tohuman transmission of a novel coronavirus in vietnam national health commission of the people's republic of china transmission of 2019-ncov infection from an asymptomatic contact in germany alert, verification and public health management of sars in the post-outbreak period coronavirus in severe acute respiratory syndrome (sars) a novel coronavirus and sars crystal structures of the main peptidase from the sars coronavirus inhibited by a substrate-like aza-peptide epoxide dissection study on the sars 3c-like protease reveals the critical role of the extra domain in dimerization of the enzyme: defining the extra domain as a new target for design of highly-specific protease inhibitors 3c-like proteinase from sars coronavirus catalyzes substrate hydrolysis by a general base mechanism only one protomer is active in the dimer of sars 3c-like proteinase biosynthesis, purification, and substrate specificity of severe acute respiratory syndrome coronavirus 3c-like proteinase a trial of lopinavir-ritonavir in adults hospitalized with severe covid-19 emboss: the european molecular biology open software suite srs, an indexing and retrieval tool for flat file data libraries issues in bioinformatics benchmarking: the case study of multiple sequence alignment hhblits: lightning-fast iterative protein sequence searching by hmm-hmm alignment the swiss-prot protein knowledgebase and its supplement trembl in 2003 toward the estimation of the absolute quality of individual protein structure models molprobity: more and better reference data for improved all-atom structure validation chapter 2: protein composition and structure modeling protein quaternary structure of homo-and hetero-oligomers beyond binary interactions by homology ucsf chimera-a visualization system for exploratory research and analysis fasman gd (1974) prediction of protein conformation protein identification and analysis tools on the expasy server the rapid generation of mutation data matrices from protein sequences mega7: molecular evolutionary genetics analysis version 7.0 for bigger datasets chemoinformatics: theory, practice, & products critical assessment of the automated autodock as a new docking tool for virtual screening critical assessment of methods of protein structure prediction (casp) round 6 visualizing and quantifying molecular goodnessof-fit: small-probe contact dots with explicit hydrogen atoms a test of enhancing model accuracy in high-throughput crystallography the penultimate rotamer library protein geometry database: a flexible engine to explore backbone conformations and their relationships to covalent geometry circular dichroism spectrum of peptides in the poly(pro)ii conformation calculation of protein extinction coefficients from amino acid sequence data universality and structure of the n-end rule the n-end rule in bacteria correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence thermostability and aliphatic index of globular proteins alpha helices are more robust to mutations than beta strands global analysis of protein folding using massively parallel design, synthesis, and testing time of the deepest root for polymorphism in human mitochondrial dna intraspecific nucleotide sequence differences in the major noncoding region of human mitochondrial dna limitation of the evolutionary parsimony method of phylogenetic analysis efficient docking of peptides to proteins without prior knowledge of the binding site molecular recognition and docking algorithms we appreciate the leadership of the laboratory of cellular dynamics (lcd), university of science and technology of china, for the all-around support and academic advisory role. we also acknowledge the strong support from the ustc office of international cooperation all through the challenging period of the coronavirus epidemic. the authors received no funding for this project from any organization. ethics approval and consent to participate not applicable the authors declare that they have no competing interests. key: cord-102359-k1xxz4hc authors: klotsa, daphne; roemer, rudolf a.; turner, matthew s. title: electronic transport in dna date: 2005-04-04 journal: nan doi: 10.1529/biophysj.105.064014 sha: doc_id: 102359 cord_uid: k1xxz4hc we study the electronic properties of dna by way of a tight-binding model applied to four particular dna sequences. the charge transfer properties are presented in terms of localisation lengths, crudely speaking the length over which electrons travel. various types of disorder, including random potentials, are employed to account for different real environments. we have performed calculations on poly(dg)-poly(dc), telomeric-dna, random-atgc dna and lambda-dna. we find that random and lambda-dna have localisation lengths allowing for electron motion among a few dozen base pairs only. a novel enhancement of localisation lengths is observed at particular energies for an increasing binary backbone disorder. we comment on the possible biological relevance of sequence dependent charge transfer in dna. the question of whether dna conducts electric charges is intriguing to physicists and biologists alike. the suggestion that electron transfer/transport in dna might be biologically important has triggered a series of experimental and theoretical investigations [5, 17, 20, 31, 35, 54] . processes that possibly use electron transfer include the function of dna damage response enzymes, transcription factors or polymerase co-factors all of which play important roles in the cell [2] . indeed there is direct evidence [9] that muty -a dna base excision repair enzyme with an [4fe4s] + cluster of undetermined function -takes part in some kind of electron transfer as part of the dna repair process [36, 46] . this seems consistent with studies in which an electric current is passed through dna revealing that damaged regions have significantly different electronic behaviour than healthy ones [9] . for physicists, the continuing progress of nanotechnologies and the consequent need for further size miniaturisation makes the dna molecule an excellent candidate for molecular electronics [6, 13, 23, 45] . dna might serve as a wire, transistor, switch or rectifier depending on its electronic properties [16, 20, 44] . in its natural environment, dna is always in liquid solution and therefore experimentally one can study the molecule either in solution or in artificially imposed dry environments. in solution experiments dna can be chemically processed to host a donor and an acceptor molecule at different sites along its long axis. photo-induced charge transfer rates can then be measured whilst the donor/acceptor molecules, the distance and the sequence of dna that lies between them are varied. the reactions are observed to depend on the type of dna used, the intercalation, the integrity of the intervening base pair stack and, albeit weakly, on the molecular distance [5, 9, 17, 35, 52] . direct conductivity measurements on dry dna have also been preformed in the past few years. the remarkable diversity that characterises the results seems to arise from the fact that many factors need to be experimentally controlled. these include methods for dna alignment and drying, the nature of the devices used to measure the conductivity, the type of metallic contacts and the sequence and length of the dna. dna has been reported to be an insulator [10] , an ohmic conductor [3, 21, 32, 34, 45 ] and a semiconductor [43] . theoretically, single-step super exchange [31] and multi-step hopping [8] models have provided interpretations of solution experiments. for experiments in dry dna, several additional approaches such as variable range hopping [57] , one-dimensional quantum mechanical tight-binding models [13, 47, 48, 55, 58, 59] and non-linear methods [12, 39] have also been proposed. despite the lack of a consistent picture for the electronic properties of dna, one conclusion has been established: the environment of the dna impacts upon its structural, chemical and thus probably also electronic properties. both theoretical and experimental studies show that the temperature and the type of solution surrounding dna have a significant effect on its structure and shape [4, 11, 57] . the effect of the environment is a key one to this report, where the environmental fluctuations are explicitly modelled as providing different types of disorder. in this work, we focus on whether dna, when treated as a quantum wire in the fully coherent low-temperature regime, is conducting or not. to this end, we study and generalise a tight-binding model of dna which has been shown to reproduce experimental [13] as well as ab-initio results [15] . a main feature of the model is the presence of sites which represent the sugar-phosphate backbone of dna but along which no electron transport is permissible. we measure the "strength" of the electronic transport by the localisation length ξ, which roughly speaking parametrises whether an electron is confined to a certain region ξ of the dna (insulating behaviour) or can proceed across the full length l (≤ ξ) of the dna molecule (metallic behaviour). sections ii-iii introduce our models and the numerical approach. in section v, we show that dna sequences with different arrangement of nucleotide bases adenine (a), cytosine (c), guanine (g) and thymine (t) exhibit different ξ's when measured, e.g. as function of the fermi energy e. the influence of external disorder, modelling variants in the solution, bending of the dna molecule, finite-temperature effects, etc., is studied in section vi where we show that, surprisingly, the models support an increase of ξ when disorder is increased. we explain that this effect is linked to the existence of the backbone sites. a. the fishbone model dna is a macro-molecule consisting of repeated stacks of bases formed by either at (ta) or gc (cg) pairs coupled via hydrogen bonds and held in the doublehelix structure by a sugar-phosphate backbone. in fig. 1 , we show a schematic drawing. in most models of electronic transport [13, 60] it has been assumed that the transmission channels are along the long axis of the dna molecule [61] and that the conduction path is due to π-orbital overlap between consecutive bases [52] ; density-functional calculations [37] have shown that the bases, especially guanine, are rich in π-orbitals. quantum mechanical approaches to the problem mostly use strictly one-dimensional (1d) tight-binding models [47, 48, 55, 58, 59] . of particular interest to us is a quasi-1d model [13] which includes the backbone structure of dna explicitly and exhibits a semiconducting gap. this fishbone model, shown in fig. 2 , has one central conduction channel in which individual sites represent a base-pair; these are interconnected and further linked to upper and lower sites, representing the backbone, but are not interconnected along the backbone. every link between sites implies the presence of a hopping amplitude. the hamiltonian for the fishbone model (h f ) is given by: where t i is the hopping between nearest-neighbour sites i, i + 1 along the central branch, t q i with q =↑, ↓ gives the hopping from each site on the central branch to the upper and lower backbone respectively. additionally, we denote the onsite energy at each site along the central branch by ε i and the onsite energy at the sites of the upper and lower backbone is given by ε q i , with q =↑↓. l is the number of sites/bases in the sequence. the model (1) clearly represents a dramatic simplification of dna. nevertheless, in ref. [13] it had been shown that this model when applied to an artificial sequence of repeated gc base pairs, poly(dg)-poly(dc) dna, reproduces experimental data current-voltage measurements when t i = 0.37ev and t q i = 0.74ev are being used. therefore, we will assume t q i = 2t i and set the energy scale by t i ≡ 1 for hopping between gc pairs. in what follows we will adopt energy units in which ev = 1 throughout. for natural dna sequences, we need to know how the hopping amplitudes vary as the electron moves between like pairs, i.e. from gc to gc or from at to at, and unlike pairs, i.e., from gc to at and vice versa. we choose t i = 1 between identical and matching bases (e.g. at/ta, gc/cg). assuming that the wavefunction overlap between consecutive bases along the dna strand is weaker between unlike and non-matching bases (at/gc, ta/gc, etc.) we thus choose 1/2. we performed semi-empirical calculations on dna base pairs and stacks using the spartan quantum chemistry software package [1] . the results have shown that the relevant electronic states of dna (highestoccupied and lowest-unoccupied molecular orbitals with and without an additional electron) are localised on one of the bases of a pair only. the reduction of the dna base-pair architecture into a single site per pair, as in the fishbone model (1), is obviously a highly simplified approach. as an improvement on this we model each base as a distinct site where the base pair is then weakly coupled by the hydrogen bonds. the resulting 2-channel model is shown in fig. 3 . this ladder model is a planar projection of the structure of the dna with its doublehelix unwound. we note that results for electron transfer also suggest that the transfer proceeds preferentially down one strand [25] . there are two central branches, linked with one another, with interconnected sites where each represents a complete base and which are additionally linked to the upper and lower backbone sites. the backbone sites as in the fishbone model are not interconnected. the hamiltonian for the ladder model is given by where t i,τ is the hopping amplitude between sites along each branch τ = 1, 2 and ε i,τ is the corresponding onsite potential energy. t q i and and ε q i as before give hopping amplitudes and onsite energies at the backbone sites. also, q(τ ) =↑, ↓ for τ = 1, 2, respectively. the new parameter t 12 represents the hopping between the two central branches, i.e., perpendicular to the direction of conduction. spartan results suggest that this value, dominated by the wave function overlap across the hydrogen bonds, is weak and so we choose t 12 = 1/10. in order to study the transport properties of dna, we could now either use artificial dna (poly(dg)-poly(dc) [43] , random sequences of a,t,g,c [38, 56] , etc.) or natural dna (bacteriophage λ-dna [37] , etc.). the biological content of the sequence would then simply be encoded in a specific sequence of hopping amplitudes 1 and 1/2 between like and unlike base-pair sequences. however, in vivo and most experimental situations, dna is exposed to diverse environments and its properties, particularly those related to its conformation, can change drastically depending on the specific choice. the solution, thermal effects, presence of binding and packaging proteins and the available space are factors that alter the structure and therefore the properties that one is measuring [4, 57] . clearly, such dramatic changes should also be reflected in the electronic transport characteristics. since it is precisely the backbone that will be most susceptible to such influences, we model such environmental fluctuations by including variations in the onsite potentials ε i,q . different experimental situations will result in a different modification of the backbone electronic structure, and we model this by choosing different distribution functions for the onsite potentials, ranging from uniform disorder ε i,q ∈ [−w/2, w/2], to gaussian disorder and on to binary disorder ε i,q = ±w/2. w is a measure for the strength of the disorder in all cases. particularly the binary disorder model can be justified by the localisation of ions or other solutes at random positions along the dna strand [4] . due to the non-connectedness of the backbone sites along the dna strands, the models (1) and (2) can be further simplified to yield models in which the backbone sites are incorporated into the electronic structure of the dna. the effective fishbone model is then given bỹ similarly, the effective ladder model reads as in these two models, the backbone has been incorporated into an energy-dependent onsite potential on the main dna sites. this re-emphasises that the presence of the backbone influences the local electronic structure on the dna bases and similarly, any variation in the backbone disorder potentials ε ↑,↓ i will results in a variation of effective onsite potentials as given in the brackets of eqs. (3) and (4). both models allow to quickly calculate the gap of the completely ordered system (all onsite potentials zero) by assuming that the lowest-energy state ψ = i ψ i(,τ ) |i(, τ ) in each band corresponds to constant ψ i (ψ i,τ ) whereas for the highest-energy states, a checkerboard pattern is obtained with . for the fishbone model, this for the chosen set of hopping parameters for (3) and (4), this gives e min,∓ = −4, 2 and e max,∓ = −2, 4 for the fishbone model and e min,∓ ≈ −3.31, 1.21 and e max,∓ = −1.21, 3.31 for the ladder model. there are several approaches suitable for studying the transport properties of the models (1) and (2) and these can be found in the literature on transport in solid state devices, or, perhaps more appropriately, quantum wires. since the variation in the sequence of base pairs precludes a general solution, we will use two methods well-known from the theory of disordered systems [50] . the first method is the iterative transfer-matrix method (tmm) [26, 29, 30, 40, 41] which allows us in principle to determine the localisation length ξ of electronic states in systems with cross sections m = 1 (fishbone) and 2 (ladder) and length l ≫ m , where typically a few million sites are needed for l to achieve reasonable accuracy for ξ. however, in the present situation we are interested in finding ξ also for viral dna strands of typically only a few ten thousand base-pair long sequences. thus in order to restore the required precision, we have modified the conventional tmm and now perform the tmm on a system of fixed length l 0 . this modification has been previously used [22, 33, 49] and may be summarised as follows: after the usual forward calculation with a global transfer matrix t l0 , we add a backward calculation with transfer matrix t b l0 . this forward-backward-multiplication procedure is repeated k times. the effective total number of tmm multiplications is l = 2kl 0 and the global transfer-matrix is τ l = t b l0 t l0 k . it can be diagonalised as for the standard ] with τ = 1 or τ = 1, 2 for fishbone and ladder model, respectively. the largest ξ τ ∀τ then corresponds to the localisation lengths of the electron on the dna strand and will be measured in units of the dna base-pair spacing (0.34 nm). the second method that we will use is the recursive green function approach pioneered by mackinnon [27, 28] . it can be used to calculate the dc and ac conductivity tensors and the density of states (dos) of a d-dimensional disordered system and has been adopted to calculate all kinetic linear-transport coefficients such as thermoelectric power, thermal conductivity, peltier coefficient and lorentz number [51] . the main advantage of both methods is that they work reliably (i) for short dna strands ranging from 13 (dft studies [37] ) base pairs up to 30 base pairs length which are being used in the nanoscopic transport measurements [15] as well as (ii) for somewhat longer dna sequences as modelled in the electron transfer results and (iii) even for complete dna sequences which contain, e.g. for human chromosomes up to 245 million base pairs [2] . the exact arrangement of the four bases a, t, g, c determines the nature and function of its associated dna strand such as the chemical composition of the proteins which are encoded. while previous studies have aimed to elucidate whether dna conducts at all, we shall also focus our attention to investigate how different dna sequences, be they artificial or naturally occurring, "conduct" charge differently. thus we study a set of different dna. a convenient starting point for most electronic transport studies [44] is the aforementioned poly(dg)poly(dc) sequence, which corresponds to a simple repetition of a gc (or cg) pair. note that within our models, there is no difference between gc and cg pairs. although not occurring naturally, such sequences can be synthesised easily. another convenient choice of artificial dna strand is a simple random sequence of the four bases, which we construct with equal probability for all 4 bases. however, they are not normally used in experiments. as dna samples existing in living organisms, we shall use λ-dna of the bacteriophage virus [bacteriophage lambda] which has a sequence of 48502 base pairs. it corresponds to a bacterial virus and is biologically very well characterised. we also investigate the 29728 bases of the sars virus [sars] . telomeric dna is a particular buffer part at the beginning and ends of of dna strands for eukaryote cells [2] . in mammals it is a guanine rich sequence in which the pattern ttaggg is repeated over thousands of bases. its length is known to vary widely between species and individuals but we assume a length of 6000 base-pairs. last, we show some studies of centromeric dna for chromosome 2 of yeast with 813138 base pairs [cen2] . this dna is also reportedly rich in g bases and has a high rate of repetitions which should be favourable for electronic transport. initially, we will compute transport properties for complete dna sequences, i.e. including and not differentiating between coding and non-coding sequences (this distinction applies to the naturally occurring dna strands only). however, we will later also study the difference between those two different parts of a given dna. we emphasise that while non-coding dna suffers from the label of "junk", it is now known to play several important roles in the functioning of dna [2] . before leaving the description of our dna sequences, we note that occasionally, we show results for "scrambled" dna. this is dna with the same number of a, t, c, g bases, but with their order randomised. clearly, such sequences contain the same set of electronic potentials and hopping variations, but would perform quite differently if released into the wild. a comparison of their transport properties with those from the original sequence thus allows to measure how important the exact fidelity of a sequence is. let us start by studying the localisation properties of dna without any onsite disorder either at ε i,τ or at ε i,q . for a poly(dg)-poly(dc) sequence, both fishbone and ladder model produce two separate energy bands between the extremal values computed at the end of section ii d. within these energy bands, the electronic states are extended with infinite localisation length ξ as expected. outside the bands, transport is exponentially damped due to an absence of states and the ξ values are very close the zero. in fig. 4 the resulting inverse localisation lengths are shown. these are zero for the extended states in the two bands, but finite outside, showing the quick decrease of the localisation lengths outside the bands. in fig. 5 , we show the same data but now plot the localisation length itself. we see that the energy gap observed previously [13] for the poly(dg)-poly(dc) sequence in the fishbone model remains. the difference with respect to the ladder model is a slight renormalisation of the gap width. the localisation lengths of poly(dg)-poly(dc) dna tend to infinity, meaning that the sequence is perfectly conducting. this is expected due to its periodic electronic structure. turning our attention to the other three dna sequences, we find that telomeric dna also gives rise to perfect conductivity like poly(dg)-poly(dc) dna. but due to its structure of just 6 repeating base pairs, there is a further split of each band into 3 separate sub-bands. they may be calculated like in section ii d. we would like to point out that it may therefore be advantageous to use the naturally occurring telomeric parts of dna sequences as prime, in-vivo candidates when looking for good conductivity in a dna strand. the structure of the energy dependence for the random-atgc and the λ-dna is very different from the preceding two sequences, but it is quite similar between just these two. the biological content of the dna sequences is -within the description by our quantum models -just a sequence of binary hopping elements between like and unlike base pairs. thus the models are related to the physics of random hopping models [7, 19] and in agreement with these, we see a dyson peak [18] in the centre of each sub-band. furthermore, we see that the range of energies for which we observe non-zero localisation lengths is increased into the gap and for large absolute values of the energy. this is similar to the broadening of the single energy band for the anderson model of localisation [50] . the localisation lengths, which roughly equal the average distance an electron would be able to travel (conduct), are close to the distance of 20 bases within the band, with a maximum of ∼ 30 bases at the centre of each band. note that this result is surprisingly good -given the level of abstraction used in the present models -when compared to the typical distances over which electron transfer processes have been shown to be relevant [9, 17, 25, 31, 35, 52, 54] . a. dna randomly bent or at finite temperatures as argued before, environmental influences on the transport properties of dna are likely to influence predominantly the electronic structure of the backbone. within our models, this can be captured by adding a suitable randomness onto the backbone onsite potentials ε q i . in this fashion, we can model for example the influence of a finite-temperature [11] and thus a coupling to phonons [24] . we emphasise however, that in order for our localisation results -which rely on quantum mechanical interference effects -to remain valid, the phase breaking lengths should stay much larger than the sequence lengths. thus the permissible temperature range is a few k only. the bending of dna is another possibility which can be modelled by a local, perhaps regular, change in ε q i along the strand. another important aspect is the change in ε q i due to the presence of a solution in which dna is normally immersed. all these effects can be modelled in a first attempt by choosing an appropriate distribution function p (ε q i ). let us first choose uniform disorder with ε q i ∈ [−w/2, w/2]. in fig. 6 we show the results for all 4 dna sequences as a function of energy for w = 1. comparing this to fig. 5 , we see that now all localisation lengths are finite; poly(dg)-poly(dc) and telomeric dna having localisation lengths of a few hundreds and a few tens of bases, respectively. the localisation lengths for random-atgc and λ-dna are only slightly reduced. in all cases, the structure of 2 energy bands remains. furthermore, w = 1 already represents a sizable broadening of about 1/2 the width of each band. thus although the localisation lengths are finite compared to the results of section v, they are still larger than the lengths of the dna strands used in the nano-electric experiments, implying finite conductances. we remark that the dyson peaks have vanished as expected [19] . we plot the dos for λ-dna in fig. 6 which clearly indicates the 2 bands. upon further increasing the disorder to w = 2, as shown in fig. 7 , the localisation lengths continue to decrease. note that we observe a slight broadening of the bands and states begin to shift into the gap. we also see that the behaviour of random-atgc and λ-dna is quite similar and at these disorder strengths, even telomeric dna follows the same trends. at w = 5, the localisation lengths have been reduced to a few base-pair separation distances and the differences between all 4 sequences are very small. the gap has been nearly completely filled as shown by the dos, albeit with states which have a very small localisation length. this will become important later. thus, in summary, we have seen that adding uniform disorder onto the backbone leads to a reduction of the localisation lengths and consequently a reduction of the electron conductance. strictly speaking, all 4 strands are insulators. however, their localisation lengths can remain quite large, larger than in many of the experiments. thus even the localised electron can contribute towards a finite conductivity for these short sequences. in agreement with experiments, poly(dg)-poly(dc) dna is the most prominent candidate. when in solution, the negatively charged oxygen on the backbone will attract cations such as na + . this will give rise to a dramatic change in local electronic properties at the oxygen-carrying backbone site, but not necessarily influence the neighbouring sites. the effects at each such site will be the same and thus in contrast to a uniform disorder used in section vi a, a binary distribution such as ε i,q = ±w/2 is more appropriate. for simplicity, we choose 50% of all backbone sites to be occupied ε i,q = −w/2 while the other half remains empty with ε i,q = +w/2. we note that a mixture of concentrations has been studied in the context of the anderson model in ref. [42] . in fig. 9 , we show the results for moderate binary disorder. in comparison with the uniformly disordered case of fig. 6 , we see that the localisation lengths have decreased further. this is expected because binary disorder is known to be very strong [42] . also, the gap has already started to fill. increasing the disorder leads again to a decrease of ξ in the energy regions corresponding to the bands. directly at e = ±w/2, we observe 2 strong peaks in the dos which is accompanied by reduced localization lengths. this peak corresponds to the infinite potential barrier or well at e = −w/2 or +w/2, respectively, as indicated by eq. (4). in fig. 9 , these peaks were not yet visible. we also see in fig. 10 that the localisation lengths for states in the band centre start to increase to values 1. this trend continues for larger w as shown in fig. 11 . we see a crossover into a regime where the two original, weak-disorder bands have nearly vanished and states in the centre at e = 0 are starting to show an increasing localisation length upon increasing the binary disorder. a further increase in w eventually leads to the complete destruction of the original bands and the formation of a single band symmetric around e = 0 at about w ∼ 2.5. the results of the previous section suggest that increasing the disorder in different regions of the energy will lead to different transport behaviour. of particular interest is the region at e = 0. in fig. 12 the variation of ξ as a function of binary disorder strength for all different sequences is shown. while ξ < 1 for small disorder, we see that upon increasing the disorder, states begin to appear and their localisation lengths increase for all dna sequences. thus we indeed observe a counter-intuitive delocalisation by disorder at e = 0. as before, poly(dg)poly(dc) and telomeric disorder show the largest localisation lengths, whereas random-atgc and λ-dna give rise to a smaller and nearly identical effect. in fig. 13 we show that this effect does not exist at e = 3, i.e. for energies corresponding to the formerly largest localisation lengths. rather, at e = 3, the localisation lengths for all dna sequences quickly drop to ξ ∼ 1. the delocalisation effect is also observed for uniform disorder, but is much smaller. as shown in fig. 14 , the enhancement is up to about ξ = 1 for the fishbone model (1) . results for the ladder model (2) are about 1.7 times larger. this surprising delocalisation-by-disorder behaviour can be understood by considering the effects of disorder at the backbone for the effective hamiltonians (3) and (4) . at e = 0, the onsite potential correction term (t q i ) 2 /(ε q i − e) will decrease upon increasing the ε q i values. for binary disorders ε q i = ±w/2, this holds for |ε q i | > |e| as shown in fig. 13 . however, for large |e|, the localisation lengths decrease quickly due to the much smaller density of states. thus the net effect is an eventual decrease (or an only very small increase) of ξ for large e. note the dip at |ε q i | = e = 3 in the figure, which corresponds to the effective ε i = ∞, i.e. an infinitely strong trap yielding extremely strong localisation. for uniform disorder ε q i ∈ [−w/2, w/2] -and generally any disor-der with compact support around e = 0 -the above inequality is never full-filled and even for e = 0 we will find small ε q i ∼ 0 such that we have strong trapping and localisation. a. variation of ξ along the dna strand in the preceding sections, we had computed estimates of the localisation length ξ for complete dna strands, i.e. the ξ values are averages. however, the biological function of dna clearly depends on the local structure of the sequence in a paramount way. after all, only certain parts of dna code for proteins, while others do not. in addition, the exact sequence of the bases specifies the protein that is to be assembled. thus, in order to gain access to the local properties, we have performed computations of ξ on subsequences of complete dna strands. we start by artificially restricting ourselves to finite windows of length k = 10, 30, 50, 100, 200, 500, 1000 and compute the localisation lengths ξ k (r) where r = 1, 2, . . . , l − k denotes the starting position of the window of length k. in order to see how the exact sequence determines our results, we have also randomly permuted (scrambled) the λ-dna sequence so that the content of a, t, g, and c bases is the same, but their order is randomised. differences in the localisation properties should then indicate the importance of the exact order. from the biological information available on bacteriophage λ-dna, we compute the localisation length for the coding regions [14] and then for window lengths k that correspond exactly to the length of each coding region. again, if the electronic properties -as measured by the localisation length -are linked to biological content, we would expect to see characteristic differences. in figs. 15 and 16, we show results for k = 100 and 1000, respectively. from fig. 15 , we see from p (ξ) that the localisation lengths for λ-dna are mostly distributed around 15-20, but p (ξ) has a rather long tail for large ξ. however, there are some windows where the localisation lengths exceed even the size of the window k = 100. thus at specific positions in the dna sequence, the system appears essentially extended with ξ > k. on the other hand, the distribution p (ξ) is identical when instead of λ-dna, we consider scrambled dna. therefore the presence of such regions is not unique to λ-dna. the results from windows positioned at the coding part of λ-dna appear statistically similar to the complete sequence, i.e. including also the non-coding regions. this suggests that with respect to the localisation properties there is no obvious difference between λ-dna and scrambled λ-dna as well as coding and non-coding regions. we emphasise that similar results have been obtained for a dna sequence constructed from the sars corona-viral data. in fig. 15 , we repeat these calculations but with k = 1000. clearly, p (ξ) is peaked again around 15-20 and this time has no tail. in all cases, k > ξ. again, the results for scrambled dna are different in each window, and now even p (ξ) is somewhat shifted with respect to λ-dna. thus in conclusion, we do not see significant differences between λ-dna and its scrambled counter part. moreover, there appears to be no large difference between the localisation lengths measured in the coding and the noncoding sequences of bacteriophage λ-dna. this indicates that the average ξ values computed in the previous sections is sufficient when considering the electronic localisation properties of the 4 complete dna sequences. as shown in the last section, the spatial variation of ξ for a fixed window size is characteristic of the order of bases in the dna sequence. thus we can now study how this biological information is retained at the level of localisation lengths. in order to do so, we define the correlation function where ξ = n i=1 ξ(r i )/n is ξ averaged over all n = l − (k − 1) windows for each of which the individual localisation lengths are ξ(r i ). in fig. 17 we show the results obtained for λ-dna with windows of length 10, 100 and 1000. we first note that cor(k) drops rapidly until the distance k exceeds the window width k (see the inset of fig. 17 ). for k > k, cor(k) fluctuates typically between ±0.2 and there is a larger anti-correlation for base-pair separations of about k = 8000. we note that such large scale features are not present when considering scrambled λ-dna instead. the fishbone and ladder models studied in the present paper give qualitatively similar results, i.e. a gap in the dos on the order of the hopping energies to the backbone, extended states for periodic dna sequences and localised states for any non-zero disorder strength. thus at t = 0, our results suggest that dna is an insulator unless perfectly ordered. quantitatively, the localisation lengths ξ computed for the ladder model are larger than for the fishbone model. since we are interested in these non-universal lengths, the ladder model is clearly the more appropriate model. the localisation lengths measure the spatial extent of a conducting electron. our results suggest -in agreement with all previous considerations -that poly(dg)poly(dc) dna allows the largest values of ξ. even after adding a substantial amount of disorder, poly(dg)poly(dc) dna can still support localization lengths of a few hundred base-pair seperation lengths. with nanoscopic experiments currently probing at the most a few dozen bases, this suggests that poly(dg)-poly(dc) dna will appear to be conducting in these experiments. furthermore, telomeric dna is a very encouraging and interesting naturally occuring sequence because it gives very large localisation lengths in the weakly disordered regime. nevertheless, we find that all investigated, nonperiodic dna sequences such as, e.g. random-atgc and λ-dna, give localised behaviour even in the clean state. this indicates that they are insulating at t = 0. when the effects of the environment, modelled by their potential changes on the backbone, are included, we find that the localisation lengths in the two bands decrease quickly upon increasing the disorder. nevertheless, depending on the value of the fermi energy, the resulting ξ values can still be 10-20 base-pairs long. while this may not give metallic behavior, it can still result in a finite current for small sequences. we also note that these distances are quite close to those obtained from electrontransfer studies. the backbone disorder also leads to states moving into the gap. therefore the environment prepared in the experiments determines the gap which is being measured. furthermore, the localisation properties of the states in the former gap are drastically different from those in the 2 bands. increasing the disorder leads to an increase in the localization lengths and thus potentially larger currents. this is most pronounced for binary disorder, taken to model the adhesion of cations in solution. thus within the 2 models studied, we find that their transport properties are in a very crucial way determined by the environment. differences in experimental set-up such as measurements in 2d surfaces or between elevated con-tacts are likely to lead to quite different results. as far as the correlations within biological λ-dna are concerned, we see only a negligible difference between the localisation properties of the coding and non-coding parts. however, this is clearly dependent on the chosen energy and the particular window lengths used. investigations on other dna sequences are in progress. spartan version 5.0, user's guide electrical conductivity in oriented dna. national nanofabrication users network newsletter charge migration in dna: ion-gated transport on the long-range charge transfer in dna dna electronics off-diagonal disorder in the anderson model of localization long-range charge hopping in dna dna-mediated charge transport for dna repair dna-templated assembly and electrode attachment of a conducting silver wire fluctuation-facilitated charge migration along dna disorder and fluctuations in nonlinear excitations in dna. arxiv: q-bio.bm/0403003v1 backbone-induced semiconducting behavior in short dna wires lambda ii:469-517, chapter appendix i: a molecular map of colphase lambda embedding methods for conductance in dna electronic properties of dna long-range dna charge transport the dynamics of a disordered linear chain the two-dimensional anderson model of localization with random hopping the quest for high-conductance dna electrical conduction through dna molecules scaling in interaction-assisted coherent transport hybrid dnagold nanostructured materials: an ab-initio approach quantum transport in dna wires: influence of a strong dissipative environment electron transfer between bases in double helical dna localization: theory and experiment the conductivity of the onedimensional disordered anderson model: a new numerical method the calculation of transport properties and density of states of disordered solids critical exponents for the metalinsulator transition the scaling theory of electrons in disordered solids: additional numerical results longrange photoinduced electron transfer through a dna helix transfer-printing of highly aligned dna nanowires effects of scale-free disorder on the anderson metalinsulator transition anisotropic electric conductivity in an aligned dna cast film direct chemical evidence for charge transfer between photoexcited 2-aminopurine and guanine in duplex dna primary free radical processes in dna absence of dcconductivity in λ-dna long-range correlations in nucleotide sequences nonlinear dynamics and statistical physics of dna finite-size scaling approach to anderson localisation finite-size scaling approach to anderson localisation: ii. quantitative analysis and new results the three-dimensional anderson model of localization with binary random potential direct measurement of electrical transport through dna molecules charge transport in dna-based devices metallic conduction through engineered dna: dna nanoelectric building blocks mutational specificity of oxidative dna damage sequence dependent dna-mediated conduction long range correlation in dna: scaling properties and charge transfer efficiency the enhancement of the localization length for two-interacting particles is vanishingly small in transfer-matrix calculations the anderson transition and its ramifications -localisation, quantum interference, and interactions, chapter numerical investigations of scaling at the anderson transition thermoelectric properties of disordered systems charge transport through a molecular π-stack: double helical dna a simple model of the charge transfer in dna-like substances femptosecond direct observation of charge transfer between bases in dna band-gap tunneling states in dna localization properties of electronic states in polaron model for poly(dg)-poly(dc) and poly(da)-poly(dt) dna polymers variable range hopping and electrical conductivity along the dna double helix extended states in disordered systems: role of off-diagonal correlations structural and dynamical disorder and charge transport in dna electronic transport in dna molecules with backbone disorder the results for the fishbone and ladder models are qualitatively the same. quantitatively, the ladder model results have a nearly twice larger localisation length it is a pleasure to thank h. burgert, d. hodgson, m. pfeiffer and d. porath for stimulating discussions. key: cord-140977-mg04drna authors: maltezos, s. title: methodology for modelling the new covid-19 pandemic spread and implementation to european countries date: 2020-06-27 journal: nan doi: nan sha: doc_id: 140977 cord_uid: mg04drna after the breakout of the disease caused by the new virus covid-19, the mitigation stage has been reached in most of the countries in the world. during this stage, a more accurate data analysis of the daily reported cases and other parameters became possible for the european countries and has been performed in this work. based on a proposed parametrization model appropriate for implementation to an epidemic in a large population, we focused on the disease spread and we studied the obtained curves, as well as, we investigated probable correlations between the country's characteristics and the parameters of the parametrization. we have also developed a methodology for coupling our model to the sir-based models determining the basic and the effective reproductive number referring to the parameter space. the obtained results and conclusions could be useful in the case of a recurrence of this repulsive disease in the future. the disease of the new virus covid-19 which has been pandemic in the world for about 90 days, the "wavefront" of infection has reached its mitigation stage. therefore, this is the time to turning our thoughts not only to its subsequent, painful and serious implications of this pandemic [1] , [2] , [3] , [4] but also, it could be useful to analyse the way of growth of the disease among the countries until the 10 th of may 2020, as well as, to correlate these with the main parameters that likely played a significant role. in particular, we consider extremely useful to study the specific characteristics of each country that played a role, the financial level or even genetic behaviour against to the new corona virus and the associated disease. some of these characteristics used as mathematical parameters for performing correlation studies. the results of this study could give us information for preparing more effective defensive strategies or practical "tools" in a possible future return of the pandemic which constitutes the central goal of the present work. the outline of the paper is as follows. in section 2 we present a theoretical methodology for parametrizing an epidemic, in section 3 we explain how to couple the present parametrization model with the basic sir model, in section 4 we give results relevant to the end-to-end epidemics growth and in section 5 we discuss the conclusions. our methodology is based on the parametrization of the growth of the covid-19 disease that we used in our recent work [5] and also in [6] and [7] , that we call "semi-gaussian of n-degree". it was used for fitting the disease's growth at various indicative countries and it belongs to the model category of "regression techniques" for epidemic surveillance. the basic-single term expression of this parametrization model is where the function c(t) applied in an epidemic spread represents the rate of the infected individuals as the new daily reported cases (drc) and coincides with the function i(t) in the sir model, as we can see in the following. also a is constant while n and τ are model parameters. the more analytical approach, in the general case from the mathematical point of view, comes from the fundamental study of the epidemic growth and includes a number of terms in a form of double summation related to the inverse laplace transform of a rational function given in [8] , referring to the "earlier stages of an epidemic in a large population". in this hypothesis, the number of unaffected individuals may be considered to be constant, while any alternation is assumed small compared to the total number of exposed individuals. this function, which could be called "large population epidemic semi-gaussian model" (lpe-sg), is the following where a ij are arbitrary amplitudes, n ij are the degrees of the model (assumed fractional in a general case) and τ ij are the time constants representing in our case a mean infection time respectively. also, m and n are the finite number of terms to be included. it is easy to prove that the "peaking time" of the function of each term depends on the product of n ij by τ ij , that is, t pij = n ij τ ij . in practice, the number of the required terms should be determined according to the shape of the data and the desired achievable accuracy. after investigation of the fitting performance we concluded that, at most, two terms of the above double sum are adequate for our purpose. also, the cross terms, with indices ij and ji, cannot help more the flexibility of the model. in particular, a) the degree of the model can "cover" any early or late smaller outbreak of the daily cases, while b) the mean infection time is a characteristic inherent parameter of the disease under study and thus should be essentially constant. for these reasons, the expression with one term was adequate in most of the cases, whereas, the 6 free parameters allow a very good flexibility for the fitting. therefore, we can write for the fitting procedure, we have used two alternative tactics, based on either the daily model or on the cumulative integral of it. the decision depends on the goodness of the fit in each case based on the criterion of minimizing the χ 2 /dof, as we have done in our previous work. [5] . the starting date (at t = 0) was one day before the first reported case (or cases). the cumulative parametrization model and the fitting model take the following forms respectively where the symbol γ represents the gamma function and the γ c the upper incomplete gamma function at time t. the above generalized mathematical model, has the advantage of providing more flexibility when the raw data include a composite structure or superposition of more than one growth curves which could be coexisting. this is possible to happen due to restriction measures applied during the evolution of an epidemic. regardless of the number of terms used, the obtained parameters must be well understood, in the sense of their role in the problem. let us consider a country where the disease starts to outbreak due to a small number of infected individuals. in this stage we assume that the country has been isolated in a relatively high degree, but of course not ideally. at this point, the disease starts with a transmission rate which depends on the dynamics of the spread in each city and village, while other inherent properties of the disease affect its dynamics (e.g. the immune reaction, the incubation time and recovering time). at this point, we must clarify also the issue of the "size" of the epidemic. the sir-based models assume that the size, n, that is the total number of individuals exposable to the disease is unchanged during its evolution, a fact which cannot be exactly true. on the other hand, a fraction of the size concerns individuals who are in quarantine for different reasons (due to tracing or for precautionary reasons). therefore, the size cannot be absolutely constant and the forecasting at the first stage (during the growing of the epidemic) should be very uncertain. in the second stage (around the turning point), the situation is more clear by means of more accurate parametrization, although high fluctuations could still be present. at the third stage (mitigation or suppress), an overall parametrization can be made and any trial for forecasting concerns a likely future comeback of the epidemic. in any of the three stages regardless of the level of uncertainties the parametrization specifies the associated parameters according to the epidemic model used. it is known that the "basic reproductive number" symbolized by r 0 is a very important parameter of the spreading of the epidemic. in the sir-based models it is determined at the first moments of the epidemic (mathematically at t = 0) and is related to the associated parameters. moreover, as it is proven in the next, this parameter doesn't depend on the size n of the epidemic. by using the present parametrization model we assume that the size of the epidemic is, not only unknown, but also much smaller than the population of the country or city under study, that is, it constitutes an unbiased sample of the potentially exposable generic population. once the epidemic is pretty much eliminated, the size could be also estimated "a posteriori" by the help of a sir-based model. however, in this case, the parameters of the spread, as well as, the reproductive number are already determined by the methodology given in the next. we consider that this more generic approach facilitates the fitting process and improves the accuracy because of the existence of an analytical mathematically optimal solution. 3 coupling with the sir-based models 3 the classical model for studying the spread of an epidemic, sir, belongs to the mathematical or state-space category of models along with a large number of other types which are analytically described in [9] . our model belongs to the continuum deterministic sir models in the special case applied at the earlier stage of an epidemic, assuming that the population is much greater than the infected number of people [8] . this model can be applied also when the epidemic is in each latest stage where the total number of infected individuals is an unbiased sample of the population. under these assumptions, this model can be related to the classical sir model, or even with its extensions (seir and sird), by means of correlating their parameters. below, we give a brief description of the basic sir epidemic model. let us describe briefly the three state equations of the sir model: the function s = s(t) represents the number of susceptible individuals, i = i(t) the number of infected individuals and r = r(t) the number of recovered individuals, all referred per unit time, usually measured in [days] . the constant factor a is the transmission rate, the constant factor β is the recovering rate and n is the size of the system (the total number of individuals assumed constant in time), that is, s + i + r = n , at every time. the assumptions concerning the initial and final conditions are, s(0) = 0, s(∞) > 0, i(∞) = 0 and this model does not have an analytical mathematical solution additional the two parameters a and β are constant during the spread of the epidemic. a solution is derived only with the approximation, βr/n < 1, that is, when the epidemic essentially concerns a small number of recovered compared to the total number of individuals. in this case a taylor's expansion to an exponential function is used. in our study, we work for the general case without this assumption. in our basic model, c(t), the integral of the function c(t), must be compared to the total number of infected individuals i, while the parameter a undertakes the scaling of the particular data. the parameter τ does not necessarily coincides with the inverse of the " mean infection rate", a, but 1/τ expresses an "effective transmission rate" in our model. the parameter n cannot be equalized to any of the parameters of the sir model. however, this parameter contributes to the key parameter of the epidemic spread, the so-called "basic reproductive number", r 0 , which is defined at t = 0 and is equal to r 0 = (a/β)(s(0)/n ), where β is the "mean recovering rate". because s(0) ≈ n , it becomes r 0 ≈ a/β. however, the s(0)/n ) represents a basic threshold, the so called "population density", above which the epidemic is initiated and growing when r o = (a/β) > 1. moreover the "effective reproductive number", r e , a variable as a function of time is also defined by the same way as follows because the condition for creating an epidemic is r e > 1, the corresponding condition should be s/n > 1/r 0 . also, at t = 0 should be r e (0) ≡ r 0 , at the peaking time t = t p should be r e (t p ) = 1 and at t = ∞ takes the value r t = r 0 s n (∞) < r 0 . by using the expressions of eq. 9, including only the value of i and its derivative, one can estimate roughly the r e at any time t. it can be also shown, that the s(∞)/n can be determined by solving the following transcendental equation numerically. therefore, we can conclude that for r e , at the outbreak of the epidemic (rising branch of the curve) we have, 1 < r e < r 0 , exactly at the peak of the curve, r e = 1 (because s = n/r 0 , as we explain in the next) and at the mitigation stage (leading branch of the curve), r e < 1 and tends to a minimum value at the asymptotic tail of the curve which is once the above relationship is achieved, the r 0 can be determined by solving the derived algebraic equation. indeed, this was our initial motivation to perform the following analysis. the methodology for accomplishing it, was based on the idea to exploit the property of our model at its maximum at the peaking time, which is t p = nτ , as it can be easily proven by differentiation. on the other hand, in the sir model a peak is expected some time for the function i, as the typical effect of the epidemic's spread. considering that both models can be fitted to the data, in the vicinity of the peak must agree, and therefore, we must claim that i p = c(t p ). let us first find an expression of the s, r and i at the peaking time, symbolizing them by, s p , r p and i p respectively. in order to relate s with r, we replace i from eq. 6 into eq. 8, we obtain from which, taking into account that s(0) ≈ n , we derive the solution the later result at the peaking time gives us an expression of r p also, according to eq. 7, at the peaking time we might have from this equation, and using the definition of r 0 , we obtain based on eq. 14 we calculate r p as follows adding the three functions at the peaking time, s p , i p and r p , we derive the algebraic equation in order to achieve an equation independent of size n , we must express i p /n as a function of the model parameters, that is, by using the maximum value of the model curve [5] and the eq. 8 of the sir model by integration with upper limit the infinity, as follows aβτ 0 1−n0 γ(n 0 + 1) 1 + s(∞) where the symbol γ represents the gamma function, n 0 and τ 0 are the particular values obtained by a fitting) and τ 0 = 1/β. replacing the above expression to eq. 18 and setting s n = s(∞)/n we obtain this transcendental equation can be solved only numerically for r 0 in which the combined unknown s n is also found numerically by using again another transcendental eq. 10, by using multiple iterations leading to a converging accurate solution within 12 loops. the parameter n of the model is essentially the expresser of r 0 , while the obtained value of r 0 concerns an hypothetical sir model fitted to the data of the daily reported cases (drc). from the obtained solution for r 0 we can also calculate the parameter a of sir model, a = βr 0 , where β can be calculated from the peak value of the daily reported recovered individuals by the formula, β = r p /i tot,p , where i tot,p represents the integral of the drc curve with upper limit the peaking time t p . in particular, implementing the above methodology, by using home made software codes written in matlab platform [10] , we obtained the fitting of the drc curve for greece at the mitigation stage, shown in fig. 1 and fig. 2 . the fitted parameters, n 0 = 4.57 and τ 0 = 5.96 and the solutions r 0 = 2.90 and s n = 0.06. for the parameter β we used a typical-average value found in the literature where, β = 0.10 and the same value we used for the other analyzed countries. two characteristic parametrizations of very large normalized size and very small one (64 times smaller), that is, of belgium and malta, respectively, are given in fig. 3 and fig. 4 . definitely, without seeing the vertical scale, one cannot distinguish which corresponds a large or a small normalized size. the only visible difference at a glance, is the peaking time (29 and 16 days respectively). 4 study of the end-to-end epidemic growth for the data analysis we selected the 29 countries of eu, including switzerland and uk obtained from [11] . the characteristics of the countries relevant to our study are summarized in table 1 . in particular, we used the population density, the estimated normalized total number of infected individuals (determined by the number of deaths by using a typical constant factor) and the gross domestic product (gdp), nominal per capita. the degree of correlation among the above characteristics and the modelling parameters, was studied by the "theoretical pearson linear correlation coefficient" given by the following formula where x and y are considered normal random variables, σ x and σ y are the corresponding standard deviations and cov(x, y ) is their covariance. however, as it is done in practice, we calculated the "sampling pearson coefficient" (spc), r(x, y ), for n observed random pairs (x i , y i , ..., x n , y n ), where the x can represent the first selected variable and y the second one. the correlation study concerned eight pairs, as is illustrated in table 2 . the conclusions of the linear correlation study are the following: 1. for the population density d: no correlation was found with other parameters. 2. for the model parameters n 0 and τ 0 : strong anti-correlation was found. 3. for the peaking time t p : very strong anti-correlation was found with the basic reproductive number, r 0 . the scatter plot of the basic reproductive number r 0 and the peaking time t p is shown in fig. 5 . this correlation gives us the following message: the higher r 0 results to less a delay of the upcoming peak in the drc curve. the obtained slope of the linear fit was −8.4 ± 0.2 days. on the other hand, the r 0 among the analyzed countries, present statistical fluctuations from about 2 to 4.6, obeying roughly a gaussian distribution with mean value 2.96 and standard deviation 0.68 (or relative to mean 23 %). the parameters n and τ also fluctuate, as we can see in fig. 6 while the peaking time t p shows stochastic characteristics obeying similarly a gaussian distribution with mean 25.7 days and standard deviation 7.8 days (or relative to mean 30 %). since r 0 fluctuates (and in turn the t p due to their linear correlation) among the different countries randomly without presenting any correlation with their associated parameters, we can conclude that the normalized size of the epidemic can be explained only by taking into account other reasons and aspects related to the way citizens interact and behave as well as the degree of social distances and mobility or transport within a country's major cities. also, a crucial role played definitely the degree of quarantine and likely some individual biological differentiations (genetic and other related characteristics). the capability for surveying the epidemic spread during the three main stages is very important and could be based on the daily data and the mathematical modelling we presented. in the mitigation stage the surveying is even more useful and crucial when the restriction measures are starting to be relaxed. the crucial condition for a new epidemic reappearance is based on the effective reproductive number, r e , as well as, on the corresponding population threshold. however, because of the large statistical fluctuations caused by the poor statistics of data as well as because of the low slope of the epidemic curve in this stage, it is very hard to achieve accurate numbers, but only a qualitative estimate as follows. the r e can be estimated from the expressions in eq. 9 and using average numerical approximations of the slope, di/dt. an alternative and practical formula based on the parametrization model sg-lpe can be easily proven and is, r e = 1 + (nτ /t − 1)/β. this formula is valid only in the vicinity of the peak, namely in the narrow range from 0.5t p to 1.5t p , because the fitted model and the sir one differ in the slopes at both side tails. once r e is estimated, the population density threshold, in turn, can be calculated and should be s(t)/n = 1/r e , assuming that the normalized size n can also be estimated by a similar level of accuracy. therefore the crucial condition in the mitigation stage is written as the derivative has to be calculated as an average slope, ∆i/∆t, preferably at least within one week. assuming that this slope is i w and the corresponding average cases in a week is i av the crucial condition becomes where we used the typical values, β = 0.1 days −1 and n s(∞) ≈ 15 for having a practical result as a case study. this simplified formula combined with one-week measurements should be very useful because the relative fluctuations of the drc are expected to be very large. a systematic analysis of the epidemic characteristics of the new virus covid-19 disease spread is presented in this work. for the mathematical analysis, we used a model that we called lpe-sg which facilitates the parametrization by an analytical mathematical description. we also presented a methodology of its coupling with the sir-based models aiming to determine the basic and effective reproductive numbers based on the fitted parameters. analysing the daily reported cases of european countries, we found no correlation between the population density, normalized size or gdp of the countries with respect to the spreading characteristics. another important finding of our study was a strong anti-correlation, statistically significant, of the basic reproductive number and the peaking time. moreover, we found that the basic reproductive number in the epidemics studied showed a uniform distribution with a wide range of values. this means that it is mainly influenced by many factors and generic characteristics of the society in a country. databased analysis, modelling and forecasting of the covid-19 outbreak epidemic analysis of covid-19 in china by dynamical modeling a robust stochastic method of estimating the transmission potential of 2019-ncov predicting the cumulative number of cases for the covid-19 epidemic in china from early data parametrization model motivated from physical processes for studying the spread of civid-19 epidemic polynomial growth in branching processes with diverging reproductive number fractal kinetics of covid-19 pandemic a contribution to the mathematical theory of epodemics mathematical modeling of infectious disease dynamics i would like to thank my daughter v. maltezou, a graduate of the department of agriculture of the aristotle university of thessaloniki and, of athens school of fine arts, for our discussions on the global epidemiological problem, which gave me the warmth and the motivation for doing this work. also, i thank my colleagues, prof. emeritus e. fokitis and e. katsoufis, for their insightful comments and our useful discussions. key: cord-121200-2qys8j4u authors: zogan, hamad; wang, xianzhi; jameel, shoaib; xu, guandong title: depression detection with multi-modalities using a hybrid deep learning model on social media date: 2020-07-03 journal: nan doi: nan sha: doc_id: 121200 cord_uid: 2qys8j4u social networks enable people to interact with one another by sharing information, sending messages, making friends, and having discussions, which generates massive amounts of data every day, popularly called as the user-generated content. this data is present in various forms such as images, text, videos, links, and others and reflects user behaviours including their mental states. it is challenging yet promising to automatically detect mental health problems from such data which is short, sparse and sometimes poorly phrased. however, there are efforts to automatically learn patterns using computational models on such user-generated content. while many previous works have largely studied the problem on a small-scale by assuming uni-modality of data which may not give us faithful results, we propose a novel scalable hybrid model that combines bidirectional gated recurrent units (bigrus) and convolutional neural networks to detect depressed users on social media such as twitter-based on multi-modal features. specifically, we encode words in user posts using pre-trained word embeddings and bigrus to capture latent behavioural patterns, long-term dependencies, and correlation across the modalities, including semantic sequence features from the user timelines (posts). the cnn model then helps learn useful features. our experiments show that our model outperforms several popular and strong baseline methods, demonstrating the effectiveness of combining deep learning with multi-modal features. we also show that our model helps improve predictive performance when detecting depression in users who are posting messages publicly on social media. mental illness is a serious issue faced by a large population around the world. in the united states (us) alone, every year, a significant percentage of the adult population is affected by different mental disorders, which include depression mental illness (6.7%), anorexia and bulimia nervosa (1.6%), and bipolar mental illness (2.6%) [1] . sometimes mental illness has been attributed to the mass shooting in the us [26] , which has taken numerous innocent lives. one of the common mental health problems is depression that is more dominant than other mental illness conditions worldwide [60] . the fatality risk of suicides in depressed people is 20 times higher than the general population [54] . diagnosis of depression is usually a difficult task because depression detection needs a thorough and detailed psychological testing by experienced psychiatrists at an early stage [39] . moreover, it is very common among people who suffer from depression that they do not visit clinics to ask help from doctors in the early stages of the problem [66] . however, it is common for people who suffer from mental health problems to often "implicitly" (and sometimes even "explicitly") disclose their feelings and their daily struggles with mental health issues on social media as a way of relief [3, 33] . therefore, social media is an excellent resource to automatically help discover people who are under depression. while it would take a considerable amount of time to manually sift through individual social media posts and profiles to locate people going through depression, automatic scalable computational methods could provide timely and mass detection of depressed people which could help prevent many major fatalities in the future and help people who genuinely need it at the right moment. the daily activities of users on social media could be a gold-mine for data miners because this data helps provide rich insights on user-generated content. it not only helps give them a new platform to study user behaviour but also helps with interesting data analysis, which might not be possible otherwise. mining users' behavioural patterns for psychologists and scientists through examining their online posting activities on multiple social networks such as facebook, weibo [12, 25] , twitter, and others could help target the right people at right time and provide urgent crucial care [5] . there are existing startup companies such as neotas 1 with offices in london and elsewhere which mines publicly available user data on social media to help other companies automatically do the background check including understanding the mental states of prospective employees. this suggests that studying the mental health conditions of users online using automated means not only helps government or health organisations but it also has a huge commercial scope. the behavioural and social characteristics underlying the social media information attract many researchers' interests from different domains such as social scientists, marketing researchers, data mining experts and others to analyze social media information as a source to examine human moods, emotions and behaviours. usually, depression diagnosis could be difficult to be achieved on a large-scale because most traditional ways of diagnosis are based on interviews, questionnaires, self-reports or testimony from friends and relatives. such methods are hardly scalable which could help cover a larger population. individuals and health organizations have thus shifted away from their traditional interactions, and now meeting online by building online communities for sharing information, seeking and giving the advice to help scale their approach to some extent so that they could cover more affected population in less time. besides sharing their mood and actions, recent studies indicate that many people on social media tend to share or give advice on health-related information [17, 29, 36, 40] . these sources provide the potential pathway to discover the mental health knowledge for tasks such as diagnosis, medications and claims. detecting depression through online social media is very challenging requiring to overcome various hurdles ranging from acquiring data to learning the parameters of the model using sparse and complex data. concretely, one of the challenges is the availability of the relevant and right amount of data for mental illness detection. the reason why more data is ideal is primarily that it helps give the computational model more statistical and contextual information during training leading to faithful parameter estimation. while there are approaches which have tried to learn a model on a small-scale data, the performance of these methods is still sub-optimal. for instance, in [10] , the authors tried crawling tweets that contain depression-related keywords as ground truth from twitter. however, they could collect only a limited amount of relevant data which is mainly because it is difficult to obtain relevant data on a large-scale quickly given the underlying search intricacies associated with the twitter application programming interface (api) and the daily data download limit. despite using the right keywords the service might return several false-positives. as a result, their model suffered from the unsatisfactory quantitative performance due to poor parameter estimation on small unreliable data. the authors in [9] also faced a similar issue where they used a small number of data samples to train their classifier. as a result, their study suffered from the problem of unreliable model training using insufficient data leading to poor quantitative performance. in [20] the authors propose a model to detect anxious depression of users. they have proposed an ensemble classification model that combines results from three popular models including studying the performance of each model in the ensemble individually. to obtain the relevant data, the authors introduced a method to collect their data set quickly by choosing the first randomly sampled 100 users who are followers of ms india student forum for one month. a very common problem faced by the researchers in detecting depression on social media is the diversity in the user's behaviours on social media, making extremely difficult to define depressionrelated features to cope with mental health issues. for example, it was evidenced that although social media could help us to gather enough data through which useful feature engineering could be effectively done and several user interactions could be captured and thus studied, it was noticed in [15, 51] that one could only obtain a few crucial features to detect people with eating disorders. in [44] the authors also suffered from the issue of inadequate features including the amount of relevant data set leading to poor results. different from the above works, we have proposed a novel model that is trained on a relatively large dataset showcasing that the method scales and it produces better and reliable quantitative performance than existing popular and strong comparative methods. we have also proposed a novel hybrid deep learning approach which can capture crucial features automatically based on data characteristic making the approach reliable. our results show that our model outperforms several state-of-the-art comparative methods. depressed users behave differently when they interact on social media, producing rich behavioural data, which is often used to extract various features. however, not all of them are related to depression characteristics. many existing studies have either neglected important features or selected less relevant features, which mostly are noise. on the other hand, some studies have considered a variety of user behaviour. for example, [41] is one such work that has collected a large-scale dataset with reliable ground truth labels. they then extracted various features representing user behaviour in social media and grouped these features into several modalities. finally, they proposed a new model called the multimodal dictionary learning model (mdl) to detect depressed users from tweets, based on dictionary learning. however, given the high-dimensional, sparse, figurative and ambiguous nature of tweet language use, dictionary learning cannot capture the semantic meaning of tweets. instead, word embedding is a new technique that can solve the above difficulties through neural network paradigms. hence, due to the capability of the word embedding for holding the semantic relationship between tweets and the knowledge to capture the similarity between terms, we combine multi-modal features with word embedding, to build a comprehensive spectrum of behavioural, lexical, and semantic representations of users. recently, using deep learning to gain insightful and actionable knowledge from complex and heterogeneous data has become mainstream in ai applications for healthcare, e.g. the medical image processing and diagnosis has gained great success. the advantage of deep learning sits in its outstanding capability of iterative learning and automated optimizing latent representations from multi-layer network structure [32] . this motivates us to leverage the superior neural network learning capability with the rich and heterogeneous behavioural patterns of social media users. to be specific, this work aims to develop a new novel deep learning-based solution for improving depression detection by utilizing multi-modal features from diverse behaviour of the depressed user in social media. apart from the latent features derived from lexical attributes, we notice that the dynamics of tweets, i.e. tweet timeline provides a crucial hint reflecting depressed user emotion change over time. to this end, we propose a hybrid model comprising bidirectional gated recurrent unit (bigru) and conventional neural network (cnn) model to boost the classification of depressed users using multi-modal features and word embedding features. the model can derive new deterministic feature representations from training data and produce superior results for detecting depression-level of twitter users. our proposed model uses a bigru, which is a network that can capture distinct and latent features, as well as long-term dependencies and correlations across the features matrix. bigru is designed to use backward and forward contextual information in text, which helps obtain a user latent feature from their various behaviours by using a reset and update gates in a hidden layer in a more robust way. in general, gru-based models have shown better effectiveness and efficiency than the other recurrent neural networks (rnn) such as long short term memory (lstm) model [8] . by capturing the contextual patterns bidirectionally helps obtain a representation of a word based on its context which means under different contexts, a word could have different representation. this indeed is very powerful than other techniques such as traditional unidirectional gru where one word is represented by only one representation. motivated by this we add a bidirectional network for gru that can effectively learn from multi-modal features and provide a better understanding of context, which helps reduce ambiguity. besides, bigru can extract more discrete features and helps improve the performance of our model. the bigru model could capture contextual patterns very well, but lacks in automatically learning the right features suitable for the model which would play a crucial role in predictive performance. to this end, we introduce a one-dimensional cnn as a new feature extractor method to classify user timeline posts. our full model can be regarded as a hybrid deep learning model where there is an interplay between a bigru and a cnn model during model training. while there are some existing models which have combined cnn and birnn models, for instance, in [63] the authors combine bilstm or bigru and cnn to learn better features for text classification using an attention mechanism for feature fusion, which is a different modelling paradigm than what is introduced in this work, which captures the multi-modalities inherent in data. in [62] , the authors proposed a hybrid bigru and cnn model which later constrains the semantic space of sentences with a gaussian. while the modelling paradigms may be closely related with the combinations of a bigru and a cnn model, their model is designed to handle sentence sentiment classification rather than depression detection which is a much more challenging task as tweets in our problem domain are short sentences, largely noisy and ambiguous. in [53] , the authors propose a combination of bigru and cnn model for salary detection but do not exploit multi-modal and temporal features. finally, we also studied the performance of our model when we used the two attributes word embedding and multi-modalities separately. we found that model performance deteriorated when we used only multi-modal features. we further show when we combined the two attributes, our model led to better performance. to summarize, our study makes the following contributions: (1) we propose a novel depression detection framework by deep learning the textual, behavioural, temporal, and semantic modalities from social media. (2) a gated recurrent unit to detect depression using several features extracted from user behaviours. (3) we built a cnn network to classify user timeline posts concatenated with bigru network to identify social media users who suffer from depression. to the best of our knowledge, this is the first work of using multi-modalities of topical, temporal and semantic features jointly with word embeddings in deep learning for depression detection. (4) the experiment results obtained on a real-world tweet dataset have shown the superiority of our proposed method when compared to baseline methods. the rest of our paper is organized as follows. section 2 reviews the related work to our paper. section 3 presents the dataset that used in this work, and different pre-processing we applied on data. section 4 describes the two different attributes that we extracted for our model. in section 5, we present our model for detection depression. section 6 reports experiments and results. finally, section 7 concludes this paper. in this section, we will discuss closely related literature and mention how they are different from our proposed method. in general, just like our work, most existing studies focus on user behaviour to detect whether a user suffers from depression or any mental illness. we will also discuss other relevant literature covering word embeddings and hybrid deep learning methods which have been proposed for detecting mental health from online social networks and other resources including public discussion forums. since we also introduce the notion of latent topics in our work, we have also covered relevant related literature covering topic modelling for depression detection, which has been widely studied in the literature. data present in social media is usually in the form of information that user shares for public consumption which also includes related metadata such as user location, language, age, among others [20] . in the existing literature, there are generally two steps to analyzing social data. the first step is collecting the data generated by users on networking sites, and the second step is to analyze the collected data using, for instance, a computational model or manually. in any data analysis, feature extraction is an important task because using only a relevant small set of features, one can learn a high-quality model. understanding depression on online social networks could be carried out using two complementary approaches which are widely discussed in the literature, and they are: â�¢ post-level behavioural analysis â�¢ user-level behavioural analysis methods that use this kind of analysis mainly target at the textual features of the user post that is extracted in the form of statistical knowledge such as those based on count-based methods [21] . these features describe the linguistic content of the post which are discussed in [9, 19] . for instance, in [9] the authors propose classifier to understand the risk of depression. concretely, the goal of the paper is to estimate that there is a risk of user depression from their social media posts. to this end, the authors collect data from social media for a year preceding the onset of depression from user-profiles and distil behavioural attributes to be measured relating to social engagement, emotion, language and linguistic styles, ego network, and mentions of antidepressant medications. the authors collect their data using crowd-sourcing task, which is not a scalable strategy, on amazon mechanical turk. in their study, the crowd workers were asked to undertake a standardized clinical depression survey, followed by various questions on their depression history and demographics. while the authors have conducted thorough quantitative and qualitative studies, they are disadvantageous in that it does not scale to a large set of users and does not consider the notion of text-level semantics such as latent topics and semantic analysis using word embeddings. our work is both scalable and considers various features which are jointly trained using a novel hybrid deep learning model using a multi-modal learning approach. it harnesses high-performance graphics processing units (gpus) and as a result, has the potential to scale to large sets of instances. in hu et al., [19] the authors also consider various linguistic and behavioural features on data obtained from social media. their underlying model relies on both classification and regression techniques for predicting depression while our method performs classification, but on a large-scale using a varied set of crucial features relevant to this task. to analyze whether the post contains positive or negative words and/or emotions, or the degree of adverbs [49] used cues from the text, for example, i feel a little depressed and i feel so depressed, where they capture the usage of the word "depressed" in the sentences that express two different feelings. the authors also analyzed the posts' interaction (i.e., on twitter (retweet, liked, commented)). some researchers studied post-level behaviours to predict mental problems by analysing tweets on twitter to find out the depression-related language. in [38] , the authors have developed a model to uncover meaningful and useful latent structure in a tweet. similarly, in [41] , the authors monitored different symptoms of depression that are mentioned in a user's tweet. in [42] , they study users' behaviour on both twitter and weibo. to analyze users' posts, they have used linguistic features. they used a chinese language psychological analysis system called textmind in sentiment analysis. one of the interesting post-level behavioural studies was done by [41] on twitter by finding depression relevant words, antidepressant, and depression symptoms. in [37] the authors used postlevel behaviour for detecting anorexia; they analyze domain-related vocabulary such as anorexia, eating disorder, food, meals and exercises. there are various features to model users in social media as it reflects overall behaviour over several posts. different from post-level features extracted from a single post, user-level features extract from several tweets during different times [49] . it also extracts the user's social engagement presented on twitter from many tweets, retweets and/or user interactions with others. generally, posts' linguistic style could be considered to extract features [19, 59, 59] . the authors in [41] extracted six depression-oriented feature groups for a comprehensive description of each user from the collected data set. the authors used the number of tweets and social interaction as social network features. for user profile features, they have used user shared personal information in a social network. analysing user behaviour looks useful for detecting eating disorder. in wang et al., [51] they extracted user engagement and activities features on social media. they have extracted linguistic features of the users for psychometric properties which resembles the settings described in [20, 37, 42] where the authors have extracted 70 features from two different social networks (twitter and weibo). they extracted features from a user profile, posting time and user interaction feature such as several followers and followee. this is one interesting work [56] where the authors combine user-level and post-level semantics and cast their problem as a multiple instance learning setup. the advantage that this method has is that it can learn from user-level labels to identify post-level labels. there is an extensive literature which has used deep learning for detecting depression on the internet in general ranging from tweets to traditional document collection and user studies. while some of these works could also fall in one of the categories above, we are separately presenting these latest findings which use modern deep learning methods. the most closely related recent work to ours is [23] where the authors propose a cnn-based deep learning model to classify twitter users based on depression using multi-modal features. the framework proposed by the authors has two parts. in the first part, the authors train their model in an offline mode where they exploit features from bidirectional encoder representations from transformers (bert) [11] and visual features from images using a cnn model. the two features are then combined, just as in our model, for joint feature learning. there is then an online depression detection phase that considers user tweets and images jointly where there is a feature fusion at a later stage. in another recently proposed work [7] , the authors use visual and textual features to detect depressed users on instagram posts than twitter. their model also uses multi-modalities in data, but keep themselves confined to instagram only. while the model in [23] showed promising results, it still has certain disadvantage. for instance, bert vectors for masked tokens are computationally demanding to obtain even during the fine-tuning stage, unlike our model which does not have to train the word embeddings from scratch. another limitation of their work is that they obtain sentence representations from bert, for instance, bert imposes a 512 token length limit where longer sequences are simply truncated resulting in some information loss, where our model has a much longer sequence length which we can tune easily because our model is computationally cheaper to train. we have proposed a hybrid model that considers a variety of features unlike these works. while we have not specifically used visual features in our work, using a diverse set of crucial relevant textual features is indeed reasonable than just visual features. of course, our model has the flexibility to incorporate a variety of other features including visual features. multi-modal features from the text, audio, images have also been used in [64] , where a new graph attention-based model embedded with multi-modal knowledge for depression detection. while they have used temporal cnn model, their overall architecture has experimented on small-scale questionnaire data. for instance, their dataset contains 189 sessions of interactions ranging between 7-33min (with an average of 16 min). while they have not experimented their method with short and noisy data from social media, it remains to be seen how their method scales to such large collections. xezonaki et al., [57] propose an attention-based model for detecting depression from transcribed clinical interviews than from online social networks. their main conclusion was that individuals diagnosed with depression use affective language to a greater extent than those who are not going through depression. in another recent work [55] , the authors discuss depression among users during the covid-19 pandemic using lstm and fasttext [28] embeddings. in [43] , the authors also propose a multi-model rnn-based model for depression prediction but apply their model on online user forum datasets. trotzek et al., [48] study the problem of early detection of depression from social media using deep learning where the leverage different word embeddings in an ensemble-based learning setup. the authors even train a new word embedding on their dataset to obtain task-specific embeddings. while the authors have used the cnn model to learn high-quality features, their method does not consider temporal dynamics coupled with latent topics, which we show to play a crucial role in overall quantitative performance. the general motivation of word embeddings is to find a low-dimensional representation of a word in the vocabulary that signifies its meaning in the latent semantic space. while word embeddings have been popularly applied in various domains in natural language processing [34] and information retrieval [61] , it has also been applied in the domain of mental health issues such as depression. for instance, in [2] , the authors study on reddit (reddit is also used in [47] ) a few communities which contain discussions on mental health struggles such as depression and suicidal thoughts. to better model the individuals who may have these thoughts, the authors proposed to exploit the representations obtained from word embeddings where they group related concepts close to each other in the embeddings space. the authors then compute the distance between a list of manually generated concepts to discover how related concepts align in the semantic space and how users perceive those concepts. however, they do not exploit various multi-modal features including topical features in their space. farruque et al., [13] study the problem of creating word embeddings in cases where the data is scarce, for instance, depressive language detection from user tweets. the underlying motivation of their work is to simulate a retrofitting-based word embedding approach [14] where they begin with a pre-trained model and fine-tune the model on domain-specific data. gong et al., [16] proposed a topic modelling approach to depression detection using multi-modal analysis. they propose a novel topic model which is context-aware with temporal features. while the model produced satisfactory results on 2017 audio/visual emotion challenge (avec), the method does not use a variety of rich features and could face scalability issues because simple posterior inference algorithms such as those based on gibbs or collapsed gibbs sampling do not parallelize unlike deep learning methods, or one need sophisticated engineering to parallelize such models. twitter has been popularly regarded as one online social media resource that provides free data for data mining on tweets. this is the reason for its popularity among researchers who have widely used data from twitter. one can freely and easily download tweet data through their apis. however, in the past, researchers have generally followed two methods for using twitter data, which are: â�¢ using an already existing dataset shared freely and publicly by others. the downside of such datasets is that they might be old to learn anything useful in the current context. recency may be crucial in some studies such as understanding current trends of a recently trending topic [22] . â�¢ crawling data using vocabulary from a social media network though is slow but helps get fresh, relevant and reliable data which would help learn patterns that are currently being discussed on online social networks. this method takes time to collect relevant and then process the data given that resources such as twitter which provide data freely impose tweet download restrictions per user per day, as a result of fair usage policy applied to all users. developing and validating the terms used in the vocabulary by users with mental illness is time-consuming but helps obtain a reliable list of words, by which reliable tweets could be crawled reducing the amount the false-positives. recent research conducted by the authors of [41] is one such work that has collected a large-scale data with reliable ground truth data, which we aim to reuse. we present the statistics of the data in table 1 . to exemplify the dataset further, the authors collected three complementary data sets, which are: â�¢ depression data set: each user is labelled as depressed, based on their tweet content between 2009 and 2016. this includes 1,402 depressed users and 292,564 tweets. â�¢ non-depression data set: each user is labelled as non-depressed and the tweets were collected in december 2016. this includes over 300 million active users and 10 billion tweets. â�¢ depression-candidate data set: the authors collected are labelled as depression-candidate, where the tweet was collected if contained the word "depress". this includes 36,993 depressioncandidate users and over 35 million tweets. data collection mechanisms are often loosely controlled, impossible data combinations, for instance, users labelled as depressed but have provided no posts, missing values, among others. after data has dataset depressed non-depressed no. of users 1402 300 million no. of tweets 292, 564 10 billion table 1 . statistics of the large dataset collected by the authors in [41] which is used in this study. been crawled, it is still not ready to be used directly by the machine learning model due to various noise still present in data, which is called the "raw data". the problem is even more exacerbated when data has been downloaded from online social media such as twitter because tweets may contain spelling and grammar mistakes, smileys, and other undesirable characters. therefore, a pre-processing strategy is needed to ensure satisfactory data quality for computational modal to achieve reliable predictive analysis. the raw data used in this study has labels of "depressed" and "non-depressed". this data is organised as follows: users: this data is packaged as a json file for each user account describing details about the user such as user id, number of followers, number of tweets etc. note that json is a standard popular data-interchange which is easy for humans to read and write. timeline: this data package contains files containing several tweets along with corresponding metadata, again in json format. to further clean the data we used natural language processing toolkit (nltk). this package has been widely used for text pre-processing [18] and various other works. it has also been widely used for removing common words such as stop words from text [10, 20, 38] . we have removed the common words from users tweets (such as "the", "an", etc.) as these are not discriminative or useful enough for our model. these common words sometimes also increase the dimensionality of the problem which could sometimes lead to the "curse-of-dimensionality" problem and may have an impact on the overall model efficiency. to further improve the text quality, we have also removed non-ascii characters which have also been widely used in literature [59] . pre-processing and removal of noisy content from the data helped get rid of plenty of noisy content from the dataset. we then obtained a high-quality reliable data which we could use in this study. besides, this distillation helped reduce the computational complexity of the model because we are only dealing with informative data which eventually would be used in modelling. we present the statistics of this distilled data below: to further mitigate the issue of sparsity in data, we excluded those users who have posted less than ten posts and users who have less than 5000 followers, therefore we ended up with 2500 positive users and 2300 negative users. social media data conveys all user contents, insights and emotion reflected from individual's behaviours in the social network. this data shows how users interact with their connections. in this work, we collect information from each user and categorize it into two types of attributes, namely multi-modal attribute and word embedding, as follows: we introduce this attribute type where the goal is to calculate the attribute value corresponding to each modality for each user. we estimate that the dimensionality for all modalities of interest is 76; and we mainly consider four major modalities as listed below and ignore two modalities due to missing values. these features are extracted respectively for each user as follows: 4.1.1 social information and interaction. from this attribute, we extracted several features embedded in each user profile. these are features related to each user account as specified by each feature name. most of the features are directly available in the user data, such as the number of users following and friends, favourites, etc. moreover, the extracted features relate to user behaviour on their profile. for each user, we calculate their total number of tweets, their total length of all tweets and the number retweets. we further calculate posting time distribution for each user, by counting how many tweets the user published during each of the 24 hours a day. hence it is a 24-dimensional integer array. to get posting time distribution for each tweet, we extract two digits as hour information, then go through all tweets of each user and track the count of tweets posted in each hour of the day. emojis allow users to express their emotions through simple icons and non-verbal elements. it is useful to get the attention of the reader. emojis could give us a glance for the sentiment of any text or tweets, and it is essential to differentiate between positive and negative sentiment text [31] . user tweets contain a large number of emojis which can be classified into positive, negative and neutral. for each positive, neutral, and negative type, we count their frequency in each tweet. then we sum up the numbers from each user's tweets to get the sum for each user. so the final output is three values corresponding to positive, neutral and negative emojis by the user. we also consider voice activity detection (vad) features. these features contain valance, arousal and dominance scores. for that, we count first person singular and first person plural. using affective norms for english words, a vad score for 1030 words are obtained. we create a dictionary with each word as a key and a tuple of its (valance, arousal, dominance) score as value. next, we parse each tweet and calculate vad score for each tweet using this dictionary. finally, for each user, we add up the vad scores of tweets by that user, to calculate the vad score for each user. topic modelling belongs to the class statistical modelling frameworks which helps in the discovery of abstract topics in a collection of text documents. it gives us a way of organizing, understanding and summarizing collections of textual information. it helps find hidden topical patterns throughout the process, where the number of topics is specific by the user apriori. it can be defined as a method of finding a group of words (i.e. topics) from a collection of documents that best represent the latent topical information in the collection. in our work, we applied the unsupervised latent dirichlet allocation (lda) [4] to extract the most latent topic distribution from user tweets. to calculate topic level features, we first consider corpus of all tweets of all depressed users. next, we split each tweet into a list of words and assemble all words in decreasing order of their frequency of occurrence, and common english words (stopwords) are removed from the list. finally, we apply lda to extract the latent k = 25 topics distribution, where k is the number of topics. we have found experimentally k = 25 to be a suitable value. while there are tuning strategies and strategies based on bayesian non-parametrics [46] , we have opted to use a simple, popular, and computationally efficient approach which helps give us the desired results. it is the count of depression symptoms occurring in tweets, as specified in nine groups in dsm-iv criteria for a depression diagnosis. the symptoms are listed in appendix a. we count how many times the nine depression symptoms are mentioned by the user in their tweets. the symptoms are specified as a list of nine categories, each containing various synonyms for the particular symptom. we created a set of seed keywords for all these nine categories, and with the help of the pre-trained word embedding, we extracted the similarities of these symptoms to extend the list of keywords for each depression symptoms. furthermore, we scan through all tweets, counting how many times a particular symptom is mentioned in each tweet. we also focused on the antidepressants, and we created a lexicon of antidepressants from the "antidepressant" wikipedia page which contains an exhaustive list of items and is updated regularly, in which we counted the number of names listed for antidepressants. the medicine names are listed in appendix b. word embeddings are a class of representation learning models which find the underlying meaning of words in the vocabulary in some low-dimensional semantic space. their underlying principle is based on optimising an objective function which helps bring words which are repeatedly occurring together under a certain contextual window, close to each other in the semantic space. the usual windows size that works well in many settings is 10 [34] . a remarkable ability of these models is that they can effectively capture various lexical properties in natural language such as the similarity between words, analogies among words, and others. these models have become increasingly popular in the natural language processing domain and have been used as input to deep learning models. among various word embedding models proposed in the literature, word2vec [27] is one of the most popular techniques that use shallow neural networks to learn word embedding. word2vec is a predictive model for learning word embeddings from raw text that is also computationally efficient. word2vec takes a large corpus of text as its input and generates a vector space with a corresponding vector in the space allocated to each specific word. word vectors are placed in the space of the vector. the words that share common meanings in the corpus are located in space near to each other. to learn the semantic meaning between the words that were posted by depressed users, we add a new attribute to extract more meaningful features. count features in multi-modalities attribute are useful and effective to extract features from normal text. however, they could not effectively capture the underlying semantics, structure, sequence and meaning in tweets. while count features are based on the independent occurrence of words in a text corpus, they cannot capture the contextual meaning of words in the text which is effectively captured by word embeddings. motivated by this, we apply word embedding techniques to extract more meaningful features from every user's tweets and capture the semantic relationship among word sequence. we used a popular model called word2vec [27] with a 300-dimensional set of word embeddings pre-trained on google news corpus to produce a matrix of word vectors. the skip-gram model is used to learn word vector representations which are characterised by low-dimensional real-valued representations for each word. this is usually done as a pre-processing stage, after which the vectors learned are fed into a model. in this section, we describe our hybrid model that learns from multi-modal features. while there are various hybrid deep learning models proposed in the literature, our method is novel in that it learns multi-modal features which include topical features as shown in figure 1 . the joint learning mechanism learns the model parameters in a consolidated parameter space where different model parameters are shared during the training phase leading to more reliable results. note that simple cascaded-based approaches incorporate error propagation from one stage to next [65] . at the end of the feature extraction step, we obtain the training data in the form of an embedding matrix for each user representing the user timeline posts attribute. we also have a 76-dimensional vector of integers for each user representing the multi-modalities attribute. due to the complexity of user posts and the diversity of their behaviour on social media, we propose a hybrid model based on cnn that combines with bigru to detect depression through social media as depicted in figure 1 . for each user, the model takes two inputs for the two attributes. first, the four modalities feature input that represents user behaviour vector runs into bigru, capturing distinct and latent features, as well as long-term dependencies and correlation across the features matrix. the second input represents each user input tweet that will be replaced with it's embedding and fed to the convolution layer to learn some representation features from the sequential data. the output in the middle of both attributes is concatenated to represent one single vector feature that fed into an activation layer of sigmoid for prediction. in the following sections, we will discuss the following two existing separate architectures which will be combined leading to a novel computational model for modelling spatial structures and multi-modalities. in particular, the model comprises a cnn network to learn the spatial structure from user tweets and a framework to extract latent features from multi-modalities attribute followed by the application of bigru. an individual user's timeline comprises semantic information and local features. recent studies show that cnn has been successfully used for learning strong, suitable and effective features representations [24] . the effective feature learning capabilities of cnns make them an ideal choice to extract semantic features from a user post. in this work, we propose to apply cnn network to extract semantic information features from user tweets. the input to our cnn network is the embedded matrix layer with a sentence matrix and the sentence will be treated as sequence of words s : [w 1 , w 2 , w 3 , . . . , w i ]. each word w â�� r 1ã�d is a one vector of the embedding matrix r wã�d , where d represents the dimension of each word in the matrix and w represents the length or number of words for each user posts. we set the size of each user sentence between 0 and 1000 words and describe the average of only ten tweets for each user. note that this size is much larger than what has been used in other recent closely-related models which are based on bert. also, we could train our model on the dataset which helps create specific representations for our dataset in a computationally less demanding way unlike those which are based on bert that is both computational and financially expensive to train followed by fine-tuning. the input layer is attached to the convolution layer by three conventional layers to learn n-gram features capturing word order; thereby capturing crucial text semantic which usually cannot be captured by a bag-of-words-based model [52] . we use a convolution operation c n to extract features between words as follows: (1) where f is a nonlinear function, b denotes bias and x n:n+hâ��1 a window of h words. here the convolution is applied to the window of word vector, where the window size is h. the network now creates a feature map according to the following equation: (2) the output of convolution layer feature map will be an input for the pooling layer, which is an important step to reduce dimension of the space by selecting appropriate features. we used the max pooling layer to calculate the maximum value for every feature-map patch. the output of pooling operation is generated as follows: . we add the lstm layer to create a stack of deep learning algorithms to optimize the results. the recurrent neural network (rnn) is a powerful network when the input is fixed vectors to process in sequence even if the data is non-sequential. models such as bigru, gru, and lstm fall in the class of rnns. the static attributes are usually inputted to the bigru. gru is an alternative of lstm and links the forget gate and the input gate into a single update gate, which is computationally efficient than an lstm network due to the reduction of gates. gru can effectively and efficiently capture long-distance information between features, but one way or unidirectional gru could only capture the historical information features partly. moreover, for our static attributes, we would like to get the information about the behavioural semantics of each user. to this end, we have applied bigru to combine the forward and backward directions for every input feature to capture the behavioural semantics in both directions. bidirectional models, in general, capture information of the past and the future, where information is captured considering both past and future contexts which makes it more powerful than unidirectional models [11] . suppose the input which resembles a user behaviour be represented as x1,x2..., xn. when we apply the traditional unidirectional gru, we have the following form: (1) bidirectional gru actually consist of two layers of gru as in figure 2 , and introduced to obtain the forward and the backward information. and the hidden layer has two values for the output, one for backward output and the other to forward output, and the algorithm can be describe as follow: where h s represents the input of step s, while 㬠h s and h s represent the hidden state of the forward and the backward gru in step s. each gru network is defined as the follow: , gru network is calculates the update gate z s in the time step s. this gate helps the model decide how much information is obtained from the previous step which could be passed to the next step. the reset gate in equation 7 is used to determine how much information from past step needs to be forgotten. the gru model used a reset gate to save related information from the past as depicted in equation 8 . lastly, the model will calculate h s that holds all the information and passes it down to the network as depicted in equation 9 . after we obtain the latent features from each model, we integrate these features and concatenate them as feature vector to be input into an activation function for classification as mentioned below. 6 experiments and results we compare our model with the following classification methods: â�¢ â�¼mdl: multimodal dictionary learning model (mdl) is to detect depressed users on twitter [41] . they use a dictionary learning to extract latent data features and sparse representation of a user. since we cannot get access to all [41] 's attributes, we implement mdl in our way. â�¢ svm: support vector machines are a class of machine learning models in text classification that try to optimise a loss function that learns to draw a maximum-margin separating hyperplane between two sets of labelled data, e.g., drawing a maximum-margin hyperplane between a positive and negative labelled data [6] . this is the most popular classification algorithm. â�¢ nb: naive bayes is a family of probabilistic algorithms based on applying bayes' theorem with the "naive" assumption of conditional independence between instances [30] . while the suitability conditional independence has been questioned by various researchers, these models surprisingly give superior performance when compared with many sophisticated models [45] . for our experiments, we have used the datasets as mentioned in section (3). they provide a large scale of data, especially for labelled negative and candidate positive. after pre-processing and extracting info from their raw data, we filter out the below datasets to perform our experiments: â�¢ number of users labelled positive: 5899. â�¢ number of tweets from positive users: 508786. â�¢ number of users labelled negative: 5160. â�¢ number of tweets from negative users: 2299106. then further excluded users who posted less than ten posts and users who have more than 5000 followers, we end up with a final dataset consisting of 2500 positive users and 2300 negative users. we adopt the ratio 80:20 to split our data into training and test. we used pre-trained word2vec that is trained on google news corpus which comprises of 3 billion words. we used python 3.6.3 and tensorflow 2.1.0 to develop our implementation. we rendered the embedding layer to be not trainable so that we keep the features representations, e.g., word vectors and topic vectors in their original form. we used one hidden layer, and max-pooling layer of size 4 which gave better performance in our setting. for both network bigru and cnn optimization, we used adam optimization algorithm. finally we trained our model for 10 iterations, with batch size of 32. the number of iterations was sufficient to converge the model and our experimental results further cement this claim where we outperform existing strong baseline methods. we employ traditional information retrieval metrics such as precision, recall, f1, and accuracy based on the confusion matrix to evaluate our model. a confusion matrix is a sensational matrix used for evaluating classification performance, which is also called an error matrix because it shows the number of wrong predictions versus the number of right predictions in a tabulated manner. some important terminologies associated with computing the confusion matrix are the following: â�¢ p: the actual positive case, which is depressed in our task. â�¢ n: the actual negative case, which is not depressed in our task. â�¢ tn: the actual case is not depressed, and the predictions are not depressed as well â�¢ fn: the actual case is not depressed, but the predictions are depressed. â�¢ fp: the actual case is depressed, but the predictions are not depressed. â�¢ tp: the actual case is depressed, and the predictions are depressed as well. based on the confusion matrix, we can compute the accuracy, precision, recall and f1 score as follows: in our experiments, we study our model attributes including the quantitative performance of our hybrid model. the multi-modalities attribute and user's timeline semantic features attribute, we will use both these attributes jointly. after grouped user behaviour in social media into multi-modalities attribute (mm), we evaluate the performance of the model. first, we examine the effectiveness of using the multi-modalities attribute (mm) only with different classifiers. second, we showed how the model performance increased when we combined word embedding with mm. we summarise the results in table 2 and figure 4 as follows: â�¢ naive bayes obtain the lowest f1 score, which demonstrates that this model has less capability to classify tweets when compared with other existing models to detect depression. the reason for its poor performance could be that the model is not robust enough to sparse and noisy data. â�¢ â�¼mdl model outperforms svm and nb and obtains better accuracy than these two methods. since this is a recent model especially designed to discover depressed users, it has captured the intricacies of the dataset well and learned its parameters faithfully leading to better results. â�¢ we can see our proposed model improved the depression detection up to 6% on f1-score, compared to â�¼mdl model. this suggests that our model outperforms a strong model. the reason why our model performs well is primarily because it leverages a rich set of features which is jointly learned in the consolidated parameters estimation resulting in a robust model. â�¢ we can also deduce from the table that our model consistently outperforms all existing and strong baselines. â�¢ furthermore, our model achieved the best performance with 85% in f1, indicating that combining bigru with cnn for multimodal strategy for user timeline semantic features strategy is sufficient to detect depression in twitter. to get a better look for our model performance and how it does classify the samples, we have used the confusion matrix. for this, we import the confusion matrix module from sklearn, which helps us to generate the confusion matrix. we visualize the confusion matrix, which demonstrates how classes are correlated to indicate the percentage of the samples. we can observe from figure 3 that our model predicts effectively non-depressed users (tn) and depressed users (tp). we have also compared the effectiveness of each of the two attributes of our model. therefore, we test the performance of the model with a different attribute, we build the model to feed it with each attribute separately and compare how the model performs. first, we test the model using only the multi-modalities attribute, we can observe in fig 4 the model perform less optimally when we used bigru only. in contrast, the model performs better when we use only cnn with word embedding attribute. this signifies that extracting semantic information features from user tweets is crucial for depression detection. although, the model when used only word embedding attribute outperform multi-modalities, still the true positive rate (sensitivity) for both attribute are close to each other as we see the precision score for each bigru and cnn. finally, we can see the model performance increased when combined both cnn and bigru, and outperforms when using each attribute independently. after depressed users are classified, we examined the most common depression symptoms among depressed users. in figure 5 , we can see symptom one (feeling depressed), is the most common symptom posted by depressed users. that shows how depressed users are exposing and posting their depressive mood on social media more than any other symptoms. besides that, other symptoms such as energy loss, insomnia, a sense of worthlessness, and suicidal thoughts have appeared in more than 20% of the depressed user. to further investigate the five most influencing symptoms among depressed users, we collected all the tweets associated with these symptoms. then we created a tag cloud [50] for each of these five symptoms, to determine what are the frequent words and importance that related to each symptom as shown in figure 6 where larger font words are relatively more important than rest in the same cloud representation. this cloud gives us an overview of all the words that occur most frequently within each of these five symptoms. in this paper, we propose a new model for detecting depressed user through social media analysis by extracting features from the user behaviour and the user's online timeline (posts). we have used a real-world data set for depressed and non-depressed users and applied them in our model. we have proposed a hybrid model which is characterised by introducing an interplay between the bigru and cnn models. we assign the multi-modalities attribute which represents the user behaviour into the bigru and user timeline posts into cnn to extract the semantic features. our model shows that by training this hybrid network improves classification performance and identifies depressed users outperforming other strong methods. this work has great potential to be further explored in the future, for instance, we can enhance multi-modalities feature by using short-text topic modelling, for instance, propose a new variant of the biterm topic model (btm) [58] capable of generating depression-associated topics, as a feature extractor to detect depression. besides, using a new recently proposed popular word representation techniques also known as pre-trained language models such as deep contextualized word representations (elmo) [35] and bidirectional encoder representations from transformers (bert) [11] , and train them on a large corpus of depression-related tweets instead of using a pre-trained word embedding model. while there will be challenges when using such pre-trained language models can introduce because of the restriction that they impose on the sequence length; nevertheless, studying these models on this task helps to unearth their pros and cons. eventually, our future works aim to detect other mental illness in conjunction with depression to capture complex mental issues which have pervaded into an individual's life. diagnostic and statistical manual of mental disorders (dsm-5â®) towards using word embedding vector space for better cohort analysis depressed individuals express more distorted thinking on social media latent dirichlet allocation methods in predictive techniques for mental health status on social media: a critical review libsvm: a library for support vector machines multimodal depression detection on instagram considering time interval of posts empirical evaluation of gated recurrent neural networks on sequence modeling predicting depression via social media depression detection using emotion artificial intelligence bert: pre-training of deep bidirectional transformers for language understanding a depression recognition method for college students using deep integrated support vector algorithm augmenting semantic representation of depressive language: from forums to microblogs retrofitting word vectors to semantic lexicons analysis of user-generated content from online social communities to characterise and predict depression degree topic modeling based multi-modal depression detection take two aspirin and tweet me in the morning: how twitter, facebook, and other social media are reshaping health care natural language processing methods used for automatic prediction mechanism of related phenomenon predicting depression of social media user on different observation windows anxious depression prediction in real-time social data rehabilitation of count-based models for word vector representations text-based detection and understanding of changes in mental health sensemood: depression detection on social media supervised deep feature extraction for hyperspectral image classification using social media content to identify mental health problems: the case of# depression in sina weibo mental illness, mass shootings, and the politics of american firearms advances in pretraining distributed word representations rethinking communication in the e-health era on discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes borut sluban, and igor mozetiä� deep learning for depression detection of twitter users depressive moods of users portrayed in twitter glove: global vectors for word representation deep contextualized word representations identifying health-related topics on twitter early risk detection of anorexia on social media beyond lda: exploring supervised topic modeling for depression-related language in twitter beyond modelling: understanding mental disorders in online social media dissemination of health information through social networks: twitter and antibiotics depression detection via harvesting social media: a multimodal dictionary learning solution cross-domain depression detection via harvesting social media multi-modal social and psycho-linguistic embedding via recurrent neural networks to identify depressed users in online forums detecting cognitive distortions through machine learning text analytics a comparison of supervised classification methods for the prediction of substrate type using multibeam acoustic and legacy grain-size data sharing clusters among related groups: hierarchical dirichlet processes understanding depression from psycholinguistic patterns in social media texts utilizing neural networks and linguistic metadata for early detection of depression indications in text sequences recognizing depression from twitter activity timelines tag clouds and the case for vernacular visualization detecting and characterizing eatingdisorder communities on social media topical n-grams: phrase and topic discovery, with an application to information retrieval salary prediction using bidirectional-gru-cnn model world health organization estimating the effect of covid-19 on mental health: linguistic indicators of depression during a global pandemic modeling depression symptoms from social network data through multiple instance learning georgios paraskevopoulos, alexandros potamianos, and shrikanth narayanan. 2020. affective conditioning on hierarchical networks applied to depression detection from transcribed clinical interviews a biterm topic model for short texts semi-supervised approach to monitoring clinical depressive symptoms in social media survey of depression detection using social networking sites via data mining relevance-based word embedding combining convolution neural network and bidirectional gated recurrent unit for sentence semantic classification feature fusion text classification model combining cnn and bigru with multi-attention mechanism graph attention model embedded with multi-modal knowledge for depression detection medlda: maximum margin supervised topic models. the depression and disclosure behavior via social media: a study of university students in china list of depression symptoms as per dsm-iv:(1) depressed mood.(2) iminished interest. key: cord-034839-6xctzwng authors: bień-barkowska, katarzyna title: looking at extremes without going to extremes: a new self-exciting probability model for extreme losses in financial markets date: 2020-07-20 journal: entropy (basel) doi: 10.3390/e22070789 sha: doc_id: 34839 cord_uid: 6xctzwng forecasting market risk lies at the core of modern empirical finance. we propose a new self-exciting probability peaks-over-threshold (sep-pot) model for forecasting the extreme loss probability and the value at risk. the model draws from the point-process approach to the pot methodology but is built under a discrete-time framework. thus, time is treated as an integer value and the days of extreme loss could occur upon a sequence of indivisible time units. the sep-pot model can capture the self-exciting nature of extreme event arrival, and hence, the strong clustering of large drops in financial prices. the triggering effect of recent events on the probability of extreme losses is specified using a discrete weighting function based on the at-zero-truncated negative binomial (negbin) distribution. the serial correlation in the magnitudes of extreme losses is also taken into consideration using the generalized pareto distribution enriched with the time-varying scale parameter. in this way, recent events affect the size of extreme losses more than distant events. the accuracy of sep-pot value at risk (var) forecasts is backtested on seven stock indexes and three currency pairs and is compared with existing well-recognized methods. the results remain in favor of our model, showing that it constitutes a real alternative for forecasting extreme quantiles of financial returns. forecasting extreme losses is at the forefront of quantitative management of market risk. more and more statistical methods are being released with the objective of adequately monitoring and predicting large downturns in financial markets, which is a safeguard against severe price swings and helps to manage regulatory capital requirements. we aim to contribute to this strand of research by proposing a new self-exciting probability peaks-over-threshold (sep-pot) model with the prerogative of being adequately tailored to the dynamics of real-world extreme events in financial markets. our model can capture the strong clustering phenomenon and the discreteness of times between the days of extreme events. market risk models that account for catastrophic movements in security prices are the focal point in the practice of risk management, which can clearly be demonstrated by repetitive downturns in financial markets. the truth of this statement cannot be more convincing nowadays as global equity markets have very recently reacted to the covid-19 pandemic with a plunge in prices and extreme volatility. the coronavirus fear resulted in panic sell-outs of equities and the u.s. s&p 500 index plummeted 9.5% on 12 march 2020, experiencing its worst loss since the famous black monday crash in 1987. directly 2, 4, 6, and 7 business days later, the s&p 500 index registered additional huge price drops amounting to, correspondingly, 12%, 5.2%, 4.3%, and 2.9%, respectively. at the same time, the toll that the covid-19 pandemic took on european markets was also unprecedented. for example, the german bluechip index dax 30 plunged 12.2% on 12 march 2020, which was followed by a further 5.3%, 5.6%, 2.1% losses, correspondingly, 2, 4, and 7 business days later. the covid-19 aftermath is a real example that highlights the strong clustering property of extreme losses. one of the most well-recognized and widely used measures of exposure to market risk is the value at risk (var). var summarizes the quantile of the gains and losses distribution and can be intuitively understood as the worst expected loss over a given investment horizon at a given level of confidence [1] . var can be derived as a quantile of an unconditional distribution of financial returns, but it is much more advisable to model var as the conditional quantile, so that it can capture the strongly time-varying nature of volatility inherent to financial markets. the volatility clustering phenomenon provides the reason for using the generalized autoregressive conditional heteroskedasticity (garch) models to derive the conditional var measure [2] . however, over the last decade, the conventional var models have been subject to massive criticism, as they failed to predict huge repetitive losses that devastated financial institutions during the global crisis of 2007-2008. therefore, special focus and emphasis is now placed on adequate modeling of extreme quantiles for the conditional distribution of financial returns rather than the distribution itself. one of the relatively recent and intensively explored approaches to modeling extreme price movements is a dynamic version of the pot model which relies on the concept of the marked self-exciting point process. unlike the garch-family models, pot-family models do not act on the entire conditional distribution of financial returns. instead, their focus moves to the distribution tails where-in order to account for their heaviness-the probability mass is usually approximated with the generalized pareto distribution. early pot models described the occurrence of extreme returns as realizations of an independent and identically distributed (i.i.d.) variable, which led to var estimates in the form of unconditional quantiles. one of the first dynamic specifications of pot models that took into account the volatility clustering phenomenon and allowed economists to perceive var as a conditional quantile was a two-stage method developed in [3] . this method required estimating an appropriately specified garch-family model in the first stage and fitting the pot model to garch residuals. a new avenue for forecasting var was opened up when the point-process approach to pot models was released in [4] . this methodology was later extended in several publications [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] . the benefit of this model is that it does not require prefiltering returns using garch-family estimates while at the same time it can capture the clustering effects of extreme losses and maintain the merits of the extreme value theory. the point-process pot model approximates the time-varying conditional probability of an extreme loss over a given day with the help of a conditional intensity function that characterizes the arrival rate of such extreme events. the intensity function can either be formulated in the spirit of the self-exciting hawkes process [4, 5, [10] [11] [12] (which is extensively used in geophysics and seismology), in the form of the observation-driven autoregressive conditional intensity (aci) model [13] , or using the autoregressive conditional duration (acd) models [6] [7] [8] (the last two methodologies were very popular in the area of market microstructure and the modeling of financial ultra-high-frequency data [15] [16] [17] ). in all cases, the timing of extreme losses depends on the timing of extreme losses observed in the past. this study does not strictly rely on the above mentioned point process approach to pot models. the discrete-time framework of our sep-pot model is motivated by observation of real-world financial data measured daily, which is the most common frequency used in pot models of risk. the empirical analysis put forward in this paper is based on the daily log returns of seven international stock indexes (i.e., cac 40 (france), dax 30 (germany), ftse 100 (united kingdom), hang seng (hong kong), kospi (korea), nikkei (japan), and s&p 500 (u.s.)) as well as the daily log returns of three currency pairs (jpy/usd, usd/gbp, usd/nzd). the daily log returns for the equity market were calculated from the adjusted daily closing prices downloaded from the refinitiv datastream database. the foreign exchange (fx) rates were obtained from the federal reserve economic data repository and are measured in following units: japanese yen to one u.s. dollar (jpy/usd), u.s. dollars to one british pound (usd/gbp), u.s. dollars to one new zealand dollar (usd/nzd). extreme losses are defined as the daily negated log returns (log returns pre-multiplied by −1) whose magnitudes (in absolute terms) are larger than a sufficiently large threshold, u. figure 1 shows that for u corresponding to the 0.95-quantile of the unconditional distribution of negated log returns, the daily measurement frequency, and the broad set of financial instruments, the relative frequency mass of the time interval between subsequent extreme losses is concentrated on small integer values. indeed, about 45% of all such durations is distributed on distinct discrete values of 1-5 days, and the most frequent time span between subsequent extreme losses is one day (about 12-13% of cases). the sep-pot model relates to the published work on the point-process approach to pot models but is consistent with the observed discreteness of threshold exceedance durations. thus, in our model, the values of the time variable are treated as indivisible time units upon which extreme losses can be observed. as a result that the extreme losses are clustered, the model incorporates the self-exciting component. accordingly, the extreme loss probability is affected by the series of time spans (in number of days) that have elapsed since all past extreme loss events. we apply the weighting function in the form of the at-zero-truncated negative binomial (negbin) distribution that allows the influence of previous extreme losses to decay over time. the functional form of the extreme loss probability in our sep-pot model is drawn from [18] , where a very similar specification was proposed to depict the self-exciting nature of terrorist attacks in indonesia and forecasted the probability of future terrorist attacks as a function of attacks observed in the past. inspired by this work, we check the adequacy of such a discrete-time approach in the framework of pot models of risk. to this end, we perform an extensive validation of the sep-pot model both in and out of sample and compare it with three widely-recognized var measures: one based on the self-exciting intensity (hawkes) pot model, one derived from the exponential garch model with skewed student's t distribution (skewed-t-egarch) model, and the last one was delivered by the gaussian garch model. the results for var at high confidence levels (>99%) remain in favor of the sep-pot model, and hence, the model constitutes a real alternative for measuring the risk of large losses. section 2 outlines the point process approach to pot models, introduces the sep-pot model, and outlines the backtesting methods used for model validation. section 3 presents the empirical findings and discusses the extensive backtesting results. finally, section 4 concludes the paper and proposes areas for future research. denoting the stochastic process that characterizes the evolution of negated daily log returns on a financial asset, being the daily log returns pre-multiplied by −1. the convention of using negated log returns legitimizes treating extreme losses as observations that fall into the right tail of distribution. more precisely, the extreme losses are defined as such positive realizations of y t that are larger than a sufficiently large threshold u. the magnitudes of extreme losses over a threshold u, (i.e.,ȳ t = y t − u) will be referred to as the threshold exceedances. the time intervals between subsequent threshold exceedances will be referred to as threshold exceedance durations. let {t i , y t i } i∈{1,2,...,n} denote an observed sample path of (1) the times when extreme losses are observed (i.e., 0 < t i < t i+1 ) and (2) the corresponding magnitudes of such losses (i.e., y t i ). if one pursued a continuous-time approach (i.e., assuming t ∈ r + ), the realized sequence {t i , y t i } i∈{1,2,...,n} of extreme returns with their locations in time can be treated as an observed trajectory of the marked point process. treating these instances of threshold exceedance as realizations of a random variable allows us to model the occurrence rate of extreme losses y t i at different time points {t i }, for example, days. an excellent introduction to the theory and statistical properties of point processes can be found in [19] . the crucial concept in the point process theory is the conditional intensity function that characterizes the time structure of event locations, and hence, the evolution of the point process. the conditional intensity function is defined as follows: where n(t, s] denotes a number of events in (t, s]. note that the conditional intensity function can intuitively be treated as the instantaneous conditional probability of an event (per unit of time) immediately after time t. to account for the clustering of extreme losses, λ(t|f t ) depends on f t being an information set available at t, consisting of the complete history of event time locations and their marks, (i.e., f t ≡ σ{(t i , y t i ), ∀i : t i ≤ t}). if λ(t|f t ) was constant over time (i.e., λ(t|f t ) = λ) then for t i ∈ [0, ∞) the point process would correspond to a homogeneous poisson point process with an arrival rate λ. the notion of the conditional intensity facilitates the derivation of the conditional var measure. the var at a confidence level 1 − q, (i.e., q ∈ (0, 1) denotes a var coverage level), represents a qth quantile in the conditional distribution of financial returns. after taking advantage of working with the negated log returns and based on the notation introduced so far, the var (for a coverage level q) estimated for a day t + 1 can be derived from the following equation: hence, the var for a coverage rate q is equal to y q,t+1 , because the probability that a (negated) return exceeds the threshold value y q,t+1 over a day t + 1 is equal to q. this probability can be further rewritten as a product of: (1) the probability of an extreme loss arrival (i.e., a threshold exceedance) over day t + 1 (given f t ), and (2) the conditional probability that the size of this extreme loss is larger than y q,t+1 (given that an extreme loss was observed over day t + 1): the early, classical pot model of the extreme value theory (evt) (the evt offers two major classes of models for extreme events in finance: (1) the block maxima method, which uses the largest observations from samples of i.i.d. data, and (2) the pot method, which is more efficient for practical application because it uses all large realizations of variables, provided that they exceed a sufficiently high threshold. a detailed exposition of these methods can be found in [20] .) assumes that the financial return data is i.i.d. hence, threshold exceedances are also i.i.d homogeneous poisson distributed in time. accordingly, the probability of observing a threshold exceedance over given day t is constant and can be estimated as a proportion of threshold exceedances in the sample (i.e., n/t, where n is the number of threshold exceedances and t denotes the length of time series of financial returns). by this logic, the standard pot model neglects repeated episodes of increased volatility and therefore also ignores the clustering property of extreme losses. as noted in [20] , the standard pot model is not directly applicable to financial return data. the more recent dynamic versions of the classical pot model found in several studies (i.e., [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] ), are directly motivated by the behavior of the non-homogeneous poisson point process, where the intensity rate of threshold exceedances, λ(t|f t ), can vary over time due to the temporal bursts in volatility. according to such a point process approach to pot models, the first factor on the left-hand side of equation (3) (i.e., the conditional probability of a threshold exceedance over day t + 1) can be derived based on the (time varying) conditional intensity function as follows: because the probability of no events in (t, s] (i.e., n(t, s] = 0) can be given as pr(n(t, s] = 0|f t ) = exp − s t λ(v|f v ) dv [21] . the pot models use the pickands-balkema-de haan's theorem of evt, which allows us to approximate the second factor on the left-hand side of equation (3) (i.e., the conditional probability that y t+1 exceeds y q,t+1 , given that it surpassed the threshold u) using the generalized pareto distribution, as follows: where f gp (·) denotes the cumulative distribution function of the generalized pareto (gp) distribution with the scale parameter σ ∈ r >0 and the shape parameter ξ ∈ r =0 . if ξ → 0, f gp (·) tends to the cumulative distribution function of an exponential distribution. equations (3)-(5) provide the grounds for the derivation of var q,t+1 , as follows: the dynamic versions of the pot models benefit from both (1) the point process theory, which allows for the time-varying intensity rate of threshold exceedances, and hence, the clustering of extreme losses, and (2) the evt, which allows us to account for the tail risk of financial instruments. thus, the forecasts of daily var can be time-varying and react to the new information. (the early, classical pot models of evt assume a constant intensity, λ, and a constant scale parameter of the gp distribution for threshold exceedances, σ. accordingly, the var level is constant over time: in empirical applications, appropriate dynamic specifications of selected components in equation (6) are needed. one possible way of specifying the time-varying conditional intensity function λ(t|f t ) is provided by the hawkes process [19] . the hawkes process belongs to the class of so called self-exciting processes where past events can accelerate the occurrence of future events. accordingly, the conditional intensity function is defined as follows: where µ ∈ r >0 denotes a constant and w(·) refers to a non-negative weighting function that captures the impact of past events, (i.e., extreme-loss days). accordingly, each threshold exceedance at t i < t contributes an amount w(t − t i ) to the risk of an extreme loss at t. this is necessary to provide a convenient parametric functional form for w(·). the well-recognized weighting function that we apply in the empirical portion of this paper is an exponential kernel function, given as follows: where α ∈ r ≥0 , β ∈ r ≥0 are the parameters to be estimated. accordingly, λ(t|f t− ) is based on the summation of exponential kernel functions evaluated at the time intervals that start at the times of previous extreme losses (i.e., t i ) and last up to time t. the parameters α and β capture, correspondingly, the scale (i.e., the amplitude) and the rate of decay characterizing an influence of past events on the current intensity. the point process features the self-excitation property because the conditional intensity function rises instantaneously after an extreme loss is registered, which, in the aftermath, triggers the arrival of next events. this mechanism results in the clustering effect, characterizing the location of extreme losses in time. the time-varying nature of the conditional intensity function results in the fluctuations of var (see equation (6)). on top of the clustering feature, the self-exciting intensity pot (i.e., sei-pot) model for var (c.f., equation (6)) can be further extended to account for the serial correlation in the magnitudes of the threshold exceedances. this can be achieved by providing an appropriate dynamic model for the scale parameter of the gp distribution in equation (5) . in the empirical portion of this paper we use the following specification: where µ s ∈ r >0 , α s ∈ r ≥0 , β s ∈ r ≥0 denote the parameters to be estimated. accordingly, the threshold exceedance magnitude is affected by the sizes and times of past threshold exceedances. unlike standard pot models, where the times of threshold exceedances are assumed to be i.i.d distributed according to the homogeneous poisson process and the magnitudes of threshold exceedances are assumed to be i.i.d. gp distributed, the dynamic point-process-based variants of the pot models allow for a time-varying intensity rate of threshold exceedances and a time-varying expected magnitude of these threshold exceedances. accordingly, the var is also time-varying. the interplay of fluctuations in λ(t|f t ) and in the scale parameter of the gp distribution for the threshold exceedances, σ t , elevates var in turbulent periods of financial turmoil and decreases its level during calm periods. hence, the var adjusts to observed market conditions. in this section we introduce the self-exciting probability pot model that obeys the natural distinction between the processes defined in discrete and continuous time. the structure of our model still draws from equation (3), but we treat time as if it was composed of indivisible distinct units (days). therefore, we refrain from approximating the conditional extreme loss probability using the conditional intensity function that characterizes the evolution of point process in continuous time. as a result that we formulate our model in discrete time, we directly describe the conditional probability of an extreme loss over day t, as follows: where g(·) ∈ (0, 1) denotes a link function. one possible choice of specifying g(·) (cf., [18] ) is: where p t ∈ (0, 1) ifλ t > 0. based on [18] , the conditional probability of an extreme loss arrival over day t can be described in a dynamic fashion that exposes the self-exciting nature of the sep-pot model as follows: where µ ∈ r >0 denotes a constant determining a baseline probability, α ∈ r ≥0 determines the scale (amplitude) of the impact that the time location of the ith past extreme-loss event exerts on p t , and g(·) ≥ 0 denotes the weighting function (i.e., discrete kernel function) that makes the past extreme-loss events less impactful than the more recent events. we specify g(·) as the probability function of the at-zero-truncated negative binomial (negbin) distribution. a probability function of a negbin distribution is: where ω ∈ r >0 and κ ∈ r >0 are the parameters of the distribution and e(u) = ω and var(u) = ω + ω 2 /κ. for κ → ∞, the negbin distribution converges to a poisson distribution. for κ = 1, the geometric distribution is obtained. the at-zero-truncated negbin distribution was formerly used in high-frequency-finance for modeling the non-zero price changes of financial instruments [22, 23] . the probability function of at-zero-truncated negbin distribution is given as g( figure 2 illustrates the self-exciting property of the sep-pot model. the plots shown in the upper row depict the at-zero-truncated negbin kernel functions evaluated at the time distances to previously observed events (i.e., g(t − t i ) ∀i : t i < t). the impact of past events on p t diminishes with time and the shape of decay is determined by parameters ω and κ. the scale of this impact is determined by α. the resulting conditional probability function of an extreme loss arrival is therefore based on the summation of the weighted kernel functions based on all the backward recurrence times. the choice of an at-zero-truncated negbin distribution guarantees flexibility in feasible shapes of the weighting function to properly reflect the dynamic properties of the data. like in existing dynamic extensions of the pot methodology, the threshold exceedance magnitudes in the sep-pot model are described using the generalized pareto distribution with the time-varying scale parameter. we specify this parameter as follows: where µ s ∈ r >0 is a constant, α s ∈ r ≥0 is a scale parameter, and g s (x; ω s ) (for x = 1, 2, ...,) denotes the nonnegative discrete weighting (kernel) function. for this purpose, we use a pdf of a geometric distribution with parameter ω s ∈ r >0 , because it constitutes a natural discrete counterpart to an exponential distribution used in the continuous-time framework of the sei-pot model (see equation (9)). hence, the magnitude of the threshold exceedance awaited at t is affected by the times and sizes of all previously observed threshold exceedances. the monotonically decaying weighting function allows distant events to affect the magnitudes of losses less than the recent events do. the sep-pot model assumes that the density function f u y t (y t |f t−1 ), depicting the right tail of the distribution of the negated financial returns, has the following form: which means that y t either surpasses the threshold u, i.e., belongs to the right tail of distribution (1 {t=t i } = 1), and hence, is drawn from the generalized pareto distribution with probability p t , or does not belong to the distribution tail (1 {t=t i } = 0) with probability 1 − p t . this reasoning allows us to formulate the log-likelihood function of the sep-pot model as the sum of two log-likelihoods as follows: where: and the var for a coverage rate q forecasted for day t (based on the information up to and including day t − 1) can be derived from the sep-pot model as follows: hence: the sep-pot model provides the grounds not only to derive the var, but also the expected shortfall (es). unlike the var, the es is a coherent risk measure. it represents the conditional expectation of loss given that the loss lies beyond the var [24] . accordingly, the es corresponding to a coverage rate q, forecasted for a day t based on the information set up to and including day t − 1 is defined as following: equation (22) can be also rewritten as follows: the es can be derived based on the standard definition of the mean excess function for the gp distribution. for u > u, the mean excess function e(u ) corresponding to the gp distribution (where σ > 0, 0 < ξ < 1) is defined as: hence, the expected size of losses exceeding the threshold u is a linear function of u − u. the es forecasts from the sep-pot model can be derived by applying the definition of e(u ) to equation (23) and by specifying the scale parameter of the gp distribution, σ, according to equation (15) . this leads to the formula for es as follows: we use four backtesting procedures to assess the accuracy of the var delivered by the sep-pot model. each of these methods refer to the notion of a var exceedance or a var violation, being a binary indicator function, i t , defined as follows: the backtesting is based on the comparison of forecasted daily var numbers with observed daily returns over a given period. a var exceedance occurs when an actual loss is larger than the var predicted for that day. if the sep-pot model were a true data generating process, than, ∀t pr(i t = 1|f t−1 ) = q, which implies that the var violations would be i.i.d. the first test that we consider is a widely used unconditional coverage (uc) test [25] where the null hypothesis states that the proportion of var exceedances according to a risk model (i.e., π) matches with the assumed coverage level for var (i.e., q): h 0 : π = q. the uc test is formulated as a likelihood ratio test which compares two bernoulli likelihood functions. asymptotically, as the number of observations t goes to infinity, the test statistic is distributed as χ 2 with one degree of freedom: where t 1 denotes the number of var violations in the sample of t returns. the second test is the conditional coverage (cc) that not only verifies the correct coverage but also sheds light on the independence of var violations over time [26] . this test was established in such a way that it aims to reject the var models when a risk model produces either the incorrect proportions or the clusters of exceedances. to this end, the process of var violations is described by a first-order markov model and the cc test is based on the estimated transition matrix, as follows: where π 00 and π 01 denote, correspondingly, the conditional probability of no var violation and a var violation (today), given that yesterday there was no var violation. analogously, π 11 and π 10 denote, correspondingly, the conditional probability of a var violation and no var violation (today) directly after a var violation yesterday. as given in equation (27), the elements of the transition matrix are estimated with the actual proportions of var violations, where t ij , for i ∈ {0, 1}, j ∈ {0, 1} is the number of (negated) returns with the indicator function i t equal to j directly following an indicator's value i. the cc null hypothesis states that the conditional probability of a var violation directly after another var violation is the same as the conditional probability of a var violation after no violation and, at the same time, it is equal to the assumed coverage level for var (i.e., h 0 : π 01 = π 11 = q). asymptotically, as the number of observations t goes to infinity, the test statistic lr cc is distributed as a χ 2 with two degrees of freedom: however, because the cc test is established on the markov property of the violation process, it is sensitive to the dependence of order one only. therefore, the cc test cannot be used to verify whether the current var exceedance depends on the sequence of states that preceded the last one. the next two backtesting methods shed more light on the higher-order autocorrelation in the process of var violations. they also allow us to conclude whether the violations are affected by some previously observed explanatory variables. the first test is a dynamic quantile (dq) test [27] that is based on a hit function, as follows: the correctly specified var model should form the hit t sequence with a mean value insignificantly different from 0, because hit t equals 1 − q, each time y t is larger than the daily var and −q, otherwise. moreover, there should be no correlation between the current and the lagged values of the hit t sequence or between the current values of the hit t sequence and the current var. if the risk model corresponds to the true data generating process, the conditional expectation of hit t should be 0 given any information known at t − 1. the dq test that we use in the empirical section of our paper can be derived as the wald statistic from an auxiliary regression, as follows: the null hypothesis states that the current value of a hit function (i.e., hit t ) is not correlated with its four lags and the forecasted var (i.e., var q,t which is based on information known at t − 1). thus h 0 : φ j = 0 ∀j ∈ {0, ..., 5}. hence, the null hypothesis states that the coverage probability produced by a risk model is correct (i.e., φ 0 = 0) and none of the five explanatory variables affects hit t . hence, the dq test statistic is asymptotically χ 2 distributed with six degrees of freedom: where hit denotes a t × 1 vector with observations of hit t variable and x denotes the standard t × 6 matrix containing a column of ones and observations on the five explanatory variables at times t = 1, ..., t, according to the regression given in equation (30). the dynamic logit test of conditional coverage might be treated as an extension of the dq conditional coverage test [28] . this method considers the dichotomous nature of var violations. accordingly, instead of the linear regression given by equation (30) , this test is established based on the dynamic logit model for i t : e[i t |f t−1 ] = pr(i t |f t−1 ) = f(a t ), where f(·) denotes the cumulative distribution function of a logistic distribution and a t is specified as follows: the autoregressive structure of equation (32) allows us to better capture the dependence of a var violation probability upon possible explanatory factors. the null hypothesis is h 0 : thus, the null states that the coverage probability delivered by a risk model corresponds to the assumed coverage rate for var (i.e., φ 0 = f −1 (q)) and none of regressors used in equation (32) causes an incidence of var violation. the test statistic can be established as a likelihood ratio test statistic. accordingly, it requires estimating the model given by equation (32) and comparing its empirical log likelihood, ln l f , with the restricted log likelihood under the null ln l r . under the null, the lr test statistic is χ 2 distributed with four degrees of freedom: in our empirical study we use daily log-returns from seven major stock indexes worldwide (cac 40, dax 30, ftse 100, hang seng, kospi, nikkei, and s&p 500) and three currency pairs (jpy/usd, usd/gbp, usd/nzd). the cac 40, dax 30, and ftse 100 are the major equity indexes in france, germany, and u.k., respectively, and they are often perceived as the proxies or the real-time indicators for a much broader european stock market. the hang seng, kospi, and nikkei demonstrate the investment opportunity on the largest asian equity markets in hong kong, south korea, and japan, respectively. s&p 500 constitutes a widely-investigated benchmark stock index reflecting the state of the overall u.s. economy. these seven indices monitor the state of the international equity market in its three global financial centers-western europe, eastern asia, and the u.s. as far as selection of the fx rates is concerned, according to [29] , the jpy/usd and usd/gbp are the second and third most traded currency pair in the world, after eur/usd (we did not investigate the eur/usd currency pair due to a much smaller number of observations when comparing to the other time series; the euro was launched on 1 january 1999). the nzd/usd, often nicknamed as the kiwi by fx traders, is a classical example of the commodity currency pair that co-fluctuates with the world prices of primary commodities (i.e., new zealand exports oil, metals, dairy, and meat products). the new zealand dollar is also treated by international investors as a carry trade currency-therefore, it is very sensitive to interest rate risk. for each of these financial instruments we split the data spanning over a four-decades-long period into: (1) the in-sample data (i.e., 2 january 1981-31 december 2014) dedicated to the estimation and evaluation of our models and (2) the out-of-sample data (i.e., 2 january 2015-31 march 2020) which is reserved for var backtesting purposes. for each of the time series, the initial threshold u was set as the 95%-quantile of the in-sample unconditional distribution of negated log returns. hence, the 5% largest negated returns were defined as extreme losses, which means that, on average, an extreme loss can be observed with probability 0.05. the selection of the threshold value u was a compromise between (1) the desired number of observations in the tail of the distribution to reduce noise and to ensure stability in parameter estimates (i.e., the lower the u, the more observations used for estimation) and (2) the goodness-of-approximation of the threshold exceedance distribution with the gp distribution (i.e., the higher the u, the better the approximation with the gp distribution). the latter issue was solved using two diagnostic tools, that confirmed the adequate goodness-of-fit of the conditional gp distribution. we used the d-test proposed in ref. [30] and the χ 2 test for uniformity of probability integral transforms (pit) based on the gp density estimates. the descriptive statistics of the cac 40, dax 30, and ftse 100 data are summarized in table 1 (analogical results for the remaining time series can be obtained from the author upon request). we see that for the cac 40, dax 30, and ftse 100, the threshold exceedances were obtained as the losses surpassing u that is equal to 0.021, 0.021, and 0.017, respectively. out of 8574 (cac 40), 8563 (dax 30), and 7826 (ftse 100) daily log returns in-sample, these threshold values allow us to expose, correspondingly, 429, 428, and 391 extreme losses that were used for the model estimation purposes. for the ftse 100 index, we have less observations (corresponding to three years: 1981-1983), because the in-sample period starts on 3 january 1984, when the ftse 100 index was established. although the official base date for the dax 30 index is 31 december 1987, the dax 30 index was linked with the former dax index which dates back to 1959. the official base date for the cac 40 also begins on 31 december 1987, but between 2 january 1981 and 30 december 1987 it could be measured as the "insee de la bourse de paris." the threshold-exceedance durations cover a very wide range of observed values. for example, for the ftse 100 index, the range spans from one day (with the relative frequency equal to 12.8% in-sample and 11.3% out-of-sample) up to 304 days in-sample or 205 days out-of-sample. in-sample, the largest threshold exceedance, equal to 0.114, was observed on the black monday of 20 october 1987 and it corresponded to a 12.22% decrease of the index. out-of-sample, the maximum threshold exceedance is equal to 0.099 (a 10.87% plunge in the index) and was observed on the black thursday of 12 march 2020, being a single day in a chain of stock market crashes induced by the covid-19 pandemic. realized gains and losses are measured over distinct days, and hence, the time spans between extreme losses are comprised of discrete time units (i.e., days). the scale of this phenomenon can be seen by looking at the considerable proportion of threshold exceedance durations equal to one, two, or three (business) days. moreover, about 45% of such durations is less than or equal to five days and over 60% are less than or equal to ten days. another striking observation from table 1 is the clustering of extreme losses. large losses tend to occur in waves, which is seen from the ljung-box test statistics q(k) (where k ∈ {5, 10, 15}) for the lack of up to kth-order serial correlation. these test statistics are significantly different from zero, and hence, the null hypothesis of no autocorrelation in threshold exceedance durations must be rejected. indeed, due to the covid-19 outbreak, between 24 februry and 31 march 2020 (i.e., over 27 business days) the cac 40, dax 30, and ftse 100 suffered from as many as 10 (cac40 and dax 30) or 11 (ftse 100) extreme losses (with the shortest and the longest threshold exceedance durations equal to one and five business days only, respectively). extreme loss days tended to occur very close to each other, but this phenomenon is paralleled by the significant autocorrelation in the magnitudes of observed threshold exceedances. based on the ljung-box test results, the null hypothesis of no autocorrelation in the threshold exceedance sizes needs to be rejected. the observed threshold exceedance durations are by their very nature discrete and feature strong positive autocorrelation. therefore, our sep-pot model is suitably tailored to this data. table 1 . descriptive statistics for the threshold exceedance durations and the threshold exceedance magnitudes for the cac 40, dax 30, and ftse 100 indexes. (q(k) denotes the ljung-box test statistics for the lack of autocorrelation up to k-th order; q(k) ***, q(k) **, and q(k) * denote the statistics significant at the 1%, 5%, and 10% levels). the sep-pot model was estimated by maximizing the log likelihood function given in equations (17)(19) . to this end, we used the constrained maximum likelihood (cml) library of the gauss mathematical and statistical system. the standard errors of the parameter estimates were derived from the asymptotic covariance matrix based on the (inverse) of a computed hessian. table 2 presents the estimation results for the cac 40, dax 30, and ftse 100 (analogical results for the remaining time series can be obtained from the author upon request). the parameter estimates responsible for the self-excitement mechanism, both in the probability of threshold exceedances (i.e., α,ω,κ) and the magnitudes of these exceedances (i.e.,α s ,ω s ) are highly statistically significant. the parameter estimates for dax 30 and cac 40 indices look very much alike, especially for the conditional probability of threshold exceedances, which means that these two stock markets are closely related to each other. obtained series forp t ,σ t , andv ar 0.01,t are illustrated in figure 4 . the extreme loss probability (i.e.,p t ) features a strong self-excitation property because it reacts to extreme-loss days with abrupt increases and, if there are no further intervening events, it slowly wanders in the downward direction. in calm and prosperous periods of the stock market history, the path ofp t rests on very low levels. however, in turbulent periods, when the location of extreme-loss days is very dense,p t tends to involve very high numbers. more specifically, persistently elevatedp t levels can be seen during observed fluctuations ofp t are accompanied with the strongly time-varying behavior ofσ t (i.e., the estimate of the dispersion parameter in the conditional distribution of threshold exceedances). the losses exceeding u trigger upward jumps in both numbers, boosting the awaited probability and the size of a threshold exceedance. for the cac 40 index,σ t peaked to its highest level (0.059) on 15 may 1981, due to enormous panic and sell-offs on the paris bourse just days before francois mitterand announced hostile reforms for the stocks quoted at the bourse. indeed, the preceding days saw the cac 30 index plunge by over 30%. the uk and german markets were mostly untouched by these french policy-oriented events, and the highestσ t was registered on 27 october 1987 (ftse 100) and 29 october 1987 (dax 30) at the levels of 0.051 (ftse 100) and 0.042 (dax 30), just after a few huge price drops were observed including the famous black monday on 19 october 1987. note, that the maximumσ t levels do not have to coincide with those ofp t . this is becauseσ t is also affected by the magnitude of past threshold exceedances. for all data in this study, the highest out-of-sampleσ t levels were registered in the second half of march 2020. the self-triggering nature ofp t andσ t give rise to variations in daily var, as shown in the panel [c] of figure 4 . what catches special attention is that the obtained path of var estimates tends to adjust to both periods of calm and turmoil in the history of equity markets-it quickly reacts to price jumps and bursts in volatility and accounts for persistent swings in stock prices. we verified whether the sep-pot model is appropriate for forecasting the daily var. to ensure a big-picture perspective over its usefulness in diverse practical applications, we derived the daily var levels for six assumed theoretical coverage rates (i.e., for q ∈ {0.05, 0.025, 0.01, 0.005, 0.0025, 0.001}), and compared them with corresponding var numbers from three competing risk models (i.e., the self-exciting intensity (hawkes) pot model (sei-pot), the egarch(1,1) model with the skewed-t distributed innovations and the standard garch(1,1) model with normally-distributed innovations). for the sake of fair comparison between the four risk models under study, the accuracy of var forecasts was validated with four backtesting procedures. moreover, each of these statistical routines was distinctly applied to examine the following: (1) the in-sample goodness-of-fit and (2) the out-of-sample accuracy. considering ten financial instruments under study, six coverage levels for var (q) and four models (sep-pot var, sei-pot var, skewed-t-egarch var, and gaussian garch var), we ended up with 240 var series in-sample and 240 series out-of-sample. therefore, for clarity of exposition, the backtesting results were summarized in the form of heatmap graphs (cf., figures 5-8 ). heatmaps use a grid of colored rectangles across two axes where the horizontal axis corresponds to the assumed var coverage level and the vertical axis corresponds to the financial instrument under study. the color of each little rectangle (in shades of red and green) reflects the p-value of a backtesting procedure. the white colour corresponds to a p-value equal to 0.05. the darker the red color indicates an increasingly smaller p-value, one that it is less than 0.05. the darker the tone of green indicates an increasingly higher p-value, one that it is larger than 0.05. for example, panel [a] of figure 5 presents the p-values corresponding to the uc test statistics. each of the four heatmaps in panel [a] refers to the var delivered from a different model: the sep-pot, sei-pot, skewed-t-egarch, and gaussian garch. according to the uc test results, the var based on the sep-pot, sei-pot, and skewed-t-egarch models produce, in-sample, a rather accurate proportion of violations. the best in-sample results were delivered by the skewed-t-egarch model; however, its superiority diminishes out-of-sample, where the skewed-t-egarch model failed in 13 out of 60 instances. out-of-sample, the sep-pot var and sei-pot var models rejected the null of correct coverage only three times. the egarch model seems to produce good var forecasts for high coverage levels (i.e., q = 0.05). for q < 0.05, the egarch var model is left behind the sei-pot var model and sep-pot var model. as expected, the advantage of var models based on pot methodology is most visible for the extreme quantiles. as far as the gaussian garch var model is concerned, its performance is dramatically worse than other risk models both in-sample and out-of-sample. the model produces incorrect var forecasts for small q (i.e., q ≤ 0.025), which can be explained by insufficient probability mass in the tails of gaussian distribution. the results of the cc test checking both the correct coverage and the lack of dependence of order one in var violations seem to support the sep-pot var model (cf., figure 6 ). the poorest fit corresponds to the highest q levels (i.e., q = 0.05) because in such cases, the null of proper specification had to be rejected both in-sample and out-of-sample for ftse 100, kospi, nikkiei, and s&p 500. however, the sep-pot var model seems to be slightly superior than the sei-pot var model. in sample, only in six instances out of 60 did the sep-pot var model fail. for the sei-pot var model, the number of failures was 10 and for the skewed-t-egarch var model it was nine. as in the case of the uc test, the cc test results indicate that the gaussian garch var model rendered the worst fit-the null was not rejected in only seven cases, mainly for the lowest quantiles (i.e., for q = 0.05). out-of-sample, the sep-pot and the sei-pot models deliver the similar quality of daily var forecasts and both win over garch-family models. turning our attention to figure 7 , which illustrates the results of the dq test, the first striking observation is that a much larger area of all heatmaps is marked with shades of red when compared to the results of the cc tests. indeed, the dq test is more demanding than the cc test because checks not only whether a var violation today is uncorrelated with the fact of a var violation yesterday but it also checks whether var violations are affected by some covariates from a wider information set, where we used the current var and the hit variable observations from one to four days ago (as in original work [27] ). the superiority of the sep-pot var model over its competitors is clearly visible. although the sep-pot var model has a clear tendency to mis-specify var at the highest q levels (i.e., q = 0.05), the dq test results for the sei-pot var and the var based on the garch family models are inferior. in-sample, the dq test rejected 14 sep-pot var models, 21 sei-pot var models, 26 skewed-t-egarch var and 57 (i.e., nearly all) gaussian garch var models. out-of-sample, the advantage of the sep-pot var model over the sei-pot var model is less vivid-the first model failed in 12 instances and the latter failed in 14. figure 8 illustrates the results of the dynamic logit cc test. we can observe a systematic pattern as far as the sep-pot var and sei-pot var models are concerned. the area marked in red concentrates on the left-hand side of the heatmaps both in and out-of-sample, which means that var is mis-specified if derived for high coverage rates (i.e., q = 0.05). this deficit of pot var models is recouped by their accuracy at low q levels. indeed, for q ≤ 0.005 in-sample and for q ≤ 0.01 out-of-sample, both pot models are not able to reject the null. the sep-pot var model was still slightly more successful than the remaining risk models. in-sample, it failed only 10 times (mainly for q = 0.05), whereas the sei-pot var model failed 18 times, the skewed-t-garch model failed ten times, yet the gaussian garch var model managed to pass this test only two times. out-of-sample, both pot var models were equally correct. for the sep-pot and sei-pot var model, the null of correct conditional coverage was rejected nine times. the dynamic logit cc test rejected the skewed-t-egarch model in 16 and the gaussian garch in majority of cases. the practical implications of the sep-pot model stem from its suitability to provide adequate var and es predictions. the var forecasts can be used by financial institutions as internal control measures of market risk. the adequacy of risk models used by financial institutions is of utmost importance for the market regulator. commercial banks have used var models for several years to calculate regulatory capital charges using the internal model-based approach of the basel ii regulatory framework. according to the more recent recommendations of the basel committee on banking supervision (bcbs), banks should use es to ensure a more prudent capture of "tail risk" and capital adequacy during periods of significant stress in the financial markets [31] . this attitude remains in line with the core objective of the dynamic pot models (including the sep-pot model), as they focus on the quantification of both the forecasted probability and the awaited size of huge losses, also producing the time-varying es forecasts. the recent basel iii accord, comprising a set of regulations developed by the bcbs, further reinforces the role of bank units responsible for internal model validations. for more about the current regulatory framework of market risk management see [32] . despite the recent shift from var to es models in the calculation of capital requirements, es forecasts remain highly sensitive to the quality of var predictions. all in all, our findings pinpoint that the sep-pot model constitutes a reasonable promising alternative for forecasting extreme quantiles of financial returns and the daily var, especially for very small coverage rates. undoubtedly, further examination of the theoretical properties of the sep-pot model and its forecasting accuracy is needed. the model should be backtested using other classes of financial instruments and compared against other extreme risk models. however, there is a plethora of var models in the literature-therefore, there are no two or three candidate specifications against which the sep-pot model should be benchmarked and compared. only among the point process-based pot models there have been variants put forward, including the acd-pot model (which is based on the dynamic specifications of time, i.e., duration, that elapses between consecutive extreme losses [6-8]) or the aci-pot model (with its multivariate extensions) that provides an explicit autoregressive specification for the intensity function [13] . all these dynamic versions of pot models exploit both strands of the literature: the point process theory and the evt, accounting for the clustering of extreme losses and the heavy-tailness of the loss distribution. the sep-pot model is also suitable tailored to these features but also explicitly accounts for the discreteness of times between extreme losses. the empirical findings in this paper provide much support for our sep-pot model. however, further efforts should be focused on benchmarking and comparison with a broader range of methods under the same settings (i.e., the same data and the same period). we proposed a new self-exciting probability pot model for forecasting the risk of extreme losses. existing methods within the point process approach to pot models pursue a continuous-time framework and therefore involve specification of an intensity function. our model is inspired by leading research in this area but is based on observation of the real-world data as we built our model for discrete time. hence, our model is a dynamic version of a pot model where extreme losses might occur upon a sequence of indivisible time units (i.e., days). instead of delivering a new functional form for a conditional intensity of the point process, we propose its natural discrete counterpart being the conditional probability of experiencing an extreme event on a given day. this conditional probability is described in a dynamic fashion, allowing the recent events to have a greater effect than the distant ones. thus, extreme losses arrive according to a self-exciting process, which allows for a realistic capturing of their clustering properties. the functional form of the conditional probability in the sep-pot model resembles the conditional intensity function used in etas models. however, we rely on discrete weighting functions based on at-zero-truncated negative binomial (negbin) distribution to provide a weight for the influence of past events. our move toward the discrete-time setup is backed up by the descriptive analysis of the data. on average, the probability mass for nearly 45% of the time intervals between extreme-loss days is distributed upon a set of discrete values ranging from one up to five days, and the shortest one-day-long duration has a relative frequency of 12% (for the threshold u set equal to the 95%-quantile of the unconditional distribution for negated returns). accordingly, the motivation of the sep-pot model lies in allowing the data to speak for itself. using the at-zero-truncated negbin distribution as a weighting function in the equation for the conditional probability of extreme loss, we try to tailor the method to the data specificity. the conditional distribution for the magnitudes of threshold exceedances also remain in line with this approach. we specify the evolution of the threshold exceedance magnitudes in a self-exciting fashion utilizing the weighting scheme based on the geometric probability density function. accordingly, the sizes of more distant threshold exceedances have less effect on the current magnitudes of extreme losses than the more recent events do. the backtesting results stay in favour of the sep-pot var model. we used four backtesting procedures to check the practical utility of our approach for seven major stock indexes and three currency pairs both in-and out-of-sample. the out-of-sample period covered as much as over five years involving the series of catastrophic downswings in equity prices due to the covid-19 pandemic in march 2020. we compared var forecasts delivered by the sep-pot model with three widely recognized alternatives: self-exciting intensity (hawkes) pot-var, skewed-t-garch var and gaussian garch var model. outcomes of backtesting procedures pinpoint that the sep-pot model for var is a good alternative to existing methods. the standard structure of the sep-pot model offers several interesting generalizations. for example, it is possible to explain the conditional probability of an extreme loss with some covariates. some potential candidate explanatory variables include price volatility measures such as high-low price ranges and measures of realized volatility. for stock indexes, some valuable information can be found in volatility indexes such as the cboe volatility (vix) index for the u.s. equity market. unlike existing point process-based pot models, the merits of the sep-pot model seem to lie in its discrete-time nature. indeed, the bernoulli log-likelihood function given in equation (18) makes it easy to update an information set in the sep-pot model on a regular, day-by-day basis. another interesting generalization of the sep-pot model could be to add the multi-excitation effect caused by different types of events. for example, the conditional probability of an extreme loss on one market could be additionally co-triggered by crashes observed in another market. finally, the contemporaneous spillover effect between different markets can be captured using multivariate extensions of the sep-pot model, for example based on extreme copula functions. these issues are left for further research. value at risk: the new benchmark for managing financial risk elements of financial risk management estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach estimating value-at-risk: a point process approach high-frequency financial data modeling using hawkes processes intensity-based estimation of extreme loss event probability and value at risk value at risk forecasts by extreme value models in a conditional duration framework autoregressive conditional duration as a model for financial market crashes prediction the modeling and forecasting of extreme events in electricity spot markets modeling multivariate extreme events using self-exciting point processes modelling interregional links in electricity price spikes point process models for extreme returns: harnessing implied volatility multivariate dynamic intensity peaks-over-threshold models modeling extreme negative returns using marked renewal hawkes processes econometrics of financial high-frequency data autoregressive conditional duration: a new model for irregularly spaced transaction data autoregressive conditional duration (acd) models in finance: a survey of the theoretical and empirical literature self-exciting hurdle models for terrorist activity an introduction to the theory of point processes quantitative risk management: concepts, techniques and tools statistical analysis of non-stationary series of events in a data base system; naval postgraduate school modelling financial transaction price movements: a dynamic integer count data model an inflated multivariate integer count hurdle model: an application to bid and ask quote dynamics comparative analyses of expected shortfall and value-at-risk: their validity under market stress techniques for verifying the accuracy of risk measurement models evaluating interval rorecasts conditional autoregressive value at risk by regression quantiles backtesting value-at-risk: from dynamic quantile to dynamic binary tests bank of international settlements. triennial central bank survey. foreign exchange turnover in non-parametric specification tests for conditional duration models basel committee on banking supervision. minimum capital requirements for market risk; bank for international settlements this article is an open access article distributed under the terms and conditions of the creative commons attribution funding: this research received no external funding. the authors declare no conflict of interest. key: cord-048353-hqc7u9w3 authors: chis ster, irina; ferguson, neil m. title: transmission parameters of the 2001 foot and mouth epidemic in great britain date: 2007-06-06 journal: plos one doi: 10.1371/journal.pone.0000502 sha: doc_id: 48353 cord_uid: hqc7u9w3 despite intensive ongoing research, key aspects of the spatial-temporal evolution of the 2001 foot and mouth disease (fmd) epidemic in great britain (gb) remain unexplained. here we develop a markov chain monte carlo (mcmc) method for estimating epidemiological parameters of the 2001 outbreak for a range of simple transmission models. we make the simplifying assumption that infectious farms were completely observed in 2001, equivalent to assuming that farms that were proactively culled but not diagnosed with fmd were not infectious, even if some were infected. we estimate how transmission parameters varied through time, highlighting the impact of the control measures on the progression of the epidemic. we demonstrate statistically significant evidence for assortative contact patterns between animals of the same species. predictive risk maps of the transmission potential in different geographic areas of gb are presented for the fitted models. the 2001 fmd epidemic in the uk had a substantial cost in human, animal health and economic terms (alexandersen et al. [1] , kao [2] ). understanding the risk factors underlying the transmission dynamics of that epidemic and evaluating the effectiveness of the control measures are essential to minimise the scale and cost of any future outbreak. epidemic modelling [3, 4, [5] [6] [7] proved critical to decision making about control policies which were (in some cases controversially) adopted to control the 2001 epidemic [8] [9] [10] . modelling now has a 'peace-time' contingency planning role. one weakness of the modelling studies undertaken in 2001 was the relatively ad-hoc nature of the parameter estimation methods employed. in their first paper, ferguson et al. [4] used maximum likelihood methods to fit to the observed incidence time series, but did not attempt to fit to the spatio-temporal pattern of spread. in their later work, the same authors developed a more robust method for estimating species-specific susceptibility and infectiousness parameters and spatial kernel parameters (see supplementary information to [3] ), but at the time the statistical basis for the methods developed was lacking. in retrospect, the methods developed turned out to be closely related to those developed during the sars epidemic by wallinga and teunis [11] , although the earlier work incorporated population denominator data to allow for spatial-and species-based heterogeneity in disease transmission. nevertheless, the methods employed had the limitation of not being fully parametric, meaning they could not be extended to fit arbitrary transmission models to the observed data. keeling et al. [5] used maximum likelihood methods to estimate transmission parameters, but it was also supplemented by more ad hoc least-squares matching to regional incidence time series. therefore there remains a need to develop rigorous modern statistical approaches for parameter estimation of non-linear models for the 2001 fmd outbreak. bayesian markov chain monte carlo (mcmc) techniques are the best established such methods and have been successfully employed in the analysis of a range of spatiotemporal outbreak data in the past [12] [13] [14] , as well as to purely temporal incidence data [15, 16] . here we develop mcmc-based inference models for the 2001 fmd epidemic in gb. the models examine: the extent to which transmission was spatially localised and the temporal variation in transmission, species-specific variation in susceptibility, infectious-ness and heterogeneity in contact rates between and within species. we take the farm as the unit of our study and ignore the possible impact of within-farm epidemic dynamics. thus we implicitly assume disease spread within a farm is so rapid as to be practically instantaneous, with all animals on a farm becoming infectious at the same time. our data consists of information on all the farms in the uk listed in the 2000 agricultural census [see http://www.defra.gov. uk/footandmouth/cases/index.htm ]. there were a total of 134,986 farms listed in that dataset and uniquely identified by their county/parish/holding (cph) number. their spatial coordinates are provided together with the number of animals by species within each farm. a partition of all gb farms according to the animal types represented is shown in figure 1a . their geographical distribution is represented in figure 1b as the number of farms per 565 km. notice the high density areas in the north west (cumbria), south west (devon), wales and scotland where the main epidemic foci developed. there is also an area of high density in the shetland islands corresponding to very small crofter smallholdings. figure 1c and d show the numbers of sheep and cattle kept per 565 km square. during the 2001 fmd outbreak, a total of 2026 infected premises (ips) were recorded -farms where fmd was diagnosed, and which were subsequently culled. the ip dataset contains, for each farm, the estimated date of infection (determined by a clinical evaluation of the age of lesions on affected animals), and the dates of disease reporting, confirmation and culling. a total of 7457 other (non-ip) farms were also culled -mostly as contiguous premises (cps, about 3103) or dangerous contacts (dcs, about 1287), but some under other local culling policies used in cumbria and scotland. for instance about 1846 (79%) out of a total of 2342 sheep farms in cumbria had all sheep culled under the ''local 3 km radial sheep cull'' policy adopted there. some of the farms (about 30) were recorded both as dcs and cps. multiple records per farm were often found in the disease control management system dataset, and it was often unclear whether this was due to data entry errors or as a result of sequential species-specific culls on the same farm. in our analysis we therefore considered the whole farm to be culled at the last recorded date of culling. the most frequent species are cattle and sheep (see figure 1a ). there are less than 3% farms with pigs only and only 10 farms with just pigs were diagnosed as ips in 2001 (less than 1% of all the ips). this indicates a-priori that pigs contributed far less to the 2001 outbreak than many other fmd outbreaks (despite their high levels of shedding [1, 17] ), and we therefore decided to discard pigs-only farms from the current study to simplify the analysis. the sensitivity analysis section shows that this simplification does not significantly affect estimates of other epidemiological parameters. we discarded another three ips due to missing information or possible mistakes regarding their location or number of animals, leaving a total of 2013 ips in our analysed dataset. we model the epidemic as a space-time survival process [18] . the total observation time t is the 240 days between 7 th february and 5 th october 2001. each farm i at the location (x i , y i ) is associated with an infection time t i (if infected), a removal time r i (if slaughtered) and two integers n c i and n s i representing, respectively, the number of cattle and sheep on the farm. s c and s s represent per-capita cattle and sheep susceptibility, respectively, while i c and i s represent per capita cattle and sheep infectivity. the susceptibility is a relative measure of animal sensitivity to the disease whereas infectivity represents the infectious risk posed by an animal to others. we use a continuous kernel to describe how the probability of contact between farms scaled with distance. transmission is naturally assumed to decrease with the distance between farms according to the power law where d ij represents the euclidian distance between the infected farm i and the susceptible farm j. the parameters a (kernel offset) and c (kernel power) are to be estimated. the kernel captures all forms of movement and contact between farms and as such, the use of a simple 2 parameter function is inevitably a highly simplified representation of the true complexity of inter-farm contacts. we examined other functional forms for the kernel (such as those used in some other analyses [19] ) but the resulting model fits were much poorer than found using the power-law kernel above. given the susceptibility and infectiousness parameters and the kernel, the infection hazard from an infected farm j to a susceptible farm i is then quantified by this model is over-specified as stated, so we arbitrarily assume s s = 1 throughout, meaning s c represents the ratio of cattle-tosheep susceptibility. for a constant (distance-independent) kernel this is just a mass-action closed epidemic model with heterogeneous susceptibility and infectiousness. this model assumes susceptibility and infectiousness parameters scale linearly with the number of animals of different species on the farm, a relatively strong assumption imposed for model parsimony reasons. the mixing matrix embedded in (2) quantifies the 4 species-specific mixing rates between animals on different farms: cattle-to-cattle (s c i c ) sheep-to-cattle (s c i s ), cattle-to-sheep (s s i c ) and sheep-to-sheep (s s i s ). this model formulation is identical to that used by keeling et al. [5] , except for the functional form of kernel used. the force of infection on a susceptible farm i at time t depends on the whole history of events and is just where , if the farm i is susceptible and the farm j is infectious at the time t 0, otherwise by default, we assume a latent period of 1 day (latency is represented within the function l); i.e. farms are infectious the day after they are infected. however, we test the sensitivity of our estimates to the assumption by also examining latent periods of 2 and 3 days. the probability density function that farm i is infected at time t is then given by hence, the contribution that a farm i, observed to be infected at time t, makes to the log likelihood is just: a farm which is not infected contributes to the overall likelihood the probability that it escapes infection during the observation period, i.e. until the time it is culled (r i ) or for the duration of the epidemic t, whichever is shorter. its contribution to the log likelihood is therefore the total log likelihood of the model can be written as we then extend the simple model above by introducing an additional parameter to understand to what extent the transmission within species is altered by between species transmission. the parameter r quantifies the degree to which mixing between species is assortative -with r,1 representing assortative mixing and r.1 disassortative mixing. the interaction model still assumes constant parameters with respect to time along the whole observation period t. the mixing matrix defined in equation (2) becomes s c i c rs c i s rs s i c s s i s ð8þ where we again fix s s to be 1 to avoid model over specification. the force of infection (3) and model log likelihood equation (7) change accordingly. assuming transmission parameters were constant in time throughout the epidemic is obviously a crude simplification. however, allowing infectivity to vary continuously in time results in an over-specified model and problems of parameter identification and confounding. we therefore examined two sets of models in which changes in transmission parameter were restricted to 2 significant points in time denoted by t cut , namely 23 rd february (when the national ban on animal movements was introduced) and 31 st march (when control measures were intensified and the so called 24/48 hour ip/cp culling policy was introduced). models were respectively fitted to the individual case data from the start of the epidemic (conditioning on the first infection) or from after 23rd february (conditioning on the 54 farms that were already infected by that date). a detailed history of the epidemic is given by kao [9] . we separately fitted model variants which assumed a discrete change in parameters on 23 rd february and on 31 st march. confounding meant that only a very limited number of parameters could be varied in time, so we examined the effect of varying infectiousness and kernel parameters separately. we fitted four separate time-varying model variants: (i) varying the cattle infectivity by a factor and keeping sheep infectivity constant through time (cattle infectivity model); (ii) varying sheep infectivity by a factor but not cattle infectivity (sheep infectivity model); (iii) varying both cattle and sheep infectivity by the same ratio (cattle & sheep infectivity model); (iv) varying the kernel parameters (time varying kernel model). for the last model variant we also fitted a version which includes non-assortative mixing between species (see equation (8)). hence the most general mathematical expression of the transmission model is: where the scripts pre and post are self-explanatory for time varying parameters. when fitting models with time varying infectivity parameters we actually fit i post and the ratio m = i pre /i post we called infectivity factor. this is a within species ratio, a parameter directly fitted by the models, unlike the between species infectivity ratio additionally calculated as explained later in the text (see parameter estimates section). note that all models above treat the epidemic as fully observed, i.e. infection times are assumed to be known (when in fact only estimated infection times are known -see sensitivity analysis section), and only ips are assumed to be infectious. we adopt a bayesian framework for statistical inference and use mcmc methods for fitting the model to individual case data. this is not strictly necessary, given our simplifying assumption that the epidemic was completely observed, but it provides a more consistent and robust framework within which to relax that assumption in future work. we obtained parameter estimates and equal-tailed 95% credible intervals from the marginal posterior distributions of the fitted parameters. for the basic model for instance we estimated the relative cattle susceptibility, s c , two infectivity parameters (i c (t);i c and i s (t);i s for all t) and two kernel parameters (c(t);c post ;c pre ;c and a(t);a post ;a pre ;a for all t). we used the posterior mean deviance as a bayesian measure of fit or model adequacy as defined by spiegelhalter et al. [20] . the posterior density deviance is defined as: where log{p(y|h)} is the log-likelihood function for the observed data vector y given the parameter vector h and c is a constant which does not need to be known for model-comparison purposes (being a function of the data alone). the smaller the mean posterior deviance, the better the corresponding model fits the data. if the posterior deviance distributions for two different models overlap significantly, it is necessary to use additional criteria to compare model fit -namely a comparison of the relative complexity of the models. the deviance information criterion (dic) is perhaps the most general of such methods, being a generalisation of the akaike information criterion for bayesian hierarchical models [20] . we define the complexity of a model by its effective number of parameters, p d , defined as where e[ ] represents taking expectations (the posterior average). the dic is then defined as a lower value of dic corresponds to a better model. this criterion offers flexibility for comparing non-nested models [20] and it is straightforwardly computed within an mcmc algorithm. we applied the classic random walk metropolis hastings algorithm [21, 22] and a block-sampling of parameters due to the computationally expensive form of the likelihood [23, 24] . a log scale has been used for sampling as the parameters were all positive definite and were expected to potentially vary by orders of magnitude. however, linear scale sampling yielded similar results. the convergence of the chains was also very much improved (see robert [25] for more on perfect sampling and reparameterization issues) compared with sampling on a linear scale. the model was coded in c and parallelized using openmp 2.0. the mcmc sampler was allowed to equilibrate with convergence being evaluated visually from the likelihood and parameter traces. for the simpler models, 5,000 iterations were sufficient for equilibration, while this increased to 20,000 for the most complex models. also, using log scale sampling, we verified that the chains were able to converge even if started with initial parameter values far from the final posterior mean values. posterior distributions were estimated from 100,000 iterations. the rate of the acceptance varies from model to model. for the baseline model we achieved a 25% rate of acceptance and for the most complex model (8 parameters), a rate of approx 10%. these values compare well with the ''golden'' acceptance rate for random walk metropolis hastings of 23% (roberts [26] ). we did not encounter common problems in mcmc estimation like slow convergence and slow mixing (o'neill [27] ). there were some correlations between parameters, mostly having biological explanations (cattle and sheep infectivity for instance), but a careful parameterization lowers them. we verified parameter estimates were not dependent on parameterization choices -e.g. no difference was seen whether we fitted species infectivity individually, or just fitted sheep infectivity and then the ratio of cattle-to-sheep infectivity. table 1 lists the parameter estimates we obtained for a set of fitted models conditioned only on the first infection whereas table 2 presents the estimates for models conditioned on infections occurring up to 23 rd february. the posterior deviances for each set of models are plotted in figure 2a and figure 2b , respectively. figure 2a illustrates some clear conclusions. of the two models without time variation in parameters, the interaction model fits significantly better than the baseline model without heterogeneous mixing between species. however, fitting the interaction model broadened the credible intervals of the infectivity parameter estimates (table 1) , indicating (unsurprisingly) slight confounding between the 4 infectivity and susceptibility parameters. of the models which allowed infectivity to vary on 23 rd february, allowing only cattle infectivity variation gave a slightly better fit than varying sheep infectivity or both. however, of the models with parameters which vary on 23 rd february, the model variants which allow the 2 kernel parameters to vary at that time point fit substantially better (by both deviance and dic criteria, see table 1 ) than those which just allow a species-specific variation in infectivity. this is encouraging for the inference procedure, as the main control measure initiated on that date was the banning of (figure 3a and figure 3b ). the parameter estimates are less precise before 23 rd february (table 1) due to the relatively small number of ips (about 57) before that date. looking at the most complex model (namely the interaction model with time varying kernel), cattle were estimated to be 5.7fold (4.6, 6.8) more susceptible than sheep (see figure 3c and table 1 ). rather than mentioning animals' specific infectivity (see figure 3d and table 1 ), it is more informative to comment on the cattle:sheep infectivity ratio parameter for the most complex fit (this ratio does dot appear in the tables as it is not a model parameter). we calculated it within the mcmc algorithm as the ratio of the two species infectiousness for each sampled parameter point. the most complex model suggests that cattle are 5.95-fold (4.54, 7.63) more infectious than sheep (figure 3e) . the parameter quantifying assortativity in mixing was estimated at r = 0.45 (0.31, 0.61) -well below 1, the level at which mixing between species is random (figure 3f ). by comparison with the model with a time varying kernel but random mixing between species, the effect of heterogeneous mixing between species modified the between-species transmission as given by matrix (1.9) as indicated below. cattle-to-cattle and sheep-to-sheep transmission is higher (by 19% and 54% respectively) for the model with non-random mixing, whereas the sheep-to-cattle and cattle-to-sheep transmissions dropped by 41% and 37 % respectively. conditioned on 23 rd february, 7 model variants have been considered (table 2 and figure 2b) . we examined the baseline and interaction models (no change in parameters over time), allowing cattle infectivity to vary on 31 st march and both cattle and sheep infectivity to vary by the same factor after 31 st march (with and without heterogeneity in mixing) and allowing both kernel parameters to vary on 31 st march. unsurprisingly, the kernel parameters were not significantly different if allowed to be different before and after 31 st march, neither did this model prove to be the best fit. overall, while the variations in mean deviance (figure 2b ) seen between model variants were much smaller than for the models conditioned on the first infection (figure 2a ), the interaction model allowing for time varying cattle infectivity gave the most adequate fit (measured by both mean deviance and dic, see table 2 ). we cannot statistically compare the two sets of models in table 1 and table 2 , as the data used are different for the two cases. however, the parameter estimates from the best-fitting models of each table are largely consistent. each post-23 rd february estimated value from the best-fit model in table 1 is included in the corresponding pre-31 st march 95% credible interval of the best fit model in table 2 (and vice-versa). the most important message from the second set of models is that all models with cattle time varying infectivity (best fit) indicated higher values of infectivity after 31 st march than before (m = 0.73 (0.63, 0.83)) ( table 2 ). this may seem paradoxical but reflects the fact that while culling (the effect of which is explicitly included in the input data) dramatically reduced case incidence in april, from may to september 2001, case incidence maintained itself at a low level -but almost entirely within cattle farms. this increase in cattle infectivity may therefore really reflect the impact of reduced biosecurity and/or increased non-compliance with movement controls. it is informative to examine what our parameter estimates imply in terms of geographic variation in transmission potential. given the parameter estimates for each model, we can define the relative risk of transmission an infectious farm j would pose to all susceptible farms in the country r j : r j~ic n c j zi s n s j p i=j sc ss n c i zn s this quantity multiplied by the average duration of infectiousness of a farm (time from end of latency to culling) gives the reproduction number r 0j of the farm j. we divided the uk into 5 km squares and then calculated the average transmission risk of all farms in each square (local r 0 ). figure 4 shows how geographic risk changed before and after 23 rd february for our best fit model conditioned on the first infection. the kernel shape has a major influence on the average risk distribution throughout the country. figure 5 shows the corresponding risk maps for the estimates inferred from our best fit model conditioned on 23 rd february. a slightly higher risk is predicted after 31 st march by the model conditioned on 23 rd february due to the increase in the cattle infectivity after this date. the risk estimates after 23 rd february from the first set of models appear consistent with those obtained from the models conditioned on 23 rd february, though a rigorous statistical comparison is not appropriate. we have made the strong assumption for this study that the only infected farms during the 2001 epidemic were the reported ips, and hence that any farms which were infected but culled before clinical diagnosis were not responsible for causing any infections. it is therefore interesting to calculate how many of the proactively culled farms our model predicts might have been infected (but, by definition, not diagnosed). to calculate the probability p i that a particular proactively culled farm i was infected, we need to adjust the infection hazard by the probability that the farm would have not been reported as a clinical case before its culling date t i c . from the outbreak data, we calculate the probability density of the time from infection to report for reported ips and hence the cumulative probability distribution of the time from infection to report, denoted by f. then, with l i (t) being the force of infection on a proactively culled farm i at time t (from the best fit model conditioned on 23 rd february), the probability that that farm gets infected and escapes reporting between its potential infection time and culling time t i c is we calculate the expected number of infections in different classes (e.g. dcs, cps) of proactively culled farms culled within a particular time interval (t i c [ t 0 ,t 1 ½ ). for instance, the expected number of cps culled at the time t i c [ t 0 ,t 1 ½ which are predicted to have been infected can be formally written as this is a simplification, as in reality the delay from infection to report almost certainly depends on the size and species mix on a farm, but the result is nevertheless indicative of the expected level of infection in proactive culling. also, at this stage, the calculations are made as if culling was a non-informative censoring process. this is a reasonable assumption for all proactively culled farms except for dcs (which by definition had been identified by veterinarian as having had a high risk of exposure) but our method may underestimate the infection rate. in calculating the infection to report delay distributions, we divided the epidemic after 23 rd february into 3 time periods: 23 rd february-31 st march, 31 st march-1 st may and 1 st may-5 th october. in these intervals a total of 1332, 4498 and 1627 farms were slaughtered, respectively. our best fit model conditioned on 23 rd february predicts different infectivity regimes before and after 31 st march (see parameter estimates and table 2 ) but we split further the second period of time due to different delays in reporting to culling. the infection to report delay is 8.6 and 8.8 days for the last two periods of time respectively but the infection to cull delay drops from 9.4 and 8.8 days respectively. applying this approach to the interaction model with time varying cattle infectivity which conditioned on the 23 rd of february, we calculated the expected proportion of proactively culled farms which were infected. we estimate that approximately 1.3% (1%, 1.6%) of 7457 culled non-ip farms may have been infected -97 in total (figure 6a ). of the 1332 farms culled between 23 rd february and 31 st march, 1.7% (1%, 2.4%) may have been infected (23 farms). of the 4498 farms culled between 31 st march and 1 st may, we estimate 0.7% (0.5%, 1%) were infected (34 farms). in the period 1 st may to 5 th october, we estimate that 1.6% (1%, 2.3%) of 1627 farms culled were infected (27 farms). the proportion of cps estimated to have been infected is 2% (1.5%, 2.5%), equating to 62 farms (figure 6b ). over the whole epidemic, we estimated 1.5% (0.8%, 2.1%) of farms designated as dcs were infected (19 farms). this estimate (figure 6c ) does not allow for higher risk of infection implied by the veterinary judgement that led to those dcs being identified, which may mean that a higher proportion were in fact infected. if we assume that dcs were 3 times more likely to be infected due to their status than the model would predict, then the incidence of infection in dcs goes up accordingly, i.e. to 4.6% or 59 farms. farms culled neither as dcs or cps (typically those culled under the 3 km and local sheep cull policies in the cumbria, dumfries and galloway areas) had the lowest estimated rate of infectiona mere 0.5 % (0.2%, 0.8%) or 16 out of 3067 farms. in this section we examine the sensitivity of our results to a number of factors: leaving pigs out of the analysis, possible errors in the estimated ip infection dates, and the assumed latent period. to justify the simplification of the analysis by discarding the number of pigs in a farm, we present some more detailed statistics regarding this variable. we also fit the simplest model conditioned the last two farms are exclusively pig farms. we denote by n p i ,s p ,i p the number of pigs in farm i, pigs susceptibility and pigs infectivity respectively. the simplest model similar to (1.2) conditioned on the first infection has been fitted, reducing the number of parameters in the same manner. in addition we estimated pig:sheep susceptibility ratio and pig infectivity, assuming all parameters constant through time. we found that cattle:sheep susceptibility ratio is 6. table 1 shows parameter estimates for cattle and sheep are largely unaffected by ignoring the pig population, with none of the estimates from the two analyses being significantly different. we conclude that including pigs would not change the conclusions presented in table 1 regarding cattle and sheep (given the very small number of ips which had pigs) but it would decrease the power of the analysis and increase model complexity. to understand to what extent our estimates are affected by the assumption that the infection dates have been accurately observed, we randomized the estimated infection dates by adding a gaussian noise with zero mean and a standard deviation of 2 days. this is motivated by the substantial proportion in the observed standard deviation (73.5% less or equal than 2 days) of the distribution time from the estimated infection date to the report date of ips. we then fitted the simplest model (conditioned on both first infection and 23 rd february) to 10 such randomised datasets. the average estimates across them are given in table 3 . they lie well within the confidence intervals we predicted in table 1 . the average cattle:sheep infectivity ratio is also very close to the values estimated using the original data. the average estimates across 10 randomized datasets using the most appropriate model conditioned on 23 rd february (i.e. cattle infectivity and interaction model) are also in table 3 . the values are within the 95%ci presented in table 2 . we assessed a sensitivity analysis for the estimated proportion of infections in proactively culled farms (see the previous section) with respect to infection times. using the predicted parameters for each dataset, we calculated the average proportions across all of them, for each category of proactively culled farms. the average proportion of infections between dc farms is 1.37% (2%, 0.78% and 0.72% for each period of time, respectively). for cp farms, the same quantities evaluate to 1.9% with 1.8%, 1.3% and 1.98%, respectively. overall proactively culled farms, we obtained an average percentage of 1.25% with 1.64%, 0.81% and 1.6% for each considered period of time. all the values are well within the 95%cis predicted by the original data (see the previous section and figure 6 ). all the results presented above assume a fixed latent period of 1 day. we tested the sensitivity of parameter estimates to this assumption by examining latent periods of 2 and 3 days. overall, we would expect infectiousness parameters to increase to compensate for the shorter infectious period, and thus slightly increased generation time (namely the mean time from infection of one case and the time of infection of the cases that case generates). interestingly, however, it is the kernel parameter estimates which are altered as the latent period is varied with the kernel becoming slightly less local with increasing latent period. for two and three days latent period, pre 23 rd february, the values of c dropped from 1.69 (table 1) this paper has presented a statistical analysis of the spatiotemporal evolution of the 2001 foot and mouth outbreak in gb. qualitatively, the results agree with those obtained by keeling et al. [5] in identifying cattle as being the key species in the 2001 epidemic. using the interaction model conditioned on 23 rd february with time varying cattle infectivity, we estimated that 88% of ips between 23 rd feb-31 st march were infected by cattle and only 12% by sheep. sheep-to-sheep transmission only accounts for 3.1% of ips in that period. after 31 st march (when we estimated that cattle infectivity increased slightly, see table 2 ) allowing for non-random mixing between species indicates contacts between farms are assortative on the basis of species composition of the farm; i.e. like species mix with like. this agrees with intuition about the nature of farming practices (e.g. sharing of personnel and equipment is likely to be more common if 2 farms have the same livestock species). the implications of the moderate degree of assortativity we found for control measures remains to be explored. we did not use data collected during the epidemic on traced contacts between farms to fix the spatial kernel function in our analysis, since in the final version of the fmd epidemic data warehouse [http://www.defra.gov.uk/footandmouth/cases/index.htm] very few of the contacts apparently identified early in the epidemic remain confirmed. also we shared the concern of earlier work that the distribution of contact distances in traced contacts may well be biased [3] . we therefore estimated the kernel function, using an offset power-law functional form. the higher value of the kernel power parameter we estimated after 23 rd february (2.67 vs. 1.70 before figure 3a) is consistent with the expected dramatic shortening in the typical contact distance following the national movement ban. this localized spread together with the higher estimated level of infectivity in cattle after 31 st march explains the long tail of the epidemic seen in 2001. in estimating the transmission risk between farms, we assumed a dependence on the euclidian distance between them. in reality, other metrics (e.g. the time required to travel between two farms) might be more reasonable, and should be examined in future work. we also did not include information on landscape (e.g. height above sea-level, location of rivers, trees etc). the estimated risk maps (figure 4 and figure 5 ) match the areas of the country where highest case incidence rates were seen -with the notable exception of wales. the discrepancy between the high predicted risk in wales and the small number of cases observed may reflect inaccuracies in the input data set -keeling et al. [5] reduced farm-level sheep population numbers by 30% in wales and obtained a better geographic match to the data (matt keeling, personal communication). however, the discrepancy may also reflect model inadequacy. we have not here allowed for other farm-level risk factors, such as the farm fragmentation index considered by ferguson et al. [3] . we have not explored more complex non-linear models of the dependence of susceptibility and infectiousness on the number of animals on a farm or relaxed our implicit assumption that contact rates between farms scale linearly with the local density of farms. all these assumptions are being relaxed in ongoing work. the most important issue to be revised in future work is to allow for proactively culled farms which were not diagnosed as ips to be potentially infected and infectious to other farms. this requires modification of the inference model used to allow for an arbitrary number of unobserved infections. the very low numbers of proactively culled farms we estimated as infected suggested that the effect of this model refinement may be limited. it should be noted though that these infection prevalence estimates are in part a result of the relatively non-local kernel estimated simultaneously. if kernel estimates change in a refined analysis -and if dcs were attributed a much higher risk of infection than estimated here due to their status -then it is possible that estimated infection rates in dcs and other proactively culled farms may increase somewhat. however, even if these factors increased our estimated infection prevalence among proactively culled farms 5 fold (which seems unlikely from ongoing work), it would still mean that only a small proportion (,10%) of dcs and cps culled were infected. this does not imply that proactive culling had no effect on the epidemic -as the largest expected effect of such culling is via the targeted depletion of susceptible animals. in this regard, proactive culling has the same epidemiological impact as vaccination. future work will revisit past estimates of exactly how important such culling was for the control of the 2001 fmd epidemic. the pathogenesis and diagnosis of foot-and-mouth disease the impact of local heterogeneity on alternative control strategies for foot-and-mouth disease transmission intensity and impact of control policies on the foot and mouth epidemic in great britain the foot-and-mouth epidemic in great britain: pattern of spread and impact of interventions dynamics of the 2001 uk foot and mouth epidemic: stochastic dispersal in a heterogeneous landscape modelling vaccination strategies against foot-and-mouth disease predictive spatial modelling of alternative control strategies for the foot-andmouth disease epidemic in great britain mathematical modelling of the foot and mouth disease epidemic of 2001: strengths and weaknesses the role of mathematical modelling in the control of the 2001 fmd epidemic in the uk models of foot-and-mouth disease different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures markov chain monte carlo methods for fitting spatiotemporal stochastic models in plant epidemiology likelihood estimation for stochastic compartmental models using markov chain methods estimating parameters in stochastic compartmental models using markov chain methods modelling antigenic drift in weekly flu incidence bayesian inference for partially observed stochastic epidemics the importance of immediate destruction in epidemics of foot and mouth disease statistics for spatial data: wiley series in probability and mathematical statistics spatio-temporal point processes, partial likelihood, foot and mouth disease bayesian measures of model complexity and fit monte carlo sampling using markov chains and their applications teller e (1953) equations of state calculations by fast computing machines bayesian data analysis markov chain monte carlo in practice advances in mcmc: a discussion weak convergence and optimal scaling of random walk metropolis hastings algorithms mcmc methods for stochastic epidemic models bayesian analysis of experimental epidemics of foot-and-mouth disease the authors wish to thank christl donnelly for useful discussions. we are grateful to defra and bbsrc for research funding. conceived and designed the experiments: nf ic. performed the experiments: ic. analyzed the data: ic. contributed reagents/materials/analysis tools: nf ic. wrote the paper: nf ic. key: cord-001603-vlv8x8l8 authors: ul-haq, zaheer; saeed, maria; halim, sobia ahsan; khan, waqasuddin title: 3d structure prediction of human β1-adrenergic receptor via threading-based homology modeling for implications in structure-based drug designing date: 2015-04-10 journal: plos one doi: 10.1371/journal.pone.0122223 sha: doc_id: 1603 cord_uid: vlv8x8l8 dilated cardiomyopathy is a disease of left ventricular dysfunction accompanied by impairment of the β(1)-adrenergic receptor (β(1)-ar) signal cascade. the disturbed β(1)-ar function may be based on an elevated sympathetic tone observed in patients with heart failure. prolonged adrenergic stimulation may induce metabolic and electrophysiological disturbances in the myocardium, resulting in tachyarrhythmia that leads to the development of heart failure in human and sudden death. hence, β(1)-ar is considered as a promising drug target but attempts to develop effective and specific drug against this tempting pharmaceutical target is slowed down due to the lack of 3d structure of homo sapiens β(1)-ar (hsβadr1). this study encompasses elucidation of 3d structural and physicochemical properties of hsβadr1 via threading-based homology modeling. furthermore, the docking performance of several docking programs including surflex-dock, fred, and gold were validated by re-docking and cross-docking experiments. gold and surflex-dock performed best in re-docking and cross docking experiments, respectively. consequently, surflex-dock was used to predict the binding modes of four hsβadr1 agonists. this study provides clear understanding of hsβadr1 structure and its binding mechanism, thus help in providing the remedial solutions of cardiovascular, effective treatment of asthma and other diseases caused by malfunctioning of the target protein. g-protein coupled receptor (gpcr) superfamily constitutes the largest family of receptors in cell responsible for mediating the effects of over 50% of drugs in the market now-a-days [1] [2] [3] [4] [5] [6] [7] [8] . gpcrs are involved in the transmission of a variety of signals to the interior of the cell and can be activated by a diverse range of small molecules including nucleotides, amino acids, peptides, proteins and odorants. activation of gpcrs results in a conformational change followed by a signal cascade that passes information to the inside of the cell by interacting with a protein known as heterotrimeric g-proteins. there are three main classes of gpcrs (a, b and c) depending on their sequence similarity to rhodopsin (rho) (class a). class a gpcrs is the largest group and encompasses a wide range of receptors including receptors for odorants, adenosine, β-adrenergic and rhodopsin [1] [2] [3] [4] [5] [6] [7] [8] . the β-adrenergic receptors (β-ars) are g s protein-coupled receptors that play important roles in cardiovascular function and disease, through serving as receptors for the neurohormones: norepinephrine and epinephrine. norepinephrine released from cardiac sympathetic nerves activates myocyte β 1 -ars, which activates adenylyl cyclase via stimulatory g-protein (g s ). the rise in the intracellular [camp] level causes the phosphorylation of several intracellular proteins by means of camp-dependent protein kinase a. such type of activated β 1 -ar results in an increased cardiac inotropy, lusitropy, and chronotropy and the secretion of rennin, all of which contribute to regulate the cardiac functions and blood pressure [9] [10] . β 1 -ar predominates in the heart, representing about 80% of the myocardial β-ars; thus, they tend to be viewed as the most significant β-ars with respect to the cardiovascular system. β 1 -and β 2 -ars in kidneys stimulate the release of renin, thereby playing a role in the activation of renin-angiotensin-aldosterone system [9] [10] . the role of β-ars in cardiovascular function and disease is also highlighted by the significant roles of drugs whose actions are based on binding to the β-ars blockers (β-blockers). βblockers represent first line therapy for the management of chronic heart failure, hypertension, acute and post myocardial infarction patients, chronic stable angina, and unstable angina [11] . they are also commonly used to control the symptoms of atrial fibrillation and other arrhythmias [11] . there are no cardiovascular drugs that have a wider range of indications than βblockers, making them a critical drug class for the management of cardiovascular disease. the availability of uses for β-blockers also suggests that the activation of β-ars, or the sympathetic nervous system (sns), plays an essential function in most cardiovascular diseases. the fact that β 1 -ar selective antagonists are equivalent to non-selective blockers in essentially all situations provides additional evidence that β 1 -ars are the more important β-receptors with respect to cardiovascular disease. the development of a large number of rational inhibitors that have the ability to modulate the activity of such receptors has been a major goal for the pharmaceutical industries to improve the clinical treatment of various disease including hypertension, heart failure and asthma [12] . however, finding specific drug against a particular β-ars drug target is a slow and laborious process. furthermore, the lack of 3d structure of hsadrb1 is an obstacle in the identification of specific drug like molecules. on the other hand, the development of computational approaches for drug designing can be effectively carried out with low cost [13] [14] . the use of computational techniques in drug discovery and development process is rapidly gaining popularity, implementation and appreciation. there will be an intensifying effort to apply computational power to combine biological and chemical space in order to rationalize the drug discovery, designing and optimization phenomena. today, computer aided drug design (cadd) is based on the knowledge of structure, either of the receptor, or that of the ligand. the former is described as structure-based while the later as ligand-based drug designing. because it is difficult and time-consuming to obtain experimental structures from methods such as x-ray crystallography and protein nmr for every protein of interest, homology modeling is a widely used in silico technique providing the useful structural models for generating hypotheses about a protein's function and directing further experimental work [15] . the main objective of this study is to employ "in silico" homology modeling technique to construct the 3d structure of hsadrb1 that will be used to identify and characterize new inhibitors for hsadrb1 by structure-based computational approaches. this model serves as a starting point to gain knowledge of protein-ligand interactions and the structural requirements of active site of protein. computational studies were performed on intel xeon quad core (2.33 ghz processor) server installed with linux os (opensuse version 12.0). multiple sequence alignment was carried out by clustalw of the closest homologue identified by ncbi p-blast to find out the identity, similarity and gap region between the target and template [16] . homology modeling was accomplished by orchestrar [17] implemented in biopolymer module of sybyl7.3 [18] . an online server, i-tasser [19] , was used for modeling a region absent in template structure. the finally selected model of hsβadr1was minimized by amber (version10.0) [20]. stereochemical properties of modeled protein structure were validated by procheck [21] , veri-fy3d [22] and errat [23] . molecular docking experiments were conducted by surflex-dock implemented in sybyl (version 7.3) [24] , fred (version 2.2.5) [25] and gold (version 2.5) [26] . ucsf chimera [27] [28] and moe [29] were used for visualization purpose. the flowchart of work plan is illustrated in (fig 1) . search the closest homologue. top-ranked template sequences as determined by blast were subjected for multiple sequence alignment on the basis of optimized e-value of the specified target sequence (table 1) . however, meleagris gallopavo β 1 -ar (mgβadr1, pdb id: 2y00) retrieved as the closest homologue, was manually edited for optimal alignment along with the target sequence. best alignment was selected based on alignment score and the reciprocal position of the conserved amino acid residues across the members of class a gpcr superfamily. the confined ballesteros and weinstein numbering scheme [32] was used to identify the transmembrane (tm) segments relative to the conserved position of amino acids in tm helices assigned as locant.50 shareing the common features in all class a gpcr superfamily. this is followed by the representation of amino acids tm helix numbers. the immediately preceding and following the .50 residue are numbered .49 and .51, respectively. orchestrar is specifically designed for homology or comparative protein modeling that identifies structurally conserved regions (scrs), models loops using model-based and ab-initio methods, models side chains, and combine them all to prepare a final model. initially, a homology model was generated by orchestrar that lacks a region of 45 amino acid residues (209-254) of the cytoplasmic loop of tm5 located within the target sequence but absent in the template structure. this region was modeled by i-tasser, an integrated platform for automated protein structure and function whose prediction is based on sequence-to-structureto-function paradigm as per multiple threading alignments by lomets [33] . the model generated by i-tasser was named as sub-model 1. five sub-models were evaluated by replica-exchange monte carlo simulations with low free-energy states, spatial restrains and alignments tm regions [34] to identify the best structural alignment almost closed to the structural analogs on the basis of structural similarity. any further steric clashes were removed to refine the coordinates, and the final results of all sub-models were based on sequence-structurefunction paradigm obtained from the consensus of structural similarity and confidence score (c-score) of i-tasser server. c-score value is the quality for the predicted sub-model on the basis of threading method. stereochemical properties of each sub-model were evaluated and the best selected sub-model was incorporated to the homology model of hsβadr1, generated previously by orchestrar and after insertion of the model the finalized modelled is subjected for optimization. homology model of hsβadr1 generated by orchestrar was minimized by sybyl using conjugate gradient and steepest descent method with 10,000 iterations each. the selected submodel generated by i-tasser was also individually minimized to 10,000 cycles by amber10, followed by the insertion of sub-model into the homology model of hsβadr1 by chain joining option in sybyl. the finally generated model is minimized further to 30,000 cycles using ff99sb force field by amber10. selection of complexes for re-docking and cross-docking validation. to identify a suitable docking program for the docking of hsβadr1 agonists, re-docking and cross-docking experiments were performed by surflex-dock, fred, and gold. six βadr1-ligand complexes, three βadr2-ligand complexes and two rhodopsin-ligand complexes were retrieved from pdb. the details of the protein-ligand complexes used in this study are summarized in table 2 and s1 fig. selection of complexes was based on following criteria: availability of the protein-ligand complexes, the crystallographic resolution of protein-ligand complexes should be 3 ǻ, the binding interaction of the protein-ligand complexes should be known. cross-docking experiments conducted in using multiple docking methods with their scoring function are utilized in this study mentioned in s2, s3 and s4 tables. the details of docking methodology are discussed in supporting informations. the re-docking results were analyzed to check the ability of docking programs to correctly identify the bound conformation of co-crystallized ligand in the top-ranked solution. rmsds were calculated between the corresponding co-crystallized ligand against its predicted docked pose. cross-docking experiments were conducted to identify which docking program exactly identified its cognate ligand among the diverse set of ligands within the top-ranked solution. for cross-docking, 11 complexes were extracted from pdb in which eight proteins are homodimers (chain a and chain b) while the rest of three are monomers (chain a). for those proteins that are present as homodimers, ligands were docked into both chains. overall, 19 complexes were evaluated for cross-docking. the results were quantified as best (1-3 position), moderate (4-5 position) and worst when the cognate ligand ranks position lowers than 5 within its cognate protein, respectively. blast results and multiple sequence alignments blast predicted mgβadr1 (pdb id: 2y00) [35] as the best match for hsβadr1 with 68% identity and 75% positivity (with an e-value of 3×10 -165 ). 2r4r, 3kj6, 3p0g and 2rh1 have 79% while 2r4s and 3sn6 have 74% query coverage, more sequence coverage than observed for 2y00 (73%). since 2r4r, 3kj6 and 24rs are available as apo form with fewer scores, identity and positive values, these structures were not used in this study. similarly, the complexes 3p0g, 2rh1 and 3sn6 were not used for the modeling of hsβadr1 structure due to their lower scores. hence, 2y00 is used according to the blast results but to establish more confidence on the top-ranked search, we opted for two sorts of multiple sequence alignments: raw multiple sequence alignment and manually-edited multiple sequence alignment. for raw alignment, the ten top-ranked templates sequences (2y00, 2vt4, 2r4r, 3kj6, 2r4s, 3sn6, 4gbr, 3p0g, 2rh1 and 3pds) were aligned against the target sequence illustrated in s2 fig and s1 table. for manually-edited alignment, both the target and template (2y00) sequences were truncated. the first 50 residues from n-terminus and 84 residues (393-477) from c-terminus were omitted from the target sequence due to the absence of corresponding homologous sequence in the template and has no important residue which is necessary to be in helical segments. the template sequence has 483 amino acid residues whereas the structure of 33-368 residues has been resolved (total 315 residues as some residues are missing). the first 3 residues (33-36) from n-terminus and 14 residues (337-351) from c-terminus were omitted. finally, 342 residues of target sequence was aligned with ten top-ranked blast search, 2y00 (297 residues), 2vt4, 2r4r, 3kj6, 2r4s, 3sn6, 4gbr, 3p0g, 2rh1 and 3pds the average alignment score for manually edited multiple sequence alignment is better (76.47) than the score obtained by raw multiple sequence alignment (74.89). overall, there are 15 instances where alignments are improved, 7 alignments are improved when the target sequence is aligned with the rest of the sequences and 8 times the alignments have better quality when the template sequence is aligned with the remaining sequences. the generalized ballesteros and weinstein numbering scheme is beneficial for the understanding, recognition and structural alignments of gpcrs family [32] . the ballesteros and weinstein numbering is illustrated in (fig 2) and the conserved amino acid residues of class a gpcrs is tabulated in table 3 . ballesteros and weinstein numbering is useful for the understanding of integrating methods for the construction of 3d models and computational probing of structure-function relations in gpcrs. these criteria pertain to the selection of correct inputs for the a1ignment programs and to structural considerations applicable to checking and refining the sequence alignments generated by alignment programs. this selection criterion depends on the information that is determined by the extent of homology among the compared sequences. alignment of sequences with intermediate homologies (i.e., 30-70%) can identify continuous patterns of conservation distributed over the entire sequence. such patterns provide structural inferences based on conservation. the hsβadr1 model is selected after structural comparison, superimposition and procheck results ( fig 3a) . orchestrar generated homology model using template 2y00 was incomplete since the structure of residues 209-254 was missing. orchestrar fills the gap but not more than1-12 residues long, therefore, an ab-initio based threading method is used to predict the structure of missing region (s3 fig). subsequently, five sub-models were generated. each sub-model is further analyzed by ramachandran plot (table 4 ). among them, sub-model 1 is selected on the basis of highest c-score (-2.43) and stereochemical properties. the c-score value being lower than -1.5 likely indicates a lack of an appropriate template within the i-tasser library. the selected sub-model 1 was subsequently inserted into the homology model of hsβadr1 by sybyl. the c-terminus val208 and the n-terminus lys254 of the homology model is connected with the n-terminus val209 and the c-terminus thr255 of submodel 1, respectively ( fig 3b) . according to the ramachandran plot,~85%, 13.5% and 1.3% residues are located within the favorable, allowed and the generously allowed regions, respectively while only one residue (ile208) is found to be in the disallowed region. the visual inspection revealed that ile208 is far away from the active site region and do not lie within 5å of active site. additionally, stereochemical properties of the model were validated by verify3d web server. verify3d evaluated the local environment and inter-residue contacts in the model. ideally, the 3d-1d profile for each of the 20 amino acids should be in range of 0-0.2. values less than zero are considered as inaccurate for the homology model. the verify3d plot of hsβadr1 model shows that the average score of all amino acid residues is 0.16 which is relatively closed to 0.20. moreover, errat, a protein structure verification web server was used to verify the model on the basis of model building and refinement, and is extremely useful in making decisions about reliability of the homology model. errat results showed that the overall quality factor for the hsβadr1 model is 73.35%., suggesting that the generated model is robust and can be use for virtual screening purpose in future. the 3d model of hsβadr1 revealed an excellent agreement with the experimentally determined 3d structure of mgβadr1. (fig 4) shows the superimposed view of hsβadr1 model and mgβadr1 structure. the calculated polypeptide backbone (cα, these pdbs have comparable sequence similarities, identities and source as that of the template but some conformational changes has been observed for helix6 [36] . however, we found no observable conformational changes especially for those amino acid residues that are involved in molecular interactions with high-affinity antagonists i32, p32 and cau located within h6 and cl-3. finally, the hsβadr1 model is subjected to the sequence manipulation suite ident and sim [37] to observe the similarity and identity of the model with respect to the template structure. the results are better but afterwards much improved after manual editing of the target and template sequences, similarity and identity ratios are increased from 73% to 75.4% and 67% to 68.4%, respectively. the overall topology and secondary structural elements particular for the class a gpcr family remain quite conserved in the model of hsβadr1, that is, an extracellular n-terminus domain, seven 7-tm domains linked by three intracellular cytoplasmic loops (cl-1, cl-2 and cl-3), three extracellular loops (el-1, el-2 and el-3), and a cytoplasmic c-terminus domain. the nterminus domain comprises of nine amino acids residues (1-9) that are located outside the membrane. the tm-1-tm-7 helices spans from 10-34, 44-67, 80-104, 125-146, 198-173, 297-274 and 308-326 amino acid residues, respectively, while the c-terminus domain comprises of amino acid residues ranging from 327 to 342 at the inner face of membrane. the cytoplasmic loops, (cl-1, cl-2 and cl-3) comprise of residues 35-43, 105-124 and 199-273, respectively. the cytoplasmic loops cl-2 and cl-3 are believed to be important in the binding, selectivity or specificity and activation of g-proteins [38] . the extracellular loops, (el-1, el-2 and el-3) comprising 68-79, 147-172 and 298-307 residues, respectively. two conserved disulfide bridges which are important for cell surface expression, ligand binding, receptor activation and maintenance of the secondary structure are located in el-2 and el-3 regions at positions cys81-cys166 and cys159-cys165, respectively (table 5 ). conserved motifs of hsβadr1 homology model dry motif also known as ionic lock switch [39] is observed at position asp105(3.49), arg106 (3.50) and tyr107 (3.51) in helix 3 of hsβadr1 model. the conserved asp in dry motif at the cytoplasmic end of helix 3 believes to regulate the transition state of active state, while the adjacent arg is crucial for g-protein activation [40] . another conserved penta-peptide npxxy motif known as tyrosine toggle switch (where x usually represents a hydrophobic residue and n is rarely exchanged against d) located at the c terminus of tm-7 which contributes to gpcr internalization and signal transduction. several site-directed mutagenesis studies revealed the importance of this motif in signaling [41] . the npxxy motif is present at position arg323(7.49), pro324(7.50), ileu325(7.51), ileu326(7.52) and tyr327(7.53) in the [44] . these domains help anchor tm to the cytoskeleton and hold together signaling complexes. pdz domain have many functions, from regulating the trafficking and targeting of proteins to assembling signaling complexes, and networks designed for efficient and specific signal transduction [45] . the amphipathic amino acid residues present in helix 8 are conserved among all human gpcrs (residues 327-341), located between the tm7 bound with helix 7. the palmitoylation occurs at n-terminus while the biosynthesis of receptor and the proper regulation of surface expression occur at c-terminus of hsβadr1. the side chain of two crucial residues of helix8, asp332(8.49) and arg334 (8.51) , are projected within the hydrophilic interface involved in stimulatory g-protein (g s ) activation while the residue phe333(8.50) and phe337(8.54) are buried in the hydrophobic core of the helix [46] . salt bridges play important roles in protein structure and function. disruption and the introduction of a salt bridge reduce and increase the stability of the protein, respectively [47] . in membrane proteins, one expects salt bridges to be particularly important because of a smaller dehydration penalty (loss of favorable contacts with water) on salt bridge formation [48] . charged groups become largely dehydrated when inserted into membranes, and therefore, experience a smaller change in hydration between non-salt-bridging and salt-bridging states. there should also be a smaller effect because of solvent screening, strengthening salt-bridge interactions [48] . multiple salt bridges are observed in the homology model of hsβadr1; asp154:arg157, asp209:arg213, asp332:lys335, glu155:arg158, glu200:lys203 and glu212: arg213. in addition, salt bridges can also serve as key interactions in much the same way as disulfide bonds (s6 fig) . re-docking analysis. the success of docking is usually scrutinized by its accurate pose prediction ability [49] [50] , hence prior to the docking of βadr1agonists into the homology model of hsβadr1, the reliability of three docking programs including surflex-dock, fred, and gold was assessed. the re-docking results were quantified on the basis of rmsd between the top-ranked docked conformation and the co-crystallized (termed as ˈreferenceˈ) ligand and visual analysis. the prediction is termed as good when rmsd >1 or < 2 å and the docked pose is superimposed on the ligand's co-crystallized position, fair when rmsd > 2 and < 3 å and the docked pose is in active site but not superimposed on its native conformation, and poor or inaccurate when rmsd >3 å and the docked pose is inverted or far away from the active site. the re-docking results showed that gold outperformed surflex-dock and fred (fig 5) . among the 19 complexes used, gold, surflex-dock, and fred generated 100%, 74%, and 68% good solutions in the top-ranked position, respectively. surflex-dock and fred identified 5% and 10% fair poses, respectively in the top-ranked docked poses. while both the programs generated 21% inaccurate solutions in the top-ranked docked pose. the results are summarized in (table 6) . furthermore, docking methods utilized in cross-docking is illustrated in (table 7) , was conducted to find out which program utilized in correctly ranks 19 ligands into their cognate binding site. the prediction was quantified on the basis of ligand's ranking (s2, s3 and s4) tables. the cross-docking results indicates that surflex-dock is superior with 47% best results in ranking the ligand in top 1-3 position in their cognate receptors. gold and fred are returned with 42% and 44% best results, respectively (fig 5) . the position and the interaction of each ligand within the cognate receptor are visually analyzed. the results showed that the conformation of each ligand generated by surflex-dock is much better than the docked conformations generated by gold and fred. most of the interactions generated by surflex-dock are similar to the interactions present in the x-ray conformation. hence, surflex-dock was found to be more appropriate for the docking of gpcr's ligands and it is further used in this study to explore the binding mode of hsβadr1 agonists into the active site of hsβadr1 model. table 8 and table 9 . binding mode of y00, whj, 5fw, 68h the docked pose of y00 reveals that multiple hydrogen bonding interactions are formed between the surrounding amino acid residues that stabilize y00 in the catechol binding pocket. the −oh group at the phenol moiety is involved in hydrogen bonding with the γ carboxylate side chain of asp167 (1.83 å). the substituted −oh group at meta and para positions of ring b shows hydrogen bonding interactions with the side chains γ −oh of thr170 (1.93 å) and ser178 (1.80 å), respectively. furthermore, the side chain phenyl ring of phe168 and the carboxylate of asp168 provide cation-π stacking interactions to the phenolic moiety of y00 that further helps to stabilize the orientation of agonist. (fig 6a) displays the binding mode of compound y00. the binding mode of whj demonstrates that the amino group of whj mediates hydrogen bond with the side chain carboxylate of asp88 at a distance of 1.86 ǻ. similarly, thr170 γ -oh group probes hydrogen bonding interactions with multiple ligand atoms including n atom and o atom at a distance of 2.03 ǻ and 2.64 ǻ, respectively. the same thr170 is also involved in forming hydrogen bond at a distance of 1.76 ǻ, the most significant hydrogen bonding interaction for whj. phe168 forms cation-π interaction with one of the fused aromatic ring of whj. the binding orientation of compound whj is shown in (fig 6b) . the binding mode of 5fw shows that the para −oh moiety of 5fw establishes hydrogen bonding interaction with the side chain carboxylate of ser179 at a distance of 2.87 ǻ. additionally; ser178 forms bi-dentate hydrogen bonding with the para and meta −oh groups at distances of 2.22 ǻ and 2.17 ǻ, respectively. the main chain carbonyl moiety of phe168 mediate hydrogen bond with the amino group of 5fw (2.66 ǻ). the docked binding mode of compound 5fw is depicted in (fig 6c) . as revealed in (fig 6d) , the -oh of 68h shows similar interactions as observed for compound 5fw. the para substituted −oh group of 68h mediates bi-dentate hydrogen bonds with the side chain −oh groups of ser178 and ser179 at distances of 2.22 ǻ and 2.44 å, respectively. furthermore, asp88 mediates bi-dentate interaction with the linear chain amino and −oh groups of 68h at a distance of 1.91 ǻ and 1.90 ǻ, respectively. the binding mode analysis of agonists y00, 5fw, 68h displays that the ser178 plays crucial role in stabilizing the agonists within the catechol binding pocket of hsβadr1 homology model. the docking results reveals that ser178 and phe168 are crucial residues in ligand binding by providing h-bonding, and π-π interactions, respectively, thus helps in the activation of hsβadr1. we intend to incorporate molecular dynamic simulation studies in order to investigate the dynamic behavior of protein-inhibitor complex formation in the near future; and the role of most important residues will be determined. the study will be helpful to pursue structure based drug design of hsβadr1 blockers. human βadr1 is found to be involved in several cardiovascular diseases. the lack of crystal structure of hsβadr1 provoked us to apply in silico techniques to initiate the drug discovery process for hsβadr1. hence, to understand the characteristics structural features of hsβadr1 and to execute the structure-based drug design strategy for hsβadr1, threading-based homology modeling of mammalian origin were applied in this study. the model possesses acceptable structural profiles. furthermore, the binding modes of four hsβadr1 agonist were determined via molecular docking simulation. h-3, h-5, and el-2 regions were found to be important in ligand binding. several residues including trp84, asp88, val89, asp167, phe168, thr170, ser178, and ser179 are involved in direct interactions with the ligand. among all, ser178, and phe168 provides h-bonding, π-π interactions, respectively, hence found to be crucial residue in ligand binding and for the activation of hsβadr1. we are also investigating the dynamic behaviour of the apo and ligand bound forms of hsβadr1 that will be published in future. note: the coordinate file of hsadrb1 is submitted to the publicly accessible protein model database (pmdb) [51] ; www.caspur.it/pmdb). the pmdb id of hsadrb1 is pm0079652 respectively. table 8 and (fig 6) ). (tif) s1 table. alignment scores (a) alignment scores obtained from the alignment scores raw multiple sequence alignment (b) alignment scores obtained from the manually edited multiple sequence alignment (c) alignment scores obtained from the raw target and template pair wise sequence alignment. (doc) s2 table. cross-docking results of surflex-dock analyzed the basis of ranking of the cognate ligand in their respective receptor. criteria for ranking: 1-3 position is best (green cell), 4-5 is moderate (blue cell) and >5 is inaccurate (red cell). (doc) s3 table. cross-docking results of fred analyzed on the basis of ranking of the cognate ligand in their respective receptor. criteria for ranking: 1-3 position is best, 4-5 is moderate and >5 is inaccurate. (doc) s4 table. cross-docking results of gold analyzed on the basis of ranking of the cognate ligand in their respective receptor. (doc) location and nature of the residues important for ligand recognition in g-protein coupled receptors identification of g protein-coupled receptor genes from the human genome sequence the structure and function of g-protein-coupled receptors the year in g protein-coupled receptor research g protein-coupled receptors: the inside story an overview on gpcrs and drug discovery: structure-based drug design and structural biology on gpcrs i want a new drug: g-protein-coupled receptors in drug development the impact of gpcr structures on pharmacology and structure-based drug design role of β adrenergic receptor polymorphisms in heart failure: systematic review and meta-analysis functional responses of human beta 1 adrenoceptors with defined haplotypes for the common 389r>g and 49s>g polymorphisms medical therapy can improve the biological properties of the chronically failing heart. a new era in the treatment of heart failure pharmacogenetics of the human beta-adrenergic receptors predicting molecular interactions in silico: i. a guide to pharmacophore identification and its applications to drug design virtual screening for sars-cov protease based on kz7088 pharmacophore points structure-based 3d-qsar models and dynamics analysis of novel n-benzyl pyridinone as p38α map kinase inhibitors for anticytokine activity clustalw and clus-talx version 2.0 comparison of composer and orchestrar sybyl molecular modeling software version 7.3, tripos associates i-tasser: a unified platform for automated protein structure and function prediction procheck: a program to check the stereochemical quality of protein structures verify3d: assessment of protein models with three-dimensional profiles verification of protein structures: patterns of nonbonded atomic interactions surflex: fully automatic flexible molecular docking using a molecular similarity-based search engine fred pose prediction and virtual screening accuracy gold version 3.0. cambridge crytallographic data center chimera: an extensible molecular modeling application constructed using standard components molecular operating environment (moe), 2012.10 chemical computing group inc. 1010 sherbooke st. west, suite #910 uniprot website basic local alignment search tool integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in g protein-coupled receptors i-tasser: fully automated protein structure prediction in casp8 tm-align: a protein structure alignment algorithm based on the tm-score the structural basis for agonist and partial agonist action on a beta 1-adrenergic receptor two distinct conformations of helix 6 observed in antagonist-bound structures of a β1-adrenergic receptor the sequence manipulation suite: javascript programs for analyzing and formatting protein and dna sequences structure of a beta1-adrenergic g-protein-coupled receptor the significance of g protein-coupled receptor crystallography for drug discovery the effect of mutations in the dry motif on the constitutive activity and structural instability of the histamine h2 receptor mutation of tyrosine in the conserved npxxy sequence leads to constitutive phosphorylation and internalization, but not signaling, of the human b2 bradykinin receptor hallucinogen actions on 5-ht receptors reveal distinct mechanisms of activation and signaling by g protein-coupled receptors homology modeling of g-protein-coupled receptors and implications in drug design pdz domains and their binding partners: structure, specificity, and modification evolutionary expansion and specialization of the pdz domains characterization of the residues in helix 8 of the human β1-adrenergic receptor that are involved in coupling the receptor to g proteins esbri: a web server for evaluating salt bridges in proteins salt bridges: geometrically specific, designable interactions. proteins structure-based design, synthesis and biological evaluation of β-glucuronidase inhibitors identification of novel interleukin-2 inhibitors through computational approaches the pmdb protein model database structure of bovine rhodopsin in a trigonal crystal form advances in determination of a highresolution three-dimensional structure of rhodopsin, a model of g-protein-coupled receptors (gpcrs) high resolution crystal structure of an engineered human β2-adrenergic g protein coupled receptor a specific cholesterol binding site is established by the 2.8 å structure of the human β2-adrenergic receptor conserved binding mode of human β2 adrenergic receptor inverse agonists and antagonist revealed by x-ray crystallography we are thankful to prof. bernd m. rode (university of innsbruck) for the computational software support during this research work. financial support required to conduct this scientific work from higher education commission (hec), is highly acknowledged. supporting information s1 fig. 2d representation of the bound ligands of 11 gpcrs complexes utilized in this study (see also table 2 ). key: cord-030686-wv77zwsc authors: budde, carlos e. title: fig: the finite improbability generator date: 2020-03-13 journal: tools and algorithms for the construction and analysis of systems doi: 10.1007/978-3-030-45190-5_27 sha: doc_id: 30686 cord_uid: wv77zwsc this paper introduces the statistical model checker figv, that estimates transient and steady-state reachability properties in stochastic automata. this software tool specialises in rare event simulation via importance splitting, and implements the algorithms restart and fixed effort. fig is push-button automatic since the user need not define an importance function: this function is derived from the model specification plus the property query. the tool operates with input/output stochastic automata with urgency, aka iosa models, described either in the native syntax or in the jani exchange format. the theory backing fig has demonstrated good efficiency, comparable to optimal importance splitting implemented ad hoc for specific models. written in c++, fig can outperform other state-of-the-art tools for rare event simulation. in formal analysis of stochastic systems, statistical model checking (smc [33] ) emerges as an alternative to numerical techniques such as (exhaustive) probabilistic model checking. its partial, on-demand state exploration offers a memorylightweight option to exhaustive explorations. at its core, smc integrates monte carlo simulation with formal models, where traces of states are generated dynamically e.g. via discrete event simulation. such traces are samples of the states where a (possibly non-markovian) stochastic model usually ferrets. given a temporal logic property ϕ that characterises certain states, an smc analysis yields an estimateγ of the actual probability γ with which the model satisfies ϕ. the estimateγ typically comes together with a quantification of the statistical error: two numbers δ ∈ (0, 1) and ε > 0 such thatγ ∈ [γ − ε, γ + ε] with probability δ. thus, if n traces are sampled, the full smc outcome is the tuple (n,γ, δ, ε). with this statistical quantification-usually presented as a confidence interval (ci) aroundγ-an idea of the quality of an estimation is conveyed. to increase the quality one must increase the precision (smaller ε) or the confidence (bigger δ). for fixed confidence, this means a narrower ci aroundγ. the number of traces n is inversely proportional to ε and to the ci width, so smc trades memory for runtime or precision when compared to exhaustive methods [5] . this trade-off of smc comes with one up and one down. the up is the capability to analyse systems whose stochastic transitions can have non-markovian this work was partially funded by nwo, ns, and prorail project 15474 (se-quoia) and eu project 102112 (success). distributions. in spite of gallant efforts, this is still out of reach for most other model checking approaches, making smc unique. the down are rare events. if there is a very low probability to visit the states characterised by the property ϕ, then most traces will not visit them. thus the estimateγ is either (an incorrect) 0 or, if a few traces do visit these states, statistical error quantification make ε skyrocket. to counter such phenomenon, n must increase as γ decreases. unfortunately, for typical estimates such as the sample mean, it takes n 384 /γ to build a (rather lax!) ci where δ = 0.95 and ε = γ 10 . if e.g. γ ≈ 10 −8 then n 38400000000 traces are needed, causing trace-sampling times to grow unacceptably long. rare event simulation (res [24] ) methods tackle this issue. the two main res methods are importance sampling (is) and importance splitting (isplit). is compromises the aforementioned up, since it must tamper the stochastic transitions of the model. given that the study of non-markovian systems is a chief reason to use smc, related work. other statistical model checkers offer res methods to some degree of automation. plasma lab implements automatic is [18] and semiautomatic isplit [21] for markov chains. its isplit engine offers a wizard that guides the user to choose an importance function. the wizard exploits a layered decomposition of the property query-not the system model. via apis, the isplit engine of plasma lab could be extended beyond dtmc models. sbip 2.0 [22] implements the same (semiautomatic, property-based) engine for dtmcs. sbip offers a richer set of temporal logics to define the property query in. cosmos [1] and ftres [26] implement importance sampling on markov chains, the latter specialising in systems described as repairable dynamic fault trees (dfts). all these tools can operate directly on markovian models, and none offers fully automated isplit. instead, the smc tool modes can also cope with nondeterminism (e.g. in markov automata) using the lss algorithm [10, 5] . on the other hand, using the batch means method, fig can estimate steady-state properties, which modes cannot currently do. moreover, res methods make more traces visit the rare states that satisfy a property ϕ (the set s ϕ ), to reduce the variance of smc estimators. for a fixed budget of traces n, this yields more precise cis than classical monte carlo simulation (cmc). fig implements importance splitting: a main res method that can work on non-markovian systems without special considerations. isplit splits the states of the model into layers that wrap s ϕ like an onion. reaching a state in s ϕ from the surface is then broken down into many steps. the i-th step estimates the conditional probability to reach (the inner) layer i + 1 from (the outer) layer i. this stepwise estimation of conditional probabilities can be much more efficient than trying to go in one leap from the surface of the onion to its core [20] . formally, let s be the states of a model with initial states s 0 and rare states this approach is correct, i.e. it yields an unbiased estimatorγ which depends on how the s i layers where chosen. for this, an importance function f : s → r 0 and thresholds i ∈ r 0 are defined: and the importance function f . these choices are the key challenge in isplit [20] . theoretical developments assume f is given [12, 8] , and applications define it ad hoc via (res and domain) expert knowledge [30, 27] . yet there is one general rule: importance must be proportional to the probability of reaching s ϕ . thus for s, s ∈ s, if a trace that visits s is more likely to observe a rare state, one wants f (s) f (s ). this means that f depends both on the model m and the property ϕ that define s ϕ . in iosa, continuous variables called clocks sample random values from arbitrary distributions (pdfs). as time evolves, all clocks count down at the same rate. the first to reach zero can trigger events and synchronise with other modules, broadcasting an output action that synchronises with homonymous input actions (iosa are input-enabled). actions can be urgent, where urgent outputs have module m1 fc,rc : clock; inf,brk : [0..2] init 0; , fig 1.2 reads models written in the jani exchange format [7] . model types supported are ctmc and a subset of sta that matches iosa, e.g. with a single pdf per clock and broadcast synchronisation. fig also translates iosa to jani as sta, to share models with tools such as the modest toolset [16] and storm [13] . this is used in sec. 4 for comparisons. properties. fig builders. engines are nosplit, restart, and sfe, which resp. run cmc, restart (rst [31] ), and fixed effort (fe [14] ) simulations. the latter two are isplit algorithms: fe was described in sec. 2, and works for transient properties; rst also works for steady-state analysis (steady-state via fe requires regeneration theory [15] , seldom applicable to non-markovian models and unsupported by all these experiments can be reproduced via the artifact freely available in [3] . we test different configurations of engines, efforts, and thresholds. for each configuration we run simulations until some timeout. this yields a ci with precision 2ε for confidence coefficient δ = 0.95. the smaller the ε, the narrower the ci, and the better the performance of the configuration (and tool) that produced it. first, we analyse repairable dfts with warm spares and exponential (fail), normal (repair), and lognormal (dormancy) pdfs. using cmc, fe 8,16,32 and rst 3,4,6 we estimate the probability of a top level event after the first failure, before all components are repaired, in trees with 6, 7, and 8 spares (the smallest iosa has 116 variables and > 2.5 e 37 states). for isplit we used seq thresholds with --dft 0 --acomp and no arguments, i.e. as automatic as cmc. with a 20 min timeout, each configuration was repeated 13 times in a xeon e5-2683v4 cpu running linux x64 4.4.0. the height of the bars in the top plot of fig. 1 is the average ci precision (lower is better), using z-score m=2 to remove outliers [17] . whiskers are standard deviation, and white numbers indicate how many runs yielded not-null estimates. clearly, res algorithms outperform cmc in the hardest cases: less than half of cmc runs in dft-8 could build (wide) cis. second, we estimate the steady-state overflow probability in the last node of tandem queues, on a markovian case with 2 buffers [29] , 3 buffers [28] , and a non-markovian 3-buffers case [30] . we study how fig-using --amono, seq, and rst 3,4,5,7,9 -approximates each optimal ad hoc function and thresholds of [29, 28, 30] . experiments ran as before: the bottom plot of fig. 1 shows that fig's default (rst 3 with seq, legend "auto 3") is always closest to the optimal. third, we compare fig and modes in the original benchmark of the latter [5] . we do so for fe-seq, rst-seq, rst-es, using each tool's default options. we ran each benchmark instance 15 min, thrice per tool, in an intel i7-6700 cpu with linux x64 5.3.1. the scatter plots of fig. 2 show the median of the ci precisions. sub-plots on the bottom-right are a zoom-ins in the range [10 −10 ,10 −5 ]. an (x,y) point is an instance whose median ci width was x for albeit modes is multi-threaded, these experiments ran on a single thread to compare both tools on equal conditions. on the other hand, fig also estimates the probability of steady-state properties, for which there is no support in modes. coupling and importance sampling for statistical model checking automation of importance splitting techniques for rare event simulation fig: the finite improbability generator. 4tu.centre for research data better automated importance splitting for transient rare events a statistical model checker for nondeterminism and rare events compositional construction of importance functions in fully automated importance splitting jani: quantitative model and tool interaction sequential monte carlo for rare event estimation some tactical problems in digital simulation smart sampling for lightweight verification of markov decision processes input/output stochastic automata with urgency: confluence and weak determinism splitting for rare event simulation: a large deviation approach to design and analysis a storm is coming: a modern probabilistic model checker on the importance function in splitting simulation the splitting method in rare event simulation the modest toolset: an integrated environment for quantitative modelling and verification how to detect and handle outliers. asqc basic references in quality control command-based importance sampling for statistical model checking prism 4.0: verification of probabilistic real-time systems splitting techniques plasma lab: a modular statistical model checking platform sbip 2.0: statistical model checking stochastic real-time systems stochastic automata for fault tolerant concurrent systems introduction to rare event simulation rare event simulation using monte carlo methods rare event simulation for dynamic fault trees advanced restart method for the estimation of the probability of failure of highly reliable hybrid dynamic systems importance functions for restart simulation of general jackson networks restart vs splitting: a comparative study rare event simulation of non-markovian queueing networks using restart method restart: a method for accelerating rare event simulations probable inference, the law of succession, and statistical inference probabilistic verification of discrete event systems using acceptance sampling acknowledgments. the author thanks arnd hartmanns for excellent discuskey: cord-102850-0kiypige authors: huang, c.-c.; lai, j.; cho, d.-y.; yu, j. title: a machine learning study to improve surgical case duration prediction date: 2020-06-12 journal: nan doi: 10.1101/2020.06.10.20127910 sha: doc_id: 102850 cord_uid: 0kiypige predictive accuracy of surgical case duration plays a critical role in reducing cost of operation room (or) utilization. the most common approaches used by hospitals rely on historic averages based on a specific surgeon or a specific procedure type obtained from the electronic medical record (emr) scheduling systems. however, low predictive accuracy of emr leads to negative impacts on patients and hospitals, such as rescheduling of surgeries and cancellation. in this study, we aim to improve prediction of operation case duration with advanced machine learning (ml) algorithms. we obtained a large data set containing 170,748 operation cases (from jan 2017 to dec 2019) from a hospital. the data covered a broad variety of details on patients, operations, specialties and surgical teams. meanwhile, a more recent data with 8,672 cases (from mar to apr 2020) was also available to be used for external evaluation. we computed historic averages from emr for surgeonor procedure-specific and they were used as baseline models for comparison. subsequently, we developed our models using linear regression, random forest and extreme gradient boosting (xgb) algorithms. all models were evaluated with r-squre (r^2), mean absolute error (mae), and percentage overage (case duration > prediction + 10 % & 15 mins), underage (case duration < prediction 10 % & 15 mins) and within (otherwise). the xgb model was superior to the other models by having higher r^2 (85 %) and percentage within (48 %) as well as lower mae (30.2 mins). the total prediction errors computed for all the models showed that the xgb model had the lowest inaccurate percent (23.7 %). as a whole, this study applied ml techniques in the field of or scheduling to reduce medical and financial burden for healthcare management. it revealed the importance of operation and surgeon factors in operation case duration prediction. this study also demonstrated the importance of performing an external evaluation to better validate performance of ml models. it becomes more and more important for clinics and hospitals in managing resources for 2 critical cares during the covid-19 pandemic. statistics show that approximately 60 % 3 of patients admitted to the hospital will need to be treated in the operation room 4 (or) [11] , and the average cost of or is up to 2,190 dollars per hour in the united 5 states [1, 6] . hence, the or is considered as one of the highest hospital revenue 6 generators and accounts for as much as 42 % of a hospital's revenue [6, 10] . based on 7 these statistics, a good or schedule and management is not only critical to patients 8 who are in need of elective, urgent and emergent operations, but is also important for 9 surgical teams to be prepared. owing to the importance of or, improvement of or 10 efficiency has high priority so that the cost and time spent on or is minimized while the 11 utilization of or is maximized to increase surgical case number and patient access [15] . 12 in a healthcare system, numerous factors are involved in affecting or efficiency, for 13 example patient expectation and satisfaction, interactions between different professional 14 specialties, unpredictability during operations, surgical case scheduling and etc [20] . 15 although the process of or is complex and involves multiple parties, one way to 16 enhance or efficiency is by increasing the accuracy of predicted surgical case duration. 17 over-or under-utilization of or time often leads to undesirable consequences such as 18 idle time, overtime, cancellation or rescheduling of surgeries, which may implement 19 negative impact on the patient, staffs and hospital [21] . in contrast, high efficiency in 20 or scheduling not only contribute to better arrangement for the usage of operating 21 room and resources, it can also lead to cost reduction and revenue increment since more 22 surgeries can be performed. 23 currently, most hospitals schedule surgical case duration by employing estimations 24 from surgeon and/or averages of historical case durations, and studies show that both of 25 these methods have limited accuracy [14, 17] . for case length estimated by surgeons, 26 factors including patient conditions, anesthetic issues might not be taken into 27 consideration. moreover, underestimation of case duration often occurs as surgeon 28 estimations were usually made by leaning towards maximizing block scheduling to 29 account for potential cancellations and cost reduction. furthermore, operations with 30 higher uncertainty and unexpected findings during operation add difficulties and 31 challenges into case length estimation [14] . historic averages of case duration for a 32 specific surgeon or a specific type of operation obtained from electronic medical record 33 (emr) scheduling systems have also been used in hospitals. however, these methods 34 have been shown to produce low accuracy due to large variability and lack of same 35 combination in the preoperative data available on the case that is being performed [25] . 36 in order to improve the predictability, researchers utilized linear statistical models, 37 such as regression, or simulation for surgical duration prediction and evaluation of the 38 importance of input variables [8, 12, 13] . however, a common shortcoming of these 39 studies is that relatively lesser input variables or features were used in their models due 40 to the limitation of statistical techniques in handling too many input variables. similarly, we combined categories for primary surgeon's id, specialty, anesthesia type 101 and room number which had case numbers less than 50 into the category of 'others'. in addition, since operation case duration can be related to the performance of 103 surgeons and surgeons' performance is affected by their working time, we also analysed 104 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 12, 2020. . figure 1 . the workflow of model training for this study. the data set used for model training fall within the time range of jan 1, 2017 to dec 31, 2019. from this data set, about 17 % of the cases were excluded based on these criteria: patients with two or more surgical procedures performed at the same time, emergent and urgent cases, surgeons with age under 28, patients with age younger than 20, pregnant patients, procedure duration longer than 10 hours or less than 10 minutes and cases with missing value. the total number of cases included in the data set for model building was 142,448. this data set was then split into training (80 %) and validation (20 %) subsets for model development. machine learning and linear regression models were developed on the training data set and validated on the validation data set using r-square and mean absolute error. percentage of cases with actual duration differences falling within 10 % and 15 minutes of predicted procedure duration was also computed. eventually, the models were further evaluated on the most recent surgical cases (from mar 1 to apr 30, 2020) which were not included in the original data set for model training. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. 105 surgical minutes performed by the same primary surgeons on the same day as well as 106 within the last 7 days, and the number of urgent and emergent operations prior to the 107 case that was being performed by the same surgeon were included in the analysis. together, 24 predictor variables were included for predictive model building in this 109 study. these predictors can be categorised into 5 groups: patient, surgical team, 110 operation, facility and primary surgeon's prior events (see table 1 ). model development and training 112 we applied multiple ml methods for operation case duration prediction. operation case 113 duration (in minutes) is the total period starting from the time patient entering into the 114 or to the time exiting the or. historic averages of case durations based on 115 surgeon-specific or procedure-specific from emr systems were used as baseline models 116 for comparison in case duration prediction. at the beginning, we performed multivariate 117 linear regression (reg) to predict operation case duration. however, when we looked at 118 the distribution of operation case duration, it was observed to be skewing to the right 119 ( fig. 2) . we performed logarithmic transformation on operation case duration to reduce 120 the skewness. the model built from log transformed multivariate linear regression 121 (logreg) outperformed reg in all evaluation indexes. subsequent ml algorithms were 122 also trained by using the log transformed case duration as the target. the first ml algorithm that we tested is random forest (rf), a tree-based 124 supervised learning algorithm. rf uses bootstrap aggregation or bagging technique for 125 regression by constructing a multitude of decision trees based on training data and 126 outputting the mean predicted value from the individual trees [19] . bagging technique 127 is unlikely to over-fitting, in other words, it reduces the variation without increasing the 128 bias. tree-based techniques were suitable for our data since they include a large number 129 of categorical variables, e.g. icd code and procedure type, most of which were sparse. 130 the number of trees that was set in study is 50. extreme gradient boosting (xgb) 131 algorithm is the other supervised ml algorithm that was tested for comparison to rf. 132 recently, xgb algorithm gains popularity within the data science community due to its 133 ability in overcoming the curse of dimensionality as well as capturing the interaction of 134 variables [18] . xgb is also a decision tree-based algorithm but more computationally efficient for 136 real-time implementation than rf. xgb and rf algorithms are different in the way of 137 how the trees are built. it has been shown that xgb performs better than rf if 138 parameters are tuned carefully, otherwise it would be more likely to over-fitting if the 139 data are noisy [3, 9] . we adopted 5-fold cross validation strategy to tune out the best 140 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. . data-splitting strategy was used in the training for all the models to prevent 145 over-fitting consequences. we randomly separated the data into training and testing 146 subsets at a ratio of 4:1. the training data were used to build different predictive 147 models as well as to extract important predictor variables. the testing data were used 148 for internal evaluation of the models.in addition to interval evaluation, external 149 evaluation on all the models were performed using data from mar 1 to apr 30, 2020. surgeon-or procedure-specific calculated from emr were also evaluated on the same 154 internal and external testing sets to ensure fair and uniform comparison across all 155 models. data processing and cleaning as well as model development in this study were 156 performed using r software. the packages "xgboost and "randomforest were used to 157 implement xgb and rf algorithms in r [4, 5] . 164 r 2 is the coefficient of determination, it represents the proportion of the variance for 165 the actual case duration that is explained by predictor variables in our models. mean absolute error (mae) measures the average of errors between the actual case 167 6/15 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. hand, the average model based on a specific procedure had lower percentage underage 188 and overage compared to the surgeon-specific model. these differences were due to an 189 extensive procedure classification in the procedure-specific model. however, the 190 percentage underage was still quite high. since no other information is taken into 191 consideration in the average model, except durations of operation cases happened in the 192 past, prediction bias and low accuracy usually result from the average model. 193 we first fitted the reg model by including all the input variables showed in table 1 . is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. when its performance is better on training set but poor on testing set. when we log transformed operation case duration and re-ran a regression model (i.e. 203 logreg), the performance of logreg model improved and outperformed reg model. [12, 23] . again in the logreg model, the results of all the evaluation metrics were 208 close for training, internal and external training sets, so the model was not over-fitting. 209 although performance of the logreg model was not bad, an assumption of linear performance of the xgb model was better than the rf model on training set but did 219 not improve a lot compared to the rf model on internal and external testing sets. since xgb was more computing efficient than rf, the xgb model was chosen to be 221 the best model and was used in subsequent analysis. in addition to the three key metrics, we studied inaccuracy of different models by 223 using external testing set. we calculated the total prediction error (in minutes) and the 224 corresponding inaccurate percentage for all the models. the results are reported in 225 in fig. 3 , we plotted scatter plots of actual versus predicted duration on the external 234 testing set for the average models of surgeon-and procedure-specific, and the xgb 235 model. a straight line indicating the theoretical perfect relationship, i.e. predicted and 236 actual procedure duration are identical, was added as a reference in each scatter plot. the data points of the xgb models were aligned closer to the straight line. therefore, 238 the xgb model showed a higher correlation between predicted and actual duration 239 compared to the other two types of average model. fig. 4 shows the density plot of 240 differences between actual and predicted case durations for the two average models and 241 the xgb models. it clearly demonstrates that the error distribution of xgb model was 242 narrower and closer to 0. as a result, the xgb model is more accurate than the other 243 models in predicting operation case duration. 244 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. . . density plot of differences between the actual operation case durations and predicted case durations obtained from the xgb model (light blue color) was narrower and centered more at 0 than density plots of those obtained from the average models (pink and cyan colors). in the average models, previous operation case durations, either averaging for a specific surgeon (cyan color) or specific procedure (pink color), were used as predictions. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. how important the variable is in making a branch of a decision tree to be purer [5, 22] . 248 a higher wfg percentage indicates that the variable is more important. the result of 249 the top 15 important variables are shown in table 4 . one thing worth noting is that 3 250 of the top 4 important variables are attributed to operation information. moreover, 251 three of the features which we computed from surgeons' data (i.e. total surgical minutes 252 performed by the surgeon within the last 7 days and on the same day, and number of accurate prediction of operation case duration is vital in elevating or efficiency and 257 reducing cost. this study not only helps to improve accuracy of or case prediction, it 258 also has novelty in the following aspects. first, the data set used in this study contained 259 more than 140,000 cases and more than 400 different types of surgical procedures which 260 set up a new benchmark for huge amount and large diversity. the maximal number of 261 cases that had been used in other studies were in the range of 40,000 to 60,000 [2, 21] . second, or events was modeled as dependent events instead of independent. to this 263 end, we extracted some additional information from surgeons' data, e.g. previous . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. . https://doi.org/10.1101/2020.06. 10.20127910 doi: medrxiv preprint april 2020 as external testing data for model evaluation. fourth, though urgent and 268 emergent surgeries were excluded from the data, number of urgent and emergent 269 operations prior to the case that was being performed by the same surgeon was included 270 as an input variable to account for its effect on operation case duration. currently, surgical cases at cmuh are scheduled according to estimates made by 272 primary surgeons. however, surgeon estimates rely heavily on prior experiences of the 273 surgeons and many factors beyond expectation will not be taken into consideration. since there is no formal record on surgeon estimates, we used averages calculated based 275 on a specific surgeon or procedure type on the testing set to be our baseline models. the performance of these two average models, as reported in table 2 , clearly showed 277 that these models were poor in predicting operation case duration. they also tended to 278 under-predict operation case duration according to their scatter plots of actual versus 279 prediction and density plot of differences between actual versus prediction (see fig. 3 280 and 4). when 24 feature variables ( table 1) were included in our model development, 281 r 2 , mae, percentage underage, overage and within improved greatly compared to the 282 baseline models. we applied 15 minutes as tolerance threshold for percentage underage, 283 overage and within because â± 15 minutes is an acceptable periodic range in cmuh to 284 be considered as accurately booking. to avoid having too stringent standard and to 285 better compare our outcomes with other studies [2, 24] , tolerance threshold of 10 % was 286 also applied. by using regression and ml approaches, we were able to decrease the total 288 prediction error (table 3 ) of operation case durations at cmuh. among all the models, 289 performance of the xgb model was considered to be the best because it was more 290 computing efficient and had the lowest inaccuracy. moreover, even though the results of 291 evaluation metrics of the rf model were similar to the xgb model, the xgb model 292 was still able to reduce the total prediction error in minutes from 223,686 to 218,415 293 minutes. in other words, the xgb model was able to save more than 5,000 minutes of 294 idle or delay times than the rf model. since most ors usually have multiple cases 295 scheduled per day, the total prediction error represents the cumulative effect of total 296 or cases in the 2-month period of mar to april 2020. this cumulative effect may 297 eventually reflects a significant financial advantage in scheduling an additional 298 operation case [7] . this would also lead to a significant cost reduction and increment in 299 revenue because ors are utilized appropriately and efficiently. it has been reported in the past studies that primary surgeons contributed the 301 largest variability in operation case duration prediction compared to other factors 302 attributed to patients [2, 16, 23] . these studies provide evidence and rationale that more 303 factors relating to primary surgeon should be added as input variables in the training of 304 ml models. moreover, extensive feature engineering usually improve the quality of ml 305 model which can be independent to the modeling technique itself. as a result, in 306 addition to primary surgeon's identifier, gender and age, we computed previous working 307 time and number of previous surgeries performed by the same primary surgeons within 308 the last 7 days and on the same day. we also counted the number of urgent and 309 emergent operations prior to the case that was being performed by the same primary 310 surgeon. these variables extracted from the data of primary surgeon were significantly 311 (p < 0.05) correlated with operation case duration (see table 5 in appendix). the 312 correlation coefficients of these variables also revealed that an operation case duration 313 performed by a primary surgeon may decrease as he or she becomes more familiar with 314 the surgical procedure but may increase if his or her total surgical minutes are too long. 315 although performing a surgery multiple times on different patients may help a primary 316 surgeon to be more efficient in his or her next operation, long working time may also 317 lead to lethargic and affect the primary surgeon's performance. in the methodology of data processing, for predictor variables which contained a lot 319 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 12, 2020. . of categories, we grouped categories that had cases less than 50 into a categories named 320 'others'. in addition to reducing data dimensionality for categorical features, this may 321 aid in generalization of our model. this indicates that our model will still be able to 322 predict case duration even for operations that are rare. moreover, our model can be 323 applied to new primary surgeons, who are not included in the training set during model 324 development, by setting their id as 'others' for case duration prediction. however, 325 there is still a need to update our model after a while, for example, when the operation 326 cases performed by a new primary surgeon has increased beyond a certain number. in 327 terms of timing, we recommend updating the model annually by using operation cases 328 performed in the most recent 3 years as training data. one limitation in this study is that we selected predictor variables which could only 330 be extracted from preoperative data. our ml model still needs to be improved in order 331 to be able to predict surgical case duration dynamically. for example, blood loss during 332 operation may affect case duration as an unexpected increase in blood loss may cause 333 surgeons to take longer time to complete the surgery. therefore, it would be better if 334 intra-operative data are incorporated during ml model development and prediction 335 made by the ml model can be updated during operation. one common issue in all ml 336 studies in predicting operation case duration, including our study, is that ml models 337 were developed using data from a single site. these ml models have difficulties in 338 generalization, since the surgical team, facilities and patient populations are different 339 across entities. it has to be custom made for a given organization using training data 340 containing its patients, procedures, surgeons, medical staffs, and the facility itself. as a 341 result, the exact same ml model is not meant to and will not perform well when 342 applied to another organization or hospital. the other interesting issue of applying ml 343 or artificial intelligence in operation estimation is that medical technologies evolve fast. 344 hence, how frequent should a ml or artificial intelligence model need to be updated 345 still remains to be answered. the xgb model was superior in predictive performance when comparing to the average, 348 the reg and the logreg models. the total inaccuracy of predicted outcomes of the xgb 349 model was the lowest among the other models developed in this study. although the 350 performance of the rf model was close to the xgb model, the xgb model was more 351 computing efficiency than the rf model in which it took shorter time to complete the 352 training process. the coefficient of determination (r 2 ) was higher while percentages of 353 under-and over-prediction of the xgb model built in this study were also lower than 354 other ml studies [2, 21, 24] . moreover, this model improves the current or scheduling 355 method which is based on estimates made by surgeons at cmuh. 356 we propose extracting additional information from operation and surgeons' data to 357 be used as predictor variables for ml algorithm training since their importance was 358 high in the xgb model. moreover, we validated the model types using an external 359 testing set in additional to the internal testing set split from the original data used in 360 model training. this helped us to validate and test the models in a more stringent and 361 rigorous way. therefore, we suggest external evaluation should be used as a tool to 362 better validate the predictive power of ml models in the future. 363 1 appendix . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 12, 2020. table 5 . correlation coefficient, standard error, t-value and p-value of predictor variables extracted from primary surgeons' data. these information were obtained from the log transformed multivariate regression (logreg) model. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 12, 2020. . https://doi.org/10.1101/2020.06.10.20127910 doi: medrxiv preprint optimization and planning of operating theatre activities: an original definition of pathways and process modeling improving operating room efficiency: machine learning approach to predict case-time duration a comparative analysis of xgboost package 'randomforest' title breiman and cutler's random forests for classification and regression package 'xgboost' type package title extreme gradient boosting understanding costs of care in the operating room decrease in case duration required to complete an additional case during regularly scheduled hours in an operating room suite predicting the unpredictable: a new prediction model for operating room times using individual characteristics and the surgeon's estimate greedy function approximation: a gradient boosting machine factors that influence the expected length of operation: results of a prospective study surgical unit time utilization review: resource utilization and management implications surgical duration estimation via data mining and predictive modeling: a case study use of simulation to assess a statistically driven surgical scheduling system improving predictions of pediatric surgical durations with supervised learning the surgical scheduling problem: current research and future opportunities tree boosting with xgboost -why does xgboost win "every" machine learning competition? tree boosting with xgboost -why does xgboost win "every" machine learning competition? newer classification and regression tree techniques: bagging and random forests for ecological prediction operating room efficiency improved prediction of procedure duration for elective surgery decision tree methods: applications for classification and prediction. shanghai archives of psychiatry surgeon and type of anesthesia predict variability in surgical procedure times a machine learning approach to predicting case duration for robot-assisted surgery relying solely on historical surgical times to estimate accurately future surgical times is unlikely to reduce the average length of time cases finish late the authors would like to thank shu-cheng liu, jhao-yu huang and min-hsuan lu in 365 providing feedback during the progress of this study. key: cord-027337-eorjnma3 authors: fratrič, peter; sileno, giovanni; van engers, tom; klous, sander title: integrating agent-based modelling with copula theory: preliminary insights and open problems date: 2020-05-22 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50420-5_16 sha: doc_id: 27337 cord_uid: eorjnma3 the paper sketches and elaborates on a framework integrating agent-based modelling with advanced quantitative probabilistic methods based on copula theory. the motivation for such a framework is illustrated on a artificial market functioning with canonical asset pricing models, showing that dependencies specified by copulas can enrich agent-based models to capture both micro-macro effects (e.g. herding behaviour) and macro-level dependencies (e.g. asset price dependencies). in doing that, the paper highlights the theoretical challenges and extensions that would complete and improve the proposal as a tool for risk analysis. complex systems like markets are known to exhibit properties and phenomenal patterns at different levels (e.g. trader decision-making at micro-level and average asset price at macro-level, individual defaults and contagion of defaults, etc.). in general, such stratifications are irreducible: descriptions at the micro-level cannot fully reproduce phenomena observed at macro-level, plausibly because additional variables are failed to be captured or cannot be so. yet, anomalies of behaviour at macro-level are typically originated by the accumulation and/or structuration of divergences of behaviour occurring at micro-level, (see e.g. [1] for trade elasticities). therefore, at least in principle, it should be possible to use micro-divergences as a means to evaluate and possibly calibrate the macro-level model. one of the crucial aspects for such an exercise would be to map which features of the micro-level models impact (and do not impact) the macro-level model. from a conceptual (better explainability) and a computational (better tractability) point of view, such mapping would enable a practical decomposition of the elements at stake, thus facilitating parameter calibration and estimation from data. moreover, supposing these parameters to be adequately extracted, one could put the system in stress conditions and see what kind of systematic response would be entailed by the identified dependence structure. the overall approach could provide an additional analytical tool for systematic risk. with the purpose of studying and potentially providing a possible solution to these requirements, we are currently working on establishing a theoretical framework integrating agent-based modelling (abm) with advanced quantitative probabilistic methods based on copula theory. the intuition behind this choice is the possibility to connect the causal, agentive dependencies captured by agentbased models with the structural dependencies statistically captured by copulas, in order to facilitate the micro-macro mappings, as well as the extraction of dependencies observable at macro-level. to the best of our knowledge, even if many research efforts targeting hybrid qualitative-quantitative methods exist in the computational science and artificial intelligence literature, the methodological connection of abm with copula theory is still an underexplored topic. a large-scale agent-based model of trader agents incorporating serial dependence analysis, copula theory and coevolutionary artificial market, allowing traders to change their behaviour during crisis periods, has been developed in [2] ; the authors rely on copula to capture cross-market linkages on macro-level. a similar approach is taken in [4] . examples of risk analysis in network-based setting can be found for instance in [5, 6] , in which the mechanisms of defaults and default contagion are separated from other dependencies observed in the market. in [3] , copula is used to model low-level dependencies of natural hazards with agent-based models, in order to study their impact at macro-level. in the present paper, we will use copula to model dependencies among agents at micro-level, and we will propose a method to combine aggregated micro-correlations at market scale. the paper is structured as follows. section 2 provides some background: it elaborates on the combined need of agent-based modeling and of quantitative methods, illustrating the challenges on a running example based on canonical trader models for asset pricing, and gives a short presentation on copula theory. section 3 reports on the simulation of one specific hand-crafted instanciation of copula producing a relevant result from the running example, and will elaborate on extensions and theoretical challenges that remain to be solved for the proposal to be operable. a note on future developments ends the paper. in financial modelling, when statistical models are constructed from time series data, it is common practice to separately estimate serial dependencies and crosssectional dependencies. the standard approach to capture serial dependence (also referred to as autocorrelation) is to use autoregressive models [10] . if the time series exhibit volatility clustering-i.e. large changes tend to be followed by large changes, of either sign, small changes tend to be followed by small changesthen it is typical to use the generalized autoregressive conditional heteroskedasticity model (garch) [11] , or one of its variants [12] . once the garch model is estimated, the cross-sectional dependence analysis can be performed on the residuals. unfortunately, autoregressive models provide only little information useful for interpretation; this is no surprise since these models are purely quantitative, and suffer problems similar to all data-driven methods. as low interpretability goes along with limited possibility of performing counterfactual or what-if reasoning (see e.g. the discussion in [9] ), such models are weakly justifiable in policy-making contexts, as for instance in establishing sound risk balancing measures to be held by economic actors. an alternative approach is given by agent-based modelling (abm), through which several interacting heterogeneous agents can be used to replicate patterns in data (see e.g. [2] ). typically, agent models are manually specified from known or plausible templates of behaviour. to a certain extent, their parameters can be set or refined by means of some statistical methods. model validation is then performed by comparing the model execution results against some expected theoretical outcome or observational data. these models can be also used to discover potential states of the system not yet observed [23] , thus becoming a powerful tool for policy-making. independently from their construction, agent models specify, at least qualitatively, both serial dependencies (for the functional dependence between actions) and cross-sectional dependencies (for the topological relationships between components), and are explainable in nature. however, they do not complete the full picture of the social system, as the focus of the designers of agent-based models is typically on the construction of its micro-level components. nevertheless, to elaborate on the connection between micro-level and macro-level components of a social system we still need to start from capturing behavioural variables associated to micro-level system components, assuming them to have the strongest effect at micro-level (otherwise the micro-level would be mostly determined by the macro-level). running example: asset market. we will then consider a paradigmatic scenario for abm: an asset market, in which traders concurrently sell, buy or hold their assets. our running example is based on canonical asset pricing models, proceeding along [17] . fundamental value. the target asset has a publicly available fundamental value given by a random walk process: where η is a normally distributed random variable with mean zero and standard deviation σ η . market-maker agent. at the end of each trading day, a market-maker agent sets the price at which a trader agent can buy or sell the asset according to a simple rule: where: the variable d(t) denotes the number of buy orders at time t, s(t) denotes number of sell orders at time t and δ is a normally distributed random variable with zero mean and constant standard deviation σ δ . the positive coefficient a can be interpreted as the speed of price adjustment. fundamental traders. fundamental traders operate with the assumption that the price of an asset eventually returns to its fundamental value. therefore, for them it is rational to sell if the value of an asset is above its fundamental value and buy if the value of an asset is below its fundamental value. their price expectation can be written as: where α is a normally distributed random variable with mean zero and standard deviation σ α . x fund can be interpreted as the strength of a mean-reverting belief (i.e. the belief that the average price will return to the fundamental value). technical traders. in contrast, technical traders, also referred to as chartists, decide on the basis of past trend in data. they will buy if the value of an asset is on the rise, because they expect this rise to continue and sell if the value is on decline. their expectation can be written as: where β is a normally distributed random variable with mean zero and standard deviation σ β . x tech can be interpreted as a strength of reaction to the trend. relevant scenario: herding behaviour. because they are intrinsic characteristics of each agent, x fund and x tech can be seen as capturing the behavioural variables we intended to focus on at the beginning of this section. now, if for all traders x fund or x tech happen to be realized with unexpectedly large values at the same time, the effect of α and β will be diminished, and this will result in higher (lower) expected value of the asset price and then in the consequent decision of traders to buy (sell). purchases (sales) in turn will lead the market-maker agent to set the price higher (lower) at the next time step, thus reinforcing the previous pattern and triggering herding behaviour. such chain of events are known to occur in markets, resulting in periods of rapid increase of asset prices followed by periods of dramatic falls. note however that this scenario is not directly described by the agent-based models, but is entailed as a possible consequence of specific classes of instantiations. herding behaviour is recognized to be a destabilizing factor in markets, although extreme time-varying volatility is usually both a cause and an effect of its occurrence. in the general case, factors contributing to herding behaviour are: the situation on the global market, the situation in specific market sectors, policies implemented by policy makers, etc. all these factors are somehow processed by each human trader. however, because such mental reasoning is only partially similar amongst the agents, and often includes non-deterministic components (including the uncertainty related to the observational input), it is unlikely that it can be specified by predefined, deterministic rules. for these reasons, probabilistic models are a suitable candidate tool to recover the impossibility to go beyond a certain level of model depth, in particular to capture the mechanisms behind behavioural variables as x fund and x tech . in the following, we will therefore consider two normally distributed random variables x fund and x tech realizing them. 1 this means that the traders will perceive the price difference in parenthesis of eqs. (3) and (4) differently, attributing to it different importance at each time step. looking at eqs. (3) and (4) we can see that the essence of an agent's decision making lies in balancing his decision rule (e.g. for the fundamental trader with the uncertainty about the asset price (e.g. α). if for instance the strength of the mean-reversing belief x fund happens to be low (in probabilistic terms, it would be a value from the lower tail), then the uncertainty α will dominate the trader's decision. in contrast, if x fund happens to be very high (i.e. from the upper tail), then the trader will be less uncertain and trader's decision to buy or sell will be determined by (f t − p t ). similar considerations apply to technical traders. assuming that behavioural random variables are normally distributed, obtaining values from the upper tail is rather unlikely and, even if some agent's behavioural variable is high, it will not influence the asset price very much since the asset price is influenced collectively. however, if all traders have strong beliefs about the rise or fall of the price of the asset, then the price will change dramatically. the dependence of the price on a collective increase in certainty cannot be directly modeled by the standard toolkit of probability, and motivates the use of copulas. copula theory is a sub-field of probability theory dealing with describing dependencies holding between random variables. application-wise, copulas are well established tools for quantitative analysis in many domains, as e.g. economic time series [15] and hydrological data [16] . consider a random variable u = (u 1 , ..., u d ) described as a d-dimensional vector. if all components of u are independent, we can compute its joint probability distribution function as f u (u 1 , .., u d ) = f u1 (u 1 ) · ... · f u d (u d ), i.e. the product of marginal distributions. in case of dependence among components we need some function that specifies this dependence. the concept of copula is essentially presented as a specific class of such functions, specifically defined on uniform marginals [13] : to obtain a uniform marginal distribution from any random variable we can perform a probability integral transformation u i = f i (x i ), where f i is the marginal distribution function of the random variable x i . in practice, when we estimate the copula from data we estimate the marginals and copula components separately. we can then introduce the most important theorem of copula theory: if f 1 , ..., f d are continuous this copula is unique. if we consider the partial derivatives of eq. (5) we obtain the density function in the form: where c is density of the copula and f i is marginal density of random variable x i . the reason why copulas gained popularity is that the cumulative distribution function f i contains all the information about the marginal, while the copula contains the information about the structure of dependence, enabling a principled decomposition for estimations. correspondingly to the high variety of dependence structures observed in the real world, there exists many parametric families of copulas, specializing for specific types of dependence. the most interesting type for economic applications is tail dependence. for example, if there is nothing unusual happening on the market and time series revolve around its mean value, then the time series might seem only weakly correlated. however, co-movements far away from mean value tend to be correlated much more strongly. in probabilistic terms, the copula describing such dependence between random variables has strong tail dependence. tail dependence does not have to be always symmetrical. certain types of copulas have strong upper tail dependence and weaker lower tail dependence. in simple terms this means that there is higher probability of observing random variables together realized in the upper quantile of their distribution than in the lower quantile. one of the parametric copulas having such properties is the joe copula [13] , illustrated in fig. 1 . this section aims to show how copulas can be effectively used to enrich stochastic agent-based models with additional dependencies, relative to micro-and macrolevels. in particular, we will focus on the phenomenon of dependence of the asset price to a collective increase in strength of belief associated to herding behaviour scenarios, observed in sect. 2.2. to model the balance between certainty and uncertainty of each trader with respect to current price of the asset, we need to set the marginal distribution functions of x fund and x tech to have a mean value and standard deviation such that if herding behaviour occurs then the uncertainty parameters α and β will play essentially no role, but if herding behaviour is not occurring, then α and β will stop traders from massive buying or selling. therefore the parameters for x fund , x tech and α, β are not entirely arbitrary. to illustrate the influence of dependence structures of behavioural random variables we will compare simulations of the market with independent behavioural random variables to simulations of the market whose random variables have a dependence structure defined by joe copula. it is important to make clear that this exercise does not have any empirical claim: it is meant just to show that copula can be used for a probabilistic characterization of the social behaviour of the system. consider a group of n total = 1000 trader agents consisting of n fund = 300 fundamental traders and n tech = 700 technical traders (this ration roughly reflects the situation of a real market). let us denote with x the vector collecting all behavioural random variables of the traders. table 1 reports all parameters we have used for our simulation. the sequence of fundamental values {f t } t t=1 is generated by eq. (1) at the beginning and remains the same for all simulations. independence scenario. at first, we will assume that the (normally distributed) behavioural random variable assigned to each trader is independent of other normally distributed behavioural variables. this means that the probability density function capturing all behavioural variables of market can be written as a simple product of marginal density functions: we have considered t = 502 time steps with initialization p 1 = 10 and p 2 = 10.02574. figure 2 illustrates the output of 100 simulations of the market with a probability density function given by the eq. (7). the simulations on average follow the fundamental price of an asset. the marginal distribution of the increments δp t taking realizations of the generated time series (fig. 2) clearly follows a normal distribution (left of fig. 4) . dependence scenario. let us keep same number of agents and exactly the same parameters, but this time consider a dependence structure between the behavioural variables described by a joe copula with parameter equal to 8. as shown in fig. 1 , this copula has strong right upper tail dependence. the joe copula is an archimedean copula, which means it attributes an univariate generator, hence drawing samples from this copula is not time consuming, even for large dimensions. in our case, each sample will be a n total dimensional vector u with components u i from the unit interval. for each agent, a quantile transformation will be made x a,i = q xa,i (u i ) by the quantile function q xa,i of i-the agent a = {fund, tech} to obtain the realization from the agent's density function. here the i-th agent a's behavioural random variable is again distributed following what specified in table 1 . running 100 market simulations, the time-series we obtain this time are much more unstable (fig. 3 ). this is due to the structural change in the marginal distribution function of δp t , which has now much fatter tails. the fatter tails can be seen on the right histogram on fig. 4 , and on the comparison of both histograms by normal qq-plots in fig. 5 . we see that for independence the increments follow a normal distribution very closely, but for dependence defined by joe copula the tails of marginal distribution deviate greatly and approximates a normal distribution only around the mean. for applications in finance it would be desirable to extend this model to consider broader contexts: e.g. a market with many assets, whose prices in general may exhibit mutual dependencies. a way to obtain this extension is to introduce adequate aggregators, following for instance what was suggested in [14, 3.11.3] . in order to apply this method, however, we need to make explicit a few assumptions. step-wise computation assumption. the increase or the decrease of the price of the asset δp t is obtained via the simulation of the agent-based market model. as it can be seen in formulas (3) and (4), the increment δp t at time t depends on the realization of random variables x at time t with agents observing previous market values p t−1 and p t−2 . this means that δp t is also a continuous random variable and its probability density function should be written as f δpt|pt−1,pt−2,x . note that this function can be entirely different at each time t depending on what values of p t−1 , p t−2 the agents observe. since the time steps are discrete, then the density functions forms a sequence {f δpt|pt−1,pt−2,x } t=t t=t0 . in this paper, for simplicity, we will not describe the full dynamics of this sequence; we will focus only on one density function in one time step, assuming therefore that the computation can be performed step-wise. by fixing time, p t−1 , p t−2 are also fixed (they have already occurred), so we can omit them and write just f δpt|x . generalizing to multiple assets. consider m non-overlapping groups of absolutely continuous random variables x 1 , ..., x m , where each group consists of behavioural random variables, or, to make our interpretation more general, predictor variables that determine the value of an asset. each group x g forms a random vector and has a scalar aggregation random variable v g = h g (x g ). this means that each value of an asset is determined by a mechanism specified by the function h g , which might be an agent-based model similar to the one explored in the previous section, but this time each group of predictor random variables will have its own distribution function. we can write then: where f denotes the marginal (joint) probability density function of the corresponding variable (variables) written as a subscript. the validity of (8) relies on two assumptions: (a) conditional independence of the groups given aggregation variables v 1 , .., v m and (b) conditional distribution function of the group x g conditioned on v 1 , ..., v m is the same as the conditional distribution of x g conditioned on v g (for a 2-dimensional proof see [14] ). these assumptions are in principle not problematic in our application because we are assuming that all interactions on micro-level of the agents are sufficiently well captured by the distribution of aggregation variables. hence formula (8) should be viewed as a crucial means for simplification, because it enables a principled decomposition. expressing the density function of v in (8) using formula (6) as copula, we obtain: this formula provides us a way to integrate in the same model mechanisms associated to the different assets in the market by means of a copula at aggregate level. in other words, by this formula, it is possible to calculate the probability of rare events, and therefore estimate systematic risk, based on the dependencies of aggregation variables and on the knowledge of micro-behaviour specified by group density functions of the agent-based models. the marginal distribution functions f vi (v i ) can be estimated either from real world data (e.g. asset price time series), or from simulations. note that whether we estimate from real world data or an agent-based market model should not matter in principle, since, if well constructed, the agent-based model should generate the same distribution of δp as the distribution estimated from real world data. the density function f xg (x g ) of an individual random vector x g can be defined as we did in our simulation study. however, to bring this approach into practice, three problems remain be investigated: estimation of copula. we need to consider possible structural time dependencies and serial dependencies in the individual aggregation variables. additionally, the agents might change their behavioural script (e.g. traders might pass from technical to fundamental at certain thresholds conditions). high dimensionality of eqs. (8) and (9) . if we consider n predictor variables for each group g = 1, ..., m we will end up with a n · m-dimensional density function. interpolation of function h g . calculating high-dimensional integrals that occur for instance in formula (8), with function h g being implicitly computed by the simulation of an agent-based market model, is clearly intractable. for the first problem, we observe that time dependence with respect to copulas is still an active area of research. most estimation methods do not allow for serial dependence of random variables. one approach to solve this is to filter the serial dependence by an autoregressive model as described at the beginning of sect. 2. another approach is to consider a dynamic copula, similarly as in [14, 20, 21] . a very interesting related work is presented in [22] , where arma-garch and arma-egarch are used to filter serial dependence, but the regime switching copula is considered on the basis of two-states markov models. using an agent-based model instead of (or integrated with) a markov model would be a very interesting research direction, because the change of regime would also have a qualitative interpretation. for the second problem, although in our example we have used 1000 agents, in general this might be not necessary, considering that abms might be not as heterogeneous, and aggregators might work with intermediate layers between micro and macro-levels. for the third problem, a better approach would be to interpolate the abm simulation by some function with closed form. in future works, we are going to evaluate the use of neural networks (nns), which means creating a model of our agent-based model, that is, a meta-model. the general concept of meta-models is a well-established design pattern [18] and the usage of nns for such purposes dates back to [19] . in our example the basic idea would be to obtain samples from the distribution f xg (x g ) as input, the results of an abm simulation v g as output, and then feed both input and output to train a dedicated nn, to be used at runtime. this would be done for each group g. the biggest advantage of this approach, if applicable in our case, is that we will have both a quick way to evaluate a function approximating h g , but we will also have the interpretative power of the agent-based market model, resulting in an overall powerful modelling architecture. agent-based models are a natural means to integrate expert (typically qualitative) knowledge, and directly support the interpretability of computational analysis. however, both the calibration on real-data and the model exploration phases cannot be conducted by symbolic means only. the paper sketched a framework integrating agent-based models with advanced quantitative probabilistic methods based on copula theory, which comes with a series of data-driven tools for dealing with dependencies. the framework has been illustrated with canonical asset pricing models, exploring dependencies at micro-and macrolevels, showing that it is indeed possible to capture quantitatively social characteristic of the systems. this also provided us with a novel view on market destabilization, usually explained in terms of strategy switching [24, 25] . second, the paper formally sketched a principled model decomposition, based on theoretical contributions presented in the literature. the ultimate goal of integrating agent-based models, advanced statistical methods (and possibly neural networks) is to obtain an unified model for risk evaluation, crucially centered around eq. (9) . clearly, additional theoretical challenges for such a result remains to be investigated, amongst which: (a) probabilistic models other than copulas to be related to the agent's decision mechanism, (b) structural changes of dependence structures, (c) potential causal mechanisms on the aggregation variables and related concepts as time dependencies (memory effects, hysteresis, etc.) and latency of responses. these directions, together with the development of a prototype testing the applicability of the approach, set our future research agenda. from micro to macro: demand, supply, and heterogeneity in the trade elasticity financial contagion: a propagation simulation mechanism when does a disaster become a systemic event? estimating indirect economic losses from natural disasters large scale extreme risk assessment using copulas: an application to drought events under climate change for austria integrating systemic risk and risk analysis using copulas incorporating contagion in portfolio credit risk models using network theory an agent-based model of an endangered population of the arctic fox from mednyi island cognitive maps and bayesian networks for knowledge representation and reasoning causal inference in statistics: an overview introduction to time series and forecasting. springer texts in statistics autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation volatility and time series econometrics: essays in honor of robert engle an introduction to copulas dependence modeling with copulas a review of copula models for economic time series copulas and its application in hydrology and water resources on the unstable behaviour of stock exchanges a metamodeling approach based on neural networks time-dependent copulas analysing financial contagion and asymmetric market dependence with volatility indices via copulas regime switching vine copula models for global equity and volatility indices why model? simple agent-based financial market model: direct interactions and comparisons of trading profits an agent-based simulation of stock market to analyze the influence of trader characteristics on financial market phenomena circlize implements and enhances circular visualization in r acknowledgments. the authors would like to thank drona kandhai for comments and discussions on preliminary versions of the paper. this work has been partly funded by the marie sklodowska-curie itn horizon 2020-funded project insights (call h2020-msca-itn-2017, grant agreement n.765710). key: cord-007147-0v8ltunv authors: dungan, r. s. title: board-invited review: fate and transport of bioaerosols associated with livestock operations and manures date: 2010-11-17 journal: j anim sci doi: 10.2527/jas.2010-3094 sha: doc_id: 7147 cord_uid: 0v8ltunv airborne microorganisms and microbial by-products from intensive livestock and manure management systems are a potential health risk to workers and individuals in nearby communities. this report presents information on zoonotic pathogens in animal wastes and the generation, fate, and transport of bioaerosols associated with animal feeding operations and land applied manures. though many bioaerosol studies have been conducted at animal production facilities, few have investigated the transport of bioaerosols during the land application of animal manures. as communities in rural areas converge with land application sites, concerns over bioaerosol exposure will certainly increase. although most studies at animal operations and wastewater spray irrigation sites suggest a decreased risk of bioaerosol exposure with increasing distance from the source, many challenges remain in evaluating the health effects of aerosolized pathogens and allergens in outdoor environments. to improve our ability to understand the off-site transport and diffusion of human and livestock diseases, various dispersion models have been utilized. most studies investigating the transport of bioaerosols during land application events have used a modified gaussian plume model. because of the disparity among collection and analytical techniques utilized in outdoor studies, it is often difficult to evaluate health effects associated with aerosolized pathogens and allergens. invaluable improvements in assessing the health effects from intensive livestock practices could be made if standardized bioaerosol collection and analytical techniques, as well as the use of specific target microorganisms, were adopted. animal feeding operations (afo) generate vast quantities of manure (feces and urine) and wastewater that must be treated, stockpiled, or beneficially used. in the united states there are approximately 238,000 afo producing an estimated 500 million wet tons of manure annually. of particular concern is the intensification of animal production, which has led to the creation of concentrated afo (cafo) that make up about 15% of all afo. the major producers of manure are cattle (beef and dairy), poultry (chicken and turkey), and swine operations (wright et al., 1998) . depending upon the animal production facility, the solid and liquid manures are typically stored in piles or holding ponds, mechanically dewatered, composted, anaerobically digested for biogas production, or a combination of the above. animal manures applied as solids, semi-solids, and liquids have traditionally been used as soil conditioners and as a source of nutrients for crop production (power and dick, 2000; risse et al., 2006) . when improperly managed, however, manures can pollute surface and ground waters with nutrients and pathogenic microorganisms (ritter, 2000) . because commercial livestock carry an increased microbial load in their gastrointestinal system, they are often reservoirs of zoonotic pathogens (temporarily or permanently), which can be transmitted to the environment in untreated manures (gerba and smith, 2005; venglovsky et al., 2009 ). an area of growing interest is airborne pathogens and microbial by-products generated at afo and during the land application of manures (chang et al., 2001b; wilson et al., 2002; cole et al., 2008; chinivasagam et al., 2009; dungan and leytem, 2009a; millner, 2009) , which can potentially affect the health of livestock, farm workers, and individuals in nearby residences (heederik et al., 2007) . land application of untreated solid and semi-solid manures and use of pressurized irrigation systems to apply liquid manures and wastewaters increase the chances that microorganisms will become aerosolized (teltsch et al., 1980a; brooks et al., 2004; hardy et al., 2006; peccia and paez-rubio, 2007) . despite the potential for bioaerosol formation during these activities, very few research papers have addressed the risk of human exposure to pathogens during the land application of animal wastes (boutin et al., 1988; murayama et al., 2010) . to date, much of the research in this area has been conducted with municipal wastewaters (us epa 1980 , 1982 tanner et al., 2005; peccia and paez-rubio, 2007) and biosolids (dowd et al., 2000; brooks et al., 2005a,b; tanner et al., 2008) . considering the fact that the number of cafo continues to grow (usda national agricultural statistics service, 2009), along with a growing farm worker and encroaching civilian population, an increased understanding of the fate and transport of airborne microorganisms is required to ensure public health is not compromised. the purpose of this review is to highlight the current knowledge of bioaerosol fate and transport, with a specific focus on bioaerosols generated at afo and during the land application of animal manures. readers seeking more information on bioaerosol collection and analytical methodologies should refer to a recent review by dungan and leytem (2009b) . additional emphasis is placed on dispersion models as a means to assess the transport of bioaerosols and subsequent risk of exposure to individuals in the downwind plume. domesticated livestock harbor a variety of bacterial, viral, and protozoal pathogens, some of which pose a risk to other animals and humans. infectious diseases that are transmissible from animals to humans and vice versa are known as zoonoses. these diseases can be transmitted to humans through direct contact (skin wounds, mucous membranes), fecal-oral route, ingestion of contaminated food and water, or aerogenic route (e.g., droplets, dust). tables 1, 2, and 3 present a list of important bacterial, viral, and protozoal zoonotic pathogens associated with animals and their wastes, respectively. many of these pathogens are endemic in commercial livestock and, therefore, are difficult to eradicate from both the animals and production facilities. some well-recognized zoonotic pathogens are escherichia coli o157:h7, salmonella spp., campylobacter jejuni, apthovirus that causes foot-and-mouth disease (fmd), and protozoal parasites such as cryptosporidium parvum and giardia lamblia. this section is not meant to be an exhaustive review of zoonotic pathogens; more detailed information on zoonoses can be found in krauss et al. (2003) and sobsey et al. (2006) . escherichia coli are native inhabitants of the gastrointestinal tract of mammals, but a subset of diarrhetic e. coli, known as enterohemorrhagic, enteropathogenic, (krauss et al., 2003) . salmonella occur in cattle, pigs, poultry, wild birds, pets, rodents, and other animals; however, only nontyphoidal salmonella (e.g., s. enterica serovar enteritidis) occurs in both humans and animals. human infection generally occurs through the ingestion of contaminated foodstuffs or excretions from sick or infected animals, resulting in acute gastroenteritis. campylobacter jejuni is among the most common causes of diarrheal disease in the united states, and this is attributed to the relatively low infectious dose (<500 organisms). the main reservoirs of c. jejuni are wild birds and poultry, although among farm animals pigs are important carriers. infection in humans occurs by ingestions of contaminated food (raw or undercooked poultry meat, pork, or milk) or water or by direct contact with contaminated feces. foot-and-mouth disease is a highly contagious and sometimes fatal viral disease of cloven-hoofed animals (domestic and wild). human infections with the fmd virus are rare and infections can usually be traced to direct handling of infected animals or contact during slaughter. cryptosporidium parvum is a protozoal parasite that is widespread in mammals and is increasingly recognized as a major cause of human diarrhea. in animals, clinical signs are most commonly observed in newborn calves. infected animals shed the organism in their feces, and human infection occurs though the ingestion of contaminated food and water. giardiasis, caused by various giardia spp. (e.g., g. lamblia), is considered one of the most prevalent parasitic infections in the world, especially in developing nations with poor sanitary practices. animal hosts of giardia spp. include cattle, sheep, pigs, cats, rodents, and other mammals, which are direct or indirect sources of human infection. transmission commonly occurs through the ingestion of food or water contaminated with feces. although the common route of transmission for many zoonotic pathogens is direct ingestion or contact, the inhalation of infectious particles should also be considered. it is well documented that communicable and noncommunicable human diseases are transmitted through airborne routes; however, the airborne transmission of some of the above-mentioned zoonotic pathogens is unknown and quite controversial. zoonotic pathogens, such as mycobacterium tuberculosis and hantavirus, are krauss et al. (2003) and sobsey et al. (2006) . known to be transmitted through aerogenic routes and are capable of causing severe disease in infected individuals (sobsey et al., 2006) . however, some enteric pathogens (e.g., salmonella spp.) are not typically associated with aerogenic routes of exposure, but based on studies with animals there is evidence suggesting that airborne transmission is possible (wathes et al., 1988; harbaugh et al., 2006; oliveira et al., 2006) . furthermore, there is much uncertainty associated with the dose-response of airborne pathogens and biological agents because many relationships have not been established to date (pillai and ricke, 2002; douwes et al., 2003; hermann et al., 2009 ). although the land application of manures is often utilized as a means to dispose of a waste by-product, rather than from a beneficial use perspective, manures are an excellent source of major plant nutrients such as nitrogen, phosphorus, and potassium, as well as some secondary nutrients. the application of manure not only improves soil nutrient status, but also has a significant effect on physical and biological properties (sommerfeldt and chang, 1985; khaleel et al., 1991; peacock et al., 2001) . manure applications increase the om content in soils, which in turn promotes the formation of water-stable soil aggregates and improves water infiltration, water-holding capacity, microbial activity, and overall productivity. to distribute the livestock manures and wastewaters to agricultural fields a variety of techniques are often utilized (pfost et al., 2001) . manures with a low moisture content, such as chicken litter or dewatered feces, can be land-applied using a manure slinger or spreader. wastes that have a very low solids content, such as wastewater from flush systems, holding ponds, or lagoons, can be land applied via furrow irrigation, directly injected (e.g., drag-hose), or sprayed using a tanker or pressurized irrigation systems (e.g., spray gun, center-pivot). application methods that launch liquid and solid manures into the air create a potentially hazardous situation as pathogens may become aerosolized and transported to downwind receptors (sorber and guter, 1975; brooks et al., 2004) . the aerosolized pathogens could potentially be directly inhaled or ingested after they land on fomites, water sources, or food crops. aerosolization is a process where fine droplets evaporate completely or to near dryness; thus, microorganisms in these droplets are transformed into solid or semi-solid particles (i.e., bioaerosols). during spray irrigation events of liquid manures and wastewaters, the water stream is broken up into droplets of various sizes. the size of the droplets is related to the sprinkler head configuration and operating pressure of the irrigation system. fine droplets, <100 μm in diameter, evapo-rate relatively quickly, whereas those >200 μm do not evaporate appreciably (hardy et al., 2006) . however, the evaporation rate of water droplets increases with decreasing humidity and increasing temperature. in a study conducted with low pressure sprinklers, total evaporation losses ranged from 0.5 to 1.4% for smooth spray plate and 0.4 to 0.6% for coarse serrated sprinklers (kohl et al., 1987) . in a us epa report (1980), the aerosolization efficiency (e) ranged from 0.08 to 2.7%, with a median value of 0.33% over 17 spray irrigation events using rotating impact-sprinklers. aerosolization efficiency is the fraction of the total water sprayed that leaves the vicinity of the irrigation system as an aerosol, rather than as droplets. bioaerosols are viable and nonviable biological particles, such as bacteria, virus, fungal spores, and pollen grains and their fragments and by-products (e.g., endotoxins, mycotoxins), that are suspended in the air (grinshpun et al., 2007) . airborne microorganisms and their components are generated as a mixture of droplets or particles, having different aerodynamic diameters ranging from 0.5 to 100 μm (lighthart, 1994; cox and wathes, 1995) . the generation of bioaerosols from water sources occurs during bubble bursting or splash, and wave action and microorganisms (single cells or groups) are usually surrounded by a thin layer of water . aside from natural activities, land spreading of slurries, pressurized spray irrigation events, and aeration basins at wastewater treatment plants are a few ways microorganisms become aerosolized. bioaerosols generated directly from relatively dry surfaces (e.g., feedlots, soils, plants) or during the land application of dry manures can be released as individual or groups of cells or associated with inorganic or organic particulate matter (cambra-lópez et al., 2010) . aerosol particles 1 to 5 μm in diameter are of the greatest concern because they are readily inhaled or swallowed, but the greatest retention in the lung alveoli occurs with the 1-to 2-μm particles (salem and gardner, 1994) . unlike microorganisms in soils, waters, and manures, aerosolized or airborne microorganisms are very susceptible to a variety of meteorological factors (cox and wathes, 1995) . the most significant factors that affect viability are relative humidity, temperature, and solar irradiance (table 4 ). in general, laboratory and field studies have shown that microorganism viability decreases with decreases in relative humidity and increases in temperature and solar irradiance (poon, 1966; dimmock, 1967; ehrlich et al., 1970b; goff et al., 1973; theunissen et al., 1993; lighthart and shaffer, 1994) . as relative humidity decreases, there is less water available to the microorganisms, which causes dehydration and subsequent inactivation of many microorganisms. however, because temperature influences relative humidity, it is often difficult to separate their effects (mohr, 2007) . targets of relative humidity-and temperature-induced inactivation of airborne microorganisms appear to be proteins and membrane phospholipids (cox and wathes, 1995) . viruses with structural lipids are stable at low relative humidities, whereas those without lipids are more stable at high relative humidities. oxygen concentration is also known to affect bacterial survival because it is involved in the inactivation of bioaerosols through the production of free radicals of oxygen (cox and baldwin, 1967; cox et al., 1974) . because bacteria are much more complex, biochemically and structurally, than viruses, viruses tend to be more resistant to the effects of oxygen and temperature-induced inactivation, except in the case of spore-forming bacteria such as clostridium spp. (mohr, 2007) . inactivation of bioaerosols by solar irradiance is highly dependent upon wavelength and is exacerbated by dehydration and oxygen (beebe, 1959; riley and kaufman, 1972; cox and wathes, 1995; ko et al., 2000) . short-wavelength ionizing radiation (e.g., x-rays, gamma rays, uv) induces free-radical-mediated reactions that cause damage to biopolymers, such as nucleic acids and proteins. another factor, known as the open-air factor, is based on the fact that the survival of many outdoor airborne microorganisms is generally poorer than in inside air under similar conditions (cox and wathes, 1995) . this effect was attributed to ozoneolefin reaction products in the outdoors. whereas the above-mentioned factors influence viability, microbial factors such as the type, genus, species, and strain of an organism also affect its airborne survival (songer, 1967; ehrlich et al., 1970b) . microorganisms associated with droplets that evaporate to dryness or near-dryness before impacting the ground or vegetation are transported in air currents. when bioaerosols are released from a source, they can be transported short or long distances and are eventually deposited in terrestrial and aquatic environments (brown and hovmøller, 2002; jones and harrison, 2004; griffin, 2007) . the transport, behavior, and deposition of bioaerosols are affected by their physical properties (i.e., size, shape, and density) and meteorological factors they encounter while airborne. because most bioaerosols are not perfectly spherical, the most useful size definition is aerodynamic diameter, which is the major factor controlling their airborne behavior (kowalski, 2006) . aerodynamic diameter is defined as the diameter of a spherical particle of water (a unit density sphere) with which a bioaerosol or microorganism has the same settling velocity in air. meteorological factors such as wind velocity, relative humidity, temperature, and precipitation affect the transport of bioaerosols, with atmospheric stability being a major factor (lighthart and mohr, 1987; lighthart, 2000; jones and harrison, 2004 ). relative humidity not only affects microorganism viability as discussed above, but also affects settling velocity because it directly influences the density and aerodynamic diameter of the bioaerosol unit (ko et al., 2000; mohr, 2007) . the deposition of bioaerosols occurs through gravitational settling, impaction, diffusion onto surfaces, and wash-out by raindrops (muilenberg, 1995) . for particles with an aerodynamic diameter >5 μm, gravitational settling and impaction are the leading causes of particle loss during transport (mohr, 2007) . for larger airborne particles (>25 μm), removal by raindrops is quite efficient. assessment of bioaerosol transport is generally accomplished by setting liquid impingement or solid impaction systems at an upwind location (background) and various downwind distances from the source (dungan and leytem, 2009b) . in brief, the aerosol samplers are usually set at 1.5 m above the ground, which corresponds to the average breathing height for humans. air is then pulled through the samplers at a specified flow rate (e.g., 12.5 l·min −1 for glass impingers) for several minutes to hours using a vacuum pump. samples are then analyzed via culture-dependent or molecularbased (e.g., pcr) assays or microscopically to calculate a microorganism concentration per cubic meter of air. in the case of airborne endotoxins, samples are typically collected on filters, subsequently extracted using a weak tween solution, and analyzed using the kinetic limulus amebocyte lysate assay (schulze et al., 2006; dungan and leytem, 2009c) . the most prevalent microorganisms identified in bioaerosol samples from afo are presented in table 5 . with most bioaerosol studies, whether conducted at afo, composting facilities, wastewater treatment plants, biosolids application sites, or wastewater spray irrigations sites, the general trend observed is that the airborne microorganism concentrations decrease with distance from the source (goff et al., 1973; katzenelson and teltch, 1976; boutin et al., 1988; taha et al., 2005; green et al., 2006; low et al., 2007) . in a study at a swine operation, the average bacterial concentrations within the barns were 1.8 × 10 4 cfu·m −3 , and although the outside air concentration decreased with distance from the facility, at 150 m downwind the bacterial concentration was still 2.5-fold greater (208 cfu·m −3 ) than at the upwind location (green et al., 2006) . in a recent study by matković et al. (2009) , airborne concentrations of fungi inside a dairy barn were about 6 × 10 4 cfu·m −3 throughout the day (morning, noon, and night) and downwind concentrations approached background levels (2.0 to 6.2 × 10 3 cfu·m −3 ) at distances as close as 5 to 50 m from the barn. at an open-lot dairy, the average endotoxin concentration at a background site was 24 endotoxin units (eu)·m −3 , whereas at the edge of the lot and 200 and 1,390 m further downwind, the average concentrations were 338, 168, and 49 eu·m −3 , respectively (dungan and leytem, 2009a) . table 6 presents airborne concentrations for microorganisms and endotoxins within and downwind of various livestock operations. boutin et al. (1988) investigated bioaerosol emissions associated with the land application of swine and cattle slurries by way of tractor-pulled tanker and fixed high-pressure spray guns. near the source, total bacterial counts were about 2,000 cfu·m −3 , regardless of the land application method. the bacterial counts steadily decreased with distance from the application site and pathogenic bacteria such as salmonella, staphylococcus, and klebsiella pneumoniae were not detected. however, compared with tank spreading, which sprays closer to the ground, airborne bacterial concentrations were greater at greater distances from the spray guns, which is likely related to the upward discharge of slurry into the air that enhances droplet size reduction and drift. to our knowledge, the boutin et al. (1988) study is the only peer-reviewed report that addresses bioaerosol transport during spray irrigation of livestock manures, whereas most other reports address spray irrigation of industrial and municipal wastes (katzenelson and teltch, 1976; parker et al., 1977; camann et al., 1988; brooks et al., 2005a; tanner et al., 2005) . in a preliminary pilot-scale field study conducted by kim et al. (2007) , swine manure was land-applied through a center pivot irrigation system and bioaerosol samples were collected upwind and 8, 14, and 23 m downwind. total airborne coliform concentrations were found to decrease with distance, from about 10 8 most probable number (mpn)·m −3 at 8 m to near background concentrations at 10 6 mpn·m −3 at 23 m downwind. although the focus of this review is on bioaerosols associated with animal operations and manures, one could reasonably expect microorganisms in industrial and municipal wastewaters to behave similarly once aerosolized. differences in survivability may occur though, depending upon the concentration and type of om in the wastes because some organic substances are known to act as osmoprotectants (cox, 1966; marthi and lighthart, 1990 ) and may provide some degree of physical protection against uv radiation and drying (sobsey and meschke, 2003; aller et al., 2005) . parker et al. (1977) investigated the transport of aerosolized bacteria during the spray irrigation of potato processing wastewater. as with other similar studies, there was a decrease in the airborne microorganism concentration with distance from the irrigation system. these authors reported detection of coliforms at distances as far as 1.0 to 1.5 km from the source; however, there was no way to verify if they were above background concentrations because that information was not provided in the report. during the land application of liquid and dewatered domestic sewage sludge (biosolids) via spray tanker and spreader/slinger, respectively, indicator organisms (coliforms, clostridium perfringens, e. coli) were not detected at distances greater than 30 m (brooks et al., 2005b) . in most of the above-mentioned bioaerosol transport studies, fecal contamination indicator organisms were targeted. fecal indicator organisms are generally chosen because they are more abundant and easily identified in the aerosols (teltsch and katzenelson, 1978; bausum et al., 1982; brenner et al., 1988) , although they may behave differently from pathogens (dowd et al., 1997; carducci et al., 1999) . alternatively, to improve upon estimates of off-site transport of bioaerosols, some researchers have used molecular-based approaches to track microorganisms from swine houses (duan et al., 2009) or during the land application of class b biosolids (low et al., 2007) and domestic wastewater (paez-rubio et al., 2005) . this approach is called microbial source tracking and has only recently been applied to aerosol samples. although emission rates for bioaerosols during the land application of livestock wastes are not currently available, emission rates have been calculated for the application of dewatered and liquid class b biosolids onto agricultural land. emission rate is a useful variable for understanding the impact of waste application, and similarities between application of municipal and livestock wastes can be made because the same spreading equipment is often used. during the land application of dewatered biosolids using a slinger, average emission rates for total bacteria, heterotrophic bacteria, total coliforms, sulfite-reducing clostridia, and endotoxin were reported to be 2.0 × 10 9 cfu·s −1 , 9.0 × 10 7 cfu·s −1 , 4.9 × 10 3 cfu·s −1 , 6.8 × 10 3 cfu·s −1 , and 2.1 × 10 4 eu·s −1 , respectively . in a study conducted by tanner et al. (2005) , ground water seeded with e. coli was sprayed using a spray-tanker, and emission rates were reported to range from 2.0 to 3.9 × 10 3 cfu·s −1 . interestingly, when studies were conducted using liquid biosolids, neither coliform bacteria nor coliphage were detected in air 2 m downwind, although these microorganisms were detected in the biosolids. although no reason was given for the latter outcome, the direct measurement of bioaerosols does provide necessary information required for calculating emission rates. a bioaerosol emission rate is a required input variable for all aerosol fate and transport models that predict absolute concentration at a specified distance from the source . atmospheric dispersion modeling is a mathematical simulation used to predict the concentration of an air contaminant at various distances from a source. in an effort to assess the transport and diffusion of airborne microorganisms associated with human and livestock diseases, dispersion modeling has been utilized (sørensen et al., 2001; garten et al., 2003; pedersen and hansen, 2008) . in australia, atmospheric dispersion models have been developed as part of preparedness programs to manage potential outbreaks of foot-and-mouth disease (cannon and garner, 1999; garner et al., 2006) . in early bioaerosol transport studies, models were based upon a modified version of the inert particle dispersion model developed by pasquill (1961) . although some of the inert particle model assumptions will not be met at a typical afo, the model assumes 1) gaussian distribution of particles in the crosswind and vertical planes; 2) particles are emitted at a constant rate; 3) diffusion in the direction of transport is negligible; 4) particles are <20 μm in diameter (i.e., gravitational effects are negligible); 5) particles are reflected from the ground (i.e., no deposition or reactions at surface); 6) wind velocity and direction are constant; and 7) terrain is flat. the original form of the inert particle dispersion model is where χ is the number of particles per cubic meter of air at a downwind location x, γ, and z (i.e., alongwind, crosswind, and vertical coordinates, respectively); q is the number of particles emitted per second; ū is the mean wind speed in meters per second; σ y and σ z are the sd of the crosswind and vertical displacements of particles at distance x downwind, respectively; and h is the height of the source including plume rise. if ground-level and centerline concentrations are to be determined, then z and γ are set to zero. for a groundlevel source h is also set to zero, the simplified equation then becomes because the pasquill dispersion model is based on inert particles, lighthart and frisch (1976) added a biological decay term as follows: χ (x,γ,z) bd = χ (x,γ,z) exp(-λt), [3] where λ is the microbial death rate (per second) and t is approximated by x/ū. subsequent researchers utilized the biological decay term, along with the dispersion model, to assess bioaerosol transport from point sources (peterson and lighthart, 1977; teltsch et al., 1980b; us epa, 1982; lighthart and mohr, 1987) . when only part of the material released into the atmosphere becomes an aerosol, as occurs during sprinkler irrigation, eq. [3] becomes where e is the aerosolization efficiency factor (teltsch et al., 1980b) . the microbial death and inactivation rates are generally derived from empirical laboratory data under static atmospheric conditions using pure cultures (hatch and dimmick, 1966) . therefore, it is imperative when developing microbial death rates to conduct the experiments with numerous microbial types and under varying environmental conditions (peterson and lighthart, 1977) . in laboratory studies, microbial death rates for sarcina lutea at 15°c were 4.6 × 10 −2 and 5.8 × 10 −4 s −1 at around 2 and 90% relative humidity, whereas death rates for pasturella tularensis at 27°c were 7.1 × 10 −2 and 2.4 × 10 −3 s −1 at similar relative humidities, respectively (cox and goldberg, 1972; lighthart, 1973) . whereas these microbes are non-spore formers, one would expect spore-forming bacteria to survive longer under changing atmospheric conditions as a result of their ability to tolerate greater temperature and radiation (madigan and martinko, 2006) . as mentioned previously, the viability of airborne microorganisms will vary greatly depending upon growth media used and microbial genus and species being tested. in field trials conducted at pleasanton, ca, microbial death rates during the spray irrigation of municipal wastewater were determined under a variety of environmental conditions (us epa, 1980) . the median death rate constants for total coliform, fecal coliform, and coliphage were 3.2, 2.3, and 1.1 × 10 −2 s −1 , respectively. death rate constants for e. coli, prepared in sterilized municipal wastewater, were reported to range from 8.8 × 10 −3 s −1 in the morning to 6.6 × 10 −2 s −1 in the afternoon (teltsch et al., 1980b) . parker et al. (1977) modified pasquill's inert particle dispersion model to predict the transport of bioaerosols from an area source (i.e., sprinkler irrigation of potato processing wastewater). even though the model contained a biological decay term, the authors did not model decay or loss of viability of microorganisms due to a lack of experimental data. dowd et al. (2000) later used the same area-source model with microbial death rates from the literature to predict bioaerosol transport during the land application of dewatered domestic sewage sludge (biosolids). based upon model predictions at a high wind speed of 10 m·s −1 , bacterial concentrations would be 69 and 6.5 m −3 of air at 100 and 10,000 m, respectively. to assess the risk of infection to workers and nearby populations, a beta-poisson model as described by haas (1983) was utilized. using dose-response data for salmonella typhimurium, the predicted risk of infection at 100 m with a 10 m·s −1 wind speed and 8 h exposure period was 13%, whereas at 1,000 and 10,000 m it decreased to 8.7 and 1.6%, respectively. risk of infection for coxsackievirus b3 was also determined; however, an incorrect dose-response value was used in the single-hit exponential model, and predicted risk of infection should have actually been about 3 orders of magnitude less than their published values. overall, their model predictions suggest that bioaerosols from land-applied biosolids can increase the risk of viral and bacterial infection to onsite workers, but there was little or no risk to population centers >10 km from the application site under low-wind conditions (≤5 m·s −1 ). the results from such studies should be used cautiously because the results were not empirically derived and, as outlined by pillai and ricke (2002) , there is uncertainty associated with the dose-response of different organisms and hosts. in a 1982 us epa report, microorganism concentrations in aerosols from spray irrigation events of municipal wastewater were predicted using an atmospher-ic diffusion model. the diffusion model consisted of 4 principal components: where c d is the concentration of microorganisms per cubic meter of air; d d is the atmospheric diffusion factor at distance d from the source (s·m −3 ); q a is the aerosol source strength (microorganism s −1 ); m d is microorganism die-off factor (not to be confused with microbial death rate, λ) as described in eq. [3] (i.e., number of organisms that are viable at distance d); and b is the background concentration (microorganisms m −3 ). d d is calculated using the inert particle dispersion model as shown in eq. [1], but q was set to unity. for a wastewater irrigation event, the aerosol source strength was further defined as q a = w f e i, [6] where w is the microorganism concentration in the wastewater (organisms l −1 ); f is the flow rate of the irrigation wastewater (l·s −1 ); e is the aerosolization efficiency factor (0 < e ≤ 1); and i is the microorganism impact factor (i.e., aggregate effect of all of factors affecting microorganism survivability; i > 0). using input data from a us epa (1980) report, total coliform concentrations were determined 770 m from the centerline of 240-m-long linear source under stable (summer night) and unstable (summer midday) atmospheric conditions. the wastewater flow rate during the irrigation event was set at 70 l·s −1 , with a total coliform concentration of 1.0 × 10 7 cfu·l −1 and respective night and midday wind speeds of 2 and 4 m·s −1 , e of 3.3 × 10 −3 and 1.6 × 10 −2 , i of 0.48 and 0.27, λ of 0.02 and 0.05 s −1 , and aerosol age (a d ) of 385 and 193 s. the q a for total coliforms during night and midday was determined to be 1.1 × 10 6 and 3.0 × 10 6 cfu·s −1 , respectively. when background coliform concentrations were subtracted, the respective total airborne concentrations at 770 m downwind were predicted to be only 0.1 and 4.4 × 10 −3 cfu·m −3 . during midday conditions, fecal streptococci concentrations at 770 m downwind were predicted to be 2-fold greater than total coliforms, even though the source concentration was 2-fold less. this is owing to the fact that fecal streptococci had a microorganism impact factor of 5.7 and death rate of zero. lighthart and mohr (1987) modified a version of the gaussian plume model used by peterson and lighthart (1977) to include an airborne microbial survival term that was a best-fit function of temperature, relative humidity, and solar radiation. the model included an algorithm using microbial source strength and local hourly mean weather data to drive the model through a typical summer or overcast and windy winter day. at high wind speeds or short travel times, the model predicted greater viable near-source concentrations because the microorganisms did not have time to become inactivated. as travel times were increased, due to slow wind speeds or longer distances, inactivation of microorganisms became more prevalent. lighthart and kim (1989) used a simulation model to describe the dispersion of individual droplets of water containing viable microbes. the droplet dispersion model was separated into 5 submodels: 1) aerosol generation, 2) evaporation, 3) dispersion, 4) deposition, and 5) microbial death. the position of each droplet, at each time step in the trajectory, was located in a 3-dimensional coordinate system. when the modeling process was repeated for many droplets, a simulation of a cloud of droplets then occurred. the effect of evaporation was determined to be an important factor when simulated in the model, as aerosols were carried further downwind. whereas the model takes into account the physical, chemical, and measured meteorological parameters for each water droplet, potential shortcomings revolved around the ability of the model to predict nearsource survival dynamics of airborne microorganisms (e.g., effect of microorganisms on water evaporation, critical water content of microbes). also, the droplet dispersion model does not take into account rapidly changing wind conditions (e.g., gusts) and, therefore, use of average wind velocities will lead to an oversimplification of meteorological conditions and microbial dispersion. when the model was compared with a release of pseudomonas syringae, deposition rates were found to be similar within 30 m of the source. the simulation model was later used by ganio et al. (1995) to model a field spray event of bacillus subtilis var. niger spores. using the same meteorological conditions as the spray event, the model produced a bioaerosol deposition pattern somewhat similar to that obtained in the field (r 2 = 0.66). a variety of short-and long-range dispersion models have been developed to understand and manage the airborne spread of epidemics such as foot-and-mouth disease (gloster et al., 1982; sørensen, 1998; cannon and garner, 1999; sørensen et al., 2000; rubel and fuchs, 2005; garner et al., 2006; mayer et al., 2008) . in a recent paper by gloster et al. (2010) , a historic outbreak of fmd in 1967 (hampshire, uk) was modeled using 6 internationally recognized dispersion model systems. whereas one-half of the models [nuclear accident model (name), veterinary meteorological decision-support system (vetmet), plume dispersion emergency modeling system (pdems)] were run using observational data provided, the other one-half [australian integrated windspread model (aiwm), modéle lagrangien courte distance (mlcd), national atmospheric release advisory center (nrac)] used numerically derived meteorological data, and comparisons between outputs were made. using the same virus emission data, the models produced very similar 24 h integrated concentrations along the major axis of the plume at 1, 5, 10, 15, and 20 km. although there were differences between the estimates, as a result of model assumptions with respect to upward diffusion rates for surface material and choice of input weather data, most estimates were within one order of magnitude. these models also predicted similar directions for livestock at risk; however, additional model assumptions such as microbial fate and susceptibility to airborne infection can substantially modify the size and location of the downwind risk area. based on information presented in this review, it is evident that animal feeding operations and manure application practices contribute to the formation of bioaerosols at greater concentrations than found in background environments. as population centers grow and converge on such operations, there will be an increasing potential for exposure to airborne pathogens and microbial by-products that are transported off site. exposure to airborne bacteria, virus, fungi, and microbial by-products is not limited to inhalation routes because deposition on fomites, food crops, and water bodies and subsequent ingestion also represent transmission routes of concern. the ability to accurately quantify airborne microorganisms within and downwind from a source is important when evaluating health risks to exposed humans and animals. however, the actual risk of exposure from airborne pathogens has not been fully recognized for a variety of reasons including choice of bioaerosol collection technique, analytical methodology, target microorganism, and dispersion and infectivity model inputs. to date, most bioaerosol transport studies have targeted fecal indicator organisms because they are generally more abundant and easily detected. pathogens on the other hand are often at concentrations that are several orders of magnitude less than indicator organisms, making their detection difficult in highly diluted aerosol samples. because the survivability of aerosolized fecal indicator organisms is likely different from that of pathogens, a first step to improve future bioaerosol studies should include the selection of organisms that better represent targeted pathogens, along with standardized methods for their collection in outdoor environments. as molecular-based approaches improve with respect to sensitivity and rapidity, it may be appropriate to standardize and use such technologies to directly detect pathogens of interest in aerosol samples, avoiding the need for indicator organisms. standardization of target microorganisms and collection and analytical methodologies will improve the ability of researchers to compare results, refine dispersion models, and develop unified risk estimates. although animal operations and manure management practices are not currently regulated with respect to bioaerosol emissions, the possibility that control measures will someday be implemented is quite realistic. without standardized methodologies, regulatory agencies will have to base decisions on inconsistent data sets, and the effectiveness of mitigation strategies to control bioaerosol emissions will not be properly determined. because land application of manures will remain a viable nutrient utilization and disposal option into the foreseeable future, emphasis must be placed on research addressing the airborne transport of pathogens because there is a lack of information on this topic. furthermore, there is a surprising lack of information concerning the infectivity of aerosolized pathogens, especially enteric pathogens. clearly, a critical component of a risk determination is not only understanding bioaerosol dispersion and transport, but also the dose-response of zoonotic pathogens. to advance our understanding of risks associated with airborne pathogens from animal feeding operations, it will be necessary for a variety of scientists, including but not limited to aerobiologists, clinical microbiologists, epidemiologists, animal scientists, and risk modelers, to convene under a common setting to address these issues in more detail and work toward a common goal of standardizing of variety of bioaerosol collection and analytical methodologies. aerosol stability of infectious and potentially infectious reovirus particles volumetric assessment of airborne fungi in two sections of a rural indoor dairy cattle shed effect of temperature and relative humidity on the survival of airborne columbia sk group viruses airborne stability of simian virus 40 the sea surface microlayer as a source of vira and bacterial enrichment in marine aerosols comparison of coliphage and bacterial aerosols at a wastewater spray irrigation site stability of disseminated aerosols of pasteurella tularensis subjected to simulated solar radiation and various humidities atmospheric bacterial contamination from landspreading of animal wastes: evaluation of the respiratory risk for people nearby animal viruses, coliphages, and bacteria in aerosols and wastewater at a spray irrigation site biological aerosol emission, fate, and transport from municipal and animal wastes estimation of bioaerosol risk of infection to residents adjacent to a land applied biosolids site using an empirically derived transport model a national study on the residential impact of biological aerosols from the land application of biosolids aerial dispersal of pathogens on the global and continental scales and its impact on plant disease microorganism levels in air near spray irrigation of municipal wastewater: the lubbock infection surveillance study airborne particulate matter from livestock production systems: a review of an air pollution problem. environ assessing the risk of windborne spread of foot-and-mouth disease in australia assessment of microbial parameters as indicators of viral contamination of aerosol from urban sewage treatment plants exposure assessment to airborne endotoxin, dust, ammonia, hydrogen sulfide and carbon dioxide in open style swine houses exposure of workers to airborne microorganisms in open-air swine houses investigation and application of methods for enumerating heterotrophs and escherichia coli in the air within piggery sheds mechanically ventilated broiler sheds: a possible source of aerosolized salmonella, campylobacter, and escherichia coli auditing and assessing air quality in concentrated feeding operation the survival of escherichia coli sprayed into air and into nitrogen from distilled water and from solutions protecting agents, as a function of relative humidity inactivation kinetics of some microorganisms subjected to a variety of stresses the toxic effect of oxygen upon the aerosol survival of escherichia coli b aerosol survival of serratia marcescens as a function of oxygen concentration, relative humidity, and time aerosol survival of pasteurella tularensis and the influence of relative humidity differences between the thermal inactivation of picornaviruses at "high" and "low" temperatures bioaerosol health effects and exposure assessment: progress and prospects bioaerosol transport modeling and risk assessment in relation to biosolid placement thermotolerant clostridia as an airborne pathogen indicator during land application of biosolids source identification of airborne escherichia coli of swine house surroundings using eric-pcr and rep-pcr airborne endotoxin concentrations at a large open lot dairy in southern idaho a concise review of methodologies used to collect and characterize bioaerosols and their application at concentrated animal feeding operations the effect of extraction, storage, and analysis techniques on the measurement of airborne endotoxins from a large dairy yearlong monitoring of airborne endotoxin at a concentrated dairy operation assessment of bioaerosols at a concentrated dairy operation survival of airborne pasteurella tularensis at different atmospheric temperatures effects of atmospheric humidity and temperature on the survival of airborne falvobacterium relationship between atmospheric temperature and survival of airborne bacteria direct detection of salmonella cells in the air of livestock stable by real-time pcr a comparison between computer modeled bioaerosol dispersion and a bioaerosol field spray event an integrated modeling approach to assess the risk of wind-borne spread of foot-and-mouth disease virus from infected premises modeling the transport and dispersion of airborne contaminants: a review of techniques and approaches sources of pathogenic microorganisms and their fate during land application of wastes airborne spread of foot-and-mouth disease: model intercomparison long distance transport of foot-and-mouth disease virus over sea emission of microbial aerosols from sewage treatment plants that use trickling filters bacterial plume emanating from the air surrounding swine confinement operations atmospheric movement of microorganisms in clouds of desert dust and implications for human health sampling for airborne microorganisms. page 939 in manual for environmental microbiology estimation of risk due to low doses of microorganisms: a comparison of alternative methodologies rapid aerosol transmission of salmonella among turkeys in a simulated holding-shed environment technical background document: microbial risk assessment and fate and transport modeling of aerosolized microorganisms at wastewater land application facilities in idaho physiological responses of airborne bacteria to shifts in relative humidity health effects of airborne exposures from concentrated animal feeding operations a method to provide improved dose-response estimates for airborne pathogens in animals: an example using porcine reproductive and respiratory syndrome virus inactivation of airborne viruses by ultraviolet radiation the effects of meteorological factors on atmospheric bioaerosol concentrations-a review dispersion of enteric bacteria by spray irrigation changes in soil physical properties due to organic waste applications: a review computational fluid dynamics (cfd) modeling to predict bioaerosol transport behavior during center pivot wastewater irrigation influence of relative humidity on particle size and uv sensitivity of serratia marcescens and mycobacterium bovis bcg aerosols measurement of low pressure sprinkler evaporation loss aerobiological engineering handbook. page 119 in aerosol science and particle dynamics zoonoses: infectious diseases transmissible from animals to humans survival of airborne bacteria in high urban concentration of carbon monoxide physics of microbial bioaerosols. page 5 in atmospheric microbial aerosols: theory and mini-review of the concentration variations found in the alfresco atmospheric bacterial populations estimation of viable airborne microbes downwind from a point source simulation of airborne microbial droplet transport estimating downwind concentrations of viable airborne microorganisms in dynamic atmospheric conditions bacterial flux from chaparral into the atmosphere in mid-summer at a high desert location off-site exposure to respirable aerosols produced during the disk-incorporation of class b biosolids brock biology of microorganisms survival of bacteria during aerosolization effects of betaine on enumeration of airborne bacteria airborne fungi in a dairy barn with emphasis on microclimate and emissions a lagrangian particle model to predict the airborne spread of foot-and-mouth disease virus bioaerosols associated with animal production operations fate and transport of microorganisms in air. page 952 in manual for environmental microbiology the outdoor aerosol. page 163 in bioaerosols molecular identification of airborne bacteria associated with aerial spraying of bovine slurry waste employing 16s rrna gene pcr and gene sequencing techniques culture-independent characterization of archaeal biodiversity in swine confinement building aerosols culture-independent approach of the bacterial bioaerosol diversity in the standard swine confinement buildings, and assessment of the seasonal effect experimental airborne transmission of salmonella agona and salmonella typhimurium in weaned pigs emission rates and characterization of aerosols produced during the spreading of dewatered class b biosolids source bioaerosol concentration and rrna gene-based identification of microorganisms aerosolized at a flood irrigation wastewater reuse site microbial aerosols from food-processing waste spray fields the estimation of the dispersion of windborne material soil microbial community responses to dairy manure or ammonium nitrate applications quantification of airborne biological contaminants associated with land applied biosolids. water environment research foundation assessment tools in support of epidemiological investigations of airborne dispersion of pathogens estimation of downwind viable airborne microbes from a wet cooling tower-including settling land application equipment for livestock and poultry manure management bioaerosols from municipal and animal wastes: background and contemporary issues studies on the instantaneous death of airborne escherichia coli land application of agricultural, industrial, and municipal by-products assessment of bioaerosols in swine barns by filtration and impaction effect of relative humidity on the inactivation of airborne serratia marcescens by ultraviolet radiation land application of manure for beneficial reuse. page 283 in animal agriculture and the environment, national center for manure & animal waste management white papers potential impact of land application of by-products on ground and surface water quality. page 263 in land application of agricultural, industrial, and municipal by-products a decision-support system for realtime assessment of airborne spread of the foot-and-mouth disease virus health aspects of bioaerosols. page 304 in atmospheric microbial aerosols: theory and applications. b. lighthart and a bioaerosol distribution patterns adjacent to two swine-growing-finishing housed confinement units in the american midwest endotoxin concentration in modern animal houses in southern bavaria ambient endotoxin level in an area with intensive livestock production concentrations and emissions of airborne endotoxins and microorganisms in livestock buildings in northern pathogen in animal wastes and the impacts of waste management practices on their survival, transport and fate. page 609 in animal agriculture and the environment virus survival in the environment with special attention to survival in sewage droplets and other environmental media of fecal or respiratory origin. page 70. report for the world health organization changes in soil properties under annual applications of feedlot manure and different tillage practices influence of relative humidity on the survival of some airborne viruses health and hygiene aspects of spray irrigation sensitivity of the derma long-range gaussian dispersion model to meteorological input and diffusion parameters modelling the atmospheric dispersion of footand-mouth disease virus for emergency preparedness an integrated model to predict the atmospheric spread of fmd virus exposure to inhalable dust and endotoxins in agricultural industries introduction to aerobiology. page 925 in manual for environmental microbiology estimating fugitive bioaerosol releases from static compost windrows: feasibility of a portable wind tunnel approach estimated occupational risk from bioaerosols generated during land application of class b biosolids bioaerosol emission rate and plume characteristics during land application of liquid class b biosolids airborne enteric bacteria and viruses from spray irrigation with wastewater isolation and identification of pathogenic microorganisms at wastewater-irrigated fields: ratios in air and wastewater die-away kinetics of aerosolized bacteria from sprinkler application of wastewater influence of temperature and relative humidity on the survival of chlamydia pneumoniae in aerosols the evaluation of microbiological aerosols associated with the application of wastewater to land estimating microorganism densities in aerosols from spray irrigation of wastewater pathogens and antibiotic residues in animal manures and hygienic and ecological risks related to subsequent land application effect of aerosolization on subsequent bacterial survival aerosol infection of calves and mice with salmonella typhimurium airborne microbial flora in a cattle feedlot agricultural uses of municipal, animal, and industrial byproducts concentrations of airborne endotoxin in cow and calf stables determination of the inflammatory potential of bioaerosols from a duck-fattening unit by using a limulus amebocyte lysate assay and human whole blood cytokine response airborne gramnegative bacterial flora in animal houses key: cord-122344-2lepkvby authors: hayashi, hiroaki; kry'sci'nski, wojciech; mccann, bryan; rajani, nazneen; xiong, caiming title: what's new? summarizing contributions in scientific literature date: 2020-11-06 journal: nan doi: nan sha: doc_id: 122344 cord_uid: 2lepkvby with thousands of academic articles shared on a daily basis, it has become increasingly difficult to keep up with the latest scientific findings. to overcome this problem, we introduce a new task of disentangled paper summarization, which seeks to generate separate summaries for the paper contributions and the context of the work, making it easier to identify the key findings shared in articles. for this purpose, we extend the s2orc corpus of academic articles, which spans a diverse set of domains ranging from economics to psychology, by adding disentangled"contribution"and"context"reference labels. together with the dataset, we introduce and analyze three baseline approaches: 1) a unified model controlled by input code prefixes, 2) a model with separate generation heads specialized in generating the disentangled outputs, and 3) a training strategy that guides the model using additional supervision coming from inbound and outbound citations. we also propose a comprehensive automatic evaluation protocol which reports the relevance, novelty, and disentanglement of generated outputs. through a human study involving expert annotators, we show that in 79%, of cases our new task is considered more helpful than traditional scientific paper summarization. with the growing popularity of open-access academic article repositories, such as arxiv or biorxiv, disseminating new research findings has become nearly effortless. through such services, tens of thousands of scientific papers are shared by the research community every month 1 . at the same time, the unreviewed nature of mentioned repositories and the sheer volume of new publications has made it nearly impossible to identify relevant work and keep up with the latest findings. scientific paper summarization, a subtask within automatic text summarization, aims to assist researchers in their work by automatically condensing articles into a short, human-readable form that contains only the most essential information. in recent years, abstractive summarization, an approach where models are trained to generate fluent summaries by paraphrasing the source article, has seen impressive progress. state-of-the-art methods leverage large, pre-trained models (raffel et al., 2019; lewis et al., 2020) , define task-specific pre-training strategies , and scale to long input sequences (zhao et al., 2020; zaheer et al., 2020) . available large-scale benchmark datasets, such as arxiv and pubmed (cohan et al., 2018) , were automatically collected from online archives and repurpose paper abstracts as reference summaries. however, the current form of scientific paper summarization where models are trained to generate paper abstracts has two caveats: 1) often, abstracts contain information which is not of primary importance, 2) the vast majority of scientific articles come with human-written abstracts, making the generated summaries superfluous. to address these shortcomings, we introduce the task of disentangled paper summarization. the new task's goal is to generate two summaries simultaneously, one strictly focused on the summarized article's novelties and contributions, the other introducing the context of the work and previous efforts. in this form, the generated summaries can target the needs of diverse audiences: senior researchers and field-experts who can benefit from reading the summarized contributions, and newcomers who can quickly get up to speed with the intricacies of the addressed problems by reading the context summary and get a perspective of the latest findings from the contribution summary. for this task, we introduce a new large-scale dataset by extending the s2orc corpus of scientific papers, which spans multiple scientific domains and offers rich citation-related metadata. we organize and process the data, and extend it with automatically generated contribution and context reference summaries, to enable supervised model training. we also introduce three abstractive baseline approaches: 1) a unified, controllable model manipulated with descriptive control codes (fan et al., 2018; keskar et al., 2019) , 2) a one-to-many sequence model with a branched decoder for multi-head generation (luong et al., 2016; guo et al., 2018) , and 3) an informationtheoretic training strategy leveraging supervision coming from the citation metadata (peyrard, 2019) . to benchmark our models, we design a comprehensive automatic evaluation protocol that measures performance across three axes: relevance, novelty, and disentanglement. we thoroughly evaluate and analyze the baselines models and investigate the effects of the additional training objective on the model's behavior. to motivate the usefulness of the newly introduced task, we conducted a human study involving human annotators in a hypothetical paper-reviewing setting. the results find disentangled summaries more helpful in 79% of cases in comparison to abstract-oriented outputs. code, model checkpoints, and data preparation scripts introduced in this work are available at https://github.com/salesforce/disentangled-sum. recent trends in abstractive text summarization show a shift of focus from designing task-specific architectures trained from scratch (see et al., 2017; paulus et al., 2018) to leveraging large-scale transformer-based models pre-trained on vast amounts of data (liu & lapata, 2019; lewis et al., 2020) , often in multi-task settings (raffel et al., 2019) . a similar shift can be seen in scientific paper summarization, where state-of-the-art approaches utilize custom pre-training strategies and tackle problems of summarizing long documents (zhao et al., 2020; zaheer et al., 2020) . other methods, at a smaller scale, seek to utilize the rich metadata associated with scientific articles and combine them with graph-based methods (yasunaga et al., 2019) . in this work, we combine these two lines of work and propose models that benefit from pre-training procedures, but also take advantage of task-specific metadata. popular large-scale benchmark datasets in scientific paper summarization (cohan et al., 2018) were automatically collected from open-access paper repositories and consider article abstracts as the reference summaries. other forms of supervision have also been investigated for the task, including author-written highlights (collins et al., 2017) , human annotations and citations (yasunaga et al., 2019) , and transcripts from conference presentations of the articles (lev et al., 2019) . in contrast, we introduce a large-scale automatically collected dataset with more fine-grained references than abstracts, which also offers rich citation-related metadata. update summarization (dang & owczarzak) defines a setting in a collection of documents with partially overlapping information is summarized, some of which are considered prior knowledge. the goal of the task is to focus the generated summaries on the novel information. work in this line of research mostly focuses on novelty detection in news articles (bysani, 2010; delort & alfonseca, 2012) and timeline summarization (martschat & markert, 2018; chang et al., 2016) on news and social media domains. here, we propose a novel task that is analogous to update summarization in that it also requires contrasting the source article with the content of other related articles which are considered pre-existing knowledge. given a source article d, the goal of disentangled paper summarization is to simultaneously summarize the contribution y con and context y ctx of the source article. here, contribution refers to the train 805152 6351 925 877 136 236 valid 36129 6374 922 875 135 236 test 54242 6350 927 892 136 237 novelties introduced in the article d, such as new methods, theories, or resources, while context represents the background of the work d, such as a description of the problem or previous work on the topic. the task inherently requires a relative comparison of the article with other related papers to effectively disentangle its novelties from pre-existing knowledge. therefore, we also consider two sets of citations: inbound citations c i and outbound citations c o as potential sources of useful information for contrasting the article d with its broader field. inbound citations refer to the set of papers that cite d, i.e. relevant future papers, while outbound citations are the set of papers that d cites, i.e. relevant previous papers. with its unique set of goals, the task of disentangled paper summarization poses a novel set of challenges for automatic summarization systems to overcome: 1) identifying salient content of d and related papers from c i and c o , 2) comparing the content of d with each document from the citations, and 3) summarizing the article along the two axes: contributions and context. current benchmark datasets used for the task of scientific paper summarization, such as arxiv and pubmed (cohan & goharian, 2015) , are limited in size, the number of domains, and lack of citation metadata. thus, we construct a new dataset based on the s2orc corpus, which offers a large collection of scientific papers spanning multiple domains along with rich citationrelated metadata, such as citation links between papers and annotated citation spans. specifically, we carefully curate the data available in the s2orc corpus and extend it with new reference labels. data curation some papers in the s2orc corpus 2 do not contain a complete set of information required by our summarization task: paper text, abstract, and citation metadata. we remove such instances and construct a paper summarization dataset in which each example a) has an abstract and body text, and b) has at least 5 or more inbound and outbound citations, c i and c o respectively. in cases where a paper has more than 20 incoming or outgoing citations, we sort them in descending order by the number of their respective citation and keep the top 20 most relevant articles. citation span extraction each article in the set of inbound and outbound citations can be represented by its full text, abstract, or the span of text associated with the citation. in this study, we follow qazvinian & radev (2008) and cohan & goharian (2015) in representing citations with the sentences in which the citation occurs. 3 thus, an outbound citation is represented by a sentence from the source paper. usually, such sentences directly refer to the cited paper and place its content in relation to the source paper. analogously, an inbound citation is represented by sentences from the citing paper and relates its content with the source paper. reference generation our approach relies on the availability of reference summaries for both contributions and contexts. however, such annotations are not provided or easily extractable from the s2orc corpus, and collecting expert annotations is infeasible due to the associated costs. therefore, we apply a data-driven approach to automatically extract contribution and context reference summaries from the available paper abstracts. first, we manually label 400 abstracts sampled from the training set. annotations are done on a sentence-level with binary labels indicating contributionand context-related sentences 4 . this procedure yields 3341 sentences with associated binary labels, which we refer to as golden standard references. next, we fine-tune an automatic sentence classifier using the golden standard data. as our classifier we use scibert (beltagy et al., 2019) , which after fine-tuning achieves 86.3% accuracy in classifying contribution and context sentences on a held-out test set. finally, we apply the fine-tuned classifier to generate reference labels for all examples in our dataset, which we refer to as silver standard references. the statistics of the resulting dataset are shown in table 1 . our goal is to build an abstractive summarization system which has the ability to generate contribution and context summaries based on the source article. to achieve the necessary level of controllability, we propose two independent approaches building on encoder-decoder architectures: controlcode (cc) a common approach to controlling model-generated text is by conditioning the generation procedure on a control code associated with the desired output. previous work on controllable generation (fan et al., 2018; keskar et al., 2019) showed that prepending a special token or descriptive prompt to the model's input during training and inference is sufficient to achieve fine-grained control over the generated content. following this line of work, we modify our training instances by prepending textual control codes, contribution: or context:, to the summarized articles. during training, all model parameters are updated for each data instance and the model is expected to learn to associate the provided prompt with the correct output mode. the approach does not require changes in the architecture, making it straightforward to combine with existing large-scale, pre-trained models. the architecture is shown on the left of figure 1 . an alternative way of controlling generation is by explicitly allocating layers within the model specifically for the desired control aspects. prior work investigating multi-task models (luong et al., 2016; guo et al., 2018) showed the benefits of combining shared and taskspecific layers within a single, multi-task architecture. here, the encoder shares all parameters between the two generation modes, while the decoder shares all parameters, apart from the final layer, which splits into two generation branches. during training, each branch is individually updated with gradients from the associated mode. the model shares the softmax layer weights between the output branches under the assumption that token-level vocabulary distributions are similar in the two generation modes due to the common domain. this approach is presented on the right of figure 1 . peyrard (2019) proposed an information-theoretic perspective on text summarization which decomposes the criteria of a good summary into redundancy, relevance, and informativeness. among these criteria, informativeness measures the user's degree of surprise after reading a summary given their background knowledge, and can be formally defined as: where ω i is a primitive semantic unit, p k is the probability over the unit under the user's knowledge, p d is the probability over the unit with respect to the source document, and i is an index over all semantic units within a summary. as defined by peyrard (2019), informativeness is in direct correspondence to contribution summarization. paper contributions are novel contents introduced to the community, which causes surprisal given the general knowledge about the state of the field. therefore, in this work we explore utilizing this measure as an auxiliary objective that is optimized during training. we define the semantic unit ω i as the summary itself 5 , which enables a simple interpretation of the corresponding probabilities. we estimate p d as the likelihood of the summary given the paper content, p d (ω i ) = p(y | d). since each paper is associated with a unique context and background knowledge, we treat the background knowledge as all relevant papers published before the source paper, i.e., outbound citations c o . therefore, p k is estimated as the likelihood of the summary given the previous work, we formulate the informativeness function as: where the conditioning depends on the generation mode of the model, and aim to maximize it during the training procedure. combined with a cross entropy loss l ce , we obtain the final objective which we aim to the minimize during training: where λ is a scaling hyperparameter determined through cross-validation. in this section, we describe the experimental environment and report automatic evaluation results. we consider four model variants: • cc, cc+inf: controlcode model without and with the informativeness objective, • mh, mh+inf: multihead model without and with the informativeness objective. figure 2 : diagram illustrating the evaluation protocol assessing summaries along 3 axes: relevance, purity, and disentanglement. we perform automatic evaluation of the system outputs (s con , s ctx ) against the silver standard references (y con , y ctx ). for this purpose, we have designed a comprehensive evaluation protocol, shown in figure 2 , based on existing metrics that evaluates the performance of models across 3 dimensions: relevance generated summaries should closely correspond with the available reference summaries. we measure the lexical overlap and semantic similarity between (s con , y con ) and (s ctx , y ctx ) using rouge (r-i) (lin, 2004) and bertscore (zhang et al. 2020 ; bs), respectively. purity generated contribution summary should closely correspond with its respective reference summary, but should not overlap with the context reference summary. we measure the lexical overlap between s con and (y con , y ctx ) using nouveaurouge con (n con -i) (conroy et al., 2011) . the metric reports an aggregate score defined as a linear combination between the two components: where weights α i j were set by the original authors to favor outputs with maximal and minimal overlap with related and unrelated references, accordingly. analogously, we calculate n ctx -i in reverse direction between s ctx and (y ctx , y con ). purity p-i is defined as the average novelty in both directions: disentanglement generated contribution and context summaries should have minimal overlap. we measure the degree of lexical overlap and semantic similarity between (s con , s ctx ) using rouge and bertscore, respectively. to maintain consistency across metrics (higher is better) we report disentanglement scores as complements of the associated metrics: our models build upon distilbart 6 , a transformerbased (vaswani et al., 2017) , pre-trained sequence-to-sequence architecture distilled from bart (lewis et al., 2020) . specifically, we used a model with 6 self-attention layers in both the encoder and decoder. weights were initialized from a model fine-tuned on a news summarization task. 7 for the multihead model, the final layer of the decoder was duplicated and initialized with identical weights. we fine-tuned on the training set for 80000 gradient steps with a fixed learning rate of 3.0 × 10 −5 and choose the best checkpoints in terms of rouge-1 scores on the validation set. the loss scaling hyparameter λ (eq. 3) was set to 0.05 and 0.01 for the controlcode and multihead models, accordingly. input and output lengths were set to 1024 and 200, respectively. at inference time, we decoded using beam search with beam size 5. the evaluation was performed using summeval toolkit (fabbri et al., 2020). in table 2 we report results from the automatic evaluation protocol described in subsection 5.1. relevance across most models and metrics, relevance scores for context generation are higher than those for contribution summarization. manual inspection revealed that in some cases generated context summaries also include article contribution information, while this effect was not observed in the reverse situation. considering that silver standard annotations may contain noisy examples with incorrectly separated references, we suspect that higher rouge scores for context summaries may be caused by noisy predictions coinciding with noisy references. examples of such summaries are shown in the appendix e. we also observe that informativeness-guided models (+inf) perform on par with their respective base versions, and the additional training objective does not affect the performance on the relevance metric. this insight corroborates with peyrard (2019) who defines informativeness and relevance as orthogonal criteria. purity while the informativeness objective was designed to improve the novelty of generated summaries, results show an opposite effect, where informativeness-guided models slightly underperform their base counterparts. the true reason for such behavior is unknown, however, it might be an indicator that the outbound citations c o are not a good approximation of reference context summaries y ctx , or the relationship between the two is weak. this effect is more evident in the medical and biology domains, which are the two most frequent domains in the dataset. 6 we did not observe a substantial difference in performance between distilbart and bart. 7 model weights are available at https://huggingface.co/sshleifer/student_cnn_6_6. disentanglement results indicate that controlcode-based models perform better than mul-tihead approaches in terms of generating disentangled outputs. this comes as a surprise given that the cc models share all parameters between the two generation modes, but might indicate that the two tasks contain complementary training signals. we also noticed that, both informativenessguided models performed better in terms of d-1. based on both purity and disentanglement evaluations, we suspect that the informativeness objective does guide the models to output more disentangled summaries (second term in eq 2), but the signal is not strong enough to focus on generating the appropriate content (first term in eq 2). it is also clear that the multihead model benefits more from the additional training objective. to better understand the strengths and shortcomings of our models, we performed a qualitative study of model outputs. table 3 shows an example of generated summaries compared with the original abstract of the summarized article. our model successfully separates the two generation modes and outputs coherent and easy to follow summaries. the contribution summary clearly lists the novelties of the work, while the context summary introduces the task at hand and explains its importance. in comparison, the original abstract briefly touches on many aspects: the context, methods used, and contributions, but also offers details that are not of primary importance, such as the detailed about the simulation environment. more generally, the described trends hold across summaries generated by our models. the model outputs are fluent, abstractive, offer good separation between modes, and are on topic. however, the factual correctness of summaries could not be assessed due to the highly specialized content and language of the summarized articles. an artifact noticed in a few instances of the inspected outputs was leakage of contribution information into context summaries. other examples of generated summaries are included in the appendix e. taking advantage of the rich metadata associated with the s2orc corpus, we analyze the performance of models across the 10 most frequent scientific domains. table 4 shows the results of contribution summarization using the controlcode 8 model. while rouge-1 scores oscillate around 40 points for most academic fields, the results indicate that summarizing documents from the medical domain is particularly difficult, with models scoring about 7 points below average. table 3 : generated samples compared with the original and generated abstracts of the associated paper. the second rows shows the output decoded from distilbart fine-tuned on our dataset, the third rows shows the outputs from controlcode model. our model successfully generates disentangled content, thus making it easier to follow than the abstract. original abstract: energy optimization in buildings by controlling the heating ventilation and air conditioning (hvac) system is being researched extensively. in this paper, a model-free actor-critic reinforcement learning (rl) controller is designed using a variant of artificial recurrent neural networks called long-short-term memory (lstm) networks. optimization of thermal comfort alongside energy consumption is the goal in tuning this rl controller. the test platform, our office space, is designed using sketchup. using openstudio, the hvac system is installed in the office. the control schemes (ideal thermal comfort, a traditional control and the rl control) are implemented in matlab. using the building control virtual test bed (bcvtb), the control of the thermostat schedule during each sample time is implemented for the office in energyplus alongside local weather data. results from training and validation indicate that the rl controller impoves thermal comfort by an average of 15% and energy efficiency by an average of 2.5% as compared to other strategies mentioned. generated abstract: despite the advances in research on hvac control algorithms, most field equipment is controlled using classical methods that include hysteresis/on/off and proportional integral and derivative (pid) controllers. these classical methods do not perform optimally. the high thermal inertia of buildings induces large time delays in the building dynamics, which cannot be handled efficiently by the simple on/off controllers. however, due to the high non-linearity in building dynamics coupled with uncertainties such as weather, energy pricing, etc., these pid controllers require extensive retuning or auto-tuning capabilities, which increases the difficulty and complexity of the control problem. in this work, we introduce novel control algorithms from a branch of machine learning called reinforcement learning. from a controls perspective, reinforcement learning algorithms can be considered as direct adaptive optimal control. like optimal control, reinforcement training algorithms minimize the cumulative sum of costs over a time horizon. unlike traditional optimization algorithms can learn optimal control actions contribution: in this work, we introduce novel control algorithms from a branch of machine learning called reinforcement learning. in our current approach, the impetus is thermostat control. instead of traditional on/off heating and cooling control, reinforcement learning is utilized to set this schedule to obtain improved predicted mean vote (pmv)-based thermal comfort at an optimal energy expenditure. hence, a thermostats schedule is computed using an rl controller. the results show that the q-learning algorithm can learn to adapt to time-varying and nonlinear system dynamics without explicit identification of the plant model in both systems and controls. context: the heating, ventilation and air conditioning (hvac) systems can account for up to 50% of total building energy demand. in the hopes of moving toward a greener, more energy-efficient future, a significant improvement in energy efficiency is needed to achieve this goal. despite the advances in research on hvac control algorithms, most field equipment is controlled using classical methods that include hysteresis/on/off and proportional integral and derivative controllers. however, due to the high nonlinearity in building dynamics coupled with uncertainties such as weather, energy pricing, etc., these pid controllers require extensive retuning or auto-tuning capabilities, which increases the difficulty and complexity of the control problem. the high thermal inertia of buildings induces large time delays in the building dynamics, which cannot be handled efficiently by the simple on/off controllers. manual inspection of instances with low scores (r-1 < 20), exposed that contribution summaries in the medical domain are highly quantitative (e.g. "among these treated . . . retinopathy was noted in x%"). while other domains such as biology also suffer from the same phenomenon, low-scoring quantitative summaries were 1.9 times more frequent in medicine than in biology. an investigation into the domain distribution in our dataset (appendix) revealed that biology and medicine are the two best represented fields in the corpus, with biology having over twice as many examples. we hypothesize that the poor performance of models stems from the fact that generating such quantitative summaries requires a deeper, domain-specific understanding of the source document and the available in-domain training data is insufficient to achieve that goal. to assess the usefulness of the newly introduced task to the research community, we conducted a human study involving expert annotators. the study aimed to compare disentangled papers summaries with traditional, abstract-based summaries in a hypothetical paper reviewing setting. judges were shown both types of summaries side by side and asked to pick one which would be more helpful for conducting the paper review. abstract-based summaries were generated by a model with a configuration identical to the models previously introduced in this work, trained to generate full abstracts using the same training corpus. annotators that participated in this study hold graduate degrees in technical fields and are active in the research community, however, they were not involved or familiar with this work prior to this experiment. the study used 100 examples, out of which 50 were decoded on the test split of the adapted s2orc dataset, while the other 50 were generated in a zero-shot fashion from articles in the cord dataset , a recently introduced collection of papers related to covid-19. results in table 5 show the proportion of all examples where the annotators preferred the disentangled summaries over the generated abstracts. the numbers indicate a strong preference from the judges for disentangled summaries, in the case of both s2orc and cord examples. the values on cord samples are slightly higher than those on s2orc; we suspect this being due to the fact that the annotators were less familiar with the topics described in covid-related publications and would require more help to review such articles. in this paper, we propose disentangled paper summarization, a new task in scientific paper summarizing where models simultaneously generate contribution and context summaries. with the task in mind, we introduced a large-scale dataset with fine-grained reference summaries and rich metadata. along with the data, we introduced three abstractive baseline approaches to solving the new task and thoroughly assessed them using a comprehensive evaluation protocol design for the task at hand. through human studies involving expert annotators with motivated the usefulness of the task in comparison to the current scientific paper summarization setting. together with this paper, we release the code, trained model checkpoints, and data preprocessing scripts to support future work in this direction. we hope this work will positively contribute to creating ai-based tools for assisting scientists in the research process. different writing styles might locate and express contributions in different ways. to understand the global tendency of contribution locations in a paper, we take each sentence from the paper texts themselves in the training set and annotated contributions using the learned sentence classifier. we then group them into 10 bins according to the relative location of the sentences in the papers they belong to and constructed a distribution which summarizes the proportion of sentences labeled as contributions in each bin. fig 3 shows the percentages of such sentences for each bin. the graph shows that no bin positions in the papers tend to describe contributions more than 50% of the time. surprisingly, the first 10% of the papers have the lowest chance of describing the contributions, which is counter-intuitive to the general idea that papers tend to discuss the introduction and highlights of the paper at the beginning. as discussed in section 3.1, labels for contribution or context are populated automatically using a classifier, which is expected to contain mistakes. therefore, we created a gold standard evaluation set by manually annotating 100 samples in the test set and report the evaluation results in table 6 . a sharp drop in rouge scores for the context summaries is due to some examples receiving zero scores for generating context summaries when the manual annotation judged that there are not existent. the overall trend of controlcode model outpeforming multihead model is still observed in the evaluation. more noticeably, we observe a reverse tendency when the two models are applied with the informativeness objective. multihead model specifically enjoyed significant improvement in terms of novelty and disentanglement. in addition to various automatic evaluation, we perform human evaluation on disentanglement to understand which models human annotators prefer. we use best-worst scaling (kiritchenko & mohammad, 2017) over the 4-tuples of summaries on the 50 random samples from the test set and have 3 annotators pick the best and the worst contribution and context summary pairs in terms of disentanglement. the rating in table 7 shows the percentage a model is chosen as the best minus the percentage a model is chosen as the worst; rating ranges from -1 to 1. a similar trend to automatic disentanglement evaluation is observed here as well in that introducing the informativeness objective significantly improves multihead model, while it conversely affects controlcode model. we show additional samples generated from controlcode model in table 8 , 9, and 10. a common failure case of all our models is the lack of disentanglement between the two summaries. while samples start generating on-topic, the model for context summary generation starts generating the contributions at one point. this phenomenon can conspicuously be observed in table 8 and 10. original abstract: domain adaption (da) allows machine learning methods trained on data sampled from one distribution to be applied to data sampled from another. it is thus of great practical importance to the application of such methods. despite the fact that tensor representations are widely used in computer vision to capture multi-linear relationships that affect the data, most existing da methods are applicable to vectors only. this renders them incapable of reflecting and preserving important structure in many problems. we thus propose here a learning-based method to adapt the source and target tensor representations directly, without vectorization. in particular, a set of alignment matrices is introduced to align the tensor representations from both domains into the invariant tensor subspace. these alignment matrices and the tensor subspace are modeled as a joint optimization problem and can be learned adaptively from the data using the proposed alternative minimization scheme. extensive experiments show that our approach is capable of preserving the discriminative power of the source domain, of resisting the effects of label noise, and works effectively for small sample sizes, and even one-shot da. we show that our method outperforms the state-of-the-art on the task of cross-domain visual recognition in both efficacy and efficiency, and particularly that it outperforms all comparators when applied to da of the convolutional activations of deep convolutional networks. generated abstract: the difficulty of securing an appropriate and exhaustive set of training data, and the tendency for the domain of application to drift over time, often lead to variations between the distributions of the training (source) and test (target) data. in machine learning this problem is labeled domain mismatch. failing to model such a distribution shift may cause significant performance degradation. domain adaptation (da) techniques capable of addressing this problem of distribution shift have thus received significant attention recently [24] . the assumption underpinning da is that, although the domains differ, there is sufficient commonality to support adaptation. many approaches have modeled this commonality by learning an invariant subspace, or set of subspaces. these methods are applicable to vector data only, however. applying these methods to structured high-dimensional representations (e.g., convolutional activations), thus requires that the data be vectorized first. although this solves the algebraic issue, it does not solve the underlying problem. tensor contribution: to address these issues, we propose a novel approach termed tensor-aligned invariant subspace learning (taisl) to learn an invariant tensor subspace that is able to adapt the tensor representations directly. by introducing a set of alignment matrices, the tensors from the source domain are aligned to an underlying tensor space shared by the target domain. instead of executing a holistic adaptation (where all feature dimensions would be taken into account), our approach performs mode-wise partial adaptation where each mode is adapted separately to avoid the curse of dimensionality. we also propose an alternating minimization scheme which allows the problem to be effectively optimized by off-the-shelf solvers. extensive experiments on cross-domain visual recognition demonstrate the following merits of our approach: i) it effectively reduces the domain discrepancy and preserves the discriminative power of the original representations; ii) it is applicable to small sample size adaptation, even when there is only one source context: deep convolutional neural networks (cnns) represent the state-of-the-art method for a substantial number of visual tasks. the activations of such cnns, and the interactions between them, are naturally represented as tensors, meaning that da should also be applied using this representation. however, after vectorization, many existing approaches become sensitive to the scarcity of source data (compared to the number of dimensions) and noise in the labels. the proposed direct tensor method uses much lower dimensional entities, thus avoiding these estimation problems. to address these issues we propose to learn an invariant tensor subspace that is able to adapt the tensor representations directly. we show in section 5 that the proposed method outperforms all comparators in da of the convolutional activation of cnns. higher-order tensor modeling offers us an opportunity to investigate multiple interactions and couplings that capture the commonality and differences between domains. following this idea, a novel approach table 9 : generated sample. context summary has a slight erroneous decoding of contribution information. original abstract: conventional wisdom holds that model-based planning is a powerful approach to sequential decision-making. it is often very challenging in practice, however, because while a model can be used to evaluate a plan, it does not prescribe how to construct a plan. here we introduce theïmagination-based planner, the first model-based, sequential decision-making agent that can learn to construct, evaluate, and execute plans. before any action, it can perform a variable number of imagination steps, which involve proposing an imagined action and evaluating it with its model-based imagination. all imagined actions and outcomes are aggregated, iteratively, into aplan contextẅhich conditions future real and imagined actions. the agent can even decide how to imagine: testing out alternative imagined actions, chaining sequences of actions together, or building a more complexïmagination treeby navigating flexibly among the previously imagined states using a learned policy. and our agent can learn to plan economically, jointly optimizing for external rewards and computational costs associated with using its imagination. we show that our architecture can learn to solve a challenging continuous control problem, and also learn elaborate planning strategies in a discrete maze-solving task. our work opens a new direction toward learning the components of a model-based planning system and how to use them. generated abstract: the imagination-based planner (ibp) is a model-based agent which learns from experience all aspects of the planning process: how to construct, evaluate, and execute a plan. the ibp learns when to act versus when to imagine, and if imagining, how to select states and actions for aggregating the proposed actions and evaluations into a useful plan. through training, it effectively develops a planning algorithm tailored to the target problem. the learned algorithm allows it to flexibly explore, and exploit regularities in, the state and action spaces. in two experiments we evaluated a continuous ibp implementation on a challenging continuous control task, and a discrete ibp in a maze-solving problem. a fully learnable plan-based planning agent for continuous control. an agent that learns to construct a plan via model-inspired imagination was trained to performconditionaländädaptive computation,ẅhich results in a dynamic computational graph recently [7] trained neural networks trained to contribution: in this paper we introduce the imagination-based planner (ibp), a model-based planning agent which learns from experience all aspects of the planning process: how to construct, evaluate, and execute a plan. the ibp learns when to act versus when to imagine, and if imagining, how to select states and actions to evaluate which will help minimize its external task loss and internal resource costs. through training, it effectively develops a planning algorithm tailored to the target problem. in two experiments we evaluated a continuous ibp implementation on a challenging continuous control task, and a discrete ibp in a maze-solving problem. context: model-based planning involves proposing sequences of actions, evaluating them under a model of the world, and refining these proposals to optimize expected rewards. several key advantages of model-free methods are that models support generalization to states not previously experienced, help express the relationship between present actions and future rewards, and can resolve states which are aliased in value-based approximations. these advantages are especially pronounced in problems with complex and stochastic environmental dynamics, sparse reward, and restricted trial-and-error experience. yet even with an accurate model, planning is often very challenging because while a model can be used to evaluate a plan, it does not prescribe how to construct a plan. existing techniques for model based planning are most effective in small-scale problems, often require background knowledge of the domain, and use pre-defined solution strategies. here we introduce the imagination-based planner (ibp), a model-based agent which learns from experience all aspects of the planning table 10 : generated sample. more than half of the context summary is describing contribution. original abstract: we provide a generalization of discriminative lexicalized shift reduce parsing techniques for phrase structure grammar to a wide range of morphologically rich languages. the model is efficient and outperforms recent strong baselines on almost all languages considered. it takes advantage of a dependency based modelling of morphology and a shallow modelling of constituency boundaries. generated abstract: this paper shows that an efficient lexicalized phrase structure parser -modelling both dependencies and morphology -already significantly improves parsing accuracy. but we also show that an additional modelling of spans and constituency provides additional robustness that contributes to yield state of the art results on almost all languages considered, while remaining quite quite efficient. moreover given the availability of existing multi-view treebanks (bhatt et al., 2009; seddah et al. 2014) , our proposed solution only requires a lightweight infrastructure to achieve multilin-gual parsing without requiring costly languaged dependent modifications such as feature engineering. this method was used for the construction of the dataset for the spmrl 2013 shared task. we first review the properties of multiview treebanks for a number of morphologically rich languages for which they are annotated both for constituents and dependencies that have the property to be token-wise aligned. these double annotations are typically obtained by converting a constituency or dependency annotation into the other annotation contribution: in this paper, we propose a lexicalized phrase structure parsing model that can effectively take morphology into account. we show that an efficient lexicalization of both dependencies and morphology significantly improves multilingual state-of-the art results on almost all languages considered, while remaining quite efficient. moreover, given the availability of existing multi-view treebanks, our proposed solution only requires a lightweight infrastructure to achieve multilin-gual parsing without requiring costly feature engineering such as feature engineering. context: most state of the art multilingual parsers are weighted by discriminative models. most state-of-theart multilingual parsing methods rely on lexicalized phrase structure parsing techniques, which have recently been shown to improve performance in a variety of languages including free word order languages like english or chinese. in this paper we show that an efficient lexicallyized parser -modelling both dependencies and morphology -already significantly improves parsing accuracy. but it also shows that an additional modelling of spans and constituency provides additional robustness that contributes to yield state-ofthe-art results on almost all languages considered, while remaining quite efficient. moreover, given the availability of existing multi-view treebanks (bhatt et al., 2009; seddah et al. 2013) , our proposed solution only requires a lightweight infrastructure to achieve multilin-gual parsing without requiring costly feature engineering such as feature engineering. scibert: a pretrained language model for scientific text detecting novelty in the context of progressive summarization timeline summarization from social media with life cycle models scientific article summarization using citation-context and article's discourse structure a discourse-aware attention model for abstractive summarization of long documents a supervised approach to extractive summarisation of scientific papers nouveau-rouge: a novelty metric for update summarization overview of the tac 2008 update summarization task dualsum: a topic-model based approach for update summarization summeval: re-evaluating summarization evaluation controllable abstractive summarization soft layer-specific multi-task summarization with entailment and question generation ctrl -a conditional transformer language model for controllable generation best-worst scaling more reliable than rating scales: a case study on sentiment intensity annotation talk-summ: a dataset and scalable annotation method for scientific paper summarization based on conference talks bart: denoising sequence-to-sequence pretraining for natural language generation, translation, and comprehension rouge: a package for automatic evaluation of summaries text summarization with pretrained encoders s2orc: the semantic scholar open research corpus multi-task sequence to sequence learning a temporally sensitive submodularity framework for timeline summarization a deep reinforced model for abstractive summarization a simple theoretical model of importance for summarization scientific paper summarization using citation summary networks exploring the limits of transfer learning with a unified text-to-text transformer. corr, abs/1910.10683 distilbert, a distilled version of bert: smaller, faster, cheaper and lighter get to the point: summarization with pointer-generator networks attention is all you need. corr, abs/1706.03762 the covid-19 open research dataset huggingface's transformers: stateof-the-art natural language processing scisummnet: a large annotated corpus and content-impact models for scientific paper summarization with citation networks big bird: transformers for longer sequences pegasus: pre-training with extracted gap-sentences for abstractive summarization bertscore: evaluating text generation with bert seal: segment-wise extractive-abstractive long-form text summarization the authors thank wenhao liu, divyansh agarwal, sharvin shah, and tania lopez-cantu for assisting with annotations. key: cord-024341-sw2pdnh6 authors: aksyonov, konstantin; aksyonova, olga; antonova, anna; aksyonova, elena; ziomkovskaya, polina title: development of cloud-based microservices to decision support system date: 2020-05-05 journal: open source systems doi: 10.1007/978-3-030-47240-5_9 sha: doc_id: 24341 cord_uid: sw2pdnh6 intelligent systems of simulation become a key stage of the scheduling of companies and industries work. most of the existing decision support systems are desktop software. today there is a need to use durability, flexibility, availability and crossplatforming information technologies. the paper proposes the idea of working cloud based decision support system bpsim.web and this one consists of some set of services and tools. the model of the multiagent resources conversion process is considered. the process of the simulation model developing via bpsim.web is described. an example of the real process model is given. creation of simulation systems (sim) [1] is one of the promising directions for the development of decision-making systems for business processes (bp), supply chains and logistics [2, 3] , technological processes (for example metallurgy [4] ). currently, the presence in sim of communities of interaction agents that are identified with decision makers (dm) is significant [5] [6] [7] [8] . currently, commercial simulation solutions based on the market (such as anylogic, aris, g2) are desktop applications. aris system allows you to create html-pages with the results of experiments and upload them to the internet. anylogic system is able to compile java applets with developed models and place them on the network. to start working with the model, it is necessary to fully download it to the user's electronic device, playing the simulation experiment of the model applet takes place on the user's device and requires significant computing resources. the analysis showed that the greatest functionality of sim of business processes is provided by anylogic and bpsim products. in the direction of service-oriented architecture, only g2 is developing. thus, the urgent task is to choose a dynamic model of a business process and build on its basis a web-service of simulation. comparative analysis of sim is presented in table 1 . 1) accounting for various types of resources [9, 10] ; 2) accounting for the status of operations and resources at specific times; 3) accounting for the conflicts on common resources and means [11, 12] ; 4) modeling of discrete processes; 5) accounting for complex resources (resource instances with properties, in the terminology of queuing systems -application (transaction)); 6) application of a situational approach (the presence of a language for describing situations (a language for representing knowledge) and mechanisms for diagnosing situations and finding solutions (a logical inference mechanism according to the terminology of expert systems); 7) implementation of intelligent agents (dm models); 8) description of hierarchical processes. consider the following approaches and models of multi-agent systems and bp: 1) model of a multi-agent process of resource conversion; 2) sie-model a.u. filippovich; 3) models of active and passive converters (apc) b.i. klebanov, i.m. moskalev. the dynamic model of multi-agent resource conversion processes (marcp) [5, 6] is designed to model organizational and technical, bp and support of management decisions. the marcp model was developed on the basis of the following mathematical schemes: petri nets, queuing systems and system dynamics models. the key concept of the marcp model is a resource converter having the following structure: input, start, conversion, control, and output. "start-up" determines the moment the converter is started on the basis of: the state of the conversion process, input and output resources, control commands, means. at the time of launch, the conversion execution time is determined based on the parameters of the control command and available resource limitations. the marcp model has a hierarchical structure. agents manage the objects of the process based on the content of the knowledge base (kb). integrated situational, simulation, expert model a.u. filippovich (sie-model) is presented in [13] . due to the fact that this model is focused on the problematic area of prepress processes (printing), some of its fragments will be described in terms of the marcp model. sie-model is presented in the form of several different levels, corresponding to the imitation, expert and situational presentation of information [13] . the first level of the model is intended to describe the structure of the system. for this, a block is associated with each object (subject). blocks are interconnected by interaction channels. each block processes a transaction for a certain time and delays it for a time determined by the intensity. we formulate the following conclusions: 1. the sie-model can serve as the basis for creating a multi-agent bp model. 2. the sie-model has the following advantages: apparatus/ mechanism for diagnosing situations; a combination of simulation, expert and situational approaches. 3. the sie model does not satisfy the requirements of the bp model: the presence of a dm (agent) model; problem orientation to business processes. in the work of i.m. moskalev, b.i. klebanov [14] [15] [16] presents a mathematical model of resource conversion process, the specificity of which is the allocation of active and passive converters (apc). in this model, the vertices of graph x are formed by passive transducers, active transformers, stock of instruments and resource storages, and many arcs are represented by resource and information flows, flows to funds. the model of active and passive converters is focused on solving production-planning problems and based on scheduling theory. this model has not worked the possibility of implementing intelligent agents (models of dm) with a production knowledge base, as well as the implementation of a language for describing mechanisms and situations for diagnosing situations and finding solutions. the results of the analysis of the considered approaches and models of dynamic modeling of situations are given in table 2 . as follows from the table, all the requirements of the multi-agent business process model are met by the marcp model. as the theoretical basis of the implemented method you can use the sie-model, the advantage of which is the study of integration issues of simulation, expert and situational modeling. to implement the simulation modeling service, it was decided to use the marcp concept. the simulation modeling service is based on asp.net core technology in the c# programming language. asp.net core is a cross-platform, high-performance, open-source environment for building modern, cloud-based, internet-connected applications. asp.net core provides the following benefits: • a single solution for creating a web user interface and web api. • integration of modern client platforms and development workflows. • cloud-based configuration system based on the environment. • built-in dependency injection. • simplified, high-performance, modular http request pipeline. • ability to host in iis, nginx, apache, docker or in your own process. • parallel version control of the application focused on .net core. • tools that simplify the process of modern web development. • ability to build and run on windows, macos and linux. • open source and community oriented. asp.net core comes fully in nuget packages. using nuget packages allows you to optimize applications to include only the necessary dependencies. asp.net core 2.x applications targeting .net core require only one nuget package. due to the small size of the application's contact area, benefits such as higher security, minimal maintenance and improved performance are available. figure 1 shows the architecture of the simulation service. the service manages 3 entities: models, simulation processes, and execution reports. it receives commands from the integration interface (api) and, depending on the command, receives or stores data in the database, performs internal transformations and calculations, and starts modeling. the simulation modeling service has an interaction and integration interface in the form of cross-platform http webapi. the http protocol allows you to not only provide web pages. the simulation service describes a rest-style interface (representation-alstatetransfer). rest is an architectural style of software that defines the interaction of components of a distributed application on a network or the integration of multiple applications. one of the requirements of rest is a unified programming interface for all entities with which the web service works. the crud (createreadupdatedelete) interface of operations is described using the http request verb (get, post, put, etc.), the names of entities and, if necessary, the identifiers of these entities. consider some possible queries on the models that the simulation service handles: here, all commands to the service have a single prefix "model". if there is an appeal to many models at once -to get all models, add a new one to many models -only the prefix is used. if you need an action associated with a specific model -get, change, delete -you must continue the request url with the model identifier. an http verb is used to determine the action to be performed on the request object. in the rest style, a special approach is laid down for requests that work with some service entities at once. this is how the imitation task is created: the url describes the relationship of one particular model with its tasks. a service should take one from the model domain with a specific identifier and refer to its many tasks for simulation. the http verb indicates that a new task should be added to this set. the receipt of all tasks for simulating a model is described in a similar way: • get model/{id}/task; however, work with many tasks for imitation can be carried out not in the context of any particular model, but immediately with the whole set. the following methods are used for this: • get task -get all simulation tasks; • get task/{id} -get information on a simulation task; • delete task/{id} -stop simulation. for each task, many reports are generated. the task report is strongly related to the essence of the task itself. therefore, deleting a task deletes all reports for this task. the reports do not have unique identifiers and can be received by the whole set. mongodb was chosen as the data storage system of the sim service. mongodb -dbms that uses a document-oriented data model. this allows mon-godb to carry out crud operations very quickly, to be more prepared for changes in stored data and to be more understandable for the user. it is an open-source project that is written in c ++. all libraries and drivers for programming languages and platforms are also available in the public domain. the storage method in mongodb is similar to json (javascriptobjectnotation), although formally json is not used. mongodb uses a format called bson (binaryj-son) for storage. the bson format has a certain structure and stores some additional information about a specific key and value pair (for example, data type and hash). therefore, usually an object in bson takes up a bit more space than json. however, bson allows you to work with data faster: faster search and processing. in mongodb, each model is stored as a document. this document stores the entire structure of the model. figure 2 shows the simulation model saved by the service in mongodb. all models are saved in the models collection. the structure of the model object itself, in addition to a unique identifier, consists of the following fields: • name -model name. • resources -an object in the form of a key-value, where key is the name of the resource, and value is the default value of the resource. • orders -the key-value object, where the key is the name of the request, and the value is an array of the names of the fields of the request. • nodes -a key-value object, where key is the name of the node, and value is a complex object that describes either the agent or the operation. the "agent" node has an array of global rules (globalrules) and a lot of knowledge (knowledges). an array of rules stores the same rule entities as operations. knowledge is a complex key-value object, where key is the name of knowledge, and value is an array of rules that describe this knowledge. as mentioned earlier, simulation tasks also end up in the database. they are saved to the tasks collection. the task consists of the following fields. • modelid -identifier of the model with which the task is associated. • modelrevision -version of the model by which the task was created. if the model has changed between the creation of the task and its immediate imitation, unforeseen incidents may occur. • state -task state. • subjectarea -the subject area with which the simulation will occur. • configuration -the configuration with which the simulation will be launched. it stores information about when to stop the simulation, as well as some other information. • reports -an array of reports that were created for this task. the task may be in one of the states. this is determined by the life cycle of the task. the states are as follows: • 0 -open -the task is created and ready to simulate; • 1 -scheduled -the task has already been planned, but has not yet been simulated (the engine is being prepared, objects are being created, etc.); • 2 -pending -the task is in simulation; • 3 -failed -the task failed; • 4 -successful -task completed successfully; • 5 -aborted -the task was interrupted; • 6 -deleted -the task was deleted before it was taken into imitation. the task created as a result of the http request does not begin to be simulated immediately. it is saved to the database and enters the task queue. service within a given period of time will check the queue for tasks. if there are no tasks for imitation, the cycle will be repeated after some time. if there is a task in the queue, imitation begins on it. the simulator can simulate several tasks in parallel. the degree of parallelism is not strictly defined and can be configured by the service settings. if the simulator does not have free resources to simulate, he does not look at the queue until they appear. as soon as some task completes the simulation, the simulator looks through the queue for new tasks. when the task falls into the simulation, it saves to the database with the status 0 (open). as soon as the simulator takes the task from the queue, it is set to status 1 (scheduled). this is necessary so that a parallel process or another instance of the service does not begin to simulate the same task. when the simulator finishes preparations, he will begin the simulation, setting the task status 2 (pending). upon successful completion of the task, it will receive the status 4 (successful), and if an error occurs -3 (failed). the task can be canceled by a special team. in this case, she will have the status 5 (aborted) or 6 (delete). figure 3 shows the pattern of product movement between the hot rolling mill lpc-10 and the cold rolling mill lpc-5. the task is to recreate the production process of sheet steel coils and conduct a series of studies in which it is necessary to evaluate a set of key parameters within three 24-h working days. the parameters are as follows: 1. the minimum number of slabs in the warehouse of cast slabs at the beginning of the simulation, providing a continuous supply of slabs every three minutes. 2. the current, minimum, maximum and average number of objects in each warehouse of the system during the simulation time. 3. the load of all units in percent during the simulation time and the current load. to create such a simulation model, you need to use the post/api/v1/model service interface and put a json object describing model in the http request body. figure 4 shows a small part of this object. you can see the first nodes of the model on it: two operations "slab store" and "batch generator" and the beginning of the description of the "in bake" agent. nine experiments were conducted with the model. in each experiment, the minimum value of the slabs in the warehouse was changed. table 3 presents a partial result of the experiments, including the minimum value of slabs in the warehouse, the final output of the model and the waiting time-the total idle time of the furnaces. according to the results of the experiments, it was decided that the 6th experiment was the best. starting from it, a continuous flow of slabs in workshops was obtained with the minimum quantity in the warehouse, the best effect was obtained. the data obtained in the course of this work made it possible to analyze the current development of business process simulation systems (such as anylogic, aris, bpsim, g2) and highlight the requirements for a new system oriented to work on the internet. a comparative analysis of the existing dynamic bp models was carried out and the model of the multi-agent resource conversion process was taken as a basis. a prototype webservice for bp simulation bpsim.web was developed. the web service has been tested in solving the problem of analyzing the processes of two workshops. cloud technology in simulation studies, gpss cloud project analysis of position optimization method applicability in supply chain management problem theoretical and technological foundations of complex objects proactive monitoring management and control on design of domain-specific query language for the metallurgical industry the architecture of the multi-agent resource conversion processes analysis of the electric arc furnace workshop logistic processes using multiagent simulation agents and data mining interaction intelligent agent: theory and practice reengineering the corporation: a manifesto for business revolutions project scheduling with multiple modes: a comparison of exact algorithms simulation tool based on a memetic algorithm to solve a real instance of a dynamic tsp a model of co-evolution in multi-agent system integration of situational, simulation and expert modeling systems. publisher "ooo elix+ system for analysis and optimization of resource conversion processes the principles of multi-agent models of development based on the needs of the agents bases of imitation model of artificial society construction accounting of the agents' needs recursion the reported study was funded by rfbr according to the research project № 18-37-00183. key: cord-018746-s9knxdne authors: perra, nicola; gonçalves, bruno title: modeling and predicting human infectious diseases date: 2015-04-23 journal: social phenomena doi: 10.1007/978-3-319-14011-7_4 sha: doc_id: 18746 cord_uid: s9knxdne the spreading of infectious diseases has dramatically shaped our history and society. the quest to understand and prevent their spreading dates more than two centuries. over the years, advances in medicine, biology, mathematics, physics, network science, computer science, and technology in general contributed to the development of modern epidemiology. in this chapter, we present a summary of different mathematical and computational approaches aimed at describing, modeling, and forecasting the diffusion of viruses. we start from the basic concepts and models in an unstructured population and gradually increase the realism by adding the effects of realistic contact structures within a population as well as the effects of human mobility coupling different subpopulations. building on these concepts we present two realistic data-driven epidemiological models able to forecast the spreading of infectious diseases at different geographical granularities. we conclude by introducing some recent developments in diseases modeling rooted in the big-data revolution. historically, the first quantitative attempt to understand and prevent infectious diseases dates back to 1760 when bernoulli studied the effectiveness of inoculation against smallpox [1] . since then, and despite some initial lulls [2] , an intense research activity has developed a rigorous formulation of pathogens' spreading. in this chapter, we present different approaches to model and predict the spreading of infectious diseases at different geographical resolutions and levels of detail. we focus on airborne illnesses transmitted from human to human. we are the carriers of such diseases. our contacts and mobility are the crucial ingredients to understand and model their spreading. interestingly, the access to large-scale data describing these human dynamics is a recent development in epidemiology. indeed, for many years only the biological roots of transmission were clearly understood, so it is not surprising that classical models in epidemiology neglect realistic human contact structures or mobility in favor of more mathematically tractable and simplified descriptions of unstructured populations. we start our chapter with these modeling approaches that offer us an intuitive way of introducing the basic quantities and concepts in epidemiology. advances in technology are resulting in increased data on human dynamics and behavior. consequently, modeling approaches in epidemiology are gradually becoming more detailed and starting to include realistic contact and mobility patterns. in sects. 4.3 and 4.4 we describe such developments and analyze the effects of heterogeneities in contact structures between individuals and between cities/subpopulations. with these ingredients in hand we then introduce state-of-the-art data-driven epidemiological models as examples of the modern capabilities in disease modeling and predictions. in particular, we consider gleam [3, 4] , episims [5] , and flute [6] . the first model is based on the metapopulation framework, a paradigm where the inter-population dynamics is modeled using detailed mobility patterns, while the intra-population dynamics is described by coarse-grained techniques. the other tools are, instead, agent-based model (abm). this class of tools guarantees a very precise description of the unfolding of diseases, but need to be fed with extremely detailed data and are not computationally scalable. for these reasons their use so far has been limited to the study of disease spread within a limited numbers of countries. in comparison, metapopulation models include a reduced amount of data, while the approximated description of internal dynamics allows scaling the simulations to global scenarios. interestingly, the access to large-scale data on human activities has also started a new era in epidemiology. indeed, the big-data revolution naturally results in real time data on the health related behavior of individuals across the globe. such information can be obtained with tools that either require the active participation of individuals willing to share their health status or that is mined silently from individuals' health related data. epidemiology is becoming digital [7, 8] . in sect. 4.6 we introduce the basic concepts, approaches, and results in this new field of epidemiology. in particular, we describe tools that, using search queries, microblogging, or other web-based data, are able to predict the incidence of a wide range of diseases two weeks ahead respect to traditional surveillance. epidemic models divide the progression of the disease into several states or compartments, with individuals transitioning compartments depending on their health status. the natural history of the disease is represented by the type of compartments and the transitions from one to another, and naturally varies from disease to disease. in some illnesses, susceptible individuals (s) become infected and infectious when coming in contact with one or more infectious (i) persons and remain so until their death. in this case the disease is described by the so-called si (susceptible-infected) model. in other diseases, as is the case for some sexual transmitted diseases, infected individuals recover becoming again susceptible to the disease. these diseases are described by the sis (susceptible-infected-susceptible) model. in the case of influenza like illnesses (ili), on the other hand, infected individuals recover becoming immune to future infections from the same pathogen. ilis are described by the sir (susceptible-infected-recovered) model. these basic compartments provide us with the fundamental description of the progression of an idealized infection in several general circumstances. further compartments can be added to accurately describe more realistic illnesses such as smallpox, chlamydia, meningitis, and ebola [2, 9, 10] . keeping this important observation in mind, here we focus on the sir model. epidemic models are often represented using chart such as the one seen in fig. 4 .1. such illustrations are able to accurately represent the number of compartments and the disease's behavior in a concise and easily interpretable form. mathematically, models can also be accurately represented as reaction equations as we will see below. in general, epidemic models include two type of transitions, "interactive" and "spontaneous." interactive transitions require the contact between individuals in two different compartments, while spontaneous transitions occur naturally at a fixed rate per unit time. for example, in the transition between s to i, susceptible individuals become infected due to the interaction with infected individuals, i.e. sci ! 2i. the transition is mediated by individuals in the compartment i, see fig. 4 but how can we model the infection process? intuitively we expect that the probability of single individual becoming infected must depend on (1) the number of infected individuals in the population, (2) the probability of infection given a contact with an infectious agent and, (3) the number of such contacts. in this section we neglect the details of who is in contact with whom and consider instead individuals to be part of a homogeneously mixed population where everyone is assumed to be in contact with everyone else (we tackle heterogeneous contacts in sect. 4.3). in this limit, the per capita rate at which susceptible contract the disease, the force of infection , can be expressed in two forms depending on the type of population. in the first, often called mass-action law, the number of contacts per individual is independent of the total population size, and determined by the transmission rateǎ nd the probability of randomly contacting an infected individual, i.e. dˇi=n (where n is the population size). in the second case, often called pseudo massaction law, the number of contacts is assumed to scale with the population size, and the transmission rateˇ, i.e. dˇi. without loss of generality, in the following we focus on the first kind of contact. the sir framework is the crucial pillar to model ilis. think, for example, at the h1n1 pandemic in 2009, or the seasonal flu that every year spread across the globe. the progression of such diseases, from the first encounter to the recovery, happens in matters of days. for this reason, birth and death rates in the populations can be generally neglected, i.e. d t n á 0 for all times t. let us define the fraction of individuals in the susceptible, infected, and recovered compartments as s; i, and r. the sir model is then described by the following set of differential equations: where dˇi áˇi n is the force of infection, and d t á d dt . the first equation describes the infection process in a homogeneous mixed population. susceptible individuals become infected through random encounters with infected individuals. the second equation describes the balance between the in-flow (infection process, first term), and the out-flow (recovery process, second term) in compartment i. finally, the third equation accounts for the increase of the recovered population due to the recovery process. interestingly, the sir dynamical equations, although apparently very simple, due to their intrinsic non-linearity cannot be solved analytically. the description of the evolution of the disease can be obtained only through numerical integration of the system of differential equations. however, crucial analytic insight on the process can be obtained for early t t 0 and late times t ! 1. under which conditions a disease starting from a small number, i 0 , of individuals at time t 0 is able to spread in the population? to answer this question let us consider the early stages of the spreading, i.e. t t 0 . the equation for the infected compartment can be written as d t i d i.ˇs /, indicating an exponential behavior for early times. it then follows that if the initial fraction of susceptible individuals, s 0 d s 0 =n, is smaller than =ˇ, the exponent becomes negative and the disease dies out. we call this value the epidemic threshold [11] of the sir model. the fraction of susceptibles in the population has to be larger than a certain value, that depends on the disease details, in order to observe an outbreak. typically, the initial cluster of infected individuals is small in comparison with the population size, i.e. s 0 i 0 , or s 0 1. in this case, the threshold condition can be re-written asˇ= > 1. the quantity: is called the basic reproductive number, and is a crucial quantity in epidemiology and provides a very simple interpretation of the epidemic threshold. indeed, the disease is able to spread if and only if each infected individual is able to infect, on average, more than one person before recovering. the meaning of r 0 is then clear: it is simply the average number of infections generated by an initial infectious seed in a fully susceptible population [10] . for any value of > 0, the sir dynamics will eventually reach a stationary, disease-free, state characterized by i d d t i d 0. indeed, infected individuals will keep recovering until they all reach the r compartment. what is the final number of recovered individuals? answering this apparently simple question is crucial to quantify the impact of the disease. we can tackle such conundrum dividing the first equation with the third equation in the system 4.1. we obtain d r s d r 0 s which in turn implies s t d s 0 e r 0 r t . unfortunately, this transcendent equation cannot be solved analytically. however, we can use it to gain some important insights on the sir dynamics. we note that for any r 0 > 1, in the limit t ! 1, we must have s 1 > 0. in other words, despite r 0 , the disease-free equilibrium of an sir model is always characterized by some finite fraction of the population in the susceptible compartment, or, in other words, some individuals will always be able to avoid the infection. in the limit where r 0 1 we can obtain an approximate solution for r 1 (or equivalently for s 1 d 1 r 1 ) by expanding s 1 d s 0 e r 0 s 1 at the second order around r 1 0. after a few basic algebraic manipulations we obtain in the previous sections we presented the basic concepts and models in epidemiology by considering a simple view of a population where individuals mix homogeneously. although such approximation allows a simple mathematical formulation, it is far from reality. individuals do not all have the same number of contacts, and more importantly, encounters are not completely random [12] [13] [14] [15] . some persons are more prone to social interactions than others, and contacts with family members, friends, and co-workers are much more likely than interactions with any other person in the population. over the last decade the network framework has been particularly effective in capturing the complex features and the heterogeneous nature of our contacts [12] [13] [14] [15] [16] . in this approach, individuals are represented by nodes while links represent their interactions. as described in different chapters of the book (see chaps. 3, 6, and 10), human contacts are not heterogeneous in both number and intensity [12] [13] [14] [15] 17] but also change over time [18] . this framework naturally introduces two timescales, the timescale at which the network connections evolve, g and the inherent timescale, p , of the process taking place over the network. although the dynamical nature of interactions might have crucial consequences on the disease spreading [19] [20] [21] [22] [23] [24] , the large majority of results in the literature deal with one of two limiting regimens [25, 26] . when g p , the evolution of the network of contacts is much slower than the spreading of the disease and the network can be considered as static. on the other hand, when p g , the links are said to be annealed and changes in networks structure are much faster than the spreading of the pathogen. in both cases the two time-scales are well separated allowing for a simpler mathematical description. here we focus on the annealed approximation ( p g ) that provides a simple stage to model and understand the dynamical properties of epidemic processes. we refer the reader to chap. 3 face-to-face interactions for recent approaches that relax this time-scale separation assumption. let us consider a network g .n; e/ characterized by n nodes connected by e edges. the number of contacts of each node is described by the degree k. the degree distribution p .k/ characterizes the probability of finding a node of degree k. empirical observations in many different domains show heavy-tailed degree distributions usually approximated as power-laws, i.e. p .k/ k ˛ [ 12, 13] . furthermore, human contact networks are characterized by so-called assortative mixing, meaning a positive correlation between the degree of connected individuals. correlations are encoded in the conditional probability p .k 0 jk/ that a node of degree k is connected with a node of degree k 0 [12, 13] . while including realistic correlations in epidemic models is crucial [27] [28] [29] they introduce a wide set of mathematical challenges that are behind the scope of this chapter. in the following, we consider the simple case of uncorrelated networks in which the interdependence among degree classes is removed. how can we extend the sir model to include heterogeneous contact structures? here we must take a step further than simply treating all individuals the same. we start distinguishing nodes by degree while considering all vertices with the same degree as statistically equivalent. this is known as the degree block approximation and is exact for annealed networks. the quantities under study are now i k d i k n k ; s k d s k n k , and r k d r k n k , where the i k ; s k , and r k are the number of infected, susceptible, recovered individuals in the degree class k. n k instead describes the total number of nodes in the degree class k. the global averages are given by i d using this notation and heterogeneous mean field (hmf) theory [26] , the system of differential equations (4.1) can now be written as: the contact structure introduces a force of infection function of the degree. in particular, k d k k where is the rate of infection per contact, i.e.ˇd k, and k describes the density of infected neighbors of nodes in the degree class k. intuitively, this density is a function of the conditional probability that a node k is connected to any node k 0 and proportional to the number of infected nodes in each class in the simple case of uncorrelated networks the probability of finding a node of degree k 0 in the neighborhood of a node in degree class k is independent of k. in this case k d d p k 0 .k 0 1/ p .k 0 / i k 0 =hki where the term k 0 1 is due to the fact that at least one link of each infected node points to another infected vertex [15] . in order to derive the epidemic threshold let us consider the early time limit of the epidemic process. as done in sect. 4.2.2.1 let us consider that at t t 0 the population is formed mostly by susceptible individuals. in the present scenario this implies s k i k and r k 0 8k. the equation for the infected compartment then becomes d t i k d k i k . multiplying both sides for p .k/ and summing over all values of k we obtain d t i d hki i. in order to understand the behavior of i around t 0 let us consider an equation built by multiplying both sides of the last equation by .k 1/ p .k/ =hki and summing over all degree classes. we obtain d t d . hk 2 i hki hki / . the fraction of infected individuals in each value of k will increase if and only if d t > 0. this condition is verified when [15] : giving us the epidemic threshold of an sir process unfolding on an uncorrelated network. remarkably, due to their broad-tailed nature, real contact networks display fluctuations in the number of contacts (large hk 2 i) that are significantly larger than the average degree hki resulting in very small thresholds. large degree nodes (hubs) facilitate an extremely efficient spreading of the infection by directly connecting many otherwise distant nodes. as soon as the hubs become infected diseases are able to reach a large fraction of the nodes in the network. real interaction networks are extremely fragile to disease spreading. while this finding is somehow worrisome, it suggests very efficient strategies to control and mitigate the outbreaks. indeed, hubs are central nodes and play a crucial role in the network connectivity [12] and by vaccinating a small fraction of them one is able to quickly stop the spread of the disease and protect the rest of the population. it is important to mention that in realistic settings the knowledge of the networks' structure is often limited. hubs might not be easy to easily known and other indirect means must be employed. interestingly, the same feature of hubs that facilitates the spread of the disease also allows for their easy detection. since high degree nodes are connected to a large number of smaller degree nodes, one may simply randomly select a node, a, from the network and follow one of its links to reach another node, b. with high probability, node b has higher degree than a and is likely a hub. this effect became popularized as the friend paradox: on average your friends have more friends than you do [12] . immunizing node b is then much more effective than immunizing node a. remarkably, as counter-intuitive as this methodology might seem, it works extremely well even in the case of quickly changing networks [30] [31] [32] . the next step in the progression towards more realistic modeling approaches is to consider the internal structure of the nodes. if each node in the network represents a homogeneously mixed sub-population instead of a single individual and we consider the edges to represent interactions or mobility between the different subpopulations, then we are in the presence of what is known as meta-population. this concept was originally introduced by r. levins in 1969 [33] for the study of geographically extended ecological populations where each node represents one of the ecological niches where a given population resides. the metapopulation framework was later extended for use in epidemic modeling by sattenspiel in 1987. in a landmark paper [34] sattenspiel considered two different types of interactions between individuals, local ones occurring within a given node, and social ones connecting individuals originating from different locations on the network. this idea was later expanded by sattenspiel and dietz to include the effects of mobility [35] and thus laying the foundations for the development of epidemic models at the global scale. metapopulation epidemic models are extremely useful to describe particle reaction-diffusion models [36] . in this type of model each node is allowed to have zero or more individuals that are free to diffuse among the nodes constituting the network. in our analysis, as done in the previous section, we follow the hmf approach and consider all nodes of degree k to be statistically equivalent and write all quantities in terms of the degree k. to start, let us define the average number of individuals in a node of degree k to be w k d 1 where n k is the number of nodes with degree k and the sum is taken over all nodes i. the mean field dynamical equation describing the variation of the average number of individuals in a node of degree k is then: where p k and p kk 0 represent, respectively, the rate at which particles diffuse out of a node of degree k and diffuse from a node of degree k to one of degree k 0 . with these definitions, the meaning of each term of this equation becomes intuitively clear: the negative term represents individuals leaving the node, while the positive term accounts for individuals originating from other nodes arriving at this particular class of node. the conditional probability p .k 0 jk/ encodes all the topological correlations of the network. by imposing that the total number of particles in the system remains constant, we obtain: that simply states that the number of particles arriving at nodes of degree k 0 coming from nodes of degree k must be the same as the number of particles leaving nodes of degree k. the probabilities p k and p kk 0 encode the details of the diffusion process [37] . in the simplest case, the rate of movement of individuals is independent of the degree of their origin p k d p for all values of the degree. furthermore, if individuals that are moving simply select homogeneously among all of their connections, then we have p kk 0 d p=k. in this case, the diffusion process will reach a stationary state when: where w d w=n, w is the total number of walkers in the system, and n the total number of nodes. the simple linear relation between w k and k serves as a strong reminder of the importance of network topology. nodes with higher degree will acquire larger populations of particles while nodes with smaller degrees will have proportionally smaller populations. however, even in the steady state, the diffusion process is ongoing, so individuals are continuously arriving and leaving any given node but are doing so in a way that maintains the total number of particles in each node constant. in more realistic settings, the traffic of individuals between two nodes is function of their degree [37] : in this expression â modulates the strength of the diffusion flow between degree classes (empirical values are in the range 0:5 ä â ä 0:5 [3] ), where w 0 is a constant and t k d w 0 hk 1câ i=hki is the proper normalization ensured by the condition in eq. (4.6). in these settings, the diffusion process reaches a stationary state when: note that for â d 0 this solution coincides with the case of homogeneous diffusion [eq. (4.7)]. combining this diffusion process with the (epidemic) reaction processes described above we finally obtain the full reaction-diffusion process. to do so we must simply write eq. (4.5) for each state of the disease (e.g., susceptible, infectious, and recovered for a simple sir model) and couple the resulting equations using the already familiar epidemic equations. the full significance of eq. (4.7) now becomes clear: nodes with higher degree have higher populations and are visited by more travelers, making them significantly more likely to also receive an infected individual that can act as the seed of a local epidemic. in a metapopulation epidemic context we must then consider two separate thresholds, the basic reproductive ratio, r 0 , that determines whether or not a disease can spread within one population (node) and a critical diffusion rate, p c , that determines if individual mobility is sufficiently large to allow the disease to spread from one population to another. it is clear that if p d 0 particles are completely unable to move from one population to another so the epidemic cannot spread across subpopulations and that if p d 1 all individuals are in constant motion and the disease will inevitably spread to every subpopulation on the network with a transition occurring at some critical value p c . in general, the critical value p c cannot be calculated analytically using our approach as it depends non-trivially on the detailed structure of the network and the fluctuations of the diffusion rate of single individuals. however, in the case of uncorrelated networks a closed solution can be easily found for different mobility patterns. indeed, in the case where the mobility is regulated by eq. (4.8) we obtain: interestingly, the critical value of p is inversely proportional to the degree heterogeneity in the network, so that broad tailed networks have very low critical values. this simple fact explains why simply restricting travel between populations is a highly ineffective way to prevent the global spread of an epidemic. the mobility patterns considered so far are so-called markovian: individuals move without remembering where they have been nor they have a home where they return to after each trip. although this is a rough approximation of individuals behavior, markovian diffusion patterns are allowed to analytically describe the fundamental dynamical properties of many systems. recently, new analytic results have been proposed for non-markovian dynamics that include origin-destination matrices and realistic travel routes that follow shortest paths [38] . in particular, the threshold within such mobility schemes reads as: the exponent á, typically close to 1:5 in heterogeneous networks, emerges from the shortest paths routing patterns [38] . interestingly, for values of â ä 0:2, fixing á d 1:5, p c in the case of markovian mobility patterns is larger than the critical value in a system subject to non-markovian diffusion. the presence of origindestination matrices and shortest paths mobility lower the threshold facilitating the global spreading of the disease. instead, for values of â > 0:2 the contrary is true. in these models the internal contacts rate is considered constant across each subpopulation. interestingly, recent longitudinal studies on phone networks [39] and twitter mention networks [40] point to the evidence that contacts instead scale super-linearly with the subpopulation sizes. considering the heterogeneity in population sizes observed in real metapopulation networks, the scaling behavior entails deep consequence in the spreading dynamics. a recent study generalized the metapopulation framework considering such observations. interestingly, the critical mobility thresholds, in the case of mobility patterns described by eq. (4.8), changes significantly being lowered by such scaling features of human contacts [40] . despite their simplicity, metapopulation models are extremely powerful tools in large scale study of epidemics. they easily lend themselves to large scale numerical stochastic simulations where the population and state of each node can be tracked and analyzed in great detail and multiple scenarios as well as interventions can be tested. the state of the art in the class of metapopulation approaches is currently defined by the global epidemic and mobility model (gleam) [3, 4] . gleam integrates worldwide population estimates [41, 42] with complete airline transportation and commuting databases to create a world wide description of mobility around the world that can then be used as the substrate on which the epidemic can spread. gleam divides the globe into 3362 transportation basins. each basin is defined empirically around an airport and the area of the basin is determined to be the region within which residents would likely use that airport for long distance travel. each basin represents a major metropolitan area such as new york, london, or paris. information about all civilian flights can be obtained from the international air transportation association (iata) [43] and the official airline guide (oag) [44] that are responsible for compiling up-to-date databases of flight information that airlines use to plan their operations. by connecting the population basins with the direct flight information from these databases we obtain the network that acts as a substrate for the reaction diffusion process. while most human mobility does not take place in the form of flights, the flight network provides the fundamental structure for long range travel that explains how diseases such as sars [45] , smallpox [46] , or ebola [47] spread from country to country. to capture the finer details of within country mobility further information must be considered. gleam uses census information to create a commuting network at the basin level that connects neighboring metropolitan areas proportionally to the number of people who live in one are but work in the other. short-term short-distance mobility such as commuting is fundamentally different from medium-term long-distance airline travel. in one case, the typical timescale is work-day (8h) while in the other it is 1 day. this timescale difference is taken into account in gleam in an effective, mean-field, manner instead of explicitly through a reaction process such as the one described above. this added layer is the final piece of the puzzle that brings the whole together and allows gleam to describe accurately the spread from one country to the next but also the spread happening within a given country [48] . in fig. 4.2 we illustrate the progression in terms of detail that we have undergone since our initial description of simple homogeneously mixed epidemic models in a single population. with all these ingredients in place we have a fine grained description of mobility on a world wide scale on top of which we can finally build an epidemic model. within each basin, gleam still uses the homogeneous mixing approximation. this assumption is particularly suited for diseases that spread easily from person to person through airborne means such as ili. gleam describes influenza through an seir model as illustrated in fig. 4.3 . seir models are a modification of the sir model described above that includes a further compartment, exposed, to represent of the remaining symptomatic individuals, one half is sick enough to decide to not travel or commute while the remaining half continue to travel normally. despite their apparent complexity, large scale models such as gleam are controlled by just a small number of parameters and ultimately, it's the proper setting of these few parameters that is responsible for the proper calibration of the model and validity of the results obtained. most of the disease and mobility parameters are set directly from the literature or careful testing so that as little as possible remains unknown when it is time to apply it to a new outbreak. gleam was put to the test during the 2009 h1n1 pandemic with great success. during the course of the epidemic, researchers were able to use official data as it was released by health authorities around the world. in the early days of the outbreak there was a great uncertainty about the correct value of the r 0 for the 2009/h1n1 pdm strain in circulation so a methodology to determine it had to be conceived. one of the main advantages of epidemic metapopulation models is their computational tractability. it was this feature what proved invaluable when it came to determine the proper value of r 0 . by plugging in a given set of parameters one is able to generate several hundreds or thousands of in silico outbreaks. each outbreak contains information not only about the number of cases in each city or country as a function of time but also information about the time when the first case occurs within a given country. in general, each outbreak will be different due to stochasticity and by combining all outbreaks generated for a certain parameter set we can calculate the probability distribution of the arrival times. the number of times that an outbreak generated the seeding of a country, say the uk, in the same day as it occurred in reality provides us with a measure of how likely the parameter values used are. by multiplying this probability for all countries with a known arrival time we can determine the overall likelihood of the simulation: where the product is taken over all countries c with known arrival time t c and the probability distribution of arrival times, p c .t/ is determined numerically for each set of input values. the set of parameters that maximizes this quantity is then the one whose values are the most likely to be correct. using this procedure the team behind gleam determined that the mostly likely value of the basic reproductive ratio was r 0 d 1:75 [49] , a value that was later confirmed by independent studies [50, 51] . armed with an empirical estimate of the basic reproductive ratio for an ongoing pandemic, they then proceeded to use this value to estimate the future progression of the pandemic. their results predicting that the full peak of the pandemic would hit in october and november 2009 were published in early september 2009 [49] . a comparison between these predictions and the official data published by the health authorities in each country would be published several years later [52] clearly confirming the validity of gleam for epidemic forecasting in real time. indeed, the model predicted, months in advance, the correct peak week in 87 % of countries in the north hemisphere for which real data was accessible. in the rest of cases the maximum error reported has been 2 weeks. gleam can also be further extended to include age-structure [53] , interventions and travel reductions. the next logical step in the hierarchy of large scale epidemic models is to take the description of the underlying population all the way down to the individual level with what are known as abm. the fundamental idea behind this class of model is a deceptively simple one: treat each individual in the population separately, assigning it properties such as age, gender, workplace, residence, family structure, etc: : : these added details give them a clear edge in terms of detail over metapopulation models but do so at the cost of much higher computational cost. the first step in building a model of this type is to generate a synthetic population that is statistically equivalent to the population we are interested in studying. typically this is in a hierarchical way, first generating individual households, aggregating households into neighborhoods, neighborhoods into communities, and communities into the census tracts that constitute the country. generating synthetic households in a way that reproduces the census data is far from a trivial task. the exact details vary depending on the end goal of the model and the level of details desired but the household size, age, and gender of household members are determined stochastically from the empirically observed distributions and conditional probabilities. one might start by determining the size of the household by extracting from the distribution of household size of the country of interest and selecting the age and gender of the head of the household proportionally to the number of heads of households for that household size that are in each age group. conditional on this synthetic individual we can then generate the remaining members, if any, of the household. the required conditional probability distributions and correlation tables can be easily generated [54] from high quality census data that can be found for most countries in the world. this process is repeated until enough synthetic households have been generated. households are then aggregated into neighborhoods by selecting from the households according to the distribution of households in a specific neighborhood. neighborhoods are similarly aggregated into communities and communities into census tracts. each increasing level of aggregation (from household to country) represents a decrease in the level of social contact, with the most intimate contacts occurring at the household level and least intimate ones at the census tract or country level. the next step is to assign to each individual a profession and work place. workplaces are generated following a procedure similar to the generation of households and each employed individual is assigned a specific household. school age children are assigned a school. working individuals are assigned to work places in a different community or census tract in a way that reflects empirical commuting patterns. at this point, we have a fairly accurate description of where the entire population of a city or country lives and works. it is then not entirely surprising that this approach was first used to study in detail the demands imposed on the transportation system of a large metropolitan city. transims, 1 the transportation analysis and simulation system [55] , used an approach similar to the one described above to generate a synthetic population for the city of portland, in oregon (or) and coupled it with a route planner that would determine the actual route taken by each individual on her way to work or school as a way of modeling the daily toll on portland's transportation infrastructure and the effect that disruptions or modification might have in the daily lives of its population. episims [5] was the logical extension of transims to the epidemic world. episims used the transims infrastructure to generate the contact network between individuals in portland, or. susceptible individuals are able to acquire the infection whenever they are in a location along with one or more infectious individuals. in this way the researchers are capable of observing as the disease spreads through the population and evaluate the effect that measures such as contact tracing and mass vaccination. more recent approaches have significantly simplified the mobility aspect of this kind of models and simply divide each 24 h period into day time and nighttime. individuals are considered to be in contact with other members of their workplace during the day and with other household members during the night. in recent years, modelers have successfully expanded the large scale agent based approach to the country [6] and even continent level [56] . as the spatial scale of the models increased further modes of long-range transportation such as flights had to be considered. these are important to determine not only the seeding of the country under consideration through importation of cases from another country but also to connect distant regions in a more realistic way. flute [6] is currently the most realistic large scale agent-based epidemic model of the continental united states. it considers that international seeding occurs at random in the locations that host the 15 largest international airports in the us by, each day, randomly infecting in each location a number of individuals that is proportional to the international traffic of those airports. flute is a refinement of a previous model [57] and it further refines the modeling of the infectious process by varying the infectiousness of an individual over time in the sir model that they consider. at the time of infection each individual is assigned one of six experimentally obtained viral load histories. each history prescribes the individuals viral load for each day of the infectious period and the infectiousness is considered to be proportional to the viral load. individuals may remain asymptotic for up to 3 days after infection during which their infectiousness is reduced by 50 % with respect to the symptomatic period. the total infectious period is set to 6 days regardless of the length of the symptomatic period. given the complexity of the model the calibration of the disease parameters in order to obtain a given value of the basic reproductive ratio, r 0 requires some finesse. chao et al. [6] uses the definition of r 0 to determine "experimentally" its value from the input parameters. it numerically simulates 1000 instances of the epidemic caused by a single individual within a 2000 person fully susceptible community for each possible age group of the seeding individual and use it to calculate the r a 0 of each age group a. the final r 0 is defined to the average of the various r a 0 weighted by age dependent attack rate [57] . the final result of this procedure is that the value of r 0 is given by: where is the infection probability per unit contact and is given as input. flute was a pioneer in the way it completely released its source code, 2 opening the doors of a new level of verifiability in this area. it has successfully used to study the spread of influenza viruses and analyze the effect of various interventions in the los angeles county [58] and united states country level [6] . the unprecedented amount of data on human dynamics made available by recent advances technology has allowed the development of realistic epidemic models able to capture and predict the unfolding of infectious disease at different geographical scales [59] . in the previous sections, we described briefly some successful examples that have been made possible thanks to high resolution data on where we live, how we live, and how we move. data availability has started a second golden age in epidemic modeling [60] . all models are judged against surveillance data collected by health departments. unfortunately, due to excessive costs, and other constraints their quality is far from ideal. for example, the influenza surveillance network in the usa, one of the most efficient systems in the world, is constituted of just 2900 providers that operate voluntarily. surveillance data is imprecise, incomplete, characterized by large backlogs, delays in reporting times, and the result of very small sample sizes. furthermore, the geographical coverage is not homogeneous across different regions, even within the same country. for these reasons the calibration and test of epidemic models with surveillance data induce strong limitations in the predictive capabilities of such tools. one of the most limiting issues is the geographical granularity of the data. in general, information are aggregated at the country or regional level. the lack of ground truth data at smaller scales does not allow a more precise selection and training of realistic epidemic models. how can we lift such limitations? data, data and more data is again the answer. at the end of 2013 almost 3 billion of people had access to the internet while almost 7 billion are phone subscribers, around 20 % of which are actively using smartphones. the explosion of mobile usage boosted also the activity of social media platforms such as facebook, twitter, google+ etc. that now count several hundred million active users that are happy to share not just their thoughts, but also their gps coordinates. the incredible amount of information we create and access contain important epidemiologically relevant indicators. users complaining about catching a cold before the weekend on facebook or twitter, searching for symptoms of particular diseases on search engines, or wikipedia, canceling their dinner reservations on online platforms like opentable are just few examples. an intense research activity, across different disciplines, is clearly showing the potential, as well as the challenges and risks, of such digital traces for epidemiology [61] . we are at the dawn of the digital revolution in epidemiology [7, 8] . the new approach allows for the early detection of disease outbreaks [62] , the real time monitoring of the evolution of a disease with an incredible geographical granularity [63] [64] [65] , the access to health related behaviors, practices and sentiments at large scales [66, 67] , inform data-driven epidemic models [68, 69] , and development of statistical based models with prediction power [67, [70] [71] [72] [73] [74] [75] [76] [77] [78] . the search for epidemiological indicators in digital traces follows two methodologies: active and passive. in active data collection users are asked to share their health status using apps and web-based platforms [79] . examples are influenzanet that is available in different european countries [64] , and flu near you in the usa [65] that engage tens of thousands of users that together provide the information necessary for the creation of interactive maps of ili in almost real time. in passive data collection, instead, information about individuals health status is mined from other available sources that do not require the active participation of users. news articles [63] , queries on search engines [74] , posts on online social networks [67, [70] [71] [72] [73] , page view counts on wikipedia [75, 76] or other online/offline behaviors [77, 78] are typical examples. in the following, we focus on the prototypical, and most famous, method of digital epidemiology, google flu trends (gft) [80] , while considering also other approaches based on twitter and wikipedia data. gft is by far the most famous model in digital epidemiology. launched in november 2008 together with a nature paper [80] describing its methodology, it has continuously made predictions on the course of seasonal influenza in 29 countries around the world. 3 the method used by gft is extremely simple. the percentage of ili visits, a typical indicator used by surveillance systems to monitor the unfolding of the seasonal flu, is estimated with a linear model based on search engine queries. this approach is general, and used in many different fields of science. a quantity of interest, in this case the percentage of ili visits p, is estimated using a correlated signal, in this case the ili related queries fraction q, that acts as surrogate. the fit allows the estimate of p as a function of the value of q: logit .p/ dˇ0 cˇ1logit .q/ c ; (4.14) where logit .x/ d ln x 1 x ,ˇ0 andˇ1 are fitting parameters, and is an error term. as clear from the expression, the gft is a simple linear fit, where the unknown parameters are determined considering historical data. the innovation of the system lies on the definition of q that is evaluated using hundreds of billions of searches on google. indeed, gft scans all the queries we submit to google, without using information about users' identity, in search of those that ili related. this is the paradigm of passive data collection in digital epidemiology. in the original model the authors measured the correlation of 50 millions search queries with historic cdc data, finding that 45 of them were enough to ensure the best correlation between the number of searches and the number of ili cases. the identity of such terms has been kept secret in order to avoid changes in users' behavior. however, the authors provided a list of topics associated with each one of them: 11 were associated with influenza complications, 8 to cold/flu remedies, 5 to general terms for influenza, etc. although the search for the terms has been performed without prior information, none of the most representative terms were unrelated to the disease. in these settings gft showed a mean correlation of 0:97 with real data and was able to predict the surveillance value with 1-2 weeks ahead. gft is based on proprietary data that for many different constraints cannot be shared with the research community. other data sources, different in nature, are instead easily accessible. twitter and wikipedia are the two examples. indeed, both systems are available for download, with some limitations, through their respective apis. the models based on twitter are built within the same paradigm of gft [67, [71] [72] [73] 81] . tweets are mined in search of ili-related tweets, or other health conditions such as insomnia, obesity, and other chronic diseases [67, 82] , that are used to inform regression models. such tweets are determined either as done in gft, or through more involved methods based on support vector machine (svm) or other machine learning methods that, provided an annotated corpus, find disease related tweets beyond simple keywords matches [67, [71] [72] [73] 81] . the presence of gps information or other self-reported geographical data allows the models to probe different granularities ranging from countries [67, 71, 73, 81] to cities [72] . while models based on twitter analyze users' posts, those based on wikipedia focus on pages views [75, 76] . the basic intuition is that wikipedia is used to learn more about a diseases or a medication. plus, the website is so popular that is most likely one of the first results of search queries on most search engines. the methods proposed so far monitor a set of pages related to the disease under study. examples are influenza, cold, fever, dengue, etc. page views at the daily or weekly basis are then used a surrogates in linear fitting models. interestingly, the correlation with surveillance data ranges from 0:02 in the case of ebola to 0:99 in for ilis [75, 76] , and allows accurate predictions up to 2 weeks ahead. one important limitation of wikipedia based methods is the lack of geographical granularity. indeed, the view counts are reported irrespective of readers' location but the language of the page can be used as a rough proxy for location. such approximation might be extremely good for localized languages like italian but it poses strong limitations in the case of global languages like english. indeed, it is reported that 51 % of pages views for english pages are done in the usa, 11 % in the uk, and the rest in australia, canada and other countries [76] . besides, without making further approximation such methods cannot provide indications at scales smaller than the country level. despite these impressive correlations, especially in the case of ilis, much still remains to be done. gft offers a particular clear example of the possible limitations of such tools. indeed, despite the initial success, it completely failed to forecast the 2009 h1n1 pandemic [61, 83] . the model was updated in september 2009 to increase the number of terms to 160, including the 40 terms present in the original version. nevertheless, gft missed high 100 out of 108 weeks in the season 2011-2012. in 2013 gft predicted a peak height more than double the actual value causing the underlying model to be modified again later that year. what are the reasons underlying the limitations of gft and other similar tools? by construction, gft relies just on simple correlations causing it to detect not only the flu but also things that correlate strongly with the flu such as winter patterns. this is likely one of the reasons why the model was not able to capture the unfolding of an off-season pandemic such as the 2009 h1n1 pandemic. also, changes in the google search engine, that can inadvertently modify users' behavior, were not taken into account in gft. this factor alone possibly explains the large overestimation of the peak height in 2013. plus, simple auto-regressive models using just cdc data can perform as well or better than gft [84] . the parable of gft clearly shows both the potential and the risks of digital tools for epidemic predictions. the limitations of gft can possibly affect all similar approaches based on digital passive data collection. in particular, the use of simple correlations measures does not guarantee the ability of capturing the phenomena across different scales in space and time with respect to those used in the training. not to mention that correlations might be completely spurious. in a recent study for example, a linear model based on twitter simply informed with the timeline of the term zombie was shown to be a good predictor of the seasonal flu [71] . despite such observations the potential of these models is invaluable to probe data that cannot be predicted by simple auto-regressive models. for example, flu activity at high geographical granularities, although very important, is measured with great difficulties by the surveillance systems. gft and other spatially resolved tools can effectively access to these local indicators, and provide precious estimates that can be used a complement for the surveillance and as input for generating epidemic models [49, 68] . the field of epidemiology is currently undergoing a digital revolution due to the seemingly endless availability of data and computational power. data on human behavior is allowing for the development of new tools and models while the commoditization of computer resources once available only for world leading research institutions is making highly detailed large scale numerical approaches feasible at last. in this chapter, we present a brief review not only of the fundamental mathematical tools and concepts of epidemiology but also of some of the state-of-the-art and computational approaches aimed at describing, modeling, and forecasting the diffusion of viruses. our focus was on the developments occurring over the past decade that are sure to form the foundation for developments in decades to come. essai dune nouvelle analyse de la mortalité causée par la petite vérole et des advantages de l'inocoulation pur la prévenir. mémoires de mathematique physique de l infectious diseases in humans the role of the airline transportation network in the prediction and predictability of global epidemics modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model modelling disease outbreaks in realistic urban social networks flute, a publicly available stochastic influenza epidemic simulation model digital disease detection: harnessing the web for public health surveillance digital epidemiology the mathematical theory of infectious diseases modeling infectious diseases in humans and animals a contribution to the mathematical theory of epidemics networks: an introduction scale-free networks complex networks: structure, robustness and function dynamical processes on complex networks computational social science linked: how everything is connected to everything else and what it means temporal networks telling tails explain the discrepancy in sexual partner reports simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts what's in a crowd? analysis of face-to-face behavioral networks activity driven modeling of dynamic networks time varying networks and the weakness of strong ties epidemic spreading in non-markovian timevarying networks thresholds for epidemic spreading in networks modeling dynamical processes in complex socio-technical systems epidemic spreading in correlated complex networks spread of epidemic disease on networks correlations in weighted networks efficient immunization strategies for computer networks and populations using friends as sensors to detect global-scale contagious outbreaks controlling contagion processes in activity driven networks some demographic and genetic consequences of environmental heterogeneity for biological control population structure and the spread of disease a structured epidemic model incorporating geographic mobility among regions reaction-diffusion equations and their applications to biology epidemic modeling in metapopulation systems with heterogeneous coupling pattern: theory and simulations modeling human mobility responses to the large-scale spreading of infectious diseases the scaling of human interactions with city size the scaling of human contacts in reaction-diffusion processes on heterogeneous metapopulation networks the gridded population of the world version 3 (gpwv3): population grids. palisades global rural-urban mapping project (grump), alpha version: population grids. palisades predictability and epidemic pathways in global outbreaks of infectious diseases: the sars case study human mobility and the worldwide impact of intentional localized highly pathogenic virus release assessing the international spreading risk associated with the 2014 west african ebola outbreak multiscale mobility networks and the spatial spreading of infectious diseases seasonal transmission potential and activity peaks of the new influenza a (h1n1): a monte carlo likelihood analysis based on human mobility the transmissibility and control of pandemic influenza a (h1n1) virus pandemic potential of a strain of influenza a (h1n1): early findings real-time numerical forecast of global epidemic spreading: case study of 2009 a/h1n1pdm comparing large-scale computational approaches to epidemic modeling: agent-based versus structured metapopulation models creating synthetic baseline populations transims (transportation analysis simulation system) determinants of the spatiotemporal dynamics of the 2009 h1n1 pandemic in europe: implications for real-time modelling mitigation strategies for pandemic influenza in the united states planning for the control of pandemic influenza a (h1n1) in los angeles county and the united states predicting the behavior of techno-social systems epidemic processes in complex networks. arxiv preprint the parable of google flu: traps in big data analysis global capacity for emerging infectious disease detection web-based participatory surveillance of infectious diseases: the influenzanet participatory surveillance experience assessing vaccination sentiments with online social media: implications for infectious disease dynamics and control you are what you tweet: analyzing twitter for public health forecasting seasonal outbreaks of influenza forecasting seasonal influenza with stochastic microsimulations models assimilating digital surveillance data the use of twitter to track levels of disease activity and public concern in the us during the influenza a h1n1 pandemic validating models for disease detection using twitter national and local influenza surveillance through twitter: an analysis of the 2012-2013 influenza epidemic towards detecting influenza epidemics by analyzing twitter messages detecting influenza epidemics using search engine query data detecting epidemics using wikipedia article views: a demonstration of feasibility with language as location proxy wikipedia usage estimates prevalence of influenzalike illness in the united states in near real-time guess who is not coming to dinner? evaluating online restaurant reservations for disease surveillance satellite imagery analysis: what can hospital parking lots tell us about a disease outbreak? public health for the people: participatory infectious disease surveillance in the digital age google flu trends using twitter to estimate h1n1 influenza activity a content analysis of chronic diseases social groups on facebook and twitter. telemedicine and e-health reassessing google flu trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales predicting consumer behavior with web search acknowledgements bg was partially supported by the french anr project harms-flu (anr-12-monu-0018). key: cord-028636-wxack9zv authors: hachicha, a.; hachicha, f. title: analysis of the bitcoin stock market indexes using comparative study of two models sv with mcmc algorithm date: 2020-07-06 journal: rev quant finan acc doi: 10.1007/s11156-020-00905-w sha: doc_id: 28636 cord_uid: wxack9zv the purpose of this article is to find a better technique for estimating the volatility of the price of bitcoin on the one hand and to check if this special kind of asset called cryptocurrency behaves like other stock market indices. we include five stock market indexes for different countries such as standard and poor’s 500 composite index (s&p), nasdaq, nikkei, stoxx, and dowjones. using daily data over the period 2010–2019. we examine two asymmetric stochastic volatility models used to describe the volatility dependencies found in most financial returns. two models are compared, the first is the autoregressive stochastic volatility model with student’s t-distribution (arsv-t), and the second is the basic svol. to estimate these models, our analysis is based on the markov chain monte-carlo method. therefore, the technique used is a metropolis–hastings (hastings in biometrika 57:97–109, 1970), and the gibbs sampler (casella and george in am stat 46:167–174, 1992; gelfand and smith in j am stat assoc 85:398–409, 1990; gilks and wild in 41:337–348, 1992). model comparisons illustrate that the arsv-t model performs better performances. we conclude that this model is better than the svol model on the mse and aic function. this result concerns bitcoin as well as the other stock market indices. without forgetting that our finding proves the efficiency of markov chain for our sample and the convergence and stability for all parameters to a certain level. on the whole, it seems that permanent shocks have an effect on the volatility of the price of the bitcoin and also on the other stock market. our results will help investors better diversify their portfolio by adding this cryptocurrency. in the last decades, a new type of currency has been launched on the financial market and has gained importance. bitcoin is a special kind of asset called cryptocurrency. it was designed by satoshi nakamoto (allegedly a pseudonym of one person or a group of people) to work as a medium of exchange nakamoto (2009) . since its introduction, it has been gaining more attention from the media, the finance industry, and academics. contrary to ''traditional'' fiat currencies, bitcoin does not rely on any central authority but uses cryptography to control its creation and management. some business already began accepting bitcoins in addition to national currencies (williamson 2018) . however, the legal status of bitcoin varies substantially from country to country. there are several reasons for this interest: firstly, japan and south korea have recognized bitcoin as a legal method of payment (bloomberg 2017a; cointelegraph 2017) . second, some central banks are exploring the use of cryptocurrencies (bloomberg 2017b) . third, the enterprise ethereum alliance was created by a large number of companies and banks to make use of cryptocurrencies and the related technology called blockchain (forbes 2017). bitcoin (btc) is based on decentralisation, which means that it is controlled and owned by its users. this decentralization is often criticized due to the lack of control over the whole system. despite this criticism, bitcoin increased in value from a couple of cents in the beginning (2009) to about 20,000 us dollars at the end of 2017. in china, cryptocurrency trade was banned in october 2017. with bitcoin's increasing popularity, understanding how its prices are correlated with other financial assets is of interest to investors, regulators and policymakers. as a peer-to-peer crypto-currency, bitcoin holds the promise of being free from central banks' and governments' interventions. a decade after its inception following satoshi nakamoto's (2008) vision, its price movements are far from being tamed nowadays. this study contributes to the existing literature on the empirical characteristics of virtual currency allowing for a dynamic transition between different economic regimes and considering various crashes and rallies over the business cycle, that is captured by jumps. stochastic volatility (sv) models are workhorses for the modeling and prediction of time-varying volatility on financial markets, and are essential tools in risk management, asset pricing and asset allocation. in financial mathematics and financial economics, stochastic volatility is typically modeled in a continuous-time setting which is advantageous for derivative pricing and portfolio optimization. nevertheless, since data is typically only observable at discrete points in time, in empirical applications, discrete-time formulations of sv models are equally important. volatility plays an important role in determining the overall risk of a portfolio and identifying hedging strategies that make the portfolio neutral concerning for to market moves. moreover, volatility forecasting is also crucial in derivatives trading. recently, sv models allowing the mean level of volatility to 'jump' have been used in literature, see chang et al. (2007) and chib et al. (2002) . the volatility of financial markets is a subject of constant analysis movements in the price of financial assets which directly affects the wealth of individuals, companies, charities, and other corporate bodies. determining whether there are any patterns in the size and frequency of such movements, or their cause and effect, is critical in devising strategies for investments at the micro-level and monetary stability at the macro level. shephard and pitt (1997) used improved and efficient markov chain monte-carlo (mcmc) methods to estimate the volatility process ''in block'' rather than one point of time such as highlighted by jacquier et al. (1994) , for a simple sv model. furthermore, hsu and chiao (2010) analyze the time patterns of individual analyst's relative accuracy ranking in earnings forecasts using a markov chain model by treating two levels of stochastic persistence. least squares and maximum likelihood techniques have long been used in parameter estimation problems. however, those techniques provide only point estimates with unknown or approximate uncertainty information. bayesian inference coupled with the gibbs sampler is an approach to parameter estimation that exploits modern computing technology. kliber et al. (2019) apply the stochastic volatility model with the dynamic conditional correlation between main stock indices and the bitcoin price in the us dollar (bitfinex exchange). baur and lucey (2010) verified whether gold can be treated as a safe haven asset, estimating a regression model for the returns of gold, where the explanatory variables were the returns of bonds and stocks, as well as the bitcoin. it was also assumed that the error term from the regression model follows an asymmetric garch model. together with the development of the market of cryptocurrencies, many researchers like smith (2018) and kim (2018) started the debate over the possible role of the new asset from the investment perspective. one of the research hypotheses assumed that it can be treated as an alternative gold by some investors. as noted by pieters and vivanco (2017) , although bitcoin is indeed a homogeneous and identical virtual good across all online markets on which it is traded, its prices behave differently across these markets. matkovskyy (2019) who compared the euro, u.s. dollar, and british pound sterling (gbp) with centralized and decentralized bitcoin cryptocurrency markets in terms of return volatility and interdependency. his results demonstrate that the markets differ for instance in terms of volatility which tends to be higher in the decentralized markets. centralized markets have higher tail dependence regarding returns. our motivation for the research was to analyze the volatility of bitcoin using mcmc and detect if the bitcoin price converges and tends to zero like the other indices. no study-to our knowledge has investigated in stochastic volatility using metropolis hasting, then assured comparison with other stock indices for the different international markets. this paper is organized as follows, sect. 2 presents the bayesian approach and the mcmc algorithms. the sv model is introduced in sect. 3, whereas empirical illustrations are given in sect. 4. the bayesian approach is a classical methodology where we assume that there is a set of unknown parameters. alternatively, in the bayesian approach, the parameters are considered as random variables with given prior distributions. we then use observations (the likelihood) to update these distributions and obtain the posterior distributions. it would seem that to be subjective and to use the observations as much as possible, one should use non normative priors. however, this sometimes creates degeneracy issues and one should choose a different prior for this reason. markov chain monte-carlo (mcmc) includes the gibbs sampler as well as the metropolis-hastings (mh)algorithm. p( ∕x) ∞ p(x∕ ) * p( ) the metropolis-hasting is the baseline for mcmc schemes that simulate a markov chain (t) with p( ∕y ) as the stationary distribution of a parameter h given a stock price index x. for example, we can define 1 , 2 and 3 such that = 1 , 2 , 3 where each i can be scalar, vectors or matrices. markov chain monte-carlo algorithms are iterative and so at iteration t we will sample in turn from the three conditional distributions. firstly, we update i = * i with probability min 1, s t otherwise let (t) i = (t−1) i a popular and more efficient method is the acceptance-rejection (a-r) m-h sampling method which is available. whenever the target densities are bounded by a density from which it is easy to sample. the gibbs sampler (casella and george 1992; gelfand and smith 1990; gilks and wild 1992) is the special m-h algorithm whereby the proposal density for updating j equals the full conditional p( * j ∕, j ). so that proposals are accepted with probability 1. the gibbs sampler involves parameter-by-parameter or block-by-block updating, which when completed from the transaction from (t) to (t+1) . repeated sampling from m-h samplers such as the gibbs samplers generates an autocorrelated sequence of numbers that, subject to regularity condition (ergodicity, etc.), eventually 'forgets' the starting values 0 = (0) 1 , (0) 2 , … , (0) d used to initialize the chain, and converges to a stationary sampling distribution p( ∕y). in practice, gibbs and m-h algorithms are often combined, which results in a ''hybrid'' mcmc procedure. in this paper, we will consider the p-th order arsv-t model, arsv(p)-t, as follows: where ν t is independent of (ε t , η t ), and v t is the log volatility which is assumed to follow a stationarity ar(p) process with a persistent parameter φ <1. by this specification, the conditional distribution, ξ t , follows the standardized t-distribution with mean zero and variance one. since κ t is independent of (s t , η t ), the correlation coefficient between ξ t and η t is also ρ. if ∼ n(0, 100) , then: the conditional posterior distribution of the volatility is given by: the representation of the sv-t model in terms of a scale mixture is particularly useful in a mcmc context since it allows sampling a non-log-concave sampling problems into a log-concave one. this permits sampling algorithms which guarantee convergence infinite time (see frieze et al. 1994) allowing log returns to be student-t-distributed naturally changes the behavior of the stochastic volatility process, in the standard sv model, a large value of |y t | induce large value of the v t . jacquier et al. (1994) , hereafter jpr, introduced markov chain technique (mcmc) for the estimation of the basic svol model with normally distributed conditional errors. let θ be the vectors of parameters of the basic svol(α, δ, σ v ), and v = (v t )t the basic svol specifies zero correlation, the errors of the mean and variance equations. briefly, the hammersley-clifford theorem states that having a parameter-set θ, a state v and an observation y, we can obtain the joint distribution p(θ, v∕y) from p(θ, v∕y) and p(v∕θ, y) , under some mild regularity conditions. therefore by applying the theorem iteratively, we can break a complicated multidimensional estimation problem into many sample one-dimensional problems. creating a markov chain θ(i)via a monte carlo process, the ergodic averaging theorem states that the time-average of a parameter will converge towards its posteriors mean. the formula of bayes factorize the posterior distribution likelihood function with the prior hypotheses: where α is the intercept, δ the volatility persistence and σ v is the standard deviation of the shock to logv t . we use a normal-gamma prior, so, the parameter α, δ ~ n and σ 2 ~ itt, then: and for v , we obtain: bitcoin (btc) daily price data from january 2010 to october 2019 was retrieved from the coindesk price index (2017). like the majority of researchers, we used the coindesk price index dyhrberg (2016) and bouri et al. (2018) . it lists the average price of bitcoin against the us dollar from leading trading platforms from around the globe. thus, the movements of the index are driven by investors from all over the world and we cannot indicate the country of origin. five indices including s&p 500 (gspc), nikkei (n225), nasdaq, down jones, and euro index (stoxx) are chosen to represent different regions whose currencies top the share of bitcoin trade. bitcoin trading against the chinese yuan accounted for most of bitcoin's trading volume until china started to clamp down on digital currency exchanges in early 2017, eventually banning the trading of bitcoin in september of 2017. japan's yen then took over as the largest trading volume with the japanese regulators adopted digital currency-friendly rules. the us dollar and euro are also among the top five most active bitcoin trading currencies. the return is defined as: we used the last 2350 observations for all indices. table 1 reports the mean, standard deviation, median, and the empirical skewness as well as kurtosis of the five series. outlines the summary statistics for the daily, data for all assets. bitcoin's mean returns are 0.805% for daily data. as expected, bitcoin returns exhibit much higher volatility than the stock indices. another interesting statistic is the skewness; bitcoin is the one with negative skewness. this indicates that the tail is at the left side of the distribution, so the probability of lower values than the mean is higher than the normal distribution, which has a skewness of zero. with a positive skewness, this is the case for the other indices, the opposite is true. as before, bitcoin has the highest volatility due to the highest kurtosis comparing to the other indices. y t = 100 * logs t − logs t−1 . the standard sv model is estimated by running the gibbs and a-r m-h algorithm based on 20,000 mcmc iterations, where 5000 iterations are used as a burn-in period. tables 2 and 3 show the estimation results in the basic svol model and the sv-t model of the daily indexes. α and δ are prior independent. the prior in δ is essentially flat over [0,1]. we impose stationarity for log(v t ) by truncating the prior of δ. other priors for δ are possible. geweke (1994a) proposes alternative priors to allow the formulation of odds ratios for non-stationarity. wheras kim et al. (1998) center an informative beta prior to around 0.9. table 2 shows the results for the daily indexes. the posterior of δ are higher for the daily series. the highest mean is − 0.1173, 0.0765, − 0.1945, − 0.1774, 0.1601 and − 0.1501. well, as the basic svol, there is no apparent evidence of a unit of volatility. there are other factors that can deflect this rate such as the marcoeconomics variable (inflation, interest rate). the leverage effect is directly modeled in the model of jpr (1994, 2004) . also, the volatility at time t can be positively or negatively influenced by the volatility at time t-1. similarly, the coefficient associated with the shocks is significant and positive for all the indices. similarly, it is high in the majority of observations. we deduce that the nature of the shock can have a remarkable influence on volatility. this will generate excessive volatility which is well observed on the bitcoin series. without neglecting the other indexes. our results are similar to those of where psychological biases represent an explanatory factor. guney et al. (2017), edgars et al. (2018) , walid and ahmed (2020), kraaijeveld and de smedt (2020) . table 3 explores the metropolis hasting estimates of the basic autoregressive sv-t model. the estimates of φ is between 0.2086 and 0.6711, while those of σ are between 0. 013 and 0. 075. against, the posterior of φ for the sv-t model is located higher. this result confirms the typical persistence reported in the garch literature. after the result, the first volatility factors have lower persistence while the values of φ 2 indicate the high persistence of the second volatility factors. the second factorφ 2 plays an important role in the sense that it captures extreme values, which may produce the leverage effect, then it can be considered conceivable. the estimates of ρ are positive for the same cases except for down jones and s&p when the estimates is respectively − 0.0226 and − 0.0389. using metropolis hasting for each data set, the innovations in the mean and volatility are negatively correlated. negative correlations between mean and variance errors can produce a "leverage" effect in which negative (positive) shocks to the mean are associated with increases (decreases)in volatility (mbanga et al. 2018; andersen et al. 2003; black 1976; caraiani and calin 2019; hung et al. 2020) . the return of different indexes is influenced by different crises observed in the international market (see fig. 1 ) for example about bitcoin, the crash of april 2013 came after bitcoin's first significant brush with the mainstream. the currency had never crossed $30 before 2013 but a flood of media coverage helped drive it well above $200. bitcoin spent most of the rest of 2013 around $120. then prices jumped ten-fold in the fall: bitcoin hit a high of $1150 in late november, and then the party ended abruptly, and prices tumbled below $500 by mid-december. it would take more than four years for bitcoin to reach $1000 again. also, the price of bitcoin had been making significant gains after 2013 when, in february, the price fell from $867 to $439 (a 49% drop). this triggered a doldrums period for bitcoin that lasted until late 2016. the 6 february crash came after the operator of mt. gox-long the go-to trading place for longtime bitcoin owners-announced the exchange had been hacked. on february 7, the exchange halted withdrawals and later revealed thieves had made off with 850,000 bitcoins (which would be worth around $3.5 billion today). the incident, which created existential doubts about the security of bitcoin and undercut liquidity in the currency, likely harmed the currency's value for years. then in summer 2017: in early january, bitcoin broke $1000 for the first time in years and started climbing like crazy. by june, the currency nudged $3000-but then lurched back all of a sudden, falling 36% to $1869 by mid-july. the great china chill: after fears over the fork subsided, bitcoin went on another crazy tear: it climbed close to $5000 at the start of september before plunging 37% by september 15, shaving off over $30 billion from bitcoin's total market cap in the process. recovery is already underway, though, as prices climbed above $4000 three days later. after 2017 was marked by tranquility in the financial markets, volatility finally rebounded in february and, most recently, in october and november 2018. many financial analysts have indicated that these sudden increasing changes may be related to changes in expectations of the pace of monetary policy normalization us. such increases would typically occur when inflation and employment figures are released. more generally, the financial markets react to the publication of macroeconomic data, and in particular in the current context of international trade tensions. in all likelihood, therefore, do not exclude those specific events or the announcement inducing new bursts of volatility. wu et al. (2019) prove the importance of the economic policy uncertainty index (epu) in forecasting the volatility of bitcoin prices. more recently, stock market volatility was relatively high in 2015-when the fall of chinese stock indices spread to the united states and europe-and when brexit was voted in mid-2016. volatility then drastically decreased until 2018, when two corrections occurred on the us and european stock markets in february and october revived it. recently, researchers have discovered a significant influence on the cycle of business on the low-frequency component of volatility, while the abrupt increases in volatility partly due to reversals of market sentiment (adrian and rosenberg 2008; engle and rangel 2008; engle et al. 2013; corradi et al. 2013; chiu et al. 2018; rognone et al. 2020; guégan and renault 2020) . developing in concert with measures of us volatility, the volatility of euro area stock markets began to decrease after the brexit vote held in mid-2016. the stoxx index at 1 month which measures the implied volatility of the euro stoxx 50 index-stood at 10% at the end of 2017, i.e. a level comparable to that before the crisis. this drop-in volatility was probably partly due to the good economic conditions of the moment and with a resolutely accommodating orientation of monetary policy (see also ecb 2017). in february 2018, however, market volatility suddenly increased after the release of relative figures inflation and employment in the united states. these figures suggested that the fed could normalize its policy faster than expected, which caused the stock market indices to fall. the stoxx indices at 1 month jumped to 40 and 30% respectively. in the weeks that followed, these indices gradually declined to fall back to 12% in may. this episode, therefore, seems to be part of the high frequency component of volatility, that is, an increase in volatility (almost) unpredictable and without major consequences for the real economy. in october 2018, a further correction occurred on the stock markets and volatility increased again. the stoxx one-month indices reached 21 and 25%. this event indicates that the markets remain particularly responsive to specific announcements in the context of a rate range recovery fed director, declining net asset purchases by the ecb, trade and political tensions (brexit, italy). we deduce also that the american market is volatile and this can be explained by the primordial role of investor sentiment in predicting the achievement of volatility based on the ranges of the s&p 500 index (zhang and mo 2019) . when the new information coming into the market, it can be disrupted and this affects the anticipation of shareholders for the evolution of the return. the resulting plots of the smoothed volatilities are shown in fig. 2 . we take our analysis in the bitcoin cryptocurrency, but the other is reported in appendix 1 (fig. 4) . it is nicely illustrated that the technique of mcmc used for estimated the latent volatility reveals that the sv-t model is more performing than the svol. the convergence is very remarkable for the nikkei, down jones, stoxx, and the bitcoin indices. this has proved that the algorithm used for estimated volatility is a good choice. the basic svol model mis-specified can induce substantial parameter bias and error in inference about vt, geweke (1994a, b) showed that the basic svol has the same problem with the largest outlier, summer 2017. the vt for the model svol reveals a big outlier on period crises. the corresponding plots of innovation are shown in fig. 3 for two model basic svol and sv-t model for nikkei indices. appendix b (fig. 5) shows the qq plot for the other indices respectively for the nasdaq, s&p, the down jones, nikkei and the stoxx for the two models. the quantile-quantile plot can be used to determine whether two data sets come from populations with a common distribution. in this graphical technique, quantiles of the first data set against the quantiles of the second data set are plotted and a 45-degree reference line is used to interpret. if the two data sets come from a population with the same distribution the points should fall approximately along this reference line. the greater the departure from this reference line, the greater the evidence for the conclusion that the two data sets have come from populations with different distributions. this technique can provide an assessment of "goodness of fit" that is graphical and a more powerful approach than the common technique of comparing histograms of the samples. the advantages of asymmetric basic sv are able to capture some aspects of the financial market and the main properties of their volatility behavior (danielsson 1994; eraker et al. 2003) . the following figure illustrates the q-q plot drawn to evaluate the fitted scaled t-distribution graphically. in general, q-q plot demonstrates linear patterns which confirms that the two distributions considered in the graph are similar. therefore, it can be said that the fitted scaled t distributions with respective parameters exhibit a good fit for the return distribution for all indexes. the algorithm used to estimate the volatility using the sv-t model ensures much higher normality for bitcoin than the svol. table 4 documents the performance of the algorithm and the consequence of using the wrong model on the estimates of volatility. the mcmc is more efficient for all parameters used in these two models. a certain threshold, all parameters are stable and converge to a certain level. appendix c and d (figs. 6 and 7) show that the α, δ, σ, φ converge and stabilize, this shows the power of mcmc. the results for both simulated show that the algorithm of the sv-t model is fast and converges rapidly with acceptable levels of numerical efficiency. then our sampling provides strong evidence of convergence of the chain. bitcoin is an innovative payment network and a new kind of money, then is a digital payment currency that utilizes crypto-currency and peer-to-peer technology to create and manage monetary transactions. we have applied these mcmc methods to the study of various indexes and cryptocurrency. the arsv-t models were compared with svol models of jpr (1994) models using s&p, down jones, nasdaq, nikkei, and the stoxx. we wanted to study the behavior of this cryptocurrency, and it turned out that it behaves like the stock market indices of the different international markets facing different shocks (efe caglar (2019)). the empirical results show that the sv-t model can describe extreme values to a certain extent and it is more appropriate to accommodate outlier. first, that the arsv-t model provides a better fit than the mfsv model and second, the positive and negative shocks haven't the same effect in volatility manabu asai (2008) . our result proves the efficiency of markov chain for our sample and the convergence and stability for all parameter to a certain level. our results are consistent with the work of dyhrberg (2016) and contradict those of baur et al. (2018a, b) claim that bitcoin has unique risk-return characteristics, follows a different volatility process when compared with other assets. this view is endorsed by many researchers, e.g. glaser et al. (2014) and baek and elbeck (2015a, b) . to our knowledge, no paper has processed bitcoin by conducting a comparative study using stochastic volatility models. our results prove that this cryptocurrency does not behave differently from stock market indices although it is traded on a virtual market. observing the market, several investors are wary of this cryptocurrency while justifying themselves by several considerations. our results will help investors better diversify their portfolio by adding this cryptocurrency and this was clearly approved, especially after the "covid 19" crisis, where the volume of transactions has risen sharply in this type of market. this paper has made certain contributions, but several extensions are still possible, and it can find the best results if you opt for extensions of svol like the model of singleton (2001 singleton ( ), knight et al. (2002 and others. stock returns and volatility: pricing the short-run and longrun components of market risk is there a risk-return trade-off in cryptocurrency markets? the case of bitcoin modeling and forecasting realized volatility autoregressive stochastic volatility models with heavy-tailed distributions: a comparison with multifactors volatility models bitcoin as an investment or speculative vehicle? a first look bitcoins as an investment or speculative vehicle? a first look is gold a hedge or a safe haven? an analysis of stocks, bonds and gold bitcoin gold and the us dollar-a replication and extension bitcoin, gold and the us dollar-a replication and extension studies of stock market volatility changes japans bitpoint to add bitcoin payments to retail outlets some central banks are exploring the use of cryptocurrencies spillovers between bitcoin and other assets during bear and bull markets explosive behavior in the prices of bitcoin and altcoins cagli the impact of monetary policy shocks on stock market bubbles: international evidence explaining the gibbs sampler the jump behavior of foreign exchange market: analysis of thaibaht markov chain monte carlo methods for stochastic volatility models financial market volatility, macroeconomic fundamentals and investor sentiment south korea officially legalizes bitcoin, huge market for traders oops, i forgot the light on! the cognitive mechanisms supporting the execution of energy saving behaviors stochastic volatility in asset prices: estimation with simulated maximum likelihood bitcoin, gold and the dollar-a garch volatility analysis herding behaviour in an emerging market: evidence from the moscow exchange the spline garch model for low frequency volatility and its global macroeconomic causes stock market volatility and macroeconomic fundamentals the impact of jumps in volatility and returns sampling from log-concave distributions sampling-based approaches to calculating marginal densities priors for macroeconomic time series and their application comment on bayesian analysis of stochastic volatility adaptive rejection sampling for gibbs sampling bitcoin-asset or currency? revealing users' hidden intentions does investor sentiment on social media provide robust information for bitcoin returns predictability monte carlo sampling methods using markov chains and their applications relative accuracy of analysts' earnings forecasts over time: a markov chain analysis improving the realized garch's volatility forecast for bitcoin with jumprobust estimators bayesian analysis of stochastic volatility models (with discussion) bitcoin: the new gold? barron's stochastic volatility: likelihood inference and comparison with arch models bitcoin: safe haven, hedge or diversifier? perception of bitcoin in the context of a country's economic situation-a stochastic volatility approach estimation of the stochastic volatility model by the empirical characteristic function method the predictive power of public twitter sentiment for forecasting cryptocurrency prices centralized and decentralized bitcoin markets: euro vs usd vs gbp investor sentiment and aggregate stock returns: the role of investor attention the relationship between news-based implied volatility and volatility of us stock market: what can we learn from multiscale perspective? bitcoin: a peer-to-peer electronic cash system financial regulations and price inconsistencies across. bitcoin markets news sentiment in the cryptocurrency market: an empirical comparison with forex likelihood analysis of non-gaussian measurement time series estimation of affine asset pricing models using the empirical characteristic function bitcoin is the new gold, bloomberg view, 31.01 is bitcoin a waste of resources? fed. reserve bank st does gold or bitcoin hedge economic policy uncertainty? publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. key: cord-003377-9vkhptas authors: wu, tong; perrings, charles title: the live poultry trade and the spread of highly pathogenic avian influenza: regional differences between europe, west africa, and southeast asia date: 2018-12-19 journal: plos one doi: 10.1371/journal.pone.0208197 sha: doc_id: 3377 cord_uid: 9vkhptas in the past two decades, avian influenzas have posed an increasing international threat to human and livestock health. in particular, highly pathogenic avian influenza h5n1 has spread across asia, africa, and europe, leading to the deaths of millions of poultry and hundreds of people. the two main means of international spread are through migratory birds and the live poultry trade. we focus on the role played by the live poultry trade in the spread of h5n1 across three regions widely infected by the disease, which also correspond to three major trade blocs: the european union (eu), the economic community of west african states (ecowas), and the association of southeast asian nations (asean). across all three regions, we found per-capita gdp (a proxy for modernization, general biosecurity, and value-at-risk) to be risk reducing. a more specific biosecurity measure–general surveillance–was also found to be mitigating at the all-regions level. however, there were important inter-regional differences. for the eu and asean, intra-bloc live poultry imports were risk reducing while extra-bloc imports were risk increasing; for ecowas the reverse was true. this is likely due to the fact that while the eu and asean have long-standing biosecurity standards and stringent enforcement (pursuant to the world trade organization’s agreement on the application of sanitary and phytosanitary measures), ecowas suffered from a lack of uniform standards and lax enforcement. highly pathogenic avian influenzas have become a major threat to human and livestock health in the last two decades. the h5n1panzootic (2004 ongoing) has been one the most geographically widespread and costly, resulting in the loss of hundreds of millions of poultry in 68 countries [1] and over 450 human deaths worldwide-a mortality rate of 60 percent [2, 3] . for h5n1, and other h5 subtypes, most countries reporting poultry outbreaks also report evidence of the disease in wild bird populations, and the mechanisms for the spread of h5n1 have been identified as a combination of wild bird transmission and the live poultry trade [4, 5] . plos risk factors. our primary interest is in the role of live poultry imports as a source of traderelated avian influenza risk at the regional level. we note that other poultry products, such as packaged meat and eggs, do pose a risk, but it is significantly lower. although avian influenza can persist in frozen meat, contact with that meat is unlikely to cause infection [30] . furthermore, since hpais are lethal to egg embryos, eggs are not a potential source of transmission [31] . the data comprise an unbalanced panel covering 53 countries over 13 years; the lack of balance is due to the fact that membership of the eu changed over the timeframe. the response variable in all models estimated was a log transformation of the number of h5n1 poultry outbreaks in a given country in a given year, obtained from the emergency prevention system for animal health (empres), a joint project of the fao and oie [32] . the log transformation was applied to account for the wide disparities in the numbers of the outbreaks across countries. in 2010, for example, indonesia recorded 1206 outbreaks while romania, the only eu country to be infected that year, had only 2. in addition to reflecting the differing directions and intensities of risk factors, this also reflects differences in reporting conventions for h5n1 at the international level [33] . a series of outbreaks may be reported separately in one country, but be treated as a single event in another. data on trade in live poultry were obtained from the united nations' comtrade database (comtrade.un.org) and resourcetrade.earth, a project of the royal institute of international affairs (www.chathamhouse.org). these report the total imports of live poultry into a given country in a given year by weight (kg). the data on trade in live poultry did not distinguish between different types of domestic birds, such as chickens, duck, and geese, but grouped them under a single commodity category of "live poultry." with respect to wild bird migration as a pathway for h5n1 spread, we used the density of wild bird habitat as a proxy for the presence and scale of migratory bird populations, and the likelihood that wild and domestic birds will mix. lakes, wetlands, and (irrigated) agricultural areas have been consistently identified as wintering and breeding grounds for migratory birds, and as places where wild birds may come into contact with free-ranging poultry [34] [35] [36] [37] [38] [39] [40] [41] . the indicator for wild bird habitat used in this study was the set of "important bird and biodiversity areas" (ibas) for "migratory and congregatory waterbirds" identified by birdlife the live poultry trade poses different avian influenza risks in different regions of the world table 1 . the distribution of h5n1 poultry outbreaks between 2004-2016 across the member states of asean, ecowas, and eu. "-" signifies that the country was not a member of its associated trade bloc in that given year . 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 brunei 0 0 0 0 0 0 0 0 0 0 0 0 0 cambodia 23 1 5 1 1 1 3 4 1 8 5 2 1 indonesia 1 3 20 0 20 1503 1206 1155 308 260 310 107 269 laos 19 0 1 7 13 6 1 international (datazone.birdlife.org). in their 2006 analysis of h5n1 spread, kilpatrick, chmura (4) also identified ibas as a proxy for migratory birds and the infection risks they pose. country-level statistics on socioeconomic and agro-ecological conditions were taken from the united nations' food and agriculture organization (www.fao.org/faostat/en/) and the world bank (data.worldbank.org). agricultural land cover was reported as a percentage of total land area of the country. per-capita gdp was reported in purchasing power parity terms as current international dollars. data for 2016 for these two variables were missing for certain countries. in these cases, the gaps were filled by extrapolating the missing data as a linear trend of the preceding 11 years. we assume that agricultural land-where free-ranging chickens, ducks, and geese are commonly raised in all three regions-also acts as a relevant proxy for susceptible poultry. data on the biosecurity measures targeting avian influenza undertaken by each country were obtained from the world organisation for animal health (oie) (www.oie.int). these report a standardized series of biosecurity controls targeting wildlife and livestock diseases, including those related to surveillance, vaccination, border checks, and management of wild disease reservoirs, and whether or not a given country undertook them in a given year. we chose a subset of these biosecurity measures we considered most relevant to h5n1 avian influenza risks for inclusion in our model. additionally, in any given year, there were 1 to 4 countries that did not provide a report of biosecurity measures to the oie; we assumed that this indicates an absence of action, and the dataset records these cases as zeroes. our modeling approach relied on generalized linear models (glm) to analyze a panel of data on disease outbreaks and associated risk factors. in this we follow others who have sought to predict the spread of h5n1 at both national and international levels [42] [43] [44] or h7n9 [45] [46] [47] . glms are well suited to epidemiological studies because of their flexibility regarding data type and the distribution of response variables, their simplicity of application, and their frequency of use [33] . our identification strategy involved the selection of three specifications for each of two estimators. we adopted both random and fixed effects estimators. hausman tests conducted at the all-regions level favored a random effects estimator, as the p-value exceeded the 5% threshold below which fixed-effects regression is conventionally considered necessary. some factors that influence the likelihood and number of outbreaks in a given country or region are not likely to change significantly over the course of several years, or even a decade. in our dataset, netherlands for example, the amount of land covered by wild bird habitat is time-variant, while agricultural land and even per-capita gdp for many countries experienced relatively modest variations over the timeframe of the study. in this case, and as the hausman diagnostics indicate, a random effects estimator is more appropriate. nevertheless, since we wished to control for timeinvariant characteristics of regions and countries we also implemented fixed effects estimators at both the aggregate and trading bloc levels, implicitly assuming no changes in the trade or biosecurity environment at the bloc level that we are unable to control for. our first specification (model 1) included a number of factors related to disease risk but excluded both live poultry imports and biosecurity measures. included predictors were land area, human population, per-capita gdp in purchasing power terms, agricultural area, wild bird habitat area, and the live chicken population. our second specification (model 2) added intra-regional trade bloc and extra-bloc imports of live poultry. our third specification (model 3) added four main biosecurity measures: border precautions, general surveillance, vaccination prohibition, and wild disease reservoir management. all are categories of oie-reported biosecurity measures taken against avian influenza. the general forms of the estimated random and fixed effects models were: where y it denotes the number of poultry outbreaks in country i in year t, x includes the predictors for model 1, z includes the additional predictors for model 2, u includes the additional predictors for model 3, ecowas and asean are dummy variables for the two titular regional trade blocs (the eu is the reference group), and u it and ε it are the "between" and "within" errors respectively. to account for heteroskedasticity, we used robust standard errors. finally, since the data used in this analysis are reported annually, and h5n1 has been a conspicuous and fast-moving epidemic (meaning the effects of an outbreak are unlikely to persist over a long period of time) among poultry, we did not use a lag structure in our statistical analysis. therefore, we assumed that the factors driving an outbreak in a given year are contemporaneous with it (e.g., an outbreak that occurred in 2012 were modelled using trade volumes from 2012). we were also constrained by data availability in our use of annual increments: although monthly data exist for outbreaks, they do not for important predictor variables such as percapita gdp, human and poultry populations, the volume of live poultry traded, and biosecurity. regressions results from all models, including both random and fixed effects, are reported in tables 2-5 . at the all-regions level, the results for the random-and fixed-effects models were very similar, with the same set of predictor variables being statistically significant (i.e., p-values below the 5% or 10%) and the same direction of impact on the response variable. this set of predictors was human population (positive direction), per-capita gdp (negative direction), intra-trade bloc live poultry imports (negative direction), extra-trade bloc live poultry imports (positive direction), and the biosecurity measure of surveillance (negative direction). additionally, although the coefficient values for the same predictor differed between the two estimators, all pairs were within the same order of magnitude. the only exception to this was migratory waterbird habitat variable-the percent of land area covered by ibas for migratory and congregatory waterbirds. this was statistically significant and negative (i.e., had a mitigating impact on h5n1 poultry outbreaks) for the fixed-effects model but was not significant for the random-effects model. the overall r-squared for the random-effects model was significantly higher than that for the fixed-effects model (0.451 vs. 0.0181). the "between rsquared" value was particularly high (0.682) in the random effects model, signaling the importance of variation among countries (as opposed to "within r-squared," which measures the variation within countries over time). as we had expected, we found significant differences across trade regions. in the randomeffects model, ecowas diverged from all-regions conditions and from asean with respect to per-capita gdp and extra-bloc imports: while the two predictors were, respectively, riskdecreasing and risk-increasing at the all-regions level and in asean, they had the opposite impacts in ecowas. furthermore, ecowas differed from the all-regions level and from the eu in terms of intra-bloc imports: while this was risk-decreasing for the former two, it was risk-increasing for ecowas. finally, there were predictors that were statistically insignificant at the all-regions level but had a significant effect within different regions. for asean, agricultural land cover was a mitigating factor for outbreaks while wild disease reservoir management showed a strong positive relation with outbreaks. for ecowas, wild waterbird habitats and border precautions had a mitigating effect on outbreaks while vaccination prohibition and wild reservoir management had a positive effect. in the eu, the population of live chickens had a strong negative relation table 2 . results from the regression models of h5n1 outbreak risk factors for member states in all three regions; regressor coefficients are reported and statistically-significant factors are marked by asterisks. a blank space signifies that the variable was not included in the given model. units model 1 model 2 model 3 with outbreaks, while vaccination prohibition, similar to the case with ecowas, was positively related. following liang, xu (5), there is a perception that the long distance transmission of highly pathogenic avian influenza h5n1 was largely due to wild bird migration, with the live poultry trade playing a minor and more localized role in some cases. our concern here has been to identify the nature of the risk posed by the live poultry trade in different regions of the world, and the conditions affecting that risk. our measure of development status, per-capita gdp, is simultaneously a proxy for modernization, biosecurity, consumption, and value-at-risk. as a proxy for modernization, it reflects risk-reducing differences in production methods. industrial livestock production methods typically include on-farm biosecurity measures that protect poultry from contact with disease-carrying wild birds. unlike traditional methods of free-range or "backyard" husbandry, factory production minimizes the likelihood of poultry intermingling with wild birds or being exposed to environmental pathogen pollution. for all its epidemiological, ecological, and ethical problems, industrial livestock production allows for more timely and widespread disease surveillance and vaccination, and for greater compliance with animal health regulations [48] . at the same time, per-capita gdp growth is also associated with risk-increasing changes in meat consumption, and hence poultry production. indeed, the highest income elasticity of demand for meat and fish has been found in the poorest households and the poorest countries [49] . in developing countries, 71% of the additions to meat consumption are from pork and poultry, with poultry dominating pork [50] . absent changes in on-farm biosecurity, increased table 3 . results from the regression models of h5n1 poultry outbreak risk factors for the association of southeast asian nations (asean); regressor coefficients are reported and statistically-significant factors are marked by. a blank space signifies that the variable was not included in the given model. units model 1 model 2 model 3 production implies increased risk. across all regions, the net effect of income growth is to reduce risk, dominating risk-increasing changes. in the ecowas region-the lowest income region-the effect is the opposite. the risk-increasing effects of income growth dominate the risk reducing effects (table 3) . amongst the landscape variables-land area, the proportion in agriculture, and the proportion in ibas-our results reveal no uniform relation to h5n1 outbreaks. at the all-regions level we found a weakly negative relation between outbreaks and the proportion of the land area in ibas (table 1 ). this was driven by the european union, which includes the highest proportion of land area in ibas, but also the most industrialized forms of poultry production. the degree to which poultry production is industrialized also shows up in the coefficients on poultry numbers, which are negative and significant only for the eu (table 4 ). while spatial heterogeneity at the landscape scale is important in terms of avian ecology, we were unable to take explicit account of these more detailed considerations in a country-scale analysis. the impacts of regional differences in biophysical conditions that are not directly controlled for are, however, included in bloc-level fixed effects. our primary concern is with the role of the live poultry trade, and how that differs between regions. across all regions we find that live poultry imports into a trade bloc are risk increasing. this is consistent with past studies that have shown that extra-bloc live poultry imports may be a significant source of additional avian influenza risk where they do not meet bloc sanitary and phytosanitary standards. the eu's common market and the asean free trade regime in particular have long-standing and standardized protocols, in accordance with the world trade organization's agreement on the application of sanitary and phytosanitary measures. but the two blocs have quite different exposures to external risk. a study of highly pathogenic avian influenza introductions to vietnam, for example, found that extra-asean imports of table 4 . results from the regression models of h5n1 poultry outbreak risk factors for the economic community of west african states (ecowas); regressor coefficients are reported and statistically-significant factors are marked by asterisks. a blank space signifies that the variable was not included in the given model. units model 1 model 2 model 3 population # people 6.05x10 -9 3.85x10 -8�� 7.34x10 -9 4.21x10 -8�� 1. live poultry increased the risk of introduction [51] . this is also what our study finds for the asean region (table 2) . we do not see an equivalent effect for the eu (table 4 ), reflecting differences in both import volumes and the biosecurity measures applied to imports. the eu imports less and applies stricter biosecurity measures to those imports. the ecowas story is different. extra-bloc live poultry imports are risk reducing, not risk increasing (table 3 ). it is likely that imports from outside the bloc reduce avian influenza risk in the region in part because they meet biosecurity standards that are more stringent than the standards applied in the region. the effects of intra-bloc trade in live poultry mirror the effects of extra-bloc trade. in the eu and asean, intra-bloc trade is risk reducing (tables 2 and 4 ). this may reflect a "substitution effect" in which imports of safer intra-bloc poultry crowds out riskier extra-bloc imports. other studies have come to similar conclusions. eu-derived live poultry imports to spain, for example, were found to pose no threat of avian influenza introduction [52] . once again, eco-was is the exception. extra-ecowas imports of live poultry are risk reducing while intrabloc imports are risk increasing (table 2 ). this is likely due to poor internal biosecurity, such as lax standards and inconsistent execution of inspections. regulatory standards within the ecowas trade bloc have been weak for the whole of the study period [53] . while harmonized sanitary and phytosanitary standards for the 15 member states of ecowas were in principle adopted in 2010, most ecowas states had yet to submit legislation for international certification by 2017 [54] . failure to adopt and enforce unified standards may be partly due to income constraints in ecowas countries. in ppp terms, the bloc's per-capita gdp in 2016 was less than half that of asean and approximately 1/8 th that of the eu, meaning it had less resources available for biosecurity policies and institutions. political instability may be another important obstacle: a table 5 . results from the regression models of h5n1 poultry outbreak risk factors for the european union (eu); regressor coefficients are reported and statistically-significant factors are marked by asterisks. a blank space signifies that the variable was not included in the given model. units model 1 model 2 model 3 the live poultry trade poses different avian influenza risks in different regions of the world number of ecowas member states, including nigeria, niger, sierra leone, mali, liberia, and cote d'ivoire have suffered from civil wars and armed insurgencies over the past two decades. such fraught geopolitical conditions are not conducive to the establishment and enforcement of cross-border regulations. it goes without saying, though, that certification of sanitary and phytosanitary legislation in ecowas states, and the establishment of enforcement agencies to bring states into compliance with the sps agreement and codex alimentarius is a necessary condition of improving regional trade-related biosecurity. in terms of biosecurity measures more specifically, we did not have direct measures of onfarm biosecurity (but conjecture that biosecurity is increasing in per-capita gdp), but we did have measures of four biosecurity policies at the national level. these include: (1) border precautions (measures applied at airports, ports, railway stations or road check-points open to international movement of animal, animal products and other related commodities, where import inspections are performed to prevent the introduction of the disease, infection or infestation); (2) general surveillance (surveillance not targeted at a specific disease, infection or infestation); (3) prohibition of vaccination (prohibition of the use of a vaccine to control or prevent the infection or infestation); and (4) management of wildlife reservoirs (measures to reduce the potential for wildlife to transmit the disease to domestic animals and human beings). the management of wild disease reservoirs differs widely across countries, but techniques include vaccination, treatment of infections with drugs, isolation of infected populations, population translocation, reproduction reduction, culling, and control (draining, flooding, or burning) of wild disease reservoir habitat [55] . of these measures, only general surveillance was significant at the all-regions level, while at the bloc level the effects of the different measures were frequently ambiguous. in the eu, for example, only the prohibition of vaccination was significant, and then in positive relation to outbreaks. for poultry, vaccination may be prohibited because the practice makes it difficult to distinguish infected from vaccinated flocks. this makes it a concomitant of policies centered on livestock culling as the primary response to outbreak risk [56] . no other biosecurity policy was found to have a statistically significant relation to outbreaks in the region. the same set of policies had opposite effects in asean and ecowas. the prohibition of vaccination and the management of wild reservoirs were positively related to outbreaks in ecowas but negatively related to outbreaks in asean, while border protection measures were negatively related to outbreaks in ecowas but positively related to outbreaks in asean. this may reflect regional disparities in the quality of implementation not captured in the data. but it may also reflect the greater importance of trade in the transmission of the disease in ecowas. in their survey of the international spread of h5n1 in the early years of the global epidemic, kilpatrick, chmura (4) found that transmission into europe was by wild birds, that transmission into southeast asia was by the poultry trade, and transmission into africa by a balance of both. our results suggest that after introduction, inter-country spread had differing dynamics in each region. while intra-bloc trade facilitated h5n1 spread among west african countries, it did not in either europe or southeast asia. in these areas, greater risk was posed by out-ofregion live poultry imports. in recent decades, avian influenzas have emerged as a major threat to human and animal health across the world. in particular, hpai h5n1, which was first isolated in 1996, has been the most widespread and among the most devastating in terms of livestock and human mortality. it has inflicted severe losses to poultry stocks and caused hundreds of human deaths. even today, as other avian influenzas have become epidemic, h5n1 remains in circulation among wildlife and livestock. identifying and quantifying the mechanisms of its international spread can help lay the groundwork for prediction and mitigation. it may also provide an instructive framework for the management of other avian influenzas. in this study, we considered the risk posed by the international trade in live poultry and the effects of associated biosecurity measures. differing agro-ecological and socioeconomic conditions across the trade regions were shown to influence epidemic dynamics in different ways, with certain factors being risk-enhancing or risk-decreasing in one region but having the opposite effect, or no significant effect, in another. in policy terms, there is no one-size-fits-all solution to mitigating avian influenza spread. the particular conditions, including those related to the trade agreements and associated regulatory standards, of a given region need to be carefully considered. but overall, biosecurity measures are potentially effective at controlling h5n1 risks, and should be undertaken as a means to forestall spread-in general, mitigation of epidemics is significantly more cost-efficient than suppression [57] . on-farm and other forms of domestic biosecurity may be more important than trade-related measures, but where the protection of trade pathways is weak, the risk of avian influenza spread is clearly higher. supporting information s1 file. detailed information on data sources. the public sources of the data used in this study, and how they were acquired, are described. (docx) world organization for animal health. oie situation report for highly pathogenic avian influenza: update: 28/02 world health organization. cumulative number of confirmed human cases for avian influenza a(h5n1) reported to who food and agriculture organization. h7n9 situation update predicting the global spread of h5n1 avian influenza combining spatial-temporal and phylogenetic analysis approaches for improved understanding on global h5n1 transmission exotic effects of capital accumulation options for managing the infectious animal and plant disease risks of international trade forecasting biological invasions with increasing international trade risk of importing zoonotic diseases through wildlife trade, united states globalization and livestock biosecurity global traffic and disease vector dispersal influences on the transport and establishment of exotic bird species: an analysis of the parrots (psittaciformes) of the world economic factors affecting vulnerability to biological invasions. the economics of biological invasions structural change in the international horticultural industry: some implications for plant health the emergence and evolution of swine viral diseases: to what extent have husbandry systems and global trade contributed to their distribution and diversity? animal movements and the spread of infectious diseases wildlife trade and global disease emergence. emerging infectious diseases ecology of zoonoses: natural and unnatural histories bats are natural reservoirs of sars-like coronaviruses global perspective for foot and mouth disease control a hotspot of non-native marine fishes: evidence for the aquarium trade as an invasion pathway reducing the risks of the wildlife trade the worldwide airline network and the dispersal of exotic species land-use and socio-economic correlates of plant invasions in european and north african countries global transport networks and infectious disease spread epidemiologic clues to sars origin in china. emerging infectious diseases influenza a h5n1 immigration is filtered out at some international borders a statistical phylogeography of influenza a h5n1. proceedings of the national academy of transboundary animal diseases: assessment of socio-economic impacts and institutional responses the spread of pathogens through trade in poultry meat: overview and recent developments the spread of pathogens through trade in poultry hatching eggs: overview and recent developments. revue scientifique et technique de l'office international des epizooties emergency prevention system (empres) for transboundary animal and plant pests and diseases. the empres-livestock: an fao initiative persistence of highly pathogenic avian influenza h5n1 virus defined by agro-ecological niche mapping h5n1 highly pathogenic avian influenza risk in southeast asia first introduction of highly pathogenic h5n1 avian influenza a viruses in wild and domestic birds in denmark, northern europe environmental factors influencing the spread of the highly pathogenic avian influenza h5n1 virus in wild birds in dynamic patterns of avian and human influenza in east and southeast asia avian influenza viruses in water birds agro-ecological features of the introduction and spread of the highly pathogenic avian influenza (hpai) h5n1 in northern nigeria multiple introductions of h5n1 in nigeria avian influenza h5n1 viral and bird migration networks in asia environmental factors contributing to the spread of h5n1 avian influenza in mainland china spatial distribution and risk factors of highly pathogenic avian influenza (hpai) h5n1 in china different environmental drivers of highly pathogenic avian influenza h5n1 outbreaks in poultry and wild birds mapping spread and risk of avian influenza a (h7n9) in china predicting the risk of avian influenza a h7n9 infection in live-poultry markets across asia potential geographic distribution of the novel avian-origin influenza a (h7n9) virus animal disease and the industrialization of agriculture assessing current and future meat and fish consumption in sub-sahara africa: learnings from fao food balance sheets and lsms household survey data. global food security rising consumption of meat and milk in developing countries has created a new food revolution. the journal of nutrition risk of introduction in northern vietnam of hpai viruses from china: description, patterns and drivers of illegal poultry trade. transboundary and emerging diseases a quantitative assessment of the risk for highly pathogenic avian influenza introduction into spain via legal trade of live poultry mycotoxins: detection methods, management, public health and agricultural trade: cabi regulation status of quarantine pests of rice seeds in the economic community of west african states (ecowas) training manual on wildlife diseases and surveillance. paris: world organization for animal health emerging biological threats: a reference guide: a reference guide: abc-clio economic optimization of a global strategy to address the pandemic threat we would like to thank ann kinzig, jim collins, ben minteer, and peter daszak for their insightful comments and discussions on the research presented here. conceptualization: tong wu, charles perrings. key: cord-027119-zazr8uj5 authors: taif, khasrouf; ugail, hassan; mehmood, irfan title: cast shadow generation using generative adversarial networks date: 2020-05-25 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50426-7_36 sha: doc_id: 27119 cord_uid: zazr8uj5 we propose a computer graphics pipeline for 3d rendered cast shadow generation using generative adversarial networks (gans). this work is inspired by the existing regression models as well as other convolutional neural networks such as the u-net architectures which can be geared to produce believable global illumination effects. here, we use a semi-supervised gans model comprising of a patchgan and a conditional gan which is then complemented by a u-net structure. we have adopted this structure because of its training ability and the quality of the results that come forth. unlike other forms of gans, the chosen implementation utilises colour labels to generate believable visual coherence. we carried forth a series of experiments, through laboratory generated image sets, to explore the extent at which colour can create the correct shadows for a variety of 3d shadowed and un-shadowed images. once an optimised model is achieved, we then apply high resolution image mappings to enhance the quality of the final render. as a result, we have established that the chosen gans model can produce believable outputs with the correct cast shadows with plausible scores on psnr and ssim similarity index metrices. shadow generation is a popular computer graphics topic. depending on the level of realism required, the algorithms can be real-time such as shadow mapping [23] and shadow projection [3] , or precomputed such as ray marching techniques, which are an expensive way to generate realistic shadows. thus, pre-computing often become a compelling route to take, such as [28] , [2] and [6] . generative adversarial networks have been implemented widely to perform graphical tasks, as it requires minimum to no human interaction, which gives gans a great advantage over conventional deep learning methods, such as image-to-image translation with single d, g semi-supervised model [7] or unsupervised dual learning [26] . we apply image-to-image translation to our own image set to generate correct cast shadows for 3d rendered images in a semi-supervised manner using colour labels. we then augment a high-resolution image to enhance the overall quality. this approach can be useful in real-time scenarios, such as in games and augmented reality applications, since recalling a pre-trained model is less costly in time and quality compared to 3d real-time rendering, which often sacrifices realism to enhance performance. our approach eliminates the need to constantly render shadows within a 3d environment and only recalls a trained model at the image plane using colour maps, which could easily be generated in any 3d software. the model we use utilises a combination of patchgan and conditional gan, because of their ability to compensate for missing training data, and tailor the output image to the desired task. there are many benefits of applying gans to perform computer graphic tasks. gans can interpret predictions from missing training data, which means a smaller training data set compared to classical deep learning models. gans can operate in multi modal scenarios, and a single input can generalise to multiple correct answers that are acceptable. also, the output images are sharper due to how gans learn the cost function, which is based on real-fake basis rather than traditional deep learning models as they minimise the euclidean distance by averaging all plausible outputs, which usually produces blurry results. finally, gan models do not require label annotations nor classifications. the rest of this paper is structured as follows. section 1 reviews related work in terms of traditional shadow algorithms, machine learning and gans, then in sect. 2 we explain the construction of the generative model we used. in sect. 3 we present our experiments from general gan to cgan and dcgan and ending with pix2pix. in sect. 4 we discuss our results. then, in sects. 5 and 6 we present the conclusion and future work, respectively. recent advancements in machine learning has benefited computer graphics applications immensely. in terms of 3d representation, modelling chairs of different styles from large public domain 3d cad models was proposed by [1] . [25] applied deep belief network to create representation of volumetric shapes. similarly, supervised learning can be used to generate chairs, tables, and cars using upconvolutional networks [5] , [4] . furthermore, the application of cnn extended into rendering techniques, such as enabling global illumination fast rendering in a single scene [19] , and image based relighting from small number of images [18] . another application filters monte carlo noise using a non-linear regression model [8] . deep convolutional inverse graphics network (dc-ign) enabled producing variations of lighting and pose of the same object from a single image [10] . such algorithms can provide full deep shading, by training end to end to produce dense per-pixel output [16] . one of the recent methods applies multiple algorithms to achieve real-time outputs, which is based on a recommendation system that learns the user's preference with the help of a cnn, and then allow the user to make adjustments using a latent space variant system [30] . gans have been implemented widely to perform graphical tasks. in cgans, for instance, feeding a condition into both d, g networks is essential to control output images [14] . training the domain-discriminator to maintain relevancy between input image and generated image, to transfer input domain into a target domain in semantic level and generate target images at pixel level [27] . semi-supervised and unsupervised models has been approached in various ways such as trading mutual information between observed and predicted categori-cal class information [20] . this can be done by enabling image translators to be trained from two unlabelled images from two domains [26] , or by translating both the image and its corresponding attributes while maintaining the permutation invariance property of the instance [15] , or even by training a generator along with a mask generator that models the missing data distribution [11] . [12] proposed an image-to-image translation framework based on coupled gans that learns a joint distribution of images in different domains by using images from the marginal distributions in individual domains. also, [13] uses an unsupervised model that performs image-to-image translation from few images. [21] made use of other factors such as structure and style. another method eliminated the need for image pairs, by training the distribution of g : x → y until g(x) is indistinguishable from the y distribution which is demonstrated with [29] . another example is by discovering cross-domain relations [9] . another way is forcing the discriminator to produce class labels by predicting which of n + 1 class the input belongs to during training [17] . another model can generate 3d objects from probabilistic space with volumetric cnn and gans [24] . the model we use in this paper is a tensorflow port of the pytorch image-to-image translation presented by isola et al. [7] . this approach can be generalised to any semi-supervised model. however, this model serves us better for its two network types; patchgan, which allows better learning for interpreting missing data and partial data generation. and conditional gan, which allows semisupervised learning to facilitate control over desired output images by using colour labels. it also replaces the traditional discriminator with a u-net structure with a step, this serves two purposes. first, it solves the known drop model issue in traditional gan structure. second, it helps transfer more features across the bottle neck which reduces blurriness and outputs larger and higher quality images. the objective of gan is to train two networks (see fig. 1 ) to learn the correct mapping function through gradient descent to produce outputs believable to the human eye y. the conditional gan here learns from observed image x and the random noise z, such that, where x is a random noise vector, z is an observed image, y is the output image. the generator g, and a discriminator d operate on "real" or "fake" basis. this is achieved by training both networks simultaneously with different objectives, g is trained to produce as realistic images as possible, while d is trained to distinguish which are fake, thus conditional gan (cgan ) can be expressed as, the aim here for g to minimise the objective against the discriminator d which aims to maximise it such that, by comparing it to an unconditional variant where the discriminator does not observe x it becomes here the distance of l l1 is used instead of l l2 to reduce blurring such that, the final objective becomes, both networks follow the convolution-batchnorm-relu structure. however, the generator differs by following the general u-net structure, and the discriminator is based on markov random fields. the application of a u-net model allows better information flow across the network than the encoder-decoder model by adding skip connections over bottle necks between layer i and layer n − i, where n is the total number of layers, by concatenating all channels at layer i with the ones in n − i, thus, producing sharper images. for discriminator d, a patchgan of n × n is applied to minimize blurry results by treating the image in small patches that are classified as real-fake across the image, then averaged to produce the accumulative results of d, such that the image is modelled as a markov random field, isola et al. [7] refers to this patchgan as a form of texture/style loss. the optimisation process alternates descent steps between d and g, by training the model to maximize log d(x, g(x, z)) and dividing the objective by 2 to slow the learning rate of d, minibatch stochastic gradient descend with adam solver is applied at the rate of 0 : 0002 for learning, and its momentum parameters are set to β 1 = 0.5, β 2 = 0 : 999. this allows the discriminator to compare minibatch samples of both generated and real samples. the g network runs at the same setting as the training phase at inference time. dropout and batch normalization to the test batch is applied at test time with batch size of 1. finally, random jitter is applied by extending the 256 × 256 input image size to 286 × 286 and then crop back to its original size of 256 × 256. further processing using photoshop, is applied manually to enhance the quality of output image, by mapping a higher resolution render of the model over the output model image, thus delivering a more realistic final image. here we report our initial experiments for shadow generation as well as minor shading functions to support it. our approach is data driven; it focuses on adjusting the image set in every iteration to achieve the correct output. for that we manually created the conditions we intended to test. also, our image set is created using maya with arnold renderer. all of our experiments are conducted on an hp pavilion laptop with 2.60 ghz intel core i7 processor, 8 ghz of ram and nvidia geforce gtx 960m graphics card. we start with the assumption that gans can generate both soft and hard shadows on demand, using colour labels and given a relatively small training image set. our evaluation is based on both real-fake basis as well as similarity index matrices. real-fake implies that the images can be clearly evaluated visually, for the network itself does not allow poor quality images by design. the similarity index matrices applied here are proposed by [22] , namely, psnr which measure the peak signal to noise ratio, which is scored between 1 and 100, and ssim which computes the ratio between the strength of the maximum achievable power of the reconstructed signal and the strength of the corrupted noisy signal, which is scored between 0 and 1. for our image-to-image translation approach, we started with a small image set of 100 images: 80 for training, 20 for testing and validation of random cube renders, with different lighting intensities and views. the results showed correct translations as shown in fig. 2 (b) top section. however, the output colours of the background sometimes differed from the target images. also, some of the environments contained patches and tiling artefacts as shown in fig. 2 bottom section. which is understandable given the small number of training images. next, we trained the model with stonehenge images. it is also lit with a single light minus the variable intensity. the camera rotates 360 degrees around the y axis. the total image number is 500 images, 300 for training and 200 for testing and validation. we started with the question; can we generate shadows for non-shadowed images that are not seen during training? we worked around it by designing the colour label to our specific need. in the validation step we fed images with no shadows as in fig. 3 (c) , paired with colour labels that contains the correct shadows (a). as the results show in fig. 3 (b) , the model translated the correct shadow with the appropriate amount of drop off to the input image. next we explore generating only accurate shadows fig. 4 (b) for nonshadowed images (c), which is accomplished by constructing the colour map to only contain shadows (a), while training the network with full shadowed images as previously, paired with shadowed labels. the results show accurate translation of shadow direction and intensity (see fig. 4 for the third and final set of experiments, we used two render setups of the stanford dragon, one for training, the second for testing and validation. the camera setup that rotate 360 degrees around the dragon's y axis with different step angle from training and testing. also, the camera elevates 30 degrees across the x axis to show a more complex and elevated view of the dragon model rather than an orthographic one. the image set is composed of 1600 training images, 600 testing images and 800 for validation. the training set (shown in table 1 ) is broken into multiple categories, with each one represented within 200 images with overlapping features sometimes. these images help understand how input image/label affects the behaviour of the output images. for example, if we fed a standard colour map to a coloured image, will it be able to translate the colours across or will the output be of standard colour, this will better inform the training process for future applications. all networks are trained from scratch using our image sets, and the weights are initialized from a gaussian distribution with mean 0 and standard deviation 0.02. the test image set consisted of 600 images that are not seen in the training phase. during this, some of the features learnt from the training set are tested and the weights are adjusted accordingly. for validation, a set of 800 images that are not seen in the training set were used, they are derived from the testing image set, but have been modified heavily. the objective here is to test the ability to generate soft and hard shadows from unseen label images, as well as colour shadows, partial shadows, shadow generation for images with no shadows. also, in some cases we have taken the image set to extremes in order to generate shadows for images that has different orientation and colour maps than their original labels. from here, the experiments progressed in three phases. first, we trained the model with the focus on standard labels to produce soft and hard shadows, using an image set of 1000 images, 400 of them are dedicated for standard colours and shadows. the remaining 600 images are an arbitrary collection of cases mentioned in the training section (table 1) , with similar arbitrary testing and validation image sets of 300 images each. in this experiment, we had the standard colours and shadows provide comprehensive 360-degree view of the 3d model. while purposefully choosing arbitrary coloured samples of 10-25 images, we created 6 colour variations for both images and their respective colour labels. our first objective was to observe whether the model can generalise colour information into texture, meaning to fill the details of the image from what has been learnt from the 360 view, and overlaying colour information on top of it. even though the general details can be seen on the models, there were heavy artefacts such as blurring and tiling in some of the coloured images with fewer training images. with that knowledge in mind, a second experiment was carried out. we adjusted the image set to reduce the tiling effect, which is mainly due to the lack of sufficient training images for specific cases. hence, the number of training images was increased to 200 per case, to increase the training set to 1600 images. in the validation image set, we pushed the code to extreme cases, such as pairing images of different colour maps and different directions, as well as partial and non-shadowed images. thus, accumulating the validation set to 800 images, assuming beforehand that we will get the same tiling effect from the previous experiment in cases where we have different angles or different colours. the significant training time is one of the main challenges that we face, as the training time for our set of experiments ranged between 3 days to one week using fig. 5 . the initial phase of the third set, which shows direct translation tasks. our focus here is mainly the ability to generate believable soft and hard shadows, as well as inpainting for missing patches. our laptop, which is considered long for training 1600 images. this why our image set was limited to 256×256 pixels. for this work, we overcome the issue with augmenting the output image with a high-resolution render. once trained, however, with some optimisation the model should be capable of real-time execution, but this issue has not been tested by us. the two biggest limitations for this method are, it is a still semi-supervised model that needs a pair of colour maps and a target image. the second limitation is that the colour maps and image pairs are manually created, and the process is labour intensive. these issues should be considered for future work. our method performed well in almost all cases with minimal to no errors and sharp image reproduction, especially when faced with direct translation tasks, such as fig. 5 , and colour variations fig. 6 . even with partial and non-shadowed images, the colours remained consistent and translated correctly across most outputs. this is promising, given a relatively small training set (approximately 200 images per case) we have used. by examining fig. 7 , we notice that the model generalised correctly in most cases even though the colour maps are non-synchronised. this means our method has the breadth to interpret correctly when training set fails. however, it tends to take a more liberal approach in translating colour difference between the label and target with bias towards the colour map. this was also visible in partial images, non-shadowed images, as well as soft and hard shadows. the network struggled mostly when more than two parameters are changed, for example, a partial image and non-shadowed model will translate well. however, partial image, non-synchronised shadow and position will start to show tiling in the output image. the model seems to struggle the most with position switching than any other change, especially when paired with non-synchronised colour map as well. this is usually manifested in the form of noise, blurring and tiling (see fig. 7 ), while the colours remain consistent and true to training images, and the shadows are correct in shape and intensity but are produced with noise and tiling artefacts. we conducted our quantitative assessment by applying the similarity matrices psnr and ssim [22] and we can confirm the previous observations. when looking at the table 2 , the lowest score were in the categories with nonsynchronized image pairs such as categories 3 and 19, while the image pairs that were approximately present in both training and testing performed the highest, which are categories 4 and 5, with overall performance leaning towards a higher scores spectrum. table 2 . this table shows how each category performed in both psnr and ssim similarity indices between output images and corresponding ground truth images. this paper explored a framework based on conditional gans using a pix2pix tensorflow port to perform computer graphic functions, by instructing the network to successfully generate shadows for 3d rendered images given training images paired with conditional colour labels. to achieve this, a variety of image sets were created using an off-the-shelf 3d program. the first set targeted soft and hard shadows under standard conditions and coloured labels and backgrounds, using 6 colour variations in the training set to test different variations, such as partial and non-shadowed images. the image set consisted of 1000 training images and 600 images for testing and validation. the results were plausible in most cases but showed clear blurring and tiling with coloured samples that did not have enough training images paired with it. next, we updated the image set to 3000 images, with 1600 training images, providing an equal number of training images for each of the 8 cases. we used 600 images for testing, and 800 for validation, which included more variations such as partial and non-shadowed images. in the validation set, the images included extreme cases such as non-sync pairing of position and colour. the results were believable in all cases, except the extreme cases, which resulted in tiling and blurring. the results are promising for shadow generation especially when challenged to produce accurate partial shadows from training image set. the model is reliable to interpret successful output for images not seen during the training phase, except when paired with different colours viewpoints. however, there are still challenges to resolve. for example, the model requires a relatively long time to be trained and the output images still suffer from minor blurriness. this is only a proof of concept. the next logical step is to optimise the process by training the model to create highly detailed renders from lower poly-count models. this also can be tested with video-based models such as hdgan. it is expected to output flickering results due to its learning nature and current state of the art. another direction of interest may be to automate the generation of colour maps from video or live feed such as the work in [29] . the main challenge, however, is the computation complexity, especially for higher resolution training. seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models a hierarchical volumetric shadow algorithm for single scattering me and my (fake) shadow learning to generate chairs, tables and cars with convolutional networks learning to generate chairs with convolutional neural networks proceedings of the 2010 acm siggraph symposium on interactive 3d graphics and games image-to-image translation with conditional adversarial networks a machine learning approach for filtering monte carlo noise learning to discover cross-domain relations with generative adversarial networks deep convolutional inverse graphics network misgan: learning from incomplete data with generative adversarial networks unsupervised image-to-image translation networks few-shot unsupervised image-to-image translation conditional generative adversarial nets instagan: instance-aware image-to-image translation deep shading: convolutional neural networks for screen-space shading conditional image synthesis with auxiliary classifier gans image based relighting using neural networks global illumination with radiance regression functions unsupervised and semi-supervised learning with categorical generative adversarial networks generative image modeling using style and structure adversarial networks image quality assessment: from error visibility to structural similarity casting curved shadows on curved surfaces learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling 3d shapenets: a deep representation for volumetric shapes dualgan: unsupervised dual learning for image-to-image translation pixel-level domain transfer precomputed shadow fields for dynamic scenes unpaired image-to-image translation using cycle-consistent adversarial networks gaussian material synthesis key: cord-027201-owzhv0xy authors: tkacz, magdalena a.; chromiński, kornel title: advantage of using spherical over cartesian coordinates in the chromosome territories 3d modeling date: 2020-06-15 journal: computational science iccs 2020 doi: 10.1007/978-3-030-50417-5_49 sha: doc_id: 27201 cord_uid: owzhv0xy this paper shows results of chromosome territory modeling in two cases: when the implementation of the algorithm was based on cartesian coordinates and when implementation was made with spherical coordinates. in the article, the summary of measurements of computational times of simulation of chromatin decondensation process (which led to constitute the chromosome territory) was presented. initially, when implementation was made with the use of cartesian coordinates, simulation takes a lot of time to create a model (mean 746.7[sec] with the median 569.1[sec]) and additionally requires restarts of the algorithm, also often exceeds acceptable (given a priori) time for the computational experiment. because of that, authors attempted changing the coordinate system to spherical coordinates (in a few previous projects it leads to improving the efficiency of implementation). after changing the way that 3d point is represented in 3d space the time required to make a successful model reduced to the mean 25.3[sec] with a median 18.5[s] (alongside with lowering the number of necessary algorithm restarts) which gives a significant difference in the efficiency of model’s creation. therefore we showed, that a more efficient way for implementation was the usage of spherical coordinates. computational power gives very powerful support in the life sciences today. a lot of experiments can be done -they are cheaper to conduct, their parameters can be easily modified. they are also in most cases reproducible and ethical (no wronging living creatures). according [1] the term modeling is defined as "to design or imitate forms: make a pattern" or "producing a representation or simulation" and model is defined as "a system of postulates, data, and inferences presented as a mathematical description of an entity or state of affairs". in fact, in a case of insilico experiments more precious would be "computational model" and "computational modeling", but in this paper, it will be referred to in a shorter form. sometimes term "modeling" is used in the context of "running computational model" -but in this paper, it will be referred to as "simulation". for many disciplines creating a model is important -it allows to re-scale (extend or reduce) object, slow down or speed up modeled process, examine almost any aspect of object or process (separating parameters or taking into account a given quantity of parameters). with the use of computers, it is also possible to make visualizations and animations. this paper describes some aspects of the modeling process that occurs in all organisms -precisely speaking occurs in almost every living cell. this process also occurs just right now -in my body while i'm writing this text, as well as in your -when you read this. this is the process of transferring genetic material, dna-during cell division. this process is difficult to examine -we can only observe living cells during a relatively short time. another difficulty here is its microscale -to observe it we have to use microscopes. and, besides of a scalewhen we want to examine the interior of the cell -we have to destroy (and kill) it . . . there are attempts to create an "artificial" [2] (or "synthetic" [3] ) cell, but this is not an easy task. to face this up, using "divide and conquer" strategy there are attempts to create models of certain cell components and processes. this paper shows some new knowledge that we discover while trying to model chromosome territories (ct's) being a final result of modeling and simulation chromatin decondensation (cd) process and documents some problems (and the way we took to solve them) to make the working model. some time ago we are asked if we can help in the creation of a probabilistic model of ct's (in short -ct's are the distinct 3d space occupied by each chromosome after cell division, see also sect. 1.2). we agreed and something that we supposed to be a project for a few months of work, becomes the true mine of many different problems to be solved. the first one we focused on, was the problem of creating appropriate model of chromatin and the model of the chromatin decondensation process (to be able to implement and simulate this process) in a phase just right after cell division. in eukaryotic cells, genetic material is not stored in a well-known form of a helix because dna strand is too long (and too vulnerable to damage). it is stored as a complex of dna strand and proteins -altogether called chromatin which is being rolled-up in a very sophisticated way [5] . this allows taking much less space and store dna untangled. probably it also helps in preventing random breaks and changes in dna sequences. researches concerning chromatin organization are important because of its influence on gene transcription [6] . there are levels of chromatin organization (depending on the level of packing) ( [4, 7] ). the two extreme levels of packing are condensed and decondensed ones [11] . the one -somewhere in between extreme ones, that we are interested in, is called euchromatin. this level of organization is often referred to as "beads on a strand" (see fig. 1 ). the level of chromatin condensation depends on different factors. it can be cell state (during cell cycle) but it is also known that it can be controlled by epigenetic modifications [8] or biological process [10] . the risk of dna damage [9] or modification varies depending on the chromatin condensation level. during the cell division, chromatin fibers condense into structures called chromosomes. in the period between two subsequent divisions, called interphase, chromosomes decondense and occupy 3-d areas within the nucleus. those distinct areas -called "chromosome territories" (ct's) -are regarded as a major feature of nuclear structure ( [12, 22] ). chromosome territories can be visualized directly using in-situ hybridization with fluorescently labeled dna probes that paint specifically individual chromosomes ( [18, 20] ). researches concerning ct's are: studying the relationship between the internal architecture of the cell nucleus and crucial intranuclear processes such as regulation of gene expression, gene transcription or dna repair ( [17, 19, 21] ). those studies are related to spatial arrangement, dynamics (motion tracking) [13] , frequency of genomic translocations ( [14] ) and even global regulation of the genome [15] . possibility of making experiments in-silicowould speed up and make some of the experiments easier and cheaper. the euchromatin was the starting point to model chromatin structure for us: we decided to model chromatin (and arms of chromosomes) as a sequence of tangent spheres (fig. 2 ) -visually very similar to euchromatin (see fig. 1 ). because euchromatin is observed as "beads on a strand" and beads (sometimes also called "domains") are its basic structural units, we decided to make a single sphere our basic part of the chromatin chain component (and the basic structural units building up cts). this allows also to make our model scalable -by changing the size of the sphere we can easily change the level of chromatin packing. a sphere can be also easily rendered as graphical primitive in most graphical libraries which were very important to guarantee the possibility of further ct's visualization. our modeling process was very closely related to geometrical, visible objects, because it was very important, that the final models could be visualized -to allow visual comparison with real images from confocal microscopy. we also decide to model the decondensation process by adding tangent spheres around existing ones. this effects in gradually expanding volume of the initial strand of spheres. the process continues until the stop condition was met (volume or size of decondensed chromatin). the computational problem was as follows: starting from the initial (condensed) chromatin model (in a form "beads-on-strand"), consisting of a sequence of mutually tangent spheres find coordinates for next n spheres (where n denotes the size -number of beads of chromatin after decondensation). geometrically it is a problem of finding (x, y, z) being the center of a new sphere with the condition of being tangent to the previous one and not in collision in any other (previously generated). our first goal was to make a fully probabilistic model -that means that we do not add additional conditions like the position of centromeres, telomeres, nucleoplasm stickiness and so on (extending model and making it more "real data-driven" are in our current field of interest and research). the modeled process of decondensation can be somehow regarded as a markov process -the subsequent state i + 1 of decondensation strictly depends on the previous one i. the very basic component of our model was a sphere s((x, y, z), r). this notation should be read as a sphere s with a center in the point that has (x, y, z) coordinates and a radius with the length of r, (r ≥ 0). the ordered chain of spheres -makes our model of a chromosome, a set of indexed spheres makes a model of ct. the very general algorithm for ct's modeling is presented in algorithm 1. line 4 and 5 reflect creating initial chromatin strand, line 6 simulation of decondensation. altogether, they led to the generation of the model of the certain ct. the last step of the algorithm (line 6) proved to be the most demanding and challenging, which is described in the next section. in the following section, we document the way we take to successfully made the probabilistic model of ct's. at first, we used the cartesian coordinates (cc). first, the algorithm generates coordinates for the sphere that are denoted as the centromere, and next add to it the next ones until it reaches the (given a-priori) length of arms for certain chromosome. having model of the entire chromosome algorithm draw a id of one of the present spheres s i ((x i , y i , z i ), r) (from those composed the chromosome) and then draws "candidate coordinates": x i+1 , y i+1 and z i+1 for the center for the next sphere. the new coordinates are to be from limited range -not too far current sphere's (as they should be tangent). to allow small flexibility, the ε value to the drawn coordinates was introduced. when we had coordinates drawn, the distance dist(s i , s i+1 ) was calculated to check whether a new sphere can be added. the distance was computed by calculating ordinary euclidean distance. if dist(s i , s i+1 ) was appropriate, the conditions to not collide with existing elements were checked. if all conditions are met -new sphere were added (for details see [26] ). there were no problems with the generation of the initial chromatin strand as a sequence of spheres (chromosome). the problem emerges when we tried to simulate the decondensation of chromatin: generation of a model takes a lot of time, and we noticed that sometimes simulation was unsuccessful. we discovered (after log analysis) that the algorithm got stuck trying to find coordinates for s i+1 . so, we added additional function that triggers restart of algorithm after 500 unsuccessful attempts for placing s i+1 sphere (see algorithm 2 lines 12-14). if s i+1 cannot be placed -algorithm starts over and searches possibility to add s i+1 , but for another sphere forming chromosome. the pseudocode for this version of the algorithm is shown in algorithm 2. in the first step it generates the "candidate coordinates" for s i+1 center (algorithm 2 lines 3-6). thanks to ε a possibility that new sphere could be a little too far, or too close the previous s i . the fine-tuning is made by an additional function that checks the distance from the previous sphere (algorithm 2 lines 7-8). additional code for stuck detection that triggers restarting computations are in (algorithm 2 lines 12-14) . new sphere x = previous sphere x ± random(0, 2 · r + 2 · ε) 6 new sphere y = previous spherey ± random(0, 2 · r + 2 · ε) 7 new sphere z = previous spherez ± random(0, 2 · r + 2 · ε); 8 check distance from previous sphere this makes the simulation of cd process long and inefficient, and the result was disappointed: the algorithm got stuck relatively often. the measured number of necessary restarts to complete model creation is shown in table 1 . in one model creation, about 650 spheres should be placed as the tangent ones, so it was easy to asses the number of inefficient searches -they are presented in the last column of table 1 . table 2 showed the time needed to generate one ct model. time was measured in seconds, basic statistics were also given, measurements were made on 40 generated models. that was not a satisfactory result. we had to rethink the way we implement the decondensation of chromatin. we decided to try to add -at first sightadditional computations: shifting (change location) of the center of coordinate systems. then we were able to use the notion of the neighborhood with a fixed radius (inspired by a topology) and use spherical coordinates (see fig. 4 ). we were aware of the fact that shifting the coordinate system takes additional time -but the solution with cc works so bad, that we hope that this approach will work a little better. the result of this change beats our expectations -which is described and documented in the next sections. we decided to try spherical coordinates (sc) [27] instead of cc (for those, who are not familiar with different coordinate systems we recommend to take a look at [28] , [29] ). when we wanted to add sphere s i+1 to the certain one s i , we first made a shift of the center of the coordinate system in such a way, that the center of coordinate system was situated in the middle of s i sphere (see fig. 4 ). this let us search for the s i+1 by drawing two angles and using just one parameter: 2r. after switching to the sc, we got rid of the problem of looping the simulation during attempts of finding the location for the s i+1 . therefore, the function that restarts ct model creation could be removed. we made measurements -time necessary to generate ct models (equivalent to the time of cd simulation) with shifting coordinate system and using spherical coordinates is presented in table 3 . time of creating ct models decreases significantly in comparison to the use of cc. this had a direct and significant impact on the time of the model creation. 6 new sphere x = previous sphere x + 2 · r · cos(ψ) · sin(φ) new sphere y = previous sphere y + 2 · r · sin(ψ) · sin(φ) new sphere z = previous sphere z + 2 · r · cos(φ); 7 is inside nucleus; to follow the rigor for scientific publications (despite very clear difference between times showed in table 2 and table 3) we made an analysis, presented in this section. for the purpose of visual comparison of the times of ct model creation we prepared a boxplot (see fig. 5 ) for general view. in fig. 5 the difference, in general, is easy to notice. there is even no single element of the chart (neither whiskers nor dots (outliers)) that overlaps each other. it is easy to notice a huge difference between computing time (and its stability) in both cases. for the record we made statistical test -the result is presented in table 4 . we calculated the value of the t-test, to confirm that the difference in creation times of model (cc and sc) is statistically significant (p-value below 0.05 means that the difference is statistically significant). this proves the statistical significance between modeling time in described two methods. based on presented in this paper results we can conclude that when you model in 3d space, using spherical coordinates may lead to a more efficient implementation of the algorithm, even when you have to shift the center of the coordinate systems. the solution when using euclidean distance in the cartesian coordinate system in implementation was much more time-consuming. what is more important -it often does not finish modeling process in an acceptable time (sometimes we have to break simulation after 3 weeks of computing on a computer with 16 gb ram and i5 processor), if it finishes at all (do not got stuck). as future work, knowing that using a spherical coordinate system is helpful we want to examine the effectiveness of quaternion-based implementation as a way to represent coordinates in 3d space. we also want to check in a more detailed way, what has an impact: only changing the center of the coordinate system, only changing the way of point representation -or both. because it is not the first time when we noticed significant change (in plus) after using spherical (or hyperspherical -in more dimensions) coordinates instead of the cartesian ones, we plan (after finishing actual projects with deadlines) design and conduct a separate experiment. we want to investigate in a more methodological and ordered way to answer the question: why spherical coordinates give better results in computational implementations? our case study also shows that it is possible that geometrical and visual thinking while modeling in 3d space can be helpful. with the "pure algebraic" thinking (based on the calculation on coordinates) finding the idea -to search in the neighborhood, shifting the center of the coordinate system and next using direction (angles) and fixed distance -would be more difficult (if even possible). synthetic cell project the genome-seeing it clearly now maternal prenatal stress and the developmental origins of mental health. the epigenome and developmental origins of health and disease multiple aspects of gene dysregulation in huntington's disease. front. neurol nucleosome positioning in saccharomyces cerevisiae chromatin condensation modulates access and binding of nuclear proteins activation of dna damage response signaling by condensed chromatin nucleoplasmin regulates chromatin condensation during apoptosis chromosome condensation and decondensation during mitosis chromosome territories how do chromosome territory dynamics affect gene redistribution? https://www. mechanobio.info/genome-regulation/how-do-chromosome-territory-dynamicsaffect-gene-redistribution chromosome territory formation attenuates the translocation potential of cells. elife chromosome territories and the global regulation of the genome chromatin spheres and the interchromatin compartment form structurally defined and functionally interacting nuclear networks inheritance of gene density-related higher order chromatin arrangements in normal and tumor cell nuclei chromosome territories, nuclear architecture and gene regulation in mammalian cells nuclear organization of the genome and the potential for gene regulation fluorescence in situ hybridization with human chromosome-specific libraries: detection of trisomy 21 and translocations of chromosome 4 spatial organization of the mouse genome and its role in recurrent chromosomal translocations cell biology -chromosome territories chromosome territories -a functional nuclear landscape chromosome territories: the arrangement of chromosomes in the nucleus chromosome territory modeler and viewer spherical coordinates. from math insight cartesian, polar, cylindrical, and spherical coordinates key: cord-119104-9d421si9 authors: huynh, tin van; nguyen, luan thanh; luu, son t. title: banana at wnut-2020 task 2: identifying covid-19 information on twitter by combining deep learning and transfer learning models date: 2020-09-06 journal: nan doi: nan sha: doc_id: 119104 cord_uid: 9d421si9 the outbreak covid-19 virus caused a significant impact on the health of people all over the world. therefore, it is essential to have a piece of constant and accurate information about the disease with everyone. this paper describes our prediction system for wnut-2020 task 2: identification of informative covid-19 english tweets. the dataset for this task contains size 10,000 tweets in english labeled by humans. the ensemble model from our three transformer and deep learning models is used for the final prediction. the experimental result indicates that we have achieved f1 for the informative label on our systems at 88.81% on the test set. the rapid spread of the coronavirus has caused a global health crisis. this virus is hazardous to people's health and causes a big panic all over the world. statistics show that each day there are 4 million tweets related to covid-19 on twitter (lamsal, 2020) . therefore, it is essential to keep track of the information associated with this disease. along with the development of many social networking platforms such as twitter and facebook. this is the primary way that helps people capture information about covid-19 regularly. however, there is much content appearing daily on these social media platforms. most of them do not have information about the status of covid-19, such as the number of suspected cases or cases near the user's area. in this article, we present our approach at wnut-2020 task 2 to identify tweets containing information about covid-19 on the social networking platform twitter or not. a tweet is believed to have information if it includes information such as recovered, suspected, confirmed, and death cases and location or travel history of the patients. specifically, we described the problem as follows. • input: given english tweets on the social networking platform. • output: one of two labels (informative and uninformative) predicted by our system. several examples are shown in table 1 tweet label a new rochelle rabbi and a white plains doctor are among the 18 confirmed coronavirus cases in westchester. httpurl 0 day 5: on a family bike ride to pick up dinner at @user broadway, we encountered our pre-covid-19 land park happy hour crew keeping up the tradition at an appropriate #socialdistance.httpurl 1 in this paper, we have two main contributions as follows. • firstly, we implemented four different models based on neural networks and transformers such as bi-gru-cnn, bert, roberta, xlnet to solve the wnut-2020 task 2: identification of informative covid-19 english tweets. • secondly, we propose a simple ensemble model by combining multiple deep learning and transformer models. this model gives the highest performance compared with the single models with f1 on the test set is 88.81% and on the development set is 90.65%. during the happening of the covid-19 pandemic, the information about the number of infected cases, the number of patients is vital for governments. dong et al. (2020) constructed a real-time database for tracking the covid-19 around the world. this dataset is collected by experts from the world health organization (who), us cdc, and other medical agencies worldwide and is operated by john hopkins university. also, there are many other covid-19 datasets such as multilingual data collected on twitter from january 2020 (chen et al., 2020) or real world worry dataset (rwwd) (kleinberg et al., 2020) . besides, on social media, the spreading of covid-19 information is extremely fast and enormous and sometimes leads to misinformation. shahi et al. (2020) conducted a pilot study about detecting misinformation about covid-19 on twitter by analyzing tweets using standard social media analytics techniques. from the researching results, the authors want to help authorities and social media users counter misinformation. moreover, the rumors and conspiracy theories within the emergence times of covid-19 spreading had made communities feel fearmongering and panicky, which lead to racism about covid-19 patients and citizens from infected countries, and mass purchase of face masks as well as the shortage of necessaries, according to (depoux et al., 2020) . thus it is necessary to identify the right information from the social media text. the dataset provided by contains 10,000 english tweets about covid-19, which is used to automatically identify whether a tweet contains useful information about the covid-19 (informative) or not (uninformative). there are 4,719 informative tweets and 5,281 uninformative tweets in the dataset, and three different annotators annotate each tweet. the inter-annotator agreement calculated by fleiss' kappa score of the dataset is 81.80%. also, the dataset is split into the training, development, and test sets with proportion 7-1-2. table 2 shows the overview information about the dataset. training 3,303 3,697 development 472 528 test 944 1,056 in this paper, we propose an ensemble method that combines the deep learning models with the transfer learning models to identify information about covid-19 from users' tweets. we implement the bi-gru-cnn model, which was used for salary prediction by and job prediction by van huynh et al. (2020), with the glove-300d word embedding (pennington et al., 2014) . this model consists of three main layers: the word representation layers (word embedding), the 1d convolutional layers (conv-1d), and the bidirectional gru layer (bi-gru). this model also achieved high performances on previous study works van huynh et al., 2019 . fig 1 illustrates the bi-gru-cnn model. inspired by transfer learning success on many nlp tasks such as text classification (do and ng, 2006; rizoiu et al., 2019) and machine reading comprehension (devlin et al., 2019; . in this paper, we used the sota transfer learning models, such as bert (devlin et al., 2019) , roberta (liu et al., 2019) , and xlnet (yang et al., 2019) with fine-tuning techniques for the problem of identifying informative tweet about covid-19. in our experiment, we used the pretrained language model, as described in table 3 . all of these pre-trained models are constructed on english texts. as the success of the ensemble models of previous tasks ( in this study, we experimented with datasets provided by wnut-2020 task 2. training, development, and testing sets are divided as described in section 3. to evaluate our models, we use four metrics include accuracy, precision, recall, and f1. to prepare data for the model training and model evaluation phases, we perform the simple and effective pre-processing of input data as follows: • step 1: converting the tweet into the lowercase strings. • step 2: removing the user names in the tweets. • step 3: deleting all urls in the tweets. • step 4: representing words into vectors with pre-trained word embedding sets for deep neural network models. according to analyzing the length of the tweets in the data, we set max length of the models to be 512 and epochs to be 15 for two models bi-gru-cnn and xlnet, and 3 for model bert and roberta. after searching for extensive hyperparameter, we set learning rate equal to 1e-3 and dropout equal to 0.2 for the bi-gru-cnn model and learning rate equal to 1e-5 and dropout equal to 0.1 for three models bert, roberta, and xlnet. experimental results of the single model and the ensemble model on the development set are presented in table 4 . specifically, in the single models, the bi-gru-cnn model gives the lowest performance with 85.66% by f1 and 86.10% by accuracy. the single model with the highest efficiency is xlnet, which attained 89.86% by f1 and 90.30% by accuracy. in addition, the bert model gives the highest precision with 89.53%, and the roberta model achieved the highest recall result with 90.74%. in particular, our recommend ensemble model gives the best performance when combining the power of single models together, which accomplished 90.65%, 91.00%, and 92.37% by f1, accuracy, and recall respectively, according to table 4 . specifically, our model improved 0.79% by f1 and 0.70% by accuracy over the most extensive single model (xlnet), and 0.63% by recall over the roberta model. accuracy after the system evaluation of wnut-2020 task 2, table 5 displays our ensemble model results on the testing set. this result is compared with the top 5 highest teams' results and the baseline model (baseline -fasttext). our model with f1 is 88.81%, 2.15% lower than the first rank team, and 13.78% higher than the baseline model. as for the results of accuracy, we get 89.40%, 2.10% lower than the first rank team, and 12.10% higher than the baseline model. in addition, table 6 displays some error prediction examples from the dataset. most of the wrong predictions occurred because of the appearance of special characters such as hashtag, the httpurl phrases, which stand for the url links in the tweets. for the information tweets, the appearance of the httpurl phrase and the hashtag #coronavirus make the classification model predict the wrong label. this mistake is the same for the uninformation tweets, where the appearance of httpurl phrase and the hashtags related to the coronavirus affected the results of the prediction model. this paper has addressed our work on the wnut-2020 task 2: identifying covid-19 information on twitter. we proposed our ensemble model combining the deep learning models and the transfer learning models for detecting information about covid-19 from users' tweets. our ensemble model achieved 91.00% by accuracy and 90.65% by f1 on the development set, and achieved 89.04% by accuracy and 88.81% by f1 on the public test set, which ranked #25 in the competition. in the future, we will improve our model's performance by exploring different features of the users' tweets and transfer learning models with fine-tuning techniques. finally, we hope our study can be applied in practice for detecting covid-19 from social networks to support the covid-19 battle all over the world. tracking social media discourse about the covid-19 pandemic: development of a public coronavirus twitter data set annelies wilder-smith, and heidi larson. 2020. the pandemic of social media panic travels faster than the covid-19 outbreak bert: pre-training of deep bidirectional transformers for language understanding transfer learning for text classification an interactive web-based dashboard to track covid-19 in real time measuring emotions in the covid-19 real world worry dataset wnut-2020 task 2: identification of informative covid-19 english tweets nlp@uit at vlsp 2019: a simple ensemble model for vietnamese dependency parsing glove: global vectors for word representation transfer learning for hate speech detection in social media an exploratory study of covid-19 misinformation on twitter hate speech detection on vietnamese social media text using the bi-gru-lstm-cnn model job prediction: from deep neural network models to applications new vietnamese corpus for machine reading comprehension of health news articles salary prediction using bidirectionalgru-cnn model xlnet: generalized autoregressive pretraining for language understanding key: cord-016965-z7a6eoyo authors: brockmann, dirk title: human mobility, networks and disease dynamics on a global scale date: 2017-10-23 journal: diffusive spreading in nature, technology and society doi: 10.1007/978-3-319-67798-9_19 sha: doc_id: 16965 cord_uid: z7a6eoyo disease dynamics is a complex phenomenon and in order to address these questions expertises from many disciplines need to be integrated. one method that has become particularly important during the past few years is the development of computational models and computer simulations that help addressing these questions. in the focus of this chapter are emergent infectious diseases that bear the potential of spreading across the globe, exemplifying how connectivity in a globalized world has changed the way human-mediated processes evolve in the 21st century. the examples of most successful predictions of disease dynamics given in the chapter illustrate that just feeding better and faster computers with more and more data may not necessarily help understanding the relevant phenomena. it might rather be much more useful to change the conventional way of looking at the patterns and to assume a correspondingly modified viewpoint—as most impressively shown with the examples given in this chapter. in early 2009, news accumulated in major media outlets about a novel strain of influenza circulating in major cities in mexico [1] . this novel h1n1 strain was quickly termed "swine flu", in reference to its alleged origin in pig populations before jumping the species border to humans. very quickly public health institutions were alerted and saw the risk of this local influenza epidemic becoming a major public health problem globally. the concerns were serious because this influenza strain was of the h1n1 subtype, the same virus family that caused one of the biggest pandemics in history, the spanish flu that killed up to 40 million people in the beginning of the 20th century [2] . the swine flu epidemic did indeed develop into a pandemic, spreading across the globe in matters of months. luckily, the strain turned out to be comparatively mild in terms of symptoms and as a health hazard. nevertheless, the concept of emergent infectious diseases, novel diseases that may have dramatic public health, societal and economic consequences reached a new level of public awareness. even hollywood picked up the topic in a number of blockbuster movies in the following years [3] . only a few years later, mers hit the news, the middle east respiratory syndrome, a new type of virus that infected people in the middle east [4] . mers was caused by a new species of corona virus of the same family of viruses that the 2003 sars virus belonged to. and finally, the 2013 ebola crisis in west african countries liberia, sierra leone and guinea that although it did not develop into a global crisis killed more than 10000 people in west africa [5] . emergent infectious diseases have always been part of human societies, and also animal populations for that matter [6] . humanity, however, underwent major changes along many dimensions during the last century. the world population has increased from approx. 1.6 billion in 1900 to 7.5 billion in 2016 [7] . the majority of people now live in so-called mega-cities, large scale urban conglomerations of more than 10 million inhabitants that live in high population densities [8] often in close contact with animals, pigs and fowl in particular, especially in asia. these conditions amplify not only the transmission of novel pathogens from animal populations to human, high frequency human-to-human contacts yield a potential for rapid outbreaks of new pathogens. population density is only one side of the coin. in addition to increasing faceto-face contacts within populations we also witness a change of global connectivity [9] . most large cities are connected by means of an intricate, multi-scale web of transportation links, see fig. 19 .1. on a global scale worldwide air-transportation dominates this connectivity. approx. 4,000 airports and 50,000 direct connections span the globe. more than three billion passengers travel on this network each year. every day the passengers that travel this network accumulate a total of more than 14 billion kilometers, which is three times the radius of our solar system [10, 11] . clearly this amount of global traffic shapes the way emergent infectious diseases can spread across the globe. one of the key challenges in epidemiology is preparing for eventual outbreaks and designing effective control measures. evidence based control measures, however, require a good understanding of the fundamental features and characteristics of spreading behavior that all emergent infectious diseases share. in this context this means addressing questions such as: if there is an outbreak at fig. 19.1 the global air-transportation network. each node represents one of approx. 4000 airports, each link one of approx. 50000 direct connections between airports. more than 3 billion passengers travel on this network each year. all in all every day more than 16 billion km are traversed on this network, three times the radius of our solar system location x when should one expect the first case at a distant location y? how many cases should one expect there? given a local outbreak, what is the risk that a case will be imported in some distant country. how does this risk change over time? also, emergent infectious diseases often spread in a covert fashion during the onset of an epidemic. only after a certain number of cases are reported, public health scientists, epidemiologist and other professionals are confronted with cases that are scattered across a map and it is difficult to determine the actual outbreak origin. therefore, a key question is also: where is the geographic epicenter of an ongoing epidemic? disease dynamics is a complex phenomenon and in order to address these questions expertises from many disciplines need to be integrated, such as epidemiolgy, spatial statistics, mobility and medical research in this context. one method that has become particularly important during the past few years is the development of computational models and computer simulations that help address these questions. these are often derived and developed using techniques from theoretical physics and more recently complex network science. modeling the dynamics of diseases using methods from mathematics and dynamical systems theory has a long history. in 1927 kermack and mckenrick [12] introduced and analyzed the "suceptible-infected-recovered" (sir) model, a parsimoneous model for the description of a large class of infectious diseases that is also still in use today [13] . the sir model considers a host population in which individuals can be susceptible (s), infectious (i) or recovered (r). susceptible individuals can aquire a disease and become infectious themselves and transmit the disease to other susceptible individuals. after an infectious period individuals recover, acquire immunity, and no longer infect others. the sir model is an abstract model that reduces a real world situation to the basic dynamic ingredients that are believed to shape the time course of a typical epidemic. structurally, the sir model treats individuals in a population in much the same way as chemicals that react in a wellmixed container. chemical reactions between reactants occur at rates that depend on what chemicals are involved. it is assumed that all individuals can be represented only by their infectious state and are otherwise identical. each pair of individuals has the same likelihood of interacting. schematically, the sir model is described by the following reactions where and are transmission and recovery rates per individual, respectively. the expected duration of being infected, the infectious period is given by t = −1 which can range from a few days to a few weeks for generic diseases. the ratio of rates r 0 = ∕ is known as the basic reproduction ratio, i.e. the expected number of secondary infections caused by a single infected individual in a fully susceptible population. r 0 is the most important epidemiological parameter because the value of r 0 determines whether an infectious disease has the potential for causing an epidemic or not. when r 0 > 1 a small fraction of infected individuals in a susceptible population will cause an exponential growth of the number of infections. this epidemic rise will continue until the supply of susceptibles decreases to a level at which the epidemic can no longer be sustained. the increase in recovered and thus immune individuals dilutes the population and the epidemic dies out. mathematically, one can translate the reaction scheme (19.1) into a set of ordinary differential equations. say the population has n ≫ 1 individuals. for a small time interval δt and a chosen susceptible individual the probability of that individual interacting with an infected is proportional to the fraction i∕n of infected individuals. because we have s susceptibles the expected change of the number susceptibles due to infection is where the rate is the same as in (19.1) and the negative sign accounts for the fact that the number of susceptibles decreases. likewise the number of infected individuals is increased by the same amount δi = +δt × × s × i∕n. the number of infecteds can also decrease due to the second reaction in (19.1). because each infected can spontaneouly recover the expected change due to recovery is based on these assumptions eqs. (19.2) and (19. 3) become a set of differential equations that describe the dynamics of the sir model in the limit δt → 0: depending on the magnitude of n a model in which reactions occur randomly at rates and a stochastic system generally exhibits solutions that fluctuate around the solutions to the deterministic system of eq. (19.4) . both, the deterministic sir model and the more general particle kinetic stochastic model are designed to model disease dynamics in a single population, spatial dynamics or movement patterns of the host population are not accounted for. these systems are thus known as well-mixed systems in which the analogy is one of chemical reactants that are well-stirred in a chemical reaction container as mentioned above. when a spatial component is expected to be important in natural scenario, several methodological approaches exist to account for space. essentially the inclusion of a spatial component is required when the host is mobile and can transport the state of infection from one location to another. the combination of local proliferation of an infection and the disperal of infected host individuals then yields a spread along the spatial dimension [13, 14] . one of the most basic ways of incorporating a spatial dimension and host dispersal is by assuming that all quantities in the sir model are also functions of a location , so the state of the system is defined by s( , t), j( , t) and r( , t). most frequently two spatial dimensions are considered. the simplest way of incorporating dispersal is by an ansatz following eq. (2.19) in chap. 2 which assumes that individuals move diffusively in space which yields the reaction-diffusion dynamical system where e.g. in a two-dimensional system with = (x, y) the laplacian is ∇ 2 = 2 ∕ 2 x + 2 ∕ 2 y and the parameter d is the diffusion coefficient. the reasoning behind this approach is that the net flux of individuals of one type from one location to a neighboring location is proportional to the gradient or the difference in concentration of that type of individuals between neighboring locations. the key feature of diffusive dispersal is that it is local, in a discretized version the laplacian permits movements only within a limited distance. in reaction diffusion systems of this type the combination of initial exponential growth (if r 0 = ∕ > 1) and diffusion (d > 0) yields the emergence of an epidemic wavefront that progresses at a constant speed if initially the system is seeded with a small patch of infected individuals [15] . the advantage of parsimoneous models like the one defined by eq. (19.7) is that properties of the emergent epidemic wavefront can be computed analytically, e.g. the speed of the wave in the above system is related to the basic reproduction number and diffusion coefficient by in which we recognize the relation of eq. (2.17). another class of models considers the reaction of eq. (19.1) to occur on two-dimensional (mostly square) lattices. in these models each lattice site is in one of the states s, i or r and reactions occur only with nearest neighbors on the lattice. these models account for stochasticity and spatial extent. given a state of the system, defined by the state of each lattice site, and a small time interval δt, infected sites can transmit the disease to neighboring sites that are susceptible with a probability rate . infected sites also recover to the the system is identical to the system depicted in (a). however, in addition to the generic next neighbor transmission, with a small but significant probability a transmission to a distant site can occur. this probability also decreases with distance as an inverse power-law, e.g. where the exponent is in the range 0 < < 2. because the rare but significant occurance of long-range transmissions, a more complex pattern emerges, the concentric nature observed in system a is gone. instead, a fractal, multiscale pattern emerges r state and become immune with probability δt. figure 19 .3a illustrates the time course of the lattice-sir model. seeded with a localized patch of infected sites, the system exhibits an asymptotic concentric wave front that progresses at an overall constant speed if the ratio of transmission and recovery rate is sufficiently large. without the stochastic effects that yield the irregular interface at the infection front, this system exhibits similar properties to the reaction diffusion system of eq. (19.7). in both systems transmission of the disease in space is spatially restricted per unit time. the stochastic lattice model is particularly useful for investigating the impact of permitting long-distance transmissions. figure 19 .3b depicts temporal snapshots of a simulation that is identical to the system of fig. 19 .3a apart from a small but significant difference. in addition for infected sites to transmit the disease to neighboring susceptible lattice sites, every now and then (with a probability of 1%) they can also fig. 19 .1) geographic distance to the initial outbreak location is no longer a good predictor of arrival time, unlike in systems with local or spatially limited host mobility infect randomly chosen lattice sites anywhere in the system. the propensity of infecting a lattice site at distance r decreases as an inverse power-law as explained in the caption to fig. 19 .3. the possibility of transmitting to distant locations yields new epidemic seeds far away that subsequently turn into new outbreak waves and that in turn seed second, third, etc. generation outbreaks, even if the overall rate at which long-distance transmission occur is very small. the consequence of this is that the spatially coherent, concerntric pattern observed in the reaction diffusion system is lost, and a complex spatially incoherent, fractal pattern emerges [16] [17] [18] . practically, this implies that the distance from an initial outbreak location can no longer be used as a measure for estimating or computing the time that it takes for an epidemic to arrive at a certain location. also, given a snapshot of a spreading pattern, it is much more difficult to reconstruct the outbreak location from the geometry of the pattern alone, unlike in the concentric system where the outbreak location is typically near the center of mass of the pattern. a visual inspection of the air-transportation system depicted in fig. 19 .1 is sufficiently convincing that the significant fraction of long-range connections in global mobility will not only increase the speed at which infectious diseases spread but, more importantly, also cause the patterns of spread to exhibit high spatial incoherence and complexity caused by the intricate connectivity of the air-transportation network. as a consequence we can no longer use geographic distance to an emergent epidemic epicenter as an indicator or measure of "how far away" that epicenter is and how long it will take to travel to a given location on the globe. this type of decorrelation is shown in fig. 19.4 for two examples: the 2003 sars epidemic and the 2009 influenza h1n1 pandemic. on a spatial resolution of countries, the figure depicts scatter plots of the epidemic arrival time as a function of geodesic (shortest distance on the surface of the earth) distance from the initial outbreak location. as expected, the correlation between distance and arrival time is weak. given that models based on local or spatially limited mobility are inadequate, improved models must be developed that account for both, the strong heterogeneity in population density, e.g. that human populations accumulate in cities that vary substantially in size, and the connectivity structure between them that is provided by data on air traffic. in a sense one needs to establish a model that captures that the entire population is a so-called meta-population, a system of m = 1, … , m subpopulation, each of size n m and traffic between them, e.g. specifying a matrix f nm that quantifies the amount of host individuals that travel from population m to population n in a given unit of time [19, 20] . for example n n could correspond to the size of city n and f nm the amount of passengers the travel by air from m to n. one of earliest and most employed models for disease dynamics using the meta-population approach is a generalization of eq. (19.4) in which each population's dynamics is governed by the ordinary sir model, e.g. ds n ∕dt = − s n i n ∕n n (19.9) di n ∕dt = s n i n ∕n n − i n dr n ∕dt = i n where the size n n = r n + i n + s n of population n is a parameter. in addition to this, the exchange of individuals between populations is modeled in such a way that hosts of each class move from location m to location n with a probability rate nm which yields which is a generic metapopulation sir model. in principle one is required to fix the infection-related parameters and and the population sizes n m as well as the mobility rates nm , i.e. the number of transitions from m to n per unit time. however, based on very plausible assumptions [11] , the system can be simplified in such a way that all parameters can be gauged against data that is readily available, e.g. the actual passenger flux f nm (the amount of passengers that travel from m to n per day) that defines the air-transportation network, without having to specify the absolute population sizes n n . first the general rates nm have to fulfill the condition nm n m = mn n n if we assume that the n n remain constant. if we assume, additionally, that the total air traffic flowing out of a population n obeys where the dynamic variables are, again, fractions of the population in each class: s n = s n ∕n n , j n = i n ∕n n , and r n = r n ∕n n . in this system the new matrix p mn and the new rate parameter can be directly computed from the traffic matrix f nm and the total population involved n = ∑ m n m according to where f = ∑ n,m f mn is the total traffic in the network. the matrix p nm is therefore the fraction of passengers that are leaving node m with destination n. because passengers must arrive somewhere we have ∑ n p nm = 1. an important first question is concerning the different time scales, i.e. the parameters , and that appear in system (19.12) . the inverse −1 = t is the infectious period, that is the time individuals remain infectious. if we assume t ≈ 4-6 days and r 0 = ∕ ≈ 2 both rates are of the same order of magnitude. how about ? the total number of passengers f is approximately 8 × 10 6 per day. if we assume that n ≈ 7 × 10 9 people we find that it is instructive to consider the inverse t travel = −1 ≈ 800 days. on average a typical person boards a plane every 2-3 years or so. keep in mind though that this is an average that accounts for both a small fraction of the population with a high frequency of flying and a large fraction that almost never boards a plane. the overall mobility rate is thus a few orders of magnitude smaller than those rates related to transmissions and recoveries. this has important consequences for being able to replace the full dynamic model by a simpler model discussed below. figure 19 .5 depicts a numerical solution to the model defined by eq. (19.12) for a set of initial outbreak locations. at each location a small seed of infected individuals initializes the epidemic. global aspects of an epidemic can be assessed by the total fraction of infected individuals j g (t) = ∑ n c n j n (t) where c n is the relative size population n with respect to the entire population size n . as expected the time course of a global epidemic in terms of the epicurve and duration depends substantially on the initial outbreak location. a more important aspect is the spatiotemporal pattern generated by the model. figure 19 .6 depicts temporal snapshots of simulations initialized in london and chicago, respectively. analogous to the qualitative patterns observed in fig. 19 .3b, we see that the presence of long-range connections in the worldwide air-transportation network yields incoherent spatial patterns much unlike the regular, concentric wavefronts observed in systems without long-range mobility. figure 19 .7 shows that also the model epidemic depicts only a weak correlation between geographic distance to the outbreak location and arrival time. for a fixed geographic distance arrival times at different airports can vary substantially and thus the traditional geographic distance is useless as a predictor. the system defined by eq. (19.12) is one of the most parsimoneous models that accounts for strongly heterogeneous population distributions that are coupled by traffic flux between them and that can be gauged against actual population size distributions and traffic data. surprisingly, despite its structural simplicity this type of model has been quite successful in accounting for actual spatial spreads of past epi-and pandemics [19] . based on early models of this type and aided by the exponential increase of computational power, very sophisticated models have been developed that account for factors that are ignored by the deterministic metapopulation sir model. in the most sophisticated approaches, e.g. gleam [21] , the global epidemic and mobility computational tool, not only traffic by air but other means of transportation are considered, more complex infectious dynamics is considered and in hybrid dynamical systems stochastic effects caused by random reactions and mobility events are taken into account. household structure, available hospital beds, seasonality have been incorporated as well as disease specific features, all in order to make predictions more and more precise. the philosophy of this type of research line heavily relies on the increasing advancement of both computational power as well as more accurate and pervasive data often collected in natural experiments and webbased techniques [21] [22] [23] [24] [25] . despite the success of these quantitative approaches, this strategy bears a number of problems some of which are fundamental. first, with increasing computational methods it has become possible to implement extremely complex dynamical systems with decreasing effort and also without substantial knowledge of the dynamical properties that often nonlinear dynamical systems can possess. implementing a lot of dynamical detail, it is difficult to identify which factors are essential for an observed phenomenon and which factors are marginal. because of the complexity that is often incorporated even at the beginning of the design of a sophisticated model in combination with the lack of data modelers often have to make assumptions about the numerical values of parameters that are required for running a computer simulation [26] . generically many dozens of unknown parameters exist for which plausible and often not evidence-based values have to be assumed. because complex computational models, especially those that account for stochasticity, have to be run multiple times in order to make statistical assessments, systematic parameter scans are impossible even with the most sophisticated supercomputers. finally, all dynamical models, irrespective of their complexity, require two ingredients to be numerically integrated: (1) fixed values for parameters and (2) initial conditions. although some computational models have been quite successful in describing and reproducing the spreading behavior of past epidemics and in situations where disease specific parameters and outbreak locations have been assessed, they are difficult to apply in situations when novel pathogens emerge. in these situations, when computational models from a practical point of view are needed most, little is known about these parameters and running even the most sophisticated models "in the dark" is problematic. the same is true for fixing the right initial con-ditions. in many cases, an emergent infectious disease initially spreads unnoticed and the public becomes aware of a new event after numerous cases occur in clusters at different locations. reconstructing the correct initial condition often takes time, more time than is usually available for making accurate and valueable predictions that can be used by public health workers and policy makers to devise containment strategies. given the issues discussed above one can ask if alternative approaches exist that can inform about the spread without having to rely on the most sophisticated highly detailed computer models. in this context one may ask whether the complexity of the observed patterns that are solutions to models like the sir metapopulation model of eq. (19.12) are genuinely complex because of the underlying complexity of the mobility network that intricately spans the globe, or whether a simple pattern is really underlying the dynamics that is masked by this complexity and our traditional ways of using conventional maps for displaying dynamical features and our traditional ways of thinking in terms of geographic distances. in a recent approach brockmann and helbing [11] developed the idea of replacing the traditional geographic distance by the notion of an effective distance derived from the topological structure of the global air-transportation network. in essence the idea is very simple: if two locations in the air-transportation network exchange a large number of passengers they should be effectively close because a larger number of passengers implies that the probability of an infectious disease to be transmitted from a to b is comparatively larger than if these two locations were coupled only by a small number of traveling passengers. effective distance should therefore decrease with traffic flux. what is the appropriate mathematical relation and a plausible ansatz to relate traffic flux to effective distance? to answer this question one can go back to the metapopulation sir model, i.e. eq. (19.12) . dispersal in this equation is governed by the flux fraction p nm . recall that this quantity is the fraction of all passengers that leave node m and arrive at node n. therefore p nm can be operationally defined as the probability of a randomly chosen passenger departing node m arriving at node n. if, in a thought experiment, we assume that the randomly selected person is infectious, p nm is proportional to the probability of transmitting a disease from airport m to airport n. we can now make the following ansatz for the effective distance: (19.13) where d 0 ≥ 0 is a non-negative constant to be specified later. this definition of effective distance implies that if all traffic from m arrives at n and thus p nm = 1 the effective distance is d nm = d 0 which is the smallest possible value. if, on the other hand p nm becomes very small, d nm becomes larger as required. the definition (19.13) applies to nodes m and n that are connected by a link in the network. what about pairs of nodes that are not directly connected but only by paths that require intermediate steps? given two arbitrary nodes, an origin m and a destination n, an infinite amount of paths (sequence of steps) exist that connect the two nodes. we can define the shortest effective route as the one for which the accumulation of effective distances along the legs is minimal. so for any path we sum the effective distance along the legs according to eq. (19.13) adding up to an effective distance d nm . this approach also explains the use of the logarithm in the definition of effective distance. adding effective distances along a route implies the multiplication of the probabilities p nm along the involved steps. therefore the shortest effective distance d nm is equivalent to the most probable path that connect origin and destination. the parameter d 0 is a free parameter in the definition and quantifies the influence of the number of steps involved in a path. typically it is chosen to be either 0 or 1 depending on the application. one important property of effective distance is its asymetry. generally we have this may seem surprising at first sight, yet it is plausible. consider for example two airports a and b. let's assume a is a large hub that is strongly connected to many other airports in the network, including b. airport b, however, is only a small airport with only as a single connection leading to a. the effective distance b → a is much smaller (equal to d 0 ) than the effective distance from the hub a to the small airport b. this accounts for the fact that if, again in a thought experiment, a randomly chosen passenger at airport b is most definitely going to a whereas a randomly chosen passenger at the hub a is arriving at b only with a small probability. given the definition of effective distance one can compute the shortest effective paths to every other node from a chosen and fixed reference location. each airport m thus has a set of shortest paths p m that connect m to all other airports. this set forms the shortest path tree t m of airport m. together with the effective distance matrix d nm the tree defines the perspective of node m. this is illustrated qualitatively in the fig. 19 .8 that depicts a planar random triangular weighted network. one can now employ these principles and compute the shortest path trees and effective distances from the perspective of actual airports in the worldwide airtransportation network based on actual traffic data, i.e. the flux matrix f nm . figure 19 .9 depicts the shortest path tree of one of the berlin airports (tegel, txl). the radial distance of all the other airports in the network is proportional to their effective distance from txl. one can see that large european hubs are effectively close to txl as expected. however, also large asian and american airports are effectively close to txl. for example the airports of chicago (ord), beijing (pek), miami (mia) and new york (jfk) are comparatively close to txl. we can also see that from the perspective of txl, germany's largest airport fra serves as a gateway to a considerable fraction of the rest of the world. because the shortest path tree also represents the most probable spreading routes one can use this method to identify airports that are particularly important in terms of distributing an infectious disease throughout the network. the shortest path trees are also those paths that correspond to the most probable paths of a random walker that starts at the reference location and terminates at the respective target node the use of effective distance and representing the air-transportation network from the perspective of chosen reference nodes and making use of the more plausible notion of distance that better reflects how strongly different locations are coupled in a networked system is helpful for "looking at" the world. yet, this representation is more than a mere intuitive and plausible spatial representation. what are the dynamic consequences of effective distance? the true advantage of effective distance is illustrated in fig. 19 .10. this figure depicts the identical computer-simulated hypothetical pandemic diseases as fig. 19 .6. unlike the latter, that is based on the traditional geographic representation, fig. 19 .10 employs the effective distance and shortest path tree representation from the perspective of the outbreak location as discussed above. using this method, the spatially incoherent patterns in the traditional representation are transformed into concentric spreading patterns, similar to those expected for simple reaction diffusion systems. this shows that the complexity of observed spreading patterns is actually equivalent to simple spreading patterns that are just convoluted and masked by the underlying network's complexity. this has important consequences. because only the topological features of the network are used for computing the effective distance and no dynamic features are required, the concentricy of the emergent patterns are a generic feature and independent of dynamical properties of the underlying model. it also means that in effective distance, contagion processes spread at a constant speed, and just like in the simple reaction diffusion model one can much better predict the arrival time of an epidemic wavefront, knowing the speed and effective distance. for example if shortly after an epidemic outbreak the spreading commences and the initial spreading speed is assessed, one can forecast arrival times without having to run computationally expensive simulations. even if the spreading speed is unknown, the shortest path trees and effective distance from the perspective of airport tegel (txl) in berlin. txl is the central node. radial distance in the tree quantifies the effective distance to the reference node txl. as expected large european hubs like frankfurt (fra), munich (muc) and london heathrow (lhr) are effective close to txl. however, also hubs that are geographically distant such as chicago (ord) and beijing (pek) are effectively closer than smaller european airports. note also that the tree structure indicates that fra is a gateway to a large fraction of other airports as reflected by the size of the tree branch at fra. the illustration is a screenshot of an interactive effective distance tool available online [27] effective distance which is independent of dynamics can inform about the sequence of arrival times, or relative arrival times. the benefit of the effective distance approach can also be seen in fig. 19 .11 in which arrival times of the 2003 sars epidemic and the 2009 h1n1 pandemic in affected countries are shown as a function of effective distance to the outbreak origin. comparing this figure to fig. 19 .7 we see that effective distance is a much better predictor of arrival time, a clear linear relationship exists between effective distance ord lhr fig. 19 . 10 simulations and effective distance. the panels depict the same temporal snapshots of computer simulated hypothetical pandemic scenarios as in fig. 19 .6. the top row corresponds to a pandemic initially seeded at lhr (london) the bottom row at ord (chicago). the networks depict the shortest path tree effective distance representation of the corresponding seed airports as in fig. 19.9 . the simulated pandemics that exhibit spatially incoherent complex patterns in the traditional representation (fig. 19.6 ) are equivalent to concentric wave fronts that progress at constant speeds in effective distance space. this method thus substantially simplifies the complexity seen in conventional approaches and improves quantitative predictions and epidemic arrival. thus, effective distance is a promising tool and concept for application in realistic scenarios, being able to provide a first quantitative assessment of an epidemic outbreak and its potential consequences on a global scale. in a number of situation epidemiologists are confronted with the task of reconstructing the outbreak origin of an epidemic. when a novel pathogen emerges in some cases the infection spreads covertly until a substantial case count attracts attention and public health officials and experts become aware of the situation. quite often cases occur much like the patterns depicted in fig. 19 .3b in a spatially incoherent way because of the complexity of underlying human mobility networks. when cases emerge at apparently randomly distributed locations it is a difficult task to assess where the event initially started. the computational method based on effective distance can also be employed in these situations provided that one knows the underlying mobility network. this is because the concentric pattern depicted in fig. 19 .10 is . compared to the conventional use of geographic distance effective distance is a much better predictor of epidemic arrival time as is reflected by the linear relationship between arrival time and effective distance, e.g. compare to fig. 19 .7. right: the same analysis for the 2003 sars epidemic. also in this case effective distance is much more strongly correlated with arrival time than geographic distance only observed if and only if the actual outbreak location is chosen as the center perspective node. in other words, if the temporal snapshots are depicted using a different reference node the concentric pattern is scrambled and irregular. therefore, one can use the effective distance method to identify the outbreak location of a spreading process based on a single temporal snapshot. this method is illustrated in a proofof-concept example depicted in fig. 19 .12. assume that we are given a temporal snapshot of a spreading process as depicted in fig. 19 .12a and the goal is to reconstruct the outbreak origin from the data. conventional geometric considerations are not sucessful because the network-driven processes generically do not yields simple geometric patterns. using effective distance, we can now investigate the pattern from the perspective of every single potential outbreak location. we could for example pick a set of candidate outbreak locations (panel (b) in the figure). if this is done we will find that only for one candidate outbreak location the temporal snapshot has the shape of a concentric circle. this must be the original outbreak location. this process, qualitatively depicted in the figure, can be applied in a quantitative way and has been applied to actual epidemic data such as the 2011 ehec outbreak in germany [28] . outbreak reconstruction using effective distance. a the panel depicts a temporal snapshot of a computer simulated hypothetical pandemic, red dots denote airports with a high prevalence of cases. from the snapshot alone it is difficult to assess the outbreak origin which in this case is ord (chicago). b a choice of 12 potential outbreak locations as candidates. c for these candidate locations the pattern is depicted in the effective distance perspective. only for the correct outbreak location the pattern is concentric. this method can be used quantitatively to identify outbreaks of epidemics that initially spread in a covert way emergent infectious diseases that bear the potential of spreading across the globe are an illustrative example of how connectivity in a globalized world has changed the way human mediated processes evolve in the 21st century. we are connected by complex networks of interaction, mobility being only one of them. with the onset of social media, the internet and mobile devices we share information that proliferates and spreads on information networks in much the same way (see also chap. 20) . in all of these systems the scientific challenge is understanding what topological and statistical features of the underlying network shape particular dynamic features observed in natural systems. the examples addressed above focus on a particular scale, defined by a single mobility network, the air-transportation network that is relevant for this scale. as more and more data accumulates, computational models developed in the future will be able to integrate mobility patterns at an individual resolution, potentially making use of pervasive data collected on mobile devices and paving the way towards predictive models that can account very accurately for observed contagion patterns. the examples above also illustrate that just feeding better and faster computers with more and more data may not necessarily help understanding the fundamental processes and properties of the processes that underly a specific dynamic phenomenon. sometimes we only need to change the conventional and traditional ways of looking at patterns and adapt our viewpoint appropriately. note 1, examples are contagion, a surprisingly accurate depiction of the consequences of a severe pandemic, and rise of the planet of the apes that concludes with ficticious explanation for the extinction of mankind due to a man made virus in the future the fates of human societies proc. r. soc. lond infectious diseases of humans: dynamics and control modeling infectious diseases in humans and animals human mobility and spatial disease dynamics proc. natl. acad. sci. usa key: cord-104133-d01joq23 authors: arthur, ronan f.; jones, james h.; bonds, matthew h.; ram, yoav; feldman, marcus w. title: adaptive social contact rates induce complex dynamics during epidemics date: 2020-07-14 journal: biorxiv doi: 10.1101/2020.04.14.028407 sha: doc_id: 104133 cord_uid: d01joq23 the covid-19 pandemic has posed a significant dilemma for governments across the globe. the public health consequences of inaction are catastrophic; but the economic consequences of drastic action are likewise catastrophic. governments must therefore strike a balance in the face of these trade-offs. but with critical uncertainty about how to find such a balance, they are forced to experiment with their interventions and await the results of their experimentation. models have proved inaccurate because behavioral response patterns are either not factored in or are hard to predict. one crucial behavioral response in a pandemic is adaptive social contact: potentially infectious contact between people is deliberately reduced either individually or by fiat; and this must be balanced against the economic cost of having fewer people in contact and therefore active in the labor force. we develop a model for adaptive optimal control of the effective social contact rate within a susceptible-infectious-susceptible (sis) epidemic model using a dynamic utility function with delayed information. this utility function trades off the population-wide contact rate with the expected cost and risk of increasing infections. our analytical and computational analysis of this simple discrete-time deterministic model reveals the existence of a non-zero equilibrium, oscillatory dynamics around this equilibrium under some parametric conditions, and complex dynamic regimes that shift under small parameter perturbations. these results support the supposition that infectious disease dynamics under adaptive behavior-change may have an indifference point, may produce oscillatory dynamics without other forcing, and constitute complex adaptive systems with associated dynamics. implications for covid-19 include an expectation of fluctuations, for a considerable time, around a quasi-equilibrium that balances public health and economic priorities, that shows multiple peaks and surges in some scenarios, and that implies a high degree of uncertainty in mathematical projections. author summary epidemic response in the form of social contact reduction, such as has been utilized during the ongoing covid-19 pandemic, presents inherent tradeoffs between the economic costs of reducing social contacts and the public health costs of neglecting to do so. such tradeoffs introduce an interactive, iterative mechanism which adds complexity to an infectious disease system. consequently, infectious disease modeling typically has not included dynamic behavior change that must address such a tradeoff. here, we develop a theoretical model that introduces lost or gained economic and public health utility through the adjustment of social contact rates with delayed information. we find this model produces an equilibrium, a point of indifference where the tradeoff is neutral, and at which a disease will be endemic for a long period of time. under small perturbations, this model exhibits complex dynamic regimes, including oscillatory behavior, runaway exponential growth, and eradication. these dynamics suggest that for epidemic response that relies on social contact reduction, secondary waves and surges with accompanied business re-closures and shutdowns may be expected, and that accurate projection under such circumstances is unlikely. the covid-19 pandemic had infected almost 9 million people and caused over 450,000 17 deaths worldwide as of june 23, 2020 [1] . in the absence of effective therapies and 18 vaccines [2] , many governments responded with lock-down policies and social distancing 19 laws to reduce the rate of social contacts and curb transmission of the virus. prevalence 20 of covid-19 in the wake of these policies in the united states indicates they may have 21 been successful at decreasing the reproduction number (r t ) of the epidemic [1] . 22 however, they have also led to economic recession with an unemployment rate at an 23 80-year peak, the stock market in decline, and the federal government forced to borrow 24 heavily to financially support businesses and households. solutions to these economic 25 crises may conflict with public health recommendations. thus, governments worldwide 26 must decide how to balance the economic and public health consequences of their 27 epidemic response interventions. 28 behavior-change in response to an epidemic, whether autonomously adopted by 29 individuals or externally directed by governments, affects the dynamics of infectious 30 diseases [3, 4] . prominent examples of behavior-change in response to infectious disease 31 prevalence include measles-mumps-rubella (mmr) vaccination choices [5] , social 32 distancing in influenza outbreaks [6] , condom purchases in hiv-affected communities [7], 33 and social distancing during the ongoing covid-19 pandemic [2] . behavior is 34 endogenous to an infectious disease system because it is, in part, a consequence of the 35 prevalence of the disease, which in turn responds to changes in behavior [8, 9] . 36 individuals and governments have greater incentive to change behavior as prevalence 37 increases; conversely they have reduced incentive as prevalence decreases [10, 11] . 38 endogenous behavioral response may then theoretically produce a non-zero endemic 39 equilibrium of infection. this happens because, at low levels of prevalence, the cost of 40 avoidance of a disease may be higher than the private benefit to the individual, even 41 though the collective, public benefit in the long-term may be greater. however, in 42 epidemic response we typically think of behavior-change as an exogenously-induced 43 intervention without considering associated costs. while guiding positive change is an 44 important intervention, neglecting to recognize the endogeneity of behavior can lead to 45 a misunderstanding of incentives and a resurgence of the epidemic when behavior 46 change is reversed prematurely. 47 although there is growing interest in the role of adaptive human behavior in 48 infectious disease dynamics, there is still a lack of general understanding of the most 49 important properties of such systems [3, 8, 12] . behavior is difficult to measure, quantify, 50 or predict [8] , in part, due to the complexity and diversity of human beings who make simply allowed the transmission parameter (β) to be a negative function of the number 56 infected, effectively introducing an intrinsic negative feedback to the infected class that 57 regulated the disease [13] . 58 modelers have used a variety of tools, including agent-based modeling [14] , network 59 structures for the replacement of central nodes when sick [15] or for behavior-change as 60 a social contagion process [16] , game theoretic descriptions of rational choice under 61 changing incentives as with vaccination [6, 11, 17] , and a branching process for 62 heterogeneous agents and the effect of behavior during the west africa ebola epidemic 63 in 2014 [18] . a common approach to incorporating behavior into epidemic models is to 64 track co-evolving dynamics of behavior and infection [16, 19, 20] , where behavior 65 represents an i-state of the model [21] . in a compartmental model, this could mean 66 separate compartments (and transitions therefrom) for susceptible individuals in a state 67 of fear and those not in a state of fear [16] . 68 periodicity (i.e. multi-peak dynamics) has long been documented empirically in 69 epidemiology [22, 23] . periodicity can be driven by seasonal contact rate changes (e.g. 70 when children are in school) [24] , seasonality in the climate or ecology [25] , sexual 71 behavior change [26] , and host immunity cycling through new births of susceptibles or a 72 decay of immunity over time. some papers in nonlinear dynamics have studied delay 73 differential equations in the context of epidemic dynamics and found periodic 74 solutions [27] . although it is atypical to include delay in modeling, delay is an 75 important feature of epidemics. for example, if behavior responds to mortality rates, 76 there will inevitably be a lag with an average duration of the incubation period plus the 77 life expectancy upon becoming infected. in a tightly interdependent system, reacting to 78 outdated information can result in an irrational response and periodic cycling. the original epidemic model of kermack and mckendrick [28] was first expressed in 92 discrete time. then by allowing "the subdivisions of time to increase in number so that 93 each interval becomes very small" the famous differential equations of the sir epidemic 94 model were derived. here we begin with a discrete-time susceptible-infected-susceptible 95 model that is adjusted on the principle of endogenous behavior-change through an 96 adaptive social-contact rate that can be thought of as either individually motivated or 97 institutionally imposed. we introduce a dynamic utility function that motivates the 98 population's effective contact rate at a particular time period. this utility function is 99 based on information about the epidemic size that may not be current. this leads to a 100 time delay in the contact function that increases the complexity of the population 101 dynamics of the infection. results from the discrete-time model show that the system 102 approaches an equilibrium in many cases, although small parameter perturbations can 103 lead the dynamics to enter qualitatively distinct regimes. the analogous continuous-time model retains periodicities for some sets of parameters, but numerical 105 investigation shows that the continuous time version is much better behaved than the 106 discrete-time model. this dynamical behavior is similar to models of ecological 107 population dynamics, and a useful mathematical parallel is drawn between these 108 systems. to represent endogenous behavior-change, we start with the classical discrete-time 112 susceptible-infected-susceptible (sis) model [28] , which, when incidence is relatively 113 small compared to the total population [29, 30] , can be written in terms of the recursions 114 where at time t, s t represents the number of susceptible individuals, i t the infected 115 individuals, and n t the number of individuals that make up the population, which is 116 assumed fixed in a closed population. we can therefore write n for the constant 117 population size. here γ, with 0 < γ < 1, is the rate of removal from i to s due to 118 recovery. this model in its simplest form assumes random mixing, where the parameter 119 b represents a composite of the average contact rate and the disease-specific 120 transmissibility given a contact event. in order to introduce human behavior, we 121 substitute for b a time-dependent b t , which is a function of both b 0 , the probability that 122 disease transmission takes place on contact, and a dynamic social rate of contact c t 123 whose optimal value, c * t , is determined at each time t as in economic epidemiological 124 models [31] , namely where c * t represents the optimal contact rate, defined as the number of contacts per unit 126 time that maximize utility for the individual. here, c * t is a function of the number of this utility function is assumed to take the form here u represents utility for an individual at time t given a particular number of 134 contacts per unit time c, α 0 is a constant that represents maximum potential utility 135 achieved at a target contact rateĉ. the second term, α 1 (c −ĉ) 2 , is a concave function 136 that represents the penalty for deviating fromĉ. the third term, the delay in information acquisition and the speed of response to that information. we 140 note that (1 − i n b 0 ) c can be approximated by when i n b 0 is small and c i n b 0 << 1. we thus assume i n (b 0 ) is small, and approximate 142 u (c) in eq. 5 using eq. 6. eq. 5 assumes a strictly negative relationship between 143 number of infecteds and contact. 144 we assume an individual or government will balance the cost of infection, the 145 probability of infection, and the cost of deviating from the target contact rateĉ to select 146 an optimal contact rate c * t , namely the number of contacts, which takes into account the 147 risk of infection and the penalty for deviating from the target contact rate. this 148 captures the idea that individuals trade off how many people they want to interact with 149 versus their risk of getting sick, or that authorities want to reopen the economy during a 150 pandemic and have to trade off morbidity and mortality from increasing infections with 151 the need to allow additional social contacts to help the economy restart. this optimal 152 contact rate can be calculated by finding the maximum of u with respect to c from eq. 153 5 with substitution from eq. 6, namely differentiating, we have which vanishes at the optimal contact rate, c * , which we write as c * t to show its 156 dependence on time. then which we assume to be positive. therefore, total utility will decrease as i t increases and 158 c * t also decreases. utility is maximized at each time step, rather than over the course of 159 lifetime expectations. in addition, eq 9 assumes a strictly negative relationship between 160 number of infecteds at time t − ∆ and c * . while behavior at high degrees of prevalence 161 has been shown to be non-linear and fatalistic [32, 33] , in this model, prevalence (i.e., 162 b0it n ) is assumed to be small, consistent with eq. 6. 163 we introduce the new parameter α = α2 we can now rewrite the recursion from eq. 2, using eq. 4 and replacing c t with c * t as 165 defined by eq. 10, as when ∆ = 0 and there is no time delay, f (·) is a cubic polynomial, given by july 10, 2020 5/17 for the susceptible-infected-removed (sir) version of the model, we include the removed category and write the (discrete-time) recursion system as the baseline contact rate and c * t specified by 169 eq. 10. with b t = b, say, and not changing over time, eqs. 13-15 form the discrete-time 170 version of the classical kermack-mckendrick sir model [28] . the inclusion of the 171 removed category entails thatĩ = 0 is the only equilibrium of the system eqs. 13-15; 172 unlike the sis model, there is no equilibrium with infecteds present. in general, since c * t 173 includes the delay ∆, the dynamic approach toĩ = 0 is expected to be quite complex. 174 intuitively, since the infecteds are ultimately removed, we do expect that from any 175 initial frequency i 0 of infecteds all n individuals will eventually be in the r category. numerical analysis of this sir model shows strong similarity between the sis and sir 177 models for several hundred time steps before the sir model converges toĩ = 0 with 178 r = n . in the section "numerical iteration and continuous-time analog" we compare 179 the numerical iteration of the sis (eq. 11) and sir (eqs. [13] [14] [15] and integration of the 180 continuous-time (differential equation) versions of the sis and sir models. to determine the dynamic trajectories of (11) without time delay, we first solve for the 184 fixed point(s) of the recursion (11) (i.e., value or values of i such that from eq. 16, it is clear that i = 0 is an equilibrium as no new infections can occur 187 in the next time-step if none exist in the current one. this is the disease-free 188 equilibrium denoted byĩ. other equilibria are the solutions of we label the solution with the + sign i * and the one with the − signî. it is important to note that under these conditionsî is an equilibrium of the for this to hold whenî is legitimate is if inequalities (20) and nĉb 0 > γ hold, thenî is locally stable. however, even if 210 both of these inequalities hold, the number of infecteds may not converge toî. it is well 211 known that iterations of discrete-time recursive relations, of which (12) is an example 212 (i.e., with ∆ = 0), may produce cycles or chaos depending on the parameters and the 213 starting frequency i 0 of infecteds. 214 table 1 shows an array of possible asymptotic dynamics with ∆ = 0 found by 215 numerical iteration of (12) for a specific set of parameters and an initial frequency table 1 are examples for which, beginning with a 217 single infected, the number of infecteds explodes, becoming unbounded; of course, this is 218 an illegitimate trajectory since i t cannot exceed n . however, in the case marked * ,î is 219 locally stable and with a large enough initial number of infecteds, there is damped 220 oscillatory convergence toî. in the case marked * * , with i 0 = 1 the number of infecteds 221 becomes unbounded, but in this case,î is locally unstable, and starting with i 0 close to 222 i a stable two-point cycle is approached; in this case df (i)/di| i=î < −1. table 1 . stability analysis of the sis model is more complicated when ∆ = 0, and in the 224 appendix we outline the procedure for local analysis of the recursion (11) nearî. local 225 stability is sensitive to the delay time ∆ as can be seen from the numerical iteration of 226 july 10, 2020 7/17 (11) for the specific set of parameters shown in table 2 . some analytical details related 227 to table 2 are in the appendix. table 1 reports an array of dynamic trajectories for some choices of parameters and, 240 in two cases, an initial number of infecteds other than i 0 = 1. the first three rows show 241 three sets of parameters for which the equilibrium values ofî are very similar but the 242 trajectories of i t are different: a two-point cycle, a four-point cycle, and apparently 243 chaotic cycling above and belowî. in all of these cases, df (i)/di| i=î < −1. clearly the 244 dynamics are sensitive to the target contact rateĉ in these cases. the fourth and eighth 245 rows show that i t becomes unbounded (tends to +∞) from i 0 = 1, but a two-point 246 cycle is approached if i 0 is close enough toî: df (i)/di| i=î < −1 in this case. for the 247 parameters in the ninth row, if i 0 is close enough toî there is damped oscillation intoî: 248 here −1 < df (i)/di| i=î < 0. the fifth and sixth rows of table 1 exemplify another interesting dynamic starting 250 from i 0 = 1. i t becomes larger thanî (overshoots) and then converges monotonically 251 down toî; in each case 0 < df (i)/dt| i=î < 1. for the parameters in the seventh row, 252 there is oscillatory convergence toî from i 0 = 1 (−1 < df (i)/di| i=î < 0), while in the 253 last row there is straightforward monotone convergence toî. a continuous-time analog of the discrete-time recursion (11), in the form of a 255 differential equation, substitutes di/dt for i t+1 − i t in (11). we then solve the resulting 256 delay differential equation numerically using the vode differential equation integrator 257 in scipy [36, 37] (source code available at https://github.com/yoavram/sanjose). using 258 the parameters in table 2 figure 1 with i 0 = 1. in figure 1 , with no delay (∆ = 0) and a one-unit delay (∆ = 1), the discrete and 266 continuous dynamics are very similar, both converging toî. however, with ∆ = 2 the 267 differential equation oscillates intoî while the discrete-time recursion enters a regime of 268 inexact cycling aroundî, which appears to be a state of chaos. for ∆ = 3 and ∆ = 4, 269 the discrete recursion "collapses". in other words, i t becomes negative and appears to 270 go off to −∞; in figure 1 , this is cut off at i = 0. the continuous version, however, in 271 these cases enters a stable cycle aroundî. it is important to note that in figure 1 for respectively. in fig. s5s there appears to be convergence toî, but in fig. s5l after 302 about 500 time units, in both discrete-and continuous-time sir versions, the number of 303 infected begins to decline towards zero. it is worth noting that if the total population size of n decreases over time, for 305 example, if we take n (t) = n exp(−zt), with z = 50b 0ĉ γ, then the short-term dynamics 306 of the sis model in (11) begins to closely resemble the sir version. this is illustrated 307 in supplementary fig. s5n , where b 0 ,ĉ, γ are, as in figs. s5s and s5l, the same as in 308 fig. 2, panel (a) . with n decreasing to zero, both s and i will approach zero in the our model makes a number of simplifying assumptions. we assume, for example, 320 that all individuals in the population will respond in the same fashion to government 321 policy. we assume that governments choose a uniform contact rate according to an 322 optimized utility function, which is homogeneous across all individuals in the population. 323 finally, we assume that the utility function is symmetric around the optimal number of 324 contacts so that increasing or decreasing contacts above or below the target contact 325 rate, respectively, yield the same reduction in utility. these assumptions allowed us to 326 create the simplest possible model that includes adaptive behavior and time delay. in holling's heuristic distinction in ecology between tactical models, models built to be 328 parameterized and predictive, and strategic models, which aim to be as simple as 329 possible to highlight phenomenological generalities, this is a strategic model [38] . 330 we note that the five distinct kinds of dynamical trajectories seen in these 331 computational experiments come from a purely deterministic recursion. this means 332 that oscillations and even erratic, near-chaotic dynamics and collapse in an epidemic 333 may not necessarily be due to seasonality, complex agent-based interactions, changing or 334 stochastic parameter values, demographic change, host immunity, or socio-cultural 335 idiosyncracies. this dynamical behavior in number of infecteds can result from 336 mathematical properties of a simple deterministic system with homogeneous endogenous 337 behavior-change, similar to complex population dynamics of biological organisms [39] . the mathematical consistency with population dynamics suggests a parallel in ecology, 339 that the indifference point for human behavior functions in a similar way to a carrying 340 capacity in ecology, below which a population will tend to grow and above which a individuals are incentivized to change their behavior to protect themselves, they will, 346 and they will cease to do this when they are not [10] . further, our results show certain 347 parameter sets can lead to limit-cycle dynamics, consistent with other negative feedback 348 mechanisms with time delays [41, 42] . this is because the system is reacting to 349 conditions that were true in the past, but not necessarily true in the present. in our 350 discrete-time model, there is the added complexity that the non-zero equilibrium may 351 be locally stable but not attained from a wide range of initial conditions, including the 352 most natural one, namely a single infected individual. observed epidemic curves of many transient disease outbreaks typically inflect and 354 go extinct, as opposed to this model that may oscillate perpetually or converge [43] , and surges in fluctuations in covid-19 cases globally [1] . there may be 363 many causes for such double-peaked outbreaks, one of which may be a lapse in 364 behavior-change after the epidemic begins to die down due to decreasing incentives [16] , 365 as represented in our simple theoretical model. this is consistent with findings that 366 voluntary vaccination programs suffer from decreasing incentives to participate as 367 prevalence decreases [44, 45] . it should be noted that the continuous-time version of our 368 model can support a stable cyclic epidemic whose interpretation in empirical terms will 369 depend on the time scale, and hence on the meaning of the delay, ∆. one of the responsibilities of infectious disease modelers (e.g. covid-19 modelers) 371 is to predict and project forward what epidemics will do in the future in order to better 372 assist in the proper and strategic allocation of preventative resources. covid-19 373 models have often proved wrong by orders of magnitude because they lack the means to 374 account for adaptive response. an insight from this model, however, is that prediction 375 becomes very difficult, perhaps impossible, if we allow for adaptive behavior-change 376 because the system is qualitatively sensitive to small differences in values of key 377 parameters. these parameters are very hard to measure precisely; they change 378 depending on the disease system and context and their inference is generally subject to 379 large errors. further, we don't know how policy-makers weight the economic trade-offs 380 against the public health priorities (i.e., the ratio between α 1 and α 2 in our model) to 381 arrive at new policy recommendations. to maximize the ability to predict and minimize loss of life or morbidity, outbreak 383 response should not only seek to minimize the reproduction number, but also the length 384 of time taken to gather and distribute information. another approach would be to use a 385 predetermined strategy for the contact rate, as opposed to a contact rate that depends 386 on the number of infecteds. in our model, complex dynamic regimes occur more often when there is a time delay. 388 if behavior-change arises from fear and fear is triggered by high local mortality and high 389 local prevalence, such delays seem plausible since death and incubation periods are 390 lagging epidemiological indicators. lags mean that people can respond sluggishly to an 391 unfolding epidemic crisis, but they also mean that people can abandon protective 392 behaviors prematurely. developing approaches to incentivize protective behavior 393 throughout the duration of any lag introduced by the natural history of the infection (or 394 otherwise) should be a priority in applied research. this paper represents a first step in 395 understanding endogenous behavior-change and time-lagged protective behavior, and we 396 anticipate further developments along these lines that could incorporate long incubation 397 periods and/or recognition of asymptomatic transmission. in the neighborhood of the equilibriumî, write i t =î + ε t and i t−∆ =î + ε t−∆ , where 437 ε t and ε t−∆ are small enough that quadratic terms in them can be neglected in the 438 expression for i t+1 =î + ε t+1 . the linear approximation to (a1) is then and in the case ∆ = 0, this reduces to we focus first on ∆ = 0 and write (a3) as ε t+1 = ε t l(î). recall thatî satisfies eq. 441 (17) , and substituting γ from (17) now we turn to the general case ∆ = 0 and eq. (a2), which we write as where a and b are the corresponding terms on the right side of (a2 constants with respect to time. local stability ofî is then determined by the properties 450 of recursion (a7), whose solution first involves solving its characteristic equation in principle there are ∆ + 1 real or complex roots of (a8), which we represent as 452 λ 1 , λ 2 , . . . , λ ∆+1 , and the solution of (a7) can be written as where c i are found from the initial conditions. convergence to, and hence local stability 454 ofî, is determined by the magnitude of the absolute value (if real) or modulus (if complex) of the roots λ 1 , λ 2 , . . . , λ ∆+1 :î is locally stable if the largest among the ∆ + 1 456 of these is less than unity. in table 2 , results of numerically iterating the complete recursion (11) are listed for 458 the delay ∆ varying from ∆ = 0 to ∆ = 4, all starting from i 0 = 1, with n = 10, 000 459 and the stated parameters. figure 3 illustrates the discrete-and continuous-time 460 dynamics summarized in table 2 with complex roots 0.4999 ± 0.6461i whose modulus is 0.8169, which is less than 1. the 465 complexity implies cyclic behavior, and since the modulus is less than one, we see 466 locally damped oscillatory convergence toî. for ∆ = 2, the characteristic equation is the cubic which has one real root 0.6383 and complex roots 0.8190 ± 0.6122i. here the modulus of 469 the complex roots is 1.0225, which is greater than unity so thatî is not locally stable. 470 in this case the dynamics depend on the initial value i 0 . if i 0 < 72, i t oscillates but not 471 in a stable cycle. if i 0 > 73, the oscillation becomes unbounded. world health organization. coronavirus disease (covid-19): situation report scientific and ethical basis for social-distancing 489 interventions against covid-19. the lancet infectious diseases social factors in epidemiology modelling the influence of human behaviour on 493 the spread of infectious diseases: a review evolving public perceptions and stability in 496 vaccine uptake game theory of social distancing in response to an epidemic the responsiveness of the demand for condoms 500 to the local prevalence of aids nine 502 challenges in incorporating the dynamics of behaviour in infectious diseases 503 models impact and behaviour: the importance of social forces to 506 infectious disease dynamics and disease ecology economic epidemiology and infectious diseases erratic 511 flu vaccination emerges from short-sighted behavior in contact networks capturing human behaviour a generalization of the kermack-mckendrick deterministic 516 epidemic model a hybrid epidemic model: 518 combining the advantages of agent-based and equation-based approaches winter. ieee the effect of a prudent adaptive 521 behaviour on disease transmission coupled contagion 523 dynamics of fear and disease: mathematical and computational explorations a general approach for population games with 526 application to vaccination ebola cases and health system demand in liberia the spread of awareness and its 532 proceedings of the national academy of sciences a review the dynamics of physiologically structured populations periodicity in epidemiological models measles in england and wales-i: an analysis of factors 544 underlying seasonal patterns seasonal and interannual cycles of endemic cholera in 547 bengal 1891-1940 in relation to climate and geography etiology of newly emerging marine diseases epidemic cycles driven by host behaviour periodic solutions of delay differential equations 552 arising in some models of epidemics a contribution to the mathematical theory of 555 epidemics the royal society modeling infectious diseases in humans and animals princeton university press time series modelling of childhood diseases dynamical systems approach adaptive human behavior in epidemiological models choices, beliefs, and infectious disease dynamics higher disease prevalence can 568 induce greater sociality: a game theoretic coevolutionary model global stability of an sir epidemic 571 global stability for the seir model in epidemiology scipy-based delay differential equation (dde) solver the strategy of building models of complex ecological systems simple mathematical models with very complicated dynamics journal of the fisheries board of canada time-delay versus stability in population models with two and three 587 trophic levels time delays are not necessarily destabilizing different epidemic curves for severe acute respiratory 591 rational epidemics and their public control group interest versus self-interest in smallpox 596 vaccination policy key: cord-024515-iioqkydg authors: zhong, qi; zhang, leo yu; zhang, jun; gao, longxiang; xiang, yong title: protecting ip of deep neural networks with watermarking: a new label helps date: 2020-04-17 journal: advances in knowledge discovery and data mining doi: 10.1007/978-3-030-47436-2_35 sha: doc_id: 24515 cord_uid: iioqkydg deep neural network (dnn) models have shown great success in almost every artificial area. it is a non-trivial task to build a good dnn model. nowadays, various mlaas providers have launched their cloud services, which trains dnn models for users. once they are released, driven by potential monetary profit, the models may be duplicated, resold, or redistributed by adversaries, including greedy service providers themselves. to mitigate this threat, in this paper, we propose an innovative framework to protect the intellectual property of deep learning models, that is, watermarking the model by adding a new label to crafted key samples during training. the intuition comes from the fact that, compared with existing dnn watermarking methods, adding a new label will not twist the original decision boundary but can help the model learn the features of key samples better. we implement a prototype of our framework and evaluate the performance under three different benchmark datasets, and investigate the relationship between model accuracy, perturbation strength, and key samples’ length. extensive experimental results show that, compared with the existing schemes, the proposed method performs better under small perturbation strength or short key samples’ length in terms of classification accuracy and ownership verification efficiency. as deep learning models are more widely deployed and become more valuable, many companies, such as google, microsoft, bigml, and amazon, have launched cloud services to help users train models from user-supplied data sets. although appealing simplicity, this process poses essential security and legal issues. the customer can be concerned that the provider who trains the model for him might resell the model to other parties. say, for example, an inside attacker can replicate the model with little cost and build a similar pay-per-query api service with a lower charge. once that happens, the market share of the model holder may decrease. in another scenario, a service provider may be concerned that customers who purchase a deep learning network model may distribute or even sell the model to other parties with a lower fee by violating the terms of the license agreement. undoubtedly, these can threaten the provider's business. as a result, endowing the capability of tracing illegal deep neural network redistribution is imperative to secure a deep learning market and provides fundamental incentives to the innovation and creative endeavours of deep learning. in the traditional literature, watermarking [2] is mainly used for copyright protection [11, 15] of multimedia data. applying the idea of watermarking to protect the intellectual property (ip) of deep neural network (dnn) models is first introduced by uchida et al. [13] in 2017. after that, researchers have proposed several dnn watermarking schemes, which can be mainly categorized into two types according to their watermark extraction/verification method: white-box and black-box watermarking. the works in [13] and [3] are the typical examples of white-box watermarking, which are built on the assumption that the internal details of the suspicious model are known to the model owner and the entire watermark needs to be extracted. the authorship verification is done by comparing the bit error between the extracted watermark and the embedded one. however, their range of application has been restricted by the inherent constraint, i.e., the internal details is known to the owner, and recent works are more focused on the black-box setting. the black-box setting only assumes access to the remote dnn api but not its internal details. the frameworks of white-box and black-box dnn watermarking schemes are the same, i.e., they both consist of a watermark embedding stage and an extraction/verification stage. typical examples of black-box watermarking are the works in [1, 14] , where the authors utilized the back-door property of neural network models [1, 6] to embed ownership information when building the model. more specifically, in these works, the watermark embedding is achieved by training with, besides normal samples, some extra crafted samples, or the so-called trigger set (both are referred to as key samples in this work). in the verification stage, the watermarked model will return the predefined labels upon receiving the key samples (compared to the watermark-free model who returns random labels) while performing as normal on non-key samples. according to the key samples they used, these methods can be further categorized into two main classes as follows. the first category is to use crafted key samples, that is, key samples are obtained by superimposing perturbation to training samples. taking image classification as an example, one can embed a readable logo or noise pattern into the normal training images. then these key images are assigned to a specific wrong label [14] . in merrer et al.'s work [9] , some normal images that close to the decision frontier are modified imperceptibly, and part of them are assigned to wrong labels, while others inherit their original correct ones. different from [9, 14] , the authors in [10] employed an autoencoder to embed an exclusive logo into ordinary samples and get the key samples. the second category is to use clean key samples. for instance, in [14] , one kind of key images are chosen from unrelated classes and marked to a specific wrong label. in [5] , the key samples are sampled from the ordinary images, which can be correctly recognized by the watermarked model but misclassified by the corresponding watermark-free model. another typical example is the work proposed by adi et al. in [1] , in which they chose some abstract images that are uncorrelated to each other to serve as key samples, and these abstract images are randomly labeled (so the probability that this random label equals the output label of an watermark-free model is low). the underlying rationale is, once again, that only the protected model can correctly recognize the key samples with overwhelming probability since they contribute to the training process. to summarize and to the best of our knowledge, all the existing blackbox dnn watermarking schemes are back-door based, and they are key sample dependent since assigning key samples with wrong labels will inevitably, more or less, twist the original decision boundary. from this sense, the functionality (i.e., classification accuracy) and robustness of the watermarked model are directly related to the characteristics of the used key samples. say for example, if crafted key samples are used for watermarking a dnn model, and a fixed perturbation is superimposed to certain key sample and this very crafted key sample is far away from the original classification frontier (of the watermark-free dnn model), then the decision boundary will be twisted heavily (e.g., become a fractal-like structure) to meet the accuracy criteria, while the robustness or the generality will decrease correspondingly. our key observation to mitigate this problem is simple but effective: adding a new label's to the key samples will minimize, if not eliminate, the effect of boundary twisting. the rationale lies in the fact that, instead of treating key samples are drawn from the marginal distribution of the sample space, we consider the superimposed perturbation to the samples or unrelated natural samples as a new feature that dominates the classification of a new class. theoretically, after adding a new label, the boundary will not be twisted, and all the merits of the corresponding watermark-free model will be preserved. from another point of view, the required number of key samples for watermark embedding, ownership verification, and the false-positive rate will be minimized when compared with boundary-twisted kind dnn watermarking schemes [14] . in a nutshell, we regard the contributions of this work are as follows: -we propose a novel black-box dnn watermarking framework that has high fidelity, high watermark detection rate, and zero false-positive rate, and robust to pruning attack and fine-tuning attack. -we evaluate the proposed framework on three benchmark datasets, i.e., mnist, cifar10 and cifar100, to quantify the relationship among classification accuracy, perturbation strength, and length of the key samples used during training. the rest of this paper is structured as follows. in sect. 2, we briefly introduce some background knowledge of deep neural networks, watermarking and dnn watermarking. section 3 presents the formal problem formulation and algorithmic details of the proposed dnn watermarking approach. the experimental results and analyses are presented in sect. 4, and some further security considerations are discussed in sect. 5. we make a conclusion in sect. 6. conceptually, the basic premise of a dnn model is to find a function f : x → y that can predict an output value y ∈ y upon receiving a specific input data x ∈ x. a dnn model generally consists of three parts: an input layer, one or more hidden layers, and an output layer. each layer has several nodes that are customarily called neurons that connect to the next layer. generally speaking, the more hidden layers, the better the performance of the model. however, it is not an easy task to train and learn a good model f that predicts well on unseen samples. typically, the training requires a vast number of , while the labeling requires expert knowledge in most applications. with the data available, the real training, which involves minimizing a loss function l that is dependent on millions of parameters in the case of dnn, also relies on powerful computing resources. this observation motivates us to design mechanisms to protect the intellectual property of dnn. digital multimedia watermarking, which makes use of the redundancy among the data of multimedia to hide information, is a long-studied research area. one popular application of it is to provide ownership verification of digital content, including audio, video, images, etc. the ownership verification process can be achieved in two different ways depending on the embedding methods: 1) extracting data from a suspicious copy and comparing the similarity between the extracted data and the embedded watermarks; 2) confirming the existence of an ownership-related watermark does exist in a suspicious copy. typically, the verification is executed by a trusted third party, for example, a judge. for dnn watermarking, the watermark extraction/verification process can be executed in either a white-box or black-box way. the white-box setting assumes that the verifier has full access to all of the parameters of the suspicious model, which is similar to the first kind of digital watermarking verification. while in the black-box setting, it assumes that the verifier can only access the api of the remote suspicious model, i.e., sending queries through the api of the suspicious model who will output a class tag. most recent dnn watermarking schemes focused on the black-box verification as it is more practical than a white-box one. this work also lies in the domain of the black-box setting. similar to the current literature and for easy presentation, we only focus on image classification dnn model ip protection. without loss of generality, we only consider the first category of black-box dnn watermarking, i.e., crafting image samples by superimposing perturbation to them. but it is noteworthy to mention that the proposed model can also be applied for classification models of other data formats, and it is also compatible with the second category of dnn watermarking. there is no essential distinction between these two kinds of key samples in terms of classification tasks since both of them can be viewed as the perturbed version of the original images. we consider the scenario in which three parties are involved: a service provider, who helps the customer to train a watermarked model f w ; a customer alice, who is the model owner that provides the training data; and an adversary bob, who is the business competitor of alice that has obtained a copy of alice's model f w . after receiving the model of alice, bob has the incentive to modify the model from f w slightly to get f w , say for example, by model compression, to avoid ip tracing under the condition that the model accuracy does not decrease. we study the problem of how to prove the model f w from bob is an illegal copy of alice's model f w via black-box accessing f w . the overall workflow of the service is depicted in fig. 1 . ideally, a good watermarked dnn model needs to have the following desirable properties: -fidelity: the classification accuracy of the watermarked model f w for normal test data should be close to that of the original model f; -effectiveness and efficiency: the false positive rate for key samples should be minimized, and a reliable ownership verification result needs to be obtained with few queries to the remote dnn api; -robustness: the watermarked model can resist several known attacks, for example, pruning attack and fine-tuning attack. from a high-level point of view, a dnn watermarking scheme π consists of three algorithms: ksgen, tremb, and ver. ksgen takes as input a subset of the original dataset d and a secret s, and outputs a key sample dataset. tremb takes as input the original dataset d and the result from ksgen, and outputs a watermarked model f w . and ver takes as input a suspicious copy f w and the result from ksgen, and conclude whether f w is pirate or not. the dnn watermarking scheme π is superior (to the literature works) if it achieves better trade-off among the above mentioned three properties. before diving into the details of the method, we present a motive example first. for illustration, we extract the output layer to form a toy network (the left part of fig. 2(a) ). then we add a new label to the extracted network to observe the boundary twist of the expanded network (the right part of fig. 2(a) ). as is clear from fig. 2(a) , the change caused by adding a new label is quite small. we run more experiments on this toy network and the expanded network and depict the results in fig. 2(b) for clear comparison. for ease of presentation and without loss of generality, assume the original goal is to predict (δ − 1) classes by training a model f from the dataset d = after adding a new label, we alternatively train a model f w from d and some crafted samples (by running ksgen) to predict δ different classes. with these notations, the details of the three algorithms ksgen, tremb, and ver are given as follows. key samples generation ksgen: for a given subset of d, say d 1 , the algorithm crafts samples by calculating where is the perturbation pattern determined by the secret s, and α = |d1| |d| and β, the perturbation strength, are system parameters that will be studied later in sect. 4. assigning all the crafted samples to the δ-th label, ksgen outputs the key sample dataset k = {k (i) , δ} |d1| i=1 . dnn training and watermark embedding tremb: with the datasets available, the service provider trains a dnn model f w . different from the watermark-free model f that classifies (δ − 1) classes, the watermarked model f w learns from the crafted dataset k to classify one more class, i.e., the class δ. aligning with the literature works [1, 5, 9, 10, 14] , we also employ the softmax function for the output layer. ver: upon detecting a suspicious dnn service f w of bob, alice will ask the judge to verify whether a secret watermark can be identified from bob's model. the judge will choose a subset of d, say it is d 2 , and produce k = ksgen(s, d 2 ) and send query image k ∈ k to f w to check the output label is δ or not. it is easy to understand that, after adding a new label, a watermarkfree model cannot output a nonexistent class label δ, that is, the probability it holds no matter x ∈ d or x ∈ k, which implies zero false-positive rate and it is desirable as discussed in sect. 3.1. correlating with more properties from sect. 3.1, fidelity essentially requires where θ is the number of appearance of the label that is not δ. then the mean value of q is determined by which is bounded by the reciprocal of the accuracy on key samples. for example, if p = 0.8, we have e[q] = 2, which is small enough for verification purpose. in this section, we evaluate the performance of our proposed dnn watermarking method using three datasets: mnist, cifar10 and cifar100. the back-door based dnn watermarking scheme proposed by zhang et al. [14] serves as the main test-bed to evaluate our proposal. we train and test all models using the tensorflow package on a machine equipped with 2xtesla v100 gpus. to eliminate the impact of randomness, every experiments are executed by 50 times, and the mean is calculated. datasets: three different benchmark datasets are used for the evaluation of our proposal, which are mnist, cifar10, and cifar100, respectively. according to our definition of key samples, they can be viewed as the modified version of the ordinary samples, and the differences lie in the location and strength of the perturbation. in [14] , the authors validated that the key samples generated by adding noise to normal images are the best choice in terms of different assessment metrics. for this reason, and also to facilitate experiments and comparisons, we use gaussian noise mode, which can be easily obtained from a secret random number generator under s. in [14] , the key samples are labeled as one of the existing classes, say, for example, class "airplane". so the key samples should be generated from normal samples that do not belong to the class "airplane". it is worth mentioning that the aim of this work is not to achieve superior classification accuracy, but to compare the performances between watermarked networks trained with key samples that predefined with a new label or not. these dnns are relatively shallow but have a fast training speed, which meets our requirements. we using the normal dataset, without key samples to train the watermark-free models f, and the their accuracy for the three benchmark datasets are 98%, 87%, and 60%, respectively. fidelity: the main purpose of fidelity is to test whether the classification accuracy of the watermarked model f w , when testing on non-key images, is deteriorated or not after embedding. to assess this property, we test the classification accuracy of the watermarked model f w on original test dataset (the original functionality of f) and newly generated key sample dataset (the judge will need to use it at the ver stage). in addition, we, by comparing with the work in [14] , experimentally investigate the relationship among performance, the ratio of the perturbed samples for training α, and the perturbation strength β, as shown in fig. 3 . from the dotted line in fig. 3 , it is easy to come to the conclusion that both of the proposed method and zhang et al.'s method achieve high classification accuracy on normal samples. in fact, they are similar to the ground truth of the original watermark-free model f. the goal of effectiveness is to measure the credibility of watermark existence provided by the output results of the verification process, while efficiency is to test how many queries are needed to get a credible watermark existence result under the pay-per-query api service. obviously, the fewer queries the better, as it can not only save time & money for verification, but also prevent arousing bob's suspicion. from fig. 3(a)-(c) , we can see that the model accuracy of both methods is increasing with the perturbation strength of key samples. as shown in fig. 3(e) , when perturbation strength β = 0.001, our method achieves the testing accuracy higher than 80% with only 0.6% of key samples for training. for comparison, in zhang et al.'s method, to get the same accuracy, more than 0.9% of key samples are needed. to conclude, our method performs better under small α or β for all datasets. once again, we regard this improvement is due to adding a new label. when α and β are small, number of crafted key samples is small and they are very similar to normal samples. under this circumstance, if the key samples are predefined to wrong classes, the learned weights that contribute to the outputs of key samples cannot change too much due to the fidelity constraint. conversely, if a new label is added, the weights associated with this exact new class can be modified without breaking the fidelity constraint. for efficiency, as discussed in sect. 3.2, in our method, only 2 queries are needed on average to determine the existence of a watermark in a suspicious dnn model with p = 0.8, which is just the case for most choices of α and β as shown in fig. 3 . for zhang et al.'s approach, since it is not false-positive free, so query a watermark-free model with key samples may still trigger the predefined label (of key samples) as the output of the api. to mitigate this bias, a larger number of queries should be used and ver should be re-defined as where τ is a pre-defined threshold and θ is the number of appearance of the label that is not equal to the predefined label (of key samples). then the accuracy, after submitting the whole set k to the api as a batch, of ver is for example, with p = 0.8, acc = 90% and τ = 0.3, |k | = 40 queries should be used for ver. clearly, it is not as efficient as the proposed method. the goal of robustness is to check if the proposed model can resist to attacks, and following the literature, we mainly consider pruning attack (or compression attack in the literature) and fine-tuning attack here. as discussed in sect. 3.1, the adversary has incentive to modify the model to prevent ownership verification. obviously, the adversary does not want to affect the model's classification accuracy with such modification. and pruning and fine-tuning attacks exactly fit this requirement. by saying robust in the scenario of ip protection of dnn using watermarking, essentially, we expect the classification accuracy of key samples is insensitive after such attacks. in the experiments, we test the classification accuracy of the watermarked model for ordinary samples and key samples, separately, under different pruning rates, and the results are shown in fig. 4 . it can be observed from this figure that the model accuracy for classifying newly generated key samples under both this proposal and zhang et al.'s design does not decrease too much with the increasing of pruning rate. but in general, our method performs slightly better than the one in [14] , especially the pruning rate is relatively high. it is reported from [8] that deep neural networks, besides their incremental learning capability, are also prone to forget previous tasks. the fine-tuning is a useful method that utilises the catastrophy forgetting of dnns to retrain a watermarked model to invalidate the ownership verification. to measure the testing accuracy of clean samples and key samples of our method under finetuning, we employ the same experimental settings as used in [14] . the results of the fine-tuning attack are tabulated in table 1 . for fair comparison, the parameters used in the three datasets are: (α = 0.01, β = 3.5×10 −3 ) for mnist, and (α = 0.01, β = 0.01) for cifar10 and cifar100. under these settings, both our method and the one in [14] can achieve the ground-truth accuracy on each dataset, as shown from the values in parenthesis of table 1 . from this table, it is easy to see that after fine-tuning, both our method and the method in [14] still preserve good classification accuracy on normal samples. this is due the generalization property of dnn and it is well accepted in the machine learning field. for the classification of key samples after fine-tuning, we expect accuracy loss. for sure the generalization property still holds in this case, but the watermarked label is learnt from insufficient data and weak features. it is observed from this table, for the mnist dataset, the accuracy of both methods is still as high as the ground truth. it may be due to the reason that the mnist dataset is relatively simple, so the weak features (from the key samples) are learnt perfectly during the training process, and the generalization property dominates classification accuracy. for the other two datasets, the acccracy decreases as expected. to conclude, although our method cannot fully prevent the fine-tuning attack, compared with the literature work [14] , it mitigate the attack to large extent. apart from the pruning attack and fine-tuning attack we mentioned above, recently, several new attacks [4, 7, 12] are proposed against black-box dnn watermarking techniques. we discuss the most related type of attacks in brief in this section. this attack considers the scenario that, given a query, bob first judges whether or not the query issued by someone works as a key sample for verification. in this way, the verification ver will be invalidated by rejecting service [7] . in [12] , the authors adopted an autoencoder to serve as the key sample detector. as discussed in sect. 4.2, our method works with a smaller number of training key samples and weaker perturbation strength, which makes the detection harder. in this paper, we proposed a novel black-box dnn watermarking method: assigning a new label to key samples to minimize the distortion of the decision boundary. compared with the existing dnn framework, it achieves zero false-positive rates and performs better when the number of training key samples are small and the perturbation is weak. for security, it is validated that the new proposal is more robust than existing schemes, and we leave the investigation of its resistance to query rejection attack for further study. turning your weakness into a strength: watermarking deep neural networks by backdooring an overview of digital video watermarking deepmarks: a digital fingerprinting framework for deep neural networks leveraging unlabeled data for watermark removal of deep neural networks deepsigns: an end-to-end watermarking framework for ownership protection of deep neural networks badnets: identifying vulnerabilities in the machine learning model supply chain have you stolen my model? evasion attacks against deep neural network watermarking techniques measuring catastrophic forgetting in neural networks. in: thirty-second aaai conference on artificial intelligence adversarial frontier stitching for remote neural network watermarking how to prove your model belongs to you: a blind-watermark based framework to protect intellectual property of dnn secure and robust digital image watermarking scheme using logistic and rsa encryption robust watermarking of neural network with exponential weighting embedding watermarks into deep neural networks protecting intellectual property of deep neural networks with watermarking you can access but you cannot leak: defending against illegal content redistribution in encrypted cloud media center acknowledgements. this work was supported in part by the australian research council under grant lp170100458, in part by the national natural science foundation of china under grant 61702221, and in prat by the nvidia corporation. key: cord-027228-s32v6bmd authors: subramanian, vigneshwar; kattan, michael w. title: editorial: why is modeling covid-19 so difficult? date: 2020-06-19 journal: chest doi: 10.1016/j.chest.2020.06.014 sha: doc_id: 27228 cord_uid: s32v6bmd nan as the covid-19 pandemic continues with no clear end in sight, clinicians, policymakers, and the public alike are searching for answers about where we're headed. several independent models of disease spread and mortality have been published, and they often came to different conclusions. modeling covid-19 has proven to be a complex and difficult endeavor. it is useful for this discussion to differentiate between epidemiologic and individual prediction models. the former category predicts outcomes for a population, such as a state or country, whereas the latter category predicts outcomes for individual patients. each type of model has a set of constraints involved in its design and use. the cdc publishes an ensemble forecast of national deaths based on nineteen independently developed models, all of which fall into the epidemiologic category 1 . predicting mortality in a population over time is a very complicated task and requires a number of assumptions to be made. of note, the biological properties of this virus are imperfectly understood given how novel it is. for instance, the duration and degree of immunity conferred by exposure will determine the likelihood and intensity of recurrent outbreaks. presumably to mitigate this issue, the models in the cdc ensemble make projections over a four-week period. mortality is fundamentally a function of the fatality rate and spread of disease, each of which is driven by a multitude of factors. fatality rates depend in part on population demographics and availability of testing, and currently available estimates vary widely across countries 2 . disease spread depends heavily on the prevalence of covid-19, which is not precisely known, and on policy interventions such as social distancing, which are a moving target and not intrinsically measurable. policy changes rapidly, and as memorial day crowds show, the degree of compliance with restrictions is also unknown. emerging studies using antibody tests suggest that prevalence is likely highly underestimated 3 , but these serologic tests are themselves imperfect, and validated performances can vary broadly depending on the specific assay that is used 4 . to get around this, models either use proxies or make assumptions about the degree and effectiveness of social distancing. for example, the university of texas model uses phone geolocation data as a proxy for social distancing and assumes the intervention remains constant across the forecasted time period 5 . conversely, the columbia model considers three scenarios, all of which assume a flat 20% reduction in contact rates for each week of stay-at-home orders, with either a one-time 10% increase in contact rates, a weekly 10% increase in contact rates, or no change in contact rates after stay-at-home orders are lifted 6 . these design choices significantly alter predicted outcomes, and it is impossible to know which assumptions will most closely match reality. assumptions may also change over time as information emerges and their performance is reassessed; for example, the columbia model updated contact tracing assumptions to the current parameters to model loosening social distancing restrictions as states reopen 6 . individual predictions models, which make diagnostic or prognostic predictions about a disease based on a patient's unique set of characteristics, have their own set of challenges. the general workflow involved in developing such a model is as follows: first, the outcome of interest is defined; second, relevant predictors or risk factors are identified; third, the effects of each predictor variable are estimated, for example in a regression analysis; and finally, the model is validated 7 . validation is comprised of discrimination and calibration, which are the ability to separate individuals who experience the outcome from those who do not and how well the predicted probabilities match reality, respectively. each of these steps presents a challenge in the context of any emerging disease, such as covid-19. first, many clinically useful outcomes of interest, such as progression to the icu, need for a ventilator, or death, are time-to-event, and therefore we may need to consider censoring. in addition, relevant risk factors are still emerging, and many are difficult to measure, such as travel history and symptom duration (symptom onset is often uncertain). we can identify candidate predictors by generalizing from models of other respiratory viruses, but these need to be validated, and their relevance may vary depending on the population that is studied. once an outcome is defined and predictors are selected, we need a large data set of patients that includes all the desired features, so that we can estimate regression coefficients or train a machine learning algorithm. thanks to electronic medical records, an enormous amount of information will eventually be available, but these data needs to be collected and cleaned for analysis. dimensionality of the data also becomes an issue: as more predictors are considered, a greater number of observations is required to assess significance (a common rule of thumb being ten observations per predictor). validation of the model is also critical. we can use the training data set itself for internal validation using methods such as bootstrapping and ten-fold crossvalidation. ideally, we would also externally validate models on an independent cohort, but this takes time and is not always done. when considering the value of covid-19 models, there are several questions we must consider. the strength of epidemiologic models is that they allow us to flexibly consider a range of possible scenarios and examine how outcomes change as our assumptions shift. no single model represents truth, but the combined spectrum of projections can guide us. similarly, individual models can help clinicians design a management plan that has the best predicted outcome for each unique patient. a model cannot and does not have to be perfect, but it should be a useful approximation. the pandemic raises an interesting question: is it more ethical to use a model that is not validated, or to press forward with no model? as a corollary, if a clinical model has been extensively internally validated, but not externally validated, should it be used in clinical practice? models can be helpful decision-making aids during our uncertain future but we must keep in mind the assumptions that were built in and the quality of the underlying data. covid-19 forecasts johns hopkins coronavirus resource center covid-19 antibody seroprevalence in santa clara county, california. medrxiv eua authorized serology test performance projections for first-wave covid-19 deaths across the us using social-distancing measures derived from mobile phones. medrxiv projection of covid-19 cases and deaths in the us as individual states re-open. medrxiv personalized and precision medicine informatics: a workflow-based view. health informatics key: cord-018899-tbfg0vmd authors: brauer, fred; castillo-chavez, carlos title: epidemic models date: 2011-10-03 journal: mathematical models in population biology and epidemiology doi: 10.1007/978-1-4614-1686-9_9 sha: doc_id: 18899 cord_uid: tbfg0vmd communicable diseases such as measles, influenza, and tuberculosis are a fact of life. we will be concerned with both epidemics, which are sudden outbreaks of a disease, and endemic situations, in which a disease is always present. the aids epidemic, the recent sars epidemic, recurring influenza pandemics, and outbursts of diseases such as the ebola virus are events of concern and interest to many people. the prevalence and effects of many diseases in less-developed countries are probably not as well known but may be of even more importance. every year millions, of people die of measles, respiratory infections, diarrhea, and other diseases that are easily treated and not considered dangerous in the western world. diseases such as malaria, typhus, cholera, schistosomiasis, and sleeping sickness are endemic in many parts of the world. the effects of high disease mortality on mean life span and of disease debilitation and mortality on the economy in afflicted countries are considerable. (german measles), and chicken pox, confer immunity against reinfection, while diseases transmitted by bacteria , such as tuberculosis, meningitis, and gonorrhea, confer no immunity against reinfection. other human diseases, such as malaria, are transmitted not directly from human to human but by vectors, agents (usually insects) that are infected by humans and who then transmit the disease to humans. there are also diseases such as west nile virus, that are transmitted back and forth between animals and vectors. heterosexual transmission of hiv/aids is also a vector process in which transmission goes back and forth between males and females. we will focus on the transmission dynamics of an infection from individual to individual in a population, but many of the same ideas arise in transmission of a genetic characteristic, such as gender, race, genetic diseases, a cultural "characteristic," such as language or religion, an addictive activity, such as drug use, and the gain or loss of information communicated through gossip, rumors, and so on. similarly, many of the ideas arise also with different characterizations of what is meant by an individual, including the types of cells in the study of disease dynamics of the immune system. in the study of chagas disease, a "house" (infested houses may correspond to "infected" individuals) may be chosen as an epidemiological unit; in tuberculosis, a household or community or a group of strongly linked individuals ("cluster") may be the chosen unit. an epidemic, which acts on a short temporal scale, may be described as a sudden outbreak of a disease that infects a substantial portion of the population in a region before it disappears. epidemics usually leave many members untouched. often these attacks recur with intervals of several years between outbreaks, possibly diminishing in severity as populations develop some immunity. this is an important aspect of the connection between epidemics and disease evolution. the historian w.h. mcneill argues, especially in his book plagues and peoples (1976) , that the spread of communicable diseases frequently has been an important influence in history. for example, there was a sharp population increase throughout the world in the eighteenth century; the population of china increased from 150 million in 1760 to 313 million in 1794, and the population of europe increased from 118 million in 1700 to 187 million in 1800. there were many factors involved in this increase, including changes in marriage age and technological improvements leading to increased food supplies, but these factors are not sufficient to explain the increase. demographic studies indicate that a satisfactory explanation requires recognition of a decrease in the mortality caused by periodic epidemic infections. this decrease came about partly through improvements in medicine, but a more important influence was probably the fact that more people developed immunities against infection as increased travel intensified the circulation and cocirculation of diseases. there are many biblical references to diseases as historical influences. the book of exodus describes the plagues that were brought down upon egypt in the time of moses. another example is the decision of sennacherib, the king of assyria, to abandon his attempt to capture jerusalem about 700 bc because of the illness of his soldiers (isaiah 37, 36-38), and there are several other biblical descriptions of epidemic outbreaks. the fall of empires has been attributed directly or indirectly to epidemic diseases. in the second century ad, the so-called antonine plagues (possibly measles nomic hardships leading to disintegration of the empire because of disorganization, which facilitated invasions of barbarians. the han empire in china collapsed in the third century ad after a very similar sequence of events. the defeat of a population of millions of aztecs by cortez and his 600 followers can be explained, in part, by a smallpox epidemic that devastated the aztecs but had almost no effect on the invading spaniards, thanks to their built-in immunities. the aztecs were not only weakened by disease but also confounded by what they interpreted as a divine force favoring the invaders. smallpox then spread southward to the incas in peru and was an important factor in the success of pizarro's invasion a few years later. smallpox was followed by other diseases such as measles and diphtheria imported from europe to north america. in some regions, the indigenous populations were reduced to one-tenth of their previous levels by these diseases: between 1519 and 1530 the indian population of mexico was reduced from 30 million to 3 million. the black death (probably bubonic plague) spread from asia throughout europe in several waves during the fourteenth century, beginning in 1346, and is estimated to have caused the death of as much as one-third of the population of europe between 1346 and 1350. the disease recurred regularly in various parts of europe for more than 300 years, notably as the great plague of london of 1665-1666. it then gradually withdrew from europe. since the plague struck some regions harshly while avoiding others, it had a profound effect on political and economic developments in medieval times. in the last bubonic plague epidemic in france (1720-1722), half the population of marseilles, 60 percent of the population in nearby toulon, 44 per cent of the population of arles, and 30 percent of the population of aix and avignon died, but the epidemic did not spread beyond provence. expansions and interpretations of these historical remarks may be found in mcneill (1976) , which was our primary source on the history of the spread and effects of diseases. the above examples depict the sudden dramatic impact that diseases have had on the demography of human populations via disease-induced mortality. in considering the combined role of diseases, war, and natural disasters on mortality rates, one may conclude that historically humans who are more likely to survive and reproduce have either a good immune system, a propensity to avoid war and disasters, or, nowadays, excellent medical care and/or health insurance. descriptions of epidemics in ancient and medieval times frequently used the term "plague" because of a general belief that epidemics represented divine retribution for sinful living. more recently, some have described aids as punishment for sinful activities. such views have often hampered or delayed attempts to control this modern epidemic. there are many questions of interest to public health physicians confronted with a possible epidemic. for example, how severe will an epidemic be? this question may be interpreted in a variety of ways. for example, how many individuals will be affected and require treatment? what is the maximum number of people needing and) invaded the roman empire, causing drastic population reductions and eco-care at any particular time? how long will the epidemic last? how much good would quarantine of victims do in reducing the severity of the epidemic? scientific experiments usually are designed to obtain information and to test hypotheses. experiments in epidemiology with controls are often difficult or impossible to design, and even if it is possible to arrange an experiment, there are serious ethical questions involved in withholding treatment from a control group. sometimes data may be obtained after the fact from reports of epidemics or of endemic disease levels, but the data may be incomplete or inaccurate. in addition, data may contain enough irregularities to raise serious questions of interpretation, such as whether there is evidence of chaotic behavior [ellner, gallant, and theiler (1995) ]. hence, parameter estimation and model fitting are very difficult. these issues raise the question whether mathematical modeling in epidemiology is of value. mathematical modeling in epidemiology provides understanding of the underlying mechanisms that influence the spread of disease, and in the process, it suggests control strategies. in fact, models often identify behaviors that are unclear in experimental data-often because data are nonreproducible and the number of data points is limited and subject to errors in measurement. for example, one of the fundamental results in mathematical epidemiology is that most mathematical epidemic models, including those that include a high degree of heterogeneity, usually exhibit "threshold" behavior, which in epidemiological terms can be stated as follows: if the average number of secondary infections caused by an average infective is less than one, a disease will die out, while if it exceeds one there will be an epidemic. this broad principle, consistent with observations and quantified via epidemiological models, has been used routinely to estimate the effectiveness of vaccination policies and the likelihood that a disease may be eliminated or eradicated. hence, even if it is not possible to verify hypotheses accurately, agreement with hypotheses of a qualitative nature is often valuable. expressions for the basic reproductive number for hiv in various populations is being used to test the possible effectiveness of vaccines that may provide temporary protection by reducing either hiv-infectiousness or susceptibility to hiv. models are used to estimate how widespread a vaccination plan must be to prevent or reduce the spread of hiv. in the mathematical modeling of disease transmission, as in most other areas of mathematical modeling, there is always a trade-off between simple models, which omit most details and are designed only to highlight general qualitative behavior, and detailed models, usually designed for specific situations including short-term quantitative predictions. detailed models are generally difficult or impossible to solve analytically and hence their usefulness for theoretical purposes is limited, although their strategic value may be high. for public health professionals, who are faced with the need to make recommendations for strategies to deal with a specific situation, simple models are inadequate and numerical simulation of detailed models is necessary. in this chapter, we concentrate on simple models in order to establish broad principles. furthermore, these simple models have additional value since they are the building blocks of models that include detailed structure. a specific goal is to compare the dynamics of simple and slightly more detailed models primarily to see whether slightly different assumptions can lead to significant differences in qualitative behavior. many of the early developments in the mathematical modeling of communicable diseases are due to public health physicians. the first known result in mathematical epidemiology is a defense of the practice of inoculation against smallpox in 1760 by over three generations) who had been trained as a physician. the first contributions to modern mathematical epidemiology are due to p.d. en'ko between 1873 and 1894 [dietz (1988) ], and the foundations of the entire approach to epidemiology based on compartmental models were laid by public health physicians such as sir r.a. ross, w.h. hamer, a.g. mckendrick, and w.o. kermack between 1900 and 1935, along with important contributions from a statistical perspective by j. brownlee. a particularly instructive example is the work of ross on malaria. dr. ross was awarded the second nobel prize in medicine for his demonstration of the dynamics of the transmission of malaria between mosquitoes and humans. although his work received immediate acceptance in the medical community, his conclusion that malaria could be controlled by controlling mosquitoes was dismissed on the grounds that it would be impossible to rid a region of mosquitoes completely and that in any case, mosquitoes would soon reinvade the region. after ross formulated a mathematical model that predicted that malaria outbreaks could be avoided if the mosquito population could be reduced below a critical threshold level, field trials supported his conclusions and led to sometimes brilliant successes in malaria control. unfortunately, the garki project provides a dramatic counterexample. this project worked to eradicate malaria from a region temporarily. however, people who have recovered from an attack of malaria have a temporary immunity against reinfection. thus elimination of malaria from a region leaves the inhabitants of this region without immunity when the campaign ends, and the result can be a serious outbreak of malaria. we formulate our descriptions as compartmental models, with the population under study being divided into compartments and with assumptions about the nature and time rate of transfer from one compartment to another. diseases that confer immunity have a different compartmental structure from diseases without immunity and from diseases transmitted by vectors. the rates of transfer between compartments are expressed mathematically as derivatives with respect to time of the sizes of the compartments, and as a result our models are formulated initially as differential equations. models in which the rates of transfer depend on the sizes of compartments over the past as well as at the instant of transfer lead to more general types of functional equations, such as differential-difference equations and integral equations. in this chapter we describe models for epidemics, acting on a sufficiently rapid time scale that demographic effects, such as births, natural deaths, immigration into and emigration out of a population may be ignored. in the next chapter we will describe models in which demographic effects are included. daniel bernouilli, a member of a famous family of mathematicians (eight spread throughout history, epidemics have had major effects on the course of events. for example, the black death, now identified as probably having been the bubonic plague which had actually invaded europe as early as the sixth century, spread from asia throughout europe in several waves during the fourteenth century, beginning in 1346, and is estimated to have caused the death of as much as one third of the population of europe between 1346 and 1350. the disease recurred regularly in various parts of europe for more than 300 years, notably as the great plague of london of 1665-1666. it then gradually withdrew from europe. more than 15% of the population of london died in the great plague (1665-1666). it appeared quite suddenly, grew in intensity, and then disappeared, leaving part of the population untouched. one of the early triumphs of mathematical epidemiology was the formulation of a simple model by kermack and mckendrick (1927) whose predictions are very similar to this behavior, observed in countless epidemics. the kermack-mckendrick model is a compartmental model based on relatively simple assumptions on the rates of flow between different classes of members of the population. in order to model such an epidemic we divide the population being studied into three classes labeled s, i, and r. we let s(t) denote the number of individuals who are susceptible to the disease, that is, who are not (yet) infected at time t. i(t) denotes the number of infected individuals, assumed infectious and able to spread the disease by contact with susceptibles. r(t) denotes the number of individuals who have been infected and then removed from the possibility of being infected again or of spreading infection. removal is carried out through isolation from the rest of the population, through immunization against infection, through recovery from the disease with full immunity against reinfection, or through death caused by the disease. these characterizations of removed members are different from an epidemiological perspective but are often equivalent from a modeling point of view that takes into account only the state of an individual with respect to the disease. we will use the terminology sir to describe a disease that confers immunity against reinfection, to indicate that the passage of individuals is from the susceptible class s to the infective class i to the removed class r. epidemics are usually diseases of this type. we would use the terminology sis to describe a disease with no immunity against re-infection, to indicate that the passage of individuals is from the susceptible class to the infective class and then back to the susceptible class. usually, diseases caused by a virus are of sir type, while diseases caused by bacteria are of sis type. in addition to the basic distinction between diseases for which recovery confers immunity against reinfection and diseases for which recovered members are susceptible to reinfection, and the intermediate possibility of temporary immunity signified by a model of sirs type, more complicated compartmental structure is possible. for example, there are seir and seis models, with an exposed period between being infected and becoming infective. when there are only a few infected members, the start of a disease outbreak depends on random contacts between small numbers of individuals. in the next section we will use this to describe an approach to the study of the beginning of a disease outbreak by means of branching processes, but we begin with a description of deterministic compartmental models. the independent variable in our compartmental models is the time t, and the rates of transfer between compartments are expressed mathematically as derivatives with respect to time of the sizes of the compartments, and as a result our models are formulated initially as differential equations. we are assuming that the epidemic process is deterministic, that is, that the behavior of a population is determined completely by its history and by the rules that describe the model. in formulating models in terms of the derivatives of the sizes of each compartment we are also assuming that the number of members in a compartment is a differentiable function of time. this assumption is plausible once a disease outbreak has become established but is not valid at the beginning of a disease outbreak when there are only a few infectives. in the next section we will describe a different approach for the initial stage of a disease outbreak. the basic compartmental models to describe the transmission of communicable diseases are contained in a sequence of three papers by w.o. kermack and a.g. mckendrick in 1927 mckendrick in , 1932 mckendrick in , and 1933 . the first of these papers described epidemic models. what is often called the kermack-mckendrick epidemic model is actually a special case of the general model introduced in this paper. the general model included dependence on age of infection, that is, the time since becoming infected, and can be used to provide a unified approach to compartmental epidemic models. the special case of the model proposed by kermack and mckendrick in 1927, which is the starting point for our study of epidemic models, is a flow chart is shown in figure 9 .1. it is based on the following assumptions: (i) an average member of the population makes contact sufficient to transmit infection with î² n others per unit time, where n represents total population size (mass action incidence). (ii) infectives leave the infective class at rate î±i per unit time. (iii) there is no entry into or departure from the population, except possibly through death from the disease. (iv) there are no disease deaths, and the total population size is a constant n. according to (i), since the probability that a random contact by an infective is with a susceptible, who can then transmit infection, is s/n, the number of new infections in unit time per infective is (î² n)(s/n), giving a rate of new infections (î² n)(s/n)i = î² si. alternatively, we may argue that for a contact by a susceptible the probability that this contact is with an infective is i/n and thus the rate of new infections per susceptible is (î² n)(i/n), giving a rate of new infections (î² n)(i/n)s = î² si. note that both approaches give the same rate of new infections; in models with more complicated compartmental structure one may be more appropriate than the other. we need not give an algebraic expression for n, since it cancels out of the final model, but we should note that for an sir disease model, n = s + i + r. later, we will allow the possibility that some infectives recover while others die of the disease. the hypothesis (iii) really says that the time scale of the disease is much faster than the time scale of births and deaths, so that demographic effects on the population may be ignored. an alternative view is that we are interested only in studying the dynamics of a single epidemic outbreak. the assumption (ii) requires a fuller mathematical explanation, since the assumption of a recovery rate proportional to the number of infectives has no clear epidemiological meaning. we consider the "cohort" of members who were all infected at one time and let u(s) denote the number of these who are still infective s time units after having been infected. if a fraction î± of these leave the infective class in unit time, then u = â��î±u , and the solution of this elementary differential equation is thus, the fraction of infectives remaining infective s time units after having become infective is e â��î±s , so that the length of the infective period is distributed exponentially with mean â�� 0 e â��î±s ds = 1/î±, and this is what (ii) really assumes. if we assume, instead of (ii), that the fraction of infectives remaining infective a time ï� after having become infective is p(ï�), the second equation of (9.1) would be replaced by the integral equation where i 0 (t) represents the members of the population who were infective at time t = 0 and are still infective at time t. the assumptions of a rate of contacts proportional to population size n with constant of proportionality î² and of an exponentially distributed recovery rate are unrealistically simple. more general models can be constructed and analyzed, but our goal here is to show what may be deduced from extremely simple models. it will turn out that that many more realistic models exhibit very similar qualitative behaviors. in our model r is determined once s and i are known, and we can drop the r equation from our model, leaving the system of two equations together with initial conditions we think of introducing a small number of infectives into a population of susceptibles and ask whether there will be an epidemic. we remark that the model makes sense only so long as s(t) and i(t) remain nonnegative. thus if either s(t) or i(t) reaches zero, we consider the system to have terminated. we observe that s < 0 for all t and i > 0 if and only if s > î±/î² . thus i increases so long as s > î±/î² , but since s decreases for all t, i ultimately decreases and approaches zero. if s 0 < î±/î² , i decreases to zero (no epidemic), while if s 0 > î±/î² , i first increases to a maximum attained when s = î±/î² and then decreases to zero (epidemic). the quantity î² s 0 /î± is a threshold quantity, called the basic reproduction number [heesterbeek (1996) ] and denoted by r 0 , which determines whether there is an epidemict. if r 0 < 1, the infection dies out, while if r 0 > 1, there is an epidemic. the definition of the basic reproduction number r 0 is that it is the number of secondary infections caused by a single infective introduced into a wholly susceptible population of size n â�� s 0 over the course of the infection of this single infective. in this situation, an infective makes î² n contacts in unit time, all of which are with susceptibles and thus produce new infections, and the mean infective period is 1/î±; thus the basic reproduction number is actually î² n/î± rather than î² s 0 /î±. another way to view this apparent discrepancy is to consider two ways in which an epidemic may begin. one way is an epidemic started by a member of the population being studied, for example by returning from travel with an infection acquired away from home. in this case we would have i 0 > 0, s 0 + i 0 = n. a second way is for an epidemic to be started by a visitor from outside the population. in this case, we would have s 0 = n. since (9.2) is a two-dimensional autonomous system of differential equations, the natural approach would be to find equilibria and linearize about each equilibrium to determine its stability. however, since every point with i = 0 is an equilibrium, the system (9.2) has a line of equilibria, and this approach is not applicable (the linearization matrix at each equilibrium has a zero eigenvalue). fortunately, there is an alternative approach that enables us to analyze the system (9.2). the sum of the two equations of (9.2) is thus s + i is a nonnegative smooth decreasing function and therefore tends to a limit as t â�� â��. also, it is not difficult to prove that the derivative of a nonnegative smooth decreasing function must tend to zero, and this shows that integration of the sum of the two equations of (9.2) from 0 to â�� gives division of the first equation of (9.2) by s and integration from 0 to â�� gives 3) is called the final size relation. it gives a relationship between the basic reproduction number and the size of the epidemic. note that the final size of the epidemic, the number of members of the population who are infected over the course of the epidemic, is n â�� s â�� . this is often described in terms of the attack rate [technically, the attack rate should be called an attack ratio, since it is dimensionless and is not a rate.] the final size relation (9.3) can be generalized to epidemic models with more complicated compartmental structure than the simple sir model (9.2), including models with exposed periods, treatment models, and models including quarantine of suspected individuals and isolation of diagnosed infectives. the original kermack-mckendrick model (1927) included dependence on the time since becoming infected (age of infection), and this includes such models. integration of the first equation of (9.2) from 0 to t gives and this leads to the form this implicit relation between s and i describes the orbits of solutions of (9.2) in the (s, i) plane. in addition, since the right side of (9.3) is finite, the left side is also finite, and this shows that s â�� > 0. the final size relation (9.3) is valid for a large variety of epidemic models, as we shall see in later sections. it is not difficult to prove that there is a unique solution of the final size relation (9.3). to see this, we define the function then, as shown in figure 9 .2, is monotone decreasing from a positive value at x = 0+ to a negative value at x = n. thus there is a unique zero s â�� of g(x) with s â�� < n. is monotone decreasing from a positive value at x = 0+ to a minimum at x = n/r 0 and then increases to a negative value at x = n 0 . thus there is a unique zero s â�� of g(x) with in fact, it is generally difficult to estimate the contact rate î² , which depends on the particular disease being studied but may also depend on social and behavioral factors. the quantities s 0 and s â�� may be estimated by serological studies (measurements of immune responses in blood samples) before and after an epidemic, and from these data the basic reproduction number r 0 may be estimated using (9.3). this estimate, however, is a retrospective one, which can be derived only after the epidemic has run its course. the maximum number of infectives at any time is the number of infectives when the derivative of i is zero, that is, when s = î±/î² . this maximum is given by since detailed records were preserved and the community was persuaded to quarantine itself to try to prevent the spread of disease to other communities, the disease in eyam has been used as a case study for modeling [raggett (1982) ]. detailed examination of the data indicates that there were actually two outbreaks, of which the first was relatively mild. thus we shall try to fit the model (9.2) over the period from mid-may to mid-october 1666, measuring time in months with an initial population of 7 infectives and 254 susceptibles, and a final population of 83. raggett (1982) gives values of susceptibles and infectives in eyam on various dates, beginning with s(0) = 254, i(0) = 7, shown in table 9 .1. the final size relation with s 0 = 254, i 0 = 7, s â�� = 83 gives î² /î± = 6.54 ã� 10 â��3 , î±/î² = 153. the infective period was 11 days, or 0.3667 month, so that î± = 2.73. then î² = 0.0178. the relation (9.5) gives an estimate of 30.4 for the maximum number of infectives. we use the values obtained here for the parameters î² and ï� in the model (9.2) for simulations of both the phase plane, here the (s, i)-plane, and for graphs of s and i as functions of t (figures 9.3 , 9.4, 9.5). figure 9 .6 plots these data points together with the phase portrait given in figure 9 .3 for the model (9.2). the actual data for the eyam epidemic are remarkably close to the predictions of this very simple model. however, the model is really too good to be true. our model assumes that infection is transmitted directly between people. while this is possible, bubonic plague is transmitted mainly by rat fleas. when an infected rat is bitten by a flea, the flea becomes extremely hungry and bites the host rat repeatedly, spreading the infection in the rat. when the host rat dies, its fleas move on to other rats, spreading the disease further. as the number of available rats decreases, the fleas move to human hosts, and this is how plague starts in a human population (although the second phase of the epidemic may have been the pneumonic form of bubonic plague, which can be spread from person to person). one of the main reasons for the spread of plague from asia into europe was the passage of many trading ships; in medieval times ships were invariably infested with rats. an accurate model of plague transmission would have to include flea and rat populations, as well as movement in space. such a model would be extremely complicated, and its predictions might well not be any closer to observations than our simple unrealistic model. very recent study of the data from eyam suggests that the rat population may not have been large enough to support the epidemic and human to human transmission was also a factor. raggett (1982) also used a stochastic model to fit the data, but the fit was rather poorer than the fit for the simple deterministic model(9.2). in the village of eyam the rector persuaded the entire community to quarantine itself to prevent the spread of disease to other communities. one effect of this policy was to increase the infection rate in the village by keeping fleas, rats, and people in close contact with one another, and the mortality rate from bubonic plague was much higher in eyam than in london. further, the quarantine could do nothing to prevent the travel of rats and thus did little to prevent the spread of disease to other communities. one message this suggests to mathematical modelers is that control strategies based on false models may be harmful, and it is essential to distinguish between assumptions that simplify but do not alter the predicted effects substantially, and wrong assumptions that make an important difference. in order to prevent the occurrence of an epidemic if infectives are introduced into a population, it is necessary to reduce the basic reproductive number r 0 below one. this may sometimes be achieved by immunization, which has the effect of transferring members of the population from the susceptible class to the removed class and thus of reducing s(0). if a fraction p of the population is successfully immunized, the effect is to replace s(0) by s(0)(1 â�� p), and thus to reduce the basic reproductive number to î² s(0) a large basic reproductive number means that the fraction that must be immunized to prevent the spread of infection is large. this relation is connected to the idea of herd immunity, which we shall introduce in the next chapter. initially, the number of infectives grows exponentially because the equation for i may be approximated by i = (î² n â�� î±)i and the initial growth rate is this initial growth rate r may be determined experimentally when an epidemic begins. then since n and î± may be measured, î² may be calculated as however, because of incomplete data and underreporting of cases, this estimate may not be very accurate. this inaccuracy is even more pronounced for an outbreak of a previously unknown disease, where early cases are likely to be misdiagnosed. because of the final size relation, estimation of î² or r 0 is an important question that has been studied by a variety of approaches. there are serious shortcomings in the simple kermack-mckendrick model as a description of the beginning of a disease outbreak, and a very different kind of model is required. exercises 1. the same survey of yale students described in example 1 reported that 91.1 percent were susceptible to influenza at the beginning of the year and 51.4 percent were susceptible at the end of the year. estimate the basic reproductive number î² /î± and decide whether there was an epidemic. 2. what fraction of yale students in exercise 1 would have had to be immunized to prevent an epidemic? 3. what was the maximum number of yale students in exercises 1 and 2 suffering from influenza at any time? 4. an influenza epidemic was reported at an english boarding school in 1978 that spread to 512 of the 763 students. estimate the basic reproductive number î² /î±. 5. what fraction of the boarding school students in exercise 4 would have had to be immunized to prevent an epidemic? 6. what was the maximum number of boarding school students in exercises 4 and 5 suffering from influenza at any time? 7. a disease is introduced by two visitors into a town with 1200 inhabitants. an average infective is in contact with 0.4 inhabitants per day. the average duration of the infective period is 6 days, and recovered infectives are immune against reinfection. how many inhabitants would have to be immunized to avoid an epidemic? 8. consider a disease with î² = 1/3000, 1/î± = 6 days in a population of 1200 members. suppose the disease conferred immunity on recovered infectives. how many members would have to be immunized to avoid an epidemic? 9. a disease begins to spread in a population of 800. the infective period has an average duration of 14 days and the average infective is in contact with 0.1 persons per day. what is the basic reproductive number? to what level must the average rate of contact be reduced so that the disease will die out? 10. european fox rabies is estimated to have a transmission coefficient î² of 80 km 2 years/fox and an average infective period of 5 days. there is a critical carrying capacity k c measured in foxes per km 2 , such that in regions with fox density less than k c , rabies tends to die out, while in regions with fox density greater than k c , rabies tends to persist. estimate k c . [remark: it has been suggested in great britain that hunting to reduce the density of foxes below the critical carrying capacity would be a way to control the spread of rabies.] 11. a large english estate has a population of foxes with a density of 1.3 foxes/km 2 . a large fox hunt is planned to reduce the fox population enough to prevent an outbreak of rabies. assuming that the contact number î² /î± is 1 km 2 /fox, find what fraction of the fox population must be caught. 12. following a complaint from the spca, organizers decide to replace the fox hunt of exercise 1 by a mass inoculation of foxes for rabies. what fraction of the fox population must be inoculated to prevent a rabies outbreak? 13. what actually occurs on the estate of these exercises is that 10 percent of the foxes are killed and 15 percent are inoculated. is there danger of a rabies outbreak. 14. here is another approach to the analysis of the sir model (9.2). (i) divide the two equations of the model to give (ii) integrate to find the orbits in the (s, i)-plane, (iv) show that no orbit reaches the i-axis and deduce that s â�� = lim tâ��â�� s(t) > 0, which implies that part of the population escapes infection. the kermack-mckendrick compartmental epidemic model assumes that the sizes of the compartments are large enough that the mixing of members is homogeneous, or at least that there is homogeneous mixing in each subgroup if the population is stratified by activity levels. however, at the beginning of a disease outbreak, there is a very small number of infective individuals, and the transmission of infection is a stochastic event depending on the pattern of contacts between members of the population; a description should take this pattern into account. our approach will be to give a stochastic-branching process description of the beginning of a disease outbreak to be applied as long as the number of infectives remains small, distinguishing a (minor) disease outbreak confined to this stage from a (major) epidemic, which occurs if the number of infectives begins to grow at an exponential rate. once an epidemic has started, we may switch to a deterministic compartmental model, arguing that in a major epidemic, contacts would tend to be more homogeneously distributed. implicitly, we are thinking of an infinite population, and by a major epidemic we mean a situation in which a nonzero fraction of the population is infected, and by a minor outbreak we mean a situation in which the infected population may grow but remains a negligible fraction of the population. there is an important difference between the behavior of branching process models and the behavior of models of kermack-mckendrick type, namely, as we shall see in this section that for a stochastic disease outbreak model if r 0 < 1, the probability that the infection will die out is 1, but if r 0 > 1, there is a positive probability that the infection will increase initially but will produce only a minor outbreak and will die out before triggering a major epidemic. we describe the network of contacts between individuals by a graph with members of the population represented by vertices and with contacts between individuals represented by edges. the study of graphs originated with the abstract theory of erdå�s and rã©nyi of the 1950s and 1960s [erdå�s and rã©nyi (1959 rã©nyi ( , 1960 rã©nyi ( , 1961 ]. it has become important in many areas of application, including social contacts and computer networks, as well as the spread of communicable diseases. we will think of networks as bidirectional, with disease transmission possible in either direction along an edge. an edge is a contact between vertices that can transmit infection. the number of edges of a graph at a vertex is called the degree of the vertex. the degree distribution of a graph is {p k }, where p k is the fraction of vertices having degree k. the degree distribution is fundamental in the description of the spread of disease. we think of a small number of infectives in a population of susceptibles large enough that in the initial stage, we may neglect the decrease in the size of the susceptible population. our development begins along the lines of that of [diekmann and heesterbeek (2000) ] and then develops along the lines of [callaway, newman, strogatz, and watts (2000), newman (2002), newman, strogatz, and watts (2002)]. we assume that the infectives make contacts independently of one another and let p k denote the probability that the number of contacts by a randomly chosen individual is exactly k, with â�� â�� k=0 p k = 1. in other words, {p k } is the degree distribution of the vertices of the graph corresponding to the population network. for the moment, we assume that every contact leads to an infection, but we will relax this assumption later. it is convenient to define the probability generating function since â�� â�� k=0 p k = 1, this power series converges for 0 â�¤ z â�¤ 1, and may be differentiated term by term. thus it is easy to verify that the generating function has the properties the mean degree, which we denote by k or z 1 , is more generally, we define the moments when a disease is introduced into a network, we think of it as starting at a vertex (patient zero) that transmits infection to every individual to whom this individual is connected, that is, along every edge of the graph from the vertex corresponding to this individual. we may think of this individual as being inside the population, as when a member of a population returns from travel after being infected, or as being outside the population, as when someone visits a population and brings an infection. for transmission of disease after this initial contact we need to use the excess degree of a vertex. if we follow an edge to a vertex, the excess degree of this vertex is one less than the degree. we use the excess degree because infection cannot be transmitted back along the edge whence it came. the probability of reaching a vertex of degree k, or excess degree (k â�� 1), by following a random edge is proportional to k, and thus the probability that a vertex at the end of a random edge has excess degree (k â�� 1) is a constant multiple of kp k with the constant chosen to make the sum over k of the probabilities equal to 1. then the probability that a vertex has excess degree (k â�� 1) is this leads to a generating function g 1 (z) for the excess degree, and the mean excess degree, which we denote by k e , is we let r 0 = g 1 (1), the mean excess degree. this is the mean number of secondary cases infected by patient zero and is the basic reproduction number as usually defined; the threshold for an epidemic is determined by r 0 . the quantity k e = g 1 (1) is sometimes written in the form our next goal is to calculate the probability that the infection will die out and will not develop into a major epidemic, proceeding in two steps. first we find the probability that a secondary infected vertex (a vertex that has been infected by another vertex in the population) will not spark a major epidemic. suppose that the secondary infected vertex has excess degree j. we let z n denote the probability that this infection dies out within the next n generations. for the infection to die out in n generations, each of the j secondary infections coming from the initial secondary infected vertex must die out in (n â�� 1) generations. the probability of this is z nâ��1 for each secondary infection, and the probability that all secondary infections will die out in (n â�� 1) generations is z j nâ��1 . now z n is the sum over j of these probabilities, weighted by the probability q j of j secondary infections. thus since g 1 (z) is an increasing function, the sequence z n is an increasing sequence and has a limit z â�� , which is the probability that this infection will die out eventually. then z â�� is the limit as n â�� â�� of the solution of the difference equation thus z â�� must be an equilibrium of this difference equation, that is, a solution of z = g 1 (z). let w be the smallest positive solution of z = g 1 (z). then, because it follows by induction that from this we deduce that z â�� = w. the equation g 1 (z) = z has a root z = 1, since g 1 (1) = 1. because the function g 1 (z) â�� z has a positive second derivative, its derivative g 1 (z) â�� 1 is increasing and can have at most one zero. this implies that the equation g 1 (z) = z has at most two and the equation g 1 (z) = z has only one root, namely z = 1. on the other hand, if r 0 > 1, the function g 1 (z) â�� z is positive for z = 0 and negative near z = 1 since it is zero at z = 1, and its derivative is positive for z < 1 and z near 1. thus in this case the equation g 1 (z) = z has a second root z â�� < 1. this root z â�� is the probability that an infection transmitted along one of the edges at the initial secondary vertex will die out, and this probability is independent of the excess degree of the initial secondary vertex. it is also the probability that an infection originating outside the population, such as an infection brought from outside into the population under study, will die out. next, we calculate the probability that an infection originating at a primary infected vertex, such as an infection introduced by a visitor from outside the population under study, will die out. the probability that the disease outbreak will die out eventually is the sum over k of the probabilities that the initial infection in a vertex of degree k will die out, weighted by the degree distribution {p k } of the original infection, and this is to summarize this analysis, we see that if r 0 < 1, the probability that the infection will die out is 1. on the other hand, and there is a probability 1 â�� g 0 (z â�� ) > 0 that the infection will persist, and will lead to an epidemic. however, there is a positive probability g 0 (z â�� ) that the infection will increase initially but will produce only a minor outbreak and will die out before triggering a major epidemic. this distinction between a minor outbreak and a major epidemic, and the result that if r 0 > 1 there may be only a minor outbreak and not a major epidemic, are aspects of stochastic models not reflected in deterministic models. if contacts between members of the population are random, corresponding to the assumption of mass action in the transmission of disease, then the probabilities p k are given by the poisson distribution the commonly observed situation that most infectives do not pass on infection but there are a few "superspreading events" [riley et al. (2003) ] corresponds to a probability distribution that is quite different from a poisson distribution, and could give a quite different probability that an epidemic will occur. for example, if r 0 = 2.5, the assumption of a poisson distribution gives z â�� = 0.107 and g 0 (z â�� ) = 0.107, so that the probability of an epidemic is 0.893. the assumption that nine out of ten infectives do not transmit infection while the tenth transmits 25 infections gives from which we see that the probability of an epidemic is 0.1. another example, possibly more realistic, is to assume that a fraction (1 â�� p) of the population follows a poisson distribution with constant r, while the remaining fraction p consists of superspreaders each of whom makes l contacts. this would give a generating function for example, if r = 2.2, l = 10, p = 0.01, numerical simulation gives so that the probability of an epidemic is 0.849. these examples demonstrate that the probability of a major epidemic depends strongly on the nature of the contact network. simulations suggest that for a given value of the basic reproduction number, the poisson distribution is the one with the maximum probability of a major epidemic. it has been observed that in many situations there is a small number of long-range connections in the graph, allowing rapid spread of infection. there is a high degree of clustering (some vertices with many edges), and there are short path lengths. such a situation may arise if a disease is spread to a distant location by an air traveler. this type of network is called a small-world network. long range connections in a network can increase the likelihood of an epidemic dramatically. these examples indicate that the probability of an epidemic depends strongly on the contact network at the beginning of a disease outbreak. we will not explore network models further here, but we point out that this is an actively developing field of science. some basic references are [newman (2003 ( , strogatz (2001 ]. contacts do not necessarily transmit infection. for each contact between individuals of whom one has been infected and the other is susceptible, there is a probability that infection will actually be transmitted. this probability depends on such factors as the closeness of the contact, the infectivity of the member who has been infected, and the susceptibility of the susceptible member. we assume that there is a mean probability t , called the transmissibility, of transmission of infection. the transmissibility depends on the rate of contacts, the probability that a contact will transmit infection, the duration time of the infection, and the susceptibility. the development in section 9.2 assumed that all contacts transmit infection, that is, that t = 1. in this section, we will continue to assume that there is a network describing the contacts between members of the population whose degree distribution is given by the generating function g 0 (z), but we will assume in addition that there is a mean transmissibility t . when disease begins in a network, it spreads to some of the vertices of the network. edges that are infected during a disease outbreak are called occupied, and the size of the disease outbreak is the cluster of vertices connected to the initial vertex by a continuous chain of occupied edges. the probability that exactly m infections are transmitted by an infective vertex of degree k is we define î� 0 (z, t )to be the generating function for the distribution of the number of occupied edges attached to a randomly chosen vertex, which is the same as the distribution of the infections transmitted by a randomly chosen individual for any (fixed) transmissibility t . then in this calculation we have used the binomial theorem to see that note that for secondary infections we need the generating function î� 1 (z, t ) for the distribution of occupied edges leaving a vertex reached by following a randomly chosen edge. this is obtained from the excess degree distribution in the same way, the basic reproduction number is now the calculation of the probability that the infection will die out and will not develop into a major epidemic follows the same lines as the argument for t = 1. the result is that if r 0 = t g 1 (1) < 1, the probability that the infection will die out is 1. and a probability 1 â��î� 0 (z â�� (t ), t ) > 0 that the infection will persist, and will lead to an epidemic. however, there is a positive probability î� 1 (z â�� (t ), t ) that the infection will increase initially but will produce only a minor outbreak and will die out before triggering a major epidemic. another interpretation of the basic reproduction number is that there is a critical transmissibility t c defined by in other words, the critical transmissibility is the transmissibility that makes the basic reproduction number equal to 1. if the mean transmissibility can be decreased below the critical transmissibility, then an epidemic can be prevented. the measures used to try to control an epidemic may include contact interventions, that is, measures affecting the network such as avoidance of public gatherings and rearrangement of the patterns of interaction between caregivers and patients in a hospital, and transmission interventions such as careful hand washing or face masks to decrease the probability that a contact will lead to disease transmission. in each exercise, assume that the transmissibility is 1. 1. show that it is not possible for a major epidemic to develop unless at least one member of the contact network has degree at least 3. 2. what is the probability of a major epidemic if every member of the contact network has degree 3. estimate (numerically) the probability of a major epidemic if c = 1.5. 4. show that the probability generating function for an exponential distribution, given by for what values of î± is it possible to normalize this (i.e., choose c to make â�� p k = 1? compartmental models for epidemics are not suitable for describing the beginning of a disease outbreak because they assume that all members of a population are equally likely to make contact with a very small number of infectives. thus, as we have seen in the preceding section, stochastic branching process models are better descriptions of the beginning of an epidemic. they allow the possibility that even if a disease outbreak has a reproduction number greater than 1, it may be only a minor outbreak and may not develop into a major epidemic. one possible approach to a more realistic description of an epidemic would be to use a branching process model initially and then make a transition to a compartmental model when the epidemic has become established and there are enough infectives that mass action mixing in the population is a reasonable approximation. another approach would be to continue to use a network model throughout the course of the epidemic. in this section we shall indicate how a compartmental approach and a network approach are related. we assume that there is a known static configuration model (cm) network in which the probability that a node u has degree k u is p(k u )). we let g 0 (z) denote the probability generating function of the degree distribution, the per-edge from an infected node is assumed to be î² , and it is assumed that infected nodes recover at a rate î±. we use an edge-based compartmental model because the probability that a random neighbor is infected is not necessarily the same as the probability that a random individual is infected. we let s(t) denote the fraction of nodes that are susceptible at time t, i(t) the fraction of nodes that are infective at time t, and r(t) the fraction of nodes that are recovered at time t. it is easy to write an equation for r , the rate at which infectives recover. if we know s(t), we can find i(t), because a decrease in s gives a corresponding increase in i. since we need only find the probability that a randomly selected node is susceptible. we assume that the hazard of infection for a susceptible node u is proportional to the degree k u of the node. each contact is represented by an edge of the network joining u to a neighboring node. we let ï� i denote the probability that this neighbor is infective. then the per-edge hazard of infection is assuming that edges are independent, u's hazard of infection at time t is consider a randomly selected node u and let î¸ (t) be the probability that a random neighbor has not transmitted infection to u. then the probability that u is susceptible is î¸ k u . averaging over all nodes, we see that the probability that a random node u is susceptible is (9.7) we break î¸ into three parts, with ï� s the probability that a random neighbor v of u is susceptible, ï� i the probability that a random neighbor v of u is infective but has not transmitted infection to u, and ï� r the probability that a random neighbor v has recovered without transmitting infection to u. then the probability that v has transmitted infection to u is 1 â�� î¸ . since infected neighbors recover at rate î±, the flux from ï� i to ï� r is î±ï� i . thus it is easy to see from this that r = î±i. (9.8) since edges from infected neighbors transmit infection at rate î² , the flux from to obtain ï� i we need the flux into and out of the ï� i compartment. the incoming flux from ï� s results from infection of the neighbor. the outgoing flux to ï� r corresponds to recovery of the neighbor without having transmitted infection, and the outgoing flux to (1â��î¸ ) corresponds to transmission without recovery. the total outgoing flux is (î± + î² )ï� i . to determine the flux from ï� s to ï� i , we need the rate at which a neighbor changes from susceptible to infective. consider a random neighbor v of u; the probability that v has degree k iskp(k)/ k . since there are (k â�� 1) neighbors of v that could have infected v, the probability that v is susceptible is î¸ kâ��1 . averaging over all k, we see that the probability that a random neighbor v of u is susceptible is to calculate ï� r , we note that the flux from ï� i to ï� r and the flux from ï� i to (1â��î¸ ) are proportional with proportionality constant î±/î² . since both ï� r and (1 â�� î¸ ) start at zero, now, using (9.9), (9.10), (9.11), and we now have a dynamic model consisting of equations (9.12), (9.7), (9.8), and s + i + r = 1. we wish to show a relationship between this set of equations and the simple kermack-mckendrick compartmental model (9.2). in order to accomplish this, we need only show under what conditions we would have s = â��î² si. differentiating (9.7) and using (9.9), we obtain consider a large population with n members, each making c â�¤ n â�� 1 contacts, so that we now let c â�� â�� (which implies n â�� â��) in such a way that we will now show that ï� i î¸ â�� 1, and this will yield the desired approximation the probability that an edge to a randomly chosen node has not transmitted infection is î¸ (assuming that the given target node cannot transmit infection), and the probability that in addition it is connected to an infected node is ï� i . becausãª î² = î²c is constant and therefore bounded as c grows, only a fraction no greater than a constant multiple of i/c of edges to the target node may have transmitted infection from a node that is still infected. for large values of c, ï� i is approximately i. similarly, î¸ is approximately 1 as c â�� â��. thus ï� i /î¸ â�� i as c â�� â��. this gives the desired approximate equation for s. the result remains valid if all degrees are close to the average degree as the average degree grows. the edge-based compartmental modeling approach that we have used can be generalized in several ways. for example, heterogeneity of mixing can be included. in general, one would expect that early infections would be in individuals having more contacts, and thus that an epidemic would develop more rapidly than a mass action compartmental model would predict. when contact duration is significant, as would be the case in sexually transmitted diseases, an individual with a contact would play no further role in disease transmission until a new contact is made, and this can be incorporated in a network model. the network approach to disease modeling is a rapidly developing field of study, and there will undoubtedly be fundamental developments in our understanding of the modeling of disease transmission. in the remainder of this chapter, we assume that we are in an epidemic situation following a disease outbreak that has been modeled initially by a branching process. thus we return to the study of compartmental models. we have established that the simple kermack-mckendrick epidemic model (9.2) has the following basic properties: 1. there is a basic reproduction number r 0 such that if r 0 < 1, the disease dies out while if r 0 > 1, there is an epidemic. 2. the number of infectives always approaches zero and the number of susceptibles always approaches a positive limit as t â�� â��. 3. there is a relationship between the reproduction number and the final size of the epidemic, which is an equality if there are no disease deaths. in fact, these properties hold for epidemic models with more complicated compartmental structure. we will describe some common epidemic models as examples. in many infectious diseases there is an exposed period after the transmission of infection from susceptibles to potentially infective members but before these potential infectives develop symptoms and can transmit infection. to incorporate an exposed period with mean exposed period 1/îº, we add an exposed class e and use compartments s, e, i, r and total population size n = s + e + i + r to give a generalization of the epidemic model (9.2) (9.14) a flow chart is shown in figure 9 .7. the analysis of this model is the same as the analysis of (9.2), but with i replaced by e + i. that is, instead of using the number of infectives as one of the variables, we use the total number of infected members, whether or not they are capable of transmitting infection. in some diseases there is some infectivity during the exposed period. this may be modeled by assuming infectivity reduced by a factor îµ during the exposed period. a calculation of the rate of new infections per susceptible leads to a model we take initial conditions for this model, integration of the sum of the equations of (9.14) from 0 to â�� gives integration of the third equation of (9.15) gives and division of the first equation of (9.15) by s followed by integration from 0 to â�� gives in this final size relation there is an initial term î² i 0 /î±, caused by the assumption that there are individuals infected originally who are beyond the exposed stage in which they would have had some infectivity. in order to obtain a final size relation without such an initial term it is necessary to assume i(0) = 0, that initial infectives are in the first stage in which they can transmit infection. if i(0) = 0, the final size relation has the form (9.3). one form of treatment that is possible for some diseases is vaccination to protect against infection before the beginning of an epidemic. for example, this approach is commonly used for protection against annual influenza outbreaks. a simple way to model this would be to reduce the total population size by the fraction of the population protected against infection. in reality, such inoculations are only partly effective, decreasing the rate of infection and also decreasing infectivity if a vaccinated person does become infected. this may be modeled by dividing the population into two groups with different model parameters, which would require some assumptions about the mixing between the two groups. this is not difficult, but we will not explore this direction here. if there is a treatment for infection once a person has been infected, this may be modeled by supposing that a fraction î³ per unit time of infectives is selected for treatment, and that treatment reduces infectivity by a fraction î´ . suppose that the rate of removal from the treated class is î·. this leads to the sitr model, where t is the treatment class, given by a flow chart is shown in figure 9 .8. it is not difficult to prove, much as was done for the model (9.2), that in order to calculate the basic reproduction number, we may argue that an infective in a totally susceptible population causes î² n new infections in unit time, and the mean time spent in the infective compartment is 1/(î± + î³). in addition, a fraction î³/(î± + î³) of infectives are treated. while in the treatment stage the number of new infections caused in unit time is î´ î² n, and the mean time in the treatment class is 1/î·. thus r 0 is it is also possible to establish the final size relation (9.3) by means very similar to those used for the simple model (9.2). we integrate the first equation of (9.16) to obtain integration of the third equation of (9.16) gives integration of the sum of the first two equations of (9.16) gives combination of these three equations and (9.17) gives (9.3). in some diseases, such as influenza, at the end of a stage individuals may proceed to one of two stages. there is a latent period after which a fraction p of latent individuals l proceeds to an infective stage i, while the remaining fraction (1 â�� p) proceeds to an asymptomatic stage a, with infectivity reduced by a factor î´ and a different period 1/î·. a flow chart is shown in figure 9 .9. the model (9.18) is an example of a differential infectivity model. in such models, also used in the study of hiv/aids [hyman, li and stanley (1999)], individuals enter a specific group when they become infected and stay in that group over the course of the infection. different groups may have different parameter values. for example, for influenza infective and asymptomatic members may have different infectivities and different periods of stay in the respective stages. 1. exposed members may be infective with infectivity reduced by a factor îµ e , 0 â�¤ îµ e < 1. 2. exposed members who are not isolated become infective at rate îº e . 3. we introduce a class q of quarantined members and a class j of isolated (hospitalized) members, and exposed members are quarantined at a proportional rate î³ q in unit time (in practice, a quarantine will also be applied to many susceptibles, but we ignore this in the model). quarantine is not perfect, but it reduces the contact rate by a factor îµ q . the effect of this assumption is that some susceptibles make fewer contacts than the model assumes. 4. infectives are diagnosed at a proportional rate î³ j per unit time and isolated. isolation is imperfect, and there may be transmission of disease by isolated members, with an infectivity factor of îµ j . 5. quarantined members are monitored, and when they develop symptoms at rate îº q they are isolated immediately. 6. infectives leave the infective class at rate î± i and isolated members leave the isolated class at rate î± j . these assumptions lead to the seqijr model [gumel et al. (2004)]: the model before control measures are begun is the special case of (9.19). it is the same as (9.15). a flow chart is shown in figure 9 .10. we define the control reproduction number r c to be the number of secondary infections caused by a single infective in a population consisting essentially only of susceptibles with the control measures in place. it is analogous to the basic reproduction number, but instead of describing the very beginning of the disease outbreak it describes the beginning of the recognition of the epidemic. the basic reproduction number is the value of the control reproduction number with we have already calculated r 0 for (9.15), and we may calculate r c in the same way but using the full model with quarantined and isolated classes. we obtain each term of r c has an epidemiological interpretation. the mean duration in e is 1/d 1 with contact rate îµ e î² , giving a contribution to r c of îµ e î² n/d 1 . a fraction îº e /d 1 goes from e to i, with contact rate î² and mean duration 1/d 2 , giving a contribution of î² nîº e /d 1 d 2 . a fraction î³ q /d 1 goes from e to q, with contact rate îµ e îµ q î² and mean duration 1/îº q , giving a contribution of îµ e îµ q î² nî³ q /d 1 îº q . a fraction îº e î³ j /d 1 d 2 goes from e to i to j, with a contact rate of îµ j î² and a mean duration of 1/î± j , giving a contribution of îµ j î² nîº e î³ j /î± j d 1 d 2 . finally, a fraction î³ q /d 1 goes from e to q to j with a contact rate of îµ j î² and a mean duration of 1/î± j , giving a contribution of îµ j î² nî³ q /d 1 î± j . the sum of these individual contributions gives r c . in the model (9.19) the parameters î³ q and î³ j are control parameters, which may be chosen in the attempt to manage the epidemic. the parameters îµ q and îµ j depend on the strictness of the quarantine and isolation processes and are thus also control measures in a sense. the other parameters of the model are specific to the disease being studied. while they are not variable, their measurements are subject to experimental error. the linearization of (9.19) at the disease-free equilibrium (n, 0, 0, 0, 0) has matrix â�¡ the corresponding characteristic equation is a fourth-degree polynomial equation whose leading coefficient is 1 and whose constant term is a positive constant multiple of 1 â�� r c , thus positive if r c < 1 and negative if r c > 1. if r c > 1, there is a positive eigenvalue, corresponding to an initial exponential growth rate of solutions of (9.19). if r c < 1, it is possible to show that all eigenvalues of the coefficient matrix have negative real part, and thus solutions of (9.19) die out exponentially [van den driessche and watmough (2002)]. in order to show that analogues of the relation (9.3) and s â�� > 0 derived for the model (9.2) are valid for the management model (9.19), we begin by integrating the equations for s + e, q, i, j, of (9.19) with respect to t from t = 0 to t = â��, using the initial conditions we continue by integrating the equation for s, and then an argument similar to the one used for (9.2) but technically more complicated may be used to show that s â�� > 0 for the treatment model (9.19) and also to establish the final size relation thus the asymptotic behavior of the management model (9.19) is the same as that of the simpler model (9.2). in the various compartmental models that we have studied, there are significant common features. this suggests that compartmental models can be put into a more general framework. in fact, this general framework is the age of infection epidemic model originally introduced by kermack and mckendrick in 1927. however, we will not explore this generalization here. these models represent an sir epidemic model and an seir epidemic model respectively with a mean infective period of 6 days and a mean exposed period of 2 days. do numerical simulations to decide whether the exposed period affects the behavior of the model noticeably. use the parameter values if you feel really ambitious, formulate and analyze an seir model with infectivity in the exposed period and treatment. 3. consider an sir model in which a fraction î¸ of infectives is isolated in a perfectly quarantined class q with standard incidence (meaning that individuals make a contacts in unit time of which a fraction i/(n â�� q) are infective), given by the system 4. isolation/quarantine is a complicated process because we don't live in a perfect world. in hospitals, patients may inadvertently or deliberately break from isolation and in the process have casual contacts with others including medical personnel and visitors. taking this into account, we are led to the model (i) determine all the parameters in the system and define each parameter. (ii) show that the population is constant. (iii) find all equilibria. (iv) find the reproductive number r 0 . (v) describe the asymptotic behavior of the model, including its dependence on the basic reproduction number. 5. formulate a model analogous to (9.16) for which treatment is not started immediately, but begins at time ï� > 0. can you say anything about the dependence of the reproduction number on ï�? in the simple model (9.2) studied in section 9.2 we have assumed that the infective period is exponentially distributed. now let us consider an sir epidemic model in a population of constant size n with mass action incidence in which p(ï�) is the fraction of individuals who are still infective a time ï� after having become infected. the model is (9.20) here, i 0 (t) is the number of individuals who were infective initially at t = 0 who are still infective at time t. then because if all initial infectives were newly infected we would have equality in this relation, and if some initial infectives had been infected before the starting time t = 0, they would recover earlier. we assume that p(ï�) is a nonnegative, nonincreasing function with p(0) = 1. we assume also that the mean infective period â�� 0 p(ï�)dï� is finite. since a single infective causes î² n new infections in unit time and â�� 0 p(ï�)dï� is the mean infective period, it is easy to calculate since s is a nonnegative decreasing function, it follows as for (9.2) that s(t) deceases to a limit s â�� as t â�� â��, but we must proceed differently to show that i(t) â�� 0. this will follow if we can prove that t 0 i(s) ds is bounded as t â�� â��. we have since â�� 0 p(ï�)dï� is assumed to be finite, it follows that t 0 i(s) ds is bounded, and thence that i(t) â�� 0. now integration of the first equation in (9.20) from 0 to â�� gives and this shows that s â�� > 0. if all initially infected individuals are newly infected, so that i 0 (t) = (n â��s 0 )p(t), integration of the second equation of (9.20) gives and this is the final size relation, identical to (9.3). if there are individuals who were infected before time t = 0, a positive term the general epidemic model described by kermack and mckendrick (1927) included a dependence of infectivity on the time since becoming infected (age of infection). we let s(t) denote the number of susceptibles at time t and let ï�(t) be the total infectivity at time t, defined as the sum of products of the number of infected members with each infection age and the mean infectivity for that infection age. we assume that on average, members of the population make a constant number a of contacts in unit time. we let b(ï�) be the fraction of infected members remaining infected at infection age ï� and let ï�(ï�) with 0 â�¤ ï�(ï�) â�¤ 1 be the mean infectivity at infection age ï�. then we let the mean infectivity of members of the population with infection age ï�. we assume that there are no disease deaths, so that the total population size is a constant n. the basic reproduction number is integration with respect to t from 0 to â�� gives here, ï� 0 (t) is the total infectivity of the initial infectives when they reach age of infection t. if all initial infectives have infection age zero at t = 0, then ï� then (9.22) takes the form and this is the general final size relation. if there are initial infectives with infection age greater than zero, let u(ï�) be the fraction of these individuals with infection age ï�, â�� 0 u(ï�)dï� = 1. at time t these individuals have infection age t + ï� and mean infectivity a(t + ï�). thus thus, the initial term satisfies the final size relation is sometimes presented in the form example 1. the seir model (9.15) can be viewed as an age of infection model with ï� = îµe + i. to use the age of infection interpretation, we need to determine the kernel a(ï�) in order to calculate its integral. we let u(ï�) be the fraction of infected members with infection age ï� who are not yet infective and v(ï�) the fraction of infected members who are infective. then the rate at which members become infective at infection age ï� is îºu(ï�), and we have the solution of this system is and it is easy to calculate this gives the same value for r 0 as was calculated directly. the age of infection model also includes the possibility of disease stages with distributions that are not exponential [feng (2007), feng, xu, and zhao (2007)]. example 2. consider an seir model in which the exposed stage has an exponential distribution but the infective stage has a period distribution given by a function p, (9.25) with initial conditions if we define u(ï�), v(ï�) as in example 1, we again obtain u(ï�) = e â��îºï� , and v satisfies for period distributions that are not exponential, it is possible to calculate â�� 0 a(ï�)dï� without having to calculate the function a(ï�) explicitly. example 3. consider an seir model in which the exposed period has a distribution given by a function q and the infective period has a distribution given by a function p. then in order to obtain an equation for i, we differentiate the equation for e, obtaining thus the input to i at time t is and the first term in this expression may be written as i 0 (t), and the second term may be simplified, using interchange of the order of integration in the iterated integral, to yield we obtain then the model is which is in age of infection form with ï� = i, and we have an explicit expression for a(ï�). 1. interpret the models (9.16), (9.18), and (9.19) introduced earlier as age of infection models and use this interpretation to calculate their reproduction numbers. 2. calculate the basic reproduction number for the model (9.26) but with infectivity in the exposed class having a reduction factor îµ. the assumption in the model (9.2) of a rate of contacts per infective that is proportional to population size n, called mass action incidence or bilinear incidence, was used in all the early epidemic models. however, it is quite unrealistic, except possibly in the early stages of an epidemic in a population of moderate size. it is more realistic to assume a contact rate that is a nonincreasing function of total population size. for example, a situation in which the number of contacts per infective in unit time is constant, called standard incidence, is a more accurate description for sexu-ally transmitted diseases. if there are no disease deaths, so that the total population size remains constant, such a distinction is unnecessary. we generalize the model (9.2) by dropping assumption (iv) and replacing assumption (i) by the assumption that an average member of the population makes c(n) contacts in unit time with c (n) â�¥ 0 [castillo-chavez, cooke, huang, and levin (1989a), dietz (1982)], and we define it is reasonable to assume î² (n) â�¤ 0 to express the idea of saturation in the number of contacts. then mass action incidence because the total population size is now present in the model, we must include an equation for total population size in the model. this forces us to make a distinction between members of the population who die of the disease and members of the population who recover with immunity against reinfection. we assume that a fraction f of the î±i members leaving the infective class at time t recover and the remaining fraction (1 â�� f ) die of disease. we use s, i, and n as variables, with n = s + i + r. we now obtain a three-dimensional model (9.28) since n is now a decreasing function, we define n(0) = n 0 = s 0 + i 0 . we also have the equation r = â�� f î±i, but we need not include it in the model, since r is determined when s, i, and n are known. we should note that if f = 1, the total population size remains equal to the constant n, and the model (9.28) reduces to the simpler model (9.2) with î² replaced by the constant î² (n 0 ). we wish to show that the model (9.28) has the same qualitative behavior as the model (9.2), namely that there is a basic reproduction number that distinguishes between disappearance of the disease and an epidemic outbreak, and that some members of the population are left untouched when the epidemic passes. these two properties are the central features of all epidemic models. for the model (9.28) the basic reproduction number is given by because a single infective introduced into a wholly susceptible population makes c(n 0 ) = n 0 î² (n 0 ) contacts in unit time, all of which are with susceptibles and thus produce new infections, and the mean infective period is 1/î±. we assume that î² (0) is finite, thus ruling out standard incidence (standard incidence does not appear to be realistic if the total population n approaches zero, and it would be more natural to assume that c(n) grows linearly with n for small n). if we let t â�� â�� in the sum of the first two equations of (9.28), we obtain the first equation of (9.28) may be written as â�� s (t) s(t) = î² (n(t))i(t). since we now obtain a final size inequality if the disease death rate is small, the final size inequality is an approximate equality. it is not difficult to show that n(t) â�¥ f n 0 , and then a similar calculation using the inequality î² (n) â�¤ î² ( f n 0 ) < â�� shows that from which we may deduce that s â�� > 0. 1. for the model (9.28) show that the final total population size is given by to cope with annual seasonal influenza epidemics there is a program of vaccination before the "flu" season begins. each year, a vaccine is produced aimed at protecting against the three influenza strains considered most dangerous for the coming season. we formulate a model to add vaccination to the simple sir model (9.2) under the assumption that vaccination reduces susceptibility (the probability of infection if a contact with an infected member of the population is made). we consider a population of total size n and assume that a fraction î³ of this population is vaccinated prior to a disease outbreak. thus we have a subpopulation of size n u = (1 â�� î³)n of unvaccinated members and a subpopulation of size n v = î³n of vaccinated members. we assume that vaccinated members have susceptibility to infection reduced by a factor ï� , 0 â�¤ ï� â�¤ 1, with ï� = 0 describing a perfectly effective vaccine and ï� = 1 describing a vaccine that has no effect. we assume also that vaccinated individuals who are infected have infectivity reduced by a factor î´ and may also have a recovery rate î± v that is different from the recovery rate of infected unvaccinated individuals î± u . we let s u , s v , i u , i v denote the number of unvaccinated susceptibles, the number of vaccinated susceptibles, the number of unvaccinated infectives, and the number of vaccianted infectives respectively. the resulting model is the initial conditions prescribe s u (0), s v (0), i u (0), i v (0), with since the infection now is beginning in a population that is not fully susceptible, we speak of the control reproduction number r c rather than the basic reproduction number. however, as we will soon see, calculation of the control reproduction number will require a more general definition and a considerable amount of technical computation. the computation method is applicable to both basic and control reproduction numbers. we will use the term reproduction number to denote either a basic reproduction number or a control reproduction number. we are able to obtain final size relations without knowledge of the reproduction number, but these final size relations do contain information about the reproduction number, and more. since s u and s v are decreasing nonnegative functions they have limits s u (â��) and s v (â��) respectively as t â�� â��. the sum of the equations for s u and i u in (9.29) is from which we conclude, just as in the analysis of (9.2), that i u (t) â�� 0 as t â�� â��, and that î± similarly, using the sum of the equations for s v and i v , we see that integration of the equation for s u in (9.29) and use of (??) gives a similar calculation using the equation for s v gives this pair of equations (9.32), (9.33) are the final size relations. they make it possible to calculate s u (â��), s v (â��) if the parameters of the model are known. it is convenient to define the matrix then the final size relations (9.32), (9.33) may be written the matrix k is closely related to the reproduction number. in the next section we describe a general method for calculating reproduction numbers that will involve this matrix. 1. suppose we want to model the spread of influenza in a city using an sliar model (susceptible-latent-infectious-asympotmatic-recovered, respectively). then our system of equation would be where î² is the transmission coefficient, î´ is the reduced transmissibility factor from asymptomatic contacts, îº is the rate of disease progression from the latent class, p is the proportion of individuals that are clinically diagnosed, î· is the recovery rate from the asymptomatic class, î³ is the recovery rate from the infectious (clinically diagnosed) class, and n is the total population size. (i) add a vaccination class to the model. assume that the vaccine imparts partial protection until it becomes fully effective. is the population of the new system constant? are there any endemic equilibria? (ii) vary the vaccination rate from 0.2 to 0.8 and determine how the number of infected individuals changes compared with the model without vaccination. does vaccination prevent the outbreak? up to this point, we have calculated reproduction numbers by following the secondary cases caused by a single infective introduced into a population. however, if there are subpopulations with different susceptibilities to infection, as in the vaccination model introduced in section 9.9, it is necessary to follow the secondary infections in the subpopulations separately, and this approach will not yield the re-production number. it is necessary to give a more general approach to the meaning of the reproduction number, and this is done through the next generation matrix [diekmann and heesterbeek (2000) , diekmann, heesterbeek, and metz (1990), van den driessche and watmough (2002)]. the underlying idea is that we must calculate the matrix whose (i, j) entry is the number of secondary infections caused in compartment i by an infected individual in compartment j. the procedure applies both to epidemic models, as studied in this chapter, and to models with demographics for endemic diseases, to be studied in the next chapter. in a compartmental disease transmission model we sort individuals into compartments based on a single, discrete state variable. a compartment is called a disease compartment if the individuals therein are infected. note that this use of the term disease is broader than the clinical definition and includes stages of infection such as exposed stages in which infected individuals are not necessarily infective. suppose there are n disease compartments and m nondisease compartments, and let x â�� r n and y â�� r m be the subpopulations in each of these compartments. further, we denote by f i the rate at which secondary infections increase the i â�� th disease compartment and by v i the rate at which disease progression, death, and recovery decrease the i â�� th compartment. the compartmental model can then be written in the form note that the decomposition of the dynamics into f and v and the designation of compartments as infected or uninfected may not be unique; different decompositions correspond to different epidemiological interpretations of the model. the definitions of f and v used here differ slightly from those in [van den driessche and watmough (2002)]. the derivation of the basic reproduction number is based on the linearization of the ode model about a disease-free equilibrium. for an epidemic model with a line of equilibria, it is customary to use the equilibrium with all members of the population susceptible. we assume: â�¢ f i (0, y) = 0 and v i (0, y) = 0 for all y â�¥ 0 and i = 1,..., n. â�¢ the disease-free system y = g(0, y) has a unique equilibrium that is asymptotically stable, that is, all solutions with initial conditions of the form (0, y) approach a point (0, y o ) as t â�� â��. we refer to this point as the disease-free equilibrium. the first assumption says that all new infections are secondary infections arising from infected hosts; there is no immigration of individuals into the disease compartments. it ensures that the disease-free set, which consists of all points of the form (0, y), is invariant. that is, any solution with no infected individuals at some point in time will be free of infection for all time. the second assumption ensures that the disease-free equilibrium is also an equilibrium of the full system. the uniqueness of the disease-free equilibrium in the second assumption is required for models with demographics, to be studied in the next chapter. although it is not satisfied in epidemic models, the specification of a specific disease-free equilibrium with all memebers of the population susceptible is sufficient to validate the results. next, we assume: â�¢ f i (x, y) â�¥ 0 for all nonnegative x and y and i = 1,..., n. y) â�¥ 0 for all nonnegative x and y. the reasons for these assumptions are that the function f represents new infections and cannot be negative, each component v i represents a net outflow from compartment i and must be negative (inflow only) whenever the compartment is empty, and the sum â�� n i=1 v i (x, y) represents the total outflow from all infected compartments. terms in the model leading to increases in â�� n i=1 x i are assumed to represent secondary infections and therefore belong in f . suppose that a single infected person is introduced into a population originally free of disease. the initial ability of the disease to spread through the population is determined by an examination of the linearization of (9.35) about the disease-free equilibrium (0, y 0 ). it is easy to see that the assumption for every pair (i, j). this implies that the linearized equations for the disease compartments x are decoupled from the remaining equations and can be written as where f and v are the n ã� n matrices with entries because of the assumption that the disease-free system y = g(0, y) has a unique asymptotically stable equilibrium, the linear stability of the system (9.35) is completely determined by the linear stability of the matrix (f â��v ) in (9.36). the number of secondary infections produced by a single infected individual can be expressed as the product of the expected duration of the infectious period and the rate at which secondary infections occur. for the general model with n disease compartments, these are computed for each compartment for a hypothetical index case. the expected time the index case spends in each compartment is given by is the solution of (9.36) with f = 0 (no secondary infections) and nonnegative initial conditions x 0 representing an infected index case: in effect, this solution shows the path of the index case through the disease compartments from the initial exposure through to death or recovery with the i â�� th component of ï�(t, x 0 ) interpreted as the probability that the index case (introduced at time t = 0) is in disease state i at time t. the solution of (9.37) is ï� (t, x 0 ) = e â��v t x 0 , where the exponential of a matrix is defined by the taylor series this series converges for all t (see, for example, [hirsch and smale (1974) ]. thus â�� 0 ï�(t, x 0 ) dt = v â��1 x 0 , and the (i, j) entry of the matrix v â��1 can be interpreted as the expected time an individual initially introduced into disease compartment j spends in disease compartment i. the (i, j) entry of the matrix f is the rate at which secondary infections are produced in compartment i by an index case in compartment j. hence, the expected number of secondary infections produced by the index case is given by following diekmann and heesterbeek (2000), the matrix k = fv â��1 is referred to as the next generation matrix for the system at the disease-free equilibrium. the (i, j) entry of k is the expected number of secondary infections in compartment i produced by individuals initially in compartment j, assuming, of course, that the environment experienced by the individual remains homogeneous for the duration of its infection. shortly, we will describe some results from matrix theory that imply that the matrix k l = fv â��1 , called the next generation matrix with small domain, is nonnegative and therefore has a nonnegative eigenvalue, r 0 = ï�(fv â��1 ), such that there are no other eigenvalues of k with modulus greater than r 0 and there is a nonnegative eigenvector ï� associated with r 0 [berman and plemmons (1970), theorem 1.3.2]. this eigenvector is in a sense the distribution of infected individuals that produces the greatest number r 0 of secondary infections per generation. thus, r 0 and the associated eigenvector ï� suitably define a "typical" infective, and the basic reproduction number can be rigorously defined as the spectral radius of the matrix k l . the spectral radius of a matrix k l , denoted by ï�(k l ), is the maximum of the moduli of the eigenvalues of k l . if k l is irreducible, then r 0 is a simple eigenvalue of k l and is strictly larger in modulus than all other eigenvalues of k l . however, if k l is reducible, which is often the case for diseases with multiple strains, then k l may have several positive real eigenvectors corresponding to reproduction numbers for each competing strain of the disease. we have interpreted the reproduction number for a disease as the number of secondary infections produced by an infected individual in a population of susceptible individuals. if the reproduction number r 0 = ï�(fv â��1 ) is consistent with the differential equation model, then it should follow that the disease-free equilibrium is asymptotically stable if r 0 < 1 and unstable if r 0 > 1. this is shown through a sequence of lemmas. the spectral bound (or abscissa) of a matrix a is the maximum real part of all eigenvalues of a. if each entry of a matrix t is nonnegative, we write t â�¥ 0 and refer to t as a nonnegative matrix. a matrix of the form a = si â�� b, with b â�¥ 0, is said to have the z sign pattern. these are matrices whose off-diagonal entries are negative or zero. if in addition, s â�¥ ï�(b), then a is called an m-matrix. note that in this section, i denotes an identity matrix, not a population of infectious individuals. the following lemma is a standard result from [berman and plemmons (1970)]. lemma 9.1. if a has the z sign pattern, then a â��1 â�¥ 0 if and only if a is a nonsingular m-matrix. the assumptions we have made imply that each entry of f is nonnegative and that the off-diagonal entries of v are negative or zero. thus v has the z sign pattern. also, the column sums of v are positive or zero, which, together with the z sign pattern, implies that v is a (possibly singular) m-matrix [berman and plemmons (1970), condition m 35 of theorem 6.2.3]. in what follows, it is assumed that v is nonsingular. in this case, v â��1 â�¥ 0, by lemma 9.1. hence, k l = fv â��1 is also nonnegative. theorem 9.1. consider the disease transmission model given by (9.35). the diseasefree equilibrium of (9.35) is locally asymptotically stable if r 0 < 1, but unstable if r 0 > 1. proof. let f and v be as defined as above, and let j 21 and j 22 be the matrices of partial derivatives of g with respect to x and y evaluated at the disease-free equilibrium. the jacobian matrix for the linearization of the system about the disease-free equilibrium has the block structure the disease-free equilibrium is locally asymptotically stable if the eigenvalues of the jacobian matrix all have negative real parts. since the eigenvalues of j are those of (f â��v ) and j 22 , and the latter all have negative real parts by assumption, the diseasefree equilibrium is locally asymptotically stable if all eigenvalues of (f â�� v ) have negative real parts. by the assumptions on f and v , f is nonnegative and v is a nonsingular m-matrix. hence, by lemma 2 all eigenvalues of (f â��v ) have negative real parts if and only if ï�(fv â��1 ) < 1. it follows that the disease-free equilibrium is locally asymptotically stable if r 0 = ï�(fv â��1 ) < 1. instability for r 0 > 1 can be established by a continuity argument. if r 0 â�¤ 1, then for any îµ > 0, ((1 + îµ)i â�� fv â��1 ) is a nonsingular m-matrix and by lemma 9.1, ((1 + îµ)i â�� fv â��1 ) â��1 â�¥ 0. by lemma 9.2, all eigenvalues of ((1 + îµ)v â�� f) have positive real parts. since îµ > 0 is arbitrary, and eigenvalues are continuous functions of the entries of the matrix, it follows that all eigenvalues of (v â�� f) have nonnegative real parts. to reverse the argument, suppose all the eigenvalues of (v â�� f) have nonnegative real parts. for any positive îµ, (v + îµi â�� f) is a nonsingular m-matrix, and by lemma 9.2, ï�(f(v + îµi) â��1 ) < 1. again, since îµ > 0 is arbitrary, it follows that ï�(fv â��1 ) â�¤ 1. thus, (f â��v ) has at least one eigenvalue with positive real part if and only if ï�(fv â��1 ) > 1, and the disease-free equilibrium is unstable whenever r 0 > 1. these results validate the extension of the definition of the reproduction number to more general situations. in the vaccination model (9.29) of the previous section we calculated a pair of final size relations that contained the elements of a matrix k. this matrix is precisely the next generation matrix with large domain k l = fv â��1 that has been introduced in this section. example 1. consider the seir model with infectivity in the exposed stage, (9.38) here the disease states are e and i, then we may calculate since fv â��1 has rank 1, it has only one nonzero eigenvalue, and since the trace of the matrix is equal to the sum of the eigenvalues, it is easy to see that the element in the first row and first column of fv â��1 . if all new infections are in a single compartment, as is the case here, the basic reproduction number is the trace of the matrix fv â��1 . in general, it is possible to reduce the size of the next generation matrix to the number of states at infection [diekmann and heesterbeek (2000)]. the states at infection are those disease states in which there can be new infections. suppose that there are n disease states and k states at infection with k < n. then we may define an auxiliary nã�k matrix e in which each column corresponds to a state at infection and has 1 in the corresponding row and 0 elsewhere. then the next generation matrix is it is easy to show, using the fact that ee t k l = k l , that the n ã� n matrix k l and the k ã� k matrix k have the same nonzero eigenvalues and therefore the same spectral radius. construction of the next generation matrix that has lower dimension than the next generation matrix with large domain may simplify the calculation of the basic reproduction number. in example 1 above, the only disease state at infection is e, the matrix a is 1 0 , and the next generation matrix k is the 1 ã� 1 matrix example 2. consider the vaccination model (9.29). the disease states are i u and i v . then it is easy to see that the next generation matrix with large domain is the matrix k calculated in section 3.3. since each disease state is a disease state at infection, the next generation matrix is k, the same as the next generation matrix with large domain. as in example 1, the determinant of k is zero and k has rank 1. thus the control reproduction number is the trace of k, example 3. (a multi-strain model of gonorrhea) the following example comes from lima and torres (1997). the system of equations is where s is the susceptible class, i 1 is the class infected with strain 1, and i 2 is the class of individuals infected with a mutated strain. the "birth" rate of the population is ï�, î¼ is the natural mortality rate, c is the probability of successful contact, î» i is the of strain i, î³ i is the recovery rate of strain i, and p is the proportion of the original infected population that become infected by the mutated strain. the disease-free equilibrium for this model is [s = n, i 1 = 0, i 2 = 0]. next we will reorder our variables: di 1 dt di 2 dt t and note that we need only the infected classes to calculate r 0 . then the new infection terms are cî» 1 s(t)i 1 (t) in the di 1 dt equation and in the di 2 dt equation, pî³ 1 i 1 (t) enter the i 2 class, but only after they have been infected with strain 1. then and v = (î¼ + î³ 1 )i 1 (t) (î¼ + î³ 2 )i 2 (t) â�� pî³ 1 i 1 (t) . since we have only two infected classes, n = 2, and our jacobian matrices are then we can calculate the inverse of v, to calculate the spectral radius of fv â��1 we find the eigenvalues of the matrix: we often call r 1 = cî» 1 î¼+î³ 1 the reproductive number for strain 1 and r 2 = cî» 2 î¼+î³ 2 the reproductive number for strain 2. then the basic reproductive number is where s i refers to the susceptible class of species i, and i i refers to the infected class of species i for i = m, b, h, mosquitoes, birds, and humans, respectively. then p i is the mosquito biting preference for species i, î¼ i is the natural mortality rate of species i, b is the number of bites per mosquito per unit time, î¸ is the human recovery rate, î² m is the transmission probability from mosquito to host per bite, î² b is the transmission probability from birds to mosquito, and î² h is the transmission probability from humans to mosquito. from which we calculate the eigenvalues and determine that the spectral radius is we have described the next generation matrix method for continuous models. there is an analogous theory for discrete systems, described in [allen and van den driessche (2008)]. there are some situations in which r 0 < 1 in which it is possible to show that the asymptotic stability of the disease -free equilibrium is global, that is, all solutions approach the disease -free equilibrium, not only those with initial values sufficiently close to this equilibrium. we will say that a vector is nonnegative if each of its components is nonnegative, and that a matrix is nonnegative if each of its entries is non -negative. we rewrite the system (9.35) as (9.39) y j = g j (x, y) , j = 1,..., m. if r 0 < 1, we have shown that the disease -free equilibrium is asymptotically stable, and that â��a = â��(f â��v ) is a non -singular m -matrix. theorem 9.2 (castillo-chavez, feng, and huang (2002)). if â��a is a nonsingular m-matrix andf â�¥ 0, if the assumptions on the model (9.35) made earlier in this section are satisfied, and if r 0 < 1, then the disease-free equilibrium of (9.39) is globally asymptotically stable. proof. the variation of constants formula for the first equation of (9.39) gives there are examples to show that the disease-free equilibrium may not be globally asymptotically stable if the conditionf â�¥ 0 is not satisfied. exercises 2. use the next generation approach to calculate the basic reproduction number for the model (9.26) but with infectivity in the exposed class having a reduction factor îµ. 3. formulate an seitr model and calculate its reproduction number. 4. for each of the examples in this section determine whether the disease-free equilibrium is globally asymptotically stable when r 0 < 1. a fundamental assumption in the model (9.2) is homogeneous mixing, that all individuals are equivalent in contacts. a more realistic approach would include separation of the population into subgroups with differences in behavior. for example, in many childhood diseases the contacts that transmit infection depend on the ages of the individuals, and a model should include a description of the rate of contact between individuals of different ages. other heterogeneities that may be important include activity levels of different groups and spatial distribution of populations. network models may be formulated to include heterogeneity of mixing, or more complicated compartmental models can be developed. an important question that should be kept in mind in the formulation of epidemic models is the extent to which the fundamental properties of the simple model (9.2) carry over to more elaborate models. an epidemic model for a disease in which recovery from infection brings only temporary immunity cannot be described by the models of this chapter because of the flow of new susceptibles into the population. this effectively includes demographics in the model, and such models will be described in the next chapter. many of the important underlying ideas of mathematical epidemiology arose in the study of malaria begun by sir r.a. ross (1911). malaria is one example of a disease with vector transmission, the infection being transmitted back and forth between vectors (mosquitoes) and hosts (humans). other vector diseases include west nile virus and hiv with heterosexual transmission. vector transmitted diseases require models that include both vectors and hosts. an actual epidemic differs considerably from the idealized models (9.2) and (9.28). some notable differences are these: 1. when it is realized that an epidemic has begun, individuals are likely to modify their behavior by avoiding crowds to reduce their contacts and by being more careful about hygiene to reduce the risk that a contact will produce infection. 2. if a vaccine is available for the disease that has broken out, public health measures will include vaccination of part of the population. various vaccination strategies are possible, including vaccination of health care workers and other first line responders to the epidemic, vaccination of members of the population who have been in contact with diagnosed infectives, and vaccination of members of the population who live in close proximity to diagnosed infectives. 3. isolation may be imperfect; in-hospital transmission of infection was a major problem in the sars epidemic. 4. in the sars epidemic of 2002-2003, in-hospital transmission of disease from patients to health care workers or visitors because of imperfect isolation accounted for many of the cases. this points to an essential heterogeneity in disease transmission that must be included whenever there is any risk of such transmission. the discrete analogue of the continuous-time epidemic model (9.2) is s j+1 = s j g j , where s j and i j denote the numbers of susceptible and infective individuals at time j, respectively, g j is the probability that a susceptible individual at time j will remain susceptible to time j + 1, and ï� = e â��î± is the probability that an infected individual at time j will remain infected to time j + 1. assume that the initial conditions are s(0) = s 0 > 0, i(0) = i 0 > 0, and s 0 + i 0 = n. exercise 1. consider the system (9.40). (a) show that the sequence {s j + i j } has a limit s â�� + i â�� = lim jâ��â�� (s j + i j ). with r 0 = î² 1â��ï� . next, consider the case that there are k infected stages and there is treatment in some stages, with treatment rates that can be different in different stages. assume that selection of members for treatment occurs only at the beginning of a stage. let i (i) j and t (i) j denote the numbers of infected and treated individuals respectively in stage i (i = 1, 2,..., k) at time j. let ï� i i denote the probability that an infected individual in the i (i) stage continues on to the next stage, either treated or untreated, and let ï� t i denote the probability that an individual in the t (i) stage continues on to the next treated stage. in addition, of the members leaving an infected stage i (i) , a fraction p i enters treatment in t (i+1) , while the remaining fraction q i continues to i (i+1) . let m i denote the fraction of infected members who go through the stage i (i) , and n i the fraction of infected members who go through the stage t (i) . then, ..., m k = q 1 q 2 â· â· â·q k , n 1 = p 1 , n 2 = p 1 + q 1 p 2 , ..., n k = p 1 + q 1 p 2 + . . . + q 1 q 2 â· â· â·q kâ��1 p k . the discrete system with treatment is s j+1 = s j g j , [i = 2,..., k, j â�¥ 0], with where îµ i is the relative infectivity of untreated individuals at stage i and î´ i is the relative infectivity of treated individuals at stage i. consider the initial conditions exercise 2. consider the system (9.41). show that hint. equation (9.42) can be proved by showing the following equalities first: 9.14 project: fitting data for an influenza model consider an sir model (9.2) with basic reproduction number 1.5. 18), and (9.19) use the next generation approach to calculate their reproduction numbers describe the qualitative changes in (s, i, r) as a function of time for different values of î² and î± with î² â�� {0 discuss the result of part (a) in terms of the basic reproductive number (what is î² /î³?). use a specific disease such as influenza to provide simple interpretations for the different time courses of the disease for the different choices of î² and î³ 75, 2, 2.5}, and for each value of r 0 , choose the best pair of values (î² , î±) that fits the slope before the first peak in the data found in the table for reported h1n1 influenza cases in mã©xico below reluctant" means the class of mbts that come into the door as new hires without a disposition to learn new stuff from â�� 0 , discuss what would be the impact of changing parameters q, î³, and î´ what are your conclusions from this model? cases 75 2 95 4 115 318 135 83 155 152 175 328 76 1 96 11 116 399 136 75 156 138 176 298 77 3 97 5 117 412 137 87 157 159 177 335 78 2 98 7 118 305 138 98 158 186 178 330 79 3 99 4 119 282 139 71 159 222 179 375 80 3 assume that n(t) = r(t) + p(t) + m(t) +u(t) + i(t) and that the total number of mbts is constant, that is, n(t) = k î¼ for all t, where k is a constant. the model iswhere q, î² , î´ , î¼, î³, and î± are constants and 0 â�¤ q â�¤ 1.1. interpret the parameters. key: cord-026742-us7llnva authors: gonçalves, judite; martins, pedro s. title: effects of self-employment on hospitalizations: instrumental variables analysis of social security data date: 2020-06-15 journal: small bus econ doi: 10.1007/s11187-020-00360-w sha: doc_id: 26742 cord_uid: us7llnva the importance of self-employment and small businesses raises questions about their health effects and public policy implications, which can only be addressed with suitable data. we explore the relationship between self-employment and health by drawing on comprehensive longitudinal administrative data to explore variation in individual work status and by applying novel instrumental variables. we focus on an objective outcome—hospital admissions—that is not subject to recall or other biases that may affect previous studies. our main findings, based on a sample of about 6,500 individuals followed monthly from 2005 to 2011 and who switch between self-employment and wage work along that period, suggest that self-employment has a positive effect on health as it reduces the likelihood of hospital admission by at least half. the self-employed represent nearly 16% of employment in the european union (eurostat 2017) . moreover, as much as 10% of the adult population of the eu has used online platforms for the provision of labor services at some point in their lives (pesole et al. 2018) . the ongoing growth of the "platform" economy contributes to the expansion of the proportion of self-employed, especially among younger workers, and raises a number of public policy questions regarding, for example, occupational health and safety risks, social protection, and representation (european commission 2017; garben 2017; ilo 2016) . indeed, platform economy jobs-and self-employment more generally as well as some types of small businessesare characterized by more flexible work formats, distinct from formal employer-employee relationships framed by employment law, and typically have more limited access to social protection. 1 in the current context of such novel forms of selfemployment, one important issue concerns the impact of self-employment on workers' health-the subject of this study. occupational characteristics, namely job control and job demand, vary significantly between self-employment and wage work. job control stands for decision authority, e.g., the freedom to decide what work to do, when and at what pace, which reduces work-related stress. job demand, on the other hand, represents sources of stress at work, such as being assigned a considerable amount of work and/or having little time to carry out specific tasks. this job demand-job control framework, proposed by karasek (1979) , karasek and theorell (1990) , and theorell and karasek (1996) , suggests that compared with wage work, self-employment is associated with both higher job control and higher job demand, an interaction termed 'job strain' in the literature (prottas and thompson 2006; stephan and roesler 2010). 2 in fact, self-employed individuals are not subject to orders from other workers higher up the organizational hierarchy, so they have more decision authority and potentially lower work-related stress. research also shows that the self-employed are more satisfied with their jobs than wage workers because they can be creative and have more autonomy. in other words, the self-employed may often be able to derive utility from the way outcomes are achieved, a process sometimes referred to as 'procedural utility' (benz and frey 2008; schneck 2014) . however, when self-employed, labor income and assets directly hinge on one's ability to work and work effort in each period. in addition, greater exposure to unanticipated demand shocks leaves selfemployed individuals subject to more volatile workload and income flows. social support at work may 1 note that the covid-19 crisis and its aftermath may contribute to the growth of self-employment, as wage employment opportunities in the labor market will decrease. additionally, the covid-19 crisis may lead to a larger share of wage employment conducted under remote work formats given their social/physical distancing properties. such remote work formats are typically more common among the self-employed, which may lead to some blurring of the differentiation between wage work and self-employment. 2 see, e.g., ingre (2017) for a discussion of the job strain model with respect to the appropriateness of the interaction between the job demand and job control dimensions in karasek's model. also be more limited given the smaller number of co-workers around (blanch (2016) , discusses the demand-control-support model). all these variables represent sources of stress. given these two opposite mechanisms-higher job demand and higher job control-it is unclear whether we should expect selfemployed individuals to suffer from more or less work-related stress, compared with wage workers. the medical literature identifies stress as an important cause of disease, e.g., cardiovascular problems and digestive disorders (mayer 2000; steptoe and kivimäki 2012) . overall, stress impacts negatively on health and well-being, and in addition to increasing incidence of disease, it may increase absence from work due to sickness and use of health care services (e.g., browning and heinesen 2012; halpern 2005; holmgren et al. 2009 ). bloemen et al. (2018) also find that the probable mechanism driving the effect of job loss on mortality is stress, through acute diseases of the circulatory system. stress is also associated with unhealthy behavior, such as smoking and drinking. the typical occupations of self-employed and wage workers may differ in terms of risk of workplace accidents and other occupational hazards. at the same time, in many countries self-employment is subject to little or no social protection, in terms of coverage by occupational safety regulation, social security, employment law, or collective bargaining, potentially representing additional negative implications for health. on the other hand, the greater flexibility regarding regulation may also represent additional work opportunities compared with wage work. overall, whether self-employment has a positive or detrimental effect on health is a public policy question that can only be answered with empirical evidence of a causal nature. there are two main empirical challenges to the identification of a causal effect of self-employment on health: reverse causality and individual unobserved heterogeneity (torrès and thurik 2019) . reverse causality has to do with the possibility that individuals become self-employed or wage workers at least partly for health-related reasons. on the one hand, self-employment may attract individuals that are healthier on average because healthier individuals tend to be more able to focus on business opportunities or may have easier access to financing (e.g., gielnik et al. 2012 ). additionally, income when self-employed tends to be more closely linked to one's ability to work than when a wage worker, and access to sickness benefits is harder for the self-employed. all these factors suggest a positive (self-)selection of the healthy into self-employment. on the other hand, health problems may constitute a barrier to finding a wage job, particularly if they are visible to the employer, and push individuals who are less healthy into self-employment (e.g., zissimopoulos and karoly 2007) . furthermore, several individual traits that are difficult to measure may be related to both health and selfemployment decisions (bujacz et al. 2019) . examples include optimism, perseverance, resilience, risk aversion, as well as genetics. some individuals who are attracted to and persist in self-employment may also have higher capacity to tolerate and manage stress, and may therefore experience lower stress (baron et al. 2016) . this capacity to deal with stressful factors is another example of an individual characteristic related to both health and type of employment. earlier life circumstances such as childhood health also influence adult health and type of employment (case et al. 2005; case and paxson 2010) . taken together, these traits and earlier circumstances mean that self-employed individuals and wage workers may have different health profiles along dimensions not observable in the data. the empirical literature on self-employment and health is growing but still scarce. most of it is plagued by the endogeneity issues mentioned above, which are difficult to tackle without longitudinal data. a recent study finds significantly lower work-related stress among self-employed individuals without employees compared with wage workers, using longitudinal data from australia and controlling for individual fixed effects (hessels et al. 2017) . previous studies on selfemployment and stress provide contradictory findings, but most of them are based on cross-sectional data and use descriptive methods (see hessels et al. (2017) , table 1 , for a review). in the study by rietveld et al. (2015) , selfemployed individuals appear healthier than wage workers. however, while the positive association between self-employment and health holds when the authors control for reverse causality, it vanishes when they control for individual unobserved heterogeneity. this finding suggests a positive selection of the healthy into self-employment. that study considers subjective health measures, including self-reported number of conditions, overall health, and mental health. it uses longitudinal survey data representative of the population 50+ in the usa. the results may therefore not be generalizable to a broader workingage population. another study by yoon and bernell (2013) relies on cross-sectional survey data representative of the adult population in the usa and adopts an instrumental variable approach. the authors find that selfemployment has a positive impact on several health indicators, namely the absence of chronic conditions such as hypertension and diabetes. they find no effects on other health outcomes, including perceived physical health and mental health. nikolova (2018) , using german longitudinal survey data and a difference-in-differences strategy, finds that switching from wage work to self-employment leads to both physical and mental health gains. considering more objective indicators and administrative data, a five-year follow-up study of the total working population in sweden finds that selfemployed individuals who own limited liability companies (but not sole proprietors) have lower average risk of mortality than wage workers (toivanen et al. 2016) . similarly, toivanen et al. (2018) find that limited liability company owners have lower rates of hospitalization for myocardial infarction than wage workers, and no different hospitalization rates for stroke. the authors unveil relevant heterogeneous effects not only by enterprise legal type of self-employed individuals but also by industry. overall, there is little robust evidence on the causal effect of self-employment on health. most of the literature does not take endogeneity into account, as longitudinal data or instrumental variables are seldom available. furthermore, it is important to distinguish the effect that is due to differences in the intrinsic characteristics of self-employment and wage work, namely job control and job demand, from institutional factors such as different access to social security benefits. this may be difficult with survey data and selfreported health indicators. separating-out the effect that is due to differences in the typical occupations of self-employed and wage workers, which are associated with different exposure to occupational hazards, would also be of interest. the main research question in this study is "what is the impact of self-employment on the likelihood of hospital admission?" we answer this question based on a large sample of administrative social security records representative of the working-age population in portugal, that includes almost 130,000 self-employed and wage workers followed between january 2005 and december 2011. we focus on a subsample of about 6,500 individuals who switch between self-employment and wage work along that 84-month period. we contribute to the literature in several ways. first, we tackle explicitly the endogeneity of the decision to become self-employed by controlling for individual fixed effects and employing instrumental variables. second, looking at hospitalizations allows us to separate-out institutional factors, because access to hospital care and social security benefits when hospitalized are unrelated with type of employment, and most hospitalizations correspond to unplanned or unavoidable acute events. administrative records of hospital admissions are also comparable across individuals and time periods and not subject to recall bias, an advantage over self-reported indicators in survey data. third, to explore to which extent the effect may be due to differences in the typical occupations of self-employed and wage workers, we look at diagnoses underlying hospitalizations. fourth, we consider the whole working population regardless of age, and explore potentially heterogeneous effects across demographic subgroups. lastly, we also investigate the effects of self-employment on the length of hospitalization and mortality. hospital admissions are also a relevant outcome for policy. they represent roughly 40% of health expenditure in portugal. 3 a significant 7% of sickness leave episodes correspond to hospital admissions (own calculations for the years 2005-2011). in 2011, sickness leave episodes cost social security 454 million euros; 4 7% of that represents almost 32 million euros. this adds to the costs for the health system and other societal costs more difficult to quantify, including productivity and well-being losses. the remaining of this paper is as follows: the next section lays down the background for the study, section 3 presents our data and empirical strategy, section 4 presents the results, and in section 5 we discuss our findings. in 2016, about 17% of employment in portugal corresponded to self-employment or own-account workers. more than one-fourth of those had employees. the proportion of own-account workers differs across groups. it is lower in the capital region than in other regions, among women, among younger age groups, and among more educated groups. by industry, we find the largest proportions of own-account workers in agriculture and other primary sector activities (71.5%), real estate (36%), consulting, scientific, and technical activities (29.5%), construction (27.4%), retail (21.3%), hospitality services (20.3%), and artistic and sports activities (19.6%). from the "self-employment" module of the labor force survey (lfs), conducted in the second quarter of 2017, we also know that more than 60% of own-account workers decide their work schedule. they also report much higher autonomy over their tasks than wage workers. this is in line with the hypothesis of higher job control. while only less than 20% of own-account workers report no difficulties with their work over the previous 12 months, 16% report periods without work, and 14% claim that clients do not pay or pay late. this may suggest that the self-employed are subject to higher job demand. own-account workers report lower levels of satisfaction at work than wage workers, although this is driven by the low satisfaction levels of those who have employees. virtually no ownaccount workers report that they would prefer to be wage workers (torres and raposo 2018) . in this study, we adopt the portuguese social security definition of self-employment or "independent workers" (trabalhadores independentes), which does not include own-account workers with employees. family and informal workers, which are captured in the lfs, do not appear in our data, as they do not pay social security contributions. this explains the lower proportion of self-employment in our data, described below, compared with the proportion of own-account workers in the lfs. for example, agriculture and other primary sector activities, which have by far the largest proportion of own-account workers in the lfs, will have limited expression in our data for those reasons. in portugal, statutory sick leave covers both the self-employed and wage workers. as in many european countries, to deter moral hazard, wage workers face a three-day gap from the onset of a sickness episode until a sickness benefit starts to be paid (i.e., waiting or "elimination" period). however, for the self-employed, this waiting period is much longer, at thirty days (ten days from 2018 onwards). due to the different waiting periods, social security records include sickness episodes that last four days or more in the case of wage workers, but at least thirtyone days in the case of the self-employed. the first three/thirty days are not eligible for sickness benefits. thus, all other things equal, the sickness spells of the self-employed that are administratively recorded are, on average, much more selected and severe. these different waiting periods can entail different incentives for wage workers and self-employed individuals. wage workers face much lower opportunity costs from reporting sick to work, i.e., fewer days without income. in some cases, collective bargaining provisions, determined by unions and firms or employer associations, may even lead to the payment (by the firms) of the first three days of absence as well. as these provisions apply to wage workers but not to the self-employed, the former may engage more often in moral hazard: "cheat" by going on sick leave when they are not really sick. in stark contrast, there is no waiting period for either self-employed or wage workers in the case of hospitalization. furthermore, benefits are the same for both types of workers. besides, due to the specific, acute nature of hospitalizations, these are less likely to be timed deliberately by individuals and therefore less likely to be artificial episodes of sickness. in sum, compared with standard-i.e., non-hospitalizationsickness episodes, hospitalizations are a significantly more objective outcome and hospital admissions should be strictly comparable between wage workers and self-employed individuals. as to the amount of the support, for nearly the entire period under analysis here (sep 2005-dec 2011), the replacement rate of the portuguese sickness benefit was equal to 65% of forgone wages for the first 90 days of sick leave, 70% from the 91st to the 365th day, and 75% from the 365th day onwards. during the first eight months of 2005, the replacement rate was 55% of forgone wages for the first 30 days of sick leave, and 60% from the 31st to the 90th day. sickness benefits are granted for a maximum of 1095 days for wage workers and 365 days for self-employed individuals (law-decrees 28/2004 (law-decrees 28/ , 133/2012 (law-decrees 28/ and 146/2005 . the portuguese national health service, financed through taxes, provides general and universal coverage and is almost free at the point of use. in portugal, secondary and tertiary care (both acute and post-acute care) is mainly provided in hospitals. general practitioners act as gatekeepers in access to hospital care in the public sector; otherwise, people can be admitted through the emergency department. private voluntary health insurance may speed up access to elective hospital treatment and ambulatory consultations, but it has very limited expression in portugal (<10%) and is not associated with type of work (i.e., self-employment or wage work). some public and private subsystems provide care to specific groups not relevant for this study (public servants, military, banking sector workers). in general, access to hospital care in portugal should be identical for both self-employed and wage workers. the only concern is that self-employed individuals may delay care in order not to lose business, as their income is closely tied to them actually working. (wage workers could also delay care in order to maintain a good reputation with their employer.) because we are looking at hospitalizations, which are generally acute, untimed events, this concern is limited. nonemergency acute interventions are scheduled by the hospital, and because waiting lists are usually long, it is unlikely that individuals pass on the opportunity to receive the care they need when hospitals schedule them, as it may be a long time before a new opportunity arises. we use data from the portuguese social security information system, made available by the instituto de informática public agency. the dataset is a random sample such that included individuals represent both a) at least 1% of all individuals who pay social security contributions and b) at least 1% of all individuals who receive sickness, maternity, or other benefits from social security, stratified by region and gender. we observe individuals on a monthly basis, from january 2005 to december 2011. we use information on whether they are wage workers or self-employed, as well as whether they receive sickness benefits in a specific month due to hospitalization. the data allow us to distinguish sickness benefits due to hospitalization from sickness benefits due to standard (nonacute) sickness spells, as the two cases are treated differently by social security (see section 2). the dataset also includes information on the individuals' gender, age, nationality, place of residence, and income from work, but not their industry or occupation. we drop individuals below 18 and above 65 years old (mandatory schooling age and statutory retirement age). after deleting also observations with missing information on the key variables, we are left with almost 130,000 individuals, of which about 10,000 are self-employed at some point over the period 2005-2011. in our main analyses, we focus on more than 6,500 individuals who switch at least once between self-employment and wage work over that period (which we refer to as 'switchers'). over the 84-monthlong period, there are more than 300,000 individualmonth observations when considering only switchers (almost 7 million individual-month observations in the full sample). to determine the effect of self-employment on the likelihood of hospitalization, we estimate four different specifications of a linear probability model like the following: 5 the binary dependent variable, hosp i,t , indicates whether individual i is hospitalized in month t or not. the variable of main interest is the one-month lag of the self-employment indicator, self-employed i,t−1 , which takes value one if individual i is self-employed in month t − 1. 6 5 we opt for the linear probability model given the computational difficulties associated with applying instrumental variables methods to nonlinear panel data models, especially when various large vectors of fixed effects are included. to investigate if the chosen functional form is appropriate, we estimated the logit/panel logit versions of models 1 and 2 (i.e., with or without individual fixed effects), which provided marginal effects similar to the ones obtained with the linear versions. 6 some individuals who receive income from both selfemployment and wage work in some months are counted as self-employed. excluding these observations provides almost identical results. using the one-month lag of the self-employment indicator, or all lags up to the third or the twelfth, for example, gives estimated total effects of selfemployment with the same sign and level of statistical significance, differing only slightly in magnitude. this shows the stability of the self-employment indicator, as individuals rarely change type of work more than once over the seven-year period considered. we are interested in the overall effect of self-employment and not in the time dynamics. that overall effect can be captured by any single lag, given the high correlation between adjacent lags. furthermore, using more than one lag would result in many more observations being lost. in conclusion, β 1 gives the effect of being self-employed, as opposed to being a wage worker, on the likelihood of being hospitalized in the following month. the four specifications that we consider are the following: model 1 controls for the individual's gender, age group (18-25, 26-35, 36-45, 46-55, or 56-65) , nationality (portuguese or foreign), and place of residence (one of the 18 districts in the mainland or one of the 11 islands), 7 included in x i,t . we also include fixed effects for each month in the sample, denoted by τ t (84 months minus jan 2005, due to the lag, and feb 2005, which is the reference month). model 2 takes advantage of the longitudinal nature of the dataset and includes also individual fixed effects, denoted by μ i , to control for time-invariant individual unobserved heterogeneity. still, it is possible that endogeneity due to unobserved individual characteristics that vary over time remains, as discussed in the introduction. to tackle this potential threat, in addition to the individual fixed effects, we employ an instrumental variable strategy. thus, model 3 applies instrumental variables without controlling for individual fixed effects (i.e., instrumental variable estimation of model 1), and model 4 applies instrumental variables controlling for individual fixed effects (i.e., instrumental variable estimation of model 2). in sum, models 1 and 3 treat the data as pooled cross-sections, whereas models 2 and 4 are fixed effects panel data models; models 3 and 4 apply an instrumental variable strategy. we use two instruments. instrument one is the proportion of self-employed workers in individual i's district, excluding her municipality of residence, in the same month (see online resource 1 for the division of the portuguese territory into districts and municipalities). instrument two is the proportion of selfemployed workers of the same gender and age group of individual i in the whole country, also excluding her municipality of residence, in the same month. the proportion of workers in a given district or gender-age group who are self-employed captures the structure of the labor market in that area or demographic group. for example, there may be a predominant industry in a given district that relies on wage workers, or there may be a new service expanding where young self-employed women abound. in general, we expect that the larger that proportion, the higher the likelihood that any individual i residing in district j or belonging to gender-age group m is self-employed. however, in some cases, low self-employment in the district/demographic group may signal opportunities or conversely, high selfemployment may signal a saturated market. that is, some individuals may be defiers, responding in the opposite way to a higher proportion of self-employed workers in the district/demographic group (i.e., violation of the monotonicity assumption). when there are defiers, the two-stage least squares estimator gives a weighted difference between the effect of the treatment among compliers and defiers, which could be misleading. nevertheless, de chaisemartin (2017) derives a weaker condition under which the two-stage least squares estimator still provides a local average treatment effect (late) for "surviving compliers." with binary outcomes, like is our case, that condition holds if defiers' late and the two-stage least squares coefficient are both of the same sign, or if defiers' and compliers' lates are both of the same sign and the ratio of these two lates is lower than the ratio of the shares of compliers and defiers in the population. in this context, it is difficult to assess if that condition is likely to hold, because the effect of self-employment on the likelihood of hospitalization can be positive or negative. still, we see no reason for the lates of compliers and defiers to differ significantly, especially since fixed effects capture individuals' intrinsic characteristics that may explain why they respond differently to the instruments. so, we argue that the condition holds as the ratio of compliers to defiers should exceed the ratio of the two lates. 8 the proportion of self-employed workers in an individual's geographical area has previously been used to instrument self-employment decisions (e.g., noseleit 2014) . the novelty here is that instead of considering the proportion of self-employed workers in the individual's municipality, we consider only neighboring municipalities excluding the individual's own. this approach to devise instrumental variables has been employed, e.g., in autor et al. (2013) and nevo (2001) . in both our instruments, the exclusion of the individual's own municipality contributes to eliminate concerns regarding instrument exogeneity. overall, we believe our instruments are validly excluded from the main equation conditional on the remaining explanatory variables (i.e., they impact hospitalizations solely through their impact on the likelihood of self-employment). for instance in the case of the proportion of self-employed workers in the district (instrument one), the crucial explanatory variables are the district fixed effects. district fixed effects take into account any district characteristics that correlate with both the instrument and the outcome, hospitalizations, as long as those characteristics are constant over time. to explore this issue further, we look at the evolution over time of some district characteristics: a general income index, a general health index, and a firm dimension index, which are composite indices produced by a portuguese polling firm, marktest (online resource 2). what we observe is that all of those indices are fairly constant over time; therefore such characteristics should be appropriately captured by the district fixed effects. note also that by comparison, the proportion of self-employed workers in the district exhibits some within-district variation, so the instrument is relevant even when controlling for district fixed effects (online resource 2). with two instruments and one potentially endogenous variable, we are able to test statistically the validity of the overidentifying restriction. given that the endogeneous variable, self-employed i,t−1 , is lagged, we also use the lags of the instruments. as mentioned previously, our main analyses focus on the subsample that includes only individuals who switch between wage work and self-employment at least once over the sample period ("switchers"). after all, those are the individuals that are used for identification in the models with individual fixed effects. moreover, in the instrumental variables model with individual fixed effects, non-switchers are by definition non-compliers, and non-compliers reduce the instruments' statistical power (de chaisemartin 2017). we also present results for all model specifications for the entire sample, for comparison. lastly, standard errors are robust to heteroscedasticity and to clustering at the individual level in models 1 and 2, and at the district level in models 3 and 4 (because that is the level of observation of instrument one). the main time-varying unobserved individual characteristic that may affect both self-employment and the likelihood of hospitalization is health. unfortunately, we do not have information on health status; only hospitalizations. we construct an indicator variable that takes value one if the individual had any hospitalization in the previous three months, to try to capture any recent (serious) changes in health status. this variable is potentially not enough to fully rule out endogeneity, which is why we resorted to instrumental variables models. still, as a sensitivity check, we add this variable to model 2 as a control. we also compare the effect of self-employment on the likelihood of hospitalization for women versus men, individuals up to 35 versus 36 and more years old, and nationals versus foreigners. to do this, we include interaction terms between the lagged selfemployment indicator and the respective demographic dummies. since we have two instruments, we are able to instrument both the lagged self-employment indicator and the interaction term. we repeat the main analyses using quarterly rather than monthly data and compare the magnitudes of the estimates. aggregating the data in this way reduces total sample size to about one third. to shed further light on the types of hospitalizations of self-employed and wage workers, we obtained information on hospitalizations from the national diagnosis-related groups dataset. this allowed us to learn the main diagnosis underlying each hospitalization as well as if it was planned or not, but only for about half of the hospitalizations in the social security dataset that we could match indirectly, as there is not an individual identifier to fully merge the two datasets. these complementary analyses are detailed in the online resource 3. we also apply the model specifications described in the previous section to study the impact of selfemployment on the length of hospitalization. first, in a two-part model type of approach, we restrict the sample to individual-month observations with a hospitalization. we use the natural logarithm of hospitalization days as the dependent variable to account for the skewness in the distribution of hospitalization days. this approach drastically reduces the sample size. we compare the results, qualitatively, to those obtained for the full sample, using the natural logarithm of hospitalization days plus one in order to keep the zeroes. our data also allow us to investigate mortality. to explore the effect of self-employment on mortality, we aggregate the data to the person-year level, as we know the year but not the month in which the individual passes away. we create a binary dependent variable that takes value one if individual i passes away in year t + 1 and zero otherwise, while excluding observations for the year in which the person passes away. we compare results obtained when the self-employment indicator takes value one if the individual is selfemployed during at least one, six, or all twelve months of year t. we estimate the same model specifications as described in the previous section, adjusted for the annual frequency considered here. control variables are measured in year t. we discuss results for the subsample of individuals who switch at least once between self-employment and wage work over time ("switchers"). results for the full sample are also presented for comparison, in the bottom half of the tables (panel b). descriptive statistics by type of employment in the previous month are shown in table 1 . looking at the switchers, the self-employed account for 38.29% of the person-month observations (panel a). the average monthly rates of hospitalization of self-employed and wage workers are 0.06% and 0.14% respectively. note that these seemingly very low numbers correspond to monthly, not annual, hospitalization rates. the average number of days of hospitalization, conditional on there being any, is slightly larger among the self-employed: 12.86 compared with 11.05 days for wage workers. the differences in the rates and lengths of hospitalization over time in both samples are shown in the online resource 4. the proportion of women is slightly lower among the self-employed than among wage workers (50% versus 52%), the self-employed are on average slightly older (about 37 versus 36 years old), and the proportion of foreigners is also slightly lower among the self-employed (13% versus 14%). the proportion of self-employed workers in the district (instrument one) is on average 4.69% and varies between 0 and 17.74%. the proportion of self-employed workers in the same gender-age group in the country (instrument two) is on average 4.06% and varies between 0.75 and 17.34%. table 2 shows the results of models 1-4. starting with the first-stage results, we conclude that when the proportion of self-employment in the district or demographic group increases, the individual likelihood of self-employment also increases, as expected. specifically, when the proportion of self-employed workers in a given district (/demographic group) is one percentage point higher, the likelihood of any individual in that district (/demographic group) becoming selfemployed is about 7 (/4.5) percentage points higher, on average (panel a, model 4). 9 returning to why we focus on the subsample of switchers, as noted, e.g., by de chaisemartin (2017), non-compliers reduce the instruments' statistical power. in the instrumental variables model with individual fixed effects, non-switchers are by definition non-compliers. judging from the large fand t-statistics, the instruments appear strong when considering the full sample (panel b, model 4). however, looking at the second stage, we can see that the coefficient on the self-employment indicator is implausibly large in absolute terms, and has a huge standard error as well. this suggests that the instruments may actually not be strong enough even though the f-and t-statistics are above conventional thresholds. 10 therefore, we focus our discussion on the results for the sample of switchers. note that even in model 2, which includes individual fixed effects but not instrumental variables, identification of the effect of self-employment also comes from switchers. as for the instrument validity test, the null hypothesis is not rejected in any case. there is also no evidence of endogeneity. in fact, the coefficients on the self-employment indicator in the instrumental variables models (models 3 and 4) are very similar to the coefficients in models 1 and 2, except they are less precisely estimated and not statistically significant (panel a). 11 in light of this result, unobserved individual characteristics, in particular those that vary over time (e.g., health status), and reversed causality don't seem to pose an issue in our analyses. this is possibly because hospitalization is a fairly objective and a rare/extreme outcome, which doesn't capture health in general but serious (unexpected) manifestations of illness. furthermore, the estimated coefficient on the self-employment indicator is about the same whether or not individual fixed effects are included (model 1 versus model 2), suggesting that self-selection of the healthy into self-employment has no impact on don't respond to changes in the labor market as captured by the instruments. recall that in the full sample, only 4.24% of the observations are self-employed; in the subsample of switchers, this proportion increases to 38.29% (table 1) . the difference in effect size of the instruments when looking at the full sample versus the subsample of switchers can be interpreted in relation to these proportions of self-employment in each sample. 10 in the model without individual fixed effects, instrument one actually has a small t-statistic and the f-statistic is also small (panel b, model 3). 11 instrumental variables estimation using only instrument one or instrument two produces identical results. the negative association between self-employment and likelihood of hospitalization. lastly, controlling for any hospitalization in the previous three months, which is another (partial) way to address endogeneity, does not change the estimated coefficients from model 2. standard errors in parentheses, robust to heteroscedasticity and to clustering at the individual level in models 1 and 2 and at the district level in models 3 and 4. *p < 0.1, **p < 0.05, ***p < 0.01 the coefficient of the self-employment indicator was multiplied by 100 to facilitate reading we find that self-employed individuals are about 0.08 percentage points less likely than wage workers to be hospitalized in any given month. this is the same as the unadjusted difference in hospitalization rates of self-employed and wage workers observed in table 1 . compared with the average monthly hospitalization rate of 0.14% among wage workers, this means that self-employed individuals are less than half as likely to be hospitalized. overall, our findings indicate a large negative impact of self-employment on the likelihood of hospitalization that is consistent across models. results also indicate that female, older, and native workers have higher rates of hospitalization (results available upon request). looking at potentially heterogeneous effects of selfemployment for different subgroups, we find that the negative impact of self-employment on the likelihood of hospitalization is stronger for women than for men. there are no differences between individuals less than or 36+ years old or between nationals and foreigners (table 3) . using quarterly data gives negative and strongly significant coefficients, which are roughly three times as large as the coefficients in the main analysis, as expected (not shown). results from our exploration of the types of hospitalizations of self-employed and wage workers, detailed in the online resource 3, indicate that selfemployment is associated with lower likelihood of hospitalization for any underlying health problem, as well as whether hospitalizations are urgent or planned. looking at the natural logarithm of hospitalization days, conditional on there being a hospitalization, we find no significant effects of self-employment. however, this analysis is limited because only observations with a hospitalization are used and many individuals have only one hospitalization over the entire period of analysis. when including the zeroes, by looking standard errors in parentheses, robust to heteroscedasticity and to clustering at the individual level *p < 0.1, **p < 0.05, ***p < 0.01 the coefficient of the self-employment indicator was multiplied by 100 to facilitate reading at the logarithm of hospitalization days plus one, the estimated coefficients are negative and strongly significant, indicating that self-employment reduces the length of hospitalization by almost 0.2%. however, this analysis is also limited because the choice of adding one to the number of days, in order to keep the zeroes, may influence results. in sum, we find no evidence that a lower likelihood of hospitalization among self-employed individuals comes at the expense of longer lengths of hospital stays, which would suggest that self-employed individuals delay going to the hospital until they are more severely sick (results available upon request). table 4 presents the effect of self-employment on the likelihood of mortality in the following year. the self-employment indicator takes value one if the individual is self-employed for more than six months in the current year. similar results are obtained when standard errors in parentheses, robust to heteroscedasticity and to clustering at the individual level in models 1 and 2 and at the district level in models 3 and 4. *p < 0.1, **p < 0.05, ***p < 0.01 the coefficient of the self-employment indicator was multiplied by 100 to facilitate reading one month as a self-employed worker is enough to classify an individual as self-employed in year t or when we require individuals to be self-employed during the whole year. the models that (partly) address endogeneity provide negative coefficients for the selfemployment indicator (models 2-4). although not statistically different from zero, the estimated coefficient from model 2 indicates that self-employed individuals are about 0.01 percentage points less likely to die in the following year than wage workers. compared with the average mortality rate of wage workers, this represents a lower likelihood of mortality by about one third. this analysis has limitations, as data are aggregated to a yearly frequency and mortality is such a rare and extreme outcome that there is little variation to identify precisely an effect of self-employment. yet, results are in line with our main findings for hospitalizations, suggesting a protective effect of selfemployment when it comes to acute events such as hospital admission and death. it is probably as challenging as it is important to determine whether self-employment is good or detrimental for health. the potential self-selection of the healthy into or out of self-employment (and their typically small businesses) is difficult to rule out empirically. however, separating the effect of self-employment on health from that selection effect is crucial to inform policy decisions. moreover, informing policy is increasingly pressing these days, as new forms of self-employment emerge and the small businesses that they create can have a significant impact on sustainable economic growth. the ongoing covid-19 crisis may also represent a significant push towards selfemployment (and wage employment with increased job flexibility, through greater use of remote work) which may have its own additional consequences in terms of health. given the motivation above, we seek to provide causal evidence on the impact of self-employment on hospitalizations in this study. we take advantage of the longitudinal nature of our rich data, where we track roughly 6,500 individuals that switch between forms of employment over a period of up to 84 months. on top of that, we also employ an instrumental variable strategy to deal with any remaining endogeneity. we find that self-employed individuals are 0.08 percentage points (or about half) less likely to be hospitalized in a given month when compared with wage workers. qualitatively, this result is in line with most available evidence, which tends to find that self-employment is good for health. this includes toivanen et al. (2016 toivanen et al. ( , 2018 , who like us look at hospitalizations and mortality. we do not seem to find evidence of endogeneity, contrary to rietveld et al. (2015) , who find a negative association between selfemployment and health that is fully explained by a selection effect. the different results between the two studies may be due to the type of outcomes considered and samples used. while we focus on administrative records of hospitalizations and consider the whole working population, rietveld et al. (2015) draw on survey-based subjective health measures and focus on the 50+ population. hospitalization is a specific, acute outcome and not a measure of health status per se. the same can be said of mortality. the job demand-job control theory is closely linked to work-related stress, yet the most obvious manifestations of stress do not always lead to hospitalization or death (e.g., anxiety, depression). in this regard, we may miss important impacts of self-employment on health, which can be positive or negative. we believe more research is needed on this important topic, looking at different, complementary health outcomes. nevertheless, as mentioned in the introduction, stress is an important cause of many health problems, ranging from cardiovascular to respiratory, digestive, and other troubles, which frequently lead to hospitalization (or death). in our analyses of the health problems underlying hospital admissions, we find that self-employed individuals are particularly less likely than wage workers to be hospitalized for troubles of the cardiovascular, respiratory, and digestive systems. despite the limitations of those analyses, our results do not contradict the interpretation that self-employed individuals seem to suffer from lower stress than wage workers or, in other words, that the beneficial effects of higher job control when selfemployed exceed the detrimental effects of higher job demands. our results are also consistent with the research on "procedural utility" that finds higher levels of well-being among self-employed individuals, something that may be linked with lower stress/better health. our results may also reflect changes in the occupations when individuals switch to/from selfemployment and small businesses, which may have different exposures to occupational hazards. for instance, manufacturing workers-typically wage workers-may be more prone to injuries at work. we do find that self-employed individuals are significantly less likely than wage workers to be hospitalized for troubles of the musculoskeletal system, which include many work-related episodes. still, we find equally large or larger differences in hospitalization rates for other types of troubles. unfortunately, with the available data we cannot explore this issue precisely, as we do not know the industry/occupation of self-employed individuals. the potentially different effects of self-employment by industry remains a topic that deserves to be explored in future research. toivanen et al. (2016) and toivanen et al. (2018) already showed promising results in this regard. we believe that the premiss that self-employed individuals may delay care in order not to lose business is of limited concern here. hospitalizations are generally acute, untimed events. furthermore, non-emergency acute interventions are scheduled by the hospital and long waiting lists deter individuals from passing on a scheduled intervention they need. we find identical relative risk ratios for urgent and planned hospital admissions. also, if self-employed individuals, having more limited access to sickness benefits, delayed appropriate care until they are seriously sick and have to be hospitalized, we would find that self-employment leads to higher rates of hospitalization, which is the opposite of what we find. as we do not know the diagnoses of all hospital admissions in the data, we cannot exclude admissions related to pregnancy and childbirth, which are unrelated to health status and capture instead fertility decisions. however, while this may partly explain the larger effect of self-employment found for women, it does not explain our findings for men, for whom we also find negative hospitalization effects. with our approach, we were able to at least partly rule out endogeneity, thanks largely to the rich longitudinal dimension of the data we use. further research may want to explore additional individual information to investigate potential heterogeneous effects, e.g., by industry or occupation. further research may also want to consider the case of self-employed individuals with employees, even if this type of self-employment and their small businesses is less common among platform economy jobs. in conclusion, this study provides evidence of a positive impact of self-employment on health and does so by focusing on an objective outcome-hospital admissions-that is not subject to recall or other biases that may affect previous studies. the positive health effect we document may be at least partly explained by greater control by the individual over different aspects of the working life associated with this form of small businesses. one important dimension of the ongoing debate about the "future of work" is precisely how to increase protection for workers under flexible contracts, such as those that increasingly emerge in the platform economy (e.g., garben 2017 , european commission 2017 . this dimension is now even more significant in the context of the covid-19 crisis. this may also involve multiple policy aspects such as social security, employment law and collective bargaining. our results indicate that, despite the existing concerns, at least as far as significant health events are concerned, there are important social gains from more flexible work formats. furthermore, as the platform economy grows around the world, leading to increasing shares of the workforce in self-employment, causal evidence about the health implications of that type of work becomes more pressing. the china syndrome: local labor market effects of import competition in the united states why entrepreneurs often experience low, not high, levels of stress: the joint effects of selection and psychological capital being independent is a great thing: subjective evaluations of selfemployment and hierarchy social support as a mediator between job control and psychological strain job loss, firm-level heterogeneity and mortality: evidence from administrative data effect of job loss due to plant closure on mortality and hospitalization not all are equal: a latent profile analysis of well-being among the self-employed causes and consequences of early-life health the lasting impact of childhood health and circumstance tolerating defiance? local average treatment effects without monotonicity proposal for a directive of the european parliament and of the council on transparent and predictable working conditions in the european union employment by sex, age and professional status protecting workers in the online platform economy: an overview of regulatory and policy developments in the eu. luxemburg: european agency for safety and health at work focus on opportunities as a mediator of the relationship between business owners' age and venture growth how time-flexible work policies can reduce stress, improve health, and save money selfemployment and work-related stress: the mediating role of job control and job demand the prevalence of work-related stress, and its association with self-perceived health and sick-leave, in a population of employed swedish women non-standard employment around the world: understanding challenges, shaping prospects. geneva: international labour office p-hacking in academic research. a critical review of the job strain model and of the association between night work and breast cancer in women job demands, job decision latitude, and mental strain: implications for job redesign healthy work: stress, productivity, and the reconstruction of working life the neurobiology of stress and gastrointestinal disease measuring market power in the ready-to-eat cereal industry self-employment can be good for your health female self-employment and children platform workers in europe. evidence from the colleem survey. luxembourg: publications office of the european union stress, satisfaction, and the work-family interface: a comparison of selfemployed business owners, independents, and organizational employees selfemployment and health: barriers or benefits? why the self-employed are happier: evidence from 25 european countries health of entrepreneurs versus employees in a national representative sample stress and cardiovascular disease current issues relating to psychosocial job strain and cardiovascular disease research mortality differences between selfemployed and paid employees: a 5-year follow-up study of the working population in sweden s eloranta. hospitalization due to stroke and myocardial infarction in self-employed individuals and small business owners compared with paid employees in sweden -a 5-year study small business owners and health o trabalho por conta própria em portugal. instituto nacional de estatística the effect of self-employment on health, access to care, and health behavior transitions to selfemployment at older ages: the role of wealth, health, health insurance and other factors publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-015255-1qhgeirb authors: busby, j s; onggo, s title: managing the social amplification of risk: a simulation of interacting actors date: 2012-07-11 journal: j oper res soc doi: 10.1057/jors.2012.80 sha: doc_id: 15255 cord_uid: 1qhgeirb a central problem in managing risk is dealing with social processes that either exaggerate or understate it. a longstanding approach to understanding such processes has been the social amplification of risk framework. but this implies that some true level of risk becomes distorted in social actors’ perceptions. many risk events are characterised by such uncertainties, disagreements and changes in scientific knowledge that it becomes unreasonable to speak of a true level of risk. the most we can often say in such cases is that different groups believe each other to be either amplifying or attenuating a risk. this inherent subjectivity raises the question as to whether risk managers can expect any particular kinds of outcome to emerge. this question is the basis for a case study of zoonotic disease outbreaks using systems dynamics as a modelling medium. the model shows that processes suggested in the social amplification of risk framework produce polarised risk responses among different actors, but that the subjectivity magnifies this polarisation considerably. as this subjectivity takes more complex forms it leaves problematic residues at the end of a disease outbreak, such as an indefinite drop in economic activity and an indefinite increase in anxiety. recent events such as the outbreaks in the uk of highly pathogenic avian influenza illustrate the increasing importance of managing not just the physical development of a hazard but also the social response. the management of hazard becomes the management of 'issues', where public anxiety is regarded less as a peripheral nuisance and more as a legitimate and consequential element of the problem (leiss, 2001) . it therefore becomes as important to model the public perception of risk as it does to model the physical hazard-to understand the spread of concern as much as the spread of a disease, for example. in many cases the perception of risk becomes intimately combined with the physical development of a risk, as beliefs about what is risky behaviour come to influence levels of that behaviour and thereby levels of exposure. one of the main theoretical tools we have had to explain and predict public risk perception is the social amplification of risk framework due to kasperson et al (1988) . as we explain below, this framework claims that social processes often combine to either exaggerate or underplay the risk events experienced by a society. this results in unreasonable and disproportionate reactions to risks, not only among the lay public but also among legislators and others responsible for managing risk. but since its inception the idea of a 'real', objective process of social risk amplification has been questioned (rayner, 1988; rip, 1988 ) and, although work in risk studies and risk management continues to use the concept, it has remained problematic. the question is whether, if we lose the notion of some true risk being distorted by a social process, we lose all ability to anticipate and explain perplexing social responses to a risk event in a way that is informative to policymakers. we explore this question in the context of risks surrounding the outbreaks of zoonotic diseases-that is, diseases that cross the species barrier to humans from other animals. recent cases of zoonotic disease, such as bse, sars, west nile virus and highly pathogenic avian influenza (hpai), have been some of the most highly publicised and controversial risk issues encountered in recent times. many human diseases are zoonotic in origin but in cases such as bse and hpai the disease reservoirs remain in the animal population. this means that a public health risk is bound up with risk to animal welfare, and often risk to the agricultural economy, to food supply chains and to wildlife. this in turn produces difficult problems for risk managers and policymakers, who typically want to avoid a general public amplifying the risk and boycotting an industry and its products, but also want to avoid an industry underestimating a risk and failing to practice adequate biosecurity. the bse case in particular has been associated with ideas about risk amplification (eg, eldridge and reilly, 2003) and continues to appear in the literature (lewis and tyshenko, 2009) . other zoonoses, such as chronic wasting disease in deer herds, have also been seen as recent objects of risk amplification (heberlein and stedman, 2009) . in terms of the social reaction, not all zoonoses are alike. endemic zoonoses like e. coli 157 do periodically receive public attention-for example following outbreaks at open farms and in food supply chains. but it is the more exotic zoonoses like bse and hpai that are more clearly associated with undue anxiety and ideas about social risk amplification. yet these cases also showed how uncertain the best, expertly assessed, supposedly objective risk level can be, and this makes it very problematic to retain the idea of an objective process of social risk amplification. such cases are therefore an important and promising setting for exploring the idea that amplification is only in the heads of social actors, and for exploring the notion that this might nonetheless produce observable, and potentially highly consequential, outcomes in a way that risk managers need to understand. our study involved two main elements, the second of which is the main subject of this article: 1. exploratory fieldwork to examine how various groups perceived risks and risk amplification in connection with zoonoses like the avian influenza outbreaks in 2007; 2. a systems dynamics simulation to work out what outcomes would emerge in a system of social actors who attributed amplification to other actors. in the remainder of the paper we first outline the fieldwork and its outcomes, and then describe the model and simulation. although the article concentrates on the latter, the two parts provide complementary elements of a process of theorising (kopainsky and luna-reyes, 2008) : the fieldwork, subjected to grounded analysis, produces a small number of propositions that are built into the systems dynamics model, and the model both operationalises these propositions and explores their consequences when operationalised in this way. the modelling is a basis for developing theory that is relevant to policy and decision making, rather than supporting a specific decision directly. a discussion and conclusion follow. traditionally, the most problematic aspect of public risk perception has been seen as its sometimes dramatic divergence from expert assessments-and the way in which this divergence has been seen as an obstacle both to managing risks specifically and to introducing new technology more generally. this has produced a longstanding interest in the individual perception of risk (eg, slovic, 1987) and in the way that culture selects particular risks for our attention (eg, douglas and wildavsky, 1982) . it has led to a strong interest in risk communication (eg, otway and wynne, 1989) . and it has been a central theme in the social amplification of risk framework (or sarf) that emerged in the late 1980s (kasperson et al, 1988) . the notion behind social risk amplification, developed in a series of articles (kasperson et al, 1988; renn, 1991; burns et al, 1993; kasperson and kasperson, 1996) , is that a risk event produces signals that are processed and sometimes amplified by a succession of social actors behaving as communication 'stations'. they interact and observe each other's responses, sometimes producing considerable amplification of the original signal. a consequence is that there are often several secondary effects, such as product boycotts or losses of institutional trust, that compound the effect of the original risk event. a substantial amount of empirical work has been conducted on or around the idea of social amplification, for example showing that the largest influence on amplification is typically organisational misconduct (freudenberg, 2003) . it continues to be an important topic in the risk literature, not least in connection with zoonosis risks (eg, heberlein and stedman, 2009; lewis and tyshenko, 2009 ). there has always been a substantial critique of the basic idea of social risk amplification. its implication that there is some true or accurate level that becomes amplified is hard to accept in many controversial and contested cases where expertise is lacking or where there is no expert consensus (rayner, 1988) . the phenomenon of 'dueling experts' is common in conflicts over environmental health, for instance (nelkin, 1995) . more generally, the concept of risk amplification seems to suggest that there is a risk 'signal' that is outside the social system and is somehow amplified by it (rayner, 1988) . this seems misconceived when we take the view that ultimately risk itself is a social construction (hilgartner, 1992) or overlay on the world (jasanoff, 1993) . and it naturally leads to the view that contributors to the amplification, such as the media (bakir, 2005) , need to be managed more effectively, and that risk managers should concentrate on fixing the mistake in the public mind (rip, 1988) , when often it may be the expert assessment that is mistaken. it thus becomes hard to sustain the idea that there is a social process by which true levels of risk get distorted. and this appears to undermine the possibility that risk managers can have a way of anticipating very high or very low levels of social anxiety in any particular case. once risk amplification becomes no more than a subjective judgment by one group on another social group's risk responses, it is hard to see how risk issues can be dealt with on an analytical basis. however, subjective beliefs about risk can produce objective behaviours, and behaviours can interact to produce particular outcomes. and large discrepancies in risk beliefs between different groups are still of considerable interest, whether or not we can know which beliefs are going to turn out to be more correct. in the remainder of this article we therefore explore the consequences of the idea that social risk amplification is nothing more than an attribution, or judgment that one social actor makes of another, and try to see what implications this might have for risk managers based on a systems dynamics model. before this, however, we describe the fieldwork whose principal findings were meant to provide the main structural properties of the model. the aim of the fieldwork was to explore how social actors reason about the risks of recent zoonotic disease outbreaks, and in particular how they make judgments of other actors systematically amplifying or attenuating such risks. this involved a grounded, qualitative study of what a number of groups said in the course of a number of unstructured interviews and focus groups. it follows the general principle of using qualitative empirical work as a basis for systems dynamics modelling (luna-reyes and andersen, 2003) . focus groups were used where possible, for both lay and professional or expert actors; individual interviews were used where access could only be gained to relevant groups (such as journalists) as individuals. the participants were selected from a range of groups having a stake in zoonotic outbreaks such as avian influenza incidents and are listed in table 1 . the focus groups followed a topic guide that was initially used in a pilot focus group and continually refined throughout the programme. they started with a short briefing on the specific topic of zoonotic diseases, with recent, well-publicised examples. the professional and expert groups were also asked to explain their roles in relation to the management of zoonotic diseases. participants were then invited to consider recent cases and other examples they knew of, discuss their reactions to the risks they presented, and discuss the way the risks had been, or were being, managed. their discussions were recorded and the recordings transcribed except in two cases where it was only feasible to record researcher notes. the individual interviews followed the same format. analysis of the transcripts followed a typical process of grounded theorising (glaser and strauss, 1967) , in which the aim was to find a way of categorising participants' responses that gave some theoretical insight into the principle of risk amplification as a subjective attribution. the categories were arrived at in a process of 'constant comparison' of the data and emerging, tentative categories until all responses have been satisfactorily categorised in relation to each other (glaser, 2002) . in glaser's words, 'validity is achieved, after much fitting of words, when the chosen one best represents the pattern. it is as valid as it is grounded'. our approach also drew on template analysis (king, 1998) in that we started with the basic categories of attributing risk amplification and risk attenuation, not a blank sheet. a fuller account of the analysis process and findings is given in a parallel publication (busby and duckett, 2012) . the first main theme to emerge from the data was the way in which actors privilege their own views, and construct reasons to hold on to them by finding explanations for other views as being systematically exaggerated or underplayed. it is surprising in a sense that this was relatively symmetrical. we expected expert groups to characterise lay groups as exaggerating or underplaying risk, but we also expected lay groups to use authoritative risk statements from expert groups and organisations of various kinds as ways of correcting their own initial and tentative beliefs. but there was no evidence for this kind of corrective process. the reasons that informants gave for why other actors systematically amplify or attenuate risk were categorised under five main headings: cognition, or the way they formed their beliefs; disposition, or their inherent natures; situation, or the particular circumstances; strategy, or deliberate, instrumental action; and structure, or basic patterns in the social or physical world. for example, one group saw the highly pathogenic avian influenza (hpai) outbreak at holton in the uk in 2007 as presenting a serious risk and explained the official advice that it presented only a very small risk as arising from a conspiracy between industry and government that the dispositions of the two naturally created. this second main theme was that some groups of informants often lacked specific and direct knowledge about relevant risks, and resorted to reasoning about other actors' responses to those risks. this reasoning involved moderating those observations with beliefs about whether other actors are inclined to amplify or attenuate risk. lay groups received information through the media but they had definite, and somewhat cliche´d, beliefs about the accuracy of risk portrayals in the media, for example. thus some informants saw the media treatment of hpai outbreaks as risk amplifying and portrayed the media as having an incentive to sensationalise coverage, but others (particularly virologists) saw media coverage as risk attenuating out of scientific ignorance. a third theme was that risk perceptions often came from the specific associations that arose in particular cases. for example, the holton hpai outbreak involved a large food processing firm that had earlier been involved in dietary and nutritional controversies. the firm employed intensive poultry rearing practices and was also importing partial products from a processor abroad. this particular case therefore bound together issues of intensive rearing, global sourcing, zoonotic outbreaks and lifestyle risks-incidental associations that enabled some informants to perceive high levels of risk and indignation, and portray others as attenuating this risk. the fourth theme was that some actors have specific reasons to overcome what they see as other actors' amplifications or attenuations. they do not just discount another actor's distortions but seek to change them. for example, staff in one government agency believed they had to correct farmers who were underplaying risk and not practicing sufficient bio-security, and also correct consumers who were exaggerating risk and boycotting important agricultural products. such actors do not simply observe other actors' expressed risk levels but try to communicate in such a way as to influence these expressed levels-for example through awareness-raising campaigns. the fieldwork therefore pointed to a model in which actors like members of the public based their risk evaluations on what they were told by others, corrected in some way for what they expected to be others' amplifications or attenuations; discrepancies between their current evaluations and those of others would be regarded as evidence of such amplifications, rather than being used to correct their own evaluations. the findings also indicated a model in which risk managers would communicate risk levels in a way that was intended to overcome the misconceptions of actors like the public. these are the underpinning elements of the models we describe below. systems dynamics was a natural choice for this modelling on several grounds. first, there is an inherent stress on endogeneity in the basic idea of social risk amplification, and in particular in the notion that it is an attribution. risk responses first and foremost reflect the way people think about risks and think about the responses of other people to those risks. second, the explicit and intuitive representation of feedback loops was important to show the reflective nature of social behaviour: how actors see the impact of their risk responses on other actors and modify their responses accordingly. third, memory plays an important part in this, since the idea that some actor is a risk amplifier will be based on remembering their past responses, and the accumulative capacity of stocks in systems dynamics provides an obvious way of representing social memory. developing a systems dynamics model on the grounded theory therefore followed naturally, and helped to add a deductive capability to the essentially inductive process of grounded theory (kopainsky and luna-reyes, 2008) . kopainsky and luna-reyes (2008) also point out that grounded theory can produce large and rich sets of evidence and overly complex theory, making it important to have a rigorous approach to concentrating on small numbers of variables and relationships. thus, in the modelling we describe in the next section, the aim was to try to represent risk amplification with as little elaboration as possible, so that it would be clear what the consequences of the basic structural commitments might be. this meant reduction to the simplest possible system of two actors, interacting repeatedly over time during the period of an otherwise static risk event (such as a zoonosis outbreak). applications of systems dynamics have been wide-ranging, addressing issues in domains ranging from business (morecroft and van der heijden, 1992) to military (minami and madnick, 2009) , from epidemiology (dangerfield et al, 2001) to diffusion models in marketing (morecroft, 1984) , from modelling physical state such as demography (meadows et al, 2004) to mental state such as trust martinez-moyano and samsa, 2008) . applications to issues of risk, particularly risk perception, are much more limited. there has been some application of system dynamics to the diffusion of fear and sarf, specifically (burns and slovic, 2007; sundrani, 2007) , but not to the idea of social amplification as an attribution. probably the closest examples to our work in the system dynamics literature deal with trust. luna-reyes et al (2008), for example, applied system dynamics to investigate the role of knowledge sharing in building trust in complex projects. to make modelling tractable, the authors make several simplifying assumptions including the aggregation of various government agencies as a single actor and various service providers as another actor. each actor accumulates the knowledge of the other actor's work, and the authors explore the dynamics that emerge from their interaction. greer et al (2006) modelled similar interactions-this time between client and contractor-each having its own, accumulated understandings of a common or global quantity (in this case the 'baseline' of work a project). martinez-moyano and samsa (2008) developed a system dynamics model to support a feedback theory of trust and confidence. this represented the mutual interaction between two actors (government and public) in a social system where each actor assesses the trustworthiness of the other actor over time, with both actors maintaining memories of the actions and outcomes of the other actor. our approach draws from all these studies, modelling a system in which actors interact on the basis of remembered, past interactions as they make assessments of some common object. the actors are in fact groups of individuals who are presumed to be acting in some concerted way. although this may seem questionable there are several justifications for doing so: (1) the aim is not to represent the diversity of the social world but to explore the consequences of specific ideas about phenomena like social risk amplification; (2) in some circumstances a 'risk manager' such as a private corporation or a government agency may act very much like a unit actor, especially when it is trying to coordinate its communications in the course of risk events; (3) equally in some circumstances it may be quite realistic to see a 'public' as acting in a relatively consensual way whose net, aggregate or average response is of more interest than the variance of response. in the following sections we develop a model in three stages. in the first, we represent the conventional view of social risk amplification; in the second, we add our subjective, attributional approach in a basic form; and in the third we make the attributional elements more realistically complex. the aim is to explore the implications of the principal findings of the fieldwork, and our basic theoretical commitments to social risk amplification as an attribution, with as little further adornment as possible, while also incorporating elements shown in the literature to be important aspects of risk amplification. in the first model, shown in figure 1 , we represent in a simple way the basic notion of social risk amplification. the fundamental idea is that risk responses are socially developed, not simply the sum of the isolated reactions of unconnected individuals. the model represents a population as being in one of two states of worry. this is simpler than the three-state model of burns and slovic (2007) particularly adds to the model. there is also no need for a recovering or removal state, as in sir (susceptible infectious recovered) models (sterman, 2004, p 303) , since there is no concept of immunity and it seems certain that people can be worried by the same thing all over again. the flow from an unworried state to a worried state is a function of how far the proportion in the worried state exceeds that normally expected in regard to a risk event such as a zoonotic disease outbreak. members of the public expect some of their number to become anxious in connection with any risk issue: when, through communication or observation, they realise this number exceeds expectation, this in itself becomes a reason for others to become anxious. this observation of fellow citizens is not medium-specific, so it is a combination of observation by word-of-mouth, social networks and broadcast media. in terms of how this influences perception, various processes are suggested in the literature. for example, there is a variety of 'social contagion' effects (levy and nail, 1993; scherer and cho, 2003) relevant to such situations. social learning (bandura, 1977) or 'learning by proxy' (gardner et al, 2000) may also well be important. we do not model specific mechanisms but only an aggregate process by which the observation of worry influences the flow into a state of being worried. the flow out of the worried state is a natural relaxation process. it is hard to stay worried about a specific issue for any length of time, and the atrophy of vigilance is reported in the literature (freudenberg, 2003) . there is also a base flow between the states, reflecting the way in which-in the context of any public risk event-there will be some small proportion of the population that becomes worried, irrespective of peers and public information. this base flow also has the function of dealing with the 'startup problem' in which zero flow is a potential equilibrium for the model (sterman, 2004, p 322) . the public risk perception in this model stands in relation to an expert, supposedly authoritative assessment of the risk. people worry when seeing others worry, but moderate this response when exposed to exogenous information-the expert or managerial risk assessment. what ultimately regulates worry is some combination of these two elements and it is this regulatory variable that we call a resultant 'risk perception'. unlike burns and slovic (2007) we do not represent this as a stock because it is not anyone's belief, and so need not have inertia. the fact that various members of the public are in different states of worry means that there is no belief that all share, as such. instead, risk perception is an emergent construct on which flows between unworried and worried states depend (and which also determines how demand for risky goods changes, as we explain below). in the simplest model we simply take this resultant risk perception as a weighted geometric mean of the risk implied by the proportion of the population worried and the publically known expert risk assessment. the expert assessment grows from zero toward a finite level, for a certain period, before decaying again to zero. this reflects a time profile for typical risk events-for example zoonotic outbreaks such as sars-where numbers of reported cases climb progressively and rapidly to a peak before declining (eg, leung et al, 2004) . the units for risk perception and the expert assessment are arbitrary, but for exposition are taken as probabilities of individual fatality during a specific risk event. numerical values of the exogenous risk-related variables are based on an outbreak in which the highest fatality probability is 10 à3 . but risks in a modern society tend to vary over several orders of magnitude. typically, individual fatality probabilities of 10 à6 are regarded as 'a very low level of risk', whereas risks of 10 à3 are seen as very high and at the limit of tolerability for risks at work (hse, 2001) . because both assessed and perceived risks are likely to vary widely, discrepancies between risk levels are represented as ratios. the way in which the expert assessment is communicated to the public is via some homogenous channel we have simply referred to as the 'media'. in our basic model we represent in very crude terms the way in which this media might exaggerate the difference between expert assessment and public perception. but the sarf literature suggests there is no consistent relationship between media coverage and either levels of public concern or frequencies of fatalities (breakwell and barnett, 2003; finkel, 2008) , so the extent of this exaggeration is likely to be highly case specific. it is also possible that the media have an effect on responses by exaggerating to a given actor its own responses. the public, for example, could have an inflated idea of how worried they are because newspapers or blogs portray it to be so. but we do not represent this because it is so speculative and may be indeterminable empirically. finally, the base model also represents the way in which risk perception influences behaviour, in particular the consumption of the goods or services that expose people to the risk in question. the 2005 holton uk outbreak of hpai, for example, occurred at a turkey meat processing plant and affected demand for its products; the sars outbreak affected demand for travel, particularly aviation services. brahmbhatt and dutta (2008) even refer to the economic disruption caused by 'panicky' public responses as 'sars type' effects. there are many complications here, not least that reducing consumption of one amenity as a result of heightened risk perception may increase consumption of a riskier amenity. air travel in the us fell after 9/11 but travel by car increased and aggregate risk levels were said to have risen in consequence (gigerenzer, 2006) . a further complication is that in certain situations, such as bank runs (diamond and dybvig, 1983 ), risk perceptions are directly self-fulfilling rather than self-correcting. the most common effect is probably that heightened risk perceptions will lead to reduced demand for the amenity that causes exposure, leading to reductions in exposure and reductions in the expert risk assessment, but it is worth noting that the effect is case-specific. the expert risk assessment is therefore not exogenous, and there is a negative feedback loop that operates to counteract rising risk perceptions. as we show later from the simulation outcomes, the base model shows a public risk perception that can be considerably larger than the expert risk assessment. it therefore seems to show 'risk amplification'. but there is no variable that stands for risk in the model: there are only beliefs about risk (called either assessments or perceptions). the idea that social risk amplification is a subjective attribution, not an objective phenomenon, means that this divergence of risk perception and expert assessment does not amount to risk amplification. and it says that actors see others as being risk amplifiers, or attenuators, and develop their responses accordingly. this means that we need to add to sarf, and the basic model of the previous section, the processes by which actors observe, diagnose and deal with other actors' risk assessments or perceptions. what our fieldwork revealed was that the social system did not correct 'mistaken' risk perceptions in some simpleminded fashion. in other words, it was not the case that people formed risk perceptions, received information about expert assessment, and then corrected their perceptions in the correct direction. instead, as we explained earlier, they found reasons why expert assessments, and in fact the risk views of any other group, might be subject to systematic amplification or attenuation. they then corrected for that amplification. risk managers, on the other hand, had the task of overcoming what they saw as mistaken risk responses in other groups, not simply correcting for them. therefore in the second model, shown in figure 2 , we now have a subsystem in which a risk manager (a government agency or an industrial undertaking in the case of zoonotic disease outbreaks) observes the public risk perception in relation to the expert risk assessment, and communicates a risk level that is designed to compensate for any discrepancy between the two. commercial risk managers will naturally want to counteract risk amplification that leads to revenue losses from product and service boycotts, and governmental risk managers will want to counteract the risk amplification that produces panic and disorder. as beck et al (2005) report, the uk bse inquiry found that risk managers' approach to communicating risk 'was shaped by a consuming fear of provoking an irrational public scare'. the effect is symmetrical to the extent that the public in turn observes discrepancies between managerial communications and its own risk perceptions, and attributes amplification or attenuation accordingly. attributions are based on simple memory of past observations. this historical memory of another actor's apparent distortions is sometimes mentioned in the sarf literature (kasperson et al, 1988; poumadere and mays, 2003) . this memory is represented as stocks of observed discrepancies, reaching a level m i (t)for actor i at time t. the managerial memory, for example, is r public ðtþ r expert ðtþ dt m i (t)40 implies that actor i sees the other actor as exaggerating risk, while m i (t)o0 implies perceived attenuation. the specific deposits in an actor's memory are not retrievable, and equal weight is given to every observation that contributes to it. the perceived scale of amplification is the time average of memory content, and the confidence the actor has in this perceived amplification is 1àe à|m(t)| where confidence grows logarithmically towards unity as the magnitude of the memory increases. the managerial actor modifies the risk level it communicates by the perceived scale of public amplification raised to the power of its confidence, while the public adjusts the communicated risk level it takes account of by the perceived scale of managerial attenuation raised to the power of its confidence in this. in the third model, in figure 3 , we add three elements found in the risk amplification literature that become especially relevant to the idea of risk amplification as a subjective attribution: confusion, distrust and differing perceptions about the significance of behavioural change. the confusion issue reflects the way an otherwise authoritative actor's view tends to be discounted if it shows evidence of confusion, uncertainty or inexplicable change. two articles in the recent literature on zoonosis risk (bergeron and sanchez, 2005; heberlein and stedman, 2009 ) specifically describe the risk amplifying effect of the authorities seeming confused or uncertain. the distrust issue reflects the observation that 'distrust acts to heighten risk perception . . . ' (kasperson et al, 2003) , and that it is 'associated with perceptions of deliberate distortion of information, being biased, and having been proven wrong in the past' (frewer, 2003, p 126) . a distinguishing aspect of trust and distrust is the basic asymmetry such that trust is quick to be lost and slow to be gained (slovic, 1993) . in figure 3 , the confusion function is based on the rate of change of attributed amplification, not rate of change communication itself, since some change in communication might appear justified if correlated with a change in public perception: g ¼ 1 à e àg c g ðtþ j j ; where c g (t) is the change in managerial amplification in unit time. the distrust function is based on the extent of remembered attributed amplification: f ¼ 1 à e àf m g ðtþ j j ; where m g (t) is the memory of managerial risk amplification at time t and f is the distrust parameter. there is no obvious finding in the literature that would help us set the value of such a parameter. the combination of the confusion and distrust factors is a combination of an integrator and a differentiator. it is used to determine how much weight is given to managerial risk communications in the formation of the resultant risk perception. it is defined such that as distrust and confusion both approach unity, this weight w tends to zero: w ¼ w max (1àg)(1àf). this weight was exogenous in the previous model, so the effect of introducing confusion and distrust is also to endogenise the way observation of worry is combined with authoritative risk communication. the third addition in this model is an important disproportionality effect. the previous models assume that risk managers base their view of the public risk perception on some kind of direct observation-for example, through clamour, media activity, surveys and so on. in practice, the managerial view is at least partly based on the public's consumption of the amenity that is risk, for example the consumption of beef during the bse crisis, or flight bookings and hotel reservations during the sars outbreak. the problem is that when a foodstuff like beef becomes a risk object it may be easy for many people to stop consuming it, and such a response from the consumer's perspective can be proportionate to even a mild risk assessment. reducing beef consumption is an easy precaution for most of the population to take (frewer, 2003) , so rational even when there is little empirical evidence that there is a risk at all (rip, 2006) . yet this easy response of boycotting beef may be disastrous for the beef industry, and therefore seem highly disproportionate to the industry, to related industries and to government agencies supporting the industry. unfortunately there is considerable difficulty in quantifying this effect in general terms. recent work (mehers, 2011) looking at the effect of heightened risk perceptions around the avian influenza outbreak at a meat processing plant suggests that the influence on the demand for the associated meat products was very mixed. different regions and different demographic groups showed quite different reactions, for example, and the effect was confounded by actions (particularly price changes) taken by manufacturer and retailers. our approach is to represent the disproportionality effect with a single exogenous factorthe relative substitutability of the amenity for similar amenities on the supply and demand side. the risk manager interprets any change in public demand for the amenity multiplied by this factor as being the change in public risk perception. if the change in this inferred public risk perception exceeds that observed directly (for example by opinion survey), then it becomes the determinant of how risk managers think the public are viewing the risk in question. this relative substitutability is entirely a function of the specific industry (and so risk manager) in question: there is no 'societal' value for such a parameter, and the effects of a given risk perception on amenity demand will always be case specific. for example, brahmbhatt and dutta (2008) reported that the sars outbreak led to revenue losses in beijing of 80% in tourist attractions, exhibitions and hotels, but of 10-50% in travel agencies, airlines, railways and so on. the effects are substantial but a long way from being constant. in this section we briefly present the outcomes of simulation with two aims: first to show how the successive models produce differences in behaviour, if at all, and thereby to assess how much value there is in the models for policymakers; second to assess how much uncertainty in figure 3 model of a more complex attributional view of risk amplification. outcomes such as public risk perception is produced by uncertainty in the exogenous parameters. figure 4 shows the behaviour of the three successive models in terms of public risk perception and expert risk assessment. for the three models, the exogenous variables are set at their modal values and when variables are shared between models they have the same values. the expert risk assessment is thus very similar for each model, as shown in the figure, rising towards its target level, falling as public risk perception reduces exposure, and then ceasing as the crisis ends around day 40. in the base model, the public risk perception is eight times higher than the expert assessment at its peak, which occurs some 20 days after that in the expert assessment. but once the attributional view of risk amplification is modelled, this disparity becomes much greater, and it occurs earlier. in the simple attributional system the peak discrepancy is over 40 times, and in the complex attributional system nearly 400 times, both occurring within 8 days of the expert assessment peak. thus the effect of seeing risk amplification as the subjective judgment of one actor about another is, given the assumptions in our models, to polarise risk beliefs much more strongly and somewhat more rapidly. we can no longer call the outcome a 'risk amplification' since, by assumption, there is no longer an objective risk level exogenous to the social system. but there is evidently strong polarisation. there is some qualitative difference in the time profile of risk perception between the three models, as shown in the previous figure where the peak risk perception occurs earlier in the later models. there are also important qualitative differences in the time profiles of stock variables amenity demand and worried population, as shown in figure 5 . when the attributional view is taken, both demand and worry take longer to recover to initial levels, and when the more complex attributional elements are modelled (the effects of mistrust, confusion and different perceptions of the meaning of changes in demand), the model indicates that little recovery takes place at all. the scale of the recovery depends on the value of the exogenous parameters, and some of these (as we discuss below) are case specific. but of primary importance is the way the weighting given to managerial communications or expert assessment is dragged down by public attributions. this result indicates the importance of a complex, attributional view of risk amplification. unlike the base model, in the attributional model it is much more likely there will be an indefinite residue from a crisis-even when the expert assessment of risk falls to near zero. figures 6 and 7 show the time development of risk perception in the third model in terms of the mean outcome with (a) 95% confidence intervals on the mean and (b) tolerance intervals for 95% confidence in 90% coverage over 1000 runs, with triangular distributions assigned to the exogenous parameters and plausible ranges based solely on the author's subjective estimates. the exogenous parameters fall into two main groups. the first group is of case-specific factors and would be expected to vary between risk events. this includes, for example, the relative substitutability of the amenity that is the carrier of the risk, and the latency before changes in demand for this amenity change the level of risk exposure. the remaining parameters are better seen as social constants, since there is no theoretical reason to think that they will vary from one risk event to another. these include factors like the natural vigilance period among the population, the normal flow of people into a state of worry, the latency before people become aware of a discrepancy between emergent risk perception and the proportion of the population that is in a state of worry. figure 6 shows the confidence and tolerance intervals with the social constants varying within their plausible ranges and the case-specific factors fixed at their modal values, and figure 7 vice versa. thus figure 6 shows the effect of our uncertainty about the character of society, figure 4 outcomes of the three models. whereas figure 7 shows the effect of the variability we would expect among risk events. the substantial difference between the means in risk perception between the two figures reflects large differences between means and modes in the distributions attributed to the parameters, which arises because plausible ranges sometimes cover multiple orders of magnitude (eg, the confusion and distrust constants both range from 1 to 100 with modes of 10, and the memory constant from 10 to 1000 with a mode of 100). these figures do not give a complete understanding, not least because interactions between the two sets of parameters are possible, but they show a reasonably robust qualitative profile. figure 8 shows the 'simple' correlation coefficients between resultant risk perception and the policy-relevant exogenous parameters over time, as recommended by ford and flynn (2005) as an indication of the relative importance of model inputs. at each day of the simulation, the sample correlation coefficient is calculated for each parameter over the 1000 runs. no attempt has been made to inspect whether the most important inputs are correlated, and to refine the model in the light of this. nonetheless the figure gives some indication of how influential are the most prominent parameters: the expert initial assessment level (ie, the original scale of the risk according to expert assessment), the expert assessment adjustment time (ie, the delay in the official estimate reflecting the latest information), the base flow (the flow of people between states of non-worry and worry in relation to a risk irrespective of the specific social influences being modelled) and the normal risk perception (the baseline against which the resultant risk perception is gauged, reflecting a level of risk that would be unsurprising and lead to no increase in the numbers of the worried). the first of these is case-specific, but the other three would evidently be worth empirical investigation given their influence in the model. it is extremely difficult to test such outcomes against empirical data because cases differ so widely and it is unusual to find data on simultaneous expert assessments and public perceptions over short-run risk events like disease outbreaks, particularly outbreaks of zoonotic disease. but a world bank paper of 2008, on the economic effects of infectious disease outbreaks (primarily sars, a zoonotic disease), collected together data gathered on the 2003 sars outbreak, and some-primarily that of lau et al (2003) -showed the day-by-day development of risk perception alongside reported cases. figure 9 is based on lau et al's data (2003) , and shows the number of reported cases of sars as a proportion of the hong kong population at the time, together with the percentage of people in a survey expressing a perception that they had a large or very large chance of infection from sars. the two lines can be regarded as reasonably good proxies for the risk perception and expert assessment outcomes in figure 4 and they show a rough correspondence: a growth in both perception and expertly assessed or measured 'reality', followed by a decay, in which the perception appears strongly exaggerated from the standpoint of the expert assessment. the perceptual gap is about four orders of magnitude-greater than even the more complex attributional system in our modelling. moreover, the risk perception peak occurs early, and in fact leads the reported cases peak. it is our models 2 and especially 3 in which the perception peak occurs early (although it never leads the expert assessment peak). the implications of the work the social amplification of risk framework has always been presented as an 'integrative framework' (kasperson et al, 1988) , rather than a specific theory, so there has always been a need for more specific modelling to make its basic concepts precise enough to be properly explored. at the same time, as suggested earlier, its implication that there is some true level of risk that becomes distorted in social responses has been criticised for a long time. we therefore set out to explore whether it is possible to retain some concept of social risk amplification in cases where even expert opinion tends to be divided, the science is often very incomplete, and past expert assessment has been discredited. zoonotic disease outbreaks provide a context in which such conditions appear to hold. our fieldwork broadly pointed to a social system in which social actors of all kinds privilege their own risk views, in which they nonetheless have to rely on other actor's responses in the absence of direct knowledge or experience of the risks in question, in which they attribute risk amplification or attenuation to other actors, and in which they have reasons to correct for or overcome this amplification. to explore how we can model such processes has been the main purpose of the work we have described. and the resulting model provides specific indications of what policymakers need to deal with-a much greater polarisation of risk beliefs, and potentially a residue of worry and loss of demand after the end of a risk crisis. it also has the important implication that risk managers' perspectives should shift, from correcting a public's mistakes about risk to thinking about how their own responses and communications contribute to the public's views about a risk. our approach helps to endogenise the risk perception problem, recognising that it is not simply a flaw in the world 'out there'. it is thus an important step in becoming a more sophisticated risk manager or manager of risk issues (leiss, 2001) . it is instructive to compare this model with models like that of luna-reyes et al (2008) which essentially involve a convergent process arise from knowledge sharing, and the subsequent development of trust. we demonstrate a process in which there is knowledge sharing, but a sharing that is undermined by expectations of social risk amplification. observing discrepancies in risk beliefs leads not to correction and consensus but to self-confirmation and polarisation. our findings in some respects are similar to greer et al (2006) , who were concerned with discrepancies in the perceptions of workload in the eyes of two actors involved in a common project. such discrepancies arose not from exogenous causes but from unclear communication and delay inherent in the social system. all this reinforces the long-held view in the risk community, and of risk communication researchers in particular, that authentic risk communication should involve sustained relationships, and the open recognition of uncertainties and difficulties that would normally be regarded as threats to credibility (otway and wynne, 1989) . the reason is not just the moral requirement to avoid the perpetuation of powerful actors' views, and not just the efficiency requirement to maximise the knowledge base that contributes to managing a risk issue. the reason is also that the structure of interactions can be unstable, producing a polarisation of view that none of the actors intended. actors engaged with each other can realise this and overcome it. a basic limitation to the use of the models to support specific risk management decisions, rather than give more general insight into social phenomena, is that there are very few sources of plausible data for some important variables in the model, such as the relaxation delay defining how long people tend to stay worried about a specific risk event before fatigue, boredom or replacement by worry about a new crisis leads them to stop worrying. it is particularly difficult to see where values of the case-specific parameters are going to come from. other sd work on risk amplification at least partly avoids the calibration problem by using unit-less normalised scales and subjective judgments (burns and slovic, 2007) . and one of the benefits of this exploratory modelling is to suggest that such variables are worthwhile subjects for empirical research. but at present the modelling does not support prediction and does not help determine best courses of action at particular points in particular crises. in terms of its more structural limitations, the model is a small one that concentrates specifically on the risk amplification phenomenon to the exclusion of the many other processes that, in any real situation, risk amplification is connected with. as such, it barely forms a 'microworld' (morecroft, 1988) . it contrasts with related work such as that of martinez-moyano and samsa's (2008) modelling of trust in government, which similarly analyses a continuing interaction between two aggregate actors but draws extensively on cognitive science. however, incorporating a lot more empirical science does not avoid having to make many assumptions and selections that potentially stand in the way of seeing through to how a system produces its outcomes. the more elaborate the model the more there is to dispute and undermine the starkness of an interesting phenomenon. we have had to make few assumptions about the world, about psychology and about sociology before concluding that social risk amplification as little more than a subjective attribution has a strongly destabilising potential. this parsimony reflects towill's (1993) notion that we start the modelling process by looking for the boundary that 'encompasses the smallest number of components within which the dynamic behaviour under study is generated'. the model attempts to introduce nothing that is unnecessary to working out the consequences of risk amplification as an attribution. as ghaffarzadegan et al (2011) point out in their paper on small models applied to problems of public policy, and echoing forrester's (2007) argument for 'powerful small models', the point is to gain accessibility and insight. having only 'a few significant stocks and at most seven or eight major feedback loops', small models can convey the counterintuitive endogenous complexity of situations in a way that policymakers can still follow. they are small enough to show systems in aggregate, to stress the endogeneity of influences on the system's behaviour, and to clearly illustrate how policy resistance comes about (ghaffarzadegan et al, 2011) . as a result they are more promising as tools for developing correct intuitions, and for helping actors who may be trapped in a systemic interaction to overcome this and reach a certain degree of self-awareness (lane, 1999) . the intended contribution of this study has been to show how to model a long-established, qualitative framework for reasoning about risk perception and risk communication, and in the process deal with one of the main criticisms of this framework. the idea that in a society the perception of a risk becomes exaggerated to the point where it bears no relation to our best expert assessments of the risk is an attractive one for policymakers having to deal with what seem to be grossly inflated or grossly under-played public reactions to major events. but this idea has always been vulnerable to the criticism that we cannot know objectively if a risk is being exaggerated, and that expert assessments are as much a product of social processes as lay opinion. the question we posed at the start of the paper was whether, in dropping a commitment to the idea of an objective risk amplification, there is anything left to model and anything left to say to policymakers. our work suggests that there is, and that modelling risk amplification as something that one social actor thinks another is doing is a useful thing to do. there were some simple policy implications emerging from this modelling. for example, once you accept that there is no objective standard to indicate when risk amplification is occurring, actors are likely to correct for other actors' apparent risk amplifications and attenuation, instead of simple-mindedly correcting their own risk beliefs. this can have a strongly polarising effect on risk beliefs, and can produce residual worry and loss of demand for associated products and services after a crisis has passed. the limitations of the work point to further developments in several directions. first, there is a need to explore various aspects of how risk managers experience risk amplification. for example, the modelling, as it stands, concentrates on the interactions of actors in the context of a single event or issue-such as a specific zoonotic outbreak. in reality, actors generally have a long history of interaction around earlier events. we take account of history within an event, but not between events. a future step should therefore be to expand the timescale, moving from intra-event interaction to inter-event interaction. the superposition of a longer term process is likely to produce a model in which processes acting over different timescales interact and cannot simply be treated additively (forrester, 1987) . it also introduces the strong possibility of discontinuities, particularly when modelling organisational or institutional actors like governments whose doctrines can change radically following elections-rather like the discontinuities that have to be modelled to represent personnel changes and consequences like scapegoating (howick and eden, 2004) . another important direction of work would be a modelling of politics and power. it is a common observation in risk controversies that risk is a highly political construction-being used by different groups to gain resources and influence. as powell and coyle (2005) point out, the systems dynamics literature makes little reference to power, raising questions about the appropriateness of our modelling approach to a risk amplification subject-both in its lack of power as an object for modelling, and its inattention to issues of power surrounding the use of the model and its apparent implications. powell and coyle's (2005) politicised influence diagrams might provide a useful medium for representing issues of power, both within the model of risk amplification and in the understanding of the system in which the model might be influential. the notion, as currently expressed in our modelling, that it is always in one actor's interest to somehow correct another's amplification simply looks naı¨ve. greenpeace v. shell: media exploitation and the social amplification of risk framework (sarf) social learning theory public administration, science and risk assessment: a case study of the uk bovine spongiform encaphalopathy crisis media effects on students during sars outbreak world bank policy research working paper 4466, the world bank east asia and pacific region chief economist's office social amplification of risk and the layering method the diffusion of fear: modeling community response to a terrorist strike incorporating structural models into research on the social amplification of risk: implications for theory construction and decision making social risk amplification as an attribution: the case of zoonotic disease outbreaks model-based scenarios for the epidemiology of hiv/aids: the consequences of highly active antiretroviral therapy bank runs, deposit insurance, and liquidity risk and culture: an essay on the selection of technological and environmental dangers risk and relativity: bse and the british media the social amplification of risk perceiving others' perceptions of risk: still a task for sisyphus statistical screening of system dynamics models nonlinearity in high-order models of social systems system dynamics-the next fifty years institutional failure and the organizational amplification of risk: the need for a closer look trust, transparency, and social context: implications for social amplification of risk workers' compensation and family and medical leave act claim contagion how small system dynamics models can help the public policy process out of the frying pan into the fire: behavioral reactions to terrorist attacks conceptualization: on theory and theorizing using grounded theory the discovery of grounded theory improving interorganizational baseline alignment in large space system development programs socially amplified risk: attitude and behavior change in response to cwd in wisconsin deer the social construction of risk objects: or, how to pry open networks of risk on the nature of discontinuities in system dynamics modelling of disrupted projects reducing risks, protecting people bridging the two cultures of risk analysis the social amplification and attenuation of risk the social amplification of risk: assessing fifteen years of research and theory the social amplification of risk the social amplification of risk: a conceptual framework qualitative methods and analysis in organizational research: a practical guide closing the loop: promoting synergies with other theory building approaches to improve systems dynamics practice social theory and systems dynamics practice monitoring community responses to the sars epidemic in hong kong: from day 10 to day 62 the chamber of risks: understanding risk controversies a tale of two cities: community psychobehavioral surveillance and related impact on outbreak control in hong kong and singapore during the severe acute respiratory syndrome epidemic contagion: a theoretical and empirical review and reconceptualization the impact of social amplification and attenuation of risk and the public reaction to mad cow disease in canada collecting and analyzing qualitative data for system dynamics: methods and models knowledge sharing and trust in collaborative requirements analysis a feedback theory of trust and confidence in government the limits to growth: the 30-year update on the quantitative analysis of food scares: an exploratory study into poultry consumers' responses to the 2007 h5n1 avian influenza outbreaks in the uk food supply chain dynamic analysis of combat vehicle accidents strategy support models systems dynamics and microworlds for policymakers modelling the oil producers: capturing oil industry knowledge in a behavioural simulation model. modelling for learning, a special issue of the science controversies: the dynamics of public disputes in the united states risk communication: paradigm and paradox (guest editorial) the dynamics of risk amplification and attenuation in context: a french case study identifying strategic action in highly politicized contexts using agent-based qualitative system dynamics muddling through metaphors to maturity: a commentary on kasperson et al. 'the social amplification of risk' risk communication and the social amplification of risk should social amplification of risk be counteracted folk theories of nanotechnologists a social network contagion theory of risk perception perception of risk perceived risk, trust and democracy business dynamics: systems thinking and modelling for a complex world understanding social amplification of risk: possible impact of an avian flu pandemic. masters dissertation, sloan school of management and engineering systems division acknowledgements-many thanks are due to the participants in the fieldwork that underpinned the modelling, and to dominic duckett who carried out the fieldwork. we would also like to thank the anonymous reviewers of an earlier draft of this article for insights and suggestions that have considerably strengthened it. the work was partly funded by a grant from the uk epsrc. key: cord-026384-ejk9wjr1 authors: crilly, colin j.; haneuse, sebastien; litt, jonathan s. title: predicting the outcomes of preterm neonates beyond the neonatal intensive care unit: what are we missing? date: 2020-05-19 journal: pediatr res doi: 10.1038/s41390-020-0968-5 sha: doc_id: 26384 cord_uid: ejk9wjr1 abstract: preterm infants are a population at high risk for mortality and adverse health outcomes. with recent improvements in survival to childhood, increasing attention is being paid to risk of long-term morbidity, specifically during childhood and young-adulthood. although numerous tools for predicting the functional outcomes of preterm neonates have been developed in the past three decades, no studies have provided a comprehensive overview of these tools, along with their strengths and weaknesses. the purpose of this article is to provide an in-depth, narrative review of the current risk models available for predicting the functional outcomes of preterm neonates. a total of 32 studies describing 43 separate models were considered. we found that most studies used similar physiologic variables and standard regression techniques to develop models that primarily predict the risk of poor neurodevelopmental outcomes. with a recently expanded knowledge regarding the many factors that affect neurodevelopment and other important outcomes, as well as a better understanding of the limitations of traditional analytic methods, we argue that there is great room for improvement in creating risk prediction tools for preterm neonates. we also consider the ethical implications of utilizing these tools for clinical decision-making. impact: based on a literature review of risk prediction models for preterm neonates predicting functional outcomes, future models should aim for more consistent outcomes definitions, standardized assessment schedules and measurement tools, and consideration of risk beyond physiologic antecedents. our review provides a comprehensive analysis and critique of risk prediction models developed for preterm neonates, specifically predicting functional outcomes instead of mortality, to reveal areas of improvement for future studies aiming to develop risk prediction tools for this population. to our knowledge, this is the first literature review and narrative analysis of risk prediction models for preterm neonates regarding their functional outcomes. preterm infants have long been recognized as a population at high risk for mortality and adverse functional outcomes, including cerebral palsy and intellectual impairment. 1 as mortality rates for preterm neonates decline and more survive to childhood, 2,3 attention has increasingly turned towards measuring longer-term morbidities and related functional impairments during childhood and young-adulthood, as well as identifying risk factors related to these complications. 4, 5 while child-specific characteristics, such as gestational age, birth weight, and sex, are well established as predictors of adverse neurodevelopmental outcomes, 6-8 recent work has identified additional factors, including bronchopulmonary dysplasia and family socioeconomic status, that are correlated with relevant outcomes, such as poor neuromotor performance and low intelligence quotient at school age. 9 in clinical settings, the assessment of prognosis can vary widely across neonatologists, 10 making a valid and reliable predictive model for long-term outcomes a highly sought-after clinical tool. moreover, predicting outcomes is vital when making decisions regarding which therapeutic interventions to apply, when providing critical data to parents for informed decision-making, and when matching infants with outpatient services to best meet their needs. in addition, prediction models are useful in evaluating neonatal intensive care unit (nicu) performance and allowing for between-center comparisons with proper adjustment for the severity of cases being treated. 11 numerous prediction tools have been developed to quantify the risk of death for preterm neonates in the nicu setting, including the score for neonatal acute physiology (snap) and the clinical risk index for babies (crib). 12 the national institute of child health and human development (nichd) risk calculator, predicting survival with and without neurosensory impairment, is widely used to counsel families in the setting of threatened delivery at the edges of viability. 13 furthermore, there are numerous other models that use clinical data from the nicu stay to predict risk for poor functional outcomes in infancy and school age. 14, 15 while several studies have categorized and evaluated the risk prediction models developed and validated in recent decades for mortality, 12, 16 no studies have compared and contrasted risk prediction models for non-mortality outcomes. recently, linsell et al. 17 published a systematic review of risk factor models for neurodevelopmental outcomes in children born very preterm or very low birth weight (vlbw). however, this review focused primarily on overall trends in model development and validation rather than a detailed consideration of individual models. in this article, we conduct an in-depth, narrative review of the current risk models available for predicting the functional outcomes of preterm neonates, evaluating their relative strengths and weaknesses in variable and outcome selection, and considering how risk model development and validation can be improved in the future. towards this, we first provide an overview of the different risk models developed since 1990. we then frame our review of these models in terms of the outcomes predicted, the range of predictors considered, and the statistical methods used to select the variables included in the final model, as well as to assess the predictive performance of the model. finally, the ethical implications of integrating risk stratification into standard clinical care for preterm neonates are considered. we conducted a manual search for relevant literature via pubmed, entering combinations of key terms synonymous with "prediction tool," "preterm," and "functional outcome" and reading the abstracts of resulting studies (table 1 ). studies with abstracts that appeared related to our review were then read in full to identify prediction models that were eligible for inclusion. reference lists of included studies were also reviewed, as were articles that later cited these original studies. prediction tools were defined as multivariable risk factor analyses (>2 variables) aiming to predict the probability of developing functional outcomes beyond 6 months corrected age. models that solely investigated associations between individual risk factors and outcomes were excluded, as were models that were not evaluated for predictive ability in terms of either a validation study or an assessment for performance, discrimination, or calibration. tests used to evaluate a model's overall performance were r 2 , adjusted r 2 , and the brier score. the use of a receiver operating characteristic (roc) curve or a c-index evaluated a model's discrimination, and the hosmer-lemeshow test was considered to evaluate a model's calibration. 18 preterm neonates were defined as <37 weeks of completed gestational age. models with vlbw neonates <1500 g were also included, since in the past birth weight served as a substitute for measuring prematurity when gestational age could not be accurately determined. models were excluded if they used a cohort entirely composed of infants born prior to 1 january 1990; those born after 1990 were likely to have had surfactant therapy available in the event of respiratory distress syndrome, which significantly reduced the morbidity and mortality rates among preterm neonates nationwide. 19, 20 models were also excluded if they limited their prediction to the outcome of survival, if they incorporated variables measured after initial nicu discharge, or if they included subjects who were not necessarily transferred to a nicu for further care following delivery. finally, we excluded tools that only predicted outcomes to an age of <6 months corrected age, as well as case reports, narrative reviews, and tools reported in languages other than english. overview of risk prediction models table 2 lists all 32 studies with risk prediction models that meet the inclusion and exclusion criteria. [13] [14] [15] from these, a total of 43 distinct models were reported. from mortality to neurodevelopmental impairment since 1990, several mortality prediction tools have been evaluated in regards to their ability to predict the likelihood of neurodevelopmental impairment (ndi) among neonates surviving to nicu discharge. one such model is the crib, which incorporates six physiologic variables collected within the first 12 h of the preterm infant's life: birth weight, gestational age, presence of congenital malformations, maximum base excess, and minimum and maximum fio 2 requirement. 50 fowlie et al. 24 evaluated how crib models obtained at differing time periods over the first 7 days of life predicted severe disability among a group of infants born >31 weeks gestational age or vlbw. in another study, fowlie et al. 25 incorporated cranial ultrasound findings on day of life 3 along with crib scores between 48 and 72 h of life into their prediction model. subsequent studies analyzed the crib in its original 12-h form and, with only one exception, 23 determined that it was not a useful tool for predicting long-term ndi or other morbidities. [26] [27] [28] [29] a second example is the snap score. 51 snap uses 28 physiologic parameters collected over the first 24 h of life to predict survival to nicu discharge, and was modified to predict ndi at 1 year and 2-3 years of age. a subsequent assessment of both the snap and the snap with perinatal extension 42 showed a poor predictive value for morbidity at 4 years of age for children born vlbw and/or with gestational age ≤31 weeks. 28 finally, the neonatal therapeutic intervention scoring system, a comprehensive exam-based prediction tool for mortality, 52 was found to have a poor predictive value for adverse outcomes at 4 years of age in children born very preterm or vlbw. 28 shortened forms of the early physiology-based scoring systems were developed and assessed for their ability to predict outcomes in childhood. application of the crib-ii on a small cohort (n = 107) of infants born <1250 g predicted significant ndi at 3 years of age. 39 however, a subsequent evaluation in a much larger cohort (n = 1328) of preterm infants <29 weeks gestational age concluded that the crib-ii did no better than gestational age or birth weight alone in predicting moderate to severe functional disability at 2-3 years of age. 40 studies have supported an association between the snap-ii and snappe-ii scores and neurodevelopmental outcomes and small head circumference at 24 months corrected age. high snap-ii scores were shown to correlate with adverse neurological, cognitive, and behavioral outcomes up to 10 years of age within a large cohort (n = 874) of children born very preterm. 43 antenatal risk factors several groups have used data from the nichd's neonatal research network (nrn) to design and test various risk prediction models for extremely low birth weight (elbw) newborns. one of the most widely used risk prediction tools developed from this cohort was by tyson et al., 13 postnatal morbidity a large cohort study (n = 910) from schmidt et al. 15, 32 used data from elbw neonates 500-999 g enrolled in the international trial of indomethacin prophylaxis in preterms (tipp). they found that the presence of three morbidities at 36 weeks post-menstrual age -bronchopulmonary dysplasia, serious brain injury, and severe retinopathy of prematurity-had a significant and additive effect on the risk for death or poor neurologic outcome at 18 months corrected age. they developed a model from this relationship that has been corroborated in two studies with smaller samples and by schmidt et al. 15 in a separate, large cohort in which the definition of poor outcome was expanded from solely ndi to "poor general health." 33, 34 letting the machines decide some innovative work has been recently performed by ambalavanan et al. 14, 35 in creating several risk prediction models. 45 along with studies developing risk prediction tools with data from the nrn and the tipp to predict the outcomes of death and ndi or solely ndi, the group made the only risk prediction tool for the outcome of rehospitalization, both general and specifically for respiratory complications, using a combination of physiologic and socioeconomic variables incorporated into a decision tree approach. they have also been the only group to create neural network-trained models, using the same small cohort to predict major handicap, low mental development index (mdi), or low psychomotor development index (pdi). the advantage of using neural networks-algorithms that can "learn" mathematical relationships between a series of independent variables and a set of outcomes-is the ability to model complex or nonlinear relationships that can be elucidated by the model without having to consider these relationships a priori (as is typically required when using multiple regression models). despite the use of innovative approaches, however, none of these models differed from other studies in predictive strength or even had high predictive efficacy. 31 limitations of prior approaches the above literature review highlights the substantial interest in developing a clinically useful risk prediction model and the limits of efforts to date. notwithstanding their differing inclusion and exclusion criteria, existing risk prediction models are relatively similar in terms of variables selected, outcomes analyzed, and statistical strategies employed. with few exceptions, the limitations of existing risk prediction models are especially apparent in their reliance on solely biologic variables and traditional analytic methods ill-equipped to handle the statistical complexity necessary for risk modeling. identifying important outcomes. the majority of risk prediction models defined ndi as their primary outcome of interest. making a determination of impairment often relies on standardized measures of cognition in concert with neurosensory deficits. yet, researchers often define ndi in different ways, making betweenstudy comparisons difficult. ndi is a construct relating to global abilities encompassing cognition, language, motor function, and vision and hearing. while the tools used to identify ndi are often also used to make diagnoses of developmental delay, ndi is not a clinical term or diagnosis in and of itself. many of the remaining studies also predicted functional outcomes, such as academic performance, executive function, language ability, and autism spectrum disorder (asd). these outcomes may be more meaningful to parents and providers than ndi. 54 to date, only four studies have considered outcomes unrelated to neurodevelopment, such as impaired pulmonary function, "poor general health," and rehospitalization rates. 15, 28, 45, 49 while the emphasis on ndi is unsurprising given the high-risk population, moderate to severe ndi only affects a minority of the preterm population. 55, 56 studies have revealed numerous additional adverse outcomes that preterm individuals are more likely to experience compared to their full-term counterparts, such as impaired respiratory, cardiovascular, and metabolic function. [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] neurodevelopment has been linked to chronic health problems in later childhood. 67 limiting risk prediction to moderate to severe ndi therefore ignores other, more common complications that preterm infants are likely to face that have an impact on neurodevelopment. this represents a missed opportunity for researchers to better understand what variables influence the likelihood that these problems occur. the impact of developmental disability on the child and family is completely absent from current risk models. health-related quality of life (hrql), which distinguishes itself as a personal rather than third-party valuation of a patient's physical and emotional well-being, is being increasingly appreciated as an important metric necessary to fully understand the impact of prematurity. 68 in a french national survey, the majority of neonatologists, obstetricians, and pediatric neurologists stated that predicting hrql in the long term for preterm infants would be beneficial for consulting parents about what additional responsibilities they can anticipate in caring for their child. 69 the trajectory of hrql from childhood to young-adulthood appears to improve in both vlbw and extremely low gestational age populations. 70 prediction modeling might aid in determining which factors could positively or negatively impact hrql in this vulnerable population. finally, we must consider the age at which outcomes are being predicted. it is evident that lower gestational age is inversely proportional to rates of ndi and academic achievement in adolescence. 71, 72 however, the vast majority of risk prediction models assessed outcomes at the age of 3 years or less, with only three studies doing so at 10 years of age or above. although early childhood outcomes may give clues about later development, many problems do not manifest until later in childhood, such as learning disabilities and certain psychiatric disorders. developmental disability severity can fluctuate throughout childhood, with catch-up occurring in early preterm children and worsening delay in some moderate and late preterm children. 73, 74 although cohorts of preterm infants are not usually followed for more than several years, likely due to lack of resources and expense, recent studies have used data from national registries to link neonatal clinical data to sampled adults, providing evidence of increased rates of adverse neurodevelopmental, behavioral, and educational outcomes among adults born preterm. 75, 76 opportunities are therefore available to use long-term data to extend risk prediction models beyond the first few years of life. variable selection. most of the risk models reviewed relied primarily on physiologic and clinical measures obtained during the nicu stay. while an emphasis on biologic risk factors is clearly reasonable given the known associations between perinatal morbidities and long-term outcomes, there is strong evidence in the literature suggesting associations between sociodemographic factors like parental race, education, and age, and outcomes such as cognitive impairment, cerebral palsy, and mental health disorders in children born preterm. more specific socioeconomic variables such as lower parental education, maternal income, insurance status, foreign country of birth by a parent, and socioeconomic status as defined by the elly-irving socioeconomic index have been repeatedly correlated with reduced mental development index, psychomotor development index, intelligence quotient, and social competence throughout childhood. 71, 72, [77] [78] [79] [80] [81] [82] the geographic area in which preterm neonates are raised could also have a profound influence on their development. neighborhood poverty rate, high school dropout rate, and place of residence (metropolitan vs. non-metropolitan) have all been correlated with academic skills and rate of mental health disorders among low birth weight children. 83, 84 only 12 of the 43 models reviewed included socioeconomic variables. this may be due, at least in part, to the difficulty in obtaining social, economic, and demographic data; these variables are often not collected upon hospital admission. additionally, socioeconomic information is often poorly, inaccurately, and variably recorded or is largely missing. 85 some risk prediction models collected socioeconomic variables at the follow-up visit when outcomes were assessed. this is an imperfect method given that factors such as household setting and family income may change substantially in the years following nicu discharge and affect children's health. 86, 87 in some models, socioeconomic variables were not included because they did not significantly improve the model's predictive ability. 45 testing the effects of social factors on infant and child outcomes requires samples that are socially and economically diverse. even large, diverse study populations may become more homogeneous over time, as subjects of lower socioeconomic status and non-white race are more likely to drop out of studies dependent on long-term follow-up. 41 and treating socioeconomic variables as statistically independent factors rather than interrelated might minimize the impact of contextual information on neurodevelopmental outcomes. model development. of the 32 papers included in the review, 12 reported on de novo risk prediction tools. the other 20 studies either evaluated a previous model or adjusted a prior model by changing the times at which data were collected or by adding additional variables. the approach to prediction tool development was almost uniform among the studies, with nine of the models solely using regression techniques to select variables. ambalavanan et al. deviated from this method in three separate studies: two using classification tree analysis, 35, 45 and one using a four-layer back-propagation neural network. 31 each new model-with the exception of the neural networkbased model by ambalavanan et al. 35, 45 -depended on an approach in which individual variables were selected and treated as independent of one another as they were analyzed in their ability to predict the outcome of interest. yet, variables may, in fact, not act independently. while parsing the roles of potential interrelationships may be computationally onerous and treating them independently may lead to a more parsimonious model, this may be at the expense of accuracy. alternative computational approaches are needed to account for the differential likelihoods of certain outcomes on the causal pathway from preterm birth to later childhood outcome. nonlinear statistical tools should be further utilized in risk prediction model development to examine the relationships between variables and outcomes of interest. machine learning, for instance, is a method of inputting a group of variables and generating a predictive model without making assumptions of independence between the factors or that specific factors would contribute the most to the model. 88 different forms of machine learning have already been employed in nicu's to extract the most important variables for predicting outcomes such as days to discharge. 89 the non-independence of risk factors is also complicated by the role of time in models of human health and development. the lifecourse framework describes how an accumulation or "chains" of risk experienced over time and at certain critical periods impact later health outcomes. 90 in the context of preterm birth, the risk of being born early is not uniform across populations and dependent on a given set of maternal risks. in turn, the degree of prematurity imparts differential risk for developing complications such as bronchopulmonary dysplasia, necrotizing enterocolitis, or retinopathy of prematurity. these morbidities then, in turn, increase risks for further medical and developmental impairment. these time-varying probabilities can be modeled and incorporated into prediction tools to more accurately capture the longitudinal and varying relationships between exposures and outcomes and improve thereby estimations of risk. [91] [92] [93] a final methodological concern regarding model development is whether and how the competing risk of death is considered when the outcome being predicted is non-terminal. consider, for example, the task of developing a model for the risk of ndi at 10 years of age. how one handles death can have a dramatic effect on the model, especially since mortality is relatively high among preterm infants. moreover, if death is treated simply as a censoring mechanism, as it is often done in time-to-event analyses such as those based on the cox model, then the overall risk of ndi will be artificially reduced; those children who die before being diagnosed with ndi will be viewed as remaining at risk even though they cannot possibly be subsequently diagnosed with ndi. while an alternative to this would be to use a composite outcome of time the first of ndi or death, doing so may result in a model that is unable to predict either event well. instead, one promising avenue is to frame the development of a prediction model for ndi within the semi-competing risks paradigm. 94, 95 briefly, semicompeting risks refer to settings where one event is a competing risk for the other, but not vice versa. this is distinct from standard competing risks, where each event is competing for the other (e.g., death due to one cause or another). to the best of our knowledge, however, semi-competing risks have not been applied to the study of long-term outcomes among preterm infants. model evaluation. waljee et al. 18 provide a summary of methods for assessing the performance of a predictive model, categorizing them into three types: overall model performance, which focuses on the extent of variation in risk explained by the model; calibration, which assesses differences between observed and predicted event rates; and discrimination, which assesses the ability to distinguish between patients who do and do not experience the outcome of interest. the majority of studies in our review assessed their models with roc curve analysis, a method of assessing discrimination. while widely used, there is some debate with regard to roc-based assessments, specifically in regard to its lack of sensitivity in assessing differences between good predictive models. 96 although several novel performance measures for comparing discrimination among models have been proposed, none have been employed in the context of comparing risk prediction tools for preterm neonates. 97, 98 few studies employed analyses other than roc. only six in our review assessed overall performance with r 2 or partial r 2 , and five evaluated calibration using the hosmer-lemeshow test. another four studies assessed internal validation with either an internal validation set or bootstrapping techniques. 99 there were nine studies meeting inclusion criteria solely because they had models that were externally validated via other studies. schmidt et al. 32 reported odds ratio associations for their 3-morbidity model, which are not a reliable method of determining the strength of risk prediction tools. 100 future risk model assessments for preterm neonates should at minimum include an roc curve analysis, although assessments of overall performance and calibration would also be helpful. validation with a different sample from the development set is also advised, ideally with a population outside the original cohort. 18 conclusion risk assessment and outcomes prediction are valuable tools in medical decision-making. fortunately, infants born prematurely enjoy ever-increasing likelihood of survival. research over the past several decades has highlighted the many influences, physiologic and psychosocial, affecting neurodevelopment, hrql, and health services utilization. yet, the wealth of knowledge gained from longitudinal studies of growth and development is not reflected in current risk prediction models. moreover, some of the most wellknown and widely used tools today, such as tyson et al.'s 13 fivefactor model, were developed nearly two decades ago. as advances in neonatal intensive care progressively reduce the risk of certain outcomes, it is clear that these older models require updating if they are to be of continued clinical use. it should be recognized that there are potential ethical ramifications to incorporating more psychosocial factors and outcomes into risk prediction models, such as crossing the line from risk stratification to "profiling" patients and offering different treatment decisions based on race or class. 101 however, physician predictions without the aid of prediction tools are highly inconsistent during counseling at the margins of viability, and further research is needed regarding the level of influence that physicians actually have on caregiver decision-making during counseling, as well as the extent to which risk prediction tools would change their approach to counseling. 10 in addition, despite recent innovation in statistical approaches to risk modeling, such as machine learning, most prediction tools rely on standard regression techniques. insofar that risk prediction models will continue to be developed for preterm neonatal care, making use of the clinical data available in most modern electronic health records and taking into consideration the analytic challenges related to unequal prior probabilities of exposures, non-independence of variables, and semi-competing risk can only strengthen our approach to predicting outcomes. we therefore recommend taking a broader view of risk, incorporating these concepts in creating stronger risk prediction tools that can ultimately serve to benefit the long-term care of preterm neonates. c.j.c. and j.s.l. designed and carried out this literature review. c.j.c., j.s.l., and s.h. worked jointly in the analysis and interpretation of the literature review results, as well as the drafting and revision of this article. all three authors gave final approval of the version to be published. on the influence of abnormal parturition, difficult labours, premature birth, and asphyxia neonatorum, on the mental and physical condition of the child, especially in relation to deformities trends in care practices, morbidity, and mortality of extremely preterm neonates survival of infants born at periviable gestational ages outcomes of preterm infants: morbidity replaces mortality institute of medicine committee on understanding premature birth and assuring healthy outcomes. preterm birth: causes, consequences, and prevention influence of birth weight, sex, and plurality on neonatal loss in united states preterm neonatal morbidity and mortality by gestational age: a contemporary cohort gestational age and birthweight for risk assessment of neurodevelopmental impairment or death in extremely preterm infants neurodevelopmental outcome at 5 years of age of a national cohort of extremely low birth weight infants who were born in 1996-1997 comparing neonatal morbidity and mortality estimates across specialty in periviable counseling prognosis and prognostic research: what, why, and how? neonatal disease severity scoring systems intensive care for extreme prematurity-moving beyond gestational age outcome trajectories in extremely preterm infants prediction of late death or disability at age 5 years using a count of 3 neonatal morbidities in very low birth weight infants prediction of mortality in very premature infants: a systematic review of prediction models risk factor models for neurodevelopmental outcomes in children born very preterm or with very low birth weight: a systematic review of methodology and reporting a primer on predictive models pulmonary surfactant therapy the future of exogenous surfactant therapy nursery neurobiologic risk score and outcome at 18 months evaluation of the ability of neurobiological, neurodevelopmental and socio-economic variables to predict cognitive outcome in premature infants. child care health dev increased survival and deteriorating developmental outcome in 23 to 25 week old gestation infants, 1990-4 compared with 1984-9 measurement properties of the clinical risk index for babies-reliabilty, validity beyond the first 12 hours, and responsiveness over 7 days predicting the outcomes of preterm neonates beyond the neonatal intensive predicting outcome in very low birthweight infants using an objective measure of illness severity and cranial ultrasound scanning is the crib score (clinical risk index for babies) a valid tool in predicting neurodevelopmental outcome in extremely low birth weight infants? the crib (clinical risk index for babies) score and neurodevelopmental impairment at one year corrected age in very low birth weight infants can severity-of-illness indices for neonatal intensive care predict outcome at 4 years of age? neurodevelopment of children born very preterm and free of severe disabilities: the nord-pas de calais epipage cohort study chronic physiologic instability is associated with neurodevelopmental morbidity at one and two years in extremely premature infants prediction of neurologic morbidity in extremely low birth weight infants impact of bronchopulmonary dysplasia, brain injury, and severe retinopathy on the outcome of extremely low-birth-weight infants at 18 months: results from the trial of indomethacin prophylaxis in preterms impact at age 11 years of major neonatal morbidities in children born extremely preterm effect of severe neonatal morbidities on long term outcome in extremely low birthweight infants early prediction of poor outcome in extremely low birth weight infants by classification tree analysis consequences and risks of <1000-g birth weight for neuropsychological skills, achievement, and adaptive functioning clinical data predict neurodevelopmental outcome better than head ultrasound in extremely low birth weight infants infant outcomes after periviable birth; external validation of the neonatal research network estimator with the beam trial clinical risk index for babies score for the prediction of neurodevelopmental outcomes at 3 years of age in infants of very low birthweight nsw and act neonatal intensive care units audit group. can the early condition at admission of a high-risk infant aid in the prediction of mortality and poor neurodevelopmental outcome? a population study in australia autism spectrum disorders in extremely preterm children snap-ii and snappe-ii and the risk of structural and functional brain disorders in extremely low gestational age newborns: the elgan study early postnatal illness severity scores predict neurodevelopmental impairments at 10 years of age in children born extremely preterm high prevalence/low severity language delay in preschool children born very preterm identification of extremely premature infants at high risk of rehospitalization screening for autism spectrum disorders in extremely preterm infants perinatal risk factors for neurocognitive impairments in preschool children born very preterm correlation between initial neonatal and early childhood outcomes following preterm birth bronchopulmonary dysplasia and perinatal characteristics predict 1-year respiratory outcomes in newborns born at extremely low gestational age: a prospective cohort study the international neonatal network. the crib (clinical risk index for babies) score: a tool for assessing initial neonatal risk and comparing performance of neonatal intensive care units score for neonatal acute physiology: a physiologic severity index for neonatal intensive care neonatal therapeutic intervention scoring system: a therapy-based severity-of-illness index prediction of death for extremely premature infants in a population-based cohort parental perspectives regarding outcomes of very preterm infants: toward a balanced approach risk of developmental delay increases exponentially as gestational age of preterm infants decreases: a cohort study at age 4 years preterm birth-associated neurodevelopmental impairment estimates at regional and global levels for 2010 late respiratory outcomes after preterm birth respiratory health in pre-school and school age children following extremely preterm birth preterm delivery and asthma: a systematic review and metaanalysis preterm birth, infant weight gain, and childhood asthma risk: a meta-analysis of 147,000 european children preterm birth: risk factor for early-onset chronic diseases preterm heart in adult life: cardiovascular magnetic resonance reveals distinct differences in left ventricular mass, geometry, and function right ventricular systolic dysfunction in young adults born preterm elevated blood pressure in preterm-born offspring associates with a distinct antiangiogenic state and microvascular abnormalities in adult life preterm birth and the metabolic syndrome in adult life: a systematic review and meta-analysis prevalence of diabetes and obesity in association with prematurity and growth restriction prematurity: an overview and public health implications measurement of quality of life of survivors of neonatal intensive care: critique and implications quality of life assessment in preterm children: physicians' knowledge, attitude, belief, practice -a kabp study health-related quality of life and emotional and behavioral difficulties after extreme preterm birth: developmental trajectories prognostic factors for poor cognitive development in children born very preterm or with very low birth weight: a systematic review prognostic factors for cerebral palsy and motor impairment in children born very preterm or very low birthweight: a systematic review evidence for catchup in cognition and receptive vocabulary among adolescents born very preterm the economic burden of prematurity in canada changing definitions of long-term followup: should "long term" be even longer? functional outcomes of very premature infants into adulthood social competence of preschool children born very preterm prediction of cognitive abilities at the age of 5 years using developmental follow-up assessments at the age of 2 and 3 years in very preterm children predicting the outcomes of preterm neonates beyond the neonatal intensive perinatal risk factors of adverse outcome in very preterm children: a role of initial treatment of respiratory insufficiency? the relationship between behavior ratings and concurrent and subsequent mental and motor performance in toddlers born at extremely low birth weight prognostic factors for behavioral problems and psychiatric disorders in children born very preterm or very low birth weight: a systematic review neurodevelopmental outcomes of extremely low birth weight infants <32 weeks' gestation between neighborhood influences on the academic achievement of extremely low birth weight children mental health outcomes in us children and adolescents born prematurely or with low birthweight measurement of socioeconomic status in health disparities research family income trajectory during childhood is associated with adiposity in adolescence: a latent class growth analysis family income trajectory during childhood is associated with adolescent cigarette smoking and alcohol use machine learning in medicine: a primer for physicians predicting discharge dates from the nicu using progress note data a life course approach to chronic diseases epidemiology 2nd edn. a life course approach to adult health scientists rise up against statistical significance the asa's statement on p-values: context, process, and purpose time for clinicians to embrace their inner bayesian? reanalysis of results of a clinical trial of extracorporeal membrane oxygenation semi-competing risks data analysis: accounting for death as a competing risk when the outcome of interest is nonterminal beyond composite endpoints analysis: semicompeting risks as an underutilized framework for cancer research use and misuse of the receiver operating characteristic curve in risk prediction assessing the performance of prediction models: a framework for traditional and novel measures novel metrics for evaluating improvement in discrimination: net reclassification and integrated discrimination improvement for normal variables and nested models multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker just health: on the conditions for acceptable and unacceptable priority settings with respect to patients' socioeconomic status auc: 0.703 sensitivity: 27.6% specificity: 87.3% competing interests: the authors declare no competing interests.publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. key: cord-017595-v3rllyyu authors: puzyn, tomasz; gajewicz, agnieszka; leszczynska, danuta; leszczynski, jerzy title: nanomaterials – the next great challenge for qsar modelers date: 2009-06-25 journal: recent advances in qsar studies doi: 10.1007/978-1-4020-9783-6_14 sha: doc_id: 17595 cord_uid: v3rllyyu in this final chapter a new perspective for the application of qsar in the nanosciences is discussed. the role of nanomaterials is rapidly increasing in many aspects of everyday life. this is promoting a wide range of research needs related to both the design of new materials with required properties and performing a comprehensive risk assessment of the manufactured nanoparticles. the development of nanoscience also opens new areas for qsar modelers. we have begun this contribution with a detailed discussion on the remarkable physical–chemical properties of nanomaterials and their specific toxicities. both these factors should be considered as potential endpoints for further nano-qsar studies. then, we have highlighted the status and research needs in the area of molecular descriptors applicable to nanomaterials. finally, we have put together currently available nano-qsar models related to the physico-chemical endpoints of nanoparticles and their activity. although we have observed many problems (i.e., a lack of experimental data, insufficient and inadequate descriptors), we do believe that application of qsar methodology will significantly support nanoscience in the near future. development of reliable nano-qsars can be considered as the next challenging task for the qsar community. one diameter of 100 nm or less. when nanoparticles are intentionally synthesized to be used in consumer goods, they are called "nanomaterials" [2] . nowadays, 50 years after feyman's lecture, nanotechnology has emerged at the forefront of science and technology developments and nanomaterials have found a wide range of applications in different aspects of human life. for example, nanoparticles of such inorganic compounds as tio 2 and zno oxides are used in cosmetics [3] , sunscreens [3] , solar-driven self-cleaning coatings [4] , and textiles [5] . nanosized cuo has replaced noble metals in newer catalytic converters for the car industry [6] . nanopowders of metals can be used as antibacterial substrates (e.g., the combination of the pure nanosilver ion with fiber to create antiodor socks) [7] . finally, metal salts (i.e., cdse quantum dots) have found many applications in electronics and biomedical imaging techniques [8, 9] . the discoveries of fullerene (c 60 ) in 1985 by kroto et al. [10] and carbon nanotubes in 1991 by iijima [11] opened a new area of the tailored design of carbonbased nanomaterials. carbon-based nanomaterials are currently used, among other applications, for synthesis of polymers characterized by enhanced solubility and processability [12] and for manufacturing of biosensors [13] . they also contribute to a broad range of environmental technologies including sorbents, high-flux membranes, depth filters, antimicrobial agents, and renewable energy supplies [14] . according to current analysis [15] , about 500 different products containing nanomaterials were officially on the market in 2007. most of them (247) have been manufactured in the usa, 123 in east asia (china, taiwan, korea, japan), 76 in europe, and only 27 in other countries. it is interesting that the number (500) is two times higher than the number of nanoproducts in the previous year. investments in nanotechnology industry grew from $13 billion in 2004 to $50 billion in 2006 andif one can believe the forecast -will reach $2.6 trillion in 2014. without doubt, nothing is able to stop such a rapidly developing branch of technology and we should be prepared for (better or worse) living day by day in symbiosis with nanomaterials. the astonishing physical and chemical properties of engineered nanoparticles are attributable to their small size. in the nanometer-scale, finite size effects such as surface area and size distribution can cause nanoparticles to have significantly different properties as compared to the bulk material [16] . for instance, by decreasing the size of gold samples one induces color changes from bright yellow through reddish-purple up to blue. however, from the physico-chemical viewpoint, the novel properties of nanoparticles can also be determined by their chemical composition, surface structure, solubility, shape, ratio of particles in relation to agglomerates, and surface area to volume ratio. all these factors may give rise to unique electronic, magnetic, optical, and structural properties and, therefore, lead to opportunities for using nanomaterials in novel applications and devices [16] . new, characteristic properties of nanomaterials include greater hardness, rigidity, high thermal stability, higher yield strength, flexibility, ductility, and high refractive index. the band gap of nanometer-scale semiconductor structures decreases as the size of the nanostructure decreases, raising expectations for many possible optical and photonic applications [17] . with respect to the size of the grains, it has been suggested that nanomaterials would exhibit increased (typically 3-5 times) strength and hardness as compared to their microcrystalline counterparts. for example, the strength of nanocrystalline nickel is five orders of magnitude higher than that of the corresponding microcrystalline nickel [18] . interestingly, the observed strength of crystalline nanomaterials is accompanied by a loss of ductility, which can result in a limitation of their utility [19] . however, some of the nanocrystalline materials have the ability to undergo considerable elongation and plastic deformation without failing (even up to 100-300%). such machinability and superplasticity properties have been observed for ceramics (including monoliths and composites), metals (including aluminum, magnesium, iron, titanium), intermetallic elements (including iron, nickel, and titanium base), and laminates [20] . although the atomic weight of carbon nanotubes is about one-sixth of the weight of steel, their young's modulus and tensile strength are, respectively, five and 100 times higher than those of steel [21] . in addition, nanoparticles, because of their very small sizes and surface/interface effects such as the fundamental change in coordination, symmetry, and confinement, they may exhibit high magnetic susceptibility. a variety of nanoparticles reveal anomalous magnetic properties such as superparamagnetism. this opens new areas of potential application for them, such as data storage and ferrofluid technology [22] . according to recent studies, nanoparticles may have also great potential in medical application, mostly due to their good biocompatibility that allows them to promote electron transfer between electrodes and biological molecules. for instance, the high biocompatibility of magnetite nanocrystals (fe 3 o 4 ) makes them potentially useful as the magnetic resonance imaging contrast agents [23] . one of the unique aspects of nanoparticles is their high wettability, termed by fujishima [24] as superhydrophilicity. depending upon the chemical composition, the surface can exhibit superhydrophilic characteristics. for example, titanium dioxide (tio 2 ), at sizes below a few nm, can decrease the water contact angle to 0±1 • [24] . nano-sized composites, due to the chemical composition and viscosity of the intercrystalline phase, may provide a significant increase in creep resistance. it has been demonstrated that alumina/silicon carbide composites are characterized by a minimum creep rate, three times lower than the corresponding monolith [25] . as mentioned in section 14.1, different types of nanomaterials are increasingly being developed and used by industry. however, little is known about their toxicity, including possible mutagenic and/or carcinogenic effects [26] . some recent contributions report evident toxicity and/or ecotoxicity of selected nanoparticles and highlight the potential risk related to the development of nanoengineering. evidently, there is insufficient knowledge regarding the harmful interactions of nanoparticles with biological systems as well as with the environment. it is well known that the most important parameters with respect to the induction of adverse effects by a xenobiotic compound are its dose, dimension, and durability. conversely, it is well established that nano-sized particles, due to their unique physical and chemical properties discussed above, behave differently from their larger counterparts of the same chemical composition [26] [27] [28] [29] [30] [31] . because of the difference between nanoparticles and bulk chemicals, the risk characterization of bulk materials cannot be directly extrapolated to nanomaterials. the biological activity of nanoparticles and their unique properties causing harmful effects are highly dependent on their size. nanoparticles, because of their small size, may pass organ barriers such as skin, olfactory mucosa, and the blood-brain barrier [32] [33] [34] , readily travel within the circulatory system of a host, and deposit in target organs. this is not possible with the same material in a larger form [35] . indeed, reduction of the particle's size to the nanoscale level results in a steady increase of the surface to volume ratio. as a consequence, a larger number of potentially active groups per mass unit is "available" on the surface and might interact with biological systems [35] . this is one possible explanation why nano-sized particles of a given compound are generally more toxic than the same compound in its larger form [36] . however, oberdörster et al. [37] suggested that the particle size is not the only possible factor influencing toxicity of nanomaterials. the following features should be also considered: • size distribution, • agglomeration state, • shape, • porosity, • surface area, • chemical composition, • structure-dependent electronic configuration, • surface chemistry, • surface charge, and • crystal structure. natural and anthropogenic nanoparticles gain access into the human body through the main ports of entry including the lungs, the skin, or the gastrointestinal tract. the unique properties of nanoparticles allow them not only to pnetrate physiological barriers but also to travel throughout the body and interact with subcellular structures. toxicological studies show that nanoparticles can be found in various cells such as mitochondria [38, 39] , lipid vesicles [40] , fibroblasts [41] , nuclei [42] , and macrophages [43] . depending on their localization inside the cell, nanoparticles can induce formation of reactive oxygen species (ros), for instance, superoxide radicals, hydroxyl radicals reactive nitrogen [44] , sulfur [45] , and other species stressing the body in a similar manner to the effect of ros [46] . this results in oxidative stress and inflammation, leading to the impacts on lung and cardiovascular health [16] . it is worth noting that normally, due to the presence of antioxidant molecules (i.e., vitamin c and glutathione), the body's cells are able to defend themselves against ros and free radicals damage. however, when a large dose of strongly electrophilic nanoparticles enter the body, the balance between reduced glutathione (gsh) and its oxidized form (gssg) is destroyed [47] and the unscavenged oxidants cause cell injuries by attacking dna, proteins, and membranes [48] . at the cellular level, oxidative stress is currently the best developed paradigm depicting the harmful effects of nano-sized particles [31, 49, 50] . the mechanism of oxidative stress occurring at the molecular level is mainly responsible for observed cytotoxic and genotoxic effects induced by nanoparticles. cytotoxicity of selected nanospecies has been confirmed by many researchers. for example, fullerene (c 60 ) particles suspended in water are characterized by antibacterial activity against escherichia coli and bacillus subtilis [51] and by cytotoxicity to human cell lines [52] . single multiwalled carbon nanotubes (cwcnts and mwcnts) are also toxic to human cells [41, 53] . nano-sized silicon oxide (sio 2 ), anatase (tio 2 ), and zinc oxide (zno) can induce pulmonary inflammation in rodents and humans [54] [55] [56] . epidemiological studies have shown that nanoparticles might be genotoxic to humans [57] . irreversible dna modifications resulting from the activity of ros may lead to heritable mutations, involving a single gene, a block of genes, or even whole chromosomes. dna damage may also disrupt various normal intracellular processes, such as dna replication and modulate gene transcription, causing abnormal function or cell death [16, 44, 58] . until now, more than 100 different oxidative dna lesions have been found. the most investigated oh-related dna lesions is 8-hydroxydeoxyguanosine (8-ohdg) [59] , which may be induced by several particles such as asbestos, crystalline silica, coal fly ashes. oxygen free radicals may overwhelm the antioxidant defense system by mediating formation of base adducts, such as 8-hydroxydeoxyguanosine, and therefore play a key role in initiation of carcinogenesis [60] . data on neurotoxic effects of engineered nanoparticles are very limited, but it has been reported that inhaled nanoparticles, depending on their size, may be distributed to organs and surrounding tissues, including the olfactory mucosa or bronchial epithelium and then can be translocated via the olfactory nerves to the central nervous system [61] . there is also some evidence that nano-sized particles can penetrate and pass along nerve axons and dendrites of neurons into the brain [33] . recent studies confirm the translocation of nanoparticles from the respiratory tract into the central nervous system; for example, inhalation with 30 nm magnesium oxide in rats showed that manganese can be taken up into olfactory neurons and accumulated in the olfactory bulb [34] . the particles at the nanoscale may also gain access to the brain across the bloodbrain barrier [2] . there is experimental evidence that oxidative stress also plays an important role in neurodegenerative diseases and brain pathology, for instance, hallervorden-spatz syndrome, pick's disease, alzheimer's disease, or parkinson's disease [62] . the effects of nanoparticles on the immune system are still unclear. although the reticuloendothelial system (res) is able to eliminate nanoparticles, several toxicological studies have suggested that nanoscale particles' interaction with the defense activities of immune cells can change their antigenicity and stimulate and/or suppress immune responses. direct experiments showed that dendritic cells and macrophages uptake of nanoparticle-protein complexes may change the formation of the antigen and initiate an autoimmune response [16] . several studies have also reported that nanoparticles may induce damage to red blood cells (erythrocytes). bosi et al. [63] have studied the hemolytic effect of different water-soluble c 60 fullerenes. preliminary results indicate that hemolytic activity depends on the number and position of the cationic surface groups. however, no clinically relevant toxicity has yet been demonstrated [64] . nano-sized particles such as volcanic ash, dust storms, or smoke from natural fires have always been present in the environment. however, the recent progress of industry has increased engineered nanoparticle pollution. the unique size-specific behavior and specific physical-chemical properties, in combination with toxicity to particular living organisms, may also result in harmful effects on the level of whole environmental ecosystems [65] . in the pioneering report on the non-human toxicity of fullerene, eva oberdörster [66] observed that manufactured nanomaterials can have negative impacts on aquatic organisms. water-soluble c 60 fullerenes cause oxidative damage (lipid peroxydation in the brain) and depletion of glutathione in the gill of juvenile largemouth bass (micropterus salmoides) at a concentration of 0.5 ppm. however, these results might be disputable, because the authors used the organic solvent tetrahydrofuran (thf) to disaggregate c 60 fullerenes, thf is classified as a neurotoxin [67] . subsequently, lover and klaper [68] observed the toxicological impact of nanoparticles of fullerenes (c 60 ) and titanium dioxide (tio 2 ) to daphnia magna: c 60 and tio 2 caused mortality with a lc 50 value of 5.5 ppm for tio 2 and a lc 50 value of 460 ppb for the fullerene. in this case the authors also used thf for solubilization of hydrophobic c 60 , thus the results are also of lower credibility. interestingly, in similar experiments by andrievsky et al. [69] with "fullerene water solutions" (hydrated fullerenes, c 60 · nh 2 o), no mortality was observed. in a later study, adams et al. [70] confirmed the acute toxicity of selected nanosized metal oxides against d. magna. he observed that sio 2 particles were the least toxic and that toxicity increased from sio 2 to tio 2 to zno. a further study by the authors [71] showed that these three photosensitive nanoscale metal oxides in water suspensions have similar antibacterial activity to gram-positive (b. subtilis) and gram-negative (e. coli) bacteria (sio 2 < tio 2 < zno). all the metal oxides nanoparticles tested inhibited the growth of both gram-positive and gram-negative bacteria; however, b. subtilis was more sensitive than e. coli. similar results have been observed for a bath of zno, tio 2 , and cuo against bacterium vibrio fischeri and crustaceans d. magna and thamnocephalus platyurus [72] . the antibacterial effects of nano-sized metal oxides to v. fischeri were similar to the rank of toxicity to d. magna and t. platyurus; they increased from tio 2 to cuo and zno. it is also very important to recognize that titanium dioxide was not toxic even at the 20 g/l level, which means that not all nanoparticles of metal oxides induce toxicity. smith et al. [73] investigated the ecotoxicological potential of single-walled carbon nanotubes (swcnt) to rainbow trout (oncorhynchus mykiss) showing that the exposure to dispersed swcnt causes respiratory toxicity -an increase of the ventilation rate, gill pathologies, and mucus secretion. additionally, the authors observed histological changes in the liver, brain pathology, and cellular pathologies, such as individual necrotic or apoptotic bodies, in rainbow trout exposed to 0.5 mg/l swcnt. mouchet et al. [74] analyzed the acute toxicity and genotoxicity of double-walled carbon nanotubes (dwnts) to amphibian larvae (xenopus laevis). the authors did not observe any effects at concentrations between 10 and 500 mg/l. however, at the highest concentrations (500 mg/l) 85% of mortality was measured, while at the lowest concentrations (10 mg/l) reduced size and/or a cessation of growth of the larvae were observed. summarizing this section, there is strong evidence that chemicals, when synthesized at the nanoscale, can induce a wide range of specific toxic and ecotoxic effects. moreover, even similar compounds from the same class can differ in toxicity. the available data on toxicity are still lacking; thus, more comprehensive and systematic studies in this area are necessary and very important. as demonstrated in this book, quantitative structure-activity relationship (qsar) methods can play an important role in both designing new products and predicting their risk to human health and the environment. however, taking into account the specific properties of nanomaterials and their still unknown modes of toxic action, this class of compounds seems to be much more problematic for qsar modelers than the "classic" (small, drug-like) chemicals. until now, more than 5000 different descriptors have been developed and used for the characterization of molecular structure (chapter 3). in general, the descriptors can be classified according to their dimensionality. constitutional descriptors, socalled "zero-dimensional," are derived directly from the formula (e.g., the number of oxygen atoms). descriptors of bulk properties, such as n-octanol/water partition coefficient or water solubility, are classified as "one-dimensional" descriptors. topological descriptors based on the molecular graph theory are called "twodimensional" descriptors and characterize connections between individual atoms in the molecule. "three-dimensional" descriptors reflect properties derived from the three-dimensional structure of a molecule optimized at the appropriate level of quantum-mechanical theory. "four-dimensional" descriptors are defined by molecular properties arising from interactions of the molecule with probes characterizing the surrounding space or by stereodynamic representation of a molecule, including flexibility of bonds, conformational behavior, etc. [75] [76] [77] [78] [79] . only a little is known about applicability of those "traditional" descriptors for the characterization of nanostructures. some authors [80] [81] [82] postulate that the existing descriptors are insufficient to express the specific physical and chemical properties of nanoparticles. thus, novel and more appropriate types of the descriptors must be developed. a group of nanoparticles is structurally diversified. in fact, this group has been defined arbitrarily in some way, taking into account size as the only criterion of the particles' membership. therefore, structures as various as nanotubes, fullerenes, crystals, and atom clusters as well as chemical species of such different properties as metals, non-metals, organic compounds, inorganic compounds, conductors, semi-conductors, and isolators were put together into one single group. since nanoparticles are not structurally homogenous, a common mechanism of toxicity cannot be expected for all of them. in consequence, toxicity and other properties should be studied within the most appropriately chosen sub-classes of structural and physico-chemical similarity. what is the best way to define the sub-classes? the answer might be given based on a stepwise procedure recommended by the oecd guidance document on the grouping of chemicals [83] (see also chapter 7) . along with the guidelines, the following eight steps should be performed: 1. development of the category hypothesis, definition, and identification of the category members. the category can be defined based on chemical similarity, physico-chemical properties, toxicological endpoint, and/or mechanism of action, as well as in terms of a metabolic pathway. 2. gathering of data for each category members. all existing data should be collected for each member of the category. 3. evaluation of available data for adequacy. the data should be carefully evaluated at this stage according to the commonly accepted protocols (i.e., according to the appropriate oecd guidance). 4. construction of a matrix of data availability (category endpoints vs. members). the matrix is to indicate whether data are available or not. performing of a preliminary evaluation of the category and filling data gaps. the preliminary evaluation should indicate if (i) the category rationale is supported and (ii) the category is sufficiently robust for the assessment purpose (contains sufficient, relevant and reliable information). 6. performing of additional testing (experiments). based on the preliminary evaluation (especially evaluation of the robustness), additional experiments and group members for testing can be proposed. 7. performing of a further assessment of the category. if new data from the additional testing are generated, the category should be revised according to the criteria from step 5. 8. documenting of the finalized category. finally, the category should be documented in the form of a suitable reporting format proposed by the guidance. the currently proposed [82] working classification scheme for nanostructured particles includes nine categories: 1. spherical or compact particles; 2. high aspect ratio particles; 3. complex non-spherical particles; 4. compositionally heterogeneous particles -core surface variation; 5. compositionally heterogeneous particles -distributed variation; 6. homogeneous agglomerates; 7. heterogeneous agglomerates; 8. active particles; 9. multifunctional particles. this classification has been adapted from the original work of maynard and aitken [84] . what types of structural properties should be described within the groups? as previously discussed in section 14.3, the diameter of a nanoparticle is important, but it is not the only one possible factor influencing the mode of toxic action. the additional structural characteristics which must also be appropriately expressed are size distribution, agglomeration, shape, porosity, surface area, chemical composition, electronic configuration, surface chemistry, surface charge, and crystal structure. in contrast to the classic qsar scheme, an entire characterization of a nanostructure may be impossible only when computational methods are employed. novel descriptors reflecting not only molecular structure, but also supra-molecular pattern (size, shape of the nanoparticles, etc.) should be derived from both computational and experimental techniques. the fastest and relatively easy step of characterizing the structure is the calculation of constitutional and topological descriptors. an interesting and very practical idea in this field is to replace a series of simple descriptors by one, so-called "technological attributes code" or "smiles-like code" [85] [86] [87] [88] . for instance, a nanoparticle of ceramic zirconium oxide, existing in bulk form and synthesized at a temperature of 800 • c can be expressed by the code "zr,o,o,cer,%e" [80] . similar to the simplified molecular input line entry system (smiles), the international chemical identifier (inchi) might also be used directly as a descriptor of chemical composition [89] . another possibility is to apply descriptors derived from either molecular graph (mg) or the graphs of atomic orbitals (gao) theory [90] [91] [92] . in the first case, vertexes in the graph represent atoms, while edges represent covalent bonds. in the second method, vertexes refer to particular atomic orbitals (1s, 2s, 2p, etc.), while edges connect the orbitals belonging to different atoms (figure 14-1) . based on the molecular graphs, faulon and coworkers [93] [94] [95] [96] have developed the signature molecular descriptor approach for the characterization of fullerenes and nanotubes. the signature is a vector including extended valences of atoms derived from a set of subgraphs, following the five-step algorithm: 1. constructing of a subgraph containing all atoms and bonds that are at a distance no greater than the given signature height; 2. labeling the vertices in a canonical order; 3. constructing a tree spanning all the edges; 4. removing of all canonical labels that appear only one time; 5. writing the signature by reading the tree in a depth-first order. the signature descriptor can be utilized not only for direct qsar modeling, but also for calculating a range of topological indices (i.e., the wiener index). [90] [91] [92] 132] without doubt, simplicity of calculation is the most significant advantage of the topological descriptors. however, in many cases these two-dimensional characteristics are insufficient to investigate more complex phenomena. in such a situation, a more sophisticated approach must be employed to describe the structure appropriately. as mentioned previously, quantum-mechanical calculations can deliver useful information on the three-dimensional features (see chapter 2). among others, they include: molecular geometry (bond lengths, valence, and torsion angles), electron distribution, ionization potential, electron affinity, surface reactivity, and band gap. when performing quantum-mechanical calculations, there are always two important assumptions to be introduced. first one is an appropriate molecular model; the second one is the appropriate level of the theory. both assumptions are closely related: when the model (system) is too large, the calculations at the highest levels of the theory are impossible, because of large computational time and other technical resources to be required [97] . small fullerenes and carbon nanotubes can be treated as whole systems and modelled directly with quantum-mechanical methods. among the theory levels, the density functional theory (dft) recently seems to have been accepted as the most appropriate and practical choice for such calculations. indeed, dft methods can serve as a good alternative for conventional ab initio calculations, when a step beyond the means field approximation is crucial and the information on the electron correlation significantly improves the results (e.g., hartree-fock -hf method in conjunction with møller-pleset the second-order correction -mp2). unfortunately, even "small" fullerenes and carbon nanotubes (containing between 40 and 70 carbon atoms) are, in fact, large from quantum-mechanical point of view. therefore, the "classic" ab initio calculations might be impractical because of the reasons mentioned in the previous paragraph, whereas dft can be performed in reasonable time. the functional commonly utilized for dft is abbreviated with the b3lyp symbol. in b3lyp calculations (eq. 14-1) the exchange-correlation energy e xc is expressed as a combination (a 0 , a x , and a c are the parameters) of four elements: (i) the exchange-correlation energy from the local spin density approximation (lsda, e lsda xc ), (ii) the difference between the exchange energy from hartree-fock (e hf x ) and lsda (e lsda x ), (iii) becke's exchange energy with gradient correction (e b88 x ), and (iv) the correlation energy with lee-yang-parr correction (e lyp c ) [98, 99] : sometimes, when a system is too large from the quantum-mechanical point of view, the calculations are practically impossible. the situation is very common for larger crystalline nanoparticles (i.e., nanoparticles of metal oxides: tio 2 , al 2 o 3 , sno 2 , zno, etc.) and, in such cases, a simplified model of the whole structure must first be appropriately selected. in general, there are two strategies for modeling of crystalline solids: (i) an application of the periodic boundary conditions (pbss) and (ii) calculations based on the molecular clusters. in the first approach, calculations for a single unit cell are expanded in the three dimensions with respect to the translational symmetry by employing appropriate boundary conditions (i.e., the unit cell should be neutral and should have no dipole moment). in doing so, the model includes information on the long-range forces occurring in the crystal. however, the cell size should be large enough to also be able to model defects in the surface and to eliminate the spurious interactions between periodically repeated fragments of the lattice [100] [101] [102] . in the second approach, a small fragment or so-called "cluster," is cut off from the crystal structure and then used as a simplified model for calculations. the only problem is how to choose the diameter of the cluster correctly? this must be performed by reaching a compromise between the number of atoms (and thus the required time of computations) and the expected accuracy (and hence level of the theory to be employed). it is worth mentioning that the molecular properties can be divided into two groups depending on how they change with increasing size of the cluster (going from molecular clusters to the bulk form). they are (i) scalable properties, varying smoothly until reaching the bulk limit and (ii) non-scalable properties, when the variation related to increasing size of the cluster is not monotonic. although the cluster models usually avoid the long-range forces, they have found many applications in modeling of local phenomena and interactions on the crystal surface [103] . as previously mentioned, in addition to calculated properties, experimentally derived properties may also serve as descriptors for developing nano-qsars (table 14-1). the experimental descriptors seem to be especially useful for expressing size distribution, agglomeration state, shape, porosity, and irregularity of the surface area. interestingly, the experimental results can be combined with numerical methods to define new descriptors. for example, images taken by scanning electron microscopy (sem), transmission electron microscopy (tem), or atomic force (figure 14 -2) might be processed with use of novel chemometric techniques of image analysis. namely, a series of images for different particles of a given nanostructure should first be taken. then, the pictures must be numerically averaged and converted into a matrix containing numerical values that correspond to intensity of each pixel in the gray scale or color value in the rgb scale. new descriptors can be defined based on the matrix (i.e., a shape descriptor can be calculated as a sum of non-zero elements in the matrix; porosity -as a sum of relative differences between each pixel and its "neighbors," etc.) [104] . without doubt, an appropriate characterization of the nanoparticles' structure is currently one of the most challenging tasks in nano-qsar. although more than 5000 qsar descriptors have been defined so far, they may be inadequate to express the supramolecular phenomena governing the unusual activity and properties of nanomaterials. as a result, much more effort in this area is required. an important step related to the numerical description of chemical structure and qsar modeling involves establishing a qualitative relationship between the structure of a nanoparticle and its various electronic properties. the b3lyp functional and the standard 6-31g(d) polple's style basis set were applied by shukla and leszczynski [106] to investigate the relationships between the shape, size, and electronic properties of small carbon fullerenes, nanodisks, nanocapsules, and nanobowls. they found out that the ionization potentials decrease, while the electron affinities increase in going from the c 60 fullerenes to the closed nanodisks, capsules, and open bowl-shaped nanocarbon clusters. in similar studies performed for capped and uncapped carbon nanotubes at the b3lyp/6-31g(d) level of theory by yumura et al. [107, 108] , the authors demonstrated that the tube lengths, edge structures, and end caps play an important role in determining the band gap expressed as a difference between the energies of the highest occupied and lowest unoccupied molecular orbitals (homo-lumo) and vibrational frequencies. wang and mezey [109] characterized electronic structures of open-ended and capped carbon nanoneedles (cnns) at the same theory level (b3lyp/6-311g(d)) concluding that conductivity of the studied species is strictly correlated to their size. only very long cnns structures have band gaps sufficiently narrow to be semiconductors, while the band gaps of very short and thin structures are too large to conduct electrons. similarly, poater et al. [110, 111] observed that the parr electrophilicity and electronic movement described by the chemical potential increase with increasing length of the carbon nanoneedles and very "short" structures (containing four layers and less) have a homo-lumo gap too large to allow conductivity. moreover, simeon et al. [112] , by performing b3lyp calculations, demonstrated that a replacement of the fullerene carbon atom with a heteroatom results in a significant change of electronic and catalytic properties of the fullerene molecule. similar studies have been performed for crystalline metal semi-conductors with the use of the cluster calculations. as mentioned in section 14.4.1, some electronic properties are scalable. they change with the changing size of the cluster until the bulk limit is reached. known examples of such properties are the homo-lumo gap (band gap) and the adiabatic electron detachment energy. for instance, the band gap of zno nanoparticles decreases with increasing diameter of the particle up to the bulk value observed for about 4 nm [113] . similarly, the bulk limits of the homo-lumo gap and the detachment energy for titanium oxide anion clusters of increasing size (increasing n) were reached already for n=7 [114, 115] . in the classic formalization of qsars, electronic properties (e.g., homo, lumo, ionization potential) have been utilized as "ordinary" molecular descriptors. as discussed above, this approach should be revised for nanoparticles, for which the properties vary with size of a particle and this variation cannot be simply described by a linear function. it is not out of the question that similar phenomena might be observed also for other types of the "traditional" descriptors and further studies in this area are required and strongly justified. regarding the five oecd principles for the validation of a (q)sar as discussed in chapters 12 and 13, an ideal qsar model, applicable for regulatory purpose, should be associated with (i) a well-defined endpoint; (ii) an unambiguous algorithm; (iii) a defined domain of applicability; (iv) appropriate measures of goodness-offit, robustness, and predictivity; and (v) a mechanistic interpretation, if possible. unfortunately, it is extremely difficult to fulfill all of these principles for (q)sars applicable to nanomaterials. there are two main difficulties related to the development of nano-qsars. the first one is lack of sufficiently numerous and systematic experimental data, while the second one is very limited knowledge on mechanisms of toxic action. as we mentioned many times, regarding their structure, the class of nanomaterials is not homogenous, combining a range of physico-chemical properties, as well as possible mechanisms of metabolism and toxicity. thus, it is impossible to assume one common applicability domain for all nanomaterials. each mode of toxicity and each class of nanomaterials should be studied separately. analyzing the literature data (section 14.3) it must be concluded that even if a class of structurally similar nanoparticles is tested with the same laboratory protocol, the number of tested compounds is often insufficient to perform comprehensive internal and external validation of a model and to calculate the appropriate measures of robustness and predictivity in qsar. for instance, limbach et al. [116] have proposed two rankings of cytotoxicity of seven oxide nanoparticles based on the in vitro study of human and rodent cells. the rankings were as follows: (i) fe 2 o 3 ≈asbestos > zno > ceo 2 ≈zro 2 ≈tio 2 ≈ca 3 (po 4 ) 2 and (ii) zno > asbestos≈zro 2 > ca 3 (po 4 ) 2 ≈fe 2 o 3 ≈ceo 2 ≈tio 2 , respectively, for human (mesothelinoma) and rodent cells. in another paper by the same research group, the authors have found that for four metal nanoparticles -namely, tio 2 , fe 2 o 3 , mn 3 o 4 , and co 3 o 4 -the chemical composition was the main factor determining the formation of reactive oxygen responsible for toxicity toward human lung epithelial cells [117] . obviously, the results cannot be combined together and a data set containing five or six compounds is too small to build an appropriately validated qsar model. do these restrictions and problems mean qsar modelers are not able to provide useful and reliable information for nanoparticles? we do not believe this to be true. the amount of data will increase along with increasing number of nanotoxicological studies. however, no one can expect the accumulation in the next few years of such extensive data for nanomaterials, as it is now available for some environmental pollutants, pharmaceuticals, and "classical" industrial chemicals [118, 119] . despite the limitations, there are some very promising results of preliminary nano-qsar studies which are reviewed below. toropov et al. [81] have developed two models defining the relationships between basic physico-chemical properties (namely, water solubility, log s, and n-octanol/water partition coefficient, log p) of carbon nanotubes and their chiral vectors (as structural descriptors). the two-element chiral vector (n, m) contains information about the process of rolling up the graphite layer when a nanotube is formed. it had been previously known [120] that the elements of the chiral vector are related to conductivity. at this point, toropov et al. confirmed, using the qsprbased research, that the vector is also strictly related to other properties. the models developed were defined by the following two equations (eqs. 14-2 and 14-3): log s = −5.10 − 3.51n − 3.59m r 2 = 0.99, s = 0.053, f = 126 log p = −3.92 + 3.77n − 3.60m r 2 = 0.99, s = 0.37, f = 2.93 (14-3) the study was based on experimental data being available for only 16 types of carbon nanotube. to perform an external validation, the authors divided the compounds into a training set (n=8) and a test set (n test =8). statistics of the validation were r 2 test = 0.99, s test =0.093, and f test =67.5 and r 2 test =0.99, s test =0.29, and f test =5.93, respectively, for the models for water solubility and n-octanol/water partition coefficient. without doubt, these were the first such qspr models developed for nanoparticles. however, the ratio of descriptors to compounds (the topliss ratio) was low, thus the model might be unstable (see discussion in chapter 12 for more detail). another contribution by toropov and leszczynski [80] presents a model predicting young's modulus (ym) for a set of inorganic nanostructures (eq. 14-4). martin et al. [121] have proposed two qsar models predicting the solubility of buckminsterfullerene (c 60 ), respectively, in n-heptane (log s heptane ) and n-octanol (log s octanol ) (eqs. 14-6 and 14-7): the symbols r 2 50 and s 2 50 refer to leave-50%-out cross-validation. the authors applied codessa descriptors, namely, rncg -relative negative charge (zefirov's pc); 2 asic -average structural information content of the second order; e min ee (cc) -minimum exchange energy for a c-c bond; 1 ic -first-order information content; and rpcs -relative positive charged surface area. interestingly, the models were calibrated on 15 compounds including 14 polycyclic aromatic hydrocarbons (pahs) containing between two and six aromatic rings and the fullerene. although values of solubility predicted for the fullerene seem to be reasonable, the authors did not validate the applicability domain of the models. indeed, the structural difference between 14 hydrocarbons and the fullerene is probably too large to make reliable predictions for c 60 (the polycyclic hydrocarbons are planar, but the fullerene is spherical). in addition, the experimental values of log s for 14 pahs ranged from -3.80 to 0.22 in heptane and from −3.03 to −0.02 in octanol, while the experimental values for the fullerene were −4.09 and 4.18 in heptane and octanol, respectively. an interesting area of nano-qsar applications is estimating solubility of a given nanoparticle in a set of various solvents. in that case, the main purpose of molecular descriptors is to correctly characterize the variation in interactions between the particle and the molecules of different solvents [122] . in fact, it means that the descriptors are related to the structure of solvents rather than to the nanoparticle structure. murray et al. [123] have developed a model characterizing the solubility of c 60 in 22 organic solvents by employing three following descriptors: two quantities, σ 2 tot and υ reflecting variability and degree of balance of electrostatic potential on the solvent surface and the surface area, sa (eq. 14-8). 95, s=0.48) , nothing is known about its predictive ability, because the model has not been validated. a set of linear models built separately for individual structural domains, namely alkanes (n=6), alkyl halides (n=32), alcohols (n=6), cycloalkanes (n=6), alkylbenzenes (n=16), and aryl halides (n=9), was published by sivaraman et al. [124] . the models were based on connectivity indices, numbers of atoms, polarizability, and variables indicating the substitution pattern as molecular descriptors for the solvents. the values of r 2 for particular models ranged between 0.93 (alkyl halides) and 0.99 (cycloalkanes) with the corresponding values of s from 0.22 (alkyl halides) to 0.04 (cycloalkanes). the authors concluded that it was impossible to obtain a unified model that included all solvents. however, when the first three classes of solvents (i.e., alkanes, alkyl halides, and alcohols) were combined together into one model, the results of an external validation performed were satisfactory. as well as linear approaches, non-linear models have been constructed. for instance, kiss et al. [125] applied an artificial neural network utilizing molar volume, polarizability parameter, lumo, saturated surface, and average polarizability as structural descriptors of solvents. they observed that for most of the solvents studied (n=126) solubility decreases with increasing molar volume and increases with polarizability and the saturated surface areas of the solvents. the reported value of s in that case was 0.45 of log units. the values of r 2 and f were 0.84 and 633, respectively. in another study [126] the authors proposed modeling with both multiple linear regression with heuristic selection of variables (hm-mlr) and a least-squares support vector machine (svm). then they compared both models with each other. both models were developed with codessa descriptors [127] . interestingly, the results were very similar (the model using svm had slightly better characteristics). the values of r 2 for the linear and non-linear model were, respectively, 0.89 and 0.90, while the values of f were 968 and 1095. the reported root mean square errors were 0.126 for the linear model (hm-mlr) and 0.116 for the model employing svm. when analyzing all the results it might be concluded that the main factor responsible for differences in the model error is related to the type of the descriptors rather than to the mathematical method of modeling. recently, toropov et al. [89] developed an externally validated one-variable model for c 60 solubility using additive optimal descriptors calculated from the international chemical identifier (inchi) code (eq. 14-9): log s = −7.98(± 0.14) + 0.325(± 0.0010) dcw(inchi) n = 92, the descriptor dcw(inchi) is defined as the sum of the correlation weights cw(i k ) for individual ichi attributes i k characterizing the solvent molecules. the example of the dcw(inchi) calculation is presented in table 14 -2. the values of cw(i k ) were optimized by the monte carlo method. all of the above models refer to physico-chemical properties as the endpoints, thus they are also termed quantitative structure-property relationships (qsprs). currently, there are only a small number of qsars related directly to nanomaterials' activity. in 2007 tsakovska [128] proposed the application of qsar methodology to predict protein-nanoparticle interactions. in 2008 durdagi et al. published two papers [129, 130] presenting qsar-based design of novel inhibitors of human immunodeficiency virus type 1 aspartic protease (hiv-1 pr). in the first work [130] the authors developed a three-dimensional qsar model with comparative molecular similarity indices analysis (comsia) method for 49 derivatives of fullerene c 60 . the values of r 2 and q 2 for the training set (n=43) were 0.99 and 0.74, respectively. the absolute values of residuals in the validation set (n=6) ranged from 0.25 to 0.99 logarithmic units of ec 50 (μm). the second model [129] were characterized by lower values of the statistics (n=17, r 2 =0.99 and q 2 =0.56). however, in that case the predictions for an external set of compounds (n test =3) were possible with an acceptable level of error. in addition, the authors proposed nine novel structures indicating possible inhibitor activity based on the model obtained. they concluded that steric effects play the most important role in the inhibition mechanism as well as electrostatic and h-donor/acceptor properties. however, the last two types of interactions are of lower importance. similarly, smiles-based optimal descriptors have been successfully applied for modeling hiv-1 pr fullerene-based inhibitors [131] . the model reported by toropov et al. [131] was described by the following equation and parameters: the dcw descriptor in this case is defined as the following (eq. 14-12): cw(sa k ) where the sa k is a smiles attribute, i.e., one symbol (e.g., "o," "=," "v") or two symbols (e.g., "al," "bi," "cu") in the smiles notation. numbers of double bonds have been used as global smiles attributes. they are denoted as "=001" and "=002." "=001" is the indicator of one double bond and "=002" is the indicator of two double bonds. although we strongly believe in the usefulness and appropriateness of qsar methodology for nanomaterial studies, the number of available models related to activity and toxicity is still very limited. when analyzing the situation, it seems that the main limitation is insufficient amount of existing experimental data. in many cases, lack of data precludes an appropriate implementation of statistical methods, including necessary external validation of the model. the problem of the paucity of data will be solved only when a strict collaboration between the experimentalists and qsar modelers is established. the role of the modelers in such studies should not be restricted only to rationalization of the data after completing the experimental part, but also they must be involved in the planning of the experimentation. since the experiments on nanomaterials are usually expensive, a kind of compromise between the highest possible number of compounds for testing and the lowest number of compounds necessary for developing a reliable qsar model should be reached. regarding the limited amount of data and high costs of the experiments, the idea of applying novel read-across techniques enabling preliminary estimation of data (chapter 7) [82, 133] is very promising. however, no one has yet tried to implement this technique to nanomaterials. without doubt, a large and increasing aspect of the near future of chemistry and technology will be related to the development of nanomaterials. on one hand, due to their extraordinary properties, nanomaterials are becoming a chance for medicine and industry. but, on the other hand, the same properties might result in new pathways and mechanisms of toxic action. in effect, the work with nanomaterials is challenging for both "types" of chemists: those who are searching for and synthesizing new chemicals and those who are working on risk assessment and protection of humans from the effects of these chemicals. when analyzing the current status of nano-qsar, the four noteworthy suggestions for further work can be made: 1. there is a strong need to supplement the existing set of molecular descriptors by novel "nanodescriptors" that can represent size-dependent properties of nanomaterials. 2. a stronger than usual collaboration between the experimentalists and nano-qsar modelers seems to be crucial. on one hand, it is necessary to produce data of higher usefulness for qsar modelers (more compounds, more systematic experimental studies within groups of structural similarity, etc.). on the other hand, a proper characterization of the nanomaterials structure is not possible only at the theoretical (computational) level. in such situation, experiment-based structural descriptors for nano-qsar might be required. 3. it is possible that the current criteria of the models'quality (the five oecd rules) will have to be re-evaluated and adapted to nanomaterials. this is due to the specific properties of chemicals occurring at the "nano" level (i.e., electronic properties change with changing size) and the very limited number of data (problems with the "classic" method of validation which is biased to small, low molecular weight molecules). 4. greater effort is required in the areas of grouping nanomaterials and nano-readacross. this technique might be useful especially at the initial stage of nano-qsar studies, when the experimental data are scarce. in summary, the development of reliable nano-qsar is a serious challenge that offers an exciting new direction for qsar modelers. this task will have to be completed before the massive production of nanomaterials in order to prevent potentially hazardous molecules from being released into the environment. in the long term, prevention is always more efficient and cheaper than clean-up. there's plenty of room at the bottom. an invitation to enter a new field of physics the potential risks of nanomaterials: a review carried out for ecetoc inorganic and organic uv filters: their role and efficiency in sunscreens and suncare products solar-driven self-cleaning coating for a painted surface synthesis activity and characterization of textiles showing self-cleaning activity under daylight irradiation synthesis, characterization and catalytic properties of cuo nanocrystals with various shapes consumer products inventory of nanotechnology products (2009) the project on emerging nanotechnologies the use of nanocrystals in biological detection fluorescent cdse/zns nanocrystal-peptide conjugates for long-term, nontoxic imaging and nuclear targeting in living cells helical microtubules of graphitic carbon fullerene containing polymers: a review on their synthesis and supramolecular behavior in solution role of carbon nanotubes in electroanalytical chemistry: a review environmental applications of carbon-based nanomaterials the challenge of regulating nanomaterials toxic potential of materials at the nanolevel size effects on the band-gap of semiconductor compounds room temperature creep behavior of nanocrystalline nickel produced by an electrodeposition technique high tensile ductility in a nanostructured metal recent development in the mechanics of superplasticity and its applications recent advances in nanotechnology magnetic properties of nanoparticle assemblies one-pot reaction to synthesize biocompatible magnetite nanoparticles titanium dioxide photocatalysis tensile creep behavior of alumina/silicon carbide nanocomposite health and environmental impact of nanotechnology: toxicological assessment of manufactured nanoparticles significance of particle parameters in the evaluation of exposure-doseresponse relationships of inhaled particles particle toxicology: from coal mining to nanotechnology current hypotheses on the mechanisms of toxicity of ultrafine particles dosimetry and toxicology of ultrafine particles nanotoxicology: an emerging discipline evolving from studies of ultrafine particles long-term clearance kinetics of inhaled ultrafine insoluble iridium particles from the rat lung, including transient translocation into secondary organs translocation of inhaled ultrafine particles to the brain translocation of inhaled ultrafine manganese oxide particles to the central nervous system modified opinion on the appropriateness of the risk assessment methodology in accordance with the technical guidance documents for new and existing substances for assessing the risks of nanomaterials. scientific committee on emerging and newly identified health risks principles for characterizing the potential human health effects from exposure to nanomaterials: elements of a screening strategy ultrafine particulate pollutants induce oxidative stress and mitochondrial damage comparison of the abilities of ambient and manufactured nanoparticles to induce cellular toxicity according to an oxidative stress paradigm combustion-derived ultrafine particles transport organic toxicants to target respiratory cells cytotoxicity of single-wall carbon nanotubes on human fibroblasts formation of nucleoplasmic protein aggregates impairs nuclear function in response to sio 2 nanoparticles biological behavior of hat-stacked carbon nanofibers in the subcutaneous tissue in rats oxidative stress-induced dna damage by particulate air pollution reactive sulfur species: an emerging concept in oxidative stress ed) nanonotoxicology: characterization, dosing and health effects oxidative stress: oxidants and antioxidants size-dependent proinflammatory effects of ultrafine polystyrene particles: a role for surface area and oxidative stress in the enhanced activity of ultrafines unusual inflammatory and fibrogenic pulmonary responses to single-walled carbon nanotubes in mice ultrafine particles bacterial cell association and antimicrobial activity of a c60 water suspension the differential cytotoxicity of water-soluble fullerenes cellular toxicity of carbon-based nanomaterials pulmonary effects of inhaled zinc oxide in human subjects, guinea pigs, rats, and rabbits investigations on the inflammatory and genotoxic lung effects of two types of titanium dioxide: untreated and surface treated comparing study of the effect of nanosized silicon dioxide and microsized silicon dioxide on fibrogenesis in rats combustion-derived nanoparticles: a review of their toxicology following inhalation exposure damage to dna by reactive oxygen and nitrogen species: role in inflammatory disease and progression to cancer oxyradicals and dna damage role of oxygen free radicals in carcinogenesis and brain ischemia neurotoxicity of low-dose repeatedly intranasal instillation of nano-and submicron-sized ferric oxide particles in mice hallervorden-spatz syndrome and brain iron metabolism hemolytic effects of water-soluble fullerene derivatives preclinical studies to understand nanoparticle interaction with the immune system and its potential effects on nanoparticle biodistribution effect-oriented physicochemical characterization of nanomaterials manufactured nanomaterials (fullerenes, c60) induce oxidative stress in the brain of juvenile largemouth bass the myth about toxicity of pure fullerenes is irreversibly destroyed. eighth biennial workshop "fullerenes and atomic clusters" iwfac' daphnia magna mortality when exposed to titanium dioxide and fullerene (c60) nanoparticles is the c60 fullerene molecule toxic?! fuller nanotub carbon nanostruct comparative toxicity of nano-scale tio 2 , sio 2 and zno water suspensions comparative eco-toxicity of nanoscale tio 2 , sio 2 , and zno water suspensions toxicity of nanosized and bulk zno, cuo and tio 2 to bacteria vibrio fischeri and crustaceans daphnia magna and thamnocephalus platyurus toxicity of single walled carbon nanotubes to rainbow trout, (oncorhynchus mykiss): respiratory toxicity, organ pathologies, and other physiological effects characterisation and in vivo ecotoxicity evaluation of double-wall carbon nanotubes in larvae of the amphibian xenopus laevis handbook of molecular descriptors 3d-qsar of human immunodeficiency virus (i) protease inhibitors. iii. interpretation of comfa results 4d-qsar analysis of a set of ecdysteroids and a comparison to comfa modeling in silico modelling of hazard endpoints: current problems and perspectives hierarchic system of qsar models (1d-4d) on the base of simplex representation of molecular structure a new approach to the characterization of nanomaterials: predicting young's modulus by correlation weighting of nanomaterials codes predicting water solubility and octanol water partition coefficient for carbon nanotubes based on the chiral vector computational nanotoxicology -towards a structure-activity based paradigm for investigation of the activity of nanoparticles guidance document on the grouping of chemicals. organisation of economic cooperation and development assessing exposure to airborne nanomaterials: current abilities and future requirements smiles as an alternative to the graph in qsar modelling of bee toxicity additive smiles-based optimal descriptors in qsar modelling bee toxicity: using rare smiles attributes to define the applicability domain optimisation of correlation weights of smiles invariants for modelling oral quail toxicity smiles in qspr/qsar modeling: results and perspectives additive in chi-based optimal descriptors: qspr modeling of fullerene c60 solubility in organic solvents qspr modeling mineral crystal lattice energy by optimal descriptors of the graph of atomic orbitals the graph of atomic orbitals and its basic properties. 1. wiener index the graph of atomic orbitals and its basic properties. 2. zagreb indices the signature molecular descriptor. 3. inversequantitative structure-activity relationship of icam-1 inhibitory peptides the signature molecular descriptor. 2. enumerating molecules from their extended valence sequences the signature molecular descriptor. 4. canonizing molecules using extended valence sequences the signature molecular descriptor. 1. using extended valence sequences in qsar and qspr studies introduction to computational chemistry development of the colle-salvetti correlation energy formula into a functional of the electron density density-functional thermochemistry. iii the role of exact exchange periodic boundary conditions in ab initio calculations electronic and structural properties of the (1010) and (1120) zno surfaces density functional theory study on the structural and electronic properties of low index rutile surfaces for tio 2 /sno 2 /tio 2 and sno 2 /tio 2 /sno 2 composite systems clusters: a bridge across the disciplines of physics and chemistry a new concept of molecular nanodescriptors for qsar/qspr studies nanoparticle analysis and characterization methodologies in environmental risk assessment of engineered nanoparticles a density functional theory study on the effect of shape and size on the ionization potential and electron affinity of different carbon nanostructures quantum-size effects in capped and uncapped carbon nanotubes end-cap effects on vibrational structures of finitelength carbon nanotubes the electronic structures and properties of open-ended and capped carbon nanoneedles modelling nanoneedles: a journey towards nanomedicine modeling the structure-property relationships of nanoneedles: a journey toward nanomedicine ab initio quantum chemical studies of fullerene molecules with substituents c 59 x x=si variable band gap zno nanostructures grown by pulsed laser deposition theoretical study of the electronic structure and stability of titanium dioxide clusters (tio 2 ) n with n=1-9 probing the electronic structure and band gap evolution of titanium oxide clusters (tio 2 ) n -(n=1−10) using photoelectron spectroscopy oxide nanoparticle uptake in human lung fibroblasts: effects of particle size, agglomeration, and diffusion at low concentrations in vitro cytotoxicity of oxide nanoparticles: comparison to asbestos, silica, and the effect of particle solubility handbook of physical-chemical properties and environmental fate for organic chemicals clar valence bond representation of pi-bonding in carbon nanotubes qspr modeling of solubility of polyaromatic hydrocarbons and fullerene in 1-octanol and n-heptane solubility of fullerene representation of c 60 solubilities in terms of computed molecular surface electrostatic potentials and areas qspr modeling for solubility of fullerene (c 60 ) in organic solvents artificial neural network approach to predict the solubility of c60 in various solvents accurate quantitative structure-property relationship model to predict the solubility of c60 in various solvents based on a novel approach using a least-squares support vector machine codessa pro. comprehensive descriptors for structural and statistical analysis computational modelling of nanoparticles 3d qsar comfa/comsia, molecular docking and molecular dynamics studies of fullerene-based hiv-1 pr inhibitors computational design of novel fullerene analogues as potential hiv-1 pr inhibitors: analysis of the binding interactions between fullerene inhibitors and hiv-1 pr residues using 3d qsar, molecular docking and molecular dynamics simulations smiles-based optimal descriptors: qsar analysis of fullerene-based hiv-1 pr inhibitors by means of balance of correlations an application of graphs of atomic orbitals for qsar modeling of toxicity of metal oxides. 34th annual federation of analytical chemistry and spectroscopy societies toward in silico approaches for investigating the activity of nanoparticles in therapeutic development key: cord-010977-fwz7chzf authors: myserlis, pavlos; radmanesh, farid; anderson, christopher d. title: translational genomics in neurocritical care: a review date: 2020-02-20 journal: neurotherapeutics doi: 10.1007/s13311-020-00838-1 sha: doc_id: 10977 cord_uid: fwz7chzf translational genomics represents a broad field of study that combines genome and transcriptome-wide studies in humans and model systems to refine our understanding of human biology and ultimately identify new ways to treat and prevent disease. the approaches to translational genomics can be broadly grouped into two methodologies, forward and reverse genomic translation. traditional (forward) genomic translation begins with model systems and aims at using unbiased genetic associations in these models to derive insight into biological mechanisms that may also be relevant in human disease. reverse genomic translation begins with observations made through human genomic studies and refines these observations through follow-up studies using model systems. the ultimate goal of these approaches is to clarify intervenable processes as targets for therapeutic development. in this review, we describe some of the approaches being taken to apply translational genomics to the study of diseases commonly encountered in the neurocritical care setting, including hemorrhagic and ischemic stroke, traumatic brain injury, subarachnoid hemorrhage, and status epilepticus, utilizing both forward and reverse genomic translational techniques. further, we highlight approaches in the field that could be applied in neurocritical care to improve our ability to identify new treatment modalities as well as to provide important information to patients about risk and prognosis. electronic supplementary material: the online version of this article (10.1007/s13311-020-00838-1) contains supplementary material, which is available to authorized users. translational genomics represents a diverse collection of research approaches that leverage human genomics and model systems to identify new approaches to treat and prevent disease and improve healthcare (1, 2) . rooted by the central dogma of dna to rna to protein, genomic research examines the entire genome concurrently, and may include analyses of dna variants in association with traits of interest as well as the impact of genomic variation on gene transcription and translation. genomic research has been enabled by technological advances to accurately and cost-effectively study variation across the genome at scale, as well as computational techniques to store and analyze genomic data quickly and efficiently (3) . while translational research is often defined in terms of the traditional "bench to bedside" techniques that advance discoveries from model systems through biomarkers and mechanisms ultimately to clinical applications, genomic research offers a strong use-case for an alternative approach. termed "reverse translation," this approach starts with humans as the model system, utilizing genomic associations to derive new information about biological mechanisms that can be in turn studied further in vitro and in animal models for target refinement (fig. 1) . both of these approaches possess advantages and drawbacks (4, 5) . forward translation depends on the relevance of the model system to human disease, both in terms of the physiologic responses to disease or insult, as well as the approach taken to perturb the system. for instance, the human applicability of genomic studies of the response to traumatic brain injury (tbi) in a mouse model require that the mouse's response to tbi is analogous to a human's, and that the approach taken to pavlos myserlis and farid radmanesh contributed equally to this work. create a tbi in the mouse provokes a similar pattern of injury seen in human tbi (6) (7) (8) (9) . as such, a great deal of careful work is required to demonstrate the validity of these model systems before the results arising from them can be judged relevant to human disease. the challenges of bridging this divide are illustrated by the universal failure of neuroprotection mechanisms that reached human trials in the last several decades, essentially all of which had promising model system data in preclinical development (10) (11) (12) (13) (14) . reverse genomic translation, in contrast, begins with humans ( fig. 1 ). as such, there are few concerns as to the relevance of the system for discovery of biomarkers and mechanisms of disease. however, this approach carries a new series of challenges in study design and data acquisition (4) . compared to isogenic cell lines or carefully bred animals in a controlled setting, humans are highly variable in both their environmental and genetic exposures. this is advantageous in identifying genetic susceptibility to disease risk and outcomes, but teasing out these small genetic effects from highly variable non-genetic exposures requires both careful computational techniques as well as large sample sizes. furthermore, because genomic data is both identifiable and can potentially lead to discrimination, human genomic studies require complex consent and data management procedures (15) . in neurocritical care, the relative rarity of many of the diseases we encounter, coupled with the challenges of critical illness and surrogate consent make human genomic studies all the more difficult to execute effectively (16) (17) (18) (19) . neurointensivists routinely encounter diseases and complications for which there are a dearth of effective treatments, or even foundational knowledge of their underlying pathophysiologic mechanisms (20, 21) . in this review, we will highlight some of the approaches being taken to apply translational genomics to the study of diseases commonly found in neurocritical care, utilizing both forward and reverse genomic translational techniques. further, we will highlight some of the best practices in the field that could be applied in neurocritical care to improve our ability to identify new treatment modalities as well as risk and prognosis information to patients and their families. in advance of the human genome project and the hapmap consortium, genetic studies were confined to the study of candidate genes and lower-resolution genome-wide techniques such as categorization of restriction fragment length polymorphisms (rflp), tandem repeats, and microsatellites (22) . these genomic features enabled early efforts to perform linkage analyses in families with related traits and disorders, as well as selected populations of unrelated individuals. careful work in this arena led to validated discoveries that have survived replication in the common era, such as chromosome 19 in late-onset alzheimer disease (ad), ultimately mapped to the apoe locus that has become a target for a great deal of genetic research in ad, as well as many other diseases including tbi and intracerebral hemorrhage (ich) (23) (24) (25) (26) . still, much of the pregwas era was characterized by candidate gene studies that suffered from low statistical power and multiple sources of confounding that led to a failure to replicate many reported associations in the gwas era that followed (27, 28) . the most substantial source of confounding in candidate gene analyses is population stratification, in which differences in allele frequency due to ancestral imbalance between cases and controls introduces spurious associations (positive or negative) between genotype and trait based solely on these cryptic ancestral imbalances (29, 30) . even in studies of apoe in european ancestry populations, uncontrolled variation in the percentages of individuals of northern vs. southern european ancestry between cases and controls can mask true associations between apoe and ich, for instance (31) . the gwas era, in which variants across the genome could be reliably genotyped and mapped to a common reference template by chromosomal location, ushered in a new system of best practices that could minimize the contribution of many of the sources of confounding in describing associations between genomic variation and traits or diseases. the international hapmap consortium obtained genotypes on individuals across 11 ancestral populations around the globe, creating a resource that described the patterns of allele frequency variation across diverse populations (32) . with these breakthroughs and a number of landmark evolutions that followed, case/control and population-based gwas have led to the identification of over 11,000 associations with human diseases and other traits (https://www.ebi.ac.uk/gwas/). obviously there is an enormous disconnect between the discovery of genetic loci and leveraging of this information for human benefit, which is where the translational genomic work that serves as the topic of the present review becomes relevant (32) . post-gwas, in addition to functional and translational efforts, the movement has been towards so-called "next-generation sequencing" methodologies consisting of whole exome sequencing (wes) and whole genome sequencing (wgs). using these approaches, each nucleotide in the exome or genome is ascertained with high reliability, permitting the identification of rare and de novo variants that escape detection in traditional gwas (33) . wes captures within-gene coding variation only, offering detection of variants that may more directly impact protein structure and function than non-coding variation detected by wgs (34) . because the coding exome is only~2% of the overall genome, it is more cost-effective than wgs, but debate continues as to which is the more appropriate tool for large-scale study of the human genome (35) . regardless, both wes and wgs remain orders of magnitude more expensive than traditional gwas approaches at this time, and as such well-powered sequencing studies remain unreachable for many diseases in the current pricing models. less common diseases and conditions that one may find in a neurocritical care unit are doubly disadvantaged, as even larger sample sizes are required for sequencing analyses than gwas, due to the need for many observations to identify rare exonic or intronic variants associated with disease (17, 36) . as pricing models improve and larger and larger community or hospital-based cohorts receive sequencing through clinical or biobanking efforts, it is hoped that even uncommon conditions such as subarachnoid hemorrhage or status epilepticus will benefit from the insights achievable through sequencing analysis, where case/control and smaller sequencing studies have shown promise (37, 38) . obviously genomic research need not be limited solely to human studies. a wealth of information about disease pathogenesis and response to injury can be gleaned from model systems of human conditions using genomic and transcriptomic approaches. because animal models and isogenic tissue cultures are specifically designed to limit genetic differences between individual animals or plated cells, dna-based association tests typically do not offer insight in the same way that they do in humans. as such, many model system studies start with rna, examining how the genome responds to perturbation through the transcriptome. however, there are substantial genomic differences between model systems and humans, as coding sequences are not necessarily conserved, promoter and enhancer control of gene expression can vary, and in the case of immortalized tissue and cell-based assays, the chromosomal architecture itself can be quite different from the organism from which it was derived (39, 40) . these differences can be highly relevant when determining whether observed transcriptomic and proteomic results from model systems are likely to be shared in humans. with those caveats, the dynamic nature of the transcriptome in model systems offers opportunities to assess the way in which the genome responds to noxious insults or drug exposures, and in animal models this can even be done across specific organs or tissues of interest (40) . as one example, traumatic brain injury researchers have obtained insight into both the initial injury cascade as well as brain response to potential injury modulators such as valproate using animal models and transcriptional microarrays, in which rna expression patterns in brain tissue can be rapidly and replicably assessed across the transcriptome (41) . using more recent technological advancements such as drop-seq, rna expression can be assessed in single cells, as has been done in individual hippocampal neurons in a mouse model of tbi (42) . at a minimum, these elegant studies can help to identify relevant cell types important in the response to injury, highlighting testable hypotheses that may be important in human conditions, all with access to tissues and control over experimental conditions that would never be possible in human-based research. given that diseases common to the neurocritical care population so rarely afford access to brain tissue for pathologic or genomic analysis antemortem, model system genomic studies offer an important adjunct for translational research. forward genomic translation begins with model systems with the goal of using the measured associations in these models to derive insight into biological mechanisms that may also be relevant in human disease. forward translation requires wellcharacterized models that are often designed to mimic the human exposures of interest as closely as possible. this is often challenging given the natural differences between humans and many of the animals chosen to serve as models. in this section, we will highlight several model systems in current use for translational genomics relevant to neurocritical care, but the field of translational modeling in neurologic disease is suitably large to prevent an exhaustive review herein. malignant cerebral edema is a highly lethal complication of ischemic stroke, with mortality of 60-80% (43) . currently, hemicraniectomy is the only available option to prevent death and yet it does not address the underlying pathophysiology. hyperosmolar therapy is potentially useful as a bridge to surgery. preclinical data based on a forward translation approach has been useful in highlighting mechanisms underlying postinfarct edema as potential targets for therapeutic manipulation. the sulfonylurea receptor 1 (sur1) is encoded by the abcc8 gene that is upregulated after cns injury, forming an ion channel in association with transient receptor potential melastatin (trpm4). continuous activation of this complex can lead to cytotoxic edema and neuronal cell death, which has been demonstrated in both animal and human models (44, 45) . sur1 is also found in pancreatic beta cells, constituting the target for the oral hypoglycemic agent, glyburide. studies of rodent and porcine stroke models demonstrated that in the first few hours after an ischemic insult, both sur1 and trpm4 are upregulated (46, 47) . limited case series of human postmortem specimens also demonstrated upregulation of sur1 in infarcted tissue (48) . therefore, intravenous glyburide has been proposed for treatment of malignant cerebral edema. targeting sur1 in rat models of ischemia have consistently resulted in reduced edema and better outcomes (49) . in particular, glyburide infusion starting 6 h after complete middle cerebral artery occlusion resulted in decreased swelling by two thirds and 5% reduction in mortality (50) . one desirable characteristic of glyburide is that it cannot penetrate intact blood-brain barrier, but that is facilitated following brain injury (51) . the effect of glyburide for treatment of cerebral edema has also been studied in tbi with promising data obtained from animal studies (52) . limited randomized trials in human using oral glyburide have shown promising results; however, use of oral formulation and study design limitations prohibit generalizability of results (52, 53) . building on this preclinical data, the phase 2 randomized clinical trial (games-rp) showed that the iv preparation of glyburide, glibenclamide, is associated with reduction in edema-related deaths, less midline shift, and reduced rate of nih stroke scale deterioration. however, it did not significantly affect the proportion of patients developing malignant edema (54) . the phase 3 charm trial, sponsored by biogen, is currently enrolling patients with large hemispheric infarction to determine whether iv glibenclamide improves 90-day modified rankin scale scores. if this trial proves successful, this vignette will represent a dramatic success story for the forward translation paradigm in genomic research. in the light of recent advances in revascularization therapy, the national institute of neurological disorders and stroke has supported an initiative aiming to develop neuroprotective agents to be used as adjunctive therapy to extend the time window for reperfusion and to improve long-term functional outcome. this stroke preclinical assessment network (span) supports late-stage preclinical studies of putative neuroprotectants to be administered prior to or at the time of reperfusion, with long-term outcomes and comorbidities constituting the endpoint. the goal is to determine if an intervention can improve outcome as compared to reperfusion alone and/or extend the therapeutic window for reperfusion. span directly applies to forward translation efforts in preclinical models of neuroprotection after stroke and is an outstanding opportunity to stimulate research efforts in a field more remembered for its past failures than the promise it holds for the future of therapeutic development in the area. other societies have also begun to endorse more comprehensive modeling approaches in areas with few therapeutic options with the hope of implementing a paradigm shift. for example, the neurocritical care society has initiated "curing coma" campaign with the 10-to 20-year mission to improve the understanding of the mechanisms and to ultimately develop preventative and therapeutic measures. traumatic brain injury (tbi) is among the leading causes of disability and death worldwide, particularly in the young. the type of tbi is in part determined by the attributes of mechanical forces, including objects or blasts striking the head, rapid acceleration-deceleration forces, or rotational impacts. following the primary injury, an intricate cascade of neurometabolic and physiological processes initiates that can cause secondary or additional injury (55, 56) . intensive care management has improved the prognosis of tbi patients; however, specific targeted treatments informed by pathophysiology could have a tremendous impact on recovery. the period of secondary tissue injury is the window of opportunity when patients would potentially benefit from targeted interventions, given that in tbi, the primary injury cannot be intervened upon by the neurologist or intensivist. the goal of therapy is therefore to reduce secondary damage and enhance neuroplasticity. the utility of animal models of tbi primarily depends on the research question, as each model emulates specific aspects of injury and has selective advantages and disadvantages. these include biomechanics of initial injury, molecular mechanisms of tissue response, and suitability for high-throughput testing of therapeutic agents, to name a few. although phylogenetically higher species are likely more representative models for human tbi, rodent models are more commonly used given the feasibility to generate and measure outcomes, as well as ethical and financial limitations of higher-order models. table 1 summarizes some common and representative tbi models [ table 1 ]. in contrast with the rodent models described in table 1 , other model systems in tbi have been selected specifically to study other aspects of the physiologic response to tbi. for example, a swine model of controlled cortical impact offers the opportunity to readily monitor systemic physiologic parameters such as tissue oxygen and acid-base status while investigating therapeutic interventions, which is argued to provide greater insight into human response to injury (66) . translation of preclinical studies using these animal tbi models to humans is inherently challenging. differences in brain structure, including geometry, craniospinal angle, gyral complexity, and white-gray matter ratio, particularly in the rodent models, can result in different responses to trauma (65) . the limitation of extrapolating animal studies to human is also manifested at the genetic level, as differences in gene structure, function, and expression levels may suggest genetic mechanisms that are incompletely correlated with humans. as an example, female sex may be associated with better outcome through the neuroprotective effect of progesterone in animal models, but these observations did not carry over to humans in the protect-iii trial (67, 68) . variable outcome measures, including neurobehavioral functional tests, glasgow outcome scale correlates, and high-resolution mri have been used in attempts to correlate animal responses to injury with those of humans. the lack of a large cache of standardized tools further limits comparison or pooling the results of different studies that use variable models of tbi or outcome measurement. transcriptomics, a genomic technique in which global rna expression is quantified through either expression microarrays or rna sequencing, has been employed to characterize specific inflammatory states following tbi. many studies have assessed the transcriptome in the acute post-tbi interval within 1-2 days after injury, with some showing upregulation of inflammation and apoptosis genes. gene ontology analysis at 3 months post-tbi have shown similar changes, with upregulation of inflammatory and immune-related genes (69) . importantly, late downregulation of ion channel expression in the peri-lesional cortex and thalamus suggests that this delayed examination of the transcriptome could be valuable for revealing mechanisms relevant to chronic tbi morbidities, including epileptogenesis and prolonged cognitive impairment (70) . in addition, tissue-specific analysis of gene diffuse axonal injury reproduces human tbi needs standardization, e.g., location of animal within shock tube and heard immobilization (65) expression across cell types in brain could provide useful insight into cell-specific pathways. for example, temporal trending of microglial expression profile indicates a biphasic inflammatory pattern that transitions from downregulation of homeostasis genes in the early stages to a mixed proinflammatory and anti-inflammatory states at subacute and chronic phases (71) . the list of antiepileptic drugs has expanded significantly in the past decade, reflecting substantial investment in the search for new therapeutics with better efficacy and tolerability. however, the list of options with demonstrated efficacy in status epilepticus (se) has remained limited. the utility of benzodiazepines, often deployed in the field as a first-line agent, decreases with increasing duration of se. in addition, 9-31% of patients with se develop refractory se when they fail to respond to first-and second-line therapy, posing a significant management and prognostic challenge (72) . the development of aeds has relied substantially on preclinical animal models to establish efficacy and safety prior to proceeding to human trials. different epilepsy models exist that are each useful for different aspects of drug development and no model is suitable for all purposes. the majority of animal models induce epilepsy using electroshock or chemical seizure induction. nearly all recent aeds have been discovered by the same conventional models, and the reliance on these common screening models has been implicated as one of the reasons for the low yield of drugs with efficacy in refractory epilepsy (73) . the pros and cons for each epilepsy model are discussed in detail in several excellent reviews (74, 75) . some of the chemicals used include kainic acid, pilocarpine, lithium, organophosphates, and flurothyl (76) . sustained electrical stimulation to specific sites, including the perforant path, the ventral hippocampus, the anterior piriform cortex can induce se (77) . the latency, length, and mortality of convulsive se are more variable in chemoconvulsant as compared to electrical models, which are in turn determined by the drug and route of administration, species, sex, age, strain, and genetic background among other factors (78) . it should also be noted that the presence of behavioral convulsion does not correlate fully with the electrographic data and vice versa. this can have c r i t i c a l i m p l i c a t i o n s w h e n s t u d y i n g d r u g s f o r pharmacoresistant se. therefore, it has been suggested that electroencephalographic quantification be used to measure the severity of se (78) . furthermore, the genetic background and expressivity of animals can have a significant effect on seizure susceptibility, even between batches of inbred mice (79) . proteomic and transcriptomic approaches have been utilized for assessment of alterations in expression profile following se, demonstrating that certain subsets of genes are upregulated at each timepoint following the onset of se. specifically, upregulation of genes regulating synaptic physiology and transcription, homeostasis and metabolism and, cell excitability and morphogenesis occur at immediate, early, and delayed timepoints. in addition, related studies have demonstrated changes in expression of micrornas related to epileptogenesis, including mirna-124 and mi-rna-128 following se (80, 81) . selective rna editing post-transcription is yet another potential source of proteomic diversity in preclinical models of se, and merits further investigation as a modulator of protein levels that may be less closely tethered to gene expression (82) . aneurysmal subarachnoid hemorrhage (sah) has an earlier age of onset and is associated with higher morbidity compared with other stroke subtypes. the pathophysiology of insult has traditionally been studied under two time-intervals, early brain injury (ebi) and, cerebral vasospasm (cv) and delayed cerebral ischemia (dci). the prime goal of translational research in this arena is to identify the mechanisms and targets related to the risk, severity, evolution and outcome. about 15% of patients die immediately following sah (83) . thereafter, early brain injury within the first 3 days, followed by dci are the most feared complications. cv is the phenomenon with strongest association with the development of dci, which 30-70% of patients experience between day 4 to 14 (84) . the underlying mechanisms leading to cv remain poorly understood and have therefore been a prime focus of preclinical studies. the majority have used rodent models, but primate, swine, and dog models have also been employed (85) . cerebral aneurysms are difficult to model and hence two common approaches to modeling sah use alternative strategies. the first is direct injection of blood into the subarachnoid space, specifically into either the prechiasmatic cistern or cisterna magna, to generate sah predominantly in the anterior or posterior circulation territories, respectively (86) . the second model, endovascular suture, passes a suture or filament through the internal carotid artery, creating a hole in one of the major branches resulting in egress of variable amount of blood into the subarachnoid space (87) . variations in some parameters of the first method, including injected blood volume, csf removal prior to injection to prevent egress of blood into the spinal canal, and replenishing intravascular volume to keep cerebral perfusion pressure constant through maintenance of mean arterial pressure, as well as the rapidity of injection have raised questions about comparability and biofidelity of the results (88) (89) (90) (91) . the latter model appears to remove some of the mentioned confounding factors, as the hemorrhage occurs at physiologic mean arterial pressure (map) and intracranial pressure (icp), but is limited by variable puncture site and ultimate hemorrhage volume. another potential drawback is the period of ischemia caused by the intraluminal suture, although the occlusion period is typically not judged to be long enough to cause significant ischemia. the missing element in these models is the absence of aneurysm formation and rupture, and consequently the vascular processes intrinsic to the aneurysm itself that influence dci. as such, some studies have used combinations of interventions to generate aneurysms, including induced hypertension via unilateral nephrectomy and administration of angiotension ii or deoxycorticosterone acetate, as well as elastase injection. the downside of these models is that the timing of aneurysm rupture cannot be reliably predicted, which limits close monitoring and physiologic assessments in the early phase following sah, blurring the timing of dci (92) (93) (94) . the immediate hemodynamic changes following the hemorrhage are monitored via a variety of methods. regardless of the method chosen, reports on the direction and range of values of cpp, cbf, and map can be quite variable, both within the same model and between different models. a common technique to measure blood flow is laser doppler flowmetry that provides a continuous measure of cortical perfusion. although it does not measure global cerebral blood flow and has spatial limitation, it appears to be relatively reliable and technically reproducible. other methods of flow measurement include radiolabeling methods and mri with the latter has the advantage of capturing the dynamic nature of the condition, as well as global and region-specific blood flows. as noted, cv and dci are responsible for delayed morbidity and mortality. given that these manifestations typically occur while patients are inpatient for care of their sah, therapeutic interventions are more feasible compared to the hyperacute phase when the processes leading to initial damage may have already occurred. however, monitoring for cv in animal models is not straightforward. one method of identifying cv is measuring the intraluminal diameter of vessels on histological samples. in addition to being an end-measure and therefore precluding measurements at different time points in the same animal, varying degrees of tissue desiccation among samples may yield numbers different from actual in vivo values. digital subtraction angiography and magnetic resonance angiography can provide a real-time evaluation, but the severity of cvand its timing, as well as neuronal cell death varies depending on the model and the affected vessels (86) . the foundational molecular pathways that orchestrate cv are complex and remain incompletely elucidated. however, translational research using many of the above models has demonstrated that endothelin-1, nitric oxide, and an inflammatory cascade ignited by breakdown of blood products play predominant roles. endothelin-1 is a potent vasoconstrictor produced by infiltrated leukocytes, and based on this notion, clazosentan was developed as an endothelin-1 receptor antagonist to combat cv. in human trials, clazosentan was found to significantly reduce the incidence of the dci without improving the functional outcome, and this or a related approach could ultimately prove beneficial if off-target drug effects, including pulmonary complications, hypotension, and anemia can be mitigated (95, 96) . hemoglobin and its degradation products are also a strong stimulus for cv through direct oxidative stress on arterial smooth muscle, decreased nitric oxide production and, increasing endothelin and free radical production (97) . this suggests that facilitating clearance of hemoglobin degradation products from the csf may be a potential therapeutic target. modulating the intense inflammatory response is also intuitive and while preclinical results support this notion in general, the evidence has thus far not been judged adequate to justify clinical trials. for example, il-1 receptor antagonist (il-1ra) reduces blood-brain barrier (bbb) breakdown, a biomarker that is itself correlated with the severity of brain injury, and work continues to determine whether this or related pathways mediating bbb permeability might have therapeutic promise (98) . given these numerous and likely interconnected mechanisms of delayed brain injury, further research is needed to understand their relative applicability to humans, and whether targeting a single pathway or a number of pathways simultaneously is likely to be the most adaptive strategy to reduce cv and dci in humans. the results of genome-wide rna sequencing analysis have supported the primary role of neuroinflammation in the pathogenesis of early brain injury. some studies have specifically found a key role for long non-coding rna (lncrna), a type of rna without protein-coding potential that are particularly abundant in the brain, in modulating the inflammatory behaviors of microglial cells (82) . high-throughput mass spectrometry has also been utilized in demonstration of differential expression of proteins in the cerebral vessels after sah, as well as for monitoring the effect of experimental therapeutics (99). we will not cover these proteomic studies in detail here, as they typically fall outside the rubric of what is classically considered "genomics", but their approach, which leverages global protein signatures rather than restricting observations to specific compounds, shares many similarities with genomics. as mentioned above, reverse genomic translation refers to an approach to the study of a disease by starting with humans using either cohort-based or case/control genomic studies. the observations made through the course of these studies then inform on the best approach for target validation and refinement to prioritize candidate mechanisms and related endophenotypes for therapeutic development. it has been shown that candidate compounds with independent confirmation of their therapeutic target via human genomics are more than twice as likely to prove effective in clinical trials (100) . therefore, the reverse translation approach would seem an adaptive strategy to identify disease-associated mechanisms and therapeutic targets with the best chance of impacting clinical care in the near term. however, the approach to reverse translation requires large sample sizes with well-characterized patient data in order to achieve a statistically confident result. these large sample sizes raise the issue of variability in risk and treatment exposures between participants, which could impact patient outcomes independently of genomic effects and therefore erode power to detect genetic risk. the utility of reverse translation in target refinement and mechanism exploration in model systems can be highlighted using an example from the stroke community. recent gwas and subsequent meta-analyses of ischemic stroke and stroke subtypes in very large case/control datasets have validated the histone deacetylase 9 (hdac9) region in chromosome 7p21.1 as a major risk locus for stroke due to large artery atheroembolism (laa). this locus was also previously discovered in association with coronary artery disease (cad) (101, 102) . based on these findings, azghandi et al. sought to investigate the role of the leading single nucleotide polymorphism (snp) in this genomic region (rs2107595) in increasing laa stroke risk (103) . they found that rs2107595, both in heterozygotes as well as in homozygote human carriers, is associated with increased expression of hdac9 in peripheral blood mononuclear cells on a dose-dependent manner, suggesting that the effect of this locus in stroke risk may be mediated by increased hdac9 expression. additionally, they demonstrated that hdac9 deficiency in mice is associated with smaller and less advanced atherosclerotic lesions in the aortic valves, curvature, and branching arteries, suggesting that hdac9 may increase atherogenesis and therefore represents a novel target for atherosclerosis and laa stroke prevention. notably, recent studies have suggested that both nonspecific (e.g. sodium valproate) as well as specific hdac9 inhibitors can have a positive impact on both stroke recurrence risk, as well as other phenotypes, including cancer. this highlights the central role that reverse translation can have in therapeutic target investigation and refinement, with potential beneficial off-target properties (104, 105) . while acute stroke care is a vital component of neurocritical care at many institutions, reverse genomic translation successes in other relevant traits also merit mention. acute respiratory distress syndrome (ards) is a frequent complication of severe neurologic injury due to sah or neurotrauma. in a recent gwas by bime et al., variation in the selectin p ligand gene (selpg), encoding p-selectin glycoprotein ligand 1 (psgl-1) was found to be associated with increased susceptibility to ards (106) . the most significant snp in this locus, rs2228315, which results in a missense mutation, has been successfully replicated in independent cohorts. further functional analyses have demonstrated that selpg expression was significantly increased in mice with ventilator (vili)-and lipoprotein (lps)-induced lung injury, and that psgl-1 inhibition with a neutralizing polyclonal antibody led to an attenuation of inflammatory response and lung injury. in selpg knockout mice, inflammatory response as well as lung injury scores were significantly reduced compared to wild-type mice (106) . these results highlight the value of reverse genomic translation in first identifying human-relevant genetic risk factors for disease, and using model systems to understand the pathways impacted by their introduction to select rationally-informed modalities for potential treatment. intracranial aneurysms (ia) are commonly encountered in the neurocritical care setting, albeit most commonly after rupture. even so, inroads leading to a better understanding of aneurysm formation may ultimately reveal opportunities for treatments to prevent acute re-rupture or prevent future aneurysm formation after sah. the strongest associations with ia have been reported in the region near cdkn2a/cdkn2b in 9p21.3 as well as in a nearby intragenic region known as cdkn2bas or anril (107, 108). anril is a long noncoding region responsible for the regulation of cdkn2a and cdkn2b and has also been implicated in the pathogenesis of cad and atherosclerosis, among other traits (109) . overexpression of anril in mouse models of cad has been associated with negative atherosclerosis outcomes including increased atherosclerosis index, unfavorable lipid profiles, thrombus formation, endothelial cell injury, overexpression of inflammatory factors in vascular endothelial cells, increased apoptosis of endothelial cells, and upregulation of apoptosis-related genes. notably, reduced anril expression has been associated with reduced inflammatory, biochemical and molecular markers of atherosclerosis, indicating a potential target for atherosclerosis and ia prevention (110) . when utilizing the reverse translation approach in genomic studies, the aforementioned examples highlight two distinct but equally important considerations for a successful implementation of such approaches. the first major consideration is that large populations of well-characterized individuals must be selected to ensure adequate statistical power to detect meaningful associations. thorough and standardized phenotyping of study subjects is one of the main predictors of the success of a gwas (111, 112) . careful assignment of cases based on strict phenotypic criteria permits well-executed gwas even in diseases with heterogeneous presentations and multiple pathogenic features, such as multiple sclerosis (ms) and stroke (113) . in neurocritical care populations where subtle characteristics of disease presentation and intermediate outcomes may represent important phenotypes for genomic investigation, such as sah, these traits should be closely defined and recorded to the greatest degree possible in all participants. this initial step is critically important in the greater scheme of reverse translational genomics, as these associations with subclasses and endophenotypes of disease often provide the biological insights needed to continue translational efforts using model systems tailored to refine observations. the second major consideration is that the execution of genomic studies needs to be comprehensive and thorough so as to permit association testing in a hypothesis-free environment. at the moment, gwas array-based studies seem to remain a favorable option of the genome, considering the lower cost associated with their utilization and proven track record in discovery, but over time, wes and wgs studies will become reachable even on more modest research budgets. for the transcriptome, rna sequencing and rna microarrays both offer unbiased surveys of global transcriptional variation, but because gene expression varies substantially by tissue it is critical that rational choices are made regarding the suitability of specific tissues for specific conditions. in uncommon conditions with necessarily small sample sizes, including neurocritical care-relevant diseases like se, sah, and ards, external validation studies can strengthen associations from an initial small discovery dataset, and in many cases these follow-up studies can make use of freely available resources. for example, in a recent expression-based gwas (egwas), microarray data for ards from the gene expression omnibus (geo) were collected and combined in an effort to identify novel genetic targets (114) . the study not only validated previously known lung injury-and ardsrelated genes, but also discovered 14 new candidate genes that may prove to be useful in future translational work. identifying loci, variants, expression patterns, and genenetworks with the use of human genomic studies is only the initial step in the reverse translation process. these discoveries must inform and guide the research to further understand and refine the phenotypic effects of these variants in model systems, including some of those described above. there are several techniques with which we can utilize the discoveries made from case/control genomic studies to build or modify model systems. one approach is transgenesis, in which a larger dna sequence including a human gene containing a mutation of interest, called a transgene, is injected into the pronucleus of a mouse fertilized egg. the fertilized egg is then inserted into the oviduct of a pseudopregnant female mouse, which is a female who has been mated with vasectomized male in order to achieve the hormonal profile of a pregnancy state. the offspring produced from this female can create an animal line that contains the human gene and allele of interest (115) . however, because the transgene is inserted randomly at one or more genetic locations as either one or more copies, the level of expression and regulatory influences of the gene of interest may not initially be well-controlled across animals. as such, there are several intermediate steps that can allow more specific genetic alteration using transgenesis, involving embryonic stem cells (escs). the first step is the introduction of regulatory sequences (such as expression cassettes) into escs. then, by injecting the transgene first into these modified escs, gene expression can be more closely controlled. the escs with the transgene can then be inserted into blastocysts and give rise to new strains, using the same methods previously described (116) . there are multiple variations on the transgenic approach which are uniquely suited to the model system being employed and can give rise to models that express transgenes in response to a particular stimulus, or in particular tissues of interest. a newer method utilizing programmable endonucleases has allowed researchers to bypass more traditional escbased methods for direct and precise gene editing. endonucleases are enzymes that cause double stranded dna (dsdna) breaks that can further be repaired either with non-homologous end-joining (nhej), an imprecise method for rejoining the dna breaks that involves various enzymes and may result in inactivating mutations, or with homologydirected repair (hdr), in which the dna breaks are repaired based on a co-injected template. four categories of programmable endonucleases have been used for direct and precise gene editing: homing endonucleases (he), zinc-finger nucleases (zfns), transcription activator-like effector nucleases (talens) and the clustered regularly interspaced short palindromic repeats/crispr-associated 9 (crispr/cas9) system. the common characteristic of these enzymes is that they possess sequence-specific nuclease activity, allowing researchers to cleave dsdna at desired, pre-specified sites. the crispr/ cas9 system has proven to be the most successful so far, in terms of efficiency, cost, and simplicity of use. perhaps the most important advantage of this approach is that programmable endonucleases do not require the use of escs and can directly be inserted into one-or two-cell stage embryos, thus allowing more specific and direct gene-editing in a single step (116) . drawbacks include enzymatic limitations as to where dna breaks can be reliably introduced, as well as off-target endonuclease activity at other sites across the genome which can disrupt gene activity in unintended ways. work is ongoing to refine these tools, improving the number of sites where gene editing can occur while also improving the specificity of the system (117). one illustrative example of human genomic studies being used to refine models to understand disease processes is the case of human ich-associated mutations in col4a1 and col4a2. col4a1 and col4a2 are the most abundant proteins in basement membranes. they form heterotrimers consisting of one col4a1 and two col4a2 peptides and are produced and modified in the endoplasmic reticulum (er). after their production, they are packaged into vesicles in the golgi apparatus and transferred to vascular endothelial basement membranes (118) (119) (120) . the initial identification of mutations in this region in familial forms of cerebral small vessel disease, coupled with the subsequent detection of common col4a1/col4a2 variants associated with sporadic deep ich led to the development of animal model systems to explore their effects (121) (122) (123) (124) . through mouse models, representative col4a1/col4a2 mutations were found to recapitulate human disease phenotypes, with multifocal ich in subcortical regions of the forebrain and the cerebellum, as well as porencephaly, small vessel disease, recurrent intraparenchymal and intraventricular hemorrhages, agerelated ich, and macro-angiopathy (125, 126) . using cellular assays and tissue derived from mouse models, mutations in col4a1/col4a2 have been associated with decreased ratio of intracellular to extracellular col4a2, retention of abnormal collagen proteins in the er, er stress, and activation of the unfolded protein response (126) (127) (128) , suggesting that the intracellular accumulation and er-stress could be an important molecular mechanism underlying ich related to col4a1 and col4a2 mutations. notably, treatment with the molecular chaperone sodium 4-phenylbutyrate resulted in decreased intracellular accumulation and significant decrease of ich severity in vivo, which could point the way towards eventual forms of treatment for both familial and sporadic col4a1and col4a2-associated ich (125) . another recent example of model system refinement for neurocritical care-relevant disorders is status epilepticus (se). pyridoxal phosphate binding protein (plpbp) variants have been associated with a rare form of b6-dependent epilepsy, which, if left untreated can lead to se. in a recent study, johnstone et al. utilized crispr/cas9 to create a zebrafish model lacking its encoded protein (129) . they observed that plpbp-deficient zebrafish experienced significantly increased epileptic activity compared to their wild type counterparts, in terms of physical activity (high-speed movements), biochemistry (c-fos expression) and electrophysiologicallyrecorded neuronal activity. additionally, treatment of plpbp â��/ â�� larvae with plp and pyridoxine led to an increase in their lifespan, and a decrease in their epileptic movements and neuronal activity. lastly, in these plpbp-deficient zebrafish, systemic concentrations of plp and pyridoxine were significantly reduced, as well as concentrations of plp-dependent neurotransmitters. collectively, these results provide insights for biomarker development and preclinical target refinement in b6-dependent epilepsy. understanding how novel treatments might impact rare disease presentations could ultimately lead to new insights for common forms of disease as well, just as the discovery of rare pcsk9 variants in patients with very low cholesterol ultimately led to pcsk9-inhibitors to treat more common forms of familial hypercholesterolemia. however, the use of animal models is not always the ideal approach to describing the effects of genetic variation, as the phenotypic alterations may be too subtle to observe or require impractical prolonged observation in late-life animals to ultimately exhibit relevant phenotypes. in these cases, tissuebased systems can provide a useful tool to study these effects. for example, ia formation, as previously described, has been associated with variants in anril. although the direct impact of these variants in human tissue or animal models is difficult to discern, work with mutations of anril in endothelial models have provided valuable insight. specifically, upregulation of anril has been associated with increased expression of inflammatory and oxidative markers in the vascular tissue such as il-6, il-8, nf-îºb, tnf-a, inos, icam-1, vcam-1, and cox-2 (130, 131) . these observations provide vital information about cellular mechanisms impacted by human disease-associated genetic risk factors without requiring the expense and time investment of creating, validating, and studying animal models. ultimately such models may still be required, but prior knowledge about cellular phenotypes associated with genetic variation may be highly valuable in choosing the right model system and selecting efficient approaches to validate these systems. the aforementioned examples highlight significant contributions of the field of translational genomics in identifying novel therapeutic targets, developing biomarkers of disease severity and elucidating disease-relevant pathophysiology. undoubtedly, these contributions are valuable in application to existing model systems of disease, or through refinement of models informed by the reverse translation process. given that many of our current models have proven to be ineffective in many cases, the reverse translation approach offers a significant advantage in that the translational discoveries arising in established or refined model systems have already been proven to be relevant to human disease. this advantage provides us with reasonable expectation that observed effects in model systems will also remain relevant to human disease, providing a substrate for therapeutic development. certainly, the ultimate goal of translational genomics is to be able to transfer the discoveries found from experimental models into clinically useful information in order to improve human health. this aim, with regard to the translational genomic approach, can be satisfied with two distinct approaches. one is concerned with improving our understanding of the mechanisms of disease, providing novel targets for therapeutic development. the other is concerned with leveraging the conclusions of translational genomics through more direct applications to clinical care. we will discuss these in order. once genomic discovery and translational exploration have confirmed the mechanism and relevance of a particular genomic association, translational genomics offers the opportunity to use these same translational approaches to derive highthroughput assays for screening of compound libraries, which are collections of small molecules useful for early-stage drug-discovery (132) . the same in vitro assays used to identify cellular phenotypes associated with genetic risk factors can be tested for amelioration or "rescue" of wild-type features after exposure to library compounds. this is particularly advantageous for the reverse genomic translation approach, as these assays are often critical components of the overall discovery cycle, and with optimization to provide ideal readouts, screening can proceed quickly. an example of success here is the identification of molecular chaperones that can ameliorate the unfolded protein response detrimental to cell survival in col4a1-mediated cerebrovascular disease (125) . identifying hits in these assays has the potential to accelerate drug-discovery, provided that the mechanism can be targeted by a small molecule and not a designed biologic entity such as a monoclonal antibody. while screening can be performed using novel compound libraries, it can also be accomplished using libraries containing already-approved drugs, providing an innovative way for compound repurposing based on genetic interactions. numerous tools already exist for in silico evaluation of existing compounds based on known mechanisms, so this step can begin even in the genomic discovery phase prior to translational validation (133) . the second approach in which translational genomics has proven to be of great potential is the rapidly evolving and highly anticipated field of precision medicine. the observations arising from translational genomics, even when not informing us about the specific mechanisms associated with the phenotype in question, may be of predictive value. this finds application in two relevant translational genomics tools: polygenic risk scores (prs) and biomarker development based on rna expression profiles. while common genetic variation can provide valuable information about disease-relevant mechanisms and help refine disease models, they are relatively weak in explaining a significant proportion of the genetic basis of complex polygenic disorders, such as cad, diabetes, stroke, or sah (111) . by summarizing the impact of many variants of small effect across the genome simultaneously, a polygenic risk score (prs) can be developed which explains far more of the genetic risk of a disease than any common variant can individually (fig. 2) . application of these prs in independent clinical populations as a predictive tool represents a novel translational approach. in a recent study examining stroke, a prs combining snps associated with atrial fibrillation (af) was found to be significantly associated with cardioembolic (ce) stroke risk and no other stroke subtypes, paving the way for a potentially useful tool to discriminate ce stroke from other etiologies without reliance on expert adjudication or longitudinal monitoring (134) . another recent study compiled a prs of cad, demonstrating that individuals in the highest quantiles of the prs exhibited cad risk on par with known mendelian cardiac diseases (135) . these studies highlight the potential uses of prs as a genetic biomarker of disease, capturing orthogonal risk information compared to clinical risk factors alone. much work is still needed in this arena, ranging from derivation of readily accessible clinical genomic testing, dissemination of prs results in an interpretable format, disclosure of off-target results that may be clinically meaningful in their own right, and, critically, the validation of prs in ancestrally-diverse populations (136) . despite these challenges, utilization of polygenic risk data to directly inform patient risk independent of our understanding of the underlying mechanisms is an exciting and rapidly evolving use-case for translational genomics. development of biomarkers is another approach in translational genomics that focuses more on predictive utilization than on elucidating mechanisms, and critical care has seen some early potential applications of this approach. in sepsis, where clinical prs percentile across the population distribution. plotting percentiles by disease risk, patients in higher prs percentiles (red dots) are at correspondingly highest risk for the disease outcomes are highly heterogenous, tools that might identify patients who are more likely to respond to certain treatments or identify individuals at highest risk for morbidity and mortality would be highly useful. in a recent study by scicluna et al., the authors categorized sepsis patients based on peripheral bloodderived genome-wide expression profiles and identified four distinct molecular endotypes (mars1-4) (137) . the mars1 expression profile was the only category that was significantly associated with 28-day and 1-year mortality. in addition, combination of the apache iv clinical score with this genetic scoring system resulted in significant improvement in 28-day mortality risk prediction, compared to apache iv alone. to further aid translation to clinical application, the authors used expression ratios of combinations of genes to stratify patients to the four molecular endotypes. bisphosphoglycerate mutase (bpgm): transporter 2, atp binding cassette subfamily b member (tap2) ratio reliably stratified patients to mars1 endotype; while other protein ratios were able to assign individuals to the other three mars categories. using this approach, not only could bpgm and tap2 transcripts potentially be used to identify patients with increased risk of mortality, but if these categorizations can be demonstrated to be causal, these molecular pathways could also be explored for therapeutic target identification and validation. further work is required to extend these findings across clinical populations, but this approach could ultimately yield new tools for prognostication in sepsis. in ischemic stroke, tissue plasminogen activator (t-pa) response and risk for hemorrhagic transformation (ht) are highly correlated with functional outcomes, and biomarkers to predict each of these would have obvious clinical utility. in a recent study, del rio-espinola et al. found that two genetic variations (rs669 and rs1801020) were associated with increased risk for ht and mortality after t-pa administration in stroke patients (138) . specifically, rs669 in a2m was associated with ht and rs1801020 was associated with in-hospital mortality. in a subsequent validation study, researchers created a genetic-clinical regression score that was successfully used to stratify stroke patients treated with t-pa based on risk for ht and parenchymal hemorrhage (ph) (139) . while in the current clinical landscape the vast majority of patients do not have readily accessible genome-wide genotypes prior to events like acute stroke, increasing uptake of clinical genomics and genomically-enabled electronic health record systems could soon enable real-time risk prediction calculations incorporating both clinical and genetic information, providing more accurate tools for clinicians to incorporate into medical decision-making. a separate set of tools that could potentially become diagnostically useful in the clinical setting is the transcriptomic approaches to identify biomarkers, using array-based screening or rna sequencing. in a recent systematic review, a total of 22 mirnas were reported to be differentially expressed in the blood cells of patients with acute ischemic stroke within 24 h after stroke (140) . some studies reported the area under the curve (auc) ranging from 0.76 to 0.987, indicating a potential for clinical utility as early diagnostic markers when neuroimaging is not immediately available or is limited by feasibility. subsequent studies were able to partially replicate these findings, showing three mirnas (mir-125a-5p, mir-125b-5p, and mir-143-3p) that were upregulated in the acute poststroke period, an effect independent of stroke pathophysiology and infarct volume (141) . these transcripts were associated with an auc of 0.90 in differentiating ischemic stroke and healthy controls, a metric that significantly outperformed computed tomography, as well as previously reported bloodbased biomarkers. in ich, a recent report identified up to 489 and 256 transcripts from whole blood that are differentially expressed between ich and controls and, ich and ischemic stroke, respectively (142) . when comparing ich and ischemic stroke transcriptomes in the first 24 h, 2667 and 311 transcripts were differentially expressed compared to controls, respectively. ich transcriptome was over-represented by t cell receptor genes compared to none for ischemic stroke and underrepresented by non-coding and antisense transcripts. t cell receptor expression successfully differentiated between ich, ischemic stroke, and controls. similarly, rna-seq of whole blood rna successfully differentiated between not only ich, ischemic stroke and controls, but also between different stroke subtypes (143). the list of genetic mutations that can cause se is extensive with most genes are associated with infantile-onset or childhood epilepsy syndromes. only a minority are seen in adultonset status epilepticus (144) . the patients in the former group usually have accompanying intellectual disability related to their epilepsy syndromes. however, the evidence supporting a genetic etiology in the latter group may be absent, posing a diagnostic challenge. the available options include gene panel sequencing, whole exome sequencing, or whole genome sequencing. sequencing a pre-selected panel of genes is more common, but with decreasing cost, exome and genome sequencing are being used with increasing frequency. bioinformatic filtering and genotype-phenotype correlation are the main challenges, particularly with the large number of genetic variants identified during whole exome or genome sequencing. the yield of sequencing studies depends on pretest probability that is determined by early age of onset, consanguinity, or affected siblings. as such, to date, only a few genes associated with adult-onset se have been identified, posing a practical limitation that predominantly limits next-generation sequencing to pediatric patients at present (144) . as clinical tools for determination of putative functional significance and deleteriousness of variants identified through sequencing are refined, it is hoped that sequencing approaches find a home in the armamentarium of the clinicians treating refractory or recurrent se in the neurocritical care unit. translational genomics undoubtedly represents an important component to overall efforts to improve our understanding of the diseases we treat, and in principle should improve our ability to identify therapeutic approaches to improve outcomes and, in some cases, prevent disease altogether. given the inherent complexity and inaccessibility of the human brain and its tissues, combined with the relative infrequency of the conditions we treat at the overall population level, progress has been modest when compared to conditions such as hyperlipidemia (145) , coronary artery disease (146) , or atrial fibrillation (147) . nevertheless, the observation-based, hypothesisfree experimental process inherent to translational genomics lends itself well to conditions such as stroke and tbi in which the search for the "master regulator" that governs response to injury has remained elusive despite carefully designed and executed hypothesis-driven studies. an important component to future translational genomic studies in neurocritical care is the pressing need for collaboration across centers with access to large, well-characterized patient populations. the success of the international stroke genetics consortium, and the track-tbi and center-tbi consortia in amassing large human populations with stroke and traumatic brain injury, respectively, is a proven model to accelerate the human genomic studies that serve as the basis for reverse genomic translational research (112, (148) (149) (150) . similar efforts through the critical care eeg monitoring research consortium and other partners could lead to biorepositories of specific conditions relevant to neurocritical care that could provide sample sizes sufficient to drive unbiased genomic discoveries (151) . close alliances with model systems researchers are another critical component to accelerating translational genomics in neurocritical care. as characterization of human disease through multimodal and continuous physiological monitoring, electrophysiology, medical imaging, and biomarker sampling continues to evolve, it is imperative that this information is shared and explored with allied model systems researchers to ensure that models are re-evaluated for their correlation with these endophenotypes, and potentially for dedicated exploration of how these human-derived phenotypes inform on the utility of specific model systems to investigate disease. finally, building relationships with biotechnology and pharmaceutical industry partners will be essential to efforts to extend therapeutic targets arising from translational genomic discoveries towards drug development (152) . while repurposing existing drug compounds for new indications is an important consideration, small molecule and biologic targets are likely to require extensive research and development in the preclinical and clinical space, and industry partners are often optimized for these phases of the therapeutic development process (153, 154) . relatedly, development of polygenic risk scores for assessment of risk, prognosis, or treatment response will also require commercial investment and infrastructure, as few academic environments exist that can manage cliacertified genotyping, quality control, and result reporting and interpretation for on-target and clinically relevant secondary results (155, 156) . particularly in rarer or particularly challenging disease indications like those commonly encountered in neurocritical care populations, academic-industry partnerships are important to raise awareness of and interest in important clinical indications where investment could yield a large impact on a relatively small population of patients. translational genomics, in which genomic associations with risk, outcome, or treatment response are systematically identified and explored for functional relevance in humans or model systems of disease, is a valuable tool for identification of mechanisms, risk factors, therapeutic targets, and risk estimates in multiple diseases that are highly relevant to clinicians and scientists operating in the neurocritical care space. while there are undoubtedly challenges to studying some of the most complex diseases that affect the most complex organ in the body, translational genomic approaches may be uniquely suited to this task. coordinated investments in the collaborations, consortia, and infrastructures that enable these studies are likely to contribute to the novel treatments and biomarkers that are so sorely needed in the highly morbid and often poorly understood conditions in the patient populations we serve. translational genomics reengineering translational science: the time is right human genome project: twenty-five years of big biology it's time to reverse our thinking: the reverse translation research paradigm defining translational research: implications for training a systematic review of large animal models of combined traumatic brain injury and hemorrhagic shock sex differences in animal models of traumatic brain injury animal models of traumatic brain injury and assessment of injury severity differences in pathological changes between two rat models of severe traumatic brain injury neuroprotection in stroke: the importance of collaboration and reproducibility neuroprotection for ischemic stroke: two decades of success and failure stroke foundation of ontario centre of excellence in stroke r. toward wisdom from failure: lessons from neuroprotective stroke trials and new therapeutic directions very early administration of progesterone for acute traumatic brain injury embracing failure: what the phase iii progesterone studies can teach about tbi clinical trials mitochondria-wide association study of common variants in osteoporosis genetic association studies in cardiovascular diseases: do we have enough power? discovery of rare variants for complex phenotypes stroke genetics: discovery, biology, and clinical applications raredisease genetics in the era of next-generation sequencing: discovery to translation review: animal models of acquired epilepsy: insights into mechanisms of human epileptogenesis pathological mechanisms underlying aneurysmal subarachnoid haemorrhage and vasospasm tools for genomics a brief history of alzheimer's disease gene discovery association of apolipoprotein e with intracerebral hemorrhage risk by race/ethnicity: a meta-analysis apolipoprotein e4 polymorphism and outcomes from traumatic brain injury: a living systematic review and meta-analysis apoe genotype specific effects on the early neurodegenerative sequelae following chronic repeated mild traumatic brain injury evaluating historical candidate genes for schizophrenia current concepts and clinical applications of stroke genetics unbiased methods for population-based association studies robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness variants at apoe influence risk of deep and lobar intracerebral hemorrhage international hapmap c. the international hapmap project advancements in next-generation sequencing opportunities and challenges of whole-genome and -exome sequencing whole-genome sequencing is more powerful than wholeexome sequencing for detecting exome variants statistical power and significance testing in large-scale genetic studies rare coding variants in angptl6 are associated with familial forms of intracranial aneurysm de novo variants in rhobtb2, an atypical rho gtpase gene, cause epileptic encephalopathy multi-omic measurements of heterogeneity in hela cells across laboratories mouse regulatory dna landscapes reveal global principles of cis-regulatory evolution resuscitation with valproic acid alters inflammatory genes in a porcine model of combined traumatic brain injury and hemorrhagic shock single cell molecular alterations reveal target cells and pathways of concussive brain injury malignant' middle cerebral artery territory infarction: clinical course and prognostic signs trpm4 inhibition promotes angiogenesis after ischemic stroke sur1-trpm4 cation channel expression in human cerebral infarcts malignant infarction of the middle cerebral artery in a porcine model. a pilot study newly expressed sur1-regulated nc(ca-atp) channel mediates cerebral edema after ischemic stroke sulfonylurea receptor 1 expression in human cerebral infarcts does inhibiting sur1 complement rt-pa in cerebral ischemia? glibenclamide is superior to decompressive craniectomy in a rat model of malignant stroke. stroke; a journal of cerebral circulation atp-dependent potassium channel blockade strengthens microglial neuroprotection after hypoxia-ischemia in rats glibenclamide pretreatment protects against chronic memory dysfunction and glial activation in rat cranial blast traumatic brain injury effects of oral glibenclamide on brain contusion volume and functional outcome of patients with moderate and severe traumatic brain injuries: a randomized double-blind placebo-controlled clinical trial effect of iv glyburide on adjudicated edema endpoints in the games-rp trial brain injury: the pathophysiology of the first hours cell death mechanisms and modulation in traumatic brain injury lateral fluid percussion brain injury: a 15-year review and evaluation temporal and spatial characterization of neuronal injury following lateral fluidpercussion brain injury in the rat experimental traumatic brain injury. experimental & translational stroke medicine animal models of head trauma a new model of diffuse brain injury in rats. part i: pathophysiology and biomechanics responses to cortical injury: i. methodology and local effects of contusions in the rat characterization of a new rat model of penetrating ballistic brain injury blast related neurotrauma: a review of cellular injury animal models of traumatic brain injury moderate controlled cortical contusion in pigs: effects on multiparametric neuromonitoring and clinical relevance progesterone in the treatment of acute traumatic brain injury: a clinical perspective and update the role of progesterone in traumatic brain injury. the journal of head trauma rehabilitation acute response of the hippocampal transcriptome following mild traumatic brain injury after controlled cortical impact in the rat analysis of post-traumatic brain injury gene expression signature reveals tubulins, nfe2l2, nfkb, cd44, and s100a4 as treatment targets time-dependent changes in microglia transcriptional networks following traumatic brain injury. frontiers in cellular neuroscience treatment of refractory status epilepticus with pentobarbital, propofol, or midazolam: a systematic review animal models of seizures and epilepsy: past, present, and future role for the discovery of antiseizure drugs experimental models of status epilepticus and neuronal injury for evaluation of therapeutic interventions loscher w. critical review of current animal models of seizures and epilepsy used in the discovery and development of new antiepileptic drugs animal models of status epilepticus and temporal lobe epilepsy: a narrative review kindling and status epilepticus models of epilepsy: rewiring the brain. progress in neurobiology status epilepticus: behavioral and electroencephalography seizure correlates in kainate experimental models differences in sensitivity to the convulsant pilocarpine in substrains and sublines of c57bl/6 mice. genes, brain, and behavior identification of micrornas with dysregulated expression in status epilepticus induced epileptogenesis dual and opposing roles of microrna-124 in epilepsy are mediated through inflammatory and nrsf-dependent gene networks genome-wide analysis of differential rna editing in epilepsy a comparison of pathophysiology in humans and rodent models of subarachnoid hemorrhage cerebral vasospasm after aneurysmal subarachnoid hemorrhage and traumatic brain injury. current treatment options in neurology a novel swine model of subarachnoid hemorrhage-induced cerebral vasospasm experimental subarachnoid hemorrhage: subarachnoid blood volume, mortality rate, neuronal death, cerebral blood flow, and perfusion pressure in three different rat models a murine model of subarachnoid hemorrhage learning deficits after experimental subarachnoid hemorrhage in rats a murine model of subarachnoid hemorrhage-induced cerebral vasospasm a modified double injection model of cisterna magna for the study of delayed cerebral vasospasm following subarachnoid hemorrhage in rats. experimental & translational stroke medicine a new percutaneous model of subarachnoid haemorrhage in rats pharmacological stabilization of intracranial aneurysms in mice: a feasibility study elastase-induced intracranial aneurysms in hypertensive mice a mouse model of intracranial aneurysm: technical considerations clazosentan, an endothelin receptor antagonist, in patients with aneurysmal subarachnoid haemorrhage undergoing surgical clipping: a randomised, doubleblind, placebo-controlled phase 3 trial (conscious-2) randomized trial of clazosentan in patients with aneurysmal subarachnoid hemorrhage undergoing endovascular coiling. stroke; a journal of cerebral circulation cerebral vasospasm following subarachnoid hemorrhage: time for a new world of thought interleukin-1 receptor antagonist is beneficial after subarachnoid haemorrhage in rat by blocking haem-driven inflammatory pathology. disease models & proteomic expression changes in large cerebral arteries after experimental subarachnoid hemorrhage in rat are regulated by the mek-erk1/2 pathway the support of human genetic evidence for approved drug indications genetic risk factors for ischaemic stroke and its subtypes (the metastroke collaboration): a meta-analysis of genome-wide association studies genome-wide association study identifies a variant in hdac9 associated with large vessel ischemic stroke deficiency of the stroke relevant hdac9 gene attenuates atherosclerosis in accord with allele-specific effects at 7p21.1 sodium valproate, a histone deacetylase inhibitor, is associated with reduced stroke risk after previous ischemic stroke or transient ischemic attack identification of hdac9 as a viable therapeutic target for the treatment of gastric cancer genome-wide association study in african americans with acute respiratory distress syndrome identifies the selectin p ligand gene as a risk factor genome-wide association study of intracranial aneurysms confirms role of anril and sox17 in disease risk genome-wide association study of intracranial aneurysm identifies three new risk loci anril: a lncrna at the cdkn2a/b locus with roles in cancer and metabolic disease effect of circular anril on the inflammatory response of vascular endothelial cells in a rat model of coronary atherosclerosis. cellular physiology and biochemistry : international journal of experimental cellular physiology, biochemistry, and pharmacology genome-wide association studies recommendations from the international stroke genetics consortium, part 1: standardized phenotypic data collection genes associated with multiple sclerosis: 15 and counting. expert review of molecular diagnostics identification of new biomarkers for acute respiratory distress syndrome by expression-based genome-wide association study technical approaches for mouse models of human disease generating mouse models for biomedical research: technological advances. disease models & mechanisms potential pitfalls of crispr/cas9-mediated genome editing the role of secretory granules in the transport of basement membrane components: radioautographic studies of rat parietal yolk sac employing 3h-proline as a precursor of type iv collagen monoclonal antibodies against chicken type iv and v collagens: electron microscopic mapping of the epitopes after rotary shadowing basement membrane (type iv) collagen is a heteropolymer col4a1 mutations as a monogenic cause of cerebral small vessel disease: a systematic review common variation in col4a1/col4a2 is associated with sporadic cerebral small vessel disease review: molecular genetics and pathology of hereditary small vessel diseases of the brain genome-wide association study of cerebral small vessel disease reveals established and novel loci molecular and genetic analyses of collagen type iv mutant mouse models of spontaneous intracerebral hemorrhage identify mechanisms for stroke prevention col4a2 mutations impair col4a1 and col4a2 secretion and cause hemorrhagic stroke abnormal expression of collagen iv in lens activates unfolded protein response resulting in cataract col4a1 mutation causes endoplasmic reticulum stress and genetically modifiable ocular dysgenesis. human molecular genetics plphp deficiency: clinical, genetic, biochemical, and mechanistic insights interfering with long chain noncoding rna anril expression reduces heart failure in rats with diabetes by inhibiting myocardial oxidative stress the interplay of lncrna anril and mir-181b on the inflammation-relevant coronary artery disease through mediating nf-kappab signalling pathway compound libraries: recent advances and their applications in drug discovery. current drug discovery technologies multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes atrial fibrillation genetic risk differentiates cardioembolic stroke from other stroke subtypes genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations predicting polygenic risk of psychiatric disorders classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. the lancet respiratory medicine a predictive clinicalgenetic model of tissue plasminogen activator response in acute ischemic stroke validation of a clinical-genetics score to predict hemorrhagic transformations after rtpa journal of stroke and cerebrovascular diseases : the official journal of national stroke association rna-seq identifies circulating mir-125a-5p, mir-125b-5p, and mir-143-3p as potential biomarkers for acute ischemic stroke the intracerebral hemorrhage blood transcriptome in humans differs from the ischemic stroke and vascular risk factor control blood transcriptomes intracerebral hemorrhage and ischemic stroke of different etiologies have distinct alternatively spliced mrna profiles in the blood: a pilot rna-seq study genetic mutations associated with status epilepticus association of genetic variants related to cetp inhibitors and statins with lipoprotein levels and cardiovascular risk pcsk9: from basic science discoveries to clinical trials rare truncating variants in the sarcomeric protein titin associate with familial and early-onset atrial fibrillation recommendations from the international stroke genetics consortium, part 2: biological sample collection and storage investigators t-t. outcome prediction after mild and complicated mild traumatic brain injury: external validation of existing models and identification of new predictors using the track-tbi pilot study the center-tbi core study: the making-of comparison of machine learning models for seizure prediction in hospitalized patients developing interactions with industry in rare diseases: lessons learned and continuing challenges. genetics in medicine : official journal of the american college of medical genetics drug repurposing from the perspective of pharmaceutical companies insights into computational drug repurposing for neurodegenerative disease clinical providers' experiences with returning results from genomic sequencing: an interview study recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (acmg sf v2.0): a policy statement of the american college of medical genetics and genomics publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-033882-uts6wfqw authors: khakharia, aman; shah, vruddhi; jain, sankalp; shah, jash; tiwari, amanshu; daphal, prathamesh; warang, mahesh; mehendale, ninad title: outbreak prediction of covid-19 for dense and populated countries using machine learning date: 2020-10-16 journal: ann doi: 10.1007/s40745-020-00314-9 sha: doc_id: 33882 cord_uid: uts6wfqw the coronavirus disease-2019 (covid-19) pandemic persists to have a mortifying impact on the health and well-being of the global population. a continued rise in the number of patients testing positive for covid-19 has created a lot of stress on governing bodies across the globe and they are finding it difficult to tackle the situation. we have developed an outbreak prediction system for covid-19 for the top 10 highly and densely populated countries. the proposed prediction models forecast the count of new cases likely to arise for successive 5 days using 9 different machine learning algorithms. a set of models for predicting the rise in new cases, having an average accuracy of 87.9% ± 3.9% was developed for 10 high population and high density countries. the highest accuracy of 99.93% was achieved for ethiopia using auto-regressive moving average (arma) averaged over the next 5 days. the proposed prediction models used by us can help stakeholders to be prepared in advance for any sudden rise in outbreak to ensure optimal management of available resources. the sars-cov-2 coronavirus disease (covid-19) originated in wuhan, china sometime during december 2019. within a month, more than ten thousand people were infected and hundreds died [1] . the initial outbreak caused several deaths, as the medical systems were not capable of handling many seriously ill patients. till july 23, 2020 there were 631,680 deaths [2] reported across the world due to this pandemic. in a rapidly evolving pandemic, improper analysis and predictions of the number of patients results in an inefficient distribution of medical resources. limited medical facilities and mismanagement of resource allocation can lead to additional severe cases and a decline in recovery rates. to cope with this situation, predicting the new cases which will arise in the future is very important. this can ensure optimal allocation of medical resources in the affected regions. data science in the predictive domain is an emerging field. in this study we have incorporated the principles of data science [3] for the prediction of covid-19 progression. the outbreak of covid-19 is a significant challenge for any government, with regard to the capacity and management of public health systems to face the catastrophic emergency [4] . the prediction model can help hospitals and healthcare management to properly allocate resources, thereby reducing the pressure and allowing the situation to be handled with relative ease. we developed and tested 9 different predictive algorithms for 10 countries. it was noticed that the pattern of growth in the number of cases varied from country to country. the basic approach for the predictions was to train the models based on the dataset provided, but these models were not sufficiently accurate, as they were trained on only one class of dataset. as a result, the models were unable to accurately predict the number of new cases and, consequently, the existing techniques failed to utilize the resources in an optimized way [5] . insufficient training data is also one of the reasons for the models to have low accuracy. we tried 9 different standard machine learning (ml) algorithms for predicting the number of patients for the next 5 days. after getting a decent accuracy of 85%, we implemented these algorithms on datasets of different countries. we selected 10 countries with the highest population and the highest density for our work. by using the data of these countries, we trained standard prediction models using multiple ml algorithms and obtained different accuracy for each of the models for different countries.the different models gave high accuracy for different countries. however, there were variations in accuracy because of the different trends of change in covid-19 patients for different countries. the system flow diagram is shown in fig. 1 multiple research works have been carried out to predict the outbreak of covid-19. vomlel et al. worked on the dataset of patients from stemi and different classifiers used for predictions were, namely, logistic regression, logitboost, decision tree, nbc, neural networks, and the two versions of bayesian network classifiers [6] . kumar et al. [7] used the arima model for predicting the outbreak in the top 15 european countries. tuli et al. [8] proposed an ml model that can run continuously on cloud data centers (cdcs) for precise prediction of spread and proactive development of strategic response by the government and citizens. robust weibull models fitted well on their dataset rather than baseline gaussian models. petropoulos et al. [9] introduced an objective approach to predict continuation of covid-19 by live forecasting. they produce ten-days-ahead point forecasts and prediction intervals. a susceptible-exposed-infectious-recovered (seir) metapopulation model was used to predict the spread across all major cities in china, with 95% credible intervals [10] . yang et al. [11] used the modified seir model to derive the epidemic curve. they used an artificial intelligence (ai) approach, trained on the 2003 sars data, to predict the epidemic. bhatnagar et al. [12] created a mathematical model for predicting the spread of covid-19 in countries using various types of parameters and tested their model on real data of countries. a segmented poisson model was incorporated by the power law and the exponential law as proposed by zhang et al. [13] to study the covid-19 outbreaks in six major western countries. maier et al. [14] have introduced a parsimonious model that captures the infected individuals and also population-wide isolation practices in response to containment policies. li et al. [15] studied the transmission process of covid-19. it used forward prediction and backward inference of the epidemic situation, and the relevant analysis helped relevant countries to make more appropriate fig. 1 proposed system flow diagram. the data on the spread of covid-19 in the top 10 densely populated countries, viz., india, bangladesh, the democratic republic of congo, pakistan, china, philippines, germany, indonesia, ethiopia, and nigeria were analyzed. the data for all the countries was fed into 9 different machine learning algorithms to predict the count of new cases for the next 5 days. these predicted values were compared with the actual values that were found and the accuracy was calculated. the best outbreak prediction model was selected for each country depending on the accuracy values obtained decisions. tomar et al. [16] have used data-driven estimation methods like, long short-term memory (lstm) and curve fitting for prediction for the monthly number of covid-19 cases in india and also the effect of preventive measures like, social isolation and lockdown on the spread of covid-19. kumar et al. [17] have applied cluster analysis, to classify real groups of infectious disease of covid-19 on a data set of different states and union territories in india, based on their high similarity to each other. wu et al. [10] forecasted the prediction for only the major cities of china, whereas zhang et al. [13] predicted for six major western countries. on the other hand, the proposed methods forecast the count for 10 highly and densely populated countries. seir, poisson, arima, and exponential smoothing model were reported for covid-19 count prediction. however, we have incorporated 9 different ml algorithms for the prediction and also trained our models with the data of over 100 days, which was 3 times more than reported in the literature. we considered the top 10 countries with high population and high density for our outbreak prediction system. since covid-19 spreads majorly through human contact, it was imperative to consider only those countries with high density, as well as, high population. the dataset of the countries, namely, bangladesh, india, china, pakistan, germany, nigeria, ethiopia, democratic republic of congo, the philippines, and indonesia have been used. initially, we identified a list of 20 most populated countries (supplementary material s.10). further, we obtained a list of countries with the highest population density (supplementary material s.11). from these two lists, we identified the top 10 countries having the highest density as well as high population count (table 1) . we used 9 different machine learning algorithms for predicting the number of patients for the above-specified countries. the train data to test data partition was 94% and 6%, respectively. the algorithms predicted the rise in the number of cases in the next 5 days (figs. 2, 3) for the countries specified in table 1 . the testing run-time for these algorithms varied between 2 and 5 s. the algorithms were tuned by an iterative approach between the normalized value of zero and one. for the tuning of individual parameters, partial and full autocorrelation was used. the arma model is the merger between auto regressive (ar) and moving average (ma) models, namely: the ar model, which tries to explain the momentum and mean reversing effects often observed in trading markets and the ma model, which tries to capture the shock effects observed in thermal noise. these shock effects could be thought of as unexpected events affecting the observation. so first we loaded the dataset, and then divided it into a test set and a train set. we trained the model based on the train set and test set comprised of values for which we had to make predictions. then we made an arma model that was trained on the training data. in arma, the values of p and q were put inside the order of the model. these values changed depending on what the model fitted the best. values of p and q are normally taken up to 6. the values of p and q varied for different training datasets depending on the best fit. for a given series a 0 , a 1 , … a t , to implement the arma model we have to find the difference between data at different timestamps and make a new series altogether. this difference that we take form the d parameter of the model. let us represent the prediction plots for the number of covid-19 patients that would rise in the next 5 days for some countries, where an exponential increase in the curve is expected or the rise in the cases would remain constant. various machine learning models were deployed for predicting the outbreak. the black line shows the actual data, whereas the other colors represent the predictions obtained using the different ml algorithms. the svr model is inefficient for most of the countries, whereas the arima model gave comparatively better results. the predictions for the countries can be seen more clearly from the snippets. a the prediction plot for indonesia indicates a rise in the curve as predicted by most of the models. arima shows a decline in the cases, whereas the arma model indicates a rise in the curve. b prediction plot for nigeria. apart from the arma model, all the other models predicted an increase in the curve. c prediction plot for pakistan. svr indicates a sharp increase in the curve, whereas the other models show a constant rise in the number of cases. d prediction plot for bangladesh. all the models indicate a constant rise in the cases, whereas the svr model shows an abrupt increase in the curve, indicating its inefficiency for predicting the outbreak. e prediction plot for india. the cases in india will increase exponentially as predicted by the models, whereas the lrp model predicted the decline for india new time series as z 0 , z 1 , … , z t . the newly formed time series is stationary and represented as z t = a (t+1) − a t . usually the value of d is taken as 0 or 1.the last value of the z series will be given by: now, if we want to predict the value at k th position, in future ie k > t , we have to get the answer in the original series means, we need a value of a k so we have to thus convert the 'z' series into 'a' series and it will be done as: prediction for the next 5 days of the number of patients in different countries where cases are likely to decrease in the coming days using 9 different machine learning algorithms. the black line represents the real data obtained and the rest all colors show predictions using different ml models. the predictions for the countries can be seen more clearly from the snippets. a prediction plot for germany.the arima and arma models indicate that the count will remain constant for the coming days, whereas xgb shows a decrease in the cases. b prediction plot for ethiopia. xgb shows a rapid decline in the number of cases, whereas the arma model shows a slight decrease in the curve. c prediction plot for the philippines. all the algorithms were inefficient in predicting the highly uneven number of cases seen in the country. d prediction plot for china. training the dataset with some specific values, a few algorithms such as lrp and brr gave inappropriate results. e prediction plot for the democratic republic of congo. svr and lrp show an increase in the number of cases arima is a predictive model that predicts future time series based on its past values. an arima model is characterized by 3 terms: p, d, and q, where p is the order of the auto regressive term, q is the order of the moving average term and d is the number of differences required. so first we loaded a dataset, then divided into a test set and a train set. we trained the model based on the train set and test set comprised of the values for which we had to make predictions. then we made an arima model that was trained on the training data. the values of p, q, and d were put inside the order of the model. these values changed depending on what the model fitted the best. values of p and q are normally taken up to 6 and d varied between 0 and 1. the values of p, d, and q varied for different training datasets depending on the best fit. if there is given a time series l 0 , l 1 , … , l t and we want to predict the last term that is l t , then let the predicted last term be represented as ̂l t . the actual last term will be given by: is moving average term and t is error lag. now for predicting ̂l t , only the error lag term is not present. the values of p and q are determined by acf and pacf, where acf stands for auto correlation function and pacf stands for partial auto-correlation function. it is a statistical approach for modeling the relationship between a dependent variable and a given set of independent variables. all the values in the dataset were plotted. after plotting the points, we created the best-fit line. a best-fit line is the one that minimizes the error, i.e., it should have a minimum difference between the actual and predicted values. we found the slope of the line and also its y-intercept. after getting the equation of the line, we were able to predict the new values, which is the number of patients in an individual country. the expression for representing a line is given as y = mx + c, where 'm' is the slope. the formula for calculating the slope is where x and ȳ are the mean values. we used the polynomial feature function provided by sci-kit learn library of machine learning, where we can increase the power of the input variable and then fit and transform it on any desired model. firstly, we imported the necessary libraries. we then imported the polynomial and linear regression functions from sci-kit learn. we instantiated a polynomial feature function with degree = 5 as a parameter. then we fitted and transformed the input variable as well as the list of days for which we wanted to make the predictions. after that, we instantiated the linear regression model with parameters normalize = true, and fitintercept = false and then fitted the model using a new list made by applying polynomial features. now we can use our model for predictions of covid-19 cases on any particular day by using the list made by applying polynomial features to the list of days for which a prediction of covid-19 cases is desired. polynomial regression is a model based on a mixture of dependent and independent variables represented by m and y, respectively, and f p is the polynomial function that tries to add variables of any power we need, which gives us the best results with the dataset taken. where m = number of independent variables, y = dependent variable, f p = polynomial function (it tries to add variables of any power we need). bayesian linear regression is an type of linear regression. we have used a polynomial version of bayesian ridge regressor to utilize the important relationship between the input variables and target variables, which can further be used for prediction of covid-19 cases on any random day. we used a randomized search that needs dictionaries of parameters having a different range of values as list of values and names of parameters as keys that we need to experiment with in our model and see which set of parameters gives the best results. the model was defined with the parameters , , and during the fitting. the regularization parameters and being estimated by maximizing the log marginal likelihood. the initial value of the maximization procedure can be set with the hyperparameters _init and _init. there are four more hyperparameters 1 , 2 , 1 , 2 , of the gamma prior distributions over and . these are usually chosen to be non-informative. bayesian ridge estimates a probabilistic model of the regression problem as described above. the prior for the coefficient is given by a spherical gaussian: the priors over and are chosen to be gamma distributions, the conjugate prior for the precision of the gaussian. the resulting model is called bayesian ridge regression. svr is a powerful algorithm that allows us to choose how tolerant we are of errors, both through an acceptable error margin( ) and through tuning our tolerance of falling outside that acceptable error rate. our original training dataset for every country was stated in a finite-dimensional state and so the sets to discriminate were not linearly separable in that space. to resolve this problem, our original finite-dimensional state was mapped into a higher-dimensional space. by doing this we could find the prediction of different countries in a non-linear approach. the model is defined as a comprehensive evaluation of the gram matrix along with the predictors x(i) and x(j). the gram matrix is a n x n dimensional matrix that contains the elements g(i, j). the process comprises obtaining a non-linear svm regression model by replacing the dot product of the predictors with a nonlinear kernel function comprising g(x1, x2) as (x1) and (x2) , where (x1) comes out to be greater than g(x1, x2) and (x2) comes out to be less than the function modeled. some regression problems cannot be described using a linear model, we need nonlinear models. to obtain a nonlinear svm regression model by replacing the dot product x ′ 1 x 2 with a nonlinear kernel function g(x 1 , x 2 ) =< (x 1 ), (x 2 ) > , where (x) is a transformation that maps x to a high-dimensional space. statistics and machine learning toolbox provides the following built-in semi-definite kernel functions. the kernel function of linear dot product is as shown: the kernel function of gaussian is: the kernel function of polynomial is: where q is in {2,3...}. we used random forest regressor (rfr), as it fits several classifying decision trees. the sub-sample size was controlled with the maxsamples parameter. we loaded the specific model into our training environment and initiated all the parameters to random values. we got the same result every time we ran the model on the given dataset. then we fitted this model on the dataset so that we could easily predict the number of covid-19 cases on any day using our trained model. xgboost stands for "extreme gradient boosting" and it is an implementation of gradient boosting trees algorithm. firstly, we imported the necessary libraries and instantiated xgboost with nestimators=1000 and fit the model. with this, we predicted the number of covid-19 cases on any day we wanted. in these ways, we used this model to get the number of covid-19 cases on any particular day using a dataset of actual covid-19 cases used in training the model. the model is defined as a comprehensive mix of training losses and regularization measures along with squared loss function summed up in an interval varying from 1 to n. the purpose of optimizing training loss is because of its assistance d = ((x 1 , y 1 ), … … … (x n , y n )) in predictive models, while regularization enhances the generalization of simpler models. additive boosting is with y and as hyperparameters. approximation techniques such as taylor approximation have been used in generating the model. where where l( ) is loss function and ∑ (ŷ i − y i ) 2 is the squared loss. in holt-winters exponential smoothing model we considered the seasonality to be additive. the forecasted value for each data element is the sum of the baseline, trend, and seasonality components. we use c to denote the frequency of the seasonality. the value of periods (1/frquecy) also depends on best fit and training. so we loaded a dataset, and then divided it into a test set and a train set. we trained the model based on the train set and test set comprised of values for which we had to make predictions. then we did exponential smoothing on the training dataset with seasonality as additive. this model consists of periods over which we want exponential smoothing to take place. the value of periods varies depending on the best fit and by analyzing the graph of training. the holt-winters seasonal method comprises the forecast equation and three smoothing equations, one for the level l t , one for the trend b t , and one for the seasonal component s t , with corresponding smoothing parameters , and . within each period, the seasonal component will add up to approximately zero, i.e., ∑ s t = 0 for a particular period. mathematically, holt winters additive model is represented as: forecast = estimated level + trend + seasonality at most recent time point series equation is represented as: the series has level (l t ),trend (b t ) + seasonality(s t ) with c seasons. level equation is represented as: the level equation shows a weighted average between the seasonally adjusted observation (y t − s t−c ) and the non-seasonal forecast (l t−1 + b t−1 ) for time t. trend equation is represented as: the trend equation is identical to holt's linear method. seasonality equation is represented as: the seasonal equation shows a weighted average between the current seasonal index, (y t − l t−1 − b t−1 ) , and the seasonal index of the same season c time periods ago. the values of , and usually range between 0 and 1. the models were trained on windows 10 operating system with an 8th generation intel i5 processor and 8 gb of ram. the dataset was obtained from ourworldindata.org [18] . all the models were trained on google colaboratory, as well as spyder using python version 3. as shown in table 2 , we used 9 different machine learning algorithms to predict the number of patients in 10 highly dense and populated countries. among all the models for the various countries (figs. 2, 3) , we achieved the highest accuracy of 99.93% for ethiopia ( fig. 3b and 4c ) by using the arma model. arima gave an accuracy of more than 85% most of the time for almost all countries. almost all the models gave an accuracy of more than 80% at least for one of the 10 countries, except in the case of the philippines (fig. 3c) . we found different countries to have a different trend of increase or decrease in covid-19 patients. not every ml algorithm could give a very high accuracy for predicting the rise or fall in the cases for each country. our results showed that for bangladesh (figs. 2d and 4b ), the lrp model showed the highest accuracy of 86.45%. for india (figs. 2e and 4a) , we got an accuracy of 99.26% using the arma model. china (fig. 3d and 5d ) had a prediction value of 82% using the xgb model. for pakistan (figs. 2c and 5b) , the accuracy was 87.91% using the brr model. for nigeria (figs. 2b and 4e) , the accuracy was 98.06% using the arma model. democratic republic of congo (figs. 3e and 4d) showed the highest accuracy of 91.96% by using the lrp model. indonesia (figs. 2a and 5a ) demonstrated the highest accuracy of 97.72% using the arima model. for germany (figs. 3a and 4f) , arima . 4 bar graphs depicting the error percentage and error bar for 5-day prediction by using 9 different ml models. the green color bar indicates the model with the least percentage error, i.e., highest percentage accuracy. a for india, the arma model gave the highest accuracy with an error bar of 0.42. b for bangladesh, the lrp model gave the highest accuracy with an error bar of 3.18 as compared to other models. c the arma model showed the least percentage error in comparison to other models for ethiopia. d the lrp model gave the least percentage error for the democratic republic of congo. e in the case of nigeria, the arma model gave the least percentage error, although the error bar had a value of 7.76. f for germany, the arima model gave the least percentage error, while models like brr and lr gave percentage error of more than 50% gave an accuracy of 85.39%. using the svr model, we got a prediction accuracy of 50.54% for the philippines (figs. 3c and 5c) . figure 4 shows bar graphs for different error percentages and their corresponding errors for the next 5-day predictions. in the case of india (fig. 4a) , ethiopia (fig. 4c) , and nigeria (fig. 4e ), the arma model gave the highest accuracy for the prediction as compared to the other models. for bangladesh (fig. 4b) and democratic republic of congo (fig. 4d ), the lrp model proved to be effective, although the accuracy in the case of bangladesh was low. in the case of germany (fig. 4f ), the arima model gave the least percentage error, while models like brr and lr gave errors of more than 50%. for indonesia (fig. 5a ), the arima model gave the least percentage error, while the arma model was highly inaccurate, with a very high error percentage. in the case of pakistan (fig. 5b) , the brr model yielded better accuracy, whereas for the philippines (fig. 5c) , none bar graphs depicting the error percentage and error bar for 5 days prediction using 9 different ml models. the green color bar indicates the model with the least percentage error i.e. highest percentage accuracy. a for indonesia, the arima model gave the least percentage error, whereas, it was seen that arma gave the highest percentage error as compared to other models. b for pakistan, the brr model proved best for prediction. c for the philippines, none of the models gave accuracy that was expected. the percentage error of all models was more than 40%. d for china, the xgb model proved the best for prediction while models like lr and svr gave an error percentage of more than 70% of the models made accurate predictions. the percentage error of all models was more than 40%. the xgb model proved best for prediction in the case of china (fig. 5d) . a range finder code was written that helps to improve accuracy. this code works on a range of predicted numbers from all 9 algorithms, rather than actual predictions by the individual algorithms. this combined approach helped us to improve accuracy by up to 8%. it was not possible to get the results using all the 9 algorithms for each country as there were no specific trends observed. for the philippines, we got a very low accuracy because a sudden drop of around 1400 cases to 0 was seen and in the following day, the count increased by 4500. also because of the change in the government rules, the covid-19 count of the country changed drastically. due to this change, the proposed models were not able to make predictions with high accuracy. according to our dataset for china, a particular day had approximately 2000 patients and after that day the rise observed in the number of cases was approximately 13,000. the number of total patients was observed to be around 15,000. all of a sudden, the number dropped by 11,000 the next day. the declining phase started after the drop for about the next 100 days. due to the peak value, our training dataset had to be changed. we considered only those values after the peak where a declining trend could be seen. to date, china shows a decline in the curve, and hence we considered only the decrement values. the slope for china is decreasing, and hence the values after the peak were considered. also, after considering this, 2 out of the 9 models failed to show good accuracy for china. the raw data received from countries like china and philippines were not correct, because the government policies changed on february 17, 2020 and july 6, 2020, respectively. table 3 , we have compared our methodology with the other methodologies reported in the literature. most of the literature has used the arima model for the outbreak prediction of covid-19. we have used 9 different ml models for the prediction of covid-19 on the top 10 densely and highly populated countries. we achieved the highest accuracy of 99.93%, which was high as compared to the other methodologies reported. the highest accuracy was achieved by the arma model for ethiopia. poonia et al. [20] achieved the highest accuracy of 95% for india using the arima model forecasting. the arima model for india achieved an accuracy of 90.55%, which was high as compared to an accuracy of 70% obtained by gupta et al. [21] using the arima model and exponential smoothing. in comparison, the seir model implemented by wu et al. [10] gave an accuracy of 95% for prediction in wuhan. although the overall accuracy achieved was very good, we are still trying to implement prediction models using different algorithms that could give us higher accuracy. we are also planning to get a single standard model that can be used for any country, which may be a combination of different algorithms. we are planning to develop such ml algorithms that could give us an approximate duration of covid-19 as a pandemic. the study presented here outlines several technique of predicting new cases that would arise in a few days in the near future in any region during an expanding pandemic, so that there is the proper allocation of resources in those regions for higher recovery rates. the arma model gave the highest accuracy for the prediction of covid-19 cases for ethiopia. from the results obtained on all the models for all countries, it was found that arma proved to be the best model for india and nigeria. arima was best for indonesia and germany. lrp for bangladesh and democratic republic of congo. and brr, xgb, svr proved best for pakistan, china and the philippines, respectively. we got an accuracy of more than 80% for all the countries except the philippines by any one of the 9 ml algorithms. the overall best model for the prediction was arima. generating high-accuracy prediction that could help in an optimized use of available resources along with pacing up the recovery graphs has been the main aim behind this exercise. these regions could potentially benefit from knowing the number of resources that they would need based on the predictions of the model. this model could help in lowering the cost of dealing with the pandemic and improve the recovery process in regions where it is deployed. data availability statement all the data and codes used in this study, as well as, the supplementary material can be made available from the corresponding author, upon reasonable request. code availability all the codes used in this study, as well as, the supplementary material can be made available from the corresponding author, upon reasonable request. involvement of human participant and animals this article does not contain any studies with animals or humans performed by any of the authors. all the necessary permissions were obtained from the institute's ethical committee and concerned authorities. no informed consent was required as the studies does not involve any human participant. what are the underlying transmission patterns of covid-19 outbreak?-an age-specific social contact characterization joint modeling of longitudinal cd4 count and timeto-death of hiv/tb co-infected patients: a case of jimma university specialized hospital introduction to business data mining culture vs policy: more global collaboration to effectively combat covid-19 optimization based data mining: theory and applications machine learning methods for mortality prediction in patients with st elevation myocardial infarction forecasting the dynamics of covid-19 pandemic in top 15 countries in april 2020: arima model with machine learning approach predicting the growth and trend of covid-19 pandemic using machine learning and cloud computing forecasting the novel coronavirus covid-19 nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions covid-19: mathematical modeling and predictions predicting turning point, duration and attack rate of covid-19 outbreaks in major western countries effective containment explains subexponential growth in recent confirmed covid-19 cases in china propagation analysis and prediction of the covid-19 prediction for the spread of covid-19 in india and effectiveness of preventive measures monitoring novel corona virus (covid-19) infections in india by cluster analysis coronavirus pandemic (covid-19), our world in data covid-19 disease outbreak forecasting of registered and recovered cases after sixty day lockdown in italy: a data driven model approach short-term forecasts of covid-19 spread across indian states until 1 trend analysis and forecasting of covid-19 outbreak in india ethic statements all authors consciously assure that for the manuscript fulfills the following statements: 1) this material is the authors' own original work, which has not been previously published elsewhere. 2) the paper is not currently being considered for publication elsewhere.3) the paper reflects the authors' own research and analysis in a truthful and complete manner. 4) the paper properly credits the meaningful contributions of co-authors and co-researchers. 5) the results are appropriately placed in the context of prior and existing research. key: cord-002169-7kwlteyr authors: wu, nicholas c; dai, lei; olson, c anders; lloyd-smith, james o; sun, ren title: adaptation in protein fitness landscapes is facilitated by indirect paths date: 2016-07-08 journal: nan doi: 10.7554/elife.16965 sha: doc_id: 2169 cord_uid: 7kwlteyr the structure of fitness landscapes is critical for understanding adaptive protein evolution. previous empirical studies on fitness landscapes were confined to either the neighborhood around the wild type sequence, involving mostly single and double mutants, or a combinatorially complete subgraph involving only two amino acids at each site. in reality, the dimensionality of protein sequence space is higher (20(l)) and there may be higher-order interactions among more than two sites. here we experimentally characterized the fitness landscape of four sites in protein gb1, containing 20(4) = 160,000 variants. we found that while reciprocal sign epistasis blocked many direct paths of adaptation, such evolutionary traps could be circumvented by indirect paths through genotype space involving gain and subsequent loss of mutations. these indirect paths alleviate the constraint on adaptive protein evolution, suggesting that the heretofore neglected dimensions of sequence space may change our views on how proteins evolve. doi: http://dx.doi.org/10.7554/elife.16965.001 the fitness landscape is a fundamental concept in evolutionary biology (kauffman and levin, 1987; poelwijk et al., 2007; romero and arnold, 2009; hartl, 2014; kondrashov and kondrashov, 2015; de visser and krug, 2014) . large-scale datasets combined with quantitative analysis have successfully unraveled important features of empirical fitness landscapes (kouyos et al., 2012; barton et al., 2015; szendro et al., 2013) . nevertheless, there is a huge gap between the limited throughput of fitness measurements (usually on the order of 10 2 variants) and the vast size of sequence space. recently, the bottleneck in experimental throughput has been improved substantially by coupling saturation mutagenesis with deep sequencing (fowler et al., 2010; hietpas et al., 2011; jacquier et al., 2013; wu et al., 2014; thyagarajan and bloom, 2014; qi et al., 2014; stiffler et al., 2015) , which opens up unprecedented opportunities to understand the structure of high-dimensional fitness landscapes (jiménez et al., 2013; pitt and ferré-d'amaré, 2010; payne and wagner, 2014) . previous empirical studies on combinatorially complete fitness landscapes have been limited to subgraphs of the sequence space consisting of only two amino acids at each site (2 l genotypes) (weinreich et al., 2006; lunzer et al., 2005; o'maille et al., 2008; lozovsky et al., 2009; franke et al., 2011; tan et al., 2011) . most studies of adaptive walks in these diallelic sequence spaces focused on "direct paths" where each mutational step reduces the hamming distance from the starting point to the destination. however, it has also been shown that mutational reversions can occur during adaptive walks in diallelic sequence spaces such that adaptation proceeds via "indirect paths" (depristo et al., 2007; berestycki et al., 2014; martinsson, 2015; li, 2015; palmer et al., 2015) . in sequence space with higher dimensionality (20 l , for a protein sequence with l amino acid residues), the extra dimensions may further provide additional routes for adaptation (gavrilets, 1997; cariani, 2002) . although the existence of indirect paths has been implied in different contexts, it has not been studied systematically and its influence on protein adaptation remains unclear. another underappreciated property of fitness landscapes is the influence of higher-order interactions. empirical evidence suggests that pairwise epistasis is prevalent in fitness landscapes (kvitek and sherlock, 2011; kouyos et al., 2012; o'maille et al., 2008; lozovsky et al., 2009) . specifically, sign epistasis between two loci is known to constrain adaptation by limiting the number of selectively accessible paths (weinreich et al., 2006) . higher-order epistasis (i.e. interactions among more than two loci) has received much less attention and its role in adaptation is yet to be elucidated (weinreich et al., 2013; palmer et al., 2015) . in this study, we investigated the fitness landscape of all variants (20 4 = 160,000) at four amino acid sites (v39, d40, g41 and v54) in an epistatic region of protein g domain b1 (gb1, 56 amino acids in total) (figure 1-figure supplement 1) , an immunoglobulin-binding protein expressed in streptococcal bacteria (sjö bring et al., 1991; sauer-eriksson et al., 1995) . the four chosen sites contain 12 of the top 20 positively epistatic interactions among all pairwise interactions in protein gb1, as we previously characterized (figure 1-figure supplement 2) . thus the sequence elife digest proteins can evolve over time by changing their component parts, which are called amino acids. these changes usually happen one at a time and natural selection tends to preserve those changes that make the protein more efficient at its specific tasks, while discarding those that impair the protein's activity. however the effect of each change depends on the protein as a whole, and so two changes that separately make the protein worse can make it much better if they occur together. this phenomenon is called epistasis and in some cases it can trap proteins in a suboptimal form and prevent them from improving further. proteins are made from twenty different kinds of amino acid, and there are millions of different combinations of amino acids that could, in theory, make a protein of a given length. studying protein evolution involves making variants of the same protein, each with just a few changes, and comparing how efficient, or "fit", they are. previous studies only measured the fitness of a few variants and showed that epistasis could block protein evolution by requiring the protein to lose some fitness before it could improve further. however, new techniques have now made it easier to study protein evolution by testing many more protein variants. wu, dai et al. focused on four amino acids in part of a protein called gb1 and tested the efficiency of every possible combination of these four amino acids, a total of 160,000 (20 4 ) variants. contrary to expectations, the results suggested that the protein could evolve quickly to maximise fitness despite there being epistasis between the four amino acids. overcoming epistasis typically involved making a change to one amino acid that paved the way for further changes while avoiding the need to lose fitness. the original change could then be reversed once the epistasis was overcome. the complexity of this solution means it can only be seen by studying a large number of protein variants that represent many alternative sequences of protein changes. wu, dai et al. conclude that proteins are able to achieve a higher level of fitness through evolution by exploring a large number of changes. there are many possible changes for each protein and it is this variety that, despite epistasis, allows proteins to become naturally optimised for the tasks that they perform. while the full complexity of protein evolution cannot be explored at the moment, as technology advances it will become possible to study more protein variants. such advances would therefore hopefully allow researchers to discover even more about the natural mechanisms of protein evolution. space is expected to cover highly beneficial variants, which presents an ideal scenario for studying adaptive evolution. moreover, this empirical fitness landscape is expected to provide us insights on how high dimensionality and epistasis would influence evolutionary accessibility. briefly, a mutant library containing all amino acid combinations at these four sites was generated by codon randomization. the "fitness" of protein gb1 variants, as determined by both stability (i.e. the fraction of folded proteins) and function (i.e. binding affinity to igg-fc), was measured in a high-throughput manner by coupling mrna display with illumina sequencing (see materials and methods, figure 1 figure supplement 3) (roberts and szostak, 1997; olson et al., 2012) . the relative frequency of mutant sequences before and after selection allowed us to compute the fitness of each variant relative to the wild type protein (wt). while most mutants had a lower fitness compared to wt (fitness < 1), 2.4% of mutants were beneficial (fitness > 1). (figure 1-figure supplement 4) . we note that this study does not aim to extrapolate protein fitness to organismal fitness. although there are examples showing that protein fitness in vitro correlates with organismal fitness in vivo (natarajan et al., 2013; wu et al., 2012) , this relation may not be linear and is likely to be systemspecific due to the difference in selection pressures in vitro and in vivo (pál et al., 2006; hingorani and gierasch, 2014) . to understand the impact of epistasis on protein adaptation, we first analyzed subgraphs of sequence space including only two amino acids at each site ( figure 1a) . each subgraph represented a classical adaptive landscape connecting wt to a beneficial quadruple mutant, analogous to previously studied protein fitness landscapes (weinreich et al., 2006; szendro et al., 2013) . each variant is denoted by the single letter code of amino acids across sites 39, 40, 41 and 54 (for example, wt sequence is vdgv). each subgraph is combinatorially complete with 2 4 = 16 variants, including wt, the quadruple mutant, and all intermediate variants. we identified a total of 29 subgraphs in which the quadruple mutant was the only fitness peak. by focusing on these subgraphs, we essentially limited the analysis to direct paths of adaptation, where each step would reduce the hamming distance from the starting point (wt) to the destination (quadruple mutant). out of 24 possible direct paths, the number of selectively accessible paths (i.e. with monotonically increasing fitness) varied from 12 to 1 among the 29 subgraphs ( figure 1b ). in the most extreme case, only one path was accessible from wt to the quadruple mutant wlfa ( figure 1a) . we also observed a substantial skew in the computed probability of realization among accessible direct paths (figure 1-figure supplement 5), suggesting that most of the realizations in adaptation were captured by a small fraction of possible trajectories (weinreich et al., 2006) . these results indicated the existence of sign epistasis and reciprocal sign epistasis, both of which may constrain the accessibility of direct paths (weinreich et al., 2006; tufts et al., 2015) . indeed, we found that these two types of epistasis were prevalent in our fitness landscape ( figure 1c ). furthermore, we classified the types of all 24 pairwise epistasis in each subgraph and computed the level of ruggedness as f sign þ 2f reciprocal , where f type was the fraction of each type of pairwise epistasis. as expected, the number of selectively inaccessible direct paths, i.e. paths that involve fitness declines, was found to be positively correlated with the ruggedness induced by pairwise epistasis (figure 1 -figure supplement 6, pearson correlation = 0.66, p=1.0 â 10 à4 ) (poelwijk et al., 2007) . our findings support the view that direct paths of protein adaptation are often constrained by pairwise epistasis on a rugged fitness landscape (weinreich et al., 2005; kondrashov and kondrashov, 2015) . in particular, adaptation can be trapped when direct paths are blocked by reciprocal sign epistasis. however, crucially, this analysis was limited to mutational trajectories within a subgraph of the sequence space. in reality, the dimensionality of protein sequence space is higher. intuitively, when an extra dimension is introduced, a local maximum may become a saddle point and allow for further adaptation -a phenomenon that is also known as "extra-dimensional bypass" (gavrilets, 1997; cariani, 2002; gutiérrez and maere, 2014) . with our experimental data, we observed two distinct mechanisms of bypass, either using an extra amino acid at the same site or using an additional site, that allow proteins to continue adaptation when no direct paths were accessible due to reciprocal sign epistasis ( figure 2 ). the first mechanism of bypass, which we termed "conversion we identified a total of 29 subgraphs in which the quadruple mutant was the only fitness peak. the number of accessible direct paths from wt to the quadruple mutant is shown for each subgraph. the maximum number of direct paths is 24. (c) the fraction of three types of pairwise epistasis around wt (2091 out of 2166), randomly sampled from the entire sequence space (10 5 in total), or in the neighborhood of the top 100 fitness variants and 100 lethal variants. we note that this analysis is different from previous studies on how epistasis changes along adaptive walks, where the quadruples are chosen such that the fitness values of genotype 00, 01 and 11 are in increasing order (greene and crona, 2014) . sign epistasis and reciprocal sign epistasis, both of which can block adaptive paths, are prevalent in the fitness landscape. classification scheme of epistasis is shown at the top. each node represents a genotype, which is within a sequence space of two loci and two alleles. green arrows represent the accessible paths from genotype "00" to a beneficial double mutant "11" (colored in red). doi: 10.7554/elife.16965.003 the following figure supplements are available for figure 1: may open up potential paths that circumvent the reciprocal sign epistasis. the starting point is 00 and the destination is 11 (in red). green arrows indicate the accessible path. a successful bypass would require a "conversion" step that substitutes one of the two interacting sites with an extra amino acid (00 ! 20), followed by the loss of this mutation later (21 ! 11). the original reciprocal sign epistasis is changed to sign epistasis on the new genetic background after conversion. (b) among~20,000 randomly sampled reciprocal sign epistasis, >40% of them can be circumvented by at least one conversion bypass (i.e. success, inset). the number of available bypass for the success cases is shown as histogram. (c) the second mechanism of bypass involves an additional site. in this case, adaptation involves a "detour" step to gain mutation at the third site (000 ! 100), followed by the loss of this mutation (111 ! 011). the original reciprocal sign epistasis is changed to either magnitude epistasis or sign epistasis on the new genetic background after detour ( bypass", works by converting to an extra amino acid at one of the interacting sites (palmer et al., 2015) . consider a simple scenario with only two interacting sites. if the sequence space is limited to 2 amino acids at each site, as in past analyses of adaptive trajectories, the number of neighbors is 2; however, if all 20 possible amino acids were considered, the total number of neighbors would be 38. some of these 36 extra neighbors may lead to potential routes that circumvent the reciprocal sign epistasis ( figure 2a ). in this case, a successful bypass would require a conversion step that substitutes one of the two interacting sites with an extra amino acid (00 ! 20), followed by the loss of this mutation (21 ! 11). this bypass is feasible only if the original reciprocal sign epistasis is changed to sign epistasis after the conversion. to test whether such bypasses were present in our system, we randomly sampled 10 5 pairwise interactions from the sequence space and analyzed the~20,000 reciprocal sign epistasis among them (see materials and methods). more than 40% of the time there was at least one successful conversion bypass and in many cases multiple bypasses were available ( figure 2b ). the second mechanism of bypass, which we termed "detour bypass", involves an additional site ( figure 2c ). in this case, adaptation can proceed by taking a detour step to gain a mutation at the third site (000 ! 100), followed by the later loss of this mutation (111 ! 011) (depristo et al., 2007; palmer et al., 2015) . detour bypass was observed in our system ( figure 2d ), but was not as prevalent and had a lower probability of success than conversion bypass. out of 38 possible detour bypasses for a chosen reciprocal sign epistasis, we found that there were on average 1.2 conversion bypasses and 0.27 detour bypasses available. we note, however, that the lower prevalence of detour bypass in our fitness landscape (l=4) does not necessarily mean that it should be expected to be less frequent than conversion bypass in other systems. while the maximum number of possible conversion bypasses is always fixed (19 â 2 à 2 ¼ 36), the maximum number of possible detour bypasses (19 â (l à 2)) is proportional to the sequence length l of the entire protein (whereas our study uses a subset l = 4). the pervasiveness of extra-dimensional bypasses in our system contrasts with the prevailing view that adaptive evolution is often blocked by reciprocal sign epistasis, when only direct paths of adaptation are considered. the two distinct mechanisms of bypass both require the use of indirect paths, where the hamming distance to the destination is either unchanged (conversion) or increased (detour). in order to circumvent the inaccessible direct paths via extra dimensions, reciprocal sign epistasis must be changed into other types of pairwise epistasis. for detour bypass, this means that the original reciprocal sign epistasis is changed to either magnitude epistasis or sign epistasis in the presence of a third mutation ( we proved that higher-order epistasis is necessary for the scenario that reciprocal sign epistasis is changed to magnitude epistasis, as well as for one of the two scenarios that reciprocal sign epistasis is changed to sign epistasis (see materials and methods). this suggests a critical role of higher-order epistasis in mediating detour bypass. to confirm the presence of higher-order epistasis, we decomposed the fitness landscape by fourier analysis (see materials and methods, figure 3 -figure supplement 1) weinreich et al., 2013; neidhart et al., 2013) . the fourier coefficients can be interpreted as epistatic interactions of different orders (weinreich et al., 2013; de visser and krug, 2014) , including the main effects of single mutations (the first order), pairwise epistasis (the second order), and higher-order epistasis (the third and the fourth order). the fitness of variants can be reconstructed by expansion of fourier coefficients up to a certain order (figure 3-figure supplement 2). in our system with four sites, the fourth order fourier expansion will always reproduce the measured fitness (i.e. the fraction of variance in fitness explained equals 1). when the second order fourier expansion does not reproduce the measured fitness, it indicates the presence of higher-order epistasis. in this way, we identified the 0.1% of subgraphs with greatest fitness contribution from higher-order epistasis ( figure 3a , red lines) and visualized the corresponding quadruple mutants by the sequence logo plot ( figure 3b ). the skewed composition of amino acids in these subgraphs indicates that higherorder interactions are enriched among specific amino acid combinations of site 39, 41 and 54. this interaction among 3 sites is consistent with our knowledge of the protein structure, where the side chains of sites 39, 41, and 54 can physically interact with each other at the core (figure 1-figure supplement 1a) and destabilize the protein due to steric effects ( figure 3-figure supplement 3) . in the presence of higher-order epistasis, epistasis between any two sites would vary across different genetic backgrounds. we computed the magnitude of pairwise epistasis (") between each pair of amino acid substitutions (see materials and methods) (khan et al., 2011) , and observed numerous instances where the sign of pairwise epistasis depended on genetic background. for example, g41l and v54h were positively epistatic when site 39 was isoleucine [i], but the interaction changed to negative epistasis when site 39 carried a tyrosine [y] or a tryptophan [w] ( figure 3c-d) . similar patterns were observed in other pairwise interactions among site 39, 41 and 54, such as g41f/v54a and v39w/v54h (figure 3-figure supplement 4) . the observed pattern of higher-order epistasis was consistent with the results of the fourier analysis ( figure 3b ). for example, site 40 was mostly excluded from higher-order epistasis; tyrosine [y] or tryptophan [w] at site 39 were involved in the most significant higher-order interactions, as they often changed the sign of pairwise epistasis. higher-order epistasis can also switch the type of pairwise epistasis, such as shifting from reciprocal sign epistasis to magnitude or sign epistasis (figure 3-figure supplement 5), which in turn is important for the existence of detour bypass. our analysis on circumventing reciprocal sign epistasis revealed how indirect paths could open up new avenues of adaptation. to study the impact of indirect paths at a global scale, we performed simulated adaptation in the entire sequence space of 160,000 variants. the fitness landscape was completed by imputing fitness values of the 10,639 missing variants (i.e. 6.6% of the sequence space) that had fewer than 10 sequencing read counts in the input library. our model of protein fitness incorporated main effects of single mutations, pairwise interactions, and three-way interactions among site 39, 41 and 54 (see materials and methods, figure 4 -figure supplement 1). we used predictor selection based on biological knowledge, followed by regularized regression, which has been demonstrated to ameliorate possible bias in the inferred fitness landscape (otwinowski and plotkin, 2014) . in the complete sequence space, we identified a total of 30 fitness peaks (i.e. local maxima); among them 15 peaks had fitness larger than wt and their combined basins of attraction covered 99% of the sequence space ( figure 4a) . we then simulated adaptation on the fitness landscape using three different models of adaptive walks (see materials and methods), namely the greedy model (de visser and krug, 2014), correlated fixation model (gillespie, 1984) , and equal fixation model (weinreich et al., 2006) . in the greedy model, adaptation proceeds by sequential fixation of mutations that render the largest fitness gain at each step. the other two models assign a nonzero fixation probability to all beneficial mutations, either weighted by (correlated fixation model) or independent of (equal fixation model) the relative fitness gain. the greedy model represents adaptive evolution of a large population with pervasive clonal interference (de visser and krug, 2014). the correlated fixation model represents adaptive evolution of a population under the scheme of strong-selection/weak-mutation (sswm) (gillespie, 1984) , which assumes that the time to fixation is much shorter than the time between mutations and the fixation probability of a given mutation is proportional to the improvement in fitness. the equal fixation model represents a simplified scenario of adaptation where all beneficial mutations fix with equal probability (weinreich et al., 2006) . among all the possible adaptive paths to fitness peaks, many of them involved indirect paths, i.e. they employed mechanisms of extra-dimensional bypass ( figure 4b, figure 4 -figure supplement 2). we classified each step on the adaptive paths into three categories based on the change of hamming distance to the destination (a fitness peak, in this case): "towards (-1)", "conversion (0)", and "detour (+1)" ( figure 4c ). conversion was found to be pervasive during adaptation in our fitness landscape (17% of mutational steps for greedy model, 41% for correlated fixation model, 59% for equal fixation model). the use of detour was less frequent (0.1% of mutational steps for greedy model, 1.3% for correlated fixation model, 3.7% for equal fixation model), in accordance with the previous observation that detour bypass was less available than conversion bypass in our fitness landscape with l ¼ 4. a conversion step would increase the length of an adaptive path by 1, while a detour step would increase the length by 2. as a result, an indirect path can be substantially longer than a direct path consisting of only "towards" steps. we found that many of the adaptive paths required more than 4 steps, which was the maximal length of a direct path between any variants in this landscape ( figure 4d ). interestingly, because indirect adaptive paths involved more variants of intermediate fitness, the use of conversion and detour steps depended on the strength of selection. consistent with previous studies (orr, 2002 (orr, , 2003 , when mutations conferring larger fitness gains were more likely to fix (e.g. greedy model and correlated fixation model), adaptation favored direct moves toward the destination, thus leading to a shorter adaptive paths ( figure 4c-d) . this suggests that the strength of selection interacts with the topological structure of fitness landscapes to determine the length and directness of evolutionary trajectories. given that extra-dimensional bypasses can help proteins avoid evolutionary traps, we expect that their existence would facilitate adaptation in rugged fitness landscapes. indeed, we found that indirect paths increased the number of genotypes with access to each fitness peak ( figure 4e ). in addition, the fraction of genotypes with accessible paths to all 15 fitness peaks increased from from 34% to 93% when indirect adaptive paths were allowed (figure 4-figure supplement 2c) . we also found that a substantial fraction of beneficial variants (fitness > 1) in the sequence space were accessible from wt only if indirect paths were used ( figure 4f) . we repeated the analysis in figure 4f with the consideration of the constraints imposed by the standard genetic code (figure 4 -figure supplement 3a). the constraints from the genetic code decreased the number of accessible variants due to the reduction in connectivity. however, this reduction in connectivity did not alter our core finding that indirect paths substantially increase evolutionary accessibility (figure 4 -figure supplement 3b). taken together, these results suggest that indirect paths promote evolutionary accessibility in rugged fitness landscapes. this enhanced accessibility would allow proteins to explore more sequence space and lead to delayed commitment to evolutionary fates (i.e. fitness peaks) (palmer et al., 2015) . consistent with this expectation, our simulations showed that many mutational trajectories involving extra-dimensional bypass did not fully commit to a fitness peak until the last two steps (figure 4-figure supplement 4 ). in our analysis, we have limited adaptation to the regime where fitness is monotonically increasing via sequential fixation of one-step beneficial mutants. when this assumption is relaxed, adaptation can sometimes proceed by crossing fitness valleys, such as via genetic drift or recombination (de visser and krug, 2014; weissman et al., 2009; ostman et al., 2012; poelwijk et al., 2007; weissman et al., 2010) . another simplification in most of our analyses is to treat all sequences in a "protein space" (smith, 1970) , where two sequences are considered as neighbors if they differ by a single amino-acid substitution. in practice, amino acid substitutions occurring via a single nucleotide mutation are limited by the genetic code, so the total number of one-step neighbors would be (figure 4-figure supplement 3) . we also expect fitness landscapes of different systems to have different topological structure. even in our system (with >93% coverage of the genotype space), the global structure of the fitness landscape is influenced by the imputed fitness values of missing variants, which can vary when different fitness models or fitting methods are used. our analysis also ignored measurement errors, but the measurement errors are expected to be very small due to the high reproducibility in the data (figure 1 -figure supplement 3b). both imputation of missing variants and measurement errors can lead to slight mis-specification of the topological structure of the fitness landscape. finally, we note that the four amino acids chosen in our study are in physical proximity and have strong epistatic interactions. while the availability of conversion bypass only depends on the dimensionality at each site, the degree of higher-order epistasis and the availability of detour bypasses can be quite different in other fitness landscapes. although the details of a particular fitness landscape can influence the quantitative role of different bypass mechanisms, this does not undermine the generality of our conceptual findings on extra-dimensional bypass, higher-order epistasis, and their roles in protein evolution. higher-order epistasis has been reported in a few biological systems (wang et al., 2013; pettersson et al., 2011; palmer et al., 2015) , and is likely to be common in nature (weinreich et al., 2013) . in this study, we observed the presence of higher-order epistasis and systematically quantified its contribution to protein fitness. our results suggest that higher-order epistasis can either increase or decrease the ruggedness induced by pairwise epistasis, which in turn determines the accessibility of direct paths in a rugged fitness landscape (figure 3-figure supplement 6). we also revealed the important role of higher-order epistasis in mediating detour bypass, which could promote evolutionary accessibility via indirect paths. our work demonstrates that even in the most rugged regions of a protein fitness landscape, most of the sequence space can remain highly accessible owing to the indirect paths opened up by high dimensionality. the enhanced accessibility mediated by indirect paths may provide a partial explanation for some observations in viral evolution. for example, throughout the course of infection hiv always seems to find a way to revert to the global consensus sequence, a putatively "optimal" hiv-1 sequence after immune invasion (zanini et al., 2015) . as we pointed out, the possible number of detour bypasses scales up with sequence length, so it will be interesting to study how extra-dimensional bypass influences adaptation in sequence space of even higher dimensionality. for example, it is plausible that the sequence of a large protein may never be trapped in adaptation (gavrilets, 1997) , so that adaptive accessibility becomes a quantitative rather than qualitative problem. given the continuing development of sequencing technology, we anticipate that the scale of experimentally determined fitness landscapes will further increase, yet the full protein sequence space is too huge to be mapped exhaustively. does this mean that we will never be able to understand the full complexity of fitness landscapes? or perhaps big data from high-throughput measurements will guide us to find general rules? by coupling state-of-the-art experimental techniques with novel quantitative analysis of fitness landscapes, this work takes the optimistic view that we can push the boundary further and discover new mechanisms underlying evolution (fisher et al., 2013; desai, 2013; szendro et al., 2013] ). two oligonucleotides (integrated dna technologies, coralville, ia), 5'-agt cta gta tcc aac ggc nns nns nnk gaa tgg acc tac gac gac gct acc aaa acc tt-3' and 5'-ttg taa tcg gat cct ccg gat tcg gtm nnc gtg aag gtt ttg gta gcg tcg tcg t-3' were annealed by heating to 95˚c for 5 min and cooling to room temperature over 1 hr. the annealed nucleotide was extended in a reaction containing 0.5 mm of each oligonucleotide, 50 mm nacl, 10 mm tris-hcl ph 7.9, 10 mm mgcl 2 , 1 mm dtt, 250 mm each dntp, and 50 units klenow exo-(new england biolabs, ipswich, ma) for 30 mins at 37˚c. the product (cassette i) was purified by purelink pcr purification kit (life technologies, carlsbad, ca) according to manufacturer's instructions. a constant region was generated by pcr amplification using kod dna polymerase (emd millipore, billerica, ma) with 1.5 mm mgso 4 , 0.2 mm of each dntp (datp, dctp, dgtp, and dttp), 0.05 ng protein gb1 wild type (wt) template, and 0.5 mm each of 5'-ttc taa tac gac tca cta tag gga caa tta cta ttt aca tat cca cca tg-3' and 5'-agt cta gta tcc tcg acg ccg ttg tcg tta gcg tac tgc-3'. the sequence of the wt template consisted of a t7 promoter, 5' utr, the coding sequence of protein gb1, 3' poly-gs linkers, and a flag-tag (figure 1figure supplement 1b) ( . the thermocycler was set as follows: 2 min at 95˚c, then 18 three-step cycles of 20 s at 95˚c, 15 s at 58˚c, and 20 s at 68˚c, and 1 min final extension at 68˚c. the product (constant region) was purified by purelink pcr purification kit (life technologies) according to manufacturer's instructions. both the purified constant region and cassette i were digested with bcivi (new england biolabs) and purified by purelink pcr purification kit (life technologies) according to manufacturer's instructions. ligation between the constant region and cassette i (molar ratio of 1:1) was performed using t4 dna ligase (new england biolabs). agarose gel electrophoresis was performed to separate the ligated product from the reactants. the ligated product was purified from the agarose gel using zymoclean gel dna recovery kit (zymo research, irvine, ca) according to manufacturer's instructions. pcr amplification was then performed using kod dna polymerase (emd millipore) with 1.5 mm mgso 4 , 0.2 mm of each dntp (datp, dctp, dgtp, and dttp), 4 ng of the ligated product, and 0.5 mm each of 5'-ttc taa tac gac tca cta tag gga caa tta cta ttt aca tat cca cca tg-3' and 5'-gga gcc gct acc ctt atc gtc gtc atc ctt gta atc gga tcc tcc gga ttc-3'. the thermocycler was set as follows: 2 min at 95˚c, then 10 three-step cycles of 20 s at 95˚c, 15 s at 56˚c, and 20 s at 68˚c, and 1 min final extension at 68˚c. the product, which is referred as "dna library", was purified by purelink pcr purification kit (life technologies) according to manufacturer's instructions. affinity selection by mrna display (roberts and szostak, 1997; olson et al., 2012) was performed as described (figure 1-figure supplement 3a ) . briefly, the dna library was transcribed by t7 rna polymerase (life technologies) according to manufacturer's instructions. ligation was performed using 1 nmol of mrna, 1.1 nmol of 5'-ttt ttt ttt ttt gga gcc gct acc-3', and 1.2 nmol of 5'-/5phos/-d(a)21-(c 9 )3-d(acc)-puromycin by t4 dna ligase (new england biolabs) in a 100 ml reaction. the ligated product was purified by urea page and translated in a 100 ml reaction volume using retic lysate ivt kit (life technologies) according to manufacturer's instructions followed by incubation with 500 mm final concentration of kcl and 60 mm final concentration of mgcl 2 for at least 30 min at room temperature to increase the efficiency for fusion formation (liu et al., 2000) . the mrna-protein fusion was then purified using anti-flag m2 affinity gel (sigma-aldrich, st. louis, mo). elution was performed using 3x flag peptide (sigma-aldrich). the purified mrna-protein fusion was reverse transcribed using superscript iii reverse transcriptase (life technologies). this reverse transcribed product, which was referred as "input library", was incubated with pierce streptavidin agarose (sa) beads (life technologies) that were conjugated with biotinylated human igg-fc (rockland immunochemicals, limerick, pa). after washing, the immobilized mrna-protein fusion was eluted by heating to 95˚c. the eluted sample was referred as "selected library". pcr amplification was performed using kod dna polymerase (emd millipore) with 1.5 mm mgso 4 , 0.2 mm of each dntp (datp, dctp, dgtp, and dttp), the selected library, and 0.5 mm each of 5'-cta cac gac gct ctt ccg atc tnn nag cag tac gct aac gac aac g-3' and 5'-tgc tga acc gct ctt ccg atc tnn nta atc gga tcc tcc gga ttc g-3'. the underlined "nnn" indicated the position of the multiplex identifier, gtg for input library and tgt for post-selection library. the thermocycler was set as follows: 2 min at 95˚c, then 10 to 12 three-step cycles of 20 s at 95˚c, 15 s at 56˚c, and 20 s at 68˚c, and 1 min final extension at 68˚c. the product was then pcr amplified again using kod dna polymerase (emd millipore) with 1.5 mm mgso 4 , 0.2 mm of each dntp (datp, dctp, dgtp, and dttp), the eluted product from mrna display, and 0.5 mm each of 5'-aat gat acg gcg acc acc gag atc ta cac tct ttc cct aca cga cgc tct tcc g-3' and 5'-caa gca gaa gac ggc ata cga gat cgg tct cgg cat tcc tgc tga acc gct ctt ccg-3'. the thermocycler was set as follows: 2 min at 95˚c, then 10 to 12 three-step cycles of 20 s at 95˚c, 15 s at 56˚c, and 20 s at 68˚c, and 1 min final extension at 68˚c. the pcr product was then subjected to 2 â 100 bp paired-end sequencing using illumina hiseq 2500 platform. we aimed to obtain at least 20 million paired-end reads for each input library and post-selection library such that the average coverage for each variant would be more than 100 paired-end reads. there were 89,075,246 paired-end reads obtained for the input library and 45,587,128 paired-end reads obtained for the post-selection library. raw sequencing data have been submitted to the nih short read archive under accession number: bioproject prjna278685. we were able to compute the fitness for 93.4% of all variants from the sequencing data. the fitness measurements in this study were highly consistent with our previous study on the fitness of single and double mutants in protein gb1 (pearson correlation = 0.97, figure 1 -figure supplement 3b) . the first three nucleotides of both forward read and reverse read were used for demultiplexing. if the first three nucleotides of the forward read were different from that of the reverse read, the given paired-end read would be discarded. for both forward read and reverse read, the nucleotides that were corresponding to the codons of protein gb1 sites 39, 40, 41, and 54 were extracted. if coding sequence of sites 39, 40, 41, and 54 in the forward read and that in the reverse read did not reversecomplement each other, the paired-end read would be discarded. subsequently, the occurrence of individual variants at the amino acid level for site 39, 40, 41, and 54 in both input library and selected library were counted, with each paired-end read represented 1 count. custom python scripts and bash scripts were used for sequencing data processing. all scripts have been deposited to https:// github.com/wchnicholas/proteingfourmutants. the fitness (w) for a given variant i was computed as: where count i;selected represented the count of variant i in the selected library, count i;input represented the count of variant i in the input library, count wt;selected represented the count of wt (vdgv) in the selected library, and count wt;input represented the count of wt (vdgv) in the input library. therefore, the fitness of each variant, w i , could be viewed as the fitness relative to wt (vdgv), such that = 1. variants with count input <10 were filtered to reduce noise. the fraction of all possible variants that passed this filter was 93.4% (149,361 out of 160,000 all possible variants). the fitness of each single substitution variant was referenced to our previous study , because the sequencing coverage of single substitution variants in our previous study was much higher than in this study (~100 fold higher). hence, our confidence in computing fitness for a single substitution variant should also be much higher in our previous study than this study. subsequently, the fitness of each single substitution in this study was calculated by multiplying a factor of 1.159 by the fitness of that single substitution computed from our previous study . this is based on the linear regression analysis between the single substitution fitness as measured in our previous study and in this study, which had a slope of 1.159 and a y-intercept of~0. the fitness of each profiled variant is shown in supplementary file 1. the three types of pairwise epistasis (magnitude, sign and reciprocal sign) were classified by ranking the fitness of the four variants involved (greene and crona, 2014) . to quantify the magnitude of epistasis (") between substitutions a and b on a given background variant bg, the relative epistasis model (khan et al., 2011) was employed as follows: where w ab represents the fitness of the double substitution, ln(w a ) and ln(w b ) represents the fitness of each of the single substitution respectively, and w bg represents the fitness of the background variant. as described previously , there exists a limitation in determining the exact fitness for very low-fitness variants in this system. to account for this limitation, several rules were adapted from our previous study to minimize potential artifacts in determining " . we previously determined that the detection limit of fitness (w) in this system is~0.01 . rule 1) if max( wab wbg , wa wbg , wb wbg ) < 0.01, " ab;bg;adjusted = 0 rule 2) if min(w a , w b , wa wbg , wb wbg ) < 0.01, " ab;bg;adjusted = max(0, " ab;bg ) rule 3) if min(w ab , wab wbg ) < 0.01, " ab;bg;adjusted = min(0, " ab;bg ) rule 1 prevents epistasis being artificially estimated from low-fitness variants. rule 2 prevents overestimation of epistasis due to low fitness of one of the two single substitutions. rule 3 prevents underestimation of epistasis due to low fitness of the double substitution. of note, " ab;bg;adjusted was set to 0 if both rule 2 and rule 3 were satisfied. to compute the epistasis between two substitutions, a and b, on a given background variant bg, " ab;bg;adjusted would be used if any one of the above three rules was satisfied. otherwise, " ab;bg would be used. fitness decomposition was performed on all subgraphs without missing variants (109,235 subgraphs in total). we decomposed the fitness landscape into epistatic interactions of different orders by fourier analysis (stadler, 1996; szendro et al., 2013; weinreich et al., 2013; neidhart et al., 2013) . the fourier coefficients given by the transform can be interpreted as epistasis of different orders (weinreich et al., 2013; de visser and krug, 2014) . for a binary sequencez with dimension l (z i equals 1 if mutation is present at position i, or 0 otherwise), the fourier decomposition theorem states that the fitness function f ðzþ can be expressed as (weinberger, 1991) : the formula for the fourier coefficientsfk is then: for example, we can expand the fitness landscape up to the second order, i.e. with linear and quadratic terms where s i ðà1þ zi 2 fþ1; à1g, andẽ i is a unit vector along the i th direction. in our analysis of subgraphs, there are a total of 2 4 ¼ 16 terms in the fourier decomposition, with 4 i à á terms for the i th order (i ¼ 0; 1; 2; 3; 4). we can expand the fitness landscape up to a given order by ignoring all higherorder terms in equation 3. in this paper, we refer to higher-order epistasis as non-zero contribution to fitness from the third order terms and beyond. the fitness values for 10,639 variants (6.6% of the entire sequence space) were not directly measured (read count in the input pool = 0) or were filtered out because of low read counts in the input pool (see section "calculation of fitness"). to impute the fitness of these missing variants, we performed regularized regression on fitness values of observed variants using the following model (hinkley et al., 2011; otwinowski and plotkin, 2014) : here, f is the protein fitness. a 0 is the intercept that represents the log fitness of wt; b i represents the main effect of a single mutation, i; m i is a dummy variable that equals 1 if the single mutation i is present in the sequence, or 0 if the single mutation is absent; and n m ¼ 19 â 4 1 à á ¼ 76 is the total number of single mutations. similarly, g j represents the effect of interaction between a pair of mutations; p j is the dummy variable that equals either 1 or 0 depending on the presence of that those two mutations; and n p ¼ 19 2 â 4 2 à á ¼ 2166 is the total number of possible pairwise interactions. in addition to the main effects of single mutations and pairwise interactions, the three-way interactions among sites 39, 41 and 54 are included in the model, based on our knowledge of higher-order epistasis (figure 3) . d k represents the effect of three-way interactions among sites 39, 41 and 54; t k is the dummy variable that equals either 1 or 0 depending on the presence of that three-way interaction; and n t ¼ 19 3 ¼ 6859 is the total number of three-way interactions. thus, the total number of coefficients in this model is 9102, including main effects of each site (i.e. additive effects), interactions between pairs of sites (i.e. pairwise epistasis), and a subset of three-way interactions (i.e. higher-order epistasis). out of the 149,361 variants with experimentally measured fitness values, 119,884 variants were non-lethal (f >0) and were used to fit the model coefficients using lasso regression (matlab r2014b). lasso regression adds a penalty term l p j j ( stands for any coefficient in the model) when minimizing the least squares, thus it favors sparse solutions of coefficients (figure 4-figure supplement 1b) . we calculated the 10-fold cross-validation mse (mean squared errors) of the lasso regression for a wide range of penalty parameter l (figure 4-figure supplement 1a) . l ¼ 10 à4 is chosen. for measured variants, the model-predicted fitness values were highly correlated with the actual fitness values (pearson correlation=0.93, figure 4 -figure supplement 1c). we then used the fitted model to impute the fitness of the 10,639 missing variants and complete the entire fitness landscape. imputed fitness values for missing variants are listed in supplementary file 2. simulating adaptation using three models for fixation python package "networkx" was employed to construct a directed graph that represented the entire fitness landscape for sites 39, 40, 41, and 54. a total of 20 4 = 160,000 nodes were present in the directed graph, where each node represented a 4-site variant. for all pairs of variants separated by a hamming distance of 1, a directed edge was generated from the variant with a lower fitness to the variant with a higher fitness. therefore, all successors of a given node had a higher fitness than the given node. a fitness peak was defined as a node that had 0 out-degree. three models, namely the greedy model (de visser and krug, 2014), the correlated fixation model (gillespie, 1984) , and the equal fixation model (weinreich et al., 2006) , were employed in this study to simulate the mutational steps in adaptive trajectories. under all three models, the probability of fixation of a deleterious or neutral mutation is 0. considering a mutational trajectory initiated at a node, n i with a fitness value of w i , where n i has m successors, (n 1 , n 2 , . . . n m ) with fitness values of (w 1 , w 2 , . . . w m ). then the probability that the next mutational step is from n i to n k , where k 2 (1, 2, . . . m), is denoted p ifik and called the probability of fixation, and can be computed for each model as follows. for the greedy model (deterministic model), if w k ¼ maxðw 1 ; w 2 ; :::w m þ; p i!k ¼ 1 (7) otherwise; p i!k ¼ 0 for the correlated fixation model (non-deterministic model), for the equal fixation model (non-deterministic model), to compute the shortest path from a given variant to all reachable variants, the function "singlesource shortest path" in "networkx" was used. if the shortest path between a low-fitness variant and a high-fitness variant does not exist, it means that the high-fitness variant is inaccessible. if the length of the shortest path is larger than the hamming distance between two variants, it means that adaptation requires indirect paths. under constraints imposed by the standard genetic code, the connectivity of the directed graph that represented the fitness landscape was restricted according to the matrix shown in figure 4 figure supplement 3a. the genetic distance between two variants was calculated according to the matrix shown in figure 4 -figure supplement 3a. if the length of the shortest path is larger than the genetic distance between two variants, it means that adaptation requires indirect paths. in the subgraph analysis shown in figure 1 -figure supplement 4, the fitness landscape was restricted to 2 amino acids at each of the 4 sites (the wt and adapted alleles). there was a total of 2 4 variants, hence nodes, in a given subgraph. only those subgraphs where the fitness of all variants was measured directly were used (i.e. any subgraph with missing variants was excluded from this analysis). mutational trajectories were generated in the same manner as in the analysis of the entire fitness landscape (see subsection "simulating adaptation using three models for fixation"). in a subgraph with only one fitness peak, the probability of a mutational trajectory from node i to node j via intermediate a, b, and c was as follows: to compute the gini index for a given set of mutational trajectories from node i to node j, the probabilities of all possible mutational trajectories were sorted from large to small. inaccessible trajectories were also included in this sorted list with a probability of 0. this sorted list with t trajectories was denoted as (p i!j;1 , p i!j;2 , . . . p i!j;t ), where p i!j;1 was the largest and p i!j;t was the smallest. this sorted list was converted into a list of cumulative probabilities, which is denoted as (a i!j;1 , a i!j;2 , . . . a i!j;t ), where a i!j;t = p t n¼1 p i!j;t . the gini index for the given subgraph was then computed as follows: gini index ¼ 2 â p tà1 n¼1 ða i!j;n þ þ a i!j;t à t t à 1 (12) sequence logo was generated by weblogo (http://weblogo.berkeley.edu/logo.cgi) (crooks et al., 2004) . the visualization of basins of attraction ( figure 4a ) was generated using graphviz with "fdp" as the option for layout. the ddg prediction was performed by the ddg monomer application in rosetta software (das and baker, 2008) using the parameters from row 16 of table i in kellogg et al. (kellogg et al., 2011) . here we prove that higher-order epistasis is required for two possible scenarios of extra-dimensional bypass via an additional site (figure 2-figure supplement 1) . for a fitness landscape defined on a boolean hypercube, we can expand the fitness as taylor series (weinberger, 1991) . f 110 ¼ a 0 þ a 2 þ a 3 þ a 23 f 111 ¼ a 0 þ a 1 þ a 2 þ a 3 þ a 12 þ a 13 þ a 23 þ a 123 to prove that higher-order epistasis is present is equivalent to prove that a 123 6 ¼ 0. the fitness difference between neighbors is visualized by the directed edges that go from low-fitness variant to high-fitness variant, thus each edge represents an inequality. no cyclic paths are allowed in this directed graph. the reciprocal sign epistasis ( figure 2-figure supplement 1a) gives, 000 001 : a 1 < 0 (14) 000 010 : a 2 < 0 (15) 001 ! 011 : a 2 þ a 12 > 0 (16) 010 ! 011 : a 1 þ a 12 > 0 the detour step (000 ! 100) and the loss step (111 ! 011) are required for extra-dimensional bypass, 000 ! 100 : a 3 > 0 (18) 011 111 : a 3 þ a 13 þ a 23 þ a 123 < 0 combining inequality (14) and (20) gives combining inequality (15) and (21) gives combining the above two inequalities with (18) and (19), we arrive at a 123 < 0 for the scenario in (c), the proof of higher-order epistasis is similar. we have (the yellow edge) 001 ! 101 : a 3 þ a 13 > 0 (25) combining the above inequality with (15), (19) and (21), we arrive at for the scenario in (d), when a 3 þ a 13 < 0, all the inequalities can be satisfied with a 123 ¼ 0. so higher-order epistasis is not necessary in this case. scaling laws describe memories of host-pathogen riposte in the hiv population accessibility percolation with backsteps extradimensional bypass weblogo: a sequence logo generator macromolecular modeling with rosetta empirical fitness landscapes and the predictability of evolution mutational reversions during adaptive protein evolution statistical questions in experimental evolution evolutionary dynamics and statistical physics high-resolution mapping of protein sequence-function relationships evolutionary accessibility of mutational pathways two crystal structures of the b1 immunoglobulin-binding domain of streptococcal protein g and comparison with nmr evolution and speciation on holey adaptive landscapes molecular evolution over the mutational landscape the changing geometry of a fitness landscape along an adaptive walk modeling the evolution of molecular systems from a mechanistic perspective what can we learn from fitness landscapes? experimental illumination of a fitness landscape comparing protein folding in vitro and in vivo: foldability meets the fitness challenge a systems analysis of mutational effects in hiv-1 protease and reverse transcriptase capturing the mutational landscape of the beta-lactamase tem-1 comprehensive experimental fitness landscape and evolutionary network for small rna towards a general theory of adaptive walks on rugged landscapes role of conformational sampling in computing mutation-induced changes in protein structure and stability negative epistasis between beneficial mutations in an evolving bacterial population topological features of rugged fitness landscapes in sequence space exploring the complexity of the hiv-1 fitness landscape reciprocal sign epistasis between frequently experimentally evolved adaptive mutations causes a rugged fitness landscape phase transition for accessibility percolation on hypercubes optimized synthesis of rna-protein fusions for in vitro protein selection stepwise acquisition of pyrimethamine resistance in the malaria parasite the biochemical architecture of an ancient adaptive landscape accessibility percolation and first-passage site percolation on the unoriented binary hypercube epistasis among adaptive mutations in deer mouse hemoglobin exact results for amplitude spectra of fitness landscapes single-round, multiplexed antibody mimetic design through mrna display a comprehensive biophysical description of pairwise epistasis throughout an entire protein domain the population genetics of adaptation: the adaptation of dna sequences a minimum on the mean number of steps taken in adaptive walks impact of epistasis and pleiotropy on evolutionary adaptation inferring fitness landscapes by regression produces biased estimates of epistasis quantitative exploration of the catalytic landscape separating divergent plant sesquiterpene synthases comprehensive and quantitative mapping of energy landscapes for protein-protein interactions by rapid combinatorial scanning delayed commitment to evolutionary fate in antibiotic resistance fitness landscapes the robustness and evolvability of transcription factor binding sites replication and explorations of high-order epistasis using a large advanced intercross line pedigree rapid construction of empirical rna fitness landscapes empirical fitness landscapes reveal accessible evolutionary paths a quantitative highresolution genetic profile rapidly identifies sequence determinants of hepatitis c viral fitness and drug sensitivity rna-peptide fusions for the in vitro selection of peptides and proteins exploring protein fitness landscapes by directed evolution crystal structure of the c2 fragment of streptococcal protein g in complex with the fc domain of human igg streptococcal protein g. gene structure and protein binding properties natural selection and the concept of a protein space landscapes and their correlation functions evolvability as a function of purifying selection in tem-1 blactamase quantitative analyses of empirical fitness landscapes hidden randomness between fitness landscapes limits reverse evolution the inherent mutational tolerance and antigenic evolvability of influenza hemagglutinin epistasis constrains mutational pathways of hemoglobin adaptation in high-altitude pikas genetic background affects epistatic interactions between two beneficial mutations fourier and taylor series on fitness landscapes darwinian evolution can follow only very few mutational paths to fitter proteins should evolutionary geneticists worry about higher-order epistasis? perspective: sign epistasis and genetic costraint on evolutionary trajectories the rate at which asexual populations cross fitness valleys the rate of fitness-valley crossing in sexual populations mechanisms of host receptor adaptation by severe acute respiratory syndrome coronavirus high-throughput profiling of influenza a virus hemagglutinin gene at single-nucleotide resolution population genomics of intrapatient hiv-1 evolution we would like to thank jesse bloom and joshua plotkin for helpful comments on early versions of the manuscript. ncw was supported by philip whitcome pre-doctoral fellowship, audree fowler fellowship in protein science, and ucla dissertation year fellowship. ld was supported by hhmi postdoctoral fellowship from jane coffin childs memorial fund for medical research. rs was supported by nih r01 de023591. the funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. the funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. author contributions ncw, conception and design, acquisition of data, analysis and interpretation of data, drafting or revising the article; ld, jol-s, analysis and interpretation of data, drafting or revising the article; cao, conception and design, acquisition of data; rs, conception and design, drafting or revising the article key: cord-143539-gvt25gac authors: marmarelis, myrl g.; steeg, greg ver; galstyan, aram title: latent embeddings of point process excitations date: 2020-05-05 journal: nan doi: nan sha: doc_id: 143539 cord_uid: gvt25gac when specific events seem to spur others in their wake, marked hawkes processes enable us to reckon with their statistics. the underdetermined empirical nature of these event-triggering mechanisms hinders estimation in the multivariate setting. spatiotemporal applications alleviate this obstacle by allowing relationships to depend only on relative distances in real euclidean space; we employ the framework as a vessel for embedding arbitrary event types in a new latent space. by performing synthetic experiments on short records as well as an investigation into options markets and pathogens, we demonstrate that learning the embedding alongside a point process model uncovers the coherent, rather than spurious, interactions. the propagation of disease [1] , news topics [2] , crime patterns [3, 4] , neuronal firings [5] , and market trade-level activity [6, 7] naturally suit the form of diachronic point processes with an underlying causal-interaction network. understanding their intrinsic dynamics is of paramount scientific and strategic value: a particular series of discrete options trades may inform an observer on the fluctuating dispositions of market agents; similarly, temporal news publication patterns may betray an ensuing shift in the public zeitgeist. the spread of a novel pathogen, notably the covid-19 virus, through disjointed pockets of the globe hints to how it proliferates, and how that might be averted [8] . practically estimating the (n × n ) possible excitations between each dyad (pair) of event types is untenable without succinct and interpretable parametrizations. how can one possibly disentangle the contributions of hundreds of options trades within each minute, in a myriad of different strike prices and expiration dates, to the poisson intensity of a particular type of trade? we must envision the right kind of bias for our model. spatiotemporal domains [9, 3, 10] exploit physical constraints on interaction locality. in essence, the influence one event bears on another is governed solely by formulation. consider a record of n event occurrences (k i , t i ) ∈ , i = 1, 2, . . . n , with n marked types k i ∈ {1, 2, . . . n } at times t i ∈ [0, t ). we have reason to believe that events with certain marks excite future events of either the same or another type. multiple such interactions may be present, and we only want to identify those that are warranted by the observed record. a compact latent representation would induce a strong prior on the continuum of allowable interactions. we start with preliminaries. the multivariate intensity function λ(k , t ), conditional on the events in [0, t ), dictates the instantaneous poisson frequency. a particular interval [t , t + dt ) would expect to witness λ(k , t )dt instances of event type k . we decompose this intensity [24] into self-and cross-excitations and an intrinsic background rate, not knowing a priori which event triggered which: one way to encourage inductive skepticism about apparent interactions in the estimated response function h (k , l , τ) is to constrain its structure. suppose there exists a latent euclidean geometry that adequately captures the interactions between event types. we infer distinct embeddings for receiving, x k ∈ x ⊂ m , versus influencing, y l ∈ y ⊂ m , because otherwise we constrain our model to symmetrical bidirectional interactions-and that is not sufficiently expressive. induced by a mapping on the marks, x and y constitute the discrete embedding of a latent manifold wherein the influence some event type l exerts on k is characterized by y l − x k 2 . each response is the result of a dyadic interaction between an event (k i , t i ) in the past and the potential occurrence of event type k at the present time t . the chosen parametric family for the causal kernel bank entails gaussian proximities in space and exponential clocks: eqs. 2 & 3 form the spatial and temporal basis of r = 1, 2, . . . r kernels comprising the response function, tentatively produced here and fully expanded in eq. 8. later on we introduce the ingredients ξ(l ) and γ r for granularity in the magnitudes. a generalized poisson process yields a clean log likelihood function, written as follows: however direct optimization on the basis of its gradient has proven unwieldy in anything other than deep general models. even worse, its current form is not amenable to analysis. suppose we happened upon the expected branching structure [12] of the realized point process. in other words, we introduced latent variables [p i j r ] ∈ n ×n ×r holding expectation estimates of p i j r ∈ {0, 1}, indicating whether it was the event instance i that triggered instance j , and attributing responsibility to kernel basis r . knowledge of the untenable true line of causation endows us with the so-called complete-data log likelihood termed log l c [10] , an expectation of the joint log-probability density of the record and the latent variables [p i j r ] in terms of their probabilities [p i j r ]. by abuse of notation let h r (· · · ) denote the r th response kernel. note that, in the above form, what was previously a logarithm of summations (see eq. 4 and 1) is replaced by a weighted sum of decoupled logarithms. the probability that the event instance was due to the background white poisson process is p b j ; ∀ j , i ,r p i j r + p b j = 1. concretely, given a model λ(k , t ), the allegation of causality i → j via r is the ratio of that particular contribution to the overall intensity: the right-hand term in eq. 5 simplifies vastly if one assumes that ∀y l ∈ y x k ∈x g (x k , y l ) = 1, and the latter approximation is tenable for large enough t ; the former is not. only through certain concessions may we gain confidence that the sum is roughly unit. first note that, by the gaussian integral, m g (x , y )dx = 1. veen & schoenberg [9] arrived at the conclusion that this approximation holds arbitrarily well if the event occurrences in this putative space are distributed uniformly in m . our embedding scheme is usually not: events are clumped at discrete locations of their types, and we are left with a coerced normalization of eq. 2; we still rely on the "pure" form in eq. 2 for the sake of closed-form optimization, as detailed below; nevertheless the intervention in eq. 7 prevents drifting towards degeneracy. the final touch is the introduction of one more kernel parameter, ξ(l ), to account for this additional restriction. the full response kernel is therefore eq. 8. we eliminate redundancies by constraining the exertion coefficient n −1 n l =1 ξ(l ) = 1 and scaling the basis coefficients γ r appropriately. furnished with the causality estimates in eq. 6 (the "expectation" step), we perform projected gradient ascent by setting partial derivatives of the complete-data log-likelihood with respect to each kernel parameter to zero (the "maximization" step). eventually the causalities are aggregated in special ways to form coefficient estimates. omitting the domains of summation over i and j , i to imply {1, 2, . . . n } and {1, 2, . . . n } × {1, 2, . . . n } respectively, the solutions unfold as so: at times, it is necessary to preserve focus on the acceptable time horizons for a particular domain, in bias against "degenerate" ones. a gamma(α δ , β δ ) prior on the decay rate δ r admits the maximization a posteriori in eq. 9, which trivially becomes uninformative at the assignment δ r (1, 0). one is typically interested in the half-life (log 2/δ r ), the prior of which is the reciprocal of the aforementioned gamma distribution and characterized by the aptly named inverse-gamma distribution with expectation β δ α δ −1 . preserving the mean while increasing both parameters strengthens the prior. the influence matrix φ = [ϕ(k , l )] is composed of the kernels with time integrated out, i.e. there is evidence that this quantity encodes the causal network structure [25, 26] . pursuant to the above maximization step, one may alternatively estimate all of these n 2 degrees of freedom. first principles (fp). we invoke the direct relation between the likelihood function and the embeddings. our approach alternates between updating the reception embedding and the influence embedding, á la gibbs sampling. observe the partial gradient with respect to the vector y : this behemoth is difficult to solve analytically. recall, however, our prior simplifying assumption that ∀y , r x ∈x g r (x , y ) = 1, also enforced a posteriori by means of eq. 7. the latter portion of eq. 14 contains the form x (x − y )g r (x , y ), equivalent to taking a quantized "expectation" of a gaussian variable subtracted by its own mean (see eq. 2). hence a viable approximation to a locally optimal y embedding point stems from neglecting the contribution of the entire second part of eq. 14, allowing us to garner the following intuitive formula: evidently each influence point y ∈ y is attracted to the reception points {x ∈ x } that appear to receive excitatory influence from it. unfortunately there is no analogue approximation for the reception points themselves that could manifest by a similar sensible trick. we produce the derivation: and submit to regular gradient ascent with learning rate , and specifically an update rule along the average log likelihood for a consistent strategy across different record lengths: to gain intuition on the selection of , we looked into entropic impact as a heuristic. the beautiful findings in [27] allowed us to reason about the following contribution of a change in an embedding point to the differential entropy of a doubly stochastic point process: which admitted a simple rule of thumb to setting the learning rate proportional to n n , with the constant pertaining to domain idiosyncrasies. maintaining this ratio ameliorated convergence in §3.1. diffusion maps (dm). could we posit a diffusion process across event types? random walk methods yield approximate manifold embeddings, proven helpful in deep representations [28, 29, 30] . construed as graph affinities, the influences φ guide a markovian random walk of which diffusion maps [31, 32, 33] may be approximated via spectral decomposition. we found that asymmetrical diffusion-maps embeddings (in the style of [34] ) serve as an adequate initial condition but are not always conducive to stable learning in conjunction with our dynamic kernel basis. we term the model learned entirely this way as dm, and for brevity we relegate its review to appendix a. baseline (b). did the reduction in degrees of freedom lend its hand to a more generalizable model of the point process? in order to motivate the reason for having an embedding at all-besides the gains in interpretability-we pitted the techniques fp and dm against the following: having estimated the full-rank matrix entries ϕ(·, ·) directly [11] . we guessed adequate initial conditions for the em procedure with a fixed empirical protocol. the surmised influence matrix came by summing up correlations between event types. notice that it remains unscaled.δ was computed as the naive reciprocal inter-arrival time between we justify this construction on basis of eq. 9, which forms a weighted average over said arrival times to garner an optimal estimate for δ −1 r . we feed the result of eq. 19 into the diffusion-maps algorithm in order to obtain our initial embeddings (x ,ŷ ) ∈x ×ŷ . ∀rβ 2 r is initialized at the mean dyadic squared distance in the embedding;δ r = (rt ) −1 , in which variety is injected to nudge the kernels apart;γ r = r −1 and finally ∀x ,μ(x ) = t −1 n −1 n . we intended to stress-test the learning algorithm under small record sizes and numerous event types; we thereby contrived a number of scenarios with known ground-truth parameters sampled randomly. the underlying models had a single kernel r = 1 in the response function and conformed to the prescriptions fp, dm, and b from §2.2. each sampled model was simulated with the thinning algorithm (see e.g. [35] and their supplementary material) in order to generate a time-series record of specified length n . reception and influence points were realized uniformly from a unit square (m = 2). spatial bandwidths β 2 were granted a gamma distribution with shape α = 1 / n and unit scale. decay rates δ were standard log-normal as were backgrounds µ(k ), though scaled by 1 /n. stability [24] was ensured by setting γ = 1 / n, constraining the frobenius norm φ f = 1 that upperbounds the l 2 -induced norm, which itself upper-bounds the spectral radius of the influences ρ(φ), the real criterion. in line with goodhart's law [36] , different facets of the model apparatus were scrutinized. first, we sought to determine whether the baseline tends to reach high in-sample likelihoods yet abysmal out-of-sample likelihoods, including extreme outliers. specifically, not only are statistics on each likelihood important but also the relationship between the two. we thought to convey it through a total least squares [37] slope and centroid, followed by its root mean-square error (rmse) in the first four outcome columns of table 1 . the centroid gives empirical means for train and test log l , and the slope how much the test varies with the train. did our models recover the chain of causation? we opened up the empirical [p i j r ] estimates and computed their hellinger distance [38] from those stipulated by the ground truth. for each "to be caused" event j , the quantities (i , r ) → p i j r are framable as a discrete probability distribution, the empirical construction of which makes it poorly suited numerically for the more widespread kl-divergence measure, unlike the hellinger distance. , rmse of the fit line (col. "fit"), divergence from ground-truth causalities (col. "div."), and mean correlation difference between the model and glove along with t-test significance markers (col. "emb. cor."). ***: p ≤ 0.01, **: p ≤ 0.05, *: p ≤ 0.1. bold numerals indicate the sample rejected an anderson-darling test [39] for normality with significance ≤ 0.05. comparison to glove. our technique lives in a dual realm: that of vector embeddings for words and other sequential entities. we fed the ordered sequence of event-type occurrences into a typical glove scheme [40] with a forward-looking window of size three, three dimensions, and an asymmetric cooccurrence matrix in an attempt to recover a set of vectors that served as both influence and reception points. the final column of table 1 displays a systematic evaluation against glove's embeddings with reference to the ground-truth geometry. concretely, all pairwise distances in each setting (that of our learned model and the newfound glove embeddings) were correlated to those of the ground truth by kendall's rank-based nonparametric statistic [41] . the gap between glove's estimated correlation and the model's under scrutiny, each in [−1, 1] and where a positive difference means the model correlated more with the ground truth, was collected in each trial and means along with a t-test significance [42] were reported in the last column of table 1 . glove is nondeterministic, so we obtained the sample mean of ten embedding correlations per trial-a move that favors glove's results. from the outcomes, it is evident that dm models pick up the spatial coordinates more consistently-probably due to the regularity imposed by their normalized spectral decomposition. epidemics. consider disease in a social apparatus emerging as a diffusive point process. in 2019 a finely regularized variational approach to learning multivariate hawkes processes from (a little more than) a handful of data [13] was demonstrated on a dataset of symptom incidences during the ∼2014-2015 ebola outbreak [43] . we gave the record precisely the same treatment the authors did, and obtained significantly higher commensurate likelihoods than their best case. in turn, they outperformed the cutting-edge approaches mle-sglp [14] and adm4 [15] that regularize for sparsity. we also trained a model with an embedding fixed to the geographic coordinates of the 54 west african districts present in the dataset, assessing the spatial nature of the process. see table 2 ; also, view appendix b for a foray into the covid-19 pandemic and a fruitful result on south korea. table 3 . the auxiliary accuracy metric is derived from categorical cross-entropy of the predicted event type at the time of an actual occurrence, rendered by the expression we visualized three-dimensional embeddings via their two principal components in figure 3 . figure 2 : trades at discrete strike prices were resampled according to quantized log-gaussian profiles with reference to moneyness at any given point in time. standard deviations, in logarithmic space, were half the separation between the two densities' centers. they were kept "loose" for the sake of seamless translation even under abrupt fluctuations in the underlying stock price. predictive ability. we display the epoch with the best training score in all our experiments. most notable in table 1 is how the dm formulation enjoys poorly determined systems, e.g. (300, 90) , but is typically outperformed on the basis of likelihoods by fp in better-posed situations like (900, 30). the baseline suffers in recovering actual causalities (measured by divergences from ground truth) in contrast to our novel models fp and dm. the ebola results in table 2 depict superior results from all of our models to the state of the art. whereas our em "baseline" is best and the model with geographic ground truth performs marginally better than our fp, we count it as a positive that most of the information was retained. the strain afflicting the region during that period of time had an incubation period of about 8-12 days [44] , suggesting preference for higher half lives. see our results pertaining to covid-19 in appendix b. interpretation of the market embeddings. events belonging to each stock ticker tend to attract influencing points of the same color in figure 3 , with deviations hinting to their relative perceptions by the market. efficient estimators are necessary in order to discern a lack of stationarity; in this case, transferring our sep. 15 fp model onto the sep. 18 test set yielded a log l improvement from 2.57 to 2.82, and vice versa gave a decrease from 2.72 to 2.45. thus behavior largely persists. further, aggregate ξ(l )'s confirm the broad intuition that out-of-money trades move markets the most [45] . the quantile plots put forth that both fp and dm outperformed the baseline in statistically filtering out the white (i.e. serially independent) background events according to their estimated probabilities p b j . a constant-intensity poisson process would witness arrival times distributed exponentially [24] . disentangling time scales. we unsuccessfully attempted to sway the estimators towards longerterm (on the order of seconds) behaviors by enlisting a prior on the exponential rate encouraging an expected half life of one minute. parsing out minute-scale behaviors out of high-frequency trades is severely difficult-recall that most market makers dealing in options are automated. the gamma prior inflicts a cost without influencing the half lives very much. it would be imperative to study higher-order interactions [46] if one were to investigate longer patterns through individual trades. the fundamental notion driving the doubly stochastic process, first attributed to cox [47] , manifests in a variety of ways including the latent point process allocation [48] model. in fact, note the meteoric rise [49] of bayesian approaches that now permeate serially dependent point processes. zhang et al. [50] sample the branching structure in order to infer the gaussian process (gp) that constitutes the influence function. gps, usually accompanied by inducing points, sometimes directly modulate the intensity [51, 48, 52, 53, 54] . linderman and adams [55] took an approach that estimated a matrix very similar to our [ϕ i j r ], relying on discretized binning and variational approximation. salehi et al. [13] exploited a reparametrization trick akin to those in variational autoencoders in order to efficiently estimate the tensor of basis-function coefficients. recent progress has been made in factorizing interactions with a direct focus on scalability [19] , improving on prior work in low-rank processes [20] . block models on observed interaction pairs also exist [21] . while these all achieve compact hawkes processes, our methodology distinguishes itself by learning a euclidean embedding with the semantics of a metric space, not a low-rank projection. notably neural hawkes [35] and a hodgepodge of other techniques centered on neural networks [56, 57, 58, 59, 60, 61, 2, 62, 63, 64] have been established through the years. our baseline estimator resembles most closely the one described in the work of zhou et al. [11] , whereas the spatiotemporal aspect is inspired from the likes of schoenberg et al. [9, 3, 10] . variational substitutes in the em algorithm have also been explored [65] . a concurrent study to ours by zhu et al. [66] parametrizes a heterogeneous kernel in real euclidean space by deep neural networks. we demonstrated the viability of estimating embeddings for events in an interpretable metric space tied to a self-exciting point process. the proposed expectation-maximization algorithm extracts parsimonious serial dependencies. our framework paves the way for generalization, extension to more elaborate models, and consequent potential for societal impact in the future. all of the real-world phenomena examined herein emerged from systems of social entities. the point processes are governed by either human interactions (e.g. in a pandemic) or dealings between agents thereof (e.g. in the options market, with the majority of trades automated on behalf of institutions.) data collection is costly for problems concerning human activities in "everyday" society, as peering into social media provides a skewed and limited perception. other, more authentic streams of information concerning public opinion, physical interaction networks, and so forth are either inaccessible due to privacy issues or saturated with noise. we believe that our contribution broadens the scope of the kinds of problems that can be studied with analytical point-process methods. a myriad of applications in the social sciences have lagged in adopting the level of technical vigor that contemporary data science enables. a few of themas displayed in the present study-benefit significantly from the efficient estimation of compact, interpretable representations for event interactions. it is difficult to name any negative ethical implications for this line of work. one necessary precaution with increasingly refined "causal" models is to avoid making any hasty conclusions on true causation, which cannot be proven solely from our hawkes kernel estimates. we review briefly the technique's application here; the curious reader is encouraged to peruse the theory presented in coifman's seminal publications [33] . casting the influence matrix as edge weights in a bipartite graph flowing between influence (col.) ↔ reception (row), we examine the diffusion process upon it [34] . we first normalize by density to our liking, per our selected value for the parameter 0 ≤ α ≤ 1 according to consider the row-stochastic version of a, named b r . its singular values multiplied by the left (orthonormal) eigenvectors thereof supply manifold embedding coordinates for the reception points, weighted by significance according to the singular values. likewise, b i may be constructed as the row-stochastic transformation of a t from the same eq. 20, of which the resultant coordinates grant us the influence points. in each set of coordinates, we preserve only those corresponding to the highest m singular values-except for the largest, which is constant by definition. the covid-19 pandemic caused by a coronavirus novel to humans has taken the world by surprise, forcing lockdowns across the globe and uniting humanity on a common front. data scientists of course desire to contribute to this global effort in whichever way they can. one avenue is the study of the infections' spatiotemporal nature. notwithstanding the regional distortions in reporting due to a wide array of factors, we expect diffusion dynamics to reflect in the fatalities at the macroscopic level. figure 4 catalogs our empirical results from the johns hopkins csse dataset [67] . since the daily new confirmed cases were so numerous, we had to increase the temporal resolution artificially from days to hours: we interpolated on an exponential curve between successive days. the model we deployed as the baseline ("b") gave the best performance in this experiment. identifying harbingers for the spatial progression of covid-19 is of paramount importance for proactive policymaking. we sought to model the transient exponential-growth phase only; incorporating a self-limiting aspect remains the topic of future study. that instability coupled with evident non-planarity of the interaction graphs in figure 4 contribute to the failure of enacting our latent embedding scheme, and reversion to the baseline model. three contending factors make this particular task difficult. first, the instability of the infection process renders possible cross-regional influences ambiguous. second, imperfections in the reporting protocol could have induced too much noise. and third, there may actually be no suitable lowdimensional euclidean embedding. to test the third hypothesis, we experimented with the counties of relatively vast and rural states. we examined solely ohio's fatality diffusion process to no end. to ameliorate perhaps the issues with recording fatalities instead of confirmed cases, as well as the explosive growth, we also peered into the april 20-may 20 period of new confirmed cases, which was slower because most states had instituted social controls. both ohio and kentucky, which have numerous small counties, were scrutinized with fp and dm models to no avail. south korea. questioning whether it was the quality of the reported confirmed cases at fault, we found the well-crafted dataset released by the korean centers for disease control & prevention [22] . they detail 3,385 incidences from the early outbreaks across the 155 regions of the country, each with at least one infection occurrence. we find it worth noting that the propagation is more controlled in south korea relative to most other countries-this premise along with diligent testing appears to contribute to the embedding's identifiability. our novel method fp with = 1 attained a test log l of −3.90, in comparison to the baseline with −4.68 and the geographically spatial model with −4.65. the test set consisted of the last 30 days in the record, containing 631 incidences. it is further worth noting that the full-rank baseline registered a significantly higher training log l than our fp did, indicative of excess overfitting. see the resultant principal components of the three-dimensional embeddings in figure 5 along with occurrence statistics in figure 6 . for one, we observe that the two urban hubs seoul and busan are not as far apart in the latent influence space as they are geographically. colors were interpolated by hue on the basis of physical proximity to seoul (blue) versus busan (red), the two major urban centers. each location has a pair of an ex and a dot, corresponding to receiving and influencing points respectively. each comparable multivariate hawkes model (along with its inference procedure) that is detailed in the current state of the art [20, 15, 13] entails careful tuning that no existing software package appears to cover judiciously. we found it more productive to benchmark the proposed formulation on the basis of previously reported test-set average log likelihoods, a standardized quantity. the current state of self-exciting point processes is fragmented such that there is no accepted benchmark dataset, or even domain. this aspect differs from the prevailing applications of deep learning like image recognition. as elaborated in the related work section, some methodologies rely on large sample records and retain the power to model any kind of interaction (in theory). others sacrifice a degree of expressivity for salient and parsimonious interactions. even within the latter camp one is faced with substantial tradeoffs pertaining to computational as well as statistical complexity. the number of event types to be accommodated could lie in the dozens or the hundreds, and likewise the length of the time series varies in the tens of thousands versus the hundreds of thousands or millions. each scenario calls for different modeling choices, and yet the delineation is not at all clear. exhaustive synthetic examples generated from true hawkes processes serve to validate the correctness of the model alongside its estimation algorithms. real phenomena never abide by the pure assumptions posed for a self-exciting point process, so a model must stand on its own merit within each distinct application domain. we showcased viability in two highly pertinent domains by examining epidemics and the options market. comparison and assessment of epidemic models hawkestopic: a joint model for network inference and topic modeling from text-based cascades marked point process hotspot maps for homicide and gun crime prediction in chicago self-exciting point process modeling of crime spatio-temporal correlations and visual signaling in a complete neuronal population state-dependent hawkes processes and their application to limit order book modeling general compound hawkes processes in limit order books when is a network epidemic hard to eliminate? estimation of space-time branching process models in seismology using an em-type algorithm multivariate spatiotemporal hawkes processes and network reconstruction learning triggering kernels for multi-dimensional hawkes processes modelling dyadic interaction with hawkes processes learning hawkes processes from a handful of events learning granger causality for hawkes processes learning social infectivity in sparse low-rank networks using multi-dimensional hawkes processes weg2vec: event embedding for temporal networks crime event embedding with unsupervised feature selection embedding temporal network via neighborhood formation learning multivariate hawkes processes at scale multivariate hawkes processes for large-scale inference the block point process model for continuous-time event-based dynamic networks ds4c: data science for covid-19 in south korea spectra of some self-exciting and mutually exciting point processes first-and second-order statistics characterization of hawkes processes and non-parametric estimation uncovering causality from multivariate hawkes integrated cumulants learning network of multivariate hawkes processes: a time series approach the entropy of a point process variational autoencoders with riamannian brownian motion priors diffusion variational autoencoders diffusion variational autoencoders data-driven probability concentration and sampling on manifold multivariate time-series analysis and diffusion maps diffusion maps large-scale spectral clustering using diffusion coordinates on landmarkbased bipartite graphs the neural hawkes process: a neurally self-modulating multivatiate point process inflation, depression, and economic policy in the west, ch. problems of monetary management: the u.k. experience an introduction to total least squares universal boosting variational inference power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests glove: global vectors for word representation parameters behind "nonparametric" statistics: kendall's tau, somers' d and median differences the probable error of a mean heterogeneities in the case fatality ratio in the west african ebola outbreak a review of epidemiological parameters from ebola outbreaks to inform early public health decision-making option moneyness and price disagreements general methodology for nonlinear modeling of neural systems with poisson point-process inputs some statistical methods connected with series of events latent point process allocation mutually regressive point processes efficient non-parametric bayesian hawkes processes nonparametric regressive point processes based on conditional gaussian processes structured variational inference in continuous cox process models bayesian nonparametric poisson-process allocation for time-sequence modeling scalable high-resolution forecasting of sparse spatiotemopral events with kernel methods scalable bayesian inference for excitatory point process networks temporal network embedding with micro-and macro-dynamics deep random splines for point process intensity estimation of neural population data deep mixture point processes: spatio-temporal event prediction with rich contextual information fully neural network based model for general temporal point processes generative sequential stochastic model for marked point processes geometric hawkes processes with graph convolutional recurrent neural networks," association for the advancement of artificial intelligence recurrent marked temporal point processes: embedding event history to vector neural jump stochastic differential equations latent self-exciting point process model for spatial-temporal networks interpretable generative neural spatio-temporal point processes an interactive web-based dashboard to track covid-19 in real time community structure in social and biological networks key: cord-028789-dqa74cus authors: ouhami, maryam; es-saady, youssef; hajji, mohamed el; hafiane, adel; canals, raphael; yassa, mostafa el title: deep transfer learning models for tomato disease detection date: 2020-06-05 journal: image and signal processing doi: 10.1007/978-3-030-51935-3_7 sha: doc_id: 28789 cord_uid: dqa74cus vegetable crops in morocco and especially in the sous-massa region are exposed to parasitic diseases and pest attacks which affect the quantity and the quality of agricultural production. precision farming is introduced as one of the biggest revolutions in agriculture, which is committed to improving crop protection by identifying, analyzing and managing variability delivering effective treatment in the right place, at the right time, and with the right rate. the main purpose of this study is to find the most suitable machine learning model to detect tomato crop diseases in standard rgb images. to deal with this problem we consider the deep learning models densnet, 161 and 121 layers and vgg16 with transfer learning. our study is based on images of infected plant leaves divided into 6 types of infections pest attacks and plant diseases. the results were promising with an accuracy up to 95.65% for densnet161, 94.93% for densnet121 and 90.58% for vgg16. with a surface area of nearly 8.7 million hectares, the moroccan department of agriculture assumes that agricultural field produces a very wide range of products and generates 13% of the gross domestic product (gdp) [1] . this sector has experienced a significant evolution of the gdp due to the exploitation of fertilization and plant protection systems [1] . despite the efforts, it still faces important challenges, such as diseases. pathogen is the factor that causes disease in the plant; it is induced either by physical factors such as sudden climate changes or chemical/biological factors like viruses and fungi [2] . market gardening and especially tomato crops in the sous-massa region are ones of the crops that are exposed to several risks which increase the quantity and quality of the agriculture products. the most important damages are caused by pests' attacks (leafminer flies, tuta absoluta and thrips) in addition to cryptogamic pathogens infections (early blight, late blight and powdery mildew). since diagnosis can be performed on plant leaves, our study is conducted as a task of classification of symptoms and damages on those leaves. ground imaging with rgb camera presents an interesting way for this diagnosis. however, robust algorithms are required to deal with different acquisition conditions: light changes, color calibration, etc. for several years, great efforts have been devoted to the study of plant disease detection. indeed, feature engineering models [3] [4] [5] [6] on one side with convolutional neural networks (cnn) [7] [8] [9] [10] on the other side; are carried out to solve this task. in [6] , the study is based on a database of 120 images of infected rice leaves divided into three classes bacterial leaf blight, brown spot, and leaf smut (40 images for each class), authors have converted the rgb images to an hsv color space to identify lesions, with a segmentation accuracy up to 96.71% using k-means. the experiments were carried out to classify the images based on multiple combinations of the extracted characteristics (texture, color and shape) using support vector machine (svm). the weakness of this method consists in a moderate accuracy obtained of 73.33%. in fact, the image quality was decreased during the segmentation phase, during which some holes were generated within disease portion, which could be a reason for the low classification accuracy. in addition, leaf smut is misclassified with an accuracy of 40%, which requires other types of features to improve le results. in the same context, in [4] , the authors have proposed an approach for diseases recognition on plant leaves. this approach is based on combining multiple svm classifiers (sequential, parallel and hybrid) using color, texture and shape characteristics. different preprocessing have been performed, including normalization, noise reduction and segmentation by the k-means clustering method. the database of infected plant leaves contains six classes including three types of insect pest damages and three forms of pathogen symptoms. the hybrid approach has outperformed the other approaches, achieving a rate of 93.90%. the analysis of the confusion matrix for these three methods has highlighted the causes of misclassification, which are essentially due to the complexity of certain diseases for which it is difficult to differentiate their symptoms during the different stages of development, with a high degree of confusion between powdery mildew and thrips classes in all the combination approaches. in another study, the authors have used a maize database acquired by a drone flying at a height of 6 m [11] . they have selected patches of 500 by 500 pixels of each original image of 4000 by 6000, and labelled them according to the fact that the most central 224 by 224 area contained a lesion. for the classification step between healthy and diseased, a sliding window of 500 by 500 on the image is used and was introduced in the convolutional neural network (cnn) resnet model [8] . with a test precision of 97.85%, the method remains non generalizable since the chosen disease has important and distinct symptoms compared to other diseases and the acquisitions are made on a single field. for that reason, it is not clear how the model would perform to classify different diseases with similar symptoms. another work that uses aerial images, with the aim of detecting disease symptoms in grape leaves [9] . authors have used cnn approach by performing a relevant combination of image features and color spaces. indeed, after the acquisitions of rgb images using uav at 15 m height. the images were converted into different colorimetric spaces to separate the intensity information from the chrominance. the color spaces used in this study were hsv, lab and yuv, in addition to the extracted vegetation indices (excessive green (exg), excessive red (exr), excessive green-red (exgr), green-red vegetation index (grvi), normalized difference index (ndi) and red-green index (rgi)). for classification, they have used the cnn model net-5 with 4 output classes: soil, healthy, infected and susceptible to infection. the model has been tested on multiple combinations of input data and three patch sizes. the best result was obtained by combining exg, exr & grvi with an accuracy of 95.86% on 64 â 64 patches. in [10] , the authors tested several existing state-of-the-art cnn architectures for plant disease classification. the public plantvillage database [12] was used in this study. the database consists of 55,038 images of 14 plant types, divided into 39 classes of healthy and infected leaves, including a background class. the best results were obtained with the transfer learning resnet34 model, achieving an accuracy of 99.67%. several works have been carried out to deal with the problem of plant diseases detection using images provided by remote sensing materials (smartphones, drones…). nevertheless, cnn have demonstrated high performances to solve this problem compared to models based on the classic feature extracting methods. in the present study we took advantages from the deep learning and transfer learning approaches to address the problem of the most important damages caused by pests' attacks and cryptogamic pathogens infections in tomato crops. the rest of the paper is organized as follows. section 2 presents a comparative study and discusses our preliminary results. the conclusion and perspectives are presented in sect. 3. the study was conducted on a database of images of infected leaves, developed and used in [3] [4] [5] .the images were taken with a digital camera, canon 600d, in several farms in the area of sous massa, morocco. additional images are collected from the internet in order to increase the size of the database. the dataset is composed of six classes, three of damage caused by insect pests (leafminer flies, thrips and tuta absoluta), and three classes of cryptogamic pathogens symptoms (early blight, late blight and powdery mildew). the dataset is validated with the help of an agricultural experts. figure 1 depicts the types of symptoms on tomato leaves, table 1 presents the composition of the database and the symptoms of each class. the images were resized in order to put the leaves in the center of the images. the motivation behind using deep learning for computer vision is the direct exploitation of image without any hand-crafted features. in plant disease detection field, many researchers have chosen deep models densnets and vggs for their high performance in standard computer vision tasks. the idea behind the densnet architecture is to avoid creating short paths from the early layers to the later layers and to ensure maximum information flow between the layers of the network. therefore, densnet connects all its layers (with corresponding feature map sizes) directly to each other. in addition, each layer obtains additional inputs from all previous layers and transmits its own characteristic maps to all subsequent layers [13] . indeed, according to [13] densnets require substantially fewer parameters and less computation to achieve state-of-the-art performances. figure 2(a) gives an example of a five dense layers convolutional model. in this study densnet was used with 121 layers and 161 layers. very deep convolutional networks or vgg ranked second in ilsvrc-2014 challenge [14] . the model is widely used for image recognition task especially in the crop field [15, 16] . consequently, we used vgg with 16 layers. the architecture has 16 fine-tuning is a transfer learning method that allows to take advantage of models trained on another computer vision task where a large number of labelled images is available. moreover, it reduces the needs on having a large dataset and computation power to train the model from scratch [16] . fine-tuned learning experiments are much faster and more accurate compared to models trained from scratch [10, 15, 17, 18] . hence, we fine-tuned all network layers of the 3 models based on learned features on the imagenet dataset [19] . the idea is to take pre-trained weights from the vgg16, densnet121 and densnet161 trained on the imagenet dataset, use those weights as first step of our learning process, then we keep them for every convolutional layer in all the iterations and update only the weights of the linear layers. experimental setup experiments were run on a google compute engine instance named google colaboratory (colab) [20] as well as a local machine lenovoy560 with 16 gb of ram. colab notebooks are based on jupyter and work as a google docs object. in addition to that, the notebooks are pre-configured with the essential machine learning and artificial intelligence libraries, such as tensorflow, matplotlib, and keras. colab operates under ubuntu 17.10 64 bits and it is composed of an intel xeon processor and 13 gb ram. it is equipped with a nvidia tesla k80 (gk210 chipset), 12 gb ram, 2496 cuda. we implemented and executed the experiments in python, using pytorch library [21] , which performs automatic differentiation over dynamic computation graphs. in addition, we used the pytorch model zoo, which contains various models pretrained on the imagenet dataset [19] . the model architecture is train with stochastic gradient descent (sgd) optimizer with learning rate of 1e-3 (0.005) and a total of 20 epochs. the dataset is divided into 80% for training and 20% for evaluation. the evaluation of loss during the training phase illustrated in fig. 3(a) . based on the graph we can observe that the training loss converged for all models. a big reduction of loss started from the first 5 epochs, after 20 iterations all the models were optimized with low losses reaching a score of 0.12 for densnet161, 0.14 for densnet121 and 0.15 for vgg16. after the 14 th epoch the training start to converge as well as the training accuracy. in addition, after testing with higher learning rate and increasing number of epochs, the best training scores were achieved using densnets models. which means that the models performed better with less parameters. besides, densnet161 performed better than densnet121 due to the deeper architecture of the model. we can observe in table 2 that densnets performed better than vgg model during the test even if their losses reached a score around 0.14 during. note that densnet with deeper layers had better test score. in the test phase densnet161 outperformed densnet121 and vgg16 with an accuracy 95.65%, 94.93% and 90.58% respectively. we can clearly see from the table 2 that densnet161 outperformed the other models in classifying leafminer fly, thrips and powdery mildew with an accuracy up to 100%, 95.65% and 100% respectively. furthermore, densnet121 had the best classification rate for early blight, late blight and tuta absoluta with an accuracy of 100%, 95.65% and 95.65 respectively. in order to compare the two models having the best accuracies, we calculated the confusion matrix of the testing dataset for those models. figure 4 represents the confusion matrix for the densnet classification models with 161 and 121 layers. more images were misclassified for densnet121 compared to densnet161. moreover, the most confused classes for the densnet121 are leafminer fly and thrips with two thrips images classified as leafminer fly and one leaf miner classified as thrips. in densnet161 model, the confusion is more likely between early blight and late blight with one early blight image classified as late blight and one image late blight classified as early blight. thrips and early blight are the most misclassified classes for both models, which is due to the similarity between the symptoms making it difficult to differentiate between these classes. table 3 describes the studies we cited earlier in sect. 1, aligned with our model results. each approach is using different dataset. nevertheless, according to the accuracies listed, the approaches based on deep learning models outperformed the approaches based on feature engineering. the results of our model are promising starting with a dataset with the size of 666 imagessee sect. 2.1 and achieving an accuracy of 95.65% using densnet161 model. in this paper we have studied three deep learning models in order to deal with the problem of plant disease detection. the best test accuracy score is achieved with densenet161 with 20 training epochs, outperforming the tested architectures. from the study that has been conducted it is possible to conclude that densnet has a suitable architecture for the task of plants disease detection based on crop images. moreover, we realized that densnets requires less parameters to achieve better performances. the preliminary results are promising. in our future works we will try to improve the results, increase the dataset size and address more challenging diseases detection problems. mapm du développement rural et des eaux et forêts, l'agriculture en chiffre_plan maroc vert, l'agriculture en chiffre agriculture de précision automatic recognition of plant leaves diseases based on serial combination of two svm classifiers a hybrid combination of multiple svm classifiers for automatic recognition of the damages and symptoms on plant leaves automatic recognition of the damages and symptoms on plant leaves using parallel combination of two classifiers detection and classification of rice plant diseases automatic recognition of vegetable crops diseases based on neural network classifier autonomous detection of plant disease symptoms directly from aerial imagery. plant phenom deep leaning approach with colorimetric spaces and vegetation indices for vine diseases detection in uav images deep learning for plant diseases: detection and saliency map visualisation image set for deep learning: field images of maize annotated with disease symptoms an open access repository of images on plant health to enable the development of mobile disease diagnostics densely connected convolutional networks very deep convolutional networks for large-scale image recognition a comparative study of fine-tuning deep learning models for plant disease identification vgg16 for plant image classification with transfer learning and data augmentation deep learning with unsupervised data labeling for weed detection in line crops in uav images using deep learning for image-based plant disease detection imagenet: a large-scale hierarchical image database performance analysis of google colaboratory as a tool for accelerating deep learning applications deep learning with python acknowledgment. this study was supported by campus france, cultural action and cooperation department of the french embassy in morocco, in the call of proposals, 2019 campaign, under the name of "appel à projet recherche et universitaire". key: cord-026949-nu46ok9w authors: varshney, deeksha; ekbal, asif; nagaraja, ganesh prasad; tiwari, mrigank; gopinath, abhijith athreya mysore; bhattacharyya, pushpak title: natural language generation using transformer network in an open-domain setting date: 2020-05-26 journal: natural language processing and information systems doi: 10.1007/978-3-030-51310-8_8 sha: doc_id: 26949 cord_uid: nu46ok9w prior works on dialog generation focus on task-oriented setting and utilize multi-turn conversational utterance-response pairs. however, natural language generation (nlg) in the open-domain environment is more challenging. the conversations in an open-domain chit-chat model are mostly single-turn in nature. current methods used for modeling single-turn conversations often fail to generate contextually relevant responses for a large dataset. in our work, we develop a transformer-based method for natural language generation (nlg) in an open-domain setting. experiments on the utterance-response pairs show improvement over the baselines, both in terms of quantitative measures like bleu and rouge and human evaluation metrics like fluency and adequacy. conversational systems are some of the most important advancements in the area of artificial intelligence (ai). in conversational ai, dialogue systems can be either an open-domain chit-chat model or a task-specific goal-oriented model. task-specific systems focus on particular tasks such as flight or hotel booking, providing technical support to users, and answering non-creative queries. these systems try to generate a response by maximizing an expected reward. in contrast, an open-domain dialog system operates in a non-goal driven casual environment and responds to the all kinds of questions. the realization of rewards is not straightforward in these cases, as there are many factors to model in. aspects such as understanding the dialog context, acknowledging user's personal preferences, and other external factors such as time, weather, and current events need consideration at each dialog step. in recent times, there has been a trend towards building end-to-end dialog systems such as chat-bots which can easily mimic human conversations. [19, 22, 25] developed systems using deep neural networks by training them on a large amount of multi-turn conversational data. virtual assistants in open-domain settings usually utilize single-turn conversations for training the models. chitchat bots in such situations can help humans to interact with machines using natural language, thereby allowing humans to express their emotional states. in dialogue systems, generating relevant, diverse, and coherent responses is essential for robustness and practical usages. generative models tend to generate shorter, inappropriate responses to some questions. the responses range from invalid sentences to generic ones like "i don't know". the reasons for these issues include inefficiency of models in capturing long-range dependencies, generation of a large number of out-of-vocabulary (oov) words, and limitations of the maximum likelihood objective functions for training these models. transformer models have become an essential part of most of the state-of-the-art architectures in several natural language processing (nlp) applications. results show that these models capture long-range dependencies efficiently, replacing gated recurrent neural network models in many situations. in this paper, we propose an efficient end-to-end architecture based on the transformer network for natural language generation (nlg) in an open-domain dialogue system. the proposed model can maximize contextual relevancy and diversity in generated responses. our research reported here contributes in three ways: (i) we build an efficient end-to-end neural architecture for a chit-chat dialogue system, capable of generating contextually consistent and diverse responses; (ii) we create a singleturn conversational dataset with chit-chat type conversations on several topics between a human and a virtual assistant; and (iii) empirical analysis shows that our proposed model can improve the generation process when trained with enough data in comparison to the traditional methods like retrieval-based and neural translation-based. conversational artificial intelligence (ai) is currently one of the most challenging problems of artificial intelligence. developing dialog systems that can interact with humans logically and can engage them in having long-term conversations has captured the attention of many ai researchers. in general, dialog systems are mainly of two types -task-oriented dialog systems and open-domain dialog systems. task-oriented dialog systems converse with the users to complete a specific task such as assisting customers to book a ticket or online shopping. on the other hand, an open-domain dialog system can help users to share information, ask questions, and develop social etiquette's through a series of conversations. early works in this area were typically rule-based or learning-based methods [12, 13, 17, 28] . rule-based methods often require human experts to form rules for training the system, whereas learning-based methods learn from a specific algorithm, which makes it less flexible to adapt to the other domains. data from various social media platforms like twitter, reddit, and other community question-answering (cqa) platforms have provided us with a large number of human-to-human conversations. data-driven approaches developed by [6, 16] can be used to handle such problems. retrieval based methods [6] generate a suitable response from a predefined set of candidate responses by ranking them in the order of similarity (e.g., by matching the number of common words) against the input sentence. the selection of a random response from a set of predefined responses makes them static and repetitive. [16] builds a system based on phrase-based statistical machine translation to exploit single turn conversations. [30] presented a deep learning-based method for retrieval-based systems. a brief review of these methods is presented by [2] . lately, generation based models have become quite popular. [19, 22, 23, 25] presented several generative models based on neural network for building efficient conversational dialog systems. moreover, several other techniques, for instance generative adversarial network (gan) [10, 29] and conditional variational autoencoder (cvae) [3, 7, 18, 20, 32, 33] are also implemented for dialog generation. conversations generated from retrieval-based methods are highly fluent, grammatically correct, and are of good quality as compared to dialogues generated from the generative methods. their high-quality performance is subjected to the availability of an extensive repository of human-human interactions. however, responses generated by neural generative models are random in nature but often lack grammatical correctness. techniques that can combine the power of both retrieval-based methods and generative methods can be adapted in such situations. on the whole hybrid methods [21, 27, 31, 34] first find some relevant responses using retrieval techniques and then leverages them to generate contextually relevant responses in the next stage. in this paper, we propose a novel method for building an efficient virtual assistant using single-turn open-domain conversational data. we use a self-attention based transformer model, instead of rnn based models to get the representation of our input sequences. we observe that our method can generate more diverse and relevant responses. our goal is to generate contextually relevant responses for single-turn conversations. given an input sequence of utterance u = u 1 , u 2 , ..., u n composed of n words we try to generate a target response y = y 1 , y 2 , ..., y m . we use pre-trained glove [15] 1 embeddings to initialize the word vectors. glove utilizes two main methods from literature to build its vectors: global matrix factorization and local context window methods. the glove model is trained on the non-zero entries of a global word to word co-occurrence matrix, which computes how frequently two words can occur together in a given corpus. the embeddings used in our model are trained on common crawl dataset with 840b tokens and 2.2m vocab. we use 300-dimensional sized vectors. we formulate our task of response generation as a machine translation problem. we define two baseline models based on deep learning techniques to conduct our experiments. first, we build a neural sequence to sequence model [23] based on bi-directional long short term memory (bi-lstm) [5] cells. the second model utilizes the attention mechanism [1] to align input and output sequences. we train these models using the glove word embeddings as input features. to build our first baseline, we use a neural encoder-decoder [23] model. the encoder, which contains rnn cells, converts the input sequence into a context vector. the context vector is an abstract representation of the entire input sequence. the context vector forms the input for a second rnn based decoder, which learns to output the target sequence one word at a time. our second baseline uses an attention layer [1] between the encoder and decoder, which helps in deciding which words to focus on the input sequence in order to predict the next word correctly. the third model, which is our proposed method, is based on the transformer network architecture [24] . we use glove word embeddings as input features for our proposed model. we develop the transformer encoder as described in [24] to obtain the representation of the input sequence and the transformer decoder to generate the target response. figure 1 shows the proposed architecture. the input to the transformer encoder is both the embedding, e, of the current word, e(u n ), as well as positional encoding pe(n) of the nth word: there are a total of n x identical layers in a transformer encoder. each layer contains two sub-layers -a multi-head attention layer and a position-wise feedforward layer. we encode the input utterances and target responses of our dataset using multi-head self-attention. the second layer performs linear transformation over the outputs from the first sub-layer. a residual connection is applied to each of the two sub-layers, followed by layer normalization. the following equations represent the layers: where m 1 is the hidden state returned by the first layer of multi-head attention and f 1 is the representation of the input utterance obtained after the first feed forward layer. the above steps are repeated for the remaining layers: where n = 1, ..., n x . we use c to denote the final representation of the input utterance obtained at n x -th layer: similarly, for decoding the responses, we use the transformer decoder. there are n y identical layers in the decoder as well. the encoder and decoder layers are quite similar to each other except that now the decoder layer has two multihead attention layers to perform self-attention and encoder-decoder attention, respectively. r y = [y 1 , ..., y m ] y m = e(y m ) + p e(m) (10) to make prediction of the next word, we use softmax to obtain the words probabilities decoded by the decoder. in this section, we present the details of the datasets used in our experiments, along with a detailed overview of the experimental settings. our dataset comprises of single-turn conversations from ten different domains -data about user, competitors, emotion, emergency, greetings, about bixby, entertainment, sensitive, device, and event. professional annotators with a linguistics background and relevant expertise created this dataset. the total dataset comprises of 184,849 utterance and response pairs with an average of 7.31 and 14.44 words for utterance and response, respectively. we first split the data into a train and test set in a 95:5 ratio. we then use 5% of the training data for preparing the validation set. the dataset details are given in table 2 . some examples from the dataset are shown in table 1 . we use two different types of models for our experiments -recurrent and transformer-based sequence-to-sequence generative models. all data loading, model implementations, and evaluation were done using the opennmt 2 [9] as the code framework. we train a seq2seq model where the encoder and decoder are parameterized as lstms [5] . we also experiment with the seq2seq model with an attention mechanism [1] between the decoder and the encoder outputs. the encoder and decoder lstms have 2 layers with 512-dimensional hidden states with a dropout rate of 0.1. the layers of both encoder and decoder are set to 6 with 512-dimensional hidden states with a dropout of 0.1. there are 8 multihead attention heads and 2048 nodes in the feed-forward hidden layers. the dimension of word embedding is empirically set to 512. we use adam [8] for optimization. when decoding the responses, the beam size is set to 5. automatic evaluation: we use the standard metrics like bleu [14] , rouge [11] and perplexity for the automatic evaluation of our models. perplexity is reported on the generated responses from the validation set. lower perplexity indicates better performance of the models. bleu and rouge measure the ngram overlap between a generated response and a gold response. higher bleu and rouge scores indicate better performance. to qualitatively evaluate our models, we perform human evaluation on the generated responses. we sample 200 random responses from our test set for the human evaluation. given an input utterance, target response, and predicted response triplet, two experts with post-graduate exposure were asked to evaluate the predicted responses based on the given two criteria: 1. fluency: the predicted response is fluent in terms of the grammar. 2. adequacy: the predicted response is contextually relevant to the given utterance. we measure fluency and adequacy on a 0-2 scale with '0' indicating an incomplete or incorrect response, '1' indicating acceptable responses and '2' indicating a perfect response. to measure the inter-annotator agreement, we compute the fleiss kappa [4] score. we obtained a kappa score of 0.99 for fluency and a score of 0.98 for adequacy denoting "good agreement. in this section we report the results for all our experiments. the first two experiments (seq2seq & seq2seq attn) are conducted with our baseline models. our third experiment (c.f fig. 1 ) is carried out on our proposed model using word embeddings as the input sequences. table 3 and table 4 show the automatic and manual evaluation results for both the baseline and the proposed model. our proposed model has lower perplexity and higher bleu and rouge scores than the baselines. the improvement in each model is statistically significant compared to the other models 3 . for all the evaluation metrics, seq2seq attn has the highest score among the baselines, and our model outperforms those scores by a decent margin. for adequacy, we find that our seq2seq model achieves the highest score of 73.70 among the baseline models. our proposed model outperforms the baselines with a score of 81.75. for fluency, we observe that the responses generated by all the models are quite fluent in general. to observe our results in more details, we perform an error analysis on the predicted response. in table 5 as seen in the example, the predicted response would not be the best fit reply to the utterance "you are online" as the response falls out of context for the given utterance. in this paper, we propose an effective model for response generation using singleturn conversations. firstly, we created a large single-turn conversational dataset, and then built a transformer-based framework to model the short-turn conversations effectively. empirical evaluation, in terms of both automatic and humanbased metrics, shows encouraging performance. in qualitative and quantitative analyses of the generated responses, we observed the predicted responses to be highly relevant in terms of context, but also observed some in-corrections as discussed in our results and analysis section. overall we observed that our proposed model attains improved performance when compared with the baseline results. in the future, apart from improving the architectural designs and training methodologies, we look forward to evaluating our models on a much larger dataset of single-turn conversation. neural machine translation by jointly learning to align and translate deep retrieval-based dialogue systems: a short review variational autoregressive decoder for neural response generation measuring nominal scale agreement among many raters long short-term memory an information retrieval approach to short text conversation generating informative responses with controlled sentence function adam: a method for stochastic optimization opennmt: open-source toolkit for neural machine translation adversarial learning for neural dialogue generation rouge: a package for automatic evaluation of summaries njfun-a reinforcement learning spoken dialogue system reinforcement learning of questionanswering dialogue policies for virtual museum guides bleu: a method for automatic evaluation of machine translation glove: global vectors for word representation data-driven response generation in social media a survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies a hierarchical latent variable encoder-decoder model for generating dialogues neural responding machine for short-text conversation improving variational encoder-decoders in dialogue generation an ensemble of retrieval-based and generation-based human-computer conversation systems a neural network approach to context-sensitive generation of conversational responses sequence to sequence learning with neural networks attention is all you need a neural conversational model the generalization of student's' problem when several different population variances are involved retrieve and refine: improved sequence generation models for dialogue partially observable markov decision processes for spoken dialog systems diversity-promoting gan: a cross-entropy based generative adversarial network for diversified text generation learning to respond with deep neural networks for retrieval-based human-computer conversation system a hybrid retrieval-generation neural conversation model unsupervised discrete sentence representation learning for interpretable neural dialog generation learning discourse-level diversity for neural dialog models using conditional variational autoencoders the design and implementation of xiaoice, an empathetic social chatbot acknowledgement. the research reported in this paper is an outcome of the project "dynamic natural language response to task-oriented user utterances", supported by samsung research india, bangalore. key: cord-048461-397hp1yt authors: coelho, flávio c; cruz, oswaldo g; codeço, cláudia t title: epigrass: a tool to study disease spread in complex networks date: 2008-02-26 journal: source code biol med doi: 10.1186/1751-0473-3-3 sha: doc_id: 48461 cord_uid: 397hp1yt background: the construction of complex spatial simulation models such as those used in network epidemiology, is a daunting task due to the large amount of data involved in their parameterization. such data, which frequently resides on large geo-referenced databases, has to be processed and assigned to the various components of the model. all this just to construct the model, then it still has to be simulated and analyzed under different epidemiological scenarios. this workflow can only be achieved efficiently by computational tools that can automate most, if not all, these time-consuming tasks. in this paper, we present a simulation software, epigrass, aimed to help designing and simulating network-epidemic models with any kind of node behavior. results: a network epidemiological model representing the spread of a directly transmitted disease through a bus-transportation network connecting mid-size cities in brazil. results show that the topological context of the starting point of the epidemic is of great importance from both control and preventive perspectives. conclusion: epigrass is shown to facilitate greatly the construction, simulation and analysis of complex network models. the output of model results in standard gis file formats facilitate the post-processing and analysis of results by means of sophisticated gis software. epidemic models describe the spread of infectious diseases in populations. more and more, these models are being used for predicting, understanding and developing control strategies. to be used in specific contexts, modeling approaches have shifted from "strategic models" (where a caricature of real processes is modeled in order to emphasize first principles) to "tactical models" (detailed representations of real situations). tactical models are useful for cost-benefit and scenario analyses. good examples are the foot-and-mouth epidemic models for uk, triggered by the need of a response to the 2001 epidemic [1, 2] and the simulation of pandemic flu in differ-ent scenarios helping authorities to choose among alternative intervention strategies [3, 4] . in realistic epidemic models, a key issue to consider is the representation of the contact process through which a disease is spread, and network models have arisen as good candidates [5] . this has led to the development of "network epidemic models". network is a flexible concept that can be used to describe, for example, a collection of individuals linked by sexual partnerships [6] , a collection of families linked by sharing workplaces/schools [7] , a collection of cities linked by air routes [8] . any of these scales may be relevant to the study and control of disease spread [9] . networks are made of nodes and their connections. one may classify network epidemic models according to node behavior. one example would be a classification based on the states assumed by the nodes: networks with discretestate nodes have nodes characterized by a discrete variable representing its epidemiological status (for example, susceptible, infected, recovered). the state of a node changes in response to the state of neighbor nodes, as defined by the network topology and a set of transmission rules. networks with continuous-state nodes, on the other hand, have node' state described by a quantitative variable (number of susceptibles, density of infected individuals, for example), modelled as a function of the history of the node and its neighbors. the importance of the concept of neighborhood on any kind of network epidemic model stems from its large overlap with the concept of transmission. in network epidemic models, transmission either defines or is defined/constrained by the neighborhood structure. in the latter case, a neighborhood structure is given a priori which will influence transmissibility between nodes. the construction of complex simulation models such as those used in network epidemic models, is a daunting task due to the large amount of data involved in their parameterization. such data frequently resides on large geo-referenced databases. this data has to be processed and assigned to the various components of the model. all this just to construct the model, then it still has to be simulated, analyzed under different epidemiological scenarios. this workflow can only be achieved efficiently by computational tools that can automate most if not all of these time-consuming tasks. in this paper, we present a simulation software, epigrass, aimed to help designing and simulating network-epidemic models with any kind of node behavior. without such a tool, implementing network epidemic models is not a simple task, requiring a reasonably good knowledge of programming. we expect that this software will stimulate the use and development of networks models for epidemiological purposes. the paper is organized as following: first we describe the software and how it is organized with a brief overview of its functionality. then we demonstrate its use with an example. the example simulates the spread of a directly transmitted infectious disease in brazil through its transportation network. the velocity of spread of new diseases in a network of susceptible populations depends on their spatial distribution, size, susceptibility and patterns of contact. in a spatial scale, climate and environment may also impact the dynamics of geographical spread as it introduces temporal and spatial heterogeneity. understanding and predicting the direction and velocity of an invasion wave is key for emergency preparedness. epigrass is a platform for network epidemiological simulation and analysis. it enables researchers to perform comprehensive spatio-temporal simulations incorporating epidemiological data and models for disease transmission and control in order to create complex scenario analyses. epigrass is designed towards facilitating the construction and simulation of large scale metapopulational models. each component population of such a metapopulational model is assumed to be connected through a contact network which determines migration flows between populations. this connectivity model can be easily adapted to represent any type of adjacency structure. epigrass is entirely written in the python language, which contributes greatly to the flexibility of the whole system due to the dynamical nature of the language. the geo-referenced networks over which epidemiological processes take place can be very straightforwardly represented in a object-oriented framework. consequently, the nodes and edges of the geographical networks are objects with their own attributes and methods (figure 1). once the archetypal node and edge objects are defined with appropriate attributes and methods, then a code representation of the real system can be constructed, where nodes (representing people or localities) and contact routes are instances of node and edge objects, respectively. the whole network is also an object with its own set of attributes and methods. in fact, epigrass also allows for multiple edge sets in order to represent multiple contact networks in a single model. figure 1 architecture of an epigrass simulation model. a simulation object contains the whole model and all other objects representing the graph, sites and edges. site object contaim model objects, which can be one of the built-in epidemiological models or a custom model written by the user. these features leads to a compact and hierarchical computational model consisting of a network object containing a variable number of node and edge objects. it also does not pose limitations to encapsulation, potentially allowing for networks within networks, if desirable. this representation can also be easily distributed over a computational grid or cluster, if the dependency structure of the whole model does not prevent it (this feature is currently being implemented and will be available on a future release of epigrass). for the end-user, this hierarchical, object-oriented representation is not an obstacle since it reflects the natural structure of the real system. even after the model is converted into a code object, all of its component objects remain accessible to one another, facilitating the exchange of information between all levels of the model, a feature the user can easily include in his/her custom models. nodes and edges are dynamical objects in the sense that they can be modified at runtime altering their behavior in response to user defined events. in epigrass it is very easy to simulate any dynamical system embedded in a network. however, it was designed with epidemiological models in mind. this goal led to the inclusion of a collection of built-in epidemic models which can be readily used for the intra-node dynamics (sir model family). epigrass users are not limited to basing their simulations on the built-in models. user-defined models can be developed in just a few lines of python code. all simulations in epigrass are done in discrete-time. however, custom models may implement finer dynamics within each time step, by implementing ode models at the nodes, for instance. the epigrass system is driven by a graphical user interface(gui), which handles several input files required for model definition and manages the simulation and output generation (figure 2). at the core of the system lies the simulator. it parses the model specification files, contained in a text file (.epg file), and builds the network from site and edge description files (comma separated values text files, csv). the simulator then builds a code representation of the entire model, simulates it, and stores the results in the database or in a couple of csv files. this output will contain the full time series of the variables in the model. additionally, a map layer (in shapefile and kml format) is also generated with summary statitics for the model (figure 3). the results of an epigrass simulation can be visualized in different ways. a map with an animation of the resulting timeseries is available directly through the gui (figure 4). other types of static visualizations can be generated through gis software from the shapefiles generated. the kml file can also be viewed in google earth™ or google maps™ (figure 5). epigrass also includes a report generator module which is controlled through a parameter in the ".epg" file. epigrass is capable of generating pdf reports with summary statistics from the simulation. this module requires a latex installation to work. reports are most useful for general verification of expected model behavior and network structure. however, the latex source files generated workflow for a typical epigrass simulation figure 3 workflow for a typical epigrass simulation. this diagram shows all inputs and outputs typical of an epigrass simulation session. epigrass graphical user interface figure 2 epigrass graphical user interface. by the module may serve as templates that the user can edit to generate a more complete document. building a model in epigrass is very simple, especially if the user chooses to use one of the built-in models. epigrass includes 20 different epidemic models ready to be used (see manual for built-in models description). to run a network epidemic model in epigrass, the user is required to provide three separate text files (optionally, also a shapefile with the map layer): 1. node-specification file: this file can be edited on a spreadsheet and saved as a csv file. each row is a node and the columns are variables describing the node. 2. edge-specification file: this is also a spreadsheet-like file with an edge per row. columns contain flow variables. 3. model-specification file: also referred to as the ".epg" file. this file specifies the epidemiological model to be run at the nodes, its parameters, flow model for the edges, and general parameters of the simulation. the ".epg" file is normally modified from templates included with epigrass. nodes and edges files on the other hand, have to be built from scratch for every new network. details of how to construct these files, as well as examples, can be found in the documentation accompanying the software, which is available at at the project's website [10] in the example application, the spread of a respiratory disease through a network of cities connected by bus transportation routes is analyzed. the epidemiological scenario is one of the invasion of a new influenza-like virus. one may want to simulate the spread of this disease through the country by the transportation network to evaluate alternative intervention strategies (e.g. different vaccination strategies). in this problem, a network can be defined as a set of nodes and links where nodes represent cities and links represents transportation routes. some examples of this kind of model are available in the literature [8, 11] . one possible objective of this model is to understand how the spread of such a disease may be affected by the pointof-entry of the disease in the network. to that end, we may look at variables such as the speed of the epidemic, number of cases after a fixed amount of time, the distribution of cases in time and the path taken by the spread. the example network was built from 76 of largest cities of brazil (>= 100 k habs). the bus routes between those cities formed the connections between the nodes of the networks. the number of edges in the network, derived from epigrass output visualized on google-earth figure 5 epigrass output visualized on google-earth. figure 4 epigrass animation output. sites are color coded (from red to blue) according to infection times. bright red is the seed site (on the ne). the bus routes, is 850. these bus routes are registered with the national agency of terrestrial transportation (antt) which provided the data used to parameterize the edges of the network. the epidemiological model used consisted of a metapopulation system with a discrete-time seir model (eq. 1). for each city, s t is the number of susceptibles in the city at time t, e t is the number of infected but not yet infectious individuals, i t is the number of infectious individuals resident in the locality, n is the population residing in the locality (assumed constant throughout the simulation), and n t is the number of individuals visiting the locality, θ t is the number of visitors who are infectious. the parameters used were taken from lipsitch et al. (2003) [12] to represent a disease like sars with an estimated basic reproduction number (r 0 ) of 2.2 to 3.6 ( table 1) . to simulate the spread of infection between cities, we used the concept of a "forest fire" model [13] . an infected individual, traveling to another city, acts as a spark that may trigger an epidemic in the new locality. this approach is based on the assumption that individuals commute between localities and contribute temporarily to the number of infected in the new locality, but not to its demography. implications of this approach are discussed in grenfell et al (2001) [13] . the number of individuals arriving in a city (n t ) is based on annual total number of passengers arriving trough all bus routes leading to that city as provided by the antt (brazilian national agency for terrestrial transportation). the annual number of passengers is used to derive an average daily number of passengers simply by dividing it by 365. stochasticity is introduced in the model at two points: the number of new cases is draw from a poisson distribution with intensity and the number of infected individuals visiting i is modelled as binomial process: where n is the total number of passengers arriving from a given neighboring city; i k, t and n k are the current number of infectious individuals and the total population size of city k, respectively. δ is the delay associated with the duration of each bus trip. the delay δ was calculated as the number of days (rounded down) that a bus, traveling at an average speed of 60 km/h, would take to complete a given trip. the lengths in kilometers of all bus routes were also obtained from the antt. vaccination campaigns in specific (or all) cities can be easily attained in epigrass, with individual coverages for each campaign on each city. we use this feature to explore vaccination scenarios in this model (figures 6 and 7). the files with this model's definition(the sites, edges and ".epg" files) are available as part of the additional files 1, 2 and 3 for this article. to determine the importance of the point of entry in the outcome of the epidemic, the model was run 500 times, randomizing the point of entry of the virus. the seeding site was chosen with a probability proportional to the log 10 of their population size. these replicates were run using epigrass' built-in support for repeated runs with the option of randomizing seeding site. for every simulation, statistics about each site such as the time it got infected and time series of incidence were saved. the time required for the epidemic to infect 50% of the cities was chosen as a global index to network susceptibility to invasion. to compare the relative exposure of cities to disease invasion, we also calculated the inverse of time , for all k neighbors elapsed from the beginning of the epidemic until the city registered its first indigenous case as a local measure of exposure. except for population size, all other epidemiological parameters were the same for all cities, that is, disease transmissibility and recovery rate. some positional features of each node were also derived: centrality, which is is a measure derived from the average distance of a given site to every other site in the network; betweeness, which is the number of times a node figures in the the shortest path between any other pair of nodes; and degree, which is the number of edges connected to a node. in order to analyze the path of the epidemic spread, we also recorded which cities provided the infectious cases which were responsible for the infection of each other city. if more than one source of infection exists, epigrass selects the city which contributed with the largest number cost in vaccines applied vs. benefit in cases avoided, for a simulated epidemic starting at the highest degree city (são paulo) figure 6 cost in vaccines applied vs. benefit in cases avoided, for a simulated epidemic starting at the highest degree city (são paulo). cost in vaccines applied vs. benefit in cases avoided, for a simulated epidemic starting at a relatively low degree city(salvador) figure 7 cost in vaccines applied vs. benefit in cases avoided, for a simulated epidemic starting at a relatively low degree city(salvador). of infectious individuals at that time-step, as the most likely infector. at the end of the simulation epigrass generates a file with the dispersion tree in graphml format, which can be read by a variety of graph plotting programs to generate the graphic seen on figure 8. the computational cost of running a single time step in an epigrass model, is mainly determined by the cost of calculating the epidemiological models on each site(node). therefore, time required to run models based on larger networks should scale linearly with the size of the network (order of the graph), for simulations of the same duration. the model presented here, took 2.6 seconds for a 100 days run, on a 2.1 ghz cpu. a somewhat larger model with 343 sites and 8735 edges took 28 seconds for a 100 days simulation. very large networks may be limited by the ammount of ram available. the authors are working on adapting epigrass to distribute processing among multiple cpus(in smp systems), or multiple computers in a cluster system. the memory demands can also be addressed by keeping the simulation objects on an objectoriented database during the simulation. steps in this direction are also being taken by the development team. the model presented here served maily the purpose of illustrating the capabilities of epigrass for simulating and analyzing reasonably complex epidemic scenarios. it should not be taken as a careful and complete analysis of a real epidemic. despite that, some features of the simulated epidemic are worth discussing. for example: the spread speed of the epidemic, measured as the time taken to infect 50% of the cities, was found to be influenced by the centrality and degree of the entry node (figures 9 and 10). the dispersion tree corresponding to the epidemic, is greatly influenced by the degree of the point of entry of spread of the epidemic starting at the city of salvador, a city with relatively small degree (that is, small number of neigh-bors) figure 8 spread of the epidemic starting at the city of salvador, a city with relatively small degree (that is, small number of neighbors). the number next to the boxes indicated the day when each city developed its first indigenous case. effect of degree(a) and betweeness(b) of entry node to the speed of the epidemic figure 9 effect of degree(a) and betweeness(b) of entry node to the speed of the epidemic. effect of betweeness of entry node on the speed of the epi-demic figure 10 effect of betweeness of entry node on the speed of the epidemic. the disease in the network. figure 8 shows the tree for the dispersion from the city of salvador. vaccination strategies must take into consideration network topology. figures 6 and 7 show cost benefit plots for three vaccination strategies investigated: uniform vaccination, top-3 degree sites only and top-10 degree sites only. vaccination of higher order sites offer cost/benefit advantages only in scenarios where the disease enter the network through one of these sites. epigrass facilitates greatly the simulation and analysis of complex network models. the output of model results in standard gis file formats facilitates the post-processing and analysis of results by means of sophisticated gis software. the non-trivial task of specifying the network over which the model will be run, is left to the user. but epigrass allows this structure to be provided as a simple list of sites and edges on text files, which can easily be contructed by the user using a spreadsheet, with no need for special software tools. besides invasion, network epidemiological models can also be used to understand patterns of geographical spread of endemic diseases [14] [15] [16] [17] . many infectious diseases can only be maintained in a endemic state in cities with population size above a threshold, or under appropriate environmental conditions(climate, availability of a reservoir, vectors, etc). the variables and the magnitudes associated with endemicity threshold depends on the natural history of the disease [18] . theses magnitudes may vary from place to place as it depends on the contact structure of the individuals. predicting which cities are sources for the endemicity and understanding the path of recurrent traveling waves may help us to design optimal surveillance and control strategies. modelling vaccination strategies against foot-and-mouth disease optimal reactive vaccination strategies for a foot-and-mouth outbreak in the uk strategy for distribution of influenza vaccine to high-risk groups and children containing pandemic influenza with antiviral agents space and contact networks: capturing the locality of disease transmission interval estimates for epidemic thresholds in two-sex network models applying network theory to epidemics: control measures for mycoplasma pneumoniae outbreaks assessing the impact of airline travel on the geographic spread of pandemic influenza modeling control strategies of respiratory pathogens epigrass website containing pandemic influenza at the source transmission dynamics and control of severe acute respiratory syndrome travelling waves and spatial hierarchies in measles epidemics travelling waves in the occurrence of dengue haemorrhagic fever in thailand modelling disease outbreaks in realistic urban social networks on the dynamics of flying insects populations controlled by large scale information large-scale spatial-transmission models of infectious disease disease extinction and community size: modeling the persistence of measles the authors would like to thank the brazilian research council (cnpq) for financial support to the authors. fcc contributed with the software development, model definition and analysis as well as general manuscript conception and writing. ctc contributed with model definition and implementation, as well as with writing the manuscript. ogc, contributed with data analysis and writing the manuscript. all authors have read and approved the final version of the manuscript. key: cord-031460-nrxtfl3i authors: sharma, vikas kumar; nigam, unnati title: modeling and forecasting of covid-19 growth curve in india date: 2020-09-05 journal: trans indian natl doi: 10.1007/s41403-020-00165-z sha: doc_id: 31460 cord_uid: nrxtfl3i in this article, we analyze the growth pattern of covid-19 pandemic in india from march 4 to july 11 using regression analysis (exponential and polynomial), auto-regressive integrated moving averages (arima) model as well as exponential smoothing and holt–winters models. we found that the growth of covid-19 cases follows a power regime of [formula: see text] after the exponential growth. we found the optimal change points from where the covid-19 cases shifted their course of growth from exponential to quadratic and then from quadratic to linear. after that, we saw a sudden spike in the course of the spread of covid-19 and the growth moved from linear to quadratic and then to quartic, which is alarming. we have also found the best fitted regression models using the various criteria, such as significant p-values, coefficients of determination and anova, etc. further, we search the best-fitting arima model for the data using the aic (akaike information criterion) and provide the forecast of covid-19 cases for future days. we also use usual exponential smoothing and holt–winters models for forecasting purpose. we further found that the arima (5, 2, 5) model is the best-fitting model for covid-19 cases in india. the covid-19 pandemic has created a lot of havoc in the world. it is caused by a virus called sars-cov-2, which comes from the family of coronaviruses and is believed to be originated from the unhygienic wet seafood market in wuhan, china but it has now infected around 215 countries of the world. with more than 13.2 million people affected around the world and more than 575,000 deaths (as of july 14, 2020), it has forced people to stay in their homes and has caused huge devastation in the world economy (singh and singh 2020; ministry of health and family welfare 2020; gupta et al. 2019) . in india, the first case of covid-19 was reported on 30th january, which was linked to the wuhan city of china (as the patient has travel history to the city). on 4th march, india saw a sudden hike in the number of cases and since then, the numbers are increasing day by day. as of 14th july, india has more than 908,000 cases with more than 23,000 deaths and is world's 3rd most infected country (https ://www.world omete rs.info/coron aviru s/). since the outbreak of the pandemic, scientists across the world have been indulged in the studies regarding the spread of the virus. lin et al. (2020) suggested the use of the seir (susceptible-exposed-infectious-removed) model for the spread in china and studied the importance of governmentimplemented restrictions on containing the infection. as the disease grew further, ivorra et al. (2019) suggested a θ-seihrd model that took into account various special features of the disease. it also included asymptomatic cases into account (around 51%) to forecast the total cases in china (around 168,500). giordano et al. (2003) also suggested an extended sir model called sidharthe model for cases in italy which was more customized for covid-19 to effectively model the course of the pandemic to help plan a better control strategy. petropoulos and makridakis (2020) suggested the use of exponential smoothing method to model the trend of the virus, globally. kumar et al. (2020) gave a review on the various aspects of modern technology used to fight against covid-19 crisis. apart from the epidemiological models, various dataoriented models were also suggested to model the cases and predict future cases for various disease outbreaks from time to time. various time-series models were also suggested to model the cases and predict future cases. arima and seasonal arima models are widely used by researchers to model and predict the cases of various outbreaks. in 2005, earnest et al. (2005) conducted a research to model and predict the cases of sars in singapore and predict the hospital supplies needed using this model. gaudart et al. (2009) modelled malaria incidence in the savannah area of mali using arima. zhang et al. (2013) compared seasonal arima model with three other time-series models to compare typhoid fever incidence in china. polwiang (2020) also used this model to determine the time-series pattern of dengue fever in bangkok. for covid-19 as well, various researchers tried to model the cases through arima. ceylan (2020) suggested the use of auto-regressive integrated moving average (arima) model to develop and predict the epidemiological trend of covid-19 for better allocation of resources and proper containment of the virus in italy, spain and france. chintalapudi (2020) suggested its use for predicting the number of cases and deaths post 60-days lockdown in italy. fanelli and francesco (2020) analyzed the dynamics of covid-19 in china, italy and france using iterative time-lag maps. it further used sird model to model and predict the cases and deaths in these countries. zhang et al. (2020) developed a segmented poisson model to analyze the daily new cases of six countries to find a peak point in the cases. since the spread of the virus started to grow in india, various measures were taken by the indian government to contain it. a nationwide lockdown was announced on march 25 to april 14, which was later extended to may 3. the whole country was divided into containment zones (where large number of cases were observed from a relatively smaller region), red zones (districts where risk of transmission was high and had higher doubling rates), green zones (districts with no confirmed case from last 21 days) and orange zones (which did not fall into the above three zones). after the further extension of the lockdown till may 17, various economic activities were allowed to start (with high surveillance) in areas of less transmission. further, the lockdown was extended to may 31 and some more economic activities have been allowed as per the transmission rates, which are the rates at which infectious cases cause new cases in the population, i.e. the rate of spread of the disease. this was further extended to june 8, with very less rules and especially the states were given the responsibility of setting the lockdown rules. the air and rail transport became open for general public. post june 8, we see that the restrictions are nominal with even shopping malls and religious places open for general public. now, the responsibility of imposing restrictions lies with the respective state governments. on the other hand, indian scientists and researchers are also working on addressing the issues arising from the pandemic, including production of ppe kits and test kits as well as studying the behaviour of spread of the disease and other aspects of management. various mathematical and statistical methods have been used for predicting the possible spread of covid-19. the classical epidemiological models (sir, seir, siqr etc.) suggested the increasing trend of the virus and predicted the peaks of the pandemic. early researches showed the pandemic to reach its peak by mid-may. they also showed that the basic reproduction number (r 0 ) and the doubling rates are lower in india, with comparison to european nations and the usa. a tree-based model was proposed by arti and bhatnagar (2020) and bhatnagar (2020) to study and predict the trends. they suggest that lockdown and social distancing in india have played a significant role to control the infection rates. but now, as the lockdown restrictions are minimal, the cases in india are growing at an alarming high rate. chatterjee et al. (2020) suggest growth of the pandemic through power law and its saturation at the later stages. due to the complexities in the epidemic models of covid-19, various researchers have been focusing on the data to forecast the future cases. chatterjee et al. (2020) , verma et al. (2020) and ziff and ziff (2020) suggest that after exponential growth, the total count follows a power regime of t 3 , t 2 , t and √ t before flattening out, where 't' refers to time. it can, therefore, be realized that there is an urgent need to model and forecast the growth of covid-19 in india as the virus is in the growing stage here. in india, the most affected states are maharashtra with over 260,000 cases (as of 14 july 2020), tamil nadu (around 142,000 cases), delhi (around 113,000 cases) and gujarat (around 42,000 cases). the greatest number of cases per million has been seen in the national capital of delhi (5740 cases per million) (refer https ://nhm.gov.in/new_updat es_2018/repor t_popul ation _proje ction _2019.pdf for population estimates). many states and union territories like, kerala, karnataka, andaman and nicobar islands, daman and diu, etc. which had recovered from majority of the cases have experienced a second wave of infections. this might be attributed to decreased travel restrictions and minimal lockdown measures. in their research, singh and jadaun (2020) studied the significance of lockdown in india and suggested that the new covid-19 cases would stop by the end of august in india with around 350,000 total cases. while some states may see an early stopping of new cases, such as telangana (mid-june), uttar pradesh and west bengal (july end) etc., the badly affected states of maharashtra, tamil nadu and gujarat will achieve this by august end. since a proven vaccine and medication is yet to be developed by the researchers then in such a scenario, modelling the present situation and forecasting the future outcome becomes crucially important to utilize our resources in the most optimal way. therefore, the article aims to study the growth curve of covid-19 cases in india and forecast its future course. since the disease is still in its growing age and very dynamic in nature, no model can guarantee for perfect validity for future. we, therefore, need to develop the understanding of the present situation of the pandemic. in this article, we first study the growth curve using regression methods (exponential, linear and polynomial etc.) and propose an optimal model for fitting the cases till july 10. further, we propose the use of time-series models for forecasting the future observations on covid-19 cases. here, we reach the best-fitted arima model for forecasting the covid-19 cases. we also compare these results with exponential smoothing (holt-winters) model. this study will help us to understand the course of spread of sars-cov-2 in india better and help the government and the people to optimally use the resources available to them. in this section, we briefly present the statistical techniques used for analyzing the covid-19 cases in india. here, we used usual regression (exponential, polynomial), timesseries (arima) and exponential smoothing models. regression is a statistical technique that attempts to estimate the strength and nature of relationship between a dependent variable and a series of independent variables. regression analyses may be linear and non-linear. a regression is called linear when it is linear in parameters, e.g. y = 0 + 1 t+ ∈ and y = 0 + 1 t + 2 t 2 + 3 t 3 + ∈ , ∈∼ n 0, 2 , where y is response variable, t denotes the indepenet variable, 0 is the intercept and other βs are known as slopes. a non-linear regression is a regression when it is nonlinear in its parameters, e.g. y = 1 e 2 x + . in the beginning of the spread of a disease, we see that the new cases are directly proportional to the existing infected cases and may be represented by dy(t) dt = ky(t) , where k is the proportionality constant. solving this differential equation, we get that, at the beginning of a pandemic, thus, at the beginning of a disease, the growth curve of the cases grows exponentially. as the disease spreads in a region, governments start to take action and people start becoming conscious about the disease. thus, after some time, the disease starts to follow a polynomial growth rather than continuing to grow exponentially. in order to fit an exponential regression to our data, we linearize the equation by taking the natural logarithm of the equation and convert it to a linear regression in first order. we estimate the parameters of a linear regression of order p as follows: let the model of linear regression of order p be: we get the best estimates of these coefficients by solving the following normal equations: which minimizes rss. this technique is referred to as the ordinary least squares (ols). we will use this technique of the ols to estimate the coefficients of our proposed model. (refer montgomery et al. (2012) . since we know that the growth curve of the disease changes after some time point, exponential to polynomial, we propose to use the following joint regression model with change point , where we take f 1 (t) = 1 e 2 t , f 2 (t) = 0 + 1 t + 2 t 2 + ⋯ + p t p + ∈, ∈∼ n(0, 2 ) and p is the order of the polynomial regression model and t stands for the time (an independent variable). during the analysis, we found that a suitable choice of f 2 (t) is a quadratic or a cubic model. once the order of the polynomial is kept fixed, an optimum value of the change point can be obtained by minimizing the residuals/errors. we can obtain the ols estimates of the parameters of the model (1) as given below: the least square estimates (lses) of the parameters, θ = 1 , 2 , , 0 , 1 , 2 , 3 , … … , p can be obtained by minimizing the residual sum of squares (rss) as given by: where ŷ exp i and ŷ poly i are the estimates value of y i from the exponential and polynomial regression models, respectively, and n is the size of the dataset. the lses of θ = 1 , 2 , , 0 , 1 , 2 , 3 , … … , p can be obtained as the simultaneous solution of the following n o r m a l e q u a t i o n s , solution to these equations is difficult since the parameter is decenter time point. we suggest to use the following algorithm while is kept fixed. ( in order to find the optimal value of µ, i.e. the turning point between the exponential and polynomial growth, we will use the technique of minimizing the residual sum squares in "analysis of covid-19 cases in india". we will use mape (mean absolute percentage error) to evaluate the performance of the mode. where y t is the observed value at time point t and ŷ t is an estimate of y t . in order to make the results easy to interpret, we will also use accuracy (%). the auto-regressive integrated moving averages method gauges the strength of one dependent variable relative to other changing variables. it is one of the most used timeseries models in diverse fields of data analysis as it takes into account the changing of trends, periodic changes as well as random disturbances in the time-series data. it is used for both better understanding of the data as well as forecasting, see brockwell et al. (1996) . autoregressive model (ar) is effectively merged with the moving averages model (ma) to formulate a useful timeseries model, arima model. the autoregression (ar) element of the model shows a changing variable that regresses on its own prior values and the moving average (ma) element incorporates the dependency between an observation and a residual error from a moving average model applied to prior observations. however, this model can only be applied to stationary data. since many real-life datasets consist of an element of non-stationarity, to model such datasets, arima model was developed. this model is open for non-stationary data as the integrated (i) factor of the model represents the differencing of raw observations to allow the time-series to become stationary. here, we may refer the reader to follow box et al. (2008 box et al. ( , 2015 for more details on arima model, estimation and its application. the general forms of ar(p) and ma(q) models can be, respectively, represented as the following equations: where ∅ s and θs are auto-regressive and moving averages parameters, respectively, y t represents value of time-series at time point t , t represents the random disturbance at time point t and is assumed to be independently and identically distributed (i.i.d.) with mean 0 and variance 2 . the arma(p, q) model can be represented as: where α is an intercept. the differenced stationary time-series can be modelled as an arma model to use arima model on the timeseries data (ceylan 2020; he and tao 2018; manikandan et al. 2016 ). the arima model is generally denoted as arima (p, d, q) where, p is the order of auto-regression, d is the degree of difference and q is the order of moving average. the degree of difference, i.e. d is a transformation (operator) that is used to make the time-series stationary as it removes the increasing trends. a higher value of d indicates positive autocorrelations out to a high number of lags. the first step to model the time-series by arima is to determine the time-series data for stationarity. the augmented dickey-fuller (adf) test may be applied to determine if the time series after differencing is stationary or not. (3) the adf test is applied to test the null hypothesis for the presence of a unit root (which indicates non-stationarity of the series). in order to deduce the arima(p, d, q) model, we can proceed as follows: we have the arma(p � , q) represented as follows (as per eq. 5) it can be equivalently written as: has a unit root (i.e. a factor of (1 − l) ) of multiplicity d. then, eq. (6) can be re-written as: or, this can be generalized as: the second step is to plot the graphs of the autocorrelation function (acf) and the partial autocorrelation function (pacf) to determine the most-likely values of p and q. the final step is to obtain the optimal values of p , d and q using the aic (akaike information criterion), for more details see https ://en.wikip edia.org/wiki/akaik e_infor matio n_crite rion. these information criteria may be used for selecting the best-fitted models. lower the values of criteria, higher will be its relative quality. the aic is given by: where k is the number of model parameters, l is the maximized value of log − likelihood function. exponential smoothing is one of the simple techniques to model time-series data where the past observations are assigned weights that are exponentially decreasing over time. we propose the following models, for modelling of covid-19 cases [see holt (1957) and winters (1960) ]. for single exponential smoothing, let the raw observations be denoted by {y t } and {s t } denote the best estimate of trend at time t. then, s 0 = y 0 , s t = y t + (1 − ) s t−1 , where ∈ (0,1) denotes the data smoothing factor. for double exponential (holt-winters) smoothing, let the raw observations be denoted by {y t } , smoothened values {s t } , and {b t } denotes the best estimate of trend at time t . then, where ∈ (0,1) denotes the data smoothing factor and ∈ (0,1) denotes the trend smoothing factor. for the forecast at t = (n + m) days, ( f n+m ) is calculated by for this study, we have used the data available at github, provided by centre for systems science and engineering (csse) at john hopkins university (see https ://githu b.com/ csseg isand data/covid -19/blob/maste r/csse_covid _19_ data/csse_covid _19_time_serie s/time_serie s_covid 19_confi rmed_globa l.csv). for this study, we use r software. (see r core team 2020). we have used the data from march 4 to july 11 for continuity of the data. we know that at the beginning of the spread of the disease in india, the growth was exponential and after some time, it was shifted to polynomial. we first obtain optimum turning point of the growth, i.e. when did the growth rate of the disease shifted to polynomial regime from the exponential. we consider both quadratic and cubic regression model for second part of the data. we will also discuss the types of polynomial growth (with their equations) in india. in order to find the turning point of the growth curve, we follow the algorithm 1, given in the previous section. using that, we evaluate the rss for all the days (from march 4) and find the date on which it is minimum. the change points of growth curve for cubic and quadratic regressions are presented in fig. 1 depending upon the size of the data set. from fig. 1 , we can confirm that the growth rate of covid-19 cases was exponential till april 5 and then after it follows the polynomial growth regime while we use the covid-19 cases till july 11 (table 1) . we call the region of exponential growth in india as region i. the coefficients of the model are presented in table 2 . we see that after the exponential regime (till april 5), the growth curve follows a polynomial growth till may 2. after this, we again see a change in the behavior of the growth curve. in tables 3, 4, 5 and 6, we try to model these growth curves through regression analysis. having evaluated the coefficients for various models (i.e. linear, quadratic and cubic) as well as the important statistics (i.e. r 2 values, p values of the models as well as individual coefficients and f-statistic), we will select the best-fitting models. in order to select the best-fitting models for region ii (april 6 to may 2); iii (may 3 to may 15), iv (may 16 to may 31) and v (june 1 to july 11), we have the following steps. we select that model which has high r 2 values, significant p value, high f-statistic and where the p values of all the variables are significant. we see for region ii, from table 3 , that the linear model is having a relatively lower f-statistic and r 2 values in comparison to the quadratic and cubic models. so, we eliminate the possibility of linear fitting. further, we see that the p values, f-statistics and the r 2 values are quite significant in both quadratic as well as the cubic models. but, if we look at the individual p values of the coefficients, we see that the individual p values are not significant for the cubic model. on the other hand, the individual p values are significant for fig. 1 trend of rss and optimum µ for exponential-quadratic regression model for region v, from table 6 , we see that the r 2 values of all the models are very high (quadratic, cubic, quartic and quintic models have exceptionally high). all the models also have significant p-values. the f-statistic of quadratic, cubic, quartic and quintic models is high. f-statistic value of quartic model is the highest. the coefficient individual p values of quartic model are also significant. thus, we conclude that the quartic model is the best-fitting model for region v (june 1 to july 11). note for region v, due to spike in the cases, we also checked the fitting of exponential curve in this region (table 7) . let the exponential model be-y(t) = e t we obtained the following parameter values: the rse for this model is 4178 and mape (%) is 0.85%. both of these values are quite larger than those of quartic model (refer table 9 for rse and mape values of quartic model in region v). thus, we conclude that quartic model is the best-fitting model for region v (1st june to 11th june). all the anova tables (refer to table 8 ) for region ii, iii, iv and v suggest significant p-values for its coefficients and suggest that the models fit well the respective regions. thus, according to our study, the growth of the virus was exponentially increasing from march 4 to april 5. then v june 1st to july 11th y(t) = −1.168 × 10 7 + 4.223 × 10 5 t − 5.658 × 10 3 t 2 + 33.29t 3 − 0.06963t 4 0.21 after, the virus grew by following a quadratic rate from april 6 to may 2. after may 3, we experienced a linear growth. but after may 15 to may 31, we experienced a sudden rise in the rate of growth of the virus and have seen quadratic growth again. further, for the period of june 1 to july 11, we see experienced a quartic (4-degree polynomial) growth, which is very alarming (see table 9 for best-fitted regression models). figure 2 shows the best-fitted regression models to the daily cumulative cases of covid-19 in india from march 4 to july 11 (table 10) . we use the daily time-series data of number of cumulative confirmed cases from march 4 to july 10. first, we check the stationarity of the transformed timeseries using adf tests. dickey-fuller statistic is 6.3915 with p value 0.99 which indicates that the growth of covid-19 cases is not stationary. the arima models may be useful over the arma models. the acf and pacf plots are shown in fig. 3 . we then obtain the optimal arima parameters ( p , d , q ) using the aic. we take various possible combinations of ( p , d , q ) and compute the aic. then, select the best-fitted arima model that has the lowest aic among all considered models. according to the aic, the arima (5, 2, 5) is the best-fitted model for the covid-19 cases, india (see table 11 ). estimates of arima (5, 2, 5) parameters and mape are shown in table 11 . we have selected the model parameters using the akaike information criterion. we obtained the parameters as: p = 5, d = 2 and q = 5. as p = 5 , it means that the order (number of time lags) of autoregression part of the model is 5. in general, we can say that the cumulative cases of covid-19 in a day are dependent on the cases of previous 5 days. as q = 5 , the present value is dependent on the moving average (residuals) of previous 5 days. as d = 2, the series y * t = y t − 2y t−1 + y t−2 is stationary. a higher value of d indicates positive autocorrelations out to a high number of lags. thus, we can have the equation for our model, using eq. 8 as: where all the symbols have their meanings as per "arima model". estimates of the holt-winters exponential smoothing and exponential smoothing models are given in table 12 . according to the mape and accuracy measures, the arima (5, 2, 5) is a better model than the holt-winters exponential smoothing and usual exponential smoothing models. from this, we can conclude that the arima model is the best fit for the cases of covid-19, followed by holt-winters model. the forecasting values along with 95% confidence intervals are shown in table 13 and fig. 4 . we have used actual data from 11th june to validate the model. even though most of the actual cases are covered in the 95% confidence intervals of the arima and holt-winters forecasts, they are seen to be nearer to the upper limits of the confidence intervals and are deviated from the estimates. it might be possible that in the future days, the forecasts might underestimate the actual cases. this might be attributed to the changing pattern of the growth of the pandemic in our country as seen in the regression analysis. thus, we suggest a segment-wise time-series models to forecast the future cases in a more accurate manner. we present the segment-wise arima and holt-winters models for 1st june to 10th july. we have seen that our time-series data are non-stationary and, thus, we select the most optimal values of (p, d, q) , which has the least aic. according to aic, (5, 2, 3) is the best-fitting model for the time-series data from june 1 to july 10, with aic = 634.18. estimates of arima (5, 2, 3) model with the corresponding mape and accuracy are given in table 14 (fig. 5) . from fig. 6 , we deduce that the optimal value of d is 2, as the time series becomes stationary with differencing degree = 2. estimates of the holt-winters exponential smoothing and exponential smoothing models are given in table 15 . according to the mape and accuracy measures, the arima (5, 2, 3) is a better model than the holt-winters exponential smoothing and usual exponential smoothing models. from this, we can conclude that the arima model is the best fit for the cases of covid-19, followed by holt-winters model. the forecasting values along with 95% confidence intervals are shown in table 16 and fig. 4 . we have used from the interpretations of both the fitted arima models, we can say that as the values of p and d are 5 and 2, respectively; the daily cumulative cases are dependent on the cases of previous 5 days. also, to convert the time series of daily cases into stationary, we need differencing degree of 2. from the regression analysis, we conclude that the spread of covid-19 disease grew exponentially from march 3 to april 5. further, from april 6 to may 2, the cases followed a quadratic regression. from may 3 to may 15, we see a linear growth of the pandemic with average daily cases of 3584. after may 15 to may 31, we again saw a spike in the cases that lead to a quadratic growth of the pandemic. and, from june 1 to july 11, we saw a major spike in the growth of the pandemic as it has followed quartic growth. verma et al. (2020) showed the four stages of the epidemic, s1: exponential, s2: power law, s3: linear and s4: flat. we saw that the course of covid-19 in india followed this regime till may 15. but after the linear trend from may 3 to may 15, the spread has again reached the quadratic growth and from june 1 to july 11, india is witnessing a quartic growth. this might be attributed to the relaxation of lockdown measures in the country. though it was much likely that the cases would start to reduce post-linear stage growth as the total cases may start to follow a square root equation, i.e. y(t) ∼ √ t. and this might lead to reduction in the daily number of cases (asy � (t) ∼ 1∕ √ t ,) leading to flattening of the curve. but, due to reduced restrictions, we see a reverse trend, which might be alarming and suggest the imposition of strict lockdown to reverse this trend of pandemic growth. if we continue to open our economy in this way, we might go back to the exponential growth of the pandemic and this would lead to huge destruction to human lives and cause a greater impact on our economy. we also observe that some cities have been the hotspots of the disease, such as delhi (more than 131,000 cases), mumbai (more than 94,000 cases), chennai (more than 78,000 cases), thane (more than 63,000 cases), etc. as on 14 june, 2020. while the other states and cities have seen a slower growth of the pandemic, these cities have seen explosive growths. due to the opening of air and rail transport in the country, the virus is likely to spread in the other regions as well as people from these cities (especially metro cities) are travelling to different states. thus, it is highly advisable that the country should go back to its lockdown phase until we see reduction in trend. in time-series analysis, we conclude that the arima (5, 2, 5) is the best-fitting model for the cases of covid-19 from 4th march to 10th july with an accuracy of 97.38%. the basic exponential smoothing is not very accurate for our case, but we see that the holt-winters model is around 97.11% accurate. both arima (5, 2, 5) and holt-winters models suggest a rise in the number of cases in the coming days. we observed that both the arima and holt-winters models capture the data well and the actual data from 11th july validate the forecasts well as they lie in the predicted confidence intervals. but, while validating the model, the actual values are always near to the upper confidence limits, it might be possible that in further days, our model might underestimate the cases. this might be possible because of the changing trend of the growth of the pandemic in india. thus, we used segmented time-series models and took data from 1st june to 10th july to build separate arima and holt-winters models. we concluded that arima (5, 2, 3) is the best-fitting model for covid-19 cases in the given time period with an accuracy of 99.86%. the basic exponential smoothing is not very accurate or this case as well but, the holt-winters model is around 99.78% accurate. we also observe that the arima and holt-winters models capture the data well and the actual data from 11th july validate the forecasts and lie near to the estimates. we may also conclude that the cases of covid-19 will rise in the coming days and the situation may turn alarming if proper measures are not followed. since the economic activities have started in the country, people need to be more careful while going out. and explosion of the pandemic in the whole country can cause a serious damage to human lives, healthcare system as well as the economy of the country. thus, there is an urgent need of imposing strict lockdown measures to curb the growth of the pandemic. we must also learn to lead our lives by following all the precautions even if the lockdown restrictions are relaxed and the economic activities are resumed. comparison of indian scenario with that of other countries might not prove fruitful at this stage because of the demographic differences and/or the characteristics of the disease. also, comparison of the indian context with that of the other countries of the world will require to study the spread of the pandemic in those countries in depth and might be considered as an altogether in the future studies. this study was limited to data-driven models using the total covid-19 cases. in the future studies, the other co-factors (associated with the demographics, social, cultural and medical infrastructure, etc.) can be taken to considerations. modeling and predictions for covid 19 spread in india ceylan z (2020) estimation of covid-19 prevalence in italy evolution of covid-19 pandemic: power law growth and saturation amenta francesco covid-19 virus outbreak forecasting of registered and recovered cases after sixty day lockdown in italy: a data driven model approach using autoregressive integrated moving average (arima) models to predict and monitor the number of beds occupied during a sars outbreak in a tertiary hospital in singapore analysis and forecast of covid-19 spreading in china, italy and france modelling malaria incidence with environmental dependency in a locality of sudanese savannah area sidarthe model of covid-19 epidemic in italy coronavirus 2019 (covid-19) outbreak in india: a perspective so far international journal of infectious diseases epidemiology and arima model of positive-rate of influenza viruses among children in wuhan, china: a nine-year retrospective study forecasting seasonal and trends by exponentially weighted averages mathematical modeling of the spread of the coronavirus disease 2019 (covid-19) considering its particular characteristics. the case of china a review of modern technologies for tackling covid-19 pandemic a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action forecasting the trend in cases of ebola virus disease in west african countries using auto regressive integrated moving average models hoboken petropoulos f, makridakis s (2020) forecasting the novel coronavirus covid-19 the time series seasonal patterns of dengue fever and associated weather variables in bangkok r: language and environment for statistical computing. r foundation for statistical computing covid-19 pandemic: power law spread and flattening of the curve comparative study of four time series methods in forecasting typhoid fever incidence in china predicting turning point, duration and attack rate of covid-19 outbreaks in major western countries fractal kinetics of covid-19 pandemic key: cord-160382-8n3s5j8w authors: yamagata, yoriyuki title: simultaneous estimation of the effective reproducing number and the detection rate of covid-19 date: 2020-05-02 journal: nan doi: nan sha: doc_id: 160382 cord_uid: 8n3s5j8w a major difficulty to estimate $r$ (the effective reproducing number) of covid-19 is that most cases of covid-19 infection are mild or asymptomatic, therefore true number of infection is difficult to determine. this paper estimates the daily change of $r$ and the detection rate simultaneously using a bayesian model. the analysis using synthesized data shows that our model correctly estimates $r$ and detects a short-term shock of the detection rate. then, we apply our model to data from several countries to evaluate the effectiveness of public healthcare measures. our analysis focuses japan, which employs a moderate measure to keep"social distance". the result indicates a downward trend and now $r$ becomes below $1$. although our analysis is preliminary, this may suggest that a moderate policy still can prevent epidemic of covid-19. in the wake of the covid-19 epidemic, the japanese government gradually employed public health measures against covid-19. in the first stage, the stronger quarantine measures at the border were implemented. once patients who had no connection to wuhan appeared inside the border, the government started to track these patients as far as possible and tried to find people contacted to these patients. once these "track and quarantine clusters" tactics were overwhelmed by the number of patients, the government started to ask behavior changes to people, culminating "declaration of the emergency" on april 6. public facilities like libraries were closed. large shopping malls and entertainment businesses, such as movie theatres, were asked to be closed. restaurants are asked to shorten their operating hours and stop providing alcoholic beverages at night. working from home was encouraged, and the citizens were advised to avoid crowded areas and generally avoid going outside unnecessarily. however, the japanese legal system does not have enough mechanisms to enforce these policies. thus the effectiveness of these policies can be questioned. many people are still commuting to their office because many companies lack the necessary ability to allow their employees to work from home. many small restaurants and cafes are still running their business because of the lack of financial compensation. to estimate the policy effect to covid-19, we need estimate daily changes of the effective reproducing number r. a major difficulty to estimate r of covid-19 is that most cases of covid-19 infection are mild or asymptomatic, therefore the true number of infection is difficult to determine. this paper estimates the daily change of r and the detection rate simultaneously using a bayesian model. the analysis using synthesized data shows that our model correctly estimates r and detects a short-term shock of the detection rate. our analysis focuses japan, which employs a moderate measure to keep "social distance". the result indicates a downward trend and now r becomes below 1. although our analysis is preliminary, this may suggest that a moderate policy still can prevent epidemic of covid-19. then, we apply our model to data from several countries to evaluate the effectiveness of public healthcare measures. the comparison between denmark and sweden reveals that lock-down is very effective in short term. however, in sweden, which did not employ lock-down, r also reduced and now r of both countries are roughly same. this might suggest that lock-down is not effective long run, or denmark is disadvantageous against covid-19 comparing to sweden. further, we applied our method to china, italy and us and show that these countries are also about exiting from epidemic. several works employ data-driven methods to predict and measure the public health measure of covid-19. anastassopoulou et al. [1] apply the sird model to chinese official statistics, estimating parameters using linear regression. the reporting rate is not estimated from data but assumed. by these models and parameters, they predict the covid-19 epidemic in hubei province. diego caccavo [2] and independently peter turchin [6] apply modified sird models, in which parameters change overtime following specific function forms. parameters govern these functions are estimated by minimizing the sum-of-squareerror. however, using the sum-of-square method causes over-fitting and always favors a complex model, therefore it is not suitable to access policy effectiveness. further, fitting the sird model in the early stage of infection is difficult, as pointed out in stat-exchange 1 . using a bayesian method, we avoid these problems to some degree, because a bayesian method estimates parameter distribution instead of a point estimate. thus, we can assess the degree of confidence of each parameter. further, by well-established statistical methods, we can compare the explanatory power of different models. flaxman et al. [4] use a bayesian model to estimate policy effectiveness. the methodology is different from us because they assume immediate effects from the policies implemented. further, they use a discrete renewal process, a more advanced model than the sird model. they use parameters estimated from studies of clinical cases while we use a purely data-driven method. 3.1. model. we use the discrete-time sir model but assume that the number of move between each category is stochastic and follows poisson distribution. the effective reproduction rate can be written when r becomes < 1 then the infection starts to decline. we cannot expect that these values are directly observable, because many (or most) cases are mild or asymptomatic. therefore, we introduce the detection rate q and let the number of cumulative observed cases c obs and recovered r obs as we assume that β and q change day to day bases while other parameters are fixed. to get a reasonable estimate, we assume prior distributions somewhat arbitrary chosen. while the prior of q(t) is the uniform distribution over [0, 1] . to make our model robust, we choose student-t as the prior for β. we perform a sensitivity analysis of the prior for σ b to ensure that the choice of priors does not strongly affect the result. data. the number of confirmed cases of each country up to may 11 were drawn from data repository 2 by johns hopkins university center for systems science and engineering. the dataset also contains the number of recovery, but as pointed out in readme the number is underestimated. for example, the number of recovery in norway is only 32 in may 11, which is impossible with 8132 confirmed cases. therefore, we estimate γ = 0.04 per day by chinese data and assume that γ is constant across all countries. 3.3. experiment. first, we performed model validation using synthesized data of the scenarios in which β and q are constant, β is constant but q is piecewise constant, β reduces linearly and q is constant with white noise and β, q are constant where β is high enough to make almost all people obtain immunity. the result is presented in section 4.1. then the real-world data were fed to stan [3] for parameter estimation by our bayesian model. we simplified our model to ease modeling in stan. because latent discrete variables cannot be used in stan, we used real numbers for n i(t) and n r(t). we used normal approximation n (λ, √ λ) for the poisson distribution used for n i(t). for n r(t), we replaced stochastic laws to deterministic laws n r(t) = γi to avoid a numerical issue. parameter estimation used 10,000 iterations with 5,000 iterations for warm-up and 5,000 iterations for sampling. four (default number of stan) independent computations were performed simultaneously and used to estimater. ifr < 1.1 then we regard that the estimation is converged. to make sure that our results are meaningful, we compared the performance of our model with a model (we refer it by "the constant model" const), which assumes constant β and q and a model (we refer it by "the constant detection rate model" const-q). const estimates constant β and q simultaneously but const-q takes q as a part of given data. parameters were estimated for const and const-q in the same way and loo, a standard measure of model performance, was compared. because the exact computation of loo-cv is computationally expensive, approximation psis-loo-cv [7] and waic [8] were compared. further, we check the reliability of these estimation by checking the pareto-k for the importance weight distribution. the computation of psis-loo-cv and waic is performed by arviz [5] . models and the computation history used for this experiment are public at github 3 . fig. 1 and fig. 2 show the estimated β and q for synthesized data using β = 0.07 and q = 0.2. fig 1 also shows the estimated β by const-q model. the true q = 0.2 is given to const-q model as data. the results show that both models can estimate beta correctly, while completely fail to estimate q. estimated q is biased to 1 and noisy. therefore, our method cannot estimate the absolute level of the detection rate. however, the estimate of β is still accurate so our method can be used to estimate β. in fact, fig. 3 and fig. 5 shows that our model is robust against changing q. fig. 3 and fig. 4 show the estimated β and q for synthesized data using constant β = 0.07 but changing q which goes 1 to 0.8 in apr. 1. fig 3 shows β estimated by const-q model drops at april 1. although the estimate by our model also drops around the same date, the drop is less significant. this indicates that allowing q vary makes an estimated β robust against a sudden change of q. fig. 5 and fig 6 show the estimated β and q for synthesized data using linearly decreasing β and noisy q. fig. 5 shows that our method is smooth while the estimate of const-q is noisy. again, this shows that our model is more robust against changing q. 1, the short-term change coincides the ground truth. therefore, the estimated q is still useful to find a short-term shock to the detection rate. fig. 7 and fig. 8 show the estimate of β and q for data generated with constant β and q. unlike fig. 1 and fig. 2 , β and time horizon are chosen so that most of population are infected in the end. the result shows that the estimate of β becomes unreliable when the infection starts saturated. therefore, our model is only useful for the case of a low infection rate. the estimated β by const-q swings widely, therefore is omitted. in summary, the estimate of β, which is a key parameter for estimate r, by our model is robust against sudden changes and noise in q, even though the estimate of q itself is unreliable. when the infection begins to saturate among population, the estimate of β as well as q becomes unreliable. analysis of japan. next, we present our analysis of japanese data. first, we applied three models, const, const-q and our model (varied-q) to japanese data. for all parameters of all models,r < 1.1 holds so convergence is excellent. then two information criteria, psis-loo-cv and waic are applied to const, const-q and varied-q. the results are shown in table 1 . table 1 shows that our model, varied-q, is the best model. however, we need to be careful because the difference between const-q is not large. further, there is a data point in which the pareto-k for the importance weight distribution is larger than 0.7. therefore, the model is not robust and influenced too much from small numbers of data points. keeping this in mind, we tentatively choose varied-q and we analyze the result of varied-q. fig. 9 shows an estimated r for each day in japan by varied-q. the first upward trend may not be very reliable because there is a few data for infection. the downward trend from the mid. february until the mid. march could be march, the tracking effort might be overwhelmed, thus created an upward trend until the beginning of april, when "the state of emergency" was declared to major urban areas. since then, there was a downward trend, and now r is below 1. thus, currently the infection is shrinking. before going analysis of other countries, we analyze sensitivity of the result to prior distributions using japanese data as an example. fig. 9 shows an estimated r for each day in japan by varied-q. the blue line shows the estimate of r based on the prior σ b ∼ exponential (1) throughout in this paper. the orange line shows the estimate of r based on the prior σ b ∼ exponential(0.1). there is no significant difference of estimates between two priors; therefore we can safely conclude that the effect of prior is small. comparison of denmark and sweden. next, we compare denmark and sweden. these countries are economically and socially similar countries, but employed very different policies against covid-19. we applied the same procedure as japan to data from two countries. all parameters are converged and information criteria favor varied-q model. for sweden, we only applied data from mar 1. because the confirmed cases only appear in february 27. the estimated r clearly shows that lock-down introduced march 13. was effective to put down r. however, in sweden, which did not employ lock-down, r reduced gradually and now r is almost same between denmark and sweden. this might suggest that lock-down is not very effective in long run but might also due to unfavorable conditions to denmark, for example, higher population density. multi-national comparison. fig. 13 shows daily estimate of r of china, italy, japan and us. the same method as japan was applied to the rest of countries and adequacy of varied-q model was verified. for italy, we only apply data from march 1, 2020 because otherwise the parameter estimation did not converge. the same method was applied to korea but the parameter estimation does not converge. the results show that china, italy, japan, and the us are about to exit epidemic. the method used in this paper has several limitations. first, as experiment used simulation data revealed, our method cannot determine the true level of detection rate, nor its long-term trends. to find the true detection rate, we need different kind of data, such as excess mortality. second, our model uses a naive sir model and does not consider incubation period and reporting delay. we can reconstruct the date of exposure by back projection, so it would be interesting apply our method to such data. the solid lines show means and shades with same colors show standard deviation athanasios tsakris, and constantinos siettos. data-based analysis, modelling and forecasting of the covid-19 outbreak chinese and italian covid-19 outbreaks can be correctly described by a modified sird model. medrxiv, page 2020 stan: a probabilistic programming language estimating the number of infections and the impact of non arviz a unified library for exploratory analysis of bayesian models in python analyzing covid-19 data with sird models practical bayesian model evaluation using leave-one-out cross-validation and waic asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory japan e-mail address: yoriyuki.yamagata@aist.go.jp the author thanks to kentaro matsuura and the tokyo.r slack group for many suggestions and advice on bayesian modeling. the author also thanks to peter turchin, from whose work the author's work started. key: cord-128991-mb91j2zs authors: agapiou, sergios; anastasiou, andreas; baxevani, anastassia; christofides, tasos; constantinou, elisavet; hadjigeorgiou, georgios; nicolaides, christos; nikolopoulos, georgios; fokianos, konstantinos title: modeling of covid-19 pandemic in cyprus date: 2020-10-05 journal: nan doi: nan sha: doc_id: 128991 cord_uid: mb91j2zs the republic of cyprus is a small island in the southeast of europe and member of the european union. the first wave of covid-19 in cyprus started in early march, 2020 (imported cases) and peaked in late march-early april. the health authorities responded rapidly and rigorously to the covid-19 pandemic by scaling-up testing, increasing efforts to trace and isolate contacts of cases, and implementing measures such as closures of educational institutions, and travel and movement restrictions. the pandemic was also a unique opportunity that brought together experts from various disciplines including epidemiologists, clinicians, mathematicians, and statisticians. the aim of this paper is to present the efforts of this new, multidisciplinary research team in modelling the covid-19 pandemic in the republic of cyprus. coronavirus disease 2019 , an infection caused by the novel coronavirus sars-cov-2 (coronaviridae study group of the international committee on taxonomy of viruses (2020)) that first emerged in wuhan, china, (zhu et al. (2020) ), counts now more than 25 million cases and has claimed nearly 850,000 lives (world health organization (2020)). despite some advances in therapy (beigel et al. (2020) ) and considerable progress in vaccine development, with some vaccine candidates reaching phase iii trials (jackson et al. (2020) ), there are still many gaps in our understanding of the new pandemic disease including some epidemiological parameters. epidemic modelling is a fundamental component of epidemiology, especially with regards to infectious diseases. following the pioneering work of r. ross, w. kermack, and mckendrick in early twentieth century (kermack and mckendrick (1927) ), the discipline has established itself and comprises a major source of information for decision makers. for instance, in the united kingdom, the scientific advisory group of emergencies (sage) is a major body that collects evidence from multiple sources including inputs from mathematical modelling to advice the british government on its response to the complex covid-19 situation; for more information see this link. in the context of the covid-19 pandemic, expert opinions can help decision makers comprehend the status of the pandemic by collecting, analyzing, and interpreting relevant data and developing scientifically sound methods and models. an exact model that would describe perfectly the data is usually not feasible and of limited scope; hence scientists usually aim for models that allow a statistical simulation of synthetic data. at the same time, models can also approximate the dynamics of the disease and discover important patterns in the data. in this way, researchers can study various scenarios and understand the likely consequences of government interventions. finally, the proposed models could motivate the conduct of further studies about the evolution of both infectious and non-infectious diseases of public interest. here we report our work including results from statistical and mathematical models used to understand the epidemiology of covid-19 in cyprus, during the time period starting from the beginning of march till the end of may 2020. we propose a range of different models that capture different aspects of the covid-19 pandemic. the analysis consists of several methods applied to understand the evolution of pandemics in the long and short run. we use change-point detection, count time series methods and compartmental models for short and long term projections, respectively. we estimate the effective reproduction number by using three different methods and obtain consistent results irrespective of the method used. results are cross-validated against observed data with considerable consistency. besides providing a comprehensive data analysis we illustrate the importance of mathematical models to epidemiology. in this section, after a brief introduction to the testing protocol, we introduce the different techniques and models that have been used for the modelling and analysis of the covid-19 infections in cyprus. the unit for surveillance and control of communicable diseases (usccd) of the ministry of health operates covid-19 surveillance. the lab-based surveillance system consists of 19 laboratories (7 public and 12 private) that carry out molecular diagnostic testing for sars-cov-2. sociodemographic, epidemiological, and clinical data of individuals with sars-cov-2 infection are routinely collected from laboratories and clinics, and reported to an electronic platform of the usccd. a confirmed covid-19 case is a person, symptomatic or asymptomatic, with a respiratory swab (nasopharynx and/or pharynx) positive for sars-cov-2 by a real-time reverse-transcription polymerase chain (rrt-pcr) assay. cases are considered imported if they have travel history from an affected area within 14 days of the disease onset. locallyacquired cases are individuals who test positive for sars-cov-2 and have the earliest onset date in cyprus without travel history from affected areas. people with symptomatic covid-19 are considered recovered after the resolution of symptoms and two negative tests for sars-cov-2 at least 24-hour apart from each other. for asymptomatic cases, the negative tests to document virus clearance are obtained at least 14 days after the initial positive test. a person with a positive test at 14 days is further isolated for one week and finally released at 21 days after the initial diagnosis without further laboratory tests. testing approaches in the republic of cyprus included: a) targeted testing of suspect cases and their contacts; of repatriates at the airport and during their 14-day quarantine; of teachers and students when schools re-opened in mid-may; of employees in essential services that continued their operation throughout the first pandemic wave (e.g., customer services, public domain); and of health-care workers in public hospitals, and b) population screenings following random sampling in the general population of most districts and in two municipalities with increased disease burden. by june 2nd 2020, 120,298 pcr tests had been performed (13,734.2 per 100,000 population). public health measures were taken in 4 phases: period 1 (10 -14 march, 2020) included closures of educational institutions and cancellation of public gatherings (>75 persons); period 2 (15 -23 march, 2020) involved closure of entertainment areas (for instance, malls, theatres, etc), allowance of 1 person per 8 square meters in public service areas, and restrictions to international travel (for example, access to the republic of cyprus was permitted only for specific persons and after sars-cov-2 testing); period 3 (24 -30 march, 2020) included closure of most retail services; and period 4 (31 march -3 may) included the suspension of incoming flights with few exceptions (for instance, repatriated cypriot citizens), stay at home order, and night curfew. change-point detection is an active area of statistical research that has attracted a lot of interest in recent years and plays an essential role in the development of the mathematical sciences. a non-exhaustive list of application areas includes financial econometrics (schröder and fryzlewicz, 2013) , credit scoring (bolton and hand, 2002) , and bioinformatics (olshen et al., 2004) . the focus is on the so-called a posteriori changepoint detection, where the aim is to estimate the number and locations of certain changes in the behaviour of a given data sequence. for a review of methods of inference for single and multiple change-points (especially in the context of time series) under the a-posteriori framework, see jandhyala et al. (2013) . the aim is to estimate the number and locations of certain changes in a stream of data. detecting these change-points enables us to separate the data sequence into homogeneous segments, leading to a more flexible modeling approach. advantages of discovering such heterogeneous segments include interpretation and forecasting. interpretation naturally associates the detected change-points to real-life events or/and political decisions. in this way, a better description of the observed process and the impact of any intervention can be communicated. forecasting, is based on the final detected segment which is important as it allows for more accurate prediction of future values of the data sequence at hand. methods developed in this context are based on a given model. for the purpose of this paper, we work with the following signal-plus-noise model where x t denotes the daily incidence covid-19 cases and f t is a deterministic signal with structural changes at certain time points. details about f t are given below. the sequence t consists of independent and identically distributed (iid) data with mean zero and variance equal to one and σ > 0. we denote the number of change-points by k and their respective locations by r 1 , r 2 , . . . , r k . the locations are unknown and the aim is to estimate them based on (1). the daily incidence cases of the covid-19 outbreak in cyprus is investigated by using the following two models for f t of (1): 1. continuous, piecewise-linear signals: f t = µ j,1 + µ j,2 t, for t = r j−1 + 1, r j−1 + 2, . . . , r j with the additional constraint of µ k,1 + µ k,2 r k = µ k+1,1 + µ k+1,2 r k for k = 1, 2, . . . , n . the change-points, r k , satisfy f r k −1 + f r k +1 = 2f r k . 2. piecewise-constant signals: f t = µ j for t = r j−1 + 1, r j−1 + 2, . . . , r j , and f rj = f rj +1 . in this work, we are using the isolate-detect (id) methodology of anastasiou and fryzlewicz (2019) to detect changes based on (1) by using linear and constant signals, as described above; see appendix a-1 for a description of the method. the analysis of count time series data (like daily incidence data we consider in this work) has attracted considerable attention, see kedem and fokianos (2002, sec 4 & 5) for several references and fokianos (2015) for a more recent review of this research area. in what follows, we take the point of view of generalized linear modelling as advanced by mccullagh and nelder (1989) . this framework naturally generalizes the traditional arma methodology and includes several complicated data generating processes besides count data such as binary and categorical data. in addition, fitting of such models can be carried out by likelihood methods; therefore testing, diagnostics and all type of likelihood arguments are available to the data analyst. the logarithmic function is the most popular link function for modeling count data. in fact, this choice corresponds to the canonical link of generalized linear models. suppose that {x t } denotes a daily incidence time series and assume, that given the past, x t is conditionally poisson distributed with mean λ t . define ν t ≡ log λ t . a log-linear model with feedback for the analysis of count time series (fokianos and tjøstheim (2011) ) is defined as in general, the parameters d, a 1 , b 1 can be positive or negative but they need to satisfy certain conditions to obtain stability of the model. the inclusion of the hidden process makes the mean of the process to depend on the long-term past values of the observed data. further discussion on model (2) can be found in appendix a-2 which also includes some discussion about interventions. an intervention is an unusual event that has a temporary or a permanent impact on the observed process. computational methods for discovering interventions, in the context of (2), under a general mixed poisson framework have been discussed by liboschik et al. (2017) . in this work, we will consider additive outliers (ao) defined by where the notation follows closely that of sec. 2.2 and i(.) denotes the indicator function. inclusion of the indicator function shows that at the time point r k , the mean process has a temporary shift whose effect is measured by the parameter γ k but in the log-scale. other type of interventions can be included (see appendix a-2) whose effect can be permanent and, in this sense, intervention analysis and change-point detection methodologies address similar problems but from a different point of view. model fitting is based on maximum likelihood estimation and its implementation has been described in detail by liboschik et al. (2017). compartmental models in epidemiology, like the susceptible-infectious-recovered (sir) and susceptible-exposed-infectious-recovered (seir) models and their modifications, have been used to model infectious diseases since the early 1920's (see keeling and rohani (2008) , nicolaides et al. (2020) among others). the basic assumptions for these models are the existence of a closed community, i.e without influx of new susceptibles or mortality due to other causes, with a fixed population, say n , and also that the individuals who recover from the illness are immune and do not become susceptible again. in the basic seir model, at any point in time t, each individual is either susceptible (s(t)), exposed (e(t)), infectious (i(t)) or recovered (r(t), including death). the epidemic starts at time t = 0 with one infectious individual, usually thought of as being externally infected, and the rest of the population being susceptible. people progress between the different compartments and this motion is described usually through a system of ordinary differential equations that can be put in a stochastic framework. a variety of seir modifications and extensions exist in the literature, and a multitude of them emerged recently because of the covid-19 epidemic. in this work, we consider four such modifications, based on the models proposed in peng et al. (2020) and li et al. (2020) for the analysis of the covid-19 epidemic in wuhan and the rest of the chinese provinces. initially, we employ the seir model based on the meta-population model of li et al. (2020) but simplified to take into account only a single population. the novelty compared to the standard seir model, is that this model takes into account the existence of undocumented/asymptomatic infections, which transmit the virus at a potentially reduced rate. the model tracks the evolution of four state variables at each day t, representing the number of susceptible, exposed, infected-reported and infected-unreported individuals, s(t), e(t), i r (t), i u (t) respectively. the parameters of the model are the transmission rate β (days −1 ), the relative transmission rate µ representing the reduction in transmission for asymptomatic individuals, the average latency/incubation period z (days), the average infectious period d (days) and the reporting rate α representing the proportion of infected individuals which are reported. for a graphic description of the model see figure 16 . the time evolution of the system is defined by the following set of differential equations (recall n denotes the population size): following li et al. (2020) , we use a stochastic version of this model with a delay mechanism. each term, say u , on the right hand side of (4) is replaced by a poisson random variable with mean u . at each day, we use the 4th order runge-kutta numerical scheme to integrate the resulting equations and obtain the values of the four state variables on the next day. for each new reported infection, we draw a gamma random variable with mean τ d days, to determine when this infection will be recorded. for the main analysis we use τ d =6 days, as the average reporting delay between the onset of symptoms and the recording of an infection; see also li et al. (2020) . note that the results are robust with respect to the value of reporting delay. the final output of this model is the number of recorded infections on each day t, y = y(t). we also use the meta population model of li et al. (2020) . it models the transmission dynamics in a set of populations, indexed by i, connected through human mobility patterns, say m ij . this is implemented by incorporating information on human movement between the 5 main districts of cyprus: nicosia, limassol, larnaca, paphos and ammochostos. in this case, i = 1, 2, 3, 4, 5 and m ij denotes the daily number of people traveling from district i to district j, i = j. such information is based on the 2011 census data obtained from the cyprus statistical service. the time evolution of the four compartmental states in each district i is defined by the following set of differential equations: where the notation follows the notation given in sec. 2.4.1. in addition to the four state variables, this model also updates at each time step the population of each area i, say n i , by where the multiplicative factor θ is assumed to be greater than 1 to reflect under-reporting of human movement. like model (4), model (5) further, we consider the meta-population model of peng et al. (2020) . this is a generalisation of the classical seir model, consisting of seven states: (s(t), p (t), e(t), i(t), q(t), r(t), d(t)). at time t, the susceptible cases s(t) will become with a rate ζ insusceptible p (t) or with rate β exposed e(t), that is infected but not yet infectious i.e. in a latent state. some of the exposed cases will eventually become infected with a rate γ. infected means they have the capacity of infecting but are not quarantined q(t) yet. the introduction of the new quarantined state, q(t), in the classical seir model, formed by the infected cases with a constant rate δ, allows to consider the effect of preventive measures. finally, the quarantined cases, are now split to cured cases, r(t), with rate λ(t) and to closed, d(t), with mortality rate κ(t). the model's parameters are the transmission rate β, the protection rate ζ, the average latent time γ −1 (days), the average quarantine time δ −1 (days) as well as the time dependent cure rate λ(t) and mortality rate κ(t). the relations are characterized by the following system of difference equations: the total population size is assumed to be constant and equal to n = s(t) + p (t) + e(t) + i(t) + q(t) + according to the official reports, the number of quarantined cases , recovered and deaths , due to covid-19, are available. however, the recovered and death cases are directly related to the number of quarantine cases, which plays an important role in the analysis, especially since the numbers of exposed (e) and infectious (i) cases are very hard to determine. the latter two are therefore treated as hidden variables. this implies that we need to estimate the four parameters ζ, β, γ −1 , δ −1 and both the time dependent cure rate λ(t) and mortality rate κ(t). notice here that while the rest of the parameters are considered fixed during the pandemic, we allow the cure and mortality rate to vary with time. we expect that the former will increase with time, given that social distancing measures have been put in place, while the latter will decrease. finally, this is an optimization problem, and the methodology we have followed in order to address it can be found in appendix a-3. the last model we consider is a modified version of a solution created by bettencourt and ribeiro (2008) to estimate real-time effective reproduction number r t 1 using a bayesian approach on a simple susceptible -infected (si) compartmental model: we use the bayes rule to update the beliefs about the true value of r t based on our predictions and on how many new cases have been reported each day. having seen k new cases on day t, the posterior distribution of r t is proportional to (denoted by ∝) the prior beliefs of the value of p (r t ) times the likelihood of r t given that we have recorded k new cases, i.e., p (r t |k) ∝ p (r t ) × l(r t |k). to make this iterative every day that passes by, we use last day's posterior p (r t−1 |k t−1 ) to be today's prior p (r t ). therefore in general however, in the above model the posterior is influenced equally by all previous days. thus, we propose a modification suggested in systrom (2020) that shortens the memory and incorporates only the last m days of the likelihood function, p (r t |k) ∝ t t=m l(r t |k t ). the likelihood function is modelled with a poisson distribution. recall the compartmental models discussed in sec. 2.4.1 and 2.4.2. then the effective reproduction number is given by see the supplement of li et al. (2020) . we estimate r t in (8) during consecutive fortnight periods for which its value is considered to be constant. to achieve this we estimate the parameters of each model, also assumed to be constant for each fortnight, using daily incidence data for cyprus 2 . to estimate the parameters we employ bayesian statistics, that is, we postulate prior distributions on the parameters and incorporate the data and the model (through the likelihood) to obtain the posterior distributions on the parameters. the posterior distributions capture our updated beliefs about the parameters after combining the prior with the observed data; see, for example, bernardo and smith (1994) . for the model defined by (4), we consider the whole area of cyprus as a single uniform population. for this case, the observations are not sufficiently informative to identify all five parameters of the model. a solution would be to enforce identifiability by postulating strongly informative prior distributions on the parameters. instead, we choose to make the assumption that the parameters z, d and µ have globally constant values, fixed over time. in particular we set d = 3.5 and µ = 0.5 as estimated in li et al. (2020) and .1 which appears to be the globally accepted mean incubation period. we thus only need to infer the reporting rate α and the transmission rate β, which vary both between different fortnights and for different countries because of the amount of testing and the degree of adherence to the social distancing policies. on the other hand, the model defined by (5) is sufficiently informative to infer all six model parameters. all computational methods, prior modelling and assumptions in relation to both compartmental models discussed in sec. 2.4.1 and 2.4.2 are given in appendix a-4. in addition to the above methods we further consider the method of cori et al. (2013) as a benchmark to compare all methodologies for estimating the effective reproduction number. by the end of may 2020, 952 cases of covid-19 were diagnosed in the republic of cyprus. of these, 50.2% were male (n = 478) and the median age was 45 years (iqr: 31-59 years). the setting of potential exposure was available for 807 cases (84.8%). of these, 17.4% (n = 140) had history of travel or residence abroad during a 14-day period before the onset of symptoms. locally acquired infections were 667 (82.7%) with 8.6% (n = 57) related to a health-care facility in one geographical setting (cluster a) and 12.4% (n = 83) clustered in another setting (cluster b). the epidemic curve by date of sampling and date of symptom onset is shown in figure 1 . the number of cases started to decline in april reaching very low levels in late may. in this section, we investigate the long-term impact of covid-19 to cyprus. towards this, we give longterm projections for the daily incidence and death rates. we fit system (6) to covid-19 data that were collected during the period from the 1st of march 2020 till the 31st of may 2020, in cyprus. we treat all the reported cases without making the distinction between local and imported. the model parameters are estimated using the methodology described in appendix a-3. once the model is fitted to data, it can be used to forecast the epidemic. in order to study the evolution of the model as new data are added and the quality of the respective forecasts, we have fitted model (6) using different time periods. specifically, the four datasets were formed using the daily reported incidences from the beginning of the observation period until and including the 2/4/2020, 17/4/2020, 15/5/2020 and 24/5/202 respectively. the dates were chosen according to the change points detected using the methodology described in section 2.2, see also section 3.3. the fitted model in each case was used in order to predict the pandemic's evolution until the 30/6/2020. in figure 2 , we show the number of predicted exposed plus infectious cases (green solid lines) and the number of predicted recovered cases (blue solid lines) for the duration of the prediction period, and compare them to the observed cases which are indicated by circles and triangles. we use circles for data that have been used in the prediction and triangles for the observed data that are used for validation. visual inspection shows that after a period of about two months during which the model overestimates the number of active cases and underestimates the number of recovered, see figure 2 (top), model (6) was able to capture accurately the evolution of the pandemic, figure 2 (bottom). the performance of the predictions can also be evaluated by means of the relative error (re) which are , where x t denotes the datum for day t and y t the model prediction for the same day. the re for the recovered cases equal 0.4%, 0.2%, 0.3% and 0.3% for the four time periods respectively with the corresponding re for the active cases being high in the beginning 18%, 5.8%, but then dropping considerably 0.16% and 0.1%, reflecting the fact the model caught up with the evolution of the pandemic. overall, system (6) gives adequate predictions especially when data from longer time periods are used. for the active cases. figure 3 shows the number of deaths and their respective predictions using subsets of data as described above. in the duration of the first data set, there were no deaths registered and therefore the prediction was identically zero, giving also an re equal to 100% see figure 3 (top left). as more deaths are registered the model's ability to predict the correct number of deaths is improving, see figure 3 . the recovery rate (λ(t)) is modelled as , λ i ≥ 0, i = 1, 2, 3. the idea is that the recovery rate, as time increases, should converge towards a constant. in figure 4 (left), the fitted recovery rate (solid line) is plotted against the observed number of recovered cases (stars). finally, model 3 can be used to estimate the unobserved number of exposed, e(t), and infectious, i(t), cases during the development of the pandemic. the maximum number of exposed cases occurs on the 21st of march 2020 and is estimated to be 173 cases, figure 4 (right, blue line), with the maximum of infectious individuals (136) being attained on the 26 of march 2020. we can observe a delay in the transition of exposed to infectious in the order of 5 days, which suggests a 5 day latent time of covid-19. we first consider the change-point detection method of sec. 2.2 for the case of piecewise-linear signal plus noise model. figure 5 illustrates the results obtained by this analysis on daily incidence data. we first fit model (2) note again that the sum 0.779 + 211 ∼ 1 which shows that the non-stationarity persists even after including additive outliers (in the log-scale). furthermore, the positive sign of both interventions shows the sudden explosion of the daily number of people infected. the corresponding bic values obtained after fitting this model is equal to 576.643 which improves the bic of the model without intervention which was equal to 615.766. figure 6 shows the fit of the model to the data and gives 95% prediction intervals for the week ahead. comparing both change-point analysis (see fig. 5 ) and the result obtained by using the above intervention analysis, we observe that both approaches give similar prediction intervals that include future observed incidence data. indeed, the observed data for the week ahead (01/06/2020-07/06/2020) were 4,6,1,0,5,5 and 1 cases. recall the effective reproduction number r t defined by (8). we perform bayesian analysis using (4) (see for the data concerning all incidents, the first recorded incident was on 07/03/2020, hence, as detailed in appendix a-4, we initialize our analysis of the outbreak 3 days earlier, on 04/03/2020. figure posterior probabilities of the event r t < 1. analysis using the full data. next, we consider the estimation model described in li et al. (2020) where cyprus is divided in 5 subpopulations (nicosia, limassol, larnaca, paphos, ammochostos) and the mobility patterns between them are taken into account (as described in metapopulation compartmental model 2). the effective reproduction number is given by (8). the compartmental model 2 structure was integrated stochastically using a 4th order runge-kutta (rk4) scheme. we use uniform prior distributions on the parameters of the model, with ranges similar to li et al. (2020) as follows: relative trasmissibility 0.2 ≤ µ ≤ 1, movement factor 1 ≤ θ ≤ 1.75; latency period 3.5 ≤ z ≤ 5.5; infectious period 3 ≤ d ≤ 4. for the infection rate we choose 0.1 ≤ β ≤ 1.5 before the lockdown and 0 ≤ β ≤ 0.8 after the lockdown and for the reporting rate we choose 0.3 ≤ α ≤ 1. note that the ensemble adjustment kalman filter (eakf, described in appendix a-4) is not constrained by the initial priors and can migrate outside these ranges to obtain system solutions. for the initialization purposes we assume that all 5 districts are potential origins with an undocumented infected and exposed population drawn from a uniform distribution [0, 5] a week before the first documented case. initial condition does not affect the outcome of the inference. transmission model 2 does not explicitly represent the process of infection confirmation. thus, we mapped simulated documented infections to confirmed cases using a separate observational delay model. in this delay model, we account for the time interval between a person transitioning from latent to contagious and observational confirmation of that individual infection through a delay of t d . we assume that t d follows a gamma distribution g(a, τ d /a) where τ d = 6 days and a = 1.85 as derived by li et al. (2020) using data from china. inference is robust with respect to the choice of τ d . for the inference we use incidents from local transmission in cyprus as were reported by the ministry of health. in figure 10 we plot the time evolution of the weekly effective reproduction number r t . while at the beginning of the outbreak the effective reproduction number was close to 2.5, after the lockdown measures, it dropped below 1 and stayed consistently there until the end of june 2020. we then use the methodology proposed by bettencourt and ribeiro (2008) and recently modified by systrom (2020) as described in detail in section 2.4.4. for that method we also use the incidents from local transmission in cyprus as were reported by the ministry of health. figure 11 shows the daily median value as well as the 95% credible intervals for the effective reproductive number using that method. the work presented in this report is the result of intensive collaboration of an interdisciplinary team which was formed shortly after the pandemic started. the main motivation was to give guidance to cypriot government for controlling this major infectious disease outbreak. accordingly, we developed models and methods that are of critical importance in appreciating how this disease is developing and what will be its next stage and in what kind of time framework. this is a valuable information for outbreak control, resource utilization and to initiate again the normal daily life. we followed diverse paths to accomplish this by appealing to different modeling approaches and methods. we have shown that the government interventions were successful on containing covid-19 in cyprus, by the end of may, even though the disease initiated with a high value of r t . the government lockdown helped reduce the reproduction number, as the data shows, by applying different methodology. in addition, we have shown by change-point methodology and time series analysis the effect of various measures taken and have developed short-term predictions. the models we applied are based on simple surveillance data, seem to work well, give similar results, and can certainly help epidemiologists and public health officials quantify and understand changes in the transmission intensity of future epidemics and the drivers of these changes. finally, we feel that our approach to bring together experts from various fields avoids misunderstandings and gaps in communication between scientists, and maximizes the effectiveness of efforts to deal with public health emergencies. the existing change-point detection techniques for the scenarios mentioned in section 2.2 are mainly split into two categories based on whether the change-points are detected all at once or one at a time. the former category mainly includes optimization-based methods, in which the estimated signal is chosen based on its least squares or log-likelihood criterion penalized by a complexity rule in order to avoid overfitting. the most common example of a penalty function is the bayesian information criterion (bic); see schwarz (1978) and yao (1988) for details. in the latter category, in which change-points are detected one at a time, a popular method is binary segmentation, which performs an iterative binary splitting of the data on intervals determined by the previously obtained splits. even though binary segmentation is conceptually simple, it has the disadvantage that at each step of the algorithm, it looks for a single change-point, which leads to its suboptimality in terms of accuracy, especially for signals with frequent change-points. one method that works towards solving this issue is the isolate-detect (id) methodology of anastasiou and fryzlewicz (2019) ; it is the method used for the analysis carried out in this paper. the concept behind id is simple and is split into two stages; firstly, the isolation of each of the true change-points within subintervals of the domain [1, 2, . . . , t ], and secondly their detection. the basic idea is that for an observed data sequence of length t and with λ t a positive constant, id first creates two ordered sets of k = t /λ t right-and left-expanding intervals as follows. the j th right-expanding interval is for clarity of exposition, we give below a simple example. figure 13 covers a specific case of two change-points, r 1 = 38 and r 2 = 77. we will be referring to phases 1 and 2 involving six and four intervals, respectively. these are clearly indicated in the plot and they are only related to this specific example, as for cases with more change-points will entertain more such phases. at the beginning, s = 1, e = t = 100, and we take the expansion parameter λ t = 10. then, r 2 gets detected in {x s * , x s * +1 , . . . , x e }, where s * = 71. recall (2) and that the parameters d, a 1 , b 1 can be positive or negative but they need to satisfy certain conditions so that we obtain stable behavior of the process. note that the lagged observations of the response x t are fed into the autoregressive equation for ν t via the term log(x t−1 + 1). this is a one-to-one transformation of x t−1 which avoids zero data values. moreover, both λ t and x t are transformed into the same scale. covariates can be easily accommodated by model (2). when a 1 = 0, we obtain an ar(1) type model in terms of log(x t−1 + 1). in addition, the log-intensity process of (2) can be rewritten as after repeated substitution. hence, we obtain again that the hidden process {ν t } is determined by past functions of lagged responses, i.e. (2) belongs to the class of observation driven models; see cox (1981) . models like (2) (2), is determined by a latent process. therefore a formal linear structure, as in the case of gaussian linear time series model does not hold any more and interpretation of the interventions is a more complicated issue. hence, a method which allows detection of interventions and estimation of their size is needed so that structural changes can be identified successfully. important steps to achieve this goal are the following; see chen and liu (1993) : 1. a suitable model for accommodating interventions in count time series data. 2. derivation of test procedures for their successful detection. 3. implementation of joint maximum likelihood estimation of model parameters and outlier sizes. 4. correction of the observed series for the detected interventions. all these issues and possible directions for further developments of the methodology have been addressed by liboschik et al. (2017) under the poisson and mixed poisson distributional framework. (6) according to the official reports, the number of quarantined cases (q), recovered (r) and deaths (d), due to covid-19, are available. however, the recovered and death cases are directly related to the number of quarantine cases, which plays an important role in the analysis, especially since the numbers of exposed (e) and infectious (i) cases are very hard to determine. the latter two are therefore treated as hidden variables. this implies that we need to estimate the four parameters ζ, β, γ −1 , δ −1 and both the time dependent cure rate λ(t) and mortality rate κ(t). this is an optimization problem that we solve as follows: first we allow the latent time γ −1 to vary between 1 and 7 days and for each fixed γ −1 , we explore its influence on the rest of the parameters. the system of differential equations (6) is solved numerically using the runge-kutta 45 numerical scheme. the left plot of figure 14 shows that the protection rate ζ and the transmission rate β both attain their corresponding maximum value when γ −1 is equal to 3 days. note that ζ takes values between 0.08 and 0.2, while β converges very fast to 1. the reciprocal of the quarantine time δ −1 is increasing with the latent time γ −1 . one would suspect that longer latent time results to higher transmission rate and as the latent time increases almost every unprotected person will be infected after a direct contact with a covid-19 patient. the right plot of figure 14 shows the effect of the latent time on the total number of infected cases (exposed and infectious e(t) + i(t)) but not yet quarantined. the peak of the infection was achieved between the 21st and the 24th of march, depending on the latent time with the estimated number of infected people ranging between 338 and 526, depending again on the latent time considered. hence, once the latent time γ −1 is fixed, the fitting performance depends on the values of ζ, β and δ −1 . after a small sensitivity analysis the latent time was finally determined as 3 days. the mortality rate κ(t) is constantly very small and almost equal to zero, therefore we have not attempted to fit any function to it. for the cure rate λ(t) we have fitted the exponential function given in 9, the idea behind being that with time the recovery should converge to a constant rate. for the parameter estimation we have used a modified version of the matlab code given by cheynet (2020) because cyprus is a small country and this fact needs to be taken properly into account. figure 14 : sensitivity analysis on the parameters for the model defined by (6). the influence of the latent time γ −1 on the protection rate ζ, the transmission rate β and the quarantine time δ −1 (left plot ), the sum of exposed and infectious cases e(t) + i(t) (right plot). we present a bayesian analysis, for the model defined by (4) consideration. in particular, in the first period (when the number of tests was relatively low) we employ a symmetric prior around the value α = 0.5, while for later periods (when the number of targeted and random tests increased) we let the prior become progressively skewed towards 1. for the transmission rate β > 0, in the first period we use a gamma(3/2, 3/2) prior, which puts high probability around 2, while for later periods we use an exponential(1) prior which puts more mass closer to zero. this choice reflects the existence of super-spreaders in the early stages of the outbreak with higher probability compared to later on. in each time-period under consideration we also need to initialize the outbreak in cyprus. for the first period in both datasets, we use a uniform prior supported in {0, 1, . . . , 10} on the number of exposed and the number of undocumented infected 3 days before the first recorded incident. the two priors are independent, while the number of susceptible individuals is taken equal to cyprus' population and the number of infected-reported equal to zero. for later periods, we use as priors on the four state variables, their posterior distributions at the end of the previous period (corrected appropriately based on the observation at the end of the previous period). following li et al. (2020) , we assume that the daily number of reported cases are independent gaussian random variables and use an empirical variance given as by recalling that y(t) denotes the number of infected cases at day t. this allows us to build a gaussian likelihood for the parameters α and β. combining this likelihood with the prior distributions, we can deduce a formula for the posterior distribution on α, β. this distribution is not available in a closed form, hence in order to compute posterior estimates and their respective uncertainty quantification, we need to sample it. in the relatively simple setting of model 1, it is feasible to employ markov chain monte carlo methods, (see robert and casella (2013) ), in order to sample the posterior (namely, we use an independence sampler). this is in contrast to the model defined by (5) in sec. 2.4.2, see li et al. (2020) , where one has to use the ensemble adjustment kalman filter (eakf), which introduces some approximations to the posterior distribution, due to the more complex meta-population structure. originally developed for use in weather prediction, the eakf assumes a gaussian distribution for both the prior and the likelihood and adjusts the prior distribution to a posterior using bayes rule deterministically. in particular, the eakf assumes that both the prior distribution and likelihood are gaussian, and thus can be fully characterized by their first two moments (mean and variance). the update scheme for ensemble members is computed using bayes rule (posterior ∼ prior × likelihood) via the convolution of the two gaussian distributions (see li et al. (2020) for the implementation). we report the results obtained after fitting a piecewise-constant signal plus noise model, as descibed in sec. 2.2. the scenario here is that at each change-point, we have a sudden jump in the mean level of the signal. figure data and code are available at github (https://github.com/chrisnic12/covid_cyprus) detecting multiple generalized change-points by isolating single ones remdesivir for the treatment of covid-19 -preliminary report bayesian theory real time bayesian estimation of the epidemic potential of emerging infectious diseases statistical fraud detection: a review joint estimation of model parameters and outlier effects in time series generalized seir epidemic model (fitting and computation) a new framework and software to estimate time-varying reproduction numbers during epidemics the species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2 statistical analysis of time series: some recent developments impact of non-pharmaceutical interventions (npi)s to reduce covid-19 mortality and healthcare demand statistical analysis of count time series models: a glm perspective log-linear poisson autoregression an mrna vaccine against sars-cov-2 -preliminary report inference for single and multiple change-points in time series regression models for time series analysis modeling infectious diseases in humans and animals a contribution to the mathematical theory of epidemics substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) tscount: an r package for analysis of count time series following generalized linear models generalized linear models hand-hygiene mitigation strategies against global disease spreading through the air transportation network circular binary segmentation for the analysis of array-based dna copy number data epidemic analysis of covid-19 in china by dynamical modeling monte carlo statistical methods adaptive trend estimation in financial time series via multiscale change-pointinduced basis recovery estimating the dimension of a model the metric we need to manage covid-19 rt: the effective reproduction number coronavirus disease (covid-19) -situation report-197 estimating the number of change-points via schwarz' criterion a novel coronavirus from patients with pneumonia in china key: cord-024552-hgowgq41 authors: zhang, ruixi; zen, remmy; xing, jifang; arsa, dewa made sri; saha, abhishek; bressan, stéphane title: hydrological process surrogate modelling and simulation with neural networks date: 2020-04-17 journal: advances in knowledge discovery and data mining doi: 10.1007/978-3-030-47436-2_34 sha: doc_id: 24552 cord_uid: hgowgq41 environmental sustainability is a major concern for urban and rural development. actors and stakeholders need economic, effective and efficient simulations in order to predict and evaluate the impact of development on the environment and the constraints that the environment imposes on development. numerical simulation models are usually computation expensive and require expert knowledge. we consider the problem of hydrological modelling and simulation. with a training set consisting of pairs of inputs and outputs from an off-the-shelves simulator, we show that a neural network can learn a surrogate model effectively and efficiently and thus can be used as a surrogate simulation model. moreover, we argue that the neural network model, although trained on some example terrains, is generally capable of simulating terrains of different sizes and spatial characteristics. an article in the nikkei asian review dated 13 september 2019 warns that both the cities of jakarta and bangkok are sinking fast. these iconic examples are far from being the only human developments under threat. the united nation office for disaster risk reduction reports that the lives of millions were affected by the devastating floods in south asia and that around 1,200 people died in the bangladesh, india and nepal [30] . climate change, increasing population density, weak infrastructure and poor urban planning are the factors that increase the risk of floods and aggravate consequences in those areas. under such scenarios, urban and rural development stakeholders are increasingly concerned with the interactions between the environment and urban and rural development. in order to study such complex interactions, stakeholders need effective and efficient simulation tools. a flood occurs with a significant temporary increase in discharge of a body of water. in the variety of factors leading to floods, heavy rain is one of the prevalent [17] . when heavy rain falls, water overflows from river channels and spills onto the adjacent floodplains [8] . the hydrological process from rainfall to flood is complex [13] . it involves nonlinear, time-varying interactions between rain, topography, soil types and other components associated with the physical process. several physics-based hydrological numerical simulation models, such as hec-ras [26] , lisflood [32] , lisflood-fp [6] , are commonly used to simulate floods. however, such models are usually computation expensive and expert knowledge is required for both design and for accurate parameter tuning. we consider the problem of hydrological modelling and simulation. neural network models are known for their flexibility, efficient computation and capacity to deal with nonlinear correlation inside data. we propose to learn a flood surrogate model by training a neural network with pairs of inputs and outputs from the numerical model. we empirically demonstrate that the neural network can be used as a surrogate model to effectively and efficiently simulate the flood. the neural network model that we train learns a general model. with the trained model from a given data set, the neural network is capable of simulating directly spatially different terrains. moreover, while a neural network is generally constrained to a fixed size of its input, the model that we propose is able to simulate terrains of different sizes and spatial characteristics. this paper is structured as follows. section 2 summarises the main related works regarding physics-based hydrological and flood models as well as statistical machine learning models for flood simulation and prediction. section 3 presents our methodology. section 4 presents the data set, parameters setting and evaluation metrics. section 5 describes and evaluates the performance of the proposed models. section 6 presents the overall conclusions and outlines future directions for this work. current flood models simulate the fluid movement by solving equations derived from physical laws with many hydrological process assumptions. these models can be classified into one-dimensional (1d), two-dimensional (2d) and threedimensional (3d) models depending on the spatial representation of the flow. the 1d models treat the flow as one-dimension along the river and solve 1d saint-venant equations, such as hec-ras [1] and swmm [25] . the 2d models receive the most attention and are perhaps the most widely used models for flood [28] . these models solve different approximations of 2d saint-venant equations. two-dimensional models such as hec-ras 2d [9] is implemented for simulating the flood in assiut plateau in southwestern egypt [12] and bolivian amazonia [23] . another 2d flow models called lisflood-fp solve dynamic wave model by neglecting the advection term and reduce the computation complexity [7] . the 3d models are more complex and mostly unnecessary as 2d models are adequate [28] . therefore, we focus our work on 2d flow models. instead of a conceptual physics-based model, several statistical machine learning based models have been utilised [4, 21] . one state-of-the-art machine learning model is the neural network model [27] . tompson [29] uses a combination of the neural network models to accelerate the simulation of the fluid flow. bar-sinai [5] uses neural network models to study the numerical partial differential equations of fluid flow in two dimensions. raissi [24] developed the physics informed neural networks for solving the general partial differential equation and tested on the scenario of incompressible fluid movement. dwivedi [11] proposes a distributed version of physics informed neural networks and studies the case on navier-stokes equation for fluid movement. besides the idea of accelerating the computation of partial differential equation, some neural networks have been developed in an entirely data-driven manner. ghalkhani [14] develops a neural network for flood forecasting and warning system in madarsoo river basin at iran. khac-tien [16] combines the neural network with a fuzzy inference system for daily water levels forecasting. other authors [31, 34] apply the neural network model to predict flood with collected gauge measurements. those models, implementing neural network models for one dimension, did not take into account the spatial correlations. authors of [18, 35] use the combinations of convolution and recurrent neural networks as a surrogate model of navier-stokes equations based fluid models with a higher dimension. the recent work [22] develops a convolutional neural network model to predict flood in two dimensions by taking the spatial correlations into account. the authors focus on one specific region in the colorado river. it uses a convolutional neural network and a conditional generative adversarial network to predict water level at the next time step. the authors conclude neural networks can achieve high approximation accuracy with a few orders of magnitude faster speed. instead of focusing on one specific region and learning a model specific to the corresponding terrain, our work focuses on learning a general surrogate model applicable to terrains of different sizes and spatial characteristics with a datadriven machine learning approach. we propose to train a neural network with pairs of inputs and outputs from an existing flood simulator. the output provides the necessary supervision. we choose the open-source python library landlab, which is lisflood-fp based. we first define our problem in subsect. 3.1. then, we introduce the general ideas of the numerical flood simulation model and landlab in subsect. 3.2. finally, we present our solution in subsect. 3.3. we first introduce the representation of three hydrological parameters that we use in the two-dimensional flood model. a digital elevation model (dem) d is a w × l matrix representing the elevation of a terrain surface. a water level h is a w × l matrix representing the water elevation of the corresponding dem. a rainfall intensity i generally varies spatially and should be a matrix representing the rainfall intensity. however, the current simulator assumes that the rainfall does not vary spatially. in our case, i is a constant scalar. our work intends to find a model that can represent the flood process. the flood happens because the rain drives the water level to change on the terrain region. the model receives three inputs: a dem d, the water level h t and the rainfall intensity i t at the current time step t. the model outputs the water level h t+1 as the result of the rainfall i t on dem d. the learning process can be formulated as learning the function l: physics-driven hydrology models for the flood in two dimensions are usually based on the two-dimensional shallow water equation, which is a simplified version of navier-stokes equations with averaged depth direction [28] . by ignoring the diffusion of momentum due to viscosity, turbulence, wind effects and coriolis terms [10] , the two-dimensional shallow water equations include two parts: conservation of mass and conservation of momentum shown in eqs. 1 and 2, where h is the water depth, g is the gravity acceleration, (u, v) are the velocity at x, y direction, z(x, y) is the topography elevation function and s fx , s fy are the friction slopes [33] which are estimated with friction coefficient η as for the two-dimensional shallow water equations, there are no analytical solutions. therefore, many numerical approximations are used. lisflood-fp is a simplified approximation of the shallow water equations, which reduces the computational cost by ignoring the convective acceleration term (the second and third terms of two equations in eq. 2) and utilising an explicit finite difference numerical scheme. the lisflood-fp firstly calculate the flow between pixels with mass [20] . for simplification, we use the 1d version of the equations in x-direction shown in eq. 3, the result of 1d can be directly transferable to 2d due to the uncoupled nature of those equations [3] . then, for each pixel, its water level h is updated as eq. 4, to sum up, for each pixel at location i, j, the solution derived from lisflood-fp can be written in a format shown in eq. 5, where h t i,j is the water level at location i, j of time step t, or in general as h t+1 = θ (d, h t , i t ) . however, the numerical solution as θ is computationally expensive including assumptions for the hydrology process in flood. there is an enormous demand for parameter tuning of the numerical solution θ once with high-resolution two-dimensional water level measurements mentioned in [36] . therefore, we use such numerical model to generate pairs of inputs and outputs for the surrogate model. we choose the lisflood-fp based opensource python library, landlab [2] since it is a popular simulator in regional two-dimensional flood studies. landlab includes tools and process components that can be used to create hydrological models over a range of temporal and spatial scales. in landlab, the rainfall and friction coefficients are considered to be spatially constant and evaporation and infiltration are both temporally and spatially constant. the inputs of the landlab is a dem and a time series of rainfall intensity. the output is a times series of water level. we propose here that a neural network model can provide an alternative solution for such a complex hydrology dynamic process. neural networks are well known as a collection of nonlinear connected units, which is flexible enough to model the complex nonlinear mechanism behind [19] . moreover, a neural network can be easily implemented on general purpose graphics processing units (gpus) to boost its speed. in the numerical solution of the shallow water equation shown in subsect. 3.2, the two-dimensional spatial correlation is important to predict the water level in flood. therefore, inspired by the capacity to extract spatial correlation features of the neural network, we intend to investigate if a neural network model can learn the flood model l effectively and efficiently. we propose a small and flexible neural network architecture. in the numerical solution eq. 5, the water level for each pixel of the next time step is only correlated with surrounding pixels. therefore, we use, as input, a 3 × 3 sliding window on the dem with the corresponding water levels and rain at each time step t. the output is the corresponding 3 × 3 water level at the next time step t + 1. the pixels at the boundary have different hydrological dynamic processes. therefore, we pad both the water level and dem with zero values. we expect that the neural network model learns the different hydrological dynamic processes at boundaries. one advantage of our proposed architecture is that the neural network is not restricted by the input size of the terrain for both training and testing. therefore, it is a general model that can be used in any terrain size. figure 1 illustrates the proposed architecture on a region with size 6 × 6. in this section, we empirically evaluate the performance of the proposed model. in subsect. 4.1, we describe how to generate synthetic dems. subsect. 4.2 presents the experimental setup to test our method on synthetic dems as a micro-evaluation. subsect. 4.3 presents the experimental setup on the case in onkaparinga catchment. subsect. 4.4 presents details of our proposed neural network. subsect. 4.5 shows the evaluation metrics of our proposed model. in order to generate synthetic dems, we modify alexandre delahaye's work 1 . we arbitrarily set the size of the dems to 64 × 64 and its resolution to 30 metres. we generate three types of dems in our data set that resembles real world terrains surface as shown in fig. 2a , namely, a river in a plain, a river with a mountain on one side and a plain on the other and a river in a valley with mountains on both sides. we evaluate the performance in two cases. in case 1, the network is trained and tested with one dem. this dem has a river in the valley with mountains on both sides, as shown in fig. 2a right. in case 2, the network is trained and tested with 200 different synthetic dems. the data set is generated with landlab. for all the flood simulations in landlab, the boundary condition is set to be closed on four sides. this means that rainfall is the only source of water in the whole region. the roughness coefficient is set to be 0.003. we control the initial process, rainfall intensity and duration time for each sample. the different initial process is to ensure different initial water level in the whole region. after the initial process, the system run for 40 h with no rain for stabilisation. we run the simulation for 12 h and record the water levels every 10 min. therefore, for one sample, we record a total of 72 time steps of water levels. table 1 summarises the parameters for generating samples in both case 1 and case 2. the onkaparinga catchment, located at lower onkaparinga river, south of adelaide, south australia, has experienced many notable floods, especially in 1935 and 1951. many research and reports have been done in this region [15] . we get two dem data with size 64 × 64 and 128 × 128 from the australia intergovernmental committee on surveying and mapping's elevation information system 2 . figure 2b shows the dem of lower onkaparinga river. we implement the neural network model under three cases. in case 3, we train and test on 64 × 64 onkaparinga river dem. in case 4, we test 64 × 64 onkaparinga river dem directly with case 2 trained model. in case 5, we test 128 × 128 onkaparinga river dem directly with case 2 trained model. we generate the data set for both 64 × 64 and 128 × 128 dem from landlab. the initial process, rainfall intensity and rain duration time of both dem are controlled the same as in case 1. the architecture of the neural network model is visualized as in fig. 1 . it firstly upsamples the rain input into 3 × 3 and concatenates it with 3 × 3 water level input. then, it is followed by several batch normalisation and convolutional layers. the activation functions are relu and all convolutional layers have the same size padding. the total parameters for the neural network are 169. the model is trained by adam with the learning rate as 10 −4 . the batch size for training is 8. the data set has been split with ratio 8:1:1 for training, validation and testing. the training epoch is 10 for case 1 and case 3 and 5 for case 2. we train the neural network model on a machine with a 3 ghz amd ryzen tm 7-1700 8-core processor. it has a 64 gb ddr4 memory and an nvidia gtx 1080ti gpu card with 3584 cuda cores and 11gb memory. the operating system is ubuntu 18.04 os. in order to evaluate the performance of our neural network model, we use global measurements metrics for the overall flood in the whole region. these metrics are global mean squared error: case 5 is to test the scalability of our model for the different size dem. in table 2b , for global performance, the mape of case 5 is around 50% less than both case 3 and case 4, and for local performance, the mape of case 5 is 34.45%. similarly, without retraining the existed model, the trained neural network from case 2 can be applied directly on dem with different size with a good global performance. we present the time needed for the flood simulation of one sample in landlab and in our neural network model (without the training time) in table 3 . the average time of the neural network model for a 64 × 64 dem is around 1.6 s, while it takes 47 s in landlab. furthermore, for a 128 × 128 dem, landlab takes 110 more time than the neural network model. though the training of the neural network model is time consuming, it can be reused without further training or tuning terrains of different sizes and spatial characteristics. it remains effective and efficient (fig. 4 ). we propose a neural network model, which is trained with pairs of inputs and outputs of an off-the-shelf numerical flood simulator, as an efficient and effective general surrogate model to the simulator. the trained network yields a mean absolute percentage error of around 20%. however, the trained network is at least 30 times faster than the numerical simulator that is used to train it. moreover, it is able to simulate floods on terrains of different sizes and spatial characteristics not directly represented in the training. we are currently extending our work to take into account other meaningful environmental elements such as the land coverage, geology and weather. hec-ras river analysis system, user's manual, version 2 the landlab v1. 0 overlandflow component: a python tool for computing shallow-water flow across watersheds improving the stability of a simple formulation of the shallow water equations for 2-d flood modeling a review of surrogate models and their application to groundwater modeling learning data-driven discretizations for partial differential equations a simple raster-based model for flood inundation simulation a simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling rainfall-runoff modelling: the primer hec-ras river analysis system hydraulic userś manual numerical solution of the two-dimensional shallow water equations by the application of relaxation methods distributed physics informed neural network for data-efficient solution to partial differential equations integrating gis and hec-ras to model assiut plateau runoff flood hydrology processes and their variabilities application of surrogate artificial intelligent models for real-time flood routing extreme flood estimation-guesses at big floods? water down under 94: surface hydrology and water resources papers the data-driven approach as an operational real-time flood forecasting model analysis of flood causes and associated socio-economic damages in the hindukush region deep fluids: a generative network for parameterized fluid simulations fully convolutional networks for semantic segmentation optimisation of the twodimensional hydraulic model lisfood-fp for cpu architecture neural network modeling of hydrological systems: a review of implementation techniques physics informed data driven model for flood prediction: application of deep learning in prediction of urban flood development application of 2d numerical simulation for the analysis of the february 2014 bolivian amazonia flood: application of the new hec-ras version 5 physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations storm water management model-user's manual v. 5.0. us environmental protection agency hydrologic engineering center hydrologic modeling system, hec-hms: interior flood modeling decentralized flood forecasting using deep neural networks flood inundation modelling: a review of methods, recent advances and uncertainty analysis accelerating eulerian fluid simulation with convolutional networks comparison of the arma, arima, and the autoregressive artificial neural network models in forecasting the monthly inflow of dez dam reservoir lisflood: a gis-based distributed model for river basin scale water balance and flood simulation real-time waterlevel forecasting using dilated causal convolutional neural networks latent space physics: towards learning the temporal evolution of fluid flow in-situ water level measurement using nirimaging video camera acknowledgment. this work is supported by the national university of singapore institute for data science project watcha: water challenges analytics. abhishek saha is supported by national research foundation grant number nrf2017vsg-at3dcm001-021. key: cord-004332-99lxmq4u authors: zhao, shi; musa, salihu s.; fu, hao; he, daihai; qin, jing title: large-scale lassa fever outbreaks in nigeria: quantifying the association between disease reproduction number and local rainfall date: 2020-01-10 journal: nan doi: 10.1017/s0950268819002267 sha: doc_id: 4332 cord_uid: 99lxmq4u lassa fever (lf) is increasingly recognised as an important rodent-borne viral haemorrhagic fever presenting a severe public health threat to sub-saharan west africa. in 2017–18, lf caused an unprecedented epidemic in nigeria and the situation was worsening in 2018–19. this work aims to study the epidemiological features of epidemics in different nigerian regions and quantify the association between reproduction number (r) and state rainfall. we quantify the infectivity of lf by the reproduction numbers estimated from four different growth models: the richards, three-parameter logistic, gompertz and weibull growth models. lf surveillance data are used to fit the growth models and estimate the rs and epidemic turning points (τ) in different regions at different time periods. cochran's q test is further applied to test the spatial heterogeneity of the lf epidemics. a linear random-effect regression model is adopted to quantify the association between r and state rainfall with various lag terms. our estimated rs for 2017–18 (1.33 with 95% ci 1.29–1.37) was significantly higher than those for 2016–17 (1.23 with 95% ci: (1.22, 1.24)) and 2018–19 (ranged from 1.08 to 1.36). we report spatial heterogeneity in the rs for epidemics in different nigerian regions. we find that a one-unit (mm) increase in average monthly rainfall over the past 7 months could cause a 0.62% (95% ci 0.20%–1.05%)) rise in r. there is significant spatial heterogeneity in the lf epidemics in different nigerian regions. we report clear evidence of rainfall impacts on lf epidemics in nigeria and quantify the impact. lassa fever (lf), caused by lassa virus (lasv), is increasingly recognised as an important rodent-borne viral haemorrhagic fever presenting a severe public health threat to some of the communities in sub-saharan west africa [1] . discovered in 1969 [2] , lf is endemic to much of rural nigeria and regions in the mano river union [3] . lasv transmits from human to human, as well as via the zoonotic cycle [1, 3, 4] . lf has a high case fatality rate ranging from 1% in the community to over 60% in hospital settings [1, 4, 5] . the common reservoir of lasv is mastomys natalensis, one of the most widespread rodent species in sub-saharan africa [1, 3] , which exhibits sensitive population dynamics to the water level, e.g. rainfall, flooded agricultural activities [6, 7] . previous studies have recognised the ecological association between the population levels of rodents and rainfall [8] [9] [10] . lf epidemics typically start in november and last until may of the following year, with the majority of cases occurring in the first quarter of the following year, in addition to sporadic cases reported throughout the year. the 2017-18 epidemic in nigeria was an unprecedented lf epidemic in the country's history [11] , which resulted in 400 confirmed cases, including 97 deaths, between january and march 2018 [12] . the most recent epidemic in nigeria has already caused 526 confirmed cases from january to march of 2019, which included 121 deaths [12] . the five states of edo, ondo, ebonyi, bauchi and plateau are the only states that have been among the top 10 hit hardest states in terms of number of lf cases in both the 2018 (85.5% of total national cases) and 2019 (85.7% of total national cases) epidemics. while there have been discussions about the association of rainfall level and lf incidence rate [13, 14] , this association has not yet been demonstrated and quantified. this work aims to study the epidemiological features of epidemics in different nigerian regions between january 2016 and march 2019. we estimate lf infectivity in terms of the reproduction number (r) and quantify the association between r and state rainfall. we explore the spatial heterogeneity of the lf epidemics and summarise the overall findings with model-average estimates. weekly lf surveillance data are obtained from the nigeria centre for disease control (ncdc), where the data are publicly available from the weekly situation reports released by ncdc [12] . laboratory-confirmed case time series are used for analysis. we examine the major epidemics that occurred between january 2016 and march 2019 across the whole country and the aforementioned five states that were among the top 10 hardest-hit states in both the 2018 and 2019 epidemics, i.e. edo, ondo, ebonyi, bauchi and plateau. the state rainfall records of each state were collected on monthly average basis from the historical records of the world weather online website [15] . figure 1 (a) and (b) shows the rainfall time series of the five states and the weekly reported lf cases across the entire nigeria. to test the credibility of the coincidence between rainfall and lf epidemic, we use a simple statistical regression model of 'case ∼ exp(α × rainfall) + θ', where α and θ are free parameters to be estimated. the 'rainfall' in the model represents the state rainfall time series with lag of 4-9 months. this lag term corresponds to the time interval between the rainfall and the development of rodent population [7] . we check the least-square fitting outcomes of these regression models and select the model of lagged rainfall with the highest goodness-of-fit. the fitting significance is treated as the initiation of the quantitative association between state rainfall and the lf epidemic. four different nonlinear growth models are adopted to pinpoint the epidemiological features of each epidemic. the models are the richards, three-parameter logistic, gompertz and weibull growth models. these simple structured models are widely used to study s-shaped cumulative growth processes; e.g. the curve of a single-wave epidemic and have been extensively studied in previous work [16, 17] . these models consider cumulative cases with saturation in the growth rate to reflect the progression of an epidemic due to reduction in susceptible pools or a decrease in the exposure to infectious rodent populations. the extrinsic growth rate increases to a maximum (i.e. saturation) before steadily declining to zero. the modelling and fitting via the growth models of the epidemic curve are illustrated in figure 2 . we fit all models to the weekly reported lf cases in different regions and evaluate the fitting performance by the akaike information criterion (aic). we adopt the standard nonlinear least squares (nls) approach for model fitting and parameter estimation, following [16, 18] . a p-value <0.05 is regarded as statistically significant and the 95% confidence intervals (cis) are estimated for all unknown parameters. as we are using the cumulative number of the lf cases to conduct the model fitting, some fitting issues might occur, as per the studies in king et al. [19] , due to the non-decreasing nature in the cumulative summation time series. the models are selected by comparing the aic to that of the baseline (or null) model. only the models with an aic lower than the aic of the baseline model are considered for further analysis. importantly, the baseline model adopted is expected to capture the trends of the time series. since the epidemic curves of an infectious disease commonly exhibit autocorrelations [20] , we use autoregression (ar) models with a degree of 2, i.e. ar(2), as the baseline models for growth model selection. we also adopt the coefficient of determination (r-squared) and the coefficient of partial determination (partial r-squared) to evaluate goodness-of-fit and fitting improvement, respectively. for the calculation of partial r-squared, the ar(2) model is used as the baseline model. the growth models with a positive partial r-squared (indicating fitting improvement) against the baseline ar(2) model will be selected for further analyses. after the selection of models, we estimate the epidemiological features (parameters) of turning point (τ) and reproduction number (r) via the selected models. the turning point is defined as the time point of a sign change in the rate of case accumulation, i.e. from increasing to decreasing or vice versa [16, 18] . the reproduction number, r, is the average number of secondary human cases caused by one primary human case via the 'human-to-rodent-to-human' transmission path [18, 21] . when the population is totally (i.e. 100%) susceptible, the r will equate to the basic reproduction number, commonly denoted as r 0 [21, 22] . the reproduction number (r) is given in eqn (1), here, γ is the intrinsic per capita growth rate from the nonlinear growth models and κ is the serial interval of the lasv infection. the serial interval (i.e. the generation interval) is the time between the infections of two successive cases in a chain of transmission [21, [23] [24] [25] . the function h(·) represents the probability distribution of κ. hence, the function m(·) is the laplace transform of h(·) and specifically, m(·) is the moment generating function (mgf) of a probability distribution [21] . according to previous work [26] , we assume h(κ) to follow a gamma distribution with a mean of 7.8 days and a standard deviation (sd) of 10.7 days. therefore, r can be estimated with the values of γ from the fitted models [18, 21, 27, 28] . the state rs were estimated from the γs of the fitted epidemic growth curves of each state. similarly, the national rs are estimated from the γs of the epidemic growth curves fitted to the national number of cases time series in different epidemic periods. we then summarise the κ and r estimates via the aic-weighted model averaging. the aic weights, w, of the selected models (with positive partial r-squared) are defined in eqn (2), here, aic i is the aic of the i-th selected model and the aic min is the lowest aic among all selected models. thus, the i-th selected model has a weight of w i . the model-averaged estimator is the weighted average of the estimates in each selected model, which has been well studied in previous work [16, 29] . for the aic-based model average of the r, there could be the situation that no growth model is selected according to the partial r-squared. in such cases, instead of the model average, we report the range of the r estimated from all growth models. testing the spatial heterogeneity of the lf epidemics after finding the model-averaged estimates, we apply cochran's q test to examine the spatial heterogeneity of the epidemics in different regions over the same period of time [30] . for instance, we treat the model-averaged r estimates as the univariate meta-analytical response against different nigerian regions (states) and further check the heterogeneity by estimating the significance levels of the q statistics. a p-value <0.05 is regarded as statistically significant. similar to the approach in the previous study [31] , the association between the state rainfall level and lasv transmissibility are modelled by a linear mixed-effect regression (lmer) model in eqn (3), here, e(·) represents the expectation function and j is the region index corresponding to different regions (states). term c j is the interception term of the j-th region to be estimated and it is variable from different regions, serving as the baseline scale of transmissibility in different states. the term t denotes the cumulative lag in the model and 〈rainfall j,t 〉 represents the average monthly rainfall of the previous t months from the turning point, τ, of the j-th region. the range of lag term, t, will be considered from 4 to 9 months, which is explained by the time interval between the peak of the rainfall and the peak of rodent population [7] . as illustrated in figure 3 , the reproduction numbers, r j s, are estimated for different epidemics from the selected growth models. the regression coefficient, β, is to be estimated. hence, the term (e β -1) × 100% is the percentage changing rate (of r), which can be interpreted as the percentage change in transmissibility due to a one-unit (mm) increase in the average of the monthly rainfall level over the past 7 months. the framework of the regression is based on the exponential form of the predictor to model the expectation of transmissibility (e.g. r); this framework is inspired by previous work [32] [33] [34] [35] . to quantify the impacts of state rainfall, we calculate the percentage changing rate with different cumulative lags (t) from 4 to 9 months and estimate their significant levels. only the lag terms (t) with significant estimates are presented in this work. we present the analysis procedure in a flow diagram in figure 3 . all analyses are conducted by using r (version 3.4.3 [36] ) and the r function 'nls' is employed for the nls estimation of model parameters. the rainfall time series of the five states and the weekly reported lf cases of the whole of nigeria are shown in figure 1 (a) and (b). we observe that the major lf epidemics usually occur in nigeria between november and may of the following year. the cumulative lagged effects were observed via matching the peak timing of the rainfall and epidemic curves. in figure 1 (c), we shift the rainfall time series of the five states by + 6 months to match the trends of the national lf epidemic curve in nigeria. in figure 1 (d) and (e), we find that the fit has a p-value <0.0001, which indicates a statistically significant association between the lf cases and shifted rainfall curve. we fit four different growth models to the lf confirmed cases and estimate the model-average reproduction number (r) after model selection. we show the growth model fitting results in figure 4 and the model estimation and selection results in table 1 . most of the models have positive partial r-squared against the baseline ar(2) model. most of the regions exhibit an epidemic turning point (τ) ranging from the epidemiological week (ew) 4-10, i.e. from the end of january to mid-march, in each year. out of four epidemics in the states of bauchi and plateau, there are three estimated τs after ew 10 ( table 1) many previous studies adopted the instantaneous reproduction number, commonly denoted by r t , which can be estimated by a renewable equation, to quantify the transmissibility of infectious diseases [21, 23, 24, 37, 38] . the factors that affect the changing dynamics of r t include (i) the depletion of the susceptible population [32] or decrease in the exposure to infectious sources, (ii) the change, usually it is the improvement, in the unmeasurable disease control efforts, e.g. contract tracing, travel restriction, school closure, etc., [39] [40] [41] [42] and local awareness of the epidemic [33] , and (iii) the natural features of the pathogen, e.g. its original infectivity and other interepidemic factors [32, 33, 35] . in this work, we choose to use the average reproduction number (r) rather than r t , as the measurement of the lasv transmissibility. the estimated r summarises the lasv transmissibility over the whole period of an epidemic. the reasons why we prefer r rather than r t are as follows. first, the temporal changes of the susceptible population or decrease in the exposure to infectious sources are removed from the r estimates due to the nature of the growth models. second, the changes of the susceptible population and/or disease awareness or control measures and the effect of the rainfall cannot be disentangled in the time-varying reproduction number, r t , the average reproduction number (r) adopted is a better proxy to explore the association between lf infectivity and rainfall. with respect to point (iii) and other heterogeneities of epidemics in different regions, we account for this issue by including the 'region' dummy variables in the lmer model in eqn (3). these dummy variables serve as random effects to offset the regional heterogeneities of lf epidemics. therefore, we can then quantify a general effect, i.e. the β in eqn (3), of the lagged rainfall on the lasv r estimate among different nigerian places. the association between total rainfall in a state and the lasv transmissibility (r) is modelled and quantified by the lmer model. in figure 5 , we find a positive relation between rainfall and r. the estimated changing rate in r under a one-unit (mm) increase in the average monthly rainfall is summarised with different cumulative lag terms from 4 to 9 months (the t in eqn (3)). the range of lag in the rainfall from 4 to 9 months had previously been explained by the time interval between the peak of the rainfall and the peak of the rodent population [7] . the estimates of the rainfall-associated changing rate in r with different lag terms were summarised in table 2 . we report the most significant (i.e. with the lowest p-value) regression estimates that appear with a cumulative lag of 7 months. the habitats of the lasv reservoir, i.e. rodents, include irrigated and flooded agricultural lands that are commonly found in and around african villages [6] . the 7-month lag also coincides with the period between the dry and rainy seasons [43] . the association between rodent population dynamics and rainfall levels has been demonstrated in a number of previous studies [6] [7] [8] [9] [10] . hence, we consider the 7-month lagged estimation as our main results. namely, a one-unit (mm) increase in the average monthly rainfall over the past 7 months is likely to cause a 0.62% (95% ci 0.20%-1.05%) rise in the r of the lf epidemic. we also remark that this 'one-unit (mm) increase in the average monthly rainfall over the past 7 months' is equivalent to '7-unit (mm) increase in the total rainfall over the past 7 months'. the present finding of the impact of lagged rainfall on lf epidemics suggests that the knowledge of such weather-driven epidemics could be gained by referring to past rainfall levels. for instance, if a relatively high amount of rainfall occurs, local measures, such as rodent population control, could be effective to reduce the lf risk. this speculation could also be verified by examining the rodent population data of the nigerian regions included in this work. the findings in this work are of public health interest and are helpful for policymakers in lf prevention and control. on the one hand, our findings suggest the existence of an association between rainfall and lasv transmissibility, which could be affected by the population dynamics of rodents [13] . on the other hand, the positive relation between rainfall and r indicates that rainfall, particularly in states with a high lf risk, can be translated as a warning signal for lf epidemics. the modelling framework in this study should be easily extended to other infectious diseases. our work contains limitations. as in some african countries, the weather data are available only from a limited number of observatory stations and thus it is not sufficient to capture more accurate spatial variability. in this work, instead of exploring the spatial differences in the associations between rainfall and lf epidemic, we relaxed the setting and studied a general relationship. we qualified the general rainfall-associated changing rate of r in nigeria. for the transmissibility estimation, our growth modelling framework can provide the estimates of r, but not the basic reproduction number commonly denoted as r 0 . however, according to the theoretical epidemiology [22, 27, 35, 44, 45] , the r 0 can be determined by r 0 = r/s, where s denotes the population susceptibility. although s is not involved in our modelling framework, the information of s could be acquired from local serological surveillances. the existing literature reported 21.3% seroprevalence among nigerian humans by the enzyme-linked immunosorbent assay (elisa) [46] . hence, the r 0 can be calculated as 1.63 by using s = 1-21.3% = 0.787 and the r = 1.28 as the average of the 2016-18 lf epidemics. this was a data-driven modelling study, and we quantified the effect of rainfall as a weather-driven force of r based on previous ecological and epidemiological evidences [7, 43] . since the transmission of lasv mainly relies on the rodent population, the factors including seasonality, agricultural land-using, subtropical or tropical forest coverage that could impact rodent ecology should be relevant and helpful in the analysis. however, due to availability of data, the agricultural land-using factors, e.g. pastureland, irrigated land, flooded agricultural land usage and forest coverage were absent in our analysis, which should be studied in the future if they become available. the lf epidemic reproduction numbers (r) of the whole of nigeria in 2017-18 (r = 1.33 with 95% ci 1.29-1.37) and 2018-19 (r ranged from 1.08 to 1.36) are significantly higher than in 2016-17 (r = 1.23 with 95% ci 1.22-1.24). there is significant spatial heterogeneity in the lf epidemics of different nigerian regions. we report clear evidence of rainfall impacts on lf epidemics in nigeria and quantify this impact. a one-unit (mm) increase in the average monthly rainfall over the past 7 months could cause a 0.62% (95% ci 0.20%-1.05%) rise in the r. the state rainfall information has potential to be utilised as a warning signal for lf epidemics. data. all data used for analysis are freely available via online public domains [12, 15] . understanding the cryptic nature of lassa fever in west africa lassa fever epidemiological aspects of the 1970 epidemic lassa fever: epidemiology, clinical features, and social consequences the lassa fever fact sheet, the world health organization (who) 2019 lassa fever in post-conflict sierra leone population dynamics of the multimammate rat mastomys huberti in an annually flooded agricultural region of central mali stochastic seasonality and nonlinear density-dependent factors regulate population size in an african rodent the use of rainfall patterns in predicting population densities of multimammate rats the basis of reproductive seasonally in mastomys rats (rodentia: muridae) in tanzania forecasting rodent outbreaks in africa: an ecological basis for mastomys control in tanzania nigeria hit by unprecedented lassa fever outbreak the collection of the lassa fever outbreak situation reports, nigeria centre for disease control risk maps of lassa fever in west africa mapping the zoonotic niche of lassa fever in africa the historical weather records, the website of world weather online real-time parameter estimation of zika outbreaks using model averaging analysis of logistic growth models intervention measures, turning point, and reproduction number for dengue avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to ebola analysing increasing trends of guillain-barré syndrome (gbs) and dengue cases in hong kong using meteorological data how generation intervals shape the relationship between growth rates and reproductive numbers phase shifting of the transmissibility of macrolide sensitive and resistant mycoplasma pneumoniae epidemics in hong kong a new framework and software to estimate timevarying reproduction numbers during epidemics estimating individual and household reproduction numbers in an emerging epidemic associations between public awareness, local precipitation, and cholera in yemen in 2017 using modelling to disentangle the relative contributions of zoonotic and anthroponotic transmission: the case of lassa fever estimating initial epidemic growth rates simple framework for real-time forecast in a datalimited situation: the zika virus (zikv) outbreaks in brazil from 2015 to 2016 as an example model selection and multi-model inference, 2nd edn mortality burden of ambient fine particulate air pollution in six chinese cities: results from the pearl river delta study the long-term changing dynamics of dengue infectivity in guangdong ambient ozone and influenza transmissibility in hong kong cholera epidemic in yemen, 2016-18: an analysis of surveillance data a comparison study of zika virus outbreaks in french polynesia, colombia and the state of bahia in brazil modelling the large-scale yellow fever outbreak in luanda, angola, and the impact of vaccination r: a language and environment for statistical computing transmission dynamics of the 2009 influenza a (h1n1) pandemic in india: the impact of holiday-related school closure different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures strategic decision making about travel during disease outbreaks: a game theoretical approach mitigation of influenza b epidemic with school closures a simple model for complex dynamical transitions in epidemics community-based measures for mitigating the 2009 h1n1 pandemic in china quantifying the seasonal drivers of transmission for lassa fever in nigeria mathematical epidemiology modeling infectious diseases in humans and animals viral hemorrhagic fever antibodies in nigerian populations acknowledgements. we appreciate the helpful comments from anonymous reviewers that improved this manuscript. the funding agencies had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.author contributions. sz conceived, carried out the study and drafted the first manuscript. sz and dh discussed the results. all authors revised the manuscript and gave final approval for publication.conflict of interest. the authors declare that they have no competing interests.ethical standards. since no personal data were collected, ethical approval and individual consent were not applicable. key: cord-154170-7pnz98o6 authors: ponciano, jos'e miguel; ponciano, juan adolfo; g'omez, juan pablo; holt, robert d.; blackburn, jason k. title: poverty levels, societal and individual heterogeneities explain the sars-cov-2 pandemic growth in latin america date: 2020-05-22 journal: nan doi: nan sha: doc_id: 154170 cord_uid: 7pnz98o6 latin america is experiencing severe impacts of the sars-cov-2 pandemic, but poverty and weak public health institutions hamper gathering the kind of refined data needed to inform classical seir models of epidemics. we present an alternative approach that draws on advances in statistical ecology and conservation biology to enhance the value of sparse data in projecting and ameliorating epidemics. our approach, leading to what we call a stochastic epidemic gompertz model, with few parameters can flexibly incorporate heterogeneity in transmission within populations and across time. we demonstrate that poverty has a large impact on the course of the pandemic, across fourteen latin american countries, and show how our approach provides flexible, time-varying projections of disease risk that can be used to refine public health strategies. teen latin american countries, and show how our approach provides flexible, time-varying projections of disease risk that can be used to refine public health strategies. one sentence summary: the growth modality of sars-cov-2 among latin-american countries is well-explained by poverty differences, individual and temporal heterogeneities. ". . .major global crises. . . demand cooperative global responses that do not leave out the poor. once sars-cov-2 is under control, the world cannot return to business as usual" von braun et al. (1) concluded in a recent editorial commentary in science entitled "the moment to see the poor." as the recent flurry of research on the sars-cov-2 pandemic shows, the languages of mathematics, statistics and computer science are essential instruments for grappling with the uncertain course of the pandemic. joel cohen's (2) remark almost twenty years ago that "mathematics is biology's next microscope, only better" has never been more salient. deterministic, epidemiological models of the seir type (susceptible s(t), exposed e(t), infected i(t) and recovered r(t)) have long enabled an in-depth exploration of infectious disease processes (3) (4) (5) (6) , and provide a framework of fundamental principles to manage infectious diseases (7) (8) (9) (10) (11) (12) including the sars-cov-2 virus pandemic (10, (13) (14) (15) . while these models are useful, parameterizing them for a novel viral pandemic with limited diagnostics and systematic data collection approaches can be challenging. we address this need (1) with a complementary, multi-model approach (16) that incorporates social, individual and temporal heterogeneities. approaches employing simple yet biologically sound models with few parameters are particularly needed in regions like latin america where sound strategies to collect public health data are seldom employed. here we show that a multi-model, multi-stages modeling approach helps elucidate i) early epidemic growth in fourteen latin-american countries ii) the role of poverty in shaping the growth rate of the number of cases and iii) the probability that the number of cases of sars-cov-2 exceeds any given amount within arbitrarily defined small windows of time, starting from the present. characterizing complex epidemiological processes depends on the adequate formulation of proper probabilistic models of governing processes. survival, extinction and growth of natural populations -including infectious diseases-are inherently stochastic (17) . at its core, the problem of modeling the growth of the accumulation of sars-cov-2 cases is a stochastic population dynamics problem begging for a full characterization of the probabilities of the possible trends. computer intensive methods to fit biologically sound stochastic models to the growth of cases permit estimation of such probabilities. these methods notably allow a process-based estimation of the time-varying probability that within a given window of time, the number of cases will exceed any given number grounded in the dynamics of the infection process. mathematically, this problem is analogous to the conservation biology goal of estimating extinction risk for endangered populations. the application of stochastic processes to the estimation of extinction risks and average times until an event of interest happens has enriched conservation biology (18) , wildlife management (19, 20) and evolutionary microbial population dynamics (21) . in conservation biology, population viability analyses (pva) were transformed by stochastic processes used to predict populations' growth and extinction (17) , despite complexities of the target species' lifecycle (16) . ideally, a population dynamics model is relatively simple yet retains essential features of demographic and environmental stochasticities (17, 22) as well as robustness in the face of other sampling complexities (16) . these features are particularly important given the paucity of complete data/information. the analysis of sars-cov-2 time series data from countries with scant public health resources is not unlike the study of endangered species time series data -both seek to use whatever information is available to distill the general principles governing data fluctuations in order to then make informed projections and assess "what-if" scenarios. it is now recognized that long-term estimates of persistence probabilities are of little use because of the ever-changing conditions of population growth (18) . viable population monitoring (vpm) aims at estimating and updating persistence probabilities over short time horizons using fresh data (18) . such stepwise, continuously updated, estimates of short-term persistence probabilities are more practical and actionable than long-term projections. we draw on prior work in conservation biology, population dynamics and epidemiological theory to complement the current suite of deterministic epidemiological models, characterize the role of urban poverty in shaping the region's sars-cov-2 epidemics, and develop a methodology to generate short (5-15 days), sequentially updatable, process-based forecasts. early epidemics : patterns and processes characterizing the early phase of the epidemic growth profiles in latin america ( fig. 1a -d) provides a first step towards understanding the transmission dynamics of sars-cov-2 in the region. unconstrained growth is often a fair assumption at the outset of an emerging disease, so growth of the number of cases n(t) over time are properly described by an exponential growth model wherein the per capita rate of change in total case numbers is the constant intrinsic growth rate r: accordingly, the per capita contribution to the growth of the epidemic is unaffected by the total number of cases. early in an epidemic, r has been used to estimate the basic reproduction number r 0 using approximate relations (6, 23) but their accuracy is model-dependent. epidemic growth is frequently limited by many factors, including reactive behavior changes or spatially constrained contact structures. however diverse, these factors tend to act as "densitydependent" processes with slower growth patterns (5). chowell et al. (5) note that different compartmental models all lead to an early sub-exponential growth. a key factor is inhomogeneous mixing in contact rates, formalized as a non-linear contact rate function of the type the resulting sub-exponential growth describes epidemiological data reported in the literature for many important epidemics (5) . importantly, this model encapsulates, albeit phenomenologically, the outcome of lower level mechanistic models (5) where clustering, structuring, other forms of heterogeneity and the mean and variance of the contact distribution are ultimately responsible for the degree of sub-exponentiality. the key observation triggering our research was that early sub-exponential growth in fourteen latin-american countries could be clearly divided into four distinct growth profiles (fig. 1a-d) according to its approach to pure exponential growth. we then hypothesized that such differences in the degree of sub-exponential growth could then be explained by differences in poverty, which we expect to modulate the distribution of contacts and possibly other mechanisms (5). compartmental, multivariate seir models are an every-day indispensable tool to obtain conceptual and practical insights. but these models are data-hungry and statistically costly (25) due to their (potentially) large number of parameters. the ratio of data to the number of parameters needing estimation remains a challenge in latin america due to a lack of data gathering infrastructure, cohesive contact tracing plans, and testing and monitoring capabilities. despite these challenges, we show that in twelve out of these fourteen countries, including some form of heterogeneity (e.g. structuring according to age or poverty or, due to inhomogeneous mixing of susceptible and infected individuals (24)) improves model fits compared with the classic seir model mostly used to date (table 1) . first, we fitted the following seir type deterministic model variants: a classic seir model with and without non-homogeneous mixing, and a structured population seir model with and without non-homogeneous mixing with structuring according to poverty and age class. the two forms of demographic structuring and the nonhomogeneous mixing were included to assess the effect of different sources of heterogeneity. for all 14 countries we modeled the start of the epidemic as resulting from imported cases and included the effect of each nation's airport-closing decision (see table 1 and supplementary material). next, our analyses focused on the development of actionable theory-grounded univariate models requiring fewer parameters. these models incorporate i) different degrees of heterogeneity among hosts in pathogen transmission and ii) variability in the dynamics due to overall poverty levels. the overall time series variability is decomposed into sampling error and two forms of process error: demographic and environmental variabilities (17, 30, 45) . the early epidemic involves demographic stochasticity to model the dynamics of initial infection, we used a stochastic pure birth process -a continuous time and discrete states markov process used in various epidemiological contexts (22, 26, 27) . the type of variability displayed here, demographic stochasticity, (17, 20, 22, 26) represents chance variation in infection due to heterogeneities in individual contact rates. this type of process variability looms large at low numbers of infected individuals (17) . let n (t) be the random number of accumulated cases in a country at time t and p n (t) = p (n (t) = n(t)) the probability of observing n(t) cases at time t. we introduced an inhomogeneous contact rate function to obtain a form of the birth rate λ n that leads to sub-exponential growth early in an epidemic. how to incorporate heterogeneity into a univariate model is detailed next. either analytically or numerically we calculate p n (t) and use it to compute the probability that the process exceeds a given threshold n c within a pre-determined future time interval (fig. s2) , as well as first passage times (fpts) defined as pure birth processes become quasi-deterministic at large population sizes (cases), but process variability remains important. when case numbers are large, deviations from deterministic predictions can emerge from temporal (or spatial) variation in the transmission rate, known as environmental stochasticity (17, 20, 22 ) (in addition to observation error (19)). spatial heterogeneities may be determined by socio-economic factors. we build a hierarchical model of the accumulation of the total number of cases including poverty and heterogeneity in transmission to jointly fit to 14 latin-american countries (fig. 1a-d) . we then extend this model formulation to include environmental variability and use it to formulate practical risk assessment tools for each country. (20, 28) and reflects what one might call the "cohen principle," which demonstrates that distributions of abundance do not provide a shortcut to understanding the mechanisms that generate those distributions. each requires analysis (31) . our minimal assumptions approach has direct, practical consequences: it re-directs the inferential focus towards biologically relevant variance forms of n (t). these are key to accurately estimate population risks (30) . second, this framework is amenable to multiple parameterizations regarding infection dynamics, for example, by assuming inhomogeneity in p(t). using eq. (6) in (5) (an expression for inhomogeneous mixing during infection) we let is the number of infected that day. if γ is the per-day recovery probability, then i(t) −b = (n(t) − n(t)γ) −b so that the average one-step change in the number of cases is: fig. 2a ) values of c closer to 1 (i.e. higher homogeneity in contact rates), which in turn implies that the accumulation of cases is closer to exponential growth ( fig. 2a) . indeed, greater social homogeneity in poorer countries means a higher proportion of the population lives in poverty. having demonstrated the significance of the urban poverty covariate, we repeated the estimation without venezuela because it was the only country for which we had difficulties crosschecking the data from multiple sources. transparency in data reporting is paramount for scientific inference. this time, besides the random poverty-driven effect, we postulated that the growth of the epidemic was dominated by environmental stochasticity. in this new model, eq. (2) in the natural logarithm scale is the mean of a markovian transition probability distribution. we called this model the stochastic epidemic gompertz (seg) model. using the seg model, we obtained a tighter relationship (fig. 2b ) between poverty and the inhomogeneity parameter, despite the added layer of randomness. just as before, higher urban poverty index yields on average (thick black line in fig. 2b ) values of c closer to 1 (i.e., higher homogeneity in contact rates). because the seg is a model with environmental stochasticity, it allows accommodating a large mean number of contacts per individual while keeping the variance of the "offspring" distribution null (i.e., no demographic stochasticity, only environmental noise). in network theory models, this would amount to specifying a distribution of the number of contacts that has a large mean but a very small (if not null) variance. our model construction process focuses on the specification of the nature of the variability (see supplementary material), and hence can readily accommodate many other mean to variance relationships besides the one implied by the seg model. the point is, the contact and infection processes (32) , not the distributional assumptions, take a central stage as in (28) . the kalman filter (kf) applied to our poverty fit yielded a joint, time-dependent, effective reproduction number (r t ) predictions for the thirteen countries plotted in figure 2b (see fig. 3 and supplementary material for the r t approximation). in all cases, r t declines but remains above one. notably, poorer countries tend to suffer higher effective reproduction numbers. to illustrate the conceptual and practical benefits of complementing seir modeling with our seg models we conducted two numerical experiments. first, we compared the predictive qualities of our seg modeling approach with the deterministic predictions of the best seir model variant fit in each country (table 1 case the projections consisted of the deterministic solution of the best-fitting ode model for four countries (fig. 4) . similar results for other countries are in the supplementary material. for our seg model, computing these projections amounted to simulating 50000 trajectories of 16 days using its maximum likelihood estimates. we then plotted for the same four countries the most probable path along with the inter-quartile (iq) range of these 50000 simulated paths (fig. 4) . this comparison (analogous to stochastic forecasting of hurricane paths) clearly shows that the hierarchical model approach is at least as good or better than the deterministic predictions ( fig. 4 , table 1 ). in every case, the future observations (data towards the end of the observed time series, not part of the fitting procedure but retained for testing) are as good as (for colombia) or closer to the most probable path than they are to the deterministic predictions. the epidemic forecasting using the seg model thus appears more reliable for longer forecasts than the deterministic solutions. this property of the seg model could be particularly useful in the face of sudden changes of some kind (social, political, public health policy, and so forth) in the context within which the epidemic is developing. risk projections and future waves we developed an rpm tool that mirrors conservation biology approaches (18) to serially update the quasi-extinction probabilities with every increase of the length of the time series of population abundances. applied to sars-cov-2, this process involves using the past records of the cumulative number of cases to estimate the seg model parameters and then use these to predict for the near future (τ = 5 or 10 days) the probability that the number of cases will rise above a given critical threshold n crit , p n crit (n(t), τ ) = p r(n (t + τ ) ≥ n crit |n (t) = n(t)). with every passing day t , the estimate of p n crit (n(t ), τ ) is updated. the resulting p n crit (n(t ), τ ) trend can be used to diagnose the near future risk of an increase by any amount of the number of cases. this risk can then be propagated to compute the chances of needing the same number of extra intensive care units for costa rica is illustrated in fig. 5 . there, the probability of a spike of 50 cases or more declined over time, indicating the dwindling of the epidemic. this approach is applicable to time series of deaths for studies aiming at assessing trends in mortality risks. finally, the advantage of introducing non-standard contact rate functions including heterogeneity is that they include non-linear phenomena like bi-stability, which, in the face of environmental stochasticity, explain how can a second wave arises without needing a phenomenological environmental forcing function (see fig. 6 in (32)). we have used a multi-model approach to better understand and predict the sars-cov-2 dynamics in fourteen countries of latin america. where possible, we incorporated inhomogeneous mixing and/or the effect of urban poverty. the models we have examined and compared included compartmental seir models and models with demographic stochasticity and environmental noise with added sampling error and a poverty random effect (the seg model). we used the seg model to illustrate how with only time series of cumulative sars-cov-2 cases, countries with scant public health resources can make practical and conceptual advances in understanding the pandemic and forecasting its effects. our seg model is one of a suite aimed at decomposing the contributions from demographic, environmental and individual heterogeneities in observed time series of data. accurately accounting for the factors shaping the variance of the growth rate of n (t) yields better forecasts ( fig. 4) (20, 30) and the power to estimate short term trends of the risk of epidemic growth ( fig 5) . these projections along with our finding that urban poverty shapes the region-wide dynamics of the sars-cov-2 pandemic in latin america highlight the potential power of our approach. developing reliable tools to better understand and predict complex epidemiological processes in poor countries depends on mathematical and statistical approaches attuned to a multiplicity of realities. yes, more data are needed and always will be. mathematical "microscopes" tracking the complexities of human behavior are also needed. yet, here we show that fundamental ecological principles can illuminate the uncertain fate of countries in need. and this using only the most readily available source of information in these countries: the reported time series of the total number of cases up to any given day. the moment to see the poor an introduction to stochastic processes with applications to biology theoretical population biology proceedings of the royal society of london. series b: biological sciences parameter estimation for markov chain models: fundamental statistical concepts and computer intensive methods (national institute for the mathematical and biological synthesis tools for statistical inference funding for this study was provided by the universidad de san carlos de guatemala for j.a.p., the universidad del norte for j.p.g. and the national institutes of health grant 1r01gm117617 to j.m.p., r.d. holt and j.k. blackburn, figs. s1 to s14references (33) (34) (35) (36) (37) (38) (39) (40) (41) (42) (43) (44) (45) (46) key: cord-022494-d66rz6dc authors: webb, b.; eswar, n.; fan, h.; khuri, n.; pieper, u.; dong, g.q.; sali, a. title: comparative modeling of drug target proteins date: 2014-10-01 journal: reference module in chemistry, molecular sciences and chemical engineering doi: 10.1016/b978-0-12-409547-2.11133-3 sha: doc_id: 22494 cord_uid: d66rz6dc in this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. we then discuss the significant role that comparative prediction plays in drug discovery. we focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples. structure-based or rational drug discovery has already resulted in a number of drugs on the market and many more in the development pipeline. [1] [2] [3] [4] structure-based methods are now routinely used in almost all stages of drug development, from target identification to lead optimization. [5] [6] [7] [8] central to all structure-based discovery approaches is the knowledge of the threedimensional (3d) structure of the target protein or complex because the structure and dynamics of the target determine which ligands it binds. the 3d structures of the target proteins are best determined by experimental methods that yield solutions at atomic resolution, such as x-ray crystallography and nuclear magnetic resonance (nmr) spectroscopy. 9 while developments in the techniques of experimental structure determination have enhanced the applicability, accuracy, and speed of these structural studies, 10, 11 structural characterization of sequences remains an expensive and time-consuming task. the publicly available protein data bank (pdb) 12 currently contains $ 92 000 structures and grows at a rate of approximately 40% every 2 years. on the other hand, the various genome-sequencing projects have resulted in over 40 million sequences, including the complete genetic blueprints of humans and hundreds of other organisms. 13, 14 this achievement has resulted in a vast collection of sequence information about possible target proteins with little or no structural information. current statistics show that the structures available in the pdb account for less than 1% of the sequences in the uniprot database. 13 moreover, the rate of growth of the sequence information is more than twice that of the structures, and is expected to accelerate even more with the advent of readily available next-generation sequencing technologies. due to this wide sequence-structure gap, reliance on experimentally determined structures limits the number of proteins that can be targeted by structure-based drug discovery. fortunately, domains in protein sequences are gradually evolving entities that can be clustered into a relatively small number of families with similar sequences and structures. 15, 16 for instance, 75-80% of the sequences in the uniprot database have been grouped into fewer than 15 000 domain families. 17, 18 similarly, all the structures in the pdb have been classified into about 1000 distinct folds. 19, 20 computational protein structure prediction methods, such as threading 21 and comparative protein structure modeling, 22, 23 strive to bridge the sequence-structure gap by utilizing these evolutionary relationships. the speed, low cost, and relative accuracy of these computational methods have led to the use of predicted 3d structures in the drug discovery process. 24, 25 the other class of prediction methods, de novo or ab initio methods, attempts to predict the structure from sequence alone, without reliance on evolutionary relationships. however, despite progress in these methods, [26] [27] [28] especially for small proteins with fewer than 100 amino acid residues, comparative modeling remains the most reliable method of predicting the 3d structure of a protein, with an accuracy that can be comparable to a low-resolution, experimentally determined structure. 9 the basis of comparative modeling the primary requirement for reliable comparative modeling is a detectable similarity between the sequence of interest (target sequence) and a known structure (template). as early as 1986, chothia and lesk 29 showed that there is a strong correlation between sequence and structural similarities. this correlation provides the basis of comparative modeling, allows a coarse assessment of model errors, and also highlights one of its major challenges: modeling the structural differences between the template and target structures 30 (figure 1 ). comparative modeling stands to benefit greatly from the structural genomics initiative. 31 structural genomics aims to achieve significant structural coverage of the sequence space with an efficient combination of experimental and prediction methods. 32 this goal is pursued by careful selection of target proteins for structure determination by x-ray crystallography and nmr spectroscopy, such that most other sequences are within 'modeling distance' (e.g., >30% sequence identity) of a known structure. 15, 16, 31, 33 the expectation is that the determination of these structures combined with comparative modeling will yield useful structural information for the largest possible fraction of sequences in the shortest possible timeframe. the impact of structural genomics is illustrated by comparative modeling based on the structures determined by the new york structural genomics research consortium. for each new structure without a close homolog in the pdb, on average, 3500 protein sequences without any prior structural characterization could be modeled at least at the level of the fold. 34 thus, the structures of most proteins will eventually be predicted by computation, not determined by experiment. in this review, we begin by describing the various steps involved in comparative modeling. next, we emphasize two aspects of model refinement, loop modeling and side-chain modeling, due to their relevance in ligand docking and rational drug discovery. we then discuss the errors in comparative models. finally, we describe the role of comparative modeling in drug discovery, focusing on ligand docking against comparative models. we compare successes of docking against models and x-ray structures, and illustrate the computational docking against models with a number of examples. we conclude with a summary of topics that will impact on the future utility of comparative modeling in drug discovery, including an automation and integration of resources required for comparative modeling and ligand docking. comparative modeling consists of four main steps 23 (figure 2 (a)): (1) fold assignment that identifies similarity between the target sequence of interest and at least one known protein structure (the template); (2) alignment of the target sequence and the template(s); (3) building a model based on the alignment with the chosen template(s); and (4) predicting model errors. although fold assignment and sequence-structure alignment are logically two distinct steps in the process of comparative modeling, in practice almost all fold assignment methods also provide sequence-structure alignments. in the past, fold assignment methods were optimized for better sensitivity in detecting remotely related homologs, often at the cost of alignment accuracy. however, recent methods simultaneously optimize both the sensitivity and alignment accuracy. therefore, in the following discussion, we will treat fold assignment and sequence-structure alignment as a single protocol, explaining the differences as needed. as mentioned earlier, the primary requirement for comparative modeling is the identification of one or more known template structures with detectable similarity to the target sequence. the identification of suitable templates is achieved by scanning structure databases, such as pdb, 12 scop, 19 dali, 36 and cath, 20 with the target sequence as the query. the detected similarity is usually quantified in terms of sequence identity or statistical measures, such as e-value or z-score, depending on the method used. sequence-structure relationships are coarsely classified into three different regimes in the sequence similarity spectrum: (1) the easily detected relationships characterized by >30% sequence identity; (2) the 'twilight zone,' 37 corresponding to relationships figure 1 average model accuracy as a function of sequence identity. 30 as the sequence identity between the target sequence and the template structure decreases, the average structural similarity between the template and the target also decreases (dashed line, triangles). 29 structural overlap is defined as the fraction of equivalent c a atoms. for the comparison of the model with the actual structure (filled circles), two c a atoms were considered equivalent if they belonged to the same residue and were within 3.5 å of each other after least-squares superposition. for comparisons between the template structure and the actual target structure (triangles), two c a atoms were considered equivalent if they were within 3.5 å of each other after alignment and rigid-body superposition. the difference between the model and the actual target structure is a combination of the target-template differences (green area) and the alignment errors (red area). the figure was constructed by calculating 3993 comparative models based on a single template of varying similarity to the targets. all targets had known (experimentally determined) structures. 30 with statistically significant sequence similarity in the 10-30% range; and (3) the 'midnight zone,' 37 corresponding to statistically insignificant sequence similarity. for closely related protein sequences with identities higher than 30-40%, the alignments produced by all methods are almost always largely correct. the quickest way to search for suitable templates in this regime is to use simple pairwise sequence alignment methods such as ssearch, 38 blast, 39 and fasta. 38 brenner et al. showed that these methods detect only $18% of the homologous pairs at less than 40% sequence identity, while they identify more than 90% of the relationships when sequence identity is between 30% and 40%. 40 another benchmark, based on 200 reference structural alignments with 0-40% sequence identity, indicated that blast is able to correctly align only 26% of the residue positions. 41 the sensitivity of the search and accuracy of the alignment become progressively difficult as the relationships move into the twilight zone. 37,42 a significant improvement in this area was the introduction of profile methods by gribskov and co-workers. 43 the profile of a sequence is derived from a multiple sequence alignment and specifies residue-type occurrences for each alignment position. the information in a multiple sequence alignment is most often encoded as either a position-specific scoring matrix (pssm) 39, 44, 45 or as a hidden markov model (hmm). 46, 47 to identify suitable templates for comparative modeling, the profile of the target sequence is used to search against a database of template sequences. the profile-sequence methods are more sensitive in detecting related structures in the twilight zone than the pairwise sequence-based methods; they detect approximately twice the number of homologs under 40% sequence identity. 41, 48, 49 the resulting profile-sequence alignments correctly align approximately 43-48% of residues in the 0-40% sequence identity range; 41,50 this number is almost twice as large as that of the pairwise sequence methods. frequently used programs for profile-sequence alignment are psi-blast, 39 sam, 51 hmmer, 46 hhsearch, 52 hhblits, 53 and build_profile. 54 as a natural extension, the profile-sequence alignment methods have led to profile-profile alignment methods that search for suitable template structures by scanning the profile of the target sequence against a database of template profiles, as opposed to a database of template sequences. these methods have proven to include the most sensitive and accurate fold assignment and figure 2 comparative protein structure modeling. (a) a flowchart illustrating the steps in the construction of a comparative model. 23 (b) description of comparative modeling by extraction of spatial restraints as implemented in modeller. 35 by default, spatial restraints in modeller include: (1) homology-derived restraints from the aligned template structures; (2) statistical restraints derived from all known protein structures; and (3) stereochemical restraints from the charmm-22 molecular mechanics force field. these restraints are combined into an objective function that is then optimized to calculate the final 3d model of the target sequence. alignment protocols to date. 50, [55] [56] [57] profile-profile methods detect $28% more relationships at the superfamily level and improve the alignment accuracy by 15-20% compared to profile-sequence methods. 50, 58 there are a number of variants of profile-profile alignment methods that differ in the scoring functions they use. 50, 55, [58] [59] [60] [61] [62] [63] [64] however, several analyses have shown that the overall performances of these methods are comparable. 50, [55] [56] [57] some of the programs that can be used to detect suitable templates are ffas, 65 sp3, 58 salign, 50 hhblits, 53 hhsearch, 52 and ppscan. 54 sequence-structure threading methods as the sequence identity drops below the threshold of the twilight zone, there is usually insufficient signal in the sequences or their profiles for the sequence-based methods discussed above to detect true relationships. 48 sequence-structure threading methods are most useful in this regime as they can sometimes recognize common folds, even in the absence of any statistically significant sequence similarity. 21 these methods achieve higher sensitivity by using structural information derived from the templates. the accuracy of a sequence-structure match is assessed by the score of a corresponding coarse model and not by sequence similarity, as in sequence comparison methods. 21 the scoring scheme used to evaluate the accuracy is either based on residue substitution tables dependent on structural features such as solvent exposure, secondary structure type, and hydrogen bonding properties, 58,66-68 or on statistical potentials for residue interactions implied by the alignment. [69] [70] [71] [72] [73] the use of structural data does not have to be restricted to the structure side of the aligned sequence-structure pair. for example, sam-t08 makes use of the predicted local structure for the target sequence to enhance homolog detection and alignment accuracy. 74 commonly used threading programs are genthreader, 66,75 3d-pssm, 76 fugue, 68 sp3, 58 sam-t08 multitrack hmm, 67, 74, 77 and muster. 78 iterative sequence-structure alignment yet another strategy is to optimize the alignment by iterating over the process of calculating alignments, building models, and evaluating models. such a protocol can sample alignments that are not statistically significant and identify the alignment that yields the best model. although this procedure can be time-consuming, it can significantly improve the accuracy of the resulting comparative models in difficult cases. 79 regardless of the method used, searching in the twilight and midnight zones of the sequence-structure relationship often results in false negatives, false positives, or alignments that contain an increasingly large number of gaps and alignment errors. improving the performance and accuracy of methods in this regime remains one of the main tasks of comparative modeling today. 80 it is imperative to calculate an accurate alignment between the target-template pair. although some progress has been made recently, 81 comparative modeling can rarely recover from an alignment error. 82 after a list of all related protein structures and their alignments with the target sequence have been obtained, template structures are prioritized depending on the purpose of the comparative model. template structures may be chosen purely based on the targettemplate sequence identity or a combination of several other criteria, such as experimental accuracy of the structures (resolution of x-ray structures, number of restraints per residue for nmr structures), conservation of active-site residues, holo-structures that have bound ligands of interest, and prior biological information that pertains to the solvent, ph, and quaternary contacts. it is not necessary to select only one template. in fact, the use of several templates approximately equidistant from the target sequence generally increases the model accuracy. 83, 84 model building once an initial target-template alignment is built, a variety of methods can be used to construct a 3d model for the target protein. 23, 82, [85] [86] [87] [88] the original and still widely used method is modeling by rigid-body assembly. 86, 87, 89 this method constructs the model from a few core regions, and from loops and side chains that are obtained by dissecting related structures. commonly used programs that implement this method are composer, 90-93 3d-jigsaw, 94 rosettacm, 81 and swiss-model. 95 another family of methods, modeling by segment matching, relies on the approximate positions of conserved atoms from the templates to calculate the coordinates of other atoms. [96] [97] [98] [99] [100] an instance of this approach is implemented in segmod. 99 the third group of methods, modeling by satisfaction of spatial restraints, uses either distance geometry or optimization techniques to satisfy spatial restraints obtained from the alignment of the target sequences with the template structures. 35, [101] [102] [103] [104] specifically, modeller, 35,105,106 our own program for comparative modeling, belongs to this group of methods. modeller implements comparative protein structure modeling by the satisfaction of spatial restraints that include: (1) homologyderived restraints on the distances and dihedral angles in the target sequence, extracted from its alignment with the template structures; 35 (2) stereochemical restraints such as bond length and bond angle preferences, obtained from the charmm-22 molecular mechanics force field; 107 (3) statistical preferences for dihedral angles and nonbonded interatomic distances, obtained from a representative set of known protein structures; 108 and (4) optional manually curated restraints, such as those from nmr spectroscopy, rules of secondary structure packing, cross-linking experiments, fluorescence spectroscopy, image reconstruction from electron microscopy, site-directed mutagenesis, and intuition ( figure 2(b) ). the spatial restraints, expressed as probability density functions, are combined into an objective function that is optimized by a combination of conjugate gradients and molecular dynamics with simulated annealing. this model-building procedure is similar to structure determination by nmr spectroscopy. accuracies of the various model-building methods are relatively similar when used optimally. 109, 110 other factors, such as template selection and alignment accuracy, usually have a larger impact on the model accuracy, especially for models based on less than 30% sequence identity to the templates. however, it is important that a modeling method allows a degree of flexibility and automation to obtain better models more easily and rapidly. for example, a method should allow for an easy recalculation of a model when a change is made in the alignment; it should be straightforward to calculate models based on several templates; and the method should provide tools for incorporation of prior knowledge about the target (e.g., cross-linking restraints and predicted secondary structure). protein sequences evolve through a series of amino acid residue substitutions, insertions, and deletions. while substitutions can occur throughout the length of the sequence, insertions and deletions mostly occur on the surface of proteins in segments that connect regular secondary structure segments (i.e., loops). while the template structures are helpful in the modeling of the aligned target backbone segments, they are generally less valuable for the modeling of side chains and irrelevant for the modeling of insertions such as loops. the loops and side chains of comparative models are especially important for ligand docking; thus, we discuss them in the following two sections. loop modeling is an especially important aspect of comparative modeling in the range from 30% to 50% sequence identity. in this range of overall similarity, loops among the homologs vary while the core regions are still relatively conserved and aligned accurately. loops often play an important role in defining the functional specificity of a given protein, forming the active and binding sites. loop modeling can be seen as a mini protein folding problem because the correct conformation of a given segment of a polypeptide chain has to be calculated mainly from the sequence of the segment itself. however, loops are generally too short to provide sufficient information about their local fold. even identical decapeptides in different proteins do not always have the same conformation. 111, 112 some additional restraints are provided by the core anchor regions that span the loop and by the structure of the rest of the protein that cradles the loop. although many loop-modeling methods have been described, it is still challenging to correctly and confidently model loops longer than approximately 10-12 residues. 105, 113, 114 two classes of methods there are two main classes of loop-modeling methods: (1) database search approaches that scan a database of all known protein structures to find segments fitting the anchor core regions 98, 115 ; and (2) conformational search approaches that rely on optimizing a scoring function. [116] [117] [118] there are also methods that combine these two approaches. 119, 120 database-based loop modeling the database search approach to loop modeling is accurate and efficient when a database of specific loops is created to address the modeling of the same class of loops, such as b-hairpins, 121 or loops on a specific fold, such as the hypervariable regions in the immunoglobulin fold. 115, 122 there are attempts to classify loop conformations into more general categories, thus extending the applicability of the database search approach. [123] [124] [125] however, the database methods are limited because the number of possible conformations increases exponentially with the length of a loop, and until the late 1990s only loops up to 7 residues long could be modeled using the database of known protein structures. 126, 127 however, the growth of the pdb in recent years has largely eliminated this problem. 128 optimization-based methods there are many optimization-based methods, exploiting different protein representations, objective functions, and optimization or enumeration algorithms. the search algorithms include the minimum perturbation method, 129 dihedral angle search through a rotamer library, 114,130 molecular dynamics simulations, 119, 131 genetic algorithms, 132 monte carlo and simulated annealing, [133] [134] [135] multiple-copy simultaneous search, 136 self-consistent field optimization, 137 robotics-inspired kinematic closure 138 and enumeration based on graph theory. 139 the accuracy of loop predictions can be further improved by clustering the sampled loop conformations and partially accounting for the entropic contribution to the free energy. 140 another way of improving the accuracy of loop predictions is to consider the solvent effects. improvements in implicit solvation models, such as the generalized born solvation model, motivated their use in loop modeling. the solvent contribution to the free energy can be added to the scoring function for optimization, or it can be used to rank the sampled loop conformations after they are generated with a scoring function that does not include the solvent terms. 105, [141] [142] [143] side-chain modeling two simplifications are frequently applied in the modeling of side-chain conformations. 144 first, amino acid residue replacements often leave the backbone structure almost unchanged, 145 allowing us to fix the backbone during the search for the best side-chain conformations. second, most side chains in high-resolution crystallographic structures can be represented by a limited number of conformers that comply with stereochemical and energetic constraints. 146 this observation motivated ponder and richards 147 to develop the first library of side-chain rotamers for the 17 types of residues with dihedral angle degrees of freedom in their side chains, based on 10 high-resolution protein structures determined by x-ray crystallography. subsequently, a number of additional libraries have been derived. [148] [149] [150] [151] [152] [153] [154] [155] rotamers rotamers on a fixed backbone are often used when all the side chains need to be modeled on a given backbone. this approach reduces the combinatorial explosion associated with a full conformational search of all the side chains, and is applied by some comparative modeling 86 and protein design approaches. 156 however, $ 15% of the side chains cannot be represented well by these libraries. 157 in addition, it has been shown that the accuracy of side-chain modeling on a fixed backbone decreases rapidly when the backbone errors are larger than 0.5 å . 158 earlier methods for side-chain modeling often put less emphasis on the energy or scoring function. the function was usually greatly simplified, and consisted of the empirical rotamer preferences and simple repulsion terms for nonbonded contacts. 151 nevertheless, these approaches have been justified by their performance. for example, a method based on a rotamer library compared favorably with that based on a molecular mechanics force field, 159 and new methods continue to be based on the rotamer library approach. [160] [161] [162] the various optimization approaches include a monte carlo simulation, 163 simulated annealing, 164 a combination of monte carlo and simulated annealing, 165 the dead-end elimination theorem, 166, 167 genetic algorithms, 155 neural network with simulated annealing, 168 mean field optimization, 169 and combinatorial searches. 151, 170, 171 several studies focused on the testing of more sophisticated potential functions for conformational search 171, 172 and development of new scoring functions for side-chain modeling, 173 reporting higher accuracy than earlier studies. the major sources of error in comparative modeling are discussed in the relevant sections above. the following is a summary of these errors, dividing them into five categories (figure 3) . this error is a potential problem when distantly related proteins are used as templates (i.e., less than 30% sequence identity). distinguishing between a model based on an incorrect template and a model based on an incorrect alignment with a correct template is difficult. in both cases, the evaluation methods (below) will predict an unreliable model. the conservation of the key functional or structural residues in the target sequence increases the confidence in a given fold assignment. the single source of errors with the largest impact on comparative modeling is misalignments, especially when the target-template sequence identity decreases below 30%. alignment errors can be minimized in two ways. using the profile-based methods discussed above usually results in more accurate alignments than those from pairwise sequence alignment methods. another way of improving the alignment is to modify those regions in the alignment that correspond to predicted errors in the model. 83 segments of the target sequence that have no equivalent region in the template structure (i.e., insertions or loops) are one of the most difficult regions to model. again, when the target and template are distantly related, errors in the alignment can lead to incorrect positions of the insertions. using alignment methods that incorporate structural information can often correct such errors. once a reliable alignment is obtained, various modeling protocols can predict the loop conformation, for insertions of fewer than 8-10 residues. 105, 113, 119, 174 distortions and shifts in correctly aligned regions as a consequence of sequence divergence, the main-chain conformation changes, even if the overall fold remains the same. therefore, it is possible that in some correctly aligned segments of a model, the template is locally different (<3 å ) from the target, resulting in errors in that region. the structural differences are sometimes not due to differences in sequence, but are a consequence of artifacts in structure determination or structure determination in different environments (e.g., packing of subunits in a crystal). the simultaneous use of several templates can minimize this kind of error. 83,84 figure 3 typical errors in comparative modeling. 23 shown are the typical sources of errors encountered in comparative models. two of the major sources of errors in comparative modeling are due to incorrect templates or incorrect alignments with the correct templates. the modeling procedure can rarely recover from such errors. the next significant source of errors arises from regions in the target with no corresponding region in the template, i.e., insertions or loops. other sources of errors, which occur even with an accurate alignment, are due to rigid-body shifts, distortions in the backbone, and errors in the packing of side chains. as the sequences diverge, the packing of the atoms in the protein core changes. sometimes even the conformation of identical side chains is not conserved -a pitfall for many comparative modeling methods. side-chain errors are critical if they occur in regions that are involved in protein function, such as active sites and ligand-binding sites. the accuracy of the predicted model determines the information that can be extracted from it. thus, estimating the accuracy of a model in the absence of the known structure is essential for interpreting it. as discussed earlier, a model calculated using a template structure that shares more than 30% sequence identity is indicative of an overall accurate structure. however, when the sequence identity is lower, the first aspect of model evaluation is to confirm whether or not a correct template was used for modeling. it is often the case, when operating in this regime, that the fold assignment step produces only false positives. a further complication is that at such low similarities the alignment generally contains many errors, making it difficult to distinguish between an incorrect template on one hand and an incorrect alignment with a correct template on the other hand. there are several methods that use 3d profiles and statistical potentials, 70, 175, 176 which assess the compatibility between the sequence and modeled structure by evaluating the environment of each residue in a model with respect to the expected environment, as found in native high-resolution experimental structures. these methods can be used to assess whether or not the correct template was used for the modeling. they include verify3d, 175 184 and tsvmod. 185 even when the model is based on alignments that have >30% sequence identity, other factors, including the environment, can strongly influence the accuracy of a model. for instance, some calcium-binding proteins undergo large conformational changes when bound to calcium. if a calcium-free template is used to model the calcium-bound state of the target, it is likely that the model will be incorrect, irrespective of the target-template similarity or accuracy of the template structure. 186 the model should also be subjected to evaluations of self-consistency to ensure that it satisfies the restraints used to calculate it. additionally, the stereochemistry of the model (e.g., bond lengths, bond angles, backbone torsion angles, and nonbonded contacts) may be evaluated using programs such as procheck 187 and whatcheck. 188 although errors in stereochemistry are rare and less informative than errors detected by statistical potentials, a cluster of stereochemical errors may indicate that there are larger errors (e.g., alignment errors) in that region. when multiple models are calculated for the target based on a single template or when multiple loops are built for a single or multiple models, it is practical to select a subset of models or loops that are judged to be most suitable for subsequent docking calculations. if some known ligands or other information for the desired model is available, model selection should be guided by this known information. 189 if this extra information is not available, model selection should aim to select the most accurate model. while models or loops can be selected by the energy function used for guiding the building of comparative models or the sampling of loop configurations, using a separate statistical potential for selecting the most accurate models or loops is often more successful. 181, 182, 190, 191 it is crucial for method developers and users alike to assess the accuracy of their methods. an attempt to address this problem has been made by the critical assessment of techniques for proteins structure prediction (casp) 192 and in the past by the critical assessment of fully automated structure prediction (cafasp) experiments, 193 which is now integrated into casp. however, casp assesses methods only over a limited number of target protein sequences, and is conducted only every 2 years. 109, 194 to overcome this limitation, the new cameo web server continuously evaluates the accuracy and reliability of a number of comparative protein structure prediction servers, in a fully automated manner. 195 every week, cameo provides each tested server with the prerelease sequences of structures that are to be shortly released by the pdb. each server then has 4 days to build and return a 3d model of these sequences. when pdb releases the structures, cameo compares the models against the experimentally determined structures, and presents the results on its web site. this enables developers, non-expert users, and reviewers to determine the performance of the tested prediction servers. cameo is similar in concept to two prior such continuous testing servers, livebench 194 and eva. 196, 197 there is a wide range of applications of protein structure models ( figure 4) . 1, [198] [199] [200] [201] [202] [203] [204] for example, high-and medium-accuracy comparative models are frequently helpful in refining functional predictions that have been based on a sequence match alone because ligand binding is more directly determined by the structure of the binding site than by its sequence. it is often possible to predict correctly features of the target protein that do not occur in the template structure. 205, 206 for example, the size of a ligand may be predicted from the volume of the binding site cleft and the location of a binding site for a charged ligand can be predicted from a cluster of charged residues on the protein. fortunately, errors in the functionally important regions in comparative models are many times relatively low because the functional regions, such as active sites, tend to be more conserved in evolution than the rest of the fold. even low-accuracy comparative models may be useful, for example, for assigning the fold of a protein. fold figure 4 accuracy and applications of protein structure models. 9 shown are the different ranges of applicability of comparative protein structure modeling, threading, and de novo structure prediction, their corresponding accuracies, and their sample applications. assignment can be very helpful in drug discovery, because it can shortcut the search for leads by pointing to compounds that have been previously developed for other members of the same family. 207, 208 the remainder of this review focuses on the use of comparative models for ligand docking. [209] [210] [211] comparative protein structure modeling extends the applicability of virtual screening beyond the atomic structures determined by x-ray crystallography or nmr spectroscopy. in fact, comparative models have been used in virtual screening to detect novel ligands for many protein targets, 201 including the g-protein coupled receptors (gpcr), 210, [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] protein kinases, [224] [225] [226] [227] nuclear hormone receptors, and many different enzymes. [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] nevertheless, the relative utility of comparative models versus experimentally determined structures has only been relatively sparsely assessed. 212, 224, 225, [242] [243] [244] the utility of comparative models for molecular docking screens in ligand discovery has been documented 245 with the aid of 38 protein targets selected from the 'directory of useful decoys' (dud). 246 for each target sequence, templates for comparative modeling were obtained from the pdb, including at least one holo (ligand bound) and one apo (ligand free) template structure for each of the eight 10% sequence identity ranges from 20% to 100%. in total, 222 models were generated based on 222 templates for the 38 test proteins using modeller 9v2. 35 dud ligands and decoys (98 266 molecules) were screened against the holo x-ray structure, the apo x-ray structure, and the comparative models of each target using dock 3.5.54. 247 the accuracy of virtual screening was evaluated by the overall ligand enrichment that was calculated by integrating the area under the enrichment plot (logauc). a key result was that, for 63% and 79% of the targets, at least one comparative model yielded ligand enrichment better or comparable to that of the corresponding holo and apo x-ray structure. 245 this result indicates that comparative models can be useful docking targets when multiple templates are available. however, it was not possible to predict which model, out of all those used, led to the highest enrichment. therefore, a 'consensus' enrichment score was computed by ranking each library compound by its best docking score against all comparative models and/or templates. for 47% and 70% of the targets, the consensus enrichment for multiple models was better or comparable to that of the holo and apo x-ray structures, respectively, suggesting that multiple comparative models can be useful for virtual screening. despite problems with comparative modeling and ligand docking, comparative models have been successfully used in practice in conjunction with virtual screening to identify novel inhibitors. we briefly review a few of these success stories to highlight the potential of the combined comparative modeling and ligand-docking approach to drug discovery. comparative models have been employed to aid rational drug design against parasites for more than 20 years. 132, 231, 232, 240 as early as 1993, ring et al. 132 used comparative models for computational docking studies that identified low micromolar nonpeptidic inhibitors of proteases in malarial and schistosome parasite lifecycles. li et al. 231 subsequently used similar methods to develop nanomolar inhibitors of falcipain that are active against chloroquine-resistant strains of malaria. in a study by selzer et al. 232 comparative models were used to predict new nonpeptide inhibitors of cathepsin l-like cysteine proteases in leishmania major. sixty-nine compounds were selected by dock 3.5 as strong binders to a comparative model of protein cpb, and of these, 21 had experimental ic 50 values below 100 mmol l à1 . finally, in a study by que et al., 240 comparative models were used to rationalize ligand-binding affinities of cysteine proteases in entamoeba histolytica. specifically, this work provided an explanation for why proteases acp1 and acp2 had substrate specificity similar to that of cathepsin b, although their overall structure is more similar to that of cathepsin d. enyedy et al. 248 discovered 15 new inhibitors of matriptase by docking against its comparative model. the comparative model employed thrombin as the template, sharing only 34% sequence identity with the target sequence. moreover, some residues in the binding site are significantly different; a trio of charged asp residues in matriptase correspond to 1 tyr and 2 trp residues in thrombin. thrombin was chosen as the template, in part because it prefers substrates with positively charged residues at the p1 position, as does matriptase. the national cancer institute database was used for virtual screening that targeted the s1 site with the dock program. the 2000 best-scoring compounds were manually inspected to identify positively charged ligands (the s1 site is negatively charged), and 69 compounds were experimentally screened for inhibition, identifying the 15 inhibitors. one of them, hexamidine, was used as a lead to identify additional compounds selective for matriptase relative to thrombin. the wang group has also used similar methods to discover seven new, low-micromolar inhibitors of bcl-2, using a comparative model based on the nmr solution structure of bcl-x l . 233 schapira et al. 249 discovered a novel inhibitor of a retinoic acid receptor by virtual screening using a comparative model. in this case, the target (rar-a) and template (rar-g) are very closely related; only three residues in the binding site are not conserved. the icm program was used for virtual screening of ligands from the available chemicals directory (acd). the 5364 high-scoring compounds identified in the first round were subsequently docked into a full atom representation of the receptor with flexible side chains to obtain a final set of 300 good-scoring hits. these compounds were then manually inspected to choose the final 30 for testing. two novel agonists were identified, with 50-nanomolar activity. zuccotto et al. 250 identified novel inhibitors of dihydrofolate reductase (dhfr) in trypanosoma cruzi (the parasite that causes chagas disease) by docking into a comparative model based on $50% sequence identity to dhfr in l. major, a related parasite. the virtual screening procedure used dock for rigid docking of over 50 000 selected compounds from the cambridge structural database (csd). visual inspection of the top 100 hits was used to select 36 compounds for experimental testing. this work identified several novel scaffolds with micromolar ic 50 values. the authors report attempting to use virtual screening results to identify compounds with greater affinity for t. cruzi dhfr than human dhfr, but it is not clear how successful they were. following the outbreak of the severe acute respiratory syndrome (sars) in 2003, anand et al. 251 used the experimentally determined structures of the main protease from human coronavirus (m pro ) and an inhibitor complex of porcine coronavirus (transmissible gastroenteritis virus, tgev) m pro to calculate a comparative model of the sars coronavirus m pro . this model then provided a basis for the design of anti-sars drugs. in particular, a comparison of the active site residues in these and other related structures suggested that the ag7088 inhibitor of the human rhinovirus type 2 3c protease is a good starting point for design of anticoronaviral drugs. 252 comparative models of protein kinases combined with virtual screening have also been intensely used for drug discovery. 224, 225, [253] [254] [255] the >500 kinases in the human genome, the relatively small number of experimental structures available, and the high level of conservation around the important adenosine triphosphate-binding site make comparative modeling an attractive approach toward structure-based drug discovery. g protein-coupled receptors are another interesting class of proteins that in principle allow drug discovery through comparative modeling. 212, [256] [257] [258] [259] approximately 40% of current drug targets belong to this class of proteins. however, these proteins have been extremely difficult to crystallize and most comparative modeling has been based on the atomic resolution structure of the bovine rhodopsin. 260 despite this limitation, a rather extensive test of docking methods with rhodopsin-based comparative models shows encouraging results. the applicability of structure-based modeling and virtual screening has recently been expanded to membrane proteins that transport solutes, such as ions, metabolites, peptides, and drugs. in humans, these transporters contribute to the absorption, distribution, metabolism, and excretion of drugs, and often, mediate drug-drug interactions. additionally, several transporters can be targeted directly by small molecules. for instance, methylphenidate (ritalin) inhibiting the norepinephrine transporter (net) and, consequently, inhibiting the reuptake of norepinephrine, is used in the treatment of attention-deficit hyperactivity disorder (adhd). 261 schlessinger et al. 262 predicted 18 putative ligands of human net by docking 6436 drugs from the kyoto encyclopedia of genes (kegg drug) into a comparative model based on $25% sequence identity to leucine transporter (leut) from aquifex aeolicus. of these 18 predicted ligands, ten were validated by cis-inhibition experiments; five of them were chemically novel. close examination of the predicted primary binding site helped rationalize interactions of net with its primary substrate, norepineprhine, as well as positive and negative pharmacological effects of other net ligands. subsequently, schlessinger et al. 263 modeled two different conformations of the human gaba transporter 2 (gat-2), using the leut structures in occluded-outward-facing and outward-facing conformations. enrichment calculations were used to assess the quality of the models in molecular dynamics simulations and side-chain refinements. the key residue, glu48, interacting with the substrate was identified during the refinement of the models and validated by site-directed mutagenesis. docking against two conformations of the transporter enriches for different physicochemical properties of ligands. for example, top-scoring ligands found by docking against the outward-facing model were bulkier and more hydrophobic than those predicted using the occluded-outward-facing model. among twelve ligands validated in cis-inhibition assays, six were chemically novel (e.g., homotaurine). based on the validation experiments, gat-2 is likely to be a high-selectivity/low-affinity transporter. following these two studies, a combination of comparative modeling, ligand docking, and experimental validation was used to rationalize toxicity of an anti-cancer agent, acivicin. 264 the toxic sideeffects are thought to be facilitated by the active transport of acivicin through the blood-brain-barrier (bbb) via the large-neutral amino acid transporter 1 (lat-1). in addition, four small-molecule ligands of lat-1 were identified by docking against a comparative model based on two templates, the structures of the outward-occluded arginine-bound arginine/agmatine transporter adic from e. coli 265 and the inward-apo conformation of the amino acid, polyamine, and organo-cation transporter apct from methanococcus jannaschii. 266 two of the four hits, acivicin and fenclonine, were confirmed as substrates by a trans-stimulation assay. these studies clearly illustrate the applicability of combined comparative modeling and virtual screening to ligand discovery for transporters. although reports of successful virtual screening against comparative models are encouraging, such efforts are not yet a routine part of rational drug design. even the successful efforts appear to rely strongly on visual inspection of the docking results. much work remains to be done to improve the accuracy, efficiency, and robustness of docking against comparative models. despite assessments of relative successes of docking against comparative models and native x-ray structures, 225, 244 relatively little has been done to compare the accuracy achievable by different approaches to comparative modeling and to identify the specific structural reasons why comparative models generally produce less accurate virtual screening results than the holo structures. among the many issues that deserve consideration are the following: • the inclusion of cofactors and bound water molecules in protein receptors is often critical for success of virtual screening; however, cofactors are not routinely included in comparative models • most docking programs currently retain the protein receptor in a rigid conformation. while this approach is appropriate for 'lock-and-key' binding modes, it does not work when the ligand induces conformational changes in the receptor upon binding. a flexible receptor approach is necessary to address such induced-fit cases 267, 268 • the accuracy of comparative models is frequently judged by the c a root mean square error or other similar measures of backbone accuracy. for virtual screening, however, the precise positioning of side chains in the binding site is likely to be critical; measures of accuracy for binding sites are needed to help evaluate the suitability of comparative modeling algorithms for constructing models for docking • knowledge of known inhibitors, either for the target protein or the template, should help to evaluate and improve virtual screening against comparative models. for example, comparative models constructed from holo template structures implicitly preserve some information about the ligand-bound receptor conformation • improvement in the accuracy of models produced by comparative modeling will require methods that finely sample protein conformational space using a free energy or scoring function that has sufficient accuracy to distinguish the native structure from the nonnative conformations. despite many years of development of molecular simulation methods, attempts to refine models that are already relatively close to the native structure have met with relatively little success. this failure is likely to be due in part to inaccuracies in the scoring functions used in the simulations, particularly in the treatment of electrostatics and solvation effects. a combination of physics-based energy function with the statistical information extracted from known protein structures may provide a route to the development of improved scoring functions • improvements in sampling strategies are also likely to be necessary, for both comparative modeling and flexible docking given the increasing number of target sequences for which no experimentally determined structures are available, drug discovery stands to gain immensely from comparative modeling and other in silico methods. despite unsolved problems in virtually every step of comparative modeling and ligand docking, it is highly desirable to automate the whole process, starting with the target sequence and ending with a ranked list of its putative ligands. automation encourages development of better methods, improves their testing, allows application on a large scale, and makes the technology more accessible to both experts and non-specialists alike. through large-scale application, new questions, such as those about ligand-binding specificity, can in principle be addressed. enabling a wider community to use the methods provides useful feedback and resources toward the development of the next generation of methods. there are a number of servers for automated comparative modeling (table 1) . however, in spite of automation, the process of calculating a model for a given sequence, refining its structure, as well as visualizing and analyzing its family members in the sequence and structure spaces can involve the use of scripts, local programs, and servers scattered across the internet and not necessarily interconnected. in addition, manual intervention is generally still needed to maximize the accuracy of the models in the difficult cases. the main repository for precomputed comparative models, the protein model portal, 195, 198, 279 begins to address these deficiencies by serving models from several modeling groups, including the swiss-model 95 and modbase 34 databases. it provides access to web-based comparative modeling tools, cross-links to other sequence and structure databases, and annotations of sequences and their models. a number of databases containing comparative models and web servers for computing comparative models are publicly available. the protein model portal (pmp) 195, 198, 279 centralizes access to these models created by different methodologies. the pmp is being developed as a module of the protein structure initiative knowledgebase (psi kb) 316 and functions as a meta server for comparative models from external databases, including swiss-model 95 and modbase, 34 additionally to being a repository for comparative models that are derived from structures determined by the psi centers. it provides quality estimations of the deposited models, access to web-based comparative modeling tools, cross-links to other sequence and structure databases, annotations of sequences and their models, and detailed tutorials on comparative modeling and the use of their tools. the pmp currently contains 19.5 million comparative models for 4.4 million uniprot sequences (august 2013). a schematic of our own attempt at integrating several useful tools for comparative modeling is shown in figure 5 . 34, 291 modbase is a database that currently contains $29 million predicted models for domains in approximately 4.7 million unique sequences from uniprot, ensembl, 269 genbank, 14 and private sequence datasets. the models were calculated using modpipe 30, 291 and modeller. 35 the web interface to the database allows flexible querying for fold assignments, sequence-structure alignments, models, and model assessments. an integrated sequence-structure viewer, chimera, 304 allows inspection and analysis of the query results. models can also be calculated using modweb, 291,309 a web interface to modpipe, and stored in modbase, which also makes them accessible through the pmp. other resources associated with modbase include a comprehensive database of multiple protein structure alignments (dbali), 281 structurally defined ligand-binding sites, 319 structurally defined binary domain interfaces (pibase), 320,321 predictions of ligand-binding sites, interactions between yeast proteins, and functional consequences of human nssnps (ls-snp). 199, 322, 323 a number of associated web services handle modeling of loops in protein structures (modloop), 324, 325 evaluation of models (modeval), fitting of models against small angle x-ray scattering (saxs) profiles (foxs), 326-328 modeling of ligand-induced protein dynamics such as allostery (allosmod), 329, 330 prediction of the ensemble of conformations that best fit a given saxs profile (allosmod-foxs), 331 prediction of cryptic binding sites, 332 scoring protein-ligand complexes based on a 335, 336 compared to protein structure prediction, the attempts at automation and integration of resources in the field of docking for virtual screening are still in their nascent stages. one of the successful efforts in this direction is zinc, 317,318 a publicly available database of commercially available drug-like compounds, developed in the laboratory of brian shoichet. zinc contains more than 21 million 'ready-to-dock' compounds organized in several subsets and allows the user to query the compounds by molecular properties and constitution. the shoichet group also provides a dockblaster service 337 that enables end-users to dock the zinc compounds against their target structures using dock. 247, 338 in the future, we will no doubt see efforts to improve the accuracy of comparative modeling and ligand docking. but perhaps as importantly, the two techniques will be integrated into a single protocol for more accurate and automated docking of ligands against sequences without known structures. as a result, the number and variety of applications of both comparative modeling and ligand docking will continue to increase. cameo 195 http://cameo3d.org/ casp 315 http://predictioncenter.llnl.gov figure 5 an integrated set of resources for comparative modeling. 34 various databases and programs required for comparative modeling and docking are usually scattered over the internet, and require manual intervention or a good deal of expertise to be useful. automation and integration of these resources are efficient ways to put these resources in the hands of experts and non-specialists alike. we have outlined a comprehensive interconnected set of resources for comparative modeling and hope to integrate it with a similar effort in the area of ligand docking made by the shoichet group. 317, 318 drug disc comparative protein structure modeling modeller 9 comparative protein structure modeling and its applications to drug discovery comparative protein structure modeling this article is partially based on papers by jacobson and sali, 201 fiser and sali, 339 and madhusudhan et al. 340 we also acknowledge the funds from sandler family supporting foundation, nih r01 gm54762, p01 gm71790, p01 a135707, and u54 gm62529, as well as sun, ibm, and intel for hardware gifts. key: cord-020888-ov2lzus4 authors: formal, thibault; clinchant, stéphane; renders, jean-michel; lee, sooyeol; cho, geun hee title: learning to rank images with cross-modal graph convolutions date: 2020-03-17 journal: advances in information retrieval doi: 10.1007/978-3-030-45439-5_39 sha: doc_id: 20888 cord_uid: ov2lzus4 we are interested in the problem of cross-modal retrieval for web image search, where the goal is to retrieve images relevant to a text query. while most of the current approaches for cross-modal retrieval revolve around learning how to represent text and images in a shared latent space, we take a different direction: we propose to generalize the cross-modal relevance feedback mechanism, a simple yet effective unsupervised method, that relies on standard information retrieval heuristics and the choice of a few hyper-parameters. we show that we can cast it as a supervised representation learning problem on graphs, using graph convolutions operating jointly over text and image features, namely cross-modal graph convolutions. the proposed architecture directly learns how to combine image and text features for the ranking task, while taking into account the context given by all the other elements in the set of images to be (re-)ranked. we validate our approach on two datasets: a public dataset from a mediaeval challenge, and a small sample of proprietary image search query logs, referred as webq. our experiments demonstrate that our model improves over standard baselines. this paper considers the typical image search scenario, where a user enters a text query, and the system returns a set of ranked images. more specifically, we are interested in re-ranking a subset of candidate images retrieved from the whole image collection by an efficient base ranker, following standard multi-stage ranking architectures in search engines [36] . directly including visual features in the ranking process is actually not straightforward due to the semantic gap between text and images: this is why the problem has initially been addressed using standard text-based retrieval, relying for instance on text crawled from the image's webpage (e.g. surrounding text, title of the page etc.). in order to exploit visual information, and therefore improve the quality of the results -especially because this text is generally noisy, and hardly describes the image semantic-, many techniques have been developed since. for instance, some works have focused on building similarity measures by fusing mono-modal similarities, using either simple combination rules, or more complex propagation mechanisms in similarity graphs. more recently, techniques have emerged from the computer vision community, where text and images are embedded in the same latent space (a.k.a. joint embedding), allowing to directly match text queries to images. the latter are currently considered as state-of-the-art techniques for the cross-modal retrieval task. however, they are generally evaluated on artificial retrieval scenarios (e.g. on mscoco dataset [34] ), and rarely considered in a re-ranking scenario, where mechanisms like pseudo-relevance feedback (prf) [31] are highly effective. we propose to revisit the problem of cross-modal retrieval in the context of re-ranking. our first contribution is to derive a general formulation of a differentiable architecture, drawing inspiration from cross-modal retrieval, learning to rank, neural information retrieval and graph neural networks. compared to joint embedding approaches, we tackle the problem in a different view: instead of learning new (joint) embeddings, we focus on designing a model that learns to combine information from different modalities. finally, we validate our approach on two datasets, using simple instances of our general formulation, and show that the approach is not only able to reproduce prf, but actually outperform it. cross-modal retrieval. in the literature, two main lines of work can be distinguished regarding cross-modal retrieval: the first one focuses on designing effective cross-modal similarity measures (e.g. [2, 10] ), while the second seeks to learn how to map images and text into a shared latent space (e.g. [15, 18, 19, 54] ). the first set of approaches simply combines different mono-media similarity signals, relying either on simple aggregation rules, or on unsupervised crossmodal prf mechanisms, that depend on the choice of a few but critical hyperparameters [2, 10, 11, 45] . as it will be discussed in the next section, the latter can be formulated as a two-step prf propagation process in a graph, where nodes represent multi-modal objects and edges encode their visual similarities. it has been later extended to more general propagation processes based on random walks [28] . alternatively, joint embedding techniques aim at learning a mapping between textual and visual representations [15, 18, 19, 23, [52] [53] [54] [55] 61] . canonical correlation analysis (cca) [17] and its deep variants [5, 27, 58] , as well as bi-directional ranking losses [8, 9, 52, 53, 55, 61] (or triplet losses) ensure that, in the new latent space, an image and its corresponding text are correlated or close enough w.r.t. to the other images and pieces of text in the training collection. other objective functions utilize metric learning losses [35] , machine translation-based measures [44] or even adversarial losses [51] . these approaches suffer from several limitations [61] : they are sensitive to the triplet sampling strategy as well as the choice of appropriate margins in the ranking losses. moreover, constituting a training set that ensures good learning and generalization is not an easy task: the text associated to an image should describe its visual content (e.g. "a man speaking in front of a camera in a park"), and nothing else (e.g. "the president of the us, the 10th of march", "john doe", "joy and happiness"). building a universal training collection of paired (image, text) instances, where text describes faithfully the content of the image in terms of elementary objects and their relationships, would be too expensive and time-consuming in practice. consequently, image search engines rely on such pairs crawled from the web, where the link between image and text (e.g. image caption, surrounding sentences etc.) is tenuous and noisy. to circumvent this problem, query logs could be used but, unfortunately -and this is our second argument regarding the limitations-, real queries are never expressed in the same way as the ones considered when evaluating joint embedding methods (e.g. artificial retrieval setting on mscoco [34] or flickr-30k [43] datasets, where the query is the full canonical textual description of the image). in practice, queries are characterised by very large intent gaps: they do not really describe the content of the image but, most of the time, contain only a few words, and are far from expressing the true visual needs. what does it mean to impose close representations for all images representing "paris" (e.g. "the eiffel tower", "louvre museum"), even if they can be associated to the same textual unit? neural information retrieval. neural networks, such as ranknet and lambdarank, have been intensively used in ir to address the learning to rank task [7] . more recently, there has been a growing interest in designing effective ir models with neural models [1, 12, 13, 20, 25, 26, 37, 38, 41, 56] , by learning the features useful for the ranking task directly from text. while standard strategies focus on learning a global ranking function that considers each query-document pair in isolation, they tend to ignore the difference in distribution in the feature space for different queries [4] . hence, some recent works have been focusing on designing models that exploit the context induced by the re-ranking paradigm, either by explicitly designing differentiable prf models [32, 40] , or by encoding the ranking context -the set of elements to re-rank-, using either rnns [4] or attention mechanisms [42, 62] . consequently, the score for a document takes into account all the other documents in the candidate list. because of their resemblance with structured problems, this type of approaches could benefit from the recent body of work around graph neural networks, which operate on graphs by learning how to propagate information to neighboring nodes. graph neural networks. graph neural networks (gnns) are extensions of neural networks that deal with structured data encoded as a graph. recently, graph convolutional networks (gcns) [30] have been proposed for semisupervised classification of nodes in a graph. each layer of a gcn can generally be decomposed as: (i) node features are first transformed (e.g. linear mapping), (ii) node features are convolved, meaning that for each node, a differentiable, permutation-invariant operation (e.g. sum, mean, or max) of its neighbouring node features is computed, before applying some non-linearity, (iii) finally, we obtain a new representation for each node in the graph, which is then fed to the next layer. many extensions of gcns have been proposed (e.g. graphsage [21] , graph attention network [50] , graph isomorphism network [57] ), some of them directly tackling the recommendation task (e.g. pinsage [59] ). but to the best of our knowledge, there is no prior work on using graph convolutions for the (re-)ranking task. our goal is to extend and generalize simple yet effective unsupervised approaches which have been proposed for the task [2, 3, 10, 11, 45] , that can be seen as an extension of pseudo-relevance feedback methods for multi-modal objects. let d ∈ d denote a document to re-rank, composed of text and image. we denote by s v (., .) a normalized similarity measure between two images, and by s t (q, d) the textual relevance score of document d w.r.t. query q. the cross-modal similarity score is given by: where nn k t (q) denotes the set of k most relevant documents w.r.t. q, based on text, i.e. on s t (q, .). the model can be understood very simply: similarly to prf methods in standard information retrieval, the goal is to boost images that are visually similar to top images (from a text point of view), i.e. images that are likely to be relevant to the query but were initially badly ranked (which is likely to happen in the web scenario, where text is crawled from source page and can be very noisy). despite showing good empirical results, cross-modal similarities are fully unsupervised, and lack some dynamic behaviour, like being able to adapt to different queries. moreover, they rely on a single relevance score s t (q, .), while it could actually be beneficial to learn how to use a larger set of features such as the ones employed in learning to rank models. in [3] , the authors made a parallel between the cross-modal similarity from eq. (1) and random walks in graphs: it can be seen as a kind of multimodal label propagation in a graph. this motivates us to tackle the task using graph convolutions. we therefore represent each query q ∈ q as a graph g q , as follows: -the set of nodes is the set of candidate documents d i to be re-ranked for this query: typically from a few to hundreds of documents, depending on the query. -each node i is described by a set of n learning to rank features x q,di ∈ r n . v i ∈ r d denotes the (normalized) visual embedding for document d i . -as we do not have an explicit graph structure, we consider edges given by a k-nearest neighbor graph, based on a similarity between the embeddings v i 1 . -we denote by n i the neighborhood of node i, i.e. the set of nodes j such that there exists an edge from j to i. -we consider edge weights, given by a similarity function between the visual features of its two extremity nodes our goal is to learn how to propagate features in the above graph. generalizing convolution operations to graphs can generally be expressed as a message passing scheme [16] : where γ and φ denote differentiable functions, e.g. mlps (multi layer perceptron). by choosing φ(h this graph convolution can be reduced to the cross-modal similarity in eq. (1). indeed, assuming that d) , and n i := n is the whole set of candidates to re-rank, then: in other words, one layer defined with eq. (3) includes the standard cross-modal relevance feedback as a special case. equation (3) is more general, and can easily be used as a building block in a differentiable ranking architecture. in the following, we derive a simple convolution layer from eq. (3), and we introduce the complete architecture -called dcmm for differentiable cross-modal model-, summarized in fig. 1 . learning to rank features x q,di are first encoded with an mlp(.;θ) with relu activations, in order to obtain node features h 0 i . then, the network splits into two branches: -the first branch simply projects linearly each h i , that acts as a pure text-based score 2 . -the second branch is built upon one or several layer(s) of cross-modal convolution, simply defined as: for the edge function g, we consider two cases: the cosine similarity g cos , defining the first model (referred as dcmm-cos), and a simple learned similarity measure parametrized by a vector a such that after the convolution(s), the final embedding for each node h (l) i is projected to a real-valued score s conv (q, d i ), using either a linear layer (s conv (q, , ω) ). finally, the two scores are combined to obtain the final ranking score: the model is trained using backpropagation and any standard learning to rank loss: pointwise, pairwise or listwise. it is worth to remark that, by extending prf mechanisms for cross-modal re-ranking, our model is actually closer to listwise context-based models introduced in sect. 2 than current state-of-the-art cross-modal retrieval models. it is listwise by design 3 : an example in a batch is not a single image in isolation, but all the candidate images for a given query, encoded as a graph, that we aim to re-rank together in a one shot manner. in our experiments, we used the pairwise bpr loss [46] , from which we obtained the best results 4 . let's consider a graph (i.e. the set of candidate documents for query q) in the batch, and all the feasible pairs of documents d +,− q for this query (by feasible, we mean all the pairs that can be made from positive and negative examples in the graph). then the loss is defined: note that contrary to previous works on listwise context modeling, we consider a set of objects to re-rank, and not a sequence (for instance in [4] , a rnn encoder is learned for re-ranking). in other words, we discard the rank information of the first ranker into the re-ranking process: we claim that the role of the first retriever is to be recall-oriented, and not precision-oriented. thus, using initial order might be a too strong prior, and add noise information. moreover, in the case of implicit feedback (clicks used as weak relevance signals), using rank information raises the issue of biased learning to rank (sensitivity to position and trust biases). it is also worth to emphasize that, contrary to most of the works around graph convolution models, our graph structure is somehow implicit: while edges between nodes generally indicate a certain relationship between nodes (for instance, connection between two users in a social network), in our case a connection represents the visual similarity between two nodes. in the following, we introduce the two datasets we used to validate our approach -a public dataset from a mediaeval 5 challenge, and an annotated set of queries sampled from image search logs of naver, the biggest commercial search engine in korea-, as well as our experimental strategy. we emphasize on the fact that we restrict ourselves to two relatively small datasets and few features as input for the models. even though the formulation from eq. (3) is very general, our claim is that a simple model, i.e. containing few hundreds to thousands parameters, should be able to reproduce prf mechanisms introduced in sect. 3. when adapting the approach to larger datasets, the model capacity can be adjusted accordingly, in order to capture more complex relevance patterns. note that we did not consider in our study standard datasets generally used to train joint embeddings such as mscoco [34] or flickr30k [43] , because the retrieval scenario is rather artificial, compared to web search: there are no explicit queries, and a text is only relevant to a single image. furthermore, we have tried to obtain the clickture [24] dataset without success 6 , and therefore cannot report on it. mediaeval. we first conduct experiments on the dataset from the "mediae-val17, retrieving diverse social images task" challenge 7 . while this challenge also had a focus on diversity aspects, we solely consider the standard relevance ranking task. the dataset is composed of a ranked list of images (up to 300) for each query, retrieved from flickr using its default ranking algorithm. the queries are general-purpose queries (e.g. q = autumn color ), and each image has been annotated by expert annotators (binary label, i.e. relevant or not). the goal is to refine the results from the base ranking. the training set contains 110 queries for 33340 images, while the test set contains 84 queries for 24986 images. while we could consider any number of learning to rank features as input for our model, we choose to restrict ourselves to a very narrow set of weak relevance signals, in order to remain comparable to its unsupervised counterpart, and ensure that the gain does not come from the addition of richer features. hence, we solely rely on four relevance scores, namely tf-idf, bm25, dirichlet smoothed lm [60] and desm score [39] , between the query and each image's text component (the concatenation of the image title and tags). we use an inception-resnet model [48] pre-trained on imagenet to get the image embeddings (d = 1536). webq. in order to validate our approach on a real world dataset, we sample a set of 1000 queries 8 from the image search logs of naver. all images appearing in the top-50 candidates for these queries within a period of time of two weeks have been labeled by three annotators in terms of relevance to the query (binary label). because of different query characteristics (in terms of frequency, difficulty etc.), and given the fact that new images are continuously added to/removed from the index, the number of images per query in our sample is variable (from around ten to few hundreds). note that, while we actually have access to a much larger amount of click logs, we choose to restrict the experiments to this small sample in order keep the evaluations simple. our goal here is to show that we are able to learn and reproduce some prf mechanisms, without relying on large amount of data. moreover, in this setting, it is easier to understand model's behaviour, as we avoid to deal with click noise and position bias. after removing queries without relevant images (according to majority voting among the three annotators), our sample includes 952 queries, and 43064 images, indexed through various text fields (title of the page, image caption etc.). we select seven of such fields, that might contain relevant pieces of information, and for which we compute two simple relevance features w.r.t. query q: bm25 and desm [39] (using embeddings trained on a large query corpus from an anterior period). we also add an additional feature, which is a mixture of the two above, on the concatenation of all the fields. image embeddings (d = 2048) are obtained using a resnet-152 model [22] pre-trained on imagenet. given the limited number of queries in both collections, we conducted 5-fold cross-validation, by randomly splitting the queries into five folds. the model is trained on 4 folds (with 1 fold kept for validation, as we use early stopping on ndcg), and evaluated on the remaining one; this procedure is repeated 5 times. then, the average validation ndcg is used to select the best model configuration. note that for the mediaeval dataset, we have access to a separate test set, so we modify slightly the evaluation methodology: we do the above 5fold cross-validation on the training set, without using a validation fold (hence, we do not use early stopping, and the number of epochs is a hyperparameter to tune). once the best model has been selected with the above strategy, we re-train it on the full training set, and give the final performance on the test set. we report the ndcg, map, p@20, and ndcg@20 for both datasets. we train the models using stochastic gradient descent with the adam optimizer [29] . we set the batch size (i.e. number of graphs per batch) to mediaeval, we also tune the number of epochs ∈ {50, 100, 200, 300, 500}, while for webq, we set it to 500, and use early stopping with patience set to 80. all node features are query-level normalized (mean-std normalization). the models are implemented using pytorch and pytorch geometric 9 [14] for the message passing components. in order to be fair, we want to compare methods with somewhat similar feature sets. obviously, for the supervised methods, results can be improved by either adding richer/more features, or increasing models' capacity. for both datasets, we compare our dcmm model to the following baselines: -a learning to rank model only based on textual features (ltr). -the cross-modal similarity introduced in sect. 3.1 [2, 3, 10, 11, 45] (cm). -the above ltr model with the cross-modal similarity as additional input feature (ltr+cm), to verify that it is actually beneficial to learn the crossmodal propagation in dcmm in a end-to-end manner. for the cross-modal similarity, we use as proxy for s t (q, .) a simple mixture of term-based relevance score (dirichlet-smoothed lm and bm25 for respectively mediaeval and webq) and desm score, on a concatenation of all text fields. from our experiments, we observe that it is actually beneficial to recombine the cross-modal similarity with the initial relevance s t (q, .), using a simple mixture. hence, three parameters are tuned (the two mixture parameters, and the number of neighbors for the query), following the evaluation methodology introduced in sect. 4.2 10 . the ltr models are standard mlps: they correspond to the upper part of architecture fig. 1 (text branch), and are tuned following the same strategy. we do not compare our models with joint embedding approaches on those datasets for the reasons mentioned in sect. 2, but also due to our initial experiments on medieval which gave poor results. for the sake of illustration, on mediaeval, 64% of the queries have no lemmas in common with training queries (and 35% for webq): given the relatively small size of these datasets, the models cannot generalize to unseen queries. this illustrates an "extreme" example of the generalization issues -especially on tail queries-of joint embedding techniques. in the meantime, as our model is fed with learning to rank features, especially term-based relevance scores like bm25, it could be less sensitive to generalization issues, for instance on new named entities. however, we want to emphasize that both approaches are not antagonist, but can actually be complementary. as our model can be seen as an extension of listwise learning to rank for bi-modal objects (if edges are removed, the model reduces to a standard mlp-based learning to rank), it can take as input node features matching scores from joint embeddings models. the model being an extension of prf, we actually see the approaches at different stages of ranking. table 1 gathers the main results of our study. without too much surprise, going from pure text ranker to a model using both media types improves the results by a large margin (all the models are significantly better than the text-based ltr model, so we do not include these tests on table 1 for clarity). moreover, results indicate that combining initial features with the unsupervised cross-modal similarity in a ltr model allows to slightly improve results over the latter (not significantly though) for the mediaeval dataset, while it has no effect on webq: this is likely due to the fact that features are somehow redundant in our setting, because of how s t (q, .) is computed for the cross-modal similarity; the same would not hold if we would consider a richer set of features for the ltr models. furthermore, the dcmm-cos model outperforms all the baselines, with larger margins for mediaeval than for webq; the only significant result (p-value < 0.05) is obtained for the map on mediaeval. nevertheless, it shows that this simple architecture -the most straightforward extension of cross-modal similarity introduced in sect. 3.1-, with a handful of parameters (see table 1 ) and trained on small datasets, is able to reproduce prf mechanisms. interestingly, results tend to drop as we increase the number of layers (best results are obtained with a single convolution layer), no matter the number of neighbors chosen to define the visual graph. while it might be related to the relative simplicity of the model, it actually echoes common observations in prf models (e.g. [3] ): if we propagate too much, we also tend to diffuse information too much. similarly, we can also make a parallel with over-smoothing in gnns [33] , which might be more critical for prf, especially considering the simplicity of this model. the dcmm-edge shows interesting results: on webq, we manage to improve results significantly w.r.t. to cm sim, while on mediaeval, results are slightly worse than dcmm-cos (except for the map). it might be due to the fact that images in the latter are more alike to the ones used to train image signatures, compared to the (noisy) web images in webq; hence, learning a new metric between images has less impact. interestingly, for both datasets, best results are obtained with more than a single layer; we hypothesize that the edge function plays the role of a simple filter for edges, allowing to propagate information from useful nodes across more layers. note that the number of layers needed for the task is tied with how we define our input graph: the less neighbors we consider for each node, the more layers might be needed, in order for each node to gather information from useful nodes. in fig. 2 , we observe that if the number of neighbors is too small (e.g. 3 or 5), then the model needs more layers to improve performance. on the other side, when considering too many neighbors (e.g. 20 or all), the nodes already have access to all the useful neighbors, hence adding layers only reduces performances. we need to find the right balance between the number of neighbors and the number of convolution layers, so that the model can learn to propagate relevant signals (e.g. 10 neighbors and 3 layers for webq). in this paper, we have proposed a reformulation of unsupervised cross-modal prf mechanisms for image search as a differentiable architecture relying on graph convolutions. compared to its unsupervised counterpart, our novel approach can integrate any set of features, while providing a high flexibility in the design of the architecture. experiments on two datasets showed that a simple model derived from our formulation achieved comparable -or better-performance compared to cross-modal prf. there are many extensions and possible directions stemming from the relatively simple model we have studied. given enough training data (e.g. large amount of click logs), we could for instance learn to dynamically filter the visual similarity by using an attention mechanism to choose which nodes to attend, similarly to graph attention networks [50] and transformer model [49] , discarding the need to set the number of neighbors in the input graph. finally, our approach directly addressed the cross-modal retrieval task, but its application to the more general prf problem in ir remains possible. learning deep structured semantic models for web search using clickthrough data xrce's participation to imageclef unsupervised visual and textual information fusion in cbmir using graph-based methods learning a deep listwise context model for ranking refinement deep canonical correlation analysis revisiting approximate metric optimization in the age of deep neural networks from ranknet to lambdarank to lambdamart: an overview crossmodal retrieval in the cooking context: learning semantic text-image embeddings amc: attention guided multimodal correlation learning for image search trans-media pseudo-relevance feedback methods in multimedia retrieval unsupervised visual and textual information fusion in multimedia retrieval -a graph-based point of view convolutional neural networks for softmatching n-grams in ad-hoc search modeling diverse relevance patterns in ad-hoc retrieval fast graph representation learning with pytorch geometric devise: a deep visual-semantic embedding model neural message passing for quantum chemistry a multi-view embedding space for modeling internet images, tags, and their semantics improving imagesentence embeddings using large weakly annotated photo collections beyond instance-level image retrieval: leveraging captions to learn a global visual representation for semantic a deep relevance matching model for ad-hoc retrieval inductive representation learning on large graphs deep residual learning for image recognition scalable deep multimodal learning for crossmodal retrieval clickage: towards bridging semantic and intent gaps via mining click logs of search engines a position-aware deep model for relevance matching in information retrieval re-pacrr: a context and densityaware neural information retrieval model multi-view deep network for cross-view classification multi-modal image retrieval with random walk on multi-layer graphs adam: a method for stochastic optimization semi-supervised classification with graph convolutional networks relevance based language models nprf: a neural pseudo relevance feedback framework for ad-hoc information retrieval deeper insights into graph convolutional networks for semi-supervised learning microsoft coco: common objects in context deep coupled metric learning for crossmodal matching cascade ranking for operational e-commerce search an updated duet model for passage re-ranking learning to match using local and distributed representations of text for web search a dual embedding space model for document ranking task-oriented query reformulation with reinforcement learning deeprank: a new deep architecture for relevance ranking in information retrieval personalized context-aware re-ranking for e-commerce recommender systems flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models cross-modal bidirectional translation via reinforcement learning nle@mediaeval'17: combining cross-media similarity and embeddings for retrieving diverse social images bpr: bayesian personalized ranking from implicit feedback dropout: a simple way to prevent neural networks from overfitting inception-v4, inception-resnet and the impact of residual connections on learning attention is all you need graph attention networks adversarial cross-modal retrieval learning two-branch neural networks for image-text matching tasks learning deep structure-preserving image-text embeddings wsabie: scaling up to large vocabulary image annotation learning semantic structure-preserved embeddings for cross-modal retrieval end-to-end neural ad-hoc ranking with kernel pooling how powerful are graph neural networks? deep correlation for matching images and text graph convolutional neural networks for web-scale recommender systems a study of smoothing methods for language models applied to ad hoc information retrieval deep cross-modal projection learning for image-text matching a domain generalization perspective on listwise context modeling key: cord-169288-aeyz2t6c authors: runvik, haakan; medvedev, alexander; eriksson, robin; engblom, stefan title: initialization of a disease transmission model date: 2020-07-17 journal: nan doi: nan sha: doc_id: 169288 cord_uid: aeyz2t6c approaches to the calculation of the full state vector of a larger epidemiological model for the spread of covid-19 in sweden at the initial time instant from available data and with a simplified dynamical model are proposed and evaluated. the larger epidemiological model is based on a continuous markov chain and captures the demographic composition of and the transport flows between the counties of sweden. its intended use is to predict the outbreak development in temporal and spatial coordinates as well as across the demographic groups. it can also support evaluating and comparing of prospective intervention strategies in terms of e.g. lockdown in certain areas or isolation of specific age groups. the simplified model is a discrete time-invariant linear system that has cumulative infectious incidence, infected population, asymptomatic population, exposed population, and infectious pressure as the state variables. since the system matrix of the model depends on a number transition rates, structural properties of the model are investigated for suitable parameter ranges. it is concluded that the model becomes unobservable for some parameter values. two contrasting approaches to the initial state estimation are considered. one is a version of rauch-tung-striebel smoother and another is based on solving a batch nonlinear optimization problem. the benefits and shortcomings of the considered estimation techniques are analyzed and compared on synthetic data for several swedish counties. this paper is concerned with using publicly available epidemiological data for estimating suitable initial conditions for a large mechanistic general susceptible-exposed-infectious-recovered (seir) model of the swedish covid-19 outbreak. the model incorporates spatial communication between the swedish municipalities, and also includes the swedish demographics, thought to be an important factor for the impact of covid-19 keeling and rohani (2008) . the viral contraction is driven by an infectious pressure as in widgren et al. (2018) ; engblom et al. (2019) . fig. 1 provides an overview of the modeling approach and specifies the included compartments. the dynamics of the disease transmission are modeled by a discrete-state continuous-time markov chain. a continuous state variable, the environmental compartment, is included to model the infectious pressure. the markov chain model is implemented using the computational framework siminf in r, widgren et al. (2019) . to infer the model parameters, the aim is to utilize a bayesian approach as it allows the use of empirical measures as prior knowledge of the model parameters. the problem of estimating the state vector of a dynamical system backwards in time is known as smoothing. an optimal (minimal variance) fixed-interval smoother for a linear time-invariant model under additive gaussian noise assumption was derived in rauch et al. (1965) . since then, various methods have been devised for more general settings, including state-dependent gaussian noise (aravkin and burke (2012) ) and non-gaussian noise sources (wang et al. (2020) ). in the present work, these two complications occur combined, as the process noise is poisson-distributed rather than gaussian, and also dependent on the plant state. therefore, none of the approaches found in the literature is readily applicable here. instead, to obtain a plausible solution fast, empirical initialization algorithms are developed and compared to determine which one is most suitable in the final setup. to establish ground truth, synthetic data produced by models of increasing complexity are utilized in the performance evaluation. the rest of the paper is organized as follows. first, the model initialization problem is formulated and the properties of the linear time-invariant model that is used to calculate the initial condition are explored. then, three model-based approaches to solving the initialization problem are presented. finally, performance of the considered approaches is evaluated on synthetic data and conclusions are drawn. the inputs to the markov chain model are the parameters inferred from data and an initial chain state. the initial state consists of the epidemiological states in all compartments, including the hidden states, i.e., the exposed and asymptomatic carriers. to find a county-wise initialization, specific to the swedish covid-19 outbreak, the cumulative infected cases data reported by the swedish public health agency were employed folkhälsomyndigheten (2020a) . in sweden, a full disease testing strategy was in effect until march 12, after which the testing was heavily restricted folkhälsomyndigheten (2020b) . with full testing, we assume that the reported cases holds the true number of cumulative infected cases. an accepted standard in stochastic epidemiological modeling is to start simulations when the system has reached some (fairly large) threshold number allen (2017); giordano et al. (2020) . we used the threshold of a 100 reported cases which sweden reached on march 6; the data up until march 12 can therefore be used for smoothing. the problem of estimating the infected, exposed, and asymptomatic populations at a given point in time (model initialization point) is therefore investigated, based on the data for cumulative incidence measured over a fixed time horizon. thus, the problem at hand constitutes a fixedinterval smoothing problem. the remaining compartments of the markov chain model do not influence the infected, exposed or asymptomatic populations and are therefore not included at present in the considered estimation problem. epidemiological mathematical models are typically designed in terms of populations and face difficulties in capturing situations, when only a few individuals are infected. this is logically the case in the beginning of an outbreak. besides, an epidemic is not readily recognized until the number of patients in the healthcare system becomes significant, thus making initial data scarce and unreliable. yet, since disease transmission is a dynamical process, a mathematical model of it has to be initialized so that historical data for the observed output agree well with the output produced by the model. as there were no deaths from the disease and very few individuals were in intensive care prior to the chosen point of initialization, the measurements that are used as input to the markov chain model cannot be used for the initialization of it. instead, reported county-wise cumulative incidence from the period of february 4th to march 12th 2020 are utilized. as contact tracing was discontinued after this period, incidence data from later times are significantly less reliable. since direct inversion of a continuous markov chain is not easily apprehended, the following linear time-invariant approximation is utilized for the initialization of the model for each county, whereas the model states are lumped over the considered age groups. the latter simplification is introduced since the cases were few in the beginning of the outbreak and patient age was not specified in the data. the model is derived as a normal approximation of the poisson distributed forward steps and formulated in statespace form as . is the discrete time corresponding to daily sampling and w k is the process noise sequence, whose properties will be clarified in section 2.3. the state vector elements ] stand for the populations of the model compartments according to: the parameters of the model are specified below σ expected rate of transition from the exposed state, γ a expected rate of transition from asymptomatic state, γ i expected rate of transition from infected state, f 0 fraction of transition from exposed reaching the infected state; the remaining fraction reaches the asymptomatic state, f 1 fraction of transition from asymptomatic state reaching the infected state, the remaining fraction corresponds to the recovery from the disease (not included in (1)), β indirect transmission rate of the environmental infectious pressure, ρ infections pressure decay rate, θ a asymptomatic viral shedding rate, θ e exposed viral shedding rate. the parameters are positive and so are the elements of the state matrix f . therefore, model (1) is also positive, i.e. the state vector belongs to the positive quadrant provided the initial condition x 0 and w k , k = 0, 1, . . . do. the latter condition restricts the distribution of the process noise. to obtain the parameter values for model (1), prior distributions for the bayesian parameter estimation algorithm of the markov chain model are utilized. the prior distributions are based on empirical data or published estimates. for parameter values from these distributions, the matrix f tends to have one eigenvalue with magnitude larger than one and is therefore unstable. this is expected, since exponential growth is observed during the early phase of a disease outbreak. since the cumulative incidence is the only measured signal, the output of the model is where h = [1 0 0 0 0] , and v k is the measurement noise with zero mean and variance r k . the introduction of measurement noise is a matter of complying with the standard assumptions of kalman filtering and not an actual model property. model (1), (2) does not possess structural observability for the whole range the parameter values. some combinations of parameter values sampled from the prior distribution make the observability matrix lose rank. in order to analyze the process noise covariance, each error vector w k is separated into two terms: w k = w 1k + w 0k , where w 1k describes the error of approximating the stochasticity of the full markov chain model by the linear dynamics of (1), and w 0k captures any other model uncertainty, including both differences between the models (e.g. the spread between counties) and differences between the complete model and the true outbreak dynamics. the process noise covariance matrix q k is split accordingly as q k = q 1k + q 0 . the model uncertainty is assumed to be additive, independent of k, and uncorrelated between the components. therefore, q 0 is diagonal and constant. the evaluation of the approximation error covariance q 1k is more challenging. when the markov chain model is sampled, the distributions of the elements of w 1k are given by sums of poisson processes that are shifted to have zero mean, and with variance that depend on the populations in the different compartments. the matrix q 1k is thus state-dependent and evaluated to (3). to avoid confusion with pure time-varying case, the explicit notation is utilized let i d = [0, d] define a finite interval of discrete time instants corresponding to the measurements y k , k ∈ i d , and m ∈ i d be the point of initialization of the markov chain model. an estimatex m|d of x k | k=m defined by model (1) is then sought from the output data y k , k ∈ i d . the problem at hand was approached using three different methods, which are presented next. the rauch-tung-striebel (rts) smoother (rauch et al. (1965) ) is a recursive method for solving fixed-interval smoothing problems. it is proven to be an optimal smoother, when the noise sources are gaussian and independent of the system states, and lacks theoretical justification in the present case. even stability properties of the rts smoother are not readily guaranteed. however, as the results of section 4 demonstrate, it can nonetheless be used empirically. the stability concerns are not critical as the estimation is performed with a discrete lti model and on a finite time interval. the rts smoother is a two-pass algorithm consisting of a kalman filter that is run for the full interval in a forward pass, followed by a backwards pass, when the state estimates are smoothed. the kalman filter equations that are solved recursively from the initial conditionsx 0 and p 0|0 arex wherex k|k−1 andx k|k are the a priori and a posteriori state estimates, p k|k−1 and p k|k are the a priori and a posteriori estimate covariances,ỹ k is the innovation, s k is the innovation covariance, and k k is the kalman gain. notice that the kalman filter requires knowledge of the covariance matrix q k for 1 ≤ k ≤ d. in the present case, the covariance matrix is not available, since it depends on the unknown states of the system. therefore, the plant state is replaced by its estimate, and the covariance matrix q k is approximated aŝ the a priori and a posteriori state and covariance estimates at each time are are saved for the backwards pass. then the algorithm proceeds backwards from the last time point d. the smoothed estimatex k|d is calculated recursively via the equationŝ where c k = p k|k f p −1 k+1|k and p k|d is the smoothed estimate covariance. the problem of estimatingx m|d can be approached as an optimization problem and solved once, rather than recursively. the simplest setup is based on the linear relation between the measurement and the state (i.e. backcasting) and leads to the algebraic system where the properties of the noisew k will be elaborated upon in section 3.3. the state estimation problem is then formulated asx where optimization problem (4) can be solved using standard techniques for linear least squares. furthermore, positivity of the state estimation can be enforced by using constrained least squares. the basic method presented above can be potentially improved through weighting by taking into account the correlation of the error termsw k . to this end, let s q = {q k } d k=0 , and define the matrix ω(s q ) by specifying its elements as then, ω(s q ) is the covariance matrix of the error termsw k are thus neither uncorrelated nor homoscedastic, so the gauss-markov theorem does not apply to the ordinary least squares formulation in (4). if the process noise covariance matrices were independent of the system states, the best linear unbiased estimator would be obtained by including the covariance in the formulation aŝ since the process noise is state-dependent in our case, the state estimation problem cannot be approached directly. the matrix ω(s q ) will be instead estimated. for this purpose, introduce the setŝ q (x m ) of approximated process noise covariance matrices aŝ a simplified version of the estimation problem can then be expressed aŝ since ω(ŝ q (x m )) depends on x m , this problem is nonlinear. in this work, its solution is sought iteratively by applying algorithm 1. solve for the parametrizations of f that appear in this work, the observability matrix φ that is utilized in solving the least squares problems of state estimation becomes numerically infeasible to calculate if m is too large. the reason for this is that f has eigenvalues that are significantly smaller than one in magnitude, so that repeated inversions result in very large elements in φ. to avoid this problem, the number of elements that was included in the optimization formulations was limited. for parametrizations, where one eigenvalue of f is very close to zero, the solution was to remove the corresponding state through truncation, thus treating the state as identical zero. the approximationx k = f k−m x m in the nonlinear least squares formulation can also pose problems, when k is significantly smaller than m. for this reason, a simple regularization was implemented, wherex k is set to zero whenever any element of f k−m x m becomes negative. the three estimation algorithms introduced above were evaluated using two types of synthetic data. first, linear model (1) was used to generate the data, with the same poisson-distributed state dependent noise sources as derived for the estimators. then, the data were generated from stochastic simulations of the markov chain model. in both cases, the models were simulated repeatedly over a time horizon of 42 days (d = 42), from identical initial conditions (distinct between the two cases) and with identical parameter values (identical between the two cases), that were randomly selected from the prior parameter distributions. the probability distributions of the state estimation errors for m = 30 were estimated by fitting kernel distribution and compared to each other. the state estimation was performed with the three algorithms for 100 realizations. the process noise covariance was calculated with the diagonal elements of q 0 set to 0.1, and r k = 0.1. measurements for indices k < 19 were neglected in the batch optimization approaches. the estimated distributions for all model states are shown in fig. 2. the rts smoother appears to perform the best, mostly through lower uncertainty in the infected population estimate. the main difference between the linear and nonlinear least squares formulations is the significantly higher uncertainty in the cumulative incidence estimation for the linear method. this makes sense as the linear method does not exploit the low uncertainty of the measurement of this state, that is encoded in the covariance model. in this case, 50 realizations were generated and data from the three counties that were subject to spread of the disease in the highest number of realizations (33, 33 and 31 respectively) were analyzed. to capture the larger model discrepancy, the diagonal elements of q 0 were set to 2 and r k = 0.5. as above, indices k < 19 were neglected in the batch optimizations. the estimated estimation error distributions for the states i, e and a in the three counties are shown in fig. 3 -fig. 5 . similarly to the case considered in section 4.1, the rts smoother is generally better at estimating the infected population. it is hard to draw conclusions apart from this from the plots, as the characteristics of the distributions vary between the counties. to investigate the effect of the initial estimation on the complete model, this model was simulated using estimated states as initial conditions. the estimated states from the three estimation methods for one realization of the simulation of the complete model were chosen. these are summarized in table 1 . the complete model was simulated 50 times from each of the three sets of initial conditions, for 42 days. the probability distributions of the logarithm of the infected, exposed and asymptomatic populations, in the three counties listed in table 1 , were then estimated using kernel distribution fitting. the results are depicted in fig. 6 -fig. 8 . the main conclusion that can be drawn from these results is that the variations between the considered estimation algorithms have limited effect on the states of the system in the end of the simulation, compared to the variations due the stochastic simulation. a greater variance in the states can be observed for the initial conditions generated by the rts smoother compared to the other, but no fig. 6 . probability distributions of model states for stockholm county according to simulation from estimated initial conditions. general conclusion regarding the initialization methods can be drawn from this, as the results are based on a single estimation instance. three approaches to a fixed interval smoothing problem with the purpose of initialization of a larger epidemiological model have been compared; one based on the rauch-tung-striebel smoother and two batch optimization methods. the non-gaussian state-dependent noise in the model implies that standard approaches could not be used directly, instead covariance estimates were used in two of the methods. the results indicate that the smoother performs better than the other methods, despite the lack of theoretical justification of the method. simulations from estimated initial conditions indicate that the effect of minor estimation errors is limited compared to the variations inherit to the stochastic simulation of the markov chain model. this suggests that computational complexity, robustness and ease of implementation might be of greater importance than high accuracy, when the initialization algorithm is chosen. fig. 8 . probability distributions of model states for vstra gtaland county according to simulation from estimated initial conditions. a primer on stochastic epidemic models: formulation, numerical simulation, and analysis smoothing dynamic systems with state-dependent covariance matrices bayesian epidemiological modeling over high-resolution network data ny fas kräver nya instatser mot covid 19 modelling the covid-19 epidemic and implementation of population-wide interventions in italy modeling infectious diseases in humans and animals maximum likelihood estimates of linear dynamic systems maximum correntropy rauch-tung-striebel smoother for nonlinear and non-gaussian systems siminf: an r package for data-driven stochastic disease spread simulations spatio-temporal modelling of verotoxigenic e. coli o157 in cattle in sweden: exploring options for control key: cord-024866-9og7pivv authors: lepenioti, katerina; pertselakis, minas; bousdekis, alexandros; louca, andreas; lampathaki, fenareti; apostolou, dimitris; mentzas, gregoris; anastasiou, stathis title: machine learning for predictive and prescriptive analytics of operational data in smart manufacturing date: 2020-04-29 journal: advanced information systems engineering workshops doi: 10.1007/978-3-030-49165-9_1 sha: doc_id: 24866 cord_uid: 9og7pivv perceiving information and extracting insights from data is one of the major challenges in smart manufacturing. real-time data analytics face several challenges in real-life scenarios, while there is a huge treasure of legacy, enterprise and operational data remaining untouched. the current paper exploits the recent advancements of (deep) machine learning for performing predictive and prescriptive analytics on the basis of enterprise and operational data aiming at supporting the operator on the shopfloor. to do this, it implements algorithms, such as recurrent neural networks for predictive analytics, and multi-objective reinforcement learning for prescriptive analytics. the proposed approach is demonstrated in a predictive maintenance scenario in steel industry. perceiving information and extracting business insights and knowledge from data is one of the major challenges in smart manufacturing [1] . in this sense, advanced data analytics is a crucial enabler of industry 4.0 [2] . more specifically, among the major challenges for smart manufacturing are: (deep) machine learning, prescriptive analytics in industrial plants, and analytics-based decision support in manufacturing operations [3] . the wide adoption of iot devices, sensors and actuators in manufacturing environments has fostered an increasing research interest on real-time data analytics. however, these approaches face several challenges in real-life scenarios: (i) they require a large amount of sensor data that already have experienced events (e.g. failures of -ideally-all possible causes); (ii) they require an enormous computational capacity that cannot be supported by existing computational infrastructure of factories; (iii) in most cases, the sensor data involve only a few components of a production line, or a small number of parameters related to each component (e.g. temperature, pressure, vibration), making impossible to capture the whole picture of the factory shop floor and the possible correlations among all the machines; (iv) the cold-start problem is rarely investigated. on the other hand, there is a huge treasure of legacy, enterprise and operational systems data remaining untouched. manufacturers are sitting on a goldmine of unexplored historical, legacy and operational data from their manufacturing execution systems (mes), enterprise resource planning systems (erp), etc. and they cannot afford to miss out on its unexplored potential. however, only 20-30% of the value from such available data-at-rest is currently accrued [4] . legacy data contain information regarding the whole factory cycle and store events from all machines, whether they have sensors installed or not (e.g. products per day, interruption times of production line, maintenance logs, causalities, etc.) [5] . therefore, legacy data analytics have the credentials to move beyond kpis calculations of business reports (e.g. oee, uptime, etc.), towards providing an all-around view of manufacturing operations on the shopfloor in a proactive manner. in this direction, the recent advancements of machine learning can have a substantial contribution in performing predictive and prescriptive analytics on the basis of enterprise and operational data aiming at supporting the operator on the shopfloor and at extracting meaningful insights. combining predictive and prescriptive analytics is essential for smarter decisions in manufacturing [2] . in addition mobile computing (with the use of mobile devices, such as smartphones and tablets) can significantly enable timely, comfortable, non-intrusive and reliable interaction with the operator on the shopfloor [6] , e.g. for generating alerts, guiding their work, etc. through dedicated mobile apps. the current paper proposes an approach for predictive and prescriptive analytics on the basis of enterprise and operational data for smart manufacturing. to do this, it develops algorithms based on recurrent neural networks (rnn) for predictive analytics, and multi-objective reinforcement learning (morl) for prescriptive analytics. the rest of the paper is organized as follows: sect. 2 presents the background, the challenges and prominent methods for predictive and prescriptive analytics of enterprise and operational data for smart manufacturing. section 3 describes the proposed approach, while sect. 4 shows a walkthrough scenario of the proposed approach in the steel industry. section 5 presents the experimental results, while sect. 6 concludes the paper and outlines the plans for future research. background. intelligent and automated data analysis which aims to discover useful insights from data has become a best practice for modern factories. it is supported today by many software tools and data warehouses, and it is known by the name "descriptive analytics". a step further, however, is to use the same data to feed models that can make predictions with similar or better accuracy than a human expert. in the framework of smart manufacturing, prognostics related to machines' health status is a critical research domain that often leverages machine learning methods and data mining tools. in most of the cases, this is related to the analysis of streaming sensor data mainly for health monitoring [7] [8] [9] , but also for failure prediction [10] [11] [12] as part of a predictive maintenance strategy. however, in all of these approaches, the prediction is produced only minutes or even seconds before the actual failure, which, is not often a realistic and practical solution for a real industrial case. the factory managers need to have this information hours or days before the event, so that there is enough time for them to act proactively and prevent it. one way to achieve this is to perform data mining on maintenance and operational data that capture the daily life-cycle of the shop floor in order to make more high-level predictions [13] [14] [15] . existing challenges. the most notable challenges related to predictive analytics for smart manufacturing include: (a) predictions always involve a degree of uncertainty, especially when the data available are not sufficient quantity-wise or quality-wise; (b) inconsistent, incomplete or missing data with low dimensionality often result into overfitting or underfitting that can lead to misleading conclusions; (c) properly preparing and manipulating the data in order to conclude to the most appropriate set of features to be used as input to the model is the most time-consuming, yet critical to the accuracy of the algorithms, activity; (d) lack of a common "language" between data scientists and domain experts hinders the extraction of appropriate hypothesis from the beginning and the correct interpretation and explainability of results. novel methods. time series forecasting involves prediction models that analyze time series data and usually infer future data trends. a time series is a sequence of data points indexed in time order. unlike regression predictive modeling, time series also adds the complexity of a sequence dependence among the input variables. recurrent neural networks (rnn) are considered to be powerful neural networks designed to handle sequence dependence. long short-term memory network (lstm) is a type of rnn that is typically used in deep learning for its ability to learn long-term dependencies and handle multiple input and output variables. background. prescriptive analytics aims at answering the questions "what should i do?" and "why should i do it?". it is able to bring business value through adaptive, time-dependent and optimal decisions on the basis of predictions about future events [16] . during the last years, there is an increasing interest on prescriptive analytics for smart manufacturing [17] , and is considered to be the next evolutionary step towards increasing data analytics maturity for optimized decision making, ahead of time. existing challenges. the most important challenges of prescriptive analytics include [2, 17, 18] : (i) addressing the uncertainty introduced by the predictions, the incomplete and noisy data and the subjectivity in human judgement; (ii) combining the "learned knowledge" of machine learning and data mining methods with the "engineered knowledge" elicited from domain experts; (iii) developing generic prescriptive analytics methods and algorithms utilizing artificial intelligence and machine learning instead of problem-specific optimization models; (iv) incorporating adaptation mechanisms capable of processing data and human feedback to continuously improve decision making process over time and to generate non-intrusive prescriptions; (v) recommending optimal plans out of a list of alternative (sets of) actions. novel methods. reinforcement learning (rl) is considered to be a third machine learning paradigm, alongside supervised learning and unsupervised learning [19] . rl shows an increasing trend in research literature as a tool for optimal policies in manufacturing problems (e.g. [20, 21] ). in rl, the problem is represented by an environment consisting of states and actions and learning agents with a defined goal state. the agents aim to reach the goal state while maximizing the rewards by selecting actions and moving to different states. in interactive rl, there is the additional capability of incorporating evaluative feedback by a human observer so that the rl agent learns from both human feedback and environmental reward [22] . another extension is multi-objective rl (morl), which is a sequential decision making problem with multiple objectives. morl requires a learning agent to obtain action policies that can optimize multiple objectives at the same time [23] . the proposed approach consists of a predictive analytics component (sect. 3.1) and a prescriptive analytics component (sect. 3 .2) that process enterprise and operational data from manufacturing legacy systems, as depicted in fig. 1 . the communication is conducted through an event broker for the event predictions and the actions prescriptions, while other parameters (i.e. objective values and alternative actions) become available through restful apis. the results are communicated to business users and shopfloor operators through intuitive interfaces addressed to both computers and mobile devices. the proposed predictive analytics approach aims to: (i) exploit hidden correlations inside the data that derive from the day-to-day shop floor operations, (ii) create and adjust a predictive model able to identify future machinery failures, and (iii) make estimations regarding the timing of the failure, i.e. when a failure of the machinery may occur, given the history of operations on the factory. this type of data usually contains daily characteristics that derive from the production line operations and are typically collected as part of a world-wide best practice for monitoring, evaluation and improvement of the effectiveness of the production process. the basic measurement of this process is an industry standard known as overall equipment effectiveness (oee) and is computed as: oee(%) = availability(%) â performance(%) â quality (%). availability is the ratio of actual operational time versus the planned operational time, performance is the ratio of actual throughput of products versus the maximum potential throughput, and the quality is the ratio of the not-rejected items produced vs the total production. the oee factor can be computed for the whole production line as an indication of the factory's effectiveness or per machine or a group of machines. the proposed methodology takes advantage of these commonly extracted indicators and processes them in two steps: in predictive model building (learning) and predictive model deployment. predictive model building. the predictive analytics model incorporates lstm and exploits its unique ability to "remember" a sequence of patterns and its relative insensitivity to possible time gaps in the time series. as in most neural network algorithms, lstm networks are able to seamlessly model non-linear problems with multiple input variables through the iterative training of their parameters (weights). since the predictive analytics model deals with time-series, the lstm model is trained using supervised learning on a set of training sequences assigned to a known output value. therefore, an analyst feeds the model with a set of daily features for a given machine (e.g. the factors that produce the oee) and use as outcome the number of days until the next failure. this number is known since historical data can hold this information. nevertheless, when the model is finally built and put in operation, it will use new input data and will have to estimate the new outcome. predictive model deployment. when the lstm model is fed with new data it can produce an estimation of when the next failure will occur (i.e. number of days or hours) and what is the expected interruption duration in the following days. although this estimation may not be 100% accurate, it could help factory managers to program maintenance actions proactively in a flexible and dynamic manner, compared to an often rigid and outdated schedule that is currently the common practice. this estimation feeds into prescriptive analytics aiming at automating the whole decision-making process and provide optimal plans. the proposed prescriptive analytics approach is able to: (i) recommend (prescribe) both perfect and imperfect actions (e.g. maintenance actions with various degrees of restoration); (ii) model the decision making process under uncertainty instead of the physical manufacturing process, thus making it applicable to various industries and production processes; and, (iii) incorporate the preference of the domain expert into the decision making process (e.g. according to their skills, experience, etc.), in the form of feedback over the generated prescriptions. to do these, it incorporates multi-objective reinforcement learning (morl). unlike most of the multi-objective optimization approaches which result in the pareto front set of optimal solutions [24] , the proposed approach provides a single optimal solution (prescription), thus generating more concrete insights to the user. the proposed prescriptive analytics algorithm consists of three steps: prescriptive model building, prescriptive model solving, and prescriptive model adapting, which are described in detail below. prescriptive model building. the prescriptive analytics model representing the decision making process is defined by a tuple s; a; t; r ð þ ; where s is the state space, a is the action space, t is the transition function t : s â a â s ! r and r is the vector reward function r : s â a â s ! r n where the n-dimensions are associated with the objectives to be optimized o n . the proposed prescriptive analytics model has a single starting state s n , from which the agent starts the episode, and a state s b that the agent tries to avoid. each episode of the training process of the rl agent will end, when the agent returns to the normal state s n or when it reaches s b . figure 2 depicts an example including 3 alternative (perfect and/or imperfect maintenance) actions (or sets of actions) s a i ; i ¼ 1; 2; 3, each one of which is assigned to a reward vector. the prescriptive analytics model is built dynamically. in this sense, the latest updates on the number of the action states s a i and the estimations of the objectives' values for each state s k are retrieved through apis from the predictive analytics. each action may be implemented either before the breakdown (in order to eliminate or mitigate its impact) or after the breakdown (if this occurs before the implementation of mitigating actions). after the implementation of each action, the equipment returns to its normal state s n . solid lines represent the transitions a i that have non-zero reward with respect to the optimization objectives and move the agent from one state to another. prescriptive model deployment. on the basis of event triggers for predicted abnormal situations (e.g. about the time of the next breakdown) received through a message broker, the model moves from the normal state s n to the dangerous state s d . for each objective, the reward functions are defined according to whether the objective is to be maximized or minimized. on this basis, the optimal policy p o i s; a ð þ; for each objective o i is calculated with the use of the actor-critic algorithm, which is a policy gradient algorithm aiming at searching directly in (some subset of) the policy space starting with a mapping from a finite-dimensional (parameter) space to the space of policies [23] . assuming independent objectives, the multi-objective optimal policy is derived from: p opt s; a ð þ ¼ q i2i p o i s; a ð þ. the time constraints of the optimal policy (prescription) are defined by the prediction event trigger. the prescription is exposed to the operator on the shop-floor (e.g. through a mobile device) providing them the capability to accept or reject it. if accepted, the prescribed action is added to the actions plan. prescriptive model adaptation. the prescriptive analytics model is able to adapt according to feedback by the expert over the generated prescriptions. this approach learns from the operator whether the prescribed actions converge with their experience or skills and incorporates their preference to the prescriptive analytics model. in this way, it provides non-disruptive decision augmentation and thus, achieves an optimized human-machine interaction, while, at the same time, optimizing manufacturing kpis. to do this, it implements the policy shaping algorithm [25] , a bayesian approach that attempts to maximize the information gained from human feedback by utilizing it as direct labels on the policy. for each prescription, optional human feedback is received as a signal of approval or rejection, numerically mapped to the reward signals and interpreted into a step function. the feedback is converted into a policy p feedback s; a ð þ, the distribution of which relies on the consistency, expressing the user's knowledge regarding the optimality of the actions, and the likelihood of receiving feedback. assuming that the feedback policy is independent from the optimal multi-objective policy, the synthetic optimal policy for the optimization objectives and the human feedback is calculated as: p opt s; a ð þ ¼ p opt s; a ð þãp feedback s; a ð þ. the case examined is the cold rolling production line of m. j. maillis s.a. cold rolling is a process of reduction of the cross-sectional area through the deformation caused by a pair of rotating in opposite directions metal rolls in order to produce rolling products with the closest possible thickness tolerances and an excellent surface finish. in the milling station, there is one pair of back up rolls and one pair of work rolls. the deformation takes place through force of the rolls supported by adjustable strip tension in both coilers and de-coilers. over the life of the roll some wear will occur due to normal processing, and some wear will occur due to extraneous conditions. during replacement, the rolls are removed for grinding, during which some roll diameter is lost, and then are stored in the warehouse for future use. after several regrinding, the diameter of the roll becomes so small that is no longer operational. the lstm model of predictive analytics was created using the keras library with tensorflow as backend and the morl using brown-umbc reinforcement learning and planning (burlap) library, while the event communication between them is performed with a kafka broker. in the m. j. maillis s.a case, the system predicts the time of the next breakdown and the rul of the available rolls. for the latter, the operator can select one of the repaired rollers, having been subject to grinding, or a new one. therefore, the alternative actions are created dynamically according to the available repaired rollers existing in the warehouse. each one has a different rul, according to its previous operation, and a different cost (retrieved from enterprise systems) due to its depreciation. each roller has an id and is assigned to its characteristics/objectives of morl (i.e. cost to be minimized and rul to be maximized) in order to facilitate its traceability. the available rolls along with the aforementioned objectives values are retrieved on the basis of a predicted breakdown event trigger. the alternative actions for the current scenario along with their costs and ruls are shown in table 1 . the action "replace with new roller" represents a perfect maintenance action, while the rest ones represent imperfect maintenance actions. figure 3 depicts an example of the process in which the prescription "replace with repaired roller id3" is generated on the basis of a breakdown prediction and previously received feedback and instantly communicated to the operators through a dedicated mobile app. the operators are also expected to provide feedback so that their knowledge and preferences are incorporated in the system and the models are adapted accordingly. the second analysis aimed to predict the expected interruption duration for the following day ('which is the expected interruption duration for the following day?'). the input features used in this lstm model were: availability, performance, minutes of breakdown, real gross production, number of breakdowns, and month (date). again, several lstm parameters and layers were tested and the final model resulted to be a sequential model with a first lstm layer of 24 neurons and an activation function 'relu', a second layer of 12 neurons with a 'relu' activation function, a dropout layer of 0.1 rate, and finally a dense layer. the model was trained using data from 2017 and 2018; using a batch size of 20, 100 epochs, a timestep of 3 and an rmsprop optimizer. predictions were performed in 2019 data and results are depicted in fig. 5 . the blue line represents the actual value whereas the orange line represents the predicted value. the overall rmse is 107.57, meaning that there is an average of 107.57 min uncertainty in each prediction. for this experiment, the actor-critic algorithm, which calculates the associated optimal policy sequentially within 10000 episodes, consists of a boltzmann actor and a td-lambda critic with learning rate = 0.3, lambda = 0.4 and gamma = 0.99. the generated policies are then integrated into a single policy taking into account the consistency (c = 0.7) and likelihood (l = 0.8) values. table 2 presents five "snapshots" of positive and negative feedback along with the resulting shaped prescriptions and their respective policies. each "snapshot" is compared to the previous one. in this paper, we proposed an approach for predictive and prescriptive analytics aiming at exploiting the huge treasure of legacy enterprise and operational data and to overcome some challenges of real-time data analytics. the potential of the proposed approach is high, especially in traditional industries that have not benefit from the advancements of industry 4.0 and that have just started investigating the potential of data analytics and machine learning for the optimization of their production processes. the traditional manufacturing sectors (e.g. textile, furniture, packaging, steel processing) have usually older factories with limited capacity on investing in modern production technologies. since the neural networks are inherently adaptive, the proposed approach could be applied to similar production lines (e.g. at a newly established factory of the same type) overcoming the cold-start problem, due to which other techniques usually fail. it also exploits both the "voice of data" and the "voice of experts". regarding future work, we plan to evaluate our proposed approach in additional use cases, with different requirements, as well as to investigate approaches and algorithms for fusion of the outcomes derived from real-time data analytics and operational data analytics that represent different levels of information. building an industry 4.0 analytics platform predictive, prescriptive and detective analytics for smart manufacturing in the information age big data challenges in smart manufacturing: a discussion paper for bdva and effra research & innovation roadmap alignment bdva the age of analytics: competing in a data-driven world predictive maintenance in a digital factory shopfloor: data mining on historical and operational data coming from manufacturers' information systems the internet of things for smart manufacturing: a review recent advances and trends in predictive manufacturing systems in big data environment a full history proportional hazards model for preventive maintenance scheduling a neural network application for reliability modelling and conditionbased predictive maintenance data mining in manufacturing: a review based on the kind of knowledge data mining in manufacturing: a review a practical approach to combine data mining and prognostics for improved predictive maintenance application of data mining in a maintenance system for failure prediction analyzing maintenance data using data mining methods machine learning for predictive maintenance: a multiple classifier approach prescriptive analytics smart manufacturing with prescriptive analytics prescriptive analytics: literature review and research challenges reinforcement learning: an introduction model-free adaptive optimal control of episodic fixedhorizon manufacturing processes using reinforcement learning a reinforcement learning framework for optimal operation and maintenance of power grids human-centered reinforcement learning: a survey multiobjective reinforcement learning: a comprehensive overview many-objective stochastic path finding using reinforcement learning policy shaping: integrating human feedback with reinforcement learning acknowledgments. this work is funded by the european commission project h2020 uptime "unified predictive maintenance system" (768634). key: cord-005033-voi9gu0l authors: xuan, huiyu; xu, lida; li, lu title: a ca-based epidemic model for hiv/aids transmission with heterogeneity date: 2008-06-07 journal: ann oper res doi: 10.1007/s10479-008-0369-3 sha: doc_id: 5033 cord_uid: voi9gu0l the complex dynamics of hiv transmission and subsequent progression to aids make the mathematical analysis untraceable and problematic. in this paper, we develop an extended ca simulation model to study the dynamical behaviors of hiv/aids transmission. the model incorporates heterogeneity into agents’ behaviors. agents have various attributes such as infectivity and susceptibility, varying degrees of influence on their neighbors and different mobilities. additional, we divide the post-infection process of aids disease into several sub-stages in order to facilitate the study of the dynamics in different development stages of epidemics. these features make the dynamics more complicated. we find that the epidemic in our model can generally end up in one of the two states: extinction and persistence, which is consistent with other researchers’ work. higher population density, higher mobility, higher number of infection source, and greater neighborhood are more likely to result in high levels of infections and in persistence. finally, we show in four-class agent scenario, variation in susceptibility (or infectivity) and various fractions of four classes also complicates the dynamics, and some of the results are contradictory and needed for further research. focus on hiv/aids transmission among human groups in order to better understand its dynamical behavior. in epidemic modeling (see, e.g., bailey 1975; anderson and may 1991; murray 2005) , there are two frequently used methodologies: mathematical and simulation methods. for mathematical approaches, a cohort of people is often classified into susceptibles, infectives, and recovereds with (without) immunity (see, e.g., kermack and mckendrick 1927) . systems of differential equations are used to describe the linear (nonlinear) dynamics of epidemics. macroscopically, mathematical models can reveal the relationship among primary factors and describe their effects to epidemic spreading under certain assumptions. as to hiv/aids epidemic, many models have been proposed (may and anderson 1987; hyman et al. 1999; brauer and driessche 2001; wu and tan 2000, etc.) . however, mathematical approaches have some serious drawbacks due to its intractability and the complexity of epidemics. moreover, the complicated nature of hiv/aids transmission makes it even harder to obtain analytical solutions and difficult to study them. in the early 1990s, some researchers started to apply simulation approaches to this field. there is a large literature that addresses the computer simulation of epidemic dynamics (see, e.g., leslie and brunham 1990; atkinson 1996; rhodes and anderson 1996; rhodes and anderson 1997; ahmed and agiza 1998; benyoussef et al. 2003; tarwater and martin 2001) . particularly, cellular automata (ca) method (some literature refers to this as a lattice-based method) has been widely used in modeling complex adaptive systems. despite of its simple structure, ca is well suited to describing the propagation phenomena, such as rumor spreading, particle percolation, innovation propagation, and disease spreading. for instance, in epidemic modeling, fuentes and kuperman (1999) propose two ca models corresponding to the classical mathematical sis model and sis model respectively. ahmed and agiza (1998) develop a ca model that takes into consideration the latency and incubation period of epidemics and allow each individual (agent) to have distinctive susceptibility. gao et al. (2006) put forward a ca model for sars spreading which takes account of social influence. more recently, other methods such as agent-based modeling and system dynamics are introduced to this field (see, e.g., gordan 2003; bagni et al. 2002) . our paper contributes to this filed by developing an extended ca simulation model. we then use the new ca model to investigate some issues in hiv/aids epidemics. most models, including the foregoing ca models, have some limitations that fail to consider the peculiarities of hiv/aids epidemics and are thereby incapable of describing the epidemic accurately and completely (see frauenthal 1980 for more discussion). first, most of the models assume that there is no latent (or incubation) period. however, for some epidemics, especially aids, there are variously lasting periods of latency and incubation as well as behavior-varying infectivity (or susceptibility) during these periods. in fact, the development of aids involves a few stages in which an infected individual can exhibit different behaviors. those diversified behaviors, in turn, have some ignorable effects on the dynamics of hiv/aids. in light of this, we extend the conventional division of epidemic process (i.e., susceptible, infection, and removed) by dividing the infection period into three sub-stages, each corresponding to the clinical stage occurring in the course of aids development. due to the inability of classical ca approaches to accommodate those newly added state transitions events, we also borrow some ideas from discrete-event simulation techniques and make one agent's stage transitions being time-triggered instead of using some state-based transition rules. secondly, it is commonly assumed that individuals in the population are homogenous in the sense that they have equal infectivity and susceptibility, or they can exert the same influence on each other, etc. this assumption may be satisfied in commonly observed epidemics but not consistent with the hiv/aids epidemic. as we know, susceptibility and infectivity heavily depends on individuals' behavior. for examples, safe sex practices such as the use of condom could dramatically reduce the chance of infection. also, the way of hiv/aids transmission for one to another is various, depending on the interactions between people, and thus the probability of getting infected is determined in part by transmission routes and can be quite different between infected-male/susceptible-female and susceptible-male/infectedfemale interactions. under this assumption, the models that are confined to a single high-risk human group are not suitable in overall population cases. new models are needed to explicitly consider the complexity. therefore, we make an extension to the traditional ca model by introducing the extended definition of neighborhood and attaching some attributes to each agent such as infectivity and resistibility. we also define four types of agents that are characterized by different infectivity (and susceptibility) and various forms of neighborhood to represent four types of people in real life. in doing so, we will be able to investigate the dynamics of hiv/aids with heterogeneous groups in a realistic way. thirdly, classical ca models assume that agents in the grid are spatially fixed, that is, once an agent is placed in a cell, it does not move into another cell. this assumption is problematical because people in the real world are migratory. for instance, in china, millions of rural people leave their hometowns and seek jobs in the cities. the migration of population is a driving force for the spread of hiv/aids. ignoring the mobility of agents in epidemic models would jeopardize the creditability of the results obtained. considering this point, we incorporate agents' mobility into their behaviors. in our model, each agent is allowed to move randomly into one of its adjoined and unoccupied cells at random time intervals. recently, agent-based modeling is used in various fields to solve plenty of problems (see, e.g. zhang and bhattacharyya 2007; luo et al. 2007) . some reader might notice that our improved ca model have features that usually found in agent-based methodology. as a matter of fact, our method borrows much from agent-based simulation modeling. to make things simple, we prefer to view this model as being a ca models. this paper is organized as follows. in the next section, we present our extended ca simulation model. section 3 gives a detailed description of simulation results and analyzes some influential factors that affect the dynamical behavior of the model. section 4 concludes and points out some possible extensions and directions for future research. cellular automata have been extensively used as tools for modeling complex adaptive systems such as traffic flow, financial markets, chemical systems, biological groups, and other social systems (see e.g. gerhard and schuster 1989; gerhardt et al. 1990; weimar et al. 1992; karafyllidis and thanailakis 1997; karafyllidis 1998) . usually, a typical ca model consists of a regular two-dimension grid with a certain boundary condition and a swarm of agents living in the grid. 1 the neighborhood of an agent is defined to some (or all) of the immediately adjacent cells and the agents who inhabit in the neighborhood are called neighbors. agents are restricted to local neighborhood interaction and hence are unable to communicate globally. there are several states agents can be in at each time and an agent's state at time t + 1 is determined based on its neighbors' states at time t . the rules used in the determination of next-time states can be written as a mapping: where s is the set of states and t denotes simulation time. the mathematical properties of cellular automata have been studied in martin et al. (1984) . in our model, we consider a population of size n(t) at time t randomly distributed in a two-dimension w × w lattice. population growth rate r is fixed throughout the simulation. at each time, new agents are added to the model, and the dead removed. simulation time advances in a discrete way. the time interval (t, t + 1) is specified to represent one week in real life. this assumption makes the simulations run reasonably fast (with respect to the whole progress of epidemics) without losing any time-specific clinical properties associated with hiv/aids. explicitly modeling the post-infection progression to aids is one feature of our model compared with conventional ca models. classical epidemic models divide the closed population into three subgroups: susceptible, infective, and recovered (removed). this simplified classification is not consistent with the epidemics in real life. particularly, it is well established that an individual, once infected with hiv, undergoes roughly three clinical phrases towards the full-blown aids: (1) infected, not yet infectious, (2) infectious, not yet asymptomatical, and, (3) symptomatical (may and anderson 1987; may et al. 1988) . the lifetime of an individual should cover not only the process from health to infection, but also the sub-stages after infection. thus, we assume that each agent can go through the following states: • s1: healthy state, initially, each agent is set to be in s1 state. healthy agents have no risk of being infected. when a healthy agent moves into the neighborhood of an infectious one or an infectious agent approaches him, the healthy agent's state will change from s1 to s2 because contacts with infectives incur the danger of infection. as for an agent in s2 state, it can transit in two directions: one direction is to change from s2 back to s1, after all its infectious neighbors move away (or its dead neighbors are removed from the grid) or he leaves the neighborhoods of its infectious neighbors; the other direction is to change from s2 to s3 if he unluckily get infected. note that we assume infection is instantaneous, i.e., instantaneous transmission from an infected individual to a susceptible. a newly infected agent is unable to transmit hiv virus until seroconversion. the s3 state corresponds to the early stages of hiv infection. let t 1 denote the duration of this period. empirical works have been done to estimate the parameter. , anderson and medley (1988) report t 1 to lie between 40 and 60 days in transfusion-induced aids cases. in our model we assume that t 1 is a random variable following a normal distribution with the mean μ 1 and the variance σ 2 1 . after t 1 , the infected agent enters s4 state: infectious state. medically, the duration during which an infected is infectious but not yet symptomatic is called incubation period. we let t 2 donate the period. empirical work suggests an average incubation period of around 4 to around 15 years (medley et al. 1987 . a weibull distribution are commonly used to describe this incubation period (see, e.g., anderson 1988; anderson and medley 1988) . furthermore, , anderson and medley (1988) estimated t 2 with a weibull distribution (with a mean of 7.7 years and a median of 7.4 year) based on 545 transfusioninduced aids cases. for simplicity, we take t 2 as a real number drawn from a normal distribution with the mean μ 2 and the variance σ 2 2 rather than a weibull distribution. it should be pointed out that the simulation results generated with the normal distribution here are proved to have little, if any, difference compared with those generated when a weibull distribution is employed. during the t 2 period, hiv viruses in the victim's body are constantly cloning themselves and eventually the immunity system collapses. at this point, the victim starts to show some symptoms and thus transit to s5 state: symptomatic stage. as usually, let t 3 denote the duration of this period. rothenberg et al. (1987) report 5-7 year survival rates among 1660 idus (intravenous drug user) in new york city and find a median time of survival of 282 days. chang et al. (1993) report a median survival time of 10.5 months. empirical work shows that almost all hiv infectives, excluding those who die from other causes, will inevitably develop aids and die of it (may and anderson 1987) . similarly, we assume t 3 follows a normal distribution with the mean μ 3 and the variance σ 2 3 . eventually, the ill agent enters s6 state after t 3 passes by agents in s6 state will be removed from the population at the beginning of the next time and all of their uninfected neighbors will be released from s2 state, back to s1 state. generally, these state transitions take place in the order of s1, s2, s3, s4, s5, and s6. it is impossible for an agent to return from s3 state to s1 or s2 state. this backward transition s2 to s1, demonstrated by a dashed line in fig. 1 , is due to the disappearance of threats posed by infectious agents. moreover, although the transitions among s3, s4, and s5 state are not relevant to the propagation process, this process is closely related to hiv/aids transmission. taking account of this procession is essential for a better understanding of hiv/aids transmission. in the model, all the events triggering transitions could be divided into two categories. one category is a rule-based, such as healthy-to-dangerous, dangerous-to-infected, and dangerous-to-healthy state-transition events. these events occur according to the ca transition rules: an agent's state at time t + 1 is based not only on its own state but also on the states of its neighbors at time t . the other category is time-based, meaning that these events are scheduled at pre-specified times. for instance, an agent entering s3 state will be assigned a time indicating when to change to s4 state. after that amount of time elapse, the transition occurs spontaneously. despite the distinction between these two categories, the subtlety of implementing the two event-triggering mechanisms is very trivial and leaves no further elaboration necessary. actually, these six states can be divided into three "super" classes in terms of the taxonomy used in the classical mathematical models: s1 and s2 states correspond to the susceptible state; s3, s4, and s5 states belongs to the infection state, and s6 state is the removed state. obviously, s1 and s2 state could be treated as one single state without changing any results. the reason why we divide this into two sub-states is that this makes our model easily implemented and our logic decent and legible. hiv/aids epidemic differs from other epidemics in that its dynamics is heavily affected by individual's behavioral patterns and the interactions between them. for example, careful sex practices and sanitization measures in drug taking will make individuals less likely to be infected. behavioral patterns and interactions are mostly determined by individual's life styles, personalities, social networks, etc. however, the majority of models fail to take account of the heterogeneity in agents' behaviors. to capture this, we extend classical ca models by allowing each agent to have its own attributes such as mobility, infectivity, resistibility (susceptibility) 2 and different extent of neighborhood. assume that each cell in the grid can be occupied by at most one agent at a time. at time t , agent i can move from one cell into one of its adjacent cells with probability p m i . here, p m i is a measurement of agent i's activity level. it is a fixed real number, drawn from a uniform distribution (p m min , p m max ) (0 ≤ p m min ≤ p m max ≤ 1). when p m min = p m max , the activity level across agents is equal and therefore agents have the same inclination to move around. one extreme case is p m min = p m max = 0, which corresponds to the situation in which agents stay in their initial places during simulation, whilst p m min = p m max = 1 means that each agent will move into one empty neighboring cell at almost each time (he could get stocked and not move anywhere if it's neighborhood is occupied). it is easy to induce that the average time per move is calculated as 1/p m i . intuitively, high level of activity leads to speedy spreading. our simulation results verify this. besides the heterogeneity in agents' activity, another kind of heterogeneity is introduced when we assign various levels of infectivity and susceptibility to agents. let f i denote the infectivity level of agent i. f i is a real number drawn uniformly from the interval (0, 1). it measures the possibility that agent i transmits hiv viruses to others when they meets. evidently, greater values of f i indicate higher infectiousness of agent i. suppose also that each agent has resistance to being infected. we denote this resistibility as r i for agent i. similarly, r i is also a real number drawn uniformly from the interval (0, 1) and has the property that the greater the resistibility, the less is the chance of getting infected. note that the infectivity of an agent need not be a constant. an agent can have different level of infectivity, depending both on its state as well as on its behavior. it is widely believed that infectives experience two periods of high infectivity (see e.g. may and anderson 1987; may et al. 1988 ), one shortly after being infected and the other at the late stage of his illness. another example is that a patient might have high infectivity during the incubation period and low infectivity owing to good health care during the symptomatic period. although our model allow for various infectivity at different stages for a single agent, we adopt the fixed infectivity for each agent. in doing so, we can focus our attention on some significant issues. we leave various infectivity scenarios for future work. conventional ca models define two types of neighborhoods: moore neighborhood and von neumann neighborhood. in this paper, we extend the concept of ca neighborhood in order to better describe various situations encountered in agent-based modeling. figure 2 illustrates the definition. as we can see in fig. 2b shows the classical moore neighborhood, and fig. 2d classical von neumann neighborhood. figure 2c represents an extended moore neighborhood with the order of 2 × 2, and fig. 2e an extended 2 × 2 von neumann neighborhood. specially, fig. 2a can be simply viewed as an extended 0 × 0 moore (or von neumann) neighborhood. note that fig. 2b , f-i have the neighborhoods with one direction. this directional structure is able to capture the biases or preferences embedded in individuals' behavioral patterns and we can use different directions to represent variable ways of interactions. it is easy to see that the greater is the neighborhood, the larger extent to which an agent can exert its influence to its neighbors. given the above neighborhood definition, a concept of distance is naturally induced. let a pair of integer numbers (x, y) represent an agent's coordinates in the grid. the distance between agent i and j is thus given as (2) next, we specify that the influence indicator m i,j of agent i and j satisfies the following condition: that is to say, the influence intensity is inversely proportional to the distance between them if the influence can be exerted, and zero otherwise. therefore, the infective impact i i,j of agent i on agent j can be expressed as and the probability of an agent infected by one of its neighbors is defined as: where p(·) is a function satisfying the conditions: (1) p(·) is a real-valued function with the value between zero and one; (2) p(·) is increased in i i,j and decreased in r i . in this model, we assume p(·) takes the form of the following equation: here, p(i i,j , r i ) is interpreted as the probability of agent i being infected by its neighbor j . denoted by b i the set of all its neighbors, agent i's overall probability of infection is thus given by it indicates that this overall probability is determined by the most influential neighbor. such specification makes sense in most cases. the model developed in sect. 2 is implemented using java programming language with the repast software package. 3 detailedly commented source code is available from the authors upon request. next, we begin our analysis by considering first a typical simulation run as a benchmark case. 3.1 benchmark case table 1 lists the input parameters chosen for the benchmark case. in this case, the grid consists of 100 × 100 sites (w = 100). the population size n is set to 2000 with the initial infected ratio α = 0.005. all agents are homogeneous in terms of having the same infectivity f i = 0.2, resistibility r i = 0.5, and 1 × 1 moore neighborhood. they are uniformly distributed in the grid. the total simulation time t for each run is set to 2000. figure 3 shows a snapshot of the spatial distribution of the population at some time in a typical simulation. it is commonly believed that as the epidemic develops, its spread ends up with two typical situations: extinction or prevalence. figure 4 depicts these two situations. in fig. 4a , the number of infections 4 climbs early in the process. after about time t = 470, the infection level starts to drop slowly until it reaches zero at time t = 2000, while in fig. 4b , the number of infections increases slowly and reaches an equilibrium level after t = 1600. the intuition . 3 the spatial distribution of the population at time t = 1275 behind is that in the first case, newly infected continuously enter the pool of infectives at a fairly low rate in early stages. after the lengthy incubation period, these infectives begin to develop aids and eventually die. the total number of them drops when the infection rate is very low with regard to the rate at which infectives leave the pool for some reason (dead in the model). while in the second case, healthy agents get infected at a relatively high rate in early stages. the infection level continues to increases because the number of removed agents is relatively small in later stages. thus high infection rate often leads to prevalence as demonstrated in fig. 4b . these two results can be found in real-world situations. notice that fig. 4 the infections vs. time result in two simulations in the parameters setting, the growth rate r is almost zero (r = 0.0001). in next subsection, we will explore the effects of various factors, such as population density, initial infection ratio, and infectivity, on the epidemic. now we keep other parameters constant as before and let the population density β vary to see how β (β = n/w 2 ) affects the dynamics of hiv/aids transmission. tarwater and martin (2001) investigate this issue when studying the outbreaks of measles or measles-like infectious diseases. as one would expect, many common infectious diseases spread more rapidly at a high population density than at a low population density. figure 5 illustrates the time series of the mean numbers 5 of infectives for different population sizes n = 1500, 2000, 2500, 3000, and 3500. we can see that when population density is relatively low (n = 1500, 2000) , the infection levels are relatively low during the entire simulation and decline slowly in the later stages. this suggests that the epidemic died out eventually. in the and 400 at last. for n = 3000 and 3500, the infection numbers reach to a very high level and then drop rapidly. the collapse is because that so many infectives are removed from the model that the pool of infectives shrinks. for clarity, we also plot the fractions of infectives in the population vs. time in fig. 6 . clearly, in late stages of the epidemic, the fractions are greater when β is great than when β is small. in summary, hiv/aids infection is more likely to persist at higher population densities. this is due to that with the population density increasing, the population contact rate rise, leading to increases in the probability of infection. early work (see, e.g., rhodes and anderson 1996) suggests that there is a threshold below which the epidemic would eventually dies out and above which it would persist. due to the limitation of ca methods, it is diffifig. 7 mean number of infections vs. time for different initial infected ratios cult to pin down its exact value. however, with many simulation runs, we still can give an approximate interval in which the threshold lies. an interesting question one may ask is how epidemic spreading is affected by initial configurations of susceptibles and infectives, or whether the multiple infection sources will make the disease more likely to become endemic. with other parameters fixed as before, we run the simulations with α = 0.002, 0.004, 0.006, and 0.008, respectively. figure 7 presents the result. clearly, as α increase, the infection level shifts upwards. in the case of α = 0.004, the infection level reach 300 at time t = 2000, higher than 200 in the case of α = 0.002. by contrast, the level in the case of α = 0.006 climb to around 380 at time t = 1000 and drops slightly to 340 at time t = 2000. the maximum infection is reached in the case of α = 0.008 at time t = 1000, which is more than 440. spatially, more sources of infection imply greater chance of being infected within a certain area, letting hiv/aids epidemics to be more likely to spread out and persist. statistically speaking, an individual's probability of infection is generally proportional to the number of infectious sources. intuitively, the more migratory the population, the more likely that an epidemic is to spread. suppose an agents' activity can be measured by the number of contacts it makes with others within a unit of period of time. as a result, our model assumes that individuals' activity is measured by mobility. later stages. this is, in the case of p m max = 0.005, the level is above 120 and in the case p m max = 0.007, the level is in the range (180, 210). in the last case of p m max = 0.009, the infection level fluctuates above 300, higher than those of other cases. so we conclude that mobility plays a significant role in the dynamics. it could explain why chinese government took rather strong measures to control the migratory people and quarantine the infectives or the suspects during the outbreak of sars in the spring of 2003. as to our model, if agents are configured with higher mobilities, it is more likely that the hiv/aids infection can persist in the population, whilst if configured with lower mobilities, the infection would gradually diminish and eventually die out. next, we are to examine how neighborhood forms affect hiv/aids epidemic dynamics. the parameter sets are kept the same as in benchmark case, except for the adoption of different neighborhood forms. figure 9 illustrates the simulation results generated in two cases: one with 1 × 1 von neumann neighborhood and the other with 1 × 1 moore neighborhood. in the case where the von neumann neighborhood is used, the level of infection goes up to about 140. it is clearly greater than that of the case with 1 × 1 moore neighborhood in which the level only reach about 100. such result suggests that with wider neighborhood, an agent is more likely to get influenced by its neighbors and therefore the likelihood of getting infected increases accordingly. it is easy to induce that the infection level rises with the order of neighborhood. this result also suggests that hiv/aids epidemic dynamics is significantly affected by strong interactions between agents. we now turn to examine heterogeneous mixing, i.e., different at-risk groups coexist. usually, heterogeneous mixing will make the dynamics more complicated and unpredictable. in the following, we assume the whole population is divided into four different groups as shown in table 2 . as shown in table 2 , class p0 has very low infectivity, low susceptibility, and 1 × 1 von neumann neighborhood. it can represents children and (or) elders in the population who hardly infect others and are easy to be infected. class pl refers to ordinary people who have relatively low infectivity and high resistibility (therefore low susceptibility). in our model, this class amounts to a large fraction of the whole population. class ph and class ph+ can represent the two high-risk groups observed in real life. agents of class ph have high infectivity and low resistibility duo to their high-risk behaviors like incautious sex without protection, needle sharing, unhygienic blood transfusion and so on. the biased 0 × 1 (or 1×0) von neumann neighborhood captures their potential oriented or biased behaviors. in contrast, agents of class ph+ with both higher infectivity and higher susceptibility represent those who are, although being in the minority, the most dangerous and malevolent group. such group does exist in real life. for instance, some crimes were reported in china in recent years that a few aids infectives intentionally have sex with innocent people or shot people with contaminated syringes in public places. they blame their infection on the society and the government for not being able to provide necessary health service and compensating too little. the 2 × 2 moore neighborhood indicates their intensive influence on others. we distinguish class ph+ from class ph in order to see whether such malevolent behaviors have significant impacts on the spread of hiv/aids and to what extent. while such rough classification may be incorrect or even erroneous, it surely is reasonable and well supported by our extended ca model. table 3 gives the values of f i and r i used in the following simulations. as we will see later, the results generated with this classification are fairly consistent with those obtained through empirical work. figure 10 gives a typical simulation result in the four-class scenario. the fractions of four classes here are: n p0 = 0.1, n pl = 0.7, n ph = 0.1, and n ph+ = 0.1. the infection curve in fig. 10 is quite similar to that obtained in the single-class scenarios except that its level is fairly higher. in this section, we will investigate the effect of agents' susceptibility on the dynamics of hiv/aids. given n p0 = 0.05, n pl = 0.8, n ph = 0.1, n ph+ = 0.05 and others as before, let r ph+ vary. figure 11 gives the infection levels when r ph+ is set to be 0.6, 0.7, 0.8, and 0.9, respectively. as we can see, these equilibrium infections are almost at the same level, which is inconsistent with our expectation. the differences are so small that we cannot assure with confidence whether changes in susceptibility have impact on the epidemic dynamics. the possible reasons, we believe, are twofold: first, the role played by susceptibility may be not as decisive as the above factors. second, we may describe susceptibility in the wrong way and make it an inessential factor in our model. future work will reconsider this issue and find the better way to describe susceptibility. at last, we will examine whether changes in the fractions of some classes can affect epidemic behaviors. given n p0 = 0.1, n pl = 0.7 and other parameters as before, we take n ph = 0.12 (n ph+ = 0.08), 0.14 (0.06), 0.16 (0.04), 0.18 (0.02), 0.20 (0.0), respectively. figure 12 shows the infection levels in these five combinations. as observed from fig. 12 , we obtained very similar results, compared with those in the foregoing analysis. with n ph+ decreasing, infection level declines. recall that ph+ has more influence on its neighbors than ph, thus leading to greater transmission to others and larger infection rate. this finding suggests the government should pay more attention to those who have high-risk life styles and revengeful behaviors. this makes it the essential issue of how highly infectious and malevolent individuals are restricted and controlled. the focus of the paper is on the modeling of the entire course of hiv/acid epidemics and heterogeneity in agents' behaviors. even though classical ca models are capable of describing the spread of common epidemics but fail to represent the complicated epidemics like hiv/aids disease. the ignorance of heterogeneity gives rise to unacceptable errors in the prediction of the development trends. in addition, components of a conventional ca system such as topological forms of grid, the definition of neighborhood, and state transition rules are simple and unchanged over time. this makes the modeling of complicated dynamics such as hiv/aids transmission difficult and uncontrollable. in this paper, we have developed an extended ca model to capture key epidemiological and clinical features of hiv/aids epidemic. first, we explicitly models and simulates the whole progression of hiv/aids disease (i.e., infected but not infectious, infectious but asymptomatic, symptomatic, and deceased). such improvement can give us a better understanding of the dynamics during the entire hiv/aids epidemics. in order to examine various degrees of influence between agents, we have introduced an extended definition of neighborhood to represent the intensity and bias of influence. this lets us gain insight into how various degrees of interactions affects the hiv/aids epidemics. another type of heterogeneity in disease-related attributes such as susceptibility, infectivity and durations of epidemic phrases is also taken into consideration. moreover, we also consider the effect of agents' mobility on epidemics dynamics. given all the improvements, we have obtained richer simulation results similar to those usually found in the mathematical models or other classical simulation models. we have identified some influential factors that greatly affect the hiv/aids epidemic dynamics. the main findings are that 1) hiv/aids epidemic can end up in the two regimes: extinction and persistence; 2) with these factors such as agents' mobility, population density, initial infection ratio, and the extent of neighborhood increasing, the infection level get higher. after crossing some critical point, the regime generated could change from dying-out to persistence at some point. this result is robust across many of the tested parameter combinations; 3) in four-class scenarios, the great fraction of 'super' infectives (the ph+ class in our model) can also produce higher level of infection. however, our simulation study above is still preliminary. there are some issues needed to be addressed. first, as said before, we should redefine susceptibility in a better way to check its role in the dynamics of hiv/aids epidemic. second, most models posit that a virus carrier's infectivity is constant during the progress of a disease. however, this is not the case for hiv/aids epidemic. various infectivity at different stages could have substantial impact on the dynamics of hiv/aids transmission. this problem needs special attention. additional, further developments of our model, e.g. by adding age-related structure (griffiths et al. 2000) , different subgroup classification, and other heterogeneity, would greatly add to the appeal of this model. with these additions, a better understanding of hiv/aids and thorough empirical work are required. finally, a natural extension of the model is to include the assessment of various control policies and managerial strategies, and this will be a firm support for the decision-making in prevention programs against hiv/aids. on modeling epidemics. including latency, incubation and variable susceptibility the epidemiology of hiv infection: variable incubation plus infectious periods and heterogeneity in sexual activity infectious diseases of humans: dynamics and control possible demographic consequences of aids in developing countries epidemiology of hiv infection and aids: incubation and infectious periods, survival and vertical transmission a simulation model of the dynamics of hiv transmission in intravenous drug users a comparison of simulation models applied to epidemics the mathematical theory of infectious diseases and its applications dynamics of hiv infection on 2d cellular automata models for transmission of disease with immigration of infectives survival and mortality patterns of an acquired immunodeficiency syndrome (aids) cohort in new york state mathematical modeling in epidemiology cellular automata and epidemiological models with spatial dependence a heterogeneous cellular automata model for sars transmission a cellular automaton describing the formation of spatially ordered structures in chemical systems a cellular automaton model of excitable media a simple agent model of an epidemic an age-structured model for the aids epidemic the differential infectivity and staged progression models for the transmission of hiv a model for the influence of the greenhouse effect on insect and microorganism geographical distribution and population dynamics a model for predicting forest fire using cellular automata. ecological modelling a contribution to the mathematical theory of epidemics the dynamics of hiv spread: a computer simulation model flood decision support system on agent grid: method and implementation algebraic properties of cellular automata transmission dynamics of hiv infection the transmission dynamics of human immunodeficiency virus (hiv) [and discussion incubation period of aids in patients infected via blood transfusion the distribution of the incubation period for the acquired immunodeficiency syndrome (aids) mathematical biology i experiences creating three implementations of the repast agent modeling toolkit persistence and dynamics in lattice models of epidemic spread epidemic thresholds and vaccination in a lattice model of disease spread survival with the acquired immunodeficiency syndrome. experience with 5833 cases in new york city effects of population density on the spread of disease diffusion and wave propagation in cellular automaton models of excitable media modelling the hiv epidemic: a state-space approach effectiveness of q-learning as a tool for calibrating agent-based supply network models we would like to express our gratitude to the many people in xi'an jiaotong university who have participated in planning, data collection, and presenting the results. without their efforts, this simulation modeling would not have been possible. this work is supported by nsfc under contract 70601023. we are grateful to ting zhang and jie huang for excellent research assistance and the referees for many helpful comments that greatly improved the presentation. key: cord-020193-3oqkdbq0 authors: bley, katja; schön, hendrik; strahringer, susanne title: overcoming the ivory tower: a meta model for staged maturity models date: 2020-03-06 journal: responsible design, implementation and use of information and communication technology doi: 10.1007/978-3-030-44999-5_28 sha: doc_id: 20193 cord_uid: 3oqkdbq0 when it comes to the economic and strategic development of companies, maturity models are regarded as silver bullets. however, the existing discrepancy between the large amount of existing, differently developed models and their rare application remains astonishing. we focus on this phenomenon by analyzing the models’ interpretability and possible structural and conceptual inconsistencies. by analyzing existing, staged maturity models, we develop a meta model for staged maturity models so different maturity models may share common semantics and syntax. our meta model can therefore contribute to the conceptual rigor of existing and future maturity models in all domains and can be decisive for the success or failure of a maturity measurement in a company. economic development, the assumption of growth, and the ongoing transformation of processes increase the competitive pressure on enterprises of all sizes. hence, they search for tools that can help determine their current benchmarking position or assess their subsequent performance compared to a predefined best-practice performance [1] . one famous example of these benchmarking tools is maturity models (mms). considering their components, two definitions result: 'maturity' as a "state of being complete, perfect or ready" [2] and 'model' as "an abstraction of a (real or language based) system allowing predictions or inferences to be made" [3] . thus, an mm can be regarded as an abstraction of a system that allows predictions or inferences about a complete or perfect state. its aim is a structured, systematic elaboration of best practices and processes that are related to the functioning and structure of an organization [4] . the mm is divided into different levels, which are used as benchmarks for the overall maturity of an organization. by application, an object is able to assess its own position in a predefined best-practice approach. an awareness of the maturity level in a particular domain is necessary in order to recognize improvement potentials and stimulate a continuous improvement process [5] . especially when it comes to emerging phenomena like digitalization or the industrial internet, mms became a favored instrument especially for smes in order to determine their own digitalization status and corresponding guidelines for improvement. but it is not only in this area that there has been an enormous increase in mm publications in recent years. due to their general applicability, the rather simple development, and the promising competitive advantages many mms have emerged in both science and practice over the last decade. they differ in the concepts of maturity, the considered domains, the development approach, or the target group [4, 6] . although there exist several approaches to the rigorous development (i.e. [7] [8] [9] ), and there are mms that already follow these guidelines and should be considered as complete and thorough, a plethora of developers still see a need for new and different kinds of mms. a possible reason for this phenomenon may be that current mms lack in consistency regarding their semantics, syntax, and concepts. this leads to a situation where many different mms introduce ontologies that are either conceptually the same but are used for different subjects and in different contexts so researchers do not understand or recognize similar models or they try to differentiate from existing approaches by using different terminology (e.g. level vs. stages, maturity vs. assessment, dimension vs. area). this in turn weakens the benefits of mms for companies, as they are unable to understand, compare, and apply the models due to different terminology or incomprehensible concepts and relationships. as a consequence, a solution is needed that proposes theoretical and practical solutions for the semantic and syntactical pitfalls and difficulties mentioned. we address this problem of conceptual divergence by introducing a uml-based meta model of mmsa formal construct that summarizes the various mm concepts, features, and interdependencies and relates them to each other. we introduce this meta model regarding the different mm concepts, where each mm can be an instance of it as it provides a conceptual template for the rigorous development of new and the evaluation of existing maturity models. we thereby focus on staged mms, as they have a holistic and cross-organizational structure that is dominant in mm research. based on the most popular mms as well as several newly developed mms, we summarize and discuss already existing mm concepts. we are able to show a valid instantiation of our maturity meta model by classifying different staged mms in their properties and concepts with our meta model. according to [10] , mms typically consist of: (a) a number of levels (commonly 3-6), (b) a generic description of each level, (c) a number of dimensions or "process areas", (d) a number of elements or activities for each process area, and (e) a description of the performance of each activity or element as it might perform or might be performed at each maturity level. the purpose of mm assessment can vary. depending on its intended aim, the mm focuses on an assessment of the current state of development (descriptive), the identification of improvement potential (prescriptive), or its application as a benchmarking tool (comparative) for an object under study [11] . mm development has faced an evolution of around 40 years. the early approaches of these models reach back to the late 1970s, when crosby and nolan developed the first preliminaries of today's mm: the five-staged quality management maturity grid and the stage theory on electronic data processing [12, 13] . the concept of these models gained more and more attention, and to date, the capability maturity model integration (cmmi) [14] represents one of the most well-known and applied mms. although there are different types of mms, we focus on the dominant form, the staged mm. its basic assumption is that an organization is constantly evolving, interand intra-organizationally, due to learning effects and improvements. this evolution process is represented in a simplified step-by-step approach using a certain number of degrees of maturity (usually 3-6), a combination of key factors that must be given in order to achieve a certain level of maturity. figure 1 shows a common representation of a staged mm. when using such an mm, the presence of relevant factors is measured via indicators. after conducting this assessment, the results are summarized with corresponding evaluations and a representative maturity level (see fig. 1 ). researchers who develop an mm have to decide on a research design for model development (i.e., empirical qualitative/quantitative, conceptual, design-oriented or other), specific research methods to be applied (i.e., case studies, surveys, literature reviews, interviews, conceptual developments or no method at all), and the research content that has to be elaborated and described (i.e., conceptual, descriptive, assessment, comparative, or others) [4] . considering the long history of mms, multiple development guidelines evolved over time that focus on these decisive processes and can be regarded as supportive instruments for mm development. [9] present the most established standard for the development of mms: a six-step generic phase model comprising scope, design, populate, test, deploy, and maintain. a different procedure model is postulated by [8] , who strongly focus on a design science research approach. they suggest an eight-phase development approach, focusing on problem definition, comparison of existing mms, determination of the development strategy, iterative mm development, the conception of transfer and evaluation, implementation of transfer media, evaluation, and rejection of mm. [7] set up general design principles (principles of form and function) for the development of useful mms. a framework for the characterization of existing mms is presented by [15] . in order to create sound and widely accepted mms, he proposes two "maturity model cycles". based on a summarizing, many approaches can support researchers in creating mms. however, these guidelines are limited in their interpretability and validity, as they do not provide concrete terminology specifications or structural concept models. consequently, many models were built following these recommendations, but they still differ in the terminology used, their descriptive principles, the extent of provided information about factors, and the factors' influence on the degree of maturity [7] . in spite of their salient popularity, there are also critical voices regarding the scientific implications and the practical applicability of mms. from a scientific point of view, [7] criticize a lack of empirical research foundation during the development process. this in turn, often leads to a simple copy of existing model structures (i.e., five-staged-mms) without considering the conceptual suitability for the given context. furthermore, [16] mention the lack of validation when selecting factors or dimensions for the model as well as the linear course of maturation. another aspect is the missing operationalization of maturity measurements which impedes replication of an evaluation. similarly, [17] argue that a simple demonstration of a gap (as done in many mms) is not enough. only understanding cause and effect can help to guide action. otherwise these models might convey the impression of "falsified certainty" to their users [18] . a common promise of mms is an increase in competitiveness, effectiveness and efficiency in a specific domain or a general improvement of constitution. however, mms lack a clear description and definition of these concepts in their conceptual grounding, making them theoretical constructs but not applicable models for users (i.e., smes). [19] summarize shortcomings of mms from a practical perspective. they state that mms are often inflexible, although a high flexibility is required. similar to [17] , they argue that mms focus on identifying gaps and raising awareness towards a specific topic, but enterprises are in need of concrete recommendations for action. furthermore, these models are disciplinary, impractical, and overwhelming for many enterprises, especially for smes where a corresponding executive management level is missing. however, not only the application and general structure of models is a topic of criticism. inconsistent terminology of the models' concepts plays a major role in mm research. when developing new mms, developers tend to invent new titles for common mm approaches in order to stand out from the plethora of already existing mm terms. for instance, designations like "maturity model", "assessment model", "roadmap", "maturity framework", "maturity matrix", "guide to maturity", and "excellence model" are synonyms for the concept of an mm. inconsistent terminology used within mms, indirectly results in an inflation of similar mms, as these models will not be identified as identical or at least similar approaches. this inconsistent terminology may cause, among other problems, two different semantic-syntax errors between mms: (a) developers use the same terminology for maturity concepts, but each concept has a different interpretation (e.g., homonyms), and (b) developers address the same concept but label it in a different way (e.g., synonyms). for instance, to describe the field of interest an mm is applied to, mms use different terms: "domain", "area", or "dimension". but not only the terminology, also the relationships between these concepts remain unclear. there rarely is a standardized definition framework of concepts and relationships. although there exists a meta model for project management competence models, it is only available in german and focuses on the broader application context, not on the maturity model itself [20] . in conclusion, these inconsistencies represent a research gap as they lead to a situation where a comparison of existing mms is almost impossible due to a lack of understanding what is actually provided. a recent study revealed that almost all scientific and many consultancy mms (in the field of industrial internet) are still not being used or are even unknown in business practice, although potential applicants stated a need for and interest in these models in general [21] . as this discrepancy comes as a surprise, we assume a weak point either in the development or application of mms (fig. 2) ; otherwise, the reluctant use in praxis is not explicable. a possible explanation for this phenomenon may be located in the different interpretations of mms' concepts on both sides of stakeholders. developers may use concepts and relationships that are interpreted differently by model applicants and by other developers in the same field of research. as a result, possible applicants as well as researchers misinterpret an mm's structure and concepts. in other words, syntax and semantics vary across mms, which is why, on the one hand, companies (as applicants) do not use or apply the models, as they simply do not understand their structure and effects. on the other hand, researchers tend to develop even more and new mms, as they do not recognize existing models as similar or comparable mm approaches (i.e., from 2016 to 2018, 18 new mms had been developed in the domain of industry 4.0 [21] ). what is needed is an overarching, holistic conceptual framework that can unify and standardize the individual objectives and goals. by focusing on the developer's perspective in this paper, the consistency of existing models can be validated, and future models can be developed in a rigorous way, which will then be the foundation for a sound assessment of models later on. a meta model, as "a model of models" [22] , is an approach that is able to meet the above-mentioned requirements. originating from software engineering, it can be understood as a model specifying underlying models that are considered instantiations of the meta model. the meta model itself is instantiated from a meta model [23] . in many cases, the meta meta model specifies a subset of the modeling language uml. a meta model typically represents the language (linguistic) and/or structure (ontological) of the underlying (that is, the subordinate) models [24] . it is used as an abstract description of (1) the unification of concepts that are part of the model, (2) the specifications and definition of concepts, and (3) the specifications and definition of concepts. thus, multiple valid models can be instances from the same meta model. the schematic meta-modeling approach for mms is shown in fig. 3 . on the top layer (m2), the developed meta model is located. it specifies the underlying mm on m1. every concept on m1 has to adhere to a type concept specified on m2. please note that the mm on m1 is, in principle, generally valid for all enterprises. it specifies the ranges and used concepts for a later assessment. the developer creates an mm type on m1 (according to the meta model on m2), which is "instantiated" on m0. on the m0 level, the "instantiated" and applied mm characterizes a concrete enterprise. this can be done by the user (the person who applies a constructed mm onto an enterprise). with a detailed meta model for mms, different mms can be regarded as instances of it, leading to common semantics and syntaxes between those models. only by providing a unified meta model ("specification" in fig. 3) , that reveals concepts, relations and entities, will the developer be able to develop a rigorous and logical mm through instantiation ("structure" in fig. 3 ). our research can therefore contribute fundamentally to the development of future staged mms in all areas, as its structure on the m1 level and its instantiation on the m0 level are directly interdependent and thus decisive for the success or failure of a maturity measurement in a company. however, compared to classical meta-modeling, where the m1 model concepts serve as types for m0 level instances, here, the instantiation of m0 from m1 must be interpreted differently. during assessment (the process that creates the m0 instances), only some of the concepts from the m1 level are instantiated with "values" that ultimately (typically by calculation) yield the resulting maturity level for the assessed enterprise. the development of the meta model for maturity models (4m) was based on a study of the most common and representative staged mms. in order to elaborate sufficient meta model elements that are valid for a broad class of staged mms, an analysis of different staged mms, their development and their structure was conducted to summarize and analyze existing concepts, their relationships as well as their multiplicities and instantiations. as a result, universally applicable meta concepts were found and related to each other. for that, we followed an opportunistic approach: we identified the necessary meta model concepts (and their relations) by reduction and selection via decision criteria based on the available content of the investigated mms. the resulting meta model in fig. 4 and table 1 show the m2 level structure as a conceptual and formal description of an mm, whose instances (on m1 level) reflect the actual concepts and interrelations within various staged mms investigated in our study (see also sect. 5). the staged structure of mms is built on common assumptions, which influenced the meta model's design: 1. a maturity model is typically developed within a domain, as it represents a path of growth for a specific field of interest. 2. an mm consists of several maturity levels (i.e., l 1 -l 5 ), which have an ascending order and are intended to represent the improvement path of an object's ability. 3. the mm can (but does not have to) be divided into dimensions (i.e., d 1 -d 5 ). these dimensions divide the object of study into fields of interest, mostly depending on the domain in which the model is applied. this subdivision serves the possibility of an incremental measurement of maturity in a certain area. 4. factors are properties of the maturity level and (if available) are grouped in dimensions. there are two possible ways of maturity determination by factors: a. in the first approach, the respective maturity level is specified by checking the threshold of certain (related) factors that are applied. due to the fact that a factor could be used on several levels (however, with different values per level), the factor specification, which assigns the concrete expected value of a factor per maturity level, is used between both. thus, e.g., level l 1 could use factor f with the factor specification f l1 while level l 2 is still able to use f as well (then with its own factor specification f l2 ). this is used to express evolving enterprise maturity via growing requirements per level of the same factor. b. second, there is the possibility that the maturity level can be determined per dimension instead of "requirements per level". for this purpose, the factors need to be assigned to the respective dimensions and calculated by using a dimensionspecific aggregation formula (e.g., sum, average, median). in contrast to the approach (a), the dimensions do not use any threshold as intermediate value. thus, a factor can only be assigned to one dimension, and the dimension maturity is calculated by its set of factors (and their current value). a factor cannot be used twice, and the dimension maturity is always a representation of the current state of the enterprise (within this dimension). the mm can make use of both, several dimensions of maturity and an overall level maturity since maturity levels and dimensions are modeled orthogonally in the meta model. however, the specification of the maturity level is still mandatory. within an mm, its value is still abstract and not directly transferable into an object under study. this transfer is done by using indicators. indicators are measurable properties of one or more factors and are directly retrievable (measurable) for an object under study. as a factor can have many indicators, and each indicator can be relevant for multiple factors, the factors retrieve their values from their indicators by using an indicator aggregation. this is a pure technical construct to allow the shared usage of indicators by different factors. however, the aggregation functions (e.g., sum, average, median) are assigned to the factor to yield its value. 6. the indicator measurement is done on the basis of indicator types. this could be, e.g., likert, ordinal, or cardinal scales, counting or index values. the final meta model with its concepts can be used to develop the initial-staged mm. the defined elements (concepts) and their relationships are regarded as best practices in mm development and are practically needed concepts in order to build a functioning staged mm. although one could decide for several variations, the already discovered concepts in the meta model are the sum of (extracted) expert knowledge from previous mms. thus, the non-existence of any part of the meta model in an actual mm is neither considered as an issue in the meta model nor is the mm regarded as incorrect. rather, the interpretation is that the actual mm lacks a feature that is typically used by other mms, and it could potentially be improved with such. further, the relatively small amount of meta concepts is not necessarily a drawback, since the meta model only contains the most powerful concepts found. despite the few concepts, it provides a strong, defined, and universal framework for a whole class of staged mms. table 2 summarizes an analysis of currently existing and applicable staged mms regarding the mapping between the developed meta model and the concepts used. the selected mms represent a sample from different domains, years, and development approaches. the bpmm and cmmi are representatives of the most famous mms developed [14, 25] . leyh et al. [26] , gökalp et al. [27] , schumacher et al. [28] , and luftman [29] are representatives of scientific mm approaches from different years. property of the organization, which represents the object/area/process of investigation; used by one or more maturity level and can be related to a dimension within an organization factor specification technical and foundational requirement construct for determining the maturity level; acts as the maturity level's individual expression of a factor (needed due to multi-usage of factors) indicator measurable property of one or more factors within an organization; uses an indicator type for measurement indicator type measuring method for determining of the value of the indicator maturity level rank of the organizational maturity that results from factor evaluation by using the factor specification (see 4.a) or aggregation of a dimension's related factors (see 4.b); subdivides the maturity model and represents a relative degree of organizational ability/maturity from the analysis of existing mms, we conclude that very few models have indepth explanations of all the concepts and relationships used. however, almost every examined model fulfills common concepts like the mm's name, the domain in which it is located, as well as a description of its respective maturity levels (except one). the concepts' names "factor" and "indicator" are not used within the models. although [27] use the term "indicators", the definition matches the 4m-concept of a factor. the 4m-concept of "indicator" is not further described in their model. in general, the lowlevel concepts (like, e.g., indicator) are rarely specified, possibly due to abstraction from the actual calculation of a maturity level. when it comes to the interpretation of the examined concepts and relationships of factors, indicators, and requirements, no mm can be regarded as an exact instantiation of our meta model, as not every concept is present. there is often a lack of information in the available publications about factors and indicators, which are subject to further calculations, or information on which requirements can be used to derive the maturity level. the mere mentioning of concepts allows conclusions to be drawn about the fact that the authors have basically dealt with the respective concepts, but the relationships between them remain undefined. possible, general reasons for this phenomenon could be the length of the publications in which the mms are presented. conference papers are often too short to describe a comprehensive presentation of all underlying assumptions. another reason could be that papers do not describe mms for application purposes, but rather focus on their contribution. however, we do not consider existing staged mms that do not match the meta model as incorrect. the meta model, as a conceptual orientation, could help other researchers or developers to understand and interpret the respective intention and improves the developer's cycle of mm development. we claim that this will help these models to overcome their ivory tower. the rather few but strong concepts defined in the meta model are a core skeleton of staged mms that may lead both the development as well as the understanding of mms. although we have built our meta model on a broad analysis of existing models, our research has limitations. thus, our meta model approach is only valid for staged models and therefore cannot explain other types of mms (continuous or focus area mms). in addition, the meta model initially only considers the development cycle and must be further developed for the application cycle. however, as [15] proposed, both cycles should not be analyzed concurrently anyway, as they differ in their requirements. to this point, we have been able to show with the current approach that the consistency and concepts of existing mms are often not entirely described in the development cycle. it will be the focus of our future work to concentrate on the application cycle and to compare concepts of assessment with provided concepts in existing mms. in this paper, a meta model for mms was introduced, which can be used for a standardized and consistent development of mms. the related concepts, elements, and their relationships were explained and specified in detail. this was done by the analysis of relevant literature that introduces staged mms and by extracting their core concepts, including their syntax and semantics. further, the final 4m was evaluated against several mms from the research literature showing that the majority of mms lack in the exact specification of their elements and relationships. this uncertainty and divergence in mm specifications often lead to inconsistent applications and implications derived from their application. the presented 4m is therefore beneficial regarding consistent mm development and comparison. the 4m, together with its defined concepts and relationships, is a compact but powerful tool for mm developers for initial development and evaluation of their work, as well as to be consistent with related work and the staged mm semantics in general. however, the 4m only covers the structural part of an instantiated mm. consideration of the assessment part as an integrated aspect of the 4m would be valuable for the mm user. we therefore intend to develop the assessment aspect in the next research step. also, we do not claim completeness regarding the concepts in the 4m since we only introduced the concepts that are used by many different mms. additional concepts could be introduced; however, we strived for simplicity and usability instead of a complex and over specified meta model. the 4m is only applicable for staged mms; thus, a meta model for other mm types (e.g., continuous mms) has to be constructed separately. a role-based maturity model for digital relevance the oxford english dictionary. clarendon matters of (meta-) modeling the maturity of maturity model research: a systematic mapping study operational excellence driven by process maturity reviews: a case study of the abb corporation business process maturity models: a systematic literature review what makes a useful maturity model? a framework of general design principles for maturity models and its demonstration in business process management developing maturity models for it management understanding the main phases of developing a maturity assessment model the use of maturity models/grids as a tool in assessing product development capability prozessverbesserung mit reifegradmodellen quality is free: the art of making quality certain software engineering institute: cmmi® for development maturity assessment models: a design science research approach maturity models development in is research: a literature review knowing "what" to do is not enough: turning knowledge into action inductive design of maturity models: applying the rasch algorithm for design science research project management maturity models: the silver bullets of competitive advantage? kompetenz-und reifegradmodelle für das projektmanagement: grundlagen maturity models in the age of industry 4.0 -do the available models correspond to the needs of business practice? object management group: mda guide version object management group: uml infrastructure specification metamodellierung als instrument des methodenvergleichs: eine evaluierung am beispiel objektorientierter analysemethoden. shaker business process maturity model (bpmm) the application of the maturity model simmi 4.0 in selected enterprises development of an assessment model for a maturity model for assessing industry 4.0 readiness and maturity of manufacturing enterprises assessing business-it alignment maturity key: cord-159103-dbgs2ado authors: rieke, nicola; hancox, jonny; li, wenqi; milletari, fausto; roth, holger; albarqouni, shadi; bakas, spyridon; galtier, mathieu n.; landman, bennett; maier-hein, klaus; ourselin, sebastien; sheller, micah; summers, ronald m.; trask, andrew; xu, daguang; baust, maximilian; cardoso, m. jorge title: the future of digital health with federated learning date: 2020-03-18 journal: nan doi: nan sha: doc_id: 159103 cord_uid: dbgs2ado data-driven machine learning has emerged as a promising approach for building accurate and robust statistical models from medical data, which is collected in huge volumes by modern healthcare systems. existing medical data is not fully exploited by ml primarily because it sits in data silos and privacy concerns restrict access to this data. however, without access to sufficient data, ml will be prevented from reaching its full potential and, ultimately, from making the transition from research to clinical practice. this paper considers key factors contributing to this issue, explores how federated learning (fl) may provide a solution for the future of digital health and highlights the challenges and considerations that need to be addressed. research on artificial intelligence (ai) has enabled a variety of significant breakthroughs over the course of the last two decades. in digital healthcare, the introduction of powerful machine learning-based and particularly deep learning-based models [1] has led to disruptive innovations in radiology, pathology, genomics and many other fields. in order to capture the complexity of these applications, modern deep learning (dl) models feature a large number (e.g. millions) of parameters that are learned from and validated on medical datasets. sufficiently large corpora of curated data are thus required in order to obtain models that yield clinical-grade accuracy, whilst being safe, fair, equitable and generalising well to unseen data [2, 3, 4] . for example, training an automatic tumour detector and diagnostic tool in a supervised way requires a large annotated database that encompasses the full spectrum of possible anatomies, pathological patterns and types of input data. data like this is hard to obtain and curate. one of the main difficulties is that unlike other data, which may be shared and copied rather freely, health data is highly sensitive, subject to regulation and cannot be used for research without appropriate patient consent and ethical approval [5] . even if data anonymisation is sometimes proposed as a way to bypass these limitations, it is now wellunderstood that removing metadata such as patient name or date of birth is often not enough to preserve privacy [6] . imaging data suffers from the same issue -it is possible to reconstruct a patient's face from three-dimensional imaging data, such as computed tomography (ct) or magnetic resonance imaging (mri). also the human brain itself has been shown to be as unique as a fingerprint [7] , where subject identity, age and gender can be predicted and revealed [8] . another reason why data sharing is not systematic in healthcare is that medical data are potentially highly valuable and costly to acquire. collecting, curating and maintaining a quality dataset takes considerable time and effort. these datasets may have a significant business value and so are not given away lightly. in practice, openly sharing medical data is often restricted by data collectors themselves, who need fine-grained control over the access to the data they have gathered. federated learning (fl) [9, 10, 11] is a learning paradigm that seeks to address the problem of data governance and privacy by training algorithms collaboratively without exchanging the underlying datasets. the approach was originally developed in a different domain, but it recently gained traction for healthcare applications because it neatly addresses the problems that usually exist when trying to aggregate medical data. applied to digital health this means that fl enables insights to be gained collaboratively across institutions, e.g. in the form of a global or consensus model, without sharing the patient data. in particular, the strength of fl is that sensitive training data does not need to be moved beyond the firewalls of the institutions in which they reside. instead, the machine learning (ml) process occurs locally at each participating institution and only model characteristics (e.g. parameters, gradients etc.) are exchanged. once training has been completed, the trained consensus model benefits from the knowledge accumulated across all institutions. recent research has shown that this approach can achieve a performance that is comparable to a scenario where the data was co-located in a data lake and superior to the models that only see isolated singleinstitutional data [12, 13] . for this reason, we believe that a successful implementation of fl holds significant potential for enabling precision medicine at large scale. the scalability with respect to patient numbers included for model training would facilitate models that yield unbiased decisions, optimally reflect an individual's physiology, and are sensitive to rare diseases in a way that is respectful of governance and privacy concerns. whilst fl still requires rigorous technical consideration to ensure that the algorithm is proceeding optimally without compromising safety or patient privacy, it does have the potential to overcome the limitations of current approaches that require a single pool of centralised data. the aim of this paper is to provide context and detail for the community regarding the benefits and impact of fl for medical applications (section 2) as well as highlighting key considerations and challenges of implementing fl in the context of digital health (section 3). the medical fl use-case is inherently different from other domains, e.g. in terms of number of participants and data diversity, and while recent surveys investigate the research advances and open questions of fl [14, 11, 15] , we focus on what it actually means for digital health and what is needed to enable it. we envision a federated future for digital health and hope to inspire and raise awareness with this article for the community. ml and especially dl is becoming the de facto knowledge discovery approach in many industries, but successfully implementing data-driven applications requires that models are trained and evaluated on sufficiently large and diverse datasets. these medical datasets are difficult to curate (section 2.1). fl offers a way to counteract this data dilemma and its associated governance and privacy concerns by enabling collaborative learning without centralising the data (section 2.2). this learning paradigm, however, requires consideration from and offers benefits to the various stakeholders of the healthcare environment a parameter server distributes the model and each node trains a local model for several iterations, after which the updated models are returned to the parameter server for aggregation. this consensus model is then redistributed for subsequent iterations. (b) decentralised architecture via peer-to-peer: rather than using a parameter server, each node broadcasts its locally trained model to some or all of its peers and each node does its own aggregation. (c) hybrid architecture: federations can be composed into a hierarchy of hubs and spokes, which might represent regions, health authorities or countries. (section 2.3). all these points will be discussed in this section. data-driven approaches rely on datasets that truly represent the underlying data distribution of the problem to be solved. whilst the importance of comprehensive and encompassing databases is a well-known requirement to ensure generalisability, state-of-the-art algorithms are usually evaluated on carefully curated datasets, often originating from a small number of sources -if not a single source. this implies major challenges: pockets of isolated data can introduce sample bias in which demographic (e.g. gender, age etc.) or technical imbalances (e.g. acquisition protocol, equipment manufacturer) skew the predictions, adversely affecting the accuracy of pre-diction for certain groups or sites. the need for sufficiently large databases for ai training has spawned many initiatives seeking to pool data from multiple institutions. large initiatives have so far primarily focused on the idea of creating data lakes. these data lakes have been built with the aim of leveraging either the commercial value of the data, as exemplified by ibm's merge healthcare acquisition [16] , or as a resource for economic growth and scientific progress, with examples such as nhs scotland's national safe haven [17] , the french health data hub [18] and health data research uk [19] . substantial, albeit smaller, initiatives have also made data available to the general community such as the human connectome [20] , uk biobank [21] , the cancer imaging archive (tcia) [22] , nih cxr8 [23] , nih deeplesion [24] , the cancer genome atlas (tcga) [25] , the alzheimer's disease neu-roimaging initiative (adni) [26] , or as part of medical grand challenges 1 such as the camelyon challenge [27] , the multimodal brain tumor image segmentation benchmark (brats) [28] or the medical segmentation decathlon [29] . public data is usually task-or disease-specific and often released with varying degrees of license restrictions, sometimes limiting its exploitation. regardless of the approach, the availability of such data has the potential to catalyse scientific advances, stimulate technology start-ups and deliver improvements in healthcare. centralising or releasing data, however, poses not only regulatory and legal challenges related to ethics, privacy and data protection, but also technical ones -safely anonymising, controlling access, and transferring healthcare data is a non-trivial, and often impossible, task. as an example, anonymised data from the electronic health record can appear innocuous and gdpr/phi compliant, but just a few data elements may allow for patient reidentification [6] . the same applies to genomic data and medical images, with their high-dimensional nature making them as unique as one's fingerprint [7] . therefore, unless the anonymisation process destroys the fidelity of the data, likely rendering it useless, patient reidentification or information leakage cannot be ruled out. gated access, in which only approved users may access specific subsets of data, is often proposed as a putative solution to this issue. however, not only does this severely limit data availability, it is only practical for cases in which the consent granted by the data owners or patients is unconditional, since recalling data from those who may have had access to the data is practically unenforceable. the promise of fl is simple -to address privacy and governance challenges by allowing algorithms to learn from non co-located data. in a fl setting, each data controller not only defines their own governance processes and associated privacy considerations, but also, by not allowing data to move or to be copied, controls data access and the possibility to revoke it. so the potential of fl is to provide controlled, indirect access to large and comprehensive datasets needed for the development of ml algo-rithms, whilst respecting patient privacy and data governance. it should be noted that this includes both the training as well as the validation phase of the development. in this way, fl could create new opportunities, e.g. by allowing large-scale validation across the globe directly in the institutions, and enable novel research on, for example, rare diseases, where the incident rates are low and it is unlikely that a single institution has a dataset that is sufficient for ml approaches. moving the to-be-trained model to the data instead of collecting the data in a central location has another major advantage: the high-dimensional, storage-intense medical data does not have to be duplicated from local institutions in a centralised pool and duplicated again by every user that uses this data for local model training. in a fl setup, only the model is transferred to the local institutions and can scale naturally with a potentially growing global dataset without replicating the data or multiplying the data storage requirements. some of the promises of fl are implicit: a certain degree of privacy is provided since other fl participants never directly access the data from other institutions and only receive the resulting model parameters that are aggregated over several participants. and in a client-server architecture (see figure 1 ), in which a federated server manages the aggregation and distribution, the participating institutions can even remain unknown to each other. however, it has been shown that the models themselves can, under certain conditions, memorise information [30, 31, 32, 33] . therefore the fl setup can be further enhanced by privacy protections using mechanisms such as differential privacy [34, 35] or learning from encrypted data (c.f. sec. 3). and fl techniques are still a very active area of research [14] . all in all, a successful implementation of fl will represent a paradigm shift from centralised data warehouses or lakes, with a significant impact on the various stakeholders in the healthcare domain. if fl is indeed the answer to the challenge of healthcare ml at scale, then it is important to understand who the various stakeholders are in a fl ecosystem and what they have to consider in order to benefit from it. the aggregation may happen on one of the training nodes or a separate parameter server node, which would then redistribute the consensus model. b) peer to peer training: nodes broadcast their model updates to one or more nodes in the federation and each does its own aggregation. cyclic training happens when model updates are passed to a single neighbour one or more times, round-robin style. c) hybrid training: federations, perhaps in remote geographies, can be composed into a hierarchy and use different communication/aggregation strategies at each tier. in the illustrated case, three federations of varying size periodically share their models using a peer to peer approach. the consensus model is then redistributed to each federation and each node therein. clinicians are usually exposed to only a certain subgroup of the population based on the location and demographic environment of the hospital or practice they are working in. therefore, their decisions might be based on biased assumptions about the probability of certain diseases or their interconnection. by using ml-based systems, e.g. as a second reader, they can augment their own expertise with expert knowledge from other institutions, ensuring a consistency of diagnosis not attainable today. whilst this promise is generally true for any ml-based system, systems trained in a federated fashion are potentially able to yield even less biased decisions and higher sensitivity to rare cases as they are likely to have seen a more complete picture of the data distribution. in order to be an active part of or to benefit from the federation, however, demands some up-front effort such as compliance with agreements e.g. regarding the data structure, annotation and report protocol, which is necessary to ensure that the information is presented to collaborators in a commonly understood format. patients are usually relying on local hospitals and practices. establishing fl on a global scale could ensure higher quality of clinical decisions regardless of the location of the deployed system. for example, patients who need medical attention in remote areas could benefit from the same high-quality ml-aided diagnosis that are available in hospitals with a large number of cases. the same advantage applies to patients suffering from rare, or geographically uncommon, diseases, who are likely to have better outcomes if faster and more accurate diagnoses can be made. fl may also lower the hurdle for becoming a data donor, since patients can be reassured that the data remains with the institution and data access can be revoked. hospitals and practices can remain in full control and possession of their patient data with complete traceability of how the data is accessed. they can precisely control the purpose for which a given data sample is going to be used, limiting the risk of misuse when they work with third parties. however, participating in federated efforts will require investment in on-premise computing infrastructure or private-cloud service provision. the amount of necessary compute capabilities depends of course on whether a site is only participating in evaluation and testing efforts or also in training efforts. even relatively small institutions can participate, since enough of them will generate a valuable corpus and they will still benefit from collective models generated. one of the drawbacks is that fl strongly relies on the standardisation and homogenisation of the data formats so that predictive models can be trained and evaluated seamlessly. this involves significant standardisation efforts from data managers. researchers and ai developers who want to develop and evaluate novel algorithms stand to benefit from access to a potentially vast collection of real-world data. this will especially impact smaller research labs and start-ups, who would be able to directly develop their applications on healthcare data without the need to curate their own datasets. by introducing federated efforts, precious resources can be directed towards solving clinical needs and associated technical problems rather than relying on the limited supply of open datasets. at the same time, it will be necessary to conduct research on algorithmic strategies for federated training, e.g. how to combine models or updates efficiently, how to be robust to distribution shifts, etc., as highlighted in the technical survey papers [14, 11, 15] . and a fl-based development implies that the researcher or ai developer cannot investigate or visualise all of the data on which the model is trained. it is for example not possible to look at an individual failure case to understand why the current model performs poorly on it. healthcare providers in many countries are affected by the ongoing paradigm shift from volume-based, i.e. fee-for-service-based, to value-based healthcare. a value-based reimbursement structure is in turn strongly connected to the successful establishment of precision medicine. this is not about promoting more expensive individualised therapies but instead about achieving better outcomes sooner through more focused treatment, thereby reducing the costs for providers. by way of example, with sufficient data, ml approaches can learn to recognise cancer-subtypes or genotypic traits from radiology images that could indicate certain therapies and discount others. so, by providing exposure to large amounts of data, fl has the potential to increase the accuracy and robustness of healthcare ai, whilst reducing costs and improving patient outcomes, and is therefore vital to precision medicine. manufacturers of healthcare software and hardware could benefit from federated efforts and infrastructures for fl as well, since combining the learning from many devices and applications, without revealing anything patient-specific can facilitate the continuous improvement of ml-based systems. this potentially opens up a new source of data and revenue to manufacturers. however, hospitals may require significant upgrades to local compute, data storage, networking capabilities and associated software to enable such a use-case. note, however, that this change could be quite disruptive: fl could eventually impact the business models of providers, practices, hospitals and manufacturers affecting patient data ownership; and the regulatory frameworks surrounding continual and fl approaches are still under development. fl is perhaps best-known from the work of koneä�nỳ et al. [36] , but various other definitions have been proposed in literature [14, 11, 15, 9] . these approaches can be realised via different communication architectures (see figure 1 ) and respective compute plans (see figure 2 ). the main goal of fl, however, remains the same: to combine knowledge learned from non co-located data, that resides within the participating entities, into a global model. whereas the initial application field mostly comprised mobile devices, participating entities in the case of healthcare could be institutions storing the data, e.g. hospitals, or medical devices itself, e.g. a ct scanner or even low-powered devices that are able to run computations locally. it is important to understand that this domainshift to the medical field implies different conditions and requirements. for example, in the case of the federated mobile device application, potentially millions of partici-pants could contribute, but it would be impossible to have the same scale of consortium in terms of participating hospitals. on the other hand medical institutions may rely on more sophisticated and powerful compute infrastructure with stable connectivity. another aspect is that the variation in terms of data type and defined tasks as well as acquisition protocol and standardisation in healthcare is significantly higher than pictures and messages seen in other domains. the participating entities have to agree on a collaboration protocol and the high-dimensional medical data, which is predominant in the field of digital health, poses challenges by requiring models with huge numbers of parameters. this may become an issue in scenarios where the available bandwidth for communication between participants is limited, since the model has to be transferred frequently. and even though data is never shared during fl, considerations about the security of the connections between sites as well as mitigation of data leakage risks through model parameters are necessary. in this section, we will discuss more in detail what fl is and how it differs from similar techniques as well as highlighting the key challenges and technical considerations that arise when applying fl in digital health. fl is a learning paradigm in which multiple parties train collaboratively without the need to exchange or centralise datasets. although various training strategies have been implemented to address specific tasks, a general formulation of fl can be formalised as follows: let l denote a global loss function obtained via a weighted combination of k local losses {l k } k k=1 , computed from private data x k residing at the individual involved parties: where w k > 0 denote the respective weight coefficients. it is important to note that the data x k is never shared among parties and remains private throughout learning. in practice, each participant typically obtains and refines the global consensus model by running a few rounds of optimisation on their local data and then shares the updated parameters with its peers, either directly or via a pa-rameter server. the more rounds of local training are performed without sharing updates or synchronisation, the less it is guaranteed that the actual procedure is minimising the equation (1) [9, 14] . the actual process used for aggregating parameters commonly depends on the fl network topology, as fl nodes might be segregated into sub-networks due to geographical or legal constraints (see figure 1 ). aggregation strategies can rely on a single aggregating node (hub and spokes models), or on multiple nodes without any centralisation. an example of this is peer-to-peer fl, where connections exist between all or a subset of the participants and model updates are shared only between directly-connected sites [37, 38] . an example of centralised fl aggregation with a client-server architecture is given in algorithm 1. note that aggregation strategies do not necessarily require information about the full model update; clients might choose to share only a subset of the model parameters for the sake of reducing communication overhead of redundant information, ensure better privacy preservation [10] or to produce multitask learning algorithms having only part of their parameters learned in a federated manner. a unifying framework enabling various training schemes may disentangle compute resources (data and servers) from the compute plan, as depicted in figure 2 . the latter defines the trajectory of a model across several partners, to be trained and evaluated on specific datasets. for more details regarding state-of-the-art of fl techniques, such as aggregation methods, optimisation or model compression, we refer the reader to the overview by kairouz et al. [14] . fl is rooted in older forms of collaborative learning where models are shared or compute is distributed [13, 40] . transfer learning, for example, is a well-established approach of model-sharing that makes it possible to tackle problems with deep neural networks that have millions of parameters, despite the lack of extensive, local datasets that are required for training from scratch: a model is first trained on a large dataset and then further optimised on the actual target data. the dataset used for the initial training does not necessarily come from the same domain or even the same type of data source as the target dataset. this type of transfer learning has shown better algorithm 1 example of a fl algorithm [12] in a clientserver architecture with aggregation via fedavg [39] . require: num federated rounds t 1: procedure aggregating 2: initialise global model: for t â�� 1 â· â· â· t do 4: for client k â�� 1 â· â· â· k do run in parallel end for 10: return w (t) 11: end procedure performance [41, 42] when compared to strategies where the model had been trained from scratch on the target data only, especially when the target dataset is comparably small. it should be noted that similar to a fl setup, the data is not necessarily co-located in this approach. for transfer learning, however, the models are usually shared acyclically, e.g. using a pre-trained model to finetune it on another task, without contributing to a collective knowledge-gain. and, unfortunately, deep learning models tend to "forget" [43, 44] . therefore after a few training iterations on the target dataset the initial information contained in the model is lost [45] . to adopt this approach into a form of collaborative learning in a fl setup with continuous learning from different institutions, the participants can share their model with a peer-to-peer architecture in a "round-robin" or parallel fashion and train in turn on their local data. this yields better results when the goal is to learn from diverse datasets. a client-server architecture in this scenario enables learning on multi-party data at the same time [13] , possibly even without forgetting [46] . there are also other collaborative learning strategies [47, 48] such as ensembling, a statistical strategy of combining multiple independently trained models or predictions into a consensus, or multi-task learning, a strategy to leverage shared representations for related tasks. these strategies are independent of the concept of fl, and can be used in combination with it. the second characteristic of fl -to distribute the compute -has been well studied in recent years [49, 50, 51] . nowadays, the training of the large-scale models is often executed on multiple devices and even multiple nodes [51] . in this way, the task can be parallelised and enables fast training, such as training a neural network on the extensive dataset of the imagenet project in 1 hour [50] or even in less than 80 seconds [52] . it should be noted that in these scenarios, the training is realised in a cluster environment, with centralised data and fast network communication. so, distributing the compute for training on several nodes is feasible and fl may benefit from the advances in this area. compared to these approaches, however, fl comes with a significant communication and synchronisation cost. in the fl setup, the compute resources are not as closely connected as in a cluster and every exchange may introduce a significant latency. therefore, it may not be suitable to synchronise after every batch, but to continue local training for several iterations before aggregation. we refer the reader to the survey by xu et al. [15] for an overview of the evolution of federated learning and the different concepts in the broader sense. despite the advantages of fl, there are challenges that need to be taken into account when establishing federated training efforts. in this section, we discuss five key aspects of fl that are of particular interest in the context of its application to digital health. in healthcare, we work with highly sensitive data that must be protected accordingly. therefore, some of the key considerations are the trade-offs, strategies and remaining risks regarding the privacy-preserving potential of fl. privacy vs. performance. although one of the main purposes of fl is to protect privacy by sharing model updates rather than data, fl does not solve all potential privacy issues and -similar to ml algorithms in generalwill always carry some risks. strict regulations and data governance policies make any leakage, or perceived risk of leakage, of private information unacceptable. these regulations may even differ between federations and a catch-all solution will likely never exist. consequently, it is important that potential adopters of fl are aware of potential risks and state-of-the-art options for mitigating them. privacy-preserving techniques for fl offer levels of protection that exceed today's current commercially available ml models [14] . however, there is a trade-off in terms of performance and these techniques may affect for example the accuracy of the final model [10] . furthermore future techniques and/or ancillary data could be used to compromise a model previously considered to be low-risk. level of trust. broadly speaking, participating parties can enter two types of fl collaboration: trusted -for fl consortia in which all parties are considered trustworthy and are bound by an enforceable collaboration agreement, we can eliminate many of the more nefarious motivations, such as deliberate attempts to extract sensitive information or to intentionally corrupt the model. this reduces the need for sophisticated countermeasures, falling back to the principles of standard collaborative research. non-trusted -in fl systems that operate on larger scales, it is impractical to establish an enforceable collaborative agreement that can guarantee that all of the parties are acting benignly. some may deliberately try to degrade performance, bring the system down or extract information from other parties. in such an environment, security strategies will be required to mitigate these risks such as, encryption of model submissions, secure authentication of all parties, traceability of actions, differential privacy, verification systems, execution integrity, model confidentiality and protections against adversarial attacks. information leakage. by definition, fl systems sidestep the need to share healthcare data among participating institutions. however, the shared information may still indirectly expose private data used for local training, for example by model inversion [53] of the model updates, the gradients themselves [54] or adversarial attacks [55, 56] . fl is different from traditional training insofar as the training process is exposed to multiple parties. as a result, the risk of leakage via reverse-engineering increases if adversaries can observe model changes over time, observe specific model updates (i.e. a single institution's update), or manipulate the model (e.g. induce additional memorisation by others through gradient-ascentstyle attacks). countermeasures, such as limiting the granularity of the shared model updates and to add specific noise to ensure differential privacy [34, 12, 57] may be needed and is still an active area of research [14] . medical data is particularly diverse -not only in terms of type, dimensionality and characteristics of medical data in general but also within a defined medical task, due to factors like acquisition protocol, brand of the medical device or local demographics. this poses a challenge for fl algorithms and strategies: one of the core assumptions of many current approaches is that the data is independent and identically distributed (iid) across the participants. initial results indicate that fl training on medical non-iid data is possible, even if the data is not uniformly distributed across the institutions [12, 13] . in general however, strategies such as fedavg [9] are prone to fail under these conditions [58, 59, 60] , in part defeating the very purpose of collaborative learning strategies. research addressing this problem includes for example fedprox [58] and part-data-sharing strategy [60] . another challenge is that data heterogeneity may lead to a situation in which the global solution may not be the optimal final local solution. the definition of model training optimality should therefore be agreed by all participants before training. as per all safety-critical applications, the reproducibility of a system is important for fl in healthcare. in contrast to training on centralised data, fl involves running multiparty computations in environments that exhibit complexities in terms of hardware, software and networks. the traceability requirement should be fulfilled to ensure that system events, data access history and training configuration changes, such as hyperparameter tuning, can be traced during the training processes. traceability can also be used to log the training history of a model and, in particular, to avoid the training dataset overlapping with the test dataset. in particular in non-trusted federations, traceability and accountability processes running in require execution integrity. after the training process reaches the mutually agreed model optimality criteria, it may also be helpful to measure the amount of contribution from each participant, such as computational resources consumed, quality of the local training data used for local training etc. the measurements could then be used to determine relevant compensation and establish a revenue model among the participants [61] . one implication of fl is that researchers are not able to investigate images upon which models are being trained. so, although each site will have access to its own raw data, federations may decide to provide some sort of secure intra-node viewing facility to cater for this need or perhaps even some utility for explainability and interpretability of the global model. however, the issue of interpretability within dl is still an open research question. unlike running large-scale fl amongst consumer devices, healthcare institutional participants are often equipped with better computational resources and reliable and higher throughput networks. these enable for example training of larger models with larger numbers of local training steps and sharing more model information between nodes. this unique characteristic of fl in healthcare consequently brings opportunities as well as challenges such as (1) how to ensure data integrity when communicating (e.g. creating redundant nodes); (2) how to design secure encryption methods to take advantage of the computational resources; (3) how to design appropriate node schedulers and make use of the distributed computational devices to reduce idle time. the administration of such a federation can be realised in different ways, each of which come with advantages and disadvantages. in high-trust situations, training may operate via some sort of 'honest broker' system, in which a trusted third party acts as the intermediary and facilitates access to data. this setup requires an independent entity to control the overall system which may not always be desirable, since it could involve an additional cost and procedural viscosity, but does have the advantage that the precise internal mechanisms can be abstracted away from the clients, making the system more agile and simpler to update. in a peer-to-peer system each site interacts directly with some or all of the other participants. in other words, there is no gatekeeper function, all protocols must be agreed up-front, which requires significant agreement efforts, and changes must be made in a synchronised fashion by all parties to avoid problems. and in a trustless-based architecture the platform operator may be cryptographically locked into being honest which creates significant computation overheads whilst securing the protocol. future efforts to apply artificial intelligence to healthcare tasks may strongly depend on collaborative strategies between multiple institutions rather than large centralised databases belonging to only one hospital or research laboratory. the ability to leverage fl to capture and integrate knowledge acquired and maintained by different institutions provides an opportunity to capture larger data variability and analyse patients across different demographics. moreover, fl is an opportunity to incorporate multiexpert annotation and multi-centre data acquired with different instruments and techniques. this collaborative effort requires, however, various agreements including definitions of scope, aim and technology which, since it is still novel, may incorporate several unknowns. in this context, large-scale initiatives such as the melloddy project 2 , the healthchain project 3 , the trustworthy federated data analytics (tfda) and the german cancer consortium's joint imaging platform (jip) 4 represent pioneering efforts to set the standards for safe, fair and innovative collaboration in healthcare research. ml, and particularly dl, has led to a wide range of innovations in the area of digital healthcare. as all ml methods benefit greatly from the ability to access data that approximates the true global distribution, fl is a promising approach to obtain powerful, accurate, safe, robust and unbiased models. by enabling multiple parties to train collaboratively without the need to exchange or centralise datasets, fl neatly addresses issues related to egress of sensitive medical data. as a consequence, it may open novel research and business avenues and has the potential to improve patient care globally. in this article, we have discussed the benefits and the considerations pertinent to fl within the healthcare field. not all technical questions have been answered yet and fl will certainly be an active research area throughout the next decade [14] . despite this, we truly believe that its potential impact on precision medicine and ultimately improving medical care is very promising. financial disclosure: author rms receives royalties from icad, scanmed, philips, and ping an. his lab has received research support from ping an and nvidia. author sa is supported by the prime programme of the german academic exchange service (daad) with funds from the german federal ministry of education and research (bmbf). author sb is supported by the national institutes of health (nih). author mng is supported by the healthchain (bpifrance) and melloddy (imi2) projects. deep learning deep learning: a primer for radiologists clinically applicable deep learning for diagnosis and referral in retinal disease revisiting unreasonable effectiveness of data in deep learning era a systematic review of barriers to data sharing in public health estimating the success of re-identifications in incomplete datasets using generative models quantifying differences and similarities in whole-brain white matter architecture using local connectome fingerprints brainprint: a discriminative characterization of brain morphology communication-efficient learning of deep networks from decentralized data federated learning: challenges, methods, and future directions federated machine learning: concept and applications privacy-preserving federated brain tumour segmentation multi-institutional deep learning modeling without sharing patient data: a feasibility study on brain tumor segmentation advances and open problems in federated learning federated learning for healthcare informatics ibm's merge healthcare acquisition nhs scotland's national safe haven the french health data hub and the german medical informatics initiatives: two national projects to promote data sharing in healthcare health data research uk the human connectome: a structural description of the human brain uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age the cancer imaging archive (tcia): maintaining and operating a public information repository chestx-ray8: hospital-scale chest x-ray database and benchmarks on weaklysupervised classification and localization of common thorax diseases deeplesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning the cancer genome atlas (tcga): an immeasurable source of knowledge the alzheimer's disease neuroimaging initiative (adni): mri methods 1399 h&e-stained sentinel lymph node sections of breast cancer patients: the camelyon dataset the multimodal brain tumor image segmentation benchmark (brats) a large annotated medical image dataset for the development and evaluation of segmentation algorithms membership inference attacks against machine learning models white-box vs black-box: bayes optimal strategies for membership inference understanding deep learning requires rethinking generalization the secret sharer: evaluating and testing unintended memorization in neural networks deep learning with differential privacy privacy-preserving deep learning federated optimization: distributed machine learning for on-device intelligence braintorrent: a peer-to-peer environment for decentralized federated learning peer-to-peer federated learning on graphs learning differentially private recurrent language models distributed deep learning networks among institutions for medical imaging deep convolutional neural networks for computeraided detection: cnn architectures, dataset characteristics and transfer learning convolutional neural networks for medical image analysis: full training or fine tuning? catastrophic interference in connectionist networks: the sequential learning problem an empirical investigation of catastrophic forgetting in gradient-based neural networks learning without forgetting overcoming forgetting in federated learning on non-iid data collaborative learning for deep neural networks an overview of multi-task learning in deep neural networks how to scale distributed deep learning accurate, large minibatch sgd: training imagenet in 1 hour demystifying parallel and distributed deep learning: an in-depth concurrency analysis yet another accelerated sgd: resnet-50 training on imagenet in 74.7 seconds p3sgd: patient privacy preserving sgd for regularizing deep cnns in pathological image classification deep leakage from gradients beyond inferring class representatives: user-level privacy leakage from federated learning deep models under the gan: information leakage from collaborative deep learning multi-site fmri analysis using privacy-preserving federated learning and domain adaptation: abide results federated optimization in heterogeneous networks communication-efficient learning of deep networks from decentralized data federated learning with non-iid data data shapley: equitable valuation of data for machine learning key: cord-103502-asphso2s authors: herrgårdh, tilda; li, hao; nyman, elin; cedersund, gunnar title: an organ-based multi-level model for glucose homeostasis: organ distributions, timing, and impact of blood flow date: 2020-10-21 journal: biorxiv doi: 10.1101/2020.10.21.344499 sha: doc_id: 103502 cord_uid: asphso2s glucose homeostasis is the tight control of glucose in the blood. this complex control is important and not yet sufficiently understood, due to its malfunction in serious diseases like diabetes. due to the involvement of numerous organs and sub-systems, each with their own intra-cellular control, we have developed a multi-level mathematical model, for glucose homeostasis, which integrates a variety of data. over the last 10 years, this model has been used to insert new insights from the intra-cellular level into the larger whole-body perspective. however, the original cell-organ-body translation has during these years never been updated, despite several critical shortcomings, which also have not been resolved by other modelling efforts. for this reason, we here present an updated multi-level model. this model provides a more accurate sub-division of how much glucose is being taken up by the different organs. unlike the original model, we now also account for the different dynamics seen in the different organs. the new model also incorporates the central impact of blood flow on insulin-stimulated glucose uptake. each new improvement is clear upon visual inspection, and they are also supported by statistical tests. the final multi-level model describes >300 data points in >40 time-series and dose-response curves, resulting from a large variety of perturbations, describing both intra-cellular processes, organ fluxes, and whole-body meal responses. we hope that this model will serve as an improved basis for future data integration, useful for research and drug developments within diabetes. a dysfunctional glucose homeostasis is a hallmark for both type 1 and type 2 diabetes mellitus (t2d). in type 1 diabetes, the insulin-producing beta-cells are destroyed by the immune system. since the other organs are unaffected, the treatment of t2d simply consists of insulin, taken via injections or insulin pumps. in t2d, the patient has both a reduced capacity to produce insulin and has developed a resistance to the hormone. this resistance appears in all of the three most metabolically active organs, which all respond to insulin: adipose tissue, muscle, and liver. inside each of these organs, the response to insulin is governed by an interaction between intracellular signaling and metabolic networks. the resistance is spread between the organs, in ways which are not yet fully understood, but which involves numerous hormones, cytokines, and metabolites. to better understand this complex interaction, both in health and in disease, dynamic mathematical models are needed. models for the top-level glucose homeostasis, involving a simple interaction between glucose and insulin, have been developed for decades ((bergman et al. 1981) . a first more advanced model (dalla man et al. 2007 was based on calculated flows of glucose and insulin between organs in response to a meal. a version of this model, trained on data from patients with type 1 diabetes, is approved by the food and drug administration, fda, for replacement of animal experiments in the approval of the algorithm inside new insulin pumps (kovatchev et al. 2008 . for more general applications, involving t2d, the intracellular insulin resistance must be combined with the whole-body interactions. such models are called multi-level models. there have been several efforts to create multi-level models of glucose homeostasis, reviewed in e.g. (ajmera et al. 2013; nyman et al. 2012; nyman et al. 2016 . one of the more comprehensive efforts is a series of nonlinear mixed effects models (silber et al. 2007; silber et al. 2010; jauslin et al. 2007 ) developed to describe plasma levels of glucose and insulin after different interventions for single patients with t2d. another effort has developed a glucose homeostasis model, based partly on (dalla man et al. 2007) , to create a simulator to use in education and to simulate scenarios of disease (maas et al. 2015) . a third effort is the multi-level model of human glucose homeostasis we created 10 years ago (nyman et al. 2011) . this model contains the dynamic glucose-insulin interaction between organs in response to a meal, based on (dalla man et al. 2007 ). in this model, we sub-divided the original insulin-responding uptake in a muscle and a fat component, and linked the fat tissue glucose uptake to intracellular insulin signaling data, coming from our own studies. this link was possible since insulin-stimulated glucose uptake can be measured both in isolated adipocytes and in organs. the adipocyte uptake is measured in vitro together with insulin signaling data; the organ-level uptakes are measured using isotopic labelling and/or arteriovenous (av) difference data, which measures the difference between arterial and venous blood. since the uptake measurements from isolated adipocytes should correlate with the av difference-based uptake-measurements for fat tissue, one can build a translation from in vitro to in situ, in humans. however, neither this model, nor any of the previously mentioned multi-level models, have subdivided the glucose uptake into the individual contributions of all of the main insulin-responding and glucose-utilizing organs: adipose tissue, muscle, and liver. in this paper, we have updated the original multi-level connections in (nyman et al. 2011) , and resolved three critical questions or issues (q1-q3), regarding the role of each of the metabolically active organs in glucose uptake (fig 1) . more specifically, we have explicitly included the liver in the model as a glucose-utilizing organ, in contrast to the original models, which only considered it as a glucose producing organ (q1). secondly, we have included a timing difference between muscle and adipose tissue glucose uptake in the response to a meal (q2). thirdly, we have have updated the model to include the impact of blood flow on glucose uptake in adipose tissue (q3). finally, we merge these three improvements together with all of the other already published improvements described above, to an updated multi-level model (q4). this model constitutes an updated view on the multi-level roles that each organ plays in glucose homeostasis, and allows for integration of future data for specific sub-systems into an integrated and more complete picture. we have used ordinary differential equations (odes) in the standard form to build the models. all of the equations are given in the supplementary files, both as equations and as simulation files, and here we only describe the most central equations, relating to the changes done in this paper. the equations for the dynamics of the amount of glucose in interstitial tissue (g t ) and plasma (g p ) are given by where u id is insulin and glucose dependent glucose uptake, i.e. in fat, muscle, and liver; where u ii is insulin independent and constant glucose utilization, i.e. glucose uptake by organs such as brain and kidneys; where egp is endogenous glucose production from the liver; where ra is glucose rate of appearance from the intestine; where e is glucose excretion through the kidneys; and where k 1 · g p and k 2 · g t denotes the flux from plasma to intestines and back, respectively. note that g t and g p are states, while u id , u ii , k 1 · g p , k 2 · g t , egp, ra, and e are the reaction rates that describe flows of glucose. similarly, k 1 and k 2 are parameters -rate constants -which are constant over time. insulin-dependent and dynamic glucose uptake the above equations are identical to those in the original dalla man model (dalla man et al. 2007) , and the change that was implemented in (nyman et al. 2011 ) was that u id was sub-divided into a muscle and an adipose tissue part. we now sub-divide the insulin-dependent dynamic glucose uptake into three parts, i.e. where u idm , u idl , and u idf denotes the uptake rates into the muscle, liver, and fat, respectively. all of these uptake descriptions have changed to same extent, so let us now go through them one by one. glucose uptake in muscle is given by where v mmax is the non-scaled maximal glucose uptake, and where k m is the corresponding michaelis-menten parameter. the insulin-dependency of the glucose uptake is located in the expression for v mmax where part m is a scaling parameter to balance the uptake of the muscle with the other organs, where v is the basal rate of glucose utilization, and v x is the maximum rate of glucose entering the tissue (here muscle) from the surrounding tissue, and where ins denotes the interstitial insulin concentration. so far, these equations for the muscle uptake are the same as in (nyman et al. 2011 . in contrast, although ins is almost calculated in the same way as in (nyman et al. 2011 , the parameters describing the rate of entry and the rate of degradation are now allowed to be different, i.e. where v 1 and v 2 describe the rate of transport from the plasma and the rate of degradation, with corresponding rate constants k 1 and k 2 , respectively; where i b denotes the basal plasma insulin concentration; and where i denotes the insulin concentration in plasma. the liver was not included in the previous models, and thus its equations are new. they are similar to the equations for muscle, i.e. where k l is a michaelis-menten constant, and and where v lmax represents the maximum rate of glucose utilization in the liver. just as for the equations for muscle, the insulin dependence is incorporated into the expression for v lmax , which is given by where part l represent the relative glucose utilization of the liver in comparison with other tissues. note that the insulin-dependency of the liver glucose uptake is described as being direct, while in reality this dependency is indirect. glucose uptake in the liver is done via the glut2 transporter, which is not regulated by insulin. in contrast, the glucose uptake in muscle and adipose tissue is done by the glut4 transporter, which is regulated by insulin. in the liver, insulin instead indirectly effects glucose uptake by up-regulating intracellular glucose phosphorylation and utilization. however, since the model is lacking intracellular reactions, this indirect effect present in the liver is approximated in the same way as the direct effect for the muscle. note, finally, also that the egp in the liver also is regulated by insulin, and that this is described as a separate process, in the same way as in (nyman et al. 2011 . glucose uptake in the adipose tissue is the most advanced part of the model, since it is determined by intracellular processes, both regarding metabolism and regarding insulin signaling. the ultimate calculation of the uptake is given by the following expression where part f is a parameter, and where v in and v out describe the rate of glucose transport into, and out of, the cell, respectively. these two fluxes are given by where p3 and p4 are transport parameters, where glu in is the amount of intracellular glucose, and where ins f ,e is the effect of insulin on these transport rates. these two equations show that the glucose uptake in the fat tissue depends on both intracellular metabolism, which alters the value of glu in , and the intracellular signaling, which alters the value of ins f ,e . the intracellular metabolism incorporates the first two steps of glycolysis, i.e. the steps involving intracellular glucose-6-phosphate (g 6p ). the equations are given by where p2, p1, v g6pmax , and k out are rate constants, and where v g6p is the rate of phosphorylation of glucose. the intracellular insulin signaling is in itself the same as in (brännmark et al. 2013 , and it starts with insulin binding to the receptor (supplementary eq 26), and ends with translocation of the glut4 transporter to the membrane (supplementary eqs 46 and 45). what is new compared to (nyman et al. 2011; brännmark et al. 2013 ) instead concerns the usage of the glut4 transporter to calculate the resulting impact on glucose uptake, ins f ,e . in our updated model, this insulin effect is given by where glut 4m is the amount of glut4 in the membrane, where glut 1 is the amount of glut1 in the membrane, and where b f e f is the effect of blood flow; nc, k8, and pf are parameters. the glut4 and glut1 terms corresponds to the transport via the two glucose transporters, and bfe f was introduced in (nyman et al. 2011 ) as a scaling parameter between the data from the in vitro setting studying isolated adipocytes, and the in situ setting, where the adipose tissue is still located in the human body. in other words, the blood flow effect is not there when simulating in vitro experiments. in (nyman et al. 2011) , this difference in insulin effect was hypothesized to be dependent on blood flow, and in this paper, we show that such an impact on blood flow is indeed present. if one does not have data for the blood flow, the model will set bfe f to a constant value, and if there is data for blood flow, we propose to use the new model described in the next section. equations for the impact of blood flow on glucose uptake in adipose tissue the impact of blood flow on glucose uptake is dependent on insulin. the same equations for adipose tissue is used for muscle glucose uptake (cf eqs (6)-(8)). where i is insulin in plasma and i b is the basal insulin level, and where c1 b f and c2 b f are rate constants. second, to calculate the impact of blood flow, we need to have an expression for how the blood flow is calculated. in this study, we only look at blood in controls, and in presence of bradykinin, which increases the blood flow. this increase is also dependent on insulin. this control of blood flow, denoted bf f is given by where be describes the direct effect of bradykinin on blood flow; where kbf describes the combined effect of insulin and bradykinin, and where ins offset is a small offset introduced to make insulin concentrations positive (same as in (nyman et al. 2011 ). the value of bradykinin is 1 in the absence of bradykinin, and 30 in the presence of bradykinin. finally, the blood flow and insulin are combined to impact the glucose uptake via the following expression for bfe f where bf b is the basal blood flow, where p b f is a paramete, and where ins b is the basal insulin level in adipose tissue. these are all the equations that have been changed in the current version of the model. the full set of odes from the final model, including the original simulation files, are found in the supplementary material. an interaction graph of the final model is given in fig s1 . models were simulated in a modular fashion, by simulating part of the model with curves from other parts as input. specifically, the new additions were simulated on their own together with equations for g p and g t with curves for egp, ra, ins, and glut 4m, simulated by the first model version presented herein (m1), as input. parameter values for existing models are used from (brännmark et al. 2013) . the agreement between model simulations and experimental data is used to estimate values for new model parameters. this agreement is done by minimizing the distance between estimation data, denoted y, and corresponding simulated data for parameter p, which is denotedŷ(p). in our case, the estimation data consists of uptake rates of glucose into the adipose tissue and and muscle, which are denoted u idf and u idm , respectively. the cost function used is the conventional where the subscript i denotes the data point, where n denotes the number of data points, and where sem denotes the standard error of the mean for the data uncertainty (cedersund and roll 2009 ). we use a χ2-test to evaluate the agreement between model simulations and data. to be more specific, we use the inverse of the cumulative χ2 distribution function for setting a threshold, and then compare the cost function v(p) with a threshold. in order to set that threshold, we need a significance level and the degrees of freedom. in this study, we use significance level 0.05, and the degrees of freedom used is specified for each analysis in the results. apart from the formal optimization described above, some additional ad hoc requirements were added to the parameter estimation. specifically, to get a good estimate of the proportions of glucose taken up by the different tissues, a term adding a slightly increasing punishment for having a total uptake of glucose in liver higher than 50% or lower than 40% of total glucose uptake in all organs. the total glucose uptake of other organs except adipose tissue, muscle and liver (u ii ) was punished in the same way for values higher than 28 % and lower than 18 % total glucose uptake of all organs. the simple fitting to the impact of blood flow on glucose uptake was done by hand. a representative simulation was chosen for the comparison to the data uncertainties for total glucose uptake from dalla man (dalla man et al. 2007 ) (6). the uncertainty of the model simulations was estimated by, during the optimization process, saving all found parameters with an acceptable simulation according to section above. we used matlab r2018b (mathworks, natick, ma) and the iqm toolbox (intiquan gmbh, basel, switzerland) for modeling. the experimental data as well as the complete code for data analysis and modeling are available at https://gitlab.liu.se/isbgroup/projects/updated-multi-level. experimental and clinical data no new data were collected in this study. we therefore refer to the methods sections in the original articles (frayn et al. 1993; coppack et al. 1996; gerich 2000; moore et al. 2012; iozzo et al. 2012; brännmark et al. 2013) for the corresponding experimental methods. distribution of postprandial glucose uptake between adipose, muscle, and liver (q1) the first improvement made to the original model (nyman et al. 2011 ), referred to as m0, was to update the redistribution of the glucose uptake among the different tissues ( (fig 2a) . the liver stands for almost half of the total postprandial glucose uptake ( fig 2b) (gerich 2000) , which was not explicitly accounted for in m0 (fig 2a, dotted line) . we therefore adopted the fluxes to fit to the data in 2b. more specifically, the liver was added as a glucose consuming organ, with a high net consumption compared to the other organs. in the updated model, referred to as m1 (table s2) , the liver is set to take up 45% of the total postprandial glucose uptake (fig 2a-b) , while adipose and muscle uptake were both reduced to 5% and 27% respectively. furthermore, the glucose uptake by organs whose uptake is not affected by a meal (e.g. brain and kidneys) was increased to 23%. note that in fig 2a, this constant uptake is symbolized by the kidneys and the brain, because those are the most prominent glucose consumers (gerich 2000) , but that other tissues and organs can be seen as represented in this uptake as well. as a validation of these changes, we compared the resulting model simulations with data from other studies. more specifically, we compared the uptake of glucose in adipose and muscle tissue, as simulated by the two models m0 and m1, with data that measures the uptake in these two organs specifically. such measurements are possible using e.g. av difference data. in fig 2c, the area under the curve (auc) for m0 of adipose and muscle combined (dashed, light orange) is approximately 2 times bigger than the auc of the data (solid, brown) in (frayn et al. 1993; coppack et al. 1996) . this is clearly beyond the experimental uncertainty, and m0 is therefore rejected by a χ 2 test (v (θ ) = 76 > 16.9 = χ 2 cum,inv (9, 0.05)). in contrast, m1 has approximately the same auc as the data, and its simulations lies within the experimental uncertainty for most data points. therefore, the time series is not rejected by the test based on these independent data (v (θ ) = 4.4 < 16.9 = χ 2 cum,inv (9, 0.05)). for these reasons, we reject m0, in favor of the new model m1. a more detailed check of the quality of the updated model m1 is obtained by looking at the muscle and adipose tissue glucose uptake one by one ( fig 2d) . for muscle (red), both the time-dynamics (left) and auc (right) agrees between simulations (light red) and independent data (dark red). this visual observation is supported by a χ 2 test (v (θ ) = 4.9 < 16.9 = χ 2 cum,inv (9, 0.05). in contrast, the adipose tissue shows a reasonable agreement with data, but it is not quantitatively acceptable according to a χ 2 test (v (θ ) = 29.5 > 16.9 = χ 2 cum,inv (9, 0.05). looking closer at the time-series reveals that the value at the maximal uptake is fine, but that the problems lies in the fact that the dynamics of the uptake in muscle and adipose tissue are different, and that this is not captured in the model. difference in time-resolved glucose uptake in adipose and muscle tissue (q2) since the timing and agreement with dynamic glucose uptake in the muscle tissue is fine already in the model m1, this model was kept essentially intact. however, one minor modification that effects muscle uptake was introduced (fig 3) . in the previous model (m1), the rate constant of insulin transport into the interstitium (v 1 ) is assumed to be the same (k 1 ) as for the rate of the subsequent degradation of insulin (v 2 ). since there is no reason for these values to be the same, we updated the model to give these two reaction rates their own rate constants (k1, and k 2 , respectively). we refitted both parameters together with the other new parameters (introduced below) to the data, and the resulting model is referred to as m2a. the developments for the adipose tissue glucose uptake needed to be more elaborate, and are available in fig 4: the new model structure is depicted in fig 4a and comparison with data is included in fig 4b. as can be seen, the same difference as was introduced for muscle, m2a, yields a poor agreement with data for the adipose tissue, since the peak is too late. the main problem is that the glucose uptake in the adipose tissue has gone down to baseline levels already after around 100 min, when insulin levels still are high (dalla man et al. 2007 ) (7). therefore, since the glucose uptake in the current model cannot go down before insulin goes down, an additional mechanism is needed. one such possible mechanism is the fact that the hexokinase reaction has a product inhibition (may and mikulecky 1983) . this leads to two new states in the next version of the model (m2b, 4a, red circle): intracellular glucose glu in and phosphorylated glucose, g6p. as seen, there is an inhibition from g6p to the rate of phosphorylation of glu in . this modification allows for the following chain-of-events. when glucose uptake begins, the amount of intracellular glucose starts to build up, which is then phosphorylated into g6p. when the g6p reaches saturation levels, g6p inhibits the phosphorylation process from intracellular glucose, which leads to increasing intracellular glucose levels. since the net glucose uptake is driven by the gradient across the cell membrane, this increase in intracellular glucose will decrease the glucose uptake, even though insulin levels still might be high. the resulting simulations of glucose uptake in muscle and adipose tissue (fig 4b, right) , agrees with the data both according to a visual check, and according to a χ 2 test (v (θ ) = 28.2 < 28.9 = χ 2 cum,inv (7, 0.05)). improvements in the intracellular adipose tissue model: glucose metabolism and blood flow effects (q3) the final improvement made was the addition of the impact of blood flow on insulin-stimulated glucose uptake in the adipose tissue. this interaction was hypothesised in iozzo et al, where they looked at the effect of blood flow and insulin, separately and combined, on glucose uptake in adipose tissue (iozzo et al. 2012 ) (fig 5a-b) . increased blood flow was achieved with the drug bradykinin. in these experiments, iozzo et al observed that glucose clearance was not significantly changed when only adding bradykinin (fig 5b, left) . in contrast, when combining both bradykinin and insulin, the glucose uptake is increased compared with only adding insulin (fig. 5b, right) . the same behaviour is produced by the model in fig 5c, where the glucose uptake only increases when both bradykinin and insulin is present. the parameter bradykinin was changed from 1 to 30 to represent the addition of bradykinin, and the parameter ins offset is changed from 0 to 7 to represent insulin infusion (fig 5a) . this behaviour also agrees with data according to according to a χ 2 test (v (θ ) = 0.26 < 3.8 = χ 2 cum,inv (2, 0.05), where the degrees of freedom have been compensated for with the number of new parameters, 4-2=2). the updated model is referred to as m3, and as for the other model additions, the new equations are shortly depicted in the figure (here fig 5a) , and described in detail in materials and methods and supplementary files. finally, we consider the performance of the resulting final multi-level model, in relation to all of the data that has been generated over the years. the final model can fit to dynamic data of postprandial glucose uptake in both adipose and muscle tissue (fig 6a, same data as in fig 2d, from (coppack et al. 1996) . the same figure displays predictions of dynamic uptake in the liver (for which the same type of av difference data is non-existent), and for the tissues with a constant demand of glucose (such as the brain). finally, the right-most sub-figure in fig 6a shows that the model agrees well with the total dynamic glucose uptake from (dalla man et al. 2007 ). furthermore, the aucs for the different tissues in the final combined model is in line with the corresponding auc data (fig 6b) , just as they were in step q1 (fig 2b) . the two left-most bars, for muscle and adipose tissue, are given by the auc of the corresponding time-series in fig 6a (cf fig 2d) , and the liver and brain/kidney uptake are the same as in fig 2b. the final model is also in agreement with data previously used in the model development. the agreement with the most important such data sets are re-plotted in fig 7 ( dalla man et al. 2007 ), which describes meal responses for the following variables: plasma glucose, plasma insulin, endogenous glucose production, glucose rate of appearance from the intestines, glucose uptake or utilization, and insulin secretion. as can be seen, the model simulations (lines) are within the experimental uncertainty (grey area) for all these time curves (agreements between simulation and data are similar as in (dalla man et al. 2007) . similarly, because of the hierarchical way that the multi-level model is constructed, it also still agrees with all of the intracellular signalling data, which we have collected over the years (brännmark et al. 2013) . the most important such data is depicted in fig 8. these data (error bars) describe time-series and dose-response curves in response to insulin for a number of intracellular proteins: the insulin receptor (ir), the insulin receptor substrate-1 (irs1), protein kinase-b (pkb), akt-substrate 160 (as160), ribosomal protein s6 kinase beta-1 (s6k1), ribosomal protein s6 (s6), as well as cellular glucose uptake. the model simulations (lines) are in agreement with both data from non-diabetic and lean controls (blue), and from obese people with type 2 diabetes (red), with changes only in a few key parameters (for more details, see (brännmark et al. 2013 ). similar agreements for additional proteins -such as extracellular signal-regulated kinases (erk1), ets like-1 protein elk-1 (elk1), forkhead box protein o1 (foxo1), etc -is equally possible to obtain by replacing the intracellular part of the model with those in (nyman et al. 2014; rajan et al. 2016 ). glucose homeostasis is a complex multi-organ and multi-level system, which requires multi-level mathematical modelling for a full understanding. we have herein improved an existing such model (nyman et al. 2011) for glucose fluxes in the circulation, linked to intracellular pathways in adipocytes, in response to a meal. specifically, we have (q1) made a new subdivision of glucose uptake between all relevant organs, to provide more reliable proportions and to include uptake in the liver (fig 2) ; (q2) improved the elimination of interstitial insulin to be tissue-specific (fig 3) , and included intracellular metabolism of glucose inside adipocytes, to capture an earlier peak in the glucose uptake in adipocytes compared to the corresponding peak in plasma insulin (fig 4) ; and (q3) accounted for the impact of blood flow on glucose uptake (fig 5) . the final combined model (q4) can fit to all of the new data for glucose uptake in all organs (fig 6) , as well as to all previous data, such as the postprandial glucose and insulin fluxes and concentrations in (dalla man et al. 2007) (fig 7) , and the intracellular data in (brännmark et al. 2013) (fig 8) . to the best of our knowledge, this is the most comprehensive description of such a wide variety of data for glucose homeostasis in humans, and we hope that it will become a useful resource also for integration of future data. one of the main contributions in this work is the addition of glucose uptake in the liver (q1). this addition is important because the liver is the the organ that takes up the most glucose: approximately 45% (fig 2) . apart from this, the liver has a unique function in glucose homeostasis, since it is the only organ that can produce glucose from other metabolites. these two functions, glucose uptake and endogenous glucose production (egp), are now modeled as separate processes. in other words, the liver can both produce and take up glucose at the same time. while there may be situations when only the net uptake/release is important, there are also situations when one can experimentally resolve the two fluxes. for instance, when labeled metabolites have been ingested, one can see the rate by which these are converted to glucose and secreted, even in postprandial conditions, when the net effect of glucose transport is into the cell. such data have previously been used to train the egp fluxes (fig 7) (dalla man et al. 2007 ), and we have now added corresponding data for glucose uptake (fig 6b) . note that this model is only fitted to the data in fig 2b, and that the agreements seen in fig 2cd serve as a simple validation of this part of the model. with this said, it should be emphasized that both the muscle and the new liver module are highly simplified. only the muscle and adipose modules have been tested with respect to dynamic uptake data, and only the adipose module with an intracellular signaling part, based on detailed intracellular data, resolving the complicated intracellular metabolic fluxes. these limitations are present primarily because such data are rare or non-existent. at the heart of resolving both q1 and q2 lies measurements of glucose fluxes, which have been measured in a variety of ways. the glucose fluxes from (dalla man et al. 2007 ) was based on a triple tracer protocol, which allows for the simultaneous calculation of plasma glucose, egp, glucose rate off appearance, and glucose utilization (fig 7) . these data are based on advanced calculations, which in turn are based on various assumptions and mathematical models developed within the field of tracer based measurements (wolfe et al. 2005) . these particular assumptions are not necessary in the organ specific glucose utilization curves, available e.g. for muscle and adipose tissue (fig 2c and d) . these data are based on an av difference-based protocol, which samples in both an artery and veins that have past through either muscle or adipose tissue, and by looking at the difference between the ongoing and the outgoing blood (coppack et al. 1996) . this is a more direct way of measuring how each organs contributes to the glucose disappearance from the blood. nevertheless, also avdifference data does not measure glucose uptake in the primary cells, myocytes and adipocytes, respectively. this means that the quick decline in glucose uptake in adipose tissue ( fig 4b) could in fact be the result of a quick equilibrium between interstitial and capillary glucose concentration. one could possibly develop an alternative model based on that equilibration-based assumption, to explain the quick decline of the glucose uptake in the adipose tissue, either as a replacement or as a complement to the herein implemented mechanism based on product inhibition (fig 4a) . finally, the fact that the model is based on three different types of measurements of glucose uptake (cellular in vitro, tracer-based, and av-difference based), and can describe all of these types of data simultaneously, is a reason why a relatively simple validation, such as that in fig 2c-d, still is of value. the final question addressed herein (q3) concerns the impact of blood flow on glucose uptake, which is highly simplified because the real relationship is a bidirectional one. the data in fig 5b shows that glucose uptake is increased by increased blood flow, at least in cases when insulin is present. this relationship is captured in the final model. however, that model can only describe situations where the blood flow is altered in a way that is not connected to the metabolic response, such as when adding bradykinin (fig 5b) . in other words, the model cannot describe meal-induced blood flow changes and its associated impact on glucose uptake. the development of a model for blood-flow regulation during e.g. meal-responses is an important task for future modelling works. another weakness regarding the blood flow part of the model concerns the lack of validation. the model is only fitted to the data in fig 6b. in the analysis, we compensate for that by reducing the degrees of freedom from the number of data points (4) to the number of data points minus the number of parameters (4-2=2). however, one could argue that the two baseline bars should not be counted since they are normalized to be 100%. in such an interpretation, the degrees of freedom are 0, a chi2 test can not be done, and the only possible assessment of the quality of the model is a visual comparison of the differences between fig 5b and c. for all these reasons, the blood flow part of the model is to be considered as a first step in the development of a model for the blood flow and its function in glucose homeostasis. it is important to compare the model presented herein to other similar models in the literature. in the introduction, we mentioned the now classical nonlinear mixed effects models describing plasma levels of glucose and insulin (silber et al. 2007; silber et al. 2010; jauslin et al. 2007 ). these models have since these early publications been used to scale data between pre-clinical data from animals to clinical human data for glucose and insulin concentrations (alskär et al. 2017) , and to describe cross-talk with more long-term processes, such as disease development in mice (choy et al. 2016 ) and dynamics of hba1c (kjellsson et al. 2013; møller et al. 2013) . glucose homeostasis-centered models, focusing on the glucose-insulin interplay, lie at the heart of mathematical models developed for type 1 diabetes, e.g. to aid insulin-pumps, and to develop a so-called artifical pancreas (huang et al. 2012; fabris and kovatchev 2020) . another application of glucose homeostasis models exist for meal response t2d simulator model, developed for pedagogical and motivational purposes (maas et al. 2015) . none of these models have subdivided glucose uptake in the different organs, or included intracellular responses, in multi-level and multi-organ models. there exists one model that does this, developed by uluseker et al (uluseker et al. 2018) . this multi-level model is based on a version of the dalla man model (dalla man et al. 2007 ) connected with our intracellular adipocyte model (brännmark et al. 2013) , while also including hormonal effects on glucose intake/appetite (leptin, ghrelin) and insulin levels (incretin). however, this model does not compare their whole-body simulations with any data, and does not include the liver as a glucose consuming organ. the model presented in this work only included intake of glucose, and thus discarded the effects of proteins and fat on the meal response, something that other models do take into account, to some extent. sips et al developed a model that integrates fatty acids with glucose metabolism (sips et al. 2015 ), but this model needs a triglyceride curve as input, and lacks protein metabolism. nevertheless, the sips model is another expansion of the dalla man model (dalla man et al. 2007 ) and can thus be merged with the developments herein. two models that include protein and fat intake from a meal are the ones developed by hall et al and sarkar et al (hall et al. 2011; sarkar et al. 2018) . these models are however developed for long term simulations (over several years), and can thus not simulate a meal response. in similarity to the model presented here, the sarkar model include liver, muscle, and adipose tissue as glucose consuming organs, but in contrast also adds the pancreas as a glucose consuming organ. furthermore, the sarkar model disregards the organs taking up a constant amount of glucose (brain and kidneys). in any case, the sarkar model only describes data for long-term dynamics, and does not describe meal-respones. another longitudinal model describing glucose dynamics on both short and long-term time-scale is the one developed by ha et al. (ha and sherman 2019) . this model is, in contrast to the other two longitudinal models mentioned above, multi-scale in that it can look at both changes over years, including the progression towards diabetes in a semi-mechanistic fashion, as well as meal response dynamics happening in the scale of hours and minutes. this model does, however, not include the distribution of glucose among different organs. there are also some multi-level and multi-scale models for other systems that should be mentioned. one such model is the one developed by barbiero et al (barbiero and lió 2020) . this model combines whole-body dynamics with the function of organs and individual cells, and is able to simulate dynamics in seconds up to several days. the model was used to simulate the cardiovascular and inflammatory effect of both t2d diabetes and covid-19, using personalized parameters. however, this model has an important short-coming: its simulations are not compared with any data. there also exists interconnected models for e.g. heart function, describing the function of cardiac cells up to the integrated behavior of the intact heart (smith et al. 2009 ). in summary, there does not exist any other multi-level model describing glucose meal response, that also separates between the different organs' glucose uptake. in this work, we present such a model, that, due to its modular approach can be easily expanded in different directions. this expansion-possibility is due to both the modular structure, and to the fact that each module can be treated as a separate modelling problem. in other words, as long as the model for each module agrees with the input-output profiles of insulin and glucose, the new model can replace the old model, with little alterations on whole-body dynamics. in the earlier developed model (nyman et al. 2011) , we took this modularity one step further, by replacing the simpler 5-state insulin receptor module with a much more detailed 37-state module for the receptor dynamics, including the possibility for a receptor to bind up to three insulin molecules (kiselyov et al. 2009 ). this demonstrates the usefulness of developing a model in modules, so that the right level of details can be included depending on the data/questions you want to analyze. since the original publication of our first multi-level model (nyman et al. 2011) , we have built further on this model in several directions, and all of these developments can be re-used also in our new model. we have e.g. expanded the intracellular part to explain a more and more comprehensive picture of the alterations in intracellular signaling that occur in t2d. this has been done by taking adipose tissue biopsies from both healthy and t2d individuals, and characterise their respective insulin signalling. in (brännmark et al. 2013 ), we presented a first model of how the insulin resistance occurs, and in subsequent works, we have added additional proteins, such as foxo1 transcription factor (rajan et al. 2016) , insulin control of mapks erk1/2 (nyman et al. 2014) . because of the modular way that our multi-level model is structured, one can replace the herein used intracellular model with any of these other alternatives. the same expansions can be done also for other organs. we therefore hope that this multi-level model in the future can serve as a hub for connecting data and models together into a useful systems-level understanding. in model m1, liver is added, the amount of glucose utilization in muscle and adipose is reduced, and the uptake that is constant during a meal of other tissues is increased compared to the original m0 model. (b) glucose distribution among organs observed in data from (gerich 2000) . (c) glucose uptake in muscle and adipose tissue combined for m1 and m0. the area under the curve for m0 is higher than seen in data from (frayn et al. 1993; coppack et al. 1996) , and m0 is thus rejected. (d) comparison between model m1's predictions of adipose and muscle glucose uptake with new data not used for parameter estimation (frayn et al. 1993; coppack et al. 1996) . illustration of the new intracellular adipose tissue module and ode equations. the flow of glucose in to the cell, v in , is dependent on the amount of glucose in interstitium (g t ) and inside the cell (glu in ), and the amount of glut 4m and glut 1 membrane glucose transporters through ins f ,e . the out flow, v out , is only dependent on glu in , which in turn depends on,v in , v out , and the phosphorylation of glucose into g6p (v g6p ). the rate of phosphorylation to g6p is only dependent on v g6p and the usage of g6p in metabolism (v met ). (b) timing comparison between uptake seen in data and the two models: m2a without phosphorylation, and m2b with glucose phosphorylation. in m2b, the peak comes earlier and the quantity of glucose taken up is closer to data than in m2a. behaviour seen in data as response to insulin and bradykinin. insulin alone has a relatively small effect on glucose clearance, but increases glucose uptake significantly when combined with bradykinin (iozzo et al. 2012 ). (c) the same behaviour as in (b) (iozzo et al. 2012) can be simulated with the model. adding bradykinin is simulated by increasing the value of bradykinin, and adding insulin infusion is simulated by increasing the value of ins offset from 0. (coppack et al. 1996) . the total glucose uptake is within the bounds presented in (dalla man et al. 2007 ). (b) total glucose uptake for all organs, simulated by the final model and from the data used to fit the model (coppack et al. 1996) . (brännmark et al. 2013) . m4 can describe data for intracellular insulin signaling in adipocytes, both normally (blue) and in t2d (red). ir, insulin receptor; irs1, insulin receptor substrate-1; pkb, protein kinase-b; as160, akt-substrate v1a = ir · k1a · (ins + 5) · 1e − 3 (115) v4c = k4c · pkb308p · mt orc2a (116) v4e = k4e · pkb473p · irs1p307 (117) v4 f = k4 f · pkb308p473p (118) v4h = k4h · pkb473p (119) v6 f 1 = as160 · (k6 f 1 · pkb308p473p + k6 f 2 · pkb473p n 6 (km6 n 6 + pkb473p n 6)) (120) v9 f 1 = s6k · k9 f 1 · mt orc1a n 9 km9 n 9 + mt orc1a n 9 (124) v9b1 = s6k p · k9b1 (125) v9 f 2 = s6 · k9 f 2 · s6k p (126) v9b2 = s6p · k9b2 (127) figure s1 : interaction graph for model m2b. original multi-level model rejected by figure 2 updated glucose distribution among organs can describe figure 2 rejected by figure 3 updated glucose dynamic behaviours by improving interstitial fluid insulin can describe figure 3 rejected by figure 4 updated glucose dynamic behaviours by redesigning an intracellular model can describe figure 4 rejected by figure 5 blood flow has an influence on adipose tissue glucose uptake rejected by figure 5 insulin has an influence on adipose tissue glucose uptake the impact of mathematical modeling on the understanding of diabetes and related complications model-based interspecies scaling of glucose homeostasis the computational patient has diabetes and a covid physiologic evaluation of factors controlling glucose tolerance in man. measurement of insulin sensitivity and β -cell glucose sensitivity from the response to intravenous glucose insulin signaling in type 2 diabetes: experimental and modeling analyses reveal mechanisms of insulin resistance in human adipocytes systems biology: model based evaluation and comparison of potential explanations for given biological data modeling the disease progression from healthy to overt diabetes in zdsd rats carbohydrate metabolism in insulin resistance: glucose uptake and lactate production by adipose and forearm tissues in vivo before and after a mixed meal meal simulation model of the glucoseinsulin system the closed-loop artificial pancreas in 2020 periprandial regulation of lipid metabolism in insulin-treated diabetes mellitus physiology of glucose homeostasis type 2 diabetes: one disease, many pathways quantification of the effect of energy imbalance on bodyweight modeling impulsive injections of insulin: towards artificial pancreas the interaction of blood flow, insulin, and bradykinin in regulating glucose uptake in lower-body adipose tissue in lean and obese subjects an integrated glucose-insulin model to describe oral glucose tolerance test data in type 2 diabetics harmonic oscillator model of the insulin and igf1 receptors' allosteric binding and activation a model-based approach to predict longitudinal hba1c, using early phase glucose data from type 2 diabetes mellitus patients after anti-diabetic treatment in silico model and computer simulation environment approximating the human glucose/insulin utilization a physiology-based model describing heterogeneity in glucose metabolism: the core of the eindhoven diabetes education simulator (e-des) glucose utilization in rat adipocytes. the interaction of transport and metabolism as affected by insulin longitudinal modeling of the relationship between mean plasma glucose and hba1c following antidiabetic treatments regulation of hepatic glucose uptake and storage in vivo a single mechanism can explain network-wide insulin resistance in adipocytes from obese patients with type 2 diabetes insulin signaling -mathematical modeling comes of age a hierarchical whole-body modeling approach elucidates the link between in vitro insulin signaling and in vivo glucose homeostasis requirements for multi-level systems pharmacology models to reach endusage: the case of type 2 diabetes systems-wide experimental and modeling analysis of insulin signaling through forkhead box protein o1 (foxo1) in human adipocytes, normally and in type 2 diabetes a long-term mechanistic computational model of physiological factors driving the onset of type 2 diabetes in an individual an integrated model for glucose and insulin regulation in healthy volunteers and type 2 diabetic patients following intravenous glucose provocations an integrated model for the glucose-insulin system model-based quantification of the systemic interplay between glucose and fatty acids in the postprandial state the cardiac physiome: at the heart of coupling models to measurement a closed-loop multi-level model of glucose homeostasis isotope tracers in metabolic research: principles and practice of kinetic analysis ribosomal protein s6 kinase beta-1 s6 we thank the swedish research council (2018-05418, 2018-03319, 2019-03767), ceniit (15.09, 20.08), the heart and lung foundation, the swedish foundation for strategic research (itm17-0245), scilifelab/kaw national covid-19 research program project grant (2020.0182), h2020 (precise4q, 777107), the swedish fund for research without animal experiments, and elliit. this manuscript has been submitted to bioarxiv. the authors declare that they have no conflict of interest. all the odes for the final model m4:26all variables of final model: key: cord-025517-rb4sr8r4 authors: koutsomitropoulos, dimitrios a.; andriopoulos, andreas d. title: automated mesh indexing of biomedical literature using contextualized word representations date: 2020-05-06 journal: artificial intelligence applications and innovations doi: 10.1007/978-3-030-49161-1_29 sha: doc_id: 25517 cord_uid: rb4sr8r4 appropriate indexing of resources is necessary for their efficient search, discovery and utilization. relying solely on manual effort is time-consuming, costly and error prone. on the other hand, the special nature, volume and broadness of biomedical literature pose barriers for automated methods. we argue that current word embedding algorithms can be efficiently used to support the task of biomedical text classification. both deepand shallow network approaches are implemented and evaluated. large datasets of biomedical citations and full texts are harvested for their metadata and used for training and testing. the ontology representation of medical subject headings provides machine-readable labels and specifies the dimensionality of the problem space. these automated approaches are still far from entirely substituting human experts, yet they can be useful as a mechanism for validation and recommendation. dataset balancing, distributed processing and training parallelization in gpus, all play an important part regarding the effectiveness and performance of proposed methods. digital biomedical assets include a variety of information ranging from medical records to equipment measurements to clinical trials and research outcomes. the digitization and availability of biomedical literature is important at least in two aspects: first, this information is a valuable source for open education resources (oers) that can be used in distance training and e-learning scenarios; second, future research advancements can stem from the careful examination and synthesis of past results. for both these directions to take effect, it is critical to consider automatic classification and indexing as a means to enable efficient knowledge management and discovery for these assets. in addition, the sheer volume of biomedical literature is continuously increasing and puts excessive strain on manual cataloguing processes: for example, the us national library of medicine experiences daily a workload of approximately 7,000 articles for processing [12] . research in the automatic indexing of literature is constantly advancing and various approaches are recently proposed, a fact that indicates this is still an open problem. these approaches include multi-label classification using machine learning techniques, training methods and models from large lexical corpora as well as semantic classification approaches using existing thematic vocabularies. to this end, the medical subject headings (mesh) is the de-facto standard for thematically annotating biomedical resources [18] . in this paper we propose and evaluate an approach for automatically annotating biomedical articles with mesh terms. while such efforts have been investigated before, in this work we are interested in the performance of current state-of-the-art algorithms based on contextualized word representations or word embeddings. we suggest producing vectorized word and paragraph representations of articles based on context and existing thematic annotations (labels). consequently, we seek to infer the most similar terms stored by the model without the need and overhead of a separate classifier. moreover, we combine these algorithms with structured semantic representations in web ontology language format (owl), such as the implementation of the mesh thesaurus in owl simple knowledge organization systems (skos) [20] . finally, we investigate the effect and feasibility of employing distributed data manipulation and file system techniques for dataset preprocessing and training. the rest of this paper is organized as follows: in sect. 2 we summarize current word embedding approaches as the main background and identify the problem of automated indexing; in sect. 3 we review relevant literature in the field of biomedical text classification; sect. 4 presents our methodology and approach, by outlining the indexing procedure designed, describing the algorithms used and discussing optimizations regarding dataset balancing, distributed processing and training parallelization. section 5 contains the results of the various experiments and their analysis, while sect. 6 outlines our conclusions and future work. word embedding techniques [9] convert words into word vectors. the following approaches have emerged in recent years with the performance of text recognition as the primary objective. the start has been made with the word2vec algorithm [11] where unique vector word representations are generated by means of shallow neural networks and the prediction method as well as by the explicit extension doc2vec (document to vector) [8] , where a unique vector representation can also be given for whole texts. next, the global vectors (glove) algorithm [14] manages to transfer the words into a vector space by making use of the enumeration method. then, the fasttext algorithm [4] achieves not only the management of a large bulk of data in optimal time but also better word embedding due to the use of syllables. in addition, the elmo algorithm [15] uses deep neural networks, lstms, and a different vector representation for a word the meaning of which differentiates. lastly, the bert algorithm [2] also generates different representations of a word according to its meaning, but instead of lstms it uses transformer elements [21] . assigning a topic to text data is a demanding process. nevertheless, if approached correctly, it ensures easier and more accurate access for the end user. a typical subcase of the topic assignment problem is the attempt to create mesh indexes in repositories with biomedical publications. this particular task, which improves the time and the quality of information retrieved from the repositories, is usually undertaken by field experts. however, this manual approach is a time consuming process (it takes two to three months to incorporate new articles), but also a costly one (the cost for each article is approximately $10) [10] . a plethora of research approaches have attempted to tackle with the problem of indexing. to do this, they use word embeddings in combination with classifiers. typical cases are discussed below. mesh now [10] classifies the candidate terms based on the relevance of the targetarticle and selects the one with the highest ranking, thus achieving a 0.61 f-score. to do this, researchers are using k-nn and support vector machine (svm) algorithms. another approach, named deepmesh [13] , deals with two challenges, that is, it attempts to examine both the frequency characteristics of the mesh tags and the semantics of the references themselves (citations). for the first it proposes a deep semantic representation called d2v-tfidf and for the latter a classification framework. a k-nn classifier is used to rate the candidate mesh headings. this system achieves an f-score of 0.63. another study [5] uses the word2vec algorithm on all the abstracts of the pubmed repository, thereby generating a complete dictionary of 1,701,632 unique words. the use of these vectors as a method to reduce dimensionality is examined by allowing greater scaling in hierarchical text classification algorithms, such as k-nn. by selecting a skip-gram neural network model (fasttext) vectors of size 300 are generated with different windows from 2 to 25, which the authors call mesh-gram with an f-score of 0.64 [1] . moreover, taking into consideration the assumption that similar documents are classified under similar mesh terms, the cosine similarity metric and a representation of the thesaurus as a graph database, scientists proceed to an implementation with an f-score of 0.69 [16] . a problem-solving approach is considered starting from converting texts into vectors with the use of elastic search and identifying the most similar texts with the help of the cosine similarity metric. then, by deriving the tags from these texts and calculating the frequency of occurrence in conjunction with similarity, an evaluation function is defined which classifies documents. biowordvec [22] is an open set of biomedical word vectors/embeddings which combines subword information from unlabeled biomedical text with mesh. there are two steps in this method: first, constructing mesh term graph based on its rdf data and sampling the mesh term sequences and, second, employing the fasttext subword embedding model to learn the distributed word embeddings based on text sequences and mesh term sequences. in this way, the value of the f-score metric is improved to 0.69 and 0.72 for cnn and rnn models respectively. the proposed approach for the indexing of biomedical resources starts with assembling the datasets to be used for training. we then proceed by evaluating and reporting on two prominent embedding algorithms, namely doc2vec and elmo. the models constructed with these algorithms, once trained, can be used to suggest thematic classification terms from the mesh vocabulary. in an earlier work we have shown how to glean together resources from various open repositories, including biomedical ones, in a federated manner. user query terms can be reverse-engineered to provide additional mesh recommendations, based on query expansion [7] . finally, we can combine and assess these semantic recommendations by virtue of a trained embeddings model [6] . each item's metadata is scraped for the title and abstract of the item (body of text). this body of text is next fed into the model and its vector similarity score is computed against the list of mesh terms available in the vocabulary. training datasets comprise biomedical literature from open access repositories including pubmed [19], europepmc [3] and clinicaltrials [17] along with their handpicked mesh terms. for those terms that may not occur at all within the datasets, we fall back to their scopenote annotations within the mesh ontology. as a result, the model comes up with a set of suggestions together with their similarity score ( fig. 1 ). for the application of the doc2vec and elmo methods, a dataset from the pubmed repository with records of biomedical citations and abstracts was used. in december of every year, the core pubmed dataset integrates any updates that have occurred in the field. each day, the national library of medicine produces updated files that include new, revised and deleted citations. about 30 m records, which are collected annually, can be accessed by researchers as of december 2018. another source is europepmc, which is a european-based database that mirrors pubmed abstracts but also provides free access to full-texts and an additional 5 m other relevant resources. each entry in the dataset contains information, such as the title and abstract of the article, and the journal which the article was published in. it also includes a list of subject headings that follow the mesh thesaurus. these headings are selected and inserted after manual reading of the publication by human indexers. indexers typically select 10-12 mesh terms to describe every indexed paper. mesh is a specialized solution for achieving a uniform and consistent indexing of biomedical literature. in addition, it has already been implemented in skos [20] . it is a large and dense thesaurus consisting of 23,883 concepts. the clinicaltrials repository also provides medical data, that is, records with scientific research studies in xml. clinicaltrials contains a total of 316,342 records and each record is also mesh indexed. we include these records with the ones obtained from pubmed and europepmc for the purposes of variability and dataset diversity. our initial methodology for collecting data for training followed a serial approach [6] . access to pubmed can be done easily via file transfer protocol (ftp). the baseline folder includes 972 zip files, up to a certain date (december 2018). each file is managed individually with the information being serially extracted, thus rendering the completion of the entire effort a time costly process. in addition to delays, this problem makes the entire process susceptible to the need for constant internet connection to the repositories and any interruptions that may occur. algorithm 1: dataset preparation procedure input: xml files from repository output: two csv files step 1. for each file repository do step 2. connect to ftp server step 3. get file to local disk step 4. store file as line in rdd step 5. delete file from local disk step 6. end for step 7. parse file -useful information is extracted step 8. convert rdd to dataframe step 9. write useful information to csv files to solve this particular problem, we investigate the use of a distributed infrastructure at the initial stage of data collection. for this purpose, apache spark 1 , a framework for parallel data management, was used. xml files are now stored as a whole in a dataframe. in detail, all the information in an xml file is read as a line and then converted into resilient distributed dataset (rdd). the useful information is then extracted as in the previous procedure. finally, the rdd is converted into a dataframe, from which the information is easily extracted, for example, into csv files. with this process, although extracting data from repositories is still not avoided, parsing can be now performed on a distributed infrastructure. an essential part of the dataset preparation process is to cover the whole thesaurus as thoroughly as possible. thus, apart from the full coverage of the thesaurus terms, the model must also learn each term with an adequate number of examples (term annotations) and this number should be similar between the terms. otherwise, there will be a bias towards specific terms that happen to have several training samples vs. others which might have only a few. therefore, to achieve a balanced dataset for training, when a term is incorporated into the set, its number of samples is restricted by an upper limit (algorithm 2). if there are fewer samples the term is ignored; if there are more, exceeding annotations are cut off. the final dataset, as shown in table 1 , does not fully cover the thesaurus, but the terms it contains are represented by an adequate and uniform number of samples. doc2vec model. to create the model, the input is formed from the "one-hot" vectors of the fixed-size body of text, which is equal to the dictionary size. the hidden plane includes 100 nodes with linear triggering functions and with the same size as the resulting vector dimensions. at the output level there will also be a vector equal to the dictionary size, while the activation function will be softmax. the training of the doc2vec model, with the help of the gensim library 2 , is performed by using the following parameters: train epochs 100, size vector 100, learning parameter 0.025 and min count 10. a variety of tests were performed to estimate these values. tests have shown that when there are only a few samples per term, a larger number of epochs can compensate for the sparsity of training samples. in addition, removing words with less than 10 occurrences also creates better and faster vector representations for thesaurus terms. the created model is stored so that it can be called directly when needed. this model, with the adopted weights of the synapses that have emerged, is in fact nothing more than a dictionary. the content of this dictionary is the set of words used in the training along with their vector representations as well as a vector representation for each complete body of text. initially, all the vectors are extracted through a url connection 3 . then, all the words related to the topic and mentioned in labels, are converted into classes, that is, numeric values, e.g. 0, 1, 2, etc., which in turn become "one-hot" vectors. to create the model, we have used the keras library 4 . the input layer receives one body of text (title and abstract) at a time. the next layer is the lambda, which is supplied by the input layer, and uses the elmo embeddings, with an output count of 1024. to create the elmo embeddings, the vectors derived from the url are used and the full body of text is converted into string format and compressed at the same time with the help of the tensorflow package. selecting the default parameter in this process ensures that the vector representation of the body of text will be the average of the word vectors. the next layer will be the dense one, which in turn is powered by the lambda layer and contains 256 nodes and the relu activation function. finally, we have the output layer, which is also a dense layer with as many nodes as classes, and in this case the activation function will be softmax. in the final stage of the system (compile), the loss parameter is categorical_crossentropy, because of the categorization being attempted. the adam optimization, and the accuracy metric are chosen. the training of the elmo model is done by starting a tensorflow session with parameters: epochs 10 and batch_size 10. this ensures that training will not take longer than 10 epochs and the data will be transferred in packages of 10 samples. upon completion, the finalized weights are stored, for the purpose of immediately creating a model either for evaluation or any other process required. in the context of the implementation of the above two models, the most significant stage is the execution of the algorithm responsible for their training. this process, depending on the architecture of the model, is particularly demanding on computing resources. most experiments are conducted on average commodity hardware (intel i7, 2.6 ghz, 4-cores cpu with 16 gb of ram). in a shallow neural network, such as doc2vec, execution can be completed relatively quickly without significant processing power demands. our test configuration appears sufficient for at least 100 epochs to be achieved, so that the model can be trained properly. however, this is not the case for deep neural networks, such as elmo. in these models, the multiple layers with many connections add complexity and increase computational power requirements in order to be adequately trained. therefore, finding a way to parallelize the whole process is considered necessary, not only for the optimization of the training time, but also for the completion of the entire effort. specifically, options associated with increasing the size of the training set in combination with the number of epochs can well lead either to a collapse of the algorithm execution or to prohibitive completion times. as an example, training elmo on a dataset of 1,000 samples (10 labels with 100 items each) took about 20 min in the above hardware configuration. based on this concern, we have conducted experiments on infrastructures with a large number of graphics processing units (gpus). gpus owe their speed to high bandwidth and generally to hardware which enables them to perform calculations at a much higher rate than conventional cpus. we experimented on high-performance hardware with two intel xeon cpus including 12-cores, 32 gb of ram and a nvidia v100 gpu. the v100 has 32 gb of memory and 5,120 cores and supports cuda v. 10.1, an api that allows the parallel use of gpus by machine learning algorithms. in this configuration, training with the 1,000 samples dataset took only 30 s for the same model. this faster execution (40x) can help run experiments with larger datasets that would otherwise be prohibitive to achieve. to evaluate the doc2vec approach, a total of 10,000 samples was used. this test set is balanced with the same procedure as the 100 k dataset we have used previously for training. therefore, each mesh label occurring in the test set appears within the annotations of 10 bibliographic items vs. 100 in the training set. in both sets, a total of 1,000 distinct mesh labels, out of the available 22 k, are considered. a body of text (title and abstract) is given as input to the model, which in turn generates a vector. the model then searches through the set of vectors, already incorporated from training, to find those that are close to the generated one. the process is quite difficult due to the plethora of labels in total, but also individually per sample as each one contains a finite non-constant number of labels. the threshold value reported is the similarity score above which suggestions are considered. at most 10 suggestions are produced by the model. the following fig. 2 plots precision (p) and recall (r) for various threshold values. for our purposes, a single information need is defined as a single-term match with an item (1-1) and we report the mean value of these metrics over all these matches, i.e. they are micro-averaged. precision takes into account how many of these 1-1 suggestions are correct out of the total suggestions made, while recall considers correct suggestions out of the total number of suggestions contained in the ground truth. other than the standard tradeoff between precision and recall, we first notice that the higher the threshold, the better the quality of results is for p. however, an increase in the threshold causes recall to drop. this makes sense, because there may be considerably fewer than 10 suggestions for higher thresholds, i.e. only few of the terms pass the similarity threshold, thus leaving out some relevant terms. for middle threshold values, more that 60% of predictions are correct and also cover over 60% of the ground truth. similarly balanced test sets were also used to evaluate the elmo model, with a varying number of labels and samples per label. the results obtained are shown in fig. 3 . initially, equally comparable or better results to previous related work are observed for precision and recall, when a small number of labels is selected. given that the dataset is balanced, these considerably improve when allowing 10x more samples for each label, thus allowing for better embeddings to be learned by the model. raising the number of labels to 100 increases the dimensionality of the problem; now, each item must be classified among 100 classes rather than simply 10. consequently, we notice a decrease in both metrics which, however, is ameliorated with the use of more samples (1,000 per label), as expected. nonetheless, further increasing the complexity of the dataset, by allowing 1,000 labels, causes absolute values to decrease considerably. even with the better performing hardware, any further efforts to improve the results in the case of 1,000 labels, notably by increasing the number of samples, do not succeed, as the execution of the algorithm is interrupted each time due to the need for additional memory. since our elmo-based model performs multi-class classification (but not multilabel), micro-averaged precision and recall would be identical 5 . now, because our test set is balanced, both micro-and macro-averaged recall will also be equal (the denominator is the fixed number of samples per label). figure 3 plots macro-averaged precision i.e. averages precision equally for each class/label. this macro-averaged precision will always be higher than the micro averaged one, thus explaining why p appears better than r, but only slightly, because of the absence of imbalance. p and r are virtually the same. the doc2vec model is capable of making matching suggestions on its own, by employing similarity scores. a threshold value of 0.64 is where precision and recall acquire their maximum values. certainly, fewer than 10 suggestions may pass this limit, but these are more accurate, apparently because of their higher score; they also occur more frequently within the terms suggested by experts in the ground truth: on average, each biomedical item in the test set hardly contains 2 mesh terms, let alone 10 (see table 1 ). this observation serves as validation of the fact that the similarity measure produced by the model is highly relevant and correlated with the quality of suggestions. the elmo model, in turn, has allowed us to construct a multi-class classification pipeline that is built around the elmo embeddings. this manages to achieve very good results when dimensionality is kept low, i.e. a relatively small number of labels or classes is selected. it also seems to outperform the doc2vec approach, even though classes are fewer and the recommendation problem is reduced to multi-class classification. however, doc2vec surpasses the elmo classification pipeline when there is a need to choose labels from a broader space. one the other hand, the threshold existence favors doc2vec, in the sense that fewer ground truth annotations pass its mark and are, therefore, considered when computing retrieval metrics. further attempts for improvement where not possible for elmo, a fact that confirms it is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. contextualized word representations have revolutionized the way traditional nlp used to operate and perform. neural networks and deep learning techniques combined with evolving hardware configurations can offer efficient solutions to text processing tasks that would be otherwise impossible to perform on a large scale. we have shown that word embeddings can be a critical part and deserve careful consideration when approaching the problem of automated text indexing. especially in the biomedical domain, the shifting nature of research trends and the complexity of authoritative controlled vocabularies still pose challenges for fully automated classification of biomedical literature. to this end, we have investigated both deep-and shallow learning approaches and attempted comparison in terms of performance. dimensionality reduction, either implied by setting a threshold or directly posing a hard cap over available classification choices, appears necessary for automated recommendations to be feasible and of any practical value. still, a careful dataset balancing as well as the capability of deep networks to leverage distributed gpu architectures are demonstrated beneficial and should be exercised whenever possible. as a next step, we intend to further evaluate our elmo implementation and design a model that would perform multi-label classification using the elmo embeddings. in addition, we are planning to release a web-based service offering access to the recommendations provided by the two models. this would facilitate interoperability, for example, with learning management systems and other repositories. finally, we see improvements in the way classification suggestions are being offered, especially in view of the density of the thesaurus used: other ontological relations, such as generalization or specialization of concepts can be taken into account in order to discover and prune hierarchy trees appearing in the recommendations list. the mesh-gram neural network model: extending word embedding vectors with mesh concepts for umls semantic similarity and relatedness in the biomedical domain bert: pre-training of deep bidirectional transformers for language understanding metadata of all full-text europe pmc articles bag of tricks for efficient text classification biomedical semantic indexing using dense word vectors subject classification of learning resources using word embeddings and semantic thesauri semantic annotation and harvesting of federated scholarly data using ontologies distributed representations of sentences and documents word embedding for understanding natural language: a survey mesh now: automatic mesh indexing at pubmed scale via learning to rank efficient estimation of word representations in vector space the nlm medical text indexer system for indexing biomedical literature deepmesh: deep semantic representation for improving large-scale mesh indexing glove: global vectors for word representation deep contextualized word representations search and graph database technologies for biomedical semantic indexing: experimental analysis a method to convert thesauri to skos attention is all you need biowordvec, improving biomedical word embeddings with subword information and mesh. sci. data key: cord-024061-gxv8y146 authors: alkhamis, moh a.; li, chong; torremorell, montserrat title: animal disease surveillance in the 21st century: applications and robustness of phylodynamic methods in recent u.s. human-like h3 swine influenza outbreaks date: 2020-04-21 journal: front vet sci doi: 10.3389/fvets.2020.00176 sha: doc_id: 24061 cord_uid: gxv8y146 emerging and endemic animal viral diseases continue to impose substantial impacts on animal and human health. most current and past molecular surveillance studies of animal diseases investigated spatio-temporal and evolutionary dynamics of the viruses in a disjointed analytical framework, ignoring many uncertainties and made joint conclusions from both analytical approaches. phylodynamic methods offer a uniquely integrated platform capable of inferring complex epidemiological and evolutionary processes from the phylogeny of viruses in populations using a single bayesian statistical framework. in this study, we reviewed and outlined basic concepts and aspects of phylodynamic methods and attempted to summarize essential components of the methodology in one analytical pipeline to facilitate the proper use of the methods by animal health researchers. also, we challenged the robustness of the posterior evolutionary parameters, inferred by the commonly used phylodynamic models, using hemagglutinin (ha) and polymerase basic 2 (pb2) segments of the currently circulating human-like h3 swine influenza (si) viruses isolated in the united states and multiple priors. subsequently, we compared similarities and differences between the posterior parameters inferred from sequence data using multiple phylodynamic models. our suggested phylodynamic approach attempts to reduce the impact of its inherent limitations to offer less biased and biologically plausible inferences about the pathogen evolutionary characteristics to properly guide intervention activities. we also pinpointed requirements and challenges for integrating phylodynamic methods in routine animal disease surveillance activities. in the past few decades, genetic analysis of rapidly evolving pathogens has become an integral part of animal disease surveillance systems worldwide (1) (2) (3) (4) . most current and past molecular surveillance studies of animal disease pathogens of both public health and economical importance such as influenza (5) (6) (7) , foot-and-mouth disease (fmd) (8) (9) (10) , and porcine reproductive and respiratory syndrome (prrs) (11) (12) (13) viruses are dependent on classical epidemiological and phylogenetic methods. these studies or surveillance systems used classical phylogenetic methods, including parsimony, neighbor-joining, or maximum likelihood (ml) approaches to either genotype novel emerging strains, classify viral lineages, or assess tree topologies to distinguish between novel and emerging strains (6, 7, 13) . in addition, classical phylogenetic approaches were used to assess correlations between the similarities of nucleotide sequences and related epidemiological characteristics, while ignoring uncertainties associated with estimates of phylogenetic relationships, host, temporal, and spatial factors (7, 10, 11, 14) . furthermore, they investigated spatio-temporal and evolutionary dynamics of the virus isolates in a disjointed analytical framework and made joint conclusions from both analytical approaches (7, 10, 11, 14) . therefore, many of the past and current molecular surveillance studies of animal diseases have ignored that epidemiological and evolutionary dynamics of rapidly evolving viruses occur on approximately the same time-scale (15) . thus, studying them in a unified analytical framework will refine their interpretations and limit biased conclusions to subsequently improving the related molecular surveillance activities. classical phylogenetic approaches are not capable of accounting for the uncertainties in evolutionary processes of rapidly evolving pathogens or integrating related epidemiological features into their phylogeny, which is an important advantage of bayesian phylodynamic methods. the bayesian phylodynamic methods were borrowed from the field of evolutionary biology and have become a powerful tool for exploring the evolutionary epidemiology of infectious pathogens (14) (15) (16) (17) . during the last two decades, the rapid growth of pathogens' genetic data and computational resources increased the applications of phylodynamic methods in animal and human disease surveillance (17) . these methods are capable of accounting for uncertainties, and uniquely integrate complex epidemiological and evolutionary processes in populations using a single bayesian statistical framework (18, 19) . this framework handles the parameters of the phylodynamic model as random variables, in which each parameter is set by a specified prior probability distribution (and a parallel inferred posterior probability distribution). therefore, this innovative quantitative integration improved disease investigation by answering novel epidemiological questions about the evolutionary history, spatiotemporal origins, within and between-host transmission, and environmental risk factors for rapidly evolving pathogens (17) . in fact, during the last decade, phylodynamic models have become well-established tools for studying the evolution of animal viral diseases specially influenza (20) , fmd (17) , and prrs (21) . besides, several studies advocated for the integration of phylodynamic methods in the routinely molecular surveillance pipelines of animal diseases with the objectives of reclassifying viral genotypes, distinguishing between emerging and endemic viral strains, and selecting proper vaccine strains (17, (21) (22) (23) . these approaches will provide a robust platform for guiding the allocation of resources within a surveillance system, for example, targeting emerging strains with higher evolutionary rates or hosts at high risk of generating new strains, which subsequently will reduce the economic costs of sampling, control, and prevention activities. phylodynamic methods are implemented in many open-source statistical software packages, while the most popular user-friendly software package is formally known as bayesian evolutionary analysis by sampling tree (beast) (24) . while past studies illustrated the great potential of phylodynamic tools, the methods are sensitive to the density and coverage of sequence sampling, selection of genetic regions, quality and quantity of the associated surveillance data, and prior selection for the evolutionary parameters (15, 25, 26) . these limitations may result in biased posterior inferences, which subsequently lead to inaccurate or biological implausible conclusions about the evolutionary epidemiology of the pathogen under study (e.g., false divergence time or geographical origins). that said, most phylogenetic studies suffer from these inherent limitations. however, setting a thorough phylodynamic analytical pipeline, while acknowledging these limitations, can reduce their impact on the resulting posterior inferences and their related conclusions. unfortunately, many published phylodynamic studies ignored such limitations, particularly in their analytical approach, in which they used simple naïve priors for their evolutionary parameters while ignoring the underlying assumptions for these priors (27) (28) (29) (30) (31) . for example, prior selection should adhere to the assumption that different pathogens have unique evolutionary characteristics (14) , and therefore, using the same simple prior on different pathogens will likely lead to the conclusion that such pathogens behaved similarly during their evolutionary history. also, these studies ignored the impact of selecting different prior models on their posterior evolutionary inferences of the pathogen under study (26, 32) . for example, the use of different prior models often leads to different conclusions about the geographical origins of the pathogen under study, and hence, bayesian model selection is a critical step in phylodynamic analysis pipelines (25, 33) . there are many studies in the published literature comparing the results of phylodynamic models inferred from different gene segments or evolutionary parameters' priors (34) (35) (36) . however, few studies raised concerns about the sensitivity of the results to the choice of different evolutionary models (20, 26) as well as suggested a focused phylodynamic analytical pipeline for animal disease molecular surveillance (37) . here, we demonstrate the basic principles for building a phylodynamic analytical pipeline, illustrate examples on the impact of gene segment and prior selection on the posterior evolutionary inferences, and highlight the prospects of the methods in improving animal disease surveillance. we selected a publicly available dataset compromising of 352 full genome sequences for human-like h3 swine si collected as part of the united states department of agriculture influenza surveillance system between 2015 and 2018 as a working example. we provided a detailed description of a classical phylodynamic analytical pipeline encompassing both demographic and discrete phylogeographic reconstruction of the human-like h3 virus using beast. our phylodynamic analyses included comparisons between commonly inferred evolutionary posterior parameters (e.g., substitution rate/site/year, divergence times, phylogeographic root state posterior probabilities, significant dispersal route between states) under different combinations of node-age and branch rate prior models. furthermore, we extended this analytical pipeline into comparing posterior parameters inferred from ha and pb2 gene segments. interpretation of the resulting posterior inferences under different scenarios, described above, has been discussed in detail, and we highlighted examples of their misuse in past phylodynamic studies. our results identified the prospects and limitations of the presented phylodynamic pipeline in the context of animal disease surveillance on regional and global scales. furthermore, our results provide researchers and stakeholders of the swine industry in the united states valuable insights on decisions related to the sampling and sequencing of the influenza virus genome when conducting future phylodynamic studies and improving the design of currently implemented surveillance systems. the summary flow chart of our phylodynamic analytical pipeline is presented in figure 1 . this bayesian statistical framework is popular and well-established for studying rapidly evolving pathogens as described elsewhere (37) (38) (39) . the pipeline is divided into five steps (figure 1) , in which two steps are dedicated to sequence preparation and curation of relevant viral lineages, while the following three steps are dedicated for phylodynamic analyses of the subsequently selected lineages. a critical step for a sound phylodynamic analysis is sequence preparation. this step can take two directions, depending on the study design and the objectives of the analysis. the first direction involves primary data analyses of novel sequences, in which they are either part of a designed study to identify the evolutionary characteristics of newly emerging viral strains (27, 37, 39) or part of an ongoing active surveillance program (40) . this direction usually includes the collection and sequencing of novel viral isolates from ongoing outbreaks. the second direction involves secondary data analyses of sequence collections published in publicly available genomic databases such as the genbank, to mainly explore the evolutionary history of specific pathogens either on regional or global scales (38, 41, 42) . secondary frontiers in veterinary science | www.frontiersin.org sequence analysis can either target all available viral isolates or specific well-defined lineages (i.e., monophyletic clades) (38, 41, 42) . to reduce the impact of sampling bias on the results of a phylodynamic analyses, it is essential to ensure the representativeness of the viral isolates under study to the available sequences data on both temporal and spatial scales. this step is most important for primary sequence analyses, in which the dataset under study needs to cover all close relatives of novel viral isolates published elsewhere. retrieving and combining relatives of novel viral isolates in a single dataset will warrant a proper inference of representative phylogenetic relationships of a tree topology based on all available related sequences. as on many occasions, novel sequences might belong to different distinct viral lineages published elsewhere (39, 43) . the basic local alignment search tool (blast; https://blast.ncbi.nlm.nih.gov/blast.cgi) is the most popular tool for retrieving relatives of novel sequences (figure 1) . finally, the retrieval process should include complete and near-complete sequences to avoid distorting the phylogenetic relationships between the novel and the related isolates. the integration of a pathogen's epidemiological characteristics into its inferred phylogeny is the ultimate justification for the preference of the phylodynamic approach over the classical phylogenetic methods. therefore, the thorough preparation of sequence metadata, which includes retrieval of information related to the isolate under study, is another critical step for a sound subsequent phylodynamic analysis. sequence metadata can be retrieved either from public genomic databases such as the genbank or from the related published literature. because phylodynamic methods largely depend on time-stamped data, this step starts with retrieving the data of collection for the viral isolates under study. thus, viral isolates with no temporal data are typically excluded from the analyses pipeline. next, the date of collection is converted into beast readable format known as fractional years to estimate divergence times. for example, a virus collected on april 14, 2017, is converted into "2017.282" as a fractional year, where "2017" is the year of collection, and "0.282" is the number of days from the beginning of that year till the day for sequence collection divided by the total number of days within a typical year. additionally, dates can be imported to beauti by a separate text file that include the complete date of sequence collection with explicit separators (e.g., -or /). however, in many instances, the complete date of collection is not available, in which it misses either the exact date or month of collection. therefore, we can either specify the age of the isolate as the mid-point of the corresponding month or year, respectively. other epidemiological characteristics such spatial or host information can be prepared in a separate text delaminated format with unique identifiers that link them to the isolates in the sequence dataset. isolates missing a none-temporal information should be kept in the analyses and are usually labeled with a question mark "?" to represent a missing information. in the context of the phylodynamic field, epidemiological characteristics such as country or host of origin are defined as a discrete trait and are described in more detail in the running and selecting phylodynamic models section. however, careful selection of these characteristics is recommended to be considered at the beginning of the analyses pipeline as a critical part of the data preparation for the subsequent analyses. geographical discrete traits can be defined as the country of origin where the pathogen was isolated or can be redefined on smaller or larger spatial scales such as administrative regions within a country (44) or continental scale (32) , respectively, depending on the study's hypothesis. besides the host of origin, other non-spatial discrete traits such as host and environmental attributes can also be defined as discrete traits (45) . multisequence alignment (msa) is another primary key step in the data preparation stage of the pathogen's genetic data (figure 1) . it is worth noting that alignment uncertainty, for example, in terms of the choice of alignment algorithm can affect the subsequent phylogenetic inferences, such as tree topology (46) . however, the impacts of alignment uncertainties have not been reported with simple pathogens like viruses, mainly when dealing with small gene segments. therefore, this issue might be considered when dealing with whole genomes or with more complex pathogens like bacteria and fungi, which can be resolved by multiple sequence alignment averaging using different alignment algorithms (47) . common alignment algorithms include clustal (48) , t-coffee (49) , and muscle (50), while aliview is a user-friendly graphical interface that can deal with large sequence datasets and integrate multiple alignment algorithms (51) . performing the multisequence alignment using an algorithm, and manually deleting the gaps within the translated alignment, are the most common steps for most phylogenetic studies (51) . also, confirming the reading frame of each gene segment (excluding the 5 ′ utr) by examining the amino acid translation is another step within the msa procedure. this step is commonly done, for example, for influenza virus ha and pb2 gene segments, and potentially for segments 7 and 8, to account for the frameshifted m2 and ns2 genes. however, it is worth noting that this step is only important for partitioned nucleotide models, described below. phylodynamic analyses require both time and computational resources, and therefore, conducting exploratory phylogenetic analyses using classical methods is an essential step that will ensure the proper setup of the subsequent phylodynamic models' priors. classical methods for inferring basic phylogenetic trees (i.e., non-time-stamped trees) include the maximum likelihood (ml) (52), maximum parsimony (mp) (53) , and neighborjoining (54) algorithms. inferring the basic phylogenetic tree of a sequence dataset will help in the preliminary assessment of the tree's topology in terms of the magnitude of structure across branches, degree of topological (in)congruence between different gene segments, and selection of lineages (in large datasets) for the subsequent phylodynamic analyses. classical phylogenetic algorithms are implemented in many open-source software packages such as mega (55) and raxml (56) . the rapid spread and transmission of viral diseases during epidemics provide plenty of time for the pathogen to accumulate informative mutations in their genomes (57) . therefore, 100% identical sequences within a dataset dilute such information. also, retrieved sequence datasets suffer from inherent redundancy due to sampling bias and issues related to the sequencing procedure (58) . hence, removing 100% identical sequences from the dataset under study, will reduce the impact of such redundancies, strengthen the tree structure, and shorten the computational time. furthermore, if the proportion of 100% identical sequences was substantially large, it will typically lead to weaker evolutionary signals and subsequently poorer phylodynamic model convergence. recombination is a natural biological phenomenon of rapidly evolving viruses like influenza and occurs when viral genomes co-infect the same host cell and exchange fragments of their gene segments resulting in new viral strains (59) . ignoring recombination events in a sequence dataset may advisedly bias the inferred posterior phylogenetic relationships and, therefore, must be excluded (60) . recombination events can be detected using the recombination detection program (61) . however, recombination events are more often detected in whole genomes than in single-gene segments. therefore, conducting phylodynamic analyses on whole-genome sequences only will lead to the exclusion of many isolates resulting in a substantially smaller dataset and subsequently biased inferences. nevertheless, the occurrence of recombination events at the beginning of a novel viral outbreak might be limited. assessing the magnitude of temporal structure in the phylogeny of the sequences data collected at different points of time is the final recommended step within the preliminary phylogenetic analyses stage (62) . here, the term "temporal structure" is defined as the measurable difference in terms of nucleotide or amino acid substitution between two genetic sequences sampled at two distinct points of time (63) . therefore, if the sequence data lacks sufficient temporal structure, then proceeding to the phylodynamic analysis may lead to biased posterior estimates and misleading conclusions (62) . an interactive regression-based approach is implemented in the tempest software package (62) to assess the strength of the association between sequences' sampling dates and genetic divergence through time. r 2 values closer to 1 than 0 estimated from a time-stamped ml tree using the root-to-tip genetic distance linear regression indicate a strong temporal structure (62) . finally, tempest can identify incongruent sequences that are defined as outlier isolates that caused substantially more or less genetic divergence from the tip to the root than one would expect given their sampling date (62) . incongruent sequences usually result from low sequencing quality, alignment errors, laboratory adopted and vaccine strains, as well as natural biological processes such as recombination. once the sequence dataset and their metadata are curated (by the past two steps, described above), we provide a variety of choices for selecting and running phylodynamic models depending on the objectives of the study. steps involving prior specification, simulations, and summarizing posterior inferences are all implemented in the beast software package (24) . large evolutionary distances (i.e., substitution per site) between pairs of sequences caused by multiple substitution events through time can be underestimated when using simple distance measures (e.g., hamming distance) (64) . hence, the distance correction technique provided by the substitution models can compensate for the underestimation of such large evolutionary distances (64) . phylogenetic tree algorithms such as the ml approach incorporates substitution models that employs continuoustime markov chain (ctmc) models (52) . ctmc models are stochastic methods that take values from a discrete state evolutionary space at random times, which is analogous to a nucleotide or amino acid substitution process, allowing for glimpsing the complete state history over the entire phylogeny where statistical inferences are drawn (52, 64, 65) . out of many available substitution models, the hasegawa, kishino, and yano (hky) (66) and the general time-reversable (gtr) (52, 67) are the most common models used to infer the phylogeny of rapidly evolving pathogens. briefly, both substitution models assume a constant rate of evolution and have two major parameters, including a rate matrix (q) and an equilibrium vector of base frequencies. however, the hky model rate matrix has two exchangeability parameters, including one transition rate and one transversion rate parameters (66) , while the gtr model has a symmetrical substitution rate matrix where all the exchangeability parameters are free (67) . accommodating the rate variation across sites can be achieved by combining substitution models with site models such as the discrete gamma (ŵ) model (68) . however, when assuming that the evolution rate is equal to zero, the invariant site (i) model is combined with the corresponding substitution model (69) . selection pressure in protein-coding genes of rapidly evolving pathogens, in terms of synonymous to nonsynonymous substitutions, usually occurs at high rates (70) . this evolutionary phenomenon can affect estimates of divergence time and, therefore, need to be accounted for when selecting a substitution model (71) . partitioning the gene segment into unique codon positions and assigning different substitution and site model combinations can accommodate the differences in the evolutionary dynamics within gene segments of the pathogen under study (70, 72) . different substitution, sites, and codon partitioning models are implemented in many ml software packages as well as in beast. however, selecting the most realistic substitution/site model and partitioning scheme for the sequence data can be statistically achieved using either bayesian information criterion (bic) (73), akaike information criterion (aic), or the corrected akaike information criterion (aicc) (74, 75) . these ml-based statistical methods are wellimplemented in both partitionfinder (76) and jmodeltest (77) . yet, a more robust bayesian method for selecting a site model and an associated substitution model is implemented as an add-on package in beast 2.x (78) . time-calibrated trees are modeled with the genetic differences between sequences through the molecular clock models, which is defined as the clock that occurs after a stochastic waiting time in the context of substitution rate (79) . when assuming that the substitution rate across the branches is uniform over the entire tree, then the molecular clock model is defined as strict. however, changes in the rate of evolution of rapidly evolving pathogens usually differ between the subtrees of its inferred phylogeny, and therefore, relaxed branch-rate models account for the variation in the rate of molecular evolution from clade to clade across the branches of the tree (79) . substitution rates across branches are assumed to be either autocorrelated (80) (i.e., substitution rates are dependent) or uncorrelated (81) (i.e., substitution rates are independent). the uncorrelated branch-rate prior commonly used for rapidly evolving viruses, in which the branch rates are drawn either from exponential or log-normal parent distribution (81) . another alternative to the strict clock model is local molecular clocks, which can estimate different rates for different predefined branch groups within a tree (82) . however, for large datasets, the manual task of assigning branches to different groups is impractical (81) , and therefore, bayesian random local clocks can nest a series of local clocks with each extending over a group of branches within the full phylogeny (83) . phylogenetic trees are inferred from individually sampled sequences to estimate the statistical properties of the population where the sequences were collected (84). kingman's n-coalescent theory (i.e., node-age model) is the first stochastic model framework aimed at estimating the size of the sequences' population (85) . the theory describes the distribution of coalescent times in the phylogeny as a function of the size of the population from which the sequences were drawn (85) . hence, in the past few decades, the coalescent theory is the core of phylodynamic methods and has shown to be the most useful for inferring essential parameters that shapes the evolution and population dynamics of evolving populations including their effective size (86), rate of growth (87), structure (88), recombination, and reticulate ancestry (89) . expanding the temporal frame of sampling times is the ultimate approach for increasing the statistical power and precision of the coalescent model in estimating substitution rates and population demographics of rapidly evolving viruses (90) . an essential evolutionary parameter estimated from the coalescent model is effective population size (n e ) at a specific time (t) and interpreted as the natural population that represents sample genealogies that have statistical features of an idealized population size through time n e (t) (84) . however, such interpretation is only suitable for a non-recombinant single population, whereas complex populations with more frequent recombination events require the use of structured tree models (84) described in the following section. estimating the posterior phylogeny of a well-mixed population with changing population size can be attained using either parametric or non-parametric node-age models (84) . parametric node-age models accommodate standard continuous population functions, the simplest and most naïve, namely, the constant population growth (cp), which assumes that the population growth rate is zero (91) . the other three parametric models include the logistic (lg) growth (assumes the population growth rate is decreasing over time), exponential (ex) growth (assumes the population growth rate is fixed over time), and the expansion (egx) growth (assumes the population growth rate is increasing over time) (91) . one would expect, in the event of an epidemic caused by a rapidly evolving virus like influenza and in the absence of new vaccination, the population growth rate of the virus would realistically fit either an exponential or an expansion growth rate model (44, 92) . unlike parametric node-age models, non-parametric models can be used to visually infer the history of population size through time (i.e., genetic diversity) from the sequence data in terms of inclines and declines (93) . these models treat each coalescent interval as a separate segment to represent a parameter for population size in a given time, in which the number of segments can be specified by the investigator to generate a sky plot (93) . the piece-wise constant bayesian skyline (bs) is the simplest non-parametric model, which assumes that the effective population size is experiencing an episodic stepwise changes through time (93) . however, the bs model is shown to be very sensitive to the total number of change points (i.e., coalescent intervals) when specified as a prior as well as to the number of sequences sampled at each point of time (94) . hence, a gaussian markov random fields bayesian skyride (gmrf) was proposed as an alternative model to bs (95) . the gmrf model is less sensitive to the prior number of change points because it implements a temporal smoothing approach to recover accurate population size trajectories (95) . however, an improved version of the gmrf is the skygrid (sg), which takes into account mutation parameters of multi-locus sequences (33) . the sg provides a more realistic estimate of demographic history in terms of population size and divergence times, as well as flexibility in terms of the ability to specify cut-points to the time trajectories (33) . furthermore, the sg model is the least sensitive to the temporal distribution of sequences (33) . a notable example of sky plots utility in prrs virus molecular surveillance in the united states was demonstrated by alkhamis et al. (21) and alkhamis et al. (37) . their sky plot inferred a distinctly high genetic diversity through time for the emerging 1-7-4 rflp-type prrv virus (37), while inferred consistent seasonal increases and decreases in the relative genetic diversity through time for endemic strains isolated between 2014 and 2015 (21) . mugration models are substitution models used to infer the migration processes of evolving organisms (96) . the most notable implementation of a migration model was developed by lemey et al. (97) using a ctmp to infer h5n1 avian influenza virus's global origins and movements between countries. they used countries from which the sequences have been sampled as discrete traits to estimate migration rates between pairs of predefined sets of geographical locations, and therefore, the method is named discrete phylogeography (97) . also, the method is known as discrete trait analysis (dta) because it has the flexibility to use any other discrete trait such as host or farm characteristics from which the sequences have been isolated to model migration rates between infected hosts and farms (37, 98) . besides, the method can infer ancestral origins (i.e., from the assigned discrete traits) for the internal nodes of the phylogeny through their estimated root state posterior probabilities (rspp) (97) . however, the most notable feature of discrete phylogeographic models is the integration of a bayesian stochastic search variable selection (bssvs) procedure to identify significant viral dispersal routes between geographical regions or host species (97) . bssvs can also infer the significance of the directionality in the migration process between pairs of discrete traits through integrated symmetric and asymmetric substitution models. the symmetric (sym) model assumes that the transition rate from state "a" to "b" is the same as the transition rate from state "b" to "a" (i.e., directional spread between traits is insignificant), while the asymmetric (asym) model assumes that the transition rate from state "a" to "b" is different from the transition rate from state "b" to "a" (i.e., directional spread between traits is significant) (97) . however, the lack of a sufficient number of sequences closer to the root of the phylogeny can impact accurate estimation of ancestral traits (i.e., ancestral geographical location or host) by the dta method (97) . therefore, dta robustness can be improved by increasing the geographical density and temporal depth of sampling (96) . dta is also limited by the type and number of variables that can be used to estimate ancestral states. therefore, the bssvs framework has been extended to accommodate a transitional rate matrix between discrete traits as a generalized linear model (glm) (22, 32) . the method improves biological plausibility of the inferred rspp for the ancestral traitsby simultaneously estimating the inclusion probabilities of geographic, demographic, and environmental predictors (22) . however, the method is shown to be more sensitive to sampling bias than the standard bssvs approach (32) . hence, comparative sensitivity analyses to sampling bias between the approaches are recommended to avoid severely biased inferred rspps. in some settings, geographical boundaries cannot be defined by discrete spatial traits such as the distribution of wildlife hosts or disease vectors and, therefore, viral evolution and spread better modeled by continuous spatial diffusion models (96) . when precise geographical information is available (i.e., longitude and latitude), continuous phylogeographic can reconstruct the viral spatio-temporal evolutionary history using relaxed random walk models (19) . these models can additionally estimate viral dispersal rate in km 2 /year and can distinguish whether the spatial diffusion process was homogenous (e.g., dispersal by air) or heterogeneous (dispersal by movements) (19, 21) . in many instances, sequence samples tend to cluster within a geographical region leading to incomplete mixing and formation of structure in the population. this might bias the posterior inferences that estimated the coalescent phylogeographic models mentioned above. hence, the recently developed structured coalescent tree models for inferring phylogeography can simultaneously model the migration process between regions while allowing for those regions to have their unique coalescent rates (96, 99) . unlike beast 1.x, beast 2.x has recently implemented several structured coalescent models for inferring geographic and between-host transmission histories, including bayesian structured coalescent approximation (basta) (26) , structured coalescent transmission tree inference (scotti) (100) , and marginal approximation of the structured coalescent (mascot) (101) . the complexity of infectious disease transmission dynamics pushed the capacity of phylodynamic models beyond demographic and phylogeographic reconstructions into investigating traditional and new epidemiological problems. one notable example was demonstrated by volz et al. by developing a structured coalescent susceptible-infected-recovered (sir) model to infer reproductive numbers from viral sequences data (102) . similar, but more complex, implementations of mathematical epidemiology in the phylodynamic models were described elsewhere (103, 104) . prior phylodynamic models described above can be readily selected and set using a graphical user interface (gui) implemented within the beast software package, namely, the bayesian evolutionary analysis utility (beauti) (24, 105) . after selecting and setting the models, the software generates a standard xml format structured text file allowing for flexible modifications for more sophisticated evolutionary models. however, the generated xml files are very complex in their structure, and therefore, manual modifications should be made by relevant experts to avoid the introduction of significant error into the model (105) . additional tutorials on selecting and setting evolutionary models using beast 1.x are available elsewhere (106) (107) (108) . phylodynamic model selection is a critical component of the analysis pipeline described in figure 1 , simply because different pathogens or gene segments have different evolutionary processes. therefore, using a single phylodynamic model with similar priors to infer the evolution of multiple pathogens may be biologically implausible, leading to biased inferences. exploring the fit of the sequence data to different phylodynamic model combinations, in terms of substitution, branch rate, and node age to infer divergence times, time to the most recent common ancestor (tmrcas), evolutionary rates is the best strategy for ensuring accurate estimation of posterior inferences. for inferring viral demographic history, our suggested pipeline (figure 1 ) leads to the generation of eight phylodynamic model combinations for a single gene segment, including the selected substitution model (by partitionfinder), two branch rate priors (uced and ucln), and four node-age priors (cp, ex, exg, and sg). however, when inferring phylogeographic history using dta, we suggest exploring both sym and asym bssvs models (figure 1) , which will lead to the generation of 16 models. our rigorous analytical pipeline is indeed timely and computationally demanding, but on the other hand, it will lead to the selection of the most realistic model that fits the sequence data with confidence. however, this suggested pipeline is not a strict set of procedures that will ensure appropriate inferences, and therefore, researchers may explore other model or analytical pipelines relevant to their evolutionary hypotheses. it is worth noting that the computational efficiency has been substantially improved in beast version 1.10 and the accompanied software library broadplatform evolutionary analysis general likelihood evaluator (beagle; permits flexible parallel computing) when compared to earlier versions. the fit of the sequence data to the most realistic phylodynamic model can be assessed through simultaneous estimating the marginal likelihood (mll) using the path sampling (ps) (25) and stepping-stone sampling (ss) (109) implemented in beauti using the standard settings (i.e., simulating across 100 samples for 1 million cycle from the posterior to the prior with a prior reflection point of beta [0.3, 1.0]). the joint posterior probability density of the models' parameters is estimated by the mcmc algorithms. setting the appropriate length of the mcmc chains (i.e., number of cycles) to ensure model convergence is dependent on the number of sequences in the dataset. one recommended approach is to quadratically increase the chain length relative to the number of sequences (e.g., 4 million states per sequence) (110) . finally, creating duplicate runs from each generated model can aid in assessing the performance stability of the mcmc simulations and their mll estimates. mcmc log-files generated by beast can be thoroughly evaluated using a friendly gui software known as tracer (111) . the software provides a simultaneous platform for summarizing and visualizing posterior estimates. appropriate model convergence can be evaluated by examining the mcmc mixing (based on acceptance ratios) using trace plots, after discarding the 10% of the sample (the "burn-in"). besides, assessing the estimates of the effective sample sizes (ess) for each parameter, in which ess values >200, indicates good model convergence (111) . on some occasions, good model convergence does not ensure consistent parameter estimation due to the use of non-informative priors implemented in beauti. therefore, it is critical to compare posterior parameter estimates (e.g., evolutionary rates, population growth rates, ps, and ss mll estimates) between independent runs for each model to warrant that each parameter is closely identical to its duplicate run. in case of improper model convergence and inconsistent parameter estimation, it is recommended to either increase the length of the mcmc chain or the use of informative priors from previous mcmc runs for the same gene segment or pathogen. model selection is achieved by comparing the bayes factor (bf) of the resulting mll estimates (from the ps and ss methods) of their corresponding candidate models (25) . briefly, the bf value of the candidate models is summarized using a matrix and computed using the following equation: where y is the sequence data, m i is the candidate model "i, " m j is the competing candidate model "j, " and lnp (y|m) is the mll estimate by either ss or ps simulators. bf values estimated by the ss method are summarized on the upper off-diagonal of the matrix, while bf values estimated by the ps method are summarized on the lower off-diagonal of the matrix. a model with horizontal (i.e., row side of the maxtrix) bf values greater than other candidate models is selected. additional applied examples on model selection using beast 1.x are available elsewhere (106) (107) (108) . the ultimate goal of the model selection procedure is to find the best fitting model that generated the data, while combining simplicity with biological realism, to appropriately represent the evolutionary characteristics of the pathogen under study (25, 112) . inferred relative genetic diversity through time (or other reconstructed demographic trajectories) and its highest posterior density (hpd) interval can be summarized using sky plots (e.g., skygrid plot) generated by tracer. similarly, estimates of divergence time, tmrcas, and substitution rate/site/year with their hpd intervals can be summarized in tracer using either box or violin plots (111) . also, tracer provides a flexible platform for simultaneous comparison of evolutionary estimates inferred by multiple phylodynamic models. next, the resulting marginal posterior probability density of the selected model is summarized as a maximum clade credible (mcc) tree using treeanotator (24) to generate a tree file. mcc tree (from the tree file) can be then visualized and annotated with either posterior support values or rssps of the discrete traits at the internal nodes using figtree (113) . in addition, figtree provides many customizable tree visualization options as well as it allows the users to upload additional information using a text file to annotate flexibly descriptions on the nodes and branches of the trees. spread3 is an interactive java-based parsing and rendering tool that can summarize and visualize phylodynamic reconstructions to infer spatio-temporal and trait evolutionary history (114) . also, spread3 integrates javascript d3 libraries to provide a web-based visualization platform for phylogeographic trees and their related inferences by combining information from the mcc tree and geojson-based geographic map files (114) . spread3 can generate a time-lapse that superimposes the mcc tree annotated with either discrete or continuous spatial traits on a map, which can be visualized using either gis-klm virtual globe software (e.g., google earth) or modern web-browsers (e.g., safari or chrome). this time-lapse demonstrates the epidemic reconstruction of pathogen evolutionary history through space and time, which can quantify the diffusion processes within and between geographical regions. furthermore, spread3 can identify and plot well-supported rates between pairs of discrete traits using bfs estimated from the symmetric or the asymmetric bssvs models. statistically significant rates with large bf values can be used to demonstrate critical viral dispersal routes between geographical regions or transmission cycles between host species. the spillover of h3 si virus from humans to swine in the early 2010s in the united states resulted in a novel emerging virulent strain, which was antigenically distinct from endemic swine strains, and therefore was named "human-like" h3 virus (115) . swine-related anthropological activities such as pig movement and vaccination are the most likely factors for the continuous emergence of si novel strains (6) . therefore, integrating phylodynamic methods with influenza surveillance systems may reduce the continuous evolutionary implications of si viruses on both public and animal health in the united states and worldwide. here, we chose dta models for our comparative phylodynamic analyses example, due to their popularity, ease of use, interpretation, and computational efficiency when compared to more complex similar models. hence, we retrieved ha and pb2 nucleotide sequences of human-like h3 si from the influenza research database (116) to explore their evolutionary history using our suggested phylodynamic pipeline, described above (figure 1) . the data comprised 352 sequences with complete date and geographical information for each gene segment and was collected from 17 u.s. states (arkansas, illinois, indiana, iowa, kentucky, maryland, michigan, minnesota, missouri, north carolina, ohio, oklahoma, oregon, pennsylvania, south dakota, west virginia, wisconsin) between january 8, 2015 and june 1, 2018. the sequence data were collected from the swine production systems and exhibition swine agricultural state fairs as part of the united states department of agriculture (usda) swine influenza surveillance program (40) and was partially analyzed by walia et al. using classical phylogenetic methods (6) . we aligned the sequences for both gene segments and assessed the topological (in)congruence of their phylogeny by performing an ml analysis for the individual segments using the gtr + ŵ substitution model, which entailed 10 through bootstrap searches with 100 ml replicates in each run (supplementary figure 1) . for the subsequent phylodynamic analyses, we removed recombinant and 100% identical sequences, which reduced the dataset to 142 sequences for each gene segment (supplementary table 1 ). we then evaluated the fit of the sequences to the most realistic substitution model and partitioning scheme using the bic approach. finally, we evaluated the temporal signal in the sequence data and found that both segments were suitable for the subsequent molecular clock analyses (r 2 = 0.65 and 0.40 for ha and pb2, respectively) (supplementary figure 2) . we assessed the sensitivity of the inferred posterior evolutionary of human-like h3 si sequence data to the choice of different gene segments (i.e., ha vs. pb2) and phylodynamic priors, including substitution, discrete spatial trait, branch rate, and node-age models on the (figure 1) . for each gene segment, we generated 16 phylodynamic models (a total of 32 runs for both segments) using the default none-informative priors' combinations implemented in beauti (figure 1) . these prior models included: (1) the gtr + ŵ vs. the hky + ŵ for the site models; (2) the symmetric vs. asymmetric for discrete spatial models; (3) the ucln vs. uced for the clock models; and (4) the cp vs. the eg vs. the egx vs. the sg for the coalescent tree models (figure 1) . we excluded spatial traits (i.e., u.s. states) with only one sequence (supplementary table 2 ) leading to the inclusion of 10 states in the subsequent dta. also, we evaluated the fit of the 16 phylodynamic models to the ha and pb2 sequences using the bf comparisons of their mll estimated by the ps and ss simulator in order to select the most realistic model and correctly interpret its posterior inferences. we then used two replicate mcmc simulations for 150 million cycles and sampled every 1,500th state for each candidate model. after assessing for proper model convergence, we compared the inferred evolutionary demographics of each candidate model by summarizing their inferred divergence times, substitution rates, and tmrcas. besides, we then generated the sg plots to compare relative genetic diversity for ha and pb2 gene segments inferred from the two different sites and discrete spatial models. similarly, we compared the phylogeographic inferences of each model by generating mcc trees, summarizing the rspps of the states, and plotting them at the internal nodes of their corresponding trees. finally, we selected and plotted the statistically significant dispersal routes between states under each candidate model using a cutoff bssvs-bf ≥ 10. the bic values, described above, indicated that the hky + ŵ is the best fitting substitution model for the ha gene segment (bic = 13,399), while the gtr + ŵ is the best fitting substitution model for the pb2 gene segment (bic = 20,029). in addition, results of the bf values (≥5) indicates that the best fitting branch-rate and node-age models to the sequence data were the sg + ucln for ha and sg + uced for pb2 segments (supplementary tables 3-6) . however, there were no significant changes in the posterior demographic inferences when choosing the opposite substitution model for both gene segments. similarly, our results indicate that the choice of discrete spatial and node-age models does not substantially change the estimated divergence times and substitution rates/site/year (figure 2) for each gene segment alone. additionally, these estimates were also not sensitive to the choice of branch-rate models (i.e., uced and ucln). however, when comparing divergence times between segments, our results indicate substantial differences in a magnitude of ∼8 years, in which the divergence time for the ha segment was around 2013 (figure 2a) , while for the pb2 segment, it was around 2005 (figure 2c) . no differences were observed in the substitution rates/site/year between the two gene segments, which were ranging between 3.3 × 10 −3 (95% hpd; from 2.8 × 10 −3 to 3.9 × 10 −3 ) and 2.9 × 10 −3 (95% hpd; from 2.2 × 10 −3 to 3.8 × 10 −3 ) for ha and pb2 segments, respectively (figures 2b,d) . similarly, posterior estimates of tmrcas were not sensitive to the choice of phylodynamic priors but were different between the two gene segments (figure 3) . hence, based on the ha segment, our results hint that the oldest humanlike h3 strains emerged from the state of minnesota in mid-2013 (figures 3a,b) , but with a notable overlap in the 95% hpd of the tmrcas inferred for other states (excluding maryland). however, results distinctly suggest that the youngest strains emerged from the state of maryland in early 2017. results of the pb2 segment were inconclusive in terms of determining the oldest strains, but identical to the ha gene in identifying maryland as the state of the youngest viral strains (figures 3c,d) . also, the choice of spatial trait model did not affect our estimates of genetic diversity for both ha and pb2 segments (figure 4) . our sg plots inferred seasonal variations in terms of increases and decreases, in the genetic diversity through time for ha segments (figures 4a,b) , while the genetic diversity of the pb2 segment gene slightly declined after 2015 (figures 4c,d) . our inferred phylogeographic posteriors did not show sensitivity to the selection of substitution or molecular clock priors. however, substantial differences were inferred when selecting different node-age and discrete spatial trait priors. inferences from both the cp and the ex node age with the asymmetric models implicated missouri as the most likely ancestral state for the human-like h3 virus currently circulating in the united states when using the ha gene segment (figures 5a,b) . however, the egx and the sg with the asymmetric models illinois and minnesota as the most likely ancestral states, respectively (figures 5c,d) . yet, when using the ha segment, the symmetric model with the cp, eg, and egx priors consistently implicated minnesota with approximately similar estimates of rspps (figures 5e-g) . in contrast, the use of the symmetric model with the sg prior implicated iowa as the ancestral location for the currently circulating human-like h3 strains (rspp = 0.36) (figure 5h) . interestingly, the ha sequence data uniquely favored this prior combination when using the bf comparisons for the best fitting phylodynamic model (supplementary tables 3, 4) . our bf values suggested that the pb2 sequence data favored the asymmetric model with the sg prior, but with a very slight edge over the symmetric model with the same coalescent prior (supplementary tables 5, 6 ). rspps inferred from the pb2 segment were almost equal for all states and, hence, were inconclusive, when using the asymmetric model with the four coalescent priors (figures 6a-d) . similarly, using the symmetric model with the four coalescent priors was inconclusive in terms of identifying the ancestral location for the currently circulating viral strains (figures 6e-h) . more specifically, the magnitude of differences between minnesota and missouri and in the inferred rspps, across different coalescent priors, was substantially small (figures 6e-h) . for example, when using the sg prior, the inferred rspps were 0.18 and 0.22 for missouri and minnesota, respectively ( figure 6h) . our bf-bssvs analyses, using the asymmetric model with the cp and the ex coalescent priors for the ha gene segment, suggest that the top three most significant unidirectional routes of viral dispersal (bf > 18) were between minnesota, iowa, illinois, and missouri (figures 7a,b) . the inferred routes maintained their unidirectionality from the origin to the destination geographical locations, using cp and ex priors (figures 7a,b) . similarly, the order of statistical significance suggests that the route from iowa to minnesota is the most important for viral dispersal between states (figures 7a,b) . in contrast, the exg with the asymmetric model suggests that the route from ohia to indiana is substantially the most significant dispersal route (bssvs-bf = 1,157) ( figure 7c) . nevertheless, the sg prior agrees with the results of the cp and ex priors in inferring the route from iowa to minnesota as the most significant (bssvs-bf = 37) (figure 7d) , while inferences from the symmetric model and the four coalescent priors consistently agreed that the top most significant bidirectional route of viral dispersal (bf ≥ 990) was between indiana and ohio (figures 7e-h) . however, disagreements were inferred on the second and the third most significant routes when using the cp and ex on one side and exg and sg on the other (figure 7h) . dispersal routes inferred for pb2 (including the order of significance) were also sensitive to the selected discrete spatial model and slightly to the coalescent priors (figure 8) . thus, when using the asymmetric model, the top two unidirectional routes included (1) iowa → minnesota; (2) indiana → kentucky (figures 8a-d) . while the cp, ex, and exg inferred the route from illinois to missouri as the third most significant route ( figures 8a-c) , the sg prior inferred the route from ohio to indiana as the third most significant route ( figure 8d) . finally, our inferred top three significant dispersal routes were from the symmetric model between (1) indiana and ohio; (2) minnesota and iowa; (3) indiana and kentucky (figures 8e-h) . in the past decade, our phylodynamic pipeline became wellestablished and demonstrated powerful potentials to trace the evolutionary history of both animal and human pathogens making it an ideal tool for designing new molecular surveillance systems. in this study, we revisited essential concepts and definitions within the field of phylodynamic methods. also, we challenged the robustness of the posterior evolutionary parameters, inferred by the commonly used phylodynamic models, using two gene segments, of the currently circulating human-like h3 si viruses isolated in the united states, and multiple priors. subsequently, we compared similarities and differences between the posterior parameters inferred from ha and pb2 sequence data using multiple phylodynamic models. hence, we explored the robust and sensitive aspects of si phylodynamic models and highlighted the importance of model selection within their analytical framework. however, unlike classical phylogenetic methods currently implemented within the si surveillance system in the united states, we were able to reveal higher resolution insights into the evolutionary epidemiology of human-like h3 viruses by quantifying their demographic and phylogeographic history. therefore, animal health researchers and stakeholders need to be aware of the method's features, strengths, and limitations for generating reliable inference to guide future disease intervention activities properly. epidemiology of swine influenza in the u.s. based on the results of the best fitting phylodynamic models for both ha and pb2 segments, evolutionary rates of currently circulating human-like h3 viruses in the united states remain high with no apparent signs of substantial declines (figures 2b,d) and were similar to what was inferred elsewhere (117). furthermore, inferred relative genetic diversity through time did not decline for the ha segment and showed evidence of seasonal variation between 2014 and 2018 (figures 4a,b) , while a slight decline in the genetic diversity was inferred for the pb2 segment between 2015 and 2018 (figures 4c,d) . these findings suggest that currently circulating human-like h3 viruses will continue evolutionary activity leading to the generation of novel strains, which is attributed to the frequent and continuous exchange of viruses between commercial and exhibition swine operations in the united states with the later as the epicenter of that exchange (117) . our estimates of the tmrcas for ha segment slightly agree on the notion that the oldest h3 viruses diverged from earlier outbreaks in the state of minnesota, which is a central region for the swine industry in the united states (figure 3) . however, the notable overlap in the inferred 95% hpds of the tmrcas between most states ( figure 3b ) suggest that the currently circulating strains are shifting their evolutionary dynamics in terms of re-emergence and dispersal when compared to earlier strains. additionally, both gene segments agree on the assumption that h3 outbreaks were recently introduced into the state of maryland (figure 3) . the state of minnesota was inferred to be the ancestral location of human-like h3 viruses isolated from outbreaks observed between 2009 and 2012 (118) , which agrees with our tmrcas inferred from ha segment (figure 3) . however, results of the sg + ucln symmetric model, selected as the best fitting model for ha sequence data (supplementary table 4) , implicates the state of iowa as the ancestral region (after 2013) for currently circulating human-like h3 viruses, followed by the state of minnesota as a secondary ancestral location ( figure 5h ). this is not surprising since iowa and minnesota share the most prominent swine production system in the united states with the highest swine density, unrestricted and intense movement of animals between states. although iowa and minnesota are the original hotspots of h3 viruses, our bssvs bf results showed a markedly significant viral dispersal route between indiana and ohio (bf = 990) ( figure 7h ). this suggests that the h3 viral gene flow between ohio and indiana, inferred for 2009-2012 viruses remains a vital migration route since, particularly within exhibition swine populations (117) . even though illinois and indiana formulate one swine production system, there was no significant viral dispersal route inferred between the states. despite the continuous nature of animal movement within the production system of minnesota and iowa, no significant dispersal route was inferred between the two states using the ha segment ( figure 7h) . nevertheless, using the pb2 segment, a highly significant dispersal route was inferred from iowa to minnesota, suggesting that iowa might be the new epicenter for virus dispersal of the currently circulating h3 lineages ( figure 8d ). this result is further supported by the significant migration route between iowa on one side and illinois and ohio on the other when using the ha segment ( figure 7h) . also, the inferred dispersal route between iowa and illinois ( figure 7h ) may reflect interstate movements of exhibition pigs (119) . hence, the movements of exhibition pigs across the united states possibly led to expanding the spatial spread of h3 viruses to states with limited swine production systems (117) . unlike the ha segment, rspps inferred from the most realistic phylodynamic model for pb2 sequences (i.e., asymmetric + sg + uced) (supplementary table 5 ) did not yield conclusive results about the ancestral geographical origin of human-like h3 in the united states ( figure 6d) . instead, this result demonstrates a homogenous spatio-temporal diffusion process of the pb2 gene between states (figure 6d) , suggesting that the virus has maintained an endemic status across the united states after 2010. also, results of the sg plot for pb2, described above, showed an overall stationarity in its genetic diversity through time (despite the slight early incline and later decline) (figure 4c) , when compared to the ha gene (figures 4a,b) , supporting the notion of endemic status. however, using the pb2 segment, we inferred a notably significant dispersal route originating from iowa to minnesota (bssvs = 193) (figure 8d) , reflecting a well-established swine transportation route within a production system, as described above. however, this route was not inferred as significant when using the best fitting model for the ha segment ( figure 7h) . these results may be attributed to the fact that pb2 evolutionary dynamics are moderately slower than the ha segment (figure 2 ) in terms of strength of the temporal signal (supplementary figure 2) , substitution rate (figure 2) , and age of the segment (figure 3) . therefore, the pb2 segment maintained similar evolutionary dynamics to earlier strains that emerged in minnesota and dispersed into iowa (120) . yet, both ha and pb2 segment agree on the importance of iowa as a geographical region for dispersal of currently circulating h3 lineages (figures 7h, 8d) . additionally, we inferred two significant viral dispersal routes originating from kentucky to indiana and from ohio to indiana (figure 8d) , which further supports the role of exhibition of swine movements between states in maintaining the spread of h3 viruses. both dispersal routes are mainly maintained by the annual agricultural fairs where exhibition susceptible swine and humans from these states are frequently exposed to direct and indirect contacts from the same infected hosts (121) . it is worth noting that the route from kentucky to indiana was hypothesized to be important for h3 gene flow between states, but past evolutionary analyses did not observe it due to the lack of sufficient samples (117) . the uneven sampling of sequences in terms of temporal depth and frequency of associated discrete traits is an inherent limitation of most phylodynamic studies. for example, the inclusion of many recent sequences from a single geographical location may lead to a biased bottleneck effect in the shape of inferred population size through time when using a coalescent model from the skyline family (122) . this issue can be resolved by designing studies with uniform probability sampling with respect to space and time (122) . further, setting dta is user friendly and computationally more efficient when compared to more complex coalescent models, but it underlays a few assumptions, such as that the sequence sample size is proportional to the size of the selected discrete state (26) . thus, including sequences from severely undersampled discrete traits will tend to produce unreliable posterior inferences, where for example, inferred rspps will be skewed toward oversampled areas. nevertheless, undersampling is a common problem, especially in passive surveillance data, and therefore, the use of structural coalescent models (e.g., basta) might be more appropriate (26) . despite this inherited sensitivity of phylodynamic methods to uneven sampling, our posterior inference from the best fitting models showed remarkable robustness toward such limitation. although the largest number of collected sequences was in 2017 (80) (supplementary table 2) , estimates of relative genetic diversity through time did not show any striking jumps in that year for both ha and pb2 segments (figures 4b, 2d) . additionally, for the ha gene, iowa (with 26 sequences) rather than ohio (39 sequences) was inferred as the ancestral location ( figure 5h, supplementary table 2) . however, seven out of the 17 u.s. states were excluded from the dta due to the lack of sufficient sequences, and therefore, their role was unquantified in shaping the spatio-temporal evolution of si. yet, these states had substantially fewer swine-related activities as well as si outbreaks than analyzed states. further, we showed how the posterior estimates of demographic reconstruction were almost insensitive to the choice of different phylodynamic priors for each gene segment (figures 2-4) . however, inferred evolutionary estimates from different gene regions may differ (41) or coincide (118) due to the natural variation in their mutation rate over time. this raises the question of whether using longer gene segments or whole genomes provides deeper resolution into the evolutionary history of rapidly evolving pathogens. past influenza a studies (41, 123, 124) , including the present study, showed that ha and na segments typically exhibit higher evolutionary rates than more conserved segments like pb1 and pb2. subsequently, segments with higher evolutionary rate will also display stronger evolutionary signals, as described above. in our analyses, the width of the 95% hpds (i.e., length of the time scale) for the median age and tmrcas of pb2 were remarkably wider than the ha segment (figures 2, 3) . this sizeable width of the posterior intervals reflects the magnitude of uncertainty surrounding inferences from the pb2 segment, as well as suggests that inferences from the ha segment were more precise (or robust) than the pb2 segment. also, we demonstrated how the pb2 segment failed to identify the ancestral geographical location of currently circulating h3 viruses ( figure 6d) . while, using the symmetric model, we inferred four candidate ancestral locations with inconclusive rspps (figure 6h) . further, nelson et al. (117) were not able to infer a significant migration route between indiana and ohio using the pb2 segment. yet, we were able to infer this particular route as significant using both the ha and the pb2 segments (figures 7h, 8d) . additionally, scotch et al. (118) confirmed agreements in the phylogeographic inferences between ha and na gene segments. this highlights another decisive question about the suitability and efficiency of using single, multiple, or whole genome when using phylodynamic methods for molecular surveillance of viral diseases. most researchers advocate for whole-genome analysis by either analyzing each segment alone or as concinnated segments. however, in the presence of a large number of sequences, these strategies are ill timed and require massive computational resources, making them inefficient for targeted and near-realtime surveillance systems. it is worth noting that substitution rate and divergence time inferred by alkhamis et al. (43) using the fmd sat1 vp1 segment were similar to the evolutionary estimates inferred by lasecka-dykes et al. (125) using wholegenome sequences, confirming the robustness of phylodynamic methods. nevertheless, the presence of recombination events can severely impact the robustness of phylodynamic methods leading to inferring biased evolutionary histories (126) . hence, targeting the most rapidly evolving gene segment at the beginning of an epidemic may suffice molecular surveillance activities. that said, the choice between gene segments or the whole genome should depend on the evolutionary properties of the pathogen, frequency of recombination events, availability of resources, and objectives of the molecular surveillance system. as described above, phylodynamic inferences tend to be biased toward the available subsets of sequences data. hence, when analyzing novel sequence datasets, it is critical to combine them with genetically related lineages published in the scientific literature or publicly available databases to reduce the impact of sampling bias as well as improve the reliability and accuracy of posterior evolutionary inferences. unfortunately, several examples published in the scientific literature used phylodynamic methods on novel sequence datasets while ignoring their published relatives (127) (128) (129) . this led to inferring mcc trees with unaccounted phylogenetic relationships such as nodes, branches, and roots. our worked example opens considerations for future work involving the use of more complex phylodynamic models, described above, to shed deeper insights into the evolutionary epidemiology of si. for example, when the exact geographical locations of the sequences are available, the use of continuous phylogeographic models will enable us to include all states in the analyses, including states with few sequences. besides, we can estimate the spatiotemporal dispersal speed of the virus as well as identify dispersal patterns (i.e., homogeneous vs. heterogeneous) across different geographical regions. also, the use of glm geographical models can directly quantify the importance of different environmental (e.g., climate) and demographical (e.g., pig density) factors in shaping the evolutionary history of si in the united states. finally, exploring the potentials of structured coalescent models in improving the reliability of inferences derived from basic dtas should be considered as well. the current surveillance programs rely heavily on collecting and analyzing spatial, temporal, and genomic aspects of an outbreak using classical statistical methods in a disjointed analytical framework. this disjointed framework suffers from many biases and is not capable of answering more profound epidemiological questions about the outbreak of current dynamics. using our suggested phylodynamic analytical pipeline, we were able to fulfill critical epidemiological questions about the emergence and evolution of currently circulating humanlike h3 si viruses in the united states, with the primary goal of guiding risk-based surveillance resources. for example, using inferences from the ha segment, we were able to identify the dates of epidemic introduction to each state. also, we were able to identify the geographic origins of the current outbreaks and observed their genomic-spatio-temporal diffusion process through time between states. also, we identified highrisk viral dispersal routes between states, rank-ordered their significance, and defined their directions. all of these are integral components of an effective risk-based molecular surveillance program, and the ability to achieve in real time is the future molecular surveillance of animal diseases. nevertheless, the availability of computational resources for designing an ongoing phylodynamic-based molecular surveillance system will always remain a challenge, especially for developing countries. that said, a few open-source software developed recently can perform basic phylodynamic analysis (e.g., estimate molecular clocks and infer evolutionary models) using an ml statistical framework, including timetree (130), treedater r package (131) , and least square dating (120) . while the algorithms implemented in these software trades off the advantages of the bayesian framework, in the presence of large sequence datasets, they can produce evolutionary estimates similar to those estimated by beast using substantially less computational resources (120, 130, 131) . nextstrain (https://nextstrain.org), which implements treetime is a futuristic working example of a web-based real-time molecular surveillance system for important human pathogens such as influenza, ebola, dengue, and the newly emerging corona viruses. this surveillance system has an on-going phylodynamic analytical engine that traces, in real-time genetic diversity, divergence times, geographical origins, and dispersal on global scales. the system updates the results of the mcc tree once new sequences are deposited in other web-based publicly available genomic databases. however, this project is achieved through rigorous and consistent global collaboration and data sharing. in the united states, resources for developing a similar system for tracing animal diseases are readily available. nevertheless, the chain of collaboration between researchers, government, and producers in the animal sector is hard to maintain due to logistic, economic, and educational (i.e., lack of awareness and skill in phylodynamic methods) reasons. nevertheless, recent scientific literature on the use of phylodynamic methods for animal disease surveillance is notably growing, which reflects the increased awareness between veterinarians about the capacities of such methods and the goodwill of the industry leaders to voluntarily share their data (37, 132) . therefore, we anticipate a new era of animal disease prevention and control in the united states. in contrast, veterinary infrastructure in developing countries is severely lacking, in terms of reporting and data sharing, when compared to their human health sectors. consequently, the question related to the future of implementing phylodynamic methods in global animal surveillance remains unanswered. our selected phylodynamic analytical pipeline offers an integrated approach to not only answering more profound epidemiological questions about emerging and endemic animal diseases but also attempts to reduce the impact of its inherent limitations to offer less biased and biologically plausible inferences about the pathogen evolutionary characteristics to properly guide intervention activities. this study has highlighted the value of phylodynamic methods in improving current and future molecular surveillance efforts against animal diseases using human-like h3 si virus as a working example. we reviewed and outlined basic concepts and aspects of phylodynamic methods and attempted to summarize essential components of the methodology in one analytical pipeline to facilitate the proper use of the methods by animal health researchers. we also pinpointed requirements and challenges for integrating phylodynamic methods in routine animal disease surveillance activities. the datasets analyzed for this study (alignments, beast xmls, and mcc tree files) can be found in the figshare dataset. https:// doi.org/10.6084/m9.figshare.11842989.v1. carlsson: global animal disease surveillance novel analytic tools for the study of porcine reproductive and respiratory syndrome virus (prrsv) in endemic settings: lessons learned in the u.s. porcine health manag a review of quantitative tools used to assess the epidemiology of porcine reproductive and respiratory syndrome in u.s. swine farms using dr. morrison's swine health monitoring program data. front vet sci towards a genomics-informed, real-time, global pathogen surveillance system highly pathogenic avian influenza virus subtype h5n1 in africa: a comprehensive phylogenetic analysis and molecular characterization of isolates regional patterns of genetic diversity in swine influenza a viruses in the united states from influenza ah5n1 immigration is filtered out at some international borders molecular epidemiology of foot-and-mouth disease virus comparative study of codon substitution patterns in foot-and-mouth disease virus (serotype o) foot-and-mouth disease: past, present and future spatial and temporal patterns of porcine reproductive and respiratory syndrome virus (prrsv) genotypes in ontario, canada analysis of orf5 and full-length genome sequences of porcine reproductive and respiratory syndrome virus isolates of genotypes 1 and 2 retrieved worldwide provides evidence that recombination is a common phenomenon and may produce mosaic isolates molecular evolution of prrsv in europe: current state of play bayesian selection of continuoustime markov chain evolutionary models unifying the spatial epidemiology and molecular evolution of emerging epidemics evolutionary analysis of the dynamics of viral infectious disease tracking virus outbreaks in the twenty-first century evidence for positive selection in foot-and-mouth disease virus capsid genes from field isolates phylogeography takes a relaxed random walk in continuous space and time combining phylogeography and spatial epidemiology to uncover predictors of h5n1 influenza a virus diffusion novel approaches for spatial and molecular surveillance of porcine reproductive and respiratory syndrome virus (prrsv) in the united states. sci rep unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza h3n2 widespread reassortment shapes the evolution and epidemiology of bluetongue virus following european invasion bayesian phylogenetic and phylodynamic data integration using beast 1.10 improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty new routes to phylogeography: a bayesian structured coalescent approximation phylodynamic reconstruction of o cathay topotype foot-andmouth disease virus epidemics in the philippines phylodynamics of h1n1/2009 influenza reveals the transition from host adaptation to immune-driven selection transmission dynamics of re-emerging rabies in domestic dogs of rural china genetic diversity of prrs virus collected from air samples in four different regions of concentrated swine production during a high incidence season the spread of type 2 porcine reproductive and respiratory syndrome virus (prrsv) in north america: a phylogeographic approach bayesian phylogeography of influenza a/h3n2 for the 2014-15 season in the united states using three frameworks of ancestral state reconstruction improving bayesian population dynamics inference: a coalescent-based model for multiple loci real-time characterization of the molecular epidemiology of an influenza pandemic reassortment patterns of avian influenza virus internal segments among different subtypes avian influenza virus exhibits distinct evolutionary dynamics in wild birds and poultry applications of bayesian phylodynamic methods in a recent porcine reproductive us. and respiratory syndrome virus outbreak hiv epidemiology. the early spread and epidemic ignition of hiv−1 in human populations reconstruction of the evolutionary dynamics of the a(h1n1)pdm09 influenza virus in italy during the pandemic and post-pandemic phases influenza a virus in swine surveillance origins and evolutionary genomics of the 2009 swine-origin h1n1 influenza a epidemic phylogenomics and molecular evolution of foot-and-mouth disease virus phylogeographic and cross-species transmission dynamics of sat1 and sat2 foot-and-mouth disease virus in eastern africa spatiotemporal evolutionary epidemiology of h5n1 highly pathogenic avian influenza in west africa and nigeria influenza a virus migration and persistence in north american wild birds alignment uncertainty and genomic analysis multiple sequence alignment averaging improves phylogeny reconstruction clustal w and clustal x version 2.0 coffee: a novel method for fast and accurate multiple sequence alignment muscle: multiple sequence alignment with high accuracy and high throughput aliview: a fast and lightweight alignment viewer and editor for large datasets evolutionary trees from dna sequences: a maximum likelihood approach methods for computing wagner trees the neighbor-joining method: a new method for reconstructing phylogenetic trees building phylogenetic trees from molecular data with mega raxml version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies the phylogeography of human viruses treemmer: a tool to reduce large phylogenetic datasets with minimal loss of diversity recombination in viruses: mechanisms, methods of study, and evolutionary consequences consequences of recombination on traditional phylogenetic analysis rdp3: a flexible and fast computer program for analyzing recombination exploring the temporal structure of heterochronous sequences using tempest (formerly path-o-gen) measurably evolving populations substitution and site models fast, accurate and simulation-free stochastic mapping dating of the human-ape splitting by a molecular clock of mitochondrial dna some probabilistic and statistical problems in the analysis of dna sequences maximum likelihood phylogenetic estimation from dna sequences with variable rates over sites: approximate methods maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites an empirical examination of the utility of codon-substitution models in phylogeny reconstruction maximum-likelihood models for combined analyses of multiple sequence data likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution estimating the dimension of a model regression and time series model selection in small samples methods for selecting fixedeffect models for heterogeneous codon evolution, with comments on their application to gene and genome data partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses modeltest 2: more models, new heuristics and parallel computing bmodeltest: bayesian phylogenetic site model averaging and model comparison the molecular clock divergence time and evolutionary rate estimation with multilocus data relaxed phylogenetics and dating with confidence estimation of the transition/transversion rate bias and species sampling bayesian random local clocks, or one rate to rule them all evolutionary trees on the genealogy of large populations estimating effective population size and mutation rate from sequence data using metropolis-hastings sampling estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach unifying vertical and nonvertical evolution: a stochastic arg-based framework a viral sampling design for testing the molecular clock and for estimating evolutionary rates and divergence times ancestral inference in population genetics phylodynamics of h5n1 highly pathogenic avian influenza in europe an integrated framework for the inference of viral population history from reconstructed genealogies bayesian coalescent inference of past population dynamics from molecular sequences smooth skyride through a rough skyline: bayesian coalescent-based inference of population dynamics structured trees and phylogeography bayesian phylogeography finds its roots phylodynamics and evolutionary epidemiology of african swine fever p72-cvr genes in eurasia and africa using temporally spaced sequences to simultaneously estimate migration rates, mutation rate and population sizes in measurably evolving populations scotti: efficient reconstruction of transmission within outbreaks with the structured coalescent mascot: parameter and state inference under the marginal structured coalescent approximation phylodynamics of infectious disease epidemics modelling tree shape and structure in viral phylodynamics using an epidemiological model for phylogenetic inference reveals density dependence in hiv transmission bayesian phylogenetics with beauti and the beast 1.7 available online at available online at model selection tutorial choosing among partition models in bayesian phylogenetics posterior analysis and post processing posterior summarization in bayesian phylogenetics using tracer 1.7 should phylogenetic models be trying to "fit an elephant"? spread3: interactive visualization of spatiotemporal history and trait evolutionary processes novel reassortant human-like h3n2 and h3n1 influenza a viruses detected in pigs are virulent and antigenically distinct from swine viruses endemic to the united states lipman: the influenza virus resource at the national center for biotechnology information bowman: evolutionary dynamics of influenza a viruses in us exhibition swine phylogeography of swine influenza h3n2 in the united states: translational public health for zoonotic disease surveillance spatial dynamics of human-origin h1 influenza a virus in north american swine fast dating using least-squares criteria and algorithms prevalence of influenza a virus in exhibition swine during arrival at agricultural fairs the effects of sampling strategy on the quality of reconstruction of viral population dynamics using bayesian skyline family coalescent methods: a simulation study phylodynamics of the emergence of influenza viruses after cross-species transmission the evolutionary dynamics of influenza a virus adaptation to mammalian hosts full genome sequencing reveals new southern african territories genotypes bringing us closer to understanding true variability of foot-and-mouth disease virus in africa reconstructing the evolutionary history of pandemic foot-and-mouth disease viruses: the impact of recombination within the emerging o/me-sa/ind-2001 lineage evolution of bluetongue virus serotype 1 in northern australia over 30 years phylodynamics of h5n1 avian influenza virus in indonesia dynamics and evolution of highly pathogenic porcine reproductive and respiratory syndrome virus following its introduction into a herd concurrently infected with both types 1 and 2 treetime: maximum-likelihood phylodynamic analysis scalable relaxed clock phylogenetic dating phylodynamic applications in 21(st) century global infectious disease research. glob health res policy the study was designed by ma and mt. the data were collected and organized by cl. all statistical analyses were conducted by ma. ma, cl, and mt wrote the first draft of the manuscript. the supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fvets. 2020.00176/full#supplementary-material conflict of interest: the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.copyright © 2020 alkhamis, li and torremorell. this is an open-access article distributed under the terms of the creative commons attribution license (cc by). the use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. no use, distribution or reproduction is permitted which does not comply with these terms. key: cord-003243-u744apzw authors: michael, edwin; sharma, swarnali; smith, morgan e.; touloupou, panayiota; giardina, federica; prada, joaquin m.; stolk, wilma a.; hollingsworth, deirdre; de vlas, sake j. title: quantifying the value of surveillance data for improving model predictions of lymphatic filariasis elimination date: 2018-10-08 journal: plos negl trop dis doi: 10.1371/journal.pntd.0006674 sha: doc_id: 3243 cord_uid: u744apzw background: mathematical models are increasingly being used to evaluate strategies aiming to achieve the control or elimination of parasitic diseases. recently, owing to growing realization that process-oriented models are useful for ecological forecasts only if the biological processes are well defined, attention has focused on data assimilation as a means to improve the predictive performance of these models. methodology and principal findings: we report on the development of an analytical framework to quantify the relative values of various longitudinal infection surveillance data collected in field sites undergoing mass drug administrations (mdas) for calibrating three lymphatic filariasis (lf) models (epifil, lymfasim, and transfil), and for improving their predictions of the required durations of drug interventions to achieve parasite elimination in endemic populations. the relative information contribution of site-specific data collected at the time points proposed by the who monitoring framework was evaluated using model-data updating procedures, and via calculations of the shannon information index and weighted variances from the probability distributions of the estimated timelines to parasite extinction made by each model. results show that data-informed models provided more precise forecasts of elimination timelines in each site compared to model-only simulations. data streams that included year 5 post-mda microfilariae (mf) survey data, however, reduced each model’s uncertainty most compared to data streams containing only baseline and/or post-mda 3 or longer-term mf survey data irrespective of mda coverage, suggesting that data up to this monitoring point may be optimal for informing the present lf models. we show that the improvements observed in the predictive performance of the best data-informed models may be a function of temporal changes in inter-parameter interactions. such best data-informed models may also produce more accurate predictions of the durations of drug interventions required to achieve parasite elimination. significance: knowledge of relative information contributions of model only versus data-informed models is valuable for improving the usefulness of lf model predictions in management decision making, learning system dynamics, and for supporting the design of parasite monitoring programmes. the present results further pinpoint the crucial need for longitudinal infection surveillance data for enhancing the precision and accuracy of model predictions of the intervention durations required to achieve parasite elimination in an endemic location. we report on the development of an analytical framework to quantify the relative values of various longitudinal infection surveillance data collected in field sites undergoing mass drug administrations (mdas) for calibrating three lymphatic filariasis (lf) models (epifil, lym-fasim, and transfil), and for improving their predictions of the required durations of drug interventions to achieve parasite elimination in endemic populations. the relative information contribution of site-specific data collected at the time points proposed by the who monitoring framework was evaluated using model-data updating procedures, and via calculations of the shannon information index and weighted variances from the probability distributions of the estimated timelines to parasite extinction made by each model. results show that data-informed models provided more precise forecasts of elimination timelines in each site compared to model-only simulations. data streams that included year 5 post-mda microfilariae (mf) survey data, however, reduced each model's uncertainty most compared to data streams containing only baseline and/or post-mda 3 or longer-term mf survey data irrespective of mda coverage, suggesting that data up to this monitoring point may be optimal for informing the present lf models. we show that the improvements observed in the predictive performance of the best data-informed models may be a function of temporal changes in inter-parameter interactions. such best data-informed models may also produce plos mathematical models of parasite transmission, via their capacity for producing dynamical forecasts or predictions of the likely future states of an infection system, offer an important tool for guiding the development and evaluation of strategies aiming to control or eliminate infectious diseases [1] [2] [3] [4] [5] [6] [7] . the power of these numerical simulation tools is based uniquely on their ability to appropriately incorporate the underlying nonlinear and multivariate processes of pathogen transmission in order to facilitate plausible predictions outside the range of conditions at which these processes are either directly observed or quantified [8] [9] [10] [11] . the value of these tools for guiding policy and management decisions by providing comparative predictions of the outcomes of various strategies for achieving the control or elimination of the major neglected tropical diseases (ntds) has been highlighted in a series of recent publications [8, 11, 12] , demonstrating the crucial role these quantitative tools are beginning to play in advancing policy options for these diseases. while these developments underscore the utility of transmission models for supporting policy development in parasite control, a growing realization is that these models can be useful for this purpose only if the biological processes are well defined and demographic and environmental stochasticity are either well-characterized or unimportant for meeting the goal of the policy modelling exercise [9] [10] [11] [13] [14] [15] [16] . this is because the realized predictability of any model for a system depends on the initial conditions, parameterizations and process equations that are utilized in its simulation such that model outcomes are strongly sensitive to the choice of values used for these variables [17] . any misspecification of these system attributes will lead to failure in accurately forecasting the future behaviour of a system, with predictions of actual future states becoming highly uncertain even when the exact representation of the underlying deterministic process is well established but precise specification of initial conditions or forcing and/or parameter values is difficult to achieve [17, 18] . this problem becomes even more intractable when theoretical models depend on parameter estimates taken from other studies [5, 17, 19] . both these challenges, viz. sensitivity to forcing conditions and use of parameter estimates from settings that are different from the dynamical environment in which a model will be used for simulation, imply that strong limits will be imposed on the realized predictability of any given model for an application [9, 10, 20] . as we have shown recently, if such uncertainties are ignored, the ability of parasite transmission models to form the scientific basis for management decisions can be severely undermined, especially when predictions are required over long time frames and across heterogeneous geographic locations [4, 5, 7] . these inherent difficulties with using an idealized model for producing predictions to guide management have led to consideration of data-driven modelling procedures that allow the use of information contained within observations to improve specification and hence the predictive performance of process-based models [9, 10, 14, [21] [22] [23] . such approaches, termed model-data fusion or data assimilation methods, act by combining models with various data streams (including observations made at different spatial or temporal scales) in a statistically rigorous way to inform initial conditions, constrain model parameters and system states, and quantify model errors. the result is the discovery of models that can more adequately capture the prevailing system dynamics in a site, an outcome which in turn has been shown to result in the making of significantly improved predictions for management decision making [9, 10, 14, 24] . initially used in geophysics and weather forecasting, these methods are also beginning to be applied in ecological modelling, including more recently in the case of infectious disease modelling [9, 10] . in the latter case, the approach has shown that it can reliably constrain a disease transmission model during simulation to yield results that approximate epidemiological reality as closely as possible, and as a consequence improve the accuracy of forecasts of the response of a pathogen system exposed to various control efforts [4-7, 21, 25-27] . more recently, attention has also focused on the notion that a model essentially represents a conditional proposition, i.e. that running a model in a predictive mode presupposes that the driving forces of the system will remain within the bounds of the model conceptualization or specification [28] . if these driving forces were to change, then it follows that even a model well-calibrated to a given historical dataset will fail. new developments in longitudinal data assimilation can mitigate this problem of potential time variation of parameters via the recursive adjustment of the model by assimilation of data obtained through time [22, 29, 30] . apart from allowing assessment of whether stasis bias may occur in model predictions, such sequential model calibration with time-varying data can also be useful for quantifying the utility of the next measurement in maximizing the information gained from all measurements together [31] . carrying out such longitudinal model-data analysis has thus the potential for providing information to improve the efficiency and cost-effectiveness of data monitoring campaigns [24, [31] [32] [33] , along with facilitating more reliable model forecasts. a key question, however, is evaluating which longitudinal data streams provide the most information to improve model performance [33] . indeed, it is possible that from a modelling perspective using more data may not always lead to a better-constrained model [34] . this suggests that addressing this question is not only relevant to model developers, who need observational data to improve, constrain, and test models, but also for disease managers working on the design of disease surveillance plans. at a more philosophical level, we contend that these questions have implications for how current longitudinal monitoring data from parasite control programmes can best be exploited both scientifically and in management [31] . specifically, we suggest that these surveillance data need to be analysed using models in a manner that allows the extraction of maximal information about the monitored dynamical systems so that this can be used to better guide both the collection of such data as well as the provision of more precise estimates of the system state for use in making state-dependent decisions [2, [35] [36] [37] . currently, parasite control programmes use infection monitoring data largely from sentinel sites primarily to determine if an often arbitrarily set target is met [3] . little consideration is given to whether these data could also be used to learn about the underlying transmission dynamics of the parasitic system, or how such learning can be effectively used by management to make better decisions regarding the interventions required in a setting to meet stated goals [2, 4] . here, we develop an analytical framework to investigate the value of using longitudinal lf infection data for improving predictions of the durations of drug interventions required for achieving lf elimination by coupling data collected during mass drug interventions (mdas) carried out in three example field sites to three existing state-of-the-art lymphatic filariasis (lf) models [4, 6, 21, [38] [39] [40] [41] [42] [43] . to be managerially relevant to current who-specified lf intervention surveillance efforts, we evaluated the usefulness of infection data collected in these sites at the time points proposed by the who monitoring framework in carrying out the present assessment [44] . this was specifically performed by ranking these different infection surveillance data streams according to the incremental information gain that each stream provided for reducing the prediction uncertainty of each model. longitudinal pre-and post-infection and mda data from representative sites located in each of the three major regions endemic for lf (africa, india, and papua new guinea (png)) were assembled from the published literature for use in constraining the lf models employed in this study. the three sites (kirare, tanzania, alagramam, india, and peneng, png) were selected on the basis that each represents the average endemic transmission conditions (average level of infection, transmitting mosquito genus) of each of these three major extant lf regions, while providing details on the required model inputs and data for conducting this study. these data inputs encompassed information on the annual biting rate (abr) and dominant mosquito genus, as well as mda intervention details, including the relevant drug regimen, time and population coverage of mda, and times and results of the conducted microfilaria (mf) prevalence surveys (table 1) . note each site also provided these infection and mda data at the time points pertinent to the existing who guidelines for conducting lf monitoring surveys during a mda programme [44] , which additionally, as pointed out above, allowed the assessment of the value of such infection data both for supporting effective model calibration and for producing more reliable intervention forecasts. the three existing lf models employed for this study included epifil, a deterministic monte carlo population-based model, and lymfasim and transfil, which are both stochastic, individual-based models. all three models simulate lf transmission in a population by accounting for key biological and intervention processes such as impacts of vector density, the life cycle of the parasite, age-dependent exposure, density-dependent transmission processes, infection aggregation, and the effects of drug treatments as well as vector control [4, 21, 38-40, 42, 43, 49] . although the three models structurally follow a basic coupled immigration-death model formulation, they differ in implementation (e.g. from individual to population-based), the total number of parameters included, and the way biological and intervention processes are mathematically incorporated and parameterized. the three models have been compared in recent work [8, 12] , with full details of the implementation and simulation procedures for each individual model also described [6, 8, 12, 21, 39, 42, 43, 49, 50] . individual model parameters and fitting procedures specific to this work are given in detail in s1 supplementary information. we used longitudinal data assimilation methods to sequentially calibrate the three lf models with the investigated surveillance data such that parameter estimates and model predictions reflect not only the information contained in the baseline but also follow-up data points. the available mf prevalence data from each site were arranged into four different temporal data streams to imitate the current who guidelines regarding the time points for conducting monitoring surveys during an mda programme. this protocol proposes that infection data be collected in sentinel sites before the first round of mda to establish baseline conditions, no sooner than 6 months following the third round of mda, and no sooner than 6 months following the fifth mda to assess whether transmission has been interrupted (defined as reduction of mf prevalence to below 1% in a population) [44, 51] . thus, the four data streams considered for investigating the value of information gained from each survey were respectively: scenario 1-baseline mf prevalence data only, scenario 2-baseline and post-mda 3 mf prevalence data, scenario 3-baseline, post-mda 3, and post-mda 5 mf prevalence data, and scenario 4-baseline and post-mda 5 mf prevalence data. in addition to these four data streams, a fifth model-only scenario (scenario 0) was also considered where no site-specific data was introduced. in this case, simulations of interventions were performed using only model-specific parameter and abr priors estimated for each region. the first step for all models during the data assimilation exercises reported here was to initially simulate the baseline infection conditions in each site using a large number of samples (100,000 for epifil and transfil, and 10,000-30,000 for lymfasim) randomly selected from the parameter priors deployed by each model. the number of parameters which were left free to be fitted to these data by each model range from 3 (lymfasim and transfil) to 21 (epifil). the abr, a key transmission parameter in all three models, was also left as a free parameter whose distribution was influenced by the observed abr (table 1) and/or by fits to previous region-specific datasets (see s1 supplementary information for model-specific implementations). the subsequent steps used to incorporate longitudinal infection data into the model calibration procedure varied among the models, but in all cases the goodness-of-fit of the model outputs for the site-specific mf prevalence data was assessed using the chi-square metric (α = 0.05) [52] . epifil used a sequential model updating procedure to iteratively modify the parameters with the introduction of each subsequent follow up data point through time [6] . this process uses parameter estimates from model fits to previous data as priors for the simulation of the next data which are successively updated with the introduction of each new observation, thus providing a flexible framework by which to constrain a model using newly available data. fig 1 summarizes the iterative algorithm used for conducting this sequential model-data assimilation exercise [6] . lymfasim and transfil, by contrast, included all the data in each investigated stream together for selecting the best-fitting models for each time series-i.e. model selection for each data series was based on using all relevant observations simultaneously in the fitting process [30, 53, 54] . although a limitation of this batch estimation approach is that the posterior probability of each model is fixed for the whole simulation period, unlike the case in sequential data assimilation where a restricted set of parameters is exposed to each observation (as a result of parameter constraining by data used in the previous time step)which thereby yields models that give better predictions for different portions of the underlying temporal process-here we use both methods to include and assess the impact that this implementation difference may have on the results presented below. for all models, the final updated parameter estimates from each data stream were used to simulate the impact of observed mda rounds and for predicting the impact of continued mda to estimate how many years were required to achieve 1% mf prevalence. interventions were modelled by using the updated parameter vectors or models selected from each scenario for simulating the impact of the reported as well as hypothetical future mda rounds on the number of years required to reduce the observed baseline lf prevalence in each site to below the who transmission threshold of 1% mf prevalence [44] . when simulating these interventions, the observed mda times, regimens, and coverages followed in each site were used (table 1) , while mda was assumed to target all residents aged 5 years and above. for making mf prevalence forecasts beyond the observations made in each site, mda simulations were extended for a total of 50 annual rounds in each site at an assumed coverage of 65%. while the drug-induced mf kill rate and the duration of adult worm sterilization were fixed among the models (table 1) , the worm kill rate was left as a free parameter to be estimated from post-intervention data to account for the uncertainty in this drug efficacy parameter [4, 7, 21] . the number of years of mda required to achieve the threshold of 1% mf prevalence was calculated from model forecasts of changes in mf prevalence due to mda for each modeldata fusion scenario. the predictions from each model regarding timelines to achieve 1% mf for each fitting scenario were used to determine the information gained from each data stream compared to the in all scenarios, the initial epifil models were initialized with parameter priors and a chi-square fitting criterion was applied to select those models which represent the baseline mf prevalence data sufficiently well (α = 0.05). the accepted models were then used to simulate the impact of interventions on mf prevalence. the chi-square fitting criterion was sequentially applied to refine the selection of models according to the post-mda mf prevalence data included in the fitting scenario. the fitted parameters from selection of acceptable models at each data point were used to predict timelines to achieve 1% mf prevalence. the scenarios noted in the blue boxes indicate the final relevant updating step before using the fitted parameters to predict timelines to achieve 1% mf in that data fitting scenario. information attributable to the model itself [14, 33, 55] . the relative information gained from a particular data stream was calculated as i d = h m -h md where h measures the entropy or uncertainty associated with a random variable, h m denotes predictions from the model-only scenario (scenario 0) which essentially represents the impact of prior knowledge of the system, and h md signifies predictions from each of the four model-data scenarios (i.e. scenarios [1] [2] [3] [4] . the values of i d for each data scenario or stream were compared in a site to infer which survey data are most useful for reducing model uncertainty. the shannon information index was used to measure entropy, h, as follows: is the discrete probability density function (pdf) of the number of years of mda predicted by each fitted model to reach 1% mf, and is estimated from a histogram of the respective model predictions for m bins (of equal width in the range between the minimum and maximum values of the pdfs) [14, 56] . to statistically compare two entropy values, a permutation test using the differential shannon entropy (dse) was performed [57] . dse is defined as |h 1 -h 2 | where h 1 was calculated from the distribution of timelines to achieve 1% mf for a given scenario, y 1 , and h 2 was calculated from the distribution of timelines to achieve 1% mf for a different scenario, y 2 . the list of elements in y 1 and y 2 were combined into a single list of size y 1 + y 2 and the list was permuted 20,000 times. dse was then recalculated each time by calculating a new h 1 from the first y 1 elements and a new h 2 from the last y 2 elements from each permutation, from which p-values may be quantified as the proportion of all recalculated dses that were greater than the original dse. model predictions of the mean and variance in timelines to lf elimination were weighted according to the frequencies by which predictions occurred in a group of simulations. in general, if d 1 , d 2 ,. . .,d n are data points (model predictions in the present case) that occur in an ensemble of simulations with different weights or frequencies w 1 ,w 2 ,. . .,w n , then the weighted mean, here, n is the number of data points and n 0 is the number of non-zero weights. in this study, the weighted variance of the distributions of predicted timelines to achieve 1% mf prevalence was calculated to provide a measure of the precision of model predictions in addition to the entropy measure, h. a similar weighting scheme was also used to pool the timeline predictions of all three models. here, predictions made by each of the three models for each data scenario were weighted as above, and a composite weighted 95% percentile interval for the pooled predictions was calculated for each data stream. this was done by first computing the weighted percentiles for the combined model simulations from which the pooled 2.5 th and 97.5 th percentile values were quantified. the matlab function, wprctile, was used to carry out this calculation. the extent by which parameter constraints are achieved through the coupling of models with data was evaluated to determine if improvements in such constraints by the use of additional data may lead to reduced model prediction uncertainty [33] . parameter constraint was calculated as the ratio of the mean standard deviation of all fitted parameter distributions to the mean standard deviation of all prior parameter distributions. a ratio of less than one indicates the fitted parameter space is more constrained than the prior parameter space [33] . this assessment was carried out using the epifil model only. in addition, pairwise parameter correlations were also evaluated to assess whether the sign, magnitude, and significance of these correlations changed by scenario to determine if using additional data might alter these interactions to better constrain a model. for this assessment, spearman's correlation coefficients and p-values testing the hypothesis of no correlation against the alternative of correlation were calculated, and the exercise was run using the estimated parameters from the epifil model. epifil was used to conduct a sensitivity analysis investigating whether the trend in relative information gained by coupling the model with longitudinal data was dependent on the interventions simulated. the same series of simulations (for three lf endemic sites and five fitting scenarios) were completed with the extended mda coverage beyond the observations given in table 1 set here at 80% instead of 65% to represent an optimal control strategy. as before, the timelines to reach 1% mf prevalence in each fitting scenario were calculated and used to determine which data stream provided the model with the greatest gain of information. the results were compared to the original series of simulations to assess whether the trends are robust to changes in the intervention coverages simulated. epifil was also used to perform another sensitivity analysis expanding the number of data streams to investigate if the who monitoring scheme is adequate for informing the making of reliable model-based predictions of timelines for achieving lf elimination. to perform this sensitivity analysis, pre-and post-mda data from villupuram district, india that provide extended data points (viz. scenario 1-4 as previously defined, plus scenario 5-baseline, post-mda 3, post-mda 5, and post-mda 7 mf prevalence data, and scenario 6-baseline, post-mda 3, post-mda 5, post-mda 7, and post-mda 9 mf prevalence) were assembled from the published literature [47, 58] . the timelines to reach 1% mf prevalence and the entropy for each of these additional scenarios were calculated to determine whether additional data streams over those recommended by who are required for achieving more reliable model constraints, which among these data might be considered as compulsory, and which might be optional for supporting predictions of elimination. differences in predicted medians, weighted variances and entropy values between data scenarios, models and sites were statistically evaluated using kruskall-wallis tests for equal medians, f-tests for equality of variance, and dse permutation tests, respectively. p-values for assessing significance for all pairwise tests were obtained using the benjamini-hochberg procedure for controlling the false discovery rate, i.e. for protecting against the likelihood of obtaining false positive results when carrying out multiple testing [59] . here, our goal was twofold. first, to determine if data are required to improve the predictability of intervention forecasts by the present lf models in comparison with the use of theoretical models only, and second, to evaluate the benefit of using different longitudinal streams of mf survey data for calibrating the three models in order to determine which data stream was most informative for reducing the uncertainty in model predictions in a site. table 2 summarises the key results from our investigation of these questions: these are the number of accepted best-fitting models for each data stream or scenario in the three study sites (table 1) , the predicted median and range (2.5 th -97.5 th percentiles) in years to achieve the mf threshold of 1% mf prevalence, the weighted variance and entropy values based on these predictions, and the relative information gained (in terms of reduced prediction uncertainty) by the use of longitudinal data for constraining the projections of each of the three lf models investigated. even though the number of selected best-fit models based on the chi-square criterion (see methods) differed for each site and model, these results indicate unequivocally that models constrained by data provided significantly more precise intervention predictions compared to model-only predictions ( table 2 ). note that this was also irrespective of the two types of longitudinal data assimilation procedures (sequential vs. simultaneous) used by the different models in this study. thus, for all models and sites, model-only predictions made in the absence of data (scenario 0) showed the highest prediction uncertainty, highlighting the need for data to improve the predictive performance of the present models. the relative information gained by using each data stream in comparison to model-only predictions further support this finding, with the best gains in reducing model prediction uncertainty provided by those data constraining scenarios that gave the lowest weighted variance and entropy values; as much as 92% to 96% reductions in prediction variance were achieved by these scenarios in comparison to modelonly predictions between the three models ( table 2 ). the results also show, however, that data streams including post-mda 5 mf survey data (scenarios 3 and 4) reduced model uncertainty (based on both the variance and entropy measures) most compared to data streams containing only baseline and/or post-mda 3 mf survey data (scenarios 1 and 2) ( table 2) . although there were differences between the three models (due to implementation differences either in how the models are run (monte carlo deterministic vs. individual-based) or in relation to how the present data were assimilated (see above)), overall, scenario 3, which includes baseline, post-mda 3, and post-mda 5 data, was shown to most often reduce model uncertainty the greatest. additionally, there was no statistical difference between the performances of scenarios 3 and 4 in those cases where scenario 4 resulted in the greatest gain of information (table 2) . it is also noticeable that the best constraining data stream for each combination of site and model also produced as expected the lowest range in predictions of the numbers of years of annual mda required to achieve the 1% mf prevalence in each site, with the widest ranges estimated for model-only predictions (scenario 0) and the shorter data streams (scenario 1). in general, this constriction in predictions also led to lower estimates of the median times to achieve lf elimination, although this varied between models and sites ( table 2) . the change in the distributions of predicted timelines to lf elimination without and with model constraining by the different longitudinal data streams is illustrated in fig 2 for the kirare site (see s2 supplementary information for results obtained for the other two study villages investigated here). the results illustrate that both the location and length of the tail of the prediction distributions can change as models are constrained with increasing lengths of longitudinal data, with inclusion of post-mda 5 mf survey data consistently producing a narrower or sharper range of predictions compared to when this survey point is excluded. fig 3 compares the uncertainty in predictions of timelines to achieve elimination made by each of the three models without (scenario 0) and via their constraining by the data streams providing the lowest prediction entropy for each of the models per site. note that variations in scenario 0 predictions among the three models directly reflect the different model structures, parameterizations, and the presence (or absence) of stochastic elements. the boxplots in the figure, however, show that for all three sites and models, calibration of each model by data the lowest entropy scenario for each site is bolded and shaded grey. additional scenarios shaded grey are not significantly different from the lowest entropy scenario. data assimilation in filarial model predictions greatly reduces the uncertainty in predictions of the years of annual mda required to eliminate lf compared to model-only predictions, with the data streams producing the lowest entropy for simulations in each site significantly improving the precision of these predictions ( table 2 ). this gain in precision, and thus the information gained using these data streams, is, as expected, greater for the stochastic lymfasim and transfil models compared to the deterministic epifil model. note also that even though the ranges in predictions of the annual mda years required to eliminate lf by the data streams providing the lowest prediction entropy differed statistically between the three models, the values overlapped markedly (e.g. for kirare the ranges are 10-18, 9-14, 9-15 for epifil, lymfasim and transfil data assimilation in filarial model predictions respectively), suggesting the occurrence of a similar constraining of predictive behaviour among the three models. to investigate this potential for a differential model effect, we further pooled the predictions from all three models for all the data scenarios and evaluated the value of each investigated data stream for improving their combined predictive ability. the weighted 95% percentile intervals from the pooled predictions were used for carrying out this assessment. the results are depicted in fig 4 and indicate that, as for the individual model predictions, uncertainty in the collective predictions by the three lf models for the required number of years to eliminate lf using annual mda in each site may be reduced by model calibration to data, with the longitudinal mf prevalence data collected during the later monitoring periods (scenarios 3 and 4) contributing most to improving the multi-model predictions for each site. the boxplots show that by calibrating the models to data streams, more precise predictions are able to be made regarding timelines to achieve 1% mf prevalence across all models and sites. the results of pairwise f-tests for variance, performed to compare the weighted variance in timelines to achieve 1% mf prevalence between model-only simulations (scenario 0) and the lowest entropy simulations (best scenario) (see table 2 ), show that the predictions for the best scenarios are significantly different from the predictions for the model-only simulations. significance was determined using the benjamini-hochberg procedure for controlling the false discovery rate (q = 0.05). for epifil, lymfasim and transfil, the best scenarios are scenarios 4, 3, and 4 for kirare, scenarios 4, 4, and 3 for alagramam, and scenarios 3, 3, and 3 for peneng, respectively. we attempted to investigate if model uncertainty in predictions by the use of longitudinal data was a direct function of parameter constraining by the addition of data. given the similarity in outcomes of each model, we remark on the results from the fits of the technically easier to run epifil model to evaluate this possibility here. the assessment of the parameter space constraint achieved through the inclusion of data was made by determining if the fitted parameter distributions for the model became reduced in comparison with priors as data streams were added to the system [33] . the exercise showed that the size of the estimated parameter distributions reduced with addition of data, with even scenario 1 data producing reductions for kirare and peneng (fig 5) . in the case of alagramam, however, there was very little, if any, constraint in the fitted parameter space compared to the prior parameter space. this result, together with the fact that even using all the data in kirare and peneng produced up to only between 2.5 to 5% reductions in fitted parameter distributions when compared to the priors, indicate that the observed model prediction uncertainty in this study may be due to other complex factors connected with model parameterization. table 3 provides the results of an analysis of pairwise parameter correlations of the selected best-fitting models for data scenario 1 compared to those selected by the data stream that gave the best reduction in epifil prediction uncertainty for alagramam (scenario 3). these results show that while the parameter space was not being constrained with the addition of more data, the pattern of parameter correlations changed in a complex manner between the two constraining data sets. for example, although the number of significantly correlated parameters did not differ, the magnitude and direction of parameter correlations were shown to change between the two data scenarios ( table 3) . the corresponding results for kirare and peneng are shown in s3 supplementary information , and indicate that a broadly similar pattern of changes in parameter associations also occurred as a result of model calibration to the sequential data measured from those sites. this suggests that this outcome may constitute a general phenomenon at least with regards to the sequential constraining of epifil using longitudinal mf prevalence data. an intriguing finding (from all three data settings) is that the most sensitive parameters in this regard, i.e. with respect to altered strengths in pairwise parameter correlations, may be those representing the relationship of various components of host immunity with different transmission processes, including with adult worm mortality, rates of production and survival of mf, larval development rates in the mosquito vector and infection aggregation (table 3) . this suggests that, as more constraining data are added, changes in the multidimensional parameter relationships related to host immunity could contribute to the sequential reductions in the lf model predictive uncertainty observed in this study. the lf elimination timeline predictions used above were based on modelling the impacts of annual mda given the reported coverages in each site followed by an assumed standard coverage for making longer term predictions (see methods). this raises the question as to whether the differences detected in the case of the best constraining data stream between the present study sites and between models ( table 2 ) could be a function of the simulated mda coverages in each site. to investigate this possibility, we used epifil to model the outcome of changing the assumed mda coverage in each site on the corresponding entropy and information gain trends in elimination predictions made from the models calibrated to each of the site-specific data scenarios/streams investigated here. the results of increasing the assumed coverage of mda to 80% for each site are shown in fig 6 and indicate that the choice of mda coverage in this study are unlikely to have significantly influenced the conclusion made above that the best performing data streams for reducing model uncertainty for predicting lf elimination pertains to data scenarios 3 and 4. however, while the model-predicted timelines to achieve the 1% mf prevalence threshold using the observed mda coverage followed by 80% mda coverage showed that the data stream which most reduced uncertainty did not change from the impact of using the observed mda coverage followed by 65% mda coverage modelled for kirare and peneng (table 2, fig 6) , this was not the case for alagramam, where data from scenario 3 with a 80% coverage resulted in the greatest reduction in entropy compared to the original results using 65% coverage which indicated that scenario 4 data performed best (table 2, fig 6) . notably, though, the entropy values of predictions using the data scenario 3 and 4 constraints were not statistically different for this site (p-value < 0.05) (fig 6) . epifil was also used to expand the number of calibration scenarios using a dataset with longer term post-mda data from villupuram district, india. this dataset contained two addition data streams: scenario 5 which included baseline, post-mda 3, post-mda 5, and post-mda 7 mf data, and scenario 6, which included baseline, post-mda 3, post-mda 5, post-mda 7, and post-mda 9 mf data. scenario 6 thus contained the most post-mda data and was demonstrated to be the most effective for reducing model uncertainty, but this effect was not statistically significantly different from the reductions produced by assimilating data contained in table 3 . spearman parameter correlations for scenarios 1 (lower left triangle) and 3 (upper right triangle) for alagramam, india. data assimilation in filarial model predictions scenarios 3 and 5 ( table 4 ). the inclusion of more data than are considered in scenario 3 therefore did not result in any significant additional reduction in model uncertainty. epifil was used to evaluate the accuracy of the data-driven predictions of the timelines required to meet the goal of lf elimination based on breaching the who-set target of 1% mf for all sites, either scenario 3 or 4 had the lowest entropies, and scenario 4 was not significantly different from scenario 3 for kirare and alagramam. these results were not statistically different from the results given 65% coverage (see table 2 ), suggesting that the data stream associated with the lowest entropy is robust to changes in the interventions simulated. scenarios where the weighted variance or entropy were not significantly different from the lowest entropy scenario are noted with the abbreviation ns. significance was determined using the benjamini-hochberg procedure for controlling the false discovery rate (q = 0.05). https://doi.org/10.1371/journal.pntd.0006674.g006 table 4 . predictions of timelines to achieve 1% mf in villupuram district, india, considering extended post-mda data. reporting those scenarios which are statistically significantly different from each other by numbers (0-4) as superscripts. for example, the weighted variance for scenario 0 has the superscript numbers (1) (2) (3) (4) (5) (6) to indicate that the weighted variance for scenario 0 is significantly different from the weighted variance for scenarios 1-6. significance was determined using the benjamini-hochberg procedure for controlling the false discovery rate (q = 0.05) in all pairwise statistical tests. + information gained by each data stream (scenario1-6) are presented in comparison to the information contained in the model-only simulation (scenario 0) https://doi.org/10.1371/journal.pntd.0006674.t004 data assimilation in filarial model predictions prevalence. this analysis was performed by using the longitudinal pre and post-infection and mda data reported for the nigerian site, dokan tofa, where elimination was achieved according to who recommended criteria after seven rounds of mda (table 5 ). the data from this site comprised information on the abr and dominant mosquito genus, as well as details of the mda intervention carried out, including the relevant drug regimen applied, time and population coverage of mda, and outcomes from the mf prevalence surveys conducted at baseline and at multiple time points during mda [60] . the results of model predictions of the timelines to reach below 1% mf prevalence as a result of sequential fitting to the mf prevalence data from this site pertaining to scenarios 0-4 (as defined above) are shown in table 6 . note that in the post mda 3, 5 and 7 surveys, as no lf positive individuals were detected among the sample populations, we used a one-sided 95% clopper-pearson interval to determine the expected upper one-sided 95% confidence limits for these sequentially observed zero infection values data assimilation in filarial model predictions using the "rule of three" approximation after k empty samples formula [61] . the results show that model constraining by scenario 2, which includes baseline and post-mda 3 data, and scenario 3, which includes baseline, post-mda 3, and post-mda 5 data, resulted in both the least entropy values and the shortest predicted times, i.e., from as low as 2 to as high as 7 years, required for achieving lf elimination in this site ( table 6 ). the data in table 5 show that the first instance the calculated one-sided upper 95% confidence limit in this setting fell below 1% mf prevalence also occurred post mda 7 (i.e after 7 years of mda). this is a significant result, and indicates that apart from being able to reduce prediction uncertainty, the best data-constrained models are also able to more accurately predict the maximal time (7 years) by which lf elimination occurred in this site. our major goal in this study was to compare the reliability of forecasts of timelines required for achieving parasite elimination made by generic lf models versus models constrained by sequential mf prevalence surveillance data obtained from field sites undergoing mda. a secondary aim was to evaluate the relative value of data obtained at each of the sampling time points proposed by the who for monitoring the effects of lf interventions in informing these model predictions. this assessment allowed us to investigate the role of these data for learning system dynamics and measure their value for guiding the design of surveillance programmes in order to support better predictions of the outcomes of applied interventions. fundamentally, however, this work addresses the question of how best to use predictive parasite transmission models for guiding management decision making, i.e. whether this should be based on the use of ideal models which incorporate generalized parameter values or on models with parameters informed by local data [10] . if we find that data-informed models can reduce prediction uncertainty significantly compared to the use of theoretical models unconstrained by data, then it is clear that to be useful for management decision making we require the application of model-data assimilation frameworks that can effectively incorporate information from appropriate data into models for producing reliable intervention projections. antithetically, such a finding implies that using unconstrained ideal models in these circumstances will provide only approximate predictions characterized by a degree of uncertainty that might be too large to be useful for reliable decision making [14, 33, 62] . here, we have used three state-of-the-art lf models calibrated to longitudinal human mf prevalence data obtained from three representative lf study sites to carry out a systematic analysis of these questions in parasite intervention modelling (see also walker et al [63] for a recent study highlighting the importance of using longitudinal sentinel site data for improving the prediction performances of the closely-related onchocerciasis models). further, by iteratively testing the reduction in the uncertainty of the projections of timelines required to achieve lf elimination in a site made by the models matching each observed data point, we have also quantified the relative values of temporal data streams, including assessing optimal record lengths, for informing the current lf models. our results provide important insights as to how best to use process models for understanding and generating predictions of parasite dynamics. they also highlight how site-specific longitudinal surveillance data coupled with models can be useful for providing information about system dynamics and hence for improving predictions of relevance to management decision-making. the first result of major importance from our work is that models informed by data can significantly reduce predictive uncertainty and hence improve performance of the present lf models for guiding policy and management decision-making. our results show that these improvements in predictive precision were consistent between the three models and across all three of our study sites, and can be very substantive with up to as much as 92% to 96% reductions in prediction variance obtained by the best data-constrained models in a site compared to the use of model-only predictions ( table 2 ). the practical policy implications of this finding can also be gleaned from appraising the actual numerical ranges in the predictions made by each individual model for each of the modelling scenarios investigated here. in the case of epi-fil, the best data-informed model (scenario 3 in peneng) gave an elimination prediction range of 7-12 years, while the corresponding model-only predictions for this site indicated a need for between 6-29 years of annual mda (table 2 ). these gains in information from using data to inform model parameters and hence predictions were even larger for the two stochastic models investigated here, viz. lymfasim and transfil, where ranges as wide as 7-28 years predicted by model-only scenarios were reduced to 9-14 years for the best data-informed models in the case of lymfasim for kirare village, and from as broad as 8-48 years to 7-22 years respectively in the case of transfil for peneng (table 2 ). these results unequivocally indicate that if parasite transmission models are used unconstrained by data, i.e. based on general parameter values uninformed by local data, it would lead to the making of predictions that would be marked by uncertainties that are likely to be far too large to be meaningful for practical policy making. if managers are risk averse, this outcome will also mean their need to plan interventions for substantially much longer than necessary, with major implications for the ultimate cost of the programme. note also that although statistically significant changes in the median years of mda required to achieve lf elimination were observed for the best datainformed models for all the three lf model types in each site, these were relatively small compared to the large reductions seen in each model's predictive uncertainly (table 2, fig 3) . this result highlights that the major gains from constraining the present models by data lies in improving their predictive certainty rather than in advancing their average behaviour. however, our preliminary analysis of model predictive accuracy suggests that the best data-constrained models may also be able to generate more accurate predictions of the impact of control ( table 6 ), indicating that, apart from simply reducing predictive uncertainty, such models could additionally have improved capability for producing more reliable predictions of the outcomes of interventions carried out in a setting. the iterative testing of the reduction in forecast uncertainty using mf surveillance data measured at time points proposed by the who (to support assessment of whether the threshold of 1% mf prevalence has been reached before implementation units can move to post-treatment surveillance [44]) has provided further insights into the relative value of these data for improving the predictive performance of each of the present lf models. our critical finding here is that parameter uncertainty in all three lf models was similarly reduced by the assimilation of a few additional longitudinal data records (table 2 ). in particular, we show that data streams comprising baseline + post-mda 3 + post-mda 5 (scenario 3) and those comprising baseline + post-mda 5 data (scenario 4) best reduced parameter-based uncertainty in model projections of the impact of mdas carried out in each study site irrespective of the models used. although preliminary, a potential key finding is that the use of longer-term data additional to the data measured at the who proposed monitoring time points did not lead to a significant further reduction in parameter uncertainty (table 4) . also, the finding that the who data scenarios 3 and 4 were adequate for constraining the present lf models appears not to be an artefact of variations in the mda coverages observed between the three study sites (fig 6) . these results suggest that up to 5 years of post-mda mf prevalence data are sufficient to constrain model predictions of the impact of lf interventions at a time scale that can go up to as high as 7 to 22 years depending on the site and model, and that precision may not improve any further if more new data are added ( table 2, table 4 ). given that the who post-mda lf infection monitoring protocol was developed for the purpose solely focussed on supporting the meeting of set targets (e.g. the 1% mf prevalence threshold) and not on a priori hypotheses regarding how surveillance data could be used also to understand the evolution and hence prediction of the dynamical parasitic system in response to management action, our results are entirely fortuitous with respect to the value of the current lf monitoring data for learning about the lf system and its extinction dynamics in different settings [31] . they do, nonetheless, hint at the value that coupling models to data may offer to inform general theory for guiding the collection and use of monitoring data in parasite surveillance programmes in a manner that could help extract maximal information about the underlying parasite system of interest. our assessment of whether the incremental increase in model predictive performance observed as a result of assimilating longitudinal data may be due to parameter constraining by the addition of data has shed intriguing new light on the impact that qualitative changes in dynamical system behaviour may have on parameter estimates and structure, and hence on the nature of the future projections of system change we can make from models. our major finding in this regard is that even though the parameter space itself may not be overly constrained by the best data stream (scenario 3 in this case for alagramam village), the magnitude and direction of parameter correlations, particularly those representing the relationship of different components of host immunity with various transmission processes, changed markedly between the shorter (scenario 1) and seemingly optimal data streams (scenario 3). this qualitative change in system behaviour induced by alteration in parameter interactions in response to perturbations has been shown to represent a characteristic feature of complex adaptive ecological systems, particularly when these systems approach a critical boundary [64] [65] [66] . this underscores yet another important reason to incorporate parameter information from data for generating sound system forecasts [67] . the finding that additional data beyond 5 years post-mda did not appear to significantly improve model predictive performance in this regard suggests that pronounced change in lf parameter interactions in response to mda interventions may occur generally around this time point for this parasitic disease, and that once in this parameter regime further change appears to be unlikely. this is an interesting finding, which not only indicates that coupling models to at least 5 years post-mda will allow detection of the boundaries delimiting the primary lf parameter regions with different qualitative behaviour, but also that the current who monitoring protocol might be sufficient to allow this discovery of system change. although our principal focus in this study was in investigating the value of longitudinal data for informing the predictive performance of the current lf models, the results presented here have also underscored the existence of significant spatial heterogeneity in the dynamics of parasite extinction between the present sites ( table 2, fig 3) . in line with our previous findings, this observed conditional dependency of systems dynamics on local transmission conditions means that timelines or durations of interventions required to break lf transmission (as depicted in table 2 ) will also vary from site to site even under similar control conditions [3] [4] [5] 21] . as we indicated before, this outcome implies that we vitally require the application of models to detailed spatio-temporal infection surveillance data, such as that exemplified by the data collected by countries in sentinel sites as part of their who-directed monitoring and evaluation activities, if we are to use the present models to make more reliable intervention predictions to drive policy and management decisions (particularly with respect to the durations of interventions required, need for switching to more intensified or new mda regimens, and need for enhanced supplementary vector control) in a given endemic setting [64] . as we have previously pointed out, the development of such spatially adaptive intervention plans will require the development and use of spatially-explicit data assimilation modelling platforms that can couple geostatistical interpolation of model inputs (eg. abr and/or sentinel site mf/ antigen prevalence data) with discovery of localized models from such data in order to produce the required regional or national intervention forecasts [5] . the estimated parameter and prediction uncertainties presented here are clearly dependent on the model-data fusion methodology and its implementation, and the cost function used to discover the appropriate models for a data stream [20] . while we have attempted to evaluate differences in individual model structures, their computer implementation, and the data assimilation procedures followed (e.g. sequential vs. simultaneous data assimilation), via comparing the collective predictions of the three models versus the predictions provided by each model singly, and show that these factors are unlikely to play a major role in influencing the current results, we indicate that future work must address these issues adequately to improve the initial methods we have employed in this work. currently, we are examining the development of sequential bayesian-based multi-model ensemble approaches that will allow better integration of each model's behaviour as well as better calculation of each model's transient parameter space at each time a new observation becomes available [30] . this work also involves the development of a method to fuse information from several indicators of infection (e.g. mf, antigenemia, antibody responses [21] ) together to achieve a more robust constraining of the present models. as different types of data can act as mutual constraints on a model, we also expect that such multiindicator model-data fusion methods will additionally address the problem of equifinality, which is known to complicate the parameterization of complex dynamical models [24, 68] . of course, the ultimate test of the results reported here, viz. that lf models constrained by coupling to year 5 post-mda data can provide the best predictions of timelines for meeting the 1% mf prevalence threshold in a site, is by carrying out the direct validation of our results against independent observations (as demonstrated by the preliminary validation study carried out here using the dokan tofa data (tables 5 and 6)). we expect that data useful for performing these studies at scale may be available at the sentinel site level in the countries carrying out the current who-led monitoring programme. the present results indicate that access to such data, and to post-treatment surveillance data which are beginning to be assembled by many countries, is now a major need if the present lf models are to provide maximal information about parasite system responses to management and thus generate better predictions of system states for use in policy making and in judging management effectiveness in different spatiotemporal settings [24, 31] . given that previous modelling work has indicated that if the globally fixed who-proposed 1% mf prevalence threshold is insufficient to break lf transmission in every setting (and thus conversely leading to significant infection recrudescence [21] ), the modelling of such spatio-temporal surveillance data will additionally allow testing if meeting this recommended threshold will indeed result in successfully achieving the interruption of lf transmission everywhere. the epidemiology of filariasis control. the filaria epidemiological modelling for monitoring and evaluation of lymphatic filariasis control mathematical modelling and the control of lymphatic filariasis. the lancet infectious diseases heterogeneous dynamics, robustness/fragility trade-offs, and the eradication of the macroparasitic disease, lymphatic filariasis continental-scale, data-driven predictive assessment of eliminating the vector-borne disease, lymphatic filariasis, in sub-saharan africa by 2020 sequential modelling of the effects of mass drug treatments on anopheline-mediated lymphatic filariasis infection in papua new guinea assessing endgame strategies for the elimination of lymphatic filariasis: a model-based evaluation of the impact of dec-medicated salt predicting lymphatic filariasis transmission and elimination dynamics using a multi-model ensemble framework data-model fusion to better understand emerging pathogens and improve infectious disease forecasting ecological forecasting and data assimilation in a data-rich era the role of data assimilation in predictive ecology effectiveness of a triple-drug regimen for global elimination of lymphatic filariasis: a modelling study. the lancet infectious diseases epidemic modelling: aspects where stochasticity matters. mathematical biosciences relative information contributions of model vs. data to short-and long-term forecasts of forest carbon dynamics inference in disease transmission experiments by using stochastic epidemic models plug-and-play inference for disease dynamics: measles in large and small populations as a case study the limits to prediction in ecological systems big data need big theory too hierarchical modelling for the environmental sciences: statistical methods and applications parameter and prediction uncertainty in an optimized terrestrial carbon cycle model: effects of constraining variables and data record length bayesian calibration of simulation models for supporting management of the elimination of the macroparasitic disease, lymphatic filariasis an improved state-parameter analysis of ecosystem models using data assimilation. ecological modelling bayesian calibration of mechanistic aquatic biogeochemical models and benefits for environmental management the model-data fusion pitfall: assuming certainty in an uncertain world transmission dynamics and control of severe acute respiratory syndrome curtailing transmission of severe acute respiratory syndrome within a community and its hospital transmission dynamics of the etiological agent of sars in hong kong: impact of public health interventions philosophical issues in model assessment. model validation: perspectives in hydrological science a sequential monte carlo approach for marine ecological prediction a sequential bayesian approach for hydrologic model selection and prediction inferences about coupling from ecological surveillance monitoring: approaches based on nonlinear dynamics and information theory. towards an information theory of complex networks using model-data fusion to interpret past trends, and quantify uncertainties in future projections, of terrestrial ecosystem carbon cycling rate my data: quantifying the value of ecological data for the development of models of the terrestrial carbon cycle estimating parameters of a forest ecosystem c model with measurements of stocks and fluxes as joint constraints. oecologia global mapping of lymphatic filariasis adaptive management and the value of information: learning via intervention in epidemiology general rules for managing and surveying networks of pests, diseases, and endangered species geographic and ecologic heterogeneity in elimination thresholds for the major vector-borne helminthic disease, lymphatic filariasis modelling strategies to break transmission of lymphatic filariasis-aggregation, adherence and vector competence greatly alter elimination mathematical modelling of lymphatic filariasis elimination programmes in india: required duration of mass drug administration and post-treatment level of infection indicators mathematical models for lymphatic filariasis transmission and control: challenges and prospects. parasite vector the lymfasim simulation program for modeling lymphatic filariasis and its control the dynamics of wuchereria bancrofti infection: a model-based analysis of longitudinal data from pondicherry parameter sampling capabilities of sequential and simultaneous data assimilation: ii. statistical analysis of numerical results an evaluation of models for partitioning eddy covariance-measured net ecosystem exchange into photosynthesis and respiration normalized measures of entropy entropyexplorer: an r package for computing and comparing differential shannon entropy, differential coefficient of variation and differential expression impact of 10 years of diethylcarbamazine and ivermectin mass administration on infection and transmission of lymphatic filariasis controlling the false discovery rate-a practical and powerful approach to multiple testing epidemiological and entomological evaluations after six years or more of mass drug administration for lymphatic filariasis elimination in nigeria the study of plant disease epidemics bayesian data assimilation provides rapid decision support for vector-borne diseases modelling the elimination of river blindness using long-term epidemiological and programmatic data from mali and senegal. epidemics socio-ecological dynamics and challenges to the governance of neglected tropical disease control. infectious diseases of poverty early-warning signals for critical transitions early warning signals of extinction in deteriorating environments practical limits for reverse engineering of dynamical systems: a statistical analysis of sensitivity and parameter inferability in systems biology models multi-sensor model-data fusion for estimation of hydrologic and energy flux parameters. remote sensing of environment key: cord-205559-q50vog59 authors: zhang, lelin; nan, xi; huang, eva; sydney, sidong liu university of technology; school, the university of sydney business; university, macquarie title: detecting transaction-based tax evasion activities on social media platforms using multi-modal deep neural networks date: 2020-07-27 journal: nan doi: nan sha: doc_id: 205559 cord_uid: q50vog59 social media platforms now serve billions of users by providing convenient means of communication, content sharing and even payment between different users. due to such convenient and anarchic nature, they have also been used rampantly to promote and conduct business activities between unregistered market participants without paying taxes. tax authorities worldwide face difficulties in regulating these hidden economy activities by traditional regulatory means. this paper presents a machine learning based regtech tool for international tax authorities to detect transaction-based tax evasion activities on social media platforms. to build such a tool, we collected a dataset of 58,660 instagram posts and manually labelled 2,081 sampled posts with multiple properties related to transaction-based tax evasion activities. based on the dataset, we developed a multi-modal deep neural network to automatically detect suspicious posts. the proposed model combines comments, hashtags and image modalities to produce the final output. as shown by our experiments, the combined model achieved an auc of 0.808 and f1 score of 0.762, outperforming any single modality models. this tool could help tax authorities to identify audit targets in an efficient and effective manner, and combat social e-commerce tax evasion in scale. the hidden economy is perceived by tax authorities around the world, such as the australian taxation office (ato), as containing businesses that intentionally hide their income to avoid paying the right amount of taxes primarily by not recording or reporting all cash or electronic transactions; and tax evasion refers to an illegal activity that an individual or entity deliberately evades paying a true tax liability [2] . according to the ato, the hidden economy is most common in the small business segment, where about 1.6 million small businesses operating across 233 industries are more likely to receive cash on a regular basis [2] . those caught evading taxes are usually subject to criminal charges and substantial penalties [23] . social media platforms such as facebook, twitter and instagram serve billions of users by providing convenient means of communication, content sharing and even payment between different users, allowing for social e-commerce activities to occur. due to their convenient and anarchic nature, it is easy for unregistered market participants to promote and conduct business activities without paying taxes. the industrial scale of these social e-commerce activities became a worldwide phenomenon, and the total transaction volume is significant. for example, thailand was reported to have the worldâăźs largest social e-commerce market in 2017, arxiv:2007.13525v1 [cs. lg] 27 jul 2020 where 51 percent of online shoppers bought products via social media platforms, their value more than doubled to 334.2 billion ($10.92 billion) [20] . according to a mckinsey report [17] , indonesia's e-commerce spending is at least $8 billion a year driven by 30 million shoppers. among them, transactions occurred over social media platforms constituted more than $3 billion, accounting for around one third of the e-commerce market, projected to be $55-65 billion by 2022. due to the concealment through text and visual content, and the social nature of social media platforms, solicited transactions are difficult for tax authorities to detect, representing the online form of transaction-based tax evasion. tax leakage from the resulting digitalized hidden economy is significant [7] and should be studied. for example, it was reported [16] that there is serious transaction-based tax evasion from cross-border online goods sales via social media platforms between china and australia, resulting in tax losses in both countries: "up to aud $1 billion in undeclared taxable income may be slipping through the net, leaving a potential tax bill in the hundreds of millions. " the advancement of machine learning, especially deep neural networks (dnn), provides powerful tools to analyze the textual and visual content on social media. many exciting applications have been developed with the aim to identify and combat illegal activities on social media, such as cyberbullying [1] , counterfeit products [3] , substance use [8] , and drug dealing [15] . compared to other applications, detecting transaction-based tax evasion activities is a unique and challenging problem. unlike illegal substances/goods, goods sold through tax evasion transactions could be perfectly legal when the transaction happens on a registered channel. therefore, they cannot be easily identified by simple product keywords/images. moreover, tax evasion transactions are a mixture of different product sources and selling strategies, resulting in greater varieties of textual and visual content than most applications. to tackle the above challenges, we built a dnn-based regtech tool to automatically detect transaction-based tax evasion activities based on a dataset we collected from instagram. the contributions of the tool are two-fold: • we collected a dataset of 58,660 instagram posts related to #lipstick. we then manually labeled 2,081 sampled posts with multiple properties. the dataset provides a solid baseline to understand sales and hidden economy activities on instagram, and facilitates the development of detection models. • we developed a regtech tool to automatically detect suspicious posts of transaction-based tax evasion activities on social media platforms. the tool utilizes a multi-modal dnn model, which combines comments, hashtags and image modalities with state-of-the-art language and image networks. our experiments shown that the combined model outperforms any single modality models, achieving an auc of 0.808 and f1 score of 0.762. the tool could greatly increase the effective detection rate and save enforcement costs for tax authorities. it is difficult for governments to detect transactions in the hidden economy. associated tax evasion has been a longstanding topic studied by tax administration and compliance scholars [22] . since transaction-based tax evasion activities moved onto social media platforms, traditional tax auditing detection methods failed to be implemented effectively by tax authorities, and little research has been done. the main focus of tax authorities on social media platforms is to detect taxpayers who evade income tax. they aim to catch taxpayers directly, that is, they seek to detect whether the taxpayer declared their income sourced from their social media accounts honestly [26] . each taxpayer is one detection point. in australia, the ato has hired a team of data mining experts to look at things like facebook and instagram to see if income reported by taxpayers match up with their actual revenue [26] . transaction-based tax evasion differs from income tax evasion, taxes being evaded are transaction taxes, such as the gst, and each transaction is a stand-alone detection point, giving rise to many more detection points than the number of taxpayers. the uganda government has taken advantage of technology and is collecting a social media tax based on the taxpayerâăźs daily use of the platform, but failing to take into account transaction activities [21] . as the social media becomes increasingly multi-modal and the unregistered selling activities become more sophisticated, it is essential to assess both textual (e.g., hashtags and comments) and visual (e.g., image and video) content to decide if a post on the social media has the intention to facilitate tax evasion transactions. a successful automatic detection system needs to handle both textual and visual information in a robust and efficient manner. thankfully, with the recent development of dnn methods, we have seen exciting breakthroughs in many computer vision (cv) and natural language processing (nlp) tasks. in cv, from alexnet [14] , googlenet [24] , resnet [9] to efficientnet [25] , deep convolutional neural networks (dcnn) are becoming more powerful in terms of accuracy, scalability and efficiency. similarly, the nlp models evolved from word embedding (word2vec [18] ), contextual word embedding (elmo [19] ) to transformers (bert [6] , xlnet [28] ), increasing in sophistication. many existing works analyze textual and visual contents on social media platforms using dnn models. in [1] , the authors proposed to detect cyberbullying on instagram using a variety of textual and visual features, including word2vec features from comments, and dcnn features from images. in [3] , a dcnn model is used to discover counterfeit sellers on two social media platforms based on shared images. in [8] , dcnn for images and long short-term memory (lstm) [10] for text are used to extract predictive features from instagram posts to access substance use risk. in [15] , lstm models are used to extract features from comments and hashtags, the dual-modal features are then combined to detect illicit drug dealing activities. in [29] , the authors proposed a framework to predict post popularity using both image and text data from the posts by a user. the majority of existing works process contents of different modalities with separate nlp/cv models, then combine the features using concatenating, stacking or embedding layers to form an end-to-end dnn model. we developed our multi-modal dnn model following the same idea with recent nlp/cv models: adapter-bert [11] for hashtags and comments, and efficientnet [25] for images. the aim is to have a modularized structure with the flexibility to change processing models for each modality, and the extensibility to incorporate more modalities, while keeping the whole network trainable from end-to-end. to build our dataset, we crawled publicly available posts and their corresponding poster information from instagram. in the proof-of-concept stage, the posts are collected using hashtag #lipstick during 22 nd to 26 t h september, 2019. for posts, we collected the username, post timestamp, number of likes, image, post text, and all its comments (we include the original post text as the first comment due to the way instagram presents the posts). we also extracted the hashtags from the comments as they usually form a significant part of the textual information. moreover, although not used at this stage, we also collected postersâăź user information, including the username, number of followers, number of following users, number of posts, and the user bio. we collected a total of 58,660 posts (short-lived and duplicated posts included), then randomly sampled 3,000 posts for manual data labelling. as some of the instagram posts were deleted after a short period of time, we check the availability of the posts in november 2019 to ensure all the labelled posts are available for more than 50 days. 711 posts have been deleted at the time of checking. moreover, we also removed 148 duplicated posts due to the way they are displayed on instagram. this produced a dataset of 2,081 unique posts sampled from 22 nd to 26 t h september, 2019. for each collected post, we labeled according to 9 properties (as shown in table 1 ): availability of post; its relevance to the search keyword #lipstick; its selling intention; its source; its relation to hidden economy; its image type; the language of the text; the existence of other contact details; and the type of other contact details left on the post. for transaction-based tax evasion, the key property is the post's relationship to hidden economy, which is defined as posts by unregistered sellers and producers who intend to generate hidden economy sales. during the process, instagram is either their main source of customers, or it is a useful tool to direct sales to the account holderâăźs other social media tool for the completion of the sale. to detect transaction-based tax evasion activities of instagram users, we analyzed their posts to extract features from the posted images, hashtags, and comments. as images and texts have different data structures and modalities, two dnn architectures, i.e, adapter-bert [11] and efficientnet [25] , were used for text and image feature extraction respectively. the features were extracted automatically from the post data using the established dnn models and were mapped into a joint feature space of both image and text features. a multimodal dnn model was then developed, which took the joint feature as input, to detect transaction-based tax evasion activities. the workflow of the proposed method is illustrated in figure 2 . the adapter-bert architecture [11] was used to extract features from the textual content, i.e., hashtags and comments. since the textual content contains multiple languages, a multilingual tokenizer was used to convert words into token vectors, which supported all the languages in our dataset as presented in table 1 . for image feature extraction, the effi-cientnetb7 model [25] was used, as it represents the state-ofthe-art in object detection while being 8.4 times smaller and 6.1 times faster on inference than the best existing models in the imagenet challenge [4] . the image and text features were then concatenated, resulting in a high dimensional feature vector for each post sample. the text features extracted by adapter-bert are 768-dimensional for both the hashtag input and the comment input, and the image features extracted by efficientnetb7 are 2,560-dimensional, so the joint feature space is 768 + 768 + 2560 = 4096-dimensional. we implemented a multi-modal dnn model by combining three different basic models, including two adapter-bert models for hashtags and comments, and an efficientnetb7 model for images. the logit layers of the three models were merged using a concatenate layer. a dropout layer was then added on top of the concatenate layer, followed by an output dense layer, where a positive prediction corresponds to a tax evasion activity, vice versa. (figures 1c and 1d) • n: the post is not relevant to #lipstick and it shows something else (figures 1a and 1b) selling intention judging from the text and image, does the poster have an intention to sell or will the post lead to a potential sale? • y: yes, the poster has an intention to sell or the post will lead to a potential sale (figures 1b, 1c and 1d both adapter-bert models were implemented using the bert-for-tf2 python package (v0.14) [13] and pre-trained using the entire wikipedia dump for the top 100 languages in wikipedia [5] . the same meta-parameters were used for building the adapter-bert models: max sequence length = 64, adapter size = 64. the efficientnetb7 model was implemented using the efficientnet package (v1.10) [27] . the imagenet pre-trained weights were used to initialize the efficientnetb7 model. to address the imbalanced distribution of the positive and negative samples, we set the class weights to (negative: 0.4, positive: 1.6) according to the ratio of each class sample size to total sample size scaled by the number of classes. the adam optimizer [12] with a learning rate of 0.0001 was used for training the model in 100 epochs. 400 posts were randomly selected for testing the model, and the remaining 1,681 posts were used for training. an internal validation set (20% of the training samples) was split from the training set and used to evaluate the modelâăźs performance during training. to test the contribution of individual text and image inputs and the effectiveness of multi-modal inputs, we compared our proposed model with the three basic dnn models using only hashtag, comment and image inputs respectively. the same initialization method, meta parameter, optimization method and learning rate as for the multi-modal model were used for the three basic models. we used precision, recall, f1 score, and the area under curve (auc) to evaluate the modelsâăź performance. the results of the proposed multi-modal dnn model and the compared individual basic models are presented in table 2, and the receiver operating characteristics (roc) curves of these models are illustrated in figure 3 . the model using hashtag features has the most imbalanced performance with the highest recall (0.89), but the lowest precision (0.444) and f1 score (0.593). the model using comment features is less imbalanced and has a substantially better f1 score (0.742). the model using image features achieved the highest precision (0.756), showing that it was most sensitive to tax evasion activities, but its overall performance (f1 score = 0.696) was compromised with the lowest recall (0.645). the differences in precision and recall between the image model and text models indicate that the visual and textual contents may provide complementary information to each other for hidden economy activity detection. this finding is further evidenced by the improved performance of the multi-modal model (f1 score = 0.762, auc = 0.808), which outperformed any single modality models. in order for tax authorities to effectively deploy this technology to aid in detecting transaction-based tax evasion, cost is an important factor. the very few examples of actual implementation of data-matching by tax authorities have been somewhat cost ineffective. take the work-program of the ato as example, in order to mitigate the major tax integrity risk of the hidden economy, their budget was aud 39.5 million in 2015-16, employing around 400 people, of which only 6 were data mining experts [2] . their work includes manually viewing social media posts. in terms of cost, the proof-of-concept stage of our model was time and labor efficient, where 3 labelers spent 60 hours to complete manual labelling. our model markedly improves the efficiency at detecting and confirming posts that relate to transaction-based tax evasion. without the detection model, tax officers will need to randomly select the posts, and can expect to detect about 22 tax evasion activities per 100 posts (464 out of 2081). with our method, our model will identify the suspicious posts first, then tax officers will manually confirm whether these posts relate to tax evasion. we can expect to identify about 72 real tax evasion activities out of 100 recommended suspicious posts. therefore, with the same amount of effort, the efficiency can be improved by more than 3 times. in our current labelled dataset, about 10 percent of the samples have video clips instead of images. since our current method could not take video clips as input, we replaced the video clips with random noise images for visual feature extraction. features extracted from random noise images do not provide useful information for our detection task, thereby may compromise the performance of the model. as shown in figure 3 , a marked drop in precision was observed in the image model's performance. for future work, we will implement a video modality module to enhance the model's applicability and improve the model's performance. another interesting finding of the results is that textural features tend to have higher recall, whereas image features have higher precision, and the combined features can outperform any individual type of features. it would be worthy of further study to investigate the complementary nature of different features and to understand the relationship between them, e.g., what words and images would have higher weights in detecting tax evasion activities or in other applications. in conclusion, we developed a regtech tool that automatically detects transaction-based tax evasion activities on social media platforms. in the proof-of-concept stage, we collected a dataset of instagram posts about #lipstick and manually annotated sampled posts with multiple labels related to sales and tax evasion activities. the dataset provides a solid baseline to understand sales and hidden economy activities on instagram. we then developed a multi-modal dnn model to automatically detect transaction-based tax evasion activities from the posts. we adopted a modularized structure for the dnn model so that the processing submodule for each modality can be changed, and new modality could be incorporated easily. evaluation results confirm the efficiency and effectiveness of the regtech tool. this tool could help tax authorities to identify audit targets in an efficient and effective manner, and combat social e-commerce tax evasion in scale. as a roadmap to a full-fledged tool, we aim to extend the detection capability to more products (other than lipstick), and eventually have a robust detection model even if the product is unseen by the model. this continuous process will also expand on our novel dataset. we will make available this dataset to the research community once we develop a prototype to the full-fledged tool. moreover, we plan to incorporate additional modality to further improve the detection rate (e.g., the tax evasion nature of 58 out of the 2,081 posts can only be decided by a combination of post content and the posterâăźs information). this proof of concept model has attracted attention from both the state administration of taxation in the peopleâăźs republic of china (the prc) and the australian federal police (afp). our team has formed a collaborative relationship with the tax science and research institute, a department under the state administration of taxation in the prc. collaborating with their frontier research lab, we aim to test this project as part of the designing stage in chinaâăźs golden tax project, with the plan to incorporate other products that are popular with daigous in our model design. for example, due to the covid-19, there is worldwide shortage of medical protective equipment/masks, the prc government is keen to regulate the unregistered sales of those products. we are waiting for them to provide a list of these popular products. as for the collaboration with afp, the detection of hidden economy transaction on social media platforms covers several legal aspects, and the afp is also interested in the regulation of other legal issues such as infringement of ip, fake products and smuggling. deep learning for detecting cyberbullying across multiple social media platforms strategies and activities to address the cash and hidden economy deep learning-based online counterfeit-seller detection imagenet: a large-scale hierarchical image database bert-base, multilingual cased, list of languages bert: pre-training of deep bidirectional transformers for language understanding concept, motives and channels of digital shadow economy: consumers' attitude identifying substance use risk based on deep neural networks and instagram social media data deep residual learning for image recognition long short-term memory parameter-efficient transfer learning for nlp adam: a method for stochastic optimization a keras tensorflow 2.0 implementation of bert, albert and adapter-bert ima-genet classification with deep convolutional neural networks a machine learning approach for the detection and characterization of illicit drug dealers on instagram: model evaluation study chinese grey market a '$1 billion tax black hole' the network of chinese personal shoppers known as daigou have been quietly operating a grey market for years, but do they pay tax? the digital archipelago: how online commerce is driving indonesia's economic development distributed representations of words and phrases and their compositionality thailand to launch value-added tax in 2020 millions of ugandans quit internet services as social media tax takes effect the curse of cash: how large-denomination bills aid crime and tax evasion and constrain monetary policy an economic perspective on tax evasion going deeper with convolutions efficientnet: rethinking model scaling for convolutional neural networks tax office trawls facebook, instagram and other social media to catch out dodgers 2020. implementation of efficientnet model. keras and tensorflow keras xlnet: generalized autoregressive pretraining for language understanding how to become instagram famous: post popularity prediction with dual-attention we acknowledge our data and labeling team: alex huang, pei-wen sophie zhong, and jun zhao. special thanks to fujitsu australia limited for providing the computational resources for this study. key: cord-229393-t3cpzmwj authors: srivastava, ajitesh; prasanna, viktor k. title: learning to forecast and forecasting to learn from the covid-19 pandemic date: 2020-04-23 journal: nan doi: nan sha: doc_id: 229393 cord_uid: t3cpzmwj accurate forecasts of covid-19 is central to resource management and building strategies to deal with the epidemic. we propose a heterogeneous infection rate model with human mobility for epidemic modeling, a preliminary version of which we have successfully used during darpa grand challenge 2014. by linearizing the model and using weighted least squares, our model is able to quickly adapt to changing trends and provide extremely accurate predictions of confirmed cases at the level of countries and states of the united states. we show that during the earlier part of the epidemic, using travel data increases the predictions. training the model to forecast also enables learning characteristics of the epidemic. in particular, we show that changes in model parameters over time can help us quantify how well a state or a country has responded to the epidemic. the variations in parameters also allow us to forecast different scenarios such as what would happen if we were to disregard social distancing suggestions. the recent outbreak of covid-19 and the world-wide panic surrounding it calls for urgent measures to contain the epidemic. predicting the speed and severity of infectious diseases like covid-19 and allocating medical resources appropriately is central to dealing with epidemics. the role of data science in this issue was brought to light when in december 2013, the first cases of chikungunya virus appeared in the americas. in 2014 darpa announced a grand challenge [1] to predict the spread of chikungunya virus in 55 countries of the western hemisphere. through monthly predictions and reevaluation over seven months, darpa announced the top winners of the challenge which included our team [2] . while many winning methods relied on manual adjustments for generating predictions, ours was a completely automated approach, thus more generalizable. however, training such a model with rapidly changing epidemic trends is difficult. often epidemic models are trained through numerical solutions to differential equations [3] or through bayesian inference [4, 5] . instead, we transform the model into a linear system and train it using weighted least squares. the more recent data is more heavily weighted to adapt to rapidly changing trends. further, we explore various hyper-parameter selection strategies to identify the best model. learning the model also enables understanding of the dynamics of the epidemic and how it has changed over time. we utilize the changing parameters to study how various countries and us states have responded to the epidemic. we do so by proposing two measures: (i) contact reduction score that measure how much a region has reduced transmission; (ii) and epidemic reduction score that measures how much reduction in confirmed cases a region has achieved compared to a hypothetical scenario where the trends had remained the same as a reference day in the past. further it enables analysis of scenarios into the future, for instance, what would be the trend of the epidemic if social distancing orders were released. while we perform our analysis at state and country level. in the future, we plan to generate similar predictions at county and city level. applying such machine learning-based models to a finer level (from countries to states/cities) and larger scale (more 'regions' of the world) brings unique challenges in terms of unreported/noisy data and large number of model parameters, which will be explored in a future work. in this paper, we present some of our initial results on country and state-level predictions. we forecast number of reported cases for us states and for all the countries. we understand that majority of the infections are conjectured to be unreported [6] . we still believe that number of reported cases is a good indicator of the stress on the healthcare system. henceforth, "number of infected cases" refers to the number of reported cases. in future work, we plan to incorporate modeling of unreported cases informed by various ongoing antibodies studies. we are also developing an interactive customizable tool that can be used to perform predictions using our model. a preliminary version of the tool has been made publicly available 1 . we propose an epidemic model ( fig. 1 ) for spread of a virus like covid-19 across the world which captures (i) temporally varying infection rates (ii) arbitrary regions, and (iii) human mobilty patterns. within every region (hospital/city/state/country), an individual can exist in either one of two states: susceptible and infected. a susceptible individual gets infected when in contact with an infected individual at a rate depending on when that individual got infected, i.e., rate of infection is β 1 for an individual infected at t − 1, β 2 for an individual infected at t − 2, and so on, thus resulting in k sub-states of infection. the hypothesis is that how actively one passes on the infection is affected by when they get infected. we assume that after being infected for a certain time, individuals no longer spread the infection, i.e., ∃k, such that β i = 0∀i > k. also, people traveling from other regions can increase the number of infections in a given 1 https://jaminche.github.io/covid-19/ region. we assume that this infection can happen because of human mobility. suppose f (q, p) represents mobility from region q to region p. our model is represented by the following system of equations. here, s p t and i p t represent the number of susceptible individuals and infected individuals respectively in the region p at time t. parameter δ captures the influence of passengers coming into the region. to deal with the fact that using large values of k may overfit the model and that data is likely to be noisy, we incorporate another hyperparameter j in eq 2 as follows: this creates a dependency on last kj days with k parameters while having a "smoothing" effect due to combining infections over j days. note that if we set k = 1, j = ∞, and ignore mobility (δ = 0), this reduces to suceptible-infected (si) model [7] . on the other hand, with bounded k = 1 and j < ∞, the model is a variation of suceptible-infected-released/recovered (sir) model [8] , where an infected individual is active for j units of time. while the model is applicable to any definition of regions, in this work, we focus on country-level and us state-level forecasts, only. further, there is a lag between an individual contracting the virus and their case being reported. we do not account for this lag, assuming that it is constant. county-level and city-level forecasting accounting for the lag and unreported cases is planned for future work. to train the model, we linearize it, by setting δ p i = δβ p i and learning it as an independent parameter. this makes the model more general by allowing different infection rates for the travelers. this is different from traditional approach of fitting one curve to the data with fixed initial values, which is computationally expensive and cannot capture rapidly changing trends. let then, our model can be represented by the linear equation ∆i p which allows us to learn the parameters using a constrained linear solver. this formulation works only if k and j are same for all regions. to allow different hyper-parameters for different regions, we used a further simplification: and, to incorporate the fast evolving trend of covid-19 due to changing policies, we use weighted least squared to learn parameters β p i and δ p i from available reported data. the best fit in the least-squares sense minimizes the sum of squared weighted residuals, i.e., the difference between observed data and predicted values provided by our learned model. we incorporate a forgetting factor α ≤ 1 in our minimization to put more weight to the recent infection trend when learning the model. lower α implies more emphasis on the more recent data. we compute the minimum of the sum of squares of weighted errors as where ∆î p t is the actual number of newly infected individual. we present our analysis on two datasets -(i) global: country-level data with each country defined as a region; and (ii) us states: state-level data for unites states with each state defined as a region. country-level infections were derived from confirmed cases obtained from jhu csse covid19 dataset [9] . population of the countries were obtained from the world bank dataset [10] . inter-country travel data were obtained from kcmd global transnational mobility dataset [10] , which makes the estimation on the basis of global statistics on tourism and air passenger traffic. number of infections for us states were obtained from confirmed cases compiled by new york times [11] . population of each state was obtained from us census bureau [12] . inter-state travel was estimated from number of flights flying between airports in the united states [13] . the hyper-parameters to be picked include k, j, and α. the best hyper-parameter set is identified by a grid search on k, j, and α, which minimizes the root mean squared error over a validation set of h = 3: we pick α from {0.1, 0.2, . . . , 1.0}, while k, j ∈ {1, 2, . . . , 14}. we enforce jk ≤ 14 days so that the dependence of new infections is bounded by the previous two weeks. this is along the lines of the motivation for 14 days of quarantine 2 . we experiment with two methods of setting the hyper-parameters: (i) fixed -same set of hyper-parameters for all countries/states (ii) variable -specialized parameters for each state/country. the results are presented next. we evaluated the results for a horizon h = 3 with two hyper-parameters selection schemes termed si-kjα (variable) and si-kjα (fixed). the best heper-parameters were selected by further splitting the training data, to set the last three days for validation. we compare these results against a recent generalized version of seir model [14] . besides rmse, we also measure the mean absolute percentage error given by the comparison is shown in tab. 1. we used a publicly available implementation of the baseline 3 . we observe that both fixed and variable hyper-parameter selections widely outperformed the baseline. performances of both fixed and variable scheme were comparable. we use the average of both predictions as an 'ensemble' prediction which performed better than both schemes, individually. it should be noted that the number of cases across different countries has orders of magnitude difference. filtering out countries with small number of cases will significantly reduce mape but increases rmse. as an example, considering only us for country-level prediction, the rmse for all three (fixed, variable, and ensemble) is 6886.9, which translates to 1.12% mape. all the code used in our experiments are available on github 4 . 11.37% gen-seir [14] 2106. 4 14.31% 7471.2* 41.06%* our approach with variable hyper-parameter set has the best performance in terms of both mape and rmse. *for some countries, the baseline method could not converge and for some the produced mape was greater than 100%. we ignore those countries to favor the baseline which results in averaging over 147 countries. fig. 2 shows the test results using si-kjα (ensemble) prediction on h = 5 days for four us states. these were arbitrarily selected; other states show similar results as well. fig. 3 shows the test results on h = 5 days for four countries. again, these were arbitrarily selected. note that the predictions are extremely accurate. for these results, we set all mobility terms (f (p, q)) to zero to reflect that due to "stay-at-home" policy around the world, the recent spread due to travel is negligible. this also reduces the number of parameters to be learned. for an earlier date, we include mobility and discuss the details next. to measure the effect of travel, we train our model with data until march 18th, with and without travel data. the next three days are used as validation set to identify best hyper-parameters and the following three days are used as test set for evaluation. table 2 shows validation and test errors for both variable and fixed hyperparameter selection scheme, each with and without travel. observe that in all cases including travel data reduces rmse in the test set. for us states, variable hyper-parameter scheme performs better, while for global, fixed scheme has a lower rmse. this may be due to lack of enough data resulting in overfitting of variable hyperparameter scheme. this is evident from the fact that the validation errors are much lower than the test errors. this observation also supports the decision of going forward with the ensemble approach instead of choosing one of variable or fixed schemes. since the speed of infection is driven by β p i , we can assess the effect of a region's effort to battle covid-19 by the change observed in these parameters. we measure the number of transmissions a susceptible person receives from an infected individual, assuming the infections are uniformly distributed across all sub-states. another approach to define this quantity is to measure the number of new infections, given that past infections are uniformly increasing. this allows us to quantify a measure of "contact" without relying on the state of individuals (infected/susceptible) in the population as we define contact reduction score (crs) for a region (country/state) as the fractional change in this number of transmissions: we picked a date in the middle of march as the reference day to learn τ p n (old), and compared against τ p n (new) obtained from training up to april 10. since, the ensemble approach for hyper-parameters selection has the best performance, we compute the number of transmissions as the average of transmissions obtained from both scenarios. further, we also define epidemic reduction score, which measures the fractional reduction in number of infections compared to the scenario where the trend of infection at reference day had continued. the reduction scores for us states and global countries are shown in fig. 4 . we ignored all the regions that had less than 100 cases for us state-level and 1000 cases from country-level data on the reference day in order to make the comparison reliable. among the 30 us states that qualify based on the threshold of 100 cases at baseline, the state with best crs was new jersey, and minnesota scored the worst. mississippi has the top ers with massachusetts at the bottom of the list. mississippi is also close to new jersey in crs. we note that mississippi has been awarded between c and f since the reference day to early april for its "social distancing" score by unacast 5 . this suggests that while "social distancing" score captures percentage reduction in average mobility, it does not provide the complete picture of the changes in infection dynamics. a visual inspection reveals that mississippi has indeed shown significant change (fig. 5) compared to reference day when number of infections were increasing much more rapidly. at country-level, brazil has the best crs with japan at the bottom, among 22 countries selected based on a threshold of 1000 infections at reference day. based on ers, us tops the list with japan at the bottom again. we emphasize that these rankings are sensitive to the reference day. while both crs and ers measure a region's response to the epidemic, we believe that crs is a better metric to evaluate a region's efforts. this is due to the fact that crs only depends on the model parameters that can be controlled by limiting contact with others and other policies to reduce transmission. on the other hand, ers depends on number of current/past infections which are not completely controlled by changing policies. for instance, a country which already has large number of cases (yet significantly less than its susceptible population) will experience a higher increase compared to a country with fewer cases, even when they have identical model parameters. we can use our models not only to generate forecasts, but also for emulating scenarios. as an example, we compute forecasts with two models: (i) a model trained on most recent data with no inter-region mobility, and (ii) a model trained on data until mid-march (reference day) including inter-region mobility. the reference day was set to middle of march as the majority of us states and countries had not actively imposed "social distancing" suggestion. the results with both models are shown in figs. 6 and 7. instead of showing a single curve for forecasts, we show a range owing to the difference in fixed and variable hyper-parameter schemes. all plots suggest that, even though the current trend is leading to limited number of infections, immediately releasing all precautions can result in a rapid rise of the epidemic. we have proposed two schemes for selecting hyper-parameters, fixed and variable. while both schemes have similar performance, variable scheme has a significant difference in test and validation error, suggesting that it is prone to over-fitting. on the other hand, with more data this issue may be resolved. while, currently our 'ensemble' approach is simply the average of the predictions using both schemes, the ideal approach may be a hybrid one, where a collection of regions share the same hyper-parameters. this is especially useful when there are very few cases in a region. then the hyper-parameters and even the parameters for this region can be set to be equal to that of a "similar" region from where more data is available. one way to identify such similar regions is by clustering regions based on their trends in a previous epidemic. in this way, active reporting for covid-19 can be used to improve forecast of the next epidemic at an early stage. we will explore this in a future work. we have only modeled reported cases, as they are an indicator of the stress on the healthcare system. however, unreported cases may affect the long term dynamics. the unreported cases can be classified into two categories: (i) unreported cases -those who get infected over the course of the epidemic but do not report it; and (ii) immune casesthose who have the antibodies without being infected during the epidemic. for unreported cases, we can add another state to our model: for an individual in the i th "infected" sub-state, a report will be made with probability γ p i . thus, the total number of new reported cases is given by ∆r = k i=1 γ p i ∆i p t−i . then the parameters will be learned by fitting the reported cases to r p t . the immune cases can be modeled as considering them not-susceptible. suppose, ρ p is the probability of a randomly selected individual in region p to be immune. then the number of susceptible individuals at time t is given by s p t = (1 − ρ p )n p − i p t . we can integrate these cases complemented by the ongoing various anti-body studies and surveys that will assist with more accurate learning of the parameters. learning these parameters from the data without considering these studies is difficult. this is due to the fact that the number of total cases, even if it is 100 times the reported cases, is currently a very small fraction of the population. as of now, we allow γ p i and ρ p to be inputs in our model. these can be used to study long-term scenarios with various inputs. for example, fig. 8 shows the number of new positive cases (reported) for various values of γ p i = γ. darpa forecasting chikungunya challenge chikv challenge announces winners, progress toward forecasting the spread of infectious diseases the estimation of the effective reproductive number from disease outbreak data statistical inference in a stochastic epidemic seir model with control intervention: ebola as a case study tracking epidemics with google flu trends data and a state-space seir model internationally lost covid-19 cases behaviors of susceptible-infected epidemics on scale-free networks with identical infectivity dynamics of measles epidemics: estimating scaling of transmission rates using a time series sir model novel coronavirus covid-19 (2019-ncov) data repository by johns hopkins csse covid-19) data in the united states state population totals airport, airline and route data epidemic analysis of covid-19 in china by dynamical modeling this work was supported by national science foundation award no. 2027007. the authors would like to thank frost tianjian xu for preparing datasets and jamin chen for integrating our methods into a web-based visualization. the authors also thank prathik rao and kangmin tan for implementing and testing various ml approaches. we have proposed a heterogeneous infection rate model with human mobility to model the spread of covid-19 at country and state-level. our model incorporates a forgetting factor to quickly adapt to the rapidly changing trends due to the changing policies of how we respond to the epidemic. with a weighted least square training and identifying the right hyper-parameters, we are able to achieve highly accurate predictions. the parameters obtained over different time-intervals allow us to measure how various regions (states/countries) have responded to the epidemic. in particular, we have defined contact reduction score and epidemic reduction score that respectively measure reduction in transmission, and reduction in epidemic spread compared to a hypothetical scenario where a trend of a prior reference point were to continue. these varying parameters also represent different policies of the past, and hence allow us to simulate scenarios into the future. for instance, using the parameters learned up to the point before "social distancing" orders, we can forecast the trajectory of the epidemic, if everyone were to stop taking distancing precautions. key: cord-147202-clje3b2r authors: ghanam, ryad; boone, edward l.; abdel-salam, abdel-salam g. title: seird model for qatar covid-19 outbreak: a case study date: 2020-05-26 journal: nan doi: nan sha: doc_id: 147202 cord_uid: clje3b2r the covid-19 outbreak of 2020 has required many governments to develop mathematical-statistical models of the outbreak for policy and planning purposes. this work provides a tutorial on building a compartmental model using susceptibles, exposed, infected, recovered and deaths status through time. a bayesian framework is utilized to perform both parameter estimation and predictions. this model uses interventions to quantify the impact of various government attempts to slow the spread of the virus. predictions are also made to determine when the peak active infections will occur. coronavirus disease (covid-) (wu et al. ( ); rezabakhsh, ala, and khodaei ( )) is a severe pandemic a fecting the whole world with a fast spreading regime, requiring to perform strict precautions to keep it under control. as there is no cure and target treatment yet, establishing those precautions become inevitable. these limitations (giuliani, et al. ( )) can be listed as social distancing, closure of businesses and schools and travel prohibitions (chinazzi et al. ( )). corona virus is a new human betacoronavirus that uses densely glycosylated spike protein to penetrate host cells. the covid-belongs to the same family classi cation with nidovirales, viruses that use a nested set of mrnas to replicate and it further falls under the subfamily of alpha, beta, gamma and delta co-vis. the virus that causes covid-belongs to the betacoronavirus b lineage and has a close relationship with sars species. it is a novel virus since the monoclonal antibodies do not exhibit a high degree of binding to sars-cov-. replication of the viral rna occurs when rna polymerase binds and re-attaches to multiple locations (mcintosh ( ); fisher and heyman ( )). cases of covid-started in december when a strange condition was reported in wuhan, china. this virus has a global mortality rate of . %, which makes it more severe in relation to u. the elderly who have other pre-existing illnesses are succumbing more to the covid-. people with only mild symptoms recover within to days, while those with conditions such as pneumonia or severe diseases take weeks to recover. the recovery percentage of patients, for example, in china stands at %. the recovery percentage rate of covid-is expected to hit % (who ( )). the virus has spread from china to other countries and territories across the globe. from wuhan, hubei province, the virus spread to mainland china, thailand, japan, south korea, vietnam, singapore, italy, iran, and other countries. the state of qatar was one of the countries that were a fected by the covid-spreading and the rst infected case was reported on th of february and it could be considered the nd highest in the arab world with the number of con rmed cases , as of may , . for e fectively specifying such security measures, it is essential to have a real-time monitoring system of the infection, recovery and death rates. develop, implement and deploy a data-driven forecasting model for use by stakeholders in the state of qatar to deal with the covid-pandemic. the model will focus on infected, deaths and recovered as those are the only data available at this time. this document is organized in the following manner. in section the seird that is employed is de ned. next, section introduces the data available and gives description. then shows how interventions are incorporated into the model. the let s(t) be the number of people susceptible at time t, e(t) be the number of people exposed at time t, i (t) be the number of infected at time t, r(t) be the cumulative number of recovered at time t and d(t) be the cumulative number of deaths at time t. this can be modeled with the following system of ordinary di ferential equations: where α is the transmission rate from susceptibles to exposed, β is the rate at which exposed become infected, γ is the rate at which infected become recovered and η is the mortality rate for those infected. notice that, this model formulation makes several key assumptions: . immigration, emigration, natural mortality and births are negligible over the time frame and hence are not in the model. . once a person is in the infected group, they are quarantined and hence they do not mix with the susceptible population. . the recovered and deaths compartments are for those who rst are infected. there is no compartment for those exposed who do not become sick (infected) and recover on their own. traditional analysis would include a steady state analysis, however, in this case the dynamics of the short term is of interest. hence, this work does not address any steady state or equilibrium concerns. this work is concerned with tting the model given in ( ) to the covid-data concerning the state of qatar during the outbreak and using the model for forecasting several possible scenarios. the johns hopkins covid-github site includes for every country for each day the cumulative number of con rmed infections, cumulative number of recovered and the cumulative number of deaths for each day starting january . the data for qatar was obtained. notice that in model ( ) the recovered and death states are cumulative as once one enters the compartment their is no exit. however, the infected compartment has transitions from exposed and to recovered and deaths. hence the data provided for con rmed infections is cumulative and included both recovered and deaths and will need to be removed from this compartment's data. let ci (t) be the con rmed infections at time t and let infected i (t) be de ned as: for clarity the term "active infections" will be used to denote this derived variable versus the cumulative infected provided in the data. figure shows the plots of the active infections, recovered and deaths data for qatar for the days since february . notice that the active infections are very low until around day when there is large jump due to increased testing. the active infections then seems to plateau for until day , after which there is extreme growth in active infections. there seems to be a similar pattern for the recovered with a delay showing the time of infection before recovery. the plot for deaths shows no deaths until day and then a steady increase in deaths for the remaining days. the state of qatar, prepared an excellent exible plan for risk management, grounded on national risk assessment, taking account of the global risk assessment done by who, focuses on reinforce capacities to reduce or eliminate the health risks from covid-. embed complete emergency risk management strategy in the health sector. furthermore, enabling and promoting , closed all parks and public beaches to curb the spread of coronavirus. on march , (day ), the ministry of commerce and industry decided to temporarily close all restaurants, cafes, food outlets, and food trucks at the main public era. also, the ministry of commerce and industry decided to close all the unnecessary business on march , (day ) hamad medical corporation ( ) and mph-qatar ( ). these interventions taken by the government change the dynamics of the system and hence need to be incorporated into the model. the next section details how we introduce interventions both from the government and interventions guided by the data. in figure , one can see the jump at day and a plateau until day . the model needs to be able to handle interventions made by the government of the state of qatar. the main parameter that policy can in uence is α, the rate of transmission from exposed to susceptible. one way to implement this the use of indicator functions w k (t) de ned as: where t k is the time where the k th intervention is taken and index k = , , .., k . for each intervention there needs to be a change to the value of α, denoted α k , that captures the impact of the intervention. let the vector this formulation gives the following transitions rates between s(t) and e(t): which will require the following constraints due to the fact that α(t) > for all t: let be the set de ned by the constraints above. in addition to changes in infection rates α, impulse functions can be used to model dramatic one time shifts in transitions between states. a dirac delta function de ned by ). this can be integrated in the model to capture spikes in the number of cases. in our case the state of qatar data shows exhibits this type of behavior at day where one can clearly see a large jump in the number of infections. this is incorporated into the model presented by a dirac delta function, δ(t − τ), in transition rate between exposed and infected, which is coupled with a coe cient to β a to capture the impact of the jump. due to the complexity of the model the bayesian inferential framework is chosen. recall, bayes formula is given by (bayes and price, ) : where π(θ|d) is the posterior probability distribution for the parameters θ given the data d, π(θ) is the prior distribution of θ and l(d|θ) is the likelihood of the data given θ. in order to specify the likelihood of the model in equation ( ) the model modi ed to model the mean abundance in each compartment and is given by: and d(t), respectively and the parameters have the same de nition as provided in the system given in equation ( ). since there is no data for s(t) and e(t) these compartments will be latent variables and will not directly factor into the likelihood. the likelihood for i (t), r(t) and d(t) are given by: to specify the prior distributions for α, β a , β, γ and η one must incorporate the following constraints α > , β > , γ > and η > . hence the following prior distributions are set: where c ( ) is an indicator function takes the value if α ∈ . this serves to truncate the normal distribution in order to keep α in the feasible range of values. the likelihood and prior distributions speci cations lead to the following posterior distribution when a = and σ = : the posterior distribution does not lend to any analytic solution, hence markov chain monte carlo (mcmc) techniques will be used to sample from the posterior distribution (gelman et al., ) . speci cally metropolis-hastings sampler is used to obtain samples from the posterior distribution (gilks, richardson, and spiegelhalter, ) and (albert, (@) . to tune the sampler a series of short chains were generated and analyzed for convergence and adequate acceptance rates. these initial short chains were discarded as "burn-in" samples. the tuned sampler was used to generate , samples from π(α, β a , β, γ, η|d) and trace plots were visually examined for convergence and deemed to be acceptable. all inferences will be made from these , samples. the model and sampling algorithm is custom programmed in the r statistical programming language version . . . the computation takes approximately seconds using a amd a -. ghz processor with gb of ram to obtain , samples from the posterior distribution. for more on statistical inference see wackerly, mendenhall, and schea fer ( ), casella and berger ( ), and berger ( ). to apply the model the following initial conditions are speci ed: s( ) = , , , e( ) = , i ( ) = , r( ) = and d( ) = . here s( ) is the current population of the state of qatar, i ( ), r( ) and d( ) are obtained directly from the data. the choice of e( ) was used as it a minimal value that would allow the disease to spread but not so large as to make the spread rapid. several values of e( ) were explored and the value of was found to have the best t. furthermore, model interventions were placed at days t = , t = , t = , t = and t = with an dirac delta impulse at time τ = . table shows the means, standard deviations and the . %, . % and . % quantiles for the model parameters based on the , samples from the posterior distribution. notice that, α = . × − and α = − . × − are very close in magnitude with di ferent signs indicating that the rst intervention drastically reduced the transmission rate. similarly one can see that the second and third interventions α = . × − and α = − . × − essentially are of the same magnitude with di ferent signs which when added resulting in a very low transmission rate. however, α = . × − is a small increase with a moderate decrease in α = − . × − which still leaves a nal transmission rate of k k= α k ≈ . × − . of particular note is the mean mortality rate η = . ≈ / which means that about in , people perish from the disease each day, which is quite low. also note that the mean infection (con rmed) rate is β = . ≈ / . which corresponds to about in . exposed people become con rmed each day. the quantile intervals provide a % credible interval for the parameters and can be used to obtain a range of reasonable parameter values. for example for the parameter β the interval is ( . , . ) meaning that the probability that β is between ( . , . ) is . . this can be used to create an interval for the risk interpretations as between / . ≈ . and / . ≈ . exposed people are con rmed as infected each day. this also gives insight into how many people who may be in the population who are exposed and may be infectious but do not yet exhibit symptoms. recall that β a is associated with the dirac delta function for impulse to model the jump in transition rate from exposed to infected at day . notice that β a ≈ . means that there is a one time in ux of approximately . % of the people exposed moved to the infected compartment. hence the increased testing captured many of the exposed people. by adding this to the natural exposed to infected rate of β = . one obtains the one time transmission rate of β + β a = . + . = . corresponding to a total of approximately . % of exposed being con rmed as infected. leaving the remaining approximately . % of exposed people still interacting with the susceptible population. while many of the parameters do not lend well to the traditional h : θ = hypothesis testing as they must be positive. we can conduct simple hypothesis tests on the α parameters to look for signi cant changes due to interventions using contrasts. speci cally the sequential contrasts of α − α , α − α , α − α , α − α and α − α . these contrasts quantify the changes that in transmission rate from susceptible to exposed due to the interventions and are what policy makers want to see. furthermore, they want a statistical test on whether or not the intervention performed in a statistically signi cant manner. this can be done by simply subtracting the mcmc samples to generate the contrast of interest. using these subtracted samples one can look at the mean, standard deviation, quantiles and the proportion of samples above , p(> ). table shows these quantities for the contrasts listed above. notice that the intervention at day reduced the transmission rate by approximately . × − ) which is considerable and the proportion of samples above was . indicating a statistically signi cant change due to the intervention. the intervention taken at day , α − α , actually increased the transmission rate, where the intervention taken at day , α − α , then reduced the transmission rate. similarly the other two interventions increased and then decreased the transmission rates, respectively. furthermore, all interventions be deemed statistically signi cant since p(> ) is either . or . indicating signi cance. the model formulation also allows for the individual transmission rates to be computed by simply summing up the α k through to the desired time point. table gives the mean, standard deviation and (q . , q . , q . ) for the transmission rate of exposed to infected across each time interval. this is done by simply add the corresponding mcmc samples. this is another perspective on how the transmission rate changes across the time frame. notice that all of the transmission rates are positive which is required by the model speci cation. also notice that the mean transmission rates vary in orders of magnitude from . × − to . × − . one interesting point that should be made is the highest transmission rate is at the beginning and the lowest transmission rate is at the end. this is evidence that the interventions that the qatari government has ultimately reduced the transmission rate. to assess the t of the model the posterior predictive distribution was used and is given by: using the samples , samples from the posterior distribution , samples were generated from the posterior predictive distribution. at each time t the median, . and . quantiles were obtained to form a posterior predictive interval. figure shows the model ts for active infections, recovered and deaths with posterior predictive bands. notice that, the model does quite well at tting the dynamics of the active infections including the jump at day and captures the plateau and the exponential growth after the plateau as well. the recovered model does ts well as does the deaths data. to assess the explained variance a pseudo-r was formed using the median from the posterior predictive distribution at each time as the point estimates. this resulted in a pseudo-r of . which indicates the tted model explains approximately . % of the variance in the data. based on this the model is deemed to t well. it should be noted that standard data splitting procedures for model validation are di cult in this scenario as removing values from the system may cause unstable behavior. to assess the model performance predictive performance is utilized with days from may and may used as a test set. using the samples from the posterior distribution the posterior predictive distribution was computed for each day of the test set and % posterior predictive intervals were created using the . % and . % quantiles. the test data is then compared to the posterior predictive intervals for each of the endpoints. another view of predictive performance is to examine pseudo-predictive-r which compares the predicted values with the actual values for the test set. this calculation leads to a pseudo-predictive-r ≈ . which is slightly lower than the pseudo-r associated with the t of the model to the data in the training set but is still very high. of course several other measures of predictive performance exist however this is the easiest to understand as it measures the amount of variation explained by the predictions across the test set. this work has demonstrated how to build a seird model for the covid-outbreak in the state of qatar, include interventions, estimate model parameters and generate posterior predictive intervals using a bayesian framework. furthermore, the model is able to treat the susceptible and exposed compartment as latent variables, as no data is observed about them other than approximate initial values. the model ts the data quite well with a pseudo-r ≈ . and predicts reasonably well with pseudo-predictive-r ≈ . . one can also note that in the model de nition, no immigration, emigration, natural births and natural mortality were not included and based on the high psuedo-r would have a negligible e fect on t. furthermore, the model did not contain compartments for those who recovered without being con rmed infections. as this was not observed one can only speculate on the impact that additional data would have on the model t, however it would be very small. the modeling paradigm is quite exible for modeling the covid-data as it easily incorporates interventions into the system and can quantify the impact of the intervention. furthermore, using simple di ferences the model can be used to predict new infections as well. figure shows the plots of the new infections with predictive bands based on . % and . % based on , samples from the posterior prediction distribution at each time point. notice that the model does well at capturing the jump at day and the bands capture most of the data. the drop beginning at the intervention at day provides for the drop in daily infection rates. another use of the model may be for long term predictions. while this is extrapolation it does provide policy makers a tool for planning, provided nothing changes, i.e. no interventions are taken. it also allows policy makers to see the possible long term e fects of their decisions. figure shows the long-term predictions of the model if no other interventions are made past may . notice that the predictions do eventually decrease across the future time frame. notice the width of the predictive bands for times farther in the future. this re ects the uncertainty associated with extrapolating into the future. however, one item that we can calculate from this is a % predictive interval for the peak infection time. by simply recording the maximum value for each of the predictive distribution trajectories from the mcmc samples one can obtain a distribution of the time for the maximum. in this case this gives the % predictive interval for the maximum to be ( , ). this means that the peak infection time will be between day ( june ) and day ( august ) of the outbreak given that no other interventions or process changes occur. fig. also shows this interval given by the dark dashed vertical lines. the width of the interval quanti es the uncertainty about where the maximum active infections will occur. since the width of the interval is days, this indicates that there is a large amount of uncertainty on when the number of active infections will begin to decline. future work could be to add an overdispersion parameter into the model to allow for the more accurate capture of uncertainty. furthermore, one can perform simulation studies to better understand how the model may perform under various scenarios. feature selection methods could be employed to select where the interventions should be placed as well as other forms of interventions could be included in the model. another possibility to address any deviations from the standard model a semi-parametric technique could be studied as well. quantiles from , samples from the posterior predictive distribution. dark dashed vertical lines give the % predictive interval for the maximum active infections. australian health sector emergency response plan for novel coronavirus (covid-) qatar national preparedness and response plan for communicable diseases. doha, qatar. primary healthcare corporation covid-strategic preparedness and response plan operational planning guidelines to support country preparedness and response an essay towards solving a problem in the doctrine of chance. by the late rev. mr. bayes, communicated by mr statistical inference, nd edition markov chain monte carlo in practice bayesian computation with r: second edition statistical decision theory and bayesian analysis, second edition the principles of quantum mechanics, fourth edition the authors would like to acknowledge to the state of qatar and the ministry of health for the daily updates and additional data. in addition the authors would like thank virginia commonwealth university in qatar and qatar university for supporting this e fort. key: cord-001921-73esrper authors: lin, cheng-yung; chiang, cheng-yi; tsai, huai-jen title: zebrafish and medaka: new model organisms for modern biomedical research date: 2016-01-28 journal: j biomed sci doi: 10.1186/s12929-016-0236-5 sha: doc_id: 1921 cord_uid: 73esrper although they are primitive vertebrates, zebrafish (danio rerio) and medaka (oryzias latipes) have surpassed other animals as the most used model organisms based on their many advantages. studies on gene expression patterns, regulatory cis-elements identification, and gene functions can be facilitated by using zebrafish embryos via a number of techniques, including transgenesis, in vivo transient assay, overexpression by injection of mrnas, knockdown by injection of morpholino oligonucleotides, knockout and gene editing by crispr/cas9 system and mutagenesis. in addition, transgenic lines of model fish harboring a tissue-specific reporter have become a powerful tool for the study of biological sciences, since it is possible to visualize the dynamic expression of a specific gene in the transparent embryos. in particular, some transgenic fish lines and mutants display defective phenotypes similar to those of human diseases. therefore, a wide variety of fish model not only sheds light on the molecular mechanisms underlying disease pathogenesis in vivo but also provides a living platform for high-throughput screening of drug candidates. interestingly, transgenic model fish lines can also be applied as biosensors to detect environmental pollutants, and even as pet fish to display beautiful fluorescent colors. therefore, transgenic model fish possess a broad spectrum of applications in modern biomedical research, as exampled in the following review. although zebrafish (danio rerio) and medaka (oryzias latipes) are primitive vertebrates, they have several advantages over other model animals. for example, they are fecund and light can control their ovulation. spawning takes place frequently and no limitation in their spawning season. microinjection of fertilized eggs is easily accessible and relatively cheap. their embryos are transparent, making it easy to monitor the dynamic gene expression in various tissues and organs in vivo without the need to sacrifice the experimental subjects. their genome sizes are approximately 20 to 40 % of the mammalian genome, making them the only vertebrates available for large-scale mutagenesis. their maturation time takes only 2~3 months, which is relatively less laborious and time-saving for generating transgenic lines. in addition, many routine techniques of molecular biology and genetics, including knock-in, knockdown and knockout, are well developed in the model fish. therefore, zebrafish and medaka are new excellent animal systems for the study of vertebrate-specific biology in vivo. the f0 transgenic line can be established once the exogenous gene can be successfully transferred to the embryos, followed by stable germline transmission of the transgene to the f1 generation. generally, around 10-20 % of treated embryos have a chance to achieve germline transmission [1] . it has been reported that a foreign gene flanked with inverted terminal repeats of adeno-associated virus can be used to enhance the ubiquitous expression and stable transmission of transgene in model fish [2] . meanwhile, transgenesis can be facilitated by using the tol2 transposon derived from medaka [3] . transposase catalyzes transposition of a transgene flanked with the tol2 sequence [4] . the efficiency of tol-2-mediated germline transmission could range from 50 to 70 % of injected embryos [4, 5] . a cutting-edge technique has taken the study of fish gene transfer to new horizons, such as knockout zebrafish by the transcription activator-like effector nuclease (talen) system and the clustered regularly interspaced short palindromic repeats (crispr) combined with crispr-associated proteins 9 (cas9) [6, 7] . the talen system involves the dna recognition domain of transcription activator-like effectors (tales) and a nuclease domain for generation of nicks on dna sequences. the crispr/cas9 system directed by a synthetic single guide rna can induce targeted genetic knockout in zebrafish. the main difference between these two systems is based on their recognition mechanisms. unlike the tales applied in the talen system, the crispr/cas9 system recognizes its target dna fragment by the complementary non-coding rna. the development of the talen and crispr/cas9 systems provides new genomic editing approaches for establishing genetic knockout fish lines [8] . the fluorescence protein gene (fpg) has been widely applied as a reporter gene in studies of the transgene expression by direct visualization under fluorescent microscopy in vivo [9] . many transgenic model fish lines harbor an fpg driven by various tissue-specific promoters, including the erythroid-specific gata promoter [10] , muscle-specific α-actin promoter [11] , rod-specific rhodopsin promoter [12] , neuron-specific isl-1 promoter [13] , pancreas-specific pdx-1 and insulin promoters [14] , myocardium-specific cmlc2 promoter [15] , liver-specific l-fabp promoter [16] , bone-specific col10a1 promoter [17] , macrophage-specific mfap4 promoter [18] , and germ cell-specific vasa promoter [19] . using medaka β-actin promoter, tsai's lab generated a transgenic line of medaka displaying green fp ubiquitously around the whole fish from f0 through f2 generations in a mendelian inheritance manner [20] . this is known as the first transgenic line of glowing pet fish, which was reported by science [21] and far eastern economic review [22] and honored to be selected as among "the coolest inventions of 2003" by time [23] . the dna sequences of the aforementioned promoters ranging from 0.5 to 6.5 kb are sufficient to drive the fpg reporter to mimic the tissue-specific expression of endogenous gene. however, some genes require a longer regulatory dna sequence, such as more than 20 kb, to fully recapitulate the characteristic expression profiles of endogenous genes. in that case, bacterial artificial chromosome (bac) and phage p1-dereived artificial chromosome (pac) have been commonly used for this purpose [24] . for example, the zebrafish rag1 gene, flanked with pac dna containing 80 kb at the 5′ upstream and 40 kb at the 3′ downstream, can be expressed specifically in lymphoid cells. instead of using the tedious chi-site dependent approach, jessen et al. reported a two-step method to construct a bac clone [25] . employing this protocol, chen et al. constructed a bac clone containing the upstream 150 kb range of zebrafish myf5 and generated a transgenic line tg (myf5:gfp) [26] . this transgenic line is able to recapitulate the somite-specific and stagedependent expression of the endogenous myf5 at an early developmental stage. in summary, all the above transgenic lines should be very useful materials for studying both gene regulation and cell development. zebrafish is particularly useful for studying heart development for the following reasons: (a) zebrafish have a primitive form of the heart, which is completely developed within 48 h post-fertilization (hpf). (b) the cardiac development can be easily observed in the transgenic line possessing a fp-tagged heart. (c) the zebrafish embryos with a defective cardiovascular system can still keep on growing by acquiring oxygen diffused from water. (d) discovery of genes involved in heart development can be facilitated by a simple haploid mutation method [27] . for example, using the zebrafish jekyll mutant, which has defective heart valves, walsh and stainier discovered that udpglucose dehydrogenase is required for zebrafish embryos to develop normal cardiac valves [28] . tsai's lab is the first group to generate a transgenic zebrafish line that possesses a gfp-tagged heart [15] . this line was established from zebrafish embryos introduced with an expression construct, in which the gfp reporter is driven by an upstream control region of zebrafish cardiac myosin light chain 2 gene (cmlc2). using this transgenic line, raya et al. found that the notch signaling pathway is activated is during the regenerative response [29] . shu et al. reported that na, k-atpase α1b1 and α2 isoforms have distinct roles in the patterning of zebrafish heart [30] . this transgenic line should also be useful for studying the dynamic movement and cell fate of cardiac primordial cells. for example, forouhar et al. proposed a hydro-impedance pump model for the embryonic heart tubes of zebrafish [31] . a 4d dynamic image of cardiac development has been developed [32] . furthermore, hami et al. reported that a second heart field is required during cardiac development [33] . thus, recently, nevis et al. stated that tbx1 plays a function for proliferation of the second heart field, and the zebrafish tbx1-null mutant resemble the heart defects in digeorge syndrome [34] . thus, the expression pattern of heart-specific genes could be analyzed based on heart progenitor cells collected in this transgenic line. the analysis of gene or protein expression dynamics at different developmental stages could also be conducted. furthermore, this transgenic fish is a potential platform for detecting chemicals, drugs and environmental pollutants affecting heart development, as detailed in following section. in vivo transient assay of the injected dna fragments in model fish embryos is a simple yet effective way to analyze the function of regulatory cis-elements. for example, myf5, one of myogenesis regulatory factors (mrf), plays key roles in the specification and differentiation of muscle primordial cells during myogenesis. the expression of myf5 is somite-specific and stage-dependent, and its activation and repression are delicately orchestrated. using in vivo transient assay, chen et al. found that a novel ciselement located at −82/-62 is essential for somite-specific expression of myf5 [35] . lee et al. revealed that this −82/-62 cis-element is specifically bound by forkhead box d3, and proposed that somite development is regulated by the pax3-foxd3-myf5 axis [36] . besides foxd3, foxd5, another protein in the forkhead box family, is necessary for maintaining the anterior-posterior polarity of somite cells in mesenchymal-epithelial transition [37] . the expression of foxd5 is regulated by fgf signaling in anterior presomitic mesoderm (psm), which indicates that fgf-foxd5-mesp signaling takes place in somitogenesis [37] . furthermore, analysis of the loci of adjacent mrf4 and myf5 revealed the complicated regulation mechanism of the mrf genes. it was also found that the biological function of mrf4 is related to myofibril alignment, motor axon growth, and organization of axonal membrane [38] . the molecular mechanism that underlies the repression of myf5 has also been reported. for example (fig. 1a) , a strong repressive element of zebrafish myf5 was found within intron i (+502/+835) [39] . this repressive element is modulated by a novel intronic microrna, termed mir-in300 or mir-3906 [40] . when myf5 transcripts reach the highest level after specification, the accumulated mir-3906 starts to reduce the transcription of myf5 through silencing the positive factor dickkopf-related protein 3 (dkk3r or dkk3a) for the myf5 promoter [41] . itgα6b is a receptor of secretory dkk3a and that interaction between itgα6b and dkk3a is required to drive the downstream signal transduction which regulates myf5 promoter activity in somite during embryogenesis of zebrafish [42] . dkk3a regulates p38a phosphorylation to maintain smad4 stability, which in turn enables the formation of the smad2/3a/4 complex required for the activation of the myf5 promoter [43] . however, when myf5 transcripts are reduced at the later differentiation stage, mir-3906 is able to be transcribed by its own promoter [40] (fig. 1b) . furthermore, increased expression of mir-3906 interacts with its receptor itgα6b, resulting in the phosphorylation of p38a and the formation of the smad2/3a/4 complex, which in turn, activates the myf5 promoter activity. when myf5 is highly transcribed, the intronic mir-3906 suppresses the transcription of myf5 through silencing the dkk3a [39, [41] [42] [43] . b at the late muscle development, mir-3906 starts transcription at its own promoter and switches to silence homer1b to control the homeostasis of intracellular calcium concentration ([ca 2+ ] i ) in fast muscle cells [40] . either mir-3906-knockdown or homer-1b-overexpression causes the increase of homer-1b protein, resulting in an enhanced level of [ca 2+ ] i , which in turn, disrupts sarcomeric actin filament organization. in contrast, either mir-3906-overexpression or homer-1b-knockdown causes the decrease of homer-1b, resulting in a reduced [ca 2+ ] i and thus a defective muscle phenotype [40] controls the intracellular concentration of ca 2+ ([ca 2+ ] i ) in fast muscle cells through subtly reducing homer-1b expression. the homeostasis of [ca 2+ ] i is required during differentiation to help maintain normal muscle development [40] . nevertheless, it remains to be investigated how mir-3906 switches its target gene at different developmental stages. apart from the regulation of somitogenesis, myf5 is also involved in craniofacial muscle development. the functions of myf5 in cranial muscles and cartilage development are independent of myod, suggesting that myf5 and myod are not redundant. thus, three possible pathways could be associated with the molecular regulation between myf5 and myod: (i) myf5 alone is capable of initiating myogenesis, (ii) myod initiates muscle primordia, which is subdivided from the myf5-positive core, and (iii) myod alone, but not myf5, modulates the development of muscle primordia [44] . furthermore, the six1a gene was found to play an important role in the interaction between myf5 and myod [45] . in cartilage development, myf5 is expressed in the paraxial mesoderm at the gastrulation stage. myf5 plays a role in mesoderm fate determination by maintaining the expression of fgf3/8, which in turn, promotes differentiation from neural crest cells to craniofacial cartilage [46] . this research on myf5 not only reveals that it has different functions between craniofacial muscle development and somitogenesis, but it also opens up a new study field for understanding craniofacial muscle development. hinits et al. reported no phenotype both in myf5knockdown embryos and myf5-null mutant, suggesting that myf5 is rather redundant in somitogenesis of zebrafish [47] . however, it is hard to reasonably explain why these embryos and mutant are all lethal and can't grow to adulthood. on the other hand, lin et al. reported an observable defective phenotype in myf5-knockdown embryos, and claimed that the concentration of myf5-mo they used can inhibit maternal myf5 mrna translation [46] . this discrepancy might attribute the effectiveness of mo used or the different phenotypes between the knockdown embryos and knockout mutant in this case. the retina-specific expression of the carp rhodopsin gene is controlled by two upstream regulatory dna ciselements [48] . one is located at −63 to −75, which is the carp neural retina leucine zipper response-like element; the other is located at −46 to −52, which is a carpspecific element crucial to reporter gene expression in medaka retinae. intriguingly, immediate activation of early growth response transcriptional regulator egr1 could result in the incomplete differentiation of retina and lens, leading to microphthalmos [49] . another important factor for ocular development is the adpribosylation factor-like 6 interacting protein 1 (arl6ip1). loss of arl6ip1 function leads to the absence of retinal neurons, disorganized retinal layers and smaller optic cups [50] . upon losing arl6ip1, retinal progenitors continued to express cyclin d1, but not shh or p57kip2, suggesting that eye progenitor cells remained at the early progenitor stage, and could not exit the cell cycle to undergo differentiation [51] . additionally, it has been reported that arl6ip1 is essential for specification of neural crest derivatives, but not neural crest induction. tu et al. found that arl6ip1 mutation causes abnormal neural crest derivative tissues as well as reduced expression of neural crest specifier genes, such as foxd3, snail1b and sox10, indicating that arl6ip1 is involved in specification, but not induction, of neural crest cells [52] . furthermore, they found that arl6ip1 could play an important role in the migration of neural crest cells because in the arl6ip1-knockdown embryos, crestinand sox10-expressing neural crest cells failed to migrate ventrally from neural tube into trunk. more recently, lin et al. found that ras-related nuclear (ran) protein is conjugated with arl6ip1, and proposed that ran protein associates with arl6ip1 to regulate the development of retinae [53] . to date, no in vivo model system has been established to identify cells in the cns that can specifically respond with regeneration after stresses, and, even if identified, no method is in place to trace these responsive cells and further identify their cell fates during hypoxic regeneration. to address these issues, lee et al. generated a transgenic zebrafish line huorfz, which harbors the upstream open reading frame (uorf) from human ccaat/enhancer-binding protein homologous protein gene (chop), fused with the gfp reporter and driven by a cytomegalovirus promoter [54] . after huorfz embryos were treated with heat-shock or under hypoxia, the gfp signal was exclusively expressed in the cns, resulting from impeding the huorf chop -mediated translation inhibition [54] . interestingly, zeng et al. found that gfp-(+) cells in spinal cord respond to stress, survive after stress and differentiate into neurons during regeneration (chih-wei zeng, yasuhiro kamei and huai-jen tsai, unpublished data). micrornas (mirnas) are endogenous single-stranded rna molecules of 19-30 nucleotides (nt) that repress or activate the translation of their target genes through canonical seed-and non-canonical centered mirna binding sites. the known mechanisms involved in mirnas-mediated gene silencing are decay of mrnas and blockage of translation [55] [56] [57] . probably the expression of 30~50 % of human genes is regulated by mirnas [58, 59] . therefore, to understand gene and function in cells or embryos, it is important to exactly know the target gene(s) of a specific mirna at different phase of cells or at particular stages of developing embryos. instead of using a bioinformatic approach, the tsai's lab developed the labeled mirna pull-down (lamp) assay system, which is a simple but effective method to search for the candidate target gene(s) of a specific mirna under investigation [60] . lamp assay system yields fewer falsepositive results than a bioinformatic approach. taking advantage of lamp, scientists discovered that mir-3906 silences different target genes at different developmental stages, e.g., at early stage, mir-3906 targets dkk3a [41] , while at late stage, it targets homer-1b [40] (fig. 1) . in another example (fig. 2) , mir-1 and mir-206 are two muscle-specific micrornas sharing the same seed sequences. they are able to modulate the expression of vascular endothelial growth factor aa (vegfaa) and serve as cross-tissue signaling regulators between muscle and vessels. since mir-1 and mir-206 share identical seed sequences, stahlhut et al. demonstrated that they can silence the same target gene, such as vegfaa, and considered them as a single cross-tissue regulator termed as mir-1/206 [61] . mir-1/206 reduces the level of vegfaa, resulting in the inhibition of the angiogenic signaling [61] . surprisingly, using the lamp assay system, lin et al. reported that the target genes for mir-1 and mir-206 are different [62] . while mir-206 targets vegfaa, mir-1 targets seryl-trna synthetase gene (sars). sars is a negative regulator of vegfaa. although both mir-1 and mir-206 have identical seed sequences, the sars-3′utrs of zebrafish, human and mouse origins can be recognized only by mir-1 in zebrafish embryos and mammalian cell lines (hek-293 t and c2c12), but not by mir-206 [62] . conversely, the vegfaa-3′utr is targeted by mir-206, but not by mir-1. therefore, lin et al. concluded that mir-1 and mir-206 are actually two distinct regulators and play opposing roles in zebrafish angiogenesis. the mir-1/sars/vegfaa pathway promotes embryonic angiogenesis by indirectly controlling vegfaa, while mir-206/vegfaa pathway plays an anti-angiogenic role by directly reducing vegfaa. interestingly, they also found that the mir-1/sars/vegfaa pathway increasingly affects embryonic angiogenesis at late developmental stages in somitic cells [62] . it remains to be studied how mir-1 increases in abundance at late stage. different from mammals, zebrafish have the ability to regenerate injured parts in the cns. many mirnas have been found in the cns. since mirnas are involved in many aspects of development and homeostatic pathways, they usually play important roles in regeneration [63] . it has been shown that several mirnas have prominent fig. 2 mir-1 and mir-206 silence different target genes and play opposing roles in zebrafish angiogenesis. both mir-1 and mir-206 are musclespecific micrornas and share identical seed sequences. however, they silence different target genes to affect the secreted vegfaa level through different pathways [62] . the mir-1/sars/vegfaa pathway plays a positive role in angiogenesis since sars, a negative factor for vegfaa promoter transcription, is silenced by mir-1, resulting in the increase of vegfaa. however, the mir-206/vegfaa pathway plays a negative role since vegfaa is silenced directly by mir-206. dynamic changes of mir-1 and mir-206 levels are also observed [62] . the mir-1 level gradually increases between 12 and 20 hpf and significantly increases further between 20 and 32 hpf, while the mir-206 level is only slightly changed during this same period. consequently, vegfaa increases greatly from 24 to 30 hpf, which might be responsible for the continuous increase of mir-1/sars/vegfaa pathway, but not mir-206/vegfaa pathway. therefore, temporal regulation of the expression of mir-1 and mir-206 with different target genes occur during embryonic angiogenesis in somitic cells of zebrafish functions in regulating the regeneration process. for example, mir-210 promotes spinal cord repair by enhancing angiogenesis [64] , and the mir-15 family represses proliferation in the adult mouse heart [65] . furthermore, mir-nas mir-29b and mir-223 are identified following optic nerve crush. by gene ontology analysis, mir-29b and mir-223 are found to regulate genes, including eva1a, layna, nefmb, ina, si:ch211-51a6.2, smoc1, and sb:cb252. these genes are involved in cell survival or apoptosis, indicating that these two mirnas are potential regulators of optic nerve regeneration [66] . although the main hematopoietic sites in zebrafish differ from those in mammals, both zebrafish and mammals share all major blood cell types that arise from common hematopoietic lineages [67] . moreover, many genes and signaling pathways involved in hematopoiesis are conserved among mammals and zebrafish. for example, scl, one of the first transcription factors expressed in early hematopoietic cells, is evolutionarily conserved. during definitive hematopoiesis, runx1 marks hematopoietic stem cells (hscs) in both mouse and fish. additionally, in differentiated populations, gata1, the erythroid lineage regulator, pu.1 and c/ebp, the myeloid lineage regulators, and ikaros, a mark of the lymphoid population, are in accordance with the hematopoietic hierarchy in zebrafish and mammals [68] . thus, the findings with respect to zebrafish blood development could be applied to mammalian system. genetic screening in zebrafish has generated many blood-related mutants that help researchers understand hematopoietic genes and their functions [69] . for example, the spadetail mutant carrying a mutated tbx16 exhibits defective mesoderm-derived tissues, including blood. this mutant displays the decrease levels tal1, lmo2, gata2, fli1 and gata1 in the posterior lateral mesoderm, indicating the important role of tbx16 during hemangioblast regulation [70] . chemical screening in zebrafish using biologically active compounds is also a powerful approach to identify factors that regulate hscs. for example, it is well known that prostaglandin (pg) e2 increases the induction of stem cells in the aorta-gonad-mesonephros region of zebrafish, as demonstrated by increasing expressions of runx1 and cmyb, which, in turn, increases engraftment of murine marrow in experimental transplantation [71] . in human clinical trials, the treatment of cord blood cells with dimethyl pge2 caused an increase in long-term engraftment [72] , suggesting that a compound identified in zebrafish could have clinical application in humans. model fish are excellent materials for the study of human diseases due to some mutants display similar phenotypes of human diseases [73] . in addition, essential genes and thereof regulation to control the development of tissues or organs are highly conserved [74] . for example, tbx5 is a t-box transcription factor responsible for cell-type specification and morphogenesis. the phenotypes of tbx5 mutant are highly similar among mammals and zebrafish. thus, transgenic fish with heart-specific fluorescence could provide a high-through screening platform for drugs for cardiovascular disease. for example, the tsai's lab established a transgenic line which could be induced to knock down the expression level of cardiac troponin c at any developmental stage, including embryos, larva or adult fish. the reduction of troponin c resulted in mimicry of dilated cardiomyopathy, and the incomplete atrioventricular blocking disease in humans. therefore, this transgenic line is expected to make a significant contribution to drug screening and the elucidation of the molecular mechanisms underlying cardiovascular diseases. next, the effect of drugs on embryonic development was also studied. amiodarone, which is a class iii antiarrhythmic agent, is being used for the treatment of tachyarrhythmia in humans. however, amiodarone-treated zebrafish embryos were found to exhibit backflow of blood in the heart [75] . subsequent research showed that amiodarone caused failure of cardiac valve formation [75] . specifically, amiodarone induces ectopic expression of similar to versican b (s-vcanb), resulting in repression of egfr/gsk3β/snail signaling, which in turn, upregulates cdh5 at the heart field, and causes defective cardiac valves [76] . moreover, amiodarone was found to repress metastasis of breast cancer cells by inhibiting the egfr/erk/snail pathway [77] , a phenomenon analogous to the inhibitory effects of amiodarone on emt transition observed in the heart. last but not least, although zebrafish has a twochambered heart, relative to mouse, rat, and rabbit, its heart rate, action potential duration (apd) and electrocardiogram (ecg) morphology are similar to those of humans. [78, 79] . additionally, tsai et al. demonstrated that the in vitro ecg recording of zebrafish heart is a simple, efficient and high throughput assay [80] . thus, zebrafish can serve as a platform for direct testing of drug effect on apd prolongation and prolonged qt interval, which is required by the fda as a precondition for drug approval. zebrafish become a popular experimental animal for the studies of human cancer [81] , in part because the fish homologs of human oncogenes and tumor suppressor genes have been identified, and in part because signaling pathways regulating cancer development are conserved [82] [83] [84] . amatruda et al. reported that many zebrafish tumors are similar to those of human cancer in the histological examination [85] . the zebrafish transgenic line with skin-specific red fluorescence could be applied for skin tumor detection [86] . when the embryos of this line were treated with solutions containing arsenic, the tumors induced on the skin could be easily identified by naked eye under fluorescent microscope. therefore, this transgenic line can be potentially used for the study of skin diseases. for example, the common skin cancer melanoma may be screened by the red fluorescence expression in this transgenic line. zebrafish transgenic line could also be applied to establish models simulating melanoma development. the human oncogenic braf v600e was expressed under the control of the zebrafish melanocyte mitfa promoter to establish a melanoma model [87] . combining skinspecific red fluorescence with mitfa-driven oncogene expression, the melanoma could be easily traced. therefore, transgenic lines and mutants of model fish could provide abundant resources for mechanistic studies and therapeutic research in human diseases. metastasis involves processes of sequential, interlinked and selective steps, including invasion, intravasation, arrest in distant capillaries, extravasation, and colonization [88] . zebrafish is again an alternative organism for in vivo cancer biology studies. in particular, xenotransplantation of human cancer cells into zebrafish embryos serves as an alternative approach for evaluating cancer progression and drug screening [89] . for example, human primary tumor cells labeled with fluorescence have already been implanted in zebrafish liver, and the invasiveness and metastasis of these cells were directly observable and easily traceable [90] . to investigate the mechanism of local cancer cell invasion, human glioblastoma cells labeled with fluorescence were infiltrated into the brain of zebrafish embryos. it was observed that the injected cells aligned along the abluminal surface of brain blood vessels [91] . by grafting a small amount of highly metastatic human breast carcinoma cells onto the pericardial membrane of zebrafish embryos at 48 hpf, tumor cells were observed to move longitudinally along the aorta [92] . similarly, highly metastatic human cancer cells labeled with fluorescence were injected into the pericardium of 48-hpf embryos. afterwards, it is possible to visualize how cancer cells entered the blood circulation and arrested in small vessels in head and tail [93] . in another example, zebrafish embryos were injected with tumorigenic human glioma stem cells at different stages of metastasis, including beginning, approaching, clustering, invading, migrating, and transmigrating [94] . thus, grafting a small number of labeled tumor cells into transparent zebrafish embryos allows us to dynamically monitor the cancer cells without the interference of immune suppression. apart from its utility in analyzing the mechanisms of tumor dissemination and metastasis, the zebrafish model can also be applied to screen potential anticancer compounds or drugs. in addition, zebrafish feature such advantages as easy gene manipulation, short generation cycle, high reproducibility, low maintenance cost, and efficient plating of embryos [95, 96] . therefore, this small fish is second only to scid and nude mice as xenograft recipients of cancer cells. leukemia is a cancer related to hematopoiesis. most often, leukemia results from the abnormal increase of white blood cells. however, some human cancers of bone marrow and blood origins have their parental cells from other blood cell types. the search for efficacious therapies for leukemia is ongoing. interestingly, the developmental processes and genes related to hematopoiesis are similar between zebrafish and humans, making zebrafish a feasible model for the study of leukemia. in addition, gene expression in zebrafish could be conveniently modified by several approaches, e.g., mo-induced gene knockdown, talens and crispr/cas9 gene knockout, and dna/ rna introduced overexpression [6, 7] . in the study of yeh et al. [97] , the zebrafish model was applied to screen for chemical modifiers of aml1-eto, an oncogenic fusion protein prevalent in acute myeloid leukemia (aml). treatment of zebrafish with chemical modifiers of aml1-eto resulted in hematopoietic dysregulation and elicited a malignant phenotype similar to human aml. cyclooxygenase-2 (cox2) is an enzyme causing inflammation and pain. nimesulide is an inhibitor of cox2 and an antagonist to aml1-eto in hematopoietic differentiation. fms-like tyrosine kinase 3 (flt3) is a class iii receptor tyrosine kinase which is normally expressed in human hematopoietic stem and progenitor cells (hspcs) [98] . internal tandem duplication (itd), which may occur at either the juxtamembrane domain (jmd) or the tyrosine kinase domain (tkds) of flt3, is observed in one-third of human aml. zebrafish flt3 shares an overall 32, 35, and 34% sequence identity with that of human, mouse, and rat, respectively. however, the jmd and the activation loops of tkd are highly conserved, implicating that the functions of flt3 signaling are evolutionally conserved. overexpression of human flt3-itd in zebrafish embryos induces the ectopic expansion of flt3-itd positive myeloid cells. if those embryos are treated with ac220, a potent and relatively selective inhibitor of flt3, flt3-itd myeloid expansion is effectively ameliorated [99] . in another example, isocitrate dehydrogenase (idh) 1 and 2 are involved in citric acid cycle in intermediary metabolism. idh mutations are found in approximately 30 % of cytogenetically abnormal aml, suggesting a pathogenetic link in leukemia initiation [100, 101] . injection of either human idh1-r132h or zebrafish idh1-r146h, a mutant corresponding to human idh1-r132h, resulted in increased 2-hydroxyglutarate, which in turn induced the expansion of primitive myelopoiesis [102] . taken together, these reports suggest that the molecular pathways involved in leukemia are conserved between humans and zebrafish. based on the aforementioned experimental evidence, zebrafish can be an exceptional platform for mimicking human myelodysplastic syndromes and establishing an in vivo vertebrate model for drug screening. several liver tumor models have been reported by liverspecific expression of transgenic oncogenes such as kras, xmrk and myc. these transgenic lines of zebrafish usually generate liver tumors with various severity from hepatocellular adenoma (hca) to hepatocellular carcinoma (hcc) [103] [104] [105] . these three transgenic liver cancer models have been used to identify differentially expressed genes through rna-sage sequencing. for example, researchers have searched genes either up-or downregulated among the three tumor models and analyzed the possible signaling pathways. then, correlation between zebrafish liver tumor signatures and the different stages of human hepatocarcinogenesis was determined [106] . high tumor incidence and convenient chemical treatment make this inducible transgenic zebrafish a plausible platform for studying on liver tumor progression, regression, and anticancer drug screening. interestingly, zebrafish become a modern organism for studying on depressive disorders [107] [108] [109] . because the physiological (neuroanatomical, neuroendocrine, neurochemical) and genetic characteristics of zebrafish are similar to mammals, zebrafish are ideal for high-throughput genetic and chemical genetic screening. furthermore, since behavioral test of zebrafish for cognitive, approach-avoidance, and social paradigms are available, the identification of depression-like indices in response to physiological, genetic, environmental, and/ or psychopharmacological alterations is feasible [110] . actually, zebrafish display highly robust phenotypes of neurobehavioral disorders such as anxiety-like and approach-avoidance behaviors. furthermore, novel information of behavioral indices can be exposed, including geotaxis via top-bottom vertical movement [111] . zebrafish behavior can also be monitored using automated behavioral tracking software, which enhances efficiency and reduces interrater variance [112] . additionally, zebrafish offer a potential insight into the social aspects of depression [113] and may be suitable for studying the cognitive deficits of depression [114] and its putative etiological pathways [115] . last but not least, zebrafish are highly sensitive to psychotropic drugs, such as antidepressants, anxiolytics, mood stabilizers, and antipsychotics [116] [117] [118] , serving as an important tool for drug discovery. aromatic hydrocarbons, heavy metals and environmental estrogens are currently being used to test the impact of environmental pollutants on animals [119] . these studies mainly focused on mortality and abnormality rates. however, the developing embryos may have already been damaged in a subtle way that would have precluded direct observation of morphology and detection of mortality. to overcome this drawback, transgenic fish can be used because they are designed to study (a) whether toxicants cause defective genes during embryogenesis; (b) whether pollutants affect the expression of tissue-specific gene; and (c) whether the impact of pollutants on embryonic development is dosage dependent. pollutants can be directly detected by simply observing the coloration change of cells before or after the pollutants can cause morphological damage. therefore, transgenic model fish are promising organisms for use as bioindicators to environmental toxicants and mutagens [120, 121] . in addition, chen and lu reported that the environmental xenobiotics can be detected by a transgenic line of medaka carrying a gfp reporter driven by cytochrome p450 1a promoter (cyp1a-gfp) [122] . furthermore, the environmental xenoestrogenic compounds can be specifically detected by a hybrid transgenic line derived from crossing between line cyp1a-gfp and line vg-lux whose lux reporter activity is driven by a vitellogenin promoter [123] . lee et al. reported another zebrafish transgenic line, termed huorfz [54] , as it has been described in pervious section . at normal condition, the translation of the transferred huorf chop -gfp mrna in huorfz embryos is completely suppressed by an inhibitory uorf of human chop mrna (huorf chop ). however, when the huorfz embryos were under er stress, such as heat shock, cold shock, hypoxia, metals, alcohol, toxicants or drugs, the downstream gfp became apparent due to the blockage of huorf chop -mediated translation inhibition. therefore, huorfz embryos can be used to study the mechanism of translational inhibition. additionally, huorfz embryos can serve a living material to monitor the contamination of hazardous pollutants [124] . besides the universal huorfz system, zebrafish could also be indicators for specific pollutants. for example, xu et al. reported a transgenic zebrafish tg (cyp1a:gfp) which can serve as an in vivo assay for screening xenobiotic compounds, since cyp1a is involved in the aryl hydrocarbon receptor pathway, and can be induced in the presence of dioxins/dioxin-like compounds and polycyclic aromatic hydrocarbons [125] . additional advantages of zebrafish include the small size, abundant number, rapid development and transparent eggs. these features make this model fish more accessible for the studies of molecular toxicology. it is increasingly clear that the transgenic fish model is a powerful biomaterial for the studies of multiple disciplines, including molecular biology, developmental biology, neurobiology, cancer biology and regenerative medicine. it provides a simple, yet effective, in vivo approach to identify regulatory dna sequences, as well as determine gene function and molecular pathways. more importantly, an increasing number of papers have reported that (a) the defective phenotype of mutants of model fish can photocopy with known human disorders; and (b) drugs have similar effects on zebrafish and mammalian systems. therefore, the transgenic fish model offers a useful platform for high-throughput drug screening in biomedical sciences. additionally, it can serve as an environmental indicator for detecting pollutants in our daily lives. nevertheless, there are several limitations and caveats of this fish model. first, unlike mammals, fish lack the heart septation, lung, mammary gland, prostate gland and limbs, which make the fish model impossible for studies of these tissues and organs. additionally, fish are absent of placenta so that fish embryos are directly exposed to the environment (e.g., drugs or pollutants) without involving the placenta. second, fish are poikilothermic and usually maintained below 30°c, which may not be optimal for those mammalian agents adapted for 37°c in evolution. last, since the zebrafish genome is tetraploid, it is less straight forward to conduct loss-offunction studies for certain genes. the molecular biology of transgenic fish enhanced expression and stable transmission of transgenes flanked by inverted terminal repeats from adeno-associated virus in zebrafish identification of the tol2 transposase of the medaka fish oryzias latipes that catalyzes excision of a nonautonomous tol2 element in zebrafish danio rerio functional dissection of the tol2 transposable element identified the minimal cis-sequence and a highly repetitive sequence in the subterminal region essential for transposition a transposon-mediated gene trap approach identifies developmentally regulated genes in zebrafish heritable gene targeting in zebrafish using customized talens efficient genome editing in zebrafish using a crispr-cas system crispr/cas9 and talen-mediated knock-in approaches in zebrafish the aequorea victoria green fluorescent protein can be used as a reporter in live zebrafish embryos gata-1 expression pattern can be recapitulated in living transgenic zebrafish using gfp reporter gene high-frequency generation of transgenic zebrafish which reliably express gfp in whole muscles or the whole body by using promoters of zebrafish origin isolation of a zebrafish rod opsin promoter to generate a transgenic zebrafish line expressing enhanced green fluorescent protein in rod photoreceptors visualization of cranial motor neurons in live transgenic zebrafish expressing green fluorescent protein under the control of the islet-1 promoter/enhancer analysis of pancreatic development in living transgenic zebrafish embryos germ-line transmission of a myocardium-specific gfp transgene reveals critical regulatory elements in the cardiac myosin light chain 2 promoter of zebrafish 435-bp liver regulatory sequence in the liver fatty acid binding protein (l-fabp) gene is sufficient to modulate liver regional expression in transgenic zebrafish establishment of a bone-specific col10a1: gfp transgenic zebrafish the macrophage-specific promoter mfap4 allows live, long-term analysis of macrophage behavior during mycobacterial infection in zebrafish expression of a vas:: egfp transgene in primordial germ cells of the zebrafish uniform gfp-expression in transgenic medaka (oryzias latipes) at the f0 generation random samples: that special glow genetics: fish that glow in taiwan coolest inventions 2003: light and dark-red fish, blue fish and glow-in-dark fish modification of bacterial artificial chromosomes through chi-stimulated homologous recombination and its application in zebrafish transgenesis artificial chromosome transgenesis reveals longdistance negative regulation of rag1 in zebrafish multiple upstream modules regulate zebrafish myf5 expression use of the gal4-uas technique for targeted gene expression in zebrafish udp-glucose dehydrogenase required for cardiac valve formation in zebrafish activation of notch signaling pathway precedes heart regeneration in zebrafish k-atpase is essential for embryonic heart development in the zebrafish the embryonic vertebrate heart tube is a dynamic suction pump four-dimensional cardiac imaging in living embryos via postacquisition synchronization of nongated slice sequences zebrafish cardiac development requires a conserved secondary heart field tbx1 is required for second heart field proliferation in zebrafish novel regulatory sequence − 82/-62 functions as a key element to drive the somite-specificity of zebrafish myf-5 foxd3 mediates zebrafish myf5 expression during early somitogenesis foxd5 mediates anterior-posterior polarity through upstream modulator fgf signaling during zebrafish somitogenesis inactivation of zebrafish mrf4 leads to myofibril misalignment and motor axon growth disorganization novel cis-element in intron 1 represses somite expression of zebrafish myf-5 microrna-3906 regulates fast muscle differentiation through modulating the target gene homer-1b in zebrafish embryos novel intronic microrna represses zebrafish myf5 promoter activity through silencing dickkopf-3 gene zebrafish dkk3a protein regulates the activity of myf5 promoter through interaction with membrane receptor integrin α6b dickkopf-3-related gene regulates the expression of zebrafish myf5 gene through phosphorylated p38a-dependent smad4 activity myogenic regulatory factors myf5 and myod function distinctly during craniofacial myogenesis of zebrafish the transcription factor six1a plays an essential role in the craniofacial myogenesis of zebrafish normal function of myf5 during gastrulation is required for pharyngeal arch cartilage development in zebrafish embryos differential requirements for myogenic regulatory factors distinguish medial and lateral somitic, cranial and fin muscle fibre populations retina-specific ciselements and binding nuclear proteins of carp rhodopsin gene egr1 gene knockdown affects embryonic ocular development in zebrafish the embryonic expression patterns and the knockdown phenotypes of zebrafish adp-ribosylation factor-like 6 interacting protein gene arl6ip1 plays a role in proliferation during zebrafish retinogenesis zebrafish arl6ip1 is required for neural crest development during embryogenesis ras-related nuclear protein is required for late developmental stages of retinal cells in zebrafish eyes transgenic zebrafish model to study translational control mediated by upstream open reading frame of human chop gene a parsimonious model for gene regulation by mirnas gene silencing by micrornas: contributions of translational repression and mrna decay regulation of mrna translation and stability by micrornas micrornas: target recognition and regulatory functions microrna target predictions in animals labeled microrna pull-down assay system: an experimental approach for high-throughput identification of micrornatarget mrnas mir-1 and mir-206 regulate angiogenesis by modulating vegfa expression in zebrafish mir-1 and mir-206 target different genes to have opposing roles during angiogenesis in zebrafish embryos concise review: new frontiers in microrna-based tissue regeneration administration of microrna-210 promotes spinal cord regeneration in mice regulation of neonatal and adult mammalian heart regeneration by the mir-15 family integrated analyses of zebrafish mirna and mrna expression profiles identify mir-29b and mir-223 as potential regulators of optic nerve regeneration transplantation and in vivo imaging of multilineage engraftment in zebrafish bloodless mutants hematopoiesis: an evolving paradigm for stem cell biology transcriptional regulation of hematopoietic stem cell development in zebrafish mutantspecific gene programs in the zebrafish prostaglandin e2 regulates vertebrate haematopoietic stem cell homeostasis prostaglandin e2 enhances human cord blood stem cell xenotransplants and shows long-term safety in preclinical nonhuman primate transplant models from zebrafish to human: modular medical models the heartstrings mutation in zebrafish causes heart/fin tbx5 deficiency syndrome the toxic effect of amiodarone on valve formation in the developing heart of zebrafish embryos amiodarone induces overexpression of similar to versican b to repress the egfr/gsk3b/snail signaling axis during cardiac valve formation of zebrafish embryos cancer metastasis and egfr signaling is suppressed by amiodarone-induced versican v2 in vivo recording of adult zebrafish electrocardiogram and assessment of drug-induced qt prolongation zebrafish model for human long qt syndrome in-vitro recording of adult zebrafish heart electrocardiogram -a platform for pharmacological testing liver development and cancer formation in zebrafish zebrafish as a cancer model zebrafish modelling of leukaemias catch of the day: zebrafish as a human cancer model zebrafish as a cancer model system a keratin 18 transgenic zebrafish tg(k18(2.9):rfp) treated with inorganic arsenite reveals visible overproliferation of epithelial cells braf mutations are sufficient to promote nevi formation and cooperate with p53 in the genesis of melanoma the pathogenesis of cancer metastasis: the 'seed and soil' hypothesis revisited zebrafish xenotransplantation as a tool for in vivo cancer study metastatic behaviour of primary human tumours in a zebrafish xenotransplantation model calpain 2 is required for the invasion of glioblastoma cells in the zebrafish brain microenvironment distinct contributions of angiogenesis and vascular co-option during the initiation of primary microtumors and micrometastases visualizing extravasation dynamics of metastatic tumor cells a novel zebrafish xenotransplantation model for study of glioma stem cell invasion quantitative phenotyping-based in vivo chemical screening in a zebrafish model of leukemia stem cell xenotransplantation zebrafish-based systems pharmacology of cancer metastasis discovering chemical modifiers of oncogene-regulated hematopoietic differentiation stk-1, the human homolog of flk-2/flt-3, is selectively expressed in cd34+ human bone marrow cells and is involved in the proliferation of early progenitor/ stem cells functions of flt3 in zebrafish hematopoiesis and its relevance to human acute myeloid leukemia cancer-associated metabolite 2-hydroxyglutarate accumulates in acute myelogenous leukemia with isocitrate dehydrogenase 1 and 2 mutations regulation of cancer cell metabolism functions of idh1 and its mutation in the regulation of developmental hematopoiesis in zebrafish inducible and repressable oncogene-addicted hepatocellular carcinoma in tet-on xmrk transgenic zebrafish an inducible kras (v12) transgenic zebrafish model for liver tumorigenesis and chemical drug screening a transgenic zebrafish liver tumor model with inducible myc expression reveals conserved myc signatures with mammalian liver tumors xmrk, kras and myc transgenic zebrafish liver cancer models share molecular signatures with subsets of human hepatocellular carcinoma gaining translational momentum: more zebrafish models for neuroscience research zebrafish as an emerging model for studying complex brain disorders zebrafish models for translational neuroscience research: from tank to bedside zebrafish models of major depressive disorders threedimensional neurophenotyping of adult zebrafish behavior aquatic blues: modeling depression and antidepressant action in zebrafish social modulation of brain monoamine levels in zebrafish can zebrafish learn spatial tasks? an empirical analysis of place and single cs-us associative learning cognitive dysfunction in depression-pathophysiology and novel targets a larval zebrafish model of bipolar disorder as a screening platform for neuro-therapeutics role of serotonin in zebrafish (danio rerio) anxiety: relationship with serotonin levels and effect of buspirone, way 100635, sb 224289, fluoxetine and para-chlorophenylalanine (pcpa) in two behavioral models an affective disorder in zebrafish with mutation of the glucocorticoid receptor global water pollution and human health transgenic zebrafish for detecting mutations caused by compounds in aquatic environments mutational spectra of benzo [a] pyrene and meiqx in rpsl transgenic zebrafish embryos transgenic fish technology: basic principles and their application in basic and applied research gfp transgenic medaka (oryzias latipes) under the inducible cyp1a promoter provide a sensitive and convenient biological indicator for the presence of tcdd and other persistent organic chemicals zebrafish transgenic line huorfz is an effective living bioindicator for detecting environmental toxicants generation of tg (cyp1a:gfp) transgenic zebrafish for development of a convenient and sensitive in vivo assay for aryl hydrocarbon receptor activity the authors declare that they have no competing interests.authors' contributions hjt conceptualized, organized, charged and revised the content, and hjt, cyl and cyc wrote the manuscript together. all authors read and approved the final manuscript.• we accept pre-submission inquiries • our selector tool helps you to find the most relevant journal submit your next manuscript to biomed central and we will help you at every step: key: cord-031957-df4luh5v authors: dos santos-silva, carlos andré; zupin, luisa; oliveira-lima, marx; vilela, lívia maria batista; bezerra-neto, joão pacifico; ferreira-neto, josé ribamar; ferreira, josé diogo cavalcanti; de oliveira-silva, roberta lane; pires, carolline de jesús; aburjaile, flavia figueira; de oliveira, marianne firmino; kido, ederson akio; crovella, sergio; benko-iseppon, ana maria title: plant antimicrobial peptides: state of the art, in silico prediction and perspectives in the omics era date: 2020-09-02 journal: bioinform biol insights doi: 10.1177/1177932220952739 sha: doc_id: 31957 cord_uid: df4luh5v even before the perception or interaction with pathogens, plants rely on constitutively guardian molecules, often specific to tissue or stage, with further expression after contact with the pathogen. these guardians include small molecules as antimicrobial peptides (amps), generally cysteine-rich, functioning to prevent pathogen establishment. some of these amps are shared among eukaryotes (eg, defensins and cyclotides), others are plant specific (eg, snakins), while some are specific to certain plant families (such as heveins). when compared with other organisms, plants tend to present a higher amount of amp isoforms due to gene duplications or polyploidy, an occurrence possibly also associated with the sessile habit of plants, which prevents them from evading biotic and environmental stresses. therefore, plants arise as a rich resource for new amps. as these molecules are difficult to retrieve from databases using simple sequence alignments, a description of their characteristics and in silico (bioinformatics) approaches used to retrieve them is provided, considering resources and databases available. the possibilities and applications based on tools versus database approaches are considerable and have been so far underestimated. proteins and peptides play different roles depending on their amino acids (aa) constitution, which may vary from tens to thousands residues. 1 peptides are conventionally understood as having less than 50 aa. 2 proteins, on the contrary, would be any molecule presenting higher amino acid content and bothproteins and peptides-present a plethora of variations in plants. despite that, plant proteomes have been much more studied than peptidomes. it is well-known that the biochemical machinery necessary for the synthesis and metabolism of peptides is present in every living organism. from the variations of this machinery, a wide structural and functional diversity of peptides was generated, justifying the growing interest in their study. in eukaryotes, peptides are prevalent in intercellular communication, performing as hormones, growth factors, and neuropeptides, but they are also present in the defense system. 3 besides plants and animals, several pathogenic microorganisms, peptides can serve as classical virulence factors, which disrupt the epithelial barrier, damage cells, and activate or modulate host immune responses. an example of this performance is represented by candidalysin, 4 a fungal cytolytic peptide toxin found in the pathogenic fungus candida albicans that damages epithelial membranes, triggers a response signaling pathway, and activates epithelial immunity. there are also reports of defense-related fungal peptides. for example, the copsin, a peptide-based fungal antibiotic recently identified in the fungus coprinopsis cinerea 5 kills bacteria by inhibiting their cell wall synthesis. regarding bacterial peptides, certain species from the gastrointestinal microbial community can release low-molecular-weight peptides, able to trigger immune responses. 6 there are additionally peptides that act like bacterial "hormones" that allow bacterial communities to organize multicellular behavior such as biofilm formation. 7 some peptides are known for their medical importance, as defensins that 2 bioinformatics and biology insights present antibacterial, antiviral, and antifungal activities. for example, human alpha-and beta-defensins present in the saliva may potentially impede virus replication, including sars-cov-2, 8 besides other roles as protection against intestinal inflammation (colitis). 9 considering the roles of plant peptides, they can also be multifunctional, and have been classified into 2 main categories 10 (supplementary figure s1) : (1) peptides with no bioactivity, primarily resulting from the degradation of proteins by proteolytic enzymes, aiming at their recycling, and (2) bioactive peptides, which are encrypted in the structure of the parent proteins and are released mainly by enzymatic processes. the first group is innocuous regarding signaling, regulatory functions, and bioactivity. so far, it has been reported that some of them may play a significant role in nitrogen mobilization across cellular membranes. 11 the second group of bioactive peptides has a substantial impact on the plant cell physiology. some peptides of this group can act in the plant growth regulation (through cell-to-cell signaling), endurance against pathogens and pests by acting as toxins or elicitors, or even detoxification of heavy metals by ion-sequestration. comprising bioactive peptides, additional subcategorization has been proposed regarding their function. tavormina et al 12 figure s1 ) based on the type of precursor: • • derived from functional precursors: originated from a functional precursor protein; • • derived from nonfunctional precursors: originated from a longer precursor that has no known biological function (as a preprotein, proprotein, or preproprotein); • • not derived from a precursor protein: some sorfs (small open read frames; usually <100 codons) are considered to represent a potential new source of functional peptides (known as "short peptides encoded by sorfs"). a more intuitive classification of bioactive peptides was further proposed by farrokhi et al 10 receptors in leaves. 13 another example is the pls (polaris) peptide that acts during early embryogenesis but later activates auxin synthesis, also affecting cytokine synthesis and ethylene response. 14 regarding the second group, it includes peptides with signaling roles in plant defense, comprising at least 4 subgroups, including syst (systemin) (supplementary figure s1 ). the syst peptides were identified in solanaceae members, like tomato and potato 15 (acting on the signaling response to herbivory). the syst leads to the production of a plant protease inhibitor that suppresses insect's proteases. 16 stratmann 17 suggested that in plants, systs act to stimulate the jasmonic acid signaling cascade within vascular tissues to induce a systemic wound response. • • defense peptides or antimicrobial peptides (amps): to be fitted into this class, a plant peptide must fulfill some specific biochemical and genetic prerequisites. regarding biochemical features, in vitro antimicrobial activity is required. concerning the genetic condition, the gene encoding the peptide should be inducted in the presence of infectious agents. 18 in practice, this last requirement is not ever fulfilled as some amps are tissue-specific and are considered as part of the plant innate immunity, while other isoforms of the same class appear induced after pathogen inoculation. 19 plant amps are the central focus of the present review, comprising information on their structural features (at genomic, gene, and protein levels), resources, and bioinformatic tools available, besides the proposition of an annotation routine. their biotechnological potential is also highlighted in the generation of both transgenic plants resistant to pathogens, and new drugs or bioactive compounds. antimicrobial peptides are ubiquitous host defense weapons against microbial pathogens. the overall plant amp characterization regards the following variables ( figure 1 ): electrical charge, hydrophilicity, secondary and 3-dimensional (3d) structures, and the abundance or spatial pattern of cysteine residues. 20 these features are primarily related to their defensive role(s) as membrane-active antifungal, antibacterial, or antiviral peptides. regarding the nucleotide sequence, plant amps are hypervariable and this genetic variability is considered crucial to provide diversity and the ability to recognize different targets. for their charges, amps can be classified as cationic or anionic ( figure 1 ). most plant amps have positive charges, which is a fundamental feature for the interaction with the membrane lipids of pathogens. 21 concerning hydrophilicity, amps are generally amphipathic, that is, they exhibit molecular conformation with both hydrophilic and hydrophobic domains. 22 silva et al 3 with respect to their 3d structure, amps can be either linear or cyclic ( figure 1 ). some linear amps adopt an amphipathic α-helical conformation, whereas non-α-helical linear peptides generally show 1 or 2 predominant amino acids. 23 in turn, cyclic amps, including cysteine-containing peptides, can be divided into 2 subgroups based on the presence of a single or multiple disulfide bonds. a peculiar feature of these peptides is a cationic and amphipathic character, what improves their functioning as membrane-permeabilizing agents. 23 considering the secondary structures, amps may exhibit α-helices, β-chains, β-pleated sheets, and loops ( figure 1 ). wang 24 classified plant amps into 4 families (α, β, αβ, and non-αβ), based on the protein classification of murzin et al, 25 with some modifications. antimicrobial peptides of the "α" family present α-helical structures, 1 whereas amps from the "β" family contains β-sheet structures usually stabilized by disulfide bonds. 26, 27 some plant amps show an α-hairpinin motif formed by antiparallel α-helices that are stabilized by 2 disulfide bridges. 28 such amps present a higher resistance to enzymatic, chemical, or thermal degradation. 29 antimicrobial peptides from the "αβ" family having both "α" and "β" structures are also stabilized by disulfide bridges. an example of amp presenting "αβ" structures are defensins, usually with a cysteine-stabilized αβ motif (csαβ), an α-helix, and a triplestranded antiparallel β sheet stabilized mostly by 4 disulfide bonds. 30 finally, amps that do not belong to the "αβ" group exhibit no clearly defined "α" or "β" structures. 26 plant amps are also classified into families considering protein sequence similarity, cysteine motifs, and distinctive patterns of disulfide bonds, which determine the folding of the tertiary structure. 31 therefore, plant amps are commonly grouped as thionins, defensins, heveins, knottins (linear and cyclic), lipid transfer proteins (ltp), snakins, and cyclotides. 27, 31 these amp categories will be detailed in the next sections, together with other groups here considered (impatienlike, macadamia [β-barrelins], puroindoline (pin), and thaumatin-like protein [tlp]) and the recently described αhairpinin amps. the description includes comments on their structure, pattern for regular expression (regex) analysis (when available), functions, tissue-specificity, and scientific data availability. thionins are composed by 45 to 48 amino acid residues with a molecular weight around 5 kda, considering the mature peptide. they are synthesized with a signal peptide together with the mature thionin and the so-called acidic domain. 32 to date, there is no experimental information available about possible functions of the acidic domain, even though it is clearly not dispensable as shown by the high conservation of the cysteine residues. 33 the thionin superfamily comprises 2 distinct groups of plant peptides α/β-thionins and γ-thionins with distinguished structural features. 34 the α/β thionins have homologous amino acid sequences and similar structures. 35 besides, they are rich in arginine, lysine, and cysteine. 36 in turn, γ-thionins have a greater similarity with defensins, and some authors classify them within this group. 37 however, compared with the defensins, they present a longer conserved amino acid sequence. 31 regarding the cysteine motif, it can be divided into 2 subgroups, one with 8 residues connected by 4 disulfide bonds called 8c and the other with 6 residues connected by 3 disulfide bonds called 6c. 38 the general designation of thionins has been proposed as a family of homologous peptides that includes purothionins. the first plant thionin was isolated in 1942 from wheat flour and labeled as purothionin. 39 since then, homologues from various taxa have been also identified, like bioinformatics and biology insights viscotoxins (viscum album) and crambins (crambe abyssinica). 40 they have also been isolated from different plant tissues like seeds, leaves, and roots. 41, 42 thionins have been tested for different elicitors: grampositive 43, 44 or gram-negative bacteria, 45, 46 yeast, 38, 43 insect larvae, 47 nematode, 33 and inhibitory proteinase. 48 thionins are hydrophobic in nature, interact with hydrophobic residues, and lyse bacteria cell membrane. their toxicity is due to an electrostatic interaction with the negatively charged membrane phospholipids, followed by either pore formation or a specific interaction with a membrane. 38 it has been reported that they are able to inhibit other enzymes possibly through covalent attachment mediated by the formation of disulfide bonds, as previously observed for other thionin/enzyme combinations. 48 thionin representatives with known 3d structures determined by x-ray crystallography are crambin (pdb id: 1crn), α1and β-purothionins (pdb id: 2phn and 1bhp), β-hordothionin (pdb id: 1wuw), and viscotoxin-a3 (pdb id: 1okh). the first to be determined was the mixed form of crambin. 35, 49 it showed a distinct capital γ shape with the n terminus forming the first strand in a βsheet. the architecture of this sheet is additionally strengthened by 2 disulfide bonds. 50 after a short stretch of extended conformation, there is a helix-turn-helix motif. in crambin, there is a single disulfide involved in stabilizing the helix-tohelix contacts. at the center of this motif, there is a crucial arg10 that forms 5 hydrogen bonds to tie together the first strand, the first helix, and the c terminus. 50 the first plant defensins were isolated from wheat 51 and barley grains, 52 initially called γ-hordothionins. due to some similarities in cysteine content and molecular weight, they were classified as γ-thionins. later, the term "γ-thionin" was replaced by "defensin" based on the higher number of primary and tertiary structures of these proteins and also on their antifungal activities more related to insect and mammalian defensins than to plant thionins. 53 plant defensins belong to a diverse protein superfamily called cis-defensin 54 and exhibit cationic charge, consisting of 45 to 54 aa with 2 to 4 disulfide bonds. 53, 55 plant defensins share similar tertiary structures and typically exhibit a triple-stranded antiparallel β sheet, enveloped by an α-helix and confined by intramolecular disulfide bonds 1 (figure 2a ). this motif is called cysteine-stabilized αβ (csαβ). 56 the csαβ defensins were classified into 3 groups based on their sequence, structure, and functional similarity. defensins are known for their antimicrobial activity at low micromolar concentrations against gram-positive and gramnegative bacteria, 57 fungi, 58 viruses, and protozoa. 59 in addition, they present protein inhibition, insecticidal, and antiproliferative activity, acting as an ion-channel blocker, being also associated with the inhibition of pathogen protein synthesis. 60 instead, plant defensins act in the regulation of signal transduction pathways and induce inflammatory processes, in addition to wound healing, proliferation control, and chemotaxis. 61 in general, plant defensins do not present high toxicity to human cells, having in vivo efficacy records, with relevant therapeutic potential, and can be applied in treatments associated with traditional medicine. 62 cools et al 63 reported that a peptide derived from a plant defensin (hsafp1) acted synergistically with caspofungin (an antimycotic) (in vivo and in vitro) against the formation of candida albicans biofilm on polystyrene and catheter substrate, indicating that the hsafp1 variant presented a strong antifungal potential in the proposed treatment. other biotechnological applications of defensins are described, as in the case of ecgdf1, which was isolated from a legume (erythrina crista-galli), heterologously expressed in escherichia coli and purified. ecgdf1 inhibited the growth of various plant and human pathogens (such as candida albicans and aspergillus niger and the plant pathogens clavibacter michiganensis ssp. michiganensis, penicillium expansum, botrytis cinerea and alternaria alternate). 64 due to these features, ecgdf1 is a candidate for the development of antimicrobial products for both agriculture and medicine. 64 non-specific lipid transfer proteins (ns-ltps) were first isolated from potato tubers 65 and are actually identified in diverse terrestrial plant species. they comprise a large gene family, are abundantly expressed in most tissues, but absent in most basal plant groups as chlorophyte and charophyte green algae. 66 they generally include an n-terminal signal peptide that directs the protein to the apoplastic space. 67 some ltps have a c-terminal sequence that allows their post-translational modification with a glycosylphosphatidylinositol molecule, facilitating the integration of ltp on the extracellular side of the plasma membrane. the ns-ltps are small proteins which were thus named because of their function of transferring lipids between the different membranes carrying lipids (non-specifically, the list includes phospholipids, fatty acids, their acylcoas, or sterols). they consist of approximately 100 aa and are relatively larger in size than other amps, such as defensins. depending on their sizes, ltps may be classified into 2 subfamilies: ltp1 and ltp2, with relative molecular weight of 9 and 7 kda, respectively. 68, 69 the limited sequence conservation turned this classification inadequate. thus, a modified and expanded classification system was proposed, presenting 5 main types (ltp1, ltp2, ltpc, ltpd, and ltpg) and 5 additional types with a smaller number of members (ltpe, ltpf, ltph, ltpj, and ltpk). 66 the new classification system is not based on molecular size but rather on (1) the position of a conserved intron, (2) the identity of the amino acid sequence, and figure s2) . although this latter classification system is the most recent, the conventional classification of ltp1 and ltp2 types has been maintained by most working groups. lipid transfer protein nomenclature has been confusing and without consistent guidelines or standards. there are several examples where specific ltps receive different names in different scientific articles. the lack of a robust terminology sometimes turns it quite difficult, extremely time-consuming, and frustrating to compare ltps with different roles/functions. 67 therefore, an additional nomenclature was also proposed by salminen et al, 67 naming ltps as follows: atltp1.3, osltp2.4, hvltpc6, ppltpd5, and taltpg7, with the first 2 letters indicating the plant species (eg, at = arabidopsis thaliana, pp = physcomitrella patens); ltp1, ltp2, and ltpc indicating the type; while the last digit (here 3-7) regards the specific number given to each gene or protein within a given ltp type. for the sake of clarity, the authors recommend the inclusion of a point between the type specification (ltp1 and ltp2) and the gene number. for ltpc, ltpd, ltpg, and other types of ltp defined with a letter, the punctuation mark was not recommended. this latter classification system is currently recommended as it comprises several features of ltps and is more robust than the previous classification systems. lipid transfer proteins are small cysteine-rich proteins, having 4 to 5 helices in their tertiary structure ( figure 2b ), which is stabilized by several hydrogen bonds. such a folding gives ltps a hydrophobic cavity to bind the lipids through hydrophobic interactions. this structure is stabilized by 4 disulfide bridges formed by 8 conserved cysteines, similar to defensins, although bound by cysteines in different positions. the disulfide bridges promote ltp folding into a very compact structure, which is extremely stable at different temperatures and denaturing agents. [70] [71] [72] these foldings provide a different specificity of lipid binding at the ltp binding site, where the ltp2 structure is relatively more flexible and present a lower lipid specificity when compared with ltp1. 34 the first 3d structure of an ltp was established for taltp1.1 based on 2d and 3d data of 1h-nmr, purified from wheat (triticum aestivum) seeds in aqueous solution. 73, 74 currently, several 3d structures of ltps have been determined, either by nuclear magnetic resonance (nmr) or x-ray crystallography; in their free, unbound form or in a complex with ligands. the heveins were first identified in 1960 in the rubber-tree (hevea brasiliensis), but its sequence was determined later, whereas a similarity was detected to the chitin-binding domain of an agglutinin isolated from urtica dioica (l.) 75 with 8 cysteine residues forming a typical cys motif. 76 the primary structure of the hevein consists of 29 to 45 aa, positively charged, with abundant glycine (6) and cysteine (8-10) residues, 76 and aromatic residues. 31, 77 the chitin-binding domain is a determinant component in the identification of hevein-like peptides whose binding site is represented by the amino acid sequence sxfgy/sxygy, where x regards any amino acid. 76, 78 most heveins have a coil-β1-β2-coil-β3 structure that occurs by variations with the secondary structural motif in the presence of turns in 2 long coils in the β3 chain. 31 antiparallel β chains form the central β sheet of the hevein motif with 2 long coils stabilized by disulfide bonds ( figure 2c ). although the presence of chitin has not been identified in plants, there are chitin-like structures present in proteins that exhibit a strong affinity to this polysaccharide isolated from different plant sources. 79 the presence of 3 aromatic amino acids in the chitinbinding domain favors chitin binding by providing stability to the hydrophobic group c-h and the π electron system through van der waals forces, as well as the hydrogen bonds between serine and n-acetylglucosamine (glcnac) present in the chitin structure. 76, 77 this domain is commonly found in chitinases of classes i to v, in addition to other plant antimicrobial proteins, such as lectins and pr-4 (pathogenesis-related protein 4) members. 80, 81 it may also occur in other proteins that bind to polysaccharide chitin, 80 such as the antimicrobial proteins ac-amp1 and ac-amp2 of amaranthus caudatus (amaranthaceae) seeds which are homologous to hevein but lack the c-terminal glycosylated region. 82 plant chitinases (class i) have the hevein-like domains, called hlds. due to the similar structural epitopes between chitinases and heveins, they are responsible for the cross reactive syndrome (latex-fruit syndrome). 83, 84 among the several classes of proteins mentioned, the proteins with a high degree of similarity to hevein are chitinases i and iv. 76 chitinases are known to play an essential role in plant defense against pathogens, 85 also inhibiting in vitro fungal growth, 86 especially when combined with β-1,3-glucanases. 87 it also interferes with the growth of hyphae, resulting in abnormal ramification, delay, and swelling in their stretching. 81 however, it has been shown that heveins have a higher inhibitory potential than chitinases and that their antifungal effect is not related only to the presence of chitinases 88 ; pn-amp1 and pn-amp2 amps with hevein domains have potent antifungal activities against a broad spectrum of fungi, including those without chitin in their cell walls. 88, 89 modes of action of chitinases usually include degradation and disruption of the fungal cell wall and plasma membrane due to its hydrolytic action, causing extravasation of plasma particles. 21, 89 therefore, heveins have good antifungal activity, and only a few are active against bacteria, most of them with low activity. another role of hevein chitinases regards the antagonistic effect in triggering the aggregation of rubber particles in the latex extraction process in rubber trees. unlike heveins, other chitinases inhibit rubber particle aggregation. however, its action in conjunction with other proteins (β-1,3-glucanase) increases the effect of β-1,3-glucanase on rubber particle aggregation. 90 a study by shi et al 91 found that the interaction of the protein network related to the antipathogenic activity released by lutoids (lysosomal microvacuole in latex) is essential in closing laticiferous cells (cells that produce and store latex), not only providing a physical barrier, but a biochemical barrier used by laticiferous cells affected by pathogen invasion. knottins are part of the cysteine-rich peptides (crps) superfamily, sharing the cysteine-knot motif and therefore resembling other families as defensins, heveins, and cyclotides. 92 their structure was initially identified by crystallography of carboxypeptides isolated from potato, showing the cysteine-knot motif with 39 aa and 6 cysteine residues. 93 they are also called "cysteine-knot peptides," "inhibitor cysteine-knot peptides," or even "cysteine-knot miniproteins" because their mature peptide presents less than 50 aa, forming 3 interconnected disulfide bonds in the cysteine-knot motif, characterizing a particular scaffold. 92 this conformation confers thermal stability at high temperatures. for example, the cysteine-stabilized β-sheet (csb) motif derived from knottins presents stability at approximately 100°c with only 2 disulfide bonds. 94 the knottins may have linear or cyclic conformation. however, both exhibit connectivity between the cysteines at positions 1-4c, 2-5c, and 3-6c, forming a ring at the last bridge 92 ( figure 2d ). knottins have different functions, such as signaling molecules, 95 response against biotic and abiotic stresses, 96 root growth, 97 symbiotic interactions as well as antimicrobial activity against bacteria, 98 fungi, 99 virus, 100 and insecticidal activity, 101 among others. knottins antimicrobial activity has been attributed to the action of functional components of the plasma membrane, leading to alterations of lipids, ion flux, and exposed charge. 99 the accumulation of peptides on the surface of the membrane results in the weakening of the pathogen membrane, 102 resulting in transient and toroidal perforations. 99 in the course of a large-scale survey to identify novel amps from australian plants, 103, 104 an amp with no sequence homology was purified. its complementary dna (cdna) was cloned from macadamia integrifolia (proteaceae) seeds, containing the complete peptide coding region. the peptide was named miamp1, being highly basic with an estimated isoelectric point (pi) of 10 and a mass of 8 kda. the miamp1 is 102 aa long, including a 26 aa signal peptide in the n-terminal region, bound to a 76 aa mature region with 6 cysteine residues. 105 its 3d structure was determined using nmr spectroscopy, 104 revealing a unique conformation among plant amps, with 8 beta-strands arranged in 2 greek key motifs, forming a greek key beta-barrel ( figure 2e ). due to its particularities, miamp1 was classified as a new structural family of plant amps, and the name β-barrelins was proposed for this class. 104 this structural fold resembles a superfamily of proteins called γ-crystallin-like characterized by the precursors βγ-crystallin. 106 this family includes amps from other organisms, for example, wmkt, a toxin produced by the wild yeast williopsis mraki. 107 the miamp1 exhibited in vitro antimicrobial activity against various phytopathogenic fungi, oomycetes, and grampositive bacteria 103 with a concentration range of 0.2 to 2 μm generally required for a 50% growth inhibition (ic50). in addition, the transient expression of miamp1 in canola (brassica napus) provided resistance against blackleg disease caused by the fungus leptosphaeria maculans, 108 turning miamp1 potentially useful for genetic engineering aiming at disease resistance in crop plants. there are few scientific publications with macadamia-like peptides, maybe because they prevail in primitive plant groups (eg, lycophytes, gymnosperms to early angiosperms as amborella and papaver), being apparently absent in derived angiosperms (eg, asteridae, including brassicaceae as arabidopsis thaliana). on the contrary, they have been identified in some monocots (as zantedeschia, zea, and sorghum). 109 in fact, peptides similar to miamp1 appear to play a role in the defense against pathogens in gymnosperms, including species of economic importance (as pinus and picea) thus deserving attention for their biotechnological potential. 109 four closely related amps (ib-amp1, ib-amp2, ib-amp3, and ib-amp4) were isolated from seeds of impatiens balsamina (balsaminaceae) with antimicrobial activity against a variety of fungi and bacteria, and low toxicity to human cells in culture. these amps are the smallest isolated from plants to date, consisting of only 20 aa in length. the ib-amps are highly basic and contain 4 cysteine residues that form 2 disulfide bonds. interestingly, they have no significant homology with other amps available in public databases. sequencing of cdnas isolated from i. balsamina revealed that all 4 peptides are encoded within a single transcript. concerning the predicted precursor of ib-amp protein, it consists of a pre-peptide followed by 6 mature peptide domains, each one of them flanked by propeptide domains ranging from 16 to 35 aa in length (supplementary figure s3) . this primary structure with repeated domains of alternating basic peptides and acid propeptide domains has, to date, not been reported in other plant species. 110 patel et al 111 conducted an experiment to purify ib-amp1 from seeds of impatiens balsamina. after purification, this peptide had its secondary structure tested by circular dichroism (cd). the results revealed a peptide that may include a β-turn but do not show evidences for either helical or β-sheet structure over a range of temperature and ph. structural information from 2d 1h-nmr was obtained in the form of proton-proton internuclear distances inferred from nuclear overhauser enhancements (noes) and dihedral angle restraints from spinspin coupling constants, which were used for distance geometry calculations. owing to the difficulty in obtaining the correct disulfide connectivity by chemical methods, the authors had built and performed 3 separate calculations: (1) a model with no disulfides; (2) another with predicted disulfide bonds; and (3) a model with alternative connectivity disulfide, as assigned from the nuclear overhauser effect spectroscopy (noesy) nmr spectra. as a result, 2 hydrophilic patches were observed at opposite ends and opposite sides of the models, whereas in between them a large hydrophobic patch was identified. however, the study did not conclude which of the 3 models would be the most likely representative of ib-amp1, reporting only that cysteines are necessary for maintaining the structure. based on the experiment performed by patel et al, 111 the present work built 3 different models: model 1: without disulfide bonds, and the other 2 models with different disulfide connections-model 2: nmr prediction by patel et al 111 6-cys;16-cys and 7-cys;20-cys, and model 3: disulfide bond partner prediction by dianna 7-cys;16-cys and 6-cys;20-cys. calculations have shown that although the peptide is small, the cysteines constrain part of it to adopt a well-defined main chain conformation. from residue 4 to 20 (except 11), the main chain is well-defined, whereas residues 1 to 3 in the n-terminal region present few restrictions and appear to be more flexible (supplementary figure s4) . analyzing the rmsd (root mean square deviation), we observed that all the models lost the initial conformation and, among them, model 3 was the most stable. models 1 and 2 showed a similar pattern (supplementary figure s5) , as in the models of patel et al, 111 although model 1 was the most flexible. little is known about impatiens-like amps mode of action. lee et al 112 investigated the antifungal mechanism of ib-amp1 noting that when oxidized (bound by disulfide bridges), there occurs a 4-fold increase in antifungal activity against aspergillus flavus and candida albicans, as compared with reduced ib-amp1 (without disulfide bridges). confocal microscopy analyses have shown that ib-amp1 can either bind to the cell 8 bioinformatics and biology insights surface or penetrate cell membranes, indicating an antifungal activity by inhibiting a distinct cellular process, rather than ion channel or membrane pore formation. fan et al 113 reported the ib-amp4 antimicrobial activity dependent of β-sheet configuration to enable insertion into the lipid membrane, thus killing the bacteria through a non-lytic mechanism. 114 current approaches aim to make changes in ib-amp to improve its antimicrobial activity. as an example, synthetic variants of ib-amp1 were fully active against yeasts and fungi, where the replacement of amino acid residues by arginine or tryptophan improved more than twice the antifungal activity. 115 another study involving amp modification generated a synthetic peptide without the disulfide bridges (ie, a linear analog of ib-amp1), which showed an antimicrobial specificity 3.7 to 4.8 times higher than the wild-type ib-amp1. 116 puroindoline puroindolines are small basic proteins that contain a single domain rich in tryptophan. these proteins were isolated from wheat endosperm, have a molecular mass around 13 kda, and a calculated isoelectric point higher than 10. at least 2 main isoforms (called pin-a and pin-b) are known, which are encoded by pina-d1 and pinb-d1 genes, respectively. these genes share 70.2% identical coding regions but exhibit only 53% identity in the 3′ untranslated region. 117 both pin-a and pin-b contain a structure with 10 conserved cysteine residues and a tertiary structure similar to ltps, consisting of 4 α-helices separated by loops of varying lengths, with the tertiary structure joined by 5 disulfide bonds, 4 of which identical to ns-ltps. 117 the conformation of the 2 pin isoforms was studied by infrared and raman spectroscopy. both pin-a and pin-b have similar secondary structures comprising approximately 30% helices, 30% β-sheets, and 40% non-ordered structures at ph 7. it has been proposed that the folding of both pins is highly dependent on the ph of the medium. the reduction of the disulfide bridges results in a decrease of pins solubility in water and to an increment of the β-sheet content by about 15% at the expense of the α-helix content. 118 no high-resolution structure for any of the pin isoforms is available, bringing challenges to understanding the function of their hydrophobic regions, with some evidence coming only from partially homolog peptides. 117 however, wilkinson et al 119 proposed a theoretical model for several sequences of this amp. puroindolines are proposed to be functional components of wheat grain hardness loci, control core texture, besides antifungal activity. [120] [121] [122] [123] although the biological function of pins is unknown, their involvement in lipid binding has been proposed. while ltps bind to hydrophobic molecules in a large cavity, pins interact only with lipid aggregates, that is, micelles or liposomes, through a single stretch of tryptophan residues. this stretch of tryptophan residues is especially significant in the main form, pin-a (wrwwkwwk), while it is truncated in the smaller form, pin-b (wptwwk). [124] [125] [126] puroindolines form protein aggregates in the presence of membrane lipids, and the organization of such aggregates is controlled by the lipid structure. in the absence of lipids, these proteins may aggregate, but there is no accurate information on the relationship between aggregation and interaction with lipids. the antimicrobial activity of pins is targeted to cell membranes. charnet et al 127 indicated that pin is capable of forming ion channels in artificial and biological membranes that exhibit some selectivity over monovalent cations. the stress and ca 2+ ions modulate the formation and/or opening of channels. puroindolines may also be membranotoxins, which may play a role in the plant defense mechanism against microbial pathogens. morris 128 reported that the pin-a and pin-b act through similar but somewhat different modes, which may involve "membrane binding, membrane disruption and ion channel formation" or "intracellular nucleic acid binding and metabolic disruption." natural and synthetic mutants have allowed the identification of pins as key elements for antimicrobial activity. snakins are crps first identified in potato (solanum tuberosum). 129, 130 due to their sequence similarity to gasa (gibberellic acid stimulated in arabidopsis) proteins, the snakins were classified as members of the snakin/gasa family. 131 the genes that encode these peptides have (1) a signal sequence of approximately 28 aa, (2) a variable region, and (3) a mature peptide of approximately 60 residues, with 12 highly conserved cysteine residues. these cysteine residues maintain the 3d structure of the peptide through disulfide bonds, besides providing stability to the molecule when the plant is under stress 129, 130, 132, 133 (figure 2f; supplementary figure s6 ). snakins may be expressed in different parts of the plant, like stem, leaves, flowers, seeds, and roots, 134-137 both constitutive or induced by biotic or abiotic stresses. in vitro activity was observed against a variety of fungi, bacteria, and nematodes, acting as a destabilizer of the plasma membrane. 129, 138, 139 moreover, they were reported as essential agents in biological processes such as cell division, elongation, cell growth, flowering, embryogenesis, and signaling pathways. [140] [141] [142] [143] alpha-hairpinins as reported by nolde et al, 144 alpha-hairpin emerged as a new amp with unusual motif configuration. these peptides prevail in plants and their structure was resolved based on nmr data obtained from the ecamp-1 peptide isolated from barnyard grass seeds (echinoa crus-galli). 144 some α-hairpinins comprise trypsin inhibitors with helical hairpin structure and this group silva et al 9 was recently proposed as a new plant amp family. 145 similar to other amps, the amino acid sequences of α-hairpinins are variable. they share the conserved cysteine motif (cx3cx1-15cx3c) that form a helix-loop-helix fold and may have 2 disulfide bridges c1-c4 and c2-c3. 146 its structural stability is maintained by forming hydrogen bonds, so that the side chains have a relatively stable spatial orientation. 147 as reviewed by slavokhotova et al, 148 members of alphahairpin family have been described in both mono and dicot groups, including species as echinochloa crus-galli and zea mays (both poaceae, monocot), fagopyrum esculentum (polygonaceae, eudicot), and stellaria media (caryophyllaceae, eudicot). several transcripts with α-hairpinin motif exhibit similarities to snakin/gasa genes and are sometimes positioned within this family. although the α-hairpinins structure has been published, its mechanism of action is still not resolved ( figure 2j , pdb id: 2l2r). however, studies indicate they present a potential dna binding capacity. 149 the term cyclotide was created at the end of the past century to designate a family of plant peptides with approximately 30 aa in size and a structural motif called cyclic cysteine knot (cck). 150 this motif is composed by a head-to-tail cyclization that is stabilized by a knotted arrangement of disulfide bridges, with 6 conserved cysteines, connected as follows: c1-2, c3-6, c4-5. 151 cyclotides are generally divided into 2 subfamilies, mӧbius and bracelets, based on structural aspects. in addition to ccks, 2 loops (between c1-2 and c4-5) have high similarity between both subfamilies, while the other 2 loops (between c2-3 and c3-4) exhibit some conservation within the subfamilies 152,153 (supplementary figure s7) . to date, several cyclotides were identified in eudicot families such as rubiaceae, 154 violaceae, 155 fabaceae, 156 and solanaceae, 157 in addition to some monocots of poaceae family. 158 in general, cyclotides may act in defense against a range of agents like insects, helminths, or mollusks. in addition, they can also act as ecbolic (inducer of uterus contractions), 154 antibacterial, 159 anti-hiv, 100 and anticancer factors. 160 all these characteristics added to the stability conferred by the cck motif turn these peptides into excellent candidates for drug development. 161, 162 thaumatin-like protein thaumatins or tlps belong to the pr-5 (pathogen-related protein) family and received this name due to its first isolation from the fruit of thaumatococcus daniellii (maranthaceae) from west africa. 163 thaumatin-like proteins are abundant in the plant kingdom, 164 being found in angiosperms, gymnosperms, and bryophytes, 163 being also identified in other organisms, including fungi, 165, 166 insects, 167 and nematodes. 168 thaumatin-like proteins are known for their antifungal activity, either by permeating fungal membranes 169 or by binding and hydrolyzing β-1,3-glucans. 170, 171 in addition, they may act by inhibiting fungal enzymes, such as xylanases, 172 α-amylases, or trypsin. 173 besides, the expression of tlps is regulated in response to some stress factors, such as drought, 174 injuries, 175 freezing, 176 and infection by fungi 177, 178 viruses, and bacteria. 179 as to the tlp structure, this protein presents characteristic thaumatin signature (ps00316): 180, 181 most of the tlps have molecular mass ranging from 21 to 26 kda, 163 possessing 16 conserved cysteine residues (supplementary figure s8) involved in the formation of 8 disulfide bonds, 182 which help in the stability of the molecule, allowing a correct folding even under extreme conditions of temperature and ph. 183 thaumatin-like proteins also contain a signal peptide at the n-terminal, which is responsible for targeting the mature protein to a particular secretory pathway. 163 the tertiary structure presents 3 distinct domains, which are conserved and form the central cleft, responsible for the enzymatic activity of the protein, being located between domains i and ii. 184 this central cleft may be of an acidic, neutral, or basic nature depending on the binding of the different linkers/receptors. all plant tlps with antifungal activity have an acidic cleft known as motif reddd due to 5 highly conserved amino acid residues (arginine, glutamic acid, and 3 aspartic acid; supplementary figure s8 ), being very relevant for specific receptor binding and antifungal activity. 169, 185, 186 crystallized structures were determined for some plant tlps, such as thaumatin 187 (figure 2g ), zeamatin 169 ( figure 2h ), tobacco pr-5d 185 and osmotin, 186 the cherry allergen pruav2, 188 and banana allergen ba-tlp, 184 among other tlps. some tlps are known as small tlps (stlps) due to the deletion of peptides in one of their domains, culminating in the absence of the typical central cleft. these stlps exhibit only 10-conserved cysteine residues, forming 5 disulfide bonds, resulting in a molecular weight of approximately 16 to 17 kda. they have been described in monocots, conifers, and fungi, so far. 163, 189, 190 other tlps exhibit an extracellular tlp domain and an intracellular kinase domain, being known as pr5k (pr5-like receptor kinases) 191 and are present in both monocots and dicots. for example, arabidopsis contains 3 pr5k genes, while rice has only 1. 163 with the rapid growth in the number of available sequences, it is unfeasible to handle such amount of data manually. thus, amp sequences (as well as their biological information) have been deposited in large general databases, such as uniprot and trembl, which contain sequences of multiple origins. 192, 193 in this sense, the construction of databases that deal specifically with amps was an important step to organize the data. during the past decade, several databases were built to support the deposition, consultation, and mining of amps. thus, these databases can be classified into 2 groups: general and specific. 194 the specific databases can be divided into 2 subgroups: those containing only 1 specific group (defensins or cyclotides) and those containing data from a supergroup of peptides (plant, animal, or cyclic peptides) (supplementary table 1 ). in general, both types of databases share some characteristics such as the way that the data are available or the tools to analyze amps. the collection of antimicrobial peptides (campr3) is a database that comprises experimentally validated peptides, sequences experimentally deduced and still those with patent data, besides putative data based on similarity. [195] [196] [197] the current version includes structures and signatures specific to families of prokaryotic and eukaryotic amps. 197 the platform also includes some tools for amp prediction. the antimicrobial peptide database (apd) 198 collects mature amps from natural sources, ranging from protozoa to bacteria, archaea, fungi, plants, and animals, including humans. amps encoded by genes that undergo post-translational modifications are also part of the scope, besides some peptides synthesized by multienzyme systems. the apd provides interactive interfaces for peptide research, prediction, and design, statistical data for a specific group, or for all peptides available in the database. the lamp (database linking antimicrobial peptides) comprises natural and synthetic amps, which can be separated into 3 groups: experimentally validated, predicted, and patented. their data were primarily collected from the scientific literature, including uniprot and other amp-related databases. 199 the database of antimicrobial activity and structure of peptides (dbaasp) 200 contains information about amps from different origins (synthetic or non-synthetic) and complexity levels (monomers and dimers) that were retrieved from pubmed using the following keywords: antimicrobial, antibacterial, antifungal, antiviral, antitumor, anticancer, and antiparasitic peptides. this database is manually curated and provides information about peptides that have specific targets validated experimentally. it also includes information on chemical structure, post-translational modifications, modifications in the n/c terminal amino acids, antimicrobial activities, cell target and experimental conditions in which a given activity was observed, besides information about the hemolytic and cytotoxic activities of the peptides. 200 due to the diversity of amps and the need to accommodate the most representative subclasses, several databases were established, focusing on specific types, sources, or features. there are several ways to classify amps, and they can range from biological sources such as bacterial amps (bacteriocins), plants, animals, and so on; biological activity: antibacterial, antiviral, antifungal, and insecticide; and based on molecular properties, pattern of covalent bonds, 3d structure and molecular targets. 201, 202 the "defensins knowledgebase" is a database with manual curation and focused exclusively on defensins. this database contains information about sequence, structure, and activity, with a web-based interface providing access to information and enabling text-based search. in addition, the site presents information on patents, grants, laboratories, researchers, clinical studies, and commercial entities. 203, 205 the cybase is a database dedicated to the study of sequences and 3d structures of cyclized proteins and their synthetic variants, including tools for the analysis of mass spectral fingerprints of cyclic peptides, also assisting in the discovery of new circular proteins. 205 the phytamp is a database designed to be solely dedicated to plant amps based on information collected from the uniprot database and from the scientific literature through pubmed. 206 plantpepdb is a database with manual curation of plantderived peptides, mostly experimentally validated at the protein level. it includes data on the physical-chemical properties and tertiary structure of amps, also useful to identify their therapeutic potential. different search options for simple and advanced compositing are provided for users to perform dynamic search and retrieve the desired data. overall, plantpepdb is the first database that comprises detailed analysis and comprehensive information on phyto-peptides from a wide functional range. 207 biological data banks (dbs) are organized collections of data of diverse nature that can be retrieved using different inputs. the management of this information is done through various software and hardware resources, whose retrieval and organization can be performed in a quick and efficient way. 208 considering biological data, information can be classified into (1) primary (sequences), (2) secondary (structure, expression, metabolic pathways, types of drugs, etc), and (3) specialized, for example, containing information on a species or on a class of protein. 209 within this third group, some references to amps can be mentioned, such as campr3 196 and apd 198 that compile sequence data and structure retrieved from diverse sources, and also the defensin knowledgebase 203 and the cybase 205 which are dedicated to specific classes of peptides (defensins and cyclotides, respectively), in addition to phytamp, 206 a specific database of plant amps (supplementary table 2 ). the first step to infer the function of a given sequence (annotation) is to retrieve it in databases. for this purpose, 3 approaches have been used mostly: (1) local alignments, especially by using basic local alignment search tool (blast) 210 and fasta 211 ; by searching for specific patterns using (2) regex or (3) hidden markov model (hmm). 194 the first approach has been widely used, since most of the information is available in databases as sequences, together with tools to align them, whereas the blast is the primary tool for doing so. 212 this tool splits the sequence into small pieces (words), comparing it with the database. however, this approach has a limitation. small motifs may not be significantly aligned as they comprise small portions of the sequences that can be smaller than 20% of the total size. 31, 194 due to the high variability of amps, only few highly conserved sequences can be identified using this type of inference. to reduce the effects of local alignment limitations, other strategies based on the search for specific patterns were introduced, such as regex 213 (supplementary table 1 ) and hmm. 214 the regex is a precise way of describing a pattern in a string where each regex position must be set, although ambiguous characters (or wildcards) can also be used. for example, if we want to find a match for both amino acid sequences caiessk and waiesk, we can use the following expression: [cw]aies{1,2}k, this expression would find a pattern starting with the letter "c" or "w," followed by an "a," an "i," and an "e," 1 or 2 "s," and ending with a "k." the hmms are well-known for their effectiveness in modeling the correlations between adjacent symbols, domains, or events, and they have been extensively used in various fields of biological analysis, including pairwise and multiple sequence alignment, base-calling, gene prediction, modeling dna sequencing errors, protein secondary structure prediction, noncoding rna (ncrna) identification, protein and rna structural alignments, acceleration of rna folding and alignment, fast noncoding rna annotation, and many others. using hmm, a statistic profile is included in the model, which is calculated from a sequence alignment, and a score is determined site-to-site, with conserved and variable positions defined a priori. 194, 215 predicting antimicrobial activity the design of new amps led to the development of methods for the discovery of new peptides, thus allowing new experiments to be done by researchers. in this sense, the new challenge lies in the construction of new prediction models capable of discovering peptides with desired activities. the apd db has established a prediction interface based on some parameters defined by the entire set of peptides available in this database. these values are calculated from natural amps to consider features like length, net charge, hydrophobicity, amino acid composition, and so on. if we take as an example the net load, the amps deposited in the apd range from -12 to +30. this is the first parameter incorporated into the prediction algorithm. however, most amps have a net load ranging from -5 to +10, which then becomes the alternative prediction condition. therefore, the same method is applied to the remaining parameters. the prediction in apd is performed in 3 main steps. first, the sequence parameters will be calculated and compared. if defined as an amp, the peptide can then be classified into 3 groups: (1) rich in given amino acids, (2) stabilized by disulfide, and (3) linear bridges. finally, sequence alignments will be conducted to find 5 peptides of higher similarity. 198, 216, 217 the advent of machine learning (ml) methods has promoted new possibilities for drug discovery. in ml inferences, both a positive and a negative dataset are usually required to train the predictive models. the positive data, in this case, regard preferably experimentally validated amps that can be collected in databases, whereas negative data are randomly selected protein sequences that do not have amp characteristics. 197, 218 machine learning methods based on support vector machine (svm), random forest (rf), and neural networks (nn) have been the most widely used. svm is a specific type of supervised method of ml, aiming to classify data points by maximizing the margin between classes in a high-dimensional space. 219, 220 random forest is a non-parametric tree-based approach that combines the ideas of adaptive neighbors with bagging for efficient adaptive data inference. neural networks is an information processing paradigm inspired by how a biological nerve system process information. it is composed of highly interconnected processing elements (neurons or nodes) working together to solve specific problems. [221] [222] [223] evaluating proteomic data regarding the use of amps in peptide therapeutics, as an alternative to antimicrobial treatment, new efficient and specific antimicrobials are demanded. as aforementioned, amps are naturally occurring across all classes of life, presenting high active potential as therapeutic agents against various kinds of bacteria. 224 the identification of novel amps in databases is primarily dependent on knowing about specific amps together with a sufficient sequence similarity. 225 however, orthologs may be divergent in sequence, mainly because they are under strong positive selection for variation in many taxa, 226 leading to remarkably lower similarity, even in closely related species. in this scenario, where alignment tools present limited use, 1 strategy to identify amps is related to proteomic approaches. proteins and peptides are biomolecules responsible for various biochemical events in living organisms, from formation and composition to regulation and functioning. thus, understanding of the expression, function, and regulation of the proteins encoded by an organism is fundamental, leading to the so-called "proteomic era." the term "proteome" was first used by marc wilkins in 1994 and it represents the set of proteins encoded by the genome of a biological system (cell, tissue, organ, biological fluid, or organism) at a specific time under certain conditions. 227 protein extraction, purification, and identification methods have significantly advanced our capacity to elucidate many biological questions using proteomic approaches. 228, 229 due to the wide diversity of proteomic analysis, methods makes the choice of the correct approach dependent on the type of material and compounds to be analyzed. 213, 230 two main tools are used to isolate proteins: (1) the 2-dimensional electrophoresis (2-de) associated with mass spectrometry (ms) and (2) liquid chromatography associated with ms, each one with its own limitations. [230] [231] [232] obtaining native proteins is a challenge in proteomics or peptidomics, due to high protein complexity in samples, as the occurrence of post-translational modifications. alternative strategies applied to extraction, purification, biochemical, and functional analyses of these molecules have been proposed, favoring access to structural and functional information of hard-to-reach proteins and peptides. 233 based on 2d gel, al akeel et al 234 evaluated 14 spots obtained from seeds of foeniculum vulgare (apiaceae) aiming at proteomic analyses and isolation of small peptides. extracted proteins were subjected to 3 kda dialysis, and separation was carried out by deae-ion exchange chromatography while further proteins were identified by 2d gel electrophoresis. one of its spots showed high antibacterial activity against pseudomonas aeruginosa, pointing to promising antibacterial effects, but requiring further research to authenticate the role of the anticipated proteins. for amps, 2de is challenging due to the low concentration of the peptide molecules captured by this approach, their small sizes, and their ionic features (strongly cationic). in addition, the limited number of available specific databases and high variability turn their identification through proteolysis techniques and mass spectrometry, matrix-assisted laser desorption/ionization (maldi-ms) difficult. in addition, the partial hydrophobicity characteristics and surface charges facilitate peptide molecular associations, making analysis difficult by any known proteomic approaches. 232 in addition, peptides are most often cleaved from larger precursors by various releasing or processing enzymes. 235 furthermore, profiles generated do not represent integral proteome, as 2de has limitations to detect proteins with low concentration, values of extreme molecular masses, pis, and hydrophobic proteins, including those of membranes. 236 due to these limitations, multidimensional liquid chromatography-high-performance liquid chromatography (mdlc-hplc) has been successfully employed as an alternative to 2d gels. techniques and equipments for the newly developed separation and detection of proteins and peptides, such as nano-hplc and multidimensional hplc, have improved proteomics evaluation. 237 molecular mass values obtained are used in computational searches in which they are compared with in silico digestion results of proteins in databases. in silico approaches, usually by the action of trypsin as a proteolytic agent, may generate a set of unique peptides whose masses are determined by ms. 238, 239 these methodologies are widely adopted for large-scale identification of peptide from ms/ms spectra. 240 theoretical spectra are generated using fragmentation patterns known for specific series of amino acids. the first 2 widely used search engines in database searching were sequest 241 and mascot (matrix science, boston, ma; www.matrixscience.com). 242 they rank peptide matches based on a cross-correlation to match the hypothetical spectra to the experimental one. mascot is widely used for peptidomics and proteomics analysis, including amp identification in many organisms, or to evaluate the antibacterial efficacy of new amps. evaluating new amp against multidrug-resistant (mdr) salmonella enterica, tsai et al 243 used 2d gel electrophoresis and liquid chromatography-electrospray ionization-quadrupole-time-offlight tandem ms to determine the protein profiles. the protein identification was performed using the mascot with trypsin as cutting enzyme, whereas ncbi nr protein was set as a reference database. the methodology used in this study indicated that the novel amp might serve as a potential candidate for drug development against mdr strains, confirming the usability of mascot. in a similar way, umadevi et al 244 described the amp profile of black pepper (piper nigrum l.) and their expression on phytophthora infection using label-free quantitative proteomics strategy. for protein/peptide identification, ms/ms data were searched against the apd database 245 using an in-house mascot server, established full tryptic peptides with a maximum of 3 missed cleavage sites and carbamidomethyl on cysteine, besides an oxidized methionine included as variable modifications. the apd database was used for amp signature identification, 245 together with phytamp 206 and campr3. 197 to enrich the characterization parameters, isoelectric point, aliphatic index, and grand average of hydropathy were also used 246 (gravy) (using protparam tool), besides the net charge from phytamp database. based on label-free proteomics strategy, they established for the first time the black pepper peptidomics associated with the innate immunity against phytophthora, evidencing the usability of proteomics/ peptidomics data for amp characterization in any taxa, including plant amps, aiming the exploitation of these peptides as next-generation molecules against pathogens. 244 other tools use database searching algorithms, such as x!tandem, 247 open mass spectrometry search algorithm (omssa), 248 probid, 249 radars, 250 and so on. these search engines are based on database search but use different scoring schemes to determine the top hit for a peptide match. general information on database search engines, their algorithms, and scoring schemes were reviewed by nesvizhskii et al. 251 despite its efficient ability to identify peptides, database searching presents several drawbacks, like false positive identifications due to overly noisy spectra and lower quality peptides score (related to the short size of peptides). so, the identification is strongly influenced by the amount of protein in the sample, the degree of post-translational modification, the quality of automatic searches, and the presence of the protein in the databases. 252, 253 in this scenario, the knowledge about the genome from a specific organism is important to allow the identification of the exact pattern of a given peptide. if an organism has no sequenced genome, it is not searchable using these methods. 235, 240 once the sequences are obtained, bioinformatic tools can be used to predict peptides structure and estimate bioactive peptides. 254 more recently, an interactive and free web software platform, mixprotool, was developed, aiming to process multigroup proteomics data sets. this tool is compiled in r (www.r-project. org), providing integrated data analysis workflow for quality control assessment, statistics, gene ontology enrichment, and other facilities. the mixprotool is compatible with identification and quantification outputs from other programs, such as maxquant and mascot, where results may be visualized as vector graphs and tables for further analysis, in contrast to existing softwares, such as giapronto. 255 according to the authors, the web tool can be conveniently operated, even by users without bioinformatics expertise, and it is beneficial for mining the most relevant features among different samples. 24 the central tenet of structural biology is that structure determines function. for proteins, it is often said the "function follows form" and "form defines function." therefore, to understand protein function in detail at the molecular level, it is mandatory to know its tertiary structure. 256 experimental techniques for determining structures, such as x-ray crystallography, nmr, electron paramagnetic resonance, and electron microscopy, require significant effort and investments. 257 all methods mentioned have their own limitations, and the gap between the number of known proteins and the number of known structures is still substantial. thus, there is a need for computational framework methods to predict protein structures based on the knowledge of the sequence. 256 in addition, in recent years, there has been impressive progress in the development of algorithms for protein folding that may aid in the prediction of protein structures from amino acid sequence information. 258 historically, the prediction of a protein structure has been classified into 3 categories: comparative modeling, threading, and ab initio. the first 2 approaches construct protein models by aligning the query sequences with already solved model structures. if the models are absent in the protein data bank (pdb), the models must be constructed from scratch, that is, by ab initio modeling, considered the most challenging way to predict protein structures. 256 in the case of comparative modeling methods, when inserting a target sequence, the programs identify evolutionarily related models of solved structures based on their sequence or profile comparison, thus constructing structure models supported by these previously resolved models. 259 this approach comprises 4 main steps: (1) fold assignment, which identifies similarity between the target and the structure of the solved model; (2) alignment of the target sequence to the model; (3) generation of a model based on alignment with the chosen template; and (4) analysis of errors considering the generated model. 260 there are several servers and computer models that automate the comparative modeling process, with swiss-model and modeler figuring as the most used. 261, 262 although automation makes comparative modeling accessible to experts and beginners, some adjustments are still needed in most cases to maximize model accuracy, especially in the case of more complex proteins. 262 therefore, some caution must be taken regarding the generated models, considering the resolution and quality of the model used, as well as homology between the model and the protein of interest. threading modeling methods are based on the observation that known protein structures appear to comprise a limited set of stable folds, and those similarity elements are often found in evolutionarily distant or unrelated proteins. the most used servers based on this approach are muster, 263 sparks-x, 264 raptorx, 259 prosa-web, 265 and most notably the i-tasser. 266 in some cases, the incorporation of structural information to combine the sequence used in the search with possible models allows the detection of similarity in the fold, even in the absence of an explicit evolutionary relation. the prediction of structures from known protein models is, at first sight, a more straightforward task than the prediction of protein structures from available sequences. therefore, when no solved model is available, another approach is recommended, namely, the ab initio modeling. this method is intended to predict the structure only from the sequence information, without any direct assistance from previously known structures. the ab initio modeling aims to predict the best model, based on the minimum energy for a potential energy function by sampling the potential energy surface using various searchable information. 267, 268 such approaches turn it challenging to produce high-resolution modeling, essential for determining the native protein folding and its biochemical interpretation. on the contrary, later resolved structures and comparisons with previously predicted proteins point to a higher successful modeling generated by ab initio methods than those generated by pure energy minimization methods, classical or even pure methods. 256 among the most used servers and programs for ab initio modeling, we highlight the rosetta, 257 quark, 269 and touchstone ii. 267 the accuracy of the models calculated by many of these methods is evaluated by cameo (continuous automated model evaluation) 270 and by casp (critical assessment of protein structure prediction). 258 probably the first reasonably accurate ab initio model was built in casp4. since then, sustained progress was achieved in ab initio prediction, but mainly for small proteins (120 residues or less). in casp11, for the first time, a novel 256-residue protein with a sequence identity with known structures lower than 5% was constructed with high precision for sequences of this size. 271 in casp12, a significant improvement was reported in 4 areas: contact prediction, free modeling, template-based modeling, and estimating the accuracy of models. the authors report that this improvement is due to the accuracy of modeling and alignment methods, as well as increased data availability for both sequence and structure. 258 due to the number of amps deposited in the pdb (to date approximately 1099 structures), comparative modeling is the most used. however, when it comes to de novo peptide design, the most recommended choice would be ab initio 272 or a hybrid approach that uses more than 1 modeling method. 273 after the generation of a model, the amp stability should be evaluated using molecular dynamics (md). molecular dynamics comprises the application of computational simulations that predict the changes in the positions and velocities of the constituent atoms of a system under given time and condition. this calculation is done through a classical approximation of empirical parameters, called "force field." 274 if, on one hand, this approximation makes the dynamics of a system containing thousands of atoms numerically accessible, it obviously limits the nature of the processes that can be observed during the simulations. no quantum effect is visualized in a md simulation; just as no chemical bond is broken, no interactions occur between orbitals, resonance, polarization, or charge transfer effects. 275 however, the molecules go beyond a static system. thus, md is a computational technique that can be used for predicting or refining structures, dynamics of molecular complexes, drug development, and action of molecular biological systems. 276 molecular dynamics simulation is widely used for protein research, aiming to extract information about the physical properties of individual proteins. the results of such simulations are then compared with experimental results. as these experiments are generally carried out in solvents, it is necessary to simulate molecular systems of protein in water. these simulations have a variety of applications, such as determining the folding of a structure to a native structure and analyzing the dynamic stability of this structure. 277 the use of md to simulate protein folding processes is one of the most challenging applications and should be relatively long (in the order of microseconds to milliseconds) to allow observing a single fold event. in addition, the force field used must correctly describe the relative energies of a wide variety of shapes, including unfolding and poorly folded shapes that may occur during the simulation. 275 the considerable application potential led to the implementation of md simulation in many software packages, including gromacs, 278-280 amber, 281 namd, 282 charmm, 283 lammps, 284 and desmond. 285 in addition to the above mentioned, there are other simulation types available, such as the monte carlo method, stochastic dynamics, and brownian dynamics. 280 in the last decades, md simulation has become a standard tool in theoretical studies of large biomolecular systems, including dna or proteins, in environments with near realistic solvents. indeed, simulations have proven valuable in deciphering functional mechanisms of proteins and other biomolecules, in uncovering the structural basis for disease, and in the design and optimization of small molecules, peptides, and proteins. 286 historically, the computational complexity of this type of computation has been extremely high, and much research has focused on algorithms to achieve unique simulations that are as long or as large as possible. 278 the interplay between a given pathogen (eg, virus, bacteria, fungus) must be studied through a holistic approach. hostpathogen relationships are very complex and occur at diverse conceivable levels, including the cellular/molecular level of both, pathogen and host, under given environmental conditions. a most approximate understanding of these interactions at every level is the ultimate goal of "systems biology" (sb). it comprises a holistic approach, integrating distinct disciplines, as biology, computer science, engineering, bioinformatics, physics, and others to predict how a given system behaves under given conditions and what is the role of its parts. systems biology stands out because it is capable of correlating omics data for the understanding of plant-pathogen interaction. the construction of a plant-pathogen interaction network includes the reconstruction of metabolic pathways of these organisms, identification of the degree of pathogenicity, besides the expression of genes and proteins from both plant and pathogen. the networks can be classified into 5 types: (1) regulatory; (2) metabolic; (3) protein-protein interaction; (4) signaling and regulatory; and (5) signaling, regulatory, and metabolic. 287 each of these networks can be plotted according to computational approaches. also, further studies are required to contemplate the construction of evolutionary in silico models and the characterization of these molecular targets in vitro. 288, 289 studies of protein-protein interactions to understand the regulatory process are essential 290 and new computational methods are necessary for this purpose with more optimized algorithms, also to remove potential false positives. thus, in-depth studies on the orientation of molecules and their linkages to the formation of a stable complex are of great importance for understanding plant-pathogen studies and also to develop new drugs. 291 the understanding of the regulatory principles by which protein receptors recognize, interact, and associate with molecular substrates or inhibitors is of paramount importance to generate new therapeutic strategies. 292 in modern drug discovery, docking plays an important role in predicting the orientation of the binder when it is attached to a protein receptor or enzyme, using forms and electrostatic interactions, van der walls, coulombic, and hydrogen bond as parameters to quantify or predict a given interaction. 293, 294 molecular docking aims at exploring the predominant mode(s) of binding of a molecule (protein or ligand) when it binds to a protein with a known 3d structure based on a scoring function that has 3 main functions: the first is to determine the binding mode and the binding site of a protein, the second is to predict the absolute binding affinity between protein and ligand (or other protein) in lead optimization, and the third is virtual screening, which can identify potential drug leads for a given protein target by searching a large ligand or protein in database. 295 protein-protein interactions are essential for cellular and immune function. in many cases, due to the absence of an experimentally determined structure of the complex, these interactions must be modeled to obtain an understanding about their structure and molecular basis. 296 few studies on plant-pathogen interactions include docking approaches and most studies focus on drug development for medical purposes. drug research based on structure is a powerful technique for the rapid identification of small molecules against the 3d structure of available macromolecular targets, usually by x-ray crystallography, nmr structures, or homology models. due to abundant information on protein sequences and structures, the structural information on specific proteins and their interactions have become crucial for current pharmacological research. 297 even in the absence of knowledge about the binding site and limited backbone movements, a variety of algorithms have been developed for docking over the past 2 decades. although the zdock, 296 the rdock, 298 and the hex 299 have provided results with high coupling precision, the complexes provided are not very useful for designing inhibitors for protein interfaces due to constraints on rigid body docking. 294 in this context, more flexible approaches have been developed which generally examine very limited conformations compared with rigid body methods. these docking methods predict that binding is more likely to occur in broad surface regions and then defines the sites in complex structures of high affinity. 300 the best example is the haddock software, 297 which has been successful in solving a large number of precise models for protein-protein complexes. a good example of its use is the study of the complex formed between plectasin, a member of the innate immune system, and a precursor lipid of bacterial cell wall ii. the study identified the residues involved in the binding site between the 2 proteins, providing valuable information for planning new antibiotics. 301 however, the absolute energies associated with intermolecular interaction are not estimated with satisfactory accuracy by the current algorithms. some significant issues as solvent effects, entropic effects, and receptor flexibility still need to be addressed. however, some methods, such as moe-dock, 302 gold, 303 glide, 304 flexx, 305 and surflex 306 which deal with lateral chain flexibility, have proven to be effective and adequate in most cases. realistic interactions between small molecules and receptors still depend on experimental wet-lab validation. 294, 307 despite the current difficulties, there is a growing interest in the mechanisms and prediction of small molecules such as peptides, as they bind to proteins in a highly selective and conserved manner, being promising as new medicinal and biological agents. 308 while both "small molecule docking methods" and "custom protocols" can be used, short peptides are challenging targets because of their high torsional flexibility. 307 proteinpeptide docking is generally more challenging than those related to other small molecules, and a variety of methods have been applied so far. however, few of these approaches have been published in a way that can be reproduced with ease. [309] [310] [311] although it is difficult to use peptide docking, a recent focus of basic and pharmacological research has used computational tools with modified peptides to predict the selective disruption of proteinprotein interactions. these studies are based on the involvement of some critical amino acid residues that contribute most to the binding affinity of a given interaction, also called hot-spots. 312, 313 despite the number of docking programs, existing algorithms still demand improvements. however, approaches are being developed to improve all issues related to punctuation, protein flexibility, interaction with plain water, among other issues. 314 in this context, the capri (critical assessment of predicted interactions) is a community that provides a quality assessment of different docking approaches. it started in 2001 and since then has aided the development and improvement of the methodologies applied for docking. 315 an evaluation was carried out for capri in 2016, resulting in an improvement in the integration of different modeling tools with docking procedures, as well as the use of more sophisticated evolutionary information to classify models. however, adequate modeling of conformational flexibility in interacting proteins remains an essential demand with a crucial need for improvement. 314 different docking programs are currently available, 294 and new alternatives continue to appear. some of these alternatives will disappear, just as others will become the top choices among field users. molecular docking technique is not often used for amps, due to its standard mechanism of action based on the classical association with the external membrane of the pathogen. despite that, some amps have the ability to bind other proteins and/or enzymes, a feature still scarcely studied. in such cases, molecular docking can be useful. an example of success is the study performed by melo et al, 47 where they showed the specific binding of a trypsin to a cowpea (vigna unguiculata) thionine, revealing that this interaction occurs in a canonical manner with lys11, located in an extended exposed loop. therefore, further application of docking may bring new evidences about the antimicrobial mechanisms revealing other molecular targets of interest. it is clear that the combination of data bank information with bioinformatic tools (especially those allowing the identification of patterns, rather than sequence order) is able to revolutionize the identification of amps and prediction of their activity. the data may come from genomic, transcriptomic, or proteomic databases, or a combination of different information sources (eg, genomic and transcriptomics, transcriptomics and proteomics). supplementary figure s9 brings a schematic flowchart describing the steps for mining, annotation, and structural/ functional analysis of amps, in addition to some wet-lab analyses that can be integrated to assess/confirm candidate amps. similar bioinformatic approaches have been actually used to identify potential peptide candidates with anti-sars-cov-2 activity, especially those potentially able to interact with the spike protein and proteases involved in viral penetration. 316, 317 as emphasized, plant amps show greater diversity and abundance, when compared with other kingdoms. it can be speculated that plants shelter many yet undescribed amp classes, given their vast abundance and isoform diversity. the genomic and peptidic structure of amps can be variable, with few key residues conserved, which turns their identification, classification, and comparison challenging even in the omics age. nevertheless, advances in the generation of new bioinformatics tools and specialized databases have led to new and more efficient approaches for both the identification of primary sequences and molecular modeling, besides the analysis of the stability of the generated models. despite the large availability of omics data and bioinformatics tools, most new plant peptides have been discovered by wet-lab approaches regarding single candidates. high throughput in silico methods have the potential to transform this scenario, revealing many new candidates, including some new or "non-canonical" peptides. it may be also speculated that a myriad of new peptides may exist considering even smaller peptides, still less considered and more difficult to identify. finally, in silico approaches shall in future studies be mandatory to define the design of wet-lab studies, turning the identification more efficient and requiring reasonably less time to track, identify, and confirm new candidate amps. considering the actual pandemic scenario of covid-19, plant amps may be regarded as an important source of antiviral drug candidates, especially considering that some amp categories present not only antiviral effects but also a wide spectrum antimicrobial activity, act as anti-inflammatory, and also induce the immune response. of higher education personnel, biocomputational program), cnpq (brazilian national council for scientific and technological development), and facepe (fundação de amparo à ciência e tecnologia de pernambuco) for fellowships. the project is supported by the interreg italia-slovenia, ise-emh 07/2019 and rc 03/20 from irccs burlo garofolo/ italian ministry of health cass performed the literature review and whore the manuscript. lz, mol, lmbv, jpbn, jrfn, jdcf, rlos, cjp, ffa, and mfo wrote specific chapters, eak and sc critically revised the text and included relevant suggestions. ambi conceived the review, wrote the introduction and concluding remarks, besides critically revising the manuscript. all authors have read the manuscript and agree to its content. supplemental material for this article is available online. prediction of protein function from protein sequence and structure plant peptides in defense and signaling plant bioactive peptides: an expanding class of signaling molecules candidalysin is a fungal peptide toxin critical for mucosal infection copsin, a novel peptide-based fungal antibiotic interfering with the peptidoglycan synthesis innate and specific gut-associated immunity and microbial interference the wide world of ribosomally encoded bacterial peptides oral saliva and covid-19 human β-defensin 2 mediated immune modulation as treatment for experimental colitis plant peptides and peptidomics nucleic acids and proteins in plants i the plant peptidome: an expanding repertoire of structural features and biological functions a small peptide modulates stomatal control via abscisic acid in long-distance signalling interaction of pls and pin and hormonal crosstalk in arabidopsis root development peptide signals for plant defense display a more universal role protease inhibitors in plants: genes for improving defenses against insects and pathogens long distance run in the wound response-jasmonic acid is pulling ahead rodríguez-palenzuéla p. plant defense peptides overview on plant antimicrobial peptides ethnobotanical bioprospection of candidates for potential antimicrobial drugs from brazilian plants: state of art and perspectives conopeptide characterization and classifications: an analysis using conoserver adaptive hydrophobic and hydrophilic interactions of mussel foot proteins with organic thin films cathelicidins, multifunctional peptides of the innate immunity antimicrobial peptides: discovery, design and novel therapeutic strategies scop: a structural classification of proteins database for the investigation of sequences and structures antimicrobial peptides from plants cyclotides insert into lipid bilayers to form membrane pores and destabilize the membrane through hydrophobic and phosphoethanolamine-specific interactions analysis of two novel classes of plant antifungal proteins from radish (raphanus sativus l.) seeds antifungal plant defensins: mechanisms of action and production h-nmr studies on the structure of a new thionin from barley endosperm: structure of a new thionin antimicrobial peptides from plants arabidopsis thionin-like genes are involved in resistance against the beet-cyst nematode (heterodera schachtii) host defense peptides and their potential as therapeutic agents plant thionins-the structural perspective plant antimicrobial peptides de smet i. plant peptides-taking them to the next level antimicrobial peptides from plants and their mode of action the inhibitory effect of a protamine from wheat flour on the fermentation of wheat mashes characterization and analysis of thionin genes thionin genes specifically expressed in barley leaves antimicrobial peptides as effective tools for enhanced disease resistance in plants identification of a cowpea γ-thionin with bactericidal activity novel thionins from black seed (nigella sativa l.) demonstrate antimicrobial activity synthetic and structural studies on pyrularia pubera thionin: a single-residue mutation enhances activity against gram-negative bacteria antimicrobial activity of γ-thionin-like soybean se60 in e. coli and tobacco plants inhibition of trypsin by cowpea thionin: characterization, molecular modeling, and docking toxicity of purothionin and its homologues to the tobacco hornworm, manduca sexta (l.) (lepidoptera:sphingidae) studies on purothionin by chemical modifications full-matrix refinement of the protein crambin at 0.83 å and 130 k γ-purothionins: amino acid sequence of two polypeptides of a new family of thionins from wheat endosperm primary structure and inhibition of protein synthesis in eukaryotic cell-free system of a novel thionin, gammahordothionin, from barley endosperm plant defensins: novel antimicrobial peptides as components of the host defense system the evolution, function and mechanisms of action for plant defensins plant γ-thionins: novel insights on the mechanism of action of a multi-functional class of defense proteins disulfide bridges in defensins comparative analysis of the antimicrobial activities of plant defensin-like and ultrashort peptides against food-spoiling bacteria isolation, purification, and characterization of a stable defensin-like antifungal peptide from trigonella foenum-graecum (fenugreek) seeds antimicrobial peptides: pore formers or metabolic inhibitors in bacteria? plant defensins-prospects for the biological functions and biotechnological properties defensins and paneth cells in inflammatory bowel disease plant defensins: types, mechanism of action and prospects of genetic engineering for enhanced disease resistance in plants benko-iseppon am, cecchetto g. gene isolation and structural characterization of a legume tree defensin with a broad spectrum of antimicrobial activity recent advances in the chemistry and biochemistry of plant lipids evolutionary history of the non-specific lipid transfer proteins lipid transfer proteins: classification, nomenclature, structure, and function lipid-transfer proteins in plants purification and characterization of a small (7.3 kda) putative lipid transfer protein from maize seeds surprisingly high stability of barley lipid transfer protein, ltp1, towards denaturant, heat and proteases structural stability and surface activity of sunflower 2s albumins and nonspecific lipid transfer protein involvement of gpi-anchored lipid transfer proteins in the development of seed coats and pollen in arabidopsis thaliana two-and three-dimensional proton nmr studies of a wheat phospholipid transfer protein: sequential resonance assignments and secondary structure three-dimensional structure in solution of a wheat lipid-transfer protein from multidimensional 1h-nmr data. a new folding for lipid carriers an unusual lectin from stinging nettle (urtica dioica) rhizomes hevein-like antimicrobial peptides of plants structural basis for chitin recognition by defense proteins: glcnac residues are bound in a multivalent fashion by extended binding sites in hevein domains ginkgotides: proline-rich hevein-like peptides from gymnosperm ginkgo biloba structure and function of chitin-binding proteins structural features of plant chitinases and chitin-binding proteins a novel antifungal peptide from leaves of the weed stellaria media l antimicrobial peptides from amaranthus caudatus seeds with sequence homology to the cysteine/glycinerich domain of chitin-binding proteins overview of plant chitinases identified as food allergens the latex-fruit syndrome the n-terminal cysteine-rich domain of tobacco class i chitinase is essential for chitin binding but not for catalytic or antifungal activity a chitin-binding lectin from stinging nettle rhizomes with antifungal properties biochemical and molecular characterization of three barley seed proteins with antifungal properties hevein: an antifungalprotein from rubber-tree (hevea brasiliensis) latex two hevein homologs isolated from the seed of pharbitis nil l. exhibit potent antifungal activity comparative proteomics of primary and secondary lutoids reveals that chitinase and glucanase play a crucial combined role in rubber particle aggregation in hevea brasiliensis the formation and accumulation of protein-networks by physical interactions in the rapid occlusion of laticifer cells in rubber tree undergoing successive mechanical wounding plant cystineknot peptides: pharmacological perspectives: plant cystine-knot proteins in pharmacology refined crystal structure of the potato inhibitor complex of carboxypeptidase a at 2.5 å resolution squash inhibitors: from structural motifs to macrocyclic knottins small signaling peptides in arabidopsis development: how cells communicate over a short distance use of scots pine seedling roots as an experimental model to investigate gene expression during interaction with the conifer pathogen heterobasidion annosum (p-type) tying the knot: the cystine signature and molecular-recognition processes of the vascular endothelial growth factor family of angiogenic cytokines a cactus-derived toxin-like cystine knot peptide with selective antimicrobial activity circular proteins from plants and fungi circulins a b. novel human immunodeficiency virus (hiv)-inhibitory macrocyclic peptides from the tropical tree chassalia parvifolia isolation, solution structure, and insecticidal activity of kalata b2, a circular protein with a twist: do möbius strips exist in nature? purification, characterisation and cdna cloning of an antimicrobial peptide from macadamia integrifolia miamp1, a novel protein from macadamia integrifolia adopts a greek key β-barrel fold unique amongst plant antimicrobial proteins peptides of the innate immune system of plants. part ii. biosynthesis, biological functions, and possible practical applications nmr structure of the streptomyces metalloproteinase inhibitor, smpi, isolated from streptomyces nigrescens tk-23: another example of an ancestral βγ-crystallin precursor structure ancestral beta gamma-crystallin precursor structure in a yeast killer toxin enhanced quantitative resistance to leptosphaeria maculans conferred by expression of a novel antimicrobial peptide in canola (brassica napus l.) primitive defence: the miamp1 antimicrobial peptide family a novel family of small cysteine-rich antimicrobial peptides from seed of impatiens balsamina is derived from a single precursor protein structural studies of impatiens balsamina antimicrobial protein (ib-amp1) antifungal mechanism of a cysteine-rich antimicrobial peptide, ib-amp1, from impatiens balsamina against candida albicans antimicrobial peptide hybrid fluorescent protein based sensor array discriminate ten most frequent clinic isolates ib-amp4 insertion causes surface rearrangement in the phospholipid bilayer of biomembranes: implications from quartz-crystal microbalance with dissipation antifungal activity of synthetic peptides derived from impatiens balsamina antimicrobial peptides ib-amp1 and ib-amp4 antimicrobial specificity and mechanism of action of disulfide-removed linear analogs of the plant-derived cys-rich antimicrobial peptide ib-amp1 triticum aestivum puroindolines, two basic cystine-rich seed proteins: cdna sequence analysis and developmental gene expression determination of the secondary structure and conformation of puroindolines by infrared and raman spectroscopy sequence diversity and identification of novel puroindoline and grain softness protein alleles in elymus, agropyron and related species puroindolines: their role in grain hardness and plant defence molecular genetics of puroindolines and related genes: allelic diversity in wheat and other grasses isolation, characterization and antimicrobial activity at diverse dilution of wheat puroindoline protein the wheat puroindoline genes confer fungal resistance in transgenic corn: the puroindolines confer corn slb resistance mini review: structure, biological and technological functions of lipid transfer proteins and indolines, the major lipid binding proteins from cereal kernels puroindolines: the molecular genetic basis of wheat grain hardness plant lipid binding proteins: properties and applications puroindolines form ion channels in biological membranes the antimicrobial properties of the puroindolines, a review snakin-1, a peptide from potato that is active against plant pathogens snakin-2, an antimicrobial peptide from potato whose gene is locally induced by wounding and responds to pathogen infection snakin: structure, roles and applications of a plant antimicrobial peptide the new casn gene belonging to the snakin family induces resistance against root-knot nematode infection in pepper radiation damage and racemic protein crystallography reveal the unique structure of the gasa/snakin protein superfamily gasa5, a regulator of flowering time and stem growth in arabidopsis thaliana isolation and characterization of the tissue and development-specific potato snakin-1 promoter inducible by temperature and wounding the gibberellic acid stimulatedlike gene family in maize and its role in lateral root development analysis of expressed sequence tags (ests) from avocado seed (persea americana var. drymifolia) reveals abundant expression of the gene encoding the antimicrobial peptide snakin increased tolerance to wheat powdery mildew by heterologous constitutive expression of the solanum chacoense snakin-1 gene recombinant production of snakin-2 (an antimicrobial peptide from tomato) in e. coli and analysis of its bioactivity geg participates in the regulation of cell and organ shape during corolla and carpel development in gerbera hybrida two osgasr genes, rice gast homologue genes that are abundant in proliferating tissues, show different expression patterns in developing panicles gasa4, one of the 14-member arabidopsis gasa family of small polypeptides, regulates flowering and seed development identification of novel genes potentially involved in somatic embryogenesis in chicory (cichorium intybus l.) disulfide-stabilized helical hairpin structure and activity of a novel antifungal peptide ecamp1 from seeds of barnyard grass (echinochloa crus-galli) buckwheat trypsin inhibitor with helical hairpin structure belongs to a new family of plant defence peptides novel antifungal αhairpinin peptide from stellaria media seeds: structure, biosynthesis, gene structure and evolution design, synthesis and docking of linear and hairpin-like alpha helix mimetics based on alkoxylated oligobenzamide defense peptide repertoire of stellaria media predicted by high throughput next generation sequencing influence of cysteine and tryptophan substitution on dna-binding activity on maize α-hairpinin antimicrobial peptide plant cyclotides: a unique family of cyclic and knotted proteins that defines the cyclic cystine knot structural motif plants defense-related cyclic peptides: diversity, structure and applications discovery, structure, function, and applications of cyclotides: circular proteins from plants cyclotide evolution: insights from the analyses of their precursor sequences, structures and distribution in violets (viola) isolation of oxytocic peptides from oldenlandia affinis by solvent extraction of tetraphenylborate complexes and chromatography on sephadex lh-20 fractionation protocol for the isolation of polypeptides from plant biomass discovery of cyclotides in the fabaceae plant family provides new insights into the cyclization, evolution, and distribution of circular proteins cyclotides associate with leaf vasculature and are the products of a novel precursor in petunia (solanaceae) discovery and characterization of novel cyclotides originated from chimeric precursors consisting of albumin-1 chain a and cyclotide domains in the fabaceae family the cyclotide cycloviolacin o2 from viola odorata has potent bactericidal activity against gram-negative bacteria cyclotides: a novel type of cytotoxic agents potential therapeutic applications of the cyclotides and related cystine knot mini-proteins disulfide-rich macrocyclic peptides as templates in drug design the superfamily of thaumatinlike proteins: its origin, evolution, and expression towards biological function plant thaumatin-like proteins: function, evolution and biotechnological applications some fungi express beta-1,3-glucanases similar to thaumatin-like proteins lentinula edodes tlg1 encodes a thaumatin-like protein that is involved in lentinan degradation and fruiting body senescence plant stress proteins of the thaumatin-like family discovered in animals plant pathogenesis-related proteins: molecular mechanisms of gene expression and protein function the crystal structure of the antifungal protein zeamatin, a member of the thaumatin-like, pr-5 protein family several thaumatin-like proteins bind to β-1,3-glucans some thaumatin-like proteins hydrolyse polymeric beta-1,3-glucans tlxi, a novel type of xylanase inhibitor from wheat (triticum aestivum) belonging to the thaumatin family zeamatin inhibits trypsin and alpha-amylase activities drought-inducible-but aba-independent-thaumatin-like protein from carrot (daucus carota l.) ethylene-responsive genes are differentially regulated during abscission, organ senescence and wounding in peach (prunus persica) antifreeze proteins in winter rye are similar to pathogenesis-related proteins differential gene expression in arachis diogoi upon interaction with peanut late leaf spot pathogen, phaeoisariopsis personata and characterization of a pathogen induced cyclophilin transcriptome and metabolite profiling of the infection cycle of zymoseptoria tritici on wheat reveals a biphasic interaction with plant immunity involving differential pathogen chromosomal contributions and a variation on the hemibiotrophic lifestyle definition a classification of plant food allergens molecular, biochemical and structural characterization of osmotin-like protein from black nightshade (solanum nigrum) molecular characterization of a novel soybean gene encoding a neutral pr-5 protein induced by high-salt stress thaumatin-like proteins-a new family of pollen and fruit allergens biochemical and structural characterization of tlxi, the triticum aestivum l resolution of the structure of the allergenic and antifungal banana fruit thaumatin-like protein at 1.7-å crystal structure of tobacco pr-5d protein at 1.8 å resolution reveals a conserved acidic cleft structure in antifungal thaumatin-like proteins crystal structure of osmotin, a plant antifungal protein crystal structure of a sweet tasting protein thaumatin i, at 1·65 å resolution crystallization and preliminary structure determination of the plant identification of conidialenriched transcripts in aspergillus nidulans using suppression subtractive hybridization analysis of the aspergillus nidulans thaumatin-like ceta gene and evidence for transcriptional repression of pyr4 expression in the ceta-disrupted strain the pr5k receptor protein kinase from arabidopsis thaliana is structurally related to a family of plant defense proteins uniprot: the universal protein knowledgebase discovering new in silico tools for antimicrobial peptide prediction computational tools for exploring sequence databases as a resource for antimicrobial peptides camp: a useful resource for research on antimicrobial peptides camp: collection of sequences and structures of antimicrobial peptides campr3: a database on sequences, structures and signatures of antimicrobial peptides apd3: the antimicrobial peptide database as a tool for research and education dbaasp: database of antimicrobial activity and structure of peptides new trends in peptide-based anti-biofilm strategies: a review of recent achievements and bioinformatic approaches a large-scale structural classification of antimicrobial peptides defensins knowledgebase: a manually curated database and information source focused on the defensins family of antimicrobial peptides computational resources and tools for antimicrobial peptides cybase: a database of cyclic protein sequence and structure phytamp: a database dedicated to antimicrobial plant peptides plantpepdb: a manually curated plant peptide database bdbms-a database management system for biological data bioinformatics: a way forward to explore "plant omics basic local alignment search tool rapid and sensitive sequence comparison with fastp and fasta comparative analysis of the quality of a global algorithm and a local algorithm for alignment of two sequences programming techniques: regular expression search algorithm profile hidden markov models hidden markov models and their applications in biological sequence analysis what are the ideal properties for functional food peptides with antihypertensive effect? a computational peptidology approach computational peptidology c-pamp: large scale analysis and database construction containing high scoring computationally predicted antimicrobial peptides for all the available plant species predstp: a highly accurate svm based model to predict sequential cystine stabilized peptides assigning biological function using hidden signatures in cystine-stabilized peptide sequences random forests and adaptive nearest neighbors multiple incremental decremental learning of support vector machines evolutionary artificial neural networks: a review novel peptideprotein assay for identification of antimicrobial peptides by fluorescence quenching a reverse search for antimicrobial peptides in ciona intestinalis: identification of a gene family expressed in hemocytes and evaluation of activity positive selection drives a correlation between non-synonymous/synonymous divergence and functional divergence progress with proteome projects: why all proteins expressed by a genome should be identified and how to do it proteomic tools for biomedicine compatibility of plant protein extraction methods with mass spectrometry for proteome analysis proteomic profiling for target identification of biologically active small molecules using 2d dige proteomics technologies and challenges separomics applied to the proteomics and peptidomics of low-abundance proteins: choice of methods and challenges-a review mining the active proteome in plant science and biotechnology screening, purification and characterization of anionic antimicrobial proteins from foeniculum vulgare peptidomics coming of age: a review of contributions from a bioinformatics angle 2d-lc/ms techniques for the identification of proteins in highly complex mixtures hplc techniques for proteomics analysis-a short overview of latest developments bioinformatics in proteomics computational methods for protein identification from mass spectrometry data de novo sequencing methods in proteomics a fast sequest cross correlation algorithm probability-based protein identification by searching sequence databases using mass spectrometry data novel antimicrobial peptides with promising activity against multidrug resistant salmonella enterica serovar choleraesuis and its stress response mechanism proteomics assisted profiling of antimicrobial peptide signatures from black pepper apd: the antimicrobial peptide database protein identification and analysis tools on the expasy server tandem: matching proteins with tandem mass spectra open mass spectrometry search algorithm probid: a probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data radars, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database analysis and validation of proteomic data generated by tandem mass spectrometry improved method for proteome mapping of the liver by 2-de maldi-tof ms bioinformatics-coupled molecular approaches for unravelling potential antimicrobial peptides coding genes in brazilian native and crop plant species prediction of bioactive peptides from chlorella sorokiniana proteins using proteomic techniques in combination with bioinformatics analyses graphical interpretation and analysis of proteins and their ontologies (giapronto): a one-click graph visualization software for proteomics data sets principles, challenges and advances in ab initio protein structure prediction practically useful: what the rosetta protein modeling suite can do for you. biochemistry critical assessment of methods of protein structure prediction (casp)-round xii template-based protein structure modeling using the raptorx web server comparative protein structure modeling of genes and genomes swiss-model: modelling protein tertiary and quaternary structure using evolutionary information comparative protein structure modeling using mod-eller: comparative protein structure modeling using modeller muster: improving protein sequence profile-profile alignments by using multiple sources of structure information improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates prosa-web: interactive web service for the recognition of errors in three-dimensional structures of proteins i-tasser server for protein 3d structure prediction touchstone ii: a new approach to ab initio protein structure prediction ab initio modeling of small proteins by iterative tasser simulations integration of quark and i-tasser for ab initio protein structure prediction in casp11: ab initio structure prediction in casp11 critical assessment of methods of protein structure prediction: progress and new directions in round xi: progress in casp xi in silico optimization of a guava antimicrobial peptide enables combinatorial exploration for peptide design high-resolution comparative modeling with rosettacm developing a molecular dynamics force field for both folded and disordered protein states challenges in protein-folding simulations molecular dynamics simulations of biomolecules relaxation mode analysis for molecular dynamics simulations of proteins gromacs 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation 5: a high-throughput and highly parallel open source molecular simulation toolkit gromacs: high performance molecular simulations through multi-level parallelism from laptops to supercomputers the amber biomolecular simulation programs scalable molecular dynamics with namd charmm: a program for macromolecular energy, minimization, and dynamics calculations fast parallel algorithms for short-range molecular dynamics scalable algorithms for molecular dynamics simulations on commodity clusters molecular dynamics simulation for all targeting antibiotic tolerance, pathogen by pathogen measuring and mapping the global burden of antimicrobial resistance university of texas medical branch at galveston antibiotic drugs targeting bacterial rnas antimicrobial drugs in fighting against antimicrobial resistance protein-ligand docking: current status and future challenges peptide docking and structurebased characterization of peptide binding: from knowledge to know-how software for molecular docking: a review bacterial multidrug efflux pumps: mechanisms, physiology and pharmacological exploitations zdock server: interactive docking prediction of protein-protein complexes and symmetric multimers the haddock2.2 web server: user-friendly integrative modeling of biomolecular complexes rdock: a fast, versatile and open source program for docking ligands to proteins and nucleic acids protein docking using case-based reasoning principles of flexible protein-protein docking plectasin, a fungal defensin, targets the bacterial cell wall precursor lipid ii variability in docking success rates due to dataset preparation development and validation of a genetic algorithm for flexible docking extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein−ligand complexes protein-ligand docking: current status and future challenges surflex-dock: docking benchmarks and real-world application docking small peptides remains a great challenge: an assessment using autodock vina advances in the prediction of protein-peptide binding affinities: implications for peptide-based drug discovery: protein-peptide binding affinities recent work in the development and application of protein-peptide docking peptide docking and structurebased characterization of peptide binding: from knowledge to know-how protein-ligand docking in the new millennium-a retrospective of 10 years in the field unal eb, gursoy a, erman b. vital: viterbi algorithm for de novo peptide design modeling protein-protein and proteinpeptide complexes: capri 6th edition docking, scoring, and affinity prediction in capri potential chimeric peptides to block the sars-cov-2 spike receptor-binding domain peptide-like and small-molecule inhibitors against covid-19 the authors are very grateful to capes (coordination for the improvement key: cord-034834-zap82dta authors: bai, xiao; sun, huaping; lu, shibao; taghizadeh-hesary, farhad title: a review of micro-based systemic risk research from multiple perspectives date: 2020-06-27 journal: entropy (basel) doi: 10.3390/e22070711 sha: doc_id: 34834 cord_uid: zap82dta the covid-19 pandemic has brought about a heavy impact on the world economy, which arouses growing concerns about potential systemic risk, taking place in countries and regions. at this critical moment, it makes sense to interpret the systemic risk from the perspective of the financial crisis framework. by combing the latest research on systemic risks, we may arrive at some precautions relating to the current events. this literature review verifies the origin of systemic risk research. by comparing the retrieved and screened systemic literature with the relevant research on the financial crisis, more focus on the micro-foundations of systemic risk has been discovered. besides, the measurement methods of systemic risks and the introduction of interdisciplinary methods have made the research in this field particularly active. this paper synthesizes the previous research conclusions to find the appropriate definition of systemic risk and combs the research literature of systemic risk from two lines: firstly, conducting the division according to the sub-branch fields within the financial discipline and the relevant interdisciplinary research methods, which is helpful for scholars within and outside the discipline to have a more systematic understanding of the research in this field. secondly predicting the research direction that can be expanded in this field. the study of systemic risk suddenly increased after the subprime crisis. compared with the previous financial crisis, although there is an overlap between the two, systemic risk includes not only the information asymmetry, liquidity, and crisis transmission pathways involved in the financial crisis, but also the research on the stability problems caused by the structure of the system itself, the quantification of systemic risks, and early warning issues. as a result, further research on the market microstructure and agent heterogeneity has also been triggered. meanwhile, cross-disciplinary research methods from other disciplines have been introduced, such as the introduction of complex network models when studying the structural stability of the system, linking the contagious effects of financial systemic risks to the transmission pathways of infectious diseases or bio-food chains [1] [2] [3] [4] [5] [6] , establishing new measures to measure systemic risk [7] [8] [9] [10] . different measurement methods in the financial sub-branch field are also applied to empirically verify and predict the tail loss of systemic risks [8, 11, 12] . what does systemic risk research study? what methods are used to study systemic risks? these are the questions that this review will always revolve around. the former one refers to a framework the scope of systemic risk and systematic risk is not the same. the latter refers to the risk factors shared by the entire economy, the non-distributable risk, also known as the market risk; it is opposite to non-systematic risk. systemic risk refers to the risk of an overall collapse of the entire system due to the impact of individual member units, and systemic collapse is reflected in the impact of most or all individual members [13] . according to the econlit economic literature database search results, the term "systemic risk" can be traced back to the speech of andrew f. brimmer, a fed official in 1988, at a joint annual meeting of aea (american economic association) and the society of government economists in new york (this speech was published in the journal of economic perspective the following year. see: [14] ), there are different views on the clear definition of systemic risk in academia. [14] first proposed the systemic risk and defined it as the occurrence and spread of a dilemma that is likely to seriously affect a country's financial order. the dilemma itself may be derived from micro factors or macro factors. the bis describes systemic risk as to the possibility that a party to a contract cannot perform an agreement and causes other contractors to default, and thus the chain reaction leads to a wider range of financial difficulties. this definition focuses on the overall risk of the system caused by the breach from microbodies. besides, darryl hendricks [15] defines it from the perspective of equilibrium theory: systemic risk refers to the risk of moving the system from one equilibrium to another worse equilibrium, and this risk is difficult to reverse due to the catalysis of many self-reinforcing mechanisms. following the systemic elaboration of systemic risks in 2003, [16] summarized the definitions of systemic risks and classified them into three categories: (1) the huge negative impact that the entire financial system faces at the same time, that is, systemic risks have the characteristics of simultaneity [17] ; (2) all or a part of the financial institutions face the same risk exposure, which means systemic risks have the same cause [18] ; (3) the impact can trigger a chain reaction. in other words, systemic risk is contagious [19] . [20] believe that these definitions have the following points in common: (1) systemic risk concerns are for all financial institutions; (2) systemic risk focuses on tail loss characteristics; (3) all need to consider the relevance and cross-cutting of the various institutions within the financial system. besides, there are differences and linkages between the concepts of systemic risk and financial crisis. [16] believes that systematicity is the most noteworthy cause of crisis formation and crisis development. the former is the reason and the characterization of the latter, that is, the financial crisis necessarily presents systemic risks, but systemic risks do not necessarily lead to financial crises. moreover, current systemic risk research is mainly limited to financial systemic risk issues within an economy. the financial crisis has a wider scope and more diversified forms of expression. for example, depending on the scope of the spread, a financial crisis can be divided into bank crisis (the bank run), currency crisis, bubble crisis of capital market speculation, sovereign debt crisis, etc., and may trigger a comprehensive economic crisis. on the other hand, systemic risk is similar to financial risk in that it is contagious, and most or all financial institutions in the system will face the same impact at the same time. however, systemic risk emphasizes the impact of financial institutions on the overall risk of the system, that is, the endogenous nature of systemic risks, while the financial crisis may origin from endogenous or exogenous causes, for instance, for iceland in the recent financial crisis, the us-centric financial crisis was transmitted to iceland's entire banking system, which led to the overall collapse of the country's economy [21] . besides, although systemic risks are not all triggered by the leverage (for example, on october 19, 1987 , due to the stock market crash, the us stock index futures prices plummeted, causing the margin account to not add in time, leading to the collapse of the broker), the impact on the system after it occurs must be amplified with leverage. therefore, the study of systemic risks is essentially related to the credit creation process and the degree of leverage. according to the different sectors created by credit, it can be divided into systemic risks of the banking system and the shadow banking system; according to the classification of financial markets and instruments that occur, it can be divided into systemic risks in the real estate market, banking system, bill market, securities market, bond market and derivatives market; according to the different debt departments, it can be divided into systemic risks of public debt (including systemic risks of sovereign debt), systemic risks of corporate debt, and systemic personal risks of households. on the other hand, because the financial crisis is highly correlated with the concept of systemic risk, the scope of the financial crisis and systemic risk research also has great overlap. by entering the keywords "financial crisis" and "risk", a total of 7375 ssci core documents and working papers published since 1992 were searched through web of science. by entering the keyword "systemic risk", a total of 1913 core documents and working papers of the same period were searched. through screening the titles and abstracts of all these papers by adding "financial markets", 1025 papers are selected out. with the help of the software citespace, further bibliometric analysis reveals that financial risk and systemic risk research have more crossovers, and the research concentrated areas are almost overlapping ( figure 1 ). the sub-branches of financial crisis research are numbered according to the timeline of theory development. it demonstrates that early crisis research mainly focused on macroeconomic issues [22] [23] [24] as time went by, there was a detailed exploration into specific financial markets and product markets [25, 26] . from another perspective, financial intermediaries and investor behaviors were discussed in terms of pricing credit risks [27, 28] . as financial intermediaries have similar research methods with corporate finance, more later attention was drawn to bank governance [29] . as the subprime crisis erupted, more studies were oriented to the stability of the financial system, which overlaps with the systemic risk research [7, 10, [30] [31] [32] . a new branch, the extreme value approach, is also adopted in systemic research. systemic risk research is not distinct from crisis research, as the former one also refers to such bank behaviors and liquidity problems. among the 10 subdivisions, more co-citation clusters are formed in regional contagion, monetary policy, credit risk pricing and systemic risk. within the systemic risk field, the co-citation cluster of multidisciplinary research is the largest one, also with most focus on the financial network. thus, by combing the literature on financial crisis theory and systemic risk research, it can be found that the research on systemic risk is closer to the micro-level than the original financial crisis research, although systemic risk management involves macro-level management and supervision. there are more studies on the micro-market structure and participant behavior, and more emphasis on systemic risk transmission and contagious research, as well as the endogenous mechanism research that pays more attention to the occurrence of risks; it can be said that systemic risk research is the further deepening and refinement of the theoretical research on the financial crisis. therefore, although the academic community still has differences in the definition of systemic risks, by comparing the concepts of systemic risk and financial crisis, and summarizing the definition of systemic risk in the academic world, the concept of systemic risk can be defined from an economic perspective: triggered by macro or micro-events, the institutions in the system are subjected to the sub-branches of financial crisis research are numbered according to the timeline of theory development. it demonstrates that early crisis research mainly focused on macroeconomic issues [22] [23] [24] as time went by, there was a detailed exploration into specific financial markets and product markets [25, 26] . from another perspective, financial intermediaries and investor behaviors were discussed in terms of pricing credit risks [27, 28] . as financial intermediaries have similar research methods with corporate finance, more later attention was drawn to bank governance [29] . as the subprime crisis erupted, more studies were oriented to the stability of the financial system, which overlaps with the systemic risk research [7, 10, [30] [31] [32] . a new branch, the extreme value approach, is also adopted in systemic research. systemic risk research is not distinct from crisis research, as the former one also refers to such bank behaviors and liquidity problems. among the 10 subdivisions, more co-citation clusters are formed in regional contagion, monetary policy, credit risk pricing and systemic risk. within the systemic risk field, the co-citation cluster of multidisciplinary research is the largest one, also with most focus on the financial network. thus, by combing the literature on financial crisis theory and systemic risk research, it can be found that the research on systemic risk is closer to the micro-level than the original financial crisis research, although systemic risk management involves macro-level management and supervision. there are more studies on the micro-market structure and participant behavior, and more emphasis on systemic risk transmission and contagious research, as well as the endogenous mechanism research that pays more attention to the occurrence of risks; it can be said that systemic risk research is the further deepening and refinement of the theoretical research on the financial crisis. therefore, although the academic community still has differences in the definition of systemic risks, by comparing the concepts of systemic risk and financial crisis, and summarizing the definition of systemic risk in the academic world, the concept of systemic risk can be defined from an economic perspective: triggered by macro or micro-events, the institutions in the system are subjected to negative impacts, and more organizations are involved in risk diffusion and the existence of internal correlations strengthens the feedback mechanism, causing the system as a whole to face the risk of collapse. the definition here emphasizes (1) events triggering harms to institutions and systems; (2) systemic risks that spillover and are contagious; (3) due to the psychological and behavioral characteristics of the micro-subjects, a self-reinforcing feedback mechanism is formed; (4) the potentially serious consequence of systemic risk is the overall collapse. we found that in web of science, from the journal of economics, finance, accounting, behavioral psychology, and management, from 1992 to march 2020, all the articles on systemic risk published in the ssci core journals totaled 1913 articles. from figure 2a , we can tell the research on systemic risk has ups and downs from year to year, echoing with worldwide risk events. besides, the search for working papers on systemic risk research on the nber website has exceeded 200 publications in the past 10 years. furthermore, the number of work papers published in this field newly released in 2020 is still considerable (figure 2b) . especially, under current circumstances, systemic risk research deserves further exploring, as the covid-19 pandemic struck the real economy by threatening human life first, then financial systems are impacted in terms of both assets and liabilities, which may further hurt real economies through systemic risk chains. another peak in this area can henceforth be expected. negative impacts, and more organizations are involved in risk diffusion and the existence of internal correlations strengthens the feedback mechanism, causing the system as a whole to face the risk of collapse. the definition here emphasizes (1) events triggering harms to institutions and systems; (2) systemic risks that spillover and are contagious; (3) due to the psychological and behavioral characteristics of the micro-subjects, a self-reinforcing feedback mechanism is formed; (4) the potentially serious consequence of systemic risk is the overall collapse. we found that in web of science, from the journal of economics, finance, accounting, behavioral psychology, and management, from 1992 to march 2020, all the articles on systemic risk published in the ssci core journals totaled 1913 articles. from figure 2a , we can tell the research on systemic risk has ups and downs from year to year, echoing with worldwide risk events. besides, the search for working papers on systemic risk research on the nber website has exceeded 200 publications in the past 10 years. furthermore, the number of work papers published in this field newly released in 2020 is still considerable (figure 2b) . especially, under current circumstances, systemic risk research deserves further exploring, as the covid-19 pandemic struck the real economy by threatening human life first, then financial systems are impacted in terms of both assets and liabilities, which may further hurt real economies through systemic risk chains. another peak in this area can henceforth be expected. the research literature on systemic risk is sometimes the overlap of the fine branches of the section 2.2 above. for example, the study of tail risk identification and quantification will involve specific empirical methods and recommendations including risk management. therefore, the literature review in this chapter is mainly summarized and reviewed from the following four aspects: the concept of systemic risk appears late, so systemic research on it is still being explored. however, according to the previous literature comparison, it has many similarities with the research theme of the financial crisis. the study of systemic risk formation mechanisms also draws on the research results of the formation mechanism of the financial crisis. therefore, in the study of the formation mechanism of systemic risk, the research context of the formation mechanism of the financial crisis is first outlined. the study on the formation mechanism of the financial crisis is mainly divided into two categories, namely, caused by emergencies or related to the economic cycle [34] . in the [35] and [36] , bank runs come from depositors' expected self-realization, which can explain some of the bank runs the research literature on systemic risk is sometimes the overlap of the fine branches of the section 2.2 above. for example, the study of tail risk identification and quantification will involve specific empirical methods and recommendations including risk management. therefore, the literature review in this chapter is mainly summarized and reviewed from the following four aspects: the concept of systemic risk appears late, so systemic research on it is still being explored. however, according to the previous literature comparison, it has many similarities with the research theme of the financial crisis. the study of systemic risk formation mechanisms also draws on the research results of the formation mechanism of the financial crisis. therefore, in the study of the formation mechanism of systemic risk, the research context of the formation mechanism of the financial crisis is first outlined. the study on the formation mechanism of the financial crisis is mainly divided into two categories, namely, caused by emergencies or related to the economic cycle [34] . in the [35, 36] , bank runs come from depositors' expected self-realization, which can explain some of the bank runs and the currency crisis, but it is difficult to use the model to make a prediction. [37] proposed that there is information asymmetry between banks and depositors. by establishing a global game model, the bank-runs problem is well explained, and the theoretical boundary value of bank runs is put forward. [38] extended this model to the mutual fund market and proposed an empirical approach to the global game model. the research of this branch is mainly related to the micro-market structure from the perspective of the micro-level and individual agent. another focus of research on the causes of the financial crisis is its procyclical nature. [39] pointed out that when the economy goes down, it is difficult for banks to realize the possibility of redemption. if depositors anticipate the impact of the economic downturn on the banking industry, they will withdraw funds from their banks, which may cause a run on the banks. [40] examined the economic volatility indicators of the united states from the end of the 19th century to the beginning of the 20th century and found that the leading indicators of the scale of debts of closed enterprises can accurately predict the occurrence of the banking crisis. [41] , however, systematically studied the us comprehensive monetary policy from 1867 to 1960, and found that the four financial crises that occurred during this period were all caused by panic, but not related to the real economy. this led to a series of subsequent empirical tests on the causes of the financial crisis [42, 43] , which conducted a more detailed investigation of the four financial crises and collected more extensive data. it has been empirically found that the first three financial crises were triggered by the impact of the real economy, while the fourth financial crisis broke out due to panic. the research on the systemic risk formation mechanism is carried out in the context of the establishment of the credible monetary system, and the degree of regulation of financial markets and the links between financial institutions and product innovations that have surpassed the previous era. therefore, against this backdrop, the occurrence of systemic risk not only means that the liquidity of the bank or the banking industry itself causes a banking crisis but also means that the liquidity of the inter-bank market or the lending market in which it is involved in financing is exhausted. furthermore, the resulting chain reaction, and even spread to other financial segment markets, eventually led to the collapse of the entire system. therefore, the research on the formation mechanism of systemic risk is divided into two directions. the first one is the research on the formation mechanism of the financial crisis, which focuses on the micro-level; dynamic analysis is used to explore the effects of interactions within the system between individual behaviors, organizations, and financial markets. the other one is to focus on the institutional roots of systemic risk from the perspective of normative economics, or law and sociology. [44] found that the frequency of financial crises in the world, such as the great depression, was, in recent decades, twice the sum of the number of financial crises in the bretton woods period and the gold standard period (1880-1893). this cannot but make people wonder whether the shortcomings of the credit monetary system itself lead to the risk of a financial crisis, because the marginal cost of credit currency issuance is almost zero compared to its face value. therefore, the central bank is more inclined to issue more than a limited amount of money to obtain money as a gain from credit capital investment or to achieve other policy objectives, especially when the central bank has a monopoly position and no other financial institution issues currency to compete with it. moreover, the central bank is not bound by a bankruptcy (the self-owned capital ratio can be extremely low), so this tendency is more obvious. besides, according to the guidelines proposed by the central bank theory founder walter bagehot that the central bank should follow (the loans provided by the central bank should set higher interest rates to prevent those who should not get loans, and the central bank should accept good bank collateral in the financial crisis to distinguish borrowers who are not solvent), this type of criterion, which can only rely on the central bank's self-discipline, can easily be broken before or even during the financial crisis, as the fed did in the 2007 subprime crisis. therefore, [45] believe that the financial crisis is essentially a lack of restraint of central bank behavior, resulting in excessive credit funds, which is caused by an excess supply of high-quality assets. this view is consistent with the findings of [30, [46] [47] [48] [49] [50] [51] , which showed that credit booms and asset price bubbles prevailed before the systemic banking crisis. [52] found that the financial crisis in many countries stemmed from the puncture of asset bubbles in the real estate market. as the financial crisis broke out, asset prices fell sharply and continued to have a major impact on real economic output and employment. further exploration of the deeper reasons, the premise of financial systemic risk accumulation as excessive debt, the root cause of excessive debt and what is the "degree" of liability, becomes an important part of the discussion of the formation mechanism of systemic risk. the discussion of this issue has even surpassed the scope of mainstream western economic theory, so it is difficult to see the whole picture in the previous classic systemic risk research literature. george bragues linked the act of capital interest-taking with the liberal democratic trend of thought, arguing that a large amount of lent funds had no tangible foundation beyond the credit creation ability of the banking system, and the fact that people exchange real goods and labor for this kind of currency without receiving any substantial return is unfair. it is the exploitation of the borrower by the lenders of the greeks proposed by philosopher aquinas. however, the idea of the rise of the liberal democratic notion that "everyone should be free to enter into a loan contract at an interest rate that is considered appropriate" justifies loan profitability. the function of credit creation in the banking system and the nature of profit-pursuing operation of capital will naturally lead to excessive debt of the overall economy, because "the main activity of finance is to create a huge debt chain and obtain more profits from it". that is, due to the existence of compound interest, even if there is no new debt, the scale itself will have endless compound growth. the growth rate of the real economy is relatively limited, and the duration of high-speed growth is short-lived [53] , which will inevitably lead to a continuous increase in leverage. therefore, when debts accumulate to a certain scale, credit funds will inevitably spread more than high-quality assets, and due to the decentralized nature of globalization (financial centers and distribution centers in the west, product centers in the east, savings in the east, debt accumulation in the west, [54] ), will become more dangerous, systemic risks and even the outbreak of financial crisis is inevitable. therefore, the free development of economy and capital may lead to disastrous consequences, requiring external supervision and constraints to restrict, and regulatory forbearance and legal loopholes are also important aspects of the study of systemic risk formation mechanisms. [55] believe that the outbreak of the subprime mortgage crisis stemmed from the sec's policy of exempting investment banks from restricting the regulation of the balance sheet liabilities in 2004, which was converted into an institutional voluntary supervision model, and the leverage ratio of investment banks generally rose sharply. the commodity futures modernization act, introduced in 2000, prevented state governments from using their laws to prevent wall street from using derivative financial instruments. [56] pointed out that relying on technocrats or over-reliance on technology to manage financial systemic risks will easily create an over-confidence illusion for regulators. as greenspan admitted at the october 2008 congressional hearing, he relied too much upon and believed in the "self-correction of the free market." jonathan c. lipson, a professor of law at temple university, believes that the design and trading of financial derivatives cds are divorced from the nature of their guarantees. unregulated over-the-counter trading has made this market far larger than the actual guarantee, which has created hidden dangers for the outbreak of the subprime mortgage crisis. compared with the research on the formation mechanism of the financial crisis, the research on systemic risk formation mechanisms is not entangled in whether it is related to the real economy, but whether it is at the macro level or the micro level, whether it is institutional research or event research. on the other hand, research is more dispersed, and the formation mechanism of systemic risk spans the research scope of financial disciplines, which deserves more in-depth and detailed discussion. [49] [50] [51] studied several financial crisis events from high-income and low-income countries, and found that the crisis would cause the average price of real estate to fall by 35% in the next six years; in the next three and a half years, asset prices fell by an average of 55%. in two years, real output fell by an average of 9%. in four years, the unemployment rate rose by 7%, while the government deficit averaged 86% higher than the pre-crisis level. once the systemic risk evolves into a financial crisis, the impact on the real economy is obvious. as [57] put it, this is because external factors that are not related to the fundamentals of the company will lead to market participation shocks caused by changes in capital participation. this will affect the price of securities in the capital market, and thus change the investment behavior of micro-enterprises. the investment of enterprises has externalities and ultimately affects the operation of the real economy. he believes that different financing models have different impacts on the real economy. the equity financing market is suitable for investment projects that use equity financing methods (such as high-tech enterprises). such enterprises have positive externalities to the real economy. therefore, although the bursting of the technology bubble hit the securities market in the early 21st century, investment in high-tech industries has accelerated technological progress and is conducive to the development of the real economy. the bond financing market is suitable for investment projects that use debt financing (such as fixed assets investment such as real estate). such investments are generally highly leveraged and have negative externalities to the real economy, so once the bubble bursts, the damage to the real economy is significant. the theoretical study of the impact of systemic risk on the real economy has been dissociated from the mainstream of macroeconomic research for a long time after keynes, because whether it considers from the real economic cycle model [58] [59] [60] or the keynesian is-lm model, the market is a complete market due to the alienation of individual heterogeneity or information asymmetry. therefore, the financing method of enterprises does not affect the value of enterprises (mm theorem), so similar financial friction and systemic risk issues are difficult to match with the traditional theoretical framework, which makes the research on such problems lack micro-foundations. until the new dynamic keynesian theoretical framework (dnk) based on the dsge method was put forward ((cee) [61] , (acel) [62] ), it was compatible with the assumption of price stickiness, and incorporated the investment and financing behaviors and leverage changes under the information asymmetry into the model. the financial accelerator theory proposed by bernanke et al., the bgg model [63] , has become a pioneer in this field. due to the limitation of the model's solution, dsge's utility function and production function setting are both in the form of linear functions or quasi-linear functions. however, linear functions cannot explain the nonlinear changes brought about by economic shocks, and leverage has a nonlinear relationship with the economic cycle. the former has the most direct connection with systemic risk occurrence [64] . to explain the nonlinear relationship between leverage and economic cycles, bernanke proposed a nonlinear investment and financing decision model linked to corporate net assets. in addition, because information asymmetry includes the cost of external investigations (costly state verification, csv), there exists a risk premium between corporate internal financing and external financing (external financial premium). the existence of the premium will make the financing situation of the enterprise worsen when the economic shock occurs, which will accelerate the decrease in investment and bring about the "financial accelerator effect". to simplify the research, bernanke expressed the functional forms of other economic sectors as linear relationships, focusing only on the internal and external financing differences of enterprises and the resulting crisis transmission mechanism. subsequent economists further extended financial frictions to the household sector, the banking system, and the government sector, and further explained the real estate market [65] , the monetary policy transmission mechanism of government departments [66] , and the risk transmission mechanism of the banking system [67, 68] through the dnk theoretical framework. the transmission path that affects the real economy through leverage is not limited to the ways depicted in figure 4 . the empirical study of the impact of systemic risk on the real economy is relatively infrequently referred only two of the top 100 works of literature in the annual average are related to this field. even empirical studies of the impact of the financial crisis on the real economy are few and far between, which may stem from the following reasons: 1. the number of observable systemic risk events is limited, and fewer samples are analyzed using time series or panel analysis; 2. in the quantification of systemic risks due to the existence of many differences, the results of using quantitative indicators to assess the impact of systemic risks on the real economy may vary. 3. because of the different ways of quantification, there will be large differences in the prediction of real economic fluctuations. if the results are not significant, the value of the research will be affected. [69] collected all systemic risk event samples after world war ii in the united states and europe to circumvent or mitigate the effects of the above problems, using 19 different methods to quantify and assess systemic risk. the quantile regression method was used to test the predictive power of systemic risk indicators on the deterioration of the real economy. it was found that only a small number of quantitative indicators can capture the risks of the macroeconomic downturn, and they can only be roughly predicted. another reason for the lack of theoretical research on the impact of systemic risk on the real economy is that traditional macroeconomic theory itself has a weak microfoundation, however, the systemic risk is mainly reflected in the systemic problems caused by the interaction of individual heterogeneous behaviours at the micro-level. by using traditional macroeconomic models, it is assumed that individual identity or limited heterogeneity has greater limitations. therefore, [3] believe that the agent model (agent model) should be introduced from other disciplines to better simulate the micro-individual behaviour, and to obtain a more realistic macroeconomic model. the empirical study of the impact of systemic risk on the real economy is relatively infrequently referred only two of the top 100 works of literature in the annual average are related to this field. even empirical studies of the impact of the financial crisis on the real economy are few and far between, which may stem from the following reasons: 1. the number of observable systemic risk events is limited, and fewer samples are analyzed using time series or panel analysis; 2. in the quantification of systemic risks due to the existence of many differences, the results of using quantitative indicators to assess the impact of systemic risks on the real economy may vary. 3. because of the different ways of quantification, there will be large differences in the prediction of real economic fluctuations. if the results are not significant, the value of the research will be affected. [69] collected all systemic risk event samples after world war ii in the united states and europe to circumvent or mitigate the effects of the above problems, using 19 different methods to quantify and assess systemic risk. the quantile regression method was used to test the predictive power of systemic risk indicators on the deterioration of the real economy. it was found that only a small number of quantitative indicators can capture the risks of the macroeconomic downturn, and they can only be roughly predicted. another reason for the lack of theoretical research on the impact of systemic risk on the real economy is that traditional macroeconomic theory itself has a weak micro-foundation, however, the systemic risk is mainly reflected in the systemic problems caused by the interaction of individual heterogeneous behaviours at the micro-level. by using traditional macroeconomic models, it is assumed that individual identity or limited heterogeneity has greater limitations. therefore, [3] believe that the agent model (agent model) should be introduced from other disciplines to better simulate the micro-individual behaviour, and to obtain a more realistic macroeconomic model. besides, the dnk model focuses more on the interpretation of risk-transfer pathways, and the endogenous interpretation of systemic risks is difficult. it is generally assumed that shocks are external, so it is difficult to predict the occurrence of crises. as to the endogenesis, based on the general theory, [70] proposed that capitalist economies had not shown to follow the neo-classical theory or in the "dis-equilibrium state" as keynesian economics stated, because of the functioning of sophisticated financial institutions. two sets of prices, for current outputs and capital assets, aligned within a capitalist economy are determined by different proximate variables, as well as consumption demand and investment demand having different horizons, the former upon shorter run expectations and the later on longer ones. both form the aggregate (effective) demand. consumption demand is a function of current factors, while investment demand is a function of the price of capital assets, supply price of investment goods, expected profits and external financing conditions. different from the micro-based general equilibrium methodology, minsky initiated his theory from a macroeconomic perspective, based on kalecki profit function, which always stands in a close or an open economy. by decomposition of the profit function, variables including monetary issues are demonstrated as follows: where ï� is the current profit of the whole economy, i is the current investment of the private department, df is the government deficit, sw is the savings of the private department, and c is the current consumption. thus, profits are not only determined by current productive factors, government activities, savings and consumption, but also by asset prices, which connect closely with financial markets. furthermore, profits, identified as cash flows, are the essential signals for investments, critically linking different payment commitments in the past, present and future. according to different cash flow commitments being availably financed, the financial system and debt structure are composed of a mixture of three types of financial postures (hedge finance, speculative finance and ponzi finance). as the natural growth of the economy, profits, in the form of cash flows, fluctuate due to the endogenous discrepancy in expectation horizons, which causes the ratios of the speculative finance and ponzi finance to increase in prosperous times, leaving the financial system to gradually become more sensitive to interest variations. when the current cash flows cannot sustain the current payment commitments, it initiates a deflationary break, and a financial crisis takes place. the implementation of mitigation policies can only alleviate the destruction yet not solve it, making the whole financial system more fragile to economic fluctuations. (financial instability hypothesis). since the complication of interactions between key factors, as well as variant vast, complicates markets in capitalist economies, minsky mainly demonstrated his theory in a narrative approach. thus, even though minsky's theory became the kernel of post-keynesian economics, it is difficult to further develop a comprehensive mathematical model ( [71] once put minsky's fih hypothesis into the framework of is-lm model to explain and get the equilibrium result, but the conditional formula that equates the price of investment goods and the price of assets implies the assumption that the capital market and the commodity market will automatically maintain the equilibrium. even though there are endogenous factors that cause deviation from the equilibrium of financial markets and commodity markets, the economy will return to equilibrium because of the implicitly assumed equality condition. besides, the decision of asset price depends solely on investment demand, which is also contrary to minsky's view, who emphasizes that the determination of asset price depends on various factors.). it was also confronted with various criticism. [72] proposed that, as in the upwards phases of an economy, investment increases cause a higher level of aggregate profits so that the leverage ratio does not necessarily increase. while, in the downside phases, investment decreases negatively affect the aggregate profits more than expected, which will increase the leverage ratio. the paradox of debt goes against minsky's theory. this contradiction should be directed to the preconditions of fih, which assume an extent of monopoly in commodity markets, so that producer can earn excess profits from market power. as the aggregate profits increase in the upward phases while the profit rates decrease, the over-increased investment would bring up leverage ratios, which could imperil future payment commitments. as in the downside phases, shrunk investments decrease aggregate profits while profit rates recover, which is good for bringing down leverage ratio. thus, in the framework of minsky' theory, the paradox of debt is explainable on the presumption of monopoly markets. further discussion can be referred to [73] . some studies try to expand the kaleckian model [74] [75] [76] by bringing a normal degree of capacity utilization into endogenous investment function. [77] utilize the profit share instead of the profit rate, [78] introduced external competition into an imperfect commodity market. all the adjustments provide a supplementary explanation of the demand regime. furthermore, monetary and financial variables are incorporated into the profit function [79, 80] . the late research intends to contain fih and the paradox of debt through a neo-kaleckian model [81] . moreover, by following a centre-periphery structure, the model explains how financial movements in the periphery associate with international financial markets' movements and external vulnerability. recent research in this field mainly focused on empirical research to support and verify fih in different economies [82] [83] [84] . minsky's achievements are more than fih. instead, he focused on the whole system, aiming at constructing a general theory to explain the whole operational law of capitalist economies, including the upward and the downward phases of the economic cycle. the discrepancy in horizons and the amplification mechanism, which is realized through the interaction between profits and investment activities, as well as the positive (negative) feedbacks between asset prices and liabilities, not only explains the endogenesis of financial crises but also demonstrates his economic vision in a dialectical way, so whether the economy attains an equilibrium gives way to the depiction of economic operation processes, as the former is only a temporary stability status compared with the whole process. with regard to the difference in expectations on consumption demand and investment demand, it should not be criticized as a lack of rationality; it should be interpreted as generalized rationality, as in the "wall street view", which is intended to be rationally applied to the real economy and sophisticated financial systems. early research on the formation of the financial crisis from the perspective of micro-market structure mainly began from the following two aspects: 1. information asymmetry; 2. liquidity. besides, there are studies on the impact of the financial crisis from the perspective of currency and interest rates and expectations. compared with the study on the financial crisis in the field of market microstructure, the study on the micro-market structure of systemic risk has increased the study of the complexity of the overall system, and it will be reviewed from the following three aspects: before the outbreak of the subprime mortgage crisis in 2007, the conclusions of the close relationship between institutions and the complex financial network's overall risk resilience were generally positive [22, 85, 86] , and it was considered that sufficiently decentralized inter-bank liabilities could produce a more stable financial system. nevertheless, some scholars then began to link the increase in systemic risk to the complexity of financial networks. for instance, [87] believes that as banks' counterparties increase, the probability of a systemic crash increase. [88, 89] demonstrated that the financial system itself has an amplification effect on the increase in systemic risk by modelling the inter-bank infection behavior. the research idea of this kind of amplification draws on the research methods of the previous network infection model [20, 90] , and the amplified transmission pathways are various. from assets and liabilities, studies by [91] showed that financially interrelated companies were more prone to systemic risk in the face of market shocks due to cross-shareholdings. similarly, [92] argued that the similarity of assets held between banks determines the extent to which relevant information was disseminated and the likelihood of a systemic crisis. by studying bank exposures in the us and europe, [93] argued that systemic risk was affected by common risk factors, and the interdependence of financial institutions was mainly due to systemic factors. [94, 95] used asset securitization as an example to explain the transmission path of a financial network to amplify systemic risk from the perspective of debt. it is believed that through the interconnected balance sheet network, asset securitization essentially magnifies the financial leverage of the entire financial system, thereby increasing the vulnerability of the system. [96] showed that the higher the degree of financial integration, the more stable the interbank interest rate under normal conditions, and the higher the interest rate soars during the crisis. combining the positive and negative views, [97] demonstrated through modelling that a closely linked financial system provides sufficient stability in the face of smaller market shocks. however, when the impact scale exceeds a certain threshold, the tightly connected financial network will amplify the risk and make the financial system more vulnerable, and the structure of the financial network has an important impact on the stability of the financial system. the research of complex financial networks draws on the game model and the research results of the previous network theory. the network theory originated from the seven bridges problem and then developed into the random graph theory, and then evolved into complex network theory. network theory and topological methods provided the basis for theoretical modelling of the structure of the research network itself [98] and also provided empirical methods [99] . because how individuals interacting with each other in the network and their importance can significantly affect the results of network structure and impact transmission, the current research in this field is still very popular. [100] pioneered the use of an interbank debt matrix to describe the network structure of the banking system. this idea was followed by scholars who later studied systemic risk [97, 101] . however, early network models used to be a non-direct model, which assumed that the connections between the participants are reciprocal and have the same interaction, even the same importance, such as if the inter-bank liability matrix is assumed to be positive, bank a's claims against bank b are the same as bank b's claims against bank a. this is because the inter-bank liability data is not available (therefore, in the empirical case, the inter-bank liability behavior will be further assumed, and the banking system liability matrix will be determined through technical processing such as the information entropy method). on the other hand, for the sake of simplifying model simulation, as [97] described, the simplified model and the same subject are to study the relationship between the structure of the financial network itself and financial stability. the results of many empirical tests are contrary to the assumptions of the theoretical model. [102] studied the italian interbank market structure from 1999 to 2010 and found that they are in line with the core-peripheral network structure, that is, a few banks continue to play a central role, maintaining lending with peripheral banks and providing liquidity to the market. while capital lending activities between core banks and non-core banks are rare, market liquidity at the time of the financial crisis is often caused by the reduction in borrowing by core banks, which means that the network distribution is not symmetrical. the core-peripheral network structure was first proposed by [103] , other related works are contributed by [104] , and [105, 106] covered the most core-periphery network structures and a further supplementary version [107] was published in 2017. a series of empirical studies showed that in other financial markets, such as india, mexico, the netherlands and the uk, the banking system network also presented the characteristics of the core-peripheral structure [108] [109] [110] . thus, more complicated networks with directional connections are introduced, which is more common in the financial contagion subdivision within the same e-n framework [111, 112] . moreover, since there are various types of connections between financial institutions, the financial network should have multi-layers, including credit, insurance, derivatives, collateral obligations cross-holding assets. the research on financial multi-layer networks is surging. [113] study the interaction of short-and long-term bilateral secured and unsecured lending. [114] study the mexican banking system on all market layers and find that market-based systemic risk indicators systematically underestimate expected systemic losses. the research on the relationship between financial market participants' behavior and systemic risk is more dispersed. the division of financial research segments can be mainly classified into behavioral finance research, corporate finance research, and micro-market structure research. participants vary in different financial markets, the relationships between their behavior and systemic risk, and the transmission pathway, are also various. according to the type of participant, the most studied is the bank. despite the trend of bank disintermediation, in the financial crisis, the irreplaceable role of the banking system and its impact on the financial system determined that it is still at the core of the financial system. the idea that competition promotes the efficiency of financial allocation has promoted the wave of financial liberalization in developed countries in europe and america since the 1970s (the second banking directive issued by the european union in 1989 allowed banks to engage in banking, insurance and other financial operations; in 1999 the us gramm-leach-bliley act also eased similar restrictions.). financial innovation products and means are constantly emerging, and competition among banks is becoming increasingly fierce, which brings about financial deepening [115] and improved resource allocation efficiency [116] , but it may also lead to instability in the banking system [117] . the relationship between interbank competition and the stability of the banking system has always been controversial. on the one hand, competition is conducive to reducing the cost of capital, thereby increasing the return on investment of enterprises, and increasing profitability, which maintains the controllable credit risk of banks and increases the stability of the system [118] . on the other hand, it will lead to bank rent-seeking behavior, especially the bank deposit insurance system will make banks tend to take more radical measures to pursue profits in a fiercely competitive environment [119] . on this basis, without deliberately screening customers, the systemic risk will be increased [22, 26, 120] . this is because it is difficult for perfect competition to exist in the competition between banks. the different scales and core-peripheral banking system structure will inevitably have a monopolistic effect on this market [121] , and there are moral hazards and adverse selection due to information asymmetry [122] . an extended study on the [118] models modifies the absolute inverse relationship between the two proposed [123, 124] , and presents that it should be u-shaped, that is, increased competition can improve the stability of the system and also make the banking system more vulnerable. the intensity of competition and other factors are the decisive factors that determine the negative or positive relationship between the two. [117] considered the role of policy incentives and constraints in the interaction between the two, and empirically tested and compared the relationship between the fierce competition in the banking industry and the stability of the banking system under different policy environments in different countries. it was found that the more standardized the regulation, the more developed the securities market, the more perfect the bank insurance system and the more efficient the credit information sharing, and that increased competition will increase the vulnerability of the system. the method of balance sheet effect is not only adopted to analyze banking defaults, but also applied to study other market participants' behavior. the fire sale and funding correspond to the activities of the bank's assets and liabilities, respectively. when there is a run (trader's run, deposit run or collateral run), the bank has to hedge the risk by recovering the loan in advance or raising the collateral mortgage rate, but there may be a risk of payment due to coordination failure. [34] believe that when the market lacks liquidity, banks have fewer opportunities to hedge their overall risk or deal with liquidity shocks. this will push banks to begin fire sales in response to future liquidity demands, which in turn will lead to excessive fluctuations in market capital prices. moreover, when monopolistic behavior exists in the interbank market, it will further reduce the effective supply of funds in the market [125] , and even market freeze appears. [30] classify liquidity risk into liquidity risk in the market and liquidity risk in trader financing. traders provide liquidity to the market through financing, but the trader's financing capabilities (such as equity and margin financing) rely on the liquidity of the asset market. however, market liquidity will suddenly dry up under certain circumstances and will affect many securities, resulting in large market fluctuations. it also leads to difficulties in the financing of traders and the realization of assets. the interaction between financing liquidity and market liquidity forms a liquidity spiral and triggers systemic risks. the liquidity spiral has a variety of transmission pathways, such as the loss spiral of erosion capital, and the guarantees spiral, which all lead to a fire sale. there are many different interpretations of the reasons for the formation of market liquidity shocks: (1) information asymmetry some of the research on financial crises caused by the lack of market liquidity is associated with information asymmetry, which is often one of the causes of liquidity exhaustion [122] . the financial crisis caused by information asymmetry is more typical in bank run events. the global game model proposed by [126] is also based on the assumption of information asymmetry, that is, participants can only obtain noisy observations, and the lack of common knowledge leads participants to choose risk-dominant equilibrium as the only equilibrium. this global game model is extended to the currency crisis model [127] and the bank crisis model [37] . when information asymmetry exists in the interbank market, it will exacerbate the liquidity of the market due to counterparty default risk, leading to system collapse [128] . moreover, the information asymmetry between the supervisory authority and the regulated institution will cause the central bank to misjudge. since regulators do not have a complete understanding of the asset quality of the entire banking system, therefore, even if a central bank participates in the interbank market to prevent the inefficient allocation of market resources due to adverse selection, moral hazard and monopolistic behaviour, the result of such hedging is difficult to be optimal [122] . (2) liquidity mismatch [129] believe that the common feature of financial markets in the recent financial crisis is that financial institutions generally have more serious mismatches in liquidity. for example, the terms of interest rate fluctuations in the subprime borrower's loan contract will affect their ability to repay and refinance. commercial banks are burdened with over-proportionate contingent liabilities relative to capital under unregulated off-balance sheet business. the rise in the proportion of investment banks that rely mainly on repurchasing commercial paper financing means that the proportion of the entire financial system that relies on market financing is rising, so the excessive leverage ratio is the risk of mismatch in liquidity. in the event of a liquidity shock, market liquidity is rapidly depleted by the balance sheet effect, resulting in the interaction of collateral run, bank run, and counterparty run. the central bank has to ease the crisis by releasing a large amount of liquidity, but this has further increased the moral hazard of financial institutions to increase leverage, which has further worsened the problem of liquidity mismatch. (3) actual demand shock [130] argued that the demand for liquidity stems from the mismatch between the supply and demand of goods by consumers in different spaces. once they needed to spend at different times in different locations, they constituted cross-regional liquidity needs, and banks needed to provide liquidity through the crediting behavior of the payment system or the interbank market. [22] agreed that random demand of consumers for liquidity will become a liquidity shock for a depository institution due to incomplete interbank market transmission. [86] suggested that systemic risk is mainly transmitted through the payment system, the interbank market and the derivatives market and that the random withdrawal demand of consumers at different times and in different places will have a liquidity impact on the payment system and the interbank market. the existence of the interbank market reduced the incentives for banks to hold non-profit cash; banks had solvency under certain conditions, while the market was caught in coordination failures. the financial accelerator model proposed by [63] explained the external financing premium problem faced by enterprises from the perspective of the incomplete credit market. the incentive mechanism of the company has prompted the agent to favour debt financing. when the negative impact causes the net value of the enterprise to decrease, the solvency declines, which in turn has an impact on the asset side of the loan bank. when the assets of the loan bank are impaired due to corporate default and further affect the scale of the risk reserves, the bank faces a liquidity shock, and the impact has a nonlinear amplification effect through the interbank market. other factors affecting market liquidity risk were institutional and systemic risk management systems, financial system structures, and the institution's own reputation risk [64] . moreover, the phenomena associated with liquidity shocks include market freeze, asset sales, contagion effects, and institutional bankruptcy, which also possibly appear when systemic risks occur. financial contagion occurs as the distress of one or a small group of banks risks the stability of other financial institutions, even ultimately spreading to the real economy. the financial contagion can occur due to both local contractual obligation connections, and global market connections, e.g., through the prices of assets due to mark-to-market valuation [131] . [100] established a framework for contagion analysis, which studies the spread of obligation default within the financial system due to unpaid liabilities. since there are various types of connections between financial firms, the shocks can transmit in the network via different channels, and the channels also interact with each other in some historic events. according to [132] taxonomy, there are mainly four types of channels: asset correlated contagion, default contagion, liquidity contagion, market illiquidity and asset fire sales. except for the default contagion, for the rest of the channels, the observed domino or spillover effects are demonstrated through the slump in mark-to-market prices. thus, contagion research can be classified into two main branches: default contagion and price-mediated contagion. (1) default contagion [133] explored how the probability and potential impact of contagion is influenced by aggregate and idiosyncratic shocks. they found that the robust-yet-fragile financial system has a low probability of such contagion. [134] extend the basic default contagion model by introducing default costs into the system. [135] assume minimal information about network structure, and through such key node-level quantities as asset size, leverage and the fraction of a financial institution's liabilities held by other financial firms, they derive explicit bounds on the potential magnitude of network effects on contagion and loss amplification. other models with extension in default costs include [90, 136, 137] . (2) price-mediated contagion these contagion models consider the mark-to-market asset price slumps due to the extreme tension in market liquidity and individual liquidity, as demonstrated in fire sales [138] [139] [140] [141] [142] [143] cross-holdings have also been studied [91, 136] . [144, 145] provide a framework for modelling asset prices during a fire sale. [131] generalizes the model of [142] by allowing for differing liquidation strategies. [146] study the three extensions that are bankruptcy costs, multiple assets and fire sales within a single model. the survey of empirical work on contagion can refer to [147] . empirical studies are revealing that financial contagion can not be well explained by the base model of obligation default [101, 135, 145, 148] . system stress tests are usually adopted to study the exposure of the whole system towards indirect contagion [138, 149] . in general, the network-based models are mainly rooted in the general equilibrium approach, which brings sufficient economic connotation to these models. it also means that systemic shocks are usually assumed from outside instead of endogeneity. thus, through limited amplification, contagions can suspend automatically and the whole system returns to a new equilibrium. even referring to the multiple assets models, the shock on the prices of assets and consequent price variations distribute independently with the network contagion, otherwise, the mathematical model shall be too complicated to be solved. however, the financial contagion research does offer great achievements in: (1) explaining the contagion path via networks of all types; (2) empirically revealing the impact of contagion on the whole financial system; (3) providing supportive evidence for capital requirements, mark-to-market accounting and limitation of the scale of interbank business. systemic risk identification and measurement are one of the most important aspects of systemic risk research. in recent years, the literature in this field has also increased. the indicators of systemic risk are ses [7] , mes [8] , klr [24] , covar [9] and capital shortfall [10] . some scholars have reversed the financial systemic risk through the market prices of derivatives (credit default swaps, cdss) issued by financial institutions [150, 151] . a survey of measurements on systemic risk can be referred to [152] , and they also offered codes for 31 such measures. besides, [5] adopted the herfindahl index to measure the concentration of credit portfolios of the emergency loans programs provided by the fed in the 2008-2010 crisis. they found that 22 strongly connected institutions received the most of the funds, while small dispersed shocks could have triggered the systemic default event, which indicates that the critical banks are "too-central-to-fail". i [153] constructed an insurance pricing model based on the put option framework of the merton model, including asset relevance, banking systemic risk and joint default rate, suggesting that systemic risk leads to low insurance pricing, which leads to insufficient liquidity in the event of a crisis. [154] used a multivariate extremum theory to model the tail dependence between multiple assets and adopted a non-parametric estimation to estimate and then to measure systemic risk. [155] studied the monthly yield data of hedge funds, banks, securities, insurance and other financial institutions based on principal component analysis combined with granger causality, and concluded that the internal correlation of these four types of finance has increased the systemic risk. [156] viewed the banking system as a credit portfolio, analyzing the value of the portfolio and the unanticipated losses to calculate systemic risk. [157] considered financial departments as a time-varying systemic risk of overall measurements, and defined systemic risk as to the conditional default probability, and analyzed the default probability through credit repayment. when the systemic risk is defined as a series of debt contract defaults, systemic risk is linked to credit risk. the biggest characteristic of credit risk is its tail effect. therefore, many researchers have introduced the copula model to simulate the tail dependence of financial institutions' extreme losses under systemic risk [11, 12] . [158] proposed a new systemic risk analysis framework, which incorporated marginal default probability, credit risk structure, consistent information multivariate density optimization (cimdo), generalized dynamic factor model and t-copula model into the framework. [149] propose the endogenous risk index (eri) to measure the spillovers across portfolios, as well as the indirect contagion index (ici) to capture the importance of a bank via measuring the loss it inflicts on other financial firms. the two indicators provide a complementary measure of interconnectedness. [159, 160] provide reviews on more recent measurement studies. therefore, it can be seen that, due to the tail characteristics and nonlinear characteristics of systemic risks, there are many methods for the quantitative assessment of systemic risks. all approaches and definitions should serve their purposes. due to the fast development of systemic risk research, this paper may not cover all the latest methods, while it can provide a series of frameworks for reference. according to the survey work, the approaches are classified into five categories: theoretical foundation, mathematical models, econometric methods, simulation and agent-based models. as to the simulation methods, more can be referred to [148] , which will not be discussed in this section. systemic risk research rests its theoretic root in equilibrium approaches, which follow a micro-based research paradigm. to achieve the equilibrium, the route of deduction shall be clarified, and the micro-based equilibrium or activities need to be coordinated with the purpose of general equilibrium by a series of preset hypotheses. to explain the frictions on financial markets, the preset conditions are adjusted accordingly, based on more convincing and practical micro-foundations. there are at least three types of endogenous inconsistencies: inconsistency in preference or expectation [22, 161] , heterogeneity [22, 26, 86] , information asymmetry [67, 86] , and bgg model [63] . besides, through obligation contracts [22, 26] or institutions' capitals [86] or other generalized agreements [162, 163] , connections are established within the financial system, which establishes the network analysis of contagions, domino effects and spillovers. as mentioned above, the study of the generation and transmission mechanism of systemic risk is in virtue of the dnk model framework. linking the activity entities and market structure at the micro-level with macroeconomic growth and volatility makes the performance of macroeconomic fluctuation of systemic risk possess a certain micro foundation. because systemic risk is accompanied by rising leverage, financial friction, large fluctuations before and after liquidity, and other derivative phenomena, it is closely related to individual heterogeneity and the characteristics of the market structure itself. to explain the impact of individual heterogeneity or heterogeneous expectations on financial market asset prices and the real economy, calvo and rotemberg's sticky-price model is used for reference. what's similar is the application of the two-sector model in the utility sector or the constraint function in the family sector, for example, [65] distinguish the buyers from the superior and sub-optimal levels and separate their constraint functions, explaining the intrinsic relationship between the rapid rise of mortgage loans and the rapid rise in housing prices before the subprime mortgage crisis. the game model is also a more common method for explaining financial friction and systemic risk. [36] proposed an analytical approach to coordinated games. by dividing the depositor's investment behaviour into two categories, patient and impatient, in the case of incomplete information, the investor will choose whether to withdraw the deposit to the bank in advance according to the expectation and then the bank run may occur due to the individual's rational behaviour. this model successfully explains the reasons for the bank's asset-liability maturity mismatch, and also explains the transmission mechanism that triggered the banking crisis. however, the deficiency is that there are multiple equilibriums in this model, and it is impossible to predict the occurrence of the bank run behaviour. the coordinated game model is then further optimized and expanded [37] . [126] proposed a method to ease public information hypotheses, allowing participants' post hoc observations to satisfy certain random distribution characteristics. the game based on this hypothesis is named as a global game. compared to the coordination model, the global game model eases the assumption of public knowledge and reduces the quantity of equilibrium. [127] introduced the global game method into the study of financial crisis theory and studied the mechanism of currency attack activities; the sequential global game method is also widely used in the fields of macroeconomics and financial crises. chinese scholars also used this model to explain the liquidity crisis of banks and the government's rescue policy [164] , and infectious mechanisms for the liquidity of informal financial institutions to formal financial institutions [165] . as for the modelling of contagion and spillover risks among financial intermediaries, the covariance of assets [7] or liabilities [7, 86, 90] was introduced based on micro-level maximization, in which the agent is either risk-neutral or risk-averse. asymmetric information assumption is also brought in. thus, the general equilibrium approach and its loose conditions can explain these endogenous questions: how participants react to an unexpected shock; what expectation they would form and action they would take thereafter; how the systemic risk is transmitted or spillovers occur within the system. however, although frictions are invited into the equilibrium models, they have to always assume the exogeneity of the shocks. another series of theoretical approaches are based on an analysis of evolutionary history and institutions, like marxism and minsky's theory. the latter inherited the former's essence in explanation of profits in that they both agree that profits go down as investment increases in the phases of economic expansion, and a crisis takes place when profits cannot cover the current payable obligations. while minsky further developed his theory by establishing a profit function on an identical equation, it follows a macro-micro-macro non-equilibrium paradigm, which unravels the necessity of present assumptions. through decomposition, the profit function can embrace various endogenous factors from different departments of an economy, including technical improvements, public policies and the financial institutions, etc., while marxism mainly attributes the profits to the surplus-labour and capital structure. minsky's methodology can also be applied to analyze the household and government behaviours and its subsequent effects on the financial system. this methodology solves the problem of exogeneity, as the financial crisis phases are integrated into a whole theory to explain the operation of capitalist economies. to avoid mathematical formalism, he insisted on delivering his theory via a narrative approach, which leaves more space for his successors to explore [82] [83] [84] , etc. network and contagion models are mainly established in the general equilibrium approach [166] . after [22] , the baseline of the network model is proposed by [100] , which is expanded from explaining default contagions to price-mediated contagions. based on that, [133] offered a model involving hard defaults, in which interbank debts of defaulted banks recover zero value. there are also other models with extensions on bankruptcy costs, multi assets and fire sales. [142] provides a framework to prove the existence of clearing asset prices and liability payments for equilibrium in a network contagion. [143] expand it to a dual-risky-asset equilibrium. no matter the eisenberg-noe model, or the gai-kapadia model or the amini framework, they all belong to static cascade models, which assume the structure and scale of the network itself be static. while networks can evolve due to the interactions of participants' inter-connections of assets and shocks from the outside, so random graph models are brought in [6, 132, 140, 142, [167] [168] [169] . moreover, to capture the stochastic and dynamic structure of the network with a vast volume of nodes, the mean-field models are borrowed from the discipline of physics, which can be applied to explain the herding effect and endogenous contagion [170, 171] . these models focus on quite simple interbank interactions that neglect defaults and contagion [137, 170, 172] . the latest models involve credit risk to explain default contagion [157, 173, 174] . since, in most economies, the interaction data of financial institutions in the system is difficult to derive, it is also impossible to acquire the specific counterparty-trading information from a bank's balance sheets. thus, alternatives are adopted to settle this problem, one of which is the maximum entropy approach. the debtrank approach is another measure to estimate the distress contagion without observing failing institutions [5] . the main methods used are the copula model to simulate the joint distribution of risk factors, and the introduction of option pricing models and multivariate extreme value theory, or principal factor analysis to extract key factors using the dcc-garch model to model the correlation between multiple assets [8] , and to establish multivariate quantitative indicators. furthermore, to verify the economic structure variation as systemic risk takes place, var is usually adopted [82] [83] [84] tc. [175] proposed the introduction of an agent-based model (abm) to solve individual heterogeneity problems, and monte carlo simulation can be used to predict the impact of micro-subject heterogeneity on macroeconomic variables. compared with the game model, abm is more compatible with more differences and can link the internal economic evolution logic with external data consistency. the key contributions by [176, 177] are to incorporate the leverage accelerator into agent-based simulated network models. the limitation of such models is that it is difficult to calibrate and test their results. in addition to financial heterogeneity issues, systemic risk draws more attention to the stability of the system itself, and complex network models are borrowed to solve such problems. because of its similar structure to infectious diseases and bio-food chains, the financial system network links various financial institutions through financial flows, and triggers a domino effect due to risk events in individual institutions [22, 97, 178] . however, the financial network system is more complicated than the natural food web or infectious disease transmission. this is because the activities of participating individuals are interactive, the interactions are not always symmetrical, the behavioral choices of the entity change with expectations or event shocks, and the results of behaviors are also characterized by uncertainty. this makes the financial system's structural stability research more complicated and has to restrict the behavioral characteristics of the system to individuals through many strict preconditions [3] . the methods for identifying and quantifying systemic risks have been discussed previously. therefore, in general, the system of research methods for systemic risk is as in figure 5 . in addition to financial heterogeneity issues, systemic risk draws more attention to the stability of the system itself, and complex network models are borrowed to solve such problems. because of its similar structure to infectious diseases and bio-food chains, the financial system network links various financial institutions through financial flows, and triggers a domino effect due to risk events in individual institutions [22, 97, 178] . however, the financial network system is more complicated than the natural food web or infectious disease transmission. this is because the activities of participating individuals are interactive, the interactions are not always symmetrical, the behavioral choices of the entity change with expectations or event shocks, and the results of behaviors are also characterized by uncertainty. this makes the financial system's structural stability research more complicated and has to restrict the behavioral characteristics of the system to individuals through many strict preconditions [3] . the methods for identifying and quantifying systemic risks have been discussed previously. therefore, in general, the system of research methods for systemic risk is as in figure 5 . systemic risk is understood from the perspective of classical economics as the transformation between multiple economic equilibriums. from the perspective of risk management, it is tail risk management. from the perspective of behavioral finance, it is because the micro-subjects' psychological and behavioral characteristics form a self-reinforcing feedback mechanism, which belongs to the systemic overall collapse risk from the perspective of complex network theory. from systemic risk is understood from the perspective of classical economics as the transformation between multiple economic equilibriums. from the perspective of risk management, it is tail risk management. from the perspective of behavioral finance, it is because the micro-subjects' psychological and behavioral characteristics form a self-reinforcing feedback mechanism, which belongs to the systemic overall collapse risk from the perspective of complex network theory. from a broader perspective, it also covers the fields of sociology, psychology, and political science. there are a thousand hamlets in the eyes of a thousand people. different starting points focus on different problems, and the methods used are different. it is also the complexity and change of systemic risk that attracts researchers to constantly try to innovate. the systemic risk research branch is extensively involved, both in connection with traditional financial crisis research and expanding research on system structure and risk warning. financial systemic risk research can be said to be the most intersecting field between finance and even economics and other disciplines. the main research methods used are both tools of traditional economic theory, as well as social network models and natural science experimental simulations. for the generation of systemic risks, the endogenous problems of shocks are difficult to solve using traditional economic models. therefore, dnk's dsge model often assumes that shocks are exogenous; to predict systemic risks, it is urgent to introduce interdisciplinary approaches to innovate research on endogenous factors. on the other hand, the adoption of complex network models often requires many rigorous assumptions that are outside of reality, and these assumptions may exclude the potential risk factors of reality. after all, the financial system itself is much more complicated than the food chain, while other new interdisciplinary research methods also have insufficient internal economic explanatory power. it is, therefore, necessary to use appropriate analytical tools based on specific issues of the specific financial sector. to better sort out the hotspots, the latest 5 years' (2015-2020) systemic risk research was analyzed and was connected with the development of crisis research, 733 articles in total. it turns out that two divisions are mainly focused on ( figure 6 ). intersecting field between finance and even economics and other disciplines. the main research methods used are both tools of traditional economic theory, as well as social network models and natural science experimental simulations. for the generation of systemic risks, the endogenous problems of shocks are difficult to solve using traditional economic models. therefore, dnk's dsge model often assumes that shocks are exogenous; to predict systemic risks, it is urgent to introduce interdisciplinary approaches to innovate research on endogenous factors. on the other hand, the adoption of complex network models often requires many rigorous assumptions that are outside of reality, and these assumptions may exclude the potential risk factors of reality. after all, the financial system itself is much more complicated than the food chain, while other new interdisciplinary research methods also have insufficient internal economic explanatory power. it is, therefore, necessary to use appropriate analytical tools based on specific issues of the specific financial sector. to better sort out the hotspots, the latest 5 years' (2015-2020) systemic risk research was analyzed and was connected with the development of crisis research, 733 articles in total. it turns out that two divisions are mainly focused on ( figure 6 ). a total of 733 articles are screened and analyzed according to their titles, keywords, abstracts and references. among the 10 subdivisions, more co-citation clusters are formed in systemic risk measurement, financial market structure, and financial stability. this map reveals that more recent studies focus on network stabilization and systemic risk measurements. although the extreme value approach was brought into this area, it caught only a small amount of attention. furthermore, the high concentration on risk measurement may imply that the measuring methods gradually mature, which leaves limited room for subsequent researchers. considering the ongoing worldwide recession, the research directions that may be further expanded in the future are as follows: a total of 733 articles are screened and analyzed according to their titles, keywords, abstracts and references. among the 10 subdivisions, more co-citation clusters are formed in systemic risk measurement, financial market structure, and financial stability. this map reveals that more recent studies focus on network stabilization and systemic risk measurements. although the extreme value approach was brought into this area, it caught only a small amount of attention. furthermore, the high concentration on risk measurement may imply that the measuring methods gradually mature, which leaves limited room for subsequent researchers. considering the ongoing worldwide recession, the research directions that may be further expanded in the future are as follows: (1) more work on models with multiple shocks and spillover effects be done. no matter endogenous or exogenous shocks, the contagion rotation shall not be limited to a specific market or intermediaries. more than one spillover risk is discussed within a model. moreover, at present, there are many types of research on systemic risks in the banking system, the real estate market and the foreign exchange market. there are few studies on the systemic risks of the shadow banking [179, 180] and internet financial markets, which may be related to the lack of available data and difficulties in providing supporting empirical studies. (2) dynamic multi-layer networks expect further exploration, especially when the prices of assets have a connection with the financial contagions. (3) more empirical studies of the impact of systemic risk on the real economy are necessary. it is expected that recent advances in large-scale data and computing tools benefit systemic risk studies, acquiring more real-world and large-scale data for empirical analysis [181] . (4) event studies shall be focused on the modification of market structure, which is being affected by new technology or policies that help promote information disclosure, for example, the alternative index (like sofr) in place of libor. moreover, digital currency issued by central banks substituting banknotes means that the monetary multiplier is no longer applicable, which will profoundly alter the whole financial system. therefore, these innovative measures should also be paid attention to. besides, there are other areas worth exploring, such as more comprehensive risk warning indicators, more effective econometric means, and crossover with other fields of economics or sociology, psychology, etc. in short, this is an active research field and has a strong practical significance and policy reference value. the research of systemic risk will be more vibrant because of the participation of more researchers. economic networks: the new challenges systemic risk in a unifying framework for cascading processes on networks systemic risk in banking ecosystems systemic risks in society and economics, social self-organization debtrank: too central to fail? financial networks, the fed and systemic risk resilience to contagion in financial networks correlation and tails for systemic risk measurement. ssrn electron capital shortfall: a new approach to ranking and regulating systemic risks tail dependence and indicators of systemic risk for large us depositories systemic risk in european sovereign debt markets: a covar-copula approach what is systemic risk, and do bank regulators retard or contribute to it? distinguished lecture on economics in government: central banking and systemic risks in capital markets defining systemic risk the financial crisis of 2007-2009: missing financial regulation or absentee. in lessons from the financial crisis: causes, consequences, and our economic future symposium on the monetary transmission mechanism systemic risk and hedge funds banking system stability: a cross-atlantic perspective financial systemic risk measurement based on time-varying copula a rational expectations model of financial contagion the twin crises: the causes of banking and balanceof-payments problems leading indicators of currency financial intermediaries and markets competition and financial stability an empirical analysis of the dynamic relation between investment-grade bonds and credit default swaps pricing liquidity risk with heterogeneous investment horizons regulation and risk taking market liquidity and funding liquidity bank capital and dividend externalities unstable banking financial crises: theory and evidence a model of reserves, bank runs, and deposit insurance bank runs, deposit insurance, and liquidity demand-deposit contracts and the probability of bank runs payoff complementarities and financial fragility: evidence from mutual fund outflows distinguishing panics and information-based bank runs: welfare and policy implications banking panics and business cycles a monetary history of the united states consequences of bank distress during the great depression the origins of banking panics: models, facts, and bank regulation financial markets and financial crises is the crisis problem growing more severe? econ a property economics explanation of the leveraged losses: lessons from the mortgage market meltdown economic policy and the financial crisis: an empirical analysis of what went wrong the aftermath of financial crises this time is different: a panoramic view of eight centuries of financial crises recovery from financial crises: evidence from 100 episodes bubbles in real estate markets economic growth over? faltering innovation confronts the six headwinds of subprimes and subsidies: the political economy of the financial crisis the global financial crisis of 2008: what went wrong? available online the institutionalist roots of macroprudential ideas: veblen and galbraith on regulation, policy success and overconfidence financial markets and investment externalities time to build and aggregate fluctuations credit and prices in a real business cycle nominal rigidities and the dynamic effects of a shock to monetary policy firm-specific capital, nominal rigidities and the busyness cycle the financial accelerator and the flight to quality illiquidity and all its friends quantitative modeling of the financial crisis: a simple model of subprime borrowers and credit growth introducing financial frictions and unemployment into a small open economy model banking, liquidity, and bank runs in an infinite horizon economy a macroeconomic model with a financial sector systemic risk and the macroeconomy: an empirical evaluation the financial instability hypothesis: a restatement minsky's financial fragility hypothesis: a missing macroeconomic link? in financial fragility and investment in the capitalist econom is minsky'sffinancial instability hypothesis valid? camb real wages and economic growth union bargaining power, employment and output in a model of monopolistic competition with wage bargaining kaleckian growth theory: an introduction unemployment and the real wage: the economic basis for contesting political ideologies. camb international competition, income distribution and economic growth interest rates in post-keynesian models of growth and distribution the paradox of debt and minsky's financial instability hypothesis neo-kaleckian models with financial cycles: a center-periphery framework financial instability in japan: debt, confidence, and financial structure structural change and financial instability in the us economy minsky's financial fragility: an empirical analysis of electricity distribution firms in brazil comparing financial systems systemic risk, interbank relations, and liquidity provision by the central bank contagion in inter-bank debt networks ã�va which networks are least susceptible to cascading failures? linear social interactions models financial networks and contagion asset commonality, debt maturity and systemic risk transatlantic systemic risk risk and liquidity in a system context securitisation and financial stability financial integration and liquidity crises systemic risk and stability in financial networks on the formation of networks and groups systemic risk in financial systems risk assessment for banking systems core-periphery structure in the overnight money market: evidence from the e-mid trading platform the stability of core and peripheral networks over time core-periphery organization of complex networks models of core/periphery structures core-periphery structure in networks core-periphery structure in networks (revisited). siam rev too interconnected to fail: financial contagion and systemic risk in network model of cds and other credit enhancement obligations of us banks finding the core: network structure in interbank markets interbank tiering and money center banks assessing interbank contagion using simulated networks a bayesian methodology for systemic risk assessment in financial networks interbank markets and multiplex networks: centrality measures and statistical null models an empirical study of the mexican banking system's network and its implications for systemic risk personal bankruptcy and credit market competition banking deregulation and industry structure: evidence from the french banking reforms of 1985 bank competition and stability: cross-country heterogeneity the theory of bank risk taking and competition revisited does deposit iinsurance iincrease banking system stability? an empirical investigation self-interested bank regulation do we need big banks? evidence on performance, strategy and market discipline preference shocks, liquidity and central bank policy capital regulation, bank competition, and financial stability does competition reduce the risk of bank failure? too many to fail-an analysis of time-inconsistency in bank closure policies global games and equilibrium selection unique equilibrium in a model of self-fulfilling currency attacks liquidity hoarding and interbank market spreads: the role of counterparty risk collective moral hazard, maturity mismatch, and systemic bailouts clearinghouseb anks and banknoteo ver-issue financial contagion and asset liquidation strategies contagion! systemic risk in financial networks contagion in financial networks failure and rescue in an interbank network how likely is contagion in financial networks? financial networks, cross holdings, and limited liability, ã�sterreich; national bank liability concentration and systemic losses in financial networks vulnerable banks liquidity risk and contagion systemic risk with central clearing counterparty design, swiss finance institute research paper no swiss finance institute an optimization view of financial systemic risk modeling: network effect and market liquidity effect uniqueness of equilibrium in a payment system with liquidation costs strategic fire-sales and price-mediated con-tagion in the banking system running for the exit: distressed selling and endogenous correlation in financial markets fire sale forensics: measuring endogenous risk the joint impact of bankruptcy costs, crossholdings and fire sales on systemic risk in financial networks. probab uncertain quant risk a critical review of contagion risk in banking simulation methods to assess the danger of contagion in interbank markets monitoring indirect contagion a framework for assessing the systemic risk of major financial institutions banking stability measures. imf working papers a survey of systemic risk analytics the pricing of deposit insurance in the presence of systemic risk multivariate dependence of implied volatilities from equity options as measure of systemic risk econometric measures of connectedness and systemic risk in the finance and insurance sectors calculating systemic risk capital: a factor model approach default clustering in large portfolios: typical events banking systemic vulnerabilities: a tail-risk dynamic cimdo approach a unified approach to systemic risk measures via acceptance sets moral hazard, and financial contagion financial networks: contagion, commitment, and private sector bailouts optimal fragile financial networks liquidity crisis, credit crisis and regulatory reform a study on the contagion mechanism of liquidity crisis from non-formal to formal financial institutions: based on the depositors' global game a survey of network-based analysis and systemic risk measurement stress testing the resilience of financial networks inhomogeneous financial networks and contagious links managing default contagion in inhomogeneous financial networks large deviations for a mean field model of systemic risk mean field games and systemic risk stability in a model of interbank lending large portfolio asymptotics for loss from default default clustering in large pools: large deviations an interdisciplinary model for macroeconomics systemic risk on different interbank network topologies bankruptcy cascades in interbank markets risk-sharing and contagion in networks shadow credit in the middle market: the decade after the financial collapse shadow banking shadowed in banks' balance sheets: evidence from china's commercial banks key: cord-176131-0vrb3law authors: bao, richard; chen, august; gowda, jethin; mudide, shiva title: pecaiqr: a model for infectious disease applied to the covid-19 epidemic date: 2020-06-17 journal: nan doi: nan sha: doc_id: 176131 cord_uid: 0vrb3law the covid-19 pandemic has made clear the need to improve modern multivariate time-series forecasting models. current state of the art predictions of future daily deaths and, especially, hospital resource usage have confidence intervals that are unacceptably wide. policy makers and hospitals require accurate forecasts to make informed decisions on passing legislation and allocating resources. we used us county-level data on daily deaths and population statistics to forecast future deaths. we extended the sir epidemiological model to a novel model we call the pecaiqr model. it adds several new variables and parameters to the naive sir model by taking into account the ramifications of the partial quarantining implemented in the us. we fitted data to the model parameters with numerical integration. because of the fit degeneracy in parameter space and non-constant nature of the parameters, we developed several methods to optimize our fit, such as training on the data tail and training on specific policy regimes. we use cross-validation to tune our hyper parameters at the county level and generate a cdf for future daily deaths. for predictions made from training data up to may 25th, we consistently obtained an averaged pinball loss score of 0.096 on a 14 day forecast. we finally present examples of possible avenues for utility from our model. we generate longer-time horizon predictions over various 1-month windows in the past, forecast how many medical resources such as ventilators and icu beds will be needed in counties, and evaluate the efficacy of our model in other countries. we used the county-level cumulative and daily death reporting from the new york times [1] , as well as the county-level active cases reporting from johns hopkins university [2] . we also used county-level population statistics from the 2017 american community survey and county-level policy dating information from johns hopkins university. specifically, we loaded the policy dating information regarding the implementation of the stay at home orders. we loaded these data into a data frame by matching the county fips codes from each source, and computed a moving average death statistic as an additional feature, using a window of size three days. we dealt with missing values in the fips codes and death reporting by removing the corresponding observations. the only exception to this was the county 36061, which represents new york city. this county had a missing fips code, but had well reported data. since new york city is the most active covid-19 hotspot in the united states, its manual inclusion was necessary. in addition to the aforementioned data sources, we experimented with mobility data, but this did not make it into our final epidemiological model because there was no simple mapping to any of the model variables, and we believed the policy dating information was sufficient to establish distinct regimes in the data. the traditional sir epidemiological model breaks up a regionâăźs population into three separate groups: susceptible, infected, and removed [3] . the primary flaw of applying this model to the current covid-19 pandemic is the poor approximation of assuming that all people in each group have uniform experiences. clumping the population into only three groups is an example of omitted-variable bias, excluding major realities caused by the scale of the pandemic. the pecaiqr model that is described below takes each group in the sir model and breaks them down into another level of classification. the variables in our model are developed following simple logical arguments based on current global realities and widely accepted scientific and epidemiological results. we first break down the susceptible class of the sir model. due to policies implemented in the us, many individuals have been self-quarantining [4] . it is sensible then that each day only a fraction of the susceptible population are exposed to potentially getting the virus from the outside world. following this logic, our model breaks up the susceptible population into three classes: protected, exposed, and carriers (explained in detail below). next, we consider the infected group of sir. there is evidence from the large data sets gathered in south korea and other well-respected global scientific efforts that a significant fraction of people infected with covid-19 do not display visible symptoms [5] . using this line of logic our model breaks the infected group into two classes: asymptomatic and infectious. an important note is that we assume these asymptomatic and infectious people still actively participate in the community, enabling them to come in contact with exposed people. finally, we are left with the removed group. our model incorporates two lines of logic in the breakdown of this group. first, it is sensible that a portion of the members of the asymptomatic and infectious classes end up self-quarantining (at least in effect) as a result of: getting tested, showing initial symptoms, or having an âăijintuitionâăi̇ they contracted the disease. second, much of the evidence about covid-19 so far shows that an individual who contracts the disease cannot get the disease again (at least on the order of a few months) [6] . following these arguments, the model breaks the removed group into: quarantined and removed classes. removed further has subclasses of dead people and recovered people, who are assumed to be immune from the virus. each of the variables we describe change continuously with time. we describe the variables for a given time t (associated with some given day t ). protected : people who did not go outside to expose themselves to any infected individuals (asymptomatic or infectious) on t, though they have a chance of coming in contact with a carrier. any person in protected also has a chance of joining the exposed class at a later time t + δ (interpreted as the next day: t + 1), as they may wish to travel outside that day for any reason. exposed: people who went outside on t and therefore had a chance of coming into contact with an infected individual (asymptomatic or infectious) to become a carrier. any member also has a chance of joining the protected class on the following day t + δ, as they may wish to self-isolate the next day for any reason. exposed : people who went outside on t and therefore had a chance of coming into contact with an infected individual (asymptomatic or infectious) to become a carrier. any member also has a chance of joining the protected class on the following day t + δ, as they may wish to self-isolate the next day for any reason. carrier : people in the model in a purposely temporary position. they are people in the exposed class who came into contact with an asymptomatic or infectious person and got the disease âăijon their handsâăi̇ to some degree. carriers returning home on t have a chance of spreading the disease to some protected people living at their home, making those people either asymptomatic or infectious on t+δ. carriers at time t can also have a chance to either âăijtouch their faceâăi̇ and contract the disease themselves (becoming asymptomatic or infectious on t + δ), or âăijwash their handsâăi̇ and return to being a member of the exposed class on t + δ. asymptomatic: people who were infected by covid-19 and are contagious on t, but show no symptoms. these people are assumed to be active in the public, in that they have a chance to spread the virus to exposed people in public areas on t. an asymptomatic person on t + δ has a chance of becoming a member of infectious on after showing initial symptoms, a member of quarantined after somehow finding that they contracted the virus, or a member of removed after having the disease pass through their immune system. here, all the people going from asymptomatic to removed would go to the recovered subclass. infectious: people who were infected by covid-19, are contagious on t, and show symptoms. these people are assumed to be active in the public, in that they have a chance to spread the virus to exposed people in public areas on t. an infectious person on t + δ has a chance of becoming a member of quarantined after figuring that they contracted the virus and a chance of becoming a member of removed after having the disease pass through their immune system. when these infectious people become a member of removed they have a chance of dying, to become a member of the dead, or living and becoming a member of recovered. quarantined : people who were originally asymptomatic or infectious and subsequently removed themselves from the public and are self-quarantining on t . these people are assumed to not be able to spread the disease to any people, so this group includes people who have covid-19 but are in a non-contagious stage. a quarantined person on t + δ has a chance of becoming a member of removed. when these quarantined people become a member of removed they have a chance of dying, to become a member of the dead, or living and becoming a member of recovered. removed : people who were originally asymptomatic, infectious, or quarantined, and had the disease fully pass through their immune system on or before t. these people belong to one of two subclasses: dead and recovered. recovered people are assumed to have survived the disease, and will not get infected again. to fit the deaths data to the system of differential equations in the pecaiqr model, we performed numerical integration using the scipy odeint package [7] , and traversed the parameter space to find a set of parameters that minimized the least squares error of each fit variable in relation to its observed variables. due to the size of the parameter space, this requires an initial guess for the parameters and the initial conditions of each of the 7 pecaiqr variables, as well as defined ranges to restrict the size of the parameter space. of course, we can simply examine the entire logical parameter space (with allowed values ranging from 0 to 1) for the each of the parameters, and for each of the pecaiqr variables as well, and then use a random guess within that space to initialize the least squares minimization. however, this is unnecessarily inefficient. the actual parameters may differ widely from county to county, but they all share a similar order of magnitude. a better approach is to use this exhaustive, unconstrained search only once, on a relatively mature curve like lombardi in italy or new york in the united states, to extract a reasonable guess for these orders of magnitude, and then simply feed in this guess for the other counties as well, to provide a more logical starting point in the least squares minimization. due to the complexity of our model, however, there are a few more nuanced details that we will discuss later in section. first, we will elaborate more on the fitting procedure. initially, we only fit the death variable (d) to the observed death reporting. early on, we decided not to fit our infection curves to daily case statistics, as they have inconsistent and unreliable reporting, so this would only diminish the accuracy of the fit to observed deaths, which is a much more reliable statistic. however, later on, we realized that we could also effectively fit the quarantined variable (q) in the pecaiqr model to the active cases reporting. the intuition behind this is the assumption that those who test positive would either self quarantine at home or be forcefully quarantined in a hospital if their condition is severe enough. of course, this does not capture all the self-quarantined individuals, so we permitted a large degree of fuzziness in the fit of the quarantined variable (q) to active cases. we achieved this with a bias scaled factor that gave much more weight to observed deaths in the fit of the death variable (d). this causes the model to prioritize the deaths fit over the active cases fit, so the active cases fit becomes more of a suggestion rather than a constraint. the hope is that the least squares error on the active cases is not large enough in magnitude to compromise the deaths fit and force the parameter space into a different minimum, but will rather provide a subtle correction around the local minimum discovered by the minimization of the least squares error on the deaths. unfortunately, a fit to active cases is not helpful for most counties, as most counties do not maintain their active cases reporting well, and moreover we discovered that the criteria for an active case may vary widely across different states. we also realized that the pecaiqr epidemiological model parameters are not static, due to the dynamic and rapidly evolving state of the pandemic, caused mainly by external forces such as social distancing protocols and lock down policies. a naive fit on the data would only yield some sort of average of the parameters. but since we only care about the most recent characteristics of the death and infection curves when making predictions into the future, we can do better than this by affording the more recent data points more weight in the least squares error calculation. this forces the minimization to favor a solution that fits more heavily on a more recent window of time, while still retaining the effects of the past data to some degree. we can achieve this with a geometric progression of weights, as well as a bias term that sets a maximum weight for all data points before a certain cutoff. we called this method training on the tail. another method to separate more recent parameters from the data is to use the assumption that there are distinct parameter regimes correlating to the start of policies -the stay at home orders in particular. note that these policies directly affect the infection curves because they limit the spread of disease. for the death curve, there is typically an offset between the date of policy implementation and the date at which the effects become apparent in the death data, this offset approximately equal to the average time till death for covid-19. with this assumption, we separate the deaths data into two training regimes -one for dates before the policy implementation plus the offset, and one for dates after the policy implementation plus the offset. then, we train one fit on the the first regime, and feed the fitted parameters and predicted variables on the date of the policy implementation as the guesses and initial conditions, respectively of a second fit on the second regime. we called this method training on the policy regime. we perform procedures described above on a county-level, and then use the fitted parameters with the numerical integration to extrapolate into the future to get county-specific predictions. note that the predictions and their errors are in cumulative deaths, but for the sake of visualization, we converted to daily deaths later. to get the errors, we initially used a method that calculated error bounds by finding the parameter variance from the residual variance using the covariance matrix of residuals and the jacobian around the fitted parameters. having obtained the mean and standard deviation for each of the parameters, we assumed that each parameter had normally distributed values, and so we sampled 100 parameter sets. we could not simply apply calculate the parameters for each confidence interval from their mean and standard deviation because it is not obvious which parameters are positively correlated or negatively correlated with the deaths predictions. this method turned out not to be ideal in all cases, as there are certain regions in the parameter space that are invalid and cause the error bars to spike. we then developed a bounding method that allowed a reliable way to accurately tighten our confidence intervals. this method infers the error bars from the deviation relative to the predicted fit, of a smoothed version of the residuals calculated from the moving average of daily deaths, which is equivalent to the moving average of the slope of the cumulative deaths. generate a pdf of deaths around the best fit prediction. for each time t in the training range, compute the difference in âăijslopeâăi̇ between the fit and the actual data. the slope in the fit is simply the current predicted death minus the previous death on the fit curve. for the slope in the actual data, to mitigate the dominating effects of outliers, we find the slope as the difference between consecutive points of the moving average (window of 3 days) instead. we then define a normal distribution of slope ratios. for each time t in the training data, we find the ratio of the actual (moving average) slope and the fit slope, and fill a list of these ratios. next we find the mean and standard deviation of the distribution as the mean and (sample) standard deviation of the ratios list. outliers, which we remove, are defined as values that are above three standard deviations from the mean. this is necessary because the ratios are unrealistically high in the small number limit. for each time t in the extrapolated prediction fit we generate a confidence range of 100 points. we find the slope at the extrapolated time as the predicted death minus the predicted death at the previous time. this slope is multiplied by a random scaler sampled from the normal distribution of ratios and denoted as s. we multiply the predicted death at the previous time t − 1 in the fit by 1 + s and add this as a point in the pdf at time t. this multiplication is repeated 100 times so that 100 points are generated. we get the discrete 10, 20, 30, 40, 50, 60, 70, 80, 90 cdf percentiles by sampling the pdf. the methods described in the past two subsections are implemented as options that can be activated with hyper parameters, and collectively they provide several different ways to fit the pecaiqr model and generate the confidence intervals. due to the time constraint of the competition, we were not able to develop a sophisticated blending method to optimize on the hyper-parameter space. however, we were able to sample a few different combinations of hyper-parameters and determine on a county-level which one works the best for each county, by modifying the evaluation script provided by the tas. this script essentially computes pinball [8] loss for the submission file of predictions, scored against the most recent data. to determine a good set of hyperparameters for each county, we simply trained using a two week cutoff in the data, and scored the predictions for the subsequent two weeks using the evaluation script. for predictions made from training data up to may 25th, we were able to obtain an average pinball loss of 0.096 for all counties with scoring on a 14 day forecast. our greatest weakness was the lack of a second working model with which we could cross validate, as well as the lack of a sophisticated blending method to optimize on the hyper-parameter space for the single model. we were not able to develop an alternative working model due to the time constraints of our group members, but a detailed description of our attempts is available in section 5 failed models. having two different models would have allowed us to mitigate their individual weaknesses and account for their individual edge cases with the other's strengths. the epidemiological model has several major weaknesses. although it works well for counties with well recorded data, it fails for the vast majority of counties, which have low or noisy death statistics. we called these counties the "non-convergent counties" because the epidemiological model was not able to converge to a parameter set that yielded predictions which could consistently score better than the naive all zeros prediction. the curves for the pecaiqr variables in the left column of the preceding figures make sense intuitively. we expect expect to see that conversion between the protected and exposed populations, and that the removed population to eventually dominate. we also expect the carrier population to peak before the infected population, which should have a similar shape to the asymptomatic population, and lastly we expect the infected and asymptomatic populations to peak before the deceased population peaks. lastly, the second figure, which uses the "fit on tail" method, more closely captures the uncertainty of the tail end of the curve, as expected. (a) (b) (c) (d) (e) (f) (g) (h) for the "nonconvergent counties," we attempted to skip over the parameter fitting step by guessing parameters based on the parameters of similar counties. in order to do this, we need to first establish proof of concept that there is some correlation between the non-covid features of a county and its pecaiqr parameters. so we visualized the parameters for all the "convergent" counties by using dimension reduction algorithms. we used principal component analysis [9] , singular value decomposition [10], and t-distributed stochastic neighbor embedding figure 3 : several different visualizations of the parameter space using dimension reduction algorithms. each point represents a count, and its size correlates with the county population. for svd, we plotted the first two columns of the left matrix against each other, which produces an visualization how similar the counties are by distance, based on their abstract preferences in the parameter space. although we did not have time to further explore the possible correlation between parameter clusters and non-covid county-specific features, we were excited to see that there are some distinct intrinsic patterns in the parameter space. further research would involve classifying these clusters with a clustering algorithm, and then attempting to predict the cluster labels or quantified distances of specific counties with a regression algorithm from the non-covid feature space. we also realized that, even for the larger counties, there was some uncertainty caused by the complexity of our model. since our parameter space has so many dimensions, there is an issue with degeneracy. the least squares minimization settles on a parameter set that yields a minimum in the least squares error metric, but this is not necessarily the only minimum or even the best minimum. we were convinced of this when we discovered that for each county, we could get distinct parameter regimes with different initial guesses, and that these solutions are similarly valid. we will show this in the figures below, by repeating the procedure from the previous section, but with a different set of parameters as the initial guess for the least squares minimization. following these observations, we hypothesized that there could be better parameter guesses that we have not encountered. in order to get better parameter guesses, further research should focus on deriving reasonable values for the more intuitive parameters of the model, such as those concerning the rate of infection, from existing data. this also illuminates the issue of ode stiffness. we believe the functionality of the scipy odeint package is quite limited. given more time, we would explore other statistical packages like stan, which has a state of the art implementation of 4th order runge kutta numerical integration for stiff odes [11] . another improvement would be to use a numerical integrator that solve handle delay differential equations, which would allow us to have more control over the time delay in expression between infected and death states. we believe that overall the pecaiqr model is very promising, and reveals the benefit of attempting more ambitious, complex epidemiological models. the epidemiological model's differential equations establishes intuitive rules by which it operates, so it has more long term predictive power than most other models. the shape of the solutions of the pecaiqr are also quite interesting, as they resemble heavy tail distributions similar to the frechet/weibull distribution. the heavy tail is especially important for pandemic forecasting, as we expect the daily deaths to fluctuate around a low value for some time, instead of decaying immediately to zero. in this respect, the tail end of the curve is almost stochastic in nature, once the infection curves lose enough momentum. due to the high variability found in the data for reported deaths for each county, with some counties reporting unrealistic jumps in the numbers of deaths, it seemed sensible to try using a stochastic model to make some predictions of death counts for counties. in order to best preprocess the county data for training, we explored possible clustering methods that would be able to cluster time series of varying lengths. dynamic-time warping (dtw) was picked to be the distance measure of choice as it is able to compare time series of different lengths and provide a âăijwarped distanceâăi̇ between each series of daily reported deaths. the daily reported deaths were preprocessed by first removing the series with all 0âăźs, and then performing a z-normalization on each data series. the data series were then compared with dynamic-time warping, and a hierarchical clustering was developed based on the dtw distance matrix. the number of clusters was determined using the elbow method. the series that had all 0âăźs are then added back into the clustering list, with cluster id âăÿ0âăź. it was desired that further clustering be done, specifically clustering involving hmm-based clustering. the ideal setup would be a clustering based on an iterative dtw-hmm clustering algorithm, in order to fully extract similarities between county reported deaths. unfortunately we did not have time to fully develop this idea. once the clusterings were developed, each cluster was used to train an hmm model. we used hmmlearnâăźs gaussianhmm model in order to have the gaussian emissions needed for this kind of data (a multinomial model with discrete output would be stretched thin with ∼100 states). the number of states each gaus-sianhmm was given per cluster was chosen by doing a modified version of the elbow method, to ensure that the hmm is complex/simple enough to match the variance in reported deaths. once these hmms were trained, we developed a method to initiate an hmm close to a particular starting emission and allow it to generate 14 subsequent emissions, to make 14-day predictions for the counties in the cluster. these predictions could be made thousands of times for each hmm, allowing each cluster to effectively create a probability distribution of predictions. further work needed to be done to make this stochastic model effective. as the model only allowed for predictions based off of an entire cluster, there needed to be a secondary layer of scaling to allow a cluster prediction to be mapped to each individual county. an idea of attempting to do some time-series comparison and do some sort of series stretching/scaling was developed, but was never fully formed. there are definite rooms for improvement and optimization in this model. one fundamental issue is that perhaps the small number of states of the gaussianmm limits the complexity of each hmm per cluster. while this is true, this level of simplicity mixed with stochasticity could capture the random element seen in the reported deaths data. if this random element is able to be removed or smoothed away,however, perhaps this model would not be so useful any longer. when attempting to use statistical regression to model nonlinear correlations in data, a common approach is to employ a bayesian non-parametric strategy, such as the gaussian process. bayesian non-parametric strategies like the gaussian process are essentially extensions of bayesian inference on an infinite-dimensional parameter space. very loosely, this allows us to model the data as the combination of many different gaussians (each quite accurate in its local region), stitched together to create a single model [12] . this technique was attempted in the latter stages of the course to create predictions for counties where the predictions from the pecaiqr did not converge, an issue at the time for counties with poor data. we used a custom implementation of gaussian processes, using a mean function of 0 and an exponential squared kernel. the data used was the rolling average of deaths over a three day window vs time. in this model, there are three parameters: l which is the length parameter for the kernel, σ f which is vertical variation parameter for the kernel, and σ y which is the noise parameter. for each county, the optimal hyperparameters for that county were found by searching for the set of parameters within a given range that minimized the error. here, the bounds on the parameters 5.0 ≤ l ≤ 15.0, 0.1 ≤ σ f ≤ m 500.0 , 0.1 ≤ σ ≤ m 10.0 where m was the maximum value of rolling three day average deaths over the past 14 days. the error function was the root mean squared error (rmse) over the last 30 days. these bounds and the error function were determined with validation procedures. the gaussian process seems to fit quite accurately in the short term, but the predictions quickly drop to zero, as demonstrated in the plots above. unlike the pecaiqr model, it does not retain a heavy tail. therefore, gaussian process does not have long term predictive power. we can possibly improve the gaussian process and overcome the issue by setting a custom mean instead of using the default zero mean, which likely contributes to the rapid decay of the tail. we did attempt to so, inspired by the pymc3 tutorial posted in the cs156b piazza [13] . it was attempted since a custom mean function could be used [14] ; similarities were noted between the graph of the number of deaths vs time in new york county and the weibull distribution, so the weibull distribution was this custom mean. however, it was not pursued further since it could not successfully run -when training the model on just one particular county, the program timed out and crashed jupyter on the computer it was run on. since the final deadline was already quite close, this attempt was abandoned. given how this model can fit to a custom mean, if it was attempted earlier it might have proved to be useful. we verify that the issue of a rapidly decaying tail is not specific to county 36061 as shown above. we can conclude that the gaussian process fits very well on the training data, but fails to present long term predictive power. again, these failures in the gaussian process model may be overcome by setting a custom mean. but we believe that the epidemiological model, though less accurate, has intrinsic advantages that the gaussian process cannot match. this is because the epidemiological model has some sort of intuition in the form of the rules established by its differential equations, while the gaussian process is purely curve fitting. perhaps, given more time, we could have combined these two models to utilize both the short term accuracy of the gaussian process and the long term predictive power of the epidemiological model. analysis of model predictions for different cutoff dates in the training data shows that the model is quite stable and consistent when predicting on the region past the peak. however, predicting before the peak is much harder, as we are no longer operating with the assumption that the infection curves are dying down. in sub-figure d) we see that the peak daily deaths value predicted by the model is significantly less than the actual peak that is revealed with more data. however, the location of the peak is correct. we realized that the training data at this early cutoff is largely dominated by data in the regime before the first stay at home order, but we were training using an initial parameter guess that accounted for the effects of a stay at home order, and so in sub-figure e), we tried a different set of parameters for the initial parameter guess in the fitting procedure, inspired from the parameters obtained from a fit on similar early curves in italian regions. this modification yielded a more accurate prediction of the peak daily deaths value, but a less accurate placement of the peak. the prediction curve becomes extended when we use this alternative parameter set because the model no longer assumes that there will be a stay at home order, and therefore the curve will not flatten to the same degree. analysis of model predictions for different cutoff dates in the training data shows that the model is quite stable and consistent when predicting on the region past the peak. one thing to note is that sub-figure a), trained on the most recent data, seems to have a much longer tail. this is a result of the high level of noise in the more recent data points, which are not included in the other cutoffs. predicting before the peak is much harder, as we are no longer operating with the assumption that the infection curves are dying down. in sub-figure d), we see that the fit is quite different from the fits with later cutoffs. the data before this cutoff is quite noisy, so the model cannot accurately predict when the infection curves will begin to die off. to fix this, in sub-figure e) we fit the model on active cases as well. the model is able to use the active case statistic to determine that the infection curve dies down earlier than it otherwise would predict. here, we fit each of the three counties using the policy regime method described in section 3.1 fitting. the faint gray curve shows the predicted infection curve on the data regime before the date of the stay at home order, plus the time of death, and the dark gray curve shows the predicted infection curve on the data regime after the stay at home order. in counties 36059 and 27053, the stay at home order seems to have flattened the curve, but in county 36061 the effects are more ambiguous. the is likely relates to factors specific to new york that worsened the outbreak, such as the high population density of new york city, and the shortage of hospital resources later on. the model can be easily adapted to train on international data. at the bare minimum, the model only requires death reporting and population. the predictions made from the may 4th cutoff, roughly a month before the june 3rd cutoff, shows that the prediction fit is quite stable and is consistent. (a) (b) (c) (d) (e) (f) figure 13 : the first column of figures shows daily deaths and currently hospitalized, daily hospitalized, and currently in icu plotted, respectively, plotted against time. the second column shows the linear regression for daily deaths vs currently hospitalized, daily deaths vs daily hospitalized, and daily deaths vs currently on ventilator, respectively. to smooth out the data, we used a moving average with a window of 7 data points. we then aligned the peaks for each hospital statistic to match the peak of the daily deaths, in order to account for the offset term that corresponds with the average time between hospital admittance and death. the first column shows the hospital statistics before they were aligned with the death statistic, and there is clearly an shift, although only by a few days. this makes sense, as patients who are admitted to the hospital are likely patients who have already developed a severe condition. note that the hospital statistics and the death statistics are both gaussian. this suggests that we can find some scaling factor once we align their peaks. for this, we perform a linear regression against the death statistic, revealing a definitive linear correlation for all the statistics. note that there is a slight nonlinearity in the regression between currently hospitalized and daily deaths, as the daily deaths increases beyond a certain point. this may indicate that we are nearing the hospital capacity, and so the change in number of currently hospitalized patients begins to lag behind the change in daily deaths. also note that there is an extreme outlier in the regression between currently on ventilator and daily deaths at the peak daily deaths value. similarly, this is also likely caused by some limit on the number of ventilators available. indeed, there seems to be a ceiling to the number of people on ventilators past a certain number of daily deaths. using the linear regression fit, we can now directly convert any of our model predictions, as well as their confidence intervals, to predictions for hospital resources. we noticed that the vast majority of counties have inconsistent reporting of deaths. some counties even had cumulative death statistics that decreased on certain dates. clearly this is not possible, as death is permanent. other counties report constant values for cumulative deaths (zero values for daily deaths) for an extended period of time, followed by a quick spike. it is doubtful that all deaths suddenly occur in a single day, so this observation suggests that the deaths reporting might not be distributed correctly, perhaps due to administrative lag. for counties with low numbers of deaths, this creates large amounts of variance in the data, which makes it hard to fit. it seems like the deaths reporting also seems to be correlated with the day of the week. for many counties there is an interesting trend that the daily deaths increase over consecutive days before the weekend. again, this might be due to administrative lag in the hospitals that report deaths. the active cases statistics also seem to be poorly recorded, and this makes sense given how ambiguous the classification of an active case can be, especially with the limitations on testing. many counties completely lack useful active case statistics, and in other cases, it is only available very late into the infection curve. in general, statistics that attempt to quantify the number of cases, whether active, cumulative, or daily, are intrinsically unreliable, as they depend on the availability of tests, which can vary greatly over time. perhaps a more reliable statistic is the ratio of positive tests to administered test on any given day. another issue was with the non time-series data. some files, such as the age race file from [15] and the aggregate jhu file from [16] , had valuable information but also numerous holes. while this data was not used in creating the predictions, it was used heavily in section 11 creative data visualizations which details an attempt on clustering based on categories in these datasets. such holes proved difficult to fill in. the filling in was done by using the data from the nearest county if a given entry was missing for a certain county. however this could be ineffective if this certain county and its nearest neighbor are very different in nature, as then the data for this certain county would not be particularly representative. while the data holes were not particularly impactful for the models described above, they certainly could be for models that took many non time-series features from those datasets into consideration. a contribution to the mathematical theory of epidemics unique epidemiological and clinical features of the emerging 2019 novel coronavirus pneumonia (covid-19) implicate special control measures positive rt-pcr test results in patients recovered from covid-19 table? q= american\ %20community\ %20survey\ %20\ %28acs\ %29\ %202018\ &tid= acsdp1y2018. dp05\ &y= 201\ 8& vintage= key: cord-195263-i4wyhque authors: heider, philipp title: covid-19 mitigation strategies and overview on results from relevant studies in europe date: 2020-05-11 journal: nan doi: nan sha: doc_id: 195263 cord_uid: i4wyhque in december 2019, the first patients in wuhan, china were diagnosed with a primary atypical pneumonia, which showed to be unknown and contagious. since then, known as covid-19 disease, the responsible viral pathogen, sars-cov-2, has spread around the world in a pandemic. decisions on how to deal with the crisis are often based on simulations of the pandemic spread of the virus. the results of some of these, as well as their methodology and possibilities for improvement, will be described in more detail in this paper in order to inform beyond the current public health dogma called"flatten-the-curve". there are several ways to model an epidemic in order to simulate the spread of diseases. depending on the timeliness, scope and quality of the associated real data, these multivariable models differ in the value of used parameters, but also in the selection of considered influencing factors. it was exemplarily shown that epidemics in their course are simulated more realistically by models that assume subexponential growth. furthermore, various simulations of the covid-19 pandemic were presented in an european perspective, compared against each other and discussed in more detail. it is difficult to estimate how credible the simulations of the pandemic models currently are, so it remains to be seen whether the spread of the pandemic can be effectively reduced by the measures taken. whether a model works well in reality is largely determined by the quality and scope of its underlying data. past studies have shown that countermeasures are able to reduce reproduction numbers or transmission rates in epidemics. in addition to that, the presented modelling study provides a good framework for the creation of subexponential-growth-models for assessing the spread of covid-19. there are several ways to model an epidemic in order to simulate the spread of diseases. depending on the timeliness, scope and quality of the associated real data, these multivariable models differ in the value of used parameters, but also in the selection of considered influencing factors. it was exemplarily shown that epidemics in their course are simulated more realistically by models that assume subexponential growth. furthermore, various simulations of the covid-19 pandemic were presented in an european perspective, compared against each other and discussed in more detail. it is difficult to estimate how credible the simulations of the pandemic models are. whether a model works well in reality is largely determined by the quality and scope of its underlying data. past studies have shown that countermeasures are able to reduce reproduction numbers or transmission rates in epidemics. in addition to that, the presented modelling study provides a good framework for the creation of subexponential-growth-models for assessing the spread of covid-19. on 30.03.2020 the sars-cov-2 virus arrived almost everywhere. nevertheless, there is disagreement about the measures to be taken against covid-19 around the world, while the news report frightening incidents. why is it so difficult for countries to find a common consensus on how to tackle the pandemic? virus pandemics spread across national borders with a time lag. therefore, while some countries are at an early stage of the crisis, as evidenced by the low number of new infections confirmed daily, other populations in the world may already be much more seriously affected. china and south korea, for example, are already planning to return to normal life (1, 2) , but elsewhere the situation is becoming increasingly dramatic. these differences result in an inhomogeneous data situation; in addition, different fundamental conditions of countries make it difficult to mutually compare situations. creating simulation models that allow a (realistic) prediction of the pandemic virus spread is a difficult task. the basic theories on which such models are based are as old as they are complex. thus, with this thesis i would like to contribute to the understanding of these and to present results from previous publications relevant to this topic. scientists are not only making a special contribution to the general public by informing them about the dangers and ways to protect ourselves from them, but they also support politicians all over the world. at the moment, it is becoming clear how valuable scientific skills and experience are for assessing the situation and making quick decisions. a concept to mitigate the effects of the virus pandemic is being widely covered by the media. this is the current public health dogma "flatten-the-curve". this can be explained without further elaboration: the number of seriously ill persons should be kept low so that the national health system does not collapse due to a lack of beds in the intensive care units of the hospitals (icus). the outbreak of the disease in society is to be "slowed down", and the mortality rate is to be kept low. the concept "flatten-the-curve" is based on the frightening result obtained by comparing the results of statistical, epidemiological simulations of the pandemic with the number of intensive care beds available in the country. many of them conclude that without effective countermeasures, a collapse of the health system is to be expected. a model frequently used for simulating directly transmitted infectious diseases is the so-called "sir" model. this model is based on the 2nd nobel prize winner (who was awarded for his findings on malaria) for medicine/physiology, robert ross, as well as to the former researchers hilda hudson and kermack/mckendrick. the sir model assumes that during an epidemic, every person goes through the three states, s: "susceptible", i: "infected" and r: "recovered". since the number of people in each stage depends on the dynamics of the epidemic and thus on the time during the epidemic, the sir model can be described mathematically with differential equations (3), which will not be discussed in detail in this work. it is often assumed that every epidemic initially spreads exponentially. due to the initially small number of cases detected, the spread normally goes unnoticed. however, experience has shown that epidemics slow down in the course of their spread and no longer show exponential growth (4). this observation will be discussed in more detail in the following study. for its understanding, however, further foundations are necessary. an important indicator used to assess the risk of epidemic spread is the effective reproduction number rt. this indicates the average number of secondary infections per primary infection, i.e. how many more people are infected by an infected person at a given time. if the value rt falls below 1, the epidemic decreases, above 1 the epidemic picks up speed. another key figure that should not be confused with the one just mentioned is the so-called "basic reproduction figure" r₀. it indicates how quickly an epidemic can spread at the beginning when all people in a population are susceptible to infection. to determine this important parameter, it is important to know the underlying "agent", its transmission and pathogenicity precisely and as generally as possible. trying to find a value of r 0 for the covid 19 pandemic online, one encounters different values, which vary greatly from 1.4 to 3.9 (5) (6) (7) (8) (9) . to calculate the effective reproduction number, the basic reproduction number must already be known. to understand why this second one varies so much, one must understand what influences it. in the classical sir model, the reproduction number at the beginning of a pandemic, i.e. at time t = 0, is calculated as follows: r t=0 = r 0 = β 0 γ β 0 is the initial transmission rate and γ the rate at which infected persons recover. r is a ratio of new infection rate to recovery rate and therefore dimensionless. the transmission rate β is generally calculated by the product of the contact rate c times the probability p with which contact between infected and susceptible persons leads to virus transmission, but in more complex models other influencing factors can be taken into account. the determination of both reproduction numbers is therefore always based on certain assumptions which are mathematically described in a model that can be based on different types of growth. this can be exponential (as most commonly used), or other growth processes, e.g. logistic or linear. these growth processes can be tested and compared with the presented sir model. in a publication by chowell et al (4), published in october 2016, for example, the dynamics of epidemics were examined using various basic concepts advanced sir models were used to determine how the predicted number of cases and the number of reproductions change over time and differ when exponential growth is assumed on the one hand and subexponential/polynomial growth on the other in a generalized growth model. in the course of this, three models were created and mathematically formulated, which consider different aspects: model 1 takes into account a certain degree of clustering within society. this influences the way epidemics spread. taking this into account, model 1 postulates that all people live in total c ̅ households of a certain size h and each household is part of a, to a certain degree, networked community. in the subsequent mathematical simulation of the model, it could be seen that a larger number of people in the community leads to more infected people, which is logical. it was also seen, however, that under the assumption of a subexponential/polynomial growth, the effective reproduction number decreases as the epidemic progresses. model 2 considers a reactive change in human behaviour, with a resulting time-dependent change in the transmission rate. this is achieved by introducing a parameter q, which influences the transmission rate β. the results were similar: behavioural changes in society lead to a decrease in the reproduction number towards 1. model 3 assumes that the population is unevenly, i.e. inhomogeneously mixed. for this purpose a parameter α is introduced in the mathematical formulas. the smaller it is (α < 1, α → 0), the more it reduces the simulated epidemic spread as suspected, this model also behaves like the one mentioned above. higher inhomogeneity of society, lowers r more drastically over time. subsequently, all approaches were combined into one concept and the results were compared with real, historical data. the following conclusion could be drawn: models that assume exponential growth estimate the effective reproduction number to be higher than actually proven. in comparison, models such as the one presented, which does not assume exponential growth, calculate the case number and effective reproduction number r t in the advanced course of an epidemic, more realistically (see figure 3 , figure 4 ). comparing the results with the situation at the end of march 2020, shown in figure 5 , it can be seen that at this point in time, the curve for new cases in some countries begins to deviate from the exponential growth that would be linear-diagonal. assuming that many previous simulations have only considered exponential growth, this is a positive signal. it could mean that the effective reproduction number, as mentioned in the study, is actually lower than previously assumed. for the sake of completeness, reference should be made to the model known as "seir". here, the incubation time is also taken into account as another possible state ("exposed"). if this is higher, this reduces the number of patients at the peak of the epidemic, but extends the period of high exposure for the clinics, since "herd immunity" is reached later. -in a study published in 2012 by the bkk (german federal office for civil protection and disaster assistance), the results of the modelling of a hypothetical sars epidemic are found. the effect on the total population (80 million) is simulated. in the study it is assumed that without measures, one infected person will infect an average of three others (initial reproduction number: r 0 = 3). with effective intervention (curfews, closure of universities/schools) this rate is reduced to 1.6. countermeasures are assumed from day 48 to 408 since the first infected person was detected and the rate with hospitalized patients requiring intensive care is assumed to be between 20 and 30 %. furthermore, it is determined that the mortality rate at the age of 60 years is 50 % and a mild course of the disease occurs in 5 % of cases. the simulation comes to the conclusion that at the peak of the epidemic about 1.1 million patients need intensive care. of these, an estimated 23890 are available under normal conditions in germany (10). in comparison with other countries such as italy or great britain, this is a relatively high value, but according to the study results, the health care system would still collapse under the burden. in this model, the incubation period during which an infected person is not yet infectious is not taken into account. for covid-19 this is estimated to be about 5.2 days (11). an essential factor that makes prognostic simulations more credible is knowing which countermeasures to spread, how well they work. however, since there are few general, evidencebased publications and no databases on this subject, assumptions must be made. the study presented by the german federal office bkk estimates the reduction of r0 through school closures, the prohibition of events and home quarantine at -46.6 %. for comparison: reports (nonpeer-reviewed) from china even estimate a reduction of as much as -91.71 % due to the very drastic measures against covid-19 in wuhan (12, 13) . calculations from spain claim that the transmission of influenza can be reduced by a maximum of 20 % by closing schools without curfew (14) . this contrasts with observations from russia, which showed that school closure during the flu season reduces the daily contact rate (≠ reproductive index r0) with individuals by 53 % for students and 19 % for workers (15) . only the results of these studies should be mentioned. these are not directly related to the current or future situation. however, it can be seen from the data that measures taken appear to be effective against the spread of infectious diseases. -the fact that science is currently exerting enormous influence on political decisions is shown by a recent publication by imperial college london, which caused the governments of great britain and the usa to rethink their course against the virus (16) . the author and one of the directors of the renowned university, neil ferguson, is probably the english equivalent of the german doctors drosten and kekulé, to whom politics in this country is currently listening. the icl publication paints a gloomy picture of what lies ahead for the uk. without going into the economic consequences, it is based on the so-called "npi measures" (non-pharmaceutical-measures) taken in the usa in 1918. measures to contain the spanish flu. the effectiveness of such "npis" is tested in the study using a statistical model which assumes exponential epidemic growth for the uk. the results will be presented in more detail below. the model assumes a reproduction number of r0 = 2.4, a 0.9 % general death rate of tested and untested infected persons (infection-fatality-rate, short: ifr), a mean hospitalization rate of of 4.4 % among infected persons and an incubation period of 5.1 days. furthermore, it is assumed that symptomatic patients are 50% more infectious than asymptomatic patients, while a distribution parameter is additionally introduced to take into account "individual infectivity". as shown in the excerpt below, the authors make the proportion of hospitalized infected persons dependent on the age of the person. while ferguson used chinese data in a preprint paper from the beginning of march to calculate hospitalisation rates as shown in table 1 (17) , the figures in the study now published look somewhat more threatening, since it also takes into account the situation in italy (see table 2 ). online simulation tools have also been published, one of which was developed by a research group led by richard neher at the university of basel. with this tool, (seir) simulations can be easily visualized in a browser. by default, the hospitalisation rate (severe-cases) is given there as follows, but it can be adjusted individually. the underlying model uses data on health care and demography of a given country. however, parameters are not (yet) adapted to current knowledge (as of 29.03.2020, (10)). in order to adjust the simulation in the appendix to more recent data, we have therefore changed the parameters for the proportion of severe cases according to the icl paper as it can be seen in figure 6 . in summary, the third assessment of the online tool shown is probably the most pessimistic, as it is based so far only on data from countries particularly affected by the crisis. it has been shown that a collapse of the health care system quickly pushes up the death rate. the icl, on the other hand, predicts a lower probability of hospitalisation and severe courses of disease in older patients (80+). it should be noted that the two important parameters, r 0 and ifr, are currently undergoing continuous correction and also vary widely between different online simulations. in contrast to many other simulations, the oxford centre for evidence based medicine, for example, estimates the death rate at a significantly lower level and also corrected the prognosis downwards based on the figures from germany (ifr = 0.4, as of march, 29, 2020) (19) . the scientists first compared the effects of various very realistically defined measures. among them: • doing nothing • isolate cases of disease • isolate cases of disease and house quarantine • closure of schools and universities • isolate cases of disease, domestic quarantine and social distancing from 70+ they concluded that, although a combination of measures is comparatively more effective, the effectiveness of all the measures described is not sufficient to reduce the transmission rate r0 sufficiently and prevent the system from collapsing. this picture is consistent with the estimates and calculations of the studies mentioned at the beginning (14, 15) . this procedure is commonly referred to as "mitigation" strategy. the simulation for great britain showed that icu capacities will not be sufficient despite the measures presented. source: (18) the researchers see the only way to get the transmission rate r0 from the specified original value of 2.4 to near or below 1 as a combination of drastic measures, whereby a complete "lockdown", as is currently the case in parts of austria, is considered a (temporarily) safe option. the two remaining scenarios therefore envisage the following countermeasures: • isolate cases of disease, domestic quarantine and general social distancing -or -• isolate cases of disease, domestic quarantine, closure of schools and universities and general social distancing however, the results in figure 8 also show that drastic measures for a limited period of five months only postpone the pandemic to a later point in time into winter and do not mitigate it. successful herd immunisation with a vaccine therefore remains necessary in this case. since this strategy, known as "suppression", is ultimately not an option, the researchers propose a special protocol instead: • permanent case isolation with subsequent quarantine up to a limit of 50 covid-19 intensive care patients per week and school closures, as well as general social distancing from a limit of 200 intensive care patients/week the researchers show in their simulation that it would be possible to maintain intensive care units for normal operation in this way. it should be mentioned in this sense that the duration of the prevention measures depends to a large extent on how well one manages to reduce r 0 . assuming that the current reproduction number is 2.0 and that the alternating measures would be sufficient to reduce the value permanently, the study calculates that the proposed protocol would have to be adhered to in great britain for just under 16 months. there are already differing opinions as to the feasibility of this strategy. whether a model works well in reality is largely determined by the quality and scope of its underlying data. although the simulation models available today are already very comprehensively designed (some of them even include global traffic flows and demographics in their datasets (10,20-22)), an unmanageable number of influencing factors remain, which cannot all be taken into account mathematically. depending on which model is used to determine the reproduction number, it will be lower or higher. if the growth curve begins to flatten out, it is probably worth adjusting the models and taking into account subexponential growth by assuming other growth processes as exponential growth, i.e. logistic growth or generalized growth, in order to realistically estimate r t . the presented study (4) provides a good framework for such models to build on. whether, and which measures are sufficient for a permanent reduction of the reproduction number has not yet been confirmed. it is a fact, however, that a relaxation of restrictions or non-compliance with rules increases the transmission rate and can quickly lead to a renewed rise in the curve in the worst case. detection of such events through monitoring and rapid follow-up in case of occurrence offers potential to prevent this. in china, south korea and singapore, it seems that the spread of the virus is well under control. under the title "response to covid-19 in taiwan -big data analytics, new technology, and proactive testing" an interesting report on how data can be used to counteract the spread of the virus can already be found (23) . if the presumed high number of asymptomatic infections is confirmed and the number of fatal infections remains unchanged, this could reduce the ifr (infection fatality rate) and thus also the simulation curves shown. among other factors, this may be one reason why the ifr is lower in countries that rely on rigorous testing of many citizens for covid-19 early on. large-scale immunological antibody testing will provide evidence of the current status of herd immunity in populations. it is difficult to estimate how credible the simulations of the pandemic models currently are, so it remains to be seen whether the spread of the pandemic can be effectively reduced by the measures taken. as the presented studies from the past have shown, countermeasures are basically able to reduce reproduction numbers or transmission rates in epidemics. the principle that biotechnologists are so keen to use in the production of modern drugs when working with transgenic organisms is unfortunately playing against us all this time: exponential growth. this must be prevented. enclosed is the result of the simulation for germany for the period until the end of april 2021, which was created with the online tool provided by the neher research group from basel (10). this is based on the assumption that the development of a vaccine takes 12-13 months and is then quickly available. according to this, the capacity of the intensive care units will not be exceeded by the number of serious cases until then (end of april 2021), or the overflow will be delayed. in the beginning, however, very tough measures will be necessary. it is assumed that hard measures such as social distancing will reduce the reproduction number by 75 % from the initially assumed 2.4 to 0.6. it is also assumed that it will increase again over time due to a beginning population dynamic after longer curfews. after four months with existing restrictions, measures will be eased and the transmission rate will continue to be maintained at -50 % until the end of april 2021. whether this assessment of the reproductive rate, ifr, as well as the effectiveness of countermeasures is realistic, will become apparent in the near future, when more data on the course and immunity in germany and neighbouring countries will be available. the hospitalisation rate of patients (severe-cases), as well as the proportion of persons requiring intensive care (critical-cases), was adapted according to the icl paper from the uk (18) . the result of the simulation can be seen in figure 9 . of abbreviations: • : reproductive number • : general reproductive number • : effective reproductive number • ifr: infection-fatality-rate • icl: imperial college longon • bkk: bundesamt für bevölkerungsschutz und katastropenhilfe • icu: intensive-care-unit • ggm: generalized-growth-model (subexponential) • exp: exponential keywords sars-cov-2, modeling, epidemiology, public health, forecasting, epidemic spread as coronavirus infections slow, south korea plans for life after social distancing. the wall street journal coronavirus: china's risky plan to revive the economy. financial times contributions to the mathematical theory of epidemics: iv. analysis of experimental epidemics of the virus disease mouse ectromelia characterizing the reproduction number of epidemics with early subexponential growth dynamics estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china pattern of early human-to-human transmission of wuhan timevarying transmission dynamics of novel coronavirus pneumonia in china novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia university of basel. covid-19 scenarios the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application evolving epidemiology and impact of non-pharmaceutical interventions on the outbreak of coronavirus disease the impact of transmission control measures during the first 50 days of the covid-19 epidemic in china estimating the impact of school closure on influenza transmission from sentinel data reactive school closure weakens the network of social interactions and reduces the spread of influenza behind the virus report that jarred the u.s. and the u.k. to action. the new york times global covid-19 case fatality rates [internet]: oxford centre for evidence-based medicine (cebm) the effect of travel restrictions on the spread of the 2019 novel coronavirus the gleamviz computational tool, a publicly available software to explore realistic epidemic spreading scenarios at the global scale response to covid-19 in taiwan: big data analytics, new technology, and proactive testing not applicable. ethics approval and consent to participate not applicable. not applicable. all data generated or analysed during this study are included in this published article [and its supplementary information files]. the authors declare that they have no competing interests the author did not receive any funding in connection with the creation of this work. key: cord-176677-exej3zwh authors: coveney, peter v.; highfield, roger r. title: when we can trust computers (and when we can't) date: 2020-07-08 journal: nan doi: nan sha: doc_id: 176677 cord_uid: exej3zwh with the relentless rise of computer power, there is a widespread expectation that computers can solve the most pressing problems of science, and even more besides. we explore the limits of computational modelling and conclude that, in the domains of science and engineering that are relatively simple and firmly grounded in theory, these methods are indeed powerful. even so, the availability of code, data and documentation, along with a range of techniques for validation, verification and uncertainty quantification, are essential for building trust in computer generated findings. when it comes to complex systems in domains of science that are less firmly grounded in theory, notably biology and medicine, to say nothing of the social sciences and humanities, computers can create the illusion of objectivity, not least because the rise of big data and machine learning pose new challenges to reproducibility, while lacking true explanatory power. we also discuss important aspects of the natural world which cannot be solved by digital means. in the long-term, renewed emphasis on analogue methods will be necessary to temper the excessive faith currently placed in digital computation. the extent to which reproducibility is an issue for computer modelling is more profound and convoluted however, depending on the domain of interest, the complexity of the system, the power of available theory, the customs and practices of different scientific communities, and many practical considerations, such as when commercial considerations are challenged by scientific findings. 5 for research on microscopic and relatively simple systems, such as those found in physics and chemistry, for example, theory -both classical and quantum mechanical -offers a powerful way to curate the design of experiments and weigh up the validity of results. in these and other domains of science that are grounded firmly on theory, computational methods more easily help to confer apparent objectivity, with the obvious exceptions of pathological science 6 and fraud 7 . for the very reason that the underlying theory is established and trusted in these fields, there is perhaps less emphasis than there should be on verification and validation ("solving the equations right" and "solving the right equations", respectively 8 ) along with uncertainty quantification-collectively known by the acronym vvuq. by comparison, in macroscopic systems of interest to engineers, applied mathematicians, computational scientists and technologists and others who have to design devices and systems that actually work, and which must not put people's lives in jeopardy, vvuq, is a way of life -in every sense -to ensure that simulations are credible. this vvuq philosophy underpins advances in computer hardware and algorithms that improve our ability to model complex processes using techniques such as finite element analysis, and computational fluid dynamics for end-toend simulations in virtual prototyping and to create digital twins. 9 there is a virtuous circle in vvuq, where experimental data hone simulations, while simulations hone experiments and data interpretation. in this way, the ability to simulate an experiment influences validation by experiment. in other domains, however, notably biology and biomedical sciences, theories have rarely attained the power and generality of physics. the state space of biological systems tends to be so vast that detailed predictions are often elusive, and vvuq is less well established, though that is now changing rapidly as, for example, models and simulations begin to find clinical use 10 . despite the often stated importance of reproducibility, researchers still find various ways to unwittingly fool themselves and their peers 11 . data dredging-also known as blind big data, data fishing, data snooping, and p-hackingseeks results that can be presented as statistically significant, without any knowledge of the structural characteristics of the problem, let alone first devising a hypothesis about the underlying mechanistic relationships. while corroboration or indirect supporting evidence may be reassuring, when taken too far it can lead to the interpretation of random patterns as evidence of correlations, and to conflation of these correlations with causative effects. spurred on by the current reward and recognition systems of academia, it is easier and very tempting to quickly publish one-off findings which appear transformative, rather than invest additional money, energy and time to ensure that these one-off findings are reproducible. as a consequence, a significant number of 'discoveries' turn out to be unreliable because they are more likely to depend on small populations, weak statistics and flawed analysis [12] [13] [14] [15] [16] . there is also a temptation to carry out post hoc rationalisation or harking, 'hypothesizing after the results are known' and to invest more effort into explaining away unexpected findings than validating expected results. most contemporary research depends heavily on computers which generate numbers with great facility. ultimately, though, computers are themselves tools that are designed and used by people. because human beings have a capacity for self-deception 17 , the datasets and algorithms that they create can be subject to unconscious biases of various kinds, for example in the way data are collected and curated in data dredging activities, a lack of standardized data analysis workflows 18 , or the selection of tools that generate promising results, even if their use is not appropriate in the circumstances. no field of science is immune to these issues, but they are particularly challenging in domains where systems are complex and many dimensional, weakly underpinned by theoretical understanding, and exhibit non-linearity, chaos and long-range correlations. with the rise of digital computing power, approaches predicated on big data, machine learning (ml) and artificial intelligence (ai) are frequently deemed to be indispensable. ml and ai are increasingly used to sift experimental and simulation data for otherwise hidden patterns that such methods may suggest are significant. reproducibility is particularly important here because these forms of data analysis play a disproportionate role in producing results and supporting conclusions. some even maintain that big data analyses can do away with the scientific method. 19 however, as data sets increase in size, the ratio of false to true correlations increases very rapidly, so one must be able to reliably distinguish false from true if one is able to find robust correlations. that is difficult to do without a reliable theory underpinning the data being analysed. we, like others 20 , argue that the faith placed in big data analyses is profoundly misguided: to be successful, big data methods must be more firmly grounded on the scientific method. 21 far from being a threat to the scientific method, the weaknesses of blind big data methods serve as a timely reminder that the scientific method remains the most powerful means we have to understand our world. in science, unlike politics, it does not matter how many people say or agree about something: if science is to be objective, it has to be reproducible ("within the error bars"). observations and "scientific facts and results" cannot depend on who is reporting them but must be universal. consensus is the business of politics and the scientific equivalent only comes after the slow accumulation of unambiguous pieces of empirical evidence (albeit most research and programmes are still funded on the basis of what the majority of people on a review panel thinks is right, so that scientists who have previously been successful are more likely to be awarded grants 22,23 .) there is some debate about the definition of reproducibility 24 . some argue that replicability is more important than reproducibility. others maintain that the gold standard of research should be 're-testability', where the result is replicated rather than the experiment itself, though the degree to which the 'same result' can emerge from different setups, software and implementations is open to question. 25 by reproducibility we mean the repetition of the findings of an experiment or calculation, generally by others, providing independent confirmation and confidence that we understand what was done and how, thus ensuring that reliable ideas are able to propagate through the scientific community and become widely adopted. when it comes to computer modelling, reproducibility means that the original data and code can be analysed by any independent, sceptical investigator to reach the same conclusions. the status of all investigators is supposedly equal and the same results should be obtained regardless of who is performing the study, within well-defined error bars -that is, reproducibility must be framed as a statistically robust criterion because so many factors can change between one set of observations and another, no matter who performs the experiment. the uncertainties come in two forms: (i) "epistemic", or systematic errors, which might be due to differences in measuring apparatus; and (ii) "aleatoric", caused by random effects. the latter typically arise in chaotic dynamical systems which manifest extreme sensitivity to initial conditions, and/or because of variations in conditions outside of the control of an experimentalist. by seeking to control uncertainty in terms of a margin of error, reproducibility means that an experiment or observation is robust enough to survive all manner of scientific analysis. note, of course, that reproducibility is a necessary but not a sufficient condition for an observation to be deemed scientific. in the scientific enterprise, a single result or measurement can never provide definitive resolution for or against a theory. unlike mathematics, which advances when a proof is published, it takes much more than a single finding to establish a novel scientific insight or idea. indeed, in the popperian view of science, there can be no final vindication of the validity of a scientific theory: they are all provisional, and may eventually be falsified. the extreme form of the modern machine-learners' pre-baconian view stands in stark opposition to this: there is no theory at all, only data, and success is measured by how well one's learning algorithm performs at discerning correlations within these data, even though many of these correlations will turn out to be false, random or meaningless. moreover, in recent years, the integrity of the scientific endeavour has been open to question because of issues around reproducibility, notably in the biological sciences. confidence in the reliability of clinical research has, for example, been under increasing scrutiny. 5 in 2005, john p. a. ioannidis wrote an influential article about biomedical research, entitled "why most published research findings are false", in which he assessed the positive predictive value of the truth of a research finding from values such as threshold of significance and power of the statistical test applied. 26 he found that the more teams were involved in studying a given topic, the less likely the research findings from individual studies turn out to be true. this seemingly paradoxical corollary follows because of the scramble to replicate the most impressive "positive" results and the attraction of refuting claims made in a prestigious journal, so that early replications tend to be biased against the initial findings. this 'proteus phenomenon' has been observed as an early sequence of extreme, opposite results in retrospective hypothesis-generating molecular genetic research 27 , although there is often a fine line to be drawn between contrarianism, wilful misrepresentation and the scepticism ('nullius in verba') that is the hallmark of good science. 28 such lack of reproducibility can be troubling. an investigation of 49 medical studies undertaken between 1990-2003 -with more than 1000 citations in total -found that 16% were contradicted by subsequent studies, 16% found stronger effects than subsequent studies, 44% were replicated, and 24% remained largely unchallenged. 29 in psychological science, a large portion of independent experimental replications did not reproduce evidence supporting the original results despite using high-powered designs and original materials when available. 30 even worse performance is found in cognitive neuroscience. 13 scientists more widely are routinely confronted with issues of reproducibility: a may 2016 survey in nature of more than 1576 scientists reported that more than 70% had tried and failed to reproduce another scientist's experiments, and more than half had failed to reproduce their own experiments. 31 this lack of reproducibility can be devastating for the credibility of a field. computers are critical in all fields of data analysis and computer simulations need to be reliable -validated, verified, and their uncertainty quantified -so that they can feed into real world applications and decisions be they governmental policies dealing with pandemics, for the global climate emergency, the provision of food and shelter for refugee populations fleeing conflicts, creation of new materials, the design of the first commercial fusion reactor, or to assist doctors to test medication on a virtual patient before a real one. reproducibility in computer simulations would seem trivial to the uninitiated: enter the same data into the same program on the same architecture and you should get the same results. in practice, however, there are many barriers to overcome to ensure the fidelity of a model in a computational environment. 32 overall, it can be challenging if not impossible to test the claims and arguments made by authors in published work without access to the original code and data, and even in some instances the machines the software ran on. one study of what the authors dubbed 'weak repeatability' examined 402 papers with results backed by code and found that, for one third, they were able to obtain the code and build it within half an hour, while for just under half they succeeded with significant extra effort. for the remainder, it was not possible to verify the published findings. the authors reported that some researchers are reluctant to share their source code, for instance for commercial and licensing reasons, or because of dependencies on other software, whether due to external libraries or compilers, or because the version they used in their paper had been superseded, or had been lost due to lack of backup. many detailed choices in the design and implementation of a simulation never make it into published papers. frequently, the principal code developer has moved on, the code turns out to depend on exotic hardware, there is inadequate documentation, and/or the code developers say that they are too busy to help. 33 there are some high-profiles examples of these issues, from disclosure of climate codes and data, 34 to delays in sharing codes for covid-19 pandemic modelling. 35 if the public are to have confidence in computing models that could directly affect them, transparency, openness and the timely release of code and data are critical. in response to this challenge, there have been various proposals to allow scientists to openly share code and data that underlie their research publications: runmycode [runmycode.org] and, perhaps better known, github [github.com]; share, a web portal to create, share, and access remote virtual machines that can be cited from research papers to make an article fully reproducible and interactive; 36 papermâché, another means to view and interact with a paper using virtual machines; 37 various means to create 'executable papers' 38, 39 ; and a verifiable result identifier (vri), which consists of trusted and automatically generated strings that point to publicly available results originally created by the computational process. 40 in addition to external verification, there are many initiatives to incorporate verification and validation into computer model development, along with uncertainty quantification techniques to verify and validate the models. 41 in the united states, for example, the american society of mechanical engineers has a standards committee for the development of verification and validation v&v procedures for computational solid mechanics models, 42 guidelines and recommended practices have been developed by the national aeronautics and space administration (nasa); 43 the us defense nuclear facilities safety board backs model v&v for all safety-related nuclear facility design, analyses, and operations, while various groups within the doe laboratories (including sandia, los alamos, and lawrence livermore) are conducting research in this area. 44 in europe, the vecma (verified exascale computing for multiscale applications) project 45 is developing software tools that can be applied to many research domains, from the laptop to the emerging generation of exascale supercomputers, in order to validate, verify, and quantify the uncertainty within highly diverse applications. the major challenge faced by the state of the art is that many scientific models are multiphysics in nature, combining two or more kinds of physics, for instance to simulate the behaviour of plasmas in tokamak nuclear fusion reactors 46 , electromechanical systems 47 or in food processing 48 . even more common, and more challenging, many models are also multiscale, which require the successful convergence of various theories that operate at different temporal and/or spatial scales. they are widespread at the interface between various fields, notably physics, chemistry and biology. the ability to integrate macroscopic universality and molecular individualism is perhaps the greatest challenge of multiscale modelling 49 . as one example, we certainly need multiscale models if we are to predict the biology, medicine that underpin the behaviour of an individual person. digital medicine is increasingly important and, as a corollary of this, there have been calls for steps to avoid a reproducibility "crisis" of the kind that has engulfed other areas of biomedicine. 50 although there are many kinds of multiscale modelling, there now exist protocols to enable the verification, validation, and uncertainty quantification of multiscale models. 51 the vecma toolkit 52 , which is not only open source but whose development is also performed openly, has many components: fabsim3, to organise and perform complex remote tasks; easyvvuq, a python library designed to facilitate verification, validation and uncertainty quantification for a variety of simulations 53, 54 ; qcg pilot job, to provide the efficient and reliable execution of large number of computational jobs; qcg-now, to prepare and run computational jobs on high performance computing machines; qcg-client, to provide support for a variety of computing jobs, from simple ones to complex distributed workflows; easyvvuq-qcgpilotjob, for efficient, parallel execution of demanding easyvvuq scenarios on high performance machines; and muscle 3, to make creating coupled multiscale simulations easier, and to then enable efficient uncertainty quantification of such models. the vecma toolkit is already being applied in several circumstances: climate modelling, where multiscale simulations of the atmosphere and oceans are required; forecasting refugee movements away from conflicts, or as a result of climate change, to help prioritise resources and investigate the effects of border closures and other policy decisions 55 ; for exploring the mechanical properties of a simulated material at several length and time scales with verified multiscale simulations; and multiscale simulations to understand the mechanisms of heat and particle transport in fusion devices, which is important because the transport plays a key role in determining the size, shape and more detailed design and operating conditions of a future fusion power reactor, and hence the possibility of extracting almost limitless energy; and verified simulations to aid in the decision-making of drug prescriptions, simulating how drugs interact with a virtual version of a patient's proteins, 56 or how stents will behave when placed in virtual versions of arteries. 57 recent years have seen an explosive growth in digital data accompanied by the rising public awareness that their lives depend on "algorithms", though it is plain to all that any computer code is based on an algorithm, without which it will not run. under the banner of artificial intelligence and machine learning, many of these algorithms seek patterns in those data. some -emphatically not the authors of this paper -even claim that this approach will be faster and more revealing than modelling the underlying behaviour notably by the use of conventional theory, modelling and simulation. 58 this approach is particularly attractive in disciplines traditionally not deemed suitable for mathematical treatment because they are so complex, notably life and social sciences, along with the humanities. however, to build a machine-learning system, you have to decide what data you are going to choose to populate it. that choice is frequently made without any attempt to first try to understand the structural characteristics that underlie the system of interest, with the result that the "ai system" produced strongly reflects the limitations or biases (be they implicit or explicit) of its creators. moreover, there are four fundamental issues with big data that are frequently not recognised by practitioners 58 : complex systems are strongly correlated, so they do not generally obey gaussian statistics; no datasets are large enough for systems with strong sensitivity to rounding or inaccuracies; correlation does not imply causality; and too much data can be as bad as no data: although computers can be trained on larger datasets than the human brain can absorb, there are fundamental limitations to the power of such datasets (as one very real example, mapping genotype to phenotype is far from straightforward), not least due to their digital character. all machine-learning algorithms are initialised using (pseudo) random number generators and have to be run vast numbers of times to ensure that their statistical predictions are robust. however, they typically make plenty of other assumptions, such as smoothness (i.e. continuity) between data points. the problem is that nonlinear systems are often anything but smooth, and there can be jumps, discontinuities and singularities. not only the smoothness of behaviour but also the forms of distribution of data regularly assumed by machine learners are frequently unknown or untrue in complex systems. indeed, many such approaches are distribution free, in the sense that there is no knowledge provided about the way the data being used is distributed in a statistical sense. 58 often, a gaussian ("normal") distribution is assumed by default; while this distribution plays an undeniable role across all walks of science it is far from universal. indeed, it fails to describe most phenomena where complexity holds sway because, rather than resting on randomness, these typically have feedback loops, interactions and correlations. machine learning is often used to seek correlations in data. but in a real-world system, for instance in a living cell that is a cauldron of activity of 42 million protein molecules 59 , can we be confident that we have captured the right data? random data dredging for complex problems is doomed to fail where one has no idea which variables are important. in these cases, data dredging will always be defeated by the curse of dimensionality -there will simply be far too much data needed to fill in the hyperdimensional space for blind machine learning to produce correlations to any degree of confidence. on top of that, as mentioned earlier, the ratio of false to true correlations soars with the size of the dataset, so that too much data can be worse than no data at all. there are practical considerations too. machine-learning systems can never be better than the data they are trained on, which can contain biases 'whether morally neutral as toward insects or flowers, problematic as toward race or gender, or even simply veridical, reflecting the status quo distribution of gender with respect to careers or first names'. 60 in healthcare systems, for example, where commercial prediction algorithms are used to identify and help patients with complex health needs, significant racial bias has been found. 61 machine learning systems are black boxes, even to the researchers that build them, making it hard for their creators, let alone others, to assess the results produced by these glorified curve-fitting systems. precise replication would be nearly impossible given the natural randomness in neural networks and variations in hardware and code. that is one reason why blind machine learning is unlikely to ever be accepted by regulatory authorities in medical practice as a basis for offering 8 drugs to patients. to comply with the regulatory authorities such as the us food and drug administration and the european medicines agency, the predictions of a ml algorithm are not enough and it is essential that an underlying mechanistic explanation is also provided, one which can explain not only when a drug works but also when it fails, and/or produces side effects. there are even deeper problems of principle in seeking to produce reliable predictions about the behaviour of complex systems of the sort one encounters frequently in the most pressing problems of twenty-first century science. we are thinking particularly, in life sciences, medicine, healthcare and environmental sciences, where systems typically involve large numbers of variables and many parameters. the question is how to select these variables and parameters to best fit the data. despite the constant refrain that we live in the age of "big data", the data we have available is never enough to model problems of this degree of complexity. unlike more traditional reductionist models, where one may reasonably assume one has sufficient data to estimate a small number of parameters, such as a drug interacting with a nerve cell receptor, this ceases to be the case in complex and emergent systems, such as modelling a nerve cell itself. the favourite approach of the moment is of course to select machine learning, which involves adjustments of large numbers of parameters inside the neural network "models" used; these can be tuned to fit the data available but have little to no predictability beyond the range of the data used because they do not take into account the structural characteristics of the phenomenon under study. this is a form of overfitting. 63 as a result of the uncertainty in all these parameters, the model itself becomes uncertain as testing it involves an assessment of probability distributions over the parameters and, with nowhere near adequate data available, it is not clear if it can be validated in a meaningful manner. 64 for some related issues of a more speculative and philosophical nature in the study of complexity, see succi (2019) . compounding all this, there is a fundamental problem that undermines our faith in simulations which arises from the digital nature of modern computers, whether classical or quantum. digital computers make use of four billion rational numbers that range from plus to minus infinity, the so-called 'single-precision ieee floating-point numbers', which refers to a technical standard for floating-point arithmetic established by the institute of electrical and electronics engineers in the 1950s; they also frequently use double precision floating-point numbers, while half-precision has become commonplace of late in the running of machine learning algorithms. however, digital computers only use a very small subset of the rational numbers -so-called dyadic numbers, whose denominators are powers of 2 because of the binary system underlying all digital computers -and the way these numbers are distributed is highly nonuniform. moreover, there are infinitely more irrational than rational numbers, which are ignored by all digital computers because to store any one of them, typically, one would require an infinite memory. leaving aside the mistaken belief held by some that a very few repeats of, say, a molecular dynamics simulation is any replacement for (quasi) monte carlo methods based on ensembles of replicas, these findings strongly suggest that the digital simulation of all chaotic systems, found in models used to predict weather, climate, molecular dynamics, chemical reactions, fusion energy and much more, contain sizeable errors of a nature that hitherto have been unknown to 9 most scientists. by the same token, the use of data from these chaotic simulations to train machine learning algorithms will in turn produce artefacts, making them unreliable. this shortcoming produced generic errors of up to 20 per cent in the case of the bernoulli map along with pure nonsense on rare occasions. one might ask why, if the consequences can be so substantial, these errors have not been noticed. the difficulty is that for real world simulations in turbulence and molecular dynamics, for example, there are no exact, closed form mathematical solutions for comparison so the numerical solutions that roll off the computer are simply assumed to be correct. given the approximations involved in such models, not to speak of the various sources of measurement errors, it is never possible to obtain exact agreement with experimental results. in short, the use of floating point numbers instead of real numbers contributes additional systematic errors in numerical schemes that have not so far been assessed at all 66 . for modelling, we need to tackle both epistemic and aleatoric sources of error. to deal with these challenges, a number of association for the advance of artificial intelligence, aaai, conferences 76 ) found that only 6% of the presenters shared the algorithm's code. 77 the most commonly-used machine learning platforms provided by big tech companies have poor support for reproducibility. 78 studies have shown that even if the results of a deep learning model could be reproduced, a slightly different experiment would not support the findings-yet another example of overfitting-which is common in machine learning research. in other words, unreproducible findings can be built upon supposedly reproducible methods. 79 rather than continuing to simply fund, pursue and promote 'blind' big data projects, more resources should be allocated to the elucidation of the multiphysics, multiscale and stochastic processes controlling the behaviour of complex systems, such as those in biology, medicine, healthcare and environmental science. 21 finding robust predictive mechanistic models that provide explanatory insights will be of particular value for machine learning when dealing with sparse and incomplete sets of data, ill-posed problems, exploring vast design spaces to seek correlations and then, most importantly, for identifying correlations. where machine learning provides a correlation, multiscale modelling can test if this correlation is causal. there are also demands in some fields for a reproducibility checklist, 80 to make ai reproducibility more practical, reliable and effective. another suggestion is the use of so-called "model cards" -documentation that accompanies trained machine learning models which outline the application domains, the context in which they are being used and their carefully benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, and phenotypic groups; 81 and proposals for best practice in reporting experimental results which permit for robust comparison. 82 despite the caveat that computers are made and used by people, there is also considerable interest in their use to design and run experiments, for instance using bayesian optimization methods, such as in the field of cognitive neuroscience 83 and to model infectious diseases and immunology quantitatively. 84 when it comes to the limitations of digital computing, research is under way by boghosian and pvc to find alternative approaches that might render such problems computable on digital computers. among possible solutions, one that seems guaranteed to succeed is analogue computing, an older idea, able to handle the numerical continuum of reality in a way that digital computers can only approximate. 85 in the short term, notably in the biosciences, better data collection, curation, validation, verification and uncertainty quantification procedures of the kind described here, will make computer simulations more reproducible, while machine learning will benefit from a more rigorous and transparent approach. the field of big data and machine learning has become extremely influential but without big theory it remains dogged by a lack of firm theoretical underpinning ensuring its results are reliable. 21 indeed, we have argued that in the modern era in which we aspire to describe really complex systems, involving many variables and vast numbers of parameters, there is not sufficient data to apply these methods reliably. our models are likely to remain uncertain in many respects, as it is so difficult to validate them. in the medium term, ai methods may, if carefully produced, improve the design, objectivity and analysis of experiments. however, this will always require the participation of people to devise the underlying hypotheses and, as a result, it is important to ensure that they fully grasp the assumptions on which these algorithms are based and are also open about these assumptions. it is already becoming increasingly clear that 'artificial intelligence' is a digital approximation to reality. moreover, in the long term, when we are firmly in the era of routine exascale and perhaps eventually also quantum computation, we will have to grapple with a more fundamental issue. even though there are those who believe the complexity of the universe can be understood in terms of simple programs rather than by means of concise mathematical equations, 86,87 digital computers are limited in the extent to which they can capture the richness of the real world. 85,88 freeman dyson, for example, speculated that for this reason the downloading of a human consciousness into a digital computer would involve 'a certain loss of our finer feelings and qualities' 89 . in the quantum and exascale computing eras, we will need renewed emphasis on the analogue world and analogue computational methods if we are to trust our computers. 85 the turing way" -a handbook for reproducible data science the need for open source software in machine learning as concerns about non-reproducible data mount, some solutions take shape addressing scientific fraud. science (80-) bad pharma: how drug companies mislead doctors and harm patients betrayers of the truth: fraud and deceit in the halls of science quantification of uncertainty in computational fluid dynamics assessing the reliability of complex models: mathematical and statistical foundations of verification, validation, and uncertainty quantification editorial: special issue on verification, validation, and uncertainty quantification of cardiovascular models: towards effective vvuq for translating cardiovascular modelling to clinical utility how scientists fool themselves -and how they can stop low statistical power in biomedical science: a review of three human research domains empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature a survey of the statistical power of research in behavioral ecology and animal behavior evaluating the quality of empirical journals with respect to sample size and statistical power. ouzounis ca power failure: why small sample size undermines the reliability of neuroscience the evolution and psychology of self-deception variability in the analysis of a single neuroimaging dataset by many teams the end of theory: the data deluge makes the scientific method obsolete the scientific method in the science of machine learning big data need big theory too the matthew effect in science: the reward and communication systems of science are considered. science (80-) the matthew effect in science funding muddled meanings hamper efforts to fix reproducibility crisis reproducible research: a minority opinion why most published research findings are false early extreme contradictory estimates may appear in published research: the proteus phenomenon in molecular genetics research and randomized trials why selective publication of statistically significant results can be effective contradicted and initially stronger effects in highly cited clinical research estimating the reproducibility of psychological science. science (80-) 1,500 scientists lift the lid on reproducibility reproducibility in scientific computing published online 2014:1-37. 34. the disclosure of climate data from the climatic github -imperial college london -code for modelling estimated deaths and cases for covid19 share: a web portal for creating and sharing executable research papers paper mâché: creating dynamic reproducible science a provenance-based infrastructure to support the life cycle of executable papers a universal identifier for computational results a comprehensive framework for verification, validation, and uncertainty quantification in scientific computing verification, validation and uncertainty quantification (vvuq) concepts of model verification and validation vecma verified exascale computing for multiscale applications multidimensional multiphysics simulation of nuclear fuel behavior multiphysics simulation innovative food processing technologies: advances in bridging the gaps at the physics-chemistry-biology interface the reproducibility crisis in the age of digital medicine multiscale modelling: approaches and challenges building confidence in simulation: applications of easyvvuq a library for verification, validation and uncertainty quantification in high performance computing a generalized simulation development approach for predicting refugee destinations ensemble-based steered molecular dynamics predicts relative residence time of a 2a receptor binders semi-intrusive multiscale metamodelling uncertainty quantification with application to a model of in-stent restenosis big data: the end of the scientific method unification of protein abundance datasets yields a quantitative saccharomyces cerevisiae proteome semantics derived automatically from language corpora contain human-like biases dissecting racial bias in an algorithm used to manage the health of populations. science (80-) weapons of math destruction. crown random house the problem of overfitting the evolution of scientific knowledge: from certainty to uncertainty a new pathology in the simulation of chaotic dynamical systems on digital computers of naturalness and complexity metascience could rescue the 'replication crisis the unity of knowledge psychologists strike a blow for reproducibility increasing transparency through a multiverse analysis the science of team science brazilian biomedical science faces reproducibility test policy: nih plans to enhance reproducibility the academy of medical sciences. reproducibility and reliability of biomedical research: improving research practice state of the art: reproducibility in artificial intelligence missing data hinder replication of artificial intelligence studies. science (80-) out-of-the-box reproducibility: a survey of machine learning platforms unreproducible research is reproducible the machine learning reproducibility checklist (version 1.0) model cards for model reporting show your work: improved reporting of experimental results neuroadaptive bayesian optimization and hypothesis testing host genotype and time dependent antigen presentation of viral peptides: predictions from theory. sci rep from digital hype to analogue reality: universal simulation beyond the quantum and exascale eras the wolfram physics project: a project to find the fundamental theory of physics a new kind of science. wolfram media a class of models with the potential to represent fundamental physics the authors are grateful for many stimulating conversations with bruce boghosian, daan crommelin, ed dougherty, derek groen, alfons hoekstra, robin richardson & david wright. authors' contributions all authors contributed to the concept and writing of the article. the authors have no competing interests.funding statement p.v.c. is grateful for funding from the uk epsrc for the ukcomes uk high-end computing consortium (ep/r029598/1), from mrc for a medical bioinformatics grant (mr/l016311/1), the european commission for the compbiomed, compbiomed2 and vecma grants (numbers 675451, 823712 and 800925 respectively) and special funding from the ucl provost. key: cord-018976-0ndb7rm2 authors: iwasa, yoh; sato, kazunori; takeuchi, yasuhiro title: mathematical studies of dynamics and evolution of infectious diseases date: 2007 journal: mathematics for life science and medicine doi: 10.1007/978-3-540-34426-1_1 sha: doc_id: 18976 cord_uid: 0ndb7rm2 nan the practical importance of understanding the dynamics and evolution of infectious diseases is steadily increasing in the contemporary world. one of the most important mortality factors for the human population is malaria. every year, hundreds of millions of people suffer from malaria, and more than a million children die. one of the obstacles of controlling malaria is the emergence of drug-resistant strains. pathogen strains resistant to antibiotics pose an important threat in developing countries. in addition, we observe new infectious diseases, such as hiv, ebora, and sars. the mathematical study of infectious disease dynamics has a long history. the classic work by kermack and mckendrick (1927) established the basis of modeling infectious disease dynamics. the variables indicate the numbers of host individuals in several different states -susceptive, infective and removed. this formalism is the basis of all current modeling of the dynamics and evolution of infectious diseases. since then, the number of theoretical papers on infectious diseases has increased steadily. especially influential was a series of papers by roy anderson and robert may, summarized in their book (anderson and may 1991) . anderson and may have developed population dynamic models of the host engaged in reproduction and migration. in a sense, they treated epidemic dynamics as a variant of ecological population dynamics of multiple species community. combining the increase of our knowledge of nonlinear dynamical systems (e. g. chaos), anderson and may also demonstrated the usefulness of simple models in understanding the basic principles of the system, and sometimes even in choosing a proper policy of infectious disease control. the dynamical systems for epidemics are characterized by nonlinearity. the systems include many processes at very different scales, from the population on earth to the individual level, and further to the immune system within a patient. hence, mathematical studies of epidemics need to face this dynamical diversity of phenomena. tools of modeling and analysis for situations including time delay and spatial heterogeneity are very important. as a consequence, there is no universal mathematical model that holds for all problems in epidemics. when we are given a set of epidemiological phenomena and questions to answer, we must "construct" mathematical models that can describe the phenomena and answer our questions. this is quite different from studies in "pure" mathematics, in which usually the models are given beforehand. one of the most important questions in mathematical studies of epidemics is the possibility of the eradication of disease. the standard local stability analysis of the endemic equilibrium and disease-free equilibrium is often not enough to answer the question, because it gives us information only on the local behavior, or the solution in the neighborhood of those equilibria. on the other hand, it is known that global stability analysis of the models is often very difficult, and even impossible in general cases, because the dynamics are highly nonlinear. even if the endemic equilibrium were unstable and the disease-free equilibrium were locally stable, the diseases can remain endemic and be sustained forever. sometimes, rather simple models show periodic or chaotic behavior. recently, the concept of "permanence" was introduced in population biology and has been studied extensively. this concept is very important in mathematical epidemiology as well. permanence implies that the disease will be maintained globally, irrespective of the initial composition. even if the endemic equilibrium were unstable, the disease will last forever, possibly with perpetual oscillation or chaotic fluctuation. since the epidemiological data supplied by medical and public health sectors are abundant, epidemiological models are in general much better tested than similar population models in ecology developed for wild animals and plants. the diversity of models is also extensive, including all the different levels of complexity. rather simple and abstract models are suitable to discuss general properties of the system, while more complex and realistic computerbased simulators are adopted for policy decision making incorporating details of the structure closely corresponding to available data. mathematical modeling of infectious diseases is the most advanced subfield of theoretical studies in biology and the life sciences. what is notable in this development is that, even if many computer-based detailed simulators become available, the rigorous mathematical analysis of simple models remains very useful, medically and biologically, in giving a clear understanding of the behavior of the system. recently, the evolutionary change of infectious agents in the host population or within a patient has attracted an increasing attention. mutations during genome replication would create pathogens that may differ slightly from the original types. this gives an opportunity for a novel strain to emerge and spread. as noted before, emergence of resistant strains is a major obstacle of infectious disease control. essentially the same evolutionary process occurs within the body of a single patient. a famous example is hiv, in which viral particles change and diversify their nucleotide sequences after they infect a patient. this supposedly reflects the selection by the immune system of the host working on the virus genome. a similar process of escape is involved in carcinogenesis -a process in which normal stem cells of the host become cancerous. the papers included in this volume are for mathematical studies of models on infectious diseases and cancer. most of them are based on presentations in the first international symposium on dynamical systems theory and its applications to biology and environmental sciences, held in hamamatsu, japan, on 14-17 march 2004. this introductory chapter is followed by four papers on infectious disease dynamics, in which the roles of time delay (chaps. 2 and 3) and spatial structures (chaps. 4 and 5) are explored. then, there are two chapters that discuss competition between strains and evolution occurring in the host population (chap. 6) and within a single patient (chap. 7). finally, there are papers on models of the immune system and cancer (chaps. 8 and 9). below, we briefly summarize the contents of each chapter. in chap. 2, zhien ma and jianquan li give an introduction to the mathematical modeling of disease dynamics. then, they summarize a project of modeling the spread of sars in china by the authors and their colleagues. in chap. 3, yasuhiro takeuchi and wanbiao ma introduce mathematical studies of models with time delay. they first review past mathematical studies on this theme during the last few decades, and then introduce their own work on the stability of the equilibrium and the permanence of epidemiological dynamics. in chaps 4 and 5, wendi wang and shigui ruan discuss the spatial aspect of epidemiology. the spread of a disease in a population previously not infected may appear as "wave of advance". this is often modeled as a reaction diffusion system, or by other models handling spatial aspects of population dynamics. the speed of disease propagation is analogous to the spread of invaders in a novel habitat in spatial ecology (shigesada and kawasaki 1997) . since microbes have a shorter generation time and huge numbers of individuals, they have much faster evolutionary changes, causing drug resistance and immune escape, among the most common problems in epidemiology. by considering the appearance of novel strains with different properties from those of the resident population of pathogens, and tracing their abundance, we can discuss the evolutionary dynamics of infectious diseases. in chap. 6, horst thieme summarized the work on the competition between different and competing strains, and the possibility of their coexistence and replacement. an important concept is the "maximal basic replacement ratio". if a host once infected and then recovered from a single strain is perfectly immune to all the other strains (i. e. cross immunity is perfect), then the one with the largest basic replacement ratio will win the competition among the strains. the author explores the extent to which this result can be generalized. he also discusses the coexistence of strains considering the aspect of maternal transmission as well. in chap. 7, yoh iwasa and his colleagues analyze the result of evolutionary change occurring within the body of a single patient. some of the pathogens, especially rna viruses have high mutation rates, due to an unreliable replication mechanism, and hence show rapid genetic change in a host. the nucleotide sequences just after infection by hiv will be quite different from those hiv occurring after several years. by mutation and natural selection under the control of the immune system, they become diversified and constantly evolve. iwasa and his colleagues derive a result that, without cross-immunity among strains, the pathogenicity of the disease tends to increase by any evolutionary changes. they explore several different forms of cross-immunity for which the result still seems to hold. in chap. 8, edoardo beretta and his colleagues discuss immune response based on mathematical models including time delay. the immune system has evolved to cope with infectious diseases and cancers. they have properties of immune memory and, once attached and recovered, they will no longer be susceptive to infection by the same strain. to achieve this, the body has a complicated network of diverse immune cells. beretta and his colleagues summarize their study of modeling of an immune system dynamics in which time delay is incorporated. in the last chapter, h.i. freedman studies cancer, which originates from the self-cells of the patient, but which then become hostile by mutations. there is much in common between cancer cells and pathogens originated from outside of the host body. freedman discusses the optimal chemotherapy, considering the cost and benefit of chemotherapy. this collection of papers gives an overview of theoretical studies of infectious disease dynamics and evolution, and hopefully will serve as a source in future studies of different aspects of infectious disease dynamics. here, the key words are time delay, spatial dynamics, and evolution. toward the end of this introductory chapter, we would like to note one limitation -all of the papers in this volume discuss deterministic models, which are accurate when the population size is very large. since the number of microparasites, such as bacteria, or viruses, or cancer cells, is often very large, the neglect of stochasticity due to the finiteness of individuals seems to be acceptable. however, when we consider the speed of the appearance of novel mutants, we do need stochastic models, because mutants always start from a small number. according to studies on the timing of cancer initiation, which starts from rare mutations followed by population growth of cancer cells, the predictions of deterministic models differ by several orders of magnitude from those of stochastic models and direct computer simulations. infectious diseases of humans a contribution to the mathematical theory of epidemics biological invasions: theory and practice key: cord-000282-phepjf55 authors: hsieh, ying-hen; fisman, david n; wu, jianhong title: on epidemic modeling in real time: an application to the 2009 novel a (h1n1) influenza outbreak in canada date: 2010-11-05 journal: bmc res notes doi: 10.1186/1756-0500-3-283 sha: doc_id: 282 cord_uid: phepjf55 background: management of emerging infectious diseases such as the 2009 influenza pandemic a (h1n1) poses great challenges for real-time mathematical modeling of disease transmission due to limited information on disease natural history and epidemiology, stochastic variation in the course of epidemics, and changing case definitions and surveillance practices. findings: the richards model and its variants are used to fit the cumulative epidemic curve for laboratory-confirmed pandemic h1n1 (ph1n1) infections in canada, made available by the public health agency of canada (phac). the model is used to obtain estimates for turning points in the initial outbreak, the basic reproductive number (r(0)), and for expected final outbreak size in the absence of interventions. confirmed case data were used to construct a best-fit 2-phase model with three turning points. r(0 )was estimated to be 1.30 (95% ci 1.12-1.47) for the first phase (april 1 to may 4) and 1.35 (95% ci 1.16-1.54) for the second phase (may 4 to june 19). hospitalization data were also used to fit a 1-phase model with r(0 )= 1.35 (1.20-1.49) and a single turning point of june 11. conclusions: application of the richards model to canadian ph1n1 data shows that detection of turning points is affected by the quality of data available at the time of data usage. using a richards model, robust estimates of r(0 )were obtained approximately one month after the initial outbreak in the case of 2009 a (h1n1) in canada. epidemics and outbreaks caused by emerging infectious diseases continue to challenge medical and public health authorities. outbreak and epidemic control requires swift action, but real-time identification and characterization of epidemics remains difficult [1] . methods are needed to inform real-time decision making through rapid characterization of disease epidemiology, prediction of shortterm disease trends, and evaluation of the projected impacts of different intervention measures. real-time mathematical modeling and epidemiological analysis are important tools for such endeavors, but the limited public availability of information on outbreak epidemiology (particularly when the outbreak creates a crisis environment), and on the characteristics of any novel pathogen, present obstacles to the creation of reliable and credible models during a public health emergency. one needs to look no further than the 2003 sars outbreak, or ongoing concerns related to highly pathogenic avian influenza (h5n1) or bioterrorism to be reminded of the need for and difficulty of real-time modeling. the emergence of a novel pandemic strain of influenza a (h1n1) (ph1n1) in spring 2009 highlighted these difficulties. early models of 2009 ph1n1 transmission were subject to substantial uncertainties regarding all aspects of this outbreak, resulting in uncertainty in judging the pandemic potential of the virus and the implementation of reactive public health responses in individual countries (fraser et al. [2] ). multiple introductions of a novel virus into the community early in the outbreak could further distort disease epidemiology by creating fluctuations in incidence that are misattributed to the behavior of a single chain of transmission. we sought to address three critical issues in real time disease modeling for newly emerged 2009 ph1n1: (i) to estimate the basic reproduction number; (ii) to identify the main turning points in the epidemic curve that distinguish different phases or waves of disease; and (iii) to predict the future course of events, including the final size of the outbreak in the absence of intervention. we make use of a simple mathematical model, namely the richards model, to illustrate the usefulness of near realtime modeling in extracting valuable information regarding the outbreak directly from publicly available epidemic curves. we also provide caveats regarding inherent limitations to modeling with incomplete epidemiological data. the accuracy of any modeling is highly dependent on the epidemiological characteristics of the outbreak considered, and most epidemic curves exhibit multiple turning points (peaks and valleys) during the early stage of an outbreak. while these may be due to stochastic ("random") variations in disease spread, and changes in either surveillance methods or case definitions, turning points may also represent time points where epidemics transition from exponential growth processes to processes that have declining rates of growth, and thus may identify effects of disease control programs, peaks of seasonal waves of infection, or natural slowing of growth due to infection of a critical fraction of susceptible individuals. for every epidemic, there is a suitable time point after which a given phase of an outbreak can be suitably modeled, and beyond which subsequent phases may be anticipated. detection of such "turning points" and identification of different phases or waves of an outbreak is of critical importance in designing and evaluating different intervention strategies. richards [3] proposed the following model to study the growth of biological populations, where c(t) is the cumulative number of cases reported at time t (in weeks): here the prime "′" denotes the rate of change with respect to time. the model parameter k is the maximum case number (or final outbreak size) over a single phase of outbreak, r is the per capita growth rate of the infected population, and a is the exponent of deviation. the solution of the richards model can be explicitly given in terms of model parameters as using the richard model, we are able to directly fit empirical data from a cumulative epidemic curve to obtain estimates of epidemiological meaningful parameters, including the growth rate r. in such a model formulation, the basic reproduction number r 0 is given by the formula r 0 = exp(rt) where t is the disease generation time defined as the average time interval from infection of an individual to infection of his or her contacts. it has been shown mathematically [4] that, given the growth rate r, the equation r 0 = exp(rt) provides the upper bound of the basic reproduction number regardless of the distribution of the generation interval used, assuming there is little pre-existing immunity to the pathogen under consideration. additional technical details regarding the richards model can be found in [5] [6] [7] . unlike the better-known deterministic compartmental models used to describe disease transmission dynamics, the richards model considers only the cumulative infected population size. this population size is assumed to have saturation in growth as the outbreak progresses, and this saturation can be caused by immunity, by implementation of control measures or other factors such as environmental or social changes (e.g., children departing from schools for summer holiday). the basic premise of the richards model is that the incidence curve of a single phase of a given epidemic consists of a single peak of high incidence, resulting in an s-shaped cumulative epidemic curve with a single turning point for the outbreak. the turning point or inflection point, defined as the time when the rate of case accumulation changes from increasing to decreasing (or vice versa) can be easily pinpointed as the point where the rate of change transitions from positive to negative; i.e., the moment at which the trajectory begins to decline. this time point has obvious epidemiologic importance, indicating either the beginning of a new epidemic phase or the peak of the current epidemic phase. for epidemics with two or more phases, a variation of the s-shaped richards model has been proposed [6] . this multi-staged richards model distinguishes between two types of turning points: the initial s curve which signifies the first turning point that ends initial exponential growth; and a second type of turning point in the epidemic curve where the growth rate of the number of cumulative cases begins to increase again, signifying the beginning of the next epidemic phase. this variant of richards model provides a systematic method of determining whether an outbreak is single-or multi-phase in nature, and can be used to distinguish true turning points from peaks and valleys resulting from random variability in case counts. more details on application of the multi-staged richards model to sars can be found in [6, 7] . readers are also referred to [8, 9] for its applications to dengue. we fit both the single-and multi-phase richards models to canadian cumulative 2009 ph1n1 cumulative case data, using publicly available disease onset dates obtained from the public health agency of canada (phac) website [10, 11] . phac data represent a central repository for influenza case reports provided by each of canada's provinces and territories. onset dates represent best local estimates, and may be obtained differently in different jurisdictions. for example, the province of ontario, which comprises approximately 1/3 of the population of canada, and where most spring influenza activity was concentrated, replaces onset dates using a hierarchical schema, whereby missing onset dates may be replaced with dates of specimen collection (if known) or date of specimen receipt by the provincial laboratory system, if both dates of onset and specimen collection are missing. data were accessed at different time points during the course of the "spring wave (or herald wave)" of the epidemic in may-july of 2009, whenever a new dataset is made available online by the phac. by sequentially considering successive s-shaped segments of the epidemic curve, we estimate the maximum case number (k) and locate turning points, thus generating estimates for cumulative case numbers during each phase of the outbreak. the phac cumulative case data is then fitted to the cumulative case function c(t) in the richards model with the initial time t 0 = 0 being the date when the first laboratory confirmed case was reported and the initial case number c 0 = c(0) = 1, (the case number with onset of symptoms on that day). there were some differences between sequential epidemic curves in assigned case dates. for example, data posted by phac on may 20 indicated an initial case date of april 13, but in the june 3 data this had been changed to april 12, perhaps due to revision of the case date as a result of additional information. model parameter estimates based on the explicit solution given earlier can be obtained easily and efficiently using any standard software with a least-squares approximation tool, such as sas or matlab. daily incidence data by onset date were posted by phac until june 26, after which date only the daily number of laboratory-confirmed hospitalized cases in canada was posted. for the purpose of comparison, we also fit the hospitalization data to the richards model in order to evaluate temporal changes in the number of severe (hospitalized) cases, which are assumed to be approximately proportional to the total cases number. the case and hospitalization data used in this work are provided online as additional file 1. we fit the model to the daily datasets, acquired in real time, throughout the period under study. the leastsquared approximation of the model parameter estimation could converge for either the single-phased or the 2-phase richards models. for the sake of brevity, only four of these model fits are presented in table 1 to demonstrate the difference in modeling results over time. the resulting parameter estimates with 95% confidence intervals (ci) (for turning point (t i ), growth rate (r), and maximum case number (k)), time period included in the model, and time period when the data set in question were accessed, is presented in table 1 . note that all dates in the tables are given by month/day. we also note that the ci's for r 0 reflect the uncertainty in t as well as in the estimates for r, and does not reflect the error due to the model itself, which is always difficult to measure. in order to compare the 1-phase and 2-phase models, we also calculate the akaike information criterion (aic) [12] for the first, third, and fourth sets of data in table 1 , where there is a model fit for the 2-phase model. the results, given in table 2 , indicates that whenever there is a model fit for the 2-phase model, its aic value is always lower than that of the 1-phase model and hence compares favorably to the 1-phase model. parameter estimates fluctuate in early datasets, and the least-squared parameter estimations diverge within and between 1-phase and 2-phase models in a manner that seems likely to reflect artifact. in particular, for the earliest model fits, using data from april 13 to may 15, the estimated reproductive number for the second phase is far larger than that obtained in the first phase, and that obtained using a single-phase model, and illustrating the pitfalls of model estimation using the limited data available early in an epidemic. estimates stabilize as the outbreak progresses, as can be seen with the final data sets (april 11 to june 5 and april 12 to june 19). for comparison, we plot the respective theoretical epidemic curves based on the richards model with the estimated parameters described in the table above in figure 1 . as noted above, model can be used to estimate turning points (t i ) and basic reproductive numbers (r 0 .), if the generation time t is know. we used t = 1.91 days (95% ci: 1.30-2.71), as obtained in [2] by fitting an age stratified mathematical model to the first recognized 2009 influenza a (h1n1) outbreak in la gloria, mexico. estimates are presented in table 1 . we also conducted sensitivity analyses with r 0 # calculated based on longer generation times (t = 3.6 (2.9, 4.3)) for seasonal influenza in [13] (see last column in table 1 ). excluding implausibly high estimates of r 0 generated using initial outbreak data (april 13 to may 15), we obtain the estimates of r 0 for the 2-phase model that range between 1.31 and 1.96. inasmuch as richards model analyzes the general trends of an epidemic (e.g., turning point, reproductive number, etc.), it can be used to fit any epidemiological time series for a given disease process, as long as the rate of change in the recorded outcome is proportional to changes in the true number of cases. as such, for comparison, we fit our model using the time series for 2009 ph1n1 hospitalizations in canada posted by phac on july 15 [11] (that last date these data were made available) ( table 3) . this time series was easily fit to a one-phase model ( figure 2) . further examples of using hospitalization or mortality data to fit the richards model can be found in [14] . we used the richards model, which permits estimation of key epidemiological parameters based on cumulative case counts, to study the initial wave of 2009 influenza a (h1n1) cases in canada. in most model fits, april 28-29 and may 4-7 were identified as early turning points for the outbreak, with a third and final turning point around june 3-5 in models based on longer time series. although this modeling approach was not able to detect turning points using some earlier data sets (e.g., those limited to the period from april 12 to may 27), in general the turning points identified were consistent across multiple models and time series. perhaps the most important divergence between models occurred with the detection of an april 29 turning point in the case report time series, but not in the time series based on hospitalized cases. we believe this may be attributable to the small number of hospitalizations, relative to cases, that had occurred by that date, as well as the fact that hospitalization data only became available on april 18. the turning point can correspond to the point at which disease control activities take effect (such that the rate of change in epidemic growth begins to decline) or can represent the point at which an epidemic begins to wane naturally (for example, due to seasonal shifts or due to the epidemic having "exhausted" the supply of susceptibles such that the reproductive number of the epidemic declines below 1). this quantity has direct policy relevance; for example, in the autumn 2009 ph1n1 wave in canada, vaccination for ph1n1 was initiated at or after the turning point of the autumn wave due to the time taken to produce vaccine; as the epidemic was in natural decline at that point, the impact of vaccination has subsequently been called into question. although the richards model is able to capture the temporal changes in epidemic dynamics over the course of an outbreak, it does not define their biological or epidemiological basis. as such, determining the nature of these turning points requires knowledge of "events on the ground" for correlation. we suspect that the last note that all dates in the tables are given by month/day. dates of posting are listed in parentheses. model duration indicates whether they fit a 1-phase or 2phase model. note that the maximum case number is rounded off to the nearest integer. r 0 # is obtained using the generation interval of t = 3.6 (2.9, 4.3) for seasonal influenza [13] . table 2 comparison of akaike information criterion (aic) values between 1-phase and 2-phase models for time periods with 2-phase model fit in table 1 time table 1 , last line) and a 1-phase model using hospitalization data (june 11), this lag in turning points would actually be expected, due to the time from initial onset of symptoms until hospitalization, which was reported to have an interquartile range of 2-7 days in a recent study from canada [15] . timelines for the 2-phase model for case data of 4/12-6/19 and the 1-phase model for hospitalization data are presented graphically in figure 3 . in addition to identifying turning points, the richards model is useful for estimation of the basic reproductive number (r 0 ) for an epidemic process, and our estimates derived using a richards model were consistent with estimates derived using other methods. for example, our r 0 agrees almost perfectly with that of tuite et al., derived using a markov chain monte carlo simulation parameterized with individual-level data from ontario's public health surveillance system [16] . our estimates of r 0 is smaller than that derived by fraser et al. [2] using mexican data, but such differences could relate in part to the different age distributions of these two countries [17] , and may also reflect the fact that our estimate is obtained canadian data at a national level, while empirical mexican estimates were based on data from the town of la gloria with only 1575 residents. most epidemic curves in the early stage of a novel disease outbreak have multiple phases or waves due to simple stochastic ("random") variation, mechanisms of disease importing, initial transmission networks and individual/community behavior changes, improvements in the performance of surveillance systems, or changes in case definitions as the outbreak response evolves. however, changes in phase (signified by the presence of turning points identified using the richards model) may also pinpoint the timing of important changes in disease dynamics, such as effective control of the epidemic via vaccination or other control measures, depletion of disease-susceptible individuals (such that the effective reproductive number for the disease decreases to < 1), or the peak of a "seasonal" wave of infection, as occurs with [4, 18, 19] , some competing methods require more extensive and detailed data than are required to build a richards model, which requires only cumulative case data from an epidemic curve. as we also demonstrate here, the richards model produces fairly stable and credible estimates of reproductive numbers early in the outbreak, allowing these estimates to inform evolving disease table 1 , derived using early case data accessed on may 20, closely approximate our final estimates (table 1, last row) . thus, while early estimation with the richards model failed to correctly detect turning points or accurately estimate the final outbreak size, it was nonetheless useful for rapid estimation of r 0 within a month of first case occurrence in canada. as with any mathematical modeling technique, the approach presented here is subject to limitations, which include data quality associated with real-time modeling (as data are often subject to ongoing cleaning, correction, and reclassification of onset dates as further data become available), reporting delays, and problems related to missing data (which may be non-random). in our current study, the hierarchical approach used by canada's most populous province (ontario) for replacement of missing data could have had distorting effects on measured disease epidemiology: the replacement of missing onset dates with dates of specimen collection could have resulted in the artifactual appearance of early turning points identified by our model, due to limitations in weekend staffing early in the outbreak. if, as we believe to be the case, public health laboratories did not have sufficient emergency staffing to keep up with testing on weekends such that weekend specimen log-ins declined sharply, this would have created the appearance of epidemic "fade out" on weekends. other factors that might distort the apparent epidemiology of disease include changes in guidelines for laboratory testing of suspected cases, improved surveillance and public health alerts at later stages of the outbreak leading to increased case ascertainment or over-reporting of cases [20] . however, the quality of the time series will tend to improve with the duration of the epidemic, both because stochastic variation is "smoothed out", and also because small variations become less important as the cumulative series becomes longer. we note that a further application of the richards model in the context of influenza would relate to comparison of the epidemiology of the 2009 influenza a h1n1 epidemic to past canadian epidemics, though such an endeavor is beyond the scope of the present study. in summary, we believe that the richards model provides an important tool for rapid epidemic modeling in the face of a public health crisis. however, predictions based on the richards model (and all other mathematical models) should be interpreted with caution early in an epidemic, when one need to balance urgency with sound modeling. at their worst, hasty predictions are not only unhelpful, but can mislead public health officials, adversely influence public sentiments and responses, undermine the perceived credibility of future (more accurate) models, and become a hindrance to intervention and control efforts in general. additional file 1: electronic supplementary material. 2009 canada novel influenza a(h1n1) daily laboratory-confirmed pandemic h1n1 case and hospitalization data. epidemic science in real time pandemic potential of a strain of influenza a (h1n1): early findings a flexible growth function for empirical use how generation intervals shape the relationship between growth rates and reproductive numbers sars epidemiology. emerging infectious diseases real-time forecast of multi-wave epidemic outbreaks. emerging infectious diseases richards model: a simple procedure for real-time prediction of outbreak severity intervention measures, turning point, and reproduction number for dengue turning points, reproduction number, and impact of climatological events on multi-wave dengue outbreaks public health agency of canada: cases of h1n1 flu virus in canada public health agency of canada: cases of h1n1 flu virus in canada a new look at the statistical model identification estimation of the serial interval of influenza pandemic influenza a (h1n1) during winter influenza season in the southern hemisphere. influenza and other respiratory viruses critically ill patients with 2009 influenza a(h1n1) infection in canada estimated epidemiologic parameters and morbidity associated with pandemic h1n1 influenza age, influenza pandemics and disease dynamics comparative estimation of the reproduction number for pandemic influenza from daily case notification data the ideal reporting interval for an epidemic to objectively interpret the epidemiological time course initial human transmission dynamics of the pandemic (h1n1) 2009 virus in north america. influenza and other respiratory viruses on epidemic modeling in real time: an application to the 2009 novel a (h1n1) influenza outbreak in canada authors' contributions yhh conceived the study, carried out the analysis, and wrote the first draft. df interpreted the results and revised the manuscript. jw participated in the analysis, the interpretation of results, and the writing. all authors read and approved the final manuscript. the authors declare that they have no competing interests. key: cord-048325-pk7pnmlo authors: hanley, brian title: an object simulation model for modeling hypothetical disease epidemics – epiflex date: 2006-08-23 journal: theor biol med model doi: 10.1186/1742-4682-3-32 sha: doc_id: 48325 cord_uid: pk7pnmlo background: epiflex is a flexible, easy to use computer model for a single computer, intended to be operated by one user who need not be an expert. its purpose is to study in-silico the epidemic behavior of a wide variety of diseases, both known and theoretical, by simulating their spread at the level of individuals contracting and infecting others. to understand the system fully, this paper must be read together in conjunction with study of the software and its results. epiflex is evaluated using results from modeling influenza a epidemics and comparing them with a variety of field data sources and other types of modeling. epiflex is an object-oriented monte carlo system, allocating entities to correspond to individuals, disease vectors, diseases, and the locations that hosts may inhabit. epiflex defines eight different contact types available for a disease. contacts occur inside locations within the model. populations are composed of demographic groups, each of which has a cycle of movement between locations. within locations, superspreading is defined by skewing of contact distributions. results: epiflex indicates three phenomena of interest for public health: (1) r(0 )is variable, and the smaller the population, the larger the infected fraction within that population will be; (2) significant compression/synchronization between cities by a factor of roughly 2 occurs between the early incubation phase of a multi-city epidemic and the major manifestation phase; (3) if better true morbidity data were available, more asymptomatic hosts would be seen to spread disease than we currently believe is the case for influenza. these results suggest that field research to study such phenomena, while expensive, should be worthwhile. conclusion: since epiflex shows all stages of disease progression, detailed insight into the progress of epidemics is possible. epiflex shows the characteristic multimodality and apparently random variation characteristic of real world data, but does so as an emergent property of a carefully constructed model of disease dynamics and is not simply a stochastic system. epiflex can provide a better understanding of infectious diseases and strategies for response. the most commonly used measure in public health, r 0 , is estimated from historical data and derived from sis/sir type models (and descendents) for forward projection [14, 15] r 0 is the basic reproductive ratio for how many individuals each infected person is going to infect [16] r 0 is often used on its own in public health as an indicator of epidemic probability; if r 0 < 1 then an epidemic is not generally considered possible, for r 0 > 1, the larger the value, the more likely an epidemic is to occur. r 0 is a composite value describing the behavior of an infectious agent. hence, r 0 can be decomposed classically, for example, as: p d c, where p is probability of infection occurring for a contact, d is duration of infectiousness, and c is number of contacts [17] . however, r 0 in the classical decomposition above, while it is one of the best tools we have, does not account for age segregation of response, existing immunity in population, network topology of infectious contacts and other factors. these observations were significant in the motivation for developing epiflex. the epiflex model was designed to create a system that could incorporate as much realism as possible in an epidemic model so as to enable emerging disease events to be simulated. there are limitations, described below in a separate section, but the model is quite effective as it stands. in most cases, the limitations of epiflex are shared by other modeling systems. there are a variety of methods used for mathematical modeling of diseases. the most common of these are the sir (susceptible, infected, recovered) of kermack and mckendrick [15] , sis (susceptible, infected, susceptible), seir (susceptible, exposed, infected, recovered), and sirp (susceptible, infected, recovered, partially immune) as developed by hyman et al. [18] and further developed by hyman and laforce [19] . the sirp model was used as the starting point for development of the object model of epi-flex. in sirp, the sir model is extended to include partial immunity (denoted by p) and the progressive decline of partial immunity to allow influenza to be modeled more accurately. (see appendix.) there is a need for experimentation in more realistic discrete modeling, since the lattice type of discrete modeling is understood to skew in favor of propagation, as discussed by rhodes and anderson [20] and haraguchi and sasaki [21] . others such as eames and keeling [22] and edmunds et al. [12] have explored the use of networks to model interactions between infectable entities, and ferguson et al. [23] and others have called for more balance in realism for epidemiology models. since epiflex was completed, lloyd-smith et al. [17] have shown the importance of superspreading in disease transmission for the sars epidemic. epiflex is designed to take these issues into account. there are known weaknesses in sis-descended models, some of which are discussed by hyman and laforce [14] . they suggested that a model dealing with demographics and their subgroups would be useful and described a start theoretical biology and medical modelling 2006, 3:32 http://www.tbiomed.com/content/3/1/32 toward conceiving such a model, creating a matrix of sirp flows for each demographic group within a "city" and modeling contacts between these groups. thus, the possibility of building an entirely discrete model using the object-oriented approach, essentially setting the granularity of the hyman-laforce concept at the level of the individual, together with monte carlo method, was attractive. the object method of design seemed to be a good fit, since object-oriented programming was invented for discrete simulations [24] . an object-oriented (oo) design defines as its primitive elements "black box" subunits that have defined ways of interacting with each other [25] . the oo language concept originally was conceived for the simula languages [24] for the purpose of verifiable simulation. enforcement of explicit connections between objects is fundamental to oo design, whereas procedural languages such as fortran and cobol do not because data areas can be freely accessed by the whole program. oo languages wrap data in methods for accessing the data. if each "black box" (i.e. object) has a set of specified behaviors, without the possibility of invisible, unnoticed interactions between them, then the simulation can potentially be validated by logical proof in addition to testing. (it would take an entire course to introduce oo languages and concepts, and there is not space to do so here. interested readers are suggested to start with an implementation of smalltalk. there are excellent free versions downloadable. smalltalk also has an enthusiastic and quite friendly user community. see: http:// www.smalltalk.org/main/.) the design of epiflex is described more completely in the appendix. design proceeded by establishing the definition of a disease organism as the cornerstone, then defining practical structures and objects for simulating the movement of a disease through populations. the disease object was assigned a set of definitions drawn from literature that would allow a wide spectrum of disease-producing organisms to be specified. the aim was to minimize the number of configuration parameters that require understanding of mathematical models. the hosts that are infected became the second primary object. a host lives and works in some area, where hosts are members of some demographic group, which together determine what of n types of contacts they might have to spread an infectious disease. the hosts move about the area in which they live between locations at which they interact. in epiflex, an area contains some configured number of locations, and locations are containers for temporary groups of hosts. since people travel between metro areas, the model supports linkages between areas to move people randomly drawn from a configurable set of demographic groups. the remainder of this section presents the disease model adopted, an overview of each component, an overview of program flow, and a description of the core methods. this is followed by discussion of results from the epiflex software system. this model has up to four stages during the infection cycle: the incubation, prodromal, manifestation, and chronic stages; to this is added a fatality phase. i have named this 'extended-sirp'. fig. 1 shows a diagram of this model. the model of fig. 1 allows us to track the different phases of the disease process separately, and to define variable infectiousness, symptoms, fatality, recovery and transition to chronic disease at each stage as appropriate. this allows us to model the progress of a disease in an individual more realistically. for diseases that have no identifiable occurrence of a particular stage, this stage can be set to length zero to bypass it entirely. the 8 contact types designed into epiflex are drawn from literature in an attempt to model spread of infection more accurately. these contact types are: blood contact by needle stick, blood to mucosal contact, sexual intercourse, skin contact, close airborne, casual airborne, surface to hand to mucosa, and food contact. the probability of infection for a contact type is input by the user as estimated from literature or based on hypothetical organism characteristics. durations of disease stages are chosen uniformly at random from a user-specified interval [r low , r high ]. random numbers, denoted by ξ, on [0, 1] are used to seed the determination of the infected disease stage periods (denoted i incubation , i prodromal , i manifestation , i chronic ). r low and r high are taken from medical literature and describe a range of days for each stage of an illness. these calculations are simply: (ξ × (r high -r low )) + r low = d, where d is days for a particular stage. (this may be extended in the future to include ability to define a graph to determine the flatness of distribution and the normative peak. this will make a significant difference in modeling of diseases such as rabies, which can, under unusual circumstances, have very long incubations.) one of the following three equations describing immunity decay is chosen; l is the current level of partial immunity, p is the level of partial immunity specified as existing random values on [0, 1] are then used to decide whether an infection occurs during the partial immunity phase p shown in the chart above. this decision uses the output of the immunity level algorithm, l, which is a number on [0, 1], as is the random value ξ: epiflex uses a dynamic network to model the interactions between hosts at a particular location based on the skew provided and the demographic segments movement cycles. the networks of contacts generated in this version extended-sirp disease model of epiflex figure 1 extended-sirp disease model of epiflex. s: susceptible i: infected r: recovered p: partially immune f: fatality extended sirp breaks the infected stage i into 4: i incubation , i prodroma l, i manifestation , i chronic , and adds a fatality terminating stage. of epiflex are not made visible externally; they can only be observed in their effects. (see: limitations of epiflex modeling.) their algorithms were carefully designed and tested at small scales, observing each element. a location describes a place, the activities that occur there, and the demographic groups that may be drawn there automatically. a location can have a certain number of cells, which are used to specify n identically behaving locations concurrently. this acts as a location repetition count within an area when the location is defined. the user sets an average number of hosts inhabiting each cell, and a maximum. there is also a cell exchange fraction specifiable to model hosts moving from cell to cell. the algorithm for allocating hosts in cells is semi-random. it randomly puts hosts into cells in the location. if a cell hits the average, then it does another random draw of a cell. if all locations are at maximum, then it overloads cells. interactions are within the cell. so a host must be exchanged to another cell in order to be infective. see the appendix for 'location component', and also with an open model look at how hospitals were defined. households are modeled at this time using a cell configuration. epiflex is implemented with a monte carlo algorithm such that each host in a location is assigned a certain number of interactions according to the cauchy distribution parameter setting for that location. this distribution describes a curve with the y axis specifying the fraction of the maximum interactions for the location and x axis specifying the fractional ordinal within the list of hosts in the location. the distribution can be made nearly flat, or severely skewed with only a few actors providing nearly all contacts, as desired by the user of epiflex. note that the structure of the network formed also depends on what locations are defined, what demographic groups are defined for the population, and how demographic groups are moved between locations. each location has a maximum number of interactions specified per person, which is used as the base input. initially, a gaussian equation was used, but it was discarded in favor of a cauchy function since this better fits the needs of the skew function and computes faster. the algorithm iterates for each infectious host, and selects other hosts to expose to the infected party in the location, by a monte carlo function. this results in a dynamically allocated network of interactions within each location. the exposure cycle also makes use of monte carlo inputs. each location has a list of contact types that can take place at a particular location, and a maximum frequency of interactions. this interaction frequency determines how many times contacts that can spread a disease will be made, and the contact specification defines the fractional efficacy of infection by any specific route. modeling the effect of different types of contacts has been discussed in the literature, e.g. song et al. [26] . epiflex attempts to make a more generalized version. for each host infection source, target hosts are drawn at random from the location queue. a contact connection is established with the target as long as the contact allocation of that target has not been used up already. contact connections made to each target are kept track of within the location to prevent over-allocation of contacts to any target. thus, for each randomly established connection, a value is set on both ends for the maximum number of connections that can be supported. once the maximum for either end of the link is reached, the algorithm will search for a different connection. the location algorithm is described below in more detail. the user specifies the maximum number of connections for a location; the σ output from a cauchy distribution function determines how many connections an individual will have. this allows variations in the degree of skewness for superspreading in a population to be modeled, which has been shown to be of critical importance by lloyd-smith et al. [17] . if p = position in queue, q = number of hosts in queue for location: x = p/q, where x denotes the proportional fraction of queue for position. if k is a constant chosen for the location to express skew distribution, the cauchy distribution function is: if κ is the number of contacts for a particular host and κ max is maximum number of contacts for any given host in the location: when hosts move from one location to another within the model, they tend to maintain a rough order of ordinal position. consequently, when there is a high σ for a location, the high connection host in one location tends to be a high connection host in another. this reflects real-world situations, (though not perfectly) and corresponds better than persistently maintaining high connection individuals from location to location, since host behavior changes from place to place. the cauchy distribution function is fairly fast in execution. the function can be used to approximate the often radical variations seen in epidemiology studies; as an extreme example, one active super-spreader individual might infect large numbers, when one or even zero is typical [17] . this type of scale-free network interaction has been explored by chowell and chavez [27] . the cauchy function allows networks to be generated dynamically within each type of location in a very flexible manner, such as corresponding to super-spreader dynamics [17] . in addition to the specification of skew within a location, the network of contacts is also defined by (a) what locations are present and (b) the movement cycles defined for each demographic group within the model. processing time increases with population. this slowing is an expected characteristic of an object modeling system and is the price paid for the discrete detail of the epiflex model. the primary source of this increase in processing time is the sum of series of possible infectious events that are modeled for each iteration. it therefore scales as a series sum not as a log, based on the contagiousness of the disease and the number of potential hosts in a location with an infected host. this is minimized by only processing infectious host contacts. the increase stems from the characteristics of networks in which each node has n connections to other nodes. when iteration is done for a location containing infectable hosts, it is the number of infected hosts that creates an element of the series. the infected hosts are put into a list, and each one interacts randomly with other hosts (including other infected ones) in the location. thus, considered as a network with m nodes, each of the m nodes is a host. a temporary connection to another host is made to n other nodes where n 1)-best accuracy curves in the single model. one may naturally expect and hypothesize that 1-best and the n(> 1)-best accuracy curves behave similarly: but no previous retrosynthesis studies have examined, or even discussed, this hypothesis. we showed in this study that the hypothesis does not hold for single model training on uspto-50k, which is the default training strategy in previous studies. this observation, that the 1-best and the n-best accuracy curves may be totally different, is important especially when trying to verify the top-n result experimentally (not in a computational experiment, but in an actual experiment that is both time-consuming and expensive). one possible reason for this difference between the 1-best and the n-best accuracy is the form of the objective function. the maximum likelihood of objective l updates the parameter θ so as to maximize the 1-best accuracy on the training set, not to maximize the n-best accuracy. it is also remarkable that the two curves of the pre-training plus fine-tuning (4th row) are not significantly different, but rather very similar. the trained model quickly hits the peaks of the 1-best and the 20-best accuracy around 10k iterations. after hitting the peaks, two curves decrease rapidly. the model gradually forgets the beneficial knowledge transferred from the augment dataset by keeping the training iterations, and the model will converge to the single train model after many iterations. thus it is important whether we can detect the peaks to early-stop the fine-tuning. in our experiments, the validation score monitoring are always successful in identifying this peak, yielding the good scores for all top-n accuracy in tables. 1,3. the fact that the two curves of the pre-training plus fine-tuning approach are very similar is another advantage for the data transfer in retrosynthesis. the joint training (2nd row) also shows the similar curves in test scores, but the pre-training plus fine-tuning approach is better in two aspects: final accuracy scores (table. 1) and necessary numbers of iterations (roughly between 70k and 250k iterations) required to achieve the best-val-score snapshot. it may seem strange that the 1,3,5-best accuracy scores of the single model are lower than those of the known results in (karpov et al., 2019) 4 . in our experiment, the best-val-score snapshots of the single model are always chosen from the earlier iterations (less than 10,000 iterations), resulting in low 1-best scores of the single model compared to the known results (karpov et al., 2019) . according to figure 2 . the 1-best accuracy exceeds the known score if we choose the snapshot of e.g. 140,000 iterations. however, the snapshot achieves worse 20-best accuracy compared to the best-val-score snapshot. next, we examine the predicted reactant smiles to see how the data transfer contributes to the improvement of the computational retrosynthesis prediction from the perspective of actual chemical experiment. we evaluate 428 test samples (x j , y j ) ∈ d t test whose 50-best predictionsŷ j s do not include the "gold" reactant smiles string y j . we note again that the gold reactant set y j is not the sole correct retrosynthesis prediction, but the current studies adopt this "hard" criterion. during the inspection, we realize that uspto-50k test dataset still contains a number of errors (mislabels) within the test set despite the curating efforts of the original authors (lowe, 2012) . therefore we exclude these mislabeled samples from this in-depth analysis. difficult to predict with a single training of a seq-to-seq model (liu et al., 2017) . the heterocycle formation reaction not only forms some new bonds but also aromatizes some atoms through a reaction, which changes many capital letters in smiles string to lowercase. therefore, it is expected that it will be difficult to learn atom-to-atom mapping with small training samples. figure 3 shows examples that fail in the single model but succeed in the transfer model. significant improvements are observed in the prediction of complicated reactions such as heterocyclic reactions (1st row), suggesting that pre-training plus fine-tuning with a large augment dataset enables to learn such large changes. deals-alder reaction (2nd row) is another example of complicated reactions, where two c-c bonds are formed as well as some atom and bond types change. in general, the single model is not good at generating the ring formation reactions and only returns odd answers, while the transferred model is able to return correct answers to such reactions. figure 3 : examples of being able to make predictions with pre-training and fine-tuning. the top-1 predictions for the single model are reasonable but the model does not predict reactants for constructing a ring. the 1st row (a) an example of heterocycle formation (rx 4). synthesis nsubstituted pyrrole from 1,4-diketone and alkylamine. single model does not correctly propose reactants built a pyrrole ring. the 2nd row (b) an example of c-c bond formation (rx 3). synthetic organic chemists easily come up with the diels-alder reaction when they see bicyclo[2.2.1]hept-2-ene ring system but single model does not predict any diels-alder reaction within the top-50. figure 4 shows difficult examples that could not be predicted correctly even by our best model. if there are multiple similar substituents in a compound, the transferred model sometimes chooses them wrong. in another case, our model fails to generate valid smiles string. this indicates that the augment dataset still does not contain a sufficient amount of reactions for polycyclic aromatic hydrocarbons. we need more data augmentation to prevent that from happening, but augmentation may be difficult to do as such chemical groups are rare. having confirmed the predicted results based on our expertise in synthetic organic chemistry, less than 0.5% of the top 1 reactions were found to be wrong, and less than 0.2% of the cases did not output any organically correct reactions at all. this is a difficulty of the evaluation of retrosynthesis: there are multiple reasonable (appropriate) hypotheses for reactant predictions, and the n-best accuracy does not perfectly match the problem. at the same time, it is surprising that our model achieves such high accuracy without using domain knowledge or graph representation of compounds. machine-learning driven drug repurposing for covid-19. arxiv learning to make generalizable and diverse predictions for retrosynthesis no electron left behind: a rule-based expert system to predict chemical reactions and reaction mechanisms cogmol: target-specific and selective drug design for covid-19 using deep generative models computer-assisted design of complex organic syntheses retrosynthesis prediction with conditional graph logic network pre-training of deep bidirectional transformers for language understanding. arxiv rich feature hierachies for accurate object detection revisiting self-training for neural sequence generation deep residual learning for image recognition smiles transformer: pre-trained molecular fingerprint for low data drug discovery. arxiv predicting organic reaction outcomes with weisfeiler-lehman network junction tree variational autoencoder for molecular graph generation a transformer model for retrosynthesis self-training for end-to-end speech recognition adam: a method for stochastic optimization opennmt: open-source toolkit for neural machine translation do better imagenet models transfer better? imagenet classification with deep convolutional neural networks grammar variational autoencoder albert: a lite bert for self-supervised lerarning of language representations open-source cheminformatics biobert: a pre-trained biomedical language representation model for biomedical text mining inductive transfer learning for molecular activity prediction: next-gen qsar models with molpmofit retrosynthetic reaction prediction using neural sequence-to-sequence models fully convolutional networks for semantic segmentation extraction of chemical structures and reactions from the literature chemical reactions from us patents (1976-sep2016 transfer learning enables the molecular transformer to predict regio-and stereoselective reactions on carbohydrates language models are unsupervised multitask learneres what's what: the (nearly) definitive guide to reaction role assignment probability of error of some adaptive pattern-recognition machines improving neural machine translation models with monolingual data a graph to graphs framework for retrosynthesis prediction very deep convolutional networks for large-scale image recognition learning graph models for template-free retrosynthesis attention is all you need discovering chemistry with an ab initio nanoreactor smiles-bert: large scale unsupervised pre-training for molecular property prediction neural networks for the prediction of organic chemistry reactions smiles, a chemical language and information system. 1. introduction to methodology and encoding rules self-training with noisy student improves imagenet classification towards good practices on building effective cnn baseline model for person re-identification xlnet: generalized autoregressive pretraining for language understanding rethinking pre-training and self-training key: cord-162772-5jgqgoet authors: viguerie, alex; lorenzo, guillermo; auricchio, ferdinando; baroli, davide; hughes, thomas j.r.; patton, alessia; reali, alessandro; yankeelov, thomas e.; veneziani, alessandro title: simulating the spread of covid-19 via spatially-resolved susceptible-exposed-infected-recovered-deceased (seird) model with heterogeneous diffusion date: 2020-05-11 journal: nan doi: nan sha: doc_id: 162772 cord_uid: 5jgqgoet we present an early version of a susceptible-exposed-infected-recovered-deceased (seird) mathematical model based on partial differential equations coupled with a heterogeneous diffusion model. the model describes the spatio-temporal spread of the covid-19 pandemic, and aims to capture dynamics also based on human habits and geographical features. to test the model, we compare the outputs generated by a finite-element solver with measured data over the italian region of lombardy, which has been heavily impacted by this crisis between february and april 2020. our results show a strong qualitative agreement between the simulated forecast of the spatio-temporal covid-19 spread in lombardy and epidemiological data collected at the municipality level. additional simulations exploring alternative scenarios for the relaxation of lockdown restrictions suggest that reopening strategies should account for local population densities and the specific dynamics of the contagion. thus, we argue that data-driven simulations of our model could ultimately inform health authorities to design effective pandemic-arresting measures and anticipate the geographical allocation of crucial medical resources. the outbreak of covid-19 in 2020 has caused widespread disruption throughout the world, leading to substantial damage in terms of both human lives and economic cost. to arrest the spread of the disease, governments have enacted unprecedented measures, including quarantines, curfews, lockdowns, and suspension of travel. the wide-reaching ramifications of such measures, deemed by many experts as necessary, are driven in part by a lack of clear information about the spatio-temporal spread of covid-19. indeed, the absence of reliable data regarding disease transmission has necessarily led to cautious responses. these recent events, which have required important decisions based on forecasts, have demonstrated more than ever the need for reliable tools intended to model the spread of covid-19 and other infectious diseases [15] . a particularly urgent need is the geo-localization of outbreaks, as this may allow a more effective allocation of medical resources. several notable models of this outbreak have been presented; indeed, at the time of this writing there are over 1,000 covid-19 articles on medrxiv, many of which address the modeling of disease spread. some models aim at offering specific evaluations of policy responses based on the implementation of different social distancing measures with combined compartmental and empirical approaches [5] [6] [7] . rather than adopting a deterministic, mechanism-based model, zhang et al. employed a statistical approach to analyze the spatio-temporal dynamics of covid-19 [17] . in gatto et al. a combined statistical and compartmental approach was employed, in which spatial dependence is addressed by dividing the region of interest (italy) into local communities connected by a network structure [6] . here, we propose an alternative approach, using a partial-differential-equation (pde) model designed to capture the continuous spatio-temporal dynamics of covid-19. we leverage a compartmental seird (susceptible, exposed, infected, recovered, deceased ) model that incorporates the spatial spread of the disease with inhomogeneous diffusion terms [8, 9, 11, 12] . the rationale is that the diffusion operator, properly tuned to account for local natural or social inhomogeneities (e.g., mountains, rivers, highways) may describe the local movement of the different populations in a deterministic way, as the limit of a brownian motion [16] . this is critical to accurately account for information relevant to the outbreak dynamics, such as local population densities, which vary in space and time. while a mathematical description of non-local dynamics is still possible in terms of fractional differential operators [10] , we postpone this approach to a follow-up of the present work. hence, our modeling approach is more appropriate for the local dynamics on mesoscales, such as regions within italy. thus, to evaluate the model efficacy, we run a simulation study of the covid-19 outbreak in the italian region of lombardy, which has been severely impacted by the covid-19 crisis between february and april of 2020 and for which the necessary data was available. also, the high density of lombardy's population and transportation network is specifically suitable to our modeling approach. our simulations show a remarkable qualitative agreement with the reported epidemiological data. we further explore various reopening scenarios, obtaining contrasting results that highlight the importance of considering local population densities and contagion dynamics. the paper outline is as follows. in section 2, we describe the seird model. then, section 3 addresses the numerical implementation of our model and section 4 presents the results of the simulation study in lombardy. we conclude in section 5 by examining the shortcomings observed from our simulations and discussing the additional work required to improve model accuracy and practical relevance. let ω ⊂ r 2 be a simply connected domain of interest and [0, t ] a generic time interval. we denote the densities of the susceptible, exposed, infected, recovered and deceased populations as s(x, t), e(x, t), i(x, t), r(x, t), and d(x, t) respectively. also, let n(x, t) denote the sum of the living population; i.e., n(x, t) = s(x, t) + e(x, t) + i(x, t) + r(x, t). then, our model is comprised of the following system of coupled pdes over ω × [0, t ]: ∂ t r = φ r i + φ e e − µrn + ∇ · (n ν r ∇r) (4) where α is the birth rate, σ is the inverse of the incubation period, φ e is the asymptomatic recovery rate, φ r is the infected recovery rate, φ d is the infected mortality rate, β e is the asymptomatic contact rate, β i is the symptomatic contact rate, µ is the general (non-covid-19) mortality rate, and ν s , ν e , ν i , and ν r are diffusion parameters respectively corresponding to the different population groups. each of these parameters may depend on time, space, or the model compartments. we also consider the allee effect (depensation), characterized by the parameter a. in this particular setting, the allee effect serves to model the tendency of outbreaks to cluster towards large population centers. specific parameter selection as well as initial and boundary conditions are discussed in section 3. fig. 1 shows the dynamics of contagion between the compartments in our model. we remark that our model accounts for asymptomatic transmission, which is considered a pivotal driver of the covid-19 pandemic [3, 7, 14] . eqs. (1)-(2) show that exposed asymptomatic patients may transmit covid-19 to susceptible individuals at contact rate β e . this aligns with recent studies suggesting that patients may transmit covid-19 almost immediately after exposure [3, 7, 14] . additionally, eqns. (2) and (4) involve a fraction φ e of exposed patients that do not develop symptoms and move directly into the recovered population. we also assume that recovered patients are immune, as we do not include any backflow from eq. (4) to eq. (1). (we note that this is a current source of debate, but consistent with the existing literature for the time scale of months considered here [13] ). the spatial movement over a large population is described by an inhomogeneous random walk, which in the limit tends to a second order differential operator [16] . the diffusivity coefficient is proportional to the population and can be locally adjusted to incorporate geographical or human-related inhomogeneities [9] . we use a finite-element spatial discretization of the italian region of lombardy, consisting of an unstructured mesh containing 30,407 triangles (a mesh convergence analysis was first performed; data not shown). we use the backward-euler method for time integration and solve each time step fully implicitly with a picard iteration for stability. the resulting linear systems are solved by the gmres algorithm using a jacobi preconditioner. initial conditions for the subpopulations s, e, i, r, and d in the model are defined by means of gaussian circular functions centered at the latitude and longitudinal coordinates of each municipality with 10,000 or more inhabitants, and weighted by the municipality's population size and geographic area. these initial conditions correspond to the data provided by lab24 [1] from the date 27 february 2020, featuring a severe outbreak in the province of lodi, and moderate numbers of exposed and infected individuals in the provinces of bergamo, brescia, and cremona (see fig. 2 ). we used homogeneous neumann boundary conditions for simplicity, mimicking a complete isolation of the region. we acknowledge that this is likely unrealistic (e.g. if lockdown orders are relaxed) and will be considered in future efforts. we assume σ=1/7 day −1 , φ r =1/24 day −1 , φ m =1/160 day −1 , and φ e =1/6 day −1 . these values were based on available data from the literature regarding the mortality, incubation period, and recovery time for infected and asymptomatic patients [2, 3, 5, 7, 14] . additionally, we do not consider births or non-covid-19 mortality (i.e., we set α = 0 and µ = 0, respectively), given the time scale of months in our simulations. the remainder of the model parameters are estimated in a two-step approach. first, we fit a 0d seird version of our model (i.e., consisting of a system of exclusively time-dependent ordinary differential equations and no diffusion terms) to match the temporal dynamics of the outbreak. then, we iteratively refined these values by means of recursive simulations using eqs. (1)-(5) to match the spatiotemporal epidemiological data. we use the r 2 coefficient and the root mean squared error (rmse) to assess the goodness-of-fit. given the uncertainty in the currently available covid-19 data, we think that parameter estimation aiming at matching the dynamics of all model compartments is not viable. as not every member of the population is tested for infection and asymptomatic cases are known to exist in possibly large numbers, we think that the available data of infected cases might lead to unrealistic parameter fitting. conversely, the data reported for covid-19 deaths offer more reliability to calibrate the model parameters. therefore, we pursue quantitative agreement in the deceased compartment (i.e., d), and qualitative agreement for the rest of the model subgroups (i.e., s,e,i,r). in section 4, our results will focus on the model forecasts of exposed and infected cases because these are key data for public health officials, e.g., in deciding resource allocation and measures to prevent contagion. we assume β i and β e to be equal, as precise estimates on the relative infectivity levels between the symptomatic and asymptomatic pools are unclear [3] . we define β i,e with decreasing piecewise constant values in time to model the escalation of the lockdown restrictions. following the results of parameter calibration, we initially set β i,e = 3.3 · 10 −4 contacts −1 ·day −1 on 27 february 2020, reducing this to β i,e = 8.5 · 10 −5 contacts·day −1 after the first lockdown measures on 9 march 2020, to β i,e = 6.275 · 10 −5 contacts −1 ·day −1 after the additional restrictions on 22 march 2020, and to β i,e = 4.125 · 10 −5 contacts −1 ·day −1 following the final restrictions on 28 march 2020. similarly, we assume ν s,e,r =0.0435, 0.0198, 0.0090, and 0.0075 km 2 · day −1 over the respective phases. the allee term a is set to 1,000 individuals·km −2 , and we fix ν i = 1.0 · 10 −4 km 2 · day −1 throughout, assuming that symptomatic individuals are largely immobile. lombardy fig. 2 shows the evolving spatial pattern of the covid-19 outbreak in lombardy, beginning with exposure in bergamo, brescia, cremona and lodi. the contagion moves north from lodi into milan via the southern suburbs and eventually reaches the city center. we note that although lodi and cremona are the most affected areas at the onset of the outbreak, they quickly improve and avoid the explosive growth found in milan, bergamo and brescia. this is consistent with the reported data [1] . however, we found cremona to be somewhat underpredicted by our model. this might be attributable to the presence of the neighboring city of piacenza, which shares its metropolitan area with cremona but is not included in our simulations because it belongs to the adjacent region of emilia-romagna. in fig. 2 , we demonstrate the remarkable qualitative agreement in the outbreak dynamics between our model forecasts and data in the three main affected areas: milan, bergamo, and brescia. the r 2 between the model forecast and the data of infected cases are 0.997, 0.977, 0.976, and 0.998 for all lombardy, bergamo, brescia, and milan, respectively. we observe that the outbreak emerges in milan later, where it grows more steadily, eventually becoming the most affected area in lombardy. we also note that the lockdowns appear to have effectively halted the spread in bergamo and brescia. these restrictions notably reduce the spread in milan, limiting the virus to a linear growth pattern, but fail stop it. we observe that our simulations predict a larger number of infections than the reported data. this results from using the data for deceased cases for calibration, which is comparatively more accurate than infections (see section 3). we obtained r 2 = 0.972 and range-normalized rmse=7.6% for this subgroup. thus, the difference in predicted and measured infections suggests a lack in the reporting of real cases, probably due to the deficiencies and difficulties of testing a significant sample of the whole living population. however, we also remark that covid-19 mortality data depend on the currently unknown transmission rates, which emphasizes the importance of qualitative agreement to test novel modeling approaches. to this end, we show that it is possible to rescale our simulation results to accurately match the order of magnitude of the reported infected case data fig. 2 , though we emphasize that this is purely for the purposes of visualization. we further use our model to assess four illustrative reopening scenarios over four months following 27 february 2020: maintenance of restrictions, relaxation of the lockdown everywhere on 3 may 2020 under two different sets of assumptions, and a combination of maintenance of restrictions in milan and relaxation elsewhere in lombardy. we still consider the changes in parameter values induced by the sequential restrictions (see section 3) and the lockdown relaxation is modeled by setting ν s,e,r = 2.175 · 10 −2 km 2 · day −1 and β i,e = 9.0 · 10 −5 contacts −1 ·day −1 (scenario a), and β i,e = 6.6 · 10 −5 contacts −1 ·day −1 (scenario b). scenario a is a pessimistic scenario, which assumes that the population contact rate is similar to early-outbreak levels. scenario b is more optimistic and assumes that the generally greater public awareness of preventative measures (such as mask-wearing and social distancing) translates to greater success in limiting contact, despite increased mobility. fig. 3 shows the resulting outbreak dynamics for these four reopening scenarios. our simulations suggest that relaxing the lockdown restrictions in the entire region may cause severe and rapid growth in the milan area. however, major urban zones far from milan (e.g., brescia and bergamo) just experience a marginal increase in growth and still show a favorable trend in time. conversely, if we maintain the lockdown restrictions in milan and relax them elsewhere, the outbreak shows more favorable dynamics, similar to those obtained for brescia and bergamo. thus, our results suggest that maintenance of lockdown measures in high-population, high-density areas like milan may be necessary for longer times to effectively arrest the spread of contagious diseases like covid-19. we have introduced a compartmental pde model describing spatio-temporal propagation of disease contagion and applied it to the 2020 outbreak of covid-19 in lombardy specifically. our simulations are intended to provide a proof-of-concept of the potential of pdes for regional modeling of the outbreak. nonetheless, they show good qualitative agreement with reality, accurately predicting the outbreak dynamics in different areas and recreating the transmission path in time and space. we then used the model to examine some possible reopening scenarios, which suggested that reopening may be best determined based on local population and contagion dynamics, and not by a one-size-fits-all approach. our model is in a very early stage, with ample room for improvement. we plan to consider non-constant model parameters and adaptively update them according to measured data using dataassimilation procedures [4] . indeed, as more reliable data becomes available, we can further extend parameter calibration to fit data for additional model compartments other than the deceased subgroup. boundary conditions can also be defined in a more realistic manner, e.g., by including 0d seir models describing the fluxes with respect to neighboring regions. additionally, we used population-dependent diffusion terms, but ideally these could also be affected by geographical features (e.g., rivers, mountains, roadways, and railways) [9] . non-local effects like the ones modeled by fractional operators can also be included [10] . these considerations may be crucial whenever using the model in larger geographical domains. we would also like to extend our framework into more sophisticated compartmental models including, e.g., hospitalizations, patients in intensive care units, or age and biological sex structures [6, 7] . this would further increase the utility of the model, potentially helping decision-makers to determine the allocation of resources among different areas. the present results clearly pinpoint the current standpoints of virologists, emphasizing the need of restrictions. finally, the socio-economical costs of lockdowns are not included here, but could ultimately be incorporated in future quantitative analyses, e.g., aiming at the comprehensive optimization of pandemic-arresting measures. incubation period of 2019 novel coronavirus (2019-cov) infections among travelers from wuhan, china serial interval of covid-19 among publicly reported confirmed cases the ensemble kalman filter impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand spread and dynamics of the covid-19 epidemic in italy: effects of emergency containment measures modelling the covid-19 epidemic and implementation of population-wide interventions in italy partial differential equations in ecology: spatial interactions and population dynamics numerical simulation of a suceptible-exposed-infectious spacecontinuous model for the spread of rabies in raccoons across a realistic landscape modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative galerkin methods for a model of population dynamics with nonlinear diffusion a numerical method for spatial diffusion in age-structured populations postitive rt-pcr test results in patients recovered from covid-19 serial interval of novel coronavirus (covid-19) infections covid-19 and italy: what next? the lancet partial differential equations in action, from modeling to theory comparison of the spatiotemporal characteristics of the covid-19 and sars outbreaks in mainland china. medrxiv the authors would like to acknowledge the work of marco demarziani, luigi greco, isabella atcha, kasey cervantes, sanne glastra, shreya rana, stefano minelli, chiara macchello, simona petralia, anita de franco, and martina moschella for their crucial help with data acquisition. key: cord-130240-bfnav9sn authors: friston, karl j.; parr, thomas; zeidman, peter; razi, adeel; flandin, guillaume; daunizeau, jean; hulme, oliver j.; billig, alexander j.; litvak, vladimir; moran, rosalyn j.; price, cathy j.; lambert, christian title: dynamic causal modelling of covid-19 date: 2020-04-09 journal: nan doi: nan sha: doc_id: 130240 cord_uid: bfnav9sn this technical report describes a dynamic causal model of the spread of coronavirus through a population. the model is based upon ensemble or population dynamics that generate outcomes, like new cases and deaths over time. the purpose of this model is to quantify the uncertainty that attends predictions of relevant outcomes. by assuming suitable conditional dependencies, one can model the effects of interventions (e.g., social distancing) and differences among populations (e.g., herd immunity) to predict what might happen in different circumstances. technically, this model leverages state-of-the-art variational (bayesian) model inversion and comparison procedures, originally developed to characterise the responses of neuronal ensembles to perturbations. here, this modelling is applied to epidemiological populations to illustrate the kind of inferences that are supported and how the model per se can be optimised given timeseries data. although the purpose of this paper is to describe a modelling protocol, the results illustrate some interesting perspectives on the current pandemic; for example, the nonlinear effects of herd immunity that speak to a self-organised mitigation process. the purpose of this paper is to show how dynamic causal modelling can be used to make predictions-and test hypotheses-about the ongoing coronavirus pandemic wu et al., 2020; zhu et al., 2020) . it should be read as a technical report 1 , written for people who want to understand what this kind of modelling has to offer (or just build an intuition about modelling pandemics). it contains a sufficient level of technical detail to implement the model using matlab (or its open source version octave), while explaining things heuristically for non-technical readers. the examples in this report are used to showcase the procedures and subsequent inferences that can be drawn. having said this, there are some quantitative results that will be of general interest. these results are entirely conditional upon the model used. dynamic causal modelling (dcm) refers to the characterisation of coupled dynamical systems in terms of how observable data are generated by unobserved (i.e., latent or hidden) causes (friston et al., 2003; moran et al., 2013) . dynamic causal modelling subsumes state estimation and system identification under one bayesian procedure, to provide probability densities over unknown latent states (i.e., state estimation) and model parameters (i.e., system identification), respectively. its focus is on estimating the uncertainty about these estimates to quantify the evidence for competing models, and the confidence in various predictions. in this sense, dcm combines data assimilation and uncertainty quantification within the same optimisation process. specifically, the posterior densities (i.e., bayesian beliefs) over states and parameters-and the precision of random fluctuations-are optimised by maximising a variational bound on the model's marginal likelihood, also known as model evidence. this bound is known as variational free energy or the evidence lower bound (elbo) in machine learning (friston et al., 2007; hinton and zemel, 1993; mackay, 1995; winn and bishop, 2005) . intuitively, this means one is trying to optimise probabilistic beliefs-about the unknown quantities generating some data-such that the (marginal) likelihood of those data is as large as possible. the marginal likelihood 2 or model evidence can always be expressed as accuracy minus complexity. this means that the best models provide an accurate account of some data as simply as possible. therefore, the model with the highest evidence is not necessarily a description of the process generating data: rather, it is the simplest description that provides an accurate account of those data. in short, it is 'as if' the data were generated by this kind of model. importantly, models with the highest evidence will generalise to new data and preclude overfitting, or overconfident predictions about outcomes that have yet to be measured. in light of this, it is imperative to select the parameters or models that maximise model evidence or variational free energy (as opposed to goodness of fit or accuracy). however, this requires the estimation of the uncertainty about model parameters and states, which is necessary to evaluate the (marginal) likelihood of the data at hand. this is why estimating uncertainty is crucial. being able to score a model-in terms of its evidence-means that one can compare different models of the same data. this is known as bayesian model comparison and plays an important role when testing different models or hypotheses about how the data are caused. we will see examples of this later. this aspect of dynamic causal modelling means that one does not have to commit to a particular form (i.e., parameterisation) of a model. rather, one can explore a repertoire of plausible models and let the data decide which is the most apt. dynamic causal models are generative models that generate consequences (i.e., data) from causes (i.e., hidden states and parameters). the form of these models can vary depending upon the kind of system at hand. here, we use a ubiquitous form of model; namely, a mean field approximation to loosely coupled ensembles or populations. in the neurosciences, this kind of model is applied to populations of neurons that respond to experimental stimulation (marreiros et al., 2009; moran et al., 2013) . here, we use the same mathematical approach to model a population of individuals and their response to an epidemic. the key idea behind these (mean field) models is that the constituents of the ensemble are exchangeable; in the sense that sampling people from the population at random will give the same average as following one person over a long period of time. under this assumption 3 , one can then work out, analytically, how the probability distribution over various states of people evolve over time, e.g., whether someone was infected or not. this involves parameterising the probability that people will transition from one state to another. by assuming the population is large, one can work out the likelihood of observing a certain number of people who were infected, given the probabilistic state of the population at that point in time. in turn, one can work out the probability of a sequence or timeseries of new cases. this is the kind of generative model used here, where the latent states were chosen to generate the data that are-or could be-used to track a pandemic. figure 1 provides an overview of this model. in terms of epidemiological models, this can be regarded as an extended seir (susceptible, exposed, infected and recovered) compartmental model (kermack et al., 1997) . please see for an application of this kind of model to there are number of advantages to using a model of this sort. first, it means that one can include every variable that 'matters', such that one is not just modelling the spread of an infection but an ensemble response in terms of behaviour (e.g., social distancing). this means that one can test hypotheses about the contribution of various responses that are installed in the model-or what would happen under a different kind of response. a second advantage of having a generative model is that one can evaluate its evidence in relation to alternative models, and therefore optimise the structure of the model itself. for example, does social distancing behaviour depend upon the number of people who are infected? or, does it depend on how many people have tested positive for covid-19? (this question is addressed below). a third advantage is more practical, in terms of data analysis: because we are dealing with ensemble dynamics, there is no need to create multiple realisations or random samples to estimate uncertainty. this is because the latent states are not the states of an individual but the sufficient statistics of a probability distribution over individual states. in other words, we replace random fluctuations in hidden states with hidden states that parameterise random fluctuations. the practical consequence of this is that one can fit these models quickly and efficiently-and perform model comparisons over thousands of models. a fourth advantage is that, given a set of transition probabilities, the ensemble dynamics are specified completely. this has the simple but important consequence that the only unknowns in the model are the parameters of these transition probabilities. crucially, in this model, these do not change with time. this means that we can convert what would have been a very complicated, nonlinear state space model for data assimilation into a nonlinear mapping from some unknown (probability transition) parameters to a sequence of observations. we can therefore make precise predictions about the long-term future, under particular circumstances. this follows because the only uncertainty about outcomes inherits from the uncertainty about the parameters, which do not change with time. these points may sound subtle; however, the worked examples below have been chosen to illustrate these properties. this technical report comprises four sections. the first details the generative model, with a focus on the conditional dependencies that underwrite the ensemble dynamics generating outcomes. the outcomes in question here pertain to a regional outbreak. this can be regarded as a generative model for the first wave of an epidemic in a large city or metropolis. this section considers variational model inversion and comparison, under hierarchical models. in other words, it considers the distinction between (first level) models of an outbreak in one country and (second level) models of differences among countries, in terms of model parameters. the second section briefly surveys the results of second level (between-country) modelling, looking at those aspects of the model that are conserved over countries (i.e., random effects) and those which are not (i.e., fixed effects). the third section then moves on to the dynamics and predictions for a single country; here, the united kingdom. it considers the likely outcomes over the next few weeks and how confident one can be about these outcomes, given data from all countries to date. this section drills down on the parameters that matter in terms of affecting death rates. it presents a sensitivity analysis that establishes the contribution of parameters or causes in the model to eventual outcomes. it concludes by looking at the effects of social distancing and herd immunity. the final section concludes with a consideration of predictive validity by comparing predicted and actual outcomes. figure 1: generative model. this figure is a schematic description of the generative model used in subsequent analyses. in brief, this compartmental model generates timeseries data based on a mean field approximation to ensemble or population dynamics. the implicit probability distributions are over four latent factors, each with four levels or states. these factors are sufficient to generate measurable outcomes; for example, the number of new cases or the proportion of people infected. the first factor is the location of an individual, who can be at home, at work, in a critical care unit (ccu) or in the morgue. the second factor is infection status; namely, susceptible to infection, infected, infectious or immune. this model assumes that there is a progression from a state of susceptibility to immunity, through a period of (pre-contagious) infection to an infectious (contagious) status. the third factor is clinical status; namely, asymptomatic, symptomatic, acute respiratory distress syndrome (ards) or deceased. again, there is an assumed progression from asymptomatic to ards, where people with ards can either recover to an asymptomatic state or not. finally, the fourth factor represents diagnostic or testing status. an individual can be untested or waiting for the results of a test that can either be positive or negative. with this setup, one can be in one of four places, with any infectious status, expressing symptoms or not, and having test results or not. note that-in this construction-it is possible to be infected and yet be asymptomatic. however, the marginal distributions are not independent, by virtue of the dynamics that describe the transition among states within each factor. crucially, the transitions within any factor depend upon the marginal distribution of other factors. for example, the probability of becoming infected, given that one is susceptible to infection, depends upon whether one is at home or at work. similarly, the probability of developing symptoms depends upon whether one is infected or not. the probability of testing negative depends upon whether one is susceptible (or immune) to infection, and so on. finally, to complete the circular dependency, the probability of leaving home to go to work depends upon the number of infected people in the population, mediated by social distancing. the curvilinear arrows denote a conditioning of transition probabilities on the marginal distributions over other factors. these conditional dependencies constitute the mean field approximation and enable the dynamics to be solved or integrated over time. at any point in time, the probability of being in any combination of the four states determines what would be observed at the population level. for example, the occupancy of the deceased level of the clinical factor determines the current number of people who have recorded deaths. similarly, the occupancy of the positive level of the testing factor determines the expected number of positive cases reported. from these expectations, the expected number of new cases per day can be generated. a more detailed description of the generative model-in terms of transition probabilities-can be found in in the main text. this section describes the generative model summarised schematically in figure 1 , while the data used to invert or fit this model are summarised in figure 2 . these data comprise global (worldwide) timeseries from countries and regions from the initial reports of positive cases in china to the current day 5 . figure 2: timeseries data. this figure provides a brief overview of the timeseries used for subsequent modelling, with a focus on the early trajectories of mortality. the upper left panel shows the distribution, over countries, of the number of days after the onset of an outbreak-defined as 8 days before more than one case was reported. at the time of writing (4 th april 2020), a substantial number of countries witnessed an outbreak lasting for more than 60 days. the upper right panel plots the total number of deaths against the durations in the left panel. those countries whose outbreak started earlier have greater cumulative deaths. the middle left panel plots the new deaths reported (per day) over a 48-day period following the onset of an outbreak. the colours of the lines denote different countries. these countries are listed in the lower left panel, which plots the cumulative death rate. china is clearly the first country to be severely affected, with remaining countries evincing an accumulation of deaths some 30 days after china. the middle right panel is a logarithmic plot of the total deaths against population size in the initial (48-day) period. interestingly, there is little correlation between the total number of deaths and population size. however, there is a stronger correlation between the total number of cases reported (within the first 48 days) and the cumulative deaths as shown in lower right panel. in this period, germany has the greatest ratio of total cases to deaths. countries were included if their outbreak had lasted for more than 48 days and more than 16 deaths had been reported. the timeseries were smoothed with a gaussian kernel (full width half maximum of two days) to account for erratic reporting (e.g., recording deaths over the weekend). the generative model is a mean field model of ensemble dynamics. in other words, it is a state space model where the states correspond to the sufficient statistics (i.e., parameters) of a probability distribution over the states of an ensemble or population-here, a population of people who are in mutual contact at some point in their daily lives. this kind of model is used routinely to model populations of neurons, where the ensemble dynamics are cast as density dynamics, under gaussian assumptions about the probability densities; e.g., (marreiros et al., 2009) . in other words, a model of how the mean and covariance of a population affects itself and the means and covariances of other populations. here, we will focus on a single population and, crucially, use a discrete state space model. this means that we will be dealing with the sufficient statistics (i.e. expectations) of the probability of being in a particular state at any one time. this renders the model a compartmental model (kermack et al., 1997) , where each state corresponds to a compartment. these latent states evolve according to transition probabilities that embody the causal influences and conditional dependencies that lend an epidemic its characteristic form. our objective is to identify the right conditional dependencies-and form posterior beliefs about the model parameters that mediate these dependencies. having done this, we can then simulate an entire trajectory into the distant future, even if we are only given data about the beginning of an outbreak 6 . the model considers four different sorts of states (i.e., factors) that provide a description of any individualsampled at random-that is sufficient to generate the data at hand. in brief, these factors were chosen to be as conditionally independent as possible to ensure an efficient estimation of the model parameters 7 . the four factors were an individual's location, infection status, clinical status and diagnostic status. in other words, we considered that any member of the population can be characterised in terms of where they were, whether they were infected, infectious or immune, whether they were showing mild and severe or fatal symptoms, and whether they had been tested with an ensuing positive or negative result. each of these factors had four levels. for example, the location factor was divided into home, work, critical care unit, and the morgue. these states should not be taken too literally. for example, home stands in for anywhere that has a limited risk of exposure to, or contact with, an infected person (e.g., in the domestic home, in a non-critical hospital bed, in a care home, etc). work stands in for anywhere that has a larger risk of exposure to-or contact with-an infected person and therefore covers non-work activities, such as going to the supermarket or participating in team sports. similarly, designating someone as severely ill with acute respiratory distress syndrome (ards) is meant to cover any life-threatening conditions that would invite admission to intensive care. having established the state space, we can now turn to the causal aspect of the dynamic causal model. the causal structure of these models depends upon the dynamics or transitions from one state or another. it is at this point that a mean field approximation can be used. mean field approximations are used widely in physics to approximate a full (joint) probability density with the product of a series of marginal densities (bressloff and newby, 2013; marreiros et al., 2009; schumacher et al., 2015; zhang et al., 2018) . in this case, the factorisation is fairly subtle: we will factorise the transition probabilities, such that the probability of moving among states-within each factor-depends upon the marginal distribution of other factors (with one exception). for example, the probability of developing symptoms when asymptomatic depends on, and only on, the probability that i am infected. in what follows, we will step through the conditional probabilities for each factor to show how the model is put together (and could be changed). the first factor has four levels, home, work, ccu and the morgue. people can leave home but will always return (with unit probability) over a day. the probability of leaving home has a (prior) baseline rate of one third but is nuanced by any social distancing imperatives. these imperatives are predicated on the proportion of the population that is currently infected, such that the social distancing parameter (an exponent) determines the probability of leaving home 8 . for example, social distancing is modelled as the propensity to leave home and expose oneself to interpersonal contacts. this can be modelled with the following transition probability: this means that the probability of leaving home, given i have no symptoms, is the probability i would have gone out normally, multiplied by a decreasing function of the proportion of people in the population who are infected. formally, this proportion is the marginal probability of being infected, where the marginal probability of a factor is an average over the remaining factors. the marginal probability p of the location factor is as follows: where the final four equalities define each factor or state in the model. the parameters in this social 8 we will license this assumption using bayesian model comparison later. distancing model are the probability of leaving home every day () out ï�± and the social distancing exponent the only other two places one can be are in a ccu or the morgue. the probability of moving to critical care depends upon bed (i.e., hospital) availability, which is modelled as a sigmoid function of the occupancy of this state (i.e., the probability that a ccu bed is occupied) and a bed capacity parameter (a threshold). if one has severe symptoms, then one stays in the ccu. finally, the probability of moving to the morgue depends on, and only on, being deceased. note that all these dependencies are different states of the clinical factor (see below). this means we can write the transition probabilities among the location factor for each level of the clinical factor as follows (with a slight abuse of notation): (1 ) is a decreasing sigmoid function. in brief, these transition probabilities mean that i will go out when asymptomatic, unless social distancing is in play. however, when i have symptoms i will stay at home, unless i am hospitalised with acute respiratory distress. i remain in critical care unless i recover and go home or die and move to the morgue, where i stay. technically, the morgue is an absorbing state. in a similar way, we can express the probability of moving between different states of infection (i.e., susceptible, infected, infectious and immune) as follows: these transition probabilities mean that when susceptible, the probability of becoming infected depends upon the number of social contacts-which depends upon the proportion of time spent at home. this dependency is parameterised in terms of a transition probability per contact (î¸trn) and the expected number of contacts at home (î¸rin) and work (î¸rou) 9 . once infected, one remains in this state for a period of time that is parameterised by a transition rate (î¸inf). this parameterisation illustrates a generic property of transition probabilities; namely, an interpretation in terms of rate constants and, implicitly, time constants. the rate parameter î¸ is related to the rate constant îº and time constant ï� according to: in other words, the probability of staying in any one state is determined by the characteristic length of time that state is occupied. this means that the rate parameter above can be specified, a priori, in terms of the can be interpreted as a probability of eluding infection with each interpersonal contact, such that the probability of remaining uninfected after r ï�± contacts is given by r p ï�± . note, that there is no distinction between people at home and at work; both are equally likely to be infectious. number of days we expect people to be infected, before becoming infectious. similarly, we can parameterise the transition from being infectious to being immune in terms of a typical period of being contagious, assuming that immunity is enduring and precludes reinfection 10 . note that in the model, everybody in the morgue is treated as having has acquired immunity. the transitions among clinical states depend upon both the infection status and location as follows: the transitions among clinical states (i.e., asymptomatic, symptomatic, ards and deceased) are relatively straightforward: if i am not infected (i.e., susceptible or immune) i will move to the asymptomatic state, unless i am dead. however, if i am infected (i.e., infected or infectious), i will develop symptoms with a particular probability (î¸dev). once i have developed symptoms, i will remain symptomatic and either recover to an asymptomatic state or develop acute respiratory distress with a particular probability (î¸sev). the parameterisation of these transitions depends upon the typical length of time that i remain symptomatic (î¸sym); similarly, when in acute respiratory distress (î¸rds). however, i may die following ards, with a probability that depends upon whether i am in a ccu, or elsewhere. this is the exception (mentioned above) to the conditional dependencies on marginal densities. here, the probability of dying (î¸fat) depends on being infected and my location: i am more likely to die of ards, if i am not in ccu, where î¸sur is the probability of surviving at home. the implication here is that the transition probabilities depend upon two marginal densities, as opposed to one for all the other factors: see the first equality in (1.6). please refer to table 1 for details of the model parameters. finally, we turn to diagnostic testing status (i.e., untested, waiting or positive versus negative). the transition probabilities here are parameterised in terms of test availability (î¸tft, î¸sen). and the probability that i would have been tested anyway, which is relatively smaller, if i am asymptomatic (î¸tes). test availability is a decreasing sigmoid function of the number of people who are waiting (with a delay î¸del) for their results. i can only move from being untested to waiting. after this, i can only go into positive or negative test states, depending upon whether i have the virus (i.e., infected or infectious) or not 11 . we can now assemble these transition probabilities into a probability transition matrix, and iterate from the first day to some time horizon, to generate a sequence of probability distributions over the joint space of all factors: notice that this is a completely deterministic state space model, because all the randomness is contained in the probabilities. notice also that the transition probability matrix t is both state and time dependent, because the transition probabilities above depend on marginal probabilities. technically, (1.8) is known as a master equation (seifert, 2012; vespignani and zapperi, 1998; wang, 2009 ) and forms the basis of the dynamic part of the dynamic causal model. this model of transmission supports an effective reproduction number or rate, r, which summarises how many people i am likely to infect, if i am infected. this depends upon the probability that any contact will cause an infection, the probability that the contact is susceptible to infection and number of people i contact: 11 notice that this model is configured for new cases that are reported based on buccal swabs (i.e., am i currently infected?), not tests for antibody or immunological status. a different model would be required for forthcoming tests of immunity (i.e., have i been infected?). furthermore, one might consider the sensitivity and specificity of any test by including sensitivity and specificity in (1.7). for example, 1 in 3 tests may be false negatives; especially, when avoiding bronchoalveolar lavage to minimise risk to clinicians: wang, w., xu, y., gao, r., lu, r., han, k., wu, g., tan, w., 2020b. detection of sars-cov-2 in different types of clinical specimens. jama. in this approximation, the number of contacts i make is a weighted average of the number of people i could infect at home and the number of people i meet outside, per day, times the number of days i am contagious. the effective reproduction rate is not a biological rate constant. however, it is a useful epidemiological summary statistic that indicates how quickly the disease spreads through a population. when less than one, the infection will decay to an endemic equilibrium. we will use this measure later to understand the role of herd immunity. this completes the specification of the generative model of latent states. a list of the parameters and their prior means (and variances) is provided in table 1 . notice that all of the parameters are scale parameters, i.e., they are rates or probabilities that cannot be negative. to enforce these positivity constraints, one applies a log transform to the parameters during model inversion or fitting. this has the advantage of being able to simplify the numerics using gaussian assumptions about the prior density (via a lognormal assumption). in other words, although the scale parameters are implemented as probabilities or rates, they are estimated as log parameters, denoted by ln ï��ï�± = . note that prior variances are specified for log parameters. for example, a variance of 1/64 corresponds to a prior confidence interval of ~25% and can be considered weakly informative. sources mizumoto and chowell, 2020; russell et al., 2020; verity et al., 2020; wang et al., 2020a) and: â�¢ https://www.statista.com/chart/21105/number-of-critical-care-beds-per-100000-inhabitants/ â�¢ https://www.gov.uk/guidance/coronavirus-covid-19-information-for-the-public â�¢ http://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/ these prior expectations should be read as the effective rates and time constants as they manifest in a real-world setting. for example, a three-day period of contagion is shorter than the period that someone might be infectious (wã¶lfel et al., 2020) 12 , on the (prior) assumption that they will self-isolate, when they realise they could be contagious. further parameters are required to generate data, such as the size of the population and the number of people who are initially infected ( , ) nn ï�±ï�± 13 , which parameterise the initial state of the population (where ï�� denotes 12 shedding of covid-19 viral rna from sputum can outlast the end of symptoms. seroconversion occurs after 6-12 days but is not necessarily followed by a rapid decline of viral load. 13 table 1 also includes a parameter for the proportion of people who are initially immune, which we will call on later. a kronecker tensor product): (1.10) these parameters are unknown quantities that have to be estimated from the data; however, we still have to specify their prior densities. this begs the question: what kind of population are we trying to model? there are several choices here; ranging from detailed grid models of the sort used in weather forecasting (palmer and laure, 2013) and epidemiologic models (ferguson et al., 2006) . one could use models based upon partial differential equations; i.e., (markov random) field models (deco et al., 2008) . in this technical report, we will choose a simpler option that treats a pandemic as a set of linked point processes that can be modelled as rare events. in other words, we will focus on modelling a single outbreak in a region or city and treat the response of the 'next city' as a discrete process post hoc. this simplifies the generative model; in the sense we only have to worry about the ensemble dynamics of the population that comprises one city . a complimentary perspective on this choice is that we are trying to model the first wave of an epidemic as it plays out in the first city to be affected. any second wave can then be treated as the first wave of another city or region. under this choice, the population size can be set, a priori, to 1,000,000; noting that a small city comprises (by definition) a hundred thousand people, while a large city can exceed 10 million. note that this is a prior expectation, the effective population size is estimated from the data: the assumption that the effective population size reflects the total population of a country is a hypothesis (that we will test later). the likelihood or observation model the outcomes considered in figure 2 are new cases (of positive tests and deaths) per day. these can be generated by multiplying the appropriate probability by the (effective) population size. the appropriate probabilities here are just the expected occupancy of positive test and deceased states, respectively. because we are dealing with large populations, the likelihood of any observed daily count has a binomial distribution that can be approximated by a gaussian density 14 . here, outcomes are counts of rare events with a small probability 1 ï�° of occurring in a large population of size 1 n . for example, the likelihood of observing a timeseries of daily deaths can be expressed as a function of the model parameters as follows: the advantage of this limiting (large population) case is that a (variance stabilising) square root transform of the data counts renders their variance unity. with the priors and likelihood model in place, we now have a full joint probability over causes (parameters) and consequences (outcomes). this is the generative model one can now use standard variational techniques (friston et al., 2007) to estimate the posterior over model parameters and evaluate a variational bound on the model evidence or marginal likelihood. mathematically, this is expressed as follows: these expressions show that maximising the variational free energy f with respect to an approximate posterior () q ï�� renders the kullback-leibler (kl) divergence between the true and approximate posterior as small as possible. at the same time, the free energy becomes a lower bound on the log evidence. the free energy can then be used to compare different models, where any differences correspond to a log bayes factor or odds ratio (kass and raftery, 1995; winn and bishop, 2005) . one may be asking why we have chosen this particular state space and this parameterisation? are there alternative model structures or parameterisations that would be more fit for purpose? the answer is that there will always be a better model, where 'better' is a model that has more evidence. this means that the model has to be optimised in relation to empirical data. this process is known as bayesian model comparison based upon model evidence (winn and bishop, 2005) . for example, in the above model we assumed that social distancing increases as a function of the proportion of the population who are infected (1.1). this stands in for a multifactorial influence on social behaviour that may be mediated in many ways. for example, government advice, personal choices, availability of transport, media reports of 'panic buying' and so on. so, what licenses us to model the causes of social distancing in terms of a probability that any member of the population is infected? the answer rests upon bayesian model comparison. when inverting the model using data from countries with more than 16 deaths (see figure 2 ), we obtained a log evidence (i.e., variational free energy) of -15701 natural units (nats). when replacing the cause of social distancing with the probability of encountering someone with symptoms-or the number of people testing positivethe model evidence fell substantially to -15969 and -15909 nats, respectively. in other words, there was overwhelming evidence in favour of infection rates as a primary drive for social distancing, over and above alternative models. we will return to the use of bayesian model comparison later, when asking what factors determine differences between each country's response to the pandemic. table 1 lists all the model parameters; henceforth, dcm parameters. in total, there are 21 dcm parameters. this may seem like a large number to estimate from the limited amount of data available (see figure 2 ). the degree to which a parameter is informed by the data depends upon how changes in the parameter are expressed in data space. for example, increasing the effective population size will uniformly elevate the expected cases per day. conversely, decreasing the number of initially infected people will delay the curve by shifting it in time. in short, a parameter can be identified if it has a relatively unique expression in the data. this speaks to an important point, the information in the data is not just in the total count-it is in the shape or form of the transient 15 . on this view, there are many degrees of freedom in a timeseries that can be leveraged to identify a highly parameterised model. the issue of whether the model is over parameterised or under parameterised is exactly the issue resolved by bayesian model comparison; namely, the removal of redundant parameters to suppress model complexity and ensure generalisation: see (1.13) 16 . one therefore requires the best measures 15 a transient here refers to a transient perturbation to a system, characterising a response that evolves over time. 16 intuitively, this can be likened to a bat inverting its generative model of the world using the transients created by echo location. the shape of the transient contains an enormous amount of information, provided the bat has a good model of how echoes are generated. exactly the same principle applies here: if one can find the right model, one can go beyond the immediate information in the data to make some precise inferences-based upon prior beliefs that constitute the generative model. this kind of abductive inference speaks to the importance of having a good forward or generative model-and the ability to select the best model based upon model evidence. of model evidence. this is the primary motivation for using variational bayes; here, variational laplace (friston et al., 2007) . the variational free energy, in most circumstances, provides a better approximation than alternatives such as the widely used akaike information criteria and the widely used bayesian information criteria (penny, 2012). one special aspect of the model above is that it has absorbing states. for example, whenever one enters the morgue, becomes immune, dies or has a definitive test result, one stays in that state: see figure 1 . this is important, because it means the long-term behaviour of the model has a fixed point. in other words, we know what the final outcomes will be. these outcomes are known as endemic equilibria. this means that the only uncertainty is about the trajectory from the present point in time to the distant future. we will see later that-when quantified in terms of bayesian credible intervals-this uncertainty starts to decrease as we go into the distant future. this should be contrasted with alternative models that do not parameterise the influences that generate outcomes and therefore call upon exogenous inputs (e.g., statutory changes in policy or changes in people's behaviour). if these interventions are unknown, they will accumulate uncertainty over time. by design, we elude this problem by including everything that matters within the model and parameterising strategic responses (like social distancing) as an integral part of the transition probabilities. we have made the simplifying assumption that every country reporting new cases is, effectively, reporting the first wave of an affected region or city. clearly, some countries could suffer simultaneous outbreaks in multiple cities. this is accommodated by an effective population size that could be greater than the prior expectation of 1 million. this is an example of finding a simple model that best predicts outcomes-that may not be a veridical reflection of how those outcomes were actually generated. in other words, we will assume that each country behaves as if it has a single large city of at-risk denizens. in the next section, we look at the parameter estimates that obtain by pooling information from all countries, with a focus on between country differences, before turning to the epidemiology of a single country (the united kingdom). hitherto, we have focused on a generative model for a single city. however, in a pandemic, many cities will be affected. this calls for a hierarchical generative model that considers the response of each city at the first level and a global response at the second. this is an important consideration because it means, from a bayesian perspective, knowing what happens elsewhere places constraints (i.e., bayesian shrinkage priors) on estimates of what is happening in a particular city. clearly, this rests upon the extent to which certain model parameters are conserved from one city to another-and which are idiosyncratic or unique. this is a problem of hierarchical bayesian modelling or parametric empirical bayes (friston et al., 2016; kass and steffey, 1989) . in the illustrative examples below, we will adopt a second level model in which key (log) parameters are sampled from a gaussian distribution with a global (worldwide) mean and variance. from the perspective of the generative model, this means that to generate a pandemic, one first samples city-specific parameters from a global distribution, adds a random effect, and uses the ensuing parameters to generate a timeseries for each city. this section considers the modelling of country-specific parameters, under a simple (general linear) model of between-country effects. this (second level) model requires us to specify which parameters are shared in a meaningful way between countries and which are unique to each country. technically, this can be cast as the difference between random and fixed effects. designating a particular parameter as a random effect means that this parameter was generated by sampling from a countrywide distribution, while a fixed effect is unique to each country. under a general linear model, the distribution for random effects is gaussian. in other words, to generate the parameter for a particular country, we take the global expectation and add a random gaussian variate, whose variance has to be estimated under suitable hyperpriors. furthermore, one has to specify systematic differences between countries in terms of independent variables; for example, does the latitude of a country have any systematic effect on the size of the at-risk population? the general linear model used here comprises a constant (i.e., the expectation or mean of each parameter over countries), the (logarithms of) total population size, and a series of independent variables based upon a discrete sine transform of latitude and longitude. the latter variables stand in for any systematic and geopolitical differences among countries that vary smoothly with their location. notice that the total population size may or may not provide useful constraints on the effective size of the population at the first level. under this hierarchical model, a bigger country may have a transport and communication infrastructure that could reduce the effective (at risk) population size. a hint that this may be the case is implicit in figure 2 , where there is no apparent relationship between the early incidence of deaths and total population size. in the examples below, we treated the number of initial cases and the parameters pertaining to testing as fixed effects and all remaining parameters as random effects. the number of initial infected people determines the time at which a particular country evinces its outbreak. although this clearly depends upon geography and other factors, there is no a priori reason to assume a random variation about an average onset time. similarly, we assume that each country's capacity for testing was a fixed effect; thereby accommodating non-systematic testing or reporting strategies 17 . note that in this kind of modelling, outcomes such as new cases can only be interpreted in relation to the probability of being tested and the availability of tests 18 . with this model in place, we can now use standard procedures for parametric empirical bayesian modelling (friston et al., 2016; kass and steffey, 1989) to estimate the second level parameters that couple betweencountry independent variables to country-specific parameters of the dcm. however, there are a large number of these parameters-that may or may not contribute to model evidence. in other words, we need some way of removing redundant parameters based upon bayesian model comparison. this calls upon another standard procedure called bayesian model reduction (friston et al., 2018; friston et al., 2016) . in brief, bayesian model reduction allows one to evaluate the evidence for a model that one would have obtained if the model had been reduced by removing one or more parameters. the key aspect of bayesian model reduction is that this evidence can be evaluated using the posteriors and priors of a parent model that includes all possible parameters. there are clearly an enormous number of combinations of parameters that one could consider. fortunately, these can be scored quickly and efficiently using bayesian model reduction, by making use of savage-dickey density ratios (friston and penny, 2011; savage, 1954 ). because bayesian model reduction scores the effect of changing the precision of priors-on model evidence-it can be regarded as an automatic bayesian sensitivity analysis, also known as robust bayesian analysis (berger, 2011) . figure 3 shows the results of this analysis. the upper panels show the posterior probability of 256 models that had the greatest evidence (shown as a log posterior in the upper left panel). each of these models corresponds to a particular combination of parameters that have been 'switched off', by shrinking their prior variance to zero. by averaging the posterior estimates in proportion to the evidence for each model, -known as bayesian model averaging (hoeting et al., 1999 )-we can eliminate redundant parameters and thereby provide a simpler explanation for differences among countries. this is illustrated in the lower panels, which show the posterior densities before (left) and after (right) bayesian model reduction. these estimates are shown in terms of their expectation or maximum a posteriori (map) value (as blue bars), with 90% bayesian credible intervals (as pink bars). the first 21 parameters are the global expectations of the dcm parameters. the remaining parameters are the coefficients that link various independent variables at the second level to the parameters of the transition probabilities at the first. note that a substantial number of second level parameters have been removed; however, many are retained. this suggests that there are systematic variations over countries in certain random effects at the country level. figure 4 provides an example based upon the largest effect mediated by the independent variables. in this analysis, latitude (i.e., distance from the south pole) appears to reduce the effective size of an at-risk population. in other words, countries in the northern hemisphere have a smaller effective population size, relative to countries in the southern hemisphere. clearly, there may be many reasons for this; for example, systematic differences in temperature or demographics. figure 3: bayesian model reduction. this figure reports the results of bayesian model reduction. in this instance, the models compared are at the second or between-country level. in other words, the models compared contained all combinations of (second level) parameters (a parameter is removed by setting its prior variance to zero). if the model evidence increases-in virtue of reducing model complexity-then this parameter is redundant. the upper panels show the relative evidence of the most likely 256 models, in terms of log evidence (left panel) and the corresponding posterior probability (right panel). redundant parameters are illustrated in the lower panels by comparing the posterior expectations before and after the bayesian model reduction. the blue bars correspond to posterior expectations, while the pink bars denote 90% bayesian credible intervals. the key thing to take from this analysis is that a large number of second level parameters have been eliminated. these second level parameters encode the effects of population size and geographical location, on each of the parameters of the generative model. the next figure illustrates the nonredundant effects that can be inferred with almost 100% posterior confidence. figure 4: between country effects. this figure shows the relationship between parameters of the generative model and the explanatory variables in a general linear model of between country effects. the left panel shows a regression of country-specific dcm parameters on the independent variable that had the greatest absolute value; namely, the contribution of an explanatory variable to a model parameter. here, the effective size of the population appears to depend upon the latitude of a country. the right panel shows the absolute values of the glm parameters in matrix form, showing that the effective size of the population was most predictable (the largest values are in white), though not necessarily predictable by total population size. the red circle highlights the parameter mediating the relationship illustrated in the left panel. figure 5 shows the bayesian parameter averages (litvak et al., 2015) of the dcm parameters over countries. the posterior density (blue bars and pink lines) are supplemented with the prior expectations (red bars) for comparison. the upper panel shows the map estimates of log parameters, while the lower panel shows the same results in terms of scale parameters. the key thing to take from this analysis is the tight credible intervals on the parameters, when averaging in this way. according to this analysis, the number of effective contacts at home is about three people, while this increases by an order of magnitude to about 30 people when leaving home. the symptomatic and acute respiratory distress periods have been estimated here at about five and 13 days respectively, with a delay in testing of about two days. these are the values that provide the simplest explanation for the global data at hand-and are in line with empirical estimates 19 . figure 5: bayesian parameter averages. this figure reports the bayesian parameter averages over countries following a hierarchical or parametric empirical bayesian analysis that tests for-and applies shrinkage priors toposterior parameter estimates for each country. the upper panel shows the parameters as estimated in log space, while the lower panel shows the same results for the corresponding scale (nonnegative) parameters. the blue bars report posterior expectations, while the thinner red bars in the upper panel are prior expectations. the pink bars denote 90% bayesian credible intervals. one can interpret these parameters as the average value for any given parameter of the generative model, to which a random (country-specific) effect is added to generate the ensemble dynamics for each country. in turn, these ensemble distributions determine the likelihood of various outcome measures under large number (i.e., gaussian) assumptions. figure 6 shows the country-specific parameter estimates for 12 of the 21 dcm parameters. these posterior densities were evaluated under the empirical priors from the parametric empirical bayesian analysis above. as one might expect-in virtue of the second level effects that survived bayesian model reduction-there are some substantial differences between countries in certain parameters. for example, the effective population size in the united states of america is substantially greater than elsewhere at about 25 million (the population in new york state is about 19.4 million). the effective population size in the uk (dominated by cases in london) is estimated to be about 2.5 million (london has a population of about 7.5 million) 20 . social distancing seems to be effective and sensitive to infection rates in france but much less so in canada. the efficacy of social distancing in terms of the difference between the number of contacts at home and work is notably attenuated in the united kingdom-that has the greatest number of home contacts and the least number of work contacts. other notable differences are the increased probability of fatality in critical care evident in china. this is despite the effective population size being only about 2.5 million. again, these assertions are not about actual states of affairs. these are the best explanations for the data under the simplest model of how those data were caused. figure 6: differences among countries. this figure reports the differences among countries in terms of selected parameters of the generative model, ranging from the effective population size, through to the probability of testing its denizens. the blue bars represent the posterior expectations, while the pink bars are 90% bayesian credible intervals. notice that these intervals are not symmetrical about the mean because we are reporting scale parametersas opposed to log parameters. for each parameter, the countries showing the smallest and largest values are labelled. the red asterisk denotes the country considered in the next section (the united kingdom). the next figure illustrates the projections, in terms of new deaths and cases, based upon these parameter estimates. the order of the countries is listed in figure 2 . this level of modelling is important because it enables the data or information from one country to inform estimates of the first level (dcm) parameters that underwrite the epidemic in another country 21 . this is another expression of the importance of having a hierarchical generative model for making sense of the data. here, the generative model has latent causes that span different countries, thereby enabling the fusion of multimodal data from multiple countries (e.g., new test or death rates). two natural questions now arise. are there any systematic differences between countries in the parameters that shape epidemiological dynamics-and what do these dynamics or trajectories look like? this concludes our brief treatment of between country effects, in which we have considered the potentially important role of bayesian model reduction in identifying systematic variations in the evolution of an epidemic from country to country. the next section turns to the use of hierarchically informed estimates of dcm parameters to characterise an outbreak in a single country. this section drills down on the likely course of the epidemic in the uk, based upon the posterior density over dcm parameters afforded by the hierarchical (parametric empirical) bayesian analysis of the previous section (listed in table 2 ). figure 7 shows the expected trajectory of death rates, new cases, and occupancy of ccu beds over a six-month (180 day) period. these (posterior predictive) densities are shown in terms of an expected trajectory and 90% credible intervals (blue line and shaded areas, respectively). the black dots represent empirical data (available at the time of writing). notice that the generative model can produce outcomes that may or may not be measured. here, the estimates are based upon the new cases and deaths in figure 2 . the panels on the left show that our confidence about the causes of new cases is relatively high during the period for which we have data and then becomes uncertain in the future. this reflects the fact that the data are informing those parameters that shaped the initial transient, whereas other parameters responsible for the late peak and subsequent trajectory are less informed. notice that the uncertainty about cumulative deaths itself accumulates. on this analysis, we can be 90% confident that in five weeks, between 13,000 and 22,000 people may have died. relative to the total population, the proportion of people dying is very small; however, the cumulative death rates in absolute numbers are substantial, in relation to seasonal influenza (indicated with broken red lines). although cumulative death rates are small, they are concentrated within a short period of time, with near-identical ccu needs-with the risk of over-whelming available capacity (not to mention downstream effects from blocking other hospital admissions to prioritise the pandemic). figure 7: projected outcomes. this figure reports predicted 22 new deaths and cases (and ccu occupancy) for an exemplar country; here, the united kingdom. the panels on the left show the predicted outcomes as a function of 22 we will use predictions-as opposed to projections-when appropriate, to emphasise the point that the generative model is not a timeseries model, in the sense that the unknown quantities (dcm parameters) do not change with time. this means the there is uncertainty about predictions in the future and the past, given uncertainty about the parameters (see figure 7 ). this should be contrasted with the notion of forecasting or projection; however, predictions in the future, in this setting, can be construed as projections. weeks. the blue lines correspond to the expected trajectory, while the shaded areas are 90% bayesian credible intervals. the black dots represent empirical data, upon which the parameter estimates are based. the lower right panel shows the parameter estimates for the country in question. as in previous figures, the prior expectations are shown as pink bars over the posterior expectations (and credible intervals). the upper right panel illustrates the equivalent expectations in terms of cumulative deaths. the dotted red lines indicate the number of people who died from seasonal influenza in recent years 23 . the key point to take from this figure is the quantification of uncertainty inherent in the credible intervals. in other words, uncertainty about the parameters propagates through to uncertainty in predicted outcomes. this uncertainty changes over time because of the nonlinear relationship between model parameters and ensemble dynamics. by model design, one can be certain about the final states; however, uncertainty about cumulative death rates itself accumulates. the mapping from parameters, through ensemble dynamics to outcomes is mediated by latent or hidden states. the trajectory of these states is illustrated in the next figure. the underlying latent causes of these trajectories are shown in figure 8 . the upper panels reproduce the expected trajectories of the previous figure, while the lower panels show the underlying latent states in terms of expected rates or probabilities. for example, the social distancing measures are expressed in terms of an increasing probability of being at home, given the accumulation of infected cases in the population. during the peak expression of death rates, the proportion of people who are immune (herd immunity) increases to about 30% and then asymptotes at about 90%. this period is associated with a marked increase in the probability of developing symptoms (peaking at about 11 weeks, after the first reported cases). interestingly, under these projections, the number of people expected to be in critical care should not exceed capacity: at its peak, the upper bound of the 90% credible interval for ccu occupancy is approximately 4200, this is within the current ccu capacity of london (corresponding to the projected capacity of the temporary nightingale hospital 24 in london, uk). figure 8: latent causes of observed consequences. the upper panels reproduce the expected trajectories of the previous figure, for an example country (here the united kingdom). the expected death rate is shown in blue, new cases in red, predicted recovery rate in orange and ccu occupancy in yellow. the black dots correspond to empirical data. the lower four panels show the evolution of latent (ensemble) dynamics, in terms of the expected probability of being in various states. the first (location) panel shows that after about 5 to 6 weeks, there is sufficient evidence for the onset of an episode to induce social distancing, such that the probability of being found at work falls, over a couple of weeks to negligible levels. at this time, the number of infected people increases (to about 32%) with a concomitant probability of being infectious a few days later. during this time, the probability of becoming immune increases monotonically and saturates at about 20 weeks. clinically, the probability of becoming symptomatic rises to about 30%, with a small probability of developing acute respiratory distress and, possibly death (these probabilities are very small and cannot be seen in this graph). in terms of testing, there is a progressive increase in the number of people tested, with a concomitant decrease in those untested or waiting for their results. interestingly, initially the number of negative tests increases monotonically, while the proportion of positive tests starts to catch up during the peak of the episode. under these parameters, the entire episode lasts for about 10 weeks, or less than three months. the broken red line in the upper left panel shows the typical number of ccu beds available to a well-resourced city, prior to the outbreak. it is natural to ask which dcm parameters contributed the most to the trajectories in figure 8 . this is addressed using a sensitivity analysis. intuitively, this involves changing a particular parameter and seeing how much it affects the outcomes of interest. figure 9 reports a sensitivity analysis of the parameters in terms of their direct contribution to cumulative deaths (upper panel) and how they interact (lower panel). these are effectively the gradient and hessian matrix (respectively) of predicted cumulative deaths. the bars in the upper panel pointing to the left indicate parameters that decrease total deaths. these include social distancing and bed availability, which are-to some extent-under our control. other factors that improve fatality rates include the symptomatic and acute respiratory distress periods and the probability of surviving outside critical care. these, at the present time, are not so amenable to intervention. note that initial immunity has no effect in this analysis because we clamped the initial values to zero with very precise priors. we will relax this later. first, we look at the effect of social distancing by simulating the ensemble dynamics under increasing levels of the social distancing exponent (i.e., the sensitivity of our social distancing and self-isolation behaviour to the prevalence of the virus in the community). as one might expect, increasing social distancing, bed availability and the probability of survival outside critical care, tend to decrease death rate. interestingly, increasing both the period of symptoms and ards decreases overall death rate, because (in this compartmental model) keeping someone alive for longer in a ccu reduces fatality rates (as long as capacity is not exceeded). the lower panel shows the second order derivatives. these reflect the effect of one parameter on the effect of another parameter on total deaths. for example, the effects of bed availability and fatality in ccu are positive, meaning that the beneficial (negative) effects of increasing bed availability-on total deaths-decrease with fatality rates. it may be surprising to see that social distancing has such a small effect on total deaths (see upper panel in figure 9 ). however, the contribution of social distancing is in the context of how the epidemic elicits other responses; for example, increases in critical care capacity. quantitatively speaking, increasing social distancing only delays the expression of morbidity in the population: it does not, in and of itself, decrease the cumulative cost (although it buys time to develop capacity, treatments, and primary interventions). this is especially the case if there is no effective limit on critical care capacity, because everybody who needs a bed can be accommodated. this speaks to the interaction between different causes or parameters in generating outcomes. in the particular case of the uk, the results in figure 4 suggest that although social distancing is in play, self-isolation appears limited. this is because the number of contacts at home is relatively high (at over five); thereby attenuating the effect of social distancing. in other words, slowing the spread of the virus depends upon reducing the number of contacts by social distancing. however, this will only work if there is a notable difference between the number of contacts at home and work. one can illustrate this by simulating the effects of social distancing, when it makes a difference. figure 10 reproduces the results in figure 8 but for 16 different levels of the social distancing parameter, while using the posterior expectation for contacts at home (of about four) from the bayesian parameter average. social distancing is expressed in terms of the probability of being found at home or work (see the panel labelled location). as we increase social distancing the probability and duration of being at home during the outbreak increases. this flattens the curve of death rates per day from about 600 to a peak of about 400. this is the basis of the mitigation ('curve flattening') strategies that have been adopted worldwide. the effect of this strategy is to reduce cumulative deaths and prevent finite resources being overwhelmed. in this example, from about 17,000 to 14,000, potentially saving about 3000 people. this is roughly four times the number of people who die in the equivalent period due to road traffic accidents. interestingly, these (posterior predictive) projections suggest that social distancing can lead to an endgame in which not everybody has to be immune (see the middle panel labelled infection). we now look at herd immunity using the same analysis. figure 10: the effects of social distancing. this figure uses the same format as figure 9 . however, here trajectories are reproduced under different levels of social distancing; from zero through to four (in 16 steps). this parameter is the exponent applied to the probability of not being infected. in other words, it scores the sensitivity of social distancing to the prevalence of the virus in the population. in this example (based upon posterior expectations for the united kingdom and bayesian parameter averages over countries), death rates (per day) decrease progressively with social distancing. the cumulative death rate is shown as a function of social distancing in the upper right panel. the vertical line corresponds to the posterior expectation of the social distancing exponent for this country. these results suggest that social distancing relieves pressure on critical care capacities and ameliorates cumulative deaths by about 3000 people. note that these projections are based upon an effective social distancing policy at home, with about four contacts. in the next figure, we repeat this analysis but looking at the effect of herd immunity. figure 11 reproduces the results in figure 10 using the united kingdom posterior estimatesbut varying the initial (herd) immunity over 16 levels from, effectively, 0 to 100%. the effects of herd immunity are marked, with cumulative deaths ranging from about 18,000 with no immunity to very small numbers with a herd immunity of about 70%. the broken red lines in the upper right panel are the number of people dying from seasonal influenza (as in figure 7) . these projections suggest that there is a critical level of herd immunity that will effectively avert an epidemic; in virtue of reducing infection rates, such that the spread of the virus decays exponentially. if we now return to figure 8 , it can be seen that the critical level of herd immunity will, on the basis of these projections, be reached 2 to 3 weeks after the peak in death rates. at this point-according to the model-social distancing starts to decline as revealed by an increase in the probability of being at work. we will put some dates on this trajectory by expressing it as a narrative in the conclusion. from a modelling perspective, the influence of initial herd immunity is important because it could form the basis of modelling the spread of the virus from city to another-and back again. in other words, more sophisticated generative models can be envisaged, in which an infected person from one city is transported to another city with a small probability or rate. reciprocal exchange between cities, (and ensuing 'second waves') will then depend sensitively on the respective herd immunities in different regions. anecdotally, other major pandemics, without social isolation strategies, have almost invariably been followed by a second peak that is as high (e.g., the 2009 h1n1 pandemic), or higher, than the first. under the current model, this would be handled in terms of a second region being infected by the first city and so on; like a chain of dominos or the spread of a bushfire (rhodes and anderson, 1998; zhang and tang, 2016) . crucially, the effect of the second city (i.e., wave) on the first will be sensitive to the herd immunity established by the first wave. in this sense, it is interesting to know how initial levels of immunity shape a regional outbreak, under idealised assumptions. figure 12 illustrates the interaction between immunity and viral spread as characterised by the effective reproduction rate, r (a.k.a. number or ratio); see (1.9). this figure plots the predicted death rates for the united kingdom and the accompanying fluctuations in r and herd immunity, where both are treated as outcomes of the generative model. the key thing to observe is that with low levels of immunity, r is fairly high at around 2.5 (current estimates of the basic reproduction ratio 25 r0, in the literature, range from 1.4 to 3.9). as soon as social distancing comes into play, r falls dramatically to almost 0. however, when social distancing is relaxed some weeks later, r remains low due to the partial acquisition of herd immunity, during peak of the epidemic. note that herd immunity in this setting pertains to, and only to, the effective or at-risk population: 80% herd immunity a few months from onset would otherwise be overly optimistic, compared to other de novo pandemics; e.g., (donaldson et al., 2009 ). on the other hand, an occult herd immunity (i.e. not accompanied by symptoms) is consistent with undocumented infection and rapid dissemination . note that this way of characterising the spread of a virus depends upon many variables (in this model, two factors and three parameters). and can vary from country to country. repeating the above analysis for china gives a much higher an initial or basic reproduction rate, which is consistent with empirical reports (steven et al., 2020) . this concludes our characterisation of projections for what is likely to happen and what could happen under different scenarios for a particular country. in the final section, we revisit the confidence with which these posterior predictive projections can be made. variational approaches-of the sort described in this technical report-use all the data at hand to furnish statistically efficient estimates of model parameters and evidence. this contrasts with alternative approaches based on cross-validation. in the cross-validation schemes, model evidence is approximated by cross-validation accuracy. in other words, the evidence for a model is scored by the log likelihood that some withheld or test data can be explained by the model. although model comparison based upon a variational evidence bound renders cross-validation unnecessary, one can apply the same procedures to demonstrate predictive validity. figure 13 illustrates this by fitting partial timeseries from one country (italy) using the empirical priors afforded by the parametric empirical bayesian analysis. these partial data comprise the early phase of new cases. if the model has predictive validity, the ensuing posterior predictive density should contain the data that was withheld during estimation. figure 13 presents an example of forward prediction over a 10-day period that contains the peak death rate. in this example, the withheld data are largely within the 90% credible intervals, speaking to the predictive validity of the generative model. there are two caveats here: first, similar analyses using very early timeseries from italy failed to predict the peak, because of insufficient (initial) constraints in the data. second, the credible intervals probably suffer from the well-known overconfidence problem in variational bayes, and the implicit mean field approximation (mackay, 2003) 26 . figure 13: predictive validity. this figure uses the same format as figure 7 ; however, here, the posterior estimates are based upon partial data, from early in the timeseries for an exemplar country (italy). these estimates were obtained under (parametric) empirical bayesian priors. the red dots show outcomes that were not used to estimate the expected trajectories (and credible intervals). this example illustrates the predictive validity of the estimates for a 10-day period following the last datapoint, which capture the rise to the peak of new cases. 26 note further that the credible intervals can include negative values. this is an artefact of the way in which the intervals are computed: here, we used a first-order taylor expansion to propagate uncertainty about the parameters through to uncertainty about the outcomes. however, because this generative model is non-linear in the parameters, high-order terms are necessarily neglected. we have rehearsed variational procedures for the inversion of a generative model of a viral epidemic-and have extended this model using hierarchical bayesian inference (parametric empirical bayes) to deal with the differential responses of each country, in the context of a worldwide pandemic. clearly, this narrative is entirely conditioned on the generative model used to make these predictions (e.g., the assumption of lasting immunity, which may or may not be true). the narrative is offered in a deliberately definitive fashion to illustrate the effect of resolving uncertainty about what will happen. it has been argued that many deleterious effects of the pandemic are mediated by uncertainty. this is probably true at both a psychological level-in terms of stress and anxiety (davidson, 1999; mcewen, 2000; peters et al., 2017 )and at an economic level in terms of 'loss of confidence' and 'uncertainty about markets'. put simply, the harmful effects of the coronavirus pandemic are not just what will happen but the effects of the uncertainty about what will happen. this is a key motivation behind procedures that quantify uncertainty, above and beyond being able to evaluate the evidence for different hypotheses about what will happen. one aspect of this is reflected in rhetoric such as "there is no clear exit strategy". it is reassuring to note that, if one subscribes to the above model, there is a clear exit strategy inherent in the self-organised mitigation 27 afforded by herd immunity. for example, within a week of the peak death rate, there should be sufficient herd immunity to preclude any resurgence of infections in, say, london. the term 'self-organised' is used carefully here. this is because we are part of this process, through the effect of social distancing on our location, contact with infected people and subsequent dissemination of covid-19. in other words, this formulation does not preclude strategic (e.g., nonpharmacological) interventions; rather, it embraces them as part of the self-organising ensemble dynamics. there are several outstanding issues that present themselves: the generative model-at both the first and second level-needs to be explored more thoroughly. at the first level, this may entail the addition of other factors; for example, splitting the population into age groups or different classes of clinical vulnerability. procedurally, this should be fairly simple, by specifying the dcm parameters for each age group (or cohort) separately and precluding transitions between age groups (or cohorts). one could also consider the fine graining of states within each factor. for example, making a more careful distinction between being in and not in critical care (e.g., being in self-isolation, being in a hospital, community care home, rural or urban location and so on). at the between city or country level, the parameters of the general linear model could be easily extended to include a host of demographic and geographic independent variables. finally, it would be fairly straightforward to use increasingly finegrained outcomes, using regional timeseries, as opposed to country timeseries (these data are currently available from: https://github.com/cssegisanddata/covid-19). another plausible extension to the hierarchical model is to include previous outbreaks of mers and sars (middle east and severe acute respiratory syndrome, respectively) in the model. this would entail supplementing the timeseries with historical (i.e., legacy) data and replicating the general linear model for each type of virus. in effect, this would place empirical priors or constraints on any parameter that shares characteristics with mers-cov and sars-cov. in terms of the model parameters-as opposed to model structure-more precise knowledge about the underlying causes of an epidemic will afford more precise posteriors. in other words, more information about the dcm parameters can be installed through adjusting the prior expectations and variances. the utility of these adjustments would then be assessed in terms of model evidence. this may be particularly relevant as reliable data about bed occupancy, proportion of people recovered, etc becomes available. a key aspect of the generative model used in this technical report is that it precludes any exogenous interventions of a strategic sort. in other words, the things that matter are built into the model and estimated as latent causes. however, prior knowledge about fluctuating factors, such as closing schools or limiting international air flights, could be entered by conditioning the dcm parameters on exogenous inputs. this would explicitly install intervention policies into the model. again, these conditions would only be licensed by an increase in model evidence (i.e., through comparing the evidence for models with and without some structured intervention). this may be especially important when it comes to modelling future interventions, for example, a 'sawtooth' social distancing protocol. a simple example of this kind of extension would be including a time dependent increase in the capacity for testing: at present, constraints on testing rates are assumed to be constant. a complementary approach would be to explore models in which social distancing depends upon variables that can be measured or inferred reliably (e.g., the rate of increase of people testing positive) and optimise the parameters of the ensuing model to minimise cumulative deaths. in principle, this should provide an operational equation that could be regarded as an adaptive (social distancing) policy, which accommodates as much as can be inferred about the epidemiology as possible. a key outstanding issue is the modelling of how one region (or city) affects another-and how the outbreak spreads from region to region. this may be an important aspect of these kinds of models; especially when it comes to modelling second waves as 'echoes' of infection, which are reflected back to the original epicentre. as noted above, the ability of these echoes to engender a second wave may be sensitively dependent on the herd immunity established during the first episode. herd immunity is therefore an important (currently latent or unobserved) state. this speaks to the importance of antibody testing in furnishing empirical constraints on herd immunity. in turn, this motivates antibody testing, even if the specificity and sensitivity of available tests are low. sensitivity and specificity are not only part of generative models, they can be estimated along with the other model parameters. in this setting, the role of antibody testing would be to provide data for population modelling and strategic advice-not to establish whether any particular person is immune or not (e.g., to allow them to go back to work). finally, it would be useful to assess the construct validity of the variational scheme adopted in dynamic causal modelling, in relation to schemes that do not make mean field approximations. these schemes usually rely upon some form of sampling (e.g., markov chain monte carlo sampling) and cross-validation. cross-validation accuracy can be regarded as a useful but computationally expensive proxy for model evidence and is the usual way that modellers perform automatic bayesian computation. given the prevalence of these sampling based (non-variational) schemes, it would be encouraging if both approaches converged on roughly the same predictions. the aim of this technical report is to place variational schemes on the table, so that construct validation becomes a possibility in the short-term future. the figures in this technical report can be reproduced using annotated (matlab) code that is available as part of the free and open source academic software spm (https://www.fil.ion.ucl.ac.uk/spm/). this software package has a relatively high degree of validation; being used for the past 25 years by over 5000 scientists in the neurosciences. the routines are called by a demonstration script that can be invoked by typing >> dem_covid. at the time of writing, these routines are undergoing software validation in our internal source version control system -that will be released in the next public release of spm (and via github at https://github.com/spm/). statistical decision theory and bayesian analysis stochastic models of intracellular transport the collected writings of paul davidson the dynamic brain: from spiking neurons to neural masses and cortical fields mortality from pandemic a/h1n1 2009 influenza in england: public health surveillance study strategies for mitigating an influenza pandemic variational free energy and the laplace approximation post hoc bayesian model selection dynamic causal modelling bayesian model reduction and empirical bayes for group (dcm) studies autoencoders, minimum description length and helmholtz free energy bayesian model averaging: a tutorial bayes factors approximate bayesian inference in conditionally independent hierarchical models (parametric empirical bayes models) a contribution to the mathematical theory of epidemics early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2). science, eabb3221 free-energy minimisation algorithm for decoding and cryptoanalysis information theory, inference and learning algorithms population dynamics under the laplace assumption allostasis and allostatic load: implications for neuropsychopharmacology estimating risk for death from 2019 novel coronavirus disease, china neural masses and fields in dynamic causal modeling singular vectors, predictability and ensemble forecasting for weather and climate uncertainty and stress: why it causes diseases and how it is mastered by the brain forest-fire as a model for the dynamics of disease epidemics estimating the infection and case fatality ratio for coronavirus disease (covid-19) using age-adjusted data from the outbreak on the diamond princess cruise ship the foundations of statistics a statistical framework to infer delay and direction of information flow from measurements of complex systems stochastic thermodynamics, fluctuation theorems and molecular machines. reports on progress in physics high contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. emerging infectious disease journal 26 how self-organized criticality works: a unified mean-field picture clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in detection of sars-cov-2 in different types of clinical specimens from dirac notation to probability bracket notation: time evolution and path integral under wick rotations variational message passing nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study advances in variational inference forest fires model and sir model used in spread of ebola virus in prediction and prevention this work was undertaken by members of the wellcome centre for human neuroimaging, ucl queen square institute of neurology. the wellcome centre for human neuroimaging is supported by core funding from wellcome [203147/z/16/z]. a.r. is funded by the australian research council (refs: de170100128 and dp200100757). a.j.b. is supported by a wellcome trust grant wt091681ma. cl is supported by an mrc clinician scientist award (mr/r006504/1).the authors declared no conflicts of interest. key: cord-007129-qjdg46o9 authors: simoes, joana margarida title: spatial epidemic modelling in social networks date: 2005-06-21 journal: aip conf proc doi: 10.1063/1.1985395 sha: doc_id: 7129 cord_uid: qjdg46o9 the spread of infectious diseases is highly influenced by the structure of the underlying social network. the target of this study is not the network of acquaintances, but the social mobility network: the daily movement of people between locations, in regions. it was already shown that this kind of network exhibits small world characteristics. the model developed is agent based (abm) and comprehends a movement model and a infection model. in the movement model, some assumptions are made about its structure and the daily movement is decomposed into four types: neighborhood, intra region, inter region and random. the model is geographical information systems (gis) based, and uses real data to define its geometry. because it is a vector model, some optimization techniques were used to increase its efficiency. since a long time, human epidemics have interested scientists of several areas. once diseases spread amongst people, it is impossible to ignore the role of social networks in embedding this phenomenon. therefore, the architecture and general topological features of the social network should be considered in the model. in several studies it has been considered the network of acquaintances [1] , which is important in diseases that require a close, prolonged contact. however, in highly contagious diseases, the infection may be passed by a short physical contact, and in this case it is more important to track the movement of individuals. in this model, it was considered the social mobility network: the daily movement of individuals, which has been already referred in the literature as a complex network with a small world behaviour [2] . in complex systems, the interaction among the constituents of the system, and the interaction between the system and its environment, are of such a nature that the system cannot be fully understood by simply analysing its components [3] . in this paper, it is described a simulation system using artificial agents integrated with geographical information systems (gis) that helps to understand the spatial and temporal behaviour of a epidemic phenomena. the utility of spatially explicit agent oriented simulations is demonstrated by simulating alternative scenarios to compare differences in the spatial structure of the mobility network and the geographical distribution of individuals. agent based models (abm) offer a great potential for studying human complex behaviour, interacting within a spatial framework [4] . unlike what happens with cellular automata (ca), in abm space is continuous and location is explicit, which means that individuals can be simulated, independent of the environment. this allows to specify rules focused on the individuals, and not on space. the present model is inspired by a site exchange cellular automata [5] , which considers two phases for each time step: movement and infection, assuming there is no virus transmission while the individual is moving. the movement rules and infection rules are determined by a movement model, and a infection model that are going to be described on the next paragraphs. the domain of the model is divided into small subunits with a geographical relevance: the regions. this regions can have several definitions depending on the scale, but in this case it were consider the concelhos, according to the ine definition (administrative division code/1994 revision approved by the deliberation n 86 of 15/12/1994). this choice was motivated by data availability and also for keeping the simulations reasonably fast. however, the model should be run with smaller regions, once conceptually the region definition is closer to the city definition. the movement rules try to emulate the daily movement of individuals according to the diversity of its activities (working, shopping, etc.). based on these regions, four ranges of movement were considered: neighbourhood, intra region, inter region and small world. neighborhood is the random movement of a individual within its immediate neighborhood (in this case it was considered a radius, defined as a parameter). this stands for the motion of an individual within its street or neighborhood; it is considered that it is where it spends most of its time. intra region is the random movement inside a region. this stands for the movement of an individual inside a city: for instance for working or going to the cinema. inter region is the random movement of an individual on the neighbor regions. this represents the travel to nearby cities, for instance for shopping or visiting friends and family. finally, it was considered a total random movement named small world (sw). this movement is a very tiny fraction of the total, and represents the situation that a small number of individuals have large range movements, which produce the shortcuts in the network, responsible for the sw phenomena. probabilities were attributed to each kind of movement, based on common sense of what are the most probable activities. this weights or probabilities should be based on real mobility data that were not available for this study, and therefore the results should be seen with caution. on fig. 3 , it is shown a graphic with the probabilities attributed to each kind of movement. the update of the movement model is synchronous. the infection model considered was a slir (susceptible-latent-infected-removed). the contagious (state change from susceptible to latent) occurs every time a susceptible meets a infectious within a certain radius. the state change from latent to infectious, and infectious to removed is only determined by time. the contagious model is not specific for a disease, but it is flexible enough to be adjusted and fit the characteristics of a given disease. the update of the movement model is synchronous . the model is vector based, considering points moving continuously over a polygon layer (fig. 6) . the polygon layer reads the geographic information from a shapefile (the esri gis format), using the shapelib api. the shapelib provides the programmer a structure with all the information contained in the shapefile, and in this way it is not necessary to program the low level access to the datafile (fig. 7) . the advantage of reading these files is the possibility of displaying different geographical configurations. several gis functionalities were implemented (like buffers and zooms) and the algorithms were optimized taking into consideration the speed of the simulations. once the operation of searching for features was intensively used, one of the efficiency algorithms developed was a search method: the quadtree. a quadtree is a tree structure used to encode two-dimensional spaces. the image is recursively subdivided into sub quadrants. the quadtree is used for raster spaces, so in this case it was used a adaptation of the quadtree for vectorial spaces. structure of the vectorial quadtree implemented in this model. the sensitivity analysis of the movement model was performed by running simulations for exclusively one type of movement. the domain of the model for these simulations was portugal, with concelhos as regions. the initial conditions are the population distribution of census 91 (ine) and the reported cases of mumps in 1993, which were in the origin of a small epidemic in this country. although the simulations are agent based, the results are shown at the level of region, in order to be more perceptible. darker shadings correspond to a greater amount of infected and removed individuals in the region. in the neighbourhood movement simulation (fig. 12) , the infection is much more restricted, in magnitude and in spatial extension, than in all other simulations. this is obviously due to the tighter movement range. it is also important to remark that the stability of the epidemic, occurs earlier in the neighbourhood simulations than in all others. the small world movement simulation (fig. 15 ) presents a totally different distribution of population. as the individuals reach every part of the domain, so does the epidemic. however, for being so contagious its inefficiency is reduced because many individuals die before they transmitted the disease, and so the stabilization of the epidemic is reached later than in the other simulations. the impact of this component, even if present in a small amount, can be seen by running a simulation with no small world movement (fig. 16) and another simulation with a small world movement probability of 0.05 (fig. 17) . in the case when the small world movement is included, the epidemic reaches a greater number of people, and reaches a greater part of the country. this analysis calls attention to the importance of the mobility network, embedded in the epidemic model as it has a determinant impact in the evolution of the epidemic. in this network, it was shown how the small component of random movement (characteristic of small world networks) has an effective influence on the results, which enforces the belief that it should not be ignored when modelling social networks. however, by now this study still lacks of a network analysis, observing measures such as the average path length and clustering coefficient, that will allow to evaluate if a small world network has effectively emerged. as it was demonstrated on the previous chapter, the structure of the mobility network is determinant in the spatial pattern and on the magnitude of the epidemic. the movement should always be considered in human epidemics models. another conclusion from this work is that the agent based approach is very well suited for epidemic modelling and that vector based modelling , with a appropriated programming, is quite efficient and provides a realistic representation of reality. however there is still a lot of work to be done: the movement model needs to be analysed in terms of networks, and the infection model needs to be tested, to evaluate the importance of the different parameters and, in the future, fit the characteristics of specific diseases. there is already one dataset introduced in the model and some data analysis needs to be conducted to evaluate the model efficacy. although, due to several issues, it is always is difficult to match the model results with real data, this would provide a way of validate it. finally, one of the most useful applications of a spatial model like this, will be the introduction of vaccination barriers, that will allow to study different vaccination strategies. epidemics in hierarchical social networks scaling laws for the movement of people between locations in a large city modelling the spatial dynamics and social interaction of human recreators using gis and intelligent agents individual-based lattice model for spatial spread of epidemics " in discrete dynamics in deterministic site exchange cellular automata model for the spread of diseases in human settlements epidemics and percolation in small world-networks i would like to thank my supervisor, michael batty (casa) for reviewing the presentation that originated this paper and carmo gomes (fcul) for providing me the dataset i use on these simulations. key: cord-256289-rls5lr27 authors: leeuwenberg, artuur m; schuit, ewoud title: prediction models for covid-19 clinical decision making date: 2020-09-22 journal: lancet digit health doi: 10.1016/s2589-7500(20)30226-0 sha: doc_id: 256289 cord_uid: rls5lr27 nan www.thelancet.com/digital-health vol 2 october 2020 e496 as of sept 2, 2020, more than 25 million cases of covid-19 have been reported, with more than 850 000 associated deaths worldwide. patients infected with severe acute respiratory syndrome coronavirus 2, the virus that causes covid-19, could require treatment in the intensive care unit for up to 4 weeks. as such, this disease is a major burden on health-care systems, leading to difficult decisions about who to treat and who not to. 1 prediction models that combine patient and disease characteristics to estimate the risk of a poor outcome from covid-19 can provide helpful assistance in clinical decision making. 2 in a living systematic review by wynants and colleagues, 3 145 models were reviewed, of which 50 were for prognosis of patients with covid-19, including 23 predicting mortality. critical appraisal of these models showed a high risk of bias for all models (eg, because of a high risk of model overfitting and unclear reporting on intended use of the models, or because of no reporting of the models' calibration performance). moreover, external validation of these models, deemed essential before application can even be considered, was rarely done. therefore, use of any of these reported prediction models was not recommended in current practice. in the lancet digital health, arjun s yadaw and colleagues present two models to predict mortality in patients with covid-19 admitted to the mount sinai health system in the new york city area. 4 these researchers have addressed many of the issues encountered by wynants and colleagues 3 and provide extensive information about the modelling in the appendix. the dataset used for model development (n=3841) is larger than in most currently published models, and the accompanying number of patients who died (n=313) seems appropriate according to the prediction model risk of bias assessment tool (probast) 5 and guidance on sample size requirements for prediction model development. 6 the calibration performance of the models is reported, which (although essential) is often missing, particularly in studies reporting on machine-learning algorithms, 7 and external validations of the models was done. yadaw and colleagues acknowledge that additional external validation will be necessary 4 because external validation was done in a random subset of the initial patient population and another set of recent patients from the same health system, and because the number of events in the validation sets were below the 100 suggested for reliable external validity assessment. 8 for other researchers to apply and externally validate models, adherence to transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod) criteria 9 is advised to present the full models, accompanied by code in case of complex machine-learning models. yadaw and colleagues reported many items in tripod, however, the models themselves are not reported in the article or appendix (item 15a of tripod) so it is not possible for a reader to make predictions for new individuals (eg, to validate the developed models in their own data or investigate the contribution of the individual predictors). the moment for risk estimation defines which values of predictors will be available and is especially important for time-varying predictors (eg, temperature). the models reported by yadaw and colleagues predict risk using measurements collected throughout the entire encounter of the patient with the health system, with no specific moment of prediction defined. 4 this raises questions about the actual prognostic value of the timevarying predictors (eg, the minimum oxygen saturation) and, hence, how and when the model should be used as the predictive value of time-varying predictors will likely increase when measured closer to the outcome. consequently, it remains unclear how to interpret the reported area under the curve of approximately 90% in relation to the moment of measurement of these timevarying predictors. two suggestions can be made regarding modelling. first, the current machine-learning models were constructed using the default hyperparameter values provided by the respective software packages. these often provide reasonable starting values, but important hyperparameters should be carefully tuned to the specific use case. 10 second, as acknowledged by yadaw and colleagues, 4 patients who had not developed the outcome by the end of the study were considered not to have the outcome. since the outcome for these patients might occur after the study ended, the actual incidence of mortality could have been underestimated. have been defined to allow sufficient follow-up time to measure the outcome in each patient. the study by yadaw and colleagues ticks a lot of boxes, 4 but it still struggles somewhat to break away from the overall negative picture painted by wynants and colleagues. 3 improvements can be achieved by more and better collaboration among researchers from different backgrounds, clinicians, and institutes and sharing of patient data from covid-19 studies and registries. then, and with improved reporting (by adherence to tripod criteria), validity, and quality (according to probast), prediction models can provide the decision support that is needed when covid-19 cases and hospital admissions will again test the limits of the health-care system. fair allocation of scarce medical resources in the time of covid-19 prognosis and prognostic research: what, why, and how? prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal clinical features of covid-19 mortality: development and validation of a clinical prediction model probast: a tool to assess the risk of bias and applicability of prediction model studies calculating the sample size required for developing a clinical prediction model calibration: the achilles heel of predictive analytics sample size considerations for the performance assessment of predictive models: a simulation study transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement tunability: importance of hyperparameters of machine learning algorithms key: cord-013784-zhgjmt2j authors: tang, min; xie, qi; gimple, ryan c.; zhong, zheng; tam, trevor; tian, jing; kidwell, reilly l.; wu, qiulian; prager, briana c.; qiu, zhixin; yu, aaron; zhu, zhe; mesci, pinar; jing, hui; schimelman, jacob; wang, pengrui; lee, derrick; lorenzini, michael h.; dixit, deobrat; zhao, linjie; bhargava, shruti; miller, tyler e.; wan, xueyi; tang, jing; sun, bingjie; cravatt, benjamin f.; muotri, alysson r.; chen, shaochen; rich, jeremy n. title: three-dimensional bioprinted glioblastoma microenvironments model cellular dependencies and immune interactions date: 2020-06-04 journal: cell res doi: 10.1038/s41422-020-0338-1 sha: doc_id: 13784 cord_uid: zhgjmt2j brain tumors are dynamic complex ecosystems with multiple cell types. to model the brain tumor microenvironment in a reproducible and scalable system, we developed a rapid three-dimensional (3d) bioprinting method to construct clinically relevant biomimetic tissue models. in recurrent glioblastoma, macrophages/microglia prominently contribute to the tumor mass. to parse the function of macrophages in 3d, we compared the growth of glioblastoma stem cells (gscs) alone or with astrocytes and neural precursor cells in a hyaluronic acid-rich hydrogel, with or without macrophage. bioprinted constructs integrating macrophage recapitulate patient-derived transcriptional profiles predictive of patient survival, maintenance of stemness, invasion, and drug resistance. whole-genome crispr screening with bioprinted complex systems identified unique molecular dependencies in gscs, relative to sphere culture. multicellular bioprinted models serve as a scalable and physiologic platform to interrogate drug sensitivity, cellular crosstalk, invasion, context-specific functional dependencies, as well as immunologic interactions in a species-matched neural environment. brain tumors are complex tissues with multicomponent interactions between multiple cell types. 1 precision medicine efforts based solely on genomic alterations and molecular circuitries driving neoplastic cells have translated into relatively limited benefit in clinical practice for brain cancers, including glioblastoma, the most prevalent and lethal primary intrinsic brain tumor. crosstalk between neoplastic cells and the surrounding stroma contributes to tumor initiation, progression, and metastasis. however, most cancer research studies investigate cancer cells in isolation, cultured in non-physiologic adherent conditions containing species-mismatched serum. massive efforts have interrogated functional dependencies of cancer cell lines. [2] [3] [4] [5] while these studies provide valuable insights into cancer cell dependencies, they lack the capacity to investigate interactions of cancer cells with stromal cells or the microenvironment in an appropriate physiological context. patient-derived xenografts (pdxs) and genetically engineered mouse models are informative and can better recapitulate the genomic and transcriptomic profiles of patient brain tumors than two-dimensional (2d) culture. however, challenges with engraftment, the low throughput nature of animal experiments, and the lack of normal human cellular interactions, limit their broad applications in clinical settings. in tumors with significant immune cell involvement, such as glioblastoma, pdxs are limited as immunocompromised animals prevent investigation of immune cells in cancer biology. 6 methods to construct self-organizing three-dimensional (3d) coculture systems, termed organoids, have been developed to interrogate physiological and pathophysiological processes. 7, 8 in cancer research, organoid systems serve as models of colorectal cancer, 9, 10 breast cancer, 11, 12 hepatocellular and cholangiocarcinomas, 13 pancreatic cancers, 14 and glioblastomas, 15 among others. 16, 17 in glioblastoma, we first described organoid systems that recapitulate tumor architecture, microenvironmental gradients, and tumor cellular heterogeneity. 15 additional glioblastoma models utilize human-embryonic stem cell (hesc)-derived cerebral organoids to investigate interactions between glioblastoma stem cells (gscs) and normal brain components including infiltration, microenvironmental stimuli, and response to therapies. 18 however, organoid modeling is labor intensive, relatively low throughput, and highly variable in terms of cellular composition and structure due to the process of self-assembly. further development of tissue engineering approaches informs new 3d culture systems with improved scalability and capacity to tune specific biological parameters, including cellular composition and extracellular matrix stiffness. 19 the development of physiologically relevant brain tumor microenvironments 20 requires careful consideration of the biophysical and biochemical properties of the matrix and cellular composition of specific tumor types, which can be achieved with recent advances in 3d bioprinting and biomaterials designed specifically for the bioprinting process. [21] [22] [23] [24] biocompatible scaffolds for tumor microenvironments include the naturally occurring extracellular matrix products chitosan-alginate (ca) 25 and hyaluronic acid (ha)-based hydrogels, 26, 27 but also synthetic polymers, including poly lactide-co-glycolide (plga), 28 and polyethylene-glycol (peg), 26 or polyacrylamide hydrogels. 29 3d printing with biocompatible materials is emerging to advance the fields of regenerative medicine and tissue modeling, 21 with notable relevance and applicability to cancer research. 22 3d bioprinting models microenvironmental interactions and drug sensitivities, 18 reciprocal interactions with macrophages, 23 and patient-specific screening tools in microfluidics-based systems. 24 among many 3d printing technologies, digital light processing (dlp)-based 3d bioprinting provides superior scalability and printing speed in addition to versatility and reproducibility. 30 several biomimetic tissue models have been developed using this technology, creating tissue-specific architecture and cellular composition that could be used for functional analyses, metastasis studies, and drug screening. 31, 32 here, we employ a rapid 3d bioprinting system and photocrosslinkable native ecm derivatives to create a biomimetic 3d cancer microenvironment for the highly lethal brain tumor, glioblastoma. the model is comprised of patient-derived gscs, macrophages, astrocytes, and neural stem cells (nscs) in a ha-rich hydrogel. one major microenvironmental feature of glioblastoma is the prominent infiltration of tumor masses by macrophage and microglia. in progressive or recurrent glioblastoma, macrophage and microglia account for a substantial fraction of the tumor bulk. using genetic depletion, co-implantation, and pharmacologic depletion, macrophage/microglia have been shown to be functionally important for glioblastoma growth, but each of these approaches may have broader effects beyond direct tumor cellmacrophage interactions. using our rapid 3d bioprinting platform, we can interrogate functional dependencies and multicellular interactions in a physiologically relevant manner. dlp-based rapid 3d bioprinting generates glioblastoma tissue models brain tumors are composed of numerous distinct populations of malignant and supporting stromal cells, and these complex cellular interactions are essential for tumor survival, growth, and progression. glioblastomas display high levels of intratumoral heterogeneity, with contributions from astrocytes, neurons, npcs, macrophage/ microglia, and vascular components. to move beyond serum-free sphere culture-based models, we utilized a dlp-based rapid 3d bioprinting system to generate 3d tri-culture or tetra-culture glioblastoma tissue models, with a background "normal brain" made up of npcs and astrocytes and a tumor mass generated by gscs, with or without macrophage, using brain-specific extracellular matrix (ecm) materials (fig. 1a ). leveraging this system with exquisite control of cellular constituents in specific locations, we selected macrophage for additional study, as we hypothesized that dlp-based 3d bioprinting could enable precise spatial arrangement of cells and matrix, and selection of any cell type. the key components of the bioprinting system were a digital micromirror device (dmd) chip and a motorized stage where prepolymer cellmaterial mixtures were sequentially loaded. the dmd chip with approximately 2 × 10 6 micromirrors controlled the light projection of the brain-shaped patterns onto the printing materials (fig. 1b) . the elliptical pattern corresponded to the core region and the coronal slice pattern corresponded to the peripheral region. each pattern was printed with 20 s of light exposure. in the 3d tri-culture model, a central tumor core composed of gscs was surrounded by a less dense population of astrocytes and npcs. in the 3d tetra-culture model, we mixed m2 macrophages with gscs within the central core to mimic the immune cell infiltrated tumor mass (fig. 1c) . the ecm composition of the glioblastoma microenvironment was modeled with gelatin methacrylate (gelma) and glycidyl methacrylate-ha (gmha) hydrogels. cells were encapsulated into a material mixture of 4% gelma (at 95% degree of methacrylation) and 0.25% gmha (at 38% degree of methacrylation), which generated a hydrogel matrix that resembled glioblastoma tissue (supplementary information, fig. s1a , b). gelma has good biocompatibility and serves as a stiffness modulator that provided desirable mechanical properties and little intervention in biochemical cues. ha is the most abundant ecm component in healthy brain tissue and promotes glioblastoma progression, including regulating glioblastoma invasion through the receptor for hyaluronan-mediated motility (rhamm) and cd44, as well as other mechanical and topographical cues. 33 we used a physiologically relevant concentration of ha (0.25%) determined from clinical analysis of a diverse population of biopsy specimens from patients with different brain tumors. 34 while a range of molecular weight has are present in the brain, low molecular weight ha promotes gsc stemness and resistance. 33 thus, in this study, low molecular weight ha (200 kda) was used to synthesize gmha to model the pro-invasive brain tumor microenvironment. the mechanical properties of the model were characterized by the compressive modulus and pore sizes. the stiffness of the acellular hydrogel remained stable over a week of incubation at 37°c (data not shown). the stiffness of cell-encapsulated tumor core was 2.8 ± 0.6 kpa, while the less populated peripheral region containing npcs and astrocytes was 0.9 ± 0.2 kpa. the peripheral region stiffness was designed to match that of healthy brain tissue reported to be~1 kpa. glioblastoma displays enhanced migration and proliferation in stiffer materials. 33 the stiffness of the tumor core was modulated with the light exposure time during printing to have higher modulus than the healthy region. the hydrogel had a porosity of 53% and an average pore size of 85 μm. with these microscale features, small molecules, such as drug molecules, freely diffuse through the matrix. cells closely interacted with other cells and the matrix (fig. 1d) . at a macro scale, the model had a thickness of 1 mm, and 4.4 mm by 3.6 mm in width and length, which allowed gradients of oxygen and nutrition diffusion to be formed within the tissue. cells were precisely printed into two prearranged regions to provide more physiologically relevant features: a non-neoplastic peripheral region composed of npcs and astrocytes surrounding a tumor core composed of either gscs alone or gscs with macrophage ( fig. 1e) . following optimization for cell density (supplementary information, fig. s2a, b) , the tumor core in the 3d tri-culture consisted of 2.5 × 10 7 gscs/ml, while the tetra-culture tumor core contained 2.5 × 10 7 gscs/ml and 1.25 × 10 7 macrophages/ml. 3d bioprinted models recapitulate glioblastoma transcriptional profiles traditionally grown cell lines have been extensively characterized in glioblastoma, revealing that these conditions fail to replicate article patient tumors in cellular phenotypes (e.g., invasion) or transcriptional profiles. 35 while patient-derived glioblastoma cells grown under serum-free conditions enrich for stem-like tumor cells (gscs) that form spheres and more closely replicate transcriptional profiles and invasive potential than standard culture conditions, we previously demonstrated that spheres display differential transcriptional profiles and cellular dependencies in an rna interference screen compared to in vivo xenografts. 36 based on this background, we interrogated the transcriptional profiles from a large cohort of patient-derived gscs grown in serum-free, sphere cell culture that we recently reported. 37 gscs grown as spheres were transcriptionally distinct from primary glioblastoma surgical resection tissue specimens, when compared through either principal component analysis (pca) or uniform manifold approximation and projection (umap) (fig. 2a, b) . to determine whether the 3d bioprinted culture systems more closely resemble primary glioblastoma tumors, we performed global transcriptional profiling through rna extraction followed by next-generation sequencing (rna-seq) on gscs isolated from the bioprinted models and on gscs in sphere culture (fig. 2c) . upregulation of a fig. 1 3d bioprinting enables generation of glioblastoma tri-culture and tetra-culture tissue environment model. a schematic diagram of in vitro 3d glioblastoma model containing gscs, macrophages, astrocytes, and neural stem cells (nscs). b schematic diagram of digital micromirror device (dmd) chip-based 3d bioprinting system used to produce the 3d glioblastoma model. c diagram of tri-culture (left) and tetra-culture (right) model system. d (left) scanning electron microscope (sem) images of acellular glycidyl methylacrylate-hyaluronic acid and gelatin methacrylate extracellular matrix. (center and right) sem images of the cells encapsulated in the extracellular matrix. scale bars, 200 μm (left), 10 μm (center), and 2 μm (right). e brightfield and immunofluorescence images of the tri-culture and tetra-culture 3d glioblastoma models. gscs are labeled with green fluorescent protein (gfp) while macrophages are labeled with mcherry. nuclei are stained with dapi. scale bars, 1 mm. core set of glioblastoma tissue-specific genes defined a "glioblastoma tissue" gene signature (fig. 2d) . when compared to gscs grown in sphere culture, the tetra-culture bioprinted model displayed upregulation of the glioblastoma tissue-specific gene set (fig. 2e) , suggesting that the bioprinted model recapitulates transcriptional states present in patient-derived glioblastoma tissues. gscs in 3d tetra-culture displayed upregulation of genes specifically expressed in orthotopic intracranial xenografts (fig. 2f, g) and, to a lesser extent, genes specifically expressed in subcutaneous flank xenografts (supplementary information, fig. s2c ) compared to sphere culture. additionally, signatures that distinguish gscs from their differentiated counterparts were upregulated in the tetraculture system compared to sphere culture (fig. 2h, i) , suggesting that the physiologic tissue environment promotes stem-like transcriptional states. we further interrogated the gene expression profiles that distinguish gscs grown in sphere culture from the 3d tetraculture bioprinted models (fig. 3a) . while cells grown in sphere culture displayed enrichment for gene sets involved in ion transport, protein localization, and vesicle membrane function, cells in the tetra-culture 3d model displayed transcriptional upregulation of cell adhesion, extracellular matrix, cell and structure morphogenesis, angiogenesis, and hypoxia signatures ( fig. s3b ). hypoxia response genes, ca9, ndrg1, angptl4, and egln family members, were upregulated in the tetra-culture system, while various ion transporters, including slc25a48 and slc6a9, were downregulated (fig. 3d , e). by qpcr, gscs isolated from either 3d system 10 days after printing displayed elevated levels of the stemness marker olig2 and decreased levels of the differentiation markers map2 and tuj1 compared to their sphere counterparts grown in parallel (fig. 3f) . additionally, gsc levels of map2 and tuj1 were decreased to a greater degree in tetraculture (i.e., with macrophage) compared to tri-culture. we further evaluated the protein expression of stemness, hypoxia, and proliferative markers in the tetra-culture system compared to sphere culture. the hypoxia marker ca9 was upregulated in the tetra-culture model compared to sphere culture (fig. 3g) . the heightened hypoxia level more closely resembled pathologic in vivo conditions, in which the tumor core had a higher hypoxia expression compared to the peripheral region of neurons and astrocytes. in the 3d culture model, cells also showed increased levels of the proliferative marker ki67 and increased protein expression of the stemness markers olig2 and sox2 ( fig. 3h-j) . macrophages promote hypoxic and invasive signatures in bioprinted models to understand the relative contributions of each cell type incorporated into bioprinted models, we performed rna-seq on gscs derived from tri-cultures and tetra-cultures. given that thp1derived macrophages display distinct expression profiles as primary macrophages, we built tetra-cultures containing thp1derived macrophage, human induced pluripotent stem cell (hipsc)-derived macrophage generated from an established protocol, 38 and primary human volunteer-derived macrophage. both hipsc-derived macrophage and primary macrophage integrated into the tetra-culture models. umap clustering revealed that the transcriptional outputs of sphere cultured gscs are distinct from that of gscs in bioprinted models (fig. 4a, b) . concordantly, we detected differentially expressed genes between sphere cultured cells and any of the bioprinted models (757-968 differentially expressed genes), while there were fewer genes that distinguished the bioprinted models (39-59 differentially expressed genes) (fig. 4c) . bioprinted models were characterized by activation of invasion, extracellular matrix, cell surface interaction, and hypoxia signatures, while gscs in sphere culture expressed cell cycle, dna replication, rna processing, and mitochondrial translation signatures (supplementary information, we next investigated differentially expressed pathways between bioprinted models to interrogate the contributions of cellular components. tri-culture-derived gscs upregulated extracellular matrix and biological adhesion pathways compared to gscs in sphere culture (supplementary information, fig. s6a -e). addition of macrophage further increased activation of hypoxia and glycolytic metabolism signatures, with enrichment for invasiveness signatures (fig. 4d-h) . tetra-cultures constructed with hipsc-derived macrophage expressed higher levels of extracellular matrix and wound healing and platelet activation signatures and decreased levels of neuron and glial development and differentiation pathways compared to tetra-cultures containing thp1-derived macrophages (supplementary information, fig. s7a , b). incorporation of primary human macrophages did not affect levels of ki67 or sox2 compared to use of thp1-derived cells (supplementary information, fig. s7c , d). consistent with our previous findings, use of hipsc-derived macrophages reduced gsc expression of map2 and tuj1 differentiation markers and increased expression of ca9 and ndrg1 hypoxia markers (supplementary information, fig. s7e ). taken together, gscs upregulate extracellular matrix interaction signatures in response to growth in a bioprinted model. the addition of macrophage further accentuates these gene activation signatures and increases activation of hypoxia and pro-invasive transcriptional profiles. 3d bioprinted tissues model complex cellular interactions and migration interactions between malignant cells and stromal components shape tumor tissue with each cell type impacting the other tissue components. to understand these changes, we investigated how macrophage responded to the 3d brain tumor microenvironment by isolating thp1-derived macrophages from 3d bioprinted constructs and performing rna-seq (fig. 5a, b) . for the 3d printed tissue, macrophage were mixed with gscs at a 1:2 ratio to form the tumor core, while the periphery was formed by astrocytes and npcs using the same composition described previously. the transcriptional output of macrophage grown in traditional culture displayed enrichment for prc2 complex targets, amino acid biosynthesis, protein metabolism signatures and ribosomal pathways, while macrophage exposed to gscs in the fig. 2 3d tetra-culture models better recapitulate transcriptional signatures found in glioblastoma tissues than standard sphere culture. a pca of the global transcriptional landscape of glioma stem cells in culture (gscs in culture, n = 40) vs primary glioblastoma surgical resection tissues (gbm tissue, n = 34) as defined by rna-seq. the top 5000 differential genes were used for the analysis. data was derived from mack et al. 37 b umap of the global transcriptional landscape of glioma stem cells in culture (gscs in culture, n = 40) vs primary glioblastoma surgical resection tissues (gbm tissue, n = 34) as defined by rna-seq. analysis parameters include: sample size of local neighborhood, number of neighbors = 40; learning rate = 0.5; initialization of low dimensional embedding = random; metrics for computation of distance in high dimensional space = manhattan. data was derived from mack et al. 37 c schematic diagram of experimental approach for gsc rna-seq experiments. d volcano plot of transcriptional landscape profiled by rna-seq comparing gscs in sphere culture (n = 40) vs glioblastoma primary surgical resection tissues (n = 34). the x-axis depicts the log transformed fold change, while the y-axis shows the log transformed p value adjusted for multiple test correction. e gene set enrichment analysis (gsea) of the glioblastoma tissue vs cell culture signature as defined in d when applied to rna-seq data comparing the 3d tetra-culture system with sphere cell culture. f volcano plot of transcriptional landscape profiled by rna-seq comparing gscs in sphere culture (n = 2 biological samples with 2 technical replicates each) vs matched orthotopic intracranial xenograft specimens (n = 2 biological samples with 2 technical replicates each). the x-axis depicts the log transformed fold change, while the y-axis shows the log transformed p value adjusted for multiple test correction. data was derived from miller et al. 36 g gsea of the glioblastoma tissue vs cell culture signature as defined in f when applied to rna-seq data comparing the 3d tetraculture system with sphere cell culture. h volcano plot of transcriptional landscape profiled by rna-seq comparing gscs in sphere culture (n = 3 biological samples with 3 technical replicates each) vs differentiated glioma cells (dgcs) in sphere culture (n = 3 biological samples with 3 technical replicates each). the x-axis depicts the log transformed fold change, while the y-axis shows the log transformed p value adjusted for multiple test correction. data was derived from suva et al. 76 i gsea of the glioblastoma tissue vs cell culture signature as defined in h when applied to rna-seq data comparing the 3d tetra-culture system with sphere cell culture. bioprinted construct showed elevation of pathways involved in leukocyte activation and innate immune response, cytokine signaling and inflammatory responses, and tlr-stimulated signatures ( fig. 5c ; supplementary information, fig. s8a -d). defense response genes, including ch14, pla2g7, and alox5, were upregulated in macrophage derived from the tetra-culture system, while genes involved in amino acid restriction, including il18, cd37, and vldlr, were downregulated (fig. 5d , e). m2 macrophage-related markers were upregulated in the 3d tetracultures, with cd163 increased by 37-fold and il-10 increased by 17-fold compared to traditional suspension culture, as measured by qpcr. m1-related markers, including tnf-α and nos2, did not increase, demonstrating that the 3d printed microenvironment preferentially polarized macrophage towards the m2 phenotype ( fig. 5f ). this is consistent with the m2 polarization of macrophage in glioblastoma tumors. 39, 40 gene expression signatures defining peripherally-derived tumor-associated macrophage in glioma 41, 42 were selectively enriched in macrophage derived from tetraculture models compared to those grown in 2d culture (supplementary information, fig. s9 ). collectively, macrophage grown in our 3d bioprinted tetra-culture model expressed gene expression signatures consistent with patient-derived tumorassociated macrophage. we interrogated the functional consequences of the addition of immune components to the 3d bioprinted model. in four patientderived gscs spanning three major glioblastoma transcriptional subtypes (proneural, classical, and mesenchymal), the addition of thp1-derived m2 macrophage increased gsc invasion into the surrounding brain-like parenchyma ( fig. 5g-j) . consistent with our gene expression analyses, m2 macrophage increased the area of invasion by 20% for cw468, 60% for gsc23, 41% for gsc3264, and 30% for gsc2907. collectively, these results support the tetraculture model as an effective tool to study cancer cell invasion and the mechanisms by which cellular interactions impinge upon these processes. as numerous stromal compartments, including neural progenitor cells, astrocytes, and neurons, [43] [44] [45] interact with glioblastoma cells within patient tumors, we interrogated the effects of the bioprinted model on neuronal and oligodendrocyte differentiation of the non-neoplastic npcs. in 2d culture, most npcs expressed the proliferative npc marker sox2. the high expression and frequency of sox2 was retained in tri-cultures and tetra-cultures containing macrophage derived from thp1 cells or primary human macrophage (supplementary information, fig. s10a ). in 2d culture, npcs expressed the neuronal marker tubb3, but retained a progenitor-like cellular morphology. in bioprinted models, npcs adopted a neuronal morphology with the appearance of elongated cellular projections (supplementary information, fig. s10b ). expression of map2 was reduced in npcs in bioprinted models compared to 2d culture (supplementary information, fig. s11a ). olig2 staining revealed oligodendrocytelike cells in tri-cultures (supplementary information, fig. s11b ). taken together, npcs partially differentiate in our bioprinted system, but are unlikely to form mature functional neurons or oligodendrocytes. the 3d bioprinted model serves as a platform for drug response modeling we next investigated the ability of our 3d bioprinted constructs to model drug responses and the capacity for cellular interactions within the 3d bioprinted constructs to affect drug sensitivity of gscs. fluorescent dextran molecules (4 kda) modeled drug penetration into 3d bioprinted models. 31, 46 dextran molecules rapidly entered bioprinted constructs when the hydrogel was soaked in a dextran solution, with rapid increases in average fluorescence intensity measured from the hydrogel. the fluorescence intensity plateaued after 30 min of incubation and displayed a uniform spatial intensity across the hydrogel, demonstrating that drug compounds can effectively permeate the 3d bioprinted model (fig. 6a-c) . egfr is commonly amplified, overexpressed, or mutated in glioblastoma, so we evaluated the treatment efficacy of two egfr inhibitors, erlotinib and gefitinib, and the glioblastoma standardof-care alkylating agent temozolomide in our models. 3d tricultures and tetra-cultures were cultured for 5 days before drug treatment. despite activated egfr in glioblastomas, egfr inhibitors have shown little benefit for glioblastoma patients. gsc23 in either 3d model displayed enhanced resistance to egfr inhibitors and temozolomide compared to sphere culture. inclusion of m2 macrophage further increased resistance of gsc23 to egfr inhibitors ( glioblastomas are highly lethal cancers for which current therapy is palliative. 47, 48 therefore, we explored the potential utility of 3d bioprinted systems to inform drug responses in glioblastoma. overlaying gene expression data from the 3d tetraculture model with drug sensitivity and gene expression data from the cancer cell line encyclopedia (ccle) and the cancer therapeutic response platform (ctrp) enabled prediction of drug sensitivity and resistance in our 3d tetra-culture model based on transcriptional signatures (fig. 6f) . [49] [50] [51] consistent with our studies of erlotinib, gefitinib, and temozolomide, high expression of genes upregulated in gscs in the 3d tetra-culture model was predicted to be associated with drug resistance for the majority of compounds across all cancer cell lines tested (fig. 6g) or when restricted to brain cancer cell lines (supplementary information, fig. s15a ). drugs predicted to be ineffective included gsk-j4 (jmjd3/kdm6b inhibitor), cytarabine (nucleotide antimetabolite), and decitabine (dna methyltransferase inhibitor), while drugs predicted to be effective included abiraterone (cyp17a1 inhibitor), fig. 3 gscs grown in 3d tetra-culture models upregulate transcriptional signatures of cellular interaction, hypoxia, and cancer stem cells. a volcano plot of transcriptional landscape profiled by rna-seq comparing the cw468 gsc grown in standard sphere culture vs gscs in the 3d tetra-culture model. the x-axis depicts the log transformed fold change, while the y-axis shows the log transformed p value adjusted for multiple test correction. n = 2 technical replicates per condition. b pathway gene set enrichment connectivity diagram displaying pathways enriched among gene sets upregulated (red) and downregulated (blue) in gscs in the 3d tetra-culture system vs standard sphere culture. c normalized single sample gene set enrichment analysis (ssgsea) scores of glioblastoma transcriptional subtypes as previously defined 81 for the cw468 gsc when grown in in standard sphere culture vs gscs in the 3d tetra-culture model. bars are centered at the mean value and error bars represent standard deviation. d mrna expression of representative genes in hypoxia response pathways between standard sphere culture vs gscs in the 3d tetra-culture model as defined by rna-seq. p values were calculated using deseq2 75 with a wald test with benjamini and hochberg correction. ****p < 1e−5. bars are centered at the mean value and error bars represent standard deviation. e mrna expression of representative genes in ion transport pathways between standard sphere culture vs gscs in the 3d tetra-culture model as defined by rna-seq. p values were calculated using deseq2 75 with a wald test with benjamini and hochberg correction. ****p < 1e−5. bars are centered at the mean value and error bars represent standard deviation. f mrna expression of stem cell and differentiation markers between standard sphere culture vs gscs in the 3d tetra-culture model as defined by quantitative pcr (qpcr). three technical replicates were used and ordinary two-way anova with dunnett multiple comparison test was used for statistical analysis, *p < 0.05; **p < 0.01; ***p < 0.001. bars indicate mean, with error bars showing standard deviation. g immunofluorescence staining of ca9 in cells grown in standard sphere culture (top) vs gscs in the 3d tetra-culture model (bottom). scale bars, 50 μm. h immunofluorescence staining of ki67 in cells grown in standard sphere culture (top) vs gscs in the 3d tetra-culture model (bottom). scale bars, 50 μm. i immunofluorescence staining of olig2 in cells grown in standard sphere culture (top) vs gscs in the 3d tetra-culture model (bottom). scale bars, 50 μm. j immunofluorescence staining of sox2 in cells grown in standard sphere culture (top) vs gscs in the 3d tetra-culture model (bottom). scale bars, 50 μm. fig. 4 addition of macrophages activates extracellular matrix and invasiveness signatures. a umap analysis of rna-seq data from gscs grown in (1) sphere culture, (2) tri-culture, (3) tetra-culture with thp1-derived macrophage, and (4) tetra-culture with hipsc-derived macrophages. b heatmap displaying mrna expression of differentially expressed genes between conditions. c upset plot showing the number of differentially expressed genes between conditions. for conditions containing sphere cultured cells, genes were considered differentially expressed if the log2 fold change of mrna expression was greater than 0.5 (or < −0.5) with an adjusted p value of 1e−0. for other conditions, genes were considered differentially expressed if the log2 fold change of mrna expression was greater than 0.5 (or < −0.5) with an adjusted p value of 1e−5. d volcano plot of transcriptional landscapes profiled by rna-seq comparing the cw468 gsc grown in tetraculture containing thp1-derived macrophages vs gscs in the tri-culture model. the x-axis depicts the log transformed fold change, while the y-axis shows the log transformed p value adjusted for multiple test correction. n = 2 technical replicates per condition. e pathway gene set enrichment connectivity diagram displaying pathways enriched among gene sets upregulated (red) and downregulated (orange) in gscs in the 3d tetra-culture system vs tri-culture system. f gsea of the extracellular matrix structural constituent pathway between tetra-culture and tri-culture models. fdr q value = 0.008. g gsea of the anastassiou multicancer invasiveness pathway between tetra-culture and tri-culture models. fdr q value = 0.02. h gene set enrichment analysis (gsea) of the collagen degradation pathway between tetra-culture and tri-culture models. fdr q value = 0.02. vemurafenib and plx-4720 (raf inhibitors), ml334 (nrf2 activator), and ifosfamide (akylating agent) (fig. 6g-j) . the drug sensitivity predictions were similar, but not entirely overlapping, when a glioblastoma orthotopic xenograft expression signature was used (supplementary information, fig. s15b ). investigation of the library of integrated network-based cellular signatures (lincs) dataset 52 showed that compounds predicted to recapitulate the 3d tetra-culture signature included hypoxia inducible factor activators, caspase activators, and hdac inhibitors, while raf inhibitors and immunosuppressive agents may impair expression of this gene signature (supplementary information, fig. s15c ). these findings suggest that interactions with the local microenvironment affect gsc sensitivity to therapeutic compounds and that the 3d bioprinted tissue model can interrogate these context-dependent effects. further, as the tetra-culture model expresses genes associated with poor sensitivity to a variety of therapeutic compounds, this system may be a more realistic model for drug discovery in glioblastoma. to validate these predictions, we treated gscs with three of the predicted compounds, abiraterone, vemurafenib, and ifosfamide in triculture and tetra-culture bioprinted models. when treated at the sphere culture ic 50 value (supplementary information, fig. s15d-f ), gscs in tetra-culture displayed enhanced sensitivity to abiraterone and ifosfamide compared to gscs in tri-culture, while sensitivity to vemurafenib was unchanged ( fig. 6i-k) . this suggests that abiraterone and ifosfamide may be effective in targeting tetra-culture derived gscs. further validating these findings in an in vivo subcutaneous glioblastoma xenograft model, ifosfamide therapy reduced tumor growth compared to vehicle (supplementary information, figs. s16a-c). 3d bioprinted tissues uncover novel context-dependent essential pathways and serve as a platform for crispr screening given widespread therapeutic resistance in glioblastoma, we leveraged the 3d bioprinted construct as a discovery platform for glioblastoma dependencies. parallel whole-genome crispr-cas9 loss-of-function screening was performed in gscs in sphere culture as well as in the 3d tetra-culture system ( fig. 7a; supplementary information, fig. s17 ). functional dependencies segregated gscs based on their method of growth ( fig. 7b; supplementary information, fig. s17f ). guide rnas were enriched (indicating that the targeted gene enhances viability when deleted) or depleted (indicating that the targeted gene reduces cell viability when deleted) in each platform (fig. 7c, d) . genes essential in each context, as well as pan-essential genes common to both platforms, included core pathways involved in translation, ribosome functions, and rna processing, cell cycle regulation, protein localization, and chromosomes and dna repair ( fig. 7e; supplementary information, fig. s17g, h) . gene hits were stratified to identify context-specific dependencies (fig. 7f) . genes selectively essential in sphere culture were enriched for cell cycle, endoplasmic reticulum, golgi and glycosylation, lipid metabolism, and response to oxygen pathways. gscs grown in the 3d tetra-culture model were more dependent on transcription factor activity, cell development and differentiation, nf-κb signaling, and immune regulation pathways (fig. 7g-k) . thus, the 3d bioprinted model allowed for interrogation of functional dependencies of brain tumor cells in physiological settings and in combination with stromal fractions and revealed a more complex functional dependency network than that observed in sphere culture. to further validate 3d bioprinted-specific dependencies, we stratified our whole-genome crispr screening results, selecting genes predicted to be essential in 3d tetra-culture (fig. 8a, b) . individual gene knockout in luciferase-labeled gscs of pag1, znf830, atp5h, and rnf19a with two independent sgrnas reduced gsc viability in both sphere culture and 3d tetra-culture models (fig. 8c-m) . additionally, knockout of pag1 or znf830 in gscs delayed the onset of neurological signs in orthotopic glioblastoma xenografts compared to gscs treated with a nontargeting sgrna (fig. 8n-q) . pag1 and znf830 are upregulated at the mrna level in glioblastomas compared to normal brain tissue and high expression is associated with poor patient prognosis in primary glioblastomas from the chinese glioma genome atlas (cgga) dataset, highlighting the clinical relevance of these factors in glioblastoma (supplementary information, fig. s18a-d) . taken together, this screening approach has identified novel candidates for future investigation and potential therapeutic development. 3d bioprinted cultures express transcriptional signatures associated with poor glioblastoma patient prognosis to determine the clinical relevance of the 3d bioprinted construct, we investigated the transcriptional profiles relative to glioblastoma patients. signatures of genes upregulated either in intracranial orthotopic xenografts or in 3d tetra-culture compared to sphere culture were elevated in glioblastomas compared to low-grade gliomas in the cancer genome atlas (tcga), cgga, and the rembrandt dataset (fig. 9a-d) . the 3d tetra-culture gene signature was elevated in recurrent glioblastomas compared to primary tumors (fig. 9e) and in the mesenchymal subtype compared to classical or proneural glioblastomas (fig. 9f) . in the tcga and cgga datasets, the orthotopic xenograft signature and the 3d tetra-culture signature were associated with poor glioblastoma patient prognosis (fig. 9g-j) . many genes with individual poor prognostic significance were upregulated in the intracranial xenograft signature, including chi3l2, postn, and ndrg1 (fig. 9k) , while dennd2a, maob, and igfbp2 were upregulated in the 3d bioprinted cultures (fig. 9l) . genes with poor prognostic significance were enriched among all genes in the 3d tetra-culture signature, when compared to a background of all genes (fig. 9m) . thus, 3d bioprinting enabled investigation of gene pathways associated with more aggressive glioblastomas, suggesting that this model can serve as a more realistic therapeutic discovery platform for the most lethal classes of glioblastoma. fig. 5 macrophages grown in 3d tetra-culture models upregulate immune activation signatures, increase m2 polarization, and promote gsc invasion. a schematic diagram of experimental approach for macrophage rna-seq experiments. b volcano plot of transcriptional landscape profiled by rna-seq comparing macrophages grown in standard sphere culture vs macrophages in the 3d tetra-culture model. the x-axis depicts the log transformed fold change, while the y-axis shows the log transformed p value adjusted for multiple test correction. c pathway gene set enrichment connectivity diagram displaying pathways enriched among gene sets upregulated (red) and downregulated (blue) in macrophages in the 3d tetra-culture system vs standard sphere culture. d mrna expression of representative genes in defense response and macrophage function pathways between standard sphere culture vs macrophages in the 3d tetra-culture model as defined by rna-seq. p values were calculated using deseq2 75 with a wald test with benjamini and hochberg correction. ****p < 1e−20. bars are centered at the mean value and error bars represent standard deviation. e mrna expression of representative genes in amino acid deprivation pathways between standard sphere culture vs macrophages in the 3d tetra-culture model as defined by rna-seq. p values were calculated using deseq2 75 with a wald test with benjamini and hochberg correction. ****p < 1e−20. bars are centered at the mean value and error bars represent standard deviation. f mrna expression of m1 and m2 macrophage polarization markers between standard sphere culture vs macrophages in the 3d tetra-culture model as defined by qpcr. three technical replicates were used and ordinary two-way anova with dunnett multiple comparison test was used for statistical analysis, ***p < 0.001; ****p < 0.0001. bars indicate mean, with error bars showing standard deviation. g fluorescence imaging of cw468 gscs (green) and macrophages (red) grown in the 3d tri-culture model without macrophages (top) vs the 3d tetra-culture model with macrophages (bottom). scale bars, 1 mm. h fluorescence imaging of 2907 gscs (green) and macrophages (red) grown in the 3d tri-culture model without macrophages (top) vs the 3d tetra-culture model with macrophages (bottom). scale bars, 1 mm. i fluorescence imaging of gsc23 gscs (green) and macrophages (red) grown in the 3d tri-culture model without macrophages (top) vs the 3d tetra-culture model with macrophages (bottom). scale bars, 1 mm. j fluorescence imaging of 3264 gscs (green) and macrophages (red) grown in the 3d tri-culture model without macrophages (top) vs the 3d tetra-culture model with macrophages (bottom). scale bars, 1 mm. to improve modeling of a highly lethal brain cancer for which current therapies are limited, we utilized a dlp-based 3d bioprinting system to model glioblastoma, the most common and highly lethal type of brain tumor. studies have reported using 3d printing to create coculture models of glioblastoma cells with other stromal cells or fabricate ha-based hydrogel to mimic brain ecm. 23, 24, 53 however, most prior models focus on only one aspect of the in vivo situation or used non-human cells, which reduced their capacity to be applied to actual clinical settings. to the best article of our knowledge, this is the first report of a human cell-based 3d glioblastoma model that recapitulates the complex tumor microenvironment with inclusion of normal brain, immune components, stromal components, and essential mechanical and biochemical cues from the extracellular matrix. the tumor microenvironment provides essential signals to guide tumor growth and survival; however, these cues are inefficiently modeled in standard 2d culture, even in the absence of serum. hypoxic signaling contributes to glioblastoma aggressiveness by remodeling gsc phenotypes. 54, 55 our 3d tetra-culture brain tumor model expressed hypoxia response signatures, allowing for investigation of hypoxic signaling in a physiologic environment, unlike standard cell culture systems. critical growth factor signaling elements are provided from neurons, [43] [44] [45] 56, 57 npcs, 58 ecm components, 59,60 and immune fractions, including macrophages. 61, 62 the perivascular niche provides a variety of signals including wnts, 63 ephrins, 64 and osteopontins 65 to promote glioblastoma invasion, growth, and maintenance of gscs. future studies will be required to integrate vascular components into the 3d printed model system to further study these important components of the brain tumor microenvironment. the 3d tetra-culture tissue environment presented here enables controlled, reproducible, and scalable interrogation of these various cellular interactions that drive brain tumor biology. while microenvironmental components supply critical niche factors to sustain the tumor ecosystem, stromal elements are also actively remodeled by malignant cells. 66 here, we observed the role of immune cells in glioblastoma growth, including changes in gene expression, invasive behaviors, and response to treatments. reciprocally, we also find that the 3d glioblastoma microenvironment promoted polarization of macrophages towards a protumoral m2 macrophage phenotype, highlighting this bidirectional crosstalk. the bioprinting approach generates a spatially separated tumor region and surrounding non-neoplastic neural tissue with defined cell density which allows the cells to interact in a more realistic manner, providing a highly reproducible platform for the interrogation of cell-cell interactions with several key advantages. first, this 3d glioblastoma tissue model allows for investigation of tumor-immune interactions in a fully human species-matched system, which is not possible in xenograft or genetically engineered mouse model. this may facilitate understanding of human-specific immune interactions and advance the field of neuro-oncoimmunology by providing insights into immunotherapy efficacy. second, combining tumoral and non-neoplastic neural components within one model will propel drug discovery efforts by enabling measurements of therapeutic efficacy, toxicities, and therapeutic index. the scalability and reproducibility of this 3d bioprinted model also allows for more high-throughput compound screening efforts. our findings suggest that the 3d bioprinted model displays transcriptional signatures closer to patient-derived glioblastoma tissue, and that local stromal interactions present within our model promotes broad therapeutic resistance, enabling compound discovery efforts in a challenging environment. third, the 3d bioprinted model is amenable to largescale whole-genome crispr-cas9-based screening methods to uncover novel functional dependencies in a physiologic setting. this model extends previous approaches by characterizing context-dependent target essentiality in cancer cells and allowing for investigation of multivalent stromal cell dependencies. in conclusion, we report a controlled, reproducible, and scalable 3d engineered glioblastoma tissue construct that serves as a more physiologically accurate brain tumor model, facilitates interrogation of the multicellular interactions that drive brain tumor biology, and acts as a platform for discovery of novel functional dependencies. gelma and gmha synthesis and characterization gelma and gmha were synthesized using type a, gel strength 300 gelatin from porcine skin (sigma aldrich cat #: g2500) and 200,000 da hyaluronic acid (lifecore), respectively, as described previously. 67, 68 briefly, for the gelma synthesis of 95% degree of methacrylation, 10% (w/v) gelatin was dissolved in 0.25 m 3:7 carbonate-bicarbonate buffer solution (ph~9) at 50°c. methacrylic anhydride was added dropwise at a volume of 0.1 ml/(gram gelatin). the reaction was left to run for 1 h at 50°c. after synthesis, the solutions were dialyzed, frozen overnight at −80°c, and lyophilized. freeze-dried gelma and gmha were stored at −80°c and reconstituted immediately before printing to stock solutions of 20% (w/vol) and 4% (w/vol), respectively. all materials were sterilized by syringe filters before mixing with cells (millipore). the degree of methacrylation of gelma and gmha were quantified using proton nmr (bruker, 600 mhz). cell culture xenografted tumors were dissociated using a papain dissociation system according to the manufacturer's instructions. gscs were then cultured in neurobasal medium supplemented with 2% b27, fig. 6 3d bioprinting enables a drug discovery platform and microenvironmental interactions contribute to drug resistance. a (top) schematic diagram of drug diffusion experiment. (bottom) images of fitc-dextran diffusion through the 3d hydrogel over a time course. scale bars, 1 mm. b average intensity of fitc-dextran signal through the 3d tetra-culture model over a time course. three replicates were used. bars indicate mean with error bars showing standard deviation. ordinary one-way anova with tukey correction for multiple comparisons was used for statistical analysis. c spatial intensity of fitc-dextran signal through the 3d tetra-culture model over a time course. d cell viability of the gsc23 gsc following treatment with the egfr inhibitors, erlotinib and gefitinib, and the alkylating agent temozolomide (tmz) in standard sphere culture conditions, the 3d tri-culture model, and the 3d tetra-culture model. three replicates were used, ordinary two-way anova with dunnett multiple test correction was used for statistical analysis. bars indicate mean, while error bars show standard deviation. **p < 0.01; ****p < 0.0001. e cell viability of the cw468 gsc following treatment with the egfr inhibitors, erlotinib and gefitinib, and the alkylating agent tmz in standard sphere culture conditions, the 3d tri-culture model, and the 3d tetra-culture model. three replicates were used, ordinary two-way anova with dunnett multiple test correction was used for statistical analysis. bars indicate mean, while error bars show standard deviation. **p < 0.01; ***p < 0.001; ****p < 0.0001. f schematic diagram of process to determine drug sensitivity based on the 3d tetra-culture gene expression signature from the ccle and ctrp datasets. [49] [50] [51] g therapeutic efficacy prediction of drugs in all cancer cells in the ctrp dataset based on differentially expressed genes between the 3d tetra-culture model and gscs grown in sphere culture as defined by rna-seq. h correlation of (top) abiraterone and (bottom) gsk-j4 sensitivities based on the 3d tetra-culture signature expression across all cancer cell lines in the ccle dataset. compounds are ranked based on the correlation between the tetra-culture gene expression signature and compound area under the curve (auc). i normalized cell viability of gscs in tri-culture and tetra-culture models following treatment with 15 μm of abiraterone. ***p < 0.001. bar shows mean of six technical replicates and error bars indicate standard deviation. unpaired two-tailed t-test was used for statistical analysis. j normalized cell viability of gscs in tri-culture and tetra-culture models following treatment with 25 μm of vemurafenib. ns, p > 0.05. bar shows mean of six technical replicates and error bars indicate standard deviation. unpaired two-tailed t-test was used for statistical analysis. k normalized cell viability of gscs in tri-culture and tetra-culture models following treatment with 50 μm of ifosfamide. ***p < 0.001. bar shows mean of six technical replicates and error bars indicate standard deviation. unpaired two-tailed t-test was used for statistical analysis. 1% l-glutamine, 1% sodium pyruvate, 1% penicillin/streptomycin, 10 ng/ml basic human fibroblast growth factor (bfgf), and 10 ng/ ml human epidermal growth factor (egf) for at least 6 h to recover expression of surface antigens. gsc phenotypes were validated by expression of stem cell markers (sox2 and olig2) functional assays of self-renewal (serial neurosphere passage), and tumor propagation using in vivo limiting dilution. thp-1 monocytes were cultured in rpmi 1640 (gibco) medium supplemented with 10% heat-inactivated fetal bovine serum (fbs, invitrogen) and 1% penicillin/streptomycin. to obtain monocytederived m2 macrophage, thp-1 monocytes were first seeded in 6well plates at a density of 5 × 10 5 cells/ml (3 ml/well). polarization to m2 macrophage was induced by (1) incubating cells in 200 ng/ ml phorbol 12-myristate 13-acetate (pma, sigma aldrich) for 48 h, (2) replacing with thp1 complete medium for 24 h, and then (3) incubating in 20 ng/ml interleukin 4 (il4, peprotech) and 20 ng/ ml interleukin 13 (il13, peprotech) for 48 h. hnp1 neural progenitor cells (neuromics) were cultured on matrigel-coated plates using the complete nbm medium for gscs. human astrocytes (thermofisher) were cultured with astrocyte medium (sciencell) supplemented with 1% penicillin/streptomycin. 3d bioprinting process before printing, gscs, hnp1s, and astrocytes were digested by accutase (stemcell technology), and macrophages were digested with tryple (thermofisher). for the 3d tetra-culture samples, the cell suspension solution for the tumor core consisted of 2.5 × 10 7 cells/ml gscs and 1.25 × 10 7 cells/ml macrophages (gscs:m2 = 2:1). for the 3d tri-culture samples, the core cell suspension solution consisted of 2.5 × 10 7 cells/ml gscs only (supplementary information, fig. s2a, b) . the cell suspension solution for the peripheral region for both models consisted of 1 × 10 7 cells/ml hnp1s and 1 × 10 7 cells/ml astrocytes. all cell suspensions were aliquoted into 0.5 ml eppendorf tubes and stored on ice before use. the prepolymer solution for bioprinting was prepared with 8% (w/v) gelma, 0.5% (w/v) gmha, and 0.6% (w/v) lithium phenyl(2,4,6-trimethylbenzoyl) phosphinate (lap) (tokyo chemical industry). prepolymer solution was kept at 37°c in dark before use. cell suspension was mixed with prepolymer solution at 1:1 ratio immediately before printing to maximize viability. the two-step bioprinting process utilized a customized lightbased 3d printing system. components of the system included a digital micromirror device (dmd) chip (texas instruments), a motion controller (newport), a light source (hamamatsu), a printing stage, and a computer with software to coordinate all the other components. the thickness of the printed samples was precisely controlled by the motion controller and the stage. cellmaterial mixture was loaded onto the printing stage, and the corresponding digital mask was input onto the dmd chip. light was turned on for an optimized amount of exposure time (20 s for the core and 15 s for the periphery). the bioprinted 3d tri-culture/ tetra-culture samples were then rinsed with dpbs and cultured in maintenance medium at 37°c with 5% co 2 . maintenance medium was made of 50% of complete nbm medium, 25% of thp1 medium, and 25% of astrocyte medium. hipsc-derived macrophage generation hipsc-derived macrophage differentiation protocol was adapted from yanagimachi et al. 69 and modified from mesci et al. 38 briefly, ipsc cell lines were generated as previously described, by reprogramming fibroblast from a healthy donor. 70 the ipsc colonies were plated on matrigel-coated (bd biosciences) plates and maintained in mtesr media (stem cell technologies). the protocol of myeloid cell lineage consisted of 4 sequential steps. in the first step, primitive streak cells were induced by bmp4 addition, which in step 2, were differentiated into hemangioblast-like hematopoietic precursors (vegf (80 ng/ml, peprotech), scf (100 ng/ml, gemini) and basic fibroblast growth factor (bfgf), (25 ng/ml, life technologies)). then, in the third step, the hematopoietic precursors were pushed towards myeloid differentiation (flt-3 ligand (50 ng/ml, humanzyme), il-3 (50 ng/ml, gemini), scf (50 ng/ml, gemini), thrombopoietin, tpo (5 ng/ml), m-csf (50 ng/ml)) and finally into the monocytic lineage in step 4 [flt3-ligand (50 ng/ml), m-csf (50 ng/ml), gm-csf (25 ng/ml)]. cells produced in suspension in step 4 were recovered, sorted by using anti-cd14 magnetic microbeads (macs, miltenyi) and then integrated into 3d bioprinted models as described above. isolation and generation of primary human macrophages human blood was obtained from healthy volunteers from the scripps research institute normal blood donor service. mononuclear cells were isolated by gradient centrifugation using lymphoprep (#07851 stemcell), washed with pbs, and treated with red blood cell lysis buffer. cells were plated to adhere monocytes and cultured in 10% heat inactivated fbs in rpmi with hepes, glutamax, 1 mm sodium pyruvate, and pen/strep with 50 ng/ml m-csf for 6 days as described by ogasawara et al. 71 unpolarized m0 macrophages were collected and integrated into 3d bioprinted models as described above. mechanical testing compressive modulus of the 3d printed constructs was measured with a microsquisher (cellscale). pillars with 1 mm in diameter and 1 mm in height were printed with same conditions used for the tissue models and incubated overnight at 37°c. both acellular and cell-encapsulated constructs were tested. the microsquisher utilized stainless steel beams and platens to compress the constructs at 10% displacement of their height. customized matlab scripts were used to calculate the modulus from the force and displacement data collected by microsquisher. sem surface patterns of the materials and cell-material interactions on micron-scale were imaged with a scanning electron microscope (zeiss sigma 500). acellular samples were snapfrozen in liquid nitrogen and immediately transferred to the freeze drier to dry overnight. cell-encapsulated samples were dried based on a chemical dehydration protocol. briefly, samples were fixed using 2.5% glutaraldehyde solution for 1 h at room temperature and then overnight at 4°c. on the next day, the samples were rinsed with dpbs for three times and soaked in 70% ethanol, 90% ethanol, and 95% ethanol subsequently, each for 15 min. then the solution was replaced with 100% ethanol for 10 min, and the step was repeated two more times. hexamethyldisilazane (hdms) was mixed with 100% ethanol at 1:2 ratio and 2:1 ratio. samples were first transferred to hdms: fig. 7 whole-genome crispr-cas9 screen reveals context-specific functional dependencies. a schematic diagram of whole-genome crispr-cas9 loss-of-function screening strategy in standard sphere culture conditions and the 3d tetra-culture model. b pca of functional dependencies defined by whole genome crispr-cas9 screening as defined in (a). c volcano plot demonstrating genes that enhance (blue) or inhibit (red) cell proliferation in sphere culture when inactivated by a specific sgrna in a whole genome crispr-cas9 loss-of-function screen. the x-axis displays the z-score and the y-axis displays the p value as calculated by the mageck-vispr algorithm. d volcano plot demonstrating genes that enhance (blue) or inhibit (red) cell proliferation in the 3d tetra-culture model when inactivated by a specific sgrna in a whole genome crispr-cas9 loss-of-function screen. the x-axis displays the z-score and the y-axis displays the p value as calculated by the mageck-vispr algorithm. 83 e pathway gene set enrichment connectivity diagram displaying pathways enriched among functional dependency genes common to both sphere culture and 3d culture in the tetra-culture model. f plot comparing the functional dependency zscores between sphere culture and 3d culture in the tetra-culture model. g pathway gene set enrichment connectivity diagram displaying pathways enriched among functional dependency genes that are specific to sphere culture, as defined in f. h pathway gene set enrichment connectivity diagram displaying pathways enriched among functional dependency genes that are specific to growth in the 3d tetra-culture, as defined in f. i volcano plot displaying differential functional dependency scores between sphere culture and the 3d tetra-culture system as defined by mageck-vispr. 83 j pathway gene set enrichment connectivity diagram displaying pathways enriched among functional dependency genes that are more essential in sphere culture compared to in the 3d tetra-culture system, as defined in i. k pathway gene set enrichment connectivity diagram displaying pathways enriched among functional dependency genes that are more essential in the 3d tetraculture system compared to in sphere culture, as defined in i. etoh (1:2) for 15 min, then hdms:etoh (2:1) for 15 min. then the solution was replaced with 100% hdms for 15 min, and the step was repeated two more times. the samples were left uncovered in chemical hood overnight to dry. the freeze-dried or chemically dried samples were coated with iridium by a sputter coater (emitech) prior to sem imaging. immunofluorescence staining and image acquisition of tumor model 3d bioprinted samples and sphere cultured cells were fixed with 4% paraformaldehyde (pfa; wako) for 30 min and 15 min, respectively, at room temperature. all samples were blocked and permeabilized using 5% (w/v) bovine serum albumin (bsa, gemini bio-products) solution with 0.1% triton x-100 (promega) for 1 h at room temperature on a shaker. samples were then incubated with the respective primary antibody (listed below) overnight at 4°c. on the next day, samples were rinsed by dpbs with 0.05% tween 20 (pbst) for three times on the shaker. samples were incubated with fluorophore-conjugated goat antirabbit or goat anti mouse secondary antibodies (1:200; biotium) and hoechst 33342 (1:1000; life technologies) counterstain in dpbs with 2% (w/v) bsa for 1 h at room temperature in dark. after incubation, samples were rinsed three times in pbst and stored in dpbs with 0.05% sodium azide (alfa aesar) at 4°c before imaging. fluorescence images of 3d samples and their sphere cultured counterparts were taken with a confocal microscope (leica sp8) using consistent settings for each antibody (supplementary information, table s1 ). fluorescence images of egfp-or mcherry-abeled cells in the 3d samples were also acquired using the confocal microscope. tile scan merging was completed by the automated program on the leica microscope and the z-stack projection was completed by imagej. quantification of the migration was based on the fluorescence images processed by imagej. rna isolation and rt-pcr egfp-labeled gscs and mcherry-labeled thp1s were isolated from 3d printed tri-culture and tetra-culture samples using flow cytometry (bd facsaria ii). cells isolated from 3d and sphere cultured cells were treated with trizol reagent (life technologies) before rna extraction. total rna of each sample was extracted using direct-zol rna miniprep kit (zymo) and immediately stored at −80°c. to perform rt-pcr, cdna was first obtained by rna reverse transcription using the protoscript® first strand cdna synthesis kit (new england biolabs) with input rna of 200 ng per sample. the primers were purchased from integrated dna technologies. rt-pcr was performed using powerup sybr green master mix (applied biosystems) and detected with quantstudio 3 rt-pcr system. gene expression was determined by the threshold cycle (ct) values normalized against the housekeeping gene (supplementary information, table s2 ). rna-seq and data analysis rna was purified as described above and subjected to rna-seq. paired-end fastq sequencing reads were trimmed using trim galore version 0.6.2 (https://www.bioinformatics.babraham.ac.uk/ projects/trim_galore/) using cutadapt version 2.3. transcript quantification was performed using salmon 72 version 0.13.1 in the quasi-mapping mode from transcripts derived from human gencode release 30 (grch38.12). 73 salmon "quant" files were converted using tximport 74 (https://bioconductor.org/packages/ release/bioc/html/tximport.html) and differential expression analysis was performed using deseq2 75 in the r programming language. data from gscs and primary glioblastoma surgical resection tissues were derived from mack et al. 37 and were processed using the same analysis pipeline. data from matched gscs grown in serum-free sphere culture and orthotopic intracranial xenografts were derived from miller et al. 36 and were processed using the same analysis pipeline. processed data from matched gscs and differentiated tumor cells were derived from suva et al. 76 and differentially expressed genes were calculated using the limma-voom algorithm in the limma package 77 in the r programming language. pca was performed within the deseq2 package using the top 5000 differentially expressed genes. umap analysis was performed using the umapr package (https://github.com/ropenscilabs/umapr) and uwot (https://cran.r-project.org/web/packages/uwot/index. html). for comparisons of glioblastoma tissue samples with gscs grown in standard sphere culture, analysis parameters include: sample size of local neighborhood, number of neighbors = 40; learning rate = 0.5; initialization of low dimensional embedding = random; metrics for computation of distance in high dimensional space = manhattan. for comparisons of gscs derived from sphere culture or 3d bioprinted models, analysis parameters include: sample size of local neighborhood, number of neighbors = 3; initialization of low dimensional embedding = random; metrics for computation of distance in high dimensional space = cosine. gene set enrichment analysis was performed using the online gsea webportal (http://software.broadinstitute.org/ gsea/msigdb/annotate.jsp) and the gsea desktop application fig. 8 pag1 and znf830 are potential therapeutic targets in glioblastoma. a 3d tetra-culture specific target identification approach. graph showing gene dependency z-score in sphere culture (x-axis) vs tetra-culture (y-axis). red color indicates genes with a sphere culture z-score of > −0.5 and a tetra-culture z-score of < −0.5. b red genes from (a) ranked based on the dependency significance in tetra-culture models (−log2 of the p value). c luminescent signal in gscs transfected with a luciferase expression vector (red) or un-transfected cells following treatment with luciferin reagent for 10 min. ***p < 0.001. unpaired, two-tailed t test was used for statistical analysis. d western blot for pag1 and flag-tagged cas9 following treatment with two independent sgrnas targeting pag1 in luciferase-expressing cw468 cells or a nontargeting control (sgcont). tubulin was used as a loading control. e western blot for znf830 and flag-tagged cas9 following treatment with two independent sgrnas targeting znf830 in luciferase expressing cw468 cells or a sgcont. tubulin was used as a loading control. f western blot for atp5h (atp5pd) and flag-tagged cas9 following treatment with two independent sgrnas targeting atp5h in luciferase expressing cw468 cells or a sgcont. tubulin was used as a loading control. g western blot for rnf19a and flag-tagged cas9 following treatment with two independent sgrnas targeting rnf19a in luciferase expressing cw468 cells or a sgcont. tubulin was used as a loading control. h cell viability of cw468 luciferase expressing gscs in sphere culture following treatment with two independent sgrnas targeting pag1 or a sgcont. ****p < 0.0001. two-way repeated measures anova with dunnett multiple comparison testing was used for statistical analysis. i cell viability of cw468 luciferase expressing gscs in sphere culture following treatment with two independent sgrnas targeting znf830 or a sgcont. ****p < 0.0001. two-way repeated measures anova with dunnett multiple comparison testing was used for statistical analysis. j cell viability of cw468 luciferase-expressing gscs in sphere culture following treatment with two independent sgrnas targeting atp5h or a sgcont. ****p < 0.0001. two-way repeated measures anova with dunnett multiple comparison testing was used for statistical analysis. k cell viability of cw468-luciferase expressing gscs in sphere culture following treatment with two independent sgrnas targeting rnf19a or a sgcont. ****p < 0.0001. two-way repeated measures anova with dunnett multiple comparison testing was used for statistical analysis. l cell viability of cw468 luciferase expressing gscs in 3d tetra-culture models after editing with two independent sgrnas targeting pag1, znf830, or a non-targeting sgrna after seven days. ****p < 0.0001. bars show mean and standard deviation of two biological replicates with 5 technical replicates. ordinary one-way anova with dunnett multiple comparison correction was used for statistical analysis. m cell viability of cw468 luciferase expressing gscs in 3d tetra-culture models after editing with two independent sgrnas targeting atp5h, rnf19a, or a nontargeting sgrna after seven days. *p < 0.05; **p < 0.01. bars show mean and standard deviation of two biological replicates with 5 technical replicates. ordinary one-way anova with dunnett multiple comparison correction was used for statistical analysis. n western blot for pag1 and flag-tagged cas9 following treatment with two independent sgrnas targeting pag1 in cw468 gscs or a sgcont. tubulin was used as a loading control. o kaplan-meier plot showing mouse survival following orthotopic implantation of gscs edited with one of two sgrnas targeting pag1 or a sgcont. sgpag1.1 vs sgcont, p = 0.071. sgpag1.9 vs sgcont = 0.023. log-rank test was used for statistical analysis. p western blot for znf830 and flag-tagged cas9 following treatment with two independent sgrnas targeting znf830 in cw468 gscs or a sgcont. tubulin was used as a loading control. q kaplan-meier plot showing mouse survival following orthotopic implantation of gscs edited with one of two sgrnas targeting znf830 or a sgcont. sgznf830.1 vs sgcont, p = 0.011. sgznf830.3 vs sgcont, p > 0.05. log-rank test was used for statistical analysis. (http://software.broadinstitute.org/gsea/downloads.jsp). 78, 79 pathway enrichment bubble plots were generated using the bader lab enrichment map application 80 and cytoscape (http://www. cytoscape.org). glioblastoma transcriptional subtypes were calculated using a program written by wang et al. 81 and implemented in r. gene signatures were calculated using the single sample gene set enrichment analysis projection (ssgseaprojection) module on genepattern (https://cloud.genepattern.org). crispr editing crispr editing was performed on cw468 gscs as well as luciferase-labeled cw468 gscs (cw468-luc). for unlabeled cells, sgrnas were cloned into the lenticrisprv2 plasmid containing a puromycin selection marker (addgene plasmid #52961), while luciferase-labeled cells were edited with sgrnas cloned into the lenticrisprv2 plasmid containing a hygromycin selection marker (addgene plasmid #98291). sgrna sequences were chosen from the human crispr knockout pooled library (brunello) 82 (supplementary information, table s3 ). western blot analysis cells were collected and lysed in ripa buffer (50 mm tris-hcl, ph 7.5; 150 mm nacl; 0.5% np-40; 50 mm naf with protease inhibitors) and incubated on ice for 30 min. lysates were centrifuged at 4°c for 10 min at 14,000 rpm, and supernatant was collected. the pierce bca protein assay kit (thermo scientific) was utilized for determination of protein concentration. equal amounts of protein samples were mixed with sds laemmli loading buffer, boiled for 10 min, and electrophoresed using nupage bis-tris gels, then transferred onto pvdf membranes. tbs-t supplemented with 5% non-fat dry milk was used for blocking for a period of 1 h followed by blotting with primary antibodies at 4°c for 16 h (supplementary information, table s4 ). blots were washed 3 times for 5 min each with tbs-t and then incubated with appropriate secondary antibodies in 5% non-fat milk in tbs-t for 1 h. for all western immunoblot experiments, blots were imaged using biorad image lab software and subsequently processed using adobe illustrator to create the figures. molecular diffusion assessment 3d printed hydrogels were printed and incubated in dpbs overnight at 37°c. fluorescein isothiocyanate (fitc)-dextran with average molecular weight of 4000 da was dissolved in dpbs at concentration of 500 µg/ml. dpbs was removed and fitc-dextran solutions were added to the wells with 3d printed hydrogels. hydrogels were incubated in fitc-dextran solution at 37°c for 0, 5, 15, 30, 60, and 120 min; rinsed three times with dpbs; and then imaged using a fluorescence microscope. fluorescence intensities of the hydrogel were measured by imagej. the average intensities and the spatial intensities at each time point were calculated in excel and plotted using prism. drug response assessment 3d tri-culture/tetra-culture samples were printed as described above, with regular gscs substituted with luciferase-labeled gscs. 3d samples and sphere cultured cells plated on matrigel-coated slides were treated with drugs after 5 days in culture. drug effects were evaluated 72 h later for erlotinib and gefitinib. for temozolomide, medium was replaced with fresh medium with temozolomide 72 h after first treatment, and the drug response was evaluated 72 h after second treatment. luciferase readings were obtained using using the promega luciferase assay system (e1500) based on the provided protocol and a tecan infinite m200 plate reader. abiraterone (hy-70013), vemurafenib (hy-12057), and ifosfamide (hy-17419), erlotinib (hy-50896), and gefitinib (hy-50895) from medchemexpress was used to generate dose response curves in vitro. fig. 9 3d bioprinting contributes to upregulation of genes with poor prognostic significance in glioblastoma. a heatmap displaying mrna expression signatures of intracranial xenografts (vs sphere cell culture) and 3d bioprinted tetra-cultures (vs sphere cell culture) as defined by the tcga glioma hg-u133a microarray. various clinical metrics, patient information and information on tumor genetics are also displayed. b mrna expression signature of (left) 3d bioprinted tetra-cultures (vs sphere cell culture) and (right) intracranial xenografts (vs sphere cell culture) in tcga glioma hg-u133a microarray. grade ii (n = 226), grade iii (n = 244), grade iv (n = 150). the box-and-whisker plot indicates the lower quartile, median, and upper quartile. error bars represent the 5%-95% confidence interval. ordinary one-way anova with tukey multiple comparison test was used for statistical analysis, ****p < 0.0001. c mrna expression signature of 3d bioprinted tetra-cultures (vs sphere cell culture) in cgga. grade ii (n = 188), grade iii (n = 255), grade iv (n = 249). the box-and-whisker plot indicates the lower quartile, median, and upper quartile. error bars represent the 5%-95% confidence interval. ordinary one-way anova with tukey multiple comparison test was used for statistical analysis, ****p < 0.0001. d mrna expression signature of 3d bioprinted tetra-cultures (vs sphere cell culture) in the rembrandt glioma dataset. grade ii (n = 98), grade iii (n = 85), grade iv (n = 130). the box-and-whisker plot indicates the lower quartile, median, and upper quartile. error bars represent the 5%-95% confidence interval. ordinary one-way anova with tukey multiple comparison test was used for statistical analysis, ****p < 0.0001. e mrna expression signature of 3d bioprinted tetra-cultures (vs sphere cell culture) in the chinese glioma genome atlas (cgga). data presented is restricted to glioblastomas (grade iv glioma). primary (n = 422), recurrent (n = 271). the box-and-whisker plot indicates the lower quartile, median, and upper quartile. error bars represent the 5%-95% confidence interval. ordinary one-way anova with tukey multiple comparison test was used for statistical analysis, ****p < 0.0001. f mrna expression signature of 3d bioprinted tetra-cultures (vs sphere cell culture) in the rembrandt glioma dataset. data presented is restricted to glioblastomas (grade iv glioma). proneural (n = 41), mesenchymal (n = 44), classical iv (n = 45). the box-and-whisker plot indicates the lower quartile, median, and upper quartile. error bars represent the 5%-95% confidence interval. ordinary one-way anova with tukey multiple comparison test was used for statistical analysis, ****p < 0.0001. g kaplan-meier survival analysis of glioblastoma patients in the tcga dataset based on the mrna expression signature of intracranial xenografts (vs sphere cell culture). patients were grouped into "high" or "low" signature expression groups based on the median signature expression score. low (n = 262), high (n = 263). log rank analysis was used for statistical analysis, p = 0.017. h kaplan-meier survival analysis of glioblastoma patients in the tcga dataset based on the mrna expression signature of 3d bioprinted tetra-cultures (vs sphere cell culture). patients were grouped into "high" or "low" signature expression groups based on the median signature expression score. low (n = 262), high (n = 263). log rank analysis was used for statistical analysis, p = 0.0001. i kaplan-meier survival analysis of glioblastoma patients in the cgga dataset based on the mrna expression signature of intracranial xenografts (vs sphere cell culture). patients in the top 1/3 of the expression signature score were grouped into the "high" group, while those in the bottom 1/3 of the expression signature score were grouped into the "low" group. low (n = 158), high (n = 158). log rank analysis was used for statistical analysis, p = 0.017. j kaplan-meier survival analysis of glioblastoma patients in the cgga dataset based on the mrna expression signature of 3d bioprinted tetra-cultures (vs sphere cell culture). patients in the top 1/3 of the expression signature score were grouped into the "high" group, while those in the bottom 1/3 of the expression signature score were grouped into the "low" group. low (n = 158), high (n = 158). log rank analysis was used for statistical analysis, p = 0.0001. k plot showing genes in the intracranial xenograft signature ranked by (x-axis) the mean survival difference between the "high" expressing group and the "low" expressing group and (y-axis) the statistical significance of the survival difference as calculated by the log-rank test. patients were grouped into "high" or "low" signature expression groups based on the median gene expression. l plot showing genes in the 3d bioprinted tetra-cultures (vs sphere cell culture) signature ranked by (x-axis) the mean survival difference between the "high" expressing group and the "low" expressing group and (y-axis) the statistical significance of the survival difference as calculated by the log-rank test. patients were grouped into "high" or "low" signature expression groups based on the median gene expression. m the outer pie chart displays the fraction of genes with prognostic significance in the 3d bioprinted tetra-cultures gene signature as calculated by the log-rank test. patients were grouped into "high" or "low" signature expression groups based on the median gene expression. the inner pie chart displays the number of total prognostically significant genes as a fraction of all genes. the chi-squared test was used for statistical analysis, p < 0.0001. sphere culture cell proliferation experiments were conducted by plating cells of interest at a density of 2000 cells per well in a 96-well plate with 6 replicates. cell titer glo (promega) was used to measure cell viability. data is presented as mean ± standard deviation. drug sensitivity prediction therapeutic sensitivity and gene expression data were accessed through the cancer therapeutics response portal (https://portals. broadinstitute.org/ctrp/). [49] [50] [51] gene signature scores were calculated for each cell line in the dataset using the single sample gene set enrichment analysis projection (ssgseaprojection) module on genepattern (https://cloud.genepattern.org). gene signature score was then correlated with area under the curve (auc) values for drug sensitivity for each compound tested. correlation r-value was plotted and statistical analyses were corrected for multiple test correction. crispr screening and data analysis whole-genome crispr-cas9 loss-of-function screening was performed with the human crispr knockout pooled library (brunello), 82 which was a gift from david root and john doench (addgene #73178). the library was used following the instructions on addgene website (https://www.addgene.org/pooled-library/ broadgpp-human-knockout-brunello). briefly, the library was stably transduced into gscs by lentiviral infection with a multiplicity of infection (moi) around 0.3-0.6, after puromycin selection, cells were propagated in either standard sphere cell culture conditions or in a 3d tetra-culture system. after 10 days, genomic dna was extracted from gscs and the sequencing library was generated using the protocol on addgene website (https://media.addgene.org/cms/filer_public/61/16/611619f4-0926-4a07-b5c7-e286a8ecf7f5/broadgpp-sequencing-protocol. pdf). sequencing quality control was performed using fastqc (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) and enrichment and dropout were calculated using the mageck-vispr pipeline 83,84 using the mageck-mle pipeline. in vivo tumorigenesis assays intracranial xenografts experiments were generated by implanting 15,000 patient-derived gscs (cw468) following treatment with sgrnas targeting pag1 or znf830 or a sgcont into the right cerebral cortex of nsg mice (nod.cg-prkdcscid il2rgtm1wjl/szj, the jackson laboratory, bar harbor, me, usa) at a depth of 3.5 mm under a university of california, san diego institutional animal care and use committee (iacuc) approved protocol. all murine experiments were performed under an animal protocol approved by the university of california, san diego iacuc. healthy, wild-type male or female mice of nsg background, 4-6 weeks old, were randomly selected and used in this study for intracranial injection. mice had not undergone prior treatment or procedures. mice were maintained in 14 h light/10 h dark cycle by animal husbandry staff with no more than 5 mice per cage. experimental animals were housed together. housing conditions and animal status were supervised by a veterinarian. animals were monitored until neurological signs were observed, at which point they were sacrificed. neurological signs or signs of morbidity included hunched posture, gait changes, lethargy and weight loss. survival was plotted using kaplan-meier curves with statistical analysis using a log-rank test. subcutaneous xenografts were established by implanting 2 million luciferase-labeled cw468 gscs into the right flank of nsg mice and maintained as described above. two weeks after implantation, treatment was initiated with 80 mg/kg of ifosfamide (hy-17419, medchemexpress) dissolved in 90% safflower oil (spectrum laboratory products) and 10% dmso or vehicle alone by 100 μl intraperitoneal injection once per day for 28 days. luminescence signal was assessed at days 0, 7, 14, 21, and 28 after initiation of treatment using bioluminescence imaging following injection of luciferin reagent intraperitoneally. tumor size was normalized based on the day 7 time point for each mouse individually. statistical analysis statistical analysis parameters are provided in each figure legend. multiple group comparisons were compared by one-way anova with tukey's post-hoc analysis (by graphpad prism). p < 0.05 was designated as the threshold value for statistical significance. all data were displayed as mean values with error bars representing standard deviation. all raw sequencing data and selected processed data is available on geo at the accession number gse147147 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc=gse147147). there are no restrictions on data availability, and all data will be made available upon request directed to the corresponding authors. all biological materials used in this manuscript will be made available upon request to the corresponding authors. distribution of human patient-derived gscs may be distributed following completion of a material transfer agreement (mta) with the appropriate institutions if allowed. all computational algorithms utilized in the manuscript have been referenced in the corresponding figure legend and described in the methods section. additional details can be made available upon request. co-evolution of tumor cells and their microenvironment project drive: a compendium of cancer dependencies and synthetic lethal relationships uncovered by large-scale, deep rnai screening the cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity next-generation characterization of the cancer cell line encyclopedia the landscape of cancer cell line metabolism patient-derived xenograft models: an emerging platform for translational cancer research organogenesis in a dish: modeling development and disease using organoid technologies organoids as an in vitro model of human development and disease sequential cancer mutations in cultured human intestinal stem cells modeling colorectal cancer using crispr-cas9-mediated engineering of human intestinal organoids a living biobank of breast cancer organoids captures disease heterogeneity brca-deficient mouse mammary tumor organoids to study cancer-drug resistance human primary liver cancer-derived organoid cultures for disease modeling and drug screening ductal pancreatic cancer modeling and drug screening using human pluripotent stem cell-and patient-derived tumor organoids a three-dimensional organoid culture system derived from human glioblastomas recapitulates the hypoxic gradients and cancer stem cell heterogeneity of tumors found in vivo organoids in cancer research organoid cultures derived from patients with advanced prostate cancer modeling patient-derived glioblastoma with cerebral organoids modeling physiological events in 2d vs. 3d cell culture the microenvironmental landscape of brain tumors 3d bioprinting of tissues and organs bioprinting for cancer research 3d-bioprinted mini-brain: a glioblastoma model to study cellular interactions and therapeutics a bioprinted human-glioblastoma-on-a-chip for the identification of patient-specific responses to chemoradiotherapy chitosan-alginate 3d scaffolds as a mimic of the glioma tumor microenvironment elucidating the mechanobiology of malignant brain tumors using a brain matrix-mimetic hyaluronic acid hydrogel platform brain-mimetic 3d culture platforms allow investigation of cooperative effects of extracellular matrix features on therapeutic resistance in glioblastoma engineering tumors with 3d scaffolds differential response of patient-derived primary glioblastoma cells to environmental stiffness 3d bioprinting of functional tissue models for personalized drug screening and in vitro disease modeling rapid 3d bioprinting of decellularized extracellular matrix with regionally varied mechanical properties and biomimetic microarchitecture deterministically patterned biomimetic human ipsc-derived hepatic model via rapid 3d bioprinting dissecting and rebuilding the glioblastoma microenvironment with engineered materials hyaluronan and hyaluronectin in the extracellular matrix of human brain tumour stroma tumor stem cells derived from glioblastomas cultured in bfgf and egf more closely mirror the phenotype and genotype of primary tumors than do serum-cultured cell lines transcription elongation factors represent in vivo cancer dependencies in glioblastoma chromatin landscapes reveal developmentally encoded transcriptional states that define human glioblastoma modeling neuro-immune interactions during zika virus infection possible involvement of the m2 anti-inflammatory macrophage phenotype in growth of human gliomas glioma grade is associated with the accumulation and activity of cells bearing m2 monocyte markers single-cell profiling of human gliomas reveals macrophage ontogeny as a basis for regional differences in macrophage activation in the tumor microenvironment decoupling genetics, lineages, and microenvironment in idh-mutant gliomas by single-cell rna-seq neuronal activity promotes glioma growth through neuroligin-3 secretion electrical and synaptic integration of glioma into neural circuits targeting neuronal activity-regulated neuroligin-3 dependency in high-grade glioma tumor vascular permeability, accumulation, and penetration of macromolecular drug carriers cbtrus statistical report: primary brain and other central nervous system tumors diagnosed in the united states in 2012-2016 radiotherapy plus concomitant and adjuvant temozolomide for glioblastoma correlating chemical sensitivity and basal gene expression reveals mechanism of action harnessing connectivity in a large-scale small-molecule sensitivity dataset an interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules lincs data portal 2.0: next generation access point for perturbation-response signatures regulation of glioma cell phenotype in 3d matrices by hyaluronic acid hypoxic induction of vasorin regulates notch1 turnover to maintain glioma stem-like cells an id2-dependent mechanism for vhl inactivation in cancer glutamatergic synaptic input to glioma cells drives brain tumour progression synaptic proximity enables nmdar signalling to promote brain metastasis neural precursor-derived pleiotrophin mediates subventricular zone invasion by glioma activation of notch signaling by tenascin-c promotes growth of human brain tumor-initiating cells a tension-mediated glycocalyx-integrin feedback loop promotes mesenchymal-like glioblastoma tumour-associated macrophages secrete pleiotrophin to promote ptprz1 signalling in glioblastoma stem cells for tumour growth macrophage-associated pgk1 phosphorylation promotes aerobic glycolysis and tumorigenesis a glial signature and wnt7 signaling regulate glioma-vascular interactions and tumor microenvironment ephrinb2 drives perivascular invasion and proliferation of glioblastoma stem-like cells osteopontin-cd44 signaling in the glioma perivascular niche enhances cancer stem cell phenotypes and promotes aggressive tumor growth cancer stem cells: the architects of the tumor ecosystem photocrosslinked hyaluronic acid hydrogels: natural, biodegradable tissue engineering scaffolds precise tuning of facile one-pot gelatin methacryloyl (gelma) synthesis robust and highly-efficient differentiation of functional monocytic cells from human pluripotent stem cells under serum-and feeder cellfree conditions a model for neural development and treatment of rett syndrome using human induced pluripotent stem cells selective blockade of the lyso-ps lipase abhd12 stimulates immune responses in vivo salmon provides fast and bias-aware quantification of transcript expression gencode reference annotation for the human and mouse genomes differential analyses for rna-seq: transcript-level estimates improve gene-level inferences moderated estimation of fold change and dispersion for rna-seq data with deseq2 reconstructing and reprogramming the tumor-propagating potential of glioblastoma stem-like cells limma powers differential expression analyses for rnasequencing and microarray studies gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles pgc-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes enrichment map: a network-based method for gene-set enrichment visualization and interpretation tumor evolution of glioma-intrinsic gene expression subtypes associates with immunological changes in the microenvironment optimized sgrna design to maximize activity and minimize off-target effects of crispr-cas9 quality control, modeling, and visualization of crispr screens with mageck-vispr mageck enables robust identification of essential genes from genome-scale crispr/cas9 knockout screens supplementary information accompanies this paper at https://doi.org/10.1038/ s41422-020-0338-1. competing interests: a.r.m. is a co-founder and has equity interest in tismoo, a company dedicated to genetic analysis focusing on therapeutic applications customized for the autism spectrum disorder and other neurological disorders origin genetics. the terms of this arrangement have been reviewed and approved by the university of california, san diego in accordance with its conflict of interest policies. the remaining authors declare no potential conflicts of interest. key: cord-026503-yomnqr78 authors: basile, davide; ter beek, maurice h.; legay, axel title: strategy synthesis for autonomous driving in a moving block railway system with uppaal stratego date: 2020-05-13 journal: formal techniques for distributed objects, components, and systems doi: 10.1007/978-3-030-50086-3_1 sha: doc_id: 26503 cord_uid: yomnqr78 moving block railway systems are the next generation signalling systems currently under development as part of the shift2rail european initiative, including autonomous driving technologies. in this paper, we model a suitable abstraction of a moving block signalling system with autonomous driving as a stochastic priced timed game. we then synthesise safe and optimal driving strategies for the model by applying advanced techniques that combine statistical model checking with reinforcement learning as provided by uppaal stratego. hence, we show the applicability of uppaal stratego in this concrete case study. next generation railway systems are based on distributed inter-organisational entities, such as on-board train computers and wayside radio-block centres and satellites, which have to interact to accomplish their tasks. a longstanding effort in the railway domain concerns the use of formal methods and tools for the analysis of railway (signalling) systems in light of the sector's stringent safety requirements [7, 10, 11, 17, [27] [28] [29] [30] [31] 41, 42] . due to their distributed and interorganisational nature, their formal verification is still an open challenge. whilst model-checking and theorem-proving techniques are predominant, to the best of our knowledge, applications of controller synthesis techniques are largely lacking. we describe a formal modelling and analysis experience with uppaal stratego of a moving block railway signalling system. this work was conducted in the context of several projects concerned with the use of formal methods and tools for the development of railway systems based on moving block signalling systems, in which train movement is no longer authorised based on sections of the railway track between fixed points, but computed in real time as safe zones around the trains. most notably, the h2020 shift2rail projects astrail: satellite-based signalling and automation systems on railways along with formal method and moving block validation (http://www.astrail. eu) and 4securail: formal methods and csirt for the railway sector (http://www.4securail.eu). the european shift2rail initiative (http://shift2rail. org) is a joint undertaking of the european commission and the main railway stakeholders to move the european railway industry forward by increasing its competitiveness. this concerns in particular the transition to next generation signalling systems, including satellite-based train positioning, moving-block distancing, and automatic driving. with a budget of nearly 1 billion euro, it is unique in its kind. previously, in [6, 8] , we introduced a concrete case study of a satellite-based moving block railway signalling system, which was developed in collaboration with industrial partners of the astrail project and which was modelled and analysed with simulink and uppaal smc (statistical model checker). while those models offered the possibility to fine tune communication parameters that are fundamental for the reliability of their operational behaviour, they did not account for the synthesis of autonomous driving strategies. building on such efforts, in this paper we present a formal model of a satellitebased moving block railway signalling system, which accounts for autonomous driving and which is modelled in uppaal stratego as a stochastic priced timed game. the autonomous driving module is not modelled manually, but it is synthesised automatically as a strategy, after which both standard and statistical model checking are applied under the resulting (safe) strategy. the starting point for deriving the strategy is a safety requirement that the model must respect. we moreover consider reliability aspects, and the autonomous driving strategy also provides guarantees for the minimal expected arrival time. the model and experiments are available at https://github.com/davidebasile/forte2020. at last year's forte, parametric statistical model checking was applied to unmanned aerial vehicles (uav) [4] . the model was formalised as a parametric markov chain with the goal of reducing the probability of failure while varying parameters such as precision of the position. the uav follows a predefined flight plan, whereas we aim at automatically synthesising a strategy to safely drive the train. it would be interesting to investigate the possibility of synthesising flight plans under safety constraints. a decade ago at forte'10, one of the first applications of statistical model checking (using the bip toolset) to an industrial case study was presented, namely the heterogeneous communication system for cabin communication in civil airplanes [9] . the goal was to study the accuracy of clock synchronisation between different devices running in parallel on a distributed application, i.e. a time bound within which communication must occur. an implementation of this case study in uppaal smc would allow a comparison of the results. statistical model checking has also been used to verify the reliability of railway interlocking systems [19] and uppaal has been used to verify railway timetables [34] . uppaal stratego has been applied to a few other case studies belonging to the transport domain, such as traffic light controllers [3] , cruise control [38] , and railway scheduling [37] . we conjecture that the uppaal stratego model in [37] could be paired with our model to study railway scheduling for autonomous trains, with the goal of synthesising improved strategies for both the scheduler and the autonomous driver. finally, there have been several recent attempts at modelling and analysing ertms level 3 signalling systems (in particular hybrid level 3 systems with virtual fixed blocks) with promela/spin, mcrl2, alloy/electrum, iuml, sysml, prob, event-b, and real-time maude [2, 5, 14, 21, 25, 33, 40, 43] . none of these concern quantitative modelling and analysis, typically lacking uncertainty, which is fundamental for demonstrating the reliability of the operational behaviour of next generation satellite-based ertms level 3 moving block railway signalling system models. one of the earliest quantitative evaluations of moving block railway signalling systems can be found in [36] , based on gsm-r communications. structure of the paper. after some background on uppaal stratego in sect. 2, we describe the setting of the case study from the railway domain in sect. 3. in sect. 4, we present the formal model, followed by an extensive description of the conducted analyses in sect. 5. finally, we discuss our experience with uppaal stratego and provide some ideas for future work in sect. 6. in this section, we provide some background of the tools and their input models used in this paper, providing pointers to the literature for more details. uppaal stratego [24] is the latest tool of the uppaal [12] suite. it integrates formalisms and algorithms coming from the less recent uppaal tiga [13] (synthesis for timed games), uppaal smc [22] (statistical model checking), and the synthesis of near optimal schedulers proposed in [23] . uppaal tiga [13, 20] implements an efficient on-the-fly algorithm for the synthesis of strategies extended to deal with models of timed games. these are automata modelling a game between a player (the controller) and an opponent (the environment). transitions are partitioned into controllable and uncontrollable ones. the controller plays the controllable transitions, while the opponent plays the uncontrollable ones. the controller is only allowed to deactivate controllable transitions. the goal is to synthesise a strategy for the controller such that, no matter the actions of the opponent, a particular property is satisfied. generally, uncontrollable transitions are used to model events such as delays in communication or other inputs from the environment. on the converse, controllable transitions characterise the logic of the controller, generally related to actuators. the strategy synthesis algorithm uses a suitable abstraction of the real-time part of the model, through zones that are constraints over the realtime clocks. strategy synthesis allows an algorithmic construction of a controller which is guaranteed to ensure that the resulting system satisfies the desired correctness properties, i.e. reachability and safety. uppaal smc is a statistical model checker based on models of stochastic timed automata. these are automata enhanced with real-time modelling through clock variables. moreover, their stochastic extension replaces non-determinism with probabilistic choices and time delays with probability distributions (uniform for bounded time and exponential for unbounded time). these automata may communicate via (broadcast) channels and shared variables. statistical model checking (smc) [1, 39] is based on running a sufficient number of (probabilistic) simulations of a system model to obtain statistical evidence (with a predefined level of statistical confidence) of the quantitative properties to be checked. smc offers advantages over exhaustive (probabilistic) model checking. most importantly, smc scales better since there is no need to generate and possibly explore the full state space of the model under scrutiny, thus avoiding the combinatorial state-space explosion problem typical of model checking, and the required simulations can be easily distributed and run in parallel. this comes at a price: contrary to (probabilistic) model checking, exact results (with 100% confidence) are out of the question. the method proposed in [23] extends the strategy synthesis of [13] to find near-optimal solutions for stochastic priced timed games, which are basically stochastic timed automata enhanced with controllable and uncontrollable transitions, similarly to timed games. in short, the method starts from the most permissive strategy guaranteeing the time bounds, computed with the algorithms in [13] . this strategy is then converted into a stochastic one by substituting non-determinism with uniform distributions. finally, reinforcement learning is applied iteratively to learn from sampled runs the effect of control choices, to find the near-optimal strategy. uppaal stratego uses stochastic priced timed games as formalism whilst integrating (real-time) model checking, statistical model checking, strategy synthesis, and optimisation. it thus becomes possible to perform model checking and optimisation under strategies, which are first-class objects in the tool. internally, abstractions that allow to pass from stochastic priced timed games to timed games similar to those in [13] are used to integrate the various algorithms. the european railway traffic management system (ertms) is a set of international standards for the interoperability, performance, reliability, and safety of modern european rail transport [26] . it relies on the european train control system (etcs), an automatic train protection system that continuously supervises the train, ensuring to not exceed the safety speed and distance. the current standards distinguish four levels (0-3) of operation of etcs signalling systems, depending largely on the role of trackside equipment and on the way information is transmitted to and from trains. the ertms/etcs signalling systems currently deployed on railways throughout europe concern at most level 2. level 2 signalling systems are based on fixed blocks starting and ending at signals. the block sizes are determined based on parameters like the speed limit, the train's speed and braking characteristics, drivers' sighting and reaction times, etc. but the faster trains are allowed to run, the longer the braking distance and the longer the blocks need to be, thus decreasing the line's capacity. this is because the railway sector's stringent safety requirements impose the length of fixed blocks to be based on the worst-case braking distance, regardless of the actual speed of the train. for exact train position detection and train integrity supervision, level 2 signalling systems make use of trackside equipment (such as track circuits or axle counters). however, communication of the movement authority (ma), i.e. the permission to move to a specific location with supervision of speed, as well as speed information and route data to and from the train is achieved by continuous data transmission via gsm-r or gprs with a wayside radio block centre. moreover, an onboard unit continuously monitors the transferred data and the train's maximum permissible speed by determining its position in between the eurobalises (transponders on the rails of a railway) used as reference points via sensors (axle transducers, accelerometer and radar). the next generation level 3 signalling systems currently under investigation and development, no longer rely on trackside equipment for train position detection and train integrity supervision. instead, an onboard odometry system is responsible for monitoring the train's position and autonomously computing its current speed. the onboard unit frequently sends the train's position to a radio block centre which, in turn, sends each train a ma, computed by exploiting its knowledge of the position of the rear end of the train ahead. for this to work, the precise absolute location, speed, and direction of each train needs to be known, which are to be determined by a combination of sensors: active and passive markers along the track, and trainborne speedometers. the resulting moving block signalling systems allow trains in succession to close up, since a safe zone around the moving trains can be computed, thus considerably reducing headways between trains, in principle to the braking distance. this allows for more trains to run on existing railway tracks, in response to the ever-increasing need to boost the volume of passenger and freight rail transport and the cost and impracticability of constructing new tracks. furthermore, the removal of trackside equipment results in lower capital and maintenance costs [32] . one of the current challenges in the railway sector is to make moving block signalling systems as effective and precise as possible, including satellite-based positioning systems and leveraging on an integrated solution for signal outages (think, e.g., of the absence of positioning in tunnels) and the problem of multipaths [44] . however, due to its robust safety requirements the railway sector is notoriously cautious about adopting technological innovations. thus, while gnss-based positioning systems are in use for some time now in the avionics and automotive sectors, current train signalling systems are still based on fixed blocks. however, experiments are being conducted and case studies are being validated in order to move to level 3 signalling systems [2, 5, 6, 8, 14, 15, 21, 25, 33, 40] . the components of the moving block railway signalling case study considered in this paper are depicted in fig. 1 . the train carries the location unit and onboard unit components, while the radio block centre is a wayside component. the location unit receives the train's location from gnss satellites, sends this location (and the train's integrity) to the onboard unit, which, in turn, sends the location to the radio block centre. upon receiving a train's location, the radio block centre sends a ma to the onboard unit (together with speed restrictions and route configurations), indicating the space the train can safely travel based on the safety distance with preceding trains. the radio block centre computes such ma by communicating with neighbouring radio block centres and exploiting its knowledge of the positions of switches and other trains (head and tail position) by communicating with a route management system. we abstract from the latter and from communication among neighbouring radio block centres: we consider one train to communicate with one radio block centre, based on a seamless handover when the train moves from one radio block centre supervision area to an adjacent one, as regulated by its functional interface specification [45] . in this section, we describe the formal model of the case study introduced before. it consists of a number of sptgs, which are basically timed automata with prices (a cost function) and stochasticity, composed as a synchronous product. we briefly describe the model's components, followed by details of the onboard unit. delays in the communications are exponentially distributed with rate 1:4 to account for possible delays. this is a common way of modelling communication delays. moreover, all transitions are uncontrollable, except for the controllable actions of the driver in the train_ato_t component, which are used to synthesise the safe and optimal strategy. component obu_main_generatelocationrequest_t initiates system interactions by generating a request for a new location to send to the location unit. the location unit component lu_main_t receives a new position request from the onboard unit, replying with the current train location (computed via gnss). the main component obu_main_sendlocationtorbc_t of the onboard unit performs a variety of operations. it receives the position from the location unit, sends the received position to the radio block centre, and eventually implements a safety mechanism present in the original system specification. in particular, at each instant of time, it checks that the train's position does not exceed the ma received from the radio block centre; if it does, it enters a failure state. once received, the radio block centre repeatedly sends a ma message until the corresponding acknowledgement from the onboard unit is received. also obu_main_receivema_t models the logic of the onboard unit. it receives a ma from the radio block centre, and sends back a corresponding acknowledgement. finally, the train_ato_t component was defined to synthesise a strategy for moving the train in a safe and optimal way. in particular, the position of the train (variable loc) is stated in a unidimensional space and identified by one coordinate (representing the position along its route), and the train is allowed at each cycle to either move one unit or stay idle. to allow state-space reduction, the value of loc represents a range of the space in which the train is located, rather than a specific point in space. next, we describe this component, depicted in fig. 2 , in detail. the initial state of train_ato_t is the nominal state i_go, drawn with two circles. two failure states (failwhilego and failwhilereadloc) are reached in case the ma is exceeded in obu_main_sendlocationtorbc_t. the initial state has an invariant to guarantee that the train has not passed its destination. note that invariants can be constraints on clocks or variables. this is done by checking that the location of the train, which is encoded by the integer variable loc, is less than or equal to the integer constant arrive, which is an input parameter of the model to perform experiments. from the initial state it is possible to transit to state readlocwhilerun, upon a location request coming from lu_main_t, and coming back from readlocwhilerun to i_go by replying to such a request. variable x is a buffer for value-reading messages. to reduce the model's state space, the value transmitted by train_ato is the remaining headway, i.e. the difference between the ma and the location. indeed, such value has a fixed range if compared to the location (under the assumption that the arrival point is greater than the initial headway value, otherwise the train will never exceed its ma before arriving to the destination). in turn, obu_main_sendlocationtorbc_t checks if such transmitted value (headway) is negative for triggering a failure, since in that case the train has exceeded its ma. from both states i_go and readlocwhilerun, an inner loop is used to receive the new ma (movaut) from rbc_main_t. the movaut should be relative to the current location loc of the train, i.e. movaut = loc + fixed number of meters that the train is allowed to travel. however, to reduce the state space, such a message simply resets the headway variable to its initial value, which is an integer constant called ma. thus, movaut is not stored in the state space because its value can be retrieved as loc+ma. the constant ma is another input parameter of the model. the reason such a loop is also present in state readlocwhilerun is that otherwise the ma message would be lost in this state, and similarly for the urgent state (marked with the symbol u, described below). we now discuss the two controllable transitions in the model. the first is used to move a train. in uppaal stratego a controller cannot delay its actions (whereas the environment can), hence the movement of the train is split into an uncontrollable transition followed by a controllable one, with an intermediate urgent state. an intermediate urgent state is such that a transition must be taken without letting time pass. this is a workaround to force the controller to perform an action at that instant of time. from the initial state i_go, an uncontrollable transition targeting the urgent state is used to check that the conditions for moving the train are met. in particular, if the headway is nonnegative and the train has not arrived, the transition for moving the train is enabled. additionally, a test c>0 on the clock c is used to forbid zeno behaviour. indeed the clock c is reset to zero after the train has moved, hence time is forced to pass before the next movement. such a condition cannot be stated directly on the controllable transition, otherwise a time-lock (i.e. time is not allowed to pass) would be reached in case the condition is not met. the controllable transition (drawn as a solid arc) from the urgent state can either set the integer speed to 1 or to 0, allowing the train to proceed to the next interval of space or to remain in the previous interval, respectively. recall that loc is not a coordinate but rather an abstraction of a portion of space. the controllable transition also updates the headway. to reduce the state space, the only negative value allowed for the variable headway is â��1. finally, the second controllable transition is used to reach a sink state done. to further reduce the state space, the train is not allowed to move once loc has reached value arrive. a hybrid clock timer is used as a stop-watch to measure the time it takes for the train to arrive in state done. to this aim, the invariant timer'==0 in state done sets the derivative of clock timer to zero. a hybrid clock can be abstracted away during the synthesis of a safe strategy. in this section, we report on analysis of the formal model. the main objective is to synthesise a safe strategy such that the train does not exceed the ma. additionally, the train should be as fast as possible, within the limits imposed by the safety requirements. to this aim, an optimal and safe strategy is synthesised. the experiments were carried out on a machine with a processor intel(r) core(tm) i7-8500y cpu at 1.50 ghz, 1601 mhz, 2 cores, and 4 logical processors with 16gb of ram, running 64 bit windows 10. the development version of uppaal stratego (academic version 4.1.20-8-beta2) was used. indeed, when developing the model and its analysis, minor issues were encountered (more later). this version of uppaal stratego contains some patches resulting from a series of interactions between the first author and the developers team at aalborg university. the set-up of the parameters of the statistical model checker was chosen to provide a good confidence in the results and is as follows: probabilistic deviation set to î´ = 0.01, probability of false negatives and false positives set to î± = 0.005 and î² = 0.5, respectively, and the probability uncertainty set to = 0.005. as anticipated in sect. 3, we focussed on one radio block centre, one onboard unit, and one location unit, i.e. one train communicating with one radio block centre. finally, we set ma = 5 and arrive = 20. to begin with, we want to check if the hazard of exceeding the ma is possible at all in our model. if such a hazard were never possible, the safe strategy would simply allow all behaviour. to analyse this, we perform standard model checking: this formula checks if for every possible path, the state maexceededfailure of the component obu_main_sendlocationtorbc is never visited. indeed, this particular state is reached exactly when the hazard occurs, i.e. the ma is exceeded, thus triggering a failure. after 0.016 s, using 38,200 kb of memory, uppaal stratego reports that this formula is not satisfied, thus such hazard is possible without a proper strategy to drive the train. we would like to check the likelihood to reach this failure, given this specific set-up of parameters. first, the average maximal time in which the train reaches its destination is computed. this is important to fine tune the time bound for further simulations. to do so, we use the statistical model checker to evaluate the following formula: this formula computes the average maximal value of the train_ato.timer stopwatch, i.e. the arrival time. it is computed based on 10000 simulations with an experimental time bound of 700 s. the computed value is 377.235 â± 3.960, and its probability distribution is depicted in fig. 3 . by analysing the probability distribution it is possible to notice that the average value is lower if faults are ignored. indeed, in case of faults the value of timer is equal to the end of the simulation (i.e. 700 s). hence, the time bound for the following simulations is set to 500 s, thus considering also worst cases of arrival time. we now compute the likelihood of the model to reach a failure, using smc to measure the probability of reaching the failure state with the following formula: uppaal stratego executes 33952 simulations and the probability is within the range [0.117029, 0.127029], with confidence 0.995. the probability confidence interval plot for this experiment is depicted in fig. 4 . we conclude that, for this set-up of parameters, there is a relatively high probability for this hazard to occur. this is as expected, due to the absence of a strategy for driving the train and the non-deterministic choice of whether or not to move the train. after these standard and statistical model-checking experiments, we exploit the synthesis capabilities of uppaal stratego to automatically fix the specification to adhere to safety constraints. indeed, no manual intervention to fix the model is needed: it suffices to compute a driving strategy and compose it with the model. recall that the only controllable transitions in the model are those for deciding whether or not to move the train (i.e. related to acceleration/deceleration, accordingly). this in turn depends on the stochastic delays in communication. the strategy prunes controllable transitions such that those previously reachable configurations leading to the failure state are no longer figure 5 shows the trajectory of variable train_ato.loc for 50 simulations, computed in 0.147 s, using 576,984 kb. we see that in all trajectories the train never stops before reaching its destination, i.e. no failure occurs. however, in some simulations the train is relatively slower, when compared to other simulations. uppaal stratego also allows to model check the synthesised strategies. we ran a full state-space exploration by means of standard model checking to formally verify that after composing the model with the safe strategy the hazard of exceeding the ma is mitigated. this is checked through the following formula: this formula checks that in the model composed with the safe strategy, the 'bad' state is never reached. after 2.283 s and using 599,268 kb of memory, uppaal stratego reports that the formula is satisfied, thus confirming that we automatically synthesised a strategy for mitigating the hazard. however, even if not showed in fig. 5 , there exist trajectories in the composition where the train never reaches its destination. this can be formally proven with a full state-space exploration of the strategy by standard model checking of the following formula: this formula checks that in all paths eventually state train_ato.done is reached (i.e. the train reached its destination). after 0.053 s, using 599,268 kb of memory, uppaal stratego reports that the formula does not hold. indeed, as expected, the strategy does not guarantee that such a state is always reached, but it only guarantees to avoid state obu_main_sendlocationtorbc.maexceededfailure. for example, there exists also a safe strategy that allows the train to remain in its starting position. to evaluate the probability to reach state train_ato.done under the safe strategy, we ran the statistical model checker to evaluate the following formula: fig. 6 . we conclude that the likelihood for the train to not reach its destination within 500 time units under the safe strategy is low, and it is mainly due to the possibility of large delays in communications. these delays are indeed the only source of stochastic behaviour in the model. we now show how uppaal stratego can account for dependability parameters other than safety. in particular, reliability of the system can be related to the capacity of the train to reach its destination quickly. we optimise the safe strategy to minimise the arrival time, thus increasing its reliability whilst satisfying safety. this can be done with the following query, computed in 0.015 s using 580,844 kb of memory: as expected, the optimised safe strategy has improved the arrival time of the safe strategy. the probability distribution of query ï� 5 is depicted in fig. 7 . sensitive analysis of maximal headway. up to this point, we evaluated the moving block railway signalling system under analysis with a specific parameter set-up. in this set-up, each time the train receives a fresh ma, its headway is reset to 5 (i.e. ma = 5). thus, this is the maximal possible headway. the parameters of the model can be tuned in such a way that the analysed properties are within a desired range of values. in particular, we hypothesise that reducing the maximal headway (i.e. ma) results in a deterioration in performance of the optimal strategy and in an increment of the probability of reaching a failure without strategy. indeed, with a tight headway, the train is forced to move slowly in order not to exceed its ma. in the remainder of this section, we experimentally verify our hypothesis. table 1 reports the evaluation of properties ï� 1 -ï� 5 in three different experiments, with values for ma taken from the set {3, 5, 10}, reporting also the computation times and, where appropriate, the number of runs. by reducing the maximal headway (i.e. ma = 3), we notice an overall deterioration of the average maximal arrival time (cf. properties ï� 1 , ï� 4 , and ï� 5 ). moreover, without strategy the probability of failure is higher when compared to ma = 5 (cf. property ï� 2 ). these results confirm our hypothesis and further corroborate the reliability of our model. as a final experiment, we enlarged the maximal headway (i.e. ma = 10) to evaluate the improvement in performance in case of a larger headway. we recall that a large headway is not desirable, since it would result in a lower capacity of the railway network. in this experiment, the values of ï� 2 and ï� 3 are similar to the case of ma = 5. however, by observing the values of ï� 1 , ï� 4 , and ï� 5 , we note that there is only a slight improvement in arrival time, even if we have doubled the maximal headway. this experiment confirms our intuition that an excessive increment of the maximal headway does not lead to a better performance. this is because the train cannot go faster than its optimal speed. on the converse, an excessive enlargement of the headway results in a deterioration of the overall track capacity. hence, ma = 5 is a satisfactory set-up for the maximal headway. we have modelled and analysed an autonomous driving problem for a moving block railway signalling system. communication between the train and the radio block centre are modelled such that the train is allowed to proceed only within the limits imposed by the radio block centre via the ma, which is based on the position of the train and updated continuously. the goal is to synthesise a strategy for the train to arrive to its destination as quickly as possible without exceeding its limits. we modelled the problem as a stochastic priced timed game. the controller is in charge of moving the train, playing against uncontrollable stochastic delays in communication. we used uppaal stratego to compute a strategy to enforce safety in the model. the safe strategy was statistically model checked to evaluate the mean arrival time of the train. this quantity was optimised, and the optimised strategy was compared to the safe one. we observed an improvement in the mean arrival time, whilst retaining safety. as far as we know, this is the first application of synthesis techniques to autonomous driving for next generation railway signalling systems. this was our first experience with strategy synthesis and optimisation of a case study from the railway domain and also with uppaal stratego. since this is a very recent tool there has not been much experimentation, in particular not outside of the groups involved in its development. the tool is still undergoing testing, and new versions and patches are released frequently. in fact, while developing the model we ran into corner cases that needed interactions with the developers team at aalborg university. those interactions led to the release of new versions, with patches fixing the issues discovered through our model. we did have experience in modelling and analysing railway case studies with uppaal smc [6, 8, 31] . the original model developed in [8] and statistically model checked had to be simplified considerably (cf. sect. 4) to undergo strategy synthesis and verification. indeed, while uppaal smc scales to large systems by applying simulations rather than full state-space exploration, uppaal stratego requires full state-space exploration of the timed game for strategy synthesis. for example, using the set-up discussed in sect. 5 with ma = 10, if we double the constant arrive (i.e. 40 instead of 20) then during the strategy synthesis the tool terminates with an error message due to memory exhaustion. an interesting future line of research would be to adapt the statistical synthesis techniques described in [16, 35] to learn safety objectives, thus avoiding the full state-space exploration (as currently performed in uppaal stratego) while guaranteeing the scalability of smc. this would enable the modelling of more complex ertms case studies. also, further experiments, with different set-ups of the parameters and more trains and radio block centres need to be performed, to investigate the limits of the approach described in this paper in terms of optimisation. finally, we intend to discuss with our railway project partners the impact of the techniques discussed in this paper. a survey of statistical model checking modelling the hybrid ertms/etcs level 3 case study in spin coordinated intelligent traffic lights using uppaal stratego parametric statistical model checking of uav flight plan modelling and analysing ertms hybrid level 3 with the mcrl2 toolset statistical model checking of a moving block railway signalling scenario with uppaal smc on the industrial uptake of formal methods in the railway domain modelling and analysing ertms l3 moving block railway signalling with simulink and uppaal smc statistical abstraction and model-checking of large heterogeneous systems adopting formal methods in an industrial setting: the railways case formal methods for transport systems quantitative evaluation of systems (qest) uppaal-tiga: time for playing games! verification of the european rail traffic management system in real-time maude performability evaluation of the ertms/etcs -level 3 partial order reduction for reachability games formal methods applied to industrial complex systems -implementation of the b method abz 2018 verification of interlocking systems using statistical model checking efficient on-the-fly algorithms for the analysis of timed games validating the hybrid ertms/etcs level 3 concept with electrum uppaal smc tutorial on time with minimal expected cost! uppaal stratego diagram-led formal modelling using iuml-b for hybrid ertms level 3 ertms/etcs rams requirements specification -chap. 2 -ram, 30 twenty-five years of formal methods and railways: what next? formal methods and safety certification: challenges in the railways domain some trends in formal methods applications to railway signaling model-based development and formal methods in the railway industry comparing formal tools for system design: a judgment study ertms level 3: the game-changer validation and real-life demonstration of etcs hybrid level 3 principles using a formal b model formal verification of railway timetables -using the uppaal model checker teaching stratego to play ball: optimal synthesis for continuous space mdps dependability checking with stocharts: is train radio reliable enough for trains? safe and time-optimal control for railway games safe and optimal adaptive cruise control statistical model checking: an overview an event-b model of the hybrid ertms/etcs level 3 standard ten diverse formal models for a cbtc automatic train supervision system towards formal methods diversity in railways: an experience report with seven frameworks modeling railway control systems in promela recent progress in application of gnss and advanced communications for railway signaling unisig: fis for the rbc/rbc handover acknowledgements. funding by miur prin 2017ftxr7s project it matters (methods and tools for trustworthy smart systems) and h2020 project 4securail (formal methods and csirt for the railway sector). the 4securail project received funding from the shift2rail joint undertaking under eu's h2020 research and innovation programme under grant agreement 881775.we thank the uppaal developers team, in particular danny poulsen, marius mikucionis, and peter jensen, for their assistance with uppaal stratego. key: cord-023284-i0ecxgus authors: nan title: abstracts of publications related to qasr date: 2006-09-19 journal: nan doi: 10.1002/qsar.19900090309 sha: doc_id: 23284 cord_uid: i0ecxgus nan tive mechanisms p.2-25. edited by magee, p.s., henry, d.r., block, j.h., american chemical society, washington, 1989. results: an overview is given on the approaches for the discovery and design concepts of bioactive molecules: a) natural products derived from plant extracts and their chemically modified derivatives (cardiac glycosides, atropine, cocaine, penicillins, cephalosporins, tetracyclines and actinomycins, pyrethrins and cyclosporin; b) biochemically active molecules and their synthetic derivatives: acetylcholine, histamine, cortisonelhydrocortisone, indole-3-acetic acid (phenoxyacetic acid herbicides); c) principles of selective toxicity is discussed exemplified by trimethoprimlmethotrexate, tetracyclines, acylovir, azidothymidine, antifungal agents; d) metabolism of xenobiotics; e) exploitation of secondary effects (serendipity); f) receptor mapping; g) quantitative structure-activity relationship studies; h) empirical screening (shotgun approach). results: past and present of qsar is overviewed: a) historical roots; b) the role of qsar models in rational drug design, together with a simplified diagram of the steps involved in drug development, including the place of qsar investigations; c) classification of qsar models: structure-cryptic (property-activity) models, structure-implicit (quantum chemical) models, structure-explicit (structure-activity) and structure-graphics (computer graphics) models; d) a non-empirical qsar model based on quantities introduced for identification of chemical structures, using szymansk's and randic's identification (id) numbers, including applications for alkyltriazines. bioessays, 1989, 11(5) , 136-141. results: a review is given on recent observations about receptor structure and the dynamic nature of drug receptors and the significance of receptor dynamics for drug design: a) receptors are classified according to structure and function (i) ion channels (nicotinic acetylcholine, gaba, glycine); (ii) g protein linked [adrenergic ( c x ,~) , muscarinic acetylcholine, angiotensin, substance k, rhodopsin]; (iii) tyrosine kinase (insulin, igf, egf, pdgf); (iv) guanylate cyclase (atrial natriuretic peptide, speractin); b) protein conformational changes can be best studied on allosteric proteins whose crystal structure is available (e.g. hemoglobin, aspartate transcarbamylase, tryptophan repressor) (no high resolution of a receptor structure is known); c) receptor conformational changes can be studied by several indirect approaches (i) spectral properties of covalent or reversibly bound fluorescent reporter groups; (ii) the sensitivity of the receptor to various enzymes; (iii) the sedimentation of chromatographic properties of the receptor; the affinity of binding of radioligands; (iv) the functional state of the receptor; d) there are many unanswered questions: e.g. (i) are there relatively few conformational states for receptors with fluctuations around them or many stable conformational states; (ii) how can static structural information be used in drug design when multiple receptor conformations exist. title: designing molecules and crystals by computer. (review) author: koide, a. ibm japan limited, tokyo scientific center, tokyo research laboratory 5-19 sanban-cho, chiyoda-ku, tokyo 102, japan. source: ibm systems journal 1989, 28(4), 613 -627. results: an overview is given on three computer aided design (cad) systems developed by ibm tokyo scientific center: a) molecular design support system providing a strategic combination of simulation programs for industrial research and development optimizing computational time involved and the depth of the resulting information; b) molworld on ibm personal systems intended to create an intelligent visual environment for rapidly building energetically stable 3d molecular geometries for further simulation study; c) molecular orbital graphics system designed to run on ibm mainframe computers offering highly interactive visualization environment for molecular electronic structures; d) the systems allow interactive data communication among the simulation programs for their strategically combined use; e) the structure and functions of molworld is illustrated on modeling the alanine molecule: (i) data model of molecular structures; (ii) chemical formula input; (iii) generation of 3d molecular structure; (iv) formulation of bonding model; (v) interactive molecular orbital graphics; (vi) methods of visualizing electronic structures; (vii) use of molecular orbital graphics for chemical reactions. title: interfacing statistics, quantum chemistry, and molecular modeling. (review) author: magee, p.s. biosar research project vallejo ca 94591, usa. source: acs symposium series 1989, no.413 . in: probing bioactive mechanisms p.37-56. edited by magee, p.s., henry, d.r., block, j.h., american chemical society, washington, 1989. results: a review is given on the application and overlap of quantum chemical, classical modeling and statistical approaches for the quant. struct.-act. relat. 9, 234 -293 (1990) abstr. 225-228 235 understanding of binding events at the molecular level. a new com-a) qsar of cns drugs has been systematically discussed according plementary method called statistical docking experiment is also to the following classes: (i) general (nonspecific) cns depressants: presented: general anesthetics; hypnotics and sedatives; (ii) general insights obtained using energy-minimized structures; activation in the bound state, types and energies of interactions at the receptor site and in crystal; four successful examples (significant regression equations) are given for the modeling of binding events using physico-chemical descriptors and correlation analysis: (i) binding of a diverse set of pyridines to silica gel during thin-layer chromatography; (ii) binding of meta-substituted n-methyl-arylcarbamates to bovine erythrocyte ache; (iii) binding of meta-substituted n-methyl-arylcarbamates to ache obtained from susceptible and resistant green rice leafhoppers; (iv) activity of phenols inhibiting oxidative phosphorylation of adp to atp in yeast; a new statistical method for mapping of binding sites has been developed based on the hypermolecule approach, identifying key positions of binding and nature of the energy exchange between the hypermolecule atoms and the receptor site; two examples are given on the successful application of statistical modeling (statistical docking experiment) based on the hypermolecule approach: (i) inhibition of housefly head ache by metasubstituted n-methyl-arylcarbamates (n = 36, r = 0.841, s = 0.390, f = 25.82); (ii) inhibition of housefly head ache by orthosubstituted n-methyl-arylcarbamates (n = 46, r = 0.829, s = 0.485, f = 14.24). (nonspecific) cns stimulants; (iii) selective modifiers of cns functions: anticonvulsants, antiparkinsonism drugs, analgetics and psychopharmacological agents; (iv) miscellaneous: drugs interacting with central a-adrenoreceptors, drugs interacting with histamine receptors, cholinergic and anticholinergic drugs; b) the review indicates that the fundamental property of the molecules which mostly influence the activity of cns drugs is hydrophobicity (they have to pass the cell membrane and the bloodbrain barrier); c) electronic parameters, indicative of dipole-dipole or charge-dipole interactions, charge-transfer phenomena, hydrogen-bond formation, are another important factor governing the activity of most cns agents; d) topographical, lipophylic and electronic structures of cns pharmacophores are reviewed; e) 191 qsar equations, 24 tables and 3 figures from 294 references are shown and discussed. the relevant template for each atom in the molecule is mapped into a bit array and the appropriate atomic position is marked; volume comparisons (e.g. common volume or excluded volume) are made by bit-wise boolean operations; the algorithm for the visualization of the molecular surface comprising the calculated van der walls volume is given; comparisons of cpu times required for the calculation of the van der waals molecular volumes of various compounds using the methods of stouch and jurs, pearlman, gavazotti and the new method showed that similar or better results can be achieved using the new algorithm with vax-class computers on molecules containing up to several hundred atoms. abtstr. 229-232 quant. struct.-act. relat. 9, 234-293 (1990) one of the important goal of protein engineering is the design of isosteric analogues of proteins; major software packages are available for molecular modeling are among others developed by (i) biodesign, inc., pasadena, california; (ii) biosym technologies, san diego, california; (iii) tripos, st. louis, missouri; (iv) polygen, waltham, massachusetts; (v) chemical design ltd. oxford; the molecular modelling packages use three basic parameters: (i) descriptive energy field; (ii)algorithm for performing molecular mechanics calculations; (iii) algorithm for performing molecular dynamics calculations; modelling study of the binding events occurring between the envelop protein (gp120) of the aids (hiv) virus and its cellular receptor (cd4) protein supported the hypothesis that this domain was directly involved in binding the gp120 envelop protein leading to the design of conformationally restricted synthetic peptides binding to cd4. 19901229 title: finding washington, 1989. results: a new technique called "homology graphing" has been developed for the analysis of sequence-function relationships in proteins which can be used for sequence based drug design and the search for lead structures: a) as target protein is inhibited by the ligands of other proteins having sequence similarity, computer programs have been developed for the search of the similarity of proteins; b) proteins are organized into hierarchical groups of families and superfamilies based on their global sequence similarities; c) global sequence similarities were used to find inhibitors of acetolactate synthase (als) and resulted in a quinone derivative as a lead structure of new als inhibitors; d) local sequence similarities of bacterial and mammal glutathione synthase (gsh) were used to find inhibitors of gsh; e) it was shown that the sequence segment of gsh was similar to dihydrofolate reductase (dhfr) is part of the atp-binding site; f) biological bases of local similarity between sequences of different proteins were indicated: molecular evolution of proteins and functionally important local regions; g) homology graph, as a measure of sequence similarity was defined; h) sequence-chemical structure relationship based on homology graph and the procedure to find lead structures was illustrated by an example resulting in a list of 33 potential inhibitors selected by the procedure based on the sequence segment from residue 150 to 210 of the sequence of tobacco als. source: acs symposium series 1989, no.413. in: probing bioactive mechanisms p.198-214. edited by magee, p.s.. henry, d.r., block, j.h., american chemical society, washington. 1989. results: a review is given on the molecular design of the following major types of antifungal compound in relation to biochemistry, molecular modeling and target site fit: a) squalene epoxidase inhibitors (allilamines and thiocarbanilates) blocking conversion of squalene 2,3-oxidosqualene; b) inhibitors of sterol c-14 demethylation by cytochrome p-450 (piperazines pyridines, pyrimidines, imidazoles and triazoles); c) inhibitors of sterol a' -t a' isornerization andlor sterol reductase inhibitors (morpholines); d) benzimidazoles specifically interfering with the formation of microtubules and the activity phenylcarbamates on benzimidazole resistant strains; e) carboxamides specifically blocking the membrane bound succinate ubiquinone oxidoreductase activity in the mitochondria1 electron transport chain in basidiomycetes; f) melanin biosynthesis inhibitors selectively interfering with the polyketide pathway to melanin in pyricularia oryzae by blocking nadph dependent reductase reactions of the pathway (fthalide, pcba, chlobentiazone, tricyclazole, pyroquilon, pp389). title: quantitative modeling of soil sorption for xenobiotic chemicals. (review) author: sabljic, a. theoretical chemistry group, department of physical chemistry, institute rudjer boskovic hpob 1016, yu-41001 zagreb, croatia, yugoslavia. source: environ. health perspect. 1989, 83(2), 179 -190. results: the environmental fate of organic pollutants depends strongly on their distribution between different environmental compartments. a review is given on modeling the soil sorption behavior of xenobiotic chemicals: a) distribution of xenobiotic chemicals in the environment and principles of its statistical modeling; b) quantitative structure-activity relationship (qsar) models relating chemical, biological or environmental activity of the pollutants to their structural descriptors or physico-chemical properties such as logp values and water solubilities; c) analysis of the qsar existing models showed (i) low precision of water solubility and logp data; (ii) violations of some basic statistical laws; d) molecular connectivity model has proved to be the most successful structural parameter modeling soil sorption; e) highly significant linear regression equations are cited between k , values and the first order molecular connectivity index ( ' x ) of a wide range of organic pollutants such as polycyclic aromatic hydrocarbons (pahs) and pesticides (organic phosphates, triazines, acetanilides, uracils, carbamates, etc.) with r values ranging from 0.976 to 0.986 and s values ranging from 0.202 to 0.300; f) the molecular connectivity model was extended by the addition of a single semiempirical variable (polarity correction factor) resulting in a highly significant linear regression equations between the calculated and measured ko, values of the total set of compounds (n = 215, r = 0.969, s = 0.279, f = 3291); g) molecular surface areas and the polarity of the compounds were found to be responsible for the majority of the variance in the soil sorption data of a set of structurally diverse compounds. title: strategies for the use of computational sar methods in assessing genotoxicity. (review) results: a review is given on the overall strategy and computational sar methods for the evaluation of the potential health effects of chemicals. the main features of this strategy are discussed as follows: a) generalized sar model outlining the strategy of developing information for the structure-activity assessment of the potential biological effects of a chemical or a class of chemicals; b) models for predicting health effects taking into account a multitude of possible mechanisms: c) theoretical models for the mechanism of the key steps of differential activity at the molecular level; d) sar strategies using linear-free energy methods such as the hansch approach; e) correlative sar methods using multivariate techniques for descriptor generation and an empirical analysis of data sets with large number of variables (simca, adapt, topkat, case, etc.); f) data base considerations describing three major peer-reviewed genetic toxicology data bases (i) national toxicology program (ntp) containing short term in vitro and in vivo genetic tests; (ii) data base developed by the epa gene-tox program containing 73 different short term bioassays for more than 4000 compounds, used in conjunction with adapt, case and topkat; (iii) genetic activity profile (gap) in form of bar graphs displaying information on various tests using a given chemical. title: quantitative structure-activity relationships. principles, and authors: benigni,, r.; andreoli, c.; giuliani, a. applications to mutagenicity and carcinogenicity. (review) laboratory of toxicology and ecotoxicology, istituto superiore di sanita rome, italy. source: mutat. res. 1989, 221(3), 197 -216. results: methods developed for the investigation for the relationships between structure and toxic effects of compounds are summarized: a) the extra-thermodynamic approach: the hansch paradigm, physical chemical properties that influence biological activity and their parametrization, originality of the hansch approach, receptors and pharmacophores: the natural content of the hansch approach, predictive value of qsars, a statistifa1 tool: multiple linear regression analysis, the problem of correlations among molecular descriptors, other mathematical utilizations of extrathermodynamic parameters; b) the substructural approach: when topological (substructural) descriptors are needed, how to use topological decriptors; c) qsar in mutagenicity and carcinogenicity: general problems, specific versions of the substructural approach used for mutagenicity and carcinogenicity, applications to mutagenicity and carcinogenicity. title: linking structure and data. (review) author: bawden, d. source: chem. britain 1989, 25(nov) , i107 -1108. address not given. results: the integration of information from different sources, particularly linking structural with non-structural information is an important consideration in chemical information technology. a review is given on integrated systems: a) socrates chemicallbiological data system for chemical structure and substructure searching combined with the retrieval of biological and physicochemical data, compound availability, testing history, etc.; b) psidom suite of pc based structure handling routines combining chemical structure with the retrieval of text and data; c) cambridge crystal structure databank on x-ray data of organic compounds integrating information on chemical structure, crystal conformation, numerical information on structure determination, bibliographic reference and keywording; d) computer aided organic synthesis for structure and substructure search, reaction retrieval, synthetic analysis and planning, stereochemical analysis, product prediction and thermal hazard analysis. title: determination of three-dimensional structures of proteins and nucleic acids in solution by nuclear magnetic resonance spectroscopy. source: critical rev. biochem. mol. biol. 1989, 24(5) , 479 -564. results: a comprehensive review is given on the use of nmr spectroscopy for the determination of 3d structures of proteins and nucleic acids in solution discussing the following subjects: a) theoretical basis of two-dimensional (2d) nmr and the nuclear overhauser effect (noe) measurements for the determination of 3d structures is given; b) sequential resonance assignment for identifying spin systems of protein nmr spectra and nucleic acid spectra, selective isotope labeling for extension to larger systems and the use of site specific mutagenesis; c) measurement and calculation of structural restraints of the molecules (i) interproton distances; (ii) torsion angle restrains; (iii) 4 backbone torsion angle restraints; (iv) side chain torsion angle restraints; (v) stereospecific assignments; (vi) dihedral angle restraints in nucleic acids; d) determination of secondary structure in proteins; e) determination of tertiary structure in proteins using (i) metric matrix distance geometry (ii) minimization in torsion angle space; (iii) restrained molecular dynamics; (iv) dynamical simulated annealing; (v) folding an extended strand by dynamical simulated annealing; (vi) hybrid metric matrix distance geometry-dynamical simulated annealing method; (vii) dynamical simulated annealing starting from a random array of atoms; f) evaluation of the quality of structures generated from nmr data illustrated by studies for the structure determination of proteins and oligonucleotides using various algorithms and computer programs; g) comparisons of solution and x-ray structures of (i) globular proteins; (ii) related proteins; (iii) nonglobular proteins and polypeptides; h) evaluation of attainable precision of the determination of solution structures of proteins for which no x-ray structures exist (i) bds-i (small 43-residue protein from the sea anemone sulcata; (ii) hirudin (small 65-residue protein from leech which is a potent natural inhibitor of coagulation); i) structure determination by nmr is the starting point for the investigation of the dynamics of conformational changes upon ligand abtstr. 236-239 quant. struct.-act. relat. 9, 234 -293 (1990) binding, unfolding kinetics, conformational equilibria between different conformational states, fast and slow internal dynamics and other phenomena. title: aladdin. an integrated tool for computer-assisted molecular design and pharmacophore recognition from geometric, steric, and substructure searching of three-dimensional molecular structures. ( aladdin has the ability to (i) objectively describe receptor map hypothesis; (ii) scan a database to retrieve untested compounds which is predicted to be active by a receptor map hypothesis; (iii) quantitatively compare receptor map hypotheses for the same biological activity; (iv) design compounds that probe the bioactive conformation of a flexible ligand; (v) design new compounds that a receptor map hypothesis predicts to be active; (vi) design compounds based on structures from protein x-ray crystallography; a search made by aladdin in a database for molecules that should have d2 dopaminergic activity recognized unexpected d2 dopamine agonist activity of existing molecules; a comparison of two superposition rules for d2 agonists was performed by aladdin resulted in a clear discrimination between active and inactive compounds; a compound set was designed that match each of the three lowenergy conformations of dopamine resulting in novel active analogues of known compounds; mimics of some sidc ~1 1 . 1 1 1 1 . 111 p.piide beta turns were designed, in order to demonstrate that aladdin can find small molecules that match a portion of a peptide chain and/or backbone; results: lately a number of chemical information systems based on three-dimensional (3-d) molecular structures have been developed and used in many laboratories: a) concord uses empirical rules and simplified energy minimization to rapidly generate approximate but usually highly accurate 3-d molecular structures from chemical notation or molecular connection table input; b) chemical abstracts service (cas) has added 3-d coordinates for some 4 million organic substances to the cas registry file; c) cambridge structural database system contains x-ray and neutron diffraction crystal structures for tens of thousands of compounds; d) maccs3d developed by molecular design ltd., contains the standard maccs-i structures to which additional 3-d data, such as cartesian coordinates, partial atomic charges and molecular mechanics energy are added; maccs3d allows exact match, geometric, submodel and substructure searching of 3-d models with geometric constrains specified to certain degree of tolerance; two 3-d databases are also available from molecular design that can be searched using maccs3d [drug data report (10,000 models) and fine chemicals directory (90,000 models)]; e) aladdin (daylight chemical information systems) is also searches databases of 3-d structures to find compounds that meet biological, substructural and geometric criteria such as ranges of distances, angles defined by three points (dihedral angles) and plane angles that the geometric object must match. aladdin is one of a number of menus working within the framework provided by daylight's chemical information system. title: improved access to supercomputers boosts chemical applica-author: borman, s. c&en 1155 sixteenth st., n.w., washington dc 20036, usa. source: c&en 1989, 67(29) , 29-37. results: supercomputers have been much more accessible by scientists and engineers in the past few years in part as a result of the establishment of national science foundation (nsf) supercomputer centers. the most powerful class of supercomputers have program execution rates of 100 million to 1 billion floating-point operations per second, memory storage capacities of some ten million to 100 miltion computer words and a standard digital word size of 64 bits, the equivalent of about 15 decimal digits. the following examples are given for the use of supercomputer resources for chemical calculations and modeling: a) modeling of key chromophores in the photosynthetic reaction center of rhodopseudomonas viridis showing the heme group, the iron atom and the chlorophyll which absorbs light and causes rapid transfer of electron to pheophtin and then to the quinone; modeling includes a significant part of the protein having about 2000 atoms out of a total of some 12,000; quant. struct.-act. relat. 9, 234-293 (1990) abstr. 240-242 239 b) modeling of transition state of reaction between chloride and methyl chloride including electron clouds and water molecules surrounding the reaction site; c) analysis of nucleic acid and protein sequences to evaluate the secondary structure of these biopolymers; d) construction of a graphical image of hexafluoropropylene oxide dimer, a model for dupont krytox high performance lubricant; e) calculation of the heats of formation of diaminobenzene isomers indicated that the target para isomer was 3 kcal/mol less stable then the meta isomer byproduct therefore the development for its large scale catalytic synthesis was not undertaken (saving was estimated to be $1 to $2 million). , b) comparison of the newly defined eo, parameter with the taft-kutter-hansch e, (tkh e,) parameter showed characteristic steric effects of ortho-alkoxy and n-bonded planar type substituents (e.g. no,, ph); c) in various correlation analyses using retrospective data eo, satisfactorily represented the steric effects of ortho-substituents on reactivity and biological activity of various organic compounds; d) semi-empirical am1 calculations using a hydrocarbon model to study the steric effects of a number of ortho-substituents resulted in the calculation of the es value (difference in the heat of formation between ortho-substituted toluene and t-butylbenzene) which linearly correlated with the eo, and the tkh e, parameters; e) effects of di-ortho substitution on lipophilicity could be mostly expressed by the summed effect of the 2-and 6-position substituents; t) highly significant regression equations were calculated for the pk, values of di-ortho-substituted benzoic acids using various substituent parameters; g) quantitative analysis of the effect of ortho-substitution is difficult because it is a result of overlapping steric and electronic effects. title: calculation of partition coefficient of n-bridgehead com( i i ) is more lipophilic than propanolol-4-sulphate (iv)]. fig. 1 shows the relationship between lipophilicity and ph for the compounds (circle represents (i), triangle (ii), rhomboid (m) and square gv 234 -293 (1990) abstr. 245-246 241 f (rekker's constant, characterizing hydrophobicity). results: a good agreement was found between the observed and calculated logp values of i1 (3.98 and 3.65, respectively) and for iii. the hydrophobicity of i was found to be significantly lower than that of i1 (2.78 and 4.72, respectively) . the large deviation was attributed to the surface reduction as a result of condensed ring formation in i. since interesting pharmacological activities have been reported for several derivatives of this type of compounds, the hydrophobicity of the unsubstituted 1 lh-indolo[3,2-c]quinoline has been calculated to be 2.22: (1) [interaction energy between a molecule and the binding site model was assumed to be the sum of its atomic contributions according to the expres-e , . , , ,~~(~) was the interaction energy parameter between the site region rand the atom-type of atom a and ag(b) was the total interaction energy for the binding mode b (binding mode was regarded as feasible when the molecule was in its energetically most favorable conformation)]. sion ag(b) = erelion reatomi a in r er,typc(a). where results: for development of the binding site model, first a simple geometry was proposed and agm-5 agm,calc i agm+) was calculated for the whole set of compounds. if the calculated binding energy of any of the compounds was outside of the above boundary, the proposed site geometry was rejected and a more complex one was considered. this procedure had been repeated until all molecules in the set could be fitted within the experimental data range. as a result a 3d, five-region voronoi binding site model has been developed for the pahs containing a trigonal pyramid (rl) in the center and portions r2rs having infinite volumes and indicated by boundary planes. region r, represented access to the solvent and regions r3rs were blocked for binding ( fig. 1) : pyrene is shown in its optimal binding mode with its atom barely touching the boundary surfaces and edges: calculations showed that benzene and other monoaromatic ring compounds should be very weak competitors for the b[a]p site. the model correctly predicted the binding energy of nine competitors outside of the training set. '% (wiener index calculated as the sum of all unique shortest distances between atoms in the hydrogen suppressed graph of the compound); (wiener index calculated as the sum of all geometric distances between atoms in the hydrogen suppressed molecule of the compound). results: the traditional 2d wiener number is defined as the sum of the lengths of all possible routes in the molecular graph. here the length is proposed to be calculated as the real three-dimensional length between atoms: this is the 3d wiener number. this number has many of the advantageous features of the related and very much studied 2d wiener number. additionally, it is highly discriminative and its use in quantitative structure-property relation studies (qspr) appears to be encouraging, according to the preliminary calculations. of these the most convincing is the set of statistical parameters for the linear correlation between the experimental and calculated enthalpy functions of 3dw the lower alkanes not shown here. three different models have been tried and in all cases the 3d wiener number seemed to be superior to the 2d one as it is reflected in (eqs.1-6). a) gaba receptors in human mouse, rat and bovine brain tissues, membrane preparations and cellular uptake systems; b) gaba receptors in cat and rat spinal cord preparations; c) cultured astrocytes. as in the equations nearly all indicator variables had negative regression coefficients it was concluded that instead of searching for better analogs, the research should be directed toward degradable pro-gaba or pro-muscimol derivatives that are efficiently taken up into the central nervous system (cns). 0.585(+2.22) irng + 4.16 (3) title: synthesis and qsar of 1-aryl-4-(~-2-quinolyi/l-isoqui-noly1ethyl)piperazines and some related compounds as hypotensive agents. authors (1) based on eq. 1, an optimal logp is predicted (logpo = 4.23). the highest activity was produced by the 1-(3-methylphenyl)-4-(~-2-qui-data determined: chemical descriptors: abtstr. 251-252 quant. struct.-act. relat. 9, 234 -293 (1990) nolylethyl) piperazine, its logp value being near to the optimal value (4.52). l.og(bph) values calculated by eq. 1 agree well with the observed ones. source: toxicology 1989, 58(2), 197 -210. compounds: 3,5-dimethoxyphenol, 4-chlorophenol, 2.6-dichlorophenol, 4-methyl-2-nitropheno1, 2,4dichlorophenol, 2,4,6-trichlorophenol, 2,3,4,5-tetrachlorophenol, 2,4,6-triiodophenol, pentachlorophenol. biological material: chinese hamster ovary (cho) cells. data taken from the literature: ezoc; ecsoc; eczoa; ec~oa [concentration (mmol1l) of the compound leading to a 20 or 50 % inhibition of the cell growth or adenosine uptake, respectively]. data determined: ego; ecso [concentration (mmol1l) of the compound leading to a 20 or 50 % inhibition of the na+/k+-atpase activity, respectively]. chemical descriptors: logp (logarithm of the partition coefficient in i-octanollwater); u (hammett's constant, characterizing the electron-withdrawing power of the substituent); e, (taft's constant, characterizing steric effects of the substituent); x (molecular connectivity index, calculated by koch's method). results: highly significant linear relationships were calculated between log (eczo) and logp (r = -0.963). the relationship between log(ec,,) and u being less good (r = -0.767). combining the two parameters the relationship has improved (eq. i): (1) (logarithm of the partition coefficient in i-octanollwater); (hansch-fujita's substituent constant characterizing hydrophobicity); (hammett's constant, characterizing the electron-withdrawing power of the substituent); (sterimol steric parameter, characterizing the steric effect of the meta substituents); (rplc derived hydrophobic substituent constant, defined by chen and horv6th, and extrapolated to 0 x methanol); (indicator variable 1 for the present 0 for the absence of hydrogen bonding substituents). results: logk' values were determined for the benzenesulfonamides and correlated with chemical descriptors. a highly significant linear relationship between logk' and logp was calculated (eq. 1): ( pk, (negative logarithm of the acidic dissociation constant); logp (logarithm of the partition coefficient in i-octanol/water). results: relationships between ki values and the chemical descriptors were investigated for cpz and its listed metabolites. relationship between log(l/ki) and logp was calculated (eq. 1) no numerical intercept (c) is given: ( in spite of the complexity of the full mechanism of inhibition involving at least six transition states and five distinct intermediates, a significant linear regression equation was calculated for ki (eq. 3): since the crystal structure of the acyl-enzyme complex, the acylation and deacylation rate were available, it was concluded that the inhibition begins with the histidine 57 catalyzed attack of serine 195 0, at the benzoxazinone c4, while the carbonyl oxygen occupies the oxyanion hole formed by glycine 194 and serine 195. title: antifolate and antibacterial activities of 5-substituted authors: harris, n.v.; smith, c.; bowden, k. rhone results: it was shown earlier that binding of diaminoquinazolines to dhfr correlated with the torsional angle of the 4-amino group of the quinazoline nucleus. it was postulated that the interaction between the adjacent 5-substituent and the 4-amino group was very important in determining dhfr binding of the compounds possibly, because of the influence on the hydrogen-bond formed between the 4-amino group and a residue at the active site. the existence of such interaction in 5-substituted 2,4-diaminoquinazolines were shown by measuring a , , and 6~~1 values. the ui and uor electronic parameters correlated well with chemical shifts of the 2-nh, groups (eq. 1) but showed poor correlation for the 4-nh, group (eq. 2), respectively: (1) the equations suggest that the through-ring resonance interactions between the 5-substituent and the adjacent 4-amino group are disrupted by some other effects which might have significance for binding. a) an extensive set of compounds based on the nalidixic acid structure of type i. where r', r3, r6 and r7 are various substituents; x6 and x* = c, n (for nalidixic acid: x6 = c, x8 = n, r' = et, r' = cooh. r6 = h, r7 = me); b) subset of (i) (set a) containing fifty two 6,7-disubstituted 1 -alky l-1,4-dihydro-4-oxoquinoline-3-carboxylic acids; compounds: abtstr. 260-261 quant. struct.-act. relat. 9, 234-293 (1990) c) subset of (i) (set b) containing one hundred and sixty two xylic acids; d) subset of (i) (set c) containing eighty five 1,4-dihydr0-4-oxo-1,8-naphthyridine-3-carboxylic acids with substituted azetidinyl, pyrrolidinyl and piperidinyl rings at position 7, fluorine at position 6 and ethyl, vinyl or 2-fluoroethyl substituent at position 1. biological material: ps. aeruginosa v-1, e. coli nihj jc-2, s. the study showed that the most active compounds have fluorine in position 6, r7 can be a wide variety of nitrogen containing substituent and the best predictor for r7 is its lipophilicity. compounds: 17 phytoalexins: pisatin, 3,6a-dihydroxy-8,9-(methylenedioxy)pterocarpan, 6a, 1 la-dehydropisatin, 3-hydroxy-8,9-(methylenedioxy)-6a, 1 1 a-dehydropterocarpan, (*)-3-hydroxy-9-methoxypterocarpan, (+)-3-hydroxy-9-zmethoxypterocarpan, (-)-3-zhydroxy-9-methoxypterocarpan, vestitol, sativan, formonenetin, coumestrol, 4'-o-methylcoumestro1, phaseoilin, phaseollinisoflavan, 2'-methoxyphaseollin-isoflavan, glyceollin, 6a-11 a-dehydroglyceollin, tuberosin, 6a, 1 ladehydrotuberosin. (capacity factor determined rp-hplc). calculated for logp of six reference compounds using their k' values (eq. 1): (1) n = 6 r = 0.993 s not given f not given the lipophilicity of the phytoalexins were within the range of log p = 1.5 -4.2. it was found that the antifungal activity of similar compounds positively correlated with antifungal activity but no equation could be calculated for the whole set of compounds. it was suggested, however, that compounds with logp values higher than 3.5 were retained in the membranes, therefore phytoalexins with slightly lower lipophilicity, as well as greater fungitoxicity and systemic activity should be searched. certain structural features seemed to correlate with antifungal activity such as the presence of phenolic oh and benzylic hydrogen. it was suggested that the ability of the ortho oh group to form fairly stable intramolecular hydrogen bond may contribute to the greater stability of the shiff base hnctional group and the higher biological activity of the substances (various subsets required different equations). results showed that compounds with increasing lipophilicity and electron donating substituents at the 3-and 5-positions have high inhibitory activity. i-[(3'-allyl-2'-hydroxybenzilidene)amino]-3-hydroxyguanidine was found to be the most active compound. the use of parameter focusing of the substituent hydrophobic constant and electronic constants was suggested for the selection of further substituents to design effective compounds. biological material: a) rabbits; b) rats; c) guinea pig. data taken from the literature: analogue results: prp, ecjoh, ecsob, ecsot values were measured and presented for the c,, paf analogue and compared with that of other analogues. c,, paf analogue was less potent than the c 1 6 or cis paf analogues and equivalent to c,, paf analogue, showing that the activity decreased with lipophilicity. a highly significant parabolic relationship was calculated between log(rps) and cf (eq. 1): the maximum activity was calculated cf = 6.78, this corresponds to the cl6 paf. (energy minimizatipn of the compounds were calculated using the free valence geometry energy minimization method); (molecular shape analysis according to hopfinger was used to quantitatively compare the shape similarity of analogs in their minimum energy conformer states (within 8 kcal/mol of their global minimum energy ( fig. 1 shows the superposition of the reference conformations of the phenylalanine and tryptophane analogues). quant. struct.-act. relat. 9, 234-293 (1990) chemical descriptors: logp (logarithm of the partition coefficient in 1 -octanol/water); (hansch-fujita's substituent constant characterizing hydrophobicity of a substituent on the aromatic ring and the hydrophobicity of the aromatic ring itself, respectively) ; [common overlap steric volumes (a3) between pairs of superimposed molecules in a common low energy conformation]; [dipole moment (debeyes) of the whole molecule and of the aromatic ring, respectively, calculated using the cndoi2 method] ; quantum chemical indices (partial atomic charges calculated by the cndoi2 method); 0 1 -0 4 [torsion angles (deg) (fig. 1 ) rotated during the conformational analysis of the compounds]. results: significant parabolic regression equations were calculated for the antigelling activity of the phenylalanine and tryptophan analogues (eq. 1 and eq. 2, respectively): the different qsar for the phenylalanine and tryptophan analogues indicated that they interact with hemoglobin in different ways or at different sites. for the phenylalanine analogues the hydrophobicity of the side chain, the aromatic dipole moment and the steric overlap volume explained about 50 %, 20 % and 10 % of the variance in antigelling activity, respectively. for the tryptophan analogues the square of the dipole moment or the steric overlap volume explained 70% or 60% of the variance in ra, respectively, being the two descriptors highly correlated. the results show that the tryptophan analogs have a relatively tight fit with the receptor site. title: s-aryl (tetramethyl) isothiouronium salts as possible antimicrobial agents, iv. in both eq. 3 and eq. 4, log(l/c) depended primarily on electronic factors (eu' ) and only secondarily on hydrophobicity (ctobsd). a threshold logp value for the active isothiuronium salts was indicated, as the compounds with logp values between -0.70 and -1.58 were found to be totally inactive with the exception of the nitro-derivatives. title: comparative qsar study of the chitin synthesis inhibitory activity of benzoyl-ureas versus benzoyl-biurets. source: tagungsbericht 1989, no.274 \ r* ponents explaining 69.61 %, 19.02 % and 9.30 % of the variance. fig. 1 shows the minimum energy conformation of a highly active representative of the urea analogs (dimilin) with 5.6 a distance between the 1 and 15 carbon atoms. fig. 2 shows the low energy conformation of the corresponding biuret analog with the two benzene rings in appro:imately the same plane and with the same c1-c18 distance (5.6 a) allowing to fit a hypothetical benzoylurea phamacophore. the similarity of the regression equations and the modelling study supported the hypothesis that the benzoylbiurets act by the same mechanism as the benzoylureas. biological material: 8 insect species: aedes aegypti, musca domestica, chilo suppressalis, hylemya platura, oncopeltus suppressalis, oncopeltus fasciatus, pieris brassicae, leptinotarsa decemlineata. [concentration of the benzoylurea derivative (various dimensions) required to kill 50% of insect larvae (a. aegypti, m. domestica, c. suppressalis, h. platura, 0. suppressalis, 0. fasciatus, p. brassicae or l. decemlineata]. data determined: lcso [concentration of the biuret analogue (ppm) required to kill 50% of insect larvae (a. aegypti or m. domestica]; molecular modeling (models of the compounds were built using molidea); conformational analysis (minimum energy conformations of the compounds were calculated using molecular mechanics method). chemical descriptors: the thesis is devoted to the quantitative analysis of the uncoupling activity of substituted phenols using chemical descriptors in order to obtain further information on the mode of action of phenol uncouplers: the study of the partition coefficient of substituted phenols in liposomelwater system [p(l/w)] showed that (i) p(l/w) depended primarily on the logp value; (ii) influence of steric and electronic parameters depended on the type of the lipid involved; qsar analysis of uncoupling phenols in rat-liver mitochondria identified the relevant physicochemical parameters required for phenols being protonophore in inner mitochondrial membrane and quantitatively separated the potency as the protonophore in the inner mitochondrial membrane and the incorporation factor (iogp); protonophoric potency of substituted phenols was linearly related to uncoupling activity when certain critical physicochemical parameters of the experiment were taken into account; linear relationship was calculated between uncoupling activities of substituted phenols and related uncouplers in the mitochondria from the flight muscles of house flies and in spinach chloroplasts; the results indicated a shuttle type mechanism for the uncoupling action of substituted phenols. title: uncoupling properties of a chlorophenol series on acer cell 234 -293 (1990) compounds: 22 chlorinated phenols substituted with 2-c1, 3-c1, 2,4,5-cl, 2,4,6-ci, pentachlorophenol, 4-ci-2-me. 4-c1-3-me, 4-c1-2,3-me, 4-c1-3,5-me, 4-ci-2-ally1, 4-c1-2-pr-5-me, 4z1, 2,3-c1, 2,443, 2,5-c1, 2,6-cl, 3,4-c1, 3,5-ci, 2,3,6-c1, 2-cl-6-no2, 2,4-c1-6-no,, 2-ci-4,6-no,. biological material: acer pseudoplatanus l. cell suspensions. data determined: dso [concentration of the compound (pmolll) required for 50 % uncoupling effect registered by measuring the oxygen consumption rate by polarography]; [minimal concentration of the compound (pnol/l) required for giving a full uncoupling effect]. chemical descriptors: logp (logarithm of the partition coefficient in 1-octanol/water); mr (molar refractivity); ed (steric parameter representing the perimeter of coplanary molecules projected onto the aromatic plane); a (angular parameter expressing the hindrance in the neighborhood of the hydroxyl group in positions 2 and 6, respectively) ; ui, 02 (hammett's constants, characterizing the electron-withdrawing power of the para-substituent and the ortho-or 4-nitro substituents, respectively). results: highly significant linear regression equations were calculated for the uncoupling effects of chlorophenols in acer cell suspensions: the equations for the uncoupling effects in the whole cells and those calculated previously for isolated mitochondria or chloroplasts possess similar structures. 0.202(*2.049) a, -4.250 (2) title: effects of 3' substituents on diphenyl ether compounds. results: sar suggested that the space for the n' and nz substituents in the psi1 binding site is relatively large. the variation of the number of the carbon atoms of r2 on the photosynthetic inhibitory activity is shown in fig. 1 (hansch-fujita's substituent constant characterizing hydrophobicity); chemical descriptors: 0.32(*0.29) ior + 0.41(&0.43) hb + 5.62 the biological activity of three out of the 30 (dpe-16, 19 and 28) substituted diphenyl esters were measured and listed. igr values were measured for the three compounds and compared with that of a-23 and methoprene. it was found that the position of acetamido group in the phenol moiety when it is in the ortho position abtstr. 272-273 234 -293 (1990) increases the lipophilicity of the compound with a logp value of 2.54. if the same group is in mr para position, the logp values are 2.16 and 1.99, respectively and they are comparatively ineffective. when both the ortho positions are substituted with tertiary butyl groups (dpe-28) the logp value is relatively higher (3.30) which increases the lipophilicity of the compound and explains the pronounced idr activity at relatively low concentrations. abstr. results: a highly significant linear regression equation was calculated for the descriptors of r' (r' = i-pro was eliminated as an outlier) (eq. 1): the compound with r' = eto, r2 = me and z = 0 was found to be an effective, broad spectrum insecticide. the replacement of the quaternary carbon with a silicon atom cansimplify the synthesis of test compounds and thus can be advantageously utilized for the preparation of large compound sets for qsar studies. the data suggest that the initial electron loss from the given compounds is the preeminent factor effecting the reaction rate. a single mechanism is suggested over the entire range of reactivities, where a transition state with a considerable positive charge is involved. title: connection models of structure and activity: ii. estimation of electronoacceptor and electronodonor functions of active centers in the molecules of physiologically active materials. research institute of physiology active materials chernogolovka, moskow district, ussr. engl. summary). authors chemical descriptors: logp (logarithm of hydrophobicity). results: calculations for electronoacceptor and electronodonor entharpic and free energy factors on the base of functional groups were made according to the principle of independence of active centers: data determined: linear correlation was found between the calculated and measured characteristics: the accuracy of the fitting was the same as the measurement error of ah,,, and agm.the entropy might be calculated from enthalpy, gibbs energy and temperature: the good linear correlations between the measured and calculated data show that the functional group approaches might be used for these compound types. the substituent effects for the a-acceptorlr-donor substituents (f, c1, br, i) were found to be very much larger for the c6fsr relative to the nitrobenzenes. these results indicate that the extra electron enters a o*-orbital, which is localized on the c-r atoms. for the structure-solubility relationship of aliphatic alcohols. the study indicated that solubility of aliphatic alcohols depends primarily on molecular connectivity ('x), the number of carbon atoms in the alkyl chain (n'), the number of hydrogens on the a-carbon atom (normal, iso, secondary, ternary) and the degree of branching (vg): (1) n not given r not given s not given f not given eq. 1 was found to be a highly significant predictor of s (eq. 2): the result support kier's, furthermore kier and hall's earlier models on the structural dependence of water solubility of alcohols. -log(s) = 113 'x + (113)* sg -2.5075 title: linear free energy relationships for peroxy radical-phenol reactions. influence of the para-substituent, the orthodi-tert-butyl groups and the peroxy radical. k (reaction rate constant (m -'s -i ) of the reaction between the reaction of cumyl-, 1 -phenylethyl-and t-butyl-peroxy radicals and ortho-para-substituted phenol inhibitors). data taken from the literature: chemical descriptors: u+ r. ui, ur (charton's electronic substituent constant and its decomposition to inductive and resonance components, respectively for the characterization of the para substituent); (indicator variable 1 for the presence or absence of the t-bu groups in 2,6-position of the phenols). results: highly significant linear regression equations were calculated by stepwise regression analysis for logk in spite of the diverse data set originating from different laboratories using different peroxy radicals (eq. 1, eq. 2): quant. struct.-act. relat. 9, 234 -293 (1990) (2) n = 32 r = 0.848 s = 0.432 f = 37.2 i c~" was not selected by stepwise regression indicating that the orthodi-t-bu substitution had no significant effect on the rate of hydrogen abstraction from phenols by the radicals. the form of the equations for different subsets of the phenols and radicals indicated that the reaction mechanism was the same for the different peroxy radicals. title: a fractal study of aliphatic compounds. a quantitative structure-property correlation through topological indices and bulk parameters. the following descriptors are considered as 'bulk parameters': vw (van der waals volume, calculated from the van der waals radii of the atoms); mw (molecular weight); sd (steric density of the functional group). results: highly significant equations are presented for calculating vw, sd and mw r values ranging from 0.92 to 1.00, other statistics and the number of investigations are not given. q and 0 values calculated by these equations were introduced to the equation given above and the physicochemical properties were calculated. the observed and calculated iogv,, d and p values are presented and compared for the alkanes, alcohols, acids and nitriles. the observed and calculated physicochemical parameters agreed well. fractal nature of the alkyl chain length was discussed and a relationship was presented between the fractal-dimensioned alkyl chain length and a generalized topological index. title: application of micellar liquid chromatography to modeling of organic compounds by quantitative structure-activity relationships. chemical descriptors: logp (logarithm of the partition coefficient in 1-octanol/water). results: in a series of experiment with the listed compounds micellar liquid chromatography has been applied to model hydrophobicity of organic compounds in a biological system. the measured logk' values of the substituted benzenes were found to be superior predictors of logp. fig. 1 shows the plot of logp versus logk' of the substituted benzenes. highly significant correlation was calculated for the logk' values of phenols (open squares) (n = 6, r = 0.985), for the rest of the compounds (full squares) (n = 16, r = 0.990) and for the entire set (n = 22, r = 0.922). further experiments using various surfactant types in the mobil phase suggested that logk' values generated on a lamellar phase may be better predictors of hydrophilicity than logp obtained from binary solvent systems. title: isoxazolinyldioxepins. 2. the partitioning characteristics and the complexing ability of some oxazolinyldioxepin diastereoisomers. authors quant. struct.-act. relat. 9, 234 -293 (1990) source: j. chem. soc. perkin trans. i1 1989 . no. 11, 1935 -1937 compounds: 10 oxazolinyldioxepin derivatives of type i and 11, where x = h, f, ci, cf3 or ch3. data determined: logk' [logarithm of the capacity factor, measured by reversed-phase liquid chromatography (rplc)]; mep (molecular electrostatic computed by geesner-prettre and pullman's vsspot procedure). chemical descriptor: logp (logarithm of the partition coefficient in 1-octanollwater). results: the logk' and logp values were measured for the two type of diastereomers and a highly significant linear relationship between logk' and logp was presented (r = 0.995): the meps of i and 11's f-derivatives were determined and presented, "a" for type i, "b" for type ii: the complex forming ability of the diastereoisomers with mono-cations was investigated and explained in terms of the structures and electronic properties of the compounds. results: linear relationships are presented plotting y versus n for the 9 hydrophobic sorbents (fig. 1 ) and the slopes of these straight lines are suggested for experimental determination of . q, values. ~0 values determined by the suggested method are listed. while no linear relationships were found between kd and n, y depend linearly on n for the test compounds [alkanols (i). alkane diols (2) results: three linear models were fittedwith independent variabies of log(p), mr and o x . the best fitting parameters (independent of composition) were obtained from the following models (no statistical characteristics is presented): (1) (2) the two types of correlations (with structural and with moving phase parameters) together might be used for the optimization of chromatographic separation of complex mixtures of sulphur-containing substances. (zero order molecular bonding type connectivity index); ig(k) = a0 + a1 p' + a2 logp + a3 p' logp ig(k) = a0 + a1 tg(cm) + a, logp + a3 tg(cm) logp -3-1 19901285 the kd values derived by the suggested method were compared by kd values calculated by martin's rule and a good agreement was found. title: mathematical description of the chromatographic behaviour of isosorbide esters separated by thin layer chromatography. compounds: 9 isosorbide esters: isosorbide (l), 1-5-monoacetate, 1-2-monoacetate, 1-5-mononitrate, 1-2-mononitrate, l-diacetate, 1-5-nitro-2-acetate, 1-2-nitro-5-acetate, l-dinitrate. rn, r~i [retention factors obtained by thin-layer chromatography in benzene/ethylacetate/isopropanol/ (7:3: 1.5) and in dichloromethane/diisopropylether/isopropanol (20:4:2: 1) eluent systems, respectively]. data determined: chemical descriptors: (information index, based on the distribution of the elements in the topological distance matrix); (the geometrical analogue); (randic connectivity index); (maximum geometric distance in the molecule); compounds: 46 highly diverse chemicals. grouped according to the following properties: contains (ester or amide or anhydride) or (heterocyclic n) or (0 bound to c) or (unbranched alkyl group with greater than 4 carbons). data determined: aerud chemical descriptors: (aerobic ultimate degradation in receiving waters). 2 v x 4x, nci m, (molecular weight). results: the paper has aimed at developing a model for predicting aerud. the data sets were collected from 22 biodegradation experts. the experts estimated the biodegradationtime that might be required for aerud on the time scales of days, weeks, months and longer. 46 (valence second order molecular connectivity index); (fourth order path/cluster connectivity index); (number of covalently bound chlorine atoms); highly diverse chemicals but typical in wastewater treatment systems were examined. zero to six order molecular and cluster connectivity indexes were calculated using computer programs wrinen in for-tran for ibm pc/xt. the best fitted linear regression model is: [first order rate constant: transport or transformation parameter (mol/pa. h)]. results: the qwasi fugacity model describes the fate of a (contaminating) chemical, such as organo-chlorine compounds, pesticides or metals. the lake model consists of water, bottom and suspended sediments, and air. the model includes the following processes: advective flow, volatilization, sediment deposition, resuspension and burial, sediment-water diffusion, wet and dry atmospheric deposition, and degrading reactions. the steady state solution of the model is illustrated by application to pcbs in lake ontario using the equilibrium criterion of fugacity as the variable controlling environmental fate of the chemical. the applications are based upon inaccurate data. use of fugacity is inappropriate for involatile chemicals, such as metals, or ionic species, because fugacities are calculated from a basis of vapor phase concentrations. for these materials the use of the equilibrium concentration activity is more appropriate since activities are calculated from a water phase base. thus, a new equilibrium criterion, termed the "aquivalent" concentration (or equivalent aqueous concentration) is suggested as being preferable. this concentration has the advantage of being applicable in all phases, such as water, air and sediments. the formalism developed in the qwasi approach can also be applied, making possible a ready comparison of the relative rates (and thus, the importance) of diverse environmental fate processes. all these are illustrated by applying the model on a steady state basis to quant. struct.-act. relat. 9, 234-293 (1990) abstr. 288-289 261 the pcb example and to the fate of lead in lake ontario. the estimated and observed concentrations of pcbs and lead in lake ontario agree well: the largest difference in the case of pcbs in rain amounts to a factor of three. in other phases, and especially in the case of lead, the difference is usually less than 30 per cent. although in order to judge the biological effects of a contaminant it is of fundamental importance to know its transport and transformations, and the present model has been proven to useful to describe this; direct biological implications are not deduced at the present stage. the similar slopes of the equations show that these compounds exert their cytotoxicity primarily by alkylation. while the majority of the tested compounds showed no hypoxia-selective cytotoxicity (ratio awa 1 .o), the 4-n02 and 3-no2 substituted compounds were more toxic to uv4 cells under hypoxic conditions (ratio = 3.2 for the compound with r = 4-n02), indicating cellular reduction of the nitro-group. the measured hypoxic selectivity of the 3-no2 and 4-n0, substituted compounds was a fraction of the calculated ratio (measured 220 fold and calculated 3500 fold by eq. 2 between the 4-n02 and 4-nh, substituted compounds). the main reason for the difference between the calculated and measured hypoxic selectivity is suggested to be the low reduction potential of the 4-n02 and 3-no, groups (e = -500 mv and e = -470 mv, respectively). 19901288 title: quantitative structure-activity relationships for the cytotoxici(hammett's constant, characterizing the electron-withdrawing power of the substituent); (hammett's polar electronic constant characterizing the electron withdrawing power of the substituent for anilines). results: significant linear regression equations were calculated for the halflife (t1/2), growth inhibition (150) and clonogenicity data (ctlo) using hammett constants (eq. 1, eq. 2, eq. 3): (1) n = 11 r = 0.96 s = 0.24 f not given 234 -293 (1990) type, test animals, the mean level of toxicity and the form of the equation. e.g. analysis of the toxicity of phenols showed a transition between simple dependence from logp to exclusive dependence to reactivity factors indicating two separate classes of phenol toxicity (eq. 1 for mouse i.p. toxicity, and eq. 2 for rat oral toxicity): results: an additivity model, plc50 = cni a ti 4-to, where ni is the number of ith substituents in a benzene derivative, at; is the toxicity contribution of the ilh substituent and to is the toxicity of the parent compound (benzene), was used for predicting toxicity of lo similar correlation was found between mutagenicity and u (fig. 2) indicating that both biochemical and chemical processes involve a ph dependent nucleophilic ring opening (including the protonation of the aziridin nitrogen as rate controlling step) and in.,uenced by electronic and steric factors (equation not given). resonance effect). (electron density on n1 in the homo calculated by the mndo). results: highly significant linear relationships between log( 1 ic) and logp, &homo (eq. 1); iogp, qhomo (eq. 2) are presented indicating that the more hydrophobic and more electron-rich triazines are more active according to the ames test: substructures [a total of 32355 fragments were generated from the 189 compounds using the program case (computer-automated structure evaluation) system]. results: a comparative classification of the compounds were performed using case for identifying molecular fragments associated with cancerogenic activity (biophores) as well as deactivating fragments (biophobes). case identified 2 1 biophores and 2 biophobes from the 32355 fragments of the 189 compounds with a less than 12.5% probability of being associated with carcinogenicity as a chance. the sensitivity and specificity of the analysis was unexpectedly high: 1.00 and 0.86, respectively. the predictive power of case biological material: chemical descriptors: was tested using the identified biophores and biophobes on a group of chemicals not present in the data base. the ability of case to correctly predict carcinogens and presumed non-carcinogens was found to be very good. it was suggested that non-genotoxic carcinogens may act by a broader mechanism rather than being chemical specific. compounds: 9 compounds of type i where r = h, ch3, c,h5, czh7, c3h7, czh9, c4h9, c3h5, c4h7, sc4h7; 3 compounds of t y p i1 where r = h, c~heoh, sc4h80h. data determined: p t pi" (a priori probability of appearance of the i-th active compound); (a priori probability of appearance of the i-th nonactive compound). (the first order molecular connectivity index); (the second order molecular connectivity index); (information-theoretic index on graph distances calculated by the wiener index according to gutmann and platt); chemical descriptors: (rank of smell, where the rank is defined to equal with one for the most active compound). results: the authors' previously proposed structure-activity relationship approach was applied for structure-odor relationship. 13 different compounds of groups i and i1 were examined using the topological indices w, r, i, x as independent variables and v as the dependent variable. the best correlation was obtained between r and v ( fig. i) results: logp and iogp, values were determined for the nitroimidazole derivatives. significant linear equations were calculated, the best one related for logp and logp,r (eq. 1): (1) logp = 1.14 logp,i + 0.37 n = 9 r = 0.92 s not given f not given chemical descriptors: logp descriptors (logarithm of the partition coefficient in i-octanoll water); (12 indicator variables taking the value of 1 for the presence of cr/p-hydroxy , 6a-fluoro, 6a-methy1, 9afluoro, 17-hydroxy, i6a-fluoro. 16,17-acetonide, 21-deoxy, 21-acetate, 21-propionate, 2 i-butyrate or 2 1-isobutyrate, respectively). results: a data set of 43 steroids were compiled after removing those ones containing unique substituents. the set was divided into two categories of approximately equal membership by defining a threshold logp value of 1.45. a descriptor set was created and the non-significant ones were eliminated using the weight-sign change feature selection technique. linear leaning machine was applied to calculate the weight vectors and complete convergence was achieved in the training procedure. the predictive ability of the linear pattern classifier thus obtained was tested using the leave one out procedure. the predictive ability was found to be 8 i .4 %. the predictive ability of the approach was found to be good and improvement was expected with larger data set. steroids, however, containing new substituents would have to be subjected to a repeated pattern-recognition calculation. lengthlbreadth descriptors (2 descriptors)]. results: for modeling the shape of the compounds, simca was used: the approach was to generate disjoint principal models of clustered points in a multidimensional space. the number of clusters for each structure was determined by using hierarchical cluster analysis. fig. 1 shows the orthogonal views of a schematic representation of the sim-ca models for the atom clusters in senecionine: each compound in turn was used as a reference structure. every other structure was superimposed on the reference using the ends of the corresponding binding moment vector plus the ring nitrogen atom. canonical correlation analysis was used for calculating the correlation between the five biological activity data and shape descriptors of 21 structures. the best correlation was observed for jurs' shadow descriptors. the msa and simca descriptors were comparable. the model was able to express both the amount and direction of shape the differences, and also for encoding relevant information for correlation with the biological activity. compounds: 6 n-substituted 3-methyl-4-nitropyrazole-5-carboxamides (11), 6 n-substituted 4-amino-3-methylpyrazole-5-carboxamides (iii), 14 n-substituted 3-methyl-4-diazopyrazole-5-carboxamides and n-piperidiny 1-n-( 1,3-dimethyl-4-nitrosopyrazol-5-yl)-urea (vii) . title: structure-activity correlations for psychotomimetics. 1. phenylalkylamines: electronic, volume, and hydrophobicity parameters. abtstr. 300 quant. struct.-act. relat. 9, 234-293 (1990) data determined: edso conformational analysis g [dose of the compound (mg/kg) which causes 50% of the rats which were trained on 1 rngfkg reference compound to respond as they would to the training drug]; (geometries of the compounds were calculated using mmf2 from starting geometries determined by the program euclid). discriminant analysis resulted in a function containing six variables which misclassified only one compound in the training set. when the data was repeatedly split randomty into a training and a test sb, the misclassification rate was 9% (15 out of 161 classifications). fig. 2 shows the plot of the two canonical varieties from discriminant analysis visualizing the separation of hallucinogenic and nonhallucinogenic derivatives (meaning of symbols are the same as in fig. 1 ). multiple regression analysis (mra) was found to be the most useful for identifying relevant and discarding redundant variables. highly significant parabolic regression equations were calculated for the human activity data (a) (n = 50, r ranging from 0.9004 to 0.9563, f not given) and for animal data (edso) (n = 16, r = 0.8679 and r = 0.9825, f not given). eight descriptors were found to be highly significant. among these the importance of directional hydrophobicity and volume effects indicated that steric and hydrophobic interactions participate in the interaction with the receptor.mra indicated a strong interaction between the meta-and para-substituents and the presence of the formation of charge transfer complex by accepting charge. data did not support the hypothesis that the human activity data and animal ( 1 ) data taken from the literature: sweet(n) (sweet taste of the compound, where n represent the number of times a sample has to be diluted to match the taste of 3% sucrose solution). [class fit distances of a compound to sweet and nonsweet class (dimension not given) calculated by principal component analysis]. chemical descriptors: mr (molar refractivity); bi, l (sterimol steric parameters, characterizing the steric effect of the substituent); r (hansch-fujita's substituent constant characterizing hydrophobicity); urn, up (hammett's constants, characterizing the electron-withdrawing power of the substituent in meta-and para-position, respectively). results: no statistically significant regression equation was obtained by the hansch-fujita approach using the chemical descriptors listed. d', d2 quant. stact.-act. relat. 9, 234 -293 (1990) abstr. 301-302 267 principal component analysis of the data set extracted 2 principal components, explaining 64 % of the variance of the sweet compounds. the sweet compounds clustered in a relatively confined region of the 15d space whereas the tasteless and bitter compounds were scattered around the sweet compounds. a coomans plot, however., indicated, when plotting d' versus d2, that sweet and nonsweet compounds could be well separated along the d' axis ( fig. 1, title: conformation of cyclopeptides. factor analysis. a convenient tool for simplifying conformational studies of condensed poly-ring systems. prolyl-type cyclopeptides. authors conformations of the six-membered dop-ring family may be reproduced by means of a superposition of the canonical twist (t), boat (b) and chair (c) forms. physically, the coefficients have the meaning of relative contributions (amplitudes) of the t, b and c forms into the total conformation of the ring. here factor analysis (fa) and principal component analysis was used in conformational studies of 30 various x-ray conformers of dop/pyr. a correspondence was found between factors identified and rpt, when the rings are considered separately. this fact allows a physical interpretation of the fa results: two or three puckering variables were found for the dop and pyr rings expressing the absolute amplitudes of the basic pucker modes. subse-quent fa treatment of the condensed system revealed five conformational variables necessary and sufficicnt to describe the twolring puckering completely. each of the basic pucker modes defines a unique pattern of conformational variation of the whole two-ring system. the results demonstrate that fa is a powerful technique in analysing condensed poly-ring systems, not amenable to the rpt treatment. title: preprocessing, variable selection, and classification rules in the application of simca pattern recognition to mass spectral data. authors: dunn m*, w.j.; emery, s.l.; glen, g.w; scott, d.r. college of pharmacy, the university of illinois at chicago 833 south wood, chicago il 60612, usa. source: environ. sci. technol. 1989, 23(12) , 1499-1505. compounds: a diverse set of 121 compounds observed in ambient air classified as (1) nonhalogenated benzenes; (2) chlorine containing compounds; (3) bromo-and bromochloro compounds; (4) aliphatic hydrocarbons; (5) miscellaneous oxygen-containing hyhocarbon-like compounds (aliphatic alcohols, aldehydes and ketones). pattern recognition was applied to autocorrelation-transformed mass spectra of the compounds using providing chemical class assignment for an unknown]; m/z;, mlz,, m/zl (first three principal components scores of simca). results: simca pattern recognition method was applied on a training set of 78 toxic compounds targeted for routine monitoring in ambient air. the analysis resulted in very good classification and identification of the compounds (87 % and 84 %, respectively). however, the training procedure proved to be inadequate as a number hydrocarbons from field samples (gcims analysis) were incorrectly classified as chlorocarbons. a new approaches for the preprocessing (scaling the ms data by taking the square root of the intensitiesfollowed by autocorrelation transform), variable selection (only the 16 most intense ions in the ms spectrum were taken), and for the classification rules of simca has been introduced to improve results on real data. fig, 1 and as a result of the revised rules the classification performance has been greatly improved for field data (97 -94 %). title: a qsar model for the estimation of carcinogenicity. 234-293 (1990) it was suggested that the mechanism of mutagenicity of the cimeb[a]ps measured in the ames test is probably more complex than the simple reactivity of carbocation intermediates. [dipole interaction potential (dimension not given)]; [molecular electrostatic potential (kcallmol) in a plane]; [molecular electrostatic field map (kcall mol), mapping the 1e(r)j values of a molecule surface in a plane, predicting the directions and energies of the interactions with small polar molecules at distances greater than the van der waals sphere]; (construction of surfaces corresponding to a given value of potential); 3d mep and mef maps (3d maps weregenerated by superimposing the equipotential curves corresponding to a value of 20 kcallmol in the case of mep. and 1.5 kcallmol in the case of mef, computed in several planes perpendicular to the mean plane of the analogues in low energy conformations, stacking over each other in 1 a distance). results: the three vasopressin analogues differ significantly in their biological activities. both mep and mef maps of the of the biologically active (mpa')-avp and (cpp')-avp are similar, but they are different from that of the inactive (ths')-avp. fig. 1, fig. 2 abtstr. 306-307 quant. struct.-act. relat. 9, 234-293 (1990) a new method for calculating the points of the equipotential curves was also presented. crystal structure (crystal coordinates of the molecules were determined by x-ray diffraction methods). data taken from the literature: [electrostatic molecular potential (ev) were calculated using am-1 type semiempirical mo calculations]; conformational analysis [minimum energy conformations were calculated using x-ray structures as input geometries followed by am 1-method (fletcher-powell algorithm)]. chemical descriptors: ui, u2 [rotational angles of the n-ally1 group (deg)]. results: four similar energy minima were located by am-i calculations for both namh+ and nlph+. the energy minima for the protonated nam + and nlph + were the most populated ones with conformational enantiomers relative to the involved n-allyl-piperidine moiety (37 % and 44 %, respectively). it was shown that the isopotential curve localization of emp contour maps were very similar for the corresponding conformations of both nlph * and namh + indicating that both molecules should interact a the same anionic sites of the opioid receptor, ( p morphine receptor). fig. 2 and fig. 3 shows the emp contour maps of namh' and nlph + , respectively, in their preferred conformations: compounds: esfenvalerate (ss and sr isomers) of type i, 3-phenoxybenzyl 2-(4-ethoxyphenyl)-3,3,3-trifluoropropyl ether (r quant. struct.-act. relat. 9, 234-293 (1990) abstr. 308 271 isomer) (11). a-cyano-3-phenoxybenzyl 2-(4-chlorophenyl)-2-methylpropionate (s isomer) (iii) and deltamethrin (iv). " cn data determined: conformational analysis (minimum energy conformations of the compounds in vacuum were calculated using am1 molecular orbital method and broyden-fletcher-goldfarb-shanno method integrated into mopac); (root mean square, indicating the goodness of fit between two conformers in 3d); (logarithm of the partition coefficient in i-octanol/water estimated using clogp program); [heat of formation of the most stable conformer (kcallmol)]. rms logp e chemical descriptors: 0 1 -0 6 results: it was assumed that the 3d positions of the benzene rings of the pyrethroids are decisive for good insecticidal activity. the lower energy conformers of (i) (ss and sr isomers), (11) (r isomer), (111) (s isomer) and deltamethrin (iv) were compared by superimposition. inspite of their opposite configuration, esfenvalerate (i) (ss isomer) and the new type pyrethroid i1 (r isomer) were reasonably superimposed, indicating that the positions of the benzene rings in space are important and the bonds between them are not directly determinant (fig. 1) crystal structure (x-ray crystal coordinates of penicillopepsin was obtained from the protein data bank). data determined: electrostatic potential [electrostatic potential of the protein atoms (kcallmol) is calculated using the partial charges in the amber united atom force field); docking (the dock program was used to find molecules that have a good geometric fit to the receptor). results: a second generation computer-assisted drug design method has been developed utilizing a rapid and automatic algorithm of locating sterically reasonable orientations of small molecules in a receptor site of known 3d structure. it includes also a scoring scheme ranking the orientations by how well the compounds fit the receptor site. in the first step a large database (cambridge crystallographic database) is searched for small molecules with shapes complementary to the receptor structure. the second step is a docking procedure investigating the electrostatic and hydrogen bonding properties of the receptor displayed by the midas graphics package. the steps of the design procedure is given. the algorithm includes a simple scoring function approximating a soft van der waals potential summing up the interaction between the receptor and ligand atoms. directional hydrogen bonding is localized using electrostatic potential of the receptor at contact points with the substrate. the shape search of (i) was described in detail. a new method has been developed for the construction of a hypothetical active site (hasl), and the estimation of the binding of potential inhibitors to this site. the molecules were quantitatively compared to one another through the use of their hasl representations. after repeated fitting one molecule lattice to another, they were merged to form a composite lattice reflecting spatial and atomic requirements of all the molecules simultaneously. the total pki value of an inhibitor was divided to additive values among its lattice points presumed to account for the binding of every part of the molecule. using an iterative method, a self consistent mathematical model was produced distributing the partial pki values of the training set in a predicting manner in the lattice. the hasl model could be used quantitatively and predictively model enzyme-inhibitor interaction. a lattice resolution of 2 -3 a was found to be optimal. a learning set of 37 e. coli dhfr inhibitors were chosen to test the predictive power of the hasl model at various resolutions. binding predictions (pki values) were calculated for the entire inhibitor set at each resolution and plotted separately for the learning and test set members at 2.8 a resolution (fig. i) : . ala-). data determined: molecular models (3d structure of the molecules have been constructed and displayed using the program geom communicating with cambridge x-ray data bank, brookhaven protein data bank, sandoz x-ray data bank sybyl and disman; quant. struct.-act. relat. 9, 234-293 (1990) abstr. 311-312 273 distance geometry [nuclear overhauser enhancements (noe) and spin-spin coupling constants were measured by 2d nmr methods, semiempirically calibrated as proton-proton distance (a) and dihedral angle (deg) constrains and used in distance geometry calculations (disman) andlor in restrained molecular dynamics calculations to determine 3d structure of molecules in solution]; crystal structure (atomic coordinates of the compounds were determined by x-ray crystallography); rms [root mean square deviation (a) of the corresponding atoms of two superimposed molecular structures]. chemical descriptors: results: distance geometry calculations were carried out using geom and disman, to identify all conformations of the compounds in solution which were consistent with experimental data obtained by noe measurements. the application of geom was demonstrated by modelling cycbsporin a with and without a limited set of h-bond constrains and with a full nmr data set. in case of cyclosporin a, 100 randomly generated linear analogues of the cyclic structure were formed from the monomers. geometric cyclization was achieved using disman, resulting in many different but stereochemically correct conformations of cyclosporin a. superposition of the backbones of the 10 best cyclic conformers showed rms deviations between 1.8 a and 3.1 a. fig. 1 shows the superposition of a disman generated ring conformation (thick line) with its x-ray structure of cyclosporin a (thin line) with h-bond constraints (rms = 1.25 a): fig.1 37 distance and 4 dihedrl-angle constraints have been extracted from noe and vicinal coupling data and used to generate the conformation and the cyclization conditions of the hexapeptide (fig. 2) (position of residual distance violations and their direction is shown by arrows): although the described method is not exhaustive, it explores a much greater variety of initial structures than had been previously possible. title: a new model parameter set for @-lactams. authors: durkin, k.a.; sherrod, m.j.; liotta*, d. department of chemistry, emory university atlanta gl 30322, usa. source: j. org. chem. 1989, 54(25) , 5839-5841. compounds: 22 @lactam antibiotics of diverse structure. data taken from the literature: crystal structures (crystal coordinates of the p-lactams were determined using x-ray diffraction method). results: superposition of the x-ray structures and the calculated geometries of 8-lactams using the original parameter set in the mm2 force field in model gave satisfactory rms values. a lack of planarity of the 8-lactam ring and significant differences in the calculated bond lengths and anglesaround the &lactam nitrogen were detected, however. = 0, s, so, so,]. in order to improve fit, a new atom type with new parameters has been developed for the p-lactam nitrogen (wild atom type 6 0 coded with symbol 2 2 in model). the new parameters were evaluated by comparison of the calculated and x-ray geometries of the 22 0-lactams. using the new parameter set, the x-ray data were satisfactorily reproduced except for the sulfone 8-lactams. it was indicated that the ampac data were not suitable for the sulfones as the hypervalent sulfur compounds are not well described in the am1 hamiltonian. an additional parameters was, however, derived giving good structural data unrelated to the ampac information. it is not known which the new parameter sets is the best for the sulfone /3-lactams. title: a molecular modelling study of the interaction of compounds noradrenalin. biological material: a) cdna of the hamster lung 0,-adrenergic receptor and &-adrenergic receptor kinase; b) bacterio-ovine-and bovine-rhodopsin and rhodopsin kinase. protein primary sequence (amino acid sequence of the hamster lung p,-adrenergic receptor has been deduced by cloning the gene and the cdna of the hamster lung &adrenergic receptor);, (the cosmic molecular modeling program was used for modeling a-helices in a hydrophobic environment using p and w torsion angles of -59" and -44", respectively, according to blundell et al.); (the two highest lying occupied and the two lowest lying unoccupied orbitals, respectively, calculated using indo molecular orbital calculation); crystal structure (crystal coordinates of noradrenaline has been determined by x-ray diffractometry); conformation analysis (minimum energy conformation of the &-adrenergic receptor model has been calculated using molecular mechanics method). results: strong experimental evidences suggested that rhodopsin and 8,-adrenergic receptor had similar secondary structure. thus, it was assumed, that similarly to bacterio-ovine-and bovine-rhodopsins, d2-adr ener gic receptor -. . b2-adrenergic receptor possesses a structure consisting of seven ahelices traversing the cell membrane. fig. 1 shows the postulated arrangements of the a-helices of rhodopsin and the &-receptor. using the experimental data, a model of the &-adrenergic receptor has been generated for the study of its interaction with noradrenaline. a possible binding site was created. successful docking indicated that homo and lumo orbitals contributed to the binding in a chargetransfer interaction between trp-109 and noradrenaline. a hydrogen bond was detected between the threonine residue of the model receptor and the noradrenaline side chain hydroxyl explaining why chirality was found to be important for the activity of adrenergic substances. title: three-dimensional steric molecular modeling of the [binding affinity (nm) of the compounds to the 5-ht3 receptor]. data determined: molecular modeling (3d molecular models of each 19 compound were made using camseqlm molecular modeling system); [distance (a) from the center of the aromatic ring to the ring-embedded nitrogen, when the nitrogen is placed in the same plane as the aromatic ring]. results: in order to derive rules for the 5-ht3 pharmacophore, a molecular graphics-based analysis was made using six core structures. the structures were aligned so as to overlay the aromatic rings and to place the ring embedded nitrogen atom in the same plane as the aromatic ring. nine steric rules were derived from the analysis common to all 19 potent 5-ht3 agents. fig. 1 shows the 3d representation of the six overlaid 5-ht3 core structures using camseqim: the 5-ht3 inactivity of atropine could be explained because its steric properties differed from those the active ics 205-930 only by a single atom and failed to meet two of the nine hypothetical criteria. uv-visible spectra [spectrophotometric studies of mixtures of the dyes and nicotine in 10% (vlv) aqueous ethanol mixture at 29"ci. results: cyanine dyes demonstrate a multitude of biological activities which may be due to the interference of the adsorbed dye molecule on active sites of the living cell. it was shown by uv-and visible spectrophotometry that the hydroxy styryl cyanine dyes and nicotine formed 1 : 1 charge-transfer complexes. the absorption band of the complex formed between nicotine and dye was detected at wavelengths longer than those of the individual pure substances having identical concentrations to those in mixture. fig. 1 shows that the two partially positive centres of the dye (2-(2-hydroxystyryl)-pyridinium-i-ethyliodide) were located at a similar distance than the two nitrogen atoms of pyridine or pyrrolidinyl moieties of nicotine allowing the suggested 1: i parallel stucking interaction between the two molecule: molecular modeling (200 conformations were calculated using a distance geometry algorithm and energy minimized by a modified mm2 force field in moledit). results: all conformers within 5 kcallmol of the lowest energy conformer were superposed on the x-ray structure of mk-329. crystal structure (crystal coordinates of the proteins were determined using x-ray diffraction method). data taken from the literature: was less than 1 a); (probability that a given tetrapeptide sequence is superimposable on the ribonuclease a structure); [probability that the ith residue (amino acid) will occur in the jth conformational state of the tetrapeptide which is superimposable to ribonuclease a]. results: it was suggested that the five tetrapeptides were essential components of larger peptides and might be responsible for their biological activity (binding to the cd4 receptor). earlier it was hypothesized that the critical tetrapeptide located in a segment of ribonuclease a, would assume low energy conformations (residues 22 -25, a @-bend, having a segment homologous to the sequence of peptide t). low energy conformers of the tetrapeptides could be superimposed to the native structure of segment 22-25 of ribonuclease a. fig. shows the superimposition of peptide t (full square): many low energy conformers could be calculated for the tetrapeptides but for the polio sequence. the p, value for most tetrapeptides were 5 -10 times higher that the value of the less active polio sequence. the results supported the hypothesis that the active peptide t adopts the native ribonuclease @-bend. title: potential cardiotonics. 4. synthesis, cardiovascular activity, molecule-and crystal structure of 5-phenyl-and 5-(pyrid-4-yl) data determined: [dose of the compound (mollkg) required for 30 4% increase of the heart beat frequency of guinea pig or dog heart]; [dose of the compound (mollkg) required for 10 % decrease of systolic or diastolic blood pressure of dog]; crystal structure (atomic coordinates of the compounds were determined by x-ray diffraction); molecular modeling (molecule models were built using molpac); mep [molecular electrostatic potential (mep) (dimension not given) was calculated using cndoiz]. results: milrinon and its oxygen containing bioisoster possess highly similar crystal structure and mep isopotential maps ( fig. 1 and fig. 2) both compounds show strong positive inotropic and vasodilatoric activity. it was suggested that the negative potential region around the thiocarbonyl group such as the carbonyl group in milrinon imitates the negative potential field around the phosphate group of camp. title: molecular mechanics calculations of cyclosporin a analogues. effect of chirality and degree of substitution on the side chain conformations of (2s,3r,4r,6e)-3-hydroxy-4-methyl-2-(meth~lamino)-6octenoic acid and related derivatives. [solution conformation of csa in cdch has been elucidated via molecular dynamics simulation incorporating 58 distance constrains obtained from ir spectroscopy and nuclear overhauser effect (noe) data]; (conformational analysis was performed using the search subroutine within sybyl); energy minimization (low energy conformers were calculated using molecular mechanics withinmacromodel ver. 1.5 applying an all-atom version of the amber force field). results a total of 12 conformations of csa have been identified within 4 kcalfmol of the minimum energy conformer. population analysis showed that one conformer dominates in solution. fig. 1 shows the superposition of the peptide backbone of the crystal and solution structures of csa (crystal structure is drawn with thick line and the solution structure with thin line). it was shown that the boltzmann distribution between active and inactive conformers correlated with the order of the immunosuppressive activity. a common bioactive conformer serving as a standard for further design has been proposed for csa and its analogs. abtstr. 320 quant. struct.-act. relat. 9, 234 -293 (1990) 4 data determined: molecular modeling (models of (i), (11) and (111) were built using sybyl based on x-ray coordinates of the compounds); conformational analysis (minimum energy conformations of the compounds were calculated using the search option of sybyl and mndo method; [interaction energy of the molecules (kcall mol) with a hypothetical receptor probe (negatively charged oxygen atom) calculated by grid]. results: the specific receptor area of the sodium channel was modeled with a negatively charged oxygen probe (carboxyl group), interacting with the positively charged (protonated) ligand. fig. 1 shows areas for energetically favorable interaction (areas i, 11 oh 0 ho2c *. . quant. struct.-act. relat. 9, 234 -293 (1990) abstr. 321-322 279 biological material: a) aspergillus ochraceus; b) carbopeptidase a. data determined kobr [first order rate coefficient (io-6/sec) of the hydrochloric acid hydrolysis of ochratoxin a and b]; x-ray crystallography (coordinates of the crystal structure of ochratoxin a and b was obtained using x-ray diffraction); (models of ochratoxin a and b was built using alchemy); ["c nmr chemical shifts (ppm) of the amide and ester carbonyls of the ochratoxins]. chemical descriptors: pka (negative logarithm of the acidic dissociation constant). results: a reversal of the hydrolysis rate between ochratoxin a and b was observed comparing the hydrolysis rates obtained in vitro (carbopeptidase a) and in vivo (hydrochloric acid). the difference in hydrolysis rates cannot be due to conformation since the two toxins have the same conformation in both crystal and in solution. fig. 1 shows the fit of ochratoxin a and b based on superimposing the phenolic carbon atoms. it is suggested that the relative large steric bulk of the chloro atom hinders the fit between ochratoxin a and the receptor site of carbopeptidase a. thus, probably the slower metabolism is the reason, why ochratoxin a is more toxic than ochratoxin b. title: inhibitors of cholesterol biosynthesis. 1. trans-6-(2-pyrrolcharge distribution studies showed that compactin had two distinct regions of relatively large partial charges corresponding to the pyrrol ring and the isobutyric acid side chain. experiments for more closely mimicking the polar regions associated with the high activity of compactin indicated that potency of the new compounds was relatively insensitive to the polarity of the r' group. it was also suggested that an electron deficient pyrrole ring was required for high potency. title: synthesis and biological activity of new hmg-coa reductase inhibitors. 1. lactones of pyridine-and pyrimidine-substituted 3.5dihydroxy-6-heptenoicf-heptanoic) acids. chemical descriptors: results: an attempt was made to correlate electrophysiological activity with the effect of the position of the aryl group on the conformation of the side chain using molecular modeling. the study suggested that the compounds with class 111 activity prefer a gauche (a in fig. 1 ) and compounds in which class i activity prefer trans relationship of the nitrogens (b in fig. 1) : the study indicated that the point of attachment of the aryl moiety had an effect on the side chain conformation which appeared to be a controlling factor of the electrophysiological profile of these compounds. title: a molecular mechanics analysis of molecular recognition by cyclodextrin mimics of a-chymotrypsin. authors ( 1 ) quant. struct.-act. relat. 9. 234 -293 ( 1990) biological material: chymotrypsin. data taken from the literature: crystal structure (crystal coordinates of the macrocycles determined using x-ray diffraction analysis). data determined: molecular modeling structure superposition (models of b-cd and in chains by nmethylformamide and n-dimethyl-formamide substituted (capped) b-cd were built using the amber program and the coordinates for building the n-methylformamide substituent were calculated using mndo in the mopac program); (energy minimization of the molecules were calculated in vacuo using molecular mechanics program with the amber force field); (energy minimized structures of b-cd and capped b-cd were separately fit to the xray structure of the b-cd complex); [molecular electrostatic potential (kcallmol) of b-cd and capped b-cd were approximated by the coulombic interaction between a positive point charge and the static charge distribution of the molecule, modeled by the potential derived atomic point charges at the nuclei and visualized as 2d mep map]. results: b-cd and capped b-cd were analyzed as biomimetic models of the active site of chymotrypsin. capped b-cd was shown to be the more effective biomimetic catalyst. capping also altered certain structural features of molecular recognition. the orientation of the secondary hydroxyls were altereddue to twisting of some of the glucose units. secondary hydroxyl oxygen mimics the ser-195 of chymotrypsin in initiating the acyl transfer event through nucleophilic attack on the substrate. fig. 1 shows the energy minimized structures of b-cd (a) and capped b-cd (b) (fragment number is given in parenthesis). the mep maps of b-cd and capped b-cd showed that the qualitative features of the electrostatic recognition were practically the same in the two mimics. biologicai material: four monocotyledonous (johnson grass, yellow foxtail, barnyard grass, yellow millet) and four dicotyledonous weed species (velvetleaf, morning glory, prickly sida, sicklepod). data determined: [pre-emergence and postemergence herbicidal activities of the compounds were measured and rated using a scale ranging from 0 (no activity) to 9 (complete kill]; [measure of the compound's ability (dimension not given) to translocate upwards in plants through xylem vessels); [soil sorption coefficient calculated by the formula k, = c,/c,, where c, is the concentration of the compound (pg compoundlg soil) and c. is the concentration of the compound (pg compoundlml) in water solution in equilibrium with the soil]; (models of the compounds were built using maccs and prxbld programs); tscf kd molecular modeling quant. struct.-act. relat. 9, 234 -293 (1990) abstr. 327-328 283 conformational analysis (minimum energy conformations of the compounds were calculated using mm2 molecular mechanics method); (molecules were visualized using program mogli on an evans and sutherland picture system 33); [total energies, orbital eigenvalues, atomic charges and dipole moments of simple model analogs of type i were calculated using prddo (partial retention of diatomic overlap) level of approximation]. electronic structure chemical descriptors: logp (logarithm of the partition coefficient in 1-octanollwater). results: conformational analyses and high level quantum mechanical calculations of the conformational preferences showed that the compounds with r = 4-c1 and 5-ci substituents adopt a coplanar structure stabilized by intramolecular hydrogen bond, whereas the 3-c analogue does not (fig. 1 ): higher logp values (0.6 -1.0 logarithmic unit difference), higher kd and tscf values of the 4-ci and 5-ci substituted compounds relative to the 3-ci analog were interpreted as the result of the intramolecular hydrogen bond and were consistent with the observation that the 4-ci and 5-ci analogs were active as post-emergence but not pre-emergence herbicides while the 3-ci derivative was active in both modes. title: application of molecular modeling techniques to pheromones of the marine brown algae cutleria multifida and ectocarpus siliculosus (phaeophyceae). metalloproteins as chemoreceptors? (geometrical models of the compounds were constructed using information from the cambridge structural data base (csd) and calculated using molecular mechanics methods in sybyl); (minimum energy conformations of the compounds were calculated using molecular mechanics method within sybyl). chemical descriptors: kfcq [partition coefficient in fc72/water (fc72 = fluorocarbon results: as both ectocarpene (i) and multifidene (11) trigger mutual cross reactions between male gametes of ectocarpus siliculosus and cutleria multifida males it was supposed that a common mode of binding should exist for the two structurally different pheromones. the active analogue approach was applied to model the pheromone receptor by superposing the minimum energy conformations of active structural analogues (hi, iv, v, vi) on ectocarpene and multifidene. the common active conformation of (i) and (11) was extracted by systematic superimposition of the analogues. to explain the function of the double bonds in the pheromones, the presence of a receptor bound metal cation was assumed. simultaneous optimization,of both structures without and with a receptor bound metal cation resulted in virtually the same conformations. fig. 1 shows the mapping of multifidene onto ectocarpene in their biologically relevant conformations. solvent)]. title: critical differences in the binding of aryl phosphate and carbamate inhibitors of acetylcholinesterases. conformational analysis [minimum energy conformations of (asn-ala-asn-pro)9 was calculated using charmm (chemistry at harvard macromolecular mechanics), amber (assisted model building with energy refinement) and ecepp (empirical conformational energy program for peptides) potential energy functions]; [root mean square deviation (a) of the position of the corresponding atoms of two superimposed molecular sructures], results: 24 low energy conformations of (asn-ala-asn-pro)9 has been determined using charmm, ecepp and amber in order to determine their final conformations and relative energies. the final conformations were compared calculating the rms values of their c" atoms and matching the parameters of the energy minimized (asn-ala-asn-pro), peptide to that of the ideal helix or coiled coil. the similarity of the final conformations obtained by using any two different potentials starting from the same conformation varied from the satisfactory to highly unacceptable. the extent of difference between any pairs of the final conformations generated by two different potential energy functions were not significantly different. the lowestenergy conformation calculated by each of the energy potentials for any starting conformation was a left handed helix and pair-wise superposition of the c" atoms in the final conformations showed small rms values (1 .o -1.3 a) . it was suggested that the native conformation of (asn-ala-asn-pro), in the cs protein may be a left-handed helix, since all three potential energy functions generated such conformation. 234 -293 (1990) source: proteins 1989 proteins , 6(2), 193 -209, 1989 . biological material: crambin. data determined phi-psi probability plot (probabilities of the occurrences of phi-psi dihedral angle pairs for each amino acid were determined and plotted using the data of approximately 100 proteins from the brookhaven protein data bank); (optimization technique for the reproduction the folding process converging to the native minimum energy structure by dynamically sampling many different conformations of the simplified protein backbone). chemical descriptors: phi-psi values [dihedral angles (deg) defined by the bonds on either side of the a-carbon atom of the amino acid residue in a protein]. results: a simplified model has been developed for the representation of protein structures. protein folding was simulated assuming a freely rotating rigid chain where the effect of each side chain approximated by a single atom. phi-psi probabilities were used to determine the potentials representing the attraction or repulsion between the different amino acid residues. many characteristics of native proteins have been successfully reproduced by the model: (i) the optimization was started from protein models with random conformations and led to protein models with secondary structural features (a-helices and 0strands) similar by nature and site to that of the native protein; (ii) the formation of secondary structure was found to be sequence specific influenced by long-range interactions; (iii) the association of certain pairs of cysteine residues were preferred compared to other cysteine pairs depending on folding; (iv) the empirical potentials obtained from phi-ps probabilities led to the formation of a hydrophobic core of the model peptide. x [dihedral angle (deg) ]. results: four kinds of monte carlo simulations of about 20,000 steps of the conformations of crambin were carried out by using the second derivative matrix of energy functions (starting from native and unfolded conformations both in two kinds of systems, in vacuo and in solution). fig. 1 shows the native (a) and theunfolded (b) conformation of crambin. starting from native conformation, the differences between the mean properties of the simulated crambin conformations obtained from in vacuo and solution calculations were not very large. the fluctuations around the mean conformation during simulation were smaller in solution than in vacuo, however. simulation starting from the unfolded conformation resulted in a more intensive fluctuation of the structure in solution than in vacuo indicating the importance of the hydration energy term in the model. the conformations generated in the simulations starting from the native conformation deviate slightly from the xray conformation (rms = 0.70 a and i . 10 a for in vacuo and solution simulations, respectively). the results indicate that the simulations of the protein with hydration energyare more realistic that the simulations without hydration energy. fig.1 results: earlier studies overestimated the catalytic rate decrease of the hypothetical d102a point mutant of thrombin (20 orders of magnitude decrease calculated instead of the 4 order of magnitude measured). the source of error was due to an overestimation of v and neglecting the effects of the surrounding water molecules and induced dipoles. to compensate for these errors, a scale factor of 0.12 was introduced into the calculations. as aresult of the rescaling, one magnitude increase of tat for the d121 mutant and two magnitudes decrease of k,, of the k41 mutant of ribonuclease a was predicted. it was shown that the effect of the mutations on the catalytic rate depended almost entirely on steric factors. it was suggested that in mutants of serine proteases where the buried asp is replaced by ala or asp, the kcat value will decrease between 4 -6 orders of magnitude. title: high-resolution structure of an hiv zinc fingerlike domain via a new nmr-based distance geometry approach. authors: summers*, m.f.; south, t.l.; kim [root mean square deviation (a) of the corresponding atoms of two superimposed molecular sructures]. results: the atomic resolution structure of an hiv zinc fingerlike domain has been generated by a new nmr-based dg method using 2d noesy backcalculation. the quality of the structures thus obtained were evaluated on the basis of the consistence with the experimental data (comparison of measured and back-calculated nmr spectra) rather than comparing it tostructural informations from other sources (e.g. x-ray data). the method provided a quantitative measure of consistence between experimental and calculated data which allowed for the use of tighter interproton distance constraints. the folding of the c( l)-f(2)-n(3)-c(4)-g(s)-k(6) residues were found to be virtually identical with the folding of the related residues in the x-ray structure of the iron domain of rubredoxin (rms values 0.46 and 0.35 a). the backbone folding of the peptide was found to be. significantly different from that of the "classical" dna-binding zn-finger. fig. 1 shows the wire frame model of all the back%ne atoms and certain side chain atoms of the peptide (dg struciure) (dashed lines indicate hydrogen atoms): active site of the protease dimer in an extended conformation with extensive van der waals and hydrogen bonding and was more than 80 % excluded from contact with the surrounding water (fig. i , where the inhibitor is shown in thicker lines and the hydrogen bonds in dashed lines): data determined: ago,, ago, [standard free energy (callmol) of transfer of a molecule from an apolar phase to an aqueous phase, observed or calculated by eq. 1: ago, = c aui ai, where aoi is the atomic solvation parameter of atomic group i, ai is the accessible surface area of atom i, respectively]. results: atomic solvation parameters (asps) characterizing the free energy change per unit area for transfer of a chemical group from the protein interior to aqueous surroundings were determined. ago, and ago, were determined and compared, and a highly significant linear relationship is presented (fig. 1) . one letter symbols indicate amino acid side chains: fig .1 the binding of the inhibitor induced substantial movement in the en-'2 zyme around the residues 77 to 82 in both subunits at places exceeding i the structure of glutamine synthetase is discussed. it was established that hydrophobic interactions are important for the intersubunit interactions, and the hydrophobic interactions between the two rings of subunits are stronger than between the subunits within a ring. the cterminal helix contribute strongly to the inter-ring hydrophobic interaction. asps are suggested to estimate the contribution of the hydrophobic energy to protein folding and subunit assembly and the binding of small molecules to proteins. title: determination of the complete three-dimensional structure of the trypsin inhibitor from squash seeds in aqueous solution by nuclear magnetic resonance and a combination of distance geometry and dynamical simulated annealing. authors: holak*, t.a.; gondol, d.; otlewski, j.; wilusz, t. max-planck-hstitut fiir biocbemie d-8033 martinsried bei miinchen, federal republic of germany. title: interpretation of protein folding and binding with atomic crystal structure (atomic coordinates of cmti-i was determined by solvation parameters. x-ray diffraction method). results: in order to obtain information of the 3d structure of the free cmti-i in solution, a total of 34 inhibitor structures were calculated by a combination of distance-geometry and dynamical simulated annealing methods, resulting in well defined 3d positions for the backbone and side-chain atoms. fig. 1 shows the superposition of the backbone (n, c", c, 0) atoms of the structures best fitted to residues 2 to 29 (binding loop): the average rms difference between the individual structures and the minimized mean stfucture was 0.35(*0.08) a for the backbone atoms and 0.89(+0.17) a for all heavy atoms. title: electron transport in sulfate reducing bacteria. molecular modeling and nmr studies of the rubredoxin-tetraheme-cytochrome-c3 complex. biological material: a) sulfate reducing bacterium (desulfovibrio vulgaris); b) rubredoxin (iron-sulfur protein); c) tetraheme cytochrome c3 from d. vulgaris; e) flavodoxin. detected in the segments from the residues 16 to 18 and 25 -25. fig. 1 shows the best superposition (residues 2 to 29) of the nmr and crystal structure of cmti-i indicating the backbone c, c", n, 0, as well as the disulfide c8 and s atoms: fig.1 it was demonstrated that uncertainty in nmr structure determination can be eliminated by including stereospecific assignments and precise distance constraints in the definition of the structure. crystal structure (coordinates of the crystal structure of the compounds were determined by x-ray crystallography). results: the speed of the homolysis of the organometallic bond is 10" times higher in the apoenzyme bound coenzyme biz than in a homogenous solution. structural changes occurring during the co-c bond homolysis of the coenzyme biz leading from cobalt(ii1) corrin to cobalt(i1) corrin were investigated. fig. 1 shows the superposition of structures of the cobalt corrin part of the biz (dotted line) and of cob(ii)alamin (solid line): biological material: apoenzyme, binding the coenzyme biz and the data determined: . 1 shows that the crystal structure of biz and cob(i1)alamin are strikingly similar and offers no explanation for the mechanism of the protein-induced activation of homolysis. it was suggested that the co-c bond may be labiiized by the apoenzyme itself and in addition to a substrate-induced separation of the homolysis fragments (which mights be supported by a strong binding of the separated fragments to the protein). 'h nmr (complete stereospecific assignments were carried out and proton-proton distance constrains were determined by the analyses of dqf-cosy, hohaha and noesy spectra); nh, ah, oh [chemical shifts (ppm) of proton resonances of human eti ; 3d structure (3d structure of et was calculated using the distance geometry program dadas based upon the noesy proton-proton distance constrains determined by nmr spectroscopy); [root mean square distance (a) between 5 et conformers calculated by distance geometry (dadas)]. results: the solution conformation of et has been determined by the combined use of 2d 'h nmr spectroscopy and distance geometry calculations. five structures of et have been calculated from different initial conformations. the superposition of the backbone atoms the calculated structures is shown in fig. 1 . the average rms value in the core region for the main-cahin atoms was 0.46 a. quant. struct.-act. relat. 9, 234-293 (1990) the lack of specific interactions between the core and tail portions of et and a characteristic helix-like conformation in the region from lys' to cys" was shown. literature data indicated that neither the eti -1 5 nor the etl6-21 truncated derivatives of et showed constricting or receptor binding activity suggesting that the et receptor recognizes an active conformation consisting of both the tail and core portion. the present study, however, suggested that the receptor bound conformation of et is probably different from that in solution because the lack of interaction between tail and core. the hydrophobic nature of the tail suggested the importance of a hydrophobic interaction with the receptor. compounds: triphenyl-methyl-phosphit cation (tpmp+). biological material: nicotinic acetylcholine receptor (achr), a prototype of the type i of membrane receptor protein from the electric tissue of torpedo and electrophorus. results: a computer model of the achr ion channel has been proposed. fig. 1 shows the side view of the ion channel model with five pore-forming mz-helices and the channel blocking photoaffinity label (tpmp +) represented by the dotted sphere: fig.1 it was supported by electronmicroscopy, electrophysiological-and biochemical experiments that the mz-helices were formed by homologous amino acid sequences containing negatively charged amino acid side chains which act as the selectivity filter. the amino acid side chains may undergo conformational changes during the permeation of the cation. the predicted transmembrane folding of four transmembrane a-helices of type i receptors is shown in fig. 2: fig. 2 energy profile calculations indicate that other transmembrane sequences of the receptor protein besides m2 may affect the ion channel. source: cabios 1989, 5 (3), 219 -226. results: an interactive computer program tefoojj2 has been developed for drug design on ibm/pc and compatible computers. the program contains the following modules and performs the following calculations: a) series design for selecting an optimal starting set of compounds using a modified version of austel's method; b) regression analysis calculating the hansch's equation; c) hansch searching method using the equation calculated by the regression analysis routine or the use of an input equation for the identification of the 10 most active compounds; d) geometrical searching methods for finding the optimum substituents in the parameter space using the sphere, ellipse, quadratic or polyhedric cross algorithms with or without directionality factors; e) space contraction for reducing the dimension of the parameter space by eliminating non-significant parameters; f) an example is given for the lead optimization of an aliphatic lead compound correctly predicting the n-pentane to be the optimum substituent. results: a new expert system sparc is being developed at epa and at the university of georgia to develop quantitative structure-activity relationships for broad compound classes: a) classical qsar approaches predict therapeutic response, environmental fate or toxicity from structure/property descriptors quantifying hydrophobicity, topological descriptors, electronic descriptors and steric effects; b) sparc (sparc performs automated reasoning in chemistry), an expert system written in prolog, models chemistry at the level of physical organic chemistry in terms of mechanism of interaction that contribute to the phenomena of interest; c) sparc uses algorithms based on fundamental chemical structure theory to estimate parameters such as acid dissociation constants (pk,s), hydrolysis rate constants, uv, visible and ir absorption spectra, and other properties; d) the information required to predict input data for broad classes of compounds is dispersed throughout the entire ir spectrum and can be extracted using fourier transforms; e ) the accuracy of sparc algorithm was demonstrated on the close match of calculated and experimental pk, values of 20 carboxylic acid derivatives near to the noise level of measurement. abtstr. 347-351 quant. struct.-act. relat. 9, 234-293 (1990) results: a new stand-alone molecular simulation program, nmrgraf integrating molecular modeling and nmr techniques has been introduced by biodesign inc. a) nmrgraf is a molecular modeling program utilizing force fields which incorporate empirical properties such as bond lengths and angles, dihedral, inversion and nonbonded interactions, electrostatic charges and van der waals interactions; b) the molecular structural properties are combined with nuclear overhouser effect (noe) and j-coupling nmr data (experimental interproton distance constrains and bond angle data); c) the nmr proton-proton distance data are accurate only at relatively short distances (5 to 10 a) which restricts the use of nmr noe approaches only for the analysis of molecules with known x-ray structure; d) the combination of nmr and molecular modeling approaches, however, makes it possible to model virtually any molecule even if its structure does not exist in the databases. title: electronic structure calculations on workstation computers. results: the main features of the program system turbomole for large-scale calculation of scf molecular electronic structure on workstation computers is described: a) the program system allows for scf level treatments of energy, first-and second-order derivatives with respect to nuclear coordinates, and an evaluation of the me? correlation energy approximation; b) the most important modules of turbomole are (i) dscf performing closed and open shell rhf calculations; (ii) egrad used for analytical scf gradient evaluations; (iii) kora calculating direct two-electron integral transformation (iv) force for the computation and processing of integral derivatives and the solution of cphf equations; c) comparison and evaluation of timings of representative applications of turbomole on various workstations showed that apollo ds 1o.ooo and iris 4d1210 were the fastest and comparable to the convex c210 in scalar mode. results: a new algorithm has been developed for the calculating and visualizing space filling models of molecules.the algorithm is about 25 times faster than a conventional one and has an interesting transparency effect when using a stereo viewer. a) the algorithm is briefly described and and the result is visualized on modeling a ribonucleotide unit; b) in the order of increasing atomnumbers, the (x,y) sections of the hemispherical disks of the atoms are projected on the screen with decreasing value of the azimuthal angle (p) of the van der wads radius and as the value of p decreases the projection is increasingly whitened to obtain shading effect on the surfaces of the spheres; c) the transparency of the van der waals' surfaces of atoms of a molecule makes possible to perceive almost the whole space filling structure and not only the surface, hiding the underlying atoms. title: supercomputers and biological sequence comparison algo-authors: core*, n.g. ; edmiston, e.w. ; saltz, j.h. ; smith, r.m. rithms. yale university school of medicine new haven ct 06520-2158, usa. source: computers biomed. res. 1989, 22(6) , 497 -515. compounds: dna and protein fragments. chemical descriptors: sequences of monomers. results: a dynamic programming algorithm to determine best matches betweenpairs of sequences or pairs of subsequences has been used on the intel ipsc/l hypercube and ,the connection machine (cm-i). parallel processing of the comparison on cm-i results in run times which are 65 to 230 times as fast as the vax 8650, with this factor increasing as the problem size increases. the cm-i and the intel ipsc hypercube are comparable for smaller sequences, but the cm-i is several times quicker for larger sequences. a fast algorithm by karlin and his coworkers designed to determine all exact repeats greater than a given length among a set of strings has been tried out on the encore multimax/32o. the dynamic programming algorithms are normally used to compare two sequences, but are very expensive for multiple sequences. the karlin algorithm is well suited to comparing multiple sequences. calculating a multiple comparison of 11 dna sequences each 300-400 nucleotides long results in a speedup roughly equal to the number of the processors used. source: cabios 1989, 5(4), 323. results: a program has been developed for the prediction and display of the secondary structure of proteins using the primary amino acid sequence as database. a) the program calculates and graphically demonstrates four predictive profiles of the proteins allowing interpretation and comparison with the results of other programs; b) as a demonstration the sliding averages of n sequential amino acids were calculated and plotted for four properties of human interleukin 6: (i) plot of the probabilities of a-helix, p-structure and p-turns according to chou and fasman; (ii) @-turn index of chou and fasman; (iii) plot of the hydrophobicity index of hopp and woods; (iv) flexibility index of karplus and schulz; c) the regions of primary structure having properties which usually go together agreed reasonably well with each other, i.e. loops and turns with bend probability and hydrophilicity with flexibility. title: 3dsearch. a system for three-dimensional substructure searching. quant. struct.-act. relat. 9, 234-293 (1990) sci. 1989, 29(4) , 255 -260. results: the search for threedimensional substructures is becoming widely used in 3d modeling and for the construction of pharmacophores for a variety of biological activities. a system (3dsearch) for the definition and search of three-dimensional substructures is described: a) representation of atom types consists of five fields (i) element (he-u); (ii) number of non hydrogen neighbors (bonded atoms); (iii) number of 'k electrons; (iv) expected number of attached hydrogens; (v) formal charge; (vi) four type of dummy atoms are also used to define geometric points in space (e.g. centroid of a b) definition of queries (i) definition of spatial relationship between atoms; (ii) matches in atom type (iii) preparation of keys (constituent descriptors); (iv) execution of key search; (v) geometric search including the handling of angleldihedral constrains and takes into account "excluded volume"; c) time tests showed that a search of 3d structures with 3 to 5 atoms in large databases with more than 200,000 entries took only a few minutes (22 -492 s). results: a database containing about 265,000 compounds in connection tables and 30,000 experimentally determined structures from the cambridge structural database has been transformed into a database of low energy 3d molecular structures using the program concord. the strategy for building the 3d database consisted of the following four steps: a) generation of approximate 3d coordinates from connection tables (hydrogens were omitted); b) assignment of atom types from connection table information characterized by five descriptors: (i) element type (he-u); (ii) number of attached non-hydrogen neighbors (0 -8); (iii) number of 'k electrons (0 -2); (iv) calculated number of attached hydrogens (0-4); (v) formal charge ( -1, 0, 1); c) addition of three types of chemically meaningful dummy atoms for purposes of 3d substructure searching: (i) centroids of planar 5and 6-membered rings; (ii) dummy atoms representing the lone electron pairs; (iii) ring perpendiculars positioned orthogonal to and 0.5 a above and below each planar ring; d) efficient storage of the resultant coordinate database indexing the compounds with identification number; e) the database can be used among others to deduce pharmacophores essential for biological activity and to search for compounds containing a given pharmacophore. title source: c&en 1989, 67(43) , 18-20. results: a complex carbohydrate structure database (ccsd) has been developed by the complex carbohydrate research center at the university of georgia, having more than 2000 structures and related text files, with about 3000 more records to be added over the next two years. the following are the most important features of ccsd: a) in ccsd, database records include full primary structures for each complex carbohydrate, citations to papers in which sequences were published, and suplementary information such as spectroscopic analysis, biological activity, information about binding studies, etc; b) structural display format visualize branching, points of attachment between glycosyl residues and substituents, anomeric configuration of glycosyl linkages, absolute configuration of glycosyl residues, ring size, identity of proteins or lipids to which carbohydrates are attached and other data; c) it is planned that ccsd will provide threedimensional coordinates, to visualize and rotate the structures in stereo and study their interaction with proteins or other biopolimers. probing bioactive mechanisms p commercial carbamate insecticides of type ii, where r' = h, s-bu biological material: acetylcholinesterase. data determined: d [distance (a) between the serine oxygen of acetylcholinesterase and the methyl substituents of the carbamate and phosphate inhibitors molecular modeling (models and minimum energy conformations of acetylcholine, aryl carbamate and phosphate ester inhibitors were created using the draw mode of a maccs database and the prxbld modeling program) results: transition state modeling of the reaction of the serine hydroxyl ion of acetylcholinesterase with the methylcarbamoyl and dimethyl phosphoryl derivatives of 3,4-dimethyl-phenol showed that the active site binding for these two classes of acetylcholinesterase inhibitors should be different. the model shows that the distances between the serine oxygen and the ring substituents (meta-and para-me-thy1 groups) are different in both spacing and direction. fig. 1 shows the transition state models of serine hydroxyl d values for the meta-and para-methyl substituents of n-methylcarbamate and dimethylphosphate were meta = 6.20, para = 8.10 and meta = 5.48, para = 7.03 a, respectively title: a comparison of the charmm, amber and ecepp potentials for peptides. 1. conformational predictions for the tandemly repeated peptide (asn-ala-asn-pro)9 biological material: tandemly repeated peptide (asn-ala-asn-pro) which is a major immunogenic epitope in the circumsporozoite (cs) protein of plasmodium falciparum conformational analysis dream or reality? a authors: nbray-szab6*, g.; nagy, j.; bcrces, a. priori predictions for thrombin and ribonuclease mutants molecular modeling (geometric model of subtilisin and trypsin was built using protein data bank coordinates and model of thrombin was built using the theoretical coordinate set of graphic representations of the triad of the tetrahedral intermediate for the enzymes formed on the active side chain and residues (ser-221, his-64 and asp32 in subtilisin; ser-195, his-64 and asp-i02 in trypsin and thrombin; his-119, lys-41 and his-i2 in ribonuclease a) were created using pcgeom electrostatic properties (electrostatic surfaces and fields of the molecules were calculated and displayed using amber and associated programs); [chemical shift (ppm) measured by nmr results: a hypothetical model between rubredoxin and cytochrome c3 was built as a model for the study of electron transfer between different redox centers, as observed in other systems. fig. i shows the main chains atoms of the proposed complex where the hemes of the cytochromes are shown along with the center of rubredoxin and stabilized by hydrogen bonds and charge-pair interactions (the nonheme iron of the rubredoxin is in close proximity to heme 1 of cytochrome c3):6 spectroscopy].the model was consistent with the requirements of steric factors, complementary electrostatic interactions and nmr data of the complex. comparison of the new model and the nmr data of the previously proposed flavodoxin-cytochrome c3 complex showed that both proteins interacted with the same heme-group of cytochrome c3. title: nuclear magnetic resonance solution and x-ray structures of squash trypsin inhibitor exhibit the same conformation of the proteinase binding loop.authors: holak*, t.a.; bode, w.; huber, r.; otlewski, j.; wilusz, t. max-planck-institut fiir biochemie d-8033 martinsried bei munchen, federal republic of germany.source: j. mol. biol. 1989, no.210, 649-654. biological material: a) trypsin inhibitor from the seeds of the squash cucurbita maxima; b) p-trypsin and trypsin inhibitor complex.title: retention prediction of analytes in reversed-phase high-performance liquid chromatography based on molecular structure. 5. quant. struct.-act. relat. 9, 234-293 (1990) results: an expert system (cripes) has been developed for the prediction of rp-hplc retention indices from molecular structure by combining a set of rules with retention coefficients stored in a database. the method underlying the system is based on the "alkyl aryl retention index scale" and aims to improve the reproducibility of prediction and compatibility between various instruments and column materials. the vp-expert system shell from microsoft was used for the development. the performance of cripes was demonstrated on several subtypes of substituted benzenes (phenacyl halides, substituted arylamines, arylamides and other types). in general the calculated and measured retention indices agreed well but relatively large deviations were observed between the ie and ic values for phenacyl bromides and chlorides, o-and p-bromo anilines, n-methylbenzamide and n,n-dimethylbenzamide and phthalate esters. the extension of the database with further interaction values was regarded as necessary for a consistently high accuracy at prediction. [out-of-plane bending energy (kcal/mol) given by the formula e, = kd', where d is the distance from the atom to the plane defined by its three attached atoms and k is a force constant]; eb e, [torsional energy (kcal/mol .deg2) associated with four consecutive bonded atoms i,j,k,l given by the formula e, = ki,j.k,l(l fs/lslcos(lslbi,j,k,~)), where b is the torsion angle between atoms i j , k and 1, s and k are constants]; e, [potential energy (kcallmol) (nonbonded energy term) associated with any pair of atoms which are neither directly bonded to a common atom or belong to substructures more than a specified cutoff distance away given by the formula e, = kij(l.0la" -2.0/a6), where a is the distance between the two atoms divided by the sum of their radii, and k is the geometric mean of the k constants associated with each atom]. results: model geometries produced by the tripos 5.2 force field have been assessed by minimizing the crystall structures of three cyclic hexapeptides, crambin and 76 diverse complex organic compounds. comparative force field studies of the tripos 5.2, amber and amberlopls force fields were carried out by energy minimization of three cyclic hexapeptides starting from the crystal structures showed the tripos 5.2 force field superior to the others with the exception of amber, ecep2 and levb force fields as published by other workers. a direct comparison between the performance of tripos 5.2 and . amber using isolated crambin showed that the bond and torsion angles of tripos 5.2 averaged closer to the crystal structure than the angles calculated by amber (rms = 0.025 a, 2.97 deg and 13.0 deg for bond lengths, angles, and torsions, respectively, and rms = 0.42 a for heavy atoms). fig. 1 shows the superimposed structures of crambin before and after energy minimization:fi 9.1 tripos 5.2 was assessed for general purpose applications by minimizing 76 organic compounds starting from their crystal structures. the test showed that tripos 5.2 had a systematic error in overestimating the bond lengths of atoms in small rings. statistical analysis of the results showed that t r i p s 5.2 had an acceptable overall performance with both peptides and various organic molecules, however, its performance was not equal to the best specialized force fields. title: new software weds molecular modeling, nmr. author: krieger. j. c&en 1155 sixteenth st., n.w., washington dc 20036, usa. source: c&en 1990, 68(13) , 16. key: cord-158494-dww63e9f authors: wakefield, jon; okonek, taylor; pedersen, jon title: small area estimation of health outcomes date: 2020-06-18 journal: nan doi: nan sha: doc_id: 158494 cord_uid: dww63e9f small area estimation (sae) entails estimating characteristics of interest for domains, often geographical areas, in which there may be few or no samples available. sae has a long history and a wide variety of methods have been suggested, from a bewildering range of philosophical standpoints. we describe design-based and model-based approaches and models that are specified at the area-level and at the unit-level, focusing on health applications and fully bayesian spatial models. the use of auxiliary information is a key ingredient for successful inference when response data are sparse and we discuss a number of approaches that allow the inclusion of covariate data. sae for hiv prevalence, using data collected from a demographic health survey in malawi in 2015-2016, is used to illustrate a number of techniques. the potential use of sae techniques for outcomes related to covid-19 is discussed. small area estimation (sae) describes the endeavor of producing estimates of quantities of interest, such as means and totals, for domains (usually areas) which have sparse or non-existent response data. sae is carried out in many fields including health, demography, agriculture, business, education, and environmental planning. in this article we focus on health outcomes, for which sae aids in highlighting geographical disparities, and is broadly useful for health planning, resource allocation and budgeting. sae allows fundamental questions such as: "how many people in my area have condition x or need treatment y". disease mapping which is traditionally based on a complete enumeration of disease cases, and differs from sae which is typically based on a subset of individuals, selected via a survey, which may have a complex design. the standard reference on sae is rao and molina (2015) , while an excellent review is provided by pfeffermann (2013) . in survey sampling, inference has often focused on the design-based (or randomization) approach. this focus has carried over into sae. this approach is quite distinct from the model-based approaches that are the bread and butter of mainstream spatial statistics. both inferential approaches are discussed in skinner and wakefield (2017) . design-based methods assess the frequentist properties of estimators, averaging over all possible samples that could have been drawn, under the specified sampling design. under this paradigm, the values of the responses in the population are viewed as fixed rather than random. model-based approaches can be either frequentist or bayesian. if a model-based approach is taken, a hypothetical infinite population model is specified for the responses, which are now viewed as random variables. modeling may be carried out within the design-based paradigm via model-assisted approaches (särndal et al., 1992) in which a model is specified but desirable design-based properties are retained, even under model misspecification. a cautious view (lehtonen and veijanen, 2009 ) is that design-based (including model-assisted) inference may be reliable in situations when there are large or medium samples in areas, while if data are sparse, a model-based approach may be a necessity. in a companion article, datta (2009) reviews, and is more enthusiastic toward, model-based approaches. random effects modeling is popular under the model-based approach to sae, and inference for these models is often frequentist through empirical bayes estimation -this is what we refer to as frequentist model-based inference, rather than the predictive approach described in valliant et al. (2000) . if a model-based approach is taken, a key element of model specification (and a source of contention) is determining how to account for the sampling design. models may be specified at the area-level or at the unitlevel. for the former, an important reference is fay and herriot (1979) , which introduced the idea of modeling a weighted estimate of an area-level characteristic. for the latter, battese et al. (1988) describe a nested error regression model at the level of the sampling unit. design-based inference from sample surveys aims to estimate finite population quantities using information about the sampling process. in a model-based framework, the finite population is itself a sample from a superpopulation a random process that can be described by some model. if, in an actual survey, the realized sample and the finite population can be described by the same model, then the sample design is ignorable. a simple random sample (srs) is ignorable in this sense. however, most real-life household surveys have nonignorable designs, and most sae models depend on assumptions. when the design is not ignorable, one needs to incorporate the design into the model. ideally, such incorporation would include the relevant aspects of the design including design weighting, non-response corrections, and weight adjustments (see section 2). while this information may be available, many aspects of the sampling frame (such as the locations of all clusters in a design with cluster sampling) are typically not available, at least not in sufficient detail to be useful. for surveys such as the demographic and health surveys (dhs), which are extensively carried out in low-and middle-income countries (lmic), stratification, clustering and estimation weights will typically be available. for surveys in developed countries, little information may be available. prevalence mapping (wakefield, 2020) , is a name that has been given to the production of maps displaying the prevalence of health and demographic outcomes, and this endeavor clearly has large overlaps with sae. while sae smoothing methods often use area-level models, prevalence mapping often uses model-based geostatistics (mbg) methods in which a continuous spatial model is specified. examples of prevalence mapping using arealevel sae techniques include hiv prevalence (gutreuter et al., 2019) and the under-5 mortality rate (u5mr) (dwyer-lindgren et al., 2014; mercer et al., 2015; li et al., 2019) . examples of prevalence mapping with mbg include hiv prevalence (dwyer-lindgren et al., 2019) , malaria (gething et al., 2016) , u5mr (golding et al., 2017) and vaccination coverage (utazi et al., 2018) . as a motivating example we consider sae of hiv prevalence among females aged 15-29, in districts of malawi, using data from the 2015-16 malawi dhs. we will refer to the districts as admin-2 areas; in malawi there are 3 admin-1 areas, 28 admin-2 areas and 243 admin-3 areas -here we are using the gadm (database of global administrative areas) classification (https://gadm.org/download_country_v3.html). a two-stage stratified cluster sample was implemented, with the sampling clusters (enumeration areas) being stratified by district and urban/rural. the malawi population and housing census (mphc), conducted in malawi in 2008 provided the sampling frame for the survey. the sample for the 2015-16 malawi dhs was designed to provide estimates of key indicators for the country as a whole, for urban and rural areas separately, and for each of the 28 districts. the sampling frame contained 12,558 clusters and our analyses use data from 827 sampled clusters (the supplementary materials give more details). in the 2015-16 dhs survey for malawi, 8,497 women in the age range 15-49 were eligible for testing, and 93% of them were tested. hiv prevalence data was obtained from voluntarily taken blood samples from dhs survey respondents. testing is anonymous, and as such, respondents cannot receive their test results directly from dhs. instead, they are given educational materials and referrals to voluntary, free counseling and testing. blood samples are collected on filter paper via finger pricks before being sent to a laboratory for testing (demographic health surveys, 2016) . urban clusters were oversampled, relative to rural clusters (details are in the supplementary materials). we dropped the island district of likoma, because it is spatially distinct (and quite culturally different from the nearest points on the mainland) and has a very small population. in the top left panel of figure 1 , we plot the weighted prevalence estimates at the district level and see great variation and what appears to be an increasing gradient in hiv prevalence from north to south. however, the uncertainty (shown in the supplementary materials) is relatively large, particularly in the south. ignoring the survey design gives a national prevalence estimate (standard error) of 6.28% (0.37%) for females aged 15-29, while the weighted estimate is 6.18% (0.48%). the weighted estimate is smaller because the weights account for the urban oversampling (hiv is more prevalent in urban areas) and the increased standard error is due to the clustering. the structure of this article is as follows. in section 2, notation and basic ideas are presented before traditional methods of direct and indirect estimation are described in section 3. we focus on spatial methods that have not been covered in recent sae reviews in section 4, with both area-level and unit-level models being described. models of each kind are applied to the hiv prevalence data in section 5. the potential application of sae methods to covid-19 modeling is discussed in section 6 and the paper concludes with a discussion in section 7. we will let i index the areal units for which estimation is required, with m areas in total. assume there are n i individuals in the population with responses y ik , k = 1, . . . , n i , in area i. we let s i represent the set of indices of the selected individuals in area i, with n i = |s i | the number sampled in area i. from a designbased perspective n i may or may not be random, depending on the survey plan -for example, if the small areas (domains) correspond to the strata in a stratified design (sometimes called primary domains) then the n i will be non-random. it is more typical for this not to be the case, and in this situation we have secondary domains. under a model-based view, n i is conditioned upon and therefore fixed. we take as target of inference k=1 y ik , the empirical mean of the response in the finite population. as examples of binary outcomes, interest may focus on the proportion of a population in an area that have a disease, have been vaccinated, or who are practicing social distancing (a much softer endpoint). this is a finite population characteristic, and such targets are common in survey sampling. another common target is the total n i k=1 y ik . in the model-based approach, the hypothetical infinite population mean in area i is denoted µ i . this can be interpreted as the mean of the distribution from which the finite population was (hypothetically) drawn. we will focus on household surveys which are extremely popular, both in high-income countries (for example, in the united states, the national crime victimization survey and the american community survey) and in lmic (for example, the dhs, and multiple indicator cluster surveys). sampling of respondents for household surveys based on face-to-face interviewing is usually conducted via stratified multi-stage cluster sampling. a sample of primary sampling units (psus) are drawn from what is typically an area-based sampling frame, such as enumeration areas in a population census. most surveys employ stratification, usually by residence (urban and rural) and administrative units. since stratification usually has very low cost, and somewhat reduces the variance of estimates, there is usually no reason to avoid stratification. from the point of view of estimating totals across the whole sample, an allocation of sampling units to strata that is proportional to the population size is preferable. optimal allocation (lohr, 2019, section 3.4.2) , where allocation of sampling units is proportional to the standard deviation of the estimator and size of the stratum, is preferable if the variances of the target estimator in each stratum are known and there is a single target estimator. in the case of most household surveys, the aim is to provide estimates of many different population quantities that all have different variances. optimal allocation is, therefore, not possible. proportional allocation ensures that the variances will not be worse than if the sample was by srs. however, many surveys are disproportionally stratified because of the need to report on particular domains with similar precision, such as reporting by urban and rural or individual provinces. nearly all surveys select psus by probability proportional to size (pps), where the size measure is the number of households in each psu as recorded in the sampling frame. a linear systematic pps sample (murthy and rao, 1988, section 8) can be obtained if the sampling units are arranged contiguously along the real number line, each unit occupying a space equal to its size n i . let t i = t i−1 + n i , i = 1, . . . , m, with t 0 = 0 and the sampling interval be t = t m /n. then, select a random point r from 1 to t and select unit i if, if the size measure is constant for all sampling units, the sample becomes a linear systematic srs. in a two-stage sample, which is the most common design, a list of households in each selected psu is constructed, usually by mapping the cluster and listing every household by visiting every dwelling. a fixed number of households are then drawn using srs or linear systematic srs. the method has the benefit that the total sample size, both in total and in each cluster, is fixed by design. the fixed sample sizes thus reduce variance and ease the planning of the fieldwork. a key element in the design-based approach to inference are the design weights, which are the reciprocals of the inclusion probabilities. we now derive these probabilities for a stratified two-stage cluster design. suppose there are n h households in strata h with n hc being the number of households in cluster c, c = 1, . . . , c h , h = 1, . . . , h. hence, h is the number of strata, and there are c h clusters in strata h. it is decided to sample n h clusters in strata h using pps sampling, so that the first stage inclusion probability for cluster c in strata h is π 1hc = n h n hc /n h . consequently, every cluster has a non-zero probability of being sampled. the number of households to sample in cluster c of strata h is n hc . the second stage inclusion probability for sampling is π 2hc = n hc /n hc . the overall inclusion probability is therefore, if a constant number of clusters is sampled at the second stage, i.e., n h,c = n, then we write m h = n h × n and obtain π hc = m h /n h , independent of c. thus inclusion probabilities are equal within strata (self-weighting), provided the size measure (number of households) is equal for each psu in the first and second stage. in sampling with more than two stages -such as psus being districts, then second stage units being enumeration areas, and third stage units being households -the second stage units are again selected by pps to achieve equal probabilities of selection. multi-stage sampling tends to increase variance substantially, because most of the variance accrues at the first stage of selection. therefore, multi-stage sampling is usually avoided if possible. for sae, surveys that use more than two stages are not ideal, because the sample becomes concentrated in the relatively few geographic areas that have been selected in the first stage. in stages selected with pps, surveys in developing countries usually employ linear systematic pps. the method has the benefit that it is simple and allows for implicit stratification if desired. implicit stratification involves sorting the sampling frame by some variable, usually geographic location. if the selection is systematic, it is likely that adjacent selections will be more similar to each other than to non-adjacent selections. pairs of adjacent clusters are therefore defined as implicit strata. the dhs used implicit stratification for early surveys but do not any longer. a great amount of energy is expended on calculating the variances of estimators. the drawback of linear systematic pps is that the resulting selections of psus are dependent on each other. the lack of independence precludes the use of some variance estimators, such as the yates-grundy-sen estimator (yates and grundy, 1953) , that require the joint inclusion probabilities of psus, which cannot be defined. therefore for variance calculation purposes, it is generally assumed that psus were sampled with replacement, even though this is not the case under linear systematic pps. this assumption leads to a generally insignificant inflated variance estimate. alternative pps selection methods do exist, see brewer and hanif (2013) , but are seldom used even though statistical suites such as spss and sas provide them, as do several r-packages. surveys in developed countries often employ different designs since more information is available to the sampler. design based weights express the contribution of each selected sampling unit to the estimate of a population parameter. the starting point for weight construction is the design weights. they are the inverse of the product of the inclusion probabilities at each stage of sampling. the design weights thus directly reflect the sampling process. imperfections, such as non-response, mar most surveys. a common way to deal with non-response is to adjust the weights, by up-weighting sampling units that are assumed to be similar to the non-responding sampling unit. the adjustment is typically carried out using neighboring units, for example, a group of geographically adjacent survey clusters, as adjustment cells. the estimation weight is then the product of the design weight and the inverse of the non-response rate for the cell. an alternative is predicting the response propensity of each sampling unit using, for example, logistic regression. the estimation weight then becomes the product of the design weight and the inverse of the predicted response probability for each sampling unit. methods for adjusting for non-response are described in chapter 8 of lohr (2019). estimates of totals from sample surveys may also differ from known totals. for example, the age and gender distribution from a survey may be different from reliable population registration. to correct for the difference, post-stratification, raking, and calibration are sometimes used to adjust the weights so that the survey estimates match the known totals. post-stratification divides the population into groups, for example by age and sex, and adjusts the weights within each group, so that known population totals are recovered. raking does the same adjustment but to the marginals of the table formed by the classification variables. calibration is similar but uses continuous variables instead of discrete. post-stratification, raking, and calibration ideally impart information to the sample, and therefore reduce bias. these methods are commonly used in countries with good sources of information. they are sometimes used when the analyst believes that some sources, such as projections from a population census, are better than the survey estimates, but that is a practice that easily goes awry. in general, weight variation in household surveys increases variance with a factor of 1+cv 2 (kish, 1992) , where cv is the coefficient of variation of the weights. most household surveys strive for equal probability of selection, and therefore, constant weights at least within each stratum (as we saw with self-weighting). equal weights are difficult to achieve in practice. survey weight variation comes from three principal sources. the first source is disproportional allocation of sampling units to strata, which is a part of the design itself. the second are differences that arise because of differences between estimates of the number of sampling units in one stage of the sampling process and subsequent ones. for example, in a typical two stage cluster sample where the clusters are selected with pps, inclusion probabilities will be equal for all households in a stratum if the number of households in each cluster used for selection in the first stage is the same as the number of households found in the listing of the clusters. since the first stage numbers come from the sampling frame, which may be somewhat outdated, and the listing is carried out immediately before the survey, the difference between the two numbers is often large and leads to substantial weight variation. third, weight variation may arise from non-response corrections, post-stratification, raking and calibration. since the weight variation increases the variance, large weights are truncated in many surveys. the truncation trades the cost of a (hopefully) small bias for an often substantial reduction in variance, and therefore also mse. as a result, the final data are the product of several random processes (pfeffermann, 2011) . the first is the generating of population units. the second is the selection resulting from the application of the sampling design. the third is the selection of responding units, given that they have been included in the sample. finally, there is usually an ad-hoc modeling step that is used to force sample statistics to resemble population parameters and reduce variance. a direct estimate of a quantity in a specific area and time period only uses data on the variable of interest from that area and time period. for simplicity of explanation, we assume that the strata correspond to the small areas of interest and that the weights are simply the reciprocal of the inclusion probabilities. we will also not explicitly index the clusters in this section, since we the discussion is relevant for general designs. the areas will be indexed by i, and let d ik be the design weight associated with individual k in area i, whose response is y ik . within area i, the design-based weighted (direct) estimator (horvitz and thompson, 1952; hájek, 1971 ) is and its variance v i may be calculated using standard methods; often the jackknife is used in a lmic context (pedersen and liu, 2012) . a starting point for analysis is to map the weighted estimates. these weighted estimates have excellent properties (so long as the weights are reliable and stable), for example design consistency. if abundant data are available, no further modeling is needed. in general, the variance is o(n −1 i ) and so for small n i this approach will not be sufficient. in this case, models are necessary to allow for incorporation of covariate information and/or leveraging of spatial dependence between areas. we briefly describe a number of traditional indirect estimators, that are constructed to reduce the mean squared error (mse), calculated under the randomization distribution. these estimators are generally design-based, but achieve a favorable mse through variance reduction. suppose we have covariates x ik , that are available on the in a conventional regression analysis, attention focusses on the slope coefficients, but in sae we are interested in m i , and so it makes sense that we would want to make inferences about the responses of unobserved individuals. a synthetic estimator is, are the slope estimates derived from the set of samples across all areas. the success of this estimator depends on how appropriate the regression model is for all areas. the variance will be o(1/n), where n = m i=1 n i is the total sample size, but the possibility of large bias leads to this estimator being unpopular. fitting separate models in each area is appealing, but then the small n i problem re-emerges. as a side note, one possible compromise would be to use spatially varying coefficient models in which each area has its own slope that is smoothed across areas. to deal with the potential large bias, the bias may be estimated to give the survey-regression estimator: where m ht i and x ht i are the horvitz-thompson estimates of m i and x i . this estimator is approximately designunbiased but the variance is unfortunately back to o(1/n i ). a composite estimator is of the form with 0 ≤ δ i ≤ 1 estimated in such a way that for larger n i we have δ i closer to 1. this estimator is intuitively appealing, and when motivated by a random effects model leads to a principled approach to estimation of δ i . in this section and the next we focus on binary outcomes, because the malawi hiv prevalence example falls under this umbrella. in a major advance, fay and herriot (1979) introduced a very clever approach that models a transform of the weighted estimate, in order to gain precision by using a random effects model. for binary outcomes, one choice of transform is z i = logit ( m ht i ), and let the associated design-based variance estimate be denoted v i . an area-level model is, where θ i is the logit of the true proportion in area i. area-specific deviations from the regression model are modeled using a pair of random effects. the independent and identically distributed (iid) terms are e i ∼ iid n(0, σ 2 ) while the collection s = [s 1 , . . . , s n ] are assigned a spatial distribution. the fay-herriot model did not include the spatial random effects s, but the iid random effects only. choices for the spatial distribution are described in banerjee et al. (2015) , with common forms being the conditional autoregressive (car) and intrinsic car (icar) models. both of these choices captures the concept that, in general, many outcomes are likely to be similar in locations that are close-by. mercer et al. (2015) and li et al. (2019) used these smoothed direct models in a space-time context, using spatial icar and temporal random walk components, along with a space-time interaction term. both the car and icar choices are markov random field (mrf) models, which offer computational advantages (rue and held, 2005) . this model may be fit using a frequentist approach or a bayesian approach in which priors are placed on the fixed effects (the β's) and on the variances/spatial dependence parameters. design-based consistency is achieved (so long as the priors don't assign zero mass to the true θ i ) since the v i term will tend to zero and the bias due to the random effects smoothing disappears asymptotically. two practical difficulties with this approach are that the direct estimates may be on the boundary for a summary parameter that is not on the whole real line. for example, in the binary case we may have m ht i equal to 0 or 1. in this case, z i will be undefined. further, a transform of the weighted estimator may not share the same design-based properties as the untransformed estimator, such as being design unbiased. these problems may be alleviated by using an unmatched sampling and linking model (you and rao, 2002) . a second difficulty is that reliable variance estimates v i may be unavailable, particularly for areas with few/no samples. in this case, variance smoothing models can be used (rao and molina, 2015 , section 6.4.1). in this section, we assume the units of analysis are clusters within a multi-stage cluster design. we let s ic represent the geographical location of cluster c in area i, and explicitly index the counts and sample sizes as y ic and n ic , respectively. a crucial assumption here (section 4.3 of rao and molina (2015) ) is that the probability of selection, given covariates, does not depend on the values of the response (as discussed in the context of ignorability in section 2). this implies that if stratified random sampling is used, stratification variables must be included in the model. one would expect cluster sampling to lead to correlated responses within clusters, and cluster-level random effects are introduced to accommodate this aspect (scott and smith, 1969) . another situation in which care is required is when pps sampling is carried out. if the model used mis-specifies the relationship between the response and size then incorrect inference will result. to provide some protection against this, zheng and little (2003) and chen et al. (2010) model the response as a spline function of size for continuous and binary outcomes, respectively. for a binary response, a common model (diggle and giorgi, 2019) is, care must be taken when modeling p ic = p(s ic ), the response probability associated with location s ic . for a stratified design it is, in general, important to include fixed effects for each strata (paige et al., 2020) . in practice, there are a number of options when we have a design with strata that consist of urban/rural crossed with geographical districts. if there were no random effects, then strictly we would need interactions between district and urban/rural. however, the models we describe include spatial random effects (to leverage spatial dependence) and interactions would produce an unidentifiable model. consequently, we include a single urban/rural effect and so are tacitly assuming that the association between the response and the urban/rural variable is approximately constant across areas. another possibility, would be to include a spatially varying urban/rural coefficient, or model the associations separately in larger regions, in the spirit of what is sometimes done with synthetic estimation. one candidate model is, where z(s ic ) is the strata within which cluster c lies, exp(γ) is the associated odds ratio, x(s ic ) are covariates available at location s ic , with odds ratios exp(β 1 ). the spatial random effect s(s ic ) is associated with cluster location s ic , and may be continuous or discrete. the cluster-level error ic ∼ n(0, σ 2 ) is the so-called nugget, which is traditionally taken to represent short scale variation and/or "measurement error". a model-based geostatistical model takes s(s ic ), as a realization of a zero-mean gaussian process (gp). gp models are common choices for continuous spatial models and imply that any collection of spatial random effects have a multivariate normal distribution. a popular choice for the variance-covariance (stein, 1999) is the matérn covariance function, for which the covariance between is, cov(s(s 1 ), s(s 2 )) = σ 2 where ρ s is the spatial range corresponding to the distance at which the correlation is approximately 0.1, σ s is the marginal standard deviation, ν s is the smoothness (which is usually fixed, since it is difficult to estimate), and k ν s is a modified bessel function of the second kind, of order ν s . when the number of clusters c is large, computation is an issue, because we need to manipulate c × c matrices which involves o(c 3 ) operations (rue and held, 2005) . various approximations have been proposed to overcome this problem, for example, the stochastic partial differential equations (spde) approach pioneered by lindgren et al. (2011) -this is the approach we use in section 5. other approaches are described by heaton et al. (2018) . where p(s) = expit(β 0 + x(s) t β 1 + z(s ic )γ + s(s)) is the risk at location s (the nugget is, for better or worse, frequently left out, since it is viewed as measurement error) and q(s) is the population density at s, which is needed at all locations on the approximating mesh, s l , l = 1, . . . , m i . cluster level covariate models are appealing when compared to area-level covariate alternatives, since they are closer to the mechanism of action, and reduce the possibility of ecological bias (wakefield, 2008) . a large disadvantage in an sae context is that to aggregate to the area-level, at which inference is required, one needs to know the values of the covariates on the mesh, including the design variables. obtaining reliable population density and covariate surfaces is not straightforward. the direct and smoothed direct approaches avoid the need for a population surface, since the weights implicitly include population information from the original sampling frame. an alternative, overdispersed binomial, unit-level model that we use for the hiv prevalance data is, where λ is the overdispersion parameter and we have taken the spatial random effect to be decomposed as s(s ic ) = e i + s i , with e i and s i as in (3). the marginal variance is var(y ic ) = n ic p ic (1 − p ic )[1 + (n ic − 1)λ] so that small values of λ correspond to little overdisperison. the aggregate risk is far easier to evaluate than the continuous spatial model, i.e., as calculated via (5), since all clusters in the same strata have identical risk, to give, where q i is the proportion of the relevant population in the area that is urban. dyed-in-the-wool ,design-based aficionados are wary of models such as the ones described in this section, because establishing design consistency is very difficult. we return to the hiv prevalence example, and fit both area-level and unit-level models. all fitting was done in the r programming environment. the smoothing models were fit using the inla package, which provides bayesian inference via a fast implementation using the method described in rue et al. (2009) and rue et al. (2017) . to obtain the weighted estimates and variances for the areas, the survey package (lumley, 2004 ) was used. a number of the models can be conveniently fitted in the summer package (martin et al., 2018) . for the discrete spatial model we use the bym2 parameterization (riebler et al., 2016) in which we have b i = e i + s i (in equation (3)) and an overall variance parameter σ 2 b for b i , and a parameter, φ, that represents the proportion of the variance that is spatial. in all analyses we use penalized complexity (pc) priors , details of which may be found in the supplementary materials. in high-income countries, the census (and other sources) provide variables on all of the population in small areas of interest. in this case, weighted estimates can be obtained from model-assisted procedures, such as those described in section 3.2, which can be smoothed, if needed. unfortunately, in lmic, there are few candidate variables, and so instead we use the smoothed direct model (2) figure 2 shows the logits of the direct estimates against the logits of the anc prevalence estimates, with a reference line added. not surprisingly, we see a strong association. the smoothed direct model estimates without the anc covariate are shown in the top middle panel of figure 1 . the shrinkage (overall via the iid random effects, and locally via the icar random effects) is apparent, with a flatter map compared to the direct estimates. the posterior uncertainty is reduced (see supplementary materials). the average standard error of the direct estimates is 0.018 and the average posterior standard deviations under the smoothed direct model is reduced by 17%. when moving from the no covariate, smoothed direct model to the model including the anc covariate, the posterior median of σ b (reflecting the amount of residual variation) changes from 0.40 to 0.14, and φ (the proportion of spatial variation) goes from 0.56 to 0.26. the odds ratio (95% credible interval) associated with the logit anc covariate is 2.8 (2.0, 3.8) so the association is very strong, which explains why the across-district residual variability is so reduced when the covariate is added. the strong spatial structure of the covariate explains why φ is reduced. the dream of spatial modeling is to find covariates that make the area-level random effects "go away". the top right panel of figure 1 shows we fit three betabinomial models (6)(7): (1) no urban/rural and no covariates, (2) urban/rural and no covariates, (3) urban/rural and anc covariate. model (1) should not be used, as it does not account for the urban/rural stratification; we include to demonstrate the bias this introduces. the bottom row of figure 1 shows the fits from these three models. again, the smoothness compared to the direct estimates is apparent, and the fits are quite similar to the smoothed direct models. closer examination reveals differences, however. the supplementary materials gives comparisons of all the estimates. it is clear that leaving the urban/rural covariate out of the model leads to bias. from the sampling frame we know the number of clusters that are the estimates from the no adjustment model are too high because of the oversampling of urban areas, which have higher hiv prevalence. urban and rural (this information is given in the supplementary materials) which can be directly used for the aggregation to the admin-2 level, via (8). in the no covariate model the odds ratio associated with being urban is 2.3 (1.8,2.9) and this hardly changes when the anc variable is added. this more than doubling of the odds is a cluster level association. the odds ratio associated with the anc covariate is 2.3 (1.7, 3.1), so the association is also strong. the reductions in the total spatial residual terms, and the proportion of this variance that is spatial both plummeted when the anc covariate was included in the model. as compared to the average standard errors of the direct estimates, the average posterior standard deviations for models (1)-(3) above were 18%, 29% and 43% lower., respectively so there is a greater reduction in uncertainty under the unit-level models than under the area-level models. the estimates from model (1) are clearly biased (they are too high) because of the non-adjustment for urban/rural. for the three models the posterior medians of σ b (representing the standard deviation of the residual variation on the logistic scale) are 0.39, 0.38, 0.18. the proportion of the variation that is spatial is 0.60, 0.67, 0.30. hence, we see the same pattern as the area-level modeling with the inclusion of the anc covariate greatly reducing the residual uncertainty and the proportion that is spatial. we also calculate estimates for the 243 admin-3 areas, which is more difficult with the smoothed direct model, because of data sparsity (26 of the areas contain no data). the supplementary materials contain summaries of this analysis. these results should be viewed with some caution, since model-checking at this level is very difficult, and the aggregation depends on knowing the proportion of each admin-3 area that is urban. finally, we fit the continuous spatial model, with and without the area-level covariate and produce pixel-level estimates of the hiv prevalence. this analysis should also be viewed with caution, particularly for the admin-2 summaries, since the aggregation required a mesh specific estimate of urban/rural. we use worldpop population density estimates (stevens et al., 2015; wardrop et al., 2018) . specifically, using the known proportion of clusters that are labeled as urban in the sampling frame we create a population density threshold (to give the correct urban proportion of the population when a geographic partition is formed, based on this threshold), and then we can obtain an urban/rural mesh to use in (8). results of these analyses are contained in the supplementary materials. there are a number of outstanding issues with the unit-level model-based analyses that need further con-sideration. in general, the weights have three components: design, non-response and post-stratification/raking and one needs to consider all three. we have discussed the stratification aspect, but further thought is required for non-response, while post-stratification from a modeling perspective has been considered (gelman, 2007) . in practice, the urban/rural covariate is not known with any great accuracy geographically, and the impact of this on inference needs to be investigated. one of the hardest parts of sae is model assessment. the direct estimates, m ht i if available, provide one standard for comparison. asymptotically, m ht i ∼ n(m i , v i ). one strategy is to systematically remove one area at a time, and then obtain a prediction of the missing area's prevalence, based on the remaining 26 districts. the supplementary materials contain plots of the direct estimates versus the predicted estimates, and we see systematic problems with the models that leave out the urban/rural covariate. the performance of both the smoothed direct (area-level) model with the anc covariate and the unit-level model with urban/rural and anc prevalence is superior to the other models. the dic and log cpo summary measures supported this conclusion. checking the admin-3 estimates is far more difficult because there is no reliable direct estimates for comparison in a cross-validation exercise. coronavirus disease (covid-19) is an infectious disease caused by the sars-cov-2 virus. there are many endpoints that are of interest in geographical studies including: disease prevalence, disease incidence, the number and fraction of people who have been infected with sars-cov-2 (via an antibody test), measures of social distancing, and excess mortality. estimating each of these responses over time and by age, gender, and race is also highly desirable. many study/data types would not fall under the heading of sae because they are based on routine surveillance data and are nominally a complete enumeration. surveillance data are often examined via space-time disease mapping or cluster detection techniques (waller and gotway, 2004) . seroprevalence studies, in which blood samples are taken from sampled individuals and examined for anti-bodies, have been carried out to estimate the number of infections due to sars-cov-2. with an appropriate sampling design, and a reliable test, such studies can be used to estimate the proportion of the population, by demographic groups, who have been previously infected by sars-cov-2. these studies can be used for sae, if the numbers sampled are sufficiently large. stringhini et al. (2020) describe a seroprevalence study carried out in geneva, switzerland, with participants taken from an existing study that contained a stratified sample from the study population (the canton of geneva). the selection of individuals is key since, for example, those reporting at health facilities (who are subsequently tested) are not representative of the population. unfortunately, many of the early seroprevalence studies came (justifiably) under fire for data quality issues and for sampling from ill-defined populations, which leads to great difficulties with interpretation. this is well-documented by vogel (2020), who gives a number of examples. health researchers are under intense pressure, given they are working in the context of the political and press feeding-frenzy for results to be made available. this has lead to inadequate validation of the antibody tests used and limited scientific scrutiny of the design of studies through peer-review. it has been documented that certain subgroups (for example, based on age, sex, race) are disproportionately affected by covid 19, and as such, a stratified analysis may be used to reliably estimate outcomes of interest. but the numbers required will be large if a small area analysis is envisaged, given the relative rarity of death from covid-19, in particular. a potential drawback with traditional survey sampling and sae in the context of covid-19 is the possibility that seroprevalence or incidence is highly clustered. there are two notions of clustering that are important here. the first is infectious network clustering, i.e., that an infected person infects others in his or her network. such clustering, while intrinsic to any infectious disease, may or may not lead to the other form of clustering, namely geographic clustering. geographic clustering of covid-19 is common, both in communities and in institutions (there are many factors which may lead to this, including the propensity for individuals of common demographic groups, or having high-risk, tending to live in close proximity). when geographic clustering is present, a large proportion of the infected population may be located in relatively few survey clusters, many of which may be geographically adjacent. since a typical cluster sample design spreads out the sample clusters over a geographic region, there is a large chance that the sample will not pick up the high prevalence areas. one may solve the problem by increasing the sample size, but that rapidly becomes costly. an alternative is adaptive cluster sampling (brewer, 1992; thompson and seber, 1996) . an ordinary cluster sample is first drawn. then, each selected cluster is listed and target individuals identified (e.g., persons with a positive test for sars-cov-2). if the number of target individuals is larger than a given threshold, then clusters that share a border with the initial cluster are included in the survey. the process is repeated for each of the adjacent clusters, until a network of clusters is formed that has no adjacent cluster thst satisfies the threshold. if the threshold is well selected, then the adaptive cluster sample will pick up geographically based networks of infection well. if it is set too low, then all clusters in the frame will be included. if the threshold is too high, the sample limits itself to the initial sample. with a target population that is clustered and judicious setting of the threshold value, adaptive cluster sampling can be very efficient. unfortunately, there is no optimal sampling strategy for adaptive cluster sampling (turk and borkowski, 2005; thompson and seber, 1996) , so the exact choice of design must be piloted. a potential disadvantage of adaptive cluster sampling is that while it may be efficient for the estimator that informs the design, such as the prevalence of a disease, it is not necessarily efficient for estimators of other population quantities. thus, if the survey aims to estimate additional quantities other than the one designed for, one might have to consider other designs. this problem is, to some extent, mitigated by the fact that many other quantities may be both easy to estimate and not require narrow confidence bands. for example, the proportion of people using face masks may not require as precise an estimate as a seroprevalence estimate. looking into the future, given the successful development of a vaccine, one could use sae techniques to estimate vaccination coverage rates. such an approach has been used in many different settings: bcg, dpt, opv, and measles in india (pramanik et al., 2015) and dpt in africa (mosser et al., 2019) . in the united states (us), in the absence of a national register, particular outcomes can be geographically examined using survey data. albright et al. (2019) carried out sae at the county level in alabama for human papillomavirus (hpv) in counties of alabama using data from the national immunization survey. to get at transmission dynamics, one needs to fit more complex models such as susceptible-exposed-infectedremoved (seir) models. unfortunately, there are a number of serious drawbacks for using such models in a sae enterprise. the first is that the data required are not routinely available, with the usual problems of under-and mis-reporting being a serious problem. the second is that fitting stochastic models is very challenging, see the references in fintzi et al. (2017) . to avoid the computational difficulties one may use deterministic compartmental models (anderson and may, 1991) and then map summary statistics such as the reproductive number r 0 . in large populations such models may perform well, but such models may perform poorly when the system is far from its deterministic limit (andersson and britton, 2000) ; inference is also difficult because error terms are "added on" rather than the stochastic and deterministic parts of the models being entwined, as is the ideal for a statistical model. however, such models are far preferable to simple curve-fitting exercises in which some generic function is fit to data -such approaches are likely to perform very poorly for predictions of complex phenomena because they (basically) contain little of the science or relevant context. the impact of interventions, such as social distancing and vaccination campaigns is also impossible to determine with such approaches. sae is a huge topic and we have been selective in this review, focusing on the use of bayesian spatial models for analyzing survey data. there is a disconnect between the spatial statistics and sae research communities. the former are very cavalier about ignoring the survey design and the latter tend to use simple discrete spatial models (or include iid spatial random effects only) and empirical bayes estimation. regular spatial modeling with binary data does not emphasize consistency, perhaps because the parameters of nonlinear random effects models are not generally consistent, unless the model is correctly specified. outside of the linear model, results on frequentist (model-based) consistency for the parameters of spatial models are few and far between. it is desirable that when area-level estimates are aggregated they are in agreement with estimates for larger areas within which they are nested. the estimates in the larger areas may be well estimated due to larger sample sizes and additional data. such benchmarking is an important endeavor in many applications. for example, li et al. (2019) produced small area estimates of under-5 mortality in 25 countries using dhs data and benchmarked to the official un national estimates, which use far more extensive data sources. many approaches have been suggested. in particular, we find the bayesian method of zhang and bryant (2020) very appealing, and it could be used with the area-level and unit-level models we have described. we have discussed both discrete and continuous spatial models. the former have dominated the sae literature, and have many appealing features including computational efficiency, a wealth of experience in their use, and robustness to different data generating mechanisms (paige et al., 2020) . however, the discrete spatial models always have an ad hoc neighborhood specification, which is unfortunate. continuous spatial model are far more appealing in this respect, and also allow data that are aggregated to different levels to be combined (wilson and wakefield, 2020) . but continuous spatial models have greater computational challenges, and careful prior specification is required (fuglstad et al., 2019) . there is a growing literature on accounting for errors in the covariates used for modeling in the sae literature with various errors-in-variables models being considered (ybarra and lohr, 2008; barber et al., 2016; burgard et al., 2019) ; see section 6.4.4 of rao and molina (2015) for a review. finally, as more and more data streams become available, there is a growing need for methods that combine data in appropriate ways, and this is especially true of sae. for example, in a lmic context, data may be available from a variety of surveys with different designs, sample vital registration systems and censuses. synthesizing such data is not straightforward but can offer great benefits if carried out successfully. we use penalized complexity (pc) priors in our analyses. pc priors facilitate intuitive hyperprior assignment. in this framework, we assign prior probability to a base/simple model and a more flexible/complex model, with the base model being favored unless otherwise indicated by the data. kullback-leibler divergence is used to measure the distance between the base model and the more flexible model, and deviation from the base model is penalized at a constant rate. the pc prior for a parameter θ is specified using the probability statement pr(θ > u ) = α, where the user chooses the values for u and α. using this framework, we set priors for the bym2 overall precision parameter and mixing parameter. for the overall precision parameter in the bym2 model, we set u = 1, α = 0.01, which corresponds to a prior probability of 0.99 of having residual odds ratios smaller than 2. for the mixing parameter, we set u = .5, α = 2/3, which corresponds to a 67% chance that more than 50% of the total variation of the random effect has spatial structure. details for the derivations of the pc prior for the bym2 model can be found in appendix 2 of riebler et al. (2016) . smoothed direct with anc covariate. bottom row unit-level (betabinomial) models: no urban/rural, no covariate; urban/rural only; urban/rural and anc covariate. the posterior median of the spatial range parameter parameter was 0.57 • , so quite large (malawi ranges in latitude between -16.9 • to -9.7 • and longitude from 32.9 • to 35.5 • ), though the 95% interval was 0.32 to 0.98, so there is a lot of uncertainty. the cluster level odds ratio was estimated as 2.3 (with a 95% interval of 1.7,3.0). small area estimation of human papillomavirus vaccination coverage among school-age children in alabama counties infectious diseases of humans: dynamics and control stochastic epidemic models and their statistical analysis hierarchical modeling and analysis for spatial data modelling the presence of disease under spatial misalignment using bayesian latent gaussian models an error-components model for prediction of county crop areas using survey and satellite data sampling with unequal probabilities sampling a fay-herriot model when auxiliary variables are measured with error bayesian penalized spline model-based inference for finite population proportion in unequal probability sampling model-based approach to small area estimation demographic health surveys model-based geostatistics for global public health: methods and applications estimation of district-level under-5 mortality in zambia using birth history data mapping hiv prevalence in sub-saharan africa between estimates of income for small places: an application of james-stein procedure to census data efficient data augmentation for fitting stochastic epidemic models to prevalence data constructing priors that penalize the complexity of gaussian random fields struggles with survey weighting and regression modeling mapping plasmodium falciparum mortality in africa between -15: a baseline analysis for the sustainable development goals. the lancet improving estimates of district hiv prevalence and burden in south africa using small area estimation techniques an essay on the logical foundations of survey sampling, part i a case study competition among methods for analyzing large spatial data a generalization of sampling without replacement from a finite universe weighting for unequal pi design-based methods of estimation for domains and small areas changes in the spatial distribution of the under five mortality rate: small-area analysis of 122 dhs surveys in 262 subregions of 35 countries in africa an explicit link between gaussian fields and gaussian markov random fields: the stochastic differential equation approach (with discussion) sampling: design and analysis: design and analysis analysis of complex survey samples summer: spatio-temporal under-five mortality methods for estimation small area estimation of childhood mortality in the absence of vital registration mapping diphtheria-pertussis-tetanus vaccine coverage in africa systematic sampling with illustrative examples model-based approaches to analysing spatial data from complex surveys child mortality estimation: appropriate time periods for child mortality estimates from full birth histories modelling of complex survey data: why model? why is it a problem? how can we approach it new important developments in small area estimation vaccination coverage in india: a small area estimation approach small area estimation, second edition an intuitive bayesian spatial model for disease mapping that accounts for scaling gaussian markov random fields: theory and application approximate bayesian inference for latent gaussian models using integrated nested laplace approximations (with discussion) bayesian computing with inla: a review model assisted survey sampling estimation in multi-stage surveys penalising model component complexity: a principled, practical approach to constructing priors (with discussion) introduction to the design and analysis of complex survey data interpolation of spatial data: some theory for kriging disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data repeated seroprevalence of anti-sars-cov-2 igg antibodies in a population-based sample from geneva, switzerland. the lancet a review of adaptive cluster sampling high resolution age-structured mapping of childhood vaccination coverage in low and middle income countries finite population sampling and inference: a prediction approach first antibody surveys draw fire for quality, bias ecologic studies revisited prevalence mapping applied spatial statistics for public health data spatially disaggregated population estimates in the absence of national population and housing census data pointless spatial modeling selection without replacement from within strata with probability proportional to size small area estimation when auxiliary information is measured with error small area estimation using unmatched sampling and linking models fully bayesian benchmarking of small area estimation models penalized spline model-based estimation of the finite population total from probability-proportional-to-size samples in each plot we have the direct estimates (with uncertainty bars) plotted against the posterior medians of the predictive distribution modeling figure 22: hiv prevalence estimates at the admin-3 level, using the betabinomial model. left: point estimates (posterior mean). right: posterior standard deviation the authors would like to thank the malawi ministry of health for allowing the use of the anc data, and jeff eaton for helpful comments on the manuscript. 1 145 4 23 6 204 phalombe 17 165 3 27 3 316 rumphi 8 130 6 20 12 156 salima 5 168 6 23 22 416 thyolo 8 177 4 30 12 674 zomba 19 194 9 26 79 584 total 278 4427 168 659 1409 11149 key: cord-135004-68y19dpg authors: russo, carlo; liu, sidong; ieva, antonio di title: impact of spherical coordinates transformation pre-processing in deep convolution neural networks for brain tumor segmentation and survival prediction date: 2020-10-27 journal: nan doi: nan sha: doc_id: 135004 cord_uid: 68y19dpg pre-processing and data augmentation play an important role in deep convolutional neural networks (dcnn). whereby several methods aim for standardization and augmentation of the dataset, we here propose a novel method aimed to feed dcnn with spherical space transformed input data that could better facilitate feature learning compared to standard cartesian space images and volumes. in this work, the spherical coordinates transformation has been applied as a preprocessing method that, used in conjunction with normal mri volumes, improves the accuracy of brain tumor segmentation and patient overall survival (os) prediction on brain tumor segmentation (brats) challenge 2020 dataset. the lesionencoder framework has been then applied to automatically extract features from dcnn models, achieving 0.586 accuracy of os prediction on the validation data set, which is one of the best results according to brats 2020 leaderboard. magnetic resonance imaging (mri) is used in everyday clinical practice to assess brain tumors. however, the manual segmentation of each volume representing the extension of the tumor is time-demanding and operator-dependent, as it is often non-reproducible and depends upon neuroradiologists' expertise. several automatic or semi-automatic segmentation algorithms have been introduced to help segment brain tumors, and deep convolutional neural networks (dcnn) have recently shown very promising results. to further improve the accuracy of automatic methods, the multimodal brain tumor segmentation (brats) challenge [1] [2] [3] is organized annually within the international conference on medical image computing and computer assisted intervention (miccai). the brats 2020 challenge includes a task for automatic segmentation of the total area containing the tumor (whole tumor -wt), as well as the necrosis and active tumor cells area (tumor core -tc, enhancing tumor -et, necrosis and et are contained in tc). furthermore, glioma patients often have a dire survival prognosis following surgical resection and radiochemotherapy [4] . thus, a further task to predict patient overall survival (os) has been added into the challenge, aimed at improving the prediction of patient survival outcome in order to add information that are relevant to the decisionmaking process. dcnns are data driven algorithms. they require huge amount of data to obtain good results. in medical imaging, such big datasets are not often available, thus pre-processing and data augmentation plays an important role. while pre-processing methods are usually used to standardize input data, they can also be used to enhance meaningful data inside the original input images: an example is cropping the region of interest when the input data includes lots of redundant and misleading information. therefore, we propose a novel spherical space transformation method to enhance information on specific points of the tumor as well as enable the dcnn learning process to be invariant to rotation and scaling of the input images. furthermore, we extended the use of lesion features extracted from the latent space of the segmentation models using the lesionencoder framework, which replaces the classic imaging / radiomic features, such as volumetric parameters, intensity, morphologic, histogram-based and textural features, which showed high predictive power in patient os prediction. dataset the dataset consists of four mri sequences used to determine the segmentation and extract survival features, namely t1-weighted, post-contrast t1, t2-weighted and flair images. training dataset have 336 4-channel volumes with ground truth segmentation. validation dataset is composed by data from 125 patients [5, 6] . testing dataset is composed by additional 166 patients. the dcnn that we chose to use as baseline for our method is derived from myronenko [7] , which is based on variational auto encoder (vae) u-net with adjusted input shape and loss function according to the type of transformation used into the preprocessing phase. the vae proposed by myronenko is composed by a u-net with two decoder branches: a segmentation decoder branch, used to obtain the final segmentation, and an additional decoder branch to reconstruct the original volumes, used to regularize the shared encoder. the loss function is given by the formula: where ll2 is the l2 loss on the vae branch and lkl is the kl divergence penalty term. we trained different models by changing pre-processing method (cartesian and spherical) and some layers hyperparameters. although the models share the same vae structure proposed by myronenko, there are a few differences. more specifically, cartesian_v1 includes standard dropout with rate 0.2, kernel size filters 3x3x3 in convolution layers outside the green blocks and an additional 3x3x3 convolution layer before the blue block. while cartesian_v2 uses spatialdropout3d with the same ratio, 1x1x1 convolution filters and no additional layers. the spherical model has the same structure of cartesian_v1 but with spherical transformation pre-processing on inputs. the spherical transformed model (using cartesian_v2 structure) has not been trained yet by the 2020 challenge deadline. a biscartesian model has been also trained using the structure of the cartesian v1 model but using a lower coefficient of the kl loss, set to 0.0001. the model has not given better segmentation results, although it is showing improved results on os task. our team previously presented the spherical coordinate transformation pre-processing as a method to improve segmentation results [8] . the spherical transformed volume is shown in figure 1 . figure 1 . example of the representation of a radiologic volume in spherical coordinate system. a)) a brain mri volume with its 3d segmentation of the tumor, and b) the same volume transformed into a spherical coordinate system using the center of the volume as the origin. each pre-processed volume uses an origin point. thus, to achieve good performance on the training, it is important to correctly select origin points, included within the tumor. for this reason, we used a cascade of dcnns, the first one predicting a coarse segmentation, and then refining the segmentation by using origin points included in the previous model. the first pass model of the cascade could also be a model trained on non-transformed input (a cartesian model), but the use of the spherical model already in the first cascade's pass enabled pre-training weights to be used for the next training steps. the spherical coordinate transformation also adds an extreme augmentation. this is a beneficial step as it adds invariance to the rotation and scaling to the dcnn model. however, such an invariance also has a drawback especially when dealing with wt segmentation: apparently, wt segmentation works better with a cartesian model, whereas using spherical pre-processing adds many false positive regions to the wt. thus, we used a cartesian model to filter out the false positive regions found by the spherical pre-processing as shown in figure 2 . we used this proposed method for the first time in a brats challenge, achieving similar results obtained in our original paper regarding the improvement of the accuracy of the model trained on transformed input compared to the baseline model. we also tested the intersection of the segmentation on the three different classes (spherical -cartesian intersection 3ch) instead of filtering only the wt class. finally, we ensembled the best segmentations from cartesian_v2 and the intersection method to improve the results further. after filtering the spherical segmentation with the cartesian filter, we used a post-processing method to improve et segmentation. we noticed that many false positive et segmentations are due to isolated voxels. for this reason, we applied a binary opening operator to isolate thin branches over et spots and then filter out the spots having less than 30 voxels. when the et segmentation is still present after these filters, the original et segmentation is restored and used as the final one. otherwise, the et segmentation is completely erased, meaning that no et is present in the current volume. table 1 shows the summary of segmentation results on the validation dataset. the most promising methods tested so far were the cartesian_v2 and the spherical models: used alone without post-processing, the cartesian_v2 gave the best results in wt and tc segmentation, while the spherical model worked better on et segmentation. the ensemble of the models gave a further improvement in the segmentation of et class, while the best result for the wt and tc classes segmentation remained using only the cartesian_v2 model. the best overall improvement on et has been shown by post-processing the segmentation of the intersected cartesian and spherical models on the three channels, even if the tc class dice score decreased and wt class did not improve further. results of the final model on testing dataset shown in table 2 seems to confirm a good accuracy, above all on the et segmentation, although is not possible to make a comparison with the other models since the challenge only allows to test one method on the dataset. our team also participated in task 2 of the brats challenge: prediction of patient overall os from pre-operative mri scans. instead of using the pre-defined imaging / radiomic features, such as volumetric parameters, intensity, morphologic, histogram-based and textural features, we used the features automatically extracted from mri scans using the novel lesionencoder (le) framework [9] . the le features were further processed using principle component analysis (pca) to reduce dimensionality, and then used as input to a generalized linear model (glm) [10] to predict patient os. the le framework was proposed in a recent work for covid-19 severity assessment and progression prediction [9] . the original le adopted the u-net structure [11] , which consists of an encoder and a decoder based on the efficientnet [12] . while the encoder learns and captures the lesion features in the input images, the decoder maps the lesion features back to the original image space and generates the segmentation maps. the features learnt by the encoder in the latent space encapsulate rich information of the lesions, therefore, can be used for lesion segmentation, as well as other tasks such as classification and prediction. in this study, we used the vae as backbone to build the le. as described in the previous section, three different configurations have been applied to the vae model, resulting in three different lesion encoders: i.e., le_cartesian (cartesian_v2), le_spherical (spherical) and le _biscartesian (cartesian_v1_bis). the latent variables of the input images / mri scans extracted by individual lesion encoders were then used as the features to predict patient os. for each mri scan, a high-dimensional feature vector ( = 256) was derived. as the high-dimensional feature space tended to lead to overfitting, we therefore used pca to control the feature dimensionality by setting different numbers of principle components (̂= [2, 60] in this study) for further analysis. figure 3 shows the joint age and os distribution of the patients in the training cohort. the age distribution, as shown at the top of the figure, seems to be a normal distribution. the os distribution, on the right side of the figure, seems to be heavily skewed, with the majority of cases having os less than 400 days. to model the tailed distribution of the os values, we therefore used a tweedie distribution [13] , a special case of exponential dispersion models whose skewness can be controlled by a power parameter ( = [1.1,1.9] in this study). a glm model [10] based on the tweedie distribution, i.e., tweedie regressor [13] , was built to predict os values. the tweedie regressor was implemented using scikitlearn (v0.23.2). as the resection status and age are essential predictors of os, both of them were merged with the lesionencoder features as input to the tweedie regressor for os prediction. two evaluation schemes were used to assess the prediction performance. the results were first evaluated based on accuracy of the classification of subjects as long-survivors (>15 months / 450 days), short-survivors (<10 months / 300 days), and mid-survivors (survival rate between 10 and 15 months / 300 -450 days). in addition, a pairwise error analysis between the predicted and actual os (in days) was performed, evaluated using the following metrics: mean square error (mse), median square error (median se), standard deviation of the square errors (std se), and the spearman correlation coefficient (spearman r). amongst the 235 patients in the training set, 118 underwent a surgical gross total resection (gtr) and 10 underwent a subtotal resection (str); in 107 cases, no information about the resection status are available. all of the 29 subjects in the validation set had a gtr resection status. the extent of resection was considered in the model as it has been shown to correlate to post-surgical outcome [14] . cross-validation on the training set we used 5-fold cross-validation to train and validate the proposed method. an internal validation set (20%) was split from the dataset in each fold with the remaining 80% as the training set. for each of the three lesion encoders, i.e., le_cartesian, le_spherical and le_biscartesian, this process was repeated 5 times, leading to 5 different sub-models. figure 4 illustrates the projected feature space of the features extracted using le_spherical (a), and the scatter plots (b, c) of the predicted os vs. actual os of the training samples. there was high variance in the performance of the sub-models (from 0.362 to 0.574). the prediction results of the 5 sub-models were further aggregated, and the results were summarized in table 3 . le_cartesian achieved the highest accuracy (0.494) and spearman r (0.429), while le_spherical had the lowest mse, median se and std se. these two models outperformed le_biscartesian; however, the differences were not substantial (<0.034 in accuracy). prediction performance on the validation set the 5 different sub-models with the same configuration were then applied to the official validation set (n=29) to predict the os of each validation case. the mean value of the 5 predictions of each case were then averaged to derive the final prediction. results of the three models with different configurations are summarized in table 4 . the le_biscartesian model achieved the highest accuracy (0.552); however, its mse and std se were higher, and spearman r was lower than the other models. the le_cartesian model had the lowest mse and the highest spearman r, showing a better representation of the overall distribution of the os values. these findings showed a complementary nature of different models; therefore, we combined the outputs of these models to test whether the prediction performance could be improved further. four combinations were tested, which consistently showed equal or better accuracy (between 0.552 and 0.586) compared to using individual models. our final submission for os prediction on the validation dataset, which was based on the m1&m2 model, was ranked the 4 th place in accuracy among the 42 participating teams. in the meanwhile, it achieved the 5 th place in both mse and spearman r, the 8 th place in median se, and the 10 th place in std se (checked on 23 october 2020). we further applied the m1&m2&m3 model on the official test dataset (n=107). the model's performance, as shown in table 5 , was lower on the test dataset compared to the validation dataset, implying a marked difference between the two datasets and overfitting of the model. however, without knowing the results of other models, either of our own or from other participating teams, it is difficult to confirm whether such performance drop is caused by a less representative training dataset or a less generalizable model, or both. spherical coordinate transformation pre-processing spherical coordinate transformation pre-processing of the input dataset contribute to explore data in a different way, thus changing the learning process and achieving different features compared to the classical dcnn model learning process. those different features can help to improve the segmentation process as well as contributing to deep feature extraction to be used in patients' os prediction. even if the spherical pre-processing method contributes to improving baseline model results, simple post-processing methods also have a strong impact on segmentation accuracy. however, overall segmentation results obtained by this method are not amongst the best ones compared to other teams in brats 2020 leaderboard, and additional efforts should be done to fine tune both cartesian and spherical training phase. the lesionencoder framework extends the use of lesion features beyond conventional lesion segmentation. there is a wealth of information in the brain tumors including shape, texture, location, extent and distribution of involvement of the abnormality, that can be extracted by the lesion encoder. while it has been demonstrated in covid-19 progression prediction [9] and severity assessment [15] , here we demonstrated a new application of le in patient os prediction. it may have strong potential in a wide range of other clinical and research applications, e.g., brain tumor pseudo-progression detection [16] and ophthalmic disease screening [17] . various dimension reduction methods have been tested in this study, including pca, independent component analysis (ica), t-distributed stochastic neighbor embedding (t-sne). in the training phase, pca was found to have lower variability in accuracy than the other methods; as a result, it was chosen to process the high-dimensional features. we used a linear search strategy to optimize the two most important parameters of the os prediction model, including the number of principle components in pca (̂= [2, 60] ) and the power of tweedie distribution ( = [1.1,1.9]). the optimal parameters for le_cartesian were (̂= 10, = 1.6), and (̂= 3, = 1.6) for both le_spherical and le_biscartesian. in addition, it will be important to demonstrate the scale invariance in the tweedie regressor within different datasets in our future work. in conclusion, we have introduced a novel and very promising method to pre-process brain tumors' mr images by means of a spherical coordinates transformation to be used in dcnn models for brain tumor segmentation. the lesionencoder framework has been applied to automatically extract imaging features from dcnn models, demonstrating good performance for the survival prediction task. the multimodal brain tumor image segmentation benchmark (brats) advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge the 2016 world health organization classification of tumors of the central nervous system: a summary segmentation labels and radiomic features for the pre-operative scans of the tcga-gbm collection segmentation labels and radiomic features for the pre-operative scans of the tcga-lgg collection spherical coordinates transform pre-processing in deep convolution neural networks for brain tumor segmentation in mri 3d mri brain tumor segmentation using autoencoder regularization, in: brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries. brainles severity assessment and progression prediction of covid-19 patients based on the lesionencoder framework and chest ct. medrxiv generalized linear models, second edition convolutional networks for biomedical image segmentation efficientnet: rethinking model scaling for convolutional neural networks the theory of exponential dispersion models and analysis of deviance. monografias de matemática evidence for improving outcome through extent of resection severity assessment of covid-19 based on clinical and imaging data. medrxiv a deep learning methodology for differentiating glioma from radiation necrosis using multimodal mri: algorithm development and validation a deep learning based algorithm identifies glaucomatous discs using monoscopic fundus photos key: cord-222868-k3k0iqds authors: goswami, anindya; rajani, sharan; tanksale, atharva title: data-driven option pricing using single and multi-asset supervised learning date: 2020-08-02 journal: nan doi: nan sha: doc_id: 222868 cord_uid: k3k0iqds we propose three different data driven approaches for pricing european style call options using supervised machine-learning algorithms. the proposed approaches are tested on two stock market indices, nifty50 and banknifty from the indian equity market. although neither historical nor implied volatility is used as an input, the results show that the trained models have been able to capture the option pricing mechanism better than or similar to the black scholes formula for all the experiments. our choice of scale free i/o allows us to train models using combined data of multiple different assets from a financial market. this not only allows the models to achieve far better generalization and predictive capability, but also solves the problem of paucity of data, the primary limitation of using machine learning techniques. we also illustrate the performance of the trained models in the period leading up to the 2020 stock market crash, jan 2019 to april 2020. fair pricing of financial instruments is at the heart of market stability. mispricing securities may cause traders to incur massive losses and can also indirectly affect the financial health of a market. it is thus vital to be able to derive the fair price of tradable financial instruments. the seminal paper [3] laid the foundation of the theory of no arbitrage option pricing, following which the scope of the theory has been extended by several authors. however, the fair price of an option contract depends on the current anticipation of the future dynamics of the underlying asset. this is why the authors of [15] argued that the success or failure of theoretical option pricing and hedging is closely tied to the success in capturing the dynamics of the underlying assets price movements. since this is a hard problem, adoption of data-driven approaches in pricing option contracts is gaining attention with the advent of superior computational power and advancements in statistical learning techniques. in this manuscript, we propose data-driven approaches for prescribing the fair price of an option contract without assuming any particular theoretical law of the underlying asset dynamics. we also propose and illustrate the use of data drawn from multiple assets/sources to train these data-driven option pricing models. this allows us a way to mitigate the possible paucity of data available to train models. we would like to emphasize that the work presented in this study does not attempt to emulate the black-scholes formula or any other theoretical option pricing model. in the past, several authors have investigated the possibility of building a data-driven option pricing model; we give a brief overview of the literature that exists. in [20] , the authors conveyed their belief that the trading process of option contracts itself may reveal analytical models. the data-driven investigations in [15] and [20] were based on option contracts on the s&p 500. while the former used only the moneyness parameter (ratio of spot and strike values) and time-to-maturity as inputs to their learning model, the latter also used historical volatility, interest rate, and lagged prices of the underlying asset and option contract. the authors of [18] obtained a better prediction performance than [15] by including the open interest in addition to all the non-lagged inputs of [20] . on the other hand, in [19] , s&p 100 data was used to predict the implied volatility instead of the option price, using past volatilities and option-contract parameters. in [17] , a variant of implied volatility was used as an input to predict the deviation of the actual market price from the black-scholes price of the option contract. the model performance was illustrated on ao spi index options. if the log returns of the underlying asset is independent of the stock price level, the formula for fair price of an option is homogeneous of degree one in both spot and strike. the authors of [11] implemented this relation in the structure of the neural network and built a model using option contract data of the s&p 500 index. the authors of [5] discuss how a technique named profiling could be used to select the optimal neural network structure to predict the implied volatility. this technique was illustrated on usd/nem exchange rate options and the model took various contract parameters as inputs. the authors of [22] argued that option contract data should be partitioned according to moneyness in order to improve the accuracy in pricing options and they illustrated this performance improvement using nikkei 225 index option contracts. in [13] the authors exhibited the effectiveness of cross validation, bayesian regularization, early stopping and bagging in preventing overfitting and improving generalization, in the process of pricing s&p 500 call options using an artificial neural network (ann). the author of [1] attempted to predict the bid-ask spread of options on the omx stockholm 30 index, using multiple lagged asset prices and their sample standard deviations. in [2] , the authors used the dividend rate in addition to black-scholes-based features to price options contracts on the ftse 100 index; the model performance was compared with the black-scholes-merton price that incorporates dividends. the authors of [12] used s&p 500 option contract data and developed a "modular" ann model for option price prediction. in particular, they divided the data set into 9 disjoint parts or modules, according to the moneyness and the time to maturity parameters of the contracts. a similar modularity is adopted in [7] where the authors build a hybrid model using banknifty option contracts. some of the previously mentioned papers have prescribed data-driven option hedging strategies, while some others have also demonstrated success in predicting the price of exotic options using their model outputs. the above survey is not meant to be exhaustive but conveys the broadly accepted methodologies for developing supervised learning models to price options. this manuscript borrows aspects like homogeneity hint and modularity from the existing literature. in this manuscript, we propose three different approaches to generate feature sets from the market data, each of which yields 17 − 22 features. each feature set is then used to train two modelsusing an ann and the xgboost algorithm respectively. none of the approaches include measures of volatility as features. however, we assume that the statistical distribution of the underlying assets' returns is independent of the level of the stock price (s). this implies that the option price function is homogeneous of degree one in both, the spot price (s) and the strike price (k). in view of this, we construct feature sets using the underlying asset's log returns, moneyness ( s k ), and time to maturity. furthermore, the output variable has been constructed using the ratio ( c k × 100) of option price (c) to the strike price (k). the fair price of an option contract must depend on the anticipated statistical distribution of the future price of the underlying asset. we try to incorporate this principle using a non-parametric approach, wherein we consider a fixed number of consecutive order statistics of log returns of the daily underlying close prices as features. we compare the performance of this approach with another approach, wherein the feature set consists of only the first two moments of the log returns of the underlying asset's daily open-high-low-close prices. both the approaches appear to be equally effective. finally we compare these two approaches with a third approach, that augments features from the second approach by including a few additional features derived from the historical option price data. this particular approach outperforms the previous two as the option price data contains significant additional information relevant to the present day option price. to the best of our knowledge, option pricing models using these feature sets have not been reported in the literature so far. in the proposed data-driven approaches, disjoint consecutive intervals of the option contract price is set as the output instead of a single predicted option price, as we believe that no real market is complete. in other words, a random payoff such as an option contract may have multiple fair prices, and a single predicted price is more confusing than convincing. hence we define the output variable in a manner that conveys the range of fair prices. we measure and compare the performance of the models described in the manuscript using two different error metrics. the first proposed error metric attempts to mimic the mean absolute error (mae) while the second metric gives the inaccuracy in predicting the option price to lie within a certain neighborhood of the actual option price. we also compare the performance of the proposed models with the theoretical black-scholes option pricing model. it is observed that the models constructed using the third approach outperform the black-scholes pricing formula in terms of the above mentioned metrics whereas other proposed models perform equivalently, if not better. again, we would like to emphasize that neither historical nor implied volatility is used as an input in any of the proposed models. we would also like to emphasize on the fact that none of the features were selected based on importance analysis, as the process of determining feature importance essentially depends on the particular choice of the training data used. despite maintaining such indifference, the success in predicting option prices indicates that perhaps these data-driven models are capable of learning certain universal rules of option pricing. we also ensure that the inputs and outputs of the models are scale-free, which allows us to investigate if models could be trained on option contract data from two different assets/sources. this, in principle, would allow us to construct models that can capture the option pricing mechanism for a broader range of underlying asset dynamics. our experiments show that the models trained using data from multiple assets/sources possess superior option pricing capabilities than the models trained on individual assets/sources. these experiments have been performed using nifty50 and banknifty option price data. however, since we have not experimented with a sufficiently broad class of assets, the complete scope and the limitations of this technique (referred to as combined training) is still unclear. nevertheless, we propose a methodology to gain a deeper understanding of the combined training effect than what the error metrics offer. in this method, for a trained model, we perform a family of tests using simulated black-scholes option price data with varying volatility. results show that the simple idea of combined training produces models that predict the option price for a wide range of underlying asset price dynamics fairly well. in other words we observe domain adaptability for a wide variety of simulation data, clearly indicating the effectiveness of the combined training technique. drawing from the modularity approach proposed by [22] , [12] and [7] , we choose to train our models on a particular subset of the contract data. to elaborate, we perform our experiments on a "filtered" dataset comprising of only near-atm (at-the-money) contracts. the "filtered" dataset also excludes option contracts that have either too short or too long time-to-maturity values. we believe that including a full range of modularity, as in [22] , [12] , and [7] , would complicate the exposition of this paper with too many experiments, as we study six different models constructed using three approaches and two algorithms, on two different assets/sources. this paper is organized in eight sections. the second section briefly presents the basics of supervised learning, and explains the two supervised learning algorithms used to construct the models. section 3 contains details about the data under consideration. the input and output of the learning models are explained in section 4. in section 5 we report the performance of the trained models. an analysis of the combined-trained models' performance is presented in section 6. performance of the models on 2019-2020 data is given in section 7. finally we comment on future research directions in the last section. attempts to develop algorithms that are capable of performing a task without explicitly specifying the expected outcome have led to the development of the field of machine learning. this manuscripts leverages a specific subset of machine learning algorithms, known as supervised learning algorithms. these algorithms take in labelled data as input and "learn" the task at hand. the term "learn" implies that the algorithms construct abstract representations of the data with the aim of capturing patterns that are fundamental to the task at hand. in the following subsections, we describe briefly two supervised learning algorithms, namely extreme gradient boosting (xgboost) and artificial neural network (ann). these algorithms are used in the later sections of this manuscript. before studying the specifics of the algorithms, it is instructive to understand the general premise of supervised learning algorithms. consider a finite labelled dataset represented as {(x 1 , y 1 ), (x 2 , y 2 ), (x 3 , y 3 ), . . . , (x j , y j )}, where the vector x j is associated with a label y j . the algorithms attempt to find a mapping f : x j → y j such that the mapping obtained is the "best" out of all the possible mappings. a qualitative assessment of the mapping (also referred to as a model) is made possible by an "objective" function (also known as a "loss function"). the specifics of the objective function and the strategy used to create the mappings vary with the choice of the algorithm. 2.1. extreme gradient boosting. developed by tianqi chen in 2016 (refer [6] ), extreme gradient boosting combines two powerful techniques, namely "boosting" and "gradient descent". it builds upon the gradient boosting decision tree algorithms developed by friedman in 2001 (refer [9] ) and 2002 (refer [10] ). gradient boosting involves constructing an ensemble of "weak" learners, which in the case of xgboost, are decision trees. these "weak" learners are combined in an iterative fashion to obtain a "strong" learner. a "weak" learner is a model whose accuracy of predictions is slightly better than a model making random predictions. refer to [8] for more details on how "weak" learners can be combined to create "strong" learners. a typical classification task involves categorizing an input to its label (or class). successfully performing classification requires the model to determine a close approximate of the true conditional probabilities of the classes, given an input. the xgboost algorithm, for a set of n output classes, assigns a score f i (x) to the i th class for the input x. we define f (x) as ( ). the scores obtained are then used to calculate the probability of each class to be the predicted class by using the softmax function, p (x) defined as the xgboost algorithm then computes the "objective" (or loss) function value for each input x by determining how far away from the true distribution is the distribution of the predicted values. this is done by using categorical cross entropy (ce), a loss function, which is defined as where z := (z 1 , z 2 , . . . z n ) is a given p.m.f of the true outputs. the xgboost algorithm seeks to minimize the value of this loss function over all possible f (x) based on the training set of j input-output pairs, {(x j , y j ) | j = 1, 2, 3, . . . j}. these pairs are used to compute the value of z (j) , for each j, such that z the weak learner h (m) (x), is then fit to the training dataset {(x j , r (m) j )} j j=1 . the algorithm then computes the multiplier α (m) using the equation this multiplier, α (m) , is then used to update the model/score as given by the scheme the xgboost algorithm thus results in a strong learner by combining m weak learners in order to obtain a close approximate to the true probability distribution. the reader is encouraged to consult the references cited as this exposition is not meant to be comprehensive. artificial neural network. developments in the field of machine learning led to the advent of algorithms that sought to mimic biological neural networks. these algorithms (referred to as ann) attempt to harness the ability of biological networks to learn patterns within data. this manuscript presents a brief overview of a special type of ann known as feed forward neural network 1 . we use feed forward neural networks for the experiments proposed in the later sections to classify structured data inputs. the reader [14] for a comprehensive study of anns. table 1 for details). a neural network is a set of "neurons" that interact with each other to "learn" the representation space of the input data. figure 2 shows the structure of a neuron. as can be seen in figure 2 , the output η of a neuron can be given bywhere ψ = (ψ 1 , ψ 2 , . . . , ψ n ) are the inputs to the neuron, w i is the weight associated with each input ψ i and b is the overall bias associated with the neuron; the function f is called the activation function and is used to impart non-linearity to the neural network. as evident from figure 1 , a feed forward neural network consists of a number of "layers" of stacked neurons. each neuron in a layer is connected to every neuron in the next layer. thus the outputs of the neurons in the preceding layer act as the inputs to the neurons in the next layer. as stated earlier, each "connection" between any pair of neurons, has a weight w associated with it. the optimal number of layers in a neural network and the number of neurons in each layer is to be uniquely determined for a given problem, and is referred to as the architecture of the neural network. along with this, it is also necessary to determine the appropriate activation functions for each of the neurons as well as the optimization scheme to be used. the architecture of the ann used in the present study is given in table 1 . number of neurons activation function layer 1 128 relu layer 2 64 relu layer 3 50 softmax table 1 . "architecture" of the neural net used the activation function used for each layer has been indicated in table 1 . the relu activation function is defined as -relu :: f (x) = max(0, x) the softmax function, as explained previously (refer equation (1)), gives the class probabilities. we use the loss function, categorical crossentropy (refer equation (2)) to determine how far the true probability distribution is from the distribution of the predicted values. in order to "learn" a given task, the sequence of weights that serve as a minimizer to the loss function are to be found, as this corresponds to a higher prediction accuracy by the neural network. this is achieved by optimizing the weights using an optimization scheme (commonly known as training the network). in the present study, we use the adam optimiser, an advancement of the stochastic gradient descent optimizer (refer [16] ). we aim to model the pricing mechanism of option contracts that are traded in a financial market. nse, an indian stock exchange, facilitates the trading of option derivatives on stocks and stock indices in high volumes. markets with high trading volumes generally imply a high level of trader participation, which further implies a lower chance of the market being imperfect (i.e, the market is efficient). this also allows us to consider the traded price of the derivative as the "fair" price. persistent high trading volumes for a particular range of option contracts give us a better chance to "learn" the pricing mechanism of those option contracts. some of the nse based stock indices that have a high option contract trade volumes are the nifty50 and banknifty. for our experimentation, we extract the daily contract price data of call options for both, nifty50 and banknifty. data is extracted for the years 2015 − 2018 (data for 4 years), from the nse website's contract wise archive section 3 . it is then ensured that the data set obtained is purged of contracts that are not traded. for reasons related to the construction of the models, we add a new column to the filtered dataset that records the close price of the same option on the previous day. if the option contract did not exist on the previous day, we report the value 0 in this new column. we subsequently screen the data to remove all rows that have a zero in the new column. we then add more columns to the data array to include the "open", "high", "low" and "close" prices of the underlying asset for the past 20 days corresponding to each row. further more, we add an additional column that represents the three months' government bond yield (see section 4.2). we then select the option contracts that are in the vicinity of at-the-money(atm) contracts. to be more precise, we only select those contracts for which the quantity |1− s k | is not more than the pre-decided value of 0.04, where k and s are the strike and the spot prices respectively. we refer to such contracts as near-atm option contracts. it has been observed that numerous near-atm option contracts are traded everyday with identical or different time to maturities. however, significantly low trading volume is observed for contracts with very large or very small time to maturities. hence we choose to study, only those contracts whose time-to-maturity values are not more than 45 days and not less than 3 days. figure 3 is an indicative sample of the nifty50 option contract dataset that we obtain from the nse. in order to build a predictive model using the algorithms described in section 2, the dataset needs to be split into separate datasets that would be used to train and evaluate the trained models. most supervised learning algorithms when trained with time series data, necessitate splitting the dataset linearly as the individual observations are not independent. in the same vein, we split the dataset in two parts according to the timestamp. the first 33 months, i.e., data from jan 2015 to sept 2017 forms the training dataset and the succeeding data i.e. from oct 2017 to dec 2018 forms the test dataset for evaluating the proposed models. table 2 shows the number of datapoints we deal with at every step of the model building and evaluation process. as mentioned previously, this study aims to develop supervised learning models that can "learn" the market perceived pricing of option contracts, and give us the fair price of an option contract in accordance with past market behaviour. in order to develop supervised machine learning models (refer section 2), we need to train the models with a set of 'inputs" and "outputs". sections 4.2, 4.3 and 4.4 describe the different feature sets, each of which we intend to use as inputs to the supervised learning algorithms. these feature sets are derived from the information available to market participants. before describing each of the feature sets, we explain the desired format of the output variable which is kept uniform across all the approaches. 4.1. categorical output variable. as for the output of the proposed data-driven option pricing models, using the option contract prices obtained directly from the market would not be prudent. this is because, for contracts with a fixed value of moneyness, the magnitude of contract parameters like "strike" and "spot" prices may vary over the years. it makes much more sense to create an output variable that is scale free. we therefore define the "output" as the ratio-expressed in percentage-of the close price (c) and the strike price (k) of the contract, i.e. we designate 100 × c k as the output variable. this ratio serves as a scale free proxy to contract price for the model. since the "output" variable is continuous, it is natural to formulate the problem using a regression model. however, since no real market is complete, a single predicted price of an option contract is more confusing than convincing. indeed the fair price could be anything in a certain interval. determining this interval is a hard problem from both, the theoretical and the empirical aspects. instead of finding such an interval of the fair price, selecting the most likely interval from a pre-determined set of non-overlapping consecutive intervals is fairly straight forward. one can divide the range of outputs into non-overlapping "bins" and select the "embracing" bin as the output variable. however, a major hurdle in this approach is determining the width of the bin. larger the width of each bin, lesser the usefulness of the model due to lack of precision. on the other hand a finer binning confuses the model due to the presence of a certain degree of in-docile uncertainties in the option trading price, which can be attributed to the lack of completeness in the market. the most straightforward way to tackle this quandary is to formulate an optimization of an appropriate loss function. instead of adopting such an objective approach which essentially depends on the type of data and the model used, we first introduce a binning insensitive performance measure for the models. we refer to this measure as the em. subsection 5.1 (refer equation (5)) gives a description of the proposed metric. we then study the values of em obtained for different bin widths, for a fixed dataset and a fixed model type. depending on the persistent stability of em and the gain of precision, we decide the bin width to be used. figure 4 shows the results of procedure used to determine the interval width. we observe that for bin width intervals larger than 0.1, the supposedly bin-insensitive measure drastically decreases. this is expected as larger bin intervals imply lesser number of classes, which makes classification easier for the models due to increased imprecision. for bin intervals lesser than 0.075, a certain monotonicity appears. but for bin interval width roughly between 0.1 and 0.075, the em value behaves insensitive to binning. this manuscript uses the value of bin width as 0.1 and partitions the entire range of output values in the manner explained in the following paragraph. the interval ((n − 1)w, nw] is set as the n th bin where n is a natural number and w (here w = 0.1) is the bin interval width. this creates a set of equispaced bins allowing us to map option contracts to their respective bins by computing the value of 100 × c k for the particular contract and assigning the corresponding integer valued bin number to it as its label. these labels are then considered as the ordinal output variables and are used to train and test the constructed models. we illustrate this binning in figure 5 . the figure is a histogram (plotted using 0.1 as the bin width) of 100 × c k values, for the filtered nifty50 contract dataset. it is evident from the plot that there are just enough data points per bin and yet we have enough number of categories, ie. bins, to make the model robust. the above procedure of binning is rather subjective and is not meant to be precise for a vital reason. the reason being, any precise data-driven optimization depends upon the choice of the data and the model. on the other hand, we wish to fix binning, regardless of the choice of the model or the dataset. the reason being that, binning defines the output variable in the training and test datasets for each model and we wish to keep all the models comparable. without identical binning, it would not be possible to combine or compare models trained on different datasets. 4.1.1. remark. the following subsections (4.2, 4.3 and 4.4) describe 3 separate, independent "approaches" used to generate feature sets that serve as inputs to the supervised learning algorithms described in section 2. here the term "approach" is used to convey the motivation/idea behind generating the feature sets. the table 3 . approach i. from the very definition of an option contract, it is known that the fair price of a call option must depend on the values of the option contract parameters (like the strike price (k), the time to maturity (τ )), the risk free interest rate (r), the spot price (s) and the anticipated statistical behavior of the future dynamics of the underlying asset. the closest real world approximate of the value of r would be the government bond yield. amongst the parameters that are available to the practitioner, it is natural to hypothesize that the most important determinant of the option contract's value is the present value of the underlying security and the price dynamics followed by it over the past few days. directly using the past asset price data as features would make the values scale dependent, especially so when data over many years is to be considered for model training. as a means to resolve the scale dependency, log returns of the time series (henceforth, referred to as lr) are considered, the values of which are given by where, s i is the i th term of a time series s. in order to obtain a non-parametric inference of the recent distribution of log returns, we calculate the order statistics of the log returns. this is done by computing the log returns of the daily close prices of the underlying asset for a window of the past 20 trading days, as it corresponds to approximately a calendar month excluding all holidays. following this, the order statistics is computed by simply arranging the log returns in ascending order for each sample. to be more precise, if x (i) denotes the ith order statistics of a sample of different real values (x 1 , x 2 , x 3 , . . . , x n ), then x (i) = x j for some j = 1, . . . , n, and x (1) < x (2) < · · · < x (n) hold. in view of the preceding discussions, we calculate the order statistics of historical log returns for each of the near-atm option contracts resulting in a row of 22 features as given below: (1) the 19 log return order statistics. (2) the time to maturity (τ ) of the option contract. (3) the interest rate r: we use the 3 month sovereign bond yield rates as an approximation for the risk free interest rates. (4) moneyness: this quantity is computed as s k (the ratio of spot to strike prices). a collection of such rows is what constitutes the train/test dataset. 4.3. approach ii. this subsection proposes a feature set that takes into account the market participant's access to other facets of the asset price data. intuitively, a lot more information on asset dynamics can be gleaned by taking into account the values of "open", "high", and "low" along with the values of "close" (refer 6). however, this intuitive anticipation deserves a quantitative backing. let us first understand the figure 6 . cross section of the underlying asset price dataset need for a completely new "approach". the previous subsection attempted to generate a feature set that captures the empirical distribution of the "close" price data of the underlying asset. the present subsection seeks to remedy the fact that the asset price data obtained from the market consists of multiple facets that haven't been accounted for in approach i. the joint distribution of these four time series' ("open", "high","low" and "close" ) cannot be inferred from the order statistics of every individual time series as they are not independent. this renders a direct mimicking of approach i ineffective. moreover, using a direct extension of approach i would lead to a feature set with 19 × 4 = 76 features. this bloating up of the feature set prevents any meaningful comparison between different models. it is therefore prudent to adopt a moments based approach to generate a feature set that is sensitive to all facets of the underlying asset data. instead of trying to obtain an empirical distribution of the multivariate time series, we measure the central tendency and the dispersion using the first raw moment and the covariance matrix of the component-wise log returns of the vector valued series. the feature set is then built using these statistics. as σ is symmetric, six entries on the upper triangular part are repeated in the lower part. we include the square root of entries of σ in the feature set after discarding the repetitions. thus we build the second feature set using the following 17 features: (1) means of the log return series'; µ o , µ h , µ l , and µ c . (2) ten statistics from σ, namely where σ ij is the (i, j) th element of σ using the convention x √ |x| = 0 iff x = 0. (3) features (2) − (4) from approach i. approaches i and ii primarily utilize the underlying asset price data to derive the set of features. however, a market participant also has access to the historical option contract trade prices. it would be imprudent to not develop an approach that factors in this key aspect. in fact, including the historical option contract trade prices in an appropriate form would help the supervised learning algorithms to develop abstract representations of market factors like implied volatility, allowing them to predict the option contract price more accurately. we would like to stress on the fact that the intent of approach iii is to build upon the progress made in approach i and ii. we cannot use an extension of approach i for reasons mentioned previously. we instead, seek to augment the feature set developed in approach ii by adding the features listed below to the feature set obtained from approach ii: (1) previous option price (scaled): this is computed as ct−1 k where c t−1 is the previously reported close price of the option contract under study and k is the strike price of the contract. including this feature helps account for any auto-regressive characteristics that might be present in the option price data. (2) mean moneyness: computed ass k , wheres is the mean of the underlying asset prices (for a window of the past 20 trading days) and k is the strike price of the contract. table 3 summarizes the features used by the three approaches described in this section. figure 7 presents an overview of the steps that constitute the process of model building. table 3 . an overview of feature sets for all the approaches once a model is trained, it is imperative to test the performance of the model on data that has not been used for training (ie. the test dataset) and study the quality of the predictions. the most common way to evaluate the predictions of nominal variables is to find the value of the accuracy metric a, defined as where c is the number of correct predictions and t is the total number of predictions. it is however, not ideal to use the accuracy metric for an ordinal output variable having a wide range. in such cases, one can examine the quality of the incorrect predictions by measuring the distance between the actual and the predicted classes. doing so is meaningful because, it is desirable for a good model to be able to predict a class identical to or very close to the actual class. in contrast, the accuracy metric treats all incorrect predictions in the same manner, regardless of whether the predicted class is close to or far from the actual class. it is therefore important to come up with a metric that does a better job of informing us about the where, w denotes the binwidth, t is the number of contracts in the test dataset and the ordinal variables c i and p i denote the actual and the model predicted bin numbers respectively. as mentioned in subsection 4.1, we set the value of w as 0.1. multiplying the bin number with the bin width makes em asymptotically insensitive to binning. we illustrate the implication of em in figure 8 . figure 8 gives an example of the case where the distance between the actual and the predicted classes is 2. it can easily be proved that the em converges to the mean absolute error (mae) as bin width tends to 0. however, the mae metric is known to be sensitive to outliers. hence, in order to get a better insight into the performance of the models, we also consider an additional metric-the "inaccuracy metric"-that is robust to outliers. the "inaccuracy metric" (ρ) gives the probability of the predicted and actual bins to lie more than 2 bins apart. in other words, the metric ρ gives the probability that the model will fail to include the actual price bin (labelled as c i ) in a band of five consecutive bins where the predicted bin (labelled as p i ) is in the middle. henceforth we refer to the above mentioned band as the predicted band (see figure 9 ). the ρ metric is defined as while em is a measure of prediction imprecision, the empirical quantiles of the error c i − p i gives the confidence interval of c i using prediction p i . in particular, 1 − ρ denotes the confidence of c i being in [p i − 2, p i + 2]. nifty50 index option data. table 4 lists the em and ρ values for all models that are trained and tested using nifty50 data. the results reported in table 4 convey that all trained models perform at par or better table 4 . model evaluation metrics for models trained and tested on nifty50 options contract price data than the pricing formula of the black-scholes model (we use the historical volatility values observed over a window of the past 20 trading days to compute the black-scholes price). we also note that in comparison to xgboost, the use of ann results in lower values of em and ρ. table 4 also shows that the values of the metrics do not differ significantly between approach i and approach ii. this indicates that the two supervised learning algorithms were unable to extract additional information on the asset dynamics from the first two moments of the open-high-low-close (ohlc) data than solely from the close price data. from the results, it is also clear that the performance of approach iii is far superior to that of approaches i and ii, which indicates that the historical option price data contains valuable information relevant to the current option price. it is evident from table 4 that for all cases the em value is less than 0.19. loosely speaking, this implies that on an average, the predicted value of 100 × c k is not further than 0.19 from the actual (refer to equation (5)). in other words, the difference between the actual and predicted option prices is on an average, less than 0.0019 × k. a more precise statement in terms of confidence interval can be made using the empirical quantiles (refer to figure 10 ). the 2% and 98% quantiles of c i − p i obtained using the approach i ann model for nifty50 data are −5 and 5 respectively. this implies that the actual price bin is within 5 neighbouring bins of the predicted bin with 96% probability for the test dataset. similarly, from other quantile values we can also deduce that the actual price bin is within 2 neighbouring bins of the predicted bin with 74% confidence. figure 10 illustrates this using a plot of the empirical cdf of c i − p i . indeed, the ρ metric is useful in this regard. to be more precise, the difference between the actual and predicted option price intervals is less than 2k 1000 with probability 1 − ρ (refer equation (6)). thus an interval of length 5k 1000 (predicted band) can be obtained from a model prediction which succeeds in containing the close price of the option (having strike price k) with probability 1 − ρ (refer to figure 9 ). we recall from figure 5 that this predicted band width is less than one tenth of the full range of option prices for the nifty50 data under consideration. we consider approach iii ann models to further illustrate the implication of the predicted bands. for this, we first identify the upper and lower limit option prices of the band and compute the corresponding daily implied volatility values for each contract. from these values, we obtain the daily averaged predicted implied volatility band. we then compute the average market-realized implied volatility for each day using near-atm options data and compare it with the predicted implied volatility band. a time series plot of that comparison is presented in figure 11 . the figure shows that for 90% of the time, the market-realized implied volatility lies within the predicted band. it is not surprising that this band prediction error is only 0.10, a value that is much lesser than the ρ value for approach iii in table 4 . the main reason behind the observed error reduction is the presence of averaging in the computation. this indicates the possibility of building a superior hybrid model by exploiting such an averaging effect. however, we do not attempt to build such models in the present study. banknifty index option data. table 5 lists out the performance of the models that were trained and tested on banknifty index data. the data processing, feature-set generation and train-test splitting for the banknifty options dataset is done in the exact same way as for nifty50 index option data, in accordance with the methodologies laid down in sections 3 and 4. it can clearly be seen that the values of the em and ρ are the lowest for approach iii models. the evaluation metrics for approach iii are also lower than those for the black-scholes formula. using the trained models and the results shown in table 5 , figure 11 . average empirical iv and the predicted iv band, plotted for the nifty50 test dataset an analysis of the results similar to what has been done for nifty trained models can be performed, but we avoid repetitive explanation. em ρ table 5 . model evaluation metrics for models trained and tested on banknifty options contract price data from the results shown in table 5 and table 4 , it is evident that approach iii ann models perform significantly better than all other proposed models. furthermore, they are far more accurate than what the black-scholes formula can prescribe. having said so, it is also important to recall that no measures of volatility has been fed into any of the proposed models. we also present a set of experiments that shows the promise of ensemble modeling. 5.3. ensemble models. the predictions of the two pricing models obtained using ann and xgboost for each approach can be averaged out, to obtain a new prediction. we refer to this as the prediction of a simple ensemble model. the rationale behind this approach is straightforward. it is plausible that for a particular approach, the xgboost model learns a subset of the representation space very well, but does not learn it well enough for some other subsets. the ann model could hypothetically learn those missed subsets of representation space better than what the xgboost model is capable of learning. by averaging out the predictions of the models, we seek to minimize the number of subsets over which the individual models perform poorly. averaging the model predictions allow us a way to leverage the well learnt portions of the representation space of both the models at the same time. we evaluate the performance of the ensemble models by computing the em and the ρ values for the test sets. tables 6 and 7 present the model evaluation metric values for the ensemble models trained and tested on nifty50 and banknifty contracts respectively. the results in table 6 and table 7 show a marked improvement in the em values for all the approaches when compared to the results in table 4 and table 5 respectively. table 7 . model evaluation metrics for ensemble averaged models trained and tested on banknifty option contracts remarks : it is important to note that the "predictions" (p ) of the ensemble model need not be an integer class label but could instead be an integer multiple of 1 2 . however, no change is needed in the computation scheme of the model evaluation metrics. trained with multiple sources. since the features and the output variable used are scale free, models trained on one asset should be able to give reasonable option price predictions for another asset provided their log return distributions are not too different from each other. this anticipation hinges on our assumption that, for a given financial market, two assets having the same return distribution should have the same option pricing mechanism. again, there is a possibility that the prediction quality may be inferior even though the training and test datasets belong to the same asset, as the return dynamics of the underlying asset may have changed drastically. this subsection presents some experiments in this direction. we first carry out an empirical investigation on the asset portability of the models. in order to do this we consider all six models trained on nifty50 option contracts, and test them with data from banknifty based contracts on non-overlapping time intervals. the results of this experiment are given in table 8 . it is crucial to note that these two indices are sufficiently independent and have contract parameters with vastly different magnitudes. we present the q-q plot (figure 12 ) of the "close" price log returns of the two underlying assets in order to compare their log return distributions. figure 12 shows a moderate mismatch between the return distributions of these two assets. thus, although we do not expect the predictive performance to be equivalent to nifty50 test sets, we expect the error metric to be decently small in magnitude. our experiment supports this anticipation. however, a quick comparison of our results (table 8 ) with table 5 shows that the nifty50-trained models outperform the banknifty-trained models for the banknifty test set. this gives evidence of the fact that a model trained on a different asset/source can outperform a model trained on the target asset/source. the results of the above experiment encourages us to train the models using contract data from two or more number of assets/sources. in principle, this should broaden the range of features and allow the models table 8 . model evaluation metrics for models trained on nifty50 contract data and tested on banknifty contracts to achieve far better generalization and predictive capability. we investigate this by training all six models (one xgb and one ann for each of the three approaches) using the combined data of nifty50 and banknifty contracts and then perform out-of-sample tests for each asset. the em and ρ values of the respective experiments are given in table 9 . table 9 . model evaluation metrics for models trained on both nifty50 and banknifty contract data a comparison of the metrics given in table 9 with those in tables 4 and 5 clearly shows that combined-trained models have better option pricing capabilities than the models trained on the respective assets individually. each of the combined-trained models also outperform the price prescription of the black-scholes formula. the performance of the option price prediction can be better perceived using a scatter plot of the actual and predicted option prices, which we present in figure 13 . since the proposed models predict a bin, in order to plot the graph (figure 13 ) we take the mid point of the predicted bin to get a single predicted price. the prices obtained using the mid point of the bins are plotted along the horizontal axis and the actual price is plotted along the vertical axis in the scatter plot. the scatter plot shown in figure 13 is constructed using the predictions given by the approach iii ann model (trained using combined data). to the plot, we add the line y = x (dashed red) and the orthogonal regression line (dashed green). the proximity of these two lines validates the absence of bias in the model. in principle, such scatter plots can be constructed for all the proposed models. the success of the above experiment warrants an in-depth explanation. in the next section, we use the concept of domain adaptation for designing a methodology that provides a deeper understanding of the combined-training effect. this section brings to fore an interesting application of the models constructed using approach i in the sections 5.2 and 5.4 respectively. we test the pre-trained models (obtained using approach i) with simulated black-scholes option price data. a family of such tests is conducted by varying the volatility parameter in the geometric brownian motion that is used to generate the simulated asset price time series'; these time series datasets are then augmented with the option prices prescribed by the black-scholes formula. we recall from section 4.2 that approach i based models use order statistics of the log returns of the underlying asset's daily close prices as their primary inputs. thus approach i can be directly used to generate the simulated test datasets by considering the simulated time series data as "close" prices. but approach ii or approach iii cannot be used directly, as they use "open", "high" and "low" time series' along with the "close" time series to generate the features, and simulating the corresponding "open", "high" and "low" time series' is not straightforward. hence we only use approach i based models for the experiments described in this section. we simulate geometric brownian motion with the drift parameter set at µ = 0.1 and vary the volatility parameter from 1% to 20% using an increment of 1%. daily data is simulated for each value of the volatility parameter, such that we obtain a test set that represents a trading session of 500 days. this test data is augmented with the price of several near-atm option contracts (with values of time to maturity ∈ [10, 25, 40]) using the black scholes formula. we then find the prediction error of the models for each variant of the test data and plot them against the volatility parameter. we do this using xgboost/ann models trained on nifty50, banknifty and the combined dataset respectively. the purpose of this exercise is to explain the results detailed in section 5.4. it has not been done to judge the performance of the trained models on data derived from theoretical models (as option contract prices obtained using theoretical models involve certain mathematical assumptions that renders the pricing obtained dissonant from reality). the minimizing volatility values of em provide a class of theoretical asset dynamics whose option prices are best predicted by the trained model. we call it the "error minimizing volatility" or emv of a given option price dataset corresponding to the learning model. for example from figures 14 and 15 figures 14 and 15 it is evident that the em plot obtained for the combined-trained models give a lower and flatter v shape curve. this implies that models trained on the combined dataset result in lower em values for a wide range of test sets having varying σ values. this hints at the possibility of domain adaptability of predictive models trained on datasets derived from multiple assets/sources. it also hints at the existence of a common representation space for datasets with similar log return distributions. such an application of domain adaptability can be a very powerful method, as it could potentially aid research in areas where data is scarce. during the period from january 2020 to april 2020 of the covid-19 pandemic, the dynamics of the nifty50 index were radically different from its usual dynamics. a q-q plot comparison of the log return distributions of the nifty50 index during the periods oct'19-dec'19 and jan'20-mar'20 is shown in figure 16 . it is evident from the q-q plot that there seems to be almost no match between the price dynamics of these two time intervals. therefore for option contracts based on the nifty50 index, we cannot expect the models trained on 2015 − 2017 data to perform well on 2019 − 2020 data. table 11 presents the values of the performance metrics, for when the pre-trained approach iii models (constructed in sections 5.2 and 5.4) are tested on 2019 − 2020 data for the nifty50 index. we consciously make the choice to use models constructed using approach iii as the benchmark to test the 2019 − 2020 dataset, as these models have given us the best predictive capability. table 11 . model evaluation metric for models trained on 2015-2017 nifty50 contract data but tested on 2019 nifty50 contract data table 11 makes it evident that the error in predicting option prices for 2019 − 2020 nifty50 test data is significantly larger in comparison to the prediction error for 2017 − 2018 nifty50 test data as in table 4 . it must be noted that performance of the models on the recent data is far better than what the black scholes formula prescribes. the large value of the evaluation metrics for the black-scholes pricing formula implies a large gap between the historical and implied volatilities. this is typically observed when drastic changes occur in a financial market. we also observe a significant improvement in case of the combined-trained models as compared to the individually-trained nifty50 models. this reaffirms the power of combined training. in addition to the above experiments, we plot the empirical iv and the predicted iv band in figure 17 in a manner similar to the plot reported in figure 11 . the band prediction error for the 2019 − 2020 dataset ( figure 17) is 25%, which is lesser than the value of ρ observed in table 11 . figure 17 helps us identify regions in the test dataset where the model does not perform well. it is observed that when the implied volatility of the underlying asset changes sharply, the prediction bands deviate from the actual values. these abrupt changes are usually caused by rapid changes in the market sentiment (in this case due to the covid-19 pandemic); an aspect that is not represented in the data used to train the models. in this paper, we present three data-driven approaches to build option pricing models using supervised learning algorithms. these approaches are illustrated for two different assets/sources (nifty50 and banknifty), and we use two different learning algorithms to build a range of models. upon evaluating the performance of the models on out-of-sample data, it was seen that approach i and ii based models performed better than the black-scholes option pricing formula in most cases, while the approach iii based models performed significantly better than all comparative models. since approach iii uses features derived from the historical option price data that are not present in the approach i and ii based feature sets, the performance improvement clearly indicates the vitality of including such information. the results also highlight the superior performance of ann-based models in comparison to the xgboost-based models. in this paper, we have also attempted to build averaging ensemble models for each data source; the results of which clearly shows an unprecedented level of accuracy in pricing option contracts. lastly, we have investigated the effect of multi-asset combined training for each of the proposed approaches. it was observed that the multi-asset trained models gave us a significant improvement in the prediction quality when compared to single-asset trained models. we have further examined this performance enhancement by using the concept of domain adaptation. the success of the multi-asset trained models makes us optimistic about the viability of building a non-assetspecific data-driven option pricing model. such a modelonce trained on data from multiple assets belonging to a particular financial marketwould be capable of predicting the fair price of any european-style call option on any asset belonging to the same financial market with a high degree of precision. however, in our paper, we have examined the combined-training effect using only two assets/sources. extensive experimentation is required to determine the limitations and the scope of such non-asset-specific models. readers may refer to [21] which reports a similar extensive experiment to study some other universal non-asset-specific relations captured by a deep learning model. further research to develop and validate the existence of such models has been planned by the authors. the codes used in this study can be made available on request. a neural network versus blackscholes: a comparison of pricing and hedging performances black-scholes versus artificial neural networks in pricing ftse 100 options. intelligent systems in accounting the pricing of options and corporate liabilities classification and regression trees profiling neural networks for option pricing xgboost: a scalable tree boosting system. kdd '16: proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining a new hybrid parametric and machine learning model with homogeneity hint for european-style index option pricing a decision-theoretic generalization of on-line learning and an application to boosting greedy function approximation: a gradient boosting machine stochastic gradient boosting pricing and hedging derivative securities with neural networks and a homogeneity hint option pricing with modular neural networks pricing and hedging derivative securities with neural networks: bayesian regularization, early stopping and bagging a nonparametric approach to pricing and hedging derivative securities via learning networks a method for stochastic optimization a hybrid neural network approach to the pricing of options option pricing using artificial neural networks: the case of s&p 500 index call options. neural networks in financial engineering using neural networks to forecast the s&p 100 implied volatility universal features of price formation in financial markets: perspectives from deep learning. quantitative finance option price forecasting using neural networks we are grateful to arkaprava sinha and prof. amit mitra (iit-kanpur) for some useful discussions. key: cord-020764-5tq9cr7o authors: vertrees, roger a.; goodwin, thomas; jordan, jeffrey m.; zwischenberger, joseph b. title: tissue culture models date: 2010-05-21 journal: molecular pathology of lung diseases doi: 10.1007/978-0-387-72430-0_15 sha: doc_id: 20764 cord_uid: 5tq9cr7o the use of tissue cultures as a research tool to investigate the pathophysiologic bases of diseases has become essential in the current age of molecular biomedical research. although it will always be necessary to translate and validate the observations seen in vitro to the patient or animal, the ability to investigate the role(s) of individual variables free from confounders is paramount toward increasing our understanding of the physiology of the lung and the role of its cellular components in disease. additionally, it is not feasible to conduct certain research in humans because of ethical constraints, yet investigators may still be interested in the physiologic response in human tissues; in vitro characterization of human tissue is an acceptable choice. the use of tissue cultures as a research tool to investigate the pathophysiologic bases of diseases has become essential in the current age of molecular biomedical research. although it will always be necessary to translate and validate the observations seen in vitro to the patient or animal, the ability to investigate the role(s) of individual variables free from confounders is paramount toward increasing our understanding of the physiology of the lung and the role of its cellular components in disease. additionally, it is not feasible to conduct certain research in humans because of ethical constraints, yet investigators may still be interested in the physiologic response in human tissues; in vitro characterization of human tissue is an acceptable choice. tissue culture techniques have been utilized extensively to investigate questions pertaining to lung physiology and disease. the isolation and propagation of human bronchial epithelial cells has allowed investigators to begin to characterize the interactions and reactions that occur in response to various stimuli. moreover, the culture of human airway smooth muscle has allowed researchers to investigate a pathologic cascade that occurs in asthma as well as other physiologic responses in the smooth muscle of the lung. numerous lung cancer cell lines have been established to investigate their responses to chemotherapy and determine their biologic properties. overall, the use of cultured human lung tissue has provided a windfall of information on the pathogenesis of diseases that affect the lung and on the basic physiology and development of the lung in general. despite this wealth of information in the literature, this chapter is the fi rst to discuss the use of tissue culture models to examine the physiology and pathologic basis of lung diseases. in light of this, we briefl y discuss the history and principles behind the utilization of tissue culture. we then discuss the current use of tissue culture to examine many of the from henrietta lacks were cultivated into the fi rst immortal cell line-"hela." 5 hela cells are still one of the most widely used cell lines today. since the 1950s, tissue culture has become fi rmly established as a mechanism to answer many questions in biomedical research. today, tissue culture is widely used to investigate diseases that affect the lung, and through this work we have been able to increase our understanding of the pathologic cascades that occur in lung diseases, as well as the normal physiologies of the lung. tissue culture is a commonly used generic term for the in vitro cultivation of cells, attributed to the early cultures that generally consisted of heterogeneous cultures of crudely disaggregated tissues. currently, many terms are used that can be encompassed by the term: organ culture, cell culture, primary explants, and ex vivo propagation all deal with the in vitro cultivation of cells or tissues. cell culture in general can be applied either to primary cells (e.g., those with a fi nite life span) or to cell lines (e.g., hela cells). additionally, these cultures can be either a homogenous or a heterogenous group of cells. primary cell culture involves the isolation of cells from a tissue by disaggregation. single cell suspensions from tissues can be completed through either enzymatic digestion of extracellular matrix surrounding the cells-such as with ethylenediaminetetraacetic acid, trypsin, or collagenase-or mechanical disaggregation. these disaggregation procedures have the disadvantage of possibly injuring cells. if the cells of interest are adherent viable cells, they will be separated from nonviable cells when the medium is changed. alternatively, viable cells can be separated from nonviable cells prior to culture by subjecting the single cell suspension to density gradient centrifugation (e.g., hypaque). primary cells have an advantage of possessing many of the biologic properties that they possessed in vivo because they are not transformed. primary cells, unlike cell lines, are not immortal and have only a fi nite survival time in culture before becoming senescent. variant cells, however, as well as those obtained from neoplastic tissue, may proliferate infi nitely, thus becoming immortal in vitro. this will eventually allow the immortal cell to take over the culture and can be thought of as a cell line. in general, primary human cultures will survive for 30-80 passages in vitro, although this number is dependent on cell type, conditions, and possibly other unknown factors. primary cells are widely used to examine the effects of toxins, infectious agents, or other cellular interactions that would not be feasible in vivo. primary cells have a disadvantage of being a heterogeneous mixture of cells upon primary isolation, with the type of cell obtained generally a component of the disag-gregation method used. the most common contaminant seen following isolation of primary cells is cells of mesenchymal origin (e.g., fi broblasts). however, advances have been made that allow the culture of homogenous populations of cells. for instance, cell surface molecules specifi c for the cells of interest may be tagged with monoclonal antibodies. techniques such as fl uorescenceactivated cell sorting or the use of magnetic beads can be utilized to enrich the single cell suspension for the cell type of interest. additionally, some investigators have recently exploited unique characteristics of certain cells, such as the presence of p-glycoprotein or multidrug resistance-associated proteins expressed on endothelial cells, to poison other contaminating cells in culture. 6 another type of primary cell culture is "primary explants." this type of culture is not subjected to a disaggregation procedure like the primary cell technique described earlier. therefore, single cell suspensions do not occur. briefl y, tissue samples are dissected and fi nely minced. these tissue pieces are then placed onto the surface of a tissue culture plate. following plating of tissue pieces, cells have been shown to migrate out of the tissue and onto the tissue culture surface. 7 this technique is useful when cells of interest may become damaged or lost in the disaggregation technique described earlier and is often used to culture human bronchial epithelial cells. 8 cell lines are another useful source of cells to investigate questions in biomedical research. these cells have the advantage of being immortal as opposed to the fi nite life spans that primary cells possess. additionally, they are generally well studied and characterized, leaving few experimental variables to worry about. these cells however, are prone to dedifferentiation-a process by which they lose the phenotypic characteristics of the cell from which they began. many of the early cell lines were established from tumor tissue and as such possess abnormal growth characteristics. newer cell lines have been established by molecular techniques such as inserting a telomerase gene into a cell to allow it to replicate infinitely. 9 because of the phenotypic changes that allow cell lines to replicate infi nitely in culture, they are often a fi rst choice for experiments; however, they are also highly criticized in light of their nonnatural phenotype. organ culture, as the name implies, involves ex vivo culture of the whole or signifi cant portion of the organ. the main advantage to this type of culture is the retention and preservation of the original cell-cell interaction and extracellular architecture. this type of culture may be particularly important when experimental design necessitates the use of an ex vivo system, but researchers still need to retain the original organ architecture to answer questions posed. these types of cultures do not grow rapidly, however, and are therefore not suitable for experiments needing large numbers of a particular cell type. 10 advantages and limitations of tissue culture tissue culture has become the penultimate tool of the reductionist biologist. the utilization of tissue culture as a research methodology has allowed investigators to study isolated interactions in its near-normal environment. these experiments by their very nature introduce artifacts; however, they do minimize the number of confounding variables that may affect a particular experiment. for instance, tissue culture allows investigators to determine the effects of one particular treatment on a particular cell type, which would not be feasible in vivo. additionally, tissue culture models of disease allow investigators to obtain samples and make observations more readily than those done in vivo. however, it is the relative simplicity of experiments done in vitro that allows models of disease or physiology to come under frequent and warranted criticism. these models do not take into consideration the complexity of biologic systems. diminishing possible confounding variables by culturing cells in vitro brings up the constant criticism of how applicable results are because of alterations of the normal cellular environment in vivo. for example, cell-cell interactions in vitro are reduced and unnatural. moreover, the culture does not contain the normal heterogeneity and threedimensional architecture that is seen in vivo. this said, however, tissue culture biology has proved to be successful in many ways. we have briefl y discussed the advantages that experimental systems using tissue culture affords researchers studying physiology and pathogenesis. because of its ability to isolate individual variables and determine their role(s) in physiology, cell culture has become an integral tool in deciphering the pathologic cascades that occur in human disease. diseases that affect lung are no exception. many diseases that affect the lung, and humans in general, are multifactorial. this begs the question how can cell culture, because of its reductionist nature only dealing with a minimal number of variables, help to solve the unknown questions and decipher the components involved in disease? often, clinical observations, and the questions arising therein, have been the launching pad for investigation. for instance, observations of massive infl ammation in the bronchoalveolar lavage samples of patients with acute respiratory disease syndrome (ards), consistent with damage seen in histologic samples, prompted investiga-tors to determine the role(s) of infl ammation in the etiology of ards. through the use of cell culture, investigators were able to determine individual interactions that occurred in the disease process. investigators have utilized culture models employing microcapillary endothelial cells under fl ow conditions to understand the role of proinfl ammatory cytokines in the cytokinesis and emigration of neutrophils in disease. using a model of pulmonary endothelium under fl ow conditions allowed investigators to demonstrate the importance of certain proinfl ammatory cytokines in ards. 11 the role of inhaled toxicants in lung injury, and the mechanism(s) by which they cause disease, is another area of investigation that has utilized cell culture. scientists have developed diverse and unique tissue culture systems that contain air-liquid barriers of lung epithelium and subjected these cells to various gaseous toxicants to determine what occurs following inhalation of various chemicals. utilizing these types of systems, investigators are able to control the exposure time and other variables that may be diffi cult when determining inhaled toxicant effects in vivo. moreover, the use of tissue culture, as opposed to an animal model, allows investigators to observe effects kinetically, without undue changes (e.g., sacrifi ce) and expense in the experimental model. 11 a tissue culture model also permits an investigator to observe multiple changes in real time, such as cellular integrity, cell signaling and intracellular traffi cking, protein expression changes, oxidant-induced cellular damage, and more. deciphering each of these changes in an animal model would be extremely diffi cult; through employing a tissue culture model, researchers are able to tightly control the experimental system while isolating the events of interest. further examples of how tissue culture models are currently being used to elucidate questions in lung physiology and disease are discussed later in the section on lung tissue cell lines. maintaining cells in vitro was initially a very diffi cult task. many characteristics need to be fulfi lled before a successful cell culture occurs. some of these characteristics are dependent on the type of tissue being studied; others may depend on specifi c requirements of the individual cells. various chemically defi ned media are now available commercially to support the growth and differentiation of numerous cell types. the creation of defi ned media has allowed investigators to culture a multitude of cell types while controlling the local environment to answer pertinent questions. for example, glucose can be removed from a culture medium in order to study its effects on cellular metabolism, relative position in the cell cycle, and many other effects. each chemical component is known in these media. additionally, investigators can add growth factors to nourish their cell cultures. the medium chosen when culturing cells in tissue culture must fi t two main requirements: (1) it must allow cells to continue to proliferate in vitro, and (2) it must allow the preservation of the certain specialized functions of interest. 7 the most common medium formulations used currently in lung research are dulbecco's modifi ed eagle's medium, minimum essential medium, rpmi 1640, and ham's f-12. occasionally, investigators develop new medium types to attain a formulation that optimizes their own experimental conditions. fetal bovine serum is a common additive to most tissue culture media, although some investigators choose to forgo this additive for more defi ned supplementation. additionally, others may choose sera from other sources such as human serum when culturing cells of human origin. inactivation of complement by heat treating serum for 1 hr at 56°c was initially very popular in tissue culture. however, it has become clear that this treatment may in fact damage some of the proteinaceous growth factors present in the medium, rendering it less effective. currently, many experts recommend heat inactivation only if the cell type of interest is particularly sensitive to complement. 12 more specifi c examples of medium utilized in lung tissue culture models are given later in the section on lung tissue cell lines. when deciphering if the current culture conditions are suffi cient for the experimental design, the investigator must determine which cellular characteristics are important. not only are the general characteristics, such as adhesion, multiplication, and immortalization of cell types important, but so are tissue-specifi c characteristics. of importance to pulmonary research, the lung is a unique environment to simulate in vitro because of the air-liquid interface. recently, investigators have made use of culture insert wells (e.g., transwells, corning) in order to study this interaction. 6 cell adhesion nearly all normal or neoplastic human epithelial cells will attach with relative ease to tissue culture surfaces. most tissue culture models utilizing tissue of lung origin fi t this description, with the notable exception of small cell lung carcinoma cell lines. however, for culture cells that may loosely adhere, or may not adhere at all, scientists coat tissue culture surfaces with extracellular matrix proteins. incubating tissue culture surfaces with serum, as well as laminin, fi bronectin, or collagen, prior to culture has been shown to improve attachment of fi nicky cells. 8 these treatments also help in replicating the normal attachment of cells to extracellular matrix proteins in vivo. the development of continuous cell lines may be serendipitous, as was the development of early cell lines. in brief, many investigators would continue splitting primary cell cultures until one or more cell clones became immortal. unfortunately, the changes that generally occurred in culture led to cells with abnormal phenotypes that had undergone dedifferentiation. today, many investigators choose to use molecular biology techniques, exploiting our current knowledge of oncogenic viruses and enzymatic processes of cellular aging to transform primary cells in vitro to an immortal phenotype. it is known that the large t antigen present in the sv (simian virus) 40 virus is capable of transforming cells to an abnormal phenotype. 11, 13, 14 moreover, transfection of primary cells with a transposase enzyme has also been shown to induce an immortal phenotypic change while preserving most normal cellular functions and phenotypes. 11 dedifferentiation a commonly encountered problem in tissue culture is dedifferentiation. this loss of phenotype may be insignificant to the research at hand or it may be critical, and it must be dealt with on a case by case basis. when a cell culture undergoes dedifferentiation it is often unclear whether undifferentiated cells took over the culture of terminally differentiated cells or whether a primary cell of interest became immortal under the culture conditions. the functional environment in which cells are cultured is critical when correlating experimental results to those seen in vivo. we previously alluded to the importance of the environment in which cells are cultured when discussing the advantages and limitations of tissue culture. investigators have frequently strived to replicate integral in vivo environments in vitro in order to increase the signifi cance of their experimental results. the development of cell culture insert wells (e.g., transwells, corning) has allowed investigators to culture bronchial or alveolar epithelial cells at an air-liquid interface. this ability allows investigators to begin to replicate a signifi cant aspect of these cells' functional environment in vitro, thereby increasing their understanding of the effects of gaseous particles on pulmonary epithelial cells. alternatively, scientists have also cultured epithelial cells on a roller bottle apparatus. this method allows investigators to determine the amount of time the apical epithelial cell surface is in contact with the air. capillary cell cultures have also come under frequent criticism when cultured in a monolayer in a tissue culture plate. investigators have been able to utilize gel matrices in which capillary cells form tubule-like structures, more closely replicating the architecture these cells maintain in vivo. additionally, endothelial cells are constantly under fl ow conditions in vivo. addressing this condition in vitro has allowed investigators to look at the role of endothelial cells during infl ammation-helping to increase the understanding of the role endothelium plays in acute lung injury. at times, researchers may also choose to determine the effects of soluble factors (e.g., cytokines, hormones, neurotransmitters) from acute patients or animal models in a cell culture model. the milieu of soluble factors present in the serum that may play a role in a disease state is considerable. moreover, these factors may have actions alone that are different when combined with other soluble factors. reconstituting every factor presents a diffi culty in vitro and leaves the possibility that an unknown factor may be missing. to address this, investigators have harvested sera from patients or animal models and used these samples as additives in their media formulations. for instance, through the use of serum samples from an animal model of smoke/burn injury-induced acute lung injury, investigators have demonstrated that use of arteriovenous co 2 removal in acute lung injury signifi cantly reduces apoptotic cell death in epithelial cells. 15 lung tissue cell lines: establishment and signifi cance the diversity of research fi elds utilizing tissue culture models of lung diseases is extensive. in this section, we will give a brief overview of the main lung cell types that are being utilized in research today to answer pressing questions about lung physiology and the pathophysiology of pulmonary disease. included in this discussion is also an overview of cell isolation and culture. the use of normal human bronchial epithelial (hbe) cells is extensively reported in the literature. based on a method pioneered by lechner et al., 16 bronchial fragments obtained from surgery, autopsy, or biopsy specimens may be used as explants. the outgrowth of bronchial epithelial cells occurs readily from these explants when grown in medium supplemented with bovine pituitary extract and epidermal growth factor. alternatively, these cells have also been demonstrated to grow in basal keratinocyte serum-free medium without supplementation; however, they demonstrate a slower growth rate and earlier senescence. 8 cultures of hbe cells are valuable for determining the responses to toxic inhaled pollutants. in vitro exposure systems based on these methods have several advantages. first, in vitro exposure systems can be stringently controlled and reproduced much better than in animal systems; second, individual determination of the cell types' responses to pollutants allows for a better characterization of the individual involvement of the cell type to a biologic response. finally, in vitro determination of the responses to toxic agents allows investigators to observe the reactions of human cells when testing in humans is not feasible because of ethical restraints. in vitro study of the responses of bronchial cells to gaseous pollutants is not without its diffi culties. wallaert et al. 17 have described these constraints well. briefl y, because of the gaseous nature of the pollutants, culture systems should be designed that allow signifi cant exposure times to pollutants while also taking care to inhibit cells from drying out when exposed to air. to facilitate these experiments, roller bottle cultures have been developed that allow cells direct contact with the ambient air. alternatively, cells have been grown on a membrane fi lter and cultured at an air-liquid interface, which allows constant exposure to the experimental treatment. the same type of experiments that are used to determine the responses of cells to inhaled toxicants have also been used to characterize responses to inhaled pharmaceuticals. in addition to the characterization of responses to inhaled agents, epithelial cell cultures, notably alveolar epithelium obtained from fetal lung tissue, have allowed investigators to characterize the liquid transport phenotype that occurs in the developing lung. characterization of the cl − ion secretion system, which occurs in the distal lung epithelium throughout gestation, has been shown to be integral in the stimulation of growth of the developing lung by regulating liquid secretion. likewise, a phenotypic switch of na + absorptive capacity has been described toward the end of gestation, which is important for preparation of the lung to function postpartum and beyond. these culture systems have elucidated important physiologic changes that occur in the developing lung. similar experiments have demonstrated that while ion transport plays a crucial role in this process other hormones and neurotransmitters are also important. pulmonary endothelial cells represent a unique type of endothelium because of their paradoxical responses to hypoxia. this uniqueness highlights the need to utilize cell culture models of pulmonary endothelium as opposed to other endothelia when interested in investigating their role(s) in pulmonary physiology. several investigators have described the isolation and culture of pulmonary endothelial cells. persistent pulmonary hypertension of the newborn, also known as neonatal pulmonary hyper-tension, is caused by a disorder of the pulmonary vasculature from fetal to neonatal circulation, culminating in hypoxemic respiratory failure and death. the inciting events that culminate in neonatal pulmonary hypertension are multifactorial. despite this, decreased production of vasodilator molecules such as nitric oxide and prostaglandin i 2 in the pulmonary endothelium has been shown to be a critical component of disease progression. 18 primary cell cultures of human airway smooth muscle tissue can be obtained utilizing a method described by halayko et al. 19 in which they isolated and characterized airway smooth muscle cells obtained from canine tracheal tissue. briefl y, airway smooth muscle cells were obtained by fi nely mincing tissue and subjecting it to an enzymatic disaggregation solution containing collagenase, type iv elastase, and type xxvii nagarse protease. following generation of a single cell suspension, cells may be grown in dulbecco's modifi ed eagle's medium supplemented with 10% fetal bovine serum. halayko et al. 20 obtained approximately 1.3 × 10 6 smooth muscle cells per gram of tissue using this method. although halayko et al. 21 pioneered this technique using trachealis tissue, many other investigators have obtained airway smooth muscle cells from a variety of biopsy specimens. airway smooth muscle hyperreactivity and hypertrophy has been known for nearly 100 years 2 to be an important end response of asthma. the use of airway smooth muscle in vitro has been vital toward delineating the pathologic steps that occur in asthma, as well as testing of potential therapeutics that may help to decrease the morbidity and mortality of asthma. additionally, the relative paucity of in vivo models of asthma further illustrates the value of isolation and characterization of smooth muscle cells from asthmatic patients in vitro. using airway smooth muscle cell culture, investigators have characterized both the hypertrophic and hyperplastic growth of smooth muscle in individuals. investigation of the potential stimuli that lead to airway smooth muscle proliferation and hypertrophy have led researchers to implicate the mitogen-activated protein kinase family members, extracellular signal-regulated kinase-1 and -2, and the phosphoinositol-3 kinase pathways in pathogenesis. 22 additionally, mediators directing smooth muscle migration have also been observed in vitro and may play a role in the progression of asthma. platelet-derived growth factor, fi broblast growth factor-2, and transforming growth factor-β (tgf-β) have all been shown to play a role in the migratory response of smooth muscle cells seen in asthma. 22 additionally, contractile agonists such as leukotriene e 4 have been shown to potentiate the migratory responses seen with platelet-derived growth factor treatment. 22 human airway smooth muscle cell culture has also been utilized to investigate possible pharmacologic interventions for the treatment of asthma. β 2 -agonists have been shown to decrease the rate of dna synthesis and likewise decrease the hyperplasia seen in airway smooth muscle cells in response to mitogenic stimuli through an increase in cyclic adenosine monophosphate. like β 2agonists, glucocorticoids have similar antiproliferative activities. lung cancer tissue and the development of novel therapeutics culture of neoplastic cells from human tumors has allowed investigators to harvest a wealth of knowledge into the biology of lung cancers; moreover, these cultures have provided potential models to test potential therapeutics. the propagation of lung cancer cells in vitro has been covered in great depth previously. 8 in contrast to primary cell cultures, cultures of neoplastic cells are immortal, allowing their easy growth in culture with less chance of being overgrown by mesenchymal cells such as fi broblasts. the relative ease of growth in culture has led to many cell lines of lung cancer tissue. the national cancer institute, recognizing the need for a variety of lung cancer cell lines (both small cell and non-small cell), helped establish over 300 cell lines. 23 these lines are a wonderful resource for investigators given that they are extensively characterized, and many have full clinical data available. moreover, many of these cell lines are now easily available through the american type culture collection for a modest handling fee. additionally, if investigators do not wish to use currently established lung cancer cell lines, obtaining clinical samples for use in tissue culture models is relatively easy. the same methods used to obtain biopsy specimens for clinical staging can also be used to begin cell cultures. following culture and initial characterization of lung cancer cell lines, many investigators have demonstrated that lung cancer cell lines maintain a similar phenotype after establishment. specifi cally, it has been verifi ed that injection of lung cancer cell lines into nude mice exhibit similar histopathology to the original tumor, indicating minimal change occurred following establishment of the cell lines. small cell lung carcinoma (sclc) cell lines have been established from a multitude of biopsy specimens, including bone marrow, lymph nodes, and pleural effusions. 8, 24 once viable cells have been obtained from clinical samples, cells are easily maintained in a basal cell culture medium such as rpmi 1640 in a humidifi ed incubator at 37°c and 5% co 2 , although the initial isolations of sclc lines utilized hites and acl-4 media. 25 most established sclc cell lines maintain a neuroendocrine phenotype in culture; however, baillie-johnson et al. 24 noticed considerable heterogeneity in the cell lines they established, highlighting the signifi cance that establishing a cell line from the clinical sample of interest may provide investigators with a line that possesses the exact phenotypic properties of interest. small cell carcinoma poses many diffi culties to surgical treatment, owing to its early and widespread metastasis. therefore, combination chemotherapy is generally utilized in treatment. unfortunately, despite initial sensitivities, sclc tumors become resistant to further treatment. utilizing in vitro cultures of sclc cell lines, sethi et al. 26 began to describe how extracellular matrix proteins can protect sclc against apoptosis-inducing chemotherapeutics through β 1 -integrin-mediated survival signals. these data indicate that extracellular matrix proteins surrounding sclc may play a role in the local recurrence seen in patients following chemotherapy in vivo and suggest novel therapeutics aimed at blocking these survival signals. non-small cell lung carcinoma (nsclc) cell lines including squamous cell carcinoma, adenocarcinoma, and large cell carcinoma have all been established. despite the fact that nsclc cells comprise three distinct histologic cell types, all cell types can be established relatively easily. the primary treatment protocol for patients affl icted by nsclc is generally surgical resection of the tumor; therefore, tumor cells for culture are readily available. these cell types can be grown under conditions similar to those described for sclc. infectious diseases play a unique role in lung pathology in light of their roles as either important contributors or consequences of many lung diseases. for instance, certain lung diseases may predispose patients to infection: patients affl icted with obstructive lung diseases, as well as cystic fi brosis patients, commonly suffer from severe and recurrent bacterial infections. additionally, patients may become superinfected following a viral respiratory infection. systemic infections, such as gram-negative bacterial sepsis, may lead to lung diseases such as ards. human type ii alveolar pneumocytes and acute lung injury/acute respiratory distress syndrome pulmonary alveolar type ii cells are a unique cell subset that carries out highly specialized functions that include synthesis and secretion of surfactant, a unique composition of lipoproteins that act to reduce surface tension at the alveolar air-liquid interface. 27 defi ning the molecular mechanisms leading to production of surfactant by type ii pneumocytes is important in many disease processes. the pathogenic sequence that results in ards, the most severe manifestation of alveolar lung injury, is generally thought to be initiated by a systemic infl ammatory response. 28 despite this knowledge, there still exist many questions about the initial triggers and pathologic steps that occur in ards. greater understanding of these steps may help to develop new treatment regimes. currently, treatment of ards consists of mechanical ventilation, which helps to stabilize blood gases. however, mechanical ventilation itself may provoke further infl ammation in the alveoli, thereby decreasing compliance and gas exchange in the alveoli. 29 the cell type of particular interest in ards and diffuse alveolar damage is the type ii pneumocytes. [30] [31] [32] [33] [34] until recently, studies trying to decipher the pathologic sequence in acute lung injury have had to rely on standard lung epithelial cell lines. recently, however, human type ii alveolar epithelial cells (pneumocytes) have been successfully isolated from fetal human lung tissue by collagenase digestion. 35 briefl y, fetal lung tissues were minced and incubated in a serum-free medium containing dibutyryl cyclic adenosine monophosphate for 5 days. the tissue explants were then treated with collagenase and incubated with deae-dextran to eliminate contaminating fi broblasts. cells were then plated onto tissue culture dishes treated with extracellular matrix derived from mdck cells and cultured overnight in waymouth's medium containing 10% serum. these steps resulted in relatively pure populations of human type ii pneumocytes that were then cultured at an air-liquid interface. using these methods, alcorn et al. 35 were able to maintain a primary culture that retained the morphologic and biochemical characteristics of type ii pneumocytes for up to 2 weeks. conventional bioreactors and three-dimensionality: the origins of three-dimensional culture carrel postulated that tissue development was linked to access to nutrient supply, noting that peripheral cells grew readily, and internal cells became necrotic presumably based on their distance from the nutrient source. to circumvent this issue, carrel implemented cultures on silk veils, preventing the plasma clots of the growth media from deforming or becoming spherical, thus facilitating the internal cell's ability to obtain nutrient replenishment. many attempts were made in standard culture systems (bioreactors) and other culture apparatuses to escape the constraints of two-dimensional cell culture, with the intent of yielding high-fi delity human and mammalian tissues, and thus emphasizing the need for development of three-dimensional biology. another famous researcher, leighton, improved on carrel's techniques in the 1950s and 1960s. leighton's major contribution to three-dimensional culture technology was the introduction of the idea of a sponge matrix as a substrate on which to culture tissues. 36 , 37 leighton fi rst experimented on cellulose sponges surrounded by plasma clots resident within glass tubes. he devised a system to grow 1-to 5-mm 3 tissue explants on sponges, using small amounts of chick plasma and embryo extract. after the mixture solidifi ed on the sponge leighton added the nutrient media and inserted the "histoculture" in a roller apparatus to facilitate nutrient mass transfer. he experimented with many sponge combinations, discovering that collagen-impregnated cellulose sponges were optimal for sustaining the growth of native tissue architecture. 3, 38 leighton was successful in growing many different tissue types on the sponge-matrix cultures. 3, 38 leighton also found that c3hba mouse mammary adenocarcinoma cells, when grown on sponge-matrix histoculture, aggregated "much like the original tumor, forming distinct structures within the tumors such as lumina and stromal elements, and glandular structures." an extremely important difference of this threedimensional histoculture from the standard two-dimensional culture is the apparent quiescence of the stromal component and the balanced growth of these cells with regard to the overall culture. leighton further advanced the concept of three-dimensional histoculture to histophysiologic gradient cultures. 39 these cultures are conducted in chambers that allow metabolic exchange between "the pool of medium and the culture chamber by diffusion across a membrane." histophysiologic gradient cultures mimic, to some degree, diffusion in tissues. 38 from the pioneering work of carrel and leighton, other methods of emulating three-dimensional cultures have been developed, such as embedding cells and tissues in collagenous gels of rat tail as per the techniques of nandi and colleagues. many of the advantages of threedimensional cultures seen by leighton, nandi, and others may be attributed to permitting the cells to retain their normal shape and special associations. 3 this global concept will be important as we begin to understand and recall the physical and environmental characteristics of the rotating-wall vessel systems. other methods of three-dimensional culture encompass a technique known as organ culture or culture on a fi lter, a strategy developed by strangeways 40 and fell and robinson. 41 tissue explants were grown on lens paper in a watch glass containing liquid culture medium. browning and trier 42 found "that for some tissues, it is critical to keep the cultures at the air-liquid interface," thus allowing the tissues to experience conditions similar to the in vivo environment. another strategy is the use of three-dimensional cultures known as proto-tissues, or aggregates of cells, used to form spheroids. this technique was popularized by sutherland and colleagues more than 20 years ago when they manipulated aggregates of cells into a spherical confi guration by spinning agitation of the cells in spinner fl asks. 43 this technique produced pseudo-tissue-like organoids useful for research evaluations. each of these methodologies will be of benefi t as we continue to examine strategies for achieving three-dimensional lung tissue constructs. 3, 38 finally, membrane bioreactors are capable of retaining enzymes, organelles, and microbial, animal, and plant cells behind a membrane barrier, trapped in a matrix or adherent to the membrane surface. in 1963, gallup and gerhardt 44 fi rst used the membrane bioreactor for dialysis culture of serratia marcescens. immobilized enzyme microencapsulation was pioneered by chang, 45 but butterworth et al. 46 fi rst developed the enzyme membrane reactor to successfully accomplish starch hydrolysis with α-amylase. likewise, for animal cell culturing, knazek et al. 47 fi rst cultured human choriocarcinoma cells on compacted bundles of amicon fi bers. many reviews on the particular applications of hollow fi ber and immobilized bioreactant bioreactors for enzyme catalysts, microbial cells, and animal cell culture are available. [48] [49] [50] [51] [52] [53] as presented previously, tissue-engineering applications of three-dimensional function and structure are well known in medical science research. 54 in microgravity three-dimensional aggregates form, facilitating the expression of differentiated organotypic assemblies. investigations to determine the effect of composite matrices, spiked with esterifi ed hyaluronic acid and gelatin, to augment osteochondral differentiation of cultured, bone marrow-derived mesenchymal progenitor cells and the effects of the matrix on cellular differentiation have been examined in vitro and in vivo. 54 briefl y, empty and populated matrices cultured for 28 days, with and without tgf-β 1 demonstrated the following results. cells implanted in the matrix produced a robust type ii collagen extracellular matrix in vitro. matrices placed in immunodefi cient mice yielded no differentiation in empty constructs, osteochondral differentiation in loaded implants, and an enhanced level of differentiation in preimplantation in vitro-cultured matrices containing tgf-β 1 . these results demonstrate the utility of three-dimensional matrix for presentation of bone mesenchymal progenitor cells in vivo for repair of cartilage and bone defects as well as indicate the efficacy for in vitro tissue engineering regimes. 54 these techniques lend themselves to microgravity and ground-based research tissue cultures alike. many earth-based laboratories are researching and developing hemopoietic bone marrow cultures of stem cell origin, and three-dimensional confi gurations are providing promising results as illustrated by schoeters and coworkers. 55 they report that murine bone marrow cells, cultured under long-term hemopoietic conditions, produce mineralized tissue and bone matrix proteins in vitro but only when precipitated by the presence of adherent bone stroma cells in three-dimensional collagen matrices. at a concentration of 8 × l0 6 stromal cells, mineralization occurs in 6 days. in contrast, twodimensionally oriented marrow fragments at 1 × 10 7 cells require requires more than 10 days before mineralization can similarly be detected. 55 two-dimensional long-term marrow culture facilitates and enhances expansion of the stromal component and rudimentary differentiation of osteogenic-like cells in the adherent stromal layer as verifi ed by type i collagen or cells positive for alkaline phosphatase. production of osteonectin and osteocalcin, a bone-specifi c protein, combined with calcifi cation is observed only in threedimensional cultures. these studies demonstrate the need for and benefi t of three-dimensionality and the application to the microgravity environment. 55 as we can see, this further reinforces the quest for threedimensionality and the potential of modeling the microgravity environment. investigations clearly show the need for the application of three-dimensional study techniques in lung pathophysiologic studies. interestingly, three-dimensional biology has facilitated full-scale investigations into most areas of tissue engineering, cell biology and physiology, immunology, and cancer research. anchorage-dependent cells are widely cultured on microcarriers. 56 studies show that for the purposes of improved surface-to-volume ratio and scale up, the microcarrier suspension culture provides excellent potential for high-density cell growth. 57 in addition, microcarriers serve well as structural supports for three-dimensional assembly, the composite of which is the basis for threedimensional tissue growth. 58 conventional culture systems for microcarrier cultures (i.e., bioreactors) use mechanical agitation to suspend microcarriers and thus induce impeller strikes as well as fl uid shear and turbulence at the boundary layer between the wall and the fl uid. investigators have attempted to make a complete study of the most effi cient bioreactor designs and agitation regimens. 59 they concluded that virtually all stirred-tank bioreactors operate in the turbulent regimen. it has been demonstrated that bead-to-bead bridging of cells is enhanced signifi cantly at lower agitation rates in a stirred reactor. 60 excessive agitation from either stirring or gas bubble sparging has been documented as a cause of cell damage in microcarrier cell cultures. 61, 62 to overcome the problems induced by these mechanisms, investigators developed alternative culture techniques such as porous microcarriers to entrap cells, 63 increased viscosity of culture medium, 64 bubble-free oxygenation, 65 and improved methods for quiescent inoculation. 66, 67 these steps decreased the damage attributed to turbulence and shear forces but failed to signifi cantly rectify the problems. reactor systems of substantially increased volume exhibit less agitation-related cell damage. this is presumably because of the decreased frequency of cell-microcarrier contact with the agitation devices in the systems. research-scale investigations do not afford the luxury of experimenting with large-scale production systems. therefore, if a large-volume system is indeed more quiescent, an improved bioreactor system should emulate the fl uid dynamics present in the upper regions of large-scale reactors in which cells and microcarriers reside with minimal agitation. the problem, then, is to suspend microcarriers and cells without inducing turbulence or shear while providing adequate oxygenation and nutritional replenishment. the term rotating-wall vessel comprises a family of vessels, batch fed and perfused, that embody the same fl uid dynamic operating principles. these principles are (1) solid body rotation about a horizontal axis that is characterized by (a) colocation of particles of different sedimentation rates, (b) extremely low fl uid shear stress and turbulence, and (c) three dimensional spatial freedom; and (2) oxygenation by active or passive diffusion to the exclusion of all but dissolved gasses from the reactor chamber, yielding a vessel devoid of gas bubbles and gas-fl uid interface (zero head space). 68, 69 three-dimensional models of lung disease current cell culture models have shortcomings resulting in unreliable tumor growth, uncharacteristic tumor development, nonhuman tumors, and inadequate methods of detection. cells propagated under traditional culture conditions differ widely in their expression of differentiated markers, adhesion receptors, and growth factor receptors compared with cells in situ or those grown as tissue-like structures. 70, 71 this is of concern because the phenotypic changes leading to malignant transformation often stem from alterations in the balanced and multifaceted roles of growth factors, receptors, and cytokines (reviewed by herlyn et al. 71 ). with increasing evidence of the importance of adhesive contacts, paracrine cross-talk between different cell types, and signaling cascades that link the cell with a complex substratum, there is now recognition that models must be developed that better simulate these complexities. there is still much to learn about the dynamic relationships among the different phenotypes found in the normal lung and in lung cancers. until a cell culture system is developed that allows differentiation to occur, 72 it is diffi cult to make any fi rm statement about relating effects in cell culture to clinical practice. tissue engineering is very embryonic in development and currently nearly universally focused on building replacement tissues. a new technology developed at the nasa johnson space center used to study colon cancer has been adapted to three-dimensional in vitro lung tissue culture models but has not been reported on to date. rotating-wall vessels are horizontally rotating cylindrical tissue culture vessels that provide controlled supplies of oxygen and nutrients with minimal turbulence and extremely low shear. 69 these vessels suspend cells and microcarriers homogeneously in a nutrient-rich environment, which allows the three-dimensional assembly of cells to tissue. prior to seeding rotating-wall vessels (synthecon, inc, houston, tx), cells were cultured in standard t fl asks (corning, corning, ny) in gtsf-2 medium (1993 psebm) in a humidifi ed 37°c, 5% co 2 incubator. the rotating-wall vessels were seeded with 1-2 mg/ml cultispher-gl microcarriers (hyclone laboratories, inc., logan, ut) followed by beas2-b or bzr-t33 cells (atcc, baltimore, md) at a density of 2 × 10 5 cells/ml. cultures were grown in the rotating-wall vessels for 14-21 days for formation of 3-to 5-mm diameter tumor masses. rotating-wall vessel rotation was initiated at 25 rpm and increased as aggregate size became larger. stationary control cultures were initiated under the same conditions using fep tefl on bags (american fluoroseal, columbia, md). at 24-hour intervals ph, dissolved co 2 , and dissolved o 2 were determined using a corning 238 model clinical blood gas analyzer. glucose concentration was determined using a beckman 2 model clinical glucose analyzer (beckman, fullerton, ca). cell samples were harvested every 48 hr and fi xed with omnifi x (xenetics, tustin, ca) for immunohistochemistry or fi xed with 3% glutaraldehyde/2% paraformaldehyde in 0.1 m cacodylic buffer (electron microscopy sciences, fort washington, pa) for scanning electron microscopy. cancer models already developed by nasa investigators include growth and differentiation of an ovarian tumor cell line, 72-74 growth of colon carcinoma lines, 72 and three-dimensional aggregate and microvillus formation in a human bladder carcinoma cell line. 74 in support as an appropriate model for cancer, even the most rudimentary three-dimensional cellular structures exhibit different phenotypes than cell lines cultured under twodimensional conditions. properties such as responses to tgf-β, drug resistance to cisplatin or cyclophosphamide, and resistance to apoptosis are all altered in various types of cell aggregates. 75 many investigations sustain consistent evidence that cells growing in three-dimensional arrays appear more resistant to cytotoxic chemoagents than cells in monolayer culture. 38 li et al. found that spheroids were more resistant to cytosine arabinoside by 11-fold and methotrexate by 125-fold when compared with single cell suspensions. 76 further monolayer cultures of colon carcinoma cells were sensitive to piericidin c in contrast to responses within in vivo colon tumors or three-dimensional slices of tumors grown in vitro. 77 numerous other investigations have revealed increased levels of drug resistance of spheroids compared with single cell monolayers. 3, 38 questions of poor diffusion and insuffi cient drug absorption within spheroids and a relatively frequent high proportion of resting cells have clouded differences in drug resistance, which could be the result of nutrient deprivation and hypoxia. heppner and colleagues executed precise experiments that confi rmed threedimensional structure and function as the causative agent and was responsible for drug resistance rather than simple inaccessibility to nutrients or the drug concentration. heppner embedded tumor specimens or cell aggregates in collagen gels, exposed the culture to various cytotoxic drugs, and compared the drug responses of the same cells in monolayers. these experiments revealed an increased resistance in the three-dimensional tumor arrays of a remarkable 1,000-fold greater than in monolayer cultures, and a similar result was seen in three-dimensional histocultures in collagen. the tumor cells grew in the presence of drug concentrations that rendered monolayers to a viability less than 0.1% of control cultures. amazingly, heppner observed that the cells became sensitive again when replated as monolayers and fi nally showed that even when exposed to melphalan and 5-fl uorouracil in monolayer cells transferred to collagen gels were again resistant based on three-dimensional architecture. thus, the cells were exposed to the drugs as monolayers, facilitating access to the drugs, and, once the cells were transferred after drug exposure to a threedimensional structure, high resistance to the drugs was sustained. 38, [78] [79] [80] [81] based on the caliber of data referenced above, teicher et al. 82 serially passaged through multiple (10) transfers emt-6 tumors in mice that were treated with thiotepa, cisplatin, and cyclophosphamide over a prolonged 6month period, thus producing extremely drug-resistant tumors in vivo. when these tumors were grown as monolayer cultures, they were as drug sensitive as the parental cells. kobayashi and colleagues 83 grew the same in vivo drug-resistant tumor cell lines as spheroids in threedimensional arrays, and resistance was almost 5,000 times that of the parent line with selected drugs, an example being the active form of cyclophosphamide used in vitro. similarly extreme resistance was also observed to cisplatin and thiotepa. this resistance was not seen in monolayer cultures, even when the monolayers were cultured on traditional extracellular matrix substrates. these experiments reconfi rmed that cells in a three-dimensional array are more drug resistant than monolayer cells in vitro and demonstrated that three-dimensional cellular confi gurations can and do become resistant to super pharmacologic doses of drugs by forming compact structures. 38 rotating-wall vessel tumor models several important human tumor models have been created in rotating-wall vessel cultures, specifi cally, lung, prostate, colon, and ovarian. 14, 58, 73, 84 many of these models involve cancers that are leading killers in our society. we present two such examples in this section, colon and prostate carcinoma. as previously reviewed, the literature indicates the remarkable difference between chemotherapeutic cytotoxicity in two-dimensional and three-dimensional cellular constructs, which may be predicated on a number of criteria. therefore, a threedimensional tumor model that emulates differentiated in vivo-like characteristics would provide unique insights into tumor biology. goodwin et al. 58 detail the fi rst construction of a complex three-dimensional ex vivo tumor in rotatingwall vessel culture composed of a normal mesenchymal base layer (as would be seen in vivo) and either of two established human colon adenocarcinoma cell lines, ht-29, an undifferentiated line, and ht-29km a stable, moderately differentiated subline of ht-29. each of these engineered tumor tissues produced tissue-like aggregates (tlas) with glandular structures, apical and internal glandular microvilli, tight intercellular junctions, desmosomes, cellular polarity, sinusoid development, internalized mucin, and structural organization akin to normal colon crypt development. necrosis was minimal throughout the tissue masses up to 60 days of culture while achieving >1.0 cm in diameter. other notable results included enhanced growth of neoplastic colonic epithelium in the presence of mixed normal human colonic mesenchyme. these results mimic the cellular differentiation seen in vivo and are similar to results obtained with other tumor types. prostate carcinoma has also been modeled in the rotating-wall vessel system by several investigators. [85] [86] [87] one of the most comprehensive descriptions of these engineered tissues is detailed by wang et al. 88 in that review, the authors describe the ability of the rotatingwall vessel system to recapitulate human prostate carcinoma (lncap) and bone stroma (mg63) to illuminate the evolution of prostate tumorigenesis to the metastatic condition. in particular, the lncap and arcap models represented in the review are known to be lethal in the human, being androgen independent and metastatic. rotating-wall vessel tla engineering also allowed indepth study of epithelial and stromal interactions, which are the facilitating elements of the continuance of lncap prostate-specifi c antigen production in vitro. when lncap was cultured in three dimensions without stroma, production of prostate-specifi c antigen ceased and metastatic markers were not observed. the authors outline the process of malignant transformation, demonstrating that these metastatic models are only possible in threedimensional tlas and are achieved by specifi c geometric relationships in three-dimensional confi guration. furthermore, they show through direct comparison with other culture systems the advantages of the rotating-wall vessel system to allow synergistic relationships to study this disease state. 88 unlike two-dimensional models, these rotating-wall vessel tumor tissues were devoid of metabolic and nutrient defi ciencies and demonstrated in vivo-like architecture. these data suggest that the rotating-wall vessel affords a new model for investigation and isolation of growth, regulatory, and structural processes within neoplastic and normal tissues. in this section, we explore the utility of rotating-wall vessel tlas as targets for microbial infection and disease. several studies have been conducted recently that indicate that three-dimensional tissues respond to infective agents with greater fi delity and with a more in vivo-like response than traditional two-dimensional cultures. nickerson et al. 89 describe the development of a threedimensional tla engineered from int-407 cells of the human small intestine, which were used as targets for the study of salmonella typhimurium. in this study, threedimensional tlas were used to study the attachment, invasion, and infectivity of salmonella into human intestinal epithelium. immunocytochemical characterization and scanning and transmission electron microscopic analyses of the three-dimensional tlas revealed that the tlas more accurately modeled human in vivo differentiated tissues than did two-dimensional cultures. the level of differentiation in the int-407 tlas was analogous to that found in previously discussed small intestine tlas 72 and from other organ tissues reconstructed in rotatingwall vessels. analysis of the infectivity studies revealed salmonella attached and infected in a manner significantly different from that in control two-dimensional cultures. during an identical exposure period of infection with salmonella, the three-dimensional tlas displayed a minor loss of structural integrity when compared with the two-dimensional int-407 cultures. furthermore, salmonella demonstrated a greatly reduced ability to adhere, invade, and induce the apoptotic event in these int-407 three-dimensional tlas than in twodimensional cultures. this result is not unlike the in vivo human response. two-dimensional cultures were significantly damaged within several hours of contact with the bacteria; conversely, although "pot marks" could be seen on the surfaces of the three-dimensional tlas, they remained structurally sound. cytokine analysis and expression postinfection of three-dimensional tlas and two-dimensional cultures with salmonella exhibited remarkable differences in expressed levels of interleukin (il)-1α, il-1β, il-6, il-1ra, and tumor necrosis factor-α mrnas. additionally, noninfected three-dimensional tlas constitutively demonstrated elevated levels of tgf-β 1 mrna and prostaglandin e 2 compared with noninfected two-dimensional cultures of int-407. 89 as previously stated, traditional two-dimensional cell monolayers lack adequate fi delity to emulate the infection dynamics of in vivo microbial adhesion and invasion. the respiratory epithelium is of critical importance in protecting humans from disease. exposed to the environment, the respiratory epithelium acts as a barrier to invading microbes present in the air, defending the host through a multilayered complex system. 90 the three major layers of the human respiratory epithelium are pseudostratifi ed epithelial cells, a basement membrane, and underlying mesenchymal cells. ciliated, secretory, and basal epithelial cells are connected by intercellular junctions and anchored to the basement membrane through desmosomal interactions. together with tight junctions and the mucociliary layer, the basement membrane maintains the polarity of the epithelium and provides a physical barrier between the mesenchymal layer and the airway. 91, 92 infi ltrating infl ammatory and immune cells move freely between the epithelial and subepithelial compartments. airway epithelial cells play a vital role in host defense 90 by blocking paracellular permeability and modulating airway function through cellular interactions. ciliated epithelial cells block invasion of countless inhaled microorganisms by transporting them away from the airways. 93 as regulators of the innate immune response, epithelial cells induce potent immunomodulatory and infl ammatory mediators such as cytokines and chemokines that recruit phagocytic and infl ammatory cells that remove microbes and enhance protection. 90, 91, 94, 95 ideally, cell-based models should reproduce the structural organization, multicellular complexity, differentiation state, and function of the human respiratory epithelium. immortalized human epithelial cell lines, such as beas-2b, 96 primary normal human bronchial epithelial cells, 97 and air-liquid interface cultures, 98 are used to study respiratory virus infections in vitro. traditional monolayer cultures (two-dimensional) of immortalized human bronchoepithelial cells represent homogenous lineages. although growing cells in monolayers is convenient and proliferation rates are high, such models lack the morphology and cell-cell and cell-matrix interactions characteristic of human respiratory epithelia. thus, their state of differentiation and intracellular signaling pathways most likely differ from those of epithelial cells in vivo. primary cell lines of human bronchoepithelial cells provide a differentiated model similar to the structure and function of epithelial cells in vivo; however, this state is short lived in vitro. 97, 99 air-liquid interface cultures of primary human bronchoepithelial cells (or submerged cultures of human adenoid epithelial cells) are grown on collagen-coated fi lters in wells on top of a permeable fi lter. these cells receive nutrients basolaterally, and their apical side is exposed to humidifi ed air. the result is a culture of well-differentiated heterogeneous (ciliated, secretory, basal) epithelial cells essentially identical to airway epithelium in situ. 98, 100 although this model shows fi delity to the human respiratory epithelium in structure and function, maintenance of consistent cultures is not only diffi cult and time consuming but also limited to small-scale production and thus limits industrial research capability. true cellular differentiation involves sustained complex cellular interactions [101] [102] [103] in which cell membrane junctions, extracellular matrices (e.g., basement membrane and ground substances), and soluble signals (endocrine, autocrine, and paracrine) play important roles. [104] [105] [106] [107] this process is also infl uenced by the spatial relationships of cells to each other. each epithelial cell has three membrane surfaces: a free apical surface, a lateral surface that connects neighboring cells, and a basal surface that interacts with mesenchymal cells. 108 recently viral studies by goodwin et al. 109 and suderman et al. 110 were conducted with rotating-well vessel-engineered tla models of normal human lung. this model is composed of a coculture of in vitro threedimensional human bronchoepithelial tlas engineered using a rotating-wall vessel to mimic the characteristics of in vivo tissue and to provide a tool to study human respiratory viruses and host-pathogen cell interactions. the tlas were bioengineered onto collagen-coated cyclodextran beads using primary human mesenchymal bronchial-tracheal cells as the foundation matrix and an adult human bronchial epithelial immortalized cell line (beas-2b) as the overlying component. the resulting tlas share signifi cant characteristics with in vivo human respiratory epithelium, including polarization, tight junctions, desmosomes, and microvilli. the presence of tissuelike differentiation markers, including villin, keratins, and specifi c lung epithelium markers, as well as the production of tissue mucin, further confi rm these tlas differentiated into tissues functionally similar to in vivo tissues. increasing virus titers for human respiratory syncytial virus (wtrsva2) and parainfl uenza virus type 3 (wtpiv3 js) and the detection of membrane-bound glycoproteins (f and g) over time confi rm productive infections with both viruses. viral growth kinetics up to day 21 pi with wtrsva2 and wtpiv3 js were as follows: wtpiv3 js replicated more effi ciently than wtrsva2 in tlas. peak replication was on day 7 for wtpiv3 js (approximately 7 log 10 particle forming units [pfu] per milliliter) and on day 10 for wtrsva2 (approximately 6 log 10 pfu/ml). viral proliferation remained high through day 21 when the experiments were terminated. viral titers for severe acute respiratory syndrome-coronavirus were approximately 2 log 10 pfu/ml at 2 day pi. human lung tlas mimic aspects of the human respiratory epithelium well and provide a unique opportunity to study the host-pathogen interaction of respiratory viruses and their primary human target tissue independent of the host's immune system, as there can be no secondary response without the necessary immune cells. these rotating-wall vessel-engineered tissues represent a valuable tool in the quest to develop models that allow analysis and investigation of cancers and infectious disease in models engineered with human cells alone. we have explored the creation of three-dimensional tlas for normal and neoplastic studies and fi nally as targets for microbial infections. perhaps carrel and leighton would be fascinated to know that from their early experiments in three-dimensional modeling and the contributions they made has sprung the inventive spirit to discover a truly space age method for cellular recapitulation. the biochemical basis of pulmonary function the pathology of bronchial asthma three-dimensional histoculture: origins and applications in cancer research histoculture of human breast cancers studies on the propagation in vitro of poliomyelitis viruses. iv. viral multiplication in a stable strain of human malignant epithelial cells (strain hela) derived from an epidermoid carcinoma of the cervix effects of paramyxoviral infection on airway epithelial cell foxj1 expression, ciliogenesis, and mucociliary function culture of animal cells: a manual of basic technique preclinical models of lung cancer: cultured cells and organ culture human papillomavirus and the development of cervical cancer: concept of carcinogenesis relationship of alveolar epithelial injury and repair to the induction of pulmonary fi brosis generation of human pulmonary microvascular endothelial cell lines promotion of mitochondrial membrane complex assembly by a proteolytically inactive yeast lon culture and transformation of human airway epithelial cells a mechanism of hyperthermia-induced apoptosis in ras-transformed lung cells smoke/burn injury-induced respiratory failure elicits apoptosis in ovine lungs and cultured lung cells, ameliorated with arteriovenous co 2 removal clonal growth of epithelial cells from normal adult human bronchus experimental systems for mechanistic studies of toxicant induced lung infl ammation regulation of vasodilator synthesis during lung development markers of airway smooth muscle cell phenotype airway smooth muscle cell proliferation: characterization of subpopulations by sensitivity to heparin inhibition divergent differentiation paths in airway smooth muscle culture: induction of functionally contractile myocytes airway wall remodelling and hyperresponsiveness: modelling remodelling in vitro and in vivo nci series of cell lines: an historical perspective establishment and characterisation of cell lines from patients with lung cancer (predominantly small cell carcinoma) cell culture methods for the establishment of the nci series of lung cancer cell lines extracellular matrix proteins protect small cell lung cancer cells against apoptosis: a mechanism for small cell lung cancer growth and drug resistance in vivo radioimmunoassay of pulmonary surface-active material in the tracheal fl uid of the fetal lamb selected anatomic burn pathology review for clinicians and pathologists overview of ventilatorinduced lung injury mechanisms surfactant protein-a levels in patients with acute respiratory distress syndrome positive end-expiratory pressure modulates local and systemic infl ammatory responses in a sepsis-induced lung injury model injurious mechanical ventilation and end-organ epithelial cell apoptosis and organ dysfunction in an experimental model of acute respiratory distress syndrome alveolar infl ation during generation of a quasi-static pressure/volume curve in the acutely injured lung evaluation of mechanical ventilation for patients with acute respiratory distress syndrome as a result of interstitial pneumonia after renal transplantation primary cell culture of human type ii pneumonocytes: maintenance of a differentiated phenotype and transfection with recombinant adenoviruses collagen-coated cellulose sponge: three dimensional matrix for tissue culture of walker tumor 256 histophysiologic gradient culture of stratifi ed epithelium the three-dimensional question: can clinically relevant tumor drug resistance be measured in vitro? structural biology of epithelial tissue in histophysiologic gradient culture tissue culture in relation to growth and differentiation the growth, development and phosphatase activity of embryonic avian femora and end-buds cultivated in vitro organ culture of mucosal biopsies of human small intestine growth of nodular carcinomas in rodents compared with multicell spheroids in tissue culture dialysis fermentor systems for concentrated culture of microorganisms semipermeable microcapsules application of ultrafi ltration for enzyme retention during continuous enzymatic reaction cell culture on artifi cial capillaries: an approach to tissue growth in vitro membranes and bioreactors: a technical challenge in biotechnology membrane bioreactors: engineering aspects membrane bioreactors: present and prospects artifi cial membrane as carrier for the immobilization of biocatalysts hollow fi ber cell culture in industry hollow fi ber enzyme reactors engineering of osteochondral tissue with bone marrow mesenchymal progenitor cells in a derivatized hyaluronan-gelatin composite sponge haemopoietic long-term bone marrow cultures from adult mice show osteogenic capacity in vitro on 3-dimensional collagen sponges microcarrier/cultures of animal cells mammalian cell culture: engineering principles and scale-up morphologic differentiation of colon carcinoma cell lines ht-29 and ht-29km in rotating-wall vessels hydrodynamic effects on animal cells in microcarrier cultures physical mechanisms of cell damage in microcarrier cell culture bioreactors growth and death in over agitated microcarrier cell cultures cell death in the thin fi lms of bursting bubbles growth of anchoragedependent cells on macroporous microcarriers viscous reduction of turbulent damage in animal cell culture biological experiences in bubble-free aeration system critical parameters in the microcarrier culture of animal cells the large-scale cultivation of mammalian cells analysis of gravity-induced particle motion and fl uid perfusion fl ow in the nasa-designed rotating zero-head-space tissue culture vessel cell culture for threedimensional modeling in rotating-wall vessels: an application of simulated microgravity autocrine and paracrine roles for growth factors in melanoma growth-regulatory factors for normal, premalignant, and malignant human cells in vitro reduced shear stress: a major component in the ability of mammalian tissues to form three-dimensional assemblies in simulated microgravity threedimensional culture of a mixed mullerian tumor of the ovary: expression of in vivo characteristics threedimensional modeling of t-24 human bladder carcinoma cell line: a new simulated microgravity vessel multicellular resistance: a new paradigm to explain aspects of acquired drug resistance of solid tumors comparison of cytotoxicity of agents on monolayer and spheroid systems modifi ed 2-tumour (l1210, colon 38) assay to screen for solid tumor selective agents signifi cance of three-dimensional growth patterns of mammary tissues in collagen gels assessing tumor drug sensitivity by a new in vitro assay which preserves tumor heterogeneity and subpopulation interactions factors affecting growth and drug sensitivity of mouse mammary tumor lines in collagen gel cultures origin and deposition of basement membrane heparan sulfate proteoglycan in the developing intestine tumor resistance to alkylating agents conferred by mechanisms operative only in vivo acquired multicellular-mediated resistance to alkylating agents in cancer heat sterilisation to inactivate aids virus in lyophilised factor viii threedimensional growth patterns of various human tumor cell lines in simulated microgravity of a nasa bioreactor long term organ culture of human prostate tissue in a nasa-designed rotating wall bioreactor establishment of a three-dimensional human prostate organoid coculture under microgravity-simulated conditions: evaluation of androgen-induced growth and psa expression three-dimensional coculture models to study prostate cancer growth, progression, and metastasis to bone threedimensional tissue assemblies: novel models for the study of salmonella enterica serovar typhimurium pathogenesis series introduction: innate host defense of the respiratory epithelium the airway epithelium: structural and functional properties in health and disease apicobasal polarization: epithelial form and function robbins infectious diseases, 6th ed. philadelphia: wb saunders epithelial regulation of innate immunity to respiratory syncytial virus epithelia cells as regulators of airway infl ammation human bronchial epithelial cells with integrated sv40 virus t antigen genes retain the ability to undergo squamous differentiation identifi cation and culture of human bronchial epithelial cells developing differentiated epithelial cell cultures: airway epithelial cells evaluation of anchorageindependent proliferation in tumorigenic cells using the redox dye alamar blue airway epithelium and mucus: intracellular signaling pathways for gene expression and secretion disorganization of stroma alters epithelial differentiation of the glandular stomach in adult mice expression and function of cell surface extracellular matrix receptors in mouse blastocyst attachment and outgrowth milk protein expression and ductal morphogenesis in the mammary gland in vitro: hormone-dependent and -independent phases of adipocyte-mammary epithelial cell interaction cell replication of mesenchymal elements in adult tissues. i. the replication and migration of mesenchymal cells in the adult rabbit dermis defi ning conditions to promote the attachment of adult human colonic epithelial cells laminin expression in colorectal carcinomas varying in degree of differentiation infl uence of mammary cell differentiation on the expression of proteins encoded by endogenous balb/c mouse mammary tumor virus genes opinion: building epithelial architecture: insights from three-dimensional culture models threedimensional engineered high fi delity normal human lung tissue-like assemblies (tla) as targets for human respiratory virus infection severe acute respiratory syndrome (sars)-cov infection in a threedimensional human bronchial-tracheal (hbte) tissue-like assembly key: cord-232238-aicird98 authors: ferrario, andrea; loi, michele title: a series of unfortunate counterfactual events: the role of time in counterfactual explanations date: 2020-10-09 journal: nan doi: nan sha: doc_id: 232238 cord_uid: aicird98 counterfactual explanations are a prominent example of post-hoc interpretability methods in the explainable artificial intelligence research domain. they provide individuals with alternative scenarios and a set of recommendations to achieve a sought-after machine learning model outcome. recently, the literature has identified desiderata of counterfactual explanations, such as feasibility, actionability and sparsity that should support their applicability in real-world contexts. however, we show that the literature has neglected the problem of the time dependency of counterfactual explanations. we argue that, due to their time dependency and because of the provision of recommendations, even feasible, actionable and sparse counterfactual explanations may not be appropriate in real-world applications. this is due to the possible emergence of what we call"unfortunate counterfactual events."these events may occur due to the retraining of machine learning models whose outcomes have to be explained via counterfactual explanation. series of unfortunate counterfactual events frustrate the efforts of those individuals who successfully implemented the recommendations of counterfactual explanations. this negatively affects people's trust in the ability of institutions to provide machine learning-supported decisions consistently. we introduce an approach to address the problem of the emergence of unfortunate counterfactual events that makes use of histories of counterfactual explanations. in the final part of the paper we propose an ethical analysis of two distinct strategies to cope with the challenge of unfortunate counterfactual events. we show that they respond to an ethically responsible imperative to preserve the trustworthiness of credit lending organizations, the decision models they employ, and the social-economic function of credit lending. the provision of explanations of machine learning model outcomes-also called post-hoc explanations-is key in the domain of explainable artificial intelligence (xai) [3, 12, 16] . post-hoc explanations are interfaces between humans and the machine learning model that are "both an accurate proxy of the decision maker [i.e., the model] and comprehensible to humans" [6] . they are invoked in relation to the need to 1) audit and improve machine learning models by supporting their interpretability, 2) enable learning from data by discovering previously unknown patterns, and 3) establish compliance with legislations and legal requirements [18, 25, 31] . counterfactual explanations [30] are a class of post-hoc interpretability explanations that provide the person subjected to a machine learning model-generated decision with 1) understandable information on the model outcome, and 2) a strategy to achieve an alternative (future) one. they are an example of "contrastive explanations in xai" [15, 18] : they explain a given model outcome by sharing a "what-if" alternative scenario comprising of "feature-perturbed versions of the same persons [i.e., of the original instance]" [19] . recent literature from the xai domain has discussed selected desiderata that may support the applicability of counterfactual explanations in real-world machine learning model pipelines [10, 13, 14, 19, 22] . the desiderata of feasibility, actionability and sparsity would allow to generate and share cognitively accessible counterfactual explanations that respect causal models between features, and suggest actionable strategies whose alternative scenarios comprise a limited number of features. however, at the basis of any discussion on post-hoc explanations lies the assumption that the machine learning model whose outcomes have to be explained remains "stable" or does not change, in a given time frame of interest [2, 9, 19] . in this paper, we argue that this assumption is violated in most realworld applications, where machine learning models are retrained with frequencies that depend on the application under consideration. we show that this phenomenon affects the provision and management of counterfactual explanations to a great extent. in fact, the time 1) of the provision of the explanation, and 2) of the successful implementation of its scenario by an interested individual are, in general, different. this time delay may lead to the emergence of unfavorable cases-called "unfortunate counterfactual events" (uce) in these notes-where the retraining of the machine learning model invalidates the efforts of an individual who successfully implemented the scenario originally recommended by a feasible, actionable and possibly sparse counterfactual explanation. moreover, we introduce an approach to address the problem of the emergence of uces that makes use of all those counterfactual explanations that have been shared with the affected individuals until the point of time of model retraining. finally, we discuss the emergence of uces from the perspective of the relation between the institution providing machine learninggenerated decisions (e.g., a bank with an algorithmic credit lending system) and their post-hoc explanations, and people affected by those decisions (e.g., those people asking for a loan). we argue that series of uces are detrimental to trust in the institutions and in ai [5, 11, 20, 29, 33] . therefore, we focus on the example of the credit lending function, and we propose an ethical analysis of two actionable strategies (and their combination). these allow to preserve trustworthiness in the lending institutions, their machine learning models and the social-economic function of credit lending, while providing counterfactual explanations to potential credit borrowers. counterfactual explanations [30] are explanations of machine learning model outcomes that provide people with a scenario describing a state of the world-called "closest world" [30] -in which an individual would have received an alternative machine learning outcome. for example, they explain to an individual why he or she did not receive a bank loan providing the "what-if" scenario: "you would have received the loan if your income was higher by $10,000" [19] . this "what-if" scenario shows that an alternative outcome can be reached by altering the values of a subset of the features describing the instance at hand (i.e., the data point of the individual asking for an explanation of the denied loan, in the above example) [18, 30] . for this reason, counterfactual explanations are an example of model-agnostic "featurehighlighting explanations" [2] . not only they do provide a humaninterpretable [30] explanation of a machine learning outcome, but they outline a strategy 1 or "recommendation," to achieve an alternative, and possibly favorable, one, through the provision of a "what-if" scenario. this aspect differentiates counterfactual explanations from more descriptive machine learning model outcome explanation methods, such as local interpretable modelagnostic explanations [23] and shapley values [28] . the counterfactual scenario provided by a counterfactual explanation is a "hypothetical point that is classified differently from the point currently in question" [2] . we call it "counterfactual" (data point) for simplicity. the counterfactuals are algorithmically generated [30] by "identifying the features that, if minimally changed, would alter the output [i.e., the current outcome] of the model" [2] . in fact, the original algorithm to generate counterfactuals by wachter et al. [30] solves the minimization problem where denotes the desired outcome for the counterfactual data point (i.e., alternative to the one of the original instance to be explained), ℎ is the machine learning model, (•,•) a loss function, and is a distance measure. the first term in (1) encodes the counterfactual condition, i.e., the search for the alternative outcome , while the second term keeps the counterfactual "close" to the original instance . by definition, counterfactual explanations are model agnostic, providing, at the same time, a degree of protection to companies' intellectual property by the disclosure of a select set of features to third parties [2] . moreover, they comply with legal requirements on explanations in both europe and the united states [2] . for these reasons, they have begun to attract the interest of different sectors of society, such as businesses, regulators, and legal scholars [2] . recently, the literature on counterfactual explanations has focused on selected desiderata, namely feasibility, actionability and sparsity [10, 13, 14, 19, 22] . these desiderata are deemed relevant to facilitate the applicability of counterfactual explanations in realworld applications that make use of machine learning-generated predictions. in fact, feasible, actionable and sparse counterfactual explanations recommend causality-consistent scenarios that can be implemented by the impacted individual, once he or she acts on the values of a limited number of features. let us discuss this point in some detail. counterfactual explanations are said to be feasible [22, 30] , if they propose a scenario that respects the causal model [14, 21] (1)) to compute counterfactual explanations does not implement feasibility constraints. recent approaches aim at ensuring the feasibility of counterfactual explanations by implementing posthoc constraints on a set of generated counterfactuals. these constraints are originally introduced by domain experts to encode known causal relations between features [19, 22] . let us consider a feasible counterfactual explanation. we say that it is actionable [19, 22] , if the corresponding scenario can be reasonably implemented by the individual whose outcome is explained by the provision of the counterfactual explanation. clearly, actionability is context-dependent: in particular, it depends on the capabilities of the individual implementing the counterfactual scenario. considering mothilal et al.'s example again [19] , to increase the yearly income of $10,000 may be a relatively easy task for affluent individuals. however, it may represent a daunting challenge for low-income ones. considering actionable counterfactual explanations allows excluding explanations that, although feasible, propose scenarios whose implementation is practically not realizable. 2 lastly, sparsity is the property of those counterfactual explanations whose scenarios suggest to alter only the values of a few variables [10, 19] . mothilal et al. argue that "intuitively, a counterfactual example will be more feasible if it makes changes to fewer number of features" [19] . in other words, sparse counterfactuals "differ from the original datapoint in a small number of factors, making the change easier to comprehend" [24] . therefore, they are deemed to be cognitively accessible. sparsity becomes an important desideratum of counterfactual explanations, especially in big data contexts. wachter et al. aim at ensuring sparsity of counterfactual explanations by using an l1distance measure in (1) [30] . more recent studies have discussed sparsity by means of a two steps approach making use of a "growing spheres" algorithm [10] and the use of a "post-hoc operation to restore the value of continuous features back to their values in x [the input data point] greedily until the predicted class [...] changes" [19] . by definition, counterfactual explanations and the different algorithms generating counterfactuals make use of a given (and fixed) machine learning model ℎ. however, in most real-world applications, machine learning models, their predictions and explanations depend on time. this basic remark has relevant repercussions on the applicability of counterfactual explanations in real-world use cases and their use as a reliable post-hoc interpretability method. we discuss this point in the forthcoming sections. machine learning models [17] are inherently dynamic objects. they are designed to perform a task, such as the binary classification of a bank's customer in "creditworthy" and "not creditworthy", by learning on data. this process is referred to as "training", or "learning" [17] . after training, and depending on the application, (trained) machine learning models are deployed in it architectures where they are fed upon batches of new data to generate predictions 3 (also referred to as outcomes) and support human decision-making. typically, the training of machine learning models does not occur only once, i.e., just before their deployment. in fact, the process can be periodically repeated, whenever new batches of data are made available, and the performance of the machine learning model degrades. this happens as the model often generates prediction in changing environments, whose evolution is not encoded in the dataset originally used for its training. for example, in e-commerce new products become available and can be recommended on an online marketplace platform. as a result, time affects machine learning models, their predictions and the subsequent post-hoc explanations. kroll et al. [9] warn against a plain search for machine learning model transparency, as "systems that change over time cannot be fully understood through transparency alone" [9] . moreover, in these cases "transparency alone does little to explain either why any particular decision was made or how fairly the system operates across bases of users or classes of queries" [9] . considering systems with high frequency of retraining, such as algorithms serving e-commerce or social media platforms "there is the added risk that the rule disclosed is obsolete by the time it can be analyzed" [9] . as commented in section 2, feasible and actionable counterfactual explanations not only describe a scenario in which the user could have achieved an alternative (and preferred) outcome in understandable terms, but they highlight an actionable strategy to achieve it. in the case of sparse counterfactual explanations, this strategy focuses on altering the values of a limited number of variables. clearly, the points of time at which 1) the explanation is generated and shared with the impacted individual, 4 and 2) its recommended scenario is successfully achieved by him or her, may differ. we argue that this time dependency of counterfactual explanations may represent a sensitive risk to their applicability. the label "creditworthy" or "not creditworthy". we also note that most machine learning models endow their predictions with a confidence score, which is interpreted as an empirical probability. 4 without loss of generality, in these notes we assume that the explanations are generated and shared with an interested individual at the same point of time. let us elaborate this point by describing the occurrence of an "unfortunate counterfactual event" (uce). we refer to the appendix for a discussion of all possible cases emerging from the analysis of the time dependency of counterfactual explanations. in an uce, a feasible, actionable and possibly sparse counterfactual scenario is shared with an individual who received a machine learning outcome 0 at time 0 . the individual successfully implements the recommendations of the counterfactual scenario at a given time 1 > 0 , through the investment of a certain amount of resources (in primis, time). finally, at time 2 ≥ 1 , the machine learning model is again requested to compute a prediction for the same individual: the predicted outcome 1 is the same one from 0 , i.e., 1 = 0 . 5 in fact, in the time interval between 0 and 1 , the machine learning model has been retrained. in particular, the retrained model did not properly learn the counterfactual scenario suggested at 0 , providing a wrong outcome for it. in summary, in the case of an uce, all the efforts spent by the individual to implement the recommendations of the counterfactual explanation are frustrated by the implementation of a retrained machine learning model that did not properly learn the counterfactual and its alternative outcome. at the time of writing, no actionable solution to the emergence of unfortunate counterfactual events has been proposed in the literature. in fact, the effects of time dependency of counterfactual explanations on their generation and provision has not yet been structurally investigated. barocas et al. [2] discuss four key assumptions of feature highlighting explanations, such as counterfactual explanations [2] . one assumption is that "the model is stable over time, monotonic, and limited to binary outcomes" [2] . stability over time is not further specified, and may be interpreted as the absence of retraining or change of selected model properties. similarly, mothilal et al. [19] argue that counterfactual explanations provide the information on "what to do to obtain a better outcome in the future," [19] , but only "assuming that the algorithm remains relatively static" [19] . however, it is not clear how the structural stability of a model relates to the counterfactual explanations of model outcomes, and allows avoiding the occurrence of uces. on the other hand, barocas et al. [2] , when discussing counterfactual explanations of credit loan, highlight that wachter et al. have thus argued that the law should treat a counterfactual explanations as a promise rather than just an explanation. they argue that if a rejected applicant makes the recommended changes, the promise should be honored and the 5 in general, we note that the digital representations of individuals may change over time: some attributes are constant (e.g., the date of birth), others are slowly changing (e.g., age, education level), others may seldom change (e.g., the nationality), while others may change with high frequency (e.g., the amount of money on a credit card account). in these notes, we consider the updated data point of the individual at time 1 to be equal to the credit granted, irrespective of the changes to the model that have occurred in the meantime [2] . more specifically, to this end wachter et al. propose: data controllers could contractually agree to provide the data subject with the preferred outcome if the terms of a given counterfactual were met within a specified period of time [30] . these contractual agreements are the legal counterpart of what we call the counterfactual "commitment". however, neither the nature of these contractual agreements nor their applicability (for example, considering the scenario where hundreds of customers of a company require counterfactual explanations every semester) are further discussed. how to honor the contractual agreement with data subjects represented by the provision of a counterfactual explanation and its recommendations? as long as counterfactual explanations are implemented in a computer system at a company's premises, refraining from the retraining of machine learning models to avoid the possibility of a series of unfortunate counterfactual events is not practically feasible. as an alternative approach, we suggest to make use of the counterfactual explanations themselves, during retraining. 6 we argue that one could perform retraining of the machine learning model on data that include, in particular, all the counterfactual scenarios that have been generated and shared with third parties until that moment, together with their alternative outcomes. the idea is that, during retraining, the model would try to learn the alternative outcomes for all the counterfactual points, together with other data deemed relevant for retraining. we show this proposal in figure 1 . the machine learning model-originally trained on data 0 at 0is retrained at 1 , using the updated training dataset 1 , new data , and the counterfactual scenarios 1 , … , generated and shared with third parties until 1 . (successfully implemented) counterfactual scenario to simplify the discussion. 6 we do consider only scenarios of feasible and actionable counterfactual explanations, in what follows. as machine learning is inherently probabilistic [17] , the promise to honor the contractual agreements mentioned by wachter et al. and represented by the counterfactual scenarios would hold only with a certain degree of certainty. 7 the quality of the agreements would be then assessed by a validation of the retrained model. therefore, a "good" retrained model would be characterized not only by a satisfactory level of performance, but also by a high degree of certainty in honoring the counterfactual scenario agreements held up to the time of retraining. our proposal is a form of data augmentation, which is a commonly used technique in machine learning [1, 27, 32] . moreover, it is of simple implementation and does not alter the retraining algorithmic procedures by adding constraints to the optimization routines used for learning. on the other hand, by augmenting data we make use of "artificial" data points, that is the counterfactual scenarios, which may correspond to fictitious individuals (e.g., customers, patients, convicts etc.), by definition. 8 this has the effect of altering the underlying distribution and the composition of the data, i.e., a portfolio of customers, cohorts of patients or convicts, used to retrain the model. however, we argue that, although fictitious, the data points encoding the counterfactual scenarios, may represent possible individuals. these become certain to the institution using the machine learning model, at the moment at which the individual impacted by the original counterfactual explanation successfully implements the recommended scenario and requests a new prediction for himself or herself to the model. in view of the above data augmentation approach, we argue that the ontological status of counterfactual scenarios of feasible and actionable counterfactual explanations advocates for their use in the retraining of machine learning models. therefore, the correct estimation of their outcomes by a retrained machine learning model becomes the degree of protection from the aforementioned series of unfortunate counterfactual events. depending on the application, the class distribution of training data, e.g., the number of creditworthy vs. not creditworthy customers in the credit lending case, may be imbalanced. 9 as an effect, considering counterfactual scenarios and their outcomes in the training data at time 1 may increase class imbalance. 10 in that case, standard machine learning techniques to cope with class imbalance, such as class-weighted learning, sub-sampling or oversampling [4, 7, 8] may be taken into account. 7 as noted in footnote 3, each machine learning outcome is endowed with a score p (normalized in the interval [0,1]) that is interpreted as a probability, or degree of certainty in the corresponding outcome. therefore, in the case of a counterfactual scenario, the corresponding outcome is honored with degree of certainty p. 8 in general, these artificial data points may, or may not, belong to the dataset used to retrain the machine learning model, in the first instance. therefore the adjective "artificial." however, this depends on the specific implementation of the algorithmic search for counterfactuals as in equation (1) . for example, all counterfactuals would belong to the training dataset (therefore, being "artificial" no more), if the search for counterfactuals is limited to the samples in the training dataset with an outcome alternative to finally, we argue that the probability of a counterfactual scenario to be successfully implemented by an individual correlates positively with the actionability of the corresponding counterfactual explanation. therefore, companies could envisage the development of scoring systems to assess the probability of successful implementation of counterfactual scenarios, once a sufficient number of cases are collected. this information could be used to assist the validation of the retrained machine learning models, in order to select those models that correctly learn counterfactual scenarios with high implementation probabilities. the availability of models that are consistent with sets of counterfactual decisions is of the utmost importance each time exogenous factors impact models performance to the extent that their retraining becomes necessary. a case in point is the recent covid 19 pandemic that entailed abrupt changes in economic expectations. these may lead to an overall disruption of financial operations such as credit lending. one may imagine different options to preserve the trustworthiness of an institution facing these operation-impacting changes and, at the same time, committing to counterfactual scenarios. in the forthcoming sections we introduce two main options that involve distinct ethical trade-offs and we comment on their applicability. the first approach for the institution is to only keep counterfactual commitments within well-specified "boundary conditions," i.e., the circumstances within which the explanation can be treated formally as a promise. these circumstances are applied to all counterfactual commitments in a given time frame, 11 and may appeal to understandable values and constraints such as economic necessity. we stress out that the greatest threat to trustworthiness here is having undisclosed boundary conditions motivating the decisionmaker to break its (implicit or explicit) counterfactual commitment. this can be avoided if the organization is transparent with the recipient of the explanation about the boundary conditions in order. the one of the original instance to be explained. this is the case of algorithm 1 in [10] . no such constraint is mentioned in wachter et al.'s paper [30] . 9 typically, in credit lending applications, the number of not creditworthy customers is smaller than the number of creditworthy ones. for example, in the german credit dataset, not creditworthy customers represent 30% of all data. 10 if the distribution of classes in training data from time 0 to 1 does not vary significantly. otherwise, it is not possible, a priori, to infer how the class imbalance in data may be affected by taking into consideration counterfactual scenarios and their outcomes. this would be the case in an economic downturn scenario, such as the one discussed in section 4. 11 therefore, the choice of the term "boundary conditions." for example, the organization may identify and publicize a given set of circumstantial conditions under which it will make decisions that depart significantly from the previously communicated counterfactuals. for example, a bank may have a criterion stating explicitly that it will not be able to stick to its counterfactual commitments in the event of a downward or upward movement of a certain economic index higher than a given threshold, or when honoring the commitment would imply a profit loss higher than a given threshold. to be trustworthy, the company would need to refer to either public data (e.g., a publicly accessible economic index) or confidential data that can be audited confidentially by trustworthy independent parties (the auditors may need to obtain access to confidential information to be able to verify if certain claims, e.g., about prospects of economic losses, are credible). when the decision-maker is confident about its ability to avoid exceptional circumstances (or exceptional choices in exceptional circumstances) within a given time frame, the former strategy can be realized by simply stating counterfactual commitments with an "expiration date." we would like to point out that adopting circumstance-limited or time-limited counterfactual commitments is ethically valuable, i.e., valuable impersonally, for society at large, and it is even a moral duty. we argue that this is not only in the self-interest of the institution. to see why, let us consider the case of a "sudden economic downturn scenario" for a bank, or, in general, a credit lending institution. in this scenario fulfilling all extant counterfactual commitments likely implies awarding loans that are too risky. one immediate consequence of a defaulting client is a cost for the lender, as defaulted loans reduce overall profit. however, due to the economic downturn, a significant proportion of defaulting clients may well cause the financial collapse of the institution. such behavior is clearly unethical on kantian deontological grounds: if all credit institutions acted according to this moral maxim (i.e., lending to all clients fulfilling past counterfactual explanations, even when the updated model predicts them to default), the likely result would be the collapse of the entire financial system, so no credit would be possible. this is a violation of kant's categorical imperative [34] . interestingly, violating counterfactual commitments also violates a common-sense deontological rule against breaking promises (which also violates kant's categorical imperative). in summary, stating clearly that counterfactual commitments are only valid within situational constraints becomes a viable alternative to the following ethically permissible, although limiting, options: a) avoiding the provision of counterfactual explanations altogether, or b) avoiding the interpretation of counterfactual explanations as implicit promises on the recipient's end. as pointed out in section 3, the proposal of considering past counterfactual scenarios during the retraining of machine learning models will typically not lead to the fulfillment of all past counterfactual commitments. in fact, as a result of the use of counterfactual scenarios during retraining, the corresponding counterfactual commitments are endowed with degrees of certainty (see section 3). this said, in the probabilistic approach an institution may decide to guarantee that a subset of counterfactual explanations and subsequent commitments made at time 0 will hold at time 1 > 0 , based on statistical reasoning. for example, a bank could guarantee only those commitments with a degree of certainty greater than a given threshold. as noted in section 3, the degree of certainty of counterfactual scenarios is computed as result of the machine learning model retraining, i.e., only after the generation of the corresponding counterfactual explanation (at time 0 ). therefore, it cannot be considered as a clause of validity for the commitment at 0 . on the other hand, the same bank may decide to guarantee only a fixed percentage of a given type of commitment (e.g., those involving the increase of annual income), based on the statistical analysis of histories of past ones. however, in general, institutions will not be able to tell, for a specific client, whether his or her specific counterfactual commitment will be maintained. even if a company relies on a public (or independently auditable), systematic, non-morally arbitrary way to select the claims to be fulfilled in altered circumstances, there may be no way to know in advance which claims fulfill those conditions. for different commitments may be unequally hard to satisfy in different circumstances, and, at the time in which counterfactual explanations are given, it will not be known in advance what are the low probability circumstances that will occur and call for a revision of the promises made. in other words, at 0 one may argue that a's counterfactual commitment could be easier to satisfy than b's. however, at 1 > 0 the reverse may be true. 12 if the probabilistic approach has to be trustworthy, it must rely on a public (or independently auditable), systematic, non-morally arbitrary way of selecting the counterfactual commitments to be discarded vs. those to be maintained using statistical reasoning. the advantage of this strategy is that it can help the institution providing these commitments to find the optimal balance between two conflicting prima facie moral obligations: a) to fulfill the expectation of the counterfactual commitment, and b) to enable mutually beneficial and socially advantageous economic transactions that require violating past commitments. the probabilistic strategy provides a "grey option" alternative to the option of being trustworthy but risking institutional collapse (e.g., bankruptcy), and the option of violating past commitments and doing simply what one has most reason to do, given future prospects and ignoring the history of the interaction. the grey option is a quantitative constraint on the breaking of promises. it limits the damage to trustworthiness and deontological morality, by limiting commitment breaking to a given percentage of cases, which is planned, known in advance, and controlled by the decision maker. the difference with the first strategy is that, in the first, one refrains from the commitment altogether, declaring it is not valid under specific circumstances. here, instead, commitments are made and then (selectively, and partially) broken. the expectations of some clients are upset, because these clients could not know that their counterfactuals would no longer be valid. but the damage is limited, because only a limited proportion of expectations is violated. framed in this way, the solution ignores deontological morality altogether. the goal is to obtain the desired balance of trustworthiness and profitability (and sustainability) and, in order to achieve the desired mix, the company decides to break promises, but only a certain degree. this may seem hypocritical, but it is morally justifiable from a consequentialist standpoint. a credit institution, for example, faces a reputational dilemma between a) disrespecting its commitments to clients implied by counterfactual explanations, with negative impact on trust in the system, and b) eroding profitability or worse engendering the financial collapse of the institution. by training models that are probabilistically constrained by past counterfactuals, and using this information to determine statistical rules to guarantee counterfactual commitments, this dilemma can be turned into a trade-off the institution can control. a bank, for example, may find it optimal to sacrifice some degree of profitability for the sake of promoting trust from the majority of clients. faced with a choice between honoring past commitments and financial stability in the long run, the probabilistic strategy achieves the latter at the expense of the former. the idea of reducing the amount of broken promises, rather than fulfilling all promises, also has a distinctive flavour of "negative consequentialism." in fact, consequentialism can be stated in a negative form: to act so as to prevent the most harm or most moral wrongs [26] . if the company honors all counterfactual commitments, when it can no longer afford to do so, it risks going bankrupt, which implies the failure to honor commitments to shareholders and other stakeholders that are harmed by bankruptcy. so, by selectively discarding some, the bank minimizes the proportion of commitments it breaks in the long term. while the morality of this approach may be doubted by strict deontologists, utilitarianism recommends it. the option is morally objectionable because it sacrifices moral obligations to a minority of clients for the sake of higher-order long-term objectives of trustworthiness combined with financial sustainability. but if you are utilitarian, this is a bullet you are willing to bite anyway, and for the general case. let us suppose that the objection from deontological morality against the probabilistic approach should be taken seriously. can the probabilistic approach be made compatible with deontological morality? the answer is affirmative and it consists in combining the declared boundary condition and probabilistic approaches. the core of the combined approach is that the probabilistic nature of the commitment to respect counterfactuals should be honestly and transparently declared to the clients. this is in the interest of both bank and clients: from the point of view of the bank, it prevents the machine learning model to behave fully erratically. the clients, on the other hand, can rationally factor the risk of an unfulfilled counterfactual commitment in their action plans. when making a financial decision, the client will not be forced to decide "should i trust the counterfactual explanation or not?", but can reasonably ask "how much should i trust the counterfactual explanation?", and decide that also based on the probability of realization attached to it. let us consider the economic downturn scenario again. in the case of a sudden downturn, a bank may be able to approve lending to a known proportion of individuals who fulfilled the counterfactual condition given in the past (but that no longer satisfy the relevant risk metrics after the model update), where fulfilling this condition for all would have meant going bankrupt. the financial risk arising from fulfilling a known proportion of past counterfactuals can be approximately assessed in advance, given that the selection of counterfactuals to be disrespected is grounded in a statistical model. ideally, a bank will be able to cover the cost arising from the need to promote its trustworthiness in a sustainable manner (e.g., by raising the interest rate on safer loans, or distributing the costs among many different stakeholders). the bank in the example can also plan in advance to disrespect a maximum proportion of counterfactual commitments in case a specified type of future risk will materialize itself. this plan violates deontological morality when it is kept secret and allows clients to form misplaced expectations in the future behavior of the bank. but it could also not violate it if it is communicated at the time the explanation is given. moreover, keeping the plan secret ignores the potential benefit for the client of accessing the probabilistic information the bank has. knowing the probability that a promise will not be respected is valuable in the client's perspective. ideally, a rational client would want the client to be fully aware of the probabilistic uncertainty of a probabilistic commitment to a counterfactual, in order to know who much weight to place on it when making decisions. hence, one may solve the tension between the consequentialist and the deontological approaches by declaring the probabilistic nature of the plan to respect counterfactual, to the client, in the moment the explanation is given to him or her. it may be objected that "making a promise while declaring that it may be broken" would also violate kant's categorical imperative, and count as unethical in that perspective. this is true of promises that are made with the intention that they may be broken for arbitrary reasons. this is not what the combined strategy does. to probabilistically guarantee a proportion of counterfactuals, in a way that varies according to the circumstances, based on a plan established in advance, and that is honestly communicated to the recipient of the explanation, amounts to a commitment that has social value. finally, let us return to our credit bank scenario to see how that may play out in reality. a bank may provide a client with the following prospect: "we expect ourselves to behave as described in our explanation in 90% of cases except in economic circumstances x or y where that proportion falls to 70% and we offer no guarantee about honoring our commitments when z occurs." this would be a promise that includes the conditions of its own breaking, with a probability attached to it. like ordinary promises, this probabilistic and conditional commitment can ground reasonably accurate expectations of the client about the bank's behavior. the recipient of the counterfactual commitment receives useful information, and knows what to expect in ordinary circumstances, factoring it some degree of risk. we could label this approach as wisdom-by-design: the system will exhibit a certain degree of coherence and predictability, yet avoid rigidity in its commitment to the past. counterfactual explanations are a class of contrastive explanations of machine learning outcomes with interesting properties. in fact, the possibility of suggesting a strategy to have recourse against a machine learning model outcome is a useful tool available to those affected by ai-assisted decisions. however, we showed that this property becomes a double-edged sword in the hands of those organizations that make use of ai in their decision-making processes, due to the role of time in machine learning applications. the approach to tackle the emergence of series of unfortunate counterfactual events discussed in these notes is a first step towards a systematic analyses of the methods to 1) ensure the consistent use of counterfactual explanations in real-world applications, and 2) support trust in institutions, and their ais, while striving for the interpretability of machine learning models. we advocate for qualitative and empirical studies with real datasets to test the propensity of organizations in implementing different ethics-preserving strategies to commit to counterfactual explanations, and to assess the impact of series of unfortunate counterfactual events on their functions. in table 1 we enumerate all possible cases that emerge from the change in time of data points, machine learning models and their outcomes, when considering the implementation of counterfactual scenarios. we show that the unfortunate counterfactual event discussed in section 3 is one of these cases (i.e., case 4). we start with some notation: with we denote a given data point corresponding, for example, to a specific individual. with ℎ we denote the machine learning model at hand (e.g., a credit loan system). the outcome of computed by ℎ is , or = ℎ( ) . let us consider two distinct moments of time 0 and 1 , with 1 > 0 , like in section 3. in table 1 we write "+" if the corresponding element does not change from 0 to 1 , "-" otherwise. in case a "-" is highlighted for the data point x, the counterfactual scenario of the explanation at 0 is implemented at 1 . it follows that we need to describe 8 distinct cases. we note that the occurrence of an unfortunate counterfactual event is given by case 4. on the other hand, case 6 is a "paradigmatic counterfactual event:" the implementation of the counterfactual scenario is successful and occurs in the absence of machine learning model changes. case 8 encodes the possibility of having a machine learning model retraining that is compatible with the implementation of a counterfactual scenario (i.e., ending in the change of outcome , as desired). all other cases are not relevant for the implementation of counterfactual scenarios we discuss in these notes. ℎ description 1 + + + no change. no counterfactual scenario is applied, in particular. -+ + only x changes, although its change does not alter the corresponding prediction. this case cannot hold if the individual has successfully implemented the counterfactual scenario suggested at time 0 as, in that case, the outcome at 1 would have changed (given the same machine learning model ℎ). data augmentation generative adversarial networks the hidden assumptions behind counterfactual explanations and principal reasons towards a rigorous science of interpretable machine learning learning from imbalanced data sets ai we trust incrementally: a multi-layer model of trust to analyze human a survey of methods for explaining black box models imbalanced learning: foundations, algorithms, and applications learning from imbalanced data: open challenges and future directions inverse classification for comparison-based interpretability in machine learning ethics guidelines for trustworthy ai. futurium -european commission the mythos of model interpretability transparency as design publicity: explaining and justifying inscrutable algorithms preserving causal constraints in counterfactual explanations for explaining data-driven document classifications explanation in artificial intelligence: insights from the social sciences machine learning explaining explanations in ai explaining machine learning classifiers through diverse counterfactual explanations the scored society: due process for automated predictions causality: models, reasoning and inference face: feasible and actionable counterfactual explanations why should i trust you?": explaining the predictions of any classifier efficient search for diverse coherent explanations towards explainable artificial intelligence consequentialism, reasons, value and justice a survey on image data augmentation for deep learning explaining prediction models and individual predictions with feature contributions the relationship between trust in ai and trustworthy machine learning technologies counterfactual explanations without opening the black box: automated decisions and the gdpr the explanation game: a formal framework for interpretable machine learning eda: easy data augmentation techniques for boosting performance on text classification tasks transparent predictions groundwork for the metaphysics of morals key: cord-184685-ho72q46e authors: huang, tongtong; chu, yan; shams, shayan; kim, yejin; allen, genevera; annapragada, ananth v; subramanian, devika; kakadiaris, ioannis; gottlieb, assaf; jiang, xiaoqian title: population stratification enables modeling effects of reopening policies on mortality and hospitalization rates date: 2020-08-10 journal: nan doi: nan sha: doc_id: 184685 cord_uid: ho72q46e objective: we study the influence of local reopening policies on the composition of the infectious population and their impact on future hospitalization and mortality rates. materials and methods: we collected datasets of daily reported hospitalization and cumulative morality of covid 19 in houston, texas, from may 1, 2020 until june 29, 2020. these datasets are from multiple sources (usa facts, southeast texas regional advisory council covid 19 report, tmc daily news, and new york times county level mortality reporting). our model, risk stratified sir hcd uses separate variables to model the dynamics of local contact (e.g., work from home) and high contact (e.g., work on site) subpopulations while sharing parameters to control their respective $r_0(t)$ over time. results: we evaluated our models forecasting performance in harris county, tx (the most populated county in the greater houston area) during the phase i and phase ii reopening. not only did our model outperform other competing models, it also supports counterfactual analysis to simulate the impact of future policies in a local setting, which is unique among existing approaches. discussion: local mortality and hospitalization are significantly impacted by quarantine and reopening policies. no existing model has directly accounted for the effect of these policies on local trends in infections, hospitalizations, and deaths in an explicit and explainable manner. our work is an attempt to close this important technical gap to support decision making. conclusion: despite several limitations, we think it is a timely effort to rethink about how to best model the dynamics of pandemics under the influence of reopening policies. covid-19 has taken the international community by surprise [1] . at the time of writing this paper, the covid-19 pandemic has surpassed 10 million confirmed cases and 500,000 deaths worldwide [2] . covid-19 is having a dramatic impact on health care systems in even the most developed countries [3] . without effective vaccines and treatments in sight, the only effective actions include policies of containment, mitigation, and suppression [4] . the infection, hospitalization, and mortality trends of covid-19 across different countries vary considerably and are affected mainly by policy-making and resource mobilization [5] . predicting the local trends of the epidemic is critical for the timely adjustment of medical resources and for the evaluation of policy changes in an attempt to curtail the economic impact [6] . in the united states, policies vary by state and city, and therefore, robust local models are essential for learning fine-grained changes that meet the needs of local communities and policymakers. under appropriate intervention, early studies observe a trajectory of consumption recovery near the end of the eight-week post-outbreak period (following the classical epidemiology models) [7] . however, traditional models do not account for the impact of local policies, such as a multiphase reopening. the recent rebounds in texas indicate different trends in different counties, which motivated the need to study the underlying impact of policy on local mortality and hospitalization trends. in this paper, we present the design of our regional model and demonstrate its use by applying them to the houston, tx area marking their difference from the global trend estimation models. due to the lack of consistent and accurate estimations of infection rates in asymptomatic individuals (using, e.g., random serological testing [8] ), we are focusing on mortality and hospitalization. we present the development of a forecasting model using local fine-grained hospital-level data to track the changes in hospitalization and mortality rates owing to reopening orders in the greater houston area encompassing nine counties in the state of texas, usa. the modeled area consists of 4,600 km 2 , incorporating a population of 3,012,050 adults and 1,080,409 children (by the 2010 census) and includes over 100 hospitals with a total bed capacity of 23,940 [9, 10] . our methodological contribution is directly modeling the impact of phased reopening. we achieve it by splitting the targeted population into low-contact and highcontact groups (determined by the subpopulations that return to work at different phases of the reopening). the mechanism adjusts the proportion of infectious subpopulations (depending on their category of jobs) to quantitatively represent the policy impact on the epidemiological dynamical system (please refer to figure 1 for a high-level overview). it can built into most existing epidemiological models without ease, offering additional explanation ability and better prediction efficacy. we demonstrated our new approach using a policy-aware risk-stratified susceptible-infectious-recovered hospitalization-critical-dead (ssir-hcd) model, which compared favorably to existing methods (including our neural network latent space modeling, a nonlinear extension of sir-hcd). there are many predictive models for covid-19 trend prediction. the center for disease control (cdc) also hosts 23 different trend predictors [11, 12] to forecast total death. there are several big categories: • purely data-driven models (with no modeling of disease dynamics), which includes regression-based parametric and non-parametric models (auto-regressive integrated moving average or arima, support vector regression, random forest), neural network (deep learning) based trend prediction (e.g., gt-deepcovid [13] ), etc. • epidemiology based dynamic models based on grouping populations into a discrete set of compartments (i.e., states), and defining ordinary differential equations (ode) rate equations describing the movement of people between compartments: seir (susceptible, exposed, infected, recovered) models and their myriad variants are examples in this category. • individual-level network-based models: finest grain modeling of a population through agent simulation, such as the ones built in netlogo by marathe et al. [14] and notredame-fred [15] . • various ensemble and hybrid models: including the imperial college london short-term ensemble forecaster [11] and ihme model [16] that combines a mechanistic disease transmission model and a curve-fitting approach. among existing models, the ode compartment-based models occupy a middle ground between network models at the individual-level and purely count-driven statistical analyses that are disease-dynamics-agnostic, which will be our main interest in this paper. compartment models, which originated in the early 20th century [17] , still represent the mainstream in epidemiological studies of infectious disease. they make a critical mathematical simplification by decomposing the entire population into compartments (i.e., states), e.g., susceptible, infectious, recovered, and use odes to model the transitions between the compartments (table 1) . these compartment models make assumptions that the observation counts in the various compartments naturally reflect the reproduction number r0 that changes over time. the recent covid-19 pandemic, however, has introduced the need to incorporate lockdown policy interventions (i.e., how long the population will remain at home), which existing compartment models have not considered. we observe different patterns of hospitalization and mortality even within a single metropolitan area such as houston, tx, which means traditional epidemiology systems might not be sufficient to explain the dynamics. many people speculated that local policies (shutdown and reopening) could have introduced perturbations to the disease dynamics. still, it is not clear how to quantify their impacts and provide counterfactual reasoning to support future policy decisions. our ssir-hcd is a unique effort to close the modeling gap by using appropriate data to enrich the established compartment models. the only other relevant model [18] focused on anti-contagion policies, which is significantly different from our phased reopening policies model in that we considered the stratified risks in the population (related to people who might have more chances of exposure, depending on the phases in the reopening policy). we collected experimental datasets of the daily reported hospitalization and cumulative morality of covid-19 that occurred in houston, texas, from may 1, 2020 (the start date of phase 1 reopening in houston, tx) until june 29, 2020. population data was collected by usa facts [23] , industry employment data was gathered from u.s. bureau of labor statistics [24] , and the hospitalization data sources originate from southeast texas regional advisory council (setrac) covid-19 report [25] . we used tmc daily news [26] to set the initial length of hospitalization for our model. we also used mortality data from the new york times county-level report [27] ). note that new york times data combine confirmed and suspected cases in their reporting of mortality. to be consistent, we used setrac hospitalization reporting that contains both confirmed and suspected cases. in this study, we focused on the data from harris county, one of the nine counties in houston, tx with the largest population. we propose a forecast model based on sir-hcd with a novel variant on compartments to address the differences in local policy. in sir-hcd, the entire population is divided into six subgroups: susceptible population , exposed population , infectious population , recovered population , hospitalized population , critical population , and dead population . the transitions between sub-groups are governed by nonlinear ordinary differential equations. please refer to table 2 for our nomenclature. we use the sir-hcd to model the state transitions. the model is a simplification of seir-hcd. we decide to drop exposed state (e), which cannot be reliably modeled in covid-19 because the cdc guideline for exposure, determined as staying within less than six feet for more than fifteen minutes from a person with known or suspected covid-19 [11] , is too short a time period to be modeled adequately. thus, a simpler sir-hcd model, which assumes the possibility of direct transitions between the susceptible state and the infectious state, is more suitable in covid-19. in the sir-hcd model, some susceptible people may turn into an infectious status after the incubation period. infectious people may either get hospitalized or recover after a certain period of time. a proportion of the hospitalized people might be admitted to the intensive care unit (icu), while the rest of them will recover in the hospital. similarly, among the critical cases (i.e., icu patients), some people might die, and others will recover. thus, the sir-hcd model follows a series of nonlinear odes to model the state transitions: note that 0 ( ), which is shorted as 0 and used interchangeably in our paper, denotes a dynamically changing reproduction number (considering several changes of quarantine policies published in houston). the symbol denotes the average incubation period of covid-19. in the equations that models , , , , the term ℎ represents the average time that a patient is in a hospital before either recovering or becoming critical, and denotes the average time that a patient is in a critical state before either recovering or dying. in addition, refers to the asymptomatic rate in infected populations , refers to the critical rate in hospitalized population h, and refers to the deceased rate in critical population . this model is more robust than sir, as the introduction of more reliable observations of , , provides extra stabilization to the dynamic system. figure 2 illustrates the sir-hcd model with its basic states and transitions implied by the ode function. with reopening policies in place, there are more interactions between people and so the likelihood of spread increases. our expectation is either that 0 remains constant (because people maintain safe distances and follow cdc protocols), or (more likely) that it increases with spotty compliance with pandemic protocols. to make the computation tractable, we decide to use the inverse operation of the exponential hill decay equation to model 0 as following, where refers to the rate of decay, and controls the shape of the decay. when = 1, the above exponential equation is just a monotonically increasing linear equation. we set the starting point = 0 as the reopening date, may 1, 2020. the initial states ( = 0) and ( = 0) are the numbers of reported hospitalized cases and cumulative mortality in harris county on that date. we decided not to rely on confirmed cases, assuming that the actual number for the infected population is larger than the reported number (such an effect has been reported in california [28] and new york [29] ). since a fraction of the actual infected patients were hospitalized on the first day, the initial infectious population ( = 0) is therefore estimated to be times the initially hospitalized number ( = 0), where is a positive constant coefficient. some studies suggested that true positive infectious cases should be 50 -90 times more than the reported positives [30, 31] . in the harris county projection, we set to be 60, assuming that ( = 0) is approximately equal to "known positives" on the first day. to estimate the recovery rate, we divided harris' case mortality rate (the number of confirmed deaths on the current day) by the number of confirmed cases 14 days before that, as reported by the new york times [27] . the average mortality rate starting from may 1, 2020, was 2%. therefore, we have an estimated recovery rate of 98%. in this case, the initially recovered individuals ( = 0) = 0.98 ⋅ ( = −14) = 0.98 ⋅ ( = −14), where = −14 refers to 14 days earlier than the starting date (i.e., april 17, 2020) . the number of critical individuals ( = 0) is set to be 50% of hospitalized individuals ( = 0) based on the average proportion of icu usages among covid-19 hospitalization in texas [11, 25] . the initial number of susceptible population ( = 0) is where is the total population in the county. in this section, we introduce the unique aspect of our model that differentiates it from existing ones. our intuition here is that people get infected either through family transmission or through social (including job) activities. in the transition from a strict stay-at-home to reopening, the population is subject to changes in their social activities, which impact their probability of infection as well as their risk of transmission to their family members. therefore, we can divide the total population in harris county into two groups; a low-contact group, which includes people in industries that were still closed (e.g., working from home subpopulation and their families, including those who are unemployed but not homeless), and a high-contact group includes people in industries that were reopened due to economic restart (e.g., working on site subpopulation and their families). intuitively, the subpopulation of people who work from home is those who continue to stay at home and have limited chances of contacting the working subpopulation. the two groups share the same fitted parameters 0 , ℎ , , , , , as well as the same constant incubation period , but they are estimating different 0 . we set the initial 0 ( = 0)for the low-contact group is slightly lower than that of the high-contact group, and low contact is slightly higher than or equal to the high-contact group. this is to keep the low-contact 0 being differentiated from high-contact 0 overtime. the unique coupling strategy makes it possible to directly reflect the impact of policy into ssir-hcd (the superscript means squared as we model two subpopulations in the joint ssir-hcd model). according to reopening announcements released on the texas government website[34] and the houston employment rates by industry (reported by the greater houston partnership research [34, 35] ), necessary industries such as transportation, utilities, government, and a subset of the health services kept running before and during the reopening of the economy, accounting for 32.3% of the population in houston. after releasing reopening phase i policies (may 1, 2020), 100% of the essential industries reopened, in addition to 15% health services, 25% professional and business service, and 25% leisure and hospitality, constituting a working on site (high-contact) subpopulation proportion of 39.62% after subtracting the unemployment rate of 0.4% [36] . the proportion of the high-contact population after reopening phase ii (may 18, 2020) was a combination of 100% of the essential industries, 100% health services, 50% of professional and business service, and 50% of leisure and hospitality industries. hence, the high-contact proportion among reopening phase ii was 58.3% after subtracting the unemployment rate. our model accounts for the change of low-contact and high-contact subpopulations between reopening phase i and reopening phase ii, therefore directly modeling the policy's impact on epidemiological data over time. our training process uses msle to minimize the errors in curve-fitting. additionally, we evaluated mean squared error (mse), but it was not used for the curve-fitting process. as the training period is very short, and the observation data is highly volatile, we do not directly use the raw daily reported data for our training and forecasting. similar to early work conducted by the school of public health at uthealth [37] , we also observed some data bumps (i.e., a large number of cases counted on one date instead of spread over time) in the reported hospitalization and mortality. following the same consideration to avoid the influence of unreliable data on our modeling, we used a 7-day rolling average to smooth the raw inputs (and generating the training hospitalization and mortality data in the experiments). the hospitalization curve represents a delayed epidemic effect since the publication of the strict stay-at-home order on march 30, 2020. after reopening policies were issued in texas (may 1, 2020), their impacts start to impact the dynamics in reopening phase i and reopening phase ii. our local hospitalization and mortality modeling aims to fit the most recent phases (i.e., reopening phase i and reopening phase ii) starting from may 1, 2020, to june 29, 2020. we validated the accuracy with data between june 23 and june 29. for comparison, our baselines were time-series regression models (exponential smoothing, autoregression, and arima) and vanilla sir-hcd. we predicted hospitalization and mortality, respectively, in the time-series regression models because they lack the capability to account for hospitalization and mortality together in one model. we also included our own neural network sir-hcd model, which is equally flexible as ssir-hcd. interested readers can find the details in the appendix. trained with harris county cumulative hospitalization and mortality data in reopening phase i and reopening phase ii, our ssir-hcd model fits the trends in the training data well: reopening phase i (mse=27.67 for hospitalization, mse=1.57 for mortality) and reopening phase ii (mse= 5.20 for hospitalization, mse=0.81 for mortality). as figure 4 shows, the local hospitalization and mortality training curves are very close to the reported data, and the test curves also follow the data trends closely, which indicates our model is not overfitting to the training period. table 3 shows the prediction accuracy of the baseline models and risk-stratified sir-hcd (ssir-hcd) model. for the hospitalization prediction, the proposed ssir-hcd model had a significantly higher accuracy (mse=8.04) compared to the baselines (mse=649.63, 22.48, 20.98 for three time-series regressions, and mse=54.04 for vanilla sir-hcd). for mortality prediction, we found that the time-series regression models generally predict well, and our proposed model had comparable accuracy. this high accuracy in mortality prediction of the general time-series models is mainly because the mortality rates were more stable than the hospitalization curve over time. table 4 displays the fitted values of eight training parameters in ssir-hcd equations for the low-contact group and the high-contact group. these fitted parameter values correspond well to the values obtained in previous studies of covid-19 [11] , [38] , [39] . and the ratio of hospitalizations turning into critical is close to the average icu proportion among hospitalizations in harris county, which was 50% in our initial state settings [11, 25] . the constant parameter is set at 11.5 in both groups based on the values suggested by the world health organization (who) [40] and the cdc [11] . as a sanity check, the 0 values in the low-contact group are indeed lower than those values in the high-contact group, indicating a lower expected number of cases directly infected by individuals in the low-contact group. figure 5 displays the ssir-hcd model's counterfactual analysis results (of our model) on what would have happened in the absence of reopening policies after 160 days on may 1, 2020. in the x-axis, day 0 refers to may 1, 2020, day 17 refers to may 18, 2020, and day 60 refers to june 29, 2020. we restored the proportion of low-contact people and high-contact people to the no-reopening status (corresponding to 31.90% high-contact proportion of the population) while keeping all the trained parameters the same. upon excluding all changes resulting from the reopening policies, it is noted that both modeled hospitalization and mortality curves become dramatically flat. the hospitalization curve with intervention reaches its peak on day 90, reducing nearly 2,500 existing cases. this demonstrates that quarantine policies are effective in controlling the spread of coronavirus as well as reducing the number of hospitalizations and mortality rates. similarly, figure 6 displays the counterfactual estimations on what would have happened if the texas government did not continue to reduce limitations in reopening phase ii. in figure 6 , the presumed reopening policies in reopening phase i represent moderate control to the hospitalization and mortality curves, reducing nearly 1,500 existing cases. since a long stay-at-home order is not economically practical, our counterfactual analysis demonstrates that moderate reopening policies, keeping essential quarantine measures (such as mask order adoption), and opening several industries to lower capacity, may offer a reasonable middle ground between the strict quarantine and fully open economy. the chart of dynamic 0 values show how dynamic 0 differentiates the low-contact group and high-contact group such that modeled hospitalization and mortality curves would be flattened by increasing the proportion of the low-contact population. the model does not use one single reproduction number value to measure the integral transmission rate as the two subgroups have different levels of risks for getting infected. our ssir-hcd model forecasts fine-grained covid-19 hospitalization and mortality by accounting for the impact of local policies. one challenge is that the ssir-hcd model is very sensitive to the initial values of , , , , , and as the number of infectious agents is nonzero at the initial time point. we have managed to avoid overfitting the local time-series curve by deploying values based on the accumulated knowledge of these initial variables and also using a smoothed time series as a rolling 7-day average to alleviate the fluctuations. after variable adjustment, the predictive results obtained a low error rate, while also obtaining parameters that are close to real-world values, such as the asymptomatic rate that is close to 93.8% in the covid-19 scenarios outcome summary [25] . in publicly reported data, the cumulative mortality data in reopening phase ii do not perfectly follow the hospitalization trends. our expectation was that it would lag after the hospitalization cases by approximately 14 days. the actual mortality rate fluctuated in the middle of reopening phase ii (despite we already smoothed the curve) when the number of hospitalization cases started to increase rapidly. nonetheless, our ssir-hcd model still approximates the hospitalization and mortality trends better than competing models. thus, our model is advantageous over baseline regressions. it can fit epidemiological data with complicated shapes, such as harris hospitalization data, based on the proportion of low-contact and highcontact groups and can consider several epidemiological states together into one model that can make predictions for one or more sub-populations simultaneously. in addition to forecasting, our model offers another unique functionality to support counterfactual analysis, which can be useful in supporting critical decision-making. however, our ssir-hcd model inherited sir-hcd in assuming a monotonically increasing 0 . this assumption has limitations to the future when economic reopening might be paused due to the overestimation of 0 (facing a big susceptible population). for example, if a local policy were to clamp down on exposure (e.g., mandating masks and other means to influence infectivity), it is not reflected in ssir-hcd, which is an obvious weakness. one possible strategy is to introduce an adjustable 0 control to the model, such as our extended model called neural network sir-hcd (see appendix), which learns the quarantine strength over time to determine 0 change. additionally, our model interprets the recovered population as those who can no longer infect other individuals under the condition that the number of susceptible individuals keeps decreasing over time. we did not consider the possibility that some covid-19 survivors may be reinfected after they were recovered, which could influence the modeling coronavirus transmission rate. several of these aspects involve controversial discussions in the scientific community, but a powerful model should be able to accommodate different assumptions. there are other reality constraints that our model is not taking into consideration. for example, the number of daily hospitalizations and critical patients cannot increase without limit due to total bed capacity in hospitals. in fact, texas medical center reported they reached 100% of icu basis capacity on june 25, 2020 [41] . our model did not consider hospitalization and icu delays when some hospitals are fully loaded, which needs more model parameters. yet another limitation of our model is the lack of full consideration for population density, demographics composition, daily in-bound/out-bound traffic flows, and medical resource disparities. for example, many patients in harris county might come from other counties, but they are treated in the texas medical center (in harris county), so the total hospitalization and mortality might not completely match the local infection rates. joint consideration of multiple counties and decomposition of hospitalized patients in terms of their residency would produce more accurate predictions. we have presented a proof-of-concept of a policy-aware compartmental dynamical epidemiological model by stratifying populations into low-and high-risk groups based on people's affiliated industries during the reopening phases at a county level using limited data. we believe it is an important effort to better understand dynamic feedback of this stratification through an ode control system. there are many limitations and future directions that we have exposed through this exploration. we will further explore these challenges with more data and better assumptions to improve existing models. international trends of combating covid-19: present and future perspectives: technium conference covid-19 trend estimation in the elderly italian region of sardinia mask wearing to complement social distancing and save lives during covid-19 unfolding trends of covid-19 transmission in india: critical review of available mathematical models estimation of reproduction numbers of covid-19 in typical countries and epidemic trends under different prevention and control scenarios assessing the tendency of 2019-ncov (covid-19) outbreak in china the impact of the covid-19 pandemic on consumption: learning from high frequency transaction data searching for the peak google trends and the covid-19 outbreak in italy centers for disease control and prevention home -covid 19 forecast hub a simulation study of coronavirus as an epidemic disease using agent-based modeling institute for health metrics and evaluation a contribution to the mathematical theory of epidemics the effect of large-scale anti-contagion policies on the covid-19 pandemic first-principles machine learning modelling of covid-19 preliminary analysis of covid-19 spread in italy with an adaptive seird model 22 anjum. seir-hcd model. kaggle. 2020 covid-19 in the united states covid-19 data report tmc daily new covid-19 hospitalizations -texas medical center covid-19) data in the united states covid-19 antibody seroprevalence actual coronavirus infections vastly undercounted, c.d.c. data shows. the new york times estimating covid-19 antibody seroprevalence algorithm 778: l-bfgs-b: fortran subroutines for large-scale bound-constrained optimization covid-19 kaggle community contributions covid-19 scenarios epidemic analysis of covid-19 outbreak and counter-measures in france coronavirus disease 2019 (covid-19) situation report -73. who website texas won't specify where hospital beds are available as coronavirus cases hit record highs. the texas tribune quantifying the effect of quarantine control in covid-19 infectious spread using machine learning a method for stochastic optimization github link: https://github.com/shayanshams66/nn-sir-hcd [neural network sir-hcd] neural network sir-hcd model (with adjusted quarantine control) since the controlling parameter 0 within the sir-hcd model does not account for specific quarantine effects. in reality, it is reasonable to consider the quarantine factors for adjusting free parameters in our existing sir-hcd model. therefore, we utilize a multilayer perceptron (mlp) architecture [39] , [42] to estimate the hidden variable and augment the epidemiological estimation process. the augmented model introduces a quarantine strength term and quarantined population . we designed ( ) as an n-layer mlp network with a weighted vector , and the input vector ( ) = ( ( ), ( ), ( ), ( ), ( ), ( ), ( )). therefore, the hidden variable ( ) is estimated as ( ) = ( ( ), ) the original reproduction number 0 at each timestep is the constant value. we aim to adjust the value of 0 by adding the variant quarantine strength term ( ) so that the curve could be more flexible to fluctuate policy changes.we utilized a multilayer perceptron (mlp) network with two hidden layers in our implementation. the deep learning adjusted sir-hcd model is trained by minimizing the weighted mean squared log error loss function using the adam optimizer [43] for 1000 iterations. the loss function calculates the weighted average squared error, in which the weight at a later time has a higher value. the optimization continues until the loss value converges. key: cord-035388-n9hza6vm authors: xu, jie; glicksberg, benjamin s.; su, chang; walker, peter; bian, jiang; wang, fei title: federated learning for healthcare informatics date: 2020-11-12 journal: j healthc inform res doi: 10.1007/s41666-020-00082-4 sha: doc_id: 35388 cord_uid: n9hza6vm with the rapid development of computer software and hardware technologies, more and more healthcare data are becoming readily available from clinical institutions, patients, insurance companies, and pharmaceutical industries, among others. this access provides an unprecedented opportunity for data science technologies to derive data-driven insights and improve the quality of care delivery. healthcare data, however, are usually fragmented and private making it difficult to generate robust results across populations. for example, different hospitals own the electronic health records (ehr) of different patient populations and these records are difficult to share across hospitals because of their sensitive nature. this creates a big barrier for developing effective analytical approaches that are generalizable, which need diverse, “big data.” federated learning, a mechanism of training a shared global model with a central server while keeping all the sensitive data in local institutions where the data belong, provides great promise to connect the fragmented healthcare data sources with privacy-preservation. the goal of this survey is to provide a review for federated learning technologies, particularly within the biomedical space. in particular, we summarize the general solutions to the statistical challenges, system challenges, and privacy issues in federated learning, and point out the implications and potentials in healthcare. the recent years have witnessed a surge of interest related to healthcare data analytics, due to the fact that more and more such data are becoming readily available from various sources including clinical institutions, patient individuals, insurance companies, and pharmaceutical industries, among others. this provides an unprecedented opportunity for the development of computational techniques to dig data-driven insights for improving the quality of care delivery [72, 105] . healthcare data are typically fragmented because of the complicated nature of the healthcare system and processes. for example, different hospitals may be able to access the clinical records of their own patient populations only. these records are highly sensitive with protected health information (phi) of individuals. rigorous regulations, such as the health insurance portability and accountability act (hipaa) [32] , have been developed to regulate the process of accessing and analyzing such data. this creates a big challenge for modern data mining and machine learning (ml) technologies, such as deep learning [61] , which typically requires a large amount of training data. federated learning is a paradigm with a recent surge in popularity as it holds great promise on learning with fragmented sensitive data. instead of aggregating data from different places all together, or relying on the traditional discovery then replication design, it enables training a shared global model with a central server while keeping the data in local institutions where the they originate. the term "federated learning" is not new. in 1976, patrick hill, a philosophy professor, first developed the federated learning community (flc) to bring people together to jointly learn, which helped students overcome the anonymity and isolation in large research universities [42] . subsequently, there were several efforts aiming at building federations of learning content and content repositories [6, 74, 83] . in 2005, rehak et al. [83] developed a reference model describing how to establish an interoperable repository infrastructure by creating federations of repositories, where the metadata are collected from the contributing repositories into a central registry provided with a single point of discovery and access. the ultimate goal of this model is to enable learning from diverse content repositories. these practices in federated learning community or federated search service have provided effective references for the development of federated learning algorithms. federated learning holds great promises on healthcare data analytics. for both provider (e.g., building a model for predicting the hospital readmission risk with patient electronic health records (ehr) [71] ) and consumer (patient)-based applications (e.g., screening atrial fibrillation with electrocardiograms captured by smartwatch [79] ), the sensitive patient data can stay either in local institutions or with individual consumers without going out during the federated model learning process, which effectively protects the patient privacy. the goal of this paper is to review the setup of federated learning, discuss the general solutions and challenges, and envision its applications in healthcare. in this review, after a formal overview of federated learning, we summarize the main challenges and recent progress in this field. then we illustrate the potential of federated learning methods in healthcare by describing the successful recent research. at last, we discuss the main opportunities and open questions for future applications in healthcare. there has been a few review articles on federated learning recently. for example, yang et al. [109] wrote the early federated learning survey summarizing the general privacy-preserving techniques that can be applied to federated learning. some researchers surveyed sub-problems of federated learning, e.g., personalization techniques [59] , semi-supervised learning algorithms [49] , threat models [68] , and mobile edge networks [66] . kairouz et al. [51] discussed recent advances and presented an extensive collection of open problems and challenges. li et al. [63] conducted the review on federated learning from a system viewpoint. different from those reviews, this paper provided the potential of federated learning to be applied in healthcare. we summarized the general solution to the challenges in federated learning scenario and surveyed a set of representative federated learning methods for healthcare. in the last part of this review, we outlined some directions or open questions in federated learning for healthcare. an early version of this paper is available on arxiv [107] . federated learning is a problem of training a high-quality shared global model with a central server from decentralized data scattered among large number of different clients (fig. 1) . mathematically, assume there are k activated clients where the data reside in (a client could be a mobile phone, a wearable device, or a clinical institution data warehouse, etc.). let d k denote the data distribution associated with client k and fig. 1 schematic of the federated learning framework. the model is trained in a distributed manner: the institutions periodically communicate the local updates with a central server to learn a global model; the central server aggregates the updates and sends back the parameters of the updated global model n k the number of samples available from that client. n = k k=1 n k is the total sample size. federated learning problem boils down to solving a empirical risk minimization problem of the form [56, 57, 69] : where w is the model parameter to be learned. the function f i is specified via a loss function dependent on a pair of input-output data pair {x i , y i }. typically, x i ∈ r d and y i ∈ r or y i ∈ {−1, 1}. simple examples include: in particular, algorithms for federated learning face with a number of challenges [13, 96] , specifically: -statistical challenge: the data distribution among all clients differ greatly, i.e., ∀k =k, we have e . it is such that any data points available locally are far from being a representative sample of the overall distribution, i.e., -communication efficiency: the number of clients k is large and can be much bigger than the average number of training sample stored in the activated clients, i.e., k (n/k). -privacy and security: additional privacy protections are needed for unreliable participating clients. it is impossible to ensure all clients are equally reliable. next, we will survey, in detail, the existing federated learning related works on handling such challenges. the naive way to solve the federated learning problem is through federated averaging (fedavg) [69] . it is demonstrated can work with certain non independent identical distribution (non-iid) data by requiring all the clients to share the same model. however, fedavg does not address the statistical challenge of strongly skewed data distributions. the performance of convolutional neural networks trained with fedavg algorithm can reduce significantly due to the weight divergence [111] . existing research on dealing with the statistical challenge of federated learning can be grouped into two fields, i.e., consensus solution and pluralistic solution. most centralized models are trained on the aggregated training samples obtained from the samples drawn from the local clients [96, 111] . intrinsically, the centralized model is trained to minimize the loss with respect to the uniform distribution [73] :d = k k=1 n k n d k , whered is the target data distribution for the learning model. however, this specific uniform distribution is not an adequate solution in most scenarios. to address this issue, the recent proposed solution is to model the target distribution or force the data adapt to the uniform distribution [73, 111] . specifically, mohri et al. [73] proposed a minimax optimization scheme, i.e., agnostic federated learning (afl), where the centralized model is optimized for any possible target distribution formed by a mixture of the client distributions. this method has only been applied at small scales. compared to afl, li et al. [64] proposed q-fair federated learning (q-ffl), assigning higher weight to devices with poor performance, so that the distribution of accuracy in the network reduces in variance. they empirically demonstrate the improved flexibility and scalability of q-ffl compared to afl. another commonly used method is globally sharing a small portion of data between all the clients [75, 111] . the shared subset is required containing a uniform distribution over classes from the central server to the clients. in addition to handle non-iid issue, sharing information of a small portion of trusted instances and noise patterns can guide the local agents to select compact training subset, while the clients learn to add changes to selected data samples, in order to improve the test performance of the global model [38] . generally, it is difficult to find a consensus solution w that is good for all components d i . instead of wastefully insisting on a consensus solution, many researchers choose to embracing this heterogeneity. multi-task learning (mtl) is a natural way to deal with the data drawn from different distributions. it directly captures relationships among non-iid and unbalanced data by leveraging the relatedness between them in comparison to learn a single global model. in order to do this, it is necessary to target a particular way in which tasks are related, e.g., sharing sparsity, sharing low-rank structure, and graphbased relatedness. recently, smith et al. [96] empirically demonstrated this point on real-world federated datasets and proposed a novel method mocha to solve a general convex mtl problem with handling the system challenges at the same time. later, corinzia et al. [22] introduced virtual, an algorithm for federated multi-task learning with non-convex models. they consider the federation of central server and clients as a bayesian network and perform training using approximated variational inference. this work bridges the frameworks of federated and transfer/continuous learning. the success of multi-task learning rests on whether the chosen relatedness assumptions hold. compared to this, pluralism can be a critical tool for dealing with heterogeneous data without any additional or even low-order terms that depend on the relatedness as in mtl [28] . eichner et al. [28] considered training in the presence of block-cyclic data and showed that a remarkably simple pluralistic approach can entirely resolve the source of data heterogeneity. when the component distributions are actually different, pluralism can outperform the "ideal" iid baseline. in federated learning setting, training data remain distributed over a large number of clients each with unreliable and relatively slow network connections. naively for synchronous protocol in federated learning [58, 96] , the total number of bits that required during uplink (clinets → server) and downlink (server → clients) communication by each of the k clients during training is given by: where u is the total number of updates performed by each client, |w| is the size of the model and h ( w up/down ) is the entropy of the weight updates exchanged during transmitting process. β is the difference between the true update size and the minimal update size (which is given by the entropy) [89] . apparently, we can consider three ways to reduce the communication cost: (a) reduce the number of clients k, (b) reduce the update size, (c) reduce the number of updates u . starting at these three points, we can organize existing research on communication-efficient federated learning into four groups, i.e., model compression, client selection, updates reducing, and peer-to-peer learning (fig. 2 ). the most natural and rough way for reducing communication cost is to restrict the participated clients or choose a fraction of parameters to be updated at each round. shokri et al. [92] use the selective stochastic gradient descent protocol, where the selection can be completely random or only the parameters whose current values are farther away from their local optima are selected, i.e., those that have a larger gradient. nishio et al. [75] proposed a new protocol referred to as fedcs, where the central server manages the resources of heterogeneous clients and determines which clients should participate the current training task by analyzing the resource information of each client, such as wireless channel states, computational capacities, and the size of data resources relevant to the current task. here, the server should decide how much data, energy, and cpu resources used by the mobile devices such that the energy consumption, training latency, and bandwidth cost are minimized while meeting requirements of the training tasks. anh [5] thus proposes to use the deep q-learning [102] technique that enables the server to find the optimal data and energy management for the mobile devices participating in the mobile crowdmachine learning through federated learning without any prior knowledge of network dynamics. the goal of model compression is to compress the server-to-client exchanges to reduce uplink/downlink communication cost. the first way is through structured updates, where the update is directly learned from a restricted space parameterized using a smaller number of variables, e.g., sparse, low-rank [58] , or more specifically, pruning the least useful connections in a network [37, 113] , weight quantization [17, 89] , and model distillation [43] . the second way is lossy compression, where a full model update is first learned and then compressed using a combination of quantization, random rotations, and subsampling before sending it to the server [2, 58] . then the server decodes the updates before doing the aggregation. federated dropout, in which each client, instead of locally training an update to the whole global model, trains an update to a smaller sub-model [12] . these submodels are subsets of the global model and, as such, the computed local updates have a natural interpretation as updates to the larger global model. federated dropout not only reduces the downlink communication but also reduces the size of uplink updates. moreover, the local computational costs is correspondingly reduced since the local training procedure dealing with parameters with smaller dimensions. kamp et al. [52] proposed to average models dynamically depending on the utility of the communication, which leads to a reduction of communication by an order of magnitude compared to periodically communicating state-of-the-art approaches. this facet is well suited for massively distributed systems with limited communication infrastructure. bui et al. [11] improved federated learning for bayesian neural networks using partitioned variational inference, where the client can decide to upload the parameters back to the central server after multiple passes through its data, after one local epoch, or after just one mini-batch. guha et al. [35] focused on techniques for one-shot federated learning, in which they learn a global model from data in the network using only a single round of communication between the devices and during the computation, no computation node is able to recover the original value nor learn anything about the output (green pie). any nodes can combine their shares to reconstruct the original value. b differential privacy. it guarantees that anyone seeing the result of a differentially private analysis will make the same inference (answer 1 and answer 2 are nearly indistinguishable) the central server. besides above works, ren et al. [84] theoretically analyzed the detailed expression of the learning efficiency in the cpu scenario and formulate a training acceleration problem under both communication and learning resource budget. reinforcement learning and round robin learning are widely used to manage the communication and computation resources [5, 46, 106, 114 ]. in federated learning, a central server is required to coordinate the training process of the global model. however, the communication cost to the central server may be not affordable since a large number of clients are usually involved. also, many practical peer-to-peer networks are usually dynamic, and it is not possible to regularly access a fixed central server. moreover, because of the dependence on central server, all clients are required to agree on one trusted central body, and whose failure would interrupt the training process for all clients. therefore, some researches began to study fully decentralized framework where the central server is not required [41, 60, 85, 91] . the local clients are distributed over the graph/network where they only communicate with their one-hop neighbors. each client updates its local belief based on own data and then aggregates information from the one-hop neighbors. in federated learning, we usually assume the number of participated clients (e.g., phones, cars, clinical institutions...) is large, potentially in the thousands or millions. it is impossible to ensure none of the clients is malicious. the setting of federated learning, where the model is trained locally without revealing the input data or the model's output to any clients, prevents direct leakage while training or using the model. however, the clients may infer some information about another client's private dataset given the execution of f (w), or over the shared predictive model w [100] . to this end, there have been many efforts focus on privacy either from an individual point of view or multiparty views, especially in social media field which significantly exacerbated multiparty privacy (mp) conflicts [97, 98] (fig. 3 ). secure multi-party computation (smc) has a natural application to federated learning scenarios, where each individual client uses a combination of cryptographic techniques and oblivious transfer to jointly compute a function of their private data [8, 78] . homomorphic encryption is a public key system, where any party can encrypt its data with a known public key and perform calculations with data encrypted by others with the same public key [29] . due to its success in cloud computing, it comes naturally into this realm, and it has certainly been used in many federated learning researches [14, 40] . although smc guarantees that none of the parties shares anything with each other or with any third party, it can not prevent an adversary from learning some individual information, e.g., which clients' absence might change the decision boundary of a classifier, etc. moreover, smc protocols are usually computationally expensive even for the simplest problems, requiring iterated encryption/decryption and repeated communication between participants about some of the encrypted results [78] . differential privacy (dp) [26] is an alternative theoretical model for protecting the privacy of individual data, which has been widely applied to many areas, not only traditional algorithms, e.g., boosting [27] , principal component analysis [15] , support vector machine [86] , but also deep learning research [1, 70] . it ensures that the addition or removal does not substantially affect the outcome of any analysis and is thus also widely studied in federated learning research to prevent the indirect leakage [1, 70, 92] . however, dp only protects users from data leakage to a certain extent and may reduce performance in prediction accuracy because it is a lossy method [18] . thus, some researchers combine dp with smc to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while preserving provable privacy guarantees, protecting against extraction attacks and collusion threats [18, 100] . federated learning has been incorporated and utilized in many domains. this widespread adoption is due in part by the fact that it enables a collaborative modeling mechanism that allows for efficient ml all while ensuring data privacy and legal compliance between multiple parties or multiple computing nodes. some promising examples that highlight these capabilities are virtual keyboard prediction [39, 70] , smart retail [112] , finance [109] , and vehicle-to-vehicle communication [88] . in this section, we focus primarily on applications within the healthcare space and also discuss promising applications in other domains since some principles can be applied to healthcare. ehrs have emerged as a crucial source of real world healthcare data that has been used for an amalgamation of important biomedical research [30, 47] , including for machine learning research [72] . while providing a huge amount of patient data for analysis, ehrs contain systemic and random biases overall and specific to hospitals that limit the generalizability of results. for example, obermeyer et al. [76] found that a commonly used algorithm to determine enrollment in specific health programs was biased against african americans, assigning the same level of risk to healthier caucasian patients. these improperly calibrated algorithms can arise due to a variety of reasons, such as differences in underlying access to care or low representation in training data. it is clear that one way to alleviate the risk for such biased algorithms is the ability to learn from ehr data that is more representative of the global population and which goes beyond a single hospital or site. unfortunately, due to a myriad of reasons such as discrepant data schemes and privacy concerns, it is unlikely that data will eve be connected together in a single database to learn from all at once. the creation and utility of standardized common data models, such as omop [44] , allow for more wide-spread replication analyses but it does not overcome the limitations of joint data access. as such, it is imperative that alternative strategies emerge for learning from multiple ehr data sources that go beyond the common discoveryreplication framework. federated learning might be the tool to enable large-scale representative ml of ehr data and we discuss many studies which demonstrate this fact below. federated learning is a viable method to connect ehr data from medical institutions, allowing them to share their experiences, and not their data, with a guarantee of privacy [9, 25, 34, 45, 65, 82] . in these scenarios, the performance of ml model will be significantly improved by the iterative improvements of learning from large and diverse medical data sets. there have been some tasks were studied in federated learning setting in healthcare, e.g., patient similarity learning [62] , patient representation learning, phenotyping [55, 67] , and predictive modeling [10, 45, 90] . specifically, lee et al. [62] presented a privacy-preserving platform in a federated setting for patient similarity learning across institutions. their model can find similar patients from one hospital to another without sharing patient-level information. kim et al. [55] used tensor factorization models to convert massive electronic health records into meaningful phenotypes for data analysis in federated learning setting. liu et al. [67] conducted both patient representation learning and obesity comorbidity phenotyping in a federated manner and got good results. vepakomma et al. [103] built several configurations upon a distributed deep learning method called splitnn [36] to facilitate the health entities collaboratively training deep learning models without sharing sensitive raw data or model details. silva et al. [93] illustrated their federated learning framework by investigating brain structural relationships across diseases and clinical cohorts. huang et al. [45] sought to tackle the challenge of non-iid icu patient data by clustering patients into clinically meaningful communities that captured similar diagnoses and geological locations and simultaneously training one model per community. federated learning has also enabled predictive modeling based on diverse sources, which can provide clinicians with additional insights into the risks and benefits of treating patients earlier [9, 10, 90] . brisimi et al. [10] aimed to predict future hospitalizations for patients with heart-related diseases using ehr data spread among various data sources/agents by solving the l 1 -regularized sparse support vector machine classifier in federated learning environment. owkin is using federated learning to predict patients' resistance to certain treatment and drugs, as well as their survival rates for certain diseases [99] . boughorbel et al. [9] proposed a federated uncertaintyaware learning algorithm for the prediction of preterm birth from distributed ehr, where the contribution of models with high uncertainty in the aggregation model is reduced. pfohl et al. [80] considered the prediction of prolonged length of stay and in-hospital mortality across thirty-one hospitals in the eicu collaborative research database. sharma et al. [90] tested a privacy preserving framework for the task of in-hospital mortality prediction among patients admitted to the intensive care unit (icu). their results show that training the model in the federated learning framework leads to comparable performance to the traditional centralized learning setting. summary of these work is listed in table 1 . an important application of federated learning is for natural language processing (nlp) tasks. when google first proposed federated learning concept in 2016, the application scenario is gboard-a virtual keyboard of google for touchscreen mobile devices with support for more than 600 language varieties [39, 70] . indeed, as users increasingly turn to mobile devices, fast mobile input methods with auto-correction, word completion, and next-word prediction features are becoming more and more important. for these nlp tasks, especially next-word prediction, typed text in mobile apps is usually better than the data from scanned books or speech-to-text in terms of aiding typing on a mobile keyboard. however, these language data often contain sensitive information, e.g., passwords, search queries, or text messages with personal information. therefore, federated learning has a promising application in nlp like virtual keyboard prediction [7, 39, 70] . other applications include smart retail [112] and finance [54] . specifically, smart retail aims to use machine learning technology to provide personalized services to customers based on data like user purchasing power and product characteristics for product recommendation and sales services. in terms of financial applications, tencent's webank leverages federated learning technologies for credit risk management, where several banks could jointly generate a comprehensive credit score for a customer without sharing his or her data [109] . with the growth and development of federated learning, there are many companies or research teams that have carried out various tools oriented to scientific research and product development. popular ones are listed in table 2 . in this survey, we review the current progress on federated learning including, but not limited to healthcare field. we summarize the general solutions to the various challenges in federated learning and hope to provide a useful resource for researchers to refer. besides the summarized general issues in federated learning setting, we list some probably encountered directions or open questions when federated learning is applied in healthcare area in the following. -data quality. federated learning has the potential to connect all the isolated medical institutions, hospitals, or devices to make them share their experiences with privacy guarantee. however, most health systems suffer from data clutter and efficiency problems. the quality of data collected from multiple sources is uneven and there is no uniform data standard. the analyzed results are apparently worthless when dirty data are accidentally used as samples. the ability to strategically leverage medical data is critical. therefore, how to clean, correct, and complete data and accordingly ensure data quality is a key to improve the machine learning model weather we are dealing with federated learning scenario or not. -incorporating expert knowledge. in 2016, ibm introduced watson for oncology, a tool that uses the natural language processing system to summarize patients' electronic health records and search the powerful database behind it to advise doctors on treatments. unfortunately, some oncologists say they trust their judgment more than watson tells them what needs to be done. 1 therefore, hopefully doctors will be involved in the training process. since every data set collected here cannot be of high quality, so it will be very helpful if the standards of evidence-based machine are introduced, doctors will also see the diagnostic criteria of artificial intelligence. if wrong, doctors will give further guidance to artificial intelligence to improve the accuracy of machine learning model during training process." -incentive mechanisms. with the internet of things and the variety of third party portals, a growing number of smartphone healthcare apps are compatible with wearable devices. in addition to data accumulated in hospitals or medical centers, another type of data that is of great value is coming from wearable devices not only to the researchers but more importantly for the owners. however, during federated model training process, the clients suffer from considerable overhead in communication and computation. without well-designed incentives, self-interested mobile or other wearable devices will be reluctant to participate in federal learning tasks, which will hinder the adoption of federated learning [53] . how to design an efficient incentive mechanism to attract devices with high-quality data to join federated learning is another important problem. -personalization. wearable devices are more focus on public health, which means helping people who are already healthy to improve their health, such as helping them exercise, practice meditation, and improve their sleep quality. how to assist patients to carry out scientifically designed personalized health management, correct the functional pathological state by examining indicators, and interrupt the pathological change process are very important. reasonable chronic disease management can avoid emergency visits and hospitalization and reduce the number of visits. cost and labor savings. although there are some general work about federated learning personalization [48, 94] , for healthcare informatics, how to combining the medical domain knowledge and make the global model be personalized for every medical institutions or wearable devices is another open question. -model precision. federated tries to make isolated institutions or devices share their experiences, and the performance of machine learning model will be significantly improved by the formed large medical dataset. however, the prediction task is currently restricted and relatively simple. medical treatment itself is a very professional and accurate field. medical devices in hospitals have incomparable advantages over wearable devices. and the models of doc.ai could predict the phenome collection of one's biometric data based on its selfie, such as height, weight, age, sex, and bmi. 2 how to improve the prediction model to predict future health conditions is definitely worth exploring. funding the work is supported by onr n00014-18-1-2585 and nsf 1750326. fw would also like to acknowledge the support from amazon aws machine learning research award and google faculty research award. deep learning with differential privacy cpsgd: communication-efficient and differentially-private distributed sgd federated ai technology enabler human activity recognition on smartphones using a multiclass hardware-friendly support vector machine efficient training management for mobile crowd-machine learning: a deep reinforcement learning approach an agent-based federated learning object search service towards federated learning at scale: system design practical secure aggregation for privacy-preserving machine learning federated uncertainty-aware learning for distributed hospital ehr data federated learning of predictive models from federated electronic health records partitioned variational inference: a unified framework encompassing federated and continual learning expanding the reach of federated learning by reducing client resource requirements leaf: a benchmark for federated settings secure federated matrix factorization a near-optimal algorithm for differentially-private principal components fedhealth: a federated transfer learning framework for wearable healthcare communication-efficient federated deep learning with asynchronous model update and temporally weighted aggregation secureboost: a lossless federated learning framework differential privacy-enabled federated learning for sensitive health data predicting adverse drug reactions on distributed health data using federated learning the physionet/computing in cardiology challenge 2015: reducing false arrhythmia alarms in the icu variational federated multi-task learning international application of a new probability algorithm for the diagnosis of coronary artery disease doc.ai: declarative, on-device machine learning for ios, android, and react native learning from electronic health records across multiple sites: a communication-efficient and privacy-preserving distributed algorithm our data, ourselves: privacy via distributed noise generation boosting and differential privacy semi-cyclic stochastic gradient descent a survey of homomorphic encryption for nonspecialists the next generation of precision medicine: observational studies, electronic health records, biobanks and continuous monitoring national health information privacy: regulations under the health insurance portability and accountability act robust aggregation for adaptive privacy preserving federated learning in healthcare ketos: clinical decision support and machine learning as a service-a training and deployment platform based on docker one-shot federated learning distributed learning of deep neural network over multiple agents deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding robust federated training via collaborative machine teaching using trusted instances federated learning for mobile keyboard prediction private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption central server free federated learning over single-sided trust social networks the rationale for learning communities and learning community models distilling the knowledge in a neural network observational health data sciences and informatics (ohdsi): opportunities for observational researchers patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records privacy preserving qoe modeling using collaborative learning mining electronic health records: towards better research applications and clinical care improving federated learning personalization via model agnostic meta learning a survey towards federated semi-supervised learning mimic-iii, a freely accessible critical care database advances and open problems in federated learning efficient decentralized deep learning by dynamic model averaging incentive design for efficient federated learning in mobile networks: a contract theory approach credit risk assessment from combined bank records using federated learning federated tensor factorization for computational phenotyping federated optimization: distributed optimization beyond the datacenter federated optimization: distributed machine learning for on-device intelligence federated learning: strategies for improving communication efficiency survey of personalization techniques for federated learning peer-to-peer federated learning on graphs deep learning privacy-preserving patient similarity learning in a federated environment: development and analysis federated optimization for heterogeneous networks fair resource allocation in federated learning distributed learning from multiple ehr databases: contextual embedding models for medical events two-stage federated phenotyping and patient representation learning threats to federated learning: a survey communication-efficient learning of deep networks from decentralized data learning differentially private recurrent language models predictive modeling of the hospital readmission risk from patients' claims data using machine learning: a case study on copd deep learning for healthcare: review, opportunities and challenges agnostic federated learning system and method for dynamic context-sensitive federated search of multiple information repositories client selection for federated learning with heterogeneous resources in mobile edge dissecting racial bias in an algorithm used to manage the health of populations multiparty differential privacy via aggregation of locally trained classifiers large-scale assessment of a smartwatch to identify atrial fibrillation federated and differentially private learning for electronic health records the eicu collaborative research database, a freely available multi-center database for critical care research modern framework for distributed healthcare data analytics based on hadoop a model and infrastructure for federated learning content repositories accelerating dnn training in wireless federated edge learning system braintorrent: a peer-to-peer environment for decentralized federated learning learning in a large function space: privacypreserving mechanisms for svm learning a generic framework for privacy preserving deep learning federated learning for ultra-reliable lowlatency v2v communications robust and communication-efficient federated learning from non-iid data preserving patient privacy while training a predictive model of in-hospital mortality biscotti: a ledger for private and secure peer-topeer machine learning privacy-preserving deep learning federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data an investigation into on-device personalization of end-to-end automatic speech recognition models using the adap learning algorithm to forecast the onset of diabetes mellitus federated multi-task learning multiparty privacy in social media unfriendly: multi-party privacy risks in social networks federated learning: rewards & challenges of distributed private ml a hybrid approach to privacy-preserving federated learning federated learning of electronic health records improves mortality prediction in patients hospitalized with covid-19 medrxiv deep reinforcement learning with double q-learning split learning for health: distributed deep learning without sharing raw patient data ai in health: state of the art, challenges, and future directions -edge ai: intelligentizing mobile edge computing, caching and communication by federated learning federated learning for healthcare informatics federated patient hashing federated machine learning: concept and applications a federated learning framework for healthcare iot devices federated learning with non-iid data mobile edge computing, blockchain and reputationbased crowdsourcing iot federated learning: a secure, decentralized and privacy-preserving system multi-objective evolutionary federated learning federated reinforcement learning publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations conflict of interest the authors declare that they have no conflict of interest. key: cord-168862-3tj63eve authors: porter, mason a. title: nonlinearity + networks: a 2020 vision date: 2019-11-09 journal: nan doi: nan sha: doc_id: 168862 cord_uid: 3tj63eve i briefly survey several fascinating topics in networks and nonlinearity. i highlight a few methods and ideas, including several of personal interest, that i anticipate to be especially important during the next several years. these topics include temporal networks (in which the entities and/or their interactions change in time), stochastic and deterministic dynamical processes on networks, adaptive networks (in which a dynamical process on a network is coupled to dynamics of network structure), and network structure and dynamics that include"higher-order"interactions (which involve three or more entities in a network). i draw examples from a variety of scenarios, including contagion dynamics, opinion models, waves, and coupled oscillators. in its broadest form, a network consists of the connectivity patterns and connection strengths in a complex system of interacting entities [121] . the most traditional type of network is a graph g = (v, e) (see fig. 1a) , where v is a set of "nodes" (i.e., "vertices") that encode entities and e ⊆ v × v is a set of "edges" (i.e., "links" or "ties") that encode the interactions between those entities. however, recent uses of the term "network" have focused increasingly on connectivity patterns that are more general than graphs [98] : a network's nodes and/or edges (or their associated weights) can change in time [70, 72] (see section 3), nodes and edges can include annotations [26] , a network can include multiple types of edges and/or multiple types of nodes [90, 140] , it can have associated dynamical processes [142] (see sections 3, 4, and 5) , it can include memory [152] , connections can occur between an arbitrary number of entities [127, 131] (see section 6) , and so on. associated with a graph is an adjacency matrix a with entries a i j . in the simplest scenario, edges either exist or they don't. if edges have directions, a i j = 1 when there is an edge from entity j to entity i and a i j = 0 when there is no such edge. when a i j = 1, node i is "adjacent" to node j (because we can reach i directly from j), and the associated edge is "incident" from node j and to node i. the edge from j to i is an "out-edge" of j and an "in-edge" of i. the number of out-edges of a node is its "out-degree", and the number of in-edges of a node is its "in-degree". for an undirected network, a i j = a ji , and the number of edges that are attached to a node is the node's "degree". one can assign weights to edges to represent connections with different strengths (e.g., stronger friendships or larger transportation capacity) by defining a function w : e −→ r. in many applications, the weights are nonnegative, although several applications [180] (such as in international relations) incorporate positive, negative, and zero weights. in some applications, nodes can also have selfedges and multi-edges. the spectral properties of adjacency (and other) matrices give important information about their associated graphs [121, 187] . for undirected networks, it is common to exploit the beneficent property that all eigenvalues of symmetric matrices are real. traditional studies of networks consider time-independent structures, but most networks evolve in time. for example, social networks of people and animals change based on their interactions, roads are occasionally closed for repairs and new roads are built, and airline routes change with the seasons and over the years. to study such time-dependent structures, one can analyze "temporal networks". see [70, 72] for reviews and [73, 74] for edited collections. the key idea of a temporal network is that networks change in time, but there are many ways to model such changes, and the time scales of interactions and other changes play a crucial role in the modeling process. there are also other [i drew this network using tikz-network, by jürgen hackl and available at https://github.com/hackl/tikz-network), which allows one to draw networks (including multilayer networks) directly in a l a t e x file.] . an example of a multilayer network with three layers. we label each layer using di↵erent colours for its state nodes and its edges: black nodes and brown edges (three of which are unidirectional) for layer 1, purple nodes and green edges for layer 2, and pink nodes and grey edges for layer 3. each state node (i.e. nodelayer tuple) has a corresponding physical node and layer, so the tuple (a, 3) denotes physical node a on layer 3, the tuple (d, 1) denotes physical node d on layer 1, and so on. we draw intralayer edges using solid arcs and interlayer edges using broken arcs; an interlayer edge is dashed (and magenta) if it connects corresponding entities and dotted (and blue) if it connects distinct ones. we include arrowheads to represent unidirectional edges. we drew this network using tikz-network (jürgen hackl, https://github.com/hackl/tikz-network), which allows one to draw multilayer networks directly in a l at ex file. , which is by jürgen hackl and is available at https://github.com/hackl/tikz-network. panel (b) is inspired by fig. 1 of [72] . panel (d), which is in the public domain, was drawn by wikipedia user cflm001 and is available at https://en.wikipedia.org/wiki/simplicial_complex.] important modeling considerations. to illustrate potential complications, suppose that an edge in a temporal network represents close physical proximity between two people in a short time window (e.g., with a duration of two minutes). it is relevant to consider whether there is an underlying social network (e.g., the friendship network of mathematics ph.d. students at ucla) or if the people in the network do not in general have any other relationships with each other (e.g., two people who happen to be visiting a particular museum on the same day). in both scenarios, edges that represent close physical proximity still appear and disappear over time, but indirect connections (i.e., between people who are on the same connected component, but without an edge between them) in a time window may play different roles in the spread of information. moreover, network structure itself is often influenced by a spreading process or other dynamics, as perhaps one arranges a meeting to discuss a topic (e.g., to give me comments on a draft of this chapter). see my discussion of adaptive networks in section 5. for convenience, most work on temporal networks employs discrete time (see fig. 1(b) ). discrete time can arise from the natural discreteness of a setting, dis-cretization of continuous activity over different time windows, data measurement that occurs at discrete times, and so on. one way to represent a discrete-time (or discretized-time) temporal network is to use the formalism of "multilayer networks" [90, 140] . one can also use multilayer networks to study networks with multiple types of relations, networks with multiple subsystems, and other complicated networked structures. fig. 1 (c)) has a set v of nodesthese are sometimes called "physical nodes", and each of them corresponds to an entity, such as a person -that have instantiations as "state nodes" (i.e., node-layer tuples, which are elements of the set v m ) on layers in l. one layer in the set l is a combination, through the cartesian product l 1 × · · · × l d , of elementary layers. the number d indicates the number of types of layering; these are called "aspects". a temporal network with one type of relationship has one type of layering, a timeindependent network with multiple types of social relationships also has one type of layering, a multirelational network that changes in time has two types of layering, and so on. the set of state nodes in m is v m ⊆ v × l 1 × · · · × l d , and the set of indicates that there is an edge from node j on layer β to node i on layer α (and vice versa, if m is undirected). for example, in fig. 1(c) , there is a directed intralayer edge from (a,1) to (b,1) and an undirected interlayer edge between (a,1) and (a,2). the multilayer network in fig. 1 (c) has three layers, |v | = 5 physical nodes, d = 1 aspect, |v m | = 13 state nodes, and |e m | = 20 edges. to consider weighted edges, one proceeds as in ordinary graphs by defining a function w : e m −→ r. as in ordinary graphs, one can also incorporate self-edges and multi-edges. multilayer networks can include both intralayer edges (which have the same meaning as in graphs) and interlayer edges. the multilayer network in fig. 1 (c) has 4 directed intralayer edges, 10 undirected intralayer edges, and 6 undirected interlayer edges. in most studies thus far of multilayer representations of temporal networks, researchers have included interlayer edges only between state nodes in consecutive layers and only between state nodes that are associated with the same entity (see fig. 1 (c)). however, this restriction is not always desirable (see [184] for an example), and one can envision interlayer couplings that incorporate ideas like time horizons and interlayer edge weights that decay over time. for convenience, many researchers have used undirected interlayer edges in multilayer analyses of temporal networks, but it is often desirable for such edges to be directed to reflect the arrow of time [176] . the sequence of network layers, which constitute time layers, can represent a discrete-time temporal network at different time instances or a continuous-time network in which one bins (i.e., aggregates) the network's edges to form a sequence of time windows with interactions in each window. each d-aspect multilayer network with the same number of nodes in each layer has an associated adjacency tensor a of order 2(d + 1). for unweighted multilayer networks, each edge in e m is associated with a 1 entry of a, and the other entries (the "missing" edges) are 0. if a multilayer network does not have the same number of nodes in each layer, one can add empty nodes so that it does, but the edges that are attached to such nodes are "forbidden". there has been some research on tensorial properties of a [35] (and it is worthwhile to undertake further studies of them), but the most common approach for computations is to flatten a into a "supra-adjacency matrix" a m [90, 140] , which is the adjacency matrix of the graph g m that is associated with m. the entries of diagonal blocks of a m correspond to intralayer edges, and the entries of off-diagonal blocks correspond to interlayer edges. following a long line of research in sociology [37] , two important ingredients in the study of networks are examining (1) the importances ("centralities") of nodes, edges, and other small network structures and the relationship of measures of importance to dynamical processes on networks and (2) the large-scale organization of networks [121, 193] . studying central nodes in networks is useful for numerous applications, such as ranking web pages, football teams, or physicists [56] . it can also help reveal the roles of nodes in networks, such as those that experience high traffic or help bridge different parts of a network [121, 193] . mesoscale features can impact network function and dynamics in important ways. small subgraphs called "motifs" may appear frequently in some networks [111] , perhaps indicating fundamental structures such as feedback loops and other building blocks of global behavior [59] . various types of largerscale network structures, such as dense "communities" of nodes [47, 145] and coreperiphery structures [33, 150] , are also sometimes related to dynamical modules (e.g., a set of synchronized neurons) or functional modules (e.g., a set of proteins that are important for a certain regulatory process) [164] . a common way to study large-scale structures1 is inference using statistical models of random networks, such as through stochastic block models (sbms) [134] . much recent research has generalized the study of large-scale network structure to temporal and multilayer networks [3, 74, 90] . various types of centrality -including betweenness centrality [88, 173] , bonacich and katz centrality [65, 102] , communicability [64] , pagerank [151, 191] , and eigenvector centrality [46, 146] -have been generalized to temporal networks using a variety of approaches. such generalizations make it possible to examine how node importances change over time as network structure evolves. in recent work, my collaborators and i used multilayer representations of temporal networks to generalize eigenvector-based centralities to temporal networks [175, 176] .2 one computes the eigenvector-based centralities of nodes for a timeindependent network as the entries of the "dominant" eigenvector, which is associated with the largest positive eigenvalue (by the perron-frobenius theorem, the eigenvalue with the largest magnitude is guaranteed to be positive in these situations) of a centrality matrix c(a). examples include eigenvector centrality (by using c(a) = a) [17] , hub and authority scores3 (by using c(a) = aa t for hubs and a t a for authorities) [91] , and pagerank [56] . given a discrete-time temporal network in the form of a sequence of adjacency matrices i j denotes a directed edge from entity i to entity j in time layer t, we construct a "supracentrality matrix" c(ω), which couples centrality matrices c(a (t) ) of the individual time layers. we then compute the dominant eigenvector of c(ω), where ω is an interlayer coupling strength.4 in [175, 176] , a key example was the ranking of doctoral programs in the mathematical sciences (using data from the mathematics genealogy project [147] ), where an edge from one institution to another arises when someone with a ph.d. from the first institution supervises a ph.d. student at the second institution. by calculating timedependent centralities, we can study how the rankings of mathematical-sciences doctoral programs change over time and the dependence of such rankings on the value of ω. larger values of ω impose more ranking consistency across time, so centrality trajectories are less volatile for larger ω [175, 176] . multilayer representations of temporal networks have been very insightful in the detection of communities and how they split, merge, and otherwise evolve over time. numerous methods for community detection -including inference via sbms [135] , maximization of objective functions (especially "modularity") [117] , and methods based on random walks and bottlenecks to their traversal of a network [38, 80] -have been generalized from graphs to multilayer networks. they have yielded insights in a diverse variety of applications, including brain networks [183] , granular materials [129] , political voting networks [113, 117] , disease spreading [158] , and ecology and animal behavior [45, 139] . to assist with such applications, there are efforts to develop and analyze multilayer random-network models that incorporate rich and flexible structures [11] , such as diverse types of interlayer correlations. activity-driven (ad) models of temporal networks [136] are a popular family of generative models that encode instantaneous time-dependent descriptions of network dynamics through a function called an "activity potential", which encodes the mechanism to generate connections and characterizes the interactions between enti-ties in a network. an activity potential encapsulates all of the information about the temporal network dynamics of an ad model, making it tractable to study dynamical processes (such as ones from section 4) on networks that are generated by such a model. it is also common to compare the properties of networks that are generated by ad models to those of empirical temporal networks [74] . in the original ad model of perra et al. [136] , one considers a network with n entities, which we encode by the nodes. we suppose that node i has an activity rate a i = ηx i , which gives the probability per unit time to create new interactions with other nodes. the scaling factor η ensures that the mean number of active nodes per unit time is η we define the activity rates such that x i ∈ [ , 1], where > 0, and we assign each x i from a probability distribution f(x) that can either take a desired functional form or be constructed from empirical data. the model uses the following generative process: • at each discrete time step (of length ∆t), start with a network g t that consists of n isolated nodes. • with a probability a i ∆t that is independent of other nodes, node i is active and generates m edges, each of which attaches to other nodes uniformly (i.e., with the same probability for each node) and independently at random (without replacement). nodes that are not active can still receive edges from active nodes. • at the next time step t + ∆t, we delete all edges from g t , so all interactions have a constant duration of ∆t. we then generate new interactions from scratch. this is convenient, as it allows one to apply techniques from markov chains. because entities in time step t do not have any memory of previous time steps, f(x) encodes the network structure and dynamics. the ad model of perra et al. [136] is overly simplistic, but it is amenable to analysis and has provided a foundation for many more general ad models, including ones that incorporate memory [200] . in section 6.4, i discuss a generalization of ad models to simplicial complexes [137] that allows one to study instantaneous interactions that involve three or more entities in a network. many networked systems evolve continuously in time, but most investigations of time-dependent networks rely on discrete or discretized time. it is important to undertake more analysis of continuous-time temporal networks. researchers have examined continuous-time networks in a variety of scenarios. examples include a compartmental model of biological contagions [185] , a generalization of katz centrality to continuous time [65] , generalizations of ad models (see section 3.1.3) to continuous time [198, 199] , and rankings in competitive sports [115] . in a recent paper [2] , my collaborators and i formulated a notion of "tie-decay networks" for studying networks that evolve in continuous time. they distinguished between interactions, which they modeled as discrete contacts, and ties, which encode relationships and their strength as a function of time. for example, perhaps the strength of a tie decays exponentially after the most recent interaction. more realistically, perhaps the decay rate depends on the weight of a tie, with strong ties decaying more slowly than weak ones. one can also use point-process models like hawkes processes [99] to examine similar ideas using a node-centric perspective. suppose that there are n interacting entities, and let b(t) be the n × n timedependent, real, non-negative matrix whose entries b i j (t) encode the tie strength between agents i and j at time t. in [2] , we made the following simplifying assumptions: 1. as in [81] , ties decay exponentially when there are no interactions: where α ≥ 0 is the decay rate. 2. if two entities interact at time t = τ, the strength of the tie between them grows instantaneously by 1. see [201] for a comparison of various choices, including those in [2] and [81] , for tie evolution over time. in practice (e.g., in data-driven applications), one obtains b(t) by discretizing time, so let's suppose that there is at most one interaction during each time step of length ∆t. this occurs, for example, in a poisson process. such time discretization is common in the simulation of stochastic dynamical systems, such as in gillespie algorithms [41, 142, 189] . consider an n × n matrix a(t) in which a i j (t) = 1 if node i interacts with node j at time t and a i j (t) = 0 otherwise. for a directed network, a(t) has exactly one nonzero entry during each time step when there is an interaction and no nonzero entries when there isn't one. for an undirected network, because of the symmetric nature of interactions, there are exactly two nonzero entries in time steps that include an interaction. we write equivalently, if interactions between entities occur at times τ ( ) such that 0 ≤ τ (0) < τ (1) < . . . < τ (t ) , then at time t ≥ τ (t ) , we have in [2] , my coauthors and i generalized pagerank [20, 56] to tie-decay networks. one nice feature of their tie-decay pagerank is that it is applicable not just to data sets, but also to data streams, as one updates the pagerank values as new data arrives. by contrast, one problematic feature of many methods that rely on multilayer representations of temporal networks is that one needs to recompute everything for an entire data set upon acquiring new data, rather than updating prior results in a computationally efficient way. a dynamical process can be discrete, continuous, or some mixture of the two; it can also be either deterministic or stochastic. it can take the form of one or several coupled ordinary differential equations (odes), partial differential equations (pdes), maps, stochastic differential equations, and so on. a dynamical process requires a rule for updating the states of its dependent variables with respect one or more independent variables (e.g., time), and one also has (one or a variety of) initial conditions and/or boundary conditions. to formalize a dynamical process on a network, one needs a rule for how to update the states of the nodes and/or edges. the nodes (of one or more types) of a network are connected to each other in nontrivial ways by one or more types of edges. this leads to a natural question: how does nontrivial connectivity between nodes affect dynamical processes on a network [142] ? when studying a dynamical process on a network, the network structure encodes which entities (i.e., nodes) of a system interact with each other and which do not. if desired, one can ignore the network structure entirely and just write out a dynamical system. however, keeping track of network structure is often a very useful and insightful form of bookkeeping, which one can exploit to systematically explore how particular structures affect the dynamics of particular dynamical processes. prominent examples of dynamical processes on networks include coupled oscillators [6, 149] , games [78] , and the spread of diseases [89, 130] and opinions [23, 100] . there is also a large body of research on the control of dynamical processes on networks [103, 116] . most studies of dynamics on networks have focused on extending familiar models -such as compartmental models of biological contagions [89] or kuramoto phase oscillators [149] -by coupling entities using various types of network structures, but it is also important to formulate new dynamical processes from scratch, rather than only studying more complicated generalizations of our favorite models. when trying to illuminate the effects of network structure on a dynamical process, it is often insightful to provide a baseline comparison by examining the process on a convenient ensemble of random networks [142] . a simple, but illustrative, dynamical process on a network is the watts threshold model (wtm) of a social contagion [100, 142] . it provides a framework for illustrating how network structure can affect state changes, such as the adoption of a product or a behavior, and for exploring which scenarios lead to "virality" (in the form of state changes of a large number of nodes in a network). the original wtm [194] , a binary-state threshold model that resembles bootstrap percolation [24] , has a deterministic update rule, so stochasticity can come only from other sources (see section 4.2). in a binary state model, each node is in one of two states; see [55] for a tabulation of well-known binary-state dynamics on networks. the wtm is a modification of mark granovetter's threshold model for social influence in a fully-mixed population [62] . see [86, 186] for early work on threshold models on networks that developed independently from investigations of the wtm. threshold contagion models have been developed for many scenarios, including contagions with multiple stages [109] , models with adoption latency [124] , models with synergistic interactions [83] , and situations with hipsters (who may prefer to adopt a minority state) [84] . in a binary-state threshold model such as the wtm, each node i has a threshold r i that one draws from some distribution. suppose that r i is constant in time, although one can generalize it to be time-dependent. at any time, each node can be in one of two states: 0 (which represents being inactive, not adopted, not infected, and so on) or 1 (active, adopted, infected, and so on). a binary-state model is a drastic oversimplification of reality, but the wtm is able to capture two crucial features of social systems [125] : interdependence (an entity's behavior depends on the behavior of other entities) and heterogeneity (as nodes with different threshold values behave differently). one can assign a seed number or seed fraction of nodes to the active state, and one can choose the initially active nodes either deterministically or randomly. the states of the nodes change in time according to an update rule, which can either be synchronous (such that it is a map) or asynchronous (e.g., as a discretization of continuous time) [142] . in the wtm, the update rule is deterministic, so this choice affects only how long it takes to reach a steady state; it does not affect the steady state itself. with a stochastic update rule, the synchronous and asynchronous versions of ostensibly the "same" model can behave in drastically different ways [43] . in the wtm on an undirected network, to update the state of a node, one compares its fraction s i /k i of active neighbors (where s i is the number of active neighbors and k i is the degree of node i) to the node's threshold r i . an inactive node i becomes active (i.e., it switches from state 0 to state 1) if s i /k i ≥ r i ; otherwise, it stays inactive. the states of nodes in the wtm are monotonic, in the sense that a node that becomes active remains active forever. this feature is convenient for deriving accurate approximations for the global behavior of the wtm using branchingprocess approximations [55, 142] or when analyzing the behavior of the wtm using tools such as persistent homology [174] . a dynamical process on a network can take the form of a stochastic process [121, 142] . there are several possible sources of stochasticity: (1) choice of initial condition, (2) choice of which nodes or edges to update (when considering asynchronous updating), (3) the rule for updating nodes or edges, (4) the values of parameters in an update rule, and (5) selection of particular networks from a random-graph ensemble (i.e., a probability distribution on graphs). some or all of these sources of randomness can be present when studying dynamical processes on networks. it is desirable to compare the sample mean of a stochastic process on a network to an ensemble average (i.e., to an expectation over a suitable probability distribution). prominent examples of stochastic processes on networks include percolation [153] , random walks [107] , compartment models of biological contagions [89, 130] , bounded-confidence models with continuous-valued opinions [110] , and other opinion and voter models [23, 100, 142, 148] . compartmental models of biological contagions are a topic of intense interest in network science [89, 121, 130, 142] . a compartment represents a possible state of a node; examples include susceptible, infected, zombified, vaccinated, and recovered. an update rule determines how a node changes its state from one compartment to another. one can formulate models with as many compartments as desired [18] , but investigations of how network structure affects dynamics typically have employed examples with only two or three compartments [89, 130] . researchers have studied various extensions of compartmental models, contagions on multilayer and temporal networks [4, 34, 90] , metapopulation models on networks [30] for simultaneously studying network connectivity and subpopulations with different characteristics, non-markovian contagions on networks for exploring memory effects [188] , and explicit incorporation of individuals with essential societal roles (e.g., health-care workers) [161] . as i discuss in section 4.4, one can also examine coupling between biological contagions and the spread of information (e.g., "awareness") [50, 192] . one can also use compartmental models to study phenomena, such as dissemination of ideas on social media [58] and forecasting of political elections [190] , that are much different from the spread of diseases. one of the most prominent examples of a compartmental model is a susceptibleinfected-recovered (sir) model, which has three compartments. susceptible nodes are healthy and can become infected, and infected nodes can eventually recover. the steady state of the basic sir model on a network is related to a type of bond percolation [63, 68, 87, 181] . there are many variants of sir models and other compartmental models on networks [89] . see [114] for an illustration using susceptible-infectedsusceptible (sis) models. suppose that an infection is transmitted from an infected node to a susceptible neighbor at a rate of λ. the probability of a transmission event on one edge between an infected node and a susceptible node in an infinitesimal time interval dt is λ dt. assuming that all infection events are independent, the probability that a susceptible node with s infected neighbors becomes infected (i.e., for a node to transition from the s compartment to the i compartment, which represents both being infected and being infective) during dt is if an infected node recovers at a constant rate of µ, the probability that it switches from state i to state r in an infinitesimal time interval dt is µ dt. when there is no source of stochasticity, a dynamical process on a network is "deterministic". a deterministic dynamical system can take the form of a system of coupled maps, odes, pdes, or something else. as with stochastic systems, the network structure encodes which entities of a system interact with each other and which do not. there are numerous interesting deterministic dynamical systems on networksjust incorporate nontrivial connectivity between entities into your favorite deterministic model -although it is worth noting that some stochastic features (e.g., choosing parameter values from a probability distribution or sampling choices of initial conditions) can arise in these models. for concreteness, let's consider the popular setting of coupled oscillators. each node in a network is associated with an oscillator, and we want to examine how network structure affects the collective behavior of the coupled oscillators. it is common to investigate various forms of synchronization (a type of coherent behavior), such that the rhythms of the oscillators adjust to match each other (or to match a subset of the oscillators) because of their interactions [138] . a variety of methods, such as "master stability functions" [132] , have been developed to study the local stability of synchronized states and their generalizations [6, 142] , such as cluster synchrony [133] . cluster synchrony, which is related to work on "coupled-cell networks" [59] , uses ideas from computational group theory to find synchronized sets of oscillators that are not synchronized with other sets of synchronized oscillators. many studies have also examined other types of states, such as "chimera states" [128] , in which some oscillators behave coherently but others behave incoherently. (analogous phenomena sometimes occur in mathematics departments.) a ubiquitous example is coupled kuramoto oscillators on a network [6, 39, 149] , which is perhaps the most common setting for exploring and developing new methods for studying coupled oscillators. (in principle, one can then build on these insights in studies of other oscillatory systems, such as in applications in neuroscience [7] .) coupled kuramoto oscillators have been used for modeling numerous phenomena, including jetlag [104] and singing in frogs [126] . indeed, a "snowbird" (siam) conference on applied dynamical systems would not be complete without at least several dozen talks on the kuramoto model. in the kuramoto model, each node i has an associated phase θ i (t) ∈ [0, 2π). in the case of "diffusive" coupling between the nodes5, the dynamics of the ith node is governed by the equation where one typically draws the natural frequency ω i of node i from some distribution g(ω), the scalar a i j is an adjacency-matrix entry of an unweighted network, b i j is the coupling strength on oscillator i from oscillator j (so b i j a i j is an element of an adjacency matrix w of a weighted network), and f i j (y) = sin(y) is the coupling function, which depends only on the phase difference between oscillators i and j because of the diffusive nature of the coupling. once one knows the natural frequencies ω i , the model (4) is a deterministic dynamical system, although there have been studies of coupled kuramoto oscillators with additional stochastic terms [60] . traditional studies of (4) and its generalizations draw the natural frequencies from some distribution (e.g., a gaussian or a compactly supported distribution), but some studies of so-called "explosive synchronization" (in which there is an abrupt phase transition from incoherent oscillators to synchronized oscillators) have employed deterministic natural frequencies [16, 39] . the properties of the frequency distribution g(ω) have a significant effect on the dynamics of (4). important features of g(ω) include whether it has compact support or not, whether it is symmetric or asymmetric, and whether it is unimodal or not [149, 170] . the model (4) has been generalized in numerous ways. for example, researchers have considered a large variety of coupling functions f i j (including ones that are not diffusive) and have incorporated an inertia term θ i to yield a second-order kuramoto oscillator at each node [149] . the latter generalization is important for studies of coupled oscillators and synchronized dynamics in electric power grids [196] . another noteworthy direction is the analysis of kuramoto model on "graphons" (see, e.g., [108] ), an important type of structure that arises in a suitable limit of large networks. an increasingly prominent topic in network analysis is the examination of how multilayer network structures -multiple system components, multiple types of edges, co-occurrence and coupling of multiple dynamical processes, and so onaffect qualitative and quantitative dynamics [3, 34, 90] . for example, perhaps certain types of multilayer structures can induce unexpected instabilities or phase transitions in certain types of dynamical processes? there are two categories of dynamical processes on multilayer networks: (1) a single process can occur on a multilayer network; or (2) processes on different layers of a multilayer network can interact with each other [34] . an important example of the first category is a random walk, where the relative speeds and probabilities of steps within layers versus steps between layers affect the qualitative nature of the dynamics. this, in turn, affects methods (such as community detection [38, 80] ) that are based on random walks, as well as anything else in which the diffusion is relevant [22, 36] . two other examples of the first category are the spread of information on social media (for which there are multiple communication channels, such as facebook and twitter) and multimodal transportation systems [51] . for instance, a multilayer network structure can induce congestion even when a system without coupling between layers is decongested in each layer independently [1] . examples of the second category of dynamical process are interactions between multiple strains of a disease and interactions between the spread of disease and the spread of information [49, 50, 192] . many other examples have been studied [3] , including coupling between oscillator dynamics on one layer and a biased random walk on another layer (as a model for neuronal oscillations coupled to blood flow) [122] . numerous interesting phenomena can occur when dynamical systems, such as spreading processes, are coupled to each other [192] . for example, the spreading of one disease can facilitate infection by another [157] , and the spread of awareness about a disease can inhibit spread of the disease itself (e.g., if people stay home when they are sick) [61] . interacting spreading processes can also exhibit other fascinating dynamics, such as oscillations that are induced by multilayer network structures in a biological contagion with multiple modes of transmission [79] and novel types of phase transitions [34] . a major simplification in most work thus far on dynamical processes on multilayer networks is a tendency to focus on toy models. for example, a typical study of coupled spreading processes may consider a standard (e.g., sir) model on each layer, and it may draw the connectivity pattern of each layer from the same standard random-graph model (e.g., an erdős-rényi model or a configuration model). however, when studying dynamics on multilayer networks, it is particular important in future work to incorporate heterogeneity in network structure and/or dynamical processes. for instance, diseases spread offline but information spreads both offline and online, so investigations of coupled information and disease spread ought to consider fundamentally different types of network structures for the two processes. network structures also affect the dynamics of pdes on networks [8, 31, 57, 77, 112] . interesting examples include a study of a burgers equation on graphs to investigate how network structure affects the propagation of shocks [112] and investigations of reaction-diffusion equations and turing patterns on networks [8, 94] . the latter studies exploit the rich theory of laplacian dynamics on graphs (and concomitant ideas from spectral graph theory) [107, 187] and examine the addition of nonlinear terms to laplacians on various types of networks (including multilayer ones). a mathematically oriented thread of research on pdes on networks has built on ideas from so-called "quantum graphs" [57, 96] to study wave propagation on networks through the analysis of "metric graphs". metric graphs differ from the usual "combinatorial graphs", which in other contexts are usually called simply "graphs". 6 in metric graphs, in addition to nodes and edges, each edge e has a positive length l e ∈ (0, ∞]. for many experimentally relevant scenarios (e.g., in models of circuits of quantum wires [195] ), there is a natural embedding into space, but metric graphs that are not embedded in space are also appropriate for some applications. as the nomenclature suggests, one can equip a metric graph with a natural metric. if a sequence {e j } m j=1 of edges forms a path, the length of the path is j l j . the distance ρ(v 1 , v 2 ) between two nodes, v 1 and v 2 , is the minimum path length between them. we place coordinates along each edge, so we can compute a distance between points x 1 and x 2 on a metric graph even when those points are not located at nodes. traditionally, one assumes that the infinite ends (which one can construe as "leads" at infinity, as in scattering theory) of infinite edges have degree 1. it is also traditional to assume that there is always a positive distance between distinct nodes and that there are no finite-length paths with infinitely many edges. see [96] for further discussion. to study waves on metric graphs, one needs to define operators, such as the negative second derivative or more general schrödinger operators. this exploits the fact that there are coordinates for all points on the edges -not only at the nodes themselves, as in combinatorial graphs. when studying waves on metric graphs, it is also necessary to impose boundary conditions at the nodes [96] . many studies of wave propagation on metric graphs have considered generalizations of nonlinear wave equations, such as the cubic nonlinear schrödinger (nls) equation [123] and a nonlinear dirac equation [154] . the overwhelming majority of studies in metric graphs (with both linear and nonlinear waves) have focused on networks with a very small number of nodes, as even small networks yield very interesting dynamics. for example, marzuola and pelinovsky [106] analyzed symmetry-breaking and symmetry-preserving bifurcations of standing waves of the cubic nls on a dumbbell graph (with two rings attached to a central line segment and kirchhoff boundary conditions at the nodes). kairzhan et al. [85] studied the spectral stability of half-soliton standing waves of the cubic nls equation on balanced star graphs. sobirov et al. [168] studied scattering and transmission at nodes of sine-gordon solitons on networks (e.g., on a star graph and a small tree). a particularly interesting direction for future work is to study wave dynamics on large metric graphs. this will help extend investigations, as in odes and maps, of how network structures affect dynamics on networks to the realm of linear and nonlinear waves. one can readily formulate wave equations on large metric graphs by specifying relevant boundary conditions and rules at each junction. for example, joly et al. [82] recently examined wave propagation of the standard linear wave equation on fractal trees. because many natural real-life settings are spatially embedded (e.g., wave propagation in granular materials [101, 129] and traffic-flow patterns in cities), it will be particularly valuable to examine wave dynamics on (both synthetic and empirical) spatially-embedded networks [9] . therefore, i anticipate that it will be very insightful to undertake studies of wave dynamics on networks such as random geometric graphs, random neighborhood graphs, and other spatial structures. a key question in network analysis is how different types of network structure affect different types of dynamical processes [142] , and the ability to take a limit as model synthetic networks become infinitely large (i.e., a thermodynamic limit) is crucial for obtaining many key theoretical insights. dynamics of networks and dynamics on networks do not occur in isolation; instead, they are coupled to each other. researchers have studied the coevolution of network structure and the states of nodes and/or edges in the context of "adaptive networks" (which are also known as "coevolving networks") [66, 159] . whether it is sensible to study a dynamical process on a time-independent network, a temporal network with frozen (or no) node or edge states, or an adaptive network depends on the relative time scales of the dynamics of network structure and the states of nodes and/or edges of a network. see [142] for a brief discussion. models in the form of adaptive networks provide a promising mechanistic approach to simultaneously explain both structural features (e.g., degree distributions and temporal features (e.g., burstiness) of empirical data [5] . incorporating adaptation into conventional models can produce extremely interesting and rich dynamics, such as the spontaneous development of extreme states in opinion models [160] . most studies of adaptive networks that include some analysis (i.e., that go beyond numerical computations) have employed rather artificial adaption rules for adding, removing, and rewiring edges. this is relevant for mathematical tractability, but it is important to go beyond these limitations by considering more realistic types of adaptation and coupling between network structure (including multilayer structures, as in [12] ) and the states of nodes and edges. when people are sick, they stay home from work or school. people also form and remove social connections (both online and offline) based on observed opinions and behaviors. to study these ideas using adaptive networks, researchers have coupled models of biological and social contagions with time-dependent networks [100, 142] . an early example of an adaptive network of disease spreading is the susceptibleinfected (si) model in gross et al. [67] . in this model, susceptible nodes sometimes rewire their incident edges to "protect themselves". suppose that we have an n-node network with a constant number of undirected edges. each node is either susceptible (i.e., of type s) or infected (i.e., of type i). at each time step, and for each edge -so-called "discordant edges" -between nodes of different types, the susceptible node becomes infected with probability λ. for each discordant edge, with some probability κ, the incident susceptible node breaks the edge and rewires to some other susceptible node. this is a "rewire-to-same" mechanism, to use the language from some adaptive opinion models [40, 97] . (in this model, multi-edges and selfedges are not allowed.) during each time step, infected nodes can also recover to become susceptible again. gross et al. [67] studied how the rewiring probability affects the "basic reproductive number", which measures how many secondary infections on average occur for each primary infection [18, 89, 130] . this scalar quantity determines the size of a critical infection probability λ * to maintain a stable epidemic (as determined traditionally using linear stability analysis of an endemic state). a high rewiring rate can significantly increase λ * and thereby significantly reduce the prevalence of a contagion. although results like these are perhaps intuitively clear, other studies of contagions on adaptive networks have yielded potentially actionable (and arguably nonintuitive) insights. for example, scarpino et al. [161] demonstrated using an adaptive compartmental model (along with some empirical evidence) that the spread of a disease can accelerate when individuals with essential societal roles (e.g., health-care workers) become ill and are replaced with healthy individuals. another type of model with many interesting adaptive variants are opinion models [23, 142] , especially in the form of generalizations of classical voter models [148] . voter dynamics were first considered in the 1970s by clifford and sudbury [29] as a model for species competition, and the dynamical process that they introduced was dubbed "the voter model"7 by holley and liggett shortly thereafter [69] . voter dynamics are fun and are popular to study [148] , although it is questionable whether it is ever possible to genuinely construe voter models as models of voters [44] . holme and newman [71] undertook an early study of a rewire-to-same adaptive voter model. inspired by their research, durrett et al. [40] compared the dynamics from two different types of rewiring in an adaptive voter model. in each variant of their model, one considers an n-node network and supposes that each node is in one of two states. the network structure and the node states coevolve. pick an edge uniformly at random. if this edge is discordant, then with probability 1 − κ, one of its incident nodes adopts the opinion state of the other. otherwise, with complementary probability κ, a rewiring action occurs: one removes the discordant edge, and one of the associated nodes attaches to a new node either through a rewire-to-same mechanism (choosing uniformly at random among the nodes with the same opinion state) or through a "rewire-to-random" mechanism (choosing uniformly at random among all nodes). as with the adaptive si model in [67] , self-edges and multi-edges are not allowed. the models in [40] evolve until there are no discordant edges. there are several key questions. does the system reach a consensus (in which all nodes are in the same state)? if so, how long does it take to converge to consensus? if not, how many opinion clusters (each of which is a connected component, perhaps interpretable as an "echo chamber", of the final network) are there at steady state? how long does it take to reach this state? the answers and analysis are subtle; they depend on the initial network topology, the initial conditions, and the specific choice of rewiring rule. as with other adaptive network models, researchers have developed some nonrigorous theory (e.g., using mean-field approximations and their generalizations) on adaptive voter models with simplistic rewiring schemes, but they have struggled to extend these ideas to models with more realistic rewiring schemes. there are very few mathematically rigorous results on adaptive voter models, although there do exist some, under various assumptions on initial network structure and edge density [10] . researchers have generalized adaptive voter models to consider more than two opinion states [163] and more general types of rewiring schemes [105] . as with other adaptive networks, analyzing adaptive opinion models with increasingly diverse types of rewiring schemes (ideally with a move towards increasing realism) is particularly important. in [97] , yacoub kureh and i studied a variant of a voter model with nonlinear rewiring (where the probability that a node rewires or adopts is a function of how well it "fits in" within its neighborhood), including a "rewire-tonone" scheme to model unfriending and unfollowing in online social networks. it is also important to study adaptive opinion models with more realistic types of opinion dynamics. a promising example is adaptive generalizations of bounded-confidence models (see the introduction of [110] for a brief review of bounded-confidence models), which have continuous opinion states, with nodes interacting either with nodes or with other entities (such as media [21] ) whose opinion is sufficiently close to theirs. a recent numerical study examined an adaptive bounded-confidence model [19] ; this is an important direction for future investigations. it is also interesting to examine how the adaptation of oscillators -including their intrinsic frequencies and/or the network structure that couples them to each other -affects the collective behavior (e.g., synchronization) of a network of oscillators [149] . such ideas are useful for exploring mechanistic models of learning in the brain (e.g., through adaptation of coupling between oscillators to produce a desired limit cycle [171] ). one nice example is by skardal et al. [167] , who examined an adaptive model of coupled kuramoto oscillators as a toy model of learning. first, we write the kuramoto system as where f i j is a 2π-periodic function of the phase difference between oscillators i and j. one way to incorporate adaptation is to define an "order parameter" r i (which, in its traditional form, quantifies the amount of coherence of the coupled kuramoto oscillators [149] ) for the ith oscillator by and to consider the following dynamical system: where re(ζ) denotes the real part of a quantity ζ and im(ζ) denotes its imaginary part. in the model (6), λ d denotes the largest positive eigenvalue of the adjacency matrix a, the variable z i (t) is a time-delayed version of r i with time parameter τ (with τ → 0 implying that z i → r i ), and z * i denotes the complex conjugate of z i . one draws the frequencies ω i from some distribution (e.g., a lorentz distribution, as in [167] ), and we recall that b i j is the coupling strength on oscillator i from oscillator j. the parameter t gives an adaptation time scale, and α ∈ r and β ∈ r are parameters (which one can adjust to study bifurcations). skardal et al. [167] interpreted scenarios with β > 0 as "hebbian" adaptation (see [27] ) and scenarios with β < 0 as anti-hebbian adaptation, as they observed that oscillator synchrony is promoted when β > 0 and inhibited when β < 0. most studies of networks have focused on networks with pairwise connections, in which each edge (unless it is a self-edge, which connects a node to itself) connects exactly two nodes to each other. however, many interactions -such as playing games, coauthoring papers and other forms of collaboration, and horse racesoften occur between three or more entities of a network. to examine such situations, researchers have increasingly studied "higher-order" structures in networks, as they can exert a major influence on dynamical processes. perhaps the simplest way to account for higher-order structures in networks is to generalize from graphs to "hypergraphs" [121] . hypergraphs possess "hyperedges" that encode a connection between on arbitrary number of nodes, such as between all coauthors of a paper. this allows one to make important distinctions, such as between a k-clique (in which there are pairwise connections between each pair of nodes in a set of k nodes) and a hyperedge that connects all k of those nodes to each other, without the need for any pairwise connections. one way to study a hypergraph is as a "bipartite network", in which nodes of a given type can be adjacent only to nodes of another type. for example, a scientist can be adjacent to a paper that they have written [119] , and a legislator can be adjacent to a committee on which they sit [144] . it is important to generalize ideas from graph theory to hypergraphs, such as by developing models of random hypergraphs [25, 26, 52 ]. another way to study higher-order structures in networks is to use "simplicial complexes" [53, 54, 127] . a simplicial complex is a space that is built from a union of points, edges, triangles, tetrahedra, and higher-dimensional polytopes (see fig. 1d ). simplicial complexes approximate topological spaces and thereby capture some of their properties. a p-dimensional simplex (i.e., a p-simplex) is a p-dimensional polytope that is the convex hull of its p + 1 vertices (i.e., nodes). a simplicial complex k is a set of simplices such that (1) every face of a simplex from s is also in s and (2) the intersection of any two simplices σ 1 , σ 2 ∈ s is a face of both σ 1 and σ 2 . an increasing sequence k 1 ⊂ k 2 ⊂ · · · ⊂ k l of simplicial complexes forms a filtered simplicial complex; each k i is a subcomplex. as discussed in [127] and references therein, one can examine the homology of each subcomplex. in studying the homology of a topological space, one computes topological invariants that quantify features of different dimensions [53] . one studies "persistent homology" (ph) of a filtered simplicial complex to quantify the topological structure of a data set (e.g., a point cloud) across multiple scales of such data. the goal of such "topological data analysis" (tda) is to measure the "shape" of data in the form of connected components, "holes" of various dimensionality, and so on [127] . from the perspective of network analysis, this yields insight into types of large-scale structure that complement traditional ones (such as community structure). see [178] for a friendly, nontechnical introduction to tda. a natural goal is to generalize ideas from network analysis to simplicial complexes. important efforts include generalizing configuration models of random graphs [48] to random simplicial complexes [15, 32] ; generalizing well-known network growth mechanisms, such as preferential attachment [13] ; and developing geometric notions, like curvature, for networks [156] . an important modeling issue when studying higher-order network data is the question of when it is more appropriate (or convenient) to use the formalisms of hypergraphs or simplicial complexes. the computation of ph has yielded insights on a diverse set of models and applications in network science and complex systems. examples include granular materials [95, 129] , functional brain networks [54, 165] , quantification of "political islands" in voting data [42] , percolation theory [169] , contagion dynamics [174] , swarming and collective behavior [179] , chaotic flows in odes and pdes [197] , diurnal cycles in tropical cyclones [182] , and mathematics education [28] . see the introduction to [127] for pointers to numerous other applications. most uses of simplicial complexes in network science and complex systems have focused on tda (especially the computation of ph) and its applications [127, 131, 155] . in this chapter, however, i focus instead on a somewhat different (and increasingly popular) topic: the generalization of dynamical processes on and of networks to simplicial complexes to study the effects of higher-order interactions on network dynamics. simplicial structures influence the collective behavior of the dynamics of coupled entities on networks (e.g., they can lead to novel bifurcations and phase transitions), and they provide a natural approach to analyze p-entity interaction terms, including for p ≥ 3, in dynamical systems. existing work includes research on linear diffusion dynamics (in the form of hodge laplacians, such as in [162] ) and generalizations of a variety of other popular types of dynamical processes on networks. given the ubiquitous study of coupled kuramoto oscillators [149] , a sensible starting point for exploring the impact of simultaneous coupling of three or more oscillators on a system's qualitative dynamics is to study a generalized kuramoto model. for example, to include both two-entity ("two-body") and three-entity interactions in a model of coupled oscillators on networks, we write [172] x where f i describes the dynamics of oscillator i and the three-oscillator interaction term w i jk includes two-oscillator interaction terms w i j (x i , x j ) as a special case. an example of n coupled kuramoto oscillators with three-term interactions is [172] where we draw the coefficients a i j , b i j , c i jk , α 1i j , α 2i j , α 3i jk , α 4i jk from various probability distributions. including three-body interactions leads to a large variety of intricate dynamics, and i anticipate that incorporating the formalism of simplicial complexes will be very helpful for categorizing the possible dynamics. in the last few years, several other researchers have also studied kuramoto models with three-body interactions [92, 93, 166] . a recent study [166] , for example, discovered a continuum of abrupt desynchronization transitions with no counterpart in abrupt synchronization transitions. there have been mathematical studies of coupled oscillators with interactions of three or more entities using methods such as normal-form theory [14] and coupled-cell networks [59] . an important point, as one can see in the above discussion (which does not employ the mathematical formalism of simplicial complexes), is that one does not necessarily need to explicitly use the language of simplicial complexes to study interactions between three or more entities in dynamical systems. nevertheless, i anticipate that explicitly incorporating the formalism of simplicial complexes will be useful both for studying coupled oscillators on networks and for other dynamical systems. in upcoming studies, it will be important to determine when this formalism helps illuminate the dynamics of multi-entity interactions in dynamical systems and when simpler approaches suffice. several recent papers have generalized models of social dynamics by incorporating higher-order interactions [75, 76, 118, 137] . for example, perhaps somebody's opinion is influenced by a group discussion of three or more people, so it is relevant to consider opinion updates that are based on higher-order interactions. some of these papers use some of the terminology of simplicial complexes, but it is mostly unclear (except perhaps for [75] ) how the models in them take advantage of the associated mathematical formalism, so arguably it often may be unnecessary to use such language. nevertheless, these models are very interesting and provide promising avenues for further research. petri and barrat [137] generalized activity-driven models to simplicial complexes. such a simplicial activity-driven (sad) model generates time-dependent simplicial complexes, on which it is desirable to study dynamical processes (see section 4), such as opinion dynamics, social contagions, and biological contagions. the simplest version of the sad model is defined as follows. • each node i has an activity rate a i that we draw independently from a distribution f(x). • at each discrete time step (of length ∆t), we start with n isolated nodes. each node i is active with a probability of a i ∆t, independently of all other nodes. if it is active, it creates a (p − 1)-simplex (forming, in network terms, a clique of p nodes) with p − 1 other nodes that we choose uniformly and independently at random (without replacement). one can either use a fixed value of p or draw p from some probability distribution. • at the next time step, we delete all edges, so all interactions have a constant duration. we then generate new interactions from scratch. this version of the sad model is markovian, and it is desirable to generalize it in various ways (e.g., by incorporating memory or community structure). iacopini et al. [76] recently developed a simplicial contagion model that generalizes an si process on graphs. consider a simplicial complex k with n nodes, and associate each node i with a state x i (t) ∈ {0, 1} at time t. if x i (t) = 0, node i is part of the susceptible class s; if x i (t) = 1, it is part of the infected class i. the density of infected nodes at time t is ρ(t) = 1 n n i=1 x i (t). suppose that there are d parameters 1 , . . . , d (with d ∈ {1, . . . , n − 1}), where d represents the probability per unit time that a susceptible node i that participates in a d-dimensional simplex σ is infected from each of the faces of σ, under the condition that all of the other nodes of the face are infected. that is, 1 is the probability per unit time that node i is infected by an adjacent node j via the edge (i, j). similarly, 2 is the probability per unit time that node i is infected via the 2-simplex (i, j, k) in which both j and k are infected, and so on. the recovery dynamics, in which an infected node i becomes susceptible again, proceeds as in the sir model that i discussed in section 4.2. one can envision numerous interesting generalizations of this model (e.g., ones that are inspired by ideas that have been investigated in contagion models on graphs). the study of networks is one of the most exciting and rapidly expanding areas of mathematics, and it touches on myriad other disciplines in both its methodology and its applications. network analysis is increasingly prominent in numerous fields of scholarship (both theoretical and applied), it interacts very closely with data science, and it is important for a wealth of applications. my focus in this chapter has been a forward-looking presentation of ideas in network analysis. my choices of which ideas to discuss reflect their connections to dynamics and nonlinearity, although i have also mentioned a few other burgeoning areas of network analysis in passing. through its exciting combination of graph theory, dynamical systems, statistical mechanics, probability, linear algebra, scientific computation, data analysis, and many other subjects -and through a comparable diversity of applications across the sciences, engineering, and the humanities -the mathematics and science of networks has plenty to offer researchers for many years. congestion induced by the structure of multiplex networks tie-decay temporal networks in continuous time and eigenvector-based centralities multilayer networks in a nutshell multilayer networks in a nutshell temporal and structural heterogeneities emerging in adaptive temporal networks synchronization in complex networks mathematical frameworks for oscillatory network dynamics in neuroscience turing patterns in multiplex networks morphogenesis of spatial networks evolving voter model on dense random graphs generative benchmark models for mesoscale structure in multilayer networks birth and stabilization of phase clusters by multiplexing of adaptive networks network geometry with flavor: from complexity to quantum geometry chaos in generically coupled phase oscillator networks with nonpairwise interactions topology of random geometric complexes: a survey explosive transitions in complex networksõ structure and dynamics: percolation and synchronization factoring and weighting approaches to clique identification mathematical models in population biology and epidemiology how does active participation effect consensus: adaptive network model of opinion dynamics and influence maximizing rewiring anatomy of a large-scale hypertextual web search engine a model for the influence of media on the ideology of content in online social networks frequency-based brain networks: from a multiplex network to a full multilayer description statistical physics of social dynamics bootstrap percolation on a bethe lattice configuration models of random hypergraphs annotated hypergraphs: models and applications hebbian learning architecture and evolution of semantic networks in mathematics texts a model for spatial conflict reaction-diffusion processes and metapopulation models in heterogeneous networks multiple-scale theory of topology-driven patterns on directed networks generalized network structures: the configuration model and the canonical ensemble of simplicial complexes structure and dynamics of core/periphery networks the physics of spreading processes in multilayer networks mathematical formulation of multilayer networks navigability of interconnected networks under random failures identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems explosive phenomena in complex networks graph fission in an evolving voter model a practical guide to stochastic simulations of reaction-diffusion processes persistent homology of geospatial data: a case study with voting limitations of discrete-time approaches to continuous-time contagion dynamics is the voter model a model for voters? the use of multilayer network analysis in animal behaviour on eigenvector-like centralities for temporal networks: discrete vs. continuous time scales community detection in networks: a user guide configuring random graph models with fixed degree sequences nine challenges in incorporating the dynamics of behaviour in infectious diseases models modelling the influence of human behaviour on the spread of infectious diseases: a review anatomy and efficiency of urban multimodal mobility random hypergraphs and their applications elementary applied topology two's company, three (or more) is a simplex binary-state dynamics on complex networks: pair approximation and beyond quantum graphs: applications to quantum chaos and universal spectral statistics the structural virality of online diffusion patterns of synchrony in coupled cell networks with multiple arrows finite-size effects in a stochastic kuramoto model dynamical interplay between awareness and epidemic spreading in multiplex networks threshold models of collective behavior on the critical behavior of the general epidemic process and dynamical percolation a matrix iteration for dynamic network summaries a dynamical systems view of network centrality adaptive coevolutionary networks: a review epidemic dynamics on an adaptive network pathogen mutation modeled by competition between site and bond percolation ergodic theorems for weakly interacting infinite systems and the voter model modern temporal network theory: a colloquium nonequilibrium phase transition in the coevolution of networks and opinions temporal networks temporal networks temporal network theory an adaptive voter model on simplicial complexes simplical models of social contagion turing instability in reaction-diffusion models on complex networks games on networks the large graph limit of a stochastic epidemic model on a dynamic multilayer network a local perspective on community structure in multilayer networks structure of growing social networks wave propagation in fractal trees synergistic effects in threshold models on networks hipsters on networks: how a minority group of individuals can lead to an antiestablishment majority drift of spectrally stable shifted states on star graphs maximizing the spread of influence through a social network second look at the spread of epidemics on networks centrality prediction in dynamic human contact networks mathematics of epidemics on networks multilayer networks authoritative sources in a hyperlinked environment dynamics of multifrequency oscillator communities finite-size-induced transitions to synchrony in oscillator ensembles with nonlinear global coupling pattern formation in multiplex networks quantifying force networks in particulate systems quantum graphs: i. some basic structures fitting in and breaking up: a nonlinear version of coevolving voter models from networks to optimal higher-order models of complex systems hawkes processes complex spreading phenomena in social systems: influence and contagion in real-world social networks wave mitigation in ordered networks of granular chains centrality metric for dynamic networks control principles of complex networks resynchronization of circadian oscillators and the east-west asymmetry of jet-lag transitivity reinforcement in the coevolving voter model ground state on the dumbbell graph random walks and diffusion on networks the nonlinear heat equation on dense graphs and graph limits multi-stage complex contagions opinion formation and distribution in a bounded-confidence model on various networks network motifs: simple building blocks of complex networks portrait of political polarization six susceptible-infected-susceptible models on scale-free networks a network-based dynamical ranking system for competitive sports community structure in time-dependent, multiscale, and multiplex networks multi-body interactions and non-linear consensus dynamics on networked systems scientific collaboration networks. i. network construction and fundamental results network structure from rich but noisy data collective phenomena emerging from the interactions between dynamical processes in multiplex networks nonlinear schrödinger equation on graphs: recent results and open problems complex contagions with timers a theory of the critical mass. i. interdependence, group heterogeneity, and the production of collective action interaction mechanisms quantified from dynamical features of frog choruses a roadmap for the computation of persistent homology chimera states: coexistence of coherence and incoherence in networks of coupled oscillators network analysis of particles and grains epidemic processes in complex networks topological analysis of data master stability functions for synchronized coupled systems cluster synchronization and isolated desynchronization in complex networks with symmetries bayesian stochastic blockmodeling modelling sequences and temporal networks with dynamic community structures activity driven modeling of time varying networks simplicial activity driven model the multilayer nature of ecological networks network analysis and modelling: special issue of dynamical systems on networks: a tutorial the role of network analysis in industrial and applied mathematics a network analysis of committees in the u.s. house of representatives communities in networks spectral centrality measures in temporal networks reality inspired voter models: a mini-review the kuramoto model in complex networks core-periphery structure in networks (revisited) dynamic pagerank using evolving teleportation memory in network flows and its effects on spreading dynamics and community detection recent advances in percolation theory and its applications dynamics of dirac solitons in networks simplicial complexes and complex systems comparative analysis of two discretizations of ricci curvature for complex networks dynamics of interacting diseases null models for community detection in spatially embedded, temporal networks modeling complex systems with adaptive networks social diffusion and global drift on networks the effect of a prudent adaptive behaviour on disease transmission random walks on simplicial complexes and the normalized hodge 1-laplacian multiopinion coevolving voter model with infinitely many phase transitions the architecture of complexity the importance of the whole: topological data analysis for the network neuroscientist abrupt desynchronization and extensive multistability in globally coupled oscillator simplexes complex macroscopic behavior in systems of phase oscillators with adaptive coupling sine-gordon solitons in networks: scattering and transmission at vertices topological data analysis of continuum percolation with disks from kuramoto to crawford: exploring the onset of synchronization in populations of coupled oscillators motor primitives in space and time via targeted gain modulation in recurrent cortical networks multistable attractors in a network of phase oscillators with threebody interactions analysing information flows and key mediators through temporal centrality metrics topological data analysis of contagion maps for examining spreading processes on networks eigenvector-based centrality measures for temporal networks supracentrality analysis of temporal networks with directed interlayer coupling tunable eigenvector-based centralities for multiplex and temporal networks topological data analysis: one applied mathematicianõs heartwarming story of struggle, triumph, and ultimately, more struggle topological data analysis of biological aggregation models partitioning signed networks on analytical approaches to epidemics on networks using persistent homology to quantify a diurnal cycle in hurricane felix resolution limits for detecting community changes in multilayer networks analytical computation of the epidemic threshold on temporal networks epidemic threshold in continuoustime evolving networks network models of the diffusion of innovations graph spectra for complex networks non-markovian infection spread dramatically alters the susceptible-infected-susceptible epidemic threshold in networks temporal gillespie algorithm: fast simulation of contagion processes on time-varying networks forecasting elections using compartmental models of infection ranking scientific publications using a model of network traffic coupled disease-behavior dynamics on complex networks: a review social network analysis: methods and applications a simple model of global cascades on random networks braess's paradox in oscillator networks, desynchronization and power outage inferring symbolic dynamics of chaotic flows from persistence continuous-time discrete-distribution theory for activitydriven networks an analytical framework for the study of epidemic models on activity driven networks modeling memory effects in activity-driven networks models of continuous-time networks with tie decay, diffusion, and convection key: cord-238342-ecuex64m authors: fong, simon james; li, gloria; dey, nilanjan; crespo, ruben gonzalez; herrera-viedma, enrique title: composite monte carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction date: 2020-03-22 journal: nan doi: nan sha: doc_id: 238342 cord_uid: ecuex64m in the advent of the novel coronavirus epidemic since december 2019, governments and authorities have been struggling to make critical decisions under high uncertainty at their best efforts. composite monte-carlo (cmc) simulation is a forecasting method which extrapolates available data which are broken down from multiple correlated/casual micro-data sources into many possible future outcomes by drawing random samples from some probability distributions. for instance, the overall trend and propagation of the infested cases in china are influenced by the temporal-spatial data of the nearby cities around the wuhan city (where the virus is originated from), in terms of the population density, travel mobility, medical resources such as hospital beds and the timeliness of quarantine control in each city etc. hence a cmc is reliable only up to the closeness of the underlying statistical distribution of a cmc, that is supposed to represent the behaviour of the future events, and the correctness of the composite data relationships. in this paper, a case study of using cmc that is enhanced by deep learning network and fuzzy rule induction for gaining better stochastic insights about the epidemic development is experimented. instead of applying simplistic and uniform assumptions for a mc which is a common practice, a deep learning-based cmc is used in conjunction of fuzzy rule induction techniques. as a result, decision makers are benefited from a better fitted mc outputs complemented by min-max rules that foretell about the extreme ranges of future possibilities with respect to the epidemic. on top of devastating health effects, epidemic impacted hugely on world economy. in the ebola outbreak between 2014-2016 where more than 28,000 and cases were suspected and 10,000 deaths in west africa [1], $2.2 billion was lost [2] . on the other hand, sars took over 648 lives from china including hong kong and 700 lives worldwide between 2002 and 2003 [3] . its losses on global economy are up to a huge $100 billion, 1% and 0.5% dips of gdps in chinese and asian domestic markets respectively [3] . although the current coronavirus (codename: ncp or covid-19) epidemic is not over yet, its economy impact is anticipated by economists from ihs markit to be far worse than that of sars outbreak in 2003 [4] . the impact is so profound that will lead to factories shut down, enterprises bankruptcy especially those in tourism, retail and f&b industries, and suspensions or withdrawals in longterm investment, if the outbreak cannot be contained in time. since the first case in december 2019, the suspected cases and deaths around the world skyrocketed to over 76395 confirmed cases and 2348 deaths, mostly in china, by the time of writing this article. an early intervention measure in public health to thwart the outbreak of covid-19 is absolutely imperative. according to a latest mathematical model that was reported in research article by the lancet [5] , the growth of the epidemic spreading rate will ease down if the transmission rate of the new contagious disease can be lowered by 0.25. knowing that the early ending the virus epidemic or even the reduction in the transmission rate between human to human, all governments especially china where wuhan is the epicenter are taking up all the necessary preventive measures and all the national efforts to halt the spread. how much input is really necessary? many decision makers take references from sars which is by far the most similar virus to covid-19. however, it is difficult as the characteristics of the virus are not fully known, it details and about how it spreads are gradually unfolded from day to day. given the limited information on hand about the new virus, and the ever evolving of the epidemic situations both geographically and temporally, it boils down to grand data analytics challenge this analysis question: how much resources shall be enough to slow down the transmission? this is a composite problem that requires cooperation from multi-prong measures such as medical provision, suspension of schools, factories and office, minimizing human gathering, limiting travel, strict city surveillance and enforced quarantines and isolations in large scales. there is no easy single equation that could tell the amount of resources in terms of monetary values, manpower and other intangible usage of infrastructure; at the same time there exist too many uncertain variables from both societal factors and the new development of the virus itself. for example, the effective incubation period of the new virus was found to be longer than a week, only some time later after the outbreak. time is an essence in stopping the epidemic so to reduce its damages as soon as possible. however, uncertainties are the largest obstacle to obtain an accurate model for forecasting the future behaviours of the epidemic should intervention apply. in general, there is a choice of using deterministic or stochastic modelling for data scientists; the former technique based solely on past events which are already known for sure, e.g. if we know the height and weight of a person, we know his body mass index. should any updates on the two dependent variables, the bmi will be changed to a new value which remains the same for sure no matter how many times the calculation is repeated. the latter is called probabilistic or stochastic model -instead of generating a single and absolute result, a stochastic model outputs a collection of possible outcomes which may happen under some probabilities and conditions. deterministic model is useful when the conditions of the experiment are assumed rigid. it is useful to obtain direct forecasting result from a relatively simple and stable situation in which its variables are unlikely to deviate in the future. otherwise, for a non-deterministic model, which is sometimes referred as probabilistic or stochastic, the conditions of a future situation under which the experiment will be observed, are simulated to some probabilistic behaviour of the future observable outcome. for an example of epidemic, we want to determine how many lives could be saved from people who are infected by a new virus as a composite result of multi-prong efforts that are put into the medical resources, logistics, infrastructure, spread prevention, and others; at the same time, other contributing factors also matter, such as the percentage of high-risk patients who are residing in that particular city, the population and its mobility, as well as the severity and efficacy of the virus itself and its vaccine respectively. real-time tools like cdc data reporting and national big data centers are available with which any latest case that occurs can be recorded. however, behind all these records are sequences of factors associated with high uncertainty. for example, the disease transmission rate depends on uncertain variables ranging from macro-scale of weather and economy of the city in a particular season, to the individual's personal hygiene and the social interaction of commuters as a whole. they are dynamic in nature that change quickly from time to time, person to person, culture to culture and place to place. the phenomena can hardly converge to a deterministic model. rather, a probabilistic model can capture more accurately the behaviours of the phenomena. so for epidemic forecast, a deterministic model such as trending is often used to the physical considerations to predict an almost accurate outcome, whereas in a non-deterministic model we use those considerations to predict more of a probable outcome that is probability distribution oriented. in order to capture and model such dynamic epidemic recovery behaviours, stochastic methods ingest a collection of input variables that have complex dependencies on multiple risk factors. the epidemic recovery can be viewed in abstract as a bipolar force between the number of populations who has contracted the disease and the number of patients who are cured from the disease. each group of the newly infested and eventually cured (or unfortunately deceased) individuals are depending on complex societal and physiological factors as well as preventive measures and contagious control. each of these factors have their underlying and dependent factors carrying uncertain levels of risks. a popular probabilistic approach for modeling the complex conditions is known as monte carlo (mc) simulation which provides a means of estimating the outcome of complex functions by simulating multiple random paths of the underlying risk factors. rather than deterministic analytic computation, mc uses random number generation to generate random samples of input trials to explore the behaviour of a complex epidemic situation for decision support. mc is particularly suitable for modeling epidemic especially new and unknown disease like covid-19 because the data about the epidemic collected on hand in the early stage are bound to change. in mc, data distributions are entered as input, since precise values are either unknown or uncertain. output of mc is also in a form of distribution specifying a range of possible values (or outcome) each of which has its own probability at which it may occur. compared to deterministic approach where precise numbers are loaded as input and precise number is computed to be output, mc simulates a broad spectrum of possible outcomes for subsequent expert evaluation in a decision-making process. recently as epidemic is drawing global concern and costing hugely on public health and world economy, the use of mc in epidemic modeling forecast has become popular. it offers decision makers an extra dimension of probability information so called risk factors for analyzing the possibilities and their associated risk as a whole. decades ago, there has been a growing research interest in using mc for quantitatively modelling epidemic behaviours. since 1957, bailey et al was among the pioneers in formulating mathematical theory of epidemics. subsequently in millennium, andersson and britton [7] adopted mc simulation techniques to study the behaviour of stochastic epidemic models, observing their statistical characteristics. in 2003, house et al, attempted to estimate how big the final size of an epidemic is likely to be, by using mc to simulate the course of a stochastic epidemic. as a result, the probability mass function of the final number of infections is calculated by drawing random samples over small homogeneous and large heterogeneous populations. yashima and sasaki in 2013 extended the mc epidemic model from over a population to a particular commute network model, for studying the epidemic spread of an infectious disease within a metropolitan area -tokoyo train station. mc is used to simulate the spread of infectious disease by considering the commuters flow dynamics, the population sizes and other factors, the proceeding size of the epidemic and the timing of the epidemic peak. it is claimed that the mc model is able to serve as a pre-warning system forecasting the incoming spread of infection prior to its actual arrival. narrowing from the mc model which can capture the temporal-spatial dynamics of the epidemic spread, a more specific mc model is constructed by fitzgerald et al [10] in 2017 for simulating queuing behaviour of an emergency department. the model incorporates queuing theory and buffer occupancy which mimic the demand and nursing resource in the emergency department respectively. it was found that adding a separate fast track helps relieving the burden on handling of patient and cutting down the overall median wait times during an emergency virus outbreak and the operation hours are at peak. mielczarek and zabawa [11] adopted a similar mc model to investigate how erratic the population is, hence the changes in the number of infested patients affect the fluctuations in emergency medical services, assuming there are epidemiological changes such as call-for-services, urgent admission to hospital and icu usages. based on some empirical data obtained from ems center at lower silesia region in poland, the ems events and changes in demographic information are simulated as random variables. due to the randomness of the changes (in population sizes as people migrate out, and infested cases increase) in both demand and supply of an ems, the less-structured model cannot be easily examined by deterministic analytic means. however, mc model allows decision makers to predict by studying the probabilities of possible outcomes on how the changes impact the effectiveness of the polish ems system. there are similar works which tap on the stochastic nature of mc model for finding the most effective escape route during emergency evacuation [12] and modelling emergency responses [13] . overall, the above-mentioned related works have several features in common: their studies are centered on using a probabilistic approach to model complex real-life phenomena, where a deterministic model may fall short in precisely finding the right parameters to cater for every detail. the mc model is parsimonious that means the model can achieve a satisfactory level of explanation or insights by requiring as few predictor variables as possible. the model which uses minimum predictor variables and offers good explanation is selected by some goodness of fit as bic model selection criterion. the input or predictor variables are often dynamic in nature whose values change over some spatial-temporal distribution. finally, the situation in question, which is simulated by mc model, is not only complex but a prior in nature. just like the new covid-19 pandemic, nobody can tell when or whether it will end in the near future, as it depends on too many dynamic variables. while the challenges of establishing an effective mc model is acknowledged for modelling a completely new epidemic behaviour, the model reported in [13] inspires us to design the mc model by decomposing it into several sub-problems. therefore, we proposed a new mc model called composite mc or cmc in short which accepts predictor variables from multi-prong data sources that have either correlations or some kind of dependencies from one another. the challenge here is to ensure that the input variables though they may come from random distribution, their underlying inference patterns must contribute to the final outcome in question. considering multi-prong data sources widen the spectrum of possibly related input data, thereby enhancing the performance of monte carlo simulation. however, naive mc by default does not have any function to decide on the importance of input variables. it is known that what matters for the underlying inference engine of mc is the historical data distribution which tells none or little information about the input variables prior to the running of mc simulation. to this end, we propose a pre-processor, in the form of optimized neural network namely bfgs-polynomial neural network is used at the front of mc simulator. bfgs-pnn serves as both a filter for selecting important variables and forecaster which generates future time-series as parts of the input variables to the mc model. traditionally all the input variables to the mc are distributions that are drawn from the past data which are usually random, uniform or some customized distribution of sophisticated shape. in our proposed model here, a hybrid input that is composed of both deterministic type and non-deterministic type of variables. deterministic variables come from the forecasted time-series which are the outputs of the bfgs-pnn. non-deterministic variables are the usual random samples that are drawn from data distributions. in the case of covid-19, the future forecasts of the time-series are the predictions of the number of confirmed infection cases and the number of cured cases. observing from the historical records, nevertheless, these two variables display very erratic trends, one of them contains extreme outliers. they are difficult to be closely modelled by any probability density function; intuitively imposing any standard data distribution will not be helpful to delivering accurate outcomes from the mc model. therefore in our proposal, it is needed to use a polynomial style of self-evolving neural network that was found to be one of the most suited machine learning algorithm in our prior work [14] , to render a most likely future curve that is unique to that particular data variable. the composition of the multiple data sources is of those relevant to the development (rise-and-decline) of the covid-19 epidemic. specifically, a case of how much daily monetary budget that is required to struggle against the infection spread is to be modelled by mc. the data sources of these factors are publicly available from the chinese government websites. more details follow in section 2 below. the rationale behind using a composite model is that what appears to be an important figure, e.g. the number of suspected cases are directly and indirectly related to a number of sub-problems which of each carries different levels of uncertainty: how a person gets infested, depends on 1) the intensity of travel (within a community, suburb, inter-city, or oversea) 2) preventive measures 3) trail tracking of the suspected and quarantining them 4) medical resources (isolation beds) available, and 5) eventual cured or dead. some of these data sources are in opposing influences to one another. for example, the tracking and quarantine measures gets tighten up, the number of infested drops, and vice-versa. in theory, more relevant data are available, the better the performance and more accurate outcomes of the mc can provide. mc plays an important role here as the simulation is founded on probabilistic basis, the situation and its factors are nothing but uncertainty. given the available data is scare as the epidemic is new, any deterministic model is prone to high-errors under such high uncertainty about the future. the contribution of this work has been twofold. firstly, a composite mc model, called cmcm is proposed which advocates using non-deterministic data distributions along with future predictions from a deterministic model. the deterministic model in use should be one that is selected from a collection of machine learning models that is capable to minimize the prediction error with its model parameters appropriately optimized. the advantage of using both fits into the mc model is the flexibility that embraces some input variables which are solely comprised of historical data, e.g. trends of people infested. and those that underlying elements which contribute to the high uncertainty, e.g. the chances of people gather, are best represented in probabilistic distribution as non-deterministic variables to the mc model. by this approach, a better-quality mc model can be established, the outcomes from the mc model become more trustworthy. secondly, the sensitivity chart obtained from the mc simulation is used as corrective feedback to rules that are generated from a fuzzy rule induction (fri) model. it is known that fri outputs decision rules with probabilities/certainty for each individual rule. a rule consists of a series of testing nodes without any priority weights. by referencing the feedbacks from the sensitivity chart, decision makers can relate the priority of the variables which are the tests along each sequence of decision rules. combining the twofold advantages, even under the conditions of high uncertainty, decision makers are benefited with a better-quality mc model which embraces considerations of composite input variables, and fuzzy decision rules with tests ranked in priority. this tool offers a comprehensive decision support at its best effort under high uncertainty. the remaining paper is structured as follow. section 2 describes the proposed methodology called grooms+cmcm, followed by introduction of two key soft computing algorithms -bfgs-pnn and fri which is adopted for forecasting some particular future trends as inputs to the mc model and generating fuzzy decision rules respectively. section 3 presents some preliminary results from the proposed model followed by discussion. section 4 concludes this paper. mc has been applied for estimating epidemic characteristics by researchers over the year, because the nature of epidemic itself and its influences are full of uncertainty. an application that is relatively less looked at but important is the direct cost of fighting the virus. the direct cost is critical to keep the virus at bay when it is still early before becoming a pandemic. but often it is hard to estimate during the early days because of too many unknown factors. jiang et al [15] have modelled the shape of a typical epidemic concluding that the curve is almost exponential; it took only less than a week from the first case growing to its peak. if appropriate and urgent preventive measure was applied early to have it stopped in time, the virus would probably not escalate into an epidemic then pandemic. ironically, during the first few days (some call it the golden critical hours), most of the time within this critical window was spent on observation, study, even debating for funding allocation and strategies to apply. if a effective simulation tool such as the one that is proposed here, decision makers can be better informed the costs involved and the corresponding uncertainty and risks. therefore, the methodology would have to be designed in mind that the available data is limited, each functional component of the methodology in the form of soft computing model should be made as accurate as possible. being able to work with limited data, flexible in simulating input variables (hybrid deterministic and its counterpart), and informative outcomes coupled with fuzzy rules and risks, would be useful for experts making sound decision at the critical time. our novel methodology is based on group of optimized and multisource selection, (grooms) methodology [14] which is made for choosing a machine learning method which has the highest level of accuracy. grooms as a standalone optimizing process is in aid of assuring the deterministic model that is to be used as input variable for the subsequent mc simulation to have the most accurate data source input. by default, mc model at its naive form accepts only input variable from a limited range of standard data distributions (uniform, normal, bernoulli, pareto, etc.); best fitting curve technique is applied should the historical data shape falls out of the common data distribution types. however, this limitation is expanded in our composite mc model, so-called cmcm in such a way that all relevant data sources are embraced, both direct and indirect types. an enhanced version of neural network is used to firstly capture the non-linearity (often with high irregularity and lack of apparent trends and seasonality) of the historical data. out of the full spectrum of data sources, direct and indirect, the selected data sources through feature selection by correlation, that are filtered by the neural network, whose data distributions would be taken as input variables to the mc model. the combined methodology, grooms+cmcm is shown in figure 1 . according to the methodology, a machine learning algorithm candidate called bgfs-pnn which is basically pnn as selected as the winning algorithm in [14] enhanced with further parameter optimization function. the given timeseries data fluctuated more than the same that were collected earlier. as a data pre-processor, bfgs-pnn has two functions. firstly, for the non-deterministic data, using a classifierbasedfilter function in a wrapper approach, salient features could be found in feature selection. the selected salient features are those very relevant to the forecast target in the mc. in this case, it is composite mc model or cmcm as the simulation engine intakes multiple data sources from types of deterministic and non-deterministic. the second function is to forecast a future time-series as a type of deterministic input variable for the cmcm. the formulation of bfgs-pnn is shown as follow. the naã¯ve version of pnn is identical to the one reported in [14] . bfgs-pnn uses bfgs (broyden-fletcher-goldfarb-shanno) algorithm to optimize the parameters and network structure size in an iterative manner using hill-climbing technique. bfgs theory is basically for solving non-linear optimization problem iteratively by finding a stationary equilibrium through quasi-newton method [16] and secant method [17] . let pnn [18] take the form of kolmogorov-gabor polynomial as a functional series in eqn. (1). the polynomial is capable to take form of any function which is generalized as y=f( ì� ). the induction process is a matter of finding all the values for the coefficient vector ì� . as the process iterates, the variables from ì� arrive in sequence fitting into the polynomial via regression and minimizing the error. the complexity grows incrementally by trying to add a neuron at a time while the forecasting error is being monitored [19] . when the number of neurons reach a pre-set level, the hidden layer increases. this continues until there is no further performance gain observed, the growth of the polynomial stops and taken as the final equation for the pnn. it is noted that the increment of the network growth is linear. however, for the case of bfgs-pnn, the expansion of the polynomial is non-linear and heuristic. the optimal state is achieved by unconstrainted hill-climbing method guided by quasi-network and secant methods. let an error function be e(p) where p is a vector of real numbers be a vector of network structure information and parameters, i.e. neurons and layers in an ordered set. at the start, when t=0, pt=0 is initialized by randomly chosen states. let search direction be si at iteration i where time t=i. let hi be hessian which is a square matrix of 2 nd -order partial derivatives of function e. where i is the current iteration, h improves becomes a better estimate as the process iterates. ã�e(pi) is the gradient of the error function which needs to be minimized at t=i by following the quasi-newton search pattern using a gradient function similar to eqn. 2. it seeks to search for the next state of parameters values pi+1 by optimizing e(pi+s*si) where the scalar s must be greater than 0. eqn. 2 then would have to obey the quasi-newton condition upon solving the approximation of the hessian hi, in a way of eqn. 3. by the secant equation, we let the hessian matrix take the form as in eqn. 4. the updating function for hessian matrix is defined as eqn. 5. following the secant method. the equation tries to impose the condition and symmetry such that ! = !"# ! . let !"# * ! = ! , = ! = ! ! , we can obtain two sub-equations in eqn. 6. as coefficients to eqn. 5. substituting eqn. 6. into eqn. 5, we obtain the updating function for hessian matrix hi+1. by applying sherman-morrison formula [20] to eqn. 6., we get eqn. 7. which is the inverse of hessian h matrix. expanding eqn. 7 to eqn. 8 we obtain an equation that can be calculated quickly without needing any buffer space for fast optimization which aims at minimizing e(*). !"# by our grooms+cmcm methodology, raw data from multiple sources are filtered, condensed, converted into insights of future behaviours in several forms. traditionally in mc simulation, probability density functions as simulated outcomes are generated, so is sensitivity chart which ranks how each factor in relevance to the predicted outcome. fuzzy rule induction (fri) plays a role in the methodology by inferring a rule-based model which supplies a series of conditional tests that lead to some consequences based on the same data that were loaded into the mc engine. fri serves the threefold purpose of easy to use, neural and scalable. firstly, the decision rules are interpretable by users. they can complement the probability density functions which show a macro view of the situation. fri helps give another perspective in causality assisting decision makers to investigate the logics of cause-and-effect. furthermore, different from other decision rule models, fri allows some fuzzy relaxation in bracketing the upper and lower bounds, thereby a decision can be made based on the min-max values pertaining to each conditional test (attribute in the data). the fri rules are formatted as branching-logic, which is also known as predicate-logic that preserves the crudest form of knowledge representation. predicate logic has an if-test[min-max]-then-verdict basic structure and a propensity of how often it exists in the dataset. the number of different groups of fri rules depend on how many different labels in the predicted class. the second advantage is that the fri rules are objective and free from human bias as they are derived homogenously from the data. therefore, they are suitable ingredient for scientifically devising policy and strategy for epidemic control. thirdly fri rules can scale up or down not only in quantity, but also in cardinality. a rule can consist of as many tests as the attributes of the data are available. in other words, as a composite mc system, new source of data could be chipped in as per when it becomes necessary or recently available; the attributes of the new data can add on to the conditional tests of the fri rules. one drawback about fri is the lack of indicators for each specific conditional test (or attribute). by the current formulation of fri, the likelihood of the occurrence of the rule is assigned to the rule as a whole. little is known about how each conditional test on the attribute is relatively contributing to the outcome that is specified in the rule. in the light of this shortcoming, our proposed methodology suggests that the scores from the sensitivity charts with respect to the relations between the attributes and the outcome, could be used at the rule by simple majority voting. rules are generated as a by-product of classification in data mining. the process is through fuzzification of the data ranges and the confidence factors in their effects in classification are taken as indicators. let a rule be a series of components constraining the attributes !"#..% (with outcome ! = y) in the classification model building, so that they can remain valid even though the values are fuzzified. so a rule can be expressed in the predicate rule format such that each ! â�� , where â��y is an membership whose labels are where î� *,â�� and î� *,â�� are the lower and upper bounds of the argument which will map to a fuzzy membership of value 1. similarly, the supports of the lower and upper bounds are denoted by î� *,â�� and î� ',â�� . the fuzzy rules are built on the decision rules which are generated from standard decision tree algorithm, such as direct rule-based classifier equipped with incremental reduced error pruning via greedy search [22] . given the rule sets generated, the task here is to find the most suitable fuzzy extension for each rule. the task can be seen as replacing the current memberships of the rules by their corresponding fuzzy memberships, which is not too computationally difficult as long as the rule structures and the elements the same. in order to fuzzify a membership, the following formula is applied over the antecedent (î©iâ�� i) of rule set while considering the relevant data / ! ; at the same time the instances from the other antecedent (î©j by this approach, the instances / ! are divided into two subsets: one subset contains all the positive instances /3 ! and the other subset contains all the negative instances /-! . after that, a purity measure is used to further separate the two groups into two extremes, positive and negative subgroups, by eqn. 11. when it comes to actual operation, a certainty factor which serves as an indicator about how much the new data instances indeed belong to a subgroup, is needed to quantify the division. followed by segregating the data into fuzzy rules # (0) â�¦ < (0) by machine learning the relations from their attributes and instance values to some label class 0 , further indicator is needed to quantify the strength of each rule. assume we have a new test instance x, eqn 13 computes the support of the rules of x as follow: !"#â�¦? where â��( ! (0) ) is the certainty factor pertaining to the rule ! (0) . the certainty factor â�� is expressed in eqn. 14: where / (0) denotes the subsets of training instances that are labelled as 0 . the result that is predicted by the default classifier to be one of the class labels, is the one that has the greatest value computed from the support function eqn. 13. at times, some instances x could not be classified into rule or subgroup, that happens when sj(x)=0 for all classes î»j, x could be randomly assigned or temporarily placed into a special group. otherwise, the fuzzy rules are formed, certainty and support indicators are assigned to each one of them. the indicators mean how strong the rules are with respect to the predictive power to the class label possessed by the rules. for the purpose of validating the proposed grooms+cmcm methodology, empirical data proceeding from the chinese center for disease control and prevention â�¡ (cdcp), an official chinese government agency in beijing, china. since the beginning of the covid-19 outbreak, cdcp has been releasing the data to the public and updating them daily via mainstream media tencent and its subsidiary ⧠. the data come from mainly two sources: one source is known to be deterministic in nature which is harvested from cdcp in the form of time-series starting from 25 jan 2020 to 25 feb 2020. a snapshot of the published data is shown in figure 2 which are deterministic in nature as historical facts. the data collected for this experiment are only parts of the total statistics available on the website. the data required for this experiment are the numbers of people in china who have been contracted with the covid-19 disease in the following ways: suspected of infection by displaying some symptoms, confirmed of infection by medical tests, cumulative confirmed cases, current number of confirmed cases, current number of suspected cases, current number of critically ill, cumulative number of cured cases, cumulative number of deceased cases, recovery (cured) rate % and fatality rate %. this group of time-series are subject to grooms for finding the most accurate machine learning technique for obtaining the forecasts as future trends under development. in this case, bfgs-pnn was found to be the winning candidate model, hence applied here for generating future trends for each of the above-mentioned records. the forecasts based on these selected data by bfgs-pnn are shown in fig 3. the forecasts are in turn used as deterministic input variables to the cmc model. they have relatively lowest errors in rmse in comparison to other time-series forecasting algorithms as tested in [14] . the rationale is to use the most accurate possible forecasted inputs for achieving the most reliable simulated outcomes from mc simulation at the best effort. â�¡ http://www.chinacdc.cn/en/ ⧠https://news.qq.com/zt2020/page/feiyan.htm the goal of this monte carlo simulation experiment, which is a part of grooms+cmcm is to hypothetically estimate the direct cost that is needed as an urgent part of national budget planning to control the covid-19 epidemic. direct cost means the cost of medical resources, that includes but not limit to medicine, personnel, facilities, and other medical supplies, directly involved in providing treatments to those patients due to the covid-19 outbreak. of course, the grand total cost is much wider and greater than the samples experimented here. the experiment however aims at demonstrating the possibilities and flexibility of embracing both types of deterministic and non-deterministic data inputs by the composite mc methodology. the other group of data that would feed into the cmc are non-deterministic or probabilistic because they would have to bear a high level of uncertainty. they are subject to situation that changes dynamically and there is little control over the outcome. in the case of covid-19 epidemic control, finding the cure to the virus is a good example. there is best effort put into treatment, but no certainty at all about a cure, let alone knowing when exactly a cure could be developed, tested and proven to be effective for use against the novel virus. there are other probabilistic factors which are used in this experiment as well. selected main attributes are tabulated in table 1 . we assume a simple equation for estimating the direct cost in fighting covid-19 using only data of quarantining and isolated medical treatment as follow. note that the variables shown are abbreviated from the term names. e.g. d-i-r = days_till_recovery. the assumptions and hypothesis are derived from past experiences about direct costs involved in quarantine and isolation during the epidemic of sars in 2003 as published in [24] , with reasonable adjustment. for the non-deterministic variables ppi/day, d-f-r and d-t-d, the following assumptions are derived from [23] . these variables are probabilistic in nature as shown in table 1 . e.g. nobody can actually tell how long an infested patient could be recovered and go home, nor how long the isolation needs to be when the patient is in critical condition. all these are bounded by some probabilities that can be expressed in statistical properties, such as min-max, mean, standard deviation and so on. so some probabilism functions are needed to describe them, and random samples from these probabilities distributions are drawn to run the simulation. it is assumed that the growth of daily medical cost ppi/day follows a normal distribution with daily increase rate. the daily increase rate is estimated from [24] to be rising as days go by, because the chinese government has been increasingly putting in resources to stop the epidemic using national efforts. the increase is due to the daily increase number of medical staff who are flown to wuhan from other cities, and the rise of the volume of consumable medical items as well as their inflating costs. the daily cost is anticipated to become increasingly higher as long as the battle against covid-19 continues at full force. there are other supporting material costs and infrastructure relocation costs such as imposing curfews and economy damages. however, these other costs are not considered for the purpose of demonstrating with a simple and easy-to-understand cmc model. normal distribution and uniform distribution are assumed accountable for the increase and probability distributions that describe the lengths of hospital stay. when more information become available, one can consider refining them to weibull distribution and rayleigh distribution which can better describe the progress of the epidemic in terms of severity and dual statistical degree of freedom. this is a very simplified approach in guessing the daily cost of the so-called medical expenses, based on only two interventions -quarantine and isolation. nevertheless, this cmc model though simplified, is serving as an example of how monte carlo style of modelling can help generate probabilistic outcomes, and to demonstrate the flexibility and scalability of the modelling system. theoretically, the cmc system can expand from considering two direct inputs (quarantining and isolation) to 20, or even 200 other direct and indirect inputs to estimate the future behaviour of the epidemic. in practical application, data that are to be considered shall be widely collected, pre-processed, examined for quality check and relevance check (via grooms), and then carefully loading into the cmc system for getting the outcomes. stochastic simulation is well-suited for studying the risk factors of infectious disease outbreaks which always changes in their figures across time and geographical dispersion, thereby posing high level of uncertainty in decision making. each model forecast by mc simulation is an abstraction of a situation under observation -in our experiment, it is the impact of the dynamics of epidemic development on the direct medical costs against covid-19. the model forecast depicts the future tendencies in real-life situation rather than statements of future occurrence. the output of mc simulation sheds light in understanding the possibilities of outcomes anticipated. being a composite mc model, the ultimate performance of the simulated outcomes would be sensitive to the choice of the machine learning technique that generated the deterministic forecast as input variable to the cmc model. in light of this, a part of our experiment besides showcasing the mc outcomes by the best available technique, is to compare the levels of accuracy (or error) resulted from the wining candidate of grooms and a standard (default) approach. the forecasting algorithms in comparison are bfgs+pnn and linear regression respectively. the performance criterion is rmse, which is consistent and unitless ! = m +,-). & / * . ! n and rmse=mo ( ! 0 )n, as defined in [25] . at the same time, the total costs that are produced manually by explicitly use of spreadsheet using human knowledge are compared vis-ã -vis with those of the forecast models by cmcm. the comparative performances are tabulated in table ii . the forecasting period is 14 days. the cmcm model is implemented on oracle crystal ball release 11.1.2.4.850 (64-bit), running on a i7 cpu @ 2ghz, 16gb ram and ms windows 10 platform. 10,000 trials were set to run the simulation for each model. it is apparent that as seen that from table ii , the rmse of the monte carlo forecasting method using linear regression is more than double that of the method using bfgs+pnn (approximately 128k vs 62k). that is mainly due to the overly estimate of all the deterministic input variables by linear regression. referring to the first diagram in figure 3 , the variable called new_daily_increase_confirmed is non-stationary and it contains an outlier which rose up unusually high near the end. furthermore, the other correlated variable called new_daily_increase_suspected, which is a precedent of the trends of the confirmed cases, also is non-stationary and having an upward trend near the end, though it dips eventually. by using linear regression, the outlier and the upward trends would encourage the predicted curve to continue with the upward trend, linearly, with perhaps a sharp gradient. consequently, most of the forecast outcomes in the system have been over-forecasted. as such, using linear regression causes unstable stochastic simulation, leading to the more extreme final results, compared to the other methods. this is evident that the total_daily_cost has been largely over-forecasted and under-forecasted by manual and mc approaches, in table ii. on the other hand, when the bfgs+pnn is used, which is able to better recognize non-linear mapping between attributes and prediction class in forecasting, offers more realistic trends which in turn are loaded into the cmcm. as a result, the range between the final total_daily_cost results are narrower compared it to its peer linear regression ([lr: 12mil -83mil] vs [bfgs+pnn: 54mil -74mil]). the estimated direct medical cost for fighting covid-19 for a fortnight is estimated to be about 73.6 million usd given the available data using grooms+cmcm. according to the results in the form of probability distributions in figure 4 , different options are available for the user to choose from when it comes to estimating the fortnightly budget in fighting this covid-19 given the existing situation. each option comes with different price tag, and at different levels of risks. in general, the higher risk that the user thinks it can be tolerated, the lower the budget it will be, and vice-versa. from the simulated possible outcomes in figure 4 , if budget is of constrained, the user can consider bearing the risk (uncertainty of 50%) that a mean of $74mil with [min:$69mil, max:$78mil] is forecasted to be sufficient to fulfil the direct medical cost need. likewise, if a high certainty is to be assured, for example at 80% the chance that the required budget would be met, it needs about a mean of $79mil with [min:$66mil, max:$82mil]. for a high certainty of 98%, it is forecasted that the budget range will fall within a mean of $90mil and ranging from [min:$60mil, max:$89mil]. as a de-facto practice, some users will take 80% certainty as pareto principle (80-20) decision [26] and accept the mean budget at $79mil. $79mil should be a realistic and compromising figure when compared to manual forecast without stochastic simulation, where $54mil and $84mil budgets would have been forecasted by manual approach by linear regression and neural network respectively. sensitivity chart, by its name, displays the extents of how the output of a simulated mc model is affected by changes in some of the input variables. it is useful in risk analysis of a so-called black box model such as the cmcm used in this experiment by opening up the information about how sensitive the simulated outcome is to variations in the input variables. since the mc output is an opaque function of multiple inputs of composite variables that were blended and simulated in a random fashion over many times, the exact relationship between the input variables and the simulated outcome will not be known except through sensitivity chart. an output of sensitivity chart from our experiment is generated and shown in figure 5 . as it can be observed from figure 5 , the top three input variables which are most influential to the predicted output, which is the total medical cost in our experiment are: the average number of days before recovery in day 10 and day 12, and average cost per day for isolating a patient in day 2. the first two key variables are about how soon a patient can recover from covid-19, near the final days. and the third most important variable is the average daily cost for isolating patients at the beginning of the forecasting period. this insight can be interpreted that an early recovery near the final days and reasonably lower medical cost at the beginning would impact the final budget to the greatest extend. consequently, decision makers could consider that based on the results from the sensitivity analysis, putting in large or maximum efforts in treating isolating patients at the beginning, observe for a period of 10 days or so; if the medical efforts that were invested in the early days take effect, the last several days of recovery will become promising, hence leading to perhaps saving substantially a large portion of medical bill. sensitivity chart can be extended to what-if scenario analysis for epidemic key variable selection and modeling [27] . for example, one can modify the quantity of each of the variables, and the effects on the output will be updated instantly. however, this is beyond the scope of this paper though it is worth exploring for it is helpful to fine-tune how the variables should be managed in the effort of maximizing or minimizing the budget and impacts. since the effect of a group of independent variables on the predicted output is known and ranked from the chart, it could be used as an alternative to feature ranking or feature engineering in data mining. the sensitivity chart is a byproduct generated by the mc after a long trial of repeating runs using different random samples from the input distributions. figure 6 depicts how the sensitivity chart is related to the processes in the proposed methodology. effectively the top ranked variables could be used to infer the most influential or relevant attributes from the dataset that loads into an fri model (described in section 2.2) for supervised learning. one suggested approach which is fast and easy is to create a correlogram, from there one can do pairwise matching between the most sensitive variables from the non-deterministic data sources to the corresponding attributes from the deterministic dataset. ranking scores could be mapped over too, by boyer-moore majority vote algorithm [28] . some selected fuzzy rules generated by the methodology and filtered by the sensitivity chart correlation mapping are show below. the display threshold is 0.82 which is arbitrary chosen to display only the top six rules where half of them predicts a reflection point for the struggle of controlling the covid-19 can be attained, the other half indicate otherwise. an fri model in a nutshell is a classification which predicts an output belonging either one of two classes. in our experiment, we setup a classification model using fri to predict whether an inflection point of the epidemic could be reached. there is no standard definition of inflection point, though it is generally agreed that is a turning point at which the momentum of accumulation changes from one direction to another or vice-versa. that could be interpreted as either an intersection of two curves of which their trajectories begins to switch. in the context of epidemic control, an inflection point is the moment since when the rate of spreading starts to subside, thereafter the trend of the epidemic is leading to elimination or eradication. based on a sliding window of 3 days length, a formula for computing the inflection point based on the three main attributes of the covid-19 data is given in listed as follow: win: score = w1 ã� (d down-trend between the past 3 days of new_daily_increase_confirmed (n.d.i.c)) + w2 ã� (d down-trend between the past 3 days of current_confirmed) + w3 ã� (d up-trend between the past 3 days of cured_rate) lose: score = w1 ã� (d up-trend between the past 3 days of new_daily_increase_confirmed (n.d.i.c)) + w2 ã� (d up-trend between the past 3 days of current_confirmed) + w3 ã� (d up-trend between the past 3 days of death_rate) where w1=0.1, w2=0.15, and w3=0.25 which can be arbitrarily set by the user. the weights reflect the importance which one considers on how the up or downward trends of confirmed cases and cured vs death rates contribute to reaching the inflection point. a dual curve chart that depicts the inflection point is shown in figure 7 . interestingly, when near the end of the timeline (28/1 -20/2) that is from point 19 th onwards, the two curves seem to intervene, as it has been hoping that the winning curve is rise over the losing curve. an inflection point might have reached, but the momentum of the winning is there yet. further observation on the epidemic development is needed to confirm about the certainty of winning. nevertheless, the top six rules that are built from classification of inflection point, and processed by feature selection via sensitivity analysis, are shown below. cf stands for confidence factor which indicates how strong the rule is. on the winning side, the rules reveal that when the variables about new confirmed cases fall below certain numbers, a win is scored, contributing towards a reflection point. (yester3days-ndic = '(10706.5 .. â��)') â�� win=0 (cf = 0.53) the strongest rules of the two forces are rules 1 and 4. rule 1 shows that to win an epidemic control the down trend over consecutive three days must fall below 3581; on the other hand, the epidemic control may lead to failure, if the cured rate stays less than 3.86% and the new daily increase in confirmed cases remain high between 1874 and 3350 (round up the decimal points). originated from wuhan, china, the epidemic of novel coronavirus was spreading over many chinese cities, then over other countries worldwide since december 2019. the chinese authorities took strict measures to contain the outbreak resolutely by restricting travels suspending business and schools etc. this gives rise to an emergency situation where critical decisions were demanded for, while the virus was novel and very little information was known about the epidemic at the early stage. with incomplete information, limited data on hand, and ever changing on the epidemic development, it is extremely hard for anybody to make a decision using only deterministic approach which foretells precisely the future behaviour of the epidemic. in this paper a composite monte-carlo model (cmcm) is proposed to be used in conjunction with grooms methodology [23] which finds the best performing deterministic forecasting algorithm. coupling grooms+cmcm together offers the flexibility of embracing both deterministic and non-deterministic input data into the monte carlo simulation where random samples are drawn from the distributions of the data from the non-deterministic data sources for reliable outputs. during the early period of disease outbreaks, data are scarce and full of uncertainty. the advantage of cmc is that a range of possible outcomes are generated associated with probabilities. subsequently sensitivity analysis, what-if analysis and other scenario planning can be done for decision support. as a part of the grooms+cmcm methodology, fuzzy rule induction is also proposed, which provides another dimension of insights in the form of decision rules for decision support. a case study of the recent novel coronavirus epidemic (which are also known as wuhan coronavirus, covid-19 or 2019-ncov) is used as an example in demonstrating the efficacy of grooms+cmcm. through the experimentation over the empirical covid-19 data collected from the chinese government agency, it was found that the outcomes generated from monte carlo simulation are superior to the traditional methods. a collection of soft computing techniques, such as bfgs+pnn, fuzzy rule induction, and other supporting algorithms to grooms+cmcm could be able to produce qualitative results for better decision support, when used together with monte carlo simulation, than any of deterministic forecasters alone. outbreaks chronology: ebola virus disease the impacts on health, society, and economy of sars and h7n9 outbreaks in china: a case comparison study coronavirus: the hit to the global economy will be worse than sars nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study the mathematical theory of epidemics stochastic epidemic models and their statistical analysis how big is an outbreak likely to be? methods for epidemic final-size calculation epidemic process over the commute network in a metropolitan area a queue-based monte carlo analysis to support decision making for implementation of an emergency department fast track monte carlo simulation model to study the inequalities in access to ems services real-time stochastic evacuation models for decision support in actual emergencies on algorithmic decision procedures in emergency response systems in smart and connected communities finding an accurate early forecasting model from small dataset: a case of 2019-ncov novel coronavirus outbreak bayesian prediction of an epidemic curve analytical study of the least squares quasi-newton method for interaction problems numerical analysis for applied science heuristic self-organization in problems of engineering cybernetics estimating the coefficients of polynomials in parametric gmdh algorithms by the improved instrumental variables method adjustment of an inverse matrix corresponding to changes in the elements of a given column or a given row of the original matrix (abstract nonlinear digital filters: analysis and applications medical applications of artificial intelligence a cost-based comparison of quarantine strategies for new emerging diseases how china built two coronavirus hospitals in just over a week fitting mechanistic epidemic models to data: a comparison of simple markov chain monte carlo approaches mathematical and computer modelling of the pareto principle soubeyrand samuel and thã©baud gaã«l using sensitivity analysis to identify key factors for the propagation of a plant epidemic5r automated reasoning: essays in honor of woody bledsoe he is a co-founder of the data analytics and collaborative computing research group in the faculty of science and technology. prior to his academic career, simon took up various managerial and technical posts, such as systems engineer, it consultant and e-commerce director in australia and asia. dr. fong has published over 450 international conference and peer-reviewed journal papers, mostly in the areas of data mining, data stream mining, big data analytics, meta-heuristics optimization algorithms, and their applications. he serves on the editorial boards of the journal of network and computer applications of elsevier, ieee it professional magazine, and various special issues of scie-indexed journals. simon is also an active researcher with leading positions such as vice-chair of ieee computational intelligence society (cis) task force on her latest winning work includes the first unmanned supermarket in macau enabled by the latest sensing technologies, face recognition and e-payment systems. she is also the founder of several online2offline dot.com companies in trading and retailing both online and offline. ms li is also an active researcher, manager and chiefknowledge-officer in dacc laboratory at the faculty of science and technology rubã©n gonzã¡lez crespo has a phd in computer science engineering. currently he is vice chancellor of academic affairs and faculty from unir and global director of engineering schools from proeduca group. he is advisory board member for the ministry of education at colombia and evaluator from the national agency for quality evaluation and accreditation of spain (aneca) his current research interests include group decision making, consensus models, linguistic modeling, and aggregation of information, information retrieval, bibliometric, digital libraries, web quality evaluation, recommender systems, and social media. in these topics he has published more than 250 papers in isi journals and coordinated more than 22 research projects. dr. herrera-viedma is vice-president of publications of the ieee smc society and an associate editor of international journals such as the key: cord-190495-xpfbw7lo authors: molnar, tamas g.; singletary, andrew w.; orosz, gabor; ames, aaron d. title: safety-critical control of compartmental epidemiological models with measurement delays date: 2020-09-22 journal: nan doi: nan sha: doc_id: 190495 cord_uid: xpfbw7lo we introduce a methodology to guarantee safety against the spread of infectious diseases by viewing epidemiological models as control systems and by considering human interventions (such as quarantining or social distancing) as control input. we consider a generalized compartmental model that represents the form of the most popular epidemiological models and we design safety-critical controllers that formally guarantee safe evolution with respect to keeping certain populations of interest under prescribed safe limits. furthermore, we discuss how measurement delays originated from incubation period and testing delays affect safety and how delays can be compensated via predictor feedback. we demonstrate our results by synthesizing active intervention policies that bound the number of infections, hospitalizations and deaths for epidemiological models capturing the spread of covid-19 in the usa. the rapid spreading of covid-19 across the world forced people to change their lives and practice mitigation efforts at a level never seen before, including social distancing, mask-wearing, quarantining and stay-at-home orders. these human actions played a key role in reducing the spreading of the virus, although such interventions often have economic consequences, lose of jobs and physiological effects. therefore, it is important to focus mitigation efforts and determine when, where and what level of intervention needs to be taken. this research provides a methodology to determine the level of active human intervention needed to provide safety against the spreading of infection while keeping mitigation efforts minimal. we use compartmental epidemiological models to describe the spreading of the infection [1] , [2] , and we view these models as control systems where human intervention is the control input. viewing epidemiological models as control systems has been proposed in the literature recently [3] , [4] , [5] , and various models with varying transmission rate [6] , [7] , [8] , [9] have appeared to quantify the level of human interventions in the case of covid19. in this paper, we build on our recent work [10] and use a safety-critical control approach to synthesize control strategies that guide human interventions so that certain safety criteria (such as keeping infection, hospitalization and death below given limits) are fulfilled with minimal mitigation efforts. the approach is based on the framework of fig. 1 . illustration of the sir model as control system and its fit to us covid-19 data [10] . model parameters were estimated from compartmental data (right) by accounting for a measurement delay τ . the transmission rate and the corresponding control input (left) were fitted to mobility data. control barrier functions [11] , [12] that leverages the theory of set invariance [13] for dynamical [14] , [15] and control systems [16] , [17] , [18] . we take into account that data about the spreading of the infection may involve significant measurement delays [5] , [19] , [20] , [21] due to the fact that infected individuals may not show symptoms and get tested for quite a few days. we use predictor feedback control [22] , [23] , [24] to compensate these delays, and we provide safety guarantees against errors in delay compensation. the outline of the paper is as follows. section ii introduces a generalized compartmental model, which covers the class of the most popular epidemiological models. section iii introduces safety critical control without considering measurement delays, while sec. iv is dedicated to delay compensation. conclusions are drawn in sec. v. compartmental models describe how the size of certain populations of interest evolve over time. consider n + m compartments, given by x ∈ r n+m , which are separated into two groups: n so-called multiplicative compartments, given by w ∈ r n , and m outlet compartments, given by z ∈ r m . the evolution of these compartments over time t can be given by the following generalized compartmental model: where x = [w t z t ] t , initial conditions are x(0) = x 0 , and f, g : r n → r n , q : r n → r m and r : r m → r m depend on the choice of the model; see examples 1, 2 and 3. in model (1), the multiplicative compartments w are populations that essentially describe the transmission of the infection. the transmission can be reduced by active interventions, whose intensity is quantified by a control input u ∈ u ⊂ r. the outlet compartments z, on the other hand, do not actively govern the transmission, but rather indicate its effects, as they are driven by the evolution of the multiplicative compartments. example 1. sir model. one of the most fundamental epidemiological models is the sir model [25] , [26] that consists of susceptible, s, infected, i, and recovered, r, populations. the sir model captures the spread of the infection based on the interplay between the susceptible and infected populations. thus, s and i are multiplicative compartments, while r, that measures the number of recovered (or deceased) individuals, is an outlet compartment. the model uses three parameters: the transmission rate β 0 > 0, the recovery rate γ > 0 and the total population n . active interventions given by the control input u ∈ [0, 1] allow the population to reduce the transmission to an effective rate β = β 0 (1 − u), where u = 0 means no intervention and u = 1 means total isolation of infected individuals. this puts the sir model with active intervention to form (1) where example 2. seir model. the seir model [27] , [28] is an extension of the sir model that incorporates an exposed population e apart from the s, i and r compartments. the exposed individuals are infected but not yet infectious over a latency period given by 1/σ > 0. since the latency affects the transmission, e is a multiplicative compartment. the seir model can be described by (1) with example 3. sihrd model. the sihrd model [10] adds two more outlet compartments to the sir model: hospitalized population h and deceased population d. their evolution is captured by three additional parameters: the hospitalization rate λ > 0, the recovery rate ν > 0 in hospitals and the death rate µ > 0. equation (1) yields the sihrd model for there exist several other compartmental models of form (1) which involve further compartments, such as the sird [29] , sirt [7] , sixrd [30] or sidarthe [1] models. more complex models can provide higher fidelity, although they involve more parameters that need to be identified. in what follows, we show applications of the sir and sihrd models and we discuss the occurrence of time delays related to incubation and testing. we omit further discussions on latency, the seir model or other more complex models. fig. 1 shows the performance of the sir model in capturing the spread of covid-19 for the case of us national data. the parameters β 0 = 0.33 day −1 , γ = 0.2 day −1 and n = 33 × 10 6 of the sir model were fitted following the algorithm in [10] to the recorded number of confirmed cases i + r [31] between march 25 and august 9, 2020, while the control input u(t), that represents the level of quarantining and social distancing, was identified from mobility data [32] based on the medium time people spent home. the fitted control input (blue) follows the trend of the mobility data (gray) well, especially in march where stay-at-home orders came into action. while the fit (blue) captures the data about confirmed cases (gray), the model also has predictive power (orange); see more details about forecasting in [10] . note that once an individual gets infected by covid-19, it takes a few days of incubation period to show symptoms and an additional few days to get tested for the virus [5] , [19] , [20] , [21] . therefore, the measured number of confirmed cases represents a delayed state of the system, , and thus we involved a time delay τ in the model identification process, which was found to be τ = 11 days by fitting [10] . the delay-free counterpart of the fit (purple) shows that the measurement delay can lead to a significant error in identifying the true current level of infection. the effects of the delay τ on safety-critical control and its compensation will be discussed in sec. iv. formally, safety can be translated into keeping system (1) within a safe set s ⊂ r n+m that is the 0-superlevel set of a continuously differentiable function h : r n+m → r: function h prescribes the condition for safety: for example, if one intends to keep the infected population i under a limit i max for the sir, seir or sihrd models, the safety condition is h(x) = i max − i ≥ 0. to guarantee safety, we design a controller that ensures that the set s in (5) is forward invariant under the dynamics (1), i.e., if x(0) ∈ s (h(x(0)) ≥ 0), then x(t) ∈ s (h(x(t)) ≥ 0) for all t > 0. below we use the framework of control barrier functions [11] , [12] to synthesize controllers that are able to keep certain compartments of interest within prescribed limits. first, we consider safety for multiplicative compartments, and then for outlet compartments. consider keeping the i-th multiplicative compartment (1 ≤ i ≤ n) below a safe limit given by c i , i.e., we prescribe where c i is an upper bound for w i . a lower bound could also be considered similarly, by taking h(x) = w i − c i . theorem 1: consider dynamical system (1), function h in (7) and the corresponding set s given by (5) . the following safety-critical active intervention controller guarantees that s is forward invariant (safe) under dynamics (1) if g i (w) = 0, ∀w ∈ r n : where relu(·) = max{0, ·} is the rectified linear unit, and α > 0. furthermore, the controller is optimal in the sense that it has minimum-norm control input. proof. according to [12] , the necessary and sufficient condition of forward set invariance is given by 1 ∀t ≥ 0, where the derivative is taken along the solution of (1). if there exists a control input u(t) so that (10) is satisfied, then h is called a control barrier function. substitution of (7) and (1) into (10) gives the safety condition where ϕ i is given by (9) . the control input u(t) must satisfy (11) for all t ≥ 0. to keep control efforts minimal, one can achieve this by solving the quadratic program: based on the kkt conditions [33] , the explicit solution is if g i (w(t)) = 0, which can be simplified to (8) . we remark that if g i (w) = 0, safety can be ensured by the help of extended control barrier functions as discussed for the safety guarantees of outlet compartments in sec. iii-b. for example, to keep the infected population i below the limit i max for the sir model given by (2) , one shall prescribe h(x) = i max − i, and (8) leads to the controller fig . 2 shows the dynamics of the closed control loop for the covid-19 model fitted in fig. 1 by prescribing 1 more precisely, α must be chosen as an extended class k function [12] , but we use a constant for simpler discussion and without loss of generality. fig. 2 . safety-critical active intervention control of the sir model fitted in fig. 1 to us covid-19 data. the controller keeps the infected population under the prescribed limit imax as opposed to the second wave of infection that was experienced during the summer of 2020. i max = 200, 000 and using α = γ/10. indeed, the safetycritical controller (red) applied from june 1, 2020 is able to keep the level of infection under the safe limit (red dashed), while gradually reducing mitigation efforts to zero. meanwhile, the us experienced a second wave of infections (gray) in the summer of 2020, which was caused by the drop in mitigation efforts in june (see the blue control input). now consider the case where the j-th outlet compartment (1 ≤ j ≤ m) needs to be kept within the safe limit c j : in the following theorem, we use a dynamic extension of control barrier functions to guarantee safety. theorem 2: consider dynamical system (1), function h in (15) and the corresponding set s given by (5) . the following safety-critical active intervention controller guarantees that s is forward invariant (safe) under dynamics (1) iḟ h(x ( 0)) + αh(x(0)) ≥ 0 and if l g q j (w) = 0, ∀w ∈ r n : and α > 0, α e > 0. furthermore, the controller is optimal in the sense that it has minimum-norm control input. proof. we again use (10) as the necessary and sufficient condition for safety, where the following expression appears: h e (x(t)) :=ḣ(x(t)) + αh(x(t)) = −(q j (w(t)) + r j (z(t))) + α(c j − z j (t)), (18) which puts the safety condition into the form h e (x(t)) ≥ 0, ∀t ≥ 0. however, the control input does not explicitly show up in (18) . still, if there exists a control input that satisfieṡ h e (x(t)) ≥ −α e h e (x(t)), then h e is an extended control barrier function [34] , [13] , whose 0-superlevel is forward invariant, that is, h e (x 0 ) ≥ 0 implies h e (x(t)) ≥ 0, ∀t > 0. substitution of (18), (15) and (1) into (19) gives the extended safety condition where ϕ e j is defined by (17) . this can be satisfied by a minnorm controller obtained from the quadratic program: the explicit solution of the quadratic program is which is equivalent to (16) . as an example of keeping outlet compartments safe, consider limiting the number of hospitalizations below h max and deaths below d max for the sihrd model given by (4) . by choosing h(x) = h max − h, one can guarantee safety in terms of hospitalization based on (16) by the controller: whereas prescribing h(x) = d max − d ensures safety by upper bounding deaths via: having synthesized controllers that keep selected compartments safe, let us now guarantee safety for multiple compartments at the same time: a set of multiplicative compartments i ⊂ {1, . . . , n} and a set of outlet compartments j ⊂ {1, . . . , m}. to formulate the safety condition, one can utilize (11) for any multiplicative compartment i ∈ i and (20) for any outlet compartment j ∈ j . then, one needs to solve the corresponding quadratic program subject to all these constraints. in general, the quadratic program can only be solved numerically and one may need relaxation terms to satisfy multiple constraints [12] . however, analytical solutions can be found in some special cases, such as the one given by the following assumption. assumption 1. assume that the following terms have the same sign: sign(g i (w(t))) = sign(l g q j (w(t))) = −1, ∀i ∈ i, ∀j ∈ j , ∀t ≥ 0. this assumption often holds for models where compartments need to be upper bounded for safety, e.g., the assumption holds for keeping e, i, r, h or d below a safe limit in the sir, seir or sihrd models. under this assumption, one can state the following proposition. proposition 3: consider dynamical system (1) with assumption 1 and the controllers (8) and (16) that keep individual multiplicative compartments w i , i ∈ i ⊂ {1, . . . , n} and outlet compartments z j , j ∈ j ⊂ {1, . . . , m} safe using the control barrier functions in (7) and (15) . the following safety-critical active intervention controller guarantees safety for all compartments at the same time: that is, one needs to take the maximum of the individual control inputs that keep each individual compartment safe. proof. if assumption 1 holds, the safety conditions in (11) and (20) can be combined into one inequality: then, one can solve the quadratic program: in the form: this can be simplified to (25) based on (8) and (16) . fig. 3 shows the closed loop response of the sihrd model given by (4) that was fitted to us covid-19 data [10] . the data about confirmed cases were scaled by the cube root of the positivity rate (positive per total tests) to account for the significant under-reporting of cases during the first wave of the virus (and cube root was applied to scale less aggressively). starting from june 1, safetycritical active intervention control is applied to limit both the hospitalizations below h max = 40, 000 and the deaths below d max = 400, 000. based on (25), we utilize the controller a hd (x) = max{a h (x), a d (x)} where a h and a d are given by (23) and (24) . the model and controller parameters are β 0 = 0.53 day −1 , γ = 0.14 day −1 , λ = 0.03 day −1 , ν = 0.14 day −1 , µ = 0.01 day −1 , n = 15×10 6 , τ = 9 days, α d = α e d = α h = (γ + λ + µ)/10 and α e h = ν/10. safetycritical control is able to reduce mitigation efforts while keeping the system below the prescribed hospitalization and death bounds and preventing a second wave of the virus. controller (6) in sec. iii is designed based on feeding back the instantaneous state x(t) of the compartmental model. however, data about certain compartments is measured with delay due to the incubation period and testing delays. thus, the instantaneous state x(t) may not be available for feedback, but the delayed state x(t − τ ) with measurement delay τ shall be used. if one implements a(x(t − τ )) instead of a(x(t)) for active intervention, a significant discrepancy between the delayed and instantaneous states can endanger safety. for example, the delay was identified to be τ = 11 days for the us covid-19 data in fig. 1 , while the infected population grew from a few thousands to more than a hundred thousand within 11 days in mid march. this difference significantly impacts safety-critical control. thus, we propose a method to compensate delays via predicting the instantaneous state from the delayed one and we analyze how the prediction error affects safety. we use the idea of predictor feedback control [22] , [23] , [24] to overcome the effect of delays. namely, at each time moment t we use the data that are available up to time t − τ and we calculate a predicted state x p (t) that approximates the instantaneous state: x p (t) ≈ x(t). then, we use the predicted state in the feedback law by applying a(x p (t)) ≈ a(x(t)). if the prediction is perfect (i.e., x p (t) = x(t)), safety is guaranteed even in the presence of delay according to sec. iii. below we analyze how errors in the prediction affect safety. the prediction can be done by any model-based or databased methods; see example 4 for instance. at this point we only assume that the prediction error defined by is bounded in the sense that e(t) ∞ ≤ ε for some ε ≥ 0. the prediction error leads to an input disturbance relative to the nominal control input u(t) = a(x(t)), which yields the closed control looṗ w(t) = f (w(t)) + g(w(t))(u(t) + d(t)), z(t) = q(w(t)) + r(z(t)). note that for a lipschitz continuous controller a with lipschitz constant c, the input disturbance is upper bounded by d(t) ∞ ≤ c e(t) ∞ ≤ cε =: δ. the following theorem summarizes how the disturbance affects safety via the notion of input-to-state safety [35] . for simplicity, we state this theorem only for the safety of multiplicative compartments. theorem 4: consider the closed loop dynamical system (31) , function h in (7) and the corresponding set s given by (5) . assume that the nominal controller u(t) guarantees safety without the input disturbance d(t) by satisfying (11), while the input disturbance d(t) defined by (30) is bounded by d(t) ∞ ≤ δ. then, set s is input-to-state safe in the sense that a larger set s d ⊇ s given by is forward invariant (safe) under dynamics (31) , where proof. similarly to (10) and (19), the necessary and sufficient condition for the invariance of s d is given bẏ substituting (33) and taking the derivative along the solution of (31) yields −ϕ i (w(t))−g i (w(t))(u(t)+d(t))+δ g i (w(t)) ∞ ≥ 0, (35) which indeed holds, since (11) and d(t) ∞ ≤ δ hold. how much larger set s d is compared to set s depends on the size δ of the disturbance that is related to the prediction error ε. if the prediction is perfect (x p (t) = x(t)), then ε = 0, δ = 0 and s d recovers s. however, if one implements a delayed state feedback controller without prediction (x p (t) = x(t − τ )), then ε and δ can be large, while s d can be significantly larger than the desired set s. example 4. a possible model-based prediction can be done as follows. at each time moment t, we take the most recent available measurement x(t − τ ) and calculate the predicted state x p (t) by numerically integrating the ideal delay-free closed loop over the delay interval θ ∈ [t − τ, t]: w p (θ) = f (w p (θ)) + g(w p (θ))a(x p (θ)), z p (θ) = q(w p (θ)) + r(z p (θ)), where x p = [w t p z t p ] t and the initial condition for integration is x p (t − τ ) = x(t − τ ). the results in figs. 2 and 3 involve this kind of predictor feedback to compensate the delay τ . the simulations were carried out without considering uncertainties in the delay or other parameters, therefore the predicted state x p (t) matched the instantaneous state x(t) up to the numerical accuracy of integration. this allowed us to guarantee safety even in the presence of a significant delay. v. conclusions in this paper, we viewed compartmental epidemiological models as control systems where human actions (such as quarantining or social distancing) are considered as control input. by the framework of control barrier functions, we synthesized optimal safety-critical active intervention policies that formally guarantee safety against the spread of infection while keeping mitigation efforts minimal. we highlighted that time delays arising during state measurements can significantly affect safety-critical control, and we proposed predictor feedback to compensate the delays while preserving a certain level of input-to-state safety. we demonstrated our results on compartmental models fitted to us covid-19 data, where we synthesized controllers to keep infection, hospitalization and deaths within prescribed limits. these controllers can help guide policy makers to decide when and how much mitigation efforts shall be reduced or increased. modelling the covid-19 epidemic and implementation of population-wide interventions in italy early dynamics of transmission and control of covid-19: a mathematical modelling study optimal control of an sir model with delay in state and control variables time-optimal control strategies in sir epidemic models can the covid-19 epidemic be controlled on the basis of daily test reports estimating the impact of covid-19 control measures using a bayesian model of physical distancing quantifying the effect of quarantine control in covid-19 infectious spread using machine learning a feedback sir (fsir) model highlights advantages and limitations of infection-based social distancing modeling shield immunity to reduce covid-19 epidemic spread safetycritical control of active interventions for covid-19 mitigation control barrier function based quadratic programs with application to adaptive cruise control control barrier function based quadratic programs for safety critical systems control barrier functions: theory and applications on a characterization of flow-invariant sets barrier certificates for nonlinear model validation viability theory set-theoretic methods in control a time delay dynamical model for outbreak of 2019-ncov and the parameter identication initial simulation of sars-cov2 spread and intervention effects in the continental us risk assessment of novel coronavirus covid-19 outbreaks outside china time delay compensation in unstable plants using delayed state feedback compensation of infinitedimensional input dynamics predictor feedback for delay systems: implementations and approximations the mathematics of infectious diseases estimation of the final size of the covid-19 epidemic eects of latency and age structure on the dynamics and containment of covid-19 a modified seir model to predict the covid-19 outbreak in spain and italy: simulating control scenarios and multi-scale epidemics estimating and simulating a sird model of covid-19 for many countries, states, and cities a metapopulation network model for the spreading of sars-cov-2: case study for ireland safegraph convex optimization exponential control barrier functions for enforcing high relative-degree safety-critical constraints american control conference (acc) input-to-state safety with control barrier functions the authors would like to thank franca hoffmann for her insights into compartmental epidemiological models and gábor stépán for discussions regarding non-pharmaceutical interventions in europe. this research is supported in part by the national science foundation, cps award #1932091. key: cord-163946-a4vtc7rp authors: awasthi, raghav; guliani, keerat kaur; bhatt, arshita; gill, mehrab singh; nagori, aditya; kumaraguru, ponnurangam; sethi, tavpritesh title: vacsim: learning effective strategies for covid-19 vaccine distribution using reinforcement learning date: 2020-09-14 journal: nan doi: nan sha: doc_id: 163946 cord_uid: a4vtc7rp a covid-19 vaccine is our best bet for mitigating the ongoing onslaught of the pandemic. however, vaccine is also expected to be a limited resource. an optimal allocation strategy, especially in countries with access inequities and a temporal separation of hot-spots might be an effective way of halting the disease spread. we approach this problem by proposing a novel pipeline vacsim that dovetails actor-critic using kronecker-factored trust region (acktr) model into a contextual bandits approach for optimizing the distribution of covid-19 vaccine. whereas the acktr model suggests better actions and rewards, contextual bandits allow online modifications that may need to be implemented on a day-to-day basis in the real world scenario. we evaluate this framework against a naive allocation approach of distributing vaccine proportional to the incidence of covid-19 cases in five different states across india and demonstrate up to 100,000 additional lives potentially saved and a five-fold increase in the efficacy of limiting the spread over a period of 30 days through the vacsim approach. we also propose novel evaluation strategies including a standard compartmental model based projections and a causality preserving evaluation of our model. finally, we contribute a new open-ai environment meant for the vaccine distribution scenario, and open-source vacsim for wide testing and applications across the globe. all countries across the globe are eagerly waiting for the launch of an effective vaccine against sars-cov-2. the operation warp speed [1] aims to deliver 300 million doses of a safe, effective vaccine for covid-19 by january 2021, however, the pace of development continues to be punctuated by the safety concerns [2] . as potential candidates start getting ready to enter the market, there will be an urgent need for optimal distribution strategies that would mitigate the pandemic at the fastest rate possible [3] [4] . center for american progress estimated that 462 million doses of covid-19 vaccine along with accompanying syringes will be needed for the us alone to reach herd immunity [5] . here we summarize the key factors that will need to be considered for effective mitigation: • scarcity of supply: despite large scale production efforts, it is expected that the vaccine will still be a scarce resource as compared to the number of people who would need it. in addition to the vaccine itself, there may also be scarcity in the components leading to its delivery, e.g syringes. the white house director of trade and manufacturing policy stated earlier this year that the us would need 850 million syringes to deliver the vaccine en-masse. this highlights the next challenge of the optimal distribution of scarce resources related to the vaccine. arxiv:2009.06602v1 [cs.ai] 14 sep 2020 • equitable distribution: a truly equitable distribution will not just be defined by the population or incidence of new cases alone, although these will be strong factors. other factors ensuring equity of distribution include quantum of exposure e.g. to the healthcare workforce that needs to be protected. in this paper, we assume that the exposure is proportional to the number of cases itself, although the proposed methodology allows more nuanced models to be constructed. there may also be unseen factors such as vaccine hoarding and political influence, which are not accounted for in this work. • transparent, measurable and effective policy: the design of policy would need to be guided by data, before, during and after the vaccine administration to a section of the population. since the viral dynamics are rapidly evolving, the policy should allow changes to be transparent and effects to be measurable in order to ensure maximum efficacy of the scarce vaccine resource. on the larger scale of states and nations, this would imply continuous monitoring of incidence rates vis-a-vis a policy action undertaken. although the aforementioned factors seem straightforward, the resulting dynamics that may emerge during the actual roll-out of the vaccine may be far too complex for human decision making. the daunting nature of such decisionmaking can be easily imagined for large and diverse countries such as india and the united states, especially where health is a state subject. artificial intelligence for learning data-driven policies is expected to aid such decision making as there would be limited means to identify optimal actions in real-world. a "near real-time" evaluation as per the demographic layout of states and consequent initiation of a rapid response to contain the spread of covid-19 [6] will be required. furthermore, these policies will need to be contextualized to the many variables governing demand or 'need' for the vaccine distribution to be fair and equitable [7] . therefore, ground testing of these scenarios is not an option, and countries will have to face this challenge. in this paper, we introduce vacsim, a novel feed-forward reinforcement learning approach for learning effective policy combined with near real-time optimization of vaccine distribution and demonstrate its potential benefit if applied to five states across india. since real-world experimentation was out of question, the change in projected cases obtained via a standard epidemiological model was used to compare the vacsim policy with a naive approach of incidence-based allocation. finally, our novel model is open-sourced and can be easily deployed by policymakers and researchers, thus can be used in any part of the world, by anyone, to make the process of distribution more transparent. actor-critic methods are temporal difference (td) learning-based methods that have a policy structure(actor) which select actions, and an estimated value function(critic) that critiques the actions made by the actor. a markov decision process is represented as a function of (x, a, γ, p, r), at a given time t. an agent performs an action ∈ a following a certain policy π θ (a|x) to receive a reward r(x, a) and a transition to the consequent state x with a probability p (x |x, a). the objective of this task is to maximize the expected γ-discounted cumulative return (given the policy parameters θ). policy gradient methods optimises a policy π θ (a|x) with respect to its parameters and update θ, to maximise j(θ) [8] , [9] . as defined in [10] , policy gradient is expressed as follows: where ψ t is representative of the advantage function a p i(x, a). an advantage function provides a relativistic idea of how useful a certain action is, given a certain state. a good advantage function provides low variance and low bias gradient estimates. in our work, we refer to [11] , which uses the a3c (asynchronous advantage critic) method proposed by [12] and suggests the following advantage function: where v π φ (x t ) represents the value network, whose parameters are further trained by performing temporal difference updates. the variable k corresponds to the k-step return from the advantage function. trpo [13] is an on-policy algorithm that is well suited for environments with both discrete and continuous action spaces. considering we make use of the latter, this optimization scheme can be employed in our model. updating policy using trpo involves taking the largest possible step to improve model performance, whilst also satisfying a constraint setting the proximity between the old and the new policies. the constraint is expressed as kl-divergence. if π θ is a policy (θ being the parameters), then the trpo update is performed as follows: where l(θ k , θ) is a surrogate advantage function comparing how the current policy performs with respect to the old policy, and d kl (θ||θ k ) is the average kl-divergence metric. natural gradient descent (ngd) [14] is an efficient realization of second-order optimization and is based on information geometry. in contrast to first-order methods such as stochastic gradient descent, ngd converges faster by approximating the loss landscape using the fisher information matrix (fim) [15] as the curvature matrix corresponding to the loss function. kronecker-factored approximate curvature [16] provides an efficient way to approximate the inverse of fim, which is otherwise a computational challenge associated with this approach. acktr is a scalable and sample efficient algorithm to apply the natural gradient method to policy gradient for actorcritic methods. proposed in [11] , it is used to calculate and apply the natural gradient update to both actor and critic. it uses a critic to estimate the advantage function. training the model amounts to solving the least-squares problem of minimizing the mean squared error (mse) between the model predictions and the target values, i.e., (r(x) = estimated(x) − target(x)). the contextual bandits algorithm is an extension of the multi-armed bandits approach [17] which contextualizes the choice of the bandit to its current environment. this serves to circumvent the problem where a multi-armed bandit may simply end up playing the same action multiple times even though the environment (context) may have changed, thus getting stuck at a sub-optimal condition. contextual bandits play an action based on its current context, given a corresponding reward, hence are more relevant to real-world environments such as the vaccine distribution problem attacked in this work. given, for time t = 1...n, a set of contexts c and a set of possible actions x and reward/payoffs p are defined. at a particular instant, based on a context c t ∈ c, an action x t ∈ x is chosen and a reward or a payoff p t = f (c t , x t ) is obtained. regret [18] is a conceptual function to understand and optimize the performance of the bandits problem. since we don't know if an action played was the most "reward-fetching", rewards against all actions that can be played are sampled, and the difference between the action chosen and the action against the maximum reward is defined as 'regret'. therefore, minimizing regret achieves the goal of maximizing reward. for an optimal action x * ∈ x such that the expectation of reward against this action is maximum, p * = max xt∈x (e(p | x t )), the regret and cumulative regret can be expressed as z = [p * − e(p | x t )] and z * = n t=1 z respectively. in order to derive the exact posterior inference in linear models, a bayesian linear regression [19] may be performed, and several computationally-efficient versions are available [20] . linear posteriors assume that the data were generated as per p = c t β + (5) where ∼ n (0, σ 2 ), p represents the reward or payoff and c is the context. the joint distribution of β and σ 2 for each action is modeled. sequentially estimating the noise level σ 2 for each action allows the algorithm to adaptively improve its understanding of the volume of the hyper-ellipsoid of plausible β's which generally leads to a more aggressive initial exploration phase (for both β and σ 2 ). the posterior at time t for action x t after observing c and p, is where we assume σ 2 ∼ ig(a t , b t ), and β | σ 2 ∼ n (µ t , σ 2 σ t ), an inverse gamma and gaussian distribution, respectively. their parameters are given by we set the prior hyperparameters to µ 0 = 0, and λ 0 = λ id, while a 0 = b 0 = η > 1. it follows that initially, for . note that we independently model and regress each action's parameters, β i , σ 2 i for i = 1, . . . , k [18] . for practical purposes, a 0 and b 0 are initialized as integer values. in this paper, we concatenate the acktr sub-model and contextual bandits sub-model in a feedforward manner. this was done to select the optimal policy through the supervised contextual bandits approach from the ones generated by the reinforcement learning-based acktr model as shown in figure 1 . this was done to address the following challenges that need to be tackled in real-world problems such as optimal vaccine distribution: • solving in real-time: a vaccine distribution problem is expected to be fast-paced. thus an overwhelming amount of brainstorming with constrained resources and time would be required to develop effective policy for a near future. this calls for the development of a prompt and an easily reproducible setup. • lack of ground truth: this is one of the key challenges addressed in this paper. since roll out of the vaccine will not give us the liberty of testing various hypotheses, a lack of ground truth data is generally expected. this is analogous to zero-shot learning problems thus precluding a simple supervised learning based approach. • absence of evaluation with certainty: lack of ground testing naturally implies nil on-ground evaluation. in that case, it often becomes challenging to employ evaluation metrics that offer a significant amount of confidence in results. in order to solve this problem, we rely upon the simulated decrease in number of susceptible people as vaccine is administered. • model scaling: we ensured that the learning process of the models simulates the relationship between different objects in the real world environment accurately, and at the same time can be scaled down in response to computational efficiency and resource utilization challenges. this is done by choosing the right set of assumptions that reflect the real world. while reinforcement learning approaches replicate human decision-making processes, the absence of evaluation makes them less trustworthy, especially when real lives may be at stake. therefore, we pipelined the acktr model with a supervised learning based contextual bandits approach where recommendations for vaccine distribution policy were used as training data for the latter. we extracted the state-wise time series data of covid-19 characteristics from the website https://mohfw.gov.in/ and use them in the experiments as described below. the five states chosen for this study, i.e., assam, delhi, jharkhand, maharashtra and nagaland are representative of possible scenarios for the vaccine distribution problem, i.e., high incidence(maharashtra and delhi), moderate incidence (assam) and low incidence (jharkhand and nagaland) as on august 15, 2020. this particular date was chosen following the announcement that india may launch its indigenous vaccine on this date. in choosing the five different states, we hope to generalize our predictions to other states across the spectrum while minimizing the bias introduced into the learning by a widely variant covid-19 incidence across the country. we also enhanced our modeling context with: population share: the percentage ratio of the population of the state to the population of all the five states. predicted death rate: the percentage ratio of the predicted deaths in the state to the total predicted cases in that state calculated using projections obtained from a fitted standard compartmental model, i.e. a susceptible, exposed, infected and recovered(seir) model. predicted recovery rate: the percentage ratio of the predicted recoveries in the state to the total predicted cases in that state using projections obtained from the seir model. the implementation of acktr and contextual bandit sub-models of vacsim are detailed henceforth: acktr model: open-ai [21] stable-baselines framework was used to construct a novel and relevant environment suited to our problem statement for acktr to learn in this environment. a. input: with both the observation space and action space being declared as box-type, a context vector describing the situation of the state in terms of total predicted cases, predicted death rate, predicted recovery rate, susceptible cases and population, at a given time, was fed as input. predictions were obtained from seir projections [22] . the action space was a one-dimensional vector with its size equal to the number of recipients of the vaccine in one round of distribution. notably, we accounted for the cumulative time it would take for vaccine distribution, administration and generation of the immune response by the body i.e. the total time between dispatch of vaccine from the distribution centre to achievement of immunity in the vaccinated individual. this introduces a lag of 15 days between the distribution date of vaccines at day 't', to the gathering of context at day 't+15' by vacsim using seir projections. b. model working: following are the assumptions used while building the environment: vaccine efficacy 100% 4 number of recipients per day 5 table 1 : hyper-parameters used during policy learning • the nature of the vaccine is purely preventative and not curative, i.e., it introduces immunity when administered to a susceptible person against future covid infection, but plays no role in fighting against existing infections (if any). • the vaccine has 100% efficacy i.e. all people who are vaccinated will necessarily develop immunity against future sars-cov-2 infection. this assumption is easily modifiable and is expected to have little consequence in deciding the optimal allocation, unless data reveal differential immunity response of certain populations within india. however, we leave scope for it to be modified as per the situation by taking this as a hyperparameter for vacsim. • each person requires only 1 dose (vial) of the vaccine to be vaccinated completely and successfully. this too may be treated as a hyperparameter. reward function: the reward function was designed to maximize the decrease in the susceptible population count with the minimum amount of vaccine, considering it to be a limited resource. where r i was the reward given to vaccine recipient i s i was the susceptible population count of the same state 15 days from the day of distribution. q i was the amount of vaccine given to the state by the model. flow: vacsim gives us a recommendation for the distribution ratio between various recipients (indian states) of the vaccine as its output. following this distribution set, it awards the corresponding number of vials to each state and converts a corresponding number of individuals in that the state from susceptible to recovered, thus short-circuiting the seir model. the explore-exploit tradeoff: we tap into the explore-vs-exploit dilemma, allowing the model to reassess its approach with ample opportunities and accordingly, redirect the learning process. we set the exploration rate at 40%. however, this too is flexible and can be treated as a hyperparameter. hyperparameters: a complete list of the hyperparameters is given in table 1 . c. output: the output of the first sub-model was a distribution set dictating the share of each recipient in the batch of vaccines dispatched at the time of delivery. the output of the first sub-model spanned over 15 days (1 september-15 september 2020) with each state having a total of 50 episodes being run for each day. for every episode, the distribution sets so obtained (one per episode) were normalised to get the percentage distribution ratio for all states. normalised here refers to the percentage ratio of a given distribution set of a state and the sum of the distribution sets of the 5 states over an entire episode for a given date. since the time period is fifteen days, this amounts to a total of 3750 episodes. a. training: these episodes, which comprised of the normalised distribution sets along with the corresponding set of rewards, were then fed to the second sub-model, i.e., contextual bandits as training data set. the action space in this model was assumed to lie in the range [0,100] (both inclusive) to represent the percentage of vaccine that went to each state during the distribution process. the features in the context were the same as in sub-model 1, except population, which was now replaced by population share (fraction of population of a state with respect to the total recipient population). this was done merely for scaling purposes. c. testing: using the context, normalised actions and the corresponding set of rewards as the training dataset, we tested the model day-wise for a period of fifteen days (16 september-30 september) with each day having fifty possible actions for each state as output, similar to what was done in acktr model. the unadjusted actions (which were not normalised) obtained after testing the model were first adjusted day-wise for each state by taking the mean of all possible fifty actions for the state on that particular day. these were then normalized to obtain the percentage vaccine distribution for the five states for the period under consideration (15 days . results indicate that out of the five states, maharashtra, which had the highest number of cases, saw a gradually decreasing trend whereas nagaland, which has the lowest number of cases, sees a gradually increasing trend with respect to the distribution of the vaccine figure 3 (right). we also calculated that if the vaccine were to be distributed on 16 since there is no way that the evaluation of distribution policy can be done in the absence of vaccine and real-world distribution, we defined the naive baseline distribution policy as % of vaccine given to a state = % of infected people in that state and compared it with our model's learned distribution. with 100000 doses and 5 states, we simulated the distribution of the available vaccine on 16 september for the naive and vacsim policies. the number of resulting (projected) infections for 30 days after the vaccine distribution were calculated using the seir model. day-wise total cases of all 5 states for both policy models were summarized. our results indicate that the vacsimbased policy would additionally reduces a total of 102385 infected cases, with 95% ci [10234, 102427] , in the next 30 days. as seen from figure 3 (right) and figure 4 (left), in the case where distribution was done using naive approach, vaccine distribution for all the fifteen days was nearly the same with maharashtra, receiving the highest amount of vaccine and nagaland, receiving the least amount of vaccine as expected as per their infection rates. on the other hand, in case of the results obtained through vacsim, each state would get a variable and sufficient amount of vaccine during the fifteen-day period including nagaland, which received a negligible amount under the naive approach. unlike the naive approach, vacsim, therefore was not biased towards any state, thus ensuring equitable distribution while mitigating the epidemic. the ultimate goal of vaccine distribution is to reduce mortality and morbidity. since our model relies entirely upon simulations, in the absence of a vaccine, we checked if the data generated by such an approach follow the cause-andeffect relationships as expected in the real world data. structure-learning was carried out using a data-driven bayesian network [23] approach using hill climbing algorithm [24] with akaike information criterion as a scoring function, and ensemble averaging over 101 bootstrapped networks. these models were learned using wiser package [25] . state-wise time series data of death, recovery, infected people, susceptible people and the amount of vaccine obtained from our model were used to learn the structure. blacklisting among nodes was done such that vaccine percentage cannot be the child node of covid-19 trajectory indicators (susceptible, recovery, infected people, death). the resulting structure shows a causal relationship between the vaccine amount(parent node) and susceptible count (child node), thus confirming the technical correctness of the vacsim model through an external evaluation approach (refer figure 5 ). researchers worldwide are working around the clock to find a vaccine against sars-cov-2, the virus responsible for the covid-19 pandemic. once available, distribution of the vaccine will have numerous logistical challenges (supply chain, legal agreements, quality control, application to name a few) which might slow down the distribution process. in this paper, we have developed a novel distribution policy model, vacsim using reinforcement learning. we have pipelined an actor-critic using kronecker-factored trust region (acktr) model and bayesian regression-based contextual bandit model in a feed-forward way such that the output (action and rewards) of acktr model are fed into the contextual bandit model in order to provide a sensible context comprising actions and rewards. contextual bandits then optimized the policy considering demographic metrics such as population share of state with respect to the chosen 5 states and time series-based characteristics of the covid-19 spread (susceptible population, recovery rate, death rate, total infected cases) as context. while distributing the vaccine, identifying the part of the population who needs it the most is a challenging task, and in our case, we addressed it by the usage of the aforementioned context. rather than using the present-day count of infected and susceptible people, we have used seir-based projections, which makes our predicted policy more robust and trustworthy. evaluation of model-driven policy is a tough assignment due to unavailability of ground truth, and we proposed a novel causality-preserving approach to evaluate such models. the open-source code will enable testing of our claims by other researchers. vacsim may have some limitations shared by all rl models, i.e. the transparency of their learning process and the explainability of their decisions. secondly, the development of vacsim has been carried out while observing the pandemic in the past few months. however, the dynamic nature of pandemic may require change in actions, thus calling for common-sense working alongside artificial intelligence. in conclusion, we believe that artificial intelligence has a role to play in optimal distribution of the scarce resources such as vaccines, syringes, drugs, personal protective equipment (ppes) etc. that the world will see in the coming months. we provide a novel, open source, and extensible solution to this problem that policymakers and researchers may refer to while making decisions. astrazeneca covid-19 vaccine study is put on hold a proposed lottery system to allocate scarce covid-19 medications: promoting fairness and generating knowledge if a coronavirus vaccine arrives, can the world make enough? a comprehensive covid-19 vaccine plan a survey of agent-based intelligent decision support systems to support clinical covid-19 vaccine: development, access and distribution in the indian context policy gradient methods for reinforcement learning with function approximation simple statistical gradient-following algorithms for connectionist reinforcement learning high-dimensional continuous control using generalized advantage estimation scalable trust-region method for deep reinforcement learning using kronecker-factored approximation asynchronous methods for deep reinforcement learning trust region policy optimization new insights and perspectives on the natural gradient method optimizing neural networks with kronecker-factored approximate curvature deep contextual multi-armed bandits deep bayesian bandits showdown: an empirical comparison of bayesian deep networks for thompson sampling thompson sampling for contextual bandits with linear payoffs pattern recognition and machine learning, ser. information science and statistics gym: a toolkit for developing and comparing reinforcement learning algorithms global stability for the seir model in epidemiology a tutorial on learning with bayesian networks learning bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood wiser: a shiny application for end-to-end bayesian decision network analysis and web-deployment key: cord-123800-pxhott2p authors: pandey, gaurav; chaudhary, poonam; gupta, rajan; pal, saibal title: seir and regression model based covid-19 outbreak predictions in india date: 2020-04-01 journal: nan doi: nan sha: doc_id: 123800 cord_uid: pxhott2p covid-19 pandemic has become a major threat to the country. till date, well tested medication or antidote is not available to cure this disease. according to who reports, covid-19 is a severe acute respiratory syndrome which is transmitted through respiratory droplets and contact routes. analysis of this disease requires major attention by the government to take necessary steps in reducing the effect of this global pandemic. in this study, outbreak of this disease has been analysed for india till 30th march 2020 and predictions have been made for the number of cases for the next 2 weeks. seir model and regression model have been used for predictions based on the data collected from john hopkins university repository in the time period of 30th january 2020 to 30th march 2020. the performance of the models was evaluated using rmsle and achieved 1.52 for seir model and 1.75 for the regression model. the rmsle error rate between seir model and regression model was found to be 2.01. also, the value of r0 which is the spread of the disease was calculated to be 2.02. expected cases may rise between 5000-6000 in the next two weeks of time. this study will help the government and doctors in preparing their plans for the next two weeks. based on the predictions for short-term interval, these models can be tuned for forecasting in long-term intervals. the covid-19 (sars-cov-2) pandemic is a major global health threat. the novel covid-19 has been reported as the most detrimental respiratory virus since 1918 h1n1 influenza pandemic. according to the world health organization (who) covid-19 situation report [1] as on march 27, 2020, a total of 509,164 confirmed cases and 23,335 deaths have been reported across the world. global spread has been rapid, with 170 countries now having reported at least one case. coronavirus disease 2019 (covid-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2). corona virus belongs to a family of viruses which is responsible for illness ranging from common cold to deadly diseases as middle east respiratory syndrome (mers) and severe acute respiratory syndrome (sars) which were first discovered in china [2002] and saudi arabia [2012] . the 2019-novel coronavirus or better known as covid-19 was reported in wuhan, china for the very first time on 31st december 2019. according to jiang et al. [3] the fatality rate for this virus has been estimated to be 4.5% but for the age group 70-79 this has gone up to 8.0% while for those >80 it has been noted to be 14.8%. this has led to elderly persons above the age of 50 with underlying diseases like diabetes, parkinson's disease and cardiovascular disease to be considered at the highest risk. symptoms for this disease can take 2-14 days to appear and can range from fever, cough, shortness of breath to pneumonia, kidney failure and even death [1] . the transmission is person to person via respiratory droplets among close contact with the average number of people infected by a patient being 1.5 -3.5 but the virus is not considered airborne [2] . there exist a large number of evidences where machine learning algorithms have proven to give efficient predictions in healthcare [4] [5] [6] . nsoesie et al. [7] has provided a systematic review of approaches used to forecast the dynamics of influenza pandemic. they have reviewed research papers based on deterministic mass action models, regression models, prediction rules, bayesian network, seir model, arima forecasting model etc. recent studies on covid-19 include only exploratory analysis of the available limited data [8] [9] [10] . effective and well-tested vaccine against covid-19 has not been invented and hence a key part in managing this pandemic is to decrease the epidemic peak, also known as flattening the epidemic curve. the role of data scientists and data mining researchers is to integrate the related data and technology to better understand the virus and its characteristics, which can help in taking right decisions and concrete plan of actions. it will lead to a bigger picture in taking aggressive measures in developing infrastructure, facilities, vaccines and restraining similar epidemics in future. the objectives of the current study are as follows. 1. finding the rate of spread of the disease in india. 2. developing a mathematical seir (susceptible, exposed, infectious, recovered) model to evaluate the spread of disease. 3. prediction of covid-19 outbreak using seir and regression models. after presenting background in section i, section ii presents the methodology about the model used in this study. section iii covers analysis, experimental results and performance evaluation. discussion has been provided in section iv followed by conclusion in section v. time series data provided by john hopkins university, usa has been used for the empirical result analysis [12] . the time period of data is from 30/01/2020 to 30/03/2020. the data includes confirmed cases, death cases and recovered cases of all countries. however, this paper focuses only on india's data for analysis and prediction of covid-19 confirmed patients. the fact that india covers approximately 17.7% of the world's population and till date the effect of covid-19 cases per million is less than 1, is the motivation behind this research. for analysis and prediction of number of covid-19 patients in india, the following models have been used. mathematical models can be designed to stimulate the effect of disease within many levels. these models can be used to evaluate disease from within the host model i.e. influence interaction within the cells of the host to metapopulation model i.e. how its spread in geographically separated populations. the most important part of this model is to calculate the r0 value. the value of r0 tells about the contagiousness of disease. it is the fundamental goal of epidemiologists studying a new case. in simple terms r0 determines an average of what number of people can be affected by a single infected person over a course of time. if the value of r0 < 1, this signifies the spread is expected to stop. if the value of r0 = 1, this signifies the spread is stable or endemic. if the value of r0 >, 1 this signifies the spread in increasing in the absence of intervention as shown in figure 1 . equation (1) calculates the percentage of the population needed to be vaccinated to stabilize the spread of disease. the r0 value of covid-19 for india that has been calculated based on earlier data is reported in the range 1.5 -4 [11] . assuming the goal of r0 to be 0.5, the lower limit of the population required for the vaccine is 919.7 million and the upper limit would be 1.2071 billion. this remains a very vague figure so calculating the value of r0 remains an important part. the seir model has mainly four components, viz. susceptible (s), exposed (e), infected (i) and recovered (r) as shown in figure 2 . s is the fraction of susceptible individuals (those able to contract the disease), e is the fraction of exposed individuals (those who have been infected but are not yet infectious), i is the fraction of infective individuals (those capable of transmitting the disease) and r is the fraction of recovered individuals (those who have become immune). in figure 2, is the infectious rate which represents the probability of disease transmitting to susceptible person from an infectious person; is incubation rate which represents latent rate at which a person becomes infectious; is the recovery rate which is determined by 1 (where d is duration of infection); and is the rate at which recovered people become susceptible due to low immunity or other health related issues. the ordinary differential equations (odes) for all these four parameters are shown in equations 2 to 5. here, n = s + e + i + r is the total the population. now, we can calculate the value of r0 using the formula in equation 6 . the value of α, μ, , can be calculated using the ordinary deferential equations from the 4 component of the seir model. to describe the spread of covid-19 using seir model, few consideration and assumptions were made due to limited availability of the data. they are enlisted as follows. 1. number of births and deaths remain same 2. 1/ is latent period of disease &1/ is infectious period 3. recovered person was not sick again during the calculation period now, considering 70% of india's population to be approximately 966 million in susceptible class (s) and assuming only 1 person got infected in the initial part with average incubation period of 5.2, average infectious period of 2.9 and r 0 equal to 4, the seir model without intervention is shown in figure 3 with the assumptions mentioned above. in figure 3 , we can examine that the number of susceptible population decreases by 80% in first 100 days as per the listed assumptions. the algorithm for seir model is shown as follows. regression models are statistical sets of processes which are used to estimate or predict the target or dependent variable on the basis of dependent variables. the regression model has many variants like linear regression, ridge regression, stepwise regression, polynomial regression etc. this study has used linear regression and polynomial regression for prediction of covid-19 cases. linear regression is a simple model which is used to finds the relation between a dependent and an independent variable. it uses the value of intercept and slope to predict the output variable. equation 7 shows the relationship between a dependent and independent variable in a linear regression model. in equation 7 , and are two independent variables which represent intercept and slope respectively and is the error rate. this creates a straight line and is mostly used for predictive analysis. to make the linear regression algorithm more accurate we try to minimize the sum of residual square between the predicted and actual value. polynomial regression is a special type of regression which works on the curvilinear relationship between the dependent values and independent values. equation 8 shows the relationship between a dependent and independent variable in polynomial regression. in equation 8, x is the independent variable and is the bias also the intercept and , , ……, are the weight or partial coefficients assigned to the predictors and n is the degree of polynomial. the polynomial regression used in the study includes transformation of data into polynomials and applying linear regression to fit the parameter. a polynomial regression with degree equal to 1 is a linear regression. choosing the value of a degree is a challenging task. if the degree of polynomial is less, it will not be able to fit the model properly and if the value of degree of polynomial is greater than actual, it will overfit the training data. in india, the first case of covid-19 was reported on 30 th january 2020. during the month of february, the number of cases reported was 3 and remained constant during the entire month. the major rise in the spread of disease started in the month of march'20. figures 4 and 5 show the change in confirmed cases and death cases from 22 nd jan 2020 to 30 th mar 2020. data from march shows a significant change in spread of the disease. in thecurrent analysis, we have used data till 25 th march 2020 as our training data and data from 25 th march 2020 -30 th march 2020 as the test/evaluation data. before applying the prediction model we analysed the time series training data to check whether the model can fit the data or not. figure 6 shows the result of confirmed cases in the log 10 base versus last 15 days of training data. the reason of using last 15 days of training data is because it shows major growth in confirmed cases as shown in figure 4 . then we trained our data with seir model and result of line of fitting is shown in figure 7 . while applying seir, we took care of some interventions like quarantine and lockdown announced by the government of india during this period so, we applied a decay function to lower the number of confirmed cases during prediction. we used hill decay in our model which is a half decay and its formula is given in equation 9 , where l is a description of the rate of decay, t is time and k is a shape parameter (no dimension). half decay functions never reach zero and have half their original efficacies at time l. since we know covid-19 is contagious and rate of transferring the disease from an infected person to susceptible person is 2.02 we need to predict the amount at which the disease can grow. table 1 shows the prediction result using two models. in both the models, we have used 25 days training data as there was no significant trend in india before march 2020. to check the performance of the models used in this study we used root mean squared log error (rmsle) and value rmsle for the seir model was 1.52 and 1.75 for the regression model. the rmsle error rate between seir model and regression model was found to be 2.01. the current trend shows that there will be a linear trend continued in the next few days as the control mechanisms taken by the government of india are fairly strict and working well for the time being. also, with linear trends, the patients getting recovered can be managed easily and death rate can be controlled as well. the findings of the current study may explode exponentially as shown by gupta and pal [13] if stringent control measures are not taken by the government. hospital provisions and medical facility enhancement work should be continued at a very rapid pace to prepare the country for exponential growth, if it occurs. however, with current interventions and preparations, the government of india is looking forward to flatten the curve. during the prediction of the confirmed cases as shown in table 1 , there were few challenges associated with the data. the data was not stationary and showed an exponential growth after 40 days from 22 nd jan 2020 as shown in figure 4 . overfitting remains a major problem with disease spread time series data. in this model, we have addressed the overfitting problem using decay based intervention. another problem faced in this study was shortage of training data. data for 25 days was used for training purpose and 5 days data for validation, based on which the number of confirmed cases for next 14 days was predicted. the training data is very less for any machine learning to train itself . also, rapid changes in number of infected cases occurred in mid-march. the seir model shows an advantage as it does not grow exponentially with time but also uses some intervention methods with time. for intervention, a hill decay model was used. in the case of a regression model, different features can be used for decaying or intervention eg. the number of cases recovered etc., but growth of the regression line still remains a problem. for a regression model, we always need to train the model after some time with the change in trend in the data. the seir model also uses some assumptions like the number of people in susceptible class, for which we have taken 70% of the population to susceptible class. in this study, we have only predicted the number of confirmed cases. to predict the number of death cases we faced many problems of data stationarity. also, with limited data, the model was not able to predict the number of death cases properly. we have used only time series data for confirmed cases and death cases in this study. using other data related to weather, geographic layout of the country, state-level population and governance parameters, the model prediction rate can be further improved. in this study, two machine learning models seir and regression were used to analyse and predict the change in spread of covid-19 disease. we have analysed the data and found out that the number of cases per million in india is less than 0.5 till 30 th march 2020. then, with the help of the seir model the value of r0 was computed to be 2.02. also, we predicted the number of confirmed cases of covid-19 for the next 14 days starting from 31 st march 2020 -13 th april 2020. during performance evaluation, our model computed the value of rmsle for the seir model to be 1.52 and 1.75 for the regression model. also, the value of spread of disease of r 0 was found to be 2.02. the result obtained from this study is taken from training data up to 30 th march 2020. further, looking at the trend, there is definitely going to be an increase in the number of cases. doctors, health workers and people involved in providing essential services have to be protected in accordance with prescribed medical norms. community spreading in the future due to carelessness of individuals as well as groups can exponentially increase the number of cases. the peak is yet to come, hence the government has to be extra vigil and enforce strict measures. in addition, provision of medical facilities across the country has to be aggressively enhanced. in future, an automated algorithm can be developed to fetch data in regular intervals and automatically predict the number of cases for weekly and biweekly data. in this way, government and hospital facilities can also maintain a check on the supply and medical assistance / isolation required for new patients. world health organization characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention review of the clinical characteristics of coronavirus disease 2019 (covid-19) predicting hepatitis b virus-positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning controlling testing volume for respiratory viruses using machine learning and text mining volatile fingerprinting of human respiratory viruses from cell culture a systematic review of studies on forecasting the dynamics of influenza outbreaks investigating a serious challenge in the sustainable development process: analysis of confirmed cases of covid-19 (new type of coronavirus) through a binary classification using artificial intelligence and regression analysis a serological survey of canine respiratory coronavirus in new zealand risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease prudent public health intervention strategies to control the coronavirus disease 2019 transmission in india: a mathematical modelbased approach an interactive web-based dashboard to track covid-19 in real time trend analysis and forecasting of covid-19 outbreak in india key: cord-208252-e0vlaoii authors: calvetti, daniela; hoover, alexander; rose, johnie; somersalo, erkki title: bayesian dynamical estimation of the parameters of an se(a)ir covid-19 spread model date: 2020-05-09 journal: nan doi: nan sha: doc_id: 208252 cord_uid: e0vlaoii in this article, we consider a dynamic epidemiology model for the spread of the covid-19 infection. starting from the classical seir model, the model is modified so as to better describe characteristic features of the underlying pathogen and its infectious modes. in line with the large number of secondary infections not related to contact with documented infectious individuals, the model includes a cohort of asymptomatic or oligosymptomatic infectious individuals, not accounted for in the data of new daily counts of infections. a bayesian particle filtering algorithm is used to update dynamically the relevant cohort and simultaneously estimate the transmission rate as the new data on the number of new infections and disease related death become available. the underlying assumption of the model is that the infectivity rate is dynamically changing during the epidemics, either because of a mutation of the pathogen or in response to mitigation and containment measures. the sequential bayesian framework naturally provides a quantification of the uncertainty in the estimate of the model parameters, including the reproduction number, and of the size of the different cohorts. moreover, we introduce a dimensionless quantity, which is the equilibrium ratio between asymptomatic and symptomatic cohort sizes, and propose a simple formula to estimate the quantity. this ratio leads naturally to another dimensionless quantity that plays the role of the basic reproduction number $r_0$ of the model. when we apply the model and particle filter algorithm to covid-19 infection data from several counties in northeastern ohio and southeastern michigan we found the proposed reproduction number $r_0$ to have a consistent dynamic behavior within both states, thus proving to be a reliable summary of the success of the mitigation measures. since its emergence in wuhan, china, at the end of 2019, the novel coronavirus sars-cov-2 has spread worldwide. in a little over four months it has evolved into a pandemic affecting nearly every country in the world in spite of the measures taken to control and contain the contagion. the novelty of the virus, which is related, but different from the coronaviruses responsible for the sars and mers epidemics in 2003 and 2009, poses a challenge to using mathematical models to predict the dynamics of the pandemic and the effects of changing mitigation strategies. following the initial outbreak, number of contributors have either proposed mathematical models specifically designed to examine the spread of covid-19 [1, 2] , or addressed important points related to the estimation of the key model parameters [3, 4] from partial, biased daily updated data [5, 6] and incomplete information about the pathogen responsible for the pandemic. early virological assessment of sars-cov-2 [7] from a small sample of patients in germany found active virus replication in the upper respiratory tract, as opposed to sars where the predominant expression was in the lower respiratory tract. patients were tested when symptoms were mild or in the prodromal phase. pharyngeal viral shedding was very high during the first week of symptoms, peaking on the fourth day, with shedding continued after the symptoms subsided, well into the second week. moreover, most of the patients seemed to have passed their shedding peak in the upper respiratory tract at the time of the first testing, suggesting efficient transmission of sars-cov2 through pharyngeal shedding when the symptoms are mild. in [8] peak shedding is suggested to occur around day 0, and 44% of the transmission is estimated to occur in the asymptomatic phase. these specific clinical features need an interpretation in the epidemic models. the spread and speed of the covid-19 pandemic has triggered a burst of modeling activity in the effort to help predict the location and intensity of the next hotspots. one of the most used indicators of the potential for spread of an infectious agent is the basic reproduction number r 0 [9, 10, 11] , although questions have been raised as to how it should be computed [12, 13, 14] and interpreted [15, 16, 17] , and what can be inferred from it [18] . in general, r 0 > 1 signals epidemic potential, and the larger the r 0 , the faster the spread of the infectious agent. in [19] , where a seir model is calibrated and tested on the wuhan outbreak data, the initial estimate of r 0 , whose definition and significance will be discussed later, was updated after the implementation of strict measures for control and prevention of the contagion. the model was then used to predict the magnitude of the pandemic by predicting the number of infected individuals on february 29, 2020 under the two different scenarios of an increasing r 0 , changing from 1.9 to 2.6 and ending at 3.1, corresponding to letting the pandemic follow its course, and that of a decreasing r 0 , going from 3.1 to 2.6, then 1.9, and eventually 0.9 and 0.5, following the mitigation measures taken in wuhan. of models particularly adapted to descibe covid-19, we mention the sidarthe model [20] , consisting of eight separate compartments to address the different role in the spread of the pandemic of unreported infectious individuals, who presumably constitute the great majority. the model parameters, inferred from official data on the number of diagnosed and recovered cases, and fitted by recursive least squares methods, are adaptively changed during the simulation to reflect the introduction of progressively restrictive measures. more specifically, for the wuhan outbreak the value of r 0 , set to 2.38 on day 1, is changed to 1.6 on day 5, increasing to 1.8 on day 12 and returning to 1.6 on day 22. after day 22, r 0 is reduced to 0.99, and from day 38 it is set to 0.85. the sidarthe model is then used to predict the course of the pandemic if the lockdown measures are either weakened or tightened. the importance of considering the contribution from undocumented infections is repeatedly addressed in the covid-19 literature [21] . the population seroprevalence of antibodies in a random sample of 3300 people in santa clara county, california [22] early in the u.s. outbreak, would indicate the number of infections to be much higher than the number of reported cases, possibly by a factor of 50 to 80. this is in line with 10% seropositivity reported in the town of robbio, italy, and 14% in the town of gangelt, germany. another important issue, raised in [23] , is the bias in the estimate of key epidemic parameters if the delay in case reporting is not taken into consideration. documented and undocumented infections are accounted for in separate compartments in the seir network dynamic metapopulation model in [24] , which assumes that asymptomatic, or oligosymptomatic cases can expose a far larger proportion of the susceptible cohort to the virus than can reported cases, with asymptomatic cases less infectious than symptomatic ones. the model takes into account the travels of infectious individuals between cities by setting up a network comprised of 375 cities in china, with the amount of traffic between any pair of cities estimated from mobility data, and assigns fours separate cohort to each city. the estimator of the distribution of the six model parameters via a variant of ensemble kalman filter suggests that in the first month of the wuhan outbreak, only approximately one out of six infections was reported. the adoption of strong mitigation measures starting from january 23, 2020 is included in the model by first reducing the mobility coefficients to 2%, of the normal value, and by setting them to zero when all traffic stopped. the median latent period of covid-19 in wuhan according to this model is 3.7 days, and the median infectious period 3.48 days, with a median delay in case reporting of 10 days at first, and 6 days after the lockdown. the median estimate of the reproduction number varies from 2.38 in the early phase, to 1.66 and 0.99 after containment measures, when the fraction of undocumented infections decreases sharply. in [25] , the epidemic is assumed to go through three phases: in the first there is a slow accumulation of new infections, most of which are unreported. the second phase is characterized by a rapid growth of infections, diseases and deaths, while in the third phase the epidemic slows down due to the depletion of susceptible individuals. the sir model used to describe the course of the covid-19 pandemic, calibrated on the daily number of deaths in the united kingdom and italy and with r 0 = 2.75 and r 0 = 2.225, respectively, inferred from the literature, suggest that the epidemic already started one month before the first reported death. the most widely used tools in the united states for predicting the dynamics of the pandemic include the chime [26, 27] and the ihme forecasting model [28] . the former is an sir/seir based differential-algebraic simulation tool with an accessible interface. the latter, in order to counteract the typical sir overestimation of the proportion of infected and to focus on the most severely ill patients, uses empirically observed covid-19 population mortality curves. the underlying statistical model assumes that the cumulative death rate at each location follows the gaussian error function, and produces long and short term predictions. the goal of this article is to provide a relatively simple and flexible model equipped with a bayesian state and parameter estimation protocol to dynamically process the covid-19 data inflow to assess the current standing of the population and to make short term forecasts of the progression. all parameters and outputs of the algorithm are easily interpretable and adjustable. the underlying model is an adaptation of the classical seir model [29] , adjusted to better conform with certain specific features of the current covid-19 epidemics. in particular, we reinterpret the cohort of exposed individuals, defining it as individuals who carry and shed the virus asymptomatically, presymptomatically, or oligosymptomatically, thus not being isolated or hospitalized. moreover, this cohort does not contribute to the number of confirmed cases. we refer to this model as se(a)ir. we propose a bayesian particle filtering (pf) algorithm for estimating dynamically the state vector consisting of the sizes of the four cohorts in the model based on a poisson distributed observation of the infected cohort size, with the dynamic model generating the mean of the distribution. concomitantly, we estimate the presumably dynamically changing rate of transmission with posterior envelopes of model uncertainty. being a fully bayesian algorithm, the output consist of model uncertainties. moreover, we show the forecasting of the expected number of new infections based on the model with predictive uncertainty envelopes. one key factor contributing to the challenge of making predictions and planning is the unknown number of individuals spreading the virus asymptomatically. the particle filter algorithm provides an estimate for the ratio of the asymptomatic and symptomatic virus carriers. a novel feature of this contribution is the derivation of a riccati type equation for the ratio of the sizes of the two cohorts. moreover, the riccati equation has a short time approximate stable equilibrium. the equilibrium value, which can be analytically calculated from the model parameters, corresponds well to the model-based estimated ratio and can be used to define a dynamically changing effective basic reproduction number r 0 for the epidemic, facilitating the comparison of model predictions with other models. the methodology is extensively tested using covid-19 data of 18 counties in northeastern ohio (cleveland area) and 19 counties in southeastern michigan (detroit area) during the period from early march 2020 to early may 2020, including the period when both states introduced similar, yet slightly different mitigation protocols. compartment models in mathematical epidemiology partition a homogenous and well-mixed population into cohorts of individuals at different stages of the infection [?] . the popular sir model, with separate compartments for susceptible (s), infected (i) and recovered (r), introduced nearly a century ago by kermack and mckendrick [29] , introduced a population dynamics component into the previous, purely phenomenological statistical models (see, e.g. [30] ) that still seem to have a life of their own in modeling the covid-19 epidemic [28] . a significant challenge for the control and containment of the covid-19 epidemic is the spread of the infection by a large portion of asymptomatic or lightly symptomatic infectious individuals who are unaware of being vectors of the virus. this is especially problematic when, as is the case at the time of the writing of this article, due to limited availability, testing priority is given to symptomatic individuals or to vulnerable populations, thus the size of the asymptomatic cohort must be estimated indirectly. in the next subsection, we propose a compartment model that can be used to obtain such an estimate, and discuss its advantages and limitations. we begin by considering a modification of the classical seir model where the infected cohort i is subdivided into two groups according to the manifestation of symptoms, denoting by a the asymptomatic, infected, and infectious subcohort and by i the symptomatic infected one. hence, while e and a are both asymptomatic and technically infected, the e cohort is not infectious as in the classical seir model, as opposed to a that sheds the virus. the compartment model, schematically represented by the branching flow diagram in the left panel of figure 1 , is governed by the system of differential equations where the functional form of the fluxes is will be specified below. the covid-19 data, consisting of the daily count of newly reported infection, corresponds to observations of the flux ϕ 2 : in the absence of additional populationlevel inputs, this is the data that must be used to estimate the cohort sizes and model parameters. in particular, if no data concerning the asymptomatic cohort dynamics are available, the values of the fluxes ϕ 3 and ϕ 3 can be set rather arbitrarily, since they are minimally connected with the fluxes in the lower branch of the flow diagram, whose values are part of the observations. to estimate for the size of the asymptomatic cohort in the absence of additional information, it is necessary to modify the model. below we propose a modified version of the model that is suitable for estimating the size of the asymptomatic cohort, while retaining many of the salient features of the extended model. the modification that we propose applies an approach similar to that used in metabolic network model reduction [?] , where lumping enzymatic reactions whose parameters cannot be estimated from the data is fairly common. in our reduced model, the fictitious compartment e(a) embeds the asymptomatic cohort into the exposed one, as after a non-infectious incubation period, the exposed individuals branch either to symptotmatic (i) or asymptomatic (a) infectious compartments, with unknown frequencies and unknown reasons. in the modified se(a)ir model (right), the two compartments e and a on the blue background are lumped together to form the fictitious e(a) compartment representing the exposed, asymptomatic cohorts. the two fluxes ϕ 3 . in this manner, the flux ϕ 3 , that describes a variation internal to the lumped compartment, is no longer part of the model governing equations. assuming, for simplicity, that both asymptomatic and symptomatic individuals move to the recovered compartment with the same relative rate, denoted by γ, it follows that the flux ϕ 2 represents the rate at which part of the exposed individuals develop symptoms, and for simplicity, we assume that the flux is proportional to the size of the compounded cohort, that is, one of the most relevant parameters in an epidemic is the rate at which susceptible individual are infected. in the classical seir model, the transmission flux assumes the form where β is the transmission rate and n is the population size. however, one of the problems of covid-19 is the significant amount of virus transmission by the asymptomatic cohort within e(a). our model assumes that in the current covid-19 epidemics, most infected individuals who have developed symptoms, are either hospitalized or in self-isolation, thus playing a limited role in the exposure of susceptible individuals to the contagion. acknowledging the differing roles that symptomatic and asymptomatic virus shedders play in transmission to susceptibles, we define where p and q are frequencies, 0 ≤ p, q ≤ 1. the frequency p may be interpreted as the fraction of the compound cohort e(a) who are infectious. classical seir models, where usually it is assumed that the exposed cohort is not 5 figure 1 : left: compartment diagram of the model including both symptomatic and asymptomatic infected cohorts. after a non-infectious incubation period, the exposed individuals branch either to symptotmatic (i) or asymptomatic (a) infectious compartments, with unknown frequencies and unknown reasons. in the modified se(a)ir model (right), the two compartments e and a on the blue background are lumped together to form the fictitious e(a) compartment representing the exposed, asymptomatic cohorts. the two fluxes ϕ 3 and ϕ 3 are merged to a single flux ϕ 3 . depicted in figure1, (right), thus the size of the new cohort is 3 . in this manner, the flux ϕ 3 , that describes a variation internal to the lumped compartment, is no longer part of the model governing equations. assuming, for simplicity, that both asymptomatic and symptomatic individuals move to the recovered compartment with the same relative rate, denoted by γ, it follows that the flux ϕ 2 represents the rate at which part of the exposed individuals develop symptoms, and for simplicity, we assume that the flux is proportional to the size of the compounded cohort, that is, one of the most relevant parameters in an epidemic is the rate at which susceptible individual are infected. in the classical seir model, the transmission flux assumes the form where β is the transmission rate and n is the population size. however, one of the problems of covid-19 is the significant amount of virus transmission by the asymptomatic cohort within e(a). our model assumes that in the current covid-19 epidemics, most infected individuals who have developed symptoms, are either hospitalized or in self-isolation, thus playing a limited role in the exposure of susceptible individuals to the contagion. acknowledging the differing roles that symptomatic and asymptomatic virus shedders play in transmission to susceptibles, we define where p and q are frequencies, 0 ≤ p, q ≤ 1. the frequency p may be interpreted as the fraction of the compound cohort e(a) who are infectious. classical seir models, where usually it is assumed that the exposed cohort is not infective, hence only the infected i cohort is responsible for the spreading of the epidemic, can be accounted for by setting p = 0, q = 1. in our model covid-19 dynamics, we assume that p > q. the transmission rate β, which is the quantity of primary interest when it comes to assessing how fast the epidemics spread, integrates elements related to the characteristics of the pathogen determining the probability of infection of a given susceptible contact, as well as factors related to social behavior, including the number and nature of daily contacts. in summary, the goverining equations of the proposed covid-19 model are where β is the transmission rate, γ is the recovery rate, η is the symptomatic rate integrating in it the incubation process e → i, and µ is the death rate. the data that is used to inform our parametrized model, as well as to update the state estimation of the cohorts over time is the number of confirmed, symptomatic new daily cases, or, equivalently, daily observations of the flux or, in fact, of a stochastic integer-valued realization of it, as explained in more detail in the next section. since, as already pointed out, the rate of transmissibility integrates not only properties of the pathogen, but also factors related to human behavior that change in the course of an epidemic, in general β is not a static parameter. in the filtering approach used to estimate the size of the cohorts in the course of the epidemic, to be explained below, β is modeled as a time dependent parameter. in the absence of a known deterministic dynamical evolution model for β, its proposed changes will be described in terms of a stochastic geometric random walk. before describing the computational methodology on which our model predictions are based, some qualitative comments about the model are in order. in classical seir models, the inclusion of the exposed cohort e adds a time delay corresponding to the incubation period not accounted for in sir models. while our expression for ϕ 1 implicitly assumes that the infected and asymptomatic individuals are immediately infectious, the flux ϕ 2 introduces a slight delay in the i cohort if, for instance, the transmission rate is changed. moreover, if p = 0, we may write and by absorbing p into the transmission rate β, the model can be written in terms of the ratio q/p determining the size of the relative effect of infected symptomatic individuals on the spread of the infection. hence, without a loss of generality, we may set p = 1, and assume q < 1. the value q = 0.1 used in our numerical tests is a rough estimate, and the effect of that parameter on the estimated quantities is briefly discussed in the results. finally, we point out that the simplifying assumption that the symptomatic and asymptomatic individuals recover at the same relative rate is not essential for the methodology developed below, and can be easily removed. in the discussion to follow, we simplify the notation and write e instead of e(a) for the exposed/asymptomatic cohort size. bayesian filtering algorithms update the information about the state vector and parameters in a sequential manner as new data arrives. in the following, we use the convention of denoting by upper case letters the random variables and by lower case letter their realizations. we refer to [31, 32] for particle filtering, and to [33, 34, 35] for the particular type of applications to ode systems. let {x t } denote a discrete time markov process, with the transition probability distribution π x t+1 |xt (x t+1 | x t ). furthermore, let {b t } denote the stochastic process representing the observations, and let π bt| (b t | x t ) denote the likelihood density, where we implicitly assume that the current observation b t depends only on the current state x t and not on the past. finally, let b t denote the cumulative data up to time t, that is, we denote by b t the set of observed realizations, b t = {b 1 , b 2 , . . . , b t }. in bayesian filtering algorithms the update of the posterior distributions is carried out in two consecutive steps, where the first step is referred to as propagation step, and the second as correction, or analysis step. the first step is accomplished through the chapman-kolmogorov formula, while the analysis step builds on bayes' formula, that is, the predicted distribution of x t+1 acts as the prior as the next observation arrives. in the following, we specify the transition probability kernel and the likelihood in our model, and describe the computational steps for the numerical implementation. we then extend the discussion to include the estimation of static parameters. consider our modified se(a)ir model introduced in the previous section, and define the state vector at time t since we assume that the infectivity parameter β may vary over time, we denote its value at time t by β t . formally, let ψ be the numerical propagator advancing the state variable and the infectivity parameter from one day to the next, in our computations, the time integration performed by means of a standard ode solver such as runge-kutta, and during the one day propagation step, the infectivity β t is kept constant. formally, we write a propagation model of the form and we account for uncertainties both in the state vector z t and the parameter vector β t by introducing an innovation term. we guarantee that all components of the state vector and the parameter β are nonnegative, with a multiplicative innovation, assuming a geometric random walk model, where c ∈ r 4 is a diagonal positive definite matrix. in the definition of the likelihood, we assume that the data consist of realizations b t of the daily new infections, denoted by b t . the new infection count is assumed to be poisson distributed, with the poisson parameter equal to the flux ϕ 2 (t), therefore, the likelihood of the observed number b t of new infections is to initialize the process, we need to define the initial state at the time t = 0 before the first observed infections. since the initial state is unknown, its initial value becomes part of the estimation problem. we postulate that before the first infection is observed, there is an unknown number of asymptomatic individuals in the community. we assume that the initial number of asymptomatic cases follows a poisson distribution with a uniformly distributed expected value λ ∼ uniform([0, λ max ]). moreover, we assume that β 0 follows a uniform distribution over some interval [β min , β max ]. we are now ready to outline the particle filter (pf) algorithm for estimating the state vector x t based on the infection count. assume that at time t a sample from the distribution π xt|bt {x 1 t , x 2 t , . . . , x n t }, is available, and each sample point is associated its corresponding weight w j t . we write a particle approximation of the chapman-kolmogorov formula and combine it with bayes' formula to obtain the updating formula let x j t+1 denote a propagated predictor particle associated with x j t , that is, . at the arrival of the next observation b t+1 , the likelihood π b t+1 |x t+1 (b t+1 | x j t+1 ) expresses how well the predictor is at explaining the data. writing the above formula as , we can interpret the formula as follows: given a predictor particle x j t+1 , the expression (1), combining the importance of its predecessor in the weight and its fitness in the likelihood, evaluates the relevance of the particle; (2) weighs the importance of the new particle relative to the predictor, and the transition kernel (3) generates a new particle from the predictor by the formula (7). this hierarchical organization is the backbone of the particle filter algorithm. initialize: draw n independent realizations of β 0 ∼ uniform([β min , β max ]) and λ ∼ uniform([0, λ max ]), and generate n realizations of the initial state of z 0 , then define the initial cloud of the particles while t < t max do (a) propagate the particles according to (6) to generate the predictive particle cloud (b) extract the second component e j t+1 of each of the propagated particles, and compute the weights, (c) sample with replacement n indices j ∈ {1, 2, . . . , n }, j = 1, 2, . . . , n , with the probability weights g j t+1 . define the new resampled predictive cloud generate a new particle cloud through the innovation process, log x j t+1 = log x j t+1 + c 1/2 w j , w j ∼ n (0, i 5 ). (d) extract the second component e j t+1 from each new particle, and compute the new likelihood weights, update the weights, . advance t → t + 1. in the particle filter algorithm, the model parameters γ, η and µ are fixed. it is possible to modify the algorithm to estimate these parameters also. in that case, to estimate the death rate µ, the information about the deceased must be included in the data, as the new infection count is essentially insensitive to that parameter. the parameters γ and η are less time-dependent than the infectivity β, and we set their values according to what is suggested in the literature. a challenges in forecasting the spread of the covid-19 epidemic is the presumably large portion of population that is asymptomatic or shows only light symptoms, while shedding the virus, which in our model is part of the cohort e. the ratio between the numbers of symptomatic and asymptomatic individuals, can be estimates from the output of the particle filter. differentiating ρ with respect to time, expressing the derivatives of e and i in terms of the governing equations (3)-(4), and simplifying, we find that ρ satisfies the riccati equation if we assume that the frequency of susceptible cohort ν s = s/n is approximately constant, we find an equilibrium state ρ * of the ratio, which is a stable equilibrium. in particular, when the infection has not spread yet to a significant portion of the population, substituting ν s ≈ 1 yields a potentially useful estimate for the prevalence of the asymptomatic infection in the population. in the computed results, a comparison of the values of ratio ρ computed from the state vectors and the equilibrium estimate, can be used to asses whether the infection pattern is settling to or near the equilibrium state. the equilibrium condition provides a natural way of defining a basic reproduction number r 0 for the model. assuming equilibrium, we have thus, when r 0 = 1 the infection stops growing under the assumption of a stable ratio of asymptomatic to symptomatic infections. this interpretation of the reproduction number is in close agreement with the r 0 for sir models. while not perfect, this number summarizes the phase of the epidemic, as our computed examples will demonstrate. to test our model, we consider the covid-19 data of new infection counts in different counties of northeastern ohio and southeastern michigan. the counties represent different population densities and demographics, e.g., cuyahoga (oh) and wayne (mi) include dense urban areas (cleveland and detroit), summit (oh) and genesee (mi) represent mid-size urban centers (akron and flint), lake (oh) and oakland (mi) represent areas near urban centers, while holmes (oh) and sanilac (mi) host rural communities. the data consist of the confirmed daily cases made available by usafacts [6] . the values of the model parameters and of the prior densities, as described in the algorithm in section 3, are listed in table 1 . the innovation covariance matrix c is a diagonal matrix of the form where σ 2 1 and σ 2 2 are the variances of the uncertainty in the logarithm of the state vector z t and in the logarithm of the transmission rate β t ,respectively. observe that for the susceptible population, the variance is weighted with the population squared to keep the innovation for this cohort from becoming excessive. we have and without the scaling by n , the innovation in a large population could be significantly large. numerical tests indicate that with a too large innovation, the algorithm may go astray. the panels in the left column of figures 2-3 display the raw data and the dynamically estimated expected values of the new infections computed with the particle filter. in the estimation process, we averaged the data over a moving window of three days to attenuate the effects of, e.g., reporting lags, and differences between weekends and weekdays. for each county we show the 50% and 75% credibility intervals and the median defined by the particles. more precisely, for each parameter/state vector sample x j t , 1 ≤ j ≤ n , we calculate the corresponding fluxes ϕ j 2 (t) = ηe j t , and evaluate the interval obtained by discarding the 1/2 × (1 − p/100) × n smallest and largest values. the red curve is the median value of the fluxes. the middle column in figures 2-3 shows the 50% and 75% posterior belief envelope of the transmission rate parameter for the respective counties. finally, the right column of figure 2-3 shows the 50% and 75% posterior envelopes estimated ratio between the cohorts e and i. in the same figure, the dashed curve represents the equilibrium value ρ * of the ratio based on the particle filter paremeters a priori lower bound transmission rate [1/days] β min 0.1 a priori upper bound for transmission rate β max 0.5 a priori upper bound of initial infectious individuals λ max 5 number of particles n 20 000 standard deviation of innovation of the state σ 1 0.01 standard deviation of innovation of the transmission rate σ 2 0.1 recovery rate γ 1/21 incubation rate η 1/7 death rate µ 0.004 (8) with β(t) equal to the posterior mean, towards the end of the observation period the equilibrium value is in good agreement with the sample-based estimate. the numerical results indicate that the asymptotic value ρ * is a good proxy of the ratio e/i, in particular once the transmission rate β has stabilized, therefore providing a quick way to assess the size of the asymptomatic cohort from the size of infected cohort. to understand better the dependency of the ratio ρ * on the model parameters, we introduce two dimensionless quantities characterizing the system of differential equations, neglecting the effect of the death rate µ, and assuming that the infection is not yet widespread, so that s/n ≈ 1, we find an approximate formula for ρ * of (8) in terms of the dimensionless quantities t and s, figure 5 shows the equilibrium value as a function of the dimensionless parameters for three different choices of the parameter: q = 0.1 (left), q = 0.5 (center) and q = 1 (right). in the four ohio counties, the equilibrium value is consistently near ρ * ≈ 0.5, while slightly below it in the michigan counties. observe that the value ρ * = 0.5 would correspond to effective basic reproduction number indicating that the disease is in a slow progression phase. while this definition of r 0 in terms of ρ * implicitly assumes equilibrium, it is possible to compute the time course of r 0 for the period when the system has not reached the equilibrium, by using formula (9) and the estimated transmission rate β. figure 4 shows the posterior belief envelopes for r 0 for six counties, three in ohio, and three in michigan, calculated from the posterior sample for β at each time. we did not show the results for holmes county (oh) and sanilac county (mi), whose r 0 was consistently very low. the time courses of r 0 show similar patterns in all six counties, and while in michigan the peak values were higher than in ohio, the values towards the end of the observation period are lower, varying around the critical values r 0 = 1. the particle solutions to the estimation problem provide a natural tool for predicting new infections. for each particle, the last realization x j t provides an initial value for the differential equations and an estimated value for the transmission rate, thus we may use the system of odes to propagate the particle data forward in time. from the ensemble of the predictions we can then extract the forecasting belief envelopes, similarly as we did for the nowcasting problem. figures 6-7 show the predictions for the following 20 days for the four sample counties under the assumption that the mitigation conditions remain unchanged. in addition, we have run two alternative scenarios: in the first one, the last estimate of the transmission rate for each particle is multiplied by a factor κ = 0.8, simulating a tightening of the containment and control measures, while in the second scenario, the transmission rate is increased by multiplicative factor κ = 1.2, simulating a relaxation of the control measures. we observe that the predictions illustrate well the what the lower r 0 in michigan compared to ohio means in terms of expected number of new infections. to further test the predictive skills of the model, we leave out from the current data set {b 1 , b 2 , . . . , b t } the last t pred data points, considering only the data up to t = t − t pred , and compute a sample of predicted new cases, defining the predicted particle sample that is, we propagate the particles computed at t forward, and generate artificial observations. a comparison of the resulting predictive envelopes and the actual data is indicative of the predictive skill of the model: large deviations indicate that the model may not have taken some important factors into account. figure 8 shows the 10 day predictions for three counties in ohio and three counties in michigan, together with the actual data. the plots indicate that, in general, the model captures the trend of the data relatively well, with the exception of the saginaw county (mi), where the predicted trend seems higher than the actual observation. finally, we consider the sensitivity of the results to the choice of some of the model parameters that assumed to be known, and in particular, the time constants t inc = 1/η (incubation) and t rec = 1/γ (recovery). to test the sensitivity, we select one of the counties, cuyahoga (oh) and run the program with different parameter combinations. figure 9 -12 show a selection of results. the results show that both the estimated transmission rate and the ratio ρ = e/i are rather robust for variations of the parameter, the former decreasing slightly with increasing t rec . interestingly, towards the end of the observation interval, the ratio ρ is in all cases close to the equilibrium value ρ * ≈ 0.5. therefore, the value of r 0 is approximately however, this approximation gives an idea of the spreading of the transmission only if the ratio e/i is near the equilibrium. the plots show that in particular at the beginning of the observation period the ratio is far from the equilibrium, and r 0 would predict a too slow growth rate for the transmission dynamics. the proposed dynamical bayesian filtering algorithm based on particle filters is shown to be able to produce a robust and consistent estimate of the state vectors consisting of the sizes of the four cohorts of a new se(a)ir model for covid-19 spread, as well as of the transmission rate, a key parameter that is not expected to be a constant. in epidemiology models, the transmission rate β is usually defined as β = (transmission probability per contact )×(number of contacts per day), where the first factor is related to the characteristics of the pathogen, while the second factor is related to the behavior of the individuals and can change in connection with hygiene, social distancing and other mitigation measures. in the ohio and michigan counties, we observe a rapid growth of β at the beginning of the outbreak, followed by a clear and consistent drop. the initial increase may be due, to some extent, to the scarcity of the data in the beginning of the epidemics, and reflect the learning effort of the algorithm. the drop that follows, however, correlates well with the stay-at-home order and other social distancing measures introduced early on in both states, from march 16 to march 23 in michigan, and from march 15 to march 19 in ohio. in both states and across all counties considered here, the transmission rate is fairly stable, except for wayne county, michigan, where β is slightly higher early in the epidemic. this may be the effect of a wider use of public transportation, socio-economic conditions and demographic factors. if the reason for the drop is the adoption of mitigation measures, it is natural to wonder why its onset is not the same in all counties. a possible explanation of the delay may lie in the fact that the infection arrived in different communities through the mobility of infectious individuals, and what happens earlier in large, densely populated areas affects the surrounding communities with a delay. the effects of the network structure on the timing and intensity of the covid-19 spread will be addressed separately. this se(a)ir model we present directly addresses the role of asymptomatic individuals shedding virus, many of whom recover before developing symptoms, in transmission dynamics. we demonstrate a method for dynamically estimating the ratio of asymptomatic/presymptomatic/oligosymptomatic individuals in the e compartment to symptomatic individuals in the i compartment. this ratio, whose equilibrium value consistently settles to a value less than one across all counties examined and with a wide range of model parameters, is considerably lower than the 50-85 ratio speculated in [22] based on a serosurvey conducted in california on april 3-4, 2020. this disparity may stem partly from the growing availability of testing throughout april as well as the cross-reactivity in the elisa tests used as part of the serosurvey. understanding better this discrepancy will be a topic of future studies. however, we emphasize that while our methodology predicts that the asymptomatic cohort is typically smaller than the symptomatic one, according to the model, it is still responsible for the majority of transmissions. furthermore, as already pointed out, the current se(a)ir model does not address some of the features of covid-19 transmission, including the latent period of the asymptomatic cohort. while designing a model that properly addresses the delay of the infectious phase of the asymptomatic patients is straightforward, retaining a structure that allows us to inform about the size of asymptomatic cohort based on data on symptomatic patients alone remains a challenge. elaborating on that point will be a future direction of this research. the model-based predictions indicate that the current value of β alone is not enough to project how the infection process will proceed. in fact, while the value β at the end of the considered time interval is nearly the same for all counties, the predictions on the new infections differ significantly. the main reason for this discrepancy is the different size of the cohorts. the numbers infected tend to increase strongly even if β is relatively small if the pipeline of exposed/asymptomatic individuals is crowded. because an unobservable concentration of individuals in the exposed/asymptomatic category may represent an uptick in future cases, only several days (preferably two weeks) of consecutively falling case numbers likely represents systematic decline in population-level disease activity. the uncertainty associated with undulating case counts is reflected in the wide predictive envelopes seen in projections for several of the ohio counties. it is worth noting 10 day projections shown in figure 8 tend to overestimate observed incidence in the last few days. we suspect this is attributable in part to reporting delays. in this paper we introduce a new way of estimating the reproduction number r 0 for the proposed model, based on an equilibrium condition of a related non-linear riccati type equation. the equilibrium condition, in turn, was shown to correspond well to the estimated ratio of the symptomatic and non-symptomatic cases after the transmission rate had stabilized, presumably due to the social distancing measures. interestingly, the equilibrium value ρ * and the corresponding r 0 are very consistent for a number of counties in two states, ohio and michigan, that adopted similar distancing measures around the same time, regardless of the fact that michigan had a significantly higher number of infection than ohio at the time of the adoption. these observations give an indication that the quantities reflect well the effectiveness of the mitigation measures, providing a useful and quick indicator for assessing such measures. in the left column, the propagation for each particle is continued by using the last state vector as initial data, and the last estimated β value as transmission rate. in the middle column, the transmission rates are slightly lowered by a factor of 0.8, while in the right column, the rates are increased by a factor of 1.2. the predictive envelopes correspond to 50% and 75% levels of belief. the predictions are based on the data excluding the last ten observations. the two different shades correspond to 50% and 75% uncertainty of predicting new observations. the actual data are shown as stem plots. early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases first-wave covid-19 transmissibility and severity in china outside hubei after control measures, and second-wave scenario planning: a modelling impact assessment. the lancet athanasios tsakris, and constantinos siettos. data-based analysis, modelling and forecasting of the covid-19 outbreak preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak coronavirus 2019-ncov global cases coronavirus locations: covid-19 map by county and state katrin zwirglmaier, christian drosten, and clemens. virological assessment of hospitalized patients with covid-2019 temporal dynamics in viral shedding and transmissibility of covid-19 the analysis of equilibrium in malaria the epidemiology and control of malaria. the epidemiology and control of malaria a brief history of r 0 and a recipe for its calculation on the definition and computation of the basic reproduction ratio r 0 in models for infectious diseases in heterogeneous populations the construction of next-generation matrices for compartmental epidemic models the estimation of the basic reproduction number for infectious diseases. statistical methods in medical research complexity of the basic reproduction number (r 0 ) pespectives on the basic reproductive ratio the failure of r 0 . computational and mathematical methods in medicine unraveling r 0 : considerations for public health applications phaseadjusted estimation of the number of coronavirus disease modelling the covid-19 epidemic and implementation of population-wide interventions in italy nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study covid-19 antibody seroprevalence effect of changing case definitions for covid-19 on the epidemic curve and transmission parameters in mainland china: a modelling study substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) fundamental principles of epidemic spread highlight the immediate need for largescale serological surveys to assess the stage of the sars-cov-2 epidemic. medrxiv simcovid: an open-source simulation program for the covid-19 outbreak. medrxiv locally informed simulation to predict hospital capacity needs during the covid-19 pandemic ihme covid-19 health service utilization forecasting team and christopher jl murray. forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months a contribution to the mathematical theory of epidemic causes of death in england and wales. second annual report of the registrar general of births, deaths and marriages in england combined parameter and state estimation in simulation-based filtering statistical and computational inverse problems parameter estimation for stiff deterministic dynamical systems via ensemble kalman filter linear multistep methods, particle filtering and sequential monte carlo astrocytic tracer dynamics estimated from [1-11c]-acetate pet measurements the work of dc was partly supported by nsf grants dms-1522334 and dms-1951446, and the work of es was partly supported by nsf grant dms-1714617. key: cord-140839-rij8f137 authors: langfeld, kurt title: dynamics of epidemic diseases without guaranteed immunity date: 2020-07-31 journal: nan doi: nan sha: doc_id: 140839 cord_uid: rij8f137 the global sars-cov-2 pandemic suggests a novel type of disease spread dynamics. who states that there is currently no evidence that people who have recovered from covid-19 and have antibodies are immune from a second infection [who]. conventional mathematical models consider cases for which a recovered individual either becomes susceptible again or develops an immunity. here, we study the case where infected agents recover and only develop immunity if they are continuously infected for some time. otherwise, they become susceptible again. we show that field theory bounds the peak of the infectious rate. consequently, the theory's phases characterise the disease dynamics: (i) a pandemic phase and (ii) a response regime. the model excellently describes the epidemic spread of the sars-cov-2 outbreak in the city of wuhan, china. we find that only 30% of the recovered agents have developed an immunity. we anticipate our paper to influence the decision making upon balancing the economic impact and the pandemic impact on society. as long as disease controlling measures keep the disease dynamics in the"response regime", a pandemic escalation ('second wave') is ruled out. demic escalation ('second wave') is ruled out. introduction the rapid spread of a disease across a particular region or regions (epidemic) or the global outbreak of a disease (pandemic) [2] can have a detrimental effect on health systems, on local and global economies including the financial markets and the socio-economic interactions, ranging from the city to the international level. measures to reduce the pandemic spread include curtailing interactions between infected and uninfected parts of the population, reducing infectiousness or the susceptibility of members of the public [3] . the two major strategies governments use to handle an outbreak are to slow down an outbreak (mitigation) or to interrupt the disease spread (suppression). since each of those interventions bears itself significant risks for the societal and economic well-being, it is crucial to understand the effectiveness of these strategies (or any hybrid of them). mathematical methods provide essential input for governmental decision making that aims at controlling the outbreak. among those are statistical methods [4, 5] , deterministic statespace models [6] with its prototype developed by kermack and mckendrick [7] , and a variety of complex network models, e.g., [8, 9] . the different mathematical approaches have different objectives: a significant application of the statistical methods frequently aims at the early detection of disease outbreaks [4] , while modelling either tries to develop a model as realistic as possible for a given outbreak or to design a simplistic model, which, however, reveals some universal truth about the outbreak dynamics. in the simplest version, the so-called compartmental models [7, 10] consider the fraction of the population which is either susceptible (s), infected (i) or removed (r) from the disease network. coupled differential equations capture the dynamics of the disease that determine the time dependence of s, i and r. extensions add more compartments to the sir model such as (e) exposed. for example, such an seir model was used in [11] for a description of the ebola outbreak in the democratic republic of congo in 1995. compartmental models have been applied to describe the recent sars-cov-2 outbreaks [12, 13, 14, 15, 16, 17] . for example, the elaborate model from giordano at al. uses a total of 8 compartments -susceptible (s), infected (i), diagnosed (d), ailing (a), recognized (r), threatened (t), healed (h) and extinct (e) -to describe the covid-19 epidemic in italy. compartmental models have been extended in order to capture stochastically unknown influences, such as changing behaviours [18] . such models were recently used to analyse the covid-19 outbreak in wuhan [19] . compartmental models address global quantities such as the fraction of susceptible individuals s and assume that heuristic rate equations can describe the disease dynamics. in cases of a strongly inhomogeneous (social) network, e.g. taking into account different population densities, the above assumption seems not always be justified. in these cases, spatial disease spread patterns can be described by a stochastic network model with monte-carlo simulations a common choice for the simulation. in this paper, we consider a disease dynamics for which the duration (severity) of the illness depends on the amount of exposure. using an elementary (social) network, we are looking for universal mechanisms describing a pandemic spread. we will reveal a connection to statistical field theory, enabling us to characterise an outbreak with the tools of critical phenomena. we will discuss the impact of the findings on policies to curb an outbreak and will draw conclusions from the recent covid19 outbreak in hubei, china. model basics: we assume that each individual has two states u with u = 0 -susceptible and u = 1 -infected. each individual interacts with four 'neighbours' of the social network. the disease spread is described as a stochastic process. at each time step (say 'day'), the probability that an individual gets infected (or recovers) depends on the status of the neighbours in the social network. here, we only study the simple case of a homogeneous network with four neighbours for each site. we also consider periodic boundary conditions to minimise edge effects. immunity: we study two closely related scenarios. (i) there is no immunity. every individual can be reinfected and can recover only to be susceptible again. (ii) individuals can be reinfected and recover. only if individuals stay infected for τ consecutive days, they are considered immune. in case (ii), the sites of immune individuals are removed from the disease network. disease dynamics: if x is a site of the disease network, at every time step the state u x ∈ {0, 1} is randomly chosen with probability where xy is an elementary link on the lattice joining sites x and y and, hence, n is the number of infected neighbours, and n x = 1 + exp {4β n x + 2h} is the normalisation. the parameter β describes the contagiousness of the disease. the parameter h is linked to the probability to contract the disease from outside the network. in fact, if no-one of the network is infected (n x = 0, ∀x), the probability p that any individual contracts the diseases, is connected to h by if the lattice contains n individuals (i.e., sites), one time step is said to be completed if we have considered n randomly chosen sites for the update. the pandemic spread as a critical phenomenon: scenario (ii) show the typical time evolution of an epidemic with the infections rate approaching zero for large times due to agents for a vanishing external field h, the model shows a critical behaviour with a phase transition at β = β c = ln(1 + √ 2)/2 ≈ 0.44. in the ordered phase for β > β c , a small seed probability phase. for β < β c , the model is in the 'response' phase, i.e., the infection rate is in repose to the seed probability p, but no outbreak occurs. the asymptotic infection rate can be calculated using markov monte-carlo methods. we used a modified swendsen-wang cluster algorithm, which performs near the phase transition [22] . our numerical findings are summarised in figure 1, left panel. curve 2 clearly separates both phases -the pandemic phase and the response regime. note that the dynamics (1) corresponds to the standard heat-bath update [23] . starting from a healthy population (u x = 0, ∀x), it takes the 'thermalisation' time t th that the daily infection rate starts fluctuating around its asymptotic value. immunity: let us now study scenario (ii), where individuals can develop immunity if they are infected for τ consecutive days. for τ > t th , the peak infection rate is that of the asymptotic state of the corresponding model (i) and, hence, inherits the classification 'pandemic' or 'response' phase. this is illustrated in figure 1 , right panel, for the pandemic phase for several values of τ . figure 2 illustrates the vastly different behaviour of the disease spread in the pandemic phase (β = 0.41, p = 5%) and in the response regime (β = 0.38, p = 4%). results are for a n = 100 × 100 network and τ = 11. note that the curve for 'infected+immune' ('triangle' symbol) in the pandemic phase is not monotonically increasing with time since the infected individuals can return to 'susceptible' state, i.e., not every infected individual becomes immune. note that in the response regime ('circle' and 'square' symbol), the 'pandemic' peak is absent altogether. however, on the downside, the so-called 'herd immunity' slowly develops over time. comparison with data: we stress that the model assumption of a homogeneous (social) network with 'four neighbours' is unrealistic. the knowledge of the underlying disease network is essential to make quantitative predictions for e.g. the critical value β c of the contagiousness. here, we adopt a different approach: we assume that qualitative time evolution of bulk quantities such as the fraction of infected individuals is within the grasp of model scenario (ii) and use those as fit functions to determine the model parameters such as β, p and τ by comparison with actual data. for this study, we used data from the covid-19 outbreak in 2020 in the hubei province in china [24] . the data of the number of infected individuals show a jump at day 73 (on the arbitrary time scale) by 40%, which is due to a change in reporting. we assume that the same 'under-reporting' has occurred in the days before, and have corrected the data by multiplying the number of infected (and infected+recovered) by a factor 1.4 for times t ≤ 73. let d(t, τ, β, p) be the fraction of the population of infected individuals as a function of time t and depending on the parameters τ (time to develop immunity), β (contagiousness) and p seed probability to get infected. we have calculated d(t, τ, β, p) using a n = 250 × 250 lattice. we checked that the result is independent of the lattice size in the percentage range for the parameters relevant in this study. if d wuhei (t) quantifies the measured values for the number of infected in the hubei outbreak, we want to approximate these data, i.e., with a suitable choice of the parameter n pop , t s , β and p. since the offset of the time axis in the hubei data is arbitrary, we have chosen the shift t s such that the peaks of simulated data and measures data coincide. all other parameters are treated as fit parameters. altogether, we find a good agreement with the data for: the model data overshoot the data in the early days of the epidemic spread, which could be related to underreporting due to limited testing capabilities. it is interesting to observe that the curve of the infection rate is asymmetric: the slope of the raise at the beginning is larger than the slope of the decline after the maximum. also, the number of infected seem to level off at a nonzero value. in the present model, this is explained as follows: with more agents being immune, it is harder for susceptible agents to be continuously infected for time greater or equal τ and, thus, to develop immunity. we also find that only about 30% of the infected (and recovered) develop an immunity. spread corresponds to the ordered phase of the field theory, and the critical value for the contagiousness is that for the phase transition. the disease is in controllable response mode if the corresponding field theory is in the disordered phase. quantitative results, reported here, are derived with an unrealistic homogeneous disease network for which each agent interacts with four neighbours. nevertheless, we find that the covid19 data of the hubei outbreak are well represented. for this case, we find that only 30% of infected develop an immunity. the heavy tail of the decline of the number of infected, which levels off at non-zero values, is an inherent feature of the model and can be traced back to the fact that agents can be reinfected. in a network with a sizeable portion of immune agents, it is increasingly challenging to develop immunity. if these model assumptions were underpinned by medical investigations, achieving 'herd immunity' would be difficult. this should influence the decision to what extent efforts focus on developing a cure or a vaccine. strategies for containing an emerging influenza pandemic in southeast asia statistical methods for the prospective detection of infectious disease outbreaks: a review statistical studies of infectious disease incidence a contribution to the mathematical theory of epidemics thresholds for epidemic outbreaks in finite scale-free networks the impacts of network topology on disease spread the mathematics of infectious diseases statistical inference in a stochastic epidemic seir model with control intervention: ebola as a case study modelling the covid-19 epidemic and implementation of population-wide interventions in italy mathematical modelling on phase based transmissibility of coronavirus a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action data-based analysis, modelling and forecasting of the covid-19 outbreak estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china capturing the time-varying drivers of an epidemic using stochastic dynamical systems early dynamics of transmission and control of covid-19: a mathematical modelling study beitrag zur theorie des ferromagnetismus statistical mechanics of lattice systems: a concrete mathematical introduction nonuniversal critical dynamics in monte carlo simulations an r package with real-time data, historical data and shiny app i thank lorenz von smekal (giessen) and paul martin (leeds) for helpful discussions. key: cord-130967-cvbpgvso authors: dinamarca, jos'e luis; aguilar, sara; runzer-colmenares, fernando; morales, alexander title: clinical concepts might be included in health-related mathematic models date: 2020-04-23 journal: nan doi: nan sha: doc_id: 130967 cord_uid: cvbpgvso the construction of mathematical models that allow comprehensive approach of decision-making in situations of absence of robust evidence is important. while it is interesting to use models that are easy to understand, using values of direct interpretation, we analized a published index (covid-19 burden index) and found it seems to be oversimplified. it is possible that the proposed index, with current data, could be useful in geographically and administratively narrowed places. but it is inaccurate to be applied throughout the process and in places as broad as american countries. it would be ideal to correct and refine the referred model, bearing in mind clinical concepts described, to take advantage of the proposal and generate a more accurate response, which can serve as an input both in the implementation of measures and in the prediction of the behavior of a pandemic like the current one. however, what we propose is to improve the accuracy of the model in terms of quantities and applicability, agreeing with the concept of"stay at home". the approach between complementary areas of knowledge should be the door that we must open to generate the new evidence we need. mathematics should not dispense clinical sciences. we have carefully read the ar cle "is a covid19 quaran ne jus fied in chile or usa right now?", published by ri gonzález, f muñoz, ps moya and m 1 kiwi . the construc on of mathema cal models that allow comprehensive approach of decision-making in situa ons of absence of robust evidence is important. as such, the approach from the exact sciences is ma er of interest to the current pandemic. indeed, we have some examples of current models. in mexico, an a empt has been made to predict the behavior of covid-19 and es mate possible scenarios through gauss model. with the universidad de guadalajara´s data, "contagion cri cal days" between march 20th-24th, were determined. assuming y=1.0257x and r2=0.9846, and considering how the virus evolved in china and recent local data, the results using the equa on, with 98% confidence, create some possible op mis c and cri cal scenarios. the model was built to es mate these scenarios at 5 days and not daily (as in spain or italy) given the fewest 2 confirmed infected in mexico . while it is interes ng to use models that are easy to understand, using values of direct interpreta on, the proposed index ("covid-19 burden index") seems to us to be oversimplified. this, because the authors minimize the impact of three relevant situa ons that are substan al issue of the integral process: first, in rela on to a contextual jus fica on, they use the assump on that different country-reali es are comparable. second, rela ve to the equa on, they use a numerator greater than the actual one. finally, in rela on to the denominator, they do not incorporate fundamental clinical-epidemiological variables. all this leads to a possible overes ma on of the total, with a risk of bias in the interpreta on and applicability of the results. we will briefly develop these three elements in the same order in which we have just presented them. three reali es (south korea, united states of america and chile) are put as comparable considering data related only to the covid-19 incidence. it is not taken 3 into account that south korea is a small country with just over 100,000 km² (109° according to land surface globally), with four ci es (seoul, busan, incheon and daegu) concentra ng just over a third of the country's 4,5 total popula on; united states of america is a federal country with almost 10 million km² (4°), in which the fi een most populous ci es do not account 6 for even 10% of the country's total; while chile is a highly centralized country with approximately 750,000 km² (38°) in which a single city (san ago) concentrates one third of the country's total popula on (table n° 1 ). analyzing and comparing the incidence of a disease, without considering the geographical and popula on reali es of three very different territories cons tutes a methodological limita on, too important not to be considered in the development of a model. since the measures being implemented will be different according to geographical reali es, the ini al rate of contagion will be different, and the distribu on of resources available to deal with the disease will be different. in the mexican example, the total n of confirmed cases reached a value less than the daily n confirmed in spain and italy, in the same me elapsed since "case one". that is why we think it is correct to compare similar reali es, especially in terms of popula on concentra on according to territorial area, and according to the distribu on of cri cal resources that exist in a given place and me. next, the numerator (15% of the confirmed popula on), equates to between 3 and 6.5 mes the es mated number of people that will require 7 mechanical ven la on . there is no jus fica on for why this high percentage was used. had it ini ally been decided that the total n of confirmed cases were going to develop severe pneumonia? however, in most hospitals in la n america -including chile-this type of pa ent does not necessarily recharge intensive care beds, but specialized beds. this is a major factor to refine in the proposed model. correc ons that would bring the model closer to reality would be: or, in any case: n confirmed × 0.05 in sta s cal terms, the problem is to use all of the n observa ons to obtain op mal es mates of the ˆθi of the θi, considering the uncertainty introduced by the observa on errors. this is a problem that gauss describes as "the most important of the applica on of 2 mathema cs to natural philosophy" . finally, with regard to the denominator, the following elements should be well thought-out: the distribu on of clinical resources that each country has to deal with a situa on such as the current one, beyond the gross n. relevant in all cases, is especially relevant in c o u n t r i e s w i t h s t r o n g o r g a n i z a o n a l a n d administra ve centralism, as la n americans: the distribu on of resources is heterogeneous and does not have to relate to a par cular n of contagions in the same place and me. on the other hand, there is a set of variables that makes a comparison between countries more complex. beyond the effect that certain measures implemented may have generated on the shape of the infec on, lethality and recoverability curves, there will always be popula onspecific variables that will end up affec ng the development of these curves: gene c, nutri onal and idiosyncra c elements. variables of this type should be considered and quan fied to be included in the denominator in the form of a correc ve factor. it is possible that the proposed index, with current data, could be useful in geographically and administra vely narrowed places. but it is inaccurate to be applied throughout the process and in places as broad as american countries: today, the reality in chile shows that for several days we would be with a result "1" which, according to the model, should be interpreted as "collapse of the health system", which is s ll far from happening. it would be ideal to correct and refine the presented model, bearing in mind the concepts described, to take advantage of the proposal and generate a more accurate response, which can serve as an input both in the implementa on of measures and in the predic on of the behavior of a pandemic like the current one. however, what we propose is to improve the accuracy of the model in terms of quan es and applicability, agreeing with the concept of "stay at home". in conclusion, we believe that the approach between complementary areas of knowledge should be the door that we must open to generate the new evidence we need. mathema cs should not dispense clinical sciences, and the result of mutual synergy will always exceed the mere sum between the two. as long as evidence is built, we must avoid venturing conclusions that can promote or result in hasty decisions. is a covid19 quarantine justified in chile or usa right now? gauss's contributions to statistics food and agriculture organization of the united nations clinical characteristics of coronavirus disease 2019 in china key: cord-195082-7tnwkxuh authors: oodally, ajmal; kuhn, estelle; goethals, klara; duchateau, luc title: modeling dependent survival data through random effects with spatial correlation at the subject level date: 2020-10-12 journal: nan doi: nan sha: doc_id: 195082 cord_uid: 7tnwkxuh dynamical phenomena such as infectious diseases are often investigated by following up subjects longitudinally, thus generating time to event data. the spatial aspect of such data is also of primordial importance, as many infectious diseases are transmitted from one subject to another. in this paper, a spatially correlated frailty model is introduced that accommodates for the correlation between subjects based on the distance between them. estimates are obtained through a stochastic approximation version of the expectation maximization algorithm combined with a monte-carlo markov chain, for which convergence is proven. the novelty of this model is that spatial correlation is introduced for survival data at the subject level, each subject having its own frailty. this univariate spatially correlated frailty model is used to analyze spatially dependent malaria data, and its results are compared with other standard models. malaria remains a disease with high morbidity and mortality in ethiopia according to the most recent information from the world health organization (2018), with about 74 million people at risk for the disease. additionally, ethiopia launched an ambitious green energy program making use of the large altitude differences in the country to generate electricity from hydroelectric dams. such dams, however, provide excellent breeding grounds for the vector of the malaria parasite, the anopheles mosquito, and thus might lead to increasing malaria incidence. in order to investigate the dam effect, data have been collected in villages around the gilgel-gibe hydroelectric dam in the oromia region at different distances from the dam. getachew et al. (2013) reported the results of this study using a shared frailty model. in this paper, we revisit these data and develop new survival models to better mimic the correlation structure in the data. the use of spatial statistic tools is relatively new in survival analysis although it may substantially improve dependent survival data modelling. banerjee et al. (2003) proposed a parametric frailty model to estimate parameters using a bayesian approach to analyse an infant mortality dataset in minnesota. spatial dependence between the clusters was modelled using two different approaches; a geostatistical approach where the exact locations are needed and a lattice approach where the relative distance between the groups is required. li and ryan (2002) developed a semi-parametric spatial frailty model with monte carlo simulations and laplace approximation of a rank based marginal likelihood. along the same lines, lin (2012) estimated parameters of a log-normal spatial frailty model using a two-iteration approach based on an approximate likelihood function, alternating between the estimation of the regression parameter and the variance components. however, all of the above models introduced spatial correlation between groups of sub-jects. in our approach, we go beyond that and model spatial correlation at the subject level. in section 2, the model is detailed followed by the estimation procedure in section 3. simulation results are presented in section 4 and the analysis of the malaria data set is discussed in section 5. conclusion and discussion follow in section 6. 2 description of the univariate spatially correlated frailty model we propose a univariate frailty model with spatial correlations at the subject level. let us consider n subjects. for 1 ≤ i ≤ n , the time to event and the time of censoring for subject i are modelled by random variables denoted by t i and c i respectively. then, for 1 ≤ i ≤ n , the right censored time and the censoring indicator are denoted by x i and ∆ i respectively and defined by x i = min(t i , c i ) and ∆ i = 1 t i ≤c i . we denote x = (x i ) 1≤i≤n and ∆ = (∆ i ) 1≤i≤n . the spatially correlated univariate frailty model is defined as follows. for 1 ≤ i ≤ n the conditional instantaneous hazard of occurrence of the event for subject i at time t is defined by: where h 0 (t) is the baseline hazard function at time t, b i the frailty term of subject i, β the vector of the unknown regression parameters, z i the vector of covariates associated with subject i. let us introduce the frailty vector b = (b i ) 1≤i≤n which is assumed to follow a centered multivariate normal distribution with covariance matrix σ 2 σ(ρ) where σ 2 is a scaling factor and σ(ρ) is the correlation matrix parameterized by ρ > 0, b ∼ n (0, σ 2 σ(ρ)). (2) we consider two different correlation structures following li and ryan (2002) , namely where d ii corresponds to the distance between subject i and subject i . by definition, d ii = 0. for the baseline hazard function, usual regular parametric forms can be considered. some examples include the weibull, gompertz and piecewise constant baselines. we denote by α the parameters of the baseline hazard function h 0 . finally, the whole model parameters are θ = (α, β, σ 2 , ρ). 3 estimation in the spatially correlated frailty model we estimate the parameters of the model by maximising the marginal likelihood. we introduce the following assumptions on the univariate spatially correlated frailty model: (f1) the censoring times (c i ) are independent of the event times (t i ) and of the frailty vector b. (f2) conditional on the frailty vector b, the event times (t i ) are independent. by assumptions (f1) and (f2), the complete likelihood can be expressed as: where h 0 (x i ) = x i 0 h 0 (t)dt is the cumulative hazard function and f is the density of a centered multivariate gaussian distribution with covariance matrix σ 2 σ(ρ). the marginal likelihood is obtained by integrating the complete likelihood with respect to the frailty vector b: the estimator of the maximum of the marginal likelihoodθ is defined by: this estimator cannot be evaluated directly in many models, in particular when the marginal likelihood does not admit an analytical form, which is the case in the frailty model we consider. in practice, we calculate the value of the estimator using an iterative algorithm. we apply the saem-mcmc algorithm with truncation on random boundaries (kuhn and lavielle (2004) , allassonnière et al. (2010) ) to compute the maximum of the marginal likelihood. let us describe in detail this algorithm, denoted later by algorithm a. first note that the complete likelihood function defined in equation (3) belongs to the exponential family since it can be written as follows: where s, ψ and φ are borel functions. sufficient statistics can be expressed as and take values in a subset s of r n (n +3)/2 . let (k q ) q ≥ 0 be a sequence of increasing compact subsets of s such that q≥0 k q = s and k q ⊂ int(k q+1 ), for all q ≥ 0. initialize θ 0 in θ, b 0 and s 0 in two fixed compact sets k and k 0 respectively. each iteration of algorithm a is composed of four steps detailed below. repeat until convergence for k ≥ 1: 1. simulation step: draw a realizationb of the unobserved frailty vector from a transition probability π θ of a convergent markov chain having as stationary distribution the conditional distribution π θ (.|x, ∆) defined by π θ (b|x, ∆) = l comp (θ; x, ∆, b) / l marg (θ; x, ∆) with the current parameters: 3. truncation step: ifs is outside the current compact set k κ k−1 , where κ is the index of the current active truncation set, or too far from the previous value s k−1 then restart the stochastic approximation in the initial compact set, extend the truncation boundary to k κ k and start again with a bounded value of the missing variable. otherwise if s k − s k−1 ≤ k where = ( k ) k≥0 is a monotone non-increasing sequence of positive numbers, set (b k , s k ) = (b,s) and keep the truncation boundary to k κ k−1 . 4. maximization step: where the functionθ : s → θ is defined such that: with l(θ, s) = exp(−ψ(θ) + s, φ(θ) ). note that the truncation step guarantees two conditions on the sequence (b k , s k ) generated by this algorithm at each iteration k. namely it ensures that the stochastic approximation does not wander outside the current compact set and that the current value is not too far from the previous value. indeed the truncation step is introduced following andrieu et al. (2005) to allow for frailties that do not live in a compact set and thereby establish the convergence proof with weak assumptions on the model. in this section, we prove the almost sure convergence of the sequence (θ k ) k generated by and a mean field h defined as: we state a first assumption on these functions. (f6) the functions w and h are such that (i) there exists an m 0 > 0 such that where w is defined in (5) and h is defined in (6). (ii) there exists m 1 ∈]m 0 , ∞] such that {s ∈ s, w(s) < m 1 } is a compact set. (iii) the closure of w(l) has an empty interior. we now state a condition on the step-size sequences in the stochastic approximation and truncation steps of the algorithm. (f7) the sequences µ = (µ k ) k≥0 and = ( k ) k≥0 are non-increasing, positive and satisfy where (θ k ) k is generated by algorithm a, d(x, a) denotes the distance from x to any closed we refer to appendix a for the proof of theorem 1. 1. in order to reduce the computational time to achieve convergence, it is crucial to start the algorithm with good initial estimates. the usual approach is to start with estimates obtained from simpler existing methods. for instance, we use as initial values for the regression parameter β and baseline components the estimated values obtained when fitting the data by a piecewise constant proportional hazards model. 2. at iteration k of the algorithm, the transition kernel used for simulating the unobserved frailty is often chosen as a transition kernel of a random walk metropolis-hastings algorithm where the proposal distribution q equal to a normal distribution centred at the current value b k−1 . however, if the frailty vector b is of high dimension, this can be inefficient in practice. to cope with this high dimensionality, one can implement a hybrid gibbs algorithm where b k−1 is updated coordinate-wise, each candidate being accept with an acceptance probability that has to be computed n times for a frailty vector of size n . nevertheless this step can be computationally intensive with an increasing size n of the frailty vector. instead, one can update blocks of size k at a time. in doing so, the computational cost is reduced by a factor of k. a subtle compromise has to be found between a too big value of k which translates to a change in too many directions at a time for b leading to a possible extremely low acceptance rate and a too small k leading to a high computational cost. moreover we recommend to use an adaptive version to achieve reasonable acceptance rates (haario et al. (2001) ). 3. the decreasing positive step size (µ k ) k is taken as follows for all 0 ≤ k ≤ k 0 , µ k = 1 and for all k > k 0 , µ k = 1 (k−k 0 ) where k 0 is a number to be specified. the step size (µ k ) k verifies assumption (f7). the algorithm is said to have no memory during the first k 0 iterations. after this burn-in time which allows for the algorithm to widely visit the parameter space, the sequence (µ k ) k decreases and converges to zero as k → ∞. besides, there is no need to implement in practice the truncation step which is only required for the theoretical proof of convergence. thus we state no recommendation for the sequence ( k ) k . 4. following ripatti et al. (2002) , we consider a stopping criterion based on the relative difference between two consecutive values of the parameters. let us fix a positive threshold > 0. if for some k > 1: holds true for three consecutive iterations, the algorithm is stopped. all programs are available on request from the authors. 4 simulation study the simulation setting is chosen to mimic the malaria data ). we generate event times for n = 300 subjects using model (m 1 ) which is defined as follows: where the baseline hazard is piecewise constant with change points (τ 0 , τ 1 , τ 2 , τ 3 ) = (0, 0.2, 0.8, +∞) and constant hazards (h 1 , h 2 , h 3 ) = (2, 0.5, 1). the parameter vector β is set equal to (2,3) and the covariates (z i ) are simulated following a bernoulli distribution of parameter 0.5. the parameters of the correlation structure are fixed at σ 2 = 1.5 and ρ = 1. the matrix d is chosen by taking subsets of size 300 of the real malaria distance matrix. we simulate the event times under three different censoring settings namely no censoring, moderate censoring (40%) and heavy censoring (60%). the censoring times are simulated following an exponential distribution with the rate parameter adjusted so as to obtain the desired censoring level. the estimates are computed using algorithm a and presented in table 1 . all estimates are close to the true values, whatever the censoring setting. we emphasize that the lower the censoring rate, the closer the estimates to the true values are. we also note that the standard errors are larger with increasing proportions of censored data for all estimates. for instance, the standard error forĥ 3 nearly doubles when we compare the non-censoring setting to heavy censoring with respective standard errors of 0.447 and 0.884. to evaluate effects of misspecification with respect to correlation structure, we introduce model (m 2 ) defined by: we fit the model (m 2 ) for data simulated under the model (m 1 ). the results are shown in table 2 . are far from the true values and even more so in the case of moderate and heavy censoring. when the correlation structure is misspecified, the scaling factor σ 2 seems to be compensating for the wrong assumption of correlation structure which leads to an overestimation of this parameter. we compare the estimator of the univariate spatially correlated frailty model with two existing models that do not take into account spatial correlation based on the same simulation setting. we define the proportional hazards model (m 3 ) and the univariate frailty model (m 4 ) as follows: neither the proportional hazards model ( 5 analysing the gilgel gibe time to malaria data set in the gilgel gibe study, 2037 children living in 16 different villages were followed over time ). the location of the children is presented in figure 1 , which demonstrates that the village boundaries are mostly of an administrative nature, and that villages are almost overlapping in some cases. the geographical coordinates (longitude and latitude) of the children around the dam are used to compute the inter-distances between pairs of children. we consider four covariates, most importantly, the distance to the dam and next to that the sex of the child, the age and the structure of the roof of the child's household. the children are grouped into three age categories namely, 0-3 (used as reference group), 3-7 and older than 7. some children have exactly the same geographical coordinates (same household) or live so close to each other that the locations recorded are exactly the same. we assign a common frailty term to those children so that the distance matrix remains invertible and positive definite. we note however that this grouping structure is different from the usual grouping structure in spatial survival models in the sense that the members of the group have exactly the same geographical coordinates. in the common grouping structure, the group normally refers to a region, state or country (li and ryan (2002) , banerjee (2016)). the number of children in a group varies from 1 to 11 with most groups consisting of 1, 2 or 3 children (760 groups of 1, 364 groups of 2 and 89 groups of 3). we fit the univariate spatially correlated frailty model defined in equations (1) and (2) to the data. the parametric baseline hazard is chosen to be piecewise constant, using the rainfall data to determine adequate cut-points as described in belay et al. (2017) . the parameters of the model are estimated using the saem-mcmc algorithm detailed in section 3.2 with a gibbs-block of size k = 10. smaller block sizes give similar results but need longer computational times. besides, the algorithm fails to converge for larger block sizes. initial parameter estimates for β and the baseline components are obtained using the r package eha (broström (2012)) which allows for estimation in a piecewise baseline proportional hazards model. the marginal log-likelihood values for a model with spatial correlation structure given by σ pol (ρ), equal to -6013.608, is substantially larger than the (2013)) model with spatial correlation structure given by σ exp (ρ), equal to -6038.579. therefore, we proceed with the former model to investigate the spatial correlation structure and the covariate effects. we also investigate whether the correlation term ρ is significantly different from ∞, i.e. whether the univariate spatially correlated frailty model is a better fit than a univariate frailty model with no spatial correlation. we perform a likelihood-ratio test with null hypothesis "ρ = ∞". we note that the parameter tested here is on the boundary of the parameter space. therefore, following the works of self and liang (1987) , the likelihoodratio test statistic converges asymptotically to a 50:50 mixture of δ 0 and χ 2 (1) distributions if the null hypothesis is correct. with y a random variable from this mixture distribution, the p-value is given by p(y > 7.64) = 0.003 and we can thus reject the null hypothesis at the 1% significance level. the estimation results are presented in the fourth column of hazard with specific parameter estimates is depicted in figure 2 . the malaria risk seems to be highest during or just after periods of heavy rainfall. the limited number of observations in the last interval makes the estimate of the last baseline component untrustworthy. we expect the estimateĥ 6 to be higher had data been available for a few more weeks. with respect to the spatial correlation structure, the variance estimateσ 2 was equal to 0.43 and ρ was equal to 0.81 with respective standard errors 0.09 and 0.14. in figure 3 , we give a graphical representation of how correlation between the children evolves with respect to the distance between them imposed by these estimates. we analyze the malaria data using a marginal model and a shared frailty model with the frailty at the village level. both models are defined such that the baseline hazard function is piecewise constant with cut-points determined following rainfall data as in the previous section. the results are presented in table 4 . the marginal model estimates are obtained by fitting the data to a piecewise constant proportional hazards model and then correcting for the variances of the parameters using the grouped jackknife method (wu et al. (1986) , lipsitz and parzen (1996) ). the hazard ratio for the distance to dam effect does not differ significantly from 1 in both models. as for children older than 7, the marginal model suggests about 46 % higher malaria risk while the shared frailty model suggests about 38 % higher malaria risk. both models found no significant effect for gender or roof. when analyzing time to event data of several subjects, it is of paramount importance to accommodate for spatial correlation at the subject level. this is even more crucial for infectious diseases where transmission occurs at the subject level. the univariate spatially correlated frailty model introduced in this paper is, to the best of our knowledge, the first such model to incorporate spatial correlation at the subject level for time to event data. regarding model parameters estimation, it has been demonstrated through the theoretical proof of convergence of the adapted saem-mcmc algorithm and the simulation results for reasonably small datasets as encountered in practice that proper estimates can be obtained for all model parameters, including those modelling the spatial correlation. previous approaches rather introduce spatial correlation between clusters of subjects. in many circumstances, however, and certainly in the case study on malaria analysed in this paper, the delineations of the clusters are often based on administrative rather than physical boundaries. infectious agents are not halted, however, by such administrative boundaries, but by distance or other physical barriers between subjects. modelling correlation at the subject level allows to bypass such artificial cluster delimitation choices and instead include accurate physical characteristics between subjects in the model. moreover, it is demonstrated in the simulation studies that not taking into account spatial correlation has a serious impact on the parameter estimates, leading to serious bias, and should thus not be neglected. furthermore, using the villages as clusters in the marginal and shared frailty models to analyse the malaria data set has serious impact on some of the parameter estimates. the main practical objective of the malaria study was to establish the association between the distance from the dam and malaria incidence. although no significant effect of distance was found, regardless what model was used, the direction of the effect changes. in the shared frailty model, the incidence decreases with increasing distance, whereas in the marginal and univariate spatially correlated frailty models the incidence increases with increasing distance from the dam. although the result from the shared frailty model is the more expected one, it has serious problems as described in getachew et al. (2013) spatial survival analysis is a relatively new research topic that will become more relevant with the increasing availability of geographical data. the recent covid-19 outbreak and various tools based on gps tracking come to mind, where contact tracing is obviously done at a subject level. the univariate spatially correlated frailty model, having shown its benefits for modelling spatially correlated malaria data, is an appropriate model to take into account the data structure of many other infectious diseases. proof of theorem 1 first, we state the classical assumptions (f3)-(f5) corresponding to assumptions delyon et al. (1999) required to prove the almost sure convergence of em like algorithms: (f3) the functions : θ → s defined as: is continuously differentiable on θ. (f4) the function l : θ → r defined as the marginal extended log-likelihood there exists a functionθ : s → θ such that: moreover, the functionθ is continuously differentiable on s. we first apply theorem 5.5 of andrieu et al. (2005) to prove the convergence of the sequence (s k ) k and check therefore the required assumptions. as detailed in andrieu et al. (2005) , drift assumptions (dri) imply assumptions (a2-a3) by proposition 6.1. thus we can apply theorem 5.5 of andrieu et al. (2005) . this results in the sequence (s k ) k generated by algorithm a satisfying lim k d(s k , s 0 ) = 0 where s 0 = {s ∈ s, ∇w(s), h(s) = 0}. we recall that we choose a piecewise baseline function h 0 and we assume that b follows a multivariate normal distribution. in addition to those model hypotheses and assumptions (f1), (f2), we have that assumptions (m1) and (m2) of delyon et al. (1999) hold in our case. it suffices to show that ψ and φ are twice continuously differentiable to satisfy assumption (m2). this is guaranteed by assumptions (f1), (f2), the choices of the frailty distribution and of the baseline hazard function h 0 . as shown in equation (4), we can write the complete likelihood in exponential form which fulfils assumption (m1). following the lines of the proof of lemma 2 of delyon et al. (1999) , we get that lim k d(θ k , l) = 0. the proof of theorem 1 is therefore complete. b appendix b let us first recall the complete log-likelihood of the data : where the cumulative hazard function h 0 ( we differentiate the complete log-likelihood with respect to each parameter to obtain the necessary equations to update the parameter estimates in algorithm a. differentiating the complete log-likelihood with respect to h m , we obtain: we obtain an analytic expression for the update of h m : differentiating the complete log-likelihood with respect to β, we obtain: the newton raphson method is used to update the values of parameter β. only the last term of the log-likelihood depends on the parameter σ 2 . differentiating the log-likelihood with respect to σ 2 we get: we obtain an analytic expression to update the parameter σ 2 : updating parameter ρ we now differentiate the log-likelihood with respect to ρ leading to: the values of the parameter ρ are updated using a gradient descent method. we recall that the frailty b is assumed to follow a multivariate normal distribution parameterized by γ. a numerical approximation of the marginal likelihood is computed based on the parameter estimatesθ = ((ĥ m ) 1≤1≤m ,β,σ 2 ,ρ). we simulate c independent realizations of (b c ) 1≤c≤c following a multivariate normal distribution with mean zero and covariance matrixσ 2 σ(ρ). the marginal likelihood is then computed based on a monte carlo sum as follows: the law of large numbers (van der vaart (2000)) ensures that the monte carlo sum converges to the marginal likelihood as c → ∞. the bigger the number of realizations c, the better the quality of the approximation. we computel marg for increasing values of c. the values obtained are compared and once they are of the same order, the quality of the approximation is determined to be sufficient. we obtain an estimate of the fisher information matrix through the observed fisher information matrix i obs (θ) = −∂ 2 θ logl marg (θ; x, ∆) (andersen et al. (1997) ). using louis's missing information principle (louis (1982) ), we express the matrix i obs (θ) as: i obs (θ) = −e θ ∂ 2 θ log l comp (θ; x, ∆, b)|x, ∆ − cov θ ∂ θ log l comp (θ; x, ∆, b)|x, ∆ where e θ and cov θ denote respectively the expectation and the covariance under the posterior distribution π θ of the frailty. we approximate the quantity i obs (θ) by a monte carlo sum based on the realizations of the markov chain generated in the algorithm having as stationary distribution the posterior distribution π θ . after a burn-in period, we use the remaining l realizations (b l ) 1≤l≤l of the markov chain to compute the following quantity: the ergodic theorem guarantees the convergence of the quantityî l (θ) to the observed fisher information matrix i obs (θ) as l goes to infinity (meyn and tweedie (2012) ). in addition to the derivatives calculated to compute the m-step of algorithm a, we also compute the following second and cross derivatives: ∂ 2 log l comp (θ; x, ∆, b) ∂ρ 2 = − 1 2 tr − a(ρ) + σ(ρ) −1 ∂ 2 ∂ρ 2 σ(ρ) construction of bayesian deformable models via a stochastic approximation algorithm: a convergence study estimation of variance in cox's regression model with shared gamma frailties stability of stochastic approximation under verifiable conditions spatial data analysis. annual review of public health frailty modeling for spatially correlated survival data, with application to infant mortality in minnesota joint bayesian modeling of time to malaria and mosquito abundance in ethiopia eha: event history analysis. r package version 2 convergence of a stochastic approximation version of the em algorithm coping with time and space in modelling malaria incidence: a comparison of survival and count regression models an adaptive metropolis algorithm coupling a stochastic approximation version of em with an mcmc procedure modeling spatial survival data using semiparametric frailty models analysis of spatial frailty models by a weighted estimating equation a jackknife estimator of variance for cox regression for correlated survival data finding the observed information matrix when using the em algorithm markov chains and stochastic stability convergent stochastic algorithm for parameter estimation in frailty models using integrated partial likelihood maximum likelihood inference for multivariate frailty models using an automated monte carlo em algorithm asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions asymptotic statistics jackknife, bootstrap and other resampling methods in regression analysis the effect of dams and seasons on malaria incidence and anopheles abundance in ethiopia key: cord-214774-yro1iw80 authors: srivastava, anuj title: agent-level pandemic simulation (alps) for analyzing effects of lockdown measures date: 2020-04-25 journal: nan doi: nan sha: doc_id: 214774 cord_uid: yro1iw80 this paper develops an agent-level simulation model, termed alps, for simulating the spread of an infectious disease in a confined community. the mechanism of transmission is agent-to-agent contact, using parameters reported for corona covid-19 pandemic. the main goal of the alps simulation is analyze effects of preventive measures -imposition and lifting of lockdown norms -on the rates of infections, fatalities and recoveries. the model assumptions and choices represent a balance between competing demands of being realistic and being efficient for real-time inferences. the model provides quantification of gains in reducing casualties by imposition and maintenance of restrictive measures in place. there is a great interest in statistical modeling and analysis of medical, economical and epidemiological data resulting from current the covid-19 pandemic. from an epidemiological perspective, as large amount of infection, containment, and recovery data from the this pandemic becomes available over time, the community is currently relying essentially on simulation models to help assess situations and to evaluate options [1] . naturally, simulation systems that follow precise mathematical and statistical models are playing an important role in understanding this dynamic and complex situation [2] . in this paper we develop a mathematical simulation model, termed alps, to replicate the spread of an infectious disease, such as covid-19, in a confined community and to study the influence of some governmental interventions on final outcomes. since alps is purely a simulation model, the underlying assumptions and choices of statistical distributions for random quantities become critical in its success. on one hand it is important to capture the intricacies the observed phenomena as closely as possible, using sophisticated modeling tools. on the other hand, it is important to keep the model efficient and tractable by using simplifying assumptions. one can, of course, relax these assumptions and obtain more and more realistic models as desired, but at the cost of increasing computational complexity. there have been a large models proposed in the literatures relating to the the spread of epidemics through human contacts or otherwise. they can be broadly categorized in two main classes (a more detailed taxonomy of simulation models can be found in [3] ): 1. population-level coarse modeling: a large number of epidemiological models have focused on the population-level variables -counts of infected (i), 0 1/15 susceptible (s), removed or recovered (r), etc. the most popular model of this type is the susceptible-infected-removed (sir) model [4] proposed by kermack and mckendrick in 1927 . this model uses ordinary differential equations to model constrained growth of the counts in these three categories: ds dt = −βi(t)s(t), di dt = βi(t)s(t) − γi(t), dr dt = γi(t) . the two parameters β and γ control the dynamics of infections, and the condition ds dt + di dt + dr dt = 0 ensures constancy of the community size. a number of other papers have studied variants of these models and have adapted them for different epidemics, such as ebola and sars [5] . while there are spatial versions of sir models, they are usually limited in their modeling of spatial dynamics. they typically use a uniform static grid to represent the spread of infections, from a location to its neighbors, over time. in general these models do not explicitly model people dynamics as residents move around in a community. several recent simulation models, focusing directly on covid-19 illness, also rely on such coarser community level models [6] . 2. agent-level modeling: while population-level dynamical evolutions of population variables are simple and very effective for overall assessment, they do not take into account any social dynamics, human behavior, government-mandated restrictions, and complexities of human interactions explicitly. the models that study these human-level factors and variables , while tracking disease at an individual level, are called agent-level models [7] . here one models the mobility, health status, and interactions of individual subjects (agents) in order to construct an overall population-level picture in a bottom-up way. the advantages of agent-based models are that they are more detailed and one can vary the parameters of restriction measures, such as social distancing, at a granular level to infer overall outcomes. agent-based models have been discussed in several papers, including [3, [8] [9] [10] and so on. the importance of simulations based analysis of epidemic spread is emphasized in [11] but with a focus on infection models within host. some aspect of spreading of diseases using network contact is also discussed. hunter et al. [10] construct a detailed agent-based model for spread of infectious diseases, taking into account population demographics and other social conditions, but they do not consider countermeasures such as lockdowns in their simulations. a broad organization of different agent-based simulation methods have been presented in [3] . there are numerous other papers on the topic of agent-based simulations for simulating spread of infections that are not referenced. the main distinction of the current paper from the past literature is its focus on agent-level transmission of infections, and the influence of social dynamics and lockdown-type restrictions on these transmissions. in this paper we assume a closed community with the infection started by a single agent at initial time. the infections are transmitted through physical exposure of susceptible agents to the infected agents. the infected agents go through a period of sickness with two eventual outcomes -full recovery for most and death for a small fraction. once recovered, the agents can no longer be infected. the social dynamical model used here is based on fixed domicile, i.e. each agent has a fixed housing unit. under unrestricted conditions, or no lockdown, the agents are free to move over the full domain using a simple motion model. these motions are independent across agents and encourage smooth paths. under lockdown conditions, most of the agents head directly to their housing units and generally stay there during that period. a very small fraction of agents are allowed to move freely under the restrictions. the rest of this paper is organized as follows. section 2 develops the proposed agent-level pandemic simulation (alps) model, specifying the underlying assumptions and motivating model choices. it also discusses choices of model parameters and present a validation using comparisons with the sir model. section 3 presents some illustrative examples and discussed computational complexity of alps. the use of alps in understanding influences of countermeasures is presented in section 4. the paper ends by discussing model limitations and suggesting some future directions. in this section we develop our simulation model for agent-level interactions and spread of the infections across a population in a well-defined geographical domain. in terms of the model design, there are competing requirements for such a simulation to be useful. our main considerations in the design of alps are as follows. on one hand, we want to capture detailed properties of agents and their pertinent environments so as to render a realistic model of pandemic evolution with or without countermeasures. on the other hand, we want to keep model complexity reasonably low, in order to utilize it for analysis under variable conditions and countermeasures. also, to obtain statistical summaries of pandemic conditions under different scenarios, we want to run a large number of simulations and compute averages. this also requires keeping the overall model tractable from a computational perspective to allow for multiple runs of alps in short time. the overall setting of the simulation model is the following. we assuming that the community is based in a square geographical region d with h household units arranged in a uniformly-spaced square grid. we assume that there are n total agents in the community and the configuration updates every unit interval (hour) counted by the variable t. the agents are fully mobile to traverse all of d when unconstrained but are largely restricted to their home units under restrictions. next, we specify the assumptions/models being used at the moment. • independent agents: each agent (person) has an independent motion model and independent infection probability. the actual infection event is of course dependent on being in a close proximity of an infected carrier (within a certain distance, say ≈ 6 feet) for a certain exposure time. but the probability of an agent being infected is independent of such events for other persons. • full mobility in absence of restrictive measures: we assume that each agent is fully mobile and moves across the domain freely when no restrictive measures are imposed. in other words, there no agents with restricted mobility due to age or health. also, we do not impose any day/night schedules on the motions, we simply follow a fixed motion model at all times. some papers, including [12] , provide a two-or three-state models where the agents transition between some stable states (home, workplace, shopping, etc) in a predetermined manner. • homestay during restrictive measures: we assume that most agents stay at home at all times during the restrictive conditions. only a small percentage (set as a parameter ρ 0 ) of population is allowed to move freely but the large majority stays at home. • sealed region boundaries: in order to avoid complications of introducing a transportation model in the system, we assume that there is no transfer of agents into and out of the region d. the region is modeled to have reflecting boundaries to ensure that all the citizens stay with in the region. the only way the population of d is changed is through death of agent(s). • fixed domicile: the whole community is divided into a certain number of living units (households/buildings). these units are placed in square blocks with uniform spacing. each agent has a fixed domicile at one of the units. during a lockdown period, the agents proceed to and stay at home with a certain fidelity. we assume that all agents within a unit are exposed to each other, i.e. they are in close proximity, and can potentially infect others. • no re-infection: we assume that once a person has recovered from the diseased then he/she can not be infected again for the remaining observation period. while this is an important unresolved issue for the current covid-19 infections, it has been a valid assumption for the past corona virus infections. • single patient ground zero: the infection is introduced in the population using a single carrier, termed patient ground zero at time t = 0. this patient is selected randomly from the population and the time variable is indicated related to this introduction event. • constant immunity level: the probability of infection of agents, under the exposure conditions, remains same over time. we do not assume any increase or decrease in agent immunity over time. also, we do not assign any age or ethnicity to the agents and all agents are assumed to have equal immunity levels. there are several parts of the model that require individual specifications. these parts include a model on dynamics of individual agents (with or without restrictions in place), the mechanisms of transmitting infections from agent to agents, and the process of recovery and fatality for infected agents. a full listing of the model parameters and some typical values are given in table 1 in the appendix. • motion model: the movement of a subject follows a simple model where the instantaneous velocity v i (t) is a weighted sum of three components: (1) velocity at the previous time, i.e. v i (t − 1), (2) a directed component guiding them to their home, α(h i − x i (t − 1)), and (3) an independent gaussian increment σw i (t), w i (t) ∼ n (0, 1). note that the motions of different agents are kept independent of each other. the located h i ∈ d denotes the home unit (or stable state) of the i th person. using mathematical notation, the model for instantaneous position x i (t) and velocity v i (t) of the i th agent are given by: here α ∈ r + determines how fast one moves towards their home, and µ quantifies the degree to which one follows the directive to stay home. if µ = 0, then the person reaches home and stays there except for the random component w i . however, if µ = 0.5, then a significant fraction of motion represents continuity irrespective of the home location. the value µ = 1 implies that either there is not restriction in place or the person does not follow the directive. reflecting boundary: when a subject reaches boundary of the domain d, the motion is reflected and the motion continues in the opposite direction. fig. 1 shows examples of random agent motions under different simulation conditions. the leftmost case shows when there is no lockdown and the agents are moving freely throughout. the middle case shows the case when the restrictions are imposed on day 10 and the restrictions stay in place after that. the last plot shows the case where a lockdown is imposed on day 10 and then lifted on day 20. • restrictions (or lockdown) model: once the restriction period starts, at time t 0 , all agents are directed towards their homes and asked to stay there. we assume that ρ% of the subject follow this directive while the others ((100 − ρ)%) follow a different motion model. the variable ρ changes over time according to: we note that the people who do not follow restrictions follow the prescribed motion model with µ = 1. • exposure-infection model: the event of infection of an agent depends on the level of exposure to another infected person. this process is controlled by the following parameters: -the distance between the subject and the infected person should be less than r 0 . -the amount of exposure in terms of the number of time units should be at least τ 0 . at the moment we use the cumulative exposure over the whole history, rather than just the recent history. -given that these conditions are satisfied, the probability of catching the disease at each time t is an independent bernoulli random variable with probability p i . • recovery-death model: once a subject is infected, we randomly associate the nature of the illness immediately. either an infected agent is going to recover (non-fatal type, or nft) or the person is eventually going to die (fatal type, or ft). the probability of having a fatal infection given that a person is infection is -recovery: a subject with a non-fatal type (nft) is sick for a period of t r days. after this period, the person can recover at any time according a bernoulli random variable with probability p r . -fatality: a subject with a fatal type (ft) is sick for a period of t d days. after this period, the person can die at any time independently according a bernoulli random variable with probability p d . a complete listing of the alps model parameters is provided in the appendix in table 1 . in this section, we motivate the values chosen for those parameters in these simulations. these values are motivated by the current reports of corona virus pandemic. • we have used a square domain with size 2 miles × 2 miles for a community with population of n agents. for n = 900, the model represents a population density of 225 people/mile 2 . the community contains h living units (buildings) with a domicile of n/h people per unit. in case n/h is high, a unit represents a tall building in metropolitan areas, but when n/h is small, a unit represents a single family home in a suburban area. the time unit for updating configurations is one hour and occurrence of major events is specified in days. for example, the lockdown can start on day 1 and end on day 60. • the standard deviation for accelerations in agent mobility are approximately 1-5 feet/hour (fph). through integration over time, this results in agent speeds up to 1000 fph. we assume that ρ 0 = 0.98, i.e. 98% of the people follow the restriction directives. • the physical distance between agents to catch infection should at most r 0 ≈ 6f t and the exposure time should be at least τ 0 = 5 time units (hours). the probability p i of getting infected, under the right exposure conditions, is set at 5% at each time unit (hour) independently. • once infected, the probability of having a fatal outcome is set at 5-10%. the time period of recovery for agents with non-fatal outcomes starts at 7 days. the probability of reaching full recovery for those agents is p r = 0.001 at each subsequence time unit (hour). similarly, for the agents with fatal outcomes, the period of being infected is set to be 7 days and after that the probability of death at each time unit (hour) is set to be p d = 0.1. although alps is perhaps too simple model to capture intricate dynamics of an active society, it does provide an efficient tool for analyzing effects of countermeasures during the spread of a pandemic. before it can be used in practice, there is an important need to validate it in some way. as described in [3] , there are several ways to validate a simulation model. one is to use real data (an observed census of infections over time) in a community to estimate model parameters, followed by a statistical model testing. while such data may emerge for covid-19 for public use in the future (especially with the advent of tracking apps being deployed in many countries), there is currently no such agent-level data available for covid-19. the other approach for validation is to consider coarse population-level variables and their dynamics, and compare them against established models such as sir and its variations. we take this latter approach. fig. 2 shows plots of the evolutions of global infection counts (susceptible, infection, recovered) in a community under the well-known sir model (on the left) and the proposed alps model (on the right). in the alps model the counts for recovered and fatalities are kept separate, while in sir model these two categories are combined. one can see a remarkable similarity in the shapes of corresponding curves and this provides a certain validation to the alps model. in fact, given the dynamical models of agent-level mobility and infections, one can in principle derive parameters of the population-level differential equations used in the sir model. we have left that out for future developments. we illustrate the use of alps model by showing its outcomes under a few typical scenarios. further, we discuss the computational cost of running alps on a regular laptop computer. we start by showing outputs of alps under some interesting settings. in these examples, we use a relatively small number of agents (n = 900) and household units (h = 9), with t = 100days, in order to improve visibility of displays. fig. 3 shows a sequence of temporal snapshots representing thecommunity at different times over the observation period. in this example, the population is fully mobile over the observation period and no social distancing restrictions are imposed. the snapshots show the situations at 10, 20, 50, and 100 days. the corresponding time evolution of global count measures is shown in the bottom right panel. the infection starts to spread rapidly around the 10th day and reaches a peak infection level of 81% around day 35. then the recovery starts and continues until very few infected people are left. in this simulation, the number of fatalities is found to be 11%. in the second example, the restrictions are introduced on day 5 after the infection and these measures stay in place after that. the results are shown in fig. 4 . the bottom right panel shows temporal evolutions of the population-level infection counts: susceptible (blue curve), infected (red), recovered (green), and deceased (magenta). as the picture shows, the infections start growing initially but the gains of lockdown measures start appearing around day 15 -it takes about 10-12 for the restrictions to show results. the subsequent bumps in the infected counts is due to the new infections transmitted by roaming agents. in this run of alps, we see an overall fatality rate of 3% and an uninfected population size of 67%. in this example, with results shown in fig. 5 , the restrictions are introduced on day 5 after the infection and lifted on day 30. so the restriction period of 25 days is surrounded by unrestricted mobility on both sides. as the plot of global variables indicate, the early restriction helps reduce the infection rates but these gains are lost soon after lifting of the restrictions. the percentage of infected people goes back up and the rate of fatalities resemble the unrestricted situation in example 1. since computational efficiency of the simulator is of vital important, we next study the computational cost of running alps for different variable sizes. in these experiments, from these results we are see that the computational cost is linear in t with slope 1. for a change in n the number of agents, while keeping other variables fixed, the change in the computational cost is linear also. however, the rate of change is 2 for smaller values but increases to 4 for the larger values. the changes in computational cost due to changes in the number of households, with other variables held fixed, are minimal. there are several ways to utilize this model for prediction, planning and decision making. we illustrate some of these ideas using examples. at first we show individual simulations under different scenarios and then present results on average behavior obtained using hundreds of simulations. in these illustrations, we have used n = 972 agents and h = 81 households. effect of timing of imposition of restrictions: fig. 6 shows some examples of alps outputs when a lockdown is imposed on the community but at different times. from top-left to bottom-right, the plots show lockdowns starting on day 1, day 5, day 10, and day 20, respectively. once the restrictions are imposed, they are not removed in these examples. the best results are obtained for the earliest imposition of restrictions. in the top-left case, the peak infection rises to 22% of the community, on day 18, and then comes down steadily. the fraction of fatalities is 2% and the fraction of community never infected is 77%. in case the restrictions are imposed on day 5, and all other parameters held same, there is a small change in the situation. the peak infection rises to 55%, the fatalities increase to 7% and the fraction of uninfected goes down to 40%. we can see that an early imposition of lockdown measures also helps reduce peak infection rates in the community. sometimes we notice a saw-tooth shape in the curve for infected people. this implies that the even the small portion of mobile agents can break through and spread infections at other home units despite full restrictions being in place. this saw-tooth shape underscores the need for severely limiting mobility. even a small fraction of population being mobile can spread infections to the immobile agents and cause infections. the bottom two panels in fig. 6 show results for a delayed lockdown, with the restrictions being imposed on day 20 and day 30. one can see that the peak infection rate becomes quite high (82-84%) and casualties mount to 10-11%. the fraction of uninfected population falls to 0% in these cases. this shows that, under the chosen parameter settings, day 20 is quite late in imposing lockdown conditions on the community, and the results are very similar to any later imposition. if the restrictions are imposed after 15-20 days, then there are no uninfected people left in the community in a typical run of alps. effect of timing of removal of restrictions: in the next set of simulations, we study the effects of lifting restrictions and thus re-allowing full mobility in the community. some sample results are shown in fig. 7 . each plot shows the evolution for a different starting time t 0 and the end time fig. 8 shows an example of the configuration with 25 households and 200 agents in the community. we vary the start time t 0 (start day of restrictions) from 1 to 30 and then to 150 and study the resulting outcomes using 100 runes of alps. (the value of t 0 = 150 implies that the restrictions are never imposed in that setting.) the remaining panels in fig. 8 show box plots of the three variables changing with t 0 . • death rate: top right shows the percentage fatalities increasing from around 1% to almost 11% as t 0 changes from 1 to 30. the largest rate of increase is observed when t 0 is between 10 to 30 days. • number of uninfected: in the bottom left panel we see a decrease in the number of uninfected population from around 90% to 0% as t 0 increases. in case the restrictions are not imposed in first 30 days after the first infection, there is no agent left uninfected in the community. • peak infection rate: in case the restrictions are imposed on the first day after the infection, the peak infection rate is contained to be 10%. as t 0 is increased and the restrictions are delayed the peak infection rate rises to almost 80% of the community. in fig. 9 we study the effect of changing t 1 while t 0 = 1 is kept fixed (and other experimental conditions being same as in the last experiment). the results show that there is no difference in the eventual number of deaths and the peak infection rates when t 1 is changed from 10 to 40. this is because agent immunity and other infection the strengths and limitations of alps model are the following. alps provides an efficient yet comprehensive modeling of the spread of infections in a self-contained community, using simple model assumptions. the model can prove very useful in evaluating costs and effects of imposing social lockdown measures in a society. in the current version, the initial placement of agents is set to be normally distributed with means given by their home units and fixed variance. this variance is kept large to allow for near arbitrary placements of agents in the community. in practice, however, agents typically follow semi-rigid daily schedules of being at work, performing chores, or being at home. thus, at the time of imposition of a lockdown, the agents can be better placed in the scenes according to their regular schedules rather than being placed arbitrarily. in terms of future directions, there are many ways to develop this simulation model to capture more realistic scenarios: (1) it is possible to model multiple, interactive communities instead of a single isolated community. (2) one can include typical daily schedules for agents in the simulations. a typical agent may leave home in the morning, spent time in the office during the day, and return to home in the evening. (3) it is possible to provide an age demographics to the community and assign immunity to agents according to their demographic labels [13] . (4) as more data becomes available in the future, one can change immunity levels of agents over time according to the spread and seasons. (5) in practice, when an agent is infected, he/she goes through different stages of the disease, associated with varying degrees of mobility [9] . one can introduce an additional variable to track these stages of infections in the model and change agent mobility accordingly. this paper develops an agent-based simulation model, called alps, for modeling spread of an infectious disease in a closed community. a number of simplifying and reasonable assumptions makes the alps efficient and effective for statistical analysis. the model is validated at a population level by comparing with the popular sir model in epidemiology. these results indicate that: (1) early imposition of lockdown measures (right after the first infection) significantly reduce infection rates and fatalities; (2) 0 13/15 lifting of lockdown measures recommences the spread of the disease and the infections eventually reach the same level as the unrestricted community; (3) in absence of any extraneous solutions (a medical treatment/cure, a weakening mutation of the virus, or a natural development of agent immunity), the only viable option for preventing large infections is the judicious use of lockdown measures. special report: the simulations driving the world's response to covid-19. how epidemiologists rushed to model the coronavirus pandemic flute, a publicly available stochastic influenza epidemic simulation model a taxonomy for agent-based models in human infectious disease epidemiology a contribution to the mathematical theory of epidemics population-based simulations of influenza pandemics: validity and significance for public health policy estimates of the severity of coronavirus disease 2019: a model-based analysis agent-based models growing artificial societies. social science from the bottom up an agent-based approach for modeling dynamics of contagious disease spread. international journal of health geographics an open-data-driven agent-based model to simulate infectious disease outbreaks high-resolution epidemic simulation using within-host infection and contact data an agent-based spatially explicit epidemiological model in mason modelling transmission and control of the covid-19 pandemic in australia this research was supported in part by the grants nsf dms-1621787 and nsf dms-1953087. a listing of alps parameters table 1 provides a listing of all the parameters one can adjust in alps to achieve different scenarios. it also provides some typical values used in the experiments presented in the paper. key: cord-030683-xe9bn1cc authors: wang, wenxi; usman, muhammad; almaawi, alyas; wang, kaiyuan; meel, kuldeep s.; khurshid, sarfraz title: a study of symmetry breaking predicates and model counting date: 2020-03-13 journal: tools and algorithms for the construction and analysis of systems doi: 10.1007/978-3-030-45190-5_7 sha: doc_id: 30683 cord_uid: xe9bn1cc propositional model counting is a classic problem that has recently witnessed many technical advances and novel applications. while the basic model counting problem requires computing the number of all solutions to the given formula, in some important application scenarios, the desired count is not of all solutions, but instead, of all unique solutions up to isomorphism. in such a scenario, the user herself must try to either use the full count that the model counter returns to compute the count up to isomorphism, or ensure that the input formula to the model counter adequately captures the symmetry breaking predicates so it can directly report the count she desires. we study the use of cnf-level and domain-level symmetry breaking predicates in the context of the state-of-the-art in model counting, specifically the leading approximate model counter approxmc and the recently introduced exact model counter projmc. as benchmarks, we use a range of problems, including structurally complex specifications of software systems and constraint satisfaction problems. the results show that while it is sometimes feasible to compute the model counts up to isomorphism using the full counts that are computed by the model counters, doing so suffers from poor scalability. the addition of symmetry breaking predicates substantially assists model counters. domain-specific predicates are particularly useful, and in many cases can provide full symmetry breaking to enable highly efficient model counting up to isomorphism. we hope our study motivates new research on designing model counters that directly account for symmetries to facilitate further applications of model counting. propositional model counting is the classic problem of counting the number of all solutions for the given formula in propositional logic. while the core problem is an integral part of complexity theory literature, advances in propositional satisfiability (sat) solvers and other decision procedures in the last decade have led to much progress in tackling this problem in innovative ways [7, 9, 10, 15, 17, 31, 39, 40, 47, 49, 50, 56, 64] . these advances have fueled the application of model counters in various software verification and reliability domains, e.g., to perform probabilistic analyses [13, 26, 28] , check and repair string manipulation code [9, 41] , and estimate information leakage using quantified information flow [19, 44] . while the basic model counting problem requires computing the number of all solutions, in some important application scenarios, the desired count is not of all solutions, but instead, of all unique solutions up to isomorphism, i.e., non-isomorphic (also called non-symmetric) solutions. for example, consider the context of software reliability analysis [26] where a goal is to find the number of inputs that can lead to an assertion violation, or bounded exhaustive testing [14, 42, 62, 68] where the goal is to estimate the total number of inputs that exist for a certain bound on the input size to decide what bound to use to stay within the testing budget. the desired counts in these cases are of non-isomorphic inputs, which are non-equivalent with respect to behaviors that a program can have because two inputs that are equivalent (and possibly not identical) produce the same output [66] . as another example, consider computing the number of solutions to a constraint satisfaction problem (csp) [45] , e.g., the number of unique ways 8 queens can be arranged on a fixed chess board such that no queen is under attack [6] . once again, one is typically interested in the number of non-symmetric solutions because the indistinguishability of queens implies that a user does not consider two solutions obtained by swapping positions of queens to be unique. in such scenarios, the user has two basic options. one option is to compute the full count using the model counter, and then use mathematical reasoning about symmetries to project the full count to the desired count. doing so is straightforward in some cases, e.g., if each solution consists of n indistinguishable objects of the same type and the composition of each solution implies that each permutation of those n objects leads to a distinct (albeit isomorphic) solution, dividing the full count by n! gives the count for non-isomorphic solutions; doing so is however, not always easy, for example when different solutions have different number of objects that can be permuted to form non-identical solutions. the other option is to ensure the formula that is input to the model counter includes symmetry breaking predicates [20, 21] , i.e., additional constraints that only allow canonical solutions from each isomorphism class, so the model counter can report the desired count. symmetry breaking predicates can be added using three basic approaches [29] . perhaps the most common approach is to add them at the cnf-level by using an off-the-shelf tool [8, 23] , which takes as input a cnf formula and creates symmetry breaking predicates for it. another common approach is to create them at the problem domain level using a domain-specific tool [58] , and then translate the formula and predicates together to cnf. a third approach is to add them manually at the problem domain level [38, 59] , and then translate to cnf. a goal of our work is to study what is the best way to add symmetry breaking predicates (if at all) to obtain precise counts of non-isomorphic solutions. we conduct the study in the context of the state-of-the-art in model counting, specifically the leading approximate model counter approxmc [16, 17, 52] and the recently introduced exact model counter projmc [40] . approxmc and projmc embody very different algorithms for model counting and provide us a diverse set of tools for the study. approxmc employs novel approximation methods to efficiently predict highly accurate model counts with formal guarantees, and is now in its third generation (called approxmc3 [52] ). projmc uses a recursive algorithm and employs a disjunctive decomposition method together with a search for disjoint components, and just had its first public release. as benchmark formulas, we use a range of problems, including structurally complex specifications of software systems [34] and constraint satisfaction problems [45] . to create the benchmark formulas, we employ the alloy toolset [34] and its kodkod backend [58] . alloy allows writing formulas in relational first order logic with transitive closure, and has been used in academia and industry for design and specification of systems [11, 18, 35, 37, 65, 67, 70] as well as for various forms of analyses of code [27, 32, 36, 42, 48, 69] . the alloy analyzer translates alloy formulas with respect to a scope, i.e., bound on the universe of discourse, into propositional logic to create cnf problems that are solved using off-the-shelf sat solvers [25] . alloy supports fully automatic (partial) symmetry breaking at the level of alloy specifications [51, 57] by adapting crawford's symmetry breaking predicates [20] , which are statically added to the formula before the solvers solve it. alloy provides an ideal vehicle for evaluating the different approaches to symmetry breaking that are our focus in this study. similar to other techniques that use cnf-based backends, the alloy analyzer translates problems from a higher-level (alloy) to a lower-level (cnf). this translation often introduces new boolean variables in the resulting formula, which are not essential for creating the cnf formula but are required for a compact (feasible) encoding in cnf [60] . as a result, the translated formula is equisatisfiable to the original formula but may not be equivalent to it, and hence it may be the case that the model count for the cnf formula is very different from the original formula. several modern model counters [16, 40, 50] readily handle this case by providing support for projected model counting [10] , i.e., computing the model count with respect to a subset of all the variables. for alloy, the subset is the primary variables, i.e., all boolean variables that directly correspond to the variables in the alloy specification. for each benchmark formula f , we create three model counting problems using automatic tools: 1) f with no symmetry breaking, which we create by setting alloy's default symmetry breaking to off ; 2) f with symmetry breaking predicates added at the problem domain level, which we create by having alloy's default symmetry breaking turned on; and 3) f with symmetry breaking predicates added at the cnf level, which we create by first using alloy to create a cnf formula with no domain-level symmetry breaking, and then using the breakid [23] tool to add cnf-level symmetry breaking predicates using its default settings. in addition, for select benchmarks we create formulas with manually added domain-specific symmetry breaking predicates, which we write in alloy following previous work [38] . the results show that while it is sometimes feasible to compute the model counts up to isomorphism using the full counts that are computed by the model counters, doing so suffers from poor scalability. the addition of symmetry breaking predicates substantially assists model counters, although it is a well-known feature in sat solving supported by theory finding [46, 61] . domain specific predicates are particularly effective, and in many cases, can provide full symmetry breaking to enable highly efficient model counting up to isomorphism. we were surprised by the extent of the impact. since the addition of symmetry breaking predicates introduces new dependencies among the variables, we expected these dependencies to make the formula more complex and perhaps less amenable to efficient model counting. however, the sheer reduction in the number of solutions caused by symmetry breaking more than compensates for the additional logical complexity of the formula. in cases where it was possible to create full symmetry breaking predicates, the model count for the formula with the predicates was computed up to a few orders of magnitude faster than the formula with no symmetry breaking predicates. a key lesson of our study (in the context of the model counting problems considered) is: if non-isomorphic solution counts are desired, use full symmetry breaking predicates at the domain-level whenever feasible -even if it is straightforward to compute the number of non-isomorphic solutions from the number of all solutions, or even if the symmetry breaking constraints have to be written manually. this paper makes the following contributions: -study. to the best of our knowledge, we present the first study of symmetry breaking in the context of model counting. as pointed out earlier, there is a tradeoff between the reduction of solution space and the likely increase in complexity due to added symmetry breaking predicates. in prior work, the benefit of symmetry breaking in sat solving were typically observed largely for unsatisfiable problems [43] , our study shows the importance of symmetry breaking and its deep relation to problem formulation in the context of satisfiable problems, albeit for model counting. -dataset. all cnf files we used for the experiments are being made publicly available: https://github.com/wenxiwang/tacas2020. we expect the dataset to be useful for future work on evaluating the performance of different model counters, and of the different strategies they employ, as well as for evaluating model enumeration tools. we believe there is an important bi-directional relation between symmetry breaking and model counting whereas: 1) in one direction the model counters directly support computing the counts for non-isomorphic solutions to facilitate applications that so require; and 2) in the other direction symmetry breaking helps model counters become more efficient. we hope our study motivates future work that further investigates this relation. this section provides two illustrative examples that require computing the number of unique solutions up to isomorphism. we specify the examples in the alloy language, which allows us to explore different approaches for applying symmetry breaking. we provide intuitive descriptions of alloy constructs as we introduce them; further details can be found elsewhere [34] . the first example illustrates a csp problem [45] where alloy's default symmetry breaking provides full symmetry breaking; we use approxmc to solve this problem (section 2.1). the second example illustrates a software testing problem [42] where manually written symmetry breaking predicates provide full symmetry breaking; we use projmc to solve this problem (section 2.2). section 5 presents a detailed experimental evaluation where we use the two tools against many additional benchmarks. consider specifying the well-known n-queens problem of placing n interchangeable queens 4 on a fixed n×n chess-board, and computing the number of solutions to the problem using a modern propositional model counter [16, 40, 50] . figure 1 shows a fragment of an alloy specification of the n-queens problem, which has been studied before using alloy [2, 4, 55] . the keyword sig introduces a set of (interchangeable) atoms. the keyword one makes the set a singleton. the field state introduces a quaternary relation of type " board x queen x int x int" where int is a built-in type that represents integers. the fact stateokay describes the basic constraints for the state of the board to be valid; the fact contains 4 sub-formulas that are implicitly conjoined; each of them uses universal quantification (all); the keyword disj constrains the quantified variables to represent distinct values. the dot operator ('.') is relational join [34] . a predicate (pred) is a parameterized formula that can be invoked elsewhere; likewise, a fun is a parameterized expression. the predicate nqueensproblem represents the overall specification of the n-queens constraints. any model of the alloy specification must satisfy the constraints in all the facts and any predicates that are invoked (directly or transitively). the alloy user writes a command and executes it to solve desired constraints. for example, "run nqueensproblem for 5 int, exactly 8 queen" asks the analyzer to find a solution to the 8-queens problem. this command creates a constraint solving problem such that the integer bit-width is 5, and there are exactly 8 queens. figure 2 shows a valuation for each set and relation created by the alloy analyzer to solve this problem, and graphically illustrates the solution. next, we illustrate the use of the approximate model counter approxmc [16] . for the nqueens specification, for each 7 ≤ n ≤ 12, we create three constraint solving problems: 1) no symmetry breaking (no-sb); 2) breakid's default cnflevel symmetry breaking [23] (cnf-sb); and 3) alloy's default domain-level symmetry breaking [58] (dom-sb). table 1 shows the number of solutions found and time taken in each case. the model count with no symmetry breaking is the highest and takes the longest to compute; this approach times out for 8×8 and larger boards. breakid's default cnf-level symmetry breaking significantly reduces table 1 : approxmc results for n-queens for 7 ≤ n ≤ 12. model count ("#") and time taken in seconds ("t[s]") for different problem sizes are shown. time-out (t.o.) is 5000 sec. note that the non-isomorphic solution count can easily be estimated from the full count for this problem. for example, for the 7×7 board we can estimate it as 208896 7! = 41.44, which is quite close to the actual count of 40. while the calculation is simple, the time to compute the full count is much higher (3727.1 seconds instead of 1.14 seconds). moreover, for larger board sizes, computing the full count times out, so using it for those sizes may be simply infeasible. this example illustrates a case where symmetry breaking predicates reduce both the model count and the time to compute it by relatively large factors. 3-queens. table 2 shows the results for a variation of the n-queens problem where the number of queens is fixed to 3, and the board size varies. to specify this variation, we replace the expression " (#queen).minus[1]" in predicate validindex with the value of "k − 1" for the board size k × k, and set the scope for queen to "exactly 3" in the run command. we validate the approxmc counts using the oeis sequence #a047659 [6] . once again, breakid's cnf-level predicates significantly reduce the model count and time to compute it, and alloy's domainlevel predicates reduce them further. since the number of queens is fixed to 3, the ratio of total number of solutions (no-sb) to number of non-isomorphic solution is 3! = 6. for example, for 11 × 11 board, the ratio for approxmc counts is exactly 6; however, the time to compute the full count is, as before, much higher (1307.04 seconds instead of 45.1 seconds). this example shows a case where symmetry breaking predicates reduce the model count by a relatively small factor but the time to compute the counts by a much larger factor. next, consider the context of bounded exhaustive testing where the program under test is run against every non-equivalent input within a bound on the input size, and the inputs are characterized by a logical formula [42] . assume the goal is to identify a bound that will lead to a feasible number of inputs that can be executed within the testing budget. we use model counting to estimate the number of solutions for different bounds. assume the inputs to the program under test are binary trees. figure 3a shows a partial alloy specification for binary trees. the singleton sig bt represents the tree, which has a root node and an integer size; the keyword lone defines a partial function, so, e.g., the tree root is either exactly one node or none. each node has an integer key and a left and a right child. the predicate repok specifies the constraints for a valid binary tree, which must be acyclic. the predicate acyclic specifies acyclicity; the operator "ˆ" is transitive closure, "*" is reflexive transitive closure, "+" is set union, "&" is set intersection, and "˜" is transpose. consider the constraint solving problem for size k so that the binary tree has exactly k nodes and the keys are 1, . . . , k. figure 3b illustrates the 5 nonisomorphic trees for size 3. to show that the impact of symmetry is not limited to only approximate counting, we perform this case study with the exact model counter projmc [40] . table 3 shows the model counts for different sizes. as before, cnf-level symmetry breaking reduces the model count, which is further reduced by alloy's fig. 4 : full symmetry breaking predicates in alloy [38] . default symmetry breaking. however, unlike before, cnf-level symmetry breaking sometimes makes the model counter, which is projmc in this case, slower. moreover, alloy's default symmetry breaking does not break all symmetries. for this example, they can be broken using manually written predicates. binary trees belongs to a restricted class of data structures for which full symmetry breaking can be achieved by writing predicates in alloy so that only the canonical solution from each isomorphism class is allowed [38] . figure 4 shows a fact that embodies this approach. intuitively, the fact requires that a pre-order traversal starting at the root visits the nodes in the same order as a pre-defined linear ordering of the nodes; the ordering module in alloy allows defining a linear order. the manually written predicates provide the most efficient counting. in this example the count up to isomorphism can, once again, be computed from the full count but at a much higher computational cost. for example, for 8 nodes, the full count is 57657600, which divided by 8! is 1430, i.e., the count up to isomorphism, but projmc takes 3673 seconds to compute the full count whereas once the manual symmetry breaking predicates are added it takes 0.34 seconds. the number of binary trees with n nodes is the oeis sequence #a000108, which allows us to validate that the manually written predicates are indeed breaking all symmetries. this section gives the relevant background on model counting, with a focus on projected and approximate model counting. let ϕ be a boolean formula in conjunctive normal form (cnf) over the variable set x. an assignment σ of truth values to the variables in ϕ is called solution of ϕ if it makes ϕ evaluate to true. we denote the set of all witnesses of f by r f . given a set of variables s ⊆ x and an assignment σ, we use σ ↓ s to denote the projection of σ on s. similarly, r ϕ↓s denotes projection of r ϕ on s. the projected model counting problem is to compute |r ϕ↓s | for a given cnf formula f and sampling set s ⊆ x. when s = x, the problem is referred to as model counting. a probably approximately correct (or pac) counter is a probabilistic algorithm approxcount(·, ·, ·, ·) that takes as inputs a formula f , a sampling set s, a tolerance ε > 0, and a confidence 1 − δ ∈ (0, 1], and returns a count c such that p r |r ϕ↓s |/(1 + ε) ≤ c ≤ (1 + ε)|r ϕ↓s | ≥ 1 − δ. for clarity, we omit mention of s unless needed for a given context. projected model counting is a fundamental problem in computer science with applications ranging from reliability of networks to information leakage. valiant initiated complexity theoretic studies of model counting and showed that model counting is #p-hard [63] . the earliest practical approaches to model counting such as relsat [12] , were based on extending dpll approaches. the advent of cdcl solvers led to the paradigm of combining conflict driven search with component caching leading to the development of solvers such as cachet [49] and sharpsat [56] . furthermore, darwiche and marquis [22] pioneered a knowledgecompilation-based approach, relying on the static partitioning of the solution space, which led to development of c2d. the recent years have witnessed combination of cdcl and static approaches with solvers such as d4 and dsharp. recently, lagniez and marquis proposed a recursive algorithm, called projmc [40] , that exploits the disjunctive decomposition technique pioneered in earlier works to perform projected model counting. concurrently, another approach, called ganak [50] , for projected model counting has been developed that provides probabilistic exact bounds via usage of universal hash functions. in this work, we focus on projmc due to its ability to provide exact counts and demonstrated scalability in comparison to other approaches. the theoretical studies of approximation led to the introduction of pac style, also referred to as (ε, δ), guarantees wherein the underlying algorithm returns an estimate within (1 + ε) factor of the exact count with confidence at least 1 − δ. stockmeyer [54] demonstrated that pac guarantees can be achieved by a probabilistic polynomial turing machine with access to np oracle. the practical exploration of stockmeyer's approach was pursued with gomes et al with the development of mbound [31] and samplecount [30] . chakraborty, meel, and vardi proposed a scalable approximate counter, called approxmc, with formal (ε, δ) guarantees which seeks to combine the advances in sat solving with design of efficient universal hash functions. approxmc is now in its third generation, called approxmc3. the central idea behind approxmc is to employ universal hash functions, represented by randomly chosen xor constraints, to partition the solution space into roughly equal small cells where every cell can be defined by the original constraints augmented with randomly chosen xor constraints. approxmc invokes cryp-tominisat [53] , a solver designed specifically for combination of cnf and xor constraints, to enumerate solutions in a randomly chosen small cell. approxmc2 achieves a significant reduction in the number of sat calls from linear in |s| to log(|s|) by exploiting dependence among different sat calls. soos and meel proposed approxmc3 by augmenting approxmc2 with a new architecture to handle cnf+xor formulas [52] . this section describes the overall design of our study, including the model counting tools, the generation of constraint solving problems, and the measurements for evaluation. for approximate model counting, we use approxmcv3 (https://github.com/ meelgroup/approxmc), which is the latest public release of approxmc [52] . for each model counting problem, we list the primary variables in the input cnf file as a comment as required by approxmc. for exact model counting, we use the latest public release of projmc [40] (http://www.cril.univ-artois.fr/kc/ projmc.html). for each model counting problem, we list the primary variables in a separate file as required by projmc. base formulas. we use four sources of base formulas. (1) alloy specs. we consider all alloy specifications in the standard distribution [1]; each command in an alloy spec defines a constraint solving problem and provides a scope; we use the given scope. we remove unsatisfiable problems since their model count is 0 (regardless of symmetry breaking), and our focus in this study is on satisfiable problems. we also remove all "easy" cases that complete within 1 second for both tools and all symmetry settings. this creates a set of 47 base problems derived from alloy specifications. (2) kodkod problems. we consider all kodkod programs in the standard distribution [5] . once again, we remove the unsatisfiable problems and "easy" cases. in addition, we remove problems that do not admit symmetry breaking, i.e., where kodkod does not add any symmetry breaking by default (e.g., when there is a given partial solution, which prevents kodkod's greedy base partitioning [57] from having an effect). some of the kodkod programs are parameterized over integer bounds and input files. we manually create those inputs in the appropriate format. this gives us a total of 13 base problems derived from kodkod programs. (3) n-queens. we use 2 common variations of the n-queens problem: 1) k queens are placed on a k × k board (1 ≤ k ≤ 12); 2) 3 queens are placed on a k × k board (1 ≤ k ≤ 12). this gives us a total of 24 base problems derived from the n-queens problem 5 . (4) complex data structures. we use 6 complex data structures: (1) singly-linked lists; (2) sorted lists; (3) doubly-linked lists; (4) binary trees; (5) binary search trees; and (6) red-black trees. for each structure, we bound the number of nodes to be between 6 and 9 (inclusive). this gives us a total of 24 base problems based on structural invariants. model counting benchmarks. for each base formula f , we create 3 model counting problems using automatic tools: 1) f with no symmetry breaking, which we create by setting alloy's default symmetry breaking to off ; 2) f with symmetry breaking predicates added at the cnf level, which we create by first using alloy to create a cnf formula with no domain-level symmetry breaking, and then using the breakid [23] tool to add cnf-level symmetry breaking predicates using the same arguments as in the satrace'15 competition [3] ; and 3) f with symmetry breaking predicates added at the problem domain level, which we create by having alloy's default symmetry breaking turned on. moreover, for data structures, we create formulas with manually added domain-specific symmetry breaking predicates, which we write in alloy following previous work [38] . this gives us a total of 348 model counting problems. table 4 shows some characteristics of the benchmarks, specifically the minimum and maximum numbers of primary variables, and all variables and clauses under the different symmetry breaking settings. we use two key metrics -the model counts and the time to compute them -and measure them under different symmetry breaking settings. for model counts, we report the tool output and the ratio of the count under one setting to the count under another setting. for time, we report the actual wall-clock times, and the ratio of time taken under one setting to the time taken under another setting. in line with prior work [17] , we report the error rate of the approximate model counting which is max( approx exact , exact approx ) − 1, based on multiplicative guarantees. the section reports the results of the experimental evaluation. section 5.1 describes the results for approxmc. section 5.2 describes the results for projmc. time. figures 5a, 5c , and 5e illustrate the time performance of approxmc on the benchmarks based on alloy, kodkod, and data structure invariants respectively. with no symmetry breaking, approxmc times out on 21 (of 47) alloy benchmarks, 6 (of 13) kodkod benchmarks, and 10 (of 24) data structure benchmarks. in all but 16 cases, formulas with alloy's default symmetry breaking take less time than with cnf-level symmetry breaking. in all but 10 cases, formulas with cnf-level symmetry breaking take less time than with no symmetry breaking. moreover, for data structure benchmarks, in all but 1 cases, formulas with manual symmetry breaking take less time than alloy's default symmetry breaking. among all the problems that time out with no symmetry breaking, the smallest time taken by the corresponding problem with alloy's default symmetry breaking was 0.14 seconds, and the smallest time taken by x-axis has benchmark model counting problems. y-axis has time in seconds (log-scale). benchmarks on x-axis are sorted in ascending order based on the number of primary variables; moreover, the data structure benchmarks are grouped by the type of the structure. blue diamond is no symmetry breaking (no-sb) ; red triangle is cnf-level symmetry breaking (cnf-sb); green square is alloy's default symmetry breaking (dom-sb); and orange cross is manual symmetry breaking (man-sb). the corresponding problem with manual symmetry breaking was 0.008 seconds. for the alloy benchmarks, approxmc does not time-out under any symmetry breaking setting for benchmarks that have up to 90 primary variables. the time results for the n-queens benchmarks were presented in section 2.1. model counts. figure 6a graphically illustrates how the model counts vary under different symmetry breaking settings. for the alloy and kodkod benchmarks, in all but 10 cases the model count for the formula with alloy's default symmetry breaking is less than the corresponding count with cnf-level symx-axis has benchmark model counting problems. yaxis (log-scale) has count ratio n/c where n is the model count for the formula with no symmetry breaking and c is the corresponding count with cnf-level symmetry breaking (green-square), alloy's default symmetry breaking (bluediamond), and manual symmetry breaking (red-triangle -only for data structures). only cases where the calculation of n did not time out are shown. metry breaking. for the data structures, the model count for the formula with alloy's symmetry breaking is less than the corresponding count with cnf-level symmetry breaking in all cases; moreover, in all but 5 cases, manual symmetry breaking gives the lowest count (the 5 exceptions are due to approximation in computing the model counts). among all problems where approxmc reports a count with no symmetry breaking, the largest ratio of count with no symmetry breaking to count with alloy's default symmetry breaking was 61167, and the largest ratio of count with no symmetry breaking to count with manual symmetry breaking was 45056. the model count results for the n-queens benchmarks were presented in section 2.1. figures 5b, 5d , and 5f illustrate the time performance of projmc on the benchmarks based on alloy, kodkod, and data structure invariants respectively. with no symmetry breaking, projmc times out on 21 (of 47) alloy benchmarks (which is the same number as approxmc although the two sets of benchmarks are not the same), 9 (of 13) kodkod benchmarks (which is more that the number for approxmc), and 9 (of 24) data structure benchmarks (which is more than approxmc). in all but 8 cases, formulas with alloy's default symmetry breaking take less time than with cnf-level symmetry breaking. in all but 24 cases, formulas with cnf-level symmetry breaking take less time than with no symmetry breaking. moreover, for data structure benchmarks, in all but 2 cases, formulas with manual symmetry breaking take less time than alloy's default symmetry breaking. among all the problems that time out with no symmetry breaking, the smallest time taken by the corresponding problem with alloy's default symmetry breaking was 3.12 seconds, and the smallest time taken by the corresponding problem with manual symmetry breaking was 0.01 seconds. model counts. figure 6b graphically illustrates how the model counts vary under different symmetry breaking settings. for the alloy and kodkod benchmarks, in all but 9 cases the model count for the formula with alloy's default symmetry breaking is less than the corresponding count with cnf-level symmetry breaking. for the data structures, the model count for the formula with alloy's symmetry breaking is less than the corresponding count with cnf-level symmetry breaking in all cases; moreover, in all cases, manual symmetry breaking gives the lowest count. among all problems where approxmc reports a count with no symmetry breaking, the largest ratio of count with no symmetry breaking to count with alloy's default symmetry breaking was 40320, and the largest ratio of count with no symmetry breaking to count with manual symmetry breaking was 362880. overall, the impact of symmetry breaking is significant for both approxmc and projmc. in majority of the cases, alloy's default symmetry breaking is more effective than cnf-level symmetry breaking using breakid. for data structure benchmarks, manual symmetry breaking is the most effective, and reports exactly the counts of the non-isomorphic solutions as desired; moreover, in cases where alloy's default symmetry breaking provides full symmetry breaking, manual symmetry breaking provides much faster solving. the empirical evaluation in the preceding subsections clearly demonstrates the significant impact of symmetry breaking on approxmc and projmc. while a detailed study to explain the observed behavior is beyond the scope of this work, we offer some explanations. as pointed out by soos and meel [52] , over 99% of the runtime of approxmc is consumed by the underlying sat solver handling cnf-xor formulas. the usage of symmetry breaking predicates for satisfiable instances typically leads to smaller overheads in runtime in the context of satisfiability queries. as discussed above, the use of symmetry breaking predicates significantly reduces the number of solutions and thereby leads to the significant reduction in the number of xors to be added by approxmc. note that the number of xors to be added is logarithmically proportional to the number of solutions of a formula. the performance of sat solvers has been observed to be sensitive to the number of xors [24] and therefore, we believe that reduction in the required number of xors is the primary reason behind the performance improvements in the context of approxmc. the performance improvement of projmc is, however, more surprising since it is not necessarily the case that reduction in the number of solutions would lead to reduction in the size of the corresponding d-dnnf (decision-deterministic decomposable negation normal form), which represents the trace of the execution of projmc [33] . furthermore, given the lack of noticable difference in runtime performance improvement via off-the-shelf symmetry breaking tools, it would be an interesting direction of future work to understand the difference in the traces between the formulas generated via alloy's default symmetry breaking and cnf-level symmetry breaking. this paper presented, to the best of our knowledge, the first study of symmetry breaking and model counting. a goal of the study was to determine what is the best way to add symmetry breaking predicates (if at all) to obtain precise counts of non-isomorphic solutions. we studied two model counters from two different classes and four scenarios of applying symmetry breaking. a key lesson of our study is that domain-specific symmetry breaking predicates are most effective at enabling precise computation of model counts up to isomorphism. we believe the results of our study can provide insights into more effective use of cutting edge model counters in important domains where the number of unique solutions up to isomorphism is desired, and also enable developing novel model counting methods that exploit symmetries. board<:state={board$0->queen$0->7->5, board$0->queen$1->6->0, board$0->queen$2->5->4, board$0->queen$3->4->1, board$0->queen$4->3->7, board$0->queen$5->2->2, board$0->queen$6->1->6, board$0->queen$7->0->3} references 1. alloy github repository alloy models repository breakid bitbucket repository kodkod examples repository kodkod github repository the on-line encyclopedia of integer sequences predictive constraint solving and analysis shatter: efficient symmetry-breaking for boolean satisfiability automata-based model counting for string constraints projected model counting a formal approach for detection of security flaws in the android permission system counting models using connected components compositional solution space quantification for probabilistic software analysis korat: automated testing based on java predicates approximate probabilistic inference via word-level counting a scalable approximate model counter algorithmic improvements in approximate counting for probabilistic inference: from linear to logarithmic sat calls the semantics of transactions and weak memory in x86 quantitative analysis of the leakage of confidential data a theoretical analysis of reasoning by symmetry in first-order logic (extended abstract) symmetrybreaking predicates for search problems a knowledge compilation map improved static symmetry breaking for sat combining the k-cnf and xor phase-transitions an extensible sat-solver reliability analysis in symbolic pathfinder taco: efficient sat-based bounded verification using symmetry breaking and tight bounds. transactions on software engineering probabilistic symbolic execution symmetry in constraint programming short xors for model counting: from theory to practice model counting: a new strategy for obtaining good bounds specificationbased program repair using sat dpll with a trace: from sat to knowledge compilation software abstractions: logic, language, and analysis com revisited: tool-assisted modelling of an architectural framework finding bugs with a constraint solver exploring the design of an intentional naming scheme with an automatic constraint analyzer a case for efficient solution enumeration bit-vector model counting using statistical estimation a recursive algorithm for projected model counting a model counter for constraints over unbounded strings testera: a novel framework for automated testing of java programs cdclsym: introducing effective symmetry breaking in sat solving abstract model counting: a novel approach for quantification of information leaks artificial intelligence: a modern approach symmetry and satisfiability algorithms for propositional model counting falling back on executable specifications combining component caching and clause learning for effective model counting ganak: a scalable probabilistic exact model counter generating effective symmetry-breaking predicates for search problems bird: engineering an efficient cnf-xor sat solver and its applications to approximate model counting extending sat solvers to cryptographic problems the complexity of approximate counting automated test generation and mutation testing for alloy sharpsat -counting models with advanced component caching and implicit bcp a constraint solver for software engineering: finding models and cores of large relational specifications kodkod: a relational model finder checkmate: automated synthesis of hardware exploits and security litmus tests on the complexity of derivation in propositional calculus the symmetry rule in propositional logic testmc: a framework for testing model counters. under submission the complexity of enumeration and reliability problems first-order model counting in a nutshell crns exposed: systematic exploration of chemical reaction networks theories of program testing and the application of revealing subdomains automatically comparing memory consistency models symstra: a framework for generating object-oriented unit tests using symbolic execution contract-based data structure repair using alloy how to make chord correct (using a stable base which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license and indicate if changes were made. the images or other third party material in this chapter are included in the chapter's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the chapter's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use this work was supported in part by the u.s. national science foundation grant ccf-1718903, and the national research foundation singapore under its ai singapore programme [aisg-rp-2018-005]. key: cord-022219-y7vsc6r7 authors: peiffer, robert l.; armstrong, joseph r.; johnson, philip t. title: animals in ophthalmic research: concepts and methodologies date: 2013-11-17 journal: methods of animal experimentation doi: 10.1016/b978-0-12-278006-6.50008-2 sha: doc_id: 22219 cord_uid: y7vsc6r7 nan the component tissues of the eye are both individually and collectively unique and fascinating to both clinician and researcher. we find in a single organ a variety of complex tissues each with its own particular vascular relationships, from the avascular cornea and lens to the extensively vascularized uvea. charac teristic biochemical features of the tissues exist as well, including the endothelial pump of the cornea, the anaerobic glycolytic pathway of the lens, and the poorly defined transport systems of aqueous humor production and outflow. besides the dynamic aqueous, we find the most substantial tissue of the eye, the vitreous humor, to be apparently a relatively stagnant blob of hydrated connective tissue whose physiology is still largely a mystery. the neurologic control mechanisms of such diverse processes as pupillary size, accommodation, and probably aque ous humor inflow and outflow add another dimension of potential inquiry. the eye is devoid of lymphatics, and the nature of its tissues further contributes to its immunologie uniqueness. the retina incorporates all the complexities of neuroperception and transmission. nowhere else in the body are such diverse structure and function so intimately and intricately related. an appreciation of these perspectives, the importance of vision as a cognitive sense, and consideration of the plethora of infectious, inflammatory, degenera tive, traumatic, toxic, nutritional, and neoplastic diseases that can affect any one or all of the component tissues of the eye, will elucidate the challenge the authors face in attempting to review pertinent conceptual and logistical approaches to eye research in the animal laboratory. while the majority of investigations have had as their objective ultimate correlation with normal and abnormal function and structure of the human eye, laboratory studies have provided an abundance of comparative information that emphasizes that while there are numerous and amazing similarities in the peripheral visual system among the vertebrate (and even the invertebrate) animals, significant differences exist that are important to both researcher and clinician in selection of a research model and in extrapolation of data obtained from one species to another, and even among different species subdivisions. details of comparative anatomy and physiology have been reviewed by several authors (prince, 1956 (prince, , 1964a polyak, 1957; duke-elder, 1958; prince et al., 1960; walls, 1967) . these are classic works and invaluable references. one is reminded that "research" means "to look for again," and while some concepts presented in these earlier texts are indeed outdated, and some facts since disproven, phylogenetic perspectives acquired from familiarity with this literature arms the scientist with valuable information. throughout this chapter, we will dwell on species anatomic and physiologic differences when they have been specifically defined or where they are of value in regard to specific laboratory methodologies. while the subhuman primate is frequently described as the ideal laboratory model for eye research, many of these animals are threatened or endangered, and logistical aspects of procurement, maintenance, and restraint should en courage investigators to explore and define the validity of nonprimate animals. when deemed essential, primates should be utilized with maximum thrift and humane care in mind. the eye is a sensitive organ, and in all species adequate anesthesia for man ipulative procedures as well as postoperative analgesia should be considered in in vivo studies; the rabbit, for instance, is extremely sensitive to barbiturate anesthesia and prolonged procedures that require intraocular manipulations pose a challenging problem in terms of subject mortality. the in vitro culture of ocular tissue cell lines offers increasing potential in research as an alternative to animal models. the lens, retinal pigment epithelium, and corneal epithelium and fibrocytes may be readily maintained and manipu lated. current work with the corneal and trabecular endothelium is perhaps the most exciting advance in ophthalmic research in recent years. in vitro culture techniques offer a number of logistical advantages over animal models. a knowledge of spontaneous ocular diseases is essential to avoid interpreting pathology as experimental that may well be unrelated to the investigation. thorough preoperative ophthalmic examination is a prerequisite of any study, and the laboratory scientist will find it to his benefit to be associated with an interested human or veterinary clinical ophthalmologist to assist in defining and managing spontaneous ocular disease. for instance, electrophysiologic studies of the canine retina will be invalid if the subject has chorioretinal lesions of canine distemper, a common entity in laboratory dogs. examination beyond the target organ is also recommended to detect abnormalities that may directly or indirectly influence results. frequently encountered spontaneous infectious diseases are reviewed in a subsequent section. we have limited our discussions primarily to research methodologies involv ing the globe itself; a paucity of information is available regarding the adnexa, orbit, and extraocular muscles, and described techniques are refinements of general approaches rather than those specifically developed for ocular research. in the same perspective, topics of pure physiology or biochemistry are limited, and we did not embark into the unfamiliar and complex area of central visual mechanisms. our approach, we hope, if not all inclusive, has been comprehen sive, with adequate references to specific detail and additional information for the interested reader. spontaneous infectious diseases that involve the eyes of laboratory animals are significantly prevalent to warrent brief discussion. it is important to consider these entities, especially before using animals derived from poorly defined sources. propensity of a species toward the development of spontaneous infecti-ous diseases that might not only affect general health but also induce ocular pathology in both experimental and control subjects may affect selection of a model of procurement and isolation procedures utilized. preinvestigation ocular examinations are necessary to detect preexisting pathology. of the three large species used in ophthalmic research (cat, dog, and primate) , the primate appears to have a minimal predisposition for spontaneous infectious ocular disease. this may be a result, however, of a lack of ophthalmic examina tion in disease situations. it is also important to note that the hamster and guinea pig rarely develop eye infections. table i summarizes those infectious diseases that more commonly involve the eyes of laboratory species. the use of laboratory animals in the investigation of infectious ocular disease has included rats, hamsters, guinea pigs, rabbits, cats, dogs, and subhuman primates. the disease agents studied have ranged from viruses to protozoa, with the rabbit serving as the most popular model. these studies have been of value in defining pathogenesis and the effects of therapeutic agents. those disease agents known to cause conjunctivitis in man do not necessarily produce a similar disease in laboratory species. a particular species may require immunosuppression to allow the induction of a specific infection (payne et al., 1977; forster and rebell, 1975) . in other instances, a model may be specifically susceptible to an organism; for example, the chlamydial inclusion body con junctivitis agent in guinea pigs. an additional and somewhat unusual use of the guinea pig involves the sereny test (formal et al., 1972) . because their conjunctivae are so susceptible to shigella organisms, this animal, through the use of a drip apparatus, has been used to identify the pathogenic strains (mackel et al., 1961) . listeria monocytogenes produces a severe keratoconjunctivitis in the guinea pig, but in the rabbit and monkey the disease is much milder and more difficult to reproduce (morris and julianelle, 1935) . this concept has lent itself to studies concerning epithelial cell phagocytosis in the guinea pig (zimianski et al., 1974) . organism installation techniques involve either direct swabbing or syringe inoculation into the conjunctival sac. the owl monkey (actus trivirgatus) has proved to be the model of choice for the study of chlamydia trachomatis conjunctivitis; other primate species includ ing orangutans have proved to be only mildly susceptible (fraser, 1976) . a number of models are available for investigators interested in studying keratitis. the rabbit has proved to be the most feasible subject, especially in herpes simplex studies (kaufman and maloney, 1961) . many bacterial and fun gal diseases afflicting man have also been examined in this model, including the effects of antibiotics, steroids, interferon, vitamins, and other agents nesburn and ziniti, 1971; smolin et al., 1979; pollikoff et al., 1972) . herpesvirus studies can be significantly influenced by the particular strain of virus utilized. for example, some are more effective in producing stromal keratidities as compared to the epithelial condition (metcalf et al., 1976) . con comitant in importance may be the route of inoculation. in some studies viral suspensions are placed into the conjunctival sac (nesburn and ziniti, 1971) , while in others they are dropped onto the cornea . corneal preparation may be important; for example, some studies have employed chalazion currette (stern and stock, 1978) , while others have employed spatula scrap ing or circular trephine defects (kaufman and maloney, 1961) . deep inoculations are required to produce a stromal keratitis (fig. 1) , while the epithelial disease may be induced through shallow scratches (polli koff et al., 1972) . the rabbit appears to be the species of choice for bacterial keratitis, although davis et al. (1978) feel that the guinea pig is of equal value. in one study an inbred animal was used (strain 13) to reduce the inherent variability seen in some of these experiments (davis and chandler, 1975) . whether or not the rabbit model has any specific biologic advantage, the guinea pig would appear to be more feasible because of its smaller size, ease of handling, ease to anesthetize, cost of purchase and maintenance, and, perhaps most significantly, resistance to spontaneous pasteur ella conjunctivitis, which frequently occurs in laboratory rabbits. methods of bacterial inoculation usually involve an intracorneal injec tion using a microsyringe and small volumes of quantitated microorganisms. fungal keratidities have been studied in rats, rabbits, and owl monkeys. rats were initially popular (burda and fisher, 1959) , but rabbits and monkeys have become the select species. fusarium solarti keratitis was first studied in the subhuman primate, but, due to expense, forster and rebell (1975) utilized the pigmented rabbit. this model, having received germinating conidia (as opposed to spores) interlamellarly, developed a sustained progressive infection, but only after pretreatment with subconjunctival steroids. candida albicans keratitis has also been induced in the corticosteroid pretreated rabbit . inoculation techniques for these organisms are identical to those described for bacterial and viral keratidities. ishibashi (1972) suggests that backward flow from the inoculation wound and volume accuracy can be controlled by using a microsyringe and a 27-gauge flat-faced needle. recently, human corneal transplants have been incriminated in the transmis sion of rabies to recipients. a model has been developed in the guinea pig utilizing the slow virus agent of creutzfeldt-jacob disease (manuelidis et al., 1977) . using a method developed by greene (1950) , this study involved the heterologous transplantation of infected guinea pig cornea sections into the an terior chamber of six recipients with resultant development of spongiform encephalopathy. studies designed to produce anterior uveitis usually employ inoculation into the anterior chamber. this procedure is best performed with a 25-to 30-gauge needle directed obliquely through the limbus into the chamber (smith and singer, 1964) . ocular histoplasmosis was effectively studied in the rabbit using this method, the result being a well-defined granulomatous uveitis. the same proce dure may be applied to produce bacterial or viral anterior uveitis (kaufman et al., 1970) . it is advisable, however, to withdraw initially an equivalent amount of aqueous fluid to avoid intraocular pressure elevation (zaitseva et al., 1978) . aguirre et al. (1975) administered canine adenovirus type i intravenously to young beagles to produce the anterior uveitis characteristic of infectious canine hepatitis. carmichael et al. (1974) demonstrated that this was a complementmediated antigen-antibody complex disease. the induction of endophthalmitis may utilize direct transcorneal chamber in oculation as described above; this method was used in rabbits to produce a bacterial infection in order to assess diagnostic anterior chamber paracentesis technique (tucker and forster, 1972) . the same procedure was also used in the rat to create a fulminating fusarium solarti endophthalmitis (o'day et al., 1979) . the intravitreal route of inoculation is utilized to induce posterior segment infection. this technique requires toothed forcep stabilization of the eye and needle introduction through the pars plana (may et al., 1974) . indirect ophthalmoscopy may be employed to ensure accurate deposition. typically a tuberculin syringe with a 5 /s-inch, 27-gauge needle is used with the subject under general anesthesia or deep tranquilization. the needle is inserted at 12 o'clock and directed toward the posterior pole. small volumes may be injected without signif icant alteration in the intraocular pressure. avallone et al. (1978) used a similar technique in rabbits to demonstrate the effectiveness of the limulus lysate test for rapid detection of e. coli endophthalmitis. after producing a staphylococcol endophthalmitis by this method, michelson and nozik (1979) tested the perfor mance of a subcutaneously implanted minipump for antibiotic administration in the rabbit. ocular toxocariasis, a human disease caused by larval migration of toxocara canis, was produced in mice by olson et al. (1970) . freshly incubated ova were administered per os by stomach tube. anterior chamber hemorrhage and actual sightings of larval migration were reported. hobbs et al. (1978) found ocular mycobacterium leprae and m. lepraemurium in the nine-banded armadillo and mouse, respectively, following intravenous, intraperitoneal, or foot-pad inoculation of the organisms; the uveal tract was primarily involved, although lesions were observed in other ocular tissues. lesions were more dramatic in mice immunologically depressed by thymectomy and total body irradiation. dur to its susceptibility to toxoplasma gondii, the rabbit has been developed as a model for ocular toxoplasmosis. nozik and o'connor (1968) used the california pigmented rabbit and a variation of a method described by vogel (1954) to study the associated retinochoroiditis. this technique consists of prop osing the eye of an anesthetized rabbit and passing the needle retrobulbarly through the sclera to the suprachoroidal space (fig. 2) . the authors point out that while the lesions are less active and heal spontaneously in 3 weeks, features are of comparative value. for detailed information concerning other uses of this model the reader is referred to tabbara's organism recoverability experiments (tabbara et al., 1979) . histoplasmosis has been studied in mice, rats, rabbits, guinea pigs, dogs, pigeons, chickens, and primates. smith et al. (1978) point out that because it is essentially a macular disease in man, the ideal model should have similar anatomy. basically two orders of animals qualify: the avian species due to their dual fovea and the primates. the above authors used the stumptailed macaque, m. arctoides, to produce a focal choroiditis by inoculating yeast-phase histoplasma capsulatum into the internal carotid artery. they emphasized the desirabil ity to create a model that, in addition to focal choroiditis, also exhibits minimal involvement of the retina, no anterior segment disease, and a late macular lesion. experimental ocular cryptococcosis has been induced in rabbits, primates and cats, the latter species appearing to be the ideal choice because of its susceptibil ity and related high incidence of chorioretinal lesions. blouin and cello (1980) used an intracarotid injection technique which reproduced lesions identical to those seen in the naturally occurring disease. a septic choroiditis of bacterial origin was produced by meyers et al. (1978) by intracarotid injection of staphlococcal and streptococcal organisms. this in turn produced a serous retinal detachment, which was the primary objective of the study. animal studies have provided an abundance of information dealing with the pharmacodynamics of the eye and drugs used to treat ocular disease. because of the accessibility of the ocular structures, there are a number of routes of drug administration which are not available in treating other organ systems. on the other hand, there are barriers to drug penetration in the eye which must be considered, including a hydrophobic corneal epithelial layer over the external ocular structures and an internal blood-ocular barrier similar to the blood-brain barrier. the endothelial cells of the iris and retinal vessels, the nonpigmented epithelium of the ciliary body and the retinal pigment epithelium all contribute to this blood-ocular barrier (cunha-vaz, 1979) . various agents can compromise these barriers to drug penetration. inflamma tion can also affect any of the ocular barriers. in some experimental designs it may be desirable to create inflammation and/or minimize the barriers. tech niques include mechanical removal of the corneal epithelium with a scalpel blade, inducing infectious keratitis or treating the external eye with enzymes (barza et al., 1973) . a more generalized inflammation can be created by sen sitizing the animal to a specific antigen and then challenging the eye with that antigen (levine and aronson, 1970) . certain pharmacologie agents affect spe cific barriers; anticholinesterase inhibitors, for instance, can break down the blood aqueous barrier (von sallmann and dillon, 1947) , while agents with epithelial toxicity, such as benzalkonium chloride, are added to many topical preparations to increase the permeability of the corneal and conjunctival epithelium. all these factors regarding method of administration and barrier permeability must be considered in an experimental design. once again the rabbit has been most frequently chosen for pharmacologie experiments on the eye. although at first glance the anatomy of the rabbit eye would seem close,enough to the human to make this a valid experimental model, at least for drug penetration studies, significant differences do exist, as will be pointed out. drugs can usually be applied topically to the eye in concentrations much higher than could be tolerated systemically. topical application has many limita tions, however. if a topical drug is to be effective intraocularly it must traverse the hydrophobic epithelial layer and the hydrophilic corneal stroma; ideally, then, it should exist in an equilibrium between ionized and unionized forms with biphasic solubility characteristics. thus, both chemical configuration and ph of the vehicle are important. other important limitations are the small tear volume and rapid clearance of drugs from the tear film (janes and stiles, 1963) . rabbits tend to blink infrequently; consequently corneal contact time is prolonged com pared to humans. most nonprimate mammals possess a nictitating membrane which may affect tear dynamics. drugs may be applied with an ointment vehicle to prolong contact time, but bioavailability may be affected. finally, the position of the animal may be important. a tenfold difference in aqueous drug levels in rabbits has been reported depending on whether the animal was upright or re cumbent during the experiment (sieg and robinson, 1974) . injection of a drug under the conjunctiva or tenon's capsule theoretically allows delivery of greater amounts of drug, prolonged contact time, and bypass of the external epithelial barrier. subconjunctival injection may be given in two ways; by (1) injecting directly through the bulbar conjunctiva or (2) by inserting the needle through the skin of the lids, leaving the conjunctiva intact. the method of injection may be very important; one study showed that in rabbits leakage of hydrocortisone back through the injection site into the tear film ac counted for most of the intraocular absorption (wine et al., 1964) . although numerous articles have been written on the kinetics of drug absorption after subconjunctival injection (baum, 1977) , the subject is still controversial, and recent evidence presented by maurice and ota (1978) indicates that absorption into the anterior chamber by this route may be much lower in rabbits than in man. if so, the rabbit may be an inappropriate model for this type of research. the potential for serious complications makes intraocular injection of drugs a 4 'last resort" option in clinical practice. in many cases the maximum amount of drug tolerated within the eye is small (leopold, 1964) . good results have been reported, however, in the treatment of experimental bacterial endophthalmitis in rabbits with intravitreal injection of antibiotics (von sallmann et al., 1944; peyman, 1977) . usually 0.05-0.2 ml of antibiotic solution is injected into the nucleus of the vitreous through the pars plana. intramuscular and intravenous are the preferred routes for systemic administra tion of drugs in most laboratory animals. drugs may be given orally in food, water, or by stomach tube, or more reliably and precisely with pills or capsules in the case of cats and dogs. whatever the systemic route, intraocular levels are limited by the body's tolerance for the drug and by the blood-ocular barriers. one method of obtaining high ocular levels is arterial infusion, usually of the ipsilateral carotid artery. the highest levels have been obtained experimentally by retrograde perfusion of the intraorbial artery in dogs; details of the technique are discussed by o'rourke et al. (1965) . in general the assessment of penetration into the various intraocular compart ments is carried out using radio-labeled preparations of the drugs under study. the labeled drug is administered, and after a predetermined length of time samples of aqueous are removed by paracentesis or the whole eye is removed and standard size tissue samples taken for scintillation counting. o' brien and edelhouser (1977) utilized a direct chamber perfusion technique to study penetra tion of radio-labeled antiviral drugs through the excised cornea. in the case of antimicrobials, drug levels in ocular tissues may also be determined using a microbioassay such as the agar diffusion technique described by simon and yin (1970) . other methods have been used less commonly. autoradiography has been used to assess intraocular penetration (mccartney et al., 1965) , and distribution in the orbital compartments has been studied using diatrizoate, a radio-opaque contrast media (levine and aronson, 1970) . techniques used to assess drug efficacy depend primarily upon the type of drug under study. most drugs used in the treatment of glaucoma, for instance, work through complicated autonomie mechanisms which alter the inflow and outflow of aqueous humor. tonography, tonometry, and fluorophotometry are used to study these effects and are discussed elsewhere. methods used to study antibiotic efficacy, on the other hand, are usually less objective. results are often based on semiquantitative or subjective impressions of clinical response. actual bacterial counts on samples of tissue following treatment of experimental infec tions are feasible, but the organism and size of original inoculum must be strictly defined (kupferman and leibowitz, 1976) . steroid efficacy is even more difficult to assess objectively in experimental models. leibowitz et al. (1974) have reported a method of quantifying steroid response based on the number of polymorphonuclear leukocytes labeled with tritiated thymidine remaining in the tissue following treatment of inflammatory keratitis. successful organ and cell culture has been reported with a variety of ocular tissues, including whole eyes. the culture of lens, cornea, and corneal endothelium have become routine techniques in many ophthalmic laboratories. paul (1975) provides a good manual of basic tissue culture techniques. the early work in culture of ocular tissue is the subject of an excellent review by lucas (1965) . a number of significant advances have been made since that time, especially in the areas of retinal pigment epithelial and cornea culture. suspended between the aqueous and vitreous humors with no direct vascular supply, the lens is maintained in a type of natural organ culture. it has always been felt, therefore, that the lens should be ideal for in vitro culture, and lens culture experiments have been utilized extensively to study the metabolism of this tissue. a great deal of research into ideal media and culture conditions has been necessary, however. rabbit, mouse, and rat lenses have been most widely used, but culture of bovine lenses is also feasible despite their large size (owens and duncan, 1979) . the lens may be cultured in an open (continuous perfusion) or closed system. the advantages and disadvantages of each type of system are discussed by schwartz (1960a) . closed systems patterned after that described by merriam and kinsey (1950) are by far the most popular, but for certain types of experiments more elaborate perfusion systems are necessary (schwartz, 1960b; sippel, 1962) . the composition of the media for lens culture has been found to be very important. kinsey et al. (1955) have provided systematic studies on the optimum concentration of various culture media constituents. presently most lens culture is being done in a modified tci99 media which has been used by kinoshita and others with good success (von sallmann and grimes, 1974) . thoft and kinoshita (1965) have found that a calcium concentration somewhat higher than that in the aqueous is desirable in the culture media. more recently, chylack and kinoshita (1973) have shown that removal of the lens with part of the vitreous still attached to the posterior surface improves certain parameters of lens function in vitro. presumably, leaving vitreous attached to the lens prevents damage to the pos terior capsule. in addition to culture of whole lenses, cell cultures of pure lens epithelium have been important in the study of lens metabolism. most of the early work was done with chick epithelium, but more recently cultures of mouse lens epithelium have been shown to retain greater cellular differentation in culture (mann, 1948; russell et al., 1977) . three different cell types make up the layers of the cornea: epithelial cells, fibrocytic stromal cells, and a single layer of endothelial (mesothelial) cells. all three corneal cell types, as well as whole corneas, can be maintained in culture. the culture of whole corneas has been investigated as a means of corneal preser vation prior to transplantation. most of these studies have used human rather than animal tissues. endothelial cell culture has become an important technique in the study of this very metabolic ly active cell layer. stocker et al. (1958) described a technique for separating a corneal button by carefully peeling off descemet's membrane with endothelium from the posterior surface and peeling the epithelium from the opposite side to yield three relatively pure cell types which can be explanted onto separate cultures (fig. 3 ). colosi and yanoff (1977) have used trypsin to obtain endothelial cells or epithelial cells from whole corneas. perlman and baum (1974) have used stacker's method to isolate endothelial cells from rabbits and have had good success in maintaining large endothelial cell cultures for several months using a modified eagle's mem supplemented with calf serum, bicarbo nate, glutamine, and kanamycin sulfate. most exciting are the reports of using tissue cultured endothelial cells, seeded on to donor corneas with endothelium removed, for transplantation in rabbits (jumblatt et al., 1978) . although the corneal stromal cells are the most thoroughly studied ocular fibrocytic cells, other fibrocytes have been successfully cultured, including goniocytes from the anterior chamber angle (francois, 1975) and hyalocytes from the vitreous (franã§ois et al., 1979) . tissue culture of immature neural retina has been a popular subject. early experiments are summarized by lucas (1965) . recently, differentation of em bryonic neural retinal cells from the chick has been described both in cell aggre gate cultures (sheffield and moscona, 1970) and monolayer cultures (combes et al., 1977) . organ culture of mature neural retina has proved more difficult. mature retina tends to deteriorate rapidly in a convential warburg apparatus (lucas, 1962) . the organ culture technique of trowell (1959) has been used to maintain mature and nearly mature retina for several days (lucas, 1962) . this technique requires mechanical support of the retina on a metal grid with receptor cells uppermost, exposed to the gas phase; in this case air as 95% oxygen was found to be toxic. ames and hastings (1956) described a technique for rapid removal of the rabbit retina, together with a stump of optic nerve, for use in short-term culture experi ments including in vitro studies of retinal response to light (ames and gurian, 1960) . until recently only a few studies were available on the behavior of retinal pigment epithelium (rpe) cells in culture, but recent work on the multipotential nature of the rpe cell has aroused new interest in this area. as with neural retina, chick embryos have been most often used for rpe cultures. hayashi et al. (1978) used edta and trypsin to isolate chick rpe cells for culture on eagles mem supplemented with 10% fetal calf serum. a similar technique has been used to culture rpe cells from syrian hamsters (albert et al., 1972a) . mandel-corn et al. (1975) have reported an ingenious experiment in which rpe cells from one eye of an owl monkey were transplanted into the vitreous of the fellow eye through the pars plana where the proliferation and metaplasia of those cells could be studied. the experiment described above is not the first time that investigators have attempted to take advantage of the eye's transparent media and large avascular spaces to create a type of in vivo tissue culture. many successes have been reported in transplanting homologous and autologous tissues into the anterior chamber. markee (1932) transplanted endometrium into the anterior chamber of guinea pigs, rabbits, and monkeys where it continued to undergo cyclic changes. goodman (1934) also reported ovulatory cycles of ovaries transplanted into the anterior chamber of rats. woodruff and woodruff (1959) successfully trans planted homologous thyroid tissue into the anterior chamber of guinea pigs. eifrig and prendergast (1968) found that autologous transplants of lymph node tissue in rabbits were well tolerated. a variety of embryonic tissues, with the exception of liver, have been successfully transplanted into the anterior chamber of rabbits (greene, 1943) . heterologous transplants of tissue from one species to another have more often than not been unsuccessful. greene (1947) reported good results in transplanting malignant tumors into the anterior chamber of different species. he observed that transplants from human to guinea pig and rabbit to mouse worked best. greene's technique of transplantation involves introducing a small piece of donor tissue in a small canula fitted with a stylet through a limbal incision and firmly implanting the tissue into the anterior chamber angle opposite the incision. morris et al. (1950) reported disappointing results using this technique for heterologous tumor transplants in guinea pigs, but obtained better results by transplanting the tissue into the lens beneath the lens capsule. the reader is referred to woodruff (1960) for further discussion of intraocular transplantation experiments. the eye is unique in that it is devoid of lymphatics and thus has no directly associated lymph nodes. classic animal studies have been utilized to define ocu lar mechanisms of immune response and have demonstrated that the uveal and limbal tissues play a role similar to that of secondary lymphoid tissue elsewhere in the body. bursuk (1928) injected typhoid bacilli and staphylococci into the cornea of rabbits and found that agglutinins and opsonins appeared in the corneal tissues earlier and in greater concentration than in the serum, suggesting local respon siveness. thompson and co-workers (1936) obtained identical results using crys talline egg albumin as the antigen, and further showed that when only one cornea of each animal was injected precipitating antibody could not be detected in the contralateral uninjected cornea. rabbit experiments in which ovalbumin was injected into the right cornea and human serum albumin was injected into the left cornea demonstrated that the respective cornea contained antibodies only to the antigen they had received and not to the antigen present in the contralateral cornea; specific antigen was detectable within 8 days (thompson and olsen, 1950) . using a'fluorescent antibody technique, witmer (1955) demonstrated immunoglobulin cells in the uvea of the rabbit, and wolkowicz et al. (1960) showed that excised, sensitized uveal tissue in short-term culture would actively produce specific antibody. lymphoid cells migrate to the eye in the presence of antigenic stimulus. thus, x irradiation of the eyes of a rabbit does not impair the ability of either the limbal tissues or the uvea to form antibody, whereas irradiation of the peripheral lym phoid system does (thompson and harrison, 1954; silverstein, 1964) . when the primary ocular response to an antigenic challenge has subsided, sensitized 4 'memory" lymphocytes may persist in the uvea or limbal tissues, since a later exposure to the same antigen introduced at a distant site or injected directly into the circulation results in renewed antibody production within the eye (pribnow and hall, 1970; silverstein, 1964) . the dog, cat and monkey have all been used as animal models in cornea research. the rabbit, however, has been exploited to such a degree for this purpose that a few comments on the rabbit cornea are appropriate. the rabbit has a large cornea approximately the same diameter as the human cornea. it is significantly thinner, however, with an average central thickness of about 0.40 mm. the thin cornea and shallow anterior chamber have contributed to technical difficulties in performing intraocular lens implantation and penetrating keratoplasty in rabbits (mueller, 1964) . the corneal epithelium is also thin in rabbits, and the existence of bowman's layer in the rabbit has been disputed (prince, 1964b) . one of the most important qualities of the rabbit cornea is the capacity of the endothelium for regeneration. following injury, rabbit endothelial cells are able to divide and cover the defect with minimal effect on the endothelial cell density. the dog is similar in this respect (befanis and peiffer, 1980) . cats and monkeys, on the other hand, have been shown to have very little potential for regeneration of endothelium. in these animals, as in man, remaining endothelial cells grow and spread to cover the defect (van horn, 1975) . thus cats and monkeys are probably better models in studies where response to endothelial injury is impor tant. the recent surge in interest in the nature of the corneal endothelium has been brought about largely by the demonstration that the endothelium is the site of the "fluid pump" transport mechanism essential for maintaining corneal dehydra tion and thus transparency (maurice, 1972) . localization of the fluid pump and many subsequent studies have been made possible by the endothelial specular microscope developed for viewing the endothelium of enucleated eyes and modi fied for use with excised corneas in the form of the perfusion specular micro scope (maurice, 1972) (fig. 4) . later the endothelial microscope was adopted for in vivo examination of humans and animals, including cats, rabbits, and monkeys (laing et al., 1975) . there is nothing new about viewing the specular reflection from the corneal endothelium, a long established slit lamp technique. the specular microscope, however, is an instrument designed to optimize this type of illumination to produce a high magnification (200-400 x) image of the endothelium suitable for photomicrography, accurate endothelial cell counts, and detailed observation of individual cells (fig. 5) . a discussion of the optical principles of specular mi croscopy is provided by laing et al. (1979) . commercially available endothelial microscopes are designed primarily for use with humans, but can be easily adapted to any laboratory animal with a sufficiently large cornea (which excludes rats and mice). the smallest eye movements blur the image of the endothelium, and, therefore, general anesthesia is usually required when working with ani mals. technique and observations in dogs (stapleton and peiffer, 1979) and cats (peiffer et al., 1980a) have been described. in 1951 ussing and zehran described an apparatus consisting of two small lucite chambers separated by a piece of frog skin which could be used to measure transport of materials potential differences across the frog skin. this same technique was modified for study of the cornea, and dual chamber perfu sion experiments form the basis for much of what we know about corneal physiology (donn et al., 1959; mishima and kudo, 1967) . the perfusion specu lar microscope has added an extra dimension to this type of in vitro experimenta tion by allowing direct observation of endothelial cell shape and integrity during perfusion. atraumatic removal of the cornea avoiding contact with the en dothelium or excessive bending of the cornea is essential to the success of this type of experiment. the technique of dikstein and maurice (1972) for excision of the cornea with a scierai rim has been used very successfully. for longer term in vitro experiments the cornea can be cultured by convential closed organ culture techniques as described elsewhere in this chapter. the impetus for the intensive investigation of optimum conditions for corneal storage has resulted from the feasibility of corneal transplantation in humans. most of the research has been done using rabbits, however, and is especially applicable, therefore, to use and storage of animal corneas for whatever purpose. the method of corneal storage in most eye banks today is moist-chamber storage, storing the intact globe at 4â°c in a glass bottle with saline-soaked cotton in the bottom. the stagnant fluid quickly becomes loaded with waste prod ucts, making this method unsuitable for long-term storage. a number of alterna tive techniques are used in laboratory storage including storage of excised cor neas at 4â°c, organ culture at 37â°c, and cryopreservation for long-term storage. several different media for storage of excised corneas have been evaluated. the most popular at present is mccary-kaufman (m-k) media, modified tissue culture media containing tc-199, 5% dextran 40, and a mixture of streptomycin and penicillin (mccarey and kaufman, 1974) . storage in m-k media probably preserves corneal viability longer than moist chamber storage, endothelial dam age occurring at 2 days in moist chamber stored rabbit corneas compared to 4-6 days for m-k stored corneas (geeraets et al., 1977) . stocker et al. (1963) reported good success with excised corneas stored in autologous serum at 4â°c. subsequent studies, however, show conflicting results as to the advantage of serum storage over moist chamber storage (geeraets et al., 1977; van horn and schultz, 1974) . corneal organ culture represents an alternative method of prolonging corneal viability. it is similar to convential storage of excised corneas except for tempera ture, which is maintained at 37â°c and may allow for preservation of endothelial viability for up to 3 weeks (doughman et al., 1974) . the culture media most often used is eagles mem supplemented with fetal calf serum. cryopreservation is the only method which permits indefinite storage of whole viable corneas. the tissue is passed through a graded series of solutions contain ing dimethyl sulfoxide and frozen at a controlled rate over liquid nitrogen. studies indicate that some endothelial damage may occur during the freezing process but destruction is incomplete (basta et al., 1975) . in fact, survival of donor endothelial cells following transplantation of cryopreserved corneas has been demonstrated using sex chromatin markers in monkeys (bourne, 1974) . techniques of cryopreservation have been reviewed and evaluated by ashwood-smith (1973) . methods of estimating corneal viability after various storage regimens are all based on evaluating endothelial integrity. histochemical stains have been used extensively (stocker et al., 1966) . dye exclusion stains, such as 0.25% trypan blue applied to the endothelium for 90 seconds and then rinsed off with saline, can be used without apparent toxicity to the endothelium. the pattern obtained with trypan blue staining is more difficult to see in rabbits than other species. a modification of this technique using a combination of trypan blue and alizarin red on rabbit corneas has recently been reported (spence and peyman, 1976) (fig. 6 ). electron microscopy, both scanning and transmission, are also used to assess the condition of the endothelium (fig. 7) . the most innocuous method of estimating corneal viability is simply mea surement of corneal thickness at body temperature which is an indirect indication of the condition of the endothelium, healthy endothelium being able to deturgese the cornea to near-normal thickness. the measurement of corneal thickness, or pachometry, has traditionally been done using an optical technique developed by von bahr (1948) . the specular microscope, however, provides for a more direct method of measuring corneal thickness by registering the distance from the point of applanation at the corneal surface to the image of the endothelium. viraj and bacterial infections of the cornea have been studied mostly in rab bits, especially in the case of herpes simplex keratitis where attempts to study this disease in other animals have been unsuccessful. techniques for initiating these infections are discussed in section iii. noninfectious forms of keratitis may result from tear deficiencies, vitamin a deficiency, allergic phenomenon, and chemical or mechanical damage. keratoconjunctivitis sicca (kcs) resulting from tear deficiency has been studied in the dog, the disorder being surgically induced by removal of the lacrimai gland and nictitating membrane (helper et al., 1974a; gelatt et al., 1975) or chemi cally induced with phenazopyridine (pyridium), 60-90 mg/day orally for 3-6 weeks (slatter, 1973) . franã§ois et al. (1976) have used rabbits to study kcs. they found it necessary to remove all the accessory lacrimai tissue, including harder's glands and the nictitating membrane as well as the lacrimai gland proper to induce the disease in rabbits. interstitial stromal keratitis, thought to be an immune-mediated phenomenon, has been demonstrated in mice after intravenous and intracutaneous injections of bovine î³-globulin or bovine serum albumin (kopeloff, 1976) . a simple method of inducing an inflammatory stromal keratitis in rabbits is injection of 0.03 ml of clove oil into the corneal stroma (leibowitz et al., 1974) . phlyctenular keratitis, another allergic form of keratitis, has been reported in animal models, but thygeson et al. (1962) have failed to produce true phlyctenular disease in rabbits and questions whether the previously reported models are valid. most of the clinically significant lesions, hereditary or acquired, affecting the corneal endothelium result in visually disabling corneal edema. spontaneous dystrophies leading to corneal edema are described in the veterinary literature and are reviewed by dice (1980) . corneal edema can be induced by in vivo freezing with a metal probe to destroy endothelial cells. the edema may be temporary or permanent depending on the area, temperature, and duration of cold exposure (chi and kelman, 1966) . we have found that two successive freezes with a contoured 10-mm brass probe, cooled to -140â°c with liquid c0 2 , consis tently produces temporary corneal edema in dogs (befanis and peiffer, 1980) . a slightly smaller probe is used for rabbits. the probe is applied two times for 15-60 seconds with time for complete thawing allowed in between. corneal edema and scarring are often irreversible. corneal transplantation is one method of restoring a transparent visual axis, but a number of prosthetic devices, usually fashioned out of methyl methacrylate, have been used experi mentally over the last 30 years and are now being used clinically on a limited basis. stone and herbert (1953) reported a two-stage procedure for implanting a plastic window in rabbit corneas. dohlman and refojo (1968) have reviewed the previous 15 years experience with plastic corneal implants. many of these procedures require the use of a tissue adhesive to hold the plastic prosthesis in place. cyanoacrylate adhesives are used. methyl-2cyanoacrylate creates the strongest bond, but is too toxic for use on the cornea. butyl-2-cyanoacrylate can be used, and the longer chain substituted forms are even less irritating but do not form as strong a bond (havener, 1978) . gasset and kaufman (1968) have used octyl-2-cyanoacrylate for gluing contact lenses to rabbit and monkey corneas with the epithelium removed. glued-on contact lenses have also been used for treatment of experimental alkali burns in rabbits (zauberman and refojo, 1973; kenyon et al., 1979) . conventional contact lenses without adhesive can be used in animal experi ments. most investigators have found that both hard lenses and silicone lenses are well-tolerated in the rabbit (thoft and friend, 1975) . hard lenses should be specifically fitted to the rabbits, however, and a partial tarsorraphy may be used if necessary to prevent rabbits from expelling lenses (enrich, 1976) . systemic connective tissue disorders may be associated with local ocular signs of diffuse, nodular, or necrotizing scleritis that can involve the cornea as well. hembry et al. (1979) sensitized rabbits by intradermal ovalbumin plus freund 's adjuvant followed by injection of ovalbumin at the limbus to produce a necrotiz ing corneoscleritis. animal studies of the composition of the aqueous humor, its function, and the processes controlling the dynamic state of its constituents and volume have two main objectives: (1) appreciation of the physiology, biochemistry, and hence metabolism of the tissues of the anterior segment and (2) definition of the mechanisms that control rate of aqueous humor production and outflow and hence intraocular pressure. definitive information is available for only a limited number of mammalian species, and these data suggest that species variation in aqueous humor composition and dynamics exists. in addition, anatomical charac teristics of the outflow pathways differ between species. excellent reviews by cole (1974) and tripathi (1974) detail these differences, which emphasize that making inferences from research results using nonprimate mammals to man requires appreciation of these species differences. the maintenance of normal intraocular pressure (iop) is dependent upon a critial balance of dynamic equilibrium between the processes of aqueous humor production and drainage. concepts of aqueous humor formation are based largely upon laboratory work with the rabbit. aqueous humor is produced by the ciliary processes as a result of active transport and passive ultrafiltration processes; approximately one-half of the aqueous is produced by active secretion across the two-layered epithelium. the exact mechanisms of transport have not been de-fined and may demonstrate species differences; active transport of sodium in the presence of a sodium-and potassium-activated adenosine triphosphatase located in the cell membrane of the nonpigmented epithelium is one of the main primary events in the formation of the aqueous fluid (cole, 1974) . the remaining 50% of aqueous humor is formed by passive processes, including diffusion, dialysis, and ultrafiltration. the composition of aqueous humor is essentially that of a plasma ultrafiltrate but varies between species. gaasterland et al. (1979) have studied the composition of rhesus monkey aqueous. differences between aqueous and plasma levels of potassium, magnesium, chloride, and bicarbonate have been demonstrated and suggest species differences in transport mechanisms and/or anterior segment metabolism. a saturable active transport system for ascorbate has been documented in the rabbit (barany and langham, 1955) . transport mechanisms for amino acids demonstrated in the rabbit may not be present in rat, cat, monkey, and dog (reddy, 1967) . total volume of the aqueous humor will vary among species with the size of the globe and relative proportions of the anterior segment (cole, 1974) . the rate of aqueous humor formation in the species studied, with the exception of the uniquely high-valued cat, is approximately 1.0-2.0 î¼,î�/minute (cole, 1974) and is dependent upon ciliary artery blood pressure, the pressure in the ciliary body stroma (essentially equal to iop), and the facility of flow through the ciliary capillary and capillary wall. because of these pressure gradients, passive aqueous humor production is decreased as iop increases; this pressure-dependent component or ' 'pseudofacuity ' ' has been demonstrated to account for up to 30% of total facility in monkeys (bill and barany, 1966; brubaker and kupfer, 1966) . the aqueous humor enters the posterior chamber, flows through the pupil into the anterior chamber due to thermal currents, and exits the globe by passing through the flow holes in the trabecular meshwork and reentering the peripheral venous circulation via thin-walled vascular channels. methodology utilized to quantitate aqueous humor production and flow rates involve the measurement of the turnover of a substance within the aqueous introduced by active and/or passive processes from the peripheral vasculature or intraocular injection. the ideal technique does not involve introduction of a needle into the globe, as this undoubtedly disrupts normal homeostatic mechanisms. man, guinea pig, rabbit, and cat have been studied utilizing fluorescein turnover with slit lamp fluorometry, a variety of isotopes, and other substances. reasonable correlation among investigators utilizing different tech niques within species has been observed (cole, 1974) . the comparative anatomy of the outflow pathways has been reviewed by calkins (1960) and tripathi (1974) . in man and the primates a well-defined trabecular meshwork spans the scierai sulcus from the termination of descemet's membrane to the base of the iris to the scierai spur. deep to the trabecular meshwork within the sulcus a single large channel, schlemm's canal, drains the aqueous into the episcleral veins (fig. 8) . in nonprimate mammals, the scierai spur and sulcus are absent; the trabecular fibers span the ciliary cleft, a division of the ciliary body into inner and outer leaves, deep to the fibers of the pectinate ligament. these fibers consist of uveal tissue and extend from the termination of descemet's membrane to the iris base. aqueous drains into small saculated vessels, the aqueous or trabecular veins, which communicate with an extensive scierai venous plexus (figs. 9 and 10). van buskirk (1979) utilized plastic lumenal castings and scanning electron mi croscopy to study the canine vessels of aqueous drainage and to demonstrate that in this species the exiting aqueous mixes with uveal venous blood. the mechanisms of aqueous humor outflow through the trabecular meshwork are not completely understood. passive pressure mechanisms certainly play an important role; increases in episcleral venous pressure result in decreased outflow and increased iop. active transport mechanisms for both organic and inorganic anions have been demonstrated. transmission electron microscopic studies of the endothelium of the scierai venous plexus have revealed giant cytoplasmic vac uoles that are suggestive of transcellular transport mechanisms and indicate that this is the main site of resistance to outflow (tripathi, 1974) . some aqueous humor may exit through the ciliary body, entering the suprachoroidal space and into the choroidal circulation and sclera. this pressureindependent uveoscleral outflow accounts for up to 30-65% of bulk flow in subhuman primates and has been demonstrated in cats, rabbits, and man to be quantitatively less than in two species of monkeys (bill, 1961 (bill, , 1965 (bill, , 1966 bill and phillips, 1971; fowlks and havener, 1969) . qualitative uveoscleral routes have been suggested in the dog utilizing dextran-fluorescein studies (gelati et al., 1979b) . because of the radius of curvature of the cornea and the presence of an overlying scierai ledge, light rays from the base of the iris, the angle recess, and the trabecular meshwork undergo total internal reflection, preventing direct visualization of the outflow structures without the use of a contact lens to elimi nate the corneal curve. two general types of gonioscopic contact lenses are available; indirect lenses, which contain mirrors and allow examination of the angle by reflected light, and direct lenses, through which the angle is observed directly. a magnifying illuminated viewing system, ideally the slit-lamp biomicroscope, is essential for critical evaluation. gonioscopic examination of the canine iridocomeal angle was first reported by troncoso and castroviejo (1936) , although nicholas had previously depicted the canine angle by drawings in 1924. troncoso in 1948 compared the gross and gonioscopic appearance of the angles of the dog, cat, pig, rabbit, rhesus monkey, and man. the clinical application of gonioscopy in comparative ophthalmology is rela tively recent. lescure and amalric (1961) described its use in the dog, and subsequently numerous investigators have stressed the value of the technique in the diagnosis of glaucoma in the dog (vainsi, 1970; gelatt and ladds, 1971; bedford, 1973 bedford, , 1977 . martin (1969) has correlated the microscopic structure and gonioscopic appearance of the normal and abnormal canine iridocorneal angle. the technique is straightforward and involves topical anesthesia and minimal physical restraint in dog, cat, and rabbit and ketamine sedation in primates (fig. 11 ). the concave surface of the lens is filled with artificial tears and placed on the corneal surface; the franklin goniolens with a circumferential flange is re tained by the eyelids and enables the examiner to have both hands free. the troncoso or koeppe lens may also be utilized; vacuum lenses and the swan lens are smaller and thus adaptable to younger and smaller animals (fig. 12) . the structures observed during gonioscopic examination of the nonprimate mammal, using the dog as an example, include, from posterior to anterior, the following: 1. the anterior surface and base of the iris 2. the pectinate ligament 3. deep to the pectinate ligament, the ciliary cleft, and the trabecular meshwork 4. the deep or inner pigmented zone representing the anterior extension of the outer leaf of the ciliary body 5. the outer or superficial pigmented zone which is variable in presence and density and represents melanocytes in the limbus 6. the corneal dome (fig. 13) species variations in appearance do exist. the sub-human primates present a gonioscopic appearance of the iridocorneal angle identical to man. in the cat, the pectinate fibers are thin and nonbranching, while in the dog they tend to be deeply pigmented, stout and arbiform. in the rabbit the pectinate fibers are short and broad (fig. 14a-c) . these tests have been used in man to detect suspicious or borderline glaucoma patients as well as to investigate the heredity of open angle glaucoma by stressing the homeostatic mechanisms of the globe. provocation results in an increase in iop that can be characterized by extent and duration. the water drinking and corticosteroid provocative tests have been most useful in open angle glaucoma, whereas the mydriatic and dark room tests have been valuable in narrow angle glaucoma (kolker and hetherington, 1976) . the water provocative test in man, rabbits, and subhuman primates (macaca irus) has been studied with schiotz and applanation tonometry and in combina tion with tonography and constant pressure perfusion (swanlijung and biodi, 1956; galin et al., 1961 galin et al., , 1963 mcdonald et al., 1961; galin, 1965; thorpe and kolker, 1967; casey, 1974) . the procedure has been used in rabbits to test the effects of different drugs on pressure-regulating mechanisms (mcdonald et al., 1961) . the test in dogs has defined normal values and demonstrated significant dif ferences in american cocker spaniels and beagles with spontaneous glaucoma (lovekin, 1971; gelatt et al., 1976a) (fig. 15 ). it is generally accepted that the increase in intraocular pressure after the administration of a substantial volume of water is primarily related to a sudden decrease in the osmolarity of the blood; the related influx of water into the eye is presumed to increase intraocular pressure proportional to the volume of water administered (galin et al., 1961 galin, 1965) . hemodilution may not be solely responsible; in man the increase in iop in 20% of the patients occurs before the fall in serum osmolarity (spaeth, 1967) . the subsequent rate of decay of iop assesses the ability of the outflow system to cope with the increase inflow. clinical measurement of the facility of outflow is accomplished by raising iop by placing a tonometer on the eye and determining the subsequent rate of volume loss and pressure decrease; as resistance to outflow of aqueous humor increases, the pressure changes will decrease. the principle of the test may be traced to the massage effect, whereby pressure on the eye leads to a softening of the globe due to increased outflow. schiotz indentation tonography employs placement of an electronically recording schiotz tonometer on the eye; the weight of the tonome ter will indent the cornea, reducing ocular volume and increasing iop. the instrument is left on the cornea for a 4-minute period. tables are utilized to derive the rate of aqueous humor outflow based upon the observed changes in iop over the 4-minute period as recorded by the tonograph (grant, 1950; kronfeld 1952 kronfeld , 1961 garner, 1965; drews, 1971; podos and becker, 1973) . in 1950, grant showed that the coefficient of outflow facility (c) is related to the change in ocular volume (î�î½) occurring over the time interval (t), as a result of the difference between the average pressure during tonography (p t av) a n d the iop prior to placement of the tonometer on the globe (p 0 ): c = av/t(p tav -p 0 ). the coefficient is expressed in microliters of aqeous humor outflow per minute per millimeters of mercury pressure (grant, 1950) . tonography has certain limitations. the accuracy is dependent upon the accu racy of the tonometer calibration, since î�î½, ptavâ» **nd pq are derived from this data. in addition, six physiologic assumptions are made upon which accurate quantitative results are dependent (grant, 1950) . 1. that there is a constant continuous flow of aqueous humor 2. that the process of tonography does not alter this flow 3. that the process does not alter outflow facility 4. that the process does not change the uveal vascular volume 5. that tonography does in fact measure the outflow of aqueous humor through the trabecular meshwork 6. that the eye exists in a steady state in regards to aqueous humor dynamics during the 4-minute period. previous discussions of pseudofacility-the decrease in aqueous humor pro duction that occurs with increased iop-and uveoscleral outflow demonstrate that assumptions 2 and 5 are not valid. evidence exists that outflow facility is depen dent upon iop. in addition, uveal blood volume is probably influenced by the placement of the tonometer on the cornea (podos and becker, 1973) . however, tonographic c values determined for the human eye correlated well with values obtained by constant pressure perfusion, aqueous humor fluorescein disappear ance time, and aqueous humor turnover of certain substances (becker and costant, 1956; franã§ois et al., 1956; bill and barany, 1966) . additional factors warrant consideration when evaluating tonographic proce dures in species other than man; animal globes differ significantly from the human eye in terms of ocular volume, radius of corneal curvature, vascular dynamics, and tissue characteristics upon which the accuracy of schiotz identation tonometry or tonography is dependent; schiotz tonometric conversion tables are inaccurate if applied to the canine eye (peiffer et al., 1977) . the technique cannot be utilized in animals without pharmacologie restraint, and the effect of the drugs utilized on the steady state of iop must be taken into account. helper and his associates (1974b) used xylazine and ketamine sedation and found c values from 0.05 to 0.32 in normal dogs, 0.03 to 0.38 in basset hounds, and 0.03 to 0.10 in a small number of glaucomatous patients. gelatt and his associates ( 1977a) observed a combination of acetylpromazine and ketamine to have mini mal effect on the steady-state iop of the normal canine eye, and demonstrated impairment of outflow in beagles with inherited glaucoma. peiffer et al. (1976) reported mean tonographic values of 0.21 using this anesthetic combination in normal mongrel dogs (fig. 16) . applantation tonography may prove more accurate and versatile than schiotz indentation tonography in the animal laboratory. a 2-minute tonography period is adequate, and the effect of anatomic variables is minimized ). invasive laboratory techniques that have been described to measure facility of outflow involve direct measurement of intraocular pressure via a cannula inserted into the globe and one of three techniques: (1) injection of a known volume of fluid and observing the rapid increase and subsequent slow decrease in iop (pressure decay curves), (2) perfusion of the globe with fluid at a constant rate and observing the related pressure changes (constant rate perfusion), and (3) perfusion at a variable rate necessary to maintain a given iop (constant pressure perfusion) (barany and scotchbrook, 1954; grant and trotter, 1955; macri, 1959; melton and hayes, 1959; armaly, 1960; melton and deville, 1960; grant, 1963; langham and eisenlohr, 1963; barany, 1964; brubaker, 1975; wickham etal., 1976) (fig. 17) . these invasive techniques eliminate the variables of scierai creep and changes in uveal blood volume and species differences in corneal anatomy. all require methods in which intraocular pressure, fluid volume in the eye, and flow reach steady state during the measurement. cannulation of the anterior chamber will induce qualitative and quantitative aqueous humor changes. despite mathemati cal equivalence, the methods are quite different in practice because they differ greatly in the time it takes for the eye to go from one steady state to another. the time required to reach steady state during constant pressure perfusion is less than 5 minutes compared to much longer times for the other techniques. all are based on the assumptions of pressure-independent facility and secretion rate, which have been criticized (langham, 1959) . facility of outflow can be determined in constant pressure perfusion by measuring average flow from an external reservoir into the eye; time period is a compromise between the desire to achieve accuracy by using a long period and to achieve high temporal resolution by using a short one. a 4-minute averaging period is utilized arbitrarily to facilitate comparison to tonographic values. facil ity may be calculated at any given pressure utilizing the formula c = fip, where f is the rate of perfusion flow in microliters of mercury per minute, p is the intraocular pressure in millimeters of mercury, and c is facility expressed in microliters of fluid per millimeter of mercury per minute. in an in vivo system, however, this equation does not consider aqueous humor production and thus provides values lower than actual facility. barany showed that facility can be calculated at two levels of pressure, î¡ î» and p 2 , utilizing the formula c = (f 2 -f\)k?2 ~ p\)â· this formula necessitates assumption of a constant episcleral venous pressure and aqueous humor production. if one estimates a value of 10 mm hg for the former and 1 ul/minute for the latter, similar results between the two equations are obtainable by dividing rate of secretion by the episcleral venous pressure and adding the result (0.1 ul/minute/mmhg) to the values obtained by the equation c = f ip . character of the perfusate can influence facility; 0.9% unbuffered saline causes a decline in resistance on prolonged infusion of the anterior chamber. barany (1964) utilized phosphate-buffered saline with calcium and glucose added, and brubaker and kupfer (1966) perfused the heparinized mammalian tissue culture medium to minimize this "washout" factor. gaasterland et al. (1979) used pooled heterologous aqueous humor to perfuse rhesus monkey eyes in vivo. melton and deville (1960) studied enucleated canine eyes using constant pressure perfusion and found an average c value of 0.28 at 28 mm hg iop. peiffer and his associates (1976) found the facility of outflow in normal dogs anesthetized with sodium pentobarbital to have a mean value of 0.13 â± 0.07 sd which increased as iop increased. perfusion values for outflow were less than those determined tonographically (0.21 â± 0.14 sd with acetylpromazineketamine hydrocholoride sedation, 0.15 â± 0.09 sd with pentobarbital anes thesia) or in the enucleated globe. van buskirk and brett (1978a,b) perfused enucleated canine eyes and found a pressure-dependent facility of outflow of 0.26 â± 0.02sd at 5 mm hg iop which increased to 0.62 at 40 mm hg iop. outflow increased significantly during perfusion with hyaluronidase-containing solution and with time in the non-hyaluronidase-perfused eyes. peiffer et al. (1980a) perfused beagles with inherited glaucoma in vivo and found impairment of outflow facility compared to normal dogs (fig. 18 ). the limitations of tonographic and perfusion techniques to quantitate aqueous humor outflow in animal species must be appreciated; neither is ideal. the refinement and development of more accurate methodology remains a challenge to the investigator. either technique is certain to provide more relevant informa tion if performed in vivo rather than on enucleated globes. the iop is a differential fluid pressure that measures the vector sum of the forces generated by the intraocular fluids acting at the interface between the fluid and the fibrous coats of the globe. the accurate determination of the iop is difficult because all the techniques utilized to measure it in some way necessitate altering the parmeter from its original value. in the laboratory, the anterior chamber can be cannulated and iop determined directly by the fluid level of an open-air manometer; this situation, of course, is not applicable to clinical situa tions where a noninvasive technique is required. any noninvasive technique, however, must ultimately be compared to simultaneous readings from the cannu lated globe. quantitative determination of iop is achieved by one of two types of tonometry. clinical estimations depend on subjecting the cornea to a force that either indents (impresses) or flattens (applanates) it. tonometers that indent the cornea are referred to as indentation tonometers, and those that flatten it are referred to as applanation tonometers. the cornea is utilized because other areas, such as the anterior sclera, have a nonuniform thickness and the variability of additional tissues including conjunctiva bulbar fascia and the underlying anterior uvea. normal iop in many of the animal species, notably the nonprimate mammals, appears to be higher and more variable than that observed in persons (bryan, 1965; heywood, 1971) . a number of physiologic variables may affect the iop. these include the nature of the subject, the time of day, and the position of the subject. intraocular pressure is related to blood pressure, and it has been demon strated that animals that are excited will have higher iop. accurate values are obtained in animals that have been handled and previously subjected to the technique. in persons, a diurnal variation of iop has been observed with the lowest iop occurring early in the morning. this variation is probably related to changes in endogenous corticosteriod levels. diurnal variation has been demon strated in the new zealand white rabbit (katz et al., 1975) and in beagles with inherited glaucoma, but not in normal dogs (gelatt et al., 1979a) . in persons significant differences in iop are observed with the patient in a sitting position as compared to the prone position. in addition, the variable of technique may contribute to the wide range of normal intraocular pressures observed in animals. sedation and anesthesia, because of associated cardiovascular effects, are likely to affect iop. this is especially true of the barbiturates. bito and co-workers (1979) found that ketamine hydrochloride had minimal effect on iop of rhesus monkeys (mucaca mulatta) and reported a mean iop of 14.1 â± 2.1 sd with higher values and a greater diurnal variation in young animals. schiotz indentation tonometry estimates iop by applying a carefully stan dardized instrument on the cornea and measuring the depth of indentation of the cornea by a weighter plunger. the schiotz tonometer has the advantages of simple construction, reasonable cost, portability, and relative simplicity of tech nique. in schiotz tonometry, a force from a small curved solid surface with a spheri cal curvature which indents the cornea will be balanced by a fluid pressure separated from the solid surface by a thin flexible membrane, the cornea in this case, when the applied force or weight of the tonometer equals the resultant force from the fluid pressure measured in a direction parallel to the direction of the applied force times the area of distortion to the membrane. the surface tension of the tear film exerts a small force parallel to the corneal surface. the instrument consists of a footplate that approximates the radius of curvature of the human cornea, a plunger, a holding bracket, a recording scale, and 5.5, 7.5, 10.0 and occasionally 15.0 gm weights (fig. 19) . the cornea is in dented with a relatively frictionless weighted plunger; the amount of plunger protruding from the plate depends upon the amount of indentation of the cornea. the tonometer scale is adjusted so that 0.05 mm of the plunger equals one scale unit. calibration tables are used to derive the actual iop from the observed tonometer reading. most accurate estimations of iop are obtained with the lighter weights within the middle scale ranges. the technique for schiotz tonometry is relatively simple (bryan, 1965; vainisi, 1970 ). the cornea is anesthetized with a drop of topical anesthesia. while allowing a few seconds for the anesthesia to take effect, check the recording arm on the tonometer by placing it on the solid convex metal surface provided. the tonometer should read 0, indicating that no indentation of the plunger is occur ring and that the plunger surface is flush with the footplate. with a bit of practice, schiotz tonometry can be performed without assistance in the dog and cat. rabbit tonometry is facilitated with an assistant holding the animal in lateral recumbancy. the animal should be relaxed, and care should be taken not to compress the jugular veins. the first, second, and/or third fingers of the left hand or the thumb may be used to simultaneously retract the lower eyelid of the right or left eyes, respectively. the tonometer is grasped between the thumb and first finger of the right hand and the tonometer scale rotated such that it is easily observed by the examiner. the fourth finger of the right hand is rested on the frontal bone to provide stability and retract the upper eyelid. care must be taken in retracting the lids so that pressure is applied only to the bony orbital rim and not the globe itself. excessive eyelid retraction may also create abnormal forces on the globe and should be avoided. the footplate is placed on the cornea as central as possible and gentle pressure applied until the holding bracket glides freely around the footplate shaft (fig. 20) . the scale reading is noted, and the tonometer is removed. the procedure is repeated two more times; each scale readings should be within one full unit to another. if the readings are to the low end of the scale, additional weights may be applied to the tonometer and the procedure repeated to obtain midscale readings. one potential source of error is the position of the tonometer in relationship to the perpendicular; deviation from the perpendicular will result in an overestimation of iop directly proportional to the degree of deviation. the process of tonometry will induce an increase in aqueous humor outflow and each repeated measure ment may be slightly lower than the previous estimate of iop. following use, the tonometer should be disassembled and cleaned carefully with a pipe cleaner. free movement of the plunger within the casing is essential for proper function, as is smooth working of the lever system and the recording arm. the process of placing the tonometer on the eye will increase iop between 5 to 20 mm hg; the schiotz tonometric conversion tables enable the clinician to correlate the tonometer reading with the iop prior to the placement of the fig. 20 . use of the schiotz tonometer in a dog. the instrument is applied in a perpendicular fashion to the central cornea. its use is limited to larger primates, dogs, cats, and rabbits, and calibration tables must be devised for each species for maximally accurate determination of intraocular pressure. instrument on the cornea. because of species differences in corneal curvature, ocular rigidity, and tissue characteristics, the use of a human conversion table will result in inaccurate estimation of iop. conversion tables have been de veloped for rabbit (best et al., 1970) and canine (peiffer et al., 1977) globes. limitations of the schiotz indentation technique depend upon the species, the clinician, and the instrument. use in animals is limited to those species with relatively large corneas and animals that can be adequately restrained and posi tioned. it is most useful in larger primates, dog, cat, and rabbit. ocular rigidity, or the ability of the cornea and sclera to stretch, will vary with age and from species to species and animal to animal (best et al., 1970; peiffer et al., 1978b) . ocular rigidity also increases as the tonometer is placed closer to the limbus. increases in ocular rigidity provide schiotz recordings that are higher than actual iop; with increased ocular rigidity there is less indentation by the schiotz tonometer, creating a false impression of increased iop. applanation tonometry is based upon the principles of fluid pressures; pressure equals force divided by the area. a force from a plane solid surface applied to a fluid contained by a thin membranous surface will be balanced when the area of contact times the pressure of the fluid equals the force applied by the plane solid surface. this is known as the imbert-fick law. simply stated, in applanation tonometry one may either measure the force necessary to flatten a constant area of the corneal surface or measure the area of cornea flattened by the constant applied force. electronic applanation tonometers, notably the mackay-marg, are the most versatile and accurate in a wide variety of species. probe tips are smaller than indentation tonometers, a minimum of intraocular fluid is displaced, iop is not significantly increased by the procedure, and the technique is independent of ocular rigidity and corneal curvature. it is applicable to smaller laboratory species, and its use has been reported in the chinchilla (peiffer and johnson, 1980 experimental animal models of glaucoma have been developed to study the effects of elevated iop on other ocular tissues, to determine the efficacy of medical and/or surgical treatment in reducing iop, and to define mechanisms of the glaucomatous process itself. spontaneous animal models of glaucoma have been utilized for the above purposes in addition to investigation to determine their similarities to the disease in man. historically, experimental glaucoma has been produced in rabbits by the injec tion of 1% kaolin into the anterior chamber to obstruct the outflow channels, with iop reaching 50-70 mm hg within 14 days (voronina, 1954) . skotnicki (1957) enclosed rabbit globes with cotton threads to induce glaucoma with resultant optic disc cupping. flocks and his associates (1959) utilized a similar technique and rubber bands; while initial increases in iop reached 70-100 mm hg, pres sures dropped to 35-50 mm hg within 48 hours. one-third of the eyes developed panophthalmitis and only 12% showed cupping of the optic disc. kupfer (1962) threaded polyethylene tubing into rabbit iridocorneal angles; iop increase was observed within 24 hours and remained elevated for 6 months. loss of retinal ganglion cells and cupping of the optic disc occurred. samis (1962) and kazdan (1964) produced glaucoma in rabbits by the intraocular injection of methylcellulose. huggert (1957) blocked the outflow of aqueous humor in rabbits using three different techniques but failed to cause significant increases in iop. de carvalho (1962) injected cotton fragments into rabbit anterior chambers, which elevated iop to 38-40 mm hg; in those animals in which the glaucoma persisted longer than 30 days, retinal and optic disc pathology was observed. injection of talcum powder or dental cement (kalvin et al., 1966b) produced elevated iop in monkeys. the injection of the enzyme î±-chymotrypsin into the globe will produce a variable increase in iop that may or may not be prolonged; the enzyme particles dissolve the zonules of the lens, the fragments of which collect in and obstruct the trabecular mesh work. the technique has been utilized in the rhesus and owl monkey to study optic nerve and ocular vascular changes (kalvin et al., 1966a; zimmerman et al., 1967; lambert et al., 1968; lesseil and kuwabara, 1969) . the enzyme may have direct toxic effects on the retina which must be considered in studying the morphologic and functional effects of elevated iop on this tissue. cyclocryotherapy will cause an acute elevation of iop in rhesus monkeys (minckler and tso, 1976) , and the technique has been utilized to study axoplasmic transport of the axons of the ganglion cells and optic nerve (minckler et al., 1976) ; the same parameters have been studied by controlled elevation of iop by cannulation and perfusion (minckler et al., 1977) . the use of repeated circumferential argon laser photocoagulation of the iridocorneal angle as described by gaasterland and kupfer (1964) results in a predictable sustained elevation of iop, marked reduction of outflow facility, and progressive cupping of the optic nerve head. the technique has the advantages of being noninvasive and associated with minimal intraocular inflammation and unrelated pathology. in chickens raised under continuous light exposure from the day of birth onward, iop rises related to increased resistance to aqueous humor outflow that is detectable as early as 2 weeks of age (lauber et al., 1972) . morphologic studies of the trabecular meshwork of affected animals reveals an increase of intercellular collagen and elastic trabecular tissue with resultant densification of the meshwork. there was an absence of endothelial vacuoles, pores, and microchannels observed by transmission electron microscopy in normal birds (tripathi and tripathi, 1974a,b; rohen, 1978) . while sporadic spontaneous cases of animal glaucoma have been described in a variety of species, only two models are reliably producable by controlled breedings, inherited glaucoma in the rabbit and beagle. these two models will be discussed briefly emphasizing methodologies utilized to define the disease pro cesses. buphthalmia (hydrophthalmus, congenital infantile glaucoma) in rabbits is due to an autosomal recessive gene with incomplete prentrance (hanna et al., 1962) . histologie abnormalities of the eye are observed at birth; elevated iop and buphthalmus may be observed as early as 2 to 3 weeks of age in some animals with progressive clouding and flattening of the cornea; ectasia of the globe, particularly in the sclero-cornea region; deepening of the anterior chamber with detachment and fragmentation of the iris membrane; partial atrophy of the ciliary body; and glaucomatous excavation of the optic disk (hanna et al., 1962) . the primary defect responsible for the development of glaucoma probably in volves impairment of facility of outflow. it has been postulated that the glaucoma may be part of a primary systemic disorder (hanna et al., 1962) . the gross ocular enlargement which characterizes buphthalmia is accompanied by fibrosis of the filtering angle (mcmaster and macri, 1967) , a decrease in facility of aqueous humor outflow (mcmaster, 1960; kolker et al., 1963) , and a hyposecretion of aqueous humor (smith, 1944; auricchio and wistrand, 1959; mcmaster and macri, 1967; greaves and perkins, 1951) . anatomic studies have em phasized changes at the angle and have postulated that they constitute at least one site of obstruction to aqueous humor outflow. hanna et al. (1962) demonstrate an absence of the space of fontana, the iris pillars, and either total absence or a rudimentary development of the trabecular canals and intrascleral channels. mcmaster and macri (1967) observed that the obstruction to outflow lies be tween the trabeculum and the episcleral veins. the angle according to hanna et al. (1962) is open in the adult buphthalmia but appears closed in the newborn. the combination of hyposecretion and reduced outflow explains why buphthalmic rabbits may have a normal iop. rabbits that are genetically buphthalmic but phenotypically normal appear to have an iop approximately 5 mm hg higher than normal, and rabbits with clinical signs of buphthalmia may have iop as high as 50 mm hg greater. the iop tends to increase a few weeks prior to observable distention of the globe. elevated iop will subsequently return to normal levels. the cause and effect relationship of the striking inverse correlation of aqueous humor ascorbate concentration with the severity of the buphthalmia is not clear; a marked drop in ascorbate levels is present in early preclinical stages of buphthalmia (lam et al., 1976; fox et al., 1977; lee et al., 1978) . the actual sequence of events, involving alterations in iop, outflow facility, ascorbate concentration of the aqueous humor, and clinical signs are not clearly defined. the rate of progression has been shown to vary with the genetic background. sheppard et al. (1967) reported that the corneal endothelial cells in a flat preparation from a buphthalmic rabbit were enlarged and of variable size. he postulated that the cells expand to cover the increased corneal area. van horn et al. (1977) utilized scanning electron microscopy to confirm this report, but also indicated that there is a loss of endothelial cells in the disease as well. gelati (1972) published a report on a familial glaucoma in the beagle; pathologic increases in iop occurred from 9 months to 4 years of age, and affected dogs demonstrated open iridocorneal angles upon gonioscopy. additional observations were summarized in subsequent papers (gelatt et al., 1976b (gelatt et al., , 1977c . controlled breedings suggested an autosomal recessive mode of inheritance. the glaucomatous process was divided into early (6 to 21 months of age), moderate (13 to 30 months of age), and advanced (31 months of age and greater) and was evaluated clinically by tonometry, gonioscopy, and anterior and posterior segment examination. in early glaucoma, the iridocorneal angle was open and without anomalies, and iop was elevated. with moderate glaucoma, variable optic disc atrophy, elongation of the ciliary processes, and focal disin sertion of the zonules from the lens were seen in addition to elevated iop and open iridocorneal angles. advanced glaucoma was characterized by increased iop, narrow to closed iridocorneal angles, lens dislocation, optic atrophy, and progression to phthisis bulbus. scanning electron microscopy was performed in a small number of dogs and correlated with the gonioscopy observations. affected dogs responded positively to water provocation (gelatt et al., 1976a) and dem onstrated decreased facility of aqueous humor outflow at all stages of the glaucomatous process when compared to normal dogs, both tonographically (gelatt et al., 1977c) and by constant pressure perfusion (peiffer et al., 1980b) . responses to topical autonomie agents whitley et al., 1980) and carbonic anhydrase inhibitors (gelatt et al., 1978) have been studied, and histochemical studies of adrenergic and cholinergic receptor sites have been performed (gwin et al., 1979a) . peiffer and gelatt (1980) described gross and light microscopic observations of the iridocorneal angle; data sup ported gonioscopic observations that the disease appeared to be an open angle glaucoma, with secondary pathology of the angle structures noted. inflammation of the uveal tract is a common and challenging enigmatic clini cal entity that encompasses the variables of inciting stimulus, host response, and associated alteration of ocular structure. the infections uveidites are discussed elsewhere; this section will review noninfectious uveal inflammation (primarily immune-mediated in nature) in animal models, which have proved useful in defining the etiopathogenesis of disease processes; enhancing of our understand ing of the immune response in general and specifically in regards to the eye, a rather unique organ, immunologically speaking; and investigating pharmaco logie mediation of the disease processes. limitations of these models should be noted: inflammation of the uveal tract in the animal model, regardless of species, tends to be an acute, self-limited dis ease. in addition to antigens derived from ocular tissue, complete freund 's adjuvant must be given to induce experimental allergic uveitis (eau). models of chronic uveitis have been particularly difficult to develop and require repeated immunizations with the inciting antigen. even in such models, the inflammation is usually restricted to the anterior segment, and there are minimal retrograde changes compared to chronic human uveitis. several studies have shown that the eye is not an immunological privileged site. allogeneic tissue implanted into the anterior chamber triggers an im munologie reaction. franklin and pendergrast (1972) observed that allogeneic thyroidal implants were rejected by the rabbit eye in a histologie manner and chronologic sequence similar to that for implants to other parts of the body. kaplan and his associates (1975; kaplan and streilein, 1977) have reported that although the rejection of allogeneic implants placed in the anterior chamber of inbred rats is delayed, the immunologie recog nition via the afferent limb of the immune response is intact. since ocular implant vascularization is coincident with that of implants in other areas of the body, immunologie suppression is probably responsible for the delay in the transplanta tion rejection observed (raju and grogan, 1969) . in studies of anterior chamber immunization, kaplan and streilein (1978) have demonstrated that the allogeneic antigens present in the anterior chamber are processed immunologically by the spleen via a vascular route, since afferent lymphatic channels do not drain the anterior chamber. they suggest that immune deviation occurs as a result of splenic suppressor factors which delay anterior chamber graft rejection. work by vessela and his co-workers (1978) in the rat has supported this theory. their model suggests that antigen processing and recognition occur, but that effector response is delayed. a possible delay mechanism could be low dose antigen sensitization with tolerance because of the lack of lymphatics, but a more likely explanation would be the presence of suppressor factors including nonspecific serologie blocking factors, specific antigen-antibody blocking complexes, sup pressor macrophages, or suppressor t cells. the majority of experimental animal investigations have been performed using models of sympathetic ophthalmitis or lens-induced uveitis. the role of immunologie mechanisms in the sympathetic ophthalmitis models has been fairly well characterized. the vast majority of this literature does not distinguish whether the immunologie response is primary or whether it is merely an epiphenomenon resulting from an alteration in the ocular antigens produced by another, nonimmunologic, insult. the physical location and biochemical characterization of antigens responsible for the induction of sympathetic ophthalmitis models of eau have been studied extensively. earlier studies dealt with the identification of uveal antigens; bilat eral uveitis can be produced by homologous immunization (aronson, 1968) . the inflammation tends to be nongranulomatous; however, occasional reports (col lins, 1949 (col lins, , 1953 describe lesions histopathologically similar to the granulomatous process seen in human sympathetic ophthalmitis. vannas et al. (1960) pro duced experimental uveitis in rabbits by enucleating one eye and implanting it in the peritoneal cavity. aronson and co-workers (1963a,b,c) demonstrated that uveal preparations from albino guinea pigs are antigenic, suggesting that melanin is not a vital antigenic component of the disease process. a number of workers have demonstrated that antigens from the rod outer segments and retinal pigment epithelium can induce experimental allergic uveitis in primates, guinea pigs, and rabbits (wong et al., 1975; meyers and pettit, 1975; hempel et al., 1976) . while previous workers had demonstrated that crude extracts of uveal tissue can also produce eau, faure et al. (1977) have suggested that the activity observed with these uveal preparations was probably due to contamination by retinal antigens, and most investigators accept their hypothesis that retinal antigens are more important than uveal antigens in eau. wacker (1973) and wacker and his associates (1977) have demonstrated that antigens present in the photoreceptor and retinal pigment epithelial layers can elicit the development of chorioretinitis and have partially characterized the responsible antigens. there is a soluble antigen (s antigen) located throughout the photoreceptor layer. this s antigen is most active in the production of eau; animals with the disease often develop delayed hypersensitivity responses to it. the s antigen appears biochemically similar to a protein subunit of retinolbinding lipoglycoprotein present throughout the photoreceptor layer. s antigen is apparently tissue-specific; crude bovine s antigen was relatively ineffective in the induction of guinea pig eau. a more purified preparation, however, resulted in a histopathologic lesion, with cellular infiltration in the anterior uvea and destruction of the photoreceptor layer. thus a purified extract of zenogeneic antigenic material had characteristics similar to those of the tissue-specific allogeneic extract. the s antigen is probably not rhodopsin, since a number of physical characteristics, including solubility, molecular weight, amino acid se quence, specific location, and the absorption spectrum, are different. the authors also identified a particulate antigen (p antigen), located in the rod outer seg ments, that does not elicit a delayed hypersensitivity response and has a lower eau induction rate. it does elicit the development of antibodies, even in the absence of disease. in most models of experimental uveitis, cell-mediated, rather than humoral, immune responses are most important in the pathophysiology of these diseases. delayed hypersensitivity reactions to inciting antigens have correlated with the onset and course of disease in eau; passive transfer experiments using lymphoid cells have been conducted; cellular mediators (lymphokines) have produced var iations of eau; and lymphoid cell depletion experiments have been performed. friedlander and his associates (1974) defined a predominantly eosinphilic re sponse to delayed hypersensitivity in the guinea pig. meyers and pettit (1975) demonstrated in the guinea pig both cutaneous de layed hypersensitivity and macrophage migration inhibition reactions toward rod outer segments and retinal pigment epithelial antigens in eau. in their study, cellular immunity appeared to correlate with clinical disease; specific antibody to the inciting antigen was absent in a number of animals that developed eau, and there was no correlation with humoral immunity. in a similar study, wacker and lipton (1971) demonstrated that delayed hypersensitivity toward the inciting antigen is usually present when eau is induced; in a number of animals who developed uveitis, no antibody response was detected. experimental uveitis can be passively tranferred to normal animals with lym phoid cells but not with serum from animals with eau (arnonson and mcmaster, 1971 ). chandler and his co-workers (1973) demonstrated that an ocular inflammation resembling uveitis can be induced with lymphokines produced by sensitized lymphocytes, and other investigators have demonstrated that experi mental uveitis can be abrogated using anti-lymphocyte serum (bã¼rde et ai, 1974) . in a mouse viral model the induction and maintenance of lymphocytic choriomeningitis (lcm) virus-induced uveitis is dependent on the thymusderived lymphocytes . immune spleen cells adaptively trans ferred to immunosuppressed animals did not receive immune lymphoid cells, no uveitis was produced; if the immune spleen cells were treated with anti-theta serum to eliminate the t cells, the uveitis was also aborted. while cell-mediated immunologie alteration appears to be responsible for the majority of experimental allergic uveitis models, in experimental lens-induced granulatomatous uveitis (elgu) humoral immunity is important. needling of the lens will provoke a uveitis in rabbits provided they also receive an intramus cular injection of freund's adjuvant (mã¼ller, 1952) . homologous lens antigens injected into the eye have minimal effect in the absence of adjuvant (mã¼ller, 1952; goodner, 1964) , and even animals previously sensitized by repeated injec tions of lens material into the footpad fail to mount more than a limited uveitis when challenged with an intravitreal injection of either homologous or heterologous lens antigens (selzer et al., 1966) . findings such as these prompted the search for a naturally occurring adjuvant which might account for the spon taneous disease; halbert et al. (1957a,b) were able to show that streptococcal extracts are able to potentiate lens reactions, although to a lesser degree com pared to freund's adjuvant, while burky (1934) had already demonstrated a similar effect in rabbits with staphylococcal extracts. clinical lgu, however, is not associated with bacterial infection. as lens-induced uveitis is essentially a feature of old and cataractous lenses containing a high proportion of insoluble material, behrens and manski (1973a) focused attention on the possible adjuvant effect of albuminoid lens protein. they found that a single injection of albuminoid into the vitreous of inbred rats produced after about 5 days a uveitis characterized by an initial neutrophil and macrophage response, whereas a similar injection of crystallins was essentially without effect; this suggested that albuminoid may have an effect similar to freund's adjuvant. other experiments in rats previously sensitized with whole lens preparations showed that intraocular challenge with albuminoid evokes a cellular response consisting initially of neutrophils followed by macrophages whereas crystallins, particularly a-crystallin, give rise to round cell exudation (behrens and manski, 1973b) . these findings suggest that the soluble antigens are responsible for humoral antibody formation and that insoluble albuminoid accounts for cell-mediated responses. mã¼ller (1963) has drawn attention to the enhancing effect of previous sensitization on the uveal response to lens proteins, having found that injection of homologous lens tissue in the presence of freund's adjuvant some time before needling the lens gives rise to a uveitis of marked proportions. marak and his associates (1976) have demonstrated that elgu can be transferred with serum. immunofluorescent studies of traumatized lenses of sensitized animals have demonstrated the presence of igg and complement, components of an immune complex-mediated reaction. depletion of the third component of complement using cobra venom factor or anti-leukocyte serum to inactivate polymorphonuclear lymphocytes (pmns) decreased the incidence of elgu. while it is most likely that elgu is an immune complex disease, immune complexes in the serum or aqueous of experimental animals have not been demonstrated. carmichael et al. (1974) reproduced the uveitis that accompanied infectious canine hepatitis adenovirus infection or vaccination demonstrating an immune complex disease resulting form soluble virus-antibody complexes and the asso ciated pmn cell response. there is significant variance in the incidence of experimental allergic uveitis produced in different species of animals. genetic influences can markedly alter immune responses; in congenic animals which differ only in the region of the chromosome containing immune response (ir) genes, marked differences in the incidence of autoimmune disease, malignancy, and infections have been noted, and it is possible that genetically determined differences in immunologie reactiv ity may be important in the development of uveitis in both humans and animals. there is a paucity of data using experimental animal models to determine the importance of immune response genes in the development of uveitis. in guinea pigs, the incidence of uveitis differs markedly between different strains. while it is relatively easy to induce uveitis in the hartley or nih strains, it is slightly more difficult to induce it in strain 13 and almost impossible to induce it in strain 2 (mcmaster et al., 1976) . mice that have well-characterized histocompatibility and immune response gene systems are a potential model to further study this phenomenon. (silverstein, 1964; hall and o'connor, 1970) . is there a specific immunologie reaction by these cells toward uveal antigens that is important in the production of human endogenous uveitis, or is the reactivity observed merely as an epiphenomenon? an equally feasible mechanism for the prolongation of an ocular inflammatory response could be the structural alteration induced by a nonspecific immunologie reaction. in the rabbit the development of systemic immune complex disease, with nonocular antigens, results in a change in ocular vascular permeability (gamble et al., 1970 howes and mckay, 1975) . it is conceivable that, once an animal receives this type of insult to its vascular system, the structural alteration of the ocular tissue is such that a chronic uveitis either develops or continues despite the lack of specific immune reactivity toward ocular antigens. the structure, composition, and physiology of the lens has been studied in a wide variety of animal species. investigators in specific areas of lens research, however, have tended to concentrate on one type of animal model. thus, much of what is known about the embryology of the lens is derived from research with chick embryos. likewise, the characterization of lens proteins is based largely on research with bovine lenses, and the physiology and metabolism of the lens has been studied mainly in rabbits and rats. some caution must be exercised, there fore, in extrapolating data from one animal model to another since significant differences undoubtedly exist. the chick lens, for instance, has a very different protein composition than that of the mammalian lens (rabaey, 1962) . there are a number of important differences between the lens of all the com monly studied laboratory animals and the lens of humans and other higher pri mates as have been pointed out by van heyningen (1976) . the rat, rabbit, and bovine lens all demonstrate minimal growth during adulthood, whereas the human lens continues to grow throughout life. also, the human lens tends to maintain its water content at about 65%, while lower animals have a decrease in water content during old age. the composition of the primate lens is different from all others in that it contains a group of tryptophan derivatives, hydroxykynurinines, which are highly absorbant in the ultraviolet range, and, fi nally, only the higher primates appear to have the ability to change the shape of their lens in accommodation. in addition to the higher vertebrates, the amphibians also deserve mention for their contribution to lens research. newts and salamanders have long been recog nized for their ability to regenerate a lens from the pigmented epithelium of the iris (reyer, 1954) . toads, on the other hand, demonstrate a similar ability to regenerate a lens from cells derived from the cornea. some of the techniques involved in this type of study are discussed by stone (1965) . more recently eguchi and okada (1973) have caused renewed interest in the relationship be tween pigmented epithelium and lens by demonstrating lenslike structures in cultures of retinal pigment epithelium from chick embryos. the dry weight of the lens is comprised almost entirely of proteins, the trans parency of which is the sine qua non of lens function. understandably, then, a large part of lens research has been aimed at analyzing these proteins. the techniques used include the complete armamentarium of sophisticated tools available to the protein chemist. as previously mentioned, the bovine lens, because of its large size and availability, has been most extensively used, al though the rabbit has also been studied in some depth. mammalian lenses consist of three major classes of soluble proteins, known as î±-, ã�, and î³-crystallins, and an insoluble fraction, predominantly albuminoid, most likely derived from a-crystallin. the separation of lens proteins can be achieved in a variety of ways depending on the objectives of the experiment. most recent studies, however, begin by separating the lens crystallins into major classes by gel filtration chromatography. specifically, sephadex g200, ultragel ac a 34, and biogel p300 all yield four major protein peaks corresponding to a-, ã�n-, ã�h~y and î³-crystallin (bloemendal, 1977) . sephacryl s-200 has also been used to separate the /3-crystallins into four subclasses (mostafapour and reddy, 1978) . further separation into subclasses is often done utilizing polycrylamide electrophoresis, and still more refined separations can be achieved using twodimensional electrophoresis and immunochemical techniques. an important concept of lens composition which has been elicited by the use of immunochemical identification of proteins is that of organ specificity, as opposed to the more usual rule of species specificity. immunoelectrophoresis can be used to demonstrate cross-reactions between lens proteins of species which are widely separated on the phylogenetic scale. in general, antibodies to the lens of one species will show extensive cross-reactivity to lens proteins of species lower on the evolutionary scale, indicating that these antigens have been carried over as evolution has progressed. this technique has been useful in defining some basic evolutionary relationships. immunoelectrophoresis and other modern methods for analysis of lens proteins are discussed by kuck (1970) . more recently a technique for radioimmunoassay of a-and î³-crystillin which is sensitive to very minute amounts of protein has been described (russell et al., 1978) . systems for in vitro study of the intact lens may be of the open type, where the lens is continually perfused with fresh media, or the closed system in which there is no exchange or only intermittent exchange of culture media. very elaborate open systems have been designed for accurately measuring such functions as metabolite production and oxygen consumption by the lens (schwartz, 1960a ,b, sippel, 1962 . although the open systems theoretically approximate the physiologic state more closely, closed systems similar to the one described by merriam and kinsey (1950) continue to be much more widely used because of simplicity and the potential for maintaining several lenses simultaneously. since 1950 a long series of articles has been published by kinsey et al. describing a variety of in vitro studies primarily using closed culture systems. these articles may be traced by referring to a recent addition to the series (kinsey and hightower, 1978) . if measuring exchange of materials between lens and culture media is not an important part of the experimental design, a compromise between open and closed system has been described by bito and harding (1965) which involves culturing lenses in dialysis bags with intermittent changes of the outer media. there is an overwhelming amount of literature describing agents capable of inducing experimental cataracts in animals and the subsequent chemical and metabolic effects of those agents. an attempt will be made here to review some of the most important and interesting methods of producing experimental cataracts. for a more complete list of experimental cataractogenic agents the reader is referred to reviews by van heyningen (1969) and kuck (1970 kuck ( , 1977 . rabbits and rat have been the most popular animals for this type of research. it is interesting that despite the vast amount of information accumulated on the mechanisms of cataractogenesis, the etiopathogenesis of senile cataracts so com mon in man remains undefined. a variety of physical insults ranging from the relatively gross technique of sticking a needle through the lens capsule to bombarding the lens with a neutron beam will induce cataracts in animals. by far the most widely studied types of cataract under this category is that produced by ionizing radiation. radiation cataracts have been studied in many different laboratory animals, most of which make satisfactory models with the exception of the bird which seems to be resistant to radiation cataract (pirie, 1961) . the lenses of younger animals are generally more susceptible to radiation, and the periphery of the lens which contains the actively dividing cells is much more sensitive than the nucleus (pirie and flanders, 1957) (fig. 21 ). x rays, y rays, and neutrons are all potent cataractogens. beta radiation in sufficient doses can also be used to cause cataracts (von sallmann et al., 1953) . much of the early work on ocular effects of ionizing radiation is reviewed by duke-elder (1954) . because of the recent boom in microwave technology, cataracts caused by this type of radiant energy have received increased attention. the cataractogenic potential for low doses of microwave radiation is probably minimal, however. hirsch et al. (1977) and appleton et al. (1975) found that lethal doses of microwaves were often required to produce cataract in rabbits. ultraviolet radiation, although not ordinarily thought of as cataractogenic, can cause cataracts experimentally in animals when a photosensitizing agent is ad ministered simultaneously (see toxic cataracts), which illustrates the important concept that two or more cataractogenic factors may act synergistically to pro duce lens opacities (hockwin and koch, 1975) . diets consisting of more than 25% galactose will consistently produce cataracts in young, rats. this type of experimental cataract has obvious clinical relevance to galactosemia in humans and the secondary cataract which often develops in this disease. the work of kinoshita et al. (1962) and others in explaining the mechanism of this cataract have demonstrated that this model is also applicable to the study of diabetic cataracts. effects of cataractogenic sugars (galactose, xylose, and glucose) have also been studied in vitro using rabbit lenses (chylack and kinoshita, 1969) . diabetic cataracts may also be studied more directly by surgical or chemical induction of diabetes in laboratory animals. patterson (1953) has used alloxan, 40 mg/kg in a single intravenous dose, to induce diabetes in rats but other tech niques have also been effective including 95% pancreatectomy and intravenous dehydroascorbate (patterson, 1956) , and intravenous streptozotocin in some spe cies (white and ã�inottim, 1972) . a large number of drugs and toxins have been demonstrated to cause cataracts in both man and animals. toxic cataracts are reviewed in depth by kuck (1977) . many of these agents have been of particular interest because of their therapeutic use in humans. metabolic inhibitors, not surprisingly, frequently cause cataracts. one such inhibitor, iodoacetate, has been shown to produce cataracts in rabbit lenses in vivo, in vitro, and, interestingly, by direct injection into the lens (zeller and shoch, 1961) . dinitrophenol, a metabolic inhibitor used for weight loss in humans until a high incidence of cataracts was recognized, will also induce cataracts in the chicken (buschke, 1947) . sodium cyanate, used experimentally in sickle cell anemia, has recently been shown to cause cataracts in the beagle (kern et al., 1977) . cytotoxin agents are another group of drugs associated with cataracts. busulfan and triethylene melamine cataracts have been studied in the rat as has dibromomannitol grimes, 1966, 1970) . recently bleomycin has been added to the list of cytotoxins with cataractogenic potential, causing cataracts in 3 to 5-day-old rats injected intraperitoneally with a dosage of 75-100 gm/logm (weill et al., 1979) . drugs which cause cataracts through more subtle effects on lens metabolism include corticosteroids, anticholinesterose inhibitors, and a number of drugs which interfere with lipid metabolism. steroid cataracts, although relatively common in humans, have been difficult to reproduce in animals, but tarkkanen et al. (1966) and wood et al. (1967) have reported success in rabbits using subconjunctival and topical steroid preparations. anticholinesterase cataracts have also proved challenging to reproduce in animals, with the monkey being the only animal model available at present (kaufman et al., 1977) . triparanol, which blocks cholesterol synthesis, and chloroquine and chlorphentermine, which also affect lipid metabolism, have all been shown to produce cataracts. a reversible cataract is seen in rats receiving a diet containing 0.05% triparanol; the cataract could not be reproduced in rabbits (harris and gruber, 1972 ). an un usual anterior polar cataract is produced with chloraquine and chlorphentermine (drenckhahn, 1978) . a group of drugs with the potential to produce cataracts by photosensitizing the lens to ultraviolet light has generated considerable attention. chlorpromazine and methozyproralen have been used in rats to study this effect (howard et al., 1969; jose and yielding, 1978) . of the nontherapeutic agents known to cause cataract, naphthalene has been studied most thoroughly, and lens opacities have been induced in the rabbit, rat, and dog. the mechanism of this type of cataract has been studied extensively by van heyningen and pirie (1967) . naphthalene can be used to create congential cataracts in rabbits (kuck, 1970) . toxic congential cataracts have also been seen in mice secondary to corticosteroids (ragoyski and trzcinska-dabrowska, 1969) . congenital cataracts can be due to infections as well as toxic and metabolic insults in vitro. although no animal model of rubella cataract is available for study, three models of viral-induced cataracts in animals have been reported. the enders strain of mumps virus, injected into the chick blastoderm prior to dif ferentiation of the lens plã¤code, has been reported to cause cataracts in the survivors (robertson et al., 1965) . hanna et al. (1968) showed that subviral particles of st. louis encephalitis virus injected intracerebrally into 4-day-old rats would cause cataracts in survivors. finally, a poorly categorized infectious agent believed to be a type of slow virus, the suckling mouse cataract agent (smca), produces cataracts when injected intracerebrally in mice less than 24-hours old (olmsted et al., 1966) . smc a has also been studied in vitro by infecting cultured rabbit lenses (fabiyi et al., 1971) . cataracts due to amino acid deficiencies can cause cataracts in rats (hall et al., 1948) . the most reproducable form of this type of cataract is that due to tryptophan deficiency which can be demonstrated in guinea pigs by feeding a diet containing less than 0.1% tryptophan (von sallmann et al., 1959) . despite the skyrocketing proliferation of intraocular lenses for use in humans following cataract extraction, very little animal experimentation has been done in *ff:##p1 s *" this area. animal selection should consider size and shape of the anterior chamber, pupil, and lens as well as species-related reactivity to surgery and lens implantation. in general, nonprimate mammals respond to intraocular manipula tions with increased exudation and inflammation compared to primates and hu mans. sampson (1956) implanted ridley-type lenses in nine dogs undergoing cataract extraction. schillinger et al. (1958) reported disappointing results with crude lenses implanted in rabbit and dog eyes. eifrig and doughman (1977) used modern iris fixation lenses in rabbits (fig. 22) . rabbits were prone to develop corneal edema after lens extraction with or without lens implantation, but edema was prolonged in eyes receiving lenses. cats have been used for implantation of lenses with slightly less technical difficulty than rabbits, but their vertical slit pupil is not ideal. because of their large anterior chamber, cats are not suitable for implantation of standard-sized anterior chamber lenses. degeneration of the retina has been a leading cause of visual impairment and partial or complete blindness in both man and animals. many causal factors have been defined and may be categorized as either genetic, chemical, developmen tal, or environmental. the latter category has received the majority of experi mental attention, being primarily concerned with the induction of retrograde changes by chemicals, radiant energy, physical trauma, nutritional deficiency, and viral infections. the multiplicity of genetic models will be dealt with briefly because rather than specific techniques they are entities within themselves. the wag/rij rat is described by lai (1977) as the "retinitis pigmentosa model," featuring a slowly progressive photoreceptor degeneration thought to be an autosomal dominant trait. at present the genetic model for human retinal dystrophy is the rca (tan-hooded) or 'dystrophie rat" (herron, 1977) . in this animal rod photorecep tor degeneration occurs because the retinal pigment epithelial (rpe) cells do not phagocytize shed rod outer segment photopigment discs. the (rd) mutant mouse has proved to be helpful to evaluate the effects of photoreceptor degeneration on the bipolar cell terminal synapse (blanks et al., 1974) . feline central retinal degeneration involving a diffuse atrophy of retinal cones has been described and is a potential model for human macular degeneration (bellhorn and fischer, 1970) (fig. 23) ; the genetics have not been defined (fischer, 1974) . a tool for studying retinal degeneration in mice and rats involves the use of chimera pro inherited canine retinal degenerative disease collectively described as "pro gressive retinal atrophy" has been recognized in a number of breeds (magnusson, 1911; hodgmen et al, 1949 , parry, 1953 barnett, 1965a) (fig. 24) (table ii) . genetic studies have revealed the majority to be inherited as a simple autosomal recessive trait, with three genotypes in the population: the homozygous normal, the heterozygous normal carrier, and the homozygous affected dog. clinical, electroretinographic, and transmission electron microscopic studies have been utilized to define distinctly different conditions in various breeds including rod-cone dystrophy in poodles (aguirre and rubin, 1972) and central progressive retinal atrophy in other species (parry, 1954; barnett, 1965b; aguirre and laties, 1976 ). evans rats and produced a retinal degeneration that involved both photoreceptor cells and rpe. fluorescein angiography and electron microscopy were utilized to assess altered retinal vascularity. these findings were likened to retinitis pigmentosa in man. to evaluate retinal changes caused by phenylalinine, the drug was administered subcutaneously to newborn rats for 1 week, producing profound damage to the bipolar and ganglion cell layers. the presence of the immature blood-brain barrier was suggested to be significant in the development of the lesions (colmant, 1977) . hamsters were used by another investigator to produce retinal pigmentary degeneration with yv-methyl-jz-nitrosourea, this model being suggested for use in screening the retinotoxicity of carcinogenic drugs. it should be pointed out, however, that the fundus was difficult to visualize, and the results were derived from histopathology (herrold, 1967) . male dutch-belted rabbits were given different levels of oxalate compounds subcutaneously to study the "flecked retina" condition seen in oxalate retinopathy. this model is also pro claimed valuable for assessing b vitamin deficiencies, ethylene glycol poisoning, and methoxyfluorane anesthesia toxicity (caine et al., 1975) . four-to 5month-old pigmented rabbits were used by brown in 1974 to study the effect of lead on the rpe. the electroretinogram, electrooculogram, and flat mount and histopathological preparations were utilized. the erg and eog recordings were normal, suggesting no major disturbance of the rpe. one of the smaller new world primate species, the squirrel monkey, was used by mascuilli et al. (1972) to evaluate the effects of various elemental iron salts on the retina. ocular siderosis with rpe and photoreceptor cell damage was evident as early as 4 hours after intravitreal administration. the erg was used to study these effects. synthetic antimalarials have long been recognized as producing ocular com plications in humans, especially macular pigmentary disturbances. berson (1970) administered chloroquine to cats to assess acute damage to the retina, utilizing intravitreal and posterior ciliary artery injections (figure 25 ). hey wood (1974) mentions the affinity of chloroquine and related drugs for melanin pig ment. miyata et al. (1978) used the long evans pigmented rat to evaluate the effects of intramuscular fenthion on the retina. erg as well as histopathologic changes were extensive 12 months after administration. a brief review of retinotoxic drug effects illustrated in color is available in the "atlas of veterinary ophthalmoscopy" (rubin, 1974) . included are funduscopic representations of ethambutol effects in the dog, naphthalene in the rabbit, chloroquine and related drugs in rabbits and cats, and a selection of other re tinotoxic agents. in the study of induced retinal degeneration, especially these conditions with rpe involvement, pigmented animals are more desirable than albinos. the subhuman primate is the only species with a macula comparable to the human, although birds may possess dual foveae. some mammals have an area centralis, a zone of high cone concentration that, while similar, is neither functionally nor anatomically identical to the macula. species differences in relative numbers of rod and cone photoreceptors varies with the animals' ecological niche, with nocturnal creatures exhibiting a preponderance of rods and diurnal animals more likely to possess a cone dominant retina. with these exceptions (not including variations in retinal vascular patterns, discussed elsewhere), the structural and functional relationships of the neurosensory retina are remarkably similar thorughout the vertebrates. a multitude of variations producing this type of degeneration have been stud ied including fluorescent, incandescent, ultraviolet, infrared, and colored light excesses with continuous or interrupted exposure patterns, high and low level intensities, and other variations. additional factors such as age, sex, body tem perature, species, and nutrition have also been simultaneously evaluated. lai et al. (1978) used fisher 344 albino rats exposed to a high-intensity fluorescent light to produce a severe peripheral retinal degeneration. a continuous highintensity fluorescent light exposure regimen for albino and hooded rats produced a more severe photoreceptor cell degeneration in the former (reuter and hobbelen, 1977). low-intensity continuous colored lights disclosed that white and blue bulbs produced the most damage in adult albino rats . it is apparent that continuous light sources are the most damaging to the albino animal, the higher the intensity the more severe the degeneration. newborn albino species, however, will show initial outer segment growth before damage eventually occurs (kuwabara and funahashi, 1976) . in 1973, berso evaluated the 13-striped ground squirrel, which has an all-cone retina, under high intensity illumination. essentially 200 times the amount of light required to produce cone degeneration in the albino rat produced "retinitis punctata albescens" in the ground squirrel. zigman et al. (1975) a more specific study was performed by tso et al. (1972) in rhesus monkeys to assess the effects of a retinally focused, low-intensity light beam under normal and hypothermie conditions. although no difference was detected due to body temperature, a progressive degeneration that involved the rpe and photoreceptor cells was identified in both study groups. the recent development of laser application in ocular therapy prompted study in laboratory species. gibbons et al. (1977) used rhesus monkeys to study visible and non visible lasers at high-and low-power levels and found that a 1-hour exposure of the latter was equivalent in rpe damage to 24 hours of the former. in another study tso and fine (1979) exposed the foveolas of rhesus monkeys to an argon laser for 10-20 minutes and found cystoid separations of the rpe and bruchs membrane at 3-4 years post-insult. good anesthesia and subject stabiliza tion are obvious requirements for these procedures. the advent of microwave radiation for everyday use has initiated considerable evaluation of its safety for the human eye. retinal damage in the form of synaptic neuron degeneration was produced in rabbits by exposure to 3100 mhz of microwave radiation (paulsson et al., 1979) . ultrastructural changes were dis covered that had not previously been seen, and these resulted from lower levels of energy than anticipated. solar retinitis has long been a problem in man but recently has increased due to greater numbers of unprotected people being exposed to snow-reflected sunlight. ham et al. (1979) produced photochemical (short wavelength) and thermal (long wavelength) lesions in rhesus monkeys by exposure to a 2500 w xenon lamp, stimulating in the former case the solar spectrum at sea level. this model demon strated the short wavelength causal nature of sunlight overexposure, resulting in solar retinitis and eclipse blindness. higher energy radiation effects have been evaluated by other means including high-speed particle acceleration. proton irradiation of discrete areas of the retina in owl monkeys produced full thickness retinal damage (gragoudas et al., 1979) . rhesus monkey retinal irradiation by highly accelerated (250 mev) oxygen nuclei produced extensive retinal vascular damage followed by degrees of degen eration and necrosis of the neural layers (bonney et al., 1977) . traumatic optic nerve injuries in humans are not so unusual as diseases that attack and destroy the optic nerve. in either case, however, retinal degeneration frequently follows. ascending and descending optic atrophy was produced in squirrel monkeys by anderson (1973) to study the degenerative process. total optic nerve neurectomy by razor blade knife via lateral orbitotomy was per formed to assess the descending condition, while xenon arc photocoagulation was performed to create the ascending condition. electron microscopy demon strated degenerative changes effecting the nerve fiber and ganglion cell layers. both methods produced morphologically similar retrograde processes; photocoagulation, however, provided evidence of ascending atrophy within 2 weeks. commotio retinae or traumatic retinopathy was initially induced in rabbits by berlin (1873) who used an elastic stick to produce retinopathy, sipperley et al. (1978) employed a bb gun technique that delivered a standard sized metallic bb which struck the cornea between the limbus and pupillary axis to produce the desired contrecoup injury of the posterior pole. the segments of the outer nuclear photoreceptor layer were specifically effected, the authors using fluorescein an-giographic, histopathologic, and electron microscopic results to demonstrate the disease. retinal degenerative conditions caused by nutritionally related problems have been studied in rats, cats, and monkeys. these studies have mostly been con cerned with vitamin deficiencies, and for economic reasons have lent themselves well to the rat as a model. the subhuman primate may be geneologically ideal for certain investigations, but it is equally as impractical for fiscal reasons. the cat has been used because of a specific susceptibility to a deficiency of the amino acid taurine (fig. 26) . in 1975 hayes et al. used young and adult cats to assess the effects of taruine deficiency by feeding a diet of casein, which contains very little cystine, the major precursor for taurine. in 3 months a photoreceptor cell degeneration was produced, the initial signs including a hyperreflective granular zone in the area centralis. the erg recordings indicated a photoreceptor degenerative process, while morphologically the outer nuclear and plexiform layers were destroyed. recently, wen et al. (1979) discovered that taurine deficiency in cats produces in addition to the photoreceptor cell a tapetal cell degeneration, thus suggesting that this amino acid also plays a role in maintaining structural integrity in this tissue. these investigators employed the use of the visual evoked potential (vep) technique to arrive at their conclusions. due to its known unusually high concentration in the retina, numerous species, such as chickens, rabbits, rats, and frogs, have been used to define the biologic function of taurine. pourcho (1977) utilized intravenous and intravitreal radiolabeled 35 s-taurine. the cat, mouse, rabbit, and monkey exhibited higher concentrations of taurine in mã¼ller cells, whereas, by comparison, the chick and frog showed very little. vitamin a deficiency has been studied in various species, the rat having been utilized most extensively because it is susceptible to night blindness as well as being less expensive and readily available. carter-dawson et al. (1979) used the offspring of vitamin a-deficient pregnant female rats to accelerate the desired condition of tissue depletion. under low levels of cyclic illumination (1.5-2 ft-c) both rhodopsin and opsin levels decreased, the latter requiring a longer period of time. in addition it was determined that rod cells degenerated before the cone cells. in another experiment, autoradiographic techniques were used to assess the influence of vitamin a deficiency on the removal of rod outer segments in weanling albino rats (herron and riegel, 1974) . the process was distinctly retarded by the absence of the vitamin. in a combination study, robison et al. (1979) evaluated the effects of relative deficiencies of vitamin a and e in rats. autofluorescent and histochemical techniques revealed that the vitamin e-free diets produced significant increases in fluorescence and lipofucsin granule stain ing regardless of the vitamin a levels. in conclusion vitamin e was thought to prevent light damage by scavenging oxygen radicals and thereby providing pro tection via lipid peroxidation. two different species of subhuman primates (ce bus and m acaca) were used by hayes (1974) to compare (separately) vitamin a and e deficiencies. after 2 years on a vitamin e-deficient diet a macular degeneration developed, charac terized by a focal massive disruption of cone outer segments. vitamin a defi ciency, on the other hand, produced xerophthalmia, keratomalacia, and impaired vision in six ce bus monkeys, the latter due to cone degeneration in the macula and midperipheral retina. information describing these techniques is available from a previous publication (ausman and hayes, 1974) . the ocular effects of two slow virus diseases have recently been studied: scrapie agent in hamsters (buyukmihci et al., 1977) and borna disease in rabbits (krey et al., 1979) . the scrapie agent when inoculated intracerebrally in young hamsters produced a diffuse thinning of the retina, the photoreceptor cell layer being most severely affected (fig. 27) . volvement in each foci. viral antigen was detected in these lesions as well as in nonaffected areas, thus suggesting an immune component involvement. this condition was likened to ocular pathology seen in subacute sclerosing panencephalitis in humans. friedman et al. (1973) injected newborn albino rats with simian virus 40 (sv40) and demonstrated viral antigen by immunofluorescense at 48 hours. adult rats injected at birth developed retinal neovascularization, folds, and gliosis, features similar to retinal dysplasia. in comparison, an extensive retinopathy was produced by monjan et al. (1972) by infecting newborn rats with lymphocytic choriomenigitis virus. the outer nuclear layers of the retina were initially effected, followed by the inner nuclear and ganglion cell layers and finally total destruction. due to a modest inflammatory infiltrate, this condition was also suspected to be immunopathologic in nature. the critical assessment of retinal degeneration has been done through the implementation of five basic techniques: (1) electroretinography (erg), (2) electrooculography (eog), (3) visual evoked response (ver), (4) autoradiography, and (5) light and electron microscopy. the concept of electroretinogram (erg) was first demonstrated by holmgren in 1865 using the enucleated eye of a frog. this technique is in principle applica ble to both animal and man and provides a detailed assessment of the rod-cone (visual cell) nature of the retina as well as its functional status. it is based upon the summation of retinal electrical potential changes which occur in response to light stimulation and are measured via corneal or skin surface electrodes (fig. 28) . detailed analyses of the erg can be obtained from brown (1968) as well as information concerning the use of microelectrodes. additional information describing the local electroretinogram (lerg), a more precise but invasive technique, is available from rodieck (1972) . the organization of the vertebrate retina and the origin of the erg potential change has been elucidated by the use of intracellular readings. the mudpuppy, necturia maculosa, has been a favorite subject because of its large retinal cells. excellent reviews of comparative retinal organization and function are provided by dowling (1970) and witkovsky (1971) . the electrooculogram (eog) is a clinically applied test of retinal function that was first measured in the eye of a tench by du bois-reymond in 1849. this test assesses the standing or corneofundal potential which exists between the anterior and posterior poles of the eye as a subject is taken from a dark to a light adapted state. for detailed information concerning this technique refer to krogh (1979) (rabbit) or arden and ikeda (1966) (rat) . a newer technique, the visual evoked potential (vep), has been adapted for use 400 msec in animals by wen et al. (1979) . this enables the clinical assessment of visual function in the area centralis as compared to the peripheral retina by measuring potential changes in the visual cortex in response to focal or diffuse retinal stimulation. the techniques described for the cat are also applicable to other species. autoradiography has been available and successfully used in ophthalmologic research for several years (cowan et al. y 1972) . tritiated amino acids are readily available in various concentrations and forms and may be injected into the vitreous or other ocular tissues where they may become imcorporated into cellu lar protein metabolism pathways. distribution may be studied by light or electron microscopy. ogden (1974) utilized the concept of axoplasmic transport together with autoradiography to trace the course of peripheral retinal nerve fibers to the optic disc. others have utilized this technique to study outer segment photopigment metabolism. rods have been shown to regenerate photopigment discs con tinuously with the rpe participating in the process by phagocytizing the shed products. (young, 1970; norton et al., 1974; mandelcorn et al., 1975) . cones also have mechanisms for photopigment renewal. specific information concern ing autoradiographic techniques may be obtained from cowan et al. (1972) , rogers (1967) , and kopriwa and leblond (1962) . light and electron microscopy have expanded the morphologic study of the retina beyond the limitations of light microscopy. in 1970 laties and liebman discovered by chance that the chlortriazinyl dye, procion yellow, was a selective stain for the outer segments of retinal cones in the mud puppy and frog. further investigation by laties et al. (1976) demonstrated the value of this dye in distinguishing between the outer segments of rods and cones and its use in monitoring outer segment renewal. these investigators studied the dye in rat, dog, rabbit, and monkey eyes as well and discovered that the rod basal saccules could not be visualized in the rat and dog with the same certainty as the gekko (nocturnal lizard) and mud puppy. specimen preparation involves the injection of aqueous procion yellow (0.5 to 4%) into the vitreous using a 3 /e-inch, 26-gauge needle. although originally used to described a syndrome consisting of cns and systemic anomalies in the human infant, retinal dysplasia (rd) now applies specifically to any abnormal differentiation of the retina after the formation of the anlã¤ge. the histopathologic features normally constituting rd include: (1) rosette formation, (2) retinal folds, and (3) gliosis (lahav and albert, 1973) . table iii lists those agents that have either been associated with or used in the induction of rd. the administration of blue tongue virus vaccine to fetal lambs appears to be the one technique that offers all of the features described for rd. this technique involves the use of a live, attenuated virus vaccine, and it is effective only during the first half of gestation (silverstein, et al., 1971) . the rd inducing effects of x-ray exposure on the retina of the primate fetus have been described (rugh, 1973) . the erg was used in this experiment as well as histopathology to demonstrate the lesions. shimada et al. (1973) and used suckling rats to evaluate the effects of antitumor and antiviral drugs, respectively, the results in both being the induction of rosette formation. in 1973 silverstein used the fetal lamb and intrauterine trauma to demonstrate the relationship of the rpe in the organizational histogenesis of the retina (fig. 29 ). experimental retinal detachment was first described by chodin in 1875. the rabbit, dog, and subhuman primate have since become species of choice; the pigmented rabbit model probably the most practical, it being less expensive, more genetically uniform, and easier to manipulate. a spontaneous inherited detachment has been reported in the collie dog, ac companied by choroidal hypoplasia and posterior staphylomas (freeman etal., 1966) . experimental methods have utilized specifically designed techniques to induce detachment including: (1) traction detachment by perforating injury (cleary and ryan, 1979a,b) , (2) intravitreal hyperosmotic injection (marmor, (norton et al., 1974) , (4) detachment by experimental circulatory embolization (algvere, 1976) , and (5) detachment by blunt needle rotation and suction without the use of hyaluronidase (johnson and foulds, 1977) . traction detachments are dependent on vitreal hemorrhage and scarring affects (cleary and ryan, 1979a,b), the rabbit and rhesus monkey being successfully utilized in these studies. vitreal changes are significant in most of these techniques, producing detachments of varying dimensions and duration (marmor, 1979) . embolization of the retinal and choroidal circulations with resultant retinal ischemia produced a long-lasting retinal detachment (algvere, 1976) ; the technique used plastic (polystyrene) beads injected into the central retinal artery and the supratemporal vortex veins of owl monkeys (aotus trivirgatus). another approach which produced septic choroiditis and multifocal serous retinal detachments in dogs involved the intracarotid injection of pathogenic bacteria (meyers et al., 1978) . experimental studies of the optic nerve have explored two aspects of this tissue; the pathogenesis of the degenerative changes resulting from increased iop, and the development of an experimental model for allergic optic nearitis, similar to that seen in multiple sclerosis. the former topic is discussed in the section on glaucoma. in regards to the latter, several workers have observed optic neuritis as well as retinal vasculitis and uvertis in guinea pigs, rabbits, and monkeys affected with experimental allergic encephalomyelitis. (raine et al. t 1974; von sallmann et al., 1967; bullington and waksman, 1958) . rao et al (1977) induced papilledema and demyelination of the optic nerve in strain 13 guinea pigs by sensitization with isogenic spinal cord emulsion in complete freunds adjuvant. unlike many other areas of ocular research where evidence obtained from animal models has been accepted with allowances for differences between the human and animal eye, the striking contrasts in ocular blood supply from one species to the next are impossible to ignore. as a result a variety of animal models have been studied in an attempt to sort out those aspects of the vascular anatomy and physiology which are highly variable from those which seem to be generally similar for a number of different species. the rabbit is sometimes used in this line of research, but more often it is supplanted by the monkey, cat, pig, or rat, all of which have an ocular blood supply which is more analogous to that of man. there are some general trends in the comparative anatomy of laboratory ani mals with respect to the ocular circulation. all of the commonly studied animals have a dual circulation to the eye consisting of (1) the uveal and (2) the retinal blood vessels. in higher primates, including man, both the uveal and retinal vessels are branches of the ophthalmic artery, itself a branch of the internal carotid; the external carotid system contributes very little to the ocular blood supply. in lower animals, on the other hand, the external carotid system, by way of the external ophthalmic artery, supplies the major portion of blood to the eye. many animals, including the dog, cat, and rat (but not the rabbit), have a strong anastomosing ramus between the external ophthalmic artery and the circle of willis, and since a relatively large part of their cerebral blood supply comes from the vertebral arteries, the common carotid artery may often be totally occluded with no apparent ill effects to the eye (jewell, 1952) . the retinal circulation in primates is supplied by the central retinal artery which branches off from the ophthalmic artery to enter the optic nerve close to the globe (fig. 30) . in lower animals the retinal circulation is more often derived from a network of anastomosing branches of the short ciliary arteries usually referred to as the circle of haller-zinn. in fact, of the nonprimate mammals, the rat is the only animal with a central retinal artery homologous to that of the primates. the presence of a central retinal artery in dogs and cats is disputed (franã§ois and neetens, 1974) , but even if present it is not the main source of blood to the retina (fig. 31) . the extent and configuration of the retinal vessels is even more variable than their source. in most laboratory animals, including primates, dogs, cats, rats and mice, the retina is more or less completely vascularized, or holangiotic. rabbits, however, have only two wing-shaped horizontal extensions of vessels from the optic nerve which are accompanied by myelinated nerve fibers (the medullary rays), and do not actually extend into the retina at all. horses have only a few small retinal vessels scattered around the optic disc (paurangiotic), and birds have a retina which is completely devoid from blood vessels (anangiotic). in summary then the primates have an ocular blood supply which is almost identical to that of man. the rat should also be considered as an animal model in studies of the retinal circulation. the ocular blood supply in cats and dogs is somewhat less analogous to that of the human. the rabbit has some very atypical features, especially of the retinal circulation, which should be considered very carefully before including it in experiments on the ocular circulation. one feature of the orbital vascular anatomy in rabbits which has not been mentioned is a peculiar anastomosis between the two internal ophthalmic arteries which has been implicated in the consensual irritative response in rabbits following an injury to one eye (forster et al., 1979) . because of the complexity and small size of the intraocular blood vessels, dissection, even under the microscope, is usually inadequate to study the anatom ical relationships of these vessels. corrosion casting techniques, involving infu-sion of the eye with neoprene (prince, l'964b) or plastic resin followed by digestion of the tissues, usually with concentrated sodium hydroxide, have been used for study of the uveal circulation and less often the retinal circulation. in our laboratory corrosion casting with a special low-viscosity plastic has been used with excellent results to study both the anterior and posterior uveal as well as the retinal circulation in cats and monkeys (risco and noapanitaya, 1980) . castings are studied with the scanning electron microscope (fig. 32) . the retina, because it is thin and relatively transparent, is amenable to simpler studies of vascular morphology. the classic text on this subject is by michaelson (1954) , who based his observations primarily on flat mounts of retina perfused with india ink. other dyes have been utilized in a similar fashion (prince, 1964b) . another popular technique introduced bykuwabara and cogan (1960) involves digestion of the retina in 3% trypsin at 37â°c for 1-3 hours until the tissue shows signs of disintegration followed by careful cleaning away of the partially digested tissue from the vascular tree and then staining with periodic acid-schiff (pas). other investigators believe that trypsin digestion may destroy some of the more delicate or immature vessels and advocate pas staining alone to delineate retinal vessels in immature retina (engerman, 1976a,b; henkind, et al., 1975) . so far we have mentioned only in vitro studies of the ocular circulation. in vivo studies can provide information about the dynamics of blood flow as well as morphology. the most widely used in vivo technique is fluorescein angiography. fluorescein dye is excited by blue light with a wavelength of 490 mm, emitting a yellow-green light with a wavelength of 520 mm. the blood vessels of the retina and iris are relatively impermeable to fluorescein because of their endothelial tight junctions. thus fluorescein angiography has been useful not only in the study of normal anatomy and time sequence of blood flow in these vessels as demonstrated by the excellent studies in the monkey by hayreh (1969; hayreh and baines, 1972a,b) but also as a test of integrity of the blood-ocular barrier in certain pathological states, especially neovascular lesions. the diffusion of fluorescein across the blood ocular barrier can be assessed quantitatively by fluorophotometric techniques (cunha-vaz, 1979) . there is evidence that, in mammals at least, the blood ocular barriers are similar for different species (rodriguez-peralta, 1975) . the basic equipment required is a fundus camera capable of rapid sequence flash photographs and appropriate filters. satisfactory fluorescein angiograms can be obtained in most laboratory animals, even mice and rats, although general anesthesia is usually required. angiograms of dogs, cats, and other carnivora are technically more difficult because the reflection from the tapetum interferes with the observation of retinal blood flow. in these animals color fluorescein angiog raphy may give better results. many of the technical aspects of fluorescein angiography in animals are discussed by bellhorn (1972) . anterior segment angiography of the iris vessels has the advantage of requiring less expensive equipment; a 35 mm camera with a macro lens and rapid recycling flash are essential (rumelt, 1974) . the rabbit, the most popular experimental animal for anterior segment angiograms, has an anterior segment circulation similar to humans, with the exception of the absence of a contribution from the anterior ciliary arteries to the anterior uveal blood supply, a situation possibly unique among mammals (ruskell, 1964) . the technique and interpretation of anterior segment angiography is discussed in detail by kottow (1978) . choroidal angiograms are also possible but present a technical problem in that in pigmented animals the retinal pigment epithelium (rpe) absorbs light strongly in both the excitation and emission wavelengths for fluorescein. the rpe is transparent to wavelengths in the infared range and the use of cyanine dyes for infared absorption and emission angiograms of the choroid have been described (hochheimer, 1979) as well as a technique for simultaneous angiography of the retinal and choroidal circulations using both fluorescein and indocyanine green (flower and hochheimer, 1973) . tsai and smith (1975) have used fluorescein for choridal angiograms by injecting dye directly into a vortex vein. angiograms of the ciliary circulation in rabbits have been obtained using conventional radio-opaque media and dental x-ray film (rothman et al., 1975) . the measurement of absolute and relative blood flow, velocity of blood flow, and oxygen content in various ocular tissues has been key to the understanding of nervous control of ocular physiology as well as the ocular effects of many drugs. unfortunately, these measurements have been fraught with technical difficulties. many of the techniques utilized have been indirect methods requiring compli cated mathematical analysis often based on tenuous assumptions. as a result large discrepancies have resulted from the use of different techniques or even the same technique in different hands. to complicate matters further different inves tigators have expressed results in different terms which are not easily interconver tible. the results of many of these studies have been reviewed and tabulated for comparison by henkind et al. (1979) . measurements of total or localized ocular blood flow have been made based on the washout of various gases including nitrous oxide, 85 kr and 133 xe (o'rourke, 1976) . a heated thermocouple has been used to measure relative blood flow (armaly and araki, 1975) , and bill (1962a,b) measured choroidal flow directly in rabbits by cannulating a vortex vein and in cats by cannulation of the anterior ciliary vein. the most recent technique is the use of radio-labeled microspheres (o'day et al., 1971) . the labeled microspheres are injected into the arterial system, and shortly thereafter the animal is sacrificed and samples of ocular tissues are taken for quantitation of radioactivity. the microspheres are presumed to embolize in the small vessels of the various ocular tissues in amounts propor tional to the blood flow in that tissue. photographic analysis following injection of fluorescein and other dyes has been used to determine retinal oxygen saturation (hickam et al., 1963) , mean circulation time (hickam and frayser, 1965) , and choroidal blood flow (trokel, 1964; ernest and goldstick, 1979) . oxygen saturation has also been measured directly in monkeys using a microelectrode inserted through the pars plana (flower, 1976) . velocity of blood flow has been estimated using doppler tech niques (riva et al., 1972; stern, 1975) and thermistors (takats and leister, 1979) . while absolute measurements may not agree from one study to another, relative measurements may still be valid. most studies confirm that the choroid receives about 75% of the total ocular blood flow while the anterior uvea receives from 10 to 35% and the retina about 5% of total ocular flow. one of the most important problems facing clinical human ophthalmologists is the control and treatment of vasoproliferative diseases in the eye. these neovascular proliferations may be confined to the retina or grow into the over lying vitreous and eventually result in retinal detachment or vitreous hemorrhage. they may also involve the iris (rubeosis iridis) and lead to neovascular glaucoma. diabetes, retrolental fibroplasia, and retinal vein occlusion are all commonly associated with ocular neovascularization. all of the above conditions have in common some degree of vasoobliteration followed by a period of retinal ischemia and subsequent vasoproliferative response. the most widely accepted theory is that growth of new vessels is stimulated by elaboration of a vasoproliferative factor from ischemie retina, a diffusable substance similar to the embryologie inducing agents discovered by spemann (1924) . this concept was supported by the isolation of a diffusable factor from certain tumors which stimulated neovascularity (folkman et al., 1971) . ryu and albert (1979) demonstrated a variable nonspecific neovascular response to viable or nonviable melanoma or retinoblastoma cells in a rabbit corneal model. the response was negligible in immune-deficient animals. more recently experiments by federman et al. (1980) have indicated that implanta tion of ischemie ocular tissues into the rabbit cornea can stimulate a non inflammatory neovascular response distant to the site of implantation. in diabetes capillary dropout in the retina is a well-documented phenomenon and is probably the precursor of proliferati ve retinopathy. reports of eye changes, especially capillary changes, in animals with spontaneous or induced diabetes are common (table iv) , but many of these reports represent sporadic findings. engerman (1976a,b) has reviewed the subject and feels that the re tinopathy associated with alloxan-induced diabetes in dogs is the best animal model based on morphology and reproducibility of the lesions. alloxan diabetes produces a similar retinopathy in monkeys but microaneurysms require about 7 years to develop (r. l. engerman, unpublished communication, 1980) . the retinopathic changes seen in these animals consists of microaneurysms and other structural capillary changes. we are not aware of any animal model of proliferative diabetic retinopathy with extraretinal neovascularization. patz and maumenee (1962) ; patz et al. (1965) sibay and hausier (1967) gepts and toussaint (1967) bloodworm (1965) engerman et al. (1972) ; engerman and bloodworth ( retrolental fibroplasia (rlf) is a result of oxygen toxicity to the immature retina. thus premature babies treated with oxygen are most susceptible, and the most immature part of the retina, usually the temporal periphery, is the most common area of involvement. the disease can be induced in newborn puppies, kittens, rats, and mice by exposure to high oxygen concentrations because these animals have an immature incompletely vascularized retina at birth. the kitten has been the most popular animal model for rlf, but the changes seen are comparable only to the early stages of rlf in humans with severe extraretinal disease complicated by retinal detachment being a rare finding in any of the animal models. there is some evidence that there are subtle differences in the pathophysiology of human rlf and that seen in animal models (ashton, 1966) . isolated retinal branch vein occlusion produced with the argon laser has been shown to cause a neovascular response in monkeys (hamilton et al., 1975) . once again, however, the new vessels are intraretinal and do not grow into the vitreous. the injection of inflammatory or toxic substances, such as blood (yamashita and cibis, 1961) , ammonium chloride (sanders and peyman, 1974), or immunogenemic substances including insulin (shabo and maxwell, 1976) into the vitreous may result in intravitreal vessels and fibrous membranes. these eyes show similarities to the late stages of diabetes or rlf in humans, but the pathophysiology is most likely different. the difficulty in producing intravitreal neovascular growth in animals may be due in part to an inhibitory effect of vitreous on proliferating blood vessels (brem et al., 1976; felton et al., 1979) . rabbit v-2 carcinoma implanted into rabbit vitreous results in a neovascular response only after the tumor has grown into contact with the retina (finkelstein et al., 1977) . the same tumor implanted into the corneal stroma stimulates a neovascular ingrowth toward the tumor (brem et al., 1976) , and, as already mentioned, federman et al. (1980) have reported that corneal implantation of ischemie retina stimulates a similar response. rubeosis, or neovascularization of the iris, may accompany diseases which cause retinal neovascularization and often produces glaucoma which is difficult to treat. there is apparently no reproducible way to induce true rubeosis in animals at present. attempts to produce the condition experimentally have been reviewed by gartner and henkind (1978) . occlusive vascular disease of the eye includes not only the relatively spectacu lar central retinal artery and vein occlusions but also a number of disorders with more subtle or insidious signs and symptoms. ischemie optic neuropathy, geo-graphic choroiditis, some forms of persistent uveitis (knox, 1965) , and possibly acute posterior multifocal placoid pigment epitheliopathy (ueno et al., 1977) are all examples of ocular disorders which can result from vasoocclusive events. experimental occlusion of the larger ocular vessels can be accomplished by occlusion of the vessel with a ligature or clamp. this includes the long and short posterior ciliary arteries (hayreh, 1964; hayreh and baines, 1972a,b; anderson and davis, 1974) and the central retinal artery and vein (hayreh et al., 1978 (hayreh et al., , 1979 . lesseil and miller (1975) reported effects on the optic nerve and retina in surviving monkeys following complete circulatory arrest for 14 to 30 minutes. experimental occlusion of the smaller vessels has been produced by embolization of the arterial system with plastic beads or various other particulate matter. most of the early embolization experiments were limited by the fact that the particulate material was injected into the carotid artery, thus indiscriminately embolizing small vessels throughout the eye (reinecke et al., 1962; hollenhorst et al., 1962; ashton and henkind, 1965) . more recent reports describe tech niques for selective embolization of the choroidal and retinal circulations based on injection site (kloti, 1967) or temporary occlusion of the retinal circulation during injection (stern and ernest, 1974) . the most selective occlusion of retinal vessels has been accomplished with photocoagulation using the argon laser (hamilton et al., 1975) or xenon photocoagulator (ernest and archer, 1979) . the concepts of using the globe as an in vivo tissue culture medium has been discussed and exemplified in a previous section; the technique is applicable to a variety of autologous, homologous, and heterologous tumors of a variety of cell line origins. specifics are more within the realm of the oncologist rather than the ophthalmologist and as such will not be dwelt on. several experimental tumor models have been well defined and have been utilized to eludiate spontaneous tumorogenic processes and to study the biological behavior of the induced tumors. intraocular injection of viruses will induce tumors in a number of laboratory animals. injection of human adenovirus type 12, a dna virus, into newborn rat vitreous produced retinoblastoma-like tumors in 3 of 35 animals between 204 and 300 days postinjection . albert and his associates (1967) had earlier described similar observations in hamsters injected subcutaneously with tissue-cultured ocular cells lines exposed to the virus. muaki and kobayashi (1973) produced extraocular orbital tumors in 12 of 59 newborn hamsters 27 to 106 days following intravitreal injection of the same virus; the tumors resembled a neuroepithelioma, suggesting that they were derived from neurogenic primordia in the retrobulbar space. further investigations demon strated the retinoblastoma in mice (muaki et al., 1977) . jc polyoma virus was injected into the eyes of newborn hamsters by ohashi and his associates (1978); 20% of the animals developed retinoblastoma-or ependymoblastoma-like intraocular tumors 7 vi-\?> months postinjection. eleven percent developed extraocular tumors, including schwannomas, fibrosarcomas, lacrimai gland carcinomas, and ependymal tumors. feline leukemia virus, an oncornavirus, was injected systemically and intraocularly into fetal and newborn kittens; a tumor of apparent retinal origin developed in 1 of 29 subjects (albert et al., 1977a) . the same virus injected into the anterior chamber of cats results in the development of iris and ciliary body melanomas or, less commonly, fibrosarcomas, in a high percentage of injected animals (shadduck et al., 1977; albert et al., 1977b) . in taylor and his associates (1971) utilized intravenous 226 ra to induce ciliary body melanomas in dogs; the tumors appeared to originate from the pigmented epithelium. albert et al. (1980) transplanted human choriodal melanoma into the vitreous of the "nude" mouse, a homozygous mutant (nu/nu) with a defect in cellular immunity. serial passage transplantation was possible. the fact that the animal is immunodeficient limits the value of this model to study biologic behavior, but this in vivo system provides abundant tissue for morphologic, biochemical, immunologie, and therapeutic studies. ophthalmic res. 7, 133. baum the eye manual of tonography system of ophthalmology. the eye in evolution the biology of the laboratory rabbit the eye tonography and the glaucomas immunopathology of uveitis ocular pharmacology proc. am. coll /â« "physiology of the human eye and visual system cataract and abnormalities of the lens becker-schaffer's diagnosis and therapy of the glaucomas anterior segment fluorescein angiography drugs and ocular tissues cells and tissues in culture retinal circulation in man and animals nuclear ophthalmology cell and tissue culture the vertebrate visual system comparative anatomy of the eye anatomy and histology of the eye and orbit in domestic animals textbook of veterinary internal medicine techniques in autoradiography atlas of veterinary ophthalmoscopy the rabbit in eye research immunopathology of uveitis the eye a treatise on gonioscopy the eye the vertebrate eye and its adaptive radiation transplantation of tissues and organs proc. am. coll key: cord-241351-li476eqy authors: liu, junhua; singhal, trisha; blessing, lucienne t.m.; wood, kristin l.; lim, kwan hui title: crisisbert: a robust transformer for crisis classification and contextual crisis embedding date: 2020-05-11 journal: nan doi: nan sha: doc_id: 241351 cord_uid: li476eqy classification of crisis events, such as natural disasters, terrorist attacks and pandemics, is a crucial task to create early signals and inform relevant parties for spontaneous actions to reduce overall damage. despite crisis such as natural disasters can be predicted by professional institutions, certain events are first signaled by civilians, such as the recent covid-19 pandemics. social media platforms such as twitter often exposes firsthand signals on such crises through high volume information exchange over half a billion tweets posted daily. prior works proposed various crisis embeddings and classification using conventional machine learning and neural network models. however, none of the works perform crisis embedding and classification using state of the art attention-based deep neural networks models, such as transformers and document-level contextual embeddings. this work proposes crisisbert, an end-to-end transformer-based model for two crisis classification tasks, namely crisis detection and crisis recognition, which shows promising results across accuracy and f1 scores. the proposed model also demonstrates superior robustness over benchmark, as it shows marginal performance compromise while extending from 6 to 36 events with only 51.4% additional data points. we also proposed crisis2vec, an attention-based, document-level contextual embedding architecture for crisis embedding, which achieve better performance than conventional crisis embedding methods such as word2vec and glove. to the best of our knowledge, our works are first to propose using transformer-based crisis classification and document-level contextual crisis embedding in the literature. crisis-related events, such as earthquakes, hurricanes and train or airliner accidents, often stimulate a sudden surge of attention and actions from both media and the general public. despite the fact that crises, such as natural disasters, can be predicted by professional institutions, certain events are first signaled by everyday citizens, i.e., civilians. for instance, the recent covid-19 pandemics was first informed by general public in china via weibo, a popular social media site, before pronouncements by government officials. social media sites have become centralized hubs that facilitate timely information exchange across government agencies, enterprises, working professionals and the general public. as one of the most popular social media sites, twitter enables users to asynchronously communicate and exchange information with tweets, which are mini-blog posts limited to 280 characters. there are on average over half a billion tweets posted daily [1] . therefore, one can leverage on such high volume and frequent information exchange to expose firsthand signals on crisis-related events for early detection and warning systems to reduce overall damage and negative impacts. event detection from tweets has received significant attention in research in order to analyze crisis-related messages for better disaster management and increasing situational awareness. several recent works studied various natural crisis events, such as hurricanes and earthquakes, and artificial disasters, such as terrorist attacks and explosions [2, 3, 4, 5] . these works focus on binary classifications for various attributes of crisis, such as classifying source type, predicting relatedness between tweets and the crises, and assessing informativeness and applicability [6, 7, 8] . on the other hand, several works proposed multi-label classifiers on affected individuals, infrastructure, casualties, donations, caution, advice, etc. [9, 10] . crisis recognition tasks are likewise conducted such as identifying crisis types, i.e. hurricanes, floods and fires [11, 12] . machine learning-based models are commonly introduced in performing the above mentioned tasks. conventional linear models such as logistic regression, naive bayes and support vector machine (svm) are reported for automatic binary classification on informativeness [13] and relevancy [8] , among others. these models were implemented with pre-trained word2vec embeddings [14] . several unsupervised approaches are also proposed for classifying crisis-related events, such as the clustop algorithm utilizing community detection for automatic topic modelling [15] . a transfer-learning approach is also proposed [16] , though its classification is only limited to two classes. the ability for cross-crisis evaluation remains questionable. more recently, numerous works proposed neural networks (nn) models for crisis-related data detection and classification. for instance, alrashdi and o'keefe investigated two deep learning architectures, namely bidirectional long short-term memory (bilstm) and convolutional neural networks (cnn) using domain-specific and glove embeddings [17] . nguyen et al. propose a cnn-based classifier with word2vec embedding pretrained on google news [14] and domain-specific embeddings [18] . lastly, parallel cnn architecture was proposed to detect disaster-related events using tweets [19, 20] . while prior works report remarkable performance on various crisis classification tasks using nn models and word embeddings, no studies are found to leverage the most recent natural language understanding (nlu) techniques, such as attention-based deep classification models [21] and document-level contextual embeddings [22] , which reportedly improve state-of-the-art performance for many challenging natural language problems from upstream tasks such as named entity recognition and part of speech tagging, to downstream tasks such as machine translation and neural conversation. this work focuses on deep attention-based classification models and document-level contextual representation models to address two important crisis classification tasks. we study recent nlu models and techniques that reportedly demonstrated drastic improvement on state-of-the-art and localize for domain-specific crisis related tasks. overall, our main contribution of this work includes: • proposing crisisbert, an attention-based classifier that improves state-of-the-art performance for both crisis detection and recognition tasks; • demonstrating superior robustness over various benchmarks, where extending crisisbert from 6 to 36 events with 51.4% of additional data points only results in marginal performance decline, while increasing crisis case classification by 500%; • proposing crisis2vec, a document-level contextual embedding approach for crisis representation, and showing substantial improvement over conventional crisis embedding methods such as word2vec and glove . . . to the best of our knowledge, this work is the first to propose a transformer-based classifier for crisis classification tasks. we are also first to propose a document-level contextual crisis embedding approach. in this section, we discuss the recent works that propose various machine learning approaches for crisis classification tasks. while these works report substantial improvement in performance over prior works, none of the works uses state of the art attentionbased models, i.e., transformers [21] , to perform crisis classification tasks. we propose crisisbert, a transformer-based architecture that builds upon a distilled bert model, fine-tuned by large-scale hyper-parameter search. various works propose linear classifiers for crisis-related events. for instance, parilla-ferrer et al. proposed an automatic binary classification, based on informative and uninformative tweets using naive bayes and support vector machine (svm) [13] . a svm with pretrained word2vec embeddings approach was also proposed [14] . besides linear models, recent works also propose deep learning based methods with different neural network architectures. for instance, alrashdi and o'keefe investigated bidirectional long short-term memory (bilstm) and convolutional neural figure) and passed into a distrilbert model. since we are performing classification task, the cls token vector, i.e. the first output vector, is then passed into a linear classifier for detection or recognition task, whereas the remainder of the output vectors are average-pooled to create crisis2vec embeddings. network (cnn) models using domain-specific and glove embeddings [23] . nguyen et al. proposed a cnn model to classify tweets to get information types using google news and domain-specific embeddings [18] . in 2017, vaswani et al. from google introduced transformer [21], a new category of deep learning models which are solely attention-based and without convolution and recurrent mechanisms. later, google proposed the bidirectional encoder representations from transformers (bert) model [24] which drastically improved state-of-the-art performance for multiple challenging natural language processing (nlp) tasks. since then, multiple transformer-based models have been introduced, such as gpt [25] and xlnet [26] , among others. transformer-based models were also deployed to solve domain specific tasks, such as medical text inferencing [27] and occupational title embedding [28] , and demonstrated remarkable performance. the bidirectional encoder representation of transformer (bert), for instance, is a multi-layer bidirectional transformer en-coder with attention mechanism [24] . the proposed bert model has two variants, namely (a) bert base, which has 12 transformer layers, a hidden size of 768, 12 attention heads, and 110m total parameters; and (b) bert large, which has 24 transformer layers, a hidden size of 1024, 16 attention heads, and 340m total parameters. bert is pre-trained with self-supervised approaches, i.e., masked language modeling (mlm) and next sentence prediction (nsp). while transformers such as bert are reported to perform well in natural language processing, understanding and inference tasks, to the best of our knowledge, no prior works propose and examine the performance of transformer-based models for crisis classification. in this work, we investigate the transformer approach for crisis classification tasks and propose crisisbert, a transformer-based classification model that surpasses conventional linear and deep learning models in performance and robustness. [29] , optimizers, learning rates, and batch sizes. table 1 shows the breakdown of the search space and the final hyper-parameters for crisisbert. each set of parameters is randomly chosen and ran with 3 epochs and two trials. in total, we evaluate over 300 hyper-parameters sets using a nvidia titan-x (pascal) for over 1,000 gpu hours. taking into consideration of performance and efficiency trade-off, we select the distilbert model for our transformer lm layer. distilbert is a compressed version of bert base through knowledge distillation. with utilization of only 50% of the layers of bert, distilbert performs 60% faster while preserving 97% of the capabilities in language understanding tasks. the optimal set of hyper-parameters for distilbert includes an adamw [30] optimizer, and initial learning rate of 5e-5, and a batch size of 32. output layer. the output layer of distilbert lm is a set of 768-d vectors led by the class header vector. since we are conducting classification tasks, only the [cls] token vector is used as the aggregate sequence representation for classification with a linear classifier. the remainder of the output vectors are processed into crisis2vec embeddings using mean-pooling operation. as discussed in section 2.3, crisis2vec embedding is a byproduct of crisisbert, where the embeddings are constructed based on a pre-trained bert model, and subsequently fine-tuned with three corpora of crisis-related tweets [6, 31, 32] to be domain-specific for crisis-related tweet representation. crisis2vec leverages the advantages of transformers, including (1) leveraging a self-attention mechanism to incorporate sentencelevel context bidirectionally, (2) leveraging both word-level and positional information to create contextual representation of words, and (3) taking advantage of the pre-trained models on large relevant corpora. to the best of our knowledge, we are the first who propose a document-level contextual embedding approach for crisis-related document representation. upon convergence, we construct the fixed-length tweet vector using a mean-pooling strategy [22] , where we compute the mean of all output vectors, as illustrated in algorithm 1. in this work, we conduct two crisis classification tasks, namely crisis detection and crisis recognition. we formulate the crisis detection task as a binary classification model that identifies if a tweet is relevant to a crisis-related event. the crisis recognition task on the other hand extends the problem into multi-class classification, where the output is a probability vector that indicate the likelihood of a tweet indicating specific events. both tasks are modelled as sequence classification problems that are formally defined below. we define the crisis detection task d = (s, φ), which is specified by s = {s 1 , ..., s n } a finite sample space of tweets with size n. each sample s i is a sequence of tokens at t time steps, i.e., s i = {s 1 i , ..., s t i }. φ denotes the set of labels that has the same sequence as the sample set, φ = {φ 1 , ..., φ n } and φ i ∈ {0, 1} where φ i = 1 indicates that sample s i is relevant to crisis, and φ i = 0 indicates otherwise. a deterministic classifier c d : s → φ specifies the mapping from sample tweets to their flags. our objective is to train a crisis detector using the provided tweets and labels that minimizes the differences between predicted labels and true labels, i.e., where j d denotes some cost function. similarly, we define a crisis recognition task r = (s, l), where sample space s is identical to that in crisis detection. l denotes a sequence of multi-class labels that have the same sequence as s, i.e., l = {l 1 , ..., l n }, and l i ∈ r m for m number of classes. a deterministic classifier c r : s → l specifies the mapping from the sample tweets to the crisis classes. the objective of the crisis classification tasks is to train a sequence classifier using the provided tweets and labels that minimizes the differences between predicted labels and true labels, i.e., where j r denotes some cost function for classifier c r . in this section, we discuss the experiments performed and their results in order to propose a highly effective and efficient approach for text classification. three datasets of labelled crisis-related tweets [6, 31, 32] are used to conduct crisis classification tasks and evaluate the proposed methods against benchmarks. in total, these datasets consist of close to 8 million tweets, where overall 91.6k are labelled. these data sets are in the form of: (1) 60k labelled tweets on 6 crises [6] , (2) 3.6k labelled tweets for 8 crises [32] , and (3) 27.9k labelled tweets for 26 crises [31] . table 2 describes more detail about each dataset and their respective classes. for our experimental evaluation, the 91.6k labelled crisis-related tweets are organized into two datasets, annotated as c6 and c36. in particular, c6 consists of 60k tweets from 6 classes of crises, whereas c36 comprises all 91.6k tweets in 36 classes. both datasets are split into training, validation and test sets that consist of 90%, 5% and 5% of the original sets, respectively. crisisbert. we evaluate the performance of crisisbert against multiple benchmarks, which comprise recently proposed crisis classification models in the literature. these works include linear classifiers, such as logistic regression (lr), support vector machine (svm) and naive bayes [33] , and non-linear neural networks, such as convolutional neural network (cnn) [20] and long short-term memory [34] . furthermore, we investigate the robustness of crisisbert for both detection and recognition tasks. this is achieved by extending the experiments from c6 to c36, which comprise 6 and 36 classes respectively, but with only 51.4% additional data points. we evaluate the robustness of the proposed models against benchmarks by observing the compromise in robustness performance, while realizing the drastically improved classification performance. as described in section 2.3, we use the optimal set of hyper-parameters for crisisbert in the experiments, which include the use of a bert model with distillation (i.e. distilbert), an adamw [30] optimizer, an initial learning rate of 5e-5, a batch size of 32, and a word dropout rate of 0.25. crisis2vec. to evaluate crisis2vec, we choose the two classifiers with the aim to represent both traditional machine learning approaches and the nn approaches. the two selected models are: (1) a linear logistic regression model, denoted as lr c2v , and (2) a non-linear lstm model, denoted as lst m c2v . we evaluate the performance of crisis2vec with the two models by replacing the original embedding to crisis2vec, ceteris paribus. we use two common evaluation metrics, namely accuracy and f1 score, which are functions of true-positive (tp), false-positive (fp), true-negative (tn) and false-negative (fn) predictions. accuracy is calculated by: for a f1-score of multiple classes, we calculate the unweighted mean for each label, i.e., for n classes of labels as: where p recision = t p t p + f p and recall = t p t p + f n we select and implement several crisis classifiers proposed in recent works to serve as benchmarks for evaluating our proposed methods. concretely, we compare crisisbert with the following models: • lr w2v : logistic regression model with word2vec embedding pre-trained on google news corpus [33] • sv m w2v : support vector machine model with word2vec embedding pre-trained on google news corpus [33] • n b w2v : naive bayes model assuming gaussian distribution for features with word2vec embedding pre-trained on google news corpus [33] • cn n gv : convolutional neural network model with 2 convolutional layers of 128 hidden units, kernel size of 3, pool size of 2, 250 filters, and glove for word embedding [20] • lst m w2v : long short-term memory model with 2 layers of 30 hidden states and a word2vec-based crisis embedding [34] models overall, the experimental results show that both proposed models achieve significant improvement on performance and robustness over benchmarks across all tasks. the experimental results for crisisbert and crisis2vec are tabulated in table 4 . robustness. comparing crisis detection task between c6 and c36, crisisbert shows 1.7% and 1.3% decline for f1-score and accuracy, which is much better than most benchmarks, i.e., from 1.7% to 6.3%, except cnn. however, when we compare the more challenging crisis recognition tasks between c6 and c36, the performance of crisisbert compromises marginally, i.e., 1.6% for f1-score and 0.7% for accuracy. on the contrary, all benchmark models record significant decline, i.e. from 6.0% to 67.2%. discussion. based on experimental results discussed above, we observe that: (1) crisisbert's performance exceeds state-of-theart performance for both detection and recognition tasks, with up to 8.2% and 25.0% respectively, (2) crisisbert demonstrates higher robustness with marginal decline for performance (i.e. less than 1.7% in f1-score and accuracy), and (3) crisis2vec shows superior performance as compared to conventional word2vec embeddings, for both lr and lstm models across all experiments. 5 related work event detection from tweets has received significant attention in research in order to analyze crisis-related messages for better disaster management and increasing situational awareness [4, 5, 2, 3] . parilla-ferrer et al. proposed automatic binary classification of informativeness using naive bayes and support vector machine (svm) [13] . stowe et al. presented an annotation scheme for tweets to classify relevancy and six [8] . furthermore, use of pre-trained word2vec reportedly improved svm for crisis classification [14] . lim et al. proposed clustop algorithm utilizing the community detection approach for automatic topic modelling [15] . pedrood et al. proposed to transfer-learn classification of one event to the other using a sparse coding model [16] , though the scope was only limited to only two events, i.e. hurricane sandy (2012) and supertyphoon yolanda (2013). a substantial number of works focusses on usign neural networks (nn) with word embeddings for crisis-related data classification. manna et al. [33] compared nn models with conventional ml classifiers [33] . alrashdi and o'keefe investigated and showed good performance for two deep learning architectures, namely bidirectional long short-term memory (bilstm) and convolutional neural networks (cnn) with domain-specific glove embeddings [17] . however, the study had yet to validate the relevance of model on a different crisis type. nguyen et al. applied cnn to classify information types using google news and domain-specific embeddings [18] . kersten et al. [20] implemented a parallel cnn to detect two disasters, namely hurricanes and floods, which reported a f1-score of 0.83. the cnn architecture was proposed earlier by kim et al. [19] . word-level embeddings such as word2vec [35] and glove [23] are commonly used to form the basis of crisis embedding [18, 36] in various crisis classification works to improve model performance. for context, word2vec uses a neural network language model (nnlm) that is able to represent latent information on the word level. glove achieved better results with a simpler approach, constructing global vectors to represent contextual knowledge of the vocabulary. more recently, a series of high quality embedding models, such as fasttext [37] and flair [38] , are proposed and reported to have improved state of the art for multiple nlp tasks. both word-level contextualization and character-level features are commonly used for these works. pre-trained models on large corpora of news and tweets collections are also made publicly available to assist in downstream tasks. furthermore, transformer-based models are proposed to conduct sentence-level embedding tasks [22] . social media such as twitter has become a hub of crowd generated information for early crisis detection and recognition tasks. in this work, we present a transformer-based crisis classification model crisisbert, and a contextual crisis-related tweet embedding model crisis2vec. we examine the performance and robustness of the proposed models by conducting experiments with three datasets and two crisis classification tasks. experimental results show that crisisbert improves state of the art for both detection and recognition class, and further demonstrates robustness by extending from 6 classes to 36 classes, with only 51.4% additioanl data points. finally, our experiments with two classification models show that crisis2vec enhances classification performance as compared to word2vec embeddings, which is commonly used in prior works. natural disasters detection in social media and satellite imagery: a survey situational awareness enhanced through social media analytics: a survey of first responders processing social media messages in mass emergency: a survey earthquake shakes twitter users: real-time event detection by social sensors crisislex: a lexicon for collecting and filtering microblogged communications in crises semi-supervised discovery of informative tweets during the emerging disasters identifying and categorizing disaster-related tweets extracting information nuggets from disaster-related messages in social media online public communications by police & fire services during the 2012 hurricane sandy on semantics and deep learning for event detection in crisis situations verifying baselines for crisis event information classification on twitter automatic classification of disaster-related tweets distributed representations of words and phrases and their compositionality clustop: a clustering-based topic modelling algorithm for twitter using word networks mining help intent on twitter during disasters via transfer learning with sparse coding deep learning and word embeddings for tweet classification for crisis response robust classification of crisis-related data on social networks using convolutional neural networks convolutional neural networks for sentence classification robust filtering of crisis-related tweets attention is all you need sentence-bert: sentence embeddings using siamese bert-networks glove: global vectors for word representation bert: pre-training of deep bidirectional transformers for language understanding language models are unsupervised multitask learners xlnet: generalized autoregressive pretraining for language understanding ncuee at mediqa 2019: medical text inference using ensemble bert-bilstm-attention model ipod: an industrial and professional occupations dataset and its applications to occupational data mining and analysis distilling the knowledge in a neural network what to expect when the unexpected happens: social media communications across crises analysing how people orient to and spread rumours in social media by looking at conversational threads effectiveness of word embeddings on classifiers: a case study with tweets a deep multi-modal neural network for informative twitter content classification during emergencies a neural probabilistic language model applications of online deep learning for crisis response using social media information enriching word vectors with subword information contextual string embeddings for sequence labeling key: cord-258018-29vtxz89 authors: cooper, ian; mondal, argha; antonopoulos, chris g. title: a sir model assumption for the spread of covid-19 in different communities date: 2020-06-28 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110057 sha: doc_id: 258018 cord_uid: 29vtxz89 in this paper, we study the effectiveness of the modelling approach on the pandemic due to the spreading of the novel covid-19 disease and develop a susceptible-infected-removed (sir) model that provides a theoretical framework to investigate its spread within a community. here, the model is based upon the well-known susceptible-infected-removed (sir) model with the difference that a total population is not defined or kept constant per se and the number of susceptible individuals does not decline monotonically. to the contrary, as we show herein, it can be increased in surge periods! in particular, we investigate the time evolution of different populations and monitor diverse significant parameters for the spread of the disease in various communities, represented by countries and the state of texas in the usa. the sir model can provide us with insights and predictions of the spread of the virus in communities that the recorded data alone cannot. our work shows the importance of modelling the spread of covid-19 by the sir model that we propose here, as it can help to assess the impact of the disease by offering valuable predictions. our analysis takes into account data from january to june, 2020, the period that contains the data before and during the implementation of strict and control measures. we propose predictions on various parameters related to the spread of covid-19 and on the number of susceptible, infected and removed populations until september 2020. by comparing the recorded data with the data from our modelling approaches, we deduce that the spread of covid-19 can be under control in all communities considered, if proper restrictions and strong policies are implemented to control the infection rates early from the spread of the disease. in december 2019, a novel strand of coronavirus (sars-cov-2) was identified in wuhan, hubei province, china causing a severe and potentially fatal respiratory syndrome, i.e., covid-19. since then, it has become a pandemic declared by world health organization (who) on march 11, which has spread around the globe [1, 2, 3, 4, 5] . who published in its website preliminary guidelines with public health care for the countries to deal with the 5 pandemic [6] . since then, the infectious disease has become a public health threat. italy and usa are severely affected by covid-19 [7, 8, 9] . millions of people are forced by national governments to stay in self-isolation and in difficult conditions. the disease is growing fast in many countries around the world. in the absence of availability of a proper medicine or vaccine, currently social distancing, self-quarantine and wearing a face mask have been emerged as the most widely-used strategy for the mitigation and control of the pandemic. 10 in this context, mathematical models are required to estimate disease transmission, recovery, deaths and other significant parameters separately for various countries, that is for different, specific regions of high to low reported cases of covid-19. different countries have already taken precise and differentiated measures that are important to control the spread of the disease. however, still now, important factors such as population density, insufficient evidence for different symptoms, transmission mechanism and unavailability of a proper vaccine, makes it difficult 15 to deal with such a highly infectious and deadly disease, especially in high population density countries such as india [10, 11, 12] . recently, many research articles have adopted the modelling approach, using real incidence datasets from affected countries and, have investigated different characteristics as a function of various parameters of the outbreak and the effects of intervention strategies in different countries, respective to their current situations. it is imperative that mathematical models are developed to provide insights and make predictions about the 20 pandemic, to plan effective control strategies and policies [13, 14, 15] . modelling approaches [8, 16, 17, 18, 19, 20, 21] are helpful to understand and predict the possibility and severity of the disease outbreak and, provide key information to determine the intensity of covid-19 disease intervention. the susceptible-infected-removed (sir) model and its extended modifications [22, 23, 24, 25] , such as the extended-susceptible-infected-removed (esir) mathematical model in various forms have been used in previous studies [26, 27, 28] to model the spread of 25 covid-19 within communities. here, we propose the use of a novel sir model with different characteristics. one of the major assumptions of the classic sir model is that there is a homogeneous mixing of the infected and susceptible populations and that the total population is constant in time. in the classic sir model, the susceptible population decreases monotonically towards zero. however, these assumptions are not valid in the case of the spread of the covid-19 virus, since new 30 epicentres spring up around the globe at different times. to account for this, the sir model that we propose here does not consider the total population and takes the susceptible population as a variable that can be adjusted at various times to account for new infected individuals spreading throughout a community, resulting in an increase in the susceptible population, i.e., to the so-called surges. the sir model we introduce here is given by the same simple system of three ordinary differential equations (odes) with the classic sir model and can be used to gain 35 a better understanding of how the virus spreads within a community of variable populations in time, when surges occur. importantly, it can be used to make predictions of the number of infections and deaths that may occur in the future and provide an estimate of the time scale for the duration of the virus within a community. it also provides us with insights on how we might lessen the impact of the virus, what is nearly impossible to discern from the recorded data alone! consequently, our sir model can provide a theoretical framework and predictions that 40 can be used by government authorities to control the spread of covid-19. in our study, we used covid-19 datasets from [29] in the form of time-series, spanning january to june, 2020. in particular, the time series are composed of three columns which represent the total cases i d tot , active cases i d and deaths d d in time (rows). these datasets were used to update parameters of the sir model to understand the effects and estimate the trend of the disease in various communities, represented by china, south korea, india, australia, usa, italy and the state of texas in the usa. this allowed us to estimate the development of covid-19 spread in these communities by obtaining estimates for the number of deaths d, susceptible s, infected i and removed r m populations in time. consequently, we have been able to estimate its characteristics for these communities and assess the effectiveness of modelling the disease. the paper is organised as following: in sec. 2, we introduce the sir model and discuss its various aspects. in 50 sec. 3, we explain the approach we used to study the data in [29] and in sec. 4, we present the results of our analysis for china, south korea, india, australia, usa, italy and the state of texas in the usa. section 5 discusses the implications of our study to the "flattening the curve" approach. finally, in sec. 6, we conclude our work and discuss the outcomes of our analysis and its connection to the evidence that has been already collected on the spread of covid-19 worldwide. 55 2. the sir model that can accommodate surges in the susceptible population the world around us is highly complicated. for example, how a virus spreads, including the novel strand of coronavirus (sars-cov-2) that was identified in wuhan, hubei province, china, depends upon many factors, among which some of them are considered by the classic sir model, which is rather simplistic and cannot take into consideration surges in the number of susceptible individuals. here, we propose the use of a modified sir model with characteristics, based upon the classic sir model. in particular, one of the major assumptions of the classic sir model is that there is a homogeneous mixing of the infected i and susceptible s populations and that the total population n is constant in time. also, in the sir model, the susceptible population s decreases monotonically towards zero. these assumptions however are not valid in the case of the spread of the covid-19 virus, since new epicentres spring up around the globe at different times. to account for this, we introduce here a sir model that does not consider the total population n , but rather, takes the susceptible population s as a variable that can be adjusted at various times to account for new infected individuals spreading throughout a community, resulting in its increase. thus, our model is able to accommodate surges in the number of susceptible individuals in time, whenever these occur and as evidenced by published data, such as those in [29] that we consider here. our sir model is given by the same, simple system of three ordinary differential equations (odes) with the 70 classic sir model that can be easily implemented and used to gain a better understanding of how the covid-19 virus spreads within communities of variable populations in time, including the possibility of surges in the susceptible populations. thus, the sir model here is designed to remove many of the complexities associated with the real-time evolution of the spread of the virus, in a way that is useful both quantitatively and qualitatively. it is a dynamical system that is given by three coupled odes that describe the time evolution of the following three 75 populations: 1. susceptible individuals, s(t): these are those individuals who are not infected, however, could become infected. a susceptible individual may become infected or remain susceptible. as the virus spreads from its source or new sources occur, more individuals will become infected, thus the susceptible population will increase for a period of time (surge period). furthermore, it is assumed that the time scale of the sir model is short enough so that births and deaths (other than deaths caused by the virus) can be neglected and that the number of deaths from the virus is small compared with the living population. based on these assumptions and concepts, the rates of change of the three populations are governed by the following system of odes, what constitutes our sir model where a and b are real, positive, parameters of the initial exponential growth and final exponential decay of the infected population i. it has been observed that in many communities, a spike in the number of infected individuals, i, may occur, which results in a surge in the susceptible population, s, recorded in the covid-19 datasets [29] , what amounts to a secondary wave of infections. to account for such a possibility, s in the sir model (1), can be reset to s surge at any time t s that a surge occurs, and thus it can accommodate multiple such surges if recorded in the published data in [29] , what distinguishes it from the classic sir model. the evolution of the infected population i is governed by the second ode in system 1, where a is the transmission rate constant and b the removal rate constant. we can define the basic effective reproductive rate r e = as(t)/b, as the fate of the evolution of the disease depends upon it. if r e is smaller than one, the infected population i will decrease monotonically to zero and if greater than one, it will increase, i.e., if di(t) thus, the effective reproductive rate r e acts as a threshold that determines whether an 100 infectious disease will die out quickly or will lead to an epidemic. at the start of an epidemic, when r e > 1 and s ≈ 1, the rate of infected population is described by the approximation di(t) dt ≈ (a − b) i(t) and thus, the infected population i will initially increase exponentially according to i(t) = i(0) e (a−b)t . the infected population will reach a peak when the rate of change of the infected population is zero, di(t)/dt = 0, and this occurs when r e = 1. after the peak, the infected population will start to decrease exponentially, following i(t) ∝ e −bt . thus, eventually (for t → ∞), the system will approach s → 0 and i → 0. interestingly, the existence of a threshold for infection is not obvious from the recorded data, however can be discerned from the model. this is crucial in identifying a possible second wave where a sudden increase in the susceptible population s will result in r e > 1, and to another exponential growth of the number of infections i. the data in [29] for china, south korea, india, australia, usa, italy and the state of texas (communities) are organised in the form of time-series where the rows are recordings in time (from january to june, 2020), and the three columns are, the total cases i d tot (first column), number of infected individuals i d (second column) and deaths d d (third column). consequently, the number of removals can be estimated from the data by since we want to adjust the numerical solutions to our proposed sir model (1) to the recorded data from [29] , for each dataset (community), we consider initial conditions in the interval [0, 1] and scale them by a scaling factor f to fit the recorded data by visual inspection. in particular, the initial conditions for the three populations are set such that s(0) = 1 (i.e., all individuals are considered susceptible initially), is the maximum number of infected individuals i d . consequently, the parameters a, b, f and i d max are adjusted 115 manually to fit the recorded data as best as possible, based on a trial-and-error approach and visual inspections. a preliminary analysis using non-linear fittings to fit the model to the published data [29] provided at best inferior results to those obtained in this paper using our trial-and-error approach with visual inspections, in the sense that the model solutions did not follow as close the published data, what justifies our approach in the paper. a prime reason for this is that the published data (including those in [29] we are using here) are data from different countries 120 that follow different methodologies to record them, with not all infected individuals or deaths accounted for. in this context, s, i and r m ≥ 0 at any t ≥ 0. system (1) can be solved numerically to find how the scaled (by f ) susceptible s, infected i and removed r m populations (what we call model solutions) evolve with time, in good agreement with the recorded data. in particular, since this system is simple with well-behaved solutions, we used the first-order euler integration method to solve it numerically, and a time step h = 200/5000 = 0.04 that 125 corresponds to a final integration time t f of 200 days since january, 2020. this amounts to double the time interval in the recorded data in [29] and allows for predictions for up to 100 days after january, 2020. obviously, what is important when studying the spread of a virus is the number of deaths d and recoveries r in time. as these numbers are not provided directly by the sir model (1), we estimated them by first, plotting the data for deaths d d vs the removals r d m , where r d m = d d + r d = i d tot − i d and then fitting the plotted data with the nonlinear function where d 0 and k are constants estimated by the non-linear fitting. the function is expressed in terms of only model values and is fitted to the curve of the data. thus, having obtained d from the non-linear fitting, the number of recoveries r can be described in time by the simple observation that it is given by the scaled removals, r m from the sir model (1), less the number of deaths, d from eq. (3), the rate of increase in the number of infections depends on the product of the number of infected and susceptible individuals. an understanding of the system of eqs. (1) explains the staggering increase in the infection rate around the world. infected people traveling around the world has led to the increase in infected numbers and this results in a further increase in the susceptible population [14] . this gives rise to a positive feedback loop leading to a very rapid rise in the number of active infected cases. thus, during a surge period, the number of susceptible individuals increases and as a result, the number of infected individuals increases as well. for example, as of 1 march, 2020, there were 88 590 infected individuals and by 3 april, 2020, this number had grown to a staggering 1 015 877 [29] . 135 understanding the implications of what the system of eqs. (1) tells us, the only conclusion to be drawn using scientific principles is that drastic action needs to be taken as early as possible, while the numbers are still low, before the exponential increase in infections starts kicking in. here, we have applied the sir model (1) considering data from various countries and the state of texas in the usa provided in [29] . assuming the published data are reliable, the sir model (1) can be applied to assess the spread of the covid-19 disease and predict the number of infected, removed and recovered populations and deaths in the communities, accommodating at the same time possible surges in the number of susceptible individuals. figures 1-17 show the time evolution of the cumulative total infections i tot , current infected individuals, i, recovered 155 individuals, r, dead individuals, d, and normalized susceptible populations, s for china, south korea, india, australia, usa, italy and texas in the usa, respectively. the crosses show the published data [29] and the smooth lines, solutions and predictions from the sir model. the cumulative total infections plots also show a curve for the initial exponential increase in the number of infections, where the number of infections doubles every five days. the figures also show predictions, and a summary of the sir model parameters in (1) and published data in [29] 160 for easy comparisons. we start by analysing the data from china and then move on to the study of the data from south korea, india, australia, usa, italy and texas. the number of infections peaked in china about 16 february, 2020 and since then, it has slowly decreased. the from the plots shown in figs. 3 and 4 , it is obvious that the south korean government has done a wonderful job in controlling the spread of the virus. the country has implemented an extensive virus testing program. there has also been a heavy use of surveillance technology: closed-circuit television (cctv) and tracking of bank cards and mobile phone usage, to identify who to test in the first place. south korea has achieved a low fatality rate (currently one percent) without resorting to such authoritarian measures as in china. the most conspicuous part of the south korean strategy is simple enough: implementation of repeated cycles of test and contact trace measures. to match the recorded data from india with predictions from the sir model (1), it is necessary to include a 180 number of surge periods, as shown in fig. 5 . this is because the sir model cannot predict accurately the peak number of infections, if the actual numbers in the infected population have not peaked in time. it is most likely the spread of the virus as of early june, 2020 is not contained and there will be an increasing number of total infections. however, by adding new surge periods, a higher and delayed peak can be predicted and compared with future data. in fig. 5 , a consequence of the surge periods is that the peak is delayed and higher than if no surge periods were 185 applied. the model predictions for the 30 september, 2020 including the surges are: 330 000 total infections, 700 active infections and 7 500 deaths, whereas if there were no surge periods, there would be 130 000 total infections, 700 active infections and 6 300 deaths, with the peak of 60 000, which is about 40% of the current number of active cases occuring around 20 may 2020. thus, the model can still give a rough estimate of future infections and deaths, as well as the time it may take for the number of infections to drop to safer levels, at which time restrictions can 190 be eased, even without an accurate prediction in the peak in active infections (see figs. 5 and 6). a surge in the susceptible population was applied in early march, 2020 in the country. the surge was caused by 2 700 passengers disembarking from the ruby princes cruise ship in sydney and then, returning to their homes around australia. more than 750 passengers and crew have become infected and 26 died. two government enquires 195 have been established to investigate what went wrong. also, at this time many infected overseas passengers arrived by air from europe and the usa. the australian government was too slow in quarantining arrivals from overseas. from mid-march, 2020 until mid-may, 2020, the australian governments introduced measures of testing, contact tracing, social distancing, staying at home policy, closure of many businesses and encouraging people to work from home. from figs. 7 and 8, it can be observed that actions taken were successful as the actual number of infections as of early june, 2020, the peak number of infections has not been reached. when a peak in the data is not reached, it is more difficult to fit the model predictions to the data. in the model, it is necessary to add a few surge periods. this is because new epicentres of the virus arose at different times. the virus started spreading in washington state, followed by california, new york, chicago and the southern states of the usa. the need to add surge periods shows clearly that the spread of the virus is not under control. in the usa, by the end of may, 2020, the number of active infected cases has not yet peaked and the cumulative total number of infections keeps getting bigger. this can be accounted for in the sir model by considering how the susceptible population changes with time in may. during that time, to match the data to the model predictions, surge periods were used where the normalized susceptible population s was reset to 0.2 every four days. what is currently happening in the usa is that as susceptible individuals become infected, their population decreases, with 215 these infected individuals mixing with the general population, leading to an increase in the susceptible population. this is shown in the model by the variable for the susceptible population, s, varying from about 0.06 to 0.20, repeatedly during may. until this vicious cycle is broken, the cumulative total infected population will keep growing at a steady rate and not reach an almost steady-state. the fluctuating normalized susceptible variable provides clear evidence that government authorities do not have the spread of the virus under control (see figs. 9 220 and 10). the plots in figs. 11 and 12 show that the peak in the total cumulative number of infections has not been reached as early as june, however, the peak is probably not far away. if there are no surges in the susceptible population, then one could expect that by late september, 2020, the number of infections will have fallen to very 225 small numbers and the virus will have been well under control with the total number of deaths in the order of 2 000. in mid-may, 2020, some restrictions have been lifted in the state of texas. the sir model can be used to model some of the possible scenarios if the early relaxation of restrictions leads to increasing number of susceptible populations. if there is a relatively small increase in the future number of susceptible individuals, no series impacts occur. however, if there is a large outbreak of the virus, then the impacts can be dramatic. for example, at the end 230 of june, 2020, if s was reset to 0.8 (s = 0.8), a second wave of infections occurs with the peak number of infections occurring near the end of july, with the second wave peak being higher than the initial peak number of infections. subsequently, the number of deaths will rise from about 2 000 to nearly 5 000, as shown in figs. 13 and 14. if governments start lifting their containment strategies too quickly, then it is probable there will be a second wave of infections with a larger peak in active cases, resulting to many more deaths. figure 15 shows clearly that the peak of the pandemic has been reached in italy and without further surge periods, the spread of the virus is contained and number of active cases is declining rapidly. the plots in panels (a), (b) in fig. 16 are a check on how well the model can predict the time evolution of the virus. these plots also assist in selecting the model's input parameters. 240 the term flattening the curve has rapidly become a rallying cry in the fight against covid-19, popularised by the media and government officials. claims have been made that flattening the curve results in: (i) reduction in the peak number of cases, thereby helping to prevent the health system from being overwhelmed and (ii) in an increase in the duration of the pandemic with the total burden of cases remaining the same. this implies that social 245 distancing measures and management of cases, with their devastating economic and social impacts, may need to continue for much longer. the picture which has been widely shown in the media is shown in fig. 17(a) . the idea presented in the media as shown in fig. 17 (a) is that by flattening the curve, the peak number of infections will decrease, however, the total number of infections will be the same and the duration of the pandemic will be longer. hence, they concluded that by flattening the curve, it will have a lesser impact upon the demands in 250 hospitals. figure 17(b) gives the scientific meaning of flattening the curve. by governments imposing appropriate measures, the number of susceptible individuals can be reduced and combined with isolating infected individuals, will reduce the peak number of infections. when this is done, it actually shortens the time the virus impacts the society. thus, the second claim has no scientific basis and is incorrect. what is important is reducing the peak in the number of infections and when this is done, it shortens the duration in which drastic measures need to be taken and not lengthen the period as stated in the media and by government officials. figure 17 shows that the peak number of infections actually reduces the duration of the impact of the virus on a community. mathematical modelling theories are effective tools to deal with the time evolution and patterns of disease outbreaks. they provide us with useful predictions in the context of the impact of intervention in decreasing the 260 number of infected-susceptible incidence rates [30, 31, 32] . in this work, we have augmented the classic sir model with the ability to accommodate surges in the number of susceptible individuals, supplemented by recorded data from china, south korea, india, australia, usa and the state of texas to provide insights into the spread of covid-19 in communities. in all cases, the model predictions could be fitted to the published data reasonably well, with some fits better than others. for china, the actual 265 number of infections fell more rapidly than the model prediction, which is an indication of the success of the measures implemented by the chinese government. there was a jump in the number of deaths reported in mid-april in china, which results in a less robust estimate of the number of deaths predicted by the sir model. the susceptible population dropped to zero very quickly in south korea showing that the government was quick to act in controlling the spread of the virus. as of the beginning of june, 2020, the peak number of infections in india 270 has not yet been reached. therefore, the model predictions give only minimum estimates of the duration of the pandemic in the country, the total cumulative number of infections and deaths. the case study of the virus in australia shows the importance of including a surge where the number of susceptible individuals can be increased. this surge can be linked to the arrival of infected individuals from overseas and infected people from the ruby princess cruise ship. the data from usa is an interesting example, since there are multiple epicentres of the virus 275 that arise at different times. this makes it more difficult to select appropriate model parameters and surges where the susceptible population is adjusted. the results for texas show that the model can be applied to communities other than countries. italy provides an example where there is excellent agreement between the published data and model predictions. thus, our sir model provides a theoretical framework to investigate the spread of the covid-19 virus within 280 communities. the model can give insights into the time evolution of the spread of the virus that the data alone does not. in this context, it can be applied to communities, given reliable data are available. its power also lies to the fact that, as new data are added to the model, it is easy to adjust its parameters and provide with best-fit curves between the data and the predictions from the model. it is in this context then, it can provide with estimates of the number of likely deaths in the future and time scales for decline in the number of infections in communities. our 285 results show that the sir model is more suitable to predict the epidemic trend due to the spread of the disease as it can accommodate surges and be adjusted to the recorded data. by comparing the published data with predictions, it is possible to predict the success of government interventions. the considered data are taken in between january and june, 2020 that contains the datasets before and during the implementation of strict and control measures. our analysis also confirms the success and failures in some countries in the control measures taken. strict, adequate measures have to be implemented to further prevent and control the spread of covid-19. countries around the world have taken steps to decrease the number of infected citizens, such as lock-down measures, awareness programs promoted via media, hand sanitization campaigns, etc. to slow down the transmission of the disease. additional measures, including early detection approaches and isolation of susceptible individuals to avoid mixing them with no-symptoms and self-quarantine individuals, traffic restrictions, and medical treatment have 295 shown they can help to prevent the increase in the number of infected individuals. strong lockdown policies can be implemented, in different areas, if possible. in line with this, necessary public health policies have to be implemented in countries with high rates of covid-19 cases as early as possible to control its spread. the sir model used here is only a simple one and thus, the predictions that come out might not be accurate enough, something that also depends on the published data and their trustworthiness. however, as the model data show, one thing that is 300 certain is that covid-19 is not going to go way quickly or easily. health organization, coronavirus disease (covid-19) outbreak nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study novel coronavirus (covid-19) cases, provided by jhu csse estimation of the transmission risk of the 2019-ncov and its implication for public health interventions the effect of human mobility and control measures on the covid-19 epidemic in china naming the coronavirus disease (covid-19) and the virus that causes it an epidemiological forecast model and software assessing interventions on covid-19 epidemic in china extended sir prediction of the epidemics trend of covid-19 in italy and compared with hunan, china quantifying the effect of quarantine control in covid-19 infectious spread using machine learning predictions for covid-19 outbreak in india using epidemiological models covid-19: india imposes lockdown for 21 days and cases rise mohfw, coronavirus disease 2019 (covid-19). available online on the predictability of infectious disease outbreaks the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak. science epidemics with mutating infectivity on small-world networks early 340 dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions analysis and forecast of covid-19 spreading in china, italy and france a data-driven network model for the emerging covid-19 epidemics in wuhan, toronto and italy estimation of covid-19 dynamics on a back-of-envelope: does the simplest sir model provide quantitative parameters and predictions modeling the impact of mass influenza vaccination and public health interventions on covid-19 epidemics with limited detection capability three basic epidemiological models the mathematics of infectious diseases the basic epidemiology models: models, expressions for r0, parameter estimation, and applications the sir model and the foundations of public health global analysis of the covid-19 pandemic using simple epidemiological models a modified sir model for the covid-19 contagion in italy mathematical modeling of covid-19 transmission dynamics with a case study of wuhan mod-370 elling the covid-19 epidemic and implementation of population-wide interventions in italy the effectiveness of quarantine of wuhan city against the corona virus disease 2019 (covid19): a wellmixed seir model analysis ), e0230405. declaration of competing interest i am attaching herewith a copy of our manuscript entitled "a sir model assumption for the spread of covid-19 in different communities" co-authored by ian cooper and chris g. antonopoulos in favor of publication in your esteemed journal chaos, solitons & fractals. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work am is thankful for the support provided by the department of mathematical sciences, university of essex, uk to complete this work. key: cord-259534-hpyf0uj6 authors: panda, sumati kumari title: applying fixed point methods and fractional operators in the modelling of novel coronavirus 2019-ncov/sars-cov-2 date: 2020-10-06 journal: results phys doi: 10.1016/j.rinp.2020.103433 sha: doc_id: 259534 cord_uid: hpyf0uj6 this study aims to discuss the prevalence of covid-19 in u.s, italy, spain , france and china, where the virus spreads most rapidly and causes tragic outcomes. thereafter, we present new insights of existence and uniqueness solutions of the 2019-ncov models via fractional and fractal-fractional operators by using fixed point methods. 1 prelude to 2019-ncov 2019-ncov has been terrifying the world. the virus first seen in wuhan, china, has spread through continents. the death toll has reached its peak in italy, spain, the us, and other advanced and emerging economies countries alike. the smallest creature, invisible to the eye, questions the existence of mankind. country like u.s.a also afraid about this virus. this virus disrupts global economies. by considering the current situation of europe and the united states, the situation of developing countries has become an unanswered question. researchers are working hard to develop a vaccine for the virus but no progress has been made so far. from self-declared countries to well developed countries, all are working hard to stop the spread of communal infection. health-care services are being kept up to date; nursing, medical staff and physicians are being trained, and many organisations are spreading awareness about and transmission of the issues related to this virus. nevertheless, it is impossible to forecast the propagation of the infection due to the number of sick people and those being treated changing significantly in various countries every day. viruses in humans have traditionally been recognized as unreliable pathogens in correlation with animals. human mortality from viruses was very small relative to other diseases such as aids, cardiovascular disorders, and cancer. however, whenever a person would have any autoimmune illness, respiratory ailments, and has weakened the immune system, it's been stated that viruses can intensify the effects and have more significant impact on the human safety. however with the causative agent, this viral disease is a novel entry into the viral planet and thus poses unforeseen challenges. 2019-ncov/sars-cov-2, widely known as novel coronavirus, is a separate, tangible-stranded rna virus that belongs to the class nidovirales [1] , which is responsible for the ongoing 2019-ncov global pandemic [2] . human viruses were not considered to be essential pathogens as infected individual people evolve flu as symptoms and then cure themselves as an adaptive immune system that stimulus disease-resistant antibody formation [3] . although some vaccines have been recently developed and older people are advised to take shots every year as they have been compromised immunology, signs of common grippe have not been a concern in both developed and developing countries. yet novel 2019-ncov's spread has alarmed people around the world. it's essential to note how the virus in alien environments is going to fare. for this purpose, interdisciplinary work involving biologists, data scientists, mathematicians and clinicians is required in order to stop the propagation of these diseases and the implementation of effective procedures, medications to control them until the condition is out of reach. according to data from the who, more than 4.5 million cases of 2019-ncov were registered globally before 18th may 2020, which is around 60 cases per lakh population. china implemented and/or followed strict rules and regulations to defeat 2019-ncov. when 2019-ncov outbreaks escalated in wuhan , china, in late february, officials went house-to house for medical examinations-forcefully isolating every person in informal hospitals and provisional quarantine centers, including extracting parents from small children who exhibited signs of 2019-ncov, no matter how mild it might be. healthcare professionals at the omnipresent big apartment buildings in the city were forced into action as asynchronous security officers, tracking the temperatures of all occupants, determining who could join and carrying out assessments of distributed feed ingredients and medicines. there, drones circled over the sidewalks, shouting at citizens to get inside and haranguing them for not wearing surgical masks, whilst elsewhere in china face detection tech, connected to a compulsory phone app that color-coded people based on their contagion danger, determined who should go into shopping malls, subway stations, restaurants and other public areas. we may conclude that this strict rules and regulations prevented spread of 2019-ncov and hence china is not yet touched the mark 100k novel strain coronavirus cases even though it has first outbreak escalated in china. additional travel restrictions (∼ 90% of traffic) have only a marginal impact when combined with public safety measures and behavioural improvements that can promote a substantial reduction in the transmissibility of diseases. the 2019-ncov pandemic is currently causing havoc throughout the world, handling a devastating blast to countries with a few of the world's best health services. though the west has not been able to handle the 2019-ncov pandemic adequately thus far, china, singapore , taiwan and thailand have taught the world how to control this extremely contagious epidemic. although china initially botched up, its answer to 2019-ncov rapidly adapted and improvised. even when their response initially seemed intrusive, it was probably the only way left for the country in retrospect. suspending schools and places of employment after january 26th 2020, leading to a drastic decline in new infections. continuing to make testing of the coronavirus free and easy to access. extensive contact monitoring of peoples who might have met patients with novel strain coronavirus 2019-ncov/sars-cov-2. establish transitional health centers, and employ 40k health care professionals from other provinces. in time supply of medical kits, food and groceries to the needy. mindfulness campaigns demanding that people always wear a mask. implementing within country travel ban and international travel ban. since around 27th may,2020, the total number of cases of coronavirus in china was 82,891 and the total number of deaths was 4634 and its recovery rate is ∼ 91%. for more info, the reader can refer [21]. a set of major components describe the trajectory of an outbreak, some of which actually remain poorly known for covid-19. the standard reproduction number(r 0 ), which characterizes the average number of associated cases produced by one initial case where the community is large part susceptible to infection, helps determine the total number of people likely to be contaminated or, more exactly, the region under the outbreak trajectory. the importance of r 0 must be greater than equality in meaning for an outbreak to take hold. a straightforward estimate provides the fraction without containment likely to get infected. the fraction is around in china, covid-19 about 2.5 in the early stages of the disease, and estimated that around 60% of the population will become contaminated. for a variety of factors, that's a really bad case situation. we're uncertain about infant transfer, several societies are voluntary social distancing by persons and groups would have an effect internally and unlikely to be revealed, and preventive initiatives, such as the steps placed in china, would significantly minimize transmission. as an outbreak develops, the cumulative reproduction number(r 0 ) drops until it falls under equilibrium in value as the disease occurs and then declines, either due to the depletion of individuals susceptible to contamination or the impact of prevention steps. the rate of the preliminary progression of the disease, its subsequent improvement, or the associated sequential period (the duration it takes for an infectious person to transmit the infection with others) and the possible length of the outbreak are calculated by variables such as the period of the infection, and the mean contagious period. a new study [22] indicates that the r 0 of extreme acute respiratory coronavirus syndrome (sars-cov-2) could be as high as 5.7, up from the original estimate r 0 of 2.28 (according to [23] ). these studies often tend to presume uniform pathogenicity and virus propagation over time, and do not account for virus mutation either to or away from a more virulent strain. : expanding worries about natural psychological warfare and/or increasing concerns about biological violence, epidemic modelling has taken on a much important to be worthy of role for strategy making from a public health viewpoint. mathematical and/or scientific models of infectious diseases can help us to interpret disease dynamics and transmission rate. models also allow us to permit us to re-enact the spread of diseases in various prospects and aspects in order to develop and assess various intervention methodologies to forestall or enhance contaminations and better allot accessible resources (for example, choosing the target population, time for intervention and the location). predictive mathematical models are important for predicting the progression of the outbreak and for preparing successful response strategies. the human-to-human transmission sir model, which defines people's movement across three mutually incompatible periods of infection: susceptible, infected, and recovered. more complex models can depict precisely the diverse propagation of particular infectious diseases. several models had been developed for the 2019-ncov pandemic. another widely applied framework is: lin et al., extended an seir model (susceptible, exposed, infectious, removed), taking into account the understanding of threat and the cumulative number of cases. now, it is worthy to mention that some recent developments in mathematical models pertinent to the 2019-ncov: ❼ altaf and atangana [4] suggested a mathematical model of type seiarm and it is able to depict the spread of the 2019-ncov as follows: more importantly they suggested using some collected data a reproductive number(r 0 ) about 2.4, which was in good agreement with the value suggested by who. the fractional model is then numerically overcome by showing several graphical data, which will further mitigate the infection. for more information pertinent to this model and mathematical terminology the reader can refer to [4] . in addition, this model helps to analyze and understand the impact of implementing various guidelines and regulations (e.g., more substantial disease confirmatory testing or more stringent social distance measures), typically resulting in a change in model parameters. the model aims to analyse and understand the impact of implementing the various recommendations and procedures (for e.g., more comprehensive disease monitoring or more rigorous social isolation precautions), usually contributing to a shift in simulation results. ❼ in [6] , findings claimed a mathematical model-scird that would take into consideration the lock-down effect and the possibility of transmission from deceased to susceptible individuals as well. suffice it to say, the model doesn't collect all the information into consideration the spread, nor is the model a solution for covid-19, but the model is intended to affirm or dismiss the impact of lock-down as a potentially appropriate step to better flatten the death and infection curves. it is therefore important to note, there is a drastic increase for utilizing mathematical modeling in the study of epidemiology diseases. mathematical models also used for identifying/estimating various diseases growth and/or abnormal growth of tissue-resistant to a particular infection surveillance natural phenomena procedures(see for example [7] [11] ). where, (c * ). f is strictly increasing; (c ♦ ). there exists k ∈ (0, 1) such that lim for all x, y ∈ x such that ox = oy. theorem 5.1. [12] let (x, d) be a complete metric space and let o : x → x be a (ξ − f)contraction and for each x 0 ∈ x. then o has a unique fixed point. this topic (ξ − f)-contraction is an extension of f -contraction. any traditional discussions indicate that literature on this topic may infer that the concept of f -contraction is a typical generalization of the banach contraction principle as a consequence of the uniqueness. for a extensive study on fixed points,f -contraction and fractionals the reader may refer [13] [34] . we present the existence of fixed point for (ξ − f)-contractions to the following 2019-ncov model of type seiarm. in order to prove our theorem, above stated theorem plays a vital role. moreover, we need the following assumptions: (h 1 ). lim inf a→θ + ξ(a) > 0 for all t ≥ 0; (h 2 ). (1− α)γ( α)+θ e c 19 ( α)γ( α) < σn σ n+1 (1+σn(σn−σ n−1 )) e −θσn ; proof. consider the operator o : c(i) → c(i) as follows, a fixed point of the operator o will be a solution of eq. (2) . in order to fulfil all the assumptions of theorem 5.1, let us consider a function f(x) = − 1 x , x > 0 and ξ : (0, ∞) → (0, ∞) of the form: in this case one can calculate that the contractive condition takes the following form: for all x, y ∈ c(i) satisfying σ n−1 ≤ ||x − y|| < σ n when n ≥ 2 and 0 < ||x − y|| < σ 1 for n = 1. we will show that o satisfies the conditions of theorem 5.1. fix n ≥ 2 and take any a(ν, s p 1 ), a(ν, s p 2 ) ∈ c(i) such that σ n−1 ≤ ||x − y|| < σ n . observe that for each ν ∈ i, we have, next, we have, and since σ n+1 > 1, −νσ n+1 ≤ −ν for all ν ∈ i. in consequence, the following holds: using the properties of the sequence (σ n ), we get, by following the same pattern as above, we get, thus, o gratified all the conditions of the above theorem 5.1. hence o has a fixed point which yields a solution which gives the existence of solution for 2019-ncov model of type seiarm. in this section, we consider, applying the fractal-fractional integral with mittag-leffler kernel according to [6] , we transform the above equation into below is the 2019-ncov model of type scird associated with fractal-fractional integral with mittag-leffler kernel: we apply theorem 5.1 to verify the existence problems of the unique solutions for mittag-leffler kernal covid model. we look for the solution of eq.(15) in a subset c of the banach space x of continues functions defined as below equipped with the supremum norm of the form: where c(i, x) be the banach space of all continuous functions from i into x. here the bielecki's norm is defined as: ||x|| = sup θ∈i e −θ |x(θ)| for x ∈ c(i, x). in order to obtain our claims, we will need the following assumptions. a fixed point of the operator o be a solution of eq. (15) . in order to fulfil all the assumptions of theorem 5.1, let us consider a function f(x) = log(x); x > 0 and ξ : (0, ∞) → (0, ∞) of the form: now, from the hypothesis of the theorem we have, for all x, y ∈ c(i, x) satisfying σ n−1 ≤ ||x − y|| < σ n when n ≥ 2 and 0 < ||x − y|| < σ 1 for n = 1. now our task is to show that o satisfies eq. (17) . a (ψ, s(ψ), c(ψ), i(ψ), r(ψ), d(ψ)) − a * (ψ, s(ψ), c(ψ), i(ψ), r(ψ), d(ψ)) a (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) − a * (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) ✂ π 2 0 θ 1−β sin 2−2β γ θ α−1 cos 2 α−2 γ 2t sin γ cos γdγ (18) < sup θ∈i a (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) − a * (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) < sup θ∈i a (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) − a * (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) σ n σ n+1 (1 + σ n (σ n − σ n−1 )) e −θσn e −θσ n+1 e θσ n+1 further, we have, e σn−||a (θ)−a * (θ)|| < 1 + σ n (σ n − σ n−1 ). and since σ n+1 > 1; −ψσ n+1 ≤ −ψ for all ψ ∈ i. in consequence the following holds. |os(θ) − os * (θ)| ≤ sup θ∈i a (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) − a * (θ, s(θ), c(θ), i(θ), r(θ), d(θ)) × σ n σ n+1 (e σn−||a (θ)−a * (θ)|| ) e −θσ n+1 e θ(σ n+1 −σn) using the properties of the sequence (σ n ), we get, thus o satisfied eq. (17) . hence all the conditions of theorem 5.1 satisfied. thus the 2019-ncov model of type scird has a unique solution. there is a drastic increase for utilizing mathematical modeling in the study of epidemiology diseases. mathematical models may predict how infectious diseases advance to demonstrate the possible result of an outbreak, and help support initiatives in public health. in the present situation, the 2019-ncov terrifies the world. in this article, we presented new insights of existence and uniqueness solutions of the 2019-ncov models via fractional and fractal-fractional operators by using fixed point methods. a few words about possible extensions of the preceding conclusions: ❼ fixed point method for correlation between the weather conditions and 2019-ncov model of type seiarm in india. ❼ fixed point method for correlation between the weather conditions and 2019-ncov model of type scird in spain. ❼ fixed point method for correlation between the weather conditions and 2019-ncov model of type sidarthe in italy. funding: funding is not applicable for this research paper. competing interest: the authors declare that they have no competing interests. authors contributions: all authors contributed equally and significantly in writing this article. all authors read and approved the final manuscript. evaluation and treatment coronavirus (covid-19), statpearls. statpearls publishingstatpearls publishing llc clinical features of patients infected with 2019 novel coronavirus in wuhan, china rapid cloning of high-affinity human monoclonal antibodies against influenza virus modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative modelling the covid-19 epidemic and implementation of population-wide interventions in italy modelling the spread of covid-19 with new fractal-fractional operators: can the lockdown save mankind before vaccination fractal-fractional differentiation for the modeling and mathematical analysis of nonlinear diarrhea transmission dynamics under the use of real data solving a system of fractional partial differential equations arising in the model of hiv infection of cd4+ cells and attractor one-dimensional keller-segel equations application of stationary wavelet entropy in pathological brain detection the role of power decay, exponential decay and mittag-leffler function's waiting time distribution: application of cancer spread dynamics of ebola disease in the framework of different fractional derivatives solving existence problems via f -contractions a new approach to the solution of non-linear integral equations via various f be -contractions a new approach to the solution of the fredholm integral equation via a fixed point on extended b-metric spaces. symmetry 10 solutions of he nonlinear integral equation and fractional differential equation using the technique of a fixed point with a numerical experiment in extended b-metric space unification of the fixed point in integral type metric spaces some fixed-point theorems in b-dislocated metric space and applications connecting various types of cyclic contractions and contractive self-mappings with hardy-rogers self-mappings. fixed point theory and applications cyclic contractions and fixed point theorems on various generating spaces. fixed point theory and applications d-neighborhood system and generalized f -contraction in dislocated metric space high contagiousness and rapid spread of severe acute respiratory syndrome coronavirus estimation of thereproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: a data driven analysis a new exploration on existence of fractional neutral integro-differential equations in the concept of atangana-baleanu derivative new results on controllability in the framework of fractional integrodifferential equations with nondense domain solutions of boundary value problems on extended-branciari b-distance on new approach of fractional derivative by mittag-leffler kernel to neutral integro-differential systems with impulsive conditions a numerical schemes and comparisons for fixed point results with applications to the solutions of volterra integral equations in dislocated extended b-metricspace novel fixed point approach to atangana-baleanu fractional and l p -fredholm integral equations new numerical scheme for solving integral equations via fixed point method using distinct (ω − f )-contractions a complex valued approach to the solutions of riemann-liouville integral, atangana-baleanu integral operator and non-linear telegraph equation via fixed point method a mathematical model of covid-19 using fractional derivative: outbreak in india with dynamics of transmission and control mathematical analysis of novel coronavirus (2019-ncov) delay pandemic model analysis of the covid-19 pandemic spreading in india by an epidemiological model and fractional differential operator email: mumy143143143@gmail.com; sumatikumari key: cord-275395-w2u7fq1g authors: romero-severson, ethan obie; hengartner, nick; meadors, grant; ke, ruian title: change in global transmission rates of covid-19 through may 6 2020 date: 2020-08-06 journal: plos one doi: 10.1371/journal.pone.0236776 sha: doc_id: 275395 cord_uid: w2u7fq1g we analyzed covid-19 data through may 6th, 2020 using a partially observed markov process. our method uses a hybrid deterministic and stochastic formalism that allows for time variable transmission rates and detection probabilities. the model was fit using iterated particle filtering to case count and death count time series from 55 countries. we found evidence for a shrinking epidemic in 30 of the 55 examined countries. of those 30 countries, 27 have significant evidence for subcritical transmission rates, although the decline in new cases is relatively slow compared to the initial growth rates. generally, the transmission rates in europe were lower than in the americas and asia. this suggests that global scale social distancing efforts to slow the spread of covid-19 are effective although they need to be strengthened in many regions and maintained in others to avoid further resurgence of covid-19. the slow decline also suggests alternative strategies to control the virus are needed before social distancing efforts are partially relaxed. since its initial outbreak in wuhan, china in late 2019 and early 2020 [1] , covid-19 caused repeated rapid outbreaks across the global from february through april 2020. the extremely rapid spread of covid-19 in china [2] does not appear to be an anomaly: the disease has shown a short doubling time (2.4-3.6 days) outside of china as well [3] . as of may 21, 2020, the virus caused 5,034,458 reported infections, and 328,730 deaths globally [4] . in response, most affected countries/regions have implemented strong social distancing efforts, such as school closures, working-from-home, shelter-in-place orders. as a result, the spread of covid-19 slowed down substantially in some countries [5] , leading to a flattening of the epidemic curve. as social distancing induces high costs to both society and individuals, plans to relax social distancing are discussed. however, changes in both the transmission rates and detection probabilities over time coupled with stochasticity due to reporting delays makes differentiating between truly subcritical dynamics and simply reduced transmission difficult. in this report, we developed a deterministic-stochastic hybrid model and fitted the model to case incidence and death incidence time series data from 55 countries. following the a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 approach suggested by king et al. [6] , we use a (partially) stochastic model and base our estimates on incidence rather than cumulative incidence data. using both case count and death count of data allowed us to disentangle changes in surveillance intensity from changes in transmission [3] . we found evidence for large decreases in the country-level transmission rates, in several of the worst-affected countries. importantly, using data up to may 6, 2020, we computed 99% confidence intervals to test whether or not the data were consistent with subcritical dynamics (i.e. the reproductive number r was below 1 on may 6, 2020). most countries showed large decreases in transmission rates over time, and more than half of studied countries have transmission rates below the epidemic threshold. on the other hand, many countries still appear to be showing rapid exponential growth. given its highly contagious nature, covid-19 can spread rapidly when strong social distancing measures are lifted, even partially [3] . alternative strategies that can effective control the virus are needed when social distancing measures are relaxed. case count and death data were downloaded from the johns hopkins github repository (https://github.com/cssegisanddata/covid-19) through may 6, 2020. data included aggregate counts of reported cases and deaths at the country level and contained no identifying information. any country in the data that had more than 2000 cumulative cases and 100 deaths by may 6, 2020 was included. to minimize the effect of repatriated cases we started each time series on the first day when the cumulative number of cases exceeded 100. all data processing and model fitting were otherwise done on the incidence scale. to address obvious bulk-reporting issues in the data (e.g. sudden zeros in the data followed by very large numbers), we smoothed the data using tukey's 3-median method. because several countries had days with a single death surrounded by days with no deaths, which the smoothing method would set to zero, days with a single death were not replaced with smoothed values. the original data and the smoothed data used for estimation are shown in s1 fig in s1 file. we model the spread of covid-19 as a partially observed markov process with real-valued states s (susceptible), e (exposed), i (infected), and r (removed) to describe the latent population dynamics, and integer-valued states c 0 (to be counted), y 1 (counted cases), d 0:3 (dying), and y 2 (counted deaths) to model sampling into the data. we use multiple states to model the counted deaths to produce an erlang distribution with mean 21 days and standard deviation 11 days in the time to death based on previous estimates of the time to death [7] . the model and all of its parameters (table 1 ) have time units of days. the latent population model is governed by the following ordinary differential equations, where χ(t) is the time-variable transmission rate and n = s + e + i + r is assumed to be fixed over the run of the model. λ ei is the rate at which exposed persons become infectious and λ ir is the rate at which cases recover (i.e. are either no longer infectious or die). at every time interval, we sample persons moving from the e to i states into a stochastic arm of the model that is used to calculate the likelihood of the data. to relate the latent population model to data, we randomly sample individuals from the unobserved population into a stochastic process that models the random movement from infection to being either counted as an observed case or counted as an observed death. the number of persons sampled into the observation arm of the model over a time interval dt is given by and ω are the probabilities that an infected person will be counted or die respectively, ρ c (t) and ω c are the probabilities of being not sampled or not dying respectively, and g(u) is a stochastic function that maps from real-value u to integer g(u); it takes value buc with probability u mod 1 and value due with probability 1 − (u mod 1). in plain english, the model tracks the random fate of each newly infected person as they move from the exposed to infected state with respect to eventually being observed as a case and/or a death in data. at each time step of length δt, the change in state space is given by, where f t |c 0 is a random variate from binðc 0 ; 1 à expðà l y 1 dtþþ, and h it is a random variate from bin d ià 1 ; 1 à exp à 4 21 dt à � à � . x i indicates the i th element of multinomial random variate defined above. the rate λ y1 determines the rate at which persons who will be counted are counted (i.e. lower values of λ y1 mean a longer delay in cases showing up in the data). the values of y 1:2 are set to zero at the beginning of every day such that they accumulate the simulated number of cases and deaths that occur each day. both the transmission rate and detection model probability are allowed to vary with time in the following way: where t f is the final time in the given dataset. in plain english, the transmission rate is constant up to some time, t ð1þ w , where it linearly increases or decreases to the value χ f by time t ð2þ w . the model is constrained such that t ð2þ w < t f à 20, that is, the transmission rate must be constant for at least 20 days before the end of the data collection period. this constraint is in place to avoid overfitting the final transmission rate. the detection rate likewise has a linear increase or decrease beginning at time t ρ ; however, the increase or decrease continues to a fixed point that is 20 days before the final datum. variation in the detection rate (i.e. the probability that an infected case will be counted in a fixed interval) over the course of the epidemic can strongly bias estimates of the population growth rate (derivation for the exponential growth case in s1 file); not allowing the detection probability to change over time could lead to discordance in the case count and death count time series. we assume that the data are negative binomial-distributed, conditional on the simulated number of cases and deaths that occur in a given time interval, i.e. the number of cases in the i th observation period has density negbin(y 1 , � 1 ) and the number of deaths in the i th observation period has density negbin(y 2 , � 2 ). the negative binomial is parameterized such that the first argument is the expectation and the second is an inverse overdispersion parameter that controls the variance of the data about the expectation; as � i becomes large, the data model approaches a poisson with parameter y i . both � 1 and � 2 were estimated from the data. parameter estimation had two distinct steps: model selection and computation of confidence intervals. in the model selection phase, the model was fit to the data using an iterated particle-filtering method, implemented in the pomp r library [8] . to optimize the likelihood of the data, we used 1500 particles in 125 iterations for each country. to determine whether or not 1500 particles was sufficient to minimize the variance in the estimated the mean likelihood at a given parameter value we ran 20 independent particles filters on a single data set and found that the average deviation from the mean was less than 0.1 log units suggesting that we would be able to detect differences in the log likelihood greater than 0.1 log units. the reported likelihoods were measured using 4500 particles at the optimized maximum likelihood estimates (mles). for all fits, the initial state at time zero is computed by assuming there were i(0) infected persons, 21 days before the first reported death (by definition time one). the number of initial number of susceptibilities was assumed to be the predicted 2020 population size of the given country [9] . the model is then simulated forward for 21 days, assuming exponential growth with transmission rate χ 0 , which is taken at the initial state of the model at time zero. because the data were highly variable in the complexity of the patterns they showed, we considered three nested models of increasing complexity for each country. the first model (model 1) assumed simple exponential growth with a constant sampling probability, i.e. ρ 0 = ρ f and t ð1þ w ¼ t ð1þ w ¼ 0. this amounts to a period of exponential growth with transmission rate χ 0 in the pre-data period and then a constant transmission rate χ f over the observation period. the second model (model 2) allowed the transmission rate to vary but kept the detection probability constant, i.e. ρ 0 = ρ f . the third model (model 3) allowed both the transmission rate and detection rate to vary, i.e. all parameters estimated. we determined the best model for each country by a sequence of likelihood ratio tests first comparing model 1 and model 2 and then model 2 to model 3. because we are using an optimization method, we do not have access to samples from the likelihood surface directly. therefore, to obtain estimates of the parametric uncertainly in the final transmission rate, χ f , we computed 99% confidence intervals using the profile likelihood method. for each country computed the profile likelihood by optimizing the model along a fixed grid with points every 0.1 units centered on the mle of χ f value from the previous fit. points were added to the grid until the measured likelihood on either side of the mle was greater than 9 log units lower than the measured maximum likelihood. we then fit a local polynomial regression (loess in r) to those points and found the predicted maximum likelihood parameter value [10] and the 99% ci by locating the points on either side of the mle that were 3.84 log units below the maximum likelihood. we fixed several parameter values based on published work and our scientific judgement. the probability that a case would eventually die, ω = 1%, is based on estimates of the case fatality ratio for both asymptomatic and symptomatic patients (95% ci 0.5-4%) [11] . the latency rate, l ei ¼ 1 3 , was initial set based on our general sense of what was consistent with the pre-print literature available at the time, however, that value is has proven to be consistent with later reports [12] . the recovery rate, l ir ¼ 1 10 , was similarly set to be consistent with the available literature when the model was being developed. likewise, longitudinal studies have shown that our assumption of an average infectious period of 10 days is reasonably consistent with the clinical data [13, 14] . although the clinical data paints a picture of the natural history of infection that is far more complex than our model can capture, the formulation of our model is consistent with the available data. our primary outcome of interest is the growth rate, r, at the end of observation period, which is derived from the transmission rate estimated from the data. using the equation in [15] we can express the growth rate in terms of the model parameters we found that for most countries, the fraction of the population that was still susceptible at the end of the observation period was greater than 95%, therefore we omitted the term prðsþ ¼ s between countries. from the same paper we also have the equation showing that r 0 = 1 when r = 0. we also compare predicted deaths due to covid-19 in each country through july 5th 2020 to the average number of deaths in a period of the same length (fig 3) . predicted deaths were computed by simulating the number of daily deaths from the first observation though july 5th 2020 for each country and taking the mean value. confidence intervals were computed as the relevant quantiles of the sum over 1000 simulations. pre-covid-19 deaths were based on allcause death data were downloaded from the who mortality database (https://www.who.int/ healthinfo/mortality_data/en/) for each country for which it was available. using data from the most recent year, we computed the death rate and multiplied the death rate by the length of the interval from the first observation to july 5th 2020 for each country. we fit our model to data collected from 55 countries. model fits are shown in fig 1, and parameter values are given in table 1 . the model can capture the data well, with a few exceptions. the model was not able to find a robust fit to the data from bangladesh; in general, the upside down 'v' shape to the deaths could not be captured well in such a short time series. algeria had a similarly odd pattern in the deaths time series that the model could not capture in detail, although the overall trend in deaths was recovered. in previous versions of this paper, the model had a hard time fitting data from both italy and spain. however, given the longer time series and modifications to the model form, we now find that both italy and spain are well-captured by the model. overall, the model slightly overestimates the number of deaths around the time where deaths begin to decrease. however, this was generally corrected if the time series was long enough. including temporal heterogeneity in the time from infection to detection of a covid-19 death would likely correct this; however, it is not clear that this is advisable, as death counts are likely under-reported. all countries except japan and saudi arabia were found to have lower transmission rates on may 6, 2020 than at the beginning of the observation period, suggesting a global decline in the transmission of covid-19 though may 6, 2020. however, the initial transmission rate should be interpreted cautiously, as we allowed a wide range of infected persons to exist 21 days before the first observation. that is, the initial transmission rate parameter is rather a convolution of the unknown number of infected persons and initial growth rate consistent with the data. we found significant evidence for subcritical dynamics in 27 countries (3 countries had subcritical point estimates but their cis contained the epidemic threshold). fig 2 shows the point estimates and cis of the final transmission rate on may 6, 2020 for all countries, stratified by continent [16] . european countries had the highest probability of being subcritical (21 of 25) with asia (7 of 15) and the americas (2 of 11) having fewer subcritical countries. none of the countries in africa that met the inclusion criteria had subcritical dynamics. generally, countries that were found to have both variable transmission rates and variable detection probabilities (model 3 in table 1) show a pattern of level or increasing deaths coupled with a level or slightly declining incidence in number of reported cases. this pattern illustrates how viewing the case count data alone can be potentially misleading as declines in reported cases can be confounded by variation in the probability of detection (e.g. comparing canada to denmark). some countries were found to have detection probabilities lower than 1% on may 6 2020; however, these values should not be over-interpreted as the simple linear model for changing detection probabilities imposes strong assumptions that are focused on capturing the general trend. fig 3 shows the predicted number of deaths projected out to july 5, 2020 assuming that all parameter values are constant over the period may 6 through july 5 2020. the average duration of the country-level epidemic in european countries is longer than in asia, leading to a higher level of death, despite asian countries having on-average higher growth rates. however, in the americas, the predicted deaths are higher with 8 of 11 countries having total predicted deaths greater than 0.1% of the total population by july 5, 2020. the model predicts 1,320,170 total covid-19 deaths in all 55 countries on july 5, 2020; of those deaths, 21% are predicted to occur in the us. the deaths due to covid-19 in europe are lower than the average number of reported deaths in a period of the same length for all countries in the data set that also had all-cause death counts from previous years. however, in the americas, the covid-19 death counts are approaching the all-cause death levels in several countries, suggesting that covid-19 deaths are approaching a doubling of all-cause deaths in those countries. our model found evidence for reductions in transmission rates of covid-19 in 53 of 55 examined countries. encouragingly, of those countries, we found statistical evidence that the size of the epidemic is decreasing in 27 countries, i.e. the effective reproductive number is less than 1, using data up to may 6, 2020. this suggests that, despite the highly heterogeneous populations represented by these countries, the growth of covid-19 outbreak can be reverted. although our model cannot attribute exact causes to the global decline in transmissions rates, most countries implemented sustained, population-level social distancing efforts over a period of weeks to months. these efforts are highly likely to play a major role in reducing the transmission of covid-19 [5] . we estimated that in countries with decreasing transmission, the rate of decrease is in general less than 0.1/day (average -0.04/day). based on data from 8 european countries, the us, and china, we previously estimated that in the absence of intervention efforts, the epidemic can grow at rates between 0.19-0.29/day [2] . this means that the outbreak can grow rapidly and quickly wipe gains made though public heath efforts made if social distancing measures are completely relaxed. for example, if the rate of decrease under strong public health interventions is 0.1/day and the growth rate in the absence of public heath interventions is 0.2/ day, then, the number of cases averted in two weeks of intervention will be regained in only one week. social distancing measures have their own social costs. our results suggests alternative strategies to control the virus are needed in place when social distancing efforts are relaxed. due to the uncertainties in the impact of each specific control measures, changes to policies should be made slowly because the signal of changing transmission can take weeks to fully propagate into current data streams as a result of the long lag between infection to case confirmation (as we estimated to be on average approximately 2 weeks). our goal in this paper was to develop a model that could be applied very broadly to multiple countries and we have made assumptions that facilitate that goal. however, our model makes key assumptions that should be considered when thinking about where these results fit into the vast collection of covid-19 modeling papers. for example, we slightly privilege death counts over case counts in linking the population model to the data by assuming that the distribution of the time to death is known and that the probability of being a detected death is fixed over time. likewise, we assume that at the country level, the change in transmission rates can be modeled by a simple linear function, which we believe is reasonable as interventions implemented at the local level are likely to lead to a smooth change when aggregated up to the population level. our model produces reasonable fits to the global data, but out approach does not allow us to have unique models for each country that could almost certainly capture country-level trends with greater accuracy. our model also makes strong simplifying assumptions about the natural history of infection, about which we are continuously learning. for example, our model assumes that contagiousness is constant over the infection, which is possibly variable over time [12] . we also assume that diagnosis does not affect transmission from infected persons. however, given that we are inferring broad, population-average parameters and allowing those parameters to change over time to reflect broad changes in the transmission dynamics, we believe that our results are are reasonable portrayals of reality despite using a simple model of the natural history of infection. overall, our results suggest that covid-19 is controllable in diverse settings using a full range of strong and comprehensive non-pharmaceutical measures, and that future deaths from the disease are avoidable. supporting information s1 file. (pdf) early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia early release -high contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2 -emerging infectious diseases journal -cdc fast spread of covid-19 in europe and the us and its implications: even modest public health goals require comprehensive intervention an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases 2020-03-30-covid19-report-13.pdf avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to ebola high contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2. emerging infectious diseases pomp: statistical inference for partially observed markov processes maximum smoothed likelihood estimation severity of 2019-novel coronavirus (ncov) temporal dynamics in viral shedding and transmissibility of covid-19 virological assessment of hospitalized patients with covid-2019 clinical and virologic characteristics of the first 12 patients with coronavirus disease 2019 (covid-19) in the united states estimating epidemic exponential growth rate and basic reproduction number countrycode: an r package to convert country names and country codes conceptualization: ethan obie romero-severson, nick hengartner, ruian ke. key: cord-196353-p05a8zjy authors: backhausz, 'agnes; bogn'ar, edit title: virus spread and voter model on random graphs with multiple type nodes date: 2020-02-17 journal: nan doi: nan sha: doc_id: 196353 cord_uid: p05a8zjy when modelling epidemics or spread of information on online social networks, it is crucial to include not just the density of the connections through which infections can be transmitted, but also the variability of susceptibility. different people have different chance to be infected by a disease (due to age or general health conditions), or, in case of opinions, ones are easier to be convinced by others, or stronger at sharing their opinions. the goal of this work is to examine the effect of multiple types of nodes on various random graphs such as erdh{o}s--r'enyi random graphs, preferential attachment random graphs and geometric random graphs. we used two models for the dynamics: seir model with vaccination and a version of voter model for exchanging opinions. in the first case, among others, various vaccination strategies are compared to each other, while in the second case we studied sevaral initial configurations to find the key positions where the most effective nodes should be placed to disseminate opinions. freedom in choosing the position of these vertices. one of our main interests is epidemic spread. the accurate modelling, regulating or preventing of a possible epidemic is still a difficult problem of the 21st century. (as of the time of writing, a novel strain of coronavirus has spread to at least 16 other countries from china, although authorities have been taking serious actions to prevent a worldwide outbreak.) as for mathematical modelling, there are several approaches to model these processes, for example, using differential equations, the theory of random graphs or other probabilistic tools [10, 12, 15] . as it is widely studied, the structure of the underlying graph can have an important impact on the course of the epidemic. in particular, structural properties such as degree distribution and clustering are essential to understand the dynamics and to find the optimal vaccination strategies [7, 11] . from the point of view of random graphs, in case of preferential attachment graphs, it is also known the initial set of infected vertices can have a huge impact on the outcome of the process [3] : a small proportion infected vertices is enough for a large outbreak if the positions are chosen appropriately. on the other hand, varying susceptibility of vertices also has an impact for example on the minimal proportion of vaccinated people to prevent the outbreak [6, 4] . in the current work, by computer simulations, we study various cases when these effects are combined in a seir model with vaccination: we have a multitype random graph, and the vaccination strategies may depend on the structure of the graph and types of the vertices as well. the other family of models which we studied is a variant of the voter model. the voter model is also a common model of interacting particle systems and population dynamics, see e.g. the book of liggett [16] . this model is related to epidemics as well: durett and neuhauser [9] applied the voter model to study virus spread. the two processes can be connected by the following idea: we can see virus spread as a special case of the voter model with two different opinions (healthy and infected), but only one of the opinions (infected) can be transmitted, while any individuals with infected opinion switch to healthy opinion after a period of time. also the virus can spread only through direct contacts of individuals (edges of the graphs), while in the voter model it is possible for the particles to influence one another without being neighbors in the graph. similarly to the case of epidemics, the structure of the underlying graph has an important impact on the dynamics of the process [2, 8] . here we study a version of this model with various underlying random graphs and multiple types of nodes. we examined the virus spread with vaccination and the voter model on random graphs of different structures, where in some cases the nodes of the graph corresponding to the individuals of the network are divided into groups representing significantly distinct properties for the process. we studied the possible differences of the processes on different graphs, regarding the nature and magnitude of distinct result and both tried to find the reasons for them, to understand how can the structure of an underlying network affect outcomes. the outline of the paper is as follows. in the second section we give a description of the virus spread in continuous time, and the discretized model. parameters are chosen such that they match the realworld data from [14] . we confront outcomes on different random graphs and the numerical solutions of the differential equations originating from the continuous time counterpart of the process. we also study different possible choices of reproduction number r 0 corresponding to the seriousness of the disease. we examine different vaccination strategies (beginning at the start of the disease of a few days before), and a model with weight on edges is also mentioned. in the third section we study the discretized voter model on erdős-rényi and barabási-albert graphs firstly without, then with multiple type nodes. later we run the process on random graphs with a geometric structure on the plane. the dynamics of virus spread can be described by differential equations, therefore they are usually studied from this approach. however, differential equations use only transmission rates calculated by the number of contacts in the underlying network, while the structure of the whole graph and other properties are not taken into account. motivated by the paper "modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks" of diána h. knipl and gergely röst [14] , we modelled the process on random graphs of different kinds. in this section we use the same notions and sets for most of the parameters. ideas for vaccination strategies are also derived from there. we examined a model in which individuals experience an incubation period, delaying the process. dynamics are also effected by a vaccination campaign started at the outbreak of the virus, or vaccination campaign coming a few days before the outbreak. in the classical seir model, each individual in the model is in exactly one of the following compartment during the virus spread: • susceptible: individuals are healthy, but can be infected. • exposed: individuals are infected but not yet infectious. • infectious: individuals are infected and infectious. • recovered: individuals are not infectious anymore, and immune (cannot be infected again). individuals can move through compartments only in the defined way above (it is not possible to miss out one in the line). the rate at which individuals leave compartments are described by probabilities (transmission rates) and the parameters of the model (incubation rate, recovery rate). individuals in r are immune, so r is a terminal point. seir with vaccination: we mix the model with a vaccination campaign. the campaign lasts for 90 days, and we vaccinate individuals according to some strategy (described later) so that at the end of the campaign 60% of the population is vaccinated (if it is possible). we vaccinate individuals only in s, but the vaccination ensures immunity only with probability q, and only after 14 days. we vaccinate individuals at most once irrespectively of the success of vaccination. however, vaccinated individuals can be infected within the first 14 days. in this case, nothing differs from the process without vaccination. to describe the underlying network, we use real-life data. we distinguish individuals according to their age. in particular, we consider 5 age groups since they have different social contact profile. the to describe the social relationships of the different age groups, we used the contact matrix obtained in [1] : where the elements c i,j represent the average number of contacts an individual in age group i has with individuals in age group j. in the sequel, the number of individuals in a given group is denoted by the label of the group according to figure 1 . the model is specified by the following family of parameters. • r 0 = 1.4: basic reproduction number. it characterizes the intensity of the epidemic. its value is the average number of infections an infectious individual causes during its infectious period in a population of only susceptible individuals (without vaccination). later we also study less severe cases with r 0 = 1.0 − 1.4. • β i,j : transmission rates. they control the rate of the infection between a susceptible individual in age group i and an infectious individual in age group j. they can be derived from r 0 and the c contact matrix. according to [1] we used β i,j = β · c i,j n j , where β = 0.0334 for r 0 = 1.4. .25: latent period. ν e is the rate of exposed individuals becoming infectious. each individual spends an average of 1 ν i in i. • 1 ν w = 14: time to develop antibodies after vaccination. • q i = 0.8 for i = 1, . . . , 4 and q 5 = 0.6: vaccine efficacy. the probability that a vaccinated individual develops antibodies and becomes immune. • δ = 0.75: reduction in infectiousness. the rate by which infectiousness of unsuccessfully vaccinated individuals is reduced. • λ i = 5 j=1 β j,i · (i j + δ · i j v ) is the total rate at which individuals of group i get infected and become exposed. • v i : vaccination rate functions determined by a strategy. this describes the rate of vaccination in group i. the dynamics of the virus spread and the vaccination campaign can be described by 50 differential equations (10 for each age group), according to [14] : we would like to create an underlying network and examine the outcome of virus spread on this given graph. we generated random graphs of different structures with n = 10000 nodes, such that each node has a type corresponding to the age of the individual. the age distributions and number of contacts in the graph between age groups comply with statistic properties detailed above. since the contact matrix c describes only the average number of contacts, the variances can be different. • erdős-rényi graphs: we create 10000 nodes and their types are defined immediately, such that the number of types comply exactly to the age distribution numbers. the relationships within each age group and the connections between different age groups are both modelled with an erdős-rényi graph in the following sense: we create an edge between every node in age group i and node in age group j independently with probability p i,j , where • preferential attachment graphs: initially we start from an erdős-rényi graph of size 100, then we keep adding nodes to the graph sequentially. every new node chooses its type randomly, with probabilities given by the observed age distribution. after that we create edges between the new node and the old ones with preferential attachment. if the new node is of type i, then we connect it with an edge independently to an old node v of type j with probability where d(v) denotes the actual degree of v, and d is the sum of degrees belonging to nodes with type j. thus the new node is more likely to attach to nodes with a high degree, resulting a few enormous degrees in each age group. on the other hand, the connection matrix c is used to ensure that the density of edges between different age groups is different. • preferential attachment mixed with erdős-rényi graphs: we create the 10000 nodes again with their types exactly according to age distribution numbers. first we create five preferential attachment graphs, the ith of size n i so that every node has an average of c i,i neighbours. in particular, the endpoints of the new edges are chosen independently, and the attachment probabilities are proportional to the degrees of the old vertices. then we attach nodes in different age groups independently with the corresponding p i,j probabilities defined above. • random graphs of minimal degree variances with configuration model: we prescribe not only a degree sequence of the nodes, but the degree of a node broken down into 5 parts regarding the age groups in a way the expectations comply with the contact matrix c, but the degrees also have a small variance. the distribution is chosen such that the variance is minimal among distribution supported on the integers, given the expectation. for example in case of c 1,4 = 2, 4847 every node in age group 1 has exactly 2 or 3 neighbours in age group 4, and the average number is 2, 4847. our configuration model creates a random graph with the given degree sequence. according to [5] , the expected value of the number of loops and multiple edges divided by the number of nodes tends to zero, thus for n = 10000 it is suitable to neglect them and to represent a network of social contacts with this model. in this section we detail how we implemented the discretization of the process on the generated random graphs. most of the parameters remained the same as in the differential equations, however to add more reality we toned them with a little variance. for the matter of the transmission rates, we needed to find different numbers to describe the probability of infections, since β was derived from c matrix and r 0 basic reproduction number. since c is built in the structure of our graphs, using different parameters β i,j would result in adding the same effect of contacts twice to the process. therefore instead of β i,j , we determined a universalβ according to the definition of r 0 . we set the disease transmissions toβ = r 0 3·12.8113 , under the assumption, that the contact profile of age groups are totally implemented in the graph structure. only the average density of the graph (without age groups), severity of the disease and average time spent in infectious period can affect the parameters. parameters ν w , δ, q i remained exactly the same, while 1 ν e = 1.25 and 1 ν i = 3 holds only in expected value. the exact distributions are given as follows: we built in reduction in infectiousness δ in the process in such way that an unsuccessfully vaccinated individual spends 3 · 0.75 = 2.25 days in average in i, instead of modifyingβ. in the discretized process, we start with 10 infectious nodes chosen randomly and independently from the age groups. we observe a 90 day period with vaccination plus 10 days without it. (in the basic scenarios we start vaccination at day 1, however we later examine the process with vaccination starting a few days before the outbreak). at a time step firstly the infectious nodes can transmit the disease to their neighbours. only nodes in s can be infected, and they cannot be infected ever again. when a node becomes infected, its position is set immediately to e, and also the number of days it spends in e is generated. secondly we check if a node reached the end of its latent/infectious period, and we set its position to i or r. (as soon as a node becomes infectious, the days it spends in i is also calculated.) then at the end of each iteration we vaccinate 0.67% of the whole population according to some strategy (if it is possible). only nodes in s get vaccination (at most once), it is generated immediately whether the vaccination is successful (with probability q i , according to its type). in case of success, the day it could become immune without any infection is also noted. if it reaches the 14th day, and still in s, its position is set to immune. the first question is whether the structure of the underlying graph can affect the process, in the case when the edge densities are described by the same contact matrix c. we can ask how it can affect the overall outcome and other properties, and how we can explain and interpret these differences regarding the structure of the graph. we compare results on different graphs with each other, and also with the numerical solution of the differential equations describing the process. in this section we study the basic scenario: our vaccination starts at day 1, we vaccinate by uniform strategy. this strategy does not distinguish age groups, every day we vaccinate 0.67% of each age group randomly (if it is still possible). we set r 0 = 1.4. using the differential equation system from [14] giving structure to the underlying social network boosted these numbers in every case, however differences are still significant between the random graphs of different properties. to study the result on the discretized model, we generated 5 random graphs with n = 10000 nodes for each graph structure, and run the process 20 times on a random graph with independent in case of most of the structures we can derive rather different outcomes on the same graph with different initial values concerning the peak of the virus. therefore using the same graphs more is acceptable.) as we can see on figure 3 (compared to figure 2 ), random graphs from the configuration model were the closest to the numerical solution of differential equations. however, the difference in outcomes can be clearly seen from every perspective: almost 20% of the population (5.7% more) was infected by the virus at the end of the time period, the infection peaked almost 10 days sooner (at day 41) and the number of infectious cases at the peak is almost twice as large. we got similar, but more severe result on erdős-rényi graphs. however still only a maximum 0.021% of the population was infected at the same time. the outcome in case of graphs with a (partial) preferential attachment structure shows that distribution of degrees do matter in this process. (this notice gave the idea initially to model a graph with minimal degree deviation with the help of the configuration model. we were curious if we can get closer results to the differential equations on such a graph.) on preferential attachment graphs 47.66% of the individuals came through the disease. what is more, 1% of the population was infected at the same time at the peak of the virus, only at day 21. however, after day 40 the infection was substantially over. with preferential attachment structure it is very likely that a node with huge degree gets infected in the early days of the process, irrespectively of initial infectious individual choices, resulting in an epidemic really fast. however, after the dense part of the graph passed through the virus around day 40, even 40% of the population is still in s, magnitude of infectious cases is really low. the process on preferential attachment mixed with erdős-rényi graphs reflects something in between, yet preferential properties dominate. it was possible to reach 60% vaccination rate during the process, except in case of preferential attachment graphs. at the end of the 100th day, 0.4 − 0.45 proportion of individuals could acquire immunity after vaccination. basic reproduction number is a representative measure for the seriousness of disease. generally, diseases with a reproduction number greater than 1 should be taken seriously, however the number is a measure of potential transmissibility. it does not actually tell, how fast a disease will spread. seasonal flu has an r 0 about 1.3, hiv and sars around 2 − 5, while according to [18] in this section we investigate how different strategies in vaccination can affect the attack rates. we study three very different strategies based on age groups or other properties of the graph. in each strategy 0.67% of the population is vaccinated at each time step (sometimes exactly, sometimes only in expected value). after a 90 days vaccination campaign 60% of the population should be vaccinated from each age group (if it is possible). we still start our vaccination campaign at day 1, and we vaccinate individuals at most once irrespectively of the success of the vaccination. • uniform strategy: this strategy does not distinguish age groups, every day we vaccinate randomly 0.67% individuals of each age group. • contacts strategy: we prioritize age groups with bigger contact number, corresponding to denser parts of the graph (concerning the 5 groups). we vaccinate the second age group for 11 days, then the third age group for 26, first age group for 10 days, forth group for 29 days, and at last age group 5 with the smallest number of contacts for 15 days. this strategy turned out to be the best in the case without any graph structure [14] . however, in conventional vaccination strategies, in the first days of the campaign, amongst others health care personnel is vaccinated which certainly makes sense, but can be also interpreted as nodes of the graph not only with high degree, but also with high probability to get infected. the effect of vaccination by degrees can be also noticed on the shape of infected individuals in age groups developing in time (see figure 6 ). not only the magnitude decreased, but the vaccination also increased the skewness, especially for age group 2. vaccination by contacts totally distorted the curve of age group 2, while the others did not changed much. we examine if vaccination before the outbreak of a virus (only a few, 5-10 days before) could influence the epidemic spread significantly. delay in development of immunity after vaccination is one of the key factors of the model, thus pre-vaccination could counterweight this effect. edges of the graph so far represented only the existence of a social contact, however relationships between the individuals could be of different quality. it is also a natural idea to make a connection between the type of the nodes (age groups of individuals) and the feature of the edge between them. for example, generally we can assume that children of age 0-9 (age group one) are more likely to catch or transmit a disease to any other individual regardless of age, since the nature of contacts with children are usually more intimate. so on the one hand, creating weights on the edges of the graph can strongly be in connection with the type of the given nodes. on the other hand regardless of age groups, individuals tend to have a few relationships considered more significant in the aspect of a virus spread (individuals sharing a household), while many social contacts are less relevant. for the reasons above, we upgrade our random graphs with a weighting on the edges, taking into account the age groups of the individuals. regardless of age, relationships divided into two types: close and distant. only 20% of the contacts of an individual can be close, transmission rates on these edges are much higher, while on distant edges they are reduced. we examine a model in which age groups do not affect weights of the edges. we double the probabilities of transmitting the disease on edges representing close contacts, and decrease probabilities on other edges at a 0.75 rate. in expected value the total r 0 of the disease has not changed. however, results on graphs can be different from the unweighted cases. with the basic scenario and in case of we experience the biggest difference on erdős-rényi-graphs, however models with edge weights give bigger attack rates of only 0.01. we get a less severe virus spread with edges only on the configuration model. in this section we study the discretized voter model in which particles exchange opinions from time to time, in connection with relationships between them. we create a simplified process to be able to examine the outcome on larger graphs. firstly, we examine this simplified process on erdős-rényi and barabási-albert graphs, then multiple type of nodes is introduced. with a possible interpretation of different types of nodes in the graphs, we generalize the voter model. later we examine the "influencer" model, in which our aim is, in opposition to the seir model, to spread one of the opinions. the process in continuous time can be modelled with a family of independent poisson process. for each pair of vertices (x, y) we have a poisson process of rate q(x, y), which describes the moments x convincing y. the rate q(x, y) increases as the distance d(x, y) decreases. in this case, every time a vertex is influenced by another one, it changes its opinion immediately. in our discretized voter process, there are two phases at each time step. first, nodes try to share their opinions and influence each other, which is successful with probabilities depending on the distance of the two vertices. more precisely, vertices that are closer to each other have higher chance that their opinion "reaches" the other one. still, every vertex can "hear" different opinions from many other vertices. in the second phase, if a node v receives the message of m 0 nodes with opinion 0, and m 1 nodes with opinion 1, then v will represent opinion 0 with probability m 0 m 0 +m 1 during the next step, and 0 otherwise. if a node v does not receive any opinions from others at a time step, then its opinion remains the same. this way, the order of influencing message in the first phase can be arbitrary, and it is also possible that two nodes exchange opinions. now we specify the probability that a vertex x manages to share its opinion to vertex y in the first phase. we transform graph distances d(x, y) into a matrix of transmission probabilities with choice q(x, y) = e −c·d(x,y) , where c is a constant. this is not a direct analogue of the continuous case, but it is still a natural choice of a decreasing function of d. (usually we use c = 2, however later we also investigate cases c ∈ {0.5, 1, 2, 3}. decreasing c escalates the process.) in the model above, on a graph on n nodes, at every time step our algorithm consists of o(n 2 ) steps, which can be problematic for bigger graphs if our aim is to make sample with viter = 100 or 200 iteration of the voter model (in the sequel, viter denotes the number of steps of the voter model). however, with c = 2 a node x convinces vertices y with d(x, y) = 3 only with a probability of e −6 = 0, 0025. thus we used the following simplified model: when we created a graph, we stored the list of edges and also calculated for each node the neighbours of distance 2. the simplified voter model spread opinions only on these reduced number of edges with the proper probabilities. we were able to run the original discretized model only on graphs with n = 100, while the simplified version can deal with n = 1000 nodes. we made the assumption that neglecting those tiny probabilities cannot significantly change the outcome of the process. from now on we only model the simplified version of the process. firstly we study the voter model on erdős-rényi(n, p) and barabási-albert(n, m) random graphs. • er(n, p): we create n nodes, and connect every possible pair x, y ∈ v independently with probability p. • ba(n, m): initially we start with a graph g 0 . at every time step we add a new node v to the graph and attach it exactly with m edges to the old nodes with preferential attachment probabilities. let d denote the sum of degrees in the graph before adding the new node, then we attach an edge independently to u with probability d(u) d . we generated graphs starting from g 0 = er 50, m (50−1) graph of complying density. multiple edges can be created by the algorithm, however loops cannot occur. attachment probabilities are not updated during a time step. multiple edges do matter in the voter model, since they somehow represent a stronger relationship between individuals: opinion on a k-multiple edge transmits with a k-times bigger probability. firstly, we examine the voter model on graphs without any nodes of multiple types to understand the pure differences of the process resulting from the structure. we compare graphs with the same density, ba(1000, m) graphs with m = {4, 5, . . . , 10} and er(1000, p), where p ∈ [0.004, 0.01]. initial probability of opinion 1 is set to 0.05 in both graphs. we compare the probability of disappearing the opinion with viter = 50 iteration of the voter model. we generated 10 different graphs from each structure and ran voter model on each 20 times with independent initial opinions. altogether the results of 200 trials were averaged. figure 8 shows the results. before the phase transition of erdős-rényi graphs, that is, with p < ln n n ≈ 0.007 with n = 1000 nodes (ba graphs of the same density are belonging to m ≤ 7) the graph consists of several as mentioned before, in this sequel we investigate extreme outcomes of the process caused by one of the most important properties of barabási-albert graphs. since nodes do not play a symmetrical role in barabási-albert graphs, fixing the proportion of nodes representing opinion 1 (we usually use v = 0.05, so 50 nodes represent opinion 1 in expected value), but changing the position of these nodes in the graph can lead to different results. we examined the following three ways of initial opinion setting: • randomly: each individual chooses opinion 1 with probability v. • "oldest nodes": we deterministically set the first 50 nodes of the graph to represent opinion 1. these nodes have usually the largest number of degrees, thus they play a crucial part in the process. not only have they large degrees, but they are also very likely to be connected to each other (this is the densest part of the graph). • "newest nodes": we deterministically set the last 50 nodes of the graph to represent opinion 1. these nodes usually have only m edges, and they are not connected to each other with a high probability. the histogram on figure 9 shows the distribution of nodes with opinion 1 with the three different choice of l 0 vectors after viter = 50 iterations of the voter model on ba(1000, 5) graphs. we experience differences in terms of probabilities of disappearing opinion 1: with random opinion distribution 11%, with l new almost one third of the cases resulted in extinction of opinion 1, while for l old this probability was negligible (0.005%). actually, for l old after only one iteration of the voter model it is impossible to see any structure in the distribution of individuals with opinion 1. vector of opinions became totally random, but with a probability of 0.12. indeed only with one step of the voter model individuals with opinion 1 could double in number, however opinion 1 cannot take advantage of any special positions in the graph anymore. all in all, giving a certain opinion to individuals who are more likely to be connected in the graph, reduces the probability of disappearing, since they can keep their opinion with a high probability, while with opinion 1 scattered across the graph (in case of l new as well as l rand ) with a dynamic parameter setting of c number of individuals with opinion 1 can reduce drastically even in a few time steps. it is a natural idea to divide the nodes of a network into separate groups according to some aspect, where the properties of different groups can affect processes on the graph. there are various ways to classify nodes into different types. we examined a simple and an other widely used method. in the following section we only have nodes with two types, however definitions still hold for multiple type cases. from now on, for purposes of discussion we only refer to the types as red and blue. we consider two different ways to assign types to the nodes: • each node independently of each other chooses to be red with probability p r , and blue with 1 − p r . (here index r corresponds to random.) • since preferential attachment graphs are dynamic models, this enables another very natural and logical way of choosing types: after a new node has connected to the graph with some edges, informally the node chooses its type with probabilities corresponding to the proportions among its neighbours' type (see also [1, 13, 17] ). this way nodes with the same type tend to connect to each other with a higher probability, forming a "cluster" in the graph. we only examined linear models. according to [2] , a few properties in the initial graph g 0 and initial types of nodes can determine the asymptotic behaviour of the proportion of types. let g n denote the graph when n nodes have been added to the initial graph g 0 . let a n and b n denote the number of red and blue nodes in g n . then the following theorem holds for the asymptotic proportion of red (a n ) and blue (b n )nodes, a n = an an+bn , b n = bn an+bn . let x n and respectively y n denote the sum of the degrees of red and blue nodes in g n . and that x 0 , y 0 ≥ 0. then a n converges almost surely as n → ∞. furthermore, the limiting distribution of a := lim n→∞ a n has full support on the interval [0, 1], has no atoms, and depends only on x 0 , y 0 , and m. this property has great significance, since we would like to compare graphs with the same proportion of red and blue nodes. the theorem ensures us about the existence of such a limiting proportion. what is more, with the generation of barabási-albert graphs with multiple edges we can examine the speed of convergence. we set types of nodes in the initial graph g 0 in such a way that not necessarily half of the nodes will be blue, but approximately the sum degree of nodes with type blue will be the half of the whole sum of degrees. (of course, in case of an initial erdős-rényi graph these will be the same in expected value. however we can get more stable proportion of types with the second method. in this case by stable we mean proportions can be closer to 1 2 .) in the voter model we can see nodes with multiple types, defined in the last section, with the following interpretation. each node (individual) has two types according to two different aspects: so each node chooses a type from both of the aspects, and the choice of types according to different aspects are independent. (since four combination of these is possible, we could say that each node chooses one type from the 4 possible pairs.) during the voter model, interaction of nodes with different types influence the process in the following way: complying with the names of the types, we expect that good reasoner nodes could convince any nodes with a higher probability than bad reasoner nodes. also any node should convince a node of unstable type with a higher probability than a node of a stable type. in a step of the voter model, when node x influences a node y, the probability of success should only depend on node x's ability to convince (good/ bad reasoner type) and node y's we investigated the model with symmetric parameter set: the probability of a good reasoner node convincing a stable one is equal to the probability of a bad reasoner node convincing an unstable one. we also made the assumption that a bad reasoner node can convince a stable node with probability 0. voter model was examined with different c(1) ≥ c(2) set of parameters, and different possible choices of types in the graph. in this sequel we examine a special case of voter model with multiple type nodes, in which the aim is to spread an initially underrepresented opinion. this problem might be related to finding good marketing strategies on online social networks, when "opinion" might be about a commercial product or a certain political convinction. we investigate the following "influencer" model: types of a node according to the different aspects is not independent, nor is the l 0 vector of initial opinions. the nodes of the graph are divided into two groups, influencers and non-influencers. influencers usually form a smaller population; they represent opinion 1, which we want to spread across the graph. they are good reasoners, and also stable, while non-influencers have bad reasoner type according to the ability to convince, while they can be stable as well as unstable. according to definitions of c values, it is impossible for a bad reasoner node to convince a stable one, resulting influencers representing opinion 1 for the whole process. firstly, we study a case in which nodes of a ba graph get a type randomly or deterministically, not according to preferential attachment. we study the equivalent of the case in subsection 3.2.1 with multiple type nodes. in each graph 10% of individuals (100 nodes) are influencers. in ba graphs influencers are situated randomly, on the "oldest nodes" or on the "newest nodes" of the graph. in er graphs influencers are situated randomly (however, since the role of nodes is symmetric, they can be situated anywhere, with no difference in the outcome). we would like to examine the differences in opinion spread. we are also interested whether it is possible to convince all the nodes of the graph to opinion 1, and in case it is, we calculate the average time needed to do so. we observed differences in the outcome for 100 runs (on 5 different random graphs), with c = [2, 1] and m = 8 parameter set (see figure 10 ) (we wanted to exclude cases in which the proportion of one of the types is negligibly small.) for these reasons, we created er and ba graphs with multiple nodes, where the proportion of good reasoners is 1 2 and according to preferential attachment (in ba graphs), we set these nodes to be the influencers (they are stable, while non-influencer individuals can be stable and unstable with probability 1 2 ). so in expected value half of the nodes are influencers, but in case of ba graphs we can experience greater deviance (in expected value half of the nodes have good reasoner type according to the ability to convince, and 3 4 of the nodes have stable type). for er graphs the only meaningful possibility to create types is the random choice, but the same proportions also hold. proportions in ba graphs are still a bit greater). however, in terms of disappearing opinions results are rather different. on er graphs in none of the cases could opinion 0 disappear, while on ba graphs it strongly depended on the exact initial proportion of influencers in the graph: on the same graph (and hence with the same proportion of influencers) opinion 0 disappears either within the first 50 iterations of the voter model, or holds a high proportion of opinion 1, yet it will never be able to reach the limit. this main difference resulted from the fact that we can not exactly set the proportion of types in ba graphs, thus the co-existence of opinions is rather sensitive to changes in the number of influencers in ba graphs. (in er graphs only 20% of the examined runs resulted in disappearance of opinion 0, even with 600 influencers.) in this section we examine the voter model on a random graph which has a geometric structure on the plane. since the graph model is not dynamic, nodes can only choose their type randomly (or according to some deterministic strategy related to the position of nodes in the plane). however firstly we study the model without multiple types, with constant c = 0.5, 0.1. since the voter model is rather time-consuming, and even in case of parameter c = 0.5 the probability of conviction q(x, y) = e −c·d(x,y) for d(x, y) = 10 is ≈ 0.0067. thus we create a reduced graph from rp (n) by erasing edges in case of d(x, y) > 10. we can assume that results on the reduced graph can approximate the outcome on the original one, since transmission of opinions on those edges are negligible. the average degree in the reduced graph is still 27.85. modifying the voter model to spread opinions only on these edges makes the algorithm less robust and manageable to run the process on graphs with many (n = 1000) nodes. firstly, we would like to understand the behaviour of the process without multiple types of the graph. in this section we make an advantage of the geometric structure of the graph, and examine different deterministic and random choices for initial opinions of l 0 . we study how these alternative options can influence the outcome (the probability of the disappearance of an opinion, expected time needed for extinction). another interesting question is whether after a given number of iterations t of the voter model we can still observe any nice shape of the situation of opinions. in both of the following four choices for initial opinions in expected value 10% of the individuals are given opinion 1, the rest of them represent opinion 0. the discretized voter model with different initial opinion vectors l 0 was performed on 400 different graphs for viter = 100 steps. with n = 1000 number of nodes, c = 0.5 and only 10% of population representing opinion 1, opinion 1 disappeared only in a few (negligible) cases for any examined l 0 . we can say, without any doubt according to picture 11, that after 100 time steps the deterministic position of initial opinions is still recognizable (even after viter = 200 steps). we can generally state that with clustering individuals with the same opinion in a group, makes proportion of opinions more stable in the process: from the different runs we observed that proportion of opinions (from initial 0.1) stayed between [0.8, 1.2] with a probability of more than 0.4 in case of opinion 1 situated in a corner of the graph, while in any other cases this was significantly lower (less than 0.3). with this placement of opinion 1, average distances within individuals with opinion 1 was the smallest, however average distances between different groups of opinions was the largest among the examined cases, resulting in moderate change of opinions. number of individuals representing 1 decreased below 50 only with probability of 0.08, while with placing opinion 1 in the center this probability is 0.1325. opinion 1 is the most likely to disappear (with probability 0.195), or reduce to an insignificant amount with random placement of opinion 1. however, inverse extreme cases are also more likely to occur, since proportions of opinion 1 exceeding 0.2 is outstandingly high with this scenario. moreover, despite the high probability of extinction, in expected value we get the highest proportion of opinion 1 after viter = 100 iteration of the voter model with random initial configuration. we also examined random graphs on a plane with a random or deterministic type choice of the nodes corresponding to two different aspect as before. we set the type pairs to form an influencer model defined before. due to the fact that average distances in this random graph model are significantly larger than in er and ba graphs of small-world property, small proportion of influencers in most in most of the cases neither can help the problem the setting of all non-influencer individual to unstable type. even with random influencer position the calculation of average time needed to convince all nodes of the graphs is challenging due to its time cost. with the increase of influencers in number to 300, in half of the runs was able to reach opinion 1 all nodes of the graph within 400 time steps. however, sometimes only in a relatively small number of iterations, suggesting that exact position on the plane of randomly chosen individuals do effect the process significantly. coexistence in preferential attachment networks evolving voter model on dense random graphs on the spread of viruses on the internet a trust model for spreading gossip in social networks random graphs. second edition on critical vaccination coverage in multitype epidemics graphs with specified degree distributions, simple epidemics, and local vaccination strategies the noisy voter model on complex networks coexistence results for some competition models random graph dynamics sir epidemics and vaccination on random graphs with clustering random graphs and complex networks preferential attachment graphs with co-existing types of different fitnesses gergely röst modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks interacting particle systems daihai he, preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak key: cord-260966-9n23fjnz authors: comunian, alessandro; gaburro, romina; giudici, mauro title: inversion of a sir-based model: a critical analysis about the application to covid-19 epidemic date: 2020-08-12 journal: physica d doi: 10.1016/j.physd.2020.132674 sha: doc_id: 260966 cord_uid: 9n23fjnz calibration of a sir (susceptibles-infected-recovered) model with official international data for the covid-19 pandemics provides a good example of the difficulties inherent the solution of inverse problems. inverse modeling is set up in a framework of discrete inverse problems, which explicitly considers the role and the relevance of data. together with a physical vision of the model, the present work addresses numerically the issue of parameters calibration in sir models, it discusses the uncertainties in the data provided by international authorities, how they influence the reliability of calibrated model parameters and, ultimately, of model predictions. epidemic modeling is usually performed with compartmental models, often called sir (susceptibles-infected-recovered) models, which are claimed to go back to the work by ronald ross and hilda p. hudson more than one century ago [1, 2] and, ten years later, to the work of anderson gray mckendrick and 5 william ogilvy kermack [3, 4] . this class of models shares several characteristics with models of population dynamics and with conceptual lumped models this work does not aim to provide forecasts of the pandemic evolution at this 70 stage. it is in the authors' opinion that the quality of the data that are currently available does not allow to perform reliable forecast and model outcomes should be used with high prudence. it will be material of future work to further develop and refine the sir model presented in this work and to address the issue of providing forecasts of the epidemics, when the data will be better 75 understood. for instance, the correct number of infected people "remains unknown because asymptomatic cases or patients with very mild symptoms might not be tested and will not be identified", as recognized, e.g., by [28] . in an interview published on march 23rd, 2020, by the italian newspaper "la repubblica", 80 angelo borrelli, head of dipartimento della protezione civile (national civil protection department) stated that a ratio of one certified case out of every 10 total cases is credible. furthermore, different criteria have been adopted by different countries and institutions to define the various categories of infected, recovered and deceased people by or with covid-19. this fact has been widely 85 recognized as a cause of uncertainty in the collected data. finally, censorship on covid-19 pandemics is reported by journalists and organizations in some of the countries affected by the pandemic. the paper is organized as follows. section 2 contains the description of the sir model in both the continuous and the discrete case (subsection 2.1) together 90 with a precise formulation of the inverse problem addressed in this paper in the discrete setting (subsection 2.2). in particular, inverse modeling, i.e., model calibration, is set up and discussed computationally within the framework proposed by [22] . the results obtained by applying our sir model to the pandemic are shown in section 3. section 4 is devoted to a discussion about the 2. methods and materials 2.1. the continuous and the discrete models 100 we start by defining the objects involved in the continuous sir model considered in this paper. definition 2.1. we denote by s(t), i(t), r(t) and d(t) the number of susceptible, infected, recovered and deceased individuals of the population under study at time t, respectively, for t varying in some interval i ⊂ r. here d includes 105 only those individuals who died while being infected, whereas the total population, at time t, is given by p (t) = s(t) + i(t) + r(t). definition 2.2. we denote by β and δ the birth and death rate, respectively, under normal conditions, i.e., without considering deaths caused by the epidemic. we also denote by γ, ρ and φ the infection, recovery and fatality rate, 110 respectively. the dimension of these coefficients is [time −1 ] . notice that φ accounts for the deaths related to the pandemic, i.e., it represents the increase in the death rate due to the pandemic. the normal death rate is considered through δ. note that β and δ in definition 2.2 are rarely considered in epidemic mod-115 eling, as the time variation of p due to the normal evolution of the population is either negligible or smoother than its variation due to the presence of an epidemic. this is due to the fact that typical values of β and δ are smaller than the ones of γ, ρ and φ by one or more orders of magnitude, as shown in subsection 3.2. we keep birth and death rates in the model, in order to facilitate 120 a thorough discussion of the assumptions behind this model, which is given in section 4. we make the following assumptions. assumption 2.1. the coefficients β, δ, γ, ρ and φ are assumed to be constant. is given by s/p , whereas (i + r)/p is the fraction of those persons who cannot be infected, as it is also assumed that recovered people are immunized. the following equations, based on the seminal papers [1, 2, 3, 4] , are used to describe the time evolution of s, i, d and r: and d(t ini ) = 0, where t ini ∈ i ⊂ r is the time at which the first individual is infected and p ini is the population at t ini . notice that from equations (1) to (4) one can easily deduce dp and if we couple (5) with (2) where 6 j o u r n a l p r e -p r o o f we can approximate (6) to the simple system of autonomous linear ordinary differential equations for some h, 0 < h 1. this rough approximation is justified by thinking that, for h small enough, i(t) p ini s(t) and therefore is/p i in (6). the system describes the population evolution taking into account demographic aspects only, i.e., in absence of the perturbation caused by epidemics and by assuming that the birth and death rate are constant, whereas    di dt = (γ − α)i, in (t ini , t ini + h), describes the time evolution of the number of infected cases during a short time 145 after the beginning of the infection at time t = t ini . the solution to (10) , and, for h small enough, its linear approximation , gives a first rough explanation about why, during the first phases of the epidemics, i.e., for t t ini , the number of infected individuals, i(t), seems to grow linearly. this fact motivates the difficulties in 150 the design of an efficient early warning system. in fact, once i(t) increases to a significant level to be detected, the exponential growth had already kicked in and the containment measures can be effective only if quite drastic. the discrete model is a simple forward-time finite-differences discretization of equations (1) to (4) . for n ∈ z, we denote the discrete time steps, at a definition 2.3. we denote by s n , i n , r n and d n the number of susceptible, infected, recovered and deceased individuals of the population under study at time t n , respectively, for n = n ini , . . . , n ini + n (mod) − 1, where n ini is such 160 that t ini = n ini ∆t and n (mod) is the number of modeled time steps. the total population at time t n is given by p n = s n + i n + r n . then the resulting algebraic iterative equations are of the form for n = n ini , . . . , n ini + n (mod) − 1, with initial conditions and the discrete counterpart of (5) is here the time spacing ∆t = 1 day, in agreement with the sampling of the available data set on covid-19 pandemic (see section 2.3). equations (11) are implemented in a specifically designed code, developed using the python programming language. the choice n ∈ z allows to simplify the notation adopted in the formulation 170 of the inverse problem in section 2.2. it is important to notice that n = 0, i.e., t 0 = 0, represents the first day for which epidemic data are available and in general it does not coincide with n = n ini , which corresponds to t ini , the day when the first person was infected in a given nation, according to our model. we will call t 0 = 0 (n = 0) and t ini (n = n ini ) the monitoring initial time and the 2.2. the inverse problem: model calibration as stated in the introduction, the inverse problem addressed here is defined in the discrete setting by making use of the conceptual framework and the 180 notation of [22] . the numerical task in treating the inverse problem consists in solving iteratively (11) and matching such solutions with the data collected within a certain time-frame [t min , t max ). such (discrete) time-varying vectorsolutions s n are collected in an array s, called the state of the system n , s n , s where n (mod) and n = n ini have been introduced in definition 2.3. s is the model 185 outcome used to forecast the number of infected, recovered and dead individuals. to this end, we also introduce the model forecast, an array y defined by for some n min , n max , with n ini ≤ n min < n max ≤ n ini + n (mod) . the available data are collected in an array d. in the specific case considered here, the subset of data denoted by d ⊂ d includes the cumulative number of the confirmed 190 infected cases, together with the number of the recovered and dead persons, released by health official organizations n , d d can include also other data, e.g., demographic data used to infer the values of some model parameters (β and δ). n = 0 represents the so-called monitoring initial time introduced in 2.1, which corresponds to the first day for which epidemic data d are available; recall that, in general, it does not coincide with 200 the day n = n ini when the first person was infected in a given country. model calibration requires that the model forecast be close to a calibration target, an array t that collects the values which should be attained by the model forecast, if the model were physically "correct" and the model parameters were "optimal". in this specific case t is defined by where i (ref) n is given by (17) and n min , n max are such that 0 ≤ n min < n max ≤ . the model parameters are placed in an array p: where r + = (0, +∞) and we recall that ∆t = 1 day and p ini is the model initial population introduced in section 2.1. if we summarize the algebraic equations in the discrete model (11) together 210 with the initial conditions (12) with the forward problem can be stated as: given p, find the unique state s = g(p) that solves (20) . in other words, given the parameters p, the solution to the forward problem will give the state of the system, s. in order to introduce the corresponding inverse problem, it is convenient to write p as where p (fix) = (β, δ, ∆t) , p (cal) = (ρ, φ, γ, n ini , p ini ) . p (fix) and p (cal) include the model parameters whose values are fixed before the simulation and the model parameters whose values are obtained from the solution of the underlying inverse problem, which is yet to be stated, respectively. remark 2.1. some remarks on p, y and t are in order. 1. the array of fixed parameters is a function of d: 2. the model forecast y is a function of s, p and d: y = y (d, s, p); 3. t may depend on d and p (fix) , but must be independent of p (cal) : t = t d, p (fix) . the misfit between model predictions and the target values is computed by means of the following objective function: where o (i) for i = 1, 2, 3, where ξ ≥ 1 is a threshold-and-weight parameter and n min , n max are such that in other words, o y,t is the sum of three functions, each of which considers one of the three reference quantities, separately. the model calibration is then performed by solving the following inverse problem: given p (fix) and d, given the solution s = g (p) to (20) , determine y (d, g (p) , p), t and find p (cal) , such that 235 p (cal) = arg min j o u r n a l p r e -p r o o f in other words, the objective of model calibration is to find the parameter values which best fit the reference data in a given time interval, n min ≤ n < n max . the parameter ξ plays a double role. first of all, it is a threshold which keeps positive the denominator of the quantity appearing in (24) , even in the particular case when t y,t is nothing but the root-mean-squared relative difference between target, t y,t reduces to the standard root-mean-squared error. notice that the latter condition implies that the value of ξ could be very great. a sensible upper limit 250 for ξ, ξ max , could be for instance given by the upper bound of p ini , which is identified with the total population of the country for which the simulation is performed, as shown in subsection 3.2. it is worth stressing that the explicit use of an interval n min ≤ n < n max for the definition of t, y and the objective function, although somehow cumbersome, 255 is useful to assess changes in the physical parameters with time. some examples of it will be shown in section 3.2. the application of the model introduced in section 2.1 and of the model calibration introduced in section 2.2 can be attempted thanks to publicly available 260 data on covid-19 pandemic. the application will be performed at national level, i.e., the considered population will be the whole population of some countries. for each country, the array d is populated with data coming from two basic sources. data on covid-19 pandemic are available from the github repository man-265 aged by the johns hopkins university [29] . this is a collection of publicly available data from multiple sources, which are processed and delivered by the johns hopkins university center for systems science and engineering (jhu csse). notice that the data are provided to the public strictly for educational and academic research purposes. the data are updated daily and the files used in this figure 1 . the optimization algorithms that have been tested are based on constrained minimization, so that some bounds on p (cal) should be pre-280 scribed. best results have been obtained by global optimization with the function differential evolution [30] . since this function implements a stochastic algorithm, the pseudo-code of figure 1 shows that several runs of the algorithms are executed in an easily parallelized loop. aside from china, for which the starting phase is not reported, since the virus diffusion started earlier than the first date for which data are available in the data set, the number of confirmed cases (plots a in figure 2 ) shows a first slow increase, followed by an exponential increase and possibly a slowdown after few weeks. it is highly questionable whether this behavior is related to the first of all, the behavior of the model is shown with test case 1, which 325 includes four model runs for which all the model parameters, but ρ, are kept fixed. exploratory tests were run with diverse parameters sets. however, the selection of the parameter sets used to obtain the results presented here was inspired by a preliminary calibration test performed on the data available for italy. in this sense, these parameters, listed in table 1 , could be considered which differ only for the value of ρ, the total population has only a limited variation, so that approximating p to a constant value could appear reasonable. nevertheless, the term used to compute the infection rate is directly proportional to both i and s and inversely proportional to p so that it introduces a nonin particular, this paper is focused on the results obtained with data from italy, but the same qualitative remarks hold also for the application to data the basic properties of the performed tests are listed in table 2 . the comparison between reference and fitted time series for test a, which is to be considered as the ideal one, because all the data are used and the standard settings are applied, is shown in figure 5 . the discrepancy between reference and modelled values in log scale is greater for the initial phase of the epidemic; the model notice that for tests b, c and d three subsets of data are used, corresponding to three non overlapping time intervals, each of which is 33-days-long. in particular, the first day for which data are available is january 22, 2020 and 395 the data series used in this paper ends on may 2, 2020. therefore, the data set for test b ends on february 24, 2020, the data set for test c covers the interval 20 j o u r n a l p r e -p r o o f d e c 1 6 d e c 2 3 d e c 3 0 j a n 0 6 j a n 1 3 j a n 2 0 j a n 2 7 italy with the parameters obtained by solution of the inverse problem for test a (see table 2 ). the vertical dotted black lines delimit the time-frame of the data set used for model calibration, i.e., they correspond to t min and tmax. (24) is nothing but the root-mean-squared relative difference between reference and modeled values of i, r and d for i = 1, 2, 3, respectively. therefore, test e has been designed in order to assess the dissimilarities in the inversion results due to the application of different objective 405 functions, by comparing tests a and e, which in ultimate essence are founded on absolute versus relative errors between model predictions and calibration targets, respectively. test f is based on a subset of the data, in particular, for this test the number of dead patients only is fitted; the rationale behind this test is that d (ref) should be less uncertain than the other data of d . finally, 410 test g is an attempt to consider the hints raised by several authorities and researchers, suggesting that official numbers could be heavily underestimated. in this test, it is assumed that the number of infected and recovered persons be 10 21 j o u r n a l p r e -p r o o f times greater than those reported in official documents; analogously the number of deaths is assumed to be twice the official value. notice that this does not 415 mean that these estimates are more accurate than the official ones; test g is designed to be a first attempt of sensitivity analysis, by considering a data set very different from the reference one, which is used for test a. to find a minimum, also by taking into account possible bounds on p (cal) . the reader is referred to the on-line scipy's documentation for details; here, it suffices to recall that tested methods include nelder-mead simplex algorithm [32] and conjugate-gradient based methods [33] . the bounds have been assigned on the basis of preliminary gross estimates from available data and they are listed 425 in table 3 . in particular, notice that the upper bound for p ini is 10 8 , hence here ξ max = 10 8 , i.e., ξ ∈ [1, 10 8 ]. several runs have been conducted with a routine for local minimization and the best results were obtained with the l-bfgs-b method, which is a variation of the broydenfletchergoldfarbshannon (bfgs) algorithm [33] to reduce as a consequence, these preliminary tests called for the application of a global 440 minimization algorithm. global minimization by application of differential evolution [30] , even with the default settings, yielded good results, which are listed in tables 4 and 5 . the mean value of each parameter and the corresponding standard deviation have been estimated after 10 runs of this stochastic algorithm, for which the 445 random initializing seed introduces variations among the returned results. when looking at table 5 , it is important to recall again that t ini and p ini are integer numbers, but in the table the averages and the relative standard errors are computed after 10 runs and this explains the float numbers notation. besides the optimal values of p (cal) listed in tables 4 and 5 , it is important 450 and useful to consider also some properties of the inversion procedure for each test; they are listed in table 6 . table 4 shows that, apart from few tests, the optimal values of γ, ρ and φ are relatively similar, sharing the same order of magnitude and the relationship 23 j o u r n a l p r e -p r o o f test table 4 would also suggest that at the pandemic peak a large fraction of the population would have already been infected, and possibly recovered. in fact, in the continuous model, when i reaches its maximum value we have and, after simple algebraic manipulations, where α is defined in (7). the calibration results listed in table 4 show that α is about one order of magnitude smaller than γ. in particular for the calibration tests performed in this study (table 4 ) (γ − α) · γ −1 assumes a relatively high value, close to 0.8. two facts should be mentioned about the results of test a shown in table 5 : first, t ini < 0, i.e., it seems that the infection started before the official appearance of the first confirmed case; second, p ini is close to the lower bound chosen in table 3 , so that the model predicts that the population which has been involved in the infection could be relatively small. these qualitative remarks table 6 shows that tests a, d and g are those for which the results of different runs are more consistent with each other. this is important, because it 25 j o u r n a l p r e -p r o o f shows that the identification of p (cal) with the proposed approach appears to be robust for these tests. on the other hand, for the remaining tests, it is important to carefully check the outcomes of each single run. in fact, the initial seed could 490 introduce some bias which cannot be overcome by the differential evolution routine with its default settings and the final result could yield a local minimum, instead of the global one. this is illustrated by the comparison in figure 6 , which shows the results of test f for the optimal parameters and those averaged among the 10 runs and listed in tables 4 and 5 . this test was designed to fit the data 495 on the deceased people, as shown in figure 6 (a); the fit seems extremely good, in fact, the two green curves overlap almost perfectly for a large time interval. on the other hand, from figure 6 (b) it is evident that some of the inversion runs yielded parameters which do not permit to properly and satisfactorily reproduce the data. from table 6 , it is also apparent that the objective function is computed a italy with the parameters obtained by solution of the inverse problem for test f (see table 2 ): (a) optimal parameters corresponding to the global minimum; (b) parameters averaged among the 10 inversion runs (tables 4 and 5 ). the vertical dotted black lines delimit the time-frame of the data set used for model calibration, i.e., they correspond to t min and tmax. where δ is the death rate introduced in definition 2.2. an attempt to account for time-varying parameters, depending on varying 535 conditions, can be found in [36] . the assumption of homogeneity could be relaxed by considering distributed models, similar to those applied to the study of transport phenomena, e.g., for diffusion of contaminants in the environment. those models can account for "diffusive" spread and for "advective" transport. however, the required 540 parametrization is often much finer than the one for lumped models, so that the number of parameters to be calibrated strongly increases, and therefore in absence of good quality data it could be difficult to perform a reliable calibration and validation of the model for a practical application. it is also assumed that the population under study is a closed system, thus epidemic models rarely consider birth and death rates, because the cor-555 responding terms in the underlying equations are usually negligible. in this work, however, these terms have been kept, as they play a significant role in our discussion. in particular, following the assumption of population homogeneity mentioned above, it is assumed that infected pregnant women give birth to infected babies and that this occur at the same rate as for susceptible women. with regard to infection rate, which is described by the term γis/p in (1) and (2), some remarks are in order. this term is computed by assuming that each infected individual has a given, constant number of contacts with other persons per unit time. our model assumes that the number of persons who cannot be infected is i + r, so that the fraction of contacted persons who 565 cannot be infected is given by (i + r)/p ; on the other hand, the fraction of contacted individuals who can be infected is given by s/p . this is equivalent to assuming that recovered people become immune to the virus. notice that timing, magnitude and longevity of immunity against sars-cov-2 is still an open question for the scientific community (see, e.g., [37, 38, 39] ). moreover, 570 recovered people are assumed to be not infectious, which is the case if the response of their immune system is fast enough so that, once they come in contact with the virus again, the virus is destroyed by the immune system before it can be spread to susceptible persons. other promising classes of models are stochastic models [40] , either under a be adapted in a relatively easy way to account for several phenomena and also to consider the role of aspects like gender, age, health and wellness on the probability of infection, recovery and decease. on the other hand, enkf could 580 provide a firm theoretical framework to improve model predictions by means of uncertain data. other models in the bayesian framework [42] could be very helpful to handle discrepancies between model predictions and reference values. unfortunately, in this case the systematic and random errors could be so high as to make it very difficult to handle them even in a stochastic framework. several tests were performed in order to study the impact of the data, the time frames over which the data were collected and the performance of different objective functions, depending on the choice of ξ, on the calibration of the parameters (see table 2 ). test a can be considered as the reference one, as it difficulties we applied the "differential evolution" algorithm [30] and the results obtained were very good. other relevant algorithms for global optimization that could be tested as part of future work are genetic algorithms [43, 35] , particle swarm optimization [44] , simulated annealing [45] . another limitation is given by the uncertainty in the available data. test g was performed with data ten times greater than the official ones with the 660 intent of starting a sensitivity analysis. the results obtained in test g show that the calibration of the parameters via these data, in particular the values for γ and ρ are not very consistent with tests a, d and f mentioned above. these results emphasize the ill-posed nature of the underlying inverse problem, by providing evidence about the great care that has to be given to the quality of an application of the theory of probabilities to the study of a priori pathometry.-part i an application of the theory of probabilities to the study of a priori pathometry a contribution to the mathematical theory of epidemics contributions to the mathematical theory of epidemics. ii. the problem of endemicity game theory of social distancing in response to an epidemic economic considerations for social distancing and behavioral based policies during an epidemic the macroeconomics of epidemics, working paper 26882 a generalization of the kermack-mckendrick deterministic epidemic model mathematical structures of epidemic systems global stability of an sir epidemic model with time delays the mathematics of infectious diseases the effect of cross-immunity and seasonal forcing in a multi-strain epidemic model external forcing of ecological and epidemiological systems: a resonance approach lyapunov functions and global stability for sir and sirs 750 epidemiological models with non-linear transmission complex dynamic behavior in a viral model with delayed immune response asymptotic profile of the positive steady state for an sis epidemic reactiondiffusion model: effects of epidemic risk and population movement complete global stability for an sir epidemic model with 760 delay distributed or discrete comparing vectorhost and sir models for dengue transmission real time bayesian estimation of the epidemic potential of emerging infectious diseases modeling the effect of information campaigns on the hiv epidemic in uganda a double epidemic model for the sars propagation a conceptual framework for discrete inverse problems in geophysics development, calibration and validation of physical models solutions of ill-posed problems, v.h. winston & sons stable determination of conductivity by boundary mea-785 surements open issues of stability for the inverse conductivity problem lipschitz stability at the boundary for time-harmonic diffuse optical tomography real estimates of mortality following covid-19 infection an interactive web-based dashboard to track covid-19 in real time differential evolution a simple and efficient heuristic for 800 global optimization over continuous spaces demographic yearbook annuaire démographique implementing the nelder-mead simplex algorithm with 805 adaptive parameters practical methods of optimization di mat-810 teo, m. colaneri, modelling the covid-19 epidemic and implementation of population-wide interventions in italy early dynamics of covid-19 in algeria: a model-based study modelling provincial covid-19 epidemic data in italy using an adjusted time-dependent sird model covid-19 infection: the perspec-820 tives on immune responses the dynamics of humoral immune responses following sars-cov-2 infection and the potential for reinfection sars-cov-2: virology, epidemiology, immunology and vaccine development stochastic models for epidemics with special reference to aids the ensemble kalman filter: theoretical formulation and practical implementation bayesian dynamical esti-835 mation of the parameters of an se(a)ir covid-19 spread model handbook of genetic algorithms particle swarm optimization optimization by simulated annealing an application of the theory of probabilities to the study of a priori pathometry.-part i an application of the theory of probabilities to 850 the study of a priori pathometry a contribution to the mathematical theory of epidemics contributions to the mathematical theory of epidemics. ii. the problem of endemicity game theory of social distancing in response to an epidemic economic considerations for social distancing and behavioral 865 based policies during an epidemic the macroeconomics of epidemics, working paper 26882 a generalization of the kermack-mckendrick deterministic epidemic model mathematical structures of epidemic systems global stability of an sir epidemic model with time delays the mathematics of infectious diseases the effect of cross-immunity and seasonal forcing in a multi-strain epidemic model external forcing of ecological and epidemiological systems: a resonance approach lyapunov functions and global stability for sir and sirs epidemiological models with non-linear transmission complex dynamic behavior in a viral 890 model with delayed immune response asymptotic profile of the positive steady state for an sis epidemic reactiondiffusion model: effects of epidemic risk and population movement complete global stability for an sir epidemic model with delay distributed or discrete comparing vectorhost and sir models 900 for dengue transmission real time bayesian estimation of the epidemic potential of emerging infectious diseases modeling the effect of information campaigns on the hiv epidemic in uganda a double epidemic model for the 910 sars propagation a conceptual framework for discrete inverse problems in geophysics development, calibration and validation of physical models solutions of ill-posed problems, v.h. win-920 ston & sons stable determination of conductivity by boundary measurements open issues of stability for the inverse conductivity prob-925 lem lipschitz stability at the boundary for time-harmonic diffuse optical tomography real estimates of mortality following covid-19 infection an interactive web-based dashboard to track covid-19 in real time differential evolution a simple and efficient heuristic for global optimization over continuous spaces annuaire démographique, 69th edition, united nations implementing the nelder-mead simplex algorithm with adaptive parameters practical methods of optimization modelling the covid-19 epidemic and implementation of population-wide interventions in italy early dynamics of covid-19 in algeria: a model-based study modelling provincial covid-19 epidemic data in italy using an adjusted time-dependent sird model covid-19 infection: the perspectives on immune responses the dynamics of humoral immune responses fol-960 lowing sars-cov-2 infection and the potential for reinfection sars-cov-2: virology, epidemiology, immunology and vaccine development stochastic models for epidemics with special reference to aids the ensemble kalman filter: theoretical formulation and practical implementation the data on covid-19 epidemic have been downloaded from the following url: from https://github.com/cssegisanddata/covid-19.finally, the authors wish to thank the anonymous referees and the editor, key: cord-254729-hoa39sx2 authors: wan, wai-yin; chan, jennifer so-kuen title: bayesian analysis of robust poisson geometric process model using heavy-tailed distributions date: 2011-01-01 journal: comput stat data anal doi: 10.1016/j.csda.2010.06.011 sha: doc_id: 254729 cord_uid: hoa39sx2 we propose a robust poisson geometric process model with heavy-tailed distributions to cope with the problem of outliers as it may lead to an overestimation of mean and variance resulting in inaccurate interpretations of the situations. two heavy-tailed distributions namely student’s [formula: see text] and exponential power distributions with different tailednesses and kurtoses are used and they are represented in scale mixture of normal and scale mixture of uniform respectively. the proposed model is capable of describing the trend and meanwhile the mixing parameters in the scale mixture representations can detect the outlying observations. simulations and real data analysis are performed to investigate the properties of the models. most time series measured over continuous range assume a normal error distribution. these traditional time series models are however vulnerable to outliers. outlier in time series has been realized as an influential factor in model fitting and forecasting. failure to downweigh the outlying effects may lead to a poor model fit, an overestimation of variance, an inappropriate interpretation and an inaccurate prediction. this issue has received a great deal of attention, therefore several approaches have been developed to reduce the influence of outliers and the distributional deviation in the data analysis. over the past decades, two main approaches are considered to cope with the overdispersion caused by the outliers. the first approach simply incorporates mixture effects to account for the heterogeneity in the distribution of the data. this can be viewed as a missing data problem assuming that the membership of the data from one of the distributions is unknown and has to be estimated. the mixture model is usually implemented using expectation-maximization (em) algorithm or markov chain monte carlo (mcmc) sampling algorithm. the second approach is to adopt a heavy-tailed distribution instead of the commonly used gaussian distribution as the error distribution of the data. some popular choices of heavy-tailed distributions include the student's t-distribution and a more general class of distributions is the pearson type iv distribution (johnson et al., 1995) . alternatively, the exponential power (ep) distribution which can describe a leptokurtic (positive excess kurtosis) or platykurtic (negative excess kurtosis) shape is another good choice. however, the implementation of these distributions is difficult because the derivation of the marginal posterior distributions of the parameters is intractable using conventional numerical and analytic approximations (choy and smith, 1997) . to overcome this problem, box and tiao (1973) proposed a new exponential power family of normal scale mixtures (smn) and later qin et al. (1998) pioneered the scale mixtures of uniform (smu) which replaced the normal distribution in the smn form by uniform distribution. theoretically, any distribution that can be expressed by smn form, also has a smu representation. the hierarchical structure of the smn or smu representation possesses two prominent advantages: (1) the resulting density contains a mixing parameter which can accommodate the extra-poisson variation and help to identify the extreme values in outlier diagnosis and (2) the parameter estimation can be simplified by sampling from normal or uniform distribution using markov chain monte carlo (mcmc) algorithms such as gibbs sampling. the recent emergence of the software winbugs which performs the bayesian statistical inferences using mcmc algorithms also facilitates the implementation of these representations, thus enhancing their popularity in the context of bayesian modelling. this hierarchical structure is very practical in insurance applications because it is well known that the normal error distribution falls short of allowing for irregular and extreme claims and hence contaminates the estimation procedure and leads to poor estimation. for instances, choy and chan (2003) applied the student's t-and ep distributions in sm representations to predict the insurance premiums to be charged on the policyholders in credibility analysis and chan et al. (2008) predicted and projected the loss reserves data with various heavy-tailed distributions under the generalized-t distribution family expressed in smu representation. so far, the techniques of using scale mixture representation apply solely to continuous time series. yet, discrete count time series is observed in many occasions especially in a medical context. in clinical trials, patients usually have longitudinal measurements and sometimes the appearance of outlying observations in the data set may inflate the mean and variance of the data distribution and have an adverse effect on both the parameter estimation and prediction. despite overdispersion, trend is often observed in time series and examining the trend patterns provides useful information on the movement of outcomes over time. to cope with this problem, thall and vail (1990) proposed adding cluster-specific and time-specific random effects into the mean link function to give extra-poisson variation to the data. thereon, jowaheer and sutradhar (2002) incorporated the gamma random effects in the poisson mixed model and later, jowaheer et al. (2009) adopted a poisson mixed model with two independent sources of normal random effects to deal with the problem. nevertheless, the mixed effects approach which assumes the mean of the poisson distribution follows gamma or log-normal distribution maybe inadequate to cast light on the outlying effect caused by the extreme observations. to tackle this pitfall, plenty of researches tried various mixing distributions. some useful ones include generalized inverse gaussian, generalized gamma, generalized exponential, inverse gamma, etc. refer to (gupta and ong, 2005) for more details. in this paper, we seek a new direction to model overdispersed longitudinally time series of counts due to the presence of outliers. our proposed model is an extension of the generalized mixture poisson geometric process (gmpgp) model proposed by wan and chan (2009) which is developed from the geometric process (gp) model originated by lam (1988a,b) . the gmpgp model is a two-component mixture model with a mean function to analyze the covariate effect and a ratio function to describe the time effect. meanwhile, the mixture effects in both mean and ratio functions can capture the population heterogeneity and overdispersion in the data. see wan (2006) and wan and chan (2009) for more details. in this paper, we introduce a robust mixture poisson geometric process (rmpgp) model using some heavy-tailed distributions in scale mixture representation. we assume that the outcome variable has a poisson distribution with a stochastic mean which forms a latent gp, and the mean of the gp after geometrically discounted by the ratio follows a heavy-tailed distribution such as log-t distribution or log-exponential power (log-ep) distribution represented in smn and smu forms respectively. under the scale mixture representation, the model parameters can be simulated using the mcmc algorithms and the mixing parameters help to identify the extreme values in the outlier diagnosis. to our knowledge, this is a pioneering work in adopting student's t-or ep distributions for poisson time series. besides, our proposed gp models have a trend component (ratio function) which enables the study of trends of the outcomes over time and can accommodate the clustering effect using a mixture of homogeneous distributions. these make our model more advantageous than many existing poisson time series models. to demonstrate the characteristics and application of our models, the paper is presented as follows. first, the development of the robust mixture poisson geometric process (rmpgp) model using student's t-and ep distributions from the gp model is described in section 2. next, section 3 introduces the scale mixture representation of the two heavy-tailed distributions and their implementation in the rmpgp model. besides, the hierarchical structure and mcmc algorithms of the models are given followed by the introduction of the model assessment criterion. furthermore, section 4 consists of a simulation study of the robust poisson geometric process (rpgp) models and section 5 demonstrates an application of the proposed models using the epilepsy data studied by leppik et al. (1985) with discussion. lastly, a brief summary is given in section 6. lam (1988a,b ) first proposed to model positive continuous data with monotone trend, such as inter-arrival times, by a monotone process called the geometric process (gp) defined as: definition. let x 1 , x 2 , . . . be a sequence of non-negative random variables. if there exists a positive real number a such that {y t = a t−1 x t , t = 1, 2, . . .} forms a renewal process (rp), then the stochastic process {x t , t = 1, 2, . . .} is called a geometric process (gp) and the real number a is called the ratio of the gp. the gp model asserts that if the ratio a discounts the tth outcome x t geometrically by t − 1 times, the resulting process {y t } becomes stationary and forms a rp, which may follow some parametric distributions f (y t ) with e(y t ) = µ and var (y t ) = σ 2 . hence, the mean and variance of the gp model are: e(x t ) = µ a t−1 and var (x t ) = σ 2 a 2(t−1) respectively. with the ratio parameter, the gp model allows the mean and variance to change over time. in fact, a gp is a monotonic increasing sequence of non-negative random variables if a < 1, and monotone decreasing if a > 1. when a = 1, it becomes a stationary renewal process (rp) which is independently and identically distributed with the same distribution f (y t ). over the past decade, gp models have been applied to various fields of research (chan et al., 2006; lam and chan, 1998; lam and zhang, 2003; lam et al., 2004) and have been extended to fit different types of data. for example, chan et al. (2006) introduced the threshold gp model to study the trends of the daily number of infected cases for the severe acute respiratory syndrome (sars) epidemic outbreak in 2003. chan and leung (2010) initiated the binary gp model to study the trends of methadone treatment outcomes. and, wan and chan (2009) introduced the generalized mixture poisson gp (gmpgp) model to analyze the new tumour counts of the bladder cancer patients which is extended from the pgp model initiated by wan (2006) . let w it denote the count for subject i at time t, i = 1, . . . , m, t = 1, . . . , n i and n = m i=1 n i . following the framework of gp model, w it is assumed to follow a poisson distribution f p (w it |x it ) with mean x it which forms a latent gp. then, we further assume the stochastic process {y it = a t−1 it x it } follows some lifetime distributions f (y it ), such as exponential distribution in wan (2006) and gamma distribution in wan and chan (2009) , with mean µ it , the resultant model is called poisson geometric process (pgp) model. without loss of generality, we assume that the logarithm of (1) the resultant model is named as robust poisson gp (rpgp) model. it is essentially a state space model with state variables x it and has time-evolving mean and ratio functions to accommodate the exogenous effects and non-monotone trends. the mean µ * it and ratio a it are identity-linked and log-linked respectively to linear functions of covariates defined respectively below: ln a it = β a0 + β a1 z a1it + · · · + β aq a z aq a it where z jkit , k = 1, . . . , q j are some time-evolving covariates. hence, {y it } is no longer a rp but becomes a stochastic process. similarly, the non-constant ratio function a it allows a non-monotone trend and different values for β ak , k = 0, . . . , q a can describe a variety of trend patterns. refer to wan (2006) for more details. apparently, by mixing the poisson distribution with some heavy-tailed distributions, extra variability will be added to the poisson distribution which enables the model to accommodate the enlarged variance caused by some extreme observations. in this paper, we consider the student's t-and ep distributions because they have different shapes including heavier-thannormal to normal tails in the former as well as the platykurtic shape and leptokurtic shape with a kink in the latter. the various shapes allow the model to be more flexible to capture different kurtoses in the data. if y * it has a student's t-distribution with mean µ * it and variance ν ν−2 σ 2 , the probability density function of student's tdistribution f t (y * it ) is given by where µ * it is given by (2), ν > 2 and is the degrees of freedom which controls the tails. a small ν gives a heavier tail and student's t-distribution converges to normal distribution when ν → ∞. the kurtosis is 6 ν−4 + 3 for ν > 4 which is greater than 3, the kurtosis of the normal distribution. to study the pmfs of the proposed rpgp-t model, we assume q µ = q a = 1, z µ1 = b = 0, 1 as the covariate effect and z a1t = t as the time-evolving effect in (2) and (3). hence the mean function µ t = exp(β µ0 + β µ1 b) and ratio function a t = exp(β a0 + β a1 t). fixing b = 1 and t = 2, we change the values of one of the scale, shape or location parameters each time while keeping the other parameters constant and approximate the pmf in (1) using monte carlo integration as described below. conditional on covariate b and time t, the marginal pmf estimatorf bt (w), in general, can be obtained by: where m = 10 000, the latentŷ * (j) bt are simulated from the student's t-distribution in (4) given the parameters µ * it , σ and ν. besides the mean and variance, we also study the kurtosis of the pmfs using the method in gupta and ong (2005) for a discrete distribution. the relative long-tailedness of the distribution is defined as lim w→∞fbt (w + 1)/f bt (w) where the limit is zero for poisson distribution. the marginal pmfs are displayed in fig. 1 (a)-(d) with their means, variances and kurtoses summarized in table 1 below. note that since different pmfs are drawn on the same graph for comparison, we use curves instead of bars to represent marginal pmfs for better visualization. results reveal that in general the location, variability and the tailedness of the marginal pmf depend on parameters β jk , j = µ, a; k = 0, 1, in which a larger β µ0 and a smaller β a0 lead to a larger mean, variance and kurtosis. whereas, ν and σ control the spread and the tail behaviour of the distribution without altering its mean. a smaller ν and a larger σ contribute to a larger variability and a heavier tail and thus the model can accommodate the outlying effect due to extreme values while keeping the mean unchanged. since the variance of each distribution in table 1 is substantially larger than the mean, the rpgp-t model is capable of fitting data with overdispersion due to outliers as well as data with equidispersion when σ is small. if y * it has an exponential power (ep) distribution, also known as generalized error distribution, with mean µ * it and variance σ 2 , it has a probability density function where c 0 = γ (3ν/2)/γ (ν/2), c 1 = c 1/2 0 /(νγ (ν/2)) and ν ∈ (0, 2] is a shape parameter which controls the kurtosis. this family subsumes a range of symmetric distributions such as uniform (ν → 0) with kurtosis equal to 1.8, normal (ν = 1) with kurtosis equal to 3 and double exponential (ν = 2) with a kurtosis of 6. its tails can be more platykurtic when ν < 1 or more leptokurtic when ν > 1 compared with the normal tail (ν = 1). by fixing the parameters at β a0 = 0.5, β a1 = −0.1, β µ0 = 3.0, β µ1 = −0.2, ν = 1, σ = 0.5 but leaving one floating, the marginal pmfs are again approximated using monte carlo integration specified in (5) but replacing f t (y * (j) bt |µ * t , σ , ν) with f ep (y * (j) bt |µ * t , σ , ν). the effects of different parameters on the resulting pmfs are illustrated in table 2 different moments of marginal pmfs for rpgp-ep model under a set of floating parameters with fixed values of ν = 1, σ = 0.5, β a0 = 0.5, β a1 = −0.1, β µ0 = 3, β µ1 = −0.5. overdispersion may arise due to clustering effects, hence an alternative way to tackle overdispersion is to add mixture effects into the mean and ratio functions. suppose that there are g groups of subjects who have different trend patterns and each subject has the probability π l of coming from group l, l = 1, . . . , g. conditional on group l, the marginal pmf for w it is given by: and the group-specific mean µ itl and ratio a itl functions become µ * itl = β µ0l + β µ1l z µ1it + · · · + β µq µ l z µq µ it , and ln a itl = β a0l + β a1l z a1it + · · · + β aq a l z aq a it respectively. the resulting model is named as robust mixture pgp (rmpgp) model in which f (y * itl ) is given by (4) or (6) for the rmpgp-t or rmpgp-ep model. to illustrate the distribution of a 2-group rmpgp-ep model, its pmf ( f 2 (w it )) where f l (w it ), l = 1, 2 is given by (7) is plotted in fig. 3 by assuming g = 2, q µ = q a = 1, t = 1, z µ1i = 1 and z a1it = t. for l = 1, we set π 1 = 0.8, β a01 = −0.1, β a11 = 0.05, β µ01 = 3, β µ11 = −0.2, ν 1 = 0.2 and σ 1 = 0.2 while for l = 2, we use β a02 = 0.1, β a12 = −0.01, β µ02 = 5, β µ12 = −0.4, ν 2 = 1.9 and σ 2 = 0.01. fig. 3 clearly displays the two distinct modes in the distribution with a larger mode (l = 1) at smaller values of w and a smaller one (l = 2) at larger values of w representing the outliers. this explains how the incorporation of mixture effects in the rmpgp-ep model can accommodate overdispersion due to clustering effects. performing statistical inference using classical methods like maximum likelihood approach is cumbersome when the data distribution has no closed form because the likelihood function involving high-dimensional integration is intractable. to avoid such numerical difficulties, we use a bayesian approach via mcmc algorithms to convert the optimization problem to a sampling problem. since the non-conjugate structure in the posterior distribution for both rmpgp models and the absolute term in the density function of the ep distribution complicate the sampling algorithms, representing the heavy-tailed distributions in a scale mixture form produces a simpler set of full conditional posterior distributions for the parameters and alleviates the computational burden of the gibbs sampler in the mcmc algorithms. choy and smith (1997) has shown that student's t-and ep distributions can be expressed in scale mixture representation to facilitate the simulation in the mcmc algorithms via a bayesian hierarchical structure. however, the ways they handle outliers are different. choy and walker (2003) revealed that the former downweighs the extreme values, whereas the latter merely downweighs or bounds the influence of the outliers. thus, it will be interesting to study their performances in outlier diagnosis. in the following, the student's t-distribution expressed in smn form and the ep distribution represented in smu form will be discussed in details. assume that a continuous random variable y has a student's t-distribution f t (y) with location µ, scale σ 2 and degrees of freedom ν. the probability density function of y is said to have a smn representation if it can be expressed as where f n (.| c, d) represents a normal distribution with mean c and variance d, f g (.| c, d) refers to a gamma distribution with mean c/d and variance c/d 2 , ν is a shape parameter and λ is a mixing parameter which can be used to identify outlier. an outlier is indicated if λ is substantially small as small value implies that the normal distribution in (10) has an inflated variance and hence helps to downweigh its influence on the variance σ 2 . applying the smn form (10) for (4) in the rmpgp-t model, the marginal pmf for w it in (7) becomes: theoretically, any distribution that can be expressed in smn form, also has a smu representation (qin et al., 1998) . to simplify the implementation of the mcmc sampling algorithm, walker and gutiérrez-peña (1999) first proposed to express the ep distribution in smu representation. in a slightly different form, chan et al. (2008) write the ep distribution in smu form as follow: where f u (.|c, d) is a uniform distribution on the interval (c, d) and again λ is a mixing parameter. different from the student's t-distribution, the larger the λ, the wider is the range of the uniform distribution to accommodate a possible outlier. in the rmpgp-ep model, if we replace (6) with (11), the marginal pmf for w it in (7) becomes: to implement the mcmc algorithms, winbugs, an interactive windows version of the bugs program for bayesian analysis of complex statistical models using mcmc techniques is used. for the rmpgp-t and rmpgp-ep models, the hierarchical structure under the bayesian framework is outlined in the following: for rmpgp-t model; for rmpgp-ep model. where µ * itl and a itl are given by (8) and (9) and i il is the group membership indicator for subject i such that i il = 1 if he/she comes from group l and zero otherwise. in order to construct the posterior density, some prior distributions are assigned to the model parameters as follows: β jkl ∼ f n (0, τ 2 jkl ), j = µ, a; k = 0, 1, . . . , q j ; l = 1, . . . , g for rmpgp-ep model (14) (i i1 , . . . , i ig ) t ∼ multinomial (1, π 1 , . . . , π g ) where c l , d l , h l are some positive constants, f ig (c, d) denotes the inverse gamma density given by and f dir (α) represents a dirichlet distribution, a conjugate to multinomial distribution, with parameters α = (α 1 , . . . , α g ). in case of a 2-group (g = 2) mixture model, (15) can be simplified to i i1 ∼ bernoulli(π 1 ), i i2 = 1 − i i1 and (16) becomes a uniform prior f u (0, 1) for π 1 with π 2 = 1 − π 1 . with the posterior meansî il of the group membership indicators i il , patient i is classified to group l ifî il = max lîil . according to bayes' theorem, the posterior density is proportional to the joint densities of complete data likelihood and prior probability distributions. for the rmpgp-t and rmpgp-ep models, the complete data likelihood functions l t (θ) and l ep (θ) for the observed data w it and missing data {y * itl , λ itl , i il } are: the vector of model parameters is θ = (θ t 1 , . . . , θ t g , π 1 , . . . , π g−1 ) t where θ l = (β µl , β al , σ l , ν l ) t ; i = 1, . . . , m, t = 1, . . . , n i , l = 1, . . . , g, β µl = (β µ0 l , . . . , β µq µ l ) and β al = (β a0l , . . . , β aq a l ). treating {y * itl , λ itl , i il } as missing observations, the joint posterior density of the rmpgp-ep model is illustrated as follows: where w = (w 11 , w 12 , . . . , w mn m ) t , i = (i 11 , i 12 , . . . , i mg ) t , y * = (y * 111 , y * 121 , . . . , y * mn m g ) t , λ = (λ 111 , λ 121 , . . . , λ mn m g ) t , β = (β a01 , . . . , β aq a g , β µ01 , . . . , β µq µ g ) t , σ = (σ 1 , . . . , σ g ) t , ν = (ν 1 , . . . , ν g ) t and π = (π 1 , . . . , π g ) t . the complete data likelihood l ep (θ) of the rmpgp-ep model is given by (17) and the priors are given by (12)-(16). in gibbs sampling, the unknown parameters are simulated iteratively from their univariate conditional posterior distributions which are proportional to the joint posterior density of complete data likelihood and prior densities. the univariate full conditional posterior densities for each of the unknown model parameters θ = (β, σ, ν, π, λ) and the latent group indicator i il are given by: and i − and are vectors of β, y * , λ, σ, ν, π and i excluding β jkl , y * itl , λ itl , σ l , ν l , π l and i i respectively. the mcmc algorithms are implemented using winbugs where 55 000 iterations are executed for each model and the first 5000 iterations are discarded as the burn-in period. thereafter parameters are sub-sampled from every 50th iteration to reduce the auto-correlation in the sample. this results in m = 1000 simulated posterior samples of every parameter and parameter estimates are given by their sample means or medians. history plots and auto-correlation function (acf) plots of each parameter are examined to ensure convergence and independence amongst the parameters. in our analysis, we adopt the deviance information criterion (dic ), originated by spiegelhalter et al. (2002) , as the model selection criterion. deviance information criterion (dic ) is the sum of posterior mean deviance d(θ) measuring the model fit and an effective dimension p d which accounts for the model complexity. for the rmpgp model, the dic is defined as where d = t or ep, f d (.) are densities given by (4) and (6) respectively, and θ (j) andθ represent the jth posterior sample and posterior mean of parameter θ , , andī il is defined in a similar way by replacing y * (j) itl and θ (j) withȳ * itl andθ. the rule of thumb is that the smaller the dic , the better is the model. to investigate the properties of the rpgp-t and rpgp-ep models, we conduct a simulation study in which r = 100 data sets are simulated from each of the rpgp models based on a set of true parameters. we set q µ = q a = 1 and each data set contains m = 80 time series of length n i = 8 from m 0 = 40 subjects in the control group (b = 0) and another m 1 = 40 subjects in the treatment group (b = 1) with z µ1it = b in the mean function (2). the degrees of freedom ν is set to include a b heavy (ν = 2.5) and light (ν = 50) tails for student's t-distribution and platykurtic (ν = 0.1) and leptokurtic (ν = 1.8) shapes for ep distribution. the parameters β ak in the ratio function (3) with z a1it = t are also set to include different trend patterns by varying the sign and magnitude of the true values. afterwards both models are fit to each data set using the bayesian approach implemented by winbugs and r2winbugs and the parameter estimateθ is given by the average of the m = 1000 posterior mediansθ j . to examine the bias and precision of the mcmc sampling algorithms, the standard deviation (sd) ofθ j over r = 100 simulated data sets for each parameter θ is reported. to assess the accuracy of parameter estimates when the data set is simulated and fitted to the same model, we calculate the mean squared error (mse) for each parameter θ which is given by: for model selection, the average dic and average squared error (ase) proposed by wegman (1972) is used to assess the quality of the density estimator on the true pmf. denote the true pmf of the rmpgp model by f bt (w) at time t with covariate b, the ase is used to compare the performance of the two models on the same simulated data set and is defined as wheref jbt (w) is the pmf estimator of (1) using monte carlo integration described in (5) for counts w in the jth simulated set at time t with treatment effect b and ψ b = m b /m is the weight associated with the control (b = 0) or treatment group (b = 1). clearly, the smaller the ase, the closer is the estimated pmf to the true one and thus the better is the model performance. table 3 summarizes the results of the four set of simulation experiments with the first two data sets simulated from the rpgp-t model and the next two simulated from the rpgp-ep model. in general, the mcmc algorithms give unbiased and precise results as both mse and sd of most parameters are reasonably small. moreover, the values of the shape parameter ν of the two models match with each other in terms of the tailedness. for example, the smallν = 5.286 of the student's t-distribution agrees withν = 1.8234 which is close to 2 in the ep distribution. however, it is noticed that ν of the rpgp-t model has relatively lower precision and higher bias reflecting the higher level of difficulty in estimating the tailedness of the heavy-tailed student's t-distribution. in model comparison, although the rpgp-t model has a slightly smaller ase (0.04504 versus 0.04817 averaged over the four simulated sets), the rpgp-ep model outperforms the rpgp-t model in dic (1968 dic ( .4 versus 1979 . all in all, the simulation experiment shows that the performance of the mcmc algorithms for the two models is satisfactory and the estimated pmfsf jbt (w) approximate the true pmfs f bt (w) reasonably well. while ep distribution can be platykurtic or leptokurtic with a kink and student's t-distribution can give a very heavy tail when the outlying effect is tremendous, the two rpgp models are suitable under different circumstances. we illustrate the usefulness of our proposed models through the epilepsy data which can be found in thall and vail (1990) . the data were collected from a clinical trial of 59 epileptics by leppik et al. (1985) . in the randomized controlled trial, = 59 patients suffering from simple or complex partial seizures were assigned to either an antiepileptic drug progabide (z µi1 = 1) or a placebo (z µi1 = 0) with no intrinsic therapeutic value. the seizure counts were recorded at a two-week interval for an eight-week period (n i = 4) with no dropout or missing cases. as shown in table 4 , the seizure counts exhibit a prominent extra-poisson variation with large variance to mean ratios at all time t due to some outlying observations as displayed in fig. 4(a) and (b) . to assess the overdispersion in the data, we fit a simple poisson regression model using a mean link function η it = exp(β 0 + β 1 z µi1 + β 2 t) and a pgp model using a mean function µ it = exp(β µ0 + β µ1 z µi1 ) and a ratio function a it = exp(β a0 +β a1 t). the mean and variance under both models (indicated by ' * ' in table 4 ) are equivalent and are given by η it and µ it a t−1 it respectively. obviously, neither of the two simple models, as restricted by their equidispersed property, can capture the overdispersion. besides, the higher mean seizure counts of the placebo group indicate that 'treatment group' is a feasible covariate. moreover, the gradually decreasing seizure counts over time for both placebo and progabide groups suggests that time t maybe a possible time-evolving effect. in addition, population heterogeneity in terms of trend pattern table 4 mean and variance for the observed epilepsy data and under two simple fitted and the best models. time overall and count level are also detected intuitively. in consideration of these and the clinical interest of examining trend patterns, we adopt the rmpgp models to analyze the epilepsy data. referring to the mcmc algorithms detailed in section 3.2, our prior specifications are mostly non-informative except for ν l in the rmpgp-t model. in both rmpgp models, we assign τ 2 jkl = 1000 in (12), c l = d l = 0.001 in (13) and α l = 1/g in (16) for a g-group (g ≥ 2) model. for ν l in the rmpgp-t model, we take h l = 20 since there is a high degree of overdispersion in the seizure counts. after implementing the mcmc algorithms in winbugs, the posterior sample means are adopted as parameter estimates since the posterior densities of most model parameters are highly symmetric and the posterior sample mean are close to the posterior sample median. table 5 summarizes the parameter estimates, standard errors (se), 95% credibility interval (ci) and model selection criterion of the fitted models. we first fitted a simple rpgp model with treatment group (z µ1it = 0, 1) as the covariate in the mean function µ itl in (8) and two-week interval (z a1it = t = 1, 2, 3, 4) as the time-evolving effect in the ratio function a itl in (9). the negative β µ1 in both rpgp models indicate the treatment effect that patients receiving progabide are associated with lower seizure counts. however, within the treatment group, it is explicit that some of these patients have abnormally high seizure counts. simply fitting a simple rpgp model may fail to allow for the clustering effect among patients receiving the same treatment. we therefore fitted a 2-group rmpgp model and also attempted to fit a 3-group rmpgp model using both student's t-and ep distributions but the results indicated that one of the groups in the 3-group rmpgp model degenerated and hence the models were discarded. not surprisingly, both rmpgp models give parallel results as they share some common model properties except the shape of the distribution of y * itl . in the rmpgp-t model, two distinct groups of patients were identified with the first group of patients having generally higher seizure counts and is named as the high-level group (l = 1). within this group, 54% of the patients are receiving progabide and they have lower seizure counts in general (β µ11 < 0) than those receiving placebo. whereas in the low-level (l = 2) group, 49% of the patients belong to the progabide group and again they generally have less epileptic seizures during the studying period (β µ12 < 0). besides, the ratio function a it1 in (9) reveals that there is a slightly decreasing trend in the seizure counts in the high-level group while a it2 indicates that no obvious trend is detected in the low-level group. in addition, comparing with the low-level group, the relatively smaller ν 1 shows that the high-level group has a higher degree of overdispersion in the seizure counts due to the existence of some abnormally large observations as revealed in fig. 4 (a) and (b). as expected, the rmpgp-ep model gives consistent results in terms of trend pattern and treatment effect. moreover, the group membership of the patients has a close affinity with that of rmpgp-t model and two diverse groups, high-level and low-level groups, are recognized as well. but for the high-level group, despite the comparable mean level, the ratio function shows that the seizure count increases in the second 2-week interval before it drops in the next two 2-week intervals. coherently, the estimate of ν 1 = 1.49 agrees with the small ν 1 = 7.21 in the rmpgp-t model indicating that the distribution of the seizure counts in the high-level group has a heavier tail to account for the higher degree of overdispersion. on the other hand, the smaller ν 2 = 0.24 indicates the data distribution is more uniform in the low-level group. for model selection, the smaller dic given by (18) as shown in table 5 for the rmpgp-ep model manifests its better model fit on the epileptic seizure counts after accounting for the model complexity. a plausible explanation is that the ep distribution has a more flexible tail behaviour and thus provides a better fit to the data. to further investigate this, their observed and fitted pmfs for the low-level group are illustrated in fig. 5 (a)-(d) at different time points, in which the observed pmf f tl (w) for group l at time t is generally given by where i(w it = w) is an indicator which returns 1 when w it = w for patient i at time t in group l and 0 otherwise, i(z µ1i = b) indicates the treatment group b of patient i,î il is the posterior mean of the group membership indicator and ψ b is the weight associated with the placebo (b = 0) or progabide group (b = 1). on the other hand, the fitted pmff tl (w) is simply obtained by the numerical approximation described in (5) based on the parameter estimates θ l and is weighted by ψ b . based on fig. 5 (a)-(d), both models imitate the observed pmfs pretty well. however the estimated trend in the ratio function cannot accommodate the upsurge in observation w = 4 at t = 2 resulting in a discrepancy between the observed and fitted pmfs in fig. 5(b) . it is not surprised to find that the two rmpgp models give similar pmfs as both have the capability of modelling highly overdispersed data. nevertheless, despite of the affinity, the slightly heavier tail in the distribution of the rmpgp-ep model possibly gives rise to the better dic . in the best model, the rmpgp-ep model, since extra variation is added to the mean of the poisson distribution, the variances of the estimated pmff tl (w) of each treatment group and the overall variances which comprises the variance of expectation and the expectation of variance conditional on the mixture group being known, show a dramatic improvement and are reported in table 4 . advantageously, implementing the model using bayesian approach enables us to study the latent stochastic process y * itl and the mixing parameters λ itl which can be output in the coda in winbugs. under the rmpgp-ep model, to examine the density of the unobserved y * itl which are simulated from an ep distribution with location parameter µ * itl , scale parameter σ 2 l and shape parameter ν l , we compare the densities of the posterior samples of y * itl with the normal distribution which has mean and standard deviation of the posterior samples and the ep distribution with same mean and standard deviation and a shape parameterν l . four selected y itl from each cluster group l and treatment group z µ1i are illustrated in the following fig. 6 . obviously, y it1 's have a leptokurtic shape whereas y it2 's appear to be more uniform than the normal distribution. these agree with the results in table 5 that the shape parameter of the high-level group is larger than that of the low-level group (ν 1 > ν 2 ) and explain how the ep distribution can downweigh the outlying effect. last but not least, the outlier diagnosis is performed using the mixing parameters in the better model, the rmpgp-ep model. for each cluster group l, the mixing parametersλ itl are plotted with the standardized observations w itl of patients who are classified to group l in fig. 7 . for fair comparison, the seizure count is standardized as w itl = |w it −ŵ itl |/σ w itl wherê w itl andσ w itl are respectively the mean and standard deviation of the estimated pmf f tl (w) in (19) at time t under group l of the rmpgp-ep model. an unusually large λ itl indicates that the w it is possibly an outlier under group l. both mixing parameters λ itl and standardized seizure counts w itl are sorted by groups (l = 1, 2) for better visualization and are graphed in fig. 7. a b c d clearly, all the top 10 (5%) outlying counts with the first 10 largest λ itl locate in the high-level group (l = 1) due to the presence of some extreme observations. they are highlighted in fig. 7 with the corresponding rank and observation in parentheses. not surprisingly, the correlation between λ it1 and w it1 is high (r = 0.9731) which signifies the appropriateness of using the mixing parameters in outlier diagnosis. while the outlying effect is not substantial in the low-level group (l = 2), λ it2 appears to be relatively smaller. the top 4 outlying values come from the same patient with id 49 who has abnormally high seizure counts at each 2-week interval. besides, 2 large observations are identified from patients with id 8, and 25. in spite of large observation, four small seizure counts are also classified as outlier from patients with id 10, 24, 39 and 56. knowing which patients are associated with abnormal seizure counts, specialists can pay more attention to their abnormalities and alternative treatments maybe considered. at the same time, the rmpgp-ep model has downweighed the outlying effect and thus the general trend pattern is not distorted by the extreme observations. in this paper, we propose using a poisson geometric process (pgp) model to analyze repeated measurements of counts over time. the model is essentially a state space model with a ratio function which describes the direction and strength of the trend and a mean function which reveals the initial level and studies the covariate effect. however, the pgp model fails to allow for extra-poisson variation when extreme observations appear in the data. ignoring the outlying effects may also lead to overestimated mean and variance resulting in invalid interpretation and prediction. as remedies, two methods are suggested to account for overdispersion which include adopting a heavy-tailed distribution and incorporating mixture effect. the latter can handle clustering effect in the data and overdispersion arisen from that may also be captured. as a new direction to account for population heterogeneity, we apply the heavy-tailed distributions in the modelling of time series of counts and pioneer the robust poisson geometric process (rpgp) model. this model allows the mean x itl of the poisson distribution to follow a gp and the logarithm of the underlying stochastic process {y it = a t−1 it x it } follows a heavytailed distribution. the resultant model is called the rpgp model. by varying a set of model parameters, the properties of the rpgp models are reported in tables 1 and 2 and their pmfs are revealed in figs. 1 and 2. although the marginal pmfs do not have a closed-form, monte carlo integration in (5) can be used to approximate the pmf, and hence the mean as well as the variance. tables 1 and 2 show that the model can accommodate both equidispersed and overdispersed data with varying degrees of kurtosis. the rpgp models and its extension, the rmpgp models to allow clustering effects are implemented using a bayesian approach. expressing the heavy-tailed distributions in a scale mixture form facilitates the model implementation using mcmc algorithms and the mixing parameters enables us to perform outlier diagnosis as shown in the real data analysis. here, the student's t-distribution in scale mixture of normal (smn) and exponential power (ep) distribution in scale mixture of uniform (smu) are adopted. the resultant models can be efficiently implemented via the user-friendly software, winbugs. moreover, the posterior densities for the mcmc algorithm are derived for the rmpgp-ep model. the simulation study shows that the performances of the two rmpgp models are comparable and satisfactory. in case of data with very long tail, the rmpgp-t model seems to fit better since student's t-distribution allows a much heavier than normal distribution. on the other hand, the rmpgp-ep model gives a better fit in the analysis of epilepsy data with diverse degree of overdispersion across mixture groups as ep distribution has a more flexible tail which can be either leptokurtic or platykurtic. one pitfall of our proposed model is that taking log-transformation of the latent y itl inevitably causes those data associated with close-to-zero means being identified as outliers. hence when zero is dominant in the data, the proposed rmpgp models can be extended to include a zero-altered component (wan and chan, 2009 ). bayesian inference in statistical analysis robust bayesian analysis of loss reserve data using the generalized-t distribution binary geometric process model for the modelling of longitudinal binary data with trend modelling sars data using threshold geometric process scale mixtures distributions in insurance applications hierarchical models with scale mixtures of normal distributions the extended exponential power distribution and bayesian robustness analysis of long-tailed count data by poisson mixtures continuous univariate distributions analysing longitudinal count data with overdispersion on familial poisson mixed models with multi-dimensional random effects geometric process and replacement problem a note on the optimal replacement problem statistical inference for geometric processes with lognormal distribution a geometric-process maintenance model for a deteriorating system under a random environment analysis of data from a series of events by a geometric process model a double-blind crossover evaluation of progabide in partial seizures uniform scale mixture models with applications to bayesian inference bayesian measures of model complexity and fit (with discussion) some covariance models for longitudinal count data with overdispersion robustifying bayesian procedures analysis of poisson count data using geometric process model a new approach for handling longitudinal count data with zero-inflation and overdispersion: poisson geometric process model nonparametric probability density estimation: a comparison of density estimation methods see tables 1-5 and figs. 1-7. key: cord-274513-0biyfhab authors: baumgartner, m. t.; lansac-toha, f. m. title: assessing the relative contributions of healthcare protocols for epidemic control: an example with network transmission model for covid-19 date: 2020-07-22 journal: nan doi: 10.1101/2020.07.20.20158576 sha: doc_id: 274513 cord_uid: 0biyfhab the increasing number of covid-19 cases threatens human life and requires retainment actions that control the spread of the virus in the absence of effective medical therapy or a reliable vaccine. there is a general consensus that the most efficient health protocol in the actual state is to disrupt the infection chain through social distancing, although economic interests stand against closing non-essential activities and poses a debatable tradeoff. in this study, we used an individual-based age-structured network model to assess the effective roles of different healthcare protocols such as the use of personal protection equipment and social distancing at neighborand city-level scales. using as much as empirical data available in the literature, we calibrated a city model and simulated low, medium, and high parameters representing these protocols. our results revealed that the model was more sensitive to changes in the parameter representing the rate of contact among people from different neighborhoods, which defends the social distancing at the city-level as the most effective protocol for the control of the disease outbreak. another important identified parameter represented the use of individual equipment such as masks, face shields, and hand sanitizers like alcohol-based solutions and antiseptic products. interestingly, our simulations suggest that some periodical activities such as going to the supermarket, gas station, and pharmacy would have little contribution to the sars-cov-2 spread once performed within the same neighborhood. as we can see nowadays, there is an inevitable context-dependency and economic pressure on the level of social distancing recommendations, and we reinforce that every decision must be a welfare-oriented science-based decision. epidemics usually pose challenges to society by threatening human life, which frequently leads to social disruption and economic depletion (meltzer et al., 1999) . the most recent coronavirus disease outbreak, caused by the severe acute respiratory syndrome [sars]-cov-2 virus, is a particularly urgent global event that already induced massive losses to human life and affected the economic development worldwide (anderson et al., 2020; baldwin and mauro, 2020; kabir et al., 2020) . since the first case of covid-19 in wuhan, hubei province of china, the disease has established local transmissions in many countries, with the number of confirmed and fatal cases growing exponentially in several regions (chinazzi et al., 2020; wilder-smith et al., 2020) . the rapid spread of this new coronavirus has motivated numerous studies on its epidemiological characteristics (adhikari et al., 2020; lipsitch et al., 2020; rothan and byrareddy, 2020) . clinical symptoms include high fever, dry cough, and respiratory distress (lai et al., 2020) . however, the disease onset may occasionally turn into severely progressive lung failure owing to alveolar damage . clinical characteristics and the common course of ill patients carrying covid-19 include absent to mild symptoms and are certainly context-dependent, but there is a considerable proportion of individuals that likely require medical assistance, especially those elderly people and/or with underlying comorbidities (k. yang et al., 2020) . in fact, a common concern for health systems is that an uncontrolled outbreak is definitely catastrophic, and really effective retainment measures are now the only realistic option to avoid the total collapse of healthcare facilities (adams and walls, 2020) . there is an ongoing debate about the optimal mitigation strategies to prevent or reduce contagion among people. strategies range from the use of physical barriers (e.g., masks), hand hygiene, avoidance of direct contact among people (e.g., handshakes and hugs) and social distancing at different scales such as staying meters apart from each other and traveling restrictions leung et al., 2020) . however, the proposal of distancing interventions on a large scale, such as the suspension of classes and the lockdown of non-essential activities such as many commercial facilities, has been proved as a drawback to the economic interests (ayittey et al., 2020; bonaccorsi et al., 2020) . this side effect stands against the social distancing protocols, highlighting that the optimal combination of retainment strategies needs to be a welfare-oriented consensus between healthcare and economic sustainability (gostin and wiley, 2020) . in this context, modeling the epidemic propagation focusing on these strategies can provide useful insights to guide field interventions and to understand covid-19 infection states . qualitative information from these models can assist decision-makers and support critical intervention policies. it is a common perception that the novel coronavirus is transmitted under a contact network among humans rothan and byrareddy, 2020) . the spread of the virus in human networks occurs over time and across the geographical space. therefore, enhanced models should account for such spatiotemporal dynamics, as well as the individual-level epidemiological phenomena (li et al., 2019) . moreover, because susceptibility and mortality rates from covid-19 are age-dependent moghadas et al., 2020) , nearly optimal epidemiological models must attempt to incorporate these heterogeneities in order to produce more realistic results (bian, 2004) . those semi-mechanistic models that define population dynamics considering age groups are known as individual-based age-structured network models (ajelli et al., 2010) . in this endeavor, we built city-level simulations to investigate which strategy could maximize the mitigation of the infection outbreak. models were calibrated with an empirical spatial structure and specific parameters of covid-19, considering an extended susceptible-infected-recovered (sir) epidemiological structure. in depth, we aimed to investigate the relative roles of health protocols such as the direct exposure to sars-cov-2, as well as the social distancing on both local and large scales. by varying model parameters related to these protocols, we were able to discuss better scenarios considering the delay in the infection peak and lower numbers of cases, as well as activities with a low potential to boost the outbreak. our simulations indicate that changes in a single public protocol (e.g., social distancing and individual-level care) could result in quite different patterns of the infection wave. meanwhile, we reveal that the carrying capacity of healthcare facilities will likely be overloaded and that social distancing, allied with investments in mass testing and hospital facilities, are the most appropriate engagements against covid-19. we calibrated the simulations using an epidemiological model that considered all combinations of relatively low, intermediate, and high probabilities of personal exposure to sars-cov-2 (β), as well as the probability of contact among people in local (v), and regional (k) scales. as long as β, v, and k increases, the individual-level chances of being exposed to the virus and the encountering probability on local and regional scales also increased, respectively. in this sense, it was clear that there was a trend of faster and higher infection peaks as long as the values of the three parameters increased from scenarios s1 to s27 ( figure 1 ; table s1 ). for those models parametrized with high values (i.e., models with two or three red dots in figure 1 ), the infection outbreak peaked 6-8 weeks after the first case, on average. while these scenarios yielded infection waves with many infected individuals already in the first weeks, their peaks were the narrowest though (table s1) . given the specified model structure, those results forecasting early wave peaks emerged under moderate to high probabilities of the individual-level exposure to sars-cov-2 virus (high β), in combination with higher encountering rates among people (v and k) ( figure 1 ; table s1 ). infection waves peaked later when models were calibrated with lower values of the three parameters. under these circumstances, waves peaked nearly two times (14-16 weeks) after the first case, on average, when compared to the aforementioned scenarios ( figure 1 ; table s1 ). these scenarios produced flattened infection curves and later peaks, especially when the model assumed that people were less exposed to the virus (low β) and had a low probability of encountering at both local (low v) and regional (low k) scales. considering our model city, all scenarios potentially overloaded the nominal carrying capacity of the healthcare system ( figure 1 ). nevertheless, there was a clear trend that those simulations with steeper smooth waves (bottom scenarios in figure 1 ) had a short-lasting overwhelm of the modeled hospital capacity. to accurately identify the relative effectiveness of each of the mitigation strategies against covid-19, we modeled the outputs of simulations as a function of different levels of parameters. when we considered the results in terms of the rapid growth of the infection (i.e., day of infection peak, the ratio of total population infected by the virus, and the number of infected people), there were significant influences when increasing parameters β and k (table 1 ). in depth, these results point towards consistent effectiveness against the disease burden by decreasing both the exposure rate of individuals and increasing social distancing on a city-level scale (table 1 ) and that this relationship seems to be non-linear ( figure 2 ). we found that those models calibrated with low exposure rate and high social distancing on a large scale had delayed infection peaks and less infected people (table 1; figure 2 ). similarly, in terms of saturation of the healthcare system (i.e., first, last day and duration of the overwhelm, and the proportional deficit in the number of beds), the model indicated that both exposure rate (β) and social distancing at a large scale (k; i.e., city-level) also had significant influences (table 1 ). all aspects of the infection wave related to the time that the health system would be saturated decreased with both exposure rate and social approximation of people ( figure 2b ). these scenarios reveal that the infection waves could be dramatically earlier and intense if people get infected at an increased rate. however, the predictions also reveal more dramatic forecasts, where the number of healthcare units should likely be three to fivefold the number of potentially available beds ( figure 2b ). the results reveal no significant effect of the social distancing protocols on a local level (i.e., neighborhood). table 1 . partial regression coefficients obtained from multi-response models through canonical powered partial least squares (cppls) regressions. the predictor matrix considered all three model parameters (β, v, and k). we built separate models for each response matrix: infection peak (peak day, the ratio of infected people and the number of cases at the peak) and healthcare saturation (first and last day, duration of the period, and estimated deficit in the number of hospital beds). significant values under jack-knife t-tests (considering α = 0.01) are in bold. see methods for details. as an increasing number of covid-19 cases is still being identified due to the impressive transmissibility of sars-cov-2 , economic consequences became a major concern (kabir et al., 2020; mckee and stuckler, 2020) . it is therefore fundamental to determine the relative effectiveness of control measures on disease retainment and to inform decisions about an adequate framework for management and mitigation strategies. for this reason, model projections considering different levels of these protocols are maybe the best coalition between researchers and policy-makers during this epidemic . it is noteworthy that those countries that took slightly delayed actions (i.e., days or a few weeks after first confirmed cases) had rapid spreads of covid-19, accompanied by high mortality, which imposed extraordinary demands to the public health systems (legido-quigley et al., 2020; sen-crowe et al., 2020) . studies demonstrate that strategies became much more effective when combined with the detection and isolation of cases (anderson et al., 2020; ferguson et al., 2020; kretzschmar et al., 2020) . unfortunately, many low-income countries may not afford or are too large to conduct mass testing (mayorga et al., 2020) . thus, the brunt of retaining the spread of covid-19 lays on social distancing and associated efforts to manage and control the infection progress (crokidakis, 2020) . social distancing and the isolation of infected people is a core intervention protocol for many infectious diseases and acts by reducing the potential for onward transmission, especially when 'herd immunity' protocols are unfeasible (anderson et al., 2020; lewnard and lo, 2020 (kretzschmar et al., 2020) . fortunately, we found that the movements of people within their residential neighborhoods had lower effects on the evolution of the epidemic curves. this likely suggests that some periodically necessary activities, such as going to the supermarket, gas station, pharmacy, bank agency, and others, would have little effect overall, once performed within the same neighborhood. in our projections, respecting social distancing protocols likely delay and reduce the peak of the infection curve, thereby scattering the number of severe cases over a longer period. more importantly, under this delayed peak, healthcare systems are able to increase their carrying capacity by building up mobile cabin hospitals, which can provide better treatments for ill people and partially reduce the mortality rate . early actions are fundamental and optimal interventions may precede the overwhelm of healthcare carrying capacity (prem et al., 2020; . as a fortunate example, since march 20, 2020, maringá (the city used as a model) declared partial lockdown, with considerably reduced traffic of people. the main interventions included the closure of educational institutions and non-essential commercial activities, as well as the complete lockdown from 21:00 pm to 05:00 am for a few weeks. these preventive social distancing protocols were determined only two days after the confirmation of the first case, which contributed even more to increase its effectiveness. however, some cities such as large brazilian capitals are now experiencing dramatic scenarios, even with early stay-at-home recommendations (crokidakis, 2020; dana et al., 2020) . although social distancing may reduce the effective spread of the sars-cov-2 virus, it can never be reduced to zero and many people tend to underestimate this protocol since its effects may take weeks to appear . equally worrisome is the fact that our projections put the direct exposure of each individual in the frontline of the factors pushing the infection progress towards the worst-case scenarios. these results deal directly with the rate at which citizens are exposed to the virus. the use of personal protective equipment (ppe), such as surgical or fabric-made masks, face shields, and hand sanitizers like alcohol-based solutions and antiseptic products has been strongly recommended to the general public (who, 2020). however, this strategy creates a debate. on one hand, the willingness of people to be protected whenever performing any outside activity, or for those who work with essential services and are frequently at moderate to high exposure risks (bourouiba, 2020) . on the other hand, the rational use of this equipment, once the global demand has grown nearly as exponential as the outbreak itself (feng et al., 2020) . so far, the most effective action seems to be imposing and encouraging the rational use of masks and the offer of hygiene items by decree or other legal dispositions. however, although most of these policies have been adopted in several countries (e.g. japan, uk, singapore, and germany), there is not enough evidence for the real effectiveness of wearing masks alone or in combination with washing hands frequently in preventing the contact-or aerosol-based transmission of sars-cov-2 (feng et al., 2020; rothan and byrareddy, 2020) . besides, the incorrect use of ppe is thought to be worse than not using at all, as well as exaggerated acquisition and overpricing of ppe could be similarly adverse. in this sense, we particularly recommend that people should use ppe adequately, especially when there is potential to spread or get in contact with droplets in the air. nevertheless, we underline that social distancing protocols seem considerably more effective. just like any model, ours have limitations as well. first, we used a fixed number of hospital beds, which is certainly unrealistic if we consider that there is a current effort to expand the nominal carrying capacity of these facilities in many cities (croda et al., 2020) . however, no matter what estimate we use to forecast the deficit in the number of available beds, projections show that there will not be enough ventilators to treat covid-19 patients in the next few months, even in the best-case scenarios (ranney et al., 2020) . second, we did not account for time-varying or dynamic public health protocols. as we can see nowadays, there is an inevitable context-dependency and economic pressure on the level of social distancing recommendations. certainly, models with a real-time structure accounting for this dynamic would be more appropriate to build an evidence-based political framework. we used an individual-based age-structured network model with an underlying modified susceptible-infected-recovered (sir) epidemiological structure. the modeling approach starts with two main components, the node transition graph and the contact network (fig. s1 ). the node transition graph consists of five compartments: susceptible (s), exposed (e), infected (i), recovered (r), and deceased (d). each individual may only be in one of these compartments at each time and the rate of transition from one to the next is modulated by parameters β (transmission rate), δ (infection rate), γ (recovery rate), and θ (mortality rate). the sequence of the modeled progression of covid-19 infection assumes that each individual may transit from susceptible (s) to infected (i) and then the model estimates when they are able to recover from a given condition. the contact network is represented by the number of individuals (n; i.e., circles/nodes) as well as by their interactions (i.e., edges/links), whenever an opportunity for transmission arises (fig. s1 ). theoretically, we assume that the interaction between two nodes occurs whenever there is potential contact between individuals (e.g., hugs, handshakes, or airdrops), or individuals may interact with previously infected objects (e.g., doorknobs, handrails, and elevator panels). thus, these opportunities serve as windows for the spread of the virus from an already infected individual to a new potential host. in practice, we explicitly modeled each individual and its probabilities of movement through the node transmission network. at first, all individuals were assigned as susceptible (s), because of no-known previous immunity against covid-19 . thereafter, the transition of each node to the exposed (e) compartment depended on the transmission rate β and the combined proportion of infected individuals, both nearby and in potential neighborhoods ( ). thus, the probability of a susceptible individual to become exposed to the virus took place at rate β . this product forces the transmission probability to be directly proportional to the number of network-level infected individuals, which seems quite realistic. once individuals enter the exposed (e) compartment, the transition to the infected (i) stage depends on the infection rate δ, which, in practice, portrays the average incubation time where = {0, 1, …, 4}, which depicts whether the node i is susceptible, exposed, infected, recovered, or deceased, respectively (fig. s1 ). to include more realism in the model, we included an age-dependent parameter modulating the e-to-i transition probability ( ) that portrays the susceptibility of each i individual to become infected at its specific age ; see model parametrization). figure s1 . diagram of the individual-based network model consisting of the transition graph and the contact network. colored circles represent the five compartments of the model for each node (i.e., individual): susceptible (s), exposed (e), infected (i), recovered (r), and deceased (d). the transition between compartments is modulated by transmission rate (β), infection rate (δ), recovery rate (γ), and mortality rate (θ). each circle in the contact network represents an individual and links are potential opportunities (i.e., contacts) for covid-19 transmission. to incorporate a spatially explicit structure, we modeled the distribution and movement of people based on the city of maringá, pr, brazil (23º25'38"s / 51º56'15"w; fig. s2 ). this is the seventh largest city in southern brazil and has a successfully implemented and developed urbanizing plan, which translates into a relatively high efficiency related to the movement of people across the city and data for each census zone represented the locations considered in the network model. we extracted the centroid of each zone according to their geographic coordinates (black circles in fig. s2 ). to model the movement across the city, we considered that people were assumed to move freely within each location. under this structure, each individual had an equal probability v of contact with all other individuals within the same zone (sahneh et al., 2017) . however, their movement was constrained among locations. the transmission of the virus from one zone to another was assumed to occur through the movement of individuals across the network. in the city model, for instance, the most frequent reason why people leave their residential neighborhoods is for working purposes (amram et al., 2019; wang et al., 2018) . therefore, the spread of the virus resulting from contact among people was proportional to their proximity within the city network, assuming that people tend to work near their own households. this weighting was based on an exponential distance kernel − , where k scales the probability of individuals from different zones to be in contact, and d is the geographic distance (km) between the centroids of each zone. we parameterized the model using as much information as available in the literature. under our model structure, parameters δ, γ, and θ were previously approximated and were fixed through all simulations. we then invariably used δ = 0.196 day -1 that portrays 5.1 days of incubation, and γ = 0.090 day -1 , which depicts an 11.1-day period of recovery (lauer et al., 2020; pan et al., 2020) . the θ = 0.00014 value was also fixed, which yielded a fatality ratio of nearly 1.5% . we initially calibrated the parameter representing the encountering probability within zones as v = 0.7, as suggested in the literature (sahneh et al., 2017; sekamatte et al., 2019) . lacking proper knowledge of the remaining parameters, we set β = 0.2 and k = 0.01. nevertheless, the model is valid for any value of these parameters (sekamatte et al., 2019) . to we set the model up to start by distributing the 357,077 citizens across the city. to each individual, we assigned a 'home' location using the population density of each zone as probabilities (fig. s2) . to include the age-structured information that would approach more realism in simulations, we then randomly assigned an age between 0 and 99 years to each individual, following the empirical age pyramid in fig. s3 from simulated curves, we extracted three response variables related to the infection peak: day of the peak, ratio of infected people, and number of cases at the peak. figure s3 . age-specific numbers of individuals (green bars) and susceptibility to infection by sars-cov-2 (red circles). the age pyramid was obtained empirically using data on 357,077 citizens from the 2010 national census on our model city (maringá, pr, brazil). the age-dependent relative mean susceptibility to infection was obtained from . the reference group to calculate values are people between 30 and 39 years, for which susceptibility is considered as 1. to infer about prioritizing the efforts of public health personnel, we compared the outcomes of simulations with the nominal carrying capacity of healthcare facilities potentially able to treat covid-19 cases in maringá (fig. s4) . we then compared this information with the simulations under all scenarios. we did this because the potential collapse of healthcare systems is a major concern worldwide (adams and walls, 2020). we considered that the number of available hospital beds in maringá was 1,657, without distinction between ordinary and intensive care units, for simplicity. from these beds, the average normal occupancy is estimated at 58.75% (sms/mga, 2020), which yields 684 beds virtually available for covid-19 cases. fortunately, most people with infections caused by the sars-cov-2 virus only develop mild symptoms and do not require medical care. however, we considered that people would develop severe or critical symptoms at an approximated rate of 0.19 (wu and mcgoogan, 2020) . from these severe/critical cases, there is an expected age-dependent probability of hospital admission at 0.025 (0-19 years), 0.32 (20-49 years), 0.32 (50-64 years), and 0.64 (65+ years) shoukat et al., 2020) . using these probabilities and the empirical age-structured data from the 2010 census, we extracted the median of a fitted gamma distribution (fig. s5) to represent the proportion of infected people demanding hospital beds at each time step (0.0379; interquartile range = 0.0180-0.0697). in the course of each infection wave, we were then able to approximate the duration (i.e., days) of the overload of the carrying capacity of hospitals, as well as a rough deficit in the number of beds available at the projected infection peaks. finally, to investigate the relative role of each public health protocol against the spread of sars-cov-2 virus, we used multi-response partial least squares regression models (pls). specifically, we used the canonical powered extension of pls (cppls), which is suitable for multivariate responses and relationships considering discrete and continuous variables with potential correlation (indahl et al., 2009) . we built models using the varying model parameters (β, v, and k) as the predictor matrix. as response matrices, we used the three variables related to the infection peak (i.e. peak day, ratio of infected people and number of cases at the peak) and the four variables related to the saturation of the healthcare system (i.e. first and last day, duration and deficit), separately. variables were centered (i.e., scaled) prior to model fitting to allow for comparisons among parameter estimates. models were fitted using package 'pls' (mevik et al., 2019) in the r environment (r core team, 2019). the significance of parameter estimates was achieved using approximated jack-knife t-tests (martens and martens, 2000) . supporting the health care workforce during the covid-19 global epidemic epidemiology, causes, clinical manifestation and diagnosis, prevention and control of coronavirus disease (covid-19) during the early outbreak period: a scoping review comparing large-scale computational approaches to epidemic modeling: agent-based versus structured metapopulation models mapping workplace neighborhood mobility among sex workers in an urban canadian setting: results of a community-based spatial epidemiological study from 2010-2016 how will country-based mitigation measures influence the course of the covid-19 epidemic? economic impacts of wuhan 2019-ncov on china and the world economics in the time of covid-19 a conceptual framework for an individual-based spatially explicit epidemiological model evidence of economic segregation from mobility lockdown during covid-19 epidemic. ssrn electron turbulent gas clouds and respiratory pathogen emissions the effect of travel restrictions on the spread of the 2019 novel coronavirus covid-19 in brazil: advantages of a socialized unified health system and preparation to contain cases covid-19 spreading in rio de janeiro , brazil : do the policies of social isolation really work ? medrxiv brazilian modeling of covid-19 (bram-cod): a bayesian monte carlo approach for covid-19 spread in a limited data set context rational use of face masks in the covid-19 pandemic impact of nonpharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand governmental public health powers during the covid-19 pandemic: stay-at-home orders, business closures, and travel restrictions the origin, transmission and clinical therapies on coronavirus disease 2019 (covid-19) outbreak -an update on the status temporal dynamics in viral shedding and transmissibility of covid-19 feasibility of controlling covid-19 outbreaks by isolation of cases and contacts canonical partial least squares-a unified pls approach to classification and regression problems covid-19 pandemic and economic cost; impact on forcibly displaced people isolation and contact tracing can tip the scale to containment of covid-19  early dynamics of transmission and control of covid-19: a mathematical modelling study severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and coronavirus disease-2019 (covid-19): the epidemic and the challenges the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application the resilience of the spanish health system against the covid-19 pandemic respiratory virus shedding in exhaled breath and efficacy of face masks scientific and ethical basis for social-distancing interventions against covid-19 age-dependent risks of incidence and mortality of covid-19 in hubei province and other parts of china climate-driven variation in mosquito density predicts the spatiotemporal dynamics of dengue defining the epidemiology of covid-19 -studies needed clinical features of covid-19 in elderly patients: a comparison with young and middle-aged patients aerodynamic analysis of sars-cov-2 in two wuhan hospitals modified jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (plsr) detection and isolation of asymptomatic individuals can make the difference in covid-19 epidemic management if the world fails to protect the economy, covid-19 will damage health not just now but also in the future the economic impact of pandemic influenza in the united states: priorities for intervention pls: partial least squares and principal component regression projecting hospital utilization during the covid-19 outbreaks in the united states time course of lung changes on chest ct during recovery from 2019 novel coronavirus (covid-19) pneumonia. radiology the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study r: a language and environment for statistical computing critical supply shortages -the need for ventilators and personal protective equipment during the covid-19 pandemic the epidemiology and pathogenesis of coronavirus disease (covid-19) outbreak gemfsim: a stochastic simulator for the generalized epidemic modeling framework individual-based network model for rift valley fever in kabale district social distancing during the covid-19 pandemic: staying home save lives covid-19 infection: the perspectives on immune responses projecting demand for critical care beds during covid-19 outbreaks in canada estimates of the severity of coronavirus disease 2019: a model-based analysis urban mobility and neighborhood isolation in america's 50 largest cities rational use of personal protective equipment for coronavirus disease ( covid-19) and considerations during severe shortages can we contain the covid-19 outbreak with the same measures as for sars? estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china pathological findings of covid-19 associated with acute respiratory distress syndrome prevalence of comorbidities and its effects in patients infected with sars-cov-2: a systematic review and meta-analysis we would like to thank marco túlio pacheco coelho, ricardo dobrovolski, and josé alexandre felizola diniz-filho for fundamental discussions and suggestions on an early draft of this manuscript. we are also grateful for the infrastructure provided by the galileo cloud computing program for computational simulations. this work was developed as a scientific counterpart that summarizes many data collected by those people involved directly or indirectly in collecting epidemiologic data and working in favor of health. they deserve all the merit of this paper. key: cord-261530-vmsq5hhz authors: rodriguez, jorge; acuna, juan m; uratani, joao m; paton, mauricio title: a mechanistic population balance model to evaluate the impact of interventions on infectious disease outbreaks: case for covid19 date: 2020-04-07 journal: nan doi: 10.1101/2020.04.04.20053017 sha: doc_id: 261530 cord_uid: vmsq5hhz infectious diseases, especially when new and highly contagious, could be devastating producing epidemic outbreaks and pandemics. predicting the outcomes of such events in relation to possible interventions is crucial for societal and healthcare planning and forecasting of resource needs. deterministic and mechanistic models can capture the main known phenomena of epidemics while also allowing for a meaningful interpretation of results. in this work a deterministic mechanistic population balance model was developed. the model describes individuals in a population by infection stage and age group. the population is treated as in a close well mixed community with no migrations. infection rates and clinical and epidemiological information govern the transitions between stages of the disease. the present model provides a steppingstone to build upon and its current low complexity retains accessibility to non experts and policy makers to comprehend the variables and phenomena at play. the impact of specific interventions on the outbreak time course, number of cases and outcome of fatalities were evaluated including that of available critical care. data available from the covid19 outbreak as of early april 2020 was used. key findings in our results indicate that (i) universal social isolation measures appear effective in reducing total fatalities only if they are strict and the number of daily social interactions is reduced to very low numbers; (ii) selective isolation of only the elderly (at higher fatality risk) appears almost as effective in reducing total fatalities but at a much lower economic damage; (iii) an increase in the number of critical care beds could save up to eight lives per extra bed in a million population with the current parameters used; (iv) the use of protective equipment (ppe) appears effective to dramatically reduce total fatalities when implemented extensively and in a high degree; (v) infection recognition through random testing of the population, accompanied by subsequent (self) isolation of infected aware individuals, can dramatically reduce the total fatalities but only if conducted extensively to almost the entire population and sustained over time; (vi) ending isolation measures while r0 values remain above 1.0 (with a safety factor) renders the isolation measures useless and total fatality numbers return to values as if nothing was ever done; (vii) ending the isolation measures for only the population under 60 y/o at r0 values still above 1.0 increases total fatalities but only around half as much as if isolation ends for everyone; (viii) a threshold value, equivalent to that for r0, appears to exist for the daily fatality rate at which to end isolation measures, this is significant as the fatality rate is (unlike r0) very accurately known. any interpretation of these results for the covid19 outbreak predictions and interventions should be considered only qualitatively at this stage due to the low confidence (lack of complete and valid data) on the parameter values available at the time of writing. any quantitative interpretation of the results must be accompanied with a critical discussion in terms of the model limitations and its frame of application. the impact of specific interventions on the outbreak time course, number of cases and outcome of fatalities were evaluated including that of available critical care. data available from the covid19 outbreak as of early april 2020 was used. key findings in our results indicate that (i) universal social isolation measures appear effective in reducing total fatalities only if they are strict and the number of daily social interactions is reduced to very low numbers; (ii) selective isolation of only the elderly (at higher fatality risk) appears almost as effective in reducing total fatalities but at a much lower economic damage; (iii) an increase in the number of critical care beds could save up to eight lives per extra bed in a million population with the current parameters used; (iv) the use of protective equipment (ppe) appears effective to dramatically reduce total fatalities when implemented extensively and in a high degree; (v) infection recognition through random testing of the population, accompanied by subsequent (self) isolation of infected aware individuals, can dramatically reduce the total fatalities but only if conducted extensively to almost the entire population and sustained over time; (vi) ending isolation measures while r0 values remain above 1.0 (with a safety factor) renders the isolation measures useless and total fatality numbers return to values as if nothing was ever done; (vii) ending the isolation measures for only the population under 60 y/o at r0 values still above 1.0 increases total fatalities but only around half as much as if isolation ends for everyone; (viii) a threshold value, equivalent to that for r0, appears to exist for the daily fatality rate at which to end isolation measures, this is significant as the fatality rate is (unlike r0) very accurately known. any interpretation of these results for the covid19 outbreak predictions and interventions should be considered only qualitatively at this stage due to the low confidence (lack of complete and valid data) on the parameter values available at the time of writing. any quantitative interpretation of the results must be accompanied with a critical discussion in terms of the model limitations and its frame of application. understanding the potential spread of diseases using mathematical modelling approaches has a long history. deterministic epidemic models published in the early 20 th century already demonstrated the importance of understanding the population-based dynamics as well as potential parameters of interest therein (kermack & mckendrick, 1927) . numerous modelling approaches are available for the prediction of propagation of infectious diseases (may & anderson, 1979; capasso & wilson, 1997; hethcote, 2000; mccallum et al., 2001; ruan & wang, 2003; li et al., 2004; keeling & eames, 2005; grassly & fraser, 2008; keeling & rohani, 2008; balcan et al., 2010; britton, 2010; funk et al., 2010; gray et al., 2011; brauer et al., 2012; miller et al., 2012; siettos & russo, 2013; pastor-satorras et al., 2015) . their outputs inform studies on health projections and play an important role in shaping policies related to public health (murray and lopez, 1997a , 1997b , 1997c , and 1997d ferguson et al., 2006) . data availability has greatly increased in recent years, which led to direct improvements in epidemiological models (colizza et al., 2006; riley, 2007; siettos & russo, 2013) . these models provided a more comprehensive understanding of recent outbreaks of diseases such as ebola (gomes et al., 2014; who ebola response team, 2014) and zika (zhang et al., 2017) . however, all modelling efforts are highly dependent on several elements: a comprehensive algorithm of clinical and public health true options and stages of events; probability of such options given certain conditions of the system; identification of parameters that reflect such events and their probabilities (such as mortality by age, infectiousness by contacts, etc.); assumptions for parameters with insufficient data; and valid data for those parameters that allow the calibration and posterior validation of the forecasts (tizzoni et al., 2012) . in viral pandemics in particular, one of those parameters, the direct estimation of infected subpopulation fractions, is not feasible using available epidemiological data (unless universal, highly sensitive is testing used, rarely possible to implement in these situations), particularly if very mild cases, asymptomatic infections or pre-symptomatic transmission are observed or expected. this was the case of the previous influenza a (h1n1-2009) pandemic and it is the observation for covid-19 pandemic (russel et al., 2020) . thus, in many cases, modelling uses a combination of the best available data from historical events and datasets, parameter estimation and assumptions. then, data about these parameters are computed with statistical tools for the development of epidemic models (cooper et al., 2006; biggerstaff et al., 2014) . the most challenging phase for the understanding of the potential spread of a disease is when novel disease outbreaks emerge in global populations (anderson & may, 1992) , in which data availability is limited (e.g. novelty of pathogen; delay of communication of case datasets from public health workers and facilities to researchers) or biased by external factors (e.g., limited availability of testing capacity; undefined or partially defined diagnostics for disease). with novel diseasespecific epidemic models, the development of models with the sufficient level of low complexity and meaningful parameters, that can be identified with data as the infection progresses and data become more available, is posited as a potential tool to inform public health policy and impact mitigation strategies (berezovskaya et al., 2005; hall et al., 2007; bettencourt et al., 2008; nishiura, 2011; wang & zhao, 2012; lee et al., 2013; nsoesie et al., 2014 , chowell et al., 2016 , rivers et al., 2019 . the parameters of these models can be identified with data as the infection progresses and data become more available. the covid-19 outbreak and posterior pandemic has brought unprecedented attention into these kinds of modelling approach limitations, with multiple epidemic models and disease spread forecasts being published as more data becomes available. these models have evaluated the ongoing course of the disease spread evolution, from the earlier dynamics of transmission from initial cases , to the potential of non-pharmaceutical interventions to limit the disease spread, such as: international travel restrictions (chinazzi et al., 2020) , contact tracing and isolation of infected individuals at onset (hellewell et al., 2020) , different scales of social distancing and isolation (flaxman et al., 2020 , prem et al., 2020 . other statistical models tried to estimate fundamental characteristics (i.e. potential model parameters) for the disease, such as the incubation period and basic reproduction number, r0 , as well as to assess short-term forecasts . given the inherent uncertainty associated with most of the parameters used, a stochastic approach is employed in the above models. effective communication between health care and public health systems and science hubs is considered one of the bigger challenges in both health sciences and public health (zarcadoolas, 2010; squiers et al., 2012) . in health care it is not only necessary to take effective measures but also to do it timely. this requires strategies for data sharing, generation of information and knowledge and timely dissemination of such knowledge for effective implementations. the development of strategies for interaction under the general, and correct assumption of low literacy health communication paradigms are especially relevant (plimpton & root, 1994) . and we have good evidence that health illiteracy influences greatly health behaviours that in turn, are likely to play a role in influencing the degree of the effectiveness of such interventions. given the complexity and the expected short and long-lasting impacts that these public health interventions should have when dealing with disease outbreaks and pandemics (reluga, 2010; fenichel et al., 2011) sufficiently complex but user-accessible modelling tools should provide researchers, public health authorities, and the general public with useful information to act in moments of clear and wide uncertainty. in order to work properly they require access to up-to-date data, in this case of the covid-19 spread (dong et al., 2020) . additionally, simple and interactive models can contribute to the understanding by broader audiences of what to expect on the propagation of infectious diseases and how specific interventions may help. this increased awareness of the disease behaviour and potential course in time by public and policy makers can directly and positively impact the outcome of epidemic outbreaks (funk et al., 2009) . population balance models are widely use in disciplines such as chemical engineering to describe the evolution of a population of particles (henze et al., 2000; ramkrishna and singh, 2014; yang, 2014; gonzález-peñas et al., 2020) . these types of models describe the variation over time of, socalled, state variables as functions of state transition equations governed by transport processes, chemical reactions or any type of change rate from one state to another. such models allow for the description of the underlying processes in a mechanistic manner, maintaining therefore a direct interpretation of the model behaviour. if the state transition rates are defined in mechanistic manner and with meaningful parameters, such models can describe a process in a way that is interpretable into reality and open the possibility not only of prediction but also of hypothesis generation when data deviate from model predictions. the present work attempts to provide a deterministic population balance-based model with a sufficient minimum, but clinically and public health robust, set of mechanistic and interpretable parameters and variables. the model aims at improving the understanding of the major phenomena involved and of the impacts in the system's resources and needs of several possible interventions. the model level of complexity is targeted such that retains mechanistic meaning of all variables and parameters, captures the major phenomena at play and specifically allows for accessibility to nonexperts and policy makers to comprehend the variables at play. i this way expert advice and decision making can be brought closer together to help guide interventions for immediate and longer-term needs. the model presented is based on balances of individuals transitioning between infection stages and segregated by age group. all individuals are placed in a common single domain or close community (e.g. a well-mixed city or town), no geographical clustering nor separation of any type is considered, neither is any form of migration in or out of the community. big cities with ample use of public transportation are thought to be settings best described by the model. the model also provides a direct estimation of the r0 (reproduction number or reproductive rate) (delamater et al. 2019 ) under different circumstances of individual characteristics (such personal protection or awareness) as well as under population-based interventions (such as imposed social isolation). r0 is a dynamic number often quoted erroneously as a constant for a specific microorganism or disease. the ability to estimate the r0 for different times of the outbreak (given the interventions), outbreak settings and interventions is considered to be a valuable model characteristic. r0 is predicted to change over time with interventions that do not produce immune subjects (such as isolation or use of personal protection equipment (ppe), as opposed to vaccination). however, in many instances over the course of an outbreak, r0 is consistently estimated as a constant, frequently overestimating and not allowing correct estimations of the course of events. the model solves dynamic variables or states. every individual belongs, in addition to their age group (which she/he never leaves), to only one of the possible states that correspond to stages of the infection, namely: healthy-non-susceptible (hn); healthy susceptible (h); pre-symptomatic (ps); symptomatic (s); hospitalised (sh); critical (sc) (with and without available intensive care); deceased (d) and recovered immune (r) . definitions of the model states are shown in table 1 . each variable is a state vector with the number of individuals in that stage per age group. age groups are defined per decade from 0-9 until 80+ year olds. nine age groups are defined in the model, each state is therefore a vector of dimensions 1x9, the total number of states is a matrix of dimensions 8x9. the transitions between these states are governed by rates of infection and transition as defined in table 2 . note that vector variables and parameters are represented in bold font and scalar ones in regular font. a schematic representation of the modelling approach with the population groups considered for the infection stages, rates of infection and transition between groups and showing possible interactions between population groups is shown in figure 1 . two main interventions are described in the model that have are being used currently slow the spread of the covid19 disease outbreak (i) the degree of social isolation of the individuals in the population in terms of the average number of random interaction individuals have per day with others that are also interacting and (ii) the level of personal protection and awareness that individuals have to protect themselves and others against contagion or spread during interactions. these interventions can be stratified by age groups. table 3 describes the key parameters that define the interventions. the degree of isolation is described by a parameter (nih) (vector per age group) corresponding to a representative average number of daily interactions that healthy susceptible individuals have with others. different nih values can be assigned per age group to describe the impact of diverse isolation strategies selective to age group such as e.g. selective isolation of the elderly and/or young. the level of use of ppe and awareness described by the parameters (lpah) for healthy and (lpaps and lpas) for infectious individuals (both in vectors per age group). values of the lpa parameters can vary between 0 and 1, with 1 corresponding to the use of complete protective measures and zero to the most reckless opposite situation. an additional reduction factor is defined for decreased social interactivity of infectious individuals both for symptomatic (rfis), to describe e.g. self or imposed isolation of s individuals and for pre-symptomatic (rfips), to describe e.g. awareness of their infection if extensive random testing of the population is implemented. the infection of healthy susceptible individuals (h) is modelled as to occur only by interaction between them and other infected either pre-symptomatic (ps) or symptomatic (s) individuals. infected hospitalised (sh) and critical (sc) are assumed not available for infectious interactions neither are those deceased (d). two rates of infection of healthy susceptible individuals (in number of infections per day) are defined, one from each one of the two possible infecting groups (ps and s). the rates of infection are vectors per age group from the product of (i) the fraction of interactions with ps (or s) among the total interactions (fips or fis) times (ii) the probability of contagion in an interaction with ps (or s) (pi_ps or pi_s) (per age group), (iii) the average number of daily interactions that h individuals have (nih) and (iv) the number of h individuals themselves (per age group) (see eqs 1.a-b). note that point operators between vectors indicate an operation element-by-element. the probabilities of infection per interaction are calculated as per eq. 2.c-d.. the average rates of transition between states are defined such that epidemiological and clinical available data can be used such as the proportion of individuals that transition or recover (see table 4 ) and the average times reported at each stage before transition or recovery (see table 5 ). table 4 . epidemiological parameters (all in vectors per age group). fraction of population non-susceptible to infection fhn_t #hs/#h fraction of ps that will become s 1 (1-fr_ps) fs_ps #s/#ps fraction of s that will become sh (1-fr_ss) fsh_s #sh/#s fraction of sh that will become sc (1-fr_sh) fsc_sh #sc/#sh fraction of cared sc that will die into d (1-fr_sc) fd_sc #d/#scic fraction of ps that will recover into r fr_ps #r/#ps fraction of s that will recover into r fr_s #r/#s fraction of sh that will recover into r fr_sh #r/#sh fraction of sc with critical care that will recover into r fr_sc #r/#scic 1 calculated, not an input parameter the rates of individuals transitioning between stages (in number of individuals per day) are described in eqs 3.a-e. all rates are vectors per age group of dimensions (1x9). in order to describe the possible shortage of critical care resources, critical individuals are distributed between those with available intensive care (nsc_ic) and those without available intensive care (nsc_nc). at each simulation time step nsc_ic and nsc_nc are computed via an allocation function of critical care resources over the total nsc per age group. the function allocates resources with priority to lower age groups until the maximum number of intensive care units is reached. all critical individuals with no available intensive care nsc_nc are assumed to become deceased after td_nc. the rate of transition from critical to deceased is therefore the sum of that of those with available care (rd_scic) plus that of those without available care (rd_scnc) as per eqs. 3.e-g. the rates of individuals recovering from the different infected stages (in number of individuals per day) are described in eqs 4.a-d. (all rates in vectors per age group). the state transitions as governed by these rates are represented in a matrix form in figure a2 . since the model produces instant values of outputs over time based on the parameters used, the reproduction number (r0) simulated must be considered as an instantaneous estimation of the r0 (delamater et al. 2019) . the model structure allows for several parameters to influence r0 including the duration of infectious stages for the virus known so far; the potential infection of others by the virus from those infected; the probabilities of infection per social interaction; and other parameters such social isolation and use of ppe. elements such as the number of recovered immune individuals should not directly affect r0 as the reproductive number refers to only the potential infection of susceptible individuals from infected individuals. dynamic reproduction number (r0) during the outbreak (delamater et al. 2019 ) is computed over time from the model state variables according to eq. 6. under this approach, infectious individuals can only infect others while they are in pre-symptomatic (ps) and symptomatic (s) stages. although it is known that post-symptomatic recovered individuals may be infectious for some period of time, this has not been considered in the model at this time due to lack of data. hospitalised and critical individuals are assumed to be well isolated and also not able to infect others. the provided dynamic output of the reproduction number r0 can be used to guide and interpret the impact of interventions in terms of r0. modelled infected individuals can take only three possible infectious paths namely: (i) ps  r; (ii) ps  s  r and (iii) ps  s  sh. these paths are made of combinations of four possible infectious stage intervals in which infected individuals spend time and infect at their corresponding rate (see table 6 ). the model presented is deterministic and based on population balances of individuals classified by their stage of infection and age group only, no other differentiation within those groups is captured by this version of the model. this characteristic allows for the model application to single, densely populated clusters. the model has low complexity and requires a small number of mechanistic meaningful parameters, most of which can be directly estimated from epidemiological and clinical data. the model however carries limitations in its prediction capabilities due to the fact that all variables and parameters refer to representative averages for each stage and age group population. this may limit the model representation of the non-linear interactions in the real system and therefore at this stage any results interpretation for prediction purposes should be critically discussed against these limitations. a case study based on a scenario of propagation of the covid19 pandemic using data available as of april 2020 is presented below. the results obtained are intended to be interpreted qualitatively and to be contextualized to the specific setting characteristics. they intend to serve as a demonstration of the model potential if applied with higher confidence parameter values. a number of selected scenarios aimed at illustrating the impact of different interventions were simulated. conclusions should be taken qualitatively at this stage given the low confidence in some parameter values. default reference epidemiological and clinical parameter values were obtained from different information sources on the covid19 outbreak as available in early april 2020. details of values and sources are provided in the appendix tables a1-a2 respectively, with indication of level of confidence. a population with an age distribution as that of the region of madrid (spain) in 2019 was used (ine spain, 2020). default reference intervention parameters were selected arbitrarily for a situation assimilated to that previous to the outbreak and without any specific intervention (see values and rationale in appendix table a3 ). the dynamic simulation results of the default outbreak scenario with no intervention is shown in in the appendix figure a4 . all scenarios are simulated for 365 days and evaluated in terms of (i) final total number of fatalities at outbreak termination and (ii) final number of fatalities per age group. in addition, the scenarios are presented also in terms of dynamic profiles over time for (iii) number of active cases; (iv) reproduction number; (iv) number of critical cases; (v) number of fatalities. in this scenario, the impact of different imposed degrees of universal social isolation was evaluated. the parameter that describes this intervention is the average number of daily social interactions that healthy susceptible individuals have (nih). as indicated above, evidence suggests that during viral infections that behave such as covid-19, the number of personal contacts increases the likelihood of infection linearly. in this scenario, the isolation measures are applied equally across all age groups and the same nih values applied to all. figure 2 illustrates the model predictions for this scenario, in terms of the output variables indicated and in absence of any other interventions. as it can be observed in figure 2 (top left), the overall risk of dying from the virus increases as the average number of daily social interactions (nih) increases. however, it seems to plateau at around 4 interactions per day, suggesting that a specific a critical value may exist for nih for the intervention to succeed at lowering the final number of deaths. once age is placed in the equation, mortality behaves similarly only for those at ages over 70 (figure 2 (top right) ). interestingly, the nih does not appear to significantly modify mortality beyond a single interaction per day. this suggests that for younger than 60, interactions, in order to decrease mortality should be lower to 0: complete social distancing and isolation and that based only on social interactions, most of the mortality decreased by partial social isolation will be in those older than 60 years of age. the number of fatalities appears clearly and directly related to social isolation as well as the speed at which the fatalities saturation will occur, figure 2 (bottom right). the model is capable to capture this due to its description of the saturation of the healthcare capacity and withdrawal of critical care over capacity. the middle and bottom graphs in figure 2 show the impact of nih on the time course of several variables. figure 2 (middle left) supports the "flatten the curve" concept, now globally popular. if interactions are not modified, the number of cases grows rapidly, exponentially and explosively. the impact of imposed social isolation selective to those over 60 years old is evaluated in this scenario. the parameter that describes this intervention is the average number of daily social interactions with other people that healthy susceptible individuals within the age groups over 60 years old have (nih). in this scenario, the isolation measures are applied selectively only to the elderly. figure 3 illustrates the model predictions for this scenario, in terms of the output variables indicated, in absence of any other interventions. as shown in figure 3 (top left) the selective social isolation of the elderly has a potentially very significant impact on final total fatalities at an almost comparable level than the previous scenario of universal isolation. this is a result with potentially significant consequences as it indicates that a sustained isolation selective only to the elderly and not to the other age groups could alleviate the economic damage at small numbers of increased total fatalities. the decrease in social interactions in schools and colleges by isolation of the young may however have an impact on the overall multiplier of infections from youngsters to adults. the impact of selective imposed social isolation of those under 20 years old is evaluated in this scenario. the parameter that describes this intervention is the average number of daily social interactions with other people that healthy susceptible individuals of the age groups under 20 years old have (nih). in this scenario, isolation measures are applied only to the youngsters. figure 4 shows the results for this scenario, for the output variables indicated, in absence of other interventions. the young population have been observed to be quite resistant to the disease. theoretically at least, young, unaffected lungs tolerate and defend better from the viral load. the isolation of the young produces no effect in the overall final fatality rate but produces a moderate impact on the mortality of the elderly at low values of nih. as it can be seen in figure 4 , social isolation of the young has little impact producing almost identical curves for any levels of social isolation. it is thought however that the decrease in social interactions in schools and colleges by isolation of the young may have a large impact on the overall multiplier of infections from youngsters to adults. this emergent aspect of the disease spread behaviour and containment efforts is captured in our results, even though the present model does not incorporate geographical features and does not explicitly describe location-specific population interactions (such as those synthetic location-specific contact patterns in prem et al., 2020) . the impact of selective imposed social isolation to both those under 20 and those over 60 years old is evaluated in this scenario. the parameter that describes this intervention is the average number of daily social interactions with other people that healthy susceptible individuals of the age groups under 20 and over 60 years old have (nih). in this scenario, the isolation measures are applied selectively only to the youngsters and the elderly. figure 5 illustrates the model predictions for this scenario, in terms of the output variables indicated, in absence of any other interventions. many of the early interventions during the covid19 outbreak started by protecting the elderly and isolating the young (no schools, no colleges or universities for students), decreasing the number of interactions of the two subpopulations substantially. the isolation of these population groups together results similarly to that of the isolation of the elderly alone with no significant added value in isolating the young respect to that the elderly alone as shown in figure 3 . the impact of the availability of intensive care beds is evaluated in this scenario. the parameter that describes this intervention is the number of available intensive care beds per million population. figure 6 illustrates the model predictions for this scenario, in terms of the output variables indicated, in absence of any other interventions. figure 6 (top left) shows the enormous impact in decreasing total fatalities that the increase in critical care resources can have, the higher the availability of critical beds, the lesser the fatality rate. the trend applies until there is no shortage of ic beds and fatalities are the unavoidable ones. this intervention avoids those deaths that are preventable by the availability of ventilators (mainly) and critical care support. with the current parameter values in a million population it appears that around 8 lives could be saved per additional intensive care bed. the impact of increased use of ppe and behavioural awareness is evaluated in this scenario. the parameters that describe this intervention is a factor increasing the default values (see table a3 ) of the lpa parameters of the healthy and infected population groups (laph, lapps and lpas). increases in these parameters decrease the probability of infection per interaction (see eqs. 1) and subsequently the rates of infection (eqs. 2) . figure 7 illustrates the model predictions for this scenario, in terms of the output variables indicated, in absence of any other interventions. (table a3) as it is shown in figure 7 the extensive use of ppe appears as potentially having a major impact on total outbreak fatalities at the highest levels of protection. there is an inverse relationship between the level of protection and the overall fatality of the disease. the peak of number of cases is reached earlier and is higher if low levels of personal protection, the infectability and r0 follow the same pattern. the peak number of critical cases is also decreased and slowed through time. the impact of increasing the number of tests to all population is evaluated in this scenario. widespread testing will increase tests done also to both infected pre-symptomatic and symptomatic individuals. the parameter that describes this intervention is a reduction factor in social interactions due to knowledge of infection by ps and s individuals (rfips and rfis). reduced values of rfips and rfis decrease the fraction among of interactions among total from both ps and s (see eq. 2.a-b) and therefore the rates of infection by these two groups (eq. 1.a-b). the impact of applying an isolation reduction factor to the default rfi values is evaluated. figure 8 illustrates the model predictions for this scenario, in terms of the output variables indicated, in absence of any other interventions. the increased awareness of the infected by testing has a great impact and it is a great differentiator among subgroups (no awareness to high awareness) in the number of cases, critical cases and total number of fatalities throughout time. the peaks are significantly decreased by awareness. it is worth noting how the number of fatalities can be brought almost to zero by complete awareness of infection and isolation. universal testing and isolation, if possible could be one of the great modifiers of the outcome of the outbreak. the above static interventions were evaluated in terms of a sustained action over a parameter at different levels and its impact on the outbreak outputs. in outbreaks, aside from the immediate management of needs and resources, the time to return to normal becomes of great concern. in this second section, dynamic interventions are evaluated specifically in terms of the ending of social isolation measures once different threshold values for r0 (everchanging due to interventions to manage infectability) of the fatality rate are reached. the model dynamic calculation of r0 allows for the evaluation of the use of the this variable as a criterion for the relaxation (or application) of interventions. these dynamic scenarios are considered of potential interest as governments and local authorities must evaluate and decide on when to apply the social distancing and isolation mitigation measures; whether it can be done totally or gradually by subgroups; and the potential impact of ending social isolation will have in further behaviour of the disease spread. the impact of ending social isolation upon reaching different threshold values of the r0 (as a function of all interventions to decrease infectibility) is evaluated in this scenario. this intervention is implemented by starting with initial social isolation in place and average number of daily social interactions (nih) with value of 1 and returning it back to its default "do nothing" value once the threshold r0 value is reached. figure 9 shows the model predictions for this scenario, in terms of the output variables indicated. the results in figure 9 (top left) clearly indicate the a withdrawal of isoaltion measures when r0 values remain above 1 will lead to little impact on the total fatalities. it is also observed that when isolation is ended even at low threshold r0 values, increases in the production of new crude number of fatalities and a peak in critical cases occurs after a period of time. these are always accompanied to sudden spike in r0 for a short period before its collapse. complete end of isolation may prove to be not the best course of action until r0 has reached levels of much lower than 1. the impact of ending social isolation for all, except for those over 60 years old, upon reaching different threshold values of the r0 is evaluated in this scenario. this intervention is implemented by starting at an initial social isolation in place with a value of 1 for the average number of daily social interactions (nih) and, once the given threshold r0 value is reached, returning it back to its default "do nothing" value for all age groups except the elderly. figure 10 shows the model predictions for this scenario, in terms of the output variables indicated. the results in figure 10 (top left) show again that an impact in fatalities will occur if the isolation ends at values of r0 over 1. the impact of ending social isolation at any r0 value is in this case smaller as the elderly remain isolated, this is in line with the results obtained in scenario 2 and shown figure 3 . the decrease of total fatalities of those in the age over 80 observed when isolation ends at, higher than 1, increasing threshold r0 values, is somehow unexpected. the impact of ending social isolation upon reaching different threshold values of the daily fatality rate (after it has passed its maximum) is evaluated in this scenario. the daily fatality rate is selected instead of e.g. the number of cases because it can be much more exactly assessed (incontrovertible as opposed to the number of cases). the decrease in the fatality rate is usually reached after the decrease in number of cases ("over the peak") as shown by most epidemiological curves for covid-19 published so far. this intervention is implemented by starting at al initial social isolation in place with a value of 1 for the average number of daily social interactions (nih) and returning it back to its default "do nothing" value when, after the rate passed its maximum, the given threshold value of the rate is reached. figure 11 shows the results for this scenario for the output variables indicated. figure 11 . impact of ending social isolation (nih from 1 back to 10) once the fatality rate, after surpassing its maximum, reaches different threshold values, on the final total number of fatalities (top left); the final total number of fatalities per age group (top right) as well as for the different time course profiles of the total active cases (middle left); reproduction number (r0) (middle right); the number of critical cases (bottom left) and the number of fatalities (bottom right). numbers are as percentage of total population. in this scenario all social isolation are ended once the fatality rate reaches a threshold after it started declining. as shown in figure 11 it appears to be a very narrow threshold from which the isolation measures can be withdrawn with low impact on total fatality. if measures are ended just before the threshold is reached the overall fatality rate and the fatality rate for elders rises sharply. for values below that threshold a still a decrease in total fatalities can be obtained if lower thresholds of fatality rate for isolation end are used. the impact of ending social isolation for all, except for those under 60 years old, upon reaching different threshold values of the fatality rate, after it reaching its maximum, is evaluated in this scenario. this intervention is implemented by starting at al initial social isolation in place with a value of 1 for the average number of daily social interactions (nih) and returning it back to its default "do nothing" value, for all age groups except the elderly, once, after the fatality rate reached its maximum, the given threshold value of the rate is reached. figure 12 shows the model predictions for this scenario, in terms of the output variables indicated. in this scenario all social isolation, except of that for the elderly, ends once the fatality rate reaches a threshold after it is already declining. as shown in figure 12 , and analogous to the previous scenario, it appears to be a very narrow threshold from which the isolation measures can be withdrawn with low impact on total fatality. also, if measures are ended just before the threshold is reached, the overall fatality rate rises sharply similarly. for values below that threshold however here no decrease in total fatalities is predicted when using lower thresholds of the fatality rate to end isolation. these results are also consistent with the idea that isolation of the age groups more vulnerable to the disease should be maintained. the model requires parameter calibration against valid data from representative populated cities. data from cities in which population is typically very interconnected socially in public areas and public transport is widely used, are particularly suited for calibration of this model. the model in its current version would benefit from more detailed descriptions and sub models of some of the intervention relevant parameters such as the levels of social interaction and personal protection measures. the model modularity and its fast computation allows for its easy scale up into multiple population nucleus that could be simulated in parallel with degrees of interconnectivity among them. separate independent copies of the model can be run in parallel one for e.g. each city in a region or country and migration terms can be added between cities. interventions can then be defined to include e.g. travel restrictions between those cities at different levels. the mechanistic nature of the model makes it also very suitable for the evaluation of advanced optimisation and optimum control strategies. its capacity of describing complex interactions makes it also of potentially great use to develop advanced artificial intelligence (ai) algorithms to aide and provide advice to authorities during decision making. ai algorithms could be trained by evaluation of very large numbers of scenarios combining static and dynamic interventions of different types against total fatalities and economic damage. lpas 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 * rationale: no reduction factor (rfips = 1) of their social interactivity respect to healthy ones is applied for presymptomatic infected individuals as they are ignorant of their condition; symptomatic infected individuals are expected to reduce their social interactivity respect to healthy ones as they feel sick (rfips < 1) ; the default level of personal protection and awareness (lpa) in children and youngsters is taken as smaller than that of adults; adult symptomatic individuals are expected to take higher level of personal protection and awareness (lpas) to not spread any general disease to others irrespective of the knowledge of their specific condition. infectious diseases of humans: dynamics and control modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model a day-by-day breakdown of coronavirus symptoms shows how the disease, covid-19, goes from bad to worse. business insider (last accesed a simple epidemic model with surprising dynamics real time bayesian estimation of the epidemic potential of emerging infectious diseases estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature mathematical models in population biology and epidemiology stochastic epidemic models: a survey analysis of a reaction-diffusion system modelling man-environmentman epidemics the effect of travel restrictions on the spread of the 2019 novel coronavirus real-time forecasting of epidemic trajectories using computational dynamic ensembles mathematical models to characterize early epidemic growth: a review the role of the airline transportation network in the prediction and predictability of global epidemics delaying the international spread of pandemic influenza complexity of the basic reproduction number (r 0) an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases adaptive human behavior in epidemiological models impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand strategies for mitigating an influenza pandemic estimating the number of infections and the impact of nonpharmaceutical interventions on covid-19 in 11 european countries a microbial population dynamics model for the acetone-butanol-ethanol fermentation process. authorea mathematical models of infectious disease transmission a stochastic differential equation sis epidemic model real-time epidemic forecasting for pandemic influenza feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. the lancet global health activated sludge models asm1, asm2, asm2d and asm3 the mathematics of infectious diseases clinical characteristics of 24 asymptomatic infections with covid-19 screened among close contacts in nanjing networks and epidemic models modeling infectious diseases in humans and animals containing papers of a mathematical and physical character early dynamics of transmission and control of covid-19: a mathematical modelling study the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application modelling during an emergency: the 2009 h1n1 influenza pandemic epidemiological models for mutating pathogens the reproduction number of covid-19 is higher compared to sars coronavirus population biology of infectious diseases: part ii how should pathogen transmission be modelled edge-based compartmental modelling for infectious disease spread actualización nº64: enfermedad por sars-cov-2 mortality by cause for eight regions of the world: global burden of disease study regional patterns of disability-free life expectancy and disabilityadjusted life expectancy: global burden of disease study global mortality, disability, and the contribution of risk factors: global burden of disease study alternative projections of mortality and disability by cause 1990-2020: global burden of disease study the structure and function of complex networks estimation of the asymptomatic ratio of novel coronavirus infections (covid-19) real-time forecasting of an epidemic using a discrete time stochastic model: a case study of pandemic influenza (h1n1-2009) did modeling overestimate the transmission potential of pandemic (h1n1-2009)? sample size estimation for post-epidemic seroepidemiological studies a systematic review of studies on forecasting the dynamics of influenza outbreaks epidemic processes in complex networks materials and strategies that work in low literacy health communication the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study population balance modeling: current status and future prospects game theory of social distancing in response to an epidemic large-scale spatial-transmission models of infectious disease using "outbreak science" to strengthen the use of models during epidemics real-time forecasts of the covid-19 epidemic in china from dynamical behavior of an epidemic model with a nonlinear incidence rate using a delay-adjusted case fatality ratio to estimate under-reporting mathematical modeling of infectious disease dynamics the health literacy skills framework the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application real-time numerical forecast of global epidemic spreading: case study of 2009 a/h1n1pdm basic reproduction numbers for reaction-diffusion epidemic models ebola virus disease in west africa-the first 9 months of the epidemic and forward projections nature-inspired optimization algorithms the simplicity complex: exploring simplified health messages in a complex world. health promotion international spread of zika virus in the americas clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan all authors wish to thank khalifa university and the government of abu dhabi for the funding and support. the impact of specific interventions on the outbreak time course, number of cases and outcome of fatalities were evaluated. data available from the covid19 outbreak as of early april 2020 was used. our preliminary results for scenarios above and parameter values used indicate that:1. universal social isolation measures may be effective in reducing total fatalities only if they are strict and the average number of daily social interactions is reduced to very low numbers.2. selective isolation of only the age groups most vulnerable to the disease (i.e. older than 60) appears almost as effective in reducing total fatalities but at a much lower economic damage. the comparison between impacts of social isolation interventions to all or selective by age on the final total number of fatalities ( figure 13 ) shows that the isolation of the elderly can achieve equivalent impact to that of all.3. an increase in the number of critical care beds could save up significant numbers of lives. using our current parameters values, for a one million population, an estimate of 8 fatalities could be avoided per extra available critical care unit.4. the use of protective equipment (ppe) appears capable of very significantly reducing total fatalities if implemented extensively and to a high degree; 5. extensive random testing of the population leading to infection recognition and subsequent immediate (self) isolation of the infected individuals, can dramatically reduce the total fatalities but only if implemented to almost the entire population and sustained over time.6. ending isolation measures with r0 above one (with a safety factor) appears to renders the previous isolation measures useless as fatality rate eventually reaches values close to if nothing was ever done; 7. ending isolation measures only for the population under 60 y/o with r0 values still above one increases total fatalities but only around half as much as if isolation is ended for everyone. 8. a threshold value for daily fatality rate (equivalent to the r0 below one) appears to exist for the feasible end of isolation measures. daily fatality rates are known very accurately unlike the r0 and could be used criteria for intervention. in figure 14 the impacts on total final number of fatalities of the withdrawal of social isolation from threshold values for (r0) and of daily fatality rate per million people are shown. a comparison between the cases when withdrawal is done universally or restricted only to those under 60 years old is shown.it is important to note that any interpretation of the above results for the covid19 outbreak interventions must be considered only qualitatively at this stage due to the low confidence (lack of complete and valid data) on the parameter values available at the time of writing. any quantitative interpretation of the results must be accompanied with a critical discussion in terms of the model limitations and its frame of application. next immediate steps involve the sensitivity analysis of the parameters with the lowest confidence. a roadmap for model expansion and broader implementation is discussed below. the for matlab® source code and excel file containing all parameter values used as well as a nonage segregated version of the model are available at https://github.com/envbioprom/covid_model appendix appendix i. epidemiological and clinical parameters per age group: covid19 case study table a1 . default epidemiological and clinical parameters per age group used in the covid19 outbreak case study simulations presented.appendix ii. data sources for the epidemiological and clinical parameters: covid19 case study table a2 . data sources and level of confidence assigned to the epidemiological and clinical parameters from table a1 for the covid19 outbreak case study. the model simulation of the outbreak time course under the default parameters and no intervention is presented in figure a4 . key: cord-266424-wchxkdtj authors: lofstedt, jeanne; dohoo, ian r.; duizer, glen title: model to predict septicemia in diarrheic calves date: 2008-06-28 journal: j vet intern med doi: 10.1111/j.1939-1676.1999.tb01134.x sha: doc_id: 266424 cord_uid: wchxkdtj the difficulty in distinguishing between septicemic and nonsepticemic diarrheic calves prompted a study of variables to predict septicemia in diarrheic calves, 28 days old that were presented to a referral institution. the prevalence of septicemia in the study population was 31%. variables whose values were significantly different (p < .10) between septicemic and nonsepticemic diarrheic calves were selected using stepwise, forward, and backward logistic regression. variables identified as potentially useful predictors were used in the final model‐building process. two final models were selected: 1 based on all possible types of predictors (laboratory model) and 1 based only on demographic data and physical examination results (clinical model). in the laboratory model, 5 variables retained significance: serum creatinine > 5.66 mg/dl (> 500 μmol/l) (odds ratio [or] = 8.63, p = .021), toxic changes in neutrophils ≥ 21 (or = 2.88, p = .026), failure of passive transfer (or = 2.72, p = .023), presence of focal infection (or = 2.68, p = .024), and poor suckle reflex (or = 4.10, p = .019). four variables retained significance in the clinical model: age ≤ 5 days (or = 2.58, p = .006), presence of focal infection (or = 2.45, p = .006), recumbency (or = 2.98, p = .011), and absence of a suckling reflex (or = 3.03, p = .031). the hosmer—lemeshow goodness‐of‐fit chi‐square statistics for the laboratory and clinical models had p‐values of .72 and .37, respectively, indicating that the models fit the observed data reasonably well. the laboratory model outperformed the clinical model by a small margin at a predictabilty cutoff of 0.5, however, the predictive abilities of the 2 models were quite similar. the low sensitivities (39% and 40%) of both models at a predicted probability cutoff of 0.5 meant many septicemic calves were not being detected by the models. the specificity of both models at a predicted probability cutoff of 0.5 was .90%, indicating that .90% of nonsepticemic calves would be predicted to be nonsepticemic by the 2 models. the positive and negative predictive values of the models were 66–82%, which indicated the proportion of cases for which a predictive result would be correct in a population with a prevalence of septicemia of 31%. t he mortality risk of live-born neonatal calves ͻ1 month of age has been reported to range from 15 to 30%. [1] [2] [3] [4] the majority of deaths are attributable to infectious diseases; diarrhea, pneumonia, and septicemia are the most common. [5] [6] [7] [8] the predominant pathogen cultured from calves with septicemia is eschericia coli, but other gramnegative, gram-positive, and mixed bacterial infections have been documented. [9] [10] [11] [12] important risk factors for the development of septicemia in calves include decreased passive transfer of colostral immunoglobulins and exposure to invasive bacterial serotypes. 13 neonatal diarrhea may also predispose calves to septicemia. in 2 separate studies conducted on a california veal operation, blood cultures revealed bacteremia in 31% and 19% of calves with signs of diarrhea, depression, and/ or weakness. 10, 14 septicemia in these calves was attributed to intestinal mucosal damage caused by bacterial, viral, or parasitic gastrointestinal infections, which allowed opportunistic gut pathogens to enter the systemic circulation. 10 the early signs of septicemia in neonatal foals and calves are vague and nonspecific and are often indistinguishable from signs of noninfectious diseases or those of focal in-fections such as diarrhea. 14, 15 positive blood cultures are required for a definitive antemortem diagnosis of septicemia, but results are not reported for 48-72 hours, and false-negative culture findings are common. 12, 16 no single laboratory test has emerged as being completely reliable for the early diagnosis of septicemia in farm animal neonates, 12, 17 therefore, various scoring systems and predictive models using easily obtainable historical, clinical, and clinicopathologic data have been developed for this purpose. 14, [17] [18] [19] the goal of these mathematical models is to identify septicemic neonates early in the course of disease when appropriate therapeutic intervention would most likely result in a favorable outcome. for a period of time, routine blood cultures were performed on all diarrheic calves presented to the atlantic veterinary college teaching hospital regardless of whether the clinical or clinicopathologic findings indicated a diagnosis of septicemia. results of this exercise indicated that septicemia was more common in diarrheic neonatal calves than we had anticipated and that it increased the cost of treatment while decreasing the prognosis for survival. a model capable of predicting sepsis in diarrheic neonatal calves presented to the hospital for treatment would assist the clinicians in distinguishing between calves with undifferentiated diarrhea and calves with diarrhea complicated by sepsis. in the case of a relatively low-value calf with diarrhea and predicted sepsis, the farmer may be advised not to initiate treatment because of the expense involved and the poor prognosis. in contrast, the owner of a valuable diarrheic calf that was predicted to be septic could be informed that early initiation of costly but appropriate antimicrobial and supportive therapy may result in an improved outcome. models to predict septicemia in calves have been published. 14, 19 study populations in these reports consisted of veal calves, 1 day to 4 months of age, with signs of serious illness (diarrhea, depression, and/or weakness) 14 and calves ͻ30 days of age that were presented to a veterinary teach medical records of calves ͻ28 days of age presented to the atlantic veterinary college teaching hospital between the years 1989 and 1993 with a primary complaint of diarrhea were retrieved. pretreatment data were extracted from the medical records of these calves. independent variables included demographic information and physical examination findings and clinicopathologic values for hematology, venous blood gases, serum chemistry, and immunoglobulins (table 1) . immunoglobulin concentration in serum was determined quantitatively or qualitatively using a variety of procedures, including the quantitative zinc sulfate turbidity test (69 calves), 20 sodium sulfite precipitation test (17 calves), gluteraldehyde coagulation test (62 calves), and radial immunodiffusion test (5 calves). failure of passive transfer of immunoglobulins was defined as an igg concentration of յ800 mg/dl by any of the above mentioned tests, a globulin concentration of յ2 g/ dl (յ20 g/l), or total serum protein of յ5 g/dl (յ50 g/l) ( table 1 ). other information extracted from the medical records of the calves in the study population were fecal culture, fecal electron microscopic examination, and fecal floatation results. logistic regression assumes that the relationship between any independent variable recorded on a continuous scale (eg, respiratory rate) has a linear relationship with the logit of the probability of sepsis. for each such independent variable, this assumption was assessed by evaluating the distribution of the independent variable, categorizing the independent variable in 2 or more groups and cross-tabulating the new variable with sepsis, or running unconditional logistic regressions with the variable in question and evaluating the hosmer-lemeshow goodness-of-fit statistics. the most appropriate form for each independent variable was then chosen (table 1) . categorical variables were all entered into the predictive models as a series of dummy variables. the dependent variable of interest was a diagnosis of septicemia. a diagnosis of septicemia in diarrheic calves was based on the following antemortem criteria: (1) positive blood culture, (2) culture of the same bacterial agent from ն2 body fluids, or (3) culture of a bacterial agent from a single joint in a calf with joint effusion involving multiple joints. blood for culture was collected aseptically from the jugular vein and inoculated into the oxoid signal blood culturing system (oxoid canada, inc, nepean, on, canada). a number of criteria contributed to a postmortem diagnosis of septicemia, including (1) morphologic changes such as multiple disseminated abscesses of similar size, purulent vasculitis and intravascular identification of bacteria, or fibrin in multiple body cavities; (2) bacterial isolation from heart blood; or (3) recovery of the same bacterial organism from ն2 body tissues (excluding intestine). other outcome variables recorded were days hospitalized and calf survival. means for continuous variables were compared between septicemic and nonsepticemic calves using student's t-test. 21 categorical variables were analyzed using contingency table chi-square analysis. 21 variables that were significantly different (p ͻ .10) between septicemic and nonsepticemic calves were grouped into 1 of 3 categories: physical variables, hematological variables, and chemistry variables, which included measures of passive transfer. variables in each group were selected for logistic regression models using stepwise, forward, and backward selection procedures. 22 variables identified as possibly useful predictors by the various selection procedures were used in the final model building process. this process involved comparing a number of possible models and selecting those that maximized the area under the receiver operating characteristic (roc) curve 23 but that did not include any variables whose coefficients were not significant (p ͻ .05). two final models were selected: one based on all possible types of predictors (the laboratory model) and one based only on demographic data and physical examination results (the clinical model). odds ratios (or) derived from the models were interpreted as measures of increased risk of disease. because septicemia was not a rare condition, the or were not precise measures of increased risk, but the approximation was reasonable and was utilized to clarify the presentation of results. once final models were selected, their sensitivity and specificity were determined at predictive probability cutoff points of 0.5 and 0.3 (the points that roughly balanced the sensitivity and the specificity). the goodness of fit of the models was evaluated using the hosmer-lemeshow goodness-of-fit chi-square statistic 22 with the data divided into 5 groups. for each model, the pearson residuals, the standardized pearson residuals, and the ⌬␤ values were computed for all covariate patterns to determine if any specific covariate pattern had an undue influence on the model. because evaluating the fit of a model using the same data that were used to build the model is likely to overestimate the predictive ability of the model, a split sample goodness-of-fit evaluation was also carried out. the data were randomly (using a computer based random number generator) divided into 2 separate databases with 60% and 40% of the observations in each. the 2 models were refit using the database containing 60% of the observations by forcing in all of the independent variables included in the 2 final models described above. the predictive ability of the resulting models was assessed using the database containing 40% of the observations. analyses were performed on a personal computer using the statistical software package stata (stata corp, college station, tx). two hundred fifty-four calves met the criteria for inclusion in the study. however, 2 calves were excluded from analysis because they could not be classified as either septicemic or nonsepticemic based on criteria established for the study; campylobacter-like organisms were seen on gram stains of their blood, but the organisms were not recovered from blood culture. the mean age of diarrheic calves at presentation was 9.4 days. crossbred beef calves (32.9%) and holstein calves seventy-eight (31%) of the calves met the study criteria for a diagnosis of septicemia. positive blood cultures were used to diagnose 50 cases (40.7% of all blood cultures were positive). a single organism was cultured from most calves but 3 of the blood cultures yielded 2 isolates each. e coli accounted for 29 (55%) of the isolates from blood. an additional 10 bacterial agents were cultured from the blood, including campylobacter spp. (8) (1), and clostridium spp. (1) . culture of the same pathogen from ն2 body fluids antemortem were used to diagnose 6 cases, and a positive antemortem joint culture from a calf with multiple enlarged joints was used to diagnose 1 case. the same bacterial agent was cultured from ն2 tissues at necropsy in 33 calves, postmortem heart blood culture was positive in 1 calf, and morphologic lesions such as multiple abcessation, purulent vasculitis, intravascular identification of bacteria, and fibrin exudation in multiple body cavities, were present in 18 calves. e coli was also the most frequent isolate in antemortem body fluid and postmortem tissue samples, accounting for 7/11 (64%) and 93/154 (60%) isolates, respectively. because individual calves often satisfied multiple criteria for a diagnosis of septicemia, the totals for all the above criteria are greater than the total number of septicemic calves. one hundred forty-one of the 174 nonsepticemic calves (81%) were discharged from the hospital (survived) compared with only 23 of 78 septicemic calves (29.5%). the mean hospital stay for nonsepticemic calves that lived was 4.1 days compared with 5.0 days for septicemic calves that survived. variables identified as potentially useful predictors (p ͻ .1) are described in table 2 . septicemic calves had significantly higher (p ͻ .05) values for respiratory rate, packed cell volume, band neutrophil count, and venous pco 2 than did nonsepticemic calves. mean values for rectal temperature and total plasma and serum protein, globulin, calcium, and glucose concentrations were significantly lower (p ͻ .05) in septicemic than in nonsepticemic calves. a significantly larger (p ͻ .05) proportion of septicemic calves was յ5 days of age, unresponsive or comatose, and in sternal or lateral recumbency at presentation when compared with nonsepticemic calves. the proportion of septicemic calves exhibiting a weak or absent suckling reflex, scleral injection, and hyperemic or cyanotic oral mucous membranes was also significantly greater (p ͻ .05). elevated serum creatinine concentration, toxic changes in neutrophils ն2ϩ, and failure of passive transfer of colostral immunoglobulins were present more frequently in septicemic than in nonsepticemic calves (p ͻ .05). two final models, the laboratory model and the clinical model, were selected based on the maximal area under the roc curve (table 3) . variables included in the laboratory model were serum creatinine concentration, toxic changes in neutrophils ն2ϩ, failure of passive transfer, focal infection, and a poor suckle reflex. the clinical model identified age of յ5 days, focal infection, recumbency, and a poor suckle reflex as predictors of septicemia. the laboratory model indicated that calves with moderately increased serum creatinine concentration were twice as likely to be septicemic (or ϭ 2.07; 95% ci ϭ 0.836-5.119), whereas those with creatinine concentration of ͼ5.66 mg/dl (ͼ500 mol/l) were 8 times more likely to be septicemic (or ϭ 8.63; 95% ci ϭ 1.383-53.827). similarly, evidence of toxic changes in their neutrophils (ն2ϩ), failure of passive transfer, or evidence of focal infections all increased the risk of septicemia by 2.68-2.88 times. calves with a poor suckle reflex were 4 times as likely to be septic. the clinical model (ie, model containing only data from the physical examination and history) showed that each of the following was a factor that increased the risk of being septicemic by approximately 2.5-3-fold: ͻ5 days of age, signs of focal infection, recumbent on admission, and poor suckle reflex. the hosmer-lemeshow goodness-of-fit chi-square statistics for the laboratory and clinical models had p-values of 0.72 and 0.37, respectively, indicating that the models fit the observed data reasonably well. evaluation of residual patterns did not identify any outlining observations, and no specific group of calves (ie, calves with a common covariate pattern) exerted a large influence on the model. we concluded that the models fit the data reasonably well. the predictive abilities of the models are presented in table 4 . when evaluated at a cutoff value of 0.5 (ie, a calf was predicted to be septicemic if the predicted probability was ն0.5), all models had quite low sensitivity and very high specificity. a cutoff of 0.3 roughly balanced the sensitivity and specificity and has been presented for comparison purposes. the laboratory models outperformed the clinical models, but only by a small margin. similarly, the models based on the full data set outperformed those in which the model was constructed on 60% of the data and its predictive ability was evaluated in the other 40%. however, the differences were small. overall, using a probability cutoff of 0.5 to define predicted sepsis, the model sensitivities ranged from 25 to 40% and the specificities ranged from 90 to 95%. the low sensitivities mean that many septicemic calves would not be detected by the model. however, the positive and negative predictive values ranged from 66 to 82%, which indicates the proportion of cases in which the predicted result was correct (for a population with a prevalence of septicemia of 31%). at a cutoff of 0.3, the models had sensitivities and specificities in the 70-75% range, which means that a much higher proportion of septicemic calves would have been detected by the model. however, a positive prediction was only associated with a 52-68% probability (positive predictive value) that the calf was truly septic. however, a negative prediction was associated with an 83-89% probability (negative predictive value) that the calf was not septic. thirty-one percent of the diarrheic calves in this study were diagnosed with septicemia based on results of blood cultures, antemortem or postmortem tissue or fluid cultures, and morphologic changes at postmortem. the prevalence of septicemia in this study was identical to that reported for calves with diarrhea, depression, and/or weakness on a veal raising facility, 14 which suggests that the predictive values of the models developed herein may be relevant to other calf populations. the risk of contamination of the blood sample for culturing probably was low, so it is unlikely that many truly nonsepticemic calves were misclassified as septicemic. however, because of the low sensitivity of blood cultures 24, 25 and the fact that blood was not cultured for some surviving calves, some truly septicemic calves may have been misclassified as nonsepticemic if they survived and were not subjected to postmortem examination. however, this misclassification probably was not common because (1) the calves survived, and the probability of survival was low if the calf was truly septicemic; and (2) the effect of misclassification of truly septicemic calves on the model would likely been a reduction in specificity, but low specificity was not a problem with either model. e coli was the bacterial agent cultured with greatest frequency from septicemic calves in this study, a finding that agrees with results of previous investigations 10, 12 and may be attributed to the fact that certain strains of e coli possess virulence factors that promote systemic invasion. 26 in addition to e coli, a variety of noncoliform bacteria were isolated. of particular interest was the recovery of campylobacter fetus subsp. fetus from blood of 8 diarrheic calves (15% of the isolates from blood); there is only 1 previous report of this agent being cultured from a septicemic calf. 10 possible explanations for the frequent isolation of campylobacter spp. in this study were that the oxoid signal blood culture system was more effective than other blood culture systems for growing this agent or that there were unique predisposing factors leading to campylobacter bacteremia in calves in our geographic location. two models, the laboratory model and the clinical model, were ultimately selected to predict septicemia in diarrheic neonatal calves presented for treatment. the laboratory model, which included all possible variables, was intended for use in a hospital setting where the clinician would have access to laboratory facilities. the clinical model, which contained only demographic and physical examination variables, was developed for use in the field. clinicopathologic parameters identified by the laboratory model as being associated with an increased risk of septicemia were moderate (1.99-5.66 mg/dl) (176-500 mol/ l) and marked (ͼ5.66 mg/dl) (ͼ500 mol/l) increases in serum creatinine concentration, moderate to marked toxic changes in neutrophils (ն2ϩ), and failure of passive transfer (igg concentration յ 800 mg/dl, globulin յ 2 g/dl [յ20 g/l]), and total serum protein յ 5 g/dl (յ50 g/l). because azotemia secondary to shock and dehydration was anticipated in both septicemic and nonsepticemic diarrheic calves presented for treatment, it was interesting to note that moderate and marked increases in serum creatinine concentration in this study increased the risk of a calf being septicemic by 2-and 8-fold, respectively. moderate to severe azotemia in septicemic calves was attributed to endotoxin induced tubular necrosis and embolic nephritis, in addition to decreased renal perfusion associated with dehydration. toxic changes in neutrophils, which are induced by inflammatory mediators and endotoxins, are frequently encountered in septicemic calves 12 ; therefore, inclusion of this variable in the laboratory model was not unexpected. because the relationship between septicemia and failure of colostral antibody transfer has been documented in previous studies, 12,14 the increased risk of septicemia in calves with failure of passive transfer was also anticipated. both the laboratory model and the clinical model indicated that the risk of septicemia in calves with identified sites of focal infection (omphalitis, arthritis, meningitis, uveitis) was more than double that of calves without evidence of focal infection. this association was biologically plausible because one of the initial sites of bacterial invasion in septicemic calves is the umbilicus, 13 and hematogenous spread of infection to the meninges, synovial lining, and uveal tract are known to occur. 12 inclusion of poor suckling reflex by both models and recumbency by the clinical model as predictors of septicemia in a population of diarrheic calves presented for treatment was unforseen. the fluid, blood gas, and electrolyte derrangements associated with the diarrhea could have caused recumbency and disinterest in suckling in the majority of calves in the study. however, multiple organ invasion by bacterial pathogens and the cascade of inflammatory reactions initiated by these infecting agents were more likely responsible for the increased frequency of recumbency and poor suckling reflex in septicemic calves. the clinical model indicated that being ͻ5 days of age increased the risk of septicemia by 2.5fold. this contrasts with results of a study conducted in veal calves where age ͼ7 days significantly increased the risk of being blood culture positive. 14 the reason for this discrepancy between the 2 studies is unclear. because most calves in the present study were presented for treatment in the winter months when all calves presented to the hospital would be born inside and there is increased stocking density in maternity pens, they could have been heavily exposed to pathogens early in life. the theory that enteritis predisposes calves ͼ1 week of age to septicemia 10 was not supported by results of this study. although the laboratory model outperformed the clinical model by a small margin at a probability cutoff of 0.5, the predictive abilities of the 2 models were remarkably similar (table 4) . thus, the practitioner in the field, without immediate access to a laboratory, could use the clinical model to predict septicemia in a diarrheic calf with almost the same accuracy as that of the clinician at a referral institution, who would be using the laboratory model for the same purpose. the low sensitivities (39% and 40%) of both models at a probability cutoff of 0.5 meant that many septicemic calves were not being detected by the models. one explanation is that other useful predictors of septicemia were not included in the models. however, with the exception of historical data pertaining to calving and management practices, most variables previously identified as risk factors in calves and foals were evaluated. 14, 17, 18 there may be no group of variables unique to the septicemic calf, partly because septicemia is a dynamic process and laboratory and clinical parameters can vary widely depending on when in the disease course the calf is evaluated. the specificity of both models at a cutoff of 0.5 was ͼ90%, indicating that ͼ90% of nonsepticemic calves would be predicted to be nonsepticemic by the 2 models. when the predictive probability cutoff value was set to 0.5, the laboratory model had a positive predictive value of 76%. this cutoff could be used when the clinician wants to be relatively certain that a calf predicted to be septicemic would, in fact, be septicemic. for example, when dealing with a relatively low-value calf, a decision to euthanize the calf may be made given the poor prognosis for septicemic calves. in this situation, a high positive predictive value would be desirable. when the predictive probability cutoff was set to 0.3, the positive predictive value decreased but the negative predictive value rose to 89%. consequently, a clinician could be relatively certain that a calf that tested negative was, in fact, nonsepticemic. this would be a desirable situation when treating a valuable calf with diarrhea in which ancillary treatments for septicemia (broad spectrum antimicrobials and plasma) would only be omitted from the treatment plan if the clinician was relatively confident that the calf was nonsepticemic. the predictive values for any test depend on the sensitivity and the specificity of the test and the prevalence of the disease. all of the predictive values discussed above are based on the assumption that the prevalence of septicemia was 31%. if the prevalence of septicemia in a population of calves presented for treatment were ͻ31%, the negative predictive value of the test would rise but the positive predictive value would drop off dramatically (fig 1) . the relatively small reduction in the predictive ability of the models when they were evaluated using a split sample approach was encouraging and suggests that the models may perform reasonably well in predicting future observations. however, there may still be a slight upward bias in the assessment of the performance of these models because the variables chosen for inclusion in the split sample assessment were those obtained from the analysis of the whole data set. in conclusion, the predictive models fit the observed data reasonably well and had moderate predictive ability. the main limitation in the models seemed to be a low sensitivity. many of the septicemic calves may have been early in the course of the disease and may not have had any distinguishing features that could be used to identify them as septicemic. reassessment of these calves as the disease progresses may clarify their septicemia status but would not help in making initial decisions about therapy. calf mortality management factors associated with calf mortality in south carolina dairy herds dairy calf mortality rate: influence of management and housing factors on calf mortality rate in tulare county, california factors influencing dairy calf mortality in michigan descriptive epidemiology of calfhood morbidity and mortality in new york holstein herds factors affecting susceptibility of calves to disease an epidemiologic study of calf health and performance in norwegian dairy herds. iii. morbidity and performance: literature review dairy calf management, morbidity and mortality in ontario holstein herds. iii. association of management with morbidity blood culture from calves and foals bacteriological culture of ill neonatal calves pasteurella multocida septicaemia in two calves neonatal septicemia in calves: 25 cases (1985-1990) septicaemic colibacillosis and failure of passive transfer of colostral immunoglobulin in calves use of a clinical sepsis score for predicting bacteremia in neonatal dairy calves on a calf rearing farm brewer bd, koterba am. bacterial isolates and susceptibility patterns in foals in a neonatal intensive care unit development of a scoring system for the early diagnosis of equine neonatal sepsis comparison of empirically developed sepsis score with a computer generated and weighted scoring system for the identification of sepsis in the equine neonate bovine neonatal sepsis score quantitation of bovine immunoglobulins: comparison of single radial immunodiffusion, zinc sulfate turbidity, serum electrophoresis, and refractometer methods primer of biostatistics applied logistic regression basic principles of roc analysis bacteremia: pathogenesis and diagnosis comparison of bacteriological culture of blood and necropsy specimens for determination of the cause of foal septicemia: 47 cases (1978-1987) escherichia coli from calves with bacteremia key: cord-252894-c02v47jz authors: chae, sangwon; kwon, sungjun; lee, donghyun title: predicting infectious disease using deep learning and big data date: 2018-07-27 journal: int j environ res public health doi: 10.3390/ijerph15081596 sha: doc_id: 252894 cord_uid: c02v47jz infectious disease occurs when a person is infected by a pathogen from another person or an animal. it is a problem that causes harm at both individual and macro scales. the korea center for disease control (kcdc) operates a surveillance system to minimize infectious disease contagions. however, in this system, it is difficult to immediately act against infectious disease because of missing and delayed reports. moreover, infectious disease trends are not known, which means prediction is not easy. this study predicts infectious diseases by optimizing the parameters of deep learning algorithms while considering big data including social media data. the performance of the deep neural network (dnn) and long-short term memory (lstm) learning models were compared with the autoregressive integrated moving average (arima) when predicting three infectious diseases one week into the future. the results show that the dnn and lstm models perform better than arima. when predicting chickenpox, the top-10 dnn and lstm models improved average performance by 24% and 19%, respectively. the dnn model performed stably and the lstm model was more accurate when infectious disease was spreading. we believe that this study’s models can help eliminate reporting delays in existing surveillance systems and, therefore, minimize costs to society. infectious disease occurs when a person is infected by a pathogen from another person or an animal. it not only harms individuals, but also causes harm on a macro scale and, therefore, is regarded as a social problem [1] . at the korea center for disease control (kcdc), infectious disease surveillance is a comprehensive process in which information on infectious disease outbreaks and vectors are continuously and systematically collected, analyzed, and interpreted. moreover, the results are distributed quickly to people who need them to prevent and control infectious disease. the kcdc operates a mandatory surveillance system in which mandatory reports are made without delay to the relevant health center when an infectious disease occurs and it operates a sentinel surveillance system in which the medical organization that has been designated as the sentinel reports to the relevant health center within seven days. the targets of mandatory surveillance consist of a total of 59 infectious diseases from groups 1 to 4 by the kcdc. the targets of sentinel surveillance include influenza from group 3 along with 21 infectious diseases from group 5. overall, a total of 80 infectious diseases in six groups are monitored. in the current korean infectious disease reporting system, if there is a legally defined infectious disease patient at a medical organization, a report is made to the managing health center through the infectious disease web reporting system. the managing health center reports to the city and province health offices through another system and the city and province health offices report to the kcdc. in the conventional reporting system, some medical organizations' infectious disease reports are incomplete and delays can occur in the reporting system. for instance, in the traditional influenza surveillance system, around two weeks elapses between when a report is made and when it is disseminated [2] . the kcdc has been running an automated infectious disease reporting system as a pilot project since 2015. however, by 2017, only 2.3% of all medical organizations were participating in the pilot project. in medical organizations using the conventional infectious disease reporting system, a large number of missing and delayed reports can occur, which hinders a prompt response to infectious disease. as such, it is necessary to create a data-based infectious disease prediction model to handle situations in real time. furthermore, if this model can understand the extent of infectious disease trends, the costs to society from infectious disease can be minimized. an increasing number of researchers recognize these facts and are performing data-based infectious disease surveillance studies to supplement existing systems and design new models [3] [4] [5] [6] [7] [8] [9] . among these, studies are currently being performed on detecting infectious disease using big data such as internet search queries [10] [11] [12] [13] [14] [15] . the internet search data can be gathered and processed at a speed that is close to real time. according to towers et al., internet search data can create surveillance data faster than conventional surveillance systems [16] . for example, when huang et al. predicted hand, foot, and mouth disease using the generalized additive model (gam), the model that included search query data obtained the best results. as such, it has been reported that new big data surveillance tools have the advantage of being easy to access and can identify infectious disease trends before official organizations [17] . in addition to internet search data, social media big data is also being considered. tenkanen et al. report that social media big data is relatively easy to collect and can be used freely, which means accessibility is satisfactory and the data is created continuously in real time with rich content [18] . as such, studies have used twitter data to predict the occurrences of mental illness [19] and infectious disease [20] [21] [22] [23] in addition to predictions in a variety of other scientific fields [24] [25] [26] [27] . in particular, a study by shin et al. reported that infectious diseases and twitter data are highly correlated. there is the possibility of using digital surveillance systems to monitor infectious disease in the future [20] . when these points are considered, using search query data and social media big data should have a positive effect on infectious disease predictions. in addition to these studies, there are also studies that have used techniques from the field of deep learning to predict infectious disease [22, 23, 28, 29] . deep learning is an analysis method and, like big data, it is being actively used in a variety of fields [30] . deep learning yields satisfactory results when it is used to perform tasks that are difficult for conventional analysis methods [31] [32] [33] . in a study by xu et al., a model that used deep learning yielded better prediction performance than the generalized linear model (glm), the least absolute shrinkage and selection operator (lasso) model, and the autoregressive integrated moving average (arima) model [28] . as such, methods of predicting infectious disease that use deep learning are helpful for designing effective models. there are also examples of infectious disease prediction based on environmental factors such as weather [34] [35] [36] [37] . previous studies have confirmed that weather data comprises a factor that has a great influence on the occurrence of infectious diseases [38] [39] [40] . liang et al. showed that rainfall and humidity are risk factors for a hemorrhagic fever with a renal syndrome [41] . in addition, a study by huang et al. reported that trends in dengue fever show a strong correlation with temperature and humidity [42] . previous studies indicate that infectious disease can be predicted more effectively if weather variables, internet big data, and deep learning are used. most previous research has attempted to predict infectious disease using internet search query data alone. however, as discussed above, it is necessary to also consider various big data and environmental factors such as weather when predicting infectious disease. in addition, in the case of models that use deep learning, it is possible to improve prediction performance by optimizing the deep learning model by optimizing its parameters. therefore, the aim of this study is to design a model that uses the infectious disease occurrence data provided by the kcdc, search query data from search engines that are specialized for south korea, twitter social media big data, and weather data such as temperature and humidity. according to a study by kwon et al., a model that considers the time difference between clinical and non-clinical data can detect infectious disease outbreaks one to two weeks before current surveillance systems [43] . therefore, this study adds lag to the collected dataset to take temporal characteristics into account. in addition, in the design process, a thorough testing of all the input variable combinations is performed to examine the effects of each resulting dataset on infectious disease outbreaks and select the optimal model with the most explanatory power. the model's prediction performance is verified by comparing it with an infectious disease prediction model that uses a deep learning method and an infectious disease prediction model that uses time series analysis. ultimately, using the results obtained by this study, it should be possible to create a model that can predict trends about the occurrence of infectious disease in real time. such a model can not only eliminate the reporting time differences in conventional surveillance systems but also minimize the societal costs and economic losses caused by infectious disease. the remainder of this paper is organized as follows. section 2 describes the data sources and standards used in this study and introduces the analysis methodology used to design the prediction model. in section 3, the analysis results are described and their implications are discussed. section 4 discusses the results. section 5 concludes the paper. as mentioned above, this study uses four kinds of data to predict infectious disease, which includes search query data, social media big data, temperature, and humidity. the standards for the non-clinical data are as follows. data from 576 days between 1 january, 2016 and 29 july, 2017 was used. the infectious diseases selected for this study are subject to mandatory reporting. unlike those diseases subject to mandatory reporting, diseases subject to sentinel reporting aggregate data on a weekly basis. since prediction is also performed on a weekly basis, it is difficult to cope with infectious diseases in real time. therefore, diseases that are subject to sentinel reporting were excluded from the study. moreover, the study excluded infectious diseases with an annual occurrence rate of less than 100 as well as infectious diseases that have a statistically insignificant model with an adjusted r-squared value of less than 0.25 when regression analysis is performed using all variables. three infectious diseases satisfied all conditions, which include malaria, chickenpox, and scarlet fever. the search data was collected from the naver data lab (https://datalab.naver.com/keyword/trendsearch.naver). the usage share data provided by internettrend (http://internettrend.co.kr/trendforward.tsp) on search engines in the health/medicine field in the first half of 2017 shows that the naver search engine had the highest usage share (86.1%) in south korea. therefore, it was chosen as the search engine for extracting search data. note that the collected search data consists of only korean terms because the search engine is specific to south korea. the search queries used in this study consisted of the infectious disease's proper name and symptoms (e.g., "chickenpox" and "chickenpox symptoms" in korea). the frequency of inquiries using these search queries were used as the search data. the number of searches were normalized with respect to the largest number of searches within the study period. weather data (temperature and humidity) were collected from the korea meteorological administration's weather information open portal (https://data.kma.go.kr). hourly data collected from weather stations nationwide was converted into daily average data for each station. in gyeonggi-do province, where around half of south korea's population lives, there are many weather stations crowded together. there was a concern that simply finding the averages of the daily data for each station would cause errors to occur, so the following process was performed. first, the averages of the data from each station were collected for the eight provinces in south korea (gyeonggi-do, gangwon-do, chungcheongnam-do, chungcheongbuk-do, jeollanam-do, jeollabuk-do, gyeongsangbuk-do, and gyeongsangnam-do). next, the averages of the data for each of the eight provinces were found to obtain south korea's national average weather data. average temperature (degrees celsius) and average humidity (percentage) were recorded. social media big data was collected for each infectious disease from twitter through a web crawler that used the python selenium library. for the twitter data, the daily number of tweets mentioning infectious disease was recorded. lastly, infectious disease data was collected from the infectious disease web statistics system (https://is.cdc.go.kr/dstat/index.jsp). this data consists of the daily number of people who were infected throughout south korea. table 1 shows the sources and descriptions of the data. table 2 shows the statistics for each of the infectious disease variables used in this study. in the case of temperature and humidity, the same conditions were used, which means they were put in a shared category. the data in table 2 shows that an average of 166.76 people are infected with chickenpox daily with a standard deviation of 98.37 and the daily naver frequency average is 33.94 with a standard deviation of 15.50. we observed that all the statistics for chickenpox are higher than those for other infectious diseases. figure 1 shows the overall framework of the model used in this study including the data collection process and the comparison of models designed using the deep neural network (dnn) method, the long-short term memory (lstm) method, the autoregressive integrated moving average (arima) method, and the ordinary least squares (ols) method. this study constructed an infectious disease surveillance model that uses non-clinical search data, twitter data, and weather data. to design the optimal prediction model, the ols models that use all possible combinations of variables in the dataset were created. the adjusted r-squared values of each model were compared. in addition, lags of 1-14 days were added to each infectious disease and their adjusted r-squared values were compared in a preliminary analysis. a lag of seven days, which had high explanatory power for all infectious diseases, was selected as the optimal lag parameter. the optimal parameters were used to create the ols, arima, dnn, and lstm models. before analysis, this study applied a lag of seven days between the input variables (optimal variable combination) and their associated output variable (disease occurrence). the ols dataset was divided into a training data subset and a test data subset using a ratio of 8:2. this means all 569 rows of collected data were divided such that there were 455 rows for the training data subset and 114 rows for the test data subset. the training data subset was only used for model training. the test data subset was only used for prediction and performance evaluation in the model after training. the arima dataset was also divided into a training data subset and test data subset using a ratio of 8:2, but only the disease occurrences were required for arima. similarly to the data above, the 569 rows of disease occurrence data were divided into 455 rows for the training data subset and 114 rows for the test data subset. in the dnn and lstm models, the whole dataset was divided into training, validation, and test data subsets at a ratio of 6:2:2 and training was performed. this means all 569 rows of collected data were divided into 341 rows for the training data subset, 114 rows for the validation data subset, and 114 rows for the test data subset. the training data subset was used for model training. the validation data subset was only used for performance evaluation during training. the final model after training was the model that yielded the best performance when the validation data subset was used in training. the test data subset was only used for the prediction and performance evaluation. to compare the models, the root mean squared error (rmse) was used to evaluate the prediction rates. rmse is a common measurement for the difference between predicted and actual values. it is usually used in the other fields as well as in the prediction of infectious diseases [28, 44, 45] . rmse is calculated using the equation below. (1) this study constructed an infectious disease surveillance model that uses non-clinical search data, twitter data, and weather data. to design the optimal prediction model, the ols models that use all possible combinations of variables in the dataset were created. the adjusted r-squared values of each model were compared. in addition, lags of 1-14 days were added to each infectious disease and their adjusted r-squared values were compared in a preliminary analysis. a lag of seven days, which had high explanatory power for all infectious diseases, was selected as the optimal lag parameter. the optimal parameters were used to create the ols, arima, dnn, and lstm models. before analysis, this study applied a lag of seven days between the input variables (optimal variable combination) and their associated output variable (disease occurrence). the ols dataset was divided into a training data subset and a test data subset using a ratio of 8:2. this means all 569 rows of collected data were divided such that there were 455 rows for the training data subset and 114 rows for the test data subset. the training data subset was only used for model training. the test data subset was only used for prediction and performance evaluation in the model after training. the arima dataset was also divided into a training data subset and test data subset using a ratio of 8:2, but only the disease occurrences were required for arima. similarly to the data above, the 569 rows of disease occurrence data were divided into 455 rows for the training data subset and 114 rows for the test data subset. in the dnn and lstm models, the whole dataset was divided into training, validation, and test data subsets at a ratio of 6:2:2 and training was performed. this means all 569 rows of collected data were divided into 341 rows for the training data subset, 114 rows for the validation data subset, and 114 rows for the test data subset. the training data subset was used for model training. the validation data subset was only used for performance evaluation during training. the final model after training was the model that yielded the best performance when the validation data subset was used in training. the test data subset was only used for the prediction and performance evaluation. to compare the models, the root mean squared error (rmse) was used to evaluate the prediction rates. rmse is a common measurement for the difference between predicted and actual values. it is usually used in the other fields as well as in the prediction of infectious diseases [28, 44, 45] . rmse is calculated using the equation below. the optimal variable combinations for the model were selected by considering all possible models in the regression analysis. the models are combinations of the four types of data in the dataset (naver searches (n), twitter searches (tw), temperature (t), and humidity (h)). figure 2 shows the adjusted r-squared values of 15 regression models for each infectious disease. among the observed regression models, the models that are combinations of all variables had the best explanatory power. therefore, this combination was chosen as the optimal variable combination. the optimal variable combinations for the model were selected by considering all possible models in the regression analysis. the models are combinations of the four types of data in the dataset (naver searches (n), twitter searches (tw), temperature (t), and humidity (h)). figure 2 shows the adjusted r-squared values of 15 regression models for each infectious disease. among the observed regression models, the models that are combinations of all variables had the best explanatory power. therefore, this combination was chosen as the optimal variable combination. previous results [43] have shown that it is possible to predict infectious disease at an early stage if a model is designed to consider the time difference between clinical data and non-clinical data. based on this observation, our model was designed to consider the time difference in each data set. in this situation, "lag" refers to the time delay between the date the data is collected and the date at which the effects actually occur. this means analysis was performed by establishing the time difference between the four input variables used in this study and the output variable that is actually affected. for example, a lag of 1 means that the output variable of 2 january 2016 is calculated using the input variables of 1 january 2016. figure 3 shows the adjusted r-squared values of regression models when 1-14 days of lag were tested for each of the infectious diseases in order to select the optimal lag. in the case of chickenpox, it was found that lags of 1, 7, and 14 days yielded the highest explanatory power. for scarlet fever, it was found that lags of 4, 7, and 11 days yielded the highest explanatory power. in the case of malaria, it was found that lags of 1, 2, and 7 days yielded the highest explanatory power. for chickenpox and malaria, the lag with the highest explanatory power was one day. however, it was decided that this lag was not suitable for the ultimate goal of reducing the length of delay from reporting to dissemination. in the observed regression models, the explanatory power of a lag of seven days was high for all infectious diseases. therefore, it was decided that this lag was the most suitable and was used for later predictions. previous results [43] have shown that it is possible to predict infectious disease at an early stage if a model is designed to consider the time difference between clinical data and non-clinical data. based on this observation, our model was designed to consider the time difference in each data set. in this situation, "lag" refers to the time delay between the date the data is collected and the date at which the effects actually occur. this means analysis was performed by establishing the time difference between the four input variables used in this study and the output variable that is actually affected. for example, a lag of 1 means that the output variable of 2 january 2016 is calculated using the input variables of 1 january 2016. figure 3 shows the adjusted r-squared values of regression models when 1-14 days of lag were tested for each of the infectious diseases in order to select the optimal lag. in the case of chickenpox, it was found that lags of 1, 7, and 14 days yielded the highest explanatory power. for scarlet fever, it was found that lags of 4, 7, and 11 days yielded the highest explanatory power. in the case of malaria, it was found that lags of 1, 2, and 7 days yielded the highest explanatory power. for chickenpox and malaria, the lag with the highest explanatory power was one day. however, it was decided that this lag was not suitable for the ultimate goal of reducing the length of delay from reporting to dissemination. in the observed regression models, the explanatory power of a lag of seven days was high for all infectious diseases. therefore, it was decided that this lag was the most suitable and was used for later predictions. in this study, the ols model was used to select the optimal parameter values. it was also used as a comparison model to evaluate the prediction performance of the deep learning models. linear regression is a regression analysis technique that models the linear correlation between the output variable y and one or more input variables x in the collected data. the model has the following form. ols is the most simple and commonly used form of linear regression. it is a technique that minimizes the sum of squared errors and can solve the mathematical expression for ß, which is the parameter to be predicted, by using the equation below. ols analyses were performed by r version 3.3.3 (https://www.r-project.org/). because ols is the simplest form of linear regression analysis, it is not sufficient for comparison with deep learning models. therefore, we also compare the arima model, which is often used for the prediction of infectious diseases [44] [45] [46] . this will more clearly compare traditional analysis methods (ols and arima) with deep learning (dnn and lstm). the arima model is a method for analyzing non-stationary time series data. one characteristic of arima analysis is that it can be applied to any time series. in particular, it shows the detailed changes when the data fluctuates rapidly over time. in this study, we used seasonal arima because the collected data is seasonal. the seasonal arima model is denoted as arima(p, d, q)(p, d, q)s. where p is the order of the autoregressive part, d is the order of the differencing, q is the order of the moving-average process, and s is the length of the seasonal cycle. (p, d, q) is the seasonal part of the model. the seasonal arima model is written below. in this study, the ols model was used to select the optimal parameter values. it was also used as a comparison model to evaluate the prediction performance of the deep learning models. linear regression is a regression analysis technique that models the linear correlation between the output variable y and one or more input variables x in the collected data. the model has the following form. ols is the most simple and commonly used form of linear regression. it is a technique that minimizes the sum of squared errors and can solve the mathematical expression for ß, which is the parameter to be predicted, by using the equation below. ols analyses were performed by r version 3.3.3 (https://www.r-project.org/). because ols is the simplest form of linear regression analysis, it is not sufficient for comparison with deep learning models. therefore, we also compare the arima model, which is often used for the prediction of infectious diseases [44] [45] [46] . this will more clearly compare traditional analysis methods (ols and arima) with deep learning (dnn and lstm). the arima model is a method for analyzing non-stationary time series data. one characteristic of arima analysis is that it can be applied to any time series. in particular, it shows the detailed changes when the data fluctuates rapidly over time. in this study, we used seasonal arima because the collected data is seasonal. the seasonal arima model is denoted as arima(p, d, q)(p, d, q) s . where p is the order of the autoregressive part, d is the order of the differencing, q is the order of the moving-average process, and s is the length of the seasonal cycle. (p, d, q) is the seasonal part of the model. the seasonal arima model is written below. where y t refers to the value of the time series at time t, µ is the mean term, a t is the independent disturbance, b is the backshift operator, φ(b) is the autoregressive operator, and θ(b) is the moving average operator. φ s b s and θ s b s are the seasonal operators of the model. the arima analyses were carried out using the r version 3.3.3. the dnn model is a feedforward analysis method that is a basic model for deep learning. dnn is composed of a minimum of three node layers and, with the exception of the input node, each node uses a nonlinear activation function. dnn uses a supervised learning technique called backpropagation. in this study, an infectious disease prediction model that uses dnn was designed and the basic dnn model was compared with this more advanced deep learning model. the variables used in dnn are bias b, input x, output y, weight w, calculation function σ and activation function f (σ). each neuron in dnn uses the following equation. figure 4 shows the structure of a neuron in the dnn model. the dnn analyses were carried out using the "dense layer" option of the keras package in the python version 3.5.3 (https://keras.io/). there are 10 parameters available in the dense layer. we only modified the units, activation function, and dropout. the rest of the parameters used the default values (e.g., use_bias = true and kernel_regularizer = none). where refers to the value of the time series at time , is the mean term, is the independent disturbance, is the backshift operator, ( ) is the autoregressive operator, and ( ) is the moving average operator. ( ) and ( ) are the seasonal operators of the model. the arima analyses were carried out using the r version 3.3.3. the dnn model is a feedforward analysis method that is a basic model for deep learning. dnn is composed of a minimum of three node layers and, with the exception of the input node, each node uses a nonlinear activation function. dnn uses a supervised learning technique called backpropagation. in this study, an infectious disease prediction model that uses dnn was designed and the basic dnn model was compared with this more advanced deep learning model. the variables used in dnn are bias , input , output , weight , calculation function and activation function ( ). each neuron in dnn uses the following equation. figure 4 shows the structure of a neuron in the dnn model. the dnn analyses were carried out using the "dense layer" option of the keras package in the python version 3.5.3 (https://keras.io/). there are 10 parameters available in the dense layer. we only modified the units, activation function, and dropout. the rest of the parameters used the default values (e.g., use_bias = true and kernel_regularizer = none). the lstm model is suitable for predicting time series data when there is a time step with a random size [47] . it was thought that prediction performance could be improved by creating an infectious disease prediction model using lstm and the time series data collected in this study. an important advantage of recurrent neural networks (rnns) is that contextual information is available when mapping io sequences. however, there is a gradient problem in that the effect of a given input on the hidden layer can be increased or decreased significantly during the circular connection. as new inputs are overwritten, the sensitivity of the first input decreases over time. therefore, the network is "forgotten". the input gate, output gate, and forget gate are non-linear summation units that control the activation of the cell. the forget gate multiplies the previous state of the cell while the input and output gates multiply the io of the cell. the activation function f of the gate is a logistic sigmoid. the io activation functions g and h of the cell usually use hyperbolic tangents or logistic sigmoids. however, in some cases, h uses the identity function. as long as the forget gate is open and the input gate is closed, the memory cell continues to remember the first input. in this way, lstm is an algorithm that resolves a problem in traditional rnns [48]. the lstm model is suitable for predicting time series data when there is a time step with a random size [47] . it was thought that prediction performance could be improved by creating an infectious disease prediction model using lstm and the time series data collected in this study. an important advantage of recurrent neural networks (rnns) is that contextual information is available when mapping io sequences. however, there is a gradient problem in that the effect of a given input on the hidden layer can be increased or decreased significantly during the circular connection. as new inputs are overwritten, the sensitivity of the first input decreases over time. therefore, the network is "forgotten". the input gate, output gate, and forget gate are non-linear summation units that control the activation of the cell. the forget gate multiplies the previous state of the cell while the input and output gates multiply the io of the cell. the activation function f of the gate is a logistic sigmoid. the io activation functions g and h of the cell usually use hyperbolic tangents or logistic sigmoids. however, in some cases, h uses the identity function. as long as the forget gate is open and the input gate is closed, the memory cell continues to remember the first input. in this way, lstm is an algorithm that resolves a problem in traditional rnns [48] . the equations for forgetting, storing, renewing, and outputting information in the cell are shown below, respectively. when data (x t ) is input to the lstm cell in equation (7), function f t determines the information to be forgotten in the cell layer. in equations (8) and (9), information that will be newly saved in the cell layer is created in i t and c t in equation (10), the cell layer c t is renewed using f t , i t , and c t in equation (11), the cell layer's information is used and h t is the output. in equation (12), the cell state gets a value between −1 and 1 through the tanh function. the values of c t and h t are kept for the next iteration of lstm. lstm analyses were carried out using the "lstm layer" of the keras package in the python version 3.5.3. there are 23 parameters available in the lstm layer. we only set the units, activation function, return sequence, and dropout. the rest of the parameters used the default values (e.g., use_bias = true, recurrent regularizer = none, recurrent_constraint = none, and unit_forget_bias = none). figure 5 shows the parameter selection method for the deep learning approach used in this study. the adadelta, adagrad, adam, adamax, nadam, rmsprop, and stochastic gradient descent (sgd) optimizers were compared. all parameters of each optimizer used the default values of the keras package. for instance, in sgd, the learning late is 0.01, the momentum is 0, the decay is 0, and the nesterov momentum is false. in addition, the following activation functions were evaluated: exponential linear unit (elu), rectified linear unit (relu), scaled elu (selu), and softplus. lastly, various numbers of epochs (400, 600, 800, and 1000) were evaluated. the other parameters were fixed as follows: number of hidden layers = 4, number of units in each hidden layer = 32, batch size = 32, and drop out = 0. prediction models with variable and fixed parameters were trained on the data and the resulting models were compared to determine the optimal prediction model. to ensure the amount of dnn model data was the same as that of the lstm model, previous data from the same time period as the lstm was inserted. all deep learning models were implemented using the keras package in the python version 3.5.3. and drop out = 0. prediction models with variable and fixed parameters were trained on the data and the resulting models were compared to determine the optimal prediction model. to ensure the amount of dnn model data was the same as that of the lstm model, previous data from the same time period as the lstm was inserted. all deep learning models were implemented using the keras package in the python version 3.5.3. the regression model was formed based on 569 days of data in which a lag of seven days was applied to each infectious disease dataset. the dataset was divided up in an 8:2 ratio and each part was used for constructing the regression model and prediction. table 3 presents the ols results. each regression model had results that were below the level of significance (p < 0.05). the adjusted r-squared value was greater than 0.25 for all three infectious diseases, which means the models can be said to have significant explanatory power. of the infectious disease regression models, the chickenpox model yielded significant results for the naver search queries, temperature, and humidity. the scarlet fever model yielded significant results for the naver search queries and humidity. additionally, the malaria model yielded significant results for the naver search queries and temperature (p < 0.05). looking at these results together, the naver search query data was significant for all three infectious diseases and the twitter data was not significant for any of the three. it can be seen that the internet search query data can be used to design an infectious disease prediction model, which was reported by previous studies. however, the results for the twitter data differ from the results of previous studies. this is believed to be because naver accounted for the largest share (86.2%) of korean search engine use in the health/medicine field for the first half of 2017 while twitter accounted for the smallest share (0.5%) of social media use in the health/medicine field for the same time period (http://internettrend.co.kr/trendforward.tsp). however, the twitter data had an effect on the process of finding the model with the highest adjusted r-squared value. therefore, it is expected to have an effect on future analysis as well. the temperature had a significant relationship with all infectious diseases except for scarlet fever and humidity had a significant relationship with all infectious diseases except for malaria. the values of the coefficients show that the most significant variables for chickenpox and scarlet fever was the naver search query data (4.4589 and 2.1956, respectively) and, for malaria, it was the temperature values (0.0770). the effect of naver search query data in particular was significant for all three infectious diseases, which confirms that it can be suitable for predicting infectious disease. the seasonal arima model was evaluated using the same data used for ols. the autocorrelation function and the partial autocorrelation function were checked for the seasonality of infectious diseases and seasonality was observed. it was considered inappropriate to select the parameters (e.g., p, d, q) because cuts off and tails off are unclear. therefore, the optimal model for each infectious disease was selected based on the akaike information criterion (aic) and rmse. the aic and rmse were used to compare the arima models. table 4 shows the aic and rmse of the seasonal arima model for each infectious disease, which allows us to identify the top three arima model. in addition, the choice of parameter values did not substantially affect the aic and rmse of the model for a single infectious disease. to compare the performance of each model, figure 6 shows the 10 models with the lowest rmse of the test data subset. the numbers inside the parentheses in each model's name represent the optimizer, the activation function, and the number of epochs used in the models (e.g., dnn (1, 2, 3) indicates that the optimizer, the activation function, and the number of epoch are adadelta, relu, and 800, respectively). the metric used to compare the models is rmse, which shows the difference between the actual and predicted values. a smaller rmse value indicates a smaller difference between the actual and predicted values and indicates a higher prediction performance. table s1 shows the rmse and prediction graphs of the dnn and lstm models with the lowest rmse for chickenpox. it can be seen that the prediction graphs for each analysis method have similar shapes overall. the 10 dnn models for chickenpox had a mean rmse of 72.8215 and a standard deviation of 1.28, which shows stable model performance. when the prediction performances of each model were compared based on rmse, the top 10 dnn models showed a 24.45% performance improvement compared to the arima model. the mean rmse of the 10 lstm models was 78.2850, which is higher than the dnn models. the standard deviation was 3.64, which shows that the difference among lstm models was more marked than the difference among dnn models. despite this, the top 10 lstm models achieved an 18.78% performance improvement over the arima model on average. there was a difference between the dnn and lstm models' average figures and standard deviations. however, in the models with the lowest rmse for each analysis method, there was not a big difference, which indicates that there was not a large difference in performance when the optimal parameters for each analysis method were used. table s2 shows the rmse and the prediction graphs of the dnn and lstm models with the lowest rmse for scarlet fever. the shapes of the graphs for the dnn models are similar. for the lstm models, the shapes of the graphs are similar except for the model with the lowest rmse. unlike the graphs of the other lstm models, the graph lstm model with the lowest rmse showed a strong tendency to follow the actual trend. this result infers that a prediction model that is better than existing prediction models can be designed by changing the deep learning parameters to achieve optimization. as is the case with chickenpox, the mean rmse of the dnn model (34.4347) for scarlet fever was lower than that of the lstm model (36.8140) . the standard deviation of the dnn model (0.80) was also lower than that of the lstm model (1.37) . when comparing each model based on rmse, the top 10 dnn models showed a 23.28% performance improvement over the arima model and the lstm models showed a 17.97% performance improvement over the arima model. table s3 shows the rmse and the prediction graphs of the dnn and lstm models with the lowest rmse for malaria. like other infectious diseases, the dnn model prediction graphs have a similar shape. however, the shapes of the lstm model prediction graphs have a tendency to not follow the trend. the rmses of each prediction model, excluding the arima model, showed little difference. this is believed to be because the number of malaria occurrences is fewer than those of the other infectious diseases. therefore, adequate predictions could not be formed. diseases, the dnn model prediction graphs have a similar shape. however, the shapes of the lstm model prediction graphs have a tendency to not follow the trend. the rmses of each prediction model, excluding the arima model, showed little difference. this is believed to be because the number of malaria occurrences is fewer than those of the other infectious diseases. therefore, adequate predictions could not be formed. it is difficult to understand the special characteristics of each analysis method by simply comparing rmse figures alone. therefore, a detailed comparison was performed on the basic comparison models (ols and arima) and the analysis methods that use deep learning (dnn and lstm) . the deep learning models used for comparison were the models with the optimal performance and lowest rmse so that they could best represent each analysis method. with the best performance figure 7 shows the chickenpox predictions of the model with the lowest rmse out of the 10 models with the lowest rmse for each analysis method. the dnn model with the best performance it is difficult to understand the special characteristics of each analysis method by simply comparing rmse figures alone. therefore, a detailed comparison was performed on the basic comparison models (ols and arima) and the analysis methods that use deep learning (dnn and lstm). the deep learning models used for comparison were the models with the optimal performance and lowest rmse so that they could best represent each analysis method. with the best performance. figure 7 shows the chickenpox predictions of the model with the lowest rmse out of the 10 models with the lowest rmse for each analysis method. the dnn model with the best performance had the following specifications, which include optimizer = adadelta, activation function = relu, and number of epochs = 400 (dnn (1, 2, 1) ). the lstm model with the best performance had the following specifications, which include optimizer = nadam, activation function = softplus, epochs = 800 (lstm (5, 4, 3) ). the ols model's predictions had a smaller range of fluctuation than the deep learning models. from day 480, it seems to follow the trend, but it does not follow the small changes. after day 550, it cannot predict the downward shape even within a stable graph model. in short, the ols model is not suitable as a prediction model. the arima model's prediction graph has a very simple shape. this model is cyclic and there is a slight increasing trend in which the predicted value per cycle increases by a factor of about 2.5 each time. this model cannot predict the trend at all. it can only predict a stable cyclic behavior. in contrast, the dnn (1, 2, 1) predictions followed the actual occurrence trend well. moreover, it had a large range of fluctuation, which means it made accurate predictions overall. however, when the number of occurrences rose rapidly in days 510-520, it was unable to follow these values. the lstm (5, 4, 3) predictions had a smaller range of fluctuation than the dnn (1, 2, 1) model. its range of variance was small, which means it had a stable shape and it performed better than dnn (1, 2, 1) when the number of occurrences rose rapidly. predictions overall. however, when the number of occurrences rose rapidly in days 510-520, it was unable to follow these values. the lstm (5, 4, 3) predictions had a smaller range of fluctuation than the dnn (1, 2, 1) model. its range of variance was small, which means it had a stable shape and it performed better than dnn (1, 2, 1) when the number of occurrences rose rapidly. figure 8 shows the scarlet fever predictions of the models with the best performance for each analysis method. the dnn model with the best performance had the following specifications: optimizer = adadelta, activation function = elu, and number of epochs = 600 (dnn (1, 1, 2) ). the lstm model with the best performance had the following specifications: optimizer = adamax, activation function = elu, number of epochs = 400 (lstm (4, 1, 1) ). the ols model's predictions were completely unable to follow the trend, which was similar with the chickenpox case. the arima model's prediction has no particular merit because it also predicts a simple cycle. much like its figure 8 shows the scarlet fever predictions of the models with the best performance for each analysis method. the dnn model with the best performance had the following specifications: optimizer = adadelta, activation function = elu, and number of epochs = 600 (dnn (1, 1, 2) ). the lstm model with the best performance had the following specifications: optimizer = adamax, activation function = elu, number of epochs = 400 (lstm (4, 1, 1) ). the ols model's predictions were completely unable to follow the trend, which was similar with the chickenpox case. the arima model's prediction has no particular merit because it also predicts a simple cycle. much like its chickenpox prediction, it can only predict a stable cyclic behavior. the dnn (1, 1, 2) predictions were relatively close when the number of occurrences was low, but they were too low when the number of occurrences was high. lstm (4, 1, 1) had a larger range of variance than dnn (1, 1, 2) and its predictions were close when the number of occurrences was high. in the prediction graphs for all of the top performing scarlet fever models, none of the models were able to follow the trend on days 480-500 when there was a severe variance in the number of actual occurrences. looking at the mean of each model's predicted value for the number of occurrences, the lstm model (88.7568) was larger than the mean of the dnn model (77.096). these same results were also seen in the case of chickenpox (dnn model mean = 237.5318, lstm model mean = 241.5186). this showed that more suitable results can be obtained if the lstm model is used to predict the maximum value for the number of occurrences and the dnn model is used to predict the minimum value. figure 9 shows the malaria predictions of the models with the best performance for each analysis method. the dnn model with the best performance had the following specifications: optimizer = adamax, activation function = softplus, number of epochs = 800 (dnn (4, 4, 3) ). the lstm model with the best performance had the following specifications: optimizer = adadelta, activation function = softplus, number of epochs = 800 (lstm (1, 4, 3) ). the predictions of the analysis methods were not satisfactory, but the dnn (4, 4, 3) model's predictions seemed to follow the trend relatively well. the arima model predicts values close to 0. it is believed that the occurrences in the malaria data are less than those of other diseases and, therefore, not suited to time series analysis because occurrences are concentrated in the summer seasons. as seen in section 3.3, the lowest rmses of each prediction model excluding the arima model showed little difference. lstm (1, 4, 3) had a lower range of variance than ols and it seemed completely unable to make predictions. as mentioned before, the reason that predictions were inadequate for all of the models in addition to lstm (1, 4, 3) was that figure 9 shows the malaria predictions of the models with the best performance for each analysis method. the dnn model with the best performance had the following specifications: optimizer = adamax, activation function = softplus, number of epochs = 800 (dnn (4, 4, 3) ). the lstm model with the best performance had the following specifications: optimizer = adadelta, activation function = softplus, number of epochs = 800 (lstm (1, 4, 3) ). the predictions of the analysis methods were not satisfactory, but the dnn (4, 4, 3) model's predictions seemed to follow the trend relatively well. the arima model predicts values close to 0. it is believed that the occurrences in the malaria data are less than those of other diseases and, therefore, not suited to time series analysis because occurrences are concentrated in the summer seasons. as seen in section 3.3, the lowest rmses of each prediction model excluding the arima model showed little difference. lstm (1, 4, 3) had a lower range of variance than ols and it seemed completely unable to make predictions. as mentioned before, the reason that predictions were inadequate for all of the models in addition to lstm (1, 4, 3) was that the number of malaria occurrences was small and proper results could not be produced. the deep learning model showed outstanding performance compared to the traditional arima method. of all the dnn and lstm prediction models for chickenpox, the optimal models with the lowest rmse yielded 27.22% and 27.33% better performance than the arima model, respectively. the top 10 dnn models for chickenpox improved performance by an average of 24.45% and the lstm models improved performance by an average of 18.78%. the lowest rmses of the dnn and lstm prediction models for scarlet fever showed 26.25% and 23.79% improved performances compared to arima models. the top 10 dnn models for scarlet fever improved performance by an average of 23.28%. the lstm models improved performance by an average of 17.97%. as noted in the previous sections, it was difficult to predict infectious diseases when the number of infections was small and concentrated in one season. in effect, we observed that the incidence of malaria was high over days 160-250 and after day 530. this period corresponds to the summer season in korea. predicting infectious diseases with this particular data set was difficult and it was not suitable for the arima analysis. even using this particular data set, when dnn was used, the trend of infectious diseases was followed comparatively (figure 9 ). moreover, there is a possibility that the performance would be improved in the dnn model if more diverse parameters were adjusted. this means using deep learning has the advantage of scalability and this can be further investigated in future studies. the arima model that was used in this study was observed to be effective if the number of incidences of infectious diseases was regular and had no increasing or decreasing trends. however, when the predictions were compared according to each analysis method, the dnn and lstm deep learning models performed better than the ols and arima models by assuming that there was a sufficiently large number of occurrences. when comparing the dnn and lstm models, the best models had similar performance, but the dnn models were better in terms of average performance. however, when the number of occurrences was large, the lstm model made close predictions. it seems to be an analysis method that is suitable for circumstances when the number of occurrences is rapidly increasing and infectious disease is believed to be spreading. the deep learning model showed outstanding performance compared to the traditional arima method. of all the dnn and lstm prediction models for chickenpox, the optimal models with the lowest rmse yielded 27.22% and 27.33% better performance than the arima model, respectively. the top 10 dnn models for chickenpox improved performance by an average of 24.45% and the lstm models improved performance by an average of 18.78%. the lowest rmses of the dnn and lstm prediction models for scarlet fever showed 26.25% and 23.79% improved performances compared to arima models. the top 10 dnn models for scarlet fever improved performance by an average of 23.28%. the lstm models improved performance by an average of 17.97%. as noted in the previous sections, it was difficult to predict infectious diseases when the number of infections was small and concentrated in one season. in effect, we observed that the incidence of malaria was high over days 160-250 and after day 530. this period corresponds to the summer season in korea. predicting infectious diseases with this particular data set was difficult and it was not suitable for the arima analysis. even using this particular data set, when dnn was used, the trend of infectious diseases was followed comparatively (figure 9 ). moreover, there is a possibility that the performance would be improved in the dnn model if more diverse parameters were adjusted. this means using deep learning has the advantage of scalability and this can be further investigated in future studies. the arima model that was used in this study was observed to be effective if the number of incidences of infectious diseases was regular and had no increasing or decreasing trends. however, actual data can have trends and be irregular. therefore, deep learning can be an excellent analytical method when analyzing such data and predicting future situations. according to the results of the previous analyses, the deep learning model follows increasing and decreasing trends sufficiently well. moreover, the dnn and lstm models were observed to be sensitive to decreasing trends and increasing trends, respectively. infectious disease is a social problem in that it can cause not only personal damage but also widespread harm. for this reason, research is being conducted to minimize social losses by predicting the spread of infectious diseases. the aim of this study was to design an infectious disease prediction model that is more suitable than existing models by using various input variables and deep learning techniques. therefore, in this study, the optimal parameters were set using a variable selection method based on ols. the relationship between actual instances of disease occurrence and the internet search query data tends to have a time lag, which means a lag was added to each infectious disease's dataset to find the future trend. next, an analysis of arima, dnn, and lstm was performed with optimal parameters. the results of ols analysis using optimal parameters showed that the regression models for each infectious disease had significant results. of the four input variables, the naver search frequency had a significant relationship with all three infectious diseases. the performance of the ols and arima analysis was used to evaluate the deep learning models. looking at the results for dnn and lstm, both the deep learning models made much better predictions than the ols and arima models for all infectious diseases. moreover, the dnn models had the best performance on average, but the lstm models made more accurate predictions when infectious diseases were spreading. however, in the case of malaria, there were few occurrences of the disease compared to other infectious diseases, which means the predictions were not comparatively accurate. this study was also able to reveal special characteristics of the dnn and lstm models. the dnn model produced smaller values than the lstm model on average when predicting infectious diseases. suitable predictions can be made using the dnn model when predicting the minimum value for disease occurrence and using the lstm model when predicting the maximum value. in previous studies, deep learning algorithms were not used [10] [11] [12] [13] [14] [15] 17] or the amount of data considered was small [22, 23, 28, 29] . this study used social media big data and weather data, which have not been sufficiently considered in existing studies. it also used deep learning analysis, which yields high prediction performance to increase the performance of infectious disease predictions. the results showed that, when selecting the optimal parameters, adding all input variables had the highest explanatory power. this means that, by adding various data, it was possible to design a model with higher explanatory power. moreover, the lstm model results for scarlet fever indicate that it is possible to optimize a deep learning model by changing its parameters in various ways and, therefore, design a prediction model that is better than existing prediction models. this study has reviewed the factors involved with infectious disease occurrence using search query data and social media big data, which exist because of the development of the internet as well as temperature and humidity weather data. it also constructed traditional prediction models such as ols, arima, and deep learning prediction models such as dnn and lstm and compared their prediction performance to confirm that the models that use deep learning are the most suitable for infectious disease prediction. it is believed that infectious disease prediction models that employ deep learning can be used to supplement current infectious disease surveillance systems and, at the same time, predict trends in infectious disease. if this can reduce the time differences in reporting systems so that infectious disease trends can be known immediately, it is expected that immediate responses to infectious disease will become possible and costs to society can be minimized. according to a study by shin et al., an emerging infectious disease known as the middle east respiratory syndrome (mers) has a deep correlation with internet search data [20] and it will become possible to expand these methods to the real-time surveillance and prediction of emerging infectious diseases as well. however, this study has three limitations, which include a relatively short data collection period, regionally combined predictions, and a consideration of a narrow range of parameters in the deep learning model. the search query data collection time period used in this research was relatively short extending from 1 january 2016 to 29 july 2017. the particular spatial ranges of data were averaged across the whole of south korea. it is believed that, if the data is expanded and the spatial ranges are subdivided, the model's performance will improve. in addition, an effort was made to change the dnn and lstm model parameters and create a variety of prediction models, but the deep learning prediction models used in this study did not cover all the prediction models that could be implemented. parameters such as hidden layers and batch size were not considered. therefore, it is difficult to conclude that the most effective model was created. if more parameters are considered and more prediction models are made in future research, it is believed that prediction performance can be increased somewhat. supplementary materials: the following are available online at http://www.mdpi.com/1660-4601/15/8/1596/s1, table s1 : the root mean squared error (rmse) and prediction graphs of top 10 deep neural network (dnn) and long-short term memory (lstm) models for chickenpox. the seasonal autoregressive integrated moving average (arima) model is denoted as arima(p, d, q)(p, d, q) s . where p is the order of the autoregressive part, d is the order of the differencing, q is the order of the moving-average process, and s is the length of the seasonal cycle. (p, d, q) is the seasonal part of the model. the numbers in parentheses indicate each deep learning model's optimizer, activation, and number of epochs, respectively. (optimizer) 1: adadelta, 2: adagrad, 3: adam, 4: adamax, 5: nadam, 6: rmsprop, and 7: sgd, (activation function) 1: elu, 2: relu, 3: selu, and 4: softplus, (number of epochs) 1: 400, 2: 600, 3: 800, and 4: 1000, table s2 : the rmse and prediction graphs of the top 10 dnn and lstm models for scarlet fever, table s3 : the rmse and prediction graphs of top 10 dnn and lstm models for malaria. infectious disease, safety, state: history of infectious disease prevention and mers situation a profile of the online dissemination of national influenza surveillance data multiscale mobility networks and the spatial spreading of infectious diseases modeling the worldwide spread of pandemic influenza: baseline case and containment interventions seasonal transmission potential and activity peaks of the new influenza a(h1n1): a monte carlo likelihood analysis based on human mobility modelling disease outbreaks in realistic urban social networks strategies for mitigating an influenza pandemic controlling pandemic flu: the value of international air travel restrictions mitigation measures for pandemic influenza in italy: an individual based model considering different scenarios monitoring pertussis infections using internet search queries disease surveillance based on internet-based linear models: an australian case study of previously unmodeled infection diseases advances in nowcasting influenza-like illness rates using search query logs correlation between national influenza surveillance data and google trends in south korea dynamic forecasting of zika epidemics using google trends influenza forecasting with google flu trends mass media and the contagion of fear: the case of ebola in america monitoring hand, foot and mouth disease by combining search engine query data and meteorological factors assessing the usability of social media data for visitor monitoring in protected areas forecasting the onset and course of mental illness with twitter data high correlation of middle east respiratory syndrome spread with google search and twitter trends in korea defender: detecting and forecasting epidemics using novel data-analytics for enhanced response applying gis and machine learning methods to twitter data for multiscale surveillance of influenza forecasting influenza-like illness dynamics for military populations using neural networks and social media twitter in the cross fire-the use of social media in the westgate mall terror attack in kenya real-time diffusion of information on twitter and the financial markets bibliographic analysis of nature based on twitter and facebook altmetrics data frequent discussion of insomnia and weight gain with glucocorticoid therapy: an analysis of twitter posts forecasting influenza in hong kong with google search queries and statistical model fusion construction and evaluation of two computational models for predicting the incidence of influenza in nagasaki prefecture deep learning applications and challenges in big data analytics deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases dermatologist-level classification of skin cancer with deep neural networks deep learning based tissue analysis predicts outcome in colorectal cancer time series analyses of hand, foot and mouth disease integrating weather variables short term effects of weather on hand, foot and mouth disease weather and virological factors drive norovirus epidemiology: time-series analysis of laboratory surveillance data in england and wales imported dengue cases, weather variation and autochthonous dengue incidence in cairns a large temperature fluctuation may trigger an epidemic erythromelalgia outbreak in china implications of temperature variation for malaria parasite development across the impact of variations in temperature on early plasmodium falciparum development in anopheles stephensi mapping the epidemic changes and risks of hemorrhagic fever with renal syndrome in shaanxi province a threshold analysis of dengue transmission in terms of weather variables and imported dengue cases in australia monitoring seasonal influenza epidemics in korea through query search forecast model analysis for the morbidity of tuberculosis in xinjiang, china time series analysis of dengue incidence in guadeloupe, french west indies: forecasting models using climate variables as predictors the authors declare no conflict of interest. key: cord-000759-36dhfptw authors: uribe-sánchez, andrés; savachkin, alex title: predictive and reactive distribution of vaccines and antivirals during cross-regional pandemic outbreaks date: 2011-06-05 journal: influenza res treat doi: 10.1155/2011/579597 sha: doc_id: 759 cord_uid: 36dhfptw as recently pointed out by the institute of medicine, the existing pandemic mitigation models lack the dynamic decision support capability. we develop a large-scale simulation-driven optimization model for generating dynamic predictive distribution of vaccines and antivirals over a network of regional pandemic outbreaks. the model incorporates measures of morbidity, mortality, and social distancing, translated into the cost of lost productivity and medical expenses. the performance of the strategy is compared to that of the reactive myopic policy, using a sample outbreak in fla, usa, with an affected population of over four millions. the comparison is implemented at different levels of vaccine and antiviral availability and administration capacity. sensitivity analysis is performed to assess the impact of variability of some critical factors on policy performance. the model is intended to support public health policy making for effective distribution of limited mitigation resources. as of july 2010, who has reported 501 confirmed human cases of avian influenza a/(h5n1) which resulted in 287 deaths worldwide [1] . at the same time, the statistics for the h1n1 2009 outbreak has so far included 214 countries with a total reported number of infections and deaths of 419,289 and 18,239, respectively [2] . today, an ominous expectation exists that the next pandemic will be triggered by a highly pathogenic virus, to which there is little or no pre-existing immunity in humans [3] . the nation's ability to mitigate a pandemic influenza depends on the available emergency response resources and infrastructure, and, at present, challenges abound. predicting the exact virus subtype remains a difficult task, and even when identified, reaching an adequate vaccine supply can currently take up to nine months [4, 5] . even if the existing vaccines prove to be potent, their availability will be limited by high production and inventory costs [6, 7] and also will be constrained by the supply of antiviral drugs, healthcare providers, hospital beds, medical supplies, and logistics. hence, pandemic mitigation will have to be done amidst limited availability of resources and supporting infrastructure. this challenge has been acknowledged by who [7] and echoed by the hhs and cdc [8, 9] . the existing models on pandemic influenza (pi) containment and mitigation aims to address various complex aspects of the pandemic evolution process including: (i) the mechanism of disease progression, from the initial contact and infection transmission to the asymptomatic phase, manifestation of symptoms, and the final health outcome [10] [11] [12] , (ii) the population dynamics, including individual susceptibility [13, 14] and transmissibility [10, [15] [16] [17] , and behavioral factors affecting infection generation and effectiveness of interventions [18] [19] [20] , (iii) the impact of pharmaceutical and nonpharmaceutical measures, including vaccination [21] [22] [23] , antiviral therapy [24] [25] [26] , social distancing [27] [28] [29] [30] [31] and travel restrictions, and the use of low-cost measures, such as face masks and hand washing [26, [32] [33] [34] . recently, the modeling efforts have focused on combining pharmaceutical and nonpharmaceutical interventions in search for synergistic strategies, aimed at better resource utilization. most of such approaches attempt implementing a form of social distancing followed by application of pharmaceutical measures. for significant contributions in this area see [33, [35] [36] [37] [38] [39] [40] [41] . one of the most notable among these efforts is a 2006-07 initiative by midas [42] , which cross-examined independent simulation models of pi spread in rural areas of asia [43, 44] , usa and uk [45, 46] , and the city of chicago [47] , respectively. midas crossvalidated the models by simulating the city of chicago, with 8.6m inhabitants and implementing a targeted layered containment [48, 49] . the research findings of midas and some other groups [12, 33] were used in a recent "modeling community containment for pandemic influenza" report by iom, to formulate a set of recommendations for pi mitigation [50] . these findings were also used in a pandemic preparedness guidance developed by cdc [51] . at the same time, the iom report [50] points out several limitations of the midas models, observing that "because of the significant constraints placed on the models . . . the scope of models should be expanded." the iom recommends "to adapt or develop decision-aid models that can . . . provide real-time feedback . . . and include the costs and benefits of intervention strategies." our literature review yields a similar observation that most existing approaches focus on assessment of a priori defined strategies, and virtually none of the models are capable of "learning," that is, adapting to changes in the pandemic progress, or even predicting them, to generate dynamic strategies. such a strategy has the advantage of being developed dynamically, as the pandemic spreads, by selecting a mix of available mitigation options at each decision epoch, based on both the present state of the pandemic and its predicted evolution. in an attempt to address the iom recommendations, we present a simulation optimization model for developing predictive resource distribution over a network of regional outbreaks. the underlying simulation model mimics the disease and population dynamics of each of the affected regions (sections 2.1 and 2.2). as the pandemic spreads from region to region, the optimization model distributes mitigation resources, including stockpiles of vaccines and antiviral and administration capacities (section 2.3). the model seeks to minimize the impact of ongoing outbreaks and the expected impact of potential outbreaks, using measures of morbidity, mortality, and social distancing, translated into the cost of lost productivity and medical expenses. the methodology is calibrated and implemented on a sample outbreak in fla, usa with over 4m inhabitants (section 3). the strategy is compared to the reactive myopic policy, which allocates resources from one actual outbreak region to the next, each time trying to cover the entire regional population at risk, regardless of the resource availability. the comparison is done at different levels of vaccine and antiviral availability and administration capacity. we also present a sensitivity analysis for assessing the impact of variability of some critical factors, including: (i) antiviral efficacy, (ii) social distancing conformance, and (iii) cdc response delay. the objective of our methodology is to generate a progressive allocation of the total resource availability over a network of regional outbreaks. the methodology incorporates (i) a cross-regional simulation model, (ii) a set of single-region simulation models, and (iii) an embedded optimization model. we consider a network of regions with each of which classified as either unaffected, ongoing outbreak, or contained outbreak ( figure 1) . the cross-regional simulation model connects the regions by air and land travel. the single-region simulation models mimic the population and disease dynamics of each ongoing region, impacted by intervention measures. the pandemic can spread from ongoing to unaffected regions by infectious travelers who pass through regional border control. at every new regional outbreak epoch, the optimization model allocates available resources to the new outbreak region (actual distribution) and unaffected regions (virtual distribution). daily statistics is collected for each ongoing region, including the number of infected, deceased, and quarantined cases, for different age groups. as a regional outbreak is contained, its societal and economic costs are calculated. in sections 2.1-2.3, we present the details of the simulation and optimization models. a testbed illustration and a comparison of our strategy to the myopic policy is given in section 3. a schematic of the cross-regional simulation model is shown in figure 2 . the model is initialized by creating population entities and mixing groups, for each region. a pandemic is started by an infectious case injected into a randomly chosen region. the details of the resulting regional contact dynamics and infection transmission are given in section 2.2. as the infected cases start seeking medical help, a new regional outbreak is detected. a resource distribution is then determined and returned to the single-region model. the outbreak can begin cross-regional simulation the single-region model subsumes the following components (see figure 3 ): (i) population dynamics (mixing groups and schedules), (ii) contact and infection process, (iii) disease natural history, and (iv) mitigation strategies, including social distancing, vaccination, and antiviral application. the model collects detailed statistics, including number of infected, recovered, deceased, and quarantined cases, for different age groups. for a contained outbreak, its societal and economic costs are calculated. the societal cost includes the cost of lost lifetime productivity of the deceased; the economic cost includes the cost of medical expenses of the recovered and deceased and the cost of lost productivity of the quarantined [52] . each region is modeled as a set of population centers formed by mixing groups or places where individuals come into contact with each other during the course of their social interaction. examples of mixing groups include households, offices, schools, universities, shopping centers, entertainment centers, and so forth, [53] . each individual is assigned a set of attributes such as age, gender, parenthood, workplace, infection susceptibility, and probability of travel, among others. each person is also assigned δt time-discrete (e.g., δt = 1 hour) weekday and weekend schedules, which depend on: (i) person's age, parenthood, and employment status, (ii) disease status, (iii) travel status, and (iv) person's compliance to social distancing decrees [54] . as their schedules advance, the individuals circulate throughout the mixing groups and come into contact with each other (see section 2.2.2). it is assumed that at any point of time, an individual belongs to one of the following compartments (see figure 4 ): susceptible, contacted (by an infectious individual), infected (asymptomatic or symptomatic), and recovered/deceased. in what follows, we present the infection transmission and disease natural history model, which delineates the transitions between the above compartments. process. infection transmission occurs during contact events between susceptible and infectious cases, which take place in the mixing groups. at the beginning of every δt period (e.g., one hour), for each mixing group g, the simulation tracks the total number of infectious cases, n g , present in the group. it is assumed that each infectious case generates r g per δt unit of time new contacts [46] , chosen randomly (uniformly) from the pool of susceptibles present in the group. we also assume the following: (i) during δt period, a susceptible may come into contact with at most one infectious case and (ii) each contact exposure lasts δt units of time. once a susceptible has started her contact exposure at time t, she will develop infection at time t + δt with a certain probability that is calculated as shown below. let l i (t) be a nonnegative continuous random variable that represents the duration of contact exposure, starting at time t, required for susceptible i to become infected. we assume that l i (t) is distributed exponentially with mean 1/λ i (t), where λ i (t) represents the instantaneous force of infection applied to susceptible i at time t [55] [56] [57] . the probability that susceptible i, whose contact exposure has started at time t, will develop infection at time t + δt is then given as a schematic of the disease natural history is shown in figure 5 . during the incubation phase, the infected case stays asymptomatic. at the end of the latency phase, she enters the infectious phase [44, 46, 48] . she becomes symptomatic at the end of the incubation period. at the end of the infectious phase, she enters the period leading to a health outcome, which culminates in her recovery or death. mortality for influenza-like diseases is a complex process affected by many factors and variables, most of which have limited accurate data support available from past pandemics. furthermore, the time of death can sometimes be weeks following the disease episode (which is often attributable to pneumonia-related complications [58] ). because of the uncertainty underlying the mortality process, we adopted an age-based form of the mortality probability of infected i, as follows: where μ i is the age-dependent base mortality probability of infected i, ρ i is her status of antiviral therapy (0 or 1), and τ is the antiviral efficacy measured as the relative decrease in the base probability [44] . we assume that a recovered case develops full immunity but continues circulating in the region. mitigation is initiated upon detection of a critical number of confirmed infected cases [59] , which triggers resource distribution and deployment. the model incorporates a certain delay for deploying field responders. pharmaceutical intervention (phi) includes vaccination and antiviral application. vaccination is targeted at individuals at risk [60] to reduce their infection susceptibility. the vaccine takes a certain period to become effective [61] . vaccination is constrained by the allocated stockpile and administration capacity, measured in terms of the immunizer-hours. we assume that as some symptomatic cases seek medical help [62, 63] , those at risk of them will receive an antiviral. the process is constrained by the allocated stockpile and administration capacity, measured in terms of the number of certified providers. both vaccination and antiviral application are affected by a number of sociobehavioral factors, including conformance of the target population, degree of risk perception, and compliance of healthcare personnel [64] [65] [66] . the conformance level of the population at risk can be affected, among other factors, by the demographics and income level [67] [68] [69] [70] [71] as well as by the quality of public information available [54] . the degree of risk perception can be influenced by the negative experience developed during previous pharmaceutical campaigns [72, 73] , as well as by public fear and rumors [74, 75] . nonpharmaceutical intervention (npi) includes social distancing and travel restrictions. we adopted a cdc guidance [51] , which establishes five categories of pandemic severity and recommends quarantine and closure options according to the category. the categories are determined based on the value of the case fatality ratio (cfr), the proportion of fatalities in the total infected population. for cfr values lower than 0.1% (category 1), voluntary at-home isolation of infected cases is implemented. for cfr values influenza research and treatment 5 between 0.1% and 1.0% (categories 2 and 3), in addition to at-home isolation, the following measures are recommended: (i) voluntary quarantine of household members of infected cases and (ii) child and adult social distancing. for cfr values exceeding 1.0% (categories 4 and 5), all the above measures are implemented. as the effectiveness of social distancing is affected by some of the behavioral factors listed above [54] , we assume a certain social distancing conformance level. travel restrictions considered in the model included regional air and land border control for infected travelers. figure 2 , the optimization model is invoked at the beginning of every nth new regional outbreak epoch (n = 1, 2, . . .), starting from the initial outbreak region (n = 1). the objective of the model is to allocate some of the available mitigation resources to the new outbreak region (actual distribution) while reserving the rest of the quantities for potential outbreak regions (virtual distribution). by doing so, the model seeks to progressively minimize the impact of ongoing outbreaks and the expected impact of potential outbreaks, spreading from the ongoing locations. mitigation resources can include stockpiles of vaccines and antivirals, administration capacity, hospital beds, medical supplies, and social distancing enforcement resources, among others. the predictive mechanism of the optimization model is based on a set of regression equations obtained using single-region simulation models. in what follows, we present the construction of the optimization model and explain the solution algorithm for the overall simulation-based optimization methodology. we introduce the following general terminology and notation: the optimization criterion (objective function) of the model incorporates measures of expected societal and economic costs of the pandemic: the societal cost includes the cost of lost lifetime productivity of the deceased; the economic cost includes the cost of medical expenses of the recovered and deceased and the cost of lost productivity of the quarantined. to compute these costs, the following impact measures of morbidity, mortality, and quarantine are used, for each region k: to estimate these measures, we use the following regression models obtained using a single-region simulation of each region k: where δ i ·· denotes the regression coefficient associated with resource i and δ im ·· is the regression coefficient for the interaction between resources i and m. similar models are used for y hk , d hk , and v hk . the above relationships between the impact measures and the resource distributions ought to be determined a priori of implementing a cross-regional scenario (see section 3). here, we consider each region k as the initial outbreak region. we assume, however, that as the pandemic evolves, the disease infectivity will naturally subside. hence, the regression equations need to be re-estimated at every new outbreak epoch, for each region k ∈ c n , using the singleregion simulation models, where each simulation must be initialized to the current outbreak status in region k in the cross-regional simulation. as an alternative to using a computationally burdensome approach of re-estimating the regression equations, a modeler may choose to use a certain decay factor α n [76] to adjust the estimates of the regional impact measures at every nth outbreak epoch, in the following way: in addition, we use the following regression model to estimate the probability of pandemic spread from affected region l to unaffected region k, as a function of resources allocated to region l, which, in turn, impact the number of outgoing infectious travelers from the region: where γ i ·· denotes the regression coefficient associated with resource i, γ im ·· is the regression coefficient associated with interaction between resources i and m, and γ 0 ·· represents the intercept. consequently, the total outbreak probability for unaffected region k can be found as p k = l∈b n p lk . as in the case of the impact measures, the estimates of the regional outbreak probabilities need to be progressively re-estimated or adjusted using a scheme similar to (4), as follows: 6 finally, we calculate the total cost of an outbreak in region k at the nth decision epoch as follows: where m h is total medical cost of an infected case in age group h over his/her disease period, w h is total cost of lost wages of an infected case in age group h over his/her disease period, w h is cost of lost lifetime wages of a deceased case in age group h, and w h is daily cost of lost wages of a non-infected case in age group h who complies with quarantine. the model. the optimization model has the following form. minimize tc n j q 1 j , q 2 j , . . . , q r j + s∈c n tc n s q 1s , q 2s , . . . , q rs · p n s subject to the first term of the objective function represents the total cost of the new outbreak j, estimated at the nth outbreak epoch, based on the actual resource distribution {q 1 j , q 2 j , . . . , q r j } (see (7)). the second term represents the total expected cost of outbreaks in currently unaffected regions, based on the virtual distributions {q 1s , q 2s , . . . , q rs } (7) and the regional outbreak probabilities p n s (6) . the set of constraints assures that for each resource i, the total quantity allocated (current and virtual, both nonnegative) does not exceed the total resource availability at the nth decision epoch. note that both the objective function and the availability constraints are nonlinear in the decision variables. (1) estimate regression equations for each region using the single-region simulation model. (2) begin the cross-regional simulation model. (4) select randomly the initial outbreak region j. set n = 1. (c) re-estimate regression equations for each region k ∈ b n ∪ c n using the single-region simulations, where each simulation is initialized to the current outbreak status in the region (alternatively, use (4) and (6)). (d) solve the resource distribution model for region j. (e) update the total resource availabilities. (10) calculate the total cost for each contained region and update the overall pandemic cost. to illustrate the use of our methodology, we present a sample h5n1 outbreak scenario including four counties in fla, usa: hillsborough, miami dade, duval, and leon, with populations of 1.0, 2.2, 0.8, and 0.25 million people, respectively. a basic unit of time for population and disease dynamics models was taken to be δt = 1 hour. regional simulations were run for a period (up to 180 days) until the daily infection rate approached near zero (see section 3.3). below, we present the details on selecting model parameter values. most of the testbed data can be found in the supplement [77] . models. demographic and social dynamics data for each region [77] were extracted from the u.s. census [78] and the national household travel survey [79] . daily (hourly) schedules [77] were adopted from [53] . each infected person was assigned a daily travel probability of 0.24% [79] , of which 7% was by air and 93% by land. the probabilities of travel among the four regions were calculated using traffic volume data [80] [81] [82] [83] , see table 1 . infection detection probabilities for border control for symptomatic cases were assumed to be 95% and 90%, for air and land, respectively [84] . the instantaneous force of infection applied to contact i at time t ((1), [57] ) was modeled as influenza research and treatment 7 where α i is the age-dependent base instantaneous infection probability of contact i, θ i (t) is her status of vaccination at time t (0 or 1), and δ is the vaccine efficacy, measured as the reduction in the base instantaneous infection probability (achieved after 10 days [61] ). the values of age-dependent base instantaneous infection probabilities were adopted from [46] (see table 2 ). the disease natural history included a latent period of 29 hours (1.21 days), an incubation period of 46 hours (1.92 days), an infectiousness period from 29 to 127 hours (1.21 to 5.29 days), and a period leading to health outcome from 127 to 240 hours (5.29 to 10 days) [85] . base mortality probabilities (μ i in (2)) were found using the statistics recommended by the working group on pandemic preparedness and influenza response [52] . this data shows the percentage of mortality for age-based high-risk cases (hrc) ( table 3 , columns 1-3). mortality probabilities (column 4) were estimated under the assumption that highrisk cases are expected to account for 85% of the total number of fatalities, for each age group [52] . single-region simulation models were calibrated using two common measures of pandemic severity [35, 45, 46] : the basic reproduction number (r 0 ) and the infection attack rate (iar). r 0 is defined as the average number of secondary infections produced by a typical infected case in a totally susceptible population. iar is defined as the ratio of the total number of infections over the pandemic period to the size of the initial susceptible population. to determine r 0 , all infected cases inside the simulation were classified by generation of infection, as in [33, 43] . the value of r 0 was calculated as the average reproduction number of a typical generation in the early stage of the pandemic, with no interventions implemented (the baseline scenario) [33] . historically, r 0 values for pi ranged between 1.4 and 3.9 [37, 43] . to attain similar values, we calibrated the hourly contact rates of mixing groups [77] (original rates were adopted from [46] ). for the four regions, the average baseline value of r 0 was 2.54, which represented a high transmissibility scenario. the values of regional baseline iar averaged 0.538. mitigation resources included stockpiles of vaccines and antiviral and administration capacities (section 3.4). a 24-hour delay was assumed for deployment of resources and filed responders [59] . phi. the vaccination risk group included healthcare providers [66] , and individuals younger than 5 years (excluding younger than 12 months old) and older than 65 years [60] . the risk group for antiviral included symptomatic individuals below 15 years and above 55 years [60, 86] . the efficacy levels for the vaccine (δ in (9)) and antiviral (τ in (2)) were assumed to be 40% [44, 87] and 70%, respectively. we did not consider the use of antiviral for a mass prophylactic reduction of infection susceptibility due to the limited antiviral availability [9] and the risk of emergence of antiviral resistant transmissible virus strains [26] . we assumed a 90% target population conformance for both vaccination and antiviral treatment [64] . the immunity development period for the vaccine was taken as 10 days [61] . a version of the cdc guidance for quarantine and isolation for category 5 was implemented (section 2.2.4, [51] ). once the reported cfr value had reached 1.0%, the following policy was declared and remained in effect for 14 days [51] : (i) individuals below a certain age ξ (22 years) stayed at home during the entire policy duration, (ii) of the remaining population, a certain proportion φ [88] stayed at home and was allowed a one-hour leave, every three days, to buy essential supplies, and (iii) the remaining (1 − φ) noncompliant proportion followed a regular schedule. all testbed scenarios considered the quarantine conformance level φ equal to 80% [54] . an outbreak was considered contained, if the daily infection rate did not exceed five cases, for seven consecutive days. once contained, a region was simulated for an additional 10 days for accurate estimation of the pandemic statistics. a 2 5 statistical design of experiment [89] was used to estimate the regression coefficient values of the significant decision factors and their interactions (see section 2.3; the values of adjusted r 2 ranged from 96.36% to 99.97%). the simulation code was developed using c++. the running time for a cross-regional simulation replicate involving over four million inhabitants was between 17 and 26 minutes (depending on the initial outbreak region, with a total of 150 replicates) on a pentium 3.40 ghz with 4.0 gb of ram. the performance of the dpo and myopic policies is compared at different levels of resource availability. table 4 summarizes the total vaccine and antiviral requirements for each region, based on the composition of 8 influenza research and treatment average daily cost of lost productivity of a non-infected quarantined case (20-99) $432.54 theregional risk groups (see section 3.3). table 5 shows the per capita costs of lost productivity and medical expenses, which were adopted from [52] and adjusted for inflation for the year of 2010 [90] . comparison of the two strategies is done at the levels of 20%, 50%, and 80% of the total resource requirement shown in table 4 . figures 6(a) and 6(b) show the policy comparison in the form of the 95% confidence intervals (ci) for the average number of infected and deceased, respectively. figure 7 also shows the policy comparison using the 95% ci for the average total pandemic cost, calculated using the pandemic statistics, and the per capita costs from table 5 . for illustrative purposes, we also show the average number of regional outbreaks, for each policy, at different levels of resource availability, in the testbed scenario involving four regions, with the hillsborough as the initial outbreak region ( table 6) . it can be observed that the values of all impact measures exhibit a downward trend, for both dpo and myopic policies, as the total resource availability increases from 20% to 80%. an increased total resource availability not only helps alleviating the pandemic impact inside the ongoing regions but also reduces the probability of spread to the unaffected regions. for both policies, as the total resource availability approaches the total resource requirement (starting from approximately 60%), the impact numbers show a converging behavior, whereby the marginal utility of additional resource availability diminishes. this behavior can be explained by noting that the total resource requirements were determined assuming the worst case scenario when all (four) regions would be affected and ought to provided with enough resources to cover their respective regional populations at risk. it can also be seen that on average, the dpo policy outperforms the myopic approach at all levels, which can attest to a more efficient resource utilization achieved by the dpo policy (see also table 6 ). the difference in the policy performance is particularly noticeable at the lower levels of resource availability, and it gradually diminishes, as the resource availability increases and becomes closer to be sufficient to cover the entire populations at risk in all regions. it can also be noted that the variability in the performance of the dpo strategy is generally smaller than that of the myopic policy. in general, for both strategies, the performance variability decreases with higher availability of resources. in this section, we assess the marginal impact of variability of some of the critical factors. the impact was measured separately by the change in the total pandemic cost and the number of deaths (averaged over multiple replicates), resulting from a unit change in a decision factor value, one factor at a time. factors under consideration included: (i) antiviral efficacy, (ii) social distancing conformance, and (iii) cdc response delay. we have used all four regions, separately, as initial outbreak regions for each type of sensitivity analysis. the results (patterns) were rather similar. due to limited space, we have opted to show the results for only one initial region, chosen arbitrarily, for each of the three types of sensitivity studies. while duval county was selected as the initial outbreak region to show the sensitivity results on antiviral efficacy, hillsborough and miami dade were used as the initial regions to show the results on, respectively, social distancing conformance and cdc response delay. figure 8 depicts the sensitivity of the average total cost and average total deaths to antiviral efficacy values between 0% and 80%. as expected, for both policies, the curves for the average number of deaths exhibit a decreasing trend which is almost linear for the values of τ between 0% and 40%. as the value of τ approaches 70%, the curves start exhibit a converging behavior. the curves for the average total pandemic cost exhibit a similar pattern for both policies. it can be noted that the performance of both policies is somewhat identical for low antiviral efficacy (between 0% and 30%). however, the performance of the dpo policy improves consistently as τ increases which can be attributed to a more discretionary allocation of the antiviral stockpile by the dpo policy. reduction of the contact intensity through quarantine and social distancing has proven to be one of the most effective containment measures, especially in the early stages of the pandemic [27, 30, 31, 41] . figure 9 shows the sensitivity of the average total cost and average total deaths to the social distancing conformance ranging between 60% and 80%. we observed that for both impact measures, the dpo policy demonstrated a better performance with the difference ranging from $3b to $26b in the total cost and from 1,400 to 20,000 in the number of fatalities. the biggest difference in performance was achieved at the lower-to-medium levels of conformance (between 65% and 72%). as the conformance level approached 80%, the dominating impact of social distancing masked the effect of better utilization of vaccines and antivirals achieved by the dpo strategy. the cdc response delay corresponds to the interval of time from the moment an outbreak is detected to a complete deployment of mitigation resources. depending on the disease infectivity, cdc response delay may represent one of the most critical factors in the mitigation process. figure 10 shows how the performance of both policies was significantly impacted by this factor. the dpo policy showed a uniformly better performance with the difference ranging between $3b to $4b in the average total cost, and between 800 to 1,800 in the average number of mortalities, over the range (24figure 10 : sensitivity analysis for cdc response delay. as recently pointed by the iom, the existing models for pi mitigation fall short of providing dynamic decision support which would incorporate "the costs and benefits of intervention" [50] . in this paper, we present a large-scale simulation optimization model which is attempted at filling this gap. the model supports dynamic predictive resource distribution over a network of regions exposed to the pandemic. the model aims to balance both the ongoing and potential outbreak impact, which is measured in terms of morbidity, mortality, and social distancing, translated into the cost of lost productivity and medical expenses. the model was calibrated using historic pandemic data and compared to the myopic strategy, using a sample outbreak in fla, usa, with over 4 million inhabitants. summary of the main results. in the testbed scenario, for both strategies, the marginal utility of additional resource availability was found to be diminishing, as the total resource availability approached the total requirement. in the testbed scenario, the dpo strategy on average outperformed the myopic policy. as opposed to the dpo strategy, the myopic policy is reactive, rather than predictive, as it allocates resources regardless of the remaining availability and the overall cross-regional pandemic status. in contrast, the dpo model distributes resources trying to balance the impact of actual outbreaks and the expected impact of potential outbreaks. it does so by exploiting regionspecific effectiveness of mitigation resources and dynamic reassessment of pandemic spread probabilities, using a set of regression submodels. hence, we believe that in scenarios involving regions with a more heterogeneous demographics, the dpo policy will likely to perform even better and with less variability than the myopic strategy. we also note that the difference in the model performance was particularly noticeable at lower levels of resource availability, which is in accordance with a higher marginal utility of additional availability at that levels. we thus believe that the dpo model can be particularly useful in scenarios with very limited resources. contributions of the paper. the simulation optimization methodology presented in this paper is one of the first attempts to offer dynamic predictive decision support for pandemic mitigation, which incorporates measures of societal and economic costs. our comparison study of the dpo versus myopic cross-regional resource distribution is also novel. additionally, our simulation model represents one of the first of its kind in considering a broader range of social behavioral aspects, including vaccination and antiviral treatment conformance. the simulation features a flexible design which can be particularized to a broader range of phi and npi and even more granular mixing groups. we also developed a decision-aid simulator which is made available to the general public through our web site at http://imse.eng.usf.edu/pandemics.aspx. the tool is intended to assist public health decision makers in implementing what-if analysis for assessment of mitigation options and development of policy guidelines. examples of such guidelines include vaccine and antiviral risk groups, social distancing policies (e.g., thresholds for declaration/lifting and closure options), and travel restrictions. limitations of the model. lack of reliable data prevented us from considering geo-spatial aspects of mixing group formation. we also did not consider the impact of public education and the use of personal protective measures (e.g., face masks) on transmission, again due to a lack of effectiveness data [91] . we did not study the marginal effectiveness of individual resources due to a considerable uncertainty about the transmissibility of an emerging pandemic virus and efficacy of vaccine and antiviral. for the same reason, the vaccine and antiviral risk groups considered in the testbed can be adjusted, as different prioritization schemes have been suggested. the form of social distancing implemented in the testbed can also be modified as a variety of schemes can be found in the literature, including those based on geographical and social targeting. effectiveness of these approaches is substantially influenced by the compliance factor, for which limited accurate data support exists. it will thus be vital to gather the most detailed data on the epidemiology of a new virus and the population dynamics early in the evolution of a pandemic, and expeditiously analyze the data to adjust the interventions accordingly. cumulative number of confirmed human cases of avian influenza a(h5n1) reported to who world health organization spanish flu pandemic influenza and the global vaccine supply vaccine production pandemic (h1n1) vaccine deployment pandemic influenza preparedness and response preparing for pandemic influenza department of health & human services towards a quantitative understanding of the within-host dynamics of influenza a infections initial human transmission dynamics of the pandemic (h1n1) 2009 virus in north america quantifying the routes of transmission for pandemic influenza little evidence for genetic susceptibility to influenza a (h5n1) from family clustering data estimating variability in the transmission of severe acute respiratory syndrome to household contacts in hong kong, china the transmissibility and control of pandemic influenza a (h1n1) virus detecting human-to-human transmission of avian influenza a (h5n1) a bayesian mcmc approach to study transmission of influenza: application to household longitudinal data the role of the airline transportation network in the prediction and predictability of global epidemics coupled contagion dynamics of fear and disease: mathematical and computational explorations invited commentary: challenges of using contact data to understand acute respiratory disease transmission optimal vaccination policies for stochastic epidemics among a population of households optimal vaccination strategies for a community of households repeated influenza vaccination of healthy children and adults:borrow now, pay later? a population-dynamic model for evaluating the potential spread of drug-resistant influenza virus infections during community-based use of antivirals economic analysis of influenza vaccination and antiviral treatment for healthy working adults antiviral resistance and the control of pandemic influenza a small community model for the transmission of infectious diseases: comparison of school closure as an intervention in individual-based models of an influenza pandemic measures against transmission of pandemic h1n1 influenza in japan in 2009: simulation model responding to simulated pandemic simulation suggests that rapid activation of social distancing can arrest epidemic development due to a novel strain of influenza analysis of the effectiveness of interventions used during the 2009 a/h1n1 influenza pandemic on estimation of vaccine efficacy using validation samples with selection bias targeted social distancing design for pandemic influenza living with influenza: impacts of government imposed and voluntarily selected interventions containing pandemic influenza with antiviral agents reducing the impact of the next influenza pandemic using household-based public health interventions transmissibility of 1918 pandemic influenza finding optimal vaccination strategies for pandemic influenza using genetic algorithms modeling the worldwide spread of pandemic influenza: baseline case and containment interventions using influenza-like illness data to reconstruct an influenza outbreak flute, a publicly available stochastic influenza epidemic simulation model models of infectious disease agent study strategies for containing an emerging influenza pandemic in southeast asia containing pandemic influenza at the source strategies for mitigating an influenza pandemic mitigation strategies for pandemic influenza in the united states modelling disease outbreaks in realistic urban social networks modeling targeted layered containment of an influenza pandemic in the united states report from the models of infectious disease agent study (midas) steering committee modeling community containment for pandemic influenza: a letter report interim pre-pandemic planning guidance: community strategy for pandemic influenza mitigation in the united states the economic impact of pandemic influenza in the united states: priorities for intervention a large scale simulation model for assessment of societal risk and development of dynamic mitigation strategies pandemic influenza: quarantine, isolation and social distancing spatial considerations for the allocation of pre-pandemic influenza vaccination in the united states statistical models and methods for lifetime data mathematical epidemiology of infectious diseases: model building, analysis and interpretation deaths from bacterial pneumonia during 1918-19 influenza pandemic cdc influenza operational plan who guidelines on the use of vaccine and antivirals during influenza pandemics influenza a(h1n1) monovalent vaccine public response to community mitigation measures for pandemic influenza precautionary behavior in response to perceived threat of pandemic influenza the immediate psychological and occupational impact of the 2003 sars outbreak in a teaching hospital the psychosocial effects of being quarantined following exposure to sars: a qualitative study of toronto health care workers influenza vaccination of health-care personnel confidence in vaccination: a parent model parental decisionmaking for the varicella vaccine exploring hepatitis b vaccination acceptance among young men who have sex with men: facilitators and barriers hepatitis b vaccine acceptance among adolescents and their parents why do parents decide against immunization? the effect of health beliefs and health professionals reassessment of the association between guillain-barré syndrome and receipt of swine influenza vaccine in 1976-1977: results of a two-state study psychosocial determinants of immunization behavior in a swine influenza campaign the fear factor doctors swamped by swine flu vaccine fears a simulation based learning automata framework for solving semi-markov decision problems under long run average reward supplemental data and model parameter values for cross-regional simulation-based optimization testbed american community survey national household travel survey (nths) tampa international airport: daily traffic volume data miami international airport: daily traffic volume data miami international airport: daily traffic volume data tallahassee regional airport: daily traffic volume data how thermal-imaging cameras spot flu fevers writing committee of the world health organization (who) antivirals for pandemic influenza: guidance on developing a distribution and dispensing program safety and immunogenicity of an inactivated subvirion influenza a (h5n1) vaccine attitudes toward the use of quarantine in a public health emergency in four countries design and analysis of experiments inflation calculator nonpharmaceutical interventions for pandemic influenza, national and community measures the authors would like to acknowledge with thanks the many helpful suggestions made by professor yiliang zhu, department of epidemiology and biostatistics at the university of south florida, tampa, fla, usa. key: cord-142398-glq4mjau authors: dhar, abhishek title: a critique of the covid-19 analysis for india by singh and adhikari date: 2020-04-11 journal: nan doi: nan sha: doc_id: 142398 cord_uid: glq4mjau in a recent paper (arxiv:2003.12055), singh and adhikari present results of an analysis of a mathematical model of epidemiology based on which they argue that a $49$ day lockdown is required in india for containing the pandemic in india. we assert that as a model study with the stated assumptions, the analysis presented in the paper is perfectly valid, however, any serious comparison with real data and attempts at prediction from the model are highly problematic. the main point of the present paper is to convincingly establish this assertion while providing a warning that the results and analysis of such mathematical models should not be taken at face value and need to be used with great caution by policy makers. a large number of recent papers have analyzed mathematical models, of varying degrees of complexity, in an attempt to understand and sometimes make predictions about the growth and spread of covid-19 across the world. such models often come with serious limitations arising from several facts such as: (i) compartmentalization of the actual degrees of freedom to make analysis simpler, (ii) too many parameters whose values are known with high degrees of uncertainties, (ii) inherent nonlinearities in the governing equations which make long time predictions difficult. nevertheless, there have been papers which make (or appear to make) definitive predictions and many of these end up generating a lot of media hype and get wide attention in the social media. one real risk is that this is actually noticed and used by policy makers. in the present note, we analyze the paper by singh and adhikari, which we believe is in this class. in particular, the paper appears to make the claim that a 49 day lockdown is required to effectively control the growth of the covid-19 pandemic in india. in this note, we explain why this claim should not be taken seriously. at the outset we make it clear that the present comment does not question the technical validity of the results in [1] . however the present comment points out that there are many issues on details such as that of interpretation of variables while making comparison with real data and choice of model parameters -these make any attempts at predictions meaningless. in sec. (ii) we first analyze a simpler version of the model studied in [1] and show that for a certain choice of model parameters, this already reproduces the main features seen in the more complicated model. however we show that, on making small changes of the parameters to somewhat more realistic values, the predictions change significantly. in sec. (iii) we point out a large number of other problems related to the study in [1] which imply that numerical predictions emerging from such studies are completely unreliable. the basic model considered by the authors is one which compartmentalizes the population according to age. at any time each compartment has a number of susceptible (s) , infected (i) and recovered (r) individuals and these evolve with time. the division into age groups is to account for the fact that the levels of contact between different groups (and within groups) vary and so they have different probabilities of passing infections. to discuss the main aspects of the model, it is sufficient to ignore the compartmentalization into age groups and we present the simpler version here. this more basic model is one which divides the population of size n into individuals who are susceptible (s) , asymptomatic infectious (i a ), symptomatic infectious (i s ) or are recovered (r), with the constraint n = s + i a + i s + r. the number r also includes people who have died. dynamics: let i = i a + i s be the total number of infected individuals. for a well mixed population, the probability that an s individual meets an i individual is ∝ (i/n ). we define the infection rate β, which controls the rate of spread and represents the probability of transmitting disease between a susceptible and an infectious individual. we assume that infected people typically either recover or die after t r days, so i → r happens at a rate γ = 1/t r . we also assume that a fraction α of the infected population is asymptomatic. however, as in the analysis of [1] , we will set α = 0 which effectively means that we ignore asymptomatic carrier populations (if any). we then have the following dynamics for the system the time dependent function u(t) is the "lockdown" function that incorporates the effect of a lockdown on the rate of spreading of the infection. a reasonable form is one where u(t) has the constant value (= 1) before the beginning of the lockdown, at time t on , and then it changes to 0 < u l < 1 over a characteristic time scale ∼ t w . thus we can take a form the number u l indicates the lowering of social contacts as a result of a lockdown. this was 3 set to the value 0 in [1] but we note that realistically it is expected that social contacts cannot be brought down to zero (for example, people still have to get groceries). an important parameter is the so-called reproductive days. the evolution for parameter set (i) is very close to the results presented in [1] . note also that the evolution for the three parameter sets are identical before the lockdown. notably, the real data (for the period march 6 -april 7, 2020) is following a completely different trajectory (orange points). the inset shows the same data plotted on a semi-log scale. parameter set (iii): we keep γ = 1/14, β = 0.24, α = 0 but change both the time, for the lockdown to be effective, to t w = 4.0 days and take u l = 0.1 (since, as discussed above, contacts are never reduced to 0). in fig. (1) we show the results for these three parameter sets and a comparison with real data for the number of confirmed covid-19 cases in india. [note: for comparison with real data, we follow [1] and assume that i s (t) is equal to the number of reported confirmed cases. in the comments in the next section, we explain however that this is not a reasonable assumption.] in fig. (1) we observe that the results from parameter set (i) are already in close agreement to those presented in fig. (4d) of [1] . the estimated lockdown period for the number of infected to reduce to a value less than 10 is 54 (which compares remarkably well with the number 49 in [1] , the small difference is expected in this type of analysis). for the three parameter sets we find that the time required for the infected population to reduce to the value i = 10 are respectively this is of course also true for the predictions of the more sophisticated model studied in [1] -the true data after the lockdown has no resemblance to their predicted evolution. warning: the improved estimates quoted in eq. (6) should not be used as suggested periods of lockdown. they are given here simply to point out the limitations of the present model. we end this section by emphasizing that while the analysis presented here ignores details such as the contact population's age break up, population distribution and connectivity matrices, we already recover the main qualitative (and to a large extent, even quantitative) aspects of the results in [1] . it is expected that the main message on sensitivity of model predictions to parameter choices would not change by increasing the complexity of the model. in the previous section, through the analysis of a simpler model, we highlighted some of the reasons why models such as those studied in [1] cannot be taken seriously for their predictive powers. however there are many more serious problems in the analysis and claims of the paper. some of these are: 1. the modeling ignores important factors such as latency periods, asymptomatic and presymptomatic transmission. presymptomatic transmission could happen over the order of 5 − 10 days and is a period during which individuals show no symptoms (and are undetected) but are infecting others. 2. the modeling ignores the fact that the number of people who have been tested are relatively very small in india and the actual number of infected people could be much higher. is compared with the total reported number of confirmed cases. however one expects that the reported confirmed cases (say c) are those that habve already been detected and therfore it does not make sense that they should be counted amongst the ones who are spreading the infection. it seems reasonable to us that i >> c. so the comparison with real data (c) through the identification i = c is flawed. in fact it seems to us that it makes more sense to instead compare r with the real data. 4. the choice of initial conditions in solving these equations is subtle and it is unclear how the authors fix these. small changes in initial conditions can lead to drastically different predictions. 5. as we have demonstrated, small changes in parameter values would lead to drastically different predictions. 6. the choice of the lockdown parameters t w < 1 and u l = 0 are completely unrealistic. we expect there should be a time lag for the lockdown to take effect and even under lockdown, social contacts are not reduced to zero. 7. the optimal lockdown period is computed based on the arbitrary criterion that the infected number drops to the value 10. why not 100 which would give a completely different prediction for the required lockdown period ? we do not see why this completely arbitrary value should be used as a criterion. 8. the actual trajectory of infections (or rather number of confirmed cases) after the lockdown started in india has no resemblance whatsoever with what is predicted by the authors, even for short times. in fact, a data analysis of the effect of lockdowns in several countries on the confirmed number of cases, presented in [3] , shows that the evolution differs qualitatively in all the countries from the picture presented in [1] . in our opinion, this basically proves that the model has no predictive value and should not be used as guiding material for policy makers. we have established in this brief note that the studies of mathematical models of disease spreading, such as the one in [1] , typically have very limited ability to make numerical predictions. such models, if analyzed properly, could play an important role in our qualitative understanding of disease evolution and, with some effort, perhaps in making short term predictions. unfortunately, completely absurd claims sometimes attract a lot of public attention and run the risk of leading to wrong policy decisions. while there are many such examples in the recent literature, we have focused here on analyzing one specific model study [1] . this paper is technically correct in that they state their assumptions clearly and the stated results are obtained under these assumptions. however, samriddhi sankar ray and sanjib sabhapandit for a careful reading of the manuscript and pointing out errors. i thank rajesh gopakumar age-structured impact of social distancing on the covid-19 epidemic in india the reproductive number of covid-19 is higher compared to sars coronavirus covid-19 data analysis key: cord-182586-xdph25ld authors: sun, fei title: dynamics of an imprecise stochastic multimolecular biochemical reaction model with l'{e}vy jumps date: 2020-04-26 journal: nan doi: nan sha: doc_id: 182586 cord_uid: xdph25ld population dynamics are often affected by sudden environmental perturbations. parameters of stochastic models are often imprecise due to various uncertainties. in this paper, we formulate a stochastic multimolecular biochemical reaction model that includes l'{e}vy jumps and interval parameters. firstly, we prove the existence and uniqueness of the positive solution. moreover, the threshold between extinction and persistence of the reaction is obtained. finally, some simulations are carried out to demonstrate our theoretical results. in recent decades, the study of biochemical reaction model has become one of the famous topics in mathematical biology and catalytic enzyme research. for a better review of mathematical models on the theory of biochemical reaction, see kwek and zhang [1] and tang and zhang [2] . considering the biochemical reaction is inevitably affected by environmental noise. kim and sauro [3] studied the sensitivity summation theorems for stochastic biochemical reaction models. in order to capture essential feature of stochastic biochemical reaction systems, some researchers have used different methods to add random terms into the deterministic chemical reaction or epidemic models and studied the dynamical behavior of the corresponding stochastic models driven by white noise (see e.g. [4] [11] as well as there references). most population systems assume that model parameters are accurately known. however, the sudden environmental perturbations may bring substantial social and economic losses. for example, the recent covid-19 has a serious impact on the world. it is more realistic to study the population dynamics with imprecise parameters. panja et al. [16] studied a cholera epidemic model with imprecise numbers and discussed the stability condition of equilibrium points of the system. das and pal [17] analyzed the stability of the system and solved the optimal control problem by introducing an imprecise sir model. other studies on imprecise parameters include those of [12] [15] , and the references therein. the main focus of this paper is dynamics of an imprecise stochastic multimolecular biochemical reaction model with lévy jumps. to this end, we first introduce the imprecise stochastic multimolecular biochemical reaction model. with the help of lyapunov functions, we prove the existence and uniqueness of the positive solution. further, the threshold between extinction and persistence of the reaction is obtained. the remainder of this paper is organized as follows. in sect. 2, we introduce the basic models. in sect. 3, the unique global positive solution of the system is proved. the threshold between extinction and persistence of the reaction are derived in sect. 4 and sect. 5. in this section, we introduce the imprecise stochastic multimolecular biochemical reaction model. the multimolecular reactions described by the following reaction formulas (selkov [18] ), let x(t) and y(t) denote the concentrations of ξ 1 and ξ 2 at time t, respectively, and using x 0 to denote the concentration of ξ 0 . then a stochastic multimolecular biochemical reaction model with lévy jumps takes the following form (gao and jiang [19] ). where p ≥ 1, and b(t) is standard brownian motion with b(0) = 0. σ 2 > 0 represent the intensity of white noise. γ(u) : y × ω → r is the bounded and continuous functions satisfying |γ(u)| < z with z > 0 is a constant. the x(t − ) and y(t − ) are the left limits of x(t) and y(t), respectively. n denotes the compensated random measure defined by n (dt, du) = n (dt, du)λ(du)dt, where n is the poisson counting measure and λ is the characteristic measure of n which is defined on a finite measurable subset y of (0, +∞) with λ(y) < ∞. we assume b and n are independent throughout the paper and denote before we state the imprecise stochastic multimolecular biochemical reaction model, definitions of intervalvalued function should recalled (pal [20] ). , the interval-valued function is take as h(π) = a (1−π) b π for π ∈ [0, 1]. letk 1 ,k 2 ,k 3 ,k 4 ,p,σ represent the interval numbers of k 1 , k 2 , k 3 , k 4 , p, σ, respectively. the system (2.1) with imprecise parameters becomes, according to the theorem 1 in pal et al. [12] and considering the interval-valued function , we can prove that system (2.2) is equivalent to the following system: for υ ∈ [0, 1]. for convenience in the following investigation, let (ω, f , {f t } t≥0 , p) be a complete probability space with a filtration {f t } t≥0 satisfying the usual conditions and define we also need the following assumption. to research the dynamical behavior of an imprecise stochastic multimolecular biochemical reaction model, our first concern is whether the solution is global and positive. in this section, with the help of lyapunov function, we show that system (2.3) has a unique global positive solution with any given initial value. because the coefficients of system (2.3) are local lipschitz continuous (mao [21] ), for any given initial value (x(0), y(0)) ∈ r 2 + , there is a unique local solution (x(t), y(t)) on t ∈ [0, τ e ), where τ e is the explosion time (see mao [21] ). in order to show that the local solution is global, we only need to prove that τ e = ∞ a.s. in this context, choosing a sufficiently large number m 0 ≥ 1 such that (x(0), y(0)) lie within the interval for each integer m ≥ m 0 , we define the stopping time as where inf ∅ = ∞ (∅ being empty set). by the definition, τ m increases as m → ∞. set τ ∞ = lim m→∞ τ m . hence τ ∞ ≤ τ e a.s. if τ ∞ = ∞ a.s. is true, then τ e = ∞ a.s. for all t > 0. in other words, we need to verify τ ∞ = ∞ a.s. if this claim is wrong, then there exist a constant t > 0 and an ǫ ∈ (0, 1) such that hence there is an interger m 1 ≥ m 0 such that consider the lyapunov function v : let m ≥ m 1 and t > 0. then, for any 0 ≤ t ≤ min τ m , t , the itǒs formula (situ [22] ) shows that where l is a differential operator, and where by use of (h) and taylor formula, we know that with θ ∈ (0, 1). similarly, we have taking the expectations on both sides of (3.2), we obtain that let ω m = {τ m ≤ t } for m ≥ m 1 . then, by (3.1), we know that p(ω m ) ≥ ǫ. noting that for every ω ∈ ω m , there exist x(τ m , ω) or y(τ m , ω), all of which equal either m or 1 m . hence x(τ m , ω), y(τ m , ω)) is no less than where 1 ωm(ω) represents the indicator function of ω m (ω). setting m → ∞ leads to the contradiction therefore, we have τ ∞ = ∞ a.s. the proof is complete. when studying biochemical reaction models, two of the most interesting issues are persistence and extinction. in this section, we discuss the extinction conditions in system (2.3) and leave its persistence to the next section. theorem 4.1. let assumption (h) hold. for any initial value (x(0), y(0)) ∈ υ, there is a unique positive solution (x(t), y(t)) to system (2.3) . if one of the following two conditions holds that is to say the reaction will become extinct exponentially with probability one. proof. integrating from 0 to t on both sides of (2.3), yields clearly, we can derive that applying ito formula to (2.3) we can conclude that integrating (4.3) from 0 to t and then dividing by t on both sides, we obtain ] n (ds, du) are all martingale terms and h : (0, ( (4.5) thus, by (4.4), we have (4.6) moreover, the quadratic variation can be calculated clearly, taking the superior limit on both sides of (4.6), we know that similarly, we get which yields lim t→∞ y(t) = 0 a.s. this completes the proof. in this section, we establish sufficient conditions for persistence in the mean of system (2.3). this together with (4.1) implies ln y(t) − ln y(0) t ≥(p l ) 1−υ (p u ) υ (k 3 (5.4) taking the inferior limit on both sides of (5.4) and combining with lemma , from (4.2) and (4.7) we have lim inf therefore, by assumption (l), we can easily obtain (5.1). periodic solutions and dynamics of a multimolecular reaction system bogdanov-takens bifurcation of a polynomialdifferential system in biochemical reaction sensitivity summation theorems for stochastic biochemical reaction systems the dynamics of the stochastic multi-molecule biochemical reaction model long-time behavior of a perturbed enzymatic reaction model under negative feedback process by white noise permanence and extinction of certain stochastic sir models perturbed by a complex type of noises dynamic behavior of a stochastic sirs epidemic model with media coverage dynamics of a novel nonlinear stochastic sis epidemic model with double epidemic hypothesis a stochastic sirs epidemic model with nonlinear incidence rate analysis and numerical simulations of a stochastic seiqr epidemic system with quarantine-adjusted incidence and imperfect vaccination the global dynamics for a stochastic sis epidemic model with isolation optimal harvesting of prey-predator ststem with interval biological parameters: a bioeconomic model incorporating prey refuge into a predatorprey system with imprecise parameter estimates optimal harvesting of a two species competition model with imprecise biological parameters ergodic stationary distribution of a stochastic hepatitis b epidemic model with interval-valued parameters and compensated poisson process dynamical study in fuzzy threshold dynamics of a cholera epidemic model a mathematical study of an imprecise sir epidemic model with treatment control self-oscillations in glycolysis analysis of stochastic multimolecular biochemical reaction model with lévy jumps a mathematical study of an imprecise sir epidemic model with treatment control stochastic differential equations and applications theory of stochastic differential equations with jumps and applications proof. according to (4.4) , we know that ln y(t) − ln y(0) t =(p l ) 1−υ (p u ) υ (k 3 key: cord-246317-wz7epr3n authors: wang, wei-yao; chang, kai-shiang; tang, yu-chien title: emotiongif-yankee: a sentiment classifier with robust model based ensemble methods date: 2020-07-05 journal: nan doi: nan sha: doc_id: 246317 cord_uid: wz7epr3n this paper provides a method to classify sentiment with robust model based ensemble methods. we preprocess tweet data to enhance coverage of tokenizer. to reduce domain bias, we first train tweet dataset for pre-trained language model. besides, each classifier has its strengths and weakness, we leverage different types of models with ensemble methods: average and power weighted sum. from the experiments, we show that our approach has achieved positive effect for sentiment classification. our system reached third place among 26 teams from the evaluation in socialnlp 2020 emotiongif competition. natural language is often indicative of one's emotion. hence, detecting emotions in textual conversations has been one of popular topics in the field of natural language processing (nlp) sentiment domain. sentiment classifier can help researchers study such information on user's feeling. there are various tasks of sentiment classification, for example, riloff et al. (2005) presents an information extraction (ie) system that automatically uses filtering extractions to improve subjectivity classification. on opinion extraction, zhai et al. (2011) extracts different opinion feature, including sentiment-words, substrings, and key-substringgroups, to help improve sentiment classification performance. in recent years, hazarika et al. (2018) proposes a multi-modal emotion detection framework, interactive conversational memory network (icon), to extract multi-modal features for emotion detection. in socialnlp 2020 emotiongif, the challenge is to use tweet text and reply to recommend exactly 6 categories. in this paper, we propose an architecture to apply to the shared task. we preprocess original tweet data to pre-trained language model, then fine-tune to multi-label classification model. to build comprehensive emotion classifier, we design an ensemble scheme to get higher performance. the shared task includes a first-of-its-kind dataset of 40,000 two-turn twitter threads. each thread contains 5 columns which are idx, text, reply, categories, and mp4. here are the explanations of 5 columns: • idx: a unique identifier of each tweet • text: the text of the original tweet • reply: the text content of the response tweet • categories: the categories of the response gif, containing 1 to 6 categories out of a list of 43 categories • mp4: the hash file name of the response gif the dataset is split into three json files, traingold, dev-unlabeled, and test-unlabeled. first including 32,000 threads is training data, and the others including 4,000 threads are validation data, and testing data. the difference between train-gold and dev-unlabeled, test-unlabeled is that the former consists of all the 5 columns while the latter two only consist of 3 columns, idx, text, and reply. figure 1 is the subset of the correlation table which contains the frequency of co-appearance of any two categories. the figure illustrates the correlation between different categories and we can observe that some categories have strong connection while some categories have weak connection. our study can be mainly divided into three topics, including multi-label classification, pre-trained models, and ensemble methods. multi-label classification is a generalization of multiclass classification. nowadays the multi-label classification is increasingly used in many fields of nlp, such as semantic scene classification and sentiment classification. there are two main categories of multi-label classification approaches: problem transformation (pt) methods, and algorithm adaptation (aa) methods. generally speaking, problem transformation methods will transform the multi-label classification problem into one or more single-label classification problem (zhang and zhou, 2014) , while algorithm adaptation methods usually use those algorithms having been adapted to multi-label task and needing no problem transformation. for problem transformation methods, a good strategy to achieve the goal is classifier chains (cc) (read et al., 2011) , which classifies whether the original multi-label problem belongs to a label or not in chain structure, and is able to capture the interdependencies between the labels. and for algorithm adaptation methods, multi-label k-nearest neighbors (mlknn) (zhang and zhou, 2007) is one of the most popular. it relies on the maximum a posteriori (map) principle on training the k-nearest neighbors (knn), which is a well known traditional machine learning algorithm, to determine which label subset each instance belongs to. due to its promising results and simplicity, it has been applied to many practical tasks of text classification. most of the existing multi-label classification approaches solve the emotion classification by training the model on a large dataset. the idea is to find informative features which can reflect the emotion expressed in the text, so with this approach most studies aim to find efficient features leading to better performance (jabreel and moreno, 2016) . also, deep learning models are introduced to solve the multi-label classification problem, and have been proved that such models are able to extract high-level features from raw data. for instance, baziotis et al. (2018) , the winner of semeval-2018 task 1 competition: affect in tweets, proposes a bi-lstm architecture with attention mechanism. they leverage a set of word2vec word embeddings trained on a dataset of 550 million tweets. pre-trained models have been widely applied in a variety of nlp systems and achieve dramatically performance for downstream tasks. there are three major advantages for pre-trained models. first of all, since they are unsupervised learning, there will be unlimited corpus can be trained. secondly, a strength pre-trained language model can generate deep contextual word representation which means a word token can have several representation in different sentences. hence, through fine-tuning we improve downstream tasks more efficiently. last but not least, using pre-trained models can reduce huge architecture engineering. this allows us don't need to design a deep learning network by ourselves and pre-train with massive cost. bert (devlin et al., 2018) , bidirectional encoder representations from transformers, is one of state-of-the-art (sota) pre-trained model. there are two main tasks in pre-training stages. at the first task, called masked lm (mlm), is to replace 15% of the words in each sequence to a [mask] token and model need to predict these masked tokens. encoder learns contextual representations during this stage. second task, next sentence prediction (nsp), the model takes pairs of sentences as input and learns to predict if the second sentence in the pair is the subsequent sentence in the original documents. in details, 50% of the inputs will be a pair in original documents in training, while the other 50% a random sentence from the corpus is chosen as the second sentence. there are some variant models based on bert like roberta (liu et al., 2019) and distilbert (sanh et al., 2019) . distilbert, distilled bert, reduces the size of a bert model by 40%, while retaining 97% of its language understanding and being 60% faster. distilbert removes half number of layers on token-type embeddings and the pooler. instead of focusing on efficiency, roberta, robustly optimized bert approach, finds bert undertrained that is why they study carefully to modify key hyperparameters to improve performance. since there is different discrepancy about whether to remove nsp (devlin et al., 2018; lample and conneau, 2019; yang et al., 2019; joshi et al., 2020) , roberta do some experiments and find that remove nsp can slighlty improve downstream tasks. furthermore, roberta uses bytes instead of unicode as the base subword units (radford et al., 2019) . using bytes makes model learn larger subword vocabulary. in general, supervised learning can be defined as finding hypotheses (classifier) that are closed to the true function which can represent all the data points in training data. however, learning algorithms that only output one hypothesis would face three major problems, statistical, computational, and representational. fortunately, ensemble methods construct a set of classifiers and then classify new data points by taking a vote of their predictions which could usually address the three problems just mentioned (dietterich, 2000) . the first problem is that learning algorithms may give same accuracy with different hypotheses. by constructing an en-semble out of all of these accurate classifiers, the algorithm can use a simple and fair voting mechanism to reduce the risk of choosing the wrong classifier. the second problem is that many learning algorithms implement local search which may stop even if the best solution found by the algorithm is not optimal. an ensemble constructed by running the local search from many different starting points may provide a better approximation to the true unknown function than any of the individual classifiers. the third problem is that in most applications of machine learning, the true function cannot be represented by any of the hypotheses. by forming weighted sums of hypotheses, it may be possible to expand the space of representable functions. hagen et al. (2015) introduces an approach with ensemble methods on twitter sentiment detection. their ensemble method is a voting scheme on the actual classifications of the individual classifiers rather than averaging confidences. their system proves a strong baseline in the semeval 2015 evaluation. these studies motivate us to transfer emo-tiongif task into multi-label classification problem since this task needs to infer most possible 6 categories of each tweet. to reduce huge architecture engineering, we adopt pre-trained models then focus on preprocessing and postprocessing stages such as ensemble methods to achieve better performance on the competition. in addition to solving those problems mentioned previously, each classifier has its strengths and weakness, if we can combine different types of classifiers to leverage others forte to cover its own drawbacks, we can obtain highly accurate classifiers by combining less accurate ones. by combining these three techniques, we could build a robust system on emotiongif task. the main goal of the present work is to predict 6 most possible categories for each tweet in emo-tiongif task. we propose an architecture as in figure 3 which includes three stages: preprocessing, model framework, and ensemble methods. tweet data don't have same structure as formal corpus (e.g. wikipedia). there are multiple methods to clean up original tweet data. we perform some methods to normalize data, including five steps, but we do not convert to lower case. here are the main five steps to normalize tweet dataset. we do these steps in order: 1. transform weird punctuation such as and . 2. transform apostrophes to original words. for example, hasn't will be converted to has not. 3. mapping unknown punctuation which not in tokenizer's vocabulary. for example, β is unknown in roberta tokenizer, this will be transformed to word beta. 4. demojize: convert emoji symbols into their corresponding meanings. also, if there are duplicate emojis, we will only retain one emoji to represent these duplicate emojis. 5. detweetize and more words conversion: some words in dataset are in tweet style, which means these words are seldom seen in formal corpus. we replace these words by manually into common representations. like idk will be replaced with i don't know. moreover, there are many recent trends like covid which haven't been seen in tokenizers before. therefore, we transform these words to common words like virus which can be tokenized correctly in tokenizers. model framework is composed of two parts: enhanced pre-trained language model and fine-tuned multi-label classification model. pre-trained model trains on formal corpus like wikipedia instead of tweet dataset. to avoid our model overfitting and domain bias on the training data, we use provided 32,000 training set to further train on pre-trained language model. the enhanced language model understands more about tweet style sentences. in emotiongif task, we treat as multi-label classification problem. hence, we use enhanced pretrained model to fine-tune to multi-label classification model in downstream task. to properly handle multi-label classification, we select bcewithlogitsloss as our loss function. bcewithlogitsloss combines a sigmoid layer and the bceloss, and takes advantage of the log-sumexp trick for numerical stability as equation (1) and equation (2). where n is the batch size, (2) our goal aims to get better performance instead of efficiency, we use roberta-base, bert-basecased, and bert-base-uncased to individually train language model and fine-tune to multi-label classification model. since roberta and bert use different input formats, and our dataset has pair of sequences text and reply in each tweet, we convert input sentences based on corresponding models. bert format is to add a special token [cls] at first and add [sep] between sentences and the end. roberta format is to add at first and add between sentences and the end. an example of representation is as table 1 . since each classifier has its strengths and weakness, if we can combine different types of classifiers to leverage others forte to cover its own drawbacks, we can obtain highly accurate classifiers by combining less accurate ones. to attain the desired results, we combine three different types of models, roberta-base, bert-base-cased, and bertbase-uncased. on account of different dropout weights in each training, the performance of each trained model may have a big gap compared with each others. by training 10 same type of models with different dropout weights and averaging their predictions, we can lower the risk of using single model with bad performance. after training and averaging three types of model, we use equation (3) figure 4 : visualization of equation y = x n with n = 1/4, 1/3, 1/2, 1, 2, 3, 4 p i for i = 1, 2, 3 are average predict scores from roberta-base, bert-base-cased and bertbase-uncased respectively and w i for i = 1, 2, 3 are the weights corresponding to each model. to choose a reasonable n, we look into the property of power function. figure 4 shows that the further away the probability is from 1, then the faster the probability is closer to 0 and vice versa. the probabilities that remain the highest at the end are the probabilities whose relative agreement (weighted down by the probability and the power) coming from each ensemble model is the highest (laurae). we take advantage of the power weighted sum to enhance performance of model. in emotiongif, we only have ground truth labels in training data. we use dev-unlabeled as our validation data. that is, we fine-tune hyperparameters based on validation data and use best models from tuning to predict testing data, test-unlabeled. in this section, our system gives some reasonable results from experiments. the source code for this paper is available as a github repository 1 . for both pre-trained language model and multilabel classification model, we use adam (kingma and ba, 2014) as optimizer with epsilon 1e-8, learning rate 4e-5. gradient accumulation steps and warmup ratio are 1 and 0.06. max sequence length, number of epochs, and batch size are set to 113, 4, and 16. for pre-trained language model, we set block size to 96. for multi-label classification model, early stopping is used, which means beam search is stopped when number of beam sentences finished per batch. early stopping patience is set to 3. early stopping metric is eval loss and early stopping metric should be minimized. most of these configurations are default arguments in simple transformers 2 . in order to achieve ensemble methods, we train 10 of roberta-base, 5 of bert-base-cased and 5 of bert-base-uncased. all of these models have been trained with above configurations. the metric that will be used to evaluate entries is mean recall at k, with k=6 (mr@6). table 2 shows an example how we evaluate our predictions. for each output, we will predict 6 categories out of a list of 43 categories as prediction in table 2 and calculate how many categories (n) that our predicted categories are identical to the answer. the mr@6 is n divided by the total amount of answer. the final result is the average of the mr@6 for all twitter threads. prediction mr@6 agree, thank you, thumbs up oops, scared, thank you, you got this, do not want, agree 1/3 to check preprocessing methods, we use robertabase tokenizer coverage to validate shown in table 3 . from table 4 shows first 6 out-ofvocab (oov) tokens. although there are still some unknown tokens, we can observe that some words like medium-dark might be able to be tokenized into expected tokens like medium, -, and dark. exploratory data analysis (eda) is a initial investigations on data so as to discover pattern or to check assumption with the help of statistics. through eda, we find that convert word to lower case may cause unexpected tokens from tokenizer. for example, word hug can be correctly tokenized when at different positions, while word hug cannot be tokenized as we expected. that is, hug will be tokenized into h and ug. hence we don't convert all words into lower case in emotiongif task. the experiment results of validation data are shown in table 5 table 6 is our system predict on testing set. ensemble models achieve about 0.5662 mar@6 score, while only using single type of model only gets 0.5404. this indicates single type of model may be slightly worse in testing data. applying ensemble methods does solve this problem. overall, our proposed system successfully outperforms with either using original pre-trained language model to fine-tune or emotiongif official baseline. our approach achieves high mar@6 score both on validation data and testing data in this competition. in this work, we propose an system architecture combining with preprocessing, model framework, and ensemble models for emotiongif task. we intently convert some words to our desired format and increase the coverage of words recognized by tokenizer. based on preprocessing data, we apply multi-label classification and pre-trained model when training models to make our work more sophisticated. besides, we also show that ensemble models with power weighted sum outperform any single model with same parameters we trained. in section 2, we observe that there is an imbalance between categories. however, in the present work, we don't deal with it. furthermore, we consider to replace multi-label classification with ranking classification due to its property of dependency in future work. the probabilities of multi-label classification are treated as independent, so there is no correlation among categories while ranking classification is the opposite. since the category would have some connection with each other as table 1 shown, we assume that it would be better to let our model regard the dependency between categories as critical. ntua-slp at semeval-2018 task 1: predicting affective content in tweets with deep attentive rnns and transfer learning bert: pre-training of deep bidirectional transformers for language understanding ensemble methods in machine learning webis: an ensemble for twitter sentiment detection icon: interactive conversational memory network for multimodal emotion detection sentirich: sentiment analysis of tweets based on a rich set of features spanbert: improving pre-training by representing and predicting spans adam: a method for stochastic optimization crosslingual language model pretraining reaching the depths of (power/geometric) ensembling when targeting the auc metric roberta: a robustly optimized bert pretraining approach language models are unsupervised multitask learners classifier chains for multi-label classification exploiting subjectivity classification to improve information extraction distilbert, a distilled version of bert: smaller, faster, cheaper and lighter xlnet: generalized autoregressive pretraining for language understanding exploiting effective features for chinese sentiment classification a review on multi-label learning algorithms ml-knn: a lazy learning approach to multi-label learning. pattern recognition key: cord-151871-228t4ymc authors: unceta, irene; nin, jordi; pujol, oriol title: differential replication in machine learning date: 2020-07-15 journal: nan doi: nan sha: doc_id: 151871 cord_uid: 228t4ymc when deployed in the wild, machine learning models are usually confronted with data and requirements that constantly vary, either because of changes in the generating distribution or because external constraints change the environment where the model operates. to survive in such an ecosystem, machine learning models need to adapt to new conditions by evolving over time. the idea of model adaptability has been studied from different perspectives. in this paper, we propose a solution based on reusing the knowledge acquired by the already deployed machine learning models and leveraging it to train future generations. this is the idea behind differential replication of machine learning models. learning system's environment is prone to change in time. the gartner data science team survey [2] found that over 60% of machine learning models developed in companies are never actually put into production, due mostly to a lack of alignment with environmental demands. to survive in such an ecosystem, machine learning models need to adapt to new conditions by learning to evolve over time. this notion of adaptability has been present in the literature since the early times of machine learning, as practitioners have had to devise ways in which to adapt theoretical proposals to their everyday life scenarios [3, 4, 5] . as the discipline has evolved, so have the available techniques to this end. consider, for example, situations where the underlying data distribution changes resulting in a concept drift. traditional batch learners were incapable of adapting to such drifts. online learning algorithms were devised [6] to succeed in this task by iteratively updating their knowledge according to the data shifts. in other situations, it is not the data that change but the business needs themselves. for instance, fraud detection algorithms [7] are regularly retrained to incorporate new types of fraud. commercial machine learning applications are designed to answer very specific business objectives that may evolve in time. take, for example, the case where a company wants to focus on a new client portfolio. this may require evolving from a binary classification setting to a multi-class configuration [8] . under such circumstances, the operational point of a trained classifier can be changed to adapt it to the new policy. alternatively, it is also possible to add patches in the form of wrappers to already deployed models. these endow models with new traits or functionalities that help them adapt to the new data conditions, either globally [9] or locally [10] . in contrast, when it is the very nature of the system that changes, alternatives are scarce. when change affects the entire ecosystem of a model, most existing solutions are not applicable. say, for example, that one of the original input attributes is no longer available, that a devised black-box solution is required to be interpretable or that updated software licenses require moving to a new production environment. it is such drastic changes in the demands of a machine learning system that this article is concerned with. a straightforward solution in this context is to discard the existing model and re-train a new one. however, by discarding the given solution, we loose all the acquired knowledge and have to rebuild and validate the full machine learning stack. this is seldom the most efficient not the most effective way for tackling this challenge. in this paper, we explore this problem and advocate for imitating the way in which biological systems adapt to changes. in particular, we stress the importance of reusing the knowledge acquired by the already existing machine learning model in order to train a second generation that is better adapted to th enew conditions. adaptation of machine learning models to new scenarios or domains is a well known problem in the machine learning literature. the most well known research branches are transfer learning and domain adaptation. domain adaptation usually deals with changes in the data distributions as time evolves, or with learning knowledge representations for one domain such that they can be transferred to another related target domain. for example, due to the covid-19 pandemic, several countries decided to accept card payments without introducing the pin up to 50 euros instead of the previous 20 euros limit, in order to minimize the interactions of card holders with the points of selling. this domain modification may affect to the fraud card detection algorithms, requiring modifications in their usage. conversely, transfer learning refers to the specific case where the knowledge acquired when solving one task is recycled to solve a different, yet related task [11] . in general, the problem of transfer learning can be mathematically framed as follows. given source and target domains d s and d t and their corresponding tasks t s and t t , the goal of transfer learning is to build a target conditional distribution p (y t |x t ) in d t for task t t from the information obtained when learning d s and t s , where t s = t t and d s = d t . advantages of this kind of learning when compared with traditional learning are that learning is performed much faster, requiring less data, and even achieving better accuracy results. in this work we focus in a different adaptation problem. in our described scenario, the task remains the same but the existing solution is no longer fit because of changes in the environmental constraints. we can frame the problem at hand using the former notation as follows. given a domain d s , its corresponding task t , and the set of original environmental constraints c s that make the solution of this problem feasible we assume an scenario were a hypothesis space h s has been defined. in this context, we want to learn a new solution for the same task and for a new target scenario defined by the set of feasibility constraints c t , where c t = c s . this may or may not require the definition of a new hypothesis space h t . in a concise form and considering an optimization framework this can be rewritten as this problem corresponds to that of environmental adaptation. under this notation, the original solution corresponds to a model h s that belongs to the hypothesis space h s defined for the first scenario. this is a model that fulfills the constraints c s and maximizes p (y|x; h) for a training dataset s = {(x, y)}, defined by task the t on the domain d s . adaptation involves transitioning from this original scenario to scenario ii; a process which may be straightforward. although, in general, this is not the case. the solution obtained for the first scenario may be unfeasible in the second; thus, effectively ceasing the lifespan of the learned model. this happens when h s is outside of the feasible set defined by c t . in the most benevolent case, there may exist another solution from the original hypothesis space, h s , that fulfills the new constraints. however, there might be no overlapping between the set of constraints c t and the set of models defined by h s . this implies that no model of that hypothesis space can be considered a solution. in such cases, adaptation involves definition of a new hypothesis space h t altogether. take, for example, the case of an application with a multivariate gaussian kernel support vector machine. assume that due to changes in the existing regulation, models are required to be fully interpretable in the considered application. the new set of constraints is not compatible with the original scenario and hence we would require a complete change of substrate. in this scenario, we introduce the notion of differential replication of machine learning models as an efficient approach to ensure model survival in a new demanding environment, by building on knowledge acquired in previous generations. this is solving for scenario ii considering the solution obtained for scenario i. differential replications ensures the environmental adaptation of machine learning models when they are subjected to constant changes. "when copies are made with variation, and some variations are in some tiny way "better" (just better enough so that more copies of them get made in the next batch), this will lead inexorably to the ratcheting process of design solving the former environmental adaptation problem can sometimes be straightforwardly done by discarding the existing model and re-training a new one. this is possible when the new model hypothesis space h t is required to provide feasible solutions for the constraint set c t . however, it is worth considering the costs of this approach. in general, rebuilding a model from scratch (i) implies obtaining the clearance from legal, business, ethical, and engineering departments, (ii) does not guarantee that a good or better solution of the objective function will be achieved 1 , (iii) requires a whole new iteration of the machine learning pipeline, which is costly and time-consuming, (iv) assumes full access to the training dataset, which may no longer be available or require a very complex version control process. plus, in many companies machine learning solutions are kept up-to-date using automated systems that continuously evaluate and retrain models; a technique known as continuous learning. note, however, that this may take huge storage space, due to the need to save all the new incoming information. hence, in the best case scenario, re-training is an expensive and difficult approach that assumes a certain level of knowledge that is not always guaranteed. nonetheless, this is the most commonly used technique for solving this environmental adaptation problem. in what follows we consider other techniques. environmental adaptation is well known for living beings. under the theory of natural selection, adaptation relies on changes in the phenotype of a species over several generations to guarantee its survival and evolution. this is sometimes referred to as differential reproduction. in the same lines, we define differential replication of a machine learning model as a cloning process in which traits are inherited from generation to generation while at the same time adding variations that make descendants more suitable/fit to the new environment. more formally, differential replication refers to the process of finding a solution h t that fulfills the constraints c t , i.e. it is a feasible solution, while preserving/inheriting features from h s . note that, in general, p (y|x; h t ) ∼ p (y|x; h s ). in the best case scenario, we would like to preserve or improve the performance of the source solution h s , also known as the parent. however, this is requirement that may not always be achieved. in a biological simile, requiring a guepard to be able to fly may imply loosing its ability to run fast. in this section, we consider existing approaches to implement differential replication in its attempt to solve the problem of environmental adaptation. the notion of differential replication is built on top of two concepts. first, there is some inheritance mechanism that is able to transfer key aspects from the previous generation to the next. that would account for the name of replication. second, the next generation should display new features or traits not present in their parents. this corresponds to the idea of differential. these new traits should make the new generation more fit to current environment to enable environmental adaptation of the offspring. particularizing to machine learning models, implementing the concept of differential may involve a fundamental change in the substratum of the given model. this means we need a new hypothesis space that will ensure that part of the models in that space fulfill the constraints of the new environment c t . consider, for example, the case of a large ensemble of classifiers. in highly time demanding tasks, this model may be too slow to provide real time prediction when deployed into production. differential replication of this model enables moving from this architecture to a simpler, more efficient one, such as that of a shallow neural network [12] . this "child" network can inherit the decision behavior of its predecessor while at the same time being more fit to the new environment. conversely, replication requires that some behavioral aspect be inherited by the next generation. usually, it is the model's decision behavior that is inherited, so that the next generation will replicate the parent decision boundary. replication can be attained in many different ways. as shown in fig. 1 , depending on the amount of knowledge that is assumed about the initial data and model, mechanisms for inheritance can be categorized as follows: • inheritance by sharing the dataset: two models trained on the same data are bound to learn similar decision boundaries. this is the weakest form of inheritance possible, were no actual information is transferred from source to target. here the decision boundary is reproduced indirectly and mediated through the data themselves. re-training falls under this category. this form of inheritance requires no access to the parent model, but assumes knowledge of its training data. • inheritance using edited data: editing is the methodology that allows data selection for training purposes [13, 14, 15] . editing can be used to preserve those data that are relevant to the decision boundary learned by the original solution and use them to train the next generation. take, for example, the case where the source hypothesis space corresponds to the family of support vector machines. in training a differential replica, one could retain only those data points that were identified as support vectors. this mechanism assumes full access to the model internals, as well as to the training data. • inheritance using model driven enriched data: data enrichment is a form of adding new information to the training dataset through either the features or the labels. in this scenario, each data sample in the original training set is augmented using information from the parent decision behavior. for example, a sample can be enriched by adding additional features using the prediction results of a set of classifiers. alternatively, if instead of learning hard targets one considers using the output of the parent's class probability outputs or logits as soft-targets, this richer information can be exploited to build a new generation that is closer in behavior to the parent. under this category fall methods like model distillation [12, 16, 17, 18] , as well as techniques such as label regularization [19, 20] and label refinery [21] . in general, this form of inheritance requires access to the source model and is performed under the assumption of full knowledge of the training data. • inheritance by enriched data synthesis a similar scenario is that where the original training data is not accessible, but the model internals are open for inspection. in this situation, the use of synthetic datasets has been explored [12, 22] . in some cases, intermediate information about the representations learned by the source model are also used as a training set for the next generation. this form of inheritance can be understood as a zero-shot distillation [23] . • inheritance of internals model's knowledge: in some cases, it is possible to access the internal representations of the parent model, so that more explicit knowledge can be used to build the next generation [24, 25] . for example, if both parent and child are neural networks, one can force the mid-layer representations to be shared among them [26] . alternatively, one could use the second level rules of a decision tree to guide the next generation of rule-based decision models. • inheritance by copying in highly regulated environments, access to the original training samples or to the model internals is not possible. in this context, experience can also be transmitted from one model to its differential replica using synthetic data points labelled according to the hard predictions of the source model. this has been referred to as copying [27] . note that on top of a certain level of knowledge about either the data or the model, or both, some of the techniques listed above often impose also additional restrictions on the considered scenarios. techniques such as distillation, for example, assume that the original model can be controlled by the data practitioner, i.e. internals of the model can be tuned to force specific representations of the given input throughout the adaptation process. in certain environments this may be possible, but generally it is not. to illustrate the utility of differential replication, we describe six different scenarios where it can be exploited to ensure a devised machine learning solution adapts to different changes in its environment. a widely established technique to provide explanations is to use linear models, such as logistic regression. model parameters, i.e. the linear coefficients associated to the different attributes, can be exploited to provide explanations to different audiences. although this approach works in simple scenarios where the variables do not need to be modified nor pre-processed, this is never the case for real life applications, where variables are usually redesigned before training and new more complex features are often introduced. this is even worse when, in order to improve model performance, data scientists create a large set of new variables, such as bi-variate ratios or logarithm scaled variables, to capture non-linear relations between original attributes that linear models cannot handle during the training phase. this results in new variables being obfuscated and therefore often not intelligible for humans. a straightforward solution using differential replication is to replace the whole predictive system, composed by both the pre-processing/feature engineering step and the machine learning model by a copy that considers both steps as a single black box model [28] . doing this, we are able to deobfuscate model variables by training copies to learn the decision outputs of trained models directly from the raw data attributes without any pre-process. in-company infrastructure is subject to continuous updates due to the rapid pace with which new software versions are released to the market. changes in the organizational structure of a company may drive the engineering department to change course. say, for example, that a company whose products were originally based on google's tensorflow package [29] makes the strategic decision of moving to pytorch [30] . in doing so, they might decide to re-train all models from scratch in the new environment. this is a long and costly process that can even result in a loss of performance. specially if original data is not available or the in-house data scientist are new to this framework. alternatively, using differential replication, the knowledge acquired by the existing solutions could be exploited in the form of hard or soft labels or as additional data attributes for the new generation. as a model is tested against new data throughout its lifespan, some of its learned biases may be made apparent. under such circumstances, one may wish to transit to a new model that inherits the original predictive performance but which ensures non-discriminatory outputs. a possible option is to edit the sensitive attributes to remove any bias, reducing in this way the disparate impact in the task t , and then training a new model on the edited dataset. alternatively, in very specific scenarios where the sensitive information is not leaked through additional features, it is possible to build a copy by removing the protected data variables [31] . or even, redesign the hypothesis space considering a loss function to account for the fairness dimension when training subsequent generations. batch machine learning models are rendered obsolete by their inability to adapt to a change in the data distribution. when this happens the most straightforward solution is to wait until there are enough samples of the new distribution and re-train the model. however, this solution is timely and often expensive. a faster solution is to use the idea of differential replication to create a new enriched dataset able to detect the data drift. for example, including the soft targets and a timestamp attribute in the target domain, d t . then, train a new model using this enriched dataset that replicates the decision behavior of the previously trained classifier. finally, we also need to allow the new model to accept new incoming data samples to be learned. this second characteristic can be added by incorporating the online requirement in the c t constrains for the differential replication process [32] . developing good machine learning models requires abundant data. the more accessible the data, the more effective a model will be. in real applications, training machine learning models usually requires collecting a large volume of data from users, often including sensitive information. directly releasing these models trained using user data could derive in privacy breaches, as the risk of leaking sensitive information encoded in the public model increases. differential replication can be used to avoid this issue by is training another model, usually a simpler one, that replicates the learned decision behavior but which preserves the privacy of the original training set by not being directly linked to these data. the use of distillation techniques in the context of teacher-student networks, for example, has been reported to be successful in this task [33, 34] . auditing machine learning models is not an easy task. when an auditor wants to audit several models under the same constraints, it is required that all models audited fulfill an equivalent set of requirements. those requirements may limit the use of certain software libraries, or of certain model architectures. usually, even within the same company, each model is designed and trained on its own basis. as research in machine learning grows, new models are continuously devised. however, this fast growth in available techniques hinders the possibility of having a deep understanding of those models and makes the assessment of some auditing dimensions a nearly impossible task. in this scenario, differential replication can be used to establish a small set of canonical models into which all others can be translated. in this sense, deep knowledge of these set of canonical models would be enough to drive auditing tests. for example, let us consider that the canonical model is a deep learning architecture with a certain configuration. any other model can be translated into this particular architecture using differential replication 2 . the auditing process need then only consider how to probe the canonical deep network to report impact assessment. in this paper we have described a general framework based on knowledge reuse from generation to generation to extend the useful life of machine learning models by adapting them to their changing environment. to tackle this issue of environmental adaptation, we have proposed a mechanism inspired in how biological organisms evolve: differential replication. differential replication allows machine learning models to modify their behavior to meet the new requirements defined by the environment. we envision this replication mechanism as a projection operator able to translate the decision behavior of a machine learning model into a new hypothesis space with different characteristics. this allows traits of a given classifier to be inherited by another, more suitable under the new premises. we have listed different inheritance mechanisms to achieve this goal, depending on specific knowledge availability scenarios. these range from the more permissive inheritance by sharing the dataset to the more restrictive inheritance by copying, which is the solution requiring less knowledge about the parent model and training data. finally, we provide examples of how differential replication applies in practice for six different real-life scenarios. on the origin of species by means of natural selection, or preservation of favoured races in the struggle for life magic quadrant for data science and machine learning platforms engaging the ethics of data science in practice fairer machine learning in the real world: mitigating discrimination without collecting sensitive data the fallacy of inscrutability large scale online learning end-to-end neural network architecture for fraud scoring in card payments online error-correcting output codes uncertainty estimation for black-box classification models: a use case for sentiment analysis why should i trust you?: explaining the predictions of any classifier a survey on transfer learning model compression application of proximity graphs to editing nearest neighbor decision rule geometric decision rules for instance-based learning algorithms application of the gabriel graph to instance based learning distilling the knowledge in a neural network rethinking the inception architecture for computer vision knowledge distillation in generations: more tolerant teachers educate better students when does label smoothing help revisiting knowledge distillation via label smoothing regularization label refinery: improving imagenet classification through label progression using a neural network to approximate an ensemble of classifiers zero-shot knowledge distillation in deep networks knowledge distillation from internal representations explaining knowledge distillation by quantifying the knowledge transactional compatible representations for high value client identification: a financial case study copying machine learning classifiers towards global explanations for credit risk scoring tensorflow: a system for large-scale machine learning automatic differentiation in pytorch using copies to remove sensitive data: a case study on fair superhero alignment prediction from batch to online learning using copies patient-driven privacy control through generalized distillation private model compression via knowledge distillation this work has been partially funded by the spanish project pid2019-105093gb-i00 (mineco/feder, ue), and by agaur of the generalitat de catalunya through the industrial phd grant 2017-di-25. key: cord-161039-qh9hz4wz authors: tripathy, shrabani s.; bhatia, udit; mohanty, mohit; karmakar, subhankar; ghosh, subimal title: flood evacuation during pandemic: a multi-objective framework to handle compound hazard date: 2020-10-03 journal: nan doi: nan sha: doc_id: 161039 cord_uid: qh9hz4wz the evacuation of the population from flood-affected regions is a non-structural measure to mitigate flood hazards. shelters used for this purpose usually accommodate a large number of flood evacuees for a temporary period. floods during pandemic result in a compound hazard. evacuations under such situations are difficult to plan as social distancing is nearly impossible in the highly crowded shelters. this results in a multi-objective problem with conflicting objectives of maximizing the number of evacuees from flood-prone regions and minimizing the number of infections at the end of the shelter's stay. to the best of our knowledge, such a problem is yet to be explored in literature. here we develop a simulation-optimization framework, where multiple objectives are handled with a max-min approach. the simulation model consists of an extended susceptible exposed infectious recovered susceptible (seirs) model.we apply the proposed model to the flood-prone jagatsinghpur district in the state of odisha, india. we find that the proposed approach can provide an estimate of people required to be evacuated from individual flood-prone villages to reduce flood hazards during the pandemic. at the same time, this does not result in an uncontrolled number of new infections. the proposed approach can generalize to different regions and can provide a framework to stakeholders to manage conflicting objectives in disaster management planning and to handle compound hazards. flood has been one of the most devastating natural disasters that cause massive loss of lives and property (2, 3) . adequate preparedness and disaster management planning are required to minimize these losses and increase recovery speed (4) . during floods, evacuation is one of the most critical preparedness measures to minimize the loss of lives, where people from high flood risk areas are shifted to safer areas (5) . the objective of evacuation planning is to define a policy for people under high risk to minimize loss of lives and damage to property (6) . preparedness for floods and cyclones starts by creating safe shelters at strategic locations, not very far from the high hazard areas. the evacuation planning involves shifting the vulnerable populations efficiently to the shelter homes while ensuring the timely distribution of essential commodities (7) . preparing an evacuation strategy well before the flood occurrence is pivotal to avoid last moment chaos that occurs due to the involvement of decision-makers at multiple stages and the need for the necessary arrangements to implement the evacuation in real-time. informing the evacuees well in advance about the evacuation will ease the process largely and make the evacuee comfortable in following the instructions (8) . optimal allocation of evacuees to shelters is a key challenge in evacuation planning (4) . under normal circumstances, the only objective is to decrease the number of people under flood risk, so as to maximize the number of people to be evacuated to nearby shelters. these shelter homes are designed to accommodate a very large number of people (for example, the capacity of a shelter on the east coast of india is approximately 2000) during natural disasters. as the shelters are provided for a very short duration (around one week), the per capita area allocated is low (9) . this is acceptable during normal scenarios; however, during the pandemic, it is essential to maintain social distancing to control the spread of covid-19. hence, during the pandemic scenario, it is not desired to fill the shelters at their full capacity. on the other hand, the evacuation demands for the shifting of maximum people to the shelters from the (possible) flood-affected regions. the two objectives, to reduce the spread of pandemic (covid-19 here) and increase the number of evacuated people, are in conflict with each other. this poses a challenge to disaster mitigation organizations and policymakers. people under potential risk are evacuated to safer shelter houses timely and safely (5, 10) . evacuation planning involves a number of decision-makers and disparate individual behavior of evacuees. an effective evacuation planning requires well-defined roles, responsibilities, and communication amongst stakeholders (11) . evacuation planning depends on factors like geographical location, population size, the spatial extent of the event's extremes, duration, the intensity of the event, and uncertainties (12) (13) (14) . understanding the evacuation process and the associated models are necessary for evacuation planning (15) . mathematical modeling and optimization have become helpful tools for evaluating time requirements for evacuation and allocating evacuees in optimal shelters (7, 16) . various studies have used optimization models for flood evacuation to minimize losses considering factors like travel time and distance, cost of evacuation, and usage of infrastructure (5, 6, 14, (17) (18) (19) . most of these studies have considered the objective function as the minimization of the transportation distance and/or time required to reach the shelters. while the objective of designed evacuation strategies is to minimize the injuries and loss of life during the disaster, the prevalence of contagious diseases, including covid-19, present conflicting priorities to the stakeholders and policymakers. flood evacuation strategies are designed to encourage people to take shelters in designated areas. however, violation of social distancing protocols in these shelters could result in a sudden surge in the contractions of infections and mortality rates (20) . besides, immediately following a disaster and throughout the recovery period, healthcare facilities are often disrupted, which results in the reduced capacity of the sector to respond to the primary health consequences of flooding and delivering care to covid-19 patients (21) . hence, disaster management approaches need to account for the effect of social (figure 1(a-b) ). we address these multiple objectives using the max-min approach of multi-objective optimization, which has been widely used in areas such as water resources management (23) (24) (25) , waste load allocations for water quality management in a stream (26) (27) (28) (29) . the model is applied to a flood-prone district on the east coast of india, the jagatsinghpur district in odisha. to susceptible, susceptible to exposed, exposed to infected, infected to recovered, infected to fatality state, and recovered to susceptible respectively. parameters θe and θi are testing rates, whereas ψe and ψi are positivity rate for exposed and infected individuals, respectively. jagatsinghpur is a coastal (east coast) district in the state odisha, india (figure 2 the first step in designing any evacuation strategy is to identify the villages with high flood hazard. the hazard values associated with 100 years return period were estimated for the jagatsinghapur district as reported by mohanty et al. (31) . the authors have considered regionalized design rainfall, design discharge, and design storm-tide as primary inputs to a comprehensive 1d-2d coupled mike flood model (32) to derive flood hazard values at village level. in the present study, hazard values generated for a flood quantiles corresponding to 100-years return period and 24 hours duration are considered to classify the villages into the category of "high hazard" (31, 33 the optimization model that is needed to be solved for designing evacuation strategies has the finally, the resulting optimization model is expressed as: = 0 ∀ ∉ where, , is the capacity of j th shelter, which is considered here to be 2000, as per the information provided by government agency. is the number of evacuees staying is shelter j. is the set of shelters that belong to the five closest shelters from village i. the function f is eq. (7) is an epidemiological model based on the extended susceptible -exposed -infectious -recovered -susceptible (seirs) model. eq. (9) takes care of the fact that the people in pucca houses will be evacuated only after the complete evacuation of the population living in the kutcha houses. to study the effect of social contact network structures on the propagation of the spread of covid-house, we use extended seirs model. in the standard seirs model, the entire population is divided into susceptible (s), exposed (e), infectious (i), and recovered (r) individuals. in the extended seirs model, we further divide the population in detected exposed (de) and detected infected (di) by using social tracing and testing parameters. the initial seed is then provided in terms of population in each category. recent developments in the field of epidemiological modeling further compartmentalize the contagious individuals according to the degree of severity of symptoms. however, given the limited availability of datasets to calibrate the associated parameters, we use a 7-compartment model in this study (figure 1(c) ). a susceptible member becomes exposed or infected upon contacting the infected individual during a transmission event. newly exposed individuals experience a latent period during which they are not contagious (referred to as the incubation period). exposed individuals than progress to the infected stage where they can either get tested if they are exhibiting symptoms or they have been selected for testing based on the contact tracing network at the prevalent rate of contact tracing and testing in the society. the infected individual can then progress either to recovery (r) or succumb to the infection (f). since we are interested in decreasing the flood risk in the covid-19 scenario, we use the deterministic mean-field model implementation of the seirs extended model. specifically, we assume that despite the underlying interaction social interaction structure that is ubiquitous to any society, the interactions within the shelter homes will primarily be random due to the violation of social distancing norms. hence, all individuals mix uniformly and have the same rates and parameters in the current implementation of the epidemiological model. we use the seirsplus package implemented in python to obtain the number of infected individuals in each shelter filled with the full capacity of 2000 using different values for the initial number of infections. we note that if the underlying social network structure and information on testing and isolation testing protocols are available, stochastic network models are recommended to account for stochasticity, heterogeneity, and deviations from uniform mixing assumptions (34). the multi-objective optimization model presented in eq. (1-11) is solved here with the max-min approach. the first objective function can have a value between 0 to 1 as per eq. (4). the second objective function is also standardized by dividing ij by ij,max, which is the maximum possible value of ij, given by, ( , , ,0 ). the max-min approach maximizes the minimum of all the objectives (when objectives are to be minimized, it is considered as the maximization of the negative of the objective function), which will force all the individual objectives to maximize. following the maxmin approach, the optimization model may be formulated as: the model mentioned above is a non-linear optimization model. we use a search algorithm, known as probabilistic global search laussane (pgsl) to obtain the feasible optimal solution (35). pgsl, a global search algorithm, was developed by raphael and smith (35) based on the assumption that better results can be obtained by focusing more on the neighborhood of good solutions. in every iteration, the algorithm increases the probability of obtaining a solution from the region of good solutions of the previous iteration. thus, the search space is narrowed down until it converges to the optimum solution. pgsl is different from other methods as it uses four nested cycles, which helps improve the search, and thus more focus could be given to areas around good solutions (29) . the four cycles of pgsl are: sampling cycle: samples are generated randomly from the current pdf of each variable. each point is evacuated based on the objective functions, and the best point is selected. probability updating cycle: probability of neighborhood of good results increased and bad decreases, and the pdfs of each variable are updated accordingly after each cycle. focusing cycle: search is focused on an interval containing better solutions after a number of probability updating cycles. this is done by dividing the interval containing the best solution for each variable. subdomain cycle: the search space keeps narrowing by selecting only a subdomain of the region of good points. we apply the developed optimization model (eq. [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] to the case study of jagatsinghapur district. due to data non-availability, we have assumed hypothetical values of some of the variables for demonstration purposes. we considered zx to be 0.6, zy to be 0.4 and rf to be 0.8 for all the villages (i). the maximum shelter capacity is considered to be 2000. the stay period in the shelter is considered to be seven days. the values are considered after discussions with planners and management authorities working at different levels of decision making. we applied our optimization model first by considering a uniform initial infection value across the district. the infection value is considered as 1% for demonstration purpose. we first simulated the increase in the number of infections in a shelter assuming different initial infection values (0.1%, 0.25%, 0.5%, 0.75%, and 1%) with shelters at full capacity for seven days. we find, the number of infections in a shelter to be in the range of 7 to 60 at the end of the stay, depending on the initial infection (supplementary figure s1) . violation of social distancing norms within shelter houses operating at their designated capacity could expose a large number of individuals to the highly contagious diseases, including covid-19. once exposed and infected individuals move back to their respective villages, it may result in the widespread outbreak at local scales. such a scenario may also become unmanageable, as the medical, as well as other facilities, will be limited after flooding events. hence, a proper evacuation strategy planning is needed to decrease flood losses and the spread of covid-19. to check the applicability of our optimization model, we first used the optimization model for non-pandemic scenarios, which includes the equations from eq (12) to eq (25), excluding eq (14), (19) , and (20) . the results obtained from the model are presented in figure 3 . we find that in most of the villages with high flood hazards, more than 50% of the population is evacuated (figure 3(a) ). we find a good number of shelters (213) remain unused in the central area (figure 3(b) ), as they are far away from the hazardous villages, and transporting people to those shelters is difficult. these shelters may not be useful during the flood, but during cyclones, they are extensively used. in most of the villages considered, more than 75% of the populations in kutcha houses are getting evacuated (figure 3(c) ). for 69 villages, this fraction reaches 100% with evacuations for a few in pucca houses (figure 3(d) ). such realistic results prove that the model works efficiently under nonpandemic situations. here we address an important compound event problem related to flood evacuation to minimize the loss due to flood hazards, while also demonstrating the potential issues associated with compound climate risks in the covid-19 pandemic increased human and economic losses from river flooding with anthropogenic warming detecting change in uk extreme precipitation using results from the climate prediction . net bbc climate change experiment shelter location-allocation model for flood evacuation planning a bi-objective evacuation routing engineering model with secondary evacuation expected costs multi-objective evacuation routing in transportation networks optimal shelter location-allocation during evacuation with uncertainties: a scenario-based approach evacuation modeling including traveler information and compliance behavior book for management and maintenance of multipurpose cyclone / flood shelters planning for protective action decision making: evacuate or shelter-in-place a model-based approach for a systematic risk analysis of local flood emergency operation plans: a first step toward a decision support system a dynamic evacuation network optimization problem with lane reversal and crossing elimination strategies institutional traps and vulnerability to changes in climate and flood regimes in thailand a review of recent studies on flood evacuation planning a review of planning and operational models used for emergency evacuation situations in australia models, solutions and enabling technologies in humanitarian logistics optimisation patterns for the process of a planned evacuation in the event of a flood a spatiotemporal optimization model for the evacuation of the population exposed to flood hazard computer-based model for flood evacuation emergency planning managing disasters amid covid-19 pandemic: approaches of response to flood disasters coastal flooding and frontline health care services: challenges for flood risk resilience in the english health care system socio-economic disparities in health system responsiveness in india a fuzzy max-min decision bi-level fuzzy programming model for water resources optimization allocation under uncertainty multiobjective two-phase fuzzy optimization approaches in management of water resources regulation of water resources systems using fuzzy logic: a case study of amaravathi dam an imprecise fuzzy risk approach for water quality management of a river system hybrid fuzzy and optimal modeling for water quality evaluation grey fuzzy optimization model for water quality management of a river system risk minimization in water quality control problems of a river system new delhi: office of the registrar general and census commissioner a new bivariate risk classifier for flood management considering hazard and socio-economic dimensions on the use of 1d and coupled 1d-2d modelling approaches for assessment of flood damage in urban areas flood hazard assessment with multiparameter approach derived from coupled 1d and 2d hydrodynamic flow model sir model on a dynamical network and the endemic state of an infectious disease a direct stochastic algorithm for global search modern optimization methods in water resources planning, engineering and management key: cord-216208-kn0njkqg authors: botha, andr'e e.; dednam, wynand title: a simple iterative map forecast of the covid-19 pandemic date: 2020-03-23 journal: nan doi: nan sha: doc_id: 216208 cord_uid: kn0njkqg we develop a simple 3-dimensional iterative map model to forecast the global spread of the coronavirus disease. our model contains at most two fitting parameters, which we determine from the data supplied by the world health organisation for the total number of cases and new cases each day. we find that our model provides a surprisingly good fit to the currently-available data, which exhibits a cross-over from exponential to power-law growth, as lock-down measures begin to take effect. before these measures, our model predicts exponential growth from day 30 to 69, starting from the date on which the world health organisation provided the first `situation report' (21 january 2020 $-$ day 1). based on this initial data the disease may be expected to infect approximately 23% of the global population, i.e. about 1.76 billion people, taking approximately 83 million lives. under this scenario, the global number of new cases is predicted to peak on day 133 (about the middle of may 2020), with an estimated 60 million new cases per day. if current lock-down measures can be maintained, our model predicts power law growth from day 69 onward. such growth is comparatively slow and would have to continue for several decades before a sufficient number of people (at least 23% of the global population) have developed immunity to the disease through being infected. lock-down measures appear to be very effective in postponing the unimaginably large peak in the daily number of new cases that would occur in the absence of any interventions. however, should these measure be relaxed, the spread of the disease will most likely revert back to its original exponential growth pattern. as such, the duration and severity of the lock-down measures should be carefully timed against their potentially devastating impact on the world economy. march 2020, the world health organisation (who) characterised the 2019 outbreak of coronavirus disease (covid-19) as a pandemic, referring to its prevalence throughout the whole world 1 . the outbreak started as a pneumonia of an unknown cause, which was first detected in the city of wuhan, china. it was reported as such to the who on the 31st december 2019, and has since reached epidemic proportions within china, where it has infected more than 80 000 citizens, to date. during the first six weeks of 2020 the disease spread to more than 140 other countries, creating wide-spread political and economic turmoil, due to unprecedented levels of spread and severity. the rapid spread of covid-19 is fuelled by the fact that the majority of infected people do not experience severe symptoms, thus making it more likely for them to remain mobile, and hence to infect others 2 . at the same time the disease can be lethal to some members of the population, having a globally averaged fatality ratio of 4.7%, so far. it is most likely this particular combination of traits that has made the covid-19 outbreak one of the largest in recorded history. in late 2002 and early 2003 a similar outbreak took place with the occurrence of severe acute respiratory syndrome (sars). although the etiological agent of sars is also a coronavirus, the virus was not able to spread as widely as in the current case. one possibility why the sars outbreak was less devastating than the current outbreak is, paradoxically, due to its much higher fatality ratio (almost 10% globally), making it too severe to spread easily. while there are a number of models available for the global spread of infectious diseases 3 , some even containing very sophisticated traffic layers 4 , relatively few researchers are making use of simpler models that can provide the big picture without difficult to interpret unambiguously. in the latter category of relatively simple models we could find only a discrete epidemic model for sars 5 , and more recently, a comparison of the logistic growth and susceptible-infected-recovered (sir) models for covid-19 6 . in our present work we develop a simple discrete 3-dimensional iterative map model, which shares some similarities with the classic sir model. we show that our model fits the currently-available global data for covid-19. the fact that the available data for the pandemic can be fitted well by a simple model such as ours suggests that past and current interventions to curb the spread of the disease, globally, may not be very effective. as a model for the global data we use a 3-dimensional iterative map, given by where x i is the total number of confirmed cases, y i is the number of new cases and z i is the global population, on any given day i. we denote the only fitting parameter by α, while c is a fixed parameter equal to the fraction of people who have died from the disease. according to the latest available data from the who (see table 2 in methods ), c = 0.04719. by using levenberg-marquardt (least squares) optimisation 7 we find α = 1.14594, for the initial condition x 30 = 75152 y 30 = 359.63 and z 30 = z 0 = 7.7000 × 10 9 . we briefly describe the physical content of eqs. (1). the first equation simply updates the total number of cases by setting it equal to the previous total number of cases, plus the number of new cases. here the factor of 1/z 0 has been introduced for convenience, to ensure that the proportionality constant α remains close to unity. in the second equation the number of new cases is assumed to be proportional to the previous number of new cases multiplied by the previous number of susceptible people. the third equation keeps track of the global population by subtracting the estimated number of people who have died each day, based on the fraction c. figure 1 shows a comparison of the data with the model, as well as a forecast made up to the 200th day. as we see in table 1 ). the forecast made in figure 1 (b) (corresponding to the last row of table 1 ), predicts that approximately a quarter of the worlds population, i.e. ≈ 1.83/7.7 = 0.24, would have had covid-19 by the 200th day. the peak of the pandemic is expected to occur on day 129, when about 65 million daily new cases can be expected. we also predict that by the beginning of august 2020, hardly any new cases should occur; however, the total number of lives lost by then could be as high as 86 million. in table 1 we see that the fitting parameter α, and hence the predictions made by the model, do change somewhat as more of the available data is used in the fitting procedure. to see the variation in α more clearly we have plotted the first and second columns of table 1 in figure 2 . it shows that, as more data is used, there seems to be a general upward trend in α, until day 64. at the same time the increase in α is not monotonic, since α appears to oscillate. while we are not sure, at this stage, whether α is converging, or whether it will continue to increase (or decrease) generally in the future, we note that the variations in α and the predictions made by the model, are relatively small over the last two weeks. thus it seems like, as more data is used to calculate α, the variations will become smaller -assuming that there is no systematic errors in the current or future data (see discussion). last day α x 200 × 10 9 max{y} × 10 6 day of max{y} in figure 2 we also plot (blue solid line) the mean valueᾱ = 1.156 over the last ten days. the oscillations of the calculated data points about this line give an indication of the uncertainty inᾱ over the last ten days. as a rough estimate of the figure 2 . variation of the fitting parameter α as more and more of the available data is used in the fitting procedure. we see that the value of α seems to have stabilised over the last ten days, as discussed in the main text. uncertainties in the predictions made by the model, we also calculate the means and standard deviations of the other quantities that are given the last ten rows given in table 1 . this results in x 200 = (1.94 ± 0.06) billion, max{y} = (74 ± 5) million, peak day = 126 ± 2, 'deaths' = (86 ± 3) million. while we realise that this method may not result in rigorous estimates for the uncertainties involved, we provide it here merely as rough estimate of the sensitivity of our simple model to the new data 3/7 coming in, as the pandemic continues. from the trend that can be seen in table 1 , it seems that our current model actually provides a best case scenario prediction since, as more data becomes available, the resulting predictions become less and less optimistic, i.e. in terms of the total number of lives lost, etc. furthermore, as the disease spreads there will probably be many more unreported cases, either due to asymptomatic responses, or simple because the numbers are now becoming too large to manage (see, for example ref. 8 ). in developing countries such as south africa there is also a relatively large percentage of people with compromised immunity, due to the high prevalence of human immunodeficiency virus (hiv), and this also could result in the coronavirus having a much larger impact than our model of the current global data shows. another factor to consider is the reliability of the who data itself. at present, this data is probably the most accurate we will ever have. however, as things progress, there will be a much greater chance of unreported cases, since people are now being instructed to contact the hospitals only if they experience severe symptoms. this means that all other cases are unlikely to be tested/confirmed. our present model does not take into account such details. one can of course try to answer more specific questions with a more sophisticated model, like the discrete model we mentioned for sars 5 ; however, here we have been more interested in developing a very simple model that brushes over the details and only captures the essential, large scale behaviour. as we have already alluded to, our model may not be suitable for individual countries, because it does not include many factors that may be necessary to predict the spread of the disease in specific situations. additionally, one must realise that the population of one country is much smaller that the world's, and the initial interventions taken could range from minimal to very severe, as in italy and china, for example. in contrast with this, on a global scale, the population is essentially limitless, and it is nearly impossible to impose restrictions on everybody. hence, it is our contention that the virus will spread more naturally on a global scale, almost as if it were left completely unchecked. the direct human cost of such an unchecked spread could be truly devastating 9 . on the one hand it could result in a catastrophic loss of tens of millions of lives, as our model predicts, but on the other hand, all the (possibly ineffective) measures being taken by individual countries to contain the virus, could also have fatal consequences. so far these measures have included enforced quarantine, which has led to a severe slowdown in economic activity and manufacturing production, principally due to declining consumption and disrupted global supply chains 10 . (as an example of the severity of the slowdown in production, several major car manufacturers are gradually halting production in major manufacturing hubs throughout the developed world 11 .) this decline, coupled with the associated economic uncertainty, has had knock on effects in the form of historically unprecedented stock market falls 12 . although the stock market is more of an indicator of the future value of the profits of listed corporations, their collapsed share prices could trigger severe financial crises because of a spike in bankruptcies. (the debt of us corporations is the highest it has ever been 13 .) the inevitable loss of jobs will also lead to an inability to pay bills and mortgages, increased levels of crime, etc. in principle, such a major decline in economic conditions could also result in a large-scale loss of life, which should be weighed carefully against the direct effects that the unimpeded global spread covid-19 could have. we have fitted our model to the data shown in table 2 . for the reader's convenience, the complete python script for the optimisation is provided on the following page. in this script, the function leastsq(), imported from the module scipy.optimize 15 , uses levenberg-marquardt optimization to minimize the residual vector returned by the function ef(). the function leastsq() is called from within main(), which reads in the data and sets up the initial parameter and the other two quantities (the initial values x[0] and y[0]) for optimisation. these three quantities are then passed to leastsq(), via the vector v0. for the data in table 2 , the output from the script should be: who-director-general-s-opening-remarks-at-the-media-briefing-on-covid covert coronavirus infections could be seeding new outbreaks insights from early mathematical models of 2019-ncov acute respiratory disease (covid-19) dynamics gleamviz: the global epidemic and mobility model a discrete epidemic model for sars transmission and control in china estimation of the final size of the covid-19 epidemic an algorithm for least-squares estimation of nonlinear parameters test backlog skews sa's corona stats. (the mail and gardian virus could have killed 40 million without global response. (nature news a covid-19 supply chain shock born in china is going global -20 coronavirus: car production halts at ford, vw and nissan -18 coronavirus: ftse 100, dow, s&p 500 in worst day since a modern jubilee as a cure to the financial ills of the coronavirus -3 coronavirus disease (covid-2019) situation reports python scripting for computational science a. e. b. would like to acknowledge m. kolahchi and v. hajnová for helpful discussions about this work. both authors wish to thank a. thomas for uncovering some of the related literature. new deaths 30 75204 1872 2009 31 75748 548 2129 32 76769 1021 2247 33 77794 599 2359 34 78811 1017 2462 35 79331 715 2618 36 80239 908 2700 37 81109 871 2762 38 82294 1185 a. e. b devised the research project and performed all the numerical simulations. both authors analysed the results and wrote the paper. the authors declare no competing interests. key: cord-174036-b3frnfr7 authors: thomas, loring j.; huang, peng; yin, fan; luo, xiaoshuang iris; almquist, zack w.; hipp, john r.; butts, carter t. title: spatial heterogeneity can lead to substantial local variations in covid-19 timing and severity date: 2020-05-20 journal: nan doi: nan sha: doc_id: 174036 cord_uid: b3frnfr7 standard epidemiological models for covid-19 employ variants of compartment (sir) models at local scales, implicitly assuming spatially uniform local mixing. here, we examine the effect of employing more geographically detailed diffusion models based on known spatial features of interpersonal networks, most particularly the presence of a long-tailed but monotone decline in the probability of interaction with distance, on disease diffusion. based on simulations of unrestricted covid-19 diffusion in 19 u.s cities, we conclude that heterogeneity in population distribution can have large impacts on local pandemic timing and severity, even when aggregate behavior at larger scales mirrors a classic sir-like pattern. impacts observed include severe local outbreaks with long lag time relative to the aggregate infection curve, and the presence of numerous areas whose disease trajectories correlate poorly with those of neighboring areas. a simple catchment model for hospital demand illustrates potential implications for health care utilization, with substantial disparities in the timing and extremity of impacts even without distancing interventions. likewise, analysis of social exposure to others who are morbid or deceased shows considerable variation in how the epidemic can appear to individuals on the ground, potentially affecting risk assessment and compliance with mitigation measures. these results demonstrate the potential for spatial network structure to generate highly non-uniform diffusion behavior even at the scale of cities, and suggest the importance of incorporating such structure when designing models to inform healthcare planning, predict community outcomes, or identify potential disparities. since its emergence at the end of 2019, the sars-cov-2 virus has spread rapidly to all portions of globe, infecting nearly five million people as of late may 2020 (1) . the disease caused by this virus, denoted covid-19, generally manifests as a respiratory illness that is spread primarily via airborne droplets. while most cases of covid-19 are non-fatal, a significant fraction of those infected require extensive supportive care, and the mortality rate is substantially higher than more common infectious diseases such as seasonal influenza (2) . even for survivors, infection can lead to long-term damage to the lungs and other organs, leading to long convalescence times and enhance risks of secondary complications (3, 4) . by early march of 2020, covid-19 outbreaks had appeared on almost every continent, including significant clusters within many cities (5) . in the absence of an effective vaccine, public health measures to counteract the pandemic in developed nations have focused on social distancing measures that seek to slow diffusion sufficiently to avoid catastrophic failure of the healthcare delivery system. both the planning and public acceptance of such measures have been highly dependent upon the use of epidemiological models to probe the potential impact of distancing interventions, and to anticipate when such measures may be loosened with an acceptable level of public risk. as such, the assumptions and behavior of covid-19 diffusion models is of significant concern. currently dominant approaches to covid-19 modeling (6) (7) (8) are based on compartment models (often called sir models, after the conventional division of the population into susceptible, infected, and recovered groups in the most basic implementations) that implicitly treat individuals within a population as geographically well-mixed. while some such models include differential contact by demographic groups (e.g., age), and may treat states, counties, or occasionally cities as distinct units, those models presently in wide use do not incorporate spatial heterogeneity at local scales (e.g., within cities). past work, however, has shown evidence of substantial heterogeneity in social relationships at regional, urban, and sub-urban scales (9, 10) , with these variations in social network structure impacting outcomes as diverse as regional identification (11) , disease spread (12) , and crime rates (13) , as well as in both human and non-human networks (14) . if individuals are not socially "well-mixed" at local scales, then it is plausible that diffusion of sars-cov-2 via interpersonal contacts will likewise depart from the uniform mixing characteristic of the sir models. indeed, at least one computational study (15) using a fairly "generic" (non-covid) diffusion process on realistic urban networks has showed considerable nonuniformity in diffusion times, suggesting that such effects could hypothetically be present. however, it could also be hypothesized that such effects would be small perturbations to the broader infection curve captured by conventional compartment models, with little practical importance. the question of whether these effects are likely to be present for covid-19, and if so their strength and size, has to date remained open. in this paper, we examine the potential impact of local spatial heterogeneity on covid-19, modeling the diffusion of sars-cov-2 in populations whose contacts are based on spatially plausible network structures. we focus here on the urban context, examining nineteen different cities in the united states. we simulate the population of each city in detail (i.e., at the individual level), simulating hypothetical outbreaks on the contact network in each city in the absence of measures such as social distancing. despite allowing the population to be well-mixed in all other respects (i.e., not imposing mixing constraints based on demographic or other characteristics), we find that spatial heterogeneity alone is sufficient to induce substantial departures from spatially homogeneous sir behavior. among the phenomena observed are "long lag" outbreaks that appear in previously unharmed communities after the aggregate infection wave has largely subsided; frequently low correlations between infection timing in spatially adjacent communities; and distinct sub-patterns of outbreaks found in some urban areas that are uncorrelated with the broader infection pattern. gaps between infection peaks at the intra-urban level can be be large, e.g. on the order of weeks or months in extreme cases, even for communities that are within kilometers of each other. such heterogeneity is potentially consequential for the management of healthcare delivery services: as we show using a simple "catchment" model of hospital demand, local variations in infection timing can easily overload some hospitals while leaving others relatively empty (absent active reallocation of patients). likewise, we show that individuals' social exposures to others who are morbid or deceased vary greatly over the course of the pandemic, potentially leading to differences in risk assessment and bereavement burden for persons residing in different locations. differences in outbreak timing and severity may exacerbate health disparities (since e.g., surge capacity varies by community) and may even affect perception of and support for prophylactic behaviors among the population at large, with those in so-far untouched communities falsely assuming that the pandemic threat is either past or was exaggerated to begin with, or attributing natural variation in disease timing to the impact of health interventions. we note at the outset that the models used here are intended to probe the hypothetical impact of spatial heterogeneity on covid-19 diffusion within particular scenarios, rather than to produce high-accuracy predictions or forecasts. for the latter applications, it is desirable to incorporate many additional features that are here simplified to facilitate insight into the phenomenon of central interest. in particular, we do not incorporate either demographic effects or social distancing, allowing us consider a setting that is as well-mixed as possible (and hence as close as possible to an idealized sir model) with the exception of spatial heterogeneity. as we show, even this basic scenario is sufficient to produce large deviations from the sir model. despite the simplicity of our models, we do note that the approach employed here could be integrated with other factors and calibrated to produce models intended for forecasting or similar applications. covid-19 is typically transmitted via direct contact with infected individuals, with the greatest risk occurring when an uninfected person is within approximately six feet of an infected person for an extended period of time. such interactions can be modeled as events within a social network, where individuals are tied to those whom they have a high hazard of intensive interaction. in prior work, this approach has been successfully employed for modeling infectious disesaes ranging from hiv (16) and influenza (17) to zika (18) transmission. to model networks of potential contacts at scale, we employ spatial network models (19) , which are both computationally tractable and able to capture the effects of geography and population heterogeneity on network structure (20) . such models have been successfully used to capture social phenomena ranging from neighborhood-level variation crime rates (13) and regional identification (11) to the flow of information among homeless persons (21) . the spatial network models used here allow for complex social dependence through a kernel function, referred to as the social interaction function or sif. the sif formally defines the relationship between two individuals based on spatial proximity. for example it has been shown that many social interaction patterns obey the zipf law (22) , where individuals are more likely to interact with others close by rather than far away (a pattern that holds even for online interactions (9) ). here, we use this approach to model a network that represents combination of frequent interactions due to ongoing social ties, and contacts resulting from frequent incidental encounters (e.g., interactions with neighbors and community members). we follow the protocol of (13, 20) to simulate social network data that combines the actual distribution of residents in a city with a pre-specified sif. we employ the model and data from (13) to produce large-scale social networks for 19 cities and counties in the united states -providing a representation of major urban areas in the united states (see supplement). given these simulated networks, then implement an sir-like framework to examine covid-19 diffusion. at each moment in time, each individual can be in a susceptible, infected but not infectious, infectious, deceased, or recovered state. the disease diffuses through the contact network, with currently infectious individuals infecting susceptible neighbors as a continous time poisson process with a rate estimated from mortality data (see supplement); recovered or deceased individuals are not considered infectious for modeling purposes. upon infection, an individual's transitions between subsequent states (and into mortality or recovery) are governed by waiting time distributions based on epidemiological data as described in the supplemantary materials. to begin each simulated trajectory, we randomly infect 25 individuals, with all others being considered susceptible. simulation proceeds until no infectious individuals remain. from the simulated trajectory data, we produce several metrics to assess spatial heterogeneity in disease outcomes. first, we present infection curves for illustrative cities, showing the detailed progress of the infection and its difference from what an sir model would posit. we also present chloropleth maps showing spatial variation in peak infection times, as well as the correlations between the infection trajectory within local areal units and the aggregate infection trajectory for the city as a whole. while an sir model would predict an absence of systematic variation in the infection curves or the peak infection day for different areal units in the same city, geographically realistic models show considerable disparities in infection progress from one neighborhood to another. to quantify the degree of heterogeneity more broadly, we examine spatial variation in outcomes for each of our city networks. we show that large variations in peak infection days within across tracts are typical (often spanning weeks or even months), and that overall correlations of within-tract infection trajectories with the aggregate urban trajectory are generally modest (a substantial departure from what would be expected from an sir model). in addition to these relatively abstract metrics, we also examine a simple measure of the potential load on the healthcare system in each city. given the locations of each hospital in each city, we attribute infections to each hospital using a voronoi tessellation (i.e., under the simple model that individuals are most likely to be taken to the nearest hospital if they become seriously ill). examination of the potential hospital demand over time shows substantial differences in load, with some hospitals severely impacted while others have few cases. finally, we consider the social exposure of individuals to covid-19, by computing the fraction of individuals with a personal contact who is respectively morbid or deceased. our model shows considerable differences in these metrics over time, revealing that the pandemic can appear very different to those "on the ground" -evaluating its progress by its impact on their own personal contacts -than what would be suggested by aggregate statistics. networks are generated using population distributions from (13), based on block-level census data. hospital information are obtained from the homeland infrastructure foundation-level data (hifld) database (23) . hifld is a an initiative that collects geospatial information on critical infrastructure across multiple levels of government. we employ the national-level hospital facility database, which contains locations of hospitals for the 50 us states, washington d.c., us territories of puerto rico, guam, american samoa, northern mariana islands, palau, and virgin islands; underlying data are collated from various state departments or federal sources (e.g., oak ridge national laboratory). we employ all hospitals within our 19 target cities, excluding facilities closed since 2019. latitude/longitude coordinates and capacity information were employed to create a spatial database that includes information on the number of beds in each hospital. the dates of the first confirmed case and all the death cases for king county, where seattle is located, were obtained from the new york times, based on reports from state and local health agencies (24) . the death rate was calculated based on population size of each county from the 2018 american community survey, and employed to calibrate the infection rate (the only free parameter in the models used here); details are provided in the supplemental materials. we ran 10 replicates of the covid-19 diffusion process in each of our 19 cities, seeding with 25 randomly selected infections in each replicate and following the course of the diffusion until no infectious individuals remained. simulations were performed using a combination of custom scripts for the r statistical computing system (25) and the statnet library (26) (27) (28) . analyses were performed using r. when taken over even moderately sized regions, aggregate infection curves can appear relatively smooth. although this suggests homogeneous mixing (as assumed e.g. by standard sir models), appearances can be deceiving. fig. 1 shows typical realizations of infection curves for two cities (seattle, wa and washington, dc), showing both the aggregate trajectory (red) and trajectories within individual census tracts (black). while the infection curves in both cases are relatively smooth, and suggestive of a fairly simple process involving a sharp early onset followed by an initially sharp but mildly slowing decline in infections, within-tract trajectories tell a different story. instead of one common curve, we see that tracts vary wildly in onset time and curve width, with some tracts showing peaks weeks or months after the initial aggregate spike has passed. the cases of fig. 1 are emblematic of a more systematic phenomenon: the progress of the infection within any given areal unit often has relatively little relationship to its progress in the city as a whole. a common pattern by taking the variance on the first principal component of the standardized infection curves. as before, where different parts of the city experience similar patterns of growth and decline in infections, we expect the dimension of greatest shared variance to account for the overwhelming majority of variation in infection rates. contrary to these expectations, however, fig. 2 shows that there is little coherence in tract-level infection patterns. mean correlations of local infection curves across tracts typically range from approximately 0 and 0.5, with a mean of approximately 0.2, indicating very little correspondence between infection timing in one track and that of another. the principal component analysis tells a similar story: overall, we see that first component accounts for relatively little of the total variance in trajectories, with on average only around 35% of variation in infection curves lying on the first principal component (and no observed case of the first component accounting for more than 60% of the variance). this confirms that local infection curves are consistently distinct, with behavior that is only weakly related to infections in the city as a whole. this is a substantially different scenario than what is commonly assumed in traditional sir models. these differences in local infection curves are a consequence of the unevenness of the "social fabric" that spans the city: while the disease can spread rapidly within regions of high local connectivity, it can easily become stalled upon reaching the boundaries of these regions. further transmission requires that a successful infection event occur via a bridging tie, an event with a potentially long waiting time. such delays create potential opportunities for public health interventions (trace/isolate/treat strategies), but they can also create a false sense of security for those on the opposite side of bridge (who may incorrectly assume that their area was passed over by the infection). indeed, examining the time to peak infection across the cities of seattle and washington, d.c. (fig. 3) shows that while peak times are visibly autocorrelated, tracts with different peak times frequently border each other. residents on opposite sides of the divide may be exposed to very different local infection curves, making risk assessment difficult. the cases of seattle and washington, dc are not anomalous. looking across multiple trajectories over our entire sample, fig. 4 shows consistently high variation in per-tract peak infection times for nearly all study communities. (this variation is also seen within individual trajectories, as shown in supplemental figure s2.) although peak times in some cities are concentrated within an interval of several days to a week, it is more common for peak times to vary by several months. such gaps are far from what would be expected under uniform local mixing. variation in the timing of covid-19 impacts across the urban landscape has potential ramifications for healthcare delivery, creating unequally distributed loads that overburden some providers while leaving others with excess resources. to obtain a sense of how spatial heterogeneity in the infection curve could potentially impact hospitals, we employ a simple "catchment" model in which seriously ill patients are taken to the nearest hospital, subsequently recovering and/or dying as assumed throughout our modeling framework. based on prior estimates, we assume that 20% of all infections are severe enough to require hospitalization (29) . while hospitals draw from (and hence average across) areas that are larger than tracts, the heterogeneity shown in fig. 1 suggests the potential for substantial differences in hospital load over time. indeed, our models suggest that such differences will occur. fig. 5 shows the number of patients arriving at each hospital in seattle and washington, dc (respectively) during a typical simulation trajectory. while some hospitals do have demand curves that mirror the city's overall infection curve, others show very different patterns of demand. in particular, some hospitals experience relatively little demand in the early months of the pandemic, only to be hit hard when infections in the city as a whole are winding down. just as hospital load varies, hospital capacities vary as well. as a simple measure of strain on hospital resources, we consider the difference between the number of covid-19 hospitalizations and the total capacity of the hospital (in beds), truncating at zero when demand outstrips supply. (for ease of interpretation as a measure of strain, we take the difference such that higher values indicate fewer available beds.) using data on hospital locations and capacities, we show in fig. 6 strain on all hospitals in seattle and washington, d.c. (respectively) during a typical infection trajectory. while some hospitals are hardest hit early on (as would be expected from the aggregate infection curve), others do not peak for several months. likewise, hospitals proximate to areas of the city with very different infection trajectories experience natural "curve flattening," with a more distributed load, while those that happen to draw from positively correlated areas experience very sharp increases and declines in demand. these conditions in some cases combine to keep hospitals well under capacity for the duration of the pandemic, while others are overloaded for long stretches of time. these marked differences in strain for hospitals within the same city highlight the potentially complex consequences of heterogeneous diffusion for healthcare providers. looking across cities, we see the same high-variability patterns as observed in seattle and washington. in particular, we note that local variation in disease timing leads to a heavy-tailed distribution for the duration at which hospitals will be at capacity. fig. 7 shows the marginal distribution of hospital overload periods (defined as total number of days at capacity during the pandemic), over the entire sample. while the most common outcome is for hospitals to be stressed for a brief period (not always to the breaking point), a significant fraction of hospitals end up being overloaded for months -or even, in a small fraction of cases, nearly the whole duration of the pandemic. while most hospitals will have only brief periods of overload, some will be at or over capacity for the entire pandemic, potentially several years. it should be reiterated that the hospital load model used here is extremely simplified, and that we are employing a no-mitigation scenario. however, these results quite graphically demonstrate that the importance of curve-flattening interventions does not abate once geographical factors are taken into account. on the other hand, these results suggest that differences in hospital load may be substantially more profound than would be anticipated from uniform mixing models, creating logistical challenges and possibly exacerbating existing differences in resource levels across hospitals. at the same time, such heterogeneity implies that resource sharing and patient transfer arrangements could prove more effective as load-management strategies than would be suggested by spatially homogeneous models, as hospitals are predicted to vary considerably in the timing of patient demand. in addition to healthcare strain, the subjective experience of the pandemic will potentially differ for individuals residing in different locations. in particular, social exposures to outcomes such as morbidity or mortality may shape individuals' understandings of the risks posed by covid-19, and their willingness to undertake protective actions to combat infection. such exposures may furthermore act as stressors, with potential implications for physical and/or mental health. as a simple measure of social exposure, we consider the question of whether a focal individual (ego) either has experienced a negative outcome themselves, or has at least one personal contact (alter) who has experienced the outcome in question. (given the highly salient nature of covid-19 morbidity and mortality, we focus on the transition to first exposure rather than e.g. the total number of such exposures, as the first exposure is likely to have the greatest impact on ego's assessment of the potential severity of the disease.) to examine how social exposure varies by location, we compute the fraction of individuals in each tract who are in socially exposed to (respectively) morbidity or mortality. fig. 8 shows these proportions for baltimore, md, over the course of the pandemic. as with other outcomes examined here, we see considerable variation in timing, with many tracts seeing a rapid increase in exposure to infections, while others go for weeks or months with relatively few persons having a personal contact with the disease. another notable axis of variation is sharpness. in many tracts, the fraction of individuals with at least one morbid contact transitions from near zero to near one within a matter of days, creating an extremely sharp social transition between the "pre-exposure world" (in which almost no one present knows someone with the illness) to a "post-exposure world" in which almost everyone knows someone with the illness). by contrast, other tracts show a much more gradual increase (sometimes punctuated by jumps), as more and more individuals come to know someone with the disease. in a few tracts that are never hit hard by the pandemic, few people ever have an infected alter; residents of these areas obviously have a very different experience than those of high-prevalence tracts. these distinctions are even more stark for mortality, which takes longer to manifest and which does so much more unevenly. tracts vary greatly in the fraction of individuals who ultimately lose a personal contact to the disease, and in the repidity with which that fraction is reached. in many cases, it may take a year or more for this quantity to be realized; until that point, many residents may be skeptical to the notion that the pandemic poses a great risk to them personally. by way of assessing the milieu within each tract, it is useful to consider the "cross-over" point at which at least half of the residents of a given tract have been socially exposed to either covid-19 morbidity or mortality. fig. 9 maps these values for baltimore, md. it is immediately apparent that social exposures are more strongly spatially autocorrelated than other outcomes considered here, due to the presence of long-range ties within individuals' personal networks. even so, however, we see strong spatial differentiation, with residents in the urban core being exposed to both morbidity and mortality much more quickly than those on the periphery. this suggests that the social experience of the pandemic will be quite different from those in city centers than in more outlying areas, with the latter taking far longer to be exposed to serious consequences of covid-19. this may manifest in differences in willingness to adopt protective actions, with those in the urban core being more highly motivated to take action (and perhaps resistant to rhetoric downplaying the severity of the disease) than those on the outskirts of the city. we see a large degree of spatial heterogeneity, as some tracts are more insulated from others in terms of social exposure. however, by the end of the pandemic, most people across all tracts have been exposed to someone who has had the disease. (right) the fraction of persons in each tract who have an alter who died from covid-19 in their personal network. on average, only around 40% of people in any given tract know someone who died by the end of the pandemic, though this varies widely across tracts. figure 9 : (left) chloropleth showing the time for half of those in each tract to be socially exposed to covid-19 morbidity in baltimore, md. the central and southern parts of the city are exposed far sooner than the northwestern part of the city. (right) chloropleth showing the time for half of those in each tract to be socially exposed to covid-19 mortality. central baltimore is exposed to deaths in personal networks far sooner than the more outlying areas of the city. our simulation results all underscore the potential effects of local spatial heterogeneity on disease spread. the spatial heterogeneity heterogeneity driving these results occurs on a very small scale (i.e., census blocks), operating well below the level of the city a whole. as the infection spreads, relatively small differences in local network connectivity and the prevalence of bridging ties driven by uneven population distribution can lead to substantial differences in infection timing and severity, leading different areas in each city to have a vastly different experience of the pandemic. resources will be utilized differently in different areas, some areas will have the bulk of their infections far later than others, and the subjective experience of a given individual regarding the pandemic threat may differ substantially from someone in a different area. these behaviors are in striking contrast to what is assumed by models based on the assumption of spatially homogeneous mixing, which posit uniform progress of the infection within local areas. as noted at the outset, our model is based on a no-mitigation scenario, and is not intended to capture the impact of social distancing. while distancing measures by definition limit transmission rates -and will hence slow diffusion -contacts occurring through spatially correlated networks like those modeled here are still likely to show patterns of heterogeneity like those described. one notable observation from our simulations is the long outbreak delay that some census tracts experience, even in the absence of social distancing. this would suggest that relaxation of mitigation measures leading to a resumption of "normal" diffusion may initially appear to have few negative effects, only to lead to deadly outbreaks weeks or months later. public health messaging may need to stress that apparent lulls in disease progress are not necessarily indicators that the threat has subsided, and that areas "passed over" by past outbreaks could be impacted at any time. finally, we stress that conventional diffusion models using locally homogeneous mixing have been of considerable value in both pandemic planning and scenario evaluation. our findings should not be taken as an argument against the use of such models. however, the observation that incorporating geographical heterogeneity in contact rates leads to radically different local behavior would seem to suggest that there is value in including such effects in models intended to capture outcomes at the city or county level. since these are the scales on which decisions regarding infrastructure management, healthcare logistics, and other policies are often made, improved geographical realism could potentially have a substantial impact on our ability to reduce lives lost to the covid-19 pandemic. in this supplement, we go into more depth on spatial interaction functions, spatial bernoulli models, the setup and parameterizations of our simulations, and the paramter estimation problems that we used for this paper. we focus specifically on the more technical aspects of each of these components, showing how they have been formally specified and parameterized. a spatial interaction function (sif) describes the marginal probability of a tie between any two nodes, given the distance between those nodes, represented f(d i,j , î¸). in this representation, d i,j is the distance between the nodes, and î¸ are the parameters for the function. prior literature shows that spatial interaction functions tend to be of the power law or attenuated power law form (19) . thus, we can represent the sif as f(d i,j , î¸) = p b (1+î± * d i,j ) î³ . here, p b represents the base tie probability, which can be thought of as the probability of a tie at distance 0. î± is a scaling parameter that determines the speed at which the probability drops towards zero. î³ is the parameter that determines the weight of the tail. we draw on two sifs in this paper, using models for social interactions and face to face interactions employed in prior studies (13, 20) . the social interaction sif declines with a î³ of 2.788, while the face-to-face sif declines with î³ of 6.437. the paramters for the social interaction sif are p b = 0.533, î± = 0.032, î³ = 2.788, and the parameters for the face-to-face sif are p b = 0.859,î± = 0.035, î³ = 6.437 (13) . bernoulli models are a class of random graph models that leverage the concept of a bernoulli graph, or a graph in which the probability associated with each edge is a bernoulli trial. in a spatial bernoulli graph, tie probabilities are determined by a spatial interaction function, applied to the pairwise distances between individuals within some space (here, geographically determined using census data). spatial bernoulli models are highly scalable due to the conditional independence of edges, but allow for extremely complex structure due to the heterogeneity in edge probabilities induced by the sif; likewise, they naturally produce properties such as local cohesion and degree heterogeneity observed in many types of social networks (20) . formally, we can specify a spatial bernoulli model by p r(y ij = 1) = f(d i,j , î¸), where y ij is each dyad, and f(d i,j , î¸) is a spatial interaction function with inputs as distance d i,j , and parameters î¸. to simulate diffusion of covid-19, we require a contact network. here, we employ the above-described spatial bernoulli graphs, with node locations for each of our 19 study locations drawn based on blocklevel census data (including clustering within households, an important factor in disease diffusion). we follow the protocols described in (15, 20) to generate node positions, specifically using the quasirandom (halton) placement algorithm. node placement begins with the households in each census block, using census data from (13) . the quasirandom placement algorithm uses a halton sequence to place households in space within the areal unit in which they reside. if any two households are placed within a critical radius of each other, then the algorithm "stacks" the households on top of each other by introducing artificial elevation (simulating e.g. a multistory apartment building). once all households are placed, individuals within households are placed at jittered locations about the household centroid. (individuals not otherwise attached to households are treated as households of size 1.) given an assignment of individuals to spatial locations, we simulate spatial bernoulli graphs using the models specified above. we generate two networks for each city, one with the social interaction sif, and the other with the face-to-face interaction sif. to form a network of potential high-risk contacts, we then merge these networks (which share the same node set) by taking their union, leading to a network in which two individuals are tied if they either have an ongoing social relationship or would be likely to have extensive face-to-face interactions for other reasons (e.g., interacting with neighbors). this process is performed for each city in our sample. table s1 lists the cities that we use for our simulations. these data are drawn from (13). we conduct a series of simulations to examine the spread of covid-19 across city-sized networks. these simulations use a simple continuous-time network diffusion process, the general description of which are described in the main text. the input for the diffusion simulation is a network and a vector of initial disease states (susceptible, latent (infected but not yet infectious), infectious, recovered, and deceased), and the output is detailed history of the diffusion process up to the point at which a steady state is obtained (i.e., no infectious individuals remain). infection occurs via the network, with currently infectious individuals infecting susceptible alters as poisson events with a fixed rate. the transitions between latent and infectious, and infectious and either recovery or mortality are governed by gamma distributions estimated from epidemiological data. table s2 shows the estimated shape and scale parameters for the gamma distributions employed here. the parameters for waiting time to infectiousness are directly available in the appendix of (30), while those for the recovery and death are estimated by matching the mean and standard deviation of durations reported in the literature (31) . selection into death versus recovery was made via a bernoulli trial drawn at time of infection (thereby determining which waiting time distribution was used), with the estimated mortality probabiliy being 0.0215. to determine the infection rate (the only free parameter for the network models used in our simulations), we simulate the diffusion of virus in seattle and fit it to the over-time death rate of the king county, wa before the first shelter-in-place order went into effect on march 23, 2020. (we limit our data to this time period because our simulation employs a no-mitigation scenario.) a grid search strategy was employed to determine the expected days to transmission (which is the inverse of infection rate), and the number of days between the existence of the first infected cases and the first confirmed cases (aka the time lag, a nuisance parameter that is relevant only for estimation of the infection rate). the time lag is treated as an integer and the expected days to transmission as a continuous variable. for each lag/rate pair, we randomly take 5 draws from the expected infection waiting time distribution, add them to the lag time (i.e. the introduction of the true patient zero for the initial outbreak), and simulate 50 realizations of the diffusion process (redrawing the network each time). the diffusion rate parameter was selected based on minimizing the mean squared error between the simulated death rate and the observed number of deaths over the selected period. the first round of grid-search divided the expected days of search into 100 intervals, from (1,2) to (100,101), with days of lag ranging from 1 to 100 days. the second round of grid-search, based on the performance of the first round, divided the expected days of search into 240 intervals, from (41.00,41.25) to (100.75,101.00), with days of lag ranging from 1 to 60 days. the grid-search suggests that the expected days to transmission is 78.375 (78.25,78.50) days ( fig s1) ; that is, in a hypothetical scenario in which a single infective ego remained indefinitely in the infective state, and a single alter remained otherwise susceptible, the average waiting time for ego to infect alter would be approximately 80 days. while this may at first blush appear to be a long delay, it should be borne in mind that this embodies the reality that no individual is likely to infect any given alter within a short period (since, indeed, ego and alter may not happen to interact within a narrow window). with many alters, however, the chance of passing on the disease is quite high. likewise, we note that the thought experiment above should not be taken to imply that actors remain infectious for such an extended period of time; per the above-cited epidemiological data, individuals typically remain infectious for roughly 18-33 days (though variation outside this range does occur, as captured by the above gamma distributions). when both delay times are considered, the net probability of infecting any given alter prior to recovering is approximately 27%. using the above, we can calculate the corresponding basic reproductive number (r 0 ): where d is the mean degree of an individual in the network; î± is the infection rate (the inverse of expected days of transmission); ï� is the time spent in the infectious state (here in days). for each simulated seattle contact network, we calculate the degree for every individual. the time in the infectious statae was obtained by simulating gamma distributions for days of incubation, recovery, and death, and randomly permuting the distribution for 10 times for each simulated sif network. taking the mean of r 0 for each individual, the corresponding basic reproductive number in the diffusion simulation model is 1.7. to supplement the results on the variation in the peak infection time given in the main text, we ran a series of simulation replicates. figure 3 in the main text shows the data from the figure below aggregated across all replicates. in the supplemental figure s2, we break out the peak infection days in each city, by replicate. these data show that the significant variation in figure 3 is not due to the number of replicates that were run, but instead due to the intrinsic variation that is present (due to spatial heterogeneity). network epidemiology: a handbook for survey design and data collection the sage handbook of gis and society research human behavior and the principle of least effort: an introduction to human ecology r: a language and environment for statistical computing, r foundation for statistical computing the lancet infectious diseases this research was supported by nsf awards iis-1939237 and ses-1826589 to c.t.b, and by the uci seed grants program. figure s2 : boxplot showing the peak infection days across 10 replicates for each city in the sample. there is a large degree of heterogeneity within each city, showing that the day that the infection peaks for any given tract is not uniform at all. within any given city, there is a consistently high amount of variance in the peak infection day. in other words, the variance that we show here is a property of the spread of the disease, rather than the number of simulation replicates. key: cord-230430-38fkbjq0 authors: williams, tom; johnson, torin; culpepper, will; larson, kellyn title: toward forgetting-sensitive referring expression generationfor integrated robot architectures date: 2020-07-16 journal: nan doi: nan sha: doc_id: 230430 cord_uid: 38fkbjq0 to engage in human-like dialogue, robots require the ability to describe the objects, locations, and people in their environment, a capability known as"referring expression generation."as speakers repeatedly refer to similar objects, they tend to re-use properties from previous descriptions, in part to help the listener, and in part due to cognitive availability of those properties in working memory (wm). because different theories of working memory"forgetting"necessarily lead to differences in cognitive availability, we hypothesize that they will similarly result in generation of different referring expressions. to design effective intelligent agents, it is thus necessary to determine how different models of forgetting may be differentially effective at producing natural human-like referring expressions. in this work, we computationalize two candidate models of working memory forgetting within a robot cognitive architecture, and demonstrate how they lead to cognitive availability-based differences in generated referring expressions. effective human-robot interaction requires human-like natural language and dialogue capabilities that are sensitive to robots' embodied nature and inherently situated context (mavridis, 2015; tellex et al., 2020) . in this paper we explore the role that models of working memory can play in enabling such capabilities in integrated robot architectures. while working memory has long been understood to be a core feature of human cognition, and thus a central component of cognitive architectures, recent evidence from psychology suggests a conception of working memory that is subtly different from what is implemented in most cognitive architectures. specifically, while most models of working memory in computational cognitive architectures maintain a single central working memory store, converging evidence from different communities suggests that humans have different resource limitations for different types of information. moreover, recent psychological evidence suggests that working memory may be a limited resource pool, with resources consumed on the basis of the number and type of features retained. this suggests that forgetting in working should be modeled in cognitive architectures as a matter of systematic removal (on the basis of decay or interference) of entity features with sensitivity to the resource limitations imposed for the specific type of information represented by that feature. of course robot cognition need not directly mirror human cognition, and indeed robots have both unique knowledge representational needs and increased flexibility in how resource limitations are implemented. in this work, we present a robot architecture in which (1) independent resource pools are maintained for different robot-oriented types of information; (2) wm resources are maintained at the feature level rather than entity level; and (3) both interferenceand decay-based forgetting procedures may be used. this architecture is flexibly configurable both in terms of what type of forgetting procedure is used, and how that model is parameterized. for robot designers, this choice of parameterization may be made in part on the basis of facilitation of interaction. in this paper we specifically consider how the use of different models of forgetting within this architecture lead to different information being retained in working memory, which in turn leads to different referring expressions being generated by the robot, which in turn can produce interactive alignment effects purely through working memory dynamics. while in future work it will be important to identify exactly which parameterizations lead to selection of referring expressions that are optimal for effective human-robot interaction and teaming, in this work we take the critical first step of demonstrating, as a proof-of-concept, that decay-and interference-based forgetting mechanisms can be flexibly used within this architecture, and that those policies do indeed produce different natural language generation behavior. "referring" has been referred to as the "fruit fly" of language due to the amount of research it has attracted (van deemter, 2016; gundel & abbott, 2019) . in this work, we focus specifically on referring expression generation (reg) (reiter & dale, 1997) in which a speaker must choose words or phrases that will allow the listener to uniquely identify the speaker's intended referent. reg includes two constituent sub-problems (gatt & krahmer, 2018) : referring form selection and referential content determination. while referring form selection (in which the speaker decides whether to use a definite, indefinite, or pronominal form (poesio et al., 2004; mccoy & strube, 1999) (see also pal et al., 2020) ) has attracted relatively little attention, referential content determination is one of the most well-explored sub-problems within natural language generation, in part due to the logical nature of the problem that enables it to be studied in isolation, to the point where "reg" is typically used to refer to the referential content determination phase alone. in this section we will briefly define and describe the general strategies that have been taken in the computational modelling of referential content determination; for a more complete account we recommend the recent book by van deemter (2016), which provides a comprehensive account of work on this problem. referential content determination, typically employed when generating definite descriptions, is the process by which a speaker seeks to determine a set of constraints on known objects that if communicated will distinguish the target referent from other candidate referents in the speaker and listener's shared environment. these constraints most often include attributes of the target referent, but can also include relationships that hold between the target and other salient entities that can serve as "anchors", as well as attributes of those anchors themselves (dale & haddock, 1991) . three referential content determination models have been particularly influential: the full brevity algorithm (dale, 1989) , in which the speaker selects the description of minimum length, in order to straightforwardly satisfy grice's maxim of quantity (grice, 1975) ; the greedy algorithm, in which the speaker incrementally adds to their description whatever property rules out the largest number of distractors (dale, 1992) ; and the incremental algorithm (ia), in which the speaker incrementally adds properties to their description in order of preference so long as they help to rule out distractors (dale & reiter, 1995) . a key aspect of the ia is its ability to overspecify through its inclusion of properties that are not strictly needed from a logical perspective to single out the target referent, but are nevertheless included due to being highly preferred; a behavior also observed in human speakers (engelhardt et al., 2006) . because the ia's behavior is highly sensitive to preference ordering (gatt et al., 2007) , there has been substantial research seeking to determine what properties are in general psycholinguistically preferred over others (belke & meyer, 2002) , or to automatically learn optimal preference orderings (koolen et al., 2012) . as highlighted by goudbeek & krahmer (2012) , however, this focus on a uniform concept of "preference" obscures a much more complex story that relates to fundamental debates over the extent to which speakers leverage listener knowledge during sentence production. a notion of "preference" as encoded in the ia could be egocentrically grounded, with speakers "prefer" concepts that are easy for themselves to assess or cognitively available to themselves (keysar et al., 1998) ; it could be allocentrically grounded, with speakers intentionally seeking to facilitate the listener's ease of processing (janarthanam & lemon, 2009) ; or a hybrid model could be used, in which egocentric and allocentric processes compete (bard et al., 2004) , with egocentrism vs. allocentrism "winning out" on the basis of factors such as cognitive load (fukumura & van gompel, 2012) . these approaches, which require accounting for listener knowledge to be slow and intentional, stand in contrast to memory-oriented account of referential content determination in which such accounting can naturally occur as a result of priming. pickering & garrod (2004)'s interactive alignment model of dialogue suggests that dialogue is a highly negotiated process (see also clark & wilkes-gibbs (1986) ) in which priming mechanisms lead interlocutors to influence each others' linguistic choices at the phonetic, lexical, syntactic and semantic levels, through mutual activation of phonetic, lexical, syntactic, and semantic structures and mental representations, as in the case of lexical entrainment (brennan & clark, 1996) . while there has been extensive evidence for lexical and syntactic priming, research on semantic or conceptual priming in dialogue has only relatively recently become a target of substantial investigation (gatt et al., 2011) . a theory of dialogue including semantic or conceptual priming would suggest that the properties or attributes that speakers choose to highlight in their referring expressions (e.g., when a speaker chooses to refer to an object as "the large red ball" rather than "the sphere") should be due in part to these priming effects. and in fact, as demonstrated by goudbeek & krahmer (2010) , speakers can in fact be influenced through priming to use attributes in their referring expressions that would otherwise have been dispreferred. these findings have motivated dual-route computational models of dialogue (gatt et al., 2011; in which the properties used for referring expression selection are made on the basis of interaction between two parallel processes, each of which is periodically called upon to provide attributes of the target referent to be placed into a wm buffer that is consulted when properties are needed for re generation (at which point selected properties are removed from that buffer). the first of these processes is a priming-based procedure in which incoming descriptions trigger changes in activation within a spreading activation model, and properties are selected if they are the highest-activation properties (above a certain threshold) for the target referent. the second of these processes is a preference-based procedure in which a set of properties is generated by a classic reg algorithm (cp. gatt & krahmer, 2018) such as the incremental algorithm, in which properties are incrementally selected according to a pre-established preference ordering designed or learned to reflect frequency of use, ease of cognitive or perceptual assessability, or some other informative metric (dale & reiter, 1995) . one advantage of this type of dual process model is that it accounts for audience design effects (in which speaker utterances at least appear to be intentionally crafted for ease-of-comprehension) within an egocentric framework, by demonstrating how priming influences on wm can themselves account for listener-friendly referring expressions. that is, if a concept is primed by an interlocutor's utterance, a speaker will be more likely to use that concept in their own references simply because it is in wm, with the side effect that that property will then be easy to process by the interlocutor responsible for its' inclusion in wm in the first place (vogels et al., 2015) . moreover, this phenomenon aligns well with evidence suggesting that despite the prevalence of lexical entrainment and alignment effects, people are actually slow to explicitly take others' perspectives into account (bard et al., 2000; fukumura & van gompel, 2012; gann & barr, 2014) . another advantage of this type of dual process model is its alignment with overarching dualprocess theories of cognition (e.g., kahneman, 2011; evans & stanovich, 2013; sloman, 1996; oberauer, 2009 ): the first priming-driven process for populating wm, grounded in semantic activation, can be viewed as a reflexive system one process, whereas the second preference-driven process leveraging the incremental algorithm can be viewed as a deliberative system two process. of course in the model under discussion the two routes do not truly compete with each other or operate on different time courses, but are instead essentially sampled between; however, it is straightforward to imagine how the two processes used in this type of model could be instead deployed in parallel. one major disadvantage of this type of model, however, is that its focus with respect to wm is entirely on retrieval (i.e., how priming and preference-based considerations impact what information is retrieved from long-term memory into wm), and fails to satisfactorily account for maintenance within wm. within gatt et al. (2011) 's model, as soon as a property stored in wm is used in a description, it is removed from wm so that that space is available for another property to be considered. this behavior seems counter-intuitive, as it ensures that representations are removed from wm at just the time when it is established to be important and useful, which should be a cue to retain said representations rather than dispose of them. moreover, this model is surprisingly organized from the perspective of models of wm such as cowan (2001) a final complication for this model is its speaker-blind handling of priming. specifically, within gatt et al. (2011) 's model a speaker's utterances are only primed by their interlocutor's utterances, when in fact the choices a speaker makes should also impact the choices they themselves make in the future (shintel & keysar, 2007) , either due to gricean notions of cooperativity (grice, 1975) or, as we argue, because a speaker's decision to refer to a referent using a particular attribute should make that attribute more cognitively available to themselves in the immediate future. these concerns are addressed by our previously proposed model of robotic short-term memory (williams et al., 2018b) , in which speakers rely on the contents of wm for initial attribute selection and, if their selected referring expression is not fully discriminating, select additional properties using a variant of the ia. while this does not align with dual-process models of cognition, it does account for both encoding and maintenance of wm, and provides a potentially more cognitively plausible account of reg with respect to wm dynamics. one shortcoming shared by both models, however, is neither the dual-process model of gatt et al. (2011) nor our wm-focused model appropriately account for when and how information is removed from wm over time, or how this impacts reg. different theories of wm "forgetting" necessarily lead to predicted differences in cognitive availability. accordingly, these different models of forgetting should similarly predict cognitive availabilitybased differences in the properties selected during reg. to design effective intelligent agents, it is thus necessary to determine how different models of forgetting may be differentially effective at producing natural human-like referring expressions. in this work, we first computationalize two candidate models of wm forgetting within a robot cognitive architecture. next, we propose a model of reg that is sensitive to the wm dynamics of encoding, retrieval, maintenance, and forgetting, discuss the particulars of deploying this type of model within an integrated robot architecture, where wm resources are divided by domain (i.e., people, locations, and objects) rather than by modality (i.e., visual vs. verbal). finally, we provide a proof-of-concept demonstration of two parametrizations of our model into an integrated robot cognitive architecture, and demonstrate how these different parametrizations lead to cognitive availability-based differences in generated referring expressions. models of forgetting in working memory are typically divided into two broad categories (reitman, 1971; jonides et al., 2008; ricker et al., 2016) : decay-based models, and interference-based models. decay-based models of wm (brown, 1958; atkinson & shiffrin, 1968) , that time plays a causal role in wm forgetting, with a representation's inclusion in wm made on the basis of a level of activation that gradually decays over time if the represented information is not used or rehearsed. accordingly, in such models, a piece of information is "forgotten" with respect to wm if it falls below some threshold of activation due to disuse. this model of forgetting is intuitively appealing due to the clear evidence that memory performance decreases over time (brown, 1958; ?; ricker et al., 2016) . computational advantages and limitations: as with gatt et al., spreading activation networks can be used to elegantly model how activation of representations impacts the rise and fall of activation of semantically related pieces of information. one disadvantage of this approach, however, is that activation levels need to be continuously re-computed for each knowledge representation in memory. while this may be an accurate representation of actual cognitive processing, artificial cognitive systems do not enjoy the massively parallelized architectures enjoyed by biological cognitive systems, meaning that this approach may face severe scaling limitations in practice. computational model: to allow for straightforward comparison with other models of forgetting, we define a simple model of decay that operates on individual representations outside the context of a semantic activation network. we begin by representing wm as a set w m = q 0 , . . . , q n , where q i is a mental representation of a single entity, represented as a queue of properties currently activated for that entity. next, we define an encoding procedure that specifies how the representations in wm are created and manipulated on the basis of referring expressions generated either by the agent or its' interlocutors. as shown in alg. 1, this procedure operates by considering each property included in the referring expression, and updating the queue used to represent the entity being described, by placing that property in the back of the queue, or by moving the property to the back of the queue if it's already included in the representation. note that this procedure can be used either after each utterance is heard (in which case the representation is updated based on all properties used to describe the entity) or incrementally (in which case the representation is updated after each property is heard). if used incrementally, then forgetting procedures may be interleaved with representation updates. finally, we define a model of decay that operates on these representations. as shown in alg. 2, this procedure operates by removing the property at the front of queue q at fixed intervals defined by decay parameter î´. algorithm 1 per-entity encoding model in contrast, interference-based models (waugh & norman, 1965) argue that wm is a fixed-size buffer in which a piece of information is "forgotten" with respect to wm if it is removed (due to, e.g., being the least recently used representation in wm memory) to make room for some new representation. interference-based models have been popular for nearly as long as decay-based models (keppel & underwood, 1962) due to observations that the evidence used as evidence for "decay" can just as easily be used as evidence for forgetting due to intra-representational interference, as longer periods of time directly correlate with higher frequencies of interfering events (lewandowsky & oberauer, 2015; oberauer & lewandowsky, 2008) , and because tasks with varying temporal lengths but consistent overall levels of interference have been shown to yield similar rates of forgetting, thus failing to support time-based decay (oberauer & lewandowsky, 2008) . recent work has trended towards interference-based accounts of forgetting, with a number of further debates and competing models opening up within this broad theoretical ground. first, there is debate as to whether interference alone is sufficient to explain forgetting, or whether time-based decay still plays some role in conjunction with interference. recent work suggests that in fact these two models may be differentially employed for different types of representations, with phonological representations forgotten due to interference and non-phonological representations forgotten due to a combination of interference and time-based decay (ricker et al., 2016) . second, within interference-based models, there exist competing models based on reasons for displacement. in particular, while theories of pure displacement (waugh & norman, 1965) posit that incoming representations replace maintained representations on the basis of frequency or recency of use, or on the basis of random chance (similar to caching strategies from computing systems research (press et al., 2014) ), theories of retroactive interference instead posit that replacements are made on the basis of semantic similarity, with representations "forgotten" if they are too similar to incoming representations (wickelgren, 1965; lewis, 1996) . third, within both varieties of interference-based models, there has been recent debate on the structure and organization of the capacity-limited medium of wm. ma et al. (2014) contrasts four such models: (1) slot models, in which a fixed number of slots are available for storing coherent representations (miller, 1956; cowan, 2001) ; (2) resource models, in which a fixed amount of representational medium can be shared between an unbounded number of representations (with storage of additional features in one representation resulting in fewer feature-representing medium being available for other representations) (wilken & ma, 2004) ; (3) discrete-representation models, in which a fixed number of feature "quanta" are available to distribute across representations (zhang & luck, 2008) ; and (4) variable-precision models, in which working memory precision is statistically distributed (fougnie et al., 2012) . computational advantages and limitations: one advantage of interference-based models for artificial cognitive agents is decreased computational expense, as only a fixed number of entities or features must be maintained in wm, and wm need not be updated at continuous intervals if no new stimuli are processed. rather, wm only needs to be updated when (1) new representations are encoded into wm, or (2) existing representations are manipulated. another advantage of this approach is its conceptual alignment with the process of caching from computer science, which means that caching mechanisms from computing systems research, such as least-recently-used and least-frequently-used caching policies, can be straightforwardly leveraged, with prior work providing substantial information about their theoretical properties and guarantees. in fact, recent work has explored precisely how caching strategies from computer science can be used for this purpose (press et al., 2014) ). within the interference-based family of models, slot-based and discrete-representation models are likely the most easy to computationalize due to the ephemeral and undiscretized nature of "representational medium." computational model: to model interference-based forgetting, we use the same wm representation and encoding procedure as used to model decay-based forgetting, and propose a new model designed specifically for robotic knowledge representation. this model can be characterized as a per-entity discrete-representation displacement model. as shown in alg. 3, this procedure operates by removing properties at the front of queue q whenever the size of q is greater than some size limitation imposed by parameter î±. this model is characterized as discrete-representation because resource limitations are imposed at the level of discrete features rather than holistic representations. it is characterized as a displacement model because features are replaced on the basis of a least-recently used (lru) caching strategy (knuth, 1997) rather than on the basis of semantic similarity, due to the pragmatic difficulty of assessing the similarity of different categories of properties without mandating the use of architectural features such as well-specified conceptual ontologies (cp. tenorth & beetz, 2009; lemaignan et al., 2010) , which may not be available in all robot architectures. this model is characterized as per-entity because resource limitations are imposed locally (i.e., for each entity) rather than globally (i.e., shared across all entities). while this is obviously not a cognitively plausible characteristic, it was selected, as a starting point, to reduce the need for coordination across architectural components and to facilitate improved performance (as entity representations need not compete with each other). however, if desired, it would be straightforward to extend this approach to allow for imposition of global resource constraints. algorithm 3 per-entity discrete representation displacement model. procedure displacement(q, î±) q: the per-entity property queue î±: maximum buffer size loop if |q| > î± then pop(q) end if end loop end procedure memory modeling has long been a core component of cognitive architectures, due to its central role in cognition (baxter et al., 2011) . the relative attention paid to wm has, however, varied widely between cognitive architectures. arcadia, for example, has placed far more emphasis on attention than on memory (bridewell & bello, 2016) . while arcadia does include a visual short term memory component, it is treated as a "potentially infinite storehouse," with consideration of resource constraints left to future work (bridewell & bello, 2016) . act-r and soar, in contrast, do place larger emphases on wm. act-r did not originally have explicit wm buffers, instead implicitly representing wm as the subset of ltm with activation above some particular level (anderson et al., 1996) , with "forgetting" thus modeled through activation decay (cp. cowan, 2001) . in more recent incarnations of act-r, a very small short-term buffer is maintained, with contents retrieved from ltm on the basis of both base-level activation (on the basis of recency and frequency) and informative cues. similarly, soar laird (2012) has long emphasized the role of wm, due to its central focus on problem solving through continuous manipulation of wm (rosenbloom et al., 1991) . and while soar did not initially represent wm resource limitations (young & lewis, 1999) , it has also by now long included decay-based mechanisms on at least a subset of wm contents (chong, 2003; nuxoll et al., 2004) , as well as base-activation and cue-based retrieval methods such as those mentioned above jones et al. (2016) . while the models may be intended primarily as models of human cognition, flaws and all, rather than as systems for enabling effective task performance through whatever means necessary regardless of cognitive plausibility, a significant body of work has well demonstrated the utility of these architectures for cognitive robotics (laird et al., , 2017 kurup & lebiere, 2012) and human-robot interaction trafton et al. (2013) . there has also been significant work in robotics seeking to use insights from psychological research on wm to better inform the design of select components of larger robot cognitive architectures that do not necessarily aspire towards cognitive plausibility. for example, a diverse body of researchers has collaborated on the development and use of the wm toolkit (phillips & noelle, 2005; gordon & hall, 2006; kawamura et al., 2008; persiani et al., 2018) ; a software toolkit that maintains pointers to a fixed number of chunks containing arbitrary information. at each timestep, this toolkit proposes a new set of chunks, and then uses neural networks to select a subset of these chunks to retain. this model has been primarily used for enabling cognitive control, in which the link between robot perception and action is modulated by a learned memory management policy. models of wm have also been leveraged within the field of human-robot interaction. broz et al. (2012) , for example, specifically model the episodic buffer sub-component of wm (baddeley, 2000) . baxter et al. (2013) leverages models of wm to better enable non-linguistic social interaction through alignment, similar to our approach in this work. researchers have also leveraged models of wm to facilitate communication. for example, hawes et al. (2007) leverage a model of wm within the cosy architecture, with concessions made to accommodate the realistic needs of integrated robot architectures, in which specialized representations are stored and manipulated within distributed, parallelized, domain-specific components (see also . similarly, in our own work within the diarc architecture , we have demonstrated (as we will further discuss in this paper) the use of distributed wm buffers associated with such architectural components (williams et al., 2018b) , as well as hierarchical models of common ground jointly inspired by models of wm (e.g. cowan, 2001) and models of givenness from natural language pragmatics (e.g. gundel et al., 1993) . in the next section we propose a new architecture that builds on this previous work of ours to allow for flexible selection (and comparison between) between different models of forgetting. our forgetting models were integrated into the distributed, integrated, affect, reflection, cognition (diarc) robot cognitive architecture: a component-based architecture designed with a focus on robust spoken language understanding and generation . diarc's mnemonic and linguistic components integrate via a consultant framework in which different architectural components (e.g., vision, mapping, social cognition) serve as heterogeneous knowledge sources that comprise a distributed model of long term memory (williams, 2017; . these consultants are used by diarc's natural language pipeline for the purposes of reference resolution and reg. in recent work we have extended this framework to produce a new short-term memory augmented consultant framework in which consultants additionally maintain, for some subset of the entities for which they are responsible, a short term memory buffer of properties that have recently been used by interlocutors to refer to those entities. in this work, we build upon that stm (short term memory)-augmented consultant framework through the introduction of a new architectural component, the wm manager, which is responsible for implementing the two forgetting strategies introduced in the previous section. our model aligns with two key psychological insights. first, converging evidence from different communities suggests that humans have different resource limitations for different types of information (wickens, 2008) , due to either decreased interference between disparate representations (oberauer et al., 2012) or the use of independent domain-specific resource pools (baddeley, 1992; logie & logie, 1995) . our approach takes a robot-oriented perspective on this second hypothesis, with our use of the wm-augmented consultant framework resulting in independent resource pools maintained for different types of entities (e.g., objects vs. locations vs. people) rather than for different modalities (e.g., visual vs. auditory) or different codes of processing (e.g., spatial vs. verbal). second, while early models of wm suggested that wm resource limitations are bounded to a limited number of chunks (miller, 1956) , more recent models instead suggest that the size of wm is affected by the complexity of those chunks (mathy & feldman, 2012) , and that maintaining multiple features of a single entity may detract from the total number of maintainable entities, and accordingly, the number of features maintainable for other entities (alvarez & cavanagh, 2004; oberauer et al., 2016; taylor et al., 2017) . our approach, again, takes a robot-oriented perspective on these models, maintaining wm resources at the feature-level rather than entity-level, while enabling additional flexibility that may not be reflected in human cognition. specifically, instead of enforcing global resource limits, we allow for flexible selection between decay-based and interference-based (i.e., resource-limited) memory management models, as well as for simultaneous employment of both models, in order to model joint impacts of interference and decay as discussed by ricker et al. 2016) . moreover, while we currently focus on local (per-entity) feature-based resource limitations, our system is designed to allow for global resource limitations (cp. just & carpenter, 1992; ma et al., 2014) in future work, due to our use of a global wm manager component. using this architecture, our different forgetting models can differentially affect reg without any direct interaction with the reg process. rather, the wm manager simply interfaces with the wm buffers maintained by each distributed consultant, which then implicitly impacts the properties available to the sd-pia algorithm that we use for reg. this algorithm takes a lazy approach to reg in which the speaker initially attempts to rely only on the properties that are currently cached in wm, and only if this is insufficient to distinguish the target from distractors does the speaker retrieve additional properties from long-term memory. incorporating the wm manager into diarc yields the configuration shown in fig. 1 . when a teammate speaks to a robot running this configuration, text is recognized and then semantically parsed by the tldl parser , after which intention inference is performed by the pragmatic understanding component (williams et al., 2015) , whose results are provided to the reference resolution module of the referential executive (rex), which leverages the growler algorithm (williams et al., 2018a ) (see also ), which searches through givenness hierarchy (gh) (gundel et al., 1993) informed data structures (focus, activated, familiar) representing a hierarchically organized cache of relevant entity representations. this is important to highlight due to its relation to the wm buffers described in this work. while the wm buffers described in this work serve as a model of the robot's own wm, the referential executive's data structures can instead be viewed as either second-order theory-of-mind data structures or as a form of hierarchical common ground. when particular entities are mentioned by the robot's interlocutor or by the robot itself, pointers to the full representations for these entities (located in the robot's distributed long-term memory) are placed into the referential executive's gh-informed data structures, and the properties used to describe them are placed into the robot's distributed wm buffers. in addition, properties are placed into wm whenever the robot affirms that those properties hold through an ltm query. critically, this can happen when considering other entities. for example, if property p is used during reference resolution, then p will be placed into wm for any distractors for which p is affirmed to hold, before being ruled out for other reasons. similarly, if r is the target referent during reg, and if property p holds for r and is considered for inclusion in the generated re, then for any distractors for which p also holds, p will be added to those distractors' wm buffers at the point where it is affirmed that p cannot be used to rule out those distractors because it holds for them as well. once reference resolution is completed, if the robot decides to respond to the human utterance, it does so through a process that is largely the inverse of language understanding, including a dedicated reg module, which uses the properties cached in wm, along with properties retrievable from long-term memory, to translate this intention into text (williams & scheutz, 2017; williams et al., 2018b) . human's turn robot's turn face description face description decay interference table 1 . 6 face case study. column contents denote properties used to describe target under each forgetting model, listed in the conssitent order below for easy comparison rather than in order selected. properties: hl = light-hair(x), hd = dark-hair(x), hs = short-hair(x), gm = male(x), gf = female(x), ct = t-shirt(x), cl = lab-coat(x), ch = hoodie(x), gy = glasses(x), gn = no-glasses(x). we analyzed the proposed architecture by assessing two claims: (1) the proposed architecture demonstrates interactive alignment effects purely through wm dynamics; and (2) the proposed architecture demonstrates different referring behaviors when different models of forgetting are selected. these claims were assessed within the context of a "guess who"-style game in which partners take turns describing candidate referents (assigned from a set of 16 faces). on each player's turn, they are assigned a referent, and must describe that referent using a referring expression they believe will allow their interlocutor to successfully identify it (a process of reg). the other player must then process their interlocutor's referring expression and identify which candidate referent they believe to be their interlocutor's target (a process of reference resolution). ideally, we would have assessed our claims in a setting in which a robot played this reference game with a naive human subject. this proved to be impossible due to the covid-19 global pandemic. instead, we present a case study in which a series of three six-round reference games are played between robot agents and a single human agent. all three games followed the same predetermined order of candidate referents and used the same pre-determined human utterances. the robot's responses were generated autonomously, with the robot in each of the three games using a different model of forgetting. in the first game, the robot uses our decay-based model of forgetting with î´ = 10; in the second game, the robot uses our interference-based model of forgetting with î± = 2; in the third game, the robot did retain any properties in short-term memory at all. the referring behavior under each model of forgetting is shown in tab. 1. as shown in this table, the three examined models perform similarly in initial dialogue turns, but increasingly diverge over time. to help explain the observed differences in robot behavior, we examine specifically turn 6, in which the robot refers to face 1 for the third time. this face could ostensibly be referred to using four properties: light-hair(x), short hair(x), lab-coat(x), and glasses(x). the architectural configuration that did not maintain representations in wm (no wm) operated according to the dist-pia algorithm (williams & scheutz, 2017) , which is a version of the incremental algorithm that is sensitive to uncertainty and that allows for distributed sources of knowledge. this algorithm first considers the highly preferred property light-hair(x), which is selected because it applies to face 1 while ruling out distractors. next, it considers short-hair(x), which it ignores because while it applies to face 1, the faces with short hair are a subset of those with light hair, and thus short-hair(x) is not additionally discriminative. next, the algorithm considers lab-coat(x), which it selects because it applies to face 1 and rules out further distractors. finally, to complete disambiguation, the algorithm considers and selects glasses(x). in contrast, the configuration that used the decay model had the following properties in wm: {lab-coat(x), short-hair(x), glasses(x)} (ordered from least-recently used to most-recently used 1 ). the algorithm starts by considering the properties stored in wm, beginning with labcoat(x), which is selected because it applies to face 1 while ruling out distractors. next, it considers short-hair(x), which is ignored because the set of entities with short hair is a subset of those wearing lab coats, and thus this is not additionally discriminative. next, it considers glasses(x), which it selects because it applies to face 1 and rules out distractors. in fact, {labcoat(x), glasses(x)} is fully discriminating for face 1, so no further properties are needed. finally, the configuration that used the interference model had the following properties in wm: {short-hair(x) glasses(x)}. this is easy to see as those properties were recently used in the human's description of face 6, and thus would have been considered for face 1 when ruling it out during reference resolution. the algorithm thus starts by considering both of these properties, which are both selected because they apply to face 1 and rule out distractors. however, because these are not sufficient for full disambiguation, the algorithm must also retrieve another property from ltm, i.e., lab-coat(x), which allows for completion of disambiguation. the differences in behavior demonstrated in this simple example validate both our claims. first, the proposed architecture's ability to demonstrate interactive alignment effects purely through wm dynamics is demonstrated by the systems' tendency to re-use properties originating from its interlocutor. second, this example clearly demonstrates that the proposed architecture demonstrates different referring behaviors when different models of forgetting are selected. we have presented a flexible set of forgetting mechanisms for integrated cognitive architectures, and conducted a preliminary, proof-of-concept demonstration of these mechanisms, showing that they lead to different referring expressions being generated due to differences in cognitive availability between different properties. the next step of our work will be to fully explore the implications of different parametrizations of each of our presented mechanism, as well as the combined use of these mechanisms, on reg, and whether the referring expressions generated under different parametrizations are comparatively more or less natural, human-like, or effective, which would present obvious benefits for interactive, intelligent robots. in addition, the perspective taken in this paper may also yield insights and benefits for cognitive science more broadly. specifically, we argue that our perspective may suggest alternative interpretations of the role of cognitive load on attribute choice. in 's work building on gatt et al. (2011) 's model, they suggest that when speakers are under high cognitive load, they rely less on previously primed attributes, and are thus more likely to rely on their stable preference orderings. their explanation for this finding 1. future work should consider other algorithmic configurations, such as having properties within wm considered in the reverse order, or according to the preference ordering specified by the target referent's consultant. is that a decrease in available wm capacity leads to an inability to retrieve dialogue context into wm. we suggest an alternative explanation: cognitive load leads to decreased priming not because priming-activated representations cannot be retrieved into wm, but because those representations are less likely to be in wm in the first place. an additional promising direction for future work will thus be to compare the ability of the model presented in this paper to those presented by with respect to modeling of reg under cognitive load. the capacity of visual short-term memory is set both by visual information load and by number of objects working memory: activation limitations on retrieval human memory: a proposed system and its control processes working memory controlling the intelligibility of referring expressions in dialogue referential form, word duration, and modeling the listener in spoken dialogue. approaches to studying world-situated language use cognitive architecture for human-robot interaction: towards behavioural alignment memory-centred architectures: perspectives on human-level cognitive competencies tracking the time course of multidimensional stimulus discrimination: analyses of viewing patterns and processing times during "same"-"different "decisions conceptual pacts and lexical choice in conversation a theory of attention for cognitive systems some tests of the decay theory of immediate memory interaction histories and short term memory: enactive development of turn-taking behaviors in a childlike humanoid robot the addition of an activation and decay mechanism to the soar architecture referring as a collaborative process the magical number 4 in short-term memory: a reconsideration of mental storage capacity cooking up referring expressions generating referring expressions: constructing descriptions in a domain of objects and processes generating referring expressions involving relations. fifth conference of the european chapter computational interpretations of the gricean maxims in the generation of referring expressions do speakers and listeners observe the gricean maxim of quantity? dual-process theories of higher cognition: advancing the debate variability in the quality of visual working memory producing pronouns and definite noun phrases: do speakers use the addressee's discourse model? speaking from experience: audience design as expert performance. language attribute preference and priming in reference production: experimental evidence and computational modeling survey of the state of the art in natural language generation: core tasks, applications and evaluation evaluating algorithms for the generation of referring expressions using a balanced corpus system integration with working memory management for robotic behavior learning preferences versus adaptation during referring expression generation referring under load: disentangling preference-based and alignment-based content selection processes in referring expression generation alignment in interactive reference production: content planning, modifier ordering, and referential overspecification logic and conversation the oxford handbook of reference cognitive status and the form of referring expressions in discourse towards an integrated robot with multiple cognitive functions learning lexical alignment policies for generating referring expressions for spoken dialogue systems efficient computation of spreading activation using lazy evaluation the mind and brain of short-term memory a capacity theory of comprehension: individual differences in working memory thinking, fast and slow implementation of cognitive control for a humanoid robot proactive inhibition in short-term retention of single items the egocentric basis of language use: insights from a processing approach. current directions in psychological science the art of computer programming learning preferences for referring expression generation: effects of domain, language and algorithm what can cognitive architectures do for robotics? biologically inspired cognitive architectures the soar cognitive architecture cognitive robotics using the soar cognitive architecture. workshops at the twenty-sixth aaai conference on artificial intelligence a standard model of the mind: toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics oro, a knowledge management platform for cognitive architectures in robotics rehearsal in serial recall: an unworkable solution to the nonexistent problem of decay interference in short-term memory: the magical number two (or three) in sentence processing visuo-spatial working memory changing concepts of working memory what's magic about magic numbers? chunking and data compression in short-term memory a review of verbal and non-verbal human-robot interactive communication generating anaphoric expressions: pronoun or definite description? the relation of discourse/dialogue structure and reference the magical number seven, plus or minus two: some limits on our capacity for processing information comprehensive working memory activation in soar. international conference on cognitive modeling psychology of learning and motivation forgetting in immediate serial recall: decay, temporal distinctiveness, or interference? psychological review modeling working memory: an interference model of complex span givenness hierarchy theoretic cognitive status filtering. proc. cogsci a working memory model improves cognitive control in agents and robots a biologically inspired working memory framework for robots toward a mechanistic psychology of dialogue centering: a parametric theory and its instantiations. computational linguistics caching algorithms and rational models of memory building applied natural language generation systems mechanisms of forgetting in short-term memory decay theory of immediate memory: from brown (1958) to today a preliminary analysis of the soar architecture as a basis for general intelligence spoken instruction-based one-shot object and action learning in a cognitive robotic architecture an overview of the distributed integrated cognition affect and reflection diarc architecture you said it before and you'll say it again: expectations of consistency in communication does working memory have a single capacity limit? robots that use language knowrob-knowledge processing for autonomous personal robots act-r/e: an embodied cognitive architecture for human-robot interaction computational models of referring: a study in cognitive science how cognitive load influences speakers' choice of referring expressions acoustic similarity and retroactive interference in short-term memory multiple resources and mental workload a detection theory account of change detection a consultant framework for natural language processing in integrated robot architectures situated open world reference resolution for human-robot dialogue going beyond command-based instructions: extending robotic natural language interaction capabilities towards givenness and relevancetheoretic open world reference resolution a framework for resolving open-world referential expressions in distributed heterogeneous knowledge bases referring expression generation under uncertainty: algorithm and evaluation framework reference in robotics: a givenness hierarchy theoretic approach augmenting robot knowledge consultants with distributed short term memory the soar cognitive architecture and human working memory. models of working memory: mechanisms of active maintenance and executive control discrete fixed-resolution representations in visual working memory key: cord-252903-pg0l92zb authors: abueg, m.; hinch, r.; wu, n.; liu, l.; probert, w. j. m.; wu, a.; eastham, p.; shafi, y.; rosencrantz, m.; dikovsky, m.; cheng, z.; nurtay, a.; abeler-doì�rner, l.; bonsall, d. g.; mcconnell, m. v.; o'banion, s.; fraser, c. title: modeling the combined effect of digital exposure notification and non-pharmaceutical interventions on the covid-19 epidemic in washington state date: 2020-09-02 journal: nan doi: 10.1101/2020.08.29.20184135 sha: doc_id: 252903 cord_uid: pg0l92zb contact tracing is increasingly being used to combat covid-19, and digital implementations are now being deployed, many of them based on apple and google's exposure notification system. these systems are new and are based on smartphone technology that has not traditionally been used for this purpose, presenting challenges in understanding possible outcomes. in this work, we use individual-based computational models to explore how digital exposure notifications can be used in conjunction with non-pharmaceutical interventions, such as traditional contact tracing and social distancing, to influence covid-19 disease spread in a population. specifically, we use a representative model of the household and occupational structure of three counties in the state of washington together with a proposed digital exposure notifications deployment to quantify impacts under a range of scenarios of adoption, compliance, and mobility. in a model in which 15% of the population participated, we found that digital exposure notification systems could reduce infections and deaths by approximately 8% and 6%, effectively complementing traditional contact tracing. we believe this can serve as guidance to health authorities in washington state and beyond on how exposure notification systems can complement traditional public health interventions to suppress the spread of covid-19. the covid-19 pandemic has brought about tremendous societal and economic consequences across the globe, and many areas remain deeply affected. due to the urgency and severity of the crisis, the poorly understood long-term consequences of the virus, and the lack of certainty about which control measures will be effective, many approaches to stopping or slowing the virus are being explored. in seeking solutions to this problem, many technology-based non-pharmaceutical interventions have been considered and deployed ( 1 ) , including data aggregation to track the spread of the disease, gps-enabled quarantine enforcement, ai-based clinical management, and many others. contact tracing, driven by interviews of infected persons to reveal their interactions with others, has been a staple of epidemiology and public health for the past two centuries ( 2 ) . these human-driven methods have been brought to bear against covid-19 since its emergence, with some success ( 3 ) . unfortunately, owing in part to the rapid and often asymptomatic spread of the virus, these efforts have not been successful in preventing a global pandemic. further, as infections have reached into the millions, traditional contact tracing resources have been overwhelmed in many areas ( 4 ) ( 5 ) . given these major challenges for traditional contact tracing, technology-based improvements are being explored, with particular focus on the use of smartphones to detect exposures to others carrying the virus. smartphone apps may approximate pathogen exposure risk through the use of geolocation technologies such as gps, and/or via proximity-based approaches using localized radio frequency (rf) transmissions like bluetooth. location-based approaches attempt to compare the places a user has been with a database of high-risk locations or overlaps with infected people ( 6 ) , while proximity-based approaches directly detect nearby smartphones that can later be checked for "too close for too long" exposure to infected people ( 7 ) . in either approach, users who are deemed to be at risk are then notified, and in some implementations, health authorities also receive this information for follow-up. due to accuracy and privacy concerns, the majority of contact tracing proposals have avoided the location signal and focused on a proximity-based approach, such as pepp-pt ( 8 ) and nshx ( 9 ) . further privacy safeguards may be achieved by decentralizing and anonymizing important elements of the system, as in dp-3t ( 10 ) and apple and google's exposure notifications system (ens) ( 11 ) . in these approaches, the recognition of each user's risk level can take place only on the user's smartphone, and server-side knowledge is limited to anonymous, randomized ids. technological solutions in this space have never been deployed at scale before, and their effectiveness is unknown. there is an acute need to understand their potential impact, to establish and optimize their behavior as they are deployed, and to harmonize them with traditional contact tracing efforts. specifically we will examine these issues in the context of ens, which is currently being adopted by many countries ( 12 ) . there are many variables to consider when characterizing the behavior of any system of this type. technology-dependent parameters, such as those needed to convert bluetooth signal strength readings to proximity ( 13 ) ( 14 ) , vary from device to device and require labor-intensive calibration. they will not be discussed in this paper. here we seek to explore the general conditions and public health backdrop in which an ens deployment may exist, and the policy characteristics that can accompany it. in order to improve our understanding of this new approach, we employ individual-based computational models, also known as agent-based models, which allow the exploration of disease dynamics in the presence of complex human interactions, social networks, and interventions ( 15 ) . this technique has been used to successfully model the spread of ebola in africa ( 16 ) , malaria in kenya ( 17 ) , and influenza-like illness in several regions ( 18 ) ( 19 ) , among many others. in the case of covid-19, the openabm-covid19 model by hinch et a.l ( 20 ) has already been used to explore smartphone-based interventions in the united kingdom. this model seeks to simulate individuals and their interactions in home, work, and community contexts, using epidemiological and demographic parameters as a guide. in this work, we adapt the openabm-covid19 model to simulate the ens approach and apply it to data from washington state in the united states in order to explore possible outcomes. we use data at the county level to match the population, demographic, and occupational structure of the region, and calibrate the model with epidemiological data from washington state and google's community mobility reports for a time-varying infection rate ( 21 ) . similar to hinch et al., we find that digital exposure notification can effectively reduce infections, hospitalizations, and deaths from covid-19 at all levels of participation. we extend the findings by hinch et al. to show how digital exposure notification can be deployed concurrently with traditional contact tracing and social distancing to suppress the current epidemic and aid in various "reopening" scenarios. we believe the demographic and occupational realism of the model and its results have important implications for the public health of washington state and other health authorities around the world working to combat covid-19. to model the combined effect of digital exposure notification and other non-pharmaceutical interventions (npis) in washington state, we use a model first proposed by hinch et al. ( 20 ) , who have also made their code available as open source on github ( 22 ) . openabm-covid19 is an individual-based model that models interactions of synthetic individuals in different types of networks based on the expected type of interaction (fig. 1) . workplaces, schools, and social environments are modeled as watts--strogatz small-world networks ( 23 ) , households are modeled as separate fully connected networks, and random interactions, such as those on public transportation, are modeled in a random network. the networks are parameterized such that the average number of interactions matches the age-stratified data in ( 24 ) . contacts between synthetic individuals in those interaction networks have the potential for transmission of the virus that causes covid-19 and are later recalled for contact tracing and possible quarantine. while the original model by hinch et al. ( 22 ) included a single occupation network for working adults, we extend this to support multiple networks for workplace heterogeneity. this is motivated by increasing evidence that workplace characteristics play an important role in the spread of sars-cov-2, such as having to work in close physical proximity to other coworkers and interacting with the public. baker et al. found that certain u.s. working sectors experience a high rate of sars-cov-2 exposure, including healthcare workers, protective services (e.g., police officers), personal care and services (e.g., child care workers), community and social services (e.g., probation officers) ( 25 ) . as another example, the centers for disease control and prevention (cdc) has issued specific guidance to meat and poultry processing workers due to the possible increased exposure risk in those environments ( 26 ) . therefore, we model each individual industry sector as its own small-world network and parameterize it with real-world data such as the sector size and interaction rates. in openabm-covid19, transmission between infected and susceptible individuals through a contact is determined by several factors, including the duration since infection, susceptibility of the recipient (a function of age), and the type of network where it occurred (home networks assume a higher risk of transmission due to the longer duration and close proximity of the exposure). individuals progress through stages of susceptible, infected, recovered, or deceased. in this model, the dynamics of progression through these stages are governed by several epidemiological parameters, such as the incubation period, disease severity by age, asymptomatic rate, and hospitalization rate, and are based on the current literature of covid-19 epidemiology. a complete list of the epidemiological parameters can be found at ( 27 ) and any modifications to those are described in the subsequent sections and documented in the supplementary materials (table s1 , s2). in this work we model the three largest counties in washington state --king, pierce, and snohomish --with separate and representative synthetic populations. the demographic and household structure were based on data from the 2010 u.s. census of population and housing ( 28 ) and the 2012-2016 acs public use microdata sample ( 29 ) . we combined census and public use microdata sample (pums) data using a method inspired by ( 30 ) . for each census block in washington state we took distributions over age, sex, and housing type from several marginal tables (called census summary tables) and from the pums, and combined them into a multiway table using the iterative proportional fitting (ipf) algorithm. we then resampled the households from the pums to match the probabilities in the multiway table. the resulting synthetic population in each census block respects the household structure given by pums and matches marginals from the census summary tables. our synthetic working population was drawn to match the county-level industry sector statistics reported by the u.s. bureau of labor statistics in their quarterly census of employment and wages for the fourth quarter of 2019 ( 31 ) . we also used a report by the washington state department of health (doh) containing the employment information of lab-confirmed covid-19 cases among washington residents as of may 27, 2020 to parameterize each occupation sector network ( 32 ) . for each sector, we use its lab-confirmed case number weighted by the total employment size as a multiplier factor to adjust the number of work interactions of that occupational network. while the doh report does not explicitly measure exposure risk for different industries, it is, to the best of our knowledge, the best source of data for confirmed covid-19 cases and occupations to date. our model should be refined with better data from future work that studies the causal effect of workplace characteristics on covid-19 transmission. a complete list of the occupation sectors and interaction multipliers can be found in the supplementary materials (table s3 ,s4). testing and quarantine in the openabm-covid19 model, if an individual presents with covid-19 symptoms, they receive a test and are 80% likely to enter a voluntary 7-day isolation with a 2% drop out rate each day for noncompliance. if the individual receives a positive test result, they isolate for a full 14 days from initial exposure with a daily drop out rate of 1%. prior to confirmation of the covid-19 case via a test result, the household members of the voluntarily self-isolating symptomatic individual do not isolate, which is in line with current recommendations by the cdc ( 33 ) . household quarantines may still occur through digital exposure notification or manual contact tracing, described in the following sections. we simulate digital exposure notification in openabm-covid19 by broadcasting exposure notifications to other users as soon as an app user either tests positive or is clinically diagnosed with covid-19 during hospitalization. the model recalls the interaction networks of this app user, known as the "index case", to determine their first-order contacts within the previous 10 days. those notified contacts are then 90% likely to begin a quarantine until 14 days from initial exposure with a 2% drop out rate each day for noncompliance. see ( 22 ) for a more comprehensive description of the model. while the actual ens allows health authorities to configure notifications as a function of exposure distance and duration, our model does not have the required level of resolution and instead assumes that 80% of all "too close for too long" interactions are captured between users that have the app. (see the supplemental materials for a sensitivity analysis of this parameter.) the overall effect of digital exposure notification depends on a number of factors that we explore in this work, including the fraction of the population that adopts the app and the delay between infection and exposure notification. as an upper bound on app adoption, we configure the age-stratified smartphone population using data on smartphone ownership from the u.s. from the pew research center ( 34 ) for ages 20+ and common sense media ( 35 ) for ages 0-19. since this data was not available for washington state specifically we assumed that the u.s. distribution was representative of washington state residents. we also extend openabm-covid19 to model traditional or "manual" contact tracing as a separate intervention. in contrast to digital exposure notification, human tracers work directly with index cases to recall their contact history without the proximity detection capabilities of a digital app. those contacts are then given the same quarantine instructions as those traced through the digital app. we configure the simulation such that manual contact tracers have a higher likelihood of tracing contacts in the household and workplace/school networks (100% and 80%, respectively) than for the additional random daily contacts (5%). this is based on the assumption that people will have better memory and ability to identify contacts in the former (e.g., involving family members or coworkers) compared to the latter (e.g., a random contact at a restaurant). additionally, we configure the capacity of the contact tracing workforce with parameters for workforce size, maximum number of index-case interviews per day, and maximum number of tracing notification calls per day following those interviews. tracing is initiated on an index case after either a positive test or hospitalization, subject to the capacity in that area. finally, we add a delay parameter between initiation of manual tracing and finally contacting the traced individuals to account for the processing and interview time of manual tracing. model calibration is the process of adjusting selected model parameters such that the model's outputs closely match real-world epidemiological data. to calibrate openabm-covid19 for washington state we use components of a bayesian seir model by liu et al. ( 36 ) for modeling covid-19. they extend the classic seir model by allowing the infection rate to vary as a function of human mobility and a latent changepoint to account for unobserved changes in human behavior. we fit that model to washington state county-level mortality data from the new york times (37) and mobility data from the community mobility reports published by google and publicly available at ( 21 ) . the community mobility reports are created with aggregated, anonymized sets of data from users who have turned on the location history setting, which is off by default. no personally identifiable information, such as an individual's location, contacts or movement, is ever made available ( 38 ) . the reports chart movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential. we note that, because of the opt-in nature of this dataset, it may not be representative of the overall population. we extend the methodology in liu et al. to model calibration in openabm-covid19 by applying the time-varying infection rate coefficients to the relevant county-specific parameters that guide user interaction levels and disease transmission likelihood. more specifically, the number of daily interactions in the random and occupation networks, and , are scaled by the i(t) r i(t) w mobility coefficient, at time step , which is calculated based on the aggregated and (t) m t anonymized location visits from the community mobility reports. the time-dependent infectious rate, , is scaled by a weighting term, , that depends on how far time step is from a (t) β (t) σ t learned changepoint, which is modeled as a negative sigmoid. both and are learned (t) σ (t) m functions and are described in more detail in ( 36 ) . finally, we use an exhaustive grid search to compute two openabm-covid19 parameters for each county: its initial infectious rate and the infection seed date . the infectious rate is the 1 mean number of individuals infected by each infectious individual with moderate-to-severe symptoms, and can be considered a function of population density and social mixing. the infection seed date is the date at which the county reaches 30 total infections, possibly before the first official cases due to asymptomatic and unreported cases. we pick the parameters where the simulated mortality best matches the actual covid-19 mortality from epidemiological data, as measured by root-mean-square error (rmse). the results of the calibrated models for king, pierce, and snohomish counties are shown in fig. 2 . note that while there is a strong correlation in the predicted and reported incidence, the absolute predicted counts are approximately 6x higher than those that were officially reported. we attribute this difference to the fact that openabm-covid19 is counting all asymptomatic and mild symptomatic cases that may not be recorded in reality. this is approximately consistent with the results of a seroprevalence study by the cdc that estimated that there were 6 to 24 times more infections than official case report data ( 39 ) . . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . in this section we present several forward-looking simulations for washington state counties by comparing multiple hypothetical scenarios that implement some combination of digital exposure notification, manual contact tracing, or social distancing. each simulation uses the same calibrated model parameters up to july 11, 2020 at which point the hypothetical interventions are implemented. beyond this date, each simulation uses the model parameters from the last week of the calibration period, except where explicitly specified as part of the intervention. for each simulated intervention we report the number of infections (daily and cumulative), cumulative number of deaths, number of hospitalizations, number of tests per day, and fraction of the population in quarantine. each simulation covers 300 consecutive days from march 1, 2020 through dec 25, 2020, plus the additional calibrated seeding period before march 1. unless otherwise stated, the reported result is the mean value over 10 runs with different random seeds of infection. note that results may be affected by the end date of the simulation because of the time it takes some interventions to have their full effect. we believe that a time horizon of approximately 5 and a half months is long enough to be practically useful for public health agencies who are considering deploying such interventions, but short enough to minimize the long-term uncertainty and effects of externalities such as a vaccine becoming available. we first study the effect of a digital exposure notification app at different levels of app adoption --15%, 30%, 45%, 60%, and 75% (or all smartphone owners) --of the population in each county. as a baseline, we compare those results to the "default" scenario without digital exposure notification and assume no change in behavior or interventions beyond july 11, 2020. the results show an overall benefit of digital exposure notification at every level of app adoption ( fig. 3 and 4) . when compared to the default scenario of only self isolation due to symptoms, each scenario results in lower overall incidence, mortality, and hospitalizations. unsurprisingly, the effect on the epidemic is more significant at higher levels of app adoption. an app with 75% adoption reduces the total number of infections by 56-73%, 73-79%, and 67-81% and the number of total deaths by 52-70%, 69-78%, and 63-78% for king, pierce, and snohomish counties, respectively. even at a relatively low level of adoption of 15%, total infections are reduced by 3.9-5.8%, 8.1-9.6%, and 6.3-11.8% and total deaths are reduced by 2.2-6.6%, 11.2-11.3%, and 8.2-15.0% for king, pierce, and snohomish counties, respectively. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . in addition to its effect on the epidemic, we also evaluate the trade-off between exposure notification app adoption and the total number of quarantine events. there is an incentive to minimize the quarantine rate because of the perceived economic and social consequences of stay-at-home orders. at 15% exposure notification app adoption the number of total quarantine events increases by 4.6-6.4%, 6.6-6.8%, and 5.8-10.2% for king, pierce, and snohomish counties (fig. 5) . in general, the higher the level of exposure notification adoption the greater the number of total quarantine events, with the exception of very high levels of adoption (60% and 75%) where this number plateaus or even decreases, likely due to the significant effect of the intervention in suppressing the overall epidemic in those scenarios. from another perspective, achieving epidemic control at the price of high initial quarantine is preferable to lower levels of quarantine that are sustained for much longer. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . fig. 5 : estimated total quarantine events of king, pierce, and snohomish counties for various levels of exposure notification app uptake among the population from july 11, 2020 to december 25, 2020. note that even for the "default" (0% en app uptake) scenario there is a non-zero number of quarantine events because this assumes that symptomatic and confirmed covid-19 positive individuals will self-quarantine at a rate of 80%, even in the absence of an app. next we study the potential impact of manual contact tracing in suppressing the epidemic as a function of the contact tracing workforce size. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . https://doi.org/10.1101/2020.08.29.20184135 doi: medrxiv preprint manual tracing with the full desired staffing levels of 15 workers per 100,000 people is able to affect the epidemic trend in all three counties, but has a significantly smaller effect at current staffing levels (fig. 6) . unsurprisingly, the impact for a given level of staffing is dependent upon the current epidemic trend, reinforcing the need for concurrent interventions to effectively manage the epidemic. additionally, we compare the performance of exposure notification to manual contact tracing to establish similarities between relative staffing level and exposure notification adoption and to verify an additive effect of concurrent manual tracing and exposure notification. we see improvements in all cases when combining interventions (fig. 7) . in all three counties, exposure notification has a stronger effect at the given staffing and adoption levels, but adding either intervention to the other results in reduced infections, albeit to different extents based on the trend of the epidemic. this suggests that both methods are useful separately and combined, even if they do not explicitly coordinate. while the results shown above suggest that the interventions are effective in suppressing the covid-19 epidemic to various degrees, in practice, health organizations will implement multiple intervention strategies simultaneously to try to curb the spread of the virus while also allowing controlled reopenings. therefore, we also study the combined effect of concurrent interventions including digital exposure notification, manual contact tracing, and social distancing (fig. 8) . we model social distancing as a function of infectiousness of interactions in the random and occupation networks, where increasing social distancing decreases the relative transmission likelihood on a network by a multiplicative factor relative to their values as of march 1, 2020 (i.e., before broad-based social distancing and mobility reductions). for example, social distancing of 1.7x is equivalent to multiplying the relative transmission by 1 / 1.7 = 0.6. note that this does not change the number of person-to-person interactions, but rather the likelihood of transmission of . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . any individual encounter, which may be affected by factors other than physical distancing such as mask usage, improved hygiene, use of personal protective equipment, etc. next we examine the effects of combined npis under various "reopening" scenarios by gradually increasing the number of interactions in every interaction network, including households, workplaces, schools, and random networks. specifically, we increase these interactions by a given percentage from the levels as of july 11, 2020 (0% reopen) up to the initial levels at march 1, 2020, at the very start of the epidemic (100% reopen). given the average number of interactions for network at the end of the baseline as and before the . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . https://doi.org/10.1101/2020.08.29.20184135 doi: medrxiv preprint the increase in new infections from a 10-20% reopening are balanced by 22-37% exposure notification app adoption, although the effect varies by county (fig. 9 ). this shows that limited additional reopenings may be possible after introducing exposure notification alongside existing fully staffed manual tracing (15 staff per 100,000 people), but that social distancing remains an important measure under these circumstances. additionally, there is an increased effect to adding exposure notification under greater reopening scenarios. as an example, we plot some primary metrics for a 50% network reopening and see significant reductions in nearly all metrics at even 30% adoption (fig. 10) . pierce snohomish fig. 10 . estimated total infected percentage, total deaths, and peak hospitalized under a 50% reopening scenario (an increase of 50% of the difference between pre-lockdown and post-lockdown network interactions) at various exposure notification adoption rates for king, pierce, and snohomish counties, assuming no change to social distancing after the (t) β baseline and 15 manual contact tracers per 100k people. as part of the washington state department of health's "safe start" plan, a key target metric to reopen washington is to reach fewer than 25 new cases per 100,000 inhabitants over the prior two weeks ( 42 ) . here, we examine how many days it would take to reach that target under the combined npis. with the recent spike in cases, the trajectory for reaching these targets without renewed lockdowns is out of the range of the simulations. therefore, to show the relative . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . benefits of the npis, we introduce an artificial renewed lockdown at the mobility levels averaged over the month before the phase 2 reopenings (phase 1.5 for king county) that occurred on june 5, 2020. using this averaged mobility from may 6 to june 5, 2020, we model the relative effects of manual tracing and exposure notification on the washington safe start key metric. we find that for all three counties, manual contact tracing at the recommended staffing levels combined with an exposure notification app can significantly reduce the amount of time it takes to achieve this metric (fig. 11) . under the recommended standard for manual tracing, adding exposure notification at 30% adoption results in reaching the target in 92%, 87%, and 85% of the time versus no exposure notification for king, pierce, and snohomish counties respectively. at the reduced levels of 4.7 tracers per 100,000 population, the target is reached in less than 83% and 88% of the time for king and snohomish respectively, although the exact ratio can not be calculated as the metric is not achieved in the baseline simulation. our individual-based modeling approach attempts to simulate the behavior of humans in a complex environment, in order to better understand the relative effects of different levels of intervention. while we have attempted to add realistic elements and calibrate it with the best . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) available data, it still represents a dramatic simplification of the real world. choices and simplifications made surrounding the behavior of the individuals, their movements in the world, disease dynamics, and many others, mean that the results should be viewed as an exploration of possible outcomes, not a prediction ( 43 ) . a more specific limitation in our work is that we modeled each county separately without cross-county interactions. in particular, we did not model how cross-county human movement contributes to disease spreading. we plan to explore this effect in our future work. our simulations assume that it takes 2 days from symptom onset to receive a covid-19 test result and we acknowledge that this is a key assumption underlying our findings. ferretti et al. ( 44 ) showed that the delay between the initial exposure to case confirmation, notification, and quarantine has a significant impact on the efficacy of the intervention. rapid testing protocols can shorten the time between symptom development and case confirmation, and are essential for epidemic control ( 20 ) . we used published covid-19 mortality data to calibrate model parameters. while the death count is arguably a good proxy to the true infection numbers, the published mortality data are scarce and noisy in small counties, resulting in the difficulty of modeling those counties with accuracy. the synthetic occupation networks are based on the latest employment data corresponding to the fourth quarter in 2019 ( 31 ) . since the beginning of the pandemic, the size and structure of occupation networks may have changed compared to the latest available data. in our work we used the mobility data along with a changepoint to model time-varying infection rates. while the changepoint vector models the net effect of various latent factors, it may be limited when multiple change points or more complex latent factors exist. the derived time-varying infection rate is homogeneously distributed to the random network and occupational networks. this is an approximation to the reality where the change may vary on different networks. the compartmental modeling approach ( 45 ) ( 46 ) ( 47 ) has been widely used for epidemic study. this approach segments the total population by subgroups according to the disease progression stage and models the transmission of stages with differential equations. seir (susceptible-exposed-infected-recovered) ( 48 ) ( 49 ) ( 50 ) ( 44 ) is a common type of compartmental model used to study covid-19 spread. however, this approach is not suitable for studying the impact of individual level interventions like exposure notification apps because they characterize the disease dynamics at a population-level. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted september 2, 2020. . in contrast to the compartmental model, the individual-based modeling approach ( 13 , 18 , 19 , 51 -62 ) simulates the infectious disease progression of individuals and can consider demographics, social interactions, and the environment. these individual-based models can predict the spread of covid-19 in multiple countries by fitting the stochastic model of disease progression and human interactions from historical data. however, the impact of additional interventions such as digital exposure notification is unexplored. in ( 63 ) ( 64 ) , disease transmission is modeled by a stochastic process to fit the reproduction number of the total population. however, manipulating the reproduction number by real contact tracing actions can be challenging as it is subject to human interaction patterns, adoption rate, and many other types of interventions. this model lacks the characteristics of individuals as it uses the mean field theory to approximate the total population. ( 65 ) ( 66 ) ( 67 ) study contact tracing by situating individuals randomly in a space and mimicking human contacts by the individual's collision from the spatial movement. while this spatial individual-based model reveals promising results in virus spread in relatively small and closed areas, such as public buildings ( 68 ) , and cruise ships ( 69 ) , the ad-hoc assumptions in individual mobility patterns are not suitable for studying the impact of contact tracing in the scale of a city. ( 70 ) introduces the spatial temporal model which has more realistic mobility patterns. however, the spatial movement used in these models is a simplification of contact tracing which lacks the individual interactions among family members, workmates and from random activities. the effectiveness of manual and digital contact tracing is discussed in ( 71 ) through empirical contact data collected from the work related network at a small scale, without considering virus spread among family members and other random interactions. the references ( 57 ) ( 72 ) are the closest to ours, but they do not cover the joint impact of manual and digital contact tracing. in addition, model calibration is missing in their case studies. in contrast, openabm-covid19 ( 22 ) simulates concurrent manual contact tracing and digital exposure notification interventions over interaction networks at a large scale. in this study we conducted a model-based estimation of the potential impact of a digital exposure notification app in washington state. openabm-covid19 simulates interactions among synthetic agents in various small-world networks, representing households, workplaces, schools, and random interactions. interactions in those networks can result in covid-19 transmission and are recalled to simulate different tracing interventions, including "manual" contact tracing or digital exposure notification, such as the recently released apple and google exposure notifications system (ens). we calibrated our model using real-world data on human mobility and showed how it can accurately match epidemiological data in washington state's three largest counties, king, pierce, and snohomish. similar to hinch et al.'s report on digital contact tracing in the uk ( 20 ) , we found that a digital exposure notification app can meaningfully reduce infections, deaths, and hospitalizations in these washington state counties at all levels of app uptake, even if only a small fraction of the . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted september 2, 2020. . eligible population participates. we also showed how digital exposure notification can be combined with manual contact tracing at the recommended levels to further suppress the epidemic, even if the two interventions do not explicitly coordinate. our simulations showed that the simultaneous deployment of both interventions can help these washington counties meet the key incidence metric defined by the safe start washington plan before december, 2020. the potential overall effect of digital exposure notification seems to be greater than even optimal levels of manual contact tracing, likely because of its ability to scale and better identify random interactions. we also found that quarantine rates, which contribute to the social and economic cost of these interventions, scale sublinearly with app adoption, meaning that in some cases there are fewer people quarantined even though a greater fraction of the population is participating in the app. we credit this effect to the success of the app at suppressing the epidemic at high levels of adoption. given a longer simulation time horizon we may see a similar effect even at the lower levels of app adoption. health authorities may consider this when appealing to the public by explaining how greater rates of collective participation may reduce the severity of the epidemic while also minimizing or reducing the need for quarantine. finally, we looked at the combined effects of digital exposure notification and manual tracing in the context of different reopening scenarios, where mobility and interaction levels increase to the pre-epidemic levels. our results suggest that both interventions are helpful in counterbalancing the effect of reopening, but are not totally sufficient to offset new cases except at very high levels of adoption and manual tracing staffing. as a result we believe that continued social distancing and limiting person-to-person interactions is essential. future work is needed to study targeted reopening strategies, such as reopening specific occupation sectors or schools, or more stringent social distancing interventions in places that do reopen. looking ahead to future work, we are considering the question of coordination between different regions when deploying digital exposure notification as part of a suite of non-pharmaceutical interventions. the united states has seen a highly spatially varied response to the covid-19 pandemic, with significant consequences to epidemic control ( 73 ) . under the conditions of varying cross-county and cross-state flows, we seek to quantify the empirical efficiency gap between coordinated and uncoordinated deployments and policies around testing, tracing, and isolation in which a digital exposure notification system can aid. in particular, the beginning of such cross-state collaborations is evident in the consortia of state governments such as the western states pact and a multi-state council in the northeast, both working together to coordinate their responses. we expect that coordinated deployments of digital exposure notification applications and public policies may lead to more effective epidemic control as well as more efficient use of limited testing and isolation resources. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted september 2, 2020. . https://doi.org/10.1101/2020.08.29.20184135 doi: medrxiv preprint applications of digital technology in covid-19 pandemic planning and response. the lancet digital health john haygarth's 18th-century "rules of prevention" for eradicating smallpox spread of sars-cov-2 in the icelandic population israel's contact tracing system said to be vastly overwhelmed by virus spread california's plan to trace travelers for virus faltered when overwhelmed, study finds a case for participatory disease surveillance of the covid-19 pandemic in india nist pilot too close for too long (tc4tl) challenge evaluation plan dp3t -decentralized privacy-preserving proximity tracing privacy-preserving contact tracing a flood of coronavirus apps are tracking us. now it's time to keep track of them using bluetooth low energy (ble) signal strength estimation to facilitate contact tracing for sc '08: proceedings of the 2008 acm/ieee conference on supercomputing spatiotemporal spread of the 2014 outbreak of ebola virus disease in liberia and the effectiveness of non-pharmaceutical interventions: a computational modelling analysis simulation of malaria epidemiology and control in the highlands of western kenya comparing large-scale computational approaches to epidemic modeling: agent-based versus structured metapopulation models flute, a publicly available stochastic influenza epidemic simulation model effective configurations of a digital contact tracing app: a report to nhsx agent-based model for modelling the covid-19 epidemic collective dynamics of "small-world" networks social contacts and mixing patterns relevant to the spread of infectious diseases estimating the burden of united states workers exposed to infection or disease: a key factor in containing risk of covid-19 infection cdc, meat and poultry processing workers and employers openabm-covid19/baseline_parameters_transpose.csv at v0 census of population and housing creating synthetic baseline populations. transportation research part a: policy and practice covid-19 confirmed cases by occupation and industry estimating the changing infection rate of covid-19 using bayesian models of mobility the new york times, coronavirus in the u.s.: latest map and case count google covid-19 community mobility reports: anonymization process description (version 1.0) seroprevalence of antibodies to sars-cov-2 in 10 sites in the united states king county safe start application moving from modified phase 1 to phase 2 building covid-19 contact tracing capacity in health departments to support reopening american society safely safe start washington -phased reopening county-by-county agent-based modeling in public health: current applications and future directions quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing a contribution to the mathematical theory of epidemics estimation of parameters in a structured sir model epidemic analysis of covid-19 in china by dynamical modeling modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions a modified seir model to predict the covid-19 outbreak in spain and italy: simulating control scenarios and multi-scale epidemics revealing covid-19 transmission in australia by sars-cov-2 genome sequencing and agent-based modeling sustainable and resilient strategies for touristic cities against covid-19: an agent-based approach predicting the impacts of epidemic outbreaks on global supply chains: a simulation-based analysis on the coronavirus outbreak (covid-19/sars-cov-2) case an agent-based epidemic model reina for covid-19 to identify destructive policies social network analysis and agent-based modeling in social epidemiology modelling disease outbreaks in realistic urban social networks covasim: an agent-based model of covid-19 dynamics and interventions covid-abs: an agent-based model of covid-19 epidemic to simulate health and economic effects of social distancing interventions modeling covid-19 on a network: super-spreaders, testing and containment modelling transmission and control of the covid-19 pandemic in australia enhancing response preparedness to influenza epidemics: agent-based study of 2050 influenza season in switzerland. simulation modelling practice and theory initial simulation of sars-cov2 spread and intervention effects in the continental us impact of delays on effectiveness of contact tracing strategies for covid-19: a modelling study feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. the lancet global health strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic an agent-based model to evaluate the covid-19 transmission risks in facilities how to restart? an agent-based simulation model towards the definition of strategies for covid-19 "second phase how many infections of covid-19 there will be in the "diamond princess"-predicted by a virus transmission model based on the simulation of crowd flow a spatiotemporal epidemic model to quantify the effects of contact tracing, testing, and containment effect of manual and digital contact tracing on covid-19 outbreaks: a study on empirical contact data modelling the impact of testing, contact tracing and household quarantine on second waves of covid-19 interdependence and the cost of uncoordinated responses to covid-19 key: cord-267180-56wqok4c authors: aliee, m.; castano, s.; davis, c.; patel, s.; miaka, e. m.; spencer, s. e.; keeling, m. j.; chitnis, n.; rock, k. s. title: predicting the impact of covid-19 interruptions on transmission of gambiense human african trypanosomiasis in two health zones of the democratic republic of congo date: 2020-10-27 journal: nan doi: 10.1101/2020.10.26.20219485 sha: doc_id: 267180 cord_uid: 56wqok4c many control programmes against neglected tropical diseases have been interrupted due to covid-19 pandemic, including those that rely on active case finding. in this study we focus on gambiense human african trypanosomiasis (ghat), where active screening was suspended in the democratic republic of congo (drc) due to the pandemic. we use two independent mathematical models to predict the impact of covid-19 interruptions on transmission and reporting, and the achievement of 2030 elimination of transmission (eot) goal for ghat in two moderate-risk regions of drc. we consider different interruption scenarios, including reduced passive surveillance in fixed health facilities, and whether this suspension lasts until the end of 2020 or 2021. our models predict an increase in the number of new infections in the interruption period only if both active screening and passive surveillance were suspended, and with slowed reduction but no increase if passive surveillance remains fully functional. in all scenarios, the eot may be slightly pushed back if no mitigation such as increased screening coverage is put in place. however, we emphasise that the biggest challenge will remain in the higher prevalence regions where eot is already predicted to be behind schedule without interruptions unless interventions are bolstered. the threat posed by the 2019 coronavirus disease pandemic is not limited to the direct consequences of the disease itself. in addition to the economic burden that many countries are facing due to lockdowns, the public health measures initiated to suppress transmission of the coronavirus, sars-cov-2, may have an additional impact on the ability to control other infectious diseases. a disruption in usual activities of health services has the potential to lead to an increased loss of life as surveillance and access to diagnostics and treatment is more limited (wang, 2020) . these challenges to public health systems will affect africa disproportionately, with covid-19 providing an additional burden to many already fragile health systems (makoni, 2020) ; and previous modelling studies have suggested that covid-19 related interruptions to malaria control could lead to substantially higher cases and deaths in africa in the near future (sherrard-smith2020, hogan2020). gambiense human african trypanosomiasis (ghat) is a vector-borne disease of central and west africa, which is typically fatal when left untreated, for which the delivery of interventions has already been impacted by the covid-19 pandemic. a primary intervention to control ghat is the use of active screening (as), where at-risk populations in hard-to-reach locations are tested using serological diagnostics followed by subsequent case confirmation and treatment. in april 2020, the world health organization (who) recommended that active case-finding activities and mass treatment campaigns for neglected tropical diseases should be postponed until further notice (who, 2020) . the lack of as results in a reliance on the passive healthcare system and individuals self-presenting to health centres following onset of symptoms. this will likely lead to diagnosis at a later stage of this lethal disease, with additional uncertainty about whether the pandemic will also lead to a fear of travelling to health centres by patients, and a reduced priority of the disease for diagnosis in the health centres due to covid-19 testing (camara, 2017 , chanda-kapata, 2020 . for ghat, interruptions in as in the past have led to increased infections in subsequent years. the 2014-2015 ebola outbreak in guinea, which completely interrupted as and resulted in "partial" passive surveillance, coincided with a fall in reported ghat incidence, which was likely due to poor surveillance rather than a true reduction in infection numbers (camara, 2017) . we can also learn from other interruptions of as: a period of no active screening in 2007-2008 in mandoul, chad, also led to increased passive case detections in 2008, with modelling suggesting there was an increase in transmission (mahamat, 2017) . a challenge for controlling intensified disease management neglected tropical diseases (idm ntds) is the strong link between control activities (i.e. screening) and surveillance through case reporting, so a reduction in reported cases can be indicative of either success (if surveillance is strong) or reduced invention effort. there is a risk that the interruption of ghat interventions will impact the goal of elimination of transmission (eot) of ghat by 2030, which has been set by the who (franco, 2020) . while modelling suggests that it is already unlikely that all areas will achieve this goal (huang 2020, ntd-ghat 2020), there is now the potential for further delays. to quantify the length of a delay . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 27, 2020. in time to reach eot caused by covid-19 interruptions, and the corresponding increase in transmission, cases and possible deaths, we use two independent stochastic models for ghat infection in two distinct transmission settings in the democratic republic of congo (drc), the country which has the highest case burden. in this work, we focus on administrative regions in kwilu province (within the former bandundu province) which was the highest endemicity province in the country in 2016. we selected two different health zones within kwilu, bagata and mosango, both classified as moderate-risk in 2016 (1-10 reported cases per 10,000 people per year (franco et al, 2020) ), although bagata has had higher case reporting (see si fig 2, 4, 5) . both health zones are similar in population size (estimated as 121,433 for mosango and 165,990 for bagata in 2015 (ocha)). the health zones have had regular active screening with bagata generally observing higher coverage: in bagata the mean coverage during 2014-2018 was approximately 35%, and the maximum was around 45%; in mosango the mean coverage during 2014-2018 was around 33%, and the maximum was 60%. we chose regions in which vector control has not yet been implemented in order to better study the significance of interruptions of screening. the impact of interruptions of screening programs was studied using two independentlydeveloped stochastic models of ghat infection, named model s and model w, developed independently and previously described in other modelling studies (stone2015, rock2015, castano2019, crump2020, davis2019). both these models take into account different stages of the disease and transmission between vectors and humans who might be at low-or high-risk of exposure to vector bites. they also allow for various screening programmes, including passive surveillance (ps) and as, through which infected people may be identified, and therefore treated (more details in si and in (castano2020, crump2020). a table with the similarities and differences in the models and simulations can be found in the si. the parameters used in the stochastic models were taken from fitting of corresponding deterministic models to the human case screening data recorded for the years 2000-2016 in each health zone, except for the model s calibration for bagata, which used data for 2000-2018 (see (crump2020) and si). the fits assumed a 3% annual increase of the human population according to the available estimates, and an increase in the passive detection rate over timean improvement that is backed by both anecdotal evidence and previous modelling studies (crump 2020 and castano 2020 plos (data aggregation)) . annual rates of active screening were captured directly from the data for the period of 2000-2018. the stochastic simulations use 200 parameter sets of the posterior distributions of specific health zones, each model generates 200,000 stochastic (1000 from each of the 200 parameter sets) and the model outputs compared to historic data can be seen in the si (figures 2 and 4) . for a no-interruption baseline, we assume as and ps interventions continue indefinitely from 2019 with the same number of people screened annually, given by the mean value of the last . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 27, 2020. ; five years (2014) (2015) (2016) (2017) (2018) . we then consider six potential interruption scenarios of ghat activities due to covid-19. we assume all interruptions start at the beginning of april 2020, but they may last until either the end of 2020 or 2021. the interruptions may disrupt either as or both as and ps. whilst as is assumed to be fully suspended within the interruption period, ps may be partially operating, going back to the detection capacity before modelled improvement; for model w this was the ps level in 1998, and in model s this was for 2000. table 1 summarises these six interruption scenarios. the interventions are reinstated to the baseline values (mean as and full ps) after the interruption period. moreover, we study identical scenarios with mitigation, where as is set to the maximum coverage observed in the data between 2000 and 2018, after interruption finishes. simulating these 13 scenarios allows us to study possibilities of catching up to previously expected progress or even accelerating towards the 2030 goal. we compare different scenarios by following the predicted infection dynamics over time after the interruptions are introduced in both health zones. in particular, our model outputs include the annual number of new human infections, reported cases corresponding to both passive surveillance and active screening, and the deaths caused by ghat. we calculate these variables for all realisations and examine at the mean values of all 200,000 realisations. figures 1 and 2 show the mean dynamics over time of all 13 scenarios as introduced in table 1 . in some cases the mean is substantially higher than the median and so we provide baseline projections in the si showing both averages and also 95% prediction intervals for the two models and both health zones (see si figures 3 and 6 ). these wider prediction intervals explain how the mean new transmissions in model s can be higher than model w for mosango ( figure 2) , however there is a higher probability of meeting eot during 2020-2025 for model s ( figure 3 ). similar trends can be recognised for the two health zones bagata and mosango despite their different prevalence and infection dynamics. the results from both models and both health zones predict a significant increase in the mean number of new infections following suspension of both as and ps. on the other hand, the number of reported cases decreases during the interruption period, but then resurges in the following years, especially in mitigated scenarios where active screening is reinstated at the maximum historic coverage. more importantly, our models predict higher death rates during and after the interruptions. if ps continues, either at full or reduced capacity, our simulations indicate that infection is unlikely to increase during the interruption, however progress towards elimination of transmission could slow or stagnate during interruption. putting these results together, the loss is more pronounced for longer interruptions (until the end of 2021) and, as expected, the worst response is observed when both active and passive activities are fully ceased. these results suggest retaining a minimum level of ps plays a significant role in controlling the transmission even if planned as cannot go ahead. in the worst scenario, if all activities are suspended for two years, mitigations are not expected to help catch up with the baseline before 2030. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 27, 2020. ; in addition to examining the expected infection and reporting dynamics, we also estimate the eot probability by calculating the cumulative fraction of simulations reaching eot over time for these health zones. in case of no interruptions, our models predict a fairly high probability (82% chance for model w, and 77% for model s) to achieve eot in mosango by 2030. suspension of as alone is not predicted to disturb that substantially (see figure 3 ). this probability is clearly decreased, yet still above 50%, in the worst case scenario when all activities are suspended for two years even without mitigation. the median elimination year is delayed for almost three years in this scenario. mitigated programs can facilitate elimination, and may even result in higher probabilities of achieving the eot goal compared to the baseline scenario without interruption in the case that some ps was retained. in contrast to mosango, reaching the 2030 eot goal does not seem likely with the current programs in bagata (only with 29% probability for both models), even without any pause in ghat related activities. however, eot by 2030 becomes more unlikely when severe interruptions are introduced. our predictions for bagata suggest the median elimination year between 2034-2035 which can be delayed to 2038-2039 in the worst scenario. to visualise the predicted delays in elimination, we present violin plots representing the probability that eot is predicted to occur during a specific year (probability density function of eot) in our simulations (figure 4) , which clearly show how the distributions are shifted toward later years for severe scenarios. whilst the qualitative trends and even 2030 eot probabilities for the two health zones are similar between models, we note that there are quantitative differences. in particular model s has wider prediction intervals than model w (see si figures 2 and 4) and this is reflected in both the higher means for new transmission in both health zones, even though medians are lower, and in the wider violin plot distributions. our analysis of two independent models highlights the possible damage caused by the suspension of ghat interventions due to covid-19. in the most severe scenario we assume both as and ps would be completely stopped, which is predicted to delay eot by an average of 2-3 years if interruption continued until the end of 2021. however, the covid-19 pandemic continues to evolve and it remains unclear when public health measures will be relaxed (or reinstated). in case of longer interruptions (not simulated here), we might expect more serious damage to the gains made to date by the elimination efforts. the predictions made are specific to the two health zones of the drc: mosango and bagata; other health zones could behave differently especially if current controls are playing a stronger role in reducing transmission. our analysis predicts eot may still be achieved in mosango by 2030 thanks to the recent boost in as coverage; however, in bagata the elimination goal is unlikely without intensifying interventions, even without covid-19 related interruptions. the heterogeneous nature of active screening in ghat endemic areas (franco, 2020) and the . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 27, 2020. ; underlying focal nature of disease transmission (franco, 2014) mean that results will be regionspecific for the delay in eot and the necessity of mitigation strategies should be evaluated on this basis. additionally, in mosango and bagata there has been no wide-scale vector control, and we would expect qualitatively different impacts in areas where this has already been implemented. in the ebola outbreak of 2014-2015, the interruption to active screening and passive surveillance is thought to have caused an increase in transmission in the affected areas, except in places where small insecticide impregnated targets could be maintained, indicating tsetse control is likely protective during medical interruptions if it remains in situ (kagabadouno, 2018 , camara, 2017 . moreover, our results suggest that retaining functioning passive surveillance, even partially, can help to avoid significant delays in eot and to prevent substantial increases in mortality. even with a functional health system, it is unclear how pandemic-induced changes in health-seeking behaviour, in addition to redirecting limited health resources, will impact levels of passive-case finding. mitigation through increasing coverage of as following interruption could also increase the probability of meeting the 2030 eot goal. on the whole, our results suggest a milder impact of covid-19 related delays on ghat incidence and mortality than that suggested by similar studies on malaria (sherrard-smith2020, hogan2020). the use of stochastic events in both presented model formulations enable the direct computation of eot prediction within health zones, by calculating the probability from multiple realisations of the infection process. however, we note that neither model considers the impact of an asymptomatic reservoir of human infection that is able to self cure or how an animal reservoir may be able to sustain transmission, as the extent of these factors remains unknown (buscher, 2018 , capewell, 2019 . furthermore, local eot for health zones will somewhat depend on eot in neighboroughing regions, especially if there is substantial movement of people between locations. the modelling work here assumed that no importation of infection occurs, with health zones considered "epidemiological islands". previous modelling work indicates there may be only small rates of importations between villages in endemic regions within the kwilu province (davis, plos ntd, village scale), but nearby regions of continued transmission could pose a threat of re-introduction into health zones which successfully achieve eot. the use of two different models has enabled us to account for some structural and parameter uncertainty due to the way the models were constructed and fitted to observed data. a brief summary of the model differences is given in the si (table 5) . through fitting to data both models estimated the passive detection rates in 2000 and 2016 (see si tables 1 and 5) , with both models achieving comparable values for 2016 --those used for model projections --in both health zones. there was a noticeable difference in inferred base passive detection values for bagata between both models, leading to model s estimating a ~40-fold improvement between 2000 and 2016 for both stages, while model w suggested that these improvements were more modest at around 4 times for stage 1 and 1.1 for stage 2. these structural and parameter differences illustrate why it is unsurprising that the two models provide different projections for the baseline and interruption scenarios, however despite this, they do reach consensus that the . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 27, 2020. ; interruptions considered here would be unlikely to represent a large setback for the programme. furthermore they provide similar quantitative estimates for the probability that each health zone will meet the eot target by 2030 under baseline and suggest that bagata in particular will need to have an intensified strategy to bring forward eot and achieve the goal. these results provide a rather optimistic perspective on covid-19-related interruption on ghat control. we stress that the cumulative impact of simultaneous programme interruptions for multiple diseases (ntds and other infections) afflicting the same vulnerable populations could become an additional global health issue that deserves early attention and sustained control efforts. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 27, 2020. ; https://doi.org/10. 1101 /2020 wang j, xu c, wong y k, he y, adegnika a a, kremsner p g, agnandj s t, sall a a, liang z, qiu c, liao f l , jiang t, krishna s, tu y, preparedness is essential for malaria-endemic regions during the covid-19 pandemic, the lancet 2020;395(10230): 1094. world health organization. control and surveillance of human african trypanosomiasis 2013. world health organization. global health observatory data repository 2020. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 27, 2020. cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 27, 2020. ; https://doi.org/10.1101/2020.10.26.20219485 doi: medrxiv preprint s . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 27, 2020. ; https://doi.org/10. 1101 /2020 do cryptic reservoirs threaten gambiense-sleeping sickness elimination? screening strategies for a sustainable endpoint for gambiense sleeping sickness assessing the impact of aggregating disease stage data in model predictions of human african trypanosomiasis transmission and control activities in bandundu province (drc) covid-19 and malaria: a symptom screening challenge for malaria endemic countries quantifying epidemiological drivers of gambiense human african trypanosomiasis across the democratic republic of congo. medrxiv village-scale persistence and elimination of gambiense human african trypanosomiasis cost-effectiveness modelling to optimise active screening strategy for gambiense human african trypanosomiasis in the democratic republic of congo. medrxiv monitoring the elimination of human african trypanosomiasis at continental and country level: update potential impact of the covid-19 pandemic on hiv, tuberculosis, and malaria in low-income and middle-income countries: a modelling study integration of diagnosis and treatment of sleeping sickness in primary healthcare facilities in the democratic republic of the congo ntd modelling consortium discussion group on gambiense human african trypanosomiasis. insights from quantitative and mathematical modelling on the proposed 2030 goal for gambiense human african trypanosomiasis (ghat) journees nationales de vaccination (jnv) activities de vaccination supplementaire quantitative evaluation of the strategy to eliminate human african trypanosomiasis in the democratic republic of congo datadriven models to predict the elimination of sleeping sickness in former equateur province of drc the potential public health consequences of covid-19 on malaria in africa the atlas of human african trypanosomiasis: a contribution to global mapping of neglected tropical diseases monitoring the progress towards the elimination of gambiense human african trypanosomiasis implications of heterogeneous biting exposure and animal hosts on trypanosomiasis brucei gambiense transmission and control the authors thank pnltha for original data collection, who for data access (in the framework of the who hat atlas key: cord-133917-uap1vvbm authors: grave, mal'u; coutinho, alvaro l. g. a. title: adaptive mesh refinement and coarsening for diffusion-reaction epidemiological models date: 2020-10-22 journal: nan doi: nan sha: doc_id: 133917 cord_uid: uap1vvbm the outbreak of covid-19 in 2020 has led to a surge in the interest in the mathematical modeling of infectious diseases. disease transmission may be modeled as compartmental models, in which the population under study is divided into compartments and has assumptions about the nature and time rate of transfer from one compartment to another. usually, they are composed of a system of ordinary differential equations (ode's) in time. a class of such models considers the susceptible, exposed, infected, recovered, and deceased populations, the seird model. however, these models do not always account for the movement of individuals from one region to another. in this work, we extend the formulation of seird compartmental models to diffusion-reaction systems of partial differential equations to capture the continuous spatio-temporal dynamics of covid-19. since the virus spread is not only through diffusion, we introduce a source term to the equation system, representing exposed people who return from travel. we also add the possibility of anisotropic non-homogeneous diffusion. we implement the whole model in texttt{libmesh}, an open finite element library that provides a framework for multiphysics, considering adaptive mesh refinement and coarsening. therefore, the model can represent several spatial scales, adapting the resolution to the disease dynamics. we verify our model with standard seird models and show several examples highlighting the present model's new capabilities. disease transmission may be modeled as compartmental models, in which the population under study is divided into compartments and has assumptions about the nature and time rate of transfer from one compartment to another [14] . these models have been used extensively in biological, ecological, and chemical applications [15, 16, 17] . they allow for an understanding of the processes at work and for predicting the dynamics of the epidemic. the large majority of the compartmental models are composed of a system of ordinary differential equations (ode's) in time. though compartmentalized models are simple to formulate, analyze, and solve numerically, these models do not always account for the movement of individuals from one region to another. different approaches have been used to introduce spatial variation into such ode models [18, 19, 20, 11] . the strategies consist of defining regional compartments corresponding to different geographic units and adding coupling terms to the model equations to account for species' movement from unit to unit. in this work, we use a partial differential equation (pde) model to capture the continuous spatio-temporal dynamics of covid-19. pde models incorporate spatial information more naturally and allow for capture the dynamics across several scales of interest. they have a significant advantage over ode models, whose ability to describe spatial information is limited by the number of geographic compartments. indeed, recent research indicates that covid-19 spreading presents multi-scale features that go from the virus and individual immune system scale to the collective behavior of a whole population [21] . we study a compartmental seird model (susceptible, exposed, infected, recovered, deceased) that incorporates spatial spread through diffusion terms [16, 22, 8, 9, 23] . adaptive mesh refinement and coarsening [24] can resolve population dynamics from local (street, city) to regional (district, state), providing an accurate spatio-temporal description of the infection spreading. moreover, diffusion may be properly tuned to account for local natural or social inhomogeneities (e.g., mountains, lakes, highways) describing populations' movements. however, the main limitation of the diffusion-reaction pde approach is the definition of the diffusion operator and transmission coefficients, which depend on the population's behavior. another issue is that the virus spread is not only through diffusion, since people, who may be infected, travel long distances in a short period. some models relate the mobile geolocation data with the spread of the disease [25, 26] . these issues make the model a highly complex system, which may completely change as the population's behavior changes. therefore, this work contributes to improving the knowledge of compartmental diffusion-reaction pde models. all implementations are done using the libmesh library. as other freely available open-source libraries (deal.ii [27] , fenics [28] , grins [29] , moose [30] , etc), libmesh provides a finite element framework that can be used for numerical simulation of partial differential equations in various applied fields in science and engineering. it has already been used in more than 1,000 publications with applications in many different areas. see, for example, recent applications in sediment transport [31] and bubble dynamics [32] . this library is an excellent tool for programming the finite element method and can be used for one-, two-, and three-dimensional steady and transient simulations on serial and parallel platforms. the libmesh library provides native support for adaptive mesh refinement and coarsening, thus providing a natural environment for the present study. the main advantage of this library is the possibility of focusing on implementing the specifics features of the modeling without worrying about adaptivity and code parallelization. consequently, the effort to build a high performance computing code tends to be minimized. about adaptivity and code parallelization. the remainder of this work is organized as follows: in section 2, we present the governing equations that describe the dynamics of a virus infection. first, we present a generic spatio-temporal seird model, based on the epidemic software [33] , used to verify our implementation. we then present a model that better represents the dynamics of covid-19 infection spread, based on [9, 8] . in section 3, we introduce the galerkin finite element formulation, the time discretization, and the libmesh implementation. then, we present the numerical verification of the generic spatio-temporal seird model implementation. we verify our algorithm's capacity to represent a compartmental model [33] and show how the diffusion influences the dynamics. section 5 presents the numerical results of the spatio-temporal model of covid-19 infection spread. we perform simulations similar to the ones presented in [8] and show tests to highlight the new modeling capabilities introduced in this work. finally, the paper ends with a summary of our main findings and the perspectives for the next steps of this research. the presentation of the governing equations follows the continuum mechanics framework in [8] instead of the more traditional approach found in mathematical and biological references. consider a system which may be decomposed into n distinct populations: u 1 (x, t), u 2 (x, t), ..., u n (x, t). let ω ∈ r 2 be a simply connected domain of interest with boundary ∂ω = γ d ∩ γ n , and [0, t ] a generic time interval. the vector compact representation of the governing equations as a transient nonlinear diffusion-reaction system of equations reads, we denote the densities of the susceptible, exposed, infected, recovered and deceased populations as s(x, t), e(x, t), i(x, t), r(x, t), and d(x, t). also, let c(x, t) denote the cumulative number of infected and n(x, t) the sum of the living population; i.e., n(x, t) = s(x, t) + e(x, t) + i(x, t) + r(x, t). we consider u = [s, e, i, r, d] t . the matrices a, b and ν, and the vector f depend on a particular form of the system dynamics. furthermore, in general, ν = ν(x), that is, diffusion is heterogeneous and anisotropic. besides the boundary conditions (2), (3), we specify the initial condition for i = 1, 2, · · · , n . we first consider a seird model [14] given by the following system of coupled pdes over ω × [0, t ]: where β is transmission rate (days −1 ), α the latent rate (days −1 ), γ the recovery rate (days −1 ), δ the death rate (days −1 ), and ν s , ν e , ν i , ν r are diffusion parameters respectively corresponding to the different population groups (km 2 persons −1 days −1 ). we append to the system of equations homogeneous neumann boundary conditions, that is, (ν · ∇u) · n = 0. we can reframe this model in the general form given by equation (1) . thus, the matrices a, b, ν and the vector f reads, this model is based on the epidemic software 1 , and it is employed to verify our implementation. the system of equations represents that the susceptible population decreases as the exposed population increases. this variation depends on the transmission rate between infected and susceptible. the number of exposed increases because of the transmission rate and decreases when the exposed individuals become infected (after the incubation period). the number of infected increases after the incubation period and decreases depending on the recovery and death rate. the number of deaths depends only on the death rate as the number of recovered depends only on the recovery rate. finally, the cumulative number of infected depends only on the exposed and the incubation period. the diffusion parameters are included in the model to spread the disease spatially. summarizing, this model assumes: • movement is proportional to population size; i.e., more movement occurs within heavily populated regions; • no movement occurs among the deceased population; • there is a latency period between exposure and the development of symptoms; • the probability of contagion is inversely proportional to the population size; • the exposed persons will ever develop symptoms; • only infected persons are capable of spreading the disease; • the non-virus mortality rate is not considered in this model; • new births are not considered in this model. note that the epidemic model's dynamics does not represent the actual covid19 dynamics since, in the case of covid19, the exposed population may be asymptomatic and recover without becoming infected and still spread the virus. thus, a better model would be the one based on [9, 8] . we begin by making several model assumptions to represent the covid-19 infection spread adequately [8] : • only mortality due the covid-19 is considered; • new births are not considered in this model. • some portion of exposed persons never develop symptoms, and move directly from the exposed compartment to the recovered compartment (asymptomatic cases); • both asymptomatic (exposed) and symptomatic (infected) patients are capable of spreading the disease; • there is a latency period between exposure and the development of symptoms; • it is possible that new cases of exposed people appear randomly in the system (exposed people who return from a travel); • the probability of contagion increases with population size (allee effect [9] ); • movement is proportional to population size; i.e., more movement occurs within heavily populated regions; • no movement occurs among the deceased population; 1 https://americocunhajr.github.io/epidemic/ [33] then, the system of equations becomes: where a characterizes the allee effect (persons), that takes into account the tendency of outbreaks to cluster around large populations, β i is the transmission rate between symptomatic and susceptible (persons −1 days −1 ), β e is the transmission rate between asymptomatic and susceptible (persons −1 days −1 ), f is a source function that depends on space and time (persons), α is the latent rate (days −1 ), γ e is the recovery rate of the asymptomatic (days −1 ), γ i is the recovery rate of the symptomatic (days −1 ), δ is the death rate (days −1 ), and ν s , ν e , ν i , ν r are the diffusion parameters respectively corresponding to the different population groups (km 2 persons −1 days −1 ). now, we call exposed who has contact with the virus but remains asymptomatic. however, since the virus is highly transmissible, the exposed population also may transmit the virus. the exposed may recover without any symptoms or may become infected. the infected follow the same logic of the previous seird system (they may recover or die). the main difference in the new seird system is in the exposed population and whom it interacts. the source function f may be defined to represent exposed people who return from travel. note that β has units (days −1 ) while β i and β e have units (person −1 days −1 ). while equations (5) and (6) divide β by the living population, equations (15), (16) and (17) keep β i and β e constant independent of that. therefore, to express this model in the general form given by equation (1), the matrices a, b, ν and the vector f reads, if we assume that the region of interest is isolated, we prescribe the following homogeneous neumann boundary conditions, or simply (ν · ∇u) · n = 0. the basic reproduction number, r 0 , is defined as the average number of additional infections produced by an infected individual in a wholly susceptible population over the full course of the disease outbreak. in an epidemic situation, the threshold r 0 = 1 is the dividing line between the infection dying out and the onset of an epidemic. r 0 > 1 implies growth of the epidemic, whereas r 0 < 1 implies decay in infectious spread [14] . the concept of r 0 is well-defined for ode models. however, its extension to a pde model is unclear, owing to the influence of diffusion. viguerie et al. [8] found that a r 0 derived for the ode version of the pde model is not consistently reliable to represent the epidemic's dynamic growth adequately. if we do not consider the diffusion, r 0 may be calculated as: for further details about the r 0 calculation, refer to [34, 8] . in this section we briefly introduce the galerkin finite element formulation, the time discretization, and the the libmesh implementation, supporting adaptive mesh refinement and coarsening. appendices a and b give respectively the resulting finite element matrices for the generic spatio-temporal seird and covid-19 models. we introduce a galerkin finite element variational formulation for space discretization. without loss of generality, we consider the case of homogeneous dirichlet and neumann boundary conditions. let v u h be a finite dimensional space such that, in which u h (·, t) is the discrete counterpart of u and w h the weight function. the weak formulation is then: find here we define the operation (·, ·) as the standard scalar product in l 2 (ω). the seird and covid-19 models yield stiff systems of equations, making explicit time-marching methods unfeasible. the backward euler method is widely applied because of its unconditional numerical stability characteristics. however, it has the disadvantage of being only first-order accurate, which introduces a significant amount of numerical diffusion. then, we use the second-order backward differentiation formula (bdf2), which, compared to the prevailing backward euler method, has significantly better accuracy while retaining unconditional linear stability. the model becomes, the subscript n + 1 is associated to t = t n+1 and n, and n − 1 to the previous time-steps. we implement the compartmental epidemiological models in libmesh, a c++ fem open-source software library for parallel adaptive finite element applications [35] . libmesh also interfaces with external solver packages like petsc [36] and trilinos [37] . recently, libmesh was also coupled with in-situ visualization and data-analysis tools [38, 39] . it provides a finite element framework that can be used for the numerical simulation of partial differential equations on serial and parallel platforms. this library is an excellent tool for programming the finite element method and can be used for one-, two-, and three-dimensional steady and transient simulations. the libmesh library also has native support for adaptive mesh refinement and coarsening (amr/c). multiple scales can be resolved by amr/c. libmesh supports amr/c by h-refinement (element subdivision), prefinement (increasing the polynomial approximation order), and hp-refinement, that is, a combination of both [24] . in libmesh, coarsening is supported in the h, p, and hp amr/c options. in the present work, we restrict ourselves to h-refinement with hanging nodes. the amr/c procedure uses a local error estimator to drive the refinement and coarsening procedure, considering the error of an element relative to its neighbor elements in the mesh. this error may come from any variable of the system. as it is standard in libmesh, kelly s error indicator is employed, which uses the h 1 seminorm to estimate the error [40] . apart from the element interior residual, the flux jumps across the inter-element edges influence the element error. the flux jump of each edge is computed and added to the element error contribution. for both the element residual and flux jump, the values of the desired variables at each node are necessary. therefore, the error e 2 can be stated as, where e 2 i is the error of each variable. in this study, we use all population types as variables for the error estimator. after computing the error values, the elements are "flagged" for refining and coarsening regarding their relative error. this is done by a statistical element flagging strategy. it is assumed that the element error e is distributed approximately in a normal probability function. here, the statistical mean µ s and standard deviation σ s of all errors are calculated. whether an element is flagged is depending on a refining (r f ) and a coarsening (c f ) fraction. for all errors e < µ s − σ s c f the elements are flagged for coarsening and for all e > µ s + σ s r f the elements are marked for refinement (see figure 1 ). the refinement level is limited by a maximum h-level (h max ), (see figure 2) , and the coarsening is done by h-restitution of sub-elements [24] , [41] . to verify the implementation of the generic spatio-temporal seird model, we have done several tests. for this, we consider a square domain of 1km × 1km centered at (0, 0) for all tests in this section. in the first test, we do not consider diffusion. we consider a population of 1000 people/km 2 with 1 person/km 2 initially infected in all area of the domain. then, the initial conditions are: s 0 = 999, e 0 = 0, i 0 = 1, r 0 = 0 and d 0 = 0. this test aims to reproduce a compartmental simulation of the epidemic software by using the same initial parameters. the results have to be the same in each point of the domain and the same as the epidemic software. we set α = 0.14286 days −1 , β = 0.25 days −1 , δ = 0.06666 days −1 , γ = 0.1 days −1 and ∆t = 1 day. the mesh has 50 × 50 bilinear quadrilateral elements. figure 3 shows the comparison of the results, where we can see a very good agreement between both solutions. figure 4 shows the results over a centralized horizontal line crossing the domain at t=365 days. it is possible to see that the results are the same in all the domain, as expected. now, we consider the same parameters of the previous example, but different initial conditions. we consider a population of 1000 people/km 2 in all area of the domain with 1 person/km 2 initially infected only in a circle centered at (0, 0) and radius r = 0.5 km, we assume that ν s = ν e = ν i = ν r = 10 −8 km 2 persons −1 days −1 . then, the initial conditions are: s 0 = 999, e 0 = 0, i 0 = 1 for r <= 0.5 and i 0 = 0 for r > 0 with r = x 2 + y 2 , r 0 = 0 and d 0 = 0 (see figure 5 ). we consider adaptive mesh refinement in this example. the original mesh has 50 × 50 bilinear quadrilateral elements, and after the refinement, the smallest element has size 0.005 km. we initially refine the domain in two levels. for the amr/c procedure, we set h max = 2,r f = 0.95, c f = 0.05. we apply the adaptive mesh refinement every 5 time-steps. figure 6 shows the results over a centralized horizontal line crossing the domain at t=365 days. figure 7 shows the infected people at different time-steps. note that the infected remains actives in other parts of the domain because of the diffusion. it is possible to see the wave effect of the disease spreading. note that the amr/c procedure improves spatial resolution on the regions where the infected people are higher. in this test, we change the initial population. instead of a constant value in all domain, we set 1000 people/km 2 at the left/top quadrant, 500 people/km 2 at the right/top quadrant, 250 people/km 2 at the left/bottom quadrant and 750 people/km 2 at the right/bottom quadrant ( figure 8) . then, the initial conditions are: s 0 = 999 for x ≤ 0 and y > 0, s 0 = 499 for x > 0 and y > 0, s 0 = 249 for x ≤ 0 and y <=, s 0 = 749 for x > 0 and y > 0, e 0 = 0, i 0 = 1 for r ≤ 0.5 and i 0 = 0 for r > 0 with r = x 2 + y 2 , r 0 = 0 and d 0 = 0. the initial population infected is 1 person/km 2 at the same circled region of the previous test. all other parameters are the same of the previous simulation. figure 9 shows the infected people ate different time-steps. it is possible to see that the regions with denser populations (more people/km 2 ) are more affected by the disease. figure 10 shows the total number of deaths after 365 days, and the regions with more people/km 2 have more deaths than the less dense regions. note also that the amr/c procedure generates meshes following the model dynamics. in this section, we perform some simulations to validate the spatio-temporal model of covid-19 infection spread. in this test, we do not consider diffusion. we consider a square domain of 1km × 1km centered at (0, 0) with a population of 1000 people/km 2 , with 1 person/km 2 initially infected and 5 people/km 2 exposed in all area of the domain. then, the initial conditions are: s 0 = 994, e 0 = 5, i 0 = 1, r 0 = 0 and d 0 = 0. the aim of this test is to reproduce a compartmental simulation presented in [8] by using the same initial parameters. the results has to be the same in each point of the domain and also the same of the ones given in [8] . we set α = 0.125 days −1 , β i = β e = 0.005 days −1 persons −1 , δ = 0.0625 days −1 , γ i = 0.041666667 days −1 and γ e = 0.1666667 days −1 . the mesh has 50 × 50 bilinear quadrilateral elements. figure 11 shows the comparison of the results, where we can see an excellent agreement. to reproduce a 1d simulation with quadrilateral elements, we fix the element width to 0.0005 and vary its length to find the proper refinement for this case. therefore, we run a mesh convergence study as well as a time-step convergence study. for the initial conditions, we set s = s 0 and e = e 0 as follows, (36) figure 12 shows the initial conditions. we further set i 0 = 0, r 0 = 0, and d 0 = 0. qualitatively, the initial conditions represent a large population centered around x = 0.35 with no exposed persons and a small population centered around x = 0.75 with some exposed individuals. we also enforce homogeneous neumann boundary conditions at x = 0 and a zero-population dirichlet boundary condition at x = 1 for all model compartments. the latter represents a non-populated area at x = 1. we compare numerical solutions computed on successively refined uniform grids with mesh size ∆x = 1/50, 1/100, 1/250, 1/500, and 1/1000. the time step is ∆t = 0.25 days. figure 15 shows the difference in the total population of each compartment of individuals for the different meshes. a good resolution is found for ∆x = 1/500. it is easy to see this convergence in figure 16 , where the number of individuals of each compartment is plotted at t = 90 days. we examine the impact of time-step size ∆t on the numerical approximation of the model solution. we consider the time step sizes ∆t = 1, ∆t = 0.5, ∆t = 0.25, ∆t = 0.125 and ∆t = 0.0625 days. as the results in section 5.2.1 suggested ∆x = 1/500 is a sufficiently fine spatial discretization, we utilize this mesh resolution here. figure 17 shows the difference of the total population of each compartment of individuals for the different time-steps. a good accuracy is found for ∆t = 0.25 days. it is easy to see how the accuracy improves in figure 18 , where the number of individuals of each compartment is plotted at t = 90 days. this test is the application of the previous configuration rotated in a two dimensional square with corners at (-1,-1), (1,-1), (1,1) and (-1,1) . the initial population is: with r = x 2 + y 2 . the original mesh has 50 × 50 bilinear quadrilaterals elements and it is refined in two levels at the beginning of the simulation. for the amr/c procedure, we set h max = 2, r f = 0.95, c f = 0.05. we apply the adaptive mesh refinement every 4 time-steps. the behavior of the transmission has to be similar to the 1d model results, but in a radial configuration. figures 19 shows the populations at different time steps. figure 6 shows the results over a centralized horizontal line (or vertical because the axisymmetry) crossing the domain at t=200 days. if we compare figure 6 with figure 14 , it is possible to see that the populations follow a similar behavior. in figure 21 we plot the time history of the total number of individuals. there is a small gain in the total number of individuals (less than 0.1%). this test considers anisotropic diffusion in the previous configuration (only in the x direction). therefore, the populations move spatially only in the x direction. figure 22 shows the populations at different time-steps. figure 23 shows the results over a centralized horizontal line crossing the domain, and figure 24 over a centralized vertical line. by comparing these two figures, it is clear how the diffusion direction influences the behavior of the virus spread. since there is no movement of infected or exposed people in the y direction, part of the population does not have contact with the virus because there is no chance of the virus to reach them. in figure 25 we plot the time history of the total number of individuals. we can see a gain in the total number of individuals of less than 0.1%. the original mesh has 50 × 50 bilinear quadrilaterals elements and it is refined in two levels at the beginning of the simulation. for the amr/c procedure, we set h max = 2, r f = 0.95, c f = 0.05. we apply the adaptive mesh refinement every 4 time-steps. the initial population is: figure 26 shows the initial susceptible population. note there are not infected or exposed people at the initial time. we implement a random source of the exposed population that depends on the number of susceptible people. in all time-steps random nodes of the domain receive a certain number of exposed people. it tries to simulate people who travel and suddenly appear in a region carrying the virus. the random source does not add individuals to the population, but change individuals between susceptible and exposed compartments. of course, this model is simple. nevertheless, it demonstrates how to handle a random source term in the equations. figure 27 shows a example of the random exposed number of people that appears in one time-step. in figure 31 , we plot the time history of the total number of individuals. there is a negligible increase in the total number of individuals (less than 0.1%). we developed an extended continuum seird model to represent the dynamics of the covid-19 virus spread based on the framework proposed in [9] . we validate our code by comparing our results with other simulations. we introduce new test cases to highlight new modeling capabilities. among the new features added to the base model, there is the addition of a source term, which represents exposed people who return from travel, by changing individuals from the susceptible compartment to the exposed compartment. we also add the possibility of anisotropic non-homogeneous diffusion. our code is implemented through the libmesh library and supports adaptive mesh refinement and coarsening. therefore, it can represent several spatial scales, adapting the resolution to the disease dynamics. data is essential to define the epidemic spreading parameters, as diffusion and infection rate. we have to study how it would be the best way to represent people who return from travels, addressing questions like defining the probability of a random source appearing in the system, in which area, the population density, among others. diffusion-reaction models, as the present one, are richer than standard compartmental models. however, they are slower, which hampers their widespread utilization in what-if scenarios, parametric studies, and time-critical situations. therefore, the development of low-dimensional computational models will leverage the ability of continuous models to perform in real-time scenarios. projection-based or data-driven model order reduction [42, 43] aims to lower the computational complexity of a given computational model by reducing its dimensionality (or order), can provide this leverage. they can work in conjunction with emerging machine learning methods such as physics informed neural networks [44] . we can foresee a tremendous impact in the mathematical epidemiology field of all these new methods and techniques, enlarging the predictive capabilities and computational efficiency of diffusion-reaction epidemiological models. in libmesh, we calculate directly the new solution (u n+1 ) instead of the variation (δu). then, on the left-hand side, we gather the terms containing an unknown, whereas all the other terms are taken to the right-hand-side. the superscript k is from the previous newton iteration. the terms in black are from the mass matrix, in blue are the nonlinear terms, in red the diffusive terms, and in green the remaining terms from the stiffness matrix. the finite element shape functions are represented by n a , a = 1, · · · , n nnos , where n nnos is the number of nodes in the finite element mesh. susceptible (equation 5): we present the matrix contributions of the system of equations that represents the covid19 dynamics [9, 8] . we use the bdf2 time discretization method, newton's method for the nonlinear terms, and we simplify the number of the living population by considering the previous linear solution. for all test cases the nonlinear tolerance for newton's method is set to 10 −8 and the linear solver tolerance is set to 10 −10 . the linear solver is gmres with ilu(0) preconditioner. in libmesh, we calculate directly the new solution (u n+1 ) instead of the variation (δu). then, on the left-hand side, we gather the terms containing an unknown, whereas all the other terms are taken to the right-hand-side. the superscript k is from the previous newton iteration. the terms in black are from the mass matrix, in blue are the nonlinear terms, in red the diffusive terms, in green the remaining terms from the stiffness matrix and in yellow the source terms. susceptible (equation 15): ωe ∇n a n k ν s ∇n b dω (61) epidemiologia matemática: estudos dos efeitos da vacinação em doenças de transmissão direta a dengue model with a dynamic aedes albopictus vector population calibration of a seir-sei epidemic model to describe the zika virus outbreak in brazil global analysis of an hiv/aids epidemic model modeling the sars epidemic remote sensing-based time series models for malaria early warning in the highlands of ethiopia statistical inference in a stochastic epidemic seir model with control intervention: ebola as a case study diffusion-reaction compartmental models formulated in a continuum mechanics framework: application to covid-19, mathematical analysis, and numerical study simulating the spread of covid-19 via spatially-resolved susceptible-exposed-infected-recovered-deceased (seird) model with heterogeneous diffusion a simple model for covid-19 modelling the covid-19 epidemic and implementation of population-wide interventions in italy a simulation of a covid-19 epidemic based on a deterministic seir model spreading of covid-19 in brazil: impacts and uncertainties in social distancing strategies. medrxiv mathematical models in epidemiology mathematical models in population biology and epidemiology numerical simulation of a susceptible-exposedinfectious space-continuous model for the spread of rabies in raccoons across a realistic landscape compartmental modelling in chemical engineering: a critical review partial differential equations in ecology: spatial interactions and population dynamics modeling infectious diseases in humans and animals spread and dynamics of the covid-19 epidemic in italy: effects of emergency containment measures a multi-scale model of virus pandemic: heterogeneous interactive entities in a globally connected world galerkin methods for a model of population dynamics with nonlinear diffusion spatial ecology via reaction-diffusion equations computational grids, generation, adaptation and solution strategies modeling future spread of infections via mobile geolocation data and population dynamics. an application to covid-19 in brazil the effect of human mobility and control measures on the covid-19 epidemic in china deal. ii-a general-purpose object-oriented finite element library the fenics project version 1.5. archive of numerical software grins: a multiphysics framework based on the libmesh finite element library moose: a parallel computational framework for coupled systems of nonlinear equations residual-based variational multiscale 2d simulation of sediment transport with morphological changes a new convected level-set method for gas bubble dynamics epidemic -epidemiology educational code on the definition and the computation of the basic reproduction ratio r0 in models for infectious diseases in heterogeneous populations libmesh: a c++ library for parallel adaptive mesh refinement/coarsening simulations in situ visualization and data analysis for turbidity currents simulation dfanalyzer: runtime dataflow analysis tool for computational science and engineering applications a posteriori error estimation in finite element analysis a posteriori error analysis and adaptive processes in the finite element method: part i -error analysis reduced order methods for modeling and computational reduction data-driven science and engineering: machine learning, dynamical systems, and control physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations this research was financed in part by the coordenação de aperfeiçoamento de pessoal de nível superior -brasil (capes) -finance code 001 and capes tecnodigital project 223038.014313/2020-19. this research has also received funding from cnpq and faperj. we are indebted to prof. americo cunha jr., prof. regina almeida and prof. sandra malta for fruitful discussions and invaluable help in the understanding of epidemiological models. a implementation of the generic spatio-temporal seird modelwe implement the generic seird model similar to the epidemic software. we have used the bdf2 time discretization method, newton's method for the nonlinear terms, and we simplify the number of the living population by considering the previous time-step solution. for all test cases the nonlinear tolerance for newton's method is set to 10 −8 and the linear solver tolerance is set to 10 −10 . the linear solver is gmres with ilu(0) preconditioner. key: cord-267890-j64x6f5r authors: armstrong, e.; gerardin, j.; runge, m. title: identifying the measurements required to estimate rates of covid-19 transmission, infection, and detection, using variational data assimilation date: 2020-05-29 journal: nan doi: 10.1101/2020.05.27.20112987 sha: doc_id: 267890 cord_uid: j64x6f5r we demonstrate the ability of statistical data assimilation to identify the measurements required for accurate state and parameter estimation in an epidemiological model for the novel coronavirus disease covid-19. our context is an effort to inform policy regarding social behavior, to mitigate strain on hospital capacity. the model unknowns are taken to be: the time-varying transmission rate, the fraction of exposed cases that require hospitalization, and the time-varying detection probabilities of new asymptomatic and symptomatic cases. in simulations, we obtain accurate estimates of undetected (that is, unmeasured) infectious populations, by measuring the detected cases together with the recovered and dead and without assumed knowledge of the detection rates. these state estimates require a measurement of the recovered population, and are tolerant to low errors in that measurement. further, excellent estimates of all quantities are obtained using a temporal baseline of 112 days, with the exception of the time-varying transmission rate at times prior to the implementation of social distancing. the estimation of this transmission rate is sensitive to contamination in the data, highlighting the need for accurate and uniform methods of reporting. finally, we employ the procedure using real data from italy reported by johns hopkins. the aim of this paper is not to assign extreme significance to the results of these specific experiments textit{per se}. rather, we intend to exemplify the power of sda to determine what properties of measurements will yield estimates of unknown model parameters to a desired precision all set within the complex context of the covid-19 pandemic. the coronavirus disease 2019 (covid-19) is burdening health care systems worldwide, threatening physical and psychological health, and desecrating the global economy. it is our nation's top priority to assuage this harm. in particular, it is vital to maintain appropriate social behavior to avoid straining hospital capacity, and meanwhile minimize the psychological stress that social restrictions place upon citizens. given the uncertainties, however, in the intrinsic properties of the disease and inaccuracies in the recording of its effects on the population [1] [2] [3] , an optimal protocol for social behavior is far from clear. it is invaluable to identify any and all methodologies that may be brought to bear upon this situation. within this context, we seek a means to quantify what data must be recorded in order to estimate specific unknown quantities in an epidemiological model tailored to the covid-19 pandemic. these unknown quantities are: i) the transmission rate, ii) the fraction of the exposed population that acquires symptoms sufficiently severe to require hospitalization, and iii) time-varying detection probabilities of asymptomatic and symptomatic cases. in this paper, we demonstrate the ability of statistical data assimilation (sda) to quantify the accuracy to which these parameters can be estimated, given certain properties of the data including noise level. sda is an inverse formulation [4] : a machine learning approach designed to optimally combine a model with data. invented for numerical weather prediction [5] [6] [7] [8] [9] [10] , and more recently applied to biological neuron models [11] [12] [13] [14] [15] [16] [17] , sda offers a systematic means to identify the measurements required to estimate unknown model parameters to a desired precision. data assimilation has been presented as a means for general epidemiological forecasting [18] , and one work has examined variational data assimilation specif-ically -the method we employ in this paper -for estimating parameters in epidemiological models [19] . related bayesian frameworks for estimating unknown properties of epidemiological models have also been explored [20, 21] . to date, there have been two employments of sda for covid-19 specifically. ref [22] used a simple sir (susceptible/infected/recovered) model, and ref [23] expanded the sir model to include a compartment of patients in treatment. two features of our work distinguish this paper as novel. first, we significantly expand the model in terms of the number of compartments. the aim here is to capture key features of covid-19 so as to eventually inform state policy on containing the pandemic. these features are: i) asymptomatic versus symptomatic populations, ii) undetected versus detected cases, and iii) two hospitalized populations: those who do and do not require critical care. for our motivations for these choices, see model. second, we employ sda for the specific purpose of examining the sensitivity of estimates of time-varying parameters to various properties of the measurements, including the degree of noise (or error) added. moreover, we aim to demonstrate the power and versatility of the sda technique to explore what is required of measurements to complete a model with a dimension sufficiently high to capture the complexities of covid-19 -an examination that has not previously been done. to this end, we sought to estimate the parameters noted above, using simulated measurements representing a metropolitan-area population loosely based on new york city. we examined the sensitivity of estimations to: i) the subpopulations that were sampled, ii) the temporal baseline of sampling, and iii) uncertainty in the sampling. our findings using simulated data are threefold. first, reasonable estimations of time-varying detection probabilities require the reporting of new detected cases (asymptomatic and symptomatic), dead, and recovered, and low (∼ ten percent) noise is well tolerated. second, the information contained in the measured detected populations propagates successfully to the estimation of the numbers of undetected cases entering hospitals. third, a temporal baseline of 112 days (the recent past) is sufficiently long for the sda procedure to capture the general trends in the evolution of the model populations, the detection probabilities, and the time-varying transmission rate following the implementation of social distancing. the transmission rate at early times is highly sensitive to contamination in the measurements and to the temporal baseline of sampling. finally, we test the procedure on real data, using the reports from italy that are listed in the open-access github repository of the johns hopkins project systems science and engineering [24] . the key sections to note in this paper are: the experiments -which describes the studies done and our motivations for them, results: general findings, and conclusion. finally, we comment on the model parameters and initial conditions that are taken in this paper to be known quantities. as noted, our aim is to exemplify sda as a tool to examine how certain properties of data impact the estimation of unknown model parameters. as any tool of possible utility must be brought to bear upon this situation in a timely manner, we made some assumptions that do not reflect all that is known to date about the covid-19 pandemic. it is important to test the procedure for robustness over various parameter ranges, "correct" or otherwise. meanwhile, we are testing the procedure's stablity over a wide range of choices for parameter values and initial conditions, including those that reflect most accurately the current state of our knowledge of the pandemic. moreover, is our hope that the exercises described in this paper can be taken and applied to a host of complicated questions surrounding covid-19. the model is written in 22 state variables, each representing a subpopulation of people; the total population is conserved. figure 1 shows a schematic. each member of a population s that enters an exposed population (e) ultimately reaches either recovered (r) or dead (d). absent additive noise, the model is deterministic. the red ovals indicate the variables that will correspond to measured quantities in the inference experiments of this paper. as noted, the model is written with the aim to inform policy 1 on social behavior so as to avoid overwhelming hospital capacity. to this end, the model resolves asymptomatic-versus-symptomatic cases, undetectedversus-detected cases, and the two tiers of hospitalization: the general h versus the critical care c populations. the resolution of asymptomatic versus symptomatic cases was motivated by an interest in what requirements exist to control the epidemic. for example, is it sufficient to focus only on symptomatic individuals, or must we also target and address asymptomatic individuals who are not personally suffering from disease? 1 many details of this model, including compartmentalization by age and geographic region, are omitted from exploration in this paper. figure 1 : schematic of the model. each rectangle represents a population. note the distinction of asymptomatic cases, undetected cases, and the two tiers of hospitalized care: h and c. the aim of including this degree of resolution is to inform policy on social behavior so as to minimize strain on hospital capacity. the red ovals indicate the variables that correspond to measured quantities in the inference experiments of this paper. the detected and undetected populations exist for two reasons. first, we seek to account for underreporting of cases and deaths. second, we will eventually seek a model structure that can simulate the impact of increasing detection rates on disease transmission, including the impact of contact tracing. thus the model was structured from the beginning so that we might examine the effects of interventions that were imposed later on. the ultimate aim here is to inform policy on the requirements for containing the epidemic. we included both h and c populations because hospital inpatient and icu bed capacities are the key health system metrics that we aim to avoid straining. any policy that we consider must include predictions on inpatient and icu bed needs. preparing for those needs is a key response need if or when the epidemic grows uncontrolled. note on figure 1 the return arrow from the recovered r to susceptible s population. the return of any portion of r to s is believed to be zero. nevertheless, we assayed the ability of sda to assimilate some portion of r back into s, as a proof-of-principle that it can be done readily. for details of the model, including the reaction equations and descriptions of all state variables and parameters, see appendix a. sda is an inference procedure, or a type of machine learning, in which a model dynamical system is assumed to underlie any measured quantities. this model f can be written as a set of d ordinary differential equations that evolve in some parameterization t as: where the components x a of the vector x are the model state variables, and unknown parameters to be estimated are contained in p(t). a subset l of the d state variables is associated with measured quantities. one seeks to estimate the p unknown parameters and the evolution of all state variables that is consistent with the l measurements. a prerequisite for estimation using real data is the design of simulated experiments, wherein the true values of parameters are known. in addition to providing a consistency check, simulated experiments offer the opportunity to ascertain which and how few experimental measurements, in principle, are necessary and sufficient to complete a model. sda can be formulated as an optimization, wherein a cost function is extremized. we take this approach, and write the cost function in two terms: 1) one term representing the difference between state estimate and measurement (measurement error), and 2) a term representing model error. it will be shown below in this section that treating the model error as finite offers a means to identify whether a solution has been found within a particular region of parameter space. this is a non-trivial problem, as any nonlinear model will render the cost function nonconvex. we search the surface of the cost function via the variational method, and we employ a method of annealing to identify a lowest minumum -a procedure that has been referred to loosely in the literature as variational annealing (va). the cost function a 0 used in this paper is written as: 3 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 29, 2020. . one seeks the path x 0 = x(0), ..., x(n), p(0), ...p(n) in state space on which a 0 attains a minimum value 2 . note that equation 1 is shorthand; for the full form, see appendix a of ref [17] . for a derivation -beginning with the physical action of a particle in state space -see ref [26] . the first squared term of equation 1 governs the transfer of information from measurements y l to model states x l . the summation on j runs over all discretized timepoints j at which measurements are made, which may be a subset of all integrated model timepoints. the summation on l is taken over all l measured quantities. the second squared term of equation 1 incorporates the model evolution of all d state variables x a . the term f a (x(n)) is defined, for discretization, as: 1 2 [f a (x(n)) + f a (x(n + 1))]. the outer sum on n is taken over all discretized timepoints of the model equations of motion. the sum on a is taken over all d state variables. r m and r f are inverse covariance matrices for the measurement and model errors, respectively. we take each matrix to be diagonal and treat them as relative weighting terms, whose utility will be described below in this section. the procedure searches a (d (n + 1) + p (n + 1))dimensional state space, where d is the number of state variables, n is the number of discretized steps, and p is the number of unknown parameters. to perform simulated experiments, the equations of motion are integrated forward to yield simulated data, and the va procedure is challenged to infer the parameters and the evolution of all state variables -measured and unmeasured -that generated the simulated data. this specific formulation has been tested with chaotic models [27] [28] [29] [30] , and used to estimate parameters in models of biological neurons [12, 13, 15, 17, 31, 32] , as well as astrophysical scenarios [33] . our model is nonlinear, and thus the cost function surface is non-convex. for this reason, we iterate -or anneal -in terms of the ratio of model and measurement error, with the aim to gradually freeze out a lowest minimum. this procedure was introduced in ref [34] , and has since been used in combination with variational optimization on nonlinear models in refs [10, 17, 31, 33] above. the annealing works as follows. we first define the coefficient of measurement error r m to be 1.0, and write the coefficient of model error r f as: 0 is a small number near zero, α is a small number greater than 1.0, and β is initialized at zero. parameter β is our annealing parameter. for the case in which β = 0, relatively free from model constraints the cost function surface is smooth and there exists one minimum of the variational problem that is consistent with the measurements. we obtain an estimate of that minimum. then we increase the weight of the model term slightly, via an integer increment in β, and recalculate the cost. we do this recursively, toward the deterministic limit of r f r m . the aim is to remain sufficiently near to the lowest minimum to not become trapped in a local minimum as the surface becomes resolved. we will show in results that a plot of the cost as a function of β reveals whether a solution has been found that is consistent with both measurements and model. we based our simulated locality loosely on new york city 3 , with a population of 9 million 4 . simulations ran from an initial time t 0 of four days prior 5 to 2020 march 1, the date of the first reported covid-19 case in new york city [35] . at time t 0 , there existed one detected symptomatic case within the population of 9 million 6 . we chose five quantities as unknown parameters to be estimated (table 1) : 1) the time-varying transmission rate k i (t); 2) the detection probability of mild symptomatic cases d sym (t), 3) the detection probability of severe symptomatic cases d sys (t), 4) the fraction of cases that become symptomatic f sympt , and 5) the fraction of symptomatic cases that become severe enough to require hospitalization f severe . here we summarize the key features that we 2 it may interest the reader that one can derive this cost function by considering the classical physical action on a path in a state space, where the path of lowest action corresponds to the correct solution [26] 3 the model has been written to inform policy for the state of illinois, but it generalizes to any geographical region. 4 for simplicity, we assume a closed population. 5 in the case of new york city, the first known symptomatic individual re-entered the u.s. from iran several days prior, and may have been infected on that date [35] . 6 the true number of initial cases is likely to be far higher. we are currently examining the model's sensitivity to initial conditions on population numbers. meanwhile, it is important to examine the sda behavior over a range of choices for initial conditions, and we were struck by the rapid grown of the infected population -even given an initial infected population of one. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . table 1 : unknown parameters to be estimated. k i , d sym , and d sys are taken to be time-varying. parameters f sympt and f severe are constant numbers, as they are assumed to reflect an intrinsic property of the disease. the detection probability of asymptomatic cases is taken to be known and zero. time-varying detection probability of mild symptomatics d sys (t) time-varying detection probability of symptomatics requiring hospitalization f sympt fraction of positive cases that produce symptoms f severe fraction of symptomatics that are severe sought to capture in modeling these parameters; for their mathematical formulatons, see appendix b. the transmission rate k i (often referred to as the effective contact rate) in a given population for a given infectious disease is measured in effective contacts per unit time. this may be expressed as the total contact rate multiplied by the risk of infection, given contact between an infectious and a susceptible individual. the contact rate, in turn, can be impacted by amendments to social behavior 7 . as a first step in applying sda to a high-dimensional epidemiological model, we chose to condense the significance of k i into a relatively simple mathematical form. we assumed that k i was constant prior to the implementation of a social-distancing mandate, which then effected a rapid transition of k i to a lower constant value. specifically, we modeled k i as a smooth approximation to a heaviside function that begins its decline on march 22, the date that the stay-at-home order took effect in new york city [39] : 25 days after time t 0 . for further simplicity, we took k i to reflect a single implementation of a social distancing protocol, and adherence to that protocol throughout the remaining temporal baseline of estimation 8 . detection rates impact the sizes of the subpopulations entering hospitals, and their values are highly uncertain [2, 3] . thus we took these quantities to be unknown, and -as detection methods will evolve -time-varying. we also optimistically assumed that the methods will improve, and thus we described them as increasing functions of time. we used smoothly-varying forms, the first linear and the second quadratic, to preclude symmetries in the model equations. meanwhile, we took the detection probability for asymptomatic cases (d as ) to be known and zero, a reasonable reflection of the current state of testing. finally, we assigned as unknowns the fraction of cases that become symptomatic ( f sympt ) and fraction of symptomatic cases that become sufficiently severe to require hospitalization ( f severe ), as these fractions possess high uncertainties (refs [40] and [41] , respectively). as they reflect an intrinsic property of the disease, we took them to be constants. all other model parameters were taken to be known and constant 9 (appendix a). the simulated experiments are summarized in the schematic of figure 2 . they were designed to probe the effects upon estimations of three considerations: a) the number of measured subpopulations, b) the temporal baseline of measurements, and c) contamination of measurements by noise. to this end, we designed a "base" experiment sufficient to yield an excellent solution, and then four variations on this experiment. the base experiment (i in figure 2 ) possesses the following features: a) five measured populations: detected asymptomatic as det , detected mild symptomatic sym det , detected severe symptomatic sys det , recovered r, and dead d; b) a temporal baseline of 201 days, beginning on 2020 february 26; c) no noise in measurements. the four variations on this basic experiment (ii through v in figure 2 ), incorporate the following independent changes. in experiment ii, the r population is not measured -an example designed to reflect the current situation in some localities (e.g. refs [2, 3] ). experiment iii reduces the temporal baseline from 201 to 112 days, ending on 2020 may 12: the recent past. the motivation for this design is to ascertain whether -in principle -it was possible to attain an accurate solution that early, given perfectly-recorded data. 7 the reproduction number r 0 , in the simplest sir form, can be written as the effective contact rate divided by the recovery rate. in practice, r 0 is a challenge to infer [21, [36] [37] [38] . 8 the full model permits k i to vary as a function of multiple such rules, each initiated and rescinded at independent times. 9 the values of other model parameters also possess significant uncertainties given the reported data, including, for example, the fraction of those hospitalized that require icu care. in future va experiments, these quantities will also be treated as unknowns. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . experiment iv includes a ∼ ten percent noise level in the simulated r data, and experiment v includes the same noise level in r, sym det , and sys det (for the form of additive noise, see appendix c). for each experiment, ten independent calculations were initiated in parallel searches, each with a randomlygenerated set of initial conditions on state variable and parameter values. b. an example using real data: italy we tested the va procedure using real data obtained from the johns hopkins repository [24], using as measurements their reported confirmed cases c, recovered r, and dead d, over a 112-day period. here, c is reported as a cumulative measurement, including r and d. to formulate a correspondence with our model, we subtracted from c the r and d numbers and divided the remainder among as det , sym det , and sys det . we assumed that the data accord with our model, and assigned zero percent of c to the as det population. then, absent any well-informed guideline, we assigned 60 and 40% of c to sym det and sys det , respectively. also in accordance with our model, we multiplied the reported value of r by 0.9, as the model assumes a 10% rate of re-entry from the r to s populations. finally, for the real data we initialized 50 independent calculations in parallel (rather than 10 for the simulated experiments). for technical details of all experimental designs and implementation, see appendix c. the salient results for the simulated experiments i through iv are as follows: table 2 . the base experiment that employed five perfectlymeasured populations over 201 days 10 yielded an excellent solution in terms of model evolution and parameter estimates. prior to examining the solution, we first plot the cost function versus the annealing parameter β, as this distribution can serve as a tool for assessing the significance of a solution. figure 3 shows the evolution of the cost throughout annealing, for the ten distinct independent paths that were initiated; the x-axis shows the value of annealing 10 the implication of a 201-day requirement for accurate estimations would be a bleak outlook. we emphasize that these experiments are to be taken as examples for how the technique can be expanded to a host of realistic scenarios. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . table 2 : estimates of static parameters f sympt and f severe , over the five simulated experiments and the example using data from italy. with two exceptions, the reported numbers are taken from the annealing iteration with a value of parameter β of 37: once the deterministic limit has been reached (see text). the first exception is experiment ii, for which we chose the result corresponding to β = 8: before the solution grows unstable exponentially (see figure 5 ). the second exception is the case for real data, where we report results at two values of β to show that the estimates vary widely over the annealing process. the result italy i corresponds to β = 20, prior to the strong imposition of model constraints; italy i i corresponds to β = 37, once the deterministic limit is reached. see specific subsections for details of each experiment. at low β the procedure endeavours to fit the measured variables to the simulated measurements. as β increases, the cost increases until it approaches a plateau (around β = 25), indicating that a solution has been found that is consistent with both measurements and model. parameter β, or: the increasing rigidity of the model constraint. at the start of iterations, the cost function is mainly fitting the measurements to data, and its value begins to climb as the model penalty is gradually imposed. if the procedure finds a solution that is consistent not only with the measurements, but also with the model, then the cost will plateau. in figure 4 , we see this happen, beginning around β = 25. once β has reached 30, the plateau is established, with some scatter across paths. unless noted, the reported estimates in this section are taken at a value of β of 37: on the plateau. the significance of this plateau may become clearer once we examine the contrasting case of experiment ii. we now examine the state and parameter estimates for the base experiment i 11 . figure 4 shows an excellent result, excepting k i (t) for times prior to its steep decline. we note no improvement in this estimate for k i (t), following a tenfold increase in the temporal resolution of measurements (not shown). the procedure does appear to recognize that a fast transition in the value of k i occurred at early times, and that that value was previously higher. it will be important to investigate the reason for this failure in the estimation of k i at early times, to rule out numerical issues involved with the quickly-changing derivative 12 . c. experiment ii: no measurement of r figure 5 shows the cost as a function of annealing for the case with no measurement of recovered population r. without examining the estimates, we know from the cost(β) plot that no solution has been found that is consistent with both measurements and model: no plateau is reached. rather, as the model constraint strengthens, the cost increases exponentially. indeed, figure 6 shows the estimation, taken at β = 8, prior to the runaway behavior. note the excellent fit to the measured states and simultaneous poor fit to the unmeasured states. as no stable solution is found at high β, we conclude that there exists insufficient information in as det , sym det , sys det , and d alone to corral the procedure into a region of state-and-parameter space in which a model solution is possible. 11 for all experiments, each solution shown is representative of the solution for all ten paths 12 as noted in experiments, we chose k i to reflect a rapid adherence to social distancing at day 25 following time t 0 , which then remains in place through to day 201. for the form of k i , see appendix b.) 7 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . 8 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . https://doi.org/10.1101/2020.05.27.20112987 doi: medrxiv preprint figure 5 : cost versus β for experiment ii: r is not measured. as β increases, the cost increases indefinitely, indicating that no solution has been found that is consistent with both measurements and model dynamics. figure 7 shows estimates that include measurement r but limit the temporal baseline to the first 112 days following time t 0 . the estimate suffers little from this reduction in the baseline of measurements. the only state estimate that is significantly different from the result shown in figure 4 is s (the others are not shown), and the estimate of the transmission rate k i (t) is poor at all times. the estimates of the static parameters and trends in the detection probabilities are still captured well. with ∼ 10% noise added to measurement r (experiment iv), a plateau appears on the cost-vs-β plot (figure 8 ), indicating that a stable minimum has been found. figure 9 shows that the noise propagates to the unmeasured states s, e, as, p, and to detection probability d sys . the general evolution of the states and detection rates, however, is still captured well, with the one exception of sym. note that the low estimate for sym is not offset by a high estimate for any of the other state variables. the addition of noise in these numbers, by definition, breaks the conservation of the population. in the future, we will ameliorate this problem by adding equality constraints into the cost function to impose the conservation of n. with the same noise level added to measurements sym det and sys det as well (experiment v), the state estimate is similar to that for experiment iv (not shown). estimates of the detection probabilities are poorer but still trace the general trends ( figure 10 ). we took the confirmed (c), recovered, and dead cases in italy that are listed in the github repository of the johns hopkins project systems science and engineering [24] . these are cumulative numbers; c is the sum of r, d, and new cases. the c were divided among the model variables as described in experiments. figures 11 and 12 show the result. the cost-vs-β plot indicates that a stable solution is found. the fits to the measured states are reasonable. the estimates of unmeasured states that impact hospital capacity (the h and c populations) appear smooth -and reasonable, with the exception of the features in c 3,det and h 3,det beginning at time t 0 . the estimates of s and e, on the other hand, are discontinuous. this may be due to the lack of a smooth solution for the time-varying parameters (not shown). we are currently amending the procedure to handle data that may include errors of unknown magnitude. we have endeavoured to illustrate how sda can systematically identify the specific measurements, temporal baseline of measurements, and degree of measurement accuracy, required to estimate unknown model parameters in a high-dimensional model. we have used as our dynamical system an epidemiological model designed to examine the complex problems that covid-19 presents to hospitals. in doing so, we have assumed knowledge of some model parameters. in light of these assumptions, we restrict our conclusions to general comments. we emphasize that estimation of the full model state requires measurements of the detected cases but not the undetected, provided that the recovered and dead are also measured. the model is tolerant to low noise in these measurements, excepting the transmission rate k i . the sda technique can be expanded in many directions. examples include: 1) defining additional model parameters as unknowns to be estimated, including the fraction of patients hospitalized, the fraction who enter critical care, and the various timescales governing the reaction equations; 2) imposing various constraints regarding the unknown time-varying quantities, particularly transmission rate k i (t), and identifying which forms permit a solution consistent with measurements; 3) examining model sensitivity to the initial numbers within each population; 4) examining model sensitivity to the temporal frequency of data sampling. the ultimate aim of sda is to test the validity of 9 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . model estimation, via prediction. in advance of that step, we are currently examining the stability of the sda procedure over a range of choices for parameter values and initial numbers for the infected populations. as this pandemic is unlikely to abate soon, it is important to design well-informed plans for social and economic activity. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . thank you to patrick clay for discussions on inferring exposure rates given social distancing protocols. hospitalized and will recover, detected h 2,det hospitalized and will go to critical care and recover, detected h 3,det hospitalized and will go to critical care and die, detected h 1 hospitalized and will recover, undetected h 2 hospitalized and will go to critical care and recover, undetected h 3 hospitalized and will go to critical care and die, undetected in critical care and will recover, detected c 3,det in critical care and will die, detected c 2 in critical care and will recover, undetected c 3 in critical care and will die, undetected r recovered d dead . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . figure 9 : estimates for experiment iv. the noise added to r propagates to some unmeasured states and to the detection probability d sym , but -with the exception of sym -the overall evolution is captured well. the noise does preclude an estimate of transmission rate k i . . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . https://doi.org/10.1101/2020.05.27.20112987 doi: medrxiv preprint figure 10 : estimates for experiment v: noise added to r, sym det , and sys det . estimates of detection probabilities are noisier than those for experiment iv, although the trends are still captured. the state estimate is similar to that shown in figure 9 for experiment iv (not shown). figure 11 : cost versus β using the italy data, indicating that a stable solution has been found. the blue notation specified by overbrackets denotes the correspondence of specific terms to the reactions between the populations depicted in figure 1 . note that the return of any portion of the r population to s is believed to be zero. nevertheless, we assayed the ability of sda to assimilate some portion of r back into s, as a proof-of-principle that it can be done readily. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . 14 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . figure 12 : estimates of all measured and unmeasured state variables, given the italy measurements of r, d, and confirmed c over 112 days. see text for comments and for our translation of c into state variables as det , sym det , and sys det . . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 29, 2020. . table 4 : the model parameters, with the unknown parameters to be estimated denoted in boldface. the unknown parameters k i , sym , and d sys are taken to be time-varying. the unknown parameters f sympt and f severe are taken to be intrinsic properties of the disease and therefore constant numbers. the detection probability of asymptomatic cases is taken to be known and zero. units of time are days. description value n total population 9,000,000 f immune fraction of recovered population that gains permanent immunity to re-infection 0.9 t resuscept time to re-enter the susceptible population, for those recovered who do not gain immunity the property that a detected case is likely to transmit less, via successful quarantine) detection probability of asymptomatic cases 0.0 f sympt fraction of positive cases that produce symptoms 0.5 [ the unknown parameters assumed to be time-varying are the transmission rate k i , and the detection probabilities d sym and d sys for mild and severe symptomatic cases, respectively. the transmission rate in a given population for a given infectious disease is measured in effective contacts per unit time. this may be expressed as the total contact rate (the total number of contacts, effective or not, per unit time), multiplied by the risk of infection, given contact between an infectious and a susceptible individual. the total contact rate can be impacted by social behavior. in this first employment of sda upon a pandemic model of such high dimensionality, we chose to represent k i as a relatively constant value that undergoes one rapid transition corresponding to a single social distancing mandate. as noted in experiments, social distancing rules were imposed in new york city roughly 25 days following the first reported case. we thus chose k i to transition between two relatively constant levels, roughly 25 days following time t 0 . specifically, we wrote k i (t) as: the parameter t was set to 25, beginning four days prior to the first report of a detection in nyc [35] to the imposition of a stay-home order in nyc on march 22 [39] . the parameter s governs the steepness of the transformation, and 16 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 29, 2020. . https://doi.org/10.1101/2020.05.27.20112987 doi: medrxiv preprint was set to 10. parameters f and ξ were then adjusted to 1.2 and 1.5, to achieve a transition from about 1.4 to 0.3. for detection probabilities d sym and d sys , a linear and quadratic form, respectively, were chosen to preclude symmetries, and both were optimistically taken to increase with time: dsym(t) = 0.2 · t dsys(t) = 0.1 · t 2 finally, each time series was normalized to the range: [0:1], via division by their respective maximum values. the simulated data were generated by integrating the reaction equations (appendix a) via a fourth-order adaptive runge-kutta method encoded in the python package odeint. a step size of one (day) was used to record the output. except for the one instance noted in results regarding experiment i, we did not examine the sensitivity of estimations to the temporal sparsity of measurements. the initial conditions on the populations were: s 0 = n − 1 (where n is the total population), as 0 = 1, and zero for all others. for the noise experiments, the noise added to the simulated sym det , sys det , and r data were generated by python's numpy.random.normal package, which defines a normal distribution of noise. for the "low-noise" experiments, we set the standard deviation to be the respective mean of each distribution, divided by 100. for the experiments using higher noise, we multiplied that original level by a factor of ten. for each noisy data set, the absolute value of the minimum was then added to each data point, so that the population did not drop below zero. the optimization was performed via the open-source interior-point optimizer (ipopt) [46] . ipopt uses a simpson's rule method of finite differences to discretize the state space, a newton's method to search, and a barrier method to impose user-defined bounds that are placed upon the searches. we note that ipopt's search algorithm treats state variables as independent quantities, which is not the case for a model involving a closed population. this feature did not affect the results of this paper. those interested in expanding the use of this tool, however, might keep in mind this feature. one might negate undesired effects by, for example, imposing equality constraints into the cost function that enforce the conservation of n. within the annealing procedure described in methods, the parameter α was set to 2.0, and β ran from 0 to 38 in increments of 1. the inverse covariance matrix for measurement error (r m ) was set to 1.0, and the initial value of the inverse covariance matrix for model error (r f ,0 ) was set to 10 −7 . for each of the five simulated experiments, ten paths were searched, beginning at randomly-generated initial conditions for parameters and state variables. for the italy data, fifty paths were searched. all simulations were run on a 720-core, 1440-gb, 64-bit cpu cluster. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 29, 2020. . https://doi.org/10.1101/2020.05.27.20112987 doi: medrxiv preprint the need for data innovation in the time of covid-19 estimating the early death toll of covid-19 in the united states substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) inverse problem theory and methods for model parameter estimation numerical weather prediction atmospheric modeling, data assimilation and predictability data assimilation: the ensemble kalman filter practical methods for optimal control and estimation using nonlinear programming the number of required observations in data assimilation for a shallow-water flow estimating the state of a geophysical system with sparse observations: time delay methods to achieve accurate initial states for prediction kalman meets neuron: the emerging intersection of control theory with neuroscience annual international conference of the ieee engineering in medicine and biology society dynamical estimation of neuron and network properties i: variational methods dynamical estimation of neuron and network properties ii: path integral monte carlo methods real-time tracking of neuronal network structure using data assimilation estimating parameters and predicting membrane voltages with conductance-based neuron models automatic construction of predictive neuron models through large scale assimilation of electrophysiological data statistical data assimilation for estimating electrophysiology simultaneously with connectivity within a biological neuronal network towards real time epidemiology: data assimilation, modeling and anomaly detection of health surveillance data streams variational data assimilation with epidemic models bayesian tracking of emerging epidemics using ensemble optimal statistical interpolation real time bayesian estimation of the epidemic potential of emerging infectious diseases adjoint-based data assimilation of an epidemiology model for the covid-19 pandemic in 2020 an epidemiological modelling approach for covid19 via data assimilation projecting hospital utilization during the covid-19 outbreaks in the united states predicting the future: completing models of observed complex systems dynamical parameter and state estimation in neuron models. the dynamic brain: an exploration of neuronal variability and its functional significance estimating the biophysical properties of neurons with intracellular calcium dynamics accurate state and parameter estimation in nonlinear systems with sparse observations improved variational methods in statistical data assimilation nonlinear statistical data assimilation for hvc ra neurons in the avian song system data assimilation of membrane dynamics and channel kinetics with a neuromorphic integrated circuit an optimization-based approach to calculating neutrino flavor evolution systematic variational method for statistical nonlinear state and parameter estimation improved inference of time-varying reproduction numbers during infectious disease outbreaks a new framework and software to estimate time-varying reproduction numbers during epidemics different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures pause order in new york city takes effect getting a handle on asymptomatic sars-cov-2 infection estimating the burden of sars-cov-2 in france epidemiology and transmission of covid-19 in shenzhen china: analysis of 391 cases and 1,286 of their close contacts incidence, clinical outcomes, and transmission dynamics of hospitalized 2019 coronavirus disease among 9,596,321 individuals residing in california and washington clinical features of patients infected with 2019 novel coronavirus in wuhan, china. the lancet clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a single-centered, retrospective, observational study. the lancet respiratory medicine short tutorial: getting started with ipopt in 90 minutes key: cord-191876-03a757gf authors: weinert, andrew; underhill, ngaire; gill, bilal; wicks, ashley title: processing of crowdsourced observations of aircraft in a high performance computing environment date: 2020-08-03 journal: nan doi: nan sha: doc_id: 191876 cord_uid: 03a757gf as unmanned aircraft systems (uass) continue to integrate into the u.s. national airspace system (nas), there is a need to quantify the risk of airborne collisions between unmanned and manned aircraft to support regulation and standards development. both regulators and standards developing organizations have made extensive use of monte carlo collision risk analysis simulations using probabilistic models of aircraft flight. we've previously determined that the observations of manned aircraft by the opensky network, a community network of ground-based sensors, are appropriate to develop models of the low altitude environment. this works overviews the high performance computing workflow designed and deployed on the lincoln laboratory supercomputing center to process 3.9 billion observations of aircraft. we then trained the aircraft models using more than 250,000 flight hours at 5,000 feet above ground level or below. a key feature of the workflow is that all the aircraft observations and supporting datasets are available as open source technologies or been released to the public domain. the continuing integration of unmanned aircraft system (uas) operations into the national airspace system (nas) requires new or updated regulations, policies, and technologies to maintain safe and efficient use of the airspace. to help achieve this, regulatory organizations such as the federal aviation administration (faa) and the international civil aviation organization (icao) mandate the use of collision avoidance systems to minimize the risk of a midair collision (mac) between most manned aircraft (e.g. 14 cfr § 135.180). monte carlo safety simulations and statistical encounter models of aircraft behavior [1] have enabled the faa to develop, assess, and certify systems to mitigate the risk of airborne collisions. these simulations and models are based on observed aircraft behavior and have been used to design, evaluate, and validate collision avoidance systems deployed on manned aircraft worldwide [2] . for assessing the safety of uas operations, the monte carlo simulations need to determine if the uas would be a hazard to manned aircraft. therefore there is an inherent need for models that represent how manned aircraft behave. while various models have been developed for decades, many of these models were not designed to model manned aircraft behavior where uas are likely to operate [3] . in response, new models designed to characterize the low altitude environment are required. in response, we previously identified and determined that the opensky network [4] , a community network of ground-based sensors that observe aircraft equipped with automatic dependent surveillance-broadcast (ads-b) out, would provide sufficient and appropriate data to develop new models [5] . ads-b was initially developed and standardized to enable aircraft to leverage satellite signals for precise tracking and navigation. [6, 7] . however, the previous work did not train any models. this work considered only how aircraft, observed by the opensky network, within the united states and flying between 50 and 5,000 feet above ground level (agl) or less. thus this work does not consider all aircraft, as not all aircraft are equipped with ads-b. the scope of this work was informed by the needs of faa uas integration office, along with the activities of the standards development organizations of astm f38, rtca sc-147, and rtca sc-228. initial scoping discussions were also informed by the uas excom science and research panel (sarp), an organization chartered under the excom senior steering group; however the sarp did not provide a final review of the research. we focused on two objectives identified by the aviation community to support integration of uas into the nas. first to train a generative statistical model of how manned aircraft behavior at low altitudes. and second to estimate the relative frequency that a uas would encounter a specific type of aircraft. these contributions are intended to support current and expected uas safety system development and evaluation and facilitate stakeholder engagement to refine our contributions for policy-related activities. the primary contribution of this paper is the design and evaluation of the high performance computing (hpc) workflow to train models and complete analyses that support the community's objectives. refer to previous work [5, 8] to use the results from this workflow. this paper focus primarily on the use of the lincoln laboratory supercomputing center (llsc) [9] to process billions of aircraft observations in a scalable and efficient manner. we first briefly overview the storage and compute infrastructure of the llsc. the llsc and its predecessors have been widely used to process aircraft tracks and support aviation research for more than a decade. the llsc high-performance computing (hpc) systems have two forms of storage: distributed and central. distributed storage is comprised of the local storage on each of the compute nodes and this storage is typically used for running database applications. central storage is implemented using the opensource lustre parallel file system on a commercial storage array. lustre provides high performance data access to all the compute nodes, while maintaining the appearance of a single filesystem to the user. the lustre filesystem is used in most of the largest supercomputers in the world. specifically, the block size of lustre is 1mb, thus any file created on the llsc will take at least 1mb of space. the processing described in this paper was conducted on the llsc hpc system [9] . the system consists of a variety of hardware platforms, but we specifically developed, executed, and evaluated our software using compute nodes based on dual socket haswell (intel xeon e5-2683 v3 @ 2.0 ghz) processors. each haswell processor has 14 cores and can run two threads per core with the intel hyper-threading technology. the haswell node has 256 gb of memory. this section describes the high performance computing workflow and the results for each step. a shell script was used to download the raw data archives for a given monday from the opensky network. data was organized by day and hour. both the opensky network and our architecture will create a dedicated directory for a given day, such as 2020-06-22. after extracting the raw data archives, up to 24 comma separated value (csv) files will populate the directory; each hour in utc time corresponds to a specific file. however, there are a few cases where not every hour of the day was available. the files contain all the abstracted observations of all aircraft for that given hour. for a specific aircraft, observations are updated at least every ten seconds. for this paper, we downloaded 85 mondays spanning february 2018 to june 2020, totaling 2002 hours. the size of each hourly file was dependent upon the number of active sensors that hour, the time of day, the quantity of aircraft operations, and the diversity of the operations. across a given day, the hourly files can range in size by hundreds of megabytes with the maximum file size between 400 and 600 megabytes. together all the hourly files for a given day currently require about 5-9 gigabytes of storage. we observed that on average the daily storage requirement for 2019 was greater than for 2018. parsing, organizing, and aggregating the raw data for a specific aircraft required high performance computing resources, especially when organizing the data at scale. many aviation use cases require organizing data and building a track corpus for each specific aircraft. yet it was unknown how many unique aircraft were observed in a given hour and if a given hourly file has any observations for a specific aircraft. to efficiently organize the raw data, we need to address these unknowns. we identified unique aircraft by parsing and aggregating the national aircraft registries of the united states, canada, the netherlands, and ireland. registries were processed for each individual year for 2018-2020. all registries specified the registered aircraft's type (e.g. rotorcraft, fixed wing singleengine, etc.), the registration expiration date, and a global unique hex identifier of the transponder equipped on the aircraft. this identifier is known as the icao 24-bit address [10] , with (2 24 -2) unique addresses available worldwide. some of the registries also specified the maximum number of seats for each aircraft. using the registries, we created a four tier directory structure to organize the data. the highest level directory corresponds to the year, such as 2019. the next level was organized by twelve general aircraft type, such as fixed wing single-engine, glider, or rotorcraft. the third directory level was based on the number of seats, with each directory representing a range of seats. a dedicated directory was created for aircraft with an unknown number of seats. the lowest level directory was based on the sorted unique icao 24-bit addresses. for each seat-based directory, up to 1000 icao 24-bit address directories are created. additionally to address that the four aircraft registries do not contain all registered aircraft globally, a second level directory titled "unknown" was created and populated with directories corresponding to each hour of data. the top and bottom level directories remained the same as the known aircraft types. the bottom directories for unknown aircraft are generated at runtime. this hierarchy ensures that there are no more than 1000 directories per level, as recommended by the llsc, while organizing the data to easily enable comparative analysis between years or different types of aircraft. the hierarchy was also sufficiently deep and wide to support efficient parallel process i/o operations across the entire structure. for example, a full directory path for the first three tiers of the directory hierarchy could be: "2020/rotorcraft/seats_001_010/." the directory would contain all the known unique icao 24-bit addresses for rotorcraft with 1-10 seats in 2018. within this directory would be up to 1000 directories, such as "a00c12_a00d20" or "a000d20_a00ecf" this lowest level directory would be used to store all the organized raw data for aircraft with an icao 24-bit address. the first hex value was inclusive, but the second hex value was not inclusive. with a directory structure established, each hourly file was then loaded into memory, parsed, and lightly processed. observations with incomplete or missing position reports were removed, along with any observations outside a user-defined geographic polygon. the default polygon, illustrated by figure 1 , was a convex hull with a buffer of 60 nautical mile around approximately north america, central america, the caribbean, and hawaii. units were also converted to u.s. aviation units. the country polygons were sourced from natural earth, a public domain map dataset [11] . specifically for the 85 mondays across the three years, 2214 directories were generated across the first three tiers of the hierarchy and 802,159 directories were created in total across the entire hierarchy. of these, 770,661 directories were nonempty. the majority of the directories were created within the unknown aircraft type directories. as overviewed by tables 1 and 2, about 3.9 billion raw observations were organized, with about 1.4 billion observations available after filtering. there was a 15% annual percent increase in observations per hour from 2018 to 2019. however, a 50% percent decrease in the average number of observations per hour was observed when comparing 2020 to 2019; this could be attributed to the covid-19 pandemic. this worldwide incident sharply curtailed travel, especially travel between countries. this reduction in travel was reflected in the amount of data filtered using the geospatial polygon. in 2018 and 2019, about 41-44% of observations were filtered based on their location. however, only 27% of observations were filtered for observations from march to june 2020. conversely, the amount of observations removed due to quality control did not significantly vary in 2020, as 26%, 20%, and 25% were removed for 2018, 2019, and 2020. these results were generated using 512 cpus across 2002 tasks, where each task corresponded to a specific hourly file. tasks were uniformly distributed across cpus, a dynamic selfscheduling parallelization approach was not implemented. each task required on average 626 seconds to execute, with a median time of 538 seconds. the maximum and minimum times to complete a task were 2153 and 23 seconds. across all tasks, about 348 hours of total compute time was required to parse and filter the 85 days of data. it is expected that if the geospatial filtering was relaxed and observations from europe were not removed, that the compute time would increase due to increase demands on creating and writing to hourly files for each aircraft. since files were created for every hour for each unique aircraft, tens of millions of small files less than 1 megabyte in size were created. this was problematic as small files typically use a single object storage target, thus serializing access to the data. additionally, in a cluster environment, hundreds or thousands of concurrent, parallel processes accessing small files can lead to significantly large random i/o patterns for file access and generates massive amounts of networks traffic. this results in increased latency for file access, higher network traffic and significantly slows down i/o and consequently causes degradation in overall application performance. while this approach to data organization may provide acceptable performance on a laptop or desktop computer, it was unsuitable for use in a shared, distributed hpc system. in response, we created zip archives for each of the bottom directories. in a new parent directory, we replicated the first three tiers of the directory hierarchy from the previous step. then instead of creating directories based on the icao 24-bit addresses, we archiving each directory with the hourly csv files from the previous organization step. we then removed the hourly csv files from storage. this was achieved using llmapreduce [12] , with a task created for each of the 770,661 non-empty bottom level directories. similar to the previous organization step, all tasks were completed in a few hours but with no optimization for load balancing. the performance of this step could be improved by distributing tasks based on the number of files in the directories or the estimated size the output archive. a key advantage to archiving the organized data, is that the archives can be updated with new data as it becomes available. if the geospatial filtering parameters and aircraft registry data doesn't change, only new open sky data needs to be organized. once organized into individual csv files, llmapreduce can be used again to update the existing archives. this substantially reduces the computational and storage requirements to process new data. the archived data can now be segmented, have outliers removed, and interpolated. additionally above ground level altitude was calculated, airspace class was identified, and dynamic rates (e.g. vertical rate) were calculated. we also split the raw data into track segments based on unique position updates and time between updates. this ensures that each segment does not include significantly interpolated or extrapolated observations. track segments without ten points are removed. figure 2 illustrates the track segments for a faa registered fixed wing multi-engine aircraft from march to june 2020. note that segment length can vary from tens to hundreds of nautical miles long. track segment length was dependent upon the aircraft type, availability of active opensky network sensors, and nearby terrain. however, the ability to generate track segments that span multiple states represents a substantial improvement over previous processing approaches for development of aircraft behavior models. then for each segment we detect altitude outliers using a 1.5 scaled median absolute deviations approach and smooth the track using a gaussian-weight average filter with a 30-second time window. dynamic rates, such as acceleration, are calculated using a numerical gradient. outliers are then detected and removed based on these rates. outlier thresholds were based on aircraft type. for example, the speeds greater than 250 knots were considered outliers for rotorcraft, but fixed wing multiengine aircraft had a threshold of 600 knots. the tracks were then interpolated to a regular one second interval. lastly, we estimated the above ground level altitude using digital elevation models. this altitude estimation was the most computationally intensive component of the entire workflow. it consists of loading into memory and interpolating srtm3 or noaa globe [13] digital elevation models (dems) to determine the elevation for each interpolated track segment position. to reduce the computational load prior to processing the terrain data, it was determined using a c++ based polygon test to identify which track segment positions are over the ocean, as defined by natural earth data. points are over the ocean are assumed to have an elevation of 0 feet mean sea level and their elevation are not estimated using the dems. for the 85 days of organized data, approximately 900,000,000 interpolated track segments were generated. for each aircraft in a given year, a single csv was generated containing all the computed segments. in total across the three years, 619,337 files were generated. as these files contained significantly more rows and columns than when organizing the raw data, the majority of these final files were greater than 1 mb in size. the output of this step did not face any significant storage block size challenges. similar to the previous step, tasks were created based on the bottom tier of the directory hierarchy. specifically for processing, parallel tasks were created for each archive. during processing, archives were extracted to a temporary directory while the final output was stored in standard memory. given the processed data, this section overviews two applications on how to exploit and dissemination the data to inform and support the aviation safety community. as the aircraft type was identified when organizing the raw data, it was a straightforward task to estimate the observed distribution of aircraft types per hour. these distributions are not reflective of all aircraft operations in the united states, as not all aircraft are observed by the opensky network. the distributions were also calculated independently for each aircraft type, so the yearly (row) percentages may not sum to 100%. furthermore the relatively low percentage of unknown aircraft was due to the geospatial filtering when organizing the raw data. if the same aircraft registries were used by the filtering was change to only include tracks in europe, the percentage of unknown aircraft would likely significantly rise. this analysis can be extended by identifying specific aircraft manufactures and models, such as boeing 777. however, the manufacturer and model information are not consistent within an aircraft registry nor across different registries. for example, entries of "cessna 172," "textron cessna 172," and "textron c172" all refer to the same aircraft model. one possible explanation for the differences between entries is that cessna used to be an independent aircraft manufacturer and then eventually was acquired by textron. depending on the year of registration, the name of the aircraft may differ but the size and performance of the aircraft remains constant. since over 300,000 aircraft with unique icao 24-bit addresses were identified annually across the aircraft registries, parsing and organizing the aircraft models can be formulated as a traditional natural language processing problem. parsing the aircraft registries differs from a common problem of parsing aviation incident or safety reports [14, 15, 16] due to the reduced word count of the registries and the structured format of the registries. future work will focus on using fuzzy string matching to identify similar aircraft. for many aviation safety studies, manned aircraft behavior is represented using mit lincoln laboratory encounter models. each encounter model is a bayesian network, a generative statistical model that mathematically represents aircraft behavior during close or safety critical encounters, such as near midair collisions. the development of the modern models started in 2008 [1] , with significant updates in 2013 [17] and 2018 [18] . all the models were trained using the llsc [9] or its predecessors. the most widely used of these models were trained using observations collected by groundbased secondary surveillance radars from the 84th radar evaluation squadron (rades) network. aircraft observations by the rades network are based on mode 3a/c, an identification friend or foe technology that provides less metadata than ads-b. notably aircraft type or model cannot be explicitly correlated or identified with specific aircraft tracks. instead, we filtered the rades observations based on the flying rules reported by the aircraft. however, this type of filtering is not unique to the rades data, it is also supported by the opensky network data. additionally, due to the performance of the rades sensors, we filtered out any observations below 500 feet agl due to position uncertainties associated with radar time of arrival measurements. observations of ads-b equipped aircraft by the opensky network differ because ads-b enables aircraft to broadcast the aircraft's estimate of their own location, which is often based on precise gnss measurements. the improved position reporting of ads-b enabled the new opensky network-based models to be trained with an altitude floor of 50 feet agl, instead of 500. specifically, three new statistical models of aircraft behavior were trained, each for a different aircraft type of fixed wing multi-engine, fixed wing single-engine, and rotorcraft. a key advantage to these models is the data reduction and dimensionality reduction. a model was created for each of the three aircraft types and stored as a human readable text file. each file requires approximately just 0.5 megabytes. this a significant reduction from the hundreds of gigabytes used to store the original 85 days of data. table iv reports the quantity of data used to train each model. for example, the rotorcraft model was trained from about 25,000 flight hours over 85 days. however, like the rades-based model, these models do not represent the geospatial nor temporal distribution of the training data. for example, a limitation of these models is that they do not inform if more aircraft were observed in new york city than los angeles. [17] . these figures illustrate how different aircraft behave, such as rotorcraft flying relatively lower and slower than fixed wing multi-engine aircraft. also note that the rades-based model has no altitude observations below 500 feet agl, whereas 18% of the approximately 25,000 rotorcraft flight hours were observed at 50-500 feet agl. it has not been assessed if the opensky network-based models can be used a surrogates for other aircraft types or operations. additionally the new models do not fully supersede the existing rades-based models, as each models represent different varieties of aircraft behavior. on github.com, please refer to the mit lincoln laboratory (@mit-ll) and airspace encounter models (@airspace-encounter-models) organizations. airspace encounter models for estimating collision safety analysis of upgrading to tcas version 7.1 using the 2008 u.s. correlated encounter model well-clear recommendation for small unmanned aircraft systems based on unmitigated collision risk bringing up opensky: a large-scale ads-b sensor network for research developing a low altitude manned encounter model using ads-b observations vision on aviation surveillance systems ads-mode s: initial system description representative small uas trajectories for encounter modeling interactive supercomputing on 40,000 cores for machine learning and data analysis mode s: an introduction and overview (secondary surveillance radar) introducing natural earth datanaturalearthdata.com llmapreduce: multi-level map-reduce for high performance data analysis the global land one-kilometer base elevation (globe) digital elevation model, version 1.0 using structural topic modeling to identify latent topics and trends in aviation incident reports temporal topic modeling applied to aviation safety reports: a subject matter expert review ontologies for aviation data management ieee/aiaa 35th digital avionics systems conference (dasc) uncorrelated encounter model of the national airspace system, version 2.0 correlated encounter model for cooperative aircraft in the national airspace system version 2.0 we greatly appreciate the support and assistance provided by sabrina saunders-hodge, richard lin, and adam hendrickson from the federal aviation administration. we also would like to thank fellow colleagues dr. rodney cole, matt edwards, and wes olson. key: cord-268959-wh28s0ws authors: gao, da-peng; huang, nan-jing title: optimal control analysis of a tuberculosis model()() date: 2017-12-29 journal: appl math model doi: 10.1016/j.apm.2017.12.027 sha: doc_id: 268959 cord_uid: wh28s0ws in this paper, we extend the model of liu and zhang (math comput model 54:836-845, 2011) by incorporating three control terms and apply optimal control theory to the resulting model. optimal control strategies are proposed to minimize both the disease burden and the intervention cost. we prove the existence and uniqueness of optimal control paths and obtain these optimal paths analytically using pontryagin’s maximum principle. we analyse our results numerically to compare various strategies of proposed controls. it is observed that implementation of three controls is most effective and less expensive among all the strategies. thus, we conclude that in order to reduce tuberculosis threat all the three controls must be taken into consideration concurrently. tuberculosis (tb) is a major global health threat caused by mycobacterium tuberculosis (mtb) that most often affects the lungs (pulmonary tb) but can affect other sites as well (extrapulmonary tb). fortunately, tb is a preventable and curable disease. following the world health organization (who), more than 51 million patients have been successfully treated in countries that adopted the who strategy since 1995 [1] . however, the global burden of tb remains enormous. in 2011, there were an estimated 8.7 million new cases of tb (13% co-infected with hiv) and 1.4 million people died from tb [1] . in 2013, there were 9 million new tb cases and 1.5 million tb deaths (in which 0.4 million are hiv-positive patients) [2] . the increase of new cases lies in multiple factors such as the spread of hiv, the collapse of public health programs, the emergence of drug-resistant strains of mtb [3] [4] [5] and exogenous re-infections, where a latently-infected individual acquires a new infection from another infectious individual (see, for example, [6, 7] and references therein). the tb control approach is the chemoprophylaxis treatment in the absence of an effective vaccine. neglecting the compliance with drug treatments might result in a relapse and antibiotic resistant tb, i.e. multidrug-resistant tb (mdr-tb) [8] . mdr-tb is one of the most serious health problems and the progress in curing that is slow. critical funding gaps for tb care and control make it difficult to sustain recent gains, promote further research and develop new drugs and vaccines [1] . it is well known that the mathematical models play vital roles in the dynamics and control of many epidemics including malaria, severe acute respiratory syndromes (sars) and tb (see, for example, [9] [10] [11] [12] [13] [14] [15] [16] [17] and references therein). many mathematical dynamic models for tb have been studied extensively in the literature; for instance we refer the reader to [18] [19] [20] [21] [22] [23] . "fast progressors" and "slow progressors" are two different ways considered by most models to progress to active disease after infection. the probability of latent and treated individuals who might be reinfected [24] is also taken into account by some latest models as the infection and/or disease do not confer full protection which is already well recognized [25] . some previous models of tb, especially the predictive models endeavor to calculate a threshold called basic reproductive number. the dynamics of transmission of the disease has been analyzed in terms of the reduction of the basic reproductive number. since the first paper published by jung et al. [8] in 2002, the time dependent optimal control strategies have been employed in the study of dynamics of tb mathematical models by many authors (see, for example, [8, [26] [27] [28] [29] [30] [31] [32] [33] [34] ). valuable theoretical results generated by both approaches of studying control strategies can be used to guide epidemic control programs. various objective criteria may be adopted, according to a chosen goal (or goals). in 2011, considering that the treated individuals are also infectious, liu and zhang [24] proposed a nonlinear tb model system as follows: where a population of size n ( t ) is partitioned into five subclasses: susceptible ( s ( t )), vaccinated ( v ( t )), individuals infected with tb in latent stage ( l ( t )), individuals infected with tb in the active stage ( i ( t )) and treated individuals infected with tb ( t ( t )). the parameter represents the constant recruitment rate of susceptible individuals. parameter p is the rate at which susceptible individuals are moved into the vaccination process. the natural death rate is μ. the disease-induced death rate coefficient for individuals in compartment i is α. parameter ρ represents the rate that individuals successfully treated of tb return to the latent tb stage and parameter γ is the treatment rate in the infective class. it is assumed that the time before latent individuals become infectious obeys an exponential distribution, with mean waiting time 1 δ . thus, δ is the rate at which an individual leaves the latent compartment to become infectious. susceptible individuals acquire tb infection from individuals with active tb at rate βs(i + ρ 1 t ) , where β is the disease transmission coefficient, and the parameter ρ 1 < 1 accounts for the reduction in infectiousness among individuals with active tb who are treated (in comparison to those who are not treated). fraction l of susceptible individuals who acquire tb infection is assumed to enter the latent tb class ( l ), at the rate βs(i + ρ 1 t ) , while the remainder moves to the active tb class ( i ). it is assumed that individuals in the latent class do not transmit infection. the vaccinated individuals are infected at rate ρ 2 βv (i + ρ 1 t ) , where ρ 2 < 1 represents the reduction in risk of infection owing to vaccination (for a vaccine that offers 100% protection, ρ 2 = 0 ; thus in reality 0 < ρ 2 < 1). for model (1.1) , liu and zhang [24] first obtained the basic reproduction number . after that they showed that if the basic reproduction number r 0 of the model system is below one then disease dies out otherwise it persists. this indicates that the basic reproduction number, r 0 , plays a crucial role in characterizing dynamics of the disease and could be used to suggest or design tb control strategies. however, they did not further investigate time dependent control strategies since their discussions had been concentrated on prevalence of tb at equilibria. note that, ∂ p | (1 . 1) < 0 , also due to liu and zhang [24] , which implies that increasing the vaccination rate of susceptible individuals against infection of tb is conducive to disease control. in order to understand the dynamics under what circumstances tb can be controlled or curtailed, we implement the theory of optimal control. three intervention strategies, called controls, are included in model system (1.1) . we write controls as functions of time and assign reasonable upper and lower bounds to them. first, a "case finding" (identification of latently infected individuals) control mechanism is incorporated in model (1.1) by replacing constant vaccination rate with u 1 ( t ). second, a "case holding" (treatment of individuals with active tb) control mechanism is incorporated in model (1.1) by replacing constant successfully treatment rate with u 2 ( t ). finally, another "case holding" is instituted in model (1.1) by replacing constant treatment rate with u 3 ( t ). it is required that u 1 ( t ), u 2 ( t ) and u 3 ( t ) are bounded lebesgues integrable functions. we intend to determine optimal control strategies that minimize not only the infected individuals but also the cost of these three controls. as mentioned by kumar and srivastava [14] , the larger proportion of population is covered, the higher efforts and expenditure is required to provide vaccination. hence, it is suitable to use the nonlinearity of order four to describe the high expensiveness concerned with the vaccination process. following the idea of kumar and srivastava [14] , in this paper, the cost constructions that account for high nonlinear relationship (nonlinearity of order four i.e. u 4 1 (t) ) between cost and effort s during vaccination process has been employed in the application of optimal control to tb mathematical models. using the same parameters and class names as in model system (1.1) , the system of differential equations describing the controlled model is as follows: (1. 2) in view of the fact that vaccination and treatment are easily available and implementable control strategies to stop tb, we obtain model system (1.2) by considering vaccination and treatment as control variables from model system (1.1) . the detailed analysis of optimal control problem is given in the following sections. the subsequent part of this paper is as follows: we present the formulation of the optimal control problems in section 2 . in section 3 , we investigate the existence of an optimal control function and derive an optimal system characterizing the optimal control. in section 4 , we analyze the uniqueness of the optimal system, in which the state system coupled with co-states. it is worth pointing out that the proof of existence and uniqueness involving biquadratic form of the control variables has not been explored so far as we know. we perform numerical simulations to illustrate our theoretical results in section 5 . the paper ends with a conclusion in section 6 . this section is devoted to the study of our model system (1.2) , when we administered vaccination and treatment policies over a fixed time window. for that we design an objective functional by keeping the biomedical goal in our mind. our goal is to minimize the number of infected individuals (including latent, infectious, being treated individuals) with tb virus while at the same time keeping the cost of implementing these three control strategies very low. in mathematical perspective, for a fixed terminal time t f , the problem is to minimize the objective functional here, the total cost on a finite time horizon [0, t f ] consists of the cost induced by the disease itself and the cost induced by vaccination and treatment effort s. we split the disease cost into the cost induced by latent group, proportional to the number of individuals infected with tb in latent stage, the cost induced by infected group, proportional to the number of individuals infected with tb in the active stage, and the cost induced by treated group, proportional to the number of treated individuals infected with tb. modeling the control cost is a more delicate issue. the cost involved in vaccination process requires relatively higher efforts and expenditure because of covering larger proportion of population. for example, vaccination efforts target first quarter of population will be less than the next quarter and so on. therefore, the cost for covering larger proportion of population involved in vaccination process will be not only increasing but also growth of cost will be faster and steeper. hence, we consider the biquadratic form in the vaccination control to represent high expensiveness of vaccination process [16] . that is, the cost involved in vaccination process is taken as it is also assumed that cost of treatment is nonlinear and takes a quadratic nature, which is found to be consistent with previous works in the literature (see, e.g. [8] ). in other words, the cost incurred in improving successfully treatment rate (the effort s such as regularity of drug intake) is taken as similarly, the cost incurred in providing treatment is taken as 6) are the weight constants of the infected tb individuals and control measures. they can be chosen to balance cost factors due to size and importance of the parts of the objective functional. we assume that there are practical limitations on the maximum rate at which individuals may be vaccinated or treated during a given period of time and the maximum successfully treated rate. in the rest of this paper, for simplicity of notation, we represent u 1 (t) = u 1 , u 2 (t) = u 2 and u 3 (t) = u 3 . we seek optimal controls u 1 , u 2 and u 3 in u ad such that is the control set. in optimal control theory, there are several basic problems included such as to prove the existence of an optimal control, to characterize the optimal control, to prove the uniqueness of the control, to compute the optimal control numerically and to investigate how the optimal control depends on various parameters in the model. in this section, we will study the sufficient condition for the existence of an optimal control of our system of ordinary differential equations (1.2) . we refer to the conditions in theorem iii 4.1 and its corresponding corollary in [35] . after that, we characterize the optimal control functions by using pontryagin's maximum principle [36, 37] , and then we derive necessary conditions for our control problem. in this subsection, we will investigate the existence of an optimal control of our model (1.2) . the boundedness of solutions to system (1.2) for finite time interval is needed to establish the existence of an optimal control and the proof of the uniqueness of the optimality system. we begin by examining the priori boundedness of the state solutions of model (1.2) . for model system (1.2) to be epidemiologically meaningful, it is important to prove that all its state variables are nonnegative for all time. in other words, solutions of model system (1.2) with nonnegative initial data remain nonnegative for all time t > 0. suppose, for example, the variable s becomes zero for some time to establish the upper bounds for the solutions, we consider an equation for the total population size n . the rate of change of the total population, obtained by adding all the equations in model (1.2) , is given by proof. to prove this theorem, we follow the requirements from theorem 4.1 and corollary 4.1 in [35] and verify nontrivial requirements. let r(t, x , u ) be the right-hand side of (1.2) . we need to show the following conditions are satisfied: (1) r is of class c 1 and there exists a constant c such that (2) the admissible set f of all solutions to system (1.2) with corresponding control in u ad is nonempty; in order to verify above conditions, we write then it is easy to see that r(t, x , u ) is of class c 1 and | r(t, 0 , 0) | = . moreover, one has since s, v, l, i and t are bounded, there exists a constant c such that this means that condition (1) holds. thanks to condition (1), there exists a unique solution to system (1.2) for a constant control, which further implies that condition (2) holds. in addition, condition (4) is obvious from the definition and the proof of this theorem can be completed by verifying condition (5) . in order to show the convexity of the integrand in the objective functional g(t, x , u ) , we have to prove that and so the proof is complete. since there exists an optimal control for minimizing function (2.2) subject to system (1.2) , we drive the necessary conditions for this optimal control by using pontryagin's maximum principle [36, 37] . we need to define hamiltonian for deriving the necessary conditions as, here, λ = (λ 1 , λ 2 , λ 3 , λ 4 , λ 5 ) ∈ r 5 is known as adjoint variable. the optimality system of equations is found by taking the appropriate partial derivatives of hamiltonian (3.1) with respect to the associated state variables. the following theorem is a consequence of the maximum principle. given an optimal control pair (u * 1 , u * 2 , u * 3 ) and corresponding solutions to the state system s * , v * , l * , i * , t * , that minimize objective functional (2.2) , there exists adjoint variables λ 1 , λ 2 , λ 3 , λ 4 and λ 5 , satisfying , u 1 max , proof. the result follows from a direct application of a version of pontryagin's maximum principle for bounded controls [36, 37] . the differential equations governing the adjoint variables are obtained by differentiation of hamiltonian function (3.1) , evaluated at the optimal control. then the adjoint system can be written as follows: (3.5) and = 1 , 2 , 3) . we point out that the optimality system consists of the state system (1.2) with the initial conditions, adjoint system (3.2) with transversality conditions (3.3) , and optimality condition (3.4) . thus, we have the following optimality system: due to the boundedness for the state system, the adjoint system has bounded coefficients which is linear in each of the adjoint variables. therefore, the adjoint system have finite upper bounds. we will state the following proposition (without proof) needed for the proof of the uniqueness of the optimality system for the small time window. proposition 4.1. [13] the function u * (m ) = min { max { m, a } , b} which is lipschitz continuous in m, with a < b for some fixed nonnegative constants. now, we are in position to prove the uniqueness of the optimality system in the same way as shown in [12 , 13 , 17] . proof. the proof is done by contradiction. note that the boundedness for the state system and the adjoint system, the lipschitz continuity of optimal control functions, and some elementary inequalities play key roles in obtaining a contradiction. , u 1 max ; for ease of notation, we generally omit the dependence on time in the following except in the case that a specific time is intended. we consider the difference of the resulting equations for x i and x i , y i and y i , i = 1 , 2 , . . . , 5 . from the first equation of (3.7) , we get dx 1 dt and d x 1 dt it follows from the above two equations that (4.1) multiplying both sides of (4.1) by (x 1 − x 1 ) and then integrating both sides from 0 to t f , one has in order to simplify the right-hand expressions of (4.2) , we need some elementary inequalities. an elementary inequality it follows that | a − b| ≤ 6 32(a 6 + b 6 ) and thus | a by the elementary inequality (a + b) 2 ≤ 2(a 2 + b 2 ) , we have where c depends on bounds for x 1 , y 1 . another inequality is also needed to obtain the specific estimate: where m i (i = 1 , 2 , . . . , 8) and n j ( j = 1 , 2 , . . . , 9) depend on the coefficients and the bounds of the state variables and co-state variables. table 1 . table 1 . we employed parameters values similar to those reported in previous studies whenever possible. table 1 provides a summary of the parameter values used in numerical simulation of model (3.7) . due to lack of data, we make hypothesis on some parameter values within realistic ranges. the initial values for the states are given by table 1 . table 1 . table 1 . table 1 . we start comparing the case of the population with optimal control actions and without control actions. in figs. 1 -5 , the population with optimal control actions are shown in solid lines, compared with the population without control actions which are shown in dashed lines. we observe from fig. 1 that the susceptible population is higher under control compared to the case without control. the results in fig. 2 predict a higher level of vaccinated population for the aforementioned tuberculosis intervention strategies under control compared to the case without control. the results depicted in figs. 3 and 4 clearly suggest that the optimal control results are very effective in control of individuals infected with tb in latent stage and in the active stage, as expected. in fig. 5 , the population of treated humans declines sharply in the first 18 years and after that decrease further, while fig. 4 shows slight increase in the last two years of the intervention strategy. this is due to the fact that the treatment control u 3 becomes zero after 18 years. the simulation results in fig. 6 suggest that this strategy would require that the vaccination control u 1 should be at maximum for almost the entire period of intervention, while treatment controls u 2 and u 3 should maintain maximum effort s f or 13 years and 18 years respectively before decreasing to zero. in what follows, we consider different control strategies in the reduction of infected individuals (including latent, infectious, being treated individuals). we observe in fig. 7 that the infected individuals are lowest when three controls are considered. as the cost is one of the important parameter to decide for applicable strategy, we performed the cost design analysis and made comparative study of the cost for various strategies. the cost profiles for various control strategies are given in fig. 8 . note that the cost is least when three controls are considered, that is, the implementation of three control measures to restrict the epidemic while at the same time keeping the cost of vaccination and treatment very low are good policies for the achievement of our goal. in fig. 9 we present the effect of varying the disease transmission coefficient β on the number of individuals infected with tb in the active stage ( i ( t )). we observe that after about 12 years the effect of β is negligible on individuals infected with tb in the active stage. fig. 10 shows the effect of varying fraction coefficient l on the number of individuals infected with tb in the active stage ( i ( t )). it is easy to see that the individuals infected with tb in the active stage affected much by l . in this work, we have studied a simple mathematical model of tb presented by liu and zhang [24] . in order to control the spread of tb, we applied a preventive control in the form of vaccination and two treatment controls to susceptible, latent, and individuals infected with tb in the active stage. next, we established the objective functional by remembering the fundamental biomedical goals in our mind. theoretically, we proved the existence of optimal control and studied the characterization of optimal control by pontryagin's maximum principle. in order to assure that the optimal control is unique in the depictions, we investigated the uniqueness of the optimal control system. numerically, the simulations of the extended tb model (1.2) indicate that the vaccination and treatment strategies are effective in reducing tb disease transmission, especially we may get maximum disease control by using minimum cost as shown in figs. 7 and 8 . it could therefore be concluded that the comprehensive effect of all the three controls not only minimizes cost burden but also keeps a tab on infective population. undoubtedly, dynamics of tb infection is far more complicated and varied than the one captured by this mathematical model. therefore, we will incorporate the time delay for drug resistance into the present model so as to provide us a better assessment for designing tb dynamics. in addition, we note that the principle technique for optimal control problem is to solve a set of "necessary conditions" that an optimal control and corresponding state must satisfy. if the hamiltonian is linear in the control variable u , it would be difficult to solve for u * from the optimality equation. we will leave these cases as our future work. tuberculosis control: past 10 years and future progress evolution of who, 1948-2001 policies for tuberculosis control for the global surveillance and monitoring project: assessment of worldwide tuberculosis control exogenous reinfection in tuberculosis a model for tuberculosis with exogenous reinfection, theor optimal control of treatments in a two-strain tuberculosis model optimal control of vector-borne diseases: treatment and prevention the intrinsic transmission dynamics of tuberculosis epidemic dynamical models of tuberculosis and their applications optimal control applied to vaccination and treatment strategies for various epidemiological models the combined effects of optimal control in cancer remission vaccination and treatment as control interventions in an infectious disease model with their cost optimization optimal control of an influenza model with seasonal forcing and age-dependent transmission rates keeping options open:an optimal control model with trajectories that reach a dnss point in positive time an optimal strategy for hiv multitherapy control strategies for tuberculosis epidemics: new models for old problems mathematical models for the disease dynamics of tuberculosis modeling epidemics of multidrug-resistant m. tuberculosis of heterogeneous fitness prospects for worldwide tuberculosis control under the who dots strategy. directly observed shortcourse therapy the reinfection threshold promotes variability in tuberculosis epidemiology and vaccine efficacy the natural history of tuberculosis: the implications of age-dependent risks of disease and the role of reinfection global stability for a tuberculosis model rate of reinfection tuberculosis after successful treatment is higher than rate of new tuberculosis optimal interventions strategies for tuberculosis optimal intervention strategy for prevention tuberculosis using a smoking-tuberculosis model optimal control for a tuberculosis model with undected cases in cameroon modeling the impact of early therapy for latent tuberculosis patients and its optimal control analysis cost-effectiveness analysis of optimal control measures for tuberculosis optimal control applied to tuberculosis models. the iea-eef european congress of epidemiology 2012: epidemiology for a fair and healthy society optimal control for a tuberculosis model with reinfection and post-exposure interventions global stability and optimal control for a tuberculosis model with vaccination and treatment deterministic and stochastic optimal control mathematical theory of optimal processes the authors are grateful to the editor and the referees for their valuable comments and suggestions. 3 (x 1 − x 1 ) 2 + (y 1 − y 1 ) 2 + (y 2 − y 2 ) 2 .(iv) case of x 1 (y 1 − y 2 ) < x 1 ( y 1 − y 2 ) < 0 . by aid of 3 an easy induction gives thatwhere k ≥ 1 + 2 d . in order to simplify the right-hand of (4.2) , another common expression can be used repeatedly,where c depends on bounds for x and y . based on the above arguments, we findwhere m 1 , n 1 are appropriate upper-bounds. similarly, we can obtain the following inequalities for (x i (t f ) , x i (t f )) and (y j (0) , y j (0)) with i = 2 , 3 , 4 , 5 and j = 1 , 2 , 3 , 4 , 5 :now it follows from (4.5), (4.6), (4.7), (4.8), (4.9), (4.10), (4.11), (4.12), (4.13), (4.14) that 0 (x 1 − x 1 ) 2 dt will be nonnegative. similar arguments apply to remaining integral terms, we can obtain all of the other λ s and t f s . take the maximum of all of the λ s used as λ and the minimum of the t f s used as t f , then the coefficient of each integral term in (4.15) is nonnegative. this implies that hence the solution of (3.7) is unique for small time. this ends the proof. remark 4.1. we note that theorem 4.1 shows that the solution of (3.7) is unique when t f is small. as pointed out by lenhart and workman [36] , we can usually obtain uniqueness of the solutions for (3.7) , but only for small time t f . this small time condition is due to reverse time orientations of the state equation and adjoint equation. in this section, a fourth-order runge-kutta algorithm is used to solve optimal system (3.7) numerically. with initial conditions for the states, chosen an initial guess for the controls, the state differential equations (1.2) are solved forward in time by a fourth-order runge-kutta scheme. with the application of the current iteration solution of the state equations (1.2) and the transversality conditions (3.3) , the adjoint system (3.2) is solved backward in time, again, by a fourth-order runge-kutta scheme. both state and adjoint values are used to update the controls with the characterizations given by (3.4) , and then the iterative process is repeated. this procedure is continued iteratively till current state, adjoint, and control values converge sufficiently. key: cord-132843-ilxt4b6g authors: zhao, liang title: event prediction in the big data era: a systematic survey date: 2020-07-19 journal: nan doi: nan sha: doc_id: 132843 cord_uid: ilxt4b6g events are occurrences in specific locations, time, and semantics that nontrivially impact either our society or the nature, such as civil unrest, system failures, and epidemics. it is highly desirable to be able to anticipate the occurrence of such events in advance in order to reduce the potential social upheaval and damage caused. event prediction, which has traditionally been prohibitively challenging, is now becoming a viable option in the big data era and is thus experiencing rapid growth. there is a large amount of existing work that focuses on addressing the challenges involved, including heterogeneous multi-faceted outputs, complex dependencies, and streaming data feeds. most existing event prediction methods were initially designed to deal with specific application domains, though the techniques and evaluation procedures utilized are usually generalizable across different domains. however, it is imperative yet difficult to cross-reference the techniques across different domains, given the absence of a comprehensive literature survey for event prediction. this paper aims to provide a systematic and comprehensive survey of the technologies, applications, and evaluations of event prediction in the big data era. first, systematic categorization and summary of existing techniques are presented, which facilitate domain experts' searches for suitable techniques and help model developers consolidate their research at the frontiers. then, comprehensive categorization and summary of major application domains are provided. evaluation metrics and procedures are summarized and standardized to unify the understanding of model performance among stakeholders, model developers, and domain experts in various application domains. finally, open problems and future directions for this promising and important domain are elucidated and discussed. and is the focus of this survey. accurate anticipation of future events enables one to maximize the benefits and minimize the losses associated with some event in the future, bringing huge benefits for both society as a whole and individual members of society in key domains such as disease prevention [167] , disaster management [140] , business intelligence [226] , and economics stability [24] . "prediction is very difficult, especially if it's about the future. " -niels bohr, 1970 event prediction has traditionally been prohibitively challenging across different domains, due to the lack or incompleteness of our knowledge regarding the true causes and mechanisms driving event occurrences in most domains. with the advent of the big data era, however, we now enjoy unprecedented opportunities that open up many alternative approaches for dealing with event prediction problems, sidestepping the need to develop a complete understanding of the underlying mechanisms of event occurrence. based on large amounts of data on historical events and their potential precursors, event prediction methods typically strive to apply predictive mapping to build on these observations to predict future events, utilizing predictive analysis techniques from domains such as machine learning, data mining, pattern recognition, statistics, and other computational models [16, 26, 92] . event prediction is currently experiencing extremely rapid growth, thanks to advances in sensing techniques (physical sensors and social sensors), prediction techniques (artificial intelligence, especially machine learning), and high performance computing hardware [78] . event prediction in big data is a difficult problem that requires the invention and integration of related techniques to address the serious challenges caused by its unique characteristics, including: 1) heterogeneous multi-output predictions. event prediction methods usually need to predict multiple facets of events including their time, location, topic, intensity, and duration, each of which may utilize a different data structure [171] . this creates unique challenges, including how to jointly predict these heterogeneous yet correlated facets of outputs. due to the rich information in the outputs, label preparation is usually a highly labor-intensive task performed by human annotators, with automatic methods introducing numerous errors in items such as event coding. so, how can we improve the label quality as well as the model robustness under corrupted labels? the multi-faceted nature of events make event prediction a multi-objective problem, which raises the question of how to properly unify the prediction performance on different facets. it is also challenging to verify whether a predicted event "matches" a real event, given that the various facets are seldom, if ever, 100% accurately predicted. so, how can we set up the criteria needed to discriminate between a correct prediction ("true positive") and a wrong one ("false positive")? 2) complex dependencies among the prediction outputs. beyond conventional isolated tasks in machine learning and predictive analysis, in event prediction the predicted events can correlate to and influence each other [142] . for example, an ongoing traffic incident event could cause congestion on the current road segment in the first 5 minutes but then lead to congestion on other contiguous road segments 10 minutes later. global climate data might indicate a drought in one location, which could then cause famine in the area and lead to a mass exodus of refugees moving to another location. so, how should we consider the correlations among future events? 3) real-time stream of prediction tasks. event prediction usually requires continuous monitoring of the observed input data in order to trigger timely alerts of future potential events [182] . however, during this process the trained prediction model gradually becomes outdated, as real world events continually change dynamically, concepts are fluid and distribution drifts are inevitable. for example, in september 2008 21% of the united states population were social media users, including 2% of those over 65. however, by may 2018, 72% of the united states population were social media users, including 40% of those over [37] . not only the data distribution, but also the number of features and input data sources can also vary in real time. hence, it is imperative to periodically upgrade the models, which raises further questions concerning how to train models based on non-stationary distributions, while balancing the cost (such as computation cost and data annotation cost) and timeliness? in addition, event prediction involves many other common yet open challenges, such as imbalanced data (for example data that lacks positive labels in rare event prediction) [206] , data corruption in inputs [248] , the uncertainty of predictions [25] , longer-term predictions (including how to trade-off prediction accuracy and lead time) [27] , trade-offs between precision and recall [171] , and how to deal with high-dimensionality [247] and sparse data involving many unrelated features [208] . event prediction problems provide unique testbeds for jointly handling such challenges. in recent years, a considerable amount of research has been devoted to event prediction technique development and applications, in order to address the aforementioned challenges [157] . recently, there has been a surge of research that both proposes and applies new approaches in numerous domains, though event prediction techniques are generally still in their infancy. most existing event prediction methods have been designed for a specific application domains, but their approaches are usually general enough to handle problems in other application domains. unfortunately, it is difficult to cross-reference these techniques across different application domains serving totally different communities. moreover, the quality of event prediction results require sophisticated and specially-designed evaluation strategies due to the subject matter's unique characteristics, for example its multi-objective nature (e.g., accuracy, resolution, efficiency, and lead time) and heterogeneous prediction results (e.g., heterogeneity and multi-output). as yet, however, we lack a systematic standardization and comprehensive summarization approaches with which to evaluate the various event prediction methodologies that have been proposed. this absence of a systematic summary and taxonomy of existing techniques and applications in event prediction causes major problems for those working in the field who lacks clear information on the existing bottlenecks, traps, open problems, and potentially fruitful future research directions. to overcome these hurdles and facilitate the development of better event prediction methodologies and applications, this survey paper aims to provide a comprehensive and systematic review of the current state of the art for event prediction in the big data era. the paper's major contributions include: • a systematic categorization and summarization of existing techniques. existing event prediction methods are categorized according to their event aspects (time, location, and semantics), problem formulation, and corresponding techniques to create the taxonomy of a generic framework. relationships, advantages, and disadvantages among different subcategories are discussed, along with details of the techniques under each subcategory. the proposed taxonomy is designed to help domain experts locate the most useful techniques for their targeted problem settings. • a comprehensive categorization and summarization of major application domains. the first taxonomy of event prediction application domains is provided. the practical significance and problem formulation are elucidated for each application domain or subdomain, enabling it to be easily mapped to the proposed technique taxonomy. this will help data scientists and model developers to search for additional application domains and datasets that they can use to evaluate their newly proposed methods, and at the same time expand their advanced techniques to encompass new application domains. • standardized evaluation metrics and procedures. due to the nontrivial structure of event prediction outputs, which can contain multiple fields such as time, location, intensity, duration, and topic, this paper proposes a set of standard metrics with which to standardize existing ways to pair predicted events with true events. then additional metrics are introduced and standardized to evaluate the overall accuracy and quality of the predictions to assess how close the predicted events are to the real ones. • an insightful discussion of the current status of research in this area and future trends. based on the comprehensive and systematic survey and investigation of existing event prediction techniques and applications presented here, an overall picture and the shape of the current research frontiers are outlined. the paper concludes by presenting fresh insights into the bottleneck, traps, and open problems, as well as a discussion of possible future directions. this section briefly outlines previous surveys in various domains that have some relevance to event prediction in big data in three categories, namely: 1. event detection, 2. predictive analytics, and 3. domain-specific event prediction. event detection has been an extensively explored domain with over many years. its main purpose is to detect historical or ongoing events rather than to predict as yet unseen events in the future [181, 222] . event detection typically focuses on pattern recognition [26] , anomaly detection [92] , and clustering [92] , which are very different from those in event prediction. there have been several surveys of research in this domain in the last decade [9, 15, 63, 146] . for example, deng et al. [63] and atefeh and khreich [15] provided overviews of event extraction techniques in social media, while michelioudakis et al. [146] presented a survey of event recognition with uncerntainty. alevizos et al. [9] provided a comprehensive literature review of event recognition methods using probabilistic methods. predictive analysis covers the prediction of target variables given a set of dependent variables. these target variables are typically homogeneous scalar or vector data for describing items such as economic indices, housing prices, or sentiments. the target variables may not necessarily be values in the future. larose [120] provides a good tutorial and survey for this domain. predictive analysis can be broken down into subdomains such as structured prediction [26] , spatial prediction [105] , and sequence prediction [88] , enabling users to handle different types of structure for the target variable. fülöp et al. [81] provided a survey and categorization of applications that utilize predictive analytics techniques to perform event processing and detection, while jiang [105] focused on spatial prediction methods that predict the indices that have spatial dependency. baklr et al. [17] summarized the literature on predicting structural data such as geometric objects and networks, and arias et al. [12] phillips et al. [163] , and yu and kak [232] all proposed the techniques for predictive analysis using social data. as event prediction methods are typically motivated by specific application domains, there are a number of surveys event predictions for domains such as flood events [47] , social unrest [29] , wind power ramp forecasting [76] , tornado events [68] , temporal events without location information [87] , online failures [182] , and business failures [6] . however, in spite of its promise and its rapid growth in recent years, the domain of event prediction in big data still suffers from the lack of a comprehensive and systematic literature survey covering all its various aspects, including relevant techniques, applications, evaluations, and open problems. the remainder of this article is organized as follows. section 2 presents generic problem formulations for event prediction and the evaluation of event prediction results. section 3 then presents a taxonomy and comprehensive description of event prediction techniques, after which section 4 categorizes and summarizes the various applications of event prediction. section 5 lists the open problems and suggests future research directions and this survey concludes with a brief summary in section 6. this section begins by examining the generic denotation and formulation of the event prediction problem (section 2.1) and then considers way to standardize event prediction evaluations (section 2.2). an event refers to a real-world occurrence that happens at some specific time and location with specific semantic topic [222] . we can use y = (t, l, s) to denote an event where its time t ∈ t , its location l ∈ l, and its semantic meaning s ∈ s. here, t , l, and s represent the time domain, location domain, and semantic domain, respectively. notice that these domains need to have very general meanings that cover a wide range of types of entities. for example, the location l can include any features that can be used to locate the place of an event in terms of a point or a neighborhood in either euclidean space (e.g., coordinate and geospatial region) or non-euclidean space (e.g., a vertex or subgraph in a network). similarly, the semantic domain s can contain any type of semantic features that are useful when elaborating the semantics of an event's various aspects, including its actors, objects, actions, magnitude, textual descriptions, and other profiling information. for example, ("11am, june 1, 2019", "hermosillo, sonora, mexico", "student protests") and ("june 1, 2010", âăijberlin, germany", "red cross helps pandemics control") denote the time, location, and semantics, for two events respectively. an event prediction system requires inputs that could indicate future events, called event indicators, and these could contain both critical information on events that precede the future event, known as precursors, as well as irrelevant information [86, 171] . event indicator data can be denoted as x ⊆ t × l × f , where f is the domain of the features other than location and time. if we denote the current time as t now and define the past time and future time as t − ≡ {t |t ≤ t now , t ∈ t } and t + ≡ {t |t > t now , t ∈ t }, respectively, the event prediction problem can now be formulated as follows: definition 2.1 (event prediction). given the event indicator data x ⊆ t − × l × f and historical event data y 0 ⊆ t − × l × s, event prediction is a process that outputs a set of predicted future eventsŷ ⊆ t + × l × s, such that for each predicted future eventŷ = (t, l, s) ∈ŷ where t > t now . not every event prediction method necessarily focuses on predicting all three domains of time, location, and semantics simultaneously, but may instead predict any part of them. for example, when predicting a clinical event such as the recurrence of disease in a patient, the event location might not always be meaningful [167] , but when predicting outbreaks of seasonal flu, the semantic meaning is already known and the focus is the location and time [27] and when predicting political events, sometimes the location, time, and semantics (e.g., event type, participant population type, and event scale) are all necessary [171] . moreover, due to the intrinsic nature of time, location, and semantic data, the prediction techniques and evaluation metrics of them are necessarily different, as described in the following. event prediction evaluation essentially investigates the goodness of fit for a set of predicted eventŝ y against real events y . unlike the outputs of conventional machine learning models such as the simple scalar values used to indicate class types in classification or numerical values in regression, the outputs of event prediction are entities with rich information. before we evaluate the quality of prediction, we need to first determine the pairs of predictions and the labels that will be used for the comparison. hence, we must first optimize the process of matching predictions and real events (section 2.2.1) before evaluating the prediction error and accuracy (section 2.2.2). 2.2.1 matching predicted events and real events. the following two types of matching are typically used: • prefixed matching: the predicted events will be matched with the corresponding groundtrue real events if they share some key attributes. for example, for event prediction at a particular time and location point, we can evaluate the prediction against the ground truth for that time and location. this type of matching is most common when each of the prediction results can be uniquely distinguished along the predefined attributes (for example, location and time) that have a limited number of possible values, so that one-on-one matching between the predicted and real events are easily achieved [1, 244] . for example, to evaluate the quality of a predicted event on june 1, 2019 in san francisco, usa, the true event occurrence on that date in san francisco can be used for the evaluation. • optimized matching: in situations where one-on-one matching is not easily achieved for any event attribute, then the set of predicted events might need to assess the quality of the match achieved with the set of real events, via an optimized matching strategy [168, 171] . for example, consider two predictions, prediction 1: ("9am, june 4, 2019", "nogales, sonora, mexico", "worker strike"), and prediction 2: ("11am, june 1, 2019", "hermosillo, sonora, mexico", "student protests"). the two ground truth events that these can usefully be compared with are real event 1: ("9am, june 1, 2019", "hermosillo, sonora, mexico", "teacher protests"), and real event 2: ("june 4, 2019", "navojoa, sonora, mexico", "general-population protest"). none of the predictions are an exact match for any of the attributes of the real events, so we will need to find a "best" matching among them, which in this case is between prediction 1 and real event 2 and prediction 2 and real event 1. this type of matching allows some degree of inaccuracy in the matching process by quantifying the distance between the predicted and real events among all the attribute dimensions. the distance metrics are typically either euclidean distance [105] or some other distance metric [92] . some researchers have hired referees to manually check the similarity of semantic meanings [168] , but another way is to use event coding to code the events into an event type taxonomy and then consider a match to have been achieved if the event type matches [49] . based on the distance between each pair of predicted and real events, the optimal matching will be the one that results in the smallest average distance [157] . however, suppose there are m predicted events and n real events, then there can be as many as 2 m ·n possible ways of matching, making it prohibitively difficult to find the optimal solution. moreover, there could be different rules for matching. for example, the "multiple-to-multiple" rule shown in figure 2 (a) allows one predicted (real) event to match multiple real (predicted) events [170] , while "bipartite matching" only allows one-to-one matching between predicted and real events (figure 2(b) ). "non-crossing matching" requires that the real events matched by the predicted events follow the same chronological order (figure 2 (c)). in order to utilize any of these types of matching, researchers have suggested using event matching optimization to learn the optimal set of "(real event, predicted event)" pairs [151] . the effectiveness of the event predictions is evaluated in terms of two indicators: 1) goodness of matching, which evaluates performance metrics such as the number and percentage of matched events [26] , and 2) quality of matched predictions, which evaluates how close the predicted event is to the real event for each pair of matched events [171] . • goodness of matching. a true positive means a real event has been successfully matched by a predicted event; if a real event has not been matched by any predicted event, then it is called a false negative and a false positive means a predicted event has failed to match any real event, which is referred to as a false alarm. assume the total number of predictions is n , the number of real events isn , the number of true positives is n t p , the number of false negatives is n f n and the number of false positives is n f p . then, the following key evaluation metrics can be calculated: prediction=n t p /(n t p + n f p ), recall=n t p /(n t p + n f n ), f-measure = 2 · precision · recall/(precision + recall). other measurements such as the area under the roc curves are also commonly used [26] . this approach can be extended to include other items such as multi-class precision/recall, and precision/recall at top k [2, 103, 123, 252] . • quality of matched predictions. if a predicted event matches a real one, it is common to go on to evaluate how close they are. this reflects the quality of the matched predictions, in terms of different aspects of the events. event time is typically a numerical values and hence can be easily measured in terms of metrics such as mean squared error, root mean squared error, and mean absolute error [26] . this is also the case for location in euclidean space, which can be measured in terms of the euclidean distance between the predicted point (or region) and the real point (or region). some researchers consider the administrative unit resolution. for example, a predicted location ("new york city", "new york state", "usa") has a distance of 2 from the real location ("los angeles", "california", "usa") [240] . others prefer to measure multi-resolution location prediction quality as follows: (1/3)(l count ry + l count ry · l st at e + l count r y · l st at e · l city ), where l city , l st at e , and l count r y can only be either 0 (i.e., no match to the truth) or 1 (i.e., completely matches the truth) [171] . for a location in non-euclidean space such as a network [187] , the quality can be measured in terms of the shortest path length between the predicted node (or subgraph) and the real node (or subgraph), or by the f-measure between the detected subsets of nodes against the real ones, which is similar to the approach for evaluating community detection [92] . for event topics, in addition to conventional ways of evaluating continuous values such as population size, ordinal values such as event scale, and categorical values such as event type, actors, and actions, as well as more complex semantic values such as texts, can be evaluated using natural language process measurements such as edit distance, bleu score, top-k precision, and rouge [10] . this section focuses on the taxonomy and representative techniques utilized for each category and subcategory. due to the heterogeneity of the prediction output, the technique types depend on the type of output to be predicted, such as time, location, and semantics. as shown in figure 3 , all the event prediction methods are classified in terms of their goals, including time, location, semantics, and the various combinations of these three. these are then further categorized in terms of the output forms of the goals of each and the corresponding techniques normally used, as elaborated in the following. semantic sequence discrete time occurrence research problems 1. supervised: geo-featured classification; spatial multi-task learning; spatial autoregressive 2. unsupervised: spatial scan, network scan step 1: event representation step 2: event causality inference. step 3: future event inference. fig. 3 . taxonomy of event prediction problems and techniques. event time prediction focuses on predicting when future events will occur. based on their time granularity, time prediction methods can be categorized into three types: 1) event occurrence: binary-valued prediction on whether an event does or does not occur in a future time period; 2) discrete-time prediction: in which future time slot will the event occur; and 3) continuous-time prediction: at which precise time point will the future event occur. occurrence prediction is arguably the most extensive, classical, and generally simplest type of event time prediction task [12] . it focuses on identifying whether there will be event occurrence (positive class) or not (negative class) in a future time period [244] . this problem is usually formulated as a binary classification problem, although a handful of other methods instead leverage anomaly detection or regression-based techniques. 1. binary classification. binary classification methods have been extensively explored for event occurrence prediction. the goal here is essentially to estimate and compare the values of f (y = "yes ′′ |x ) and f (y = "no ′′ |x ), where the former denotes the score or likelihood of event occurrence given observation x while the latter corresponds to no event occurrence. if the value of the former is larger than the latter, then a future event occurrence is predicted, but if not, there is no event predicted. to implement f , the methods typically used rely on discriminative models, where dedicated feature engineering is leveraged to manually extract potential event precursor features to feed into the models. over the years, researchers have leveraged various binary classification techniques ranging from the simplest threshold-based methods [121, 251] , to more sophisticated methods such as logistic regression [18, 248] , support vector machines [102] , (convolutional) neural networks [35, 135] , and decision trees [58, 185] . in addition to discrminative models, generative models [11, 239] have also been used to embed human knowledge for classifying event occurrences using bayesian decision techniques. specifically, instead of assuming that the input features are independent, prior knowledge can also be directly leveraged to establish bayesian networks among the observed features and variables based on graphical models such as (semi-)hidden markov models [54, 183, 239] and autoregresive logit models [199] . the joint probabilities p(y = "yes ′′ , x ) of p(y = "no ′′ , x ) can thus be estimated using graphical models, and then utilized to estimate f (y = "yes ′′ |x ) = p(y = "yes ′′ |x ) and f (y = "no ′′ |x ) = p(y = "no ′′ |x ) using bayesian rules [26] . 2. anomaly detection. alternatively, anomaly detection can also be utilized to learn a "prototype" of normal samples (typical values corresponding to the situation of no event occurrence), and then identify if any newly-arriving sample is close to or distant from the normal samples, with distant ones being identified as future event occurrences. such methods are typically utilized to handle "rare event" occurrences, especially when the training data is highly imbalanced with little to no data for "positive" samples. anomaly detection techniques such as one-classification [189] and hypotheses testing [100, 187] are often utilized here. 3. regression. in addition to simply predicting the occurrence or not, some researchers have sought to extend the binary prediction problem to deal with ordinal and numerical prediction problems, including event count prediction based on (auto)regression [73] , event size prediction using linear regression [229] , and event scale prediction using ordinal regression [84] . 3.1.2 discrete-time prediction. in many applications, practitioners want to know the approximate time (i.e. the date, week, or month) of future events in addition to just their occurrence. to do this, the time is typically first partitioned into different time slots and the various methods focus on identifying which time slot future events are likely to occur in. existing research on this problem can be classified into either direct or indirect approaches. 1. direct approaches. these of methods discretize the future time into discrete values, which can take the form of some number of time windows or time scales such as near future, medium future, or distant future. these are then used to directly predict the integer-valued index of future time windows of the event occurrence using (auto)regression methods [147, 154] , or to predict the ordinal values of future time scales using ordinal regression or classification [201] . 2. indirect approaches. these methods adopt a two-step approach, with the first step being to place the data into a series of time bins and then perform time series forecasting using techniques such as autoregressive [26] based on the historical time series x = {x 1 , · · · , x t } to obtain the future time seriesx = {x t +1 , · · · , xt }. the second step is to identify events in the predicted future time seriesx using either unsupervised methods such as burstness detection [31] and change detection [109] , or supervised techniques based on learning event characterization function. for example, existing works [165, 173] first represent the predicted future time seriesx ∈ rt ×d using time-delayed embedding, intox ∈ rt ×d ′ where each observation at time t can be represented x t } and t = t ,t + 1, · · ·t . then an event characterization function f c (x t ) is established to mapx t to the likelihood of an event, which can be fitted based on the event labels provided in the training set intuitively. overall, the unsupervised method requires users to assume the type of patterns (e.g., burstiness and change) of future events based on prior knowledge but do not require event label data. however, in cases where the event time series pattern is difficult to assume but the label data is available, supervised learning-based methods are usually used. discrete-time prediction methods, although usually simple to establish, also suffer from several issues. first, their time-resolution is limited to the discretization granularity; increasing this granularity significantly increases the computations al resources required, which means the resolution cannot be arbitrarily high. moreover, this trade-off is itself a hyperparameter that is sensitive to the prediction accuracy, rendering it difficult and time-consuming to tune during training. to address these issues, a number of techniques work around it by directly predicting the continuous-valued event time [191] , usually by leveraging one of three techniques. 1. simple regression. the simplest methods directly formalize continuous-event-time prediction as a regression problem [26] , where the output is the numerical-value future event time [212] and/or their duration [80, 129] . common regressors such as linear regression and recurrent neural networks have been utilized for this. despite their apparent simplicity, this is not straightforward as simple regression typically assumes gaussian distribution [129] , which does not usually reflect the true distribution of event times. for example, the future event time needs to be left-bounded (i.e., larger than the current time), as well asbeing typically non-symmetric and usually periodic, with recurrent events having multiple peaks in the probability density function along the time dimension. 2. point processes. as they allow more flexibility in fitting true event time distributions, point process methods [167, 219] are widely leveraged and have demonstrated their effectiveness for continuous time event prediction tasks. they require a conditional intensity function, defined as follows: where д(t |x ) is the conditional density function of the event occurrence probability at time t given an observation x , and whose corresponding cumulative distribution function, g(t |x )), n (t, t + dt), denotes the count of events during the time period between t and t + dt, where dt is an infinitelysmall time period. hence, by leveraging the relation between density and accumulative functions and then rearranging equation (1), the following conditional density function is obtained: once the above model has been trained using a technique such as maximal likelihood [26] , the time of the next event in the future is predicted as: although existing methods typically share the same workflow as that shown above, they vary in the way they define the conditional intensity function λ(t |x ). traditional models typically utilize prescribed distributions such as the poisson distribution [191] , gamma distribution [53] , hawks [69] , weibull process [56] , and other distributions [219] . for example, damaschke et al. [56] utilized a weibull distribution to model volcano eruption events, while ertekin et al. [72] instead proposed the use of a non-homogeneous poisson process to fit the conditional intensity function for power system failure events. however, in many other situations where there is no information regarding appropriate prescribed distributions, researchers must start by leveraging nonparametric approaches to learn sophisticated distributions from the data using expressive models such as neural networks. for example, simma and jordan [191] utilized of rnn to learn a highly nonlinear function of λ(t |x ). 3. survival analysis. survival analysis [62, 204] is related to point processes in that it also defines an event intensity or hazard function, but in this case based on survival probability considerations, as follows: where h (t |x ) is the so-called hazard function denoting the hazard of event occurrence between time (t −dt) for a t for a given observation x . either h (t |x ) or ξ (t |x ) could be utilized for predicting the time of future events. for example, the event occurrence time can be estimated when ξ (t |x ) is lower than a specific value. also, one can obtain ξ (t |x ) = exp − ∫ t 0 h(u|x)du according to equation (4) [132] . here h (t |x ) can adopt any one of several prescribed models, such as the wellknown cox hazard model [61, 132] . to learn the model directly from the data, some researchers have recommended enhancing it using deep neural networks [119] . vahedian et al. [204] suggest learning the survival probability ξ (t |x ) and then applying the function h (·|x ) to indicate an event at time t if h (t |x ) is larger than a predefined threshold value. a classifier can also be utilized. instead of using the raw sequence data, the conditional intensity function can also be projected onto additional continuous-time latent state layers that eventually map to the observations [62, 205] . these latent states can then be extracted using techniques such as hidden semi-markov models [26] , which ensure the elicitation of the continuous time patterns. event location prediction focuses on predicting the location of future events. location information can be formulated as one of two types: 1. raster-based. here, a continuous space is partitioned into a grid of cells, each of which represents a spatial region, as shown in figure 4 (a). this type of representation is suitable for situations where the spatial size of the event is non-negligible. 2. point-based. in this case, each location is represented by an abstract point with infinitely-small size, as shown in figure 4 (b). this type of representation is most suitable for the situations where the spatial size of the event can be neglected, or the location regions of the events can only be in discrete spaces such as network nodes. there are three types of techniques used for raster-based event location prediction, namely spatial clustering, spatial embedding, and spatial convolution. 1. spatial clustering. in raster-based representations, each location unit is usually a regular grid cell with the same size and shape. however, regions with similar spatial characteristics typically have irregular shapes and sizes, which could be approximated as composite representations of a number of grids [105] . the purpose of spatial clustering here is to group the contiguous regions who collectively exhibit significant patterns. the methods are typically agglomerative style. they typically start from the original finest-grained spatial raster units and proceed by merging the spatial neighborhood of a specific unit in each iteration. but different research works define different criteria for instantiating the merging operation. for example, wang and ding [211] merge neighborhoods if the unified region after merging can maintain the spatially frequent patterns. xiong et al. [220] chose an alternative approach by merging spatial neighbor locations into the current locations sequentially until the merged region possesses event data that is sufficiently statistically significant. these methods usually run in a greedy style to ensure their time complexity remains smaller than quadratic. after the spatial clustering is completed, each spatial cluster will be input into the classifier to determine whether or not there is an event corresponding to it. 2. spatial interpolation. unlike spatial clustering-based methods, spatial interpolation based methods maintain the original fine granularity of the event location information. the estimation of event occurrence probability can be further interpolated for locations with no historical events and hence achieve spatial smoothness. this can be accomplished using commonly-used methods such as kernel density estimation [5, 93] and spatial kriging [105, 114] . kernel density estimation is a popular way to model the geo-statistics in numerous types of events such as crimes [5] and terrorism [93] : where k(s) denotes the kernel estimation for the location point s, n is the number of historical event locations, each s i is a historical event location, γ is a tunable bandwidth parameter, and k(·) is a kernel function such as gaussian kernel [85] . more recently, ristea et al. [176] further extended kde-based techniques by leveraging localized kde and then applying spatial interpolation techniques to estimate spatial feature values for the cells in the grid. since each cell is an area rather than a point, the center of each cell is usually leveraged as the representative of this cell. finally, a classifier will take this as its input to predict the event occurrence for each grid [5, 176] . 3. spatial convolution. in the last few years, convolutional neural networks (cnns) have demonstrated significant success in learning and representing sophisticated spatial patterns from image and spatial data [88] . a cnn contains multiple convolutional layers that extract the hierarchical spatial semantics of images. in each convolutional layer, a convolution operation is executed by scanning a feature map with a filter, which results in another smaller feature map with a higher level semantic. since raster-based spatial data and images share a similar mathematical form, it is natural to leverage cnns to process it. existing methods [19, 150, 164, 209] in this category typically formulate a spatial map as input to predict another spatial map that denotes future event hotspots. such a formulation is analogous to the "image translation" problem popular in recent years in the computer vision domain [46] . specifically, researchers typically leverage an encoder-decoder architecture, where the input images (or spatial map) are processed by multiple convolutional layers into a higher-level representation, which is then decoded back into an output image with the same size, through a reverse convolutional operations process known as transposed convolution [88] . 4. trajectory destination prediction. this type of method typically focuses on populationbased events whose patterns can be interpreted as the collective behaviors of individuals, such as "gathering events" and "dispersal events". these methods share a unified procedure that typically consists of two steps: 1) predict future locations based on the observed trajectories of individuals, and 2) detect the occurrence of the "future" events based on the future spatial patterns obtained in step 1. the specific methodologies for each step are as follows: • step 1: here, the aim is to predict each location an individual will visit in the future, given a historical sequence of locations visited. this can be formulated as a sequence prediction problem. for example, wang and gerber [214] sought to predict the probability of the next time point t + 1's location s t +1 based on all the preceding time points: p(s t +1 |s ≤t ) = p(s t +1 |s t , s t −1 , · · · , s 0 ), based on various strategies including a historical volume-based prior model, markov models, and multi-class classification models. vahedian et al. [203] adopted bayesian theory p(s t +1 |s ≤t ) = p(s ≤t |s t +1 ) · p(s t +1 )/p(s ≤t ) which requires the conditional probability p(s ≤t |s t +1 ) to be stored. however, in many situations, there is huge number of possible trajectories for each destination. for example, with a 128 × 64 grid, one needs to store (128 × 64) 3 ≈ 5.5 × 10 11 options. to improve the memory efficiency, this can be limited to a consideration of just the source and current locations, leveraging a quad-tree style architecture to store the historical information. to achieve more efficient storage and speed up p(s ≤t |s t +1 ) queries, vahedian et al. [203] further extended the quad-tree into a new technique called vigo, which removes duplicate destination locations in different leaves. • step 2: the aim in this step is to forecast future event locations based on the future visiting patterns predicted in step 1. the most basic strategy here is to consider each grid cell independently. for example, wang and gerber [214] adopted supervised learning strategies to build predictive mapping between the visiting patterns and the event occurrence. a more sophisticated approach is to consider the spatial outbreaks composited by multiple grids. scalable algorithms have also been proposed to identify regions containing statistically significant hotspots [110] , such as spatial scan statistics [116] . khezerlou et al. [110] proposed a greedy-based heuristic tailored for the grid-based data formulation, which extends the original "seed" grid containing statistically-large future event densities to four directions until the extended region is no longer a statistically-significant outbreak. . unlike the raster-based formulation, which covers the prediction of a contiguous spatial region, point-based prediction focuses specifically on locations of interest, which can be distributed sparsely in a euclidean (e.g., spatial region) or non-euclidean space (e.g., graph topology). these methods can be categorized into supervised and unsupervised approaches. 1. supervised approaches. in supervised methods, each location will be classified as either "positive" or "negative" with regard to a future event occurrence. the simplest setting is based on the independent and identically distributed (i.i.d.) assumption among the locations, where each location is predicted by a classifier independently using their respective input features. however, given that different locations usually have strong spatial heterogeneity and dependency, further research has been proposed to tackle them based on different locations' predictors and outputs, resulting in two research directions: 1) spatial multi-task learning, and 2) spatial auto-regressive methods. • spatial multi-task learning. multi-task learning is a popular learning strategy that can jointly learn the models for different tasks such that the learned model can not only share their knowledge but also preserve some exclusive characteristics of the individual tasks [244] . this notion coincides very well with spatial event prediction tasks, where combining the outputs of models from different locations needs to consider both their spatial dependency and heterogeneity. zhao et al. [244] proposed a spatial multi-task learning framework as follows: where m is the total number of locations (i.e., tasks), w i and y i are the model parameters and true labels (event occurrence for all time points), respectively, of task i. l(·) is the empirical loss, f (w i , x i ) is the predictor for task i, and r(·) is the spatial regularization term based on the spatial dependency information m ∈ r m×m , where m i, j records the spatial dependency between location i and j. c(·) represents the spatial constraints imposed over the corresponding models to enforce them to remain within the valid space c. over recent years, there have been multiple studies proposing different strategies for r(·) and c(·). for example, zhao et al. [245] assumed that all the locations would be evenly correlated and enforced their similar sparsity patterns for feature selection, while gao et al. [85] further extended this to differentiate the strength of the correlation between different locations' tasks according to the spatial distance between them. this research has been further extended this approach to tree-structured multitask learning to handle the hierarchical relationship among locations at different administrative levels (e.g., cities, states, and countries) [246] in a model that also considers the logical constraints over the predictions from different locations who have hierachical relationships. instead of evenly similar, zhao, et al. [243] further estimated spatial dependency d utilizing inverse distance using gaussian kernels, while ning et al. [156] proposed estimating the spatial dependency d based on the event co-occurrence frequency between each pair of locations. • spatial auto-regressive methods. spatial auto-regressive models have been extensively explored in domains such as geography and econometrics, where they are applied to perform predictions where the i.i.d. assumption is violated due to the strong dependencies among neighbor locations. its generic framework is as follows: where x t ∈ r m×d andŷ t +1 ∈ r m×m are the observations at time t and event predictions at time t + 1 over all the m locations, and m ∈ r m×m is the spatial dependency matrix with zero-valued diagonals. this means the prediction of each locationŷ t +1,i ∈ŷ t +1 is jointly determined by its input x t,i and neighbors {j |m i, j 0} and ρ is a positive value to balance these two factors. since event occurrence requires discrete predictions, simple threshold-based strategies can be used to discretizeŷ i intoŷ ′ i = {0, 1} [32] . moreover, due to the complexity of event prediction tasks and the large number of locations, sometimes it is difficult to define the whole m manually. zhao et al. [243] proposed jointly learning the prediction model and spatial dependency from the data using graphical lasso techniques. yi et al. [228] took a different approach, leveraging conditional random fields to instantiate the spatial autoregression, where the spatial dependency is measured by gaussian kernel-based metrics. yi et al. [227] then went on to propose leveraging the neural network model to learn the locations' dependency. 2. unsupervised approaches. without supervision from labels, unsupervised-based methods must first identify potential precursors and determinant features in different locations. they can then detect anomalies that are characterized by specific feature selection and location combinatorial patterns (e.g., spatial outbreaks and connected subgraphs) as the future event indicators [41] . the generic formulation is as follows: where q(·) denotes scan statistics which score the significance of each candidate pattern, represented by both a candidate location combinatorial pattern r and feature selection pattern f . specifically, f ∈ {0, 1} d ′ ×n denotes the feature selection results (where "1" means selected; "0", otherwise) and r ∈ {0, 1} m×n denotes the m involved locations for the n events. m(g, β) and c are the set of all the feasible solutions of f and r, respectively. q(·) can be instantiated by scan statistics such as kulldorff's scan statistics [116] and the berk-jones statistic [41] , which can be applied to detect and forecast events such as epidemic outbreaks and civil unrest events [171] . depending on whether the embedding space is an euclidean region (e.g., a geographical region) or a non-euclidean region (e.g., a network topology), the pattern constraint c can be either constrained to predefined geometric shapes such as a circle, rectangle, or an irregular shape or subgraphs such as connected, cliques, and k-cliques. the problem in equation (8) is nonconvex and sometimes even discrete, and hence difficult to solve. a generic way is to optimize f using sparse feature selection; there is a useful survey provided in [127] and r can be defined using the two-step graph-structured matching method detailed in [42] . more recently, new techniques have been developed that are capable of jointly learning both feature and location selection [42, 187] . event semantics prediction addresses the problem of forecasting topics, descriptions, or other metaattributes in addition to future events' times and locations. unlike time and location prediction, the data in event semantics prediction usually involves symbols and natural languages in addition to numerical quantities, which means different types of techniques may be utilized. the data are categorized into three types based on how the historical data are organized and utilized to infer future events. the first of these categories covers rule-based methods, where future event precursors are extracted by mining association or logical patterns in historical data. the second type is sequence-based, considering event occurrence to be a consequence of temporal event chains. the third type further generalizes event chains into event graphs, where additional cross-chain contexts need to be modeled. these are discussed in turn below. . association rule-based methods are amongst the most classic approaches in data mining domain for event prediction, typically consisting of two steps: 1) learn the associations between precursors and target events, and then 2) utilize the learned associations to predict future events. for the first step, for example, an association could be x = {"election", "fraud"} → y ="protest event", which indicates that serious fraud occurring in an election process could lead to future protest events. to discover all the significant associations from the ocean of candidate rules efficiently, frequent set mining [92] can be leveraged. each discovered rule needs to come with both sufficient support and confidence. here, support is defined as the number of cases where both "x" and "y" co-occur, while confidence means the ratio indicating that "y" occurs once "x" happens. to better estimate these discrimination rules, further temporal constraints can be added that require the occurrence time of "x" and "y" to be sufficiently close to be considered "co-occurrences". once the frequent set rules have been discovered, pruning strategies may be applied to retain the most accurate and specific ones, with various strategies for generating final predictions [92] . specifically, given each new observation x ′ , one of the simplest strategies is to output the events that are triggered by any of the association rules starting from event x ′ [206] . other strategies first rank the predicted results based on their confidence and then predict just the top r events [252] . more sophisticated and rigorous strategies tend to build a decision list where each element in the list is an association rule mapping, so once a generative model has been built for the decision process, the maximal likelihood can be leveraged to optimize the order of the decision list [124] . this type of research leverages the causality inferred among the historical events to achieve future event predictions. the data here typically shares a generic framework consisting of the following procedures: 1) event representation, 2) event graph construction, and 3) future event inference. step 1: event semantic representation. this approach typically begins by extracting the events from the target texts using natural language processing techniques such as sanitization, tokenization, pos tag analysis, and name entity recognition. several types of objects can be extracted to represent the events: i) noun phrase-based [39, 94, 111] , where the noun-phrase corresponds to each event (for example, "2008 sichuan earthquake"); ii) verbs and nouns [112, 168] , where an event is represented as a set of noun-verb pairs extracted from news headlines (for example, "", "", or ""); and iii) tuple-based [249] , where each event is represented by a tuple consisting of objects (such as actors, instruments, or receptors), a relationship (or property), and time. an rdf-based format has also been leveraged in some works [57] . step 2: event causality inference. the goal here is to infer the cause-effect pairs among historical events. due to its combinatorial nature, narrowing down the number of candidate pairs is crucial. existing works usually begin by clustering the events into event chains, each of which consist of a sequence of time-ordered events under the relevant semantics, typically the same topics, actors, and/or objects [2] . the causal relations among the event pairs can then be inferred in various ways. the simplest approach is just to consider the likelihood that y occurs after x has occurred throughout the training data. other methods utilize nlp techniques to identify causal mentions such as causal connectives, prepositions, and verbs [168] . some formulate causal-effect relationship identification as a classification task where the inputs are the cause and effect candidate events, often incorporating contextual information including related background knowledge from web texts. here, the classifier is built on a multi-column cnn that outputs either "1" or "0" to indicate whether the candidate has an effect or not [115] . in many situations, the cause-effect rules learned directly using the above methods can be too specific and sparse, with low generalizability, so a typical next step is to generalize the learned rules. for example, "earthquake hits china" → "red cross help sent to beijing" is a specific rule that can be generalized to "earthquake hits [a country]" → "red cross help sent to [the capital of this country]". to achieve this, some external ontology or a knowledge base is typically needed in order to establish the underlying relationships among items or provide necessary information on their properties, such as wikipedia (https://www.wikipedia.org/), yago [196] , wordnet [75] , or conceptnet [137] . based on these resources, the similarity between two cause-effect pairs (c i , ε i ) and (c j , ε j ) can be computed by jointly considering the respective similarity of the putative cause and effect: σ ((c i , ε i ), (c j , ε j )) = (σ (c i , c j )+σ (ε i , ε j ))/2. an appropriate algorithm can then be utilized to apply hierarchical agglomerative clustering to group them and hence generate a data structure that can efficiently manage the task of storing and querying them to identify any cause-effect pairs. for example, [168, 169, 190] leverage an abstraction tree, where each leaf is an original specific cause-effect pair and each intermediate node is the centroid of a cluster. instead of using hierarchical clustering, [249] directly uses the word ontology to simultaneously generalize cause and effect (e.g., the noun "violet" is generalized to "purple", the verb "kill" is generalized to "murder-42.1 1 ") and then leverage a hierarchical causal network to organize the generalized rules. step 3: future event inference. given an arbitrary query event, two steps are needed to infer the future events caused by it based on the causality of events learned above. first, we need to retrieve similar events that match the query event from historical event pool. this requires the similarity between the query event and all the historical events to be calculated. to achieve this, lei et al. [123] utilized context information, including event time, location, and other environmental and descriptive information. for methods requiring event generalization, the first step is to traverse the abstraction tree starting from the root that corresponds to the most general event rule. the search frontier then moves across the tree if the child node is more similar, culminating in the nodes which are the least general but still similar to the new event being retrieved [168] . similarly, [45] proposed another tree structure referred to as a "circular binary search tree" to manage the event occurrence pattern. we can now apply the learned predicate rules starting from the retrieved event to obtain the prediction results. since each cause event can lead to multiple events, a convenient way to determine the final prediction is to calculate the support [168] , or conditional probability [226] of the rules. radinsky et al. [168] took a different approach, instead ranking the potential future events by their similarity defined by the length of their minimal generalization path. for example, the minimal generalization path for "london" and "paris" is "london" alternatively, zhao et al. [249] proposed embedding the event causality network into a continuous vector space and then applying an energy function designed to rank potential events, where true cause-effect pairs are assumed to have low energies. these methods share a very straightforward problem formulation. given a temporal sequence for a historical event chain, the goal is to predict the semantics of the next event using sequence prediction [26] . the existing methods can be classified into four major categories: 1) classical sequence prediction; 2) recurrent neural networks; 3) markov chains; and 4) time series predictions. sequence classification-based methods. these methods formulate event semantic prediction as a multi-class classification problem, where a finite number of candidate events are ranked and the top-ranked event is treated as the future event semantic. the objective isĉ = arg max c i u(s t +1 = c i |s 1 , · · · , s t ), where s t +1 denotes the event semantic in time slot t +1 andĉ is the optimal semantic among all the semantic candidates c i (i = 1, · · · ). multi-class classification problems can be split into events with different topics/semantic meaning. three types of sequence classification methods have been utilized for this purpose, namely feature-based methods, prototype-based methods, and model-based methods such as markov models. • feature-based. one of the simplest methods is to ignore the temporal relationships among the events in the chain, by either aggregating the inputs or the outputs. tama and comuzzi [198] formulated historical event sequences with multiple attributes for event prediction, testing multiple conventional classifiers. another type of approach based on this notion utilizes compositional based-methods [89] that typically leverage the assumption of independency among the historical input events to simplify the original problem u(s t +1 |s 1 , s 2 , · · · , s t ) = u(s t +1 |s ≤t ) into v(u(s t +1 |s 1 ), u(s t +1 |s 2 ), · · · , u(s t +1 |s t )) where v(·) is simply an aggregation function that represents a summation operation over all the components. each component function u(s t +1 |s i ) can then be calculated by estimating how likely it is that event semantic s t +1 and s i (i ≤ t ) co-occur in the same event chain. granroth-wilding and clark [89] investigated various models ranging from straightforward similarity scoring functions through bigram models and word embedding combined with similarity scoring functions to newly developed composition neural networks that jointly learn the representation of s t +1 and s i and then calculate their coherence. some other researchers have gone further to consider the dependency among the historical events. for example, letham et al. [125] proposed to optimizing the correct ordering among the candidate events, based on the following equation: where the semantic candidate in the set i should be ranked strictly to be lower than those in j , with the goal being to penalize the "incorrect ordering". here, 1 [·] is an indicator function which is discrete such that 1 [b ≥a] ≤ e b−a and can thus be utilized as the upperbound for minimization, as can be seen in the right-hand-side of the above equation. w is the set of parameters of the function u(·). this can now be relaxed to an exponentialbased approximation for effective optimization using gradient-based algorithms [88] . other methods focus on first transferring the sequential data into sequence embeddings that can encode the latent sequential context. for example, fronza et al. [79] apply random indexing to represent the words in terms of their its vector representations by embedding the information from neighboring words into each word before utilizing conventional classifiers such as support vector machines (svm) to identify the future events. • model-based. markov-based models have also been leveraged to characterize temporal patterns [224] . these typically use e i to denote each event under a specific type and e denotes the set of event types. the goal here is to predict the event type of the next event to occur in the future. in [7] , the event types are modeled using the markov model so given the current event type, the next event type can be inferred simply by looking up the state with the highest probability in the transition matrix. a tool called wayeb [8] has been developed based on this method. laxman et al. [121] developed a more complicated model, based on a mixture of hidden markov models and introducing new assumptions and the concept of episodes composed of a subsequence of event types. they assumed different event episodes should have different transition patterns so started by discovering the frequent episodes for events, each of which they modeled by a specific hidden markov model over various event types. this made it possible to establish the generative process for each future event type s based on the mixture of the above episode markov models. when predicting, the likelihood of a current observed event sequence over each possible generative process, p(x |λ y ) is evaluated, after which a future event type can be considered as either being larger than some threshold (as in [121] ) or the largest among all the different y values (in [239, 241] ). • prototype-based. adhikari et al. [3] took a different approach, utilizing a prototype-based strategy that first clusters the event sequences into different clusters in terms of their temporal patterns. when a new event sequence is observed, its closest cluster's centroid will then be leveraged as a "reference event sequence" whose sub-sequential events will be referred to when predicting future events for this new event sequence. recurrent neural network (rnn)-based methods. approaches in this category can be classified into two types: 1. attribute-based models; and 2. descriptive-based models. the attribute-based models, ingest feature representation of events as input, while the descriptive-based models typically ingest unstructured information such as texts to directly predict future events. • attributed-based methods. here, each event y = (t, l, s) at time t is recast and represented as e t = (e t,1 , e t,2 , · · · , e t,k ), where e t,i is the i-th feature of the event at time t. the feature here can include location and other information such as event topic and semantics. each sequence e = (e 1 , · · · , e t ) is then input into the standard rnn architecture for predicting next event e t +1 in the sequence at time point t + 1 [134] . various types of rnn components and architecture have been utilized for this purpose [33, 34] , but a vanilla rnn [70, 88] for sequence-based event prediction can be written in the following form: where h i , o i , and a i are the latent state, output, and activation for the i-th event, respectively, and w , u , and v are the model parameters for fitting the corresponding mappings. the prediction e t +1 := ψ (t + 1) can then be calculated in a feedforward way from the first event and the model training can be done by back-propagating the error from the layer of ψ (t). existing work typically utilizes the variants of vanilla rnn to handle the gradient vanishing problem, especially when the event chain is not short. the most commonly used methods for event prediction are lstm and gru [88] . for example, the architecture and equation for lstm are as follows: where the additional components c i−1 and ζ i are introduced to keep tracking the previous "history" and gating the information for forgetting in order to handle longer sequences. for example, some researchers opt to leverage a simple type lstm architecture to extend the rnn-based sequential event prediction [33, 97] , while others leverage variants of lstm, such as bi-directional lstm instead [113, 155] and yet others prefer to leverage gated-recurrent units (gru) [70] . moving beyond considering just the chain relationships among events, li et al. [131] generalized this into graph-structured relationships to better incorporate the event contextual information via the narrative event evolutionary graph (neeg). an neeg is a knowledge graph where each node is an event and each edge denotes the association between a pair of events, enabling the neeg to be represented by a weighted adjacency matrix a. the basic architecture can be denoted by the following, as detailed in the paper [131] : here, the current activation a i is not only dependent on the previous time point but also influenced by its neighbor nodes in neeg. • descriptive-based methods. attribute-based methods require extra effort during preprocessing in order to convert the unstructured raw data into feature vectors, a process which is not only computationally labor intensive but also not always feasible. therefore, multiple architectures have been proposed to directly process the raw (textual) event descriptions to enable them to be used to predict future event semantics or descriptions. these models share a similar generic framework [96, 97, 139, 195, 221, 231] , which begins by encoding each sequence of words into event representations, utilizing an rnn architecture, as shown in figure 5 . the sequence of events must then be characterized by another higher-level rnn to predict future events. under this framework, some works begin by decoding the predicted future candidate events into event embedding, after which they are compared with each other and the one with the largest confidence score is selected as the predicted event. these methods are usually constrained by the known list of event types, but sometimes we are interested in open set predictions where the predicted event type can be a new appearance of a type that has not previously been seen in the training set. to achieve this, other methods focus on directly generating future events' descriptions that characterize event semantics that may or may not have appeared before by designing an additional sequence decoder that decodes the latent representation of future events into word sequences. more recent research has enhanced the utility and interpretability of the relationship between words and relevant events, and all the previous events for the relevant future event, by adding a hierarchical attention mechanisms. for example, yu et al. [231] and su and jiang [195] both proposed word-level attention and event-level attention, while hu [96] leveraged word-level attention in the event encoder as well as in the event decoder. this section discusses the research into ways to jointly predict the time, location, and semantics of future events. existing work in this area can be categorized into three types: 1) joint time and for example, vilalta and ma [206] defined lhs as a tuple (e l , τ ), where τ is the time window before the target in rhs predefined by the user. only the events occurring within a time window before the event in rhs will satisfy the lhs. similar techniques have also been leveraged by other researchers [45, 194] . however, τ is difficult to define beforehand and it is preferable to be flexible to suit different target events. to handle this challenge, yang et al. [225] proposed a way to automatically identify information on a continuous time interval from the data. here, each transaction is composed of not only items but also continuous time duration information. lhs is a set of items (e.g., previous events) while rhs is a tuple (e r , [t 1 , t 2 ]) consisting of a future event semantic representation and its time interval of occurrence. to automatically learn the time interval in rhs, [225] proposed the use of two different methods . the first is called the confidence-intervalbased method, which leverages a statistical distribution (e.g., gaussian and student-t [26] ) to fit all the observed occurrence times of events in rhs, and then treats the statistical confidence interval as the time interval. the second method is known as minimal temporal region selection, which aims to find the temporal region with the smallest interval and covers all historical occurrences of the event in rhs. time expression extraction. in contrast to the above statistical-based methods, another way to achieve event time and semantics joint prediction comes from the pattern recognition domain, aiming to directly discover time expressions that mention the (planned) future events. as this type of technique can simultaneously identify time, semantics, and other information such as locations, it is widely used and will be discussed in more details later as part of the discussion of "planned future event detection methods" in section 3.4.3. time series forecasting-based methods. the methods based on time series forecasting can be separated into direct methods and indirect methods. direct methods typically formulate the event semantic prediction problem as a multi-variate time series forecasting problem, where each variable corresponds to an event type c i (i = 1, · · · ) and hence the predicted event type at future timet is calculated asŝt = arg max c i f (st = c i |x ). for example, in [128] , a longitudinal support vector regressor is utilized to predict multi-attribute events, where n support vector regressors, each of which corresponds to an attribute, is built to achieve the goal of predicting the next time point's attribute value. weiss and page [219] took a different approach, leveraging multiple point process models to predict multiple event types. to further estimate the confidence of their predictions, biloš et al. [25] first leveraged rnn to learn the historical event representation and then input the result into a gaussian process model to predict future event types. to better capture the joint dynamics across the multiple variables in the time series, brandt et al. [30] extended this to bayesian vector autoregression. utilizing indirect-style methods, they focused on learning a mapping from the observed event semantics down to low-dimensional latent-topic space using tensor decompositionbased techniques. similarly, matsubara et al. [142] proposed a 3-way topic analysis of the original observed event tensor y 0 ∈ r d o ×d a ×d c consisting of three factors, namely actors, objects, and time. they then went on to decompose this tensor into latent variables via three corresponding low-rank matrices p o ∈ r d k ×d o , p a ∈ r d k ×d a , and p c ∈ r d k ×d c respectively, as shown in figure 6 . here d k is the number of latent topics. for the prediction, the time matrices p c are predicted into the futurep c via multi-variate time series forecasting, after which a future event tensor are estimated by recovering a "future event tensor"ŷ by the multiplication among the predicted time matrixp c as well as the known actor matrix p a and object matrix p o . raster-based. these methods usually formulate data into temporal sequences consisting of spatial snapshots. over the last few years, various techniques have been proposed to characterize the spatial and temporal information for event prediction. the simplest way to consider spatial information is to directly treat location information as one of the input features, and then feed it into predictive models, such as linear regression [250] , lstm [174] and gaussian processes [118] . during model training, zhao and tang [250] leveraged the spatiotemporal dependency to regularize their model parameters. most of the methods in this domain aim to jointly consider the spatial and temporal dependency for predictions [64] . at present, the most popular framework is the cnn+rnn architecture, which implements sequenceto-sequence learning problems such as the one illustrated in figure 7 . here, the multi-attributed spatial information for each time point can be organized as a series of multi-channel images, which can be encoded using convolution-based operations. for example, huang et al. [99] proposed the addition of convolutional layers to process the input into vector representations. other researchers have leveraged variational autoencoders [215] and cnn autoencoders [104] to learn the lowdimensional embedding of the raw spatial input data. this allows the learned representation of the input to be input into the temporal sequence learning architecture. different recurring units have been investigated, including rnn, lstm, convlstm, and stacked-convlstm [88] . the resulting representation of the input sequence is then sent to the output sequence as input. here, another recurrent architecture is established. the output of the unit for each time point will be input into a spatial decoder component which can be implemented using transposed convolutional layers [233] , transposed convlstm [104] , or a spatial decoder in a variational autoencoder [215] . a conditional random field is another popular technique often used to model the spatial dependency [105] . point-based. the spatiotemporal point process is an important technique for spatiotemporal event prediction as it models the rate of event occurrence in terms of both spatial and time points. it is defined as: various models have been proposed to instantiate the model of the framework illustrated in equation (11) . for example, liu and brown et al. [136] began by assuming there to be a conditional independence among spatial and temporal factors and hence achieved the following decomposition: where x , l,t , and f denotes the whole input indicator data as well as its different facets, including location, time, and other semantic features, respectively. then the term λ 1 (·) can be modeled based on the markov spatial point process while λ 2 (·) can be characterized using temporal autoregressive models. to handle situations where explicit assumptions for model distributions are difficult, several methods have been proposed to involve the deep architecture during the point process. most recently, okawa et al. [159] have proposed the following: where k(·, ·) is a kernel function such as a gaussian kernel [26] that measures the similarity in time and location dimensions. f (t ′ , l ′ ) ⊆ f denotes the feature values (e.g., event semantics) for the data at location l ′ and time t ′ . д θ (·) can be a deep neural network that is parameterized by θ and returns an nonnegative scalar. the model selection of д θ (·) depends on the specific data types. for example, these authors constructed an image attention network by combining a cnn with the spatial attention model proposed by lu et al. [138] . in this section, we introduce the strategies that jointly predict the time, location, and semantics of future events, which can be grouped into either system-based or model-based strategies. system-based. the first type of the system-based methods considered here is the model-fusion system. the most intuitive approach is to leverage and integrate the aforementioned techniques for time, location, and semantics prediction into an event prediction system. for example, a system named embers [171] is an online warning system for future events that can jointly predict the time, location, and semantics including the type and population of future events. this system also provides information on the confidence of the predictions obtained. using an ensemble of predictive models for time [160] , location, and semantic prediction, this system achieves a significant performance boost in terms of both precision and recall. the trick here is to first prioritize the precision of each individual prediction model by suppressing their recall. then, due to the diversity and complementary nature of the different models, the fusion of the predictions from different models will eventually result in a high recall. a bayesian fusion-based strategy has also been investigated [95] . another system named carbon [108] also leverages a similar strategy. the second type of model involves crowd-sourced systems that implement fusion strategies to generate the event predictions made by the human predictors. for example, in order to handle the heterogeneity and diversity of the human predictors' skill sets and background knowledge under limited human resources, rostami et al. [177] proposed a recommender system for matching event forecasting tasks to human predictors with suitable skills in order to maximize the accuracy of their fused predictions. li et al. [126] took a different approach, designing a prediction market system that operates like a futures market, integrating information from different human predictors to forecast future events. in this system, the predictors can decide whether to buy or sell the "tokens" (using virtual dollars, for example) for each specific prediction they have made according to their confidence in it. they typically make careful decisions as they will obtain corresponding awards (for correct predictions) or penalties (for erroneous predictions). planned future event detection methods. these methods focus on detecting the planned future events, usually from various media such sources as social media and news and typically relying on nlp techniques and linguistic principles. existing methods typically follow a workflow similar to the one shown in figure 8 , consisting of four main steps: 1) content filtering. methods for content filtering are typically leveraged to retain only the texts that are relevant to the topic of interest. existing works utilize either supervised methods (e.g., textual classifiers [117] or unsupervised methods (e.g., querying techniques [152, 238] ); 2) time expression identification is then utilized to identify future reference expressions and determine the time to event. these methods either leverage existing tools such as the rosetta text analyzer [55] or propose dedicated strategies based on linguistic rules [101] ; 3) future reference sentence extraction is the core of planned event detection, and is implemented either by designing regular expression-based rules [153] or by textual classification [117] ; and 4) location identification. the expression of locations is typically highly heterogeneous and noisy. existing works have relied heavily on geocoding techniques that can resolve the event location accurately. in order to infer the event locations, various types of locations are considered by different researchers, such as article locations [152] , authors' profile locations [49] , locations mentioned in the articles [22] , and authors' neighbors' locations [107] . multiple locations have been selected using a geometric median [49] or fused using logical rules such as probabilistic soft logic [152] . tensor-based methods. some methods formulate the data into tensor-form, with dimensions including location, time, and semantics. tensor decomposition is then applied to approximate the original tensors as the product of multiple low-rank matrices, each of which is a mapping from latent topics to each dimension. finally, the tensor is extrapolated towards future time periods by various strategies. for example, mirtaheri [148] extrapolated the time dimension-matrix only, which they then multiplied with the other dimensions' matrices to recover the estimated extrapolated tensor into the future. zhou et al. [253] took a different approach, choosing instead to add "empty values" for the entries in future time to the original tensor, and then use tensor completion techniques to infer the missing values corresponding to future events. this category generally consists of two types of event prediction: 1) population level, which includes disease epidemics and outbreaks, and 2) individual level, which relates to clinical longitudinal events. there has been extensive research on disease outbreaks for many different types of diseases and epidemics, including seasonal flu [3] , zika [200] , h1n1 [158] , ebola [13] , and covid-19 [162] . these predictions target both the location and time of future events, while the disease type is usually fixed to a specific type for each model. compartmental models such as sir models are among the classical mathematical tools used to analyze, model, and simulate the epidemic dynamics [186, 237] . more recently, individual-based computational models have begun to be used to perform network-based epidemiology based on network science and graphtheoretical models, where an epidemic is modeled as a stochastic propagation over an explicit interaction network among people [52] . thanks to the availability of high-performance computing resources, another option is to construct a "digital twin" of the real world, by considering a realistic representation of a population, including membersâăź demographic, geographic, behavioral, and social contextual information, and then using individual-based simulations to study the spread of epidemics within each network [27] . the above techniques heavily rely on the model assumptions regarding how the disease progresses individually and is transmitted from person to person [27] . the rapid growth of large surveillance data and social media data sets such as twitter and google flu trends in recent years has led to a massive increase of interest in using data-driven approaches to directly learn the predictive mapping [3] . these methods are usually both more time-efficient and less dependent on assumptions, while the aforementioned computational models are more powerful for longer-term prediction due to their ability to take into account the specific disease mechanisms [242] . finally, there have also been reports of synergistic research that combines both techniques to benefit from their complementary strengths [98, 242] . this research thread focuses on the longitudinal predictive analysis of individual health-related events, including death occurrence [62] , adverse drug events [185] , sudden illnesses such as strokes [124] and cardiovascular events [23] , as well as other clinical events [62] and life events [59] for different groups of people, including elders and people with mental disease. the goal here is usually to predict the time before an event occurs, although some researchers have attempted to predict the type of event. the data sources are essentially the electronic health records of individual patients [172, 185] . recently, social media, forum, and mobile data has also been utilized for predicting drug adverse events [185] and events that arise during chronic disease (e.g., chemical radiation and surgery) [62] . this category focuses on predicting events based on information held in various types of media including: video-based, audio-based, and text-based formats. the core issue is to retrieve key information related to future events utilizing semantic pattern recognition from the data. video-and audio-based. . while event detection has been extensively researched for video data [129] and audio mining [192] , event prediction is more challenging and has been attracting increasing attention in recent years. the goal here is usually to predict the future status of the objects in the video, such as the next action of soccer players [60] or basketball players [154] , or the movement of vehicles [235] . text-and script-based. a huge amount of news data has accumulated in recent decades, much of which can be used for big data predictive analytics among news events. a number of researchers have focused on predicting the location, time, and semantics of various events. to achieve this, they usually leverage the immense historical news and knowledge base in order to learn the association and causality among events, which is then applied to forecast events when given current events. some studies have even directly generated textual descriptions of future events by leveraging nlp techniques such as sequence to sequence learning [57, 97, 123, 153, 168, 170, 190, 195, 221] . . this category can be classified into: 1) population based events, including dispersal events, gathering events, and congestion; and 2) individual-level events, which focus on fine-grained patterns such as human mobility behavior prediction. 4.3.1 group transportation pattern. . here, researchers typically focus on transportation events such as congestion [43, 104] , large gatherings [203] , and dispersal events [204] . the goal is thus to forecast the future time period [80] and location [203] of such events. data from traffic meters, gps, and mobile devices are usually used to sense real-time human mobility patterns. transportation and geographical theories are usually considered to determine the spatial and temporal dependencies for predicting these events. another research thread focuses on individual-level prediction, such as predicting an individual's next location [130, 223] or the likelihood or time duration of car accidents [19, 174, 233] . sequential and trajectory analyses are usually used to process trajectory and traffic flow data. different types of engineering systems have begun to routinely apply event forecasting methods, including: 1) civil engineering, 2) electrical engineering, 3) energy engineering, and 4) other engineering domains. despite the variety of systems in these widely different domains, the goal is essentially to predict future abnormal or failure events in order to support the system's sustainability and robustness. both the location and time of future events are key factors for these predictions. the input features usually consist of sensing data relevant to specific engineering systems. • civil engineering. this covers various a wide range of problems in diverse urban systems, such as smart building fault adverse event prediction [21] , emergency management equipment failure prediction [66] , manhole event prediction [179] , and other events [99] . • electrical engineering. this includes teleservice systems failures [61] and unexpected events in wire electrical discharge machining operations [184] . • energy engineering. event prediction is also a hot topic in energy engineering, as such systems usually require strong robustness to handle the disturbance from the nature environments. active research domains here include wind power ramp prediction [83] , solar power ramp prediction [1] , and adverse events in low carbon energy production [50] . • other engineering domains. there is also active research on event prediction in other domains, such as irrigation event prediction in agricultural engineering [161] and mechanical fault prediction in mechanical engineering [197] . here, the prediction models proposed generally focus on either network-level events or devicelevel events. for both types, the general goal is essentially to predict the likelihood of future system failure or attacks based on various indicators of system vulnerability. so far these two categories have essentially differed only in their inputs: the former relies on network features, including system specifications, web access logs and search queries, mismanagement symptoms, spam, phishing, and scamming activity, although some researchers are investigating the use of social media text streams to identify semantics indicating future potential attack targets of ddos [142, 217] . for device-level events, the features of interest are usually the binary file appearance logs of machines [160, 210] . some work has been done on micro-architectural attacks [90] by observing and proactively analyzing the observations on speculative branches, out-of-order executions and shared last level caches [188] . political event prediction has become a very active research area in recent years, largely thanks to the popularity of social media. the most common research topics can be categorized as: 1) offline events, and 2) online activism. 4.6.1 offline events. this includes civil unrest [171] , conflicts [218] , violence [28] , and riots [67] . this type of research usually targets the future events' geo-location, time, and topics by leveraging the social sensors that indicate public opinions and intentions. utilization of social media has become a popular approach for these endeavors, as social media is a source of vital information during the event development stage [171] . specifically, many aspects are clearly visible in social media, including complaints from the public (e.g., toward the government), discussions about their intentions regarding specific political events and targets, as well as advertisements for the planned events. due to the richness of this information, further information on future events such as the type of event [85] , the anticipated participants population [171] , and the event scale [84] can also be discovered in advance. 4.6.2 online events. due to the major impact of online media such as online forums and social media, many events such as online activism, petitions, and hoaxes in such online platform also involve strong motivations for achieving some political purpose [213] . beyond simple detection, the prediction of various types of events have been studied in order to enable proactive intervention to sidetrack the events such as hoaxes and rumor propagation [106] . other researchers have sought to foresee the results of future political events in order to benefit a particular group of practitioners, for example by predicting the outcome of online petitions or presidential elections [213] . different types of natural disasters have been the focus of a great deal of research. typically, these are rare events, but mechanistic models, a long historical records (often extending back dozens or hundreds of years), and domain knowledge are usually available. the input data are typically collected by sensors or sensor networks and the output is the risk or hazard of future potential events. since these event occurrence are typically rare but very high-stakes, many researchers strive to cover all event occurrences and hence aim to ensure high recalls. 4.7.1 geophysics-related. earthquakes. predictions here typically focus on whether there will be an earthquake with a magnitude larger than a specified threshold in a certain area during a future period of time. to achieve this, the original sensor data is usually proccessed using geophysical models such as gutenbergâăşrichterâăźs inverse law, the distribution of characteristic earthquake magnitudes, and seismic quiescence [14, 175] . the processed data are then input into machine learning models that treat them as input features for predicting the output, which can be either binary values of event occurrence or time-to-event values. some studies are devoted to identifying the time of future earthquakes and their precursors, based on an ensemble of regressors and feature selection techniques [178] , while others focus on aftershock prediction and the consequences of the earthquake, such as fire prevention [140] . it worth noting that social media data has also been used for such tasks, as this often supports early detection of the first-wave earthquake, which can then be used to predict the afterstocks or earthquakes in other locations [181] . fire events. research in this category can be grouped into urban fires and wildfires. this type of research often focuses on the time at which a fire will affect a specific location, such as a building. the goal here is to predict the risk of future fire events. to achieve this, both the external environment and the intrinsic properties of the location of interests are important. therefore, both static input data (e.g., natural conditions and demographics) and time-varying data (e.g., weather, climate, and crowd flow) are usually involved. shin and kim [189] focus on building fire risk prediction, where the input is the building's profile. others have studied wildfires, where weather data and satellite data are important inputs. this type of research focuses primarily on predicting both the time and location of future fires [216, 234] . other researchers have focused on rarer events such as volcanic eruptions. for example, some leverage chemical prior knowledge to build a bayesian network for prediction [44] , while others adopt point processes to predict the hazard of future events [56] . 4.7.2 atmospheric science-related. flood events. floods may be caused by many different reasons, including atmospheric (e.g., snow and rain), hydrological (e.g., ice melting, wind-generated waves, and river flow), and geophysical (e.g., terrain) conditions. this makes the forecasting of floods highly complicated task that requires multiple diverse predictors [212] . flood event prediction has a long history, with the latest research focusing especially on computational and simulation models based on domain knowledge. this usually involves using ensemble prediction systems as inputs for hydrological and/or hydraulic models to produce river discharge predictions. for a detailed survey on flood computational models please refer to [47] . however, it is prohibitively difficult to comprehensively consider and model all the factors correctly while avoiding all the accumulated errors from upstream predictions (e.g., precipitation prediction). another direction, based on data-driven models such as statistical and machine learning models for flood prediction, is deemed promising and is expected to be complementary to existing computational models. these newly developed machine learning models are often based solely on historical data, requiring no knowledge of the underlying physical processes. representative models are svm, random forests, and neural networks and their variants and hybrids. a detailed recent survey is provided in [149] . tornado forecasting. tornadoes usually develop within thunderstorms and hence most tornado warning systems are based on the prediction of thunderstorms. for a comprehensive survey, please refer to [68] . machine learning models, when applied to tornado forecasting tasks, usually suffer from high-dimensionality issues, which are very common in meteorological data. some methods have leveraged dimensional reduction strategies to preprocess the data [230] before prediction. research on other atmosphere-related events such as droughts and ozone events has also been conducted [77] . there is a large body of prediction research focusing on events outside the earth, especially those affecting the star closest to us, the sun. methods have been proposed to predict various solar events that could impact life on earth, including solar flares [20] , solar eruptions [4] , and high energy particle storms [141] . the goal here is typically to use satellite imagery data of the sun to predict the time and location of future solar events and the activity strength [74] . business intelligence can be grouped into company-based events and customer-based events. 4.8.1 customer activity prediction. the most important customer activities in business is whether a customer will continue doing business with a company and how long a costumer will be willing to wait before receiving the service? a great deal of research has been devoted to these topics, which can be categorized based on the type of business entities namely enterprises, social media, and education, who are primarily interested in churn prediction, site migration, and student dropout, respectively. the first of these focuses on predicting whether and when a customer is likely to stop doing business with a profitable enterprise [71] . the second aims to predict whether a social media user will move from one site, such as flickr, to another, such as instagram, a movement known as site migration [236] . while site migration is not popular, attention migration might actually be much more common, as a user may "move" their major activities from one social media site to another. the third type, student dropout, is a critical domain for education data mining, where the goal is to predict the occurrence of absenteeism from school for no good reason for a continuous number of days; a comprehensive survey is available in [143] . for all three types, the procedure is first to collect features of a customer's profile and activities over a period of time and then conventional or sequential classifiers or regressors are generally used to predict the occurrence or time-to-event of the future targeted activity. financial event prediction has been attracting a huge amount of attention for risk management, marketing, investment prediction and fraud prevention. multiple information resources, including news, company announcements, and social media data could be utilized as the input, often taking the form of time series or temporal sequences. these sequential inputs are used for the prediction of the time and occurrence of future high-stack events such as company distress, suspension, mergers, dividends, layoffs, bankruptcy, and market trends (rises and falls in the company's stock price) [36, 40, 65, 91, 122, 144, 226] . it is difficult to deduce the precise location and time for individual crime incidences. therefore, the focus is instead estimating the risk and probability of the location, time, and types of future crimes. this field can be naturally categorized based on the various crime types: 4.9.1 political crimes and terrorism. this type of crime is typically highly destructive, and hence attracts huge attention in its anticipation and prevention. terrorist activities are usually aimed at religious, political, iconic, economic or social targets. the attacker typically targets larger numbers of people and the evidences related to such attacks is retained in the long run. though it is extremely challenging to predict the precise location and time of individual terrorism incidents, numerous studies have shown the potential utility for predicting the regional risks of terrorism attacks based on information gathered from many data sources such as geopolitical data, weather data, and economics data. the global terrorism database is the most widely recognized dataset that records the descriptions of world-wide terrorism events of recent decades. in addition to terrorism events, other similar events such as mass killings [202] and armed-conflict events [193] have also been studied using similar problem formulations. most studies on this topic focus on predicting the types, intensity, count, and probability of crime events across defined geo-spatial regions. until now, urban crimes are most commonly the topic of research due to data availability. the geospatial characteristics of the urban areas, their demographics, and temporal data such as news, weather, economics, and social media data are usually used as inputs. the geospatial dependency and correlation of the crime patterns are usually leveraged during the prediction process using techniques originally developed for spatial predictions, such as kernel density estimation and conditional random fields. some works simplify the tasks by only focusing on specific types of crimes such as theft [180] , robbery, and burglary [51] . 4.9.3 organized and serial crimes. unlike the above research on regional crime risks, some recent studies strive to predict the next incidents of criminal individuals or groups. this is because different offenders may demonstrate different behavioral patterns, such as targeting specific regions (e.g., wealthy neighborhoods), victims (e.g., women), for specific benefits (e,g, money). the goal here is thus is to predict the next crime site and/or time, based on the historical crime event sequence of the targeted criminal individual or group. models such as point processes [130] or bayesian networks [133] are usually used to address such problems. despite the major advances in event prediction in recent years, there are still a number of open problems and potentially fruitful directions for future research, as follows: increasingly sophisticated forecasting models have been proposed to improve the prediction accuracy, including those utilizing approaches such as ensemble models, neural networks, and the other complex systems mentioned above. however, although the accuracy can be improved, the event prediction models are rapidly becoming too complex to be interpreted by human operators. the need for better model accountability and interpretability is becoming an important issue; as big data and artificial intelligence techniques are applied to ever more domains this can lead to serious consequences for applications such as healthcare and disaster management. models that are not interpretable by humans will find it hard to build the trust needed if they are to be fully integrated into the workflow of practitioners. a closely related key feature is the accountability of the event prediction system. for example, disaster managers need to thoroughly understand a model's recommendations if they are to be able to explain the reason for a decision to displace people in a court of law. moreover, an ever increasing number of laws in countries around the world are beginning to require adequate explanations of decisions reached based on model recommendations. for example, articles 13-15 in the european union's general data protection regulation (gdpr) [207] require algorithms that make decisions that âăijsignificantly affect" individuals to provide explanations ("right to explanation") by may 28, 2018. similar laws have also been established in countries such as the united states [48] and china [166] . the massive popularity of the proposal, development, and deployment of event prediction is stimulating a surge interest in developing ways to counter-attack these systems. it will therefore not be a surprise when we begin to see the introduction of techniques to obfuscate these event prediction methods in the near future. as with many state-of-the-art ai techniques applied in other domains such as object recognition, event prediction methods can also be very vulnerable to noise and adversarial attacks. the famous failure of google flu trends, which missed the peak of the 2013 flu season by 140 percent due to low relevance and high disturbance affecting the input signal, is a vivid memory for practitioners in the field [82] . many predictions relying on social media data can also be easily influenced or flipped by injecting scam messages. event prediction models also tend to over-rely on low-quality input data that can be easily disturbed or manipulated, lacking sufficient robustness to survive noisy signals and adversarial attacks. similar problems threaten to other application domains such as business intelligence, crime, and cyber systems. over the years, many domains have accumulated a significant amount of knowledge and experience about event development occurrence mechanisms, which can thus provide important clues for anticipating future events, such as epidiomiology models, socio-political models, and earthquake models. all of these models focus on simplifying real-world phenomena into concise principles in order to grasp the core mechanism, discarding many details in the process. in contrast, data-driven models strive to ensure the accurate fitting of large historical data sets, based on sufficient model expressiveness but cannot guarantee that the true underlying principle and causality of event occurrence modeled accurately. there is thus a clear motivation to combine their complementary strengths, and although this has already attracted great deal of interest [98, 242] , most of the models proposed so far are merely ensemble learning-based and simply merge the final predictions from each model. a more thorough integration is needed that can directly embed the core principles to regularize and instruct the training of data-driven event prediction methods. moreover, existing attempts are typically specific to particular domains and are thus difficult to develop further as they require in-depth collaborations between data scientists and domain experts. a generic framework developed to encompass multiple different domains is imperative and would be highly beneficial for the various domain experts. the ultimate purpose of event prediction is usually not just to anticipate the future, but to change it, for example by avoiding a system failure and flattening the curve of a disease outbreak. however, it is difficult for practitioners to determine how to act appropriately and implement effective policies in order to achieve the desired results in the future. this requires a capability that goes beyond simply predicting future events based on the current situation, requiring them instead to also take into account the new actions being taken in real time and then predict how they might influence the future. one promising direction is the use of counterfactual event [145] prediction that models what would have happened if different circumstances had occurred. another related direction is prescriptive analysis where different actions can be merged into the prediction system and future results anticipated or optimized. related works have been developed in few domains such as epidemiology. however, as yet these lack sufficient research in many other domains that will be needed if we are to develop generic frameworks that can benefit different domains. existing event prediction methods mostly focus primarily on accuracy. however, decision makers who utilize these predicted event results usually need much more, including key factors such as event resolution (e.g., time resolution, location resolution, description details), confidence (e.g., the probability a predicted event will occur), efficiency (whether the model can predict per day or per seccond), lead time (how many days the prediction can be made prior to the event occurring), and event intensity (how serious it is). multi-objective optimization (e.g., accuracy, confidence, resolution). there are typically trade-offs among all the above metrics and accuracy, so merely optimizing accuracy during training will inevitably mean the results drift away from the overall optimal event-prediction-based decision. a system that can flexibly balance the trade-off between these metrics based on decision makers' needs and achieve a multi-objective optimization is the ultimate objective for these models. this survey has presented a comprehensive survey of existing methodologies developed for event prediction methods in the big data era. it provides an extensive overview of the event prediction challenges, techniques, applications, evaluation procedures, and future outlook, summarizing the research presented in over 200 publications, most of which were published in the last five years. event prediction challenges, opportunities, and formulations have been discussed in terms of the event element to be predicted, including the event location, time, and semantics, after which we went on to propose a systematic taxonomy of the existing event prediction techniques according to the formulated problems and types of methodologies designed for the corresponding problems. we have also analyzed the relationships, differences, advantages, and disadvantages of these techniques from various domains, including machine learning, data mining, pattern recognition, natural language processing, information retrieval, statistics, and other computational models. in addition, a comprehensive and hierarchical categorization of popular event prediction applications has been provided that covers domains ranging from natural science to the social sciences. based upon the numerous historical and state-of-the-art works discussed in this survey, the paper concludes by discussing open problems and future trends in this fast-growing domain. forecasting of solar power ramp events: a post-processing approach causal prediction of top-k event types over real-time event streams epideep: exploiting embeddings for epidemic forecasting prediction of solar eruptions using filament metadata area-specific crime prediction models methodological approach of construction business failure prediction studies: a review event forecasting with pattern markov chains wayeb: a tool for complex event forecasting probabilistic complex event recognition: a survey on-line new event detection and tracking a bayesian approach to event prediction forecasting with twitter data forecasting ebola with a regression transmission model earthquake magnitude prediction in hindukush region using machine learning techniques a survey of techniques for event detection in twitter modern information retrieval predicting structured data customer event history for churn prediction: how long is long enough? a spatiotemporal deep learning approach for citywide short-term crash risk prediction with multi-source data a comparison of flare forecasting methods. i. results from the âăijall-clearâăi̇ workshop scalable causal learning for predicting adverse events in smart buildings identifying content for planned events across social media sites comparison of machine learning algorithms for clinical event prediction data-driven prediction and prevention of extreme events in a spatially extended excitable system uncertainty on asynchronous time event prediction pattern recognition and machine learning epifast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems predicting local violence: evidence from a panel survey in liberia forecasting civil wars: theory and structure in an age of "big data" and machine learning real time, time series forecasting of inter-and intra-state political conflict forecasting social unrest using activity cascades estimating binary spatial autoregressive models for rare events sensor event prediction using recurrent neural network in smart homes for older adults prediction of next sensor event and its time of occurrence using transfer learning across homes temporal convolutional networks allow early prediction of events in critical care making words work: using financial text as a predictor of financial events social media fact sheet event summarization using tweets extracting causation knowledge from natural language texts a text-based decision support system for financial sequence prediction non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs a generic framework for interesting subspace cluster detection in multi-attributed networks pcnn: deep convolutional networks for short-term traffic congestion prediction bayesian networks based rare event prediction with sensor data a tree-based approach for event prediction using episode rules over event streams stargan: unified generative adversarial networks for multi-domain image-to-image translation ensemble flood forecasting: a review regulating by robot: administrative decision making in the machine-learning era using publicly visible social media to build detailed forecasts of civil unrest infrequent adverse event prediction in low carbon energy production using machine learning an architecture for emergency event prediction using lstm recurrent neural networks disease transmission in territorial populations: the small-world network of serengeti lions bayes predictive analysis of a fundamental software reliability model hidden markov models as a support for diagnosis: formalization of the problem and synthesis of the solution text analytics apis, part 2: the smaller players a volcanic event forecasting model for multiple tephra records, demonstrated on mt news events prediction using markov logic networks a new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees leveraging fine-grained transaction data for customer life event predictions predicting soccer highlights from spatiotemporal match event streams event prediction for individual unit based on recurrent event data collected in teleservice systems isurvive: an interpretable, event-time prediction model for mhealth an overview of event extraction from twitter traffic congestion prediction by spatiotemporal propagation patterns deep learning for event-driven stock prediction online failure prediction for railway transportation systems based on fuzzy rules and data analysis forecasting location-based events with spatio-temporal storytelling tornado forecasting: a review recurrent marked temporal point processes: embedding event history to vector on clinical event prediction in patient treatment trajectory using longitudinal electronic health records systematic review of customer churn prediction in the telecom sector reactive point processes: a new approach to predicting power failures in underground electrical systems forecasting heroin overdose occurrences from crime incidents mag4 versus alternative techniques for forecasting active region flare productivity christiane fellbaum. 2012. wordnet. the encyclopedia of applied linguistics a survey on wind power ramp forecasting managing the risks of extreme events and disasters to advance climate change adaptation: special report of the intergovernmental panel on climate change issues in complex event processing: status and prospects in the big data era failure prediction based on log files using random indexing and support vector machines titan: a spatiotemporal feature learning framework for traffic incident duration prediction survey on complex event processing and predictive analytics google flu trends' failure shows good data> big data a review on the recent history of wind power ramp forecasting incomplete label multi-task ordinal regression for spatial event scale forecasting incomplete label multi-task deep learning for spatio-temporal event subtype forecasting extreme events: dynamics, statistics and prediction a taxonomy of event prediction methods deep learning what happens next? event prediction using a compositional neural network model fortuneteller: predicting microarchitectural attacks via unsupervised deep learning automated news reading: stock price prediction based on financial news using context-specific features data mining: concepts and techniques simulating spatio-temporal patterns of terrorism incidents on the indochina peninsula with gis and the random forest method toward future scenario generation: extracting event causality exploiting semantic relation, context, and association features bayesian model fusion for forecasting civil unrest integrating hierarchical attentions for future subevent prediction what happens next? future subevent prediction using contextual hierarchical lstm social media based simulation models for understanding disease dynamics mist: a multiview and multimodal spatial-temporal learning framework for citywide abnormal event forecasting improved disk-drive failure warnings estimating time to event of future events based on linguistic cues on twitter using machine learning methods to forecast if solar flares will be associated with cmes and seps skip n-grams and ranking functions for predicting script events deepurbanevent: a system for predicting citywide crowd dynamics at big events a survey on spatial prediction methods epidemiological modeling of news and rumors on twitter that's what friends are for: inferring location in online social media platforms based on social relationships carbon: forecasting civil unrest events by monitoring news and social media time-series event-based prediction: an unsupervised learning framework based on genetic programming forecasting gathering events through trajectory destination prediction: a dynamic hybrid model extracting causal knowledge from a medical database using graphical patterns supervenience and mind: selected philosophical essays diversityaware event prediction based on a conditional variational autoencoder with reconstruction prediction for big data through kriging: small sequential and one-shot designs improving event causality recognition with multiple background knowledge sources using multi-column convolutional neural networks a spatial scan statistic leveraging unscheduled event prediction through mining scheduled event tweets spatio-temporal violent event prediction using gaussian process regression time-to-event prediction with neural networks and cox regression data mining and predictive analytics stream prediction using a generative model based on frequent episodes in event sequences a hybrid model for business process event prediction event prediction based on causality reasoning interpretable classifiers using rules and bayesian analysis: building a better stroke prediction model sequential event prediction the wisdom of crowds in action: forecasting epidemic diseases with a web-based prediction market system feature selection: a data perspective multi-attribute event modeling and prediction over event streams from sensors time-dependent representation for neural event sequence prediction next hit predictor-self-exciting risk modeling for predicting next locations of serial crimes constructing narrative event evolutionary graph for script event prediction failure event prediction using the cox proportional hazard model driven by frequent failure signatures a novel serial crime prediction model based on bayesian learning theory mm-pred: a deep predictive model for multi-attribute event sequence grid-based crime prediction using geographical features a new point process transition density model for space-time event prediction conceptnetâăťa practical commonsense reasoning tool-kit knowing when to look: adaptive attention via a visual sentinel for image captioning sam-net: integrating event-level and chain-level attentions to predict what happens next major earthquake event prediction using various machine learning algorithms data handling and assimilation for solar event prediction fast mining and forecasting of complex time-stamped events a survey of machine learning approaches and techniques for student dropout prediction a multi-stage deep learning approach for business process event prediction counterfactual theories of causation event recognition and forecasting technology forecasting occurrences of activities tensor-based method for temporal geopolitical event forecasting flood prediction using machine learning models: literature review urban events prediction via convolutional neural networks and instagram data embers at 4 years: experiences operating an open source indicators forecasting system capturing planned protests from open source indicators a prototype method for future event prediction based on future reference sentence extraction future event prediction: if and when sequence to sequence learning for event prediction staple: spatio-temporal precursor learning for event forecasting spatio-temporal event forecasting and precursor identification real-time forecasting of an epidemic using a discrete time stochastic model: a case study of pandemic influenza (h1n1-2009) deep mixture point processes: spatio-temporal event prediction with rich contextual information mobile network failure event detection and forecasting with multiple user activity data sets prediction of irrigation event occurrence at farm level using optimal decision trees forecasting the novel coronavirus covid-19 using social media to predict the future: a systematic literature review towards a deep learning approach for urban crime forecasting a new temporal pattern identification method for characterization and prediction of complex time series events assessing china's cybersecurity law pairwise-ranking based collaborative recurrent neural networks for clinical event prediction learning causality for news events prediction learning to predict from textual data mining the web to predict future events beating the news' with embers: forecasting civil unrest using open source indicators an investigation of interpretable deep learning for adverse drug event prediction forecasting natural events using axonal delay a deep learning approach to the citywide traffic accident risk prediction neural networks to predict earthquakes in chile spatial crime distribution and prediction for sporting events using social media a crowdsourcing triage algorithm for geopolitical event forecasting machine learning predicts laboratory earthquakes a process for predicting manhole events in manhattan theft prediction with individual risk factor of visitors earthquake shakes twitter users: real-time event detection by social sensors a survey of online failure prediction methods using hidden semi-markov models for effective online failure prediction unexpected event prediction in wire electrical discharge machining using deep learning techniques adverse drug event prediction combining shallow analysis and machine learning forecasting seasonal outbreaks of influenza an efficient approach to event detection and forecasting in dynamic multivariate social media networks tiresias: predicting security events through deep learning autoencoder-based one-class classification technique for event prediction predicting an effect event from a new cause event using a semantic web based abstraction tree of past cause-effect event pairs modeling events with cascades of poisson processes neural speech recognizer: acoustic-to-word lstm model for large vocabulary speech recognition fundamental patterns and predictions of event size distributions in modern wars and terrorist campaigns high-impact event prediction by temporal data mining through genetic algorithms hierarchical gated recurrent unit with semantic attention for event prediction yago: a core of semantic knowledge machine learning for predictive maintenance: a multiple classifier approach an empirical comparison of classification techniques for next event prediction using business process event logs probabilistic forecasting of wind power ramp events using autoregressive logit models dynamic forecasting of zika epidemics using google trends predicting time-to-event from twitter messages a multimodel ensemble to forecast onsets of state-sponsored mass killing forecasting gathering events through continuous destination prediction on big trajectory data predicting urban dispersal events: a two-stage framework through deep survival analysis on mobility data a measurement-based model for estimation of resource exhaustion in operational software systems predicting rare events in temporal domains the eu general data protection regulation (gdpr). a practical guide graph-based deep modeling and real time forecasting of sparse spatio-temporal data deep learning for real-time crime forecasting and its ternarization an iot application for fault diagnosis and prediction a hierarchical pattern learning framework for forecasting extreme weather events towards long-lead forecasting of extreme flood events: a data mining framework for precipitation cluster precursors identification incomplete label uncertainty estimation for petition victory prediction with dynamic features using twitter for next-place prediction, with an application to crime prediction csan: a neural network benchmark model for crime forecasting in spatio-temporal scale cityguard: citywide fire risk forecasting using a machine learning approach ddos event forecasting using twitter data the perils of policy by p-value: predicting civil conflicts forest-based point process for event prediction from electronic health records on predicting crime with heterogeneous spatial patterns: methods and evaluation a miml-lstm neural network for integrated fine-grained event forecasting event history analysis spatio-temporal check-in time prediction with recurrent neural network based survival analysis finding progression stages in time-evolving event sequences web-log mining for quantitative temporal-event prediction using external knowledge for financial event prediction based on graph neural networks neural network based continuous conditional random field for fine-grained crime prediction an integrated model for crime prediction using temporal and spatial factors predicting future levels of violence in afghanistan districts using gdelt tornado forecasting with multiple markov boundaries dram: a deep reinforced intra-attentive model for event prediction a survey of prediction using social media hetero-convlstm: a deep learning approach to traffic accident prediction on heterogeneous spatio-temporal data blending forest fire smoke forecasts with observed data can improve their utility for public health applications a data-driven approach for event prediction social media mining: an introduction forecasting seasonal influenza fusing digital indicators and a mechanistic disease model unsupervised spatial event detection in targeted domains with applications to civil unrest modeling spatiotemporal event forecasting in social media multi-resolution spatial event forecasting in social media online spatial event forecasting in microblogs simnest: social media nested epidemic simulation via online semi-supervised deep learning spatial auto-regressive dependency interpretable learning based on spatial topological constraints multi-task learning for spatio-temporal event forecasting feature constrained multi-task learning models for spatiotemporal event forecasting spatial event forecasting in social media with geographically hierarchical regularization distant-supervision of heterogeneous multitask learning for social event forecasting with multilingual indicators hierarchical incomplete multi-source feature learning for spatiotemporal event forecasting constructing and embedding abstract event causality networks from text snippets modeling temporal-spatial correlations for crime prediction prediction model for solar energetic proton events: analysis and verification a pattern based predictor for event streams a tensor framework for geosensor data forecasting of significant societal events key: cord-260407-jf1dnllj authors: tang, catherine so-kum; wong, chi-yan title: factors influencing the wearing of facemasks to prevent the severe acute respiratory syndrome among adult chinese in hong kong date: 2004-06-11 journal: prev med doi: 10.1016/j.ypmed.2004.04.032 sha: doc_id: 260407 cord_uid: jf1dnllj background. the global outbreak of the severe acute respiratory syndrome (sars) in 2003 has been an international public health threat. quick diagnostic tests and specific treatments for sars are not yet available; thus, prevention is of paramount importance to contain its global spread. this study aimed to determine factors associating with individuals' practice of the target sars preventive behavior (facemask wearing). methods. a total of 1329 adult chinese residing in hong kong were surveyed. the survey instrument included demographic data, measures on the five components of the health belief model, and the practice of the target sars preventive behavior. logistic regression analyses were conducted to determine rates and predictors of facemask wearing. results. overall, 61.2% of the respondents reported consistent use of facemasks to prevent sars. women, the 50–59 age group, and married respondents were more likely to wear facemasks. three of the five components of the health belief model, namely, perceived susceptibility, cues to action, and perceived benefits, were significant predictors of facemask-wearing even after considering effects of demographic characteristics. conclusions. the health belief model is useful in identifying determinants of facemask wearing. findings have significant implications in enhancing the effectiveness of sars prevention programs. a new and highly infectious disease in humans, the severe acute respiratory disease syndrome (sars), has created a major public health threat in many countries. within 2 months since its first appearance in asia in mid-february of 2003, the world health organization (who) has already received reports of the outbreak of sars in 26 countries on all five continents [1] . the clinical symptoms of sars are nonspecific, including high fever, dry cough, breathing difficulties, muscle pain, and generalized weakness. the incubation period can last from 2 to 10 days, thus enables symptomless individuals to transmit the disease through either close person-to-person contact or travel from one city to another city in the world. the mortality rate of sars is about 3-10%. only recently has the causative agent of this disease been found. the latest multicountry laboratory findings have confirmed that a new pathogen, a member of the coronavirus family never been seen in humans, is the cause of sars [2] . however, the exact transmission route of the disease is still unknown, and quick diagnostic tests as well as specific treatments are also not yet available. under these unknown circumstances, prevention is of particular importance in containing the global spread of this new infectious disease. this study aimed to examine factors affecting hong kong people's practice of preventive behaviors against sars. findings from this study would provide pertinent information in designing and implementing sars prevention programs not only for hong kong, but for other countries as well. hong kong was one of the hardest hit area during the global outbreak of sars in 2003 and has accounted for almost 40% of the probable cases and deaths of sars. at the beginning of the local outbreak of this disease in early march of 2003, mainly health care workers who treated the index patients were infected with the disease [3] . very soon, new cases were reported among close contacts of known patients, and the disease then quickly spread to the community. local health authorities have since stepped up various prevention and intervention activities against further spread of the disease [4] . at the community level, health authorities have launched large-scale public health education programs about the disease, issued preventive health guidelines to health care workers and the general public, suspension of classes for schools and universities, prompt isolation of infected individuals, and ordering of probable infected individuals to quarantine themselves at home for 10 days. at the individual level, health advice is given on ways to prevent contracting and spreading of sars. the suggested sars preventive behaviors include (1) maintaining good personal hygiene (covering nose and mouth with a tissue when sneezing or coughing and washing hands immediately afterward with liquid soap), (2) developing a healthy lifestyle with proper diet, regular exercises, adequate rest, and no smoking, (3) ensuring good ventilation at home and in the office, and (4) wearing facemasks, especially for those with respiratory tract infections or those caring for them. despite all these efforts, an average of 40 -50 new infected cases and about five deaths of sars were reported daily. the disease continued to affect both health care workers, as well as individuals from the community until june 2003. researchers have argued that the practice of preventive behaviors by individuals is one of the most effective ways in disease prevention and health promotion [5] [6] [7] . with environmental and policies support, these individual preventive behaviors can pass on to effective population-level prevention efforts [8, 9] . for example, public education and media campaigns that disseminate health messages and information, environmental manipulation that provide necessary facilities, and national policies that make available economic incentive or reimbursement can motivate many individuals in the community to practice the desired health behaviors. thus, various psychosocial approaches, such as the health belief model [10] , the theory of reasoned action [11] , the social cognitive model [12] , the protection motivation theory [13] , and the stages of change model [14] , have been put forward to predict the practice of preventive behaviors at the individual level. among various psychosocial approaches, the health belief model is one of the most widely used and provides the necessary conceptual framework for this study. this model postulates that the practice of preventive behaviors is a function of the degree to which individuals perceive a personal health threat and the perception that particular preventive behaviors will be effective in reducing the threat [10] . in applying this model to understand the practice of sars preventive behaviors, perceived health threat refers to individuals' perception of their vulnerability to contracting sars (perceived susceptibility) and that this disease has serious consequences (perceived severity). individuals' be-lief that the practice of the suggested sars preventive behaviors will prevent sars depends on (1) whether they think these preventive behaviors will be effective (perceived benefits), (2) whether the cost of undertaking these behaviors (perceived barriers) exceeds the benefits, and (3) whether there are any cues (cues to action) to trigger these behaviors. cues to action can be internal, such as the perception of a body state, or external, such as the influence of mass media and social pressure. there is ample support for the health belief model in explaining individual practice of preventive behaviors. this model helps to predict the practice of preventive dental care [15] , dieting for obesity [16] , aids risk-reduction behaviors [17] , breast self-examination [18 -20] , sunscreen use [21] , and participation in a broad array of health screening programs such as obtaining a mammogram to screen for breast cancer [22 -24] and undergoing genetic testing for cancer susceptibility [25, 26] . prevention programs that draw on this model to effect behavioral change have also yielded positive results in increasing various health behaviors to prevent dental problems [15] , osteoporosis [27] , and diabetes [28] . overall, perceived benefits, perceived barriers, and perceived susceptibility are the three most powerful components of the health belief model in influencing whether individuals practice different preventive behaviors [21, 29, 30] . it is also found that the actual risk of developing a disease is a much less important predictor of individual preventive behaviors than is perceived susceptibility [31] . other than psychosocial predictors, there is also an accumulation of literature documenting the importance of associations between individuals' demographic characteristics and their practice of preventive behaviors [5,32 -35] . in general, women and more affluent and better educated individuals are more likely to practice the suggested preventive behaviors. an inverted curvilinear relationship is found between age and practice of preventive behaviors. typically, young children are often compliant in adopting various preventive behaviors, which tend to decline in adolescence and adulthood but improve again among older people. findings in relation to marital status and ethnicity are inconclusive. the purposes of this study were twofold. the first objective was to determine the rates of the target sars preventive behavior in adult chinese with different demographic background. among various sars preventive behaviors suggested by local and international health authorities, this study focused on the wearing of facemasks. this target preventive behavior was chosen for this study because it was specific to this disease and involved deliberate effort of individuals. based on past related literature, it was expected that men, the younger age group, and individuals with low educational attainment would be less likely to wear facemasks. the second objective of this study was to test the efficacy of the health belief model in predicting the practice of the target preventive behavior. based on past related literature, it was expected that perceived susceptibility, perceived severity, perceived benefits, perceived barriers, and cues to action were significant predictors of facemask wearing. this study was conducted between march 29 and april 1, 2003, in hong kong, when there was clear evidence that sars had started to spread from health care workers in hospitals to the community. at the time of the study, the exact causative agent and route of transmission of the disease were still not yet fully known [3] . local health authorities had since implemented enhanced infection control procedures in all hospitals and cohorting of sars patients [4] . they had also stepped up communitywide sars prevention and intervention activities. data for this study were obtained using a community telephone survey of adult chinese (aged 19 and above) residing in hong kong during the specified period. random-digit dialing of the local residential telephone directory for 2002 was used to select respondents. this directory covered all listed telephone numbers in all regions of hong kong, where over 90% of the households owned at least one or more telephone lines. telephone surveys were conducted by trained telephone interviewers and took about 20 min. when telephones were busy or there was no answer, three follow-up calls on different time or dates were attempted before substituting a new telephone number. the response rate was 65%, and the sampling error was 3.1 percentage points. a total of 1329 adult chinese were surveyed, and their demographic information are summarized in table 1 . compared to the 2001 hong kong population census data [36] , the present sample included more women as well as individuals with university education and higher monthly personal income. similar differences in demographic characteristics between telephone surveys and census data were also noted in previous local telephone surveys [37] . the present sample comprised 40.2% men and 59.8% women. about half of them aged between 30 and 49 years, 20.2% between 19 and 29 years, 14.7% between 50 and 59 years, and 18.4% were older than 60 years. among them, 30% were single, 66% were currently married, and the remaining were either separated, divorced, or widowed. slightly more than half of the respondents completed high school education and worked either full time or part time. another 21.1% of the respondents were homemakers, 7.8% were students, 12.6% were retirees, and 5.2% were unemployed. respondents were asked to indicate how often in the past week they wore facemasks to prevent contracting and spreading sars. they responded with either ''never,'' ''occasionally,'' or ''most of the time.'' the first two responses were coded as ''0'' and the last response was coded as ''1'' for subsequent statistical analyses. this was assessed by three items: (1) whether respondents felt vulnerable to contracting sars, (2) whether they knew or had close contact with any individuals infected with sars, and (3) whether they had respiratory infection syndromes such as sore throat, dry cough, fever, muscle ache, and shortness of breath. respondents answered with ''yes'' or ''no'' responses, and affirmative responses were then summed to form a total score. high total scores represent respondents perceiving themselves as being highly susceptible to contracting sars. respondents indicated on two 4-point scales the degree to which they were fearful of sars (1 as ''not at all fearful'' to 4 as ''very fearful'') and worried that hong kong would become a quarantine city because of the widespread of sars to the community (1 as ''not at all worried'' to 4 as ''very worried''). high mean scores of these two scales represent respondents perceiving sars as having very severe adverse consequences. respondents were asked to indicate on a 4-point scale the degree to which they agreed wearing facemasks could prevent contracting and spreading sars (1 as ''strongly disagree'' to 4 as ''strongly agree''). high scores indicate respondents perceiving great benefits in wearing facemasks. respondents were asked to rate on two 4-point scales the degree to which they had difficulty in obtaining facemasks (1 as ''not at all difficult'' to 4 as ''very difficult'') and the level of discomfort when wearing them (1 as ''not at all uncomfortable'' to 4 as ''very comfortable''). high mean scores of these two items indicate respondents perceiving great barriers in wearing facemasks. this was assessed by asking respondents to indicate on two 4-point scales the degree to which the local government and their family members encouraged them to wear facemasks (1 as ''strongly disagree'' to 4 as ''strongly agree''). high mean scores of these two scales represent respondents having great awareness of environmental cues to wear facemasks. respondents were also asked about their sex, age, educational attainment, marital status, and personal monthly income. statistical analyses in this study were conducted using spss 10.0 software. descriptive statistics for demographic characteristics of respondents were generated and compared with the 2001 hong kong population census data ( table 1 ). the rates of wearing facemasks were determined for individuals with various demographic characteristics. bivariate logistic regression analyses were then conducted to determine whether the practice of facemask wearing differed within each demographic characteristic (table 2) . a multivariate logistic regression analysis was also performed to test the health belief model and to identify significant predictors of the target preventive behavior. odds ratios (ors) for each predictor were estimated from the logistic regression (table 3) . demographic variables of the respondents were entered in the logistic regression first to control for their effects before testing the health belief model. then, predictor variables were entered simultaneously in the next block of the regression. the predictor variables consisted the five components of the health belief model: perceived sus-ceptibility, perceived severity, perceived benefits, perceived barriers, and cues to action. overall, 61.2% of the respondents reported consistent wearing of facemasks to prevent contracting and spreading sars. table 2 5 .460) were more likely to wear facemasks to prevent sars within their own demographic groups. results also showed that sex (v 2 = 26.8, p < 0.001), age (v 2 = 12.6, p < 0.001), and marital status (v 2 = 36.97, p < 0.001) had significant subgroup differences in the target preventive behavior. a logistic regression with odds ratios was conducted to test the efficacy of the health belief model in predicting the wearing of facemasks to prevent sars. in block i, demographic factors of sex, age, and marital status were entered first to control for their effects. results showed that this block was significant (v 2 = 40.23, p < 0.001). the five components of the health belief model were entered in the next block, and they were significant in predicting facemask wearing, even after considering effects of demographic factors (v 2 = 100.86, p < 0.001). the final model of the logistic regression analysis is presented in table 3 . with the exception of perceived barriers, all estimated coefficients were in the expected direction. all odds ratios were above 1.0. in summary, respondents who were women (or = 1.39; ci = 1.053, 1.835), who belonged to the older age group (or = 1.037; ci = 0.982, 1.095), who were married (or = 1.487; ci = 1.07, 2.067), who felt more susceptible to contracting sars (or = 2.575; ci = 1.586, 4.181), who perceived sars as having more serious consequences (or = 1.176; ci = 0.909, 1.521), who believed greater benefits in wearing facemasks (or = 1.354; ci = 1.019, 1.800), who encountered greater barriers in wearing facemasks (or = 1.131; ci = 0.887, 1.443), and who were more aware of environmental cues (or = 2.447; ci = 1.875, 3.194) were more likely to wear facemasks. results showed that three of the five components of the health belief model, namely, perceived susceptibility, cues to action, and perceived benefits, were significant predictors. perceived severity and perceived barriers were not significant predictors of facemask wearing when other factors were also considered. this study examined how various psychosocial factors are associated with the practice of the target sars preventive behavior among adult chinese in hong kong. similar to previous research [15 -26] , this study found the health belief model useful in identifying major determinants of the wearing of facemasks to prevent contracting and spreading sars. in particular, findings showed that three of the five components of the model, namely, perceived susceptibility, cues to action, and perceived benefits, were significant predictors. review studies of the health belief model have also found that among the five components, perceived susceptibility and perceived benefits are the more powerful components in predicting preventive behaviors [29, 30] . the present results showed that compared to those with a low level of perceived susceptibility, individuals feeling personally very vulnerable to contracting sars were 2.5 times more likely to wear facemasks. it was also found that individuals who had strong beliefs in the effectiveness of wearing facemasks to prevent sars were 1.4 times more likely to wear facemasks than those who did not have these beliefs. furthermore, this study showed that cues to action were as an important predictor as perceived susceptibility. those who were more aware of environmental cues were 2.4 times more likely to wear facemasks than those who perceived few cues to action. previous studies have also indicated that cues to action in the form of advice from family members and health care professionals are also very important factors in increasing various preventive behaviors [23] . the remaining two components of the health belief model, perceived severity and perceived barriers, were found to be nonsignificant determinants of the target sars preventive behavior in this study. in spite of previous literature indicating the powerfulness of perceived barriers in influencing the practice of preventive behaviors [21, 23, 29, 30] , this component did not significantly predict the wearing of facemasks in the present sample of adult chinese. it might be that this target preventive behavior is relatively easy to perform, despite some discomfort and inconvenience. individuals can have complete control of the behavior and can take off the facemasks anytime they want or feel uncomfortable. facemasks are cheap and easy to obtain, except during the early outbreak of sars. thus, there are relatively less perceived barriers in wearing facemasks as compared to participating in screening or immunization programs. this study also found that perceived severity of sars did not significantly predict facemask wearing. earlier studies on cancer noted that perceived severity was not related to preventive behaviors, as cancer is uniformly thought of as a serious disease [30] . on the other hand, it would also be argued that the failure of perceived severity to predict the target sars preventive behavior might be related to individuals' underestimation of the potential of this disease to become a global epidemic. during the early stage of the local outbreak, this disease affected mainly health care workers, a few index patients, and their close contacts [3] . the serious consequences of this disease with increasing prevalence and mortality rates might have been overlooked. thus, the public was not motivated or did not anticipate the need to practice the suggested sars preventive behavior. similar to existing literature on preventive behaviors [7,32 -35] , this study also found that individuals' demographic characteristics were significant predictors of facemask wearing. results showed that men, single people, and the 19 -29 age group were significantly less likely to wear facemasks to prevent sars than women, married people, and individuals aged between 30 and 59 years. although statistically nonsignificant, odds ratios also showed that individuals with less than university education and very low personal income were also less likely to wear facemasks. researchers have argued that the low rates of preventive behaviors among young, single, and low education men may be related to their inadequate knowledge about the disease, peer influences, risk-taking tendency, and perceived immortality and immunity to various disease [5, 38] . this study found that about 61.2% of the respondents consistently wore facemasks to prevent sars. in other words, despite the rapid spread, increasing death toll, and vigorous preventive activities, close to 40% of the respondents did not practice the target sars preventive behavior. for some subgroups, the noncompliance rates were even higher. as there are evidences that sars may emerge again both locally and across countries [1] , there is a need for hong kong, as well as other countries, to further enhance the effectiveness of related prevention and intervention measures to contain the disease. results of this study had significant implications for sars prevention programs. it is suggested that individuals' perception of their own vulnerability to the disease should be emphasized by highlighting the global outbreak and debilitating health consequences of sars, which not only affect health care workers, older people, those with chronic illness, or individuals of a particular ethnicity. public education and media campaigns should be conducted regularly in schools, workplaces, and community settings to disseminate the latest findings of sars. the public should be made aware that there is no specific treatment and vaccination for this disease; therefore, individuals must take own precautions to prevent contracting the disease. it should also be emphasized that at the individual level, facemask wearing remains one of the most effective ways to prevent contracting and spreading sars [39] . indeed, studies have shown that communications that highlight perceived susceptibility and simultaneously increase the perception that a particular health behavior will reduce the health threat are successful in promoting various preventive behaviors [15, 27, 28] . the present study also showed that environmental cues to wear facemasks are very important. other researchers have argued that environmental manipulation and policies support are crucial to translate individual to collective preventive efforts [8, 9] . environmental cues can be in the form of signs and posters to remind people to wear facemasks and should be placed in prominent areas in schools, at worksites, and in public places. advice, recommendation, or encouragement from health care professionals are also useful to increase individuals' likelihood to wear facemasks. policies that require the wearing of facemasks in certain locations (e.g., inside hospitals) or for those who have respiratory symptoms will also ensure individuals to practice the desired preventive behaviors. finally, it is also particularly important to target sars prevention activities at those who are less likely to practice these behaviors, namely, men, single individuals, and the younger age group. this study had several limitations. first, there were concerns about the representative of the sample. this study used telephone surveys to collect data and failed to include individuals who were without home telephone lines or not at home during the surveyed period. comparison with the census data showed that the present sample comprised more women as well as more affluent and better educated individuals, who were also more likely to practice the suggested sars behaviors. in view of this, the present sample might have higher rates of facemask wearing than the general population. second, only retrospective self-reports of respondents were collected without external verification, and results might be subject to recall and social desirability bias. third, this study focused on the five key components of the health belief model to predict rates of facemask wearing. it did not investigate individuals' knowledge of and emotional responses to sars, which have also been found to significantly influence the practice of preventive behaviors [23, 40] . lastly, the measurement of the target sars preventive behavior was crude and without any contextual information. it is possible that rates of facemask wearing might differ at home, in the workplace, or in public places. despite these limitations, this study provided pertinent information on factors influencing the practice of the target sars preventive behavior. findings have significant implications in enhancing the effectiveness of prevention activities against the global spread of sars. update: outbreak of severe acute respiratory syndrome-worldwide a novel coronavirus associated with severe acute respiratory syndrome sars: experience at prince of wales hospital, hong kong guideline on management of severe acute respiratory syndrome (sars) patterns of health behaviors in us adults change in lifestyle factors and other influence on health status and all-cause mortality nih consensus development panel on physical and cardiovascular health. physical activity and cardiovascular health a tale of 3 tails translating social ecological theories into guidelines for community health promotion social learning theory and the health belief model understanding attitudes and predicting social behavior self-efficacy: the exercise of control a cognitive and physiological processes in fear appeals and attitude change: a revised theory of protection motivation in search of how people change: application to addictive behavior conditional health threats: health beliefs, decisions, and behaviors among adults psychosocial predictors of compliance with a weight control intervention for obese children and adolescents psychosocial predictors of gay men's aids risk-reduction behavior predicting mammography and breast selfexamination factors associated with breast self-examination among jordan women perceptions of threat, benefits, and barriers in breast self-examination amongst young asymptomatic women message framing and sunscreen use: gain-framed messages motivate beachgoers mammography adherence and beliefs in a sample of low-income african american women attitudes, beliefs, and knowledge as predictors of nonattendence in a swedish population-based mammography screening program breast cancer worry and screening: some prospective data solomon lj. likelihood of undergoing genetic testing for cancer risk: a population-based study factors influencing intention to obtain a genetic test for colon cancer risk: a population-based study dimensions of the severity of a health threat: the persuasive effects of visibility, time of onset, and rate of onset on young women's intentions to prevent osteoporosis preventive care in the context of men's health a meta-analysis of studies of the health belief model with adults the health belief model: a decade later decision-making about genetic testing among women at familial risk for breast cancer associations between exercise and health behaviors in a community sample of working adults a population-based study of age and gender differences in patterns of healthrelated behaviors elucidating the relationships between race, socioeconomic status, and health socio-demographic characteristics and individual health behaviors main tables of the 2001 population census the rate of physical child abuse in chinese families: a community survey in hong kong individual psychology of risktaking behaviors in non-adherence effectiveness of precautions against droplets and contact in prevention of nosocomial transmission of severe acute respiratory syndrome (sars) incomplete knowledge and attitude-behavior inconsistency key: cord-263606-aiey8nvq authors: bonate, peter l. title: musings on the current state of covid-19 modeling and reporting date: 2020-05-29 journal: j pharmacokinet pharmacodyn doi: 10.1007/s10928-020-09692-2 sha: doc_id: 263606 cord_uid: aiey8nvq nan in thinking about the current state of covid-19-related modeling i am reminded of the rime of the ancient mariner. samuel taylor coleridge wrote ''water, water, every where, nor any drop to drink.'' today its ''models, models every where, which one to believe.'' every day covid-19 related models are reported in the newspapers and on the television. model predictions have become an everyday topic of conversation in households and among coworkers. people have become armchair modelers or critics. models are often chosen to support a political narrative. there comes a point with having more and more models, those predictions have less and less value. here are some musings i've had on the current state of covid-19 modeling and reporting. • how many different models are there for covid-19 transmission dynamics? i think you would be hard pressed to know exactly. the website for the centers for disease control and prevention (cdc) uses 12 different models for their predictions of covid-19 related deaths [1] . i like this website for several reasons. not only do they report predictions for each model, they use ensemble modeling to make a collective 2-week prediction into the future. they don't report how the ensemble was done, perhaps it was as simple as an average, but they do try to account for the many different models. i also like that they report 95% prediction intervals. too many scientists confuse confidence intervals and prediction intervals and it's nice to see that the cdc gets it right. • twenty years ago, when spreadsheets were first starting to be used, there was concern that because of their ease of use, scientists were inappropriately and possibly misusing statistics to obtain the results they desired, i.e., p \ 0.05. programs like excel or minitab made it far to easy for anyone to push a button and generate a statistic and a p-value. spreadsheets removed the critical thinking around whether the statistic you are using is appropriate and whether the assumptions behind that statistic are met. it looks like this concern also applies to modeling. in these times it seems everyone is a modeler. in the last week, two stories have come into the news. the first relates to a graphic used by the georgia department of health showing that the number of new coronavirus cases declined every day for the last 2 weeks in five counties with the most infections. a close examination by reporters revealed that the dates of the graph were not sorted in chronological order and the graph was a grievous error [2] . the department of health claimed it was a mistake, not purposeful, and i believe them. it's far too easy for amateurs to use the graphics capability of excel to make graphs any way they want without having to think about whether what they are doing is correct. • related to this was a story shortly thereafter which reported that the white house was making claims related to the number of covid-19 deaths using a model developed by kevin hassett of the council of economic advisors. hassett showed that deaths would drop to zero by may 15, which is great news if true [3] . the problem is that hassett used a cubic polynomial, which is easily implemented in excel, to fit the observed data and then extrapolated into the future to predict when the number of covid deaths would be zero. every modeler knows that extrapolation is always risky and working with polynomials is especially risky for many reasons. in this case, had he extrapolated beyond may 15 he would have seen that the predicted number of covid deaths was negative, an impossibility, and that his model was wrong (don't get me started on all models are wrong; that's another commentary). hassett says he was just ''smoothing the data'' and he probably was. the problem again goes back to the ease of statistics and graphics functions in spreadsheets which takes the critical thinking of modeling out of the equation (no pun intended). • today i was reading in the new york times an article that reported that lockdown delays cost an additional 36,000 lives [4] . had the lockdowns started sooner more lives would have been saved seems obvious. to be fair, the times goes on to say that ''all models are only estimates, and it is impossible to know for certain the exact number of people who would have died.'' but this caveat was buried at the end of the story and many casual readers might not even have noticed it or gotten that far to read it. there are a couple of other issues related to this article that bear worth mentioning. first, just how many significant figures are needed for death counts? i am not an expert in this area, but five significant figures seem too many to me. this seems an unacceptable level of precision. there is a phenomenon in psychology where experts that report percentages as x.x% are deemed more credible than experts that just report x%. i would argue that a prediction of 29,410 is deemed more believable than a prediction of 29,400 (which seems more reasonable using a smaller number of significant figures). the second aspect relates to the title of this article, which was ''lockdown delays cost at least 36,000 lives, data show.'' i think it wrong to have said ''data show''. i personally do not believe that models are data. models are built from data. data are observable phenomenon. a more correct title would have been to use ''predictions show'' or ''models show.'' • suddenly groups that operated in relative obscurity are now celebrities. did you know there was an institute for disease modeling funded by the bill and melinda gates foundation? i didn't, at least not before the pandemic. as scientists we are thought of as impartial, but we are also human and subject to the kind of vanity as others. there is a lot at stake here. sure, most people go into science because they want to better the world, but there is also recognition and money (grants, books, etc.) at stake. modelers are being interviewed by reporters and being presented on television to discuss their models having their own 15 min of fame. there is a huge incentive to get the work released first rather than being first and correct. preprint services like medrxiv do a service by publishing potentially important articles that could benefit society, but they are not subject to peer review. reporters routinely report from these preprints as if they had been reported in peer-reviewed journals [4] . this opens the door for potentially flawed work to be accepted [6] . flawed models decrease the credibility of other models opening the door to criticisms that models are ''simply unreliable.'' [7] . • i think there are important implications for covid-19 modeling outside of modeling pandemic related statistics and that relates to climate change. let's not forget that climate change predictions, many of which are dire, are based on models. using models to drive policy change during the pandemic sets precedence for using models to drive policy change with climate change. and by corollary, discrediting pandemic models leads to bystander model discrediting in the climate change arena. hence, opponents to climate change have a genuine stake in discrediting pandemic models as well. where does all this leave us, as citizens of the world? as modelers? i recognize that reporters can't state the assumptions behind all the models they report nor do most laypeople know what a confidence interval is, let alone a prediction interval (hell, a lot of modelers don't even know the difference), but i do think we can do better to educate people. as i stated in my commentary in the last issue, we have an opportunity here. this time we have an opportunity to improve quantitative literacy in society. for the first time, people are interested in modeling and models. we've used modeling in weather prediction for decades, but most people don't seem to care about the science or modeling behind those predictions. they just joke that they are often wrong. today, people seem genuinely interested in what's behind covid-19 predictions and whether they agree with those predictions. let's take to task the news and how it's reporting these predictions. estimates should be reported with the error in those predictions. let's take to task the politicians who are selectively using models, or discrediting modeling as a scientific endeavor [8] , to serve their own agenda or political views. as modelers we need to remain vigilant in these times because the work being done centers for disease control and prevention covid-19 forecasts it's just cuckoo': state's latest data mishap causes critics to cry foul the trump administration's ''cubic model'' of coronavirus deaths lockdown delays cost at least 36,000 lives, data show differential effects of intervention timing on covid-19 spreading the united states influential covid-19 model uses flawed methods and shouldn't guide u.s. policies, critics say covid-19 projection models are proving to be wrong cornyn steps in it again, tweeting that data modeling isn't part of the scientific process publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations is important. modeling matters and how we use and report those models matter too. key: cord-254339-djmibi3a authors: griette, quentin; magal, pierre; seydi, ousmane title: unreported cases for age dependent covid-19 outbreak in japan date: 2020-06-17 journal: biology (basel) doi: 10.3390/biology9060132 sha: doc_id: 254339 cord_uid: djmibi3a we investigate the age structured data for the covid-19 outbreak in japan. we consider a mathematical model for the epidemic with unreported infectious patient with and without age structure. in particular, we build a new mathematical model and a new computational method to fit the data by using age classes dependent exponential growth at the early stage of the epidemic. this allows to take into account differences in the response of patients to the disease according to their age. this model also allows for a heterogeneous response of the population to the social distancing measures taken by the local government. we fit this model to the observed data and obtain a snapshot of the effective transmissions occurring inside the population at different times, which indicates where and among whom the disease propagates after the start of public mitigation measures. covid-19 disease caused by the severe acute respiratory syndrome coronavirus (sars-cov-2) first appeared in wuhan, china, and the first cases were notified to who on 31 december 2019 [1, 2] . beginning in wuhan as an epidemic, it then spread very quickly and was characterized a pandemic on 11 march 2020 [1] . symptoms of this disease include fever, shortness of breath, cough, and a non-negligible proportion of infected individuals may develop severe forms of the symptoms leading to their transfer to intensive care units and, in some cases, death, see e.g., guan et al. [3] and wei et al. [4] . both symptomatic and asymptomatic individuals can be infectious [4] [5] [6] , which makes the control of the disease particularly challenging. the virus is characterized by its rapid progression among individuals, most often exponential in the first phase, but also a marked heterogeneity in populations and geographic areas [7] [8] [9] . the number of reported cases worldwide exceeded 3 millions as of 3 may 2020 [10] . the heterogeneity of the number of cases and the severity according to the age groups, especially for children and elderly people, aroused the interest of several researchers [11] [12] [13] [14] [15] . indeed, several studies have shown that the severity of the disease increases with the age and co-morbidity of hospitalized patients (see e.g., to et al. [15] and zhou et al. [8] ). wu et al. [16] have shown that the risk of developing symptoms increases by 4% per year in adults aged between 30 and 60 years old while davies et al. [17] found that there is a strong correlation between chronological age and the likelihood of developing symptoms. since completely asymptomatic individuals can also be contagious, a higher probability of developing symptoms does not necessarily imply greater infectiousness: zou et al. [6] found that, in some cases, the viral load in asymptomatic patients was similar to that in symptomatic patients. moreover while adults are more likely to develop symptoms, jones et al. [18] found that the viral loads in infected children do not differ significantly from those of adults. these findings suggest that a study of the dynamics of inter-generational spread is fundamental to better understand the spread of the coronavirus and most importantly to efficiently fight the covid-19 pandemic. to this end the distribution of contacts between age groups in society (work, school, home, and other locations) is an important factor to take into account when modeling the spread of the epidemic. to account for these facts, some mathematical models have been developed [13, 14, 17, 19, 20] . in ayoub et al. [19] the authors studied the dependence of the covid-19 epidemic on the demographic structures in several countries but did not focus on the contacts distribution of the populations. in [13, 14, 17, 20] a focus on the social contact patterns with respect to the chronological age has been made by using the contact matrices provided in prem et al. [21] . while ayoub et al. [19] , chikina and pegden [20] and davies et al. [17] included the example of japan in their study, their approach is significantly different from ours. indeed, ayoub et al. [19] use a complex mathematical model to discuss the influence of the age structure on the infection in a variety of countries, mostly through the basic reproduction number r 0 . they use parameter values from the literature and from another study of the same group of authors [22] , where the parameter identification is done by a nonlinear least-square minimization. chikina and pegden [20] use an age-structured model to investigate age-targeted mitigation strategies. they rely on parameter values from the literature and do discuss using age-structured temporal series to fit their model. finally, davies et al. [17] also discuss age-related effects in the control of the covid epidemic, and use statistical inference to fit an age-structured sir variant to data; the model is then used to discuss the efficiency of different control strategies. we provide a new, explicit computational solution for the parameter identification of an age-structured model. the model is based on the siur model developed in liu et al. [23] , which accounts for a differentiated infectiousness for reported and unreported cases (contrary to, for instance, other sir-type models). in particular, our method is significantly different from nonlinear least-squares minimization and does not involve statistical inference. in this article we focus on an epidemic model with unreported infectious symptomatic patients (i.e., with mild or no symptoms). our goal is to investigate the age structured data of the covid-19 outbreak in japan. in section 2 we present the age structured data and in section 3 the mathematical models (with and without age structure). one of the difficulties in fitting the model to the data is that the growth rate of the epidemic is different in each age class, which lead us to adapt our early method presented in liu et al. [23] . the new method is presented in the appendix a. in section 4 we present the comparison of the model with the data. in the last section we discuss our results. patient data in japan have been made public since the early stages of the epidemic with the quarantine of the diamond princess in the haven of yokohama. we used data from the website covid19japan.com (https://covid19japan.com. accessed 6 may 2020) which is based on reports from national and regional authorities. patients are labeled "confirmed" when tested positive to covid-19 by pcr. interestingly, the age class of the patient is provided for 13,660 out of 13,970 confirmed patients (97.8% of the confirmed population) as of 29 april. the age distribution of the infected population is represented in figure 1 compared to the total population per age class (data from the statistics bureau of japan estimate for 1 october 2019). in figure 2 we plot the number of reported cases per 10,000 people of the same age class (i.e., the number of infected patients divided by the population of the age class times 10,000). both datasets are given in table 1 and a statistical summary is provided by table 2 . note that the high proportion of 20-60 years old confirmed patients may indicate that the severity of the disease is lower for those age classes than for older patients, and therefore the disease transmits more easily in those age classes because of a higher number of asymptomatic individuals. elderly infected individuals might transmit less because they are identified more easily. the cumulative number of death ( figure 3 ) is another argument in favor of this explanation. we also reconstructed the time evolution of the reported cases in figures 4 and 5 . note that the steepest curves precisely concern the 20-60-year old, probably because they are economically active and therefore have a high contact rate with the population. figure 1 . in this figure we plot in blue bars the age distribution of the japanese population for 10,000 people and we plot in orange bars the age distribution of the number of reported cases of sars-cov-2 for 10,000 patient on 29 april (based on the total of 13,660 reported cases). we observe that 77% of the confirmed patients belong to the 20-60 years age class. the model consists of the following system of ordinary differential equations: (1) this system is supplemented by initial data here t ≥ t 0 is time in days, t 0 is the starting date of the epidemic in the model, s(t) is the number of individuals susceptible to infection at time t, i(t) is the number of asymptomatic infectious individuals at time t, r(t) is the number of reported symptomatic infectious individuals at time t, and u(t) is the number of unreported symptomatic infectious individuals at time t. a flow chart of the model is presented in figure 6 . asymptomatic infectious individuals i(t) are infectious for an average period of 1/ν days. reported symptomatic individuals r(t) are infectious for an average period of 1/η days, as are unreported symptomatic individuals u(t). we assume that reported symptomatic infectious individuals r(t) are reported and isolated immediately, and cause no further infections. the asymptomatic individuals i(t) can also be viewed as having a low-level symptomatic state. all infections are acquired from either i(t) or u(t) individuals. a summary of the parameters involved in the model is presented in table 3 . our study begins in the second phase of the epidemics, i.e., after the pathogen has succeeded in surviving in the population. during this second phase τ(t) ≡ τ 0 is constant. when strong government measures such as isolation, quarantine, and public closings are implemented, the third phase begins. the actual effects of these measures are complex, and we use a time-dependent decreasing transmission rate τ(t) to incorporate these effects. the formula for τ(t) is the date d is the first day of public intervention and µ characterises the intensity of the public intervention. a similar model has been used to describe the epidemics in mainland china, south korea, italy, and other countries, and give reasonable trajectories for the evolution of the epidemic based on actual data [23, [25] [26] [27] [28] [29] . compared with these models, we added a scaling with respect to the total population size n, for consistency with the age-structured model (12) . this only changes the value of the parameter τ and does not impact the qualitative or quantitative behavior of the model. table 3 . parameters of the model. interpretation method t 0 time at which the epidemic started fitted s 0 number of susceptible at time t 0 fixed i 0 number of asymptomatic infectious at time t 0 fitted u 0 number of unreported symptomatic infectious at time t 0 fitted τ(t) transmission rate at time t fitted d first day of public intervention fitted µ intensity of the public intervention fitted 1/ν average time during which asymptomatic infectious are asymptomatic fixed f fraction of asymptomatic infectious that become reported symptomatic infectious fixed ν 1 = f ν rate at which asymptomatic infectious become reported symptomatic fixed rate at which asymptomatic infectious become unreported symptomatic fixed 1/η average time symptomatic infectious have symptoms fixed at the early stages of the epidemic, the infectious components of the model i(t), u(t) and r(t) must be exponentially growing. therefore, we can assume that the cumulative number of reported symptomatic infectious cases at time t, denoted by cr(t), is since i(t) is an exponential function and cr(t 0 ) = 0 it is natural to assume that cr(t) has the following special form: as in our early articles [23, [26] [27] [28] [29] , we fix χ 3 = 1 and we evaluate the parameters χ 1 and χ 2 by using an exponential fit to we use only early data for this part, from day t = d 1 until day t = d 2 , because we want to catch the exponential growth of the early epidemic and avoid the influence of saturation arising at later stages. once χ 1 , χ 2 , χ 3 are known, we can compute the starting time of the epidemic t 0 from (5) as : we fix s 0 = 126.8 × 10 6 , which corresponds to the total population of japan. the quantities i 0 , r 0 , and u 0 correspond to the values taken by i(t), r(t) and u(t) at t = t 0 (and in particular r 0 should not be confused with the basic reproduction number r 0 ). we fix the fraction f of symptomatic infectious cases that are reported. we assume that between 80% and 100% of infectious cases are reported. thus, f varies between 0.8 and 1. we assume that the average time during which the patients are asymptomatic infectious 1/ν varies between 1 day and 7 days. we assume that the average time during which a patient is symptomatic infectious 1/η varies between 1 day and 7 days. in other words we fix the parameters f , ν, η. since f and ν are known, we can compute computing further (see below for more details), we should have and by using the approach described in diekmann et al. [30] , van den driessche and watmough [31] , the basic reproductive number for model (1) is given by by using (8) we obtain in what follows we will denote n 1 , . . . , n 10 the number of individuals respectively for the age classes [0, 10[, . . . , [90, 100[. the model for the number of susceptible individuals s 1 (t), . . . , s 10 (t), respectively for the age classes [0, 10[, . . . , [90, 100[, is the following the model for the number of asymptomatic infectious individuals i 1 (t), . . . , i 10 (t), respectively for the age classes [0, 10[, . . . , [90, 100[, is the following . . . i 10 (t) = τ 10 s 10 (t) φ 10,1 (i 1 (t) + u 1 (t)) n 1 + . . . + φ 10,10 (i 10 (t) + u 10 (t)) n 10 − νi 10 (t). the model for the number of reported symptomatic infectious individuals r 1 (t), . . . , r 10 (t), respectively for the age classes [0, 10[, . . . , [90, 100[, is finally the model for the number of unreported symptomatic infectious individuals u 1 (t), . . . , u 10 (t), respectively in the age classes [0, 10[, . . . , [90, 100[, is the following in each age class [0, 10[, . . . , [90, 100[ we assume that there is a fraction f 1 , . . . , f 10 of asymptomatic infectious individual who become reported symptomatic infectious (i.e., with severe symptoms) and a fraction (1 − f 1 ), . . . , (1 − f 10 ) who become unreported symptomatic infectious (i.e., with mild symptoms). therefore we define in their survey, prem and co-authors [21] present a way to reconstruct contact matrices from existing data and provide such contact matrices for a number of countries including japan. based on the data provided by prem et al. [21] for japan we construct the contact probability matrix φ. more precisely, we inferred contact data for the missing age classes [80, 90[ and [90, 100[. the precise method used to construct the contact matrix γ is detailed in appendix b. an analogous contact matrix for japan has been proposed by munasinghe, asai and nishiura [32] . the contact matrix γ we used is the following where the ith line of the matrix γ ij is the average number of contact made by an individuals in the age class i with an individual in the age class j during one day. notice that the higher number of contacts are achieved within the same age class. the matrix of conditional probability φ of contact between age classes is given by (18) and we plot a visual representation of this matrix in figure 7 . . graphical representation of the contact matrix φ. the intensity of blue in the cell (i, j) indicates the conditional probability that, given a contact between an individual of age group i and another individual, the latter belongs to the age class j. the matrix was reconstructed from the data of prem et al. [21] , with the method described in appendix b. the daily number of reported cases from the model can be obtained by computing the solution of the following equation: in figures 8 and 9 we employ the method presented previously in liu et al. [29] to fit the data for japan without age structure. the model to compute the cumulative number of death from the reported individuals is the following where η d is the death rate of reported infectious symptomatic individuals and p is the case fatality rate (namely the fraction of death per reported infectious individuals). in the simulation we chose 1/η d = 6 days and the case fatality rate p = 0.286 is computed by using the cumulative number of confirmed cases and the cumulative number of deaths (as of 29 april) as follows p = cumulative number of deaths cumulative number of reported cases = 393 13744 . in figure 10 we plot the cumulative number of d(t) by using the same simulations than in figures 8 and 9 . figure 8 . cumulative number of cases. we plot the cumulative data (reds dots) and the best fits of the model cr(t) (black curve) and cu(t) (green curve). we fix f = 0.8, 1/η = 7 days and 1/ν = 7 and we apply the method described in liu et al. [29] . the best fit is d 1 = 2 april, d 2 = 5 april, d = 27 april, µ = 0.6, χ 1 = 179, χ 2 = 0.085, χ 3 = 1 and t 0 = 13 january. figure 9 . daily number of cases. we plot the daily data (black dots) with dr(t) (blue curve). we fix f = 0.8, 1/η = 7 days and 1/ν = 7 and we apply the method described in liu et al. [29] . the best fit is d 1 = 2 april, d 2 = 5 april, n = 27 april, µ = 0.6, χ 1 = 179, χ 2 = 0.085, χ 3 = 1 and t 0 = 13 january. in order to describe the confinement for the age structured model (12)-(15) we will use for each age class i = 1, . . . , 10 a different transmission rate having the following form the date d i is the first day of public intervention for the age class i and µ i is the intensity of the public intervention for each age class. in figure 11 we plot the cumulative number of reported cases as given by our model (12)-(15) (solid lines), compared with reported cases data (black dots). we used the method described in the appendix a to estimate the parameters τ i from the data. in figure 12 we plot the cumulative number of unreported cases (solid lines) as given by our model with the same parameter values, compared to the existing data of reported cases (black dots). figure 11 . we plot a comparison between the model (12)-(15) and the age structured data from japan by age class. we took 1/ν = 1/η = 7 days for each age class. our best fit is obtained for f i which depends linearly on the age class until it reaches 90%, with in order to understand the role of transmission network between age groups in this epidemic, we plot in figure 13 the transmission matrices computed at different times. the transmission matrix is the following where the matrix φ describes contacts and is given in (18) , and the transmission rates are the ones fitted to the data as in figure 11 τ during the early stages of the epidemic, the transmission seems to be evenly distributed among age classes, with a little bias towards younger age classes (figure 13a ). younger age classes seem to react more quickly to social distancing policies than older classes, therefore their transmission rate drops rapidly (figure 13b,c) ; one month after the start of social distancing measures, the transmission mostly occurs within elderly classes (60-100 years, figure 13d ). the recent covid-19 pandemic has lead many local governments to enforce drastic control measures in an effort to stop its progression. those control measures were often taken in a state of emergency and without any real visibility concerning the later development of the epidemics, to prevent the collapse of the health systems under the pressure of severe cases. mathematical models can precisely help see more clearly what could be the future of the pandemic provided that the particularities of the pathogen under consideration are correctly identified. in the case of covid-19, one of the features of the pathogen which makes it particularly dangerous is the existence of a high contingent of unidentified infectious individuals who spread the disease without notice. this makes non-intensive containment strategies such as quarantine and contact-tracing relatively inefficient but also renders predictions by mathematical models particularly challenging. early attempts to reconstruct the epidemics by using siur models were performed in liu et al. [23, [26] [27] [28] , who used them to fit the behavior of the epidemics in many countries, by including undetected cases into the mathematical model. here we extend our modeling effort by adding the time series of deaths into the equation. in section 4 we present an additional fit of the number of disease-induced deaths coming from symptomatic (reported) individuals (see figure 10 ). in order to fit properly the data, we were forced to reduce the length of stay in the r-compartment to 6 days (on average), meaning that death induced by the disease should occur on average faster than recovery. a shorter period between infection and death (compared to remission) has also been observed, for instance, by verity et al. [7] . the major improvement in this article is to combine our early siur model with chronological age. early results using age structured sir models were obtained by kucharski et al. [33] but no unreported individuals were considered and no comparison with age-structured data were performed. indeed in this article we provide a new method to fit the data and the model. the method extends our previous method for the siur model without age (see appendix a). the data presented in section 2 suggests that the chronological age plays a very important role in the expression of the symptoms. the largest part of the reported patients are between 20 and 60 years old (see figure 1 ), while the largest part of the deceased are between 60 and 90 years old (see figure 3 ). this suggests that the symptoms associated with covid-19 infection are more severe in elderly patients, which has been reported in the literature several times (see e.g., lu et al. [12] , zhou et al. [8] ). in particular, the probability of being asymptomatic (our parameter f ) should in fact depend on the age class. indeed, the best match for our model (see figure 11 ) was obtained under the assumption that the proportion of symptomatic individual among the infected increases with the age of the patient. this linear dependency of f as a function of age is consistent with the observations of wu et al. [16] that the severity of the symptoms increase linearly with age. as a consequence, unreported cases are a majority for young age classes (for age classes less than 50 years) and become a minority for older age classes (more than 50 years), see figure 12 . moreover, our model reveals the fact that the policies used by the government to reduce contacts between individuals have strongly heterogeneous effects depending on the age classes. plotting the transmission matrix at different times (see figure 13 ) shows that younger age classes react more quickly and more efficiently than older classes. this may be due to the fact that the number of contacts in a typical day is higher among younger individuals. as a consequence, we predict that one month after the effective start of public measures, the new transmissions will almost exclusively occur in elderly classes. the observation that younger ages classes play a major roles in the transmission of the disease has been highlighted several times in the literature, see e.g., davies et al. [17] , cao et al. [11] , kucharski et al. [33] for the covid-19 epidemic, but also mossong et al. [34] in a more general context. we develop a new model for age-structured epidemic and provided a new and efficient method to identify the parameters of this model based on observed data. our method differs significantly from the existing nonlinear least-squares and statistical inference methods and we believe that it produces high-quality results. moreover, we only use the initial phase of the epidemic for the identification of the epidemiological parameters, which shows that the model itself is consistent with the observed phenomenon and argues against overfitting. yet our study could be improved in several direction. we only use reported cases which were confirmed by pcr tests, and therefore the number of tests performed could introduce a bias in the observed data -and therefore our results. we are currently working on an integration of this number of tests in our model. we use a phenomenological model to describe the response of the population in terms of number of contacts to the mitigation measures imposed by the government. this could probably be described more precisely by investigating the mitigation strategies in terms of social network. nevertheless we believe that our study offers a precise and robust mathematical method which adds to the existing literature. we first choose two days d 1 and d 2 between which each cumulative age group grows like an exponential. by fitting the cumulative age classes [0, 10[, [10, 20[, . . . and [90, 100[ between d 1 and d 2 , for each age class j = 1, . . . 10 we can find χ j 1 and χ we choose a starting time t 0 ≤ d 1 and we fix and we obtain where χ i j ≥ 0, ∀i = 1, . . . , n, ∀j = 1, 2, 3. figure a1 . we plot an exponential fit for each age classes using the data from japan. we assume that cr 1 (t) = ν 1 1 i 1 (t), . . . where therefore we obtain i j (t) = i j e χ j 2 t (a3) where by assuming that the number of susceptible individuals remains constant we have . . . and if we assume that the u j (t) have the following form then by substituting in (a5) we obtain the cumulative number of unreported cases cu j (t) is computed as and we used the following initial condition: we define the error between the data and the model as follows . . . ε n (t) = i n (t) − τ n s n φ n1 i 1 (t) + u 1 (t) n 1 + . . . + φ nn i n (t) + u n (t) n n + νi n (t), or equivalently . . . ε n (t) = (χ n 2 + ν) i n e χ n 2 t − τ n s n φ n1 let the matrix φ be fixed. we look for the vector τ = (τ 1 , . . . , τ n ) which minimizes of min τ∈r n ∑ j=1,...,n define for each j = 1, . . . , n and h j (t) := s j φ j1 i 1 + u 1 n 1 e χ 1 2 t + . . . + φ jn i n + u n n n e χ n 2 t , the survey [21] presents reconstructed contact matrices for a number of countries including japan for the 5-year age classes [0, 5), [5, 10) , ..., [75, 80) at various locations (work, school, home, and other locations) and a compilation of those contact matrices to account for all locations. the precise description of the compilation is presented in the paper. note that this paper is a follow-up of mossong et al. [34] where the survey procedure is described (including the data collection protocol) for several european countries participating in the polymod study. the data is publicly available online (prem et al. [21] , supporting dataset, doi: https://doi.org/ 10.1371/journal.pcbi.1005697.s002) and is presented in the form of a zipped collection of spreadsheets, containing the data for several countries in columns x1 x2 ... x16. the columns stand for the average number of contact of one individual of the corresponding age class (0-5 years for x1, 5-10 years for x2, etc...), with an individual of the age class indicated by the row (first row is 0-5 years, second is 5-10 years etc...). since the age span covered by the study stops at 80, we had to infer the number of contacts for people over the age of 80. we postulated that most people aged 80 or more are retired and that their behaviour does not significantly differs from the behavior of people in the age class [75, 80). therefore we completed the missing columns by copying the last available information and shifting it to the bottom. we repeated the procedure for lines. we believe that the introduced bias is kept to a minimum since the numerical values are relatively low compared to the diagonal. because we use 10-year ages classes and the data is given in 5-year age classes, we had to combine adjacent columns to recover the average number of contacts. to combine columns together, we used the weighted average c i = n 2(i−1)+1 c 2(i−1)+1 + n 2(i−1)+2 c 2(i−1)+2 n 2(i−1)+1 + n 2(i−1)+2 , where the column c i corresponds to the average number of contacts of an individual taken at random in the [10(i − 1), 10i) and c i is the average number of contacts of an individual taken at random in the age class [5(i − 1), 5i). to combine two lines, we simply use the sum of the data l i = l 2(i−1)+1 + l 2(i−1)+2 . the matrix γ in (17) is the transpose of the array obtained by the former procedure applied to the "all locations" dataset. then φ is obtained by scaling the lines of γ to 1, i.e., who timeline-covid-19 world health organization. pneumonia of unknown cause-china. disease outbreak news clinical characteristics of coronavirus disease 2019 in china presymptomatic transmission of sars-cov-2-singapore transmission of 2019-ncov infection from an asymptomatic contact in germany sars-cov-2 viral load in upper respiratory specimens of infected patients clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study report of the who-china joint mission on coronavirus disease 2019 (covid-19). 2020. available online world health organization sars-cov-2 infection in children: transmission dynamics and clinical characteristics sars-cov-2 infection in children the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study age-structured impact of social distancing on the covid-19 epidemic in india temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by sars-cov-2: an observational cohort study estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china cmmid covid-19 working group. age-dependent effects in the transmission and control of covid-19 epidemics an analysis of sars-cov-2 viral load by patient age age could be driving variable sars-cov-2 epidemic trajectories worldwide modeling strict age-targeted mitigation strategies for covid-19 projecting social contact matrices in 152 countries using contact surveys and demographic data characterizing key attributes of the epidemiology of covid-19 in china: model-based estimations understanding unreported cases in the 2019-ncov epidemic outbreak in wuhan, china, and the importance of major public health interventions reference table for the year 2019: computation of population by age (single years) and sex-total population, japanese population estimating the last day for covid-19 outbreak in mainland china predicting the cumulative number of cases for the covid-19 epidemic in china from early data a covid-19 epidemic model with latency period a model to predict covid-19 epidemics with applications to south korea, italy, and spain predicting the number of reported and unreported cases for the covid-19 epidemic in china on the definition and the computation of the basic reproduction ratio r 0 in models for infectious diseases in heterogeneous populations reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission quantifying heterogeneous contact patterns in japan: a social contact survey early dynamics of transmission and control of covid-19: a mathematical modelling study social contacts and mixing patterns relevant to the spread of infectious diseases acknowledgments: data from https://covid19japan.com. the authors declare no conflict of interest. author contributions: p.m. and o.s. designed the original study; q.g., p.m. and o.s. participated in the adaptation to the age-structured data; q.g. and p.m. wrote the computer code; all authors actively contributed to the initial version and revisions of the manuscript. all authors have read and agreed to the published version of the manuscript. and by settingremark a1. it does not seem possible to estimate the matrix of contact φ by using similar optimization method. indeed, if we look for a matrix φ = φ ij which minimizeswhenever φ is diagonal. therefore the optimum is reached for any diagonal matrix. moreover by using similar considerations, if several χ 2 j are equal, we can find a multiplicity of optima (possibly with φ not diagonal). this means that trying to optimize by using the matrix φ does not yield significant and reliable information.in the figure a2 below, we present an example of application of our method to fit the japanese data. we use the period going from 20 march to 15 april. key: cord-203620-mt9ivgzi authors: mccreery, clara h.; katariya, namit; kannan, anitha; chablani, manish; amatriain, xavier title: effective transfer learning for identifying similar questions: matching user questions to covid-19 faqs date: 2020-08-04 journal: nan doi: nan sha: doc_id: 203620 cord_uid: mt9ivgzi people increasingly search online for answers to their medical questions but the rate at which medical questions are asked online significantly exceeds the capacity of qualified people to answer them. this leaves many questions unanswered or inadequately answered. many of these questions are not unique, and reliable identification of similar questions would enable more efficient and effective question answering schema. covid-19 has only exacerbated this problem. almost every government agency and healthcare organization has tried to meet the informational need of users by building online faqs, but there is no way for people to ask their question and know if it is answered on one of these pages. while many research efforts have focused on the problem of general question similarity, these approaches do not generalize well to domains that require expert knowledge to determine semantic similarity, such as the medical domain. in this paper, we show how a double fine-tuning approach of pretraining a neural network on medical question-answer pairs followed by fine-tuning on medical question-question pairs is a particularly useful intermediate task for the ultimate goal of determining medical question similarity. while other pretraining tasks yield an accuracy below 78.7% on this task, our model achieves an accuracy of 82.6% with the same number of training examples, an accuracy of 80.0% with a much smaller training set, and an accuracy of 84.5% when the full corpus of medical question-answer data is used. we also describe a currently live system that uses the trained model to match user questions to covid-related faqs. even before the advent of the covid-19 pandemic, people across the world were turning to the internet to find answers to their medical concerns [1] . around 7% of googleâăźs daily searches were health related, equivalent to around 70,000 queries every minute [2] . with the emergence of medical question-answering websites such as adam 1 , webmd 2 , askdocs 3 and healthtap 4 , people now have the opportunity to ask detailed questions and find answers, from experts, that satisfied their needs. covid-19 has done nothing but accelerate this trend. almost every government agency and healthcare organization has tried to meet the informational need of users by building online faqs that try to address as many covid-related topics as possible (see cdc's 5 , who's 6 , or mayo clinic's 7 faqs for example ) the examples above already illustrate two important problems of any medical q&a collection: (1) there is a very large number of possible questions that can be formulated in different ways, and (2) it is not easy for a user to browse through a large collection of pre-existing questions to find the one that most resembles their need. a scalable solution to overcome both of these issues is to build a system that can automatically match user formulated questions with semantically similar answered questions, and provide those as suggestions to the users. if no similar answered questions exist, we can mark them as priority for experts to respond. this approach more directly satisfies user needs allowing them to use their own words to formulate the question. it also provides an avenue for collecting unanswered questions that users want answered, which is extremely important in a rapidly changing situation such as the currrent covid-19 pandemic. the problem of matching general unanswered questions with semantically similar answered questions has been well-studied in the context of online user forums [7, 9, 11, 27] , community qa [8, 16, 29] and question answer archives [15, 16] . typical approaches either assume a large amount of training data on which, either statistics can be computed or models can be learned. however, these approaches fall short when applied to the problem of medical question similarity. first, medical questions imbibe a large amount of medical information that a single word can completely change the meaning of the question. as an example, iâăźm pregnant and i believe iâăźve been infected with coronavirus. what should i know about going to the hospital? and should i visit the doctor if i am expecting and think i might have covid-19? are similar questions with low overlap, but is it safe to take vitamin d3 supplements to build immunity against coronavirus? and is it safe to take hydroxychloroquine to build immunity against coronavirus? are critically different and only a couple of words apart. second, there is no publicly available medical question-question similarity data at the scale where these differences can be effectively encoded in order to learn a reliable similarity function. in fact, we hypothesize that constructing such large datasets that cover the large functional space of nuanced variations in medical domain can be quite hard, and is not a scalable proposition. in this paper, we tackle the general problem of medical questionquestion similarity, assuming only a small amount of labeled data of similarity pairs. we also apply the general solution to a specific covid-19 scenario (see figure 1 ) where many different questions from different sources are integrated into a user-friendly experience. our proposed solution stems from two key insights: first, whether or not two questions are semantically similar is akin to asking whether or not the answer to one also answers the other. this means that the answers in the answered questions contain wealth of medical knowledge that can be distilled into the model. the second insight is that we can infuse this medical knowledge from the answers as a pretraining task within a language model, so that we can capture relatedness between words/concepts in the language. recent success of pretrained bi-directional transformer networks for natural language processing in non-medical fields supports this insight [12, 20, 22, 24, 28] . our approach stems from augmenting a general language model such as bert, with medical knowledge by process of double finetuning that first distills medical knowledge using a large corpus of relevant in-domain task of medical question-answer pairs. subsequently, it fine-tunes on the available small corpus of questionquestion similarity dataset. our models pretrained on medical questionanswer pairs outperform models pretrained on out-of-domain question similarity with high statistical significance. in particular, while other pretraining tasks yield an accuracy below 78.7% on this task, our model achieves an accuracy of 82.6% with the same number of training examples, an accuracy of 80.0% with a much smaller training set, and an accuracy of 84.5% when the full corpus of medical question-answer data is used. the main contributions of this paper are: • we present an approach of double fine-tuning for the problem of question-question similarity: this helps the model to cope up with data sparsity, by imbibing domain knowledge through an intermediate fine-tuning task. • we prove that, particularly for medical nlp, domain matters: pretraining on a different task in the same domain outperforms pretraining on the same task in a different domain. however, using extensive experimentation we show that the choice of the in-domain task matters: choosing the task that provides ample signal to capture the domain knowledge needed for the final task is central. • we apply the general approach medical question similarity to covid-19 specific questions • we release 8 a dataset of medical question pairs generated and labeled by doctors that is based upon real, patient-asked questions, hereafter referred as mqp dataset. some sample examples from this dataset is provided in table 1 . the rest of the paper is structured as follows: § 2 describes the methodology used in creating a dataset that will be made publicly available. § 3 provides the overview of the approach. § 4 describes how we used the model to build a service that matches user's covid-19-related questions to faqs published online. § 5 describes experimental details and the key results, § 6 discusses related work and we end with a discussion on future work. there is no existing dataset that we know of for medical question similarity. therefore, one contribution of this paper is that we have generated such a dataset that we refer to as mqp and are releasing it. this dataset is hand-generated by doctors and contains 3048 medical questions pairs that are labeled similar or different. we explicitly choose doctors for this task because determining whether or not two medical questions are the same requires medical training that crowd-sourced workers rarely have. we present doctors with a list of 1524 patient-asked questions randomly sampled from publicly available crawl of healthtap [13] . in all of the intermediate tasks that we consider, we make sure to exclude these sampled questions. each question results in one similar and one different pair through the following instructions provided to the labelers: (1) rewrite the original question in a different way while maintaining the same intent. restructure the syntax as much as possible and change medical details that would not impact your response (ex. 'i'm a 22-y-o female' could become 'my 26 year old daughter' ). (2) come up with a related but dissimilar question for which the answer to the original question would be wrong or irrelevant. use similar key words. the first instruction generates a positive question pair (similar) and the second generates a negative question pair (different). with the above instructions, we intentionally frame the task such that positive question pairs can look very different by superficial metrics, and negative question pairs can conversely look very similar. this ensures that the task is not trivial. in table ? ?, we provide examples of how the doctors perform this task. table 1 has more examples of the pairs they generate. we anticipate that each doctor interprets these instructions slightly differently, so to reduce bias, no doctor providing data in the train set generates any data in the dev or test set. thus instead of a random train-dev-test split, the splits are created based on the doctors that labeled the examples. in other words, we make sure that the set of doctors that created the examples in the training set is disjoint from those that created examples in the dev or test set. furthermore, we also ensure that there is no overlap between the seed questions in the train and test set. the final dataset contains 4567 unique questions. the minimum, maximum, median and average number of tokens in these questions are 4, 81, 20 and 22.675 respectively. showing there is variance in the length of the questions. the shortest question is "are fibroadenomas malignant?" to obtain an oracle score, we also have doctors handlabel question pairs that a different doctor generated. the accuracy of the second doctor with respect to the labels intended by the first (viz. inter-annotator agreement) is used as an oracle and is 87.6% in our test set of 836 question pairs. we are interested in learning a model that determines whether two medical questions are similar, i.e have semantic correspondence. if a large corpus of pairs of similar medical questions is available, it would be relatively straightforward to learn a model for question can i take any medication for pain due to suprapatellar bursitis? unable to exercise. :( similarity, as done in the case of the large-scale quora dataset (qqp) [9, 10] for general question-question similarity on quora platform. however, labeled training data is still one of the largest barriers to supervised learning, particularly in the medical field where it is expensive to get doctor time for hand-labeling data. to overcome this issue, we take the approach of double fine-tuning, derived from transfer learning. double fine-tuning works as follows: starting with a pretrained model trained on a large general corpus, the model is subsequently fine-tuned twice. in the first fine-tuning stage, a related task with large amounts of training data is used to train the model. the goal of this step is to have the model imbibe the requisite knowledge into an otherwise generic model. our main dataset for this purpose is the medical question-answering (qa) dataset described under bert+qa model in section 5.2, where the goal is to predict if the given answer correctly answers the given question. the final fine-tuning is performed using a small amount of labeled data available for the final goal. in our case, this refers to the task of identifying question similarity and the dataset we use is mqp, described in section 2. both tasks are posed as binary classification problems with cross entropy loss. in order to understand the importance of intermediate finetuning, we also experiment with different types of intermediate tasks. in figure 2 for the base model for representation, we use the architecture and weights from bert [12] . we also compare against previous state-of-the-art (sota) models of biobert [17] , scibert [6] , and clinicalbert [14] . note that these three bert models that have been fine-tuned once already on the original bert tasks but with different text corpora. we also perform an ablation over pretrained model architecture and reproduce our results starting with the xlnet model instead of bert. as the coronavirus crisis has proliferated, one useful source of information has been faqs, published by various sources such as cdc, fda, nytimes and others. moreover, these faqs have continued to evolve and are still evolving as we learn new information about the disease, prevention and safety measures. curai health is a remote healthcare platform where one can get treatment for many common health issues from real doctors, without having to go to a doctor's office. we deployed a service on our platform that enables users to enter their question in free-text and attempts to match their question to an existing faq. the goal of the system is to match a user question to a given set of question-answer pairs (faqs). while the answer to a question can also be useful in knowing whether the user's question is relevant to a given faq, we simplified this problem to identifying questions in our faqs that were similar to the user question, ignoring the answer. therefore, given the pair of (user question, faq question), we can now use our doublefinetuned bert based question-similarity model to predict whether the questions are similar or not. since the model was not trained on coronavirus-related data and it did not have terms such as 'coronavirus' and 'covid' in its tokenizer vocabulary, the model was yielding unexpected results when the questions were input as is. therefore, we replaced such terms in both the user question as well as the faqs with generic placeholders like 'disease'. since the launched service made it clear that the questions were meant to be covid-19 related and the set of faqs pertain to the same topic as well, performing such replacements is acceptable and will not change the semantic meaning of the question. therefore, questions such as "how can i protect myself from covid-19?" were transformed into "how can i protect myself from the disease?" since we have only a few hundred curated faqs, we score every (user question, faq) pair using the bert model to get similar pairs. as we scale up to serving more faqs, we will need to come up with an effective candidate generation scheme before running bert inference. with our current throughput, we anticipate the current approach of scoring all faqs to continue to work till we reach a couple of thousand faqs. since our model was not fine-tuned on any kind of covid questions or text, we found that it did make errors. since our platform allows the users to chat with a medical professional for free, we wanted to bias towards precision than recall i.e. model predictions should actually be similar and it is acceptable to output no results even if there are actually relevant faqs in our curated set. we achieved this by ensuring some minimum amount of "key" token overlap between the user question and the faq using a tf-idf based filter. given the nature of the bert model and the latency required to provide a good user experience, we used two nvidia tesla k80 gpus for inference. we encapsulate the model as a microservice, containerize it and run the container image on google kubernetes engine in google cloud platform. the service can be seen in action in figure 1 . as shown in figure 3 , for each faq that we render, we also display who it came from (source) and when it was last updated. since (1) we did not have a readily available dataset of covid specific questions to quantify performance on covid question similarity and (2) the model itself is applicable for general medical question similarity, we evaluate the model performance on general medical question similarity. this section describes the datasets, models and the evaluation setup along with the results. we used the following datasets to derive the pretraining tasks. quora question pairs (qqp) is a labeled corpus of 363,871 question pairs from quora, an online question-answer forum [10] . these question pairs cover a broad range of topics, most of which are not related to medicine. however, it is a well-known dataset containing labeled pairs of similar and dissimilar questions. healthtap is a medical question-answering website in which patients can have their questions answered by doctors. we use a publicly available crawl [13] with 1.6 million medical questions. each question has corresponding long and short answers, doctor meta-data, category labels, and lists of related topics. we reduce this dataset to match the size of qqp via random sampling for direct performance comparisons, but also run one experiment leveraging the full corpus. webmd is an online publisher of medical information including articles, videos, and frequently asked questions (faq). for a second medical question-answer dataset, we use a publicly available crawl [21] over the faq of webmd with 46,872 question-answer pairs. we decrease the size of qqp and healthtap to match this number before making direct performance comparisons. we are interested in understanding the role of double fine-tuning, and in particular the effectiveness of intermediate task before the final fine tuning with training set from mqp ( § 2). for this intermediate training, we consider the following training variations (see figure 2 ). bert: this is the baseline model without any intermediate training. bert+qqp: this is bert trained using the quora question pairs dataset [10] on the binary classification task of classifying a given pair of questions as similar or dissimilar. bert+qc: here, we take questions from healthtap, pair them up with their main-category labels and call these positive examples. we then pair each question with a random other category and call this a negative example. there are 227 main categories represented, such as abdominal pain, acid reflux, acne, adhd, alcohol etc. we then train a bert model to classify category matches and mismatches, rather than predict to which of the classes each example belongs to. bert+aa: one task that has been known to generalize well is that of next-sentence prediction, which is one of two tasks used to train the original bert model. to mimic this task, we take each answer from healthtap and split it into two parts: the first two sentences (start), and the remaining sentences (end). we then take each answer start and end that came from the same original question and label these pairs as positives. we also pair each answer start with a different end from the same main category and label these as negatives. this is, therefore, a binary classification task in which the model tries to predict whether an answer start is completed by the given answer end. bert+qa: this is the proposed approach for imbibing medical knowledge into the classifier. in order to correctly determine whether or not two questions are semantically similar, as is our ultimate goal, a network must be able to interpret the nuances of each question. another task that requires such nuanced understanding is that of pairing questions with their correct answers. we isolate each true question-answer pair from the medical question-answering websites and label these as positive examples. we then take each question and pair it with a random answer from the same main category or tag and label these as negative examples. finally, we train bert to label question-answer pairs as either positive or negative. metrics: we report accuracy as our metric. for a dataset consisting of t question pairs, accuracy is defined as: accuracy = , where, for t th example,ŷ (t ) is the model's prediction of whether the pair is similar and y (t ) is the ground truth label, where i denotes the indicator function. training details: for each intermediate task, we train the network for 5 epochs [20] with 364,000 training examples to ensure that differences in performance are not due to different dataset sizes. we then fine-tune each of these intermediate-task-models on a small number of labeled, medical-question pairs until convergence. a maximum sentence length of 200 tokens, learning rate of 2e-5, and batch size of 16 is used for all models. all experiments are done with 5 different random train/validation splits to generate error bars representing one standard deviation in accuracy. we use accuracy of each model, as described above, as our quantitative metric for comparison and a paired t-test to measure statistical significance. here we investigate whether domain of the training corpus matters more than task-similarity when choosing an intermediate training step for the medical question similarity task. accuracy on the final task (medical question similarity) is our quantitative proxy for performance. domain similarity vs task similarity we fine-tune bert on the intermediate tasks of quora question pairs (qqp) and healthtap question answer pairs (qa) before fine-tuning on the final task to compare performance. we find that the qa model performs better than the qqp model by 2.4% to 4.5%, depending on size of the final training set (figure 4) . conducting a paired t-test over the 5 data splits used for each experiment, the p-value is always less than 0.0006, so this difference is very statistically significant. we thus see with high confidence that models trained on a related in-domain task (medical question-answering) outperform models trained on the same question-similarity task but an out-of-domain corpus (qqp). furthermore, when the full corpus of question-answer pairs from healthtap is used, the performance climbs all the way to 84.5% ± 0.7%. results hold across models the same trends hold when the bert base model is replaced with xlnet, with a p-value of 0.0001 (table 3) . to benchmark ourselves against existing medical models, we compare our fine-tuned models to biobert, scibert, and clin-icalbert as the base model. each of these models has fine-tuned the original bert weights on a medically relevant corpus using the original bert tasks. given only the base model has changed, we compare bert fine-tuned on our mqp dataset with each of these off-the-shelf models fine-tuned on mqp. all of them perform comparably to the original bert model. accuracies were as follows: bert= 78.5±1.32, clinicalbert= 74.2±1.72, biobert= 78.5±0.75 and scibert= 75.8 ± 1.19. we hypothesize that this is because technical literature and doctor notes that these models are pretrained on have their own distinct vocabularies. while they are more medical in nature than wikipedia articles, they still use language quite distinct from the colloquial medical question-answer language found online. results hold across datasets we repeat our experiments with a question-answer dataset from webmd and restrict the health-tap and qqp dataset sizes for fair comparison. we find that the qa model again outperforms the qqp model by a statistically significant margin (p-value 0.049) and that the webmd model even outperforms the healthtap model with the same amount of data (table 3 ). our findings therefore hold across multiple in-domain datasets. we investigate further the extent to which task matters for an indomain corpus in two different ways. we start by using the same healthtap data and forming different tasks from the questions therein, and then we compare our models against intermediate models trained by other researchers. to test the extent to which any in-domain task would boost the performance of an out-of-domain model, we compare bert+qa to bert+aa and bert+qc. as before, we use accuracy on the final question-similarity task as our proxy for performance and keep the test set constant across all models. figure 4 shows the results. we find that both of these tasks actually perform worse than the baseline bert model, making final model less useful for understanding the subtler differences between two questions. we hypothesize that for bert+qc, it is very easy for questions being in the same category to be dissimilar and therefore it is likely that it hasn't learned useful question representations for question similarity. similarly for bert+aa, the language in answers can be different than the personal language used in patient-asked questions so that the learned representations for question language might not be ideal. this suggests that while domain does matter a lot, many tasks are not well-suited to encoding the proper domain information from the in-domain corpus. to get a better qualitative understanding of performance, we perform an error analysis on our trained models. we define a consistent error as one that is made by at least four of the five models trained on different train/validation splits. similarly, we consider a model as getting an example consistently correct if it does so on at least four of the five models trained on different train/validation splits. by investigating the question pairs that a model-type gets consistently wrong, we can form hypotheses about why the model may have failed on that specific example. table 4 shows examples of pairs of questions with their true label and how each of the models labeled the pair. we form hypotheses based on key pieces of medical language that we could potentially perturb to make the models not make the mistakes that they do. for instance, from row 1 in table 4, we hypothesize that the nuance is hypertension being a synonym for high blood pressure. other than qa and aa, rest of the models don't get it right because they would not have seen enough medical domain training instances that highlight this equivalence. however, there need not always be well-encapsulated and only a handful of concepts that can be tweaked. row 4 shows an example of fairly complex ways of formulating similar questions that one can't easily distill to a few edits to make them understandable for the models. an approach to understanding model errors: we can prove or disprove our hypotheses by augmenting each question pair to table 4 : examples that were consistently labeled wrong by at least one model type. patterns reveal the key differences in what is learned by each intermediate task add or remove one challenging aspect of the language, at a time and observe whether or not those changes result in a different label. we repeatedly make such small changes to the input until the models label those examples correctly. note that the augmented questions are not added to our test set and do not contribute to our quantitative performance metrics; they are only created for the sake of probing and understanding the trained models . we demonstrate our approach in table 5 by showing our analysis for one specific question pair. question 1 is kept fixed and we incrementally make small changes to question 2 to inspect model predictions. 9 while not shown here, using the same approach, we find that differences in spelling and capitalization do not cause a significant number of errors in any model, although they are present in many questions. we believe that this analysis not only helps us shed some light on explainability of our models but also guides the collection of additional training data through strategies such as active learning. while there has been significant work on question-question similarity ( [7, 18] and references therein), research in medical questionquestion similarity is still somewhat nascent. the closest to our work is that of abacha and demner-fushman [3, 4] . we differ from this work in significant ways. first, rather than training a model to answer medical questions correctly, we train a model to determine if any existing questions in the dataset is semantically similar to the new question. second, unlike their approach that constructs training data using specialized rules and manual curation, we use table 5 : example question pair that was augmented to reveal which aspects of the question pair the network failed to understand the approach of transfer learning where a surrogate in domain task with large amounts of data is used to infer medical knowledge, and subsequently fine-tune on a small corpus of manually labeled question similarity pairs. finally, we are interested in questions that are patient-asked which tend to use less technical language, include more misspellings, and span a different range of topics than the language and distribution of pairs that they generate using faqs. previous work has tried to overcome this using augmentation rules to generate similar question pairs automatically [19] , but this leads to an overly simplistic dataset in which negative questionpairs contain no overlapping keywords and positive question-pairs follow similar lexical structures. another technique for generating training data is weak supervision [25] , but due to the nuances of determining medical similarity, generating labeling functions for this task is difficult. also related to medical question pairs generation is that of the problem of recognizing question entailment (rqe) [5] . while question entailment allows for asymmetric similarity metric where one question is more specific than the other, question similarity requires that the metric is symmetric. second, as pointed out previously, their question pairs have a different language and topic distribution from the one we care about. nlp has undergone a transfer learning revolution in the past year, with several large pretrained models earning state-of-the-art scores across many linguistic tasks. two such models that we use in our own experiments are bert [12] and xlnet [28] . these models have been trained on semi-supervised tasks such as predicting a word that has been masked out from a random position in a sentence, and predicting whether or not one sentence is likely to follow another. the corpus used to train bert was exceptionally large (3.3 billion words), but all of the data came from bookscorpus and wikipedia. talmor and berant [26] recently found that bert generalizes better to other datasets drawn from wikipedia than to tasks using other web snippets. this is consistent with our finding that pretraining domain makes a big difference. to address the need for pretrained models in particular domains, some researchers have recently re-trained bert on different text corpora such as scientific papers [6] , doctor's medical notes [14] and biomedical journal articles [17] . however, re-training bert on the masked-language and next-sentence prediction tasks for every new domain is unwieldy and time-consuming. we investigate whether the benefits of retraining on a new domain can also be realized by fine-tuning bert on other in-domain tasks. phang et al. [23] see a boost with other tasks across less dramatic domain changes, where a different text corpus is used for the final task but not an entirely different technical vocabulary or domain. in this work, we release mqp, a dataset of medical question pairs generated and labeled by doctors that is based upon real, patientasked questions. we also show that the double finetuning approach of pretraining on in-domain question-answer matching (qa) is particularly useful for the difficult task of identifying semantically similar questions. furthermore, we show that the choice of this in-domain task matters: choosing a task that provides ample signal to capture the domain knowledge is needed to be able to perform the final task well. although the qa model outperforms the out-of-domain sametask qqp model, there are a few examples where the qqp model seems to have learned information that is missing from the qa model. in the future, we can further explore whether these two models learned independently useful information from their pretraining tasks. if they did, then we hope to be able to combine these features into one model with multi-task learning. an additional benefit of the error analysis is that we have a better understanding of the types of mistakes that even our best model is making. it is therefore now easier to use weak supervision and augmentation rules or even active learning to supplement our datasets to increase the number of training examples in those difficult regions of the data. both of these improvements could further improve our performance on this task. the social life of health information dr google will see you now: search giant wants to cash in on your medical queries recognizing question entailment for medical question answering asma ben abacha and dina demner-fushman. 2019. a question-entailment approach to question answering mediqa 2019 -recognizing question entailment (rqe) scibert: pretrained contextualized embeddings for scientific text detecting semantically equivalent questions in online user forums learning the latent topics for question retrieval in community qa question-question similarity in online forums together we stand: siamese networks for similar question retrieval bert: pre-training of deep bidirectional transformers for language understanding medical question answer datasets clinicalbert: modeling clinical notes and predicting hospital readmission finding similar questions in large question and answer archives question-answer topic model for question retrieval in community question answering biobert: a pre-trained biomedical language representation model for biomedical text mining semi-supervised question retrieval with gated convolutions finding similar medical questions from question answering websites multi-task deep neural networks for natural language understanding medical question answer data deep contextualized word representations sentence encoders on stilts: supplementary training on intermediate labeled-data tasks language models are unsupervised multitask learners multiqa: an empirical investigation of generalization and transfer in reading comprehension retrieval models for question and answer archives xlnet: generalized autoregressive pretraining for language understanding question retrieval with high quality answers in community question answering key: cord-000332-u3f89kvg authors: broeck, wouter van den; gioannini, corrado; gonçalves, bruno; quaggiotto, marco; colizza, vittoria; vespignani, alessandro title: the gleamviz computational tool, a publicly available software to explore realistic epidemic spreading scenarios at the global scale date: 2011-02-02 journal: bmc infect dis doi: 10.1186/1471-2334-11-37 sha: doc_id: 332 cord_uid: u3f89kvg background: computational models play an increasingly important role in the assessment and control of public health crises, as demonstrated during the 2009 h1n1 influenza pandemic. much research has been done in recent years in the development of sophisticated data-driven models for realistic computer-based simulations of infectious disease spreading. however, only a few computational tools are presently available for assessing scenarios, predicting epidemic evolutions, and managing health emergencies that can benefit a broad audience of users including policy makers and health institutions. results: we present "gleamviz", a publicly available software system that simulates the spread of emerging human-to-human infectious diseases across the world. the gleamviz tool comprises three components: the client application, the proxy middleware, and the simulation engine. the latter two components constitute the gleamviz server. the simulation engine leverages on the global epidemic and mobility (gleam) framework, a stochastic computational scheme that integrates worldwide high-resolution demographic and mobility data to simulate disease spread on the global scale. the gleamviz design aims at maximizing flexibility in defining the disease compartmental model and configuring the simulation scenario; it allows the user to set a variety of parameters including: compartment-specific features, transition values, and environmental effects. the output is a dynamic map and a corresponding set of charts that quantitatively describe the geo-temporal evolution of the disease. the software is designed as a client-server system. the multi-platform client, which can be installed on the user's local machine, is used to set up simulations that will be executed on the server, thus avoiding specific requirements for large computational capabilities on the user side. conclusions: the user-friendly graphical interface of the gleamviz tool, along with its high level of detail and the realism of its embedded modeling approach, opens up the platform to simulate realistic epidemic scenarios. these features make the gleamviz computational tool a convenient teaching/training tool as well as a first step toward the development of a computational tool aimed at facilitating the use and exploitation of computational models for the policy making and scenario analysis of infectious disease outbreaks. the 2009 h1n1 influenza pandemic highlighted the importance of computational epidemic models for the real-time analysis of the health emergency related to the global spreading of new emerging infectious diseases [1] [2] [3] . realistic computational models are highly complex and sophisticated, integrating substantial amounts of data that characterize the population and geographical context in order to attain superior accuracy, resolution, and predictive power [4] [5] [6] [7] [8] [9] [10] . the challenge consists in developing models that are able to capture the complexity of the real world at various levels by taking advantage of current information technology to provide an in silico framework for testing control scenarios that can anticipate the unfolding of an epidemic. at the same time, these computational approaches should be translated into tools accessible by a broader set of users who are the main actors in the decision-making process of health policy, especially during an emergency like an influenza pandemic. the tradeoff between realistic and accurate descriptions of large-scale dynamics, flexibility, computational feasibility, ease of use, and accessibility of these tools creates a major challenge from both the theoretical and the computational points of view [4, 5, 11, 12, 10, 13] . gleamviz is a client-server software system that can model the world-wide spread of epidemics for human transmissible diseases like influenzalike illnesses (ili), offering extensive flexibility in the design of the compartmental model and scenario setup, including computationally-optimized numerical simulations based on high-resolution global demographic and mobility data. gleamviz makes use of a stochastic and discrete computational scheme to model epidemic spread called "gleam" -global epidemic and mobility model, presented in previously published work [6, 3, 14] which is based on a geo-referenced metapopulation approach that considers 3,362 subpopulations in 220 countries of the world, as well as air travel flow connections and short-range commuting data. the software includes a client application with a graphical user interface (gui) for setting up and executing simulations, and retrieving and visualizing the results; the client application is publicly downloadable. the server application can be requested by public institutions and research centers; conditions of use and possible restrictions will be evaluated specifically. the tool is currently not suitable for the simulation of vector-borne diseases, infection transmission depending on local contact patterns such as sexually transmitted diseases and diseases with a time scale that would make demographic effects relevant. the tool, however, allows the introduction of mitigation policies at the global level. localized intervention in space or time can be implemented in the gleam model and their introduction in the gleamviz computational tool are planned for future releases. only a few computational tools are currently available to the public for the analysis and modeling of epidemics. these range from very simple spreadsheet-based models aimed at providing quick estimates for the number of patients and hospitalizations during a pandemic (see e.g. flusurge [15] ) to more complicated tools based on increasingly sophisticated simulation approaches [11, 16, 12, 10, 13, 5] . these tools differ in their underlying modeling approaches and in the implementation, flexibility, and accessibility of the software itself. influsim is a tool that provides a visual interface to simulate an epidemic with a deterministic compartmental model in a single population [11] . the model includes age structure and explicit sojourn times with different stages in each compartment, extending an seir compartmentalization to include hospitalizations and intervention measures. the software provides the infectious disease dynamics and the user can set parameter values and add or remove interventions. however, no spatial structure or other forms of heterogeneity and stochasticity are considered in the model. on the other hand agent-based models describe the stochastic propagation of a disease at the individual level, thus taking into account the explicit social and spatial structure of the population under consideration. communityflu is a software tool that simulates the spread of influenza in a structured population of approximately 1,000 households with 2,500 persons [13] . user interaction with the software is limited to the spreadsheet portion of the program, where one can choose the type of intervention and other parameters describing the disease and the population. a larger population is considered in flute, a publicly available tool for the stochastic simulation of an epidemic in the united states at the level of individuals [10] . the model is based on a synthetic population, structured in a hierarchy of mixing social groups including households, household clusters, neighborhoods, and nation-wide communities. flute comes with a configuration file in text format that can be modified by an expert user to set various parameters such as the initiation of the epidemic, the reproductive number, and the interventions considered. no gui is provided, and the output of the simulations is given in the form of text files that must be analyzed through additional software. epifast involves a parallel algorithm implemented using a master-slave approach which allows for scalability on distributed memory systems, from the generation of synthetic population aggregated in mixing groups to the explicit representation of the contact patterns between individuals as they evolve in time [5] . the epi-fast tool allows for the detailed representation and simulation of the disease on social contact networks among individuals that dynamically evolve in time and adapt to actions taken by individuals and public health interventions. the algorithm is coupled with a webbased gui and the middleware system didactic, which allows users to specify the simulation setup, execute the simulation, and visualize the results via plots. epidemic models and interventions are pre-configured, and the system can scale up to simulate a population of a large metropolitan area on the order of tens of millions of inhabitants. another class of models focuses on the global scale, by using a metapopulation approach in which the population is spatially structured into patches or subpopulations (e.g. cities) where individuals mix. these patches are connected by mobility patterns of individuals. in this vein two tools are currently available. the global epidemic model (gem) uses a metapopulation approach based on an airline network comprised of 155 major metropolitan areas in the world for the stochastic simulation of an influenza-like illness [16] . the tool consists of a java applet in which the user can simulate a hypothetical h1n1 outbreak and test pre-configured intervention strategies. the compartmentalization is set to an seir model, and the parameterization can be modified in the full or stand-alone mode, but not currently in the java applet. the spatiotemporal epidemiological modeler (stem) is a modeling system for the simulation of the spread of an infectious disease in a spatially structured population [16] . contrary to other approaches, stem is based on an extensible software platform, which promotes the contribution of data and algorithms by users. the resulting framework therefore merges datasets and approaches and its detail and realism depend on continuous developments and contributions. however, these are obtained from a variety of sources and are provided in different formats and standards, thus resulting in possible problems related to the integration and merging of datasets. such issues are left to the user to resolve. the existing tools described above thus offer the opportunity to use highly sophisticated data-driven approaches at the expense of flexibility and ease of use by non-experts on the one hand, or very simplified models with user-friendly guis and no specific computational requirements on the other. our approach aims at optimizing the balance of complex and sophisticated data-driven epidemic modeling at the global scale while maintaining an accessible computational speed and overall flexibility in the description of the simulation scenario, including the compartmental model, transition rates, intervention measures, and outbreak conditions by means of a user-friendly gui. in the gleamviz tool the setup of the simulations is highly flexible in that the user can design arbitrary disease compartmental models, thus allowing an extensive range of human-to-human infectious diseases and intervention strategies to be considered. the user interface has been designed in order to easily define both features specific to each compartment, such as the mobility of classes of individuals, and general environmental effects, such as seasonality for diseases like influenza. in addition, the user can define the initial settings that characterize the initial geographical and temporal conditions, the immunity profile of the population, and other parameters including but not limited to: the definition of an outbreak condition in a given country; the number of stochastic runs to be performed; and the total duration of each simulation. the tool allows the production of global spreading scenarios with geographical high resolution by just interacting with the graphic user interface. while an expert input would be required to interpret and discuss the results obtained with the software, the present computational platform facilitates the generation and analysis of scenarios from intensive data-driven simulations. the tool can be deployed both in training activities as well as to facilitate the use of large-scale computational modeling of infectious diseases in the discussion between modelers and public health stakeholders. the paper is organized as follows. the "implementation" section describes the software application architecture and its major components, including the computational model gleam. the "results and discussion" section presents in detail the gleamviz client and its components that allow for software-user interaction, including an application of the simulator to an influenza-like-illness scenario. the top-level architecture of the gleamviz tool comprises three components: the gleamviz client application, the gleamviz proxy middleware, and the simulation engine. the latter two components constitute the gleamviz server, as shown in figure 1 . users interact with the gleamviz system by means of the client application, which provides graphical userinterfaces for designing and managing the simulations, as well as visualizing the results. the clients, however, do not themselves run the simulations. instead they establish a connection with the gleamviz proxy middleware to request the execution of a simulation by the server. multiple clients can use the same server concurrently. upon receipt of requests to run a simulation, the middleware starts the simulation engine instances required to execute the requests and monitors their status. once the simulations are completed, the gleamviz proxy middleware collects and manages the resulting simulation data to be served back to the clients. a schematic diagram of the workflow between client and server is shown in figure 2 . this client-server model allows for full flexibility in its deployment; the client and server can be installed on the same machine, or on different machines connected by a local area network or the internet. the two-part decomposition of the server in terms of middleware and engines additionally allows for advanced high-volume setups in which the middleware server distributes the engine instances over a number of machines, such as those in a cluster or cloud. this architecture thus ensures high speed in large-scale simulations and does not rely on the cpu-specific availability accessible by the user. the gleamviz simulation engine uses a stochastic metapopulation approach [17] [18] [19] 2, [20] [21] [22] 16 ] that considers data-driven schemes for the short-range and design the compartmental model of the infectious disease in the model builder. 1 configure the simulation of the world-wide epidemic spreading in the simulation wizard. 2 submit the simulation for execution by the engine on the server. inspect the results of a simulation in the interactive visualization. 5 inspect all simulations and retrieve results in the simulations history. long-range mobility of individuals at the inter-population level, coupled with coarse-grained techniques to describe the infection dynamics within each subpopulation [6, 14] . the basic mechanism for epidemic propagation occurs at multiple scales. individuals interact within each subpopulation and may contract the disease if an outbreak is taking place in that subpopulation. by travelling while infected, individuals can carry the pathogen to a non-infected region of the world, thus starting a new outbreak and shaping the spatial spread of the disease. the basic structure of gleam consists of three distinct layers -the population layer, the mobility layer, and the epidemic layer (see figure 3 ) [6, 14] . the population layer is based on the high-resolution population database of the gridded population of the world project by the socio-economic data and applications center (sedac) [23] that estimates population with a granularity given by a lattice of cells covering the whole planet at a resolution of 15 × 15 minutes of arc. the mobility layer integrates short-range and longrange transportation data. long-range air travel mobility is based on travel flow data obtained from the international air transport association (iata [24]) and the official airline guide (oag [25] ) databases, which contain the list of worldwide airport pairs connected by direct flights and the number of available seats on any given connection [26] . the combination of the population and mobility layers allows for the subdivision of the world into geo-referenced census areas obtained by a voronoi tessellation procedure around transportation hubs. these census areas define the subpopulations of the metapopulation modeling structure, identifying 3,362 subpopulations centered on iata airports in 220 different countries. the model simulates the mobility of individuals between these subpopulations using a stochastic procedure defined by the airline transportation data [6] . short-range mobility considers commuting patterns between adjacent subpopulations based on data collected and analyzed from more than 30 countries in 5 continents across the world [6] . it is modeled with a time-scale separation approach that defines the effective force of infections in connected subpopulations [6, 27, 28] . on top of the population and mobility layers lies the epidemic layer, which defines the disease and population dynamics. the infection dynamics takes place within each subpopulation and assumes a compartmentalization [29] that the user can define according to the infectious disease under study and the intervention measures being considered. all transitions between compartments are modeled through binomial and multinomial processes to preserve the discrete and stochastic nature of the processes. the user can also specify the initial outbreak conditions that characterize the spreading scenario under study, enabling the seeding of the epidemic in any geographical census area in the world and defining the immunity profile of the population at initiation. seasonality effects are still an open problem in the transmission of ili diseases. in order to include the effect of seasonality on the observed pattern of ili diseases, we use a standard empirical approach in which population layer short-range mobility layer long-range mobility layer the short-range mobility layer covers commuting patterns between adjacent subpopulations based on data collected and analyzed from more than 30 countries on 5 continents across the world, modeled with a time-scale separation approach that defines the effective force of infections in connected subpopulations. the long-range mobility layer covers the air travel flow, measured in available seats between worldwide airport pairs connected by direct flights. seasonality is modeled by a forcing that reduces the basic reproductive number by a factor α min ranging from 0.1 to 1 (no seasonality) [20] . the forcing is described by a sinusoidal function of 12 months-period that reaches its peak during winter time and its minimum during summer time in each hemisphere, with the two hemispheres with opposite phases. given the population and mobility data, infection dynamics parameters, and initial conditions, gleam performs the simulation of stochastic realizations of the worldwide unfolding of the epidemic. from these in silico epidemics a variety of information can be gathered, such as prevalence, morbidity, number of secondary cases, number of imported cases, hospitalized patients, amounts of drugs used, and other quantities for each subpopulation with a time resolution of 1 day. gleam has been under continuous development since 2005 and during these years it has been used: to assess the role of short-range and long-range mobility in epidemic spread [30, 31, 6] ; to retrospectively analyze the sars outbreak of 2002-2003 in order to investigate the predictive power of the model [22] ; to explore global health strategies for controlling an emerging influenza pandemic with pharmaceutical interventions under logistical constraints [21] ; and more recently to estimate the seasonal transmission potential of the 2009 h1n1 influenza pandemic during the early phase of the outbreak to provide predictions for the activity peaks in the northern hemisphere [3, 32] . the gleamviz simulation engine consists of a core that executes the simulations and a wrapper that prepares the execution based on the configuration relayed from the client by the gleamviz proxy middleware. the engine can perform either single-run or multi-run simulations. the single-run involves only a single stochastic realization for a given configuration setup and a random seed. the multi-run simulation involves a number of stochastic realizations as set by the user and performed by the core (see the following section), each with the same configuration but with a different random seed. the results of the multi-run simulation are then aggregated and statistically analyzed by the wrapper code. the simulation engine writes the results to files and uses lock files to signal its status to the middleware component. the core is written in c++, resulting in a fast and efficient engine that allows the execution of a single stochastic simulation of a 1-year epidemic with a standard seir model in a couple of minutes on a high-end desktop computer. the wrapper code is written in python [33] . the server components can be installed on most unix-like operating systems such as linux, bsd, mac os x, etc. the gleamviz proxy middleware is the server component that mediates between clients and simulation engines. it accepts tcp connections from clients and handles requests relayed over these connections, providing client authorization management. a basic access control mechanism is implemented that associates a specific client with the simulations it launches by issuing a private simulation identifier key upon submission. users can only retrieve the results of the simulations they launched, or simulations for which they have obtained the simulation definition file -containing the private simulation identifier key-from the original submitter. upon receipt of a request to execute a simulation, the middleware sets up the proper system environment and then launches an instance of the simulation engine with the appropriate configuration and parameters according to the instructions received from the client. for singlerun simulations, the daily results are incrementally served back to the client while the simulation is being executed. this allows for the immediate visualization of the spreading pattern, as described in "visualization interface" subsection. for multi-run simulations the results are statistically analyzed after all runs are finished, and the client has to explicitly request the retrieval of the results once they become available. the gleamviz proxy server component can be configured to keep the simulation data indefinitely or to schedule the cleanup of old simulations after a certain period of time. multi-run metadata is stored in an internal object that is serialized on a system file, ensuring that authorization information is safely kept after a server shutdown or failure. the gleamviz proxy component additionally provides control features such as accepting administrative requests at runtime in order to manage stored simulations or to modify several configuration parameters like the number of simultaneous connections allowed, the number of simultaneous simulations per client, the session timeout, etc. the middleware server is written in python [33] and uses the twisted matrix library suite [34] for its networking functionality. client and server communicate using a special purpose protocol, which provides commands for session handling and simulation management. commands and data are binary encoded using adobe action message format (amf3) in order to minimize bandwidth needs. the gleamviz client is a desktop application by which users interact with the gleamviz tool. it provides guis for its four main functions: 1) the design of compartmental models that define the infection dynamics; 2) the configuration of the simulation parameters; 3) the visualization of the simulation results; and 4) the management of the user's collection of simulations. in the following section we describe these components in detail. the client was developed using the adobe air platform [35] and the flex framework [36] and can thus be deployed on diverse operating systems, including several windows versions, mac os x, and several common linux distributions. the gleamviz client has a built-in updating mechanism to check for the latest updates and developments and prompts the user to automatically download them. it also offers a menu of configuration options of the interface that allows the user to customize preferences about data storage, visualization options, the server connection, and others. the software system presented above is operated through the gleamviz client, which provides the user interface: the part of the tool actually experienced on the user side. the gleamviz client integrates different modules that allow the management of the entire process flow from the definition of the model to the visualization of the results. in the following we will describe the various components and provide the reader with a user study example. the model builder provides a visual modeling tool for designing arbitrary compartmental models, ranging from simple sir models to complex compartmentalization in which multiple interventions can be considered along with disease-associated complications and other effects. (an example can be found in previous work [37] .) a snapshot of the model builder window is shown in figure 4 . the models are represented as flow diagrams with stylized box shapes that represent compartments and directed edges that represent transitions, which is consistent with standard representations of compartmental models in the literature. through simple operations like 'click and drag' it is possible to create any structure with full flexibility in the design of the compartmentalization; the user is not restricted to a given set of pre-loaded compartments or transition dynamics. the interactive interface provided by the model builder enables the user to define the compartment label, the mobility constraints that apply (e.g. allowed/not allowed to travel by air or by ground), whether the compartment refers to clinical cases, as well as the color and position of their representation in the diagram (see figure 5 ). this allows the user to model many kinds of human-to-human infectious diseases, in particular respiratory and influenza-like diseases. transitions individuals is equal to  si n , where n is the total size of the subpopulation. the gleam simulation engine considers discrete individuals. all its transition processes are both stochastic and discrete, and are modeled through binomial and multinomial processes. transitions can be visually added by dragging a marker from the source to the target compartment. spontaneous transitions are annotated with their rates, which can be modified interactively. infection transitions are accompanied with a representation of the infection's source compartment and the applicable rate (i.e. b in the example above), which can also be modified in an interactive way. the rates can be expressed in terms of a constant value or in terms of a variable whose value needs to be specified in the variables table, as shown in figure 4 . the value can also be expressed by simple algebraic expressions. the client automatically checks for and reports inconsistencies in the model in order to assist the user in the design process (see bottom right window in figure 4 ). models can be exported to xml files and stored locally, allowing the user to load a model later, modify it, and share it with other users. the diagram representation can be exported as a pdf or svg file for use in documentation or publications. a few examples of compartmental models are available for download from the simulator website. the simulation wizard provides a sequence of panels that leads the user through the definition of several configuration parameters that characterize the simulation. figure 6 shows some of these panels. the consecutive steps of the configuration are as follows: •choice of the type of the simulation (panel a) the user is prompted with three options: create a new single-run simulation or a new multi-run simulation from scratch, or a new one based on a saved simulation previously stored in a file. •compartmental model selection and editing the user can design a new compartmental model, modify the current compartmental model (when deriving it from an existing simulation), or load a model compartmentalization from a file. •definition of the simulation parameters (panel c) the user is asked to specify various settings and parameter values for the simulation, including, e.g., the number of runs to perform (only accessible in the case of a multi-run), the initial date of the simulation, the length of the simulation (in terms of days), whether or not seasonality effects should be considered, the airplane occupancy rate, the commuting time, the conditions for the definition of an outbreak, and others. •initial assignment of the simulation (panel d) here the user assigns the initial distribution of the population amongst compartments, defining the immunity profile of the global population on the starting date. •definition of the outbreak start (panel e) this panel allows the user to define the initial conditions of the epidemic by selecting the city (or cities) seeded with the infection. •selection of output results (panel f) here the user selects the compartments that will constitute the output provided by the client at the end of the simulation. the corresponding data will be shown in the visualization window and made available for download. when all the above configuration settings are defined, the user can submit the simulation to the gleamviz server for execution. this will automatically add the simulation to the user's simulations history. it is furthermore possible to save the definition of the simulation setup to a local file, which can be imported again later or shared with other users. the simulations history is the main window of the client and provides an overview of the simulations that the user has designed and/or submitted, in addition to providing access to the model builder, the simulation wizard, and the visualization component. the overview panel shown in figure 7 lists the simulation identifier, the submission date and time, the simulation type (i.e., single or multi-run), the execution status (i.e., initialized, start pending, started, aborted, complete, failed, or stop pending) and the results status (i.e., none, retrieve pending, retrieving, stop retrieve pending, complete, or stored locally). additional file 1 provides a detailed explanation of all these values. a number of context-dependent command buttons are available once a simulation from the list is selected. those buttons allow the user to control the simulation execution, retrieve the results from the server and visualize them, clone and edit the simulation to perform a new execution, save the simulation definition or the output data to the local machine (in order to analyze the obtained data with other tools, for example), and remove the simulation. in addition to exporting the compartmental model (see the "model builder" subsection) the user can export a complete configuration of a simulation that includes the compartmental model and the entire simulation setup to a local file, which can be imported again later or shared with other users. once the execution of a simulation is finished and the results have been retrieved from the server, the client can display the results in the form of an interactive visualization of the geo-temporal evolution of the epidemic. this visualization consists of a temporal and geographic mapping of the results accompanied by a set of graphs (see figure 8 ). the geographic mapping involves a zoomable multi-scale map on which the cells of the population layer are colored according to the number of new cases of the quantity that is being displayed. several visualization features can be customized by clicking on the gear icon and opening the settings widget. it is possible to zoom in and out and pan by means of the interface at the top left of the map. dragging the map with the mouse (on a location where there are no basin marks) can also pan the visualization. all the widgets and the graphs displayed over the map can be re-positioned according to the user's preferences by clicking and dragging the unused space in the title bar. the color coding of the map represents the number of cases on a particular day. the time evolution of the epidemic can be shown as a movie, or in the form of daily states by moving forward or backward by one day at a time. for single-run simulations it is also possible to show the airline transportation of the 'seeding' individuals by drawing the traveling edge between the origin and destination cities. in the case where the output quantity is a subset of the infectious compartments, the edges show the actual seeding of the infection. note that the evolution of the epidemic depends strongly on the model definition. for example, it is possible that some basins are infected by a latent individual that later develops the disease. in this case no seeding flight will be shown if only infectious compartments are selected as output. beside the geographical map, the visualization window displays two charts. one chart shows the number of new cases per 1,000 over time (incidence), and the other shows the cumulative number of new cases per 1,000 over time (size). for multi-run simulations, median values and corresponding 95% confidence intervals are shown. the menu above each chart combo lets the user choose the context for which the corresponding charts show incidence and size data. this context is either: global, one of three hemispheres, one continent, one region, one country, or one city. the currently selected day is marked by a vertical line in these plots, and the day number, counted from the initial date selected for the simulation, is shown by side of the time slider. here we present an example application of the gleamviz tool to study a realistic scenario for the mitigation of an emerging influenza pandemic. disease-control programs foresee the use of antiviral drugs for treatment and shortterm prophylaxis until a vaccine becomes available [38] . the implementation of these interventions rely both on logistical constraints [21, 39] -related, e.g., to the availability of drugs -and on the characteristics of the infection, including the severity of the disease and the virus's potential to develop resistance to the drugs [40] . here we focus on the mitigation effects of systematic antiviral (av) treatment in delaying the activity peak and reducing attack rate [41] [42] [43] 7, 8, 39, 40, 3] , and assume that all countries have access to av stockpiles. we consider a scenario based on the 2009 h1n1 influenza pandemic outbreak and feed the simulator with the set of parameters and initial conditions that have been estimated for that outbreak through a maximum likelihood estimate by using the gleam model [3] . the results provided by the present example are not meant to be compared with those contained in the full analysis carried out with gleam [3] due to the fact that in the figure 8 the simulation results can be inspected in an interactive visualization of the geo-temporal evolution of the epidemic. the map shows the state of the epidemic on a particular day with infected population cells color-coded according to the number of new cases of the quantity that is being displayed. pop-ups provide more details upon request for each city basin. the zoomable multi-scale map allows the user to get a global overview, or to focus on a part of the world. the media-player-like interface at the bottom is used to select the day of interest, or show the evolution of the epidemic like a movie. two sets of charts on the right show the incidence curve and the cumulative size of the epidemics for selectable areas of interest. present example we do not consider additional mitigation strategies that were put in place during the early phase of the outbreak, such as the sanitary control measures implemented in mexico [3, 44] , or the observed reduction in international travel to/from mexico [45] . indeed, the current version of gleamviz does not allow for interventions that are geographically and/or temporally dependent. however, these features are currently under development and will be available in the next software release. for this reason the simulation scenario that we study in this application of the simulator does not aim to realistically reproduce the timing of the spreading pattern of the 2009 h1n1 pandemic. the results reported here ought to be considered as an assessment of the mitigating impact of av treatment alone, based on the initial conditions estimated for the h1n1 outbreak, and assuming the implementation of the same av protocol in all countries of the world. we adopt a seir-like compartmentalization to model influenza-like illnesses [29] in which we include the systematic successful treatment of 30% of the symptomatic infectious individuals (see figure 9 ). the efficacy of the figure 9 compartmental structure in each subpopulation in the intervention scenario. a modified susceptible-latent-infectious-recovered model is considered, to take into account asymptomatic infections, traveling behavior while ill, and use of antiviral drugs as a pharmaceutical measure. in particular, infectious individuals are subdivided into: asymptomatic (infectious_a), symptomatic individuals who travel while ill (infectious_s_t), symptomatic individuals who restrict themselves from travel while ill (infectious_s_nt), symptomatic individuals who undergo the antiviral treatment (infectious_avt). a susceptible individual interacting with an infectious person may contract the illness with rate beta and enter the latent compartment where he/she is infected but not yet infectious. the infection rate is rescaled by a factor ra in case of asymptomatic infection [41, 46] , and by a factor ravt in case of a treated infection. at the end of the latency period, of average duration equal to eps -1 , each latent individual becomes infectious, showing symptoms with probability 1-p a , whereas becoming asymptomatic with probability p a [41, 46] . change in travelling behavior after the onset of symptoms is modeled with probability p t set to 50% that individuals would stop travelling when ill [41] . infectious individuals recover permanently after an average infectious period mu -1 equal to 2.5 days. we assume the antiviral treatment regimen to be administered to a 30% fraction (i.e. pavt = 0.3) of the symptomatic infectious individuals within one day from the onset of symptoms, reducing the infectiousness and shortening the infectious period of 1 day. [41, 42] . av treatment is accounted for in the model by a 62% reduction in the transmissibility of the disease by an infected person under av treatment when av drugs are administered in a timely fashion [41, 42] . we assume that the drugs are administered within 1 day of the onset of symptoms and that the av treatment reduces the infectious period by 1 day [41, 42] . the scenario with av treatment is compared to the baseline case in which no intervention is considered, i.e. the probability of treatment is set equal to 0 in all countries. the gleamviz simulation results are shown in figure 10 where the incidence profiles in two different regions of the world, north america and western europe, are shown for both the baseline case and the intervention scenario with av treatment. the results refer to the median (solid line) and 95% reference range (shaded area) obtained from 100 stochastic realizations of each scenario starting from the same initial conditions. the resulting incidence profiles of the baseline case peak at around mid-november and the end of november 2009 in the us and western europe, respectively. these results show an anticipated peak of activity for the northern hemisphere with respect to the expected peak time of seasonal influenza. in order to make a more accurate comparison with the surveillance data in these regions, we should rely on the predictions provided by models that can take into account the full spectrum of strategies that were put in place during the 2009 h1n1 outbreak, viz. the predictions obtained by gleam [3] . in the case of a rapid and efficient implementation of the av treatment protocol at the worldwide level, a delay of about 6 weeks would be obtained in the regions under study, a result that could be essential in gaining time to deploy vaccination campaigns targeting high-risk groups and essential services. in addition, the gleamviz tool provides simulated results for the number of av drugs used during the evolution of the outbreak. if we assume treatment delivery and successful administration of the drugs to 30% of the symptomatic cases per day, the number of av drugs required at the activity peak in western europe would be 4.5 courses per 1,000 persons, and the size of the stockpile needed after the first year since the start of the pandemic would be about 18% of the population. again, we assume a homogeneous treatment protocol for all countries in the world; results may vary from country to country depending on the specific evolution of the pandemic at the national level. computer-based simulations provide an additional instrument for emerging infectious-disease preparedness and control, allowing the exploration of diverse scenarios and the evaluation of the impact and efficacy of various intervention strategies. here we have presented a computational tool for the simulation of emerging ili infectious diseases at the global scale based on a datadriven spatial epidemic and mobility model that offers an innovative solution in terms of flexibility, realism, and computational efficiency, and provides access to sophisticated computational models in teaching/training settings and in the use and exploitation of large-scale simulations in public health scenario analysis. project name: gleamviz simulator v2. 6 project homepage: http://www.gleamviz.org/simulator/ operating systems (client application): windows (xp, vista, 7), mac os x, linux. programming language: c++ (gleamsim core), python (gleamproxy, gleamsim wrapper), action-script (gleamviz) other requirements (client application): adobe air runtime, at least 200 mb of free disk space. license: saas baseline scenario scenario with av figure 10 simulated incidence profiles for north america and western europe in the baseline case (left panels) and in the av treatment scenario (right panels). the plots are extracted from the gleamviz tool visualization. in the upper plots of each pair the curves and shaded areas correspond to the median and 95% reference range of 100 stochastic runs, respectively. the lower curves show the cumulative size of the infection. the dashed vertical line marks the same date for each scenario, clearly showing the shift in the epidemic spreading due to the av treatment. any restrictions to use by non-academics: none. the server application can be requested by public institutions and research centers; conditions of use and possible restrictions will be evaluated specifically. additional file 1: the gleamviz computational tool: additional file. this file includes information for installing the gleamviz client and details of the features of its various components. the transmissibility and control of pandemic influenza a (h1n1) virus potential for a global dynamic of influenza a (h1n1) seasonal transmission potential and activity peaks of the new influenza a(h1n1): a monte carlo likelihood analysis based on human mobility modelling disease outbreaks in realistic urban social networks epifast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems multiscale mobility networks and the spatial spreading of infectious diseases strategies for containing an emerging influenza pandemic in southeast asia mitigation strategies for pandemic influenza in the united states mitigation measures for pandemic influenza in italy: an individual based model considering different scenarios flute, a publicly available stochastic influenza epidemic simulation model the influenza pandemic preparedness planning tool influsim an extensible spatial and temporal epidemiological modelling system centers for disease control and prevention (cdc) modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model centers for disease control and prevention (cdc) controlling pandemic flu: the value of international air travel restrictions a mathematical model for the global spread of influenza assessing the impact of airline travel on the geographic spread of pandemic influenza forecast and control of epidemics in a globalized world delaying the international spread of pandemic influenza modeling the worldwide spread of pandemic influenza: baseline case and containment interventions predictability and epidemic pathways in global outbreaks of infectious diseases: the sars case study socioeconomic data and applications center (sedac). columbia university the architecture of complex weighted networks estimating spatial coupling in epidemiological systems: a mechanistic approach a structured epidemic model incorporating geographic mobility among regions infectious diseases in humans the role of airline transportation network in the prediction and predictability of global epidemics the modeling of global epidemics: stochastic dynamics and predictability modeling vaccination campaigns and the fall/winter 2009 activity of the new a (h1n1) influenza in the northern hemisphere python programming language twisted matrix networking engine adobe flex framework modeling the critical care demand and antibiotics resources needed during the fall 2009 wave of influenza a (h1n1) pandemic world health organization: pandemic preparedness antiviral treatment for the control of pandemic influenza: some logistical constraints hedging against antiviral resistance during the next influenza pandemic using small stockpiles of an alternative chemotherapy containing pandemic influenza with antiviral agents containing pandemic influenza at the source potential impact of antiviral drug use during influenza pandemic modelling of the influenza a(h1n1)v outbreak in mexico city secretaría de comunicaciones y transportes the who rapid pandemic assessment collaboration: pandemic potential of a strain of influenza a(h1n1): early findings we are grateful to the international air transport association for making the airline commercial flight database available to us. this work has been partially funded by the nih r21-da024259 award, the lilly endowment grant 2008 1639-000 and the dtra-1-0910039 award to av; the ec-ict contract no. 231807 (epiwork) to av, vc, and wvdb; the ec-fet contract no. 233847 (dynanets) to av and vc; the erc ideas contract n.erc-2007-stg204863 (epifor) to vc, cg, and mq. authors' contributions cg, wvdb and bg designed the software architecture. wvdb and mq developed the client application. bg implemented the gleam engine. cg developed the proxy middleware. cg, vwdb, vc and av drafted the manuscript. mq and bg helped draft the manuscript. av and vc conceived and coordinated the software project, designed and coordinated the application study. all authors read and approved the final manuscript.competing interests av is consulting and has a research agreement with abbott for the modeling of h1n1 diffusion. the other authors have declared that no competing interests exist. key: cord-178783-894gkrsk authors: zhang, rui; hristovski, dimitar; schutte, dalton; kastrin, andrej; fiszman, marcelo; kilicoglu, halil title: drug repurposing for covid-19 via knowledge graph completion date: 2020-10-19 journal: nan doi: nan sha: doc_id: 178783 cord_uid: 894gkrsk objective: to discover candidate drugs to repurpose for covid-19 using literature-derived knowledge and knowledge graph completion methods. methods: we propose a novel, integrative, and neural network-based literature-based discovery (lbd) approach to identify drug candidates from both pubmed and covid-19-focused research literature. our approach relies on semantic triples extracted using semrep (via semmeddb). we identified an informative subset of semantic triples using filtering rules and an accuracy classifier developed on a bert variant, and used this subset to construct a knowledge graph. five sota, neural knowledge graph completion algorithms were used to predict drug repurposing candidates. the models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. these models were complemented by a discovery pattern-based approach. results: accuracy classifier based on pubmedbert achieved the best performance (f1= 0.854) in classifying semantic predications. among five knowledge graph completion models, transe outperformed others (mr = 0.923, hits@1=0.417). some known drugs linked to covid-19 in the literature were identified, as well as some candidate drugs that have not yet been studied. discovery patterns enabled generation of plausible hypotheses regarding the relationships between the candidate drugs and covid-19. among them, five highly ranked and novel drugs (paclitaxel, sb 203580, alpha 2-antiplasmin, pyrrolidine dithiocarbamate, and butylated hydroxytoluene) with their mechanistic explanations were further discussed. conclusion: we show that an lbd approach can be feasible for discovering drug candidates for covid-19, and for generating mechanistic explanations. our approach can be generalized to other diseases as well as to other clinical questions. anteed. on the other hand, de novo development and approval of an effective antiviral therapy can take more than a decade. in the absence of an effective vaccine or other therapies, there have been significant efforts in repurposing drugs approved for other diseases for covid-19 treatment, some of which have been tested in clinical trials (e.g., dexamethasone [9] , hydroxychloroquine and lopinavir/ritonavir [10] ). computational approaches to drug repurposing have also garnered much attention to accelerate discovery of therapies for covid-19 [11, 12] . common computational drug repurposing methods include drug signature matching, molecular docking, genome-wide association studies, and network analysis [13] . these data-driven approaches involve systematic analysis of various types of biological and clinical data (e.g., gene expression, chemical structure, genome and protein sequences, and electronic health records) to generate hypotheses regarding repurposed use of approved or investigational drugs [13] . the potential of recent advances in artificial intelligence (ai) and machine learning for covid-19 drug repurposing has also been highlighted [14] and several studies using these techniques have reported promising results [15] [16] [17] [18] . in particular, approaches leveraging network medicine [19] principles and biological knowledge graphs have been emphasized [14] . most of these computational approaches have focused on biological data, such as gene expression, protein-protein and drug-target interactions, and used sars-cov-2-related data. however, covid-19-specific data is meaningful in the context of the larger body of diverse knowledge underpinning medicine and life sciences, a primary source of which is the biomedical literature. while some covid-19 drug repurposing studies incorporated some literature-based knowledge [15, 18] , their focus has remained largely covid-19-specific. we argue that efficiently and safely repurposing drugs to treat covid-19 requires more effective integration of literature-based knowledge with biological data collected via high-throughput methods. in this paper, we propose a novel literature-based discovery [20, 21] ap-proach for covid-19 drug repurposing. similar to related work [18] , we cast drug repurposing as a task of knowledge graph completion (or link prediction). we use a large, literature-derived biomedical knowledge graph constructed from semmeddb [22] as well as covid-19 research literature [23] , as our data source. we use several state-of-the-art, neural network-based algorithms [24] [25] [26] for the task, and also complement these approaches with an approach based on discovery patterns [27] . furthermore, we highlight the role of discovery patterns in search of mechanistic explanations for the proposed drugs. unlike most approaches that focus on covid-19-specific knowledge [15, 18] , we consider a larger body of biomedical knowledge, as captured in the pubmed bibliographic database as well as in the covid-19 research literature. our results show that our approach can identify known drugs that have been used for covid-19 and discover other novel drugs that can potentially be repurposed for covid-19. significant computational work has already been done to prioritize fdaapproved drugs for repurposing to treat covid-19 [11, 12] . for the most part, these studies can be categorized as molecular docking-based drug screening studies and network-based studies, the majority of them belonging to the former category. in molecular docking studies, small molecules in compound libraries are screened for effectiveness against the host proteins in the sars-cov-2 host interactome. many studies of this kind have been reported, and some of the proposed drugs such as ritonavir, ribavirin, remdesivir, oseltamivir, have been used in practice and many are being evaluated in clinical trials [28] [29] [30] [31] [32] [33] [34] [35] . while not as common as docking studies, network-based approaches to drug repurposing have also been explored. in one early study, a virus-related knowledge graph which consists of drug-target and protein-protein interactions and similarity networks from publicly available databases (e.g., drugbank [36] , chembl [37] , biogrid [38] ) was constructed and network-based machine learning and statistical analysis were used to predict an initial list of covid-19 drug candidates. this list was narrowed down based on text mining from the literature and gene expression profiles from covid-19 patients, and a poly-adpribose polymerase 1 (parp1) inhibitor cvl218, was proposed for therapeutic use against covid-19 [15] . cava et al. [39] used gene expression profiles from public datasets to construct a protein-protein interaction network in conjunction with pathway enrichment analysis to identify 36 potential drugs, including nimesulide, thiabendazole, and fluticasone propionate. in another study, network proximity analyses of drug targets and hcov-host interactions in the human interactome were used to prioritize 16 potential repurposed drugs, including melatonin, mercaptopurine, and sirolimus, which were validated by enrichment analyses of drug-gene signatures and transcriptome data in human cell lines. potentially useful drug combinations (e.g., melatonin plus mercaptopurine) were also suggested [16] . a follow-up study combined network medicine approaches based on human interactome with clinical patient data from a covid-19 registry to show that melatonin was associated with reduced likelihood of a positive sars-cov-2 laboratory test [17] . the approach was further extended to explore deep learning [18] . a comprehensive knowledge graph of drugs, diseases, and proteins/genes (named cov-kge) was constructed by combining molecular interaction information from the literature with knowledge from drugbank. a knowledge graph embedding model, named rotate [25] was used to represent the entities and the relationships in the knowledge-based in low-dimensional vector space. using the ongoing covid-19 trial data as a validation set, 41 high-confidence repurposed drug candidates (including dexamethasone, indomethacin, niclosamine, and toremifene) were identified, and further validated via an enrichment analysis of gene expression and proteomics data in sars-cov-2-infected human cells. another study used node2vec graph embeddings and variational graph autoencoders for the same purpose [40] . [41] evaluated three algorithms (graph neural network, network proximity, and network diffusion) on a network of drug protein targets and disease-associated proteins for covid-19 drug repurposing. while they obtained low correlations across the three algorithms, an ensembling approach that combined the predictions of all algorithms was shown to outperform the individual methods, ranking ritonavir, chloroquine, and dexamethasone among the most promising candidates. some limited literature knowledge relevant to covid-19 has been incorporated to network-based approaches; however, their focus remains largely on structured molecular interaction information encoded in databases. literature-based discovery (lbd) [20, 21] is a method of automatic hypothesis generation pioneered by swanson [42] . based on the concept of "undiscovered public knowledge", lbd seeks to uncover valuable hidden connections between disparate research literatures, and has been proposed as a potential solution for the problem of "research silos" (the view that scientific research areas are largely isolated from one another). the primary lbd paradigm is the so-called abc model. in the open discovery form of this model, a relationship between two concepts a and b is known in one research area and another relationship between concepts b and c is known in another, and a potential relationship between concepts a and c is proposed. conversely, in closed discovery, relationship ac is known, and a concept b is proposed as an explanation for the relationship ac. extensions to abc model have also been proposed, such as discovery browsing model that aims to elucidate more complex relationship paths between biomedical concepts [43, 44] . most applications of lbd have been in the biomedical domain, beginning with swanson's discovery of fish oil as a treatment for raynaud disease [42] , a hypothesis supported subsequently by clinical studies. while early lbd systems focused primarily on term cooccurrence [45, 46] , semantic relations have been widely used in later years for representing scientific content of biomedical publications [27, [47] [48] [49] . more recently, distributed vector representations based on term or semantic relation co-occurrence have been gaining popularity [50] [51] [52] . drug repurposing has been one of the prominent applications of lbd [27, [53] [54] [55] [56] [57] [58] . for example, hristovski et al. [27] used semantic discovery patterns following the abc model to identify potential therapeutic uses for drugs. zhang et al. [56] used discovery patterns and semmeddb relations to identify potential prostate cancer drugs. cohen et al. [55] used a vector representation approach based on semantic relations to predict a small number of active agents within a large library screened for activity against prostate cancer cells. knowledge graphs are represented as a collection of head entity-relation-tail entity triples (h,r,t), where entities correspond to nodes and relations to edges between them. knowledge graph completion (or link prediction) is the task of predicting unseen relations between two existing entities or to predict the tail entity given the head entity and the relation (or head entity given the tail entity and the relation). recent approaches to knowledge graph completion rely on knowledge graph embedding methods [59] , which learn a mapping from nodes and edges to continuous vector space that preserve the proximity structure of the knowledge graph and are amenable to application of machine learning methods. such methods include translational models, which use distance-based scoring functions (e.g., transe [24] , transh [60] , rotate [25] ), and semantic matching models, which use similarity-based scoring functions (e.g., rescal [61] , distmult [62] , complex [63] , and holographic embeddings (hole) [64] ). graph convolutional networks [65, 66] as well as methods that use context-based encoding approach (kg-bert [67] , stelp [26] ) have also been recently proposed. knowledge graph embedding techniques based on a network of drug, disease, and gene/protein entities, have been used to support drug repurposing for rare diseases [68] . graph convolutional networks were used to model drug side effects resulting from drug-drug interactions [69] . a multimodal graph of proteinprotein interactions, drug-protein target interactions and drug-drug interactions was constructed from publicly available datasets. sang et al. (2018) [70] constructed low-dimensional knowledge graph embeddings from semmeddb relations and trained a long short-term memory (lstm) model using known drug therapies from therapeutic target database [71] , proposing potential drugs us-ing the trained model. in this section, we first describe our data sources and the preprocessing steps that were taken to construct a literature knowledge graph from these data sources. next, we discuss the knowledge graph completion methods that we used to predict candidate drugs for covid-19 as well as the discovery patterns used for providing mechanistic explanations. lastly, we detail various evaluation schemes that we used to automatically validate our predictions. a workflow diagram illustrating our approach is provided in fig. 1 . we constructed our biomedical knowledge graph primarily from semmeddb [22] , a repository of semantic relations automatically extracted from biomedi-cal literature using semrep natural language processing (nlp) tool [72, 73] . semrep-extracted relations are in the form of subject-predicate-object triples (also called semantic predications) and are derived from unstructured text in concepts are enriched with semantic type information (disease or syndrome, pharmacologic substance, etc.) and the relations are linked to the supporting article and sentence. semmeddb has supported a wide range of computational applications, ranging from gene regulatory network inference [76] to in silico screening for drug repurposing [55] and medical reasoning [77] , and has also found widespread use for literature-based knowledge discovery and hypothesis generation [44, 48, [78] [79] [80] . in its most recent release (version 43, dated 8/28/2020) 1 , semmeddb contains more than 107m relations from more than in this work, we focused on a subset of semantic relations derived from the combination of pubmed and cord-19 datasets, predicted to be accurate and informative for drug repurposing. first, we eliminated relations involving generic biomedical concepts (i.e., relations in which both subject and object were present in a generic concept table of semmeddb such as pharmaceutical preparations) and relations with identical subject and object arguments. next, we excluded a subset of predicate types that were not expected to be useful for drug repurposing, such as part of and process of. the predicate types used are affects, associated with, augments, causes, coexists with, complicates, disrupts, inhibits, interacts with, manifestation of, predisposes, prevents, produces, stimulates, and treats. lastly, we also excluded the relations in which the subin the second step, we eliminated uninformative semantic relations using loglikelihood ratio (g 2 ) and network degree centrality for the concepts (in-degree and out-degree). we assigned each semantic relation a g 2 score indicating how strongly the terms within a triple are associated with each other [82] . a high g 2 score means that the observed and expected frequencies are significantly different, indicating that the triple is less likely to occur by chance. for computational purposes, we created two three-dimensional contingency tables with indices i, j, and k. the first table (ot) holds observed frequencies of a triple from the knowledge graph and the second table (et) contains the expected values assuming independence of terms in each triple. g 2 was then calculated using the equation where n ijk is the cell i, j, k in ot, m ijk is the cell i, j, k in et, and t = n ijk . next, we normalized all three measures (g 2 , in-degree, and out-degree) to the range [0, 1] and summed them up into a final score. the lower the score, the more specific and informative the relation is. for example, the relation operative surgical procedures-treats-woman with high score is more general than relation interleukin-6-affects-autoimmune diseases. we kept all relations for which the score value was less than a threshold value î±. we manually tuned the î± value to achieve a balance between specificity of relations and their variability. we kept 30 % of all relations with the lowest score in the data set. we also kept all biomedical concepts that refer to covid-19 terms (cuis: c5203670, c5203671, c5203672, c5203673, c5203674, c5203675, c5203676). at the end of the preprocessing stage, the knowledge graph consists of 131 355 nodes and 2 558 935 relations. the precision of semantic predications generated by semrep vary by domain (e.g., molecular interactions are less precise than clinical relationships). to improve the precision of the relations used for drug repurposing, we extended the semrep accuracy classifiers previously proposed [83, 84] . we fine-tuned a collection of transformer-based pretrained language models to classify semantic predications as correct vs. incorrect. we used the following models: vanilla bert (base size, cased and uncased) [85] , biobert [86] , bioclinicalbert [87] , bluebert [88] , and pubmedbert [89] . to extend the coverage of our existing classifiers, we used 6492 predications annotated as correct vs. incorrect with respect to their source sentences. we leveraged 6000 annotations from a previous study [84] (cohen's îº of 0.80) and annotated 492 new predications. annotation guidelines generated in the previous study was used. two of the authors (hk and mf) and two health informat-ics graduate students annotated predications containing predicates of interest absent in the prior study (fleiss' îº = 0.410, indicating moderate agreement). the resulting annotated set was split into 80/10/10 training/validation/test sets. hyperparameters were determined empirically and the learning rate was set to 1 ã� 10 â��5 , the batch size was 16, the maximum number of epochs was set to 10 but early stopping was employed. optimization was done using the adam optimizer [90] with decoupled weight decay regularization using betas (0.9, 0.999) and decay 0.01. the pooled output from the bert model was fed through a linear layer to produce logits that then underwent a softmax transformation to return class probabilities. a single tesla v100 gpu was used to train the models. we compared the performance of various above-mentioned transformers. the best classifier was then used to filter incorrect semantic predications. consider a knowledge graph g = (e, r, e), where e refers to a set of entities, r denotes a set of possible relations, and t stands for a set of triples in the form (h)ead-(r)elation-(t)ail, formally denoted as {(h, r, t)} â�� e ã� r ã� e. the aim of knowledge graph completion is to infer new triples (h , r , t ) such that h , t â�� e and r â�� r. in this setting, the knowledge graph completion problem could be represented as a ranking task in which we learn a prediction function ï�(h, r, t) : e ã� r ã� e â�� r which generates higher scores for true triples and lower scores for false triples. we explored three classes of knowledge graph completion methods: transe [24] and rotate [25] for translational models, distmult [62] and complex [63] for semantic matching models, and stelp [26] for context-based encoding. these methods differ in the way that they encode entities and relations in a knowledge graph into a low-dimensional vector space (i.e., kg embedding). such distributed vector representation can be used for downstream reasoning and machine learning tasks. transe [24] describes a triplet (h, r, t) as a translation between head entity h and tail entity t through relation r in a continuous vector space, i.e., h + r â�� t, where h, r, t â�� r d is the embedding of h, r, and t, respectively. to measure plausibility of relations transe employs a distance-based score function we choose transe because of its simplicity and good prediction performance. however, transe is able to model only one-to-one relations and fails to embed one-to-many, many-to-one, and many-to-many relations. to solve this problem, many other solutions have been proposed including rotate [25] . rotate treats each relation in a complex vector space as a rotation from the head entity to the tail entity, i.e., s(h, r, t) = |h â�¢ r â�� t| l1 , where â�¢ is a hadamard product. we selected rotate as a counterpart to transe, as transe reportedly does not perform well on some data sets (e.g., b15k family of data sets), which require symmetric pattern modeling. distmult [62] is the simplest approach among semantic matching models. semantic triple encoder for link prediction (stelp) [26] , is a contextbased encoding approach to knowledge graph completion. at its core is a siamese bert model that leverages sharing one set of weights across two models to produce encoded, contextual representations of the predications that are then where d is the set of correct triples, n (tp) is the set of corrupted triples for where î³ is a scaling factor for the contribution of the contrastive loss. at inference, stelp considers every entity-context combination for a given partial predication, (h, r) to find (t) or (r, t) to find (h), and ranks every pair using the sum of the positive class probability and the scaled negative euclidean distance. we replaced the vanilla base bert model proposed in the stelp paper with biobert, trained on biomedical literature corpora. the 1 016 124 unique predications remaining after preprocessing were each corrupted to produce five negative predications for a total of 5 080 620 negative predications and a grand total of 6 096 744 predications. the hyperparameters were set to the same values as in the original stelp paper and the learning rate was set to 1e-5, the batch size was 16, the contrastive loss scaling factor was 1.0. optimization was done using adam with decoupled weight decay with betas (0.9, 0.999) and decay 0.01. training was run for 29 000 training iterations. ranking was done by adding the scaled contrast score to the positive class probability and entities ordered in descending rank order. all preprocessing was done using custom bash and python scripts. transe, rotate, distmult, and complex link prediction models were implemented in pytorch using the dgl-ke package [91] for learning large-scale kg embeddings. the bert models were based on huggingface bert implementations using pytorch. pre-trained weights for biobert (biobert-base v1.1 (+ pubmed 1m)) 2 , bioclinicalbert 3 , pubmedbert 4 and bluebert (bluebert-base, uncased, pubmed+mimic-iii) 5 came from various sources associated with each paper. stelp was also implemented using a combination of a hug-gingface bert model and pytorch. our source codes are also publicly available 6 . discovery patterns are defined as a set of constraints that need to be satisfied for the discovery of new relations between concepts [27] . herein, we used discovery patterns for two purposes. first, we explored open discovery patterns to identify drugs that can be repurposed for covid-19. second, we used closed discovery patterns to propose plausible mechanisms for drugs identified via knowledge graph completion methods described above. discovery patterns are expressed in terms of predication pairs (or predication chains). in particular, we focused on the following discovery pattern: druga -inhibits|interacts with -conceptb and 10/15/2020). we focus on the latter category in our qualitative evaluation below. we semi-automatically generated a ground truth drug list, similar to the approach in other computational drug repurposing studies for covid-19 [18] . we downloaded the interventions used in covid-19 drug trials from clinicaltrials.gov using the following search: https://clinicaltrials.gov/ct2/res this yielded a set of 1167 clinical trials. we extracted the interventions from these studies and mapped the intervention terms to umls cuis using metamap (v2016) [92] and filtered the resulting concepts by their semantic groups [93] , keeping only those concepts with the semantic group chemicals & drugs. we also considered the additional semantic types therapeutic procedure and gene or genome, which also appeared for some concepts in intervention lists. we removed the duplicates and some general concepts (e.g., therapeutic procedure, placebo) as well as incorrect mappings, which resulted in a final list of 285 concepts. the automatic evaluation (below) was performed against this set. time slicing is an evaluation technique often used in literature-based discovery and link prediction tasks [20] . the idea is to train models on data prior to a specific date and test them on data after that date and evaluate whether links that formed only after the cutoff date can be predicted from the trained model. in this study, we trained our models on semantic relations extracted from publications dated 03/11/2020 or earlier and tested whether they can predict the drugs that have been proposed for covid-19 since then or have been evaluated in clinical trials. this date was selected as cutoff, as it is the date on which who declared covid-19 a pandemic. it is also a date by which enough biological knowledge about sars-cov-2 had accumulated, although covid-19 therapies were still in their infancy, making it a suitable cutoff for time slicing experiments. all five link prediction models were automatically assessed using a link prediction evaluation protocol proposed by bordes et al. [24] . suppose that x is a set of triples, î� e be the embeddings of entities e, and î� r be the embeddings of relations r. in the first, corruption step, we go through a set of triples and for each triple x = (h, r, t) â�� x replace its head and tail with all other entities in e. each triple is corrupted exactly 2|e| â�� 1 times. formally, the corrupted triple is defined as: where h = h and t = t. we employ the filtered setting protocol not taking into account any corrupted triple that already appears in the kg. in the second, scoring phase, original and corrupted triples are tested using the constructed classifier ï�. intuition behind this is that the model will assign a higher score to the original triple and a lower score to the corrupted triple. in the third, evaluation phase, the proposed link prediction models are assessed using three measures: mean rank (mr), mean reciprocal rank (mrr), and hits@k measure. mr is an average rank assigned to the true predication, over all predications in a test set: where rank h i and rank t i denote the rank position: where the indicator function i[p ] is 1 iff p is true, and 0 otherwise. mrr is the average inverse rank for all test triples and is formally computed as: hits@k measures the percentage of predications in which the true triple appears in the top k ranked triples, where k â�� {1, 3, 10}; formally: our aim was to achieve low mr and high mrr and hits@k. in addition, we also performed a qualitative evaluation. one of the authors (mf) used neo4j browser to assess the plausibility of some of the drugs highly ranked by the knowledge completion models, guided by literature search and review, and following the closed and open discovery paradigms. we report the performance of the semantic relation accuracy classifier as well as the knowledge graph completion methods in this section. the full table of results for the comparison of various bert models for the accuracy classifier is included below ( table 1) the link prediction results for all employed models are presented in table 3 . for mr a lower score is considered better, for all others a higher score is considered better. the score for each method is the mean value over all triplets in the testing set. sion, l 1 norm, learning rate î· = 0.01 and regularization coefficient î» = 2ã�10 â��8 . model training was limited to 20 000 epochs. relatively small number of relations (15) ensure that all entities and relations can be smoothly embedded into the same vector space. next, we use t-sne (t-distributed stochastic neighbor embedding) [94] algorithm to graphically represent embeddings of computed concepts in a twodimensional space (figure 2 ). t-sne algorithm enables reduction of highdimensional data into a low-dimensional space such that similar concepts are presented by nearby points. the plot demonstrates relatively good co-localization of selected concepts, especially for suspected covid-19 and paclitaxel. our results indicate that more complex knowledge graph completion models might be less efficient in drug repurposing tasks. theoretical considerations suppose that transe is outperformed by its successors [25, 62, 63] . however, differences in performances among distmult, complex, and rotate are relatively small. all three models achieved low performance on mrr, hits@1, inherently model only one-to-one relations and fails to represent one-to-many, many-to-one or even many-to-many relations, it shows its efficiency in embedding a large-scale complex biomedical knowledge graph, such as the extended semmeddb used here. empirical evidence shows that distmult and complex usually perform well for high-degree entities, but fails with low-degree entities [95] . because we eliminated highly frequent concepts due to their lack of informativeness, it is possible that this is reflected in lower performance scores of both models. the context-encoding model, stelp, showed rather poor performance in evaluation. one possibility is that the model was only able to learn high-level groupings for the predicates. this is likely the case as it was observed the model versus affects covid-19 etc., it did not learn more granular features that allow it to differentiate between subjects within the context of treats covid-19. however, analysis of the t-sne embedding and the qualitative evaluation show that the model mostly clustered the ground truth drugs into a couple of large clusters. to further compare the drug rankings between transe and stelp, we performed the wilcoxon signed-rank test (p = 0.851), which indicates that no correlation was found between how the two models were ranking novel predications. spearman's rank correlation between the novel predication rankings for both models was found to be â��0.003 and this further supports the results of the wilcoxon test. table 4 and table 5 show that there is very little agreement between transe and stelp, particularly in the top 1000 rankings for each model. it is worth noting that there were 47 items in common in the top 1000 rankings for both models. ranked triples for the specified model, calculating the absolute difference between the rankings from the two models for each of those triples, and calculating the statistics. for example, the triples that transe ranked as the top 1000 triples we gathered, the absolute differences of rankings between transe and stelp for those 1000 triples were calculated, and the statistics were calculated from those differences. the can be possible to explore larger graphs than that explored in this work. on the other hand, with adequately large computational resources, it may be possible to optimize stelp hyperparameters and train over multiple random seeds to generate a model that obtains better results than transe or rotate, which are limited by their smaller representational capacity. discovery patterns based on semantic relations provide an intuitive way of exploring potential mechanistic links between biological phenomena. neo4j and cypher, its query language, are powerful tools that complement semantic relations nicely in quickly pinpointing promising research directions, although massive graphs present some challenges for effective query and retrieval. in addition, a domain expert is needed to sort out some of the noise in semantic relations (some of it obvious) due to text mining errors. however, given that predictions made by the knowledge completion models above are largely opaque, a human-in-the-loop discovery browsing approach based patterns [43, 44] remains an effective alternative to these more complex approaches, and also complements them by providing potential explanations. the following classes of drugs have been used for the management of covid-19 so far: antivirals (e.g., remdesivir), antibodies (e.g., convalescent plasma), anti-inflammatory agents (e.g., dexamethasone), immunomodulators (e.g., interleukin inhibitors), anticoagulants (e.g., heparin), antifibrotics (e.g., tyrosine kinase inhibitors), and adjuvants (e.g., vitamin d) [96, 97] . in addition, several trials have studied antimalarials (e.g., hydroxychloroquine) and antiparasites (e.g., ivermectin), but evidence from trials do not support their use. the knowledge graph completion models did not predict antivirals, antimalarials, and antiparasites, except for antivirals from the class neuraminidase inhibitors and the antimalarial artemisone. all the other drug classes and most of their members were predicted by the models. dexamethasone, currently considered the most effective drug for reducing mortality in patients receiving oxygen, was the highest ranking drug from the rotate model. it is possible that the models missed specific antivirals and antiparasites due to their mechanism of action, which usually involves binding to specific receptors, a relation type on which semrep does relatively poorly. despite this issue, qualitative assessment of the drugs predicted by the models was overall positive. using the open discovery pattern approach, we identified five promising drugs that were ranked highly and were not, to our knowledge, discussed in the literature, which we discuss below (paclitaxel, sb 203580, alpha 2-antiplasmin, pyrrolidine dithiocarbamate, and butylated hydroxytoluene). the same approach also yielded other highly ranked substances, which are currently evaluaetd in clinical trials, such as quercetin, melatoninm losartan, estradiol, and simvastatin. note that the knowledge graph completion models predicted 7 of these drugs (excluding sb 203580, alpha 2-antiplasmin, and pyrrolidine dithiocarbamate). figure 3 shows the resulting network from this discovery pattern generated by neo4j browser. paclitaxel is used to treat several cancer types, including ovarian cancer, breast cancer, lung cancer, cervical cancer, and pancreatic cancer. it stabilizes the microtubule polymer and protects it from disassembly. chromosomes are thus unable to achieve a metaphase spindle configuration. this blocks the progression of mitosis and prolonged activation of the mitotic checkpoint triggers apoptosis or reversion to the g0-phase of the cell cycle without cell division [98] . the following patterns support the paclitaxel discovery: 1. paclitaxel-inhibits-interleukin-6-causes-covid-19 2. paclitaxel-inhibits-nf-kappa b-associated with-covid-19 3. paclitaxel-inhibits-interleukin-1, beta-associated with-covid-19 4. paclitaxel-inhibits-granulocyte colony-stimulating factorassociated with-covid-19 5. paclitaxel-inhibits-interleukin-10-predisposes-covid-19 6. paclitaxel-inhibits-interleukin-8-predisposes-covid-19 7. paclitaxel-inhibits-thromboplastin-associated with-covid-19 the first six patterns support a role for paclitaxel in alleviating the cytokine storm of covid-19, triggered by dysfunctional immune response and mediating widespread lung inflammation. paclitaxel may plausibly help as an immunosuppressive therapy to immunomediated damage in covid-19 [99] . thromboplastin (pattern 7) is a complex enzyme found in brain, lung, and other tissues and especially in blood platelets and functions in the conversion of prothrombin to thrombin in the clotting of blood and may be elevated in patients with covid-19. as pulmonary microvascular thrombosis plays an important role in progressive lung failure in covid-19 patients, paclitaxel may reduce the state of hypercoagulability by acting as an inhibitor of thromboplastin [100] . the final pattern involves the interaction of paclitaxel with tlr4. paclitaxel is known to have high affinity for tlr4 receptors. sars-cov-2 spike protein binds with human innate immune receptors, mainly tlr4, increasing secretion of il-6 and tnf-î± and neuroimmune response. this suggests that paclitaxel may dislocate sars-cov-2 spike proteins [101, 102] . sb 203580 is a specific inhibitor of p38î±, which suppresses downstream activation of mapkap kinase-2, involved in many cellular processes including stress and inflammatory responses and cell proliferation. the following patterns support the sb 203580 discovery: ity, especially in patients with comorbidities such as hypertension, diabetes, and coronary heart disease [104] . the following patterns support the alpha 2-antiplasmin discovery: 1. alpha 2-antiplasmin-inhibits-plasmin-predisposes-covid-19 2. alpha 2-antiplasmin-inhibits-fibrinogen-associated with-covid-19 3. alpha 2-antiplasmin-interacts with-igy-associated with-covid-19 more specifically, plasmin may cleave a newly inserted furin site in the s protein of sars-cov-2, which increases its infectivity and virulence in covid-19. in addition, fibrinogen levels are higher in covid-19 patients and may contribute to hypercoagulability [104] . by inhibiting plasmin and fibrinogen (first two patterns), alpha 2-antiplasmin may confer protection to covid-19. in addition, pattern 3 suggests a mechanism of protection via immunoglobulin y (igy). in the immunology field, igy against acute respiratory tract infection has been developed for more than 20 years. several igy applications have been effectively confirmed in both human and animal health. igy antibodies extracted from chicken eggs have been used in bacterial and viral infection therapy. igy production has been proposed as immunization as an adjuvant therapy in viral respiratory infection caused by covid-19 infection [105] . chicken immunized with alpha 2-antiplasmin and the peptide-specific antibody (igy) was isolated from the egg yolks of hens that could be used as potential protections for covid-19 patients [106] . pyrrolidine dithiocarbamate is a family of drugs used for metal chelation, induction of g1 phase and cell cycle arrest. it binds to zinc and the resulting complex can enter the cell and inhibit viral rna-dependent rna polymerase [107] . it is supported by the following patterns: 1. pyrrolidine dithiocarbamate-inhibits-nf-kappa b-associated with2. pyrrolidine dithiocarbamate-inhibits-interleukin-6associated with-covid-19 3. pyrrolidine dithiocarbamate-inhibits-tnf protein, humanassociated with-covid-19 the mechanisms suggested here are similar to those observed for the previous drugs. pyrrolidine dithiocarbamate contains antioxidants and prevents inflammatory changes. it inhibits the expression of il-6 and tnf, and nf-îºb in the virus-infected chorion cells through its antiviral activity. it has been proposed for the treatment of influenza [107] and it may have potential as a therapeutic option for covid-19. butylated hydroxytoluene is a lipophilic compound useful for its antioxidant properties. it is widely used to prevent free radical-mediated oxidation in fluids and other materials and is generally recognized as safe as a food additive. it has been postulated in the past as an antiviral drug. open discovery identified the following relevant patterns: 1. butylated hydroxytoluene-inhibits-cd69 protein, human2. butylated hydroxytoluene-inhibits-free radicals-associated with3. butylated hydroxytoluene-inhibits-tnf protein, humanassociated with-covid-19 4. butylated hydroxytoluene-inhibits-hydrogen peroxideassociated with-covid-19 the first pattern indicates butylated hydroxytoluene as an inhibitor of cd69. studies have shown that the cd69+ cells were detected in the lung of patients with asthmatic and eosinophilic pneumonia, suggesting a crucial role for cd69 in the pathogenesis of such inflammatory diseases. cd69 is, potentially, a new therapeutic target for patients with intractable inflammatory disorders and tumors [108] . therefore, by inhibiting cd69, butylated hydroxytoluene may halt potential inflammatory responses in covid-19. however, cd69 does not appear to be a major player in the physiopathology of covid-19 (the query "cd69 and covid-19" did not return any results in pubmed). nonetheless this is noteworthy, because this pathway is suggested as a novel and important pathway for all immune responses [108] . the crucial role of free radicals in covid-19 has been acknowledged and an antioxidative therapeutic strategy for covid-19 has been suggested [109] . along these lines, patterns 2-3 point to antioxidant function of butylated hydroxytoluene by scavenging free radicals and inhibiting reactive oxygen species [110] . our approach relies on accuracy of the predications extracted by semrep. semrep precision is about 0.70 and its recall around 0.42 [73] . while the accuracy classifier helped us improve the accuracy of the predications used, the remaining errors were still significant, impacting the knowledge graph completion task. in addition, despite aggressive filtering, the graph formed by the relations in extended semmeddb is very large, making it difficult to apply computationally intensive models like stelp. in this study, we examined a sub-graph which, inevitably, results in a loss of information available to knowledge graph completion techniques. while we were still able to apply modeling techniques to a fairly large sub-graph focusing on drug repurposing, there exists a larger, complementary sub-graph that may provide further drug candidates. as noted above, the transe model benefited from hyperparameter tuning using a grid search method to find an optimal configuration. similarly, stelp would likely benefit from a similar tuning to find an optimal configuration. for example, a single linear layer was used on the pooled output from the biobert model to produce the logits when increasing the representational capacity of the linear layer, by depth or width, might allow for stelp to develop a richer model of the underlying space formed by the biobert contextualized embeddings. our methods were limited to knowledge from the literature. other types of biological data (e.g., protein-protein interactions, drug-target interactions, gene/protein sequences, pharmacogenomic and pharmacokinetic data) are likely to benefit identification of drug candidates, as shown to some extent by other studies [14] , as well as our prior work [53] . however, the computational resources needed for training models based on such massive data can be prohibitive. transe and similar methods seem more promising in that respect. lastly, with our in silico approach, we can of course only propose drug candidates for repurposing. to evaluate whether these drugs could indeed act as therapeutic agents for covid-19, clinical studies are needed. however, the fact that we were able to identify some drugs known to have some benefit for covid-19 (e.g., dexamethasone) via purely computational methods that rely only on automatically extracted literature knowledge is encouraging. in this study, we proposed an approach that combines literature-based discovery and knowledge graph completion for covid-19 drug repurposing. unlike similar efforts that largely focused on covid-19-specific knowledge, we incorporated knowledge from a wider range of biomedical literature. we used state-of-the-art knowledge graph completion models as well as simple but effective discovery patterns to identify candidate drugs. we also demonstrated the use of these patterns for generating plausible mechanistic explanations, showing the complementary nature of both methods. the approach proposed here is not specific to covid-19 and can be used to repurpose drugs for other diseases. it can also be generalized to answer other clinical questions, such as discovering drug-drug interactions or identifying drug adverse effects. as covid-19 pandemic continues its spread and disruption around the globe, we are reminded how the spread of infectious diseases is increasingly common and future pandemics ever more likely. innovative computational methods leveraging existing biomedical knowledge and infrastructure could help us plan for, respond to and mitigate the effects of such global health crises. drug repurposing is a key piece of this response, and our approach provides an efficient computational method to facilitate this goal. sb 203580-inhibits-interleukin-6 -causes sb 203580-inhibits-tnf protein sb 203580-inhibits-interleukin-1, beta-associated with sb 203580-inhibits-nf-kappa b-associated with sb 203580-inhibits-interleukin-1-causes sb 203580-inhibits-granulocyte-macrophage colony-stimulating factor -associated with sb 203580-inhibits-macrophage colony-stimulating factorassociated with similarly to paclixatel, all patterns involving sb 203580 point to a potential inhibition of the hyperinflammatory response in covid-19 of the protein kinases p38î± in inflammation and innate immunity was found when the compound sb203580 suppressed tumor necrosis factor (tnf) production in monocytes, and this resulted in inhibition of septic (infammatory) shock alpha 2-antiplasmin is a serine protease inhibitor responsible for inactivating plasmin draft landscape of covid-19 candidate vaccines placebo-controlled study of azd1222 for the prevention of covid-19 in adults statement on astrazeneca oxford sars-cov-2 vaccine, azd1222, covid-19 vaccine trials temporary pause safety and immunogenicity of an rad26 and rad5 vector-based heterologous prime-boost covid-19 vaccine in two formulations: two open, non-randomised phase 1/2 studies from russia researchers highlight 'questionable' data in russian coronavirus vaccine trial results dexamethasone in hospitalized patients with covid-19-preliminary report effect of hydroxychloroquine in hospitalized patients with covid-19: preliminary results from a multi-centre, randomized, controlled trial current status of covid-19 therapies and drug repositioning applications covid-19 drug repurposing: a review of computational screening methods, clinical trials, and protein interaction assays drug repurposing: progress, challenges and recommendations artificial intelligence in covid-19 drug repurposing, the lancet digital health a data-driven drug repositioning framework discovered a potential therapeutic agent targeting covid-19 networkbased drug repurposing for novel coronavirus 2019-ncov/sars-cov-2 a network medicine approach to investigation and population-based validation of disease manifestations and drug repurposing for covid-19 repurpose open data to discover therapeutics for covid-19 using deep learning network medicine: a networkbased approach to human disease literature based discovery: models, methods, and trends emerging approaches in literature-based discovery: techniques and performance review semmeddb: a pubmed-scale repository of biomedical semantic predications cord-19: the covid-19 open research dataset translating embeddings for modeling multi-relational data rotate: knowledge graph embedding by relational rotation in complex sspace semantic triple encoder for fast open-set link prediction exploiting semantic relations for literature-based discovery a sars-cov-2 protein interaction map reveals targets for drug repurposing discovery of sars-cov-2 antiviral drugs through large-scale compound repurposing analysis of therapeutic targets for sars-cov-2 and discovery of potential drugs by computational methods anti-hcv, nucleotide inhibitors, repurposing against covid-19 virtual screening and repurposing of fda approved drugs against covid-19 main protease using integrated computational approaches to identify safe and rapid treatment for sars-cov-2 fast identification of possible drug treatment of coronavirus disease-19 (covid-19) through computational drug repurposing study ribavirin, remdesivir, sofosbuvir, galidesivir, and tenofovir against sars-cov-2 rna dependent rna polymerase (rdrp): a molecular docking study drugbank: a knowledgebase for drugs, drug actions and drug targets chembl: a large-scale bioactivity database for drug discovery biogrid: a general repository for interaction datasets in silico discovery of candidate drugs against covid-19 predicting potential drug targets and repurposable drugs for covid-19 via a deep generative model for graphs network medicine framework for identifying drug repurposing opportunities for covid-19 fish oil, raynaud's syndrome, and undiscovered public knowledge., perspectives in biology and medicine graph-based methods for discovery browsing with semantic predications semantic medline for discovery browsing: using semantic predications and the literature-based discovery paradigm to elucidate a mechanism for the obesity paradox an interactive system for finding complementary literatures: a stimulus to scientific discovery using concepts in literature-based discovery: simulating swanson's raynaud-fish oil and migraine-magnesium discoveries using the literature-based discovery paradigm to investigate drug mechanisms exploring relation types for literature-based discovery context-driven automatic subgraph creation for literature-based discovery reflective random indexing and indirect inference: a scalable method for discovery of implicit connections finding schizophrenia's prozac emergent relational similarity in predication space embedding of semantic predications combining semantic relations and dna microarray data for novel hypotheses generation using literature-based discovery to identify novel therapeutic approaches predicting high-throughput screening results with scalable literature-based discovery methods exploiting literature-derived knowledge and semantics to identify potential prostate cancer drugs a new method for prioritizing drug repositioning candidates extracted by literature-based discovery literature-based discovery of new candidates for drug repurposing knowledge graph embedding: a survey of approaches and applications knowledge graph embedding by translating on hyperplanes a three-way model for collective learning on multi-relational data., in: icml embedding entities and relations for learning and inference in knowledge bases complex embeddings for simple link prediction holographic embeddings of knowledge graphs convolutional 2d knowledge graph embeddings modeling relational data with graph convolutional networks kg-bert: bert for knowledge graph completion a literaturebased knowledge graph embedding method for identifying drug repurposing opportunities in rare diseases modeling polypharmacy side effects with graph convolutional networks gredel: a knowledge graph embedding based method for drug discovery from biomedical literatures ttd: therapeutic target database the interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text broad-coverage biomedical relation extraction with semrep the unified medical language system the unified medical language system (umls): integrating biomedical terminology augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference a reasoning and hypothesis-generation framework based on scalable graph analytics enabling discoveries link prediction on the semantic medline network are abstracts enough for hypothesis generation? investigating the role of interleukin-1 beta and glutamate in inflammatory bowel disease and epilepsy using discovery browsing keep up with the latest coronavirus research extending the log-likelihood measure to improve collocation identification, master's thesis mining biomedical literature to explore interactions between cancer drugs and dietary supplements evaluating active learning methods for annotating semantic predications bert: pre-training of deep bidirectional transformers for language understanding biobert: a pre-trained biomedical language representation model for biomedical text mining proceedings of the 2nd clinical natural language processing workshop transfer learning in biomedical natural language processing: an evaluation of bert and elmo on ten benchmarking datasets domain-specific language model pretraining for biomedical natural language processing adam: a method for stochastic optimization training knowledge graph embeddings at scale an overview of metamap: historical perspective and recent advances aggregating umls semantic types for reducing conceptual complexity visualizing data using t-sne a capsule network-based embedding model for knowledge graph completion and search personalization pharmacologic treatments for coronavirus disease 2019 (covid-19): a review pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (covid-19): a review how taxol/paclitaxel kills cancer cells the trinity of covid-19: immunity, inflammation and intervention covid-19: coagulopathy, risk of thrombosis, and the rationale for anticoagulation the role of tlr4 in chemotherapy-driven metastasis is tolllike receptor 4 involved in the severity of covid-19 pathology in patients with cardiometabolic comorbidities? what goes up must come down: molecular basis of map-kap kinase 2/3-dependent regulation of the inflammatory response and its inhibition elevated plasmin (ogen) as a common risk factor for covid-19 susceptibility igy-turning the page toward passive immunization in covid-19 infection purification of human î±2-antiplasmin with chicken igy specific to its carboxy-terminal peptide antiviral function of pyrrolidine dithiocarbamate against influenza virus: the inhibition of viral gene replication and transcription a new therapeutic target: the cd69-myl9 system in immune responses tackle the free radicals damage in covid-19 understanding the chemistry behind the antioxidant activities of butylated hydroxytoluene (bht): a review we thank franã§ois-michel lang, leif neve, and jim mork for their assistance with processing the cord-19 dataset with semrep and providing updates to semmeddb. we acknowledge tom rindflesch for his encouragement with the project. key: cord-244657-zp65561y authors: hawryluk, iwona; mishra, swapnil; flaxman, seth; bhatt, samir; mellan, thomas a. title: simulating normalising constants with referenced thermodynamic integration: application to covid-19 model selection date: 2020-09-08 journal: nan doi: nan sha: doc_id: 244657 cord_uid: zp65561y model selection is a fundamental part of bayesian statistical inference; a widely used tool in the field of epidemiology. simple methods such as akaike information criterion are commonly used but they do not incorporate the uncertainty of the model's parameters, which can give misleading choices when comparing models with similar fit to the data. one approach to model selection in a more rigorous way that uses the full posterior distributions of the models is to compute the ratio of the normalising constants (or model evidence), known as bayes factors. these normalising constants integrate the posterior distribution over all parameters and balance over and under fitting. however, normalising constants often come in the form of intractable, high-dimensional integrals, therefore special probabilistic techniques need to be applied to correctly estimate the bayes factors. one such method is thermodynamic integration (ti), which can be used to estimate the ratio of two models' evidence by integrating over a continuous path between the two un-normalised densities. in this paper we introduce a variation of the ti method, here referred to as referenced ti, which computes a single model's evidence in an efficient way by using a reference density such as a multivariate normal where the normalising constant is known. we show that referenced ti, an asymptotically exact monte carlo method of calculating the normalising constant of a single model, in practice converges to the correct result much faster than other competing approaches such as the method of power posteriors. we illustrate the implementation of the algorithm on informative 1and 2-dimensional examples, and apply it to a popular linear regression problem, and use it to select parameters for a model of the covid-19 epidemic in south korea. mathematical modelling of infectious diseases is widely used to understand the processes underlying pathogen transmission and inform public health policies. with advances in both computing power and availability of data, it is possible to build more complex, robust and accurate models. the increasing importance of epidemiological models requires synthesis with rigorous statistical methods. this synthesis is required to robustly estimate necessary parameters, quantify uncertainty in predictions, and test hypotheses [1] . in practice the decision to choose a model is often based on heuristics, relying on the knowledge and experience of the modeller, rather than through a systematic selection process [2, 3] . a number of model selection methods are available, but those methods often come with a trade-off between accuracy and computational complexity. for example, widely used in epidemiology akaike information criterion (aic) or bayesian information criterion (bic) are easy to compute, but come with certain limitations [1] . specifically, they do not take into account the parameters' uncertainty or the prior probabilities, and might favour excessively complex models. the ratio of two normalising constants -the bayes factor (bf) -is popularly used for model selection method in the bayesian setting [4] . in general, normalising constants cannot be computed analytically or through common quadrature methods, and more complex probabilistic algorithms need to be employed. thermodynamic integration (ti) [5, 6, 7, 8] provides a useful way estimate the log ratio of the normalising constants of two densities. instead of marginalising the densities explicitly, which results in high-dimensional integrals, by using ti we only need to evaluate a 1dimensional integral, where the integrand can easily be sampled with markov chain monte carlo (mcmc). to see how this works, consider a pair of normalising constants z 1 and z 2 , where q i is a density for model m i with parameters θ, that can be normalised to give the model's bayesian posterior density to apply thermodynamic integration we introduce the concept of a path between q 1 (θ) and q 2 (θ), linking the two densities via a series of intermediate ones. this family of densities is denoted q(λ; θ). an example path in λ is shown in figure 1 , for selected values of the coupling parameter. note -often the coupling parameter is denoted β (or t) in reference to a physical thermodynamic integration in inverse temperature (or temperature). in many instances this analogy makes sense but a more generic procedure is a thermodynamic integration between two systems with distinct hamiltonians coupled by a switching parameter λ, which is closer to the spirit of this work. the parametric density q(λ; θ), linking q 1 to q 2 and defining the intermediate densities, can be chosen to have an optimal or in some way convenient path, but a common choice is simply the geometric one q(λ; θ) = q λ 2 (θ)q 1−λ 1 (θ) , λ ∈ [0, 1] . the important point to note is that for λ = 0, q(λ; θ) returns the first density q(0; θ) = q 1 (θ), for λ = 1 it gives q(1; θ) = q 2 (θ), and for in-between λ values a log-linear mixture of the end-point code available at https://github.com/mrc-ide/referenced-ti densities. just as we have defined a family of densities, there is an associated normalising constant for any point along the path, that for any value of λ is given by q(λ; θ)dθ . a further small but important point to avoid complications is to have densities that have common support, e.g. ω(1) = ω(0). hereafter support is denoted ω. having set up the definitions of q(λ; θ) and z(λ), the ti expression can be derived, to compute the log ratio of z 1 = z(0) and z 2 = z(1), while avoiding explicit integrals over the models' parameters θ. by the fundamental theorem of calculus, assuming that the order of ∂ λ and dθ integral can be exchanged, and by the elementary rules of differentiating logarithms we get: the notation e q(λ;θ) is for an expectation with respect to the sampling distribution q(λ; θ). the final line in the expression summarises the usefulness of ti -instead of having to work with the complicated high-dimensional integrals of equation 1 to find the log-bayes factor log z2 z1 , we only need consider a 1-dimensional integral of an expectation, and the expectation can be readily produced by mcmc. here we set out in detail a variation on the ti theme that we have found to be useful in practice. the variation is to work primarily in terms of referenced normalising constants. the approach of using exactly integrate-able references has provided us with a particularly efficient method of selection between different hierarchical bayesian models times-series models, and we hope the approach will be useful to others working on similar problems, for example in phylogenetic model selection where ti is already a popular established method [9, 8] . in the following, we begin with introducing the reference density, then go on to illustrate different practical choices of a reference normalising constant, along with theoretical and practical considerations. next, the mechanics of applying the method are set out for two simple pedagogical examples. performance benchmarks are discussed for a well-known problem in the statistical literature [10] , which shows the method performs favourably in terms of accuracy and the number of iterations to convergence. finally the technique is applied to a hierarchical bayesian time-series model describing the covid-19 epidemic in south korea. in the covid-19 infections model we select technical parameters in the reproduction number (r t ) model, such as the autoregression window size and ar(k) lag, as well epidemiologically meaningful parameters such as the serial interval distribution time for generating infections. an efficient approach to compute bayes factors, or more generally to marginalise a given density for any application, is to introduce a reference as to clarify notation, z is the normalising constant of interest with density q, z ref is a reference normalising constant with associated density q ref . the second line replaces the ratio z/z ref with a thermodynamic integral as per the identity derived in equation 2. the introduction of a reference naturally facilitates the marginalisation of a single density, rather than requiring pairwise model comparisons by direct application of equation 2. this is useful when comparing multiple models as n > n 2 for n ≥ 3. however, the primary motivation to reference the ti is the computational efficiency of converging the ti expectation. in equation 3, with judicious choice of q ref , the reference normalising constant z ref can be evaluated analytically and account for most of z. in this case log q(θ) q ref (θ) tends to have a small expectation and variance and converges quickly. the idea of using an exact reference to aid in the solution of computationally intractable problems is a fundamental and a perennial one throughout computational sciences. within normalising constant evaluation methods, our suggested algorithm is closely related to several other techniques [11, 7, 12, 8, 9, 13, 14] . in the generalised stepping stone method a reference is introduced to speed up convergence of the importance sampling at each temperature [13, 14] . in the power posteriors (pp) approach the reference in equation 3 is the prior distribution of q and thus z ref = 1 [11] . this is elegant as the reference need not be chosen -it is simply the prior -however the downside of the simplicity is that for poorly chosen or uninformative priors, the thermodynamic integral will be slow to converge and susceptible to instability. in particular for complex hierarchical models with uninformative priors this can often be an issue. the reference density in equation 3 can be chosen at convenience, but the main desirable features are that it should be easily formed without special consideration or adjustments, similar to power posteriors method, and that z ref should be analytically integratable and account for as much of z as possible. such a choice of z ref ensures the part with expensive sampling is small and convergence is fast. an obvious choice in this regard is the laplace-type reference, where the log density is approximated with a second-order one, e.g. a multivariate gaussian. for densities with a single concentration, laplace-type approximations are ubiquitous, and an excellent natural choice for many problems. in the following section we consider three approaches that can be used to formulate a reference normalising constant z ref from a second-order log density (though more generally other tractable references are possible). in each referenced ti scenario, we note that even if the reference approximation is poor, the estimate of the normalising constant based on equation 3 remains asymptotically exact -only the speed of convergence may be reduced (provided assumptions such matching support of end-point densities remains). the most straightforward way to generate a reference density is to taylor expand log q(θ) to second order about a mode. noting there is no linear term, we see the reference density is where h is the hessian matrix and θ 0 is the vector of mode parameters. the associated normalising constant is taylor expansion method tends to produce a reference density that works well in the neighbourhood of θ 0 but can be less suitable if density is asymmetric, has long or short tails, or if the derivatives at the mode are poorly approximated for example due to cusps or conversely very flat curvature at the mode. in many instances a more robust choice of reference can found by using mcmc samples from the whole posterior density. another elementary but often more robust approach to form a reference density is by drawing samples from the true density q(θ), to estimate the mean parametersθ and covariance matrixς, such that the reference normalising constant is z ref = q(θ) det(2πς) . this method of generating a reference is simple and reliable. it requires sampling from the posterior q(θ) so is more expensive than derivative methods, but the cost associated with drawing enough samples to generate a sufficiently good reference tends to be quite low. in the primary application discussed later, regarding relatively complex high-dimensional bayesian hierarchical models, we use this approach to generate a reference density and normalising constant. the sampled covariance reference is typically a good approach, but it is not in general optimaltypically another gaussian reference with different parameters can generate a normalising constant closer to the true one, thus leading to faster convergence of the thermodynamic integral to the exact value. such an optimal reference can be identified variationally. the conditions to identify a optimal reference normalising constant can be derived using elementary methods by perturbation theory. consider a taylor expansion of the log normalising constant log z(λ) about λ = 0: the first derivative gives the expectation as per the derivation in equation 2, and the second derivative is a variance as the curvature of log z(λ) is increasing, to first order we see and for the specific case of λ = 1, this inequality establishes bounds that can be maximised with respect to the position (µ) and scale (s) parameters of a reference density such as thus the parameters that optimise provide a reference density that is variationally optimal. we note this is simply an application of the gibbs-feynman-bogoliubov inequality [15, 16, 17] and that finding such a approximation to the true density is well-studied problem with numerous approaches that can be used to determine q ref [18, 19] . in itself the existence of a variational bound provides no guarantee of being a good approximation to the true normalising constant. thus alone it is not a satisfactory general approach. however as a point of reference from which to estimate the true normalising constant, it provides an optimal density within the family of trial reference functions considered, therefore improving convergence to the mcmc normalising constant in referenced ti. having set out three approaches to find a single reference for the ti expression in equation 3, a natural generalisation is the telescopic expansion note, here the analytic reference is denoted z 0 rather than z ref to generalise the indexing. in cases where q 0 differs substantially from q, the telescopic generalisation can improve numerical stability. by bridging the endpoints in terms of intermediate density pairs, qi+1(θ) qi(θ) , we can form a series of lower variance mcmc simulations with favourable convergence properties. a reasonable choice for generating intermediate densities is for the q i th density to be the 2 (i + 1) th order taylor expansion of q(θ). if a model has parameters with limits, e.g. θ 1 ∈ [0, ∞), θ 2 ∈ (−1, ∞) etc., in referenced ti the exact analytic integration for the reference density should be commensurately limited. however, the calculation of arbitrary probability density function orthants, even for well-known analytic functions such as the multivariate gaussian, is in general a difficult problem. computing highdimensional orthants usually requires advanced techniques, the use of approximations, or sampling methods [20, 21, 22, 23, 24, 25] . fortunately we can simplify our reference density to create a reference with tractable analytic integration for limits by using a diagonal approximation to the sampled covariance or hessian matrix. for example the orthant of a diagonal multivariate gaussian can be given in terms of the error function [26] , leading to the expression where k denotes the set of indices of the parameters with lower limits a i . σ diag is a diagonal covariance matrix, that is one containing only the variance of each of the parameters, without the covariance terms and σ diag i denotes the i th element of the diagonal. restricting our density to a diagonal one is a poorer approximation than using the full covariance matrix. in practice however this has not been particularly detrimental to the convergence of the thermodynamic integraland again we note that the quality of the reference only affects convergence rather than eventual theoretical monte carlo accuracy of the normalising constant. this behaviour is observed in the practical examples later considered, though the distinction between accuracy and convergence and matters of asymptotic consistency using an mcmc estimator with finite iterations are naturally less clear cut. the algorithm described in this section was implemented in python and stan programming languages. using stan enables fast mcmc simulations, using hamiltonian monte carlo (hmc) and no-u-turn samples (nuts) algorithm [27, 28] , and portability between other statistical languages, such as r or julia. additionally it is familiar to many epidemiologists using bayesian statistics [29] . the code for all examples shown in this paper is available at https://github.com/mrc-ide/ referenced-ti. in the examples shown in the 3 section, we used 4 chains with 20,000 iterations per chain for the pedagogical examples, and 4 chains with 2,000 iterations for the other applications. in all cases, half of iterations were used for the burn-in. mixing of the mcmc chains and the sampling convergence was checked in each case, by ensuring that ther value was ≤ 1.05 for each parameter in every model. in all examples in the remaining part of this paper, the integral given in equation 2 was discretised to allow computer simulations. each expectation e q(λ;θ) log q1(θ) q0(θ) was evaluated at λ = 0.0, 0.1, 0.2, ...., 0.9, 1.0, unless stated otherwise. to obtain the value of the integral in equation 2, we interpolated a curve linking the expectations using a cubic spline, which was then integrated numerically. the pseudo-code of the algorithm with sampled covariance laplace reference is shown in algorithm 1. input q -un-normalised density, q ref -un-normalised reference density, λ -set of coupling parameters λ, n -number of mcmc iterations output z -normalising constant of the density q compute the mean, 9: end for 10: interpolate between the consecutive e λ values to obtain a curve ∂ λ log(z(λ)) 11: integrate ∂ λ log(z(λ)) over λ ∈ [0, 1] to get log z z ref to illustrate the technique consider the 1-dimensional density ,whudwlrq b) expectation e q(λ;θ) log q(θ) q ref (θ) vs mcmc iteration, shown at each value of λ sampled. c) λ dependence of the ti contribution to the log-evidence d) convergence of the evidence z, with 1% convergence after 500 iterations and 0.1% after 17, 000 iterations per λ. with normalising constant z = ∞ −∞ q(θ)dθ. this density has a cusp -one of the more awkward pathologies of some naturally occurring densities [30, 31] -and it does not have an analytical integral that easily generalises to multiple dimensions, but is otherwise a made-up 1-dimensional example that could be interchanged with an other. in this instance the laplace approximation based on the second-order taylor expansion at the mode will fail due to the cusp, so we can use the more robust covariance sampling method. sampling from the 1d density q(θ) we find a variance ofσ 2 = 0.424, giving a gaussian reference density q ref figure 1 . in this simple example, the integral can be easily evaluated to high accuracy using quadrature [32, 33] , giving a value of 1.523. referenced thermodynamic integration reproduces this value, with convergence of z shown in figure 1 , converging to 1% of z with 500 iterations and 0.1% within 17, 000 iterations. this example illustrates notable characteristic features of referenced thermodynamic integration. the reference q ref (θ) is a good approximation to q(θ), with z ref accounting for most (102%) of z. consequently z z ref is close to 1, and the expectations, e q(λ;θ) log q(θ) q ref (θ) , evaluated by mcmc are small. for the same reasons the variance at each λ is small, leading to favourable convergence within a small number of iterations. and finally e q(λ;θ) log q(θ) q ref (θ) weakly depends on λ, so there is no need to use a very fine grid of λ values or consider optimal paths -satisfactory convergence is easily achieved using a simple geometric path with 4 λ intervals. 2d pedagogical example with constrained parameters as a second example, consider a 2-dimensional un-normalised density with a constrained parameter space: where θ 1 ∈ [0, +∞) and θ 2 ∈ (−∞, +∞) a reference density q ref (θ) can be constructed from the hessian at the mode of q(θ). notice, that because parameter θ 1 is constrained to be ≥ 0, integrating the gaussian approximations q ref (θ) using the formula given in equation 5 will give an overestimate. to account for this we use the reference density q diag ref (θ), based on a diagonal hessian, that has an exact and easy to calculate orthant. all densities are shown in figure 2 . to obtain the log-evidence of the model, we calculated the exact value numerically [34, 32] , using the full covariance laplace method as per equation 6 and using the diagonal covariance with correction added to take into account the lower bound of the parameter θ 1 , as per equation 8 . the gaussian reference densities were then used to carry out referenced thermodynamic integration. results of all methods are given in table 1 . as expected, without applying the correction the value of the evidence is overestimated. to benchmark the application of the referenced ti in the model selection task, consider fitting non-nested linear regression to the radiata pine data [10] . this example is widely used for testing the normalising constant calculating methods, due to the fact that the exact value of the model evidence can be computed. the data consists of 42 3-dimensional datapoints, expressed by y imaximum compression strength, x i -density and z i -density adjusted for resin content. in this example, we follow the approach of friel and wyse [12] , and test which of the two models m 1 and m 2 provides better predictions for the compression strength: in other words, we want to know, whether density or density adjusted allows to predict the compression strength better. the priors for the models were selected in a way which enables obtaining an exact solution and can be found in friel and wyse [12] . five methods of estimating the model evidence were used in this example: laplace approximation using a sampled covariance matrix, model switch ti along a path directly connecting the models [8, 35] , referenced ti, power posteriors with equidistant 11 λ-placements (labelled here as pp 11 ) and power posteriors with 100 λ-s (pp 100 ) as in [12] . for the model switch ti, referenced ti and pp 11 we used λ ∈ {0.0, 0.1, ..., 1.0}. the expectation from mcmc sampling per each λ for model switch ti, referenced ti, pp 11 and pp 100 and fitted cubic splines between the expectations are shown in figure 3 . immediately we notice that both ti methods eliminate the problem of divergence of expectation for λ = 0, which is observed with the power posteriors, where samples for λ = 0 come from the prior distribution. pp 11 method failed to estimate the log-evidence correctly. the 1-dimensional lines estimated by fitting a cubic spline to the expectation were integrated for each of the models to obtain the log-evidence for m 1 and m 2 , and the log ratio of the two models' evidences for the model switch ti. rolling mean of the integral over 1500 iterations for referenced ti and pp 100 are shown in figure 4 a-b. we can see from the plots, that referenced ti presents excellent convergence to the exact value, whereas pp 100 oscillates around it. in the same figure 4 , plots c-d show the distribution of log-evidence for each model generated by 15 runs of the three algorithms: laplace approximation with sampled covariance matrix, referenced ti and pp 100 . although all three methods resulted in a log-evidence satisfactorily close to the exact solution, referenced ti was the most accurate and importantly, converged fastest. the bfs calculated to assess whether model m 2 fits the data better than model m 1 and the number of iterations needed to achieve standard error of 0.5%, excluding the iterations needed for the mcmc burn-in, are presented in table 2 . notably, both ti methods gave a bf very close to the exact value. referenced ti performed the best out of tested methods -it converged faster than all other methods, requiring only 308 mcmc draws compared to 55,000 draws needed for the power posterior method or over 2,000 for the model switch ti. referenced ti also showed excellent accuracy both in estimating individual model's evidence and the bf. the final example of using referenced ti for calculating model evidence is fitting a renewal model to covid-19 case data from south korea. the data were obtained from https://opendata.ecdc. europa.eu/covid19/casedistribution/csv (accessed 19-07-2020) and contained time-series data of confirmed cases from 31-12-2019 to 18-07-2020. the model is based on the bellman-harris branching process whose expectation is the renewal equation. its derivation and details are explained in mishra et al. [36] and a short explanation of the model is provided in the appendix. briefly, the model is fitted to the time-series case data and estimates a number of parameters, including serial interval and the effective reproduction number, r t . number of cases for each day are modelled by a negative binomial distribution, with shape and overdispersion parameters estimated by a renewal equation. three modification of the original model were tested: • gi = k, k = 5, 6, 6. 5, 7, 8, 9, 10, 20 both ti and referenced ti methods used 11 equidistant λ-s. power posteriors method was used with 11 (pp 11 ) and 100 (pp 100 ) λ-s. third column shows the total number of mcmc steps required to achieve standard error of 0.5%, excluding the burn-in steps. * -using sampled covariance matrix. of rayleigh-distributed generation interval (which we assume to be the same as the serial interval), • ar(k), k = 2, 3, 4 -autoregressive model with k days lag, • w = k, k = 1, 2, 3, 4, 7 -changing the length k of the sliding window w . within each group of models, gi, ar and w , we want to select the best model through the highest evidence method. for example, we want to check whether gi = 6 fits the data better than gi = 10, etc. the dimension of each model was dependent on the modifications applied, but in all the cases the normalising constant was a 40-to 200-dimensional integral. the logevidence of each model was calculated using laplace approximation with a sampled covariance matrix, and then correction to the estimate was obtained using referenced ti method. values of the log-evidence for each model calculated by both laplace and referenced ti methods are given in table 3 . interestingly, the favoured model in each group, i.e. the model with the highest logevidence, was different when the evidence was evaluated using laplace approximation than when it was evaluated with referenced ti. for example, using laplace method, sliding window of length 7 was incorrectly identified as the best model, whereas with referenced ti window of length 2 was chosen to be the best among the tested sliding windows models, which agrees with the previous studies of the window-length selection in h1n1 influenza and sars outbreaks [37] . this exposes how essential it is to accurately determine the evidence, even good approximations can result in misleading results. log-bayes factors for all model pairs within each of the three groups are shown in figure 7 in the appendix. the importance of performing model selection in a rigorous way is clear from figure 5 , where the posterior densities of parameters φ and σ and the generated r t time-series are plotted for the models favoured by laplace and referenced ti methods (meaning of the parameters is given in the appendix). the differences in the densities and time-series show the pitfalls of selecting an incorrect model. for example, the parameter σ was overestimated by the models selected by laplace approximation in comparison to these selected by the referenced ti. table 3 : log-evidence estimated by laplace approximation, added referenced ti correction and total log-evidence from referenced ti, with 95% credible interval given in brackets. in each section, model with the highest log-evidence estimated by laplace or referenced ti method is indicated in bold. between the two favoured models were most extreme for the gi = 8 and gi = 20 models. while a gi = 8 is plausible, even likely for covid-19, gi = 20 is implausible given observed data [38] . this is further supported by observing that for gi = 20, favoured by the laplace method, r t reached the value of over 125 in the first peak -around 100 more than for the gi = 8. the second peak was also largely overestimated, where r t reached a value of 75. particularly interesting is the fact that all models present similar fit to the confirmed covid-19 cases data, as shown in figure 8 in the appendix. that makes it impossible to select the best model just by visually comparing the model fits, or by using model selection methods that do not take the full posterior distributions into account. that shows that although the models might fit the data well, other quantities generated, which are often of interest to the modeller, might be completely incorrect. moreover, it emphasises the need to test multiple models before any conclusion or inference is undertaken, especially with the complex, hierarchical models. in epidemiology this is particularly important, as the modellers can be tempted to pick arbitrary parameters for their model, as long as the predictions fit the data [37] . although the fit might be accurate, other inferred parameters or uncertainty related to the predictions might be completely inappropriate for making any meaningful predictions. in the radiata pine example, the contribution of ti to the marginalised likelihood estimated by laplace approximation was not substantial and using laplace approximation would suffice to make an informed model choice. in this example, we see that the ti contribution is relatively large and changes the decision we would make if we calculated model evidence using only the laplace method. moreover, from table 3 we see that the evidence was the highest for the "boundary" models when laplace approximation was applied. for example, for the fixed sliding window length models, log-evidence was monotonically increasing with the value of w when gaussian approximation was applied. increasing value of the log-evidence makes it less clear what the best value of w is -in general, in that case more values should be tested until the preferred model is no longer on the boundary. with referenced ti, that issue no longer exists, as the preferred model was not the boundary one and the log-evidence change was concave. the examples shown in section 3 illustrate the applicability of the referenced ti algorithm for calculating model evidence and the model selection process. in the radiata pine example, referenced ti performed better than other tested methods, both in terms of accuracy and speed. power posteriors method required much denser placement of the coupling parameters around λ = 0, where the values are sampled purely from the prior distribution. in the case of referenced ti, at λ = 0 values are sampled from the reference density, which should be close to the original density (in the sense of k-l distance), which results not only in a more accurate estimate of the normalising constant, but also much faster convergence of the mcmc samples. it is worth noting that referenced ti performed better even than the model switch ti method. this speed of convergence of the referenced ti algorithm needs to eventually be theoretically characterised but practical tests shown in this thesis suggest a faster convergence than other approaches. this is a very important property for costly models where mcmc iterations are computationally demanding. in the bellman-harris renewal model for the covid-19 epidemic in south korea, we showed that for complex models, model selection through the laplace approximation of the normalising constant can give misleading results. using referenced ti, we calculated model evidence for 16 models, which enabled a quick comparison between chosen pairs of competing models. importantly, the evidence given by the referenced ti was not monotonic with the increase of one of the parameters, which was the case for the laplace approximation. referenced ti proves useful in situations where the posterior distribution is uni-modal but non-gaussian. in the case of gaussian posteriors, the bfs calculated using laplace approximation of the model evidence provides almost immediate and sufficient information on which model fits the data better, although the error introduced by this approximation grows quickly in the higher dimensions [39] . therefore, in the case of multi-dimensional datasets it is not always clear whether the posterior distribution is of a gaussian shape, and thus it is still worth testing the outcome with a more accurate method. in epidemiology, as well as other fields, the best model is often picked using simple methods such as aic or bic [1] . their main advantage is that they can be computed with virtually no additional computational cost and are often provided by the off-the-shelf statistical software. it is important to note, that neither aic nor bic incorporate the uncertainty of the parameters and the model's predictions [4] . as a result, aic often favours the models with a larger number of parameters, and although bic penalises more complex models, it might give misleading answers when multiple models with similar score are compared [4] . additionally, none of these methods takes into account prior probabilities of the parameters [9] and does not force the researcher to consider prior predictive checks on model appropriateness [40] . fully bayesian model selection methods which compare the models' evidences are often more appropriate, as they take into account the full posterior densities, instead of just maximum a posteriori probability estimates [41] . using model evidence and bfs as a method of model selection has been disputed in scientific literature. bfs have been shown to be sensitive to the choice of priors of the model, which some consider to be a disadvantage, although we have shown in the covid-19 model example that sometimes it proves useful when the competing models generate outputs that cannot be compared to the empirical data. it is also possible that all competing models make similar predictions under sensible priors, but their marginal likelihood might differ substantially once integrated over the parameter space [42] . due to this sensitivity, it is recommended to evaluate the bfs over a range of possible prior choices, which may become computationally expensive [4] . in general, however, if a sample size is sufficiently large, the effect of the priors is small [42, 4] . worth noting as well, that bfs indicate which one of the two models is relatively better, and do not give an absolute score of an individual model's accuracy. for a comprehensive discussion of the applicability of bfs we refer the reader to [42, 43] . although referenced thermodynamic integration and other methods using path-sampling have theoretical asymptotically exact monte carlo estimator limits, in practice a number of considerations affect accuracy. for example, biases will be introduced to the referenced ti estimate in practice if one endpoint density substantially differs from another. this is because the volume of parameter space that must be explored to produce an unbiased estimate of the expectation cannot be sampled based on the reference density generating proposals within a practical number of iterations. generally the larger the mismatch in endpoint densities, the larger the bias within finite iterations. the point is shown for a simple 1d example in figure 6 . similarly, the larger the mismatch, the higher the variance and slower the expectation is to converge. this illustrates the advantage of using a reference that matches the posterior as closely as possible, as opposed to a typically wide reference like the prior distribution, that gives the characteristic divergence at λ = 0 with power posteriors. measures of density similarity and reference performance scaling with dimension should be considered in future work. furthermore, discretisation of the coupling parameter path in λ introduces a discretisation bias. for the power posteriors method, friel et al. (2017) propose an iterative way of selective the λplacements which reduce the discretisation error [7] . calderhead and girolami (2009) test multiple λ-placements for 2 and 20 dimensional regression models, and report relative bias for each tested scenario [44] . in the referenced ti algorithm discretisation bias is however negligible -the use of the reference density results in ti expectations that are small and have low variance, and therefore converge quickly. because of that, both the expected distance from two densities estimated by the mcmc draws and the curvature in λ for the expectations per λ are small. in our framework we use geometric paths with equidistant coupling parameters λ between the un-normalised posterior densities, but there are other possible choices of the path constructions, e.g a harmonic [6] or hypergeometric path [35] . further optimisation might be worthwhile exploring with alternative choices of λ placement, to correctly estimate the evidence with fewer mcmc draws. normalising constants are fundamental in bayesian statistics and allow the best model to be selected for given data. in this paper we describe referenced thermodynamic integration, which allows efficient calculation of model evidence by sampling from geometric paths between the unnormalised density of interest and a reference -here, a multivariate normal reference. in the examples presented the method converges to an accurate solution faster than other approaches such as power posteriors or model switch thermodynamic integration. the referenced ti approach was applied to several examples, in which normalising constants over 1 to 200 dimensional integrals were calculated. the referenced ti method can be applied to model selection in epidemiology of infectious diseases, as it allows to calculate bayes factors for competing models efficiently even for high-dimensional and complex models. autoregressive process with a longer lag (3-and 4-days respectively). finally, models w = k, k = 1, .., 7 were similar to the ar(2) model, but the underlying assumption of these models is that the r t stays constant for the duration of the length of the sliding window w = k. in the original model developed in [36] , imported infections are generated with an exogenous component, however the simplified version used in this paper neglected those and focused only on the endogenous component. ar (2) ar (3) ar (4) ar (2) ar (3) ar (4) (c) figure 7 : logarithms of bayes factors for the analysed covid-19 renewal models, evaluated using the normalising constants ratios obtained by referenced ti. in each cell, the colour indicates the value of the bf 1,2 for models m 1 (row) and m 2 (column). higher values, that is a brighter orange colour, suggest that m 1 is strongly better than m 2 , and values below 0 in blue palette indicate that m 1 is worse than m 2 . gi = 8 performed best out of fixed gi models, w = 2 best out of sliding window models, and ar(3) performed better than ar(2) and ar (4) . for the interpretation of the bf values see [4] . mathematical models of infectious disease transmission parameter inference and model selection in deterministic and stochastic dynamical models via approximate bayesian computation: modeling a wildlife epidemic model selection and parameter estimation for dynamic epidemic models via iterated filtering: application to rotavirus in germany statistical mechanics of fluid mixtures simulating normalizing constants: from importance sampling to bridge sampling to path sampling improving power posterior estimation of statistical evidence computing bayes factors using thermodynamic integration improving marginal likelihood estimation for bayesian phylogenetic model selection regression analysis marginal likelihood estimation via power posteriors estimating the evidence: a review choosing among partition models in bayesian phylogenetics genealogical working distributions for bayesian model testing with phylogenetic uncertainty on model dynamical systems in statistical mechanics variational principle of bogoliubov and generalized mean fields in manyparticle interacting systems the application of the gibbs-bogoliubov-feynman inequality in mean field calculations for markov random fields a view of the em algorithm that justifies incremental, sparse, and other variants an introduction to variational methods for graphical models computation of gaussian orthant probabilities in high dimension estimating orthant probabilities of high-dimensional gaussian vectors with an application to set estimation orthant probabilities the evaluation of general non-centred orthant probabilities the numerical evaluation of certain multivariate normal integrals an asymptotic expansion for the multivariate a generalized error function in n dimensions the no-u-turn sampler: adaptively setting path lengths in hamiltonian monte carlo stan: a probabilistic programming language stan for epidemiology star distribution around a massive black hole in a globular cluster on the eigenfunctions of many-particle systems in quantum mechanics quadpack: a subroutine package for automatic integration scipy: open source scientific tools for python: scipy.integrate scipy 1.0: fundamental algorithms for scientific computing in python thermodynamic bayesian model comparison on the derivation of the renewal equation from an age-dependent branching process: an epidemic modelling perspective using information theory to optimise epidemic models for real-time prediction and estimation epidemiology and transmission of covid-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study on the error in the laplace approximations of high-dimensional integrals can the strengths of aic and bic be shared? a conflict between model indentification and regression estimation bayesian model evidence as a practical alternative to deviance information criterion a survey of bayesian predictive methods for model assessment, selection and comparison marginal likelihood computation for model selection and hypothesis testing: an extensive review estimating bayes factors via thermodynamic integration and population mcmc a.0.1 covid-19 modelthe covid-19 model shown is based on the renewal equation derived from the bellman-harris process. the details of the model and its derivation are provided in mishra et al. [36] . here, we give a short overview of the ar(2) model. the model has a bayesian hierarchical structure and is fitted to the time-series data containing a number of new confirmed covid-19 cases per day in south korea, obtained from https://opendata.ecdc.europa.eu/covid19/casedistribution/ csv. new infections y(t) are modelled by a negative binomial distribution, with a mean parameter in a form of a renewal equation. the number of confirmed cases y(t) is modelled as:where φ is an overdispersion or variance parameter and the mean of the negative binomial distribution is denoted as f (t) and represents the daily case data through:as the case data is not continuous but is reported per day, f (t) can be represented in a discretised, binned form as:here, g(τ ) is a raileigh-distributed serial interval with mean gi, which is discretised as r t , the effective reproduction number, is parametrised as r t = exp( t ), with exponent ensuring positivity. t is an autoregressive process with two-days lag, that is ar(2), with 1 ∼ n (−1, 0.1), 2 ∼ n (−1, σ) and t ∼ n (ρ 1 t−1 + ρ 2 t−2 , σ t ) for t = {3, 4, 5, ...}. the model's priors are:modification were applied to this basic model, to obtain the different variants of the model as described in section 3. first group of models analysed was the ar(2) model described above, but with the gi parameter fixed to a certain value instead of inferring that parameter from the data. ar(3) and ar(4) models had additional parameters ρ 3 and ρ 4 , which allow to model the key: cord-241057-cq20z1jt authors: han, jungmin; cresswell-clay, evan c; periwal, vipul title: statistical physics of epidemic on network predictions for sars-cov-2 parameters date: 2020-07-06 journal: nan doi: nan sha: doc_id: 241057 cord_uid: cq20z1jt the sars-cov-2 pandemic has necessitated mitigation efforts around the world. we use only reported deaths in the two weeks after the first death to determine infection parameters, in order to make predictions of hidden variables such as the time dependence of the number of infections. early deaths are sporadic and discrete so the use of network models of epidemic spread is imperative, with the network itself a crucial random variable. location-specific population age distributions and population densities must be taken into account when attempting to fit these events with parametrized models. these characteristics render naive bayesian model comparison impractical as the networks have to be large enough to avoid finite-size effects. we reformulated this problem as the statistical physics of independent location-specific `balls' attached to every model in a six-dimensional lattice of 56448 parametrized models by elastic springs, with model-specific `spring constants' determined by the stochasticity of network epidemic simulations for that model. the distribution of balls then determines all bayes posterior expectations. important characteristics of the contagion are determinable: the fraction of infected patients that die ($0.017pm 0.009$), the expected period an infected person is contagious ($22 pm 6$ days) and the expected time between the first infection and the first death ($25 pm 8$ days) in the us. the rate of exponential increase in the number of infected individuals is $0.18pm 0.03$ per day, corresponding to 65 million infected individuals in one hundred days from a single initial infection, which fell to 166000 with even imperfect social distancing effectuated two weeks after the first recorded death. the fraction of compliant socially-distancing individuals matters less than their fraction of social contact reduction for altering the cumulative number of infections. the pandemic caused by the sars-cov-2 virus has swept across the globe with remarkable rapidity. the parameters of the infection produced by the virus, such as the infection rate from person-to-person contact, the mortality rate upon infection and the duration of the infectivity period are still controversial . parameters such as the duration of infectivity and predictions such as the number of undiagnosed infections could be useful for shaping public health responses as the predictive aspects of model simulations are possible guides to pandemic mitigation [7, 10, 20] . in particular, the possible importance of superspreaders should be understood [24] [25] [26] [27] . [5] had the insight that the early deaths in this pandemic could be used to find some characteristics of the contagion that are not directly observable such as the number of infected individuals. this number is, of course, crucial for public health measures. the problem is that standard epidemic models with differential equations are unable to determine such hidden variables as explained clearly in [6] . the early deaths are sporadic and discrete events. these characteristics imply that simulating the epidemic must be done in the context of network models with discrete dynamics for infection spread and death. the first problem that one must contend with is that even rough estimates of the high infection transmission rate and a death rate with strong age dependence imply that one must use large networks for simulations, on the order of 10 5 nodes, because one must avoid finite-size effects in order to accurately fit the early stochastic events. the second problem that arises is that the contact networks are obviously unknown so one must treat the network itself as a stochastic random variable, multiplying the computational time by the number of distinct networks that must be simulated for every parameter combination considered. the third problem is that there are several characteristics of sars-cov-2 infections that must be incorporated in any credible analysis, and the credibility of the analysis requires an unbiased sample of parameter sets. these characteristics are the strong age dependence of mortality of sars-cov-2 infections and a possible dependence on population density which should determine network connectivity in an unknown manner. thus the network nodes have to have location-specific population age distributions incorporated as node characteristics and the network connectivity itself must be a free parameter. 3 an important point in interpreting epidemics on networks is that the simplistic notion that there is a single rate at which an infection is propagated by contact is indefensible. in particular, for the sars-cov-2 virus, there are reports of infection propagation through a variety of mucosal interfaces, including the eyes. thus, while an infection rate must be included as a parameter in such simulations, there is a range of infection rates that we should consider. indeed, one cannot make sense of network connectivity without taking into account the modes of contact, for instance if an individual is infected during the course of travel on a public transit system or if an individual is infected while working in the emergency room of a hospital. one expects that network connectivity should be inversely correlated with infectivity in models that fit mortality data equally well but this needs to be demonstrated with data to be credible, not imposed by fiat. the effective network infectivity, which we define as the product of network connectivity and infection rate, is the parameter that needs to be reduced by either social distancing measures such as stay-at-home orders or by lowering the infection rate with mask wearing and hand washing. a standard bayesian analysis with these features is computationally intransigent. we therefore adopted a statistical physics approach to the bayesian analysis. we imagined a six-dimensional lattice of models with balls attached to each model with springs. each ball represents a location for which data is available and each parameter set determines a lattice point. the balls are, obviously, all independent but they face competing attractions to each lattice point. the spring constants for each model are determined by the variation we find in stochastic simulations of that specific model. one of the dimensions in the lattice of models corresponds to a median age parameter in the model. each location ball is attracted to the point in the median age parameter dimension that best matches that location's median age, and we only have to check that the posterior expectation of the median age parameter for that location's ball is close to the location's actual median age. thus we can decouple the models and the data simulations without having to simulate each model with the characteristics of each location, making the bayesian model comparison amenable to computation. finally, the distribution of location balls over the lattice determines the posterior expectation values of each parameter. we matched the outcomes of our simulations with data on the two week cumulative death counts after the first death using bayes' theorem to obtain parameter estimates for the infection dynamics. we used the bayesian model comparison to determine posterior expectation values for parameters for three distinct datasets. finally, we simulated the effects of various partially effective social-distancing measures on random networks and parameter sets given by the posterior expectation values of our bayes model comparison. we used data for the sars-cov-2 pandemic as compiled by [28] from the original data we generated random g(n, p = 2l/(n − 1)) networks of n = 90000 or 100000 nodes with an average of l links per node using the python package networkx [36] . scalepopdens ≡ l is one of the parameters that we varied. we compared the posterior expectation for this parameter for a location with the actual population density in an attempt to predict the appropriate way to incorporate measurable population densities in epidemic on network models [37, 38] . we used the python epidemics on networks package [39, 40] to simulate networks with specific parameter sets. we defined nodes to have status susceptible, infected, recovered or dead. we started each simulation with exactly one infected node, chosen at random. the simulation has two sorts of events: 1. an infected node connected to a susceptible node can change the status of the susceptible node to infected with an infection rate, infrate. this event is network connectivity dependent. therefore we expect to see a negative or inverse correlation between infrate and scalepopdens. 2. an infected node can transition to recovered status with a recovery rate, recrate, or transition to a dead status with a death rate, deathrate. both these rates are entirely node-autonomous. the reciprocal of the recrate parameter (recdays in the following) is the number of days an individual is contagious. we assigned an age to each node according to a probability distribution parametrized by the median age of each data set (country or state). as is well-known, there is a wide disparity in median ages in different countries. the probability distribution approximately models the triangular shape of the population pyramids that is observed in demographic studies. we parametrized it as a function of age a as follows: here medianage is the median age of a specific country, maxage = 100 y is a global maxiit is computationally impossible to perform model simulations for the exact age distribution for each location. we circumvented this problem, as detailed in the next subsection (bayes setup), by incorporating a scalemedage parameter in the model, scaled so that scalemedage = 1.0 corresponds to a median age of 40 years. the node age is used to make the deathrate of any node age-specific in the form of an age-dependent weight: where a[n] is the age of node n and ageincrease = 5.5 is an age-dependence exponent. w(a) is normalized so that a w(a|ageincrease)p (a|medianage = 38.7y) = 1, using the median age of new york state's population as the value of ageincrease given above was approximately determined by fitting to the observed age-specific mortality statistics of new york state [35] . however, we included ageincrease as a model parameter since the strong age dependence of sars-cov-2 mortality is not well understood, with the normalization adjusted appropriately as a function of ageincrease. note that a decrease in the median age with all rates and the age-dependence exponent held constant will lead to a lower number of deaths. we use simulations to find the number of dead nodes as a function of time. the first time at which a death occurs following the initial infection in the network is labeled time-firstdeath. figure close to its actual median age. we implemented bayes' theorem as usual. the probability of a model, m, given a set of after the first death did not affect our results. as alluded to in the previous subsection, the posterior expectation of the scalemedage parameter (×40 y) for each location should turn out to be close to the actual median age for each location in our setup, and this was achieved (right column, figure 5 ). we simulated our grid of models on the nih biowulf cluster. our grid comprised of 56448 ×2 parametrized models simulated with 40 random networks each and parameters in all possible combinations from the following lists: parameters. in particular, note that the network infectivity (infcontacts) has a factor of two smaller uncertainty than either of its factors as these two parameters (infrate and scalepopdens) cooperate in the propagation of the contagion and therefore turn out to have a negative posterior weighted correlation coefficient (table i ). the concordance of posterior expectation values (table i) this goes along with the approximately 80 day period between the first infection and the first death for a few outlier trajectories. however, it is also clear from the histograms in figure 9 and the mean timefirstdeath given in table i that the likely value of this duration is considerably shorter. finally, we evaluated a possible correlation between the actual population density and the scalepopdens parameter governing network connectivity. we found a significant correlation when we added additional countries to the european union countries in this regression, we obtained (p < 0.0019, r = 0.33) scalepopdens(us&eu+) = 0.11 ln(population per km 2 ) + 2.9. while epidemiology is not the standard stomping ground of statistical physics, bayesian model comparison is naturally interpreted in a statistical physics context. we showed that taking this interpretation seriously leads to enormous reductions in computational effort. given the complexity of translating the observed manifestations of the pandemic into an understanding of the virus's spread and the course of the infection, we opted for a simple data-driven approach, taking into account population age distributions and the age dependence of the death rate. while the conceptual basis of our approach is simple, there were computational difficulties we had to overcome to make the implementation amenable to computability with finite computational resources. our results were checked to not depend on the size of the networks we simulated, on the number of stochastic runs we used for each model, nor on the number of days that we used for the linear regression. all the values we report in table i are well within most estimated ranges in the literature but with the benefit of uncertainty estimates performed with a uniform model prior. while each location ball may range over a broad distribution of models, the consensus posterior distribution (table i) shows remarkable concordance across datasets. we can predict the posterior distribution of time of initial infection, timefirstdeath, as shown in table i . the dynamic model can predict the number of people infected after 21 the first infection (right panel, figure 10 ) and relative to the time of first death (left panel, figure 10 ) because we made no use of infection or recovery statistics in our analysis [9] . note the enormous variation in the number of infections for the same parameter set, only partly due to stochasticity of the networks themselves, as can be seen by comparing the upper and lower rows of figure 4 . with parameters intrinsic to the infection held fixed, we can predict the effect of various degrees of social distancing by varying network connectivity. we assumed that a certain fraction of nodes in the network would comply with social distancing and only these compliant nodes would reduce their connections at random by a certain fraction. figure 12 shows the effects of four such combinations of compliant node fraction and fraction of con(table ii) with the posterior expectations of parameters (table i) shows that the bayes entropy of the model posterior distribution is an important factor to consider, validating our initial intuition that optimization of model parameters would be inappropriate in this analysis. the regression we found (eq.'s 6, 7, 8) with respect to population density must be considered in light of the fact that many outbreaks are occurring in urban areas so they are not necessarily reflective of the true population density dependence. furthermore, we did not find a significant regression for the countries of the european union by themselves, perhaps because they have a smaller range of population densities, though the addition of these countries into the us states data further reduced the regression p-value of the null hypothesis without materially altering regression parameters. detailed epidemiological data could be used to clarify its significance. [ [24] [25] [26] [27] have suggested the importance of super-spreader events but we did not encounter any difficulty in modeling the available data with garden variety g(n, p) networks. certainly if the network has clusters of older nodes, there will be abrupt jumps in the cumulative death count as the infection spreads through the network. furthermore, it would be interesting to consider how to make the basic model useful for more heterogenous datasets such as all countries of the world with vastly different reporting of death statistics. using the posterior distribution we derived as a starting point for more complicated models may be an approach worth investigating. infectious disease modeling is a deep field with many sophisticated approaches in use [39, [41] [42] [43] and, clearly, our analysis is only scratching the surface of the problem at hand. network structure, in particular, is a topic that has received much attention in social network research [37, 38, [44] [45] [46] . bayesian approaches have been used in epidemics on networks modeling [47] and have also been used in the present pandemic context in [2, 27, 48] . to our knowledge, there is no work in the published literature that has taken the approach adopted in this paper. there are many caveats to any modeling attempt with data this heterogenous and complex. first of all, any model is only as good as the data incorporated and unreported sars-cov-2 deaths would impact the validity of our results. secondly, if the initial deaths occur in specific locations such as old-age care centers, our modeling will over-estimate the death rate. a safeguard against this is that the diversity of locations we used may compensate to a limited extent. detailed analysis of network structure from contact tracing can be used to correct for this if such data is available, and our posterior model probabilities could guide such refinement. thirdly, while we ensured that our results did not depend on our model ranges as far as practicable, we cannot guarantee that a model with parameters outside our ranges could not be a more accurate model. the transparency of our analysis and the simplicity of our assumptions may be helpful in this regard. all code is available 23 an seir infectious disease model with testing and conditional quarantine the lancet infectious diseases the lancet infectious diseases the lancet infectious diseases proceedings of the 7th python in science conference 2015 winter simulation conference (wsc) agent-based modeling and network dynamics infectious disease modeling charting the next pandemic: modeling infectious disease spreading in the data science age we are grateful to arthur sherman for helpful comments and questions and to carson chow for prepublication access to his group's work [6] . this work was supported by the key: cord-217139-d9q7zkog authors: kumar, sumit; sharma, sandeep; kumari, nitu title: future of covid-19 in italy: a mathematical perspective date: 2020-04-18 journal: nan doi: nan sha: doc_id: 217139 cord_uid: d9q7zkog we have proposed an seir compartmental mathematical model. the prime objective of this study is to analyze and forecast the pandemic in italy for the upcoming months. the basic reproduction number has been calculated. based on the current situation in italy, in this paper, we will estimate the possible time for the end of the pandemic in the country. the impact of lockdown and rapid isolation on the spread of the pandemic are also discussed. further, we have studied four of the most pandemic affected regions in italy. using the proposed model, a prediction has been made about the duration of pandemic in these regions. the variation in the basic reproduction number corresponding to the sensitive parameters of the model is also examined. since its first appearance in wuhan (china), the severe acute respiratory syndrome coronavirus-2 disease (covid 19) is spreading rapidly across the globe [1, 2] . the number of patients are increasing exponentially and in some of the countries, thousands of people are losing their lives, almost every day, due to covid 19. the seriousness of the situation can be understood by the fact that the number of infected cases have crossed the 1.2 million mark and more than 75 thousand confirmed deaths, across the globe, due to the disease as on 7th april [3] . moreover, the outbreak has spread in more than 200 countries [3] . owing to the grievous condition, the world health organization first declared the covid 19 as a public health emergency of international concern on 30 january 2020 and later a pandemic on 11 march 2020 [4] . in fact, at this point of time, the covid 19 seems to be unstoppable. different countries and health agencies are working together to find some robust method to prevent the ongoing wave, which now becomes a pandemic, of the disease. moreover, the ongoing outbreak forced various countries to implement the countrywide lock down to break the chain of transmission of the corona virus. this results in a huge loss in terms of economy and resources and at the same time created a chaotic situation worldwide. efforts have been made from different corners of the research community to understand the structure of the virus and the transmission mechanism of the disease. different studies analyze the role of social distancing (which is the only effective method available till date) in the prevention of covid 19. at the initial stage, it is believed that transmission of the disease took place through animal to human mode. but later it has been established that the direct transmission of the disease is also possible and is the primary reason for the acute transmission to various countries [5, 6, 7] . the possibility of hospital related transmission was also explored and it was suspected to be the possible cause in 41 % of patients [6] . along with its high transmission efficiency, the high level of global travel contributed heavily to the spread of sars-cov-2 across the globe [8] . the ongoing pandemic of covid-19 challenged many developed nations including germany, spain, usa, italy, france and several others and has caused catastrophic impacts on the lifestyle of the people of these countries. this is almost opposite to the pattern of infectious diseases observed to date. in the past, such an outbreak of pandemic generally took place in underdeveloped and developing countries. the african region which is the host of many infectious diseases has the lowest reported cases of covid-19 till date. italy recorded its first case of covid19 on february 20, 2020, at lodi (lombardy) [9] . in the next 24 hours, the infected cases increased to 36 [9] . moreover, at the initial stage of the outbreak, italian data followed closely the exponential growth trend observed in hubei province, china. till date, italy is one of the countries that have faced grievous consequences of covid 19. till 7th april 2020, italy has 1,32,547 recorded cases and 16,523 deaths due to covid 19. in terms of reported cases, it is the third highest while it is on the top when it comes to the number of deaths across the globe. of the patients who died, 42.2 % were aged 80-89 years, 32.4 % were aged 70-79 years, 8.4 % were aged 60-69 years, and 2.8 % were aged 50-59 years. the male to female ratio is 80 % to 20 % with older median age for women (83.4 years for women vs 79.9 years for men) [10] . moreover, the estimated mean age of those who lost their lives in italy was 81 years [10] . the covid19 outbreak completely disturbed the economic condition of italy. several family owned small sectors are suffering [11] . the condition of italy surprised the research fraternity because italy stands in the top five countries in terms of medical facilities. the ongoing dismal scenario in italy has forced the government to admit that they do not have any control over the spread of the disease as well as do not know when the ongoing web of covid 19 will stop. due to the catastrophic impacts of the covid-19 outbreak, efforts have been made to analyze the trend of the disease and predict the future of the epidemics [10, 12] . the work carried out in [10] predicts that in the absence of timely implementation of available medical resources, the authorities will not be able to control the outbreak of the disease. the study further concludes that together with the medical facilities peoples movement and social activities should be restricted immediately in order to curtail the burden of the covid-19. the work carried out in [12] , collected the day to day data and measured the possible similarity between italy and hubei province (china). further, the study also shows that the number of deaths increased almost five times as the available treatment facilities reached the limit. the limited availability of medical staff and facilities also delays the possible end of the covid-19 crisis from 15 april 2020 to 8 may 2020. however, to the best of our knowledge so far no compartmental deterministic mathematical model has been developed and studied for covid-19 spread in italy. in the current work, we propose a compartmental mathe-matical model to study the case of italy. the proposed model incorporates four different compartments namely -susceptible, exposed, infected and recovered population. the present covid-19 has an incubation period ranges up to 14 days. therefore, to make our model realistic, we include the exposed population along with the infected population, which certainly results in improved prediction. we collected the data of italy for covid 19 available on the worldometers[13] website. further we trained the proposed mathematical model using the data avilable till 6 th april 2020. using this improved mathematical model we study the impact of lockdown on the spread of covid 19 in italy. also, we predict the possible end of the current outbreak of covid 19 in italy. moreover, to make our predictions more realistic, we have trained and validated our model with covid-19 data of some the highly affected regions of italy. the proposed model describes the transmission mechanism of covid-19. in the modelling process, we have divided the human population into four mutually exclusive compartments, namely, susceptible (s), exposed (e), infected (i) and recovered (r). the exposed class is the collection of those individuals which have just got infected and are not showing any symptoms of the disease, while the infected class individuals have clear symptoms of covid-19. we further assume that a recovered individual will become immune to the infection. the flow chart of the model is presented in figure 1. based on the above assumptions, the model is governed by the following system of equations:in the above model, n represents the total population. we assume that all the new recruiters joined the susceptible class at a constant rate a. β is the disease transmission rate from the infected individuals to susceptible individuals. we further assume that susceptible individuals once come into the contact of infected individual will not directly join the infected class. they first join the exposed class (e) and after certain period of time shows visible symptoms of the disease and enters into the infected class (i). exposed class individuals are assumed to be less infectious as compared to the infected class individuals. therefore, β 0 represents the disease transmission rate for exposed individuals. clearly, β 0 ≤ β. here, α is the rate at which exposed individuals join the infected class. α 1 is the recovery rate of exposed individuals and α 2 is the recovery rate of infected individuals. θ is the disease induced death rate. µ is the natural death rate. in this section, we check the mathematical feasibility of the proposed model. for this purpose, we check whether all the solutions of the proposed model will remain positive and bounded or not. our proposed model system involves human population. hence for the initial state, all the compartmental values are assumed to be non-negative. we consider the following initial condition for analysis: to show the epidemiological feasibility of the proposed model system 1, it is required that all the solutions remain non-negative. hence, in the following theorem we verify that all the solutions with non-negative initial condition will remain non-negative. the following theorem establishes the positivity of the solutions. theorem 1: the solution (s(t), e(t), i(t), r(t)) of the proposed model system is non-negative for all t ≥ 0 with non-negative initial condition. proof : from the first equation of system (1), we have now, integrating the equation on both sides on simplification we have, this gives where t 1 ≥ 0 is arbitrary. similarly we can show the non-negativity for compartments e, i and r. hence, the solution (s(t), e(t), i(t), r(t)) will remain positive for non-negative initial condition. in order to accurately predict the epidemic, the solutions of the mathematical model should be bounded. the following theorem guarantees the boundedness of the solutionsof the covid-19 model designed for italy. theorem 2: all solutions of the proposed model are bounded. proof : we need to show that (s(t), e(t), i(t), r(t)) is bounded for each value of t ≥ 0. from our model system 1 we obtain: from here, we can conclude that similarly, we can show the boundedness of every compartment of the model. therefore feasible region for our model system will be as of now, italy is one of those countries which are highly effected by covid-19. according to the data collected from [13], the total number of covid-19 cases has crossed 139422 as on 9 th april 2020. there has been more then 17669 deaths in the country due to this epidemic. our proposed model aims to predict the future scenario of the covid-19 epidemic in italy by analyzing its present state in the country. in this section, we perform rigorous numerical simulations to get an insight of the epidemic in italy. the parameter values used for the simulation are provided in table 1 we have considered the following initial condition to perform our analysis. this is taken on the basis of the officially reported data by [13] as on 1 st march 2020. (s(0), e(0), i(0), r(0)) = (50000000, 18000, 1577, 83) according to the situation report-11 available on the official website of who[15], the first two covid-19 positive cases in italy were reported on 31 st january 2020. both the infected individuals had travel history to the city of wuhan, china. this was the initial phase of the epidemic in the country. due to the lack of government concern on this matter, the epidemic started to grow slowly but significantly. by the end of february the number of cases reached to 1000 [13] . we trained our model defined for italy for the month of march to check its accuracy with the official data available [13] . it was observed that our model predicted the epidemic closely. figure 4 it can be seen from the simulated results (see fig.3 ) that, the active cases will still increase till the end of april. the pandemic cases may start to decrease by the first week of may. the epidemic may hit its peak in italy between 28 th april to 3 rd may as shown in figure 3 . also, according to our proposed model and if the conditions remains the same, the epidemic in the country may not last till the end of july 2020. in this section, we will discuss the factors which can significantly control the spread of pandemic. the two major factors discussed here are (a) early lock down and (b) rapid isolation. on 9 th march 2020, the government of italy imposed a nationwide quarantine restricting the movement of the population except for necessity, work, and health circumstances, in response to the growing pandemic of covid-19 in the country. in this section, we will investigate the effect of early lock down on the spread of pandemic in italy. it is clear from figure 4 that the epidemic could have been controlled at a very early stage if the government had imposed the lock down early in italy. figure 4 shows three different scenarios of the epidemic in italy with different initial values of susceptible class. figure (4a) shows the current scenario of italy. however, if lock down would have been imposed prior to 9 th march 2020, the number of susceptible would have been significantly low. in figure (4b) , we assume the susceptible population to be 3.5 crores due to the lock down in the country. it can be seen from the plot that, it has not only significantly reduce the number of infections, but also caused the overall death of pandemic by 20 th april, 2020. also, figure (4c) indicate that the epidemic could have been eliminated by 31 st march, 2020 if the susceptible population have been reduced to 10 lakhs. covid-19 is a global pandemic which is spreading all across the globe. early research shows that the disease transmission rate from an infected individual to a susceptible is very high [16] . the transmission rate can be reduced by isolating the infected individuals as quickly as possible. in this subsection, with the help of numerical simulations we will show the variations in infected population for different values of β, disease transmission rate from infected individual to susceptible individuals. figure 5 shows various scenarios of the epidemic in italy in case disease transmission rate would have been timely controlled. a rapid isolation of infected population will lead to reduce the disease transmission rate, β. from figure 5 , we see that as disease transmission rate, β is reduced from 0.7 to 0.4, it not only decrease the active number of infections from 45000 to 9000, but also the overall lifespan of pandemic reduced from july 31 to april 20, 2020. in this section, we will study and analyze the spread of the pandemic in different regions of italy. there are few regions which have become hotspots for the pandemic in the country. these regions include lombardia, emilia romagna, piemonte and veneto. in order to study the pandemic more precisely, we have assumed that the transmission rate in these regions varies from the generalized transmission rate of italy. this can be explained, as there is a visible population gap in these regions. also, we have assumed that by first week of march, whole population of these regions is susceptible. hence, we have assumed that there are no new individuals available for recruitments, hence a equals zero. for this purpose we have used the parameters values available in table 2 . the disease transmission rate from exposed individuals is also reduced to 0.4. all the remaining parameter values are considered to be same as given in table 1 in figure 6 , we study the case of lambardia region of italy. it is interesting to note that cases of infectives were highest in this region of the country. in sub-figure 6a, the proposed model is used for fitting the available data. in sub-figure 6b, we used our proposed model to predict the future of pandemic in lambardia. we see from the study that by mid july, the pandemic will die out in this region. figure 7 shows the case study of emilia romagna region of italy. in figure (7a), we can see the accuracy of our proposed model. figure (7b) consist of the upcoming predicted scenario of the epidemic in this region. according to our proposed model the pandemic will last till mid july, 2020 in this particular region. figures 8 and 9 illustrate the behaviour of the pandemic in piemonte and veneto regions respectively. sub-figures (9a) and (10a) exhibit the authenticity of our proposed model with the officially reported covid-19 cases in these regions. the possible extinction of pandemic in these regions is shown in sub-figures (8a) and (9a). according to our proposed model, the pandemic in both of these regions will last till the last week of july, 2020. also, it can be observed from sub-figure (8b) that the active infected cases in piemonte will not surpass 18000. whereas, the active infected cases in veneto may hit the 21000 mark ( see sub-fig 9b) . shows the accuracy of our model with the officially reported data of these regions. also, from our simulations, one can conclude that the epidemic is yet to reach its peak. the basic reproduction number is one of the key parameter to examine the long term behaviour of an epidemic. it can be defined as the number of secondary cases produced by a single infected individual in its entire life span as infectious agent. we have used next-generation matrix technique explained in [18] , to derive the expression of basic reproduction number r 0 . the r 0 takes the following expression we will analyse the variation in r 0 for different values of the parameters involved in the model system. figure 10 illustrates the simultaneous variation in the basic reproduction number for different values of corresponding parameters. the parameter values used are given in table 1 . a seir type compartmental model is proposed to study the current scenario of covid-19 in italy. our proposed model accurately fits the officially available data of the pandemic in italy. also, it is concluded from the study that the pandemic may still grow in italy till 30 th april, 2020. the pandemic may hit the 2.5 lacks active cases mark (see fig. 3 ) in italy. also, we have discussed how the lock down that was imposed on 9 th march, 2020 was a good but a delayed decision of the government of italy. through simulations, we have shown that a rapid isolation of the infective individuals and early lock down in the country are two of the most efficient procedures to terminate the spread of covid-19. as of now, the vaccination of covid-19 have not been discovered. hence, this research can also be beneficial for the countries which are in the initial stage of the pandemic, as our research describes two of the most effective procedures to counter the spread of the pandemic and its long term impact on the spread of disease. the expression of the basic reproduction number r 0 is also derived. as our proposed model involves various parameters, we have shown the sensitivity of these parameters via numerical simulations. it is clear from the simulations (see fig. 10d ) that the transmission rates, β and β 0 are the most sensitive parameters. the reproduction number can be minimized if we can reduce these two parameters. severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and corona virus disease-2019 (covid-19): the epidemic and the challenges corcione, 2019-novel coronavirus outbreak: a new challenge early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china novel wuhan (2019-ncov) coronavirus the next big threat to global health? critical care utilization for the covid-19 outbreak in lombardy, italy: early experience and forecast during an emergency response covid-19 and italy: what next? covid-19 in italy: momentous decisions and many uncertainties predicting the ultimate outcome of the covid-19 outbreak in italy the reproductive number of covid-19 is higher compared to sars coronavirus the construction of nextgeneration matrices for compartmental epidemic models this research can be extended in various ways. one can refine the model by introducing new compartments in order to examine the epidemic more precisely. there are certain assumptions which we have made while constructing this model because of the limited data and short onset time. as more data will be available in the future, this model can be trained with more real data to increase its efficiency. key: cord-254107-02bik024 authors: hillisch, alexander; pineda, luis felipe; hilgenfeld, rolf title: utility of homology models in the drug discovery process date: 2004-08-31 journal: drug discovery today doi: 10.1016/s1359-6446(04)03196-4 sha: doc_id: 254107 cord_uid: 02bik024 abstract advances in bioinformatics and protein modeling algorithms, in addition to the enormous increase in experimental protein structure information, have aided in the generation of databases that comprise homology models of a significant portion of known genomic protein sequences. currently, 3d structure information can be generated for up to 56% of all known proteins. however, there is considerable controversy concerning the real value of homology models for drug design. this review provides an overview of the latest developments in this area and includes selected examples of successful applications of the homology modeling technique to pharmaceutically relevant questions. in addition, the strengths and limitations of the application of homology models during all phases of the drug discovery process are discussed. 1359-6446/04/$ -see front matter ©2004 elsevier ltd. all rights reserved. pii: s1359-6446(04)03196-4 the majority of drugs available today were discovered either from chance observations or from the screening of synthetic or natural product libraries. the chemical modification of lead compounds, on a trial-and-error basis, typically led to compounds with improved potency, selectivity and bioavailability and reduced toxicity. however, this approach is labor-and time-intensive and researchers in the pharmaceutical industry are constantly developing methods with a view to increasing the efficiency of the drug discovery process [1] . two directions have evolved from these efforts. the 'random' approach involves the development of hts assays and the testing of a large number of compounds. combinatorial chemistry is used to satisfy the need for extensive compound libraries. the 'rational', protein structure-based approach relies on an iterative procedure of the initial determination of the structure of the target protein, followed by the prediction of hypothetical ligands for the target protein from molecular modeling and the subsequent chemical synthesis and biological testing of specific compounds (the structure-based drug design cycle). the rational approach is severely limited to target proteins that are amenable to structure determination. although the protein data bank (pdb; http://www.rcsb.org/pdb) is growing rapidly (~13 new entries daily), the 3d structure of only 1-2% of all known proteins has as yet been experimentally characterized. however, advances in sequence comparison, fold recognition and protein-modeling algorithms have enabled the partial closure of the so-called 'sequence-structure gap' and the extension of experimental protein structure information to homologous proteins. the quality of these homology models, and thus their applicability to, for example, drug discovery, predominantly depends on the sequence similarity between the protein of known structure (template) and the protein to be modeled (target). despite the numerous uncertainties that are associated with homology modeling, recent research has shown that this approach can be used to significant advantage in the identification and validation of drug targets, as well as for the identification and optimization of lead compounds. in this review, we will focus on the application of homology models to the drug discovery process. homology, or comparative, modeling uses experimentally determined protein structures to predict the conformation of another protein that has a similar amino acid sequence. the method relies on the observation that in nature the structural conformation of a protein is more highly conserved than its amino acid sequence and that small or medium changes in sequence typically result in only small changes in the 3d structure [2] . generally, the process of homology modeling involves four steps -fold assignment, sequence alignment, model building and model refinement ( figure 1 ). the fold assignment process identifies proteins of known 3d structure (template structures) that are related to the polypeptide sequence of unknown structure (the target sequence; this is not to be mistaken with drug target). next, a sequence database of proteins with known structures (e.g. the pdb-sequence database) is searched with the target sequence using sequence similarity search algorithms or threading techniques [3] . following identification of a distinct correlation between the target protein and a protein of known 3d structure, the two protein sequences are aligned to identify the optimum correlation between the residues in the template and target sequences. the next stage in the homology modeling process is the model-building phase. here, a model of the target protein is constructed from the substitution of amino acids in the 3d structure of the template protein and the insertion and/or deletion of amino acids according to the sequence alignment. finally, the constructed model is checked with regard to conformational aspects and is corrected or energy minimized using force-field approaches. several improvements and modifications of this general homology modeling strategy have been developed and applied to the prediction of protein structures. to subject the available structure prediction methods to a blind test, community-wide experiments on the critical assessment of techniques for protein structure prediction (casp 1-5) have been performed and their results presented and published. as a result, the current state-of-the-art in protein structure prediction has been established, the progress made has been documented and the areas where future efforts might be most productively concentrated have been highlighted [4, 5] . homology modeling techniques are dependent on the availability of high-resolution experimental protein structure data. the development of effective protein expression systems and major technological advances in the instrumentation used for structure determination (x-ray crystallography and nmr spectroscopy) has contributed to an exponential growth in the number of experimental protein 3d structures. by may 2004, the pdb contained ~23,000 experimental protein structures for ~7400 different proteins (proteins with less than 90% sequence identity). a recent analysis of all protein chains in the pdb shows that these proteins can be grouped into 2500 protein families 660 ddt vol. 9, no. 15 august 2004 reviews research focus www.drugdiscoverytoday.com figure 1 . the steps involved in the prediction of protein structure by homology modeling. structure modeling of the bacterial transcriptional repressor copr is shown [28] . although the model is based on a low-sequence identity of only 13.8% between copr and the p22 c2 repressor, several experimental methods support this homology model. reproduced, with permission, from ref. [84] . abbreviation: copr, plasmid copy control protein. target sequence: homolog 1, no 3d structure: homolog 2, no 3d structure: template 1, 3d structure: template 2, 3d structure: comprising 900 unique protein folds [6] (updates can be found at http://scop.mrc-lmb.cam.ac.uk). the majority of the structures in the pdb (84%) were determined by x-ray crystallography, with 15% of the structures being characterized by nmr spectroscopy. the pdb database encompasses experimental information on an extensive array of ligands (small organic molecules and ions) bound to more than 50,000 different binding sites that can be analyzed using programs including relibase (http://relibase.ebi.ac.uk) [7] , ligbase (http://alto.compbio.ucsf.edu/ligbase) [8] and pdbsum (http://www.biochem.ucl.ac.uk/bsm/pdbsum) [9] . although the experimental structure database is growing rapidly, there is still a substantial gap between the number of known annotated sequences [1, 182, 126 unique sequences in swiss-prot-trembl (http://www.expasy.org/ sprot) as of 29 august 2003] and known protein 3d structures (23,000). if only significantly different proteins are considered (~7400), which omits muteins, artificial proteins and multiple structure determinations of the same proteins (e.g. hiv-protease and carbonic anhydrase ii), then less than 1% of the 3d structures of known protein sequences have been elucidated. this sequence-structure gap can partly be filled with homology models. for example, the queryable database modbase (http://alto.compbio. ucsf.edu/modbase-cgi/index.cgi) provides access to an enormous number of annotated comparative protein structure models [10] . the program psi-blast was used to assign protein folds to all 1,182,126 unique sequence entries in swiss-prot-trembl. for 56% of these sequences, comparative models with an average model size of 235 amino acids could be built using the program modeller [11] . thus, by august 2003, 659,495 3d structure models of proteins were accessible via the internet. the models are predicted to have at least 30% of their c α atoms superimposed within 3.5 å of their correct positions. information on binding sites and ligands can be retrieved from this database using ligbase [8] . however, the majority of the models are built on a low sequence identity and it should be realized that this level of accuracy is, in most cases, not sufficient for a detailed structure-based ligand design. the swiss-model repository (http://swissmodel.expasy. org/repository) [12] is also a database of annotated comparative protein 3d structure models, which have been generated using the fully automated homology-modeling pipeline swiss-model. as of august 2003, this database contained models for 282,096 different protein sequence entries (26%) from the swiss-prot-trembl databases (1,073,566 sequences), with an average model size of ~200 amino acids. researchers from eidogen (http://www.eidogen.com) have created a database system called target informatics platform™ [13] that currently includes homology models for 55,000 proteins. homology modeling of 26,279 human protein sequences resulted in the construction of 17,442 models for 13,114 different sequences (50%). thus, putative and known ligand binding pockets can be detected, analyzed and compared and the resulting data used to support target prioritization and lead discovery and/or optimization procedures. accelrys (http://www.accelrys.com) produces discovery studio (ds) atlasstore™ as a complete oracle ® -based protein and ligand structural data management solution. currently, ds atlasstore™ contains 2,052,000 homology models that have been automatically generated from the sequences of 195,000 proteins from 33 different genomes. in conjunction with homology models, cengent therapeutics (http://www.cengent.com) offers dynamic structural information generated from molecular dynamics simulations for 5500 human drug target proteins. this structural information can be used for target prioritization and virtual screening. the quality of the homology models is dependent on the level of sequence identity between the protein of known structure and the protein to be modeled [14] . for a sequence identity that is greater than 30%, homology can be assumed; the two proteins probably have a common ancestor and are, therefore, evolutionarily related and are likely to share a common 3d structure. in this case, pairwise and multiple sequence alignment algorithms are reliable and can be used for the generation of homology models ( figure 2 ). if the sequence identity is below 15%, structure modeling becomes speculative, which could lead to misleading conclusions. when the sequence identity is between 15% and 30%, conventional alignment methods are not sufficiently reliable and only sophisticated, profile-based methods are capable of recognizing homology and predicting fold. for regions of low sequence identity, threading methods [15] are often applied. protein models that are built on such low sequence identities can be used for the assignment of protein function and for the direction of mutagenesis experiments ( figure 2 ). models that have a sequence identity between ~30% and 50% could facilitate the structure-based prediction of target drugability, the design of mutagenesis experiments and the design of in vitro test assays ( figure 2 ). if sequence identity is greater than ~50%, the resulting models are frequently of sufficient quality to be used in the prediction of detailed protein-ligand interactions, such as structure-based drug design and prediction of the preferred sites of metabolism of small molecules ( figure 2 ). there are numerous applications for protein structure information and, hence, homology models at various stages of the drug discovery process [16] . the most spectacular successes are clearly those where protein structural information has helped to identify or to optimize compounds that were subsequently progressed to clinical trials or to the drug market [17] . the applications of homology models that had an impact on target identification and/or validation, lead identification and lead optimization are reviewed here ( figure 3 ). it is clear that only a minute fraction of the entire proteome can be affected by drug-like (preferentially orally bioavailable) small molecules. based on the total numbers of known genes, disease-modifying genes and drugable proteins, the number of drug target proteins, for humans, has been estimated at 600-1500 [18] . for small molecules, sets of properties have been established that differentiate drugs from other compounds [19, 20] ; these properties can be used to identify compounds with, for example, poor oral absorption properties [21] . drug molecules and their corresponding target proteins are highly complementary, which suggests that some rules that distinguish good target proteins from others should be deducible [22] . deep lipophilic pockets that comprise distinct polar interaction sites are clearly superior to shallow highly charged protein surface regions. the inhibition of protein-protein interfaces as a valuable therapeutic principle has recently been shown with inhibitors of the p53-murine double minute clone 2 (mdm2) interaction [23, 24] . the binding site for these inhibitors is a distinct lipophilic pocket that normally interacts with the α-helical surface patches of the p53 tumor suppressor transactivation domain. advances in the rapid detection, description and analysis of ligand-binding pockets [25] [26] [27] , together with the availability of more than 0.5 million homology models, will open new possibilities for the prioritization of proteins with regards to drugability. in the pharmaceutical industry, structural aspects are being increasingly implemented as additional decision criteria on the drugability of potential drug targets. companies such as inpharmatica (http:// www.inpharmatica.com) have developed an integrated suite of informatics-based discovery technologies that contain software tools for the structure-based assessment of target drugability. the design of site-directed mutant proteins is one further important option for the application of homology models to target validation. introducing point mutations and subsequently studying the effects in vitro or in vivo is a common approach in molecular biology. this strategy enables the identification of amino acids that are functionally or relationship between target and template sequence identity and the information content of resulting homology models. arrows indicate the methods that can be used to detect sequence similarity between target and template sequences. applications of the homology models in drug discovery are listed to the right. the higher the sequence identity, the more accurate the resulting structure information. homology models that are built on sequence identities above ~50% can frequently be used for drug design purposes. superimpositions of x-ray crystal structures of the ligand-binding domains of members of the nuclear receptor family are shown to the left. these x-ray structures illustrate the increase in structure deviation with a decreased sequence identity. the pr is red, the gr is green, the erα is blue and the trβ is cyan. sequence identities: pr:gr, 54%; pr:erα, 24%; and pr:trβ, 15%. abbreviations: erα, estrogen receptor α; gr, glucocorticoid receptor; pr, progesterone receptor; trβ, thyroid receptor β. structurally important in the protein under investigation, which ultimately contributes to biological knowledge on, for example, potential target proteins. typically, the amino acids that are to be modified in these studies are selected on the basis of sequence alignments by focusing on conserved residues. however, if at least some structure information is available, the selection of the amino acids that are to be mutated can be much more precise and successful [28] . this approach is even more powerful when applied in conjunction with pharmacologically active compounds. site-directed mutants of the target protein can be made to render that target sensitive to an existing pharmacological agent. based on homology models, some members of the mitogen-activated protein (map) kinase family were mutated to make them sensitive to a kinase inhibitor from the pyridinyl imidazole class [29] . this enabled the use of the compound for broader target validation studies. one of the most attractive ways to validate a target protein is to administer a pharmacologically active compound that selectively acts on that protein and to study the effects in a relevant animal model. similar strategies have been described under the term 'chemogenomics' [30] . it has recently been shown that it is possible to design small molecules based on homology models and then to use these compounds as tools to study the physiological role of the respective target protein of that particular drug [31] . eight years after the discovery of estrogen receptor β (erβ), the distinct roles of the two er isotypes, erα and erβ, in mediating the physiological responses to estrogens are not completely understood. although knockout animal experiments have provided an insight into estrogen signaling, additional information on the function of erα and erβ was imparted by the application of isotype selective er agonists. based on the crystal structure of the erαligand-binding domain (lbd) and a homology model of the erβ-lbd (59% sequence identity to erα), hillisch et al. [31] designed steroidal ligands that exploit the differences in size and flexibility of the two ligand-binding cavities ( figure 4 ). compounds that were predicted to bind preferentially to either erα or erβ were synthesized and tested in vitro. this approach led directly to highly er isotype-selective (200-250-fold) ligands that were also highly potent. to unravel the physiological roles of each of the two receptors, in vivo experiments with rats were conducted using the erα-and erβ-selective agonists in parallel with the natural ligand of er, 17β-estradiol. the erα agonist was shown to be responsible for most of the known estrogenic effects (e.g. induction of uterine growth and bone-protection), in addition to pituitary (e.g. reduction of luteinizing hormone plasma levels) and liver (e.g. increase in angiotensin i plasma levels) effects [31] . however, the erβ agonist had distinct effects on the ovary, for example, the stimulation of early folliculogenesis [32] , which possibly presents clinicians with a new option for tailoring classical ovarian stimulation protocols. a comparison of the homology model with the x-ray crystal structure of the erβ-lbd complexed with genistein [33] revealed that the homology model had a root-mean-square deviation (rmsd) of the backbone atoms (not considering helix 12) of 1.4 å. the x-ray crystal structure confirmed the presence of essential interactions between the ligand and the erβ and did not reveal, at least in this case, any new aspects for the design of erβ agonists that were not covered by the homology model. these studies show that it is possible to design highly selective compounds, if structure information on all of the relevant homologs of the target is available, and 663 ddt vol. 9, no. 15 august 2004 reviews research focus www.drugdiscoverytoday.com figure 3 . applications of homology models in the drug discovery process. the enormous amount of protein structure information currently available could not only support lead compound identification and optimization, but could also contribute to target identification and validation. reproduced, with permission, from [84] . there are numerous examples where protein homology models have supported the discovery and the optimization of lead compounds with respect to potency and selectivity. currently, the structures of 40 of the 518 known different human protein kinases have been characterized by x-ray crystallography [34] . homology model-based drug design has been applied to epidermal growth factor-receptor tyrosine kinase protein [35, 36] , bruton's tyrosine kinase [37] , janus kinase 3 [38] and human aurora 1 and 2 kinases [39] . using the x-ray crystal structure of cyclin-dependent kinase 2 (cdk2), honma et al. [40] generated a homology model of cdk4. this model guided the design of highly potent and selective cdk4 inhibitors that were targeted towards the atp binding pocket. the diarylurea class of compounds were subsequently synthesized and tested. in an in vitro inhibition assay, the most potent compound had an ic 50 of 42 nm. the predicted binding mode of the lead compound was verified by co-crystallization with cdk2 [40] . vangrevelinghe et al. [41] identified a cdk2 inhibitor using a homology model of the protein and highthroughput docking. siedlecki et al. [42] have demonstrated the utility of homology modeling in the prediction of pharmacologically active compounds. alterations in dna methylation patterns play an important role in tumorigenesis; therefore, inhibitors of dna methyltransferase 1 (dnmt1), which is the protein that represents the major dna methyltransferase activity in human cells, are desired. known inhibitors from the 5-azacytidine class were docked into the active site of a dnmt1 homology model, which led to the design of n4-fluoroacetyl-5-azacytidine derivatives that acted as highly potent inhibitors of dna methylation in vitro. thrombin-activatable fibrinolysis inhibitor (tafi) is an important regulator of fibrinolysis, and inhibitors of this enzyme have potential use in antithrombotic and thrombolytic therapy. based on a homology model of tafi (~50% sequence identity to carboxypeptidases a and b), appropriately substituted imidazole acetic acids were designed and were subsequently found to be potent and selective inhibitors of activated tafi [43] . homology models of the voltage-gated k + -channel k v 1.3 and the ca 2+ -activated channel ik ca 1 were used to design selective ik ca 1 inhibitors that were based on the polypeptide toxin charybdotoxin. comparison of the two models revealed a unique cluster of negatively charged residues in the turret of k v 1.3 that were not present in ik ca 1. to exploit this difference, the homology model was used to design novel analogs, which were then synthesized and tested. research demonstrated that the novel compounds blocked ik ca 1 activity with ~20-fold higher affinity than k v 1.3 [44] . 4. (a,b) comparison of the two isotypes of the estrogen receptor. in the homology model, erα (blue) and erβ (green) ligand-binding pockets are shown in complex with the natural ligand of the er, 17β-estradiol. the binding of 8β-ve 2 , a highly potent and selective erβ agonist, modeled into the erβ ligand-binding niche is depicted to the right. reproduced, with permission, from ref. [31] . (c,d) a model of the antiprogestin ru 486 (mifepristone) bound to hpr. a single amino acid mutation renders this compound inactive at the cpr and hamster pr. steric clashes between ru 486 and cpr are shown on the right side. abbreviations: er, estrogen receptor; her, human estrogen receptor; cpr, chicken progesterone receptor; hpr, human progesterone receptor; pr, progesterone receptor; rba, relative binding affinity; 8β-ve 2 , 8β-vinylestra-1,3,5(10)-triene-3,17β-diol. the key proteinase (m pro , or 3cl pro ) of the new coronavirus (cov) that caused the severe acute respiratory syndrome (sars) outbreak of 2003 (sars-cov) is another example of successful homology model building; in this case, success is defined as the ability to use the model to propose an inhibitor that has significant affinity for the target enzyme. x-ray crystal structures for the m pro s of transmissible gastroenteritis virus (tgev, a porcine coronavirus) and of human coronavirus 229e [45, 46] have been characterized. these proteinases have 44 and 40% sequence identity, respectively, with the key proteinase of sars-cov. following publication of the genome sequence of the new virus, first on the internet and a few weeks later in print [46, 47] , the level of sequence identity between the proteinases enabled anand et al. [46] to construct a 3d homology model for the m pro of human cov. however, the 3d homology model generated was insufficient for the design of inhibitors with reasonable confidence. to establish the structural basis of the interaction with the polypeptide substrate of the m pro , anand and co-workers [46] synthesized a substrate-analogous hexapeptidyl chloromethylketone inhibitor that was complexed with tgev m pro . the x-ray crystal structure of the complex was then determined, which revealed that, as expected, the chloromethylketone moiety had covalently reacted with the active-site cysteine residue of the proteinase. the p1, p2, and p4 side chains of the inhibitor had bound to, and thereby defined, the specificity binding sites of the target enzyme. the experimentally determined structure of the inhibitor-tgev m pro complex was then compared with all inhibitor complexes of cysteine proteinases in the pdb, which revealed a surprisingly similar inhibitor binding mode in the complex of human rhinovirus type 2 (hrv2) 3c proteinase with ag7088 ( figure 5 ) [48] . at that time, ag7088 was in late phase ii clinical trials as a drug for the treatment of the strain of the common cold that is caused by human rhinovirus. the comparison of the crystal structures of hrv2 in complex with ag7088 and tgev m pro in complex with the hexapeptidyl chloromethylketone inhibitor revealed little similarity between the two target enzymes, except in the immediate neighborhood of the catalytic cysteine residue, but an almost perfect match of the inhibitors. to investigate these findings further, ag7088 was docked into the substrate-binding site of the sars-cov m pro model without much difficulty, although it was noted that there could potentially be steric problems with the p-fluorobenzyl group in the s2 pocket, and also with the ethylester moiety in s1′. therefore, it was proposed that, although ag7088 was not an ideal inhibitor, this compound should be a good starting point for the design of anti-sars drugs. indeed, only a few days after the on-line publication of these results in sciencexpress [46] , it was confirmed that ag7088 had anti-sars activity in vitro. derivatives of ag7088 with modified p2 residues have since been shown to have k i values in the lower µmolar range (rao et al., pers. commun.) . the crystal structure of the authentic sars-cov key proteinase was determined a few months later [49] . although the dimeric structure showed the expected similarity to the homologous enzymes of tgev and human cov 229e, there were interesting differences in detail. in particular, one of the monomers in the dimer was observed to be in an inactive conformation, which was thought to be the result of the low ph of crystallization. the overall rmsd for the entire dimer from the homology model of anand et al. [46] was >3.0 å (i.e. no residues excluded from the comparison), which dropped to 2.1 å when a few outliers at the carboxy terminus were excluded from the comparison, and to <1.8 å for each of the three individual domains of the enzyme. other homology models were generated (d. debe, unpublished and [50] ) and virtual screening has been performed using a sars-cov m pro model [51] . taken together, these findings confirm that homology modeling is often inadequate for the prediction of the mutual orientation of domains in multidomain proteins. however, the homology model generated by anand et al. [46] also shows that a reasonable model of a substrate-binding site can serve to develop useful ideas for inhibitor design that can inspire medicinal chemists to start a synthesis program long before the 3d structure of the target enzyme is experimentally determined. in the case of g-protein-coupled receptors (gpcr), homology-modeling approaches are limited by the lack of experimentally determined structures and the low sequence similarity of those structures that have been characterized with respect to pharmacologically important target proteins. the x-ray crystal structure of only one gpcr, bovine rhodopsin, has been determined [52] . this structure is complemented by bacteriorhodopsin, which is a transmembrane protein that comprises seven helices and is also of relevance for modeling approaches, even though this protein is not a gpcr. some examples of homology models for gpcrs and their utility have recently been reviewed [53] . high-throughput docking has been applied to verify the ability of homology models to identify agonists (glucocorticoid receptor agonists) [54] , antagonists of retinoic acid receptor α [55] , d 3 -dopamine-, m 1 -muscarinic acetylcholine-and v 1a -vasopressin-receptors [56] and inhibitors of thrombin [57] . in the identification of thrombin inhibitors, homology models of thrombin were built retrospectively and were based on homologous serine proteases (28%-40% sequence identity); the best docking solutions were yielded with those models that were derived from proteins of higher sequence identity. recently, the performance of docking studies into protein active sites that had been constructed from homology models was assessed using experimental screening datasets of cdk2 and factor viia [58] . when the sequence identity between the model and the template near the binding site was greater than ~50%, there was an approximate fivefold increase in the number of active compounds identified than would have been be detected randomly. this performance is comparable to docking to crystal structures. a further application of homology models is the design of test assays for the in vitro pharmacological characterization of compounds or hts. based on the structure of the coiledcoil domain of c-jun, models for α-helical proteins were designed such that they can be used as affinity-tagged proteins that incorporate protease cleavage sites [59] . the resulting 10.5 kda recombinant proteins were synthesized and used as molecularly defined and uniform substrates for in vitro detection of hiv-1 and iga endoprotease activity, which enabled the surface plasmon resonance-based screening of inhibitors. the enormous volume of structure information on entire target protein families that is available might also have an impact on screening cascades. many drug discovery projects endeavor to identify ligands that are highly selective for particular drug targets. selective compounds are supposed to be superior because such compounds typically lead to fewer adverse side effects (e.g. cox-2 inhibitors). however, the most important homologs that should not be targeted by the desired drug, with respect to the actual target, are not always clear, particularly within the larger target protein families. the sequence similarity of the full-length proteins or entire domains might not always be representative of the target protein when considering the conservation of the ligand-binding pockets. comparison of the shape and features of the binding pockets within a protein family could indicate which homologs should be included in the screening cascade for so-called 'counter screening'. the structure information that is currently available on entire protein families (e.g. proteases, kinases and nuclear receptors) could contribute to the design of selective compounds or better screening cascades, both of which could potentially advance the design of drugs that have fewer side effects. a detailed structural knowledge of the ligand-binding sites of target proteins was also shown to facilitate the selection of animal models for ex vivo or in vivo experiments. the proposal is that animals having target proteins with significantly different binding sites compared with human orthologs should be excluded as pharmacological models. many promising compounds showing high-potency in human in vitro assays have not reached clinical trials because efficacy could not be demonstrated in animal models. single amino acid differences between humans and animals might, in some cases, be sufficient to cause such effects. the er selectivity of ligands described by hillisch et al. [31] was shown by homology models and in vitro assays to be crucially dependent on the interaction of ligand substituents with one particular amino acid that differs between erα and erβ (figure 4a ) [31] . to ensure that this important interaction is present in estrogen receptors of all animal models that are used to characterize compounds [32] , homology models of murine, rat and bovine erβ were built and compared with the binding pocket of human erβ (herβ). a complete conservation of amino acids within the binding pockets of human, murine and rat erβ was observed. however, bovine erβ showed one amino acid difference at the exact position that was determined to be crucial for erβ ligand selectivity. the prediction that the herβ selective compounds should not bind to bovine erβ was later verified using transactivation experiments (unpublished results). thus, the implementation of uninterpretable experiments could be avoided at an early stage and the otherwise attractive bovine tissues (later available in larger amounts) could be excluded from ex vivo investigations. similarly, information on the structure of progesterone receptors (pr) can be used to explain the abolished binding of the progesterone antagonist mifepristone (ru 486) to chicken pr and hamster pr [60] . a single point mutation (human pr gly722 to chicken pr cys575) prevents antiprogestins containing 11β-aryl substituents (e.g. ru 486) from binding to chicken (and hamster) pr (figure 4c) , which therefore excludes hamsters, for example, from pharmacological studies with antiprogestins [61] . in the future, such effects could be predicted and particular species could then be excluded from pharmacological studies at an early stage, which would ultimately reduce attrition rates in the drug discovery process. one of the challenges in lead optimization is to identify compounds that not only show a high potency at the desired target protein but also have adequate physical properties to reach systemic circulation, to resist metabolic inactivation for a specific time period and to avoid undesired pharmacological effects. knowledge of the structure of the proteins that are involved in these processes, such as drugmetabolizing enzymes, transcription factors or transporters, could help to design molecules that do not interact with these 'non-target' proteins. the cytochrome p450s (cyp) are an extremely important class of enzymes that are involved in phase i oxidative metabolism of structurally diverse chemicals. only ~10 hepatic cyps are responsible for the metabolism of 90% of known drugs. recently, the x-ray crystal structures of three mammalian cyps, cyp2c5 [62] , cyp2c8 [63] and cyp2c9 [64] , have been solved and represent a solid basis for the homology modeling of this entire superfamily. models of cyp1a2, cyp2a6, cyp2b6, cyp2c8, cyp2c9, cyp2c19, cyp2d6, cyp2e1, cyp3a4 and cyp4a11 have been generated using different structure templates. these models have been used to explain and to predict the probable sites of metabolic attack in a variety of cyp substrates [65] [66] [67] [68] [69] [70] [71] [72] . however, the large lipophilic and highly flexible character of some cyp binding cavities renders pure in silico approaches towards the prediction of the occurrence and site of small molecule metabolism extremely difficult. if protein structure information is combined with pharmacophoric patterns and quantum mechanical calculations, some predictions concerning the preferred sites of metabolism within small molecules are possible [73] . regarding this aspect of homology modeling, cyp2d6 is a particularly interesting cyp because 5-9% of the caucasian population does not produce this polymorphic member of the cyp superfamily. the resulting deficiencies in drug oxidation can lead to severe side effects in these individuals. predictions on whether or not a lead compound could act as a cyp2d6 substrate could help to identify problematic cases early in drug discovery. combined homology modeling and quantitative sar approaches are able to predict such cyp inhibitors [74] . thus, in the future, protein structure information in conjunction with high-throughput docking and pharmacophore-based methods could be used to decide which compounds have the potential to inhibit particular cyps. this approach could facilitate the detection of potential drug-drug interactions early in the drug discovery process and measures could then be taken to avoid such interactions [75] . cyp substrates and inhibitors are not the only compounds to have been studied using homology models. these approaches have been used recently to investigate cyp inducers. the induction of cyps is primarily mediated via the activation of ligand-dependent transcription factors, such as the aryl hydrocarbon receptor (ahr) for the cyp1a family, the constitutive androstane receptor (car) for the cyp2d family and the pregnane x receptor (pxr), glucocorticoid receptor (gr) and vitamin d receptor (vdr) for the cyp3a family [76] . in principle, the in silico prediction of drug-metabolizing enzyme induction could be reduced to predicting the binding and activation of transcription factors (e.g. ahr and car). however, recent xray structure analyses of pxr have shown that the lbd of this nuclear receptor contains a large lipophilic and flexible binding pocket [77] . this renders pure in silico structure-based predictions concerning whether or not a small molecule will activate pxr difficult. the homology modeling of car [78, 79] and other members of the nuclear receptor family involved in cyp induction [80] have recently been described. these models predict reasonably shaped potential ligand binding pockets. however, further results on the utility of these models are needed. with respect to the structure-based prediction of adverse health effects, progress has been described with the human ether-a-go-go-related gene (herg). this tetrameric potassium channel contributes to phase three repolarization of heart muscle cells by opposing the depolarizing ca 2 + influx during the plateau phase. inhibition of this protein results in cardiovascular toxicity (qt-prolongation) and has caused several drugs to be withdrawn from the market. therefore, in silico predictions on the probability of the formation of an interaction between a drug and herg have gained enormous attention and have recently been reviewed [81] . homology models of herg, which are based on the x-ray crystal structures of the bacterial kcsa [82] and mthk channels [83] , have already shed light on some details of the molecular interactions that initiate herg inhibition. however, the complexity of this potassium channel signifies that detailed x-ray structure analyses of the protein in the open-and closed-state are required before these molecular interactions can be fully understood and predicted, which has implications for the prediction of cardiotoxicity. numerous examples for the successful application of homology modeling in drug discovery are described here. in the absence of experimental structures of drug target proteins, homology models have supported the design of several potent pharmacological agents. one of the advantages of homology models is that these models can be generated 667 ddt vol. 9, no. 15 august 2004 reviews research focus relatively easily and quickly. furthermore, such models could support the hypotheses of medicinal chemists on how to generate biologically active compounds in the important early conceptual phase of a drug discovery project. the design of compounds that are selectively directed at particular drug target proteins is one of the strengths of this technique. such selective compounds can even be applied to gain insights into the physiological role of novel drug targets. the in silico protein structure-based prediction of metabolism and toxicity of small molecules, particularly cyp inhibition and induction and herg inhibition, is currently in its infancy and predictive capabilities could be limited to classification only. however, while complete experimental structures of pharmacologically important proteins are missing, the homology modeling technique provides one approach to bridge the gap until this information becomes available. modern methods of drug discovery: an introduction the response of protein structures to amino-acid sequence changes fold recognition methods progress in protein structure prediction assessment of homology-based predictions in casp5 scop: a structural classification of proteins database for the investigation of sequences and structures databases for protein-ligand complexes ligbase: a database of families of aligned ligand binding sites in known protein sequences and structures pdbsum: summaries and analyses of pdb structures modbase, a database of annotated comparative protein structure models, and associated resources comparative protein modelling by satisfaction of spatial restraints the swiss-model repository of annotated threedimensional protein structure homology models supporting your pipeline with structural knowledge the relation between the divergence of sequence and structure in proteins casp5 assessment of fold recognition target predictions the role of protein 3d structures in the drug discovery process the impact of structure-guided drug design on clinical agents the druggable genome prediction of 'drug-likeness' a scoring scheme for discriminating between drugs and nondrugs drug-like properties and the causes of poor solubility and poor permeability structural bioinformatics in drug discovery inhibition of the p53-hdm2 interaction with low molecular weight compounds inhibition of the p53-mdm2 interaction: targeting a protein−protein interface a new method to detect related function among proteins independent of sequence and fold homology inferring functional relationships of proteins from local sequence and spatial surface patterns structural classification of protein kinases using 3d molecular interaction field analysis of their ligand binding sites: target family landscapes transcriptional repressor copr: structure model-based localization of the deoxyribonucleic acid binding motif use of a drug-resistant mutant of stress-activated protein kinase 2a/p38 to validate the in vivo specificity of sb 203580 chemogenomics: an emerging strategy for rapid target and drug discovery dissecting physiological roles of estrogen receptor alpha and beta with potent selective ligands from structurebased design impact of isotype-selective estrogen receptor agonists on ovarian function structure of the ligand-binding domain of oestrogen receptor beta in the presence of a partial agonist and a full antagonist the protein kinase complement of the human genome design and synthesis of novel tyrosine kinase inhibitors using a pharmacophore model of the atp-binding site of the egf-r rational design of potent and selective egfr tyrosine kinase inhibitors as anticancer agents rational design and synthesis of a novel antileukemic agent targeting bruton's tyrosine kinase (btk), lfm-a13 structure-based design of specific inhibitors of janus kinase 3 as apoptosis-inducing antileukemic agents targeting aurora2 kinase in oncogenesis: a structural bioinformatics approach to target validation and rational drug design structure-based generation of a new class of potent cdk4 inhibitors: new de novo design strategy and library design discovery of a potent and selective protein kinase ck2 inhibitor by high-throughput docking establishment and functional validation of a structural homology model for human dna methyltransferase 1 synthesis and evaluation of imidazole acetic acid inhibitors of activated thrombin-activatable fibrinolysis inhibitor as novel antithrombotics structure-guided transformation of charybdotoxin yields an analog that selectively targets ca 2+ -activated over voltage-gated k + channels structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra alpha-helical domain coronavirus main proteinase (3clpro) structure: basis for design of anti-sars drugs characterization of a novel coronavirus associated with severe acute respiratory syndrome structure-assisted design of mechanismbased irreversible inhibitors of human rhinovirus 3c protease with potent antiviral activity against multiple rhinovirus serotypes the crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor evaluation of homology modeling of the severe acute respiratory syndrome (sars) coronavirus main protease for structure-based drug design a 3d model of sars cov 3cl proteinase and its inhibitors design by virtual screening crystal structure of rhodopsin: a g-proteincoupled receptor modeling the 3d structure of gpcrs: advances and application to drug discovery nuclear hormone receptor targeted virtual screening rational discovery of novel nuclear hormone receptor antagonists protein-based virtual screening of chemical databases. ii. are homology models of g-protein-coupled receptors suitable targets? docking ligands onto binding site representations derived from proteins built by homology modelling performance of 3d database molecular docking studies into homology models design of helical proteins for real-time endoprotease assays a single amino acid that determines the sensitivity of progesterone receptors to ru486 ru486 is not an antiprogestin in the hamster mammalian microsomal cytochrome p450 monooxygenase: structural adaptations for membrane binding and functional diversity structure of human microsomal cytochrome p450 2c8. evidence for a peripheral fatty acid binding site crystal structure of human cytochrome p450 2c9 with bound warfarin molecular modeling of human cytochrome p450-substrate interactions modelling human cytochromes p450 involved in drug metabolism from the cyp2c5 crystallographic template homology modelling of human cyp1a2 based on the cyp2c5 crystallographic template structure homology modelling of cyp2a6 based on the cyp2c5 crystallographic template: enzyme-substrate interactions and qsars for binding affinity and inhibition molecular modelling of cyp2b6 based on homology with the cyp2c5 crystal structure: analysis of enzyme-substrate interactions a molecular model of cyp2d6 constructed by homology with the cyp2c5 crystallographic template: investigation of enzyme-substrate interactions investigation of enzyme selectivity in the human cyp2c subfamily: homology modelling of cyp2c8, cyp2c9 and cyp2c19 from the cyp2c5 crystallographic template prediction of drug metabolism: the case of cytochrome p450 2d6 a novel approach to predicting p450 mediated drug metabolism. cyp2d6 catalyzed n-dealkylation reactions and qualitative metabolite predictions using a combined protein and pharmacophore model for cyp2d6 competitive cyp2c9 inhibitors: enzyme inhibition studies, protein homology modeling, and three-dimensional quantitative structure-activity relationship analysis molecular basis of p450 inhibition and activation: implications for drug development and drug therapy prediction of human drug metabolizing enzyme induction coactivator binding promotes the specific interaction between ligand and the pregnane x receptor a structural model of the constitutive androstane receptor defines novel interactions that mediate ligandindependent activity insights from a three-dimensional model into ligand binding to constitutive active receptor molecular modelling of the human glucocorticoid receptor (hgr) ligand-binding domain (lbd) by homology with the human estrogen receptor alpha (heralpha) lbd: quantitative structure-activity relationships within a series of cyp3a4 inducers where induction is mediated via hgr involvement predicting undesirable drug interactions with promiscuous proteins in silico a structural basis for drug-induced long qt syndrome characterization of herg potassium channel inhibition using comsia 3d qsar and homology modeling approaches modern methods of drug discovery we gratefully acknowledge fruitful discussions with mario lobell (bayer healthcare; http://www.bayerhealthcare.com), derek debe, sean mullen (eidogen) and sunil patel (accelrys). key: cord-273815-7ftztaqn authors: gupta, r. k.; marks, m.; samuels, t. h. a.; luintel, a.; rampling, t.; chowdhury, h.; quartagno, m.; nair, a.; lipman, m.; abubakar, i.; van smeden, m.; wong, w. k.; williams, b.; noursadeghi, m. title: systematic evaluation and external validation of 22 prognostic models among hospitalised adults with covid-19: an observational cohort study date: 2020-07-26 journal: nan doi: 10.1101/2020.07.24.20149815 sha: doc_id: 273815 cord_uid: 7ftztaqn background the number of proposed prognostic models for covid-19, which aim to predict disease outcomes, is growing rapidly. it is not known whether any are suitable for widespread clinical implementation. we addressed this question by independent and systematic evaluation of their performance among hospitalised covid-19 cases. methods we conducted an observational cohort study to assess candidate prognostic models, identified through a living systematic review. we included consecutive adults admitted to a secondary care hospital with pcr-confirmed or clinically diagnosed community-acquired covid-19 (1st february to 30th april 2020). we reconstructed candidate models as per their original descriptions and evaluated performance for their original intended outcomes (clinical deterioration or mortality) and time horizons. we assessed discrimination using the area under the receiver operating characteristic curve (auroc), and calibration using calibration plots, slopes and calibration-in-the-large. we calculated net benefit compared to the default strategies of treating all and no patients, and against the most discriminating predictor in univariable analyses, based on a limited subset of a priori candidates. results we tested 22 candidate prognostic models among a cohort of 411 participants, of whom 180 (43.8%) and 115 (28.0%) met the endpoints of clinical deterioration and mortality, respectively. the highest aurocs were achieved by the news2 score for prediction of deterioration over 24 hours (0.78; 95% ci 0.73-0.83), and a novel model for prediction of deterioration <14 days from admission (0.78; 0.74-0.82). calibration appeared generally poor for models that used probability outcomes. in univariable analyses, admission oxygen saturation on room air was the strongest predictor of in-hospital deterioration (auroc 0.76; 0.71-0.81), while age was the strongest predictor of in-hospital mortality (auroc 0.76; 0.71-0.81). no prognostic model demonstrated consistently higher net benefit than using the most discriminating univariable predictors to stratify treatment, across a range of threshold probabilities. conclusions oxygen saturation on room air and patient age are strong predictors of deterioration and mortality among hospitalised adults with covid-19, respectively. none of the prognostic models evaluated offer incremental value for patient stratification to these univariable predictors. coronavirus disease 2019 , caused by severe acute respiratory syndrome coronavirus-2 (sars-cov-2), causes a spectrum of disease ranging from asymptomatic infection to critical illness. among people admitted to hospital, covid-19 has reported mortality of 21-33%, with 14-17% requiring admission to high dependency or intensive care units (icu) [1] [2] [3] [4] . exponential surges in transmission of sars-cov-2, coupled with the severity of disease among a subset of those affected, pose major challenges to health services by threatening to overwhelm resource capacity 5 . rapid and effective triage at the point of presentation to hospital is therefore required to facilitate adequate allocation of resources and to ensure that patients at higher risk of deterioration are managed and monitored appropriately. importantly, prognostic models may have additional value in patient stratification for emerging drug therapies 6, 7 . as a result, there has been global interest in development of prediction models for covid-19 8 . these include models aiming to predict a diagnosis of covid-19, and prognostic models, aiming to predict disease outcomes. at the time of writing, a living systematic review has already catalogued 145 diagnostic or prognostic models for covid-19 8 . critical appraisal of these models using quality assessment tools developed specifically for prediction modelling studies suggests that the candidate models are poorly reported, at high risk of bias and over-estimation of their reported performance 8, 9 . however, independent evaluation of candidate prognostic models in unselected datasets has been lacking. it therefore remains unclear how well these proposed models perform in practice, or whether any are suitable for widespread clinical implementation. we aimed to address this knowledge gap by systematically evaluating the performance of proposed prognostic models, among consecutive patients hospitalised for covid-19 at a single centre. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . https://doi.org/10.1101/2020.07.24.20149815 doi: medrxiv preprint we used a published living systematic review to identify candidate prognostic models for covid19 indexed in pubmed, embase, arxiv, medrxiv, or biorxiv until 5 th may 2020 8 . we included models that aim to predict clinical deterioration or mortality among patients with covid-19. we also included prognostic scores commonly used in clinical practice [10] [11] [12] , but not specifically developed for covid -19 patients. for each candidate model identified, we extracted predictor variables, outcome definitions (including time horizons), modelling approaches, and final model parameters from original publications, and contacted authors for additional information where required. we excluded scores where the underlying model parameters were not publicly available, since we were unable to reconstruct them, along with models for which included predictors were not available in our dataset. the latter included models that require computed tomography imaging or arterial blood gas sampling, since these investigations were not routinely performed among unselected patients with covid-19 at our centre. our study is reported in accordance with transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod) guidance for external validation studies 13 . we included consecutive adults admitted to university college hospital london with a final diagnosis of pcrconfirmed or clinically diagnosed covid-19, between 1 st february and 30 th april 2020. since we sought to use data from the point of hospital admission to predict outcomes, we excluded patients transferred in from other hospitals, and those with hospital-acquired covid-19 (defined as 1 st pcr swab sent >5 days from date of hospital admission, as a proxy for the onset of clinical suspicion of sars-cov-2 infection). oxygen, invasive mechanical ventilation or extra-corporeal membrane oxygenation) or death, equivalent to world health organization clinical progression scale ≥ 6 14 . the rationale for this composite outcome is to make the endpoint more generalisable between centres, since hospital respiratory management algorithms may vary substantially. each chest radiograph was reported by a single radiologist, reflecting routine clinical conditions, using british society of thoracic imaging criteria, and using a modified version of the radiographic assessment of lung edema (rale) score 15, 16 . participants were followed-up clinically to the point of discharge from hospital. we extended follow-up beyond discharge by cross-checking nhs spine records to identify reported deaths postdischarge, thus ensuring >30 days' follow-up for all participants. for each prognostic model included in the analyses, we reconstructed the model according to authors' original descriptions, and sought to evaluate the model discrimination and calibration performance against their original intended endpoint. for models that provide online risk calculator tools, we validated our reconstructed models against original authors' models, by cross-checking our predictions against those generated by the web-based tools for a random subset of participants. for models that used icu admission or death, or 'severe' covid-19 or death, as composite endpoints, we used our 'clinical deterioration' endpoint as the primary outcome, as defined above. where models specified their intended time horizon in their original description, we used this timepoint in the primary analysis, in order to ensure unbiased assessment of model calibration. where the intended time horizon was not specified, we assessed the model to predict in-hospital deterioration or mortality, as appropriate. for all models, we assessed discrimination by quantifying the area under the receiver operating characteristic curve (auroc) 17 is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 26, 2020. . https://doi.org/10.1101/2020.07. 24.20149815 doi: medrxiv preprint model intercept was not available, we calibrated the model to our dataset by calculating the intercept when using the model linear predictor as an offset term, leading to perfect citl. this approach, by definition, overestimated calibration with respect to citl, but allowed us to examine the calibration slope in our dataset. we also assessed the discrimination of each candidate model for standardised outcomes of: (a) our composite endpoint of clinical deterioration; and (b) mortality, across a range of pre-specified time horizons from admission (7 days, 14 days, 30 days and any time during hospital admission), by calculating time-dependent aurocs (with cumulative sensitivity and dynamic specificity) 18 . the rationale for this analysis was to harmonise endpoints, in order to facilitate more direct comparisons of discrimination between the candidate models. in order to further benchmark the performance of candidate prognostic models, we then computed aurocs for a limited number of univariable predictors considered to be of highest importance a priori, based on clinical knowledge and existing data, for prediction of our composite endpoints of clinical deterioration and mortality (7 days, 14 days, 30 days and any time during hospital admission). the a priori predictors of interest examined in this analysis were age, clinical frailty scale, oxygen saturation at presentation on room air, c-reactive protein and absolute lymphocyte count 8, 19 . we performed decision curve analyses to quantify the net benefit achieved by each model for predicting the intended endpoint, in order to inform clinical decision making across a range of risk:benefit ratios for an intervention or 'treatment' 20 . in this approach, the risk:benefit ratio is analogous to the cut point for a statistical model above which the intervention would be considered beneficial (deemed the 'threshold probability'). net benefit was calculated as sensitivity × prevalence -(1 -specificity) × (1 -prevalence) × w where w is the odds at the threshold probability and the prevalence is the proportion of patients who experienced the outcome 20 . we calculated net benefit across a range of clinically relevant threshold probabilities, ranging from 0 to 0.5, since the risk:benefit ratio may vary for any given intervention (or 'treatment'). we compared the utility of each candidate model against strategies of treating all and no patients, and against the best performing univariable predictor for in-hospital clinical deterioration, or mortality, as appropriate. we calculated 'delta' net benefit as net benefit when using the index model minus net benefit when: (a) treating all patients; . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 26, 2020. . https://doi.org/10.1101/2020.07.24.20149815 doi: medrxiv preprint and (b) using most discriminating univariable predictor. decision curve analyses were done using the rmda package in r 21 . we handled missing data using multiple imputation by chained equations 22 , using the mice package in r 23 . all variables in the final prognostic models were included in the imputation model to ensure compatibility 22 . a total of 10 imputed datasets were generated; discrimination and calibration metrics were pooled using rubin's rules 24 . individual predictions for each prognostic model were averaged across imputations for each participant in order to generate pooled calibration plots, roc curves and decision curves. all analyses were conducted in r (version 3.5.1). we recalculated discrimination and calibration parameters for each candidate model using a complete case analysis. we also examined for non-linearity in the a priori univariable predictors using restricted cubic splines, with 3 knots. finally, we estimated optimism for discrimination and calibration parameters for the a priori univariable predictors using bootstrapping (1,000 iterations), using the rms package in r 25 . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 26, 2020. . https://doi.org/10.1101/2020.07.24.20149815 doi: medrxiv preprint we identified a total of 37 studies describing prognostic models, of which 19 studies (including 22 unique models) were eligible for inclusion (supplementary figure 1 and table 1 ). of these, 5 models were not specific to covid-19, but were developed as prognostic scores for emergency department attendees 26 , hospitalised patients 12, 27 , people with suspected infection 10 or community-acquired pneumonia 11 , respectively. of the 17 models developed specifically for covid-19, most (10/17) were developed using datasets originating in china. a total of 13/22 models use points-based scoring systems to derive final model scores, with the remainder using logistic regression modelling approaches to derive probability estimates. a total of 12/22 prognostic models primarily aimed to predict clinical deterioration, while the remaining 10 sought to predict mortality alone. when specified, time horizons for prognosis ranged from 1 to 30 days. during the study period, 521 adults were admitted with a final diagnosis of covid-19, of whom 411 met the eligibility criteria for inclusion (supplementary figure 2) . median age of the cohort was 66 years (interquartile range (iqr) 53-79), and the majority were male (252/411; 61.3%). table 2 figure 4 shows missingness of each prognostic model in the complete case dataset, stratified by the outcomes of interest, due to unavailability of predictor variables. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 26, 2020. . https://doi.org/10.1101/2020.07.24.20149815 doi: medrxiv preprint table 3 shows discrimination and calibration metrics, where appropriate, for the 22 evaluated prognostic models in the primary multiple imputation analysis. the highest aurocs were achieved by the news2 score for prediction of deterioration over 24 hours (0.78; 95% ci 0.73 -0.83), and the for all models that provide probability scores for either deterioration or mortality, calibration appeared visually poor with evidence of overfitting and either systematic overestimation or underestimation of risk ( figure 1 ). supplementary figure 6 shows associations between prognostic models with pointsbased scores and actual risk. in addition to demonstrating reasonable discrimination, the news2 and curb65 models demonstrated approximately linear associations between scores and actual probability of deterioration at 24 hours and mortality at 30 days, respectively. next, we sought to compare the discrimination of these models for different outcomes across the range of time horizons, benchmarked against preselected univariable predictors associated with adverse outcomes in covid-19 8, 19 . we recalculated time-dependent aurocs for each of these outcomes, stratified by time horizon to the outcome ( supplementary figures 7 and 8 ). these analyses showed that aurocs generally declined with increasing time horizons. admission oxygen saturation on room air was the strongest predictor of in-hospital deterioration (auroc 0.76; 95% ci 0.71-0.81), while age was the strongest predictor of in-hospital mortality (auroc 0.76; 95% ci 0.71-0.81). we compared net benefit for each prognostic model (for its original intended endpoint) to the strategies of treating all patients, treating no patients, and using the most discriminating univariable predictor for either deterioration (i.e. oxygen saturation on air) or mortality (i.e. patient age) to stratify treatment (supplementary figure 9 ). although all prognostic models showed greater net benefit than . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . treating all patients at the higher range of threshold probabilities, none of these models demonstrated consistently greater net benefit than the most discriminating univariable predictor, across the range of threshold probabilities ( figure 2 ). recalculation of model discrimination and calibration metrics for prediction of the original intended endpoint using a complete case analysis revealed similar results to the primary multiple imputation approach (supplementary table 1 ). visual examination of associations between the most discriminating univariable predictors and log odds of deterioration or death using restricted cubic splines showed no evidence of non-linear associations (supplementary figure 10) . finally, internal validation using bootstrapping showed near zero optimism for discrimination and calibration parameters for the univariable models (supplementary table 2 ). . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . in this observational cohort study of consecutive adults hospitalised with covid-19, we systematically evaluated the performance of 22 prognostic models for covid-19. these included models developed specifically for covid-19, along with existing scores in routine clinical use prior to the pandemic. for prediction of both clinical deterioration or mortality, discrimination appeared modest or poor for most models. news2 performed reasonably well for prediction of deterioration over a 24-hour interval, achieving an auroc of 0.78, while the carr 'final' model 29 also had reasonable discrimination (auroc 0.78), but tended to systematically underestimate risk. all covid-specific models that derived an outcome probability of either deterioration or mortality showed poor calibration. we found that oxygen saturation (auroc 0.76) and patient age (auroc 0.76) were the most discriminating single variables for prediction of in-hospital deterioration and mortality respectively. these predictors have the added advantage that they are immediately available at the point of presentation to hospital. in decision curve analysis, no prognostic model demonstrated clinical utility consistently greater than using oxygen saturation on room air to predict deterioration, or patient age to predict mortality. while previous studies have largely focused on novel model discovery, or evaluation of a limited number of existing models, this is the first study to our knowledge to evaluate systematically-identified candidate prognostic models for covid-19. we used a comprehensive living systematic review 8 to identify eligible models and sought to reconstruct each model as per the original authors' description. we then evaluated performance against its intended outcome and time horizon, wherever possible, using recommended methods of external validation incorporating assessments of discrimination, calibration and net benefit 17 . moreover, we used a robust approach of electronic health record data capture, supported by manual curation, in order to ensure a high-quality dataset, and inclusion of unselected and consecutive covid-19 cases that met our eligibility criteria. in addition, we used robust outcome measures of mortality and clinical deterioration, aligning with the who clinical a weakness of the current study is that it is based on data from a single centre, and therefore cannot assess between-setting heterogeneity in model performance. second, due to the limitations of routinely collected data, predictor variables were available for varying numbers of participants for each model. we therefore performed multiple imputation, in keeping with recommendations for . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . development and validation of multivariable prediction models, in our primary analyses 30 . findings were similar in the complete case sensitivity analysis, thus supporting the robustness of our results. thirdly, a number of models could not be reconstructed in our data. for some models, this was due the absence of predictors in our dataset, such as those requiring computed tomography imaging, since this is not currently routinely recommended for patients with suspected or confirmed covid-19 16 . we were also not able to include models for which the parameters were not publicly available. this underscores the need for strict adherence to reporting standards in multivariable prediction models 13 . finally, we used admission data only as predictors in this study, since most prognostic scores are intended to predict outcomes at the point of hospital admission. we note, however, that some scores (such as news2) are designed for dynamic in-patient monitoring. future studies may integrate serial data to examine model performance when using such dynamic measurements. despite the vast global interest in the pursuit of prognostic models for covid-19, our findings show that no covid-19-specific models can currently be recommended for routine clinical use. all novel prognostic models for covid-19 assessed in the current study were derived from single-centre data. future studies may seek to pool data from multiple centres in order to robustly evaluate the performance of existing models across heterogeneous populations, and develop and validate novel prognostic models, through individual participant data meta-analysis 31 . such an approach would allow assessments of between-study heterogeneity and the likely generalisability of candidate models. it is also imperative that discovery populations are representative of target populations for model implementation, with inclusion of unselected cohorts. moreover, we strongly advocate for transparent reporting in keeping with tripod standards (including modelling approaches, all coefficients and standard errors) along with standardisation of outcomes and time horizons, in order to facilitate ongoing systematic evaluations of model performance and clinical utility 13 . we conclude that baseline oxygen saturation on room air and patient age are strong predictors of deterioration and mortality, respectively. none of the prognostic models evaluated in this study offer incremental value for patient stratification to these univariable predictors. therefore, none of the evaluated prognostic models for covid-19 can be recommended for routine clinical implementation. future studies seeking to develop prognostic models for covid-19 should consider integrating multi-. cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . centre data in order to increase generalisability of findings, and should ensure benchmarking against existing models and simpler univariable predictors. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . the uclh covid-19 reporting group was comprised of the following individuals, who were involved in data curation as non-author contributors: asia ahmed, ronan astin, malcolm avari, elkie benhur, in the writing of the report; or in the decision to submit the article for publication. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . all authors have completed the icmje uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: non-financial support from aidence bv (dr nair), outside the submitted work; no support from any organisation outside those declared above for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work. this study was approved by east midlands -nottingham 2 research ethics committee (ref: 20/em/0114). the conditions of regulatory approvals for the present study preclude open access data sharing to minimise risk of patient identification through granular individual health record data. the authors will consider specific requests for data sharing as part of academic collaborations subject to ethical approval and data transfer agreements in accordance with gdpr regulations. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . 1 mews = modified early warning score; qsofa = quick sequential (sepsis-related) organ failure assessment; rems = rapid emergency medicine score; 2 news = national early warning score; tactic = therapeutic study in pre-icu patients admitted with covid-19; avpu = alert / responds to voice / is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 26, 2020. . for each plot, the blue line represents a loess-smoothed calibration curve, and scatter points show quartiles of predicted risk. rug plots indicate the distribution of data points. no model intercept was available for the caramelo or colombi 'clinical' models; the intercepts for these models were calibrated to the validation dataset, by using the model linear predictors as offset terms. calibrationin-the-large is therefore not shown for these models, since it is zero by definition. the primary outcome of interest for each model is shown in the plot sub-heading. individual predictions for each prognostic model were averaged across imputations for each participant in the dataset in order to generate these pooled calibration plots. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 26, 2020. . for each analysis, the endpoint is the original intended outcome and time horizon for the index model. delta net benefit is calculated as net benefit when using the index model minus net benefit when: (1) treating all patients; and (2) using the most discriminating univariable predictor. the most discriminating univariable predictor is admission oxygen saturation (spo2) on room air for deterioration models and patient age for mortality models. individual predictions for each prognostic model were averaged across imputations for each participant in the dataset in order to generate pooled decision curve plots. delta net benefit is shown with loess-smoothing. black dashed line indicates threshold above which index model has greater net benefit than the comparator. full decision curves for each candidate model are shown in supplementary figure 9 . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 26, 2020. presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with covid-19 in the new york city area features of 20 133 uk patients in hospital with covid-19 using the isaric who clinical characterisation protocol: prospective observational cohort study critical care utilization for the covid-19 outbreak in lombardy report 17 -clinical characteristics and predictors of outcomes of hospitalised patients with covid-19 in a london nhs trust: a retrospective cohort study | faculty of medicine | imperial college london the demand for inpatient and icu beds for covid-19 in the us: lessons from chinese cities remdesivir for the treatment of covid-19 -preliminary report effect of dexamethasone in hospitalized patients with covid-19: preliminary report prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal probast: a tool to assess the risk of bias and applicability of prediction model studies assessment of clinical criteria for sepsis defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement who working group on the clinical characterisation and management of covid-19 infection frequency and distribution of chest radiographic findings in covid-19 positive patients time-dependent roc curve analysis in medical research: current methods and applications the effect of frailty on survival in patients with covid-19 (cope): a multicentre, european, observational cohort study a simple, step-by-step guide to interpreting decision curve analysis multiple imputation using chained equations: issues and guidance for practice mice: multivariate imputation by chained equations in r multiple imputation for nonresponse in surveys regression modeling strategies rapid emergency medicine score: a new prognostic tool for in-hospital mortality in nonsurgical emergency department patients the ability of the national early warning score (news) to discriminate patients at risk of early cardiac hu mortality (in−hospital) zhang_death; mortality (in−hospital) deterioration (in−hospital) zhang_poor; deterioration (in−hospital) caramelo; mortality (in−hospital) carr_final deterioration (14 days) carr_threshold; deterioration (14 days) colombi_clinical; deterioration (in−hospital) key: cord-287145-w518a0wa authors: habib, nahida; hasan, md. mahmodul; reza, md. mahfuz; rahman, mohammad motiur title: ensemble of chexnet and vgg-19 feature extractor with random forest classifier for pediatric pneumonia detection date: 2020-10-30 journal: sn comput sci doi: 10.1007/s42979-020-00373-y sha: doc_id: 287145 cord_uid: w518a0wa pneumonia, an acute respiratory infection, causes serious breathing hindrance by damaging lung/s. recovery of pneumonia patients depends on the early diagnosis of the disease and proper treatment. this paper proposes an ensemble method-based pneumonia diagnosis from chest x-ray images. the deep convolutional neural networks (cnns)—chexnet and vgg-19 are trained and used to extract features from given x-ray images. these features are then ensembled for classification. to overcome data irregularity problem, random under sampler (rus), random over sampler (ros) and synthetic minority oversampling technique (smote) are applied on the ensembled feature vector. the ensembled feature vector is then classified using several machine learning (ml) classification techniques (random forest, adaptive boosting, k-nearest neighbors). among these methods, random forest got better performance metrics than others on the available standard dataset. comparison with existing methods shows that the proposed method attains improved classification accuracy, auc values and outperforms all other models providing 98.93% accurate prediction. the model also exhibits potential generalization capacity when tested on different dataset. outcomes of this study can be great to use for pneumonia diagnosis from chest x-ray images. pneumonia is an infection in one or both lungs in which lungs' alveoli may fill up with fluid or pus causing cough, fever or making it difficult to breathe. it is the major cause of morbidity and mortality for infants under 2 years old and people over 60 years old in many countries. every year in the us alone, more than 250,000 people are hospitalized and around 50,000 die [1] . pneumonia causes death of more than 800,000 children where the death toll was more than the sum of malaria, aids and measles [2, 3] . according to a study, conducted by a uk based ngo in india [4] , around 4 million people could have been saved if appropriate actions were taken. thus, early and accurate diagnosis and treatment of pneumonia is of utmost important. different diagnosing methods of pneumonia are blood test, chest radiograph, lung ultrasound, ct scan, pulse oximetry and bronchoscopy [5] . the advances in artificial intelligence (ai) specially machine learning (ml) and deep learning (dl) algorithms have made robust improvement in automatic diagnosis of diseases. the computer aided diagnosis (cad) tool makes disease detection and prediction task easier, cheaper and more accessible. the cad may be referred to a diagnosis made by a radiologist considering the results obtained through computer analysis [6] . cad methods using deep convolutional neural network (cnn) and ml are now using in successful diagnosis of lung diseases-lung cancer, pneumonia, covid-19, tuberculosis, breast cancer, colon cancer, prostate cancer, coronary artery disease, congenital heart defects, brain diseases, skin lesions etc. chest x-ray is one of the most commonly used painless and non-invasive radiological tests to screen and diagnose many lung diseases also other methods such as ct and mri can be used [7] . as chest x-ray is fast, easy and inexpensive than ct and mri, they are mostly used in emergency diagnosis and treatment of lungs, hearts and chest wall diseases. chexnet is a 121-layer convolutional neural network model proposed by some researchers of stanford university to diagnose pneumonia. the model is trained on chestx-ray14 dataset and diagnose all the 14 pathologies of the dataset with best results [8] . vgg-19 is a 19-layer trained convolutional neural network invented by visual geometry group of oxford university. this cnn architecture contains 16 convolutional layers and 3 fully connected layers. this paper proposed an ensemble technique of two cnn models-fine-tuned chexnet and vgg-19 models for the diagnosis of pediatric pneumonia from chest x-ray images. the kermany dataset [9] of 5856 chest x-ray images are used here for this purpose. cnn features from the two models are collected and ensembled. the number of pneumonia and normal images are not same in the dataset. so, to level the dataset random under sampler (rus), random over sampling (ros), and smote over sampling techniques are applied on the ensembled features. for the detection and classification of pneumonia from normal images different ml algorithms-random forest (rf), adaptive boosting (adaboost), k-nearest neighbors (knn) are applied on the features afterword's. among the models rf achieves the best classification accuracy of almost 99%. the rest of the paper is organized as follows. related works are mentioned in "related works" section. the proposed method of this paper is described in "proposed methodology" section. "results and discussion" section describes and discusses the results. finally, last but not least conclusion and future work of this research can be found in "conclusion" section. this section highlights the studies or works done by other researchers related to this research. modern technology has made diagnosis and treatment easier and more convenient than before. the availability of large datasets and unbeatable success of deep learning has made diagnosis task more accurate. authors of ref. [5] , mentioned that, due to pneumonia at least 2500 children die every day and it is the leading cause of children death. rudan et al. [10] , shared that on an annual basis more than 150 million people get infected with pneumonia especially children under 5 years old. lobar pneumonia, pulmonary tuberculosis and lung cancer, these three types of diseases are discriminated from chest x-ray by authors of ref. [11] on their paper. ghimire et al. [12] indicated that influenza was the cause in 10% of all childhood pneumonia and 28% of all children with influenza developed pneumonia in bangladesh. the rate of pneumonia affected children is seen more to poor family with unhygienic livelihood than economically stable families. for successful diagnosis of pneumonia various deep learning cnn models have already been developed and still developing to get more accurate results. chest x-ray is easy to use medical imaging and diagnostic technique performed by expert radiologists to diagnose pneumonia, tuberculosis, interstitial lung disease, and early lung cancer [13] . stephen et al. [14] . proposed a deep learning method on their research for pneumonia classification. pneumocad is a computer aided diagnosis system developed by ref. [15] that uses handcrafted features to diagnose pediatric pneumonia from chest x-ray images. vamsha deepa et al. [5] , proposed a feature extraction method that classify normal lungs and pneumonia infected lungs from x-ray images. the method extracts haralick texture features of x-rays and detect pneumonia with an accuracy of 86%. to detect clouds of pneumonia in chest x-rays, [16] used otsu thresholding method that separates the healthy part of the lung from the pneumonia infected lungs. based on segmentation, feature extraction and artificial neural networks [17] defined a method for the classification of lung diseases such as tuberculosis, pneumonia and lung cancer from chest radiographs. glcm features, haralick features and congruency parameter are used by some authors for pneumonia detection. in rapid childhood pneumonia diagnosis, wavelet augmented analysis is used by ref. [18] for the cough sound while ref. [19] proposed a deep convolutional neural network with transfer learning for detecting pneumonia on chest x-rays. for pediatric pneumonia diagnosis a transfer learning method with deep residual network is proposed by ref. [7] . the national institutes of health (nih) cxr dataset released by ref. [20] is comprised of 112,120 frontal cxrs, individually labeled to include up to 14 different diseases. for creating these labels from the associated radiological reports, the authors used natural language processing to text-mine disease classifications whose expected accuracy is more than 90%. the chexnet deep cnn model uses this nih cxr dataset and is said to exceed the average radiologist performance on the pneumonia detection task [8] . both vgg-19 and chexnet image classification models accept input images of size 224 × 224. in this paper, here an ensemble of two deep cnn models (chexnet and vgg-19) is proposed with transfer learning and fine tuning. the features collected from the ensembled models are then balanced and fed to the ml model for successful and accurate classification of pneumonia and non-pneumonia (normal) images. the proposed model is also tested on a different to dataset [21] for generalization. each and every steps of the proposed methodology-from data collection to the classification of the x-ray images is discussed here. the proposed methodology includes image preprocessing using an image enhancement technique and resizing of images, augmentation of training images, finetuning cnn models, model's training, extraction of cnn's feature vector, ensemble of extracted feature vectors, dataset imbalance handling and pneumonia classification using different machine learning algorithms. figure 1 shows the overall procedure of the proposed methodology. table 1 describes the datasets used to train and evaluate the proposed method. this study uses two kaggle datasets-one is the dataset of ref. [9] collected from guangzhou women and children's medical center to train, test and validate the cnn model's performance and the other datasets is of ref. [21] to validate the proposed method performances. the dataset [9] contains 5856 chest x-ray images collected by x-ray scanning of pediatric patients between 1 and 5 years old. this dataset also comes with the ground truth of each x-ray image and the data is distributed into train, validation and test folder [22] . there are 4273 images of pneumonia patients and rest are the chest x-ray images of healthy people. the other dataset [21] named 'covid-19 chest x-ray database' from kaggle challenge which is the 'winner of the covid-19 dataset award' is used to evaluate the generalization performance of this proposed method. the dataset contains chest x-ray images of covid-19, normal healthy people and viral pneumonia patients. there are 219 covid-19 positive images, 1341 normal images and 1345 viral pneumonia images. but as this research only focuses on the diagnosis of pneumonia, only the normal and pneumonia images are selected and tested on the proposed final model. this section narrates the image preprocessing operations performed on the x-ray images. the preprocessing operation consists of two different tasks-image enhancement and data augmentation. different image enhancement techniques such as-gabor filtering, local binary pattern, histogram equalization, adaptive histogram equalization etc. are applied on the x-ray images. among them, adaptive histogram equalization (ahe) does well in cnn based feature extraction. the ahe technique enhances the contrast of the images. to retrain a transfer learned cnn model, the x-ray images need to be rescaled as required for the transfer learned model. for training the transfer learned, fine-tuned chexnet and vgg19 model, the images are resized to 224 × 224 size images. besides that, the images are rescaled into 0-1 range to match input types of models. training deep cnn models requires an extensive amount of labeled data. a model trained with limited data may produce models that perform well on train set however fails to generalize. the obnoxious condition of overfitting to training example leads to poor performance on the test dataset. increasing dataset size is an excellent solution to reduce this overfitting or memorization problem. data augmentation or artificial increase of data from available data is one of the best practices to reduce overfitting in the research community. this study utilizes geometric translations of images to increase dataset size. shearing, zooming, flipping and shifting (width shift and height shift) are the operations applied for data augmentation. convolution), relu activation and batch normalization. figure 2 shows the architectural design of the fine-tuned chexnet model. the vgg-19 model consists of five blocks, and each contains few convolution layers followed by a maxpooling layer. for convolution operation, vgg-19 uses the kernel of size 3 × 3 with 64, 128, 256, 512 and 512 channels on its five convolution blocks, respectively. figure 3 represents the architecture of the fine-tuned vgg-19 model. both of the models are fine-tuned for the detection and classification of pneumonia images. during fine-tuning, following modification are performed on the models for retraining the models using dataset [9] . for fine-tuning chexnet, at first the neural network part of the pre-trained chexnet model is eliminated. the redesigned new classifier part of the model uses 512, 128 and 64 neurons on its hidden dense layers. the output layer of the model has one neuron to classify images into two classes (normal and pneumonia). each dense layer (except output layer) of the classifier part is followed by a dropout. dropout reduces the capability of the model while training and guides the model during training against overfitting. the dense layer usages relu activation function while output layer usages sigmoid activation function for binary classification. fine-tuning vgg-19 includes retraining only the last convolution block of the vgg-19, while freezing the first four blocks of the model. instead of using a flattening layer after feature extractor, this research has applied global average pooling to reduce the number of learning parameters. both of the model usages adam optimizer with binary cross-entropy as the loss function. learning rate of the models is maintained based on the validation accuracy of the model. the training procedure uses learning rate decay by a dataset with a severe skew in the distribution of data among classes is a tricky situation known as dataset imbalance. the inconsistent distribution of data may produce a biased model and perform poorly on generalization. as the dataset [9] used here contains more pneumonia images than normal images though the difference is not severe, it may cause the model get slightly biased towards pneumonia image detection. thus, as a dataset balancing technique random under sampling (rus), random over-sampling (ros) and synthetic minority oversampling technique (smote) are applied on the ensembled feature vector to balance the dataset. the rus randomly deletes images from the majority class while ros technique increases the dataset size by reproducing the minority class data randomly and smote randomly creates new minority class points by interpolating between the existing minority points and their neighbors [23] . all of the techniques create the same number of images for each class. among the techniques ros performs little better than the rus and smote. the classification techniques used for pneumonia detection from the feature vector are discussed in this section. this study experimented with different ml classification methods from tree-based and non-tree-based techniques. random forest uses bagging ensemble technique for classification. the rf classifier contains decision trees (dt) as a building block. each decision tree is trained with a randomly selected samples of the dataset and these facilitates to train uncorrelated dt. then, the output of all dts is combined to make the final decision. figure 4 illustrates the random forest classifier. adaptive boosting or adaboost is an ensemble method that aims to creates a strong classifier from a number of weak classifiers. in adaboost, at first a base model is trained then other models are added which tries to correct the errors of the previous one. this research uses logistic regression (lr), dts and support vector machines (svm) as the base model of adaboost. adaboost with lr works better here. k-nearest neighbors is a simple supervised algorithm that can be used for both classification and regression problems. based on a distance functions knn classifies new cases while storing all available cases. using 7 number of neighbors gives the best result in this research. among these different methods, random forest (rf) classifier achieved better performance than others. as, rf is an ensemble of high variance and low bias dts where the output of each of the dt acts as a vote for the output class. besides, ensemble of these dts also creates a final model containing low bias and moderate variance. thus, rf ensured here better performances than others. this section justifies each of the steps of the proposed methodology by discussing the experimental results of each method and showing related comparison with other pneumonia detection models. both of the two x-ray image dataset [9] and [21] are collected and preprocessed using the techniques mentioned in "data preprocessing" section. as training a cnn model with huge amount of data helps to make the model more accurate, generalize and robust, augmentation is applied only on the training datasets to artificially increase the size of training images, to get more different images and to resolve the overfitting problems. moreover, the proposed method is also tested on a different dataset [21] to ensure strong generalization ability of the model. figures 5, 6 displays the preprocessed images of dataset [9] and [21] . the images are also labeled as 0 and 1 respectively for normal and pneumonia images. the cnn models chexnet and vgg-19 used in this research were trained using the preprocessed chest x-ray images of size 224 × 224. both of the model is then tested and validated on the 624 test and 16 validation images of dataset [9] . chexnet achieves 92.63% test accuracy and 100% validation accuracy while vgg-19 obtained 89.26% and 50% test and validation accuracy on test and validation images. the ensemble of two models will help to increase the overall performances of the model. figure 7a , b shows the validation performances of the chexnet and vgg-19 models on the validation dataset. the features of all the train and test images are now collected separately and combined them altogether to make a feature vector of the dataset. to make the pneumonia detection model more realistic and increase accuracy the model's feature vectors are then ensembled. the coding part of this task is done completely in python keras framework with tensorflow backend using google colab gpu on a mac operating system. the ensembled feature vector has 4265 pneumonia images and 1575 normal images. thus, to ensure unbiased results from the classifier rus, ros and smote data balancing techniques are applied on the feature vector. after applying rus, there will be 1575 pneumonia images and 1575 normal images and both of ros and smote makes total 8530 images 4265 for each normal and pneumonia class. ml models such as random forest, adaboost, knn are then applied with five-fold cross validation on the balanced dataset. among the techniques ros with rf performs better than other. table 2 below shows the mean accuracy of five-fold cross validation for different ml models using different data balancing techniques. the table shows that the best mean accuracy obtains using rf with ros. in table 3 , comparative accuracy of rf, adaboost and knn is shown for each five-fold with ros. in every fold, rf achieves the best classification accuracy than adaboost and knn. so, rf is selected as final classifier and ros as the data imbalance handler. and the best mean accuracy obtained by the proposed model is 98.93% or 99% approx. with 99% precision, 99% recall and 99% f-score on average. to validate the model performances again the previous 16 validation images are tested on the final proposed model which achieves 100% validation accuracy. figure 8 displays the validation report of our final model with confusion matrix. to test the model's generalization performances a different dataset [21] is used. all of the normal and pneumonia images of new test dataset [21] are tested using the final model for generalization. the model achieves 98.25% generalization accuracy on this new test dataset. figure 9 presents the model's performance on the test dataset. the proposed model performance on this new dataset proves the model's strong and appropriate generalization capability. among the 2686 images the model successfully classifies 2639 images while only 47 images are misclassified. the above discussion proved that the proposed model acts so well in diagnosing pediatric pneumonia from chest x-rays. this section demonstrates the comparison of our model's performances with existing models and methods. the most frequently used models for medical image classifications are vgg16, resnet, densenet121, inceptionv3, xception which achieves approximately 74-87% accuracy on pneumonia classification. the proposed model performs much better than these models. the models proposed on paper [7, 14, 19] and paper [24] shows that their models perform better than the other existing models for pneumonia detection. comparison of these models with proposed model is given in table 4 . the comparison table clearly proves that the proposed model is the best performing pediatric pneumonia classification model. the model achieves an average auc (area under the roc curve) value of 98.94%. the two-dimensional area under the roc curve is defined by auc. figure 10 shows the roc curve (receiver operating characteristic curve) with 99% auc area. roc curve is a graphical representation showing the performance of a model. an automated cad tool is presented in this paper, for the detection of childhood pneumonia from chest x-ray images. in this research a fusion of two fine-tuned cnn model is proposed which is then ensembled with ml random forest classifier. the proposed final model classifies pneumonia 98.93% accurately and comparing to other models provide better prediction. so, our proposed model can be referred to as one of the best pneumonia classification models. in future, the authors will try to optimize the model to classify two or more diseases with an effective result. prateek et al. [19] 0.901 liang et al. [7] 0.905 stephen et al. [14] 0.937 saraiva et al. [24] 0.953 proposed model 0.989 epidemiology and etiology of childhood pneumonia childhood pneumonia as a global health priority and the strategic interest of the bill & melinda gates foundation 17 lakh indian children to die due to pneumonia by 2030; here's how it can be averted feature extraction and classification of x-ray lung images using haralick texture features role of gist and phog features in computer-aided diagnosis of tuberculosis without segmentation a transfer learning method with deep residual network for pediatric pneumonia diagnosis radiologist-level pneumonia detection on chest x-rays with deep learning. 2017. arxiv identifying medical diagnoses and treatable diseases by image-based deep learning global estimate of the incidence of clinical pneumonia among children under five years of age pair-wise discrimination of some lung diseases using chest radiography pneumonia in south-east asia region: public health perspective computer-aided detection in chest radiography based on artificial intelligence: a survey an efficient deep learning approach to pneumonia classification in healthcare. hindawi j healthcare eng computer-aided diagnosis in chest radiography for detection of childhood pneumonia detection of pneumonia clouds in chest x-ray using image processing approach automatic detection of major lung diseases using chest radiographs and classification by feed-forward artificial neural network wavelet augmented cough analysis for rapid childhood pneumonia diagnosis deep convolutional neural network with transfer learning for detecting pneumonia on chest x-rays chest x-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases covid-19 chest x-ray database visualizing and explaining deep learning predictions for pneumonia detection in pediatric chest radiographs handling data irredularities in classification: foundations, trends, and future challenges. pattern recogn classification of images of childhood pneumonia using convolutional neural networks, in: 6th international conference on bioimaging publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the authors are grateful to the participants who contributed to this research. no financial support is provided from any organization during the research project. conflict of interest the authors declare that there is no conflict of interests regarding the publication of this work. key: cord-185125-be11h9wn authors: baldea, ioan title: what can we learn from the time evolution of covid-19 epidemic in slovenia? date: 2020-05-25 journal: nan doi: nan sha: doc_id: 185125 cord_uid: be11h9wn a recent work (doi 10.1101/2020.05.06.20093310) indicated that temporarily splitting larger populations into smaller groups can efficiently mitigate the spread of sars-cov-2 virus. the fact that, soon afterwards, on may 15, 2020, the two million people slovenia was the first european country proclaiming the end of covid-19 epidemic within national borders may be relevant from this perspective. motivated by this evolution, in this paper we investigate the time dynamics of coronavirus cases in slovenia with emphasis on how efficient various containment measures act to diminish the number of covid-19 infections. noteworthily, the present analysis does not rely on any speculative theoretical assumption; it is solely based on raw epidemiological data. out of the results presented here, the most important one is perhaps the finding that, while imposing drastic curfews and travel restrictions reduce the infection rate kappa by a factor of four with respect to the unrestricted state, they only improve the k{appa}-value by ~15 % as compared to the much bearable state of social and economical life wherein (justifiable) wearing face masks and social distancing rules are enforced/followed. significantly for behavioral and social science, our analysis of the time dependence k{appa} = k{appa}(t) may reveal an interesting self-protection instinct of the population, which became manifest even before the official lockdown enforcement. an interesting self-protection instinct of the population, which became manifest even before the official lockdown enforcement. in the unprecedented difficulty created by the covid-19 pandemic outbreak, 1 mathematical modeling developed by epidemiologists over many decades 2-7 may make an important contribution in helping politics to adopt adequate regulations to efficiently fight against the spread of sars-cov-2 virus while mitigating negative economical and social consequences. the latter aspect is of paramount importance 8 also because, if not adequately considered by governments currently challenged to deciding possibly under dramatic circumstances and formidable tight schedule, it can jeopardize the healthcare system itself. as an effort in this direction, we drew recently attention 9 to the general fact that the spread of the sars-cov-2 virus in smaller groups can be substantially slowed down as compared to the case of larger populations. in this vein, the time evolution of covid-19 disease in the two million people slovenia certainly deserves special consideration, as on 15 may 2020, concluding that this country has the best epidemic situation in europe, prime minister janez janška declared the end of the covid-19 epidemic within slovenian borders. 10 subsequent developments (only four new cases between 15 and 24 may, 11 cf. table 1 ) have fortunately given further support to this declaration. attempting to understanding and learning from this sui generis circumstance is the very aim of the present paper. thanks to long standing efforts extending over many decades, a rich arsenal of theoretical methods of analyzing epidemics exists. most of them trace back to the celebrated sir model 2-7 wherein the time evolution of the numbers of individuals belonging to various epidemiological classes (susceptible (s), infected (i), recovered (r), etc) classes is described by deterministic differential equations. unfortunately, those approaches need many input parameters 12,13 that can often be reliably estimated only after epidemics ended, 14 which un-avoidably compromises their ability of making predictions. as an aggravating circumstance, one should also add the difficulty not encountered in the vast majority of previous studies: how do the input parameters needed in model simulations change in time under so many restrictive measures (wearing face masks, social distancing, movement restrictions, isolation and quarantine policies, etc) unknown in the pre-covid-19 era? estimating model parameters from data fitting in a certain time interval to make predictions can easily run into a difficulty like that described in the first paragraph of section 2.3. as shown below, our approach obviates the aforementioned difficulty. we will adopt a logistic growth model in a form which is different from that often employed in the past [15] [16] [17] [18] [19] (see sections 2.3 and 3 for technical details). this model is considerably simpler than sir flavors, and already turned out to be an appealing framework in dealing with current covid-19 pandemic issues. 9,13 logistic functions (see equation (2) below) were utilized for studying various problems. [20] [21] [22] [23] [24] [25] studies on population dynamics of epidemic populations [26] [27] [28] [29] [30] [31] [32] were also frequently based on the logistic function. nevertheless, as anticipated, there is an important difference between the present approach (section 2.3) and all the other approaches of which we are aware. the latter merely justify the logistic model by the fact that recorded disease numbers followed a sigmoidal curve. shortcomings of this standpoint are delineated in the beginning of section 2.3. the strength of the approach presented in section 2.3 is that we do not use data fitting. rather, we use raw epidemiological data to validate the logistic growth and straightforwardly extract the time dependent infection rate, which is the relevant model parameter for the specific case considered and makes it possible to compare how efficient different restrictive measures act to mitigate the covid-19 pandemic, and even to get insight significant for behavioral and social science. to briefly remind, standard logistical growth in time t of an infected population n = n(t) follows an ordinary differential equation containing two constants (input model parameters): the (intrinsic) infection rate κ(> 0) and the so called carrying capacity n. in a given environment, the latter has a fixed value to which the population saturates asymptotically (lim t→∞ n(t) = n). this can be seen by straightforwardly integrating equation (1) with the initial condition n(t)| t=t 0 = n 0 , which is often recast by using the half-time τ ≡ t 0 + 1 κ ln n n 0 − 1 , n(t) t=τ = n/2. noteworthily for the discussion that follows, equation (2) assumes time-independent model parameters. in epidemiological language, n(t) gives the cumulative number of cases at time t. plotted as a function of t, the derivative with respect to time (throughout assumed a contiguous variable)ṅ(t) ≡ dn/dt, representing the "daily" number of new infections, is referred to as the epi(demiological) curve. before proceeding with the data analysis let us briefly summarize relevant public heath measures, social distancing and movement restrictions imposed during the covid-19 crisis in slovenia. 11, 33 the first case of coronavirus was confirmed on march 4, 2020, imported via a returnee traveling from morocco via italy. 34 on 10 march, the government banned all incoming flights from italy, south korea, iran, and china; the land border with italy was closed for all but freight transport; indoor public with more than 100 persons were prohibited, sporting and other events with more than 500 participants were allowed only without audience. the (blue) curve of figure 2a depicting the evolution of total covid-19 infections in slovenia (underlying data are collected in table 1 ) has an appealing similarity to the logistic s-shaped curve depicted in figure 1 . one would be therefore tempting to follow numerous previous authors, [26] [27] [28] [29] [30] [31] [32] who claimed that the logistic model applies merely because of the (apparently) good data fitting. still, to claim that a description based on a model like that of equation (2) is valid, checking that the model parameters do not depend on the fitting range (t 1 , t 2 ) is mandatory. for the specific case considered here, this means that fitting numbers of infected individuals in time range t 1 < t < t 2 should yield, within inherent statistical errors, values of n and κ independent of t 1 and t 2 . and, like in other known cases, 35, 36 this is just the stumbling block for the logistic function approach delineated in section 2.1. in particular, the infection rate κ should not depend on how broad the is the range (t 1 , t 2 ); however, we checked by straightforward numerical calculations that it does. given the real epidemic timeline delineated in section 2.2, the infection rate must indeed depend on time, κ = κ(t). if the contrary was true, all containment measures would be useless. but when κ depends on t, equations (2) and (3) no longer apply; they were deduced by integrating out equation (1) assuming a time-independent κ. fortunately, rather than merely inquiring how good the fitting curve based on equation (2) is, we are able to directly check (and demonstrate, see below) the validity of a time-dependent logistic model merely based on the real epidemiological reports. to this aim, we recast the differential equation (1), which is the basic definition of the logistic growth (not to be confused with the logistic function of equation (2)), as followṡ when put in this way, one can straightforwardly get insight in how to proceed. one should plot the ratio of the daily new cases to the cumulative number of cases (numerator and denominator in equation (4), respectively) as a function of the cumulative number of cases and inspect whether the curve is linear or departs from linearity. is the decrease linear (like anticipated in the ideal simulation presented in figure 1b) , we have the demonstration that the logistic growth model applies. the curve constructed as described above using the covid-19 epidemic reports for slovenia (table 1, noteworthily, the low-κ regime comprises two periods: lockdown and lockdown easing. to quantify the differences between these two periods, we used equation (4) we believe that especially the two results shown in figure is that what really matters is not to keep everyone at home ("italian approach") but rather to impede virus transmission ("german approach"), e.g., by wearing masks, adequate hygiene, and social distancing. infection transmission does not strongly increase upon easing as long as face masks and social distancing prevent sars-cov-2 virus spreading. one should add at this point -an important fact that appears to be currently inadequately understandthat, along with a less pleasant effect of a short-term slight increase of the daily new cases, a moderate increase in the infection rate also has a positive impact. it reduces epidemic duration; compare the right tail the green and orange curves in figure 3b . (ii) the fact that the carrying capacity n does not change upon lockdown easing is equally important. this is the maximum number of individuals that can be infected in a given environment. rephrasing, the maximum number of infected individuals does not increase when the lockdown is released; the total carrying capacity of a given environment does not change. from a methodological perspective, one should emphasize the important technical strength of the approach proposed above, which made it possible to arrive at the aforementioned conclusions. it is only the differential form, equation (1), of logistic growth employed that obviates the need for any additional theoretical assumption. the traditional approach of validating the logistic model by blind data fitting using its integral counterpart, equation (2), does not work for covid-19 pandemic applications because the model parameter κ can and does depend on time. this time dependence κ = κ(t) is essential to properly assess and make recommendations on the efficiency of the restriction measures to be enforced against sars-cov-2 virus spread. and just because, in its differential form utilized here, the logistic model merely requires directly "measurable" epidemiological quantities (daily reportsṅ(t) and cumulative number of cases n(t), cf. equation (1)) makes in the present unsusual situation this model an alternative preferable to other more elaborate sir-based flavors. the latter models contain a series of quantities that cannot be directly accessed "experimentally". governments confronted to taking decisions under unprecedented time pressure cannot await confirmation of often speculative theoretical hypotheses needed in data processing. before ending, let us also note that monitoring the κ(t)-timeline allowed us to get insight also relevant for behavioral and social science; the self-protection instinct of the population became manifest even before the official lockdown enforcement (cf. section 2.3). who declares covid-19 a pandemic contributions to the mathematical theory of epidemics contributions to the mathematical theory of epi-demicsii. the problem of endemicity contributions to the mathematical theory of epidemics. iii. further studies of the problem of endemicity the mathematical theory of infectious diseases and its applications 5a crendon street, high wycombe, bucks hp13 6le a thousand and one epidemic models the mathematics of infectious diseases mitigating the covid economic crisis: act fast and do whatever it takes a cepr press voxeu.org ebook suppression of groups intermingling as appealing option for flattening and delaying the epidemiologic curve while allowing economic and social life at bearable level during covid-19 pandemic finding an accurate early forecasting model from small dataset: a case of 2019-ncov novel coronavirus outbreak institut, r. k. modellierung von beispielszenarien der sars-cov-2-epidemie 2020 in deutschland notice sur la loi que la population poursuit dans son accroissement recherches mathémathiques sur la loi d'accroissement de la population du système social et des lois qui le régissent studien zur chemischen dynamik; erste abhandlung: die einwirkung der suren auf acetamid the rate of multiplication of micro-organisms: a mathematical study german and british antecedents to pearl and reed's logistic curve the early origins of the logit model how populations grow: the exponential and logistic equations floppy molecules as candidates for achieving optoelectronic molecular devices without skeletal rearrangement or bond breaking a sui generis electrode-driven spatial confinement effect responsible for strong twisting enhancement of floppy molecules in closely packed self-assembled monolayers progressive intensification of uncontrolled plant-disease outbreaks field trials of copper fungicides for the control of potato blight. i. foliage protection and yield horizontal (polygenic) and vertical (oligogenic) resistance against blight four essays covid-19: government measures timeline slovenia confirms first case of coronavirus: health minister". the new york times important issues facing model-based approaches to tunneling transport in molecular junctions counterintuitive issues in the charge transport through molecular junctions date total confirmed new confirmed per day 64 03-04 1 1 65 03-05 6 5 66 03-06 8 2 67 03-07 12 4 68 03-08 16 4 69 03-09 25 9 70 03-10 34 9 71 03-11 57 23 72 03-12 89 32 73 03-13 141 52 74 03-14 181 40 key: cord-260797-tc3pueow authors: aleta, alberto; ferraz de arruda, guilherme; moreno, yamir title: data-driven contact structures: from homogeneous mixing to multilayer networks date: 2020-07-16 journal: plos comput biol doi: 10.1371/journal.pcbi.1008035 sha: doc_id: 260797 cord_uid: tc3pueow the modeling of the spreading of communicable diseases has experienced significant advances in the last two decades or so. this has been possible due to the proliferation of data and the development of new methods to gather, mine and analyze it. a key role has also been played by the latest advances in new disciplines like network science. nonetheless, current models still lack a faithful representation of all possible heterogeneities and features that can be extracted from data. here, we bridge a current gap in the mathematical modeling of infectious diseases and develop a framework that allows to account simultaneously for both the connectivity of individuals and the age-structure of the population. we compare different scenarios, namely, i) the homogeneous mixing setting, ii) one in which only the social mixing is taken into account, iii) a setting that considers the connectivity of individuals alone, and finally, iv) a multilayer representation in which both the social mixing and the number of contacts are included in the model. we analytically show that the thresholds obtained for these four scenarios are different. in addition, we conduct extensive numerical simulations and conclude that heterogeneities in the contact network are important for a proper determination of the epidemic threshold, whereas the age-structure plays a bigger role beyond the onset of the outbreak. altogether, when it comes to evaluate interventions such as vaccination, both sources of individual heterogeneity are important and should be concurrently considered. our results also provide an indication of the errors incurred in situations in which one cannot access all needed information in terms of connectivity and age of the population. the average. within a network perspective, this is just a consequence of the higher number of contacts, or degree, that some individuals have in the network [17, 18, 20] . this individual heterogeneity also signaled that outbreaks could be really large if key individuals become infected and, at the same time, gave a new target for efficient control strategies such as vaccinating highly connected individuals [21, 22] . however, despite the many advantages of this approach, determining the complete contact network of a large population is almost infeasible, especially for infections transmitted by respiratory droplets or close contacts. hence, it is common to use idealized networks built using some empirical data of the population, such as the degree distribution [23] . lastly, there are high-resolution approaches that rely on lots of statistical data to build agent-based models in which the behavior of every single individual is taken into account [24] [25] [26] [27] [28] [29] . note, however, that in agent-based models, individuals are usually assigned to certain mixing groups (i.e., their household, school, or workplace), and that inside those groups homogeneous mixing is used, due to the lack of data for all these settings at a country scale [30] . an important step to create more realistic models in this direction is to collect high-resolution data on individual contacts using wearable sensors [21] , that can be used to build timevarying networks in which not only the information about who contacts who is contained but also the duration and frequency of contacts [31] . several settings have been monitored, such as schools and workplaces [32, 33] , or even conferences and museums [34, 35] . although the data is still too rare to be used in large scale simulations, it has already been shown that the heterogeneity induced by the time-varying networks inside each mixing group produces a different outcome than the one obtained assuming homogeneous mixing within each group [30] . our goal in this paper is to analyze the role of one particular type of heterogeneity in disease dynamics, namely, the age structure of the population. originally, age was introduced into the models to study childhood diseases [5] . the classical approach consists of dividing the population into different groups, one for each age bracket under consideration, and establishing an age-dependent transmission rate. this transmission rate can be arranged in a matrix in which each element encodes the transmission probability between groups i and j (this matrix is also known as the who acquired infection from whom matrix [36, 37] ). it is also possible to separate the effect of the transmission itself in a common parameter and encode the number of contacts between each group in the matrix [38] . note that this procedure falls into the second category described previously. that is, it takes into account the heterogeneity induced by having different classes of individuals but hides the individual variability under a homogeneous mixing approach within each group, as in models of sexually transmitted diseases with groups with different activity levels. nevertheless, this approach is widely used today and has yielded outstanding results for many diseases such as chickenpox [39] , herpes zoster [40] , measles [41] [42] [43] , pertussis [44] and tuberculosis [45] . in fact, even though the theoretical basis of this method is relatively old, data on the contact patterns of the general population as a function of their age have been available only recently. the first large-scale study on the contact patterns between and within groups in the context of infections spread by respiratory droplets or close contact took place in 2008 and was focused in europe [14] . since then, a number of studies covering different countries have appeared, although data on africa and asia are still scarce [46] . various methods have been developed to infer the contact patterns in the absence of direct data [47] [48] [49] , and to project them into the future [50] . and yet, most studies that use this data disregard the whole distribution of contacts and use only the average number of contacts between groups, completely neglecting the individual heterogeneity (with few exceptions [51] ). as a consequence, in these studies, superspreading events cannot occur naturally, unless the model is modified, contrary to network models in which the large connectivity of some individuals can result in the appearance of such events. similarly, the virtual absence of an epidemic threshold for certain types of contact networks cannot be observed with these simplified contact patterns [52] . to bridge this gap, in this paper, we focus on analyzing the role that disease-independent heterogeneity in host contact rates plays in the spreading of epidemics in large populations under several scenarios, both numerically and analytically. furthermore, in contrast to previous approaches to this problem [53] [54] [55] [56] , we use a data-driven approach to highlight not only the role of those heterogeneities but also to explore the validity of the conclusions that one can derive when only limited information about the population is available. there are multiple ways of modeling the contact patterns of the population, depending on the availability of data and the characteristics of the disease. in this work, we consider that diseases have the same outcome on all individuals regardless of their condition and that individuals do not change their behavior as a consequence of the disease. this way, we can focus on the effect of adding different characteristics to the population contact patterns. to be more specific, we use the information from the survey that was carried out in italy for the polymod project [14] . in this project, over 7,000 participants from eight european countries were asked to record the characteristics of their contacts with different individuals during one day, including age, sex, location, etc. since that pioneering work, the number of countries where this type of study has been conducted has been increasing steadily, but data on africa and asia are still scarce. besides, the resolution and amount of information vary from study to study [46] . as such, we build four different models of interaction, assuming that only partial information about the population is available, see fig 1. the simplest formulation is the homogeneous mixing approach (model h), suitable when very limited information about the population is available. in this model, all individuals are able to contact each other with equal probability. the number of such interactions, hki, can be extracted from contact surveys simply by calculating the average number of contacts per individual. note, however, that this formulation is very simplistic since all individuals are completely equivalent. a slightly better approximation is to divide the population into agegroups, given the demographic structure of the population, fig 1b, and establish a different number of contacts between and within them (model m), which is the common approach currently used in the epidemic literature to model age-mixing patterns. in this case, the necessary information includes knowing the age of both individuals participating in each contact, although this information can be easily summarized in an age-contact matrix, m, where each entry m αβ represents the average number of contacts from an individual in age group α to individuals in age group β. note that in both models only the average number of contacts is used, in one case the average over the whole population and in the other over each age-group. another possibility is to use the whole contact distribution, fig 1d, to build the contact network of the population. this formulation is commonly found in the network science literature since it highlights the role that the disproportionate number of contacts of some individuals have in the dynamics of the disease. a simple way of creating these networks is to represent each individual i as a node and extract its degree (number of contacts) from the distribution. then, the expected number of edges between nodes i and j is ha ij i ¼ k i k j = p l k l (model c). to obtain this expression, we can consider that each node i has k i stubs associated. next, if these stubs are matched together randomly, the probability that each stub from node i ends up at one of the k j stubs of node j is k j over the total number of stubs, ∑ l k l . this method is known as the configuration model. lastly, we can combine both ingredients, the mixing patterns, and the contact distribution of the population in a network representation. to do so, we propose to arrange nodes in a multilayer network, in which each layer represents an age-group. as such, the first step to create this network is to extract the age associated to each node from the demographic structure of the population, fig 1b, and assign them to their corresponding layer (since we are working with 15 age-groups, our system is composed by that same amount of layers). then, the degree of each node should be extracted from the desired distribution. to incorporate the mixing patterns into the configuration model, we propose the following scheme: 1. given a node i located in layer α (where the layer represents the age-group associated with i), the probability that any of its stubs ends up at a node in any layer β (including the same layer) is p αβ . this probability can be extracted from the mixing matrix as 2. the stub from node i will match the stub of node j, situated in layer β, with probability k j /∑ l2β k l , where the denominator indicates the addition over the degree of all nodes present in layer β. hence, the expected number of edges between nodes i and j will be given by yet, note that incorporating the mixing patterns introduces a restriction in the degree distribution. indeed, one of the important properties of the mixing patterns matrix is that it has on the other hand, if the full contact distribution of the population is known, regardless of their age, it is possible to build the contact network of the population (c). lastly, when both the contact distribution and the interaction patterns between different age groups are known, the individual heterogeneity and the global mixing patterns can be combined to create a multilayer network in which each layer represents a different age group (c+m). panel b: demographic structure of italy in 2005 [57] . panel c: age-contact patterns in italy obtained in the polymod study [14] . panel d: contact distribution in italy obtained in the polymod study [14] . the x axis represents the number of daily contacts and the y axis the fraction of individuals that have reported such amount of contacts. the distribution is fitted to a right-censored negative binomial distribution since the maximum number of contacts that could be reported was 45. https://doi.org/10.1371/journal.pcbi.1008035.g001 to verify reciprocity, i.e., that is, the number of contacts going from group α to group β has to be the same as the ones from β to α (if the populations of each group were equal, this would lead to a symmetric matrix). it is easy to see that eq (1) only fulfills this property if x and, thus, where hki α represents the average degree in layer α. hence, even though the shape of the distribution can be chosen freely, the mixing matrix fixes the average degree of each layer. eqs (1) and (4) to determine the consequences of each of the previous assumptions, we first consider a general susceptible-infected-susceptible (sis) markovian model [58, 59] . in this model, the recovery rate of each infected individual is modeled by a poisson process with rate δ. in turn, each successful contact emanating from an infected individual (i.e., a contact that transmits the disease) is modeled as a poisson process with rate λ. we denote by y i the bernoulli random variables that are equal to one if individual i is infected or zero otherwise. complementary, the only ingredient left to be defined is how the contact process between individuals actually takes place. in general, in its exact formulation, we can do so by introducing the matrix a, which denotes whether two individuals can contact each other or not [58, 59] : with this formulation we can already study the spreading of an epidemic on any network, models c and cm. indeed, assuming that the states are independent, i.e., hy i y j i = hy i ihy j i�y i y j , we get considering that the nodes with the same degree are statistically equivalent, we can obtain the epidemic threshold using the heterogeneous mean field approximation [60] , this well-known result from network science clearly shows the importance of the heterogeneity of the contacts, since it depends on the second moment of the distribution. in the case of italy, using this expression we obtain a theoretical threshold of τ cm = 0.033 and τ c = 0.035 for the cm and c models, respectively. for the m model, since individuals are indistinguishable, eq (6) is rewritten as where m αβ is the matrix depicted in fig 1c, and y α the fraction of infected individuals in layer α. in this case, using the next generation approach [61, 62] , the epidemic threshold is regarding italy, the spectral radius of m is ρ(m) = 22.51, resulting in an epidemic threshold of τ m = 0.044. lastly, the equation governing the h model is where the epidemic threshold is according to fig 1d, the epidemic threshold in our system is thus τ h = 0.052. thus, in this case, the following relation holds: some observations are in order. first, even though the average number of contacts is the same in all models, the epidemic threshold is completely different. besides, increasingly adding heterogeneity to the model lowers the epidemic threshold. this is especially relevant when going from classical mixing models to network models. indeed, when we introduce the whole contact distribution, we are indirectly adding the possibility of having super-spreading events, which, as noted before, is missing in the classical approaches. on the other hand, as expected, the difference between both network models is relatively small ( t cm t c ¼ 1:06) since the main driver of the epidemic threshold is the contact distribution. nonetheless, as we shall see next, for other scenarios, the multilayer framework will yield quite different results from model c. to asses the quality of our theoretical analysis, our first step is to obtain the epidemic threshold for each configuration numerically. to do so, we create an artificial population of 10 6 individuals and assign them an age according to the demographic structure of the italian population [57] . then, we simulate a stochastic sis markov model, with δ = 1 and multiple values of λ for each of the four contact models under consideration (see materials and methods). in fig 2a, we show the attack rate (total number of cases over the whole population) as a function of λ. the overall behavior of the four scenarios is qualitatively similar, although large differences are observed in the value of the epidemic threshold (see inset), as predicted. to properly characterize the value of the epidemic threshold and compare it with the theoretical expectations, we use the quasistationary state (qs) method [59, 63] . this technique allows computing the susceptibility of the system, which presents a peak at the epidemic threshold (see materials and methods). the caveat is that it is highly dependent on the system size since the epidemic threshold is only properly defined for infinite systems. nevertheless, in fig 2b we compute the susceptibility, χ for the four configurations with system sizes ranging from 10 4 to 10 6 individuals and we can see that for the latter the peak of the susceptibility is already quite close to the predicted value of the epidemic threshold, validating our theoretical approach. next, we focus on studying the impact that the disease has on each age group under the different configurations, fig 2c. we set the value of λ in each case so that the attack rate is equal to 0.4, since the four scenarios converge to that value for similar values of λ (see fig 2a) . using the homogeneous mixing approximation, we obtain a distribution of infected individuals across ages proportional to the demographic structure of the population (fig 1b) , as one would expect given that all individuals are virtually indistinguishable for the dynamics. the same result is obtained for the c model, in which the age of the nodes is completely independent of the network structure. at variance with these results, if we incorporate the heterogeneous mixing patterns of the population either in the age-mixing (m) model or in the multilayer network (cm) setting, the incidence in each age group would be quite different, see fig 2c. note that we have again set λ so that the overall incidence is 0.40 in all cases −this assures that the total number of infected individuals is the same, only its distribution across age classes is different. results show that in both scenarios the prevalence is much higher for teenagers and smaller for the older cohorts than in the homogeneous mixing model. although the sis model facilitates the theoretical and numerical analysis of the system, especially near the epidemic threshold, it is too simplistic to model real diseases such as ili. thus, to highlight the impact of these observations on a more realistic scenario, we slightly modify the model by incorporating the removed compartment so that the dynamics are governed by a susceptible-infected-removed (sir) model, which is better suited for studying ili [64] . it has been recently shown that using a constant and group-independent basic reproduction number, r 0 , might not describe well key features of the disease dynamics in realistic scenarios [28] . for this reason, we first explore the dependency of this parameter with the age of the individual in the two networked scenarios. to do so, we simply count the total number of newly infected individuals that a single seeded infectious subject would produce in a fully susceptible population over 10 8 simulations, with the value of λ set so that the average value of r 0 is 1.3 inline with typical values for influenza [65] . fig 3a shows the value of r 0 as a function of the age of the seed node in the network in which all nodes have the same degree distribution. clearly, the same r 0 value is obtained regardless of the age of the nodes, as it should be given that both their degree and their connections are independent of their age. conversely, in the multilayer network where the mixing patterns of the population are incorporated, fig 3b, the situation changes completely. the value of r 0 is above the average for teenagers and adults but below the average for the elderly, highlighting the importance of the underlying structure in the value of r 0 . lastly, we study the effect of vaccinating a fraction of the nodes before the epidemic begins. this sort of contention measures are among those that can benefit the most from knowledge about the structure of the population, as they allow devising more efficient vaccination strategies. first, we set the baseline scenario to values compatible with the 2018-2019 ili epidemic in italy. according to the world health organization, the total attack rate was 13.3%. besides, an important fraction of the population was vaccinated preemptively. in italy, vaccination is recommended for several groups of people, such as those with chronic medical conditions, firefighters, health care workers, or the elderly [66]. of these groups, the only one that we can distinguish in our model is the elderly, but it is also the one with the largest vaccination rates. unfortunately, the uptake of the vaccine has been decreasing for the past few years, and now is close to 50% [67] . even more, the effectiveness of the vaccine is estimated to be around 60% yielding an effective vaccination rate of 30% in the elderly [68] . hence, to obtain the baseline values in our model, we set 30% of the elderly in the recovered state initially and set the value of λ so that the attack rate is 13.3%, fig 4a. our first observation is that in the c scheme, we trivially obtain a reduction in the attack rate among the elderly due to their vaccination, but otherwise, the incidence is the same in all age groups. on the other hand, both in the m and cm models, the attack rate depends highly on the age of the individual. to gauge the effect of increasing vaccination rates, we vaccinate 1% of the total population (assuming that the effectiveness is 60% for all age-groups). note that since the elderly group represents 19% of the population, the initial vaccination rate was roughly 10% of the total population. if these new vaccines are administered randomly, we can see that the effect is just a homogeneous reduction of 5-6% in all age groups, independently of the model, fig 4b. conversely, if that same amount of new vaccinations is targeted, the situation changes completely. in the m model, we vaccinate individuals belonging to the group with 15-19 years old since it is the one with the largest number of contacts and the highest attack rates. we can see that the overall reduction is much larger than in the previous case, and especially so in this particular group, see fig 4c. in the c and cm models, instead, we apply the vaccines to individuals with the largest degrees. we can see that the reduction is larger in the c setting than in the cm one. this result might seem counter-intuitive since the same measure is applied to both systems. however, note that while in the c model the largest degrees are homogeneously distributed across the population, in the cm model they are concentrated in specific age groups, or layers. furthermore, since nodes in the same layer tend to be connected together, the previous observation implies that the effect of removing hubs will be lower. to verify this, we have rewired the connections of the cm model while preserving the age, degree and vaccination status of each node. as we can see, in such case we recover the same value as in the c model. in other terms, the correlations induced by the age mixing patterns lower the effectivity of this vaccination strategy. note also that in both the random and the targeted vaccination schemes, the number of new vaccines introduced in the system is exactly the same, only who is vaccinated changes. models can range from simple homogeneous mixing models to high-resolution approaches. the latter, even though it might provide better insights, is also much more data demanding. as a compromise between the two, network models can capture the heterogeneity of the population while keeping the amount of data necessary low. nevertheless, most network approaches focus only on determining the role that the difference in the number of contacts of the population has on the impact of disease dynamics but ignore other types of heterogeneities such as the age mixing patterns. we have shown that to determine the epidemic threshold of the population properly, the heterogeneity in the number of contacts cannot be neglected, making the simple homogeneous approach and the homogeneous approach with age mixing patterns ill-suited for it. in fact, a description that ignores the age mixing patterns of the population can capture much better the value of the epidemic threshold. furthermore, we observe two different regimes in the attack rate as a function of the spreading rate. for low values of the spreading rate, individual heterogeneity plays a more important role, yielding larger attack rates than the homogeneous counterparts. however, after a certain value, the phenomenology reverses, i.e., larger attack rates are obtained for the homogeneous approaches rather than for the networked versions. the reason is that, in homogeneous models, an infected agent can contact everyone in the population, and thus it can keep infecting individuals even if the attack rate is high. when the network is taken into consideration, it is possible that nodes run out of susceptible individuals within their vicinity, virtually preventing them from spreading the disease any further. on the other hand, if we study the distribution of infected individuals across age cohorts, we can see that the c scheme is no longer valid, yielding the same results as the simple homogeneous mixing approach. if the age mixing patterns are added into the model, either in the m or cm schemes, a larger fraction of young individuals will be infected, while the incidence in elder cohorts is reduced. hence, even though the c approach can predict fairly well the value of the epidemic threshold, it cannot be used to study the spreading of diseases in which taking into account the age of the individuals is important beyond the epidemic threshold. conversely, the multilayer network of the cm model can describe both the epidemic threshold and the distribution of the disease across age groups correctly. in other words, it combines both the importance that individual heterogeneity has with the inherent assortativity present in human interactions. individual heterogeneity also introduces important variations in the measured value of r 0 . this observation is quite important since it shows that for the proper evaluation of r 0 during emerging diseases, the sampling of the population has to be done carefully. biases in the sampled individuals, such as having too many young individuals, could lead to estimations of r 0 much larger than its actual value. even more, this is not limited to the age of the individuals since we have also seen the importance of individual heterogeneity in the dynamics. of utmost relevance, if in the sample, there are individuals with an average number of contacts higher than the normal population, the estimations of r 0 would also be higher. lastly, we have also observed the crucial role that heterogeneity plays if we want to devise efficient vaccination strategies. the role of networks in this regard is known to be important not only because there are tools that allow identifying the most important individuals, but because it provides a clear way to study herd immunity. yet, if we do not take into account the contact distribution of the population the effectivity of vaccination campaigns will be lower. conversely, if we rely simply on the contact distribution of the population and disregard their mixing patterns, we would overestimate the effect of vaccination. as the current covid-19 pandemic has shown, accounting for both the age and the contact heterogeneity of individuals is crucial to control the epidemic. it is yet unknown the exact role that age plays in this disease, although preliminary results show that children are less susceptible and that the case fatality rate for older individuals is much higher. similarly, large super-spreading events are possible such as the ones detected in south korea, boston or spain [19, 69] . the latter country is also among the ones most affected by the current epidemic, but empirical information about the age mixing patterns of the population is not available [46, 69] . thus, to the inherent problems of forecasting the evolution of an emerging disease [70, 71] we have to add our ignorance about these factors which, as we have shown in this article, can substantially modify the predictions. this highlights once again the importance of obtaining precise information about the behavior of the population, enhancing our preparedness for this type of event. to sum up, we have shown the importance that individual heterogeneities have on the spreading of infectious diseases. yet, although in general the more details in the model the better, it is also important to take into account the inherent limitations about data that currently exist. therefore, it is crucial to correctly gauge what can and cannot be done, given the information available to us. in particular, we have shown that to predict the epidemic threshold, it is indispensable to know the degree distribution of the population. nonetheless, this is not strictly needed to evaluate the impact of a disease away from the threshold. yet, adding this information, even though it does not dramatically change the predicted outcomes of the epidemic under normal conditions, could be pivotal to devise efficient vaccination strategies. furthermore, we have seen that the underlying information of the system also has an impact on quantities that are commonly measured and used in real settings, such as r 0 , implying that care must be taken when extrapolating the results from one study to the other. in all cases, we consider populations of 10 6 individuals. in the h model, since individuals are indistinguishable, the impact of the disease over the age groups is computed by randomly extracting values from the demographic distribution of italy in 2005 [57] . in the m model, the size of each age-group is computed using the same procedure. besides, the age-mixing matrix was corrected so that reciprocity is fulfilled, and the average connectivity is exactly 19.40 [50] . in the c model, we randomly extract the degree of each node from a right-censored negative binomial distribution adjusted to the survey data from polymod [14] . then, links are sampled performing a bernoulli trial over each pair of nodes respecting that ha ij i ¼ k i k j = p l k l . a similar procedure is followed to create the multilayer with age mixing patterns, but in this case, each layer has its own values for the negative binomial distribution, according to the data (see fig 1d) , and the probability of establishing a respects ha ij i ¼ p aðiþ;bðjþ k i k j = p l2aðjþ k l where p α(i),β(j) is the probability that a link from a node with the same age as node i ends up at a node with the same age as node j, and α(j) is the layer to which j belongs. we remark that the network is simplified, removing multiple edges. close to the critical point, the fluctuations of the system are often high, driving the system to the absorbing state [59, 63] . to avoid this problem, the quasistationary state (qs) method stores m active configurations previously visited by the dynamics. at each step, with probability p r , the current configuration (as long as is active) replaces one of the m stored ones. then, if the system tries to visit an absorbing state, the whole configuration is substituted by one of the stored ones. the system evolves for a relaxation time, t r , and then the distribution of the number of infected individuals, p n , is obtained during a sampling time t a . lastly, the threshold is estimated by locating the peak of the modified susceptibility χ = n(hρ 2 i − hρi 2 )/hρi, where hρ k i is the k-th moment of the the distribution of the number of infected individuals, p n (note that hρ k i = ∑ n n k p n ). in our analysis, the number of stored configurations and the probability of replacing one of them is fixed to m = 100 and p r = 0.01, while the relaxation and sampling times vary in a range depending on the size of the system, t r = 10 4 − 10 6 and t a = 10 5 − 10 7 . heterogeneity in pathogen transmission: mechanisms and methodology the role of population heterogeneity and human mobility in the spread of pandemic influenza an infectious disease model on empirical networks of human contact: bridging the gap between dynamic network data and contact matrices a review of simulation modelling approaches used for the spread of zoonotic influenza viruses in animal and human populations. zoonoses and public health modeling infectious diseases in humans and animals sexual practices and risk of infection by the human immunodeficiency virus gonorrhea transmission dynamics and control heterogeneities in the transmission of infectious agents: implications for the design of control programs transmission dynamics of hiv infection models for vector-borne parasitic diseases age-related changes in the rate of disease transmission: implications for the design of vaccination programmes sexual lifestyles and hiv risk aids and sexual behaviour in france social contacts and mixing patterns relevant to the spread of infectious diseases complex networks: structure and dynamics superspreading and the effect of individual variation on disease emergence network theory and sars: predicting outbreak diversity stochasticity and heterogeneity in the transmissiondynamics of sars-cov-2 infectious disease transmission and contact networks in wildlife and livestock a high-resolution human contact network for infectious disease transmission statistical physics of vaccination networks and epidemic models strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic mitigation strategies for pandemic influenza in the united states spread of zika virus in the americas measurability of the epidemic reproduction number in data-driven contact networks reactive school closure weakens the network of social interactions and reduces the spread of influenza recalibrating disease parameters for increasing realism in modeling epidemics in closed settings robust modeling of human contact networks across different scales and proximity-sensing techniques contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys data on face-to-face contacts in an office building suggest a low-cost vaccination strategy based on community linkers what's in a crowd? analysis of faceto-face behavioral networks empirical temporal networks of face-to-face human interactions mixing patterns between age groups in social networks disease control in age structure population infectious diseases of humans: dynamics and control using empirical social contact data to model person to person infectious disease transmission: an illustration for varicella the impact of demographic changes on the epidemiology of herpes zoster: spain as a case study age-structure and transient dynamics in epidemiological systems global dynamics of a discrete age-structured sir epidemic model with applications to measles vaccination strategies parental vaccination to reduce measles immunity gaps in italy. elife contact network structure explains the changing epidemiology of pertussis data-driven model for the assessment of mycobacterium tuberculosis transmission in evolving demographic structures. proceedings of the national academy of sciences a systematic review of social contact surveys to inform transmission models of close-contact infections inferring the structure of social contacts from demographic data in the analysis of infectious diseases spread projecting social contact matrices in 152 countries using contact surveys and demographic data inferring high-resolution human mixing patterns for disease modeling projecting social contact matrices to different demographic structures high-resolution epidemic simulation using withinhost infection and contact data epidemic spreading in scale-free networks transmission dynamics of an sis model with age structure on heterogeneous networks effects of vaccination and population structure on influenza epidemic spread in the presence of two circulating strains incorporating disease and population structure into models of sir disease in contact networks multiple lattice model for influenza spreading world population prospects 2019, custom data acquired via website virus spread in networks fundamentals of spreading processes in single and multilayer complex networks dynamical processes on complex networks the construction of next-generation matrices for compartmental epidemic models the impact of regular school closure on seasonal influenza epidemics: a data-driven spatial transmission model for belgium epidemic thresholds of the susceptible-infected-susceptible model on networks: a comparison of numerical and theoretical results age-specific contacts and travel patterns in the spatial spread of 2009 h1n1 influenza pandemic estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. bmc infectious diseases seasonal influenza vaccination and antiviral use in eu/eea member states. european centre for disease prevention and control association between vaccination coverage decline and influenza incidence rise among italian elderly adjuvanted influenza vaccine for the italian elderly in the 2018/19 season: an updated health technology assessment evaluation of the potential incidence of covid-19 and effectiveness of containment measures in spain: a data-driven approach forecasting covid-19. front phys. 2020 predictability: can the turning point and end of an expanding epidemic be precisely forecast? arxiv key: cord-167889-um3djluz authors: chen, jianguo; li, kenli; zhang, zhaolei; li, keqin; yu, philip s. title: a survey on applications of artificial intelligence in fighting against covid-19 date: 2020-07-04 journal: nan doi: nan sha: doc_id: 167889 cord_uid: um3djluz the covid-19 pandemic caused by the sars-cov-2 virus has spread rapidly worldwide, leading to a global outbreak. most governments, enterprises, and scientific research institutions are participating in the covid-19 struggle to curb the spread of the pandemic. as a powerful tool against covid-19, artificial intelligence (ai) technologies are widely used in combating this pandemic. in this survey, we investigate the main scope and contributions of ai in combating covid-19 from the aspects of disease detection and diagnosis, virology and pathogenesis, drug and vaccine development, and epidemic and transmission prediction. in addition, we summarize the available data and resources that can be used for ai-based covid-19 research. finally, the main challenges and potential directions of ai in fighting against covid-19 are discussed. currently, ai mainly focuses on medical image inspection, genomics, drug development, and transmission prediction, and thus ai still has great potential in this field. this survey presents medical and ai researchers with a comprehensive view of the existing and potential applications of ai technology in combating covid-19 with the goal of inspiring researches to continue to maximize the advantages of ai and big data to fight covid-19. 1 introduction figure 1 . main scope of ai in fighting against covid-19. we collected 1273 online publications related to covid-19, sars-cov-2, and 2019-ncov from databases such as nature, elsevier, google scholar, arxiv, biorxiv, and medrxiv. then, we filter out 267 papers that explicitly use ai methods. long short-term memory [66] vae variational auto-encoder [186] serological diagnosis, chest x-ray and ct image inspection, and other noninvasive methods. benefitting from the advantages of high sensitivity and specificity, real-time reverse transcriptase polymerase chain reaction (rt-pcr) is the current standard detection technology in diagnosing the sars-cov-2 virus and bacterial infections. using rt-pcr, 9 rna positives were detected from pharyngeal swabs of patients, indicating that the sars-cov-2 virus had spread in communities of wuhan, china, in early january 2020 [106] . the shedding of the sars-cov-2 virus detected in the throat, lungs, and feces suggests multiple routes of virus transmission [217, 227] . however, rt-pcr faces the limitations of a complicated sample preparation, low detection efficiency, and high false-negative rate [106, 212, 223] . isothermal nucleic acid amplification and blood testing methods are also commonly used for the rapid screening of sars-cov-2 [103, 122, 223 ]. an ml classification method was used for blood testing to extract important routine hematological and biochemical characteristics and to provide covid-19 classification. in [223] , 105 blood test reports were collected, of which, 27 were collected as positive saples from patients with confirmed covid-19, and for comparison, negative samples were collected from patients with ordinary pneumonia, tuberculosis, and lung cancer. each sample contains 49 feature variables, including routine hematological and biochemical parameters. next, the authors implemented the rf algorithm [22] on the training samples to perform feature learning and classification. based on the extracted 11 key feature variables, they built an rf classifier and tested 253 samples of 169 patients with suspected covid-19 with an accuracy of 96.97%. although ai technologies rarely participate directly in rt-pcr and blood testing, the viral load and covid-19 case data collected in these methods provide important data sources for the subsequent ai-based analysis. medical imaging inspection is another widely used clinical approach for covid-19 detection and diagnosis. covid-19 medical image inspection mainly includes chest x-ray and lung ct imaging. ai technology plays an important role in medical image inspection and has achieved significant results in image acquisition, organ recognition, infection region segmentation, and disease classification. it not only greatly shortens the time of a radiologist's image diagnosis but also improves the accuracy and performance of the diagnosis. we will discuss in detail the contributions of ai methods to chest x-ray and lung ct imaging. ct imaging provides an important basis for the early diagnosis of covid-19. the ct imaging manifestations of covid-19 are mainly ground glass opacity (ggo) in the periphery of the subpleural region, and some are consolidated. if the situation improves, the area will be absorbed and form fibrous stripes. examples of lung ct images of normal and covid-19 cases are shown in fig. 2 [42, 161, 184] . the progress of ct image inspection based on ai usually includes the following steps: region of interest (roi) segmentation, lung tissue feature extraction, candidate infection region detection, and covid-19 classification. the representative ai architecture for covid-19 ct image classification is shown in fig. 3 . the segmentation of lung organs and rois is a foundational step in ai-based image inspection. it depicts the rois in lung ct images (such as lungs, lung lobes, bronchopulmonary segments, and infected regions or lesions) for further evaluation and quantification. different dl models (such as u-net, v-net, and vb-net) have been used for ct image segmentation [32, 94, 114, 193, 226] . in [179] , shan et al. collected 549 ct images from patients with confirmed covid-19 and proposed an improved segmentation model (named vb-net) based on the v-net [117] and resnet [69] models. in [32] , chen et al. built a dl model based on the u-net++ structure [244] to extract the rois from each ct image and detect the training curve of suspicious lesions. in [226] , xu et al. used a 3d dl model to segment the infection regions from lung ct images. they then built a classification model using resnet and location-attention structures and divided the segmented area images into three categories, such as covid-19, influenza-a viral pneumonia, and normal. in [114] , li et al. used the u-net segmentation model to extract the lung organ from each lung ct image as an roi. in [94] , jin et al. proposed an ai-based covid-19 diagnostic system, which consists of a lung segmentation module and a covid-19 diagnostic module. the lung segmentation module is implemented based on deeplabv1 [33] . in [193] , tang et al. used the vb-net model [179] to accurately segment 18 lung regions and infected regions from lung ct images and further calculated 63 quantitative features. focusing on the detection and localization of candidate infection regions, different ai methods were proposed in [65, 85, 181, 212] . in [65] , gozes et al. used commercial software to identify lung nodules and small opacities within the 3d lung volume. then, they constructed a dl model consisting of u-net and resnet structures, where the u-net module was used to extract the roi regions, and the resenet model was used to detect and classify diffuse turbidity and ground glass infiltration. the authors compared the ct images of 56 patients with confirmed covid-19 and 101 noncoronavirus patients and analyzed the ct features of covid-19 in detail. in [181] , shi et al. used a cnn model based on v-net to segment lung organs and infection regions from lung ct images. then, they used the lasso method to calculate the best ct morphological features. finally, the severity of covid-19 was predicted and evaluated based on the best ct morphology and clinical features. in [212] , wang et al. collected 195 ct images from 44 patients with covid-19 and 258 ct images from 55 negative patients. they used the cnn model with the inception structure [191] to classify randomly selected roi images and predict covid-19 disease. in [85] , huang et al. used an inferreadtm ct pneumonia tool based on ai to quantitatively evaluate changes in the lung burden of patients with covid-19. the tool includes three modules: lung and lobe region extraction, pneumonia segmentation, and quantitative analysis. the ct image features of covid-19 pneumonia are divided into four types: mild, moderate, severe, and critical. based on roi segmentation and candidate infection region detection, the important features of rois and infection regions are extracted for covid-19 classification [156] . in [156] , qi et al. collected 71 ct images from 52 patients with confirmed covid-19 in 5 hospitals. they used the pyradiomics method to extract 1,218 features from each ct image and then performed lr and rf methods on these features to distinguish between short-term and long-term hospital stay. in [180] , shi et al. used the vb-net model [179] to segment the infection and lung fields from ct images and divided them based on 96 features, including 26 volume features, 31 digital features, 32 histogram features, and 7 surface features. next, they proposed an isarf method to classify features and predict covid-19 disease. comparative experiments showed that the isarf method is superior to the lr, svm, and nn methods. in [241] , zheng et al. proposed a 3d dcnn model (named decovnet) to detect covid-19 from ct images. the proposed decovnet model includes three components. the first component uses vanilla 3d convolutional layers to extract lung image features, the second component consists of two 3d residual blocks that perform element conversion on the 3d feature maps, and the third component gradually extracts the information in the 3d feature map through 3d max-pooling and outputs the probability of covid-19. in [187] , song et al. collected 1990 ct images, including 777 images from 88 patients with covid-19, 505 images from 100 patients with bacterial pneumonia, and 708 images from 86 healthy people. they proposed a dre-net dl model based on the pretrained resnet50 structure and functional pyramid networks. dre-net extracts the top-k lesion features from each ct image to predict the classification of patients with covid-19. the lack of large-scale datasets is the main challenge that hinders the implementation of ai-based ct image inspection and affects diagnostic performance. to address these challenges, strategies such as transfer learning, data augmentation, and "human-in-the-loop" were used in [94, 179, 239] . in [94] , jin et al. used the imagenet7 dataset [47] to pretrain the proposed 2d classification network. in [239] , zhao et al. provided a public covid-19 ct scan dataset, including 275 covid-19 cases and 195 non-covid-19 cases. they used data augmentation and tl methods to alleviate the shortage of training data. in terms of data augmentation, they used transformation operations to expand the training dataset, such as random transformation, cropping, and rotation. in terms of tl, they pretrained the densenet model [84] on the chest x-ray dataset [213] and then used the pretrained model to predict covid-19. in addition, a "human-in-the-loop" strategy was adopted to reduce the workload of radiologists in annotating the training samples [179] . radiologists annotate a small portion of training samples in the first batch of training. then, they manually correct the segmentation results in the second batch and used them as annotations of the images. iterative training is performed in this way to complete the annotation of all training samples. it is commendable that several works provided open-source code of the designed models and online covid-19 ct image inspection systems. for example, li [114] , jin [94] , zheng [241] , and zhao [239] published the proposed dl models on github [88] . in addition, song et al. [187] provided an online ct diagnosis service. wang et al. [212] provided a public website for ct image uploading and testing. in [32] , chen et al. developed a public online ct diagnostic system, and anyone can upload ct images for self-diagnosis. more detailed information about ai-based ct image segmentation and classification methods is provided in table 2 and fig. 4 . compared with ct images, chest x-ray (cxr) images are easier to obtain in clinical radiology inspections. although cxr image inspection is a typical imaging method used for covid-19 diagnosis, it is generally considered to be less sensitive than ct image inspection. some cxr images of patients with early covid-19 showed normal characteristics. the radiological signs of covid-19 cxr images include airspace opacity, ggo, and later mergers. in addition, the distribution of bilateral, peripheral, and lower regions is mostly observed. examples of cxr images of normal and covid-19 cases are shown in fig. 5 from [124, 160] . the cxr image inspection process based on ai techniques usually includes steps such as data preprocessing, dl model training, and covid-19 classification. the representative ai architecture for covid-19 cxr image inspection is shown in fig. 6 . unlike ct images, cxr image segmentation is more challenging because the ribs are projected onto soft tissues, which is confused with image contrast. in this way, most dl models focus on the classification of the entire cxr image, while few works focus on segmenting rois and lung organs from cxr images. in [67] , hassanien et al. used a classification method to identify and classify covid-19 on lung x-ray images through a multilevel threshold and svm. a multilevel image segmentation threshold was used to segment the lung organs from the background, and then the svm module classified the infected lungs from the uninfected lungs. focusing on covid-19 classification based on cxr images, several studies built ai-based [237] proposed a new dl model, which consists of a backbone network, a classification module, and an anomaly detection module. the backbone network extracts the features of each input cxr image. the classification module and anomaly detection module use the extracted features to generate classification scores and scalar anomaly scores, respectively. in [209] , wang et al. introduced a covid-net dcnn model to identify covid-19 cases based on cxr images. the covid-net model uses a large number of convolutional layers in a projection-expansion-projection design pattern. they collected 13,800 cxr images from 13,725 patients (including 183 covid-19 patients) to establish a cxr database (called covidx) for training covid-net. it is commendable that the authors provided an open-source of the proposed model code and the covidx database. similar to ct images, in cxr image inspection, there is also the problem of a lack of large-scale datasets for dl model training. in [121] , loey et al. used the gan model [86] to generate more cxr images, thereby extending the scale of the cxr dataset. in addition, three dl models (alexnet [109] , googlenet [190] , and resnet18 [69] ) were used to classify cxr images into four categories: covid-19, normal, pneumonia bacteria, and pneumonia virus. in [127] , maghdid et al. used cnn and alexnet models to train cxr and ct images to diagnose covid-19 cases, respectively. among them, the alexnet model was pretrained on the imagenet dataset to perform covid-19 classification on the datasets in [41, 124, 140] . unlike existing tl and image augmentation methods, afshar et al. designed a capsule network model (named covid-caps) suitable for small-scale cxr datasets [2] . each layer of the covid-caps model contains multiple capsules, and each capsule represents a specific image instance at a specific position through multiple neurons. the capsule module [75] uses protocol routing to capture alternative models of spatial information and attempts to reach a consensus on the existence of objects. in this way, the protocol uses information from instances and objects to identify the relationship between them without the need for large-scale datasets. more detailed information about ai-based cxr image classification methods for covid-19 inspection is shown in table 3 and fig. 7 . in addition to rt-pcr detection and image inspection techniques, some noninvasive measurement methods have also been used for covid-19 detection and diagnosis, including cough sound judgment and breathing pattern detection. (1) monitoring covid-19 through ai-based cough sound analysis. schuller et al. [174] discussed the potential application of computer audition (ca) and ai in the analysis of cough sounds in patients with covid-19. they first analyzed the ca's ability to automatically recognize and monitor speech and cough under different semantics, such as breathing, dry and wet coughing or sneezing, speech during colds, eating behaviors, drowsiness, or pain. then, they suggested applying the ca technology to the diagnosis and treatment of patients with covid-19. however, due to the lack of available datasets and annotation information, there is no report on the application of this technology in covid-19 diagnosis. similarly, iqbal et al. [90] also discussed an abstract framework that uses the speech recognition function of mobile applications to capture and analyze the cough sounds of suspicious persons to determine whether the user is healthy or suffers from a respiratory disease. in [214] , wang et al. analyzed the respiratory patterns of patients with covid-19 and other breathing patterns of patients with influenza and the common-cold. in addition, they proposed a respiratory simulation model (named bi-at-gru) for covid-19 diagnosis. the bi-at-gru model includes a gru neural network with a bidirectional and attention mechanism and can classify 6 types of clinical respiratory patterns, such as eupnea, tachypnea, bradypnea, biots, cheyne-stokes, and central-apnea. (2) covid-19 diagnosis based on noninvasive measurements. in [128] , maghdid et al. designed an abstract framework for covid-19 diagnosis based on smart phone sensors. in the proposed framework, smart phones can be used to collect the disease characteristics of potential patients. for example, the sensors can acquire the patient's voice through the recording function and can obtain the patient's body temperature through the fingerprint recognition function. then, the collected data are submitted to the ai-supported cloud server for disease diagnosis and analysis. the virology and pathogenesis of sars-cov-2 are one of the most important scientific studies in the fields of biology and medicine. scientists have analyzed the virus characteristics of sars-cov-2 through proteomics and genomic studies [7, 77, 123] . in the field of virology, the origin and classification of sars-cov-2, the physical and chemical properties, receptor interactions, cell entry, and the ecology and genomic variation in sars-cov-2 have been studied [123, 217, 242] . we mainly discuss the contribution of ai in the pathological research of sars-cov-2 from the perspective of proteomics and genomics. since the advent of sars-cov-2, there have been a large number of research achievements in proteomics. five types of structural proteins of sars-cov-2 were confirmed, including nucleocapsid (n) proteins, envelope (e) proteins, membrane (m) proteins, and spike (s) proteins [145, 206, 240] . in addition, other proteins translated in the host cells essential for virus replication have also attracted the attention of researchers, such as non-structural protein 5 (nsp5) and 3c-like protease (3clpro). moreover, several studies have shown that sars-cov-2 uses the human angiotensin-converting enzyme 2 (ace2) to enter the host [77, 242] . in this field, ai techniques are used to predict protein structures and analyze the interaction network between proteins and drugs. the representative ai architecture for protein structure predication is shown in fig. 8 . in [176, 177] , senior et al. used dl models to implement the alphafold system for protein structure prediction. the alphafold system uses a resnet model [69] to analyze the covariance and amino acid residue contacts in homologous gene sequences and to predict the corresponding protein structures. the alphafold system consists of a feature extraction module and a distance prediction neural network. the feature extraction module is responsible for searching for protein sequences that are similar to the input protein sequences and constructing the multiple sequence alignment (msa). the module simultaneously generates residual position and sequence contour features, and the output of 485 feature parameters are input into the distance prediction neural network. the distance prediction neural network is a two-dimensional (2d) resnet structure, which is responsible for accurately predicting the distance between all residue pairs of every two protein sequences. the authors added a one-dimensional output layer to the network to predict the accessible surface area, distance map, and secondary structure of each residue. finally, the generated potential is optimized by gradient descent to generate protein structures. based on [176, 177] , jumper et al. [96] used the alphafold system to predict the structure of sars-cov-2 membrane proteins. they published the predicted protein structures such as 3a, nsp2, nsp4, nsp6, and papain-like proteases. although the structure of these proteins has not been verified by clinical experiments, this publication allows researchers to quickly conduct sara-cov-2 studies. in [145] , ortega et al. used a computational method to detect changes in the s1 subunit of the spike receptor-binding domain and determined mutations in the sars-cov-2 spike protein sequence, which may be beneficial for studying human-to-human transmission. they collected sequences for modeling and constructed the sars-cov-2 spike protein model from the protein data bank (pdb) [19] and used swiss-model software [12] to construct the sars-cov-2 spike protein model. then, z-dock software [151] was used to dock between the spike protein and ace2, and a clustering algorithm was used to cluster the docking results. the work indicated that the sars-cov-2 spike protein has a higher affinity for human ace2 receptors. another branch of ai-assisted proteomics research involves finding new compounds and drug candidates for the treatment of covid-19 by building interactive networks and knowledge maps between proteins and drugs. please see section 4 for details. genomics is mainly used in sars-cov-2 to analyze the origin of sars-cov-2, vaccine development, and pt-pcr detection. various ai algorithms are applied for similarity comparisons of gene sequences, gene fragments, and mirna prediction [46, 164] . in [164] , randhawa et al. used different ml methods to analyze the pathogen sequences of covid-19 and identified the inherent features of the viral genomes, thereby rapidly classifying new pathogens. they collected the complete reference genome of the covid virus from ncbi [53] , the bat î²-coronavirus from gisaid [61] , and all available virus sequences from virus-host db [135] . each genomic sequence was mapped to a corresponding genomic signal in a discrete digital sequence by using chaotic game representation [93] . in addition, the amplitude spectrum of these genomic signals was calculated by using a discrete fourier transform. on this basis, they used 6 ml classification models to train the above sequence distance matrix and compared their performance. finally, they conducted the trained ml models on 29 covid-19 sequences to classify covid-19 pathogens. the results of this work support the hypothesis that covid-19 originated in bats and its classification as a î²-coronavirus. in [46] , demirci et al. performed a mirna prediction on the sars-cov-2 genome based on 3 ml methods and identified mirna-like hairpins and microrna-mediated sars-cov-2 infection interactions. they collected the complete covid-19 genome from ncbi [53] and human-mature mirna sequences from mirbase [108] . the genomic sequences are transcribed and divided into multiple overlapping fragments, which are folded into a secondary structure to extract the hairpin structure. on this basis, the authors used 3 ml methods (e.g., dt, naive bayes, and rf) to predict the category of each hairpin and determined the similarity between the hairpins and human mirna. they searched for mature mirna targets in human and sars-cov-2 genes and analyzed the potential interactions between sars-cov-2 mirnas and human genes and between human mirnas and sars-cov-2 genes. finally, the gene ontology of sars-cov-2 mirna targets in human genes were analyzed, and the similarity between sars-cov-2 mirna candidates and mature mirnas of any known organism was evaluated using the panther classification system [134] . in [133] , metsky et al. used genomic and ai technologies to rapidly design nucleic acid detection assays and improved current rt-pcr testing of sars-cov-2. they developed a crispr tool that uses enzymes to edit the genome by cutting specific genetic code chains and used different ml methods to predict the diversity of the target genome. the authors designed the rt-pcr test method through the crispr tool, and it can effectively detect 67 respiratory viruses, including sars-cov-2. in the field of drug development, ai technologies can screen existing drug candidates for covid-19 by analyzing the interaction between existing drugs and covid-19 protein targets. in addition, ai technologies can help to discover new drug-like compounds against covid-19 by constructing new molecular structures that have inhibitory effects on proteases at the molecular level. the representative ai architecture for new drug-like compound discover is shown in fig. 9 . drug development can be divided into small-molecule drug discovery and biological product development. small-molecule drug discovery mainly focuses on chemically synthesized small molecules of active substances, which can be made into small-molecule drugs through chemical reactions between different organic and inorganic compounds. one group of ai-based drug development focuses on the discovery of new drug-like compounds at the molecular level. in [18, 182] , beck et al. proposed a figure 9 . representative ai architecture for new drug-like compound discover. dl-based drug-target interaction model (mt-dti) to predict potential drug candidates for covid-19. the mt-dti model uses smiles strings and amino-acid sequences to predict target proteins with 3d crystal structures. the authors collected the amino-acid sequences of 3c-like proteases and related antiviral drugs and drug targets from the databases of ncbi [53] , drug target common (dtc) [194] , and bindingdb [120] . in addition, they used a molecular docking and virtual screening tool (autodock vina [202] ) to predict the binding affinity between 3,410 drugs and sars-cov-2 3clpro. the experimental results provided 6 potential drugs, such as remdesivir, atazanavir, efavirenz, ritonavir, dolutegravir, kaletra (lopinavir/ritonavir). note that remdesivir shows promising in clinical trial. in [136] , moskal et al. used ai methods to analyze the molecular similarity between anti-covid-19 drugs (termed "parents") and drugs involving similar indications to screen out second-generation drugs (termed "progeny") for covid-19. they first used the mol2vec [91] method to convert the molecular structure of the parent drugs into a high-dimensional vector space, treated the drug molecule as a "sentence", and mapped its molecular substructure to a "word". then, they used the vae [186] model to generate smiles strings with similar 3d shape and pharmacodynamic properties to a given seed molecule [63] . in addition, cnn, lstm, and mlp models are used to generate the corresponding smiles strings and molecules. the authors selected 71 parent drugs as seed molecules from the literature and selected 4456 drugs as candidate progeny drugs from zinc [245] and chembl [56] . in [23] , bung et al. committed to the development of new chemical entities for the sars-cov-2 3clpro based on dl technology. they constructed an rl-based rnn model to classify protease inhibitor molecules and obtained a smaller subset that favored the chemical space. then, they collected 2515 protease inhibitor molecules in smiles format from the chembl database as training data, where each smiles string is regarded as a time series, and each position or symbol is regarded as a time point. the output of small molecules was docked to the 3clpro structure with minimal energy and ranked based on the virtual screening score obtained by selecting candidates of anti-sars-cov-2 [202] . in [192] , tang et al. analyzed 3clpro with a 3d structure similar to sars-cov and evaluated it as an attractive target for anti-covid-19 drug development. they proposed an advanced deep-q learning network (called adqn-fbdd) to generate potential lead compounds of sars-cov-2 3clpro. they collected 284 reported molecules as sars-cov-2 3clpro inhibitors. these molecules were split using the improved brics algorithm [45] to obtain the target fragment library of sars-cov-2 3clpro. then, the proposed adqn-fbdd model trains each target fragment and predicts the corresponding molecules and lead compounds. through the proposed structure-based optimization policy (sbop), they finally obtained 47 derivatives with inhibitory effects on sars-cov-2 3clpro from these lead compounds, which are regarded as potential anti-sars-cov-2 drugs. another group of studies focused on screening candidate biological products for covid-19. biological products are a type of protein products with therapeutic effects, which are mainly combined with specific cell receptors involved in the disease process. biological products are prepared from microbial cells such as genetically modified bacteria, yeast, or mammalian cell strains through biotechnology processes. in [79] , hu et al. established a multitask dl model to predict the possible binding between potential drugs and sars-cov-2 protein targets, thereby selecting available drugs for sars-cov-2. they first collected 8 sars-cov-2 viral proteins from ghddi [58] as potential targets. the proposed dl model is based on the atomnet model [188, 205] and includes a shared layer to learn the joint representation of all tasks and a task processing layer for performing specific tasks. by fine-tuning the dl model using a coronavirus-specific dataset, the model can predict the possible binding between the drugs and the protein targets and output the binding affinity score. based on existing studies, rdrp, 3clpro, and papain-like protease have been confirmed as the three principal targets of sars-cov-2 [60, 146, 206] . based on the prediction results [113, 210] , the authors selected the top 10 potential drugs with a high likelihood of inhibition for each target. in [98] , kadioglu et al. used high-performance computing (hpc), virtual drug screening, molecular docking, and ml technologies to identify sars-cov-2 drug candidates. after performing virtual drug screening and molecular docking, two supervised ml models(e.g., nn and naivebayes) were used to analyze clinical drugs and test compounds to construct corresponding drug likelihood prediction models. several approved drugs, including those used for the hepatitis c virus (hcv), the enveloped ssrna virus, and other infectious diseases, were selected as sars-cov-2 drug candidates. facing the known covid-19 protease target 3clpro, zhavoronkov et al. [240] designed a small-molecule drug-discovery pipeline to produce 3clpro inhibitors, used 3clpro's crystal structure, homology modeling, and co-crystallized fragments to generate 3clpro molecules. they collected the crystal structure of covid-19 3clpro from [230] and constructed a homology model. at the same time, molecules with activity on various proteases were extracted from [56, 89] and constituted a protease peptidomimetic dataset with 5,891 compounds. then, they used 28 ml methods (such as gae, gan, and ga) and rl strategies to separately train input datasets (e.g., crystal structure, homology model, and co-crystal ligands), and generated new molecular structures with a high score. in [78] , hofmarcher et al. used a chemai dl model [154] based on the smileslstm structure [76] to test the resistance of the molecules to covid-19 proteases. they collected 3.6 million molecules from chembl [56] , zinc [245], and pubchem [104] and formed a training dataset. then, the chemai model was trained on the dataset in a multitask parallel training way, where the output neurons of the model represent the biological effects of the input molecules. the authors used the chemai model to predict the inhibitory effects of these molecules on the 3clpro and plpro proteases of covid-19. these molecules have a binding, inhibitory, and toxic effect on the targets. a list of covid-19 drug development methods based on ai is provided in table 4 . currently, there are 3 types of covid-19 vaccine candidates, such as (1) whole virus vaccines, (2) recombinant protein subunit vaccines, and (3) nucleic acid vaccines [38, 238] . ai technology has been involved in the design and development of covid-19 vaccines. compared with explicit applications in other fields, ai technology is usually used in the sub-processes of vaccine development in an implicit manner. the ai algorithms of netmhc and netmhcpan are used in the development of covid-19 vaccines for epitope prediction [74, 97, 215] . in [74] , herst et al. obtained the sars-cov-2 protein sequences from genbank and used the msa algorithm to trim the nucleocapsid phosphoprotein sequences to possible peptide sequences. on this basis, they used netmhc and netmhcpan ai algorithms to train and predict peptide sequences [8, 97] . the pan variant of netmhc integrates the in-vitro objects of 215 hlas for prediction. finally, they used the average value of the ann, svm, netmhc and netmhcpan methods to calculate the vaccine candidates. in [215] , ward et al downloaded the sars-cov-2 nucleotide sequences from the ncbi [53] and gisaid [61] databases, and generated a consensus sequence for each sars-cov-2 protein. the sequences can be used as references for prediction, specificity, and epitope mapping analysis. next, the authors used different epitope prediction tools to predict b cell epitopes and map them to the amino acid sequences of each gene. on this basis, they used the ai-based netmhcpan algorithm to predict hla-1 peptides and obtained a total of 2,915 alleles in all peptide lengths. blastp tool [6] was used to locate the short amino acid epitope sequences to the canonical sequences of sars-cov-2 proteins. finally, the author provided an online tool that provides functions of sars-cov-2 genetic variation analysis, epitope prediction, coronavirus homology analysis, and candidate proteome analysis. in [143] , ong et al. used ml and reverse vaccinology (rv) methods to predict and evaluate potential vaccines for covid-19. they used rv to analyze the bioinformatics of pathogen genomes to identify promising vaccine candidates. they obtained the sars-cov-2 sequences and all proteins of the 6 known human coronavirus strains from the ncbi [53] and uniprot [17] databases. then, they used vaxign and vaxign-ml [71, 142] to analyze the complete proteome of the coronaviruses and predicted their service biological characteristics. next, they improved the vaxign-ml model based on ml and rv using lr, svm, knn, rf, and xgboost methods and predicted the protein level of all sars-cov-2 proteins. the nsp3 protein was selected for phylogenetic analysis, and the immunogenicity of nsp3 was evaluated by predicting t cell mhc-i and mhc-ii and linear b cell epitopes. in [158] , qiao et al. used dl to predict the patient's mutated new antigen and identified the best t-cell epitope for peptide-based covid-19 vaccines. they first sequenced the diseased cells in the patient's blood and extracted 6 human leukocyte antigen (hla) types and t-cell receptor (tcr) sequences. then, they proposed the deepnovo model to train the patient's immune peptide and to identify the best t-cell epitope set based on a person's hla alleles and immune peptide group information. the deepnovo model uses lstm and rnn structures to capture sequence patterns in peptides or proteins and predicts hla peptides from conserved regions of the virus, thereby predicting new mutant antigens in patients. in addition, they used the iedb [204] tool to predict the immunogenicity of 177 peptides. they suggested designing an epitope-based covid-19 vaccine specifically for each person based on their hla alleles. the prediction of immune stimulation ability is an important part of vaccine designing [162, 166] . different ml methods and position-specific scoring matrices (pssm) are usually used to predict epitope and immune interactions, thereby predicting the generation of adaptive immunity in the target host. in [162] , rahman et al. used immuno-informatics and comparative genomic methods to design a multi-epitope peptide vaccine against sars-cov-2, which combines the epitopes of s, m, and e proteins. they used the ellipro antibody epitope prediction tool [87] to predict linear b cell epitopes on the s protein. ellipro uses multiple ml methods to predict and visualize a given protein sequence or b-cell epitope in the structure. in addition, sarkar et al. [172] studied the epitope-based vaccine design for 15/36 covid-19 and used the svm method to predict the toxicity of the selected epitopes. in [153] , prachar et al. used 19 epitope-hla combined prediction tools including iedb, ann, and pssm algorithms to predict and verify 174 sars-cov-2 epitopes. thanks to the developed information and multimedia technology, the outbreak and spread of covid-19 were reported in a timely and accurate manner. the number of suspected, confirmed, cured, and dead covid-19 cases in each country/region is announced in real time. in addition, passenger travel trajectories and related big data are shared for scientific research. based on the rich data, numerous researchers have participated in the prediction, spread, and tracking of the covid-19 outbreak. researchers collected clinical covid-19 case data and used different ai methods to extract important features and to predict the mortality and survival rate of patients with covid-19. the representative ai architecture for prediction of patient mortality and survival rate is shown in fig. 10 . figure 10 . representative ai architecture for prediction of patient mortality and survival rate. in [152] , pourhomayoun et al. used 6 ai methods to predict the mortality rate of patients with covid-19. they used public data of patients with covid-19 from 76 countries around the world [225] , and counted 112 features, including 80 medical annotations and disease features and 32 features from the patients' demographic and physiological data. based on the filtering method and wrapper method, 42 best features were extracted, such as demographic features, general medical information, and patient symptoms. on this basis, 6 ai methods (such as svm, nn, rf, dt, lr, and knn) are used to predict the mortality of patients with covid-19. in [173] , sarkar et al. used the rf model to analyze the records of 433 patients with covid-19 from kaggle [43] and identified the important features and their impact on mortality. experimental results show that patients over 62 years of age have a higher risk of death. in [228, 229] , yan et al. analyzed a blood sample dataset of 404 patients with covid-19 in wuhan, china, and used the xgboost classification method [37] to select three important biomarkers and to predict individual patient survival rates. experimental results with an accuracy of 90% indicated that higher ldh levels seem to play an important role in distinguishing the most critical covid-19 cases. bluedot [21] and metabiota [132] are two ai companies that made accurate predictions for the covid-19 outbreak. bluedot collected large-scale heterogeneous data from various sources, such as news reports, global ticketing data, animal diseases, global infectious disease alerts, and real-time climate conditions. then, it used filtering tools to narrow its focus; used various ml and natural language processing (nlp) techniques to detect, mark, and display the potential risk frequency of covid-19; and predicted the outbreak time of transmission. it is worth mentioning that 9 days before the official 16/36 announcement of the covid-19 outbreak, bluedot accurately predicted the epidemic of covid-19 and cities with a high risk of virus outbreaks. metabiota collected large-scale data from social and nonsocial sources (such as biology, socioeconomic, political, and environmental data) and used technologies such as ai, ml, big data, and nlp to accurately predict the outbreak, spread, and intervention measures of covid-19. more ai-based covid-19 outbreak and transmission prediction methods are shown in table 5 . table 5 . covid-19 outbreak and transmission prediction based on ai methods. data sources methods country/region huang [82] yang [231] , who [216] cnn, lstm, mlp, gru china hu [80, 81] the paper [148] , who [216] mae, clustering china yang [233] baidu [16] seir, lstm china fong [51, 52] nhc [139] svm, pnn china ai [3] who [54, 216] anfis, fpa china, usa rizk [168] who [216] isacl-mfnn usa, italy, spain giuliani [62] italy [144] emtmgl italy ayyoubzadeh [14] worldometer [218] , google [201] lr, lstm iran marini [129, 130] swiss population enerpol switzerland lai [110] iata [126] , worldpop [219] ml global punn [155] jhu csse [49] svr, pr, dnn, lstm, rnn lampos [111] mediacloud [131] , phe [64] , ecdc [55] transfer learning global although the source of the covid-19 epidemic has not yet been identified, it was first reported in wuhan, china. therefore, the outbreak and spread of covid-19 in china have received extensive attention. in [82] , huang et al. used 4 dl models, such as cnn, lstm, gru, and mlp to train and predict the covid-19 case data from 7 severe epidemic cities in china. the input of these dl models is the features of the covid-19 cases, including the number of confirmed cases, cured cases, and deaths. based on the input of the previous 5 days, each model can predict the number of covid-19 cases for the following few days. the architecture of the covid-19 outbreak prediction model based on ai models is shown in fig. 11 . figure 11 . architecture of covid-19 outbreak prediction model based on dl models. in [80, 81] , hu et al. used ai methods such as mae and clustering algorithms to predict the number of confirmed covid-19 cases in different provinces and cities in china. in addition, they clustered 34 provinces and cities in china into 9 clusters based on the prediction results and further predicted the spread of covid-19 among provinces and cities. in [233] , yang et al. used the seir model [101] and the lstm model to predict covid-19 in china. the population migration data and the latest covid-19 epidemiological data from baidu [16] were input into the seir model to derive the epidemic curves. in addition, they used sars data from 2003 to pretrain the lstm model to predict covid-19 for the following few days, in which epidemiological parameters, such as the transmission, incubation, recovery probability, and the number of deaths, were selected as input features. both the seir and lstm models predicted a daily infection peak of 4,000 in the first week of february. in [51, 52] , fong et al. obtained early covid-19 epidemiological data from nhc [139] . then, they used traditional time series data analysis methods (e.g., arima, exponential, and holt-winters), ml methods (e.g., kr, svm, and dt), and ai methods (e.g., pnn) to analyze and predict future outbreaks. in addition to china, the outbreak and spread of covid-19 in other countries (including the united states, italy, spain, iran, and switzerland) have also received widespread attention. in [3] , ai et al. proposed an improved anfis method [92] to predict the number of covid-19 cases. the proposed system connects fuzzy logic and neural networks and uses and enhanced flower pollination algorithm (fpa) [232] for model parameter optimization and model training. in [168] , rizk et al. proposed an improved multi-layer feed-forward neural network (isacl-mfnn) model, which uses an internal search algorithm (isa) to optimize model parameters and uses the cl strategy to enhance the isa performance. from the official covid-19 dataset reported by the who [216] , data from january 22, 2020, to april 3, 2020, in the united states, italy, and spain were collected to train the isacl-mfnn model and to predict the confirmed cases within the next 10 days. in [62] , giuliani et al. collected the number of infected people in italian provinces [144] and used the emtmgl model to simulate and predict the spatial and temporal distribution of covid-19 infection in italy. in [14] , ayyoubzadeh used real-time covid-19 epidemic data from google trends [201] and worldometer [218] to predict covid-19 cases in iran. they collected daily epidemic data and saved them as a time series data format and then used the lr and lstm models to make predictions, thereby obtaining the outbreak and spread trend of covid-19 in iran. in [129, 130] , marini et al. developed an agent-based ai platform to predict the development of covid-19 in switzerland. the system accepts the entire swiss population as input data to simulate and predict the spread of covid-19 in switzerland. it simulates the people's daily trajectories by calibrating the micro-census data and effectively predicts the individual contacts and possible transmission routes. many studies have likewise focused on the prediction of the spread of covid-19 around the world. they collected a large amount of travel data, mobile phone data, and social media data and used ai methods to accurately predict the potential transmission range and transmission route of covid-19. in [110] , lai et al. collected a large amount of travel and mobile phone data from [219] and constructed corresponding models to predict the transmission risk of covid-19 in different countries. on this basis, they established air travel network models between domestic cities and cities in other countries to predict risk cities at home and abroad. in [155] , punn et al. used 2 ml models (e.g., svr [13] and pr [44] ) and 3 dl regression models (e.g., dnn, lstm [66] , and rnn) to predict real-time covid-19 cases. in [111] , lampos et al. used an automatic crawling tool to obtain daily confirmed covid-19 case data and related articles from online media such as mediacloud [131] , public health england (phe) [64] , and european centre for disease prevention and control (ecdc) [55] . they used the tl strategy to transfer the covid-19 model of the country where the disease spread to other countries that are still in the early stage of the epidemic curve, and thus achieving the target country's epidemic prediction. in addition, companies such as microsoft bing [20] , google [201] , and baidu [16] have aggregated multiple available data sources and developed covid-19 global tracking systems to provide a visual tracking interface. in addition to ai methods, various methods based on statistics and epidemiology are used to predict the outbreak and spread of covid-19. in [70] , he et al. collected the highest viral load in the pharyngeal swabs of 94 patients with confirmed covid-19. they fitted a generalized additive model with identity links and smooth spline curves to analyze its overall trend. a gamma distribution was fitted to the transmission pair data to evaluate the serial interval distribution. the results of statistical analysis showed that the patients with confirmed covid-19 reach the peak of virus shedding before or during symptom onset, and some kinds of transmission may occur before the initial symptoms. in [208] , wang et al. determined a set of technical indicators (e.g., number of infection cases in the hospital, daily infection rate, and daily cured rate) that reflect the infection status of covid-19. next, they proposed a calculation method based on statistical theory to quantify the iconic characteristics of each period and predict the turning point in the development of the epidemic. in addition, numerous studies based on the susceptible-infected-recovered (sir) and seir models have studied the spread of covid-19 from an epidemiological perspective. please see [25, 119, 169, 185, 195, 199, 234] for more information. when covid-19 appeared, most countries in the world adopted different forms of social control, social alienation, school closures, and blockade measures to prevent the spread of the epidemic [203] . ai technologies have been widely used in epidemic control and social management, including individual temperature detection, video tracking, contact tracking, intelligent robots, etc. many countries have used smart devices equipped with ai to detect suspicious persons in public transportation places such as airports and train stations [40, 167] . for example, infrared cameras are used to scan for high temperatures in a crowd, and different ai methods perform efficient analysis to detect whether an individual is wearing a mask in real time. in addition, dl-based video tracking technology is used to detect and track suspicious covid-19 patients in public places [31] . moreover, at the entrances and exits of cities, the identity information of each passing person was collected. then, ai-based systems are used to efficiently query the travel history and trajectory of each passing individual to check whether they are from a region seriously affected by covid-19 [35, 197] . ai technologies are also used in contact tracking of patients with covid-19 [100] . for each patient with confirmed covid-19, personal data such as mobile phone positioning data, consumption records, and travel records may be integrated to identify the potential transmission trajectory [57] . in addition, when people are in social isolation, mobile phone positioning and ai frameworks can assist the government in better understanding the status of individuals [165] . moreover, intelligent robots are used to perform site disinfection and product transfer, and mobile phone positioning functions are used to detect and track the distribution and flow of personnel. another group of studies focused on the impact of various social control strategies on the spread of covid-19. in the implementation and performance improvement of ai greatly depends on the large-scale available data and resources. therefore, we compiled available public resources that can be used for covid-19 disease diagnosis, virology research, drug and vaccine development, and epidemic and transmission prediction. three types of data and resources were summarized, including medical images, biological data, and informatics resources. 19/36 we collected 17 groups of covid-19 medical images such as cxr and ct images from individual researchers and organizations. among them, the cxr image data set published by cohen et al. [41] is widely cited, which is a collection of cxr images from multiple references. in addition, many researchers uploaded cxr and ct images to kaggle [15, 112, 124, 138, 160] for covid-19 research. moreover, organizations such as the british society of thoracic imaging (bsti), eurorad, and radiopaedia also released online cxr and ct images. table 6 displays the detailed description of medical image data resources of covid-19. table 6 . medical image data resources for covid-19 research. data type cited by refs. zhao [239] ct images [239] hrct [48] ct images [94] armato [11] ct images [94] coronacases [42] ct images -medical segmentation [175] ct images -cohen [41] cxr images [2, 9, 73, 121, 127, 137, 178, 209 , 237] wang [213] cxr images [237, 239] covidx [24] cxr images [209] adrian [159] cxr images [73] covid-net [209] cxr images [209] kermany [102] cxr images [9] mendeley data [5] cxr images -kaggle [15, 112, 124, 138, 160] cxr and ct images [2, 9, 121, 127, 137, 173, 178, 209 ] bsti [140] cxr and ct images [127] sirm [184] cxr and ct images -eurorad [50] cxr and ct images -radiopaedia [161] cxr and ct images we collected 10 biological data resources, such as ncbi, protein data bank (pdb), uniprot, clarivate analytics integrity (cai), drug target common (dtc), and virus-host db (vhdb), as shown in table 7 . these data resources provide abundant biological data resources, including gene sequences, proteins, drug molecules and compounds, and mirna sequences. informatics resources such as covid-19 situation reports, dashboards, covid-19 cases, and demographic data are gathered in table 8 [135] genome sequences virus sequences [164] pdb [149] proteins 3d shapes of proteins, nucleic acids, and assemblies [19, 145] uniprot [17] proteins sars-cov-2 protein entries and receptors [143] mirbase [108] mirna sequences human mature mirna sequences [46] zinc [245] drug compounds drug compounds and molecules [78, 98, 136 ] dtc [194] drug molecules drug molecules for 3c-like proteases [18, 182] cai [89] drug discovery empowering knowledge-based drug discovery and development [240] bindingdb [120] amino-acid sequences amino-acid sequences of 3c-like proteases [18, 182] [219] demographic data spatial demographic and air travel data [110] ghddi [58] community drug discovery community [79] humdata [68] community community perceptions of covid19 we summarize the main challenges currently faced by ai against covid-19 and provide the corresponding suggestions. at present, the applications of ai in covid-19 research mainly faces four challenges: the lack of available large-scale training data, massive noisy data and rumors, the limited knowledge on the intersection of computer science and medicine, and data privacy and human rights protection. â�¢ lack of available large-scale training data. most ai methods rely on large-scale annotated training data, including medical images and various biological data. however, due to the rapid outbreak of covid-19, there are insufficient datasets available for ai. in addition, annotating training samples is very time-consuming and requires professional medical personnel. â�¢ massive noisy data and rumors. challenges arise from relying on the developed mobile internet and social media; massive noise information and fake news about covid-19 has been published on various online media without rigorous review. however, ai algorithms seem to be powerless in 21/36 judging and filtering the noise and erroneous data. this problem limits the application and performance of ai, especially in epidemic prediction and transmission analysis. â�¢ limited knowledge in the intersection of computer science and medicine. many ai scientists are from computer science, but the application of ai in the covid-19 battle requires in-depth cooperation in computer science, medical imaging, bioinformatics, virology, and many other disciplines. therefore, it is crucial to coordinate the cooperative work of researchers from different fields and integrate the knowledge of multiple subjects to jointly deal with covid-19. â�¢ data privacy and human rights protection. in the era of big data and ai, the cost of obtaining personal privacy data is very low. faced with public health issues such as covid-19, many governments want to obtain various types of personal information, including mobile phone positioning data, personal travel trajectory data, and patient disease data. how to effectively protect personal privacy and human rights during information acquisition and ai-based processing is an issue worthy of discussion and attention. in addition to the applications investigated in this paper, ai can also contribute to the battle of covid-19 from the following 10 potential directions. 1. noncontact disease detection. in cxr and ct image detection, the use of noncontact automatic image acquisition can effectively avoid the risk of infection between radiologists and patients during the covid-19 pandemic. ai can be used for patient posture positioning, standard section acquisition of cxr and ct images, and movement of camera equipment. 2. remote video diagnosis. ai and nlp technologies can be used to develop remote video diagnosis systems and chat robot systems and provide covid-19 disease consultation and preliminary diagnosis to the public. 3. patient prognosis management. ai technology (such as intelligent image and video analysis) can be used to automatically monitor patient behavior during the follow-up monitoring and prognostic management process, in addition to long-term tracking and management of patients with covid-19. 4. biological research. in the field of biological research, ai can be used to discover protein structures and features of virus through accurate analysis of biomedical information, such as large-scale protein structures, gene sequences, and viral trajectories. 5. drug and vaccine development. ai can not only be used to discover potential drugs and vaccines but also to simulate the interaction between drugs and proteins and between vaccines and receptors, thereby predicting the potential responses to the drugs and vaccines of patients with covid-19 with different constitutions. 6. identification and filtering of fake news. ai can be used to reduce and eliminate fake news and noise data on online social media platforms to provide reliable, correct, and scientific information about the covid-19 pandemic. 7. impact simulation and evaluation. various simulation models can use ai to analyze the impact of different social control strategies on disease transmission. then, they can be used to explore more effective and scientific approaches of disease prevention and social control. 8. patient contact tracking. by constructing social relationship networks and knowledge graphs, ai can identify and track the trajectories of people in close contact with patients with covid-19, thereby accurately predicting and controlling the potential spread of the disease. 22/36 9. intelligent robots. intelligent robots are expected to be used in applications such as disinfection and cleaning in public places, product distribution, and patient care. 10. intelligent internet of things. ai is expected to be combined with the internet of things to deploy in customs, airports, railway stations, bus stations, and business centers. in this case, we can quickly identify suspicious covid-19 virus and patients through intelligent monitoring of the environment and personnel. in this survey, we investigated the main scope and contributions of ai in combating covid-19. compared with the pandemic of sars-cov in 2003 and mers-cov in 2012, ai technologies have been successfully applied in almost every corner of the covid-19 battle. the application of ai in covid-19 research can be summarized in four aspects, such as disease detection and diagnosis, virology research, drug and vaccine development, and epidemic and transmission prediction. among them, medical image analysis, drug discovery, and epidemic prediction are the main battlefields of ai in the fight against covid-19. we also summarized the currently available data and resources for covid-19 research based on ai, including medical imaging data, biological data, and informatics resources. finally, we highlighted the main challenges and potential directions in this field. this survey provided medical and ai researchers with a comprehensive view of the existing and potential contributions of ai in combating covid-19, with the goal of inspiring them to continue to maximize the advantages of ai and big data to fight against this pandemic. the covid-19 pandemic calls for spatial distancing and social closeness: not for social distancing covid-caps: a capsule network-based framework for identification of covid-19 cases from x-ray images optimization method for forecasting confirmed cases of covid-19 in china artificial intelligence (ai) provided early detection of the coronavirus (covid-19) in china and will influence future urban health policy internationally augmented covid-19 x-ray images dataset. mendeley data basic local alignment search tool the proximal origin of sars-cov-2 gapped sequence alignment using artificial neural networks: application to the mhc class i system covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks covid-19: a novel coronavirus and a novel challenge for critical care the lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans the swiss-model workspace: a web-based environment for protein structure homology modelling support vector regression predicting covid-19 incidence using google trends and data mining techniques: a pilot study in iran covid-19 x-rays. website real-time covid-19 data. website the universal protein resource (uniprot) predicting commercially available antiviral drugs that may act on the novel coronavirus (sars-cov-2) through a drug-target interaction deep learning model announcing the worldwide protein data bank covid-19 tracker). website an ai epidemiologist sent the first warnings of the wuhan virus. website random forests de novo design of new chemical entities (nces) for sars-cov-2 using artificial intelligence covid-19 chest x-ray dataset initiative. website chinese and italian covid-19 outbreaks can be correctly described by a modified sird model. medrxiv artificial intelligence applied on chest x-ray can aid in the diagnosis of covid-19 infection: a first experience from lombardy, italy. medrxiv a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster modelling transmission and control of the covid-19 pandemic in australia a comparative study of fine-tuning deep learning models for plant disease identification clinical characteristics and intrauterine vertical transmission potential of covid-19 infection in nine pregnant women: a retrospective review of medical records distributed deep learning model for intelligent video surveillance systems with edge computing deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study. medrxiv deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs keep up with the latest coronavirus research covid-19 control in china during mass population movements at new year. the lancet fangcang shelter hospitals: a novel concept for responding to public health emergencies. the lancet xgboost: a scalable tree boosting system the sars-cov-2 vaccine pipeline: an overview xception: deep learning with depthwise separable convolutions in a time of coronavirus, china's investment in ai is paying off in a big way covid-19 image data collection ct images of confirmed covid-19 cases. mendeley data novel corona virus 2019 dataset. website approximate optimal designs for multivariate polynomial regression on the art of compiling and using'drug-like'chemical fragment spaces computational analysis of microrna-mediated interactions in sars-cov-2 infection. biorxiv imagenet: a large-scale hierarchical image database building a reference multimedia database for interstitial lung diseases an interactive web-based dashboard to track covid-19 in real time images of covid-19 cases. mendeley data composite monte carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction finding an accurate early forecasting model from small dataset: a case of 2019-ncov novel coronavirus outbreak genome sequencing data of sars-cov-2. website disease control and prevention. weekly influenza confirmed cases. website disease prevention and control. geographic distribution covid-19 cases worldwide. website chembl: a large-scale bioactivity database for drug discovery the role of close contacts tracking management in covid-19 prevention: a cluster investigation in jiaxing, china targeting covid-19: ghddi info sharing portal. website first known person-to-person transmission of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) in the usa. the lancet structure of the respiratory syncytial virus polymerase complex gisaid: global initiative on sharing all influenza data. website modelling and predicting the spatio-temporal spread of coronavirus disease 2019 (covid-19) in italy automatic chemical design using a data-driven continuous representation of molecules coronavirus (covid-19) cases in the uk. website rapid ai development cycle for the coronavirus (covid-19) pandemic: initial results for automated detection and patient monitoring using deep learning ct image analysis lstm: a search space odyssey automatic x-ray covid-19 lung image classification system based on multi-level thresholding and support vector machine. medrxiv covid-19 pandemic. website deep residual learning for image recognition temporal dynamics in viral shedding and transmissibility of covid-19 vaxign: the first web-based vaccine design program for reverse vaccinology and applications for vaccine development feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. the lancet global health covidx-net: a framework of deep learning classifiers to diagnose covid-19 in x-ray images an effective ctl peptide vaccine for ebola zaire based on survivors' cd8+ targeting of a particular nucleocapsid protein epitope with potential implications for covid-19 vaccine design matrix capsules with em routing long short-term memory sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor large-scale ligand-based virtual screening for sars-cov-2 inhibitors using deep neural networks prediction of potential commercially inhibitors against sars-cov-2 by multi-task deep model artificial intelligence forecasting of covid-19 in china evaluating the effect of public health intervention on the global-wide spread trajectory of covid-19. medrxiv multiple-input deep convolutional neural network model for covid-19 forecasting inchina. medrxiv clinical features of patients infected with 2019 novel coronavirus in wuhan, china. the lancet densely connected convolutional networks serialquantitative chest ct assessment of covid-19: deep-learning approach generative adversarial nets ellipro: an antibody epitope prediction tool. website clarivate analytics integrity. website active surveillance for covid-19 through artificial intelligence using concept of real-time speech-recognition mobile application to analyse cough sound mol2vec: unsupervised machine learning approach with chemical intuition anfis: adaptive-network-based fuzzy inference system chaos game representation of gene structure development and evaluation of an ai system for covid-19 diagnosis. medrxiv virology, epidemiology, pathogenesis, and control of covid-19 computational predictions of protein structures associated with covid-19. website netmhcpan-4.0: improved peptide-mhc class i interaction predictions integrating eluted ligand and peptide binding affinity data identification of novel compounds against three targets of sars cov-2 coronavirus by combined virtual screening and supervised machine learning health security capacities in the context of covid-19 outbreak: an analysis of international health regulations annual report data from 182 countries. the lancet the efficacy of contact tracing for the containment of the 2019 novel coronavirus(covid-19). medrxiv modeling infectious diseases in humans and animals identifying medical diagnoses and treatable diseases by image-based deep learning a simple and multiplex loop-mediated isothermal amplification assay for rapid detectionof sars-cov pubchem substance and compound databases social distancing strategies for curbing the covid-19 epidemic. medrxiv sars-cov-2 detection in patients with influenza-like illness interventions to mitigate early spread of sars-cov-2 in singapore: a modelling study mirbase: from microrna sequences to function imagenet classification with deep convolutional neural networks assessing spread risk of wuhan novel coronavirus within and beyond china tracking covid-19 using online search covid-19 x-rays. website therapeutic options for the 2019 novel coronavirus artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct characterizing the propagation of situational information in social media during covid-19 epidemic: a case study on weibo preliminary assessment of the covid-19 outbreak using 3-staged model e-ishr volumetric medical image segmentation: a 3d deep coarse-to-fine framework and its adversarial examples hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting sars-cov-2 infection in vitro covid-19progression timeline and effectiveness of response-to-spread interventions across the united states. medrxiv bindingdb: a web-accessible database of experimentally determined protein-ligand binding affinities within the lack of covid-19 benchmark dataset: a novel gan with deep transfer learning for corona-virus detection in chest x-ray images development of anovel reverse transcription loop-mediated isothermal amplification method for rapid detection ofsars-cov-2 genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding chest x-ray images (pneumonia) visualizing data using t-sne international air travel association(iata). website diagnosing covid-19 pneumonia from x-ray and ct images using deep learning and transfer learning algorithms a novel ai-enabled framework to diagnose coronavirus covid 19 using smartphone embedded sensors: design study enhancing response preparedness to influenza epidemics: agent-based study of 2050 influenza season in switzerland. simulation modelling practice and theory covid-19 epidemic in switzerland: growth prediction and containment strategy using artificial intelligence and big data. medrxiv how ai is battling the coronavirus outbreak. website crispr-based surveillance for covid-19 using kosoko-thoroddsen, tinna-solveig f-comprehensive machine learning design. biorxiv panther in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees linking virus genomes with host taxonomy suggestions for second-pass anti-covid-19 drugs based on the artificial intelligence measures of molecular similarity, shape and pharmacophore distribution automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks rsna pneumonia detection challenge. website of the people's republic of china. real-time covid-19 report. website covid-19 bsti imaging database. website polynomial neural networks architecture: analysis and design vaxign-ml: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens covid-19 coronavirus vaccine design using reverse vaccinology and machine learning the coronavirus datasets in italy. website role of changes in sars-cov-2 spike protein in the interaction with the human ace2 receptor: nn in silico analysis characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune cross-reactivity with sars-cov identification of a potential mechanism of acute kidney injury during the covid-19 outbreak: a study based on single-cell transcriptome analysis the paper news network. website covid-19/sars-cov-2 resources. website intensive care management of coronavirus disease 2019 (covid-19): challenges and recommendations. the lancet respiratory medicine zdock server: interactive docking prediction of protein-protein complexes and symmetric multimers predicting mortality risk in patients with covid-19 using artificial intelligence to help medical decision-making. medrxiv covid-19 vaccine candidates: prediction and validation of 174 sars-cov-2 epitopes. biorxiv interpretable deep learning in drug discovery covid-19 epidemic analysis using machine learning and deep learning algorithms. medrxiv machine learning-based ct radiomics model for predicting hospital stay in patients with pneumonia associated with sars-cov-2 infection: a multicenter study. medrxiv fighting against the common enemy of covid-19: a practice of building a community with a shared future for mankind personalized workflow to identify optimal t-cell epitopes for peptide-based vaccines against covid-19 detecting covid-19 in x-ray images with keras, tensorflow, and deep learning. website novel corona virus 2019 dataset. website images of covid-19 cases. mendeley data epitope-based chimeric peptide vaccine design against s, m and e proteins of sars-cov-2 etiologic agent of global pandemic covid-19: an in silico approach the role of artificial intelligence in management of critical covid-19 patients machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: covid-19 case study. biorxiv identification of covid-19 can be quicker through artificial intelligence framework using a mobile phone-based survey in the populations when cities/towns are under quarantine computational immunology meets bioinformatics: the use of prediction tools for molecular binding in the simulation of the immune system drones and artificial intelligence to enforce social isolation during covid-19 outbreak covid-19 forecasting based on an improved interior search algorithm and multi-layer feed forward neural network mathematical modeling of epidemic diseases mobilenetv2: inverted residuals and linear bottlenecks ai-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multimodal data the essential facts of wuhan novel coronavirus outbreak in china and epitope-based vaccine designing against 2019-ncov a machine learning model reveals older age and delayed hospitalization as predictors of mortality in patients with covid-19. medrxiv covid-19 and computer audition: an overview on what speech and sound analysis could contribute in the sars-cov-2 corona crisis covid-19 ct segmentation dataset. website improved protein structure prediction using potentials from deep learning protein structure prediction using multiple deep neural networks casp13 detection of coronavirus disease (covid-19) based on deep features lung infection quantification of covid-19 in ct images with deep learning large-scale screening of covid-19 from community acquired pneumonia using infection size-aware classification deep learning-based quantitative computed tomography model in predicting the severity of covid-19: a retrospective study in 196 patients self-attention based molecule representation for predicting drug-target interaction very deep convolutional networks for large-scale image recognition covid-19 database. website from a single host to global spread. the global mobility based modelling of the covid-19 pandemic implies higher infection and lower detection rates than current estimates. the global mobility based modelling shape-based generative modeling for de novo drug design deep learning enables accurate diagnosis of novel coronavirus (covid-19) with ct images. medrxiv development and evaluation of a deep learning model for protein-ligand binding affinity prediction inception-v4, inception-resnet and the impact of residual connections on learning going deeper with convolutions rethinking the inception architecture for computer vision ai-aided design of novel targeted covalent inhibitors against sars-cov-2. biorxiv severity assessment of coronavirus disease 2019 (covid-19) using quantitative features from chest ct images drug target commons 2.0: a community platform for systematic analysis of drug-target interaction profiles predicting the evolution of sars-covid-2 in portugal using an adapted sir model previously used in south korea for the mers outbreak breadth of concomitant immune responses prior to patient recovery: a case report of non-severe covid-19 an investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by sars-cov-2: an observational cohort study. the lancet infectious diseases susceptible-infected-recovered (sir) dynamics of covid-19 and economic impact of chloroquine and covid-19 coronavirus search trends. website autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading school closure and management practices during coronavirus outbreaks including covid-19: a rapid systematic review. the lancet child and adolescent health the immune epitope database 2.0 atomnet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery unexpected receptor functional mimicry elucidates activation of coronavirus fusion structure, function, and antigenicity of the sars-cov-2 spike glycoprotein tracking and forecasting milepost moments of the epidemic in the early-outbreak: framework and applications to the covid-19. medrxiv covid-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-ncov) in vitro structural definition of a neutralization-sensitive epitope on themers-cov s1-ntd a deep learning algorithm using ct images to screen for corona virus disease (covid-19). medrxiv hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases abnormal respiratory patterns classifier may contribute to large-scale screening of people infected with covid-19 in an accurate and unobtrusive manner an integrated in silico immuno-genetic analytical platform provides insights into covid-19 serological and vaccine targets. biorxiv who. novel coronavirus 2019 (covid-19). website virological assessment of hospitalized patients with covid-2019 covid-19 coronavirus pandemic. website the statistics datasets on holidays and air travel. website a new coronavirus associated with human respiratory disease in china estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study rapid and accurate identification of covid-19 infection through machine learning based on clinical available blood test results. medrxiv inhibition of sars-cov-2 (previously 2019-ncov) infection by a highly potent pan-coronavirus fusion inhibitor targeting its spike protein that harbors a high capacity to mediate membrane fusion epidemiological data from the covid-19 outbreak, real-time case information deep learning system to screen coronavirus disease 2019 pneumonia characteristics of pediatric sars-cov-2 infection and potential evidence for persistent fecal viral shedding a machine learning-based model for survival prediction in patients with severe covid-19 infection. medrxiv prediction of criticality in patients with severe covid-19 infection using three clinical features: a machine learning-based prognostic model with clinical data in wuhan. medrxiv design of wide-spectrum inhibitors targeting coronavirus main proteases clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (covid-19): a multi-center study in wenzhou city, zhejiang flower pollination algorithm for global optimization modified seir and ai prediction of the epidemics trend of covid-19 under public health interventions a model for covid-19 prediction in iran based on china parameters covid-19: review indigenous peoples' data the epidemiology, diagnosis and treatment of covid-19 covid-19 screening on chest x-ray images using deep learning based anomaly detection progress and prospects on vaccine development against sars-cov-2. vaccines potential covid-2019 3c-like protease inhibitors designed using generative deep learning approaches deep learning-based detection for covid-19 from chest ct using weak label. medrxiv a pneumonia outbreak associated with a new coronavirus of probable bat origin network-based drug repurposing for novel coronavirus 2019-ncov/sars-cov-2 unet++: a nested u-net architecture for medical image segmentation key: cord-273429-dl6z8x9h authors: dandekar, r.; rackauckas, c.; barbastathis, g. title: a machine learning aided global diagnostic and comparative tool to assess effect of quarantine control in covid-19 spread date: 2020-07-24 journal: nan doi: 10.1101/2020.07.23.20160697 sha: doc_id: 273429 cord_uid: dl6z8x9h we have developed a globally applicable diagnostic covid-19 model by augmenting the classical sir epidemiological model with a neural network module. our model does not rely upon previous epidemics like sars/mers and all parameters are optimized via machine learning algorithms employed on publicly available covid-19 data. the model decomposes the contributions to the infection timeseries to analyze and compare the role of quarantine control policies employed in highly affected regions of europe, north america, south america and asia in controlling the spread of the virus. for all continents considered, our results show a generally strong correlation between strengthening of the quarantine controls as learnt by the model and actions taken by the regions' respective governments. finally, we have hosted our quarantine diagnosis results for the top $70$ affected countries worldwide, on a public platform, which can be used for informed decision making by public health officials and researchers alike. we have developed a globally applicable diagnostic covid-19 model by augmenting the classical sir epidemiological model with a neural network module. our model does not rely upon previous epidemics like sars/mers and all parameters are optimized via machine learning algorithms employed on publicly available covid-19 data. the model decomposes the contributions to the infection timeseries to analyze and compare the role of quarantine control policies employed in highly affected regions of europe, north america, south america and asia in controlling the spread of the virus. for all continents considered, our results show a generally strong correlation between strengthening of the quarantine controls as learnt by the model and actions taken by the regions' respective governments. finally, we have hosted our quarantine diagnosis results for the top 70 affected countries worldwide, on a public platform, which can be used for informed decision making by public health officials and researchers alike. the coronavirus respiratory disease 2019 originating from the virus "sars-cov-2" 1, 2 has led to a global pandemic, leading to 12, 552, 765 confirmed global cases in more than 200 countries as of july 12, 2020. 3 as the disease began to spread beyond its apparent origin in wuhan, the responses of local and national governments varied considerably. the evolution of infections has been similarly diverse, in some cases appearing to be contained and in others reaching catastrophic proportions. given the observed spatially and temporally diverse government responses and outcomes, the role played by the varying quarantine measures in different countries in shaping the infection growth curve is still not clear. with publicly available covid-19 data by country and world-wide by now widely available, there is an urgent need to use data-driven approaches to bridge this gap, quantitatively estimate and compare the role of the quarantine policy measures implemented in several countries in curtailing spread of the disease. as of this writing, more than a 100 papers have been made available, 9 mostly in preprint form. existing models have one or more of the following limitations: • lack of independent estimation: using parameters based on prior knowledge of sars/mers coronavirus epidemiology and not derived independently from the covid-19 data 10 or parameters like rate of detection, nature of government response fixed prior to running the model. 11 • lack of global applicability: not implemented on a global scale. 12 • lack of interpretibility: using several free/fitting parameters making it a cumbersome, complicated model to reciprocate and use by policy makers. 13 in this paper, we propose a globally scalable, interpretable model with completely independent parameter estimation through a novel approach: augmenting a first principles-derived epidemiological model with a data-driven module, implemented as a neural network. we leverage this model to quantify the quarantine strengths and analyze and compare the role of quarantine control policies employed to control the virus effective reproduction number [13] [14] [15] [16] [17] [18] [19] in the european, north american, south american and asian continents. in a classical and commonly used model, known as seir, [20] [21] [22] the population is divided into the susceptible s, exposed e, infected i and recovered r groups, and their relative growths and competition are represented as a set of coupled ordinary differential equations. the simpler sir model does not account for the exposed population e. these models cannot capture the large-scale effects of more granular interactions, such as the population's response to social distancing and quarantine policies. however, a major assumption of these models is that the rate of transitions between population states is fixed. in our approach, we relax this assumption by estimating the time-dependent quarantine effect on virus exposure as a neural network informs the infected variable i in the sir model. this trained model thus decomposes the effects and the neural network encodes information about the quarantine strength function in the locale where the model is trained. in general, neural networks with arbitrary activation functions are universal approximators. [23] [24] [25] unbounded activation functions, in particular, such as the rectified linear unit (relu) has been known to be effective in approximating nonlinear functions with a finite set of parameters. [26] [27] [28] thus, a neural network solution is attractive to approximate quarantine effects in combination with analytical epidemiological models. the downside is that the internal workings of a neural network are difficult to interpret. the recently emerging field of scientific machine learning 29 exploits conservation principles within a universal differential equation, 30 sir in our case, to mitigate overfitting and other related machine learning risks. in the present work, the neural network is trained from publicly available infection and population data for covid-19 for a specific region under study; details are in the experimental procedures section. thus, our proposed model is globally applicable and interpretable with parameters learned from the current covid-19 data, and does not rely upon data from previous epidemics like sars/mers. the classic sir epidemiological model is a standard tool for basic analysis concerning the outbreak of epidemics. in this model, the entire population is divided into three sub-populations: susceptible s; infected i; and recovered r. the sub-populations' evolution is governed by the following system of three coupled nonlinear ordinary differential equations here, β is the infection rate and γ is the recovery rates, respectively, and are assumed to be constant in time. the total population n = s(t) + i(t) + r(t) is seen to remain constant as well; that is, births and deaths are neglected. the recovered population is to be interpreted as those who can no longer infect others; so it also includes individuals deceased due to the infection. the possibility of recovered individuals to become reinfected is accounted for by seis models, 31 but we do not use this model here, as the reinfection rate for covid-19 survivors is considered to be negligible as of now. the reproduction number r t in the seir and sir models is defined as an important assumption of the sir models is homogeneous mixing among the subpopulations. therefore, this model cannot account for social distancing or or social network effects. additionally the model assumes uniform susceptibility and disease progress for every individual; and that no spreading occurs through animals or other non-human means. alternatively, the sir model may be interpreted as quantifying the statistical expectations on the respective mean populations, while deviations from the model's assumptions contribute to statistical fluctuations around the mean. to study the effect of quarantine control globally, we start with the sir epidemiological model. figure 1a shows the schematic of the modified sir model, the qsir model, which we consider. we augment the sir model by introducing a time varying quarantine strength rate term q(t) and a quarantined population t (t), which is prevented from having any further contact with the susceptible population. thus, the term i(t) denotes the infected population still having contact with the susceptibles, as done in the standard sir model; while the term t (t) denotes the infected population who are effectively quarantined and isolated. thus, we can write an expression for the quarantined infected population t (t) as further we introduce an additional recovery rate δ which quantifies the rate of recovery of the quarantined population. based on the modified model, we define a covid spread parameter in a similar way to the reproduction number defined in the sir model (4) as c p > 1 indicates that infections are being introduced into the population at a higher rate than they are being removed, leading to rapid spread of the disease. on the other hand, c p < 1 indicates that the covid spread has been brought under control in the region of consideration. since q(t) does not follow from first principles and is highly dependent on local quarantine policies, we devised a neural network-based approach to approximate it. recently, it has been shown that neural networks can be used as function approximators to recover unknown constitutive relationships in a system of coupled ordinary differential equations. 30, 32 following this principle, we represent q(t) as a n layer-deep neural network with weights w 1 , w 2 . . . w n , activation function r and the input vector u = (s(t), i(t), r(t)) as for the implementation, we choose a n = 2-layer densely connected neural network with 10 units in the hidden layer and the relu activation function. this choice was because we found sigmoidal activation functions to stagnate. the final model was described by 54 tunable parameters. the neural network architecture schematic is shown in figure 1b. the governing coupled ordinary differential equations for the qsir model are more details about the model initialization and parameter estimation methods is given in the experimental procedures section. in all cases considered below, we trained the model using data starting from the dates when the 500 th infection was recorded in each region and up to june 1 2020. in each subsequent case study, q(t) denotes the rate at which infected persons are effectively quarantined and isolated from the remaining population, and thus gives composite information about (a) the effective testing rate of the infected population as the disease progressed and (b) the intensity of the enforced quarantine as a function of time. to understand the nature of evolution of q(t), we look at the time point when q(t) approximately shows an inflection point, or a ramp up point. an inflection point in q(t) indicates the time when the rate of increase of q(t) i.e dq(t) dt was at its peak while a ramp up point corresponds to a sudden intensification of quarantine policies employed in the region under consideration. we define the quarantine efficiency, q eff as the increase in q(t) within a month following the detection of the 500 th infected case in the region under consideration. thus the magnitude of q eff shows how rapidly the infected individuals were prevented from coming into contact with the susceptibles in the first month following the detection of the 500 th infected case; and thus contains composite information about the quarantine and lockdown strength; and the testing and tracing protocols to identify and isolate infected individuals. figure 2 shows the comparison of the model-estimated infected and recovered case counts with actual covid-19 data for the highest affected european countries as of 1 june 2020, namely: russia, uk, spain and italy, in that order. we find that irrespective of a small set of optimized parameters (note that the contact rate β and the recovery rate γ are fixed, and not functions of time), a reasonably good match is seen in all four cases. recovery rates are assumed to be constant in our model, in the duration spanning the detection of the 500 th infected case and june 1 st , 2020. the average contact rate in spain and italy is seen to be higher than russia and uk over the considered duration of 2 − 3 months, possibly because russia and uk were affected relatively late by the virus, which gave sufficient time for the enforcement strict social distancing protocols prior to widespread outbreak. for spain and italy, the quarantine efficiency and also the recovery rate are generally higher than for russia and uk, possibly indicating more efficient testing, isolation and quarantine; and hospital practices in spain and italy. this agrees well with the ineffectiveness of testing, contact tracing and quarantine practices seen in uk. 35 although the social distancing strength also varied with time, we do not focus on that aspect in the present study, and will be the subject of future studies. a higher quarantine efficiency combined with a higher recovery rate led spain and italy to bring down the covid spread parameter (defined in (6)), c p from > 1 to < 1 in 16, 25 days. respectively, as compared to 32 days for uk and 42 days for russia (figure 4). figure 5 shows q eff for the 23 highest affected european countries. we can see that q eff in the western european regions is generally higher than eastern europe. this can be attributed to the strong lockdown measures implemented in western countries like spain, italy, germany, france after the rise of infections seen first in italy and spain. 36 although countries like switzerland and turkey didn't enforce a strict lockdown as compared to their west european counterparts, they were generally successful in halting the infection count before reaching catastrophic proportions, due to strong testing and tracing protocols. 37, 38 subsequently, these countries also managed to identify potentially infected individuals and prevented them from coming into contact with susceptibles, giving them a high q eff score as seen in figure 5 . in contrast, our study also manages to identify countries like sweden which had very limited lockdown measures; 39 with a low q eff score as seen in figure 5 . this strengthens the validity of our model in diagnosing information about the effectiveness of quarantine and isolation protocols in different countries; which agree well with the actual protocols seen in these countries. figure 6 shows reasonably good match between the model-estimated infected and recovered case counts with actual covid-19 data for the highest affected north american states (including states from mexico, the united states, and canada) as of 1 june 2020, namely: new york, new jersey, illinois and california. q(t) for new york and new jersey show a ramp up point immediately in the week following the detection of the 500 th case in these regions, i.e. on 19 march for new york and on 24 march for new jersey ( figure 7) . this matches well with the actual dates: 22 march in new york and 21 march in new jersey when stay at home orders and isolation measures were enforced in these states. a relatively slower rise of q(t) is seen for illinois while california showing a ramp up post a week after detection of the 500 th case. although no significant difference is seen in the mean contact and recovery rates between the different us states, the quarantine efficiency in new york and new jersey is seen to be significantly higher than that of illinois and california (figure 16b), indicating the effectiveness of the rapidly deployed quarantine interventions in new york and new jersey. 40 owing to the high quarantine efficiency in new york and new jersey, these states were able to bring down the covid spread parameter, c p to less than 1 in 19 days ( figure 8 ). on the other hand, although illinois and california reached close to c p = 1 after the 30 day and 20 day mark respectively, c p still remained greater than 1 (figure 8), indicating that these states were still in the danger zone as of june 1, 2020. an important caveat to this result is the reporting of the recovered data. comparing with europe, the recovery rates seen in north america are significantly lower (figures 16a,b) . it should be noted that accurate reporting of recovery rates is likely to play a major role in this apparent difference. in our study, the recovered data include individuals who cannot further transmit infection; and thus includes treated patients who are currently in a healthy state and also individuals who died due to the virus. since quantification of deaths can be done in a robust manner, the death data is generally reported more accurately. however, there is no clear definition for quantifying the number of people who transitioned from infected to healthy. as a result, accurate and timely reporting of recovered data is seen to have a significant variation between countries, under reporting of the recovered data being a common practice. since the effective reproduction number calculation depends on the recovered case count, accurate data regarding the recovered count is vital to assess whether the infection has been curtailed in a particular region or not. thus, our results strongly indicate the need for each country to follow a particular metric for estimating the recovered count robustly, which is vital for data driven assessment of the pandemic spread. figure 9a shows the quarantine efficiency for 20 major us states spanning the whole country. figure 9b shows the comparison between a report published in the wall street journal on may 21 highlighting usa states based on their lockdown conditions, 41 and the quarantine efficiency magnitude in our study. the size of the circles represent the magnitude of the quarantine efficiency. the blue color indicate the states for which the quarantine efficiency was greater than the mean quarantine efficiency across all us states, while those in red indicate the opposite. our results indicate that the north-eastern and western states were much more responsive in implementing rapid quarantine measures in the month following early detection; as compared to the southern and central states. this matches the on-ground situation as indicated by a generally strong correlation is seen between the red circles in our study (states with lower quarantine efficiency) and the yellow regions seen in in the wall street journal report 41 (states with reduced imposition of restrictions) and between the blue circles in our study (states with higher quarantine efficiency) and the blue regions seen in the wall street journal report 41 (states with generally higher level of restrictions). this strengthens the validity of our approach in which the quarantine efficiency is recovered through a trained neural network rooted in fundamental epidemiological equations. figure 10 shows reasonably good match between the model-estimated infected and recovered case count with actual covid-19 data for the highest affected asian countries as of 1 june 2020, namely: india, china and south korea. q(t) shows a rapid ramp up in china and south korea ( figure 11 ) which agrees well with cusps in government interventions which took place in the weeks leading to and after the end of january 4 and february 42 for china and south korea respectively. on the other hand, a slow build up of q(t) is seen for india, with no significant ramp up. this is reflected in the quarantine efficiency comparison (figure 16c), which is much higher for china and south korea compared to india. south korea shows a significantly lower contact rate than its asian counterparts, indicating strongly enforced and followed social distancing protocols. 43 no significant difference in the recovery rate is observed between the asian countries. owing to the high quarantine efficiency in china and a high quarantine efficiency coupled with strongly enforced social distancing in south korea, these countries were able to bring down the covid spread parameter c p from > 1 to < 1 in 21 and 13 days respectively, while it took 33 days in india ( figure 12 ). figure 13 shows reasonably good match between the model-estimated infected and recovered case count with actual covid-19 data for the highest affected south american countries as of 1 june 2020, namely: brazil, chile and peru. for brazil, q(t) is seen to be approximately constant ≈ 0 initially with a ramp up around the 20 day mark; after which q(t) is seen to stagnate (figure 14a). the key difference between the covid progression in brazil compared to other nations is that the infected and the recovered (recovered healthy + dead in our study) count is very close to one another as the disease progressed ( figure 13 ). owing to this, as the disease progressed, the new infected people introduced in the population were balanced by the infected people removed from the population, either by being healthy or deceased. this higher recovery rate combined with a generally low quarantine efficiency and contact rate (figure 16d) manifests itself in the covid spread parameter for brazil to be < 1 for almost the entire duration of the disease progression (figure 15a). for chile, q(t) is almost constant for the entire duration considered (figure 14b). thus, although government regulations were imposed swiftly following the initial detection of the virus, leading to a high initial magnitude of q(t), the government imposition became subsequently relaxed. this maybe attributed to several political and social factors outside the scope of the present study. 44 even for chile, the infected and recovered count remain close to each other compared to other nations. a generally high quarantine magnitude coupled with a moderate recovery rate (figure 16d) leads to c p being < 1 for the entire duration of disease progression (figure 15b). in peru, q(t) shows a very slow build up (figure 14c) with a very low magnitude. also, the recovered count is lower than the infected count compared to its south american counterparts (figure 13c). a low quarantine efficiency coupled with a low recovery rate (figure 16d) leads peru to be in the danger zone (c p > 1) for 48 days post detection of the 500 th case (figure 15c). n y n j m i p a f l g a c a m a t x il m d o k u t a z n e w a o h o r c o s our model captures the infected and recovered counts for highly affected countries in europe, north america, asia and south america reasonably well, and is thus globally applicable. along with capturing the evolution of infected and recovered data, the novel machine learning aided epidemiological approach allows us to extract valuable information regarding the quarantine policies, the evolution of covid spread parameter c p , the mean contact rate (social distancing effectiveness), and the recovery rate. thus, it becomes possible to compare across different countries, with the model serving as an important diagnostic tool. our results show a generally strong correlation between strengthening of the quarantine controls, i.e. increasing q(t) as learnt by the neural network model; actions taken by the regions' respective governments; and decrease of the covid spread parameter c p for all continents considered in the present study. based on the covid-19 data collected (details in the materials and methods section), we note that accurate and timely reporting of recovered data is seen to have a significant variation between countries; with under reporting of the recovered data being a common practice. in the north american countries, for example, the recovered data are significantly lower than its european and asian counterparts. thus, our results strongly indicate the need for each country to follow a particular metric for estimating the recovered count robustly, which is vital for data driven assessment of the pandemic spread. the key highlights of our model are: (a) it is highly interpretable with few free parameters rooted in an epidemiological model, (b) its reliance on only covid-19 data and not on previous epidemics and (c) it is highly flexible and adaptable to different compartmental modelling assumptions. in particular, our method can be readily extended to more complex compartmental models including hospitalization rates, testing rate and distinction between symptomatic and asymptomatic individuals. thus, the methodology presented in the study can be readily adapted to any province, state or country globally; making it a potentially useful tool for policy makers in event of future outbreaks or a relapse in the current one. finally, we have hosted our quarantine diagnosis results for the top 70 affected countries worldwide on a public platform (https://covid19ml.org/ or https://rajdandekar.github.io/covid-quarantinestrength/), which can be used for informed decision making by public health officials and researchers alike. we believe that such a publicly available global tool will be of significant value for researchers who want to study the correlation between the quarantine strength evolution in a particular region with a wide range of metrics spanning from mortality rate to socio-economic landscape impact of covid-19 in that region. currently, our model lacks forecasting abilities. in order to do robust forecasting based on prior data available, the model needs to further augmented through coupling to with real-time metrics parameterizing social distancing, e.g. the publicly available apple mobility data. 45 this could be the subject of future studies. the starting point t = 0 for each simulation was the day at which 500 infected cases were crossed, i.e. i 0 ≈ 500. the number of susceptible individuals was assumed to be equal to the population of the considered region. also, in all simulations, the number of recovered individuals was initialized from data at t = 0 as defined above. the quarantined population t (t) is initialized to a small number t (t = 0) ≈ 10. the time resolved data for the infected, i data and recovered, r data for each locale considered is obtained from the center for systems science and engineering (csse) at johns hopkins university. the neural network-augmented sir ode system was trained by minimizing the mean square error loss function l nn (w, β, γ, δ) = log(i(t) + t (t)) − log(i data (t)) 2 + log(r(t)) − log(r data (t)) 2 that includes the neural network's weights w . for most of the regions under consideration, w, β, γ, δ were optimized by minimizing the loss function given in (13) . minimization was employed using local adjoint sensitivity analysis 32, 46 following a similar procedure outlined in a recent study 30 with the adam optimizer 47 with a learning rate of 0.01. the iterations required for convergence varied based on the region considered and generally ranged from 40, 000 − 100, 000. for regions with a low recovered count: all us states and uk, we employed a two stage optimization procedure to find the optimal w, β, γ, δ. in the first stage, (13) was minimized. for the second stage, we fix the optimal γ, δ found in the first stage to optimize for the remaining parameters: w, β based on the loss function defined just on the infected count as l(w, β) = log(i(t) + t (t)) − log(i data (t)) 2 . in the second stage, we don't include the recovered count r(t) in the loss function, since r(t) depends on γ, δ which have already been optimized in the first stage. by placing more emphasis on minimizing the infected count, such a two stage procedure leads to much more accurate model estimates; when the recovered data count is low. the iterations required for convergence in both stages varied based on the region considered and generally ranged from 30, 000 − 100, 000. preliminary versions of this work can be found at medrxiv 2020.04.03.20052084 and arxiv:2004.02752. data for the infected and recovered case count in all regions was obtained from the center for systems science and engineering (csse) at johns hopkins university. all code files are available at https://github.com/rajdandekar/mit-global-covid-modelling-project-1. all results are publicly hosted at https://covid19ml.org/ or https://rajdandekar.github.io/covid-quarantinestrength/. a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster coronavirus disease 2019 (covid-19) situation summary coronavirus disease 2019 (covid-19) situation report -174 what china's coronavirus response can teach the rest of the world whose coronavirus strategy worked best? scientists hunt most effective policies first case of 2019 novel coronavirus in the united states hidden outbreaks spread through u.s. cities far earlier than americans knew, estimates say coronavirus in latin america: what governments are doing to stop the spread an aggregated dataset of clinical outcomes for covid-19 patients the effect of travel restrictions on the spread of the 2019 novel coronavirus forecasting covid-19 and analyzing the effect of government interventions the effect of human mobility and control measures on the covid-19 epidemic in china impact of nonpharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions estimation of the transmission risk of the 2019-ncov and its implication for public health interventions early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases modelling the sars epidemic by a lattice-based monte-carlo simulation extension and verification of the seir model on the 2009 influenza a (h1n1) pandemic in japan forecasting epidemics through nonparametric estimation of time-dependent transmission rates using the seir model approximations by superpositions of sigmoidal functions approximation capabilities of multilayer feedforward networks neural network with unbounded activation functions is universal approximator deep sparse rectifier neural networks maxout networks. 30th int. conf. mach. learn improving deep neural networks for lvcsr using rectified linear units and dropout workshop report on basic research needs for scientific machine learning: core technologies for artificial intelligence universal differential equations for scientific machine learning analysis of a spatially extended nonlinear seis epidemic model with distinct incidence for exposed and infectives ) diffeqflux.jl -a julia library for neural differential equations spain orders nationwide lockdown to battle coronavirus. the guardian italy extends coronavirus lockdown to entire country, imposing restrictions on 60 million people how did britain get its coronavirus response so wrong? guardian coronavirus: what are the lockdown measures across europe what switzerland did right in the battle against coronavirus. marketwatch 2020) coronavirus: how turkey took control of covid-19 emergency sweden has become the world's cautionary tale these states have some of the most drastic restrictions to combat the spread of coronavirus a guide to state coronavirus reopenings and lockdowns coronavirus cases have dropped sharply in south korea. what's the secret to its success what's behind south korea's covid-19 exceptionalism politics and poverty hinder covid-19 response in latin america 2020) mobility trend report adjoint sensitivity analysis for differentialalgebraic equations: the adjoint dae system and its numerical solution adam: a method for stochastic optimization this effort was partially funded by the intelligence advanced reseach projects activity (iarpa.) we are grateful to emma wang for help with some of the simulations, and to haluk akay, hyungseok kim and wujie wang for helpful discussions and suggestions. the authors declare no conflicts of interest. key: cord-126012-h7er0prc authors: diaz, victor hugo grisales; prado-rubio, oscar andres; willis, mark j. title: covid-19: forecasting mortality given mobility trend data and non-pharmaceutical interventions date: 2020-09-25 journal: nan doi: nan sha: doc_id: 126012 cord_uid: h7er0prc we develop a novel hybrid epidemiological model and a specific methodology for its calibration to distinguish and assess the impact of mobility restrictions (given by apple's mobility trends data) from other complementary non-pharmaceutical interventions (npis) used to control the spread of covid-19. using the calibrated model, we estimate that mobility restrictions contribute to 47 % (us states) and 47 % (worldwide) of the overall suppression of the disease transmission rate using data up to 13/08/2020. the forecast capacity of our model was evaluated doing four-weeks ahead predictions. using data up to 30/06/20 for calibration, the mean absolute percentage error (mape) of the prediction of cumulative deceased individuals was 5.0 % for the united states (51 states) and 6.7 % worldwide (49 countries). this mape was reduced to 3.5% for the us and 3.8% worldwide using data up to 13/08/2020. we find that the mape was higher for the total confirmed cases at 11.5% worldwide and 10.2% for the us states using data up to 13/08/2020. our calibrated model achieves an average r-squared value for cumulative confirmed and deceased cases of 0.992 using data up to 30/06/20 and 0.98 using data up to 13/08/20. non-pharmaceutical interventions (npis) issued by governments are successfully controlling covid-19 virus transmission rates. for example, it has been suggested that npis have prevented or delayed 61 million new confirmed cases in china, south korea, italy, iran, france, and the united states [1] . unfortunately, these policies are easing in several countries around the world and consequently global infected cases of covid-19 are still drastically increasing with the number of daily new infected cases at more than two hundred thousand, (20/09/2020), see [2] . as a result, new npis are constantly being issued by governments around the world to reduce rates of contagion. the introduction of any new npi needs to be cost-effective, as npis have also shown high social and economic costs, such as unemployment. to try to ensure the most cost-effective npis are introduced, mathematical models are fundamental as they can be used to understand and measure the impact of npis [3] . furthermore, mathematical models have been used to estimate the effective reproduction number, which is reduced through the application of npis [4] . this is an essential metric used by policymakers and epidemiologists as exponential growth in active cases is still observed if this number is higher than one in any determined period. for this reason, in [5, 6] we developed a mathematical model that estimated the effective reproduction number as a function of npis and time. this contribution aims to extend our previous work and estimate the dynamics of mortality as well as calculate the cumulative active cases for many countries around the world. at the same time, we evaluate the effectiveness of restrictions on mobility (i.e., walking, driving and transport) on the reduction of the disease transmission rate and hence the control of the cumulative number of infected and deceased individuals. to do this we use mobility trends data provided by apple [7] . this is important as mobility restrictions have been identified as being the most significant of all npis by [8, 9] . for example, it has been estimated that lockdowns in the us would save 1.7 million lives by october 2020 with a monetary mortality benefit of 8 trillion usd [10] . in addition, country-wide lockdowns have been found to reduce disease transmission rates by 75-87% in 11 european countries [8] . similarly, school closure, quarantine, and distance working have been estimated to have reduced the number of contagious individuals by 78.2%-99.3% [11] . however, there are also reports of the lower effectiveness (a 47% reduction in contagion rates) being associated with mobility [9] , which demonstrates the wide confidence bounds associated with the evaluation of the impact of this npi. in this work, a new compartmental model is proposed and calibrated to distinguish and measure the impact of mobility restrictions on the rate of disease suppression. in addition, we use the calibrated model to forecast up to four-weeks ahead the number of cumulative infected cases and the resulting mortality rates. here, we demonstrate for the first time for a large number of cases (to the best of our knowledge) that including both mobility trend data and npis are necessary to capture dynamics and simultaneously accurately forecast mortality. as we use a hybrid compartmental model, the mathematical model developed here has the additional advantage that can be also used to calculate the effective reproductive number at any particular time and for any country. to generate our results we use data from 49 countries and 51 u.s states available from [12] and [2] , respectively. in this contribution, our previous model [5] is extended to predict mortality and to include a term to estimate the reduction on the contagious rates given reported mobility data. due to these modifications, the proposed model is referred to as sird-mc (mobility-control). to develop the sird-mc model, individuals are compartmentalised into the number of susceptible ( ( )) compared to the total population ( ), the number of infected ( ( )), the number of removed ( ( )), and the number of deceased ( ( )). the compartmental ( ) is the summatory of recovered and deceased individuals. we also include a term for patients that have symptoms that can lead to death ( ( )). this compartment is proposed to deal with different kinds of infected individuals e.g. asymptomatic and symptomatic, as it has been shown that this is necessary in order to analyse the rates of contagion [13] . our sird-mc model consists of the following ordinary differential equations, in the above equations, the removal rate of reported infectious individuals (or recovery rate), (day −1 ), is assumed constant and equal to 1/8 day −1 , as it has been shown to be reasonable and effective [5, 9] . the transmission rate of the virus ( ( )) is assumed to be dependent on npis (this is described in detail in the next section). the model also includes an under-reported parameter ( ) for removed individuals [5] and we assume that the rate constant associated with ( ) is proportional to that of ( ( )) where ( ) is an estimated constant between zero and one. as only the compartment ( ) within our model can lead to death, the deceased individuals, ( ), are considered to be dependent of ( ). the total deceased individuals are modelled assuming a constant fatality rate , (day −1 ). finally, the total confirmed cumulative cases ( ( )) are modelled by summing the infected and removed cases ( ( )). in our previous work, we introduced a time-varying term that accounts for a reduction in the virus transmission rate constant, ( ), using a first-order differential equation employing all npis as input [5] . we assumed a control signal ̅( ) which was a boolean variable, i.e. it could take the value of zero (npis off) or one (npis are active), and the time varying disease transmission rate ( ) was assessed through determination of the model parameters associated with the ode model. in this work, we modify this expression to, in this equation, we have the rate of reduction of the disease transmission rate ( ′ ( )) which is assumed zero at the initial time with an initial rate ( ). in addition, represents a time such as the use of masks, biosecurity protocols, or closure of schools. by including these two terms in the model, we can capture the effects of disease suppression with and without restrictions on mobility. given the estimated numerical value of these two model parameters, we can estimate the contribution of mobility restrictions to the overall reduction in disease transmission rate as, in this equation, ̅ is the average for all us states or countries evaluated and similarly, ̅ is the average value of . mobility trends reports provided by apple were used as inputs of the model as the data is easily accessible, updated and conveniently normalised [7] . mobility trends for 49 countries and 51 u.s. states were averaged for each day using all reported mobility data (i.e. walking, driving, transport), converting the data into a representative number per day. firstly, the data is normalised to have values between zero (no mobility restrictions) and one (maximum mobility restriction). subsequently, we transform this data into a mobility signal [14, 15] . to validate our model, we made predictions up to four weeks ahead of the cumulative deceased individuals and number of active cases. for forecasting of the total cases, the initial conditions of the odes were specified as ( 2 ), ( 2 ) and ( 2 ), and the initial condition for recovered and deceased individuals set as, where 2 is equal to 30 jun 2020 as we used data for model calibration up to this date. to compare predictive capacity of our model in relation to the number of data, we use data up to 13 aug 2020 ( 2 ) for model calibration. to estimate the confidence bounds in the predictions, we used the same initial conditions described above. the median absolute percentage error (mdape) and the mean absolute percentage error (mape) were calculated to report the predictive capability of our models. usually, the mape is estimated using an absolute percentage error (ape) calculated with the mean value of the prediction as reference, however, when calculating it in this way, it is known that the mape is not robust against outliers [16] . here, as an outlierresistant measure, we estimated the ape three times for each experimental point using the three simulated values of a prediction (the mean value of the prediction and its respective maximum and minimum bounds). the lowest ape of the three is then selected for each experimental point to estimate the mape. by doing this, we can calculate the most likely mape avoiding potential errors due to outliers. to perform model calibration and the subsequent validation, the first step is to tune the mobility model, which defines the sird-mc model input signal. this model is also required to perform the forecasting of mobility for model validation. the mobility restriction model consists of a linear representation where the slope of the curve can potentially change over five windows of time (see appendix a). as an example, in figure 1 we show the calibrated model and data up to 29 jun 2020 (blue) and forecasting of future mobility (data in black and model in light blue) up to four weeks ahead for brazil, italy, australia, and colombia. a reduction of the predictive capacity of the model is observed when new and strong mobility restrictions are introduced. as an example, new lockdown rules in melbourne (australia) were issued to stop the spread of the virus, and the trend is beyond the predictive power of our model (see figure 1 ). for this reason, we limited our study to make short term predictions of up to four weeks. the average predictive power of the mobility model in all cases studied is shown in figure 2 . although the raw data is still highly noisy, the mean absolute error of the model was lower than 0.06. moreover, it may be observed that the forecasting error is lower than the calibration error as 37 of the 49 countries studied does not have restrictions of mobility. this is important for parameter identification as the mobility signal covers all the spectrum of mobility, i.e. the effect of mobility on the reduction of the contagious rates is observed in the data. although mobility tendencies provided by apple (i.e. walking, driving, transport) are back to normal for most of the countries around the globe, some of the countries still have a reduction in mobility (22/07/2020) such as south korea, brazil, chile, argentina, south africa, australia, amongst others. disease suppression with and without mobility restrictions considering the countries studied in this work and data up to 30 jun 2020, the estimated gains (i.e. , ) of the ode (7) were evaluated. in the 49 countries studied, the gain associated with mobility restrictions ( ) was on average 0.183 ± 0.054 ( ̅ ) and the gain associated with all other npis ( ̅ ) was equal to 0.2050 ± 0.03 (see table 1 ), i.e. worldwide mobility restrictions contribute to 47±14% of the overall reduction in the disease transmission rate of covid-19. whereas for the us states, mobility restrictions account for 54±14% of the overall reduction in the disease transmission rate. when data up to 13/08/2020 was used for calibration, our models indicate that mobility restrictions account for 47 ± 10% (us states) and 47 ± 12% (worldwide) of the overall reduction in the disease transmission rate. the estimated model parameters for each country and the us states are reported in appendix c. in table 1 we show all the model parameters -averaged over all countries and the us states. note that the time constant ( ̅ ) represents the time necessary to achieve 63.2% of the total change in contagious rates. worldwide, the time constant is 28.58 ± 4.16 (days), data up to 30/06/2020. the parameter with the highest variation was the death rate for covid-19 patients with symptoms (µ ̅) as a small reduction in the parameter ′ ′ can lead to a high reduction in µ, i.e. ′ ′ has an exponential effect on the number of patients with symptoms. the initial or basic reproductive number of the epidemic was found to be on average 3.0 ± 0.2 (worldwide) and 3.2 ± 0.17 (us states). note, the low confidence bounds of our average basic reproduction number. in contrast, the average basic reproduction number on 11 european countries was estimated by [8] as 3.8. we estimated that the initial reproduction number was drastically reduced due to npis (the initial number is ~3). however, most of the countries and the us states up to 13 aug 2020 has shown an effective reproductive number higher than one, see fig. 3 . as confinement policies were found to be significant to control the effective reproduction number in most of the countries, it is evident the need of social distancing in countries with high transmission rates to minimise the number of deaths. as expected, the average mape for us states and worldwide increases with the number of days ahead (see figure 5 and figure 6 ). in all cases considered, the mape for two weeks ahead was 2.4 ± 0.8% for cumulative deceased individuals ( figure 5 ) and 3.9 ± 0.9% ( figure 6 ) for cumulative confirmed cases. when the number of weeks ahead was increased from two to four, the mean mape was increased by 2.5-and 3.5-fold for deceased individuals and confirmed cases. the mape for cumulative deaths was lower than 10% in 41 u.s. states and 39 countries. hence, for the forecasting of cumulative deaths, our model has an 80% probability of producing four-week ahead predictions with a mean mape lower than 10%. whereas, if we consider the confirmed cases, this probability is 62%. using all data (calibration and validation), our model had a r-squared value of 0.992 (calibration data up to 30/06/2020) and 0.98 (calibration data up to 13/08/28). table 2 . these models were evaluated using 156 and 73 countries, respectively. *the mdape was estimated and reported by ihme [17] as in [17] , mdape is estimated instead of mape, we also calculate this figure for comparative purposes. in this work, the mdape of sird-mc for all states was 5.9% (51 u.s. states and 49 countries worldwide). to estimate the predictive capacity once more data is gathered, we also perform forecasting using a more recently updated dataset (calibration using data up to 13/08/2020, see table 2 ). as expected given the increased amount of data, our model increases the predictive capacity with time as both mdape for deceased individuals confirmed cumulative cases reduces from 5.9 to 2.8 % and 13.6 to 11.2 %, respectively. mobility restrictions imposed by governments are controversial as individual freedom is compromised for the greater good. lockdowns slow down the spread of the virus; this is a fact. however, the impact of mobility restrictions on the control of the spread of covid-19 that has been reported in the literature vary in a wide range, between 47 and 99% [9] . here, using data up to 30 jun 2020 and our model (sird-mc), we demonstrated that mobility reductions could explain ~47 and ~54% of the reduction in the disease transmission rates in 49 countries and the us states, respectively. these results are similar to those obtained by [7] , where data from 26 countries was used to calibrate a hybrid model combining the susceptible-infected-recovered (sir) differential equations with gradient boosted trees (gbt). an augmented version of the seir compartmental model that uses apple's mobility trend data was proposed by [20] to estimate mortality and hospitalizations. in their work, they demonstrate that mobility data provides a leading indicator of contagious rates in eight us states. however, their model assumed that the change in the value of the effective reproduction number only depended on the reduction of mobility. in our work, we demonstrate that this is not true as other npis were found to reduce the spread of the virus by around ~46 and 53%, approximately. to validate the predictive capacity of our model we considered the forecasting of cumulative active cases as well as the numbers of diseased individuals up to four weeks ahead and provide several statistical indicators of the forecasting accuracy such as mape, mdape and r-squared. we found that our model has a lower mdape than other models studied by [17] . at the same time, our calibrated model was able to capture the disease suppression dynamics with a high r-squared, 0.992 (calibration data up to 30/06/2020). hence, we believe that the modelling approach suggested in this paper to estimate mortality will be a useful tool that can assist decision making processes. the authors declare that they have no competing interests. the effect of large-scale anti-contagion policies on the covid-19 pandemic individual model forecasts can be misleading, but together they are useful the reproductive number of covid-19 is higher compared to sars coronavirus insights into the dynamics and control of covid-19 infection rates covid-19: mechanistic model calibration subject to active and varying non-pharmaceutical interventions estimating the effects of non-pharmaceutical interventions on covid-19 in europe no place like home: cross-national data analysis of the efficacy of social distancing during the covid-19 pandemic does social distancing matter? interventions to mitigate early spread of sars-cov-2 in singapore: a modelling study the introduction of population migration to seiar for covid-19 epidemic modeling with an efficient intervention strategy kinetic models in industrial biotechnology -improving cell factory performance forecasting efforts from prior epidemics and covid-19 predictions forecast accuracy in demand planning: a fast-moving consumer goods case study * ihme covid-19 model comparison team, predictive performance of international covid-19 mortality forecasting models covid-19 projections using machine learning los alamos natinoal laboratory covid-19 confirmed and forecasted case data mobility trends provide a leading indicator of changes in sars-cov-2 transmission this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. key: cord-250288-obsl0nbf authors: yan, bingjie; tang, xiangyan; liu, boyi; wang, jun; zhou, yize; zheng, guopeng; zou, qi; lu, yao; tu, wenxuan title: an improved method for the fitting and prediction of the number of covid-19 confirmed cases based on lstm date: 2020-05-05 journal: nan doi: nan sha: doc_id: 250288 cord_uid: obsl0nbf new coronavirus disease (covid-19) has constituted a global pandemic and has spread to most countries and regions in the world. by understanding the development trend of a regional epidemic, the epidemic can be controlled using the development policy. the common traditional mathematical differential equations and population prediction models have limitations for time series population prediction, and even have large estimation errors. to address this issue, we propose an improved method for predicting confirmed cases based on lstm (long-short term memory) neural network. this work compared the deviation between the experimental results of the improved lstm prediction model and the digital prediction models (such as logistic and hill equations) with the real data as reference. and this work uses the goodness of fitting to evaluate the fitting effect of the improvement. experiments show that the proposed approach has a smaller prediction deviation and a better fitting effect. compared with the previous forecasting methods, the contributions of our proposed improvement methods are mainly in the following aspects: 1) we have fully considered the spatiotemporal characteristics of the data, rather than single standardized data; 2) the improved parameter settings and evaluation indicators are more accurate for fitting and forecasting. 3) we consider the impact of the epidemic stage and conduct reasonable data processing for different stage. since the malthus population model was proposed, it has been widely used in various fields. later, related scholars added the population coefficient to the malthus model to realize the improvement of the model, and proposed the logistic model. this model makes the related population prediction more accurate and effective. for the prediction of regional population size and age structure, shorokhov, s. i. used the matrix form to realize population prediction through the comparison of immigration inflow [shorokhov, s. i. (2014) ]. similarly, the use of different mathematical models can also predict the number of world population. yuri s. popkov and others used the entropy estimation algorithm to construct a random prediction model [popkov y s, dubnov y a, and popkov a y. (2016) ]. the problem with the traditional logistic regression model is that when there are many feature quantities, the accuracy of its prediction will decrease. therefore, there are not suitable for scenarios containing a large number of multi-type features or variables. however, we propose a prediction method using the lstm network model in machine learning which has a good advantage in this filed. according to the population growth prediction model, kermack proposed the sir epidemic model. after that, the sir epidemic model has been well applied. ning, z et al. predicted the case of infectious diseases through grey and differential equation models [ning, z. and lin, l. (2014) ]. this model is mainly used in epidemiology. a common situation is to explore the risk factors of a disease or predict the probability of a disease based on risk factors and so on. paul d. haemig used decades of database to build mathematical models and realized the prediction of the number of patients with encephalitis next year [haemig p d, sjöstedt de luna s, grafström a et al. (2011) ]. in terms of infectious diseases related to this article, there are also many mathematical models used to predict. a. gray used stochastic differential equations to improve the sis epidemic model [gray a, greenhalgh d, hu l et al. (2011) ] and established a new sir model to predict the development trend of avian influenza and human influenza [iwami s, takeuchi y, and liu x. (2007) ]. in recent years, infectious disease models have been continuously improved and developed. infectious disease models are applied to the prediction and prevention of medical places and have played a huge role in promoting it. there is a combination of approximate bayesian algorithm and infectious disease model to calculate risk probability [minter a and retkute r. (2019) ]. li q combined with evolutionary game theory to realize the infectious disease prediction model with vaccination strategy, and achieved good results in practical application [li q, li m c, lv l et al. (2017) ].in response to the development of national conditions, relevant scholars have also proposed the practical comparison of real-time prediction of endemic infectious diseases based on lasso models in different countries [chen y, chu c w, chen m i c et al. (2018) ]; in addition, there are many practical models. on the one hand, the comparison of basic models enables the selection, trade-off and comparison of infectious disease models [funk s and king a a. (2020) ]. on the other hand, the researchers combined the local ili incidence and used a dynamically calibrated compartment model for real-time analysis and real-time prediction of influenza ?? outbreaks in belgium [miranda g h b, baetens j m, bossuyt n et al. (2019) ]. although the sir epidemic model predicts good results, there has a big problem. this model is based on differential equations, so the process of solving equations is very complicated. another drawback is the prediction results are very sensitive to the initial value conditions, leading that the robustness of the model is week. in this respect, we use the lstm model's own reverse error propagation characteristics to train the data. the traditional prediction methods are based on the mathematical model. nowadays, machine learning prediction has more and more widely used. the combination of machine learning and traditional forecasting models for quantitative forecasting has mushroomed. at present, there are data processing methods using hierarchical learning methods to predict the citation times of future papers [chakraborty t, kumar s, goyal p et al. (2014) ]. some people have also used machine learning to realize the measurement of content and literature, and predict the citation time of biomedical literature [fu l and aliferis c. (2010) ]. in addition, artificial neural networks have also been used to predict populations [folorunso o, akinwale a t, asiribo o e et al. (2010) ]. at the same time, because of the powerful generalization ability of ml, the traffic flow prediction model based on machine learning has been widely used recently. the existing method introduces the maximum entropy kalman filter to achieve traffic flow prediction [cai l, zhang z and yang j et al. (2019) ]. these studies use deep learning neural networks to achieve traffic flow prediction. for example, they are the methods to realize multiintersection traffic flow prediction learning [shen z, wang w, shen q et al. (2019) ] and hybrid traffic flow prediction method based on multi-mode deep learning [du s, li t, gong x et al. (2018) ]. in addition, machine learning has also been commonly used in infectious disease model prediction. hani m. aburas used a neural network model to predict the proportion of patients diagnosed with dengue fever [aburas h m, cetiner b g and sari m. (2010) ]. at present, machine learning prediction models are often used to predict traffic flow and other situations. however, there is relatively little research on the prediction number of people. our research applies the lstm model to the number prediction and finds that the results are relatively good. for the recent research and development of covid-19 and its impact, many researchers have made relevant prediction models based on the data. zifeng yang et al. analyzed the epidemic prevention measures of the chinese government, and predicted the epidemic situation of covid-19 [yang z, zeng z, wang k et al. (2020) ]. they used the improved seis model and the ai network trained by the sars data in 2003, and predicted the next epidemic situation in china. the facts proved that their prediction results were relatively good. by using the real-time data on the website, dong e et al. built a web-based interactive dashboard that can track covid-19 in real time using the characteristics of internet data [dong e, du h and gardner l. (2020) ]. in addition, there is a dynamic stochastic general equilibrium (dsge) model that uses data randomization to assess the impact of coronavirus on the tourism industry and achieves trend prediction ?? [yang y, zhang h and chen x. (2020) ]. although there have been many studies on the prediction model of covid-19 recently, we first used the lstm model to consider the end of the epidemic. the lstm model performs forward time memory processing on the collected large amount of data. and it considers actual factors and characteristic conditions, and outputs data associated with the factors. in this session, we will specifically introduce our data processing method, the construction of the lstm model, and the improved method. novel coronavirus pneumonia covid-19 is a highly contagious virus. we hope to use the data on the number of confirmed cases of the previous days to predict the growth trend of the number of confirmed cases in the following days. at the same time, whether the national government takes favorable measures to cut off the spread of the virus is also an important factor, which will affect the time when the inflection point appears and the time when the virus continues to spread. therefore, we use the data onto the number of daily diagnosing as different countries and the time when the country takes similar "the curfew" measures as the input of the model data. to increase the learning efficiency and the features extracting efficiency, it is necessary to preprocess the data and display more features to facilitate model learning. we take the cumulative number of diagnosed, the number of newly diagnosed, the cumulative growth rate of daily diagnoses, and whether the government has closed the city as the input of model learning. simultaneously, the number of newly confirmed cases of the next day is used as an output rather than the cumulative diagnosis of the next day. because the numerical range of the number of newly confirmed cases will be relatively small, which is convenient for the evaluation and comparison of the loss function. for the first two items, you should not simply use minmaxscaler for standardization, because the value of max will be broken. so we need to manually set the max value so that the input data is between [0,1], which can predict the next few days of data perform better, you can directly use 0 and 1 to express the two options of whether to close the city. long short term memory networks are a special kind of recurrent neural networks. it has a better effect on time series prediction. there is an incubation period for the new coronavirus. using lstm for time series prediction may find the influencing factors of potential cases. entering lstm will first pass a forgetting gate, and pass a sigmoid layer which is also called forgetting gate layer. take the current state 1 next, you will enter the lstm input gate. in this step, we determine the vector to be updated and create a vector update candidate value of a tanh activation function. it can be expressed by the following formula: then you can get the status in the storage cell as finally, the last decision output gate will be used to decide what to output and update the status in a similar way as the previous two gates. in this way, we get a prediction of future output. we found that the results of the data fitting phase are ideal for regions with few cases, but for regions with a large number of cases, such as hubei, china, the results of the fitting experiment are biased. in this respect, we propose the following methods to improve the results of the fitting phase which can also make the prediction results better. this type of problem mainly occurs in regions where the epidemic is serious. because the base of the number of people infected with the virus is large, a relatively large number of new people will be obtained when there should be a small number of new people in the later stage of the epidemic. in response to this problem, we propose the following methods to improve the prediction results of the ordinary lstm model. first of all, in the later stage of the epidemic, the prevention and control of the epidemic is relatively strong, and the data fluctuation is small. simultaneously, the number of infected people and the number of diagnosing can be considered to be roughly equal, because the diagnosed patients will be isolated and can be considered no longer contagious. therefore, this stage should be carried out under the conditions of a small population base. so we use the standard deviation of last n days as the judgment condition to adjust the parameters for the number of confirmed cases. first, the data to be input is normalized by minmaxscaler based on itself, to unify the standard deviation judgment standard, and then calculate the variance 2  . if the variance 2  is greater than the critical value  , that is    2 , then the data is reduced based on the population base method, divided by the logarithm based on  , and  can be calculated by simple adjustment. the specific improved algorithm is as follows: because of the existence of an incubation period and asymptomatic carriers, the novel coronavirus pneumonia has many potential factors. however, discovering these potential effects requires many characteristics. the results obtained only by fitting the lstm to the trend may not be ideal. so we go through a fully connected neural network to further extract the features of the data as the input of the lstm layer at first. the lstm layers output predictions for different conditions of various characteristics according to various characteristics, and then integrates predictions of various characteristics through the fully connected layer to obtain the desired result. at the same time, the incubation period of the novel coronavirus pneumonia can be as long as 14 days, so the time series sequences we entered should theoretically be greater than 14 days. in this section, we will introduce our experiments using model fitting prediction, and explain our evaluation indicators and parameter settings. novel coronavirus cases. it is a dataset for the number of covid-19 confirmed cases, deaths and recovered operated by the johns hopkins center for systems science and engineering (jhu csse) with the support of the esri living atlas team and the johns hopkins university applied physics laboratory (jhu apl). it's available online at https://github.com/cssegisanddata/covid-19 covid-19 lockdown dates by country. it is a table of the lockdown dates of countries or provinces collected by jcyzag from the internet and published on the kaggle dataset. it's available online at https://www.kaggle.com/jcyzag/covid19-lockdown-dates-by-country. the goodness of fit of a statistical model describes how well it fits a set of observations. the scale of goodness of fit usually summarizes the difference between the observed values and the expected values under the model. calculation method. note the mean value of the values to be fitted is y , the expected value is i ŷ , and the fitting result is i y . total sum of squares sst is: regression sum of squares ssr is: from this calculation, the goodness of fit can be obtained. note that the expected value is i ŷ , and the fitted value is i y , then we set a training window of the training data and get a time series of data. our model set a time-series sequence of up to 21 days for training, with the aim of discovering more carriers' influence on virus transmission during the incubation period. ?? after normalization, the absolute values of all input and output data values are small, so a small learning rate is required for the learning progress. as the number of training epochs increases, the learning rate should also decrease, so that the data can be better fitted. in order to show the results of data fitting and prediction, we use the trained model to predict the number of confirmed cases in the next 7 days. our model is built according to the following steps. first, the input data extraction features were put into the lstm to obtain several time series outputs. after getting the output of the lstm network, integrate and merge features in the last part. the model structure is shown in this part shows the results of our experiments. we will compare our improved lstm prediction results with the results of the number prediction model such as logistic and hill equations. at the same time, the fitting results of our improved fitting algorithm and the unimproved lstm algorithm will also be compared. during the training process, learning rate decreases with the increase of epoch, and the loss convergence during the training process is shown in can be controlled within 2%, which is better than the logistic and hill equation performance. we calculated the goodness of fit for the fitting results of the two models before and after improvement, as shown in tab 3. the prediction result of the improved model will be closer to the true value, effectively reducing the late prediction error due to the large case base. to address the problem of deviation and accuracy in predicting the number of confirmed cases in traditional methods, we propose an improved method based on lstm neural network. the prediction of the diagnosis number of new coronavirus can be regarded as time series prediction, and the lstm model has a good effect on time series prediction. because of the incubation period of new coronaviruses, time series prediction using lstm may also find the influencing factors of potential cases. first of all, we use the 21-day case data of various countries and regions provided on the website, and use the cumulative number of diagnoses, the number of new diagnoses, the cumulative growth rate of daily diagnoses, and whether the government closes the city as the model input and set parameters minmaxscaler standardized processing. through a fully connected neural network, the features of the data are further extracted as input to the lstm layer. build an lstm neural network, and set the length of the time series, the learning rate, and the number of days to predict. in view of the huge data in the training set, to reduce the fitting deviation, we improved the lstm model, used the unified standard deviation as the judgment standard, reduced the data using the method based on the overall cardinality, and adjusted the data. this method creates a tanh activation function update vector, obtains the status in the storage unit, and determines the output and update status content through the decision output gate, which can output the predicted number of newly diagnosed cases in the next day. besides, according to the actual situation of the actual epidemic situation, we selected data from several countries and regions, and compared the deviation between the experimental results of the improved lstm prediction model and the digital prediction model (such as logistic and dengue confirmed-cases prediction: a neural network model the meshless local petrov-galerkin (mlpg) method a noise-immune kalman filter for short-term traffic flow forecasting the utility of lassobased models for real time forecasts of endemic infectious diseases: a cross country com parison an abnormal network flow feature sequence prediction approach for ddos attacks detection in big data environment savant syndrome-theories and empirical findings an interactive webbased dashboard to track covid-19 in real time. the lancet infectious diseases a hybrid method for traffic flow forecasting using multimodal deep learning population prediction using artificial neural network using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature choices and tradeoffs in inference with infectious disease models a stochastic differential equation sis epidemic model forecasting risk of tickborne encephalitis (tbe): using data from wildlife and climate to predict next year's number of human victims avian-human influenza epidemic model a new prediction model of infectious diseases with va ccination strategies based on evolutionary game theory approximate bayesian computation for infectious disease modelling real-time prediction of influenza outbreaks in belgium. epidemics the prediction of infectious diseases via a grey and differential equation model predicting concentration of pm10 using optimal parameters of deep neural network new method of randomized forecasting using entropy-robust estimation: application to the world population prediction. mathematics a novel learning method for multiintersections aware traffic flow forecasting the forecast of the size and age structure of the economically active population of altai territory and kemerovo region determination of the normal contact stiffness and integration time step for the finite element modeling of bristle-surface interaction public health emergency management and multi-source data technology in china coronavirus disease 2019 (covid-19) situation report 43 report of the who-china joint mission on coronavirus disease (covid-19) modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions coronavirus pandemic and tourism: dynamic stocha stic general equilibrium modeling of infectious disease outbreak lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems federated imitation learning: a novel framework for cloud robotic systems with heterogeneous sensor data recognition of pyralidae insects using intelligent monitoring autonomous robot vehicle in natural farm scene design and implementation of a novel precision irrigation robot based on an intelligent path planning algorithm we perform statistical analysis on the prediction results and calculate the deviation value. it can be seen that the improved lstm model prediction results are better than the traditional logistic and hill equation prediction models in most cases. the average error key: cord-198449-cru40qp4 authors: carballosa, alejandro; mussa-juane, mariamo; munuzuri, alberto p. title: incorporating social opinion in the evolution of an epidemic spread date: 2020-07-09 journal: nan doi: nan sha: doc_id: 198449 cord_uid: cru40qp4 attempts to control the epidemic spread of covid19 in the different countries often involve imposing restrictions to the mobility of citizens. recent examples demonstrate that the effectiveness of these policies strongly depends on the willingness of the population to adhere them. and this is a parameter that it is difficult to measure and control. we demonstrate in this manuscript a systematic way to check the mood of a society and a way to incorporate it into dynamical models of epidemic propagation. we exemplify the process considering the case of spain although the results and methodology can be directly extrapolated to other countries. both the amount of interactions that an infected individual carries out while being sick and the reachability that this individual has within its network of human mobility have a key role on the propagation of highly contagious diseases. if we picture the population of a given city as a giant network of daily interactions, we would surely find highly clustered regions of interconnected nodes representing families, coworkers and circles of friends, but also several nodes that interconnect these different clustered regions acting as bridges within the network, representing simple random encounters around the city or perhaps people working at customer-oriented jobs. it has been shown that the most effective way to control the virulent spread of a disease is to break down the connectivity of these networks of interactions, by means of imposing social distancing and isolation measures to the population [1] . for these policies to succeed however, it is needed that the majority of the population adheres willingly to them since frequently these contention measures are not mandatory and significant parts of the population exploit some of the policies gaps or even ignore them completely. in diseases with a high basic reproduction number, i.e., the expected number of new cases directly generated by one infected case, such is the case of covid19, these individuals represent an important risk to control the epidemic as they actually conform the main core of exposed individuals during quarantining policies. in case of getting infected, they can easily spread the disease to their nearest connections in their limited but ongoing everyday interactions, reducing the effectiveness of the social distancing constrains and helping on the propagation of the virus. measures of containment and estimating the degree of adhesion to these policies are especially important for diseases where there can be individuals that propagate the virus to a higher number of individuals than the average infected. these are the so-called super-spreaders [2, 3] and are present in sars-like diseases such as the covid19. recently, a class of super-spreaders was successfully incorporated in mathematical models [4] . regarding the usual epidemiological models based on compartments of populations, a viable option is to introduce a new compartment to account for confined population [5] . again, this approach would depend on the adherence of the population to the confinement policies, and taking into account the rogue individuals that bypass the confinement measures, it is important to accurately characterize the infection curves and the prediction of short-term new cases of the disease, since they can be responsible of a dramatic spread. here, we propose a method that quantitatively measures the state of the public opinion and the degree of adhesion to an external given policy. then, we incorporate it into a basic epidemic model to illustrate the effect of changes in the social network structure in the evolution of the epidemic. the process is as follows. we reconstruct a network describing the social situation of the spanish society at a given time based on data from social media. this network is like a radiography of the social interactions of the population considered. then, a simple opinion model is incorporated to such a network that allows us to extract a probability distribution of how likely the society is to follow new opinions (or political directions) introduced in the net. this probability distribution is later included in a simple epidemic model computed along with different complex mobility networks where the virus is allowed to spread. the framework of mobility networks allows the explicit simulation of entire populations down to the scale of single individuals, modelling the structure of human interactions, mobility and contact patterns. these features make them a promising tool to study an epidemic spread (see [6] for a review), especially if we are interested in controlling the disease by means of altering the interaction patterns of individuals. at this point, we must highlight the difference between the two networks considered: one is collected from real data from social media and it is used to feel the mood of the collective society, while the other is completely in-silico and proposed as a first approximation to the physical mobility of a population. the study case considered to exemplify our results considers the situation in spain. this country was hard-hit by the pandemic with a high death-toll and the government reacted imposing a severe control of the population mobility that it is still partially active. the policy worked and the epidemic is controlled, nevertheless it has been difficult to estimate the level of adherence to those policies and the repercussions in the sickness evolution curve. this effect can also be determinant during the present transition to the so-called 'new normal'. the manuscript is organized as follows. in section 2 we describe the construction of the social network from scratch using free data from twitter, the opinion model is also introduced here and described its coupling to the epidemiological model. section 3 contains the main findings and computations of the presented models, and section 4 a summary and a brief discussion of the results, with conclusions and future perspectives. in order to generate a social network, we use twitter. we downloaded several networks of connections (using the tool nodexl [7] ). introducing a word of interest, nodexl brings information of users that have twitted a message containing the typed word and the connections between them. the topics of the different searches are irrelevant. in fact, we tried to choose neutral topics with potentiality to engage many people independently of political commitment, age, or other distinctions. the importance of each subnet is that it reveals who is following who and allows us to build a more complete network of connections once all the subnets are put together. each one of the downloaded networks will have approximately 2000 nodes [8] . in this way, downloading as many of such subnets as possible gives us a more realistic map of the current situation of the spanish twitter network and, we believe, a realistic approximation to the social interactions nationwide. we intended to download diverse networks politically inoffensive. 'junction' accounts will be needed to make sure that all sub-networks overlap. junction accounts are these accounts that are part of several subnets and warrant the connection between them. if these junction accounts did not exist, isolated local small networks may appear. go to supplementary information to see the word-of-interest networks downloaded and overlapped. twitter, as a social network, changes in time [9] , [10] , [11] and it is strongly affected by the current socio-political situation, so important variations in its configuration are expected with time. specifically, when a major crisis, such as the current one, is ongoing. taking this into consideration, we analyze two social neworks corresponding to different moments in time. one represents the social situation in october 2019 (with = 17665 accounts) which describes a pre-epidemic social situation and another from april 2020 (with = 24337 accounts) which describes the mandatory-confinement period of time. the networks obtained are directed and the links mark which nodes are following which. so, a node with high connectivity means it is following the opinions of many other nodes. the two social networks obtained with this protocol are illustrated in figure 1 . a first observation of their topologies demonstrate that they fit a scale free network with a power law connectivity distribution and exponents = 1.39 for october'19 and = 1.77 for april'20 network [12] . the significantly different exponents demonstrate the different internal dynamics of both networks. we generate the graphs in (a) and (b) using the algorithm force atlas 2 from gephi [13] . force atlas 2 is a forced-directed algorithm that stimulates the physical system to get the network organized through the space relying on a balance of forces; nodes repulse each other as charged particles while links attract their nodes obeying a hooke's law. so that, nodes that are more distant exchange less information. we consider a simple opinion model based on the logistic equation [14] but that has proved to be of use in other contexts [15, 16] . it is a two variable dynamical model whose nonlinearities are given by: where and account for the two different opinions. as + remains constant, we can use the normalization equation + = 1, and, thus, the system reduces to a single equation: is a time rate that modifies the rhythm of evolution of the variable , is a coupling constant and controls the stationary value of . this system has two fixed points ( 0 = 0 and 0 = + ⁄ + being the latest stable and 0 = 0 unstable. we now consider that each node belongs to a network and the connections between nodes follow the distribution measured in the previous section. the dynamic equation becomes [17] , each of the nodes obey the internal dynamic given by ( ) while being coupled with the rest of the nodes with a strength / where is a diffusive constant and is the connectivity degree for node (number of nodes each node is interacting with, also named outdegree). note that this is a directed non-symmetrical network where means that node is following the tweets from nodes. is the laplacian matrix, the operator for the diffusion in the discrete space, = 1, … , . we can obtain the laplacian matrix from the connections established within the network as = − , being the adjacency matrix notice that the mathematical definition in some references of the laplacian matrix has the opposite sign. we use the above definition given by [17] in parallelism with fick's law and in order to keep a positive sign in our diffusive system. now, we proceed as follows. we consider that all the accounts (nodes in our network) are in their stable fixed point 0 = + + , from equation (6), with a 10% of random noise. then a subset of accounts is forced to acquire a different opinion, = 1 with a 10% of random noise, ∀ / = 1, . . and we let the system to evolve following the dynamical equations (3) . in this case, accounts are sorted by the number of followers that it is easily controllable. therefore, some of the nodes shift their values to values closer to 1 that, in the context of this simplified opinion model, means that those nodes shifted their opinion to values closer to those leading the shift in opinion. this process is repeated in order to gain statistical significance and, as a result, it provides the probability distribution of nodes eager to change the opinion and adhere to the new politics. our epidemiological model is based on the classic sir model [18] and considers three different states for the population: susceptible (s), infected (i) and recovered or removed individuals (r) with the transitions as sketched in figure 2 . here represents the probability of infection and the probability of recovering. we assume that recovered individuals gain immunity and therefore cannot be infected again. we consider an extended model to account for the epidemic propagation where each node interacts with others in order to spread the virus. in this context we consider that each node belongs to a complex network whose topology describes the physical interactions between individuals. the meaning of node here is a single person or a set of individuals acting as a close group (i.e. families). the idea is that the infected nodes can spread the disease with a chance to each of its connections with susceptible individuals, thus becomes a control parameter of how many individuals an infected one can propagate the disease to at each time step. then, each infected individual has a chance of being recovered from the disease. a first order approach to a human mobility network is the watts-strogatz model [19] , given its ability to produce a clustered graph where nearest nodes have higher probability of being interconnected while keeping some chances of interacting with distant nodes (as in an erdös-renyi random graph [20] ). according to this model, we generate a graph of nodes, where each node is initially connected to its nearest neighbors in a ring topology and the connections are then randomly rewired with distant nodes with a probability . the closer this probability is to 1 the more resembling the graph is to a fully random network while for = 0 it becomes a purely diffusive network. if we relate this ring-shaped network with a spatial distribution of individuals, when is small the occurrence of random interactions with individuals far from our circle of neighbors is highly severed, mimicking a situation with strict mobility restrictions where we are only allowed to interact with the individuals from our neighborhood. this feature makes the watts-strogatz model an even more suitable choice for the purposes of our study since it allows us to impose further mobility restrictions to our individuals in a simple way. on the other hand, the effects of clustering in small-world networks with epidemic models are important and have been already studied [21] [22] [23] [24] . the network is initialized setting an initial number of nodes as infected while the rest are in the susceptible state and, then, the simulations starts. at each time step, the chance that each infected individual spreads the disease to each of its susceptible connections is evaluated by means of a monte carlo method [25] . then, the chance of each infected individual being recovered is evaluated at the end of the time step in the same manner. this process is repeated until the pool of infected individuals has decreased to zero or a stopping criterion is achieved. the following step in our modelling is to include the opinion model results from the previous section in the epidemic spread model just described. first, from the outcome of the opinion model , we build a probability density ( ̅) where ̅ = 1 − represents the disagreement with the externally given opinion. these opinion values are assigned to each of the nodes in the watts-strogatz network following the distribution ( ̅). next, we introduce a modified parameter, which varies depending on the opinion value of each node. it can be understood in terms of a weighted network modulated by the opinions, it is more likely that an infection occurs between two rogue individuals (higher value of ̅) rather than between two individuals who agree with the government confinement policies (̅ almost zero or very close to zero). we introduce, then, the weight ′ = ⋅ ̅ ⋅ ̅ , which accounts for the effective probability of infection between an infected node and a susceptible node . at each time step of the simulation, the infection chances are evaluated accordingly to the value ′ of the connection and the process is repeated until the pool of infected individuals has decreased to zero or the stopping criterion is achieved. in figure 3 , we exemplify this process through a network diagram, where white, black and grey nodes represent susceptible, infected and recovered individuals respectively. black connections account for possible infections with chance ′ . to account for further complexity, this approach could be extrapolated to more complex epidemic models already presented in the literature [4, 6, 26] . nevertheless, for the sake of illustration, this model still preserves the main features of an epidemic spread without adding the additional complexity to account for real situations such as the covid19 case. following the previous protocol, we run the opinion model considering the two social networks analyzed. figure 4 shows the distribution of the final states of the variable for the october'19 network (orange) and the april'20 network (green) when the new opinion is introduced in a 30% of the total population (r=30%). different percentages of the initial population r were considered but the results are equivalent (see figure s1 in the supplementary information). figure 4 clearly shows that the population on april'20 is more eager to follow the new opinion (political guidelines) comparing with the situation in october'19. in the pandemic scenario (network of april '20) it is noticeable that larger values of the opinion variable, , are achieved corresponding with the period of the quarantine. preferential states are also observed around = 0, = 0.5 and = 1. note that the network of april'20 allows to change opinions more easily than in the case of october'19. during the sanitary crisis in spain, the government imposed heavy restrictions on the mobility of the population. to better account for this situation, we rescaled the probability density of disagreement opinions ( ̅) to values between 0 and 0.3, leading to the probability densities of figure 5 . from here on, we shall refer to this maximum value of the rescaled probability density as the cutoff imposed to the probability density. note that this probability distribution is directly included into de mobility model as a probability to interact with other individuals, thus, this cutoff means that the government policy is enforced reducing up to a 70% of the interactions and the reminder 30% is controlled by the population decision to adhere to the official opinion. in figure 6 we summarized the main results obtained from the incorporation of the opinion model into the epidemiological one. we established four different scenarios: for the first one we considered a theoretical situation where we imposed that around the 70% of the population will adopt social distancing measures, but leave the other 30% in a situation where they either have an opinion against the policies or they have to move around interacting with the rest of the network for any reason (this means, ̅ = 0.3 for all the nodes). in contrast to this situation we introduce the opinion distribution of the social networks of april'20 and october'19. finally, we consider another theoretical population where at least 90% of the population will adopt social distancing measures (note that in a real situation, around 10% of the population occupies essential jobs and, thus, are still exposed to the virus). however, for the latter the outbreak of the epidemic does not occur so there is no peak of infection. note that the first and the last ones are completely in-silico scenarios introduced for the sake of comparison. figure 6a shows the temporal evolution of the infected population in the first three of the above scenarios. the line in blue shows the results without including an opinion model and considering that a 70% of the population blindly follows the government mobility restrictions while the reminding 30% continue interacting as usual. orange line shows the evolution including the opinion model with the probability distribution derived as in october'19. the green line is the evolution of the infected population considering the opinion model derived from the situation in april'20. note that the opinion model stated that the population in april'20 was more eager to follow changes in the opinion than in october'19, and this is directly reflected in the curves in figure 6a . also note that as the population becomes more conscious and decides to adhere to the restriction-of-mobility policies, the maximum of the infection curve differs in time and its intensity is diminished. this figure clearly shows that the state of the opinion inferred from the social network analysis strongly influences the evolution of the epidemic. the results from the first theoretical case (blue curve) show clearly that the disease reaches practically all the rogue individuals (around the 30% of the total population that we set with the rescaling of the probability density), while the other two cases with real data show that further agreement with the given opinion results in flatter curves of infection spreading. we have analyzed both the total number of infected individuals on the peaks and its location in time of the simulation, but, since our aim is to highlight the incorporation of the opinion model we show in figures 6b and 6c the values of the maximum peak infection as well as the delay introduced in achieving this maximum scaled with the corresponding values of the first case (blue line). we see that the difference on the degree of adhesion of the social networks outcomes a further 12% reduction approx. on the number of infected individuals at the peak, and a further delay of around the 20% in the time at which this peak takes place. note that for the april'20 social network, a reduction of almost the 50% of individuals is obtained for the peak of infection, and a similar value is achieved for the time delay of the peak. this clearly reflects the fact that a higher degree of adhesion is important to flatten the infection curve. finally, in the latter theoretical scenario, where we impose a cutoff of ̅ = 0.1, the outbreak of the epidemic does not occur, and thus there is no peak of infection. this is represented in figure 6b and 6c as a dash-filled bar indicating the absence of the said peak. changing the condition on the cutoff imposed for the variable ̅ can be of interest to model milder or stronger confinement scenarios such as the different policies ruled in different countries. in figure 7 we show the infection peak statistics (maximum of the infection curve and time at maximum) for different values of the cutoffs and for both social opinion networks. in both cases, the values are scaled with those from the theoretical scenario with all individuals having their opinion at the cutoff value. both measurements (figures 7a and 7b) are inversely proportional to the value of the cutoff. this effect can be understood in terms of the obtained probability densities. for both networks (october'19 and april '20) we obtained that most of the nodes barely changed their opinion, and thus for increasing levels on the cutoff of ̅ these counts dominate on the infection processes so the difference between both networks is reduced. on the other hand, this highlights the importance of rogue individuals in situations with increasing levels of confinement policies since for highly contagious diseases each infected individual propagates the disease rapidly. each infected individual matter and the less connections he or she has the harder is for the virus to spread along the exposed individuals. note that for all the scenarios, the social network of april'20 represents the optimum situation in terms of infection peak reduction and its time delay. it is particularly interesting the case for the cutoff in ̅ = 0.2. all simulations run for this cutoff show an almost non-existent peak. this is represented on figure 7a with almost a reduction of the 100% of the infection peak (the maximum value found on the infection curve was small but not zero) and the value of the time delay (figure 7b) as discussed in the previous section, we are considering a watts-strogatz model for the mobility network. this type of network is characterized by a probability of rewiring (as introduced in the previous section) that stablishes the number of distant connections for each individual in the network. all previous results were obtained considering a probability of rewiring of 0.25. figure 8 shows the variation of the maximum for the infection curve and time for the maximum versus this parameter. the observed trend indicates that the higher the clustering (thus, the lower the probability of rewiring) the more difficult is for the disease to spread along the network. this result is supported by previous studies in the field, which show that clustering decreases the size of the epidemics and in cases of extremely high clustering, it can die out within the clusters of population [21, 24] . this can be understood in terms of the average shortest path of the network [12] , which is a measure of the network topology that tells the average minimum number of steps required to travel between any two nodes of the network. starting from the ring topology, where only the nearest neighbors are connected, the average shortest path between any two opposite nodes is dramatically reduced with the random rewirings. remember that these new links can be understood as short-cuts or long-distance connections within the network. since the infection process can only occur between active links between the nodes, it makes sense that the propagation is limited if less of these long-distance connections exist in the network. the average shortest path length decays extremely fast with increasing values of the random rewiring, and thus we see that the peak statistics are barely affected for random rewirings larger than the 25%. if one is interested on further control of the disease, the connections with distant parts of the network must be minimized to values smaller than this fraction. regarding the performance of both opinion biased epidemic cases, we found again a clear difference between the two of them. in the april'19 case, the outcome of the model present always a more favorable situation to control the expansion of the epidemic, stating the importance of the personal adherence to isolation policies in controlling the evolution of the epidemic. we have parametrized the social situation of the spanish society at two different times with the data collected from a social media based on microblogging (twitter.com). the topology of these networks combined with a simple opinion model provides us with an estimate of how likely this society is to follow new opinions and change their behavioral habits. the first analysis presented here shows that the social situation in october 2019 differs significantly from that of april 2020. in fact, we have found that the latter is more likely to accept opinions or directions and, thus, follow government policies such as social distancing or confining. the output of these opinion models was used to tune the mobility in an epidemic model aiming to highlight the effect that the social 'mood' has on the pandemic evolution. the histogram of opinions was directly translated into a probability density of people choosing to follow or not the directions, modifying their exposedness to being infected by the virus. although we exemplify the results with an over-simplified epidemic model (sir), the same protocol can be implemented in more complicated epidemic models. we show that the partial consensus of the social network, although non perfect, induces a significant impact on the infection curve, and that this impact is quantitatively stronger in the network of april 2020. our results are susceptible to be included in more sophisticated models used to study the evolution of the covid19. all epidemic models lack to include the accurate effect of the society and their opinions in the propagation of epidemics. we propose here a way to monitor, almost in real time, the mood of the society and, therefore, include it in a dynamic epidemic model that is biased by the population eagerness to follow the government policies. further analysis of the topology of the social network may also provide insights of how likely the network can be influenced and identify the critical nodes responsible for the collective behavior of the network. in order to check the statistical accuracy and relevance of our networks, we considered different scenarios with more or less subnets (each subnet corresponding with a single hashtag) and estimate the exponent of the scale-free-network fit. this result is illustrated in figure s1a for the october'19 case and in figure s1b for the april'20 case. note that as the number of subnets (hashtags) is increased, the exponent converges. for 1 subnet all the exponents were calculated and for n subnets just one combination is possible so that non deviation is shown. distribution of the final states of the variable for the october'19 network (orange) and the april'20 network (green) when the new opinion is introduced by three different percentages of the total population (r parameter) is shown in figure s2 . note that in all cases the results are qualitatively equivalent and, once included in the opinion model, the results are similar. figure s2 . distribution of the concentrations for the twitter network from october 2019 (orange) and april 2020 (green) for r=20% (a), r=30% (b) and r=40% (c) of the initial accounts in the state 1 with a 10% of noise ( =0.0001, =0.01, =0.0001, 0=0.01, =20000). figure s3 shows the evolution of the number of infected individuals with time for the epidemic model biased with the opinion model of april 2020. results for different values of the ̅ cutoff are shown. note how for ̅ = 0.2 the peak of infection vanishes, and the epidemic dies out due to its lack of ability to spread among the nodes. on the other hand, figure s4 shows for different values of the cutoff on ̅, the comparison between the three cases presented in the main text (see figure 6 ): the theoretical scenario where the opinion is fixed on the cutoff value for all the nodes, and the epidemic model biased with the opinions of october '19 and april '20 scenarios. see how the difference between the theoretical scenario and the opinion biased models diminishes with growing values of the cutoff value on ̅ finally, figure s5 shows the effect that higher values of the rewiring probability of the watt-strogatz model has in the time evolution of the infected individuals. as shown in the main text, lower values of the rewiring probability has an important impact on the peak of infection, while values above = 0.3 barely change the statistics on the said peak, or fall within the error of the measurements. sectoral effects of social distancing one world, one health: the novel coronavirus covid-19 epidemic the role of superspreaders in infectious disease mathematical modeling of covid-19 transmission dynamics with a case study of wuhan predictability: can the turning point and end of an expanding epidemic be precisely forecast? epidemic processes on complex networks nodexl: a free and open network overview, discovery and exploration addin for excel evolving centralities in temporal graphs: a twitter network analysis analyzing temporal dynamics in twitter profiles for personalized recommendations in the social web emerging topic detection on twitter based on temporal and social terms evaluation gephi: an open source software for exploring and manipulating networks resherches mathematiques sur la loi d'accroissement de la population the coupled logistic map: a simple model for the effects of spatial heterogeneity on population dynamics logistic map with memory from economic model turing patterns in network-organized activatorinhibitor systems mathematical epidemiology of infectious diseases: model building, analysis and interpretation collective dynamics of small-world networks on random graphs. publicationes mathematicae epidemics and percolation in small-world networks the effects of local spatial structure on epidemiological invasions properties of highly clustered networks critical behavior of propagation on small-world networks metropolis, monte carlo and the maniac infectious diseases in humans this research is supported by the spanish ministerio de economía y competitividad and european regional development fund, research grant no. cov20/00617 and rti2018-097063-b-i00 aei/feder, ue; by xunta de galicia, research grant no. 2018-pg082, and the cretus strategic partnership, agrup2015/02, supported by xunta de galicia. all these programs are co-funded by feder (ue). we also acknowledge support from the portuguese foundation for science and technology (fct) within the project n. 147. 2. opinion distributions depending on the initial number of nodes with different opinion. the list of hashtags used to construct both networks is in table 1 for the october'19 case (column on the left) and for the april'20 scenario (right column). all hashtags used were neutral in the sense of political bias or age meaning. april '20 #eleccionesgenerales28a #cuidaaquientecuida #eldebatedecisivolasexta #estevirusloparamosunidos #pactosarv #quedateconesp #rolandgarros #semanaencasayoigo #niunamenos #quedateencasa #selectividad2019 #superviviente2020 #anuncioeleccions28abril #autonomosabandonados #blindarelplaneta #renta2019 #diamundialdelabicicleta' #encasaconsalvame #emergenciaclimatica27s' #diamundialdelasalud #cuarentenaextendida #asinonuvigo #ahoratocalucharjuntos #house_party #encasaconsalvame apoyare_a_sanchez pleno_del_congreso viernes_de_dolores key: cord-213974-rtltf11w authors: lensink, keegan; laradji, issam; law, marco; barbano, paolo emilio; nicolaou, savvas; parker, william; haber, eldad title: segmentation of pulmonary opacification in chest ct scans of covid-19 patients date: 2020-07-07 journal: nan doi: nan sha: doc_id: 213974 cord_uid: rtltf11w the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) has rapidly spread into a global pandemic. a form of pneumonia, presenting as opacities with in a patient's lungs, is the most common presentation associated with this virus, and great attention has gone into how these changes relate to patient morbidity and mortality. in this work we provide open source models for the segmentation of patterns of pulmonary opacification on chest computed tomography (ct) scans which have been correlated with various stages and severities of infection. we have collected 663 chest ct scans of covid-19 patients from healthcare centers around the world, and created pixel wise segmentation labels for nearly 25,000 slices that segment 6 different patterns of pulmonary opacification. we provide open source implementations and pre-trained weights for multiple segmentation models trained on our dataset. our best model achieves an opacity intersection-over-union score of 0.76 on our test set, demonstrates successful domain adaptation, and predicts the volume of opacification within 1.7% of expert radiologists. additionally, we present an analysis of the inter-observer variability inherent to this task, and propose methods for appropriate probabilistic approaches. world are overwhelmed and facing shortages of the essential equipment necessary to manage the symptoms of this disease. rapid screening is necessary to diagnose the disease and slow the spread, and effective tools are essential for prognostication in order to efficiently allocate resources to those who need them most. while rt-pcr has emerged as the standard screening protocol for covid-19 in many countries, the test has been shown to have high false negative rates due to its relatively low sensitivity yet high specificity [28] . recent work has shown that the analysis of chest ct scans by trained radiologists increases the diagnostic sensitivity [1] . this is because the virus attacks and inhibits the alveoli of the lung, which fill with fluid in response, causing various forms of opacification within the lung when seen on computed tomography (ct) scans. due to an increase in density, these areas present on ct scans as increased attenuation with preserved bronchial and vascular markings, known as a ground glass opacity (ggo). in addition to this, when the accumulation of fluid progresses to obscure bronchial and vascular regions on ct scans, it is known as consolidation. in addition to providing complimentary diagnostic properties, the analysis of ct scans has great potential value for the prognostication of patients with covid-19. the percentage of well-aerated-lung (wal) has emerged as a predictive metric for determining prognosis of patients confirmed with covid-19, including admission to the icu and death [6] . the quantification of percentage of wal is often done by visually estimating volume of opacification relative to healthy lung, which is a time consuming process, or can be roughly estimated automatically through attenuation values within the lung. in addition to the percent of wal, which does not account for the various forms of opacification, expert interpretation of ct scans can provide insight on the severity of the infection by identifying various patterns of opacification (see table 1 ). the prevalence of these patterns, and the severity of the infection, can also correlate to different stages of the disease [15, 24] . therefore, automatic quantification of both the percentage wal and the opacification composition could enable efficient estimation of the stage of the disease, and even a glimpse at the risk for poor outcomes. standard diagnostics require experienced radiologists, and is highly time consuming. thus, there is a need to develop machine learning techniques to deal with the problem in a quantitative way to support the radiological team. the task is difficult for a number of reasons, and there is a limited amount of available work focused on this particular problem. first, privacy restrictions and labelling cost lead to a lack of available public data. as a result, there is only one small public dataset [20] known to us at the time of writing this manuscript. for this reason, we are unable to publicly release the collected dataset, at this time. second, as we show in this work, the segmentation process has a subjective component. indeed, the collected data is heavily influenced by the instructions provided to the annotators and therefore, it is likely problematic to combine data sets collected under different labelling regimes. third, the cost of acquiring pixel level segmentation labels is prohibitive. compared to common computer vision tasks such as the annotation of street scenes, this task is non-trivial and requires careful attention by highly skilled expert annotators. in this study, we found that pixel level annotation required an average of 60 minutes per scan, leading to a total of roughly 660 highly trained person-hours to acquire the dataset. from a machine learning point of view, the second point deserves special attention. as we show next, the segmentation task is especially difficult due to the subjectivity of the labels and the low inter-class variance. the regions of opacification are, by definition, hazy and without clearly defined borders. as such, there is an expected increased level of inter-observer variability. in addition, the patterns that we wish to differentiate have low inter-class variability. while the patterns described in 1 are technically independent, they are not mutually exclusive in a given area, and distinguishing these complex patterns is very time consuming and difficult for even expert radiologists. given the challenges mentioned above, the goal of this work is to provide open source models for the segmentation of patterns of pulmonary opacification, which have been correlated with various stages and severity of covid-19 pneumonia. while the development of open source models have been hindered by the lack of publicly available data, we hope that by releasing our open source model and pretrained weights, we can enable healthcare centers and researchers around the world to develop tools for the effective diagnosis and prognostication of covid-19 on ct scans. in addition to our models, we hope to enable researchers by discussing the insights gained from our work into this difficult task, particularly related to the incorporation of uncertainty and the high inter-observer variability between annotators. we provide an open source software implementation 1 , with training procedure using a small public dataset, and an online visualization tool 2 to easily view predictions on our private dataset. our contributions are as follows: 1. we have collected 663 chest ct scans of patients with covid-19 pneumonia from healthcare centers around the world, and created pixel wise segmentation labels for nearly 25,000 slices that segment 6 different forms of pulmonary opacification that have been correlated with stages and severity of covid-19. we provide open source implementations and pretrained weights for multiple segmentation models trained on our dataset, and show that these models adapt to domains that are withheld from the training set. we hope that by making this publicly available we will ease the burden of development on healthcare centers around the world that have limited access to data. in this section we discuss the work that is most relevant for this paper. we start with semantic segmentation for ct scans on general medical problems, followed by semantic segmentation for covid-19. deep learning-based methods have been widely applied in medical image analysis to combat covid-19 [10, 13, 23] . they have been proposed to detect patients infected with covid-19 via radiological imaging. for example, a covid-net [22] was proposed to detect covid-19 cases from chest radiography or x-ray images. an anomaly detection model [19] was designed to assist radiologists in analyzing the vast amounts of chest x-ray images. for ct imaging, a location-attention oriented model was employed to calculate the infection probability of covid-19. a weakly-supervised deep learning-based software system was developed in [26] using 3d ct volumes to detect covid-19. although plenty of ai systems have been proposed to provide assistance in covid-19 diagnostics for clinical practice, there are only a few related works [8] , and no significant impact has been shown using ai to improve clinical outcomes, as of yet. semantic segmentation for ct scans has been widely used for diagnosing lung diseases. diagnosis is often based on segmenting different organs and lesions from chest ct slices, which can provide essential information for doctors to identify underlying disease processes. many methods exist that perform lung nodule segmentation. early algorithms are based on svm to extract features and detect nodule segmentations [13] . later, algorithms based on deep learning emerged [10] . some examples are central focus cnns [23] and gan-synthesized data to improve the training of a discriminative model for pathological lung segmentation. latest methods are based on two deep networks to segment lung tumors from ct slices by adding multiple residual streams of varying resolutions, and multi-task learning of joint classification and segmentation. semantic segmentation for covid-19 while covid-19 is a recent phenomenon, several methods have been proposed to analyze infected regions of covid-19 in the lungs. [8] proposed a semi-supervised learning algorithm for automatic covid-19 lung infection segmentation from ct scans. their algorithm leverages attention to enhance representations. similarly, [27] proposed to use spatial and channel attention to enhance representations, and [5] augment unet [17] with resnext [25] blocks and attention to improve its efficacy. instead of focusing on the architecture. although previous methods are accurate, their computational cost can be prohibitive. [3] similarly propose the segmentation of tomographic patterns from chest ct using a large annotated dataset, and compute measures of the severity of infection. we have worked with health centers around the world to retrospectively collect 663 ct scans of patients suspected of having covid-19. the dataset is composed of scans from health centers in canada, italy, south korea, iran, and saudi arabia, where each respective health center's research ethics board approved use of the dataset. the dataset represents a relatively equal distribution of sex, with 321 female scans and 324 male scans. while all scans had a slice thickness of 1mm, the slice spacing varied depending on the healthcare centers protocols for storing scans. we collected 112 scans with 1mm spacing, 84 scans with 5mm spacing, and 449 scans with 10mm spacing. ct scans are 3d volumes, where the x, y, and z axes are commonly used to refer to the anterior-posterior, lateral, and distal-proximal axes respectively. all studies were re-sampled prior to our collection to have an x and y resolution of 512 × 512, with varied number of slices depending on the slice spacing and the length of the distal-proximal axis. studies with 10mm spacing generally have 30-60 slices, where as studies with 1mm slice spacing have 150-300 slices. general and sub-specialist radiologists (medical doctors trained in medical imaging) determined the most clinically relevant classes for annotation of the data. in order to aid prognosis and diagnosis, we segmented 6 different patterns of pulmonary opacification seen in covid-19 pneu(84) 3865 (116) 16208 (463) 24975 (663) monia, as outlined in table 1 . these patterns are composed of ggo, varying degrees of lobular septal thickening (an anatomic unit of the lung) and/or consolidation, and are commonly used by radiologists to differentiate between different stages and severity of infection. each tomographic pattern is defined and identified by its distinguishing spatial characteristics as outlined in [9] . the aim of this project is to automate the segmentation of these 6 patterns, as the information can aid radiologists in diagnostics and evaluating patient prognostication in terms of admission to the icu, the need for mechanical ventilation, and death. in addition to the patterns of pulmonary opacification we also annotated pleural effusion and lymphadenopathy, both of which are non-pulmonary findings that relate to the infection. the annotation team was composed of a number of practicing staff and resident radiologists at vancouver general hospital, as well as numerous medical students at the university of british columbia. the medical students were used to collect lung segmentation labels, while the radiologists and residents were used to collect segmentation labels for patterns of opacification. in order to compute the percent wal we first segmented the total lung volumes and then the opacities within the lung. a team of 12 expert radiologists and residents used an online annotation tool to segment the patterns of pulmonary opacification. the radiologists were trained on how to use the software and instructed to annotate every slice in 10mm spaced studies, and a portion of roughly 50 representative slices with in the 5mm and 1mm spaced studies. as a result, the thin spaced studies are partially labelled, as some slices were left unlabelled and in some regions there were no labels. the dataset was split on the scan level in order to ensure that all slices from a scan were contained in the same set. due to the relatively few number of scans in the test set (84/663), it is composed of scans that were selected by an expert radiologist in order to ensure a clinically representative sample. these scans were selected such that the test set included a variety of presentations, including healthy lungs, as well as lungs that contained opacities at varying stages of the disease. in order to test the effectiveness of domain adaption, we withheld all studies from 2 locations, italy and vancouver, for inclusion in the test set. following selec-tion of the test set, the validation and training sets were randomly selected as 20% and 80% of the remaining studies, respectively. the final number of scans in each split from each region is presented in table 2 . the labelling process for the test set differed slightly from the training set in order to increase the quality of the annotations. the instructions for the annotators stayed the same to limit bias, however we ensured that every slice of each scan in the test set was labelled, in order compute a more accurate estimate of the ground truth percent wal. in contrast to this, for the training set we prioritized getting slices labelled from more scans in order to increase the diversity of the training set, so we instructed the annotation team to label fewer slices out of each study. this process allowed us to efficiently collect a high quality test set while balancing diversity in the training set. standard computer vision applications and data sets, such as camvid [2] , contain very little variance when annotated by different annotators. this is because it is rather trivial to identify simple objects such as trees, houses, etc. however, this is not generally the case when working with medical data sets. for the problem at hand, there is significant inter-observer variability, similar to other studies presented in medical imaging [12] due to the fact that the segmentation task is non-trivial even to an expert. in the attempt to quantify this difference we have performed a study, collecting data from 12 different experts that annotate the same 43 slices, thus allowing us to compare and quantify the inter-observer variability present in our dataset. a visual comparison of the different annotations is presented in figure 2 and 4. qualitatively, we see that while there are differences between the radiologists they are all generally focused on the same regions of the lung. we observe that there is large variability in both the borders of the opacity and more so in the type of opacity. as we discuss next, these differences require attention when training a network. to see why traditional metrics may be insufficient to deal with this problem we compute the intersection over union (iou), which is a standard metrics in semantic segmentation [7] . the results are presented in a form of a matrix in figure 2 . comparison of the labels generated by each radiologist (columns) for 5 consecutive slices (rows) from the same study. figure 3 . inter-observer opacity iou comparisons. figure 3a shows a pair-wise distance matrix between all 12 radiologists, and figure 3b shows the distance between each radiologist and the average prediction. we see that, by a large margin, each radiologist is more similar to the average prediction than to any of their peers. figure 3 . as can be seen in figure 3 , the iou of each radiologist relative to their peers is rather poor. an ai system with similar iou is typically considered as inoperable. the main reason that iou is an inappropriate measure is that it assumes hard borders, where the problem at hand does not present such borders. furthermore, since the decision on the opacification type is highly subjective, results tend to yield even lower iou's. thus, in the next section we discuss techniques to deal with the uncertainty in the data. by definition ggo and consolidation are hazy cloudlike opacities that do not have clear boundaries due to their physical structure, therefore they are better represented by a continuous probability distribution rather than a discrete one. unfortunately, commonly available annotation tools do not make it easy to incorporate uncertainty into the labelling, nor is it time efficient to ask annotators to label in such a manner. in cases such as this, where segmentation is non-trivial and the object does not have a clear border, disagreement between hard labels should not simply be attributed to annotation error. instead, we view each label as a sample from the ground truth, which is an unknown underlying continuous probability distribution. this can be modeled by s obs =s + n(x, c). here s obs is the observed segmentation obtained by the radiologist,s is the average segmentation and n(x, c) is a noise model that depends on the location, e.g. how close is the region to the edge of the object, and the class of the object, c. the noise model is clearly correlated, as different classes tend to be more correlated than others as they present in a more similar way. it is also correlated in space, as pixels that are close to the boundary can depend on each other. while a comprehensive treatment of the problem is beyond the scope of this paper we have been experimenting with non-parametric noise models as well as simple parametric noise models. in this paper we present a simple parametric approach that uses only the first and second moments of the data estimated from the uncertainty study. to this end, we assume that each data point (pixel) represents a measurement from a gaussian statistic. we then use the kldivergence measure to compare the probability obtained by the network to the probability parameterized by gaussian distribution. this approach takes into consideration first order statistics that is presented in the data. we are currently collecting more data that compares different annotations of the same slice in order to develop a more comprehensive model that will allow us to train with noise models that are closer to realistic ones. in this section we describe the various deep learning approaches we have taken to solve the problem. although the problem is 3d by nature, in this early phase of development we have focused on developing 2d methods that segment each axial slice independently. while this is clearly suboptimal, as the inclusion of the z axis contains relevant information, we decided to first focus on 2d methods to set a baseline for all future approaches. training 3d models is undergoing as new data arrives and results will be presented in the future. a lung window of −1000hu to 350hu is applied to each slice, followed by normalization of the pixel values in each slice using the mean, −653.2hu, and standard deviation, 628.5hu, computed across the entire training set. in this initial work, we have grouped the 6 patterns into 3 clinically relevant grouping as outlined in table 1 . these groupings are created in order to increase the inter-class variation while maintaining clinical relevancy. in this initial phase we focus on training three popular 2d segmentation networks, which are a natural first step for this problem and do not require any special care given the wide variety of slice spacing in the dataset. we train a unet [18] , a deeplabv3plus with a resnet50 backbone [4] , and a pspnet with an inceptionresnetv2 [21] backbone. in order to compare models we select a variety of common metrics. firstly, we use the intersection-over-union to evaluate the accuracy of the segmentation for each class as although as we have previously demonstrated, this metric does not represent a desired property it is commonly used and thus we compute it to compare to known literature. in addition, we combine the probabilities for each opacity group together to compute the opacity iou, which we use to evaluate the ability of the model to distinguish between healthy lung and opacification. in order to more sensitively determine the accuracy of the model at computing the percent wal, we compute the ratio of the predicted volume of opacification with the ground truth, a metric we are calling the relative volume (rv). this metric is computed over the entire test set as wheret p ,f p ,t p andf n are integrated quantities. this metric is much less sensitive compared with the iou as it integrates the opacities, however more sensitive than comparing the percent wal directly because it does not depend on the lung volume, which is often much larger than volume of opacification. therefore, unlike the iou, even if individual boundaries do not match the overall relative volume should be correct. furthermore, while the iou of an individual structure has no real clinical value, the rv has significant clinical value and therefore estimating it correctly is a much more desired goal. in our training we use the adam optimizer [14] with a batch size of 64 to minimize the weighted kl divergence loss for 30 epochs using a learning rate of 10 −1 , which is decayed by a factor of 10 every 10 epochs. the loss is weighted using the complement of the probability of each class, that were computed from the training set. the model selected for evaluation on the test set is selected using the binary opacity iou on the validation set. we visualize select output of the model in figure 5 , where we see that the qualitative results are visually correct. in the first figure we present a successful segmentation of pure ggo. while there are some variations in the border compared to the "ground truth" as estimated by a radiologist, all regions of opacification have been segmented by the model. it is important to note that unlike other problems in computer vision where the ground truth is correct, it is impossible to know if the radiologist is better than our trained model given one label. further studies are needed in order to obtain a full quantitative assessment of the results. in the second row we see a prediction on a slice, where the "ground truth" contains a combination of group 2 and group 3. this presentation is typical where regions of ggo have progressed in severity and inter/intra-lobular lines have formed, which is a defining characteristic for group 3. again, we see some disagreement between the prediction and the "ground truth" in terms of the exact border, however the model has correctly identified each region of opacification. in qualitative terms, the prediction is equally viable to the ground truth and the difference between our segmentation to the "ground truth" is similar to the one obtained by the segmentation by different radiologists. in the third row we show the effects of partial volume artifacts, which we consider to be the most common cause of false positive. in this case we see the effect of the diaphragm causing a opaque region in the anterior portion of the right lung, however it is also common to see partial volume artifacts at the apex of the lungs as well. seeing as a radiologist would rule this out by viewing adjacent slices and recognizing the start of the diaphragm, a potential solution for this would be to include information from additional slices, such seen in 3d approaches. in the bottom row we see the successful segmentation of consolidation in the posterior basal segment of the right lower lung lobe. we would like to highlight the region of the opacification that has been predicted to be group 3 (purple), however was labelled as group 4 (brown). in consultation with a second radiologist we have confirmed that the prediction can be seen as more correct than the label in the predicted data set. the likely cause of this was previously discussed, where we show that more specific options are needed by the annotators. in this case since the region of opacification was mainly group 4, it is likely that the annotator simply labelled the entire region as such. in order to compare the different deep learning models, quantitative results are presented in table 3 . the unet proved to be the best segmentation model in terms of opacity iou, while we see that the other models performed similarly, yet slightly worse. no single model stands out as having the best miou over the three opacity groups, and all scores are relatively low. given the relatively high opacity iou, this could be attributed to a significant confusion between opacity groups. despite varied iou scores, the models all predicted clinically relevant relative volumes of opacity, with the unet and pspnet over predicting by 1.7% and 4.4% respectively, and the deeplabv3plus under predicting by 3.3%. (d) in this example we see that the model is more specific than our annotator, and was able to differentiate a region of crazy paving from a larger region of consolidation.. figure 5 . predictions from the unet model on four slices from the test set. the first column shows the ct scan, the second column shows the ground truth annotation, and the last column shows the model predictions. despite the challenging task of segmenting pulmonary opacification on ct scans, these initial promising results already indicate that deep learning approaches can provide value in clinical situations. all models were able to achieve relative volumes ratios within ±5% of the ground truth, enabling clinically relevant automatic estimation of the percent wal. we show qualitative results that our model is able to determine the pattern of opacification, which could provide timely and valuable clinical information to healthcare centers, as these patterns are associated with the severity of the infection. we do not see large differences between different model architectures observed in [8] , suggesting that the quantity and diversity of data are more important that the network architecture, as proposed in [11] . success on the test set, which contains scans from regions that were held out of the training set, shows that the models are able to successfully transfer knowledge across domains. this is an important finding, as the sharing of pre-trained models may allow the community to circumvent the lack of publicly available data due to privacy restrictions. the best model achieves an opacity iou of 0.758 on the test set, which we note is slightly higher than the human level performance we found comparing each radiologist to the average prediction in section 2. given the analysis of the inter-observer variability, we believe these models have approached the upper bound of the accuracy for our labels, without including higher order statistics or better data. thus, given a test set of similarly collected labels we are not able to reliably evaluate quantitative performance using simple statistics beyond this point. we see that comparison of qualitatively ideal annotations from expert annotators yield a wide range of iou scores. while it is noteworthy that the model's opacity iou is higher than the values we see between radiologists and the average label, due to the fact that the human level analysis comes from only one scan we do not believe it is possible to determine if the model has surpassed human ability yet. the relatively poor quantitative performance in differentiating between pattern of opacification is likely due to a number of factors. firstly, in section 2 we see large disagreements between experts on the specific pattern of opacification in a given region, as shown in figure 4 . while it is possible that annotator error is a factor, we believe that this could largely be rectified through improved annotation procedure. internal patterns within the segmented regions can mix together like a mosaic of many patterns in a single segmented region, and therefore should not be characterized by a single number. a more appropriate description should be given by a partial volume or probability, however creating such labels is difficult with software tools commonly used for annotation. thus, currently, our labels do not always account for the variations in the type of pattern across regions making the labels inherently noisy, which effects training and evaluation of the model. in future work we plan to adjust the labelling procedure to provide annotators with more specific tools and instructions, such that we are able to obtain more accurate labels. lastly, even with the groupings described in table 1 , there is low inter-class variability between the patterns of opacification, meaning that even given excellent annotations the segmentation task is likely chal-lenging. this application demonstrates the need to further develop techniques that allow the use of noisy labels with a non-trivial noise model. a simple example is the use of covariance information that is non-trivial to incorporate when using stochastic gradient descent. in this paper we have described our collection and annotation of ct scans of patients with covid-19 pneumonia for the segmentation of patterns of pulmonary opacification. we provide results using three popular 2d segmentation networks that show that we are able to accurately compute the relative volume of opacity, and estimate the composition of three clinically relevant pattern groups that have been linked to patient outcome. we show that our models are able to adapt to new domains by withholding two regions from training, which provides valuable insight on the possibility of a community driven approach that is not hindered by the lack of publicly available data or restrictive privacy policies. we provide an analysis of the inter-observer variability that is present in this non-trivial task, and conclude that due to the physical structure of opacification and the low inter-class variability, improved success in this task requires the adoption of soft labelling techniques and probabilistic models. to this end, we propose improved annotation procedures and noise modelling techniques that allow for future work using continuous probability distribution as opposed to the more common discrete distributions used in computer vision. correlation of chest ct and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases semantic object classes in video: a high-definition ground truth database. pattern recognition letters, xx(x):xx-xx quantification of tomographic patterns associated with covid-19 from chest ct rethinking atrous convolution for semantic image segmentation residual attention u-net for automated multi-class segmentation of covid-19 chest ct images well-aerated lung on admitting chest ct to predict adverse outcome in covid-19 pneumonia the pascal visual object classes (voc) challenge. ijcv inf-net: automatic covid-19 lung infection segmentation from ct images fleischner society: glossary of terms for thoracic imaging deep learning techniques for medical image segmentation: achievements and challenges automatic lung segmentation in routine imaging is a data diversity problem, not a methodology problem deep learning with noisy labels: exploring techniques and remedies in medical image analysis lung nodule segmentation and recognition using svm classifier and active contour modeling: a complete intelligent system adam: a method for stochastic optimization coronavirus disease (covid-19): spectrum of ct findings and temporal progression of the disease who coronavirus disease dashboard u-net: convolutional networks for biomedical image segmentation u-net: convolutional networks for biomedical image segmentation unsupervised anomaly detection with generative adversarial networks to guide marker discovery covid-19 ct segmentation dataset inception-v4, inception-resnet and the impact of residual connections on learning covid-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images central focused convolutional neural networks: developing a data-driven model for lung nodule segmentation temporal changes of ct findings in 90 patients with covid-19 pneumonia: a longitudinal study aggregated residual transformations for deep neural networks deep learning-based detection for covid-19 from chest ct using weak label. medrxiv an automatic covid-19 ct segmentation based on u-net with attention mechanism coronavirus disease 2019 (covid-19): a perspective from china we thank brian lee and duncan ferguson for their support in collecting and creating the dataset, vancouver coastal health research institute and the doctors, medical students, and healthcare centers who contributed to the creation of the dataset. for a comprehensive list please see our acknowledgments page. k.l and e.h. are supported of the natural sciences and engineering research council of canada (nserc). key: cord-004416-qw6tusd2 authors: krishna, smriti m.; omer, safraz mohamed; li, jiaze; morton, susan k.; jose, roby j.; golledge, jonathan title: development of a two-stage limb ischemia model to better simulate human peripheral artery disease date: 2020-02-26 journal: sci rep doi: 10.1038/s41598-020-60352-4 sha: doc_id: 4416 cord_uid: qw6tusd2 peripheral arterial disease (pad) develops due to the narrowing or blockage of arteries supplying blood to the lower limbs. surgical and endovascular interventions are the main treatments for advanced pad but alternative and adjunctive medical therapies are needed. currently the main preclinical experimental model employed in pad research is based on induction of acute hind limb ischemia (hli) by a 1-stage procedure. since there are concerns regarding the ability to translate findings from this animal model to patients, we aimed to develop a novel clinically relevant animal model of pad. hli was induced in male apolipoprotein e (apoe(−/−)) deficient mice by a 2-stage procedure of initial gradual femoral artery occlusion by ameroid constrictors for 14 days and subsequent excision of the femoral artery. this 2-stage hli model was compared to the classical 1-stage hli model and sham controls. ischemia severity was assessed using laser doppler perfusion imaging (ldpi). ambulatory ability was assessed using an open field test, a treadmill test and using established scoring scales. molecular markers of angiogenesis and shear stress were assessed within gastrocnemius muscle tissue samples using quantitative polymerase chain reaction. hli was more severe in mice receiving the 2-stage compared to the 1-stage ischemia induction procedure as assessed by ldpi (p = 0.014), and reflected in a higher ischemic score (p = 0.004) and lower average distance travelled on a treadmill test (p = 0.045). mice undergoing the 2-stage hli also had lower expression of angiogenesis markers (vascular endothelial growth factor, p = 0.004; vascular endothelial growth factorreceptor 2, p = 0.008) and shear stress response mechano-transducer transient receptor potential vanilloid 4 (p = 0.041) within gastrocnemius muscle samples, compared to animals having the 1-stage hli procedure. mice subjected to the 2-stage hli receiving an exercise program showed significantly greater improvement in their ambulatory ability on a treadmill test than a sedentary control group. this study describes a novel model of hli which leads to more severe and sustained ischemia than the conventionally used model. exercise therapy, which has established efficacy in pad patients, was also effective in this new model. this new model maybe useful in the evaluation of potential novel pad therapies. major amputation, renal failure and death) and poor long-term durability [5] [6] [7] [8] . there is great interest in developing novel medical therapies for the leg symptoms of pad. recent efforts have focused on stimulating development of new blood vessels within the leg through angiogenesis or by encouraging the remodelling of existing small vessels into improved collateral channels (arteriogenesis). promising results for novel treatments, such as viral vectors carrying angiogenesis promoting agents and stem cells, in pre-clinical models of pad have not been consistently replicated in large clinical trials 9-12 . in patients that have pad, atherosclerosis-associated arterial narrowing develops gradually over many years allowing the legs to adjust to the gradual decrease in blood flow through compensatory mechanisms within the blood vessels and muscle fibres 13 . in contrast, the most commonly used animal model for initial testing of novel therapies for pad is a model of acute blood supply interruption through ligation or excision of the femoral artery (referred to here as the 1-stage hind limb ischemia (hli) model) 14, 15 . previous studies report that the ligation and excision of the femoral artery in the 1-stage model leads to increased fluid shear stress within the limb collateral arteries resulting in altered gene expression patterns through shear stress responsive elements which promote arterio-and angio-genesis [16] [17] [18] . hind limb blood supply in this 1-stage model therefore usually naturally recovers over a period of approximately 4 weeks 15 . this model does not therefore simulate the clinical presentation of pad. patients typically present with a history of acute exacerbation of chronic symptoms of leg pain on walking and have ongoing ischemic symptoms. the 1-stage hli model may therefore not be an ideal model to study therapeutic angiogenesis and arteriogenesis 14, 19 . another approach to inducing hli is the placement of an ameroid constrictor around the femoral artery to induce gradual occlusion 16, 20, 21 . previous studies suggest that this approach leads to mild ischemia 20 and that blood flow recovery occurs within 2-3 weeks 16, 21 . furthermore, previous pre-clinical pad research has mainly focused on assessing hind limb blood supply with limited assessment of ambulatory ability 14, 15 . on the other hand, the assessment of novel treatments in pad patients usually involves measures of walking ability using treadmill or corridor walking tests 22, 23 . there is therefore a need for an increased focus on functional tests of the limb within clinically-relevant rodent models. we hypothesised that the limb ischemia produced by the current 1-stage hli model would be more severe and sustained if the model was modified to include an initial more slowly progressive arterial narrowing over 14 days prior to the induction of acute ischemia (i.e. a 2-stage model). our overall aim was to develop a more clinically relevant rodent model that could incorporates stable on-going limb ischemia in order to test therapeutic interventions. mice. male apolipoprotein e deficient (apoe −/− ) mice (n = 118, obtained from animal resources centre, western australia) were used for the experiments. mice were housed in a temperature-controlled room (21 ± 1 °c) with an automatic 12:12-h light/dark cycle (07:00 to 19:00 hours). mice were singly housed in a clear individually-ventilated, temperature and humidity-controlled cage system (aero ivc green line; tecniplast) with enrichment. all experiments were performed during the light phase (10:00-18:00 hours) and mice were fed with standard rodent chow and water ad libitum during the course of these experiments. approval for the animal studies was obtained from the institutional ethics committee (animal ethics committee, james cook university) and experimental work performed in accordance with the institutional and ethical guidelines of james cook university, australia, and conforming to the guide for the care and use of laboratory animals (national institutes of health, usa). hli models. the first phase of the study utilised two hli models: the most commonly used unilateral acute hli model (1-stage hli) 16, 24, 25 and the newly developed 2-stage hli model. male apoe −/− mice aged 20 months were randomly divided into 4 groups as follows: group 1 = 1-stage hli model (n = 25), group 2 = 1-stage sham (n = 19), group 3 = 2-stage hli model (n = 25) and group 4 = 2-stage sham (n = 19). body weight and primary outcome measures were recorded at regular intervals as illustrated in fig. 1a . all functional assessments were performed in a subset of mice randomly selected from each experimental group (n = 5-7). the creation of the 1-stage hli model involved exposure of the left femoral artery through a vertical 0.5-1 cm skin incision under a stereotactic microscope (leica). the femoral artery and its side branches were then ligated with 6-0 silk sutures (ethicon) immediately distal to the inguinal ligament and proximal to the popliteal bifurcation before being excised ( supplementary fig. s1a,b) . femoral nerves were carefully preserved. the wound was irrigated with sterile saline and then the overlying skin was closed using 4-0 vicryl sutures (ethicon). post-operative pain was reduced using lignocaine (troy laboratories). a similar surgery without ligation or excision of the femoral artery was performed on the 1-stage sham controls. the 2-stage hli model was performed using a 2-stage surgical procedure. the left femoral artery was exposed as described above and 2 custom made miniature ameroid constrictors of 0.25 mm internal diameter (research instruments sw) were positioned on the artery. one was placed on the femoral artery immediately distal to the inguinal ligament and one was positioned proximal to the sapheno-popliteal bifurcation ( supplementary fig. s1c ). after 14 days, a new incision was made and the femoral artery ligated and excised, as described for the 1-stage hli model. a similar two-stage surgery was performed without placement of ameroids, nor ligation and excision of the femoral artery for the 2-stage sham controls. assessment of the effect of exercise training in the 2-stage hli model. during the second phase of the study, the effect of an exercise program on male apoe −/− mice aged 6 months undergoing the 2-stage hli model (male apoe −/− mice, n = 30) was tested. mice subjected to 2-stage hli were randomly allocated to an exercise training or control group (n = 15 per group). mice in the exercise group received 180 to 200 m (between 30-45 mins of running wheel access) of exercise each day on a running wheel (8 station home cage running the movement of animals were monitored and functional scores of the various groups were assessed according to the scoring criteria detailed in the materials and methods section. the 2-stage hli model showed reduced function throughout the experimental period compared to the 1-stage hli model. data shown as mean ± sem and analysed by repeated measures 2-way anova, and p value significant at ≤0.05. (e) graph showing modified ischemia scores. the animals were monitored and scored for signs of ischemia according to previously published criteria detailed in the materials and methods. the 2-stage hli model showed a higher level of ischemia compared to the 1-stage hli depicted by a lower scoring throughout the study period. data shown as median ±sem and analysed by repeated measures 2-way anova, and p value significance set at ≤0.05. suggested that the hli typically resolved over the course of 28 days, with hind limb blood supply reaching a plateau between 21 and 28 days after surgery 28 . the ldpi measurements were therefore performed at the following time points for 1-stage hli model: day 0 prior to surgery, immediately after surgery, days 3, 7, 14, 21 and 28 after surgery. for the 2-stage hli model, the ldpi measurements were performed at the following time points: day 0 prior to the first operation (ameroid placement), immediately after the first operation, days 3, 7, and 14 after the first operation, immediately after the second operation (femoral artery excision), and 7, 14, 21 and 28 days after the second operation (i.e. 14, 21, 28, 35 and 42 days after the first operation). a schematic illustration of the experimental design is shown in fig. 1a . body mass was also measured on the same day as the ldpi measurements. in clinical practice pad patients are treated to improve pain free walking capacity and resolve rest pain and tissue loss (critical limb ischemia, cli). in clinical trials, these are usually investigated by walking tests and assessment of pain. similar to clinical trials, in the current study ambulatory ability was assessed with a treadmill test, voluntary physical activity examined through an open field test and foot pain estimated through a mechanical allodynia test. all outcomes were assessed by an assessor blinded to mice group. semi-quantitative assessments of limb function and ischemia were performed at the same time points as the blood flow measurements ( fig. 1a ; supplementary tables s1, s2). limb function was assessed using the clinical use score (tarlov scale) as: 0 = no movement; 1 = barely perceptible movement, no weight bearing; 2 = frequent and vigorous movement, no weight bearing; 3 = supports weight, may take 1 or 2 steps; 4 = walks with only mild deficit; 5 = normal but slow walking and 6 = full and fast walking 29, 30 . limb ischemia was scored using the ischemia scoring scale as previously reported: 0 = auto-amputation of leg; 1 = leg necrosis; 2 = foot necrosis; 3 = two or more toe discoloration; 4 = one toe discoloration; 5 = two or more nail discolorations; 6 = one nail discoloration and 7 = no necrosis 30 . all scoring was performed by two independent observers and found to be identical. treadmill test. mice were run on a six lane excer3/6 treadmill (columbus instruments) without incline. mice were acclimatised to the treadmill by ambulating on it at 5 m/min for 5 min once daily on three consecutive days prior to any testing. before each treadmill test, mice were fasted for 1 h. the speed of the treadmill was controlled using the software and calibrated using an inbuilt speedometer mounted on the treadmill platform. a treadmill test involved an initial warm up at 5 m/min for 5 min followed by a progressive speed increase from 5 to 10 m/min, accelerated at 1 m/min. following this the treadmill speed remained at 10 m/min for up to a total running time of 20 min. during the test a stimulus grid of 3 hz was kept on until mouse exhaustion as previously reported 31 . exhaustion of the mouse was defined if the mouse returned to the stimulus grid 10 times despite a 3 hz electrical stimulus to encourage walking on the belt. the treadmill software recorded the total distance walked by a mouse until exhaustion. a blinded observer supervised the experiment to assess outcomes. the treadmill belt and lanes were cleaned with water and 70% alcohol and dried with paper towel after each test to remove any body scent. treadmill testing was carried out before ameroid placement, 10 days after ameroid placement, and 1 and 3 weeks after completion of the 2-stage hli (fig. 1a) . for the 1-stage hli model, treadmill testing was performed before ligation and excision of the femoral artery and 1 and 3 weeks after ischemia induction. voluntary physical activity test. the open field test is a common measure of voluntary physical activity in rodents suggested to be similar to a 6-min walk test used in humans 32 . to ensure consistency prior to the test, mice were brought to the testing room in their home cages at least 1 hr prior to the start of behavioural testing. the mice were fasted during the acclimatisation period and given free access to water under normal lighting. the open field box was made of opaque plastic (40 × 40 × 40 cm), divided into an outer field (periphery) and a central field (20 × 20 cm) which was primarily used for analysis. mice were individually placed in the centre of the arena and movements of the mice were recorded using a video camera (logitech) supported with acquisition software (capture star ver. 1; cleversys inc) and analysed by the topscan lite software (high throughput version 2.0; cleversys inc). the test protocol used was identical for each mouse assessed. after each test the open field box was cleaned with water and 70% alcohol and dried with a paper towel to remove the body scent, which could be a cue to movement of the mice. room lighting, temperature, and noise levels were kept consistent for all tests. the mouse movements were recorded for 20 min, to mimic the short timed nature of the 6-min walk test. rest time was recorded as motion measure score <0.05 in the software and average speed was calculated only for motion measure score >0.05. total distance travelled (m), frequency of movement, time spent in the arena (s) and average velocity in the arena (mm/s) were measured. mechanical allodynia test. the paw pressure transducer and the pressure application measurement device (pam; ugo basile) is a non-invasive tool for measuring mechanical allodynia threshold and hypersensitivity in rodents. the pam device allows an accurate measurement of primary mechanical hypersensitivity in rodents 33, 34 . a gradually increasing squeeze force is applied across the joint at a rate of approximately 300 gms/sec until a behavioural response (paw withdrawal, freezing of whisker movement, wriggling or vocalization) is observed with a cut-off of 5 sec. the peak gram force (gf) applied immediately prior to limb withdrawal was recorded by the base unit, and this value was designated the limb withdrawal threshold (lwt). lwt was measured twice in both the ipsilateral and contralateral limbs by two independent observers. the measurements were averaged and presented as a ratio of operated left limb to the un-operated right limb. blood tests. blood was collected by cardiac puncture at the completion of the experiments. platelet poor plasma was separated as described previously 35, 36 . the plasma concentrations of interleukin (il)-6, interferon (ifn)-γ, monocyte chemoattractant protein-1 (mcp-1) and tumour necrosis factor (tnf)-α were determined scientific reports | (2020) 10:3449 | https://doi.org/10.1038/s41598-020-60352-4 www.nature.com/scientificreports www.nature.com/scientificreports/ using a cytometric bead array kit (cba, bd biosciences). the inflammatory markers were assessed in samples (n = 10/group) selected from each group using a random number generator. briefly, 50 μl of mixed capture beads and 50 μl of serially diluted standard or plasma sample and 50 μl of phycoerythrin (pe) detection reagent, were incubated in the dark for 2 h in sample assay tubes. samples were then washed twice with 1 ml of the wash buffer, resuspended, and acquired on the cyan adp flow cytometer (beckman coulter). results were analysed and quantified by fcap array ™ software (v3, bd biosciences). we previously reported this method to have good reproducibility with an inter-assay coefficient of variation of 6-9% (n = 8-10) 36 . total nitrate was measured in plasma samples by a nitrate/nitrite colorimetric assay kit following the manufacturer's protocol (inter-assay coefficient of variation 2.7%; cayman chemicals) as reported previously 37 . briefly, nitrate was converted to nitrite using nitrate reductase. subsequently, addition of the griess reagents converted nitrite into a deep purple azo compound and the absorbance was measured at 540 nm using an omega plate reader. histological assessments. low capillary density has been reported in the gastrocnemius muscle of pad patients and animal hli models and associated with functional impairment 38, 39 . hence at the end of experiments gastrocnemius muscle samples were collected from the mice and stored in optimal cutting compound (oct, proscitech) which was progressively frozen in isopentane (sigma) suspended in liquid nitrogen. sections (5 µm-thick) were obtained on poly l-lysine coated slides (proscitech) from each sample with muscle fibres oriented in the transverse direction. all histological assessments were performed on sections that were examined in a blinded fashion at 40x magnification. muscle fibre structure. fixed cryostat sections were stained with hematoxylin and eosin (h&e, proscitech), examined at magnifications of 100x or 400x to assess the integrity of the tissues. degenerating muscle fibres were identified in the h&e stained sections by morphological assessment. assessors looked for the presence of mature skeletal muscle fibres (small peripheral nuclei) versus immature skeletal muscle myoblasts (large lobulated central nuclei) 40 . muscle fibre number and size were examined in 5 separate fields in 4 distinct areas in each specimen. muscle fibrosis. the extent of skeletal muscle fibrosis was assessed by staining the cryostat sections (5µm-thick) with picrosirius (proscitech). briefly, tissue sections were stained and examined under 40x power light microscope and skeletal muscle fibrosis was analysed using the image analysis software (zeiss axio imager z1). quantification of fibrosis was expressed as the percentage of fibrotic tissue present within the section (1 mm 2 tissue area) using a previously published protocol 40 . immunohistochemistry and morphometric analysis of capillary and arteriolar density. these were performed as previously reported 41, 42 . the gastrocnemius muscles from ischaemic and non-ischaemic hind-limbs were collected and embedded in oct compound (proscitech), frozen, and cut into 5 µm-thick sections. the slides were fixed at −20 °c in 95% ethanol for 1 hr. slides were washed three times in cold pbs with 1% horse serum (5 min/ wash) and blocked overnight with 5% horse serum in pbs at 4 °c. immunohistochemistry was performed using primary antibodies against cd31 (1:100 dilution; abcam) and smooth muscle α-actin (α-sma, 1:200 dilution; abcam). bound primary antibodies were detected by using appropriate secondary antibodies (biotinylated anti-goat igg and biotinylated anti-rat igg, all at 1:100 dilutions, vector labs) using avidin-biotin-peroxidase (vector labs) as described previously 43 . pictures from four random areas of each section and three sections per mouse were taken by using a digital camera (nikon eclipse sci epifluorescence microscope, nikon corporation) at 40× magnification. capillary density were quantified by measuring the percentage of cd31 and α-sma staining out of the total area as previously described. western blotting. gastrocnemius muscles were mainly harvested at the end of studies (i.e. 4 weeks after full ischemia induction). gastrocnemius muscles were also harvested from a subset of mice (n = 15) subjected to the 2-stage hli (n = 5/time-point) prior to and days 3 and 7 after ameroid placement. samples were frozen in liquid nitrogen, and stored in oct compound (proscitech) at −80 °c. tissues were pulverised in ripa buffer (150 mm sodium chloride, 1.0% np-40 or triton x-100, 0.5% sodium deoxycholate, 0.1% sodium dodecyl sulfate, 50 mm tris, ph 8.0) with protease inhibitors (roche diagnostics, australia) and phostop (roche diagnostics, australia) to extract proteins and quantitated using the bradford protein assay kit (biorad, usa). samples (25 μg of protein/lane) were loaded onto a 10% sds-polyacrylamide electrophoresis gel. after electrophoresis (110 v, 90 min), the separated proteins were transferred (15 ma, 60 min) to a polyvinylidene difluoride membrane (biorad, usa). non-specific sites were blocked with 5% non-fat dry milk for 60 min, and the blots were then incubated with following antibodies: anti-vascular endothelial growth factor (anti-vegf www.nature.com/scientificreports www.nature.com/scientificreports/ mrna analysis by quantitative real-time pcr. at the end of the study gastrocnemius muscle samples were harvested, placed in rna later (qiagen) and stored at −80 °c. samples (n = 10/group) were selected from each group using a random number generator and were processed for gene expression analysis. total rna was isolated using an rneasy mini kit (qiagen) according to manufacturer's instructions and quantified spectrophotometrically using nanodrop 2000. rna samples (20 ng) were subjected to quantitative real time pcr (qrt-pcr) analysis of genes of interest using the quantitect sybr green one-step rt-pcr assay (qiagen). qrt-pcr was performed using primers for mouse vegf-r1 (ppm03066f), vegf-r2 (ppm03057a), trpv4 (ppm36070a), klf4 (ppm25088b) and gapdh (qt01658692). the relative expression of these genes were calculated by using the concentration-ct-standard curve method and normalized using the average expression of mouse gapdh for each sample using the rotor-gene q operating software (version 2.0.24) as previously reported 44,45 . statistical analyses. all data were tested for normality using the d' agostino-pearson normality test. data with normal distribution were expressed as mean ± standard error of mean (sem) and analysed using parametric tests. non-normally distributed data were expressed as median and interquartile ranges (iqr) and analysed using non-parametric tests. statistical significance was determined using the unpaired student t test for comparison between two groups or analysis of variance followed by student-newman-keuls post-hoc analysis for comparison between multiple groups. comparison of the time course of ldpi indices, clinical scores, open field tests and treadmill exercise tests were done by 2-way anova for repeated measures, followed by bonferroni post hoc analysis or by linear mixed effect method using r studio software. difference in the clinical ischemia score were determined by fisher's exact test. analyses were performed using prism (graphpad software, san diego, ca) or r software. a p value of ≤0.05 was considered to be statistically significant. mice undergoing 2-stage hli had more severe ischemia than those undergoing 1-stage hli. immediately after femoral artery excision, limb perfusion assessed by ldpi was similarly reduced in both hli models by approximately 70%. mice subjected to the 1-stage hli had more rapid recovery of hind limb perfusion than those subjected to the 2-stage procedure (p = 0.014, fig. 1b,c, supplementary fig. s2 ). by 28 days after ischemia induction limb perfusion was similar in mice subjected to the 1-stage procedure and sham controls (p = 0.510) but still reduced in mice subjected to the 2-stage procedure by comparison to sham controls (p < 0.001). there was no change in overall body mass after ischemia induction (supplementary fig. s3 ). mice subjected to 2-stage hli showed more severely impaired hind limb use than those undergoing 1-stage hli. after ischemia induction, mice subjected to both methods of hli developed limb oedema, paleness of skin and occasional muscle necrosis. mice in all experimental groups exhibited a severe functional deficit after surgery (fig. 1d) . functional score was significantly worse in mice subjected to the 2-stage hli than those subjected to the 1-stage hli (p = 0.014). both the hli models showed increased ischemia compared to the respective shams. there were no cases of auto-amputation or foot or limb necrosis (supplementary fig. s4 ). mice subjected to the 2-stage hli showed a significantly worse ischemic score compared to those subjected to the 1-stage hli (repeated measures 2 way anova, p = 0.004, fig. 1e ). mice subjected to 2-stage hli had reduced treadmill performance. mice subjected to the 1-stage hli showed no significant difference in total distance travelled during the study period on a treadmill test when compared to shams (repeated measures 2 way anova, p = 0.818; fig. 2a ). after the first procedure of the 2-stage model (ameroid placement) the treadmill ambulatory performance of mice was not significantly affected ( supplementary fig. s5a ). in contrast mice subjected to 2-stage hli had a significant reduction in total distance travelled on the treadmill compared to their sham controls (p = 0.050) and mice subjected to 1-stage hli (p = 0.045; fig. 2a ). supplementary fig. s5b-d) . this reduction in physical activity was maintained after completing the 2-stage hli by comparison to sham controls and also mice subjected to 1-stage hli (fig. 2b) . the reduction in physical activity of mice subjected to 2-stage hli was reflected in less distance travelled, less total time in motion and lower velocity of the movement compared to sham controls (fig. 2c-e) . when compared to the 1-stage hli, the 2 stage-hli model showed a reduction in the total distance travelled in the open field arena (linear mixed effect model, p = 0.036). mice subjected to hli had enhanced mechanical allodynia. mice subjected to both 1-stage and 2-stage hli showed significantly increased sensitivity to pressure compared their respective sham controls and there was no significant difference in pressure sensitivity between the two models (fig. 2f ). hli induces systemic inflammation. the plasma concentrations of the cytokines assessed were below the detectable ranges in both the sham control groups (table 1) . mice subjected to hli, irrespective of model, had plasma cytokine concentration significantly higher than the sham controls although levels were not significantly different between models ( table 1 ). the plasma concentrations of nitric oxide (no) metabolites were higher in mice undergoing 1-stage hli than the sham control group (p < 0.001) and mice undergoing 2-stage hli (p = 0.031, table 1 ). myofibers which are healthy and functionally active are characterised by peripheral nuclei, while myofibrils with central nuclei are immature and do not show optimal contraction 46 . in gastrocnemius muscle samples removed from sham controls, myocytes were angular with peripheral nucleus (fig. 3a) . at day 28 after ischemia induction, mice subjected to 2-stage hli showed microscopic changes such as cellular swelling, focal necrosis and interstitial oedema. there were also numerous infiltrating inflammatory cells. gastrocnemius muscle samples from mice subjected to 1-stage hli showed more homogenous appearance with all myocytes showing peripheral nuclei and limited inflammatory cell infiltration (fig. 3a) . histological evaluation revealed that tissues of mice subjected to 1-stage hli had fewer immature myofibers cells than the tissues from 2-stage hli model. furthermore, muscle samples from mice subjected to 2-stage had prominent oedema, myofibre separation and multifocal neutrophilic infiltration. neutrophils were observed throughout the tissue sections. necrotic muscle fibres were prominent and formed confluent areas (fig. 3a) . muscle fibrosis was assessed by picrosirius red staining which suggested that 2-stage hli led to increased skeletal muscle fibrosis compared to 1-stage hli (p = 0.007) or sham controls (p = 0.001; fig. 3b ,e). mice subjected to 2-stage hli had fewer hind limb collaterals. consistent with the reduced perfusion as assessed by ldpi, both arteriogenesis and angiogenesis was inhibited in the ischemic gastrocnemius muscles of the 2-stage hli model (fig. 3c,d) . measurement of angiogenesis by cd31 immunostaining showed that the presence of arterioles was significantly reduced in samples from the 2-stage hli compared to the 1-stage hli (p = 0.002) or sham controls (p = 0.004; fig. 3 ). the arteriolar density was also significantly less in samples from the 2-stage hli model compared to the 1-stage hli (p = 0.001) or sham control (p = 0.004; fig. 3 ). protein concentrations of angiogenesis and shear stress response markers were downregulated in the gastrocnemius muscles of mice undergoing 2-stage hli. the relative total vegf and vegfr-2 (but not vegfr-1, p-enos/enos and hif-α) protein levels in gastrocnemius tissue collected from mice 4 weeks after 2-stage hli were significantly less than in sham controls and mice undergoing 1-stage hli (fig. 4a-d, supplementary fig. s6 ). analysis of tissues from the 2-stage hli model prior to ameroid placement and day 3 and day 7 after ameroid placement suggested no significant changes in concentrations of trpv4 or vegf in response to ameroid constriction ( supplementary fig. s7 ). at the end of the experiment, protein concentrations of trpv4 and klf4 were significantly less in the gastrocnemius muscle samples from mice undergoing 2-stage hli compared to sham controls and mice undergoing 1-stage hli (fig. 4e,f) . qrt-pcr showed that the relative expressions of vegf-r2, trpv4 and klf4 in gastrocnemius muscle of mice subjected to 2-stage hli were significantly lower than within the gastrocnemius muscle of mice undergoing 1-stage hli or in sham controls (fig. 4g-j) . exercise training improved functional capacity but not limb perfusion in mice subjected to 2-stage hli. supervised exercise training is an established method to improve functional capacity in pad patients 47, 48 and previous studies show that mice respond positively to exercise training 49, 50 . in order to examine whether an established clinically effective therapy was effective in the novel animal model, the effect of exercise training (using a running wheel) on functional capacity of mice subjected to 2-stage hli was assessed. exercise training was commenced 5 days after ischemia induction. exercise training did not affect limb perfusion as shows the data from the ischemic limbs from the 1-stage and 2-stage hli models and the respective sham controls (n = 8/group, scale bars in all images = 25 µm). (e-g) quantitative bar graphs showing the effect of hli on (e) skeletal muscle fibrosis by picrosirius red staining, (f) angiogenesis by immunohistochemical staining against cd31 and (g) arteriogenesis by immunohistochemical staining against α-sma. all values are median and interquartile ranges (n = 8/group) and p value significance set at ≤0.05. (2020) 10:3449 | https://doi.org/10.1038/s41598-020-60352-4 www.nature.com/scientificreports www.nature.com/scientificreports/ www.nature.com/scientificreports www.nature.com/scientificreports/ assessed by ldpi (linear mixed effect test, p = 0.700; fig. 5b ,c). mice subjected to exercise training showed a significant increase in average treadmill walking capacity compared to sedentary controls (p = 0.003; fig. 5d ). exercise training upregulated gastrocnemius muscle vegf and trpv4 levels. since improvement in ambulatory performance as a result of exercise training could be due to enhanced angiogenesis, the expression of angiogenesis and shear stress responsive proteins vegf, vegf-r1, vegf-r2, trpv4 and klf4 were assessed in the gastrocnemius muscles (fig. 6). vegf (p = 0.001, fig. 6b ) and trpv4 (p = 0.027, fig. 6e ) but not vegf-r1, vegf-r2 and klf4, were highly upregulated following exercise training (fig. 6c,d,f ). this report describes the development of a novel model of hli which results in more severe and prolonged ischemia than the traditional model. mice subjected to the 2-stage hli had functional and ambulatory impairment and a positive response to exercise training as has been reported for pad patients 51 . previous rodent studies suggest that placing of ameroid constrictors alone without manipulating the femoral artery results in mild ischemia 20 and blood flow recovers within 2 weeks 16 . hence, the novel hli model was based on placement of two ameroid constrictors on the femoral artery to promote gradual occlusion followed by excision of intervening segment after 14 days to induce severe ischemia. a previous report suggests that recovery of hind limb blood flow is reduced in apoe −/− compared to c57bl/6 j mice due to their limited collateral arteries and hence apoe −/− mice were used in the current study 24 . pad patients are generally older and exhibit metabolic derangements that limit angio-and arterio-genesis and previous studies suggest that apoe −/− mice have delayed skeletal muscle healing 52 , reduced angiogenesis responses and impaired functional recovery after hli further supporting the rationale for choosing this mice species 26, 28, 53 . mice subjected to 1-stage hli had rapid recovery of limb blood flow and reached a perfusion level similar to the sham controls within 28 days as has been reported by other investigators 28, 54, 55 . ligation and sudden excision of the femoral artery is believed to generate a pressure gradient between the proximal and distal ends of the occluded vessel, resulting in increased shear stress and a redirection of blood flow towards the collaterals and through numerous branches arising from the internal iliac artery, resulting in rapid improvement in blood flow 14, 56 . ameroid constrictors have been shown to cause luminal occlusion within 14 days, however, the blood flow slowly increases in the next 2-3 weeks 16, 21 . hence, in the new model, we superimposed a secondary acute event by excising the intervening femoral artery segment along with the ameroid constrictors. in contrast to the 1-stage hli, mice subjected to 2-stage hli had on-going limb ischemia and a prolonged functional deficit on both forced and voluntary ambulation tests. these findings support the value of the novel model for the testing of interventions aimed at achieving clinical improvements in pad patients. since angiogenesis is an inflammation-driven process 57 , the concentrations of circulating cytokines were measured. these markers of systemic inflammation increased in response to hli in both models examined. twenty eight days after hli induction, the concentrations of circulating cytokines were similar in the two models studied. gastrocnemius muscle samples obtained from the 2-stage hli model had marked neutrophilic infiltration. it has been previously reported that inflammatory cells accumulate in hypoxic tissues and promote angiogenesis 58 . it is possible that the systemic concentrations of cytokines were not reflective of the level of inflammation within the hind limb. markers of angiogenesis and arteriogenesis in gastrocnemius tissue, such as cd31 and α-smc, were found to be less evident in the 2-stage than the 1-stage hli model. furthermore, gastrocnemius vegf and vegf-r2 protein levels and amount of total plasma no metabolites were significantly lower in the 2-stage than 1-stage hli model. vegf promotes angiogenesis 59 through binding to vegf-r2 60 expressed on endothelial cells 61 . vegf induces the release of no 62 thereby promoting microvascular perfusion and endothelial progenitor cell mobilization [63] [64] [65] . endothelial cell derived microrna, such as mir-16, have also been implicated in controlling angiogenesis through inhibiting rho gdp dissociation inhibitor (rhogdi)-α, an important regulator of enos phosphorylation 66 . it appears likely that the low levels of pro-angiogenic markers in the 2-stage hli model reflect less activation of endothelium-dependent pro-angiogenesis signalling pathways which were stimulated by collateral flow within the 1-stage model. shear stress promotes arteriogenesis by stimulating remodelling of collaterals 67, 68 . endothelial cells transduce changes in shear stress into intracellular signals which promote expression of a distinct set of genes which can control the response to ischemia [69] [70] [71] . previous studies suggest that increased shear stress promotes phosphorylation and upregulation of mechano-sensors, such as trpv4 [72] [73] [74] [75] . it was postulated that the distinct ways of inducing hli in the two models studied might be reflected in different trpv4 expression. mice subjected to 2-stage hli had lower expression of trpv4 compared to those undergoing 1-stage hli. furthermore, in the 2-stage hli model, there was no change in trpv4 protein levels after ameroid placement, suggesting that ameroid constriction is a gradual process that does not lead to shear stress changes capable of stimulating mechano-sensors like trpv4. these findings also suggest that 2-stage hli results in more limited collateral flow than the 1-stage approach. the acute reduction in arterial pressure gradient following femoral artery excision in the 1-stage hli model it thought to be registered by endothelial shear stress response elements resulting in upregulation of angiogenesis and arteriogenesis promoting genes 21 . simulation of trpv4, for example by 4α-phorbol 12, of vegf-r1 (g), vegf-r2 (h), trpv4 (i) and klf4 (j) assessed in the gastrocnemius muscles of mice undergoing hli. quantitative real time pcr (qrt pcr) was performed on extracted total mrna using specific primers and normalised to glyceraldehyde 3 phosphate dehydrogenase (gapdh) expression. data analysed by mann-whitney u test (n = 10 samples/group). (2020) 10:3449 | https://doi.org/10.1038/s41598-020-60352-4 www.nature.com/scientificreports www.nature.com/scientificreports/ 13-didecanoate, has been reported to promote increase no release and increased hind limb blood flow 73 . trpv4 deficient rodents have impaired vasodilatation [76] [77] [78] [79] . this supports the important role of trpv4 in promoting adaptation to acute limb ischemia. four weeks of exercise training led to an approximate 2-fold increase in treadmill ambulatory distance in the mice subjected to 2-stage hli, paralleling findings from pad patients [80] [81] [82] [83] . these findings suggest the novel 2-stage hli mouse model simulates the walking impairment experienced by patients. in support of the relevance of this model to patients, we found that exercise therapy improved treadmill performance without improving limb perfusion, a finding similar to that described in pad patients 47, 48, 84 . since exercise training increases shear stress in pre-existing collaterals we examined the expression of trpv4 and angiogenesis markers 85 . compared to sedentary controls, mice undergoing exercise training showed increased total protein levels of vegf and trpv4, which was in accordance with previous reports showing enhanced expression of pro-angiogenesis markers after exercise training [86] [87] [88] . overall these findings suggest flow-mediated upregulation of shear stress responsive genes is important in stimulating angiogenesis responses in the new model. this study have several strengths and weaknesses. the 1-stage hli model which is usually utilised for pad research has many limitations including its disparate pathophysiological mechanisms compared to patient presentations, the temporary nature of the ischemia and its relative responsiveness to a variety of therapies which are not effective in patients 14 . this study suggests the novel 2-stage model has clear advantages over the 1-stage model since ischemia is more severe and prolonged and does not naturally recover making it suitable to access the longer term effects of potential therapies. the study also has some limitations. these include the absence of a detailed time course of the genes and proteins assessed and the absence of a definitive test of the role of shear stress response elements in the differences in the models found. the importance of shear stress response elements in the response to hli has however been previously reported in prior studies 85 . it is therefore logical to assume similar mechanisms were involved in the current study given the gene expression differences demonstrated. the study also lacked the use of new high resolution imaging systems, such as fluorescent microsphere angiography. lower-limb arterial disease comparison of global estimates of prevalence and risk factors for peripheral artery disease in 2000 and 2010: a systematic review and analysis the global pandemic of peripheral artery disease global and regional burden of death and disability from peripheral artery disease: 21 world regions endovascular versus open revascularization for peripheral arterial disease outcomes of polytetrafluoroethylene-covered stent versus bare-metal stent in the primary treatment of severe iliac artery obstructive lesions trial of a paclitaxel-coated balloon for femoropopliteal artery disease drug-coated balloon versus standard percutaneous transluminal angioplasty for the treatment of superficial femoral and popliteal peripheral artery disease: 12-month results from the in.pact sfa randomized trial autologous cell therapy for peripheral arterial disease: systematic review and meta-analysis of randomized, nonrandomized, and noncontrolled studies growth factors for angiogenesis in peripheral arterial disease trends in the national outcomes and costs for claudication and limb threatening ischemia: angioplasty vs bypass graft the australian national workplace health project: design and baseline findings muscle fiber characteristics in patients with peripheral arterial disease evaluation of the clinical relevance and limitations of current pre-clinical models of peripheral artery disease the efficacy of extraembryonic stem cells in improving blood flow within animal models of lower limb ischaemia the effect of gradual or acute arterial occlusion on skeletal muscle blood flow, arteriogenesis, and inflammation in rat hindlimb ischemia blood flow and vascular gene expression: fluid shear stress as a modulator of endothelial phenotype biomechanical activation of vascular endothelium as a determinant of its functional phenotype a review of the pathophysiology and potential biomarkers for peripheral artery disease subacute limb ischemia induces skeletal muscle injury in genetically susceptible mice independent of vascular density cellular and molecular mechanism regulating blood flow recovery in acute versus gradual femoral artery occlusion are distinct in the mouse physical activity during daily life and mortality in patients with peripheral arterial disease skeletal muscle perfusion in peripheral arterial disease a novel end point for cardiovascular imaging impaired collateral vessel development associated with reduced expression of vascular endothelial growth factor in apoe −/− mice murine model of hindlimb ischemia delayed arteriogenesis in hypercholesterolemic mice vascular growth factor expression in a rat model of severe limb ischemia mouse model of angiogenesis critical function of bmx/etk in ischemia-mediated arteriogenesis and angiogenesis limb ischemia after iliac ligation in aged mice stimulates angiogenesis without arteriogenesis exercise performance and peripheral vascular insufficiency improve with ampk activation in high-fat diet-fed mice altered behavioural adaptation in mice with neural corticotrophin-releasing factor overexpression amitriptyline reverses hyperalgesia and improves associated mood-like disorders in a model of experimental monoarthritis validation of the digital pressure application measurement (pam) device for detection of primary mechanical hyperalgesia in rat and mouse antigen-induced knee joint arthritis a peptide antagonist of thrombospondin-1 promotes abdominal aortic aneurysm progression in the angiotensin ii-infused apolipoprotein-e-deficient mouse influence of apolipoprotein e, age and aortic site on calcium phosphate induced abdominal aortic aneurysm in mice impaired acetylcholine-induced endothelium-dependent aortic relaxation by caveolin-1 in angiotensin ii-infused apolipoprotein-e (apoe −/− ) knockout mice relationship between leg muscle capillary density and peak hyperemic blood flow with endurance capacity in peripheral artery disease chronically ischemic mouse skeletal muscle exhibits myopathy in association with mitochondrial dysfunction and oxidative damage detrimental effect of class-selective histone deacetylase inhibitors during tissue regeneration following hindlimb ischemia chronic sodium nitrite therapy augments ischemia-induced angiogenesis and arteriogenesis sildenafil promotes ischemia-induced angiogenesis through a pkg-dependent pathway fenofibrate increases high-density lipoprotein and sphingosine 1 phosphate concentrations limiting abdominal aortic aneurysm progression in a mouse model osteoprotegerin deficiency limits angiotensin ii-induced aortic dilatation and rupture in the apolipoprotein e-knockout mouse wnt signaling pathway inhibitor sclerostin inhibits angiotensin ii-induced aortic aneurysm and atherosclerosis nuclear positioning in muscle development and disease the effect of exercise on haemodynamics in intermittent claudication: a systematic review of randomized controlled trials exercise for intermittent claudication impaired aerobic capacity in hypercholesterolemic mice: partial reversal by exercise training hypercholesterolemia impairs exercise capacity in mice exercise for intermittent claudication apolipoprotein e −/− mice have delayed skeletal muscle healing after hind limb ischemia-reperfusion peripheral nerve ischemia: apolipoprotein e deficiency results in impaired functional recovery and reduction of associated intraneural angiogenic response differential necrosis despite similar perfusion in mouse strains after ischemia time course of skeletal muscle repair and gene expression following acute hind limb ischemia in mice promoting blood vessel growth in ischemic diseases: challenges in translating preclinical potential into clinical success inflammation and wound healing: the role of the macrophage post-ischaemic neovascularization and inflammation vegf receptor signalling -in control of vascular function glycaemic control improves perfusion recovery and vegfr2 protein expression in diabetic mice following experimental pad high affinity vegf binding and developmental expression suggest flk-1 as a major regulator of vasculogenesis and angiogenesis vascular endothelial growth factor/vascular permeability factor augments nitric oxide release from quiescent rabbit and human vascular endothelium recovery of disturbed endothelium-dependent flow in the collateral-perfused rabbit ischemic hindlimb after administration of vascular endothelial growth factor nitric oxide synthase modulates angiogenesis in response to tissue ischemia essential role of endothelial nitric oxide synthase for mobilization of stem and progenitor cells hindlimb ischemia impairs endothelial recovery and increases neointimal proliferation in the carotid artery hemodynamic stresses and structural remodeling of anastomosing arteriolar networks: design principles of collateral arterioles wall remodeling during luminal expansion of mesenteric arterial collaterals in the rat shear-induced force transmission in a multicomponent, multicell model of the endothelium oscillatory shear stress and hydrostatic pressure modulate cell-matrix attachment proteins in cultured endothelial cells effects of biaxial oscillatory shear stress on endothelial cell proliferation and morphology ultrastructure and molecular histology of rabbit hind-limb collateral artery growth (arteriogenesis) trpv4 induces collateral vessel growth during regeneration of the arterial circulation calcium-dependent signalling is essential during collateral growth in the pig hind limb-ischemia model effects of endogenous nitric oxide and of deta nonoate in arteriogenesis trpv4-dependent dilation of peripheral resistance arteries influences arterial pressure transient receptor potential vanilloid type 4-deficient mice exhibit impaired endothelium-dependent relaxation induced by acetylcholine in vitro and in vivo activation mediates flow-induced nitric oxide production in the rat thick ascending limb trpv4-mediated endothelial ca2+ influx and vasodilation in response to shear stress esc guidelines on the diagnosis and treatment of peripheral arterial diseases, in collaboration with the european society for vascular surgery (esvs): document covering atherosclerotic disease of extracranial carotid and vertebral, mesenteric, renal, upper and lower extremity arteriesendorsed by: the european stroke organization (eso)the task force for the diagnosis and treatment of peripheral arterial diseases of the european society of cardiology (esc) and of the european society for vascular surgery (esvs) cilostazol for intermittent claudication a systematic review of the uptake and adherence rates to supervised exercise programs in patients with intermittent claudication exercise for intermittent claudication a systematic review of treatment of intermittent claudication in the lower extremities cardiovascular effects of exercise: role of endothelial shear stress exercise-induced expression of angiogenesis-related transcription and growth factors in human skeletal muscle the influence of physical training on the angiopoietin and vegf-a systems in human skeletal muscle exercise linked to transient increase in expression and activity of cation channels in newly formed hind-limb collaterals this research was funded by a faculty administered grant and a research infrastructure block grant from james cook university and funding from the queensland government. jg holds a practitioner fellowship from the national health and medical research council, australia (1117061) and a senior clinical research fellowship from the queensland government. smo was supported by funding from the graduate research school and college of medicine, james cook university. we would like to acknowledge with thanks the help of prof. zoltan sarnyai (james cook university) who provided access to open field test assessment facility, dr. joseph moxon (james cook university) who assisted with the lme used to analyse data from the exercise study and dr. pacific huynh (james cook university) who provided the mouse images in figs. 1a and 5a. s.m.k. lead the design of the research, undertaking of experiments, and interpretation of data and writing of the manuscript. s.m.o., j.l., s.m. and r.j.j. performed parts of the experiments, contributed to interpretation of the data and gave critical comments on the manuscript. j.g. contributed rationales for the studies, led funding applications, co-wrote the manuscript and contributed to project supervision and data interpretation. the authors declare no competing interests. supplementary information is available for this paper at https://doi.org/10.1038/s41598-020-60352-4.correspondence and requests for materials should be addressed to j.g.reprints and permissions information is available at www.nature.com/reprints.publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. key: cord-262524-ununcin0 authors: bankhead, armand; mancini, emiliano; sims, amy c.; baric, ralph s.; mcweeney, shannon; sloot, peter m.a. title: a simulation framework to investigate in vitro viral infection dynamics date: 2011-12-31 journal: procedia computer science doi: 10.1016/j.procs.2011.04.195 sha: doc_id: 262524 cord_uid: ununcin0 abstract virus infection is a complex biological phenomenon for which in vitro experiments provide a uniquely concise view where data is often obtained from a single population of cells, under controlled environmental conditions. nonetheless, data interpretation and real understanding of viral dynamics is still hampered by the sheer complexity of the various intertwined spatio-temporal processes. in this paper we present a tool to address these issues: a cellular automata model describing critical aspects of in vitro viral infections taking into account spatial characteristics of virus spreading within a culture well. the aim of the model is to understand the key mechanisms of sars-cov infection dynamics during the first 24hours post infection. we interrogate the model using a latin hypercube sensitivity analysis to identify which mechanisms are critical to the observed infection of host cells and the release of measured virus particles. in the past ten years there has been a growing interest in modeling viral infections for the study and characterization of host infection dynamics. early mathematical models were typically based on ordinary differential equations (odes) and focused on extracting key parameters of the infections dynamics [1, 2] . later on, interest moved to investigating how spatial relations affected the system dynamics, thus moving to cellular automata (ca) models [3, 4, 5] . more recently the attention has shifted to viral respiratory diseases due to the increasing danger of pandemics such as the "spanish flu" that caused over 50 million deaths worldwide in 1918. near pandemics such as severe acute respiratory syndrome (sars), resulting in 8,096 infected cases and a 9.6% death rate, have provided unfortunate reminders their continued health risk and significant economic impact [6] . new research projects are being funded to investigate the different scales of threat, from the epidemiological spread of a virus in a population to its early phases of the infection in a cell culture. among the papers published on this topic, in particular on the influenza virus [7, 8, 9] , there are bocharov's ode model [10] and beauchemin's ca model [3, 4, 11] . both models open access under cc by-nc-nd license. described the dynamics of the influenza infection in the upper epithelial areas of the lungs and were validated by clinical data. although they take into account the effect of the immune system on the dynamics of the disease, describing the infection until its final outcome (more than a week after the first infection), none of them analyzed which mechanisms where critical in the dynamics of the first and second round of infection. to understand the key mechanisms of viral respiratory diseases, we focus instead on early phases of viral infection by using in vitro experiments observing lung cells up to 24 hours post infection (pi). the in vitro experiments provide measurements of virus titer, spatial characteristics of cell growth and, through green fluorescent protein (gfp) imaging, of the infection spread. our computational model focuses on simulating the early stages of a viral infection in a population of cells plated on a culture well. the choice of a ca model was natural since the in vitro infections being studied use host lung cancer cell lines that form a fixed mono-layer in which spatially dependent aspects of infection may be present [12, 13] . we developed this computational model using the multi-agent system visualization (masyv) platform [3] . in contrast to previous models, we explicitly focus on the dynamics of virus spread on a population of cells, supported by experimental data from an in vitro model system. we also explicitly model the infectious viral particles as discrete entities, whereas in previous models the infection of cells followed simple ca rules depending on the states of neighboring cells. these viral particles are released by infected cells according to a specific function based on time post infection, and move over the well with a random walk algorithm. this representation allows us to model the mechanisms of virus spread in an environment where the virus is not confined and can also infect cells not adjacent to the infected ones. in section 2 we describe the model design and its main features. we also describe the sars infection experimental data used to parameterize the model. in section 3 we present a sensitivity analysis that identifies the critical mechanisms characterizing the early phases of the infection. we also show that the model can explain the experimentally observed virus titer data and allows a deeper understanding of the infection dynamics in the in vitro experiments. the computational model is built using beauchemin's masyv platform. the software consists of a server providing i/o and supervisory services to the various client modules where the simulation is actually coded. our choice to use masyv was partially driven by flexible and powerful graphical visualization routines that facilitate comparison to images provided by the experimental collaborators. masyv has a c-based api and is open source allowing finalized custom models to be easily shared. we discuss novel contributions and differences from the previous modules. the original module details are presented in beauchemin et al. [3] . our model reproduces a viral infection on a population of cells plated on a culture well. in our client we consider, as the target of the viral infection, calu-3 cells that are a human airway epithelial cell line derived from human lung cancer. we model these host cells using a 130x130-site ca model where each site represents either one calu-3 cell or an empty space. at the beginning of the simulation each lattice site is initialized and labelled with "uninfected" or "empty" states as described below in section 2.2. uninfected cells are initially stochastically infected with virus through a first round of infection at the beginning of the simulation, described in section 2.3, and once infected progress through the following states: containing: initial infection state representing viral entry and hijacking of host cell mechanisms necessary for viral replication. expressing: cell is actively producing and assembling virus capsids and genomes internally, but has not begun releasing virion. infectious: assembled virion is being released from the host cell according to the release function (section 2.4) by examining the experimental viral titer data shown in figure 1 we derived temporal delay of the state transition between containing and infectious. the ~ log 10 3 viral titer measurements at time points 0, 4 and 7 hours post infection are residual virus left over from the initial infection after washing. we set the delay for release of new infectious viral particles to 7 hours to represent the jump in observed virus titer between 7 and 12 hours. cells release virus particles according to a viral release function, described below. virus particles are entities that diffuse from one lattice point to another according to a random walk algorithm (section 2.4). these free-floating virus particles stochastically infect uninfected cells in a second round of infection. the ca lattice is therefore like a tissue of immobile cells with infective virions moving over it. we represent only infectious viral particles to relate their count in the model with the viral titer measured in terms of plaque forming unit per ml (pfu/ml). a pfu is a measure of the number of viral particles capable of forming plaques per unit of volume, thus only infectious particles are counted in the viral titer. the ca lattice is updated synchronously and the boundary conditions for both the epithelial and viral particles are periodic, i.e. epithelial cells and viral particles can grow or move outside the boundary of the virtual well. we use a honeycomb neighbourhood: each lattice site is considered adjacent to six sites and cells can replicate only if a neighbouring site is empty or if it contains a dead cell. only uninfected cells are allowed to replicate. when cells on the lattice are initialized all uninfected cells are stochastically assigned a time_to_death value between 1 and an arbitrary cell_lifespan variable. cell_lifespan was chosen to be large enough such that 5%, a minority of the cells, die naturally within the simulated 24 hours. cell_lifespan is not affected by infection as in vitro experimental observations showed no increase in cell death due to infection within 24 hours. in the laboratory before an in vitro infection is performed, 10 6 cells are first plated in each of the 6 culture wells present in a dish, and given time to settle down on the well surface to create a cell mono-layer, before the virus is applied to the cell culture. in the plating process, a suspension of cells is plated onto the surface of the well that adhere to the plastic in small clumps or as single cells forming "islands" that are homogeneously distributed throughout the well. cells located on the inside of these islands (surrounded by other cells) do not replicate due to contact inhibition whereas cells on the outer edges of each island replicate until they are completely surrounded by other cells. as a result each island continues to grow until there is no adjacent empty space on the well surface. to represent this behaviour we used a set of images ( figure 2 ) that shows how cells are distributed on the culture well just before the infection is performed. the number of cells forming each isolated island was counted for each image obtaining a distribution of island sizes with an average of 116 cells per island and a standard deviation of 74 cells per island. this data was used to generate randomly positioned clusters of cells on the grid, each with a size extracted from the measured distribution. islands are placed iteratively until the simulated confluency matches the experimentally measured confluency. a cell line doubling time is defined as the time required for the population of cells to double in number and we used a doubling time of 48 hours for calu-3 cells (ardizzoni lab, personal communication). cell density directly affects local cell growth since cells completely surrounded by other cells are not able to replicate. for this reason in our model the initial spatial distribution of uninfected cells affects the overall cell growth on the virtual culture well. in order to obtain a correct cell growth in our model we used a parameter called division_time that measures the time necessary to a uninfected cell to duplicate. we adjusted division_time to match the simulated doubling time with the experimentally-derived one. this was accomplished by analyzing cell growth of uninfected cells over 48 hours as a function of time for different values of the parameter. this led us to fix the value of division_time to 10 hours. cells in the culture well are initially infected using a multiplicity of infection (moi) of 2 (see section 2.7.4). the moi is calculated as the ratio of infectious viral particles to the number of calu-3 target cells. by definition the proportion of infected cells is given by the poisson distribution that describes the infection process [14] . using the experimental moi would theoretically give us the exact percentage of infected cells at the beginning of the experiment but even though the number of particles can be measured with good accuracy not all the viral particles used in this first infection process are infective and the estimate is derived using veroe6 cells. this leads to the definition of an effective moi, that is the moi given by the number of particles able to truly infect the calu-3 host cells. initially viral particles infect a proportion of the plated calu-3 cells before being washed away. we use the resulting proportion of infected cells estimated through the standard moi definition as a starting point of our simulation. after cells are initialized on the lattice, they are assigned an infected state according to equation 2, which describes the probability of at least one viral particle entering a cell. in the equations below, n represents the number of virus particles and moi is the multiplicity of infection used in the experiment. equation 2, the probability of a cell being infected by 1 or more virus particles, is derived from equation 1, the probability of being infected by a single virus particle, commonly used to describe viral infection as a poisson process [14] . although the expected moi is experimentally known, as mentioned in the "experimental data" section it is often over-estimated. we add the free parameter infectiousness for two purposes: 1) to scale moi over-estimation and 2) predict the number of initially infected cells when this data is not available. in the sensitivity analysis we discuss the importance of this parameter. moi n * e moi n! (1) after cells transition to an infectious state virus particles are released with a probability described by a sigmoidal function shown in figure 4 . the probability that one or more infectious viral particles are released in each time step is given by v(t) (equation 3 ). in this equation t represents time since the cell was infected and parameters a, b, and c are derived by fitting an experimentally-derived viral release curve produced by ka-wai et al. [15] . since we had no direct measure of the amount of virions per cell released by sars infected cells we rescaled the sigmoidal function with the free parameter vir_release. when v(t) > 1 an infectious cell will release one or more virions with a probability of 1. decimal digits less than one are treated as a probability for an additional virus particle to be released. infectious virions diffuse in the virtual well according to a simple random walk: for each virion in the virtual well at each time step of the simulation we perform a number of random walk steps given by the free parameter num_diff_steps. each viral particle has an equal chance to move to one of the six neighboring lattice sites and at each time step performs num_diff_steps movements on the lattice. the need of using a free parameter for the diffusion of virions was given by the lack of data regarding the virion diffusion coefficient on the well with specific conditions of the media used to cover the cell culture. viral titer measures only infectious particles present in the media using plaque forming units (pfu) per ml. for this reason we model only infectious particles [14] . in addition it is more computationally efficient to release and diffuse only the infectious viral particles. once released, virus particles may infect cells with an uninfected state located at the same lattice site. we assume each viral particle independently infects a nearby cell with a probability following the binomial distribution. for each lattice site the probability p second round (equation 4) is used to calculate whether or not a given uninfected cell is infected by n viral particles given n total local virus particles. p bp represents the probability of a virus-receptor binding event leading to a cell's infection by a single viral particle during a given model time step. we refer to the free parameter, p bp , as binding_prob below. once a cell is successfully infected, n viral particles are removed from the local virtual media located at the cell's lattice site. for these experiments we used a virus derived from our severe acute respiratory syndrome coronavirus (sars-cov) wild type infectious clone in which we engineered the green fluorescent protein (gfp) in place of open reading frame 7a/b. sars-cov gfp stock titers were calculated using standard plaque tittering methods. briefly, confluent monolayers of veroe6 cells in 6 well plates were infected with serial 1:10 dilutions (usually 10e-1 to 10e-6) of stock virus for 1 hour at 37 o c. the monolayers were covered with a solution of 0.8% low melting point agarose (seachem), 1x minimal essential media high glucose (mem, invitrogen), 10% fetal clone ii (hyclone), and 1x antibiotic/antimycotic (invitrogen), which solidifies trapping the virus to allow cell to cell viral spread but not release of virions into the media. plates were incubated at 37 o c for 36 hours, stained with neutral red for 2 to 5 hours, the stain removed and the plaques (holes in the monolayer generated when viruses kill the host cells) are visualized and counted on a light box. stock titers were calculated as plaque forming units (pfu) per ml. calu-3 cells were plated at a density of 1x10 6 per well in 6 well plates in mem containing 10% fetal clone ii and 1x antibiotic/antimycotic. cells were incubated at 37 o c at 5% co 2 levels for 2 days prior to infection with a media change 24 hours post plating. sars-cov gfp stocks (7.5x10 7 ) were used to infect each well at a multiplicity of infection of 2 or the addition of 2 infectious virions per cell in each well. at each time point, 100ul of media was harvested from each well to titer via plaque assay. (number of cells per well assumed to be 2x10 6 at 2 days post plating.) images shown in figure 2 were taken using phase contrast microscopy images were taken at standard exposure times at 10x magnification. at the indicated times post infection, images of the infected and mock-infected calu-3 cells were taken in living cells real time using a fluorescent light to excite gfp [16] at each time point, three to five fields of sars-gfp infected cells were assessed using imagej (http://rsbweb.nih.gov/ij/). in imagej, we gated the light signal to generate spots in each cell and used the spot counting algorithm determine the total number of cells per field. gfp positive cells were then counted and averaged. for each candidate parameter set, î¸, a simulation fitness (eq. 5) was calculated based on a comparison of two experimental measurements: virus titer and proportion infected cells. both components of simulation fitness, f(î¸), are normalized by a maximum error (me) term to balance their contribution. f titer = log 10 vt exp(t) log 10 vtsim(t) 2 (6) randomized weight bootstrapping (1000 iterations) was used to determine me titer and the derivation of me gfp is described below. equation 6 shows f titer is the sum of squares difference between experimentally derived virus titer, vt exp , (figure 1 ) and scaled virus titer produced by the simulation, vt sim . only the last three time points (7, 12, 24 hours) are compared because earlier time point titer values were skewed by residual virus. simulation output, vt sim , is scaled by î± to account for differences in the number of simulated cells versus the number of in vitro cells. using a lattice size of 130x130 an average of 7,636 total starting simulated cells were produced. the virus titer was scaled by a factor of 10 6 /7,636 because the experimental virus titer was produced from a population of 10 6 in vitro cells. as described in the experimental data section 2.7, in vitro gfp measurements were used to quantify the proportion of infected cells at 12 hours post infection. this additional biological information was used to constrain our model's parameter space. the f gfp portion of simulation fitness (equation 7) is the absolute value of the difference between simulated proportion of infected cells i sim to experimentally measured proportion infected, i exp . since i exp was measured to be .11 (on average 11 of 100 cells are infected) at 12 hours, a maximum error, me gfp , of .89 was used to normalize f gfp between zero and one. to assess the importance of model free parameters to simulation outcome, a latin hypercube sampling (lhs) sensitivity analysis was performed [17, 18] . this two-tiered sampling method was implemented by dividing each free parameter's range into 1000 intervals. intervals for each parameter were shuffled without replacement and then randomly sampled using a uniform distribution resulting in 1000 samplings of the parameter space. all possible probability parameter values were considered for infectiousness and binding_prob between 0 and 1. maximum values for num_diff_steps and vir_release were arbitrarily chosen (12 and 30 respectively). a spearman rank correlation was then used to measure statistical dependence, rho, between free parameter and single run simulation fitness. the following are free parameters (described above) contained in our model and assessed using lhs: we also examined how free parameters affect simulation output with time. using lhs sampling, we tested relations between free parameters and two simulation outputs: virus titer and % of infected cells. figure 6 (a) shows that infectiousness and vir_release are significant over 24 hours of simulated infection and that their correlations start to diverge at ~12 hours. again, binding_prob and num_diff_steps do not show a significant correlation with virus titer. however, parameters necessary for the second round of infection do impact the proportion of infected cells shown in figure 6 (b) where we see an increase in binding_prob and vir_release correlation between 7 and 12 hours, corresponding to a decrease in the infectiousness correlation. this decrease can be attributed to the second round of infection in which a greater population of cells has been infected. the simultaneous increase in correlation of binding_prob and vir_release indicates that additional rounds of infection are playing a significant role in viral spread. although simplistic compared to in vivo model systems, the interpretation of in vitro experiments is still confounded by biological complexity and disparate data types. explanatory models are critical for understanding and hypothesis generation. this agent-based modeling framework may be used to investigate first and second round of infection mechanisms using free parameters able to be tuned to allow the model to incorporate disparate types of experimental data. we also take into account spatial aspects of infection, including biases in culture well cell growth and diffusion of infectious virions. virus titer data and gfp infectivity data from a sars infection of calu-3 cells is used as an example to illustrate the model's capacity to interpret experimentally derived data. lhs sensitivity analysis indicates that a small population of cells is initially infected and that additional rounds of infection are responsible for virus titer measurements. a significant relationship between infectiousness and both the simulation fitness and simulation outputs (virus titer and proportion infected cells) indicates the importance of this parameter on the resulting infection dynamics. this result also demonstrates the importance of the accurate cell and infectious virus particle counts for the moi calculation. finally our model highlights the importance of intracellular processes leading to virus release. one possible future step is to include additional detail regarding intracellular processes of virus replication and move to a multi-scale spatio-temporal model. future work is planned to incorporate microarray data and make predictions regarding host response and expression in order to examine connections between infection state and signaling in the immune response. simulated annealing has been used to identify free parameters that fit the described model to virus titer data and may be used to predict the number of infected cells or other un-measured data types to support experimental modeling efforts. these off-line predictions could then be used to interpolate a single-cell function representing host response postinfection. we also plan to train the model with data from multiple virus strains to investigate how virus population and host response dynamics differ. finally, we also plan to investigate the effects of initial spatial distribution of infected cells on viral pathogenesis for multiple virus strains. all authors must sign the transfer of copyright agreement before the article can be published. this transfer agreement enables elsevier to protect the copyrighted material for the authors, but does not relinquish the authors' proprietary rights. the copyright transfer covers the exclusive rights to reproduce and distribute the article, including reprints, photographic reproductions, microfilm or any other reproductions of similar nature and translations. authors are responsible for obtaining from the copyright holder permission to reproduce any figures for which copyright exists. hiv-1 dynamics in vivo: virion clearance rate, infected cell lifespan, and viral generation time modelling viral and immune system dynamics a simple cellular automaton model for influenza a viral infections probing the effects of the well-mixed assumption on viral infection dynamics structured model of influenza virus replication in mdck cells sars cases with onset of illness from 1 kinetics of influenza a virus infection in humans antiviral resistance and the control of pandemic influenza: the roles of stochasticity, evolution and model details modeling the viral dynamics of influenza a virus infection mathematical model of antiviral immune response iii. influenza a virus infection modeling influenza viral dynamics in tissue. artificial immune systems multi-scale modelling in computational biomedicine modeling dynamic systems with cellular automata, chapter 21 role of atp in influenza virus budding severe acute respiratory syndrome coronavirus infection of human ciliated airway epithelia: role of ciliated cells in viral spread in the conducting airways of the lungs a methodology for performing global uncertainty and sensitivity analysis in systems biology critique of and limitations on the use of expert judgements in accident consequence uncertainty analysis this work was made possible by funding from the national institute of allergy and infectious diseases, nih, department of health and human services contract # hhsn272200800060c thanks to casey long for his work in plating calu-3 cells for sars-cov infection experiments. also thanks to dr.andrea cavazzoni of the unit of experimental oncology (university of parma) for the useful discussions. key: cord-118731-h5au2h09 authors: adiga, aniruddha; chen, jiangzhuo; marathe, madhav; mortveit, henning; venkatramanan, srinivasan; vullikanti, anil title: data-driven modeling for different stages of pandemic response date: 2020-09-21 journal: nan doi: nan sha: doc_id: 118731 cord_uid: h5au2h09 some of the key questions of interest during the covid-19 pandemic (and all outbreaks) include: where did the disease start, how is it spreading, who is at risk, and how to control the spread. there are a large number of complex factors driving the spread of pandemics, and, as a result, multiple modeling techniques play an increasingly important role in shaping public policy and decision making. as different countries and regions go through phases of the pandemic, the questions and data availability also changes. especially of interest is aligning model development and data collection to support response efforts at each stage of the pandemic. the covid-19 pandemic has been unprecedented in terms of real-time collection and dissemination of a number of diverse datasets, ranging from disease outcomes, to mobility, behaviors, and socio-economic factors. the data sets have been critical from the perspective of disease modeling and analytics to support policymakers in real-time. in this overview article, we survey the data landscape around covid-19, with a focus on how such datasets have aided modeling and response through different stages so far in the pandemic. we also discuss some of the current challenges and the needs that will arise as we plan our way out of the pandemic. as the sars-cov-2 pandemic has demonstrated, the spread of a highly infectious disease is a complex dynamical process. a large number of factors are at play as infectious diseases spread, including variable individual susceptibility to the pathogen (e.g., by age and health conditions), variable individual behaviors (e.g., compliance with social distancing and the use of masks), differing response strategies implemented by governments (e.g., school and workplace closure policies and criteria for testing), and potential availability of pharmaceutical interventions. governments have been forced to respond to the rapidly changing dynamics of the pandemic, and are becoming increasingly reliant on different modeling and analytical techniques to understand, forecast, plan and respond; this includes statistical methods and decision support methods using multi-agent models, such as: (i) forecasting epidemic outcomes (e.g., case counts, mortality and hospital demands), using a diverse set of data-driven methods e.g., arima type time series forecasting, bayesian techniques and deep learning, e.g., [1] [2] [3] [4] [5] , (ii) disease surveillance, e.g., [6, 7] , and (iii) counter-factual analysis of epidemics using multi-agent models, e.g., [8] [9] [10] [11] [12] [13] ; indeed, the results of [11, 14] were very influential in the early decisions for lockdowns in a number of countries. the specific questions of interest change with the stage of the pandemic. in the pre-pandemic stage, the focus was on understanding how the outbreak started, epidemic parameters, and the risk of importation to different regions. once outbreaks started-the acceleration stage, the focus is on determining the growth rates, the differences in spatio-temporal characteristics, and testing bias. in the mitigation stage, the questions are focused on non-prophylactic interventions, such as school and work place closures and other social-distancing strategies, determining the demand for healthcare resources, and testing and tracing. in the suppression stage, the focus shifts to using prophylactic interventions, combined with better tracing. these phases are not linear, and overlap with each other. for instance, the acceleration and mitigation stages of the pandemic might overlap spatially, temporally as well as within certain social groups. different kinds of models are appropriate at different stages, and for addressing different kinds of questions. for instance, statistical and machine learning models are very useful in forecasting and short term projections. however, they are not very effective for longer-term projections, understanding the effects of different kinds of interventions, and counter-factual analysis. mechanistic models are very useful for such questions. simple compartmental type models, and their extensions, namely, structured metapopulation models are useful for several population level questions. however, once the outbreak has spread, and complex individual and community level behaviors are at play, multi-agent models are most effective, since they allow for a more systematic representation of complex social interactions, individual and collective behavioral adaptation and public policies. as with any mathematical modeling effort, data plays a big role in the utility of such models. till recently, data on infectious diseases was very hard to obtain due to various issues, such as privacy and sensitivity of the data (since it is information about individual health), and logistics of collecting such data. the data landscape during the sars-cov-2 pandemic has been very different: a large number of datasets are becoming available, ranging from disease outcomes (e.g., time series of the number of confirmed cases, deaths, and hospitalizations), some characteristics of their locations and demographics, healthcare infrastructure capacity (e.g., number of icu beds, number of healthcare personnel, and ventilators), and various kinds of behaviors (e.g., level of social distancing, usage of ppes); see [15] [16] [17] for comprehensive surveys on available datasets. however, using these datasets for developing good models, and addressing important public health questions remains challenging. the goal of this article is to use the widely accepted stages of a pandemic as a guiding framework to highlight a few important problems that require attention in each of these stages. we will aim to provide a succinct model-agnostic formulation while identifying the key datasets needed, how they can be used, and the challenges arising in that process. we will also use sars-cov-2 as a case study unfolding in real-time, and highlight some interesting peer-reviewed and preprint literature that pertains to each of these problems. an important point to note is the necessity of randomly sampled data, e.g. data needed to assess the number of active cases and various demographics of individuals that were affected. census provides an excellent rationale. it is the only way one can develop rigorous estimates of various epidemiologically relevant quantities. there have been numerous surveys on the different types of datasets available for sars-cov-2, e.g., [15] [16] [17] [18] , as well as different kinds of modeling approaches. however, they do not describe how these models become relevant through the phases of pandemic response. an earlier similar attempt to summarize such responsedriven modeling efforts can be found in [19] , based on the 2009-h1n1 experience, this paper builds on their work and discusses these phases in the present context and the sars-cov-2 pandemic. although the paper touches upon different aspects of model-based decision making, we refer the readers to a companion article in the same special issue [20] for a focused review of models used for projection and forecasting. multiple organizations including cdc and who have their frameworks for preparing and planning response to a pandemic. for instance, the pandemic intervals framework from cdc 1 describes the stages in the context of an influenza pandemic; these are illustrated in figure 1 . these six stages span investigation, recognition and initiation in the early phase, followed by most of the disease spread occurring during the acceleration and deceleration stages. they also provide indicators for identifying when the pandemic has progressed from one stage to the next [21] . as envisioned, risk evaluation (i.e., using tools like influenza risk assessment tool (irat) and pandemic severity assessment framework (psaf)) and early case identification characterize the first three stages, while non-pharmaceutical interventions (npis) and available figure 1 : cdc pandemic intervals framework and who phases for influenza pandemic therapeutics become central to the acceleration stage. the deceleration is facilitated by mass vaccination programs, exhaustion of susceptible population, or unsuitability of environmental conditions (such as weather). a similar framework is laid out in who's pandemic continuum 2 and phases of pandemic alert 3 . while such frameworks aid in streamlining the response efforts of these organizations, they also enable effective messaging. to the best of our knowledge, there has not been a similar characterization of mathematical modeling efforts that go hand in hand with supporting the response. for summarizing the key models, we consider four of the stages of pandemic response mentioned in section 2: pre-pandemic, acceleration, mitigation and suppression. here we provide the key problems in each stage, the datasets needed, the main tools and techniques used, and pertinent challenges. we structure our discussion based on our experience with modeling the spread of covid-19 in the us, done in collaboration with local and federal agencies. • acceleration (section 5): this stage is relevant once the epidemic takes root within a country. there is usually a big lag in surveillance and response efforts, and the key questions are to model spread patterns at different spatio-temporal scales, and to derive short-term forecasts and projections. a broad class of datasets is used for developing models, including mobility, populations, land-use, and activities. these are combined with various kinds of time series data and covariates such as weather for forecasting. • mitigation (section 6): in this stage, different interventions, which are mostly non-pharmaceutical in the case of a novel pathogen, are implemented by government agencies, once the outbreak has taken hold within the population. this stage involves understanding the impact of interventions on case counts and health infrastructure demands, taking individual behaviors into account. the additional datasets needed in this stage include those on behavioral changes and hospital capacities. • suppression (section 7): this stage involves designing methods to control the outbreak by contact tracing & isolation and vaccination. data on contact tracing, associated biases, vaccine production schedules, and compliance & hesitancy are needed in this stage. figure 2 gives an overview of this framework and summarizes the data needs in these stages. these stages also align well with the focus of the various modeling working groups organized by cdc which include epidemic parameter estimation, international spread risk, sub-national spread forecasting, impact of interventions, healthcare systems, and university modeling. in reality, one should note that these stages may overlap, and may vary based on geographical factors and response efforts. moreover, specific problems can be approached prospectively in earlier stages, or retrospectively during later stages. this framework is thus meant to be more conceptual than interpreted along a linear timeline. results from such stages are very useful for policymakers to guide real-time response. consider a novel pathogen emerging in human populations that is detected through early cases involving unusual symptoms or unknown etiology. such outbreaks are characterized by some kind of spillover event, mostly through zoonotic means, like in the case of covid-19 or past influenza pandemics (e.g., swine flu and avian flu). a similar scenario can occur when an incidence of a well-documented disease with no known vaccine or therapeutics emerges in some part of the world, causing severe outcomes or fatalities (e.g., ebola and zika.) regardless of the development status of the country where the pathogen emerged, such outbreaks now contains the risk of causing a worldwide pandemic due to the global connectivity induced by human travel. two questions become relevant at this stage: what are the epidemiological attributes of this disease, and what are the risks of importation to a different country? while the first question involves biological and clinical investigations, the latter is more related with societal and environmental factors. one of the crucial tasks during early disease investigation is to ascertain the transmission and severity of the disease. these are important dimensions along which the pandemic potential is characterized because together they determine the overall disease burden, as demonstrated within the pandemic severity assessment framework [22] . in addition to risk assessment for right-sizing response, they are integral to developing meaningful disease models. formulation let θ = {θ t , θ s } represent the transmission and severity parameters of interest. they can be further subdivided into sojourn time parameters θ δ · and transition probability parameters θ p · . here θ corresponds to a continuous time markov chain (ctmc) on the disease states. the problem formulation can be represented as follows: given π(θ), the prior distribution on the disease parameters and a dataset d, estimate the posterior distribution p(θ|d) over all possible values of θ. in a model-specific form, this can be expressed as p(θ|d, m) where m is a statistical, compartmental or agent-based disease model. in order to estimate the disease parameters sufficiently, line lists for individual confirmed cases is ideal. such datasets contain, for each record, the date of confirmation, possible date of onset, severity (hospitalization/icu) status, and date of recovery/discharge/death. furthermore, age-and demographic/comorbidity information allow development of models that are age-and risk group stratified. one such crowdsourced line list was compiled during the early stages of covid-19 [24] and later released by cdc for us cases [25] . data from detailed clinical investigations from other countries such as china, south korea, and singapore was also used to parameterize these models [26] . in the absence of such datasets, past parameter estimates of similar diseases (e.g., sars, mers) were used for early analyses. modeling approaches for a model agnostic approach, the delays and probabilities are obtained by various techniques, including bayesian and ordinary least squares fitting to various delay distributions. for a particular disease model, these are estimated through model calibration techniques such as mcmc and particle filtering approaches. a summary of community estimates of various disease parameters is provided at https://github.com/midas-network/covid-19. further such estimates allow the design of pandemic planning scenarios varying in levels of impact, as seen in the cdc scenarios page 4 . see [27] [28] [29] for methods and results related to estimating covid-19 disease parameters from real data. current models use a large set of disease parameters for modeling covid-19 dynamics; they can be broadly classified as transmission parameters and hospital resource parameters. for instance in our work, we currently use parameters (with explanations) shown in table 1 . challenges often these parameters are model specific, and hence one needs to be careful when reusing parameter estimates from literature. they are related but not identifiable with respect to population level measures such as basic reproductive number r 0 (or effective reproductive number r eff ) and doubling time which allow tracking the rate of epidemic growth. also the estimation is hindered by inherent biases in case ascertainment rate, reporting delays and other gaps in the surveillance system. aligning different data streams (e.g., outpatient surveillance, hospitalization rates, mortality records) is in itself challenging. when a disease outbreak occurs in some part of the world, it is imperative for most countries to estimate their risk of importation through spatial proximity or international travel. such measures are incredibly valuable in setting a timeline for preparation efforts, and initiating health checks at the borders. over centuries, pandemics have spread faster and faster across the globe, making it all the more important to characterize this risk as early as possible. formulation let c be the set of countries, and g = {c, e} an international network, where edges (often weighted and directed) in e represent some notion of connectivity. the importation risk problem can be formulated as below: given c o ∈ c the country of origin with an initial case at time 0, and c i the country of interest, using g, estimate the expected time taken t i for the first cases to arrive in country c i . in its probabilistic form, the same can be expressed as estimating the probability p i (t) of seeing the first case in country c i by time t. data needs assuming we have initial case reports from the origin country, the first data needed is a network that connects the countries of the world to represent human travel. the most common source of such information is the airline network datasets, from sources such as iata, oag, and openflights; [30] provides a systematic review of how airline passenger data has been used for infectious disease modeling. these datasets could either capture static measures such as number of seats available or flight schedules, or a dynamic count of passengers per month along each itinerary. since the latter has intrinsic delays in collection and reporting, for an ongoing pandemic they may not be representative. during such times, data on ongoing travel restrictions [31] become important to incorporate. multi-modal traffic will also be important to incorporate for countries that share land borders or have heavy maritime traffic. for diseases such as zika, where establishment risk is more relevant, data on vector abundance or prevailing weather conditions are appropriate. modeling approaches simple structural measures on networks (such as degree, pagerank) could provide static indicators of vulnerability of countries. by transforming the weighted, directed edges into probabilities, one can use simple contagion models (e.g., independent cascades) to simulate disease spread and empirically estimate expected time of arrival. global metapopulation models (gleam) that combine seir type dynamics with an airline network have also been used in the past for estimating importation risk. brockmann and helbing [32] used a similar framework to quantify effective distance on the network which seemed to be well correlated with time of arrival for multiple pandemics in the past; this has been extended to covid-19 [8, 33] . in [34] , the authors employ air travel volume obtained through iata from ten major cities across china to rank various countries along with the idvi to convey their vulnerability. [35] consider the task of forecasting international and domestic spread of covid-19 and employ official airline group (oag) data for determining air traffic to various countries, and [36] fit a generalized linear model for observed number of cases in various countries as a function of air traffic volume obtained from oag data to determine countries with potential risk of under-detection. also, [37] provide africa-specific case-study of vulnerability and preparedness using data from civil aviation administration of china. challenges note that arrival of an infected traveler will precede a local transmission event in a country. hence the former is more appropriate to quantify in early stages. also, the formulation is agnostic to whether it is the first infected arrival or first detected case. however, in real world, the former is difficult to observe, while the latter is influenced by security measures at ports of entry (land, sea, air) and the ease of identification for the pathogen. for instance, in the case of covid-19, the long incubation period and the high likelihood of asymptomaticity could have resulted in many infected travelers being missed by health checks at poes. we also noticed potential administrative delays in reporting by multiple countries fearing travel restrictions. as the epidemic takes root within a country, it may enter the acceleration phase. depending on the testing infrastructure and agility of surveillance system, response efforts might lag or lead the rapid growth in case rate. under such a scenario, two crucial questions emerge that pertain to how the disease may spread spatially/socially and how the case rate may grow over time. within the country, there is need to model the spatial spread of the disease at different scales: state, county, and community levels. similar to the importation risk, such models may provide an estimate of when cases may emerge in different parts of the country. when coupled with vulnerability indicators (socioeconomic, demographic, co-morbidities) they provide a framework for assessing the heterogeneous impact the disease may have across the country. detailed agent-based models for urban centers may help identify hotspots and potential case clusters that may emerge (e.g., correctional facilities, nursing homes, food processing plants, etc. in the case of covid-19). formulation given a population representation p at appropriate scale and a disease model m per entity (individual or sub-region), model the disease spread under different assumptions of underlying connectivity c and disease parameters θ. the result will be a spatio-temporal spread model that results in z s,t , the time series of disease states over time for region s. data needs some of the common datasets needed by most modeling approaches include: (1) social and spatial representation, which includes census, and population data, which are available from census departments (see, e.g., [38] ), and landscan [39] , (2) connectivity between regions (commuter, airline, road/rail/river), e.g., [30, 31] , (3) data on locations, including points of interest, e.g., openstreetmap [40] , and (4) activity data, e.g., the american time use survey [41] . these datasets help capture where people reside and how they move around, and come in contact with each other. while some of these are static, more dynamic measures, such as from gps traces, become relevant as individuals change their behavior during a pandemic. modeling approaches different kinds of structured metapopulation models [8, [42] [43] [44] [45] , and agent based models [46] [47] [48] [49] [50] have been used in the past to model the sub-national spread; we refer to [13, 51, 52] for surveys on different modeling approaches. these models incorporate typical mixing patterns, which result from detailed activities and co-location (in the case of agent based models), and different modes of travel and commuting (in the case of metapopulation models). challenges while metapopulation models can be built relatively rapidly, agent based models are much harder-the datasets need to be assembled at a large scale, with detailed construction pipelines, see, e.g., [46] [47] [48] [49] [50] . since detailed individual activities drive the dynamics in agent based models, schools and workplaces have to be modeled, in order to make predictions meaningful. such models will get reused at different stages of the outbreak, so they need to be generic enough to incorporate dynamically evolving disease information. finally, a common challenge across modeling paradigms is the ability to calibrate to the dynamically evolving spatio-temporal data from the outbreak-this is especially challenging in the presence of reporting biases and data insufficiency issues. given the early growth of cases within the country (or sub-region), there is need for quantifying the rate of increase in comparable terms across the duration of the outbreak (accounting for the exponential nature of such processes). these estimates also serve as references, when evaluating the impact of various interventions. as an extension, such methods and more sophisticated time series methods can be used to produce short-term forecasts for disease evolution. formulation given the disease time series data within the country z s,t until data horizon t , provide scale-independent growth rate measures g s (t ), and forecastsẑ s,u for u ∈ [t, t + ∆t ], where ∆t is the forecast horizon. data needs models at this stage require datasets such as (1) time series data on different kinds of disease outcomes, including case counts, mortality, hospitalizations, along with attributes, such as age, gender and location, e.g., [53] [54] [55] [56] [57] , (2) any associated data for reporting bias (total tests, test positivity rate) [58] , which need to be incorporated into the models, as these biases can have a significant impact on the dynamics, and (3) exogenous regressors (mobility, weather), which have been shown to have a significant impact on other diseases, such as influenza, e.g., [59] . modeling approaches even before building statistical or mechanistic time series forecasting methods, one can derive insights through analytical measures of the time series data. for instance, the effective reproductive number, estimated from the time series [60] can serve as a scale-independent metric to compare the outbreaks across space and time. additionally multiple statistical methods ranging from autoregressive models to deep learning techniques can be applied to the time series data, with additional exogenous variables as input. while such methods perform reasonably for short-term targets, mechanistic approaches as described earlier can provide better long-term projections. various ensembling techniques have also been developed in the recent past to combine such multi-model forecasts to provide a single robust forecast with better uncertainty quantification. one such effort that combines more than 30 methods for covid-19 can be found at the covid forecasting hub 5 . we also point to the companion paper for more details on projection and forecasting models. challenges data on epidemic outcomes usually has a lot of uncertainties and errors, including missing data, collection bias, and backfill. for forecasting tasks, these time series data need to be near real-time, else one needs to do both nowcasting, as well as forecasting. other exogenous regressors can provide valuable lead time, due to inherent delays in disease dynamics from exposure to case identification. such frameworks need to be generalized to accommodate qualitative inputs on future policies (shutdowns, mask mandates, etc.), as well as behaviors, as we discuss in the next section. once the outbreak has taken hold within the population, local, state and national governments attempt to mitigate and control its spread by considering different kinds of interventions. unfortunately, as the covid-19 pandemic has shown, there is a significant delay in the time taken by governments to respond. as a result, this has caused a large number of cases, a fraction of which lead to hospitalizations. two key questions in this stage are: (1) how to evaluate different kinds of interventions, and choose the most effective ones, and (2) how to estimate the healthcare infrastructure demand, and how to mitigate it. the effectiveness of an intervention (e.g., social distancing) depends on how individuals respond to them, and the level of compliance. the health resource demand depends on the specific interventions which are implemented. as a result, both these questions are connected, and require models which incorporate appropriate behavioral responses. in the initial stages, only non-prophylactic interventions are available, such as: social distancing, school and workplace closures, and use of ppes, since no vaccinations and anti-virals are available. as mentioned above, such analyses are almost entirely model based, and the specific model depends on the nature of the intervention and the population being studied. formulation given a model, denoted abstractly as m, the general goals are (1) to evaluate the impact of an intervention (e.g., school and workplace closure, and other social distancing strategies) on different epidemic outcomes (e.g., average outbreak size, peak size, and time to peak), and (2) find the most effective intervention from a suite of interventions, with given resource constraints. the specific formulation depends crucially on the model and type of intervention. even for a single intervention, evaluating its impact is quite challenging, since there are a number of sources of uncertainty, and a number of parameters associated with the intervention (e.g., when to start school closure, how long, and how to restart). therefore, finding uncertainty bounds is a key part of the problem. data needs while all the data needs from the previous stages for developing a model are still there, representation of different kinds of behaviors is a crucial component of the models in this stage; this includes: use of ppes, compliance to social distancing measures, and level of mobility. statistics on such behaviors are available at a fairly detailed level (e.g., counties and daily) from multiple sources, such as (1) the covid-19 impact analysis platform from the university of maryland [56] , which gives metrics related to social distancing activities, including level of staying home, outside county trips, outside state trips, (2) changes in mobility associated with different kinds of activities from google [61] , and other sources, (3) survey data on different kinds of behaviors, such as usage of masks [62] . modeling approaches as mentioned above, such analyses are almost entirely model based, including structured metapopulation models [8, [42] [43] [44] [45] , and agent based models [46] [47] [48] [49] [50] . different kinds of behaviors relevant to such interventions, including compliance with using ppes and compliance to social distancing guidelines, need to be incorporated into these models. since there is a great deal of heterogeneity in such behaviors, it is conceptually easiest to incorporate them into agent based models, since individual agents are represented. however, calibration, simulation and analysis of such models pose significant computational challenges. on the other hand, the simulation of metapopulation models is much easier, but such behaviors cannot be directly represented-instead, modelers have to estimate the effect of different behaviors on the disease model parameters, which can pose modeling challenges. challenges there are a number of challenges in using data on behaviors, which depends on the specific datasets. much of the data available for covid-19 is estimated through indirect sources, e.g., through cell phone and online activities, and crowd-sourced platforms. this can provide large spatio-temporal datasets, but have unknown biases and uncertainties. on the other hand, survey data is often more reliable, and provides several covariates, but is typically very sparse. handling such uncertainties, rigorous sensitivity analysis, and incorporating the uncertainties into the analysis of the simulation outputs are important steps for modelers. the covid-19 pandemic has led to a significant increase in hospitalizations. hospitals are typically optimized to run near capacity, so there have been fears that the hospital capacities would not be adequate, especially in several countries in asia, but also in some regions in the us. nosocomial transmission could further increase this burden. formulation the overall problem is to estimate the demand for hospital resources within a populationthis includes the number of hospitalizations, and more refined types of resources, such as icus, ccus, medical personnel and equipment, such as ventilators. an important issue is whether the capacity of hospitals within the region would be overrun by the demand, when this is expected to happen, and how to design strategies to meet the demand-this could be through augmenting the capacities at existing hospitals, or building new facilities. timing is of essence, and projections of when the demands exceed capacity are important for governments to plan. the demands for hospitalization and other health resources can be estimated from the epidemic models mentioned earlier, by incorporating suitable health states, e.g., [43, 63] ; in addition to the inputs needed for setting up the models for case counts, datasets are needed for hospitalization rates and durations of hospital stay, icu care, and ventilation. the other important inputs for this component are hospital capacity, and the referral regions (which represent where patients travel for hospitalization). different public and commercial datasets provide such information, e.g., [64, 65] . modeling approaches demand for health resources is typically incorporated into both metapopulation and agent based models, by having a fraction of the infectious individuals transition into a hospitalization state. an important issue to consider is what happens if there is a shortage of hospital capacity. studying this requires modeling the hospital infrastructure, i.e., different kinds of hospitals within the region, and which hospital a patient goes to. there is typically limited data on this, and data on hospital referral regions, or voronoi tesselation can be used. understanding the regimes in which hospital demand exceeds capacity is an important question to study. nosocomial transmission is typically much harder to study, since it requires more detailed modeling of processes within hospitals. challenges there is a lot of uncertainty and variability in all the datasets involved in this process, making its modeling difficult. for instance, forecasts of the number of cases and hospitalizations have huge uncertainty bounds for medium or long term horizon, which is the kind of input necessary for understanding hospital demands, and whether there would be any deficits. the suppression stage involves methods to control the outbreak, including reducing the incidence rate and potentially leading to the eradication of the disease in the end. eradication in case of covid-19 appears unlikely as of now, what is more likely is that this will become part of seasonal human coronaviruses that will mutate continuously much like the influenza virus. contact tracing problem refers to the ability to trace the neighbors of an infected individual. ideally, if one is successful, each neighbor of an infected neighbor would be identified and isolated from the larger population to reduce the growth of a pandemic. in some cases, each such neighbor could be tested to see if the individual has contracted the disease. contact tracing is the workhorse in epidemiology and has been immensely successful in controlling slow moving diseases. when combined with vaccination and other pharmaceutical interventions, it provides the best way to control and suppress an epidemic. formulation the basic contact tracing problem is stated as follows: given a social contact network g(v, e) and subset of nodes s ⊂ v that are infected and a subset s 1 ⊂ s of nodes identified as infected, find all neighbors of s. here a neighbor means an individual who is likely to have a substantial contact with the infected person. one then tests them (if tests are available), and following that, isolates these neighbors, or vaccinates them or administers anti-viral. the measures of effectiveness for the problem include: (i) maximizing the size of s 1 , (ii) maximizing the size of set n (s 1 ) ⊆ n (s), i.e. the potential number of neighbors of set s 1 , (iii) doing this within a short period of time so that these neighbors either do not become infectious, or they minimize the number of days that they are infectious, while they are still interacting in the community in a normal manner, (iv) the eventual goal is to try and reduce the incidence rate in the community-thus if all the neighbors of s 1 cannot be identified, one aims to identify those individuals who when isolated/treated lead to a large impact; (v) and finally verifying that these individuals indeed came in contact with the infected individuals and thus can be asked to isolate or be treated. data needs data needed for the contact tracing problem includes: (i) a line list of individuals who are currently known to be infected (this is needed in case of human based contact tracing). in the real world, when carrying out human contact tracers based deployment, one interviews all the individuals who are known to be infectious and reaches out to their contacts. modeling approaches human contact tracing is routinely done in epidemiology. most states in the us have hired such contact tracers. they obtain the daily incidence report from the state health departments and then proceed to contact the individuals who are confirmed to be infected. earlier, human contact tracers used to go from house to house and identify the potential neighbors through a well defined interview process. although very effective it is very time consuming and labor intensive. phones were used extensively in the last 10-20 years as they allow the contact tracers to reach individuals. they are helpful but have the downside that it might be hard to reach all individuals. during covid-19 outbreak, for the first time, societies and governments have considered and deployed digital contact tracing tools [66] [67] [68] [69] [70] . these can be quite effective but also have certain weaknesses, including, privacy, accuracy, and limited market penetration of the digital apps. challenges these include: (i) inability to identify everyone who is infectious (the set s) -this is virtually impossible for covid-19 like disease unless the incidence rate has come down drastically and for the reason that many individuals are infected but asymptomatic; (ii) identifying all contacts of s (or s 1 ) -this is hard since individuals cannot recall everyone they met, certain folks that they were in close proximity might have been in stores or social events and thus not known to individuals in the set s. furthermore, even if a person is able to identify the contacts, it is often hard to reach all the individuals due to resource constraints (each human tracer can only contact a small number of individuals. the overall goal of the vaccine allocation problem is to allocate vaccine efficiently and in a timely manner to reduce the overall burden of the pandemic. formulation the basic version of the problem can be cast in a very simple manner (for networked models): given a graph g(v, e) and a budget b on the number of vaccines available, find a set s of size b to vaccinate so as to optimize certain measure of effectiveness. the measure of effectiveness can be (i) minimizing the total number of individuals infected (or maximizing the total number of uninfected individuals); (ii) minimizing the total number of deaths (or maximizing the total number of deaths averted); (iii) optimizing the above quantities but keeping in mind certain equity and fairness criteria (across socio-demographic groups, e.g. age, race, income); (iv) taking into account vaccine hesitancy of individuals; (v) taking into account the fact that all vaccines are not available at the start of the pandemic, and when they become available, one gets limited number of doses each month; (vi) deciding how to share the stockpile between countries, state, and other organizations; (vii) taking into account efficacy of the vaccine. data needs as in other problems, vaccine allocation problems need as input a good representation of the system; network based, meta-population based and compartmental mass action models can be used. one other key input is the vaccine budget, i.e., the production schedule and timeline, which serves as the constraint for the allocation problem. additional data on prevailing vaccine sentiment and past compliance to seasonal/neonatal vaccinations are useful to estimate coverage. modeling approaches the problem has been studied actively in the literature; network science community has focused on optimal allocation schemes, while public health community has focused on using meta-population models and assessing certain fixed allocation schemes based on socio-economic and demographic considerations. game theoretic approaches that try and understand strategic behavior of individuals and organization has also been studied. challenges the problem is computationally challenging and thus most of the time simulation based optimization techniques are used. challenge to the optimization approach comes from the fact that the optimal allocation scheme might be hard to compute or hard to implement. other challenges include fairness criteria (e.g. the optimal set might be a specific group) and also multiple objectives that one needs to balance. while the above sections provide an overview of salient modeling questions that arise during the key stages of a pandemic, mathematical and computational model development is equally if not more important as we approach the post-pandemic (or more appropriately inter-pandemic) phase. often referred to as peace time efforts, this phase allows modelers to retrospectively assess individual and collective models on how they performed during the pandemic. in order to encourage continued development and identifying data gaps, synthetic forecasting challenge exercises [71] may be conducted where multiple modeling groups are invited to forecast synthetic scenarios with varying levels of data availability. another set of models that are quite relevant for policymakers during the winding down stages, are those that help assess overall health burden and economic costs of the pandemic. epideep: exploiting embeddings for epidemic forecasting an arima model to forecast the spread and the final size of covid-2019 epidemic in italy (first version on ssrn 31 march) real-time epidemic forecasting: challenges and opportunities accuracy of real-time multi-model ensemble forecasts for seasonal influenza in the u.s realtime forecasting of infectious disease dynamics with a stochastic semi-mechanistic model healthmap the use of social media in public health surveillance. western pacific surveillance and response journal : wpsar the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak basic prediction methodology for covid-19: estimation and sensitivity considerations. medrxiv covid-19 outbreak on the diamond princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand. imperial college technical report modelling disease outbreaks in realistic urban social networks computational epidemiology forecasting covid-19 impact on hospital bed-days, icudays, ventilator-days and deaths by us state in the next 4 months open data resources for fighting covid-19 data-driven methods to monitor, model, forecast and control covid-19 pandemic: leveraging data science, epidemiology and control theory covid-19 datasets: a survey and future challenges. medrxiv mathematical modeling of epidemic diseases the use of mathematical models to inform influenza pandemic preparedness and response mathematical models for covid-19 pandemic: a comparative analysis updated preparedness and response framework for influenza pandemics novel framework for assessing epidemiologic effects of influenza epidemics and pandemics covid-19 pandemic planning scenarios epidemiological data from the covid-19 outbreak, real-time case information covid-19 case surveillance public use data -data -centers for disease control and prevention covid-19 patients' clinical characteristics, discharge rate, and fatality rate of meta-analysis estimating the generation interval for coronavirus disease (covid-19) based on symptom onset data the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china the use and reporting of airline passenger data for infectious disease modelling: a systematic review flight cancellations related to 2019-ncov (covid-19) the hidden geometry of complex, network-driven contagion phenomena potential for global spread of a novel coronavirus from china forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study. the lancet using predicted imports of 2019-ncov cases to determine locations that may not be identifying all imported cases. medrxiv preparedness and vulnerability of african countries against introductions of 2019-ncov. medrxiv creating synthetic baseline populations openstreetmap american time use survey multiscale mobility networks and the spatial spreading of infectious diseases optimizing spatial allocation of seasonal influenza vaccine under temporal constraints assessing the international spreading risk associated with the 2014 west african ebola outbreak spread of zika virus in the americas structure of social contact networks and their impact on epidemics generation and analysis of large synthetic social contact networks modelling disease outbreaks in realistic urban social networks containing pandemic influenza at the source report 9: impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand the structure and function of complex networks a public data lake for analysis of covid-19 data midas network. midas 2019 novel coronavirus repository covid-19) data in the united states covid-19 impact analysis platform covid-19 surveillance dashboard the covid tracking project absolute humidity and the seasonal onset of influenza in the continental united states epiestim: a package to estimate time varying reproduction numbers from epidemic curves. r package version google covid-19 community mobility reports mask-wearing survey data impact of social distancing measures on coronavirus disease healthcare demand, central texas, usa current hospital capacity estimates -snapshot total hospital bed occupancy quantifying the effects of contact tracing, testing, and containment covid-19 epidemic in switzerland: on the importance of testing, contact tracing and isolation quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing isolation and contact tracing can tip the scale to containment of covid-19 in populations with social distancing. available at ssrn 3562458 privacy sensitive protocols and mechanisms for mobile contact tracing the rapidd ebola forecasting challenge: synthesis and lessons learnt acknowledgments. the authors would like to thank members of the biocomplexity covid-19 response team and network systems science and advanced computing (nssac) division for their thoughtful comments and suggestions related to epidemic modeling and response support. we thank members of the biocomplexity institute and initiative, university of virginia for useful discussion and suggestions. this key: cord-175015-d2am45tu authors: moran, rosalyn j.; fagerholm, erik d.; cullen, maell; daunizeau, jean; richardson, mark p.; williams, steven; turkheimer, federico; leech, rob; friston, karl j. title: estimating required 'lockdown' cycles before immunity to sars-cov-2: model-based analyses of susceptible population sizes, 's0', in seven european countries including the uk and ireland date: 2020-04-09 journal: nan doi: nan sha: doc_id: 175015 cord_uid: d2am45tu we used bayesian model inversion to estimate epidemic parameters from the reported case and death rates from seven countries using data from late january 2020 to april 5th 2020. two distinct generative model types were employed: first a continuous time dynamical-systems implementation of a susceptible-exposed-infectious-recovered (seir) model and second: a partially observable markov decision process (mdp) or hidden markov model (hmm) implementation of an seir model. both models parameterise the size of the initial susceptible population (s0), as well as epidemic parameters. parameter estimation (data fitting) was performed using a standard bayesian scheme (variational laplace) designed to allow for latent unobservable states and uncertainty in model parameters. both models recapitulated the dynamics of transmissions and disease as given by case and death rates. the peaks of the current waves were predicted to be in the past for four countries (italy, spain, germany and switzerland) and to emerge in 0.5-2 weeks in ireland and 1-3 weeks in the uk. for france one model estimated the peak within the past week and the other in the future in two weeks. crucially, maximum a posteriori (map) estimates of s0 for each country indicated effective population sizes of below 20% (of total population size), under both the continuous time and hmm models. with a bayesian weighted average across all seven countries and both models, we estimated that 6.4% of the total population would be immune. from the two models the maximum percentage of the effective population was estimated at 19.6% of the total population for the uk, 16.7% for ireland, 11.4% for italy, 12.8% for spain, 18.8% for france, 4.7% for germany and 12.9% for switzerland. our results indicate that after the current wave, a large proportion of the total population will remain without immunity. as of early april 2020, the coronavirus pandemic has reached different epidemic stages across the world. france was the earliest affected country in europe with its first reported cases on 24 th january 2020 (reusken, broberg et al. 2020) with cases reported shortly after in germany, then the united kingdom, italy, spain, switzerland and in ireland on february 29th. subsequently outbreaks have emerged across the european continent. the daily rates of new confirmed cases of the covid-19 virus (sars-cov-2) have begun to decrease in some of these countries; in particular, in italy and spain, with promising signs that extensive social distancing measures have been effective and that these countries have reached or are past 'the peak' of infections. epidemiological models that predict the progression of populations from susceptible (s) to exposed (e), infected (i) and recovered (r) (seir models (kermack and mckendrick 1927) ) can be used to investigate the properties of these peaks, given the initial susceptibility of a population. for sars-cov-2, no (or limited) immunity can be assumed a priori in humans and thus the majority of the entire population is deemed susceptible (team 2020). several studies have developed and simulated seir models using epidemic parameters to 'nowcast' and forecast transmission (wu, leung et al. 2020) . parameters of the model are being continuously improved and modified, such as reduced serial interval estimates (nishiura, linton et al. 2020 , yang, zeng et al. 2020 ), initially derived from observed cases in the initial outbreak in wuhan, china (sun, chen et al. 2020 , wang, liu et al. 2020 , wang, wang et al. 2020 . in most studies, these compartmental models are applied as dynamic generative (i.e., causal or mechanistic) models that assume a set of parameters and predict cases or clinical resources (moghadas, shoukat et al. 2020 ) and intervention effects (prem, liu et al. 2020 , wells, sah et al. 2020 . the (initial) susceptible size of a population (termed 's0') is assumed to be the size of a particular city, e.g. 10 million in wuhan (prem, liu et al. 2020) or-for a country-is assumed to comprise of multiples of smaller city sized outbreaks e.g. 100k (ferguson, laydon et al. 2020 ). such models have lent important insight into the likely disease and clinical trajectories of countries as a whole, enabling planning and management for predicted numbers of cases requiring hospitalization and ventilation (moghadas, shoukat et al. 2020) . given the lockdowns around europe which likely averted larger case surges (wang, liu et al. 2020) , we sought to investigate the current effective population size in seven european countries. therefore, we used the seir model to determine the initial population size (s0) that was susceptible (i.e., would eventually become infected) at the beginning of the first wave and thus determine the levels of immunity that might exist in these countries after this wave, (by assuming the susceptible population will eventually become infected and develop immunity). one approach, to perform this inverse modelling, is to apply dynamic causal modelling (friston, mattout et al. 2007 )-enabling the incorporation of prior values for parameters (e.g. the serial interval, incubation period or number of daily contacts) and prior uncertainty about these values. quarantine and social isolation are often explicitly accounted for in seir models (wearing, rohani et al. 2005 , feng 2007 , ridenhour, kowalik et al. 2018 ) making them appropriate for the current government advised social distancing. importantly, two particular forms of this seir model (with social distancing) have recently been developed (friston, parr et al. 2020 , moghadas, shoukat et al. 2020 ) that also account for deaths following hospitalization or treatment via ventilation within an intensive care unit (icu). specifically, they account for a potential time lag between becoming infected and developing acute respiratory distress. this makes these models putatively 'fit for purpose', when using death as well as case reporting data to fit or invert the models to recover (posterior) parameter values-and estimate their uncertainty. we aimed to apply these two models to data from seven countries: including ireland, the united kingdom, italy, spain, france, germany and switzerland. our goal was to estimate s0. one model, (the 'ode model'-see methods and (moghadas, shoukat et al. 2020) ) is based on a classical compartment model where a person in an epidemic can occupy only one compartment or 'state' and moves from state to state: from susceptible to eventually (through intermediate states) either recovery or death. the other model-the 'hmm model' (friston, parr et al. 2020 )-features several factors that change together; including where a person is located (out of the home vs. in the home, for example), as well as their infectious, testing and clinical status. we apply both models to daily case and death reports to assess whether there is convergence on estimates of the initially susceptible (i.e., effective) population sizes. data from a repository for the 2019 novel coronavirus at john's hopkins university center for systems science and engineering (jhu csse) were used (dong, du et al. 2020 ). using these date-stamped entries of daily reported cases and reported deaths, we extracted seven timeseries pairs for the countries (including all territories) of ireland, the united kingdom, italy, spain, germany, france and switzerland. data records from january 22 nd to april 5 th , 2020 were modelled. for the ordinary differential equation (ode) model, daily cases and daily accumulated deaths were fitted while for the hmm model, daily cases and daily deaths were fitted, corresponding to the state equations (see (friston, parr et al. 2020) ). ode model: a dynamic transmission model comprising a set of 12 coupled ordinary differential equations was adapted from moghadas et al (friston, parr et al. 2020 ). the original model included 12 states for four different age categories. we simplified the model structure by collapsing across age (see appendix a). the 12 states or compartments in this simplified model (flow function, figure 1 ) described susceptible (s) individuals who became infected with the disease through exposure (e) to other infected individuals. infected individuals comprised three categories, an asymptomatic or subclinical state (a), a symptomatic state who would not require hospitalization (inh) and a symptomatic state who would require hospitalization (ih). each of these infected categories could also self-isolate -representing three more states defined by lower a priori contacts. people in states inh and a were assumed to recover, while those in states (ih) would transition to either hospitalized (h) or icu states (icu). from these states people would recover (r) or die (d), figure 1 . time constants of the mode included the incubation period, recovery period, time to self-isolate, time from symptom onset to hospitalization, time from icu admission to death, time from non-icu admission to death, length of stay in icu and length of hospital stay. parameters controlling proportions that entered branching states (e.g. proportion of all hospitalised cases admitted to icu were also included (see appendix a for full parameter list) as well as the transmission rate and contact per day either within or without self-isolation. parameters were equipped with a priori values and optimisation was performed on log scale factors to ensure positivity (appendix a) of these proportions and rate constants. to link these odes to the observed data we employed an observer function which assumed a variable rate of case reporting for symptomatic (without requiring hospitalization) and asymptomatic individuals. a priori we assumed that only 1% of individuals infected who were asymptomatic received tests. we assumed that 20% of symptomatic cases who do not require hospitalization receive tests. and that 100% of infected individuals who are hospitalised receive tests. the levels of 1% and 20% testing were free parameters in our model. the 100% for hospitalised tests was fixed. finally, 100% of deaths were assumed to be recorded. finally, we placed a prior on the initial number of individuals in each state. a priori, we assume 100 individuals in infected states. we tested two alternatives for s0: in the ode model (model: ode) we initialised s0 to 1 million x _ 0 individuals, where _ 0 = 1. this parameter would be optimised for each individual dataset and so could accommodate total sizes; e.g. if _ 0 = 4.9 a posteriori then the total population of ireland would be considered initially susceptible. we also tested a 'cities'-based version of the ode model (model: ode_city) that might recapitulate the death and case rates for each country. for this, we altered the observer function and imposed a prior of 1 million x _ 0 individuals, where _ 0 = 1. then we scaled the case and death rates by the population in millions (see appendix a for equations). for this if we obtained _ 0 = 1 a posteriori then the total initial susceptible population would also correspond to the total population of ireland, but the epidemic dynamic would comprise 4.9 distinct outbreaks. see (friston, parr et al. 2020 ) for a complete description of the model. in brief, the model represents four factors describing location, infection status, test status and clinical status. within each factor people may transition among four states probabilistically. the transitions generate predicted outcomes; for example, the number of people newly infected who have tested positive or the number of people newly infected who will remain untested. the location factor describes if an individual is at home, at work, in a critical care unit (ccu) or deceased. similar to the early states in the ode model, the hmm has a second factor describing infection status susceptible, infected, infectious or immune, where it is assumed that there is a progression from a state of susceptibility to immunity-through a period of (pre-contagious) infection to an infectious (contagious) status. the clinical status factor comprises asymptomatic, symptomatic, acute respiratory distress syndrome (ards) or deceased. finally, the fourth factor represents diagnostic status where an individual can be untested or waiting for the results of a test that can either be positive or negative. as with the ode model, transitions amongst states are controlled by rate constants (inverse time constants) and non-negative probabilities. similar to the ode model above, we initialised (and set as priors) s0 to 1 million x _ 0 individuals, where _ 0 = 1. for the hmm and both ode models (ode model and ode_city) to estimate the model parameters, we employed a standard (variational laplace) bayesian scheme to optimise parameters of corresponding dcm (spm_nlsi_gn) (friston, mattout et al. 2007) . the key aim of our analysis was to estimate the likely immunity after the current set of cases and deaths. to ascertain the initial susceptibility s0, we examined the posterior estimate from both model types and its bayesian credible intervals. however, first we examined the evidence for each model, relative to the worst performing model. we used two ode models, with different constructs for epidemic sizes/metapopulations. the first ode model (ode) assumed a prior of 1 million susceptible individuals (s0), the. the second ode model accounted for several effective populations of size 1 million (ode_city). the third model was the hmm model, which also assumed a prior of 1 million initial susceptible individuals. of all three models, ode_city was the worst performing model for all countries data (figure 2a ). from the two better performing models, we then estimated the effective population size of s0=s(t=0) as a proportion of the total population ( figure 2b ). taking a bayesian average-across all models and countries-the estimated proportion of people that were initially susceptible at the start of this outbreak-and thus immune at the end of the outbreak-was 6.4% of the total population of each country. the ode model produced consistently higher estimates of s0 at the end of the wave than the hmm. these values suggest that after the current wave of cases, between 3 (lowest estimates for ireland and the uk) and 12 (highest estimate for germany) more cycles (with identical dynamics to those from jan 22 nd ) would be required to bring the total population to probable herd immunity levels (we assume herd immunity of 60%, figure 2b ). we plot this fall in susceptibility state s (increase in immunity) over time, from the initial size s0 in figure 2c for the ode and hmm models separately for ireland and the uk ( figure 2c ). our model inversion procedure produced fits to the data that recapitulated the rates up to april 5 th for both models (figure 3 ). systematic differences in future predictions were observed however between the ode and hmm models (though predictions were of similar orders of magnitude). for all countries the peak date and peak number of cases was higher for the ode model. however, both models exhibited peaks for dates in the past for four countries (italy, spain, germany and switzerland). for france the models were discrepant with the ode model predicted a peak in the future on april 20 th and the hmm model estimating a peak had already occurred on april 7 th . peaks in the future were expected for ireland and the uk. for ireland, the peak reported case rate predictions were estimated at april 9 th for the hmm and april 23 rd for the ode. the estimate of the number of daily cases at the peak were 720 cases and 392 cases for the ode and hmm models. for the uk, the peak case rate predictions were estimated at april 11 th for the hmm and april 17 th for the ode model. the peak case rates (i.e. tested cases) were estimated at 9304 daily cases for the ode and 5411 daily cases for the hmm models (figure 3) . the cumulative deaths (figure 4 ) evinced relatively small discrepancies between the models, with the ode model predicting a larger cumulative death toll, of 1250 for ireland compared with 1008 deaths in ireland given by the hmm model. for the uk the ode and hmm were remarkably consistent, predicting a cumulative death toll of 49296 and 49785, respectively 1 (figure 4) . in other european countries, however, the discrepancies between the model predictions were greater in some countries such as france, spain and switzerland, with the hmm suggesting considerably lower cumulative deaths. finally, to test the assumption that low s0 proportions of the population may be indicative of a 'next wave' or several 'next waves', we estimated the initial susceptible size from the initial peak of the spanish flu pandemic of 1918-1919. using data collated from approximately half of the united kingdom: i.e. a population of approximately 22 million. using the hmm model and variational laplace, we see fits to the data that capture the falling peak. here, we estimated the effective or susceptible population size was s0 = 4.03% of the total population size ( figure 5 ). though dramatically different in terms of hospital care-the general picture remains -that large waves may be possible after low s0. we used a variational bayesian scheme (friston, mattout et al. 2007 ) to optimise the parameters of two distinctly constructed models of viral transmission (friston, parr et al. 2020 , moghadas, shoukat et al. 2020 ). we optimised the parameters of these models based on daily reported cases and daily reports of death due to covid-19. we optimised the model from data acquired for seven european countries. both models were able to predict (i.e. fit) the current epidemic dynamics with plausible estimated trajectories. the models differed in their exact case rate predictions but predicted commensurate figures for the deaths in the united kingdom and ireland. how do these estimates relate to previous predictions of covid-19 deaths in the uk? it was predicted (ferguson, laydon et al. 2020 ) that without interventions 510,000 deaths could occur, in the uk due to covid-19. this analysis (ferguson, laydon et al. 2020 ) also predicted, that even with an optimal mitigation scenario, these death rates would reduce only by one half i.e. to 255,000. thus, the predicted death cases of our models ~50,000 in the current cycle are in line with the predictions of mitigation effects, if we assume that several more cycles are possible. importantly, both models predicted that we are currently nearing or past the peak of daily case rates in all seven countries. however, the estimates suggest that after this cycle more than 80% of each country's total population in all countries studied remain susceptible. therefore, we assume that future cycles will occur. the predicted s0 was higher for the ode model relative to the hmm model: in turn, the ode model predicted a more prolonged cycle in the current period relative to the hmm model. this speaks to a trade-off between s0 and cycle times. assuming herd immunity requires 60% of the susceptible population to be immune (cohen and kupferschmidt 2020) , where 60%-one may conclude that further cycles are possible. however, that is not to say that populations within current outbreak areas may not reach herd immunity after the current cycle. yet, if this is the case (immunity is clustered in geographic or some other organisation of communities), then parts of the country-particularly those communities with high contact numbers that have not 'been involved' in the current cycle may be more likely to participate in future cycles. and while it is obviously unrealistic to suppose that an additive linear effect of populations will emerge (sirakoulis, karafyllidis et al. 2000 , eubank, guclu et al. 2004 ) (i.e., identically shaped cycles), given the complexity of contacts and population movement, our analysis may offer a rough guide to cycle immunity numbers. as with most scientific research at this time, the modelling described above was conducted with haste. in line with the sentiments of the world health organization's dr mike ryan 'perfection is the enemy of the good when it comes to emergency management. speed trumps perfection. and the problem in society we have at the moment is everyone is afraid of making a mistake. everyone is afraid of the consequence of error, but the greatest error is not to move, the greatest error is to be paralyzed by the fear of failure.' 1 therefore, we are grateful to the coding repositories listed below, where interested researchers can reproduce or nuance our analyses. for code and data see: https://github.com/rosalynmoran/covid-19.git 1 https://www.rev.com/blog/transcripts/world-health-organization-covid-19-update-marchappendix a % states s = x(1); % susceptible e = x(2);% exposed i_nh = x(3); % infected will not require hospitalization i_nh_si = x(4); % infected will not require hospitalization -socially isolated i_rh = x(5); % infected will require hospitalization i_rh_si = x(6); % infected will require hospitalization -socially isolated i_sc = x(7);% infected asymotmatic -sub clinical i_sc_si = x(8);% infected asymotmatic -sub clinical -socially isolated i_h = x(9); % hospitalized not n icu i_icu = x(10); % hospitalized in the icu r = x(11); % recovered d = x(12); % deaths % parameters n0 = (exp(p.n))*1e6; beta = (0.09*exp(p.beta))*(10*exp(p.k)); % beta x contacts betasi = (0.09*exp(p.beta))*(2*exp(p.k_si)); % contacts in isolation gamma = (1/4.6)*exp(p.gamma); % recovery rate = 1/days_infection 4.6 from pnas, 3 or 7 from lancet kappa = (1/5.2)*exp(p.kappa); % latency rate 1/(days incubation) -assume 5. = beta*(s/n0)*(i_nh + i_rh) + alpha*beta*(s/n0)*i_sc ... + beta*(s/n0)*(i_nh_si + i_rh_si) + alpha*beta*(s/n0)*i_sc_si-kappa*e; di_nh_dt = p_clin*(1-q)*(1-h)*kappa*e -(1-fi)*gamma*i_nh -(fi)*taui*i_nh; di_nh_si_dt = p_clin*q*(1-h)*kappa*e -gamma*i_nh_si + (fi)*taui*i_nh; di_rh_dt = p_clin*(1-q)*(h)*kappa*e -(1-fi)*delta*i_rh -fi*taui*i_rh; di_rh_si_dt = p_clin*q*(h)*kappa*e -delta*i_rh_si + fi*taui*i_rh; di_sc_dt = (1-p_clin)*kappa*e -(1-fa)*gamma*i_sc -(fa)*taua*i_sc; di_sc_si_dt = (fa)*taua*i_sc -gamma*i_sc_si; dh_dt = (1-c)*(1-fi)*delta*i_rh + (1-c)*delta*i_rh_si -(mh*muh + (1-mh)*phih)*i_h; % dicu_dt = c*(1-fi)*delta*i_rh + c*delta*i_rh_si -(mc*muc + (1-mc)*phic)*i_icu; % dr_dt = gamma*(i_sc + i_nh + i_sc_si + i_nh_si) + (1-mh)*phih*i_h + (1-mc)*phic*i_icu; dd_dt = mh*muh*i_h + mc*muc*i_icu ; %% ode observer function cases = prop_asymp*(x(7) +x(8)) + prop_sympnh*(x(3)+x (4)) + x(5) + x(6); % twenty percent of symptomatic not hospitalised cases tested one percent of asymptomatic tested deaths = x(12); prop_asymp = 0.01*exp(p.cases_from_sc); prop_sympnh = 0.2*exp(p.cases_from_nh); prop_asymp = min([prop_asymp, 1]); prop_sympnh = min([prop_sympnh, 1]); ode city observer function cases = p.no_cities*(prop_asymp*(x(7) +x(8)) + prop_sympnh*(x(3)+x(4)) + x(5) + x(6)); % twenty percent of symptomatic not hospitalised cases tested one percent of asymptomatic tested deaths = p.no_cities*x(12); countries test tactics in 'war'against covid-19 modelling disease outbreaks in realistic urban social networks report 9: impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand dynamic causal modelling of covid-19 variational free energy and the laplace approximation projecting hospital utilization during the covid-19 outbreaks in the united states the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study a cellular automaton model for the effects of population movement and vaccination on epidemic propagation evolving epidemiology and impact of non-pharmaceutical interventions on the outbreak of coronavirus disease impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study key: cord-289917-2mxd7zxf authors: singh, brijesh p.; singh, gunjan title: modeling tempo of covid‐19 pandemic in india and significance of lockdown date: 2020-08-04 journal: j public aff doi: 10.1002/pa.2257 sha: doc_id: 289917 cord_uid: 2mxd7zxf a very special type of pneumonic disease that generated the covid‐19 pandemic was first identified in wuhan, china in december 2019 and is spreading all over the world. the ongoing outbreak presents a challenge for data scientists to model covid‐19, when the epidemiological characteristics of the covid‐19 are yet to be fully explained. the uncertainty around the covid‐19 with no vaccine and effective medicine available until today create additional pressure on the epidemiologists and policy makers. in such a crucial situation, it is very important to predict infected cases to support prevention of the disease and aid in the preparation of healthcare service. in this paper, we have tried to understand the spreading capability of covid‐19 in india taking into account of the lockdown period. the numbers of confirmed cases are increased in india and states in the past few weeks. a differential equation based simple model has been used to understand the pattern of covid‐19 in india and some states. our findings suggest that the physical distancing and lockdown strategies implemented in india are successfully reducing the spread and that the tempo of pandemic growth has slowed in recent days. a very special type of pneumonic disease that generated the covid-19 pandemic was first identified in wuhan, china in december 2019 and is spreading all over the world. the ongoing outbreak presents a challenge for data scientists to model covid-19, when the epidemiological characteristics of the covid-19 are yet to be fully explained. the uncertainty around the covid-19 with no vaccine and effective medicine available until today create additional pressure on the epidemiologists and policy makers. in such a crucial situation, it is very important to predict infected cases to support prevention of the disease and aid in the preparation of healthcare service. in this paper, we have tried to understand the spreading capability of covid-19 in india taking into account of the lockdown period. the numbers of confirmed cases are increased in india and states in the past few weeks. a differential equation based simple model has been used to understand the pattern of covid-19 in india and some states. our findings suggest that the physical distancing and lockdown strategies implemented in india are successfully reducing the spread and that the tempo of pandemic growth has slowed in recent days. the novel corona virus (covid-19) started from wuhan, china and thus, initially known as the wuhan virus, expanded its circle in south korea, japan, italy, iran, usa, france, spain and finally spreading in india. it is named as novel because it is never seen before mutation of animal corona virus but certain source of this pandemic is still unidentified. it is said that the virus might be connected with a wet market (with seafood and live animals) from wuhan that was not complying with health and safety rules and regulations. the pandemic have been recorded over 200 countries, territories, and areas with about 3,000,000 confirmed cased and 200,000 deaths (who). the covid-19 is very similar in symptomatology to other viral respiratory infections. as it is novel virus, the specific modes of transmission are not clearly known. originally it is emerged from animal source then spread all over the world from person to person. initially, there has been speculation about the virus spreading while the infected person is not showing any symptoms, but that has not been scientifically confirmed. on march 11, 2020, who changed the status of the covid-19 emergency from public health international emergency to a pandemic? nonetheless, the fatality rate of the current pandemic is on the rise (between 2-4%); relatively lower than the previous sars-cov (2002/2003) and mers-cov (2012) outbreaks (malik et al., 2020) . thus, covid-19 has presented an unprecedented challenge before the entire world. symptoms of covid-19 are reported as cough, acute onset of fever and difficulty in breathing. out of all the cases that have been confirmed, up to 20% have been deemed to be severe. cases vary from mild forms to severe ones that can lead to serious medical conditions or even death. it is believed that symptoms may appear in 2-14 days, as the incubation period for the novel corona virus has not yet been confirmed. however, in india 14 days minimum quarantine period is declared by government for suspected cases. since it is a new type of virus, there is a lot of research being carried out across the world to understand the nature of the virus, origins of its spreads to humans, the structure of it, possible cure / vaccine to treat covid-19. india also became a part of these research efforts after the first two confirmed cases were reported here on january 31, 2020. then in india screening of traveler at airport migrant was started, immediate chinese visas was canceled, and who was found affected from covid-19 kept in quarantine centers (ministry of home affaires government of india, advisory). in continuation, we take a look at few of the interesting and important research being carried out in india with respect to covid-19. icmr, india claims that sari patients with no record of international travel or contact with infected persons tested positive for covid-19. hence it is important to optimize testing by developing strategies to identify potential cases that have a higher chance of being infected. since the availability of the resources like testing kits, labs, health personnel etc. is limited in india as for as concerned the population, the most practical approach is to test symptomatic patients presenting to hospitals, hotspots and aggressive testing to identify and contain local chains of transmission. in absence of a definite treatment modality like vaccine, physical distancing has been accepted globally as the most efficient strategy for reducing the severity of disease and gaining control over it (ferguson et al., 2020; singh & adhikari, 2020) . also in india it is reported that the country is well short of the who's recommendations of minimum threshold of 2.28 skilled health professionals per 1,000 population (anand & fan, 2016) . therefore, on march there are already various measures such as social distancing, lockdown masking and washing hand regularly has been implemented to prevent the spread of covid-19, but in absence of particular medicine and vaccine it is very important to predict how the infection is likely to develop among the population that support prevention of the disease and aid in the preparation of healthcare service. this will also be helpful in estimating the health care requirements and sanction a measured allocation of resources. it is well known fact that covid-19 has spread differently in different countries, any planning for increasing a fresh response has to be adaptable and situation-specific. data obtained on covid-19 outbreak have been studied by various researchers using different mathematical models (chang, harding, zachreson, cliff, & prokopenko, 2020; rao srinivasa arni, krantz steven, thomas, & ramesh, 2020) . many other studies (anastassopoulou, russo, tsakris, & siettos, 2020; corman et al., 2020; gamero, tamayo, & martinez-roman, 2020; huang et al., 2020; hui et al., 2020; rothe et al., 2020; singh, 2020) on this recent epidemic have been reported so many meaningful modeling results based on the different principles of mathematics. most of pandemics follow an exponential curve during the initial spread and eventually flatten out (junling, dushoff, bolker, & earn, 2014) . sir model is one of the best suited models for projecting the spread of infectious diseases like covid-19 where a person once recovered is not likely to become susceptible to the infection again (kermack & mckendrick, 1991) . susceptible-infectious-recovered (sir) compartment model (herbert, 2000) is used to include considerations for susceptible, infectious, and recovered or deceased individuals. these models have shown a significant predictive ability for the growth of covid-19 in india on a day-to-day basis so far. a recent study has shown that social distancing can reduce cases by up to 62% (mandal et al., 2020) . further, time series models have been employed for predicting the incidence of covid-19 disease. as compared to other prediction models, for instance support vector machine (svm) and wavelet neural network (wnn), arima model is more capable in the prediction of natural adversities (zhang, yang, cui, & chen, 2019) . a time dependent sir models have been defined to observe the undetectable infected persons with covid-19 (chen, lu, chang, & liu, 2020) . a stochastic mathematical model (chatterjee, kaushik, arun, & subramanian, 2020) of the covid-19 epidemic has been studied in india. the logistic growth regression model is used for the estimation of the final size and its peak time of the corona virus epidemic in many countries of the world and found similar result obtained by sir model (batista, 2020) . it is well known that the effects of social distancing become visible only after a few days from the lockdown. this is because the symptoms of the covid-19 normally take some time to come out after getting infected from the covid-19. the peak infection is reached at the end of june 2020 with in excess of 150 million infective in india and the total number infected is estimated to be 900 million (singh & adhikari, 2020) . other estimates indicates that, with hard lockdown and continued social distancing, the peak total infections in india will be 97 million and the number of infective by september is likely to be over 1,100 million (schueller et al., 2020) . due to the recent development of this pandemic, we are interested in addressing the following important issues about covid-19: 1. what is the expected time to stop new corona cases? 2. what is the expected maximum number of corona cases? 3. the significance of lockdown. in this paper, instead of developing a mathematical model for the pattern of spread of covid-19, an attempt has been made to resolve these above issues in india. let us define a function called tempo of disease that is the first differences in natural logarithms of the cumulative corona positive cases on a day, which is as: where p t and p t − 1 are the number of cumulative corona positive cases for period t and t−1, respectively. when p t and p t − 1 are equal then r t will become zero. if this value of r t that is, zero will continue a week then we can assume no new corona cases will appear further. in the initial face of the disease spread the tempo of disease increases but after sometime when some preventive majors is being taken then it decreases. since r t is a function of time then the first differential is defined as. where r t denotes the tempo that is the first differences in natural logarithms of the cumulative corona positive cases on a day, r t is the desired level of tempo that is, zero in this study, t denotes the time and k is a constant of proportionality. equation 1 is an example of an ordinary differential equation that can be solved by the method of separating variables. the equation 1 can be written as. integrating equation 2, we get. where c is an arbitrary constant. taking the antilogarithms of both sides of equation 3 we have. r t = e kt + c ) e kt e c ) r t = ae kt ð4þ where a = e c . this equation 4 is the general solution of equation 1. if k is less than zero, equation 4 tells us how the corona positive cases will decreases over the time until it reaches zero. value of a and k is estimated by least square estimation procedure using the data sets. the paper used series of daily cases from the website covid19india.org (coronavirus outbreak in india, n.d.) . in this study the day wise cumulative number of corona positive cases from april 1, 2020-june 10, 2020 has been used to know when the tempo of disease will become zero and what will be the size at that time. also an attempt has been done to understand the significance of lockdown kerala and telangana. large variation observed in the percent confirmed cases among total testing cases, from 1.2 in bihar to 9.1 in maharashtra. in punjab the recovery rate is very low however in karnataka the percent confirmed among total test is very low means either the disease prevalence is low or the quality of testing kit is not good. in gujarat and delhi it is 7.4 and 7.9% respectively. these percentages are very low because the testing are done in the hotspots only if the population of hotspot only is considered that these percentages might be more. government suggested and implemented social distancing and lockdown to control the spread of covid-19 in the society. in table 3 an attempt has been made to show the summary statistics of tempo table 5 which reveal that lockdown is significantly affects the spread of covid-19. figure 1 show that tempo of disease r t is declining toward zero with time, more rapidly in kerala and telangana than other states. whereas, in rest of the states it is declining slowly toward r t = 0. covid-19 has been declared as pandemic by who and is currently become a major global threat. prediction of a disease may help us to understand the factors affecting it and the steps that we can take to control it. the government of india has taken preventive measures such as complete lockdown in the very early stage of disease, physical distancing and case isolation. the most important issue is that many healthcare professionals are visiting each and every household in the hotspot area across the country to trace and isolate infected persons to curtail the spread of disease. in order to support the prevention of the disease and aid to the healthcare professionals, an attempt has been made to develop a simple model for the prediction of confirmed covid-19 cases and to utilize that model for forecasting future covid-19 cases in india. as per the model forecast, the confirmed cases are expected to gradually decrease in the coming weeks. it is also likely that the efforts such as lockdown and physical distancing affect this prediction start to decline. on the basis of considered data, one can predict that the final size of corona virus pandemic in india will be around 1 million by the end of september. the exponential model used in this study is a data driven model. thus, its forecasts are as reliable and can capture the dynamics of the pandemic. due to real time change in data daily, the predictions will accordingly change. hence, the results from this paper should be used only for qualitative understanding. brijesh p. singh https://orcid.org/0000-0002-2429-4758 the health workforce in india. geneva: who. human resources for health observer. series no. 16 data-based analysis, modelling and forecasting of the covid-19 outbreak estimation of the final size of the second phase of the coronavirus covid 19 epidemic by the logistic model modelling transmission and control of the covid-19 pandemic in australia healthcare impact of covid-19 epidemic in india: a stochastic mathematical model a time-dependent sir model for covid-19 with undetectable infected persons detection of 2019 novel coronavirus (2019-ncov) by realtime rt-pcr impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand. imperial college covid-19 response team forecast of the evolution of the contagious disease caused by novel corona virus (2019-ncov) in china the mathematics of infectious diseases clinical features of patients infected with 2019 novel coronavirus in wuhan the continuing 2019-ncov epidemic threat of novel coronaviruses to global health-the latest 2019 novel coronavirus outbreak in wuhan estimating initial epidemic growth rates contributions to the mathematical theory of epidemics-i emerging coronavirus disease (covid-19), a pandemic public health emergency with animal linkages: current status update prudent public health intervention strategies to control the coronavirus disease 2019 transmission in india: a mathematical model-based approach model-based retrospective estimates for covid-19 or coronavirus in india: continued efforts required to contain the virus spread transmission of 2019-ncov infection from an asymptomatic contact in germany covid-19 in india: potential impact of the lockdown and other longer term policies modeling and forecasting novel corona cases in india using truncated information: a mathematical approach age-structured impact of social distancing on the covid-19 epidemic in india comparison of the ability of arima, wnn and svm models for drought forecasting in the sanjiang plain key: cord-264136-jjtsd4n3 authors: ferstad, johannes opsahl; gu, angela jessica; lee, raymond ye; thapa, isha; shin, andrew y; salomon, joshua a; glynn, peter; shah, nigam h; milstein, arnold; schulman, kevin; scheinker, david title: a model to forecast regional demand for covid-19 related hospital beds date: 2020-03-30 journal: nan doi: 10.1101/2020.03.26.20044842 sha: doc_id: 264136 cord_uid: jjtsd4n3 covid-19 threatens to overwhelm hospital facilities throughout the united states. we created an interactive, quantitative model that forecasts demand for covid-19 related hospitalization based on county-level population characteristics, data from the literature on covid-19, and data from online repositories. using this information as well as user inputs, the model estimates a time series of demand for intensive care beds and acute care beds as well as the availability of those beds. the online model is designed to be intuitive and interactive so that local leaders with limited technical or epidemiological expertise may make decisions based on a variety of scenarios. this complements high-level models designed for public consumption and technically sophisticated models designed for use by epidemiologists. the model is actively being used by several academic medical centers and policy makers, and we believe that broader access will continue to aid community and hospital leaders in their response to covid-19. link to online model: https://surf.stanford.edu/covid-19-tools/covid-19/ the us classified the coronavirus disease pandemic (covid-19) as a national emergency on march 14, 2020 . cumulative us cases surged beyond 122,000 on march 29, 2020. [2, 3] its rapid spread in china and italy quickly overwhelmed available hospital beds. [4, 5] county-level forecasts of demand for hospital beds based on data commonly available to hospitals would help guide us hospital efforts to anticipate and mitigate similar bed shortages. [6, 7] in order to plan their response, hospital and public health officials need to understand how many people in their area are likely to require hospitalization for covid-19; how these numbers compare to the number of available intensive care and acute care beds; and how to project the impact of socialdistancing measures on utilization. since the majority of people with covid-19 are asymptomatic and the rates of cases requiring hospitalization differ significantly across age groups, answering these questions requires understanding the epidemic and accounting for the specific vulnerabilities of the local population. [8, 9] the numbers of people in each age group differ significantly across us counties, as do the available hospital resources [10, 11] . initial analyses of the potential impact of covid-19 have spurred governmental action at the federal and state level; however, significant differences remain in the policies implemented across states and counties. [9, 12] in order to help facilitate adequate and appropriate local responses, we developed a simple model to project the number of people in each county in the united states who are likely to require hospitalization as a result of covid-19 given the age distribution of the county per the us census. the model compares the projected number of individuals needing hospitalization to the publicly known numbers of available intensive and acute care beds and allows users to model the impact of social-distancing or other measures to slow the spread of the virus. the uncertainty surrounding the numbers of people infected and the rates of spread make it difficult to evaluate the accuracy of projections generated by complex epidemiological models. the model presented errs on the side of simplicity and transparency to allow non-specialist policymakers to fully understand the logic and uncertainty associated with the estimates. for each us county, the model accepts as an input the number of covid-19 hospitalizations and the associated doubling time, if these are available. if these are not available, the model imports the latest number of confirmed cases from the new york times online repository and accepts user-entered parameters of the ratio of total cases to confirmed cases (e.g., 5:1) [8, 23] and the covid-19 populationlevel doubling time (e.g., 7 days) [13] . the effects of interventions that mitigate the spread of infection (such as social distancing) are simulated with user-entered parameters in the form of a greater doubling time and a start date for that new doubling time. county-specific hospitalization rates are derived from combining age-distributions derived from the us census [11] and age-group specific estimates of the case rates of severe symptoms, critical symptoms, and mortality (together morbidity) derived from imperial college covid-19 response team [9] . the default assumptions are that: people are admitted to the hospital on the day they test positive (the assumptions will change when testing begins for nonsymptomatic people); those with severe and critical symptoms spend, respectively, 12 days in acute care and 7 days in intensive care; and 50% of each type of bed is available for covid-19+ patients. [14, 15] the numbers of patients requiring each type of bed are compared to the numbers of relevant beds derived from data from the american hospital association. [10] the detailed technical description of the model is available in our technical supplement. to facilitate use by hospital and public health officials, the model is deployed through an interactive online website that allows users to generate dynamic, static, and spatial estimates of the number and rate of severe, critical, and mortality case rates for each county or group of counties. these data are displayed along with the number of intensive care and acute care hospital beds in the corresponding region. the urgency of the challenge and widespread data sharing have led to the development of numerous models of covid-19: models designed to share the most recent data on confirmed cases and deaths [16] ; technical models intended for use by epidemiologists [17, 18, 21] ; and models presenting high-level summaries of the potential for covid-19 associated bed demand to overwhelm hospital capacity [9, 19, 22] . the model presented here fits the needs of local hospital and government leaders, many of whom lack access to trained epidemiologists or data analysts. the website allows such leaders to study projections of the spread of covid-19 based on their county-level hospitalization information, if these are available, and otherwise presents projections based on data from the rest of the us and assumptions from the literature. the model, deployed as a website, is relatively easy to understand as the underlying dynamics are specified in terms of assumptions interpretable to those with no specialist training. as stakeholders gather data about the spread of covid-19 in their region, the model permits them to tailor inputs and generate new estimates in real time. as new epidemiological data become available, the model is updated by those maintaining it with new age-specific rates of severe disease, critical disease, mortality and their associated loss. for example, the first update was from case-rates derived from data on wuhan [20] to the case-rates derived from the imperial report. as expected, the model predicts us counties with large, older populations are likely to have the highest demand for hospital beds. counties with a high number of confirmed cases are likely to be first to see the demand for beds exceed hospital bed availability. furthermore, smaller counties with relatively few hospital beds are likely to see the demand for beds exceed hospital bed availability at lower fractions of the population being covid-19+ (see table 1 ). local leaders in these vulnerable communities should pay particular attention to the case growth rate if they seek to tailor their response rapidly to predicted downstream resource constraints. our results suggest, in line with basic epidemiological principles, that hospitalization rates are very sensitive to contagion doubling time. despite the uncertainty about doubling time, the model demonstrates that across a large set of assumptions, efforts to increase hospital bed capacity are not on their own likely to be sufficient to ensure that hospital capacity meets demand. when considering the limited healthcare resources anticipated, the need for strengthening collaborative operational planning and research between institutions and across the public and private sectors is paramount. while hospital leaders may be concerned with the impact of covid-19 infections at their facility, our model assesses the potential impact of the epidemic at a county and state level. assessing the regional impact can permit collaborative and coordinated planning between public health and hospital leaders facing an unprecedented challenge. regional and state leaders are better equipped to understand the potential implications, such as funding and cost-sharing, of this epidemic across counties within a state. these conversations are facilitated by the ability to study several counties in a region. one of the greatest challenges to continuously adjusting regional policies best tailored to meet the challenge of covid-19 is that healthcare and political leaders lack familiarity with the concept of exponential growth of a highly transmissible deadly pathogen-particularly one with a 3-to 14-day lag between infection and needing a hospital bed. forecasting demand for hospital resources a week to a few weeks ahead helps address that handicap. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.03.26.20044842 doi: medrxiv preprint there are several limitations to the model. first, projections related to the spread of a pandemic at this early stage are plagued with uncertainty about critical model parameters, e.g., the doubling time and the true number of cases in the population. to account for this, the model was designed to produce projections given current assumptions about parameters (with the option to update those variables as better data become available). second, although estimates of disease propagation follow first-order evidence, important input parameters may not have been considered in the model which may affect hospitalization rates and bed availability. for example, assumptions on shared healthcare resources (such as the use of pediatric beds for adult hospitalization overflow) are not modifiable by users but only by those maintaining the model. finally, the model estimates the effect of one or the summation of more than one intervention (by modification of the doubling time) but does not ascribe the individual impact of sequential interventions. in this report, we describe an online, real-time, interactive simulation model to facilitate local policy making and regional coordination by providing estimates of hospital bed demand and the impact of measures to slow the spread of the infection. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.03.26.20044842 doi: medrxiv preprint . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . the model accepts as an input the number of covid-19 hospitalizations and the associated doubling time, if these are available. if these are not available, for each us county, the model imports the latest total number of confirmed cases from the new york times online repository and accepts user-entered parameters of the ratio of total cases to confirmed cases (e.g., 10:1) [1, 9] and the covid-19 populationlevel doubling time (e.g., 7 days) [2] . following a simple exponential model, the total number of covid-19+ people on each subsequent day n is the product of the initial number of total cases and 2 to the power given by n divided by the doubling time input parameter. users can simulate the effects of interventions that mitigate the spread of infection (such as social distancing) by entering new doubling times and dates for these interventions to take effect. if these are input, the number of covid-19+ people on subsequent days is calculated using the new doubling time with the previous formula. users are referred to online tools to estimate changes in doubling time associated with the impact of socialdistancing interventions. [3] calculation of hospitalization rates county-specific age distributions are derived from the us census [4] and age-group specific estimates of the case rates of severe symptoms, critical symptoms, and mortality derived from imperial college covid-19 response team [5] . for each county, the proportion of each age group relative to the total population is calculated. severe and critical symptom case-rates for each age group are weighted by these proportions, and the hospitalization rate is calculated as the sum of these weighted severe and critical symptom case-rates. the model includes parameters for the hospital length of stay for covid-19+ patients with severe and critical symptoms set to, respectively, 12 days in acute care and 7 days in intensive care. by default, 50% of the available acute and icu beds is shown as the overall capacity for covid-19+ patients. but another parameter allows the user to adjust the number of each type of bed that is available for covid-19+ patient. [6, 7] the number of patients with severe and critical symptoms requiring hospitalization each day is calculated assuming that patients are admitted to the hospital on the day they test positive (this assumption will change when testing begins more seriously for non-symptomatic people) and are discharged after their length of stay. the cumulative number of patients with severe and critical symptoms requiring hospitalization is estimated as the product of the corresponding hospitalization rate and cumulative number of covid+ people in the population. the number of patients in hospital beds on a specific day is defined as the cumulative number of patients admitted up to that day minus the corresponding cumulative number of patients discharged. hospital capacity at county level is based on the number of hospital beds in each us county from the american hospital association. the acute care beds include general medical, surgical, and long-term acute care beds for adults. the intensive care unit beds include all adult intensive care unit beds other than neonatal [8] . as discussed above, the number of hospital beds available for covid-19+ patients is by default 50% of overall beds, but can be adjusted by the user. coronavirus disease 2019 (covid-19) situation report -51 coronavirus covid-19 global cases proclamation on declaring a national emergency concerning the novel coronavirus disease (covid-19) outbreak characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention critical care utilization for the covid-19 outbreak in lombardy, italy: early experience and forecast during an emergency response priorities for the us health community responding to covid-19 u.s. lags in coronavirus testing after slow response to outbreak substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand aha annual survey american community survey 5-year data white house takes new line after dire report on death toll early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a single-centered, retrospective, observational study. the lancet respiratory medicine clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases epidemic calculator modeling covid-19 spread vs healthcare capacity caring for covid-19 patients -can hospitals around the nation keep up? the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19)-china covid-19 hospital impact model for epidemics ihme covid-19 health service utilization forecasting team. forecasting covid-19 impact on hospital bed-days, icu-days, ventilator days and deaths by us state in the next 4 months substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia covid-19 hospital impact model for epidemics american community survey 5-year data impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a single-centered, retrospective, observational study. the lancet respiratory medicine clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study aha annual survey coronavirus (covid-19) data in the united states we thank grace lee, teng zhang, jacqueline jil vallon, and francesca briganti for their help with this work. we also thank amber levine for her help testing and improving the tool. key: cord-225640-l0z56qx4 authors: ghamizi, salah; rwemalika, renaud; veiber, lisa; cordy, maxime; bissyande, tegawende f.; papadakis, mike; klein, jacques; traon, yves le title: data-driven simulation and optimization for covid-19 exit strategies date: 2020-06-12 journal: nan doi: nan sha: doc_id: 225640 cord_uid: l0z56qx4 the rapid spread of the coronavirus sars-2 is a major challenge that led almost all governments worldwide to take drastic measures to respond to the tragedy. chief among those measures is the massive lockdown of entire countries and cities, which beyond its global economic impact has created some deep social and psychological tensions within populations. while the adopted mitigation measures (including the lockdown) have generally proven useful, policymakers are now facing a critical question: how and when to lift the mitigation measures? a carefully-planned exit strategy is indeed necessary to recover from the pandemic without risking a new outbreak. classically, exit strategies rely on mathematical modeling to predict the effect of public health interventions. such models are unfortunately known to be sensitive to some key parameters, which are usually set based on rules-of-thumb.in this paper, we propose to augment epidemiological forecasting with actual data-driven models that will learn to fine-tune predictions for different contexts (e.g., per country). we have therefore built a pandemic simulation and forecasting toolkit that combines a deep learning estimation of the epidemiological parameters of the disease in order to predict the cases and deaths, and a genetic algorithm component searching for optimal trade-offs/policies between constraints and objectives set by decision-makers. replaying pandemic evolution in various countries, we experimentally show that our approach yields predictions with much lower error rates than pure epidemiological models in 75% of the cases and achieves a 95% r2 score when the learning is transferred and tested on unseen countries. when used for forecasting, this approach provides actionable insights into the impact of individual measures and strategies. since the outbreak of the covid-19 pandemic, the world has been facing a human tragedy with overwhelmed healthcare systems and fears of economic collapses. in the absence of vaccines to immunize the population rapidly at scale, governments have implemented various non-pharmaceutical public health interventions such as social distancing and lockdowns. considering that the world health organisation (who) is foreseeing first clinical trials of vaccine for the end of the year 2020 [15] , decision-makers must carefully plan their exit strategies: measures that were put in place to contain the coronavirus spread must be methodically lifted to avoid the risk of precipitating new outbreaks. in this context, mathematical modelling offers public health planners with frameworks to make predictions about the spread of emerging diseases and assess the impact of possible mitigation strategies. this is particularly important when dealing with infectious diseases, such as covid-19, where mass interventions (e.g., screening, social distancing, and vaccination) can lead to effects at a population level, including herd immunity, changes in the infection rate or even changes in the pathogen ecology as a consequence of selective pressure. there are two main types of models: static cohort models and transmission dynamic models [3] . static models, typically relying on decision trees and markov processes, assume a force of infection that is independent of the proportion of the population that is infected and therefore is of little use in response to highly infectious diseases like covid-19. transmission dynamic models, on the other hand, see a force of infection varying depending on the proportion of the population which is infected. compared with static cohort models, transmission dynamic models are usually more complex to parameterize requiring epidemiological information on the infectious disease and demographic and economic information about the affected population. different techniques exist to implement dynamic approaches. agent-based models (abm) are simulations composed of agents that interact with each other and their environment. because each agent can make its own rules, this type of approach can capture aggregate phenomena derived from the behavior of single agents. these models offer a great explainability of the root causes leading to the propagation of a disease but are computationally intensive to run and thus, hardly applicable to large populations. indeed, the behaviour and the interaction of each type of agent needs to be fully defined in order for the model to be useful. these rules are case-specific and are not transferable from one population to another. the most common approach to model the spread of infectious disease is the susceptible-infected-removed (sir) model and its extension i.e seir (susceptible, exposed, infectious, and recovered). this is a state-based model, every state expresses the degree of exposure of a population to the disease. it is also equation-based where each equation defines the rate to go from one state to the other. the seir model thus separates the population into four groups and simulates the evolution over time of each one of the subpopulations. the transition rates are defined by the time scale to which an individual can transmit the disease, the time to recovery, and the number of newly infected people due to an infected individual. the most varying parameter is called effective reproduction number (r t ), and expresses the number of people that can be contaminated by an infectious individual over a period of time. these methods are dependent on the validity of the input parameters like transition rates. while seir is a very powerful model, it presents a major limitation, it requires hyper-parameters that are hard to observe such as the infection rate of an individual. in practice, seir parameters are manually set to fit with the local observations to the considered population (e.g. country), and are not learnt from larger-scale observations. to circumvent the limitation of such epidemiological models, researchers recently started to take advantage of the advances made in machine learning (ml) in order to create models based on available large datasets [11, 14] . we name this family of approaches "ml-based" and our own work falls in it. our first contribution is to devise a novel approach, dn-seir, that alleviates manual tuning of the seir model, by relying on large and trustable public datasets (large scale observations) and machine learning (to learn the parameters' values for a given population). our approach combines seir with a machine learning predictor, based on a deep learning model, to estimate the effective reproduction number (r t ), over time. the machine learning predictor relies on demography and mobility features to predict an effective reproduction number. for each time increment, r t is updated and used for the next day computation. we evaluated our approach on twelve countries from all continents and showed that our approach that mixes demographics, mobility, and epidemiological data provides better forecasts for 9 out of 12 of the studied countries than a purely epidemiological modelling. our second contribution is to exploit this online prediction of effective reproduction number in a simulation tool 1 for policymakers which was recently advertised to the public 2 . policymakers have to decide when to relax certain parts of society (workplaces, travels, schools... ) and to what extent it may create a new epidemic wave that would flood the hospitals with critical cases. the simulator enables one to make such a strategic 1 open sourced and available for reproduction on https://github.com/covid19-kdd/kdd20covid19/ 2 online tool and press release anonymized for double-blind review exit plan for a certain country and predict its impact in terms of hospitalization, infected people and deaths. it is also designed to explore and optimize various exit strategies and constraints. we evaluate 3 common hand-crafted exit strategies and show that multi-objective genetic algorithms can find atypical strategies on the pareto-front that minimize both the death numbers and the economic impact. machine learning approaches have been widely used to model and forecast former epidemics, especially to handle the large amount of data it involves and the increasing complexity of the epidemiological models underneath. popular approaches remain regression trees and forests [8] and neural networks [4] . while there is a plethora of reports that tackle the covid-19 forecasting, the peer-reviewed literature about ml and covid-19 is rather scarce. most approaches in public repositories tackle ml regression in combination with the sir epidemiological model [1] or its seir extension [9, 12, 16] . recent research about non-pharmaceutical interventions focused on mobility data to combine epidemiological models and learning algorithms. in [13] , vollmer et al. integrate mobility in a stochastic model. they focused on italy and suggest that covid-19 transmission rate and mobility metrics are closely related. as well as mobility should be closely monitored in the next weeks and months.. our approach relies on a similar intuition and combines learning from multiple countries with countryspecific features. the transfer learning in our approach is able to separate the contribution of different measures like commute to work, retail and recreation activities. we also show that using the output of our ml approach as a search fitness function leads to optimal exit strategies. our end goal is to provide policymakers with a tool to easily generate exit strategies and evaluate their impact. in particular, an exit strategy can be modeled like as a schedule of measures (policy schedule) that will impact the way the disease will spread. we restrict the policy schedule to mobility levels. as illustrated in figure 1 , we propose to combine a genetic algorithm (to search for policy schedules), a deep learning model (to predict the evolution of the effective reproduction number induced by a given policy schedule) and an epidemiological model (to forecast, based on the computed effective reproduction numbers, the effect of the scheduled policies on public health over time, e.g. deaths and hospitalization occupancy). our three components work within a feedback loop. at each iteration, the genetic algorithm builds a population of policy schedules, which the deep learning and epidemiological models allow to evaluate. in turn, this feedback is used to generate better schedules, optimizing health-related objectives (e.g. minimize total deaths) while satisfying hard constraints (e.g. never exceed hospitalization capacity). epidemiological models predict the state of a population struck by a pandemic over time, based on state transition parameters and the evolution of the effective reproductive number, r t , of the disease. we use an extension of seir model, i.e. the sei-hcrd compartmental model -susceptible (s) → exposed (e) → infectious (i ) → removed (hospitalized (h ), critical (c), recovered (rec), dead (d)). such model can be defined by the following system of ordinary differential equations: such that s + e + i + h + c + rec + d is equal to the total population and r t denotes the effective reproduction number over time. the sei-hcrd involves several parameters. t suf f ix is the transition time estimated to transit from one population to the other. t inc is the average incubation period, t inf is the average infectious period, t hosp is the average hospitalization time in normal state (i.e. until any patient recovers or enters a critical state). t cr it is the average hospitalization time in a critical state (i.e. until death or recovery). the parameters m, c, f determine the severity of the infection: m is the percentage of infected individuals with non-severe symptoms (i.e. they are asymptomatic or have mild symptoms) and which, therefore, are not hospitalized. c is the percentage of hospitalized persons who will eventually enter a critical state. finally, f denotes the percentage of persons in the critical state who will pass away. value transition ratio value 14 days table 1 . parameter values used in the sei-hcrd model. to set these parameters, we lean on liu et al.'s study [5] and assign them with the constant values reported in table 1 . then, given the time series of effective reproduction numbers over time,{r t }, the sei-hcrd model computes the resulting impacts on the population, including the number of deaths. instead of manually assigning fitted values to r t , we propose to predict them from scheduled exit strategies using deep learning models. of course, the values of these features are country-specific and are largely impacted by the mitigation strategy of each country. for example, figure 2 shows the evolution of all features for luxembourg, italy and japan. we observe that italy and luxembourg have drastically reduced their activities whereas japan does not exhibit a significant reduction, for the case of schools and international travels. next, we clean the collected data. when some values (for a given feature) are missing, we fill the gap by interpolating between the closest days with available information. for each category of places, we smooth the corresponding feature over the 5, 10, 15, and 30 past days, resulting in four new features. 3 https://www.google.com/covid19/mobility/ we complete our dataset with demographic features and with the corresponding day of the week to take into account the weekly fluctuations of the data. overall, our feature engineering process yields 4,625 inputs of 32 features each, which are recapitulated in the model can be seen as a supervised predictor, taking as input the mobility and demographic features to predict an effective reproduction number, r t , for each time index t. thus, in order to train it, we need to label the dataset with r t value for each day of the training period. since the real effective reproduction numbers are not known, we estimate them by fitting the sei-hcrd model to the real-world number of cases and deaths. since the r t are time-dependent, we represent them as a decaying function and seek the best parameters' value for this function (which yield the fittest sei-hcrd model). we opted for the hill decay function because it showed good results in the literature [10] . thus, r t is given by 1 (1+( t l ) k ) · r 0 where r 0 , k and l are parameters. we use the l-bfgs optimization method to find the values of these parameters that minimize the mean square error of the number of cases and deaths predicted by the sei-hcrd model. once these parameter values are found, they can be injected back in the hill decay function to generate a time series for the past values of r t . the analysis of feature correlations shows that when working on a country-by-country basis, all the mobility features are highly correlated (over 0.75) for some countries as all the activity reduction and closure were enacted in most sectors at the same time. co-linear features offering little to no information gain to the learner, we train the model on all countries to reduce any correlation between the different features. thus, the model is trained using all the countries at once and the train/test split is done randomly. once a training set with expected values (r t values computed from the estimated hill decay function) is established, a supervised model can be trained to predict the future values of r t using the mobility and demographic data as features. to do so, we rely on a feed-forward neural network (ffnn). the architecture of the ffnn and its hyperparameters are optimized using a grid search to minimize the mean square error with cross-validation. the search leads to an architecture with 2 fully-connected hidden layers with, respectively, 1000 and 50 neurons. in addition to the ffnn, we also evaluate two other estimators. since the problem takes the form of a time series, we investigate the performances of a long short-term memory (lstm) network which allows sequence to sequence transformation, hence, learning from the features of the past days to predict the next days. we use a 15 days window to predict the values for the next 7 days. note that in this case, at each step, we use the computed values of r t from the past iterations as input for the model in addition to the rest of the features. finally, we investigate one last approach, gradient boosting using 500 estimators. for each of the approaches, we evaluate the performances using two classical metric, i.e the coefficient of determination, (r 2 score) and the root mean square error (rmse) 4 . the train:test splits were performed with (1) random split between train and test data (2) our search method uses nsga-ii [2] , an established genetic algorithm (ga) for multi-objective optimization that uses non dominated sorting to find pareto-optimal solutions. 4 the closer r 2 score to 1 the better and the closer rmse to 0 the better solution space: any solution generated by nsga-ii is a policy schedule. a schedule consists of a list of vectors, each of which is associated with a mobility feature and encodes the value of the feature for each time index t. a value ranges from 0 (no restriction) to 100 (full lockdown). the indices t go from april 30 to september 30, with steps of 2 weeks. objectives: we use 2 fitness functions which represent the compromising health and societal impacts of policy scheduled. we quality these impacts with the total number of total deaths between april 30 and december 30 and the mean of the mobility feature values over the same period. the first objective must be minimized while the second is maximized. constraints: for a policy schedule to be an acceptable solution, we require that the number of critical cases never exceeds the hospitalization capacity (icu) of the country of interest. this is an important requirement for policymakers, as critical cases which may not be correctly hospitalized likely result in additional deaths. selector: current pareto-front solutions are selected in priority. if there are more than the population size, they are filtered based on a crowding distance (here, we use manhattan distance in the objective space). otherwise, we fill the population with non-optimal solutions, selected using a binary tournament selection: pareto-dominant solutions are retained in priority; in case of non-dominance, crowding distance to the pareto front is used for tie-breaking. we rely on pymoo 5 to implement our nsga-ii search and use the library's default values for the remaining parameters like mutation rate or crossover. our end goal is to provide decision-makers with a tool allowing to easily generate exit strategies in order to evaluate their impact. to achieve this, we use a deep neural network as a proxy to evaluate the hyperparameters of a sei-hcrd model based on mobility and demographic data. however, to be useful, the neural network needs to be able to capture enough information in the mobility and demographic data alone to make accurate predictions. hence, we formulate the first research question as follow: can we predict the effective reproduction number based on mobility and demographic data? ultimately the proposed approach is intended to allow policymakers to evaluate and select exit strategies by analysing their impact on multiple aspects such as the number of death, possible overflow of healthcare capacities or the perturbation of economic activities. to evaluate the capacity of our approach to model such strategies, we evaluate it by predicting the impact of various popular scenarios. thus, we ask the following question: we conclude our investigation by a comparison of the impact of the popular scenarios with the impact of the one proposed by the search algorithm. the algorithm minimizes the number of deaths and the socioeconomical impacts generated by a diminution of activities (mobility) while avoiding the over-saturation of healthcare capacities. to evaluate the results, we compared them to the "naive" scenarios formulated in the previous research question and thus ask: 5 www.pymoo.org, the most starred python ga library on github fig. 3 . rmse between true cases and predicted cases by our dn-seir model and a fitted seir model. how do the exit strategies proposed by the search algorithm perform against popular ones? comparison of a fitted seir model and our dn-seir model. we estimate the confidence interval by evaluating the mean and standard deviation of a bayesian ridge regressor on each element of the test set (using the same training set as the ffnn). we use a grid search over 3 values for each of its 2 hyper-parameters α init (0.1, 1, 1.9) and λ init (1, 0.1, 0.01). we compare in table 3 the predicted cases of the time-dependant seir (fitted on past cases/deaths) and the dn-seir approach. the dn-seir approach has a lower error to the ground truth values of cases for 9 over 12 countries in comparison with a time regression approach. besides, the ground truth falls within the confidence interval of the dn-seir model for 10 of the 12 countries evaluated. it is worth noting that while 5 of the 12 countries lie under 5% error, 9 over 12 countries do no exceed 15% error. in figure 3 , we evaluate the rmse between the predicted cases of the seir model and the dn-seir approach (with its optimistic and pessimistic boundaries) across all the countries of the dataset. the simulation spans on 7 days (instead of 12 days for the results presented in table 3) and we compare the number of cases on april 29th, the last day of available mobility data for all countries. we use the wilcoxon signed-rank test to compare if two distributions are equals. the results show that for all three prediction dn-seir, dn-seir max and dn-seir min we can reject the hypothesis (p-values « 0.05) that they are equal to the seir prediction. we then perform a vargha and delaney's a 12 test to analyze the effect size. we see that our approach generates a lower rmse with a small effect size. these results indicate that even in very short term (7 days), relying only on the mobility data yields better results than applying a regression over the past values. shap shows how the trends in the feature, modelled in our approach using smoothed features, play a significant role in the final prediction, and shows an opposing contribution to its associated daily feature. for transit, retail, and park features, the trends values over 5, 10 and 15 days counter the impact of their respective daily values, yet on a lower scale. this indicates an inertia phenomenon that can be explained by the actual delay between the actual numbers (r t , cases, deaths) and the reported one and also the delay inherent to the epidemiological model. the comparison of the three countries also shows that mobility features have a much higher impact on the prediction in italy and luxembourg than japan as their shap impact is much wider. this hints that other social distancing features in japan could reduce the impact of mobility (masks for instance). our approach yields predictions with much lower errors than pure epidemiological models in 75% of the cases and achieves a 95% r² score when the learning is transferred and tested on unseen countries. in this section, we investigate four prediction strategies for an exit from the lockdowns that were taking place all over the world. the goal of the exit strategies is to allow a return to normal activities while minimizing the impact on the number of deaths and avoiding peaks in hospitalization that would saturate healthcare facilities. the strategies that we are investigating are the following: • hard exit: in this strategy, all mobility activities are resumed to normal on may 11, 2020. • progressive exit: mobility activities are gradually restored, with an increase of 15% of the activity every 2 weeks until the pre-lockdown activity level is reached. • cyclic exit: every two weeks, activity is resumed to normal then brought back to lock down situation. the process is repeated for 4 cycles, thus ending on 03/08/2020. • status quo: the current situation (as of april 30th) is maintained for the entirety of the period. figure 5 shows the evolution of r t values for luxembourg, italy and japan. as expected we see no evolution from the initial value in the no exit case, a cyclic fluctuation in the case of a cyclic exit, a soft increase of the r t values when applying a progressive exit and finally, a brutal jump in the case of a hard exit. r t reaches a plateau typically quite rapidly after strategies are applied. the plateau depends on the mobility condition, therefore, we see two plateaus in the results, one with all activities remaining closed (no exit), and one where all the other strategies reach the same plateau when the mobility levels are restored to their pre-lockdown baseline. we evaluate these strategies for the three countries luxembourg, italy and japan and obtain the results depicted by table 4 . we choose those three countries because they present difference with respect to their demography, mitigation strategies and number of deaths attributed to covid-19. furthermore, they are amongst the few to provide reliable data about hospitalization capacities, hence allowing us to incorporate this information when looking for optimal policy schedules. we compute for each strategy the area under curve (auc) of each mobility metric and provide the mean across all mobility values, and compare these hand-crafted strategies with the ones found by our genetic algorithm search. the search is run on 100 generations with a size of population of 100 and the hard constraint that critical hospitalizations should not exceed the country's icu capacity (2,054 for italy, 1,822 for japan and 42 for luxembourg, [6] ). all strategies are evaluated between april 30th and september 30th. we report 2 metrics, the total deaths on september 30th and the mean area under curve across all the mobility features over the 5 months. the later reflects an economic objective that we need to maximize while the former is a healthcare objective. we state in the table three strategies found on the pareto. s1 is the pareto solution with the lowest death toll, s3 is the strategy with the highest mobility activity (and hence highest death toll) and s2 is the median death toll. our study shows that progressive lift strategies yield a similar economic footprint as 2-weeks cyclic strategies with fewer casualties (7% fewer deaths for italy and 10 times less for luxembourg). our approach allows to see drastic changes based on different exit strategies. the progressive strategy offers in our experiments a better outcome than a hard or a cyclic strategy. the search-based strategies (s1, s2, s3) perform better than manual strategies on the death metric for italy and japan, and for luxembourg. for luxembourg, s1 performs as good as the progressive strategy. overall, our results show that the search for exit strategies can be guided and restricted to the policy-makers constraints (i.e. hospital capacity) and yields actionable strategies within constrained computation time. we ran the experiments 100 times (x100 generations each) and the hypervolume of the pareto-solutions converges within 80 generations. the search algorithm yields better strategies than the popular ones both in term of impact on the activity and number of deaths while ensuring that the healthcare facilities are not overwhelmed. in this paper, we studied dn-seir, a data-driven approach to evaluate the effective reproduction number of the covid-19 epidemic. in particular, we considered both manual and search-based mitigation strategies, with the aim to help decision-makers in the evaluation and selection of exit strategies. to this end, we evaluated the state-of-the-art compartment model (i.e. seir) and shew that our approach yields predictions closer to the ground truth. we also demonstrated that learning can transfer across different countries and a simple ffnn provides accurate and interpretable predictions. finally, we proposed a search-based approach to evaluate and find optimal strategies that satisfy the constraints of the health facilities and achieve a quick economic recovery with limited casualties. our approach paves the ways to automated strategy simulation and search and provides a simple, yet, powerful tool for policy makers to tailor exit strategies to their context and priorities. we can go further than our approach with better feature engineering or neural architecture search (with cnn or rnn). we can also extend the data-driven prediction of hyper-parameters not only to the effective reproduction number but also to all the epidemiological parameters like hospitalization rate. this would require having access to accurate hospitalization data across a large pool of countries and can be achieved in the close future as more countries are sharing such data. finally, we could extend our technique to a more-grained approach that takes into account age-specific or location-specific epidemiological models. covid-19 outbreak prediction with machine learning a fast and elitist multiobjective genetic algorithm: nsga-ii modelling the epidemiology of infectious diseases for decision analysis supervised forecasting of the range expansion of novel non-indigenous organisms: alien pest organisms and the 2009 h1n1 flu pandemic annelies wilder-smith, and joacim rocklöv. 2020. the reproductive number of covid-19 is higher compared to sars coronavirus coronavirus pandemic (covid-19) interpretable machine learning long-term predictors of dengue outbreaks in bangladesh: a data mining approach rajan gupta, and saibal pal. 2020. seir and regression model based covid-19 outbreak predictions in india towards a comprehensive simulation model of malaria epidemiology and control sirnet: understanding social distancing measures with hybrid neural network model for covid-19 infectious spread sirnet: understanding social distancing measures with hybrid neural network model for covid-19 infectious spread a sub-national analysis of the rate of transmission of covid-19 in italy using mobility to estimate the transmission intensity of covid-19 in italy : a subnational analysis with future scenarios world health organisation. 2020. a coordinated global research roadmap: 2019 novel coronavirus. number march modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions key: cord-264994-j8iawzp8 authors: fitzpatrick, meagan c.; bauch, chris t.; townsend, jeffrey p.; galvani, alison p. title: modelling microbial infection to address global health challenges date: 2019-09-20 journal: nat microbiol doi: 10.1038/s41564-019-0565-8 sha: doc_id: 264994 cord_uid: j8iawzp8 the continued growth of the world’s population and increased interconnectivity heighten the risk that infectious diseases pose for human health worldwide. epidemiological modelling is a tool that can be used to mitigate this risk by predicting disease spread or quantifying the impact of different intervention strategies on disease transmission dynamics. we illustrate how four decades of methodological advances and improved data quality have facilitated the contribution of modelling to address global health challenges, exemplified by models for the hiv crisis, emerging pathogens and pandemic preparedness. throughout, we discuss the importance of designing a model that is appropriate to the research question and the available data. we highlight pitfalls that can arise in model development, validation and interpretation. close collaboration between empiricists and modellers continues to improve the accuracy of predictions and the optimization of models for public health decision-making. m icrobial pathogens are responsible for more than 400 million years of life lost annually across the globe, a higher burden than either cancer or cardiovascular disease 1 . diseases that have long plagued humanity, such as malaria and tuberculosis, continue to impose a staggering toll. recent decades have also witnessed the emergence of new virulent pathogens, including human immunodeficiency virus (hiv), ebola virus, severe acute respiratory syndrome (sars) coronavirus, west nile virus and zika virus. the persistent global threat posed by microbial pathogens arises from the nonlinear mechanisms of disease transmission. that is, as the prevalence of a disease is reduced, the density of immune individuals drops, the density of susceptible individuals rises and disease is more likely to rebound. the resultant temporal trajectories are difficult to predict without considering this nonlinear interplay. for instance, many microbial diseases exhibit periodic spikes in the number of cases that are unexplainable by pathogen natural history or environmental phenomena. by explicitly defining the nonlinear processes underlying infectious disease spread, transmission models illuminate these otherwise opaque systems. forty years ago, nature published a series of papers that launched the modern era of infectious disease modelling 2,3 . since that time, these methodologies have multiplied 4 . transmission models now employ a variety of approaches, ranging from agent-based simulations that represent each individual 5 to compartmental frameworks that group individuals by epidemiological status, such as infectiousness and immunity 2, 3 . accompanying the methodological innovations, however, are challenges regarding selection of appropriate model structures from among the wealth of possibilities 6 . at this anniversary of the publication of these landmark papers 2,3 , we reflect on contributions that transmission modelling has made to infectious disease science and control. through a series of case studies, we illustrate the overarching principles and challenges related to model design. with expanding computational capacity and new types of data, myriad opportunities have opened for transmission modelling to bolster evidence-based policy (box 1) 7, 8 . in all pursuits, modelling is most informative when conducted collaboratively with microbiologists, immunologists and epidemiologists. we offer this perspective as an entry point for non-modelling scientists to understand the power and flexibility of modelling, and as a foundation for the transdisciplinary conversations that bolster the field. even within the same disease system, the ideal model design depends on the specifics of the questions asked. here, we highlight a series of models focused on one of the defining infectious agents of our era: hiv. the virus has challenged science, medicine and public health at every scale, from its deft immune evasion to its death toll of more than 35 million over the last four decades 9 . we describe how clinical needs, research questions and data availability have shaped the design of hiv models across these scales. unless otherwise indicated, the term 'hiv' is inclusive of both hiv-1 and hiv-2. within-host models. at a within-host scale (table 1) , models can be used to simulate cellular interactions, immunological responses and treatment pharmacokinetics 10 . in such simulations, viral dynamics are often modelled using a compartmental structure, with the growth of one population, such as circulating virions, dependent on the size of another population, such as infected cells. for example, a seminal within-host model fit to viral load data by perelson et al. 11 revealed high turnover rates of hiv-1, counter to what was then the prevailing assumption that hiv-1 remained dormant during the asymptomatic 'latency' phase. the corollary to these high rates of viral turnover was that drug resistance would likely evolve rapidly under monotherapy. further analyses of this model indicated that a combination of at least three drugs was necessary to maintain drug sensitivity 12 . once combination therapy did become available, extension of the perelson et al. model demonstrated that the two-phase decline in viral load observed following treatment initiation was attributable to a reservoir of long-lived infected cells 13 . with this insight also came the realization that prolonged treatment would be necessary to suppress viral load. the incorporation of meagan c. fitzpatrick 1,2 , chris t. bauch 3 , jeffrey p. townsend 4,5,6 and alison p. galvani 2,5,6 * the continued growth of the world's population and increased interconnectivity heighten the risk that infectious diseases pose for human health worldwide. epidemiological modelling is a tool that can be used to mitigate this risk by predicting disease spread or quantifying the impact of different intervention strategies on disease transmission dynamics. we illustrate how four decades of methodological advances and improved data quality have facilitated the contribution of modelling to address global health challenges, exemplified by models for the hiv crisis, emerging pathogens and pandemic preparedness. throughout, we discuss the importance of designing a model that is appropriate to the research question and the available data. we highlight pitfalls that can arise in model development, validation and interpretation. close collaboration between empiricists and modellers continues to improve the accuracy of predictions and the optimization of models for public health decision-making. stochasticity into this within-host framework allowed model fitting to 'viral blips'-transient peaks in viral load, even under antiretroviral treatment 14 . analysis of this data-driven stochastic model demonstrated that homeostatic proliferation maintained the infected cell reservoir and produced these viral blips, a finding that was later confirmed experimentally 15, 16 . the implication for clinical care was that intensified antiretroviral treatment would be unable to eliminate the latent reservoir of infected cells as had been hypothesized, sparing patients from potentially fruitless trials with such regimens. individual-based models. whereas the unit of interest for withinhost modelling is an infected cell, the analogous unit for individualbased models is an infected person (table 1) 5, 17, 18 . individual-based models are often used to explore the interplay between disease transmission and individual-level risk factors, such as comorbidities, sexual behaviours and age. such models are capable of incorporating data with individual-level granularity, including those regarding contact patterns, patient treatment cascades and clinical outcomes. individual-based models are uniquely suited for representing overlap in individual-level risk factors and translating the implications of this overlap for public health policy. for example, an individual-based model was recently used to demonstrate that the majority of hiv transmission among people who inject drugs in new york city is attributable to undiagnosed infections 18 . these modelling results underscore the urgency for the city to invest in more comprehensive screening and improved diagnostic practices. population models. most commonly, models are created at the population scale, capturing the spread of a pathogen through a large group (table 1) . at this scale, compartmental models shift in focus from the pathogen to the host. unlike individual-based models, compartmental models will aggregate individuals with a similar epidemiological status. for instance, the archetypical 's-i-r' model separates the entire population of interest into one of three categories: s, susceptible to infection; i, infected and infectious; or r, recovered and protected 19 . in practice, most models will have additional compartments or stratification beyond this simple structure. age stratification is essential when either the disease risk or the intervention is age-specific. as an example, an age-stratified multipathogen model demonstrated that schistosomiasis prevention targeted to zimbabwean schoolchildren could cost-effectively reduce hiv acquisition later in life 20 . this framework was extended to additional countries with a range of age-specific disease prevalence and co-infection rates to assess the potential value of treating schistosomiasis in adults. although adult treatment is not usually considered efficient, the model showed that it could be cost-effective in settings with high hiv prevalence 21 . these models strengthened the investment case for treatment of schistosomiasis, an otherwise neglected tropical disease. network models are also deployed to represent dynamics on the population scale (table 1) . these models impose a structure on contacts between hosts, unlike compartmental models which assume that contacts are random among hosts within a compartment. in a network model, nodes represent individuals and the connections between nodes represent contacts through which infection may spread 22 . sources for network parameterization may include surveys, partner notification services or phylogenetic tracing 23, 24 . as with individual-based models, network models tend to require significant amounts of data to fully parameterize, but various computational and statistical methods have been developed to analyse the impact of uncertain parameter values on model predictions 25 . network models are applied to discern the influence of contact structure on disease transmission and on the effectiveness of targeted intervention strategies. for instance, network models predicted that hiv would spread more quickly through sexual partnerships that are concurrent versus serially monogamous, even if the total numbers of sexual acts and partners remain constant 26 . the study prompted a more rigorous engagement of epidemiologists with sociological data to tailor interventions for specific settings 27 . other network models have focused on the more rapid transmission within clusters of high-risk individuals and slower transmission to lower-risk clusters, a dynamic which explains discrepancies between observed incidence patterns and the expected pattern based on an assumption of homogeneous risks 28 . these studies both illustrate the importance of accounting for network-driven dynamics when individuals are highly aggregated with regards to their risk factors, and when appropriate data for parameterization are available. metapopulation models. metapopulation models represent disease transmission at dual scales, considering not just the interactions of individuals, but also the relationships between groups of individuals, which are typically defined geographically (table 1) . transmission intensity is often higher within groups than across groups, especially when the groups are spatially segregated 29 . one metapopulation model of hiv in mainland china considered there are the three principal objectives of modelling, all of which can inform public health policy. predicting disease spread. models can be used to estimate the infectiousness of a pathogen within a given population. a fundamental concept is that of r 0 , the basic reproduction number, which quantifies the number of infections that would result from a single index case in a susceptible population. r 0 governs the temporal trajectory of an outbreak and the scale of interventions required for its containment. models may be used to infer r 0 as well as forecast changes in r 0 that could drive transitions in epidemic dynamics, such as the shift from sporadic outbreaks to sustained chains of transmission. example: assessing real-time zika risk in texas 90 . selecting among alternative control strategies. simultaneous field trials of multiple infectious disease control options are often infeasible. models can simulate a wide range of control strategies and thus optimize public health policies according to translational objectives and real-world constraints. modelling can also extrapolate from the individual clinical outcomes of interventions or novel therapeutics to the population-level impacts. extrapolating to the population level is essential to evaluate the indirect benefits of interventions, including a reduction in transmission, or unanticipated repercussions, such as evolution of resistance. example: comparing antibiotic 'cycling' versus 'mixing' to minimize the evolution of antimicrobial resistance 107 . hypothesis testing. it is often logistically or ethically infeasible to empirically test scientific hypotheses in the field or experimentally. modelling can identify parsimonious explanations of observed phenomena, including complex outcomes that can arise from the nonlinear processes common in microbiological systems. even simple models can be useful to help us understand dynamics that are common to many microbiological systems through identification of basic mechanisms that apply across a range of infections. by examining a new infectious agent through the lens of previously characterized systems, models provide insight into the ways that a particular microbial infection might follow or break from typical patterns. example: investigating whether individual heterogeneity within social networks significantly impacts disease spread 22 . transmission within and between provinces, driven by the mobility of migrant labourers 30 . the study suggested that hiv prevention resources could be most effectively targeted to provinces with the greatest initial incidence, as rising incidence in other provinces is driven more by migration from the high-burden provinces than by local transmission. given that the chinese provinces with employment opportunities for migrants are also those with the heaviest burden of hiv, migrant workers who acquire hiv often do so in the province where they work. however, government policy requires migrants to return to their home province for treatment. the movement of these workers perpetuates the disease cycle, as new migrants move to fill the vacated jobs and themselves become exposed to elevated hiv risk. these results therefore call for reconsideration of provincial treatment restrictions. multinational models. global policies, such as the treatment goals set by the joint united nations programme on hiv/aids (unaids), have been modelled on a global scale (table 1) by considering the effectiveness of the policies for each nation. for example, a compartmental model was used to evaluate the potential impact of a partially efficacious hiv vaccine on the epidemiological trajectories in 127 countries that together constitute over 99% of the global burden 31 . the model was tailored to each country by fitting to country-specific incidence trends as well as diagnosis, treatment and viral suppression data. this model revealed that, even with efficacy as low as 50%, a hiv vaccine would avert millions of new infections worldwide, irrespective of whether ambitious treatment goals are met. these results identify the synergies between vaccination and treatment-as-prevention, and provide evidence to support continued investment in vaccine development 9, 32 . from the cellular level to the population level, hiv modelling has led to improvements in drug formulations, clinical care and resource allocation. as scientific advances continue to bring pharmaceutical innovations, modelling will remain a useful tool for illuminating transmission dynamics and optimizing public health policy. hiv was not controlled before it became a pandemic, but our response to future outbreaks has the potential to be more timely 33 . when diseases emerge in new settings, such as ebola in west africa and sars in china, modelling can be rapidly deployed to inform and support response efforts (fig. 1) . unfortunately, the urgency of public health decisions during such outbreaks tends to be accompanied by a sparsity of data with which to parameterize, calibrate and validate models. as detailed below, uncertainty analysis-a method of analysing how uncertainty in input parameters translates to uncertainty in model outcome variables-becomes all the more vital in these situations. media attention regarding model predictions is often heightened during outbreaks, ironically at a time when modelling results are apt to be less robust than for well-characterized endemic diseases. we discuss the importance of careful communication regarding model recommendations and associated uncertainty to inform the public without fuelling excessive alarm. despite these challenges, and especially if these challenges can be navigated, the timely assessment of a wide range of intervention scenarios made possible by modelling would be particularly valuable during infectious disease emergencies. ebola virus outbreaks. the 2014 ebola virus outbreak struck a populous region near the border of guinea and sierra leone, sparking a crisis in a resource-constrained area that had no prior experience with the virus. as the caseload mounted and disseminated geographically, it became apparent that the west african outbreak would be unprecedented in its devastation. models were developed to estimate the potential size of the epidemic in the absence of intervention, demonstrating the urgent need for expanded action by the international community [34] [35] [36] , and to calculate the scale of the required investment 37 . initial control efforts included a militarily enforced quarantine of a liberian neighbourhood in which ebola was spreading. modelling analysis in collaboration with the liberian ministry of health demonstrated that the quarantine was ineffective and possibly even counterproductive 38 . connecting the microbiological and population scales, another modelling study integrated within-host viral load data over the course of ebola infection and between-host transmission parameterized by contact-tracing data. the resulting dynamics highlighted the imperative to hospitalize most cases in isolation facilities within four days of symptom onset 39 . these modelling predictions were borne out of empirical observations. early in the outbreak, when the incidence was precipitously growing, the average time to hospitalization in liberia was above six days 40 . as contact tracing improved, the concomitant acceleration in hospitalization was found to be instrumental in turning the tide on the outbreak 40 . in another approach, phylogenetic analysis and transmission modelling were combined to estimate underreporting rates and social clustering of transmission 41 . this study informed public health authorities regarding the optimal scope and targeting of their efforts, which were central to stemming the epidemic. although data can be scarce for emerging pathogens, modellers can exploit similarities with better-characterized disease systems to investigate the potential efficiency of different interventions (box 1). as vaccine candidates became available against ebola, ring vaccination was proposed based on the success of the strategy in eliminating smallpox 42 , another microorganism whose transmission required close contact between individuals and for which peak infectiousness occurs after the appearance of symptoms. compartmental models had suggested parameter combinations for which ring vaccination would be superior to mass vaccination 43 , and methodological advances subsequently allowed for explicit incorporation of contact network data 44 . modelling based on social and healthcare contact networks specific to west africa supported implementation of ring vaccination 45 , and the approach was adopted for the clinical trial of the vaccine 46 . in 2018, two independent outbreaks of ebola erupted in the democratic republic of the congo. during the initial outbreak in équateur province, modellers combined case reports with time series from previous outbreaks to generate projections of final epidemic size that could inform preparedness planning and allocation of resources 47 . ring vaccination was again deployed, this time within two weeks of detecting the outbreak. a spatial model quantified the impact of vaccine on both the ultimate burden and geographic spread of ebola, highlighting how even one week of additional delay would have substantially reduced the ability of vaccination to contain this outbreak 48 . the second outbreak was reported in august in the north kivu province. armed conflict in this region has interfered with the ability of healthcare workers to conduct the necessary contact tracing, vaccination and treatment. as conditions make routine data collection difficult and even dangerous, modelling has the potential to provide crucial insights into the otherwise unobservable characteristics of this outbreak. in contrast to the unexpected emergence of ebola in a new setting, the influenza virus has repeatedly demonstrated its ability to cause pandemics. a pandemic is an event in which a pathogen creates epidemics across the entire globe. the 1918 pandemic killed an estimated 50 million people worldwide 49 , exceeding the combined military and civilian casualties of world war 1. while the 2% case-fatality rate of the 1918 strain was approximately 40 times higher than is typical for influenza 50 , pathogenic strains with case-fatality rates exceeding 50% periodically emerge 51 . modelling has illustrated how repeated zoonotic introductions impose selection for elevated human-to-human transmissibility, which thereby exacerbates the threat of a devastating influenza pandemic 52 . such threats underscore the importance of surveillance systems and preparedness plans, which can be informed by modelling (box 1). transmission models are able to optimize surveillance systems, accelerate outbreak detection and improve forecasting [53] [54] [55] [56] . for example, a spatial model integrating a variety of surveillance data streams and embedded in a user-friendly platform is currently implemented by the texas department of state health services to generate real-time influenza forecasts (http://flu.tacc.utexas.edu/). modelling has also motivated the development of dynamic preparedness plans, which adapt in response to the unfolding events of a pandemic, as models identified that adaptive efforts would be more likely to contain an influenza pandemic than static policies chosen a priori 57 . other pandemic influenza analyses used agestructured compartmental models to study the trade-off between targeting influenza vaccination to groups that transmit many infections but experience relatively low health burdens (for example, schoolchildren) versus groups that transmit fewer infections but experience greater health burdens (for example, the elderly) 58 . such examples illustrate the insights that modelling has provided to the decision makers charged with maintaining readiness against simultaneously rare but catastrophic situations. modelling has also examined the impact of human behaviour, including vaccination decisions and social interactions, on the course of an epidemic. public health interventions are not always sufficient to ensure disease control, as behavioural factors can thwart progress [59] [60] [61] [62] . for example, reports in 1974 of potential neurological side effects from the whole-cell pertussis vaccine led to a steep decline in vaccine uptake throughout the uk, followed by a slow recovery (fig. 2a) 63 . vaccine uptake ebbed and flowed over the next two decades, with higher rates of vaccination in the wake of large pertussis outbreaks (fig. 2b) 61, 63 . compartmental models analysing the interplay between vaccine uptake and disease dynamics confirmed the hypothesis that increases in vaccination were a response to the pertussis infection risk 61 , and showed that incorporating this interplay can improve epidemiological forecasts. network models extending these coupled disease-behaviour analyses types of projection that can be generated include outbreak trajectories, disease burdens and economic impact. d, probabilistic uncertainty analyses convey not only model projections of policy outcomes, but also quantification of confidence in the projections. e, as policies are adopted and the microbiological system is influenced accordingly, the model can be iteratively updated to reflect the shifting status quo, thereby progressively optimizing policies within an evolving system. have illustrated how the perceived risk of vaccination can have greater influence on vaccine uptake than disease incidence 64 . more recently, vaccine refusal has led to the resurgence of measles in the usa 62, 65 . researchers are turning to social media to gather information about attitudes toward vaccines and infectious diseases, and to glean clues about vaccinating behaviour 55, 66, 67 . for instance, signals that vaccine refusal is compromising elimination can be detected months or years in advance of disease resurgence by applying mathematical analysis of tipping points to social media data that have been classified on the basis of sentiment using machine learning algorithms 66 . these and other data science techniques might help public health authorities identify the specific communities that are at increased risk of future outbreaks. on shorter timescales, the near-instantaneous availability of social media data facilitates its integration into models developed for outbreak response 55, 66 . other behavioural factors that have been incorporated into transmission models include attendance at social gatherings, sexual behaviour and commuting patterns-elements which are also often affected by perceived infection risk 59, 68, 69 . antimicrobial resistance. a substantial portion of the increase in human lifespan over the last century is attributable to antibiotics 70 , but the emergence of pathogen strains that are resistant to antimicrobials threatens to reverse these gains. the extensive use and misuse of antibiotics has led to the evolution of multidrug-resistant, extensively drug-resistant and even pan-drug-resistant pathogens across the globe. precariously, this evolution outpaces the development of new antibiotics. mathematical modelling is being used to identify strategies to forestall the emergence and re-emergence of antimicrobial resistance 71, 72 . models are particularly valuable for comparing alternative strategies, such as administration of different antibiotics within the same hospital ward, temporal cycling of antibiotics and combination therapy [73] [74] [75] [76] . high-performance computing now permits the rapid exploration of multidimensional parameter space. models can thereby narrow an array of possible interventions down to a subset likely to have the highest impact or optimize between trade-offs, such as effectiveness and cost (box 1). by contrast, expense, feasibility and ethical considerations may impose more limitations on in vivo investigations (box 1). not only can models identify the optimal strategy for a given parameter set, but they can generate the probability that this intervention remains optimal across variation in the parameters. for example, an optimization routine combined with simulation of hospital-based interventions identified combination therapy as most likely to reduce antibiotic resistance 75 . as a complementary approach, modelling can incorporate economic considerations into these evaluations. a stochastic compartmental model showed that infection control specialists dedicated to promoting hand hygiene in hospitals are cost-effective for limiting the spread of antibiotic resistance 74 . although most models of antibiotic resistance have focused on transmission in healthcare settings, the importance of antibiotic resistance in natural, agricultural and urban settings has been increasingly recognized [77] [78] [79] [80] [81] [82] [83] . for example, a metapopulation model of antimicrobial-resistant clostridium difficile simulated its transmission within and between hospitals, long-term care facilities and the community. this model demonstrated that mitigating risk in the community has the potential to substantially avert hospital-onset cases by decreasing the number of patients with colonization at admission and thereby the transmission within hospitals 84 . this study illustrates how models can consider the entire ecosystem of infection to elucidate dynamics that might not be captured through focus on a single setting. during the initial phase of an outbreak, the predictive power of models is often constrained by data scarcity. this challenge is exacerbated for outbreaks of novel emerging diseases given that our understanding of the disease will rely on the unfolding epidemic (fig. 1) . not only can the absence of data constrain model design, but sparse data requires extensive sensitivity analyses to evaluate the robustness of conclusions. univariate sensitivity analyses, in which individual parameters are varied incrementally above and below a point estimate, can identify which parameters most influence model output (box 1). such comparisons reveal both salient gaps in knowledge and targets for preventing and mitigating the outbreak (box 1) 85 . as an outbreak progresses, each day has the potential to provide more information about the new disease, including its duration of latency, the symptomatic period, infectiousness, transmission modalities, underreporting and the case-fatality rate. however, collecting detailed data to inform each of these parameters can strain resources when they are thinly spread during an emergency response. sensitivity analysis can support clinicians and epidemiologists in prioritizing data collection efforts 86 . parameterization challenges are compounded for complicated disease systems, such as vector-borne diseases. for example, models of zika virus infection span both species and scales, as the disease trajectory is influenced by factors ranging from mosquito seasonality and mosquito abundance down to viral and immunological dynamics within human and mosquito hosts 87, 88 . adding to this complexity, the ecological parameters vary seasonally and geographically-heterogeneities that may be amplified by socioeconomic factors modulating human exposure to infected mosquitoes 89 . in the absence of the high-resolution data that would be ideal to tailor a mosquitodriven disease system to a given setting, uncertainty analysis can unify parameterization from disparate data sources. in contrast to univariate sensitivity analyses, uncertainty analysis simultaneously samples from empirical-or expert-informed distributions for many or all input parameters. collaboration between modellers and disease experts is thus instrumental to ensuring the biological plausibility of these parameter distributions 90, 91 . the uncertainty analysis produces both a central point estimate and a range for each outcome, a combination which can inform stakeholders about the best-case and worst-case scenarios as well as the likelihood that an intervention will be successful [92] [93] [94] . in constructing models and communicating results, there are common pitfalls which can compromise the rigor and impact of the research. a pervasive pitfall is the incorporation of excessive model complexity, particularly through inclusion of more parameters than can be reliably parameterized from data. intuition might suggest that a complex representation of a microbiological system would more closely represent reality. however, the predictive power of a model can be degraded if incorporating additional parameters only marginally improves the fit to data. this tendency results in complicated transmission models that overfit data in much the same way that complicated statistical regressions can overfit data, replicating not only the relevant trends but also the noise in a particular data set. these overfit models thus become less useful for prediction and generalization 6, 95 . to guide appropriate model complexity and parameterization, modellers have used the mathematical theory of information to develop criteria which quantify the balance between realism and simplicity. such criteria penalize additional parameters but reward substantial improvements in fit, thereby identifying the simplest model that can adequately fit the data 61, 96, 97 . these methods can be applied to select among models or alternatively to calculate weighted average predictions across models. in a similar vein, modelling consortiums serve to address uncertainty surrounding model design [98] [99] [100] . in a consortium, several modelling groups develop their models independently, each applying their particular expertise and perspective. for example, consortia of malaria modellers were convened to predict the effectiveness of interventions, including a vaccine candidate 101 and mass drug administration 102 . congruence of output among models engenders confidence that model results are robust. another pitfall concerns the quality of data used to inform the model. incompleteness of data has been an issue since 1766, when daniel bernoulli published a compartmental model of smallpox and acknowledged that more extended analyses would have been possible if the data had been age-stratified 103 . even today, using data to develop models without knowledge of how the data were collected or the limitations of the data can be risky. data collected for an alternative purpose can contain gaps or biases that are acceptable for the original research question, yet lead to incorrect conclusions when incorporated for another purpose in a specific model. in ideal circumstances, modellers would be involved in the design of the original study, ensuring both seamless integration of the results into the model and awareness on the part of the modeller with regard to data limitations. failing that, it is very helpful for modellers to collaborate with scientists familiar with the details of empirical studies on which their results might depend. this lack of familiarity with the biases or incompleteness of data sources may be particularly dangerous in the era of digital data. 'big data hubris' can blind researchers to the limitations of the dataset, such as being a large but unrepresentative sample of the general population, or the alteration of search engine algorithms partway through the data collection process 6 . some of these limitations can be addressed by using digital data as a complement to traditional data sources. in this way, the weakness of one data source (for example, low sample size of traditional surveys or bias in large digital data) can be compensated by the strengths of another data source (for example, balanced representation in small survey versus large scale of digital data). a final pitfall that often arises in the midst of an ongoing outbreak concerns the interpretation of epidemic projections. initial models may assume an absence of intervention as a way to assess the potential consequences of inaction. such projections may contribute to the mobilization of government resources towards control, as was the case during the west african ebola outbreak 35, 37, 38 . in this respect, the projections are intended to make themselves obsolete 104 . in retrospect and without knowledge of the initial purpose of the model, it may appear that the initial predictions were excessively pessimistic 105 . additionally, people living in outbreak zones often change their behaviour to reduce infection risks, thereby mitigating disease spread through, for example, reducing social interactions or increasing vaccine uptake (fig. 2) 59, 61, 66 . thus, risk assessment constitutes a 'moving target' 105 . for example, input parameters estimated from contact tracing early in an outbreak could require adjustments to reflect these behaviour changes and accurately predict subsequent dynamics 106 . the need for proficient communication skills is heightened during an outbreak. this concern is particularly relevant when presenting sensitivity and uncertainty analyses. although predictions at the extreme of sensitivity analyses also tend to be less probable than mid-range projections, there can be a temptation to focus on the most sensational model scenarios. ensuing public pressure on the basis of misunderstood findings can cause unwarranted alarm and trigger counterproductive political decisions. in both publications and media interactions, underscoring the improbability of extreme scenarios explored during sensitivity analysis, as well as how improved interventions turn a predictive model into a counterfactual one, may pre-empt this pitfall 33 . the role for modelling in supporting epidemiologists, public health officials and microbiologists has progressively expanded since the foundational publications forty years ago, in concert with the growing abundance and granularity of data as well as the refinement of quantitative approaches. models have now been developed for virtually every human infectious disease, as well as in many that affect animals and plants, and have been applied across the globe. interdisciplinary collaboration among empiricists, policymakers and modellers facilitates the development of scientifically grounded models for specific settings and generates results that will be actionable in the real world. reciprocally, modelling results may guide the design of experiments and field studies by revealing key gaps in our understanding of microbiological systems. furthermore, modelling is a feasible and cost-effective approach for identifying impactful policies prior to implementation decisions. through all these avenues, epidemiological modelling galvanizes evidence-based action to alleviate disease burden and improve global health. vaccine uptake appears to be entrained by surges in infection incidence. mathematical models can capture the interplay between natural and human dynamics exemplified in this dataset and a wide variety of other study systems. global, regional, and national age-sex specific mortality for 264 causes of death, 1980-2016: a systematic analysis for the global burden of disease study population biology of infectious diseases: part i population biology of infectious diseases: part ii modeling infectious disease dynamics in the complex landscape of global health formalizing the role of agent-based modeling in causal inference and epidemiology big data. the parable of google flu: traps in big data analysis predicting the effectiveness of prevention: a role for epidemiological modeling bridging the gap between evidence and policy for infectious diseases: how models can aid public health decision-making preventing acquisition of hiv is the only path to an aids-free generation multiscale modelling in immunology: a review hiv-1 dynamics in vivo: virion clearance rate, infected cell life-span, and viral generation time dynamics of hiv-1 and cd4 + lymphocytes in vivo decay characteristics of hiv-1-infected compartments during combination therapy modeling latently infected cell activation: viral and latent reservoir persistence, and viral blips in hiv-infected patients on potent therapy hiv reservoir size and persistence are driven by t cell survival and homeostatic proliferation modeling the within-host dynamics of hiv infection assessment of epidemic projections using recent hiv survey data in south africa: a validation analysis of ten mathematical models of hiv epidemiology in the antiretroviral therapy era the risk of hiv transmission at each step of the hiv care continuum among people who inject drugs: a modeling study infectious diseases of humans: dynamics and control cost-effectiveness of a community-based intervention for reducing the transmission of schistosoma haematobium and hiv in africa evaluating the potential impact of mass praziquantel administration for hiv prevention in schistosoma haematobium high-risk communities when individual behaviour matters: homogeneous and network models in epidemiology connecting the dots: network data and models in hiv epidemiology detailed transmission network analysis of a large opiate-driven outbreak of hiv infection in the united states bayesian inference of spreading processes on networks concurrent partnerships and the spread of hiv concurrency is more complex than it seems network-related mechanisms may help explain long-term hiv-1 seroprevalence levels that remain high but do not approach population-group saturation metapopulation dynamics of rabies and the efficacy of vaccination predicting the hiv/aids epidemic and measuring the effect of mobility in mainland effectiveness of unaids targets and hiv vaccination across 127 countries an hiv vaccine is essential for ending the hiv/aids pandemic opinion: mathematical models: a key tool for outbreak response ebola virus disease in west africa-the first 9 months of the epidemic and forward projections estimating the future number of cases in the ebola epidemic -liberia and sierra leone impact of bed capacity on spatiotemporal shifts in ebola transmission dynamics and control of ebola virus transmission in montserrado, liberia: a mathematical modelling analysis strategies for containing ebola in west africa effect of ebola progression on transmission and control in liberia interrupting ebola transmission in liberia through community-based initiatives epidemiological and viral genomic sequence analysis of the 2014 ebola outbreak reveals clustered transmission selective epidemiologic control in smallpox eradication emergency response to a smallpox attack: the case for mass vaccination the impact of contact tracing in clustered populations harnessing case isolation and ring vaccination to control ebola efficacy and effectiveness of an rvsv-vectored vaccine in preventing ebola virus disease: final results from the guinea ring vaccination, open-label, cluster-randomised trial (ebola ça suffit!) projections of ebola outbreak size and duration with and without vaccine use in équateur ebola vaccination in the democratic republic of the congo emergence and pandemic potential of swine-origin h1n1 influenza virus estimated influenza illnesses, medical visits, hospitalizations, and deaths averted by vaccination in the united states global epidemiology of avian influenza a h5n1 virus infection in humans, 1997-2015: a systematic review of individual case data the role of evolution in the emergence of infectious diseases information technology and global surveillance of cases of 2009 h1n1 influenza optimizing provider recruitment for influenza surveillance networks influenza a (h7n9) and the importance of digital epidemiology disease surveillance on complex social networks optimizing infectious disease interventions during an emerging epidemic optimizing influenza vaccine distribution nine challenges in incorporating the dynamics of behaviour in infectious diseases models evolutionary game theory and social learning can determine how vaccine scares unfold department of health & human services. measles cases and outbreaks the pertussis vaccine controversy in great britain the impact of rare but severe vaccine adverse events on behaviour-disease dynamics: a network model measles outbreak: opposition to vaccine extends well beyond ultra-orthodox jews critical dynamics in population vaccinating behavior digital epidemiology multiscale mobility networks and the spatial spreading of infectious diseases capturing sexual contact patterns in modelling the spread of sexually transmitted infections: evidence using natsal-3 the perpetual challenge of antimicrobial resistance factors affecting the reversal of antimicrobial-drug resistance retrospective evidence for a biological cost of vancomycin resistance determinants in the absence of glycopeptide selective pressures multistrain models predict sequential multidrug treatment strategies to result in less antimicrobial resistance than combination treatment universal or targeted approach to prevent the transmission of extended-spectrum beta-lactamase-producing enterobacteriaceae in intensive care units: a cost-effectiveness analysis modeling antibiotic treatment in hospitals: a systematic approach shows benefits of combination therapy over cycling, mixing, and mono-drug therapies why sensitive bacteria are resistant to hospital infection control call of the wild: antibiotic resistance genes in natural environments antibiotic resistant bacteria are widespread in songbirds across rural and urban environments spatial and temporal distribution of antibiotic resistomes in a peri-urban area is associated significantly with anthropogenic activities impact of point sources on antibiotic resistance genes in the natural environment: a systematic review of the evidence investigating antibiotics, antibiotic resistance genes, and microbial contaminants in groundwater in relation to the proximity of urban areas systematic review: impact of point sources on antibioticresistant bacteria in the natural environment environmental factors influencing the development and spread of antibiotic resistance quantifying transmission of clostridium difficile within and outside healthcare settings optimal cost-effectiveness decisions: the role of the cost-effectiveness acceptability curve (ceac), the cost-effectiveness acceptability frontier (ceaf), and the expected value of perfection information (evpi) probabilistic uncertainty analysis of epidemiological modeling to guide public health intervention policy zika viral dynamics and shedding in rhesus and cynomolgus macaques evaluating vaccination strategies for zika virus in the americas epidemic dengue and dengue hemorrhagic fever at the texas-mexico border: results of a household-based seroepidemiologic survey assessing real-time zika risk in the united states one health approach to cost-effective rabies control in india cost-effectiveness of canine vaccination to prevent human rabies in rural tanzania cost-effectiveness of next-generation vaccines: the case of pertussis optimizing the impact of low-efficacy influenza vaccines reassessing google flu trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales a new look at the statistical model identification model selection in ecology and evolution direct and indirect effects of rotavirus vaccination: comparing predictions from transmission dynamic models data-driven models to predict the elimination of sleeping sickness in former equateur province of drc learning from multi-model comparisons: collaboration leads to insights, but limitations remain public health impact and cost-effectiveness of the rts, s/as01 malaria vaccine: a systematic comparison of predictions from four mathematical models role of mass drug administration in elimination of plasmodium falciparum malaria: a consensus modelling study bernoulli's epidemiological model revisited ebola: models do more than forecast models overestimate ebola cases coupled contagion dynamics of fear and disease: mathematical and computational explorations ecological theory suggests that antimicrobial cycling will not reduce antimicrobial resistance in hospitals the authors gratefully acknowledge funding from the notsew orm sands foundation (grants to m.c.f., j.p.t. and a.p.g.), the national institutes of health (grant nos. k01 ai141576 and u01 gm087719 to m.c.f. and a.p.g., respectively) and the natural sciences and engineering research council of canada (grant no. rgpin-04210-2014 to c.t.b.). the authors also thank c. wells and a. pandey, both members of the yale center for infectious disease modeling and analysis, for their helpful discussions regarding the hiv and ebola modelling literature. m.c.f. and a.p.g. drafted the initial manuscript. m.c.f., c.t.b., j.p.t. and a.p.g. all critically revised the content. the authors declare no competing interests. correspondence should be addressed to a.p.g.reprints and permissions information is available at www.nature.com/reprints.publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. key: cord-280640-0h3yv2m4 authors: phillips, christopher j.; schoen, robert e. title: screening for colorectal cancer in the age of simulation models:a historical lens date: 2020-07-16 journal: gastroenterology doi: 10.1053/j.gastro.2020.07.010 sha: doc_id: 280640 cord_uid: 0h3yv2m4 nan in late march, the washington post, citing unnamed officials, quoted dr. anthony s. fauci's response to the demands of modeling the novel coronavirus epidemic: "i've looked at all the models. i've spent a lot of time on the models. they don't tell you anything. you can't really rely upon models." 1 in this commentary, we report on the historical evolution of simulation modeling as applied to screening for crc. we highlight how simulation modeling has come to be regarded by some as equivalent to evidence based data, but we caution against overreach by relying too heavily on modeling relative to empirical testing. computer simulation models sit uncomfortably alongside evidence provided by controlled observational studies and formal randomized trials. simulations are at once experimental-researchers can adjust parameters and observe different outcomes-and yet also fundamentally non-empirical, in that the parameters, inputs, and assumptions need not have strong empirical bases of support. according to douglas owens of stanford university and the uspstf, the introduction of simulation modeling into colorectal cancer screening recommendations helped address questions that clinical trials had left unanswered. 7 but as eric winsberg's survey of the rise of computer simulation has noted, "it is a mistake to think of simulations simply as tools for unlocking hidden empirical content." 8 simulations do not produce empirical evidence; they enable the creation of "present futures" or alternative visions of the future to help decision making in the present. 9 as recently as 1996, however, simulation models were not used at all in screening recommendations. the rapid rise in their importance over the following two decades, and their central role in shaping current screening recommendations, should occasion reflection about how and why simulation models have become an influential tool for medical decision making. though the introduction of electronic computers and algorithms into medicine broadly dates back to the 1950s, the idea of simulating the effects of screening with computers arose only in the early 1970s. 10 [12] [13] during the 1980s, computer simulations of screening regimens proliferated as researchers sought to understand their potential benefits and costs. [14] [15] [16] until the twenty-first century, however, computer simulations were not seen as providing evidence definitive enough to shape screening recommendations. randomized controlled trials (rcts) continued to provide the highest level of empirical evidence for the uspstf and the canadian task force on preventive health care. 17 the 1996 guide to clinical preventive services of the uspstf did not rely on even established simulation models and two years later uspstf representatives advocating for the use of models still acknowledged "important obstacles" because models were "invariably complex and involve numerous assumptions and subjective judgments." [18] [19] a major turning point was the development of the cancer intervention and simulated population model for incidence and natural history). models for crc screening were often based on those designed for breast and cervical cancer, but emerged later because empirical studies on the effects of colorectal cancer screening arrived later. the first colorectal cancer models were created in the 1990s, [20] [21] incorporating the first randomized controlled studies of crc screening. [22] [23] [24] [25] the resulting simulation models within the cisnet group are similar in many regards: all are stochastic microsimulation models (meaning that instead of cohorts they model an individual's probabilistic transition between disease states) and all take into account the natural history of colorectal cancer as well as surveillance, epidemiology, and end results (seer) incidence and mortality data. in preparation for its 2008 recommendations on colorectal cancer screening, the uspstf augmented their systematic evidence review for the first time with models provided by cisnet researchers, prompting praise for the use of state-of-the-art technology but also concern that modelers had made some "surprising choices," including not quality-adjusting estimates of benefits or accounting for patient and institutional costs of adherence. 26 by 2016, the uspstf's crc recommendations relied heavily on cisnet researchers to provide information on the starting and stopping ages, as well as recommended intervals for screening methods. in particular, cisnet researchers modeled the experience of 40-year old individuals with no previous cancer diagnosis, and aggregated those experiences over an entire hypothetical cohort to measure both the benefit (in life-years gained (lyg)) and the cost (in number of lifetime colonoscopies required) of screening. effectiveness was defined by an "efficient frontier," highlighting the strategies that provided the "largest incremental increase in lyg per additional colonoscopy performed." 27 they concluded "the strategies of colonoscopy every 10 years, annual fit (fecal immunochemical testing), sig (flexible sigmoidoscopy) every 10 years with annual fit, and ctc (computerized tomographic colonography) every 5 years performed from ages 50 to 75 years provided similar lyg and a comparable balance of benefit and screening burden." 28 all three models predicted that starting screening at age 45 would result in an increase in life years gained but the uspstf rejected recommending earlier initiation dates. they reasoned that the increase in life years gained relative to the burden of increased colonoscopy exams was small. combining an earlier starting age with an extension of the subsequent screening interval from every 10 to every 15 years would have maintained a similar estimate of required lifetime colonoscopies but one of the three models (miscan-colon) predicted this solution would cause a loss in lyg. in the end, the uspstf considered these findings and concluded that "the evidence best supports a starting age of 50 years for the general population, noting the modest increase in life-years gained by starting screening earlier, the discordant findings across models for extending the screening interval when the age at which to begin screening is lowered, and the lack of empirical evidence in younger populations." 28 the american cancer society's review two years later addressed the question of initiation age by homing in on the discordant findings across models. in particular, they noted the apparent outlier, the miscan-colon simulation model, would have agreed with the other two about the benefit of starting screening at age 45 if it had incorporated new information on the increased incidence of colorectal cancer in younger people. 29 all three cisnet models had used seer incidence rates taken from 1975-1979 (providing an incidence rate prior to the institution of appreciable screening activity, hence more reflective of the underlying natural history). the acs research team used miscan-colon to simulate "a cohort of adults aged 40 years in 2015, and assumed that this cohort had a 1.591-fold increased crc incidence across all ages compared with the original model." 30 the incidence multiplier was derived from comparing seer data from 1975 with more recent incidence rates. re-running the miscan-colon model with updated assumptions resulted in the acs guideline development group providing a qualified recommendation that screening begin at 45, while maintaining initiation at 50 as a strong recommendation. [31] [32] in this instance, the use of a simulation model for crc screening recommendations engendered controversy rather than resolved questions. though there was indeed new epidemiological evidence on colorectal cancer incidence published between 2016 and 2018 29 there were no new empirical studies about the effects of earlier screening. in fact, as the acs group acknowledged, there has never been any substantial research on the effects of starting screening before age 50 for colorectal cancer. 32 while the model predicts benefits in earlier detection of crc, a number of unintended consequences could potentially follow 6 primary of which is the diversion of resources away from higher risk populations for whom the relative benefit of screening is substantially greater. 33 moreover, the agreement or disagreement of models is not a straightforward measure of the strength or weakness of evidence. it is difficult to know whether differences in predictions are due to differences in model design or model inputs. 34 two models which have been accurately validated using existing studies may have quite different predictions about future outcomes. a 2016 study by cisnet-crc modelers, for example, noted that the essential but unobservable process of transition from benign adenoma to malignant tumor, or "dwell-time," is simulated by all the models but such a value is an output rather than input or assumption. as a result, even if models are carefully calibrated to match existing data, information about how sensitive predictions are to different dwell-time distributions, for example, remains unknown. 35 given the expense of trials and the difficulty of combining multiple parameters simultaneously, such as age of initiation, screening interval, and preferred modalitymodeling may indeed be an essential component for decision making. the ease of modeling compared to large-scale clinical trials as well as the ability of models to combine evidence from many different sources also favors their use. models have significant limitations however, from their imposition of assumptions about the natural history of disease, the screening process, and the behavior of individuals, to the possibility of producing results with misleading levels of precision. most importantly, no matter how elaborate the design of the "virtual laboratory" created by a model 36 or the quality of the "experiments" within, empirical studies remain essential. as has become abundantly clear in the case of statins 37 and hormone replacement therapy 38 in the absence of large and unexpected effects, only randomized controlled trials of adequate size performed in applicable settings can provide reliable information about an intervention's effectiveness. as dr. fauci said recently, "all models are just models. when you get new data, you change them." 39 new epidemiological data on incidence rates can certainly justify updating a model's assumptions, which then might lead the model to make different predictions about the effects of screening, but without formal randomized trials of the new intervention, we have no new empirical information about effectiveness. computer simulations have subtly expanded what counts as "evidence-based" medicine. balancing reliance on models with careful assessment of the systematic and empirical evidence is the core responsibility of organizations developing and delivering prescriptions for public policy. careful deliberation is required to avoid over reliance on modeling. models are no substitute for "real-world data" 40 and their conclusions remain bound by the availability of rigorous empirical studies. experts and trump's advisers doubt white house's 240,000 coronavirus deaths estimate evolution of mathematical models of epidemics from colorectal cancer screening guidelines to headlines: beware! when should guidelines change? a clarion call for evidence regarding the benefits and risks of screening for colorectal cancer at earlier ages lowering the starting age for colorectal cancer screening to 45 years: who will come...and should they? potential intended and unintended consequences of recommending initiation of colorectal cancer screening at age 45 years bauchner: screening for colorectal cancer science in the age of computer simulation imagined futures: fictional expectations and capitalist dynamics biomedical computing: digitizing life in the united states a simulation system for screening procedures the paradox of authority -transformation of the uspstf under the affordable care act background and objectives of the u.s. preventive services task force the miscan simulation program for the evaluation of screening for disease a computer simulation model for the practical planning of cervical cancer screening programmes optimising the age, number of tests, and test interval for cervical screening in canada statistical models for cancer screening broadening the evidence base for evidence-based guidelines: a research agenda based on the work of the u.s. preventive services task force us preventive services task force. guide to preventive services screening for colorectal cancer the miscan-colon simulation model for the evaluation of colorectal cancer screening randomised controlled trial of faecaloccult-blood screening for colorectal cancer randomised study of screening for colorectal cancer with faecal-occult-blood test reducing mortality from colorectal cancer by screening for fecal occult blood. minnesota colon cancer control study the effect of fecal occult-blood screening on the incidence of colorectal cancer screening guidelines for colorectal cancer: a twice-told tale estimation of benefits, burden, and harms of colorectal cancer screening strategies: modeling study for the us preventive services task force us preventive services task force. screening for colorectal cancer: us preventive services task force recommendation statement colorectal cancer incidence patterns in the united states, 1974-2013 the impact of the rising colorectal cancer incidence in young adults on the optimal age to start screening: microsimulation analysis i to inform the american cancer society colorectal cancer screening guideline optimizing colorectal cancer screening by race and sex: microsimulation analysis ii to inform the american cancer society colorectal cancer screening guideline colorectal cancer screening for average-risk adults cost-effectiveness and national effects of initiating colorectal cancer screening for average-risk persons at age 45 years instead of 50 years diversity of model approaches for breast cancer screening: a review of model assumptions by the cancer intervention and surveillance network (cisnet) breast cancer groups validation of models used to inform colorectal cancer screening guidelines: accuracy and implications introduction to the cancer intervention and surveillance modeling network (cisnet) breast cancer models interpretation of the evidence for the efficacy and safety of statin therapy when are randomised trials unnecessary? picking signal from noise the coronavirus in america: the year ahead. the new york times real-world evidence and real-world data for evaluating drug safety and effectiveness key: cord-266189-b3b36d72 authors: dignum, frank; dignum, virginia; davidsson, paul; ghorbani, amineh; van der hurk, mijke; jensen, maarten; kammler, christian; lorig, fabian; ludescher, luis gustavo; melchior, alexander; mellema, rené; pastrav, cezara; vanhee, loïs; verhagen, harko title: analysing the combined health, social and economic impacts of the corovanvirus pandemic using agent-based social simulation date: 2020-06-15 journal: minds mach (dordr) doi: 10.1007/s11023-020-09527-6 sha: doc_id: 266189 cord_uid: b3b36d72 during the covid-19 crisis there have been many difficult decisions governments and other decision makers had to make. e.g. do we go for a total lock down or keep schools open? how many people and which people should be tested? although there are many good models from e.g. epidemiologists on the spread of the virus under certain conditions, these models do not directly translate into the interventions that can be taken by government. neither can these models contribute to understand the economic and/or social consequences of the interventions. however, effective and sustainable solutions need to take into account this combination of factors. in this paper, we propose an agent-based social simulation tool, assocc, that supports decision makers understand possible consequences of policy interventions, but exploring the combined social, health and economic consequences of these interventions. in order to handle crises like the covid-19 crisis, governments and decision makers at all levels are pressed to make decisions in a very short time span, based on very limited information (rosenbaum 2020) . where italy and later spain are criticized by not being quick enough to react on the spread of the sars-cov-2 virus, one wonders whether others would have made different decisions at that time. although health is considered of the prime importance, the interventions from governments are having huge economic impacts and possibly even bigger socio-psychological impact. because the incubation time of the coronavirus is around two weeks it will take at least two weeks before the effects of any restriction on movements is visible. that is, decisions will everytime be two weeks behind the actual situation. given this fact, many countries made a jump start and installed possibly more severe restrictions than necessary in order to stay ahead of the effect. however, it also means that in many countries the restrictions will last for several more months. extended lockdown and restrictions on movement are leading to social stress and the economic effects will have lasting consequences too. it can be expected that people will find unforeseen ways to circumvent the restrictions and there will be increased attempts to violate the restrictions, all potentially undoing the effects of the restriction. at the same time lockdowns are causing social stress, unemployment in several countries is rising at an incredible speed, causing even more social unrest. this is pushing the timing and type of strategies to exit lockdown and lessen the current restrictions, leading many governments to consider the introduction of tracking apps or other means to limit spread once restrictions are (partially) lifted, in order to limit the risks of a second or third pandemic wave. how and when should restrictions be lifted? is it better to first re-start schools, or should transport and work take priority? what will be the effect of these strategies on the pandemic but also on the economy and social life of people and countries? it is also clear that interests and possibilities are not equal for all countries. thus a preliminary lifting of restrictions in one country might adversely affect the spread in the neighbouring countries. unless all cross border transport is halted, which is almost impossible given the economic and food dependencies between countries worldwide. the above considerations make clear that health, social-psychological and economical perspectives are tightly coupled and all have a huge influence on the way the society copes with the covid-19 crisis. epidemiology (the study of the distribution and determinants of diseases in humans) is a cornerstone of public health, and shapes policy decisions and evidence-based practice by identifying risk factors for disease and targets for preventive healthcare. as such, it is not strange that during the initial times of the covid-19 pandemic, decision makers got their main advice from epidemiologists. epidemiology assumes that risk factors are general, abstract and difficult for an individual to control. however, individual behavior has a direct influence on these risks, as it became evident in the spread of the coronavirus. the study of individual and societal behaviors are therefore necessary to the taken into account next to epidemiological studies, in order to be able to understand the combined effect of any policy measure across all aspects. this is especially serious if the effects of a restriction (or the lifting of a restriction) have an effect in one perspective that invalidates the assumptions made from another perspective (bénassy-quéré et al. 2020) . there is thus a need for ways to couple the different perspectives and model the interdependencies, such that these become visible and understandable. this will facilitate balanced decision making where all perspectives can be taken into account. tools (like the one we propose) should thus facilitate the investigation of alternatives and highlight the fundamental choices to be made rather than trying to give one solution. in this paper, we present assocc (agent-based social simulation for the covid-19 crisis), an agent-based social simulation tool that supports decision makers gain insights on the possible effects of policies, by showing their interdependencies, and as such, making clear which are the underlying dilemmas that have to be addressed. such understanding can lead to more acceptable solutions, adapted to the situation of each country and its current socio-economic state and that is sustainable from a long term perspective. in the next section we will briefly discuss the current covid-19 crisis situation. in sect. 3, we will discuss the different perspectives of the consequences of the crisis, and show what is needed to connect the different perspectives. in sect. 4, we describe the agent architecture at the core of the assocc approach, which brings together the epidemiologic, social and economic perspectives, and show how this is implemented in a workable agent architecture that can be used in a social simulation framework. in sect. 5, we describe the practical application of the social simulation framework assocc by exploring different example scenarios where the different perspectives are combined. conclusions are drawn in sect. 6. the covid-19 crisis is characterized by very emotional debates and an atmosphere of crisis and panic (khosravi 2020) . when the pandemic spread from asia to europe it took some time to realise its possible consequences and what would be appropriate measures to prevent these consequences. also the usa seemed to ignore what was happening in asia and europe for some time, causing a considerable delay in introducing defensive policies when the pandemic finally reached the country. in a country like the usa where little social security exists and government is not prepared to invest money in preparing for disasters the societal consequences are possibly even larger than in europe (hick et al. 2020) . this was also one of the lessons learned from the storm katrina (executive-office 2006), but the lack of preparedeness still remains. in the usa alone already by end march 2020, over 6.6 million americans had filed for unemployment. 1 as the number of covid-19 cases increased, policies had to be made quickly in order to prevent the rapid spread of the virus resulting in an overload of hospital and ic capacity. given the lack of data about the coronavirus at the time of decision making, epidemic models based on those modeling earlier influenza epidemics were leading on making decisions. there was simply not enough information and not enough possibilities to take more focused measures that would target the right groups and still had the desired effect. the early days of the pandemic show a quick change of approaches as more became known about the coronavirus. e.g. the fact that in early stages it was believed that asymptomatic carriers were not able to transmit the virus, was influential in initial decisions concerning testing and contact tracing, and possibly to the high speed at which the virus spread initially. currently many countries have introduced severe movement restrictions and full lockdown policies, with potentially huge social and economic consequences. however, also in these cases it is unclear what are the factors and motivations leading to the policy. in fact, an initial study by oxford university shows that there is little correlation between the severity of the spread of the coronavirus and the stringency of the policies in place in different countries. 2 the initial idea was that these measures would last for one or two months. however, at the moment there are already several countries speaking about a period of several months extending at least until mid 2020 or even longer. based on data from previous pandemics, initial economic policies were based on the expectation of getting back to normal within a limited amount of time, with many governments soldering the costs for the current period, it is increasingly clear that impact may be way above what governments can cope with, and a new 'normal' economy will need to be found (bénassy-quéré et al. 2020) . and above that, the international dependencies of the world economy require that countries should coordinate their policies in order to sort maximum effect. a thing which is notoriously difficult and has not been improved by the recent attitude of the usa to go for its own interests first and show little solidarity with other countries. from a sociological perspective there are not many theories that can be used to predict how the current situation will develop. however, some principles are clear. people have fundamental needs for affiliation and thus need to socialize. people can use the internet for some of these needs, but physical proximity to other people is a basic need and cannot be withheld for a long period without possibly severe consequences. keeping families locked up in their homes for long periods also will create problems of its own, even without considering the particular dangers for disfunctional families and domestic violence victims. new practices need to be formed where all members of the family will experience less privacy and autonomy and have to adjust their daily practices to accommodate the new situation. this is possible for short periods as tensions can be kept in reign. however, over long periods this might lead to conflicts and consequent problems of distress, increased domestic violence, suicides, etc. cluver et al. (2020) . these experiences might also affect families and society on the long term. people will change their behavior permanently to avoid similar situations. thus e.g. people might be less inclined to travel, get close to other people, etc. the effects of these changes can be more subtle, but have a long lasting effect on the well being of society. from the considerations above may be clear that policies impact epidemics, economics and society differently, and that a policy that may be beneficial from one perspective may lead to disastrous consequences from another perspective. as the crisis progresses and with it its impact increases, decision makers need to be aware of the complexity of the combined impact of policies. means to support understanding this complexity are sorely needed as are tools that enable the design and analysis of many 'what-if' scenarios and potential outcomes. in this section, we describe the epidemics, economics and social science models that are needed to support decision makers on policies concerning the covid-19 crisis and the complexity of combining these models. epidemiological models are dominated by compartmental models, of which sir (cope et al. 2018) , or seir, formulations are the most commonly used. in compartmental models people are divided into compartments, as depicted in fig. 1 . seir shows how people start as being susceptible to a virus (s), then can become exposed (e) to the virus. from that state they can become infected ( i 1 ). in that condition they can visit a doctor ( o 1 ) or just stay home ( i 2 ). from those states they can still visit a doctor and in the end they are either recovered (r) (or passed away). once recovered people can become susceptible again if immunity does not last. the spread of the virus is determined by the probabilities with which persons move from one state to the next. of course this figure gives a very simple picture of all complexities involved in epidemic models. however, what is clear from this picture is that people are a kind of passive containers of the virus that can infect others with a probability and can recover and become immune. there are no explicit actions that people undertake. a model like the above is often combined with a social network model that shows the speed of the spread along certain dimensions. when a person gets infected that is central in the network he will spread the virus in all directions and the epidemic spreads quick as well. cope et al. (2018) these models are mathematical simplifications of infectious diseases, and allow for understanding how different situations may affect the outcome of the epidemic, by assuming that every person in the same compartment exhibit the same characteristics. that is, the model does not consider explicit human behaviour (as is also explained in the media nowadays 3 ) or consequences of interventions on actions of people (heesterbeek et al. 2015) . these are either considered outside of the model, or are transformed into an estimated effect on the parameters of the seir model that is uniform for all individuals in one compartment. e.g. the effect of closing schools on the spread of the corona virus can be interpreted as a lowering of factor in fig. 1 (meaning a lower number of places where the spread of the virus is possible and thus a lower probability for the virus to spread). however, what if the consequence of closing schools would be that children are staying with their grandparents or that they are brought together at the homes of other children rotating caring between families? in these cases, the number of places where the virus can spread may actually increase, which might outweigh the effect of closing the schools. it is possible to include all these factors in to the probability factor. however, by doing so, we loose the actual causal links between the different factors as these are not explicit part of the model and therefore cannot be easily identified and adjusted. in economics there are many competing models and theories. without singling out a specific model, it can be said that in general economic models have difficulties in times of crisis (kirman 2010; colander et al. 2009 ). competing theories focus on different aspects of the economy and make different assumptions about the rest. the main issue all models struggle with (exactly like epidemiological models) is that of human behavior. often, economic models take a 'homo economicus' view on human behavior: a 'rational' individual that always goes for maximum utility or profit in all circumstances. however, we all know this is not always true, as can be easily illustrated by the ultimatum game, an experimental economics game in which one player proposes how to divide a sum of money, e.g. 100 dollars, with the second party. if the second player rejects this division, neither gets anything. rationally, the agent that gets offered should accept any offer as it is more than nothing (what he gets when refusing). however, empirical studies show that people only accept what they perceive as fair offers (andersen et al. 2011) . in our example, when more than 30-40 dollars are offered. so, fairness apparently also is worth something! it neatly illustrates that people have more motivations to take actions than mere utility of that action. again, such motivations and values can be somehow incorporated in economic models, but such interpretation is not part of economic theory. i.e. economics does not directly provide an answer on how much fairness is worth. this will depend from the context and history. e.g. suppose an agent is pretty fair and offers the other agent 49 dollars and the other agent accepts. the next time the same agent offers the other agent 45, but now the other agent refuses due to the fact he feels he will get offered less and less. however, he might very well have accepted the 45 if it was offered the first time. the above is just one example to show that economic models are very good to explain and predict some behavior of people, but do not include all motivations and behaviors following from those. the third perspective that is relevant for modeling of a pandemic is the social science. in particular, social network analysis is often used to understand the possible ways a virus might spread (firestone et al. 2011) . nowadays much of the work in this area is related to online social media networks, because a lot of data is easily available on such networks. however, for the spread of a virus the physical social networks (between friends, family, colleagues, etc.) are those of main interest. given that there is enough data to construct the physical social networks, they are a very good tool to determine which people might potentially be big virus spreaders. this can be due to their role (e.g. nurses in elderly care) or due to the fact that they have many contacts in different contexts (e.g. sports and culture) that are otherwise sparsely connected and they form bridges between densely connected communities. however, knowing which people or which roles one would like to target for controlling the spread does not mean one can device effective policies. e.g. in the current pandemic it is clear that closing schools as being a possible bridge between communities is acceptable, while closing hospitals or elderly care centres is not. so, again, additional aspects have to be included in these models in order to use the theory in practice. more semantics for the nodes in the networks are needed as well as what the links between the nodes are constituted of. do the links indicate the number of contacts per day? can people also change the network based on their perception of a situation? i.e. avoid contacts, have different types of contacts, establish new contacts, etc. so, where social network analysis looks at people as nodes in a network, people are the ones that actually create, maintain and change the social network. when a government tries to contain the spread of a virus the social networks give a good indication where that might be done most effective, but how the people constituting the network will react to the policies is not included in the social network theory. thus whether new links will arise bypassing previous links or other persons will take the place of so-called super spreaders cannot be derived from this theory. given the above short discussions of the different modeling perspectives of the crisis one can conclude that all perspectives include some assumptions about human behavior in their models. however, this behavior, and especially how people influence each other's behavior is not part of any of these models. we propose to take the human behavior as the central perspective and use it as a linking pin to connect the different perspectives as illustrated in fig. 2 . in the next section we will discuss how our agent model from the assocc framework can be used to fulfill this central role to couple the different perspectives. the design of the assocc framework is based on the fact that individuals always have to balance their needs over many contexts. in the research that we have done in the past twenty years we have come to the the sketch in fig. 3 that illustrates how people manage this balancing act in their daily life. we assume that people have a value system that is reasonably consistent both over times, contexts and domains. the value system is based on the schwartz value circumplex (schwartz 1994 ) that is quite universal. it depicts a number of basic values that everyone possesses and their relations. people differ in how they prioritize values rather than on which values they have. although, priorities can differ between individuals they are reasonably consistent within cultural groups. therefore the values can also be seen as linking individual drivers to the social group. we have used this abstract social science framework already in vanhee et al. (2011), vanhée (2015) , cranefield et al. (2017) , heidari et al. (2018) . in order to use it in these simulations we have formalized and extended the framework such that it can be coupled to concrete behaviour preferences in each context. thus values give a stable guideline of behavior and they will be kept satisfied to a certain degree whenever possible. thus, if "conservatism" is important to a person, she will, in general, direct her behavior to things that will benefit the preservation of the community. the second type of drivers of behavior in fig. 3 are the motives that all people have in common. this is based on the theory of mcclelland (1987) . the four basic motives that are distinguished are: the achievement motive drives us to progress from the current situation to something better (whatever "better" might mean). the affiliation motive drives us to be together with other people and socialize. thus we sometimes do things just to be doing it together with our friends and family. the power motive actually does not mean we want power over others, but rather that we want to be autonomous. i.e. being able to do tasks without anyone's help. finally, the avoidance motive lets us avoid situations in which we do not know how to behave or what to expect from others. each of these motives is active all the time and whenever possible it will drive a concrete behavior. thus, i might go to my grandparents to ask a question rather than text them, just because i want to have a chat. the third type of elements that determine behavior are the affordances that a context provides. these affordances determine what kind of behavior is available and also what type of behavior is salient. e.g. in a bar one often drinks alcohol. even though it is not obligatory it is salient and also afforded easily. individuals have to balance between their values, their motives and the affordances to determine what behaviour would be more appropriate in each situation. as one can imagine this is quite tricky and will take too much time and energy if done in every situation from scratch. therefore, in human society we have developed social structures to standardize situations and behaviors in order to package certain combinations that will be acceptable and usually good (even if not optimal). these social constructs are things like: norms, conventions, social practices, organizations, institutions. note that these constructs give general guidelines or defaults of behavior, but are no physical restrictions on what is possible! implementing this whole architecture would be too inefficient for any social simulation. therefore we use this as theoretical starting point, but translate it into a simpler model that is more efficient and scalable. thus for the assocc simulation framework we fix the the most important aspects of the values and motives described in the above architecture, into a set of needs, illustrated in the following fig. 4 . from the figure it can be seen that we model the values and motives as needs that deplete over time if nothing is done to satisfy them. the model prevents that an agent will only look at the need with the highest priority and only at other ones when that need is completely satisfied. by calibrating the size and threshold and the depletion rate of each need we can calibrate and balance all the needs over a longer period, between different contexts and over several domains. e.g. using this model it becomes possible to decide for an individual whether it is more important to work a bit more or go home and be with the family. this simple model is the crux behind combining health, wealth and social wellbeing in a simulation model (fig. 5) . there is not a complete mapping as not all values are of importance given the type of activities and choices that the agents can make. moreover, the achievement motive is not included explicitly. individual agents inherently are driven to take actions because they need to fulfill their needs. longer term kinds of achievement, like getting educated or getting rich are not important for the purpose of this context. some of the needs are subdivided. safety has the more concrete needs of foodsafety, financial-survival, risk-avoidance, compliance, and financial-safety. food safety and financial survival represent that individuals have to have enough food and money to survive. financial safety means that one has some buffer to pay other than the basic necessities. compliance indicates whether complying to norms is important or not. risk-avoidance in this context indicates whether people take actions that might get them infected or avoid any of those at all costs. given this core model of the agent decision model for behavior we can now describe the actual agent architecture as we have implemented it in assocc. we have developed a netlogo simulation consisting of a number (between 300 and 2500) of agents that exist in a grid. agents can move, perceive other agents, and decide on their actions based on their individual characteristics and their perception of the environment. the environment constrains the physical actions of the agents but can also impose norms and regulations on their behavior. e.g. the agents must follow roads when moving between two places, but the environment can also describe rules of engagement such how many agents can occupy a certain location. through interaction, agents can take over characteristics from the other agents, such as becoming infected with the coronavirus, or receive information. the main components of the simulation are: • agents: representing individuals. agents have needs and capabilities, but also personal characteristics such as risk aversion or the propensity to follow the law and recommendations from authorities. needs include health, wealth and belonging. capabilities indicate for instance their jobs or family situations. agents need a minimum wealth value to survive which they receive by working or subsidies (or by living together with a working agent). in shops and workplaces, agents trade wealth for products and services. agents pay tax to a central government that then uses this money for subsidies, and the maintenance of public services such as hospitals and schools. • places: representing homes, shops, hospitals, workplaces, schools, airports and stations. by assigning agents to homes, different households can be represented: families, students rooming together, retirement homes, three generation households and co-parenting divorced agents. the distribution of these households can be set in different combinations to analyse the situation in different cities or countries. • global functions: under this heading we capture the general seir model of the corona virus which is used to give agents the status of infected, contagious, etc. this model also determines the contageousness of places like home, transport, shops, etc. based on a factor that represents fixed properties of a place (like size, time people spend there on average, whether it is indoor or outdoor) and density (how many people are there at the same time). under this global functions we also capture economic rules that indicate tax and subsidies from government. finally we also include the social networks and groups that exist under this heading. the social networks give information about normal behavior and also provide clusters of agents performing activities together. • policies: describing interventions that can be taken by decision makers. for instance social distancing, testing or closing of schools and workplaces. policies have complex effects for the health, wealth and well-being of all agents. policies can be extended in many different ways to provide an experimentation environment for decision makers. agents can move between places and take the policies into consideration for their reasoning. as described in sect. 4, agents' decisions are based on the container model of needs. these needs are satisfied by doing activities and decay over time. needs may have different importance to each agent but the overall assumption is that agents will try to satisfy their most important need that is least satisfied at a given moment given the context. the context determines which choices are available at any given moment. thus e.g. if agents have to work in a shop they will (normally) go to work even if the need for safety is high. but if they have work that can be done at home as well, they have a choice between going to work or staying home to work. in that case, their need for safety can make them decide to stay home. most needs are composite. for instance, safety is built up of food-safety, financial-survival, risk-avoidance, and compliance, listed here in order of importance, i.e. relevance to the agent's direct need to survive. e.g. the safety need is defined as the minimum of the first two, and a weighted mean of the rest, where the weights are the importances assigned to each subneed. here the satisfaction of food-safety is defined as having enough essential resources stocked up at home, such as food and medicine, measured over a two week period (i.e. the need is fully satisfied if the agent has enough supplies for the coming two weeks and decays from there). the only way to increase this need, is by going shopping for essential resources. however, going shopping, i.e. leaving the house may conflict with the need for safety, so the agent will need to balance these two needs in its decision to go shopping. agents with a high level of risk-avoidance will be more likely to try to avoid the disease and thus want to stay away from large groups of people. the need for belonging includes conformity, i.e. the need to comply with norms and regulations, which is satisfied by taking actions that conform to the rules, such as staying inside during lockdown, or going to school or work if that is requested from them. conformity can have a negative value, in the case that the agent decides to break a rule. the need for autonomy is satisfied when agents are able to follow their own plans. agents satisfy their need for autonomy when they are able to make an "autonomous" decision. lockdown policies block many of these actions, which means that when an agent reaches a too low level of autonomy it may decide to break lockdown. however, to regulate this effect and not provide agents with too strong incentives to break the lockdown, the 'compliance' is used as a regulating factor. the need for self-esteem is satisfied when the agent believes that its actions are accepted and even followed by other agents. finally, the need for survival represents an agent's need to rest if it is sick. this need can be satisfied by resting at home if the agent believes it is sick, and depletes if it does anything else while it believes to be sick. under this need we also fitted the conformity need. people will conform to what other people do if they are uncertain about the context and which action is the best to take. conforming to others is safe as it is usually good to do what others do. thus it contributes to (social) survival. as can be seen above, agents in assocc have a wide range of needs. they range over the social, health and economic dimensions of society and can therefore also show how interventions intended to remedy e.g. the spread of the virus can influence other dimensions and due to these dependencies do not have the intended effect. in the next section we describe some examples of scenarios we have simulated in assocc to illustrate the use, the range of possible scenarios and importance of the approach. in this section we briefly discuss two scenarios we have simulated with assocc. they are about quite different aspects of the covid-19 crisis. the first is on the question whether schools should be closed and employees work at home during the pandemic. the second investigates some economic effects of having a lockdown for several weeks or even months. both scenarios are built on the same model as described above, which shows the richness of the model and possibilities for exploration of new policies. typically, schools are places where both children and parents from a community gather, which potentially leads to spreading of the virus. when epidemics emerge, one common measure to be taken is the closure of schools. in case of influenza-like illnesses it has been proven that this is an effective measure, since children are highly susceptible for that disease. however, the covid-19 pandemic is different, as it seems to be less contagious for children. sweden, as one of the few european countries, has kept primary and junior school (0-16 years old) open, based on the premise that closing schools would mean many parents with jobs that are vital to society, like healthcare personnel, would have to stay home and take care of their children instead of going to work. this would delay efforts to stem the spread of the virus. in chang et al. (2020) it is already indicated that closing schools would not have much effect in the australian situation. in this scenario, we explore potential consequences of keeping schools open on the spread of the virus but also on social and economic aspects. we model the direct and indirect effect on the spread of the virus when schools are closed and people work from home. we assume that schools will be closed as soon as a certain amount of infected people within the city has been reached. different thresholds have been tested, but in this paper schools are closed whenever one infected person is detected. the scenario assumes that when children are staying at home, at least one adult should be at home to take care of them. this caregiver is assumed to work from home for the duration of the school closure. we will in particular look at the effect on the spreading of the coronavirus when schools are closed and thus parents working at home compared to parents already working at home with children still going to school. in fig. 6 the results of closing the schools and working at home are shown. we show a graph with the health status of the population and the activities taking place. when only the schools are closed (the left two graphs), some people are forced to work at home too, as they have to take care of their children. the scale on the horizontal axis denotes time. four ticks make up one day. so, the runs cover about 2-4 months (depending on whether any differences still appear after 2 months). the vertical axes show the number of agents which is around 330. this number can vary slightly as it depends on the household distribution and the way the households are generated. the results are based on 40 runs. although the number of people at home is of course much higher than people in other places (the green line in the bottom plots) compared to the baseline simulation, which implies less spreading of the virus, the peak of infected people is higher (around 210 and 215 resp.). the peak is reached around the same time, i.e. 40 ticks (or 10 days). the number of people that died is 80 and 78 resp. this means that the differences are not significant. we also see a difference in people going to non-essential shops (the black line in the activity graphs). the number of people at non-essential shops, on the right where the last two peaks (saturday and sunday) show that people go out shopping in the weekend. the above findings show that the measures do not lower the peak as expected, but even increase that peak. this can be explained by the fact that more people want to leave the house during the weekend, as they have worked at home during the week with less social interaction. at these public places typically people of different communities are gathered, while at work or school, more or less the same group of people are present. thus closing the schools only increases the spreading of the virus. thus without additional restrictions, like social distancing and closing restaurants, the effect of working at home or closing schools is surpassed by the side effect, namely people wanting to go out. the covid-19 pandemic has severe medical and social consequences. but it has vast economic consequences as well. a recession is expected, and predictions show that it can be way more severe than the banking crisis (2009) recession. thus governments are trying to minimize the economic impact of the pandemic and try to minimize the "financial issues" people and companies have by giving massive a b c d fig. 6 results financial support. also, once the pandemic has been dealt with, "restarting" the economy is a big challenge as many people have lost their jobs and economic activity has slowed to a minimum. with this scenario we demonstrate the relation between the pandemic, health measures and the economy. this shows the complex and interconnected nature of the situation. in many countries the government has locked down the country and closed down business operations. this cuts the cycle of income of companies and subsequently either people become unemployed or governments temporarily take over paying the wages of the employees. in this scenario we show two different situations. the first shows what happens with a lockdown and no government support, the second shows the situation when government supports companies by taking over wages. when government closes all non-essential shops to prevent spread and orders a lockdown the curve of the pandemic is really flattened. within the time span of our simulation we see that we are still not experiencing many infections and hardly any deaths. note that lockdown here starts when only one person was infected. so, much earlier than in real life! people are almost all working at home. the essential shops seem to be doing well. this can be seen as hoarding behavior of people that want to make sure they have enough essential products and have still money to buy things. they do not buy anything non-essential as those shops are all closed (and we have no on-line shopping in our simulation!). the non-essential shops do not go bankrupt in the simulation because we did assume here that they could postpone all fixed costs, like loans, rent, etc. if these fixed costs are taking into account all the nonessential shops will go bankrupt pretty quick without government intervention. the velocity of the money flowing through the system decreases due to the lockdown (fig. 7) . in the second situation government takes over wages from all people working in the non-essential shops. overall we see capital increasing slightly. this can be explained due to the fact that people cannot spend money on non-essential products and leisure activities (going to cafes, sport events, etc.) while the number of unemployed/non-payed people is relatively small (fig. 8) . the economic activity is kept up by the government. however, the government reserves are quickly depleting as less tax is coming in and more subsidies are paid. this situation is not sustainable! in this paper we have shown that in crises like the covid-19 crisis the interventions of government have to be made quick and are often based on too little information and only partial insights of the consequences of these interventions. in most cases the health perspective is the leading perspective to inform government decisions. however, the models used by epidemiologists lack the part on human behavior. therefore they have to translate government interventions and the expected reactions from citizens to parameters into parameters of the epidemic model without being able to check for interdependencies. we have shown how the assocc platform can be a good addition to current epidemiological and economic models by putting the human behavior model central and from that core connect the health, wealth and social perspectives. in general there are several advantages of using an agent based tool like assocc. first a tool like assocc explicitly avoids providing detailed predictions, but rather supports investigating the consequences in all perspectives from government policies. by comparing variations of policies (like government subsidizing people or not in the economic scenario) it becomes clear what are the consequences and which are the fundamental choices that have to be made. taking this stance avoids politicians being able to just blame the model for giving a wrong advice. technology should not give single solutions to decision makers, but should support the decision makers to have a good insight in the consequences of their decisions and which are the fundamental priorities that have to be weighed. e.g. the joy of life of elderly people against the risks of meeting their family and becoming infected. or the risk of contagion happening through schools vs. the social disruption when schools are closed. due to the agent based nature it is also possible to explain where results come from. e.g. closing schools might not have any positive effect on the spread of the virus due to all kinds of side effects. in the scenario we could see that people were going more to non-essential shops in the weekend because they had to be more at home to take care of the children during the week. their need for belonging went up and going to the shops was the only option left as other leisure places were closed. tracing this behavior back in this way helps to explain what is the basic cause and where in the chain something can be done if one wants to avoid this behavior. each of the steps in these causal chains is based on a solid theory (social-psychological, economic or epidemiological in our case). and thus can be checked both on its plausibility as well as on its theoretical foundation. using the agent based models makes it easier to adjust parameters in the simulation based on differences in demographics and cultural dimensions in different countries, but also in different regions or even neighbourhoods in big towns. therefore more finegrained analysis can be performed on the impact of government interventions on different parts of the country. e.g. rich and poor neighbourhoods or urban and rural areas can be affected in different ways by the same measure. finally, we should stress that we do not claim that agent based models like used in assocc should replace domain specific models! actually one could use assocc to simulate intervention scenarios to study the human reactions to it and the consequences. having a good insight in these dependencies one can feed the domain specific models with better information to make precize optimizations or predictions for the effect of that intervention. thus in this way the strength of the different types of models can be combined rather than seen as competing. supplementary information the full project is available at https ://simas socc. org. the netlogo and gui code is available at https ://githu b.com/lvanh ee/covid -sim including a few runnable scenarios, described in the website. stakes matter in ultimatum games europe needs a catastrophe relief plan. mitigating the covid economic crisis: act fast and do whatever modelling transmission and control of the covid-19 pandemic in australia the financial crisis and the systemic failure of the economics profession characterising seasonal influenza epidemiology using primary care surveillance data no pizza for you: value-based plan selection in bdi agents the federal response to hurricane katrina: lessons learned. government printing office the importance of location in contact networks: describing early epidemic spread using spatial social network analysis. preventive veterinary medicine modeling infectious disease dynamics in the complex landscape of global health simulations with values duty to plan: health care, crisis standards of care, and novel coronavirus sars-cov-2 perceived risk of covid-19 pandemic: the role of public worry and trust the economic crisis is a crisis for economic theory human motivation facing covid-19 in italy-ethics, logistics, and therapeutics on the epidemic's front line beyond individualism/collectivism: new cultural dimensions of values implementing norms? utrecht universiteit publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations acknowledgments open access funding provided by umea university.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. key: cord-259426-qbolo3k3 authors: tadesse, trhas; alemu, tadesse; amogne, getasew; endazenaw, getabalew; mamo, ephrem title: predictors of coronavirus disease 2019 (covid-19) prevention practices using health belief model among employees in addis ababa, ethiopia, 2020 date: 2020-10-22 journal: infect drug resist doi: 10.2147/idr.s275933 sha: doc_id: 259426 cord_uid: qbolo3k3 background: ethiopia has taken strict preventive measures against covid-19 to control its spread, to protect citizens, and ensure their wellbeing. employee’s adherence to preventive measures is influenced by their knowledge, perceived susceptibility, severity, benefit, barrier, cues to action, and self-efficacy. therefore, this study investigated the predictors of covid-19 prevention practice using the health belief model among employees in addis ababa, ethiopia, 2020. methods: multicentre cross-sectional study design was used. a total of 628 employees selected by systematic sampling method were included in this study. data were collected using a pretested self-administered questionnaire. summary statistics of a given data for each variable were calculated. logistic regression model was used to measure the association between the outcome and the predictor variable. statistical significance was declared at p-value<0.05. direction and strength of association were expressed using or and 95% ci. results: from a total of 628 respondents, 432 (68.8%) of them had poor covid-19 prevention practice. three hundred ninety-one (62.3%), 337 (53.7%), 312 (49.7), 497 (79.1%), 303 (48.2%) and 299 (52.4%) of the respondents had high perceived susceptibility, severity, benefit, barrier, cues to action and self-efficacy to covid-19 prevention practice, respectively. employees with a low level of perceived barriers were less likely to have a poor practice of covid-19 prevention compared to employees with a high level of perceived barrier [aor = 0.03, 95% ci (0.01,0.05)]. similarly, employees with low cues to action and employees with a low level of self-efficacy were practiced covid prevention measures to a lesser extent compared those with high cues to action and high level of self-efficacy [aor = 0.05, 95% ci (0.026,0.10)] and [aor = 0.08, 95% ci (0.04,0.14)], respectively. conclusion: the proportion of employees with poor covid-19 prevention was high. income, perceived barrier, cues to action, and self-efficacy were significantly associated with covid-19 prevention practice. the globe is facing an extremely bizarre time struggling to fight an enemy it never saw before; the novel coronavirus disease (covid)-19. sars-cov-2 or covid-19 was first reported in december 2019, as a cluster of acute respiratory illness in wuhan (pneumonia of unknown cause), hubei province, china, from where it spread rapidly around the globe involving more than 190 countries. the world health organization (who) declared the outbreak a public health emergency of international concern (pheic) on 30 january 2020 and a global pandemic on 11 march 2020. 1, 2 as of 22 april 2020, more than 2.57 million cases have been reported across 185 countries and territories, resulting in more than 178,558 deaths. about three-fourth (701,838) the people with covid19 have recovered while about 52, 262 of them are in a serious or critical condition. 3, 4 there are now more than 24,600 confirmed cases of coronavirus infection across the continent africa, resulting in more than 1190 mortalities. similarly, about 116 cases and 3 death of covid-19 are reported in ethiopia as of 22 april 2020. 5 covid-19 causes a range of respiratory symptoms including fever, fatigue, dry cough, and difficulty of breathing. it may result in serious complications like ards and death especially among elderly patients and patients with underlying medical conditions like heart disease, diabetes, hypertension, and asthma. a study done in china shows those patients with a severe form of covid 9 developed ards and required icu admission and oxygen therapy. at this stage of the diseases, the mortality rate is high (15%). 6, 7 for five decades, the health belief model (hbm) has been one of the most widely used conceptual frameworks in health behavior. the hbm has been used both to explain change and maintenance of health-related behaviors and as a guiding framework for health behavior intervention. 8 it is now believed that people will take action to prevent or control ill-health conditions like covid-19 if they regard themselves as susceptible to the covid-19; if they believe it would have potentially serious consequences; if they believe that a course of action like stay home, keep social distance, wear face mask, etc available to them would help reduce either their susceptibility to the disease or the severity of the condition; and if they think that the likely barriers (or cost) of taking the actions outweighed by its benefits. 9 given that health motivation is it's a central focus, the health belief model is an ideal option for addressing behavioral problems that evoke health concerns. the model has been tested repeatedly in western countries that it fits best for health behavior change studies as well as a planning model together with other health education and planning models, such as the precede-procede model. to date, a vaccine and effective treatment are not available for covid-19. in a situation like this, basic hygiene principles and aggressive public health measures are virtually important for preventing the spread of the disease and hence reducing its impact in the community. therefore, this study was aimed at assessing predictors of covid-19 prevention practice among higher education employees in addis ababa ethiopia using a health belief model. the study was conducted to determine the predictors of covid-19 prevention practice among employees working in addis, ethiopia, may 2020. addis ababa is the capital city of ethiopia with a population of around 4.7 million. addis ababa has 10 9 administrative sub-cities and a total of 99 kebeles. the study was done among employees selected from four organizations in addis ababa (ethiopian airlines, commercial bank of ethiopia, black lion hospital, and ethiopian telecommunication corporation). the study was done from may to june 2020. a multicentered cross-sectional study design was used to assess predictors of covid-19 prevention practices using a health belief model among employees in addis ababa, ethiopia, 2020. employees from the four stated organizations and who are willing to participate were included in this study. employees with hearing and visual impairment were excluded from this study. the sample size for the study was calculated using a single proportion formula by assuming 95% cl, 4% marginal error, and 50% proportion of covid-19 prevention practice. therefore, by adding a 10% non-response rate, the final sample size for this study was 628. the sample size was proportionally allocated to each of the four organizations. then, a systematic sampling method was used to select the study participants from each of the four organizations. according to the available submit your manuscript | www.dovepress.com infection and drug resistance 2020:13 data during the study period, a total of 4396 active workers were available in the four selected organizations in addis ababa. hence, by dividing the total active employees during the study period (4396) with the total sample size (628), (n/n), the sampling interval (k) of 7 was obtained. the first employee was selected at random from each organization and consecutive participants were selected every seventh employee. participants were approached in their working area. variables covid-19 prevention practice was the dependent variable. demographic variables, knowledge about covid-19, and the hbm constructs (perceived susceptibility, perceived severity, perceived benefit, perceived barrier, cues to action, and self-efficacy) were the independent variables. the questionnaire was developed by reviewing previous different literature conducted on prevention practice of covid-19 and in consultation with experts from different fields to check the relevance and make necessary changes according to the study requirements. the questions were modified according to the suggestions received from the expert panel and output from the pre-test. guidelines for layout, question design, formatting, and pretesting testing were followed. the questionnaire was used to gather employees' demographic data, knowledge about covid-19 and its prevention, health belief model constructs (perceived susceptibility, perceived severity, perceived benefit, perceived barrier, and cues to action self-efficacy), and practice of covid-19 prevention. spss version 23 computer software package was used to analyze the data. the collected data were entered into spss and data cleaning was undertaken before data analysis. summary statistics like frequency, percent, mean, and standard deviation of a given data for each variable were calculated. a logistic regression model was used to measure the association between the outcome (covid-19 prevention practice) and the predictor variables (socio-demographic variables, knowledge, and the hbm constructs). statistical significance was declared at p-value<0.05. direction and strength of association was expressed using or and 95% ci. a preliminary phase was conducted to assess the validity and reliability of the questionnaire before its use. initially, three ethiopian experts in the field of epidemiology and research in ethiopian universities were asked to assess the degree to which items in the questionnaires were relevant and can correctly measure predictors of covid-19 prevention practice using the health belief model, and a correction was made accordingly. then, the questionnaire was pretested on 30 participants who were excluded later from the study sample. data were used to assess internal consistency reliability using cronbach's alpha. the results showed adequate internal consistency reliability (with cronbach's alpha= 0.915 or perceived susceptibility, 0.773 for perceived severity, 0.954 for the perceived benefit, 0.869 for the perceived barrier, 0.806 for cues to action, and 0.986for self-efficacy questions). approval and ethical clearance was obtained from the institution review board (irb) of universal medical and business college (umbc) which was in accordance with the principles embodied in the declaration of helsinki. official permission was also obtained from the principals of the four selected organizations before approaching the study participants. the objective and purpose of the study was clearly explained to the study subjects to obtain written informed consent before data collection. participants were also informed that they can discontinue or decline to participate in the study at any time. confidentiality of the information was maintained and the data were recorded anonymously throughout the study. knowledge of covid-19: knowledge of covid-19 was measured using 12 questions. each correct response was scored 1, and each incorrect response was scored 0. a total score of ≥9 (≥80%) out of 12 was considered as having good knowledge whereas a score <9 (<80%) was considered as poor knowledge towards covid-19 and its prevention. covid-19 prevention practices: practice of covid-19 prevention was measured using eleven questions. each correct response in the practice category was scored 1, and infection and drug resistance 2020:13 submit your manuscript | www.dovepress.com dovepress each incorrect response was scored 0. a total score of ≥8 (≥80%) out of eleven was considered as having good practice whereas a score <8 (<80%) was considered as having a poor practice of covid-19 prevention. 10 perceived susceptibility: one's belief regarding the chance of getting covid-19. respondents will be asked eight 7 questions (eg i am not afraid of getting coronavirus infection) to describe their level of agreement in a five-scale response format from "strongly disagree" to "strongly agree". the 5-point likert scale response options, scored from 1 to 5, were strongly disagree, disagree, neutral, agree, and strongly agree. subscale scores were obtained by summing item scores and dividing by the total number of items. if it is above or equal to the average score, it was indicative of high perceived susceptibility. 11 perceived severity: one's belief of how serious covid-19 and its squeal are. respondents were asked six 6 questions (eg becoming corona virus-infected is the worst thing that could happen to me) to describe their level of agreement in a five scale response format from "strongly disagree" to "strongly agree". the 5-point likert scale response options, scored from 1 to 5, were strongly disagree, disagree, neutral, agree, and strongly agree. subscale scores were obtained by summing item scores and dividing by the total number of items. if it is above or equal to the average score, it was indicative of high perceived severity. 12 perceived benefit: one's beliefs in the efficacy of covid-19 prevention practice like hand washing, social distancing, etc. to reduce the risk of getting covid-19. respondents were asked 10 9 questions (eg washing hands frequently with soap and water or using alcohol-based hand rub kills the virus that causes covid-19) to describe their level of agreement in a five-scale response format from "strongly disagree" to "strongly agree". the 5-point likert scale response options, scored from 1 to 5, were strongly disagree, disagree, neutral, agree, and strongly agree. subscale scores were obtained by summing item scores and dividing by the total number of items. if it is above or equal to the average score, it was indicative of a high perceived benefit. 13 perceived barrier: one's belief about the tangible and psychological costs of practicing covid-19 prevention mechanisms like staying at home. respondents were asked six 6 questions (eg face mask is hard to get) to describe their level of agreement in a five-scale response format from "strongly disagree" to "strongly agree". the 5-point likert scale response options, scored from 1 to 5, were strongly disagree, disagree, neutral, agree, and strongly agree. subscale scores were obtained by summing item scores and dividing it by the total number of items. if it is above or equal to the average score, it was indicative of a low level of perceived barrier. 14 cues to action: strategies to activate one's "readiness" to use covid-19 prevention practices. based on prior research (wilson et al, 1991), a 6-item yes/no scale was used to assess participant's exposure to cues that could influence them to engage in covid-19 practice. the scale was developed. typical items as follows: "do you know someone with covid-19?" the sum of the score ranged from 6 to 12; higher scores indicated exposure to more covid-19 information. scale score was obtained by summing item scores and dividing by the total number of items. 15 self-efficacy: one's confidence in one's ability to use or apply prevention of covid-19 practices recommended by who in a different situation. respondents were asked five 5 questions (eg feel confident that i could talk to any person to using a face mask) to describe their level of agreement in a five-scale response format from "strongly disagree" to "strongly agree". the 5-point likert scale response options, scored from 1 to 5, were strongly disagree, disagree, neutral, agree, and strongly agree. subscale scores were obtained by summing item scores and dividing by the total number of items. if it was above or equal to the average score, it was indicative of a high level of self-efficacy. 16 a total of 628 employees working in four organizations in addis ababa were included in this study. more than half of the study subjects 414 (65.9%) were in the age category of 24-28 years with a mean ± sd of 28.76 ± 5.10 years. the majority of the respondent 434 (69.1%) were males and more than half 361 (57.5%) of them were single. the majority 376 (59.9%) and 402 (64.0%) of them were degree holders by educational level and earn a monthly income of 2500-7499 birr, respectively. most 247 (39.3%) of them were bank workers, while 131 (20.9%) of them were health workers ( table 1) . of the total of 628 respondents, 248 (39.5%) of them had poor knowledge about covid-19 ( figure 1 ). all of the respondents heard about the disease. more than half 337 (53.7%) of them were not aware of the call center service number to seek information about covid-19 and half 309 submit your manuscript | www.dovepress.com infection and drug resistance 2020:13 (49.2%) of the employees were aware of the main symptoms of covid-19 like fever, dry cough, and difficulty of breathing. two hundred ninety-seven (47.3%) of them believed that persons infected with covid-19, but has no symptoms cannot transmit the virus to others. close to half 297 (47.3%) of them said children and young adults do not need to take measures to prevent covid-19, and people who have contact with someone infected with the covid-19 should be immediately quarantined. only 261 (41.6%) of them said the length of quarantine of people who have contact with covid-19 cases is 14 days ( table 2) . the major sources of information about covid-19 for the study subjects were government media (tv/radio) (69.6%), social media (67.7%), local sources like posters, banners (59.1%), national sources (moh/ephi) (55.4%) and private medias tv/radio (52.7%) (figure 2 ). bivariate and multivariate logistic regression models were carried out to determine the factors affecting the employees' knowledge of covid-19. only variables with a p-value ≤0.2 (age, level of education, occupation, income) were included in the multivariate regression. after adjusting for possible confounding factors with multivariate regression; age, level of education, and income were significantly associated with knowledge about covid-19 with a p-value <0.05. employees in the age category of 24-28 years were 2.75 times [aor = 2.75, 95% ci (1.72, 4.41)] more likely to have a poor level of knowledge about covid-19 compared to employees whose age was greater than 28 years. similarly, employees with an educational level of the certificate were 10.02 times [aor = 10.02, 95% ci (5.02, 19.99)] more likely to have a poor level of knowledge about covid-19 than employees with an educational level of degree an above. employees with a monthly income of 7500-12,499 birr were less likely to have a poor level of knowledge about covid-19 ( table 3) . three hundred ninety-one (62.3%) of the respondents had high perceived susceptibility to covid-19 while the rest 237 (37.7%) had low perceived susceptibility to coronavirus infection with a mean score ± sd of 14.65±8.5 and a median value of 11. concerning the perceived severity of the disease, 337 (53.7%) of the respondents had high perceived severity to coronavirus infection while the rest 291 (46.3%) had low perceived severity and the mean score for perceived severity was 22.34 with a standard deviation ±7.8 and median value 24. concerning the third component of the health belief model, half 316 (50.3%) of the respondents had low perceptions about the benefit of coronavirus infection prevention practice. but 312 (49.7%) of the participants had high perceived benefit with a mean ± sd score of 34.0± 12.7 with a median value of 39. three hundred twenty-five (51.8%) of the respondents were exposed to low triggering factors for coronavirus infection prevention with a mean ± sd score of 8.9±2.3 and median value 8. four hundred ninety-seven (79.1%) of the respondents had a high perceived barrier and the rest 131 (20.9%) of the participants had low perceived barrier with a mean ± sd score of 18.02 ± 7.3 with a median value of 21. about 329 (52.4%) had low selfefficacy towards coronavirus infection prevention with a mean ± sd score of 15.6 ± 7.5 with and a median value of 15 (table 4) . of the total of 628 respondents, 432 (68.8%) of them had a poor practice of covid-19 prevention. more than half 58.3% of them did not clean surfaces. only 39.8% of them cover their mouth and nose while sneezing and coughing while 42.4% of them disposed used tissues properly after coughing and sneezing. the majority of 60.8% of them washed their hands frequently with soap and water for 20 seconds and 58.9% of them cleaned their hands with alcohol-based sanitizer if water is not available. more than half 55.7% of them wear masks in public areas and none of them kept their distance. more than two-thirds of 68.85% of them avoided groups and all of them stayed at home if they feel sick (figures 3 and 4 ). a bivariate and a multivariate logistic regression model was carried out to determine the factors affecting the employees' knowledge of covid-19. only variables with a p-value ≤0.2 (sex, knowledge, perceived severity) were included in the multivariate regression. after adjusting for possible confounding factors with multivariate regression; income, perceived barrier, cues to action, and self-efficacy were significantly associated with prevention practice of covid-19 with a p-value <0.05. employees with a monthly income of 7500-12,499 birr and 7500-12,499 birr were more likely to practice the prevention of covid-19 compared to employees with a monthly income of ≥12,500 birr [aor = 3.67, 95% ci (1.09,12.42)] and [aor = 4.25, 955ci (1.23,14.65)], respectively. employees with low level of perceived barriers were less likely to have a poor practice of covid-19 prevention compared to employees with a high level of perceived barrier [aor = 0.03, 95% ci (0.01,0.05)]. similarly, employees with low cues to action and employees with a low level of self-efficacy were practiced covid prevention measures to a lesser extent compared those with high cues to action and high level table 5 ). covid-19 is an emerging infectious disease that poses a significant threat to public health. 17 given the severe threats imposed by covid-19 and the lack of a covid-19 vaccine, preventive measures play a vital role in decreasing infection rates and halting the spread of the disease. 18 this indicates the necessity of employees to practice the preventive and control measures, which is affected by socio-demographic characteristics, level of knowledge, perceived susceptibility, severity, benefit, barrier, cues to action, and self-efficacy. therefore, this study was the first study to assess the predictors of coronavirus infection prevention practice among employees of addis ababa, ethiopia using the health belief model. in this study, the level of covid-19 prevention practice was 196 (31.2%). this was similar to a study conducted in residents of ethiopia. 19 however, this was lower when compared to the previous study done among health professionals in ethiopia which were 63%, 20, 21 and with the study conducted among the high-risk group of addis ababa ethiopia which was 49% 22 with the study conducted in the kingdom of saudi arabia. 23 and with the study conducted in hong kong which was 77% of the participants reported good health performance for covid-19. 24 this discrepancy might be due to the difference in the study population. a perceived barrier is one of the components of hbm that deals with the perception of barriers that do not allow the performance of coronavirus infection prevention (ie availability and accessibility of water, home environment, and the availability of electricity and internet connection). 25 the present study finds out that employees with low level of perceived barriers were less likely to have a poor practice of covid-19 prevention compared to employees with a high level of perceived barrier [aor = 0.03, 95% ci (0.01,0.05)]. this might be due to the finding proportion of households with soap and water for handwashing was 13% and current levels of access to water and hand-washing facilities, and characteristics of the home environment are not conducive for effective implementation of basic prevention measures, including social distancing 26 and limited access to electricity and internet connection discourages work from home. 27 self-efficacy is one of the components of hbm that refers to the level of a person's confidence in his or her ability to successfully perform the prevention mechanism of covid-19. 8 the current study identified that employees with low cues to action and employees with a low level of self-efficacy were practice covid prevention measures to a lesser extent compared with those with high cues to action and high level of cues to action [aor = 0.05, 95% ci (0.026,0.10)] and [aor = 0.08, 95% ci (0.04,0.14)], respectively. this was in line with the study conducted in turkish adults 28 and with the study conducted in iran among hospital staff. 29 individuals who believe they are at low risk of developing a covid-19 are more likely to engage in unhealthy, or risky, behaviors like not wearing a face mask, unable to keep social distancing, etc., 30 and the combination of perceived severity and perceived susceptibility is referred to as perceived threat 31 which depend on knowledge about the covid-19 situation. 32 the hbm predicts that higher perceived threat leads to a higher likelihood of engagement in health-promoting behaviors like keeping social distancing, properly wearing a face mask, hand hygiene, etc. but the current study failed to show the significant association between covid-19 prevention practice and perceived severity and perceived susceptibility. this might be due to the knowledge gap that was found among the employees towards the covid-19 situation was 40%. a study conducted in sudan to determine the sudanese perceptions of covid-19 using the health belief model showed that low perceived susceptibility (beliefs about the likelihood of getting covid-19) and severity (beliefs about the seriousness of contracting covid-19, including consequences) was 45% and 40%, respectively. 33 this is slightly higher compared to the current study, which was 37.7% of the employee had low perceived susceptibility. but the perceived severity was slightly lower which 53.7% of the employees had low perceived severity. this difference might be due to the difference in the study area.and the source of the population. in the current study of a total of 628 respondents, 380 (60.5%) of them had good knowledge about covid-19. this was higher when it compared with the study that was done in ethiopia which was 52% of the participants had good knowledge on transmission of covid-19 22 and with the study conducted in india which was 39% of the participants have good perceived knowledge for preventive measures. 34 this study has some limitations. one of the limitations is bias occurred as a result of the study design (crosssectional) since the study took the information at specified time-points and cause and effect association cannot be studied. different mechanisms were used to reduce potential bias in the study. in addition to this, a lack of sufficient similar study limited comparison to this study finding with other studies. however, identifying knowledge gaps, perceived susceptibility, severity, benefit, barrier, cues to action, self-efficacy, and practice can be used to develop effective interventions and establish baseline levels to set priorities for program managers. this study examined the predictors of covid-19 prevention practice using the health belief model among employees of addis ababa, ethiopia. a significant number of employees had poor knowledge about covid-19 and its prevention. the proportion of poor prevention practice of covid-19 was also high. income, perceived barrier, cues to action, and self-efficacy were significantly associated with the prevention practice of covid-19 with a p-value<0.05. hence, policymakers and other concerned bodies should focus on those areas to improve the prevention practice of covid-19. consent to publish is not applicable for this manuscript because there are no individual data details like images or videos. the result of this research was extracted from the data gathered and analyzed based on the stated methods and materials. there are no supplementary files. the original data supporting this finding will be available at any time upon request. wuhan municipal health and health commission's briefing on the current pneumonia epidemic situation in our city risk assessment: outbreak of acute respiratory syndrome associated with a novel coronavirus coronavirus disease 2019 (covid-19) situation report -93. data as received by who from national authorities by 10:00 cest center for disease control and prevention (cdc). covid view summary ending on april 18, 2020. cdc 24/7: saving lives, protecting people covid-19 situation update for the who african region, external situation report. world health organization (who) 2019 novel coronavirus of pneumonia in wuhan, china: emerging attack and management strategies clinical features of patients infected with 2019 novel coronavirus in wuhan, china the health belief model as an explanatory framework in communication research: exploring parallel, serial, and moderated mediation applying the health belief model to assess prevention services among young adults key messages and actions for covid-19 prevention and control in schools using social and behavioural science to support covid-19 pandemic response the perceived severity of a disease and the impact of the vocabulary used to convey information: using rasch scaling in a simulated oncological scenario perceived benefits, perceived risk, and trust: influences on consumers' group buying behavior mental health and emotional impact of covid-19: applying health belief model for medical staff to general public of pakistan the behavioral assessment of visual neglect perceived susceptibility to illness and perceived benefits of preventive care: an exploration of behavioral theory constructs in a transcultural context short report on implications of covid-19 and emerging zoonotic infectious diseases for pastoralists and africa the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study the knowledge and practice towards covid-19 pandemic prevention among residents of ethiopia: an online cross-sectional study knowledge, attitude and practice of healthcare workers towards covid-19 and its prevention in ethiopia: a multicentre study perceptions and preventive practices towards covid-19 early in the outbreak among jimma university medical center visitors, southwest ethiopia knowledge, practice and associated factors towards the prevention of covid-19 among high-risk groups: a cross-sectional study in addis ababa knowledge, attitude and practice toward covid-19 among the public in the kingdom of saudi arabia: a cross-sectional study community responses during the early phase of the covid-19 epidemic in hong kong: risk perception, information exposure and preventive measures. medrxiv understanding green advertising attitude and behavioral intention: an application of the health belief model covid-19 prevention measures in ethiopia current realities and prospects. strategy support program | working paper 142 ministerial briefing paper on evidence of the likely impact on educational outcomes of vulnerable children learning at home during covid-19. canberra: australian government, department of education, skills and employment covid-19 severity, self-efficacy, knowledge, preventive behaviors, and mental health in turkey. death stud factors associated with preventive behaviours of covid-19 among hospital staff in iran in 2020: an application of the protection motivation theory clustering of lifestyle risk behaviours and its determinants among school-going adolescents in a middle-income country: a cross-sectional study perceived severity and susceptibility towards leptospirosis infection in malaysia towards an effective health interventions design: an extension of the health belief model study of the sudanese perceptions of covid-19: 2020. applying the health belief model public perception and preparedness for the pandemic covid 19: a health belief model approach clinical. epidemiol glob health our heartfelt thanks go to universal medical and business college for funding the study. the researchers also wish to express their gratitude to the study subjects and to all those who lent their hands for the successful completion of this research. all authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis, and interpretation, or in all these areas; took part in drafting, revising, or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work. this research was funded by universal medical college but has no other role in the manuscript. the authors affirm that there is no conflict of interest concerning the publication of this manuscript. infection and drug resistance is an international, peer-reviewed openaccess journal that focuses on the optimal treatment of infection (bacterial, fungal and viral) and the development and institution of preventive strategies to minimize the development and spread of resistance. the journal is specifically concerned with the epidemiology of antibiotic resistance and the mechanisms of resistance development and diffusion in both hospitals and the community. the manuscript management system is completely online and includes a very quick and fair peerreview system, which is all easy to use. visit http://www.dovepress.com/ testimonials.php to read real quotes from published authors. key: cord-252166-qah877pk authors: ekins, s; mestres, j; testa, b title: in silico pharmacology for drug discovery: applications to targets and beyond date: 2007-09-01 journal: british journal of pharmacology doi: 10.1038/sj.bjp.0707306 sha: doc_id: 252166 cord_uid: qah877pk computational (in silico) methods have been developed and widely applied to pharmacology hypothesis development and testing. these in silico methods include databases, quantitative structure-activity relationships, similarity searching, pharmacophores, homology models and other molecular modeling, machine learning, data mining, network analysis tools and data analysis tools that use a computer. such methods have seen frequent use in the discovery and optimization of novel molecules with affinity to a target, the clarification of absorption, distribution, metabolism, excretion and toxicity properties as well as physicochemical characterization. the first part of this review discussed the methods that have been used for virtual ligand and target-based screening and profiling to predict biological activity. the aim of this second part of the review is to illustrate some of the varied applications of in silico methods for pharmacology in terms of the targets addressed. we will also discuss some of the advantages and disadvantages of in silico methods with respect to in vitro and in vivo methods for pharmacology research. our conclusion is that the in silico pharmacology paradigm is ongoing and presents a rich array of opportunities that will assist in expediating the discovery of new targets, and ultimately lead to compounds with predicted biological activity for these novel targets. the first part of this review (ekins et al., 2007) has briefly described the history and development of a field that can be globally referred to as in silico pharmacology. this included the development of methods and databases, quantitative structure-activity relationships (qsars), similarity searching, pharmacophores, homology models and other molecular modelling, machine learning, data mining, network analysis and data analysis tools that all use a computer. we have also previously introduced how some of these methods can be used for virtual ligand-and target-based screening and virtual affinity profiling. in this second part of the review, we will greatly expand on the applications of these methods to many different target proteins and complex properties, and discuss the pharmacological space covered by some of these in silico efforts. in the process, we will detail the success of in silico methods at identifying new pharmacologically active molecules for many targets and highlight the resulting enrichment factors when screening active druglike databases. we will finally discuss some of the advantages and disadvantages of in silico methods with respect to in vitro and in vivo methods for pharmacology research. the applicability of computational approaches to ligand and target space in which a lead molecule against one gene family member is used for another similar target (termed chemogenomics) (morphy et al., 2004; sharom et al., 2004) , will be discussed thoroughly in an upcoming review in this journal from didier rognan (personal communication) and will be only briefly addressed here. however, there have been several attempts to establish relationships between molecular structure and broad biological activity and effects that should be considered (see also section 2.3.1 in ekins et al. (2007) ) (kauvar et al., 1995 (kauvar et al., , 1998b kauvar and laborde, 1998a) . for example, the work of fliri et al. (2005b) presented the biological spectra for a cross-section of the proteome. using hierarchical clustering of the spectra similarity enabled a relationship between structure and bioactivity to be constructed. this work was extended to identify agonist and antagonist profiles at various receptors, correctly classifying similar functional activity in the absence of drug target information (fliri et al., 2005c) . interestingly, using ic 50 data as affinity fingerprints did not identify functional activity similarities between molecules as this approach was suggested to introduce a pharmacophoric bias (fliri et al., 2005c) . a similar probabilistic approach has also been applied by the same authors to link adverse effects for drugs (obtained from the drug labelling information) with biological spectra. for instance, clustering molecules by side effect profile showed that similar molecules had overlapping profiles, in the same way that they had similar biological spectra, linking preclinical with clinical effects (fliri et al., 2005a) . this work offers the intriguing possibility of predicting a biospectra profile, possible functional activity and a side effect profile for a new molecule based on similarity alone. however, confidence in this approach would be greatly enhanced by further prospective testing with a large test set of drug-like molecules not used to generate the underlying signature database. a second group also from pfizer presented a global mapping of pharmacological space and in particular focused on a polypharmacology network of molecules with activity against multiple proteins (paolini et al., 2006) . they have additionally generated bayesian binary models (for molecules active at o10 mm or inactive) for 698 targets using over 200 000 molecules with biological data (from their in-house collection and the literature), suggesting that they would be useful for predicting primary pharmacology. assessment of 617 approved oral drugs in two-dimensional (2d) molecular property space (molecular weight versus clogp) showed that many of them had clogp 45 and mw 4500. in spite of this, their associated targets were potentially druggable but had yet to realize their potential (paolini et al., 2006) . perhaps this work needs to be combined with that of fliri and others for its true potential to be realized, to enable simultaneous understanding and prediction of target, proteomic, functional activity and side effects. a recent analysis using 48 molecular 2d descriptors followed by principal component (pca) of over 12 000 anticancer molecules representing cancer medicinal chemistry space, showed that they populated a different space broader than hit-like space and orally available drug-like space. this would indicate that in order to find molecules for anticancer targets in commercially available databases, different rules are required other than those widely used for drug-likeness, as they may unfortunately filter out possible clinical candidates (lloyd et al., 2006) . methods to predict the potential biological targets for molecules from just chemical structure have been attempted by using different approaches to those already described above. for example, one study used probabilistic neural networks with 24 atom-type descriptors to classify 799 molecules from the mdl drug data reports (mddr) database with activity against one of the seven targets (g protein-coupled receptors (gpcrs), kinases, enzymes, nuclear hormone receptors and zinc peptidases) with excellent training, testing and prediction statistics (niwa, 2004) . twenty-one targets related to depression were selected and molecules from the mddr database were used to create support vector machine (svm) classification models from atom-type descriptors (lepp et al., 2006) . these models had satisfactory predictions and recall values between 45 and 90%, the molecules recovered being on average of low molecular weight (o300) and some were active against more than one model. it was suggested that general svm filters would be useful for virtual screening owing to their speed. others have used similarity searching of the mddr database against small numbers of reference inhibitors for several different targets and were able to show variable enrichment factors that were greater than random (hert et al., 2004) . the structure-based alternative to understanding small molecule-protein interactions is to flexibly dock molecules into multiple proteins. a representative of this inverse docking approach is invdock, which was recently applied for identifying potential adverse reactions using a database of 147 proteins related to toxicities (dart). this method has been recently demonstrated with 11 marketed anti-hiv drugs resulting in reasonable accuracy against the dna polymerase beta and dna topoisomerase i (ji et al., 2006) . the public availability of data on drugs and drug-like molecules may make the analyses described above possible for scientists outside the private sector. for example, chemical repositories such as drugbank (http://redpoll. pharmacy.ualberta.ca/drugbank/) (wishart et al., 2006) , pubchem (http://pubchem.ncbi.nlm.nih.gov/), kidb (http:// kidb.bioc.cwru.edu/) (roth et al., 2004; strachan et al., 2006) and others consist of a wealth of target and small molecule data that can be mined and used for computational pharmacology approaches. although much of the in silico pharmacology research to date has been focused on human targets, many of these databases contain data from other species that would also be useful for understanding species differences and promoting discovery of molecules for animal healthcare as well as assisting in understanding the significance of toxicological findings for chemicals released into the environment. to exhaustively describe all of the proteins that have been computationally modelled under the auspices of in silico pharmacology would be impossible in the confines of this review. therefore, we will briefly overview the types of proteins that have been modelled and the methods used (see below and table 1 ). in addition, we will focus on and describe particular pharmacological applications with regard to virtual screening where novel ligands have been identified. the reader is highly encouraged to study an extensive review of success stories in computer-aided design, which covers a large number of proteins that have been targets for all manner of in silico methods (kubinyi, 2006) , as well as other reviews that have dealt with the successes of individual methods (fujita, 1997; kurogi and guner, 2001a; guner et al., 2004) . as described previously, computational approaches for drug discovery and development may have more impact if integrated (swaan and ekins, 2005) and we have previously attempted to show that computational methods have been broadly applied to virtually all important proteins in absorption, distribution, metabolism, excretion and toxicity (adme/tox) (ekins and swaan, 2004b) . the qaim of this paper is to provide an up-to-date review of all proteins and protein families addressed through current state-of-the-art in silico pharmacology methods. drug target examples. enzymes: the ubiquitin regulatory pathway, in which ubiquitin is conjugated and deconjugated with substrate proteins, represents a source of many potential targets for modulation of cancer and other diseases santo et al. (2005) abbreviations: ampa, a-amino-3-hydroxy-5-methyl-4-isoxazole propionate; cox, cyclooxygenase; cyp, cytochrome p450; hiv-1, human immunodeficiency virus; lox, 5 lipoxygenase. in silico pharmacology for drug discovery s ekins et al (wong et al., 2003) . the recent crystal structure of a mammalian de-ubiquitinating enzyme hausp, which specifically de-ubiquitinates the ubiquitinated p53 protein, may also assist in drug development despite the peptidic nature of its substrate (hu et al., 2002) . novel non-peptidic inhibitors of the protease ubiquitin isopeptidase, which not only de-ubiquitinates p53 but other general ubiquitinated proteins as well, were discovered recently using a simple pharmacophore-based search of the national cancer institute (nci) database (mullally et al., 2001; mullally and fitzpatrick, 2002) . these inhibitors had ic 50 values in the low micromolar range and caused cell death independent of the tumour suppressor p53, which is mutated in greater than 50% of all cancers (hence, p53 inhibition per se may not represent an optimal target for modulation). the ubiquitin isopeptidase inhibitors shikoccin, dibenzylideneacetone, curcumin and the more recently described punaglandins from coral indicate that a sterically accessible a,b-unsaturated ketone is essential for bioactivity (verbitski et al., 2004) . all these molecules represent valuable leads for further chemical optimization. aromatase (cytochrome p450 (cyp)19) is a validated target for breast cancer. a ligand-based pharmacophore was generated with three non-steroidal inhibitors. this model could recognize known inhibitors from an in-house library and was further refined by the addition of molecular shape. the model was further used to search the nci database and molecules were scored with a quantitative catalyst hypo-refine (accelrys inc., san diego, ca, usa) model generated with 16 molecules. the hits were also filtered with other pharmacophores for toxicity-related proteins, before testing. two out of the three compounds were ultimately found to be micromolar inhibitors (schuster et al., 2006) . a structure-based catalyst pharmacophore was developed for acetylcholine esterase, which was subsequently used to search a natural product database. the strategy identified scopoletin and scopolin as hits and were later shown to have moderate in vivo activity (rollinger et al., 2004) . the same database was also screened against cyclooxygenase (cox)-1 and cox-2 structure-based pharmacophores, leading to the identification of known cox inhibitors. these represent examples where a combination of ethnopharmacological and computational approaches may aid drug discovery (rollinger et al., 2005) . a combined ligand-based and structure-based approach was taken to gaining structural insights into the human 5-lipoxygenase (lox). a catalyst qualitative hiphop model was created with 16 different molecules that resulted in a five-feature pharmacophore. a homology model of the enzyme was based on two soybean lox enzymes and one rabbit lox enzyme. molecular docking was then used to update and refine the pharmacophore to a four-feature model that could also be visualized in the homology model of 5-lox. as a result of these models, amino-acid residues in the binding site were suggested as targets for site-directed mutagenesis while virtual screening with the pharmacophore had suggested compounds with a phenylthiourea or pyrimidine-5-carboxylate group for testing (charlier et al., 2006) . homology models for the human 12-lox and 15-lox have also been used with the flexible ligand docking programme glide (schrödinger inc.) to perform virtual screening of 50 000 compounds. out of 20 compounds tested, 8 had inhibitory activity and several were in the low micromolar range (kenyon et al., 2006) . more than 30 years of research on renin have not been enough to deliver a marketed drug that inhibits this enzyme. in spite of this, renin remains an attractive yet elusive target for hypertension (fisher and hollenberg, 2001; stanton, 2003) . in this respect, application of structure-based design leads to the identification of new non-peptidic inhibitors of human renin. these molecules include aliskiren (rahuel et al., 2000; torres et al., 2003) , piperidines, including ro-0661168 oefner et al., 1999; vieira et al., 1999) , and related 3,4-disubstituted piperidines (marki et al., 2001) . interestingly, these piperidines bind to and stabilize a different conformer of the protein termed 'open renin' (bursavich and rich, 2002) , whereas aliskiren binds to 'closed renin'. since these latter structure-based design efforts, there have been remarkably very few published attempts at computer-aided design of novel renin inhibitors. a single early qsar was derived for a series of chainmodified peptide analogues of angiotensinogen. the activity of these molecules was found to correlate with kier's firstorder molecular connectivity index descriptor and molecular weight but not with lipophilicity as measured by logp (khadikar et al., 2005) . another computational method for renin drug discovery used the de novo design software growmol, which could apparently regenerate 3,4-disubstituted piperidines in 1% of the grown structures (bursavich and rich, 2002 ). an attempt to use a catalyst pharmacophore to discover new renin inhibitors was described in the early 1990s (van drie, 1993) . several novel molecules from the pomona database (an early three-dimensional (3d) molecule database) were found that mapped to a renin pharmacophore but apparently were not tested in vitro. more recently, a ligandfit docking study with a crystal structure of the 'open renin' form was able to detect 10 known inhibitors seeded in a library of 1000 compounds within the top 8.4% when using a consensus scoring function. four examples of high-scoring compounds that were not tested as inhibitors fulfilled the pharmacophore derived from the x-ray data, consisting of four hydrophobes, a hydrogen bond donor or positive ionizable feature as well as excluded volumes (krovat and langer, 2004) . another study has used similarity searching of the mddr database (for over 100 000 compounds) using 10 renin inhibitors and was able to produce enrichment factors that were 17-fold greater than random (hert et al., 2004) . genetic algorithms have also been used for class discrimination between renin inhibitors and noninhibitors in a subset of the mddr using a small number of interpretable descriptors. among them, amide bond count, molecular weight and hydrogen bond donor counts were found to be much higher in renin inhibitors (ganguly et al., 2006) . the recent publications on novel renin inhibitors represent a considerable amount of new information that could be used for further qsar model development and database searching efforts in order to derive novel starting scaffolds for optimization. cathepsin d is an aspartic protease found mainly in lysosomes, which may have a role in b-amyloid precursor protein release and hence may well be a target for in silico pharmacology for drug discovery s ekins et al alzheimer's disease. cathepsin d may also be elevated in breast cancer and ovarian cancer hence a means to modulate this activity could be beneficial in these diseases. there has been a brief overview of cathepsin d in a comprehensive review of protease inhibitors (leung et al., 2000) . a combination of a structure-based design algorithm and combinatorial chemistry has been successfully applied to finding novel molecules for cathepsin d in the nanomolar range (kick et al., 1997) . structures based on pepstatin (a 3.8 pm inhibitor (baldwin et al., 1993) ) yielded a 6-7% hit rate. these molecules were tested in vitro using hippocampal slices and were shown to block the formation of hyperphosphorylated tau fragments (bi et al., 2000) . there have been relatively few computational studies to date on cathepsin d and other related aspartic proteases such as renin and b-secretase. one study has used molecular dynamics and free energy analyses (mm-pbsa) of cathepsin d inhibitor interactions to suggest new substitutions that may improve binding (huo et al., 2002) . a genetic algorithmbased de novo design tool, adapt has also been used to rediscover active cathepsin d molecules, by placing key fragments in the correct positions (pegg et al., 2001) . computational models may aid in the selection of novel ligands for protease inhibition that are non-peptidic and selective. using the structural features of eight published inhibitors for cathepsin d (huo et al., 2002) , a five-feature pharmacophore was derived consisting of three hydrophobes and two hydrogen bond acceptors (r ¼ 0.98). this pharmacophore was used to search a molecule database and selected 10 molecules out of 11 441 present. in contrast, a similarity search at the 95% level using chemfinder (cambridgesoft, cambridge, ma, usa) suggested 16 different molecules. all of these were selected for testing in vitro. the pharmacophore produced four hits (40% hit rate) and the similarity search generated five hits (31% hit rate), where at least one replicate showed greater than 40% inhibition (ekins et al., 2004a) . in silico evaluation of the adme properties for all active compounds estimated that the molecules would be well absorbed, although some were predicted to have solubility and cyp2d6 inhibition problems. pharmacophore-and structure-based approaches have been used to optimize an acyl urea hit for human glycogen phosphorylase. a catalyst hypogen five-feature pharmacophore was developed and used to guide further analogue synthesis. these compounds showed a good correlation with prediction (r ¼ 0.71). an x-ray structure for one molecule was used to confirm the predicted binding conformation. ultimately, a comparative molecular field analysis (comfa) model was generated with all molecules synthesized and was found to be complementary to the x-ray structure. the outcome of this study was a molecule with good cellular activity that could inhibit blood glucose levels in vivo in rat (klabunde et al., 2005) . the human sirtuin type 2, a target for controlling aging and some cancers, deacetylates a-tubulin and has been crystallized at high resolution. this structure has been used for docking the maybridge database and returned a small hit list from which 15 compounds were tested and 5 showed activity at the micromolar level (tervo et al., 2004) . catechol o-methyltransferase is a target for parkinson's disease and there is currently a crystal structure of the enzyme that has been used to generate a homology model of the human enzyme. this model was used to dock with flexx software several catechins from tea and understand the structure-activity relationship (sar) for these molecules and their metabolites, which had been tested in vitro. ultimately, the combination of in vitro and computational work indicated that the galloyl group on catechins, the distance between lys 144 on the enzyme, and the reacting catecholic hydroxy group were important for inhibition . kinases: the kinases represent an attractive family of over 500 targets for the pharmaceutical industry, with several drugs approved recently. kinase space has been mapped using selectivity data for small molecules to create a chemogenomic dendrogram for 43 kinases that showed the highly homologous kinases to be inhibited similarly by small molecules (vieth et al., 2004) . virtual screening methods have been applied quite widely for kinases to date (fischer, 2004) . the structure-based design method has produced new potent inhibitors of cdk1 starting from the highly similar apo cdk2 and the positioning of olomoucine. a few aminoacid residues were mutated to conform to the cdk1 sequence. macromodel was used to energy minimize molecules in the atp pocket and visual inspection suggested points for molecular modification on the ligand. very quickly, design efforts guided ligand optimization to improve activity from 4.5 mm to 25 nm (furet et al., 2000) . a more recent cdk1/cyclin b homology model was also used to manually dock ligands, which enabled progression from alsterpaullone with an ic 50 of 35 nm to a derivative with an ic 50 of 0.23 nm (kunick et al., 2005) . a structure-based in silico screening method was pursued for the syk c-terminal sh2 domain using dock to find low molecular weight fragments for each binding site with millimolar binding affinity. the fragments were then linked to result in molecules in the 38-350 mm range, which is a starting point for further lead optimization (niimi et al., 2001) . a pseudoreceptor model was built with a set of 27 epidermal growth factor receptor (egfr) tyrosine kinase inhibitors with the flexible atom receptor model method. the top 15 models created had high r 2 and q 2 and were also validated with a six-molecule test set. the pseudoreceptor was also in accord with a crystal structure of cdk2 (peng et al., 2003) . virtual screening using dock with the crystal structure of the lck sh2 domain was used to screen two million commercially available molecules. extensive filtering was required to result in a manageable hit list using molecular weight and diversity. out of 196 compounds tested in vitro, 34 were inhibitory at 100 mm, while 2 had activities of 10 and 40 mm. fluorescence titrations of some of these compounds suggested the k d values were in the low micromolar range (huang et al., 2004) . the same group also took a similar approach to discover inhibitors of erk2 by screening 800 000 compounds computationally and testing in vitro 80 of them (hancock et al., 2005) . five of these molecules inhibited cell in silico pharmacology for drug discovery s ekins et al proliferation and two were shown by fluorescence titration to bind erk2 with k d values, which were in the low micromolar range. in both cases, docking of the active molecules suggested orientations for verification by x-ray crystallography (hancock et al., 2005) . the ligand scout method was used with bcr-abl tyrosine kinase to find sti-571 (imatinib, gleevec) in a single and multiple conformation database (wolber and langer, 2005) . a structurally related three-substituted benzamidine derivative of sti-571 was suggested by structure-based design and when manually docked into the binding site and energy minimized, it was shown to form favourable interactions with a hydrophobic pocket. ck2 and pkd are part of the cop9 signalosome and can control stability of p53 and c-jun, which are important for tumour development. curcumin, besides being an inhibitor of ubiquitin isopeptidase (mullally et al., 2001; mullally and fitzpatrick, 2002) and activator protein-1 (tsuchida et al., 2006) , also inhibits ck2 and pkd. using curcumin and emodin as reference structures against which a database of over a million molecules was screened by means of 2d and 3d similarity searches retrieved 35 molecules. among them, seven possessed inhibitory activity. for example, piceatannol was more potent than curcumin against both ck2 and pkd, with ic 50 values of 2.5 and 0.5 mm, respectively (fullbeck et al., 2005) . obviously, these examples suggest there has been some success in finding active molecules for kinases, but interestingly in few of these studies is selectivity toward other kinases accounted for. ultimately, for therapeutic success activity toward several kinases (but selectivity toward others) may be required. drug-metabolizing enzymes and transporters: mathematical models describing quantitative structure-metabolism relationships were pioneered by hansch et al. (1968) using small sets of similar molecules and a few molecular descriptors. later, lewis and co-workers provided many qsar and homology models for the individual human cyps (lewis, 2000) . as more sophisticated computational modelling tools became available, we have seen a growth in the number of available models (de groot and ekins, 2002b; de graaf et al., 2005; de groot, 2006) and the size of the data sets they encompass. some more recent methods are also incorporating water molecules into the binding sites when docking molecules into these enzymes and these may be important as hydrogen bond mediators with the binding site amino acids (lill et al., 2006) . docking methods can also be useful for suggesting novel metabolites for drugs. a recent example used a homology model of cyp2d6 and docked metoclopramide as well as 19 other drugs to show a good correlation between ic 50 and docking score r 2 ¼ 0.61 . a novel aromatic n-hydroxy metabolite was suggested as the major metabolite and confirmed in vitro. now that several crystal structures of the mammalian cyps are available, they have been found to compare quite favourably to the prior computational models (rowland et al., 2006) . however, for some enzymes like cyp3a4, where there is both ligand and protein promiscuity, there may be difficulty in making reliable predictions with some computational approaches such as docking with the available crystal structures (ekroos and sjogren, 2006) . hence, multiple pharmacophores or models may be necessary for this and other enzymes (ekins et al., 1999a, b) , as it has been indicated by others more recently (mao et al., 2006) . the udp-glucuronosyltransferases are a class of versatile enzymes involved in the elimination of drugs by catalysing the conjugation of glucuronic acid to substrates bearing a suitable functional group, so called phase ii enzymes. there have been numerous qsar and pharmacophore models that have been generated with relatively small data sets for rat and human enzymes. the pharmacophores for the human ugt1a1, ugt1a4 and ugt1a9 all have in common two hydrophobes and a glucuronidation feature, while ugt1a9 has an additional hydrogen bond acceptor feature sorich et al., 2004) . sulfotransferases, a second class of conjugating enzymes, have been crystallized (dajani et al., 1999; gamage et al., 2003) and a qsar method has also been used to predict substrate affinity to sult1a3 (dajani et al., 1999) . to the best of our knowledge, computational models for other isozymes have not been developed. in general, conjugating enzymes have generally been infrequently targeted for in silico models. perhaps because of a paucity of in vitro data and limited diversity of molecules tested, they have been less widely applied in industry. the computational modelling of drug transporters has been thoroughly reviewed by numerous groups (zhang et al., 2002a, b; chang and swaan, 2005) and will not be addressed here in detail. various transporter models have also been applied to database searching to discover substrates and inhibitors pleban et al., 2005; chang et al., 2006b) and increase the efficiency of in vitro screening (chang et al., 2006a) or enrichment over random screening. a pharmacophore model of the na þ /d-glucose co-transporter found in renal proximal tubules was derived indirectly using phlorizin analogues with the disco programme to superpose molecules. this enabled an estimate of the size of the binding site to be obtained. in contrast to more recent studies with transporter pharmacophores, this model was not tested or used for database searching (wielert-badt et al., 2000) . receptors: there are more than 20 different families of receptors that are present in the plasma membrane, altogether representing over 1000 proteins of the receptorome (strachan et al., 2006) . receptors have been widely used as drug targets and they have a wide array of potential ligands. however, it should be noted that to date we have only characterized and found agonists and antagonists for a small percentage of the receptorome. the a-amino-3-hydroxy-5-methyl-4-isoxazole propionate receptor is central to many central nervous system (cns) pathologies and ligands have been synthesized as anticonvulsants and neuroprotectants. there is currently no 3d structure information and therefore a four-point catalyst hiphop pharmacophore was developed with 14 antagonists. this was then used to search the maybridge database and select eight compounds for testing of which six of these were found to be active in vivo as anticonvulsants (barreca et al., 2003) . serotonin plays a role in many physiological systems, from the cns to the intestinal wall. along with its many receptors, it has a major developmental function regulating cardiovascular morphogenesis. the 5-ht 2 receptor family are g protein-coupled 7-transmembrane spanning receptors with 5-ht 2b expressed in cardiovascular, gut, brain tissues, as well as human carcinoid tumors (nebigil et al., 2000) . in recent years, this receptor has been implicated in the valvular heart disease defects caused by the now banned 'fen-phen' treatment of patients. the primary metabolite, norfenfluramine, potently stimulates 5-ht 2b (fitzgerald et al., 2000; rothman et al., 2000) . computational modelling of this receptor has been limited to date. a traditional qsar study used a small number of tetrahydro-b-carboline derivatives as antagonists of the rat 5-ht 2b contractile receptor in the rat stomach fundus (singh and kumar, 2001) . a 3d-qsar with grid-golpe using 38 (aminoalkyl)benzo and heterocycloalkanones as antagonists of the human receptor resulted in very poor model statistics, possibly owing to the limited range of activity measured and the fact that the data corresponded to a functional response that is likely more complex (brea et al., 2002) . neither of these models was validated with external predictions. on the basis of bacteriorhodopsin and rhodopsin, homology models for the mouse and human 5-ht 2b receptor have been combined with site-directed mutagenesis. the bacteriorhodopsin structure provided more reliable models, which confirmed an aromatic box hypothesis for ligand interaction along transmembrane domains 3, 6, 7 with serotonin (manivet et al., 2002) . a more recent 5-ht 2b homology model based on the rhodopsin-based model of the rat 5-ht 2a was used to determine the sites of interaction for norfenfluramine following molecular dynamics simulations. site-directed mutagenesis showed that val 2.53 was implicated in highaffinity binding through van der waals interactions and the ligand methyl groups (setola et al., 2005) . there is certainly an opportunity to develop further qsar models for this receptor in order to rapidly screen libraries of molecules to identify undesirable potent inhibitors. the serotonin 5-ht 1a receptor has been frequently modelled. for example, a conformational study of four ligands defined a pharmacophore of the antagonist site using sybyl (hibert et al., 1988) . the model resulting from such an active analogue approach was used in molecule design and predicted molecule stereospecificity. more recently, a series of over 700 homology models were iteratively created based on the crystal structure of the bovine rhodopsin that were in turn tuned by flexx docking of known ligands. the final model was used in a virtual screening simulation that was enriched with inhibitors, compared with random selection and from this the authors suggested its utility for a real virtual screen (nowak et al., 2006) . a homology model of the 5-ht 1a receptor has also been used with dock to screen a library of 10 000 compounds seeded with 34 5-ht 1a ligands. ninety percent of these active compounds were ranked in the top 1000 compounds (becker et al., 2006) , representing a significant enrichment. the same model was used to screen a library of 40 000 vendor compounds and select 78 for testing, of which 16 had activities below 5 mm, one possessing 1 nm affinity. structure-based in silico optimization was then performed to improve selectivity with other gpcrs and optimize the pharmacokinetic (pk) profile. however, as this proceeded, the molecules were found to have affinity for the human ether a-go-go-related gene (herg), and this was subsequently computationally assessed using a homology model that pointed to adjusting the hydrophobicity. the resulting clinical candidate had good target and antitarget selectivity and backup compounds were selected in the same way (becker et al., 2006) . another early computer-aided pharmacophore generated with sybyl using a set of selective and non-selective analogues was used to design agonists for 5-ht 1d as antimigraine agents with selectivity against 5-ht 2a (linked to undesirable changes in blood pressure) (glen et al., 1995) . a range of typical and atypical antipsychotics bind to the 5-ht 6 receptor. based on the structure of bovine rhodopsin, homology models of the human and rodent 5-ht 6 receptors were constructed and used to dock ligands that were known to exhibit species differences in binding (hirst et al., 2003) . following sequence alignment, amino-acid residues were identified for mutation and the rationalization of these mutations and their effects on ligand binding were obtained from the docking studies. the models generated were in good agreement with the in vitro data and could be used for further molecule design. this study was a good example where computational, molecular biology and traditional pharmacology methods were combined (hirst et al., 2003) . the na þ , k þ -atpase is a receptor for cardiotonic steroids, which in turn inhibit the atpase and cation transport and have ionotropic actions. although the effects of digitalis have been known for hundreds of years, a molecular understanding has remained absent until recently. a homology model was generated with the serca1a crystal structure and tested with nine cardiac glycosides (keenan et al., 2005) . the model was also mutated to mimic the rat receptor and showed how oubain would orient differently in these models, perhaps explaining the species difference in affinity. these models also suggested amino acids that could be experimentally mutated to validate the hypothesis for the binding site identification, although this has yet to be tested. the dopamine receptors have been implicated in parkinson's disease and schizophrenia. unfortunately, no crystal structure is currently available and thus the search for new antagonists has used qsar models. a set of 48 compounds was used with four different qsar methods (comfa, simulated annealing-partial least square (pls), k-nearest neighbours (knn) and svm), and training as well as testing statistics were generated. svm and knn models were also used to mine compound databases of over 750 000 molecules that resulted in 54 consensus hits. five of these hits were known to bind the receptor and were not in the training set, while other suggested hits did not contain the catechol group normally seen in most dopamine inhibitors (oloff et al., 2005) . the a1a receptor is a target for controlling vascular tone and therefore useful for antihypertensive agents. a novel approach for ligand-based screening called multiple feature tree (mtree) describes the training set molecules as a feature tree descriptor derived from a topological molecular graph that is then aligned in a pairwise fashion (hessler et al., 2005) . a set of six antagonists was used to derive a model with this method and was compared with a catalyst pharmacophore model. both approaches identified a central positive ionizable feature flanked by hydrophobic regions at either end. these two methods were compared for their ability to rank a database of over 47 000 molecules. within the top 1% of the database, mtree had an enrichment factor that was over twice that obtained with catalyst (hessler et al., 2005) . nuclear receptors: nuclear receptors constitute a family of ligand-activated transcription factors of paramount importance for the pharmaceutical industry since many of its members are often considered as double-edged swords (shi, 2006) . on the one hand, because of their important regulatory role in a variety of biological processes, mutations in nuclear receptors are associated with many common human diseases such as cancer, diabetes and osteoporosis and thus, they are also considered highly relevant therapeutic targets. on the other hand, nuclear receptors act also as regulators of some the cyp enzymes responsible for the metabolism of pharmaceutically relevant molecules, as well as transporters that can mediate drug efflux, and thus they are also regarded as potential therapeutic antitargets (off-targets). examples of the use of target-based virtual screening to identify novel small molecule modulators of nuclear receptors have been recently reported. using the available structure of the oestrogen receptor subtype a (era) in its antagonist conformation, a homology model of the retinoic acid receptor a (rara) was constructed (schapira et al., 2000) . using this homology model, virtual screening of a compound library lead to the identification of two novel rara antagonists in the micromolar range. the same approach was later applied to discover 14 novel and diverse micromolar antagonists of the thyroid hormone receptor (schapira et al., 2000) . by means of a procedure designed particularly to select compounds fitting onto the lxxll peptide-binding surface of the oestrogen receptor, novel era antagonists were identified (shao et al., 2004) . since poor displacement of 17b-estradiol was observed in the er-ligand competition assay, these compounds may represent new classes of era antagonists, with the potential to provide an alternative to current anti-oestrogen therapies. the discovery of three low micromolar hits for erb displaying over 100-fold binding selectivity with respect to era was also recently reported using database screening (zhao and brinton, 2005) . a final example reports the identification and optimization of a novel family of peroxisome proliferator-activated receptors-g partial agonists based upon pyrazol-4-ylbenzenesulfonamide after employing structure-based virtual screening, with good selectivity profile against the other subtypes of the same nuclear receptor group . ion channels: therapeutically important channels include voltage-gated ion channels for potassium, sodium and calcium that are present in the outer membrane of many different cells such as those responsible for the electrical excitability and signalling in nerve and muscle cells (terlau and stuhmer, 1998) . these represent validated therapeutic targets for anaesthesia, cns and cardiovascular diseases (kang et al., 2001) . a recent review has discussed the various qsar methods such as pharmacophores, comfa, svm, 2d-qsar, genetic programming, self organizing maps and recursive partitioning that have been applied to most ion channels (aronov et al., 2006) in the absence of crystal structures. to date l-type calcium channels and herg appear to have been the most extensively studied channels in this regard. in contrast, there are far fewer examples of computational models for the sodium channel. these three classes of ion channels have been studied as they represent either therapeutic targets or antitargets to be avoided. for example, one of many models for the herg potassium channel has compared three different methods with the same set of molecules for training and a test set. recursive partitioning, sammon maps and kohonen maps were used with atom path lengths . the average classification quality was high for both training and test selections. the sammon mapping technique outperformed the kohonen maps in classification of compounds from the external test set. the quantitative predictions for recursive partitioning could be filtered using a tanimoto similarity to remove molecules that were markedly different to the training set (willett, 2003) . the path length descriptors can also be used to visualize the similarity of the molecules in the whole training set (figure 1a ). in addition, a subset of molecules can also be compared, with those highlighted in blue representing close neighbours and those in red being more distant (figure 1b) . transcription factors: a cyclic decapeptide with activity against the ap-1 transcription factor was used to derive a 3d pharmacophore to which low energy conformations of non-peptidic compounds were compared. new 1-thia-4azaspiro [4, 5] decane and benzophenone derivatives with activity in binding and cell-based assays were discovered as ap-1 inhibitors in a lead hopping approach (tsuchida et al., 2006) . antibacterials: twenty deoxythymidine monophosphate analogues were used along with docking to generate a pharmacophore for mycobacterium tuberculosis thymidine monophosphosphate kinase inhibitors with the catalyst software. a final model was used to screen a large database spiked with known inhibitors. the model was suggested to have an enrichment factor of 17, which is highly significant. in addition, the model was used to rapidly screen half a million compounds in an effort to discover new inhibitors (gopalakrishnan et al., 2005) . antivirals: neuroamidase is a major surface protein in influenza virus. a structure-based approach was used to generate catalyst pharmacophores and these in turn were used for a database search and aided the discovery of known inhibitors. the hit lists were also very selective (steindl and langer, 2004) . human rhinovirus 3c protease is an antirhinitis target. a structure-based pharmacophore was developed initially around ag 7088 but this proved too restrictive. a second pharmacophore was developed from seven peptidic inhibitors using the catalyst hiphop method. this hypothesis was useful in searching the world drug index database to retrieve in silico pharmacology for drug discovery s ekins et al compounds with known antiviral activity and several novel compounds were selected from other databases with good fits to the pharmacophore, indicative that they would be worth testing although these ultimate testing validation data were not presented (steindl et al., 2005a) . human rhinovirus coat protein is another target for antirhinitis. a combined pharmacophore, docking approach and pca-based clustering was used. a pharmacophore was generated from the structure and shape of a known inhibitor and tested for its ability to find known inhibitors in a database. ultimately, after screening the maybridge database, 10 compounds were suggested that were then docked and scored. six compounds were tested and found to inhibit viral growth. however, the majority of them were found to be cytotoxic or had poor solubility (steindl et al., 2005b) . the ligand scout approach was tested on the rhinovirus serotype 16 and was able to find known inhibitors in the pdb (wolber and langer, 2005) . the sars coronavirus 3c-like proteinase has been addressed as a potential drug design target. a homology model was built and chemical databases were docked into it. a pharmacophore model and drug-like rules were used to narrow the hit list. forty compounds were tested and three were found with micromolar activity, the best being calmidazolium at 61 mm (liu et al., 2005) , perhaps a starting point for further optimization. a pharmacophore has also been developed to predict the hepatitis c virus rna-dependent rna polymerase inhibition of diketo acid derivatives. a catalyst hypogen model was figure 1 (a) a distance matrix plot of the 99 molecule herg training set showing in general that the molecules are globally dissimilar as the plot is primarily red . (b) a distance matrix plot of a subset of the training set to show molecules similar to astemizole. blue represents close molecules and red represents distant molecules based on the chemtree pathlength descriptors (see colour scale). in silico pharmacology for drug discovery s ekins et al derived with 40 molecules with activities over three log orders to result in a five-feature pharmacophore model. this was in turn tested with 19 compounds from the same data set as well as nine diketo acid derivatives, for which the predicted and experimental data were in good agreement (di santo et al., 2005) . other therapeutic targets: the integrin vla-4 (a4b1) is a target for autoimmune and inflammatory diseases such as asthma and rheumatoid arthritis. the search for antagonists has included using a catalyst pharmacophore derived from the x-ray crystal structure of a peptidic inhibitor (singh et al., 2002b) . this was used to search a virtual database of compounds that could be made with reagents from the available chemicals directory. twelve compounds were then selected and synthesized, with resulting activities in the range between 1.3 nm and 20 mm. hence, a peptide was used to derive non-peptide inhibitors that were active in vivo. a second study by the same group used comfa with a set of 29 antagonists with activity from 1 to 662 nm to generate a model with good internal validation statistics that was subsequently used to indicate favourable regions for molecule substituent changes (singh et al., 2002a) . it is unclear whether the comfa model was also successful for design of further molecules. it is possible to use approved drugs as a starting point for drug discovery for other diseases. for example, the list of world health organization essential drugs has been searched to try to find leads for prion diseases using 2d tanimoto similarity or 3d searching with known inhibitors. this work to date has suggested compounds, yet they appear not to have been tested, so the approach has not been completely validated (lorenzen et al., 2005) . protein-protein interactions are key components of cellular signalling cascades, the selective interruption of which would represent a sought after therapeutic mechanism to modulate various diseases (tesmer, 2006) . however, such pharmacological targets have been difficult for in silico methods to derive small molecule inhibitors owing to generally quite shallow binding sites. the g-protein gbg complex can regulate a number of signalling proteins via protein-protein interactions. the search for small molecules to interfere with the gbg-protein-protein interaction has been targeted using flexx docking and consensus scoring of 1990 molecules from the nci diversity set database (bonacci et al., 2006) . after testing 85 compounds as inhibitors of the gb 1 g 2 -sirk peptide, nine compounds were identified with ic 50 values from 100 nm to 60 mm. further substructure searching was used to identify similar compounds to one of the most potent inhibitors to build a sar. these efforts may eventually lead to more potent lead compounds. up to this point, we have generally considered in silico pharmacology models that essentially relate to a single target protein and either the discovery of molecules as agonists, antagonists or with other biological activity after database searching and in vitro testing or following searching of databases seeded with molecules of known activity for the target. however, there are many complex properties that have been modelled in silico and these will be briefly discussed here. it should also be pointed out that while several physicochemical properties such as clogp and water solubility have been extensively studied, the training sets for these models are in the 1000s or tens of thousands of molecules, while other complex properties have generally used much smaller training sets in the range of hundreds of molecules. for example, a measure of molecule clearance would be indicative of elimination half-life that would naturally be of value for selecting candidates. the intrinsic clearance has therefore been used as a measure of the enzyme activity toward a compound and this may involve multiple enzymes. some of the earliest models for this property includes a comfa model of the cyp-mediated metabolism of chlorinated volatile organic compounds, likely representative of cyp2e1 (waller et al., 1996) . a more generic set of molecules with clearance data derived from human hepatocytes has been used to predict human in vivo clearance using multiple linear regression, pca, pls, neural networks with leave-oneout cross-validation (schneider et al., 1999) . microsomal and hepatocyte clearance data sets have also been used separately to generate catalyst pharmacophores, which were then tested by predicting the opposing data set. this method assumes there are some pharmacophore features intrinsic to the molecules that dictate intrinsic clearance (ekins and obach, 2000) . a second complex property is the volume of distribution that is a function of the extent of drug partitioning into tissue versus plasma and there have been several attempts at modelling this property (lombardo et al., 2002 (lombardo et al., , 2004 . this property, along with the plasma half-life, determines the appropriate dose of a drug. for example, 253 diverse drugs from the literature were used with eight molecular descriptors with sammon and kohonen mapping methods. these models appeared to classify correctly 80% of the compounds (balakin et al., 2005) . recently, a set of 384 drugs with literature volume of distribution at steady-state data was used with a mixture discriminant analysis-random forest method and 31 molecular descriptors to generate a predictive model. this model was tested with 23 molecules, resulting in a geometric mean fold error of 1.78, which was comparable to the values for other predictions for this property from animal, in vitro, or other methods (lombardo et al., 2006) . a third property, the plasma half-life determined by numerous adme properties has also been modelled with sammon and kohonen maps using data for 458 drugs from the literature and four molecular descriptors. like the previously described volume of distribution models, these models appeared to classify correctly 80% of the compounds (balakin et al., 2005) . a fourth complex property is renal clearance, which assumes the excretion of the unchanged drug that takes place only by this route, hence this represents a method of monitoring the proportion of drug metabolized. in one set of published qsar models, 130 molecules were used with 62 volsurf or 37 molconn-z descriptors. the models were tested with 20 molecules and one using soft independent modelling of class analogies and molconn-z descriptors obtained 85% correct classification between the two classes (0-20 and 20-100%) (doddareddy et al., 2006) . a fifth example of a complex property is the proteinligand interaction and appropriate scoring functions for which several methods have been developed such as force fields, empirical and knowledge-based approaches (see also ekins et al., 2007) . these are important in computational structure-based design methods for assessing virtual candidate molecules to select those that are likely to bind a protein with highest affinity (shimada, 2006) . recently, a kernel partial least squares (k-plss) qsar approach has been used along with a genetic algorithm feature selection method for the distance-dependent atom pair descriptors from the 61 or 105 small molecule training sets with binding affinity data and the proteins they bind to. bootstrapping, scrambling the data and external test sets were used to test the models (deng et al., 2004) . in essence, such k-pls qsar models across many proteins perhaps isolate the key molecular descriptors that relate to the highest affinity interactions. it will be interesting to see whether such models can continue to be generated with the much larger binding affinity data sets that are now available. a final example of a complex property is the v max of an enzyme that has been modelled on a few occasions (hirashima et al., 1997; mager et al., 1982; ghafourian and rashidi, 2001; sipila and taskinen, 2004) . this value will depend on the properties of the compound in question and will be influenced by the steric properties of the active site as well as the ease of expulsion of the leaving group from the active site. balakin et al. (2004) , have recently used neural network methods to model the v max data for n-dealkylation mediated by cyp2d6 and cyp3a4, using whole molecules, centroid of the reaction and leaving group-related descriptors. these models were also used to predict small sets of molecules not included in training. ultimately, many other reactions and the evaluation of other enzymes will be necessary. similarly, larger test sets are required for all the above complex property models to provide further confidence in the models in terms of their utility and applicability. uses of in silico pharmacology we propose a general schema for in silico pharmacology, which is shown in figure 2 . this demonstrates some of the key roles of the computational technologies that can assist pharmacology. these roles include finding new antagonists or agonists for a target using an array of methods either in the absence or presence of a structure for the target. computational methods may also aid in understanding the underlying biology using network/pathways based on annotated data (signalling cascades), determining the connectivity of drug as a network with targets to understand selectivity, integration with other models for pk/pd (pharmacodynamic) and ultimately the emergence of systems in silico pharmacology. obviously, we have taken more of a pharmaceutical bias in this review but we would argue these methods are equally amenable and should be considered to discover new chemical probes for the academic pharmacologist as opposed to lead molecules for optimization to become drugs. some of the advantages of in silico pharmacology and in silico methods in general are the reduction in the number of molecules made and tested through database searching to find inhibitors or substrates, increased speed of experiments through reliable prediction of most pharmaceutical properties from molecule structure alone and ultimately reductions in animal and reagent use. we must however consider the multiple optimization of numerous predicted properties, possibly either weighting in silico pharmacology models by importance (or confidence in the model and or data), as well as data set size and diversity. similarly, we should consider the disadvantages of in silico pharmacology methods as protein flexibility, molecule conformation and promiscuity all hinder accurate predictions. for example, even with the recent availability of crystal structures for several mammalian drug-metabolizing enzymes, there is still considerable difficulty in reliable metabolism predictions. our focus thus far has been on the creation of many in silico pharmacology models for human properties, yet as pharmacology uses animals for much in vivo testing and subcellular preparations from several species for in vitro experiments, we need models from other species both to understand differences as well as enable better scaling between them. a widely discussed disadvantage of in silico methods is the applicability of the model, which will now be discussed further. defining in silico model applicability domain some of the in silico pharmacology methods that can be used have similar limitations to models used in other areas, such as those for predicting physicochemical and adme/tox properties. for example, models may be generated with a narrow homologous series of pharmacologically relevant molecules (local model) or a structurally diverse range of molecules (global model). these two approaches have their pros and cons, respectively. the applicability domain of the local model may be much narrower than for the global model such that changing to a new chemical series will result in prediction failure. however, global models may also fail if the predicted molecule falls far enough away from representative molecules in the training set. these limitations are particularly specific to qsar models. from many of the in silico pharmacology model examples described above, the qsar models are generally local in nature and this will limit lead hopping to new structural series, whereas global models may be more useful for this feature. several papers have described the applicability domain of models and methods in considerable detail (dimitrov et al., 2005; tetko et al., 2006) to calculate this property. molecular similarity to training set compounds may be a reliable measure for prediction quality (sheridan et al., 2004) as demonstrated for a herg model . to our knowledge, there has not been a specific analysis of the applicability domain specifically for in silico pharmacology models (other than for those examples described above) to the same degree as there has been for physicochemical properties like solubility and logp. the applicability domain of pharmacophore models have not been addressed either as the focus has primarily been on statistical qsar methods. as we shift toward hybrid or meta-computational methods (that integrate several modelling approaches and algorithms) for predicting from molecular structure the possible physicochemical and pharmacological properties, then these could be used to provide prediction confidence by consensus. the docking methods with homology models for certain proteins of pharmacological interest could be used alongside qsar or pharmacophore models if these are also available. there have been numerous occasions in the study of drug-metabolizing enzymes were qsar and homology models have been combined or used to validate each other (de groot et al., 2002a; de graaf et al., 2005; de groot, 2006) . drug metabolism is a good example as several simultaneous outcomes (for example, metabolites) often occur, a condition not normally found in other pharmacological assays where a single set of conditions yields a single outcome. it is here that the classification into specific ('local') and comprehensive ('global') methods finds its clearest use (see figure 3) , with local methods being applicable to simple biological systems such as a single enzyme or a single enzymatic activity (testa and krämer, 2006) . the production of regioselective metabolites (for example, hydroxylation to a phenol and an alcohol) is usually predictable from such methods, but that of different routes (for example, oxidation versus glucuronidation) is not. this is where global algorithms (that is, applicable to versatile biological systems) are most useful in their potential capacity to encompass all or most metabolic reactions and offer predictions, which are much closer to the in vivo situation. it is readily apparent that in a minority of papers we have found that computational approaches have resulted in predicted lead compounds for testing without the authors providing further experimental verification of biological activity (krovat and langer, 2004; langer et al., 2004; steindl and langer, 2004; gopalakrishnan et al., 2005; lorenzen et al., 2005; steindl et al., 2005a; amin and welsh, 2006) . this is an interesting observation as for many years computational studies were generally performed after synthesis of molecules, and essentially provided illustrative pictures and explanation of the data. now it appears we are seeing a shift in the other direction as predictions are published for pharmacological activity without apparently requiring in vitro or in vivo experimental verification, as long as the models themselves are validated in some manner. as the models may only have a limited prediction domain so perhaps in future we will see some discussion of the predicted molecules and their distance from the training set or some other measure of how far the predictions can be extended. many of the molecules identified by virtual screening techniques have not been tested in vitro to ensure that they are not false positives that may actually be involved in molecule aggregation. these types of molecules have been termed so-called 'promiscuous inhibitors', occurring as micromolar inhibitors of several proteins mcgovern and shoichet, 2003; seidler et al., 2003) . a preliminary computational model was developed to help identify these potential promiscuous inhibitors (seidler et al., 2003) . from reviewing the literature, we suggest it would be worth researchers either implementing filters for 'promiscuous inhibitors' or performing rigorous experimental verification of their predicted bioactive molecules to rule out this possibility. . it would certainly be very useful to know the existence of difficult targets for modelling with different methods, as this apparently is a process of trial and error for each investigator currently. in summary, in this and the accompanying review (ekins et al., 2007) , we have presented our interpretation of in silico pharmacology and described how the field has developed so far and is used for: discovery of molecules that bind to many different targets and display bioactivity, prediction of complex properties and the understanding of the underlying metabolic and network interactions. while we have not explicitly discussed pk/pd, whole organ, cell or disease simulations in this review, we recognize they too are an important component of the computer-aided drug design approach (noble and colatsky, 2000; gomeni et al., 2001; kansal, 2004) and may be more widely integrated with other in silico pharmacology methods described previously . the brief history of in silico pharmacology has taken perhaps a rather predictable route with computational models applied to many of the most important biological targets where they have the capacity to be used to search large databases and quickly suggest molecules for testing. many of the examples we have presented have demonstrated significant enrichments over random selection of molecules and so far these have been the most plentiful types of metrics that are routinely used to validate in silico models. the future of in silico pharmacology may be somewhat difficult to predict. while we are seeing a closer interaction between computational and in vitro approaches to date, will we see a similar relationship with in vivo studies in the future? more broadly, will in silico pharmacology ever be able to replace entirely experimental approaches in vitro and even in vivo, as some animal rights activists want us to believe? the answer here can only be a clear and resounding 'no' (at least in the near future), for two irrefutable reasons. first, biological entities are nonlinear systems showing 'chaotic behaviour'. as such, there is no relation between the magnitude of the input and the magnitude of the output, with even the most minuscule differences between initial conditions rapidly translating into major differences in the output. and second, no computer programme, however 'complex and systemslike', will ever be able to fully model the complexity of biological systems. indeed, and in the formulation of the mathematician gregory chaitin, biological systems are algorithmically incompressible, meaning that they cannot be modelled fully by an algorithm shorter than themselves. in the meantime, in silico pharmacology will likely become more complex requiring some degree of integration of models, as we are seeing in the combined metabolism modelling approaches (figure 3) . ultimately, to have a much broader impact, the in silico tools will need to become a part of every pharmacologist's tool kit and this will require training in modelling and informatics, alongside the in vivo, in vitro and molecular skills. this should provide a realistic appreciation of what the different in silico methods can and cannot be expected to do with regard to the pharmacologists aim of discovering new therapeutics. in silico pharmacology for drug discovery s ekins et al a preliminary in silico lead series of 2-phthalimidinoglutaric acid analogues designed as mmp-3 inhibitors applications of qsar methods to ion channels. in: ekins s (ed) computational toxicology: risk assessment for pharmaceutical and environmental chemicals quantitative structure-metabolism relationship modeling of the metabolic n-dealkylation rates comprehensive computational assessment of adme properties using mapping techniques crystal structures of native and inhibited forms of human cathepsin d: implications for lysosomal targeting and drug design pharmacophore modeling as an efficient tool in the discovery of novel noncompetitive ampa receptor antagonists an integrated in silico 3d model-driven discovery of a novel, potent, and selective amidosulfonamide 5-ht1a agonist (prx-00023) for the treatment of anxiety and depression novel cathepsin d inhibitors block the formation of hyperphosphorylated tau fragments in hippocampus differential targeting of gbetagamma-subunit signaling with small molecules new serotonin 5-ht(2a), 5-ht(2b), and 5-ht(2c) receptor antagonists: synthesis, pharmacology, 3d-qsar, and molecular modeling of (aminoalkyl)benzo and heterocycloalkanones designing non-peptide peptidomimetics in the 21st century: inhibitors targeting conformational ensembles developing a dynamic pharmacophore model for hiv-1 integrase rapid identification of p-glycoprotein substrates and inhibitors pharmacophorebased discovery of ligands for drug transporters computational approaches to modeling drug transporters structural insights into human 5-lipoxygenase inhibition: combined ligand-based and target-based approach inhibition of human liver catechol-o-methyltransferase by tea catechins and their metabolites: structure-activity relationship and molecular-modeling studies x-ray crystal structure of human dopamine sulfotransferase, sult1a3 cytochrome p450 in silico: an integrative modeling approach designing better drugs: predicting cytochrome p450 metabolism development of a combined protein and pharmacophore model for cytochrome p450 2c9 pharmacophore modeling of cytochromes p450 generation of predictive pharmacophore models for ccr5 antagonists: study with piperidine-and piperazine-based compounds as a new class of hiv-1 entry inhibitors predicting proteinligand binding affinities using novel geometrical descriptors and machine-learning methods simple but highly effective three-dimensional chemical-feature-based pharmacophore model for diketo acid derivatives as hepatitis c virus rna-dependent rna polymerase inhibitors a stepwise approach for defining the applicability domain of sar and qsar models in silico renal clearance model using classical volsurf approach molecular docking and highthroughput screening for novel inhibitors of protein tyrosine phosphatase-1b insights for human ether-a-go-go-related gene potassium channel inhibition using recursive partitioning, kohonen and sammon mapping techniques applying computational and in vitro approaches to lead selection three and four dimensional-quantitative structure activity relationship analyses of cyp3a4 inhibitors three dimensional quantitative structure activity relationship (3d-qsar) analysis of cyp3a4 substrates in silico pharmacology for drug discovery: methods for virtual ligand screening and profiling techniques: application of systems biology to absorption, distribution, metabolism, excretion, and toxicity three dimensional-quantitative structure activity relationship computational approaches of prediction of human in vitro intrinsic clearance development of computational models for enzymes, transporters, channels and receptors relevant to adme/tox structural basis for ligand promiscuity in cytochrome p450 3a4 the design of drug candidate molecules as selective inhibitors of therapeutically relevant protein kinases is there a future for renin inhibitors possible role of valvular serotonin 5-ht(2b) receptors in the cardiopathy associated with fenfluramine analysis of drug-induced effect patterns to link structure and side effects of medicines biological spectra analysis: linking biological activity profiles to molecular structure biospectra analysis: model proteome characterizations for linking molecular structure and biological response identification of nonpeptidic urotensin ii receptor antagonists by virtual screening based on a pharmacophore model derived from structure-activity relationships and nuclear magnetic resonance studies on urotensin ii recent success stories leading to commercializable bioactive compounds with the aid of traditional qsar procedures novel curcumin-and emodin-related compounds identified by in silico 2d/3d conformer screening induce apoptosis in tumor cells structure-based design of potent cdk1 inhibitors derived from olomoucine structure of a human carcinogenconverting enzyme, sult1a1 introducing the consensus modeling concept in genetic algorithms: application to interpretable discriminant analysis quantitative study of the structural requirements of phthalazine/quinazoline derivatives for interaction with human liver aldehyde oxidase computer-aided design and synthesis of 5-substituted tryptamines and their pharmacology at the 5-ht1d receptor: discovery of compounds with potential anti-migraine properties computerassisted drug development (cadd): an emerging technology for designing first-time-in-man and proof-of-concept studies from preclinical experiments a virtual screening approach for thymidine monophosphate kinase inhibitors as antitubercular agents based on docking and pharmacophore models combining structurebased drug design and pharmacophores piperidine-renin inhibitors compounds with improved physicochemical properties pharmacophore modeling and three dimensional database searching for drug design using catalyst: recent advances identification of novel extracellular signal-regulated kinase docking domain inhibitors structure-activity correlations in the metabolism of drugs comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures multiple-ligand-based virtual screening: methods and applications of the mtree approach graphics computer-aided receptor mapping as a predictive tool for drug design: development of potent, selective, and stereospecific ligands for the 5-ht1a receptor quantitative structure-activity studies of octopaminergic 2-(arylimino)thiazolidines and oxazolidines against the nervous system of periplaneta americana l differences in the central nervous system distribution and pharmacology of the mouse 5-hydroxytryptamine-6 receptor compared with rat and human receptors investigated by radioligand binding, site-directed mutagenesis, and molecular modeling crystal structure of a ubp-family deubiquitinating enzyme in isolation and in complex with ubiquitin aldehyde identification of non-phosphate-containing small molecular weight inhibitors of the tyrosine kinase p56 lck sh2 domain via in silico screening against the py þ 3 binding site molecular dynamics and free energy analyses of cathepsin d-inhibitor interactions: insight into structure-based ligand design in silico search of putative adverse drug reaction related proteins as a potential tool for facilitating drug adverse effect prediction identification of novel farnesyl protein transferase inhibitors using three-dimensional searching methods interactions of a series of fluoroquinolone antibacterial drugs with the human cardiac k þ channel herg modeling approaches to type 2 diabetes predicting ligand binding to proteins by affinity fingerprinting the diversity challenge in combinatorial chemistry protein affinity map of chemical space elucidation of the na þ , k þ -atpase digitalis binding site novel human lipoxygenase inhibitors discovered using virtual screening with homology models qsar studies on 1,2-dithiole-3-thiones: modeling of lipophilicity, quinone reductase specific activity, and production of growth hormone structure-based design and combinatorial chemistry yield low nanomolar inhibitors of cathepsin d acyl ureas as human liver glycogen phosphorylase inhibitors for the treatment of type 2 diabetes development of novel edg3 antagonists using a 3d database search and their structure-activity relationships impact of scoring functions on enrichment in docking-based virtual screening: an application study on renin inhibitors computer applications in pharmaceutical research and development structure-aided optimization of kinase inhibitors derived from alsterpaullone pharmacophore modeling and threedimensional database searching for drug design using catalyst discovery of novel mesangial cell proliferation three-dimensional database searching method lead identification for modulators of multidrug resistance based on in silico screening with a pharmacophoric feature model screening for new antidepressant leads of multiple activities by support vector machines protease inhibitors: current status and future prospects on the recognition of mammalian microsomal cytochrome p450 substrates and their characteristics prediction of small-molecule binding to cytochrome p450 3a4: flexible docking combined with multidimensional qsar virtual screening of novel noncovalent inhibitors for sars-cov 3c-like proteinase oncology exploration: chartering cancer medicinal chemistry space a hybrid mixture discriminant analysis-random forest computational model for the prediction of volume of distribution of drugs in human prediction of human volume of distribution values for neutral and basic drugs. 2. extended data set and leave-class-out statistics prediction of volume of distribution values in humans for neutral and basic drugs using physicochemical measurements and plasma protein binding in silico screening of drug databases for tse inhibitors structure-based drug design of a novel family of ppargamma partial agonists: virtual screening, x-ray crystallography, and in vitro/in vivo biological activities judging models in qsar-and lfe-like studies if there are no replications: correlation of dipeptidyl peptidase iv hydrolytic activities of l-alanyl-l-alanine phenylamides the serotonin binding site of human and murine 5-ht2b receptors: molecular modeling and site-directed mutagenesis qsar modeling of in vitro inhibition of cytochrome p450 3a4 piperidine renin inhibitors: from leads to drugs a common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening kinase inhibitors: not just for kinases anymore from magic bullets to designed multiple ligands pharmacophore model for novel inhibitors of ubiquitin isopeptidases that induce p53-independent cell death cyclopentenone prostaglandins of the j series inhibit the ubiquitin isopeptidase activity of the proteasome pathway serotonin 2b receptor is required for heart development hiv-1 integrase pharmacophore: discovery of inhibitors through three-dimensional database searching design and synthesis of non-peptidic inhibitors for the syk c-terminal sh2 domain based on structure-based in-silico screening prediction of biological targets using probabilistic neural networks and atom-type descriptors a return to rational drug discovery: computer-based models of cells, organs and systems in drug target identification homology modeling of the serotonin 5-ht1a receptor using automated docking of bioactive compounds with defined geometry computational tools for the analysis and visualization of multiple protein-ligand complexes renin inhibition by substituted piperidines: a novel paradigm for the inhibition of monomeric aspartic proteinases? application of validated qsar models of d1 dopaminergic antagonists for database mining global mapping of pharmacological space a genetic algorithm for structure-based de novo design 3d-qsar and receptor modeling of tyrosine kinase inhibitors with flexible atom receptor model (flarm) targeting drug-efflux pumps -a pharmacoinformatic approach structure-based drug design: the discovery of novel nonpeptide orally active inhibitors of human renin discovering cox-inhibiting constituents of morus root bark: activity-guided versus computer-aided methods acetylcholinesterase inhibitory activity of scopolin and scopoletin discovered by virtual screening of natural products screening the receptorome to discover the molecular targets for plant-derived psychoactive compounds: a novel approach for cns drug discovery evidence for possible involvement of 5-ht(2b) receptors in the cardiac valvulopathy associated with fenfluramine and other serotonergic medications crystal structure of human cytochrome p450 2d6 rational discovery of novel nuclear hormone receptor antagonists combining in vitro and in vivo pharmacokinetic data for prediction of hepatic drug clearance in humans by artificial neural networks and multivariate statistical techniques pharmacophore modeling and in silico screening for new p450 19 (aromatase) inhibitors in silico pharmacology for drug discovery s ekins et al identification and prediction of promiscuous aggregating inhibitors among known drugs molecular determinants for the interaction of the valvulopathic anorexigen norfenfluramine with the 5-ht2b receptor identification of novel estrogen receptor alpha antagonists from large networks to small molecules similarity to molecules in the training set is a good discriminator for prediction accuracy in qsar orphan nuclear receptors, excellent targets of drug discovery the challenges of making useful protein-ligand free energy predictions for drug discovery. in: ekins s (ed) computer applications in pharmaceutical research and development 3d qsar (comfa) of a series of potent and highly selective vla-4 antagonists identification of potent and novel alpha4beta1 antagonists using in silico screening quantitative structure-activity relationship study on tetrahydro-beta-carboline antagonists of the serotonin 2b (5ht2b) contractile receptor in the rat stomach fundus comfa modeling of human catechol o-methyltransferase enzyme kinetics development of biologically active compounds by combining 3d qsar and structure-based design methods towards integrated adme prediction: past, present and future directions for modelling metabolism by udp-glucuronosyltransferases multiple pharmacophores for the investigation of human udp-glucuronosyltransferase isoform substrate selectivity evaluation of a novel shape-based computational filter for lead evolution: application to thrombin inhibitors potential of renin inhibition in cardiovascular disease human rhinovirus 3c protease: generation of pharmacophore models for peptidic and nonpeptidic inhibitors and their application in virtual screening influenza virus neuraminidase inhibitors: generation and comparison of structure-based and common feature pharmacophore hypotheses and their application in virtual screening pharmacophore modeling, docking, and principal component analysis based clustering: combined computer-assisted approaches to identify new inhibitors of the human rhinovirus coat protein screening the receptorome: an efficient approach for drug discovery and target validation reengineering the pharmaceutical industry by crash-testing molecules structure and function of voltage-gated ion channels an in silico approach to discovering novel inhibitors of human sirtuin type 2 hitting the hot spots of cell signaling cascades the biochemistry of drug metabolism -an introduction. part 1: principles and overview can we estimate the accuracy of adme-tox predictions? solution structure of cnerg1 (ergtoxin), a herg specific scorpion toxin discovery of nonpeptidic small-molecule ap-1 inhibitors: lead hopping based on a three-dimensional pharmacophore model protein-structure-based drug discovery of renin inhibitors punaglandins, chlorinated prostaglandins, function as potent michael receptors to inhibit ubiquitin isopeptidase activity substituted piperidines-highly potent renin inhibitors due to induced fit adaption of the active site kinomics-structural biology and chemogenomics of kinase inhibitors and targets modeling the cytochrome p450-mediated metabolism of chlorinated volatile organic compounds the discovery of novel, structurally diverse protein kinase c agonists through computer 3d-database pharmacophore search. molecular modeling studies probing the conformation of the sugar transport inhibitor phlorizin by 2d-nmr, molecular dynamics studies, and pharmacophore analysis similarity-based approaches to virtual screening drugbank: a comprehensive resource for in silico drug discovery and exploration ligandscout: 3-d pharmacophores derived from protein-bound ligands and their use as virtual screening filters drug discovery in the ubiquitin regulatory pathway in silico prediction of drug binding to cyp2d6: identification of a new metabolite of metoclopramide structural biology and function of solute transporters: implications for identifying and designing substrates modeling of active transport systems structure-based virtual screening for plant-based erbeta-selective ligands as potential preventative therapy against age-related neurodegenerative diseases acknowledgements se gratefully acknowledges dr cheng chang and dr peter w swaan (university of maryland), dr konstantin v balakin (chemical diversity inc.) for in silico pharmacology collaborations over the past several years and dr hugo kubinyi for his insightful efforts in tabulating the successful applications of in silico approaches, which was inspirational. gold-enhelix inc. graciously provided chemtree. se kindly acknowledges dr maggie az hupcey for her support. jm acknowledges the research funding provided by the spanish ministerio de educació n y ciencia (project reference bio2005-04171) and the instituto de salud carlos iii. owing to limited space, it was not possible to cite all in silico pharmacology-related papers, our sincere apologies to those omitted. the authors state no conflict of interest. key: cord-193856-6vs16mq3 authors: zhou, tongxin; wang, yingfei; yan, lu; tan, yong title: spoiled for choice? personalized recommendation for healthcare decisions: a multi-armed bandit approach date: 2020-09-13 journal: nan doi: nan sha: doc_id: 193856 cord_uid: 6vs16mq3 online healthcare communities provide users with various healthcare interventions to promote healthy behavior and improve adherence. when faced with too many intervention choices, however, individuals may find it difficult to decide which option to take, especially when they lack the experience or knowledge to evaluate different options. the choice overload issue may negatively affect users' engagement in health management. in this study, we take a design-science perspective to propose a recommendation framework that helps users to select healthcare interventions. taking into account that users' health behaviors can be highly dynamic and diverse, we propose a multi-armed bandit (mab)-driven recommendation framework, which enables us to adaptively learn users' preference variations while promoting recommendation diversity in the meantime. to better adapt an mab to the healthcare context, we synthesize two innovative model components based on prominent health theories. the first component is a deep-learning-based feature engineering procedure, which is designed to learn crucial recommendation contexts in regard to users' sequential health histories, health-management experiences, preferences, and intrinsic attributes of healthcare interventions. the second component is a diversity constraint, which structurally diversifies recommendations in different dimensions to provide users with well-rounded support. we apply our approach to an online weight management context and evaluate it rigorously through a series of experiments. our results demonstrate that each of the design components is effective and that our recommendation design outperforms a wide range of state-of-the-art recommendation systems. our study contributes to the research on the application of business intelligence and has implications for multiple stakeholders, including online healthcare platforms, policymakers, and users. internet technologies enable information to be generated and disseminated at almost no cost, which accelerates the growth of information in online environments. social media platforms, for example, allow users to share abundant content, including blogs, music, videos, and other formats, which individuals can freely choose to consume. although various information options increase individuals' choice opportunities, having too many choices can be overwhelming and sometimes even confusing. often, individuals experience difficulties in spotting content that is truly relevant to themselves or in which they are indeed interested (konstan and riedl 2012; ricci et al. 2015) . such a choice overload issue can lessen users' experience and create barriers to individuals' engagement in online platforms. online healthcare communities (ohcs), which are social-media-based platforms that gather users with similar health-management interests, are no exception. due to their easy access, ohcs are increasingly being used by individuals to learn about their illness, become familiar with treatment routines, and connect with others in similar circumstances. typical ohcs provide users with various healthcare interventions to promote healthy behavior and improve adherence. examples include behavioral treatment programs or plans that help individuals to establish healthy habits in regard to diet and physical exercise. during the recent covid-19 pandemic, for example, individuals often engage in online work-out activities to relieve stress and stay healthy (pew research center 2020). individuals can freely choose interventions in which to participate in an online environment. when faced with too many choices, however, individuals may find it difficult to decide which option to take, as they may not know what would work or even what to expect, especially when they are not healthcare professionals and do not have adequate experience in evaluating each choice. as a result, they may fall into analysis paralysis (oulasvirta et al. 2009 ) and fail to engage in any health-management activities. this may harm their self-intervention adherence and outcome (nutting et al. 2011; snyderman and dinan 2010) . the choice overload issue significantly affects one's participation experience or outcome in ohcs, leading to the pressing demand for services that can better fit individuals' healthcare needs. therefore, in this study, we aim to follow the design-science paradigm to develop a personalized healthcare recommendation system as a means to support individuals' engagement in health management. recommendation systems are intelligence-based algorithms that can help users to filter information and discover alternatives that they might not have found otherwise (konstan and riedl 2012; vozalis and margaritis 2003) . existing recommendation systems deploy various approaches to learn users' preferences from user-behavior data, such as collaborative filtering, content-based filtering, and hybrid models . research has shown that recommendation systems can effectively improve business performance and customer experience (konstan and riedl 2012; pu et al. 2011) in ecommerce settings. despite their extensive use in ecommerce settings, whether and how recommendation systems can be integrated with online healthcare platforms has received little attention and remains largely underexplored. there are several unique patterns associated with users' health behaviors that create challenges in healthcare recommendations. first, previous health studies suggest that individuals' health behaviors are frequently affected by their evolving health status and health-management experiences (johnson et al. 2002; king et al. 2006; yan and tan 2014) . thus, individuals' healthcare needs can exhibit strong temporal dynamics. second, individuals' health management usually contains multi-dimensional effort, as promoting health requires individuals to make a series of changes in all aspects of motivation and lifestyle. for instance, in weight management, individuals need to jointly monitor and manage different behavioral aspects, such as dietary behaviors and participation in physical activities. these patterns indicate that individuals' healthcare needs can be diverse, as individuals may need support for each type of health-management activity. given these unique patterns of individuals' health behaviors, conventional recommendation systems that are proven effective in ecommerce settings may not be effective in the healthcare context. this is because these algorithms generally exploit historical data to learn users' preferences. as individuals' health behaviors are continually changing, their health-behavior variations may not be fully captured by the historical data, especially when individuals' health-behavior data remain limited. 1 thus, a mere exploitation of historical data may not be sufficient in healthcare recommendations. in addition, when individuals do not have well-established preferences about healthcare interventions, they may dynamically form their preferences based on the recommended items. the conventional recommendation systems do not take into account such interactions between users and recommendations and, thus, may not be effective in improving long-term recommendation performance (liu et al. 2019) . finally, conventional recommendation systems are generally shown to over-specialize recommendations (fleder and hosanagar 2009; pariser 2011) . thus, they may not well support users' diverse healthcare needs. these research gaps motivate us to propose a novel recommendation design that utilizes a multi-armed bandit (mab) as the main building block. an mab is an online-learning framework in statistics and machine learning for solving decision-making problems in noisy or changing environments (auer et al. 2002; chapelle and li 2011) . specifically, when decision-makers (e.g., service providers) do not know the outcome of an action (e.g., recommendation), an mab can help them to sequentially select choice alternatives while actively gathering information on each alternative's expected payoff (zeng et al. 2016) . in this process, an mab strikes a balance between exploiting the learned knowledge to gain immediate rewards (reusing a highly rewarding alternative from the past) and exploring potential better alternatives (trying new or less-used alternatives to gather more information), which is known as the "exploitationversus-exploration" tradeoff. by doing so, an mab aims to maximize the cumulative reward during the entire decision-making period. in the healthcare recommendation context, service providers tend to have little knowledge about users' healthcare preferences, as individuals may constantly change their health behaviors and healthcare needs. thus, the mab framework can be used in such a setting to efficiently guide the learning of users' changing healthcare needs. in addition, through the exploration process, an mab framework can promote the discovery of users' diverse healthcare needs, which may not be revealed by their historical behavior data. to better adapt an mab to the healthcare-recommendation context, we follow prominent healthbehavior theories to further extend and enhance a standard mab by synthesizing two model components, deep-learning-based feature engineering and diversity constraint. first, we design and implement two deeplearning models to extract user embeddings and item embeddings, which enables us to capture information that is critical to a healthcare decision-making context, such as users' health histories and health-behavior sequences (johnson et al. 2002; king et al. 2006; yan and tan 2014) and intrinsic attributes of healthcare interventions. taken together, the constructed user embeddings and item embeddings help to improve the personalization and contextualization of healthcare recommendations. the second model component is incorporated based on social cognitive theory (sct) (bandura 1991; bandura 1998) . sct proposes a classic paradigm for understanding individuals' personal-influenced-based health-management behaviors. based on sct, we theorize the major dimensions of health management, and we use a diversity constraint to ensure that recommendations are structurally diversified along each of the health-management dimensions, so that individuals are provided with well-rounded support. to this end, we propose a thompson sampling (ts)-based algorithm to solve this constrained recommendation task. our proposed recommendation framework is evaluated through a series of experiments, using data collected from a leading non-commercial online weight-loss platform in the united states. the focal platform provides weight-loss challenges to users, which are structured behavioral treatment programs to help users to manage short-term weight-loss goals, such as changing a dietary behavior, increasing physical exercise, and reducing weight in certain periods. we apply our recommendation framework to this weightmanagement setting to help users to find the most relevant challenges as a means to improve their engagement in weight-management activities. our evaluation results suggest that each of our proposed model components is effective and that our recommendation framework significantly outperforms a wide range of benchmark models, including ucb, e -greedy, and state-of-the-art conventional recommendation systems, such as context-aware collaborative filtering (cacf), probabilistic matrix factorization (pmf), and content-based filtering (cb). in addition, we demonstrate that our recommendation framework can more effectively learn the dynamics and the diversity distribution in users' challenge choices. from users' perspectives, we find that our recommendation design can serve to benefit a larger user population on the platform. finally, we take a further step to evaluate our recommendation performance with respect to users' weight-loss outcomes. the evaluation results suggest that our proposed recommendation design can help users to achieve the highest average weight-loss rate compared to the benchmark models. our study makes several key contributions to the literature and practice. first, one major contribution of our study is the proposed healthcare recommendation framework, which demonstrates that prescriptive analytics can be integrated via a design-science artifact (abbasi et al. 2016; chen et al. 2012) to provide decision-making support for individuals' health management. the novel aspects of our recommendation framework include (1) a deep-learning-based feature engineering procedure, (2) a domain-knowledgedriven diversity constraint, and (3) a customized online-learning scheme. to the best of our knowledge, our study is among the first to combine an mab with deep context representations and to introduce recommendation constraints for diversity promotion. second, from a practical perspective, our recommendation framework can be applied to address real-world challenges in healthcare recommendations. online healthcare platforms can adopt our recommendation design to improve users' health-management experience on the platform. finally, the design of our recommendation framework can be further generalized to settings beyond healthcare. the online-learning scheme of an mab enables decision-makers to adaptively adjust their strategies to minimize opportunity cost, and the deep-learningbased feature engineering procedure can help decision-makers to better understand the context-dependency of their decision results. our study is related primarily to two streams of literature, that is, individuals' health management and recommendation systems. in the following, we first review prominent health-behavior theories to identify the unique behavior patterns associated with individuals' health management. this discussion provides the theoretical foundation for our recommendation design. we then review the existing recommendation algorithms and discuss their limitations in delivering healthcare recommendations with respect to individuals' health-behavior patterns. finally, we introduce an online-learning framework, mab, which has gathered increasing attention from the literature for its capability of solving decision-making problems under uncertainty. we explain how an mab framework can be implemented in capturing individuals' health-behavior patterns in the healthcare recommendation process. individuals' lifestyles play a significant role in affecting their quality of health. poor health behaviors, such as smoking, alcohol abuse, and sedentary living habits, have been shown to be associated with multiple health risks (cdc 2019). thus, the management of personal health usually requires individuals to invest effort into making a health-behavior change. for example, in managing a chronic condition, such as obesity or type 2 diabetes, patients need to continually self-regulate their ongoing lifestyle in regard to dietary behaviors and participation in physical activities. researchers have found that patients' active engagement in health management is generally associated with improved adherence to treatment plans and better health outcomes (nutting et al. 2011; snyderman and dinan 2010) . according to prior health-behavior studies and theories (bandura 1991; bandura 1998; johnson et al. 2002) , individuals' health management may exhibit unique patterns, such as behavior dynamics and diversity. these patterns play a decisive role in shaping individuals' preferences for healthcare interventions and affect the design of healthcare recommendation systems. in the following, we introduce several prominent health-behavior theories to motivate our recommendation design. previous health studies have generally depicted individuals' health management as a dynamic process. johnson et al. (2002) suggested that, in the process of health management, individuals may frequently adapt their health behaviors based on their personal health condition and health-management experiences, such as treatment compliance, self-monitoring, and healthcare-knowledge seeking. in addition, the social environment may dynamically transform individuals' health behaviors by affecting mental well-being (king et al. 2006; yan and tan 2014) . for instance, the exchange of emotional or informational support among peers may encourage optimism and self-esteem of individuals (dimatteo 2004), which can help them to better comply with a treatment plan and make a behavior change (johnson and wardle 2011; krukowski et al. 2008; wang et al. 2012) . together, these studies indicate that individuals' health behaviors need to be understood with respect to specific health and social context. as health and social context may evolve with time, individuals' health behaviors generally exhibit strong temporal dynamics. psychosocial theories have extended our understanding of how cognitive and social factors contribute to personal health, among which sct (bandura 1991; bandura 1998) is widely used in the health literature to describe individuals' health-management behaviors. sct proposes a personal-influence-based selfregulation model, in which individuals exert control over their motivation and behaviors to achieve better health outcomes. the theory suggests that individuals' self-regulation contains multi-dimensional effort. first, individuals need to set proper health goals to motivate themselves toward a desirable health outcome. second, individuals need to operationalize their goals into actual behavioral aspects so that they can gain behavior-management skills and strategies to tackle challenges and fulfill expectations effectively. depending on health contexts, individuals may need to attend to different behavioral aspects at the same time. in weight management, for example, individuals need to manage both their dietary behaviors and physical activities to control their calorie intake and expenditure. based on sct, there are two major health-management dimensions: outcome-oriented dimension(s) and behavior-oriented dimension(s). the former influences individuals' motivation for health behaviors, whereas the latter affects the course of behavior execution. corresponding to these dimensions, individuals may need different types of support to guide them through the self-regulation process. for instance, individuals may need instructions on setting reasonable health goals to help them understand and manage their progress toward a targeted health condition; as well, they may need suggestions on how to cope with difficulties in the process of establishing health-behavior routines. these patterns indicate that individuals' healthcare needs can be diverse. the dynamic and multifaceted nature of health management has brought new challenges in healthcare recommendations. in this section, we review the existing recommendation systems to discuss the research gaps associated with conventional recommendation schemes that have impeded them from addressing individuals' unique health-behavior patterns. recommendation systems are intelligence-based decision-making algorithms that can help users to filter information or product choices based on their own preferences or interests, especially when there is information or product overload (konstan and riedl 2012; vozalis and margaritis 2003) . during the last few decades, recommendation systems have garnered considerable attention from both academia and industry for their capability in delivering personalized services and generating benefits for service providers and customers (isinkaye et al. 2015; pathak et al. 2010; pu et al. 2011 ). in the literature, a large body of research has focused on batch-learning-based recommendation systems, such as collaborative filtering, content-based filtering, and hybrid models ). these recommendation systems generally adopt a "first learn, then earn" recommendation scheme. that is, they first learn users' preference patterns based on a series of historical data, and then they fully exploit the learned knowledge to make future recommendations. for example, collaborative filtering makes recommendations based on similarities in users' item-selection histories (adomavicius and tuzhilin 2005; sedhain et al. 2014) , and content-based filtering leverages the content attributes of users' previously selected items (bielikovã¡ et al. 2012; pon et al. 2007 ). previous studies have proposed a variety of techniques to learn users' preference patterns from historical data, such as context-aware recommendation systems (cars) that model contextual dependency of users' behaviors, and model-based techniques to learn latent user representations. the "first learn, then earn" scheme, however, is based on the assumption that users' preferences have a static pattern that can be well represented by the historical data (adomavicius and tuzhilin 2005; sahoo et al. 2012) . when users' preferences are constantly changing, such recommendation methods may become less effective in adapting to individuals' behavior dynamics, as it is likely that individuals' preference patterns will not be fully captured by the data. in addition, prior studies have generally shown that the batchlearning-based models tend to over-specialize recommendations in the long run (yu et al. 2009 ), as they tend to focus on well-known items that already have accumulated adequate historical information, whereas the items with limited historical data will be overlooked (fleder and hosanagar 2009; pariser 2011) . as a result, these models can be ineffective in satisfying individuals' diverse healthcare interests. these research gaps motivate us to propose an online-learning scheme, i.e., multi-armed bandit (mab), to address the dynamics and diversity in individuals' health behaviors to improve healthcare recommendations. in most real-world decision-making scenarios, decision-makers usually do not know the expected utility of an action and can learn only from experience (cohen et al. 2007; mehlhorn et al. 2015; speekenbrink and konstantinidis 2015) . in statistics and machine learning, multi-armed bandit (mab) has been proposed to explicitly formulate such decision-making scenarios under uncertainty (auer et al. 2002; gittins 1979) . specifically, an mab models a sequential decision-making problem in which the underlying reward distribution for each action is unknown, and data can be obtained in a sequential order to update knowledge of the reward distribution. the rationale of an mab algorithm is to adaptively learn the reward associated with each action while gathering as much reward as possible during the entire decision-making process, that is, earning while learning (misra et al. 2019) . in order to do so, an mab strikes a balance between exploration and exploitation (kim and lim 2015; li et al. 2010; tang et al. 2014) . that is, on the one hand, an mab reuses highly rewarding alternatives from the past to ensure explicit short-term rewards, that is, "exploiting" the environment (cohen et al. 2007; mehlhorn et al. 2015) ; on the other hand, it takes actions to learn the outcome associated with the less-explored alternatives to minimize opportunity cost, that is, "exploring" the environment (cohen et al. 2007; speekenbrink and konstantinidis 2015) . it is worth noting that the online-learning scheme stands in contrast to batch-learning algorithms. the former actively collects data to learn the environment, with a forward-looking goal of maximizing longterm rewards. in other words, online learning may deviate from the current "best" knowledge from time to time in exchange for potential better learning performance and higher rewards collected in the future. in contrast, batch-learning algorithms fully exploit the current knowledge without exploring potential better opportunities that are not shown in the historical data. as such, they tend to interact with the environment in a passive and myopic manner and may not learn effectively when the environment contains many uncertainties that cannot be represented by current data. research has shown that online-learning algorithms, i.e., mabs, are suitable for tackling decisionmaking problems in noisy and changing environments (speekenbrink and konstantinidis 2015) . for example, misra et al. (2019) applied an mab to a pricing problem in which the volume of demand was uncertain. schwartz et al. (2017) used an mab to improve advertising design when online advertisers were not able to identify targeted users. in our healthcare-recommendation context, service providers (e.g., online healthcare platforms) usually have little knowledge of users' healthcare needs or preferences, especially when users frequently change their behavior patterns. an mab can be used in such a setting to help service providers to effectively explore users' preference variations while improving users' online engagement during the process. in addition, through exploration, an mab increases choice stochasticity and, thus, can better promote recommendation diversity (qin et al. 2014) . despite these advantages, mabs are seldom studied in healthcare recommendation problems. in this study, we enrich the healthcare recommendation literature by designing an mab-driven framework for providing personalized healthcare interventions. we propose a deep-learning and diversity-enhanced mab framework for recommending healthcare interventions to address the challenges and research gaps presented in the previous section. first, we adopt an mab as the main building block of our framework, as it can effectively explore variations in users' healthcare preferences and promote recommendation diversity at the same time. to better adapt an mab to the healthcare recommendation setting, we then further enhance our framework by synthesizing two model components, that is, deep-learning-based feature engineering and a diversity constraint. as suggested by prior health studies (johnson et al. 2002; king et al. 2006; yan and tan 2014) , individuals' health behaviors are dynamically affected by a series of contexts, including their evolving health status, healthmanagement experiences, and social context. based on these studies, the sequential information embedded in individuals' health histories and health-behavior paths can play an essential role in shaping individuals' health behaviors. deep-learning models can effectively capture patterns from dynamic temporal sequences and extract complex synergies between different features, thereby enabling the enhanced representation of variations in individuals' health behaviors. we thus incorporate a deep-learning-based feature engineering procedure to improve recommendation personalization and contextualization. in addition, sct suggests that individuals' health management may contain multi-dimensional efforts. the diversity constraint helps us to structurally diversify recommendations along each theory-driven health-management dimension so that individuals are provided with well-rounded support. in figure 1 , we provide a graphical illustration of our recommendation design. each construct of the recommendation framework is intended to enhance healthcare recommendation performance. in a recommendation cycle, we first use deep-learning models to construct representations for users and items, i.e., the user embeddings and item embeddings. we then use the constructed embeddings to capture the contextual features of the recommendation environment, which enables us to learn the context-dependency of the recommendation results and generalize users' feedback. the mab algorithm, shown on the right side of figure 1 , adaptively learns users' preferences by balancing the exploitation-versus-exploration tradeoff. the diversity constraint seeks to diversify the recommendations along the theorized healthmanagement dimensions. we elaborate each of these constructs in the remainder of this section. ohcs provide healthcare interventions to encourage and instruct individuals' health behaviors. in table 1, we provide several examples of typical healthcare interventions provided in ohcs. we consider the setting in which an online healthcare platform provides intervention suggestions to users on a regular (e.g., weekly) basis. it is unlikely that every individual user will prefer the same interventions, and there is interpersonal heterogeneity in terms of which interventions to adopt and to what extent. the goal of the platform is to adaptively suggest k interventions with the highest chance of improving individuals' engagement in their healthcare management. formally, let t be the number of recommendation periods and i be the number of users on the platform. suppose that each time the platform provides each user with k alternatives from a full available at the end of each period, the platform receives users' feedback on the recommendations, that is, whether they have adopted or engaged in the recommended interventions. let ( , ) t r i k denote user i 's feedback on item k . users' feedback serves as a reward for the platform's recommendation decisions; the platform may use the information of users' feedback to update its knowledge about users' preferences and adjust its subsequent recommendations. we formulate the above recommendation problem as a contextual mab. in a contextual algorithm, the decision of choice is leveraged upon a set of contextual features of the environment, such as the attributes of choice alternatives and user characteristics, so that the algorithm can exploit the similarity between choice alternatives and deliver online personalized recommendations (zeng et al. 2016 ). contextual mabs learn to map the contexts into appropriate actions (greenewald et al. 2017 ). thus, they are able to personalize recommendations based on specific situations. in addition, as the pool of users and healthcare interventions may likely undergo frequent changes, it is desirable to learn a feature-based model that can generalize users' behavior histories to the user-item pairs that have never or rarely occurred in the past. to this end, in the healthcare recommendation context, we consider two sets of contextual information: individuals' health-management contexts it x and attributes of healthcare interventions k z . we assume that users' feedback, i.e., the intervention engagement decision ( , ) t r i k , is stochastically generated by an underlying probability that depends on the contexts. we model this probability as a logistic function: x z , and * î¸ denotes the underlying coefficient vector, which can be learned adaptively in the recommendation process. the objective for the online healthcare platform is to maximize the expected cumulative user engagement during the entire course of recommendation, i.e., (2) modern recommendation systems should be well-diversified, motivated by the principle that recommending redundant items leads to diminishing returns on utility. in the context of healthcare recommendations, the major health-management dimensions that we identify based on sct include the outcome-oriented dimension(s) and behavior-oriented dimension(s). whereas the former helps individuals to gain outcome-driven motivation, the latter enables them to acquire health-management skills and strategies. thus, to provide individuals with well-rounded support, recommendations need to cover each of the health-management dimensions. although an mab framework is able to promote recommendation diversity through the exploration process, we further incorporate a diversity-constraint mab to ensure that the exploration is conducted in guided directions and that the recommendations are structurally diversified along each of the healthmanagement dimensions. formally, our diversity constraint can be expressed as where s denotes the recommendation set, outcome dim denotes the outcome-oriented dimension(s), and behavior dim denotes the behavior-oriented dimension(s). we subject the optimization problem in (2) to the diversity constraint d to ensure that the recommendation set it s contains suggestions for each healthmanagement dimension. to solve this constrained recommendation task, we propose an algorithm that is adapted from thompson sampling (ts). ts is a machine-learning algorithm that addresses the exploitation-versus-exploration tradeoff presented in a bandit problem. ts is best understood in a bayesian setting in which it computes the posterior distribution of the unknown parameters î¸ in the likelihood function, given the realized stochastic feedback. the rationale of ts is to encourage exploration through probability matching. that is, in each round, a ts algorithm randomly draws alternatives according to its probability of being optimal. research has shown that ts generally has better empirical performance than do alternative bandit algorithms, such as ucb and e -greedy (chapelle and li 2011) . our algorithm extends an ordinary ts by integrating a constrained optimization problem to solve for the optimal recommendation decisions subject to the diversity constraint. we present the details of our algorithm below. a ts algorithm with diversity constraint input: prior mean j m and prior variance j u for each parameter , 1,2, . step 2 (optimization): solve the following optimization problem: observe a new batch of data ( , ( , )), [ ], update the posterior mean by: update the posterior variance by: recommendation size constraint binary decision to improve the characterization of individuals' health-management contexts and enhance recommendation personalization, we design a deep-learning model to construct user embeddings. specifically, our userembedding model leverages information on users' attribute features (e.g., gender, age, etc.), health-status trajectories, and health-management behavioral sequences. for each user, the attribute variables usually remain unchanged over time, whereas health status and behavioral sequences will vary with time. hence, the user embeddings depend on both user i and time t to reflect the evolving dynamics. to properly guide the learning on these aspects, we propose a novel wide-and-deep neural network. the wide-and-deep structure was originally proposed for user response modeling in mobile apps. it combines two branches of user features (i.e., a "wide" branch and a "deep" branch) to facilitate user representation learning (cheng et al. 2016) . in this study, we design a "wide" branch to process users' attribute features to take into account that certain intervention suggestions can be more actionable for specific users given their personal attributes, and we apply a fully-connected structure to account for possible interactions among the attribute features. we then use a "deep" branch to learn sequence features, such as users' health-status trajectories, historical healthcare-intervention adoptions, and other health-management experiences in regard to selfmonitoring activities and social behaviors. the sequence of historical intervention adoptions is included to capture the dynamics in users' preferences. together with the health-status trajectories, it captures users' evolving health histories and the corresponding changing healthcare preference. in addition, based on prior health theories mentioned in section 2.1, health-management experiences, such as social supports and selfmonitoring activities, may also affect individuals' health behaviors and thus influence their preferences for healthcare interventions. therefore, we further include related behavior paths to capture the effect of healthmanagement experiences on users' intervention-adoption behaviors. we use long short-term memory (lstm) with a self-attention mechanism to capture the dynamically changed patterns in these features and their correlation with adopted healthcare interventions. to address the fact that different sequence features have different dimension scales (e.g., number of social activities vs. numerical intervention attributes), we propose a self-organizing lstm module and add a balancer in the lstm cell to tackle unbalanced weights between different input features. the output of the deep model is the embeddings of the adopted healthcare interventions, with the loss function defined as the cosine distance between the last hidden layer (user embedding) and item embedding. in addition, we enhance the learning process of the deep branch by an auxiliary loss function with the healthcare outcome as the goal. we use the auxiliary loss function to incorporate the effects of healthcare outcomes on individuals' health behaviors and individuals' preferences on healthcare interventions. this is unique to the healthcare recommendation context, in which individuals' health behaviors are fundamentally driven by the goal of optimizing health outcomes. meanwhile, the auxiliary loss branch can also guide the neural network to properly extract signals from the sequence features, as, otherwise, the gradient flow will not be balanced and the shallow structure will dominate the gradient flow. an illustration of the proposed deep learning architecture for user embedding construction is provided in figure 2 . short title that highlights the main features of the intervention and a description or instruction that describes the detailed execution procedure. to capture the semantics embedded in this information, we propose a hybrid model, in which we apply lstm to learn the semantics of the intervention descriptions, and we use average token-level embedding to extract signals from the intervention titles. the outputs are the meta attributes of the intervention (e.g., duration, category, and/or intensity of the intervention). we further finetune the token-level embedding in the procedure of representation learning to ensure low information loss. in sum, the user embeddings and item embeddings help us to learn key contextual information for healthcare recommendations concerning users' health behavior contexts and item attributes, which are then used as the input of our bandit recommendation model. due to the space limit, more details on our userembedding and item-embedding models are provided in appendix a1. to evaluate the performance of our recommendation framework, we collected data from a leading nonto help users to establish a healthy living style, the focal platform provides weight-loss challenges, which are behavioral treatment programs that help users to focus on a specific weight-loss goal in a short time period. examples of weight-loss challenges include diet-oriented challenges, such as "cut off processed carbs and include 100g of mixed veg in every meal," and activity-oriented challenges, such as "30 minutes of jogging every day." the diet-oriented and activity-oriented challenges provide behavioral guidance for individuals' weight-management routine. users can also find weight-loss-oriented challenges that help them to set goals directly for weight changes, such as losing a certain amount of weight during specific periods. participation in weight-loss-oriented challenges can help users to establish outcome-driven motivation. each weight-loss challenge is defined by a short title and description that contains information on the challenge goal, duration, and instructions. users can choose to join any challenge as long as its starting date has not passed. in appendix a2, we provide a screenshot of the challenge webpage to show how users can retrieve challenge information from the online platform. the focal platform does not incorporate any recommendation system to facilitate users' challenge selection. during our investigation period, there were more than 100 challenges provided to users. users may likely find it difficult to select challenges to join, as users may lack the ability to discern, from various choice alternatives, what challenges are suitable for their weight management. in addition, a significant search cost is expended, as users need to spend time reading the challenge descriptions and/or instructions before deciding which challenge to join. these problems can potentially be solved by providing personalized challenge recommendations to support healthy behaviors. our investigation on weight-loss challenge recommendations can help to improve the match between individuals and weight-loss challenges and, thus, improve individuals' weight-management performance. from the platform's perspective, the recommendations can enhance users' participation experience and, thus, contribute to user maintenance and platform sustainability. we collected three datasets to support our investigation of recommendation performance. the first dataset contains descriptive information for each challenge provided on the platform. during our data collection window, 2 there were 165 challenges provided on the platform in total. for each challenge, we collected the title, description, and duration. on average, users can choose from about 50 challenges each week. the second dataset is users' challenge-selection histories; that is, we recorded for each user the challenge(s) that he or she selected per week. this dataset enables us to learn users' preferences for weight-loss challenges. the third dataset contains auxiliary information for each user, such as gender, age, membership duration, initial weight when first joining the platform, online weigh-in activities, the number of friends, and the posts published in the community forum. we use this information to capture users' heterogeneous weight-management contexts. in particular, gender and age are two factors that directly affect individuals' weight status. membership duration measures users' overall weight-management experiences on the platform. initial weight and weekly weigh-in records help us measure individuals' weight-loss status and health histories. the number of friends and forum posts provide proxies for the amount of social support available to individuals (shumaker and brownell 1984; yan 2018 ); thus, we use them to capture the social contexts of users' weight management. we provide a summary of key data statistics in appendix a3. users on the focal platform, on average, chose two challenges per week. when users chose any challenge, they chose multiple challenges about 70% of the time. as noted, there are three major challenge types on the platform: weight-loss oriented, diet oriented, and exercise oriented. we find that users tend to choose different types of challenges whenever they choose multiple challenges. specifically, users choose more than two challenge types 92% of the time when they choose multiple challenges; they choose all three types of challenges about 51% of the time. these results indicate the existence of diversity in users' preferences for weight-loss challenges. in addition, we find that users' selection of challenge types drifts over time. that is, users may have preferred certain challenge types at the beginning of a time period and gradually shift to other challenge types as time goes by. these findings provide evidence for the dynamics of users' preferences, which may be due to users' transitions to different weight-loss statuses, in which they need different types of support. it is also likely that users gradually establish their personal tastes in regard to weight-loss challenges during the process of challenge participation. these findings thus provide support for our recommendation design. weight-loss challenges are presented in a textual format with a title and a description. they aim to help users to manage short-term weight-loss goals, such as changing a dietary behavior, increasing physical exercise, and reducing weight. goal setting can reinforce individuals' motivation, and well-structured goal formulation can have positive and directional effects on individuals' task performance (les macleod edd 2012; locke and latham 1990). the smart metric (i.e., specific, measurable, attainable, relevant, and time-bound) has been widely used as a gold standard in areas such as education and healthcare for assessing the quality of goals (doran 1981; ogbeiwi 2018) . this metric can help individuals to clearly identify the direction for logical action planning and implementation (ogbeiwi 2017; ogbeiwi 2018) . thus, smartrelated goal characteristics can influence how individuals perceive the effectiveness of a goal and affect their choice-making behaviors in deciding which goal to pursue. we use the smart metric to characterize each challenge based on the challenge description data. as the number of challenges is large and users' challenge-selection data are comparatively sparse, we need to quantify the similarities among challenges, and the smart-based features can properly guide our calibration of challenge similarity. in particular, corresponding to the goal-setting dimensions specified by the smart metric, we construct the following meta attributes for each challenge: whether the challenge is specifically defined (specificity), whether the challenge goal is measurable (measurability), the intensity level of the challenge (attainability), whether the challenge is related to diet or physical activity (relevancy), and the time span of the challenge (duration). in table 2 , we provide a summary of the annotated challenge meta attributes, which will be used for learning challenge-embedding representation and downstream recommendation task. whether a challenge is specifically defined (0 or 1) whether a challenge goal is measurable (0 or 1) diet whether a challenge is related to dietary behaviors (0 or 1) intensity_diet intensity level for a diet-oriented challenge (l, m, h) activity whether a challenge is related to physical activities (0 or 1) intensity_activity intensity level for an activity-oriented challenge (l, m, h) whether a challenge contains a goal for weight changes (0 or 1) intensity_weight_loss intensity level for a weight-loss-oriented challenge (l, m, h) whether a challenge contains motivational words/sentences (0 or 1) whether a challenge requires individuals to regularly monitor and report their weightloss progress, e.g., body weight, daily diet, running mileage (0 or 1) duration time span (in weeks) of a challenge in addition to smart-based attributes, we consider two other features that may affect individuals' engagement in challenge participation: motivational and self-monitoring. motivational characterizes challenges from the perspective of goal statement, which has been suggested to be important in helping individuals to build up inner motivation (locke and latham 1990) . self-monitoring is an important step in goal fulfillment, as it helps individuals to process their performance toward goal achievement (bandura 1991) . we use self-monitoring to indicate whether a challenge encourages individuals to regularly monitor and report their weight-loss progress. the detailed annotation procedure is provided in appendix a4. as noted in section 3.3, our user-embedding model adopts a wide-and-deep network structure, in which we use the wide branch to capture users' attribute features and the deep branch to capture the sequence features. in our evaluation context, users' attribute features include gender, age, initial weight, and membership duration. the sequence features include three parts. the first part captures users' health status, that is, their historical weight variations. the second part is the sequence of historical challenges chosen by individuals, which we use to account for users' personal tastes. the third part concerns users' other behavioral sequences, such as their past social activities (e.g., establish friendships with other users and publish forum posts) and self-monitoring activities (e.g., weigh-in). the auxiliary loss head is designed to measure the weight loss in the next time period, where we choose a combined loss of mse for absolute value prediction and cross-entropy loss for weight-loss sign prediction. we present the detailed network structure and the loss functions in appendix a1. for our challenge-embedding model, we use challenge name and description as the inputs, and the annotated challenge meta attributes based on the smart metric are the outputs. the annotated challenge meta attributes help us to depict the key characteristics of a weight-loss-related goal; thus, they provide a good standard for calibrating challenge similarity in our focal context. in operationalizing our diversity constraint, we identify weight loss as the outcome-oriented dimension, as it is the health goal that individuals aim to achieve in our focal context. we identify diet and physical exercise as two behavior-oriented dimensions, as they are two essential behavioral-regulation aspects in weight management. with respect to these dimensions, we ensure that our recommendations cover all three challenge types, i.e., weight-loss oriented, diet oriented, and exercise oriented. therefore, our diversity constraint is specified as represents the recommended challenge set, weightloss dim represents the weight-loss-oriented dimension, diet dim denotes the diet-oriented dimension, and exercise dim denotes the exercise-oriented dimension. we apply our recommendation framework to the weight-management context described in section 4, with the aim of promoting users' engagement in weight-loss challenges on the platform. in particular, we implement the algorithm introduced in section 3.2 to offer top-k challenges to users on a weekly basis. 3 the time span of recommendation is 16 weeks (i.e., the same as our data collection window). to demonstrate the effectiveness of our recommendation design, we follow the design-science paradigm to rigorously evaluate our recommendation framework through a series of experiments. we first examine the effectiveness of our deep-learning embeddings in capturing user characteristics and challenge attributes. we then apply different evaluation approaches to test each of our model components as well as to compare our model against state-of-the-art recommendation systems. to show how the construction of deep-learning models improves feature engineering, we use t-sne to visualize the embeddings in a two-dimensional space (maaten and hinton 2008). t-sne is a nonlinear dimensionality reduction technique that is well suited for deconstructing high-dimensional data. it can project high-dimensional vectors into lower dimensions without changing the data structure so that it helps us to understand data patterns in a more intuitive way. in a t-sne plot, two points that are close to each other indicate that the corresponding embedding representation vectors are similar. we provide the visualization results for challenge embeddings and user embeddings in figures 3 and 4 , respectively. specifically, we present the t-sne plot for our constructed challenge embeddings in figure 3 (1). in , we provide the t-sne plots for two state-of-the-art word2vec deep-learning models, bert and fasttext. we use these two models as a benchmark for evaluating the performance of our proposed challenge-embedding model. in the plot, we use different colors to indicate each challenge type, such as weight-loss oriented, diet oriented, and exercise oriented. we use the color degree to denote the intensity level of a challenge: a deeper color indicates a challenge of higher intensity. for example, dietoriented challenges are denoted by orange points, and among the points, there are three color degrees: light orange, dark orange, and orange-red, representing low-, medium-, and high-intensity levels, respectively. we find that, in figure 3 (1), challenges that belong to the same type are tightly clustered. in addition, within each challenge-type cluster, challenges of the same intensity level tend to be close to each other. these patterns indicate that our challenge embedding model can well capture intrinsic challenge attributes, especially the ones that are key to goal-setting theory (doran 1981) . in comparison, the benchmark models cannot clearly distinguish these challenge patterns. (1) proposed (2) bert (3) fasttext the t-sne plot for our constructed user embeddings is presented in figure 4 (1 note that these features are most directly related to our weight-loss context; in particular, gender and age are two demographics that directly affect users' body weight, and weight-loss status reflects users' in-period weight variations. we compare our proposed user-embedding model with two benchmark models, collab_learner and tabular. the results are presented in figures 4(2) and 4(3) , respectively. the collab_learner model and the tabular model are two encapsulated python learners provided in the fastai library. in particular, the collab_learner model learns user representations from the historical challenge-selection data; however, it is not able to incorporate users' personal characteristics. the tabular model is a deep-learning model that learns user embeddings based on users' tabular attributes, such as gender and age, but does not extract signals from sequential data, such as users' health histories and behavior paths. as shown, these two models do not perform as well as our models, as there is no explicit user pattern displayed. (1) proposed (2) collab_filer (3) tabular (4) sampled users finally, the sequence shape of the user clusters produced by our model in figure 4 (1) motivates us to further investigate granular individual-level patterns, as the points in a sequence are likely generated by the same or similar users. we randomly sampled several individual users and plotted their corresponding embeddings in figure 4 (4). we find that the embeddings of the same user locate close to each other and tend to be concatenated into a trajectory and that the embeddings of different users are located relatively far apart. these results show that our proposed user-embedding model can effectively capture the sequential we conduct an ablation analysis to compare our model with a series of baseline mabs. this allows us to show that each of our model components (i.e., the deep-learning-based feature engineering procedure and the diversity constraint) is effective in helping users to find the relevant challenges. the baseline mabs are the counterpart mab models that partially incorporate or do not incorporate the proposed model components. specifically, the baseline mabs that we investigate include the mab without user embeddings, the mab without challenge embeddings, the mab without either embeddings, the mab without diversity constraint, and the mab without either embeddings or constraint. when user embeddings are not incorporated, we use users' attribute features (e.g., gender, age, etc.) to account for recommendation personalization. when challenge embeddings are not incorporated, we use the annotated challenge features to capture the inherent attributes of challenges, namely, the variables listed in table 2 . evaluating an explore/exploit policy is difficult because we typically do not know the reward of an action that was not chosen. possible solutions include doubly-robust estimation (dudã­k et al. 2011) , offline precision evaluated by preference set (qin et al. 2014) , and simulation. the first evaluation approach, doubly-robust estimation, is an offline data evaluation approach that utilizes pre-collected historical data to evaluate policy performance. the historical data are assumed to contain three sets of information: action, context, and reward. as the data are pre-collected, we are able to observe only the rewards for the chosen action in the data. to adjust for the potential bias caused by the data collection process, the doubly-robust estimator combines two policy evaluation methods, direct simulation (ds) and inverse propensity score (ips). formally, let g denote the offline dataset, which contains action a , context v , and reward r . in our context, action refers to the platform's provision of a challenge in the data, context v includes individuals' weight-management context it x and the challenge features, and r is users' feedback, that is, whether a user selects a challenge. let it s denote the set of challenges recommended to user i at week t , and it s k = . our doubly-robust estimator can be expressed as follows: where äµ is a reward simulator, and p is the propensity of challenge provision in the data. the rationale of this method is that, when data are not available, the method uses a pre-trained reward predictor to simulate the reward; otherwise, it applies a correction to the reward predictor using the actual data. the second evaluation method measures recommendation precision. following previous studies (qin et al. 2014; qin and zhu 2013) , we construct each user's preference set as the set of challenges selected by the user in the data. the recommendation precision is thus the overlap ratio between the recommendation set and the preference set. finally, in light of previous theoretical mab studies (hertz et al. 2018; sani et al. 2012) , we examine our model performance through a simulated environment. specifically, we construct a logistic predictor for users' binary challenge-selection decisions, that is, where v is a concatenation of user embeddings and challenge embeddings. the weight vector î¶ could have been chosen arbitrarily, but it was in fact a perturbed version of the weight vector trained on a randomly constructed training set (nguyen et al. 2017) , and the performance evaluation is conducted on a test set. this simulator is omniscient, in the sense of full knowledge of users' preferences and the actual amount of reward accrued by recommendations. we provide the details of these evaluation approaches in appendix a6. the evaluation results are provided in table 3 . each value in the table represents users' average selection rate during the entire recommendation course. a superscripted asterisk denotes that a benchmark model performs significantly worse than the proposed model. our results show that the mabs that include only one of the components have an inferior performance (i.e., lower average challenge-selection rate) than our proposed model. specifically, the mab without diversity constraint is suggested to be significantly worse by the doubly-robust estimation and simulation method. the mab without user embeddings or challenge embeddings (or both) is shown to have a worse performance by all three evaluation methods. finally, we find that the mab without either embeddings or diversity constraint performs worse than the mabs that partially incorporate the model components. these results indicate that each of our proposed model components is effective. as compared to users' attribute features, user embeddings can better capture the sequential information embedded in users' health histories and behavior paths and, thus, are more effective. challenge embeddings are more effective than the annotated challenge features, as they are able to extract semantic information from the textual challenge descriptions. in addition, as most annotated challenge features are categorical and one-hot encoded, they may not provide much information for the learning process. in comparison, challenge embeddings can better calibrate the similarity among challenges and make the learning more effective. 0.4583 *** 0.5462 *** 0.4172 ** note: asterisk in superscript denotes that a benchmark model performs significantly worse than the proposed model. significance levels are: * p < 0.1, ** p < 0.05, *** p < 0.01. in figure 5 , we plot the recommendation performance across time to show the learning curve of each model. the x-axis denotes recommendation rounds, and the y-axis denotes the average challenge selection rate up to round t . it is shown that our model achieves the highest learning rate across all periods, regardless of the evaluation approach taken. that is, our model can boost the average challenge-selection rate faster than the benchmark models can. for example, when evaluated by simulation, our model increases the average challenge selection rate from approximately 0.39 to approximately 0.45 after the 16-week recommendation phase, which is a 15% increase. this is followed by the mab with no diversity constraint (~13%), the mab with no user embeddings (~11%), the mab with no challenge embeddings (~10%), the mab without either embedding (~8%), and the mab without either diversity constraint or embeddings (~8%). these results further highlight the effectiveness of our deep-learning-based feature engineering and diversity constraint in the learning procedure. we compare our proposed recommendation framework against a wide range of benchmark models. in particular, for benchmark bandit models, we consider ucb and e -greedy, which are two classic onlinelearning methods to solve the "exploitation-versus-exploration" tradeoff. for batch-learning-based models, we consider a variety of collaborative filtering methods, such as context-aware recommenders and matrixfactorization-based models. we also consider content-based filtering and hybrid filtering. content-based filtering is able to offer recommendations based on item features. hybrid filtering further combines content-based filtering with collaborative filtering to incorporate information embedded in users' challenge selection histories. as the batch-learning-based models make recommendations by exploiting users' itemselection histories, we use the first four weeks of data to train the algorithms. finally, we consider pure exploitation and pure exploration, which are two recommendation schemes that do not seek a balance between exploitation and exploration. we summarize our benchmark models in table 4 . the implementation details of these benchmark models are provided in appendix a7. similarly, we calibrate recommendation performance by users' average challenge-selection rate and evaluate it through the aforementioned three evaluation approaches, i.e., doubly-robust estimation, offline precision, and simulation. our evaluation results are provided in table 5 . (1) doubly-robust estimation (2) offline precision (3) omniscient simulator the results show that our proposed model has the best recommendation performance under all evaluation measures. it achieves an average challenge selection rate of 51.15% when evaluated by doublyrobust estimation; 60.54%, by offline precision; and 44.67%, by simulation. ucb and e -greedy are shown to have comparable performance (~42% for doubly-robust estimation, ~52% for offline precision, and ~39% for simulation), but both perform significantly worse than our model. batch-learning-based models generally do not perform well (mainly below 40%). the performance inferiority may be due to users' dynamic preferences for weight-loss challenges. in other words, the batch-learning-based models assume that users' preferences can be well represented by their past behavior patterns (adomavicius and tuzhilin 2005; sahoo et al. 2012) ; thus, these models can be biased when users' preferences contain dynamic e -greedy a bandit algorithm that chooses the arm with the seemingly highest average reward with probability 1 e and explores a random arm with probability e ; cacf context-aware collaborative filtering, which incorporates the contexts of users' item selections as weights into a normal collaborative filtering procedure (chen 2005) ; scf social collaborative filtering, which formulates a neighborhood-based method for cold-start collaborative filtering in a generalized matrix algebra framework (sedhain et al. 2014 ); pmf probabilistic matrix factorization, a model-based collaborative filtering approach that uses matrix factorization under a probabilistic framework to estimate user-item interactions (mnih and salakhutdinov 2008) ; camf context-aware matrix factorization, which is an extension of the classic matrix factorization approach for incorporating contextual information (baltrunas et al. 2011 ); cb content-based filtering, an approach to offer recommendations based on content similarities of items (bielikovã¡ et al. 2012; pon et al. 2007) ; hybrid_pure a hybrid model that combines pure collaborative filtering with cb using mixed hybridization (burke 2002) ; hybrid_cacf a hybrid model that combines cacf and cb using mixed hybridization; pure exploitation a model that selects the best option given current knowledge; pure exploration a model that fully randomizes recommendations. finally, our results show that pure exploitation and pure exploration achieve worse recommendation performance as compared to our model. this performance gap indicates the importance and necessity of balancing the exploitation-versus-exploration tradeoff. a pure exploitation method may be stuck in a worse local optimum. pure exploration, in contrast, over-explores users' preferences; it fully randomizes the recommendations without utilizing or learning from information embedded in users' past behaviors. note that pure exploitation and pure exploration can often be seen in the design of a/b testing. specifically, in an a/b test, experimenters first spend a short time period for pure exploration, whereby they randomly assign users to different groups to examine the performance of policy variants. they then engage in a long period of pure exploitation, assigning all of the users to the group that achieves the best performance. in practice, the pure exploratory phase can be expensive or even infeasible to implement. for example, in a health-management context, it is usually infeasible to arbitrarily assign individuals to a treatment plan. instead of two distinct periods of pure exploration and pure exploitation, a bandit-driven design adaptively combines exploration and exploitation. thus, it can reduce the opportunity cost incurred in the exploratory phase and help service providers to achieve better performance. 0.4617 *** 0.5690 *** 0.4006 *** note: asterisk in superscript denotes that a benchmark model performs significantly worse than the proposed model. significance levels are: * p < 0.1, ** p < 0.05, *** p < 0.01. in a typical healthcare context, individuals' behavior patterns likely change over time (johnson et al. 2002; king et al. 2006) . in this experiment, we investigate the recommendation results for the users whose choices tend to vary considerably as a means to examine whether our recommendation framework can well capture the dynamics in users' preferences. from the test-user set, we select the 30 users whose challenge choices have the largest embedding variance. 5 we implement our recommendation algorithm for the selected users and compare the recommendation performance with that of using the full test-user set. table 6 presents the results for the new test-user set. our model is shown to outperform all of the benchmark models by all evaluation measurements. in particular, our model achieves an average challenge selection rate of 53.44% when evaluated by doubly-robust estimation; 60.90%, by offline precision; and 47.56%, by simulation. to better show the performance variations, we plot the differences between the new recommendation results and the original results on the full test set in figure 6 . here, we present the performance changes measured by doubly-robust estimation. the performance changes measured by offline precision and simulation are similar and are provided in appendix a8. we find that bandit-driven models, such as our 5 as challenge embeddings are vectors, we define the variance of embeddings as the minimum value of the variances of the elements. proposed model, ucb, and e -greedy, have different degrees of performance increase. in contrast, batchlearning-based models generally have a performance decrease. these results suggest that the advantage of the online-learning scheme is further strengthened when evaluated on the dynamic users whose preferences tend to vary frequently. the performance decrease of batch-learning-based models indicates that the "first learn, then earn" recommendation scheme may perform even worse when users' preferences exhibit strong dynamics. this may be because the dynamic patterns in users' preferences cannot be fully captured by the historical data. bandit-driven models, in contrast, are able to actively collect users' feedback on recommendations and, thus, can capture changes in users' preferences more promptly. 0.3942 *** 0.2250 *** 0.4308 *** note: asterisk in superscript denotes that a benchmark model performs significantly worse than the proposed model. significance levels are: * p < 0.1, ** p < 0.05, *** p < 0.01. performance variation for dynamic users 5.5 experiment 5: diversity analysis effectively learn users' diverse challenge preferences in the data. specifically, for our proposed model and each of the benchmark models, we calculate the recommendation frequency for each challenge type to construct a diversity distribution. we then compare the recommendation frequencies with users' challenge selection frequencies in the data, which reveal users' true preferences of challenge types. we use the jensen-shannon divergence (jsd) to measure the similarity between two diversity distributions (endres and schindelin 2003; fuglede and topsoe 2004) . a small jsd value indicates high similarity between two diversity distributions. in figure 7 , we visualize the diversity distribution for each recommendation model and present the corresponding jsd values. the first bar in the figure represents the diversity distribution in users' challenge-selection histories observed in the data. the second bar represents the diversity distribution in the recommendations provided by our proposed model. as can be seen, the first two bars are very similar to each other, indicating that the recommendations produced by our model are diversified in a way that is similar to users' actual challenge-selection histories. the recommendation diversity distributions produced by the benchmark models generally have a larger difference from the observed challenge-selection data. for example, the bar that corresponds to cacf is quite different from the first bar. these observations are confirmed by our jsd results, that our proposed model has the smallest jsd value (i.e., 0.03) across all of the benchmark models. combined with the results in the earlier-discussed experiments, these findings provide evidence that our proposed recommendation framework can well support users' diverse healthcare preferences and that the diversity constraint can further guide the recommendation system to explore along each of the weight-management dimensions and, thus, improves learning efficiency. in this experiment, we aim to examine whether our recommendation framework can benefit more users. we define user improvement as the percentage of users who receive more preferred items from a focal recommendation algorithm than from a baseline algorithm. we use probabilistic matrix factorization (pmf) as our baseline algorithm. pmf models hidden user representations based on the users' challenge-selection histories. it does not, however, incorporate contextual information of users' selection behaviors, and it is batch-learning-based. thus, by comparing with pmf, we are able to assess the value of the recommendation context along with the online-learning scheme. we present our results for user improvement in figure 8 . it is shown that our proposed recommendation approach has the highest user improvement rate (~76.19%), suggesting that approximately 76% of users receive more preferred challenges from our recommendation framework than from pmf. comparatively, the other recommendation approaches have a lower user improvement rate (all below 70%). these results indicate that our recommendation framework can serve to improve a larger user population on the platform. the former experiments evaluate the effectiveness of our recommendation framework in improving users' challenge-selection rates. in the healthcare context, it is also important that service providers take further steps to evaluate corresponding health-related outcomes. although necessary, increasing users' engagement in interventions may not be enough, as it does not directly guarantee an improvement in health. therefore, in this experiment, we further examine our recommendation performance in improving users' weight-loss outcomes. in particular, we explore a different recommendation target: users' in-period weight-loss rate. 6 that is, we use users' in-period weight-loss rate as feedback to guide the learning of our recommendation framework. different from prior experiments, generating weight-loss feedback requires two steps of simulation. first, we use the logistic predictor described in section 5.2 to simulate whether users will choose a particular item. second, we simulate users' in-period weight-loss status (e.g., weight gain or non-gain) based on their choices. we use users' weigh-in data to train a logistic predictor for this simulation, which takes user embeddings and the average challenge embeddings of users' choice as the input variables and users' weight-loss status as the prediction target. as shown in table 7 , our proposed recommendation framework has the best performance under the new recommendation target, achieving an average in-period weight-loss rate of 75.64%. in contrast, the benchmark models achieve a significantly worse average in-period weight-loss rate. these results indicate that our recommendation design is effective not only in helping users to find preferable challenges to engage in but also in further helping them to achieve better weight-loss performance. 0.6507 *** pure exploitation and pure exploration pure exploitation 0.7035 *** pure exploration 0.4006 *** note: asterisk in superscript denotes that a benchmark model performs significantly worse than the proposed model. significance levels are: * p < 0.1, ** p < 0.05, *** p < 0.01. in this study, we take a design-science perspective to develop a novel recommendation framework for providing personalized healthcare recommendations to users on online healthcare platforms. the design of our recommendation framework is motivated by several unique patterns in individuals' health behaviors. first, due to the evolving process of health management, users' health behaviors may continuously change with time. second, users may need multiplex healthcare information to manage different health aspects. these characteristics indicate that users' healthcare preferences can be dynamic and diverse. to this end, we propose a deep-learning and diversity-enhanced mab framework. mab is able to adaptively learn users' changing behavior patterns while promoting diversity along the exploration process. to better adapt an mab to the healthcare recommendation context, we further synthesize two model components into our framework based on prominent health-behavior theories. the first component is a deep-learning-based feature construction procedure, aimed at capturing important healthcare recommendation contexts, such as healthcare intervention's inherent attributes and individuals' evolving health condition and health-behavior paths. the second component is a diversity constraint, which we use to ensure that recommendations are provided in each of the major health-management dimensions, so that individuals can receive well-rounded support for their health management. we conduct a series of experiments to test each of our model components as well as to compare our model against state-of-the-art recommendation systems, using data collected from a representative online weight-loss platform. the results of the experiments provide strong evidence for the effectiveness of our proposed recommendation framework. our study contributes to the emerging literature on the application of business intelligence (abbasi et al. 2016; chen et al. 2012) . we demonstrate that prescriptive analytics can be integrated with it artifacts to generate applicable insights. in particular, the innovative healthcare recommendation framework that we developed provides an important contribution to the literature on recommendation systems and online healthcare systems. to the best of our knowledge, we are among the first to combine mab models with deep-learning-based embeddings to improve the characterization of recommendation contexts. in addition, the inclusion of diversity constraints demonstrates a way of promoting recommendation diversity according to pre-designed dimensions. this innovation can be of significance in professional industries, in which domain expertise needs to be incorporated to guide recommendation diversification. from a practical perspective, our recommendation framework can be used to address real-world challenges in healthcare recommendation problems. the effectiveness of our framework, as demonstrated by our results, implies great potential for using our recommendation design to provide users with tailored engagement suggestions. although our recommendation design is proposed for assisting individuals' engagement in online health management, the framework can be extended to broader problem settings. for example, the combination of an mab and deep-learning-based feature engineering can be used to solve other healthcare problems, such as drug discovery, disease diagnosis, clinical trials, and therapy development. decision making in such healthcare problems usually involves complex contextual knowledge (e.g., drug structures, patient health histories, symptom development paths), and decisionmakers usually do not have full knowledge of the environment (e.g., whether a drug is effective). the online-learning framework of an mab can help decision-makers to better cope with uncertainties in the healthcare environment, and deep-learning models can be combined to improve the characterization of decision-making contexts. in addition, we demonstrate a way to formulate recommendation constraints, which can be used to incorporate domain expertise to guide the recommendation procedure. our recommendation framework can also be extended to fields beyond healthcare. real-world decision-making problems, such as financial investment, product pricing, and marketing, usually contain different levels of uncertainty. the uncertainty may be because decision-makers do not gather enough data to guide their decision making, or the decision-making environment is frequently changing (e.g., market instability, technology change, policy environment fluctuation). thus, it is of practical importance to develop an adaptive decision-making framework that can respond well to the uncertainty in the environment. decision-makers may consider combining an mab with deep-learning embeddings to learn the contextdependency of their decision results while adaptively adjusting their strategies to minimize opportunity cost. in addition, in real-world recommendation problems, it is usually desirable to recommend diversified content to maximize the coverage of the information that users find interesting to improve their engagement experience. our formulation of the diversity constraint can be used to strengthen recommendation diversification along with theory-guided dimensions in multiple application areas. big data research in information systems: toward an inclusive research agenda toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions finite-time analysis of the multiarmed bandit problem matrix factorization techniques for context aware recommendation social cognitive theory of self-regulation health promotion from the perspective of social cognitive theory effective hierarchical vector-based news representation for personalized recommendation hybrid recommender systems: survey and experiments an empirical evaluation of thompson sampling context-aware collaborative filtering system: predicting the user's preference in the ubiquitous computing environment business intelligence and analytics: from big data to big impact wide & deep learning for recommender systems should i stay or should i go? how the human brain manages the trade-off between exploitation and exploration social support and patient adherence to medical treatment: a meta-analysis there'sa smart way to write management's goals and objectives doubly robust policy evaluation and learning a new metric for probability distributions blockbuster culture's next rise or fall: the impact of recommender systems on sales diversity the answer to weight loss is easy-doing it is hard! jensen-shannon divergence and hilbert space embedding bandit processes and dynamic allocation indices action centered contextual bandits maintenance of lost weight and long-term management of obesity stochastic satisficing account of confidence in uncertain value-based decisions recommendation systems: principles, methods and evaluation the association between weight loss and engagement with a web-based food and exercise diary in a commercial weight loss programme: a retrospective analysis understanding variation in chronic disease outcomes robust multiarmed bandit problems social support processes and the adaptation of individuals with chronic disabilities recommender systems: from algorithms to user experience internet-based weight control: the relationship between web features and weight loss making smart goals smarter a contextual-bandit approach to personalized news article recommendation diversity-promoting deep reinforcement learning for interactive recommendation a theory of goal setting & task performance visualizing data using t-sne unpacking the exploration-exploitation tradeoff: a synthesis of human and dynamic online pricing with incomplete information using multiarmed bandit experiments probabilistic matrix factorization reinforcement learning for bandit neural machine translation with simulated human feedback transforming physician practices to patient-centered medical homes: lessons from the national demonstration project why written objectives need to be really smart general concepts of goals and goal-setting in healthcare: a narrative review when more is less: the paradox of choice in search engine use the filter bubble: how the new personalized web is changing what we read and how we think empirical analysis of the impact of recommender systems on sales from virtual parties to ordering food, how americans are using the internet during covid-19 tracking multiple topics for finding interesting articles user-centric evaluation framework for recommender systems contextual combinatorial bandit and its application on diversified online recommendation promoting diversity in recommendation by entropy regularizer recommender systems: introduction and challenges. recommender systems handbook a hidden markov model for collaborative filtering risk-aversion in multi-armed bandits customer acquisition via display advertising using multi-armed bandit experiments social collaborative filtering for cold-start recommendations toward a theory of social support: closing conceptual gaps improving health by taking it personally uncertainty and exploration in a restless bandit problem ensemble contextual bandits for personalized recommendation analysis of recommender systems algorithms to stay or leave?: the relationship of emotional and informational support to commitment in online health support groups obesity and overweight good intentions, bad outcomes: the effects of mismatches between social support and health outcomes in an online weight loss community feeling blue? go online: an empirical study of social support among patients online context-aware recommendation with time varying multi-armed bandit deep learning based recommender system: a survey and new perspectives key: cord-280064-rz8cglyt authors: gwizdałła, tomasz title: viral disease spreading in grouped population date: 2020-08-27 journal: comput methods programs biomed doi: 10.1016/j.cmpb.2020.105715 sha: doc_id: 280064 cord_uid: rz8cglyt background and objective the currently active covid-19 pandemic has increased, among others, public interest in the computational techniques enabling the study of disease-spreading processes. thus far, numerous approaches have been used to study the development of epidemics, with special attention paid to the identification of crucial elements that can strengthen or weaken the dynamics of the process. the main thread of this research is associated with the use of the ordinary differential equations method. there also exist several approaches based on the analysis of flows in the cellular automata (ca) approach. methods in this paper, we propose a new approach to disease-spread modeling. we start by creating a network that reproduces contacts between individuals in a community. this assumption makes the presented model significantly different from the ones currently dominant in the field. it also changes the approach to the act of infection. usually, some parameters that describe the rate of new infections by taking into account those infected in the previous time slot are considered. with our model, we can individualize this process, considering each contact individually. results the typical output from calculations of a similar type are epidemic curves. in our model, except of presenting the average curves, we show the deviations or ranges for particular results obtained in different simulation runs, which usually lead to significantly different results. this observation is the effect of the probabilistic character of the infection process, which can impact, in different runs, individuals with different significance to the community. we can also easily present the effects of different types of intervention. the effects are studied for different methods used to create the graph representing a community, which can correspond to different social bonds. conclusions we see the potential usefulness of the proposition in the detailed study of epidemic development for specific environments and communities. the ease of entering new parameters enables the analysis of several specific scenarios for different contagious diseases. the recent pandemic related to the worldwide expansion of the covid-19 coronavirus redirected the attention of scientists and entire societies to techniques that can be helpful when analyzing and predicting the spread of diseases in communities. viruses are always present in our environment, and this presence is usually uninteresting. for example, we are so familiar with most influenza virus mutations that the epidemics caused by them are widely considered mainly in economic discussions. only occasionally, when some particularly aggressive mutations emerge or during pandemics caused by some coronaviruses (sars, mers), is the public alerted about the potential danger. with the knowledge about possible directions of disease transfer and the number of people at risk, the identification of the most vulnerable groups is crucial. in 1927, kermack and email address: tomasz.gwizdalla@uni.lodz.pl (tomasz gwizdałła) mackendrick published their paper [1] , which is now considered the first approach to the mathematical modeling of epidemic processes. they introduced the crucial concepts by classifying members of a population, when considering an illness, as susceptible and infected and tried to find the relationships between these classes with the help of differential calculus. this division is also used today, and the number of applications using, based on their approach, ordinary differential equations (ode) continues to increase. we want to emphasize a topic that is difficult to address when using the ode approach -the inclusion of stochastic effects. indeed, there are different models in which some external force, either a periodic, e.g., seasonality [2] , or purely stochastic [3] , here for the sis model, force, assuming the susceptibility of an individual recovering from a disease, is used. in our approach, we follow another approach that is related to the analysis of the distribution of particular groups in real human communities. this approach is often related to the cellular automata model. the typical approach can be found, e.g., in sirakoulis' paper [4] . in the model, the population is distributed over a two-dimensional space, and this space is, as is typical for ca, divided into squares. the spread of disease is modeled by the deterministic rules describing the transitions between states. many investigations use similar models, e.g., the famous model proposed by ferguson's group [5] conducts much more soft division to generate the areas. we pay special attention to this model since it was later widely used to study the possibilities of preventing disease. in [6, 7] , the expected results of different intervention procedures during the influenza epidemic in asia and the united states, respectively, are shown. the same model has recently been applied to the covid-19 [8] pandemic. great britain and the united states were the areas of application, but according to authors' remarks, every high-income country can be studied with this model. a similar approach using ca can also be observed in [9] , where the transfer between cells is the main factor supporting the spread of disease. the exchange of people is also a crucial feature of the model presented by holko [10] . the results are here reproduced for the whole area of poland, divided into 36 × 36 squares, showing the possible spread of the influenza epidemic. in our earlier paper [11] , we showed, taking into account the same area of inhabitance as in this paper, that the model requires special attention, primarily due to the existence of so-called size effects. although similar to the approaches based on cellular automata, our approach is different, mainly due to the change in the definition of the topology of interpersonal links. we do not consider the aggregate number of individuals in a particular state and a particular area, but we create direct links between them. thus, we can regard our model as an agent-based model. after defining the topology and the set of features characterizing agents, we can individualize their behavior. in the paper, we use the barabasi-albert (ba) model of the creation of a community graph [12, 13] . this approach currently seems to be the most popular among the numerous attempts to model communities, dated from the seminal paper on random graphs by erdos and renyi [14] . the main property of the ba model is that it leads to a power-law distribution of nodes according to their degree (the scale-free property), which is typical of communities. therefore, several real-world networks can be described by a ba model, e.g., a world wide web network, an actor or scientific collaboration, or even the e. coli metabolism [15, 16] . several authors have already proposed the use of the ba network [17, 18, 19, 20] , but their models concentrate on other problems. for simplicity, we do not introduce the differences in the behavior of agents. there are also papers where some of the concepts used in our paper were considered. ramos [21] studied the case of grouping on some form of a two-dimensional grid. balcan [22] studied the role of hubs, well known from complex network theory, in outbreak prevention by their identification and vaccination. hellewell [23] studied the graph of direct links, where a negative binomial distribution determined the probability of the creation of links. among the papers related to the current covid-19 pandemic, we want to pay attention especially to these documents that can be used in our calculations or give some additional ideas for disease analysis. we mention here the reports concerning the incubation time [24, 25, 26] and the ones that present information about important parameters, such as the basic reproduction number r 0 [27, 28] . boldog's paper [27] also draws interesting conclusions concerning the potential risks for particular countries when taking into account their connections with china. this paper is organized as follows: in the next section, we present a model emphasizing two revealed problems -the construction of a graph describing a society and the procedure of the transfer of illness. we also justify the idea of division into groups based on some, generally arbitrary, factor. in the section devoted to the presentation of results, we concentrate on the epidemic curves, which are presented in two forms, i.e., the number of new cases and the number of recovered persons (in the absence of the mortality rate), and on the analysis of intervention, considered as the minimization of the number of contacts between neighbors in the network. we also show the effect of using different procedures on the possibility of people in different groups becoming sick. the most popular way to model the disease-spreading process is the method of kermack [1] , which is based on individuals being assigned to one of several groups. in their original paper, the authors did not use the contemporary terminology, but later, the acronym sir began to be used. in the sir model, the members of a population can be classified as belonging to one of three groups concerning a disease: susceptible (s) -those who can become ill; infective (i) -those who can infect others; and recovered (r), those who are permanently immune. the set of ordinary differential equations 1 determines the number (or fraction) of individuals in a particular state and at a particular time. in the above formula, s , i, and r are the numbers of people in the respective phases of illness, a is the contact rate, and b is the inverse of the infectious period. this approach was later extended by including a fourth phase -exposed (e) -between phases s and i. this fourth phase describes the fraction of the population in the latent (incubation) phase of an illness. this addition enables us to include more realistic processes when considering the majority of infectious diseases. the set of odes now takes the form: some parameters in formula 2 are the same as in 1: β is the contact rate, and γ is the inverse of the infectious period. additionally, we have to consider n(t) -the total number of people in a community, satisfying the condition n(t) = s (t) + e(t) + i(t) + r(t) = const -and some new parameters, such as δ, which is the inverse of the latent period, and µ, i.e., the mortality (and also birth) rate. the equality of mortality and birth rates is certainly the condition enabling us to consider a constant n(t). we should emphasize that both of the above sets of equations are exemplary since their detailed forms depend strongly on the system's assumptions, but they present the idea of the calculations. the presented models also offer many opportunities to be modified or expanded. we can mention here, e.g., seis, where no immunity is assumed after a disease is passed on, or seijr, where an additional j phase (isolated individuals with treatment) is placed between the i and r phases. this continuous approach enables easy calculation of one of the most interesting values describing the potential effect of an outbreak, i.e., the basic reproduction number, which is a simple function of the ode parameters: although continuous models based on the odes give many interesting and practical results, it is well known [3] that there exists a large stochastic effect in the epidemic process. therefore, we propose considering epidemics in a more individual manner. we start from two assumptions: • every infection is a result of the interaction between infected and susceptible, • we have to find the network of interpersonal connections initially and then model the spread of an illness. the second of these two points distinguishes our proposition from some of the abovementioned approaches, being based mainly on the cellular automata methodology, where the particular cells in a two-dimensional lattice correspond to different sections, covering the whole real area under investigation, for example, certain parts of cities or countries (see, e.g., [20, 10] ). we instead numerically follow a scheme in which the possibility of every individual interaction is considered separately. the fundamental idea of the ba model is the observation regarding the growth of the network. instead of creating links between the existing nodes, the preferential character of this growth is assumed, which governs the creation of a graph corresponding to a community network. in conclusion, the individuals are added to the existing graph one after the other, and during this process, the straightforward formula: is used when determining the probability of linking a new node to an existing node indexed by i. k i here is the degree of node i, and the summation in the denominator runs over all existing nodes. we present the results for the two cases of graph modeling. the first one is the pure ba model, as it is well known from the seminal papers. however, we propose a modified approach, which is related to the fact that every individual is a member of different groups. as a group, we understand here different subsets of a community described by some common interest or features. good examples include the place of residence or the workplace. even if we do not consider someone connected in the sense of a typical social network, we can more easily meet him/her in a local store or the company hallway. this is why we decided to modify the probability given by formula 4, trying to take into account the mentioned effects. the solution that imposes itself is to change the relative probabilities of acceptance for links connecting nodes/individuals belonging to the same group when compared to those from other groups. however, no clear evidence exists regarding how we can introduce these changes. it could strongly depend on several additional details, for example, an individual living in a block of flats and shopping at the supermarket can accidentally meet many more unknown people than a resident of a cottage in the suburbs, who shops only at a local store. since we have to aggregate all these possibilities, we decided to use the mechanism where the probability of connecting nodes is tripled for nodes belonging to the same group, while for different groups, it is divided by 2. this procedure has a different significance in different phases of the creation of a graph of connections. in the early phase, the values of probabilities (4) are relatively large; thus, almost all nodes in the same group are connected (probability is close to 1), and the decreased probability for different groups starts to play an important role just after the initialization. the later added nodes usually have smaller degrees; hence, both formulas 5 have comparable significance. the choice of particular multipliers (3 and 0.5) in equations 5 has no particular background. we want to clearly distinguish between the case where belonging to some environment strongly influences an individual's ability to be connected with another person and that where no particular preferences are possible. in this paper, we use three types of graphs describing communities: • the pure ba graph; • a graph based on assignment to one of four groups with completely artificial sizes; • a graph based on assignment to one of 16 groups with sizes based on the inhabitance of particular areas of a selected medium-sized city in poland. we create the four groups mentioned in the second of the above points to introduce the visible differences in their sizes. the idea is to observe the effects of the proposed division in the simplified case, and the percentages of people in successive groups are {0.1, 0.2, 0.3, 0.4}. the second division is based on a real analysis of our city -łódź (lodz). easily accessible information [29] shows that we can approximate the city's symmetric shape as a square with an edge length close to 16 km. the number of inhabitants in lodz can be estimated to be approximately 700000. the large square is further divided into 16 smaller squares (4 rows by 4 columns). for this approach, we can establish the number of inhabitants in particular parts of the city. the results of this division are schematically presented in fig. 1 (a) . the shades correspond to the number of inhabitants in a particular area, and the numbers inside the squares are the populations of groups in thousands. to show in more detail the distribution of people in the studied case, we also added fig. 1 (b) , where a similar division is shown for the system of 16 × 16 = 256 smaller groups. it is a fundamental assumption that this community is isolated; thus, we do not have the chance for potential secondary outbreaks, although it is not difficult to introduce this effect. the modification of the probability distribution in creating the connection is shown in fig. 2 . we show the distributions for the three types of graphs mentioned above when creating them for 70000 individuals. we can observe that the algorithm creates less than 10 −4 of pairs with a probability greater than 0.01. the differences are visible, but it may seem that they are not able to produce a significant effect on the spreading process. there is also no significant difference between distributions for different divisions (4 groups or 16 groups). the process of disease transfer is, in reality, difficult to describe in the language of mathematical formulas. when considering the direct contact results, we have to take into account many details, such as the duration of contact or the distance between individuals. indeed, there are also many features of medical origin, such as the individual immune system or blood group [30] . due to these problems, a much more simplified approach is typically used. for example, in schimit's paper [31] , the probability of infection depends on the number of infected neighbors (ν): . in our approach, we consider the process of possible infection separately during every step and for every possible pair of neighbors. there are two probabilities that are the basis of our model: • p m -probability of meeting (contact), • p i -probability of infection. the first value (p m ) is related to the social reasons for infection -without contact with a pathogen, we cannot become ill. therefore, we have to define the value that describes the probability that two individuals can meet during the time corresponding to one time step. thus, p m is the value that describes the possibility of a situation when contact is long enough to enable the transmission of illness. the second value (p i ) is related to the medical-related processes of infection and should, in general, correspond to the characteristics of the pathogen causing the disease. in the simplification, to meet the current paper's needs, this value is the probability of transferring an illness during a meeting, which occurs with probability p m . in general, this value can undoubtedly correspond to many processes related to the pathogen transfer between two individuals. as a result, this method enables distinguishing the influence of processes of different origin on the result of simulation. by changing p m , we can introduce, e.g., the effect of a quarantine, and by changing p i , we can distinguish different diseases. it is very important to mention here that p i does not change during the simulation run. this change would be a good model for the reaction of a pathogen to the environmental conditions, such as the temperature, humidity, or uv radiation level. the crucial problem when studying the expansion of an epidemic is to parametrize the seir model by introducing into the calculations the realistic times at which an individual stays in the exposed (t e ) and infectious (t i ) states. in this paper, we show the results for two pairs of times, making it possible to distinguish between the two types of infections. as a first case, we choose the times typical for influenza (t e = 2, t i = 4), following the cases studied in [32, 10] . for the second case, we choose times that best resemble the covid-19 data. since there exist some preliminary data [33, 26, 25] , we decide to set (t e = 0, t i = 14). finally, the scheme of disease transfer is as follows: • we start from exactly one "patient 0". • the location of this patient varies between simulation runs, but it is always unambiguously defined. usually, "patient 0" is the hub of the selected group, and sometimes it is a less connected node. the detailed information about him/her is always given when describing a particular result. • one time step is equivalent to one day. • during the day, an individual is in group i -infectiousif he/she can transfer the disease to every one of his/her connected individuals with total probability p t = p m * p i . • after the infection time, every individual stays recovered (r) and does not return to the susceptible group (s). • the mortality and birth rates are assumed to be 0. in table 1 , we present a summary of the simulation parameters. three fields in the table need further explanation. unless stated otherwise, the outbreak starts in the so-called hub in the most populous group. in the concept of social networks, hubs are those nodes that are characterized by the highest degree. this means that the selection of the initial point of the outbreak is not random but is assigned to the node with a relatively high possibility of spreading the disease. we also decide to propose the model of intervention in the form of the separation of individuals. the main idea of prevention forced by authorities during the covid-19 outbreak is to keep people at home. certainly, for most of the population, it is not possible to completely resign from leaving places of isolation; hence, they significantly decrease the number and intensity of outside contacts. we model this by decreasing the p m factor. if we force an intervention, every day after its start, p m is halved (p m,t+1 = 0.5 * p m,tt enumerates the time steps) until it reaches a value lower by one order of magnitude compared to the starting value p m = 0.1. for the initial calculations, we assume that intervention, if it occurs, starts on the 10th day. in this section, we present the results of calculations made for the model presented above. we start the analysis from the information on the size of graphs used in particular simulations. two values are chosen, i.e., 7000 and 70000, with several reasons justifying these choices. first, we can expect the existence of a size effect. the size effect is typical for dynamical system simulations and manifests itself in the dependence of results on the size of the sample. we have observed this effect for disease spreading [11] models, and we can also expect its existence for these calculations. the choice of particular values is done to achieve easy scaling and comparison with the number of nodes in the subsequent sections of fig. 1 thus, the real number of 3700 persons in the upper-left section corresponds to 37 or 370 for 7000 and 70000, respectively. a smaller number of nodes can also make the calculations faster and, therefore, enable collecting more extensive statistics. the most time-consuming process is the creation of the networkits time complexity is o(n 2 ). in fig. 3 and 4 , we show the epidemic curves and the cumulative epidemic curves for selected cases described by the parameters listed in table 1 . in fig. 3 , we show the daily number of new cases. in fig. 4 , as cumulative data, we consider all individuals who passed the infectious state (i). for simplicity, we call them recovered in the figure. the results presented in the plots are averaged over ten runs for every set of parameters. this procedure makes all curves smoother, and all statistical effects disappear; we will return to them later. we prepare both figures in the same style. having the data for two illnesses and two cases related to reducing the contact probability (denoted as a form of intervention), we show them in such a way that the upper plots correspond to the influenza-related data and the lower ones to the covid-19 related data. the left plots show the results without any intervention, while the right ones correspond to the inclusion of the social distancing effect by decreasing the p m parameter. the pure epidemic curves (fig. 3) show that every factor included in the simulation parameters can influence the course of epidemics. since we show percentages, we can directly compare all curves. the most visible differences can be observed for the influenza-related data. as one can expect, when starting the disease in the hub of the most populated group (described as a hub), we obtain the highest rate of total infected persons. this result is, however, obtained for the particular division into four groups. with this division, the increase in connection probabilfigure 4 : the course of cumulative epidemic curves for the same sets of parameters as in fig. 3 . the organization of the image is the same as in fig. 3 . ity plays such a significant role that it causes a large increase in the number of persons with illness. interestingly, the same case but considered for the pure ba model or for a larger number of groups (16) does not cause this significant effect. it can also be observed that the maximum of epidemics takes place earlier than for the corresponding larger number of groups for the smaller groups. this effect is visible for all pairs of curves prepared for the same set of parameters but different sizes. it is also essential that the difference in maximum percentages for different parameters can differ by approximately an order of magnitude, and epidemics can last a long time. indeed, it never expires, as is known for influenza, certainly with different intensities. when looking at the plots for influenza with intervention, we can estimate the progress of influenza when we assume that the restrictions, similar to those introduced during covid-19, are implemented on the 10th day from the time of outbreak. we can expect that in approximately three weeks (10 days after implementing social distancing), the disease will disappear. indeed, we can consider this situation as a thought experiment, since no government would impose restrictions due to the seasonal flu. when looking at the covid-19-related data in the lower plot of fig. 3 , we notice that the differences here are visibly less important. as we observed earlier, the maxima of the curves for smaller sizes occur slightly earlier than for the corresponding curves for larger sizes. this difference is, however, visibly smaller. the longer infectious time causes the disease to be more aggressive and faster. for almost all cases, after approximately three months, the disease disappears. the number of people spreading the disease can be seen in fig.4 . the final number of ill persons reaches a value in the interval [57%, 73%]. this result is certainly unacceptable, especially taking into account the fact that the real mortality index for covid-19 is reported to be approximately 5%. the influence of intervention on the tenth day is shown in fig. 3 (d) . the crucial observation is that the disease distinctly decreases its intensity after several dozens of days. this profile is similar to the curves obtained, for example, in some western european countries (see, e.g., https://www.worldometers.info/coronavirus for the netherlands or spain). the disease stays significantly weaker after approximately two months from the first infection. indeed, these systems are not closed ones. we can show that the size effect is present in this particular type of calculation. except for the data for covid-19 parameters without intervention, every other case strongly depends on community size. the most straightforward way to notice this effect is to observe the curves drawn with the same line type but different colors. this means that the spread of disease depends strongly not only on the relative ratio of connections inside and outside groups but also on the absolute number of links. some extended calculations may be needed, especially to describe the influenza-type epidemic. the interesting effect that we can observe is the change in characteristics with the increase in the number of groups. we can see that the total number of ill persons for the division into four groups is usually higher than for the pure ba model and later decreases when dividing the community into 16 groups. in fig. 4 , this relation is observed, e.g., when comparing the plots for size = 7000, initialized in the hub of a large group, and 3 models described as ba, 4gr, and 16gr. we think that the reason for this property is the fact that we strongly support the creation of links inside groups (see eq. 5). thus, if there is a small number of relatively numerous groups, the average number of links could increase, strengthening the effect of disease transfer. we should emphasize the magnitude of these changes, which can make some results even ten times greater than others. we also tested the significance of the choice of the first infected individual. to prevent the plot from being unreadable, we limited the number of different cases to just two. for the first one, for which most of the calculations are made, "patient 0" is in the most populous group. for three selected individuals, described as "hub other group", he/she is also a hub but located in the less populated group. the difference in the final number of ill persons can be, when analyzing this feature, approximately up to 30% higher than when starting from the more populated subgroup. considering once more the effect of intervention (see plots (b) and (d) in fig. 3 and 4) , we can observe that, with intervention included, the duration of the epidemic does not strongly depend on the parameters of the model. some dispersion certainly exists but is not significant. to show some characteristics, we sampled the selected averaged values for particular simulation types in table 2 . we choose to present three points in time: • the time when the cumulative curve changes its character for all types of simulation (a-d), the data are averaged over the whole set of parameters, and averages (along with standard deviations) are shown. the results show that for the virus with the longer infectious period, the duration is shorter (but certainly causes many more cases) than for those with shorter times t e and t i . we must pay attention to the fact that the results here are strongly correlated with the model used. the size effect is observable and characterized by a stronger influence of intervention for larger samples, and the division into 16 groups prevents the spread of disease. these data also give us information about the final phase of an epidemic. the most important result is shown in plot (d). even with a fast, radical, and widely accepted mechanism of intervention, we can still expect that after ten weeks, approximately 1 person in every 100000 (1 per mil due to t 0.999 being approximately 1 percent the total number of ill persons) can get sick. one of the most important questions when analyzing an epidemic is the significance of early intervention. the assumed start time of interventions above was ten days. this is a very short period and, for covid-19, is within the estimations of the incubation period. hence, we decided to perform calculations assuming different values of delay. the results, averaged over 25 runs, are shown in fig. 5 . the results confirm that stochastic effects are much more important for influenza-like diseases. the differences in the values of recovered persons depend very strongly on the graph creation method following the order mentioned by analyzing the epidemic curves. the introduction of groups leads initially to the worsening of results (increase in the number of infected persons) before improvement occurs (fewer ill persons when divided into 16 groups). however, the dispersion of results is so significant here than in every set of 25 runs, and a run exists in which the disease is not transferred to other individuals. the more important conclusion comes from plot (b) of fig. 5 . although the results for the pure ba model and those of the sample divided into 4 groups are almost indistinguishable on the logarithmic scale, these results differ by a factor of 2-3 in the same manner as in fig. 4 . the very important observation, however, is that if the assumption about much stronger links in groups is realistic, by decreasing the frequency of contact between people, we can reduce the number of infected persons by up to an order of magnitude. this effect can lead to the conclusion that greater social segregation can slow down or even stop epidemics. in this case, the stochastic effects are much stronger than for the case of a smaller number of groups, which can decrease this ratio by another two orders of magnitude. finally, we present the effect of taking into account the stronger links inside groups related to the real-space separation. as we wrote earlier, the percentage of individuals in groups, when divided into 16 groups, correspond to the inhabitance of figure 5 : the effect of the delay of intervention. plot (a) is prepared for data used for influenza (t e = 2, t i = 4); plot (b), for covid-19-related parameters (t e = 0, t i = 14). we perform the calculations for a sample size equal to 7000 and with the start of the outbreak in the hub of the most populous group. then, we prepare plots for three cases: pure ba model, four groups, and 16 groups. the descriptions are the same as in the earlier figure, i.e., ba, 4gr, and 16gr, respectively. the bars correspond to the range obtained during different runs for particular parameters. 16 parts of lodz. the plots in fig. 6 show the percentage of people in successive areas selected from the city's total area. the outbreak starts, as usual in our calculations, in the most numerous cell. in the case considered, it is the cell in the third row and third column (see fig. 1 ). one may observe that the grouping causes a larger distinction between the characteristics obtained for different areas. in the suburbs, the rate of sickness is significantly lower, except for cases where it is passed to a local hub (see lowest-right squares in plots (d) and (h)). it is essential to mention here that there exists no preference for choice, as for links in the community graph, regarding the individuals from neighboring squares. when isolated, people in less urbanized areas have an approximately ten times lower probability of getting sick. we also emphasize an interesting effect observed when comparing some of the plots. in fig. 6 , we have four pairs of plots prepared for the same disease and intervention option but for different divisions. they are as follows: (a,c), (b,d), (e,g), and (f,h). although the plots in pairs are usually different, one pair is visibly similar. a difference in the scale certainly exists. we can show that pair (e,g), corresponding to the covid-19 pa-rameters without intervention, results in the same pattern of illness in particular areas. since the corresponding plots (f and h) with intervention are explicitly different, we can say that for this aggressive disease, an accurate description of the relations inside the community graph is more important for the correct prediction of the possible course of the disease. every epidemic has two components: a social one and a medical one. we presented a model that integrates both components by considering the spread of disease as the effect of individual acts. these acts correspond to the direct transfer of pathogens between two individuals in a community network, modeled with the popular social network modeling. the study shows the influence of the community structure (the structure of links concerning affiliation to particular groups). these results show the great significance of the knowledge related to a society's observations when anticipating the potential course of an epidemic. the formulation of the model enables easy calculation of one of the basic values related to its progression -the basic reproduction number, r 0 . following the formula given in [34] , i.e., r 0 = kbd, we can easily adapt the notions used in our paper. because d is the duration of infectiousness, it corresponds to t i . b is the probability of transmission by contact and equals p i . the determination of k -the number of contacts of infectious persons per unit time -needs some explanation. it can be calculated by analyzing the average node degree of the constructed graph < k n >. these calculations are not presented in this paper, but the estimation gives an interval (dependent on the size and model) from 6 to 8.4. this allows us to calculate k =< k n > * p m . finally, the values of r 0 can be estimated as being between 0.6 and 0.84 for influenza and between 2.1 and 2.94 for covid-19. the above value corresponds very well to the summary given for covid-19 in boldog's paper [27] . we can also easily calculate the effective reproductive number (r). usually, it is obtained by the multiplication of r 0 by the number of individuals in the s state. here, we can add the multiplication by the p m parameter; as a result, we can observe the effect of the intervention on r and estimate the time at which r passes the critical value r = 1. in our opinion, the very important property of the proposed model is its natural way of including the stochastic character of the disease-spreading process. applying the random approach to every possible contact, we can estimate not only the average profiles of epidemic curves but also the range in which the real number of ill persons can lie. we can try to find the crucial nodes/individuals whose identification will allow us to reduce the range of an epidemic. we can also try to estimate the possible result of this identification. importantly, by individualizing the features of agents, we can also individualize the transfer of illness between them. the model enables the easy introduction of different forms of intervention as well as new potential outbreaks. considering intervention, we show the results of its typical form -isolation and quarantine. there is, however, a very simple way to introduce different procedures, i.e., vaccination, but other forms, as shown in [35] , can be enlisted for the model. when entering a new illness, we have to change the attribute of the selected, either intentionally or randomly, individual. another important piece of information that can be estimated from the presented model is the one about the anticipated epidemic duration. by trying to recognize the inflection point (time), we can determine the time at which the disease can be significantly impaired. in our calculations, these values are limited to one per one thousand total ill persons and reaches a value approximately three times greater than the time between outbreak and inflection. a contribution to the mathematical theory of epidemics seasonality and period-doubling bifurcations in an epidemic model hamiltonian dynamics of the sis epidemic model with stochastic fluctuations a cellular automaton model for the effects of population movement and vaccination on epidemic propagation strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic cooley, modeling targeted layered containment of an influenza pandemic in the united states impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand modeling epidemics using cellular automata epidemiological modeling with a population density map-based cellular automata simulation system size effect in cellular automata based disease spreading model emergence of scaling in random networks statistical mechanics of complex networks on random graphs i the largescale organization of metabolic networks the metabolic world of escherichia coli is not small network structure and the biology of populations complex networks: structure and dynamics on analytical approaches to epidemics on networks simulation of the spread of infectious diseases in a geographical environment disease spreading on populations structured by groups seasonal transmission potential and activity peaks of the new influenza a(h1n1): a monte carlo likelihood analysis based on human mobility for the mathematical modelling of infectious diseases covid-19 working group, feasibility of controlling covid-19 outbreaks by isolation of cases and contacts estimate the incubation period of coronavirus 2019 (covid-19) incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data the incubation period of 2019-ncov from publicly reported confirmed cases: estimation and application risk assessment of novel coronavirus covid-19 outbreaks outside china modelling the epidemic trend of the 2019 novel coronavirus outbreak in china map of districts in lodz relationship between the abo blood group and the covid-19 susceptibility disease spreading in complex networks: a numerical study with principal component analysis realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics estimate the incubation period of coronavirus 2019 (covid-19), medrxiv transmission dynamics and control of severe acute respiratory syndrome managing epidemics: key facts about major deadly diseases, world health organization key: cord-225347-lnzz2chk authors: chakraborty, tanujit; ghosh, indrajit; mahajan, tirna; arora, tejasvi title: nowcasting of covid-19 confirmed cases: foundations, trends, and challenges date: 2020-10-10 journal: nan doi: nan sha: doc_id: 225347 cord_uid: lnzz2chk the coronavirus disease 2019 (covid-19) has become a public health emergency of international concern affecting more than 200 countries and territories worldwide. as of september 30, 2020, it has caused a pandemic outbreak with more than 33 million confirmed infections and more than 1 million reported deaths worldwide. several statistical, machine learning, and hybrid models have previously tried to forecast covid-19 confirmed cases for profoundly affected countries. due to extreme uncertainty and nonstationarity in the time series data, forecasting of covid-19 confirmed cases has become a very challenging job. for univariate time series forecasting, there are various statistical and machine learning models available in the literature. but, epidemic forecasting has a dubious track record. its failures became more prominent due to insufficient data input, flaws in modeling assumptions, high sensitivity of estimates, lack of incorporation of epidemiological features, inadequate past evidence on effects of available interventions, lack of transparency, errors, lack of determinacy, and lack of expertise in crucial disciplines. this chapter focuses on assessing different short-term forecasting models that can forecast the daily covid-19 cases for various countries. in the form of an empirical study on forecasting accuracy, this chapter provides evidence to show that there is no universal method available that can accurately forecast pandemic data. still, forecasters' predictions are useful for the effective allocation of healthcare resources and will act as an early-warning system for government policymakers. in december 2019, clusters of pneumonia cases caused by the novel coronavirus were identified at the wuhan, hubei province in china [58, 48] after almost hundred years of the 1918 spanish flu [118] . soon after the emergence of the novel beta coronavirus, world health organization (who) characterized this contagious disease as a "global pandemic" due to its rapid spread worldwide [99] . many scientists have attempted to make forecasts about its impact. however, despite involving many excellent modelers, best intentions, and highly sophisticated tools, forecasting covid-19 pandemics is harder [64] , and this is primarily due to the following major factors: -very less amount of data is available; -less understanding of the factors that contribute to it; -model accuracy is constrained by our knowledge of the virus, however. with an emerging disease such as covid-19, many transmission-related biologic features are hard to measure and remain unknown; -the most obvious source of uncertainty affecting all models is that we don't know how many people are or have been infected; -ongoing issues with virologic testing mean that we are certainly missing a substantial number of cases, so models fitted to confirmed cases are likely to be highly uncertain [55] ; -the problem of using confirmed cases to fit models is further complicated because the fraction of confirmed cases is spatially heterogeneous and time-varying [123] ; -finally, many parameters associated with covid-19 transmission are poorly understood. amid enormous uncertainty about the future of the covid-19 pandemic, statistical, machine learning, and epidemiological models are critical forecasting tools for policymakers, clinicians, and public health practitioners [26, 76, 126, 36, 73, 130] . covid-19 modeling studies generally follow one of two general approaches that we will refer to as forecasting models and mechanistic models. although there are hybrid approaches, these two model types tend to address different questions on different time scales, and they deal differently with uncertainty [25] . compartmental epidemiological models have been developed over nearly a century and are well tested on data from past epidemics. these models are based on modeling the actual infection process and are useful for predicting long-term trajectories of the epidemic curves [25] . short-term forecasting models are often statistical, fitting a line or curve to data and extrapolating from there -like seeing a pattern in a sequence of numbers and guessing the next number, without incorporating the process that produces the pattern [23, 24, 26] . well constructed statistical frameworks can be used for short-term forecasts, using machine learning or regression. in statistical models, the uncertainty of the prediction is generally presented as statistically computed prediction intervals around an estimate [51, 65] . given that what happens a month from now will depend on what happens in the interim, the estimated uncertainty should increase as you look further into the future. these models yield quantitative projections that policymakers may need to allocate resources or plan interventions in the short-term. forecasting time series datasets have been a traditional research topic for decades, and various models have been developed to improve forecasting accuracy [27, 10, 49] . there are numerous methods available to forecast time series, including traditional statistical models and machine learning algorithms, providing many options for modelers working on epidemiological forecasting [24, 25, 20, 26, 80, 22, 97] . many research efforts have focused on developing a universal forecasting model but failed, which is also evident from the "no free lunch theorem" [125] . this chapter focuses on assessing popularly used short-term forecasting (nowcasting) models for covid-19 from an empirical perspective. the findings of this chapter will fill the gap in the literature of nowcasting of covid-19 by comparing various forecasting methods, understanding global characteristics of pandemic data, and discovering real challenges for pandemic forecasters. the upcoming sections present a collection of recent findings on covid-19 forecasting. additionally, twenty nowcasting (statistical, machine learning, and hybrid) models are assessed for five countries of the united states of america (usa), india, brazil, russia, and peru. finally, some recommendations for policy-making decisions and limitations of these forecasting tools have been discussed. researchers face unprecedented challenges during this global pandemic to forecast future real-time cases with traditional mathematical, statistical, forecasting, and machine learning tools [76, 126, 36, 73, 130] . studies in march with simple yet powerful forecasting methods like the exponential smoothing model predicted cases ten days ahead that, despite the positive bias, had reasonable forecast error [91] . early linear and exponential model forecasts for better preparation regarding hospital beds, icu admission estimation, resource allocation, emergency funding, and proposing strong containment measures were conducted [46] that projected about 869 icu and 14542 icu admissions in italy for march 20, 2020. health-care workers had to go through the immense mental stress left with a formidable choice of prioritizing young and healthy adults over the elderly for allocation of life support, mostly unwanted ignoring of those who are extremely unlikely to survive [35, 100] . real estimates of mortality with 14-days delay demonstrated underestimating of the covid-19 outbreak and indicated a grave future with a global case fatality rate (cfr) of 5.7% in march [13] . the contact tracing, quarantine, and isolation efforts have a differential effect on the mortality due to covid-19 among countries. even though it seems that the cfr of covid-19 is less compared to other deadly epidemics, there are concerns about it being eventually returning as the seasonal flu, causing a second wave or future pandemic [89, 95] . mechanistic models, like the susceptible-exposed-infectious-recovered (seir) frameworks, try to mimic the way covid-19 spreads and are used to forecast or simulate future transmission scenarios under various assumptions about parameters governing the transmission, disease, and immunity [56, 52, 8, 29, 77] . mechanistic modeling is one of the only ways to explore possible long-term epidemiologic outcomes [7] . for example, the model from ferguson et al. [40] that has been used to guide policy responses in the united states and britain examines how many covid-19 deaths may occur over the next two years under various social distancing measures. kissler et al. [70] ask whether we can expect seasonal, recurrent epidemics if immunity against novel coronavirus functions similarly to immunity against the milder coronaviruses that we transmit seasonally. in a detailed mechanistic model of boston-area transmission, aleta et al. [4] simulate various lockdown "exit strategies". these models are a way to formalize what we know about the viral transmission and explore possible futures of a system that involves nonlinear interactions, which is almost impossible to do using intuition alone [54, 85] . although these epidemiological models are useful for estimating the dynamics of transmission, targeting resources, and evaluating the impact of intervention strategies, the models require parameters and depend on many assumptions. several statistical and machine learning methods for real-time forecasting of the new and cumulative confirmed cases of covid-19 are developed to overcome limitations of the epidemiological model approaches and assist public health planning and policy-making [25, 91, 6, 26, 23] . real-time forecasting with foretelling predictions is required to reach a statistically validated conjecture in this current health crisis. some of the leading-edge research concerning real-time projections of covid-19 confirmed cases, recovered cases, and mortality using statistical, machine learning, and mathematical time series modeling are given in table 1 . a univariate time series is the simplest form of temporal data and is a sequence of real numbers collected regularly over time, where each number represents a value [28, 18] . there are broadly two major steps involved in univariate time series forecasting [60] : -studying the global characteristics of the time series data; -analysis of data with the 'best-fitted' forecasting model. understanding the global characteristics of pandemic confirmed cases data can help forecasters determine what kind of forecasting method will be appropriate for the given situation [120] . as such, we aim to perform a meaningful data analysis, including the study of time series characteristics, to provide a suitable and comprehensive knowledge foundation for the future step of selecting an apt forecasting method. thus, we take the path of using statistical measures to understand pandemic time series characteristics to assist method selection and data analysis. these characteristics will carry summarized information of the time series, capturing the 'global picture' of the datasets. based on the recommendation of [32, 122, 75, 74] , we study several classical and advanced time series characteristics of covid-19 data. this study considers eight global characteristics of the time series: periodicity, stationarity, serial correlation, skewness, kurtosis, nonlinearity, long-term dependence, and chaos. this collection of measures provides quantified descriptions and gives a rich portrait of the pandemic time-series' nature. a brief description of these statistical and advanced time-series measures are given below. a seasonal pattern exists when a time series is influenced by seasonal factors, such as the month of the year or day of the week. the seasonality of a time series is defined as a pattern that repeats itself over fixed intervals of time [18] . in general, the seasonality can be found by identifying a large autocorrelation coefficient or a large partial autocorrelation coefficient at the seasonal lag. since the periodicity is very important for determining the seasonality and examining the cyclic pattern of the time series, the periodicity feature extraction becomes a necessity. unfortunately, many time series available from the dataset in different domains do not always have known frequency or regular periodicity. seasonal time series are sometimes also called cyclic series, although there is a significant distinction between them. cyclic data have varying frequency lengths, but seasonality is of a fixed length over each period. for time series with no seasonal pattern, the frequency is set to 1. the seasonality is tested using the 'stl' function within the "stats" package in r statistical software [60] . stationarity is the foremost fundamental statistical property tested for in time series analysis because most statistical models require that the underlying generating processes be stationary [27] . stationarity means that a time series (or rather the process rendering it) do not change over time. in statistics, a unit root test tests whether a time series variable is non-stationary and possesses a unit root [93] . the null hypothesis is generally defined as the presence of a unit root, and the alternative hypothesis is either stationarity, trend stationarity, or explosive root depending on the test used. in econometrics, kwiatkowski-phillips-schmidt-shin (kpss) tests are used for testing a null hypothesis that an observable time series is stationary around a deterministic trend (that is, trend-stationary) against the alternative of a unit root [108] . the kpss test is done using the 'kpss.test' function within the "tseries" package in r statistical software [117] . serial correlation is the relationship between a variable and a lagged version of itself over various time intervals. serial correlation occurs in time-series studies when the errors associated with a given time period carry over into future time periods [18] . we have used box-pierce statistics [19] in our approach to estimate the serial correlation measure and extract the measures from covid-19 data. the box-pierce statistic was designed by box and pierce in 1970 for testing residuals from a forecast model [122] . it is a common portmanteau test for computing the measure. the mathematical formula of the box-pierce statistic is as follows: where n is the length of the time series, h is the maximum lag being considered (usually h is chosen as 20) , and r k is the autocorrelation function. the portmanteau test is done using the 'box.test' function within the "stats" package in r statistical software [63] . nonlinear time series models have been used extensively to model complex dynamics not adequately represented by linear models [67] . nonlinearity is one important time series characteristic to determine the selection of an appropriate forecasting method. [115] there are many approaches to test the nonlinearity in time series models, including a nonparametric kernel test and a neural network test [119] . in the comparative studies between these two approaches, the neural network test has been reported with better reliability [122] . in this research, we used teräsvirta's neural network test [112] for measuring time series data nonlinearity. it has been widely accepted and reported that it can correctly model the nonlinear structure of the data [113] . it is a test for neglected nonlinearity, likely to have power against a range of alternatives based on the nn model (augmented single-hidden-layer feedforward neural network model). this takes large values when the series is nonlinear and values near zero when the series is linear. the test is done using the 'nonlinearitytest' function within the "nonlineartseries" package in r statistical software [42] . skewness is a measure of symmetry, or more precisely, the lack of symmetry. a distribution, or dataset, is symmetric if it looks the same to the left and the right of the center point [122] . a skewness measure is used to characterize the degree of asymmetry of values around the mean value [83] . for univariate data y t , the skewness coefficient is whereȳ is the mean, σ is the standard deviation, and n is the number of data points. the skewness for a normal distribution is zero, and any symmetric data should have the skewness near zero. negative values for the skewness indicate data that are skewed left, and positive values for the skewness indicate data that are skewed right. in other words, left skewness means that the left tail is heavier than the right tail. similarly, right skewness means the right tail is heavier than the left tail [69] . skewness is calculated using the 'skewness' function within the "e1071" package in r statistical software [81] . kurtosis is a measure of whether the data are peaked or flat, relative to a normal distribution [83] . a dataset with high kurtosis tends to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. datasets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. for a univariate time series y t , the kurtosis coefficient is 1 the kurtosis for a standard normal distribution is 3. therefore, the excess kurtosis is defined as so, the standard normal distribution has an excess kurtosis of zero. positive kurtosis indicates a 'peaked' distribution and negative kurtosis indicates a 'flat' distribution [47] . kurtosis is calculated using the 'kurtosis' function within the "performance-analytics" package in r statistical software [90] . processes with long-range dependence have attracted a good deal of attention from a probabilistic perspective in time series analysis [98] . with such increasing importance of the 'self-similarity' or 'long-range dependence' as one of the time series characteristics, we study this feature into the group of pandemic data characteristics. the definition of self-similarity is most related to the self-similarity parameter, also called hurst exponent (h) [15] . the class of autoregressive fractionally integrated moving average (arfima) processes [44] is a good estimation method for computing h. in an arima(p, d, q), p is the order of ar, d is the degree first differencing involved, and q is the order of ma. if the time series is suspected of exhibiting long-range dependency, parameter d may be replaced by certain non-integer values in the arfima model [21] . we fit an arfima(0, d, 0) to the maximum likelihood, which is approximated by using the fast and accurate method of haslett and raftery [50] . we then estimate the hurst parameter using the relation h = d + 0.5. the self-similarity feature can only be detected in the raw data of the time series. the value of h can be obtained using the 'hurstexp' function within the "pracma" package in r statistical software [16] . many systems in nature that were previously considered random processes are now categorized as chaotic systems. for several years, lyapunov characteristic exponents are of interest in the study of dynamical systems to characterize quantitatively their stochasticity properties, related essentially to the exponential divergence of nearby orbits [39] . nonlinear dynamical systems often exhibit chaos, characterized by sensitive dependence on initial values, or more precisely by a positive lyapunov exponent (le) [38] . recognizing and quantifying chaos in time series are essential steps toward understanding the nature of random behavior and revealing the extent to which short-term forecasts may be improved [53] . le, as a measure of the divergence of nearby trajectories, has been used to qualifying chaos by giving a quantitative value [14] . the algorithm of computing le from time-series is applied to continuous dynamical systems in an n-dimensional phase space [101] . le is calculated using the 'lyapunov exponent' function within the "tserieschaos" package in r statistical software [9] . time series forecasting models work by taking a series of historical observations and extrapolating future patterns. these are great when the data are accurate; the future is similar to the past. forecasting tools are designed to predict possible future alternatives and help current planing and decision making [10] . there are essentially three general approaches to forecasting a time series [82] : 1. generating forecasts from an individual model; 2. combining forecasts from many models (forecast model averaging); 3. hybrid experts for time series forecasting. single (individual) forecasting models are either traditional statistical methods or modern machine learning tools. we study ten popularly used single forecasting models from classical time series, advanced statistics, and machine learning literature. there has been a vast literature on the forecast combinations motivated by the seminal work of bates & granger [12] and followed by a plethora of empirical applications showing that combination forecasts are often superior to their counterparts (see, [17, 114] , for example). combining forecasts using a weighted average is considered a successful way of hedging against the risk of selecting a misspecified model [31] . a significant challenge is in choosing an appropriate set of weights, and many attempts to do this have been worse than simply using equal weightssomething that has become known as the "forecast combination puzzle" (see, for example, [109] ). to overcome this, hybrid models became popular with the seminal work of zhang [127] and further extended for epidemic forecasting in [24, 26, 23] . the forecasting methods can be briefly reviewed and organized in the architecture shown in figure 1 . the autoregressive integrated moving average (arima) is one of the well-known linear models in time-series forecasting, developed in the early 1970s [18] . it is widely used to track linear tendencies in stationary time-series data. it is denoted by arima(p, d, q), where the three components have significant meanings. the parameters p and q represent the order of ar and ma models, respectively, and d denotes the level of differencing to convert nonstationary data into stationary time series [78] . arima model can be mathematically expressed as follows: where y t denotes the actual value of the variable at time t, ǫ t denotes the random error at time t, β i and α j are the coefficients of the model. some necessary steps to be followed for any given time-series dataset to build an arima model are as follows: -identification of the model (achieving stationarity). -use autocorrelation function (acf) and partial acf plots to select the ar and ma model parameters, respectively, and finally estimate model parameters for the arima model. -the 'best-fitted' forecasting model can be found using the akaike information criteria (aic) or the bayesian information criteria (bic). finally, one checks the model diagnostics to measure its performance. an implementation in r statistical software is available using the 'auto.arima' function under the "forecast" package, which returns the 'best' arima model according to either aic or bic values [61] . wavelet analysis is a mathematical tool that can reveal information within the signals in both the time and scale (frequency) domains. this property overcomes the primary drawback of fourier analysis, and wavelet transforms the original signal data (especially in the time domain) into a different domain for data analysis and processing. wavelet-based models are most suitable for nonstationary data, unlike standard arima. most epidemic time-series datasets are nonstationary; therefore, wavelet transforms are used as a forecasting model for these datasets [26] . when conducting wavelet analysis in the context of time series analysis [5] , the selection of the optimal number of decomposition levels is vital to determine the performance of the model in the wavelet domain. the following formula for the number of decomposition levels, w l = int[log(n)], is used to select the number of decomposition levels, where n is the time-series length. the wavelet-based arima (warima) model transforms the time series data by using a hybrid maximal overlap discrete wavelet transform (modwt) algorithm with a 'haar' filter [88] . daubechies wavelets can produce identical events across the observed time series in so many fashions that most other time series prediction models cannot recognize. the necessary steps of a wavelet-based forecasting model, defined by [5] , are as follows. firstly, the daubechies wavelet transformation and a decomposition level are applied to the nonstationary time series data. secondly, the series is reconstructed by removing the high-frequency component, using the wavelet denoising method. lastly, an appropriate arima model is applied to the reconstructed series to generate out-of-sample forecasts of the given time series data. wavelets were first considered as a family of functions by morlet [121] , constructed from the translations and dilation of a single function, which is called "mother wavelet". these wavelets are defined as follows: where the parameter m ( = 0) is denoted as the scaling parameter or scale, and it measures the degree of compression. the parameter n is used to determine the time location of the wavelet, and it is called the translation parameter. if the value |m| < 1, then the wavelet in m is a compressed version (smaller support is the time domain) of the mother wavelet and primarily corresponds to higher frequencies, and when |m| > 1, then φ ( m, n)(t) has larger time width than φ(t) and corresponds to lower frequencies. hence wavelets have time width adopted to their frequencies, which is the main reason behind the success of the morlet wavelets in signal processing and time-frequency signal analysis [86] . an implementation of the warima model is available using the 'waveletfittingarma' function under the "waveletarima" package in r statistical software [87] . fractionally autoregressive integrated moving average or autoregressive fractionally integrated moving average models are the generalized version arima model in time series forecasting, which allow non-integer values of the differencing parameter [44] . it may sometimes happen that our time-series data is not stationary, but when we try differencing with parameter d taking the value to be an integer, it may over difference it. to overcome this problem, it is necessary to difference the time series data using a fractional value. these models are useful in modeling time series, which has deviations from the long-run mean decay more slowly than an exponential decay; these models can deal with time-series data having long memory [94] . arfima models can be mathematically expressed as follows: where b is is the backshift operator, p, q are arima parameters, and d is the differencing term (allowed to take non-integer values). an r implementation of arfima model can be done with 'arfima' function under the "forecast"package [61] . an arfima(p, d, q) model is selected and estimated automatically using the hyndman-khandakar (2008) [59] algorithm to select p and q and the haslett and raftery (1989) [50] algorithm to estimate the parameters including d. exponential smoothing state space methods are very effective methods in case of time series forecasting. exponential smoothing was proposed in the late 1950s [124] and has motivated some of the most successful forecasting methods. forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. the ets models belong to the family of state-space models, consisting of three-level components such as an error component (e), a trend component (t), and a seasonal component(s). this method is used to forecast univariate time series data. each model consists of a measurement equation that describes the observed data, and some state equations that describe how the unobserved components or states (level, trend, seasonal) change over time [60] . hence, these are referred to as state-space models. [59] . an r implementation of the model is available in the 'ets' function under "forecast" package [61] . as an extension of autoregressive model, self-exciting threshold autoregressive (se-tar) model is used to model time series data, in order to allow for higher degree of flexibility in model parameters through a regime switching behaviour [116] . given a time-series data y t , the setar model is used to predict future values, assuming that the behavior of the time series changes once the series enters a different regime. this switch from one to another regime depends on the past values of the series. the model consists of k autoregressive (ar) parts, each for a different regime. the model is usually denoted as setar (k, p) where k is the number of threshold, there are k +1 number of regime in the model and p is the order of the autoregressive part. for example, suppose an ar(1) model is assumed in both regimes, then a 2-regime setar model is given by [41] : where for the moment the ǫ t are assumed to be an i.i.d. white noise sequence conditional upon the history of the time series and c is the threshold value. the setar model assumes that the border between the two regimes is given by a specific value of the threshold variable y t−1 . the model can be implemented using 'setar' function under the "tsdyn" package in r [34] . bayesian statistics has many applications in the field of statistical techniques such as regression, classification, clustering, and time series analysis. scott and varian [105] used structural time series models to show how google search data can be used to improve short-term forecasts of economic time series. in the structural time series model, the observation in time t, y t is defined as follows: where β t is the vector of latent variables, x t is the vector of model parameters, and ǫ t are assumed follow normal distributions with zero mean and h t as the variance. in addition, β t is represented as follows: where δ t are assumed to follow normal distributions with zero mean and q t as the variance. gaussian distribution is selected as the prior of the bsts model since we use the occurred frequency values ranging from 0 to ∞ [66] . an r implementation is available under the "bsts" package [103] , where one can add local linear trend and seasonal components as required. the state specification is passed as an argument to 'bsts' function, along with the data and the desired number of markov chain monte carlo (mcmc) iterations, and the model is fit using an mcmc algorithm [104] . the 'theta method' or 'theta model' is a univariate time series forecasting technique that performed particularly well in m3 forecasting competition and of interest to forecasters [11] . the method decomposes the original data into two or more lines, called theta lines, and extrapolates them using forecasting models. finally, the predictions are combined to obtain the final forecasts. the theta lines can be estimated by simply modifying the 'curvatures' of the original time series [110] . this change is obtained from a coefficient, called θ coefficient, which is directly applied to the second differences of the time series: where y " data = y t − 2y t−1 + y t−2 at time t for t = 3, 4, · · · , n and {y 1 , y 2 , · · · , y n } denote the observed univariate time series. in practice, coefficient θ can be considered as a transformation parameter which creates a series of the same mean and slope with that of the original data but having different variances. now, eqn. (2) is a second-order difference equation and has solution of the following form [62] : where a θ and b θ are constants and t = 1, 2, · · · , n. thus, y new (θ) is equivalent to a linear function of y t with a linear trend added. the values of a θ and b θ are computed by minimizing the sum of squared differences: forecasts from the theta model are obtained by a weighted average of forecasts of y new (θ) for different values of θ. also, the prediction intervals and likelihoodbased estimation of the parameters can be obtained based on a state-space model, demonstrated in [62] . an r implementation of the theta model is possible with 'thetaf' function in "forecast" package [61] . the main objective of tbats model is to deal with complex seasonal patterns using exponential smoothing [33] . the name is acronyms for key features of the models: trigonometric seasonality (t), box-cox transformation (b), arma errors (a), trend (t) and seasonal (s) components. tbats makes it easy for users to handle data with multiple seasonal patterns. this model is preferable when the seasonality changes over time [60] . tbats models can be described as follows: where y (µ) t is the time series at time point t (box-cox transformed), s (i) t is the i-th seasonal component, l t is the local level, b t is the trend with damping, d t is the arma(p, q) process for residuals and e t as the gaussian white noise. tbats model can be implemented using 'tbats' function under the "forecast" package in r statistical software [61] . forecasting with artificial neural networks (ann) has received increasing interest in various research and applied domains in the late 1990s. it has been given special attention in epidemiological forecasting [92] . multi-layered feed-forward neural networks with back-propagation learning rules are the most widely used models with applications in classification and prediction problems [129] . there is a single hidden layer between the input and output layers in a simple feed-forward neural net, and where weights connect the layers. denoting by ω ji the weights between the input layer and hidden layer and ν kj denotes the weights between the hidden and output layers. based on the given inputs x i , the neuron's net input is calculated as the weighted sum of its inputs. the output layer of the neuron, y j , is based on a sigmoidal function indicating the magnitude of this net-input [43] . for the j th hidden neuron, the calculation for the net input and output are: ω ji x i and y j = f (net h j ). for the k th output neuron: with λ ∈ (0, 1) is a parameter used to control the gradient of the function and j is the number of neurons in the hidden layer. the back-propagation [102] learning algorithm is the most commonly used technique in ann. in the error back-propagation step, the weights in ann are updated by minimizing where, d pk is the desired output of neuron k and for input pattern p. the common formula for number of neurons in the hidden layer is h = (i+j) 2 + √ d, for selecting the number of hidden neurons, where i is the number of output y j and d denotes the number of i training patterns in the input x i [128] . the application of ann for time series data is possible with 'mlp' function under "nnfor" package in r [72] . autoregressive neural network (arnn) received attention in time series literature in late 1990s [37] . the architecture of a simple feedforward neural network can be described as a network of neurons arranged in input layer, hidden layer, and output layer in a prescribed order. each layer passes the information to the next layer using weights that are obtained using a learning algorithm [128] . arnn model is a modification to the simple ann model especially designed for prediction problems of time series datasets [37] . arnn model uses a pre-specified number of lagged values of the time series as inputs and number of hidden neurons in its architecture is also fixed [60] . arnn(p, k) model uses p lagged inputs of the time series data in a one hidden layered feedforward neural network with k hidden units in the hidden layer. let x denotes a p-lagged inputs and f is a neural network of the following architecture: where c 0 , a j , w j are connecting weights, b j are p-dimensional weight vector and φ is a bounded nonlinear sigmoidal function (e.g., logistic squasher function or tangent hyperbolic activation function). these weights are trained using a gradient descent backpropagation [102] . standard ann faces the dilemma to choose the number of hidden neurons in the hidden layer and optimal choice is unknown. but for arnn model, we adopt the formula k = [(p+1)/2] for non-seasonal time series data where p is the number of lagged inputs in an autoregressive model [60] . arnn model can be applied using the 'nnetar' function available in the r statistical package "forecast" [61] . the idea of ensemble time series forecasts was given by bates and granger (1969) in their seminal work [12] . forecasts generated from arima, ets, theta, arnn, warima can be combined with equal weights, weights based on in-sample errors, or cross-validated weights. in the ensemble framework, cross-validation for time series data with user-supplied models and forecasting functions is also possible to evaluate model accuracy [106] . combining several candidate models can hedge against an incorrect model specification. bates and granger(1969) [12] suggested such an approach and observed, somewhat surprisingly, that the combined forecast can even outperform the single best component forecast. while combination weights selected equally or proportionally to past model errors are possible approaches, many more sophisticated combination schemes, have been suggested. for example, rather than normalizing weights to sum to unity, unconstrained and even negative weights could be possible [45] . the simple equal-weights combination might appear woefully obsolete and probably non-competitive compared to the multitude of sophisticated combination approaches or advanced machine learning and neural network forecasting models, especially in the age of big data. however, such simple combinations can still be competitive, particularly for pandemic time series [106] . a flow diagram of the ensemble method is presented in figure 2 . the ensemble method by [12] produces forecasts out to a horizon h by applying a weight w m to each m of the n model forecasts in the ensemble. the ensemble forecast f (i) for time horizon 1 ≤ i ≤ h and with individual component model forecasts f m (i) is then the weights can be determined in several ways (for example, supplied by the user, set equally, determined by in-sample errors, or determined by cross-validation). the "forecasthybrid" package in r includes these component models in order to enhance the "forecast" package base models with easy ensembling (e.g., 'hybridmodel' function in r statistical software) [107] . the idea of hybridizing time series models and combining different forecasts was first introduced by zhang [127] and further extended by [68, 24, 26, 23] . the hybrid forecasting models are based on an error re-modeling approach, and there are broadly two types of error calculations popular in the literature, which are given below [84, 30] : in the additive error model, the forecaster treats the expert's estimate as a variable,ŷ t , and thinks of it as the sum of two terms: where y t is the true value and e t be the additive error term. in the multiplicative error model, the forecaster treats the expert's estimateŷ t as the product of two terms: where y t is the true value and e t be the multiplicative error term. now, even if the relationship is of product type, in the log-log scale it becomes additive. hence, without loss of generality, we may assume the relationship to be additive and expect errors (additive) of a forecasting model to be random shocks [23] . these hybrid models are useful for complex correlation structures where less amount of knowledge is available about the data generating process. a simple example is the daily confirmed cases of the covid-19 cases for various countries where very little is known about the structural properties of the current pandemic. the mathematical formulation of the proposed hybrid model (z t ) is as follows: where l t is the linear part and n t is the nonlinear part of the hybrid model. we can estimate both l t and n t from the available time series data. letl t be the forecast value of the linear model (e.g., arima) at time t and ǫ t represent the error residuals at time t, obtained from the linear model. then, we write these left-out residuals are further modeled by a nonlinear model (e.g., ann or arnn) and can be represented as follows: where f is a nonlinear function, and the modeling is done by the nonlinear ann or arnn model as defined in eqn. (5) and ε t is supposed to be the random shocks. therefore, the combined forecast can be obtained as follows: wheren t is the forecasted value of the nonlinear time series model. an overall flow diagram of the proposed hybrid model is given in figure 3 . in the hybrid model, a nonlinear model is applied in the second stage to re-model the left-over autocorrelations in the residuals, which the linear model could not model. thus, this can be considered as an error re-modeling approach. this is important because due to model misspecification and disturbances in the pandemic rate time series, the linear models may fail to generate white noise behavior for the forecast residuals. thus, hybrid approaches eventually can improve the predictions for the epidemiological forecasting problems, as shown in [24, 26, 23] . these hybrid models only assume that the linear and nonlinear components of the epidemic time series can be separated individually. the implementation of the hybrid models used in this study are available in [1] . five time series covid-19 datasets for the usa, india, russia, brazil, and peru uk are considered for assessing twenty forecasting models (individual, ensemble, and hybrid). the datasets are mostly nonlinear, nonstationary, and non-gaussian in nature. we have used root mean square error (rmse), mean absolute error (mae), mean absolute percentage error (mape), and symmetric mape (smape) to evaluate the predictive performance of the models used in this study. since the number of data points in both the datasets is limited, advanced deep learning techniques will over-fit the datasets [51] . we use publicly available datasets to compare various forecasting frameworks. covid-19 cases of five countries with the highest number of cases were collected [2, 3] . the datasets and their description is presented in table 2 . characteristics of these five time series were examined using hurst exponent, kpss test and terasvirta test and other measures as described in section 3. hurst exponent (denoted by h), which ranges between zero to one, is calculated to measure the longrange dependency in a time series and provides a measure of long-term nonlinearity. for values of h near zero, the time series under consideration is mean-reverting. an increase in the value will be followed by a decrease in the series and vice versa. when h is close to 0.5, the series has no autocorrelation with past values. these types of series are often called brownian motion. when h is near one, an increase or decrease in the value is most likely to be followed by a similar movement in the future. all the five covid-19 datasets in this study possess the hurst exponent value near one, which indicates that these time series datasets have a strong trend of increase followed by an increase or decrease followed by another decline. kpss tests are performed to examine the stationarity of a given time series. the null hypothesis for the kpss test is that the time series is stationary. thus, the series is nonstationary when the p-value less than a threshold. from table 3 , all the five datasets can be characterized as non-stationary as the p-value < 0.01 in each instances. terasvirta test examines the linearity of a time series against the alternative that a nonlinear process has generated the series. it is observed that the usa, russia, and peru covid-19 datasets are likely to follow a nonlinear trend. on the other hand, india and brazil datasets have some linear trends. further, we examine serial correlation, skewness, kurtosis, and maximum lyapunov exponent for the five covid-19 datasets. the results are reported in table 4 . the serial correlation of the datasets is computed using the box-pierce test statistic for the null hypothesis of independence in a given time series. the p-values related to each of the datasets were found to be below the significant level (see table 4 ). this indicates that these covid-19 datasets have no serial correlation when lag equals one. skewness for russia covid-19 dataset is found to be negative, whereas the other four datasets are positively skewed. this means for the russia dataset; the left tail is heavier than the right tail. for the other four datasets, the right tail is heavier than the left tail. the kurtosis values for the india dataset are found positive while the other four datasets have negative kurtosis values. therefore, the covid-19 dataset of india tends to have a peaked distribution, and the other four datasets may have a flat distribution. we observe that each of the five datasets is non-chaotic in nature, i.e., the maximum lyapunov exponents are less than unity. a summary of the implementation tools is presented in table 5 . we used four popular accuracy metrics to evaluate the performance of different time series forecasting models. the expressions of these metrics are given below. where y i are actual series values,ŷ i are the predictions by different models and n represent the number of data points of the time series. the models with least accuracy metrics is the best forecasting model. this subsection is devoted to the experimental analysis of confirmed covid-19 cases using different time series forecasting models. the test period is chosen to be 15 days and 30 days, whereas the rest of the data is used as training data (see table 2 ). in first columns of tables 6 and 7, we present training data and test data for usa, india, brazil, russia and peru. the autocorrelation function (acf) and partial autocorrelation function (pacf) plots are also depicted for the training period of each of the five countries in tables 6 and 7 . acf and pacf plots are generated after applying the required number of differencing of each training data using the r function 'diff'. the required order of differencing is obtained by using the r function 'ndiffs' which estimate the number of differences required to make a given time series stationary. the integer-valued order of differencing is then used as the value of 'd' in the arima(p, d, q) model. other two parameters 'p' and 'q' of the model are obtained from acf and pacf plots respectively (see tables 6 and 7) . however, we choose the 'best' fitted arima model using aic value for each training dataset. table 6 presents the training data (black colored) and test data (red-colored) and corresponding acf and pacf plots for the five time-series datasets. further, we checked twenty different forecasting models as competitors for the short-term forecasting of covid-19 confirmed cases in five countries. 15-days and 30-days ahead forecasts were generated for each model, and accuracy metrics were computed to determine the best predictive models. from the ten popular single models, we choose the best one based on the accuracy metrics. on the other hand, one hybrid/ensemble model is selected from the rest of the ten models. the bestfitted arima parameters, ets, arnn, and arfima models for each country are reported in the respective tables. table 7 presents the training data (black colored) and test data (red-colored) and corresponding plots for the five datasets. twenty forecasting models are implemented on these pandemic time-series datasets. table 5 gives the essential details about the functions and packages required for implementation. results for usa covid-19 data: among the single models, arima (2, 1, 4) performs best in terms of accuracy metrics for 15-days ahead forecasts. tbats and arnn (16, 8) also have competitive accuracy metrics. hybrid arima-arnn model improves the earlier arima forecasts and has the best accuracy among all hybrid/ensemble models (see table 8 ). hybrid arima-warima also does a good job and improves arima model forecasts. in-sample and out-of-sample forecasts obtained from arima and hybrid arima-arnn models are depicted in fig. 4(a) . out-of-sample forecasts are generated using the whole dataset as training data. arfima(2,0,0) is found to have the best accuracy metrics for 30-days ahead forecasts among single forecasting models. bsts and setar also have good agreement with the test data in terms of accuracy metrics. hybrid arima-warima model and has the best accuracy among all hybrid/ensemble models (see table 9 ). in-sample and out-of-sample forecasts obtained from arfima and hybrid arima-warima models are depicted in fig. 4(b) . results for india covid-19 data: among the single models, ann performs best in terms of accuracy metrics for 15-days ahead forecasts. arima(1,2,5) also has competitive accuracy metrics in the test period. hybrid arima-arnn model improves the arima(1,2,5) forecasts and has the best accuracy among all hybrid/ensemble models (see table 10 ). hybrid arima-ann and hybrid arima-warima also do a good job and improves arima model forecasts. in-sample and out-of-sample forecasts obtained from ann and hybrid arima-arnn models are depicted in fig. 5(a) . out-of-sample forecasts are generated using the whole dataset as training data (see fig. 5 ). ann is found to have the best accuracy metrics for 30-days ahead forecasts among single forecasting models for india covid-19 data. ensemble ann-arnn-warima model and has the best accuracy among all hybrid/ensemble models (see table 11 ). in-sample and out-of-sample forecasts obtained from ann and ensemble ann-arnn-warima models are depicted in fig. 5(b) . results for brazil covid-19 data: among the single models, setar performs best in terms of accuracy metrics for 15-days ahead forecasts. ensemble ets-theta-arnn (efn) model has the best accuracy among all hybrid/ensemble models (see table 12 ). in-sample and out-of-sample forecasts obtained from setar and ensemble efn models are depicted in fig. 6 (a). warima is found to have the best accuracy metrics for 30-days ahead forecasts among single forecasting models for brazil covid-19 data. hybrid warima-ann model has the best accuracy among all hybrid/ensemble models (see table 13 ). insample and out-of-sample forecasts obtained from warima and hybrid warima-ann models are depicted in fig. 6(b) . results for russia covid-19 data: bsts performs best in terms of accuracy metrics for a 15-days ahead forecast in the case of russia covid-19 data among single models. theta and arnn(3,2) also show competitive accuracy measures. ensemble ets-theta-arnn (efn) model has the best accuracy among all hybrid/ensemble models (see table 14 ). ensemble arima-ets-arnn and ensemble arima-theta-arnn also performs well in the test period. in-sample and out-ofsample forecasts obtained from bsts and ensemble efn models are depicted in fig. 7(a) . setar is found to have the best accuracy metrics for 30-days ahead forecasts among single forecasting models for russia covid-19 data. ensemble arima-theta-arnn (afn) model has the best accuracy among all hybrid/ensemble models (see table 15 ). all five ensemble models show promising results for this dataset. in-sample and out-of-sample forecasts obtained from setar and ensemble afn models are depicted in fig. 7(b) . results for peru covid-19 data: warima and arfima(2,0.09,1) perform better than other single models for 15-days ahead forecasts in peru. hybrid warima-arnn model improves the warima forecasts and has the best accuracy among all hybrid/ensemble models (see table 16 ). in-sample and out-of-sample forecasts obtained from warima and hybrid warima-arnn models are depicted in fig. 8(a) . arfima(2,0,0) and ann depict competitive accuracy metrics for 30-days ahead forecasts among single forecasting models for peru covid-19 data. ensemble ann-arnn-warima (aaw) model has the best accuracy among all hybrid/ensemble models (see table 17 ). in-sample and out-of-sample forecasts obtained from arfima(2,0,0) and ensemble aaw models are depicted in fig. 8(b) . results from all the five datasets reveal that none of the forecasting models performs uniformly, and therefore, one should be carefully select the appropriate forecasting model while dealing with covid-19 datasets. in this study, we assessed several statistical, machine learning, and composite models on the confirmed cases of covid-19 data sets for the five countries with the highest number of cases. thus, covid-19 cases in the usa, followed by india, brazil, russia, and peru, are considered. the datasets mostly exhibit nonlinear and nonstationary behavior. twenty forecasting models were applied to five datasets, and an empirical study is presented here. the empirical findings suggest no universal method exists that can outperform every other model for all the datasets in covid-19 nowcasting. still, the future forecasts obtained from models with the best accuracy will be useful in decision and policy makings for government officials and policymakers to allocate adequate health care resources for the coming days in responding to the crisis. however, we recommend updating the datasets regularly and comparing the accuracy metrics to obtain the best model. as this is evident from this empirical study that no model can perform consistently as the best forecasting model, one must update the datasets regularly to generate useful forecasts. time series of epidemics can oscillate heavily due to various epidemiological factors, and these fluctuations are challenging to be captured adequately for precise forecasting. all five different countries, except brazil and peru, will face a diminishing trend in the number of new confirmed cases of covid-19 pandemic. followed by the both short-term out of sample forecasts reported in this study, the lockdown and shutdown periods can be adjusted accordingly to handle the uncertain and vulnerable situations of the covid-19 pandemic. authorities and health care can modify their planning in stockpile and hospital-beds, depending on these covid-19 pandemic forecasts. models are constrained by what we know and what we assume but used appropriately, and with an understanding of these limitations, they can and should help guide us through this pandemic. since purely statistical approaches do not account for how transmission occurs, they are generally not well suited for long-term predictions about epidemiological dynamics (such as when the peak will occur and whether resurgence will happen) or inference about intervention efficacy. several forecasting models, therefore, limit their projections to two weeks or a month ahead. in this research, we have focused on analyzing the nature of the covid-19 time series data and understanding the data characteristics of the time series. this empirical work studied a wide range of statistical forecasting methods and machine learning algorithms. we have also presented more systematic representations of single, ensemble, and hybrid approaches available for epidemic forecasting. this quantitative study could be used to assess and forecast covid-19 confirmed cases, which will benefit epidemiologists and modelers in their real-world applications. considering this scope of the study, we can present a list of challenges of pandemic forecasting (short-term) with the forecasting tools presented in this chapter: -collect more data on the factors that contribute to daily confirmed cases of covid-19. -model the entire predictive distribution, with particular focus on accurately quantifying uncertainty [55] . -there is no universal model that can generate 'best' short-term forecasts of covid-19 confirmed cases. -continuously monitor the performance of any model against real data and either re-adjust or discard models based on accruing evidence. -developing models in real-time for a novel virus, with poor quality data, is a formidable task and real challenge for epidemic forecasters. -epidemiological estimates and compartmental models can be useful for longterm pandemic trajectory prediction, but they often assume some unrealistic assumptions [64] . -future research is needed to collect, clean, and curate data and develop a coherent approach to evaluate the suitability of models with regard to covid-19 predictions and forecast uncertainties. for the sake of repeatability and reproducibility of this study, all codes and data sets are made available at https://github.com/indrajitg-r/forecasting-covid-19-cases. github repository our world in data worldometers data repository modeling the impact of social distancing, testing, contact tracing and household quarantine on second-wave scenarios of the covid-19 epidemic. medrxiv forecasting time series using wavelets athanasios tsakris, and constantinos siettos. data-based analysis, modelling and forecasting of the covid-19 outbreak infectious diseases of humans: dynamics and control stability analysis and numerical simulation of seir model for pandemic covid-19 spread in indonesia principles of forecasting: a handbook for researchers and practitioners the theta model: a decomposition approach to forecasting the combination of forecasts real estimates of mortality following covid-19 infection. the lancet infectious diseases lyapunov characteristic exponents for smooth dynamical systems and for hamiltonian systems; a method for computing all of them long-term storage: an experimental study package 'pracma the combination of forecasts: a bayesian approach time series analysis: forecasting and control distribution of residual autocorrelations in autoregressive-integrated moving average time series models refining the global spatial limits of dengue virus transmission by evidence-based consensus time series: theory and methods: theory and methods ensemble method for dengue prediction theta autoregressive neural network model for covid-19 outbreak predictions. medrxiv forecasting dengue epidemics using a hybrid methodology an integrated deterministicstochastic approach for predicting the long-term trajectories of covid-19. medrxiv real-time forecasts and risk assessment of novel coronavirus (covid-19) cases: a data-driven analysis time-series forecasting the analysis of time series: an introduction a time-dependent sir model for covid-19 with undetectable infected persons multiplicative error modeling approach for time series forecasting combining forecasts: a review and annotated bibliography 25 years of time series forecasting forecasting time series with complex seasonal patterns using exponential smoothing fair allocation of scarce medical resources in the time of covid-19 analysis and forecast of covid-19 spreading in china, italy and france time series forecasting with neural networks: a comparative study using the air line data chaotic attractors of an infinite-dimensional dynamical system predicting chaotic time series. physical review letters impact of nonpharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand non-linear time series models in empirical finance nonlineartseries: nonlinear time series analysis deep learning an introduction to long-memory time series models and fractional differencing improved methods of combining forecasts critical care utilization for the covid-19 outbreak in lombardy, italy: early experience and forecast during an emergency response measuring skewness and kurtosis clinical characteristics of coronavirus disease 2019 in china business forecasting space-time modelling with long-memory dependence: assessing ireland's wind power resource the elements of statistical learning: data mining, inference, and prediction seir modeling of the covid-19 and its dynamics practical implementation of nonlinear time series methods: the tisean package. chaos: an interdisciplinary feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. the lancet global health wrong but useful-what covid-19 epidemiologic models can and cannot tell us the effectiveness of quarantine of wuhan city against the corona virus disease 2019 (covid-19): a well-mixed seir model analysis artificial intelligence forecasting of covid-19 in china clinical features of patients infected with 2019 novel coronavirus in wuhan, china. the lancet forecasting with exponential smoothing: the state space approach forecasting: principles and practice unmasking the theta method automatic time series for forecasting: the forecast package for r. number 6/07. monash university, department of econometrics and business statistics forecasting for covid-19 has failed an introduction to statistical learning multivariate bayesian structural time series model nonlinear time series analysis an artificial neural network (p, d, q) model for timeseries forecasting. expert systems with applications statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis projecting the transmission dynamics of sars-cov-2 through the postpandemic period nnfor: time series forecasting with neural networks nnfor: time series forecasting with neural networks early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases metalearning: a survey of trends and technologies meta-learning for time series forecasting and forecast combination trend and forecasting of the covid-19 outbreak in china the end of social confinement and covid-19 re-emergence risk arma models and the box-jenkins methodology time series modelling to forecast the confirmed and recovered cases of covid-19 global spread of dengue virus types: mapping the 70 year history fforma: feature-based forecast model averaging introduction to the theory of statistics the assessment of probability distributions from expert opinions with an application to seismic fragility curves social contacts and mixing patterns relevant to the spread of infectious diseases comparative study of wavelet-arima and wavelet-ann models for temperature time series data in northeastern bangladesh wavelet methods for time series analysis comparing sars-cov-2 with sars-cov and influenza pandemics. the lancet infectious diseases forecasting the novel coronavirus covid-19 a review of epidemic forecasting using artificial neural networks testing for a unit root in time series regression beta autoregressive fractionally integrated moving average models the many estimates of the covid-19 case fatality rate. the lancet infectious diseases predictions, role of interventions and effects of a historic national lockdown in india's response to the covid-19 pandemic: data science call to arms. harvard data science review short-term forecasting covid-19 cumulative confirmed cases: perspectives for brazil log-periodogram regression of time series with long range dependence. the annals of statistics real-time forecasts of the covid-19 epidemic in china from february 5th to february 24th facing covid-19 in italy-ethics, logistics, and therapeutics on the epidemic's front line a practical method for calculating largest lyapunov exponents from small data sets. physica d: nonlinear phenomena learning internal representations by error propagation depends boom-spikeslab, and linkingto boom. package 'bsts'. 2020 bayesian variable selection for nowcasting economic time series predicting the present with bayesian structural time series fast and accurate yearly time series forecasting with forecast combinations forecasthybrid: convenient functions for ensemble time series forecasts the kpss stationarity test as a unit root test a simple explanation of the forecast combination puzzle generalizing the theta method for automatic forecasting a machine learning forecasting model for covid-19 pandemic in india power of the neural network linearity test linear models, smooth transition autoregressions, and neural networks for forecasting macroeconomic time series: a re-examination forecast combinations. handbook of economic forecasting nonlinear time series analysis since 1990: some personal reflections non-linear time series: a dynamical system approach tseries: time series analysis and computational finance. r package version 0 the 1918 "spanish flu" in spain nonlinearity tests for time series time series and forecasting: brief history and future research multiple time scales analysis of hydrological time series with wavelet transform rule induction for forecasting method selection: meta-learning the characteristics of univariate time series forecasting sales by exponentially weighted moving averages no free lunch theorems for optimization nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study time series forecasting using a hybrid arima and neural network model neural network forecasting for seasonal and trend time series forecasting with artificial neural networks:: the state of the art estimation of local novel coronavirus (covid-19) cases in wuhan, china from off-site reported cases and population flow data from different sources. medrxiv key: cord-255557-k0xat0u7 authors: mao, liang; wu, xiao; huang, zhuojie; tatem, andrew j. title: modeling monthly flows of global air travel passengers: an open-access data resource date: 2015-10-31 journal: journal of transport geography doi: 10.1016/j.jtrangeo.2015.08.017 sha: doc_id: 255557 cord_uid: k0xat0u7 abstract the global flow of air travel passengers varies over time and space, but analyses of these dynamics and their integration into applications in the fields of economics, epidemiology and migration, for example, have been constrained by a lack of data, given that air passenger flow data are often difficult and expensive to obtain. here, these dynamics are modeled at a monthly scale to provide an open-access spatio-temporally resolved data source for research purposes (www.vbd-air.com/data). by refining an annual-scale model of huang et al. (2013), we developed a set of poisson regression models to predict monthly passenger volumes between directly connected airports during 2010. the models not only performed well in the united states with an overall accuracy of 93%, but also showed a reasonable confidence in estimating air passenger volumes in other regions of the world. using the model outcomes, this research studied the spatio-temporal dynamics in the world airline network (wan) that previous analyses were unable to capture. findings on the monthly variation of wan offer new knowledge for dynamic planning and strategy design to address global issues, such as disease pandemics and climate change. the worldwide airline network (wan) has played a critical role in contracting human societies into a global village through the rapid transport of people, commodities and information over long distances. every year, on average 700 million passengers and $6.4 trillion goods are carried by air (guimerà et al., 2005; tyler, 2013) . its tremendous impacts on the global socio-economy have drawn increasing attention from a variety of research fields, such as regional studies, international trade, and transportation management mahutga et al., 2010; o'kelly and miller, 1994) . recent reports have also shown that the wan is both directly and indirectly responsible for inter-and intra-continental spread of diseases, such as the severe acute respiratory syndrome (sars), dengue fever, and novel h1n1 influenza (ciesin, 2014; khan et al., 2009; lemey et al., 2014; mangili and gendreau, 2005; tatem et al., 2012) , as well as the spread of invasive species (liebhold et al., 2006; tatem, 2009) . as the wan continues to expand at an exceptional rate, knowledge about its characteristics and evolution is crucial for global economic development and disease control (bogoch et al., 2015; millard-ball and schipper, 2011; o'connor, 2003) , among other factors. as research interest in the wan continues to grow, the availability and completeness of air passenger flow data have become key obstacles. to date, the available data sources can be summarized into three categories. the first category refers to commercial providers of worldwide aviation data, such as the international air transport association (iata) and the official airline guide (oag). both the oag and iata have complete passenger origin and destination records for sale, but the price can be amounted to tens of thousands in us dollars, along with rigorous restrictions for users. the researchers may need to spend a fortune to obtain these data and be prohibited from sharing them with others. as an alternative, the second category of data sources follow the recent movement on open data, which is a new idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. there are a few number of open access data sources concerning the global airline network, but all of them have limitations. the skyscanner for business (http://business.skyscanner.net/portal/en-gb) offers online api services to access live pricing and airfare search history, but these data are not directly related to the real passenger origins and destinations. the openflights organization (openflight.org) offers free downloadable datasets, but they are only limited to airports and routes with no details for passenger flows. the sky explore (http://geog-curaosgeo.asc.ohio-state.edu/t100/web/main.html) presents a webgis interface to map real passenger flows from multiple data sources, but its data coverage is confined in the north america and europe. rather than releasing real data, the third category of data sources comes from statistical models of air passenger flows (grosche et al., 2007; johansson et al., 2011; long, 1970; wei and hansen, 2006) , and the modeled flows are published for open access (huang et al., 2013) . these existing models, however, are limited to predicting annual aggregates of air passenger volumes, and thus represent the wan as a static structure during a year. in reality, the wan is dynamic over time and space given that passenger volumes fluctuate by month and flight routes open/close by season (feuerberg, 2008; grubesic et al., 2009) . the data aggregation in current models hides the network dynamics, which could otherwise provide insights into the spatiotemporal patterns of various global processes, such as disease spread and labor migration. for example, many diseases, such as the flu and dengue fever, have seasonal dissemination patterns, and the annually summarized air traffic data is not appropriate to predict the global disease dispersion. the temporary labor migration also varies to fit fluctuating job markets such as the tourism and agriculture, and hence monthly air traffic would offer a more reliable estimation than the annual aggregates. to date, few studies have been devoted to modeling temporally-resolved air passenger flows over the wan. as a result, little analysis has been conducted on the fine-scale spatio-temporal variation of the wan within a year. the purpose of this article is two-fold. first, we refined existing models developed by huang et al.(2013) to a finer temporal scale and predicted the monthly air passenger flows between directly connected airports worldwide. we also release the modeled monthly flows online for open access. second, we attempt to understand the monthly wan as a dynamic by measuring the variation of air passenger flows by month, by route, and by airport. the wan was conceptualized as a collection of nodes and links, where nodes represent airports and links represent flight routes between airports. many empirical studies have shown that air passenger volume is proportional to the population size of the origin and destination cities, and inversely proportional to the geographic distance between origin and destination cities, similar to isaac newton's gravitational interaction law (grosche et al., 2007; long, 1970; matsumoto, 2007) . a gravity model, thus, can be utilized to estimate the air passenger volume between any pair of nodes. our model views the air passenger flow as an outcome of spatial interactions between a pair of origin and destination airports, which can be formulated into a multiplicative function of node and link characteristics, as shown in eq. (1): where p ij (t) denotes the number of air passengers from airport i to j during month t (t = 1, 2,…12). s(t) is a scaling constant. node i,k (t) and node j,k (t) represent the kth characteristic during month t regarding origin airport i and destination airport j respectively, for instance, a socio-economic, demographic, meteorological and network characteristic (table 1) . route ij,l (t) is the lth measurement of the linkage between airport i and j during month t, for example, the great circle distance, seat capacity, flight frequency, and link type ( table 1 ). the α k (t), β k (t), γ l (t) are corresponding coefficients to be estimated. for model development, we obtained monthly air passenger numbers from the us air carrier statistics t-100 domestic and international segments (www.transtats.bts.gov). these datasets contains market data reported by both us and foreign air carriers, including carrier, origin, and destination for enplaned passengers, freight and mail when at least one point of service is in the us or one of its territories. the dataset of the year 2010 was selected to match the collection time of other data sources, such as the census data. the market data were tabulated into 141,182 records with information concerning the origin airport, destination airport, actual passenger number, and month. in addition, the 2010 total passenger numbers for the top 100 international airports across the world by passenger volume were downloaded from the aci website (airports council international, http://www.aci.aero/data-centre/annual-traffic-data/ passengers/2010-final). this dataset was used as an extra independent source for model validation. information on a total of 3416 airports across the world were obtained from the 2010 flightstats database (www.fightstats.com), including names, codes, and geographic coordinates (latitudes and longitudes). to connect these airports into a network, flight routes were further derived from the 2010 scheduled flight capacity dataset purchased from the oag (www.oag.com). this dataset provides information on direct links (if a commercial flight is scheduled) of origin and destination airports, flight distances, and seat capacity by month for 2010. the airport and route datasets were utilized to compute geographic distances between airports, construct network graphs for each month in 2010, and derive network measurements, such as the in-degree, out-degree, and betweenness centrality. the population data was obtained from the most recent gridded population of the world, version 4 (gpwv4), released by the center for international earth science information network (ciesin, 2014). table 1 predictors for the monthly air passenger flows. node characteristics (node i,k or node j,k ) pop i the population size of airport i ppp i the purchasing power index where airport i serves in-degree i the number of incoming links airport i has in the airport network out-degree i the number of outgoing links airport i has in the airport network capacity_in i total incoming capacity of airport i capacity_out i total outgoing capacity of airport i a betweenness centrality i the number of shortest paths going through airport i monthly average temperature of airport i. monthly average humidity of airport i. monthly average precipitation of airport i. route characteristics (route ij,l ) inversed_distance ij the inverse of the great circle distance between airport i and j. the total seat capacity of routes between airport i and j. whether the airports i and j are in the same country or not. a more detailed calculation of betweenness centrality can be referred to the supplementhe gpwv4 is a minimally-modeled gridded population data set (30 arc-second resolution) that incorporates census population data from the 2010 round of censuses. to extract the population size served by airport, a 200 km buffer was created to reflect the upper distance limit of the catchment area by two hour ground travel to the airport (lieshout, 2012) . the buffer zone was superimposed onto the gridded population dataset and a zonal aggregation was performed to extract the potential serviceable population in the catchment area of the airport. for the economic development around an airport, a gridded map with a cell resolution of 1°× 1°was obtained from the geographically based economic database (g-econ, http://gecon.yale.edu/). each grid cell shows the purchasing power parity (ppp) at the cell location (nordhaus, 2006) . the ppp value closest to an airport was extracted and divided by the population in the grid cell to estimate the ppp value per capita for that airport. considering local climate as a driver of air travel, for example through tourism (bogoch et al., 2015) , three climatic variables were selected as airport characteristics, namely, the monthly average precipitation, temperature, and humidity. the data were downloaded from the worldclim website (www.worldclim.org) as gridded surfaces of 1 × 1 km spatial resolution (hijmans et al., 2005) . airports were superimposed onto the grids to extract the three climatic variables. to identify the best fit model, the general gravity model (eq. (1)) was transformed into three types of model specifications. the first model was a log-normal model proposed by balcan et al. (2009) , assuming that the natural log of the monthly air passenger volume follows a normal distribution. a logarithm transformation was performed on each numerical variable to maintain the configuration of the gravity model. to improve performance, the model also considered interaction terms between origin and destination nodes, denoted as interaction ij , for example, the product between the total populations of the origin and destination nodes. the first model took a form of eq. (2): the second model assumed that the monthly air passenger volume follows a poisson distribution, as it is a count. eq. (1) was transformed into a simple poisson regression model (eq. (3)), which has been used in the previous model developed by johansson et al. (2011) . since the passenger numbers on a flight can never exceed the seat capacity, it is more appropriate to set the seat capacity as an offset, rather than a regular covariate, with its coefficient constrained to 1. further, a dispersion parameter was added to account for the potential over-dispersion problem. this problem arises from the poisson distribution, which confines its variance to be equal to its mean. for count data, the observed variance could in fact be greater than the mean, known as over-dispersion. as formulated in eq. (3), adding a dispersion parameter ϕ allows the variance to vary from the mean value μ, which may produce a better fit. if the estimated ϕ is close to 1, there is probably no over-dispersion problem and vice versa. the third model is a negative binomial normal model, which is often chosen when the poisson regression has a poor fit. the model takes the same form as eq. (3) except that the p ij (t) follows a negative binomial distribution instead of the poisson distribution. this model can also accommodate the over-dispersion problem for count data under some circumstances. for each model and each month, the us air passenger data and all covariates were input into sas 9.3 to estimate the model coefficients. the evaluation of model performance included a cross-validation within the us to select the best model, and a validation beyond the us to gauge to model accuracy. the cross-validation was performed as follows: one tenth of observations were randomly selected and held as a testing dataset; the rest of the observations were treated as a training set for model fitting; after the model was built, predictions were made with the testing dataset and then errors were further obtained by computing the differences between the predicted and observed values. this process was repeated 10 times, and the root mean squared error (rmse) and the mean absolute error (mae) were computed as evaluation criteria. lower values of rmse and mae indicate a better fit of the model. the model with the smallest rmses and maes was chosen for subsequent spatio-temporal analysis. to investigate the model prediction power beyond the us, the best-fit model was expanded to estimate the worldwide passenger flows. the model predicted passenger volumes were compared with those observed at the top 100 world airports reported by the airports council international (aci). we only considered the top 100 world airports here, because the aci only releases these data freely. the pearson correlation coefficient and the spearman's rank correlation coefficient were employed for validation. for each month, the log-normal model produced the greatest rmse and mae, followed by the negative binomial model and then the poisson model (table 2) . for this reason, the poisson model was selected as the best-fit model for subsequent analysis. given that the average number of transported passengers was 9239 per month per route, the maes suggest that the average prediction error was roughly ±7% and hence the average model accuracy was 93%. the diagnostic plots (fig. 1) compare the estimated monthly passenger volumes with corresponding observations. a majority of paired values fall close to the 45°line, indicating a good fit of the model (fig. 1a) . with regard to the error distribution (fig. 1b) , the fitted curve almost coincides with the horizontal line (y = 0) showing that the errors scatter randomly with no obvious biases or unusual patterns, and hence the model is appropriate. the spatial distribution of prediction uncertainty by route was also estimated and reported as a map in the supplementary material (fig. s1 ). in addition to the cross validation, the monthly predictions were also aggregated into an annual total for each airport, and compared with the observed annual passengers reported by the airport council international (aci, 2011). fig. 2 shows a high level of consistency between our predictions and actual observations (correlation coefficients n0.95), in terms of both the magnitude and the ranking order. although the model was built based on the us data, it can reasonably predict air passenger volumes in other regions. the estimated coefficients of the poisson model by month are provided in the supplementary material. in general, most variables were statistically significant, with a few being significant throughout all 12 months while some were only significant in specific months in 2010. for example, the inverse distance between two airports was statistically significant in all monthly models (fig. 3a) , and shows a negative association with air passengers, reflecting the distance-decay effect of 'gravity'. the interaction term between the incoming capacities of origin and destination airports was positively associated with the air fig. 4 . estimated monthly variation of the wan in terms of it's a) flight routes, b) passenger volume, c) airport rank by flight connections, and d) airport rank by passenger throughput. passenger volumes throughout the 12 months (fig. 3b) , implying more passengers if there were larger airports at both ends of the journey. the temperature at the destination is another driver of air travel (fig. 3c) . during cold months in the northern hemisphere, such as from january to april and from october to december, the temperature was positively related to the number of air passengers, suggesting that people tend to fly to warmer places. however, the temperature had negative or no associations with air travelers during the summer months in the northern hemisphere from may to september. lastly, the scale parameter, defined as the square root of the dispersion parameter ϕ in eq. (3), was estimated to range from 13 to 26 over the 12 months. the rising trend in fig. 3d implies that the wan had an increasingly heterogeneous structure over the 12 months, as the passenger volume varies more widely around its mean value. the predicted monthly air passengers of the wan are included in supplementary material ii (data and video clip), and also published online a part of the vector-borne disease airline importation risk (vbd-air) project (www.vbd-air.com/data/) for free download. fig. 4 shows the monthly variation of the wan in terms of its flight routes, passenger volume, and role of airports. based on the model estimates, the monthly variations in air passenger flows can be roughly divided into three stages in 2010. the first stage spanned from january to march, characterized by the greatest number of flight routes, but the shortest average flight distance and a low passenger volume ( fig. 4a and b) . these statistics suggest that stage 1 was dominated by shortrange flights and a low passenger volume per route. the second stage (from april to october) was the peak season of the entire year due to the largest number of passengers. it was also featured by a substantial decrease in flight routes, but also an increase in average flight distance. with fewer operating routes, the number of passengers per route was larger than stage 1 implying higher seat occupancy rates or larger carriers. the third stage included the last two months of the year characterized by the fewest passengers across the year. there was another significant drop in the number of flight routes, but the average flight distance per route remained longer than stage 1. the varying roles of airports over months are also of interest, as shown in fig. 4c and d. the fra (frankfurt) was the most connected airport (with the greatest number of flight routes connected to) in most months, while the atl (atlanta) played as the busiest airport (with the greatest passenger throughput), showing a concentration of world air travel activities in europe and north america. with regard to the flight connections, the top 10 airports in europe (e.g., fra and cdg) had higher ranks in the early months of the year, while in late months their roles were gradually replaced by airports in the us, such as the atl and ord (chicago). this might be related to the flight route reduction shown in fig. 4a (more discussions later) . the pek (beijing) was the 2nd busiest airport during stage 1 and 3, but it was replaced by the lhr (london) in stage 2, suggesting that a remarkable fall and rise of passenger numbers in east asia and europe among stages (more discussions later). to further explain the temporal variations in fig. 4 , the air passenger flows were mapped geographically to depict where the increase and decrease took place between the three stages of 2010. as shown in figs. 5 and 6, most changes were concentrated in the northern hemisphere due to the majority of the world's population being distributed there. from stage 1 (january to march) to stage 2 (april to october), significant decreases in travelers were clustered in short-haul flights within east and southeast asia (fig. 5a) . meanwhile, a noticeable increase of long-distance travelers was seen within the us and between europe and other continents (fig. 5b) . a possible reason is that the stage 2 spanned the late spring to fall in northern hemisphere, which had longer daytimes and warmer weather, and thus encouraged human activities (e.g., tourism) as well as long distance travels. to increase profits, many short-haul flight routes were likely closed temporarily so that the air carriers could be redistributed to serve medium or long-haul routes, for example, the seasonal air routes between philadelphia (usa) and barcelona (spain) , and between atlanta (usa) and athens (greece). in addition, the declining economic trends of the time of the data used to construct the model were likely another reason for the reduced number of passengers, given its tremendous impact on airline industry. the global economy was contracting in 2010, and therefore it is very possible that some variations were attributable to the economic decline. from stage 2 (april to october) to stage 3 (november to december), there was an apparent shrink of passenger volumes within europe, between europe and north america, and between europe and asia (fig. 6a ). those routes with reduced passengers were primarily longhaul flights, possibly because the winter weather in the northern hemisphere discouraged the demands for trans-continental travels. only a few flight routes had an increased passenger volume, and a majority of them were medium haul flights toward tropical regions, for example, caribbean islands and southeast asia. there were several limitations of this modeling work that may introduce uncertainties in estimation and bias in interpretation. first, the flight datasets used in model building only record passenger information on direct links between airports. although the model is capable of estimating the flows between airports separated by 2 or more stops, the estimates could be biased to a certain degree. to address this issue, additional datasets with passenger transfer information, such as flight ticket databases, could be further included in the model. second, a simple 200 km catchment area was set to estimate the population size served by an airport. the realistic travel time/distance to an airport could vary by airport size, local economic size, population density, travel means, etc. more sophistication should be introduced to delineate the catchment area and better reflect the population sizes at the origin and destination airports. third, the number of air passengers was assumed to be independent with one another between flight routes to satisfy the assumption of linear regression models. this route independence could be problematic, because the passenger numbers could be related when one route is connected to the other. in future research, the airline origin and destination survey (db1b) data, which records itinerary samples, can be further included in our model to account for multiple-stop travels and dependence between flights. at last, the selection of months as the basic temporal unit for modeling is an ad hoc criterion that may impose new problems. for instance, the air traffic often peaks during long holiday periods and school breaks that may cross months. such fine-grained peaks might be evened in the monthly model and maps. the global flow of air travel passengers varies over time and space, but analyses of these dynamics and their integration into applications in the fields of regional studies, epidemiology and migration, for example, have been constrained by a lack of data, given that air passenger flow data are often difficult and expensive to obtain. here, these dynamics are modeled at a monthly scale to provide an open-access spatiotemporally resolved data source for research purposes. the contributions of this research have covered two aspects. first, poisson regression models were developed to predict monthly passenger volumes in the wan, which refine the previous annual scale models. the models not only performed well in the united states, but also showed good confidence in estimating air passenger volumes in other regions. the proposed modeling approach can be extended to other years too, if data of those years are available. second, the models and estimates are all shared online for researchers to further reveal the monthly characteristics of wan that previous analyses were unable to capture. existing studies have devised various tools to understand yearly or quarterly evolution of wan (feuerberg, 2008; grubesic et al., 2009; o'connor, 2003) . these tools help identify global roles of cities, evaluate population's accessibility to air travel, and predict future geographic patterns. it would be interesting to reexamine these topics with the same tools, but a closer lens at the monthly scale. for instance, cities that play important roles in some months, but not the entire year, can be revealed. the effects of seasonal flight routes on the wan structure can be investigated. the accessibility to air travel can be assessed in a spatial and temporal manner. such fine-grained analyses on wan would offer new knowledge for regional planning or dynamic strategy design. for example, those cities that are temporarily important for months in the wan could be fast growing nodes in the future regional development and are worth attention from urban planners. the monthly assessment of accessibility to air travel may suggest dynamic airfare strategies to mitigate local and regional biases in time and costs. the world health organization can also identify possible high-risk routes for the next (few) month(s) according to the monthly wan structure and disease prevalence, and then optimally focus its control efforts (e.g., airport surveillance) to these routes. multiscale mobility networks and the spatial spreading of infectious diseases assessment of the potential for international dissemination of ebola virus via commercial air travel during the 2014 west african outbreak gridded population of the world, version 4 (gpwv4), preliminary release mapping world city networks through airline flows: context, relevance, and problems airline data for global city network research: reviewing and refining existing approaches uncovering trends in seasonal transportation data: new tool analyzes airline revenue passenger-mile trend data gravity models for airline passenger volume estimation spatio-temporal fluctuations in the global airport hierarchies the worldwide air transportation network: anomalous centrality, community structure, and cities' global roles very high resolution interpolated climate surfaces for global land areas an open-access modeled passenger flow matrix for the global air network in 2010 on the treatment of airline travelers in mathematical models spread of a novel influenza a (h1n1) virus via global airline transportation unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza h3n2 airline baggage as a pathway for alien insect species invading the united states measuring the size of an airport's catchment area air travel, spatial structure, and gravity models economic globalisation and the structure of the world city system: the case of airline passenger data transmission of infectious diseases during commercial air travel international air network structures and air traffic density of world cities are we reaching peak travel? trends in passenger transport in eight industrialized countries geography and macroeconomics: new data and new findings global air travel: toward concentration or dispersal? the hub network design problem: a review and synthesis the worldwide airline network and the dispersal of exotic species air travel and vector-borne disease movement international air transport association, cape town an aggregate demand model for air passenger traffic in the hub-and-spoke network this research was funded by the national academies national academy of sciences airport cooperative research program board (#acrp a02-20). supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.jtrangeo.2015.08.017. key: cord-262966-8b1esll4 authors: huang, ganyu; pan, qiaoyi; zhao, shuangying; gao, yucen; gao, xiaofeng title: prediction of covid-19 outbreak in china and optimal return date for university students based on propagation dynamics date: 2020-04-07 journal: j shanghai jiaotong univ sci doi: 10.1007/s12204-020-2167-2 sha: doc_id: 262966 cord_uid: 8b1esll4 on 12 december 2019, a novel coronavirus disease, named covid-19, began to spread around the world from wuhan, china. it is useful and urgent to consider the future trend of this outbreak. we establish the 4+1 penta-group model to predict the development of the covid-19 outbreak. in this model, we use the collected data to calibrate the parameters, and let the recovery rate and mortality change according to the actual situation. furthermore, we propose the bat model, which is composed of three parts: simulation of the return rush (back), analytic hierarchy process (ahp) method, and technique for order preference by similarity to an ideal solution (topsis) method, to figure out the best return date for university students. we also discuss the impacts of some factors that may occur in the future, such as secondary infection, emergence of effective drugs, and population flow from korea to china. nomenclature c-the average number of contacts of an exposed person without isolation each day n-the number of individuals nd-the death toll ne-the number of exposed individuals ni-the number of infectious individuals nr-the number of recovered individuals ns-the number of susceptible individuals n -the total population of china p-intensity of isolation for exposed individuals r-correlation coefficient r 2 -coefficient of determination t0-moment when the government began to take measures t-outbreak duration α-incubation rate β-infectious rate of contacts of an exposed person γ-recovery rate μ-pneumonia mortality 0 introduction on 12 december 2019, the first patient with unexplained pneumonia was admitted into the hospital in wuhan. since then, a novel coronavirus disease, named covid-19, has spread around the world, and the number of infected patients has been growing exponentially. we have known that the novel coronavirus has certain infectivity and a good affinity with human respiratory tract cells. it can also be transmitted from person to person. thus, it is useful and urgent to predict the situation of this outbreak with mathematical modeling. besides, students still stay at home to prevent the spread of covid-19. here we establish a model to predict the spread of covid-19 and infer the most suitable return date for university students. traditionally, the compartment model is used to predict the outbreak of infectious diseases. before building our model, we have an overview of related work. we classify the related work into two categories: models with external floating population and those without. in these papers, conventional models are the models of seir, seiar and seijr. they all have their advantages. for example, the seir model is easy to implement, and the seijr model accurately divides isolated individuals from other groups. however, there is a common disadvantage: the results of longterm prediction are not accurate because these models cannot fit the real situation for a long time. in this paper, we consider some factors that influence the covid-19 outbreak for a long time and establish the 4+1 penta-group model. on the basis of traditional seir model, we add a compartment, i.e., dead individuals, and take into account some parameters, such as the moment when the government began to take measures, and intensity of isolation for exposed individuals. moreover, for some parameters used in our model, we use the collected data to calibrate them in order to achieve as close as possible to the actual situation. then, we establish an analytical model, using simulation of the return rush (back), analytic hierarchy process (ahp) method and technique for order preference by similarity to an ideal solution (topsis) method, called bat model. through the combination of these two models, we can predict the development of the epidemic and draw the pros and cons of different return dates. existing work on prediction of the covid-19 outbreak can be classified into models without external floating population and those with. previous related work adopts differential equations as the basic form for simulation. the types of models without external floating population are the seir, seiar and seijr models, approximately. the comparison of these models and our model is shown in fig. 1 seir as a traditional infectious disease model, the seir model describes the relationship between susceptible individuals, exposed individuals, infectious individuals, and recovered individuals. fan et al. [1] , geng et al. [2] , and zhou et al. [3] proposed some of the most classic seir models. they directly applied traditional seir model [4] without any changes to simulate the outbreak in wuhan and other areas. the seir model takes little account of the actual situation, so the long-term prediction is far from the actual value. seijr the seijr model is roughly the same as traditional seir model, but the population is divided into susceptible individuals, asymptomatic individuals during the incubation period, infectious individuals with symptoms, isolated individuals with treatment, and recovered individuals. read et al. [5] adopted this idea. this model accurately separates isolated and other populations and it is more realistic about the status quo. nevertheless, precise data on each individual are hard to collect, making it difficult to calibrate parameters. therefore, the long-term prediction is far from the actual value. seiar in the seiar model, the difference from the seijr model is that there are no isolated individuals but asymptomatic individuals. bai et al. [6] followed this approach, and this model has similar characteristics to the seijr model. in addition to these, we find an seir based model with external floating population, which considers the zoonotic force of infection and the daily number of travelers. wu et al. [7] adopted this model. the simulation is already very close to the beginning of the epidemic. however, as the outbreak progresses, it is drifting away from reality, which makes it unsuitable for long-term forecasting. all of these models have their strengths, but none of them do well in long-term predictions due to the parameters or model accuracy. based on the above experience, our 4+1 penta-group model takes into account the long-term nature of the outbreak. we add the dead individuals, whose precise data can be collected to calibrate mortality, to our model and we also add the time of isolation initiation and the intensity of isolation to the model given the long-term impact of the measures that the government has taken. in general, we establish a model that can predict the long-term situation of the outbreak. we establish two models to predict the spread of covid-19 and figure out the most suitable return time. first, we establish the 4+1 penta-group model to predict the future trend of the covid-19 outbreak in china. then, we propose the bat model to simulate the return rush and consider some factors to obtain the best return time. model for the 4+1 penta-group model, we assume that there is no floating population, considering that the government has issued notices and taken measures to keep people at home. moreover, we assume that the recovery rate is positively correlated with the level of medical care, and the mortality is negatively correlated with the level of medical care. the 4+1 penta-group model includes six sub-models that describe the flow relationship of each population [8] . it reflects the complete changes of five groups of people and the overall relationship. in this paper, we use differential equations to simulate the flow of people [9] . the first model that reflects the change in susceptible individuals is (1) the second model that reflects the change in exposed individuals is the third model that reflects the change in infectious individuals is the fourth model that reflects the change in recovered individuals is in this model, we separate n d from n r to predict the death toll precisely. moreover, we will discuss the effect of secondary infection in subsection 4.1. this new compartment also benefits this section because secondary infections in dead individuals will never exist. the fifth model that reflects the change in dead individuals is and the sixth model that reflects the overall relationship is the bat model is composed of three parts: simulation of the return rush (back), ahp method and topsis method. first of all, we assume that the infectious rate is positively correlated with the population density, measured by the per capita floor space. on the first day of the spring festival travel rush, before the extended leave notice, people were sent at shanghai hongqiao railway station. moreover on 5 february 2020, 50 000 people were at the same place. without extended holidays, the per capita area of 440 million people will increase m times: m = 1.942 8, estimated by data of shanghai hongqiao railway station. then we derive the impacted infectious rate and the number of contacts that would quadruple: β = mβ, and c = 4c. by replacing the original parameters, we get here we take into account the impact of both the epidemic situation and the delay to rework and return to school, and use ahp method to get the weight of each factor. factors considered in our analysis are the end date of the outbreak, the total number of cases, economic impact, and students' graduation. we establish a weight matrix for these factors, as shown in fig. 2 . then we set up a scoring system to obtain the best return date using topsis method. when constructing the matrix of judgment, we compare factors in pairs using the consistent matrix method [10] [11] . the relative scale is adopted to minimize the difficulty of comparing various factors with different properties, to improve accuracy. through the analysis of the questionnaire results, the importance levels of the four relevant factors are determined in a score-based method from equally important (denoted as 1) to extremely more critical (denoted as 9), as shown in fig. 2 . in terms of importance, relative to the end time, the scores of the number of cases, the economic impact and the end time are 5, 4 and 0.33, respectively. also, we set up a scoring system to obtain the best return date using topsis method [12] [13] . we first determine that both the economic impact and the students' graduation are related to the return time. it is evident that the delay of return will influence graduation of those students while do harm to the economic. then, the end time of the outbreak is subtracted from the original case to get a 3 × 3 matrix. furthermore, we normalize this matrix and add up the product of weight and value of each factor in the matrix. we consider the sum as the highest score. comparing those scores, we finally derive the most suitable return date. with the bat model, we can derive the most suitable return date. to realize the prediction of the future situation of the outbreak, we need to determine the specific values of the parameters in the 4+1 penta-group model [14] . we set the time for the emergence of the first case as t = 1 d. according to the time of the emergence of much news about covid-19 and the time when people started to pay attention, we set t 0 as 42 d (january 23, 2020). the mean time from symptoms onset to isolation is 6.138 8 d (interval: 5.967 6-6.320 0 d) [15] , from which we obtain 1/α = 6.138 8. under traffic control, considering the fact that people in hubei were forbidden to leave hubei, and the city's buses, subways, ferries, and long-distance passenger transport were suspended from 10:00 am since 23 january 2020, we assume that each person has only daily contact with family members at home, i.e., a census household size of 3.1 people per household (from the national data). the total population of china is n = 1 400 050 000, while the population in hubei is 59 170 000, so the average number of contacts is c = 20, which is calculated by employing a decentralized average and considered to be the number of people exposed to an exposed person without isolation. for the recovery rate and pneumonia mortality, we apply the latest data to eqs. (4) and (5) and get the variation of the recovery rate and pneumonia mortality. at the same time, we can see that these two parameters depend on the time t from at last, we use the nonlinear least square method to get β and p that best match the actual data. when the sum of the squares of the predicted value from the actual value is the smallest, we obtain β = 0.019 5, p = 0.052. until now, we have all the parameters, and then we set the initial condition as follows: with the initial conditions brought in, we obtain the predicted results of the infections of the covid-19 outbreak in china, as shown in fig. 4 . to verify the validity of our model, we set the coefficient of determination and the correlation coefficient between the predicted curve and the actual data as the reliability evaluation criteria of the model. the closer the two coefficients are to 1, the higher the correlation is, which means the higher the reliability of the predicted results is. the final results are r 2 = 0.905 29 and r = 0.995 38; both are greater than 0.9, so the correlation is very high. eventually, our model can be considered to be highly reliable. from the prediction results, it is concluded that by the seven-day rule of no new infections, we know that the epidemic will end on 2 may 2020, and the total number of cases is 103 321. the number of exposed individuals peaked at 22 540 on 8 february 2020. the number of infectious individuals peaked at 58 125 on 22 february 2020. we assume that the general return rush lasts for seven days, and use bat model to evaluate 46 sets of return date (starting from march 1 to april 15). finally, we obtain the best time to return to work or study is from march 15 to march 22, which means universities will start on march 23. judging from the prediction results, we cannot relax our vigilance at this stage but should wait patiently for the epidemic to end, although the epidemic situation has passed its inflection point. in the following analysis, we assume that the government and people will continue to take strict precautionary measures. while the cured ones will all become susceptible individuals and share the same infectious rate and other equations remain the same, we rewrite eqs. (1) and (6) to comply with the presence of secondary infections: the new framework is shown in fig. 5 . we apply eqs. (9) and (10) to get the results shown in fig. 6 . the prediction model gives the outbreak prediction results and simultaneously evaluates the prediction of secondary infection at different time, and finally gets almost the same result, from which we can know that there is no need to panic if secondary infection occurs. as long as we continue to take protective measures, the overall situation will not be significantly affected. predictably, the emergence of effective therapies and specific drugs will significantly improve γ and reduce μ. furthermore, we assume that the influence of the emergence of effective drugs on the two parameters has the following three conditions: only the recovery rate increases, only the pneumonia mortality decreases, and both recovery rate and pneumonia mortality change. here, we increase the cure rate to 0.5 and reduce the pneumonia mortality by a tenth. we take three values as the possible time for effective drugs to appear: t = 80 d (march 1, 2020) , t = 90 d (march 11, 2020), and t = 100 d (march 21, 2020) and get the results, as shown in table 1 . while the improvement of the cure rate and the reduction in the mortality will increase the number of cured people and reduce the number of deaths, the delay in the presence of effective treatment and drugs will increase the number of deaths but decrease the number of cured individuals. these two parameters have a relatively small impact on the end of the outbreak and the total number of infections. we take the exchange between the first-tier city shanghai and the republic of korea as an example to discuss the influence of the population flow from japan and south korea on the covid-19 outbreak in china. south korea will suspend most flights to china, according to statements from major korean airlines. we regard the day february 4 as the date of announcement as the time node, and the six routes still open after the time node, which means that 20 flights can be taken. the previous 59 routes have about 33 flights. while the numbers of flights from shanghai to korea are 16 and 30, we determine that the korean authorities allow only asymptomatic passengers on flights before and after the decision. moreover, in the context of a large population, we use foreign predictive case data to figure out the ratio of exposed people to susceptible people. on the basis of this ratio, the number of exposed and susceptible people arriving in china by plane is estimated. then, it is added to all kinds of people in shanghai to figure out the final prediction result. korean air usually uses boeing 777, which can carry people between 305 and 550. we take the mid-value, 428 people and obtain that the daily population flows of korea and shanghai around february 4 are 1 712 and 428, respectively. supposing the daily population flow is a, and the proportion of the latent population in the sum of latent population and susceptible population is k, for shanghai, the expressions are using the previous model, we change the data to the infection data of shanghai and obtain the results, and we compare the obtained results with the predicted results without considering the flow between korea and shanghai. by comparison, we find almost no difference in the results. the results differ only by two digit after the decimal point. therefore, we conclude that although the news will strongly report the entry of sick foreigners, it has a relatively small impact on overall epidemic control. we should not be panicked by such reports, nor should we be relaxed because there is no report. we should pay more attention to the protection against the exposed individuals, the main source of infection. in this paper, we establish the 4+1 penta-group model to predict the spread of covid-19 and infer the most suitable return date for university students using the bat model. firstly, we develop a basic seirbased model for predicting the spread of new viruses. secondly, with the help of methods such as ahp and topsis and taking into account various factors, we establish the bat model to obtain an optimal time for the return to work and study, which is of great practical significance. our estimates perform much better in the long run than the estimates of other forecasting models. we are very innovative in introducing the isolation strength parameter and set the isolation time; for example, dead population is added on the basis of seir model. the change of mortality and cure rate varying with time is obtained by fitting instead of a constant, which is scientific. seir-based novel pneumonia transmission model and inflection point prediction analysis analysis of the role of current prevention and control measures in the epidemic of new coronavirus based on seir model preliminary prediction of the basic reproduction number of the wuhan novel coronavirus 2019-ncov global stability analysis on one type of seir epidemic model with floating population novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions early transmission dynamics of novel coronavirus pneumonia epidemic in shaanxi province nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study study on the stability of infectious disease dynamics model survey of transmission models of infectious diseases research on computation methods of ahp weight vector and its applications fuzzy analytic hierarchy process a review of the comprehensive multiindex evaluation method the improved method for topsis in comprehensive evaluation parameter identification for a stochastic seirs epidemic model: case study influenza modelling the epidemic trend of the 2019-ncov outbreak in hubei province key: cord-276782-3fpmatkb authors: garbey, m.; joerger, g.; furr, s.; fikfak, v. title: a model of workflow in the hospital during a pandemic to assist management date: 2020-05-02 journal: nan doi: 10.1101/2020.04.28.20083154 sha: doc_id: 276782 cord_uid: 3fpmatkb we present a computational model of workflow in the hospital during a pandemic. the objective is to assist management in anticipating the load of each care unit, such as the icu, or ordering supplies, such as personal protective equipment, but also to retrieve key parameters that measure the performance of the health system facing a new crisis. the model was fitted with good accuracy to france's data set that gives information on hospitalized patients and is provided online by the french government. the goal of this work is both practical in offering hospital management a tool to deal with the present crisis of covid-19 and offering a conceptual illustration of the benefit of computational science during a pandemic. care unit (icu) are or are going to be be severely strained. healthcare professionals are 5 often times forced to make difficult decisions in patient care and resource allocation. 6 patient profiles might be out of the ordinary routine of the hospital and workflow must 7 be different. end-to-end on demand visibility with identification of real constraints is 8 needed for the senior management. 9 a manager may have simple but essential questions such as: how many beds do i 10 need on the floor, how many beds are available in the critical care unit, how much 11 supplies should be ordered to take care of our patients and protect our staff from 12 infection, how long will the facility have to work at maximum capacity, is there enough 13 staff to hold this workload long enough, are we doing well with patient outcomes, etc. people who are going to be symptomatic enough to require hospitalization. this 22 approach has been quickly applied to covid-19 with success [20, 25] . in the case of the 23 covid-19 pandemic, it is particularly difficult because large bodies of infected people 24 are asymptomatic. consequently, the basic reproduction number r 0 factor of 25 covid-19 is still under active debate. 26 on the hospital workflow side, while there is a large amount of work on this 27 topic [23] , one of the difficulties is to asses the death rate of patients hospitalized at the 28 beginning of the pandemic because the length of stay (los) is rather long and the 29 disease is still not well understood [2, 16, 26] . every hospital has to adapt to the new 30 crisis as it arrives, so clinical practice may vary greatly from one institution to another. 31 a number of guidelines and great reports have been quickly edited to support the heath 32 community, but it takes time to standardize the healthcare process [3, 5, 13, 24] . 33 our goal in the paper was to come up with a simple and robust mathematical 34 framework that is easy to use and that supports the management of the patient 35 workflow during a pandemic. such a model should operate on a relatively limited data 36 set that reports daily on the number of patients admitted for hospitalization, patient acquisition. much more can be done with the patient electronic records that detail 42 patient comorbidities and chronic conditions, provided that the disease of the pandemic 43 is well understood. 44 we have used a markov process description of the workflow's graph with probability 45 governing the patient transition from one care unit to another, as well as a simple 46 statistical model of patient los at each stage. we will show that with a minimum 47 number of parameters used to fit on the time series listed above for a period of a few 48 weeks, one may start to assemble the information needed to assist the senior 49 management in getting answers and identifying real constraints to reduce speculation or 50 misallocation of resources. this work is our first iteration to achieve a very ambitious goal: as data becomes 52 available, the quality and level of detail of modeling should keep improving to achieve 53 better results. it is our hope that such an effort, among many others, will once again 54 prove how much digital health can benefit from computational science to improve 55 patient care. the paper is organized as follows: section 2 describes our method to construct the 57 model and details the choices we made to work with the data set on hand; section 3 58 gives the main results and solution to our initial goal in supporting management; 59 section 4 discusses the benefit and limitation of our method and concludes with further 60 potential development. because of the sparsity of data available to construct a predictive model during a 63 pandemic crisis, we are going to use a very simple model that reproduces the workflow 64 of table 1 . let's start with a brief description of standard patient workflow -see figure 65 1 -with respect to disease progression -see figure 2 . the patient moves from one care unit to another according to his/her condition. the first two steps are registration and diagnostics, which in principle should be a 68 relatively quick process. for the patients who stay in the hospital because their health 69 condition justifies a longer stay, they are first put in a ward unit for further assessment 70 and treatment. this step is where a number of medical imaging steps start involving work flow transition table 1 . probability of transition for the patient in reference to the workflow of either a chest ct scan in the imaging center or a chest x-ray with a mobile unit. meanwhile, significant biological lab work starts to grade the patient's condition more 73 precisely and continues during the patient's stay. these resources, i.e. imaging and lab 74 work, are typically shared by all patients in the hospital and therefore may slow down 75 the process. for simplicity and in the absence of adequate data set for validation, we 76 neglect these constraints. some of the patients who receive medical attention do well 77 with conservative management only and can be discharged home after a few days. but 78 for others, their health condition may deteriorate and those patients will need to be 79 moved to the imu for higher level of care and/or to transfer to the icu for ongoing 80 monitoring and mechanical ventilation. the imu and icu require extensive supplies 81 and resources. it is often mentioned that the number of available ventilators is critical 82 to icu functions. however, it is not the only limiting factor: patients under mechanical 83 ventilation need sedation and might be connected to a number of additional systems to 84 deal with organ failures. once again for simplicity and because of the lack of input data, 85 our model will not take into account these bottlenecks. there are no technical 86 difficulties required to add those constraints in the mathematical model with our 87 bottom up description of the workflow as in [10, 15] . additional steps can be recovery 88 for patient being well or unfortunately palliative care when the patient is not responsive 89 to treatment. there are many exceptions and singularities to these standard paths: for 90 example, a patient may go directly from admission to the icu when their condition is 91 too unstable. in some hospitals, the floor might be shared by patients who are 92 recovering from covid-19 and palliative care patients.despite this, we will separate 93 these functional units in our model to clarify the workflow process according to what 94 each patient stage requires in terms of resources and time to deliver adequate care. to 95 summarize, a simple workflow graph is created and the main requirement is to know (i) 96 the probability that a patient goes from one care unit to another and (ii) a statistical 97 estimate of how long the patient should stay in each care unit before moving on. [8, 17] . our model follows a markov process for (i): there is a probability associated with 99 each branch of the graph summarized in table 1 . with respect to (ii), we use a 100 lognormal distribution that can be reconstructed from the parameters listed in table 2 101 and table 3 . this simple framework allows us to construct a generic model that resource allocation, as well as the number of patient outputs, such as the number of 106 patient healed and discharged per day, or the number of death(s). these time series can 107 be fitted to existing data the hospital obtains during a period of a few weeks prior to 108 retrieving the performance parameters of table 1 . once the model is calibrated, it can 109 be used to extrapolate the load of each care unit in the next few days and anticipate the 110 need of staff and supplies -see table 4 . figure 1 this discrete model is stochastic, so one needs to run many simulations to build a 112 statistical estimate of such quantities. it is appropriate to retrieve the unknown 113 parameters of the model using a form of stochastic optimization method, such as genetic 114 algorithm, since the model workflow process, like the one in the hospital, is discrete, 115 noisy, and nonlinear. floor imu icu recovery palliative nurse 5 beds 2 beds 2 beds 4 beds 3 beds md 10 beds 10 beds 6 beds 10 beds 10 beds table 4 . number of staff required at each care unit per beds in reference to the workflow of figure 1 let us describe the data set we are using to construct our model. the french 117 government has kindly decided to release the records of most public hospitals around 118 the country during the covid19 crisis. from this excel file, we can easily recover the 119 number of patients staying in hospitals, the number of patients in icu, the number of 120 patients healed and discharged, and the number of patients dying in a medical 121 institution. those numbers are updated daily and go back to march 18, 2020 [27] . we 122 will extensively use this french data set (fds) to identify the missing parameters of 123 our model. the number of parameters of our model is relatively large: about one parameter for 125 each branch of the graph minus the number of nodes for (i) and two parameters for the 126 log distribution of (ii) in each care unit. to avoid over-fitting, one should come up with 127 a strategy that lowers the number of unknown based either on literature or hypothesis 128 that can be validated otherwise. we are going to describe thereafter the rationale for 129 our choices to the best of our knowledge and further discuss some of the limitations of 130 our model in section 4. first of all, a lognormal distribution of the duration of each step of the process 132 might be justified as follows. biological process, such as incubation and recovery, are 133 often described as such [18, 19] . first, the patient's condition is indeed dominated by 134 his/her biological time. second, medical procedures with their associated time lag and 135 delay are also often best described as lognormal processes [10, 22] with a long tail. this 136 is not in contradiction with the fact that patient los in the hospital may not ideally be 137 described by a simple exponential distribution or similar. overall, los adds up the 138 time distribution of each step in a markov process and might be described at the 139 convolution of the probability distribution of each step [14] . now let's review the parameters of table 1 that gives the probability transition from 141 one unit to another, in order to rationalize the construction of our generic model. one 142 can first list the following constraints assuming that all possible paths are exhaustively 143 april 28, 2020 5/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . https://doi.org/10.1101/2020.04.28.20083154 doi: medrxiv preprint listed in the workflow of figure 1 , so we have: 144 α 2 + α 3 + α 4 = 1, α 5 + α 6 + α 7 = 1, α 8 + α 9 = 1, α 10 + α 11 = 1, α 12 + α 13 = 1. (1) overall, the death rate and recovery rate of patients who are staying in the hospital 145 should be within an acceptable limit. technically, the death rate of hospitalized patients 146 is: similarly, the recovery rate of hospitalized patients β h = 1 − β d is: β d is difficult to asses with a pandemic that just started. as a matter of fact, most 149 infected patients are still in the hospital and their outcome may not be clear. we look 150 thereafter for some lower and upper bounds of β d that limits our search. in france, as of april 17, 2019, the number of deaths in hospitals was 11,842 160 patients and recovered was 35,983 patients [27] . assuming that the proportion of death 161 versus recovery will be about the same for the patients who are still ill, the death rate of 162 hospitalized patients should be around 25%. finally, according to [26] , an early estimate 163 of the death rate for hospitalized patients in wuhan, china based on a case series of 191 164 patients was 54/191 = 28%. 165 we restrict ourselves to the model matching the fds to a [10%, 40%] death rate 166 interval, that is: according to several reports including the icnarc one mentioned above, it is 168 expected that the number of patients dying in icu is about 50%. [6] provides much 169 further details on the probability of survival of patients with ards under mechanical 170 ventilator as a function of the day of the start. it shows that about 25% of the patients 171 in icu die during the first few days from severe complications. we will introduce an 172 artificial two phases icu decomposition of the patient stay in the icu to bypass the 173 limitation of a single lognormal distribution that may not represent an adequate model 174 of los in this unit according to [6] clinical studies: a short phase one with mortality 175 driven by α 8 and a longer phase 2 with mortality driven by α 10 . consequently, we will assume that: there are also few parameters in table 1 that should have near to no limited effect 178 on statistics when matching our model to fds. fds is based on hospitalized patients, 179 so α 1 cannot be recovered from this data set. according to fds, about 30% of patients 180 who show up at the emergency room (er) are returning home [27] . we will choose 181 α 1 = 0.3. april 28, 2020 6/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . https://doi.org/10.1101/2020.04.28.20083154 doi: medrxiv preprint according to dr. m. mueller [29] , 25% of the patients who are not responsive to 183 treatment may leave palliative care alive and are discharged home. this may vary 184 depending on each country or hospital policy. because patients with covid-19 in 185 palliative care are still very contagious, we will assume they stay in the hospital until 186 the end. we will choose α 12 = 0., for all our calculations. to sum up, our model essentially needs the calibration of 6 parameters, namely a = (α 2 , α 3 , α 5 , α 6 , α 8 , α 10 ) under the set of constraints (1), (2), (3). let us denote f admission (jd) the number of patients admitted per day jd ∈ 1..n in 189 the hospital who have a positive diagnosis and must stay in the hospital. we will use we find a as the solution of the minimization problem of the weighted norm: where f s is the mean of a large number of runs of the model. this number of runs is 198 set large enough to let the solution of the optimization problem be independent of it. as mentioned above, we will use a genetic algorithm to solve that minimization problem. 200 the weight factor (γ 1 , γ 2 , γ 3 ) in (4) can be set equal or unequal to favor the quality of 201 the fitting for one of the variables, such as the number of patients in the icu that is 202 critical to management. table 2 and 3 give the time window we used for each transient stage. we construct a 204 lognormal distribution of duration for the patient stay in such a way that about 90% of 205 the patients' stay will be within a coarse approximation [p, q] listed in these tables. the choice of the parameters in table 2 and table 3 might be easier to come up with. 207 one of the most remarkable features is that patients with covid-19 who stay in the 208 icu can be longer than usual [1] . the los in palliative care was set according to dr. 209 m. mueller's data [29] . we have used extensively [2, 26] , as well as the feedback from 210 clinicians in the field to estimate the interval of variation for the parameters [p, q] the 211 best we could. we used a fairly large interval since it can be observed that the standard 212 deviation for los in each care unit is large as described in this report from the imperial 213 college london covid-19 response team [30] . one may fine tune the interval value 214 [p, q] if needed in the fitting process of the model to the data set of time series available. 215 to distinguish those unknown parameters that are important from those who are 216 less significant, we run linear sensitivity analysis for each of our results. this method is 217 used to confirm that the time window parameters of table 2 and table 3 have a 218 secondary effect on the quality of the model fitting. finally, we derive from our model some predictions on staffing and supplies for the 220 next week or so, as well as the load foreseen for each care unit. the nature of the 221 stochastic simulation automatically gives an uncertainty estimate on these predictions 222 that increases as time grows. to compute supplies such as personal protection kits, we 223 can use some adaptation of the reference of the cdc web site [31] that was constructed 224 for ebola. our software can then be used to feed the stock management scheme 225 implemented by cdc for covid-19 [32] . in this paper, we use a growth estimate of 226 two personal protective equipment (ppe) per shift and per staff member for simplicity. 227 april 28, 2020 7/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . table 4 a gross approximation of the number of nurses 228 and staff per bed site in each unit. those figures are depending on the crisis situation 229 and might differ depending on the country [5] . 230 in order to take into account the fact that staff and supplies are limited and require 231 hard management choices during a pandemic crisis, we tested the model further against 232 the scenario of a shortage on nurses who are essential in intensive care units. to 233 introduce a risk factor due to the shortage of nurses, we have extrapolated from [9] to get a continuous approximation, we assume that the shortage of nurses has a linear 239 effect, and use linear interpolation for shortages from 0% to 40% maximum. this is 240 certainly a gross approximation, but we felt that it was important to bring awareness to 241 those effects with a simulation tool. we will present in the next section our results. let us first report on the model fitting with the fds. we sum up the number of 244 admissions, patients in icu, number of recoveries and deaths for the whole country of 245 france in order to get a robust data set that averages the noise of the data. we 246 calibrated the model to this largest data set that covers the period 3/18/20 to 4/11/20 247 and found a death rate of about 25%. this result is in agreement with the estimate we 248 did in section 2, as of april 17, 2019 [27] . table 1 to 3. all 250 numbers have been scaled by a factor to represent an average hospital size. the 251 sensitivity analysis on the alpha unknown vector a is reported in figure 3 . april 28, 2020 8/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . we observed that the number of patient admissions is not a smooth curve. typically, 253 sunday's have less activity with less patients discharged than weekdays. however, the 254 model fitting seems adequate and robust to a small variation of parameters. the logic 255 on the influence of parameters is simple, α 2 being the one who is most important for all 256 output. each of the six parameters seems to have some significant influence for at least 257 april 28, 2020 9/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. in order to compare the results obtained with designated subset of the fds that 273 corresponds to the hospital in paris and the hospitals in alsace, we used the simulation 274 with the exact same set of parameters found for the data set with the whole country. alsace has been the busiest cluster at the beginning of the pandemic, followed later on 276 by paris and ile de france. the results for alsace are reported in figure 6 and figure 7 . 277 we observed a fairly large difference of the model's prediction on the number of patients 278 under mechanical ventilation. it seems that at the peak of the pandemic in alsace, the 279 number of patients under mechanical ventilation was less in reality than in the model. 280 one possible factor would be the shortage of available beds in the icu. on the other 281 hand, the number of deaths did not go higher than significantly expected. a better 282 explanation might be the fact that a fairly large number of patients in critical condition 283 were transferred to hospitals in different parts of the country or neighboring countries: 284 according to local newspaper more than 110 patients from alsace have been 285 transferred [33] . this seems coherent with our results: the scaling factor for the alsace 286 data set to get a maximum hospitalization rate of about 50 patients per day is 6; the 287 overshoot on the icu prediction is about 20 in figure 6 ; the total maximum overshoot 288 is therefore about 120; considering that the average los in icu (see table 1 ) is roughly 289 12 days, our model still seems to give an adequate approximation. but unfortunately, 290 we do not have enough information to add this new patient path in the workflow of 291 figure 1 . this phenomena is less present in the results for the data set with paris but are still 293 there -see figure 8 and figure 9 . one can indeed refine the parameter fitting to be 294 specific for alsace and paris in order to reflect that the clinical decision process in the 295 workflow, i.e parameters of table 1 to 3, might be sensitive to how much the local 296 system is under stress, but we should then take into account those number of 297 transferred patients that are not negligible. next, let us describe the use of our model to assist daily management in the hospital 299 during the pandemic. one key factor is to anticipate the load of each care unit and 300 required resources, either to match the increase in number of patients or to reallocate 301 resources to other patients who have seen their surgery postponed. 302 we choose a hypothetical scenario that might occur if confinement conditions to 303 contain the pandemic are lifted too early. we assume that the hospital has a nominal 304 low flux of patients from week 1 to 7, and a recurrence with a daily 20% increase of new 305 patients coming in occurs in week 8. figure 10 shows the dynamic of the load of each 306 care unit, in particular the large delay in the number of patients in the icu that 307 becomes saturated the latest. the black curves are a simulation of the previous week's 308 load (week 7), while red curves are the prediction for the following week (week 8). the 309 april 28, 2020 10/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . as an illustration of the capability of the model, figure 10 and figure 11 provide an 314 estimate of the growth of resources needed to face the new patient wave. a number of 315 decisions should be made in regards to patient care. figure 12 compares the patient 316 april 28, 2020 11/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . output with or without shortage of nurses. those results are speculative since it is 317 difficult to quantify the risk for patients beyond the nice publication results of [9] 318 and [8] . it is our hope that data accumulated during crises such as the present episode 319 of covid-19 will give the mathematical modeling the base to do this estimate 320 rigorously in future work. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. . mathematical models available, even for the present crisis, see [25] and [26] . it might be 325 difficult to assess the basic reproduction number r 0 factor, which is under active debate. 326 it is probably even more difficult to assess the exact impact of global confinement or 327 targeted confinement on those parameters that characterized the pandemic model. 328 we should however be able to use our model to test if the effect on the most critical 329 resource, such as icu beds and delay in care, are linearly or nonlinearly related to those 330 parameters. let us use the most simplistic ordinary differential equation epidemiology model: the function i(t) is used as the input of our workflow model, and represents the 336 number of patients admitted to the hospital. we test the influence of the transmission 337 rate on the number of icu beds over a 16-week period. figure 13 shows that the 338 maximum number of icu beds required during the epidemic is significantly higher when 339 april 28, 2020 13/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 2, 2020. admission goes up to 50 patients per day. this is a significant load for any hospital 347 system because all patients suffer from the same disease and cannot be triaged using the 348 existing departmental structure. the hospital system needs to recruit resources quickly 349 enough to deliver quality patient care while keeping the staff safe from infection. there are many ways of developing such a mathematical model. we chose a markov 351 process that can augment a workflow graph provided by the clinicians and used a simple 352 statistical model for the los of the patient at each stage corresponding to a graph node. 353 a number of variations in the model construction are available: for example, changing 354 the probability distribution of los for specific stages with a more sophisticated model 355 than lognormal or decomposing the graph nodes into subgraphs of the workflow with 356 more details. in particular, the icu supports different paths of medical care depending 357 on patient conditions. because of the sparsity of data on hand, we kept the model as 358 simple as possible and we were able to fit the french data set with good accuracy. using this approach, we could: • recover important parameters that are characteristics of the workflow such as the 361 probability for a patient to transition from one unit to another, and important 362 patient outcomes such as healing rate or death rate. april 28, 2020 14/18 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 2, 2020. . https://doi.org/10.1101/2020.04.28.20083154 doi: medrxiv preprint • on a pragmatic side, we use the model to assist the senior manager in answering 364 his/her questions as listed in our introduction: how many beds do i need on the 365 floor, how is this affecting patient outcomes, do we need to transfer patients to a 366 different facility, etc.? there are a number of limitations to our approach. the smaller the hospital, the less 368 predictable the outcome will be. with time, the characteristics of the population of 369 patients who show up to the er may change and the pandemic management by the 370 governing organizations would evolve. one can think, for example, that systematic 371 testing would provide early diagnostics and impact the performance of the health 372 system as shown by the statistics of countries who were early adopters of that strategy. 373 due to the heterogeneity of the patient population and disease patterns that depend 374 heavily on patient characteristics, our next step in improving this model would be to 375 include patients' medical history listed in the electronic medical record. above all, any model of workflow especially during a pandemic should be aware of 377 the human factor. staff can get sick or burnout during a pandemic and there should 378 be a number of strategies to compute that risk and enter this into the constraints 379 imposed on the health care system [4, 11, 12, 21] . further, human behavior and decision 380 process changes under stress: it can be for economical or psychological reasons. the covid-19 in critically ill patients in the seattle region -case series timsit severe sars-cov-2 infections: practical considerations and management strategy for intensivists centers for disease control and prevention: 2019 novel coronavirus overcrowding and understaffing in modern health-care systems: key determinants in meticillin-resistant staphylococcus aureus transmission dod covid-19 practice management guide characteristics and outcomes in adult patients receiving mechanical ventilation jama impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand temime management of nurse shortage and its impact on pathogen dissemination in the intensive care unit epidemics the returns to nursing: evidence from a parental leave program working paper 23174 multiscale modeling of surgical flow in a large operating room suite: understanding the mechanism of accumulation of delays in clinical practice an agent-based and spatially explicit model of pathogen dissemination in the intensive care unit the effect of workload on infection riskin critically ill patients modelling hospital length of stay using convolutive mixtures distributions a cyber-physical system to improve the management of a large suite of operating rooms a predictive model for the early identification of patients at risk for a prolonged intensive care unit length of stay the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application markus abbt log-normal distributions across the sciences: keys and clues bioscience early efforts in modeling the incubation period of infectious diseases with an acute course of illness multi-city modeling of epidemics using spatial networks: application to 2019-ncov (covid-19) coronavirus in india understaffing, overcrowding,inappropriate nurse:ventilated patient ratio and nosocomial infections: which parameter is the best reflection of deficits? modeling the uncertainty of surgical procedure times: comparison of log-normal and normal models traversing the many paths of workflow research: developing a conceptual framework of workflow terminology through a systematic literature review 2019-ncov) situation reports a mathematical model for the novel coronavirus epidemic in discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin, biorxiv 28. icnarc report on covid-19 in critical care palliative and end of life care for patients with respiratory diseases symptom progression of covid-19 estimated personal protective equipment (ppe) needed for healthcare facilities personal protective equipment (ppe) burn rate calculator où les patients alsaciens atteints du covid-19 ont-ilsété transférés we would like to thank patrick doolan for sharing his view with us on management and 385 risk evaluation from his great experience acquired from the energy sector. declarations of interest: none 387 key: cord-276218-dcg9oq6y authors: kim, jihoon; koo, bon-kyoung; knoblich, juergen a. title: human organoids: model systems for human biology and medicine date: 2020-07-07 journal: nat rev mol cell biol doi: 10.1038/s41580-020-0259-3 sha: doc_id: 276218 cord_uid: dcg9oq6y the historical reliance of biological research on the use of animal models has sometimes made it challenging to address questions that are specific to the understanding of human biology and disease. but with the advent of human organoids — which are stem cell-derived 3d culture systems — it is now possible to re-create the architecture and physiology of human organs in remarkable detail. human organoids provide unique opportunities for the study of human disease and complement animal models. human organoids have been used to study infectious diseases, genetic disorders and cancers through the genetic engineering of human stem cells, as well as directly when organoids are generated from patient biopsy samples. this review discusses the applications, advantages and disadvantages of human organoids as models of development and disease and outlines the challenges that have to be overcome for organoids to be able to substantially reduce the need for animal experiments. the use of classical cell line and animal model systems in biomedical research during the late twentieth and early twenty-first centuries has been successful in many areas, such as improving our understanding of cellular signalling pathways, identifying potential drug targets and guiding the design of candidate drugs for pathologies including cancer and infectious disease. the value of the achievements these model systems have made possible is proved by the fact that their use has become near universal in biomedical research today. historically, the investigation of disease mechanisms in animal models has progressed along a common discovery pipeline, whereby biological processes were initially investigated by genetic screens in invertebrates, followed by an analysis of evolutionary conservation in mammalian model systems, eventually leading to clinical translation to humans. the common principles of animal development and organ physiology derived from this approach have led to a detailed mechanistic understanding of many human diseases. nevertheless, extrapolating results from model systems to humans has become a major bottleneck in the drug discovery process. furthermore, recent studies have identified biological processes that are specific to the human body and cannot be modelled in other animals. these include, for example, brain development, metabolism and the testing of drug efficacy. the emergence of human in vitro 3d cell culture approaches using stem cells from different organs has therefore received widespread attention as having the potential to overcome these limitations. attempts to model the biology of human organsincluding the differentiation of human stem cells in 2d, in either the presence or absence of a 3d matrix; bio-printing of human cells; and the culture of cells in a microfluidic device ('organ-on-a-chip') 1 -were made prior to the emergence of organoids and have shown some potential for drug screening or human disease research. however, organoids are unique, in that they are self-organizing, 3d culture systems that are highly similar to -and in some cases, histologically indistinguishable from -actual human organs [2] [3] [4] [5] [6] [7] [8] . one feature that is common to all organoids is that they are generated from pluripotent stem cells (pscs) or adult stem cells (adscs; also known as tissue stem cells) by mimicking human development or organ regeneration in vitro 9 . the analysis of organoid formation can thus provide valuable information about the mechanisms underlying human development and organ regeneration, highlighting their value for basic biological research in addition to their potential application in pharmaceutical drug testing and molecular medicine. the potential of organoids to complement existing model systems and extend basic biological research, medical research and drug discovery into a more physiologically relevant human setting is becoming ever more widely appreciated. however, the development of organoid technology is still in its infancy in comparison to established cell line and animal models, with challenges still to overcome. in this review, we describe examples of psc-derived and adsc-derived organoid models that have shown great potential and promise in modelling human disease for a broad spectrum of life stages, from early development through to adulthood. many excellent reviews have already described the various organoid human organoids: model systems for human biology and medicine model systems [9] [10] [11] [12] [13] [14] , highlighted their relevance for disease modelling 15, 16 and outlined organoid-associated protocols 17 . this review will instead discuss organoids as a novel model system for understanding human development and genetics that complements, and in the future might reduce, the use of animal models. we emphasize the differences between the mouse and human systems, evaluating the advantages and disadvantages of animal models and human organoid models. we also highlight the tools and methodologies that are available for human organoid studies, in the hope that this information will lead basic biologists to consider using this emerging platform for the study of human pathophysiology. the diversity of biological model systems building on the fundamental idea that biological mechanisms are conserved throughout evolution, biomedical research has traditionally focused on only a handful of model organisms. these models are typically robust, fast-growing species that generate a large number of offspring in a short time and can be propagated at low cost in the laboratory. researchers benefit greatly from the use of an established model organism, as there are already detailed descriptions of its development and physiology, a large body of experimental techniques and many community resources, including reagents and species-specific databases. while some model organisms, such as yeast (saccharomyces cerevisiae), fruit flies (drosophila melanogaster) and mice (mus musculus), have a long tradition of use in research, other model systems have been established more recently but nevertheless have helped broaden our knowledge by presenting us with a new set of experimental tools. these include the worm, caenorhabditis elegans, whose reproducible cell lineage allows unparalleled analysis of cell fate decisions 18 , and the zebrafish, danio rerio, the first vertebrate system in which saturation loss-of-function mutagenesis screening was performed 19, 20 . c. elegans became a widely used experimental model from the mid 1970s, whereas the first large-scale genetic screens in d. rerio were conducted in the 1990s 21, 22 . however, although it is extensive, the information obtained from animal models does not completely reflect human physiology. in this respect, human organoids (which emerged in the early 2010s) 3, 5, 23, 24 can be seen as a novel experimental model that bridges the gap between animal models and human beings. the most commonly used model organisms currently are s. cerevisiae, c. elegans, d. melanogaster, zebrafish and the common house mouse. in addition, for cancer studies, patient-derived xenografts (pdx) and cancer cell lines are also used. each model has its own unique advantages and limitations (fig. 1 ). for example, 60-80% of genes in c. elegans have human orthologues, and about 42% of human disease genes have orthologues in c. elegans [25] [26] [27] . moreover, many signalling pathways have been evolutionarily conserved from this species. the largely deterministic nature of c. elegans development makes it an ideal model for cell lineage analysis and has led to the discovery of key cell death pathways, yet pattern formation in c. elegans is different from that in other organisms, and mechanistic principles of its development are often not fully conserved. in addition, other aspects of the physiology of c. elegans, such as its lifespan, metabolism and immune system, are very different from human physiology. among mammalian model systems, the availability of highly advanced genetic tools and germline-competent www.nature.com/nrm mouse embryonic stem cells (escs) that enable the generation of genetically engineered mouse models makes the mouse the most preferred system. nevertheless, limitations remain with regard to the physical inaccessibility of early development, as compared with other, egg-laying models with visible development, the requirement for expensive animal facilities in which to maintain the mice and some important differences between mouse and human physiology (box 1). the zebrafish was developed as a vertebrate model for large-scale genetic studies in the 1990s, to overcome some of the limitations of the mouse model: the transparent zebrafish embryo and this model's potential to reproduce in huge numbers with a manageable price tag allowed zebrafish researchers to uncover fundamental principles of early embryonic development. however, the zebrafish model suffers from a hugely complex genome, a lack of comprehensive in vitro culture systems and the fact that it shares only a limited degree of biological similarity with humans. over the past 30 years, the use of these various, equally valuable animal model systems, as wells as 2d human cancer cell lines, has led to an explosion of knowledge about human development and mechanisms of disease, but it has also revealed the limitations of these same model systems in emulating human pathophysiology. a number of biological phenomena that are specific to humans are not amenable to being reproduced in animal models. the human brain, for example, is far more complex than its mouse counterpart, owing partly to human-specific developmental events and mechanisms 28 . neurons in the human cortex, for example, arise from a cell type, outer radial glia, that is not present -or is present only in minute numbers -in rodents 28 . human physiology is also profoundly different from the mouse model system: it is perhaps unsurprising that there are huge differences in metabolism between humans and laboratory models, given that humans develop far more slowly than the models 29 . common drugs such as ibuprofen 30 and warfarin 31 are metabolized in the liver in such different manners in humans and rodents that, in humans, ibuprofen is prescribed for pain, fever and inflammation, and warfarin is prescribed as an anticoagulation drug, whereas both drugs are toxic to rats. finally, and in contrast to all animal models, humans are not inbred. understanding human genetic diversity and its influence on disease onset and progression and drug responses is a prerequisite for developing personalized medical treatments and requires the establishment of human-specific model systems. the advent of human induced pluripotent stem cell (ipsc) technology and diverse human adsc culture methods has made it possible, for the first time, to generate laboratory models specific to an individual 32 . reprogramming other cells into ipscs has become a routine laboratory procedure, but generating disease models from those cell lines remains challenging. early approaches focused on the differentiation of ipscs into one type of cell (for example, neurons and cardiomyocytes) and their culture in 2d 33, 34 . more recently, culture methods have been developed to mimic in vivo organ development in 3d, allowing more complex tissue structures and diverse cell types to be modelled simultaneously 1, 9, [35] [36] [37] [38] [39] . in this methodology, human ipscs are sequentially exposed to a course of differentiation cues in order to simulate the stages of a human developmental process. during this process, differentiated ipscs aggregate to form first an organ bud, and later organoids, that faithfully mimic the mature organ structure, including multiple cell types and the interactions between them. human adsc-derived organoids have also emerged as an alternative organoid system that consists of a simpler structure composed mainly of cell types found in the epithelium 3 . in contrast to the complicated process of ipsc reprogramming followed by differentiation to the required organ type, these organoids can be generated from biopsies isolated directly from the organ of interest or from diseased patient tissue. however, the establishment of human adsc-derived organoids is limited by accessibility to the tissue and prior knowledge of the culture conditions for that tissue, while an ipsc line, once established from a patient, can be used to repeatedly generate different tissue models without any time limit (that is, beyond the patient's lifespan) 40 . in summary, organoids constitute an improvement in the generation of model systems, with cell types that more closely resemble, and are in similar conditions to, those found in the human body 7, 41 . diverse organoid systems provide a useful variety in their complexity and modelling capability, ranging from the simple epithelial structures derived from adscs 3 to more complex cultures box 1 | differences between human and mouse stem cells the difference between mice and humans is evident in the culturing of their stem cells. pluripotent mouse embryonic stem cells (escs) were established in the early 1980s 45, 46 . leukaemia inhibitory factor (lif) was the first pluripotent stem cell (psc) factor identified, with the later discovery that 2i (the kinase inhibitors pd0325901 and chir99021) and lif (together termed 2ilif) are necessary as well as sufficient for the culture of escs from all mouse strains [176] [177] [178] [179] . the first human pscs to be successfully cultured were human escs, equivalent to mouse epi-stem cells that have lost germ-cell potential 44, 180, 181 . initial cultures were propagated in media that were based on basic fibroblast growth factor, but for decades researchers continued to search for appropriate culture conditions in which to maintain human pscs in the naive, pluripotent state [182] [183] [184] [185] [186] [187] [188] . universally agreed-upon culture conditions for the maintenance of naive human psc cultures remain elusive, with various reports suggesting different medium compositions including the use of a pkc inhibitor (gö6983), rock inhibitor (y-27632), mek inhibitor (pd98059), gsk inhibitor (im12), braf inhibitor (sb590885), lck/src inhibitor (wh-4-023) or wnt5a 182, 184, 189, 190 . a similar fundamental discrepancy between mouse and human cultures, in their requirements for additional media components, has also been observed in a number of adult stem cell (adsc)-derived organoid culture systems. however, one commonality appears to be that inhibitors of tgfβ and p38 signalling must be added to the growth factors used in murine adsc cultures in order to enable the long-term culture of human adscs 9 . for example, mouse colonic organoids require epidermal growth factor, noggin, r-spondin and wnt as growth factors, while human colonic organoids additionally require two inhibitors, a83-01 and sb202190 (which block tgfβ and p38 signalling, respectively) 3 , in order to attain a similar level of longevity and robustness. the differences between mouse and human systems in both psc-derived and adsc-derived organoid cultures suggest rather surprising fundamental differences between these species in the intrinsic signalling pathways that are activated in their stem cells and that are required for stem cell identity, and not just differences during development or during an immune response. this confirms the need for human-based laboratory model systems in order to fully understand human development and pathophysiology. part of neocortex where neuronal progenitors reside. a breed with genetically closely related individuals. induced pluripotent stem cell (ipsc). pluripotent stem cells reprogrammed from somatic cells. nature reviews | molecular cell biology generated from multiple germ layers 6 obtained via multistep psc differentiation protocols 42, 43 . it is expected that, in the near future, the variety of organoid models available, coupled with advances in the technology, will provide a series of powerful and efficient platforms for studying human development, physiology and disease. the past years have seen substantial progress in the development and use of human-cell-based model systems for the study of human biology, including both pluripotent and adsc-based systems. the introduction of human ipsc technology. the first human pscs, known as human escs, were reported in 1998 (ref. 44 ), nearly two decades after the initial discovery of murine pluripotent cell lines in 1981 (refs 45, 46 ). the human stem cell lines were predicted to be useful for the study of human developmental biology, drug discovery and cell therapy -in spite of the technical limitations imposed by the lack of knowledge of appropriate differentiation protocols at that time. as the generation of human psc lines required the sacrifice of human embryos at the blastocyst stage, strong ethical concerns were raised 47 . in 2007 the debate around human pscs was largely circumvented by the introduction of human ipsc technology, which converts an ordinary differentiated fibroblast from an adult human into a pluripotent cell 33 . reprogramming to a pluripotent state is usually achieved by forced expression of a specific set of transcription factors 33 , and the pluripotent cells can then be differentiated into specific cell types. ipsc technology not only allowed the generation of patient-specific stem cells but also enabled researchers to work with an unlimited supply of human stem cells and stem cell-derived tissues. furthermore, ipsc technology allows the banking of patient-derived stem cells 48 . there are still technical limitations when using ipscs -for example, the use of oncogenes for reprogramming, the genetically unstable nature of the reprogramming and the low efficiency of reprogramming. all these technical hurdles made it difficult to obtain error-free ipsc lines from patients 49 . but these limitations have been at least partly overcome by the use of non-integrating vectors and standardized quality control protocols to avoid or screen out unwanted genetic alterations 49 . line-to-line variability caused by the genetic heterogeneity of humans has also been resolved by the use of isogenic controls generated by genetic engineering using crispr-cas9 technology 50 . these improvements have enabled researchers to use ipsc-derived specialized cell types -for example, neurons, cardiomyocytes, haematopoietic progenitor cells and pancreatic β-cells -in disease modelling and drug screening 51 . however, just over a decade after their introduction, 3d organ culture methods have greatly increased the utility of human ipscs. ipsc-derived organoid models. human psc-derived organoids are generated by guided differentiation protocols that mimic developmental processes identified through previous work, both in vitro and in vivo (fig. 2) . as our knowledge of human development is very limited, most early studies that aimed at producing organoids with properties similar to those of human tissues were based on parallels drawn from mouse development. in principle, to generate an organoid, the entire process of organ development from pscs should be faithfully mimicked. in reality, it is nearly impossible in vitro to provide all biochemical cues that drive cell differentiation and 3d tissue assembly and organization at precisely the right times, places and concentrations at which they would occur during embryonic development. fortunately, cells in vitro tend to follow a semi-autonomous differentiation trajectory, as they do in vivo, and three main types of protocols are utilized to generate functional organoids. in the case of brain organoids, human pscs are initially guided to differentiate into embryoid bodies before further differentiation towards the neuroectodermal lineage. once the cell aggregates contain the developmental precursors for brain tissue patterning, the rest of the developmental steps occur spontaneously in a spinning bioreactor 5, 52 . for the development of liver primordia, guided differentiation to hepatoblasts (hepatocyte precursors) was not sufficient to form a complete set of organ precursors, which requires cells from different lineages. it was known from studies of mouse hepatogenesis that cell-to-cell communication between endothelial, mesenchymal and hepatic endoderm cells was important, so based on this knowledge the first human liver primordia were created by mixing human psc-derived hepatoblasts, mesenchymal cells and endothelial cells. through self-condensation and organization, these three cell types assemble in vitro to make an aggregate that mimics the architecture of developing human liver primordia 23, 53 . other organoids require more precise, lengthy protocols in order to acquire the appropriate progenitor cell types for the target epithelium. many organoids that model endoderm-derived organs (such as oesophagus, stomach, colon, intestine and lung) undergo stepwise differentiation protocols, in which the timing, concentration and combination of specific growth factors and chemical inhibitors for modulating key developmental signalling pathways are crucial to developing the desired epithelial tissue in a manner analogous to fetal development 24, 43, [54] [55] [56] [57] . in all cases, the process of organoid formation involves three crucial steps. first, key signalling pathways regulating developmental patterning are activated or inhibited (using commercially available morphogens and signalling inhibitors) in order to establish a correct regional identity during stem cell differentiation. usually this is achieved through induction of the signalling events that have been identified in the mouse as establishing cell fates in vivo. second, media formulations that allow proper terminal differentiation of the desired cell types within the organoid are developed, generally following established methods of 2d culture or inspired by the murine developmental process. finally, cultures are grown in a way that allows their expansion in three dimensions, which is achieved either by aggregating cells into 3d structures or by embedding the cultures into a 3d matrix. control cell lines with the same genetic background as experimental cells. three-dimensional pluripotent stem cell aggregates. culture system in which nutrients are supplied while cells are agitated. adscs can be cultured in the presence of niche factors that function to maintain the stem cells in an undifferentiated state while allowing stem cell differentiation. as a result, the cultured stem cells can generate adsc-derived organoids, which are composed of an epithelial monolayer that mimics the 3d architecture and contains the cell types of the desired organ to be modelled 58 . the first steps in the development of adsc-derived organoids, however, are to identify and isolate the appropriate population of adult stem cells and to understand their niche requirements. obtaining the first mouse intestinal organoid cultures required not only the identification of the intestinal stem cell population expressing the selective marker for these cells, the leucine-rich repeat-containing g protein-coupled receptor 5 (lgr5), but also the essential niche factors to support stem cell activity 59 . the re-creation of the intestinal stem cell niche in vitro was inspired by mouse genetic studies that had shown that epithelial proliferation and stem cell self-renewal are dependent on epidermal growth factor (egf) and wnt activity, while differentiation is controlled by bone morphogenetic protein (bmp) kidney organoids after it was discovered that sb202190 prevents secretory lineage differentiation, this inhibitor was replaced by the addition of insulin-like growth factor 1 (igf1) and fibroblast growth factor 2 (fgf2) 4 ; this modification enables concurrent multilineage differentiation and self-renewal in human gut organoids. all these changes and refinements to culture conditions, which are the results of another decade of research, improved the quality of human gut organoid cultures, enabling researchers to produce organoids that more closely resemble tissue in vivo. nevertheless, mouse organoids remain more similar to the in vivo tissue of origin than do human organoids. generation of other adsc-derived organoids. using the procedure described above, multiple types of organoids from various tissues have been generated by modifying the basic medium that was initially developed for intestinal organoid culture ( fig. 2; fig. 3 ). in most cases, a mouse organoid culture system was first established and then adapted to human cells. mouse colonic organoids can be grown through the addition of wnt3a 3 . mouse stomach pyloric organoids and corpus organoids were grown with only slight modifications to the original protocol for mouse intestinal organoids 60, 61 . mouse liver and pancreas ductal organoids were also obtained similarly from injury-activated lgr5-positive progenitors 62, 63 . the number of organoids generated from diverse mouse tissues and organs is growing, and for each, previous knowledge of the signalling processes that comprise the organ-specific or tissue-specific stem cell niche environment has been key to developing the appropriate protocols. adsc-derived human organoids have also become widely available (fig. 3) . they have been generated from almost all endoderm-derived tissues (intestine, colon, stomach, liver, pancreas, lung, bladder and so forth) 3,7,41,64-70 and from gender-specific tissues (prostate, endometrium, fallopian tube and mammary gland) 8,71-75 . additional components frequently required for the growth of human cultures, in comparison to the mouse cultures, are a tgfβ pathway inhibitor (such as a83-01) and a p38 mapk inhibitor (such as sb202190) 9 . as for mouse organoids, human organoids can be derived from minimal amounts of tissue biopsies and can be cultured indefinitely, thus forming the basis for the building of living biobanks, which are an important resource in biomedical research. following the establishment of human stem cell-based organoids, various human diseases have been modelled using these systems. for example, human organoids have been used to study infectious diseases, inheritable genetic disorders and cancer. moreover, with the advent of various genetic-engineering tools, pathogenic genes and mutations can be directly tested in organoids derived from cells isolated from healthy donors, which enables us to perform human genetic studies in controlled genetic backgrounds. the human brain is the most highly developed brain in the animal kingdom and the most complex organ in our body. because of its distinct complexity, it has been very difficult to model human brain development using animal systems. human brain organoids, however, have now been successfully used to model human brain development and disease. the first disease studied in this model was microcephaly; unlike rodent models, human brain organoids with a mutation in cdk5rap2 display a significantly reduced size, with a smaller number of progenitors that undergo premature neurogenesis 5 . cdk5rap2 is localized at the centrosome and is required for correct orientation of the mitotic spindle, so that a mis-orientation of the cleavage plane prevents the expansion of the progenitor pool by symmetric division in the patient-derived organoids. this human experimental system was also crucial in determining the relationship between the zika virus (zikv) and microcephaly. during the recent zikv outbreak, a link was initially suggested between middle part of the stomach, connected with cardia, fundus and pylorus. collection of biomaterials, such as cancer organoids, used for research. www.nature.com/nrm neurological disorders, including microcephaly, and zikv [76] [77] [78] [79] . nevertheless, even though zikv could be detected in microcephalic fetal tissues, the causal relationship and the mechanism by which zikv might compromise brain development were not clear. a number of studies have used 3d human stem cell-derived systems, including neurosphere culture and brain organoid models, to reveal the effect of zikv infection on human brain development 80, 81 . thus, the human organoid model provides a physiologically relevant platform for studying zikv pathogenesis. further studies with human brain organoids have determined that brazilian zikv strains possess increased virulence as compared to an african strain 82 , and that the ns2a protein of zikv causes decreased neural stem cell proliferation 83 . various chemical compounds that alleviate the hypomorphic effect of zikv have also been identified using the human brain organoid system 84 . these studies highlight the importance of human brain organoids as a model system in understanding the pathology and mechanistic basis of genetic and infectious diseases, as well as in discovering potential therapeutic reagents. as exemplified by the zikv studies above, a human model system is preferable to animal models when studying infectious disease pathogenesis, because pathogens often have a narrow species or tissue tropism; that is, they infect only certain species, and sometimes only specific cell types. for decades there were not any methods to culture human noroviruses, which cause acute gastroenteritis and can be fatal in young children or in old or immunocompromised individuals 85 . it recently became possible to successfully grow norovirus in culture by using a human intestinal organoid-derived epithelial monolayer, which led to the finding that certain viral strains specifically require both certain bile components and the activity of the host galactoside 2-alphal-fucosyltransferase 2 (fut2) enzyme for infection 86 , which mirrors the patterns of epidemiological studies. rotaviruses, which also cause diarrhoeal disease 87 , are another example of human intestinal organoids being used to cultivate patient-derived viral strains. the rapid replication of the virus was detected within one day of inoculation. an antiviral cytokine, a vp7-neutralizing antibody and ribavirin were demonstrated to suppress viral replication in the organoid system 88, 89 . human airway organoids are suitable models for the study of respiratory viruses. respiratory syncytial virus (rsv) is a single-stranded rna virus that is a significant threat to young infants and the elderly 41 . it was shown to successfully infect human airway organoids and to cause dramatic changes to epithelial morphology and function: the ns2 protein of rsv was shown to cause increased motility of airway epithelial cells 41 . the human airway organoid system has also been used to assess the infectivity of animal-borne influenza viruses [90] [91] [92] . the diversity of two viral surface molecules -haemagglutinin and neuraminidase -has previously made it difficult to predict the infectivity of these viruses in humans 93 . the human airway organoids faithfully simulate human airway epithelium, and so provide a platform for rapidly testing the infectivity of newly emerging influenza viruses. organoids have also been used to co-cultivate human epithelia with bacteria (for example, helicobacter pylori) and with protozoan parasites (for example, cryptosporidium) 64, [94] [95] [96] . the list of pathogens that can be grown in the surrogate human organ system is growing rapidly. we expect that, in the near future, numerous microbiome strains from the human gut or glands will be analysed by using organoids to study infectious diseases in a human cell context, and that organoid infection models will be used as screening platforms to identify novel drug candidates (box 2). human organoids are important resources for precision medicine. for example, they can be useful for selecting an appropriate drug for patients with genetic diseases or cancer. cystic fibrosis (cf) is a relatively common genetic disease with approximately 90,000 patients worldwide. although loss of function of the cftr gene, encoding a chloride channel regulating the mucosal environment, is the main cause, nearly 2,000 mutations have been reported that affect the function or expression of cftr, making this a genetically heterogeneous disease 97 . while two drugs -vx-770 (a cftr potentiator that improves chloride-channel activity) and vx-809 (a cftr corrector that helps mutant protein folding) -are already available for patients with cf who carry specific cftr mutations in combination with the common mutation f508del, evaluating the efficacy of these drugs or discovering new drugs for patients with different mutations remains challenging. rectal organoids isolated from small endoscopic biopsies have nonstructural protein 2a, encoded in a virus genome. an effect that causes small size of a body part. the ability of a virus to infect various species and cells. an approach to treating individual patients with the best possible drug, based on the individual's genetic background and disease characteristics. numerous epidemics and pandemics have threatened human health in history (of which a few examples are plague (yersinia pestis), spanish flu, smallpox, aids/hiv, ebola virus, severe acute respiratory syndrome (sars)-coronavirus (cov) and zika virus (zikv); who, disease outbreaks), heavily disrupting human society. the emergence of the novel human coronavirus sars-cov-2 was first reported in wuhan, china, at the end of 2019 (ref. 191 ). sars-cov-2, reported to have crossed the species barrier, causes coronavirus disease-19 in humans 192 . owing to the lack of targeted antiviral medicines and a vaccine against covid-19, the virus has spread globally and the who has declared covid-19 a pandemic (who, coronavirus disease (covid-19) pandemic). developing a vaccine and therapeutics against covid-19 has become a priority, and human organoids can play a crucial role in advancing research, by providing human cells from different organs that are susceptible to sars-cov-2 infection. several research groups have successfully shown that sars-cov-2 can infect and propagate in primary human organoids derived from liver and gut, as well as in pluripotent stem cell-derived organoids modelling blood vessels and the kidney [193] [194] [195] . sars-cov-2 uses the same protein, viral receptor angiotensin-converting enzyme 2 (ace2), as sars-cov to enter the host cells. ace2 is expressed in multiple tissues and cell types, enabling the virus to infect other tissues beyond the lung [196] [197] [198] [199] . the ability of sars-cov-2 to infect different organs was modelled with human organoids [193] [194] [195] . moreover, it has been shown that the viral infection of human blood vessel and kidney organoids can be inhibited by human recombinant soluble ace2, a drug candidate initially developed for sars-cov 194 . structural studies of the spike proteins of sars-cov-2 and sars-cov bound to ace2 predicted a similar result [200] [201] [202] . these studies have provided additional evidence that human organoids can serve as an effective research platform for the study of human diseases, including the outbreak of new viruses such as sars-cov-2 and zikv. nature reviews | molecular cell biology been used in forskolin-induced swelling (fis) assays [98] [99] [100] to assess forskolin-camp signalling-induced cftr channel activity, and it was found that the assay can faithfully predict patient responses to individual drugs and to combined treatments 99, 100 . it was also found that with this relatively simple in vitro assay it is possible to identify which patients with rare mutations would respond to, and could therefore be treated with, existing therapies 99,100 , which could be life-changing for those individuals. the rectal-organoid-based fis assay screen is currently being performed by hubrecht organoid technology for patients in the netherlands with cf. human organoids also play a prominent role in enhancing our understanding of human cancers. traditionally, human cancer cell lines, mouse cancer models or pdx have been the main experimental platforms for studies of human cancer. however, with the advent of adsc-derived organoid technology, transformed cancer tissues have been grown in vitro as cancer organoids (also known as tumouroids or canceroids), demonstrating that this technology is applicable to diseased tissue as well as to normal epithelia. patient samples from colon 101-105 , brain 106, 107 , prostate 108 , pancreas [109] [110] [111] , liver 112 , breast 113 , bladder 68 , stomach [114] [115] [116] , oesophageal 117 , endometrial 118 and lung 41,119 cancers have readily been cultured as cancer organoid models. recently, an analysis of drug responses in patients and in their matched cancer organoids led to the conclusion that responses to the drugs are highly similar in the two settings: a drug with no antitumour activity in the organoids did not demonstrate efficacy in the matched patient, and drugs that showed an effect in the organoid cultures were matched by a patient response in close to 90% of cases 120 . this initial study has been corroborated by several studies with larger cohorts 105, 121, 122 . the sample size and cancer types analysed remain limited, and more rigorous investigation will be needed before cancer organoids can be routinely adopted as in vitro patient 'avatars' . nevertheless, the rapidly increasing number of patient-derived cancer organoids and their use in xenograft formation and molecular profiling have already accelerated cancer research, and patient-derived organoid models in the future may provide an in vitro screening platform to predict the best therapeutic options for individual patients. capecchi reported highly frequent homologous recombination events in mouse escs 123 . by contrast, homologous recombination was found to be an extremely rare event in human pscs 124 . later studies reported that double-strand breaks (dsbs) facilitate homologous recombination events as part of the dna repair mechanism in human cell lines 125 . much effort has been directed at establishing an efficient method for generating dsbs at a specific target locus in order to harness this repair machinery in introducing desired genetic changes such as repair sequence or pathogenic mutation into the target locus. this was initially accomplished by the use of meganucleases (endonucleases with a long recognition site (12-40 bps)), zinc finger nucleases 126 and transcription activator-like effector nucleases 127 , but with varying levels of target specificity and activity. the development of the crispr−cas9 endonuclease technology has made diverse methods of genetic engineering readily available to all researchers [128] [129] [130] [131] . unlike the previous technologies, the cas9 endonuclease is guided to the genomic sequence of interest as a means to generate a dsb by a guide rna sequence (grna), making the system highly versatile and easy to apply 132 . there are numerous examples of genetic engineering of human pscs using this system, including the generation of isogenic cell lines with specific mutations. such isogenic cell lines serve as an important control for genetic analysis in human pscs, due to the strong phenotypical variation among different cell lines. furthermore, large-scale loss-of-function analyses have been performed in many human cancer cell lines and human pscs through the multiplexing of grnas in retroviral or lentiviral vectors, to provide genome-wide targeting coverage. combining organoid systems with crispr-cas9 genome editing broadens the applications of the organoid system in many ways. for example, this system has been used to model monogenic disorders such as cf. human gut organoids with f508del, causing misfolded cftr channel protein that leads to rapid degradation, were precisely corrected into the normal sequence by crispr-cas9; the engineered organoids with the corrected cftr amino acid sequence showed restored channel activity of cftr in vitro 133 , showing the causal relationship between the mutation and disease phenotype, as well as the possibility of using a similar strategy to generate autologous organoids for transplantation. crispr-cas9 technology has also been employed to identify and introduce oncogenic driver mutations to normal epithelial organoids. two groups have independently reported the minimum set of mutations that can closely model a metastatic human colon cancer 134, 135 . simultaneous knockout of multiple genes or conditional gene knockout strategies have also been developed for adsc-derived organoids and for pscs that can be used for psc-derived organoids [136] [137] [138] . crispr grna library screening analysis has been performed in multiple labs to maximize the utility of human organoids in genetic studies 139, 140 . these developments in genetic engineering, combined with human psc and human organoid technologies, have opened new opportunities for studies of human genetics, as it is possible to perform genetic experiments in small human organ models that closely reflect human physiology. organoids should largely be considered as a model system currently under development: much of the current excitement around the technology is based on their enormous potential rather than on what has already been achieved. nevertheless, organoids could potentially revolutionize disease research in a profound manner (fig. 4) . generating animal models for a specific disease requires pre-existing insight into either the causative conditions or the genes involved. animal models are typically created by applying harmful conditions to animals or by www.nature.com/nrm manipulating the genes responsible for the disorder, while organoid models can be directly generated from affected patients without prior knowledge of the specific genes responsible. this is particularly relevant for multigenic disorders such as inflammatory bowel disease, provided that the pathology is caused by the affected epithelium, and for cancers, where cancer organoids can be directly isolated from the patient [141] [142] [143] [144] [145] [146] . human organoid cultures have a number of potential benefits over animal models (box 3): organoids provide faster and more robust outcomes, are more readily accessible and provide both a more accurate representation of human tissue and a larger quantity of material to work with than animal models do. the mouse is one of the animal model systems most frequently used to explore human biology and disease, owing to its similarity to humans, in comparison with other animal models, and to the ability to generate transgenic and knockout mouse strains. however, the generation of a conventional transgenic mouse model to address questions regarding human disease generally takes more than a year, even with the technological advance of crispr-cas9-mediated precision genome editing [147] [148] [149] . furthermore, differences in microbiota and pathogen composition between animal models and humans, as well as the failure of certain phenomena observed in mice to translate directly to humans, limit the utility of animal models in human disease research 150 . (2) biobanking, whereby samples obtained from patients can be used to generate patient-derived organoids and stored as a resource for future research; (3) disease modelling, to understand the mechanisms of human diseases such as infectious diseases, inheritable genetic disorders and cancer using various laboratory techniques, including omics and drug-screening analyses; and (4) precision medicine, in which patient-derived organoids can be used to predict response to drugs and as resources for regenerative medicine coupled with genetic engineering. ecm, extracellular matrix. diseases caused by more than a single genetic factor. human stem cell-derived organoid culture is widely expected to bridge the remaining gaps between animal models and humans, primarily because the source material for organoid culture is a stem cell derived from a human 35, 40 . the length of time required to establish the research platform is also faster for the organoid model than for animal models: a human organoid culture can be established within a few weeks or months with a high success rate, thus supporting the use of patient-derived organoids in personalized medicine to provide robust personalized data, including individual mutation profiles and drug responses. organoid handling is also relatively easy, with researchers being able to handle a large number of organoid lines at the same time, although the initial determination of culture conditions for a new tissue type is rather complicated. variability and single-cell profiling. in spite of the many advantages detailed above, human organoid culture remains under development, and numerous efforts to advance the technology are still in progress. one of the most pressing issues in organoid technology is the variability of the system: individual reports have described contrasting methods to generate organoids from stem cells, but a widely accepted, standardized protocol is still missing. for instance, human gut organoids derived in different laboratories, using human adscs and human pscs, are different, even though both are termed 'gut organoids' 2,3,24 . the possibility for further improvement of an already well-defined method was recently reported by showing that more refined culture conditions provide better cellular diversity and culture efficiency 4 . the lack of widely accepted, standardized protocols and guidance is an important issue to overcome in order to reduce the variability of the system from research group to group. a collective effort should be made to set clear guidelines and ways to assess the quality and validity of each system. single-cell profiling technologies for transcriptome and epigenome analysis may be key in terms of highly accurate assays suitable for this purpose. with these assays, it would be possible to compare every cell type present in the organoids with their in vivo counterpart at the level of transcriptome and epigenome. in addition, individual differences such as age and genetic background might introduce further variability to the human organoid system, which, while potentially challenging, could also present an opportunity to assess the role of person-to-person variability in human biology. additional considerations to be addressed include the modelling of cell-to-cell communication with stromal cell populations and the development of vasculature in organoid systems, although vascularization has been observed in some cases upon transplantation 23, 151 . significant progress has recently been achieved with blood vessel organoids 152 , which have a great potential to impact research into vascular diseases such as diabetes. however, despite steady progress, vascularization still remains a difficult obstacle to overcome. several reports already exist of organoid co-culture systems containing pathogens such as bacteria, viruses and parasites 153, 154 , and moreover, organoid co-cultures with mesenchymal and/or immune cell populations have shown exciting progress in recent years and are of great interest as means to better model human disease 23, [155] [156] [157] [158] [159] [160] [161] [162] . nevertheless, organoid systems already contain a large degree of complexity, and it is important to acknowledge the challenge of adding additional components to an already complex system. both simple and complex organoid systems have their pros and cons, and therefore it is most important to use the most appropriate level of complexity for a given study. for example, cancer organoids containing only the epithelial component can be sufficient to test the efficacy of most cancer drugs. however, for immuno-oncology therapies or for an assessment of drug metabolism and availability to the target organ, more complex systems containing immune cells, mesenchymal cells and/or endothelial cells together with the organoids will be necessary. unfortunately, a standardized protocol or guidelines concerning these matters are still lacking, so individual researchers are left to determine the most appropriate system for themselves. we foresee that full standardization of many of these protocols will be achieved in the near future, in a manner similar to increases in both the quality and reproducibility of human ipsc technology. assessing interactions with other organs and the environment. one clear drawback of organoid systems is the lack of interorgan communication. human organoid systems fundamentally mimic a part of the human body, not the entire body. therefore, human organoids are limited to the reproduction of organ-specific or tissue-specific microphysiology, a limitation to bear in mind prior to entering this exciting new field. however, efforts are already in progress to overcome • human-derived: human organoids represent human physiology, rather than being a 'human-like' or 'similar' system. • rapid: adult stem cell-derived and pluripotent stem cell-derived organoids can be established rapidly and easily. • robustness: once established, scale-up is usually possible for large-scale genomic screening and drug screening. • genetic manipulation: most modern genetic engineering tools can be applied to induced pluripotent stem cells or directly to organoid systems. • personalization: induced pluripotent stem cells and organoids can be obtained from individuals. • cellular components: the microenvironment is sometimes lacking, particularly in adult stem cell-derived organoids. co-culture systems with other cell types are not firmly established. • standardization: protocols for organoid establishment and quality control are not globally standardized. • relatively costly: organoids cost less than mouse or fish models, but they are relatively expensive compared to traditional cell lines, fly, yeast or worm models. • scale: studies at the level of whole organs are difficult. • heterogeneity: owing to diversity between individuals and protocols, outcomes may vary from group to group. www.nature.com/nrm this limitation. for example, multiple organoids have been connected in order to study communication between the liver, pancreas and gastrointestinal tract 163 ; cell migration between the developing forebrain and hindbrain [164] [165] [166] ; and the interaction between the brain and hormone-producing organs 167 . the development of tools to help us model organ-level communication will progress, although this capacity is likely to lag in comparison to progress in other areas of the field, owing to the complex nature of the studies undertaken. of note, efforts to bring together the fields of organoid research and organ-on-a-chip research are particularly exciting, potentially resulting in an 'organoid-on-a-chip' technology 168, 169 . we foresee, for example, the potential generation of a chamber device that enables the separate culture of distinct organoid types, thus preventing the uncontrolled fusion of organoids while permitting organoid-organoid communication. finally, the effect of extracellular matrix composition on organoid culture is yet to be defined; uncertainty in the composition of the extracellular matrix can heavily influence the outcomes in chemical screening or genetic screening of human organoids. diverse efforts have been made, with some impressive successes in recent years [170] [171] [172] [173] [174] [175] . this obstacle should be overcome not only as a means to produce more robust human model systems, but also to allow the translation of human organoid technology to regenerative medicine, where 'good manufacturing practice' requires all raw materials, including matrix materials, to be fully defined. furthermore, work is ongoing towards the development of organoid culture platforms for large-scale production, organoid-based high-content screening platforms and micro-organoids-on-a-chip as miniature, finely controlled systems. for all these systems, it will be extremely important to know how to manufacture a synthetic, versatile extracellular matrix. despite the remaining challenges, human organoids hold great potential in clinical translational research, owing to the advantages outlined above and to rapid, ongoing technology development. from the initial full 'laboratory life cycle' , which started with the isolation of patient samples, to the establishment of organoids and their cryopreservation, organoid technology has expanded to embrace genetic manipulation, various omics and drug-screening analyses and diverse co-culture system with viruses, bacteria and parasites (fig. 4) . thus, technologies and experimental procedures that were developed in other model systems can now be applied to human organoid systems, which will accelerate our understanding of human biology and enable us to validate hypotheses and models generated from studies in animal model systems. given the rapid technical advances in the field, we believe that human organoid systems will provide unprecedented opportunities to improve human health. published online xx xx xxxx progress and potential in organoid research single lgr5 stem cells build crypt-villus structures in vitro without a mesenchymal niche provide the first example of organoids derived from adscs isolated from mouse gut long-term expansion of epithelial organoids from human colon, adenoma, adenocarcinoma, and barrett's epithelium human intestinal organoids maintain self-renewal capacity and cellular diversity in niche-inspired culture condition cerebral organoids model human brain development and microcephaly report that the complexity of human brain development can be modelled by human psc-derived organoids kidney organoids from human ips cells contain multiple lineages and model human nephrogenesis long-term expansion of functional mouse and human hepatocytes as 3d organoids long-term, hormone-responsive organoid cultures of human endometrium in a chemically defined medium in this review, huch and koo summarize the development of various organoid culture systems and compare mouse and human systems organoids: a historical perspective of thinking in three dimensions organogenesis in a dish: modeling development and disease using organoid technologies dishing out mini-brains: current progress and future prospects in brain organoid research organoids: modeling development and the stem cell niche in a dish modeling development and disease with organoids organoid models for translational pancreatic cancer research organoids as an in vitro model of human development and disease stem cell models of human brain development the embryonic cell lineage of the nematode caenorhabditis elegans large-scale mutagenesis in the zebrafish: in search of genes controlling development in a vertebrate large scale genetics in a small vertebrate, the zebrafish the zebrafish issue of development the genetics of caenorhabditis elegans this is the first report of organ bud formation through self-condensation of cells from different lineages directed differentiation of human pluripotent stem cells into intestinal tissue in vitro identify a step-by-step procedure to generate human intestinal organoids derived from pscs wormbase: a multi-species resource for nematode biology and genomics the worm in us -caenorhabditis elegans as a model of human disease genome-wide rnai screens in caenorhabditis elegans: impact on cancer research development and evolution of the human neocortex metabolic costs and evolutionary implications of human brain development predictability of metabolism of ibuprofen and naproxen using chimeric mice with human hepatocytes cyp2c9-catalyzed metabolism of s-warfarin to 7-hydroxywarfarin in vivo and in vitro in chimeric mice with humanized liver pluripotent stem cell-derived organoids: using principles of developmental biology to grow human tissues in a dish induction of pluripotent stem cells from adult human fibroblasts by defined factors induced pluripotent stem cell lines derived from human somatic cells a critical look: challenges in differentiating human pluripotent stem cells into desired cell types and organoids disease modeling in stem cell-derived 3d organoid systems organoid and assembloid technologies for investigating cellular crosstalk in human brain development and disease human kidney organoids: progress and remaining challenges liver organoids: from basic research to therapeutic applications disease modelling in human organoids long-term expanding human airway organoids for disease modeling stepwise differentiation of pluripotent stem cells into retinal cells modelling human development and disease in pluripotent stem-cell-derived gastric organoids embryonic stem cell lines derived from human blastocysts establishment in culture of pluripotential cells from mouse embryos isolation of a pluripotent cell line from early mouse embryos cultured in medium conditioned by teratocarcinoma stem cells human embryonic stem cells: research, ethics and policy human ipsc banking: barriers and opportunities advances in pluripotent stem cells: history, mechanisms, technologies, and applications stem cells, genome editing, and the path to translational medicine pluripotent stem cells in disease modelling and drug discovery guided self-organization and cortical plate formation in human brain organoids generation of a vascularized and functional human liver from an ipsc-derived organ bud transplant 3d modeling of esophageal development using human psc-derived basal progenitors reveals a critical role for notch signaling esophageal organoids from human pluripotent stem cells delineate sox2 functions during esophageal specification wnt/beta-catenin promotes gastric fundus specification in mice and humans in vitro generation of human pluripotent stem cell derived lung organoids the intestinal stem cell identification of stem cells in small intestine and colon by marker gene lgr5 differentiated troy+chief cells act as reserve stem cells to generate all lineages of the stomach epithelium lgr5 +ve stem cells drive self-renewal in the stomach and build long-lived gastric units in vitro in vitro expansion of single lgr5 + liver stem cells induced by wnt-driven regeneration unlimited in vitro expansion of adult bi-potent pancreas progenitors through the lgr5/r-spondin axis in vitro expansion of human gastric epithelial stem cells and their responses to bacterial infection a novel human gastric primary cell culture system for modelling helicobacter pylori infection in vitro long-term culture of genome-stable bipotent stem cells from adult human liver expansion of adult human pancreatic tissue yields organoids harboring progenitor cells with endocrine differentiation potential tumor evolution and drug response in patient-derived organoid models of bladder cancer basal cells as stem cells of the mouse trachea and human airway epithelium reconstruction of the mouse extrahepatic biliary tree using primary human extrahepatic cholangiocyte organoids development of organoids from mouse and human endometrium showing endometrial epithelium physiology and long-term expandability quantification of regenerative potential in primary human mammary epithelial cells identification of multipotent luminal progenitor cells in human prostate organoid cultures single luminal epithelial progenitors can generate prostate organoids in culture the notch and wnt pathways regulate stemness and differentiation in human fallopian tube organoids a developmental and genetic classification for malformations of cortical development: update 2012 zika virus and microcephaly: why is this situation a pheic? detection and sequencing of zika virus from amniotic fluid of fetuses with microcephaly in brazil: a case study zika virus associated with microcephaly zika virus depletes neural progenitors in human cerebral organoids through activation of the innate immune receptor tlr3 zika virus impairs growth in human neurospheres and brain organoids show the utility of complex brain organoids for translational zika virus research the brazilian zika virus strain causes birth defects in experimental models zika-virus-encoded ns2a disrupts mammalian cortical neurogenesis by degrading adherens junction proteins identification of small-molecule inhibitors of zika virus infection and induced neural cell death via a drug repurposing screen epidemiology of human noroviruses and updates on vaccine development replication of human noroviruses in stem cell-derived human enteroids demonstrate that organoid culture systems can support research on difficult pathogens that previously could not be cultivated rotavirus vaccines human intestinal enteroids: a new model to study human rotavirus infection, host restriction, and pathophysiology modeling rotavirus infection and antiviral therapy using primary intestinal organoids the emergence of influenza a h7n9 in human beings 16 years after influenza a h5n1: a tale of two cities differentiated human airway organoids to assess infectivity of emerging influenza virus influenza viruses en route from birds to man influenza virus neuraminidase structure and functions modeling infectious diseases and host-microbe interactions in gastrointestinal organoids persistence and toxin production by clostridium difficile within human intestinal organoids result in disruption of epithelial paracellular barrier function modelling cryptosporidium infection in human small intestinal and lung organoids recent strategic advances in cftr drug discovery: an overview a functional cftr assay using primary cystic fibrosis intestinal organoids report the use of organoids in precision medicine for patients with cystic fibrosis characterizing responses to cftr-modulating drugs using rectal organoids derived from subjects with cystic fibrosis rectal organoids enable personalized treatment of cystic fibrosis prospective derivation of a living organoid biobank of colorectal cancer patients a colorectal tumor organoid library demonstrates progressive loss of niche factor requirements during tumorigenesis preserved genetic diversity in organoids cultured from biopsies of human colorectal cancer metastases patient-derived colorectal cancer organoids upregulate revival stem cell marker genes following chemotherapeutic treatment patient-derived organoids can predict response to chemotherapy in metastatic colorectal cancer patients a patient-derived glioblastoma organoid model and biobank recapitulates interand intra-tumoral heterogeneity patient-derived organoids (pdos) as a novel in vitro model for neuroblastoma tumours organoid cultures derived from patients with advanced prostate cancer organoid models of human and mouse ductal pancreatic cancer pancreatic cancer organoids recapitulate disease and allow personalized drug screening human pancreatic tumor organoids reveal loss of stem cell niche factor dependence during disease progression human primary liver cancer-derived organoid cultures for disease modeling and drug screening a living biobank of breast cancer organoids captures disease heterogeneity human gastric cancer modelling using organoids a comprehensive human gastric cancer organoid biobank captures tumor subtype heterogeneity and enables therapeutic screening divergent routes toward wnt and r-spondin niche independency during human gastric carcinogenesis organoid cultures recapitulate esophageal adenocarcinoma heterogeneity providing a model for clonality studies and precision therapeutics patient-derived organoids from endometrial disease capture clinical heterogeneity and are amenable to drug screening patient-derived lung cancer organoids as in vitro cancer models for therapeutic screening patient-derived organoids model treatment response of metastatic gastrointestinal cancers a rectal cancer organoid platform to study individual responses to chemoradiation patient-derived organoids predict chemoradiation responses of locally advanced rectal cancer site-directed mutagenesis by gene targeting in mouse embryoderived stem cells gene targeting in human pluripotent cells chimeric nucleases stimulate gene targeting in human cells enhancing gene targeting with designed zinc finger nucleases a tale nuclease architecture for efficient genome editing rna-guided genetic silencing systems in bacteria and archaea multiplex genome engineering using crispr/cas systems rna-guided human genome engineering via cas9 targeted genome engineering in human cells with the cas9 rna-guided endonuclease the next generation of crispr-cas technologies and applications functional repair of cftr by crispr/cas9 in intestinal stem cell organoids of cystic fibrosis patients report the first study to apply crispr-cas9-based gene correction in an organoid system sequential cancer mutations in cultured human intestinal stem cells modeling colorectal cancer using crispr-cas9-mediated engineering of human intestinal organoids one-step generation of conditional and reversible gene knockouts a protocol for multiple gene knockout in mouse small intestinal organoids using a crispr-concatemer simultaneous paralogue knockout using a crispr-concatemer in mouse small intestinal organoids pooled in vitro and in vivo crispr-cas9 screening identifies tumor suppressors in human colon organoids genome-scale crispr screening in human intestinal organoids identifies drivers of tgf-beta resistance alterations in the epithelial stem cell compartment could contribute to permanent changes in the mucosa of patients with ulcerative colitis dna methylation defines regional identity of human intestinal epithelial organoids and undergoes dynamic changes during development single cell analysis of crohn's disease patient-derived small intestinal organoids reveals disease activity-dependent modification of stem cell properties dna methylation and transcription patterns in intestinal epithelial cells from pediatric patients with inflammatory bowel diseases differentiate disease subtypes and associate with outcome organoids in cancer research somatic inflammatory gene mutations in human ulcerative colitis epithelium one-step generation of mice carrying mutations in multiple genes by crispr/cas-mediated genome engineering generating genetically modified mice using crispr/cas-mediated genome engineering one-step generation of mice carrying reporter and conditional alleles by crispr/ cas-mediated genome engineering exploring host-microbiota interactions in animal models and humans vascularized and complex organ buds from diverse tissues via mesenchymal cell-driven condensation human blood vessel organoids as a model of diabetic vasculopathy organoids in immunological research modeling host-virus interactions in viral infectious diseases using stem-cellderived systems and crispr/cas9 technology human fetal tnf-alpha-cytokineproducing cd4 + effector memory t cells promote intestinal development and mediate inflammation early in life organoid modeling of the tumor immune microenvironment generation of tumor-reactive t cells by co-culture of peripheral blood lymphocytes and tumor organoids 3d model for car-mediated cytotoxicity using patient-derived colorectal cancer organoids co-culture with intestinal epithelial organoids allows efficient expansion and motility analysis of intraepithelial lymphocytes a primary human macrophage-enteroid co-culture model to investigate mucosal gut physiology and host-pathogen interactions mesenchymal stem cells increase alveolar differentiation in lung progenitor organoid cultures anatomically and functionally distinct lung mesenchymal populations marked by lgr5 and lgr6 modelling human hepato-biliarypancreatic organogenesis from the foregut-midgut boundary fused cerebral organoids model interactions between brain regions fusion of regionally specified hpsc-derived organoids models human brain development and interneuron migration assembly of functionally integrated human forebrain spheroids understanding the role of steroids in typical and atypical brain development: advantages of using a "brain in a dish" approach towards a human-on-chip: culturing multiple cell types on a chip with compartmentalized microenvironments multisensor-integrated organs-onchips platform for automated and continual in situ monitoring of organoid behaviors extracellular matrix hydrogel derived from decellularized tissues enables endodermal organoid culture development of collagen-based 3d matrix for gastrointestinal tract-derived organoid culture peg-4mal hydrogels for human organoid generation, culture, and in vivo delivery mechanically and chemically defined hydrogel matrices for patient-derived colorectal tumor organoid culture designer matrices for intestinal stem cell and organoid culture growth of epithelial organoids in a defined hydrogel inhibition of pluripotential embryonic stem cell differentiation by purified polypeptides myeloid leukaemia inhibitory factor maintains the developmental potential of embryonic stem cells the ground state of embryonic stem cell self-renewal validated germline-competent embryonic stem cell lines from nonobese diabetic mice new cell lines from mouse epiblast share defining features with human embryonic stem cells derivation of pluripotent epiblast stem cells from mammalian embryos derivation of novel human ground state naive pluripotent stem cells induction of a human pluripotent state with distinct regulatory circuitry that resembles preimplantation epiblast derivation of naive human embryonic stem cells systematic identification of culture conditions for induction and maintenance of naive human pluripotency epigenetic resetting of human pluripotency naive pluripotent stem cells derived directly from isolated cells of the human inner cell mass resetting transcription factor control circuitry toward ground-state pluripotency in human systematic identification of culture conditions for induction and maintenance of naive human pluripotency application of small molecules favoring naive pluripotency during human embryonic stem cell derivation clinical features of patients infected with 2019 novel coronavirus in wuhan coronaviridae study group of the international committee on taxonomy of viruses. the species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2 recapitulation of sars-cov-2 infection and cholangiocyte damage with human liver ductal organoids inhibition of sars-cov-2 infections in engineered human tissues using clinical-grade soluble human ace2 sars-cov-2 productively infects human gut enterocytes a pneumonia outbreak associated with a new coronavirus of probable bat origin sars-cov-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor structure, function, and antigenicity of the sars-cov-2 spike glycoprotein structural basis for the recognition of sars-cov-2 by full-length human ace2 structural basis of receptor recognition by sars-cov-2 the authors thank a. kavirayani and c. hindley for proofreading, and s. bae for initial design of the artwork. work in j.a.k.'s laboratory is supported by the austrian academy of sciences, the austrian science fund (grant z_153_b09) and an advanced grant from the european research council. b.-k.k.'s laboratory is supported by the austrian academy of sciences, the european research council, the human frontier science program and the interpark bio-convergence center grant program. the authors contributed equally to the writing and revisions of the article. the authors declare no competing interests. nature reviews molecular cell biology thanks h. clevers, p. liberali and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. who, disease outbreaks: www.who.int/emergencies/ diseases/en/ who, coronavirus disease 19 (covid-19) pandemic: https:// www.who.int/emergencies/diseases/novel-coronavirus-2019 key: cord-171231-m54moffr authors: habli, ibrahim; alexander, rob; hawkins, richard; sujan, mark; mcdermid, john; picardi, chiara; lawton, tom title: enhancing covid-19 decision-making by creating an assurance case for simulation models date: 2020-05-17 journal: nan doi: nan sha: doc_id: 171231 cord_uid: m54moffr simulation models have been informing the covid-19 policy-making process. these models, therefore, have significant influence on risk of societal harms. but how clearly are the underlying modelling assumptions and limitations communicated so that decision-makers can readily understand them? when making claims about risk in safety-critical systems, it is common practice to produce an assurance case, which is a structured argument supported by evidence with the aim to assess how confident we should be in our risk-based decisions. we argue that any covid-19 simulation model that is used to guide critical policy decisions would benefit from being supported with such a case to explain how, and to what extent, the evidence from the simulation can be relied on to substantiate policy conclusions. this would enable a critical review of the implicit assumptions and inherent uncertainty in modelling, and would give the overall decision-making process greater transparency and accountability. the epidemiological simulation models that have been informing the covid-19 policymaking process should be viewed as safety-critical systems. these models have direct and significant influence on the policies and decisions that aim to reduce the risk posed by the virus to public health [1] . however, a recent systematic review of 31 diagnostic and prediction models for covid-19 concluded that, at present, none of these models could be recommended for practical use to inform critical policy decisions [2] . it is, therefore, vital that decision-makers are aware of the assumptions made in these models and that they can reflect on the limitations of the models in relation to practical decisions about the management of the pandemic. similar to engineered safety-critical systems, e.g. flight control software or pacemakers, the rigour and transparency with which these simulation models are developed should be proportionate to their criticality to, and influence on, public health policy -this is true for covid-19 but also holds for other models used to support such critical decision-making. in safety-critical systems engineering it is common practice to produce an assurance case -a structured, explicit argument supported by evidence [3] . such cases are a primary means by which confidence in the safety of the system is communicated to, and scrutinised by, the diverse stakeholders, including regulators and policy makers. we believe it is important to support the covid-19 simulation models with an assurance case that explains how, and the extent to which, the resulting evidence supports and substantiates the policy conclusions. we argue that such a case has the potential to enable a wider understanding, and a critical review, of the expected benefits, limitations and assumptions that underpin the development of the simulation models and the extent to which these issues, including the different sources of uncertainty, are considered in the policy decision-making process. the use of assurance cases is a long-established practice in the safety-critical domain. particularly in the uk, the development of an assurance case is a mandatory requirement in key sectors such as defence, nuclear and rail [4] . more pertinently, in the nhs, compliance with the clinical safety standards dcb0129 and dcb0160 requires an assurance case for health it systems [5] . an assurance case may consider different critical properties of a system. in this paper, we focus on safety. an assurance case is primarily used to communicate, support and critically evaluate a safety claim about risk-based decisions to commission a system or a service. data from modelling, simulation, testing and in-service usage provides the evidence base for such a claim. however, this evidence is rarely conclusive. it entails different sources of uncertainty and hinges on technical, organisational and social assumptions. further, in making risk-based decisions, tradeoffs are inevitable, e.g. between safety, privacy and costs. justifications for these decisions have to be communicated to and accepted by the relevant stakeholders, e.g. by individuals or groups whose privacy might be reduced in return for an improvement in safety. to this end, a structured argument is used to explain the extent to which the evidence supports the safety claim, given the many sources of uncertainty, assumptions and tradeoffs. the argument is structured in the sense that it should make these issues and the way in which they relate to each other explicit for the different stakeholders to critically review, modify, accept or reject. the more complex and novel the system and its context are, the more significant the role of the assurance argument is in informing the risk-based decision-making process. healthcare and public health interventions form a complex and adaptive set of interacting systems in which risk-based and evidence-based decisions would benefit from explicit and clear explanation by means of structured arguments. a study by the health foundation on the use of assurance cases in healthcare highlighted some potential benefits [6] , primarily: 1. promoting structured thinking about risk among clinicians and fostering multidisciplinary communication about safety; 2. integrating evidence sources; 3. aiding communication among stakeholders; and 4. making the implicit explicit. as a highly salient example, take report 9 by the imperial college covid-19 response team ("the impact of non-pharmaceutical interventions (npis) on the reduction of covid-19 mortality and healthcare demand") [7] . this is an example of microsimulation providing primary evidence which has significant policy implications. we can view a policy assurance case as an integration of the following, as illustrated in figure 1 : a. data from microsimulation providing scientific evidence; b. scientific claims, often referred to as "scientific advice", concerning the effect of the different public health strategies based on the evidence; and c. policy claims concerning the chosen public health strategy based on the scientific claims, but also taking into account national values, policy goals, etc. the relationships between the above can be established through the following arguments: d. scientific argument explaining the extent to which the microsimulation evidence (a) supports the scientific claims (b); e. policy argument explaining the extent to which the scientific claims (b) are sufficient to support the policy claims (c); and f. confidence argument in the trustworthiness of the microsimulation evidence (a). simulation models are engineering artefacts. as such, they should be systematically specified, implemented and tested. the rigour with which this is performed should be proportionate to the criticality of these models to the decision making process. for example, the covid-19 model used by the imperial team is based on a modified individual-based simulation that was developed to support pandemic influenza planning. models can be for a specific purpose and therefore a confidence argument would need to justify the suitability of the model for the new context, including the continued validity of the original parameters. this is important since adhoc reuse and modification have been associated with catastrophic accidents in other safetycritical domains (e.g. the recent boeing 737 max accidents [8] ). as thimbleby recently argued [9] , the quality of the software design and code of the simulator is an important factor, particularly its amenability to inspection and testing. for instance, neil ferguson, the lead author of the imperial report, stated the following: "for me the code is not a mess, but it's all in my head, completely undocumented. nobody would be able to use it… and i don't have the bandwidth to support individual users'' [10] . in a safety-critical context, this would significantly undermine confidence in the simulation results. we can note that this is defensible, in contextthe imperial team was working under tight timescales and (longer-term) it does plan to make the simulation program publicly available. we hope that this will be combined with the actual source code to enable wider replication and evaluation of the evidence. the validity of the simulation results hinges on large uncertainties and many societal assumptions, e.g. about population behavioural changes. in large part, this is because covid-19 is a novel virus, with relatively little reliable data on transmission rates. the developers of the model made many of these assumptions explicit through listing the corresponding parameters and where data exists to support the chosen parameter values. this should enable an independent assessment and evolution of the model. for example, the report states an assumption that 30% of covid-19 patients who are hospitalised will require critical care (invasive mechanical ventilation) based on early reports from cases in the uk, china and italy. we now know that this was a significant overestimate due to a combination of miscommunication ("critical care" in many other countries includes non-invasive measures such as continuous positive airway pressure devices) and the effects of the initial official uk advice to "intubate early". given the novelty of the virus and the large uncertainties around the design of the model and its underpinning data, the transition from the simulation results to the overall scientific claim, i.e. scientific advice or conclusions, is not straightforward [11] . we recreated the scientific argument using a structured argumentation notation, the goal structuring notation (gsn) [12] . gsn is widely used in safety-critical domains for creating structured assurance arguments. figure 2 shows a simplified example of how a structured argument can be used to capture part of the scientific argument through identifying the claims that are made, the evidence that supports those claims, and the relationships between them. structured arguments also help to ensure that the key assumptions that are made are documented, e.g. exclusion of economic and non-covid-19 health implications. in figure 2 , the results of modelling for different npis are used as evidence to support claims about the positive and negative effects of a covid-19 mitigation strategy. a confidence argument in the trustworthiness §of the microsimulation evidence is developed further, referenced but not shown in figure 2 , and considers justification of the way in which the model was adapted and tested, including the choice of parameters. moving from scientific advice and evidence to a policy decision requires that policy makers consider assumptions, risk acceptance beliefs and tradeoffs (such as between economic and medical impact) that are not often direct and amenable to rigorous scientific examination [11] . the transition from a scientific claim to a policy one should therefore involve a complex and diverse policy argument that builds on the scientific claims, but also brings to bear these additional considerations [13] . imperial college report 9 contains some explicit policy claims, but it does not contain a policy argument [7] . a good policy argument should justify the reliance on particular sources of scientific advice and models, and acknowledge the extent to which the underlying sources of uncertainty in the evidence were considered. alternative scientific claims based on different (potentially conflicting) models should also be considered where available. the policy argument should make clear how tradeoffs were made and how evidence concerning the economic, legal and ethical implications of the chosen policy was generated and appraised. in the covid-19 context, such evidence should also incorporate an estimation of non-covid-19 health harms, e.g. potential delays in cancer diagnosis and treatment. highlighting the different aspects of the policy argument should ensure clarity about different accountabilities: (1) the accountability of the scientists to base their scientific advice on data and results that have been generated in a trustworthy manner; and (2) the accountability of the policy makers to appraise the different items of evidence and clarify the basis on which the policy was established. our society is currently placing great weight on simulation models of covid-19 effects. although such models are essential for dealing with the pandemic, it is hard to know which we should trust, to what extent, under what conditions. we need, therefore, to make an interdisciplinary effort to understand these models, and to support that effort we should use assurance cases to capture our arguments of validity. in such an effort, epidemiologists and health data scientists will have a central role, but they will need support from software engineers, including those with safety-critical software experience. working together, such collaborations will be able to create standards for the developing, testing and maintaining these models in a consistent, rigorous and auditable manner. they will be able to build assurance cases that communicate the uncertainty, assumptions and tradeoffs to a wide variety of stakeholders. this knowledge will then aid policymakers in using pandemic models in exactly the ways that they are useful, and not in the ways that they are not. special report: the simulations driving the world's response to covid-19 prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. bmj should healthcare providers do safety cases? lessons from a cross-industry review of safety case practices. safety science safety and assurance cases: past, present and possible future-an adelard perspective what is the safety case for health it? a study of assurance practices in england using safety cases in industry and healthcare. the health foundation impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand the boeing 737 max accidents. 28th safety-critical systems symposium a call for professional software engineering to be used in epidemic modelling no 10's infection guru recruits game developers to build coronavirus pandemic model four principles to make evidence synthesis more useful for policy risk assessment and risk management: review of recent advances on their foundation key: cord-258316-uiusqr59 authors: spil, ton a.m.; romijnders, vincent; sundaram, david; wickramasinghe, nilmini; kijl, björn title: are serious games too serious? diffusion of wearable technologies and the creation of a diffusion of serious games model date: 2020-08-18 journal: int j inf manage doi: 10.1016/j.ijinfomgt.2020.102202 sha: doc_id: 258316 cord_uid: uiusqr59 today globally, more people die from chronic diseases than from war and terrorism. this is not due to aging alone but also because we lead unhealthy lifestyles with little or no exercise and typically consume food with poor nutritional content. this paper proffers the design science research method to create an artefact that can help people study the diffusion of serious games. the ultimate goal of the study is to create a serious game that can help people to improve their balance in physical exercise, nutrition and well-being. to do this, first we conducted 97 interviews to study if wearables can be used for gathering health data. analysis indicates that designers, manufacturers, and developers of wearables and associated software and apps should make their devices reliable, relevant, and user friendly. to increase the diffusion, adoption, and habitual usage of wearables key issues such as privacy and security need to be addressed as well. then, we created a paper prototype and conducted a further 32 interviews to validate the first prototype of the game, especially with respect to the diffusion possibilities of the game. results are positive from a formal technology acceptance point of view showing relevance and usefulness. but informally in the open questions some limitations also became visible. in particular, ease of use is extremely important for acceptance and calling it a game can in fact be an obstruction. moreover, the artefact should not be patronizing and age differences can also pose problems, hence the title not to make the serious game too serious. future research plans to address these problems in the next iteration while the future implementation plan seeks for big platforms or companies to diffuse the serious game. a key theoretical contribution of this research is the identification of habit as a potential dependent variable for the intention to use wearables and the development of a diffusion model for serious games. the hedonic perspective is added to the model as well as trust and perceived risks. this model ends the cycle of critical design with an improvement of theory as result contributing to the societal goal of decreasing obesities and diabetes. mobile health solutions, including those with the ability to provide healthcare delivery, advice and access to healthcare information, are rapidly gaining prominence (american diabetes association, 2008) . this is largely due to increases in computing power and developments with smart phone capabilities and technologies (global mobile statistics, 2014). there are many benefits of mobile health solutions; namely convenience, a low or negligible learning curve, and they are accessible essentially 24/7 (steinhubl, muse, & topol, 2015) ). mobile health technologies tend to support both health and wellness aspects across all age groups, genders and ethnicities and this makes them particularly popular especially with consumers (markoff, 2011) . hence, we are witnessing mobile solutions to support diet and exercise activities, management and empowerment for people with chronic conditions such as diabetes as well as mental wellness and behaviour support (global mobile statistics, 2014). the use of self-tracking wearable technology has increased in popularity and is now being used as a means of optimising the health, fitness, and well-being of individuals and even groups. the widespread diffusion and adoption of wearables requires the development of rich and robust lenses to conceptualise and understand the drivers of their success (benbunan-fich, 2018) . in this research, we define wearables as devices worn by individuals which monitor variables such as steps taken, heart rate, speed, pace, distance, calories burnt, hour slept, quality of sleep, and even dietary information. the sale of wearables is soaring and is set to grow by 27 % in 2020 (https://techcrunch.com/ 2019/10/30/wearable-spending-forecasted-to-increase-27-in-2020/). 23 million wearables were sold in 2016 worldwide and total sales are expected to grow to 213 million in 2020. while the diffusion of wearables is high the long term adoption of wearables and the apps on them is low. the rate of abandonment and usage of wearables and the apps on them respectively is substantial. hence there is a strong need to investigate the users of wearables and find out (a) what would help make wearables and the apps on them a 'success' and (b) why the adoption of wearables and the apps on them is a 'failure' so far. serious games is a term used to describe the development of games specifically designed to achieve some change in the player. this could be a change in knowledge, attitude, physical ability, cognitive ability, health, or mental wellbeing. mccallum identified three types of health games: games focussing on physical health, cognitive health and social and emotional health (mccallum, 2012) . as noted by hamari and keronen (2017) more and more games are increasingly being employed for a variety of purposes yet the literature is scattered and there is a lack of a clear and reliable understanding of why games are being used and what their benefits are. moreover, it is still to be established how they are placed with respect to the utilitarian-hedonic continuum of information systems (hamari & keronen, 2017) . further, they note that on reviewing 48 studies they found that some believe that because games intended for instrumental use are rated high regarding enjoyment and usefulness games are thus multipurpose is which rely on hedonic factors and the pursuit of instrumental outcomes. this paper proffers the design science research method to create an artefact that can help people to improve their balance in physical exercise, nutrition and well-being. specifically, the wearable incorporates serious games to investigate the research question "how can we create a qualitative diffusion model for serious games?" this study will focus on the motivational purpose of serious games. the paper is structured as follows: in the following section we first look at adoption and diffusion literature and then focus on the diffusion of wearables and thereafter on the diffusion of serious games. in section 3 we explain our research method and section 4 discusses the interview results. section 4 mirrors the literature review and discusses the interview results in terms of diffusion of wearables and the adoption of serious games. in section 5 we discuss the the theoretical and practical implications of our study and results. section 6 closes with conclusions. in this section we first present relevant aspects of the adoption and diffusion literature to create the basis for the interview model. this is followed by specific wearable diffusion issues from literature and finally we discuss the diffusion of serious games from a theoretical perspective. for the interviews we made use of the use it model (landeweerd, spil, ton, & klein, 2013) , a qualitative research model derived from the utaut model (venkatesh et al., 2003) , diffusion of innovation model (rogers, 1983) and the is success model from delone and mclean (2003) . fig. 1 illustrates the use it model that integrates the four determinants of success of ict. the use it model makes a distinction between process and product of innovation as rogers (1995) does. the domains are taken from delone and mclean (2003) who show the net benefits of an information system in the user domain (relevance) and the information, system and service quality in the information technology domain. we used a grounded literature review using the approach of wolfswinkel, furtmueller, and wilderom (2013) . the key steps in the process are define, search, select, analyse and present. appendix 1 illustrates the summarised metadata for this grounded literature review. the main dissatisfaction when using a wearable is not being able to fulfil the expectations of the users in terms of fit, comfort, form factor, selectability, adaptability, and overall utility (coorevits & coenen, 2016) . this could be due to the limited focus of designers and developers on the needs of the user. nascimento, oliveira, and tam (2018) revealed that satisfaction affected intention to continue to use in particular those who were not power users but had a low level of habit. consumers may actually 'have inflated expectations about the ability of wearables to change nutritional habits'. furthermore they mention that consumers may have specific needs such as diet needs that are not captured well nor displayed by the dashboards of wearables (canhoto & arp, 2017) . buchwald, urbach, and von entreß-fürsteneck (2018) speak about satisfaction as well as dissatisfaction as important metrics in understanding continuance and discontinuance of self tracking devices. they use the hygiene theory of herzberg, and suggest that while hygiene factors can cause dissatisfaction, they may not necessarily cause satisfaction. for example, the unreliability of the system creates and fosters an intention to discontinue, but its absence does not contribute to the formation of an intention to continue. 'experience with technology is a key parameter in consumers' adoption' (kalantari, 2017, p. 301 ). in the context of self-tracking technology, kari, koivunen, frank, makkonen, and moilanen (2016) found that critical experiences promote or hinder the adoption and thereby lead to rejection during the implementation. they also found that prior experiences on self-tracking technologies had an influence on the expectancy of performance of new technologies. in the context of post-adoption and sustained use, handson experience with the technology influences habit and use. and habit in turn influences behavioural intention and use behaviour (venkatesh, thong, & xu, 2012) . limayem, hirt, and cheung (2007) refers to habit as ''the extent to which people tend to perform behaviors (use is) automatically because of learning'' (p. 705). they discuss four conditions that are likely to form is habits: 1) frequent repetition 2) extent of satisfaction with outcomes 3) relatively stable contexts and 4) comprehensiveness of usage of the is system. prior frequency of behaviour is important for the habit strength. rogers (1995) does not use the term habit but shows the importance of institutionalisation of a new innovation. while experience is necessary for forming a habit it is not in itself a sufficient condition (venkatesh et al., 2012) . wearables have specific characteristics; due to novelty of a technology habit could be an important factor in technology acceptance (polites & karahanna, 2012) . users seem to have problems keeping activity trackers on their person. they remove them to engage in activities such as showering, washing dishes, etc (shih, han, poole, rosson, & carroll, 2015) . there also seems to be a trade-off in terms of size of the wearable. smaller ones are easy to wear and carry but also easy to forget and more fragile while larger ones cannot be forgotten but they are also inconvenient to carry and bulky. however respondents did not seem to have problems remembering to carry car fig. 1 . the use it model (spil, schuring, & michel-verkerke, 2004) . t.a.m. spil, et al. international journal of information management xxx (xxxx) xxxx keys, mobile phones, and wallets. shih et al. (2015) believe that with longer adoption periods wearables will become a part of their daily (activity) routines. individuals have to (a) prepare the wearable such as charging (b) make sure the gps is working (c) turn them on and (d) finally remember to bring it (lupton, pink, heyes labond, & sumartojo, 2018) . while some of these could become a part of a routine/habit there were other aspects that required constant attention and vigilance. fritz, huang, murphy, and zimmermann (2014) did a longitudinal study on fitness trackers using wearable devices in three different continents. the wearables became a part of them and they felt strange when they took it off. however most of them lost interest when the novelty wore off and the monitoring became routine. once they crossed the learning curve and were able to estimate their steps and/or calories without the device by themselves, the wearables became obsolete. research interests in serious games have increased over the last decade and we now find a few similar definitions of serious games in existing literature: "serious games are games that do not have entertainment, enjoyment or fun as their primary purpose" (michael & chen, 2005) . "serious games have an objective to use the entertaining quality of the game for training, education, health, public policy, and strategic communication objectives." the combination of these two definitions is maybe the best way of defining serious games. the purpose differs from entertainment oriented video games but this does not mean that it cannot be fun or joyful to play. in addition, marsh (2011) argues that not all videogame characteristics, such as challenge, fun and play, are appropriate descriptions or labels for all serious games. we define serious games as: "playful acts that do not have entertainment, enjoyment or fun as their primary purpose but have training, education, health, public policy, and strategic communication objectives". over the recent past gamification has increased in popularity (koivisto & hamari, 2018) . gamification refers to designing information systems to afford similar experiences and motivations as games do, and consequently, attempting to affect user behaviour (koivisto & hamari, 2018) . the authors reviewed 819 studies and found that while the results in general lean towards positive findings about the effectiveness of gamification, the amount of mixed results is sufficient to urge caution (koivisto & hamari, 2018) . furthermore, education, health and crowdsourcing as well as points, badges and leaderboards persist as the most common contexts and ways of implementing gamification (koivisto & hamari, 2018) . taken together their findings suggest that there is much room to design more suitable serious games to support specific goals. this would also suggest that more incorporation of codesign and user centred design in gamification for healthcare could prove to be a critical success factor in the update and continued use of this games in this context. deterding, dixon, khaled, and nacke (2011) divide the world in a playing world and a non-playing context. in this paper we instead see an entertainment-oriented and a goal-oriented approach to game design. inbetween we see entertainment games that are partially used for real-life purposes and real-life games that become more fun. garris, ahlers, and driskell (2002) developed a game model in which they call the in-between group instructional games. they clearly describe a scale from video games to game-based learning and state that "there is little consensus on game features that support learning" (p.442). hamari et al. (2016) focus on the challenge and skill (flow) and engagement and immersion of perceived learning. they conclude that serious games must challenge and engage the players for better learning. kiili (2005) also focuses on flow. flow is seen as challenges versus capabilities of the player. an interesting aspect is that feedback is both part of the flow task as well as the flow artefact indicating that adding feedback to entertainment games can give a learning effect. transformational learning can be the bridge, communicating the power of games (barab et al., 2012) . furthermore, the positioning of the person and content are closely linked (sousa et al., 2018) , where positioning context can be derived from dialogues and narratives. most researchers focused their study on different purposes of serious games. there has been a lot of research in the effectiveness of serious games in teaching-learning related processes. for example, buchinger and hounsell (2018) reviewed a list of collaborative-competitive serious games in the teaching-learning process. they mentioned 9 important design features: intra players interaction, synchronization, roles, resources, score, challenge, reward, artificial intelligence and operationalization. this study will focus on the motivational purpose of serious games to create consciousness and behaviour change. in the next section we introduce the adopted research method. in line with the set of principles for conducting critical research in information systems as discussed by myers and klein (2011) , our research consists of elements of critique as well as of transformation. we question the actual adoption and effectiveness of wearables and serious games -the principle of revealing and challenge prevailing beliefs and social practices -by making use of the it adoption model as discussed in the previous section based on insights from innovation and adoption researchers like davis, bagozzi, and warshaw (1989) , delone and mclean (1993) , rogers (1983) and venkatesh et al. (2003) -the principle of using core concepts from critical social theorists. we study how the adoption of serious wearable games can be improved -the principle of taking a value position -in order to help improve health on both an individual and societal level -the principles of individual emancipation and improvements in society -and try to improve diffusion models for serious games by identifying habit as a potential dependent variable for the intention to use wearables -the principle of improvements in social theories. we use elements of critique (myers & klein, 2011) such as the principle of using core concepts from critical social theorists dating back to ajzen and fishbein (1980) and bandura (1977) leading toward the theory of planned behavior. the emphasis on relevance in the interview method used leads to a value position critical theorists advocate. finally the principle of revealing and challenging prevailing beliefs and social practices is well established in this paper by choosing a society problem (obesities and diabetes) and explore behavioral change with help of wearables and serious games. the element of transformation (myers & klein, 2011) are also studied in this paper. the principle of individual emancipation is studied with efficacy (bandura, 1977) which is used in the utaut model (venkatesh, 2003) . this study is aimed at health improvements in society and provides a theory improvement with a new model for diffusion of serious games. for attaining these research goals, a mixed method approach was adopted. in order to design our mobile health serious game, we made use of an adapted version of the design science research method (dsrm) process model, as based on the work of peffers et al. (2007) . after performing the literature study, we conducted 97 semi-structured interviews with actual owners and users of wearables. the initial group of interviewees was very diverse with users of different ages, backgrounds and education levels. in order to make the results more generalizable and the interview sample more homogenous we made use of the so-called drilldown technique. this was accomplished by focusing on interviewees that can be regarded as the most active users, i.e. millennials (between 18 and 34 years old) who are far more likely to own wearables than older adults and who use wrist-worn wearables for general health and fitness purposes. in order to focus our analysis on a homogeneous group of early adopters (rogers, 1983; yin, 2013) , we developed a subset of 20 interviews which where analysed in-depth. the majority of the interviewees (a) were highly educated (b) had experience with technology in general and ict in particular and (c) were willing to voluntarily adopt new technologies such as wearables. the other 77 interviews were used for the requirements of the wearables and the serious game for diabetes and obesities. we analysed the qualitative interview data by doing a sentiment analysis through coding (huberman & miles, 1994) . we divided our analysis into three different phases: data reduction, data display and drawing conclusions/verification. after getting a better insight in the adoption of wearables based on a sentiment analysis, we designed a specific game artefact based upon interviews with 97 potential users in a wide age scale and demonstrated the artefact in a student environment during a 10 week testing period. before doing a second iteration, we conducted another 32 interviews to validate the first prototype of the game, especially on diffusion possibilities of the game. our study can be regarded in sum as a qualitative study of diffusion of wearables and serious games; we did not focus on the specifics of the serious game itself (i.e. its user intertface). this following section describes the objective descriptive data as given by the interviewees. this is followed in the subsequent section by a sentiment analysis and comparison to literature. around 50 % of the interviewees had a smartwatch and 'apple' was the most mentioned brand. around 25 % possessed some form of bracelet. pedometers, sportwatches, pebble and fitbit made up the rest. the primary purpose of using wearables seems to be for the monitoring of steps and heart rate (fig. 2) . four out of seven respondents use the heartrate function for sport/movement, whereas running is the most mentioned sport. analysis of sleep was mentioned by three interviewees, of whom two were interested in the amount of sleep while the third was interested in the rhythm of sleep. nearly twenty five percent of the interviewees mentioned that they would like to have an extension of their smartphone as part of their wearable. two interviewees mentioned that they want to have a standalone device having its own internet access and own gps. two mentioned that they would like to be able to monitor blood pressure. the following items were mentioned as extra features that users would like to have: bmi, weight, scanning food instead of filling it in, body temperature, health app giving prescriptive advice about certain disease/disorder, monitoring health in order to change behavior, and amount of alcohol in the blood. a fitbit user also mentioned wanting to have more functionality with regards to movement. essentially there was consensus that wearables needed to be more comprehensive and standalone. when queried on the crucial factors for the use of wearables nearly twenty five percent of the interviewees identified additional value and ease of use as being important (fig. 3) . twenty percent of the respond mentioned reliable data and personal interest either in new technology or from a hobby point of view. lifespan of battery was of importance to fifteen percent. only ten percent mentioned health, communication, behaviour change and stand-alone device as being important. the adoption of serious games is predicated on a number of factors as illustrated in fig. 4 below. although the structured analysis is very positive, showing a high probability of diffusion, the emotional analysis shows some limitations that the next design must overcome. the emotional analysis is given in quotes: • "i wonder why it cannot be a normal app and has to be a game" • "i think it looks a bit childish" • "i do not want the game to treat me like a child" • "i already know that i need to exercise more" • "cheating is easy" the analysis shows that the next iteration should take care of age differences and preferences and the interaction with the game should be as easy as possible. we should reconsider to call it a game or an app. in fig. 4 only the key areas are mentioned. just like in most of the tam (technology acceptance model) studies perceived usefulness is important and in this case 80 % of the interviewees says that staying healthy is very relevant and if the game contributes to that they would use the game. ease of use is on the same level (80 %) but is relatively more important because so many interviewees mentioned it more than once. only 40 % of the interviewees was concerned with privacy. 60 % of the interviewees mentioned measuring and using physical activity in the game would be good and easy to accomplish. on nutrition and relaxation they were not that sure both in measuring and using it. the research question under consideration was "how can we create a qualitative diffusion model for serious games?" to answer this research question we did a systematic literature review that provided us with the confirmation of relevance as the most important determinant of diffusion of both wearables and serious games. what is unique and interesting about our research findings includes the notion of habit as a new determinant for the diffusion of serious games. the institutionalisation of the serious game is an important factor to make the serious game a lasting success. the dynamics of the game will improve the flow and will prevent the treatment to become boring. the first study on wearables confirmed the importance of the habit determinant and in table 2 we build a new proposition for serious games. next to that we propose perceived enjoyment as determinant for successful diffusion of serious games. finally we use the concept of information quality of delone and mclean to address the importance of learning and feedback in serious games. moreover our findings contribute to both theory and practice as follows. diffusion of wearables is hindered by the perception of a lack of relevance by users and a lack of relative advantage. there is the potential to add new features and / or functionalities to these wearables such as blood pressure, temperature, and even sugar level measurement in the future. this may enhance the perception of relevance. however, the more information is captured the more security and privacy issues arise. currently wearables just provide descriptive facts. yet, in the opinion of the authors, for wearables to be really effective, they need to go beyond descriptive information and provide prescriptive information that will allow users/wearers to take action. from a serious gaming point of view, we can conclude that staying healthy is the most relevant factor and the perceived usefulness should address this for the user to adopt this serious game. the main idea from the interviews and first design cycle is to "improve health by having fun". the fun element is not elaborated yet because we think this is more the domain of professional game designers than e-health researchers. still fun or enjoyment is of major importance for the game. the new diffusion model for serious games can help researchers to study if a specific serious game is likely to diffuse in the target group. in section 5.2 is shown how this can be done in practice. one of the key factors for discontinuing the use of wearables is the presence of errors and lack of reliability. while organizations can depend on it-service departments and/or external contractors to solve bugs, errors and reliability issues, this is not the case for personal/individual ict such as wearables. it is expected that personal icts are accurate and reliable and it is left to the individual to solve problems but unfortunately most individuals do not have the knowledge, will, or time to troubleshoot and solve problems and issues that may arise with personal ict. overall, the results indicated that people were neutral to positive with regards to sharing information, body data, habits, addictions, and the living environment that the wearable provided for diagnosis and statistical research. the extent to which they are willing to share their data depended on several factors. while people believe wearables can be hacked, their opinion is divided with regards to their privacy being at stake. from a monitoring point of view, nutrition is the hardest factor to measure. it is subjective and cannot be done in an automated way. suggestions from the literature and interviews are to use speech recognition and imaging to monitor as easy as possible. when analysing the outcomes of the interviews, the specific reasons for why some people do not adopt or habitually use the wearable, is not very clear. however, what became clear was that users of simple, unsophisticated models did not develop the habit of using wearables every day nor throughout the day. in conclusion, from the literature and from the validation of the game design, one thing is very clear: the game (if we call it a game) should be simple and "stupid". the ease of use is mentioned throughout all interviews and is clearly warned for in literature as shown in the background section. hence the title not making the serious game too serious and limit the amount of feedback. serious games increasingly blur the boundary between hedonic and utilitarian information systems (berfine koese, morschheuser, & hamari, 2019) . in and of itself this may not be an issue but it does mean that users may perceive the purpose of the same system differently, ranging from pure utility to pure play (berfine koese et al., 2019) . this could help explain why some of the serious games in healthcare are not so successful on a larger scale as initially expected. further, it may indicate that ensuring a more consistent understanding of the purpose of the game amongst users could be significant in improving the success of the specific game with regard to its particular healthcare benefit. from 562 games reviewed by the authors, not all of which were healthcare focused (berfine koese et al., 2019) , they found that the more fun-oriented users perceived the system to be, the more enjoyment affects continued and discontinued use intentions, and the less ease of use affects the continued use intention (berfine koese et al., 2019) ; hence, users' conceptions of the system are an important influential aspect of system use and should particularly be considered when designing modern multi-purposed gamified information systems which have a specific purpose or focus such as in healthcare contexts. our study confirms that these user conceptions from the emotional analyses seems troublesome although from a quantitative analysis there seem to be no problems for diffusion. it is therefore important to take a qualitative perspective. our paper has also served to contribute to developing further the application of dsrm into healthcare contexts, in particular we have incorporated aspects around privacy/security which are essential considerations when designing healthcare related solutions. in table 2 they are shown as trust and perceived risks. we have also included a consideration for hedonic aspects; i.e. perceived enjoyment (van der heijden, 2004) with the game while still subscribing to its utilitarian goal of supporting a critical healthcare need. we note that to date these two aspects -privacy/security and hedonic aspects -have not been incorporated into dsrm. for example while brooks and el-gayar et al. (2015) adopted a dsrm approach to examine the implementation of electronic health records neither these two elements were part of their consideration. we suggest that due consideration to privacy/security and hedonic aspects are useful to incorporate as shown in table 2 and appendix 2 and thereby extend dsrm when applied in healthcare contexts but probably also in other contexts. the extensive and critical use of the dsrm method in this paper justifies a generalisation of the findings in both the diffusion of wearables and the future diffusion of a serious game for diabetes and obesities which is the ultimate goal of this study. hence our research results serve to build a new theoretical model that can be used to predict whether a serious game will diffuse in society and start to reach behavioural and motivational objectives. we use kalantari's (2017) perspective to study wearable technologies. table 1 shows the comparison and analysis of both interview studies. it is followed by building a new model for diffusion of serious games specifically. finally we can build upon this analysis to develop a new model for qualitative design science studies of the diffusion of serious games. from theory (section 2.3) we derive the horizontal axis with fun, feedback and flow. next to perceived usefulness (davis et al., 1989 ) we table 1 the success factors analysed with the use it construct. success factors expected to be measured process perceived compatibility all interviewees have either a smart watch, sports watch, fitbit or pedometer. all interviewees have internet online. all interviewees use a digital device on which an app can function. all interviewees have internet online. perceived usefulness perceived usability sport is at the top. this is closely followed by health. staying healthy is the most relevant subject that the interviewees mention in 80 % of the interviews. relevance or additional value is a big theme and mentioned by 50% of respondents. ease of use is mentioned in more than 80 % to be important for the success of the game. information quality among younger people, the primary appeal is fitness optimization. older people are seeking to enhance their health and wellbeing and also to extend their life. measuring physical activity is the most mentioned functionality that already 60 % of the interviewees do. they want it to be easier and the other 40 % expects to use it if provided. most respondents were positive with respect to their enhanced insight and ability to monitor their health indicators. however they were divided regarding the enhancement of their personal health because of wearables. measuring nutrition is seen as difficult and only useful if it can be done in an easy way. measuring sleep and stress was done by just a few of the interviewees and is a topic that needs further study. service quality system quality perceived risks privacy and security on wearables does not appear to be a serious concern for the developers of the wearables and the apps on them nor for the users of the wearables and apps. most interviewees think they are going to use the game when it improves their health. only 20 % see privacy risks. trust social and personal influence reliability is a big theme which was mentioned by almost 50 % of the respondents. nearly all interviewees state that they want to spend some time for using the serious game. however a minor theme concerns the willingness of people to provide health data. they think it is more a personal tool for their personal use than a healthcare system tool. many interviewees state that peer pressure might help them to stay on track with their health objectives. t.a.m. spil, et al. international journal of information management xxx (xxxx) xxxx add the hedonic perceived enjoyment (van der heijden, 2004) . from section 2.2 we add the determinant habit. this determinant is specific for innovations that have to be used many times and should be validated in a future quantitative study in an extended tam or utaut. a proposition would be: habit has a positive and significant impact on user intention to use a serious game in healthcare. for requirements we did not use new notions but used the is success model of delone and mclean (2003) and the tam model (davis et al., 1989) . we determined an overlap between resources and resistance and took these elements together and renamed it reliability. trust and perceived risks were already added in the use it model (landeweerd et al., 2013) . it can be a relevant addition to the utaut model, yu (2012) labelled it perceived credibility. the results in this table should not be compared with evaluation studies of serious games where learning and behaviour play an important role (petri & gresse von wangenheim, 2016) , the table is focused on diffusion of serious games in a qualitative way. in appendix 2 we elaborate the content of this model toward interview questions. we start the interview with process questions to check the compatibility (rogers, 1983) of the new serious game and to get to know the interviewee. the interview model in appendix 2 addresses all factors found in table 2 above. this interview model can be used by practitioners who want to develop a new serious game in healthcare environments. the interview will take approximately one hour and the amount of interviews will depend on the variety of the user (player) group. for each homogeneous group at least one interview should be done but preferably two or more. serious games have evolved from being 'games' to just being 'serious'. they have become so serious that they are devoid of fun. this has had a significant impact on the adoption and use of serious games. to address this problem we have proposed a model for the adoption of serious games whose central tenets are fun, feedback, and flow. equally important are three more elements of the model namely: relevance, reliability, and fulfilments of requirements. these six elements together will enhance enjoyment, usefulness, trust, quality, and ultimately lead to the diffusion and adoption of serious games. but the most important outcome that the model hopes to achieve is the design of serious games that will lead to the transformation of individuals, reduction of bad habits and instilling of good habits. when we consider the various stakeholders in this space it becomes quite clear that at the heart of it is/should be the customer, namely the end users of the serious game (fig. 5) . influencing the end user and being influenced by the end users are the game development studios, designers of serious games, and researchers of serious games. we also recognise the mutually reinforcing roles of all 4 stakeholders (sein et al., 2011) and the reciprocal shaping of both the artefact as well as the stakeholders. the practical implications of our research apply to all four stakeholders. what we have witnessed in the covid-19 pandemic has underscored more than ever the importance of health and wellness but mostly about keeping healthy. moreover, it has shown that individuals need to take more responsibility in monitoring and managing their health and wellness supported by mobile and wearable technology aids. currently there are over 300,000 apps to support and assist patients with diabetes, however all these solutions have poor uptake and even less sustained use (jimenez, lum, & car, 2019) . a key reason for this is around the engagement of the user and the ability of the solution to sustain behaviour change. gamification has been shown to assist with increasing user engagement and sustained usage but incorporating aspects of gamification for health and wellbeing is still in its infancy (johnson et al., 2016; spil, sunyaev, thiebes, & van baalen, 2017) . our study has served to highlight critical aspects that need to be considered when designing the specific serious game that focus specifically on the reliability of the game, its requirements and relevance combined with ensuring the solution is fun, provides the correct level of feedback and the flow is appropriate. educating the users regarding the purpose of the game seems to be crucial in the success of the game in terms of health benefits. obviously ease of use perceptions of usefulness are also critical in adoption of the games. thus, our developed model provides a suitable rubric for game designers so that they can develop wearable and mobile solutions to address a specific health or wellness aspect with confidence, knowing it will have a high likelihood of uptake and sustained use. from the perspective of practice of equal importance is the business or financial angle, since the cost of designing and developing games which have poor uptake and even poorer sustained use are not financially viable for developers and companies and do not help to address escalating healthcare costs either. the diffusion, adoption, and retention of serious games by end users is of great concern to the game development studios. working together with all stakeholders, leading to the transformation of individuals, families, and communities should be their primary goal and vision. depending on the health system the studios need to work collaboratively with insurance companies as well as the health sector (public and private) to reduce the cost of health, improving health and wellbeing outcomes, and enhancing their enjoyment. considering the security and privacy concerns involved, a key challenge for the studios is to gain the trust of the stakeholders and in particular the end users and sponsoring or funding agencies such as the government and health sector. finally the artefacts we have created as part of this research: the prototypes, the models, and the instruments themselves can become the foundation for future research by other researchers of serious games. these were enumerated in section 5.1 above. finally, we want to stress that designing a serious game is not the holy grail for making the world healthy. staying healthy is multifaceted in many ways and only a game is not going to solve the many health related problems ahead. in combination with many other initiatives, it can help though to make the world a little bit better. moreover, to make wearable devices more relevant, more reliable and easier to use the adoption of serious games is beneficial. a limitation of this paper is that the study is done in a well developed country and although the authors are confident the model can be used in less developed countries, this has not been tested. a first validation of the interview framework is done with 32 interviews and a specific prototype of a serious game on obesities and diabetes. future study is needed to for validating use for a serious game in general and also for using serious games in underdeloped countries. the identification of habit as a potential dependent variable for the intention to use wearables and the development of a diffusion model for serious games based on this insight can be seen as the most important theoretical contributions of our research. more specific, we found during our interviews and validation design cycle that serious health games should "improve health by having fun". this aspect seems to be a critical design issue and therefore we proposed to include hedonic aspects, like perceived enjoyment, next to privacy/security related aspects into the diffusion model for serious games, the artefact made in this design science research method. we think that focusing on these aspects when developing a health related serious game may improve its diffusion and as a consequence may help to improve health on both an individual and societal level as well. our research question was: how can we create a qualitative diffusion model for serious games? with the critical elements found in the section above we created a new diffusion model for serious games and tested it with 32 interviews. we are confident that this model can improve diffusion of serious games in healthcare and hope it will be applied in many successful future projects. understanding attitudes and predicting social behavior nutrition recommendations and interventions for diabetes: a position statement of the american diabetes association self-efficacy: toward a unifying theory of behavioral change game-based curriculum and transformational play: designing to meaningfully positioning person, content, and context an affordance lens for wearable information systems is it a tool or a toy? how user's conception of a system's purpose affects their experience and use a framework for developing a domain specific business intelligence maturity model: application to healthcare guidelines for designing and using collaborative-competitive serious games insights into personal ict use: understanding continuance and discontinu-ance of wearable self-tracking devices exploring the factors that support adoption and sustained use of health and fitness wearables the rise and fall of wearable fitness trackers user acceptance of computer technology: a comparison of two theoretical models the delone and mclean model of information systems success: a ten-year update from game design elements to gamefulness: defining gamification persuasive technology in the real world: a study of long-term use of activity sensing devices for fitness games, motivation, and learning: a research and practice model global mobile statistics 2014 part a: mobile subscribers; handset market share; mobile operators why do people play games? a meta-analysis challenging games help students learn: an empirical study on engagement, flow and immersion in game-based learning data management and analysis methods examining diabetes management apps recommended from a google search: content analysis consumers' adoption of wearable technologies: literature review, synthesis, and future research agenda critical experiences during the implementation of a self-tracking technology digital game-based learning: towards an experiential gaming model. the internet and higher education the rise of motivational information systems: a review of gamification research the success of google search, the failure of google health and the future of google plus grand successes and failures in it. public and private sectors: ifip wg 8.6 international working conference on transfer and diffusion of it, tdit how habit limits the predictive power of intention: the case of information systems continuance personal data contexts, data sense, and self-tracking cycling the ipad in your hand: as fast as a supercomputer of yore serious games continuum: between games for purpose and experiential environments for purpose serious games, games that educate, train, and inform gamification and serious games for personalized health a set of principles for conducting critical research in information systems wearable technology: what explains continuance intention in smartwatches? a design science research methodology for information systems research how to evaluate educational games: a systematic literature review shackled to the status quo: the inhibiting effects of incumbent system habit, switching costs, and inertia on new system acceptance lessons for guidelines from the diffusion of innovations diffusion of innovations action design research use and adoption challenges of wearable activity trackers zombies and ethical theories: exploring transformational play as a framework for teaching with videogames. learning, culture and social interaction electronic prescription system, do the professionals use it? the adoption of wearables for a healthy lifestyle: can gamification help user acceptance of hedonic information systems consumer acceptance and use of information technology: extending the unified theory of acceptance and use of technology user acceptance of information technology: toward a unified view using grounded theory as a method for rigorously reviewing literature case study research factors affecting individuals to adopt mobile banking: empirical evidence from the utaut model key: cord-229937-fy90oebs authors: amaro, j. e.; dudouet, j.; orce, j. n. title: global analysis of the covid-19 pandemic using simple epidemiological models date: 2020-05-14 journal: nan doi: nan sha: doc_id: 229937 cord_uid: fy90oebs several analytical models have been used in this work to describe the evolution of death cases arising from coronavirus (covid-19). the death or `d' model is a simplified version of the sir (susceptible-infected-recovered) model, which assumes no recovery over time, and allows for the transmission-dynamics equations to be solved analytically. the d-model can be extended to describe various focuses of infection, which may account for the original pandemic (d1), the lockdown (d2) and other effects (dn). the evolution of the covid-19 pandemic in several countries (china, spain, italy, france, uk, iran, usa and germany) shows a similar behavior in concord with the d-model trend, characterized by a rapid increase of death cases followed by a slow decline, which are affected by the earliness and efficiency of the lockdown effect. these results are in agreement with more accurate calculations using the extended sir model with a parametrized solution and more sophisticated monte carlo grid simulations, which predict similar trends and indicate a common evolution of the pandemic with universal parameters. the sir (susceptible-infected-recovered) model is widely used as first-order approximation to viral spreading of contagious epidemics [1] , mass immunization planning [2, 3] , marketing, informatics and social networks [4] . its cornerstone is the so-called "mass-action" principle introduced by hamer, which assumes that the course of an epidemic depends on the rate of contact between susceptible and infected individuals [5] . this idea was extended to a continuous time framework by ross in his pioneering work on malaria transmission dynamics [6] [7] [8] , and finally put into its classic mathematical form by kermack and mckendric [9] . the sir model was further developed by kendall, who provided a spatial generalization of the kermack and mckendrick model in a closed population [10] (i.e. neglecting the effects of spatial migration), and bartlett, who -after investigating the connection between the periodicity of measles epidemics and community size -predicted a traveling wave of infection moving out from the initial source of infection [11, 12] . more recent implementations have considered the typical incubation period of the disease and the spatial migration of the population. the pandemic has ignited the submission of multiple manuscripts in the last weeks. most statistical distributions used to estimate disease occurrence are of the binomial, poisson, gaussian, fermi or exponential types. despite their intrinsic differences, these distributions generally lead to similar results, assuming independence and homogeneity of disease risks [13] . in this work, we propose a simple and easy-to-use epidemiological model -the death or d model [14] -that can be compared with data in order to investigate the evolution of the infection and deviations from the predicted trends. the d model is a simplified version of the sir model with analytical solutions under the assumption of no recovery -at least during the time of the pandemic. we apply it globally to countries where the infestation of the covid-19 coronavirus has widespread and caused thousands of deaths [15, 16] . additionally, d-model calculations are benchmarked with more sophisticated and reliable calculations using the extended sir (esir) and monte carlo planck (mcp) models -also developed in this work -which provide similar results, but allow for a more coherent spatial-time disentanglement of the various effects present during a pandemic. a similar esir model has recently been proposed by squillante and collaborators for infected individuals as a function of time, based on the ising model -which describes ferromagnetism in statistical mechanics -and a fermi-dirac distribution [17] . this model also reproduces a posteriori the covid-19 data for infestations in china as well as other pandemics such as ebola, sars, and influenza a/h1n1. the sir model considers the three possible states of the members of a closed population affected by a contagious disease. it is, therefore, characterized by a system of three coupled non-linear ordinary differential equations [18] , which involve three time-dependent functions: • susceptible individuals, s(t), at risk of becoming infected by the disease. • infected individuals, i(t). • recovered or removed individuals, r(t), who were infected and may have developed an immunity system or die. the sir model describes well a viral disease, where individuals typically go from the susceptible class s to the infected class i, and finally to the removed class r. recovered individuals cannot go back to be susceptible or infected classes, as it is, potentially, the case of bacterial infection. the resulting transmission-dynamics system for a closed population is described by where λ > 0 is the transmission or spreading rate, β > 0 is the removal rate and n is the fixed population size, which implies that the model neglects the effects of spatial migration. currently, there is no vaccination available for covid-19, and the only way to reduce the transmission or infection rate λ -which is often referred to as "flattening the curve"-is by implementing strong social distancing and hygiene measures. the system is reduced to a first-order differential equation, which does not possess an explicit solution, but can be solved numerically. the sir model can then be parametrized using actual infection data to solve i(t), in order to investigate the evolution of the disease. in the d model, we make the drastic assumption of no recovery in order to obtain an analytical formula to describeinstead of infestations -the death evolution by covid-19. this can be useful as a fast method to foresee the global behavior as a first approach, before applying more sophisticated methods. we shall see that the resulting d model describes well enough the data of the current pandemics in different countries. the main assumption of the d model is the absence of recovery from coronavirus, i.e. r(t) = 0, at least during the pandemic time interval. this assumption may be reasonable if the spreading time of the pandemic is much faster than the recovery time, i.e. λ ≫ β. the sir equations are then reduced to the single equation of the well-known si model, which represents the simplest mathematical form of all disease models, where the infection rate is proportional to both the infected, i, and susceptible individuals n −i. equation 5 is trivially solved by multiplying by dt and dividing by (n − i)i, integrating over an initial t = 0 and final t we obtain where i 0 = i(t 0 ). taking the exponential on both sides finally, solving this algebraic equation we obtain the solution i(t) which can be written in the form where we have defined the constants the parameter b is the characteristic evolution time of the initial exponential increase of the pandemic. the constant c is the initial infestation rate with respect to the total population n . assuming c ≪ 1, eq. 11 yields in order to predict the number of deaths in the d model we assume that the number of deaths at some time t is proportional to the infestation at some former time τ , that is, where µ is the death rate, and τ is the death time. with this assumption we can finally write the d-model equation as where a = µi 0 e −τ /b , c = c e −τ /b , and a/c yields the total number of deaths predicted by the model. this is the final equation for the d-model, which presents a similar shape to the well-known woods-saxon potential for the nucleons inside the atomic nucleus or the bacterial growth curve. the rest of the parameters, µ, τ , i 0 and n are embedded in the parameters a, b, c, which represent space-time averages and can be fitted to the timely available data. in fig. 1 , we present the fit of the d-model to the covid-19 death data for china, where its evolution has apparently been controlled and the d function has reached the plateau zone, with few increments over time, or fluctuations that are beyond the model assumptions. this plot shows the duration of the pandemic -about two months to reach the top end of the curve -and the agreement, despite the crude assumptions, between data and the evolution trend described by the d-model. this agreement encourages the application of the d model to other countries in order to investigate the different trends. in order to get insight into the stability and uncertainty of our predictions, fig. ii shows the evolution of a, b, and c and other model predictions from fits to the daily data in spain. the meaning of these quantities is explained below: • the parameter a is the theoretical number of deaths at the day corresponding to t = 0. in general, it differs from the experimental value and can be interpreted as the expected value of deaths that day. note that experimental data may be subject to unknown systematic errors and different counting methods. • the parameter b, as mentioned above, is the characteristic evolution time. during the initial expo-nential behavior, it indicates the number of days for the number of deaths to double. moreover, 1/b is proportional to the slope of the almost linear behavior in the mid region of the d function. that behavior can be obtained by doing a taylor expansion around t 0 = −b ℓn c and is given by • the parameter c is called the inverse dead factor because d(t → ∞) = a/c provides the asymptotic or expected total number of deaths. figure ii shows the stable trend of the parameters between days 19 to 24 (corresponding to march 27-30), right before reaching the peak of deaths cases, which occurred in spain around april 1. such stability validates the d-model predictions during this time. however, a rapid change of the parameters is observed, especially for a, once the peak is reached, drastically changing the prediction of the number of deaths given by a/c. this sudden change results in the slowing down of deaths per day and longer time predictions t 95 and t 99 . the parameters of the d model correspond to average values over time of the interaction coefficients between individuals, i.e. they are sensitive to an additional external effect on the pandemic evolution. these may include the lockdown effect imposed in spain in march 14 and other effects such as new sources of infection or a sudden increase of the total susceptible individuals due to social migration and large mass gatherings [20] . it is not possible to identify a specific cause because its effects are blurred by the stochastic evolution of the pandemic, which is why any reliable forecast presents large errors. one can also determine deaths/day rates by applying the first derivative to eq. 15, which allows for a determination of the pandemics peak and evolution after its turning point. the d model describes well the cumulative deaths because the sum of discrete data reduce the fluctuations, in the same way as the integral of a discontinuous function is a continuous function. however, the daily data required for d ′ have large fluctuations -both statistical and systematic -which normally gives a slightly different set of parameters when compared with the d model. using the d model fitted to cumulative deaths allows to compute deaths/day as where ∆t = 1 day. figure 3 shows that eqs. 18 and 19 yield similar parameters, as the time increment is small enough compared with the time evolution of the d(t) function. hence, the first derivative d ′ (t) can be used to describe deaths per day. in addition, fig. 4 shows that the parameters may be different for both d and d ′ functions using cumulative and daily deaths, respectively, as shown for spain on april 5. it is also important to note that b is directly proportional to the full width at half maximum (f w hm ) of the d ′ (t) distribution, as shown below, the b parameter presents typical values between 4 and 10 for most countries undergoing the initial exponential phase, which yields a minimum and maximum time of 14 and 35 days, respectively, between the two extreme values of the f w hm . c. dn model with two or more channels of infection some models [21] include changes in the transmission rate due to various interventions implemented to contain the outbreak. the simple d model does not allow to do this explicitly, but changes in the spread can be taken into account by considering the total d or d n function as the sum of two or more independent d-functions with different parameters, which may reveal the existence of several independent sources, or virus channels. an example is shown in fig. 5 , where the two-channel function has been fitted with six parameters to the spanish data up to april 13. the fit reveals a second, smaller death peak, which substantially increase the number of deaths per day and the duration of the pandemic. this is equivalent to add a second, independent, source of infection several weeks after the initial pandemic. the second peak may as well represent a second pandemic phase driving the effects of quarantine during the descendant part of the curve. additionally, the cumulative d-function can also be computed with a two-channel function, which provides, as shown in fig. 6 , a more accurate prediction for the total number of deaths and clearly illustrates the separate effect of both source peaks. it is interesting to note that for large t, a ≈ a 2 , c ≈ c 2 and b 2 ≈ 2b. in such a case, the total number of deaths expected during the pandemic is given by d 2 (∞) = 2a/c. the d-model can also be used to estimate i(t) using the initial values of i 0 = i(0) and the total number of susceptible people n = s(0). the initial value of n is unknown, and not necessarily equal to the population of the whole country since the pandemic started in localized areas. here, we shall assume n = 10 6 , although plausible values of n can be tens of millions. note that the no-recovery assumption of the d model is unrealistic, and this calculation only provides an estimation of the number of individuals that were infected at some time, independently of whether they recovered or not. from the definition of d(t) in eq. 14, the following relations between the several parameters of the model were extracted solving the first two equations for µ and i 0 we obtain hence, µ can be computed by knowing n . however, to obtain i 0 one needs to know the death time τ . this has been estimated to be about 15 to 20 days for covid-19 cases, which can be used to compute two estimates of i(t). these are given in fig. 7 for the case of spain. since there is no recovery in the d model, the total number of infected people is i ∼ n for large t, i.e. n = 10 6 in our case. in fig. 7 which also depends on n and τ . for n = 10 6 , the ratio d/i increases similarly to the separate functions d and i between the initial and final values, these results depend on the total susceptible population n . however, the ratio of infected with respect to susceptibles, i/n , is independent on n . this function depends only on τ and is shown in the bottom panel of fig. 8 for τ = 15 and 20 days, which reveals the rapid spread of the pandemic. accordingly, between 10% and 30% of the susceptibles were infected in march 7, and one month later (april 6), when the fit was made, all susceptibles had been infected. this does not means that the full population of the country got infected, since the number n is unknown and, for instance, excludes individuals in isolated regions, and it may additionally change because of spatial migration, not considered in the model. d-model predictions can be compared with more realistic results given by the complete sir model [9, 11] , which is characterized by eqs. 1, 2, 3 and 4 with initial conditions r(0) = 0, i(0) = i 0 , s(0) = n − i 0 . the sir system of dynamical equations can be reduced to a non-linear differential equation. first, dividing eq. 1 by eq. 3 one obtains, which yields the following exponential relation between the susceptible and the removed functions, moreover, eq. 4 provides a relation between the infected and the removed functions, which yields, by inserting into eq. 3, the final sir differential equation in order to obtain r(t) we only need to solve this first-order differential equation with the initial condition so that s + i + r = 1, then r(t) verifies which can be solved numerically, or by approximate methods in some cases. in ref. [9] , a solution was found for small values of the exponent λn r/β. for the coronavirus pandemic, however, this number is expected to increase and be close to one at the pandemic end. at this point, we propose a modification of the standard sir model. instead of solving eq. 38 numerically and fitting the parameters to data, the solution can be parametrized as which presents the same functional form as the d-model and, conveniently, provides a faster way to fit the model parameters by avoiding the numerical problem of solving eq. 38. in fact, numerical solutions of the sir model present a similar step function for r(t). additionally, one can assume that d(t) is proportional to r(t), and can also be written as where a 2 , c 2 = s 0 and b 2 = β/(λn ) are unknown parameters to be fitted to deaths-per-day data, together with the three parameters of the r(t)-function: a, b, c. figure 9 shows fits of the esir model to daily deaths in spain during the coronavirus spread. the use of no boundary condition for the number of deaths (left panel) is not an exact solution of the sir differential equation. a way to solve this problem is to impose the condition d ′ (∞) = 0, as the number of deaths must stop at some time. numerically, it is enough to choose a small value of d ′ (t) for an arbitrary large t. the middle and right panels of fig. 9 show different boundary conditions of d ′ (100) = 10 and d ′ (100) = 5, respectively, which yield the same results and the expected behavior for a viral disease spreading and declining. it is also consistently observed (e.g. see middle and right panels of fig. 9 ), that at large t, r(t) → a c ≈ 1, which essentially means that most of the susceptible population n recovers, as we previously inferred from the d model. this, together with the fact that c 2 can always be adjusted to 1, leaves the esir model with essentially 4 free parameters to fit to the daily death data; i.e. the same number of parameters than the original sir model. as shown in fig. 10 , esir fits reproduce well the long flattening behavior observed in uk, usa, germany or iran, whereas it fails to reproduce the more-pronounced double-peak structure typically observed in countries like france, italy, spain or belgium. as previously done with the d model, one can also expand the esir model to accommodate this apparent failure to take lockdown effects into account. similarly, the esir2 model is proposed as, with where we have assumed that a = a ′ and c = 2a to accommodate that r(∞) → 1 and c 2 = 1. hence, we are left with five free parameters. finally, fig. 11 shows the comparison between the esir2 and d ′ 2 fits to real data for countries where covid-19 has widely spread: belgium, usa, france, germany, iran, italy, spain and uk, usa. death data are taken from refs. [19, 22, 23] and consider 7-day average smoothing to correct for anomalies in data collection such as the typical weekend staggering observed in various countries, where weekend data are counted at the beginning of the next week. real error intervals are extracted from the correlation matrix. as discussed in section 2.3, the reduced d ′ 2 model has been used with a = a 2 and c = c 2 . although arising from different assumptions, both models provide similar data descriptions and predictions, with slightly better values of χ 2 per degree of freedom for the esir2 model. it is also interesting to note that the reduced esir2 model with five parameters yields similar results to the full esir2 model, with eight parameters. as data become available, daily predictions vary for both esir2 and the d ′ 2 models. this is because the model parameters are actually statistical averages over space-time of the properties of the complex system. no model is able to predict changes over time of these properties if the physical causes of these changes are not included. the values of the model parameters are only well defined when the disease spread is coming to an end and time changes in the parameters have little influence. more sophisticated calculations can be compared with esir2 and d ′ 2 predictions. in particular, monte carlo (mc) simulations have also been performed in this work for the spanish case [24] , which consist of a lattice of cells that can be in four different states: susceptible, infected, recovered or death. an infected cell can transmit the disease to any other susceptible cell within some random range r. the transmission mechanism follows principles of nuclear physics for the interaction of a particle with a target. each infected particle interacts a number n of times over the interaction region, according to its energy. the number of interactions is proportional to the interaction cross section σ and to the target surface density ρ. the discrete energy follows a planck distribution law depending on the 'temperature' of the system. for any interaction, an infection probability is applied. finally, time-dependent recovery and death probabilities are also applied. the resulting virus spread for different sets of parameters can be adjusted from covid-19 pandemic data. in addition, parameters can be made time dependent in order to investigate, for instance, the effect of an early lockdown or large mass gatherings at the rise of the pandemic. as shown in fig. 12 , our mc simulations present similar results to the d ′ 2 model, which validates the use of the simple d-model as a first-order approximation. more details on the mc simulation will be presented in a separate manuscript [24] . interestingly, mc simulations follow the data trend up to may 11 without any changes in the parameters for nearly two weeks. an app for android devices, where the monte carlo planck model has been implemented to visualize the simulation is available from ref. [25] . in order to investigate the universality of the pandemic, it is interesting to compare all countries by plotting the d model in terms of the variable (t − t 0 )/b, where t 0 is the maximum of the daily curve given by t max = −b ℓn(c). by shifting eq. 15 by t max = −b ℓn(c) and dividing by t max = a/c, the normalized d function is given by, the left of fig. 13 shows similar trends for the normalized d curves of different countries, which suggests a universal behavior of the covid-19 pandemic. only iran seems to slightly deviate from the global trend, which may indicate an early and more effective initial lockdown. a similar approach can be done for the daily data using peaks. the global models considered in this work present some differences with respect to other existing models. first, in this work we have tried to keep the models as simple as possible. this allows to use theoretical-inspired analytical expressions or semi-empirical formulae to perform the data analysis. the use of semi-empirical expressions for describing physical phenomena is recurrent in physics. one of the most famous is the semi-empirical mass formula from nuclear physics. of course the free parameters need to be fitted from known data, but this allowed to obtain predictions for unknown elements. in our case we were inspired by the well known statistical sir-kind models slightly modified to obtain analytical expressions that carry the leading time dependence. we have found that the d and d 2 models allow a fast and efficient analysis of the pandemics in the initial and advanced stages. our results show that the time dependence of the pandemic parameters due to the lockdown can be effectively simulated by the sum of two dfunctions with different widths and heights and centered at different times. the distance between the maxima of the two d-functions should be a measure of the time between the effective pandemic beginning and lockdown. in the spanish case this is about 20 days. taking into account that lockdown started in march 14, this marks the pandemic starting time as about february 22. had the lockdown started on that date, the deaths would had been highly reduced. the smooth blending between the two peaks provides a transition between the two statistical regimes (or physical phases) with and without lockdown. the monte carlo simulation results are in agreement with our previous analysis with the d and d 2 models. the monte carlo generates events in a population of in-dividuals in a lattice or grid of cells. we simulate the movement of individuals outside of the cells and interactions with the susceptible individuals within a finite range. the randon events follow statistical distributions based on the exponential laws of statistical mechanics for a system of interacting particles, driven by macroscopic magnitudes as the temperature, and interaction probabilities between individuals, that can be related to interaction cross sections. the monte carlo simulation spread the virus in spacetime, and allows also space-time dependence on the parameters. in this work we have made the simplest assumptions, only allowing for a lockdown effect by reducing the range of the interaction starting on a fixed day. this simple modification allowed to reproduce nicely the spanish death-per-day curve. the lockdown produces a relatively long broadening of the curve and a slow decay. similar mc calculations can be performed in several countries to infer the devastating effect of a late lockdown as compared with early lockdown measures. the later is the case of south africa and other countries, which have not reached the exponential growth. the death and extended sir models are simple enough to provide fast estimations of pandemic evolution by fitting spatial-time average parameters, and present a good first-order approximation to understand secondary effects during the pandemic, such as lockdown and population migrations, which may help to control the disease. similar models are available [17, 26] , but challenges in epidemiological modeling remain [27] [28] [29] [30] . this is a very complex system, which involves many degrees of freedom and millions of people, and even assuming consistent disease reporting -which is rarely the case -there remains an important open question: can any model predict the evolution of an epidemic from partial data? or similarly, is it possible, at any given time and data, to measure the validity of an epidemic growth curve? we finally hope that we have added new insightful ideas with the death, the extended sir and monte carlo models, which can now be applied to any country which has followed the initial exponential pandemic growth. discussion: the kermack-mckendrick epidemic threshold theorem stability analysis of sir model with vaccination seasonality and the effectiveness of mass vaccination application of sir epidemiological model: new trends epidemic disease in england -the evidence of variability and of persistency of type report on the prevention of malaria in mauritius an application of the theory of probabilities to the study of a priori pathometry. -part i an application of the theory of probabilities to the study of a priori pathometry.-part iii a contribution to the mathematical theory of epidemics discussion of measles periodicity and community size by measles periodicity and community size deterministic and stochastic models for recurrent epidemics basic models for disease occurrence in epidemiology the d model for deaths by covid-19 the continuing 2019-ncov epidemic threat of novel coronaviruses to global health the latest 2019 novel coronavirus outbreak in wuhan, china who. coronavirus disease 2019 attacking the covid-19 with the isingmodel and the fermi-dirac distribution function the sir model and the foundations of public health impact of non-pharmaceutical interventions against covid-19 in europe: a quasi-experimental study inferring covid-19 spreading rates and potential change points for case number forecasts special issue on challenges in modelling infectious disease dynamics modeling infectious disease dynamics in the complex landscape of global health mathematical epidemiology: past, present, and future true epidemic growth construction through harmonic analysis the authors thank useful comments from emmanuel clément, araceli lopez-martens, david jenkins, ra-mon wyss, liam gaffney and hans fynbo. this work was supported by the spanish ministerio de economía y competitividad and european feder funds (grant fis2017-85053-c2-1-p), junta de andalucía (grant fqm-225) and the south african national research foundation (nrf) under grant 93500. key: cord-289496-d8ac6l6o authors: chen, min; li, miao; hao, yixue; gharavi, hamid; liu, zhongchuan; hu, long; wang, lin title: the introduction of population migration to seiar for covid-19 epidemic modelling with an efficient intervention strategy date: 2020-08-06 journal: inf fusion doi: 10.1016/j.inffus.2020.08.002 sha: doc_id: 289496 cord_uid: d8ac6l6o in this paper, we present a mathematical model of an infectious disease according to the characteristics of the covid-19 pandemic. the proposed enhanced model, which will be referred to as the seir (susceptible-exposed-infectious-recovered) model with population migration, is inspired by the role that asymptomatic infected individuals, as well as population movements can play a crucial role in spreading the virus. in the model, the infected and the basic reproduction numbers are compared under the influence of intervention policies. the experimental simulation results show the impact of social distancing and migration-in rates on reducing the total number of infections and the basic reproductions. and then, the importance of controlling the number of migration-in people and the policy of restricting residents’ movements in preventing the spread of covid-19 pandemic is verfied. the outbreak of novel coronavirus (covid-19) pneumonia continues to increase worldwide, causing a large number of casualties and global economic crises. the outbreak has put many countries into an emergency response phase. at present, research on developing a vaccine for coronavirus is still in preliminary stages and there are no specific drugs for efficient treatment of infected patients. the ones that are available can only assist a patient's self-autoimmunity. under these conditions and based on the characteristics of covid-19 pandemic, it is vital to develop a new mathematical model according to the characteristics of the covid-19. it would be possible to analyze the pathology and dynamic transmission of infectious diseases, and also find an optimal decision for preventing spread of the virus with such a model. however, well-established models, such as the susceptible-exposed-infectious-recovered (seir) model [1] , cannot be considered for covid-19 due to the unique characteristics of the virus. more importantly, the spread of covid-19 is not limited to symptomatic individuals, but also to those who do not show any symptoms (i.e., asymptomatic). more specifically, unlike previous seir models, a new model has to take into consideration covid-19 exposure characteristics that consist of two types: symptomatic and asymptomatic. by incorporating both types in a new model, it would then be possible to analyze the pathology and dynamic transmission of infectious diseases and then find an optimal solution to prevent spread of the virus. thus, in this paper, we incorporate asymptomatic infections and population migration into seir and introduce a new model, which will be referred to as the seiar (susceptible-exposed-infectious-asymptomatic-recovered) model with population migration. based on the proposed seiar model with the unique epidemiological characteristics of covid-19, we evaluate differences in the asymptotic stability of the model with and without intervention. we then present our experimental results, which include verifying the effectiveness of interventions (e.g., social distancing and immigration control) in mitigating the spread of covid19. in summary, the main contributions of this paper include: • by establishing a mathematical model of an infectious disease based on the characteristics of covid-19 (such as the effect of asymptomatic infected patients in spreading the virus), the applicability of the modified model is proved. • by calculating the curve of the number of infections based on the euler numerical method, we then analyze and compare the effects of different anti-epidemic policies on reducing the number of infections. • by calculating the basic reproduction number of the model, we verify that reducing mobility rates based on the basic reproduction number again, i.e., the immigration ban, can indeed slow down virus transmission. the rest of the paper is organized as follows. the related works are introduced in section 2. we develop a mathematical model to describe the transmission of covid-19 and calculat the basic reproduction number for this model in section 3. then, we use a simulation to calculate the impact of quarantine measures and immigration restrictions on the transmission of infectious diseases in section 4. we also discuss our future work in section 5. finally, section 6 concludes this paper. the traditional model of infectious diseases is based on the population classification of infectious diseases. presently, well-established mathematical models have been widely used to project the propagation of various epidemics, such as susceptible-infectious-recovered (sir) and seir [2] . kermack et al. proposed an sir-based model to project the expected epidemic scenarios for the plague [3] . in the case of influenza, hethcote et al introduced a so-called fourth population classification (i.e. exposed person) [4] . this type of population is a potential infection population, it does not have the ability to infect others. the extended epidemic model is called the seir model [3] . currently, the seir model is applicable to influenza, aids, and other epidemics. for example, in [5] , syafruddin et al., studied the differential equations of transmission dynamics of dengue fever in selangor malaysia. they finally proved that the seir model has certain applicability between theoretical calculation and practical calculation. in terms of the global stability of the sir and seir models, abtae. et al. applied lyapunov function to determine the global asymptotical stability conditions of disease-free and endemic free equilibrium and concluded that the stability of seir is better than sir [6] . moreover, li et al. proposed an infectious disease model where individuals in the exposed state and those in the recovered state are infectious [6] . however, according to the infectivity characteristics of covid-19, the seir model is not applicable to describe the actual infectious process of covid-19. currently, based on a simple mathematical model and limited data, zhong et al., [7] used the epidemic data of in wuhan (at the initial stage) to estimate the infection rate, mortality, and other parameters. these parameters were then fed into the sir model to predict the outbreak of covid-19 in china. the authors did not consider the difference between the infection rate and mortality rate among various cities. for instance, patients in the exposed period were omitted. they also ignored the impact of population migration. in [8] , iwata et al. investigated the possibility of potential outbreak caused by imported cases in a community outside china. moreover, the authors in this paper put different coefficients into the mathematical model and got sets of results which were not representative. however, there are many deficiencies in recent studies [9] . firstly, there were not enough data to know the origin, the rule of infection, and the mode of infection of covid-19 with certainty, while it is still in the early stage of the outbreak. secondly, the migration of population and the dynamic change of parameters have been ignored. thus, based on the above analysis, previous seir-based models cannot be directly applied to covid-19. this is due to the fact that while individuals in the exposed state of covid-19 are infectious, there is no evidence to prove that individuals in the recovered state cannot still be infectious. so, we need to enhance the seir model by considering only the exposed state as being infectious and not the recovered state. in order to build such a model, we set up two infection rates; one is the infection rate of the exposed person and the other is the infection rate of the infected person. in addition, after establishing the model, we consider the impact of the exposed period of covid-19 on the spread of the epidemic. in the next section, we will describe the specific seiar model. in this paper, we propose a new epidemic model that tasks asymptomatic cases and immigration into consideration. because of the change of contact and migration rates, we incorporate these two parameters as a function of time in our model. according to the known characteristics of covid-19, we can divide the regional population into the following five categories: susceptible group (s), exposed group (e), infected (symptomatic infection) group (i), asymptomatic infected group(a), or recovered group (r), as illustrated in fig. 1 . furthermore, according to the report of the world health organization, the general transmission process is as follows: • if susceptible persons contact asymptomatic and symptomatic infected people, their status will change to exposed. • most of the exposed people are transformed into infected or asymptomatic infected after the exposed period. it should be noted that the exposed of covid-19 is 5.2 days. • infected and asymptomatic people will be moved into convalescence after rehabilitation, because they are expected to develop a certain immunity within an unknown period of time. thus, we introduce the transformation relationship between these five groups as shown in fig. 1 . here, we consider the existence of asymptomatic infected people and immigration. note that asymptomatic infected people will not die of the covid-19 infection, but infected people may. moreover, the population density of these categories is denoted by s(t), e(t), i(t), a(t) and r(t) at time t, respectively. let n (t) denote the total number of people at time t, then, we can obtain symptoms relative to those who have symptoms. furthermore, the probability of symptomatic patient recovering is δ 1 and the recovery probability of asymptomatic patients is δ 2 . we set the probability of mortality due to covid-19 to η and the natural mortality rate to θ. based on the above parameters, we propose the seiar model population migration using the the following ordinary differential equations (odes): where λ 1 denotes the migration-in rate, and λ 2 denotes the migration-out rate. for the above parameters (α, ρ, χ, θ, β 1 , β 2 , δ 1 , δ 2 , η, λ 1 , λ 2 ), we can solve it by the machine learning algorithm. in this paper, for the sake of simplicity, we use existing work to estimate the parameters. furthermore, we utilize the euler numerical method to solve the odes (2). the basic reproduction number r 0 is an important indicator to describe the epidemic dynamics. it refers to the number of the expected cases directly generated by one case in which all individuals in a population are susceptible to infection. in this paper, we use the next generation matrix [10] , [11] to derive a formula for the control reproduction number when control measures are in force. to be specific, firstly, we divid the population into five categories according to the state of infection. then, we denote matrix j i (x) as the new infection rate in category i. let x = (x 1 , x 2 , x 3 , x 4 , x 5 ) be the combination of crowd categories, x i represents the number of people in each category. is the rate where other categories are converted to category i. according to the mathematical model of infectious diseases, we can obtain j (x) and v(x) as follows: and (4) furthermore, we define j and v as the jacobian matrices of function j (x) and v(x). then, we can obtain the jacobian matrixes of (3) and (4) with respect to x which can be expressed as follows: and where f and v is a 3 × 3 matrices. ,where 1 ≤ i, j ≤ 3, and the disease free equilibrium x 0 = (0, s 0 , 0, 0, 0). x 0 is a locally asymptotically or stable equilibrium solution of (2). it is clear that (0, 1, 0, 0, 0) is a disease free equilibrium of (2) [12] . then j o u r n a l p r e -p r o o f journal pre-proof thus, we obtain the next generation matrix of covid-19 as g = f v −1 . therefore, the basic reproduction number r 0 of the seiar model can be obtained by calculating the spectrum radius of g as follows: in epidemiology, the basic reproduction number is the average number of people who are infected with an infectious disease. the basic reproduction number can be used to determine how easy it is to control an epidemic. if r 0 < 1, infectious diseases will gradually disappear. if r 0 > 1 , infection can spread exponentially, but it generally does not last forever as the number of people likely to be infected slowly declines. some people may die of the infection, while others may recover and develop immunity. if r 0 = 1, the infectious disease can become endemic to populations. in this paper, we mainly study the impact of different effective epidemic prevention measures on controlling the epidemic situation. currently, research of covid-19 in wuhan has been well investigated. thus, we use the research parameters used by tang, et al., in [13] , [14] directly. we mainly analyze the infection data of wuhan, china, which is consistent with the infection direction of our early experimental area, so we can use the parameters in this study directly. specifically, we set the total number of people in the experimental area to 10000. the contact rate of susceptible individuals and infected people is α = 140. the disease incidence probability of each contact transmission is ρ = 2.1011 × 10 −5 . the exposed period is γ = 1/χ = 5.2 [15] . after the exposed period, the probability of the exposed people becoming infected is β 1 = 0.86834, and the probability of exposed infection to asymptomatic infection is β 2 = 1 − 0.86834. the probability of recovery is δ 1 = 0.13029, and the recovery probability of asymptomatic patients with infection is δ 2 = 0.1. the probability of death due to covid-19 is η = 0.2. the natural mortality rate is θ = 7.14×10 −3 . furthermore, we set initial s = 9999, e = 0, i = 1, a = 0, r = 0 . in order to verify the feasibility of the model, we used real data from wuhan for simulation. we set the initial experimental data as: s = 11081000, e = 0, i = 1, a = 0, r = 0. the first case was detected on december 6, 2019, according to china's national health commission. on january 24, 2020, wuhan implemented a strict lock-down policy for the population. in this way, from case detection to day 50, the contact rate drops significantly. we set it to α 0 = 5. the experimental results are shown in fig. 2 . we found that the daily increase of infections was different from the actual report in wuhan. some patients were not detected due to the limited number of test reagents in the early stage of infection. the number of patients reported dramatically increases suddenly after the number of test reagents increased. therefore, the daily increase number of infections is reasonable. according to the experimental results, the cumulative number of infections increases every day, and after about 80 days, the cumulative number of infections is basically unchanged. this phenomenon is consistent with the report of the wuhan health and health commission, which fully proves the applicability of our model. in order to control the spread of covid-19, measures such as staying home should be considered to reduce the contact rate. thus, in this paper, we set three contact rates: (i) constant contact rate α = α 0 , (ii) exponential decreasing contact rate α(t) = (α 0 − α min e −r1t ) + α min , and (iii) piecewise exponential function contact ratio, which is as follows: 11) where α min represents the minimum exposure rate after implementation of the policy. in equation (10) , l is the days the ban. in equation (11) , l 0 is the days for beginning the ban. l is the days for ending the ban. quarantine can keep people at a social distance,and then change contact rate. therefore, we can define different values for the contact rate according to different epidemic prevention measures. firstly, if the authorities do not take any preventive measures, we assume that the contact is constant. otherwise, we assume that the contact rate will exponentially decrease. over time, when people become less alert or the authorities end the quarantine too early, we set the contact rate to increase exponentially. the contact rate meets equation (10) . in fig. 3 , in order to reduce the impact of the migration-in rate on the test area, we assume that no infected persons will migrate to that area within the next 140 days. moreover, we set α min = 2, r 1 = 1.3768. in fig. 3 , experimental results show that the number of new symptomatic infections per day increases with a fastest pace in areas without quarantine measures (i.e., no isolated), and the cumulative number of symptomatic infections is the largest. the larger the value of l, the larger cumulative number of symptomatic infections. because the sooner quarantine measures are made, the less contact people will have, i.e., the less contact rate. if these measures are implemented on day 50 (i.e., l = 50), the quarantine measures will have little effect. the results of this trial prove that if we want to mitigate the epidemic as soon as possible and reduce the number of infections during the epidemic, it is essential to implement strict guidelines as early as possible. secondly, as shown in fig. 4 , the contact rate meets equation (11) . we begin quarantine measures 20 days after the case was found, i.e, l 0 = 20. the quarantine ended on the 40th (i.e., l=40), 50th (i.e., l=50), and 60th (i.e., l=60) days in the experimental area. when quarantine measures are ended, the contact rate increases exponentially. according to curves of no isolated, l = 40 and l = 50 , there is no difference between the premature termination of the quarantine and the failure to take preventive measures. the daily number of new infections stopped rising after the quarantine is terminated on the 60th day. there is no significant difference between the effect of the quarantine and that of continuous implementation of the quarantine measures. the analysis showed that the government neither can end the quarantine measures prematurely, nor need to close dated measures too late. according to the infection data, it can be reasonably estimated the time needed of quarantine measures. in our next experiment, we set different mobility-in rates. firstly, λ 1 = 0 represents a blockage to entering or exiting the epidemic area. in addition, to compare the impact of the immigration rate on our infection model, we set the immigrationin rate to λ 1 = 0.01 and λ 1 = 0.007. the results, which are shown in fig. 5 for symptomatic population, indicate that the total number of infected people in the experimental area decreases with a reduction in the immigration-in rate. additionally, the lower the immigration-in rate, the faster it will reach the peak number of infected people. the base reproduction number r 0 can be used as a parameter to measure the rate of infection. as shown in fig. 6 , we set different mobility rates, i.e.,λ 1 = 0.00714, λ 1 = 0.001 and λ 1 = 0. we make the contact rate of each curve decrease exponentially. according to fig. 6 , in the initial stage, the basic reproduction number is greater than 1. as time goes on, the contact rate decreases, which makes the basic reproduction decrease, and it ends up being less than 1. in addition, compared with the three curves, the smaller the different mobility rates are, the smaller the immigration rate. these experimental results indicate that quarantine measures are conducive to control the spread of infectious diseases. the validity of seiar model by experimentation. in the future, there remain several interesting topics which are as follows: medical resource allocation: due to different severities of the covid-19 epidemic in different areas, utilization of medical resources in different regions is also different. for example, in severe areas, doctors, nurses, protective materials and drugs are in serious shortage. however, there is a surplus of medical supplies in areas with less severe outbreaks. therefore, we need to establish a scheduling scheme to transfer the remaining medical materials from the slight epidemic area to areas with serious epidemic. in order to make an optimal scheduling scheme, we need to establish a multi-objective optimization problem: the objective function is to minimize mortality of the infected patients in the experimental area, and to minimize the cost of deployment. the constraint is that cities should share medical resources with other cities and ensure that medical resources of its own city can be self-sufficient. the allocated materials accepted by the city receiving the medical resources cannot exceed the amount of medical resources it needs. thus, we need to build a scheduling model of hospital to hospital. user mobility control for economic recovery: the epidemic has seriously affected the economies of all countries. achieve economic recovery on the basis of controlling the covid-19 epidemic situation is a challenging problem. from the perspective of existing research, user mobility is an important factor affecting epidemic prevention and economic recovery. thus, we plan to give user mobility control to achieve economic recovery on the basis of epidemic control. in particular, some novel approaches based on edge intelligence could be considered [16] , especially advanced mobile networking and healthcare systems [17] . for example, we take the income of the hospitality industry and the number of people infected by the virus as the target, and establish a multi-objective optimization problem. the objective function is to maximize the income of the catering industry and minimize the number of people infected. there exist a trade-off, because an increase in the number of infected people will restrict people from going out, so the income of the hospitality industry will decrease. in order to improve the income of this industry, we will increase the number of people going out, but the expense of increasing the number of infected people. thus, we can give an appropriate mobility of users, and realize the profit of the catering industry on the basis of controlling the epidemic. deep learning for population migration: in order to get the dynamic mobility of population in an experimental region, we can divide the movement of a population into two parts, i.e., the number of people moving into the area, called the migrate in, and those moving out as migrate out. according to previous studies, the amount of migration in any region is continuous from a time perspective. for example, the amount of migrate in has a strong correlation with the amount of migrate in at the previous moment. moreover, population movement is affected by the covid-19. in this paper, we give simple parameters to describe the influence of immigration and emigration on covid-19. in future work, we will use the deep learning [18] , [19] approach to build a spatial and temporal population migration model, to better predict the impact of population migration on the covid-19 epidemic. based on the results of our simulations, some interventions are needed to control the spread of covid-19. home confinement reduces person-to-person contact and can effectively reduce the rate of transmission and the number of people infected. in addition, if earlier measures to control person-to-person contact are implemented, the faster the spread of the virus is controlled. moreover, these measures must not be terminated prematurely. the experimental results of the study on immigration rates also prove the importance of implementing immigration control. statistical inference in a stochastic epidemic seir model with control intervention: ebola as a case study modeling the effects of prevention and early diagnosis on hiv/aids infection diffusion a contribution to the mathematical theory of epidemics the mathematics of infectious diseases 2018 international scientific-practical conference problems of infocommunications global stability for delay sir and seir epidemic models with saturated incidence rates early prediction of the 2019 novel coronavirus outbreak in the mainland china based on simple mathematical model a simulation on potential secondary spread of novel coronavirus in an exported country using a stochastic epidemic seir model the effect of human mobility and control measures on the covid-19 epidemic in china reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission media impact switching surface during an infectious disease outbreak global dynamics of a class of seirs epidemic models in a periodic environment an updated estimation of the risk of transmission of the novel coronavirus (2019-ncov) fitting and forecasting the trend of covid-19 by seir(+ caq) dynamic model clinical characteristics of coronavirus disease 2019 in china edge intelligence in the cognitive internet of things: improving sensitivity and interactivity pea: parallel electrocardiogram-based authentication for smart healthcare systems automatic virtual network embedding: a deep reinforcement learning approach with graph convolutional networks an intelligent anomaly detection scheme for micro-services architectures with temporal and spatial data analysis this work is supported by national key r and d program of china under grant 2018yfc1314600. drae edlctoet:i am pleased to submit an research paper enttled "the introducton oo populaton migraton to seiar oor covid-19 epidemic modelling with an efcient interventon strategy". we believe that this manuscript is appropriate oor publicaton in special issue on "advances in mult-source inoormaton fusion oor epidemic diseases". thus, we submit it to this issue.to the best oo our knowledge, the named authors have no confict oo interest, fnancial or otherwise.sincerely, j o u r n a l p r e -p r o o f key: cord-174692-ljph6cao authors: dadlani, aresh; afolabi, richard o.; jung, hyoyoung; sohraby, khosrow; kim, kiseon title: deterministic models in epidemiology: from modeling to implementation date: 2020-04-07 journal: nan doi: nan sha: doc_id: 174692 cord_uid: ljph6cao the abrupt outbreak and transmission of biological diseases has always been a long-time concern of humankind. for long, mathematical modeling has served as a simple and yet efficient tool to investigate, predict, and control spread of communicable diseases through individuals. a myriad of works on epidemic models and their variants have been reported in the literature. for better prediction of the dynamics of a particular disease, it is important to adopt the most suitable model. in this paper, we study some of the widely-appreciated deterministic epidemic models in which the population is divided into compartments based on the health status of each individual. in particular, we provide a demographic classification of such models and study each of them in terms of mathematical formulation, near equilibrium point stability properties, and disease outbreak threshold conditions (basic reproduction ratio). furthermore, we discuss the various influential factors that need to be considered during epidemic modeling. the main objective of this article is to provide a basic understanding of the mathematical complexity incurred in deterministic epidemic models with the aid of graphical illustrations obtained through implementation. throughout recorded history, human population has always been haunted by the emergence and re-emergence of infectious diseases. several lives have been lost due to lack of knowledge on the dynamical behavior of epidemic outbreaks of contagious diseases and measures to confront them [1] , [2] . for decades, scientists have toiled to understand the transmission characteristics of such diseases so as to devise control strategies to prevent further spread of infection. the field of science that studies such epidemic diseases and in particular, the factors that influence the incidence, distribution, and control of infectious diseases in human populations is called epidemiology. in this regard, mathematical modeling has proven to serve useful in analyzing, predicting, evaluating, detecting, and implementing efficient control programs. such analytical models accompanied by computer simulations serve as experimental tools for building, testing, and assessing theories and understanding the relationship between various parametric values involved. practical use of epidemic models depends on how closely they realize actual biological diseases in real world. to keep the models simple and tractable, many assumptions and relaxations are taken into consideration at each level of the process. however, even such simplified models often pose significant questions regarding the underlying mechanisms of infection spread and possible control approaches. hence, adopting the apt epidemic model for prediction of real phenomenon is of great importance. models that are useful in the study of infectious diseases at the population scale can be broadly classified into two types: deterministic and stochastic. early models that were developed to study specific diseases such as tuberculosis, measles, and rubella were deterministic in nature. in deterministic models, the large population is divided into smaller groups called compartments (or classes) where each group represents a specific stage of the epidemic. such models, often formulated in terms of a system of differential equations (in continuous time) or difference equations (in discrete time), attempt to explain what happens on the average at the population scale. a solution of a deterministic model is a function of time or space and is generally uniquely dependent on the initial data. on the other hand, a stochastic model is formulated in terms of a stochastic process which, in turn, is a set of random variables, x(t; ω) ≡ x(t), defined as {x(t; ω)|t ∈ t and ω ∈ ω} where t and ω represent time and a common sample space, respectively. the solution of a stochastic model is a probability distribution for each of the random variables. such models capture the variability inherent due to demographic and environment variability and are useful under small population sizes. more specifically, they allow follow-up of each individual in the population on a chance basis [3] , [4] . discrete-time markov chain (dtmc), continuous-time markov chain (ctmc), and stochastic differential equation (sde) models are three types of stochastic modeling processes which have been deeply covered in [5] . figure 1 shows the different classes under which epidemic models have been studied in the literature. the connecting blue lines in the figure highlight the main scope of this report. needless to say, both deterministic and stochastic epidemic models have their own applications. deterministic models are used to address questions such as: what fraction of individuals would be infected in an epidemic outbreak?, what conditions should be satisfied to prevent and control an epidemic?, what happens if individuals are mixed non-homogeneously?, and so on [6] . while such models are preferable in studying a large population, stochastic epidemic models are useful for a small community and answer questions such as: how long is the disease likely to persist?, what is the probability of a major outbreak?, and the like [5] . hence, stochastic epidemic models are generalized forms of simple deterministic counterparts. however, unlike deterministic models, stochastic modfigure 1 : classification of various classes of epidemic models. els can be laborious to set up and may need many simulation runs to yield useful predictions. they can become mathematically very complex and result in misperception of the dynamics [3] . to this end, we focus on some widely-used deterministic models which are relatively easier to conceive, set up, and implement using various computer softwares at disposal. deterministic epidemiology is believed to have started in the early twentieth century [7] . in 1906, hamer was the first to assume that the incidence (number of new cases per unit time) is proportional to the product of the number of susceptible and infective individuals in his model for measles epidemics [8] . the exponential growth in mathematical epidemiology was boosted by the acclaimed work of kermack and mckendrick which was published in 1927 [9] . this paper laid out a foundation for modeling infections where all members of the population are assumed to be initially equally susceptible to the disease and confer complete immunity only after recovery. after decades of neglect, the kermack-mckendrick model was brought back to prominence by anderson et al. in 1979 [10] . since then several models have been developed addressing aspects such as passive and disease-acquired immunity, vaccination, quarantine, vertical transmission, disease vectors, age structure, social and sexual mixing groups, as well as chemotherapy [11, 12, 13, 14] . improved models have also been designed for diseases such as measles, chickenpox, smallpox, whooping cough, malaria, rabies, diphtheria, filariasis, herpes, syphilis, and hiv/aids [14] , [15] . the main objective of this article is to help the reader gain an insight on the basics of deterministic compartmental modeling through implementation. in this work, we formulate some well-known models and derive their steady-state solutions. since the models under study are non-linear in nature, we investigate their qualitative behavior near their corresponding equilibria using linearization method [16] . all models discussed in this paper have been implemented using wolfram mathematica [17] , the codes of which are freely available. the remainder of this article is structured as follows. section ii provides a demographic classification of deterministic models along with notations and assumptions that will be used throughout the paper. in section iii, we present the classical epidemic model known as the susceptible-infected-recovered (sir) model which forms the basis for the extended models that follow in sections iv to vii, accompanied by their implementation results. in section viii, we present some additional factors that impact the behavior of epidemic models, followed by some conclusive remarks in section ix. in order to analyze the structure of epidemic models as well as the relation between their structure and the resulting dynamics, it is important to classify models as clear and simple as possible. in our study, we classify and study compartmental models based upon demography or vital dynamics. demography relates to the study of characteristics of human populations such as birth, death, incidence of disease, and so on. epidemic models with vital dynamics consider an open population with births and deaths while models without vital dynamics have a closed and fixed population with no demographic turnover. for better realization, we assume that the law of mass action holds for all models in this paper. this law states that if individuals in a population mix homogeneously, then the encounters between infected and susceptible individuals occur at a rate proportional to their respective numbers in the population [18] . in other words, the rate at which the susceptible population becomes infected is directly proportional to the product of the sizes of the two populations. table 1 summarizes the notations that will be used in model derivation henceforth. note that s , i, r, and e are used to represent the compartments in the epidemic model as well as the proportion of the corresponding compartments at any time instant t. 3 the basic sir model in their first paper, kermack and mckendrick created a model in which the population is divided into three compartments: susceptible (s ), infected (i), and recovered (r) [9] as illustrated in figure 2 . they assumed that all individuals are mutually equally susceptible to the disease and that complete immunity is obtained merely after recovery from infection. moreover, they also assumed that the duration of the disease is same as the duration of infection with constant transmission and recovery rates. based upon the demographic classification, the epidemic and endemic s ir models are studied below. for a closed population of size n, we assume that the mixing of individuals is homogeneous and the law of mass action holds. also, for large classes of communicable diseases, it is more realistic to consider a force of infection that depends on the fraction of infected population with respect to the total constant population n, rather than the absolute number of infectious subjects. based upon this assumption, the standard disease incidence rate is defined as βs i/n and the overall rate of recovery is given as γi. in spite of the above simplifying assumptions, the resulting non-linear system does not admit a closed-form solution. nevertheless, we shall see how significant results can be derived analytically. figure 2 can be translated into the following set of differential equations: summing up (1), (2), and (3) yields zero which implies that the population is of constant size with s + i + r = n. dividing (2) by (1) gives: assume that the population is susceptible up to time zero at which a relatively small number, i(0), become infected. thus, at t = 0, s (0) = n − i(0) and r(0) = 0. as time approaches infinity, lim t→∞ i(t) = 0, lim t→∞ s (t) = s (∞), and the number of individuals that have been infected is s (0) − s (∞). integrating (4) leads to: where c is constant and s (∞) is the proportion of susceptibles at the end of the epidemic. since the initial infection is small, (5) further reduces to: (a) r 0 = 0.7, β = 0.7, and γ = 1. where c ′ denotes some constant. defining r 0 = β/γ as the basic reproduction ratio, we see in figure 3 that for r 0 > 1, a small infection to the population would create an epidemic. r 0 describes the total number of secondary infections produced when one infected individual is introduced into a disease-free population. the importance of the role of r 0 can be seen by rewriting (2) as follows: in order to avoid an epidemic, (7) should be non-positive. this is possible only if (7) is positive and thus, there will be an epidemic outbreak. figure 3 illustrates the limiting values of the s , i, and r compartments for different values of r 0 where n is normalized to 1. inclusion of demographic dynamics may permit a disease to persist in a population in the long term. a disease is said to be endemic if it remains in a population for over a decade or two. due to the long time period involved, an endemic disease model must include births as a source of new susceptibles and natural deaths in each compartment. in our study of the endemic s ir model, we consider constant birth and death rates. using the notations in table 1 , the scheme in figure 4 can be expressed mathematically as: figure 4 : basic s ir model with vital dynamics. assuming that b equals µ, we can easily see that the sum of the above three equations yields zero when s + i + r = n holds in a non-varying population. moreover, we observe that the average time of an infection is 1/(γ + µ), and since the infectious individuals infect others at rate β, r 0 is defined as β/(γ + µ). by setting the left-hand side of the (8)-(10) to zero and solving for s , i, and r, we obtain the following two steady states (or equilibrium points) [16] : where c 1 = µ/β and c 2 = γ/β. points e 1 and e 2 denote the disease-free equilibrium (dfe) and endemic equilibrium (ee) points, respectively. figure 5a depicts the system in a disease-free steady state when r 0 ≤ 1, whereas figure 5b shows the occurrence of an endemic as the infected population reaches a limiting value of 0.225 when r 0 > 1. the stability of the system is driven by r 0 as it can be observed by linearizing the system of equations at these points. the local stability of the model at these equilibrium points is analyzed via linearization. the jacobian matrix for (8) and (9) is given as: evaluating the above matrix at e 1 and solving the characteristic equation, det(j − λi) = 0, where i is the identity matrix of size 2, results in the following pair of eigenvalues: in order to be a stable node, both eigenvalues should be negative. therefore, e 1 is stable when β < γ + µ (or equivalently r 0 < 1) and unstable when β > γ + µ. similarly, the jacobian matrix evaluated at e 2 is as below: one can easily observe that the determinant of (14) is positive as long as r 0 > 1. hence, the endemic equilibrium is stable only when r 0 > 1. the s-i phase portrait in figure 6 shows how the model approaches the dfe and ee points with different initial values for s (0) and i(0). for the sake of simplicity, n has been normalized to 1. as depicted in figure 6a , for r 0 = 0.5, the system eventually ends up at (1, 0), irrespective of the initial values of s (0) and r(0). on the other hand, an endemic occurs at (0.625, 0.225) for r 0 = 1.6 as in figure 6b . the phenomenon in which a parameter variation causes the stability of an equilibrium to change is known as bifurcation [16] . in continuous systems, this corresponds to the real part of an eigenvalue of an equilibrium passing through zero. with the basic reproduction number as the bifurcation parameter, figure 7 shows a transcritical bifurcation where the equilibrium points persist through the bifurcation, but their stability properties change. hence, we can conclude that: a few recent studies have revealed interesting bifurcation behaviors in s ir models incorporated with factors such as varying immunity period, saturated treatment, and vaccination. we refer the interested reader to [19, 20, 21, 22] and the references therein for more details. for viral diseases, such as measles and chickenpox, where the recovered individuals, in general, gain immunity against the virus, the s ir model is applicable. however, there exist certain bacterial diseases such as gonorrhoea and encephalitis that do not confer immunity. in such diseases, an infectious individual is allowed to recover from the infection and return unprotected to the susceptible class where he/she is prone to get infected again. cases as such can be modeled using the susceptible-infected-susceptible (sis) model as shown in figure 8 , where the model variables are defined in table 1 . in a fixed population, where there is no birth or death and individuals recover from the disease at the per capita rate of γ, the simplest form of the model in figure 8 is given by: by substituting s = n − i in (17), the system above can be reduced to: solving (18) analytically with i(0) = i 0 gives the solution for the complete system at time t as follows: the behavior of the system in long-term can be inferred by looking at the possible values of (β − γ) that make (20) feasible. if (β − γ) > 0, then e −(β−γ)t → 0 as t → ∞. this can be written as: if (β − γ) < 0, then e −(β−γ)t → ∞ as t → ∞ and thus, lim t→∞ i(t) = 0. there exists two equilibrium points for this model which can be obtained by setting di dt = 0 in (18) and solving for s and i. with r 0 as β/γ, we get: where e 1 and e 2 denote the dfe and ee points, respectively. in terms of the basic reproduction number, if r 0 ≤ 1, the pathogen dies out as illustrated in figure 9a because the infection in one individual cannot replace itself. if r 0 > 1, an existing infectious individual leads to more than one infection thus, spreading the pathogen in the population as seen in figure 9b . the jacobian matrix constructed from (16) and (17) is as follows: linear stability analysis for e 1 is done by solving the corresponding characteristic equation to obtain the following pair of eigenvalues: the stability of e 1 depends on the value taken by λ 2 . the equilibrium point is a stable dfe if β < γ (or equivalently r 0 < 1) and unstable if β > γ (or r 0 > 1). similarly, the eigenvalues of the characteristic equation for e 2 are: in this case, the ee point is stable if β > γ and unstable if β < γ. figure 10 depicts the vector plots for examples where e 1 and e 2 are unstable. in figure 10a , the system converges to some state (highlighted in red) other than (1, 0) when r 0 = 2.5. likewise, for r 0 < 1, the system converges to an invalid state as illustrated in figure 10b . at β = γ or equivalently, r 0 = 1, a bifurcation occurs as the two equilibria collide (dfe equals ee) and exchange stability [7] . the forward bifurcation occurring at this threshold condition is similar to as seen previously in section iii. the s is model with varying population of constant size is as shown in figure 11 . the corresponding system of differential equations for such a model is given below, where b and µ are assumed to be equal and s + i = n: to find the equilibrium points of the system, we set (26) and (27) where r 0 is β/(γ + µ). figure 12 shows the system behavior for different values of r 0 . in figure 12a , since r 0 is less than 1, the disease dies out and the system enters the diseasefree steady state. the same happens at r 0 = 1, where the two equilibria meet. for r 0 greater than 1, the disease does not die out, but instead remains in the population as an endemic with a limiting value. this can be seen in figure 12b where an endemic occurs at (s (t), i(t)) = (0.56, 0.44) for r 0 = 1.8. the corresponding jacobian matrix for this model is given as: the eigenvalues of j in (29) for e 1 and e 2 are deduced as follows: linear stability analysis reveals that the disease-free equilibrium (e 1 ) is asymptotically stable if β − (γ + µ) ≤ 0 (or r 0 ≤ 1) and unstable otherwise [23] . similarly, the endemic steady state is asymptotically stable if r 0 > 1. figure 13 portraits the stability of e 1 and e 2 for different values of r 0 . in figure 13a , with r 0 = 0.6, the system converges to e 1 = (1, 0) figure 14 : the s irs model without vital dynamics. where it is stable. in the same manner, as in figure 13b , the system eventually ends up at e 2 = (0.56, 0.44) for r 0 = 1.8, which is a valid endemic state. the forward bifurcation for the simple s is model with demographic turnover and r 0 as the bifurcation parameter occurs at r 0 = 1 as studied earlier. nevertheless, such models in presence of additional factors reveal interesting behaviors. some recent works on s is model that exhibit bifurcations consider factors such as non-constant contact rate having multiple stable equilibria [24] , non-linear birth rate [25] , treatment [26] , and time delay [27] . the s irs model is an extension of the basic s ir model in which individuals recover with immunity to the disease and become susceptible again after some time recovering. influenza is a contagious viral disease that is usually studied using this model. in what follows, we investigate the model in both, absence and presence of demographic turnover. the system of differential equations describing the s irs flow diagram in fig. 14 is as below: with the three compartments summing up to n, we obtain the following two equilibrium points where e 1 and e 2 denote the dfe and ee, respectively. with c 1 = ν/(γ + ν), c 2 = γ/(γ + ν), and r 0 defined as β/γ, we have: as illustrated in fig. 15 , the infection dies out and reaches the disease-free steady state for r 0 ≤ 1 and stays as an endemic when r 0 > 1. the corresponding jacobian matrix of the system is obtained by substituting r with n−s −i in (32) . hence, at e 1 , the matrix j yields the following two eigenvalues: therefore, e 1 is a stable node if λ 2 ≤ 0 or r 0 ≤ 1, and is a saddle point (unstable) if λ 2 > 0 or r 0 > 1. however, analyzing the stability of e 2 requires more care due to the structure complexity of the corresponding eigenvalues given as: for the sake of clarity, let us denote ν(β + ν) and 2(γ + ν) as c and b, respectively. rewriting the above eigenvalues gives: also, let us represent c 2 − ν(β − γ)b 2 as δ and ν(β − γ)b 2 as ζ. in order to study the stability of e 2 , we need to consider the following two cases: • case a: when δ ≥ 0, depending on the possible values that (β − γ) can take, we have the following two conditions : (i) if β − γ ≤ 0 or r 0 ≤ 1, then ζ ≤ 0 which implies that the magnitude of δ is always greater than or equal to c 2 . under this condition, the eigenvalues will always be real with opposite signs. hence, the equilibrium point e 2 would be a saddle point as portrayed in figure 16a , where the parametric values are the same as that in figure 15a . (ii) if β − γ > 0 or r 0 > 1, then ζ > 0 and thus, the magnitude of δ would always be lesser than c 2 . in this case, λ 1 and λ 2 will always be real values with negative signs. as shown in figure 16b , e 2 converges to a stable node with the same parametric values as given in figure 15b . here, δ = 0.019 which is lesser than c 2 = 1.519 and thus, results in a stable node at (0.333, 0.539). • case b: when δ < 0, β − γ should always be greater than zero and thus, the eigenvalues would be complex conjugates. since the real parts of the eigenvalues are negative, the figure 17 : the s irs model with vital dynamics. equilibrium would be a stable focus point where the following condition holds: as an example for this case, with r 0 = 2, β = 0.7, γ = 0.35, and ν = 0.25, the stable focus occurs at (0.5, 0.208) as depicted in figure 16c . as in figure 17 , the s irs model with standard incidence can be simply expressed as the following set of differential equations [23] : it is worth mentioning that 1/γ and 1/ν can be regarded as the mean infectious period and the mean immune period, respectively. with ν = 0, the model reduces to an s ir model with no transition from class r to class s due to life-long immunity. the system has a dfe point and a unique ee point denoted by e 1 and e 2 , respectively, as given below: where r 0 is given by β/(γ + µ), c 1 = (ν + µ)/(γ + ν + µ), c 2 = γ/(γ + ν + µ), and b is assumed to be equal to µ. it should be noted that the basic reproduction number does not depend on the loss of immunity rate (ν). the system reaches dfe steady state for r 0 = 0.28 as shown in figure 18a . here, the parametric values are taken to be β = 0.04, µ = 0.043, γ = 0.1, and ν = 0.01. on the other hand, figure 18b illustrates an example for which r 0 > 1. in this case, the system reaches the endemic state (0.625, 0.125, 0.25) for r 0 = 1.6. (41) gives the following jacobian matrix which is obtained from (41) and (42): evaluating (45) at e 1 and solving its corresponding characteristic equations yields the following pair of eigenvalues: we see that the disease-free equilibrium (e 1 ) is stable if λ 2 ≤ 0, i.e. β − (γ + µ) ≤ 0 or r 0 ≤ 1, and unstable otherwise. similarly, on finding the eigenvalues for e 2 , we see that this unique endemic equilibrium point is stable for r 0 > 1. the instability of the equilibrium points can be seen in figure 19 . for r 0 = 1.6, the vector plot in figure 19a shows how the system does not converge to (s * , i * ) = (1, 0). in the same manner, figure 19b illustrates the instability of e 2 at (s * , i * ) = (3.575, −0.892) when r 0 is less than unity. with the basic reproduction ratio as the bifurcation parameter, the system yields a transcritical forward bifurcation at r 0 = 1. more interesting behaviors have been reported when studied under factors such as stage structure [28] and non-linear incidence rates [29] , [30] . hitherto, we have dealt with models comprising of s , i, and r compartments. in these models, the infected individuals become infectious immediately. in the next two models, an exposed compartment in which all the individuals have been infected but are not yet infectious, is introduced. such models take into consideration the latent period of the disease, resulting in an additional compartment denoted by e(t). the progression rate coefficient from compartment e to i is given as ε such that 1/ε is the mean latent period. several other models with latent period such as s eir and s eirs have also been reported in the literature. however, such models are beyond the scope of this article. interested readers can refer to [14] , [15] , and [31] for more on epidemic models beyond two dimensions. 6 the s ei model unlike s ir models, the susceptible-exposed-infected (s ei) model assumes that a susceptible individual first undergoes a latent (or exposed) period before becoming infectious [32] . one example of this model is the transmission of severe acute respiratory syndrome (sars) coronavirus. for a fixed population of size n, the following differential equations describe the flow diagram in fig. 20 : the system should be analyzed asymptotically as it does not have a closed-form solution. we shall see that the population converges into a single compartment due to the straight-forward nature of the system. in what follows, we investigate the system behavior in terms of β and ε. • case a: when β 0 and ε 0, depending on the initial values of e(0) and i(0), the system approaches two different equilibrium points as below: (i) if e(0) = 0 and i(0) = 0, the system remains in the following disease-free equilibrium: this scenario is illustrated in figure 21a where the complete population is susceptible at all times for n = 1, β = 0.8, ε = 0.5, and s (0) = 1. (ii) if e(0) 0 or i(0) 0, then the system approaches the following equilibrium point in long-term: to prove this, consider the solution of (49) which is: i(t) is a monotonically increasing function when e(t) > 0. since all compartments are always non-negative, e(t) is greater than 0 when e(t) 0. additionally, on solving (47), we get: when i(t) > 0, we see that s (t) is a monotonically decreasing function. unless when e(t) = 0 for all t, i(t) is a monotonically increasing function, and lim t→∞ s (t) = 0 as lim t→∞ t 0 i(τ)dτ = ∞. in terms of e(t), from (48), we see that the solution e(t) = ce −εt goes to 0 as t approaches infinity. in summary, if the condition e(0) 0 or i(0) 0 is satisfied, then lim t→∞ s (t) = 0. since s = 0, lim t→∞ e(t) = lim t→∞ ce −εt = 0. consequently, lim t→∞ i(t) = n. this case is depicted in figure 21b . • case b: when β = 0 and ε 0, the reduced system yields the following solution: since lim t→∞ e(t) = 0, the system results in the following equilibrium solution as shown in figure 21c , where the state of the system changes from the initial condition (s (0), e(0), i(0)) = (0.75, 0.24, 0.01) to steady state (0.75, 0, 0.25) for ε = 0.5: • case d: when β 0 and ε = 0, the solution of the reduced system of differential equations is given as below: with b = µ in a birth-death population of total size n , the model illustrated in fig. 22 can be written as follows: the two set of equilibrium points obtained by setting the left-hand side of (60)-(62) to zero and solving for s , e, and i are: with c 1 and c 2 defined as µ/(ε + µ) and ε/(ε + µ), respectively, and r 0 = βε/(µ(ε + µ)). the dfe and ee steady-states are respectively, e 1 and e 2 . with β = 0.25, ε = 0.4, and µ = 0.2, figure 23a depicts the disease-free steady state of the system as r 0 = 0.833. likewise, figure 23b shows how the system reaches the endemic equilibrium for r 0 = 1.7 when β = 0.54, ε = 0.5, and µ = 0.22. considering (60) and (62), the jacobian matrix for the system is as given below: evaluating the matrix at e 1 and solving the characteristic equation gives the following two eigenvalues: for e 1 to be a stable node, both eigenvalues should be negative which means that λ 2 should be less than zero. hence, ε(4β + ε) should be less than (ε + 2µ) 2 , which implies that r 0 < 1. in other words, e 1 is a stable point if r 0 < 1 and is a saddle point if the eigenvalues have opposite signs. doing the same for e 2 , we get: similar to the above case, if r 0 > 1, then both eigenvalues are negative and thus, e 2 would be a stable node. otherwise, e 2 would be a saddle point. in the simplest case, with r 0 as the bifurcation parameter, the system exhibits a forward transcritical bifurcation as it switches between the two equilibria. however, not much has been done in bifurcation analysis of such models in presence of other factors. the susceptible-exposed-infected-susceptible (s eis ) model is an extension of the s ei model such that in this model, the individual does not remain infected forever, but instead recovers and returns back to being susceptible again. many sexually transmitted diseases (std) and chlamydial infections are known to result in little or no acquired immunity following recovery [10] . in such cases, this model may serve as a suitable choice. the dynamical transfer of hosts depicted in figure 24 can be formulated as follows, where n = s + e + i: on solving (67)-(69) for s , e, and i, we obtain e 1 and e 2 which represent the disease-free and endemic equilibrium points, respectively: where c 1 = γ/(ε + γ), c 2 = ε/(ε + γ), and r 0 = β/γ. as shown in figure 25 , the system converges to e 1 when r 0 ≤ 1 and to e 2 when r 0 > 1. the asymptotic behavior of the model can be analyzed by studying the stability conditions of the system near its equilibrium points. the jacobian matrix formed by using (67) and (69) is as below: at e 1 , the matrix j yields the following eigenvalues: once again, in order for e 1 to be stable, both eigenvalues should be negative and since λ 1 is already negative, λ 2 should be less than zero. hence, for λ 2 to be negative, −γ − ε + (γ − ε) 2 + 4βε should be negative, which on further simplification implies that r 0 < 1. therefore, e 1 is a stable dfe when r 0 < 1 and is unstable when r 0 > 1. similarly, calculating the eigenvalues of j evaluated at e 2 results in the following pair of eigenvalues: since λ 1 is always negative, the stability of e 2 depends on λ 2 . for λ 2 < 0, we observe that e 2 is a stable ee. on the contrary, when λ 2 > 0 (or equivalently, r > 1), the equilibrium point is unstable. this is clearly shown in figure 26 where the stability of the system at the equilibrium points depends on r 0 . in figure 26a is a stable endemic. as observed in previous cases, the model results in a forward bifurcation at r 0 = 1 when switching from one steady-state to another. thus, the stability of e 1 is persistent at r 0 = 1. in this subsection, we consider the s eis model for a population of size n with birth and death rates that are constant. the set of equations given below refer to such a scheme illustrated in figure 27 : the following two equilibrium points are calculated by setting b = µ, the time-derivatives in (74)-(76) to zero, and solving for s , e, and i: e 1 : (s * , e * , i * ) = (n, 0, 0), where c 1 , c 2 , and r 0 are defined as (γ + µ)/(γ + ε + µ), ε/(γ + ε + µ), and βε/((ε + µ)(γ + µ)), respectively. for r 0 ≤ 1, the system approaches e 1 which is disease-free, while for r 0 > 1, it ends up at e 2 . the jacobian matrix for this system is given as follows: by evaluating the matrix in (78) at e 1 , we see that the equilibrium point is a stable disease-free equilibrium when both of the following eigenvalues are negative and is unstable if the eigenvalues have opposite signs. in terms of r 0 , e 1 is stable if r 0 < 1 and unstable if r 0 > 1: doing the same for e 2 yields a more complex pair of eigenvalues. however, on simplifying the eigenvalues, one can easily conclude that e 2 is stable when r 0 > 1 and unstable otherwise. at r 0 = 1, the system changes from e 1 to e 2 resulting in a forward bifurcation. a few works that reveal interesting bifurcation behaviors in s eis model can be found in [33, 34, 35] . occurrences of certain events in nature and society may influence the behavior of the epidemic models seen so far. to guarantee that the model of choice mimics its counterpart in real-world, such influential factors should be taken into consideration. many of these factors result in models with additional compartments which make them more complicated for mathematical analysis. in this section, a concise description on some of the most prominent factors that are considered in epidemic modeling is provided. the time period between exposure and the onset of infectiousness is defined as the latent period. this is slightly different from the definition of incubation period which is the time interval between exposure and appearance of the first symptom of the disease in question. thus, the latent period could be shorter or even longer than the incubation period. as seen before, s ei and s eis are two examples of models with latent period. models with higher dimensions such as s eir, s eirs , ms eir, and ms eirs have also been reported in the literature [36] . however, since these models cannot be reduced to planar differential equation systems due to their complexity, only a few complete analytic results have been obtained. in absence of vaccination for an outbreak of a new disease, isolation of diagnosed infectives and quarantine of people who are suspected of having been infected are some of the few control measures available. models such as s iqs and s iqr with a quarantined compartment, denoted by q(t), assume that all infectives go through the quarantined compartment before recovering or becoming susceptible again [37] . however, vaccination, if available, is one of the most cost-effective methods of preventing disease spread in a population. we refer the reader to [14] , [15] , and [38] for more on models with vaccination and vaccine efficacy. models with time delay deal with the fact that the dynamic behavior of disease transmission at time t depends not only on the current state but also on the state of previous time [31] , [39] . time delays are of two types namely, discrete (or fixed) delay and continuous (or distributed) delay. in the case of discrete time delay, the behavior of the model at time t depends on the state at time t − τ as well, where τ is some fixed constant. as an example, with τ being the latent period for some disease, the number of infectives at time t also depends on the number of infectives at time t − τ. on the contrary, the behavior of a model with continuous delay at time t depends on the states during the whole period prior to t as well. since individuals in different age groups have different infection and mortality rates, age is considered to be an important characteristic in modeling infectious diseases. mostly, cases of sexually transmitted diseases (stds) such as aids occur in younger individuals as they tend to be more active within or between populations. likewise, malaria is responsible for nearly half of the death of infants under the age of 5 due to their weak immune system. hence, such facts highlight the importance of age structure in epidemic modeling. agestructured models are broadly classified into three types namely, discrete [40] , continuous [41] , and age groups or stages [42] . in epidemiology, multi-group models describe the spread of infectious diseases in heterogeneous populations where each heterogeneous host population can be divided into several homogeneous groups in terms of geographic distributions, models of transmissions, and contact patterns. one of the pioneer models with multiple groups was investigated by lajmanovich et al. in [43] for the transmission of gonorrhea. however, recent studies include differentiation of susceptibility to infection (ds) due to genetic variation of susceptible individuals, variation in infectiousness, and disease spread in competing populations [31] . a central assumption in the classical models seen so far is that the rate of new infections is proportional to the mass action term. in these models, we assumed that the infected and susceptible individuals mix homogeneously. an increasingly important issue in epidemiology is how to extend these classical formulations to adequately describe the spatial heterogeneity in the distribution of susceptible and infected people and in the parameters of the spread of the infection observed in both, experimental data and computer simulations. diffusion or migration of individuals in space are simply of two types: migration among different patches and continuous diffusion in space. in the former type, migration of individuals between patches depend on the connectivity of the patches. models with migration between two patches [44] and n patches [45] have been reported in the literature. the latter type, on the other hand, takes into account the fact that the distribution of individuals and their interactions depend not only on the time t, but also the location in a given space. most of the classical epidemic models admit threshold dynamics, i.e. a dfe is stable if r 0 < 1 and an ee is stable if r 0 > 1. however, capasso et. al [46] showed that it is likely possible for a dfe and ee to be stable simultaneously. futhermore, periodic oscillations have been observed in the incidence of various diseases including mumps, chickenpox, influenza, and the like. the question that arises is why are classical epidemic models unable to capture these periodic phenomena? the main reason is the nature of the force of infection. classical models frequently use mass incidence and standard incidence which imply that the contact rate and infection probability per contact are constant in time. nonetheless, it is more realistic (with added complexity) to consider the force of infection as a periodic function in time. as a simple example, consider the s ir model with a periodic incidence function f(i, t), and birth and death rates taken to be µ as given below [47] : in recent years, much attention has been given to the study and analysis of chaotic behavior in epidemic models with non-linear infection forces. mathematical modeling of communicable diseases has received considerable attention over the last fifty years. a wide range of studies on epidemic models has been reported in the literature. however, there lacks a comprehensive study on understanding the dynamics of simple deterministic models through implementation. aiming at filling such a gap, this work introduced some widely-appreciated epidemic models and studied each in terms of mathematical formulation, near equilibrium point stability analysis, and threshold dynamics with the aid of mathematica. in addition, important factors that may be considered for better modeling were also presented. we believe that this article would serve as a good starting point for readers new to this research area and/or with little mathematical background. infectious diseases and human population history viruses, plagues, and history deterministic modeling of infectious diseases: theory and methods stochastic epidemic models: a survey an introduction to stochastic epidemic models, ser compartmental models for epidemics. department of mathematics, university of british columbia the mathematics of infectious diseases epidemic disease in england: the evidence of variability and of persistency of type, ser a contribution to the mathematical theory of epidemics population biology of infectious diseases: part i epidemiological models with age structure, proportionate mixing, and cross-immunity a thousand and one epidemic models, ser modeling infectious diseases in humans and animals an introduction to infectious disease modeling nonlinear systems infectious diseases of humans: dynamics and control bifurcation thresholds in an sir model with information-dependent vaccination bifurcation analysis in an sir epidemic model with birth pulse and pulse vaccination stability and bifurcations in an epidemic model with varying immunity period qualitative and bifurcation analysis using an sir model with a saturated treatment function on the global stability of sis, sir and sirs epidemic models with standard incidence a simple sis epidemic model with a backward bifurcation bifurcation analysis of an sis epidemic model with nonlinear birth rate backward bifurcation in simple sis model bifurcation and chaos in sis epidemic model stability of hopf bifurcation of a delayed sirs epidemic model with stage structure bifurcation analysis of an sirs epidemic model with generalized incidence bifurcations of an sirs epidemic model with nonlinear incidence rate dynamical modeling and analysis of epidemics population dynamics of fox rabbits in europe some epidemiological models with nonlinear incidence complex dynamics of discrete seis models with simple demography dynamic analysis of an seis model with bilinear incidence rate mathematical understanding of infectious disease dynamics effects of quarantine in six endemic models for infectious diseases global results for an epidemic model with vaccination that exhibits backward bifurcation time delays in epidemic models: modeling and numerical considerations dynamics of a discrete age-structured sis models continuous-time age-structured models in population dynamics and epidemiology, ser an age-structured model for pertussis transmission a deterministic model for gonorrhea in a nonhomogeneous population qualitative analysis for communicable disease models an epidemic model in a patchy environment analysis of a reaction-diffusion system modeling man-environment-man epidemics melnikov analysis of chaos in an epidemiological model with almost periodic incidence rates key: cord-283092-t3yqsac3 authors: shah, kamal; abdeljawad, thabet; mahariq, ibrahim; jarad, fahd title: qualitative analysis of a mathematical model in the time of covid-19 date: 2020-05-25 journal: biomed res int doi: 10.1155/2020/5098598 sha: doc_id: 283092 cord_uid: t3yqsac3 in this article, a qualitative analysis of the mathematical model of novel corona virus named covid-19 under nonsingular derivative of fractional order is considered. the concerned model is composed of two compartments, namely, healthy and infected. under the new nonsingular derivative, we, first of all, establish some sufficient conditions for existence and uniqueness of solution to the model under consideration. because of the dynamics of the phenomenon when described by a mathematical model, its existence must be guaranteed. therefore, via using the classical fixed point theory, we establish the required results. also, we present the results of stability of ulam's type by using the tools of nonlinear analysis. for the semianalytical results, we extend the usual laplace transform coupled with adomian decomposition method to obtain the approximate solutions for the corresponding compartments of the considered model. finally, in order to support our study, graphical interpretations are provided to illustrate the results by using some numerical values for the corresponding parameters of the model. mathematical models are powerful tools to study various physical phenomena of real world problems. the respective idea was initiated by bernouli in 1776. after that, the first mathematical model of infectious disease was formulated in 1927 by mckendrick and karmark. following that, this area got considerable attention and lots of models, which describe numerous physical or biological processes, were formed; the reader may refer to [1] [2] [3] [4] [5] for more information about some models. by using mathematical models for the description of infectious diseases, we can get information about the transmission of a disease in a community, its mortality rates, and how to control it. therefore, this area has been established in the last few decades very well, (see [6] [7] [8] [9] ). several outbreaks in the form of pandemics have been come out like in 1920 and 1967 in which more than 100 million people died. during the start of this century, there also occurred some outbreaks in saudi arabia, china, and mexico. in these outbreaks, thousands of people lost their lives. but due to the rapid advancement in medical science, vaccines were prepared and made those diseases curable. in the end of 2019, a serious outbreak has occurred in hubei province of china due to a virus known as corona which has been named the novel covid-19. this outbreak is in progress and more than three million people have been infected in almost every country of the globe. nearly 0.22 million people have died due this disease. three months have gone but, up to date, neither proper cure nor some suitable vaccine have been prepared yet [10] . the world health organization (who) reported the presence of a novel coronavirus (2019-ncov) in wuhan city, hubei province of china, on 31 december 2019. the virus which caused this infection belongs to the previous family of sars. investigating the present literature, there are many theories behind the origin of the virus. some researchers have investigated that it originated from bats to human due unlawful full transmissions of animals in a market of seafood in wuhan. the concerned virus has been identified in pangolin and also in dogs. therefore, it has been considered that many infected cases claimed that they had been working in a local fish and wild animal market in wuhan from where they caught infection of coronavirus-19. after that, researchers confirmed that the widespread of the disease is due to person-to-person contact. also, the mentioned city of the republic of china is a great trade center from where the infection was transferred to many countries of the world through immigration; for details, we refer to [11, 12] . keeping in mind what we have mentioned above, numerous researchers started to model the disease to figure out the properties in different ways. recently, some researchers in [13] have constructed the following mathematical models under ordinary derivative, as a modification to some previously studied prey-predator models [14] [15] [16] , as follows: in the above model, h stands for healthy individual, i for infected individuals, and β for the infection rate, where the rate of immigration of healthy individuals from one place to another place is denoted by α. further, the rate at which infection take immigration is γ, while death rate is denoted by δ and cure rate by ρ. since immigration of people is also a big cause of spreading of this disease, it is evident to check the impact of the immigration of individuals on the transmission dynamics of the current disease. on the other hand, such a study may help in forming some precautionary measures to protect more people from this infection. the study of the mathematical models under fractional derivatives instead of usual ordinary derivatives produces more significant results which are more helpful in understanding. in fact, numerous fractional order derivatives have been introduced and used in literature including "caputo and riemann-liouville" derivatives which are the most popular differential operators. there are large number of applications in real world problems due to fractional calculus; see [17] [18] [19] [20] [21] [22] [23] . recently, some authors replaced the singular kernel in classical nonlocal fractional derivatives by a nonsingular kernel of "mittag-leffler" type; for details, see [24, 25] and the references therein. it is remarkable that fractional derivatives in fact are defined by means of convolutions which contain ordinary derivatives as a special case. further, the geometry of fractional derivatives tells us about the accumulation of the whole function. actually, fractional operators, either of singular or without singular kernels, are nonlocal with memory effect unlike ordinary differential operators which are local in nature. fractional order operators involving "mittag-leffler" kernels have been proved more practical and efficient like the classical nonlocal fractional operators; see [25] [26] [27] [28] [29] . further investigating dynamics problems under fractional derivatives instead of integer order derivative produces global dynamics of the concerned problems which include the integer order derivative as a special case [30] [31] [32] [33] [34] [35] [36] . inspired from the aforesaid discussion, in this paper, we investigate the covid-19 model (1) under the new type derivatives as where, in the above model, abc d θ 0 stands for atangana-baleanu-caputo (abc) derivative of order 1 > θ > 0. we shall analyze the above model from various aspects including existence theory and series type solution. we shall also investigate some stability results of ulam's type for the considered model. for the existence theory, we use the classical fixed point theorems of krasnoselskii's and banach. in addition to the series type solutions, we shall use the integral transforms given by laplace and decomposition technique of adomian. numerical interpretations are given via graphs to demonstrate the obtained results. also, it is necessary that the right hand side of the above covid-19 abc-model must vanish at 0 (see theorem 3.1 in [30] ). 1.1. organization of the paper. section 1 is devoted to introduction of the paper. in section 2, some fundamental results are given. also, in section 3, we establish the existence results while in section 4, the required analytical results are constructed. section 5 is related to the graphical presentations of the results and their discussion. in section 6, we provide a brief conclusion and some future directions. here, we provide some necessary results that may be found in [29] and the references therein such as [24, 25] . definition 2.1. if φ ∈ h1ð0, tþ and θ ∈ ð0, 1, then the abc derivative is defined by we remark that if we replace e θ ½ð−θ/1 − θþðt − yþ θ by e 1 = exp ½ð−θ/1 − θþðt − yþ, then we get the so-called caputo-fabrizo differential operator. further, it is to be noted that here, kðθþ is known as the normalization function which is defined as kð0þ = kð1þ = 1. also, e θ stands for famous special function called mittag-leffler which is a generalization to the exponential function [17] [18] [19] . 2 biomed research international definition 2.2. let φ ∈ l½0, t, then the corresponding integral in abc sense is given by lemma 2.3. (see proposition 3 in [28] ). the solution of the given problem for 1 > θ > 0, is provided by definition 2.4. the laplace transform of abc derivative of a function φðtþ is defined by note: for the qualitative analysis, we define banach space as z = x × x, where x = c½0, t under the norm kðh, iþk = max t∈½0,t ½jhðtþ+|iðtþj. the following fixed point theorem will be used to proceed to our main results. theorem 2.5. [36] . let b be a convex subset of z and assume that f and g are two operators with (1) fðh, iþ + gðh, iþ ∈ b for every h, i ∈ b (2) f is contraction (3) g is continuous and compact then, the operator equation fðh, iþ + gðh, iþ = ðh, iþ has at least one solution. (2) here, we are going to discuss existence and uniqueness of solution for our main model. let us write model (2) as where if we apply the fractional integral ab i θ 0 of order θ on both sides of (9) and make use of lemma 2.3 together with the use of the initial conditions, we get to derive the existence and uniqueness, we imposed some growth conditions on the nonlinear functions f , g : under the continuity of f , g together with assumption (a2), system (5) has at least one solution if proof. by the help of krasnoselskii's fixed point theorem, we shall prove the existence result. we define the operators f = ðf 1 , f 2 þ, g = ðg 1 , g 2 þ by using (6) as follows: which implies that and similarly, one has from (8) and (9), one has which implies that f is a contraction. let us define a closed subset b of z as for g to be compact and continuous, let any ðh, iþ ∈ b, we have from (10) and (10), we have hence, f is bounded. next, we show that f is equicontinuous. let t 1 < t 2 ∈ ½0, t, then consider similarly, now, from (12) and (14), we see as t 1 ⟶ t 2 , then the right sides tend to zero. hence, we see that consequently, we claim that hence, g is a equicontinuous operator. by using arzelà-ascoli theorem, the operator g is a completely continuous operator and also uniformly bounded proved already. hence, g is relatively compact. by krasnoselskii's fixed point theorem, the given system has at least one solution. next, we establish results about uniqueness of solution as follows: with max fl f , l g g = l. proof. define the operator p = ðp 1 , p 2 þ: z ⟶ z using (6) as and in same fashion, one has from (16) and (17), we have hence, p is a contraction. by banach contraction theorem, the considered system has unique solution. next, we give a results about ulam-hyers stability. given by ja + bj < 1, where proof. let ðh, iþ ∈ z be any solution of the model (2) and ð h, iþ ∈ z is unique solution of the same model; then, we have where a and b are given as in (19) . hence, the solution of the given system is ulam-hyers stable. since the eigenvalues of square matrix are λ 1 = 0, λ 2 = a + b and spectral radius of the matrix is given by max fjλ1j: i = 1, 2g = ja + bj < 1: in this section, we apply the proposed novel analytical method to find the series type solution of the suggested model (2) . to this end, we take the laplace transform of both sides of (2) and use the initial conditions to get after rearranging the terms in (34), one has now, we are interested to find the required solution in the form of infinite series, therefore taking the unknown solutions further, the nonlinear term hðtþiðtþ in the system (2) may be decomposed in terms of adomian polynomials as where we compute few terms for n = 0, 1, 2, ⋯, as and so on. plugging the above series type representation in (21) , one has comparing terms on both sides in (40), we get taking inverse transform of laplace on both sides and after computation with using biomed research international we get few terms of the series solution as and so on. in this way, the remaining terms will be generated. now, we take some various values for parameters taken in [16] as α = 0:0, β = 0:03, γ = 0:05, δ = 0:05, and ρ = 0:05 and take a random community where the total population is divided in such a way that 70 percent of the population is healthy and 30 percent is infected, that h0 = 0:7, i = 0:3: clearly, using these values in model (2), we have l f = 0:03, l g = 0:03, kðθþ = 1: from which we have l = 0:03. hence, the condition of existence of at least one solution holds by using theorem 3.1. also, the condition of theorem 3.3 is valid under suitable value of t. in the current situation, the solution is going to become stable. further, taking kðθþ = 1, we compute few terms from (43) of series solution up to four terms as follows: and so on. we plot the solutions (43) for different fractional order by using matlab in figures 1-4 . from figure 1 , we see that at when the rate of healthy immigrants is zero, it means that protection rate is increasing and hence the population of infected class is decreasing while the population of healthy class is increasing at different rates due to fractional order derivative by evaluating the solution up to twenty terms via using matab. as the order is increasing, the growth rate of healthy class is increasing and thus becomes stable first as compared to the small fractional order. on the other hand, the decaying process of infected class is fastest on the small fractional order as compared to the large order. thus, in this case, the stability is achieved first at the smallest fractional order derivative rather than at the greater order. further at the given values of the parameters, the infection to vanish in a locality will take days between 220 and 250. in figure 2 , in the presence of immigration and less protection rate, we plot the solution corresponding to different fractional orders. we see that infection is increasing while the population of healthy class is decreasing at various rates due to fractional order. from figure 3 , when we involve immigration of infected class and cease the immigration for the healthy class, we see that the population density of infected class is going up with different rates due to fractional order derivative during in first 250 days. on the other hand, the healthy population is going on instability in the first 50 days, that is; it increases and then decreases suddenly. to achieve stable position, it requires nearly 110 days. this means that immigration of infected population from one place to another will cause instability in the healthy population of a community. from figure 4 , we see that the straight increase in both populations is due to immigration of healthy population but using strong protection rate at different fractional orders. in this article, we have examined a population model of the novel covid-19 under abc fractional order derivatives. we have proved sufficient results about the existence and uniqueness of solution for the considered model and proved that it has at least one solution. hence, the fixed point theory always works as an effective tool that can be used to check the existence and uniqueness of various physical problems. a stability result has also been established. through a novel method, we have derived approximate solutions for the corresponding compartments of the model under investigation. further, some numerical results have been presented for different fractional orders through matlab by taking various values of the immigration rates. we observed that as the immigration of infected class is increasing, it will cause the decrease in healthy population and hence the population of infected people increases. therefore, an important factor which increases the infection of the current outbreak is free biomed research international immigration. when people do not avoid the unnecessary traveling from one place to another, there is greater chance to infect. if this term decreases, then infection may be sufficiently decreased in a population. if in society, the immigration of infected people is strictly controlled, then we may protect our society from further hazard. in the future, the concerned model may be further extended by involving exposed class, recovered class, and asymptotically infected class to form five compartment models. this new model will further produce more significant information basis on which better controlling policies and procedure may be made to save our society from this infection. no data were used to support this study. there exist no competing interest regarding this research work. semianalytical study of pine wilt disease model with convex rate under caputo-febrizio fractional order derivative dynamic behavior of leptospirosis disease with saturated incidence rate global stability analysis and control of leptospirosis complex mathematical models of biology at the nanoscale mathematical oncology: cancer summed up a pneumonia outbreak associated with a new coronavirus of probable bat origin pneumonia of unknown aetiology in wuhan, china: potential for international spread via commercial air travel outbreak of pneumonia of unknown etiology in wuhan china: the mystery and the miracle cross-species transmission of the newly identified coronavirus 2019-ncov early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia on the volterra and other nonlinear models of interacting populations contribution to the theory of periodic reactions a mathematical model in the time of covid-19, a preprint fractional differential equations, mathematics in science and engineering theory of fractional dynamic systems applications of fractional calculus in physics applications of fractional calculus to dynamic problems of linear and nonlinear hereditary mechanics of solids theory and application of fractional differential equations nagumo-type uniqueness result for fractional differential equations an introduction to the fractional calculus and fractional differential equations integration by parts and its applications of a new nonlocal fractional derivative with mittag-leffler nonsingular kernel new fractional derivative with non-local and non-singular kernel a fractional mathematical model of breast cancer competition model electrical circuits rc, lc, and rl described by atangana-baleanu fractional derivatives discrete fractional differences with nonsingular discrete mittag leffler kernels fractional derivatives with mittag-leffler kernel: trends and applications in science and engineering on a class of ordinary differential equations in the frame of atangana-baleanu fractional derivative new fractional derivative with non-singular kernel for deriving legendre spectral collocation method on exact solutions for time-fractional korteweg-de vries and korteweg-de vries-burger's equations biomed research international using homotopy analysis transform method on a new modified fractional analysis of nagumo equation a comparative study on solving fractional cubic isothermal auto-catalytic chemical system via new efficient technique numerical treatment for studying the blood ethanol concentration systems with different forms of fractional derivatives a fixed-point theorem of krasnoselskii all authors have read and approved the final revised version of this paper. the second and third authors will financially support this article from their own sources. h 0 t ð þ = h 0 , i 0 t ð þ = i 0 ,h t ð þ = 0:7 + 0: all authors have equal contribution in this work. the first and second authors were involved in the analysis while the last two authors have done the verification and checking of the formal analysis of this paper. key: cord-277128-g90hp8j7 authors: rajendran, sukumar; jayagopal, prabhu title: accessing covid19 epidemic outbreak in tamilnadu and the impact of lockdown through epidemiological models and dynamic systems date: 2020-09-17 journal: measurement (lond) doi: 10.1016/j.measurement.2020.108432 sha: doc_id: 277128 cord_uid: g90hp8j7 despite having a small footprint origin, covid-19 has expanded its clutches to being a global pandemic with severe consequences threatening the survival of the human species. despite international communities closing their corridors to reduce the exponential spread of the coronavirus. the need to study the patterns of transmission and spread gains utmost importance at the grass-root level of the social structure. to determine the impact of lockdown and social distancing in tamilnadu through epidemiological models in forecasting the “effective reproductive number” (r(0)) determining the significance in transmission rate in tamilnadu after first covid19 case confirmation on march 07, 2020. utilizing web scraping techniques to extract data from different online sources to determine the probable transmission rate in tamilnadu from the rest of the indian states. comparing the different epidemiological models (sir, sier) in forecasting and assessing the current and future spread of covid-19. r(0) value has a high spike in densely populated districts with the probable flattening of the curve due to lockdown and the rapid rise after the relaxation of lockdown. as of june 03, 2020, there were 25,872 confirmed cases and 208 deaths in tamilnadu after two and a half months of lockdown with minimal exceptions. as on june 03, 2020, the information published online by the tamilnadu state government the fatality is at 1.8% (208/11345=1.8%) spread with those aged (0-12) at 1437 and 13-60 at 21899 and 60+ at 2536 the risk of symptomatic infection increases with age and comorbid conditions. the thrust of this global pandemic is disrupting the natural flow of human existence via covid-19 of the "common cold" coronavirus family, its existing versions of a severe acute respiratory syndrome (sars) (2001) (2002) (2003) , and middle east respiratory syndrome (mers) (2012) (2013) (2014) (2015) . sars-cov and covid19 spread from infected civets and bats, while mers-cov originated from dromedary camels resulting in an epidemic as corroborated by scientific research. this paper intends to provide a detailed view of the different mathematical models in predicting the rife of covid19. modeling a detailed view in determining the spread of covid19 taking into account the influences of provincial factors, thereby providing a fundamental understanding through quantitative and qualitative inferences during these uncertain times. the model utilizes population dynamics and conditional dependencies such as new cases, deaths, social distancing, and herd immunity over a stipulated timeperiod to simulate probable outcomes. the social stigma determines the rate of spread and is specific to the region and religious practices and different structure congregation of the masses. a thorough study of different epidemiological models is presented in the subsequent section and the comparative analysis of the various factors and socio-economic needs that influence the model's capabilities in determining the spread of the epidemic. the epidemiological model provides a system to define and determine interventions based on statistics of collected data and predicting the probability of determining the development prevention and control of diseases. epidemiology models are classified into respective categories based on the chance variations (stochastic & deterministic), time (continuous & discrete intervals), space (spatial & non-spatial), and population (homogenous & heterogenous) to predict the dispersion of diseases (hethcote, 2000) . the factors governing the spread are infectious agents, modes of transmission, susceptibility, and immunity (chowell et al., 2006 the case fatality rate(cfr) is highly variable and increases with severe respiratory symptoms in adults with comorbid conditions . there present scenario specified in this paper is currently no specific vaccine available for covid19, despite quarantines and lockdowns imposed. the authors also specify a need to accelerate protocols to rapid point of care diagnostic testing and effective personal care in preventing the transmission and spread of the disease. the authors provide a time dependent model more adaptive than the existing models but with a way of further improving the results and predictions by taking into account the disease propagation probabilities and transmission rates (del rio & malani, 2020). the portals and dashboard created by who (organisation, n.d.) and tamilnadu  utilize a language library(python) designed for covid19. the reed-frost theory (abbey, 1952) represents soper's equation as follows (1) the proposed equation 1 helps in determining the cases in successive time periods. where → number of cases, t time and probability of an individual with no contact with , → → 1 -→ probability of an individual having contact with . the reed-frost theory (abbey, 1952) states that (2) and for a population of a and b, the theory states in equation 4 that individual is not infected from cases in population b . (4) but is infected from any one of the cases in either of population a or b 1 -. (5) the cases infected from the population a in time t+1 is the cases infected from the population b in time t+1 is both equations 6 & 7 determined the introduction of cases into the population, which forms the initial spread for epidemics. the compartmental structure of different epidemiological models si, sis, sir, sirs, sei, seis, seir, seirs, mseir, mseirs, are represented in figure 1, the dynamics of covid19 are mapped into a compartmental model that generates the mean-field approximations to ensemble or population dynamics four latent factors determine the distributions i. location the transmission probability iv. diagnostic and testing the master equation for the dynamic part of the dynamic casual model is case fatality rate (cfr) = number of people died the reproductive number (r 0 ) called "r naught" defines the average number of secondary infections caused by an infected individual introduced into an uninfected population. the range of >1 0 signifies an epidemic while <1 defines the disease as eliminated. there has been a high spread from the interstate and intercontinental travelers to tamilnadu state. air & rail travel has eased the transmission and spread of the covid19 diseases in rural and urban districts of tamilnadu. the statistics are as follows to measure the clustered spread of covid19 is to determine the geolocation of the infected individual as the individual is entering into the community with zero infected individuals (sugiura, 1978) . in order to measure the intensity of the spread, each individual is measured as the centroid of the cluster (li et al., 2020) . the dependent features are the age, embarking source, and destination vectors (chowell et al., 2006) . the specific clusters identified in tamilnadu are that of the delhi cluster, koyambedu cluster (majumder & mandl, 2020) . the data defined with that of the cluster is as shown in the table the age-structured epidemiological models are differential equations specific to the changing size and age structure of a population over time. these models work on continuous age and age groups as partial and ordinary differential equations. the continuous age model is determined with a partial differential equation for the population growth. a demographic model with continuous age this population model called the lotka -mckendrick model (barbu et al., 2001) . the classic kermack-mckendrik model where these class-age-structured model equations are the initial population age distribution epidemiological model assumes that there is a steady age distribution in the total population size. a demographic model with age groups the government of india and the state of tamilnadu implemented lockdown for four periods as lockdown 1.0 to 4.0 from march till may and further extending with leniency. this lockdown has reduced the rapid spread of covid19 through social distancing, reducing fatalities. the three main transmission modes through physical contact, respiratory droplets through sneezing or < 5 coughing, indirect contact with surfaces handled by an infected person. in this stage, the probability transmission is from traveling individuals from different geographical locations, the sources are highly identifiable, and isolation is high. the isolated individuals infect a small cluster of the surrounding, thereby creating sporadic pockets of infection. individual identified within the local cluster . in this stage, the source is not identifiable, and vast masses of people are infected. the infection spreads through random patterns where random members start to get infected. denotes the travel of the individual from state, district, international locations. → in this stage, the spread becomes an epidemic with a large number of infected people and an increase in the number of deaths. after a while, the community starts to develop immunity to this specific strain of the virus, thereby reducing transmission and death eventually. there is also a possibility of the virus mutating generating a second wave. the mapping of transmission of covid19 is done through contact tracing, thereby isolating individuals infected by the epidemic at different epicentres of the society denoted by . lockdown reduced the spread of covid19 in tamilnadu as represented in dotted lines in the figure below. 3-18-2020 3-20-2020 3-22-2020 3-24-2020 3-26-2020 3-28-2020 3-30-2020 4-1-2020 4-3-2020 4-5-2020 4-7-2020 4-9-2020 4-11-2020 4-13-2020 4-15-2020 4-17-2020 4-19-2020 4-21-2020 4-23-2020 4-25-2020 4-27-2020 4-29-2020 5-1-2020 5-3-2020 5-5-2020 5-7-2020 5-9-2020 5-11-2020 5-13-2020 5-15-2020 5-17-2020 5-19-2020 5-21-2020 5-23-2020 5-25-2020 5-27-2020 5-29-2020 5-31-2020 6-2-2020 lock down period in the attack rate contributes to the different cycles and the modes of the transmission of covid19. the implication depends on the population cluster size and the number of infected cases. secondary attack rate the sar is the infectability rate of the individuals already infected by covid19 and recovered. the sar contributes to the identification of the whether recovered patients are immunized for lifetime or for a few months. the infected person enters the population and transmits to each person with probability p. k the number of people re-infected while being contagious for transmission of secondary infections (kharroubi, 2020 ). transmitting contact (efimov & ushirobira, 2020) . the differential = -let be birth and death rate, the probabilistic model process through different nodes as infected and recovered (yan et al., 2020) . ≈ ( ) ∑ ∈ ( ) ( ) = ( ) si model sis model sir model seir model the figure below shows indian state/ut wise details of active cases of covid19 pandemic in india till june 04. in this proposed study of covid19 infection in tamilnadu, the dataset consists of cases district wise till 31 may 2020 is used. the status of individual district details is extracted from the health & family welfare department government of tamilnadu. the district-wise details of the covid19 status displayed in the sample table 2 27 thirupathur 32 28 4 0 28 thiruvallur 948 603 334 11 29 thiruvannamalai 419 144 273 2 30 thiruvarur 47 33 14 0 31 thoothukudi 226 135 89 2 32 tirunelveli 352 211 140 1 33 tiruppur 114 114 0 0 34 trichy 88 70 18 0 35 vellore 43 the different statistical measure models have evolved to predict the transmission of infectious viruses/ bacteria. the different measures taken into account depends on the geographical structure, climatic condition, and region-specific human practices. the infectious pathogens follow specific traits while infecting and spreading from host to host, creating a life cycle. the transmission rate and intensity may vary between interspecies transmission and human transmission, depending on whether the transmission is direct or intermediary. the infection rate is very low at the initial transmission between interspecies, while there is an exponential increase in human transmission. the study of covid19 spread in tamilnadu the transmission of infection from infected to the susceptible persons is at 5,40,405. the recovery rate and the comorbidity rate of the covid19 varies predominantly as the geographical and climatic changes influence the spread. the attack rate and the susceptibility rate rapidly increase due to the adaption of the virus strain pertaining to the individual's genetic morphology. the susceptible individuals are isolated with the impact of lockdown reducing the susceptibility rate of covid19. the time the infectious disease doubles in number with as the initial number of infected cases 0 doubling period of the cases and the initial number of infected cases at time t. the initial time and the doubling time taken to reproduce is reduced to seven days as the spread of the infection amplifies with the different strains and clusters of infection. the peak period of the infection is gradually increased as the imposition of the lockdown reduces the transmission cycle of covid19. this model represents the sir epidemic model for indian states with different geographical regions. the data is collected for a period of two and a half months and pre-processed to remove the missing and noisy data [13][19] . the transformation from raw data to geospatial data is possible through qgis, and mapping the shapefiles created through arcgis [5][12] . a sample of the mapped data is, as shown in figure 14 . the daily testing statistics of the covid19 per lakh population in tamilnadu is at 865 as on 15-06-2020 with the number of persons tested is at 6,42,201, and the number of samples tested is at 7,10,599. the daily testing statistics of the covid19 per lakh population in tamilnadu is at 3760 as on 08-08-2020 with the number of persons tested is at 29,75,657. based on the data provided by the government of tamilnadu, the sir model, compared with the other compartmental model's project, the outbreak will peak at the end of june and will start decreasing towards the end of august. based on this model and the different parameters like the lockdown and the social distancing and applying face mask has postponed the infectious rate of transmission through may 2020 to june 2020. the proposed model compares the age/ gender wise social distancing to be for comparison through the dynamic models. the model specifies doubling time and the attack rate at which the infection spreads. the inclusion of tamilnadu state district wise rate of attack is necessary to determine the rate of spread of covid19 at the district, town, and village panchayat levels. the flattening of the curve is ascertained by the end of august in tamilnadu as the r 0 values are at an inflection point where the curve attains the period of a downtrend as the herd immunity increases as social distancing and personal protection becomes a daily ritual. the estimated values of the data should be interpreted with caution as the data may vary based on the climatic condition, geographical variations and frequential history of susceptibility to infectious diseases. the important findings are to impose lockdowns that supress the transmission of the diseases, and isolate individuals with comorbidity. the covid19 value varies considerably for 0 different models and moreover the reinfection rate of the already infected is unknown and the future impact of subsequent fatalities are unknown. the mathematical numerical analysis conducted in the paper is more adaptive than the traditional models. the model can be extended to study the infective rate and the asymptotic infectious rate by including both the infected and the asymptotic undetected individuals. the model can be extended to other states characterized by similar geographic and climatic conditions and a closer reproductive rate . an examination of the reed-frost theory of epidemics on the controllability of the lotka--mckendrick model of population dynamics a time-dependent sir model for covid-19 with undetectable infected persons mathematical applications associated with the deliberate release of infectious agents covid-19-new insights on a rapidly changing epidemic on an interval prediction of covid-19 development based on a seir epidemic model the mathematics of infectious diseases modeling and predicting the spread of covid-19 in lebanon: a bayesian perspective predicting the epidemic trend of covid-19 in china and across the world using the machine learning approach early transmissibility assessment of a novel coronavirus in wuhan d.). {ministry of health and family welfare}, year = 2020 d.). {world health organisation covid19}, year = 2020 further analysts of the data by akaike's information criterion and the finite corrections: further analysts of the data by akaike's an interpretable mortality prediction model for covid-19 patients writing-original draft preparation we authors confirm that there is no conflict of interest associated with research work carried out with any means.further, we confirm, the article not submitted elsewhere to any journal or conference, solemnly set for measurement journal, elsevier publications for the review process. in addition, no funding resources are available for this research activity. key: cord-266090-f40v4039 authors: gao, wei; baskonus, haci mehmet; shi, li title: new investigation of bats-hosts-reservoir-people coronavirus model and application to 2019-ncov system date: 2020-08-03 journal: adv differ equ doi: 10.1186/s13662-020-02831-6 sha: doc_id: 266090 cord_uid: f40v4039 according to the report presented by the world health organization, a new member of viruses, namely, coronavirus, shortly 2019-ncov, which arised in wuhan, china, on january 7, 2020, has been introduced to the literature. the main aim of this paper is investigating and finding the optimal values for better understanding the mathematical model of the transfer of 2019-ncov from the reservoir to people. this model, named bats-hosts-reservoir-people coronavirus (bhrpc) model, is based on bats as essential animal beings. by using a powerful numerical method we obtain simulations of its spreading under suitably chosen parameters. whereas the obtained results show the effectiveness of the theoretical method considered for the governing system, the results also present much light on the dynamic behavior of the bats-hosts-reservoir-people transmission network coronavirus model. on december 2019, the symptom of 2019-ncov infected patients was identified as fever, cough, breathing difficulties, and some other. due to the long incubation period and mild symptoms, the suspicious infected people need to be observed for around 14 days. to reduce population flow and restrict the spread of the virus, corresponding virus dissemination control policies and relevant actions are being carried out at different levels. on january 23, wuhan was locked down by strict restrictions of transport, and soon some other provinces announced to lockdown as well [3] . chinese people were suggested to stay at home and avoid gathering, assembling, celebrations, visiting, and so on to reduce the virus dissemination. people were required to wear respirators in public areas. some researchers pointed out some effective ways to control the spread of infectious virus, including school closure, case isolation, household quarantine, internal travel restrictions, and border control, which were proved to be helpful for the delay or reduction of virus infections [4] [5] [6] . almost all the economic activities in some countries have been paused, which caused countless damages to humans' lives, development, and also a large amount of financial pressure on these countries. globally, australia and new zealand firstly released regulations to ban travellers who had been to china in the past 14 days. from december 2019 up to now, 2019-ncov virus suddenly out-broke in other countries in flood. italy, japan, south korea, spain, and other countries were in the clouds of the virus as well, and some of them started to lockdown cities, restrict transports, close schools, and so on. the whole world is in damage by the 2019-ncov and in the combat against coronavirus. some other researchers devoted themselves to the drug and vaccine developments, however, there is no way to effectively eliminate the virus in the human's body. a large number of researchers have studied the 2019-ncov from a wide range of perspectives, including diverse infectious diseases, microbiology, virology, respiratory system, biochemistry molecular biology, immunology, public environmental occupational health, genetics heredity, veterinary sciences, environmental sciences ecology, and pathology. most of them are conducted by the usa and china, followed by saudi arabia, south korea, and germany. according to tian [7] and others, the differences of rbd between sars-cov and 2019-ncov is of significance for the cross-reactivity of neutralizing antibodies, and a sars-cov-specific human monoclonal antibody cr3022 could bind potently with 2019-ncov rbd (kd of 6.3 nm). as for the origins of 2019-ncov virus, 2019-ncov, benvenuto et al. [8] held that 2019ncov could be considered a coronavirus distinct from sars virus, probably transmitted from bats or another host, where mutations conferred upon it the ability to infect humans, and they also proposed a preliminary evolutionary and molecular epidemiological analysis of this new virus, considering high genetic similarity between 2019-ncov and the severe acute respiratory syndrome coronavirus (sars-cov), and leveraging existing immunological studies of sars-cov, ahmed, quadeer, and mckay [9] devoted themselves to seeking for gaining insights for vaccine design against 2019-ncov. with the increasing of virus spread and the ongoing related research both at domestic and international, one question still hinders human's knowledge of the 2019-ncov: what is the original source of such a virus and how can it transmit to human. in this paper, we intend to study a mathematical model called the bats-hosts-reservoir-people coronavirus (bhrpc) model for the transfer of 2019-ncov from the reservoir to people. by using a powerful numerical method we gain its spreading simulations under suitably chosen parameters. the obtained results show the effectiveness of the theoretical method considering the governing system and also present much light on the dynamic behavior of the bats-hosts-reservoir-people transmission network coronavirus model. in this regards, more recently, some experts have investigated some important nonlinear models arising in real-world problems [10] [11] [12] [13] [14] [15] [16] . one of such problems has been mathematically developed by chen et al. for simulating the phase-based transmissibility of a novel coronavirus as 2019-ncov [17] defined by where n p , m p , b p , κ, b w , δ p , ε, c are real nonzero constants. the initial conditions for this system are shortly given by equation (1) is used to describe the phase-based transmissibility of a novel coronavirus from source to people. in eq. (1), u is the susceptible people, v is used to symbolize exposed people, y is the symptomatic infected people, f is asymptomatic infected people, r is removed people (recovered and died people), n p is the birth rate, m p is the death rate of people, w is the reservoir (the seafood area), 1/ω b is the incubation period of bat infection, and 1/γ b is the infectious period of bat infection [17] . khan et al. [10] have investigated the endemic equilibria, stability, and global sensitivity of system (1). if r 0 < 1, then system (1) is locally asymptotically stable [10, 18] , and the outbreak will fade away [18] . when r 0 > 1, the outbreak will occur [18] , and it is not stable. the used data for system (1) are for wuhan, china, [17, 19] . moreover, mathematical analysis and applications of dengue fever outbreak and epidemiology in the sense of fractional have been investigated [20, 21] . recently, a numerical scheme based on the newton polynomial has been applied successfully to observe important properties of the spread of covid-19 with new fractal-fractional operators [22] . some close relationships of covid-19 with hiv have been presented in [23] . many applications of fractional-or integer-order mathematical models explaining more detailed informations about the real-world problems have been presented in a detailed manner . in this paper, we investigate the numerical distributions of 2019-ncov according to time with the help of several approaching terms of vim. in 1999, vim, one of the most powerful numerical methods, was firstly developed by he [51] [52] [53] [54] for numerical investigation and exceeds the difficulties of the perturbation or adomian functions. later, wazwaz has applied vim for investigating linear and nonlinear wave equations along with wave-like equations [55] and laplace equation [56] . moreover, many applications of vim have been observed for various models [14] [15] [16] [17] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] . we consider differential equations of the form where l and n are linear and nonlinear operators, respectively, [55] , and f (x) is a source inhomogeneous term. according to basic concepts of vim presented by he, we construct the following iteration formula for eq. (3) [51] [52] [53] [54] : where the parameter λ is a general lagrange multiplier, which can be optimally identified via the variational theory, the subscript n denotes the nth-order approximation, and u n is considered as a restricted variation, which means δ u n = 0. clearly, the main steps of vim first require the determination of the lagrange multiplier λ, which needs to be optimally identified. once λ is determined, the successive approximations u n+1 , n ≥ 0, of the solution u are obtained upon using a suitably selected function u 0 , which satisfies the boundary conditions. then the solution is given by in this subsection, by using vim we numerically investigate the bats-hosts-reservoir-people coronavirus model. according to vim iteration structure, we can write eq. (1) in the following form: where k = 0, 1, 2, 3, . . . . it produces the stationary condition λ = -1. substituting eq. (7) into eq. (6), we find the following iteration equation: with the help of some computational software algorithm, by considering eq. (2) with initial values we get the first approximate components of the system for k = 0: thus we get the first approach of u n , v n , y n , f n , r n , w n as follows: f 1 (t) = β 4 + τ 4 t, r 1 (t) = β 5 + τ 5 t, where, for simplicity, we have taken now we obtain the second components of the variables u n , v n , y n , f n , r n , w n for k = 1: y 2 (t) = β 3 + (τ 3 + τ 13 )t + τ 14 2 t 2 , f 2 (t) = β 4 + (τ 4 + τ 15 )t + τ 16 2 t 2 , r 2 (t) = β 5 + (τ 5 + τ 17 )t + τ 18 2 t 2 , w 2 (t) = β 10 + (τ 6 + τ 19 )t + τ 20 2 figure 1 two-dimensional surfaces of system (10) where τ 14 = +τ 2 (-1 + δ p )ω p + τ 3 (γ p + m p ), 10 , the remaining components of the iteration formula (5) can be found in the same manner using a similar algorithm via various computational schemes. in this work, we observe the spreading rate of the bats-hosts-reservoir-people coronavirus model from reservoir to people under suitably chosen values of the parameters, reported by experts in wuhan area of china. we obtain two-dimensional simulations of the second terms u 2 , v 2 , y 2 , f 2 , r 2 , w 2 of u n , v n , y n , f n , r n , w n as in fig. 1 . in this paper, we have successfully applied vim to numerical investigation of the 2019-ncov model. this method is based on a series solution terms of iteration. only in the second terms of iteration, we have obtained numerical results for system (10) . under suitfigure 2 two-dimensional surfaces of susceptible people and exposed people of system (10) figure 3 two-dimensional surfaces of symptomatic infected people and asymptomatic infected people of system (10) figure 4 two-dimensional surfaces of removed people (recovered or died people) and reservoir (the seafood area) of system (10) ably chosen values of the parameters, reported by who, we have plotted the numerical results. according to figs. 1, 2, 3, 4, we observed that vim produces similar distributions for susceptible people, which increase exponentially, when we compare the simulated data with fig. 2 in [17] . moreover, we can say that each class of system (1) has also simulated estimated behaviors from them. furthermore, the spread of the 2019-ncov with susceptible people u 2 is faster than the others, such as v 2 , y 2 , f 2 , r 2 , w 2 . finally, from the second terms of the proposed algorithm we can observe that susceptible people will affect more and more people from all over the world. as a future direction of this concept, application of powerful projected tools may be studied. these produce more comprehensive results on the mathematical system of 2019-ncov. genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from patients with acute respiratory disease in wuhan national health commission of the people's republic of china. top expert: disease spread won't be on scale of sars 2020 south china morning post. china coronavirus: three cities join wuhan in quarantine lockdown as beijing tries to contain deadly outbreak 2020 strategies for mitigating an influenza pandemic effectiveness of control measures during the sars epidemic in beijing: a comparison of the rt curve and the epidemic curve school closure to reduce influenza transmission potent binding of 2019 novel coronavirus spike protein by a sars coronavirus-specific human monoclonal antibody the 2019-new coronavirus epidemic: evidence for virus evolution preliminary identification of potential vaccine targets for the covid-19 coronavirus (sars-cov-2) based on sars-cov immunological studies modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative is the world ready for the coronavirus? editorial. the new york times china virus death toll rises to 41, more than 1,300 infected worldwide new approach for the model describing the deathly disease in pregnant women using mittag-leffler function mathematical analysis and computational experiments for an epidemic system with nonlocal and nonsingular derivative modeling the mechanics of viral kinetics under immune control during primary infection of hiv-1 with treatment in fractional order new numerical simulations for some real world problems with atangana-baleanu fractional derivative a mathematical model for simulating the phase-based transmissibility of a novel coronavirus reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia mathematical analysis of dengue fever outbreak by novel fractional operators with field data application of fractional calculus to epidemiology modelling the spread of covid-19 with new fractal-fractional operators: can the lockdown save mankind before vaccination? hiv and shifting epicenters for covid-19, an alert for some countries a generalization of truncated m-fractional derivative and applications to fractional differential equations complex surfaces to the fractional (2 + 1)-dimensional boussinesq dynamical model with local m-derivative a review on harmonic wavelets and their fractional extension local fractional homotopy perturbation method for solving non-homogeneous heat conduction equations in fractal domains solitons and other solutions of (3 + 1)-dimensional space-time fractional modified kdv-zakharov-kuznetsov equation on the fractal geometry of dna by the binary image analysis application of local fractional series expansion method to solve klein-gordon equations on cantor sets fractional dynamics complex solitons in the conformable (2 + 1)-dimensional ablowitz-kaup-newell-segur equation approximate solutions of the time fractional kadomtsev-petviashvili equation with conformable derivative. erzincan univ fractal boundary value problems for integral and differential equations with local fractional operators review of numerical methods for numilpt with computational accuracy assessment for fractional calculus a new technology for solving diffusion and heat equations a fractional epidemiological model for computer viruses pertaining to a new fractional derivative cancer treatment model with the caputo-fabrizio fractional derivative an efficient analytical approach for fractional equal width equations describing hydro-magnetic waves in cold plasma new numerical results for the time-fractional phi-four equation using a novel analytical approach optical solitons to the fractional schrödinger-hirota equation new wave solutions of time fractional kadomtsev-petviashvili equation arising in the evolution of nonlinear long waves of small amplitude. erzincan univ mathematical analysis of a fractional differential model of hbv infection with antibody immune response novel dynamical structures of 2019-ncov with nonlocal operator via powerful computational technique spatiotemporal patterns in the belousov-zhabotinskii reaction systems with atangana-baleanu fractional order derivative m-fractional solitons and periodic wave solutions to the hirota-maccari system numerical solutions with linearization techniques of the fractional harry dym equation analysis of regularized long-wave equation associated with a new fractional operator with mittag-leffler type kernel a new study of unreported cases of 2019-ncov epidemic outbreaks new analytical solutions of conformable time fractional bad and good modified boussinesq equations approximate analytical solution for seepage flow with fractional derivatives in porous media variational iteration method: a kind of nonlinear analytical technique: some examples a new approach to nonlinear partial differential equations a variational iteration approach to nonlinear problems and its applications the variational iteration method: a reliable analytic tool for solving linear and nonlinear wave equations the variational iteration method for exact solutions of laplace equation application of he's variational iteration method to helmholtz equation investigation of a transition from steady convection to chaos in porous media using piecewise variational iteration method application of variational iteration method to nonlinear differential equations of fractional order fourth order integro-differential equations using variational iteration method key: cord-164964-vcxx1s6k authors: kharkwal, himanshu; olson, dakota; huang, jiali; mohan, abhiraj; mani, ankur; srivastava, jaideep title: university operations during a pandemic: a flexible decision analysis toolkit date: 2020-10-20 journal: nan doi: nan sha: doc_id: 164964 cord_uid: vcxx1s6k modeling infection spread during pandemics is not new, with models using past data to tune simulation parameters for predictions. these help understand the healthcare burden posed by a pandemic and respond accordingly. however, the problem of how college/university campuses should function during a pandemic is new for the following reasons:(i) social contact in colleges are structured and can be engineered for chosen objectives, (ii) the last pandemic to cause such societal disruption was over 100 years ago, when higher education was not a critical part of society, (ii) not much was known about causes of pandemics, and hence effective ways of safe operations were not known, and (iii) today with distance learning, remote operation of an academic institution is possible. our approach is unique in presenting a flexible simulation system, containing a suite of model libraries, one for each major component. the system integrates agent based modeling (abm) and stochastic network approach, and models the interactions among individual entities, e.g., students, instructors, classrooms, residences, etc. in great detail. for each decision to be made, the system can be used to predict the impact of various choices, and thus enable the administrator to make informed decisions. while current approaches are good for infection modeling, they lack accuracy in social contact modeling. our abm approach, combined with ideas from network science, presents a novel approach to contact modeling. a detailed case study of the university of minnesota's sunrise plan is presented. for each decisions made, its impact was assessed, and results used to get a measure of confidence. we believe this flexible tool can be a valuable asset for various kinds of organizations to assess their infection risks in pandemic-time operations, including middle and high schools, factories, warehouses, and small/medium sized businesses. as the events of 2020 have shown, pandemics due to novel viruses can lead to unimaginable disruption in society [1] . the impact has been two-fold; the direct impact of the pandemic on physical health and mortality and the indirect impact of lock-downs and social distancing on mental health [1, 2, 3] and the economy [2] . a specific example of a critical societal function facing disruption is higher education, especially for starting freshman for whom an important formative experience, namely of transitioning from home to an independent life, has been severely disrupted [4] . higher education in the us contributed an estimated $528 billion 1 to the national gross domestic product (gdp) in 2020 [5] and employed roughly 3 million people [6] . disruptions in the education sector have long-term ramifications in terms of an inadequately prepared workforce for the future [7] . by mid-march 2020 most colleges and universities across the us either cancelled in-person classes or shifted to remote-only instruction and this mode of instruction may continue for an unknown amount of time. in a recent survey of nearly 3,000 institutions [4] , only 21.3% said that they are considering fully or primarily in person model for fall 2020 and beyond. the campus environment has some unique features as compared to other places. figure 1 shows types of interactions on university campus, namely groups, queues and rivers. groups can further be classified into on-campus interactions. that can be monitored and controlled and off-campus interactions that cannot be monitored. the former includes classrooms, study areas, student life activities under the university's purview, e.g. dorms, extra-mural sports, clubs, etc. off-campus activities include private housing, social activities, grocery shopping, and a myriad of other life activities. queues appear at various kinds of service locations on campus, including those offered by the university, e.g. bookstores, student services, etc., as well as those offered by others, e.g. cafes, banks, etc. rivers include pathways where students cross each other. this classification of interactions allows for more accurate modeling of disease spread. given the nature of the pandemic, groups are the most risky types of interactions, involving sufficiently large number of people in close proximity for long periods of time. further, it is comparatively easier to reduce community transmission in queues using appointments only service and in rivers using one-way rivers, ventilation, masking and physical distancing rules. thus, among all different types of interactions, the on-campus group interactions are the ones that have high community transmission risk, are observable and controllable. therefore, it is reasonable to have a sophisticated model for community transmission through on-campus group interactions, in particular classrooms and a simple model of infections through other types of interactions consistent with related work [25] . in our work, we include a detailed stochastic network model of classes, especially since there is growing evidence of virus transmission through aerosol spread [26, 27] . since classes have fixed schedules, they can be modeled as processes happening at specific times, with batched arrivals of students and instructors. classrooms are assumed to be completely cleaned and sanitized between any two consecutive classes thus the transmission is limited within classes and mixing across two classes is only due to the same people in the two classes and not due to the shared classrooms. infections outside the classroom settings are random with chances of infection depending upon the community prevalence rates [25] . based on our review of various discussions in the media and other sources over the past 6 months, and confirmed by our interactions with university administration, we identify the following key decisions (dimensions) to be considered: 2. physical distancing: student classroom density, in sqft/student, based on the physical distance between students. 3. class modality: in-person or online classes. 4. testing: testing policy, i.e. symptomatic or asymptomatic, and whether to do contact tracing. any operational policy consists of a set of choices, one for each of the decisions outlined above. an example of a policy is minnesota's sunrise plan [28] , details of which are presented in section 4. the university administration needs to evaluate the cost of implementing a policy and the benefits obtained from them. implementation costs for policies are usually estimated based upon the predictions of behavior. for example, class modality decisions can incur technology, infrastructure and support costs, as well as revenue loss due to changes in student enrollment. our work provides a method of estimating the benefits of the operational decisions and policies. in this paper, we present a flexible framework that allows for various models for human behavior and socialization patterns, decisions and choices, infection transfer models, disease progression models, etc. to simulate the outcomes under different policies. a simulation to evaluate policy impact requires several models. for the problem of evaluating university operations policy, four models are key, namely (i) a model of social behavior and interaction patterns among people, (ii) a model of infection transfer from people to people, which can be direct (person-to-person contact) or indirect (contact intermediated by temporally and spatially co-located visits to a location where infected people deposit the infection, and the susceptible people pick up the infection), (iii) a model of disease progression in an individual once infected, and (iv) the management policy being used. once a set of models are selected from the model library and a management policy is chosen, a scenario needs to be executed to assess the impact. in our simulation framework there are three components that achieve this, namely the person-location visit generator, the infection transfer generator, and the disease progression generator. each of these uses the corresponding model selected for the execution. for the infection transfer generator and the disease progression generator, several models exist, and we give brief descriptions of one of each kind in section 2.2 and section 2.3. for the person-location visit generator there are no good models and based on our ongoing work we propose to build new ones. these are described in detail in section 2.1. the framework we developed is flexible since it allows a policy analyst to experiment with various types of models and policies. in addition, it is expandable because new kinds of models, policies, as well as metrics and visualizations can be added. new model of social contact one important contribution of the paper is a new model of social contact. disease spread depends upon the pathogen properties as well as the contact structure in the population. we introduce a peopleplace network model for social interactions that replaces the unstructured population model used for pandemic modeling. the disease spread prediction models for covid-19 so far have ignored the heterogeneity and randomness in the contact structure of the population [14, 12, 11, 21] . the models are based upon the variations of compartmental models [29, 30, 31, 32, 33, 34] that assume populations with homogeneous interactions and give rise to simple ordinary differential equations. the models were originally developed by medical doctors in early twentieth century and later studied by mathematicians, engineers and social scientists. there is limited use of structure in interactions, only at a coarse level in [1] based upon [35] . case study of a major university's policy: over summer 2020, we interacted with the university of minnesota administration administration to track various decisions, and analyzed a range of choices for each discussion, to help inform the decisions. we believe this case study, in addition to showing the usefulness of our approach, also provides helpful guidance for the future. the rest of the paper is organized as follows: section 2 describes the design of the toolkit, section 3 provides an evaluation of the choices for each decision and its impact, section 4 provides a detailed case study of the university of minnesota's sunrise plan, and section 5 concludes the paper, with potential directions for future work. we now present a flexible stochastic simulation framework for evaluating university operational decisions that contains libraries for (i) structural and behavioral models of human contact calibrated using real data, (ii) disease transmission models in buildings, and (iii) disease progression models. current covid-19 spread models are strong on modeling infection spread and disease progression, but relatively weak in modeling human behavior and social contact in various campus activities, e.g. attending classes. this leads to estimates that have a high degree of deviance from reality [13, 1] . proposed approach uses an agent based model (abm) for human interaction, and stochastic models for physical and biological processes. the abm models human interactions as a network, which provides better predictions of disease spread than traditional sir and seir models, which are population based. given the airborne transmission mode of covid-19, i.e. aerosolized droplets containing the pathogen stay suspended in air for long periods of time [26, 36] , indirect human contact must be modeled in addition to direct contact. therefore, we introduce a person-location network, a bipartite graph that captures human interaction indirectly though locations visited. figure 2 shows the architecture of the simulation framework, consisting of three system components: the person-location visit generator, the infection transfer generator, and the disease progression generator. there exist several models for each of these components developed at different times as the knowledge about the disease evolved, along with available data such as list of courses for fall 2020, course selections, mask use policy, number of in person courses, and number of students, faculty, and staff on campus. in the following, we describe the latest models which we have implemented for each component. the person-location visit generator creates a sequence of visits, i.e. events of individuals visiting locations. in the following, we define the network and describe the generation process of the network and the event sequence. the basis of the person location visit generator is the person-location bipartite network g(p, l), with p and l the set of people and locations, respectively. this network captures all connections between people and locations, i.e. existence of an edge between i and j means that person i visits location j at some point. in the simulation, this network g(p, l) is a realization from some random network generation process. network generation: to generate the network g(p, l), we consider the case where there are n people and m locations in the bipartite network, i.e. |p | = n and |l| = m . input to the generation process includes the set of nodes p and l, degree sequence of nodes in p , denoted as d 1 , · · · , d n , and the degree sequence of nodes in l, denoted as w 1 , · · · , w m . these two degree sequences can either be obtained from data, or generated as random samples from certain degree distributions. note that degree distributions for the people side and the location side can be different. also, we may need additional adjustments of the two degree sequences to ensure w j , so that they are valid sequences for the bipartite network. the bipartite network is then generated as a realization from the configuration model [37, 38] with given set of nodes p and l, and desired degree sequences d 1 , · · · , d n (for nodes in p ) and w 1 , · · · , w m (for nodes in l). the pseudo code is highlighted in algorithm 1. this algorithm returns the person-location bipartite network g(p, l). input: set of nodes p and l; degree sequences d 1 , · · · , d n (for nodes in p ) and w 1 , · · · , w m (for nodes in l). output: return the generated person-location bipartite network g(p, l). choose one half-edge from the people side and one half-edge from the location side, both uniformly at random across all half-edges on the same side; connect the two half-edges to form an edge, and add it to the edge set of g(p, l); s = s-1. end event sequence: the input data for our simulation is an event sequence g(p, l, t ), where t is the discretized time range for the whole simulation. for each time t ∈ t , g(p, l, t) represents the actual visits between people and locations at time t and is a subgraph of g(p, l). we can see that g(p, l, t) is still a bipartite network. g(p, l, t ) can be obtained from data or generated as random samples. one simple way to create g(p, l, t ) as random samples can be: for each time t, we sample the set of edges in g(p, l) with probability p uniformly at random, and denote the resulting subgraph as g(p, l, t). the parameter p captures the sociability of people and locations. within a region, certain areas may have higher sociability factor p, while other areas may have lower sociability factor p. many factors, such as user behaviors, regional characteristics and customs, and geographic and weather conditions, can be integrated into this parameter p. more importantly, p can also be modified to capture the impact of some public policies due to the outbreak of covid-19. for example, shutdown or reduced operations of business as well as shelter-in-place can be modeled as reducing the value of the sociability factor p. since we are particularly interested in the use case of university re-opening, we will present the campus specific person-location visit generator in this section. as discussed in section 1.1, by assumption, we will only model the interactions within classes. in modelling a campus, we generate a student / instructor-class bipartite network g(s ∪ i, c), where s denotes the set of students, i denotes the set of instructors, and c denotes the set of classes. g(s ∪ i, c) models which student is taking which class as well as which instructor is teaching which class. each student has a student profile indicating their department and academic level. similarly, each class has a class profile indicating its department and difficulty level. each instructor is assigned to exactly one class and this forms the instructor-class part of the bipartite network. the student-class part of the bipartite network is generated by a modified configuration model, where students are assigned to classes following certain restrictions. we assume that each student chooses 2 to 5 classes (this forms the degree sequence of the students) subject to the capacities of classes (this forms the degree sequence of the classes). with probability p 1 , students choose classes within their own department with difficulty levels matching their academic levels; with probability p 2 , students choose classes within their own department with difficulty levels not matching their academic levels; with probability p 3 , students choose classes outside their own department. in general, p 1 > p 2 > p 3 and we also require p 1 + p 2 + p 3 = 1. these restrictions can be imposed by integrating networks from multiple configuration model processes. we point that generating the network g(s ∪ i, c) from such random processes is useful for analysis prior to course enrollment; if we have the exact student-class enrollment data and instructor-class data, then we can create a deterministic bipartite network g(s ∪ i, c) from the data. figure 3 shows the topology of the bipartite network g(s ∪ i, c). based on the bipartite network g(s ∪ i, c), and the teaching schedule of all classes throughout the semester, a visit schedule of who (students and instructors) visits which class and when will be created. if we assume that students will always attend classes, then this visit schedule becomes the natural event sequence g(s ∪ i, c, t ) for the simulation, where t denotes the set of days within the simulated academic semester. otherwise, we can introduce an attendance rate p to generate the event sequence. the attendance rate p captures the students' attendance activity: with probability p, a student will attend a scheduled class; with probability 1 − p, a student will skip a scheduled class. for day t, we sample the set of edges in the visit schedule associated with that day with probability p uniformly at random, and denote the resulting graph as g(s ∪ i, c, t). the event sequence g(s ∪ i, c, t ) can be obtained as {g(s ∪ i, c, t) : t ∈ t }. the infection transfer generator generates a sequence of infections using the event sequence generated by the person-location visit generator and a disease transmission model. the probability of infection is computed using the wells-riley equation [39, 40] commonly used for modeling indoor airborne disease transmission. the wells-riley equation is used to calculate the probability of each susceptible individual of getting infected when indoors (as in classrooms) with other infectious people. the probability of infection depends upon the physical environment, including room volume, ventilation rate, time spent in the room, pulmonary ventilation rate, and the infectiousness of the disease (quanta of pathogen). the probability calculation by the wells-riley equation is given by the following: where p is probability of infection, c is the number of newly infected people, s is the number of susceptible people, i is the number of infectors, p is the pulmonary ventilation rate of susceptible (m 3 /h), q is the room ventilation rate (m 3 /h), q is the quantum generation rate (quanta/h), and t is the exposure time. the equation demonstrates how changes to both the physical environment and infection control procedures may potentially impact the spread of airborne infections in indoor environments such as classrooms. the original wells-riley equation is used for fast-moving infections. it assumes that during the scenario that a group of people are in an indoor environment, there is a chance that susceptible individuals exposed to the pathogen produced by infected individuals may get infected and start adding pathogen to the environment. this is unlikely in classroom settings because the class time is much small than the usual incubation period of covid-19. alternatively, the disease quantum generation rate is very small, making the infection transfer within a classroom slow-moving. we introduce a simple solution to this problem. we use the first-order taylor approximation of the wells-riley equation (1) to model the transmission probability. we ignore the higher-order terms in the taylor series that are not valid because of the high incubation time and low quantum generation rate of covid-19. in particular, we are using the following equation to model the probability: this approximation of the wells-riley equation is extremely close to the original equation because the exposure time in classrooms is small and a linear approximation of an exponential function is very accurate when the argument of the function is small. the disease progression generator library generates the disease state transitions of each agent. the key states of the library include susceptible, infected, transmitting/infectious, asymptomatic, symptomatic, severely ill, dead, and recovered, which are highlighted in figure 4 . it shows a standard finite state epidemiological model of disease progression states [8] , where arrows imply the direction of change of state. our simulator follows this epidemiological model and obtains the distribution of time spent at different states from existing literature. the transition from the susceptible to the infected state primarily occurs according to the infection transfer generator and is probabilistically determined by the first-order taylor approximation of the wells-riley equation (2) inside the campus, secondly, it also happens spontaneously through outside infection, whose rate is a variable that depends upon the disease prevalence around the campus. once infected, the agent starts his / her incubation period, which follows a weibull distribution w eibull(0.11, 1.97) with a mean of 8.29 days and a median of 7.76 days, based on [41] . people become infectious and start transmitting viruses 1-3 days (uniformly distributed) before their incubation period ends [42] . once an agent starts transmitting, it becomes asymptomatic (this includes agents that are pre-symptomatic). when the incubation period ends, the agent either remains asymptomatic or transitions to being symptomatic with 65% probability [43] . we assume that the symptomatic or asymptomatic agents remain contagious until recovered. the distribution of this period has a mean of 7.8 days and is best fit by the gamma function gamma(3, 7.8/3) [44] . these are parametric values that can be modified as we change our understanding of the disease or implement different testing policies. asymptomatic or symptomatic agents with incorrect test results (false negative rate is assumed to be 3.3% of total tested based on the test mentioned in section 3.3.4) go around spreading the disease to locations they visit and eventually will transition to the recovered state at the end of their contagious period [45] . agents with positive test results are taken out of the simulation and put in effective quarantine based on the number of days left in their contagious period. a portion of them, defined by a probabilistic parameter, will develop a severe illness or even mortality and will not be able to come back on campus throughout the semester. for those without severe illness or mortality, once the quarantine period ends for them, they are put back into the simulation as recovered state, which entails that state for the remainder of the semester and thus they will not get infected again. we have based this on the informed assumption that antibody immunity lasts for three months which is more than the entirety of our semester [46] . we point out that our disease progression generator library is general enough to model the complete disease progression. for example, we can also include states like hospitalized, shortness of breath, respirator, icu, dead, etc. into the library and perform further analysis. however, controlling the spread of the disease is the primary concern for universities, therefore we omit those states and focus more on the infection and transmission of the disease. for this study, we analyze the cumulative infected students due to community transmission of covid-19 in section 3, hence the fraction of agents who leave the system (severe illness or mortality) or get recovered is immaterial for our simulations because neither of the states impact new infections. recovered patients are immune and don't act as vectors whereas patients who leave are isolated from the system. the simulator, available in github 4 , was used to evaluate the effectiveness of different operational interventions for reducing infections during the semester. the university of minnesota is chosen, to represent many big universities in the country. following describes the data and assumptions used for the experiment in section 3.1 and section 3.2 and analyzes the impact of different operational interventions in section 3.3. the simulator uses actual student class enrollment data from the university of minnesota and contains 46,782 students, 5,570 classes and 5,570 instructors. these are from fall 2019 official enrollment statistics report 5 . we consider students from each department except college of continuing and professional studies, since students from that college primarily take online courses. thus, our event sequence is g(s ∪i, c, t ), where |s| = 46782, |i| = 5570, |c| = 5570, and |t | = 7 × 12 = 84 days. each student can enroll in 2 to 5 courses as per university guidelines to maintain student status, which defines the degree of each student in the network g(s ∪ i, c). we assume that students can only take classes within departments of their own school / college in the results we present. fall 2019 registration information of umn 6 is used to provide the set of classes offered, number of students enrolled in each class, timings, and the instructors to the simulation. with such information, we construct the person-location bipartite network and event sequence according to section 2.1. although the current focus is on the pandemic operations of a major university, the framework is flexible enough to analyze the spread of infectious diseases involving human interactions in a big campus if any kind, given relevant models and parameters. as we develop the understanding of the disease, we regularly update the parameters involved in the framework based on the current studies. below we discuss some of the major parameters used in the framework. initial infection: we start our simulation with a sample of the population being initially infected. for our analysis, we assume this value to be 1% according to the minnesota's weekly covid-19 reports [47] . we randomly select the initially infected people in the simulation, and uniformly distribute them into different groups based on the number of days since they have been infected (maximum 5 days). outside transmission: students lead a significant portion of their life outside the university and this is not precisely modeled in the simulation, especially since it cannot be controlled. further, 75% -80% of students live in off-campus private housing, details of which are outside of the university's purview. detailed modeling of this has not been done. instead, an assumption is made that 5 non-quarantined susceptible students are spontaneously infected every day due to presumed transmission from non-university contact. this is consistent with the analysis by the upenn/swarthmore team [25] . we expect modelers to have better estimate of this parameter as the semester progresses. indoor transmission: we consider indoor transmission of covid-19 to be airborne, based on the assumption that tables, chairs, equipment and other surfaces inside the classrooms are being systematically sanitized by the university cleaning staff. we use an approximation of the wells-riley equation (2) , which already has well defined parameters for airborne transmission inside a classroom. the pulmonary ventilation rate of susceptible, defined as p, is set at 0.48 m 3 /h [39] . the quantum generation rate q, or the amount of infection produced by a covid-19 patient per hour, is assumed to be 20 quanta/h [40] . we also assume that different types of masks have various efficiency in filtering the quanta generated (covid-19 virus released) and pulmonary ventilation (air intake). this is presented in the detailed analysis of masking impact in section 3.3. the room ventilation rate q is assumed to be the product of ventilation rate 4 ac/h [39] and room volume (explained in section 3.3.2). in this section we analyze the effects of individual policy dimensions, including masking, physical distancing, class modality, and testing. specifically, we estimate the cumulative infected students due to the community transmission of covid-19 within the university campus. to analyze the impact of each policy dimension, we vary values of parameters associated with that policy dimension, keeping the other dimensions as low as possible (i.e., equivalent to pre-covid-19 times). for each set of simulation parameters we ran 1,000 simulations, plot the mean of cumulative infected students and the 95% confidence interval for the mean. across different simulations the results are concentrated and the standard deviations are relatively small as compared to the mean values. thus, the 95% confidence intervals are close to the mean values. the masking policy is an aggregation of four parameters, namely, student mask type, student mask compliance, instructor mask type, and instructor mask compliance. we study the impact of this policy from two aspects, namely the student mask compliance (the percentage of student population that wear a mask) and the student mask types. for the study of student mask compliance, students and instructors are assumed to be wearing cloth masks. the instructor mask compliance is set to 100%, while the student mask compliance is set as a variable. for the study of student mask types, we consider three types of masks, namely cloth masks, medical masks, and n95 masks. the effectiveness of different mask types was modeled based on a study in [48] , comparing the filtration efficiency of small aerosols on different types of masks. the study [48] found that, on average, n95 masks filter efficiency was 95%, medical mask filter efficiency was 55%, and general cloth mask filter efficiency was 38%. we also assume that all classes are in-person, physical distancing is 2 feet, and only symptomatic people are tested. figure 5 shows the impact of different student mask compliance on the cumulative number of infected students due to community transmission within campus. a decrease of 23.25% in cumulative infected students can be observed when there is a change from no student mask compliance to strict mask policy adherence. in figure 6 , we fix the student mask compliance to be 100% and vary the mask types. we observe that the regular use of n95 masks by students will significantly reduce the spread. however, it's also expensive for them to use these masks every day. figure 6 : impact of different mask types on cumulative infected students due to the community transmission of covid-19 within university campus covid-19 spreads among people in close proximity for sufficient time. in our simulation, we use physical distance as a radius to calculate the area of a circle which acts as a substitute for the area per student. this value is used to determine the room volume of a class based on the total number of students attending the class. the product of room volume and ventilation rate parameter (4 ac/h) is used as the room ventilation rate parameter in the wells-riley equation (2) . center for disease control (cdc) recommends a physical distance of 6 feet [49] . to analyze physical distancing policy, we vary the physical distance from 2 (personal space [50] in normal times) to 6 feet. we analyze the physical distancing dimension by assuming scheduling of all classes in-person, no adherence on wearing mask among students, and only symptomatic people are tested. as seen in figure 7 , an increase from 2 to 6 feet in physical distance can decrease the cumulative infected by 70.50%, which suggests that it's crucial to avoid close contact with other people even though they are not showing any symptoms. large gatherings, like classes, increase the likelihood of spreading the virus [51, 52] . therefore, we analyze the impact of class modality by varying the maximum class size. we set the mask compliance among students to be 0%, physical distancing is reduced to 2 feet, and only symptomatic infected people are tested. in particular, we vary the maximum in-person class sizes to 30, 60 or all in-person, as shown in figure 8 . a change from all classes being in-person to restricting classes with more than 30 students to be online, can bring a decrease of 72.99% in cumulative infected. large classes act as hubs where typically students from various departments study together, which increases their potential exposure to other students. consequently, it creates short paths between students of different departments (communities) [53] , which exacerbates the community spread. therefore, avoiding large classes can be very effective in controlling the spread of the disease. in the simulation, we can implement testing policies to detect infection spread in the network. on a given day, each person has a test state of tested positive, tested negative, or not tested. the testing capacity each day is assumed to be limited due to constraints from time, labor, money, manufacture of testing supplies, etc. available tests are preferentially used to test symptomatic people. people who turn symptomatic go for testing the next day. based on the testing results on each day, we then create a list of classes attended by the positively tested people, called the 'contact traced' (ct) classes. the rest of the classes come under 'non-contact traced' (nct) classes. we can implement various policies to in the simulation, we use the test accuracy statistics from a publicly available infection testing manufactured by inbios 7 , with 96.7% sensitivity and 100% specificity. if an individual has a false negative test result, he / she can still attend classes and spread the virus. if an individual has a false positive test result, he / she will be quarantined in the simulation. we also define a 'testing gap day' parameter which stops people from getting tested within a particular time frame. in the simulation we set the 'testing gap day' as 3 days, because this is the average number of days after which an infected person becomes infectious. this value can be changed depending upon the availability of tests and other policies. in addition, we set the mask compliance among students to be 0%, physical distancing to be 2 feet, and all classes to be held in-person in our simulation. to study this policy dimension, we vary the test capacity by 2000, 5000 and 10,000 tests per day. as shown in figure 9 , 5x testing capacity (from 2000 to 10000) can lead to 36.45% decrease in cumulative infected, which is less effective as compared to other individual dimensions we studied. the higher education community finds itself in uncharted territory due to covid-19, which has limited the functioning of the community. operational policies being implemented also carry social and monetary costs, which are presently unclear, but will impact the future. thus, implementing cost-effective policies with high impact on controlling the spread of the disease is extremely important. masking and physical distancing are economically cheaper to implement. however, as shown in figure 5 and 7, when these policies are implemented in isolation, 71.57% and 27.50% student population respectively still gets infected. these two policies also have high social cost for implementation. in march, when understanding of the pandemic was nascent, policies such as self-isolation and social distancing were recommended to flatten the curve. feelings of loneliness and isolation that are exacerbated during social distancing have caused mental health to suffer and lead to increased substance use, and elevated suicidal ideation [54, 55] . masking as a voluntary policy would likely lead to insufficient compliance, would be perceived as less fair, and could intensify stigmatization such as negative emotional responses, social labeling, or prejudicial attitudes [56] . on the other hand, policies like mass testing and shifting classes with huge enrollments online bear the financial burden and may not be as effective as expected in isolation, unless all other recommendations are being strictly followed by the figure 10 : sunrise plan decision timeline students. scientists at the university of illinois developed a quick, inexpensive saliva test and started doing 10,000 to 15,000 tests per day. however, mass testing and contact tracing alone doesn't guarantee a control over the spread, as shown in our simulation results in section 3.3.4, which also results in 58.3% population being infected by the end of the semester. students' defying quarantining/self-isolation when being testing positive has also led to another spike in infections [57] . these policies when implemented individually don't seem to be enough to control the disease spread, but they can work well in conjunction with each other. we present a case study of a combination of these policies in our next section. here we present a detailed case study of the impact of various decisions taken by the university of minnesota, one of the largest universities in the country, as it went through the process of developing and implementing its sunrise plan the sunrise plan is summarized in figure 10 . under umn's sunrise plan, recommendations for the use of masks and social distancing were made as early as may 1. cloth masks are being provided to all students and employees who are on campus and required in specific settings. as of june 22, based on the physical distancing requirements outlined by the minnesota department of health [58] and umn's own medical and public health experts 8 , decision on maintaining 6 feet physical distance in general-purpose classrooms was adopted. course delivery modality was entered into the university course scheduling system by the end of july 2. it was also decided that any in-person class meetings will end right before thanksgiving (nov 26), and the remainder of the class meetings, including final exams, will be entirely online. led by umn's health emergency response office (hero), a testing and tracing advisory team was formed and the team introduced an elaborate testing policy called mtest 9 on july 30. on aug 21, it was announced that all classes will be wholly online for at least first two weeks of the fall semester. as the university considered and implemented alternative policies over time, we modeled the impacts of those decisions, including requiring masks, the extent of physical distancing in classrooms, introduction of class modalities as to designate classes as in-person, blended, remote, or online, and testing protocols. this was done over a roughly 4-month period from may 1st to august 21st, 2020. as each decision was considered, the impact of various alternatives for it were evaluated. since decisions were made one at a time, in evaluating the impact of downstream decisions, the system also incorporated the earlier decision made in the sunrise plan. in table 1 and figure 11 , we show the combined results of various policies adopted at different stages of sunrise plan. the results can be applied to get a measure of confidence for each decision. our results show that, even just the proper adherence of masking and physical distancing can bring a drastic difference in the cumulative infected in comparison to individual policies. covid-19 has presented administrators of higher education institutions with a completely new problem, i.e. what are the right decisions to make to for operating an educational campus during the pandemic, such that educational objectives can be met, while ensuring health safety for the entire community. the history on this problem is short, i.e. only around 6 months, since the last pandemic with this level of societal impact, i.e. the spanish flu of 1918, happened at a time when higher education was not much of an integral part of the society. this paper is among the first set of efforts to address this problem. our approach is unique in that it presents a flexible simulation system that contains a suite of model libraries, one for each major system component. the simulation system merges agent based modeling (abm) and stochastic network approach, and models the interactions of individual entities, e.g. students, instructors, classrooms, residences, etc. in great detail. for each decision to be made by administrators, the system can be used to predict the impact of various choices, and thus enable the administrator to make a suitable decision. a detailed case study of the university of minnesota's sunrise plan [28] was presented. specifically, as various decisions were made in sequence, their impact was assessed using this system. the results were used to get a measure of confidence in each decision. we believe this flexible tool can be a valuable asset for various kinds of organizations to assess their infection risks in pandemic-time operations. economists see uneven jobs recovery mental health and the covid-19 pandemic here's our list of colleges' reopening models projections of education statistics to 2026. nces 2018-019 projected number of participants in educational institutions, by level and control of institution education during covid-19 and beyond the sir model for spread of disease: the differential equation model. loci.(originally convergence modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model transims: transportation analysis and simulation system forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months forecasting the impact of the first wave of the covid-19 pandemic on hospital demand and deaths for the usa and european economic area countries. medrxiv covid act now. america's covid warning system impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand pattern of early human-to-human transmission of wuhan 2019-ncov. biorxiv early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases transmission interval estimates suggest pre-symptomatic spread of covid-19 feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. the lancet global health nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study covid-19 hospital impact model for epidemics (chime). penn medicine, the university of pennsylvania responding to covid-19-a once-in-a-century pandemic policies and guidelines for covid-19 preparedness: experiences from the university of washington covid-19: epidemiology, evolution, and cross-disciplinary perspectives simulating covid-19 in a university environment consideration of the aerosol transmission for covid-19 and public health evidence of short-range aerosol transmission of sars-cov-2 and call for universal airborne precautions for anesthesiologists during the covid-19 pandemic a dynamic model of market share and sales behavior diffusion of innovations discussion: the kermack-mckendrick epidemic threshold theorem three basic epidemiological models the mathematics of infectious diseases nonlinear dynamics and chaos with student solutions manual: with applications to physics, biology, chemistry, and engineering multiscale mobility networks and the spatial spreading of infectious diseases aerosol and surface stability of sars-cov-2 as compared with sars-cov-1 the asymptotic number of labeled graphs with given degree sequences a critical point for random graphs with a given degree sequence. random structures & algorithms modelling the transmission of airborne infections in enclosed spaces association of infected probability of covid-19 with ventilation rates in confined spaces: a wells-riley equation based investigation. medrxiv estimation of incubation period distribution of covid-19 using disease onset forward time: a novel cross-sectional and forward follow-up study. medrxiv temporal dynamics in viral shedding and transmissibility of covid-19 covid-19 pandemic planning scenarios modeling the impact of social distancing measures on the spread of sars-cov-2 in minnesota sars-cov-2 infection induces robust minnesota department of health. minnesota department of health weekly covid-19 report 8 comparison of filtration efficiency and pressure drop in anti-yellow sand masks, quarantine masks, medical masks, general masks, and handkerchiefs center for disease control and prevention. social distancing the hidden dimension/edward twitchell hall public health response to the initiation and spread of pandemic covid-19 in the united states clustering and superspreading potential of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) infections in hong kong the small-world network of college classes: implications for epidemic spread on a university campus mental health, substance use, and suicidal ideation during the covid-19 pandemic-united states the impact of the covid-19 pandemic on suicide rates social and behavioral consequences of mask policies during the covid-19 pandemic a university had a great coronavirus plan, but students partied on minnesota department of health the authors acknowledge krishnamurthy iyer, kelly searle, and rachel croson for their constructive feedback at various stages of this research. key: cord-299312-asc120pn authors: khoshnaw, sarbaz h.a.; shahzad, muhammad; ali, mehboob; sultan, faisal title: a quantitative and qualitative analysis of the covid–19 pandemic model date: 2020-05-25 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.109932 sha: doc_id: 299312 cord_uid: asc120pn global efforts around the world are focused on to discuss several health care strategies for minimizing the impact of the new coronavirus (covid-19) on the community. as it is clear that this virus becomes a public health threat and spreading easily among individuals. mathematical models with computational simulations are effective tools that help global efforts to estimate key transmission parameters and further improvements for controlling this disease. this is an infectious disease and can be modeled as a system of non-linear differential equations with reaction rates. this work reviews and develops some suggested models for the covid-19 that can address important questions about global health care and suggest important notes. then, we suggest an updated model that includes a system of differential equations with transmission parameters. some key computational simulations and sensitivity analysis are investigated. also, the local sensitivities for each model state concerning the model parameters are computed using three different techniques: non-normalizations, half normalizations, and full normalizations. results based on the computational simulations show that the model dynamics are significantly changed for different key model parameters. interestingly, we identify that transition rates between asymptomatic infected with both reported and unreported symptomatic infected individuals are very sensitive parameters concerning model variables in spreading this disease. this helps international efforts to reduce the number of infected individuals from the disease and to prevent the propagation of new coronavirus more widely on the community. another novelty of this paper is the identification of the critical model parameters, which makes it easy to be used by biologists with less knowledge of mathematical modeling and also facilitates the improvement of the model for future development theoretically and practically. this work reviews and develops some suggested models for the covid-19 that can 19 address important questions about global health care and suggest important notes. then, we 20 suggest an updated model that includes a system of differential equations with transmission 21 parameters. some key computational simulations and sensitivity analysis are investigated. 22 also, the local sensitivities for each model state concerning the model parameters are 23 computed using three different techniques: non-normalizations, half normalizations, and 24 full normalizations. 25 results based on the computational simulations show that the model dynamics are 26 significantly changed for different key model parameters. interestingly, we identify that 27 transition rates between asymptomatic infected with both reported and unreported 28 symptomatic infected individuals are very sensitive parameters concerning model variables 29 in spreading this disease. this helps international efforts to reduce the number of infected 30 individuals from the disease and to prevent the propagation of new coronavirus more 31 widely on the community. another novelty of this paper is the identification of the critical 32 model parameters, which makes it easy to be used by biologists with less knowledge of 33 mathematical modeling and also facilitates the improvement of the model for future 34 development theoretically and practically. the idea of chemical kinetic theory is an important approach for understanding and 126 representing the biological process in terms of model equations. the important assumptions 127 to build such models are model states, parameters, and equations. this is because it helps 128 the investigation of mathematical modeling effectively and easily [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] infected, shows symptoms of covid-19, and detected by the government, either 185 from a rapid test or from voluntary action to report to the hospital. we assume that 186 all individuals in this class will get a specific treatment and supervision, whether it's 187 through monitored isolation or treatment in the hospital. 188 5. recovered group (r). this group present individual who get recovered from 189 covid-19, and had a temporal immunity. 190 the transmission diagram which illustrates the interaction between each group described in 191 figure 1 . where 208 209 furthermore, the model's constant parameters and initial states with their definitions are 210 described in table 1 . using equations (2)-(4), the model dynamics are described by the 211 following system of non-linear ordinary differential equations 212 the model initial populations are expressed in the following equation 213 co 263 mp 264 utat 265 ion 266 al 267 sim 268 ulat 269 ions 270 for 271 the 272 mo 273 del 274 stat 275 es 276 giv 277 en 278 in system (9) parameter has an effective role in the dynamics of . 419 3. the transition rate has also affected asymptomatic infected people, reported 420 symptomatic infected people, and unreported symptomatic infected people. it 421 can be seen that the model dynamics for such states become more flat when the 422 value of is increased. this is an important key element for controlling this 423 disease. practically discussed and investigated in this area. have improvements in interventions and healthcare programs. 510 the used model in this paper has further improved based on the computational results using 511 matlab for different initial populations and parameters. some main results can help in 512 understanding the suggested model more widely and effectively. by using computational 513 simulation, we identify some key critical parameters that have a great role in spreading this 514 virus among the model classes. one of the identified key parameters is the transmission rate 515 between asymptomatic infected and reported symptomatic individuals. this is an important 516 finding in the understanding of the covid-19 and how this virus spreads more quickly. 517 some other critical model parameters have investigated in this paper. for example, the 518 transmission parameter between asymptomatic infected and unreported symptomatic 519 individuals has a great impact on the dynamics of the model states. besides these findings 520 provide additional information about estimations and predictions for the number of infected 521 individuals. accordingly, our results in identifying key parameters are broadly consistent 522 with clinical and biological findings. 523 remaining issues are subject to sensitivity analysis. this is also an important issue that can 524 be further studied. we have applied the idea of local sensitivity to calculate the sensitivity 525 of each model state concerning model parameters for the updated model of the covid-19. 526 three different techniques are investigated which are non-normalizations, half 527 normalizations, and full normalizations. these provide us an important step forward to 528 identify critical model elements. by using local sensitivity approaches we concluded 529 that almost all model states are sensitive to the critical model parameters { } . this 530 becomes a great step forward and helps international attempts regarding the covid-19 531 pandemic outbreak. this may help to reduce the number of infected individuals from the 532 disease and to prevent the coronavirus more widely in the community. it can be concluded 533 that the identified factors can be controlled to reduce the number of infected individuals. 534 overall, our results demonstrate a strong effect of the key critical parameters on the 535 spreading covid-19. 536 therefore, based on the effect of each involved parameters over the model states, more 537 suggestions and interventions can be proposed for controlling the covid-19 disease. that 538 will be useful for any interventions and vaccination programs. accordingly, the healthcare 539 communities should pay more attention to the quarantine places for controlling this disease 540 more effectively. it can be strongly suggested that anyone in the quarantine places should 541 be separated from the others and should use only their separate equipment, bedroom, and 542 toilet to prevent the transmission of the virus through the touching of shared surfaces. 543 another suggestion is that reducing the contact between asymptomatic-symptomatic groups 544 and susceptible groups, this is effectively minimizing the number of infected people. it 545 seems necessary to plan a certain strategy to put the asymptomatic infected individuals on 546 quarantine places sooner rather than later. future research on identifying key critical 547 elements might extend the explanations of the new covid-19 more widely. it will be 548 important that future research investigates more suggested transmissions between the 549 model groups. for example, the model will further improve by adding two transmission 550 paths, one of them is between unreported symptomatic infected and reported symptomatic 551 infected, the other one is between asymptomatic infected and recovered individuals. 552 553 history and recent advances in coronavirus 555 discovery human coronaviruses: insights 558 into environmental resistance and its influence on the development of new antiseptic 559 strategies a 563 novel coronavirus from patients with pneumonia in china early transmission dynamics in 567 of novel coronavirus-infected pneumonia clinical features of patients 570 infected with 2019 novel coronavirus in wuhan pneumonia 573 of unknown etiology in wuhan, china: potential for international spread via commercial 574 air travel nowcasting and forecasting the potential domestic and 576 international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling 577 study estimationofthereproductivenumberofnovelcoronavirus (covid-19) and 581 the probable outbreak size on the diamond princess cruise ship: a data-driven analysis a discrete stochastic model of the covid-19 outbreak: 584 forecast and control estimation of 586 the transmission risk of the 2019-ncov and its implication for public health interventions a mathematical 589 model for simulating the phase-based transmissibility of a novel coronavirus an updated 592 estimation of the risk of transmission of the novel coronavirus (2019-ncov) serial intervals of respiratory infectious diseases: a systematic review and analysis serial interval 598 of novel coronavirus (covid-19) infections the serial interval of covid-19 from publicly reported confirmed cases novel 604 coronavirus patients' clinical characteristics, discharge rate and fatality rate of meta-605 analysis mathematical modelling of 607 covid-19 transmission and mitigation strategies in the population of ontario response strategies for 611 covid-19 epidemics in african settings: a mathematical modelling 612 study the effect of control strategies to reduce social 615 mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study mathematical 618 recommendations to fight against covid-19 feasibility of controlling covid-19 621 outbreaks by isolation of cases and contacts a mathematical modelling approach in the spread of the 623 novel 2019 coronavirus sars-cov-2 (covid-19) pandemic the 626 reproductive number of covid-19 is higher compared to sars coronavirus model reductions in biochemical reaction networks mathematical modelling for covid-19) in predicting future behaviours and sensitivity 632 analysis balancing the chemical equations 635 and their steady-state approximations in the complex reaction mechanism: linear algebra 636 techniques. applied nanoscience invariant manifold for single and multi-route reaction mechanisms physical assessments on 643 chemically reacting species and reduction schemes for the approximation of invariant 644 manifolds complex reaction and dynamics modeling multi-route reaction mechanism for 649 surfaces: a mathematical and computational approach slow invariant manifold 652 assessments in the multi-route reaction mechanism model reduction of biochemical 655 reactions networks by tropical analysis methods. mathematical modelling of natural 656 phenomena methods of model reduction for 658 large-scale biological systems: a survey of current methods and trends model reduction of chemical 661 reaction systems using elimination complex systems: from chemistry to systems biology systems biology: simulation of dynamic network states sensitivity analysis in chemical 669 kinetics sensitivity analysis approaches applied to systems biology models mathematical model for 673 the ebola virus disease global 675 sensitivity analysis challenges in biological systems modeling. industrial and engineering 676 topological sensitivity analysis 678 for systems biology sensitivity analysis of an infectious disease model a mathematical modelling approach for childhood vaccination 684 with some computational simulations identifying 687 critical parameters in sir model for spread of disease a discrete stochastic model of the covid--19 outbreak: 690 forecast and control estimation of 692 the transmission risk of the 2019--ncov and its implication for public health interventions early dynamics of transmission and control of covid--19: a mathematical 696 modelling study. the lancet infectious diseases a mathematical 698 model for simulating the phase--based transmissibility of a novel coronavirus an updated 701 estimation of the risk of transmission of the novel coronavirus novel 704 coronavirus patients' clinical characteristics, discharge rate and fatality rate of meta--705 analysis why is it 707 difficult to accurately predict the covid-19 epidemic? prediction of the epidemic peak of coronavirus disease in japan effects of media 712 reporting on mitigating spread of covid-19 in the early phase of the outbreak understanding 714 unreported cases in the covid-19 epidemic outbreak in wuhan, china, and the 715 importance of major public health interventions key: cord-267150-hf0jtfmx authors: gupta, rajan; pandey, gaurav; chaudhary, poonam; pal, saibal kumar title: seir and regression model based covid-19 outbreak predictions in india date: 2020-04-03 journal: nan doi: 10.1101/2020.04.01.20049825 sha: doc_id: 267150 cord_uid: hf0jtfmx covid-19 pandemic has become a major threat to the country. till date, well tested medication or antidote is not available to cure this disease. according to who reports, covid-19 is a severe acute respiratory syndrome which is transmitted through respiratory droplets and contact routes. analysis of this disease requires major attention by the government to take necessary steps in reducing the effect of this global pandemic. in this study, outbreak of this disease has been analysed for india till 30th march 2020 and predictions have been made for the number of cases for the next 2 weeks. seir model and regression model have been used for predictions based on the data collected from john hopkins university repository in the time period of 30th january 2020 to 30th march 2020. the performance of the models was evaluated using rmsle and achieved 1.52 for seir model and 1.75 for the regression model. the rmsle error rate between seir model and regression model was found to be 2.01. also, the value of r0 which is the spread of the disease was calculated to be 2.02. expected cases may rise between 5000-6000 in the next two weeks of time. this study will help the government and doctors in preparing their plans for the next two weeks. based on the predictions for short-term interval, these models can be tuned for forecasting in long-term intervals. the covid-19 (sars-cov-2) pandemic is a major global health threat. the novel covid-19 has been reported as the most detrimental respiratory virus since 1918 h1n1 influenza pandemic. according to the world health organization (who) covid-19 situation report [1] as on march 27, 2020, a total of 509,164 confirmed cases and 23,335 deaths have been reported across the world. global spread has been rapid, with 170 countries now having reported at least one case. coronavirus disease 2019 (covid-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2). corona virus belongs to a family of viruses which is responsible for illness ranging from common cold to deadly diseases as middle east respiratory syndrome (mers) and severe acute respiratory syndrome (sars) which were first discovered in china [2002] and saudi arabia [2012] . the 2019-novel coronavirus or better known as covid-19 was reported in wuhan, china for the very first time on 31st december 2019. according to jiang et al. [3] the fatality rate for this virus has been estimated to be 4.5% but for the age group 70-79 this has gone up to 8.0% while for those >80 it has been noted to be 14.8%. this has led to elderly persons above the age of 50 with underlying diseases like diabetes, parkinson's disease and cardiovascular disease to be considered at the highest risk. symptoms for this disease can take 2-14 days to appear and can range from fever, cough, shortness of breath to pneumonia, kidney failure and even death [1] . the transmission is person to person via respiratory droplets among close contact with the average number of people infected by a patient being 1.5 -3.5 but the virus is not considered airborne [2] . there exist a large number of evidences where machine learning algorithms have proven to give efficient predictions in healthcare [4] [5] [6] . nsoesie et al. [7] has provided a systematic review of approaches used to forecast the dynamics of influenza pandemic. they have reviewed research papers based on deterministic mass action models, regression models, prediction rules, bayesian network, seir model, arima forecasting model etc. recent studies on covid-19 include only exploratory analysis of the available limited data [8] [9] [10] . effective and well-tested vaccine against covid-19 has not been invented and hence a key part in managing this pandemic is to decrease the epidemic peak, also known as flattening the epidemic curve. the role of data scientists and data mining researchers is to integrate the related data and technology to better understand the virus and its characteristics, which can help in taking right decisions and concrete plan of actions. it will lead to a bigger picture in taking aggressive measures in developing infrastructure, facilities, vaccines and restraining similar epidemics in future. the objectives of the current study are as follows. 1. finding the rate of spread of the disease in india. 2. developing a mathematical seir (susceptible, exposed, infectious, recovered) model to evaluate the spread of disease. 3. prediction of covid-19 outbreak using seir and regression models. after presenting background in section i, section ii presents the methodology about the model used in this study. section iii covers analysis, experimental results and performance evaluation. discussion has been provided in section iv followed by conclusion in section v. time series data provided by john hopkins university, usa has been used for the empirical result analysis [12] . the time period of data is from 30/01/2020 to 30/03/2020. the data includes confirmed cases, death cases and recovered cases of all countries. however, this paper focuses only on india's data for analysis and prediction of covid-19 confirmed patients. the fact that india covers approximately 17.7% of the world's population and till date the effect of covid-19 cases per million is less than 1, is the motivation behind this research. for analysis and prediction of number of covid-19 patients in india, the following models have been used. mathematical models can be designed to stimulate the effect of disease within many levels. these models can be used to evaluate disease from within the host model i.e. influence interaction within the cells of the host to metapopulation model i.e. how its spread in geographically separated populations. the most important part of this model is to calculate the r0 value. the value of r0 tells about the contagiousness of disease. it is the fundamental goal of epidemiologists studying a new case. in simple terms r0 determines an average of what number of people can be affected by a single infected person over a course of time. if the value of r0 < 1, this signifies the spread is expected to stop. if the value of r0 = 1, this signifies the spread is stable or endemic. if the value of r0 >, 1 this signifies the spread in increasing in the absence of intervention as shown in figure 1 . equation (1) calculates the percentage of the population needed to be vaccinated to stabilize the spread of disease. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.01.20049825 doi: medrxiv preprint the r0 value of covid-19 for india that has been calculated based on earlier data is reported in the range 1.5 -4 [11] . assuming the goal of r0 to be 0.5, the lower limit of the population required for the vaccine is 919.7 million and the upper limit would be 1.2071 billion. this remains a very vague figure so calculating the value of r0 remains an important part. the seir model has mainly four components, viz. susceptible (s), exposed (e), infected (i) and recovered (r) as shown in figure 2 . s is the fraction of susceptible individuals (those able to contract the disease), e is the fraction of exposed individuals (those who have been infected but are not yet infectious), i is the fraction of infective individuals (those capable of transmitting the disease) and r is the fraction of recovered individuals (those who have become immune). in figure 2, is the infectious rate which represents the probability of disease transmitting to susceptible person from an infectious person; is incubation rate which represents latent rate at which a person becomes infectious; is the recovery rate which is determined by 1 (where d is duration of infection); and is the rate at which recovered people become susceptible due to low immunity or other health related issues. the ordinary differential equations (odes) for all these four parameters are shown in equations 2 to 5. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.01.20049825 doi: medrxiv preprint here, n = s + e + i + r is the total the population. now, we can calculate the value of r0 using the formula in equation 6 . the value of α, μ, , can be calculated using the ordinary deferential equations from the 4 component of the seir model. to describe the spread of covid-19 using seir model, few consideration and assumptions were made due to limited availability of the data. they are enlisted as follows. 1. number of births and deaths remain same 2. 1/ is latent period of disease &1/ is infectious period 3. recovered person was not sick again during the calculation period now, considering 70% of india's population to be approximately 966 million in susceptible class (s) and assuming only 1 person got infected in the initial part with average incubation period of 5.2, average infectious period of 2.9 and r 0 equal to 4, the seir model without intervention is shown in figure 3 with the assumptions mentioned above. in figure 3 , we can examine that the number of susceptible population decreases by 80% in first 100 days as per the listed assumptions. the algorithm for seir model is shown as follows. output: s_out, e_out, i_out, r_out is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.01.20049825 doi: medrxiv preprint regression models are statistical sets of processes which are used to estimate or predict the target or dependent variable on the basis of dependent variables. the regression model has many variants like linear regression, ridge regression, stepwise regression, polynomial regression etc. this study has used linear regression and polynomial regression for prediction of covid-19 cases. linear regression is a simple model which is used to finds the relation between a dependent and an independent variable. it uses the value of intercept and slope to predict the output variable. equation 7 shows the relationship between a dependent and independent variable in a linear regression model. in equation 7 , and are two independent variables which represent intercept and slope respectively and is the error rate. this creates a straight line and is mostly used for predictive analysis. to make the linear regression algorithm more accurate we try to minimize the sum of residual square between the predicted and actual value. polynomial regression is a special type of regression which works on the curvilinear relationship between the dependent values and independent values. equation 8 shows the relationship between a dependent and independent variable in polynomial regression. in equation 8, x is the independent variable and is the bias also the intercept and , , ……, are the weight or partial coefficients assigned to the predictors and n is the degree of polynomial. the polynomial regression used in the study includes transformation of data into polynomials and applying linear regression to fit the parameter. a polynomial regression with degree equal to 1 is a linear regression. choosing the value of a degree is a challenging task. if the degree of polynomial is less, it will not be able to fit the model properly and if the value of degree of polynomial is greater than actual, it will overfit the training data. in india, the first case of covid-19 was reported on 30 th january 2020. during the month of february, the number of cases reported was 3 and remained constant during the entire month. the major rise in the spread of disease started in the month of march'20. figures 4 and 5 show the change in confirmed cases and death cases from 22 nd jan 2020 to 30 th mar 2020. data from march shows a significant change in spread of the disease. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.01.20049825 doi: medrxiv preprint in thecurrent analysis, we have used data till 25 th march 2020 as our training data and data from 25 th march 2020 -30 th march 2020 as the test/evaluation data. before applying the prediction model we analysed the time series training data to check whether the model can fit the data or not. figure 6 shows the result of confirmed cases in the log 10 base versus last 15 days of training data. the reason of using last 15 days of training data is because it shows major growth in confirmed cases as shown in figure 4 . then we trained our data with seir model and result of line of fitting is shown in figure 7 . while applying seir, we took care of some interventions like quarantine and lockdown announced by the government of india during this period so, we applied a decay function to lower the number of confirmed cases during prediction. we used hill decay in our model which is a half decay and its formula is given in equation 9 , where l is a description of the rate of decay, t is time and k is a shape parameter (no dimension). half decay functions never reach zero and have half their original efficacies at time l. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.01.20049825 doi: medrxiv preprint since we know covid-19 is contagious and rate of transferring the disease from an infected person to susceptible person is 2.02 we need to predict the amount at which the disease can grow. table 1 shows the prediction result using two models. in both the models, we have used 25 days training data as there was no significant trend in india before march 2020. to check the performance of the models used in this study we used root mean squared log error (rmsle) and value rmsle for the seir model was 1.52 and 1.75 for the regression model. the rmsle error rate between seir model and regression model was found to be 2.01. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint the current trend shows that there will be a linear trend continued in the next few days as the control mechanisms taken by the government of india are fairly strict and working well for the time being. also, with linear trends, the patients getting recovered can be managed easily and death rate can be controlled as well. the findings of the current study may explode exponentially as shown by gupta and pal [13] if stringent control measures are not taken by the government. hospital provisions and medical facility enhancement work should be continued at a very rapid pace to prepare the country for exponential growth, if it occurs. however, with current interventions and preparations, the government of india is looking forward to flatten the curve. during the prediction of the confirmed cases as shown in table 1 , there were few challenges associated with the data. the data was not stationary and showed an exponential growth after 40 days from 22 nd jan 2020 as shown in figure 4 . overfitting remains a major problem with disease spread time series data. in this model, we have addressed the overfitting problem using decay based intervention. another problem faced in this study was shortage of training data. data for 25 days was used for training purpose and 5 days data for validation, based on which the number of confirmed cases for next 14 days was predicted. the training data is very less for any machine learning to train itself . also, rapid changes in number of infected cases occurred in mid-march. the seir model shows an advantage as it does not grow exponentially with time but also uses some intervention methods with time. for intervention, a hill decay model was used. in the case of a regression model, different features can be used for decaying or intervention eg. the number of cases recovered etc., but growth of the regression line still remains a problem. for a regression model, we always need to train the model after some time with the change in trend in the data. the seir model also uses some assumptions like the number of people in susceptible class, for which we have taken 70% of the population to susceptible class. in this study, we have only predicted the number of confirmed cases. to predict the number of death cases we faced many problems of data stationarity. also, with limited data, the model was not able to predict the number of death cases properly. we have used only time series data for confirmed cases and death cases in this study. using other data related to weather, geographic layout of the country, state-level population and governance parameters, the model prediction rate can be further improved. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.01.20049825 doi: medrxiv preprint in this study, two machine learning models seir and regression were used to analyse and predict the change in spread of covid-19 disease. we have analysed the data and found out that the number of cases per million in india is less than 0.5 till 30 th march 2020. then, with the help of the seir model the value of r0 was computed to be 2.02. also, we predicted the number of confirmed cases of covid-19 for the next 14 days starting from 31 st march 2020 -13 th april 2020. during performance evaluation, our model computed the value of rmsle for the seir model to be 1.52 and 1.75 for the regression model. also, the value of spread of disease of r 0 was found to be 2.02. the result obtained from this study is taken from training data up to 30 th march 2020. further, looking at the trend, there is definitely going to be an increase in the number of cases. doctors, health workers and people involved in providing essential services have to be protected in accordance with prescribed medical norms. community spreading in the future due to carelessness of individuals as well as groups can exponentially increase the number of cases. the peak is yet to come, hence the government has to be extra vigil and enforce strict measures. in addition, provision of medical facilities across the country has to be aggressively enhanced. in future, an automated algorithm can be developed to fetch data in regular intervals and automatically predict the number of cases for weekly and biweekly data. in this way, government and hospital facilities can also maintain a check on the supply and medical assistance / isolation required for new patients. world health organization characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention review of the clinical characteristics of coronavirus disease 2019 (covid-19) predicting hepatitis b virus-positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning controlling testing volume for respiratory viruses using machine learning and text mining volatile fingerprinting of human respiratory viruses from cell culture a systematic review of studies on forecasting the dynamics of influenza outbreaks investigating a serious challenge in the sustainable development process: analysis of confirmed cases of covid-19 (new type of coronavirus) through a binary classification using artificial intelligence and regression analysis a serological survey of canine respiratory coronavirus in new zealand risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease prudent public health intervention strategies to control the coronavirus disease 2019 transmission in india: a mathematical modelbased approach an interactive web-based dashboard to track covid-19 in real time trend analysis and forecasting of covid-19 outbreak in india key: cord-263620-9rvlnqxk authors: li, zhi-chun; huang, hai-jun; yang, hai title: fifty years of the bottleneck model: a bibliometric review and future research directions date: 2020-09-30 journal: transportation research part b: methodological doi: 10.1016/j.trb.2020.06.009 sha: doc_id: 263620 cord_uid: 9rvlnqxk abstract the bottleneck model introduced by vickrey in 1969 has been recognized as a benchmark representation of the peak-period traffic congestion due to its ability to capture the essence of congestion dynamics in a simple and tractable way. this paper aims to provide a 50th anniversary review of the bottleneck model research since its inception. a bibliometric analysis approach is adopted for identifying the distribution of all journal publications, influential papers, top contributing authors, and leading topics in the past half century. the literature is classified according to recurring themes into travel behavior analysis, demand-side strategies, supply-side strategies, and joint strategies of demand and supply sides. for each theme, typical extended models developed to date are surveyed. some potential directions for further studies are discussed. the bottleneck model was first introduced by vickrey in 1969, aiming at addressing the departure time choices of commuters on a bottleneck-constrained highway during the morning rush hours. in this model, all individuals are assumed to have an identical preferred time to arrive at their destination and incur a schedule delay cost proportional to the amount of time that they arrive early or late. commuters choose their departure time to minimize their own travel cost based on a trade-off between bottleneck congestion delay cost and schedule delay cost of early or late arrival. this model is able to model the formation and dissipation of queuing behind the bottleneck in a simple and tractable way, thus making it a benchmark representation of the dynamics of traffic congestion in peak period. the past 50 years (from 1969 to 2019) have witnessed significant progress in the bottleneck model research since the pioneering work of vickrey (1969) . a lot of insights into understanding the features of traffic congestion in peak period have been obtained via the bottleneck model. these insights cover various aspects, such as behavioral analysis (e.g., the nature of shifting peak, inefficiency of unpriced equilibria, behavioral difference of heterogeneous commuters, connection between morning and evening commutes, effects of commuter scheduling preferences), demand management (e.g., congestion / emission / parking pricing and tradable credit schemes, relationship between bottleneck congestion tolling and urban structure), and supply management (e.g., bottleneck / parking capacity expansion). the insights also play an important role in deeply understanding the essence of commuters' travel behavior during morning/evening peak periods, and in evaluating and making reasonable transport policies for alleviating peak-period traffic congestion. to date, there have been a few reviews on the topic of bottleneck models or their variations (e.g., arnott et al., 1998 ; lindsey and verhoef, 2001 ; small, 1992 ; small, 2015 ) . these early reviews appeared in different years, aiming to track the development of the bottleneck models on some selected specific topics. because the research area is still growing and new disruptive trends of automation and sharing in mobility are emerging, it is timely to provide a state-of-the-art review of this area, particularly on the occasion of its 50th anniversary. this paper attempts to provide a systematic and critical review that differs from previous reviews in several aspects. first, it is meaningful to conduct a bibliometric study of the large body of literature to celebrate the 50th anniversary of vickrey's bottleneck model. to do so, we carry out a literature review to analyze the research progress on the bottleneck model research throughout the past half century. the review tries to cover all relevant topics published in journals rather than some specific ones as covered in the previous review papers. second, a bibliometric analysis approach is adopted that can trace the footprints underlying the scholarly publications by constructing network connections of the publications, journals, researchers, and keywords. with the aid of visualization technique (e.g., a software called vosviewer), the bibliometric approach can map the landscape of the knowledge domain of the bottleneck model studies, allowing us to clearly identify the distribution of publications by journal, influential papers, top contributing authors, and leading topics. third, based on the bibliometric analysis, a critical review on the previous relevant studies is provided, together with some discussions on the current research gaps and opportunities. it is noted that a bottleneck system consists of the following elements: users, the authority (or the government), and the bottleneck (i.e., transport infrastructure). from the perspectives of these elements, we categorize the literature into four classes: travel behavior analysis, demand-side strategies, supply-side strategies, and joint strategies of demand and supply sides. the travel behavior analysis from the users' perspective focuses on the equilibrium analysis of commuters' travel choice behavior, such as the choices of departure time, route, mode, and/or parking. the demand-side strategies from the government's perspective refer to the travel demand management strategies, such as congestion / emission / parking pricing and tradable credit schemes. the supply-side strategies from the transport infrastructure's perspective include such topics as bottleneck capacity expansion and parking capacity design. the joint strategies of demand and supply sides from both the government's and transport infrastructure's perspectives are a hybrid of demand-side and supply-side strategies. for each theme, typical models proposed in previous studies are reviewed. the remainder of this paper is organized as follows. in the next section, a bibliometric study is conducted. section 3 presents a literature review based on the literature categorization. in section 4 , some potential directions for further studies are discussed. finally, section 5 concludes the paper. this section provides a general bibliometric analysis of various bottleneck model studies. the bibliometric analysis uses quantitative methods to classify bibliometric data and build up representative summaries. it has been recognized as a useful approach for analyzing the performances of journals, institutes and authors, as well as the characteristics of research fields or topics. with the aid of visualization technique (e.g., vosviewer software), the bibliometric networks, such as cocitation network, co-authorship network, and keyword co-occurrence network, can be constructed and visually presented. to measure the influences of publications, authors and journals, various bibliometric indicators are considered, including the number of publications, total citations, and citations per paper. in order to collect the publication data since 1969, we scout the three well-recognized journal databases or search engines, namely web of science core collection, scopus, and google scholar, using such topics or keywords as bottleneck, bottleneck model(s), morning commute or commuting, and bottleneck congestion. we further retrieve literature by tracking the references cited by the papers searched from the three databases. in particular, we check all references citing the original work of vickrey (1969) , entitled "congestion theory and transport investment". after repeated sifting and checking, a total of 232 relevant papers during the period of 1969-2019 are finally retrieved. during the 20 years of 1990-2009, this topic received growing attention, with a total of 68 relevant papers published, and the number of relevant publications per five years exceeds 10, more than the total number of publications during the first 15 years (1969) (1970) (1971) (1972) (1973) (1974) (1975) (1976) (1977) (1978) (1979) (1980) (1981) (1982) (1983) (1984) . during the past 10 years from 2010 to 2019, this topic attracted further increasing interest, and a total of 149 relevant papers were published, accounting for 64.2% of total number of publications in the past 50 years. particularly, the largest amount emerges in the most recent 5 years of 2015-2019, with 82 publications (about 35.3% of total number of publications). this continued growing tendency clearly shows that the bottleneck model is still an important and hot research topic in the field of transportation and this tendency is expected to continue in coming years. table 1 shows the top 15 journals (over a total of 38 journals) by the number of published related papers. it can be seen that these journals mainly belong to "transportation" and "economics" categories in terms of the journal categories in the jcr (journal citation reports) published by thomson reuters. transportation research part b (tr-b for short, a leading journal in transportation field) leads table 1 with 75 papers (accounting for 32.3% of the total number of publications), followed by journal of urban economics (jue, a leading journal in urban economics field) with 27 papers (accounting for 11.6%). the total percentage of papers published in these two journals (a total of 102 papers) reaches nearly half of the total transportation research part c 10 6 economics of transportation 9 7 transportation research record 9 8 transportation research part e 8 9 regional science and urban economics 7 10 journal of transport economics and policy 6 11 american economic review 5 12 transportmetrica a: transport science 5 13 applied economics 4 14 journal of public economics 4 15 transportmetrica b: transport dynamics 4 number of publications (about 44.0%). the number of papers published in each of transportation science (ts), transportation research part a and part c (tr-a, tr-c) reaches 10 or more. notably, as a young journal founded in 2012, economics of transportation published 9 papers on this topic. in order to look at the co-relation among journals publishing the 232 publications, a bibliographic coupling of the 38 journals is conducted, as shown in fig. 2 . the size of a solid circle (or a vertex) represents the number of publications related to the topic of the bottleneck model in a journal. the line between circles represents the co-citation relationship of journals. the color of the line represents the cluster of journals, such as the journal categories of economics or transportation. the width of the lines between circles represents the co-citation degree or intensity between journals (i.e., the total number of co-citations of the documents in the journals concerned). specifically, a thick line means a strong co-citation degree between journals, and vice versa. it can be seen that the papers published in tr-b have highly been cited by those published in tr-a, tr-c, tr-e, jue, ts, trr (transportation research record), economics of transportation, transportmetrica a, jtep (journal of transport economics and policy), networks and spatial economics, and journal of public economics. the papers published in jue have a strong citation with those published in tr-b, rsue (regional science and urban economics), and economics of transportation. we now look at the most influential papers about the topic of the bottleneck model during the past five decades, which are determined according to total citations or average citations per year. it should be pointed out that in this paper, citation count is based on the sci/ssci citation databases. here, sci/ssci means science citation index expanded and social science citation index in the web of science core collection. table 2 shows the top 50 most influential papers, each having more than 50 citations. it can be noted that the top 3 most influential papers are vickrey (1969) , small (1982) and adl (1993a) in terms of the total number of citations. here, "adl" respectively refers to the first letters of the surnames of three scholars, i.e., arnott r, de palma a, and lindsey r. all the 3 most cited papers are from american economic review (aer), which is a well-recognized top economics journal. particularly, the pioneering work of vickrey (1969) is the most influential paper with the highest total citations of 944, and the highest average citations of 18.51 per year. there are 18 papers, each having more than 100 citations, and 9 papers, each having an average of no less than 10 citations per year. it should be pointed out that the work of fosgerau and karlstrom (2010) , entitled "the value of reliability", has had a total of 140 citations, and a second ranking in terms of average citations per year, regardless of its short publication history. 15 out of the top 50 papers are from tr-b, 8 from ts, 5 from jue, and 5 from tr-a. in terms of total citations, 5 out of the top 10 most influential papers were written by adl. in terms of average citations per year, 5 out of the top 10 most influential papers were published in tr-b, with publication years being between 2010 and 2013 and research focuses on tradable credit schemes ( nie and yin, 2013 ; xiao et al., 2013 ) and values of travel time and its variance ( fosgerau and karlstrom, 2010 ; fosgerau and engelson, 2011 ) . in order to understand the co-citation relationship of authors in the bottleneck model research, fig. 3 shows the bibliographic coupling network of authors, in which a solid circle represents a researcher and an edge represents the co-citation between a pair of researchers. the size of a solid circle represents the number of papers published by a researcher, and the width of an edge represents the co-citation intensity between studies by that pair of authors. it can be noted that as far as the total number of related publications is concerned, the authors, such as de palma a, lindsey r, huang hj, yang h, fosgerau m, arnott r, verhoef et, zhang hm, liu w, and van den berg vac, are the most productive, influential top 10 authors (see also table 3 ) because they are associated with large circles. table 3 further shows the top 23 influential authors in terms of total number of publications (no less than 5 papers), together with total number of citations and average citations per paper. the research institute and country/area of the associated authors have also been indicated in this table. it can be seen that de palma a leads the list in the total number of publications, with 31 publications. it is followed by lindsey r and huang hj, each having 23 publications. 10 authors have more than 10 publications. small ka, adl, and daganzo cf are the top 5 most influential authors in terms of average citations per year. de palma a leads this table in the total number of citations (2185), and small ka leads this list in the average citations per paper, reaching an average of 114 citations per paper. the other author, having more than 100 citations per paper, is arnott r, reaching an average of 101 citations per paper. in order to identify the research hotspots in the bottleneck model research, fig. 4 shows the bibliographic coupling network of keywords, in which the size of the solid circle represents the number of occurrences of a keyword, and the width of the line represents the occurrence degree of the two keywords connected by that line. one can find some high-frequency keywords in fig. 4 , such as "traffic congestion", "bottleneck model", "transportation", "commuting", "morning commute", "travel time", "costs", "travel behavior", "traffic control", "numerical model", "traffic management", "scheduling", "departure time choice", "user equilibrium", "parking", "road pricing", "congestion pricing", "travel time variabilities", and "heterogeneity". in order to make the review clearer, a cluster analysis of the 232 papers is conducted. 1 we categorize the 232 papers into four classes in terms of their research focuses: travel behavior analysis, travel demand management (i.e., demand-side strategies), infrastructure operations and management (i.e., supply-side strategies), and joint strategies of demand and supply sides. the travel behavior analysis mainly focuses on the analysis of the trip and/or activity scheduling behavior of travelers through building various travel choice behavior models, such as departure time / route / parking / mode choices, morning vs evening commutes, piecewise constant vs time-varying scheduling preferences, normal congestion vs hypercongestion, homogeneous vs heterogeneous users, individual vs household, deterministic vs stochastic situations, single vs multiple bottlenecks, and analytical approach vs dta (dynamic traffic assignment) approach. travel demand management focuses on a set of strategies and policies to reduce travel demand, or to redistribute the demand in space and/or time, including congestion / emission / parking pricing and their effects on urban system. infrastructure operations and management means to determine the optimal capacity or service level of infrastructure elements (e.g., road bottleneck, parking lot, airport, port). joint strategies are a hybrid of both demand-side and supply-side strategies. among these modules, travel behavior analysis is a basis of the travel demand management studies and the infrastructure operations and management studies. the travel demand management strategies and the infrastructure operations and management strategies interplay through demand-supply interaction. the interrelationships among them are shown in fig. 5 . the shaded part in fig. 5 represents the joint strategies of demand and supply management. in the following section, we will provide a systematic review of the bottleneck model studies published in the past half century based on the classification in fig. 5 . the classical vickrey's bottleneck model aims to model the departure time choice behavior of commuters during the morning commute. for the convenience of readers, the detailed formulation of the classical bottleneck model is provided in appendix. in this subsection, some basic assumptions underlying this model are presented. various extensions to relax these assumptions are then reviewed. these extensions include considerations of other travel choice dimensions (e.g., route / parking / mode choices), morning-evening commutes, time-varying scheduling preferences, vehicle physical length in queue and hypercongestion, heterogeneous users, household travel and carpooling, stochastic models and information, multiple bottlenecks, and dta-approach bottlenecks. the classical vickrey's bottleneck model, as a stylized representation of the dynamics of traffic congestion, has been widely recognized as an important tool for modeling the formation and dissipation of queuing at a bottleneck in rush hours. in the model, it is assumed that homogeneous commuters travel from a single origin (home) to a single destination (workplace) along a single road that has a bottleneck with a fixed capacity during the morning rush hours. all commuters choose their departure time based on a trade-off between the bottleneck queuing delay and the schedule delay of arriving early or late. equilibrium is reached when no individual has an incentive to alter his/her departure time. the attractiveness of vickrey's bottleneck model lies in its ability to derive closed-form solutions for equilibrium departure interval (i.e., the departure times of the first and last commuters from home), equilibrium departure rate, equilibrium queuing delay at the bottleneck, and equilibrium cumulative departures and arrivals. the derivations of these analytical solutions are built on some strong assumptions, stated as follows. 1) departure time choice for morning commute. the classical bottleneck model involves only the departure time choice dimension for morning commute. the other travel choice dimensions, such as route, mode and parking choices, and the evening commute are not taken into account. in reality, commuters may also decide on their travel route and/or travel mode besides departure time, subject to parking capacity constraint. moreover, their travel decisions are usually based on day-long schedules, but not on morning or evening activity schedule only. some studies, e.g., de palma and lindsey (2002a) ; zhang et al., (2005) , and li et al., (2014) , showed that commuters' morning and evening departuretime decisions are interdependent under some conditions, and the morning and evening departure patterns for specific individuals are not symmetric. it is, thus, necessary to consider multiple travel choice dimensions of commuters on a whole day basis. 2) piecewise constant scheduling preferences. vickrey's bottleneck model assumes that the value of travel time and the value of schedule delay of arriving early or late are constants, usually denoted by three parameters î±, î² and î³ , respectively. however, some empirical studies have confirmed that the marginal utility of time for performing an activity at a certain location changes over time (see, e.g., tseng and verhoef, 2008 ; jenelius et al., 2011 ; hjorth et al., 2013 ; peer and verhoef, 2013 ; peer et al., 2015 ) . it is therefore meaningful to relax the assumption of piecewise constant scheduling preferences to develop a scheduling model with time-varying marginal activity utilities. 3) normal congestion with point queue (or vertical queue). vickrey's bottleneck model assumes that traffic flow does not fall under heavily congested conditions and flow increases with density, which is called "normal (or ordinary) congestion". however, in reality, a phenomenon of hypercongestion (i.e., flow decreases with density) may occur in the downtown areas of major cities during rush hours. on the other hand, the point queue or vertical queue assumes that any vehicle that has to queue before passing through a bottleneck is stacked in a vertical pile at the bottleneck, i.e., vehicles stack vertically and queues take place at a point. a vertical queue does not occupy any road space and has no influence on upstream approaching vehicles. however, in reality, vehicles have physical lengths, which influence the movements of vehicles at the bottleneck and thus the queuing delays, particularly the queue spillback may block upstream link. 4) homogeneous individuals. the traditional bottleneck models mainly focus on individuals' travel choice behavior, and assume that the travel choice decision of an individual in a household is independent of that of other individuals. however, in reality, the interdependencies between household members (e.g., due to limited number of cars) indeed influence the activity schedules of household members. the classical bottleneck model also assumes that all commuters are homogeneous, i.e., they have the same desired arrival time and the same values of travel time and schedule delay. however, some studies have shown that there are big differences in the travel choice behavior of heterogeneous commuters due to their different travel preferences. there is thus a need to consider the heterogeneity of users. 5) deterministic model. a bottleneck system is in general a dynamic and stochastic system. the dynamicity and stochasticity result from various random events, ranging from non-recurrent random incidents, such as traffic accident, vehicle breakdown, signal failure, adverse weather and earthquake, to recurrent fluctuations in travel demand and capacity by time of day, day of week, and season. the travel time or queuing delay at the bottleneck is thus a stochastic variable. furthermore, commuters may not have perfect information about the traffic condition, and thus cannot perceive the travel time accurately, leading them to make travel choice decisions somewhat haphazardly. 6) only one bottleneck. the classical vickrey's bottleneck model assumes that there is only one single bottleneck on the highway connecting commuters' home and workplace. however, in reality one commuter often traverses multiple bottlenecks on his/her way to work, e.g., a y-shaped highway corridor with upstream and downstream bottlenecks. it is thus worthwhile to extend the single-bottleneck model to a multi-bottleneck case. 7) analytical approach. the classical bottleneck model has well-defined analytical solutions, as shown in appendix. this is because it tackles a single bottleneck only. in order to promote the realistic applicability of the model, it is necessary to extend the analytical bottleneck model to a general network with many links and many od (origin-destination) pairs. to do this, a point-queue dta approach (i.e., treating the queues on congested links in the network as a point bottleneck) and a traffic simulation technique may be adopted. these aforementioned assumptions play a significant role in deriving the analytical solutions of the bottleneck model and in revealing the nature of congestion dynamics. however, they also restrict the model's explanatory power and applications for a general case due to ignorance of many realistic characteristics of traffic system. to strengthen the realism of the bottleneck model, these assumptions have been relaxed in the literature through various extensions, which are in turn reviewed as follows. the classical vickrey's bottleneck model concerns only the departure time choice of commuters. in the literature, some extensions have been made to incorporate other travel choice dimensions, such as route / parking / mode choice dimensions. in terms of route choice dimension, arnott et al., (1990b) presented a simultaneous departure time and route choice model for a network with one od pair and two parallel routes. it showed that at the no-toll equilibrium, the number of users on each route coincides with that in the social optimum. optimal uniform and step tolls divert users towards longer routes, but only slightly. an optimal time-varying toll eliminates queuing without affecting route usage. arnott et al., (1992) and liu and nie (2011) proposed multi-class departure time and route choice models for identifying the behavioral difference of different user classes. siu and lo (2014) and further addressed the simultaneous departure time and route choice problem in a bottleneck system with uncertain route travel time. recently, kim (2019) empirically estimated the social cost of traffic congestion in the us using a simultaneous departure time and route choice bottleneck model. it was shown that the annual cost of congestion borne by all us commuters is about 29 billion dollars. these aforementioned simultaneous departure time and route choice models have derived some insights into understanding the nature of commuters' route choice behavior during the morning commute. however, these studies usually considered only one od pair and an auto-only bottleneck system. they also assumed that total travel demand was fixed (or inelastic) and the parking, as a source of congestion at the destination, was ignored. besides route choice dimension, parking dimension has also been incorporated in the bottleneck model studies, as shown in table 4 . it can be seen in table 4 that most of the parking studies considered the morning commuting problems from home to work through a bottleneck-constrained route. some exceptions are zhang et al., (2008 zhang et al., ( , 2019 , who studied the integrated morning and evening commutes through one two-way route with one bottleneck each way. some studies, such as arnott et al., (1991b) , zhang et al., (2008 zhang et al., ( , 2019 , and liu (2018) , made a strong assumption about the parking order, i.e., commuters park outwards from (or inwards to) the cbd. qian et al., ( , 2012 divided the parking lots into two discrete classes: a closer and a farther parking cluster. others, like yang et al., (2013) and liu et al., (2014a , considered parking reservation issues without/with expiration time and the effects of parking space constraints on the departure time and parking location choices. liu and geroliminis (2016) examined the effects of cruising-for-parking on commuters' departure time choices using mfd (macroscopic fundamental diagram) approach. however, all these studies did not concern the parking duration issue ( lam et al., 2006 ; li et al., 2008 ) , which directly affects the parking turnover and thus the real-time number of available parking spaces in a parking lot. in addition, the bottleneck model has also been extended to consider mode choice dimension. for the convenience of readers, we summarize in table 5 some principal contributions to the multi-modal bottleneck problems. it can be noted that most studies considered two physically separated modes (auto and rail), and thus cannot consider the congestion interaction between modes. moreover, some studies, such as tabuchi (1993) , danielis and marcucci (2002) , and gonzales and daganzo (2012) , ignored the effects of passenger crowding discomfort in transit vehicles on commuters' travel choices. however, some studies, such as huang (20 0 0 , 20 02), huang et al., (2007) , de palma et al. (2017) , showed that in-vehicle passenger crowding discomfort has a significant effect on passengers' travel choices. in order to achieve a social-optimum system, the in-vehicle passenger crowding in transit vehicles should be incorporated in the transit service optimization, together with passenger wait time at transit stops due to insufficient vehicle capacity. as previously stated, the standard bottleneck model focuses mainly on morning commuting problems, and little attention has been paid to evening or day-long commuting problems. this may be because the evening commuting is usually seen as a note: " ã� " represents "no" and " â�� " represents "yes". symmetric reverse process of the morning commuting. some studies, such as vickrey (1973) , de palma and lindsey (2002a) , gonzales and daganzo (2013) , and li et al., (2014) , have shown that the morning and evening equilibrium departure patterns are not symmetric under some conditions, e.g., the bottleneck system has multiple alternative travel modes, or commuters are heterogeneous in terms of their preferred work start/end times and/or the values of travel time and schedule delay. although investigation of the morning and evening commuting problems in isolation may provide some important insights, in reality commuters usually make travel decisions based on their day-long activity schedules. to date, only a few published papers have involved the analysis of day-long commuting problems. for example, zhang et al., (2008) presented an integrated day-long commuting model that links the morning and evening commuting trips via parking location choice. gonzales and daganzo (2013) incorporated mode choice dimension in the integrated morning and evening commuting problem. daganzo (2013) further examined the two-mode day-long commuting problem when the wish times of arriving at and departing from workplace follow a continuous distribution. the day-long commuting models mentioned above adopted a trip-based modeling approach, and thus the time allocations of commuters for activities and travel during a day cannot be properly addressed. different from the trip-based morning-evening commuting models, zhang et al., (2005) presented a day-long activitytravel scheduling model to address commuters' time allocations among activities and travel during a day. their model connects the home-to-work commute in the morning and the work-to-home commute in the evening via work duration. li et al., (2014) investigated the properties of the day-long activity-travel scheduling model. they presented a sufficient and necessary condition of interdependence between the morning and evening departure-time decisions, i.e., the marginal utility of work activity is not a constant, but depends on both the clock time of day and the work duration (implying a flexible work-hour scheme). recently, zhang et al., (2019) further investigated autonomous vehicles oriented morning-evening commuting and parking problems. these previous studies usually considered a simple activity chain, namely home-work-home chain. however, in reality commuters may engage in other activities before work (e.g., taking the kid to school) or after work (e.g., shopping or recreation). it is thus meaningful to incorporate other activity participations in the day-long activity-travel scheduling model. vickrey's bottleneck model assumes that the value of travel time and the values of schedule delays of arriving early and late are constants î±, î², and î³ , respectively (see eq. (a1) in appendix). this assumption has been widely adopted in various extensions or variations of vickrey's bottleneck model. however, some previous empirical studies have confirmed that the marginal activity utility varies in time and space. vickrey (1973) formulated the departure time choice model for the morning commuting problem, in which the utilities derived from time spent at home and at work are linear functions of time. tseng and verhoef (2008) estimated the scheduling model of the morning commuting problem, in which marginal utilities vary nonlinearly over time of a day. jenelius et al., (2011) explored the effects of activity scheduling flexibility and interdependencies between different segments in a daily trip chain on delay cost and value of time. hjorth et al., (2013) empirically estimated different types of activity scheduling preference functions (including const-step, const-affine or const-exp formulations) and compared them to a more general form (exp-exp formulation) with regard to model fit, based on the stated preference survey data collected from car commuters traveling in the morning peak in the city of stockholm. abegaz et al., (2017) used stated preference data to compare the valuation of travel time variability under a structural model where trip-timing preferences are defined in terms of timedependent utility rates (i.e., a "slope model") against its reduced-form model where departure time is assumed to be optimally chosen. fosgerau and small (2017) presented a dynamic model of traffic congestion in which scheduling preferences are endogenously determined. this is different from the traditional activity scheduling models, in which the scheduling preferences are assumed to be exogenously given. developed an activity-based bottleneck model for investigating the step tolling problem, in which the activity scheduling utilities of commuters at home and at work vary by the time of day. it showed that ignoring the preference heterogeneity of commuters would underestimate the efficacy of a step toll. recently, li and huang (2018) investigated the user equilibrium problem of a single-entry traffic corridor with continuous scheduling preferences. the results showed that the introduction of continuous scheduling preferences makes inflow rate of early arrivals first increase and then decrease. even though the introduction of continuous scheduling preferences can smooth the departure rate of commuters and make the user equilibrium flow pattern more stable, a series of shock waves still exist due to discontinuities in departure rates or sharply decreasing inflow rate at the entry point of the corridor. these aforementioned studies mainly focused on evaluation or comparison of the effects of different forms of the activity scheduling preference functions. other important factors were ignored. for example, the marginal utilities of commuters may change by gender, travel mode, income level, and so on. it is meaningful to reveal the effects of these heterogeneities on the activity scheduling preferences and to empirically calibrate the scheduling preference functions of various activities through field surveys. the point-queue assumption in vickrey's bottleneck model significantly facilitates the calculation of queuing delay at the bottleneck. however, it cannot account for the influence of vehicle queuing on upstream approaching vehicles due to ignoring the physical lengths of vehicles in queue. lago and daganzo (2007) investigated two important aspects: queue spillovers caused by insufficient road space, and merging interactions caused by the convergence of trips in a two-origin and single-destination network with limited storage space. they obtained some unexpected findings, e.g., ramp metering is beneficial, and providing more freeway storages is counterproductive. chen et al., (2019) explored the impact of queuelength-dependent capacity on travelers' departure time choices in the morning commute problem. it showed that multiple equilibria and even a continuum of equilibria may exist, and the equilibrium cost may be a locally decreasing function of the number of users. the standard model for analyzing traffic congestion with vehicle queue length consideration usually incorporates a relationship between volume, speed and density of traffic flow. there is a well-defined inverse-u-shaped relationship between traffic volume and density. however, most of traffic flow models focus on the situation of "ordinary (or normal) congestion", in which traffic volume increases as traffic density increases (or travel speed decreases as traffic volume increases). this is because it is believed that traffic flow does not fall under heavily congested conditions. however, in reality, the phenomenon of hypercongestion may occur, especially in the downtown areas of major cities during rush hours. hypercongestion refers to traffic jam situations where traffic volume decreases as traffic density increases. small and chu (2003) presented tractable models for handling demand fluctuations for a straight uniform highway and for a dense street network located in a central business district (cbd) . for the cbd model, they employed an empirical speed-density relationship for dallas texas to characterize hypercongested conditions. arnott (2013) presented a bathtub model of downtown rush-hour traffic congestion that captures the hypercongestion phenomenon. it was shown that when demand is high relative to capacity, applying an optimal time-varying toll can generate benefits that may be considerably larger than those obtained from standard models and that exceed the toll revenue collected. fosgerau and small (2013) combined a variable-capacity bottleneck with î± â�� î² â�� î³ scheduling preferences for a special case with only two possible levels of capacity. it showed that the marginal cost of adding a traveler is especially sensitive to the low level of capacity, and under hypercongestion the policies (an optimal toll, a coarse toll, and metering) can be designed so that travelers gain even without considering any toll revenue. fosgerau (2015) extended the bathtub model to assess the effects of the policies of road pricing, transit provision and traffic management under hypercongestion. it showed that the unregulated nash equilibrium is also the social optimum among a wide range of potential outcomes and any reasonable road pricing scheme would be welfare decreasing when the speed of alternative transit mode is high enough such that hypercongestion does not occur in equilibrium. large welfare gains can be achieved through road pricing when there is hypercongestion and travelers are heterogeneous. gonzales (2015) further considered the hypercongestion issue in a multi-modal context. it showed that hypercongestion may arise when modes are not priced, a stable steady equilibrium state can emerge when cars and high-capacity transit are used simultaneously, and there always exist fixed coordinated prices (i.e., fixed difference of prices) for cars and transit to achieve a stable equilibrium state without hypercongestion. in order to derive a closed-form solution for no-toll equilibrium for hypercongestion, arnott et al., (2016) proposed a special bathtub model through adapting the simplest bottleneck model to an isotropic downtown area where the congestion technology entails velocity being a negative linear function of traffic density. liu and geroliminis (2016) adopted an mfd approach to model the hypercongestion effects of cruising-for-parking in a congested downtown network. in addition to the hypercongestion issues, the traffic flow model that describes the relationship between velocity and density has also been extended for investigation of continuum corridor problems (see, e.g., arnott and depalma, 2011 ; depalma and arnott, 2012 ; li and huang, 2017 ; lamotte and geroliminis, 2018 ) . it should be mentioned that the tractability is a major challenge for the models with hypercongestion consideration. this is because the travel time of a traveler is determined by the decisions of other travelers throughout the duration of the trip, as pointed out in fosgerau and small (2013) . therefore, it is difficult to derive analytical solutions for a general case with general scheduling preferences, heterogeneous users, or travel time uncertainty. a simulation method has thus to be used. the standard bottleneck model has a strong assumption that commuters are homogeneous, i.e., all commuters have the same preference for arriving early or late and have an identical value of time. this assumption has been relaxed in the literature to consider the heterogeneity of commuters, such as heterogeneities of travel preferences and work start time (e.g., flexible or staggered work hours). the heterogeneity may be represented in discrete or continuous form. the discrete type of heterogeneity means that all commuters are divided into several groups and the commuters in one group are assumed to have the same preference and work start time. the continuous type of heterogeneity assumes a continuous distribution for the preference and/or work start time. considering user heterogeneity is important for achieving accurate estimates of welfare effects of various policy measures, such as congestion pricing, ramp metering, capacity investment, and flexible work schedules. table 6 provides a summary of bottleneck model studies involving heterogeneous users. it can be seen that the existing studies mainly focused on the case of piecewise constant scheduling preferences (i.e., î± â�� î² â�� î³ preferences), and considered the users' heterogeneities in the following ways: ( i ) identical preferred arrival time and discretely / continuously distributed scheduling preference parameters; ( ii ) discretely distributed preferred arrival time and identical / discretely distributed scheduling preference parameters; and ( iii ) continuously (including uniformly) distributed preferred arrival time and identical / continuously distributed scheduling preference parameters. however, the cases of discretely (continuously) distributed preferred arrival time but continuously (discretely) distributed scheduling preference parameters are not investigated yet, which provide a research opportunity for further study. it can also be seen that most studies concerned the proportional heterogeneity (i.e., all commuters have the same ratios of î²/ î± and î³ / î±), which helps derive the departure ortwo discrete groups with same î³ / î² analytical der of different commuter groups and analytical equilibrium solutions. some studies, such as newell (1987) , lindsey (2004) , ramadurai et al., (2010) , doan et al., (2011) , and liu et al., (2015c) , have relaxed this assumption to consider a general heterogeneity structure (i.e., î±, î² and î³ are allowed to vary independently). some properties of the model (e.g., the existence and uniqueness of solution) have been discussed. however, it is difficult to derive analytical solution of the model, which poses a challenge of designing an efficient solution algorithm. in addition, the marginal utility of an activity generally varies over time, as previously stated. it is thus necessary to relax the assumption of î± â�� î² â�� î³ scheduling preferences to consider the case of time-varying scheduling preferences. most of the previous bottleneck model studies focused on individual-based trips, and assumed that each household member makes activity-travel scheduling decisions independently. however, in reality, a large number of morning commute trips are indeed household-based travel, i.e., a multi-person trip among household members rather than a single-person trip. the interdependency between household members could influence the activity participation of each household member. therefore, the intra-household interaction should be considered in the activity-travel scheduling models. de palma et al. (2015) proposed a variant of vickrey's bottleneck model of the morning commute, in which individuals live as couples and value the time at home more when together than when alone. the results showed that the cost of congestion is higher for couples than for single individuals because the cost of arriving early rises proportionally more than the cost of arrival late decreases. the costs can be even higher if spouses collaborate with each other when choosing their departure times. jia et al., (2016) explored the departure time choice problem of the household travel (commuter and children) in a home-school-work trip chain with two preferred arrival times (a school start time and a work start time). liu et al., (2017b) further considered a hybrid of household travel (home-school-work trip chain) and individual travel (home-work trip). the findings showed that by appropriately coordinating the schedules of work and school, the traffic congestion at highway bottleneck and thus the total travel cost can be reduced. zhang et al., (2017) further investigated and compared the morning commuting equilibrium solutions in the "school near workplace" and "school near home" networks. it was shown that the dynamic commuting equilibrium solution is significantly affected by school locations, and in the ''school near home" network, households always arrive at school no later than the desired school arrival time. these abovementioned studies considered the morning trip-timing decisions of couples ( de palma et al., 2015 ) and of a parent and his/her children ( jia et al., 2016 ; liu et al., 2017b ; zhang et al., 2017 ) . however, for a family with two workers (husband and wife), couples must decide who takes the children to school in the morning and when, and who brings them back home in the evening. a parent has to trade-off not only his/her own schedule convenience with that of his/her spouse, but also the schedules of children. it is therefore meaningful to address the morning-evening activity scheduling issues with intra-household interaction consideration. carpooling or ridesharing refers to the case in which multiple persons travel together in an auto by sharing the cost. with carpooling, the seat capacity of an auto can be utilized more efficiently and the average individual travel costs, such as fuel cost, toll, and the stress of driving, are reduced. carpooling is also recognized as a more environmentally friendly and sustainable way to commute by reducing vehicular carbon emissions as well as the need for parking spaces. recently, xiao et al., (2016) incorporated carpooling behavior in morning commute problem with considering parking space constraint at destination. three modes, namely solo-driving, carpooling, and transit, were considered. it was shown that the departure period of solo drivers covers the departure period of carpoolers, and as the number of parking spaces decreases, the number of solo-drivers decreases gradually, while the number of carpoolers first increases and then decreases. liu and li (2017) examined the morning commute problem in the presence of ridesharing program. commuters simultaneously choose departure time from home and role in the program (solo driver, ridesharing driver and ridesharing rider). ma and zhang (2017) further explored the dynamic ridesharing problem on a highway with a single bottleneck together with parking. they designed the schemes with different ridesharing payments and shared parking prices. recently, presented a variable-ratio charging-compensation scheme to investigate dynamic ridesharing problem using bottleneck model approach. different objectives of the platform were considered, including minimization of system disutility, maximization of platform profit, and minimization of system disutility subject to zero platform profit. yu et al., (2019) incorporated users' heterogeneities in the carpooling problem based on the traditional bottleneck model, and revealed the effects of heterogeneities on the efficiency of carpool subsidization. all the aforementioned studies are based on a corridor with a single bottleneck, and thus cannot consider the interaction between flows of different links in the network (i.e., network effects). extending the single-bottleneck model to a general network with multiple od pairs and multiple bottlenecks could help deepen understanding of the effects of carpooling services on the urban transport system. the carpooling services can be implemented through mobile platforms in which passengers can call on riding services and drivers can respond to the service requests. the existing related studies considered a single carpooling platform. it is meaningful to examine the competition and/or collaboration among multiple carpooling platforms. transportation systems are stochastic, dynamic and nonlinear systems due to various disturbance factors from supply side and/or demand side, such as traffic accident, bad weather, and within-day and/or day-to-day demand variations. considering the impacts of stochastic factors on transportation systems has important implications for promoting the resilience and reliability of the systems. table 7 lists some major studies incorporating uncertainty effects in the bottleneck problems. it can be noted that most of previous studies focused on the supply uncertainty caused by travel time variation or capacity randomness. it is somehow surprising that no studies take into account the effects of the uncertainty in demand side in the bottleneck problems, though there are a few publications involving joint fluctuations in both supply and demand sides (e.g., arnott et al., 1999 ; fosgerau, 2010 ) . in order to model the uncertainty effects of bottleneck system, a probability distribution function needs to be specified for the random variables concerned. in this regard, most studies adopted a general distribution, and a few studies adopted some specific distributions, such as uniform distribution (e.g., xiao et al., 2014 wang and xu, 2016 ; zhang et al., 2018 ) , exponential distribution ( tian and huang, 2015 ) , uniform and exponential distributions (noland et al., 1995 (noland et al., , 1998 , and gumbel distribution ( xiao and fukuda, 2015 ) . in terms of modeling method, expectation value model is usually adopted in stochastic optimization problems. however, in order to well capture the risky attitudes of travelers towards random fluctuations in demand and/or supply sides, some studies have also incorporated the effects of travel time variation in the objective functions of models, such as fosgerau (2010) ; borjesson et al., (2012) ; engelson and fosgerau (2016) , and xiao et al., (2017) . it should be mentioned that in the field of travel time variability or reliability, fosgerau and his collaborators have made a number of works by using a model of time-varying scheduling preferences. for instance, they presented a new measure of travel time variability ) and explored the relationships among different measures of the cost of travel time variability ( engelson and fosgerau, 2016 ) . they also derived the value of travel time variability fosgerau and fukuda, 2012 ) , and revealed the relationship between the mean and variance of travel delay in dynamic queues with random capacity and demand ( fosgerau, 2010 ) . for a systematic review of the values of travel time and travel time reliability, the interested readers may refer to small (2012) . it should be pointed out that these previous studies are mainly based on expectation value (risk-neutral) model or meanvariance (risk-averse) model. it is meaningful to consider other measures of risk, such as var (value at risk) and cvar (conditional value at risk). this could lead to a difference in the value of travel time variability. moreover, these previous studies assumed that the scheduling preferences are exogenously given. incorporating endogenous scheduling preferences, as presented in fosgerau and small (2017) , is also an important direction for further studies. obviously, variability or uncertainty in supply and/or demand sides involves lack of information about how a stochastic process is realized. thereby, its analysis naturally invites considering the effects of information provision. travelers may have only partial information about traffic conditions before or during a trip. with the aids of various information and communication technologies (e.g., global navigation satellite system, global positioning system), real-time traffic information can be collected and disseminated efficiently. travelers can adjust their activity and travel schedules through day-to-day leaning and the traffic information guidance. in this regard, noland (1997) examined the congestion effects of providing commuters with pre-trip information and found that information provision does not necessarily bring benefits to the commuters using the information. ziegelmeyer et al., (2008) investigated the impact of public information about past departure rates on congestion level and travel cost based on learning model and the adl's bottleneck model. liu et al., (2017a) considered the effect of travelers' inertia in the day-to-day behavioral adjustment due to traffic information updating. khan and amin (2018) studied the effects of heterogeneous information (market penetration and accuracy) on traffic congestion. explored the impact of the cost of information provision on information quality provision strategy. zhu et al., (2019) examined the dayto-day departure time adjustment process of travelers with bounded rationality based on long-term historical knowledge (or short-term travel experience) and real-time information provision. however, all the aforementioned studies only considered a simplified case of one single route, which may not be able to capture the full impact of information on traffic congestion. it is thus necessary to extend it to a general network with multiple routes in a further study. some studies have relaxed the assumption that commuters pass through only one bottleneck during commuting peak periods to consider the case of passing through multiple bottlenecks during a trip. kuwahara (1990) analyzed the equilibrium queuing patterns at a two-tandem bottleneck on one freeway for which some commuters may pass through both bottlenecks. arnott et al., (1993b) studied a y-shaped highway corridor with two upstream bottlenecks and one downstream bottleneck. they found that expanding the capacity of one of the upstream bottlenecks can raise total travel cost (i.e., a paradox occurs), and metering access to reduce effective upstream capacity can improve efficiency. optimal capacity for an upstream bottleneck is equal to, or smaller than, the optimal capacity for the downstream bottleneck. kim (1999) further analyzed the dynamic equilibrium queuing patterns for a two-tandem bottleneck with two origins and one destination. it was found that in some cases, a queue does not occur at the upstream since the departure rate is always equal to its capacity at equilibrium. in order to avoid traffic congestion at a two-tandem bottleneck, the downstream bottleneck should be enlarged prior to the upstream bottleneck. further demonstrated the bottleneck paradox phenomenon by an experimental method in a y-shaped bottleneck network with two groups of commuters. the commuters in group one pass through only the downstream bottleneck, whereas the commuters in group two must pass through both upstream and downstream bottlenecks. they found that the observed departure times at aggregate level are in close agreement with the equilibrium solution. akamatsu et al., (2015) discussed the existence and uniqueness of the solution of departure-time choice equilibrium for a corridor with multiple discrete bottlenecks and heterogeneous users. these previous related studies mainly focused on a specific occasion (e.g., a two-tandem or y-shaped bottleneck structure), and thus the results obtained might not be applicable to a general network. therefore, further investigations on general bottleneck networks are needed. the bottleneck models presented in the aforementioned literature usually adopted analytical approaches because they treated only some simple cases with one or two routes. in order to apply the bottleneck models to real large-scale networks, a dta-based bottleneck modeling approach was presented, which was inspired by vickrey's bottleneck model. in this approach, the usual components of vickrey's bottleneck model are applied separately to the links in the network. in this regard, de palma and his colleagues have developed a dynamic network model, called metropolis, in which the travel mode, departure time and route choices can be endogenously determined. metropolis has been implemented both with a vertical queue for each link (i.e., the physical length of a queue is not considered), and with a horizontal queue which means the queue on one link can affect other links (i.e., queue spillback effects). the model is solved using microsimulation, and has been applied to evaluate various policies, such as congestion pricing (see, de palma and lindsey, 2006 ; de palma et al., 2005 , de palma et al., 2008 . for more details of metropolis, please refer to de palma et al. (1997) and de palma and marchal (2002) . besides the aforementioned various topics, some other topics related to the bottleneck model have also been studied, summarized as follows. i. properties of equilibrium solution. smith (1984) showed the existence of user equilibrium solution for a singlebottleneck model with homogeneous users. daganzo (1985) proved the uniqueness of the user equilibrium solution. newell (1987) extended the analysis to the case of heterogeneous linear schedule delay functions. an elastic demand version of the bottleneck model was analyzed in arnott et al., (1993a) . ii. variations of model formulation and solution algorithm. de palma et al. (1983) proposed a stochastic departure time choice logit model to consider the commuters' perception errors of utility. han et al., 2013a , han et al., 2013b reformulated vickrey's bottleneck model as a partial differential equation formulation. otsubo and rapoport (2008) presented a discrete version of vickrey's bottleneck model and a solution algorithm for computing the equilibrium solution. nie and zhang (2009) proposed numerical solution procedures for the morning commute problem. guo and sun (2019) considered personal perception in the travel cost function, aiming to incorporate the commuters' psychological tastes towards early arrival at the workplace in the bottleneck model. iii. doubly dynamic adjustments (day-to-day and within-day dynamics). ben-akiva et al., (1984 presented a dynamic simulation model to describe the evolution of queues and delays from day to day. ben-akiva et al., (1991) further presented a framework for evaluating the effects of traffic information systems based on the doubly dynamic adjustment model incorporating the drivers' information acquisition and integration. guo et al., (2018) considered the bounded rationality factor due to individual's limited cognitive level and imperfect information in the doubly dynamic bottleneck model. iv. time varying bottleneck capacity. the classical bottleneck model usually assumes a constant or invariant bottleneck capacity. using optimal control theory, yang and huang (1997) presented a bottleneck model with queue-dependent capacity and elastic demand for design of time-varying toll schemes, and found that queues must not be eliminated in the optimal state of the system. zhang et al., (2010) presented another bottleneck model in which the bottleneck capacity varies exogenously over time, in discrete steps. they derived user equilibrium and system optimal traffic patterns with (exogenously) time-varying capacities and the optimal tolls leading to the system optimum pattern. v. ramp metering. arnott et al., (1993b) suggested an optimal metering policy to improve the efficiency of a y-shaped bottleneck system. o'dea (1999) found that in the bottleneck model, metering can produce a sizable benefit and should not be regarded as a substitute for congestion pricing. shen and zhang (2010) designed a pareto-improving metering strategy for a multi-ramp linear freeway based on an analysis of priority order of the ramps. these studies mainly focused on a simple bottleneck system or a freeway. we can extend it to a general network in a further study. queuing delay is a pure deadweight loss for the society and results in inefficient use of transportation infrastructure. in order to make efficient use of transportation resources, congestion pricing has been widely suggested as a viable measure to internalize externalities caused by queuing at the bottleneck so as to relieve peak-period traffic congestion. congestion pricing scheme is generally based on the economic theory of marginal cost pricing and is a mechanism to improve social benefit . for comprehensive reviews of congestion pricing, readers can refer to lindsey et al., (2012) , van den berg (2012 , and fosgerau and van dender (2013) . a substantial stream of research has been conducted on bottleneck congestion pricing. table 8 provides a summary of the bottleneck congestion pricing studies. it should be pointed out that multi-modal bottleneck tolling studies are not shown in table 8 , and readers can refer to table 5 . it can be seen in table 8 that many of the existing studies focused on the topics of step tolling, users' heterogeneity, and tradable credit scheme. based on different assum ptions, the step bottleneck tolling studies can be classified into three main categories: the adl model of arnott et al., (1990a arnott et al., ( , 1993a arnott et al., ( , 1998 , the laih model of laih (1994 , laih, 2004 , and the braking model of lindsey et al., (2012) and xiao et al., (2012) . the laih model implicitly assumed that separate queues exist for tolled users and untolled users who arrive before the toll is turned off. despite this strong assumption, the laih model is useful for estimating the approximate efficiency of a multi-step toll scheme. the adl model assumed that a mass of commuters departs just after the toll is lifted. the braking model considered that as the end of the tolling period approaches, drivers have an incentive to stop before reaching the tolling point and wait until the toll is switched off. the congestion pricing studies shown in table 8 usually fall into the family of the piecewise constant î± â�� î² â�� î³ preferences, and focus on the case of normally recurrent traffic congestion. further investigated the congestion tolling problems in a framework of time-varying scheduling preferences. they compared the single-step and multi-step toll schemes with linear time-varying and piecewise constant marginal activity utilities. arnott (2013) and fosgerau (2015) incorporated the hypercongestion phenomenon in the congestion tolling problems using the bathtub model. vehicular use also causes environmental externality, besides congestion externality. in order to control vehicle pollution emissions and improve air quality, emission tax policy has been suggested. bulteau (2016) proposed a microeconomic model of urban toll system to internalize the negative externality effects (congestion and pollution) generated by vehicular use. in the proposed model, two modes of transportation (i.e., cars and public transport) were taken into account, and the vehicle emission rate was implicitly assumed to be a constant. based on the bottleneck congestion model of arnott et al., (1990a , arnott (1986) , bernstein and muller (1993) , mun (1999) , daganzo and garcia (2000) social optimum toll to totally eliminate bottleneck queue during morning peak pareto-improving time-varying toll in a time window minimize total social cost or user cost elastic travel demand braid (1989) , arnott et al., (1993a) , yang and huang (1997) consider the elasticity of travel demand to travel cost maximize total social surplus step tolling laih (1994 laih ( , 2004 , arnott et al., (1990a arnott et al., ( , 1993a , fosgerau (2011 , lindsey et al., (2012) , van den berg (2012 , gonzales and christofa (2014) , knockaert et al., (2016) , ren et al., (2016) , bao et al., (2017) , xu et al., (2019) single-step or multi-step tolling scheme as a substitute for the time-varying tolling scheme typical models: adl model, laih model, lindsey et al. braking model route or lane substitute braid (1996) , hall (2018) two routes: one tolled and one free a portion of lanes are tolled heterogeneous travelers cohen (1987) , arnott et al., (1992 arnott et al., ( , 1994 reward scheme rouwendal et al., (2012) , yang and tang (2018) a reward scheme means a subsidy instead of a penalty of tolling fare-reward scheme for transit users regulatory regime of bottleneck de palma and lindsey (2002b lindsey ( , 2008 , fu et al., (2018) private regime (profit maximization) public regime (welfare maximization) mixed regime stochastic environments yao et al., (2010) , , hall and savage (2019) stochastic toll stochastic capacity 1993a), three alternative toll schemes were compared: a fine toll (time-varying toll), a coarse toll (varies according to the peak period and off-peak period), and a uniform toll (constant over time). the policy of redistributing the gains from urban tax to public transport was also evaluated. liu et al., (2015b) presented a variable speed limit scheme to reduce total traffic emissions and travel costs based on vickrey's bottleneck model and constant vehicle emission rate assumption, and evaluated its effectiveness in improving traffic flow efficiency of the bottleneck system. these studies only considered one od pair with one route and assumed a constant vehicle emission rate. it is meaningful to extend these studies to incorporate the network effects and the changing vehicle emissions by vehicle type and speed. tradable credit scheme has recently been advocated as a useful tool for regulating the externalities caused by vehicular use and as a promising substitution for congestion pricing scheme. this is because such scheme does not involve money transfer from the public to the government, which can significantly increase the public acceptability towards the scheme ( yang and wang, 2011 ) . studied various parking permit schemes in a many-to-one network, in which each origin is connected to a single destination by a bottleneck-constrained highway and a parallel transit line. they compared three parking permit distribution schemes for commuters living in different origins: uniform, pareto improving, and system optimum schemes. liu et al., (2014a ,b) further developed tradable parking permit schemes to realize parking reservations for homogeneous or heterogeneous commuters in terms of their values of time. nie and yin (2013) proposed a general analytical framework for design of system optimal tradable credit scheme and analysis of the efficiency of tradable credit scheme for a two-route system. their results showed that the tradable credit scheme could provide substantial efficiency gains for a wide range of scenarios. tian et al., (2013) examined the efficiency of a tradable credit scheme in a competitive highway/transit network with continuous heterogeneity in terms of individuals' values of time. xiao et al., (2013) explored the efficiency and effectiveness of a tradable credit system with identical and non-identical commuters. credits are tradable between the commuters, and the credit price is determined by a competitive market. the credit system consists of a time-varying credit charged at the bottleneck and an initial credit distribution to the commuters. nie (2015) proposed a market-based tradable credit scheme for managing traffic congestion at critical bottlenecks (e.g., bridges and tunnels). it was assumed that users who avoid traveling in the peak-time window will be rewarded with mobility credits and those who do not will pay a congestion toll in the form of credits or cash. the travelers may trade their credits with each other. it was shown that the best choice of the rewarding-charging ratio is 1, i.e., each peak-time user is charged one credit and each off-peak user is awarded one credit. shirmohammadi and yin (2016) designed a tradable credit scheme to maintain the queue length of the bottleneck to be less than a queue length threshold specified by the authority. sakai et al., (2017) proposed a model for designing pareto-improving pricing scheme with bottleneck permits for a v-shaped two-to-one merging bottleneck. they showed that the first-best pricing scheme for this v-shaped network does not always achieve a pareto improvement, because the cost of one group of drivers is increased due to the permit pricing. xiao et al., (2019) presented two tradable parking permit schemes for a corridor system with three alternative travel modes, i.e., transit, driving alone and carpool, when the parking supply at destination is insufficient. it was found that the prices of parking permits, regardless of whether the trip is completed as a carpool or not, decrease with the parking supply, and the price that a solo driver should pay is higher than that a carpooler should pay. the tradable uniform parking permit scheme is more efficient than the tradable differentiated parking permit scheme for solo-driving and carpooling travelers. it can be noted that these abovementioned studies mainly focused on a single od pair with one or two routes, and the transaction costs for trading the credits were usually ignored. it is thus important to look at the impacts of network effects and transaction costs on the effectiveness of the tradable credit schemes. in reality, markets always create speculators. it is also meaningful to look at the effects of collusive behavior among credit purchasers. in addition, when the parking supply is insufficient at destination, one may first park his/her car at a park-and-ride lot and then transfers to a transit vehicle to reach final destination. it is therefore necessary to extend the existing models to incorporate the park-and-ride services in a further investigation. the redistribution of toll revenue is an important factor influencing the public acceptability of the toll schemes and thus practical implementation. adler and cetin (2001) developed an analytical model for a two-node two-route network, aiming to explore a direct redistribution approach in which money collected from the drivers on a more desirable route is directly transferred to the users on a less desirable route. it was shown that this model about toll collection and subsidization would reduce the travel cost for all travelers and totally eliminate the wait time in the queue. compared with the social optimal solution, the direct redistribution model yields almost identical results. mirabel and reymond (2011) analyzed the impact of toll redistribution on total cost and on modal split between railroad and road based on the two-mode model of tabuchi (1993) . in their model, it was assumed that toll revenue from road was redistributed to public transport. two kinds of road toll regimes were considered, i.e., a fine toll and a uniform toll. it was shown that a toll policy is more efficient as long as toll revenue is directed towards public transport when the railroad fare is equal to average cost. these previous studies mainly focused on the redistribution of toll revenue for public transport improvement purpose. it will be meaningful to take into account other use purposes, such as transportation infrastructure investment and fiscal revenue, and to determine the optimal redistribution proportion among different uses. although the first-best time-varying toll may eliminate queuing completely, congestion toll scheme may not be politically feasible. parking charging can be considered as a possible substitute for the congestion tolling because parking charges seem to be much easier to implement than congestion tolls. similar to congestion tolls, parking charges may be used to disperse demand over time so as to reduce congestion and gain efficiency. zhang et al., (2008) presented a morningevening commuting model to determine a location-dependent parking fee scheme that optimizes the commuter morning / evening commuting pattern. analyzed the regulatory schemes of parking market: price-ceiling and quantity tax/subsidy schemes. it was shown that both price-ceiling and quantity tax/subsidy regulations can efficiently reduce system cost and commuter cost under certain conditions, and help ensure the stability of the parking market. fosgerau and de palma (2013) determined the optimal parking charges and evaluated the benefits of parking pricing as an alternative to congestion tolls. zhang and van wee (2011) proposed a duration-dependent parking fee scheme, and compared it with three other pricing regimes: no charging, optimal time-varying road tolls, and a combination of optimal time-varying road tolls and location-dependent parking fees. ma and zhang (2017) derived dynamic parking charges for a bottleneck system with ridesharing, in which all travelers were assumed to participate in the ridesharing program, i.e., a traveler was either a driver or a passenger. as a substitute for parking pricing, parking permit schemes have also been studied in the literature ( liu et al., 2014b ; xiao et al., 2019 ) , as presented in subsection 3.2.3 . the aforementioned studies did not consider commuters' time spent on searching for available parking spaces. the search for parking spaces comprises a wasteful commuting component that contributes to traffic congestion, and thus should be considered in commuting cost. on the other hand, parking facilities are usually supplied by both private firms and public sector. it will be interesting to examine this mixed market and compare it with the extreme cases of either private-only or public-only parking provision regime. in addition, a mixed market consisting of solo-driving and ridesharing should be investigated for analyzing the effects of parking pricing on the market. ship queuing and waiting at a general anchorage to enter the berth under the port congestion are similar to the auto queuing and waiting at the road bottleneck. the congestion pricing concept for a road bottleneck has been extended to address the port congestion pricing issues. in this regard, laih and his collaborators have undertaken a number of studies (see, laih and hung, 2004 ; laih et al., 2007 , laih et al., 2015 chen, 2008 , laih and chen, 2009 ; laih and sun, 2013 ) . they derived optimal time-varying and/or step toll schemes to eliminate or decrease the port congestion. by levying port congestion tolls, the departure schedules of container ships can be rationally changed, and thus the arrival times of container ships at the busy port can be smoothed or dispersed. as a result, the queuing delays of container ships for port entry decrease. they also derived the resultant changes of container ships' departure schedules after levying port congestion tolls. however, they did not consider the redistribution of port congestion charges, which can help promote the public acceptability of port congestion charging scheme. in the literature, there are some studies about airport congestion pricing issues. for example, daniel (1995) proposed an airport runway congestion pricing model (i.e., a bottleneck model with time-dependent stochastic queuing) for estimating congestion prices and capacities for large hub airports. the proposed stochastic bottleneck model combines stochastic queuing, time-varying traffic rates, and intertemporal adjustment of traffic in response to queuing delay and fees. daniel and pahwa (20 0 0) showed that the stochastic bottleneck model of daniel (1995) can generate more realistic traffic patterns than earlier models, such as deterministic bottleneck model of vickrey (1969) . harback (2008 , daniel and harback, 2009 ) adopted the stochastic bottleneck model to address the airport congestion pricing issues for 27 major us hub airports. daniel (2011) further determined the equilibrium congestion pricing schedules, traffic rates, queuing delays, layover times, and connection times by time of day for four canadian airports (toronto, vancouver, calgary, and montreal). daniel (2014) examined the efficiency and practicality of airport slot constraints using a deterministic bottleneck model of landing and takeoff queues. it was shown that slot constraints at us airports would be ineffective, and effective slot constraints require many narrow slot windows. silva et al. (2014) studied airlines' interactions and scheduling behavior, together with airport pricing, using a combination of a deterministic bottleneck model and a vertical structure model that explicitly considers the roles of airlines and passengers. wan et al. (2015) treated terminal congestion and runway congestion separately, and studied its implication for design of optimal airport charges and/or terminal capacity investment. to capture the difference between these two types of congestion, they adopted a deterministic bottleneck model for the terminal and a conventional congestion model for the runways. they showed that the welfare-optimal uniform airfares do not yield the first-best outcome. the first-best fares charged to the business passengers are higher than the leisure passengers' fare if and only if the relative schedule-delay cost of business passengers is higher than that of leisure passengers. these airport pricing studies usually focused on a single airport, and did not consider the effects of airport pricing on the competition and collaboration among regional airports (e.g., the airports of hong kong, guangzhou, shenzhen, zhuhai, and macao in the greater bay area of china), which deserves a further study. queuing delays at the bottleneck during the morning and evening commutes may be an important factor influencing household residential location choice, which shapes urban spatial structure of a city ( mun et al., 2005 ) . arnott (1998) incorporated the departure time choice into a model of urban spatial structure by using vickrey's bottleneck model. it was shown that in contrast to the standard static model (without time dimension), congestion tolling in the bottleneck model can cause urban form to become less concentrated, and thus may have less pronounced effects on urban spatial structure than was previously thought. fosgerau and de palma (2012) introduced spatial heterogeneity into the bottleneck model via considering dynamic congestion in an urban setting where trip origins are spatially distributed. it was shown that at equilibrium, travelers sort according to their distances to the destination; the queue is always unimodal regardless of the spatial distribution of trip origins; and the travelers located beyond a critical distance from the cbd tend to gain from tolling, even when toll revenue is not redistributed, while nearby travelers lose. gubins and verhoef (2014) considered a monocentric city with a traffic bottleneck located at the entrance to the cbd. the commuters' departure times, household residential locations, and lot sizes are all endogenously determined. they showed that road pricing may lead to urban sprawl, even when the collected toll revenue is not redistributed back to the city inhabitants. takayama and kuwahara (2017) further developed a model considering commuters' heterogeneity, departure time and residential location choices in a monocentric city with a single bottleneck. the results showed that commuters sort themselves temporally and spatially according to their values of time and schedule delay flexibility. imposing a congestion toll without redistributing toll revenue causes the physical expansion of the city, which is opposite to the results of traditional location models. franco (2017) examined the effects of change in downtown parking supply on urban welfare, mode choice and urban spatial structure using a general spatial equilibrium model of a closed monocentric city with two transport modes, endogenous residential parking supply and bottleneck congestion at the cbd. xu et al., (2018) presented an integrated model of urban spatial structure and traffic congestion for a two-zone monocentric city in which the two zones are connected by a congested highway and a crowded railway. the commuters' departure time and mode choices are governed by a bottleneck model, and the endogenous interactions between travel and residential relocation choices are analyzed. fosgerau et al., (2018) presented a unified model of the bottleneck model and the monocentric city model. the model generates a number of new insights regarding the interaction between congestion dynamics and urban spatial equilibrium. unlike the traditional static congested city models, their model leads to an optimal city that is less dense in the center and denser in the suburb than the city at the laissez-faire equilibrium. this result is similar to that in gubins and verhoef (2014) . vandyck and rutherford (2018) developed a spatial general equilibrium model to study economy-wide and distributional implications of congestion pricing in the presence of agglomeration externalities and unemployment. fosgerau and kim (2019) presented a new monocentric city framework that combines a discrete urban space with multiple vickrey-type bottlenecks. they confirmed empirically the relationship between residential location choice and trip-timing choice, i.e., commuters traveling a longer distance tend to arrive at work early or late (i.e., at off-peak times) while commuters with a shorter distance tend to arrive at the peak time. these aforementioned studies considered the role of households' residential location decisions in shaping urban spatial structure, but ignored the role of firms' location decisions. in a further study, the effects of both households' and firms' location decisions should be simultaneously considered in the analysis of urban spatial equilibrium. the classical vickrey's bottleneck model has also been employed to address transit passenger travel choice behavior and transit system optimization issues. kraus and yoshida (2002) incorporated the commuter's time-of-use decision into a model of transit pricing and transit service optimization, in which waiting time at a transit stop was treated analogously to queuing time at the highway bottleneck. it was shown that increased ridership leads to higher average user cost, and the relationship between service frequency and ridership does not conform to the well-known square root principle. yoshida (2008) further studied the effects of passengers' queuing rules at transit stops (including the first-in-first-out and the random-access queuing) on the mass-transit policies, such as the number of trains and runs, scheduling, and pricing. the results showed that when the shadow value of a unit of waiting time exceeds that of a unit of time late for work, the passengers' queuing discipline does not have any effect on the optimal or second-best mass-transit policy. otherwise, the aggregate travel cost with random-access queuing is lower than that with first-in-first-out. tian et al., (2007) analyzed the equilibrium properties of commuters' trip timing during the morning commute on a many-to-one linear corridor transit system with considering in-vehicle passenger crowding effect and schedule delay cost. monchambert and de palma (2014) considered a bi-modal competitive system, consisting of a public transport mode (bus), which may be unreliable, and an alternative mode (taxi). the results showed that the public transport service reliability at the competitive equilibrium increases with the taxi fare, and the public transport service reliability and thus patronage at equilibrium are lower than those at the first-best social optimum. de investigated trip-timing decisions of rail transit users who trade off in-vehicle passenger crowding costs and disutility from traveling early or late. three fare regimes, namely no fare, an optimal uniform fare, and an optimal time-dependent fare, were studied and compared, together with determination of the optimal long-run number and capacities of trains. wang et al., (2017) designed the policies of transit subsidies (including cost and passenger subsidies) from either government funding or road toll revenue to circumvent the downs-thomson paradox appearing in a competitive highway/transit system. yang and tang (2018) proposed a fare-reward scheme for managing rail transit peak-hour congestion with homogeneous commuters, in which a commuter is rewarded with one free trip during pre-specified shoulder periods after taking a certain number of paid trips during the peak hours. such a fare-reward scheme aims to shift commuters' departure time to reduce their queuing at stations in an incentive-compatible manner while keeping the transit operator's revenue intact. tang et al., (2019) further considered the heterogeneous commuters, in terms of commuters' scheduling flexibility (i.e., arrival time flexibility interval), and proposed an incentive-based hybrid fare scheme, which combines the fare-reward scheme with a non-rewarding uniform fare scheme. it was shown that the hybrid fare scheme can create a revenue-preserving win-win-win situation for the transit operator, flexible commuters and non-flexible commuters. these previous studies have provided many insights into understanding the travel choice behavior of transit passengers, operations and scheduling of transit services, and the effects of various transit policies, such as transit service pricing and subsidies. however, they usually consider transit mode only or two physically isolated modes (e.g., auto and rail). in reality, auto and bus share the same roadway, and thus the interaction between them cannot be ignored. the congestion externality caused by intermodal interaction should be considered in the transit fare pricing, together with the in-vehicle crowding externality in transit vehicles. arnott et al., (1993a) concerned the capacity expansion issue of a road bottleneck with homogeneous commuters. it was shown that the self-financing result (i.e., toll revenue exactly covers its capital cost) holds even when the variation of the toll by time of day is constrained (e.g., a coarse toll). arnott and kraus (1995) investigated under what circumstances the first-best pricing and investment rules (i.e., first-best self-financing rule, or trip price equals marginal cost) for a congestible bottleneck facility apply when both the time variation of the congestion charge is constrained and users are different in unobservable characteristics so that the same congestion charge must be applied to heterogeneous users, in terms of work start time or value of time. their findings indicated that the first-best self-financing rule holds if the congestion externality is anonymous, independent of user type. thereby, marginal cost pricing of a congestible facility is feasible even if users differ in observationally indistinguishable ways, when a completely flexible toll is employed. but, when there are constraints on the time variation of the toll (e.g., uniform toll), marginal cost pricing is infeasible and a variant of ramsey pricing is (secondbest) optimal. liu et al., (2015a) designed a highway use reservation system to allocate highway space to potential users at different time intervals. they also evaluated the efficiency of the reservation system. lamotte et al., (2017) addressed the capacity allocation issue of a road between two vehicle types (i.e., conventional and bookable autonomous vehicles), using a variant of the bottleneck model. these studies usually assumed a fixed total travel demand, a single travel mode and a deterministic environment. in further studies, these assumptions can be relaxed to consider elastic demand, multiple travel modes and/or stochastic situation. qian et al., ( , 2012 investigated the design problems of parking capacity, parking fee, and access time when all parking lots in the parking market are operated by multiple profit-driven private operators or by a welfare-driven social planner. franco (2017) examined how the changes in cbd parking supply affect residential land rents, residential parking supply, mode choice, welfare, air pollution, share of auto users, population densities and city size, and whether the self-financing theorem holds in the context of the urban spatial model. liu (2018) presented an equilibrium model of departure time and parking location choices for optimizing the parking supply that minimizes the total system cost (i.e., the sum of travel cost and social cost of parking supply) under either user equilibrium or system optimum pattern. he found that the optimal planning of parking with autonomous vehicles is significantly different from that without autonomous vehicles. zhang et al., (2019) further analyzed the optimal parking supply strategy for autonomous vehicles to minimize the total system cost based on an integrated morning-evening commuting model. these previous studies did not concern the competition of different parking types (e.g., on-street and off-street) and the parking facility ownership issues (private and public), which can be considered in further studies. in the literature, there are a few studies involving joint strategies of capacity investment and demand management. for instance, arnott et al., (1994) explored the welfare effects of a toll-financed capacity expansion (i.e., toll revenues are used to finance transport investment) using a bottleneck model with user heterogeneity consideration. it was shown that if initial capacity is sufficiently small, a toll-financed expansion leaves all drivers better off. xiao et al., (2012) studied the feasibility of expanding bottleneck capacity by toll revenue. the results showed that if the revenue generated by the optimal flat toll is used to finance the capacity expansion, the trip cost of each commuter is reduced in the long run. however, the revenue from the optimal flat toll can never cover the capital cost of constructing the optimal capacity for minimizing the total system cost under constant returns. qian et al., (2012) derived the optimal parking capacity, fee, and access time which altogether yield the minimum total social cost. wan et al., (2015) investigated the joint impacts of airport terminal capacity expansion and time-varying terminal fine toll on passenger demand (including business and leisure passengers) and airport system. these previous studies usually assumed a constant returns to scale and piecewise constant scheduling preferences, which can be extended to consider other returns to scale (e.g., increasing) and time-varying scheduling preferences. in the previous subsections, we have reviewed the literature about bottleneck model studies from the perspectives of travel behavior analysis, demand-side strategies, supply-side strategies, and joint strategies of both demand and supply sides. in spite of broad extensions conducted since the pioneering work of vickrey (1969) , there are still some limitations in the existing related studies, summarized as follows. (i) as shown in section 3.1 , various strong assumptions are often made in the related studies, aiming to simplify the model and derive analytical solutions. such simplicity may lead to a large deviation of the model results from the actual values, and thus restricts explanatory power and real applications of the model. in order to model more realism, it is necessary to relax these assumptions in further studies. (ii) the existing studies have mainly focused on the topics of travel behavior analysis and demand-side strategies (particularly on congestion tolling). however, only limited attention has been paid to the topics of supply-side strategies (e.g., financing mode for capacity expansion due to fiscal deficit) and joint strategies (e.g., using congestion tolls to finance capacity expansion). the disposition of toll revenue also lacks adequate research. these topics provide potential research opportunities for further studies. (iii) driving effects of information technology innovation on social development, such as sharing economy and smart mobility, are seldom incorporated in the previous related studies. as such, rapid development of new technologies has been bringing about significant social reform, which is changing people's behavior and reshaping urban development. by incorporating these factors causing social changes, the bottleneck model could continue to provide new theoretical insights. according to the literature review and analysis of the limitations of existing related studies presented in the previous section, one can identify some new and important gaps and opportunities for further studies, presented as follows. one solution to the bottleneck congestion is to expand the capacity of the bottleneck. such expansion needs a huge capital cost, which imposes a heavy financial burden on local authority. in order to broaden the range of fiscal sources for bottleneck capacity expansion, various franchising programs, such as build-operate-transfer (bot) or public-private partnership (ppp) projects, have been implemented in practice to encourage private sectors to invest in massive transit projects. in a bot contract, the private investor negotiates with the government to finance, design, construct, and operate transportation infrastructure for a certain period (i.e., a concession period). upon the expiration of the concession period, the government will take over the infrastructure. a ppp contract, as another procurement model of public projects or services, implies a collaborative agreement between private sectors and government targeted at financing, designing, implementing and operating infrastructure and services. partnerships between private sectors and government provide advantages to both parties. the technology and innovation of private sectors can help provide better public services through improved operational efficiency. the government provides the private sector with incentives to deliver projects on time and within budget. the ppp contract specifies the rights and obligations of each party, embodying risk and revenue allocations between the parties. it is important to address the bot or ppp contract design issues of the bottleneck capacity expansion, particularly under a situation of the shortage of funds. congestion pricing schemes have been operating for years in a few countries and regions, such as singapore, london, stockholm, and milan. such schemes are not worldwide implemented yet due to low public acceptance, which is caused by the following factors: privacy, equity, complexity, and uncertainty ( gu et al., 2018 ) . the privacy issue means that the itineraries of travelers are recorded by the charging facilities at different locations. the equity issue im plies that congestion pricing hurts the poor from using road facilities and makes the road resources become a privilege of the rich. the complexity issue concerns the desire for a simple and well-understood proposal for calculation of congestion charges. the uncertainty issue includes the uncertainty in the effectiveness of the proposed scheme, and the uncertainty in revenue allocation. in order to improve public acceptance towards congestion pricing policy, the redistribution of toll revenue from congestion pricing is a critical issue. the government should make a reasonable allocation scheme of toll revenue to improve people's livelihood, such as expanding road capacity, improving public transit services, and reducing taxation. to achieve strong public support, the details of the use of toll revenue should be publicized to the society. it is well known that the main economic principle behind congestion pricing is to internalize the congestion externality caused by transportation. transportation contributes to environmental externality due to vehicular pollution emissions, besides congestion externality. in order to control air pollution level and improve air quality, clean air action programs have been launched in some large chinese cities, such as beijing and shanghai. the measures adopted in the program include subsidizing use of clean energy (e.g., electric or natural gas vehicles), retrofit of old motorized vehicles, and purification of vehicular pollutant emissions (e.g., free provision of vehicular exhaust purifier). to achieve financial sustainability, it is proposed to levy the emission taxes and redistribute part of the emission taxes to fund the aforementioned programs. therefore, further studies can be focused on how to redistribute the emission tax revenue, which will affect the practical implementation of emission pricing scheme and the public acceptance towards this scheme. auto sharing or ridesharing, as an emerging hot topic in the filed of transportation, may have a significant effect on the auto ownership rationing. it is expected that implementation of ridesharing has a potential to reduce the maximum number of autos and parking spaces required in the transportation system, which affects the traffic congestion level and the residential location choice and thus spatial distribution of residents in the urban system. in the ridesharing service system, the platform for ridesharing (e.g., didi or uber) plays an important role in matching shared autos and passengers , and the fleet size, service price or subsidy for ridesharing can help adjust the shared auto utilization rate, balance the modal split, and thus relieve the traffic congestion level of the system. further studies can, therefore, be made to consider the relationships among ridesharing, auto ownership rationing, and urban spatial structure, and to investigate the fleet size, pricing or subsidizing problem of ridesharing in a competitive multi-modal transportation system. the competition and collaboration between different ridesharing platforms and between ridesharing platform and public transit are also an important direction for future study. it is widely recognized that the rapid developments of information and communication technologies have significantly changed people's learning, work and life styles. for example, telecommuting or teleworking, as an alternative work arrangement, becomes a growing trend in the information age. telecommuting will drive people away from workplaces, and thus save office space in urban areas and change the household residential location choice farther from the workplaces, leading to a more spread-out city. it will also reduce the number of work trips and thus the demand for ground transportation, leading to reduction in energy consumption, traffic congestion and air pollution. however, telecommuting reduces the chance and time for teamwork and face-to-face communications. as a result, team productivity may actually suffer, which hurts the productivity of individual's firm and the urban economy. regardless of its two sides, the telecommuting has recently become a major working mode of various professions due to outbreak of covid-19 across the globe, making people more aware of its importance. on the other hand, rapid developments of new technologies also change the mobility of people and goods. it is believed that the emerging 5 g and self-driving technologies will revolutionize the transportation industry. the 5 g technology will enable road users and transportation infrastructure to communicate with everything else on the road. the self-driving technology can drive vehicles automatically, and thus the car users do not need to carry out the driving task and thus they can spend in-vehicle time in the autonomous cars on work or leisure activities, yielding extra activity utility. the end-to-end connectivity across the city with the 5 g technology allows autonomous vehicles to drive close to each other through cooperating and platooning technologies, thus leading to increased network capacity and decreased traffic congestion in the peak period. the 5 g technology can also alert autonomous vehicles of change of traffic conditions, such as collisions, weather and traffic accidents, through direct and real-time communication from vehicle to vehicle, causing increased safety and reliability on the road. it is naturally needed to investigate the effects of new technology revolution on the movement behavior of people and goods and to design an efficient and sustainable urban system. the goal of this paper is to undertake a broad literature review of the bottleneck model research over the past half century. the review undertaken in this paper uses a bibliometric analysis approach, in which the literature data of a total of 232 relevant papers are extracted from three well-recognized journal databases or search engines, namely web of science core collection, scopus, and google scholar. this analysis identifies the leading topics, top contributing authors, influential papers, and distributions of publications by journal, allowing readers to track how and where the literature has evolved. the literature is classified in terms of recurring themes into four main categories: travel behavior analysis, demand-side strategies, supply-side strategies, and joint strategies of demand and supply sides. for each theme, typical models proposed in previous studies are reviewed. based on a systematic review, we have identified some main gaps and opportunities in the bottleneck model research, which provides potential avenues for future research in this important and exciting area. by incorporating technological progress in the new digital era, the bottleneck model research keeps pace with the times and thus to contribute to new theoretical development. we are grateful to professor robin lindsey and three anonymous referees for their helpful comments and suggestions on earlier versions of the paper. the work described in this paper was jointly supported by grants from the national key research and development program of china ( 2018yfb160 090 0 ), the national natural science foundation of china ( 71525003 , 71890970/71890974 ), and the nsfc-eu joint research project (71961137001). the second author made a presentation entitled "the bottleneck and corridor problems" at the international workshop on transport modeling held in auckland, new zealand, on january 8-11, 2019, and had a heated discussion with the other two authors of this paper. this discussion led them to write a 50th anniversary review of the bottleneck models, as presented in this paper. however, the opinions expressed here are those of the authors themselves. the classical bottleneck model describes the departure time choice of commuters during morning commute. every morning, n homogeneous commuters travel from home to work along a highway containing a bottleneck with a capacity s . to simplify the analysis, all commuters want to reach the workplace at an identical preferred arrival time t * . 2 without loss of generality, the free-flow travel time from home to work is assumed to be zero. thus, a commuter arrives at the bottleneck immediately after leaving home and arrives at the workplace immediately after leaving the bottleneck. when the arrival rate at the bottleneck exceeds the bottleneck's capacity, a queue develops. those who arrive early or late face a schedule delay cost. commuters choose their departure times based on a trade-off between the bottleneck congestion and the schedule delay cost. let c ( t ) denote the travel cost of commuters departing from home to work at time t . it is composed of queuing delay cost at the bottleneck and schedule delay cost of arriving early or late. let t ( t ) be the queuing delay time at the bottleneck at time t. c ( t ) is then given as where î± is the unit cost of travel time, î² is the unit cost of arriving early, and î³ is the unit cost of arriving late. according to the empirical study of small (1982) , the relationship î³ > î± > î² should hold. the queuing delay time t ( t ) equals the queue length d ( t ) divided by the bottleneck capacity s , i.e., t (t) = d (t) /s , where d ( t ) is the difference between the cumulative arrivals and cumulative departures by that time, i.e., 2 one can also consider a preferred arrival time window [ t * â�� , t * + ], where is a measure of work start time flexibility. no penalty of schedule delay is incurred if a commuter reaches the destination within the time window. otherwise, a penalty of schedule delay takes place. for example, vickrey (1969) assumed a uniform distribution of t * over an interval, and hendrickson and kocur (1981) generalized it to a general distribution. where r ( t ) is the departure rate of commuters from home at time t and t q is the time at which the queue begins. at the equilibrium, all commuters have the same travel cost c ( t ) regardless of their departure time. this means d c(t ) / d t = 0 , â�� t â�� ( t q , t q ) , where t q is the time when the queue ends. one can thus derive the equilibrium departure rate r ( t ) as where ë� t is the departure time from home at which a commuter can arrive at workplace punctually, i.e., ë� t + t ( ë� t ) = t * . eq. (a3) shows that the equilibrium departure rate curve is piecewise constant. in the morning peak period ( t q ,t q ), the capacity of the bottleneck is fully utilized, and thus t q â�� t q = n/s holds. at the equilibrium, the first and last commuters do not face a queue, their queuing delays are zero, and their schedule delay costs must thus be equal, expressed as î²( t * â�� t q ) = î³ t q â�� t * . (a4) from eq. (a4) , t q â�� t q = n/s and ë� t + t ( ë� t ) = t * , one obtains the resultant equilibrium travel cost is c = ( î²î³ / (î² + î³ ) )( n/s ) . from equilibrium condition c ( t ) = c ( t q ) = c ( t q ) and eqs. (a1) and ( a4 ), one can derive the queuing delay time as eq. (a6) shows that a queue builds up linearly from t q to ë� t and then dissipates linearly until it disappears at t q . this means that the queuing delay curve is piecewise linear. testing the slope model of scheduling preferences on stated preference data a direct redistribution model of congestion pricing the corridor problem with discrete multiple bottlenecks analytical equilibrium of bicriterion choices with heterogeneous user preferences: application to the morning commute problem congestion tolling and urban spatial structure a bathtub model of downtown traffic congestion schedule delay and departure time decisions with heterogeneous commuters economics of a bottleneck departure time and route choice for the morning commute does providing information to drivers reduce traffic congestion? a temporal and spatial equilibrium analysis of commuter parking route choice with heterogeneous drivers and group-specific congestion costs a structural model of peak-period congestion: a traffic bottleneck with elastic demand properties of dynamic traffic equilibrium involving bottlenecks, including a paradox and metering the welfare effects of congestion tolls with heterogeneous commuters road pricing, traffic congestion and the environment: issues of efficiency and social feasibility information and time-of-usage decisions in the bottleneck model with stochastic capacity and demand the corridor problem: preliminary results on the no-toll equilibrium equilibrium traffic dynamics in a bathtub model: a special case financing capacity in the bottleneck model regulating dynamic congestion externalities with tradable credit schemes: does a unique equilibrium exist? transportation investigation of the traffic congestion during public holiday and the impact of the toll-exemption policy dynamic model of peak period congestion dynamic model of peak period traffic congestion with elastic arrival rates dynamic network models and driver information systems understanding the competing short-run objectives of peak period road pricing valuations of travel time variability in scheduling versus mean-variance models uniform versus peak-load pricing of a bottleneck with elastic demand peak-load pricing of a transportation route with an unpriced substitute partial peak-load pricing of a transportation bottleneck with homogeneous and heterogeneous values of time revisiting the bottleneck congestion model by considering environmental costs and a modal policy solving the step-tolled bottleneck model with general user heterogeneity optimal multi-step toll design under general user heterogeneity morning commute problem with queue-length-dependent bottleneck capacity endogenous trip scheduling: the henderson approach reformulated and compared with the vickrey approach commuter welfare under peak-period congestion tolls: who gains and who loses? the marginal social cost of travel time variability the uniqueness of a time-dependent equilibrium distribution of arrivals at a single bottleneck system optimum and pricing for the day-long commute with distributed demand, autos and transit a pareto improving strategy for the time-dependent morning commute problem congestion pricing and capacity of large hub airports: a bottleneck model with stochastic queues congestion pricing of canadian airports the untolled problems with airport slot constraints when) do hub airlines internalize their self-imposed congestion delays pricing the major us hub airports 20 0 0. comparison of three empirical models of airport congestion pricing departure times in y-shaped traffic networks with multiple bottlenecks bottleneck road congestion pricing with a competing railroad service stochastic equilibrium model of peak period traffic congestion congestion pricing on a road network: a study using the dynamic equilibrium simulator metropolis private toll roads: competition under various ownership regimes comparison of morning and evening commutes in the vickrey bottleneck model private roads, competition, and incentives to adopt time-based congestion tolling modelling and evaluation of road pricing in paris the economics of crowding in rail transit trip-timing decisions and congestion with household scheduling preferences private operators and time-of-day tolling on a congested road network real cases applications of the fully dynamic metropolis tool-box: an advocacy for large-scale mesoscopic transportation systems metropolis: modular system for dynamic traffic simulation morning commute in a single-entry traffic corridor with no late arrivals on the existence of pricing strategies in the discrete time heterogeneous single bottleneck model additive measures of travel time variability the cost of travel time variability: three measures with properties on the relation between the mean and variance of delay in dynamic queues with random capacity and demand how a fast lane may replace a congestion toll congestion in the bathtub congestion in a city with a central bottleneck the dynamics of urban traffic congestion and the price of parking the value of travel time variance commuting for meetings valuing travel time variability: characteristics of the travel time distribution on an urban road travel time variability and rational inattention the value of reliability commuting and land use in a city with bottlenecks: theory and evidence vickrey meets alonso: commute scheduling and congestion in a monocentric city trip-timing decisions with traffic incidents hypercongestion in downtown metropolis endogenous scheduling preferences and congestion road pricing with complications downtown parking supply, work-trip mode choice and urban spatial structure private road supply in networks with heterogeneous users coordinated pricing for cars and transit in cities with hypercongestion empirical assessment of bottleneck congestion with a constant and peak toll: san francisco-oakland bay bridge morning commute with competing modes and distributed demand: user equilibrium, system optimum, and pricing the evening commute with cars and transit: duality results and user equilibrium for the combined morning and evening peaks congestion pricing practices and public acceptance: a review of evidence dynamic bottleneck congestion and residential land use in the monocentric city day-to-day departure time choice under bounded rationality in the bottleneck model modeling the morning commute problem in a bottleneck model based on personal perception pareto improvements from lexus lanes: the effects of pricing a portion of the lanes on congested highways tolling roads to improve reliability a partial differential equation formulation of vickrey's bottleneck model, part i: methodology and theoretical analysis a partial differential equation formulation of vickrey's bottleneck model, part ii: numerical analysis and computation schedule delay and departure time decisions in a deterministic model estimating exponential scheduling preferences 20 0 0. fares and tolls in a competitive system with transit and highway: the case with two groups of commuters pricing and logit-based mode choice models of a transit and highway system with elastic demand modal split and commuting pattern on a bottleneck-constrained highway optimal utilization of a transport system with auto/transit parallel modes the value of travel time variability with trip chains, flexible scheduling and correlated travel times traveler delay costs and value of time with trip chains, flexible activity scheduling and information traffic managements for household travels in congested morning commute bottleneck model with heterogeneous information estimating the social cost of congestion using the bottleneck model bottleneck congestion: differentiating the coarse charge the user costs of air travel delay variability a new look at the two-mode problem the commuter's time-of-use decision and optimal pricing and service in urban mass transit equilibrium queueing patterns at a two-tandem bottleneck during the morning peak spillovers, merging traffic and the morning commute queueing at a bottleneck with single-and multi-step tolls effects of the optimal step toll scheme on equilibrium commuter behaviour economics on the optimal n-step toll scheme for a queuing port economics on the optimal port queuing pricing to bulk ships the optimal step toll scheme for heavily congested ports effects of the optimal port queuing pricing on arrival decisions for container ships effects of the optimal n-step toll scheme on bulk carriers queuing for multiple berths at a busy port optimal non-queuing pricing for the suez canal modeling time-dependent travel choice problems in road networks with multiple user classes and multiple parking facilities on the use of reservation-based autonomous vehicles for demand management the morning commute in urban areas with heterogeneous trip lengths user equilibrium in a bottleneck under multipeak distribution of preferred arrival time morning commute in a single-entry traffic corridor with early and late arrivals user equilibrium of a single-entry traffic corridor with continuous scheduling preference bottleneck model revisited: an activity-based perspective step tolling in an activity-based bottleneck model reliability evaluation for stochastic and time-dependent networks with multiple parking facilities existence, uniqueness, and trip cost function properties of user equilibrium in the bottleneck model with multiple user classes equilibrium in a dynamic model of congestion with large and small users step tolling with bottleneck queuing congestion handbook of transport systems and traffic control optimal information provision at bottleneck equilibrium with risk-averse travelers an equilibrium analysis of commuter parking in the era of autonomous vehicles modeling the morning commute for urban networks with cruising-for-parking: an mfd approach interactive travel choices and traffic forecast in a doubly dynamical system with user inertia and information provision expirable parking reservations for managing morning commute with parking space constraints efficiency of a highway use reservation system for morning commute a novel permit scheme for managing parking competition and bottleneck congestion effectiveness of variable speed limits considering commuters' long-term response managing morning commute with parking space constraints in the case of a bi-modal many-to-one network modeling and managing morning commute with both household and individual travels pricing scheme design of ridesharing program in morning commute problem departure time and route choices in bottleneck equilibrium under risk and ambiguity morning commute problem considering route choice, user heterogeneity and alternative system optima a semi-analytical approach for solving the bottleneck model with general user heterogeneity the morning commute problem with ridesharing and dynamic parking charges bottleneck congestion pricing and modal split: redistribution of toll revenue public transport reliability and commuter strategy peak-load pricing of a bottleneck with traffic jam optimal cordon pricing in a non-monocentric city flextime, traffic congestion and urban productivity the morning commute for nonidentical travelers traffic flow for the morning commute a new tradable credit scheme for the morning commute problem managing rush hour travel choices with tradable credit scheme numerical solution procedures for the morning commute problem commuter responses to travel time uncertainty under congested conditions: expected costs and the provision of information travel-time uncertainty, departure time choice, and the cost of morning commutes simulating travel reliability optimal metering in the bottleneck congestion model vickrey's model of traffic congestion discretized equilibrium at a bottleneck when long-run and short-run scheduling preferences diverge long-run versus short-run perspectives on consumer scheduling: evidence from a revealed-preference experiment among peak-hour road commuters the economics of parking provision for the morning commute managing morning commute traffic with parking modeling multi-modal morning commute in a one-to-one corridor network the morning commute problem with heterogeneous travellers: the case of continuously distributed parameters linear complementarity formulation for single bottleneck model with heterogeneous commuters a single-step-toll equilibrium for the bottleneck model with dropped capacity give or take? rewards versus charges for a congested bottleneck pareto-improving social optimal pricing schemes based on bottleneck permits for managing congestion at a merging section pareto-improving ramp metering strategies for reducing congestion in the morning commute tradable credit scheme to control bottleneck queue length on the existence and uniqueness of equilibrium in the bottleneck model with atomic users airlines' strategic interactions and airport pricing in a dynamic bottleneck model of congestion punctuality-based departure time scheduling under stochastic bottleneck capacity: formulation and equilibrium punctuality-based route and departure time choice the scheduling of consumer activities: work trips trip scheduling in urban transportation analysis valuation of travel time the bottleneck model: an assessment and interpretation the existence of a time-dependent equilibrium distribution of arrivals at a single bottleneck bottleneck congestion and modal split bottleneck congestion and distribution of work start times: the economics of staggered work hours revisited bottleneck congestion and residential location of heterogeneous commuters a pareto-improving and revenue-neutral scheme to manage mass transit congestion with heterogeneous commuters modeling the modal split and trip scheduling with commuters' uncertainty expectation the morning commute problem with endogenous shared autonomous vehicle penetration and parking space constraint tradable credit schemes for managing bottleneck congestion and modal split with heterogeneous users equilibrium properties of the morning peak-period commuting in a many-to-one mass transit system step-tolling with price-sensitive demand: why more steps in the toll make the consumer better off congestion tolling in the bottleneck model with heterogeneous values of time winning or losing from dynamic bottleneck congestion pricing? the distributional effects of road pricing with heterogeneity in values of time and schedule delay congestion pricing in a road and rail network with heterogeneous values of time and schedule delay autonomous cars and dynamic bottleneck congestion: the effects on capacity, value of time and preference heterogeneity multiclass continuous-time equilibrium model for departure time choice on single-bottleneck network regional labor markets, commuting, and the economic impact of road pricing visualizing bibliometric networks congestion theory and transport investment pricing, metering, and efficiently using urban transportation facilities a smart local moving algorithm for large-scale modularity-based community detection airport congestion pricing and terminal investment: effects of terminal congestion, passenger types, and concessions equilibrium trip scheduling in single bottleneck traffic flows considering multi-class travellers and uncertainty: a complementarity formulation e-hailing ride-sourcing systems: a framework and review dynamic ridesharing with variable-ratio charging-compensation scheme for morning commute overcoming the downs-thomson paradox by transit subsidy policies equilibrium and modal split in a competitive highway/transit system under different road-use pricing strategies an ordinary differential equation formulation of the bottleneck model with user heterogeneity the morning commute problem with coarse toll and nonidentical commuters managing bottleneck congestion with tradable credits the morning commute under flat toll and tactical waiting congestion behavior and tolls in a bottleneck model with stochastic capacity stochastic bottleneck capacity, merging traffic and morning commute on the morning commute problem with carpooling behavior under parking space constraint tradable permit schemes for managing morning commute with carpool under parking space constraint the valuation of travel time reliability: does congestion matter? on the cost of misperceived travel time variability constrained optimization for bottleneck coarse tolling pareto-improving policies for an idealized two-zone city served by two congestible modes analysis of the time-varying pricing of a bottleneck with elastic demand using optimal control theory mathematical and economic theory of road pricing on the morning commute problem with bottleneck congestion and parking space constraints managing network mobility with tradable credits managing rail transit peak-hour congestion with a fare-reward scheme congestion derivatives for a traffic bottleneck congestion derivatives for a traffic bottleneck with heterogeneous commuters commuter arrivals and optimal service in mass transit: does queuing behavior at transit stops matter? carpooling with heterogeneous users in the bottleneck model a new look at the morning commute with household shared-ride: how does school location play a role? impact of capacity drop on commuting systems under uncertainty integrated daily commuting patterns and optimal road tolls and parking fees in a linear city modelling and managing the integrated morning-evening commuting and parking patterns under the fully autonomous vehicle environment efficiency comparison of various parking charge schemes considering daily travel cost in a linear city improving travel efficiency by parking permits distribution and trading integrated scheduling of daily work activities and morning-evening commutes with bottleneck congestion analysis of user equilibrium traffic patterns on bottlenecks with time-varying capacities and their applications optimal official work start times in activity-based bottleneck models with staggered work hours day-to-day evolution of departure time choice in stochastic capacity bottleneck models with bounded rationality and various information perceptions road traffic congestion and public information: an experimental investigation key: cord-295116-eo887olu authors: chimmula, vinay kumar reddy; zhang, lei title: time series forecasting of covid-19 transmission in canada using lstm networks() date: 2020-05-08 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.109864 sha: doc_id: 295116 cord_uid: eo887olu on march 11(th) 2020, world health organization (who) declared the 2019 novel corona virus as global pandemic. corona virus, also known as covid-19 was first originated in wuhan, hubei province in china around december 2019 and spread out all over the world within few weeks. based on the public datasets provided by john hopkins university and canadian health authority, we have developed a forecasting model of covid-19 outbreak in canada using state-of-the-art deep learning (dl) models. in this novel research, we evaluated the key features to predict the trends and possible stopping time of the current covid-19 outbreak in canada and around the world. in this paper we presented the long short-term memory (lstm) networks, a deep learning approach to forecast the future covid-19 cases. based on the results of our long short-term memory (lstm) network, we predicted the possible ending point of this outbreak will be around june 2020. in addition to that, we compared transmission rates of canada with italy and usa. here we also presented the 2, 4, 6, 8, 10, 12 and 14(th) day predictions for 2 successive days. our forecasts in this paper is based on the available data until march 31, 2020. to the best of our knowledge, this of the few studies to use lstm networks to forecast the infectious diseases. every infectious disease outbreak exhibits certain patterns and such patterns needed to be identified based on transmission dynamics of such outbreaks. intervening measures to eradicate such infectious diseases rely on the methods used to evaluate the outbreak when it occurs. any outbreak in a country or province usually occurs at different levels of magnitude with respect to time i.e. seasonal changes, adaptation of virus over time. usually patterns exhibited in such scenarios are non-linear in nature and this motivates us to design the system that can capture such non-linear dynamic changes. with the help of these non-linear systems, we can describe the transmission of such infectious diseases. in [1] [2] a transmission model for malaria and in [3] a mathematical model for analysing dynamics of tuberculosis has been developed to study the transmission using mathematical models. in [4] a laplacian based decomposition is used to solve the non-linear parameters in a pine witt disease. a modified sirs model in [5] successfully helped to control the syncytial virus in infants. similarly mathematical models presented in [6] , [7] helped clinicians to better understand the characteristics of human liver and transmission of dengue outbreak. most of the data driven approaches used in previous studies [8] are linear methods and often neglects the temporal components in the data. they depend upon regression without non-linear functions and failed to capture the regressive (ar) methods overwhelmingly depends on assumptions and such models are difficult for forecasting realtime transmission rates. wide range of statistical and mathematical models [9] [10] have been proposed to model the transmission dynamics of current covid-19 epidemic. in many cases, these models are not able to fit the given data perfectly and accuracy is also low while predicting the growth of covid-19 transmission. r0 is a popular statistical method specifically used to model an infectious disease. often referred as âăÿreproduction numberâăź because, the infections reproduce itself with respect to time. r0 forecasts the number of people can get the infection from the infected person. in this model, an extra weight is applied to the person who never infected the current disease nor vaccinated. if the value of r0 of a disease is 10, then the infected person will spread the disease to 10 other people surrounding him. in [11] authors used r0 method to find the infection rate of novel virus on diamond princes cruise ship [11] . however, in such method it is difficult to find the starting point of the infectious disease by identifying patient zero and the people he interacted with during his incubation period. it is worth noting that mathematical models presented in [12] , [13] , [14] can be used to solve the complex non-linear patterns of infectious diseases. even though these epidemiological models are good at capturing vital components of an infectious disease, parameters of these models required several assumptions. such hypothesized parameters would not fit the data perfectly and precision of such models will be low. meanwhile, in engineering applications [15] , model parameters are calculated with the help of real-time data. similar approach was used in this research to find the model parameters instead of assumptions. in order to overcome the barriers of statistical approaches, we developed the deep learning based network to predict the real-time transmission. our model could help public health care providers, policy makers to make necessary arrangements to tackle the rush of potential covid-19 patients. this experiment is based on the data sets of confirmed covid-19 cases available until march 31, 2020. artificial intelligence and mobile computing are one of the key factors for the success of technology in health care systems [16] . in the world of smart devices, data is being generated in the unprecedented way than ever before and promoted the role of machine learning in healthcare [16] . the world today is more connected than ever before this helped to share the real time infectious data between the countries. the distinctive feature of artificial intelligence is its flexibility, domain adaptation and economical to integrate with existing systems. over the last few weeks, many researchers came up with several mathematical models to predict the transmission of novel corona virus [17] [18] . the major drawbacks of the existing models are linear, non-temporal and several assumptions while modelling the network. first of all, the covid-19 is a time series data set and it is highly recommended to use the sequential networks to extract the patterns from it. second of all, the data we are dealing with is dynamic in nature so by using statistical and epidemiological models, results are often vague [19] [20] . in [21] , [22] , [23] , [24] researchers used deep learning based lstm networks to forecast covid-19 infections. the lstm models used in the above networks could not able to represent the spatio-temporal components simultaneously. in this paper we addressed the above problem by modifying the internal connections. in our modified lstm cells, we have established the alternative connections between the input and output cells. this type of connections not only helps the networks to preserve spatio-temporal components, but also to transfer the historical information to the next units. in this paper, we made an effort to predict the outbreak of covid-19 based on past transmission data. first of all, coherence of input data needs to be analyzed in order to find the key feature i.e. number of new cases reported with respect to the previous day infections. after selecting the key parameters of the network, several experiments was conducted to find the optimal model that can predict future infections with minimum error. previous studies on covid-19 predictions, did not considered the recovery rate while developing the model. in this research, we considered the recovery rate as one of the features while building our model. from the design point of view, when a crisis occurs, algorithms tend to assign high probability and completely neglects the previous information which leads to biased predictions. we addressed this issue in our literature and solved this by using lstm networks. our results are expected to alert the public health care providers of canada to prepare themselves for the crisis against covid-19. with the help of this real-time forecasting tool, front-line clinical staff will be alerted before the crisis. the rest of this paper is structured as follows: section ii describes methods, datasets and lstm models used in this paper. in section iii, we have discussed our findings and in section iv, concussion and future work was discussed the covid-19 data used in this research is collected from johns hopkins university and canadian health authority, provided with number of confirmed cases until march 31, 2020. the data set also includes number of fatalities and recovered patients by the end of each day. the dataset is available in the time series format with date, month and year so that the temporal components are not neglected. a wavelet transformation [25] is applied to preserve the timefrequency components and it also mitigates the random noise in the dataset. the fundamental point to represent and forecast the trends of current is to select conventional functions to fit the data. the covid-19 dataset is divided into training set (80%) on which our models are trained and testing set (20%) to test the performance of the model. a large part of real-world datasets are temporal in nature. due to its distinctive properties, there are numerous unsolved problems with wide range of applications. data collected over regular intervals of time is called time-series (ts) data and each data point is equally spaced over time. ts prediction is the method of forecasting upcoming trends/patterns of the given historical dataset with temporal features. in order to forecast covid-19 transmission, it would be effective if input data has temporal components and it is different from traditional regression approaches. a time series (ts) data can be break downed into trend, seasonality and error. a trend in ts can be observed when a certain pattern repeats on regular intervals of time due to external factors like lockdown of country, mandatory social distancing, quarantines etc. in many real-world scenarios, either of trend or seasonality are absent. after finding the nature of ts, various forecasting methods have to be applied on given ts given the ts, it is broadly classified into 2 categories i.e. stationary and non-stationary. a series is said to be stationary, if it does not depend on the time components like trend, seasonality effects. mean and variances of such series are constant with respect to time. stationary ts is easier to analyze and results skilful forecasting. a ts data is said to nonstationary if it has trend, seasonality effects in it and changes with respect to time. statistical properties like mean, variance, sand standard deviation also changes with respect to time. in order to check the nature (stationarity and non-stationarity) of the given covid-19 dataset, we have performed augmented dickey fuller (adf) test [26] on the input data. adf is the standard unit root test to find the impact of trends on the data and its results are interpreted by observing p-values of the test. if p is between 5-1%, it rejects the null hypothesis i.e. it does not have a unit root and it is called stationary series. if p is greater than 5% or 0.05 the input data has unit root so it is regarded as non-stationary series. before diving into the model architecture, it is crucial to explain the internal mechanisms of lstm networks and reasons behind using it instead of traditional recurrent neural networks. recurrent lstm networks has capability to address the limitations of traditional time series forecasting techniques by adapting nonlinearities of given covid-19 dataset and can result state of the art results on temporal data. each block of lstm operates at different time step and passes its output to next block until the final lstm block generates the sequential output. as of this writing, rnns with blocks (lstm) are the efficient algorithms to build a time series sequential model. the fundamental component of lstm networks is memory blocks, which was invented to tackle vanishing gradients by memorizing network parameters for long durations. memory block in lstm architecture are similar to the differential storage systems of a digital systems. gates in lstm helps in processing the information with the help of activation function (sigmoid) and output is in between 0 or 1. reason behind using sigmoid activation function is because, we need to pass only positive values to the next gates for getting a clear output. the 3 gates of lstm network are represented with the following equations below: where: = function of input gate = function of forget gate = function of output gate = coefficients of neurons at gate (x) −1 = result from previous time step = input to the current function at time-step t = bias of neurons at gate (x) input gate in the first equation gives the information that needs to be stored in the cell state. second equation throws the information based on the forget gate activation output. the third equation for output gate combines the information from the cell state and the output of forget gate at time step âăÿtâăź for generating the output. the internal block diagram of lstm block used in this study is shown in 1 the motivation behind initiating self-loops is to create a path so that gradients or weights can be shared for long durations. especially, this is useful while modelling deep networks where vanishing gradient is a frequent issue to deal with. by adjusting weights as self-looped gates, we can adjust the time scale to detect the dynamically changing parameters. using the above techniques, lstms are able to produce the state-of-the-art results in [27] . the network architecture used in this study is shown in 2 the methods used in this study are based on data guided approaches and are completely different from previous studies. our approaches and predictive outcomes will provide assistance for restricting the infections and possible elimination of current covid-19 pandemic. we trained our network with data until march 31, 2020 reported by canadian health authority. in this study we found that policies or decisions taken by government will greatly affect the current outbreak.several studies on forecasting of coid-19 transmission are based on the r0 method however, they didn't include the sensitivity analysis to find the important features. we examined our model predictions using mean square error (mse). in figure 4 we plotted the total number of confirmed cases and forecasted covid-19 cases in canada as a function of time. from the figure we can observe that, canada didnâăźt witness its peak yet and it is expected number of cases will soon increase exponentially despite the social distancing. although our model achieved better performance when compared with other forecasting models, it is unfortunate that transmissions are following increasing trend. the rate of infections in usa, italy and spain are growing exponentially meanwhile, the number of infections in canada are increasing linearly in figure 3. if canadians follow the regulations strictly, the number of confirmed cases will soon decline. in our lstm model-1 we trained and tested our network on canadian dataset; the rmse error is 34.83 with an accuracy of 93.4% for short term predictions in canada. meanwhile, based on our testing/validation dataset the rmse error is about 45.70 with an accuracy of 92.67% for long term predictions. the predictions of lstm model are shown in 4 with solid red line. it shows that our model was able to capture the dynamics of the transmission with minimum loss. from the figure 4 we can say that canada witnessed linear growth in cases until march 16 2020 after its first confirmed case. the current epidemic in canada is predicted to continue until june 2020. our second lstm model-2 is trained on italian dataset to predict short-term and long-term infections in canada. for short term predictions, the rmse error is about 51.46 which is higher than previous model. accord-covid-19 forecasting using lstm networks ing to this second model within 10 days, canada is expected to see exponential growth of confirmed cases. it was a challenging task to forecast the dynamics of transmission based on small dataset. even though covid-19 outbreak started in canada around early january, the consistent epidemiological data wasn't released until early february. because of small dataset several statistical models struggled to select the optimal parameters and several unknown variables led to uncertainty in their predictions. lstm model is different from statistical methods in many ways for instance, the proposed lstm network fits the real-time data and without any assumptions while selecting hyperparameters. it was able to overcome the parameter assumptions using cross validation and achieved better performance by reducing the uncertainty. after reaching the inflection point, the recovery rate will start decrease rapidly and death rate may increase at the same time as shown in figure 5 . in order to find the trend of the infections we decomposed the given series and the trend of infections is increasing with respect to time. further, number of infections followed increasing trend from sunday to tuesday and followed decreasing trend until saturday as shown in figure 6 . as we are still under the stage of dilemma about the current situation of covid-19 because, the accuracy of our estimates is bounded with a lot of external factors. so, it is recommended to conduct the follow-up study after this experiment to be more precise about the dynamics of this novel infectious disease. the actual number of cases might be higher than the cases reported by the government because, of the backlog of test results and some people will be immune before even testing. all the above factors may lead to discrepancy of our model estimations. even though we addressed data imbalance by using statistical methods like interpolation and re-sampling yet we couldnâăźt represent patients who are on incubation period or not tested. other problem while modelling current pandemic is that, people covid-19 forecasting using lstm networks figure 4 : predictions of the lstm model on current exposed and infectious cases (red solid line). the red dotted lines represents the sudden changes from where number of infections started following exponential trend. the black dotted lines in the figure represents the training data or available confirmed cases travelling between the provinces. based on our sensitivity analysis our projections may go down if current trials on potential vaccines achieves fruitful results. finally, in order to minimize the bias on our training algorithm we introduced regularization. further, by training our network inversely, we found that outbreak in canada started around early january but, it was not reported until january last week. even without the knowledge of 1st case, our inverse training will help governments to better understand the outbreak of covid-19 and helps then to prevent such outbreaks in future. the patterns from the data reveals that prompt and effective approaches taken by canadian public health authorities to minimize the human exposure is showing a positive impact when compared with other countries like usa and italy 3. rate of transmission in canada is following linear trend while in usa is witnessing an exponential growth of transmissions. however, it is too early to draw the conclusions about the current epidemic. after simulations and data fitting, our model predicted canada would reach peak within 2 weeks from now. however, the current outbreak resembles early 20th century spanish flu [28] , which killed millions of people and lasted for 2 covid-19 forecasting using lstm networks years. based on our model simulations, the current covid-19 pandemic is expected to end within 3 months from now. due to some unreported cases, a small number infection clusters may appear until december 2020. however, recent technological improvements and international cooperation between countries may even reduce the duration current pandemic. to sum up, this is the first study to model the infections disease transmission model to predict the gravity of covid-19 in canada using deep learning approaches. based on our current findings, provinces that have implemented social distancing guidelines before the pandemic has less confirmed cases than other provinces 3. for instance, saskatchewan issued social distancing guidelines 2 weeks ahead than quebec which has half of the confirmed cases in canada. our results could help canadian government to monitor the current situation and use our forecasts to prevent further transmissions. we confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. in so doing we confirm that we have followed the regulations of our institutions concerning intellectual property. we further confirm that any aspect of the work covered in this manuscript that has involved human patients has been conducted with the ethical approval of all relevant bodies and that such approvals are acknowledged within the manuscript. irb approval was obtained (required for studies and series of 3 or more cases) written consent to publish potentially identifying information, such as details or the case and photographs, was obtained from the patient(s) or their legal guardian(s). the international committee of medical journal editors (icmje) recommends that authorship be based on the following four criteria: 1. substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; and 2. drafting the work or revising it critically for important intellectual content; and 3. final approval of the version to be published; and 4. agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. all those designated as authors should meet all four criteria for authorship, and all who meet the four criteria should be identified as authors. for more information on authorship, please see http://www.icmje.org/recommendations/browse/roles-andresponsibilities/defining-the-role-of-authors-and-contributors.html#two. all listed authors meet the icmje criteria. we attest that all authors contributed significantly to the creation of this manuscript, each having fulfilled criteria as established by the icmje. one or more listed authors do(es) not meet the icmje criteria. we believe these individuals should be listed as authors because: we confirm that the manuscript has been read and approved by all named authors. we confirm that the order of authors listed in the manuscript has been approved by all named authors. the corresponding author declared on the title page of the manuscript is: this author submitted this manuscript using his/her account in editorial submission system. we understand that this corresponding author is the sole contact for the editorial process (including the editorial submission system and direct communications with the office). he/she is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. we confirm that the email address shown below is accessible by the corresponding author, is the address to which corresponding author's editorial submission system account is linked, and has been configured to accept email from the editorial office of international journal of women's dermatology: someone other than the corresponding author declared above submitted this manuscript from his/her account in editorial submission system: we understand that this author is the sole contact for the editorial process malaria transmission dynamics of the anopheles mosquito in kumasi, ghana bifurcation analysis of a mathematical model for malaria transmission mathematical analysis of the transmission dynamics of hiv/tb coinfection in the presence of treatment semianalytical study of pine wilt disease model with convex rate under caputo-febrizio fractional order derivative a new fractional hrsv model and its optimal control: a non-singular operator approach a new study on the mathematical modelling of human liver with caputofabrizio fractional derivative a new fractional modelling and control strategy for the outbreak of dengue fever bridging the gap between evidence and policy for infectious diseases: how models can aid public health decision-making application of the arima model on the covid-2019 epidemic dataset forecasting of covid-19 confirmed cases in different countries with arima models estimation of the reproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: a datadriven analysis the fractional features of a harmonic oscillator with position-dependent mass new aspects of time fractional optimal control problems within operators with nonsingular kernel a new feature of the fractional euler-lagrange equations for a coupled oscillator using a nonsingular operator approach deep learning for real-time gravitational wave detection and parameter estimation: results with advanced ligo data covid-19 forecasting using lstm networks the âăijinconvenient truthâăi̇ about ai in healthcare preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak transmission potential and severity of covid-19 in south korea updating of covariates and choice of time origin in survival analysis: problems with vaguely defined disease states strong consistency of least-squares estimation in linear regression models with vague concepts machine learning approach for confirmation of covid-19 cases: positive, negative, death and release multiple-input deep convolutional neural network model for covid-19 forecasting in china prediction for the spread of covid-19 in india and effectiveness of preventive measures neural network based country wise risk prediction of covid-19 wavelet transform domain filters: a spatially selective noise filtration technique lag order and critical values of the augmented dickey-fuller test insights into lstm fully convolutional networks for time series classification a pandemic warning? no funding was received for this work. vinay kumar reddy chimmula: conceptualization of this study, methodology, software, writing -original draft preparation,critical revision of the manuscript for important intellectual content. lei zhang: data curation,critical revision of the manuscript for important intellectual content,supervision and material support, regular feedback after each update. manuscript title: the authors whose names are listed immediately below certify that they have no affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers' bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript. the authors whose names are listed immediately below report the following details of affiliation or involvement in an organization or entity with a financial or non-financial interest in the subject matter or materials discussed in this manuscript. please specify the nature of the confl ict on a separate sheet of paper if the space below is inadequate. we wish to draw the attention of the editor to the following facts, which may be considered as potential conflicts of interest, and to significant financial contributions to this work: the nature of potential conflict of interest is described below:no conflict of interest exists.we wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. funding was received for this work.all of the sources of funding for the work described in this publication are acknowledged below:(including editorial submission system and direct communications with the office). he/she is responsible for communicating with the other authors, including the corresponding author, about progress, submissions of revisions and final approval of proofs.we the undersigned agree with all of the above.author's name (fist, last) signature date all persons who meet authorship criteria are listed as authors, and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript. furthermore, each author certifies that this material or similar material has not been and will not be submitted to or published in any other publication before its appearance in the hong kong journal of occupational therapy. please indicate the specific contributions made by each author (list the authors' initials followed by their surnames, e.g., y.l. cheung). the name of each author must appear at least once in each of the three categories below. conception and design of study: vkr chimmula acquisition of data: vkr chimmula analysis and/or interpretation of data: vkr chimmula; l zhang drafting the manuscript: vkr chimmula, revising the manuscript critically for important intellectual content: l zhang; vkr chimmula approval of the version of the manuscript to be published (the names of all authors must be listed):vkr chimmula, l zhang. key: cord-268779-qbn3i2nq authors: alrasheed, hend; althnian, alhanoof; kurdi, heba; al-mgren, heila; alharbi, sulaiman title: covid-19 spread in saudi arabia: modeling, simulation and analysis date: 2020-10-23 journal: int j environ res public health doi: 10.3390/ijerph17217744 sha: doc_id: 268779 cord_uid: qbn3i2nq the novel coronavirus severe acute respiratory syndrome (sars)-coronavirus-2 (cov-2) has resulted in an ongoing pandemic and has affected over 200 countries around the world. mathematical epidemic models can be used to predict the course of an epidemic and develop methods for controlling it. as social contact is a key factor in disease spreading, modeling epidemics on contact networks has been increasingly used. in this work, we propose a simulation model for the spread of coronavirus disease 2019 (covid-19) in saudi arabia using a network-based epidemic model. we generated a contact network that captures realistic social behaviors and dynamics of individuals in saudi arabia. the proposed model was used to evaluate the effectiveness of the control measures employed by the saudi government, to predict the future dynamics of the disease in saudi arabia according to different scenarios, and to investigate multiple vaccination strategies. our results suggest that saudi arabia would have faced a nationwide peak of the outbreak on 21 april 2020 with a total of approximately 26 million infections had it not imposed strict control measures. the results also indicate that social distancing plays a crucial role in determining the future local dynamics of the epidemic. our results also show that the closure of schools and mosques had the maximum impact on delaying the epidemic peak and slowing down the infection rate. if a vaccine does not become available and no social distancing is practiced from 10 june 2020, our predictions suggest that the epidemic will end in saudi arabia at the beginning of november with over 13 million infected individuals, and it may take only 15 days to end the epidemic after 70% of the population receive a vaccine. coronavirus, a genus of the coronaviridae family, are enveloped viruses with a large plus-stranded rna genome. the genomic rna is 27-32 kb in size and is capped and polyadenylated. three serologically distinct groups of coronaviruses have been described, with viruses in each group characterized by their host range and genome sequence. coronaviruses belong to a large family of viruses known to cause illnesses ranging from the common cold to more severe diseases, such as middle east respiratory syndrome (mers) and severe acute respiratory syndrome (sars). a novel coronavirus, sars-coronavirus-2 (sars-cov-2) was identified in december 2019 in wuhan, china, as a coronavirus that had not been previously identified in humans; this novel coronavirus is also known as the coronavirus disease 2019 . since its identification, sars-cov-2 has spread rapidly, affecting over 200 countries and causing the 2019/2020 coronavirus pandemic. it was declared as a public health emergency of international concern on 30 january 2020 by the world health organization (who). to date, many countries and regions have implemented lockdown measures and strict social distancing to limit the propagation of the virus. from a strategic and healthcare management perspective, the propagation pattern of the disease and the prediction of its spread over time is of great importance, which can save lives and minimize the social and economic consequences. epidemiological modeling is a powerful tool that can help understand disease spread, control, and prevention. different mathematical epidemic models have been used in the literature, including statistical models [1] , mathematical models [2] [3] [4] [5] [6] [7] [8] [9] [10] , and network-based models [11] [12] [13] . mathematical epidemic models are used to predict the course of an epidemic and develop methods for controlling it by comparing different possible scenarios based on the observed data. one of the widely used models is the susceptible-infected-recovered (sir) model [14, 15] , where individuals are assigned into three compartments, i.e., susceptible (s), infected (i), and recovered (r). each individual belongs to one compartment and changes his/her state over time. an individual can transition from susceptible to infected with a specific infection rate. each individual can also transition from infected to recovered according to a specific recovery rate. this simple epidemic model works well for a homogeneous population that exhibits similar contact patterns, with contact probabilities between any two individuals considered to be equal. however, recent research has shown that the contact patterns in a real population are heterogeneous [16] . as bidirectional social contacts are key factors in disease spreading, modeling epidemics on contact networks has been increasingly used to understand disease transmission and evaluate the impact of potential disease control [17] [18] [19] . this is because contact relationships between individuals that allow infection propagation naturally define a network. hence, understanding the contact network structure can improve the predictions of the infection distribution among individuals and allow the simulation of the full epidemic dynamics. networks allow the modeling and simulation of disease control measures by manipulating the connections among different individuals. in this work, we propose a simulation model for the spread of covid-19 in saudi arabia using a network-based sir epidemic model. we first generated a contact network that captured the realistic social behaviors and dynamics of individuals in the population of saudi arabia. we aimed to match the model simulations with empirical data and then used the model to evaluate the effectiveness of the control measures employed by the saudi government, to predict the future dynamics of the disease in saudi arabia according to different scenarios, and to predict the percentage of individuals that must be vaccinated to stop the outbreak (when a vaccine becomes available). modeling the spread of covid-19 in saudi arabia has been discussed in the literature [20] [21] [22] [23] ; however, no studies used a network-based model that captured the social and dynamic properties that are intrinsic to saudi society. further, control measures, such as school closures, mosque closures, domestic flight shutdowns, and curfews, were not considered. the proposed model is used to explain how social measures, such as social distancing and regional lockdowns, influence the model parameters, which, in turn, change the number of infected cases over time. the proposed model considers the dynamic nature of individual contact behaviors and the variations in susceptibility and infectivity between individuals. the main contributions of the work can be summarized as follows: we built a model for contact networks that captures the social properties and dynamics intrinsic to saudi arabia's society. a set of attributes was defined for each node (representing each individual), including age, gender, nationality, and location. this is important as network structure and node attributes are crucial factors in the covid-19 epidemic spreading process. we built a network simulation model of the spread of covid-19 in saudi arabia using the widely adopted sir model. using our network simulations, we analyzed the processes by which covid-19 spreads. 3. we analyzed the effectiveness of the response of saudi authorities using our network simulations. 4. we predicted the future dynamics of the disease in saudi arabia under different scenarios. 5. we investigated the effectiveness of different vaccination strategies. in this work, we evaluated the effectiveness of saudi arabia's control measures on the epidemic dynamics. our results showed that strict local control measures, such as school closures, mosque closures, and flight shutdowns, play an important role in controlling the spread of the disease. in particular, mosque closures have the greatest impact on decreasing the transmission rate of the disease. our key results are in agreement with previous findings in china [11, 24, 25] and in the united states [11] . our model suggests that saudi arabia would have faced the peak of the outbreak on 21 april 2020 with a total of about 26 million infections if it had not imposed the control measures. this illustrates the importance of employing strict measures for flattening the epidemic curve of the infection and reducing the size of the epidemic. the strict social measures delay the peak of infection and minimize its period. altogether, these effects limit the burden on the healthcare system and prevent it from being overwhelmed. we also predicted the future dynamics of the outbreak in saudi arabia for the upcoming six months using multiple scenarios. according to the current data, the proposed model suggested that the peak would be roughly at the beginning of july, reaching a peak of 0.5% of the population if people did not practice strict social distancing. the peak represents the highest number of daily infections. using our simulations, we also computed the percentage of people that must be vaccinated to stop the epidemic. our results suggest that the outbreak can be contained by increasing the percentage of the vaccinated population (but without resorting to mass vaccination of the population). according to our results, the proposed simulation model provides insights that reflect the dynamic behavior of covid-19 under different scenarios. the results can guide the local healthcare system for making decisions during the critical periods of the epidemic. the rest of the paper is organized as follows. in section 2, we discuss related literature works, and, in section 3, we describe the method, including the contact network generation model, the data, and the network simulation model. in sections 4 and 5, we present and discuss the simulation results. finally, section 6 concludes the work. the epidemic progression of covid-19 has received increased attention from the research community since its outbreak in late 2019. the importance of understanding the virus transmission dynamics and further predicting the epidemic curve for public policy healthcare control measures has prompted multiple modeling efforts to control the outbreak [26, 27] . existing contributions in the epidemiological modeling of covid-19 include different types of models, such as statistical models [1] , mathematical models [2] [3] [4] [5] [6] [7] [8] [9] [10] , network-based models [11] [12] [13] , and phenomenological models [28] . due to their conceptual and mathematical simplicity, mathematical models, especially sir compartmental models, have long been popular in modeling epidemic dynamics [29, 30 ]. an sir model describes the spread of a disease in a population, where individuals are assigned into three compartments: susceptible (s), infected (i), and recovered (r) [14, 15] . however, previous studies [16] reported that compartmental models lack explicit modeling of contact structures among individuals, which play a crucial role in understanding and modeling the dynamics of the spread of directly transmissible diseases. compartmental models assume homogenous mixing, where all individuals are equally likely to encounter infection, which may not reflect reality [31] [32] [33] . manzo [16] argued that a major problem with these kinds of compartmental models is that they can only be used with population-wide interventions because they do not model the topology of realistic social interactions. for these reasons, network-based models have been considered as an alternative for the epidemiological modeling of directly transmissible diseases [17] [18] [19] . in such models, an infection may only spread over an arc between two nodes (or individuals) in the network that represents a contact. in the literature, several studies have addressed the deficiencies of previous compartmental models by extending sir-type models on a generated contact network [19] . for instance, salathe and jones [34] adopted this approach to study the effect of community structure on the epidemic dynamics of infectious disease and immunization intervention. volz [35] modeled sir dynamics on a static random network, which represents the population structure of susceptible and infected individuals and their contact patterns with an arbitrary degree distribution. the authors extended their work in [36] to cover a dynamic random network because contact patterns are inherently dynamic such that individuals tend to make and break relationships over time. miller et al. [30] proposed an edge-based compartmental model, which unlike compartmental models, assumes a heterogeneous contact rate and considers the partnership duration. read and keeling [37] investigated how local or global transmission routes in a contact network may affect the evolutionary selection of the transmission rate and infectious period, which determines the transmission dynamics of infectious diseases. ball et al. [38] proposed a stochastic sir network epidemic model with preventive dropping, where a susceptible individual can practice social distancing by removing its edge to an infectious individual. due to the importance of social mixing patterns on modeling epidemic dynamics and evaluating the employed control measures, many research efforts have been made to estimate the patterns in different countries [39] [40] [41] [42] . despite the success of network-based models, several published studies on covid-19 modeling, including those supporting policy decision making, have focused on compartmental models [2] [3] [4] [5] [6] [7] [8] [9] [10] [20] [21] [22] [23] 43] . manzo [16] urged researchers to direct their efforts toward network-based sir models and to start discussing a large-scale collection of empirical network data to foster such models. ferguson et al. [13] used a network-based model to study the impact of non-pharmaceutical interventions on reducing the spread of covid-19 to advise policymaking in the uk and other countries. the authors adopted an individual-based simulation model published in [44, 45] , where spatial details were included, such as the household, school, workplace, and the wider community. the authors used real data to define multiple attributes of the model, including age and household distribution size, average class sizes, staff-student ratios, and workplace size. peirlinck et al. [11] evaluated the effectiveness of intervention strategies and predicted the outbreak peak in china and the us. the authors modeled the covid-19 outbreak dynamics by combining a network model, where the nodes represent states and the edges represent connections between them, and an epidemic susceptible (s), infected (i), exposed (e), and recovered (seir) model. in their study, liu et al. [12] developed a contact network and a model without contact to simulate the unfortunate incident of the covid-19 outbreak in the diamond princess cruise ship in two stages. the first stage was unprotected contact and the second stage was divided into two scenarios: protected contact and airborne spread of the virus. the authors designed a small-world network-based chain-binomial model [46, 47] for the unprotected contact stage, a contact network epidemic model for protected contact for the crew stage, and a no-contact susceptible and infected model (ncsi) for the airborne spread for the passenger stage. they used bayesian inference and metropolis-hastings sampling to estimate the model parameters. several existing contributions modeled the covid-19 outbreak in saudi arabia using different models [20] [21] [22] [23] . for instance, alboaneen et al. [20] predicted that saudi arabia would have a maximum total cases of 79,000 using logistic growth and sir models. in [21] , alharbi et al. found that the sir model provided the best fit to the data compared to the generalized logistic, richards, and gompertz models. their results predicted that the total number of infected cases would reach 359,794 and that the pandemic would end by early september 2020. aletreby et al. [22] predicted that the pandemic would peak by the end of july 2020. further, the work in [23] used the sir model to predict future trends and compare the impact of control measures taken by saudi arabia and the united kingdom on the outcomes of covid-19 pandemic. their results indicated that early extreme measures imposed by the saudi authority played a major role in reducing the spread of the disease, compared to the uk. although there are some contributions that discussed covid-19 in saudi arabia [48] [49] [50] [51] [52] and others modeled the epidemic dynamics of the covid-19 outbreak in the country using different models, such as sir [20, 21, 23] , seir [22] , logistic growth [20] , and generalized logistic, richards, and gompertz models [21] , none have used a network-based model or considered the social properties and dynamics intrinsic to saudi arabia's society. control measures, such as school closures, mosque closures, domestic flight shutdowns, and curfews, were not considered. this work seeks to fill that gap by investigating the spread of covid-19 in saudi arabia using a network-based epidemic simulation model. the first positive covid-19 case in saudi arabia was confirmed on 2 march 2020 with more cases sporadically appearing in the following few weeks [53] [54] [55] . according to the saudi ministry of health [56] , the vast majority of infected people were home-comers from high-risk regions and their immediate contacts [53,57,58]. the proposed simulation is a stochastic discrete network-based model that explicitly represents individuals and their interactions. first, we created a synthetic contact network that matches the essential structural properties of saudi arabia's society. the synthetic population was constructed to statistically match the population demographics of saudi arabia. secondly, we modeled the spread of covid-19 in saudi arabia using a classic sir model. finally, we conducted the contact network generation, simulation, and all analyses using the python-based networkx library [59] . the generated network dataset, model parameters, and population demographics data are all available at https://github.com/halrashe/covid-19_sa_simulation. to simulate the spread of covid-19 in saudi arabia, we generated a contact network using the intrinsic properties and dynamics of saudi arabia's society. we preserved the saudi-related demographics and social features that are essential for the transmission of infection. therefore, our network generation model captures key individual and social aspects. first, the network captures the properties of individuals by assigning a set of attributes to each node, including age group, gender, citizenship, and location. secondly, the network conforms to several observed contact behaviors among individuals such as location and age assortativity [60] . we used data from the saudi general authority for statistics [61] to assign the distribution of individuals for each attribute [62, 63] (see figure a1 and table a1 in the appendix). this is computationally challenging [32, 64] ; therefore, a contact network with a population of n = 10,500 individuals was generated with given age group, gender, citizenship, and location distributions. the geographic locations used to construct the network corresponded to the 15 main administrative regions in saudi arabia. node connections represent contacts that may take place before and during the period of the epidemic. three connection types between node pairs were used in the network: familial, social, and random. see figure 1 for a schematic of the network. we define our undirected and unweighted contact network g = (v,e), where v represents the set of individuals in the population and e represents the contact relationships between them. in the contact network g, each individual belongs to a household and the household sizes correspond to the values for saudi arabia reported in [65] . each household is represented as a complete graph in which every node is connected to every other node by a familial edge. nodes from two different households can be linked in two ways, i.e., based on similarity (social edges) or at random (random edges). nodes are linked with social edges with a probability proportional to their similarity (i.e., a higher node similarity implies a higher chance of connection in the contact network). two nodes are considered similar when they exhibit similar attributes. the similarity of two nodes u and v, denoted as similarity (u,v) , is computed using the scaled euclidean distance between the two node vectors based on their attributes. let u and v be the vectors corresponding to nodes u and v. we first construct the vectors of the two nodes (the vector length is equal to the number of attributes describing each node, which is 4 in this case). the corresponding elements in both vectors have values of 0 and 1, respectively, if the two nodes have a different value for an attribute. otherwise, the corresponding elements of the two vectors have a value of 0. then, the similarity is computed as follows: where a and c are constants such that 3a + c = 1 and c >> a. the goal here is to assign the location attribute a larger weight because it plays the most important role in deciding the contact relationship among node pairs. if two nodes are not similar, then they may be connected randomly with a probability of p + loc if they both belong to the same location and with a probability of p + random if they belong to different locations (p + random << p + loc). each edge eu,v connecting node u and v has a type attribute describing its formation. here, we use three edge types. the first one is familial when the two nodes u and v belong to the same household. the second is social when eu,v is formed as a result of the similarity between u and v. the third one is random when eu,v is formed completely at random. social edges represent contact relationships as a result of sharing school, work, interests, and neighborhoods. random edges represent contact relationships that occur as a result of coming into contact with another individual in a public place, a taxi, an airport, etc., or due to social contact that is not based on similarity. to make the model more realistic, a set of random edges is removed from the network (for example, not all familial relationships resemble infection-leading forms of contact) based on the edge type. a familial edge is removed with a probability of p â�� familial, a social edge is removed with a probability of p â�� social, and a random edge is removed with a probability of p â�� random such that p â�� familial << p â�� social << p â�� random. algorithm 1 shows the contact network generation algorithm. figure 2 shows the main properties of the contact network used in the simulation. nodes are linked with social edges with a probability proportional to their similarity (i.e., a higher node similarity implies a higher chance of connection in the contact network). two nodes are considered similar when they exhibit similar attributes. the similarity of two nodes u and v, denoted as similarity(u,v), is computed using the scaled euclidean distance between the two node vectors based on their attributes. let u and v be the vectors corresponding to nodes u and v. we first construct the vectors of the two nodes (the vector length is equal to the number of attributes describing each node, which is 4 in this case). the corresponding elements in both vectors have values of 0 and 1, respectively, if the two nodes have a different value for an attribute. otherwise, the corresponding elements of the two vectors have a value of 0. then, the similarity is computed as follows: where a and c are constants such that 3a + c = 1 and c >> a. the goal here is to assign the location attribute a larger weight because it plays the most important role in deciding the contact relationship among node pairs. if two nodes are not similar, then they may be connected randomly with a probability of p + loc if they both belong to the same location and with a probability of p + random if they belong to different locations (p + random << p + loc ). each edge e u,v connecting node u and v has a type attribute describing its formation. here, we use three edge types. the first one is familial when the two nodes u and v belong to the same household. the second is social when e u,v is formed as a result of the similarity between u and v. the third one is random when e u,v is formed completely at random. social edges represent contact relationships as a result of sharing school, work, interests, and neighborhoods. random edges represent contact relationships that occur as a result of coming into contact with another individual in a public place, a taxi, an airport, etc., or due to social contact that is not based on similarity. to make the model more realistic, a set of random edges is removed from the network (for example, not all familial relationships resemble infection-leading forms of contact) based on the edge type. a familial edge is removed with a probability of p â�� familial , a social edge is removed with a probability of p â�� social , and a random edge is removed with a probability of p â�� random such that p â�� familial << p â�� social << p â�� random . algorithm 1 shows the contact network generation algorithm. figure 2 shows the main properties of the contact network used in the simulation. algorithm 1 contact network generation 1: create household clusters (complete graphs) with given average sizes 2: type(e uv ) â�� f amilial â�� e uv â�� e 3: for each pair of non-neighboring nodes u, v do 4: if similarity(u, v) > t then {t is the node pairs similarity threshold} 5: e â�� e â�ª e uv with probability p + social 6: type(e uv ) â�� social 7: else 8: if location(u) = location(v) then 9: e â�� e â�ª e uv with probability p + loc 10: type(e uv ) â�� random 11: else 12: e â�� e â�ª e uv with probability p + if type(e uv ) = social then 19: e â�� e â�� e uv with probability p â�� social 20: each of the square-shaped regions in the similarity matrix in figure 2b is formed because of the citizenship attribute. the bottom-left region corresponds to saudi individuals and the other two correspond to non-saudi individuals. non-saudi individuals are partitioned into two groups because two patterns of contact have been identified between non-saudi individuals. due to the model's stochasticity, similarity alone does not control edge formation (see the adjacency matrix in figure 2b . table 1 lists the structural properties of the underlying contact graph, which may have a significant impact on the dynamics of the disease [66] . the network density is zero for a network with no edges and 1 for a network with all possible edges. our contact network had a density of 0.036, revealing that it is a sparse network with every node connected to every other node (number of connected components is one). the node degree is the number of contacts an individual node has, which provides a quantitative measure of the node's role in the disease transmission process (figure 2c ). the maximum degree shows the most active node (or nodes) in the network, representing individuals contacting a large number of people, such as sales workers and delivery and taxi drivers in highly populated locations (e.g., riyadh). in addition, the network exhibits a small-world property with a high clustering coefficient and a short average path length and diameter. the network also shows a strong community structure. the modularity value [67] ranges between â��1 and 1 and is used to measure the quality of communities (higher modularity indicates stronger community structure). our contact network had 14 communities, each of which corresponded to a location (this is expected from the generation model used to create the network). generally, our contact network structure matches the properties of other contact networks [36, 68, 69] . however, unlike other contact network generation models, we did not assume any network properties in advance [64, [70] [71] [72] . each of the square-shaped regions in the similarity matrix in figure 2 (b) is formed because of the citizenship attribute. the bottom-left region corresponds to saudi individuals and the other two correspond to non-saudi individuals. non-saudi individuals are partitioned into two groups because two patterns of contact have been identified between non-saudi individuals. due to the model's stochasticity, similarity alone does not control edge formation (see the adjacency matrix in figure 2 (b). table 1 lists the structural properties of the underlying contact graph, which may have a significant impact on the dynamics of the disease [66] . the network density is zero for a network with no edges and 1 for a network with all possible edges. our contact network had a density of 0.036, revealing that it is a sparse network with every node connected to every other node (number of connected components is one). table 1 . contact network properties. definition value the transmission dynamics of covid-19 depend on the structure of the underlying contact network and individual susceptibilities. the susceptibility defines how likely an individual is to become infected if he or she comes into contact with an infected individual. since it is unknown what attributes of an individual determine his or her susceptibility, we used statistical tests on a real covid-19 patient dataset to identify them. to this end, we requested and received data about the patients in saudi arabia from the saudi ministry of health. the data consist of records of all individuals who were tested by taking nasopharyngeal swabs for covid-19 in saudi arabia between 2 march 2020 until 25 april 2020. several data cleaning steps were applied to the dataset before testing. the characteristics of the final dataset are shown in figure a2 and table a2 in the appendix. as can be seen in table a2 , the dataset is unbalanced because the majority of the cases are negative. therefore, we conducted oversampling for the positive class according to the synthetic minority over-sampling technique (smote) using python [73] . we then applied the pearson's chi-square statistical hypothesis test to both the original unbalanced dataset and the balanced dataset. chi-square was used to assess whether there was a significant statistical relationship between the attribute (i.e., age, gender, citizenship, and location; the independent variables) and the test result (the dependent variable). this is a well-known feature selection technique in machine learning [74] . our goal is to determine which attribute contribute to an individual's susceptibility. the resulting p-values for the attributes are presented in table 2 . it can be seen that all p-values were < 0.05, which implies a significant relationship. therefore, all attributes were included to estimate an individual's susceptibility. let g = (v,e) denote the contact network defined in section 3.1. to simulate the spread of covid-19 in saudi arabia, we ran a standard sir epidemic model on our contact network. according to this model, each node u has a state state(u) that is either susceptible, infected, or recovered (immune). transitions are only allowed from susceptible to infected or from infected to recovered. the sir model is a reasonable representation for covid-19, which assumes (up to this point) to lead to full immunity after recovery [75] . the epidemiology of covid-19 and its clinical characteristics are not fully known. therefore, we heavily relied on recently available data [54, 55] for disease transmission. based on the analysis in section 3.2, we identified four main attributes that play a role in the transmission of infection: age, gender, citizenship, and location. accordingly, each node u was assigned a susceptibility value susceptibility(u) describing its risk of infection. the transmission probability from an infected node v to a susceptible node u occurs with a probability proportional to the susceptibility of node u.; i.e., p u,v = susceptibility(u), where state(v) = infected. to find each susceptibility value, we extracted all possible events (attribute value combinations) from the available records and calculated the probability of each compound event. figure a3 and table a3 list the node susceptibility values. initially (at time 0), the population is fully susceptible with a single infected individual. the infected individual was chosen to have the same attributes (i.e., age group, gender, citizenship, and location) as the first recorded case in saudi arabia (a 40-year-old male from the eastern region). thereafter, the infection progresses via the contact network for several iterations (each iteration corresponds to one day). the incubation period was set to 14 days, which is the maximum incubation period recorded for covid-19 [76] . the recovery rate was set to 0.2 (see table 3 ). the major control measures employed by the saudi government were implemented in the model, which include school closures, mosque closures, domestic flight shutdowns, and in-home curfews. the model also implements social distancing, ground screening, partial business reopening, and business as usual. the major control measures, their dates, and assumed compliance rates used in the model are listed in table 4 . in some cases, control measures are not enough to prevent contact; for example, school friends can meet outside of school, and people can still travel by car to meet. in table 4 , a compliance rate of 35% for ground screening represents the percentage of people who were infected but only detected as a result of the ground screening. business as usual refers to the full reopening of businesses, where we assume that contact relationships are restored and social distancing is the only measure that affects the susceptibility of individuals. the compliance rates that produced simulation curves closest to the actual curve were selected. control measures were introduced by removing edges between relevant nodes and with a specific compliance rate. for example, school closures resulted in removing edges among node pairs who shared the same location and age group. edges were removed with a specific probability and among a specific percentage of relevant nodes. on the other hand, partial business reopening and business as usual result in adding removed edges between a given set of nodes and with a given probability. finally, we implemented social distancing as a reduction in the probability of infection (decreasing node susceptibility). to establish the simulation model parameters, we used the empirical data of confirmed cases of covid-19 in saudi arabia for the period from 2 march 2020 (first confirmed case) until 11 may 2020. the model parameters are listed in table 3 . we compare the actual and simulated results of the daily and cumulative new infected cases in figures 3 and 4 , respectively. note that all simulation results corresponded to averages of 10 simulations and were scaled to the actual number of infected cases. it can be seen from the figures that our model fit the reported data well. to further confirm the fit of our model, we predicted the daily cases for the period from 12 may 2020 to 18 june 2020 and compared it with the available actual data (see figure 5 ). all regulations imposed after 31 may are not implemented in the model, which may explain the overestimations of the simulated curve around 31 may. the values of the mean absolute percentage error (mape) and the symmetric mean absolute percentage error (smape) for the prediction were 17.7% and 14.6%, respectively. outlier values were removed due to the extreme sensitivity of the above error measures to outliers [77, 78] . the values of mape and smape before removing the outliers were 32.9% and 19.0%, respectively. we further simulated and analyzed the effect of the selected saudi control measures and their timings. then, we predicted the disease dynamics and measured the effect of vaccination. it can be seen from the figures that our model fit the reported data well. to further confirm the fit of our model, we predicted the daily cases for the period from 12 may 2020 to 18 june 2020 and compared it with the available actual data (see figure 5 ). all regulations imposed after 31 may are not implemented in the model, which may explain the overestimations of the simulated curve around 31 may. the values of the mean absolute percentage error (mape) and the symmetric mean absolute percentage error (smape) for the prediction were 17.7% and 14.6%, respectively. outlier values were removed due to the extreme sensitivity of the above error measures to outliers [77, 78] . the values of mape and smape before removing the outliers were 32.9% and 19.0%, respectively. it can be seen from the figures that our model fit the reported data well. to further confirm the fit of our model, we predicted the daily cases for the period from 12 may 2020 to 18 june 2020 and compared it with the available actual data (see figure 5 ). all regulations imposed after 31 may are not implemented in the model, which may explain the overestimations of the simulated curve around 31 may. the values of the mean absolute percentage error (mape) and the symmetric mean absolute percentage error (smape) for the prediction were 17.7% and 14.6%, respectively. outlier values were removed due to the extreme sensitivity of the above error measures to outliers [77, 78] . the values of mape and smape before removing the outliers were 32.9% and 19.0%, respectively. it can be seen from the figures that our model fit the reported data well. to further confirm the fit of our model, we predicted the daily cases for the period from 12 may 2020 to 18 june 2020 and compared it with the available actual data (see figure 5 ). all regulations imposed after 31 may are not implemented in the model, which may explain the overestimations of the simulated curve around 31 may. the values of the mean absolute percentage error (mape) and the symmetric mean absolute percentage error (smape) for the prediction were 17.7% and 14.6%, respectively. outlier values were removed due to the extreme sensitivity of the above error measures to outliers [77, 78] . the values of mape and smape before removing the outliers were 32.9% and 19.0%, respectively. we further simulated and analyzed the effect of the selected saudi control measures and their timings. then, we predicted the disease dynamics and measured the effect of vaccination. we further simulated and analyzed the effect of the selected saudi control measures and their timings. then, we predicted the disease dynamics and measured the effect of vaccination. to determine the efficacy of the imposed control measures in saudi arabia, we simulated the epidemic without each measure individually for the period from 2 march 2020 to 11 may 2020. then, we compared the resulting simulation curve with the original epidemic curve. the results of this analysis provided an estimate of the number of new cases that were prevented using the control measure. figure 6 illustrates the epidemic curves produced from not implementing the major control measures (i.e., school closures, mosque closures, domestic flight shutdowns, and curfews) imposed by the saudi government. to determine the efficacy of the imposed control measures in saudi arabia, we simulated the epidemic without each measure individually for the period from 2 march 2020 to 11 may 2020. then, we compared the resulting simulation curve with the original epidemic curve. the results of this analysis provided an estimate of the number of new cases that were prevented using the control measure. figure 6 illustrates the epidemic curves produced from not implementing the major control measures (i.e., school closures, mosque closures, domestic flight shutdowns, and curfews) imposed by the saudi government. the figure shows that removing any of the control measures caused the epidemic curve to reach the peak earlier compared to the actual curve. when the school closures measure is not implemented, the maximum percentage increase in the number of daily cases was 104% and the curve peak occurred earlier compared to the other curves. not implementing mosque closures and curfews also caused the curve to peak early compared to the actual curve. cancelling mosque closures caused 113% maximum percentage increase, while cancelling curfews caused only 23% increase. cancelling flight shutdowns resulted in 83% maximum percentage increase in the number of daily cases. to assess the impact of the selected date of each of the control measures, we simulated the epidemic with a late effective date for each measure. the results are shown in figure 7 the figure shows that removing any of the control measures caused the epidemic curve to reach the peak earlier compared to the actual curve. when the school closures measure is not implemented, the maximum percentage increase in the number of daily cases was 104% and the curve peak occurred earlier compared to the other curves. not implementing mosque closures and curfews also caused the curve to peak early compared to the actual curve. cancelling mosque closures caused 113% maximum percentage increase, while cancelling curfews caused only 23% increase. cancelling flight shutdowns resulted in 83% maximum percentage increase in the number of daily cases. to assess the impact of the selected date of each of the control measures, we simulated the epidemic with a late effective date for each measure. the results are shown in figure 7 , where figure 7a shows the impact of delaying school closures, figure 7b shows the impact of delaying mosque closures, figure 7c shows the impact of delaying domestic flight shutdowns, figure 7d shows the impact of delaying curfews, and figure 7e shows the actual curve (for ease of comparison). each figure also shows the percentage increase in the infection rate and the total number of infected cases when the corresponding control measure was delayed. the figure shows that delaying any of the control measures caused an increase in the infection rate and in the total number of infected cases. when the mosque closures measure was delayed, the total number of infected cases increased by 173% and the infection rate increased by 128%. delaying curfews caused 113% increase in the total number of infected cases and 45% increase in the infection rate. when school closures or flight shutdowns were delayed, the infection rate increased by 35% and 37%, respectively. delaying school closures and flight shutdowns increased the total number of infected cases by 17% and 49%, respectively. we used the proposed model to predict the future dynamics of the outbreak in saudi arabia for the upcoming period of six months (from 12 may to 31 december) with respect to three scenarios, representing multiple levels of adherence to the social distancing recommendations after 31 may 2020 (the announced business as usual date). figure 8 shows the number of infected individuals per day for all scenarios. in particular, figure 8(a)-(c) show the epidemic dynamics with poor (0% of population), moderate (50% of population), and strong (75% of population) compliance to social distancing, respectively. the level of adherence is defined by the percentage of people that practice social distancing. for comparison purposes, we also predicted the dynamics of the outbreak when no control measures or social distancing were imposed during the whole pandemic period starting from 2 march. the red curve in figure 8 shows that if no control measures were imposed, the peak of the infection was predicted to be about 1.6% on 21 april 2020 with a total outbreak size of 80% of the population (the peak refers to the highest number of daily infections) and the epidemic ended at the end of august 2020. the figure shows that delaying any of the control measures caused an increase in the infection rate and in the total number of infected cases. when the mosque closures measure was delayed, the total number of infected cases increased by 173% and the infection rate increased by 128%. delaying curfews caused 113% increase in the total number of infected cases and 45% increase in the infection rate. when school closures or flight shutdowns were delayed, the infection rate increased by 35% and 37%, respectively. delaying school closures and flight shutdowns increased the total number of infected cases by 17% and 49%, respectively. we used the proposed model to predict the future dynamics of the outbreak in saudi arabia for the upcoming period of six months (from 12 may to 31 december) with respect to three scenarios, representing multiple levels of adherence to the social distancing recommendations after 31 may 2020 (the announced business as usual date). figure 8 shows the number of infected individuals per day for all scenarios. in particular, figure 8a -c show the epidemic dynamics with poor (0% of population), moderate (50% of population), and strong (75% of population) compliance to social distancing, respectively. the level of adherence is defined by the percentage of people that practice social distancing. for comparison purposes, we also predicted the dynamics of the outbreak when no control measures or social distancing were imposed during the whole pandemic period starting from 2 march. the red curve in figure 8 shows that if no control measures were imposed, the peak of the infection was predicted to be about 1.6% on 21 april 2020 with a total outbreak size of 80% of the population (the peak refers to the highest number of daily infections) and the epidemic ended at the end of august 2020. population is practicing social distancing) after business return on 31 may 2020. the simulation results suggest that there would be two peaks at roughly the beginning of july and the middle of august. the peak of the infection will be 0.5% with a total outbreak size of about 47% of the population. according to this scenario, our model suggests that the epidemic will end at the beginning of november 2020 with over 13 million infected individuals, which is measured according to the number of active cases (i.e., when the number of active cases is close to zero). the epidemic curve in figure 8 (b) shows the disease dynamics when social distancing is practiced moderately (about 50% of the population practicing social distancing). the figure suggests that the first peak will remain below 0.4%, the second peak will be avoided, and the total number of infected individuals will be about 33% of the population. in figure 8 (c), the epidemic curve suggests that when most people practice social distancing (75% of the population), the total number of infected individuals will decrease to about 25%. we next explored the dynamics of the epidemic if part of the population is vaccinated. this is helpful to understand what percentage of the population must be vaccinated to stop the epidemic. we considered four scenarios where 0%, 30%, 50%, and 70% of the population was vaccinated. in the proposed model, vaccination is represented by removing the edges between a node u and part of its neighbor nodes. we show the epidemic curves and the results of multiple vaccination scenarios in figure 9 and table 5 , respectively. we assumed that a vaccine would become available on 10 june figure 8a shows the disease dynamics when social distancing adherence is poor (0% of the population is practicing social distancing) after business return on 31 may 2020. the simulation results suggest that there would be two peaks at roughly the beginning of july and the middle of august. the peak of the infection will be 0.5% with a total outbreak size of about 47% of the population. according to this scenario, our model suggests that the epidemic will end at the beginning of november 2020 with over 13 million infected individuals, which is measured according to the number of active cases (i.e., when the number of active cases is close to zero). the epidemic curve in figure 8b shows the disease dynamics when social distancing is practiced moderately (about 50% of the population practicing social distancing). the figure suggests that the first peak will remain below 0.4%, the second peak will be avoided, and the total number of infected individuals will be about 33% of the population. in figure 8c , the epidemic curve suggests that when most people practice social distancing (75% of the population), the total number of infected individuals will decrease to about 25%. we next explored the dynamics of the epidemic if part of the population is vaccinated. this is helpful to understand what percentage of the population must be vaccinated to stop the epidemic. we considered four scenarios where 0%, 30%, 50%, and 70% of the population was vaccinated. in the proposed model, vaccination is represented by removing the edges between a node u and part of its neighbor nodes. we show the epidemic curves and the results of multiple vaccination scenarios in figure 9 and table 5 , respectively. we assumed that a vaccine would become available on 10 june 2020 (the date was chosen to make the differences easily visible on the plot). before this date, all control measures are imposed with the compliance rates shown in table 4 . however, irrespective of the dates, the insights available in this simulation are useful for whenever a vaccine becomes available. table 4 . however, irrespective of the dates, the insights available in this simulation are useful for whenever a vaccine becomes available. figure 9 . epidemic curves of multiple vaccination scenarios. curves are smoothed using a savitzky-golay filter [79] (filter with a window length of 31 and a degree 3 polynomial). figure 9 and table 5 suggest that the outbreak and peak sizes are inversely proportional to the percentage of vaccinated population. further, we observe that the higher the percentage of population vaccinated is, the earlier the epidemic peaks and the epidemic ends. for example, when 30% of the population is vaccinated, the peak occurs on 1 july 2020 and ends on 4 november 2020. when 70% of the population is vaccinated, the peak occurs on 30 may 2020 and ends on 25 june 2020. the proposed network model allows the analysis and evaluation of various control measures that are used to slow or prevent the transmission of covid-19 in saudi arabia and to evaluate the timing of each measure. moreover, the model can be used to predict the future dynamics of the outbreak in saudi arabia with and without the availability of vaccination. the results presented in section 4.1 show the epidemic curves resulting from not implementing each of the four major control measures employed by the saudi government. the results reveal several important pieces of information. first, they suggest that all of the employed control measures played a significant role in delaying the peak of the epidemic, where the peak represents the highest number of daily infections. this can be seen by comparing the dashed vertical lines on the curves, which show the day at which the number of new cases reached the maximum peak (compared to the actual curve). secondly, it is apparent from the top curve in figure 6 , that implementing school closures had the maximum impact because canceling school closures caused the curve to reach the peak early compared to the other three curves. thirdly, the results suggest that the employed measures also played an important role in slowing down the infection rate. for example, the maximum percentage increase in the number of cases in the original curve was 49%, whereas it was 104%, 113%, and 83% without implementing school closures, mosque closures, and flight shutdowns, respectively. our results are in agreement with previous findings in china [11, 24, 25] and in the united states [11] . for example, the authors in [11] suggested that community mitigation actions such figure 9 and table 5 suggest that the outbreak and peak sizes are inversely proportional to the percentage of vaccinated population. further, we observe that the higher the percentage of population vaccinated is, the earlier the epidemic peaks and the epidemic ends. for example, when 30% of the population is vaccinated, the peak occurs on 1 july 2020 and ends on 4 november 2020. when 70% of the population is vaccinated, the peak occurs on 30 may 2020 and ends on 25 june 2020. the proposed network model allows the analysis and evaluation of various control measures that are used to slow or prevent the transmission of covid-19 in saudi arabia and to evaluate the timing of each measure. moreover, the model can be used to predict the future dynamics of the outbreak in saudi arabia with and without the availability of vaccination. the results presented in section 4.1 show the epidemic curves resulting from not implementing each of the four major control measures employed by the saudi government. the results reveal several important pieces of information. first, they suggest that all of the employed control measures played a significant role in delaying the peak of the epidemic, where the peak represents the highest number of daily infections. this can be seen by comparing the dashed vertical lines on the curves, which show the day at which the number of new cases reached the maximum peak (compared to the actual curve). secondly, it is apparent from the top curve in figure 6 , that implementing school closures had the maximum impact because canceling school closures caused the curve to reach the peak early compared to the other three curves. thirdly, the results suggest that the employed measures also played an important role in slowing down the infection rate. for example, the maximum percentage increase in the number of cases in the original curve was 49%, whereas it was 104%, 113%, and 83% without implementing school closures, mosque closures, and flight shutdowns, respectively. our results are in agreement with previous findings in china [11, 24, 25] and in the united states [11] . for example, the authors in [11] suggested that community mitigation actions such as isolation of infectious individuals, quarantine of close contacts, and travel restrictions impact the covid-19 disease infection rates. canceling curfews appeared to have a minimum impact on slowing down the infection rate compared to other measures as it only caused a 23% increase in the number of cases (red curve in figure 6 ). this is likely because the assumed compliance rate for this measure was only 50% (see table 4 ), which is the lowest compared to those of other measures. the time at which the control measures are implemented against the epidemic is critical. in section 4.2, we presented the epidemic curves after changing the effective date of each of the employed control measures. from the results, it can be observed that delaying one of the measures increased the total number of infected cases compared to when all measures were implemented (the actual curve). for instance, when the effective date of the school closure control measure was delayed by 14 days, the total number of infected cases increased by 17%. however, the highest increase in the number of infected cases was seen when mosque closures and curfews were delayed (173% and 113%, respectively). similar trends can be observed with respect to the infection rate as delaying mosque closures and curfew implementation caused maximum percentage increases (128% and 45%, respectively). this can be explained by the susceptibility of the nodes affected by each control measure. for example, mosque closures mainly affect contact relationships (edges) among adult males. generally, the node susceptibilities of adult male individuals are higher compared to those of other nodes (table a3 and section 3.2). delaying flight shutdowns had the least impact on the total number of infected cases (4% increase compared to the actual). this can be explained by the low number of edges that connect individuals from different locations (figure 2(d) ). the model was used to predict the future dynamics of covid-19 in saudi arabia for the upcoming period of six months with and without control measures. the results are shown in section 4.3. a number of observations can be made from the results. first, the size of the peak when no control measures were imposed would be disastrous as it would result in a total of over 26 million infected individuals, which would overwhelm the healthcare system. we next compare the three scenarios when control measures are imposed. first, if social distancing adherence is poor, the two peaks, occurring at roughly the beginning of july and the middle of august, would result in over 13 million infected individuals by the end of the epidemic (at the beginning of november 2020). this number is half of the number of infected individuals when no control measures were imposed. second, social distancing significantly decreased the infection peak and the total number of infections. our results contradict earlier findings [20, 21, 80, 81] . for example, [80] predicted that the 99% pandemic end in saudi arabia should have been on 30 may 2020. in [20] , the authors predicted that the final phase of the outbreak would occur by the end of june 2020 with a total of 79,000 infected individuals. further, the work in [21] predicted that pandemic would end by early september 2020 with a total of 359,794 infected individuals. in comparing the two scenarios (without control measures and with control measures), it is clear that employing different control measures is crucial for flattening the epidemic curve and reducing the final size of the epidemic. these measures also prolong the peak period and minimize the peak, which is crucial to avoid overwhelming the healthcare system. the model was used to predict the disease dynamics under multiple vaccination scenarios. the results (section 4.4) suggest that the epidemic will end in saudi arabia on 4 november 2020 if no one in the population is vaccinated (i.e., if no vaccination is available). at this point, a sufficient amount of the population developed immunity to the disease because they previously had the virus and had recovered. however, the results showed that, in this scenario, around 41% of the population may become infected, which is equivalent to over 13 million individuals, and the epidemic may reach its peak on 1 july 2020 with over a hundred thousand individuals infected. in the best-case scenario, when 70% of the population is vaccinated on 10 june the results suggested that it may take only 15 days to end the epidemic with an outbreak size of 13%. this period increases by almost two months when only 50% of the population is vaccinated. when 30% of the population is vaccinated, the results show that the epidemic may end in late september. note that specific recommendations for vaccination may consider multiple factors analyzed in this section, such as the outbreak size, peak size, and pandemic end date, but may also consider other factors, such as the vaccination cost and the number of critical cases. the proposed network generation and simulation models are part of an effort to create an accurate simulation of the spread of covid-19 in saudi arabia. however, the findings in this work are subject to several limitations. as with all models, the quality of our model depends on the quality of the underlying data. this includes the contact patterns, data of infected cases, and pathogen data. inadequate and missing data were replaced with assumptions and simplifications. for example, in the proposed contact network generation model, the contact patterns and edge formation among individuals were simplified to three contact types with corresponding assumed probabilities. therefore, a greater focus on realistic contact patterns in saudi arabia using a social contact survey could produce interesting findings that could enhance the accuracy of our model. moreover, the contact network used to simulate the disease was static. a dynamic network, in which nodes and edges are added and removed over time due to birth, death, and quarantine, would be more realistic to represent contact relationships among individuals; this is left for future work. our predictions also include inherent uncertainty as the model parameters were derived from limited clinical data. for example, the node susceptibilities were based on limited data (records from 2 march to 25 april). in addition, the population's actual compliance to the recommended control measures is unknown. therefore, the compliance rates used in the model were assumed. more information on population compliance rates would help improve the accuracy of the model. the goal of this work was to model and analyze the spread of covid-19 in saudi arabia using a network-based epidemic model. first, we generated a realistic contact network of individuals in saudi arabia. then, we used the sir model to simulate the spread of covid-19. the proposed model accounted for the dynamic nature of individual contact behaviors and the variations in susceptibility between individuals. the proposed simulation model was used to evaluate the effectiveness of the employed saudi control measures and their timings on the dynamics of the epidemic and to predict the future dynamics of the outbreak in saudi arabia. the model was also used to calculate the percentage of people that need to be vaccinated to stop the epidemic. funding: this research received no external funding. table b1 . distribution of nodes by age group, gender, citizenship, and location. the age distribution of individuals was based on citizenship and gender but is approximated here. similarly, the gender distribution was also based on citizenship but is approximated here. age group 0-9 15% application of the arima model on the covid-2019 epidemic dataset. data brief 2020, 105340 modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action estimation of the transmission risk of the 2019-ncov and its implication for public health interventions analysis and forecast of covid-19 spreading in china, italy and france transmission dynamics of the covid19 outbreak and effectiveness of government interventions: a data-driven analysis interrupting covid-19 transmission by implementing enhanced traffic control bundling: implications for global prevention and control efforts feasibility of controlling covid-19 outbreaks by isolation of cases and contacts modeling the epidemic dynamics and control of covid-19 outbreak in china effect of delay in diagnosis on transmission of covid-19 outbreak dynamics of covid-19 in china and the united states using the contact network model and metropolis-hastings sampling to reconstruct the covid-19 spread on the diamond princess impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand a contribution to the mathematical theory of epidemics infectious diseases of humans: dynamics and control complex social networks are missing in the dominant covid-19 epidemic models networks and epidemic models networks and the epidemiology of infectious disease epidemic spread in networks: existing methods and current challenges predicting the epidemiological outbreak of the coronavirus disease 2019 (covid-19) in saudi arabia epidemiological modeling of covid-19 in saudi arabia: spread projection, awareness, and impact of treatment dynamics of sars-cov-2 outbreak in the kingdom of saudi arabia: a predictive model covid-19 outcomes in saudi arabia and the uk: a tale of two kingdoms the effectiveness of quarantine of wuhan city against the corona virus disease 2019 (covid-19): a well-mixed seir model analysis the effect of human mobility and control measures on the covid-19 epidemic in china covid-19: knowns, unknowns, and questions. msphere 2020, 5 sars-cov-2 and covid-19: the most important research questions real-time forecasts of the covid-19 epidemic in china from contact network epidemiology: bond percolation applied to infectious disease prediction and control edge-based compartmental modeling for infectious disease spread when individual behaviour matters: homogeneous and network models in epidemiology network theory and sars: predicting outbreak diversity modeling individual vulnerability to communicable diseases: a framework and design dynamics and control of diseases in networks with community structure sir dynamics in random networks with heterogeneous connectivity reality mining: sensing complex social systems disease evolution on networks: the role of contact structure a stochastic sir network epidemic model with preventive dropping of edges social contacts and mixing patterns relevant to the spread of infectious diseases the effect of network mixing patterns on epidemic dynamics and the efficacy of disease contact tracing hens, n. a household-based study of contact networks relevant for the spread of infectious diseases in the highlands of peru estimating contact patterns relevant to the spread of infectious diseases in russia the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modeling study strategies for mitigating an influenza pandemic modeling targeted layered containment of an influenza pandemic in the united states collective dynamics of small-world networks an examination of the reed-frost theory of epidemics the cancellation of mass gatherings (mgs)? decision making in the time of covid-19 covid-19 in the shadows of mers-cov in the kingdom of saudi arabia covid-19-the role of mass gatherings covid-19: preparing for superspreader potential among umrah pilgrims to saudi arabia saudi arabia's drastic measures to curb the covid-19 outbreak: temporary suspension of the umrah pilgrimage covid-19) disease interactive dashboard humanitarian data exchange-novel coronavirus (covid-19) cases data projecting social contact matrices in 152 countries using contact surveys and demographic data population based on gender population distribution over the main administrative regions based on gender and nationality chains of affection: the structure of adolescent romantic and sexual networks family demographic transition in saudi arabia: emerging issues and concerns the structure and function of complex networks. siam rev finding and evaluating community structure in networks modeling disease outbreaks in realistic urban social networks an efficient immunization strategy for community networks susceptible-infected-recovered epidemics in dynamic contact networks generating realistic scaled complex networks synthetic minority over-sampling technique feature selection for classification: a review. data classif covid-19 and postinfection immunity: limited evidence, many remaining questions the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application a new accuracy measure based on bounded relative error for time series forecasting principles of forecasting: a handbook for researchers and practitioners smoothing and differentiation of data by simplified least squares procedures when will covid-19 end? data-driven prediction epidemic situation and forecasting of covid-19 in saudi arabia using the sir model key: cord-280722-glcifqyp authors: rios, v.; gianmoena, l. title: is there a link between temperatures and covid-19 contagions? evidence from italy date: 2020-05-19 journal: nan doi: 10.1101/2020.05.13.20101261 sha: doc_id: 280722 cord_uid: glcifqyp this study analyzes the link between temperatures and covid-19 contagions in a sample of italian regions during the period ranging from february 24 to april 15. to that end, bayesian model averaging techniques are used to analyze the relevance of the temperatures together with a set of additional climate, environmental, demographic, social and policy factors. the robustness of individual covariates is measured through posterior inclusion probabilities. the empirical analysis provides conclusive evidence on the role played by the temperatures given that it appears as the most relevant determinant of contagions. this finding is robust to (i) the prior distribution elicitation, (ii) the procedure to assign weights to the regressors, (iii) the presence of measurement errors in official data due to under-reporting, (iv) the employment of different metrics of temperatures or (v) the inclusion of additional correlates. in a second step, relative importance metrics that perform an accurate partitioning of the r2 of the model are calculated. the results of this approach support the evidence of the model averaging analysis, given that temperature is the top driver explaining 45% of regional contagion disparities. the set of policy-related factors appear in a second level of importance, whereas factors related to the degree of social connectedness or the demographic characteristics are less relevant. the coronavirus disease 2019 (covid-19) pandemic started on december 1, 2019 in wuhan city, central china, when a group of people with unknown cause pneumonia were reported, mainly linked to workers in the wuhan south china wholesale seafood market (who, 2020a) . after the epidemic outbreak in china, on 30 january 2020, the world health organization (who) director-general dr. tedros declared the outbreak of the novel severe acute respiratory syndrome corona virus (sars-cov-2) and the covid-19 disease "a public health emergency of international concern". the reason is that the sars-cov-2 is a new pathogen that attacks the respiratory system and can lead to an acute respiratory failure or death (bedford et al., 2020; phua et al., 2020; russell et al., 2020) . despite travel restrictions, border controls, and quarantine measures in china, the epidemic spread worldwide. on 11 march 2020, the who declared the covid-19 "a pandemic disease" after affecting 118,610 people over 114 countries in a matter of weeks. five weeks later, by april 15, more than 2 million people were infected in 185 countries. the size and strength of the epidemic outbreak has threatened the ability of many health-care systems all over the world to cope with a shock of unprecedented magnitude and produced a severe health crisis. in addition, although the economic impacts of the covid crisis are yet uncertain, imf forecasts projects that the "great lockdown" employed by governments to contain the disease in many countries is likely to cause a deeper recession than that of the 2008-2009 financial crisis, as the global economy is expected to contract sharply by 3 percent in 2020 (imf, 2020) . given that governments all over the world are highly concerned with the threat of the covid-19, increasing our understanding on the drivers of the geographical variation in covid-19 contagions is of major importance from a policy perspective. however, to date, little is known about the determinants of regional covid-19 contagion differentials. from an empirical perspective, the strand of literature analyzing its determinants has highlighted the relevance of different factors in shaping regional reactions to covid-19 epidemic, such as air pollution (aqr, 2020a; pansini and fonaca, 2020; setti et al., 2020; wu et al., 2020) , social mobility and connectedness (arenas et al., 2020; pluchino et al., 2020; kuchler et al., 2020) , population density (fang and wahba, 2020; aqr, 2020b) , the level of policy stringency and the timings of the lockdowns orea and alvarez, 2020; casares and khan, 2020) , or the effects of proactive testing (wang et al., 2020b; romer, 2020) among others. in this regard, one of the issues that has received more attention is the possible link between climatic factors and contagions (yao et al., 2020; bukhari and jameel, 2020; sajadi et al., 2020; ma et al., 2020; oliveiros et al., 2020; wang et al., 2020c) as there is the hope that rising temperatures and humidity during the summer season could reduce sars-cov-2 transmission rate, providing time for healthcare system recovery, drug and vaccine development, and a return to economic activity. however, the results are not conclusive as they virtually fit all possibilities and reach diverging conclusions. yao et al. (2020) and bukhari and jameel (2020) find no evidence of 1 any statistical relationship between temperatures and contagions, ma et al. (2020) , oliveiros et al. (2020) , sajadi et al. (2020) and wang et al. (2020c) observe a negative effect, whereas in merow and urban (2020) and pedrosa (2020) a positive relationship is obtained. these studies represent a substantial progress in understanding the link between climate and the spread of covid-19 but an important methodological problem present in most of them, is that they have employed very limited sets of variables to analyze this phenomenon and ignored uncertainty surrounding the true model or data generating process (dgp) underlying covid-19 contagions. from an econometric perspective, ignoring model uncertainty and omitting relevant explanatory variables that could affect covid-19 patterns, is of major importance given that such estimates may be unreliable (moral-benito, 2015 , steel, 2017 . moreover, since it is often not clear a priori which set of variables are part of the "true" regression model, a naive approach that ignores specification and data uncertainty may result in biased estimates, overconfident (too narrow) standard errors and misleading inference and predictions. a second issue with existing analysis on covid-19 is that they fail to derive a rank of the various factors in terms of their importance, thus, hampering the consensus on what policies could be implemented to increase resilience against this pandemic disease. third, there is evidence on large cross-country differentials regarding the percentage of detection of covid-19 cases achieved by each country, which makes difficult to compare existing cross-country incidence series (russell et al., 2020) . these measurement errors are likely to plague and contaminate any crosscountry analysis and might be aggravated if the ability to test relative to the actual contagions is time-varying, or if there is a varying degree of under-reporting in the number of deaths due to the disease. to solve the aforementioned problems this study contributes to the literature investigating the linkages between climate and covid-19 contagions in three major aspects. first, we investigate the effects of temperatures by means of model averaging techniques. this approach allows us to produce a probabilistic ranking of importance through the computation of posterior inclusion probabilities (pip) for the different variables. 1 the set of potential determinants considered at the regional level includes: (i) institutional factors and variables related to the regional authorities' policy response to the epidemic, (ii) demographic factors, (iii) climatic, geographic and environmental characteristics and (iv) variables that capture the degree of social connectedness and mobility of the population. thus, compared with the limited set of regressors analyzed in the existing empirical literature, this study assesses model uncertainty over a larger set of covid-19 potential determinants while minimizing omitted variable bias and the multicollinearity concerns that could arise in a single model regression empirical framework. hence, unlike previous studies using a small fraction of the information available in the data set to draw conclusions, we explore the model space formed by all the possible model combinations implied by the set of potential drivers. in addition, a posterior jointness analysis following doppelhofer and weeks (2009) is carried out with the aim of increasing our understanding on the factors that might behave as complementary or substitutes to the climate and policy effects on covid-19 contagions. second, to complement the bma analysis, in a second phase, we calculate relative importance metrics for regression models which allow us to consider all possible causal patterns and channels among the explanatory factors (johnson and lebreton, 2004, gromping, 2007) . these metrics perform a decomposition of the r 2 of the regression model, enabling a detailed analysis of the relative contribution of each variable to the explained variability of regional contagion differentials. the main advantage of this approach is that it allows to disentangle the importance of each factor taking into account multivariate interactions with the other factors. a third point is that instead of employing a non-homogeneous large sample, with observations that are difficult to compare (which usually comes with the additional cost of having a limited set of explanatory factors to investigate regional differentials), here we focus our attention in a small sample of 21-nuts-2 italian regions. this research design has some advantages with respect a "large n" design. the first one is that the italian civil protection ministry (cpm) provides both, longer time-series but also more consistent, reliable and homogeneous regional data in the magnitudes of the health crisis (i.e, number of contagions, deaths, tests, etc) than other advanced countries. 2 second, the fact that italy was one of the countries that was hard hit in the first place by the covid-19 pandemic, implies that by the time of performing this analysis, we can analyze the full evolution of the first wave of the epidemic, given that abrupt changes in the data are not likely to occur in the near future which could affect the conclusions. third, by exploiting regional variation within a single country, we minimize the potential biases and problems of data comparability due to under-reporting differentials across countries and across time in both the incidence and death data (see, russell et al. , 2020 for a discussion). nonetheless, as the issue of under-reporting of cases is likely to create measurement errors even in relatively homogeneous sample, we verify if the results obtained are robust with respect to this potential problem. 3 the study is organized as follows. section 2, which follows this introduction, reviews the theoretical mechanisms and the empirical evidence linking climate to 2 an example of issues when registering and compiling data in a similarly developed and decentralized country is that of spain. in spain, in april 2, the spanish health ministry published a note correcting their daily reports recognizing that some regions where counting differently hospitalizations to others, making impossible to compare the existing regional time series (shm, 2020) . similar problems have occurred with the incidence data. until april 15, active cases included pcr-confirmed cases for all regions, and serological tests for some regions. starting on april 15, a change of methodology led to a split in both figures for some regions; by april 18 most had switched to the new methodology except for galicia, which made the switch from april 28 onwards. detailed press reviews on counting problems across regions on other variables such as the number of deaths in spain are (mouzo, 2020; galaup and sanchéz, 2020) . 3 to that end, we correct the time series of reported cases using a delay-adjusted crude fatality ratio (cfr). covid-19 contagions. section 3 describes the data on regional contagion and climatic factors and provides preliminary evidence on the link between these variables . section 4 describes the data set used in this study and the various factors considered in the analysis. in section 5, the bma econometric modeling framework is presented. the empirical findings and robustness checks are presented in section 6, while section 7 discusses the policy implications than can be derived from this research and offers the main conclusions of the study. 2 why should climate matter for covid-19 contagions? the linkages between viral disease transmission and climatic factors have been well documented in a large number of cases (ncr, 2001) . disease agents and their vectors have specific environments that are optimal for growth, survival, transport, and dissemination. factors such as precipitation, temperature, humidity, and ultraviolet radiation intensity are part of that environment. indeed, each of these climatic factors can have different impacts on the epidemiology of various infectious diseases. for instance, in some viral diseases such as the dengue or the malaria, higher temperatures favor spread of the disease whereas in the case of the case of the influenza higher temperatures and humidity decrease the efficiency of the viral transmission (lowen and steel, 2014) . in the case of the influenza, which is transmitted person to person in aerosol droplets typically through coughing and sneezing, the transmission is more efficient at cold temperatures and dry conditions. the specific mechanisms through which cold temperatures increase the diffusion of influenza are related to the susceptibility conditions of the host or the physical properties of the virion envelope, among others. in this regard, laboratory studies analyzing both the temperature and humidity effects on the viability of the sars-cov, which belongs to the family of the sars-cov-2, show that higher temperatures and humidity decrease its viability, as it occurs with the influenza (chan et al., 2011) . in the case of sars-cov-2, which is also assumed to be propagated by droplets (who, 2020b), two main types of human to human transmission are likely to be at work. the first hypothesis suggests that viral particles emitted from the respiratory tract of an infected individual may land on a surface, another person could touch that object and then touch their nose, mouth or eyes, facilitating the virus to sneak into the body, thereby infecting the second person (who, 2020b) . the second is that of aerosol transmission. aerosol transmission is biologically plausible when infectious aerosols are generated by or from an infectious person, the pathogen remains viable in the environment for some period of time, and the target tissues in which the pathogen initiates infection are accessible to the aerosol. thus, it could be possible that people may emit virus particles in a range of sizes, some of them small enough to be considered aerosols, which can stay suspended in the air for hours and then be inhalated by others (van doremalen et al., 2020) . the efficiency of these transmission processes is likely be mediated by climatic factors (duan et al., 2003; lowen and steel, 2014) . a first mediation mechanism is that of temperature and heat. the reason is that colder environments may increase viral transmission given that lower temperatures decrease metabolic functions and mucus secretion, which act as a defense barrier in hosts. hair-like organelles outside of cells that line the body's airways, called cilia, do not function as well in dry and cold conditions and they cannot expel viral particles as well as they otherwise would. secondly, temperatures can affect the transmission through the vehicle itself, the respiratory droplet, as the length of a time a droplet remains airborne depends on its size (lowen and steel, 2014) . in the winter season, when cold, dry air comes indoors and is warmed, the relative humidity indoors drops sharply, which makes it easier for airborne viral particles to travel. the intuition here is that heating systems increase indoor temperatures and dry-out the air, favoring the evaporation of droplets. in such circumstances, the stronger the evaporation, the smaller the diameter of the droplet, thereby enhancing the airborne suspension and the possibility of aerosol transmission in closed spaces. a third mechanism linking climate factors to contagions refers to the impacts exerted by sunshine light and ultra-violet (uv) radiation (merow and urban, 2020) . this is because of there is evidence suggesting that uv radiation might increase human defenses and could reduce virus stability. sunlight contains different types of uv which might degrade the genetic material of viral particles. indeed, as shown by duan et al. (2003) uv radiation effectively kills viruses such as the sars-cov. likewise, sunshine light and uv radiation increase vitamin d in humans, which enhances their immune system reducing the likelihood of experiencing respiratory tract infection (canell et al., 2006; martineau et al., 2017) . taken together, the theoretical arguments outlined above suggest that in regions with higher temperatures, higher humidity and more abundant fluxes of sunshine light or uv, the efficiency of viral transmission could have been lower. however, at the empirical level, it is yet unknown if covid-19 propagates more slowly in these environments. table ( 1), summarizes the evidence provided by the empirical literature analyzing climate effects on covid-19. although there seems that the number of studies finding a negative sign in temperatures and humidity dominate those that find either non-significant or positive effects, the empirical research on the relationship between these variables and covid has been so far limited and generally reaches diverging conclusions. the synthesis provided in table ( 1) shows that there are at least four studies finding that temperatures reduce covid-19 transmission, two of them observe that the range of temperatures does not matter and the other two obtain that temperatures could in fact have a positive impact on contagions. in the case of uv there are just two studies, one with a non significant link and the other one with negative effect, whereas in the case of humidity, three of them report a reduction effect on covid viral transmission and the other three a positive one. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) specifically, ma et al. (2020) investigate the effect of temperatures and humidity on deaths in wuhan by means of a time-series generalized additive model (gam). by exploiting time variation in temperature data of wuhan from the 20 of january to the 29 of february, and controlling for air pollutants, they find a negative and statistically significant effect of ambient temperatures and humidity. sajadi et al. (2020) investigate the link between climate factors within a sample of 50 cities of different countries all over the world by means of linear regression techniques finding that temperatures 2 meters above the surface, exert a negative and statistically significant effect on the total number of contagions whereas humidity increases them. other two studies focusing on cross-sectional variation among chinese cities find a negative and statistically significant link between temperatures and contagion speed. oliveiros et al. (2020) use data from 31 chinese provinces and linear regression modeling. in their study they observe a negative link between median temperatures after controlling for humidity, wind speed and precipitation. on the other hand, wang et al. (2020c) observe a negative link between temperatures and the reproduction number of the covid19 in 100 chinese cities for the pre-intervention period, after controlling for relative humidity, gdp per capita, population density, the number of hospital beds and the share of population over 65. the effect of humidity in this analysis appears to be detrimental to contagions. as opposed to these two papers, yao et al. (2020) within in a sample of 232 chinese cities and after adjusting for relative humidity and uv, report that temperatures held no significant associations with cumulative incidence rate. the descriptive analysis of bukhari and jameel (2020) finds that 90% of contagions occurred in a wide range of temperatures and humidity, between 3 to 17 celsius degrees and suggests that reduced spreading due to environmental factors would be limited in most of northern europe and north america in the summer season. merow and urban (2020) using a panel data of 128 countries including country-fixed effects and daily data on contagions obtain a positive effect of the 14 day lag of mean temperatures on the growth rate of contagions, once the proportion of elderly, relative humidity and ultraviolet light are taken into account. similarly, pedrosa (2020) finds a non robust positive effect of temperatures in a sample of us states and in a sample of 110 countries by looking at the early growth rate of the contagions. the reasons for this diversity of results have to do with the fact that these contributions differ considerably in terms of the sample composition, the study period, the indicator used to measure the impact of the epidemic and the temperatures, and more importantly, in the econometric and statistical approach employed to perform inference. moreover, as discussed in section (1) a drawback of existing literature is that it has focused on a limited number of variables and explanatory factors, which casts doubts on the validity of these findings (i.e, yao et al. (2020) or sajadi et al. (2020) do not include any covariate). accordingly, further empirical research is required to clarify the nature of the link between climatic factors and covid-19 incidence at the regional level. this section provides preliminary evidence on the linkages between climate and covid-19 incidences in italian regions. to that end, we first describe the official data on incidences at the regional level in italy, up to april 15. data were collected daily at 12 p.m. (gmt+2), between february 24, 2020 to april 16, 2020 from the italian ministry of civil protection (mcp). the analysis considers nuts-2 level regions rather than other possible alternatives for various reasons. firstly, the use of data at the nuts-2 aggregation level allows us to built an homogeneous database on a larger number of potential drivers of contagions using distinct sources. secondly, nuts-2 is the territorial unit most commonly employed in the literature on regional analysis and it is particularly relevant in terms of italian health-care regional policy, given that health competences in italy are decentralized (golinelli et al., 2017). 4 however, before continuing, it is necessary to stress that the quality of the available contagion data on covid-19 is far from being optimal since the time-series on infections are affected by a severe share of under-reported cases and italy is no exception in this regard (see, russell et al., 2020) . the reason is that only the more severe cases (non-asymptomatic and non-mild) cases get sick enough to search medical help and have been officially diagnosed. therefore, the quality of the data on contagions depends to a large extent on the fraction of cases that are severe enough to both lead to medical attention and be tested. hence, by using this metric on contagions, we are measuring "two phenomena at the same time": the infected people and the authorities' ability to detect them. note that although this point could be a major drawback in cross-country comparative research, it is reasonable to assume that within a country, these measurement errors do not differ significantly across regions, and as a consequence, these concerns are of less importance in this context. 5 to characterize the dynamics of covid-19 incidence at the regional level, we begin by analyzing changes in the evolution of the official time series of reported contagions per capita. figure (1) shows the evolution of cumulative contagions per 100,000 inhabitants from february 24, 2020 to april 15, 2020. as observed, the figures increased substantially during march 2020, such that by march 21, all the regions in the country had more than 10 covid-19 contagions per 100,000 inhabitants. figure ( 1), the region that increased more rapidly the number of infections and crossed the 10 cases per 100,000 inhabitants threshold in a first place was lombardy, on march 2. by march 24, lombardy had more than 300 cases and from april 5 onwards, the figures in lombardy located in the interval ranging from the 500 to 700 cases per 100,000 inhabitants. the evolution of lombardy was closely followed by the regions of marche, emilia-romagna, bolzano and aosta valley, who also experienced markedly increases, and before march 22, all the neighbors of lombardy located in the north of the country reported more than 100 cases per 100,000 inhabitants. 6 this pattern of contagions confirms a rapid cross-regional diffusion such that most of the italian regions arrived to their peak of contagions per capita during the last ten days of march and stabilized their growth rates during the first half of april. 7 to carry out our research and explore the effects of temperatures we also need climate data. given that there is a time lag between person to person transmission, the development of symptoms, the test realization and the confirmation of test results later on, we resort to a lagged value of temperatures to capture the fact that if present, climatic effects should have taken some time to exert an impact on contagions. for this reason, we employ the average temperatures in february of each regional centroid, obtained after averaging over the time series of daily mean temperatures at 2 meters above the surface of the earth. the meteorological data was taken from the nasa-prediction of worldwide energy resources (nasa-power) v8 gis database. figure ( 2) panel (a) shows the geographical distribution of the total contagions per 100,000 inhabitants for each region in april 15. an interesting fact shown in figure ( 2) is that the geographical distribution of infections has been clearly asymmetric across regions. on the other hand, the information of temperatures is shown in the panel (b) of figure ( 2). as the color scale range in each panel is set to reflect a higher value of the variable being mapped, the observation of figure ( 2) reveals that higher cumulative incidence levels up to april 15, occurred in regions with lower temperatures in february. 8 to further investigate this link, figure ( 3) provides a graphical illustration on the association between average daily temperatures and covid-19 cumulative incidences 8 the corresponding regional nuts-2 codings and regional names shown in the maps are: itc1-piedmont, itc2-aosta valley, itc3-liguria, itc4-lombardy, itf1-abruzzo, itf2-molise, itf3-campania, itf4-apulia, itf5-basilicata, itf6-calabria, itg1-sicily, itg2-sardinia, ith1-autonomous province of bolzano, ith2-autonomous provincce of trento, ith3-veneto, ith4-friuli-venezia giulia, ith5-emilia-romagna, iti1-toscana, iti2-umbria, iti3-marche, iti4-lazio . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . per 100,000 inhabitants by april 15. the scatter plots suggests the existence of a negative relationship between the magnitude of cumulative contagions and temperatures. this means that regions with higher temperatures tend to have lower levels of infections, while those regions with lower temperatures, on average, are characterized by a higher incidence. indeed, the pairwise correlation between the two variables is statistically significant (ρ = −0.723 with p-value = 0.00). 9 . nonetheless, the information provided by figure ( 3) should be treated with caution, as the observed connection between temperatures and the regional epidemic size may simply be a spurious correlation resulting from an ecological fallacy or the omission of other variables affecting both, climate and the infections. in view of this potential problem, in section (5) we develop a more appropriate statistical analysis on the link between temperatures and regional covid-19 cases. to produce a comprehensive ranking on the importance of the diverse potential factors influencing contagion differentials during the epidemic outbreak, we consider a variety of candidate factors. specifically, we consider (i) institutional characteristics and variables related to the policy response during the epidemic, (ii) demographic characteristics, (iii) climate and environmental characteristics and (iv) the degree of social connectedness among the population. we now describe these controls and provide a brief conceptual justification for their inclusion in the analysis. table ( a2) in the appendix presents the detailed definition and sources of all the control variables used in the paper. the first variable considered is (i) the ability of regional authorities to perform tests and detect infections in an early stage. we expect this variable to have decreased contagions given that an early and correct diagnose of the infection may have helped to identify potential carriers, isolate them properly and reduce the spread of the epidemic (romer, 2020; wang et al., 2020b) . 10 to proxy the ability of monitoring the epidemic in advance we use the number of tests per 100,000 inhabitants during the first week of the epidemic outbreak, from the 24 of february to march 1. a second factor included refers to the (ii) time delay or policy lag in the application of regional lockdowns to limit viral transmission. given that during the early stages of the covid-19 outbreak the number of new cases was growing at an exponential rate, time lags in the adoption of the regional lockdown of territories relative to the stage of the epidemic outbreak may have helped to increase diffusion and total pandemic size orea and alvarez, 2020; casares and khan, 2020) . in italy, early lockdowns were adopted by march 8 in the regions of lombardy, marche, emilia romagna, veneto and pidemont covering 16.46 million inhabitants. two days later, a nation-wide lockdown was extended to all the regions and the whole population. we proxy the lag in the policy response by calculating the number of days with more than 1 case per 100,000 inhabitants prior to the lockdown of the region. finally, we consider (iii) the ratio of health expenditures to gdp, because of regions with a higher expenditure are more likely to have experienced less shortages of equipment and educated manpower to implement tests, carry out treatments and ultimately, minimize within-hospital contagion amplification, also known as "nosocomial transmission" (bedford et al., 2020) . in addition, regions with a higher share of resources of their gdp devoted to health-care services may have been able to perform intensified contact tracing and locate cases to stop onward transmission. thus, we expect a negative effect of higher health expenditures. socio-demographic factors may also play a relevant role determining reported contagions per capita. in particular, the age composition of the population is expected to have an impact on contagions. to control for differences in the demographic structure we consider (iv) the share of population under 35 years old and (v) the share of population above 75 years old. given that there is evidence that effects of the covid-19 are much more severe in older populations (verity et al., 2020) and because of the limited ability of performing tests on the population, regional authorities may have prioritized detection among the elderly with respect the younger (which in many cases could have been asymptomatic carriers). for this reason we expect regions with a higher share of elderly population to have reported more cases per capita than regions with a younger population. we also consider the fact that different degrees of social connectivity could also have an impact in the spread of the disease and the probability of importing infections from abroad (charaudeau et al., 2014) . to control for regional differentials in this regard, we first consider (vi) the degree of social mobility of the population, which is expected to increase contagions (arenas et al., 2020; pluchino et al., 2020; kuchler et al., 2020) . following pluchino et al. (2020) we measure the mobility of the population as the ratio between the sum of commuting flows (incoming and outgoing) for each municipality in a region and the population employed in that municipality using data from the italian ministry of economic policy planning and coordination. the regional value is obtained by averaging over all municipalities within the region. a second candidate factor to explain these differentials is the (vii) population density, because of the transmission of sars-cov-2 through the population operates through close person-to-person contact, most readily by respiratory droplets, whether through polluted surface contact (who, 2020b) or aerosol (van doremalen et al., 2020) . thus, compared to low-density areas, a higher population density could favor the interaction and personal contacts reinforcing the transmission of the virus (aqr, 2020b). nevertheless, this argument should be taken with caution, as some extremely dense cities, such as singapore, seoul, and shanghai, have outperformed many other less-populated places in combating the coronavirus. as a point of fact, fang and wahba (2020) find no statistical relation between density and contagions in a sample of chinese cities. the third group of factors that are likely to explain contagions differentials refer to climate and the environmental conditions of the region. air pollution is considered a 13 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . factor that may exacerbate the vulnerability of populations by worsening chronic lung and heart infections (cui et al., 2003) . in the context of the covid-19 existing studies have failed to stablish a clear causal link between particulate pollution, contagion and the subsequent health damage (pansini and fonaca, 2020; wu et al., 2020; wang et al., 2020a) . however, the intuition for a positive link is that airborne particles may have been able to serve as carrier for the pathogen (setti et al., 2020) . whether the virus remains viable after hitching a ride on pollution particles, or whether it can do so in sufficient amounts to cause infection is unclear. in any case, to control for this potential determinant, we account for (viii) environmental quality differentials. we measure the impact of environmental quality by means of the share of population that is exposed to more than 15 micrograms/m3 of particulate matter (pm2.5) using data from the oecd. our expectation is that a higher pollution may have exerted a positive effect on the pandemic size. makinen et al. (2009) find that in general, when the average temperature drops the estimated risk for the lower respiratory tract infections increases. apart of the potential role of (ix) temperatures, another climatic factor that has received considerable attention in the literature is that of (ix) relative humidity (oliveiros et al., 2020; sajadi et al., 2020 , wang et al., 2020c . as discussed in section (2), we expect regions with lower temperatures and lower humidity to have a higher incidence as these two factors may have reduced the efficiency of viral transmission by decreasing host resistance in outdoor environments and increasing the stability of airborne viral particles in indoor spaces. in empirical covid-19 research, although some variations of a baseline model are often reported, basing inference on a single model has become a common practice. typically, researchers draw their conclusions on this single model acting as if the model was the true model. nevertheless, this procedure uses a minimal fraction of the information in the data set and understates real uncertainty associated with the specification of the empirical model (see, moral-benito, 2015; steel, 2017) . to address this concern, in our analysis of the drivers of covid-19 regional differentials we employ the bayesian model averaging approach. we begin by considering the following regression model: where y denotes a n × 1 dimensional vector consisting of observations for the cumulative total infections per capita during february 24 to april 15, for each region i = 1, . . . , n . α reflects the constant term, ι n is a n × 1 vector of ones, x is an n × k matrix of regional explanatory variables with associated response parameters β contained in a k ×1 vector. finally, = ( 1 , . . . , n ) is a vector of i.i.d disturbances whose elements have zero mean and finite variance σ 2 . 14 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . bayesian model averaging (bma) techniques haven been previously analyzed to investigate the determinants of complex processes in the field of regional science (crespo-cuaresma et al., 2014 , hortas-rico and rios, 2019 , rios and gianmoena, 2020 ) and a large literature on bma in regression models already exists (see moral-benito, 2015 , steel, 2017 for detailed reviews on the literature). however, to get an intuition behind the bma approach employed here to learn on the nature of covid-19 regional disparities, notice that for any set of possible explanatory variables of size k, the total number of possible models is 2 k and k ∈ 0, 2 k . this implies there are 2 k sub-structures of the model in equation (1) given by subsets of coefficients η k = α, β k and combinations of regressors x k . 11 model averaging techniques solves the question of variable importance and the effects of each regressor h by estimating all the candidate models implied by the combinations of regressors in x (or a relevant sample of them) and computing a weighted average of all the estimates of the corresponding parameter of x h (where the sub-index h here denotes a single regressor and not a model or a combination of regressors k). by proceeding in this way, estimates consider both the uncertainty associated to the parameter estimate conditional on a given model, but also the uncertainty of the parameter estimate across different models. by following the bayesian logic, the posterior for the parameters η k calculated using model m k is written as: where g (η k |y, x, m k ) is the posterior, f (y, x|η k , m k ) is the likelihood and g (η k |m k ) is the prior. the key metrics in bma analysis are the posterior mean (pm) of the distribution of η: and the posterior standard deviation (psd): 11 we consider 10 potential explanatory variables in our baseline analysis. thus, the cardinality of the model space in this context is 2 10 = 1024 models, based on different combinations of regressors. in section (6.2.4) 15 regressors are employed, which implies a model space consisting of possible 32,768 models. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10. 1101 where the v ar (η|y, x) is given by: to derive these metrics, it is necessary to calculate the posterior model probability p (m k |y, x) of each of the sub-models m k . these can be obtained as: where p (y, x|m k ) is the marginal likelihood and p (m k ) is the prior model probability. the marginal likelihood of a model k is calculated as: where p y, x|η k , σ 2 , m k is the likelihood of model k and p η k , σ 2 |g is the prior distribution of the parameters in model m k conditional to g, the zellgner's g-prior. in addition, the bma framework can be extended to generate probabilistic on the relevance of the various regressors, using the posterior inclusion probability (pip) for a variable h: and the conditional posterior positivity of h: where values of conditional positivity close to 1 indicate that the parameter is positive in the vast majority of considered models and values close to 0 indicate the effect on the dependent variable is negative. the calculation of previous metrics in the bma approach requires to define priors on the model space and priors on the parameter space. we use a model specific 16 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . zellgner g-prior based on the bayesian risk inflation criterion (bric), whereas we use a calibrated binomial prior on the model space to give the same a priori probability to each model and variable. 12 following early discussion of bayesian measures of variable importance, pips in equation (8) have become a standard tool for interpreting the results in econometric applications of bma (doppelhofer and weeks, 2009 ). however, although they provide valuable insight into the overall importance of single variable, they neglect the interdependence of inclusion and exclusion of variables. thus, pips do not help to conclude if the importance of the variable is evenly spread out across all model specifications or if it is specific to a certain combination of explanatory variables. to gain insights into the interdependence of the inclusion of sets of different variables, several studies investigate the joint posterior of pairs of variables. jointness reveals generally unknown forms of dependence. positive jointness implies that regressors are complements, representing distinct but mutually reinforcing effects. on the other hand, negative jointness implies that explanatory variables are substitutes and capture similar underlying effects. to analyze this issue, we follow doppelhofer and weeks (2009) , who propose the use of log of a cross-product ratio of inclusion probabilities. for any pair of arbitrary variables a and b of the set of k potential variables, we calculate their bivariate jointness as: where values of the j statistic above 1 are considered as evidence on significant complementarity, values below -1 suggest significant substitutability and values between -1 and 1 are suggest independence. ab indicates that event ab did not occur, p ab denotes the probability of models where ab did not occur, etc. table (2) reports the results obtained from the bma analysis. however, before continuing with the discussion of the results, it is worth mentioning the problems that the methodology applied here can solve and those problems that may persist, affecting the quality of the estimates. the strong point of the bma methodology employed 12 in particular, the g-prior hyper-parameter takes the value of g k = max n, k 2 . the binomial prior on the model space, regulates prior model probabilities where each covariate k is included in the model with a probability of success φ. we set φ = 0.5 which implies a prior model size of 5 regressors. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . here is that it accounts for the uncertainty of the parameter estimates across different models while controlling for omitted variable bias and reducing multicollinearity problems (moral-benito, 2015; steel, 2017) . nevertheless, it does not correct for the potential negative effect of endogeneity generated by reverse causal relationships, or measurement errors. the issue of reverse causal relationships is likely to be at work in the case of the proactive testing proxy variable, whereas measurement errors can be an issue in the dependent variable, the number of contagions per 100,000 inhabitants. therefore, to minimize the potential problems caused by reverse causality relationships and the lack of an appropriate instrument, tests are measured with respect the total population during the first week of the epidemic (i.e, 6 weeks before the measurement of the dependent variable). on the other hand, the issue of measurement errors in the number of contagions is addressed in detail in section (6.2.2). we use the pips of the different variables to classify evidence of robustness of the different contagions drivers, such that regressors with pips above the a priori inclusion probability (apip) are considered as relevant determinants and variables with pip < apip, as irrelevant ones. from a bayesian perspective, variables with pips higher than others, reflect a higher importance as they are more likely to be part of the dgp and as such, they can be considered as a relevant piece in explaining contagion differentials across italian regions. 13 as observed in column (1) of table ( 2), there is a small group of top variables conforming the group of important determinants. these are the mean temperatures (97.5%) and the policy lags (88%), which are both statistically significant at the 1% level. in the frontier of relevance/irrelevance we also find the variable measuring the relative humidity conditions, with a pip of the 44%, but with a statistically significant effect on contagions. it is somehow surprising that none of the remaining factors considered, although display the expected signs and are mostly consistent with those obtained in the literature (i.e, a positive link between contagions and the share of old population, social mobility, population density and air pollution; and a negative one between the share of young, health spending and tests), do not appear to be relevant in explaining regional differentials across italian regions. overall, our findings suggest, that by focusing only on the role played by climate conditions and the timings of the policy response, we have enough information to explain why some regions may have been affected more than others. we now turn our attention to the model averaged estimates of our regional-level variables as they provide the basis for posterior inference regarding the parameters. model averaged estimates of each variable are constructed using all possible model combinations implied by our set of variables and computing a probabilistically weighted average of all the estimates of the corresponding parameter. the results displayed in table ( 2) correspond to the estimation of all the models in the model space including any combination of the 10 variables. 14 the estimated posterior model 13 a common concern in bma is that pips may simply reflect the strength of the correlations between the explanatory variables and the dependent variable. however, this is not an issue here since the correlation between the pips and the pairwise correlation between contagions and the set of potential drivers, in this context is about 0.43. this suggests the probabilistic importance ranking obtained from the analysis performed here contains valuable information that cannot be learned from the mere computation of correlations. 14 we do not resort to conventional monte carlo markov chain model composition (m c 3 ) algorithms 18 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . (1), the posterior inclusion probability. columns (2) to (3) reflect the conditional posterior mean and the standardized conditional posterior mean for the linear marginal effects of each variable, respectively. column (4) is the sign certainty probability, a measure of our posterior confidence in the sign of the coefficient. standard deviations in parenthesis. * significant at 10% level, ** significant at 5% level, *** significant at 1% level. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . size has a mean of 3.4 variables, lower than its expected a priori size of 5 variables, which is in line with the finding of very few candidate determinants being relevant and a parsimonious model specification. as regards our variable of interest, we find that temperatures exert a negative impact on regional contagion outcomes with a posterior mean of -34.296 and a standardized mean effect of -0.683. this means that an increase in 1 celsius degree in the mean regional temperature during february, decreased the number of total contagions per 100,000 inhabitants in april 15 by 34.29. alternatively, the estimated standardized posterior mean coefficient in column (3) reveals that on average, a 1 standard deviation shock to the temperatures (i.e, a change of about 4.3 celsius degrees, which is the difference in the temperatures between the average temperatures of lombardy (itc4) and calabria (itf6)) had the effect of decreasing the total number of contagions per 100,000 inhabitants by 147.54, which is the difference between the highly affected region of marche (iti3) and that of tuscany (iti1). moreover, as shown in column (4), the posterior sign positivity of the temperatures is 0.00%, which implies the parameter estimate is always negative irrespective of the model in which the variable appears. this negative effect of temperatures on regional contagions is in line with previous literature and supports the findings of ma et al. (2020) , oliveiros et al. (2020) , sajadi et al. (2020) and wang et al. (2020c) while contradicts those of merow and urban (2020) , pedrosa (2020) and yao et al. (2020) . as explained, before, the rationale for a negative effect of temperatures on the number of contagions has to do with the increased susceptibility of the hosts due to a slower metabolism decreasing their defenses, and the fact that in winter, when cold air comes indoors and is it warmed by our heating systems, the relative humidity indoors drops sharply facilitating airborne viral particles to travel and be transmitted from one person to another as an aerosol. we also find the effect of relative humidity helps to explain part of the observed regional heterogeneity. in particular, the estimated effect is negative in all the models, which suggests that regions with higher outdoor humidity experienced lower contagion rates. this finding supports previous results of wang et al. (2020c) and ma et al. (2020) . the finding of a positive and statistically significant impact on the number of infections due to a late political response, is also in line with the results obtained in other studies (orea and alvarez, 2020; alvarez et al., 2020) . specifically, we find that each additional day in which the lockdown was not implemented, with respect to the detection date of 1 case per 100,000 inhabitants, meant an increase of 28.230 cases per 100,000 inhabitants by april 15. to put this figure in context, this estimates imply that in lombardy, implementing the full lockdown in march 1 instead of the 8 of march, would have decreased the total amount of cases by 32% decreasing the incidence toll from 62,153 cases to 42,226 cases. this result is lower than the estimates provided in the empirical study of orea and alvarez (2020) who find that for spanish regions the implementation of the lockdown one week before the date it occurred, had a reduction effect of the cases of approximately the 50%. this difference in the as the enumeration of all the models is computationally feasible in this context. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . magnitudes can be explained by the fact that even if italy was one of the first countries implementing the lockdown, in relative terms, the date when this policy measure took place was late, such that the potential of flattening the epidemic curve was lower. to complement previous results and to gain further insights that might be useful for policy-making, we carry out an analysis of the posterior jointness to detect pip dependencies among regressors. table ( 3) reports the posterior jointness relationships of the different determinants included in the analysis, as calculated by the metric proposed by doppelhofer and weeks (2009) . the investigation of the jointness is relevant from a policy-making perspective as it helps to better understand what regional factors may reinforce or hamper the efforts of increasing resilience against the covid-19 epidemic outbreak. the main findings are as follows. we find evidence of both positive and negative jointness among contagion determinants. significant positive jointness (with j > 1) is not restricted to variables considered significant by the pips. in this respect, a more complex explanation emerges, given that variables such as the health spending ratio to the gdp, which seems to decrease contagions, appears to be a strong substitute of policy lags and temperatures. this suggests that in the absence of a warm climate or a fast policy response, a higher health spending to gdp ratio could become a relevant factor to decrease contagions. this can be attributed to both, the higher ability of monitoring and isolating cases and the possibility to reduce "nosocomial transmission" because of an increased access to protection equipment. other codependencies with respect the policy lags are worth mentioning. conditional to a delayed implementation of the lockdowns, we observe that a pro-active testing policy to detect infections becomes a more relevant determinant and may help to decrease contagions. in addition to this, we find a high degree of substitutability between policy lags and social-mobility, which suggests that restrictions to mobility may produce a similar effect to those of the lockdown of the entire region. therefore, there results of this posterior jointness analysis reveal that there is room for policy-makers to increase resilience against the epidemic. place-based oriented policies aiming at increasing health-care capacities and resources, or investments that help workers and firms to adopt teleworking practices that reduce social mobility might be important for the 21 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint re-activation of the economy while keeping the number of contagions down. we now investigate if the finding of temperatures being the most relevant determinant of contagion differentials across italian regions is robust to changes in the set-up of the analysis. specifically, in section (6.2.1) we first analyze if consider the results regarding the probabilistic importance of the variables are being driven by the prior-distributions placed on the parameters and the distribution of the model size. in section (6.2.2) we check if the rankings are robust to the importance weighting procedure. to that end, we develop a novel frequentist model averaging framework extending hansen and racine (2012) which produces importance ranking using jackknife model averaged variable weights. in section (6.2.3) we study if correcting for measurement errors in the number of contagions affects the validity of previous findings. in addition, we analyze whether the results are affected by using alternative measurements of temperatures such as the minimum or the maximum temperatures, or if using data from a different source, matters for the validity of our conclusions. finally, in section (6.2.4) we check if the rankings are robust to the extension of the model space by increasing the number of candidate determinants. an implication of the bayesian model averaging approach is that inferences drawn on the effects of the various factors and their importance, depend on prior distributions assigned to the model parameters and on the model space. to verify the conclusions do not depend on subjective prior information implied by the elicitation of our prior distributions on the parameters and the model space, we now discuss the results of the different robustness checks performed along this line. we first check the sensitivity of our results to the the definition of the binomial prior on the model space, with φ = 0.5, which implies apips for the variables of the 50%. thus, we depart from the baseline specification and we set the prior model size to 6, 7 and 8 regressors respectively, by adjusting φ in each case. as shown in table ( 4), the effect of increasing the prior model size has a stronger effect on the pips than the g-prior, since the employment of priors favoring large model sizes increases slightly the pips of most of the potential determinants. however, when comparing the apips of each variable with the estimated pips implied by the data, we do not find any significant change in our previous results. temperatures and policy lags remain to be the most robust determinants of contagion 15 for an alternative graphical representation of the results in table ( . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . differentials with pips that are always above the 90%. the only difference appears if one holds an priori view of covid-19 contagions being driven by a larger set of potential variables, for prior model sizes equal to 7 and 8 (i.e, assuming variables have apips of the 70 and 80% respectively). in this context, one should also consider the relative humidity as a relevant driver of contagion differentials. however, the result of a categorization of all the other variables as "unimportant factors" does not change. secondly, we consider a variety of different g-priors specifications on the parameters while holding fixed φ = 0.5. these prior-distributions differ from our baseline gprior, the bric which sets g = max n ; k 2 . in this group of g-prior distributions we consider the (i) unit information prior (uip) which sets g = n ; the (ii) risk information criteria prior (ric) where g = k 2 and (iii) empirical bayes prior (ebl) which is a model k specific g-prior estimated via maximum likelihood. in this case . finally, we consider the hyper-g prior which relies on a beta prior on the shrinkage factor of the form g 1+g ∼ beta(1, a 2 − 1) where in this specific case, a = 3.5. in this context, temperatures and policy lags always appear to be the most important variables. as regards the role of relative humidity, we find that when employing uip, ebl or hyper-g priors relative humidity displays pips above the apips. taken together the results of these robustness checks suggest that the main results are not driven by prior information and that only if one is willing to accept more candidate determinations as a part of the explanation of cross-regional differentials the role of humidity may change. we now investigate if a frequentist model averaging produces similar results. despite the similarity in spirit and objectives between frequentist model averaging (fma) and bma, both techniques differ in their approach to inference (moral-benito, 2015). as explained by steel (2017) , fma analysis tend to focus on estimators and their properties, and do not require a prior on the parameters. in fma, the parameters are treated as fixed, yet unknown, and are not assigned any probabilistic interpretation associated with prior knowledge or learning from data. to date, the empirical literature of fma has mainly focused on forecasts combinations ( steel, 2017) . however, as we show below, this approach can also be employed to derive metrics of great interest to analyze the importance of the potential drivers of covid-19 differentials. let the fma estimator of η be given by: where 0 ≤ ω m k ≤ 1 and 2 k 1 k=1 ω m k = 1 such that it integrates both, model selection 24 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint and parameter estimation. following buckland et al. (1997) the estimated standard error in fma is given by: whereτ k is the estimated variance of the parameters in model k andb 2 k = η m k − η f m a is a term that captures the uncertainty across different models. as discussed by moral-benito (2015) fma estimators crucially depend on the weights selected for estimation as their asymptotic properties may differ substantially. in this analysis we consider the approach of hansen and racine (2012) , who suggest a weighting procedure based on the minimization of a cross-validation criterion also known as the jackknife model averaging (jma), that is appropriate for general linear models with heteroskedasticity. cross-validated optimal weights in this context are obtained by solving: where: and µ (ω) = 2 k k=1 ω k x kηk ,η k (ω) = 2 k k=1 = ω k η k 0 andη k denotes the leastsquared model averaging estimator of model k. in the fma context it is possible to sum over model weights ω m k to derive a metric of variable importance, that is similar to that of the pips in the bma, the frequentist variable weight (fvw). that is, for a variable h we calculate the fvw as the sum of the model weights including the variable h: as fvws have never been used before to asses importance, in the online appendix c, we provide the results of a monte carlo simulation study showing that variable weights ω η h can be used to generate an accurate rank on the relative importance of a regressor. in addition, to complement this metric, we also calculate the frequency of t-stats above 1.96, which corresponds to a significant covariate at the 5% level as in (sala-i-martin et al., 2004) . table (5) shows the results derived from the jackknife model averaging analysis. column (1) reports the fvws, column (2) the standardized coefficient estimate and column (3) the fraction of models where t-stats of the variable were above 1.96, and 25 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint therefore, appeared to be significant at the 5% level. the main result of temperatures being the most important determinant of covid-19 regional contagion differentials appears is confirmed in this context as the fvw of the variable is of the 85%, clearly above the fvws of the other predictors included. importantly, from a frequentist inference perspective, the parameter of temperatures appears to be significant at the 5% level in all the models which reinforces our view on this variable being the most important driver. the most remarkable difference in this context is that the estimate of a 1-standard deviation shock in temperatures produces a decrease in -0.536 standard deviations of contagions, which is a slightly lower magnitude than those obtained in the bma. notes: the results reported correspond to the estimation of all the models in the model space including any combination of the 10 variables. the jackknife (or crossvalidation) choice of weight vector that solves equation (13) is obtained using the quadprog command in matlab. the matlab codes that extend the gma function developed by hansen and racine (2012) to compute fvw and standard errors can be provided upon request. standard deviations in parenthesis, * significant at 10% level, ** significant at 5% level, *** significant at 1% level. as refers to the other variables, the results of fvws > 0.5 in column (1) hold for the policy lag variable, or the relative humidity. nevertheless, in the fma, two additional variables display fvws > 0.5: the tests performed and the degree of social mobility of the population. the effect of testing appears to be weakly significant and negatively related to the size of the epidemic whereas that of social mobility is positive and significant at conventional 5% levels. nevertheless, when looking at the fraction of t-stats in which these factors are significant, we find that they are only significant 26 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . in the 11% and the 24.6% of the models. therefore, models where these two variables did not appear to be relevant, were precisely those models with very large variances and noisy estimates of the parameter estimates. the opposite occurs for the health spending variable, which only obtains a fvw of the 3.4%, but its negative impact on contagions appears to be highly significant. the results stemming from the fma analysis confirm that temperatures are the key factor explaining differentials but interestingly, they also point out that policy variables like health-spending and social mobility, which were discarded by the pips in table ( 2), may in indeed have played a more relevant role that previously suggested. therefore, the results of the fma do not alter our main result but place more emphasis in the epidemic management and actions of the policy-makers than the bma. a third concern with the validity of previous results is that measurement errors are likely to be present in our key dependent variable given that a large share of cases may have not been detected. to address this issue, we estimate the percentage of symptomatic covid-19 cases reported in the italian regions using case fatality ratio estimates, correcting for delays between confirmation-and-death following russell et al. (2020). 16 these authors propose to proceed in this way because of in real-time, the division of cumulative deaths (d t ) by cases (c t ) leads to a biased estimate of the case fatality ratio (cfr), because of this calculation does not account for (i) delays from confirmation of a case to death, and (ii) under-reporting of cases. the approach used here to obtain an accurate estimate of the size of regional infections consists in two steps. first, we use an estimated distribution of the delay from hospitalisation-to-death for known cases that are fatal, which allows us to adjust the naive estimates of the cfr to account for these delays. 17 however, this corrected cfr (ccfr) does not account for under-reporting. given that the best available estimates of cfr (after controlling for under-reporting) from large studies in china and south korea are in the 1% -1.5% range, we assume a baseline cfr of 1.4% for our analysis. thus, if a region has a ccfr that is higher to this threshold value, it means that only a fraction of cases are being reported. in such case, the excess of the observed ccfr can be attributed to under-reporting. specifically, under-reporting u of known cases is calculated for each region i and date 16 we do not attempt to estimate both symptomatic and asymptomatic cases as it is not clear yet what is the precise share of asymptomatic and in the context of the approach set out by russell et al. (2020) this adjustment would just change the estimation by a scalar k leaving the ordering and the variability unchanged. 17 we assumed the delay from confirmation-to-death followed the same distribution as estimated hospitalisation-to-death, based on data from the covid-19 outbreak in wuhan, china, between the 17th december 2019 and the 22th january 2020. thus, the distribution used is a lognormal with a mean delay of 13 days and a standard deviation of 12.7 days. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020 . . https://doi.org/10.1101 t as: where u it represents the underestimation of the proportion of cases with known outcomes, c it is the daily incidence at time t and f t is the proportion of cases with delay of t periods between confirmation and death. we obtain the ccfr as ccf r it = d it utc it and once we have the ccf r it in hand, we estimate reporting as r it = 1.4% ccf r it . this allows us to calculate a new daily and cumulative incidence time series as: the results obtained after performing the bma with this corrected time-series of contagionsĉ it for each region are shown in table (6). as shown, the results using this new dependent variable confirm the temperatures are the most important determinant. in fact, when using this correction, the temperatures appear to be the only variable with a posterior probability that is above the apip (i.e, pip=85.4% > apip =50%). the estimated posterior mean impact of an increase in one celsius degree in this case is 21.8 times higher than that obtained when using the official data in table ( 2). this can be explained by the fact that the estimated under-reporting of cases in italian regions is quite high, ranging between the 81.45% in lombardy and the 98.26% on molise. nevertheless, the standardized posterior mean estimate is of -0.613 standard deviations (in line with the baseline model) and the posterior sign positivity is 0%, which implies the statistically significant negative link is robust to all model specifications also in this context. 18 another issue that deserves attention and that might affect the validity and interpretation of our previous results is that of the measurement of temperatures, as it could be that the maximum or the minimum daily temperature are those favoring or hampering viral transmission. although they are correlated with the mean temperatures they imply upper and lower bounds, producing a different impact on contagions. table (7) reports the results obtained when using these two measurements of temperatures. we use the average maximum and minimum temperatures 2 meters above the surface using the centroid of the region using data taken from the nasa-power v8 gis database. the results of the benchmark analysis are shown in columns (1) to (3) for the sake of comparison, while columns (4) to (6) and columns (7) to (9) display those obtained for the minimum and maximum temperatures respectively. as observed, both minimum and maximum temperatures also appear to be the more relevant than the other determinants, with pips of 95.2% and 99.4%. in either case, the 18 we also provide the corresponding estimates of the bma robustness checks shown in tables (7) and (8) when using the corrected contagion series in the tables (a3) and (a4) of the appendix a. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020 . . https://doi.org/10.1101 notes: the dependent variable in all regressions is the total number of contagions per 100,000 inhabitants after correcting for under-reporting using equation (17). the results reported correspond to the estimation of all the models in the model space including any combination of the 10 variables. we use a fixed bric g-prior on the parameters and a binomial model prior with prior mean model size equal to 5 in all cases. variables are ranked by column (1), the posterior inclusion probability. columns (2) to (3) reflect the conditional posterior mean and the standardized conditional posterior mean for the linear marginal effects of each variable, respectively. column (4) is the sign certainty probability, a measure of our posterior confidence in the sign of the coefficient. standard deviations in parenthesis. * significant at 10% level, ** significant at 5% level, *** significant at 1% level. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . estimated standardized coefficient is always negative across all model specifications in the range of -0.66 to -0.71 standard deviations. this means that the most extreme daily values do not alter our results. to complement this information we use a different metric of temperatures, obtained from the online climate portal "ilmeteo.it", which reports historical monthly measurements of minimum and maximum temperatures for different provinces and meteorological stations scattered within the boundaries of each of the regions. this metric departs from the meteorological measurement performed at the centroid of the region given that in this case, the average maximum and minimum temperatures for all the spatial data points within each region are used to produce a regional estimate of mean temperatures. again, when using this alternative measurement of temperatures, we find it to be the most relevant one, with a pip of the 74.3%. therefore, the main results reported in table ( 2) are robust to these alternative measurement. as mentioned above, our baseline specification includes different controls based on the literature analyzing covid-19 differentials. however, one may still argue that the effect of temperatures on the dependent variable could simply be capturing the effect of some omitted correlated determinant of contagions. in order to investigate further this issue, we now extend our baseline specification by including different covariates that may be associated with both, temperatures and contagions at the regional level. in particular, we begin by considering the role played by the distance to the epicenter of the epidemic, which in italy was located in lombardy. in an analogy with the diffusion of the impacts of an earthquake, we expect that a higher regional distance to the epicenter of the pandemic may have reduced the share of imported contagions and as a consequence, the final epidemic size. this intuition is supported by the work of fang and wahba (2020) , where distance to wuhan is observed to be an important predictor of contagion differentials among chinese cities. in addition, controlling for distances to lombardy is relevant in the italian context, given that regions in the south of the country or in the mediterranean islands far away from lombardy, tend to have better climatic conditions. we also control for the presence of comorbidities in the population as there is evidence that suggests that covid-19 infections exerts a higher impact on those suffering from other health problems, such that their clinical evolution is usually worse (guan et al., 2020) . to account for these differentials, we aggregate into a composite index the annual number of patients discharged by hospitals with either respiratory diseases, neoplasms and diseases of the circulatory system. in addition, we consider differences in the regional the quality of government. previous studies have shown that regional quality of government differentials are a robust determinant explaining regional resilience during the downturn of the great 30 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint the dependent variable in all regressions is the total number of contagions per 100,000 inhabitants from the italian cpm. the results reported correspond to the estimation of all the models in the model space including any combination of the 10 variables. we use a fixed bric g-prior on the parameters and a binomial model prior with prior mean model size equal to 5 in all cases. variables are ranked by column (1), the posterior inclusion probability of the baseline specification. standard deviations in parenthesis, * significant at 10% level, ** significant at 5% level, *** significant at 1% level. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . (rios and gianmoena, 2020) . in the context of a health crisis, a higher quality of government is expected to influence not only the provision and the quality of health-care services but also the adaptability of institutions to a changing and turbulent environment, which in turn may have helped to buffer the epidemic shock. hence, to control for regional differentials across regions in the quality of government, we use the composite indicator developed by (charron et al., 2014). 19 moreover, as discussed in section (2), it is possible that what matters for sars-cov2 viral transmission are not the temperatures per se, but the levels of sunlight and uv radiation, which are deemed to degrade viral genetic material and increase host defenses through enhanced production of vitamins (canell et al., 2006; martineau et al., 2017) . because of there is a strong correlation between regional temperatures and solar radiation, we disentangle the effects of these two variables by including the mean solar radiation flux in the centroid of the region by using data taken from the nasa-power-v8 database. finally, we consider the potential effect of infrastructure density, measured as the number of road and railway kilometers per squared kilometer. this variable is expected to capture differences in the degree of social mobility and connectedness within the region. the reason for including this variable is that despite the fact that the first outbreak could have struck anywhere, regions with large cities like milan or rome should have a bias to attracting them, mainly because they have commercial hubs and large transport nodes with a massive influx of tourism and travelers. likewise, the use of public transport may have been a key factor in the rapid spread of covid-19 (aqr, 2020b). table (8) show the results obtained when the bma analysis is performed again including these additional controls. as can be seen, only the temperatures, the policy lags, the distance to lombardy and the relative humidity display statistically significant effects at the 5% level. moreover, we find that the most relevant factor are the regional temperatures with a pip of the 92.9% and with a coefficient that continues to be negative and statistically significant in all cases. therefore, the newly-added controls do not alter the main result of the paper. 20 19 the eqi is built upon three different pillars that refer to the degree of impartiality, corruption and quality of public services and it is based on survey data about the perceptions and experiences of european citizens on the quality, impartiality and level of corruption in education, public health care and law enforcement. 20 figure a3 in the supplementary material appendix displays the cumulative model probability plots providing evidence of a strong concentration of probability density in the top models 100. the top 100 highest probability models, cover approximately the 10% of the model space and concentrate the 89.76% of the posterior probability mass. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . notes: the dependent variable in regressions is the total number of contagions per 100,000 inhabitants. the results reported correspond to the estimation of the top 100 models from all the possible regressions including any combination of the 15 variables. we use the reversible jump m c 3 sampler algorithm implemented by zeugner and feldkircher (2015) in r. we use a fixed bric g-prior on the parameters and a binomial model prior with prior mean model size equal to 7.5. variables are ranked by column (1), the posterior inclusion probability. standard errors in parenthesis, * significant at 10% level, ** significant at 5% level, *** significant at 1% level. the results obtained so far imply that temperatures are the most robust determinant of cross-regional contagion differentials in italy. we now investigate this issue from a different perspective. as a complement to the bayesian and the frequentist model averaging approaches developed before, we explore the relative importance of the various factors that could affect contagions by means of methods that partition the proportion of explained model's variance among multiple predictors (i.e, the r 2 of the model). the term relative importance here refers to the contribution a variable makes to the prediction of contagions by itself and in combination with other predictor variables, which differs from classical statistical inference where a variable may explain only a small proportion of predictable variance and yet be considered very meaningful (johnson and lebreton, 2004) . to understand the intuition behind this approach recall that in the context of a cross-sectional regression, with n = 1, . . . , n , the r 2 informs on the model's explained variability across spatial units (regions). thus, decompositions on the relative importance of a factor x k , tell us the percentage of explained disparities across spatial units due to factor k. in a linear regression framework with k explanatory factors the variance of the model is given by: where β k stands for the parameter of variable k, v k for the regressor variance, ρ j,k is the inter-regressor correlations and σ 2 denotes the unexplained variance. as explained by gromping (2007) variable importance methods that decompose r 2 , have to decompose the explained variance of the model (i.e, the first two summands above). however, these variances can only be uniquely partitioned in the case of uncorrelated regressors. whenever ρ j,k = 0, different methods lead to different results. given that the increase in the r 2 allocated to a certain factor x k depends on which other variables are already in the model when x k is added to be part of the explanation, gromping (2007) propose to use the lindeman-merenda-gold method which consists on averaging such order-dependent r 2 allocations over all p! orderings to produce a fair unique assessment of importance. the proportional marginal variance decomosition (pmvd) approach proposed by feldman (2005) , is a modified version with data dependent weights penalizing variables that are not statistically significant. others, like zuber and strimmer (2010) and zuber and strimmer (2011) have proposed to by-pass the need of estimating all possible models by decorrelating the variables with a mahalanobis transformation and suggest to look directly at correlation-adjusted (marginal) corelation (car) scores, whereas genizi (1993) proposed a similar metric that can be viewed as the weighted average of squared car scores. to study the relative contribution of the various factors affecting contagion, we 34 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . employ the lmg method (gromping, 2007) , the pmvd (feldman, 2005) and the genizi and the car scores (genizi, 1993; strimmer, 2010 zuber and strimmer, 2011) . the formulas employed to obtained the decompositions implied by these metrics are detailed in the appendix d. table (9) reports the estimated relative importance metrics for each group of variables and each variable alone. as can be seen, the variables that belong to the climate, environmental and geography factors explain 63.4% of the variability in contagions. within this group, leaving the role of temperatures aside, the distance to the epicenter of the epidemic outbreak (7.2%) and solar radiation (6.2%) are the most important ones. relative humidity differentials just explain a 3.3% of the variability in contagions which contrasts with the higher importance attributed to this variable in the bma and fma approaches. however, the low importance attributed by the relative importance analysis to the variability in air pollution (1.4%) is in agreement with previous approaches. the second group of factors in terms of importance are the institutional factors and those related to the policy-response to the epidemic which all together, explain a 23.4% of disparities. the individual rankings of all the factors included in these group are above demographic and social connectedness ones. after the policy lag variable which is ranked in the second place, the share of health spending in the gdp (6.7%) is ranked as the fourth variable wheres the quality of regional government (5.1%) in the sixth place, explain a non-negligible share of contagion variation. the tests performed relative to the population are ranked in the eighth place, explaining the 3.31% of the model's fit to the data. in a much lesser degree of importance we find the demographic characteristics (6.8%) and the social connectedness factors (6.4%). among the demographic variables, the shares are quite equi-distributed since the share of young population (2.6%), the share of old population (2.5%) and the regional comorbidity index (1.7%) a similar portion of the r 2 . regarding the factors that capture regional differences in social connectedness the most prominent variable is the degree of social mobility (3.1%) above the population density (2.4%) and the infrastructure density (0.9%). taken together, the results suggest that although institutional settings and political interventions have been important, regional contagion outcomes in italy have been largely determined by factors outside the scope of regional policy-makers. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . this study has examined the determinants of regional covid-19 contagion differentials in italy during the period ranging from february 24 to april 15 covering the first wave of the epidemic outbreak. they key contribution of this analysis is methodological given that we consider the effect of a greater number of determinants than previous studies and we employ bma techniques to account for model uncertainty in a cross-regional contagion regression framework. within the bma, we compute the pips for the different indicators to generate a probabilistic ranking of relevance for the various contagion determinants. the analysis reveals the temperatures are the top determinant shaping regional reactions to the epidemic outbreak in italy with a pip of the 97.9%. we find that the negative and statistically significant effect on viral transmission is robust to all the model specifications. the most plausible explanation for this link is that lower temperatures may have increased the efficiency of viral propagation decreasing host defenses and favoring the conditions for aerosol transmission in closed spaces. the result of a negative relationship between temperatures and contagions is robust to (i) different forms of prior distributions elicitation, (ii) the procedure to assign importance-weights to the regressors, (iii) the presence of measurement errors in official data due to underreporting, (iv) the employment of alternative definitions of temperatures, or (v) the inclusion of additional covariates. in a second step, we calculate relative importance metrics that allow to perform an accurate partitioning of the model's fit to the data. the results show that climate and environmental factors explain 63% of regional differences. specifically, we find that temperatures alone are responsible for the 45.4% of the observed disparities in contagions. the set of factors related to the institutional setting and the policy response to the epidemic crisis appear in a second level of importance, whereas factors related to the degree of social connectedness or the demographic characteristics are less relevant in this context. although the limited spatial coverage of the analysis implies that our main findings should be taken with caution and cannot be extrapolated elsewhere, the results of the paper raise potentially important policy implications, especially at a time in which there is an active public debate on the most appropriate instruments to reduce the impact of the epidemic outbreak on regional economies. first, although our analysis suggests that higher temperatures may have contributed to decrease the viral transmission efficiency, this does not imply that contagions will not occur during the summer season or in warmer environments. in fact, given the wide range of average temperatures in which contagions occurred in italy, from -4 to 10 celsius degrees, our findings suggest that transmission is likely to occur in diverse environments. the strong importance that our modeling exercise attributes to temperatures and delays in lockdown policies, indicates that in the absence of a vaccine, the combined effect of a relaxation of the lockdown measures and the expected drop in temperatures during the fall and/or the following winter, could re-create the 37 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . it is precisely, in the hypothetical context of second epidemic wave, where the results obtained here regarding the role played by some policy related variables may be useful and help to guide policy-making. an interesting result, revealed by the posterior jointness analysis, is that there is an important degree of substitutability between temperatures, lockdowns, and the regional health-spending on gdp. therefore, the negative effect on contagions exerted by health-spending and the fact that in the absence of lockdowns this variable would have been considered as much more relevant from a probabilistic point of view, can be interpreted as evidence in favor of the need to increase the share of economic resources devoted to robustify the health-care system. since regional lockdowns are very costly from an economic point of view, our results point to health-care system investments as a less costly but effective policy to deal with the epidemic. a similar implication can be derived for the degree of social mobility as it has been observed it tends to increase contagions. given that this variable also appears to be substitute of lockdowns, policies aiming at reducing the level of social mobility, as it is the case of adopting teleworking (working from home), could help to reduce daily commutes and the significant possibility of transmission in office environments. thus, the expansion of teleworking may help reduce costs to the economy and society. finally, we would like to highlight some future empirical avenues of research that have a strong potential to gain a deeper understanding on the linkages between climate and covid-19 contagions. from the methodological point of view, one option is the extension of the bma framework employed here to the context of a dynamic spatial panel data model including fixed effects (lee and yu, 2010) or adaptive fixed effects (shi and lee, 2017) . this could allow to investigate probabilistic importance by exploiting not only cross-regional variation but also time-variation while controlling for unobserved heterogeneity. a second avenue of research consists in the development of a homogeneous and comparable cross-country regional dataset of contagions. this type of study will require the employment of correction methods such as the one we have implemented here. in the european context, high-frequency mortality monitoring data could allow to correct under-reporting in the deaths, which could be useful to further refine the methodology of russell et al. (2020) . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint figure a1 : the link between relative humidity and contagions 44 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint figure a2 : the role of g-priors and model size priors 45 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.13.20101261 doi: medrxiv preprint cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020 . . https://doi.org/10.1101 notes: the dependent variable in all regressions is the total number of contagions per 100,000 inhabitants after correcting for under-reporting. the results reported correspond to the estimation of all the model space formed by models including any combination of the 10 variables. we use a fixed bric g-prior on the parameters and a binomial model prior with prior mean model size equal to 5 in all cases. variables are ranked by column (1), the posterior inclusion probability. * significant at 10% level, ** significant at 5% level, *** significant at 1% level. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020. . the group of top regressors conformed by the constant term, z 1 and z 2 always display weights above medium importance and low importance regressors, irrespective of the signal-to-noise ratio and the size of the selection matrix s. their variable weights are in most of the cases the 99.9% or the 100%. similarly, medium-relevance regressors appear above the low-importance ones in most of the scenarios, which implies a high level of accuracy. the variance of the residual does not appear to significantly affect the quality of the rankings based on fma weights. however, the quality of the rankings appear to be dependent on the size of the sample of models s considered to derive the weights. this can be seen by looking at the quality of rankings produced with s=1000 and s=250. when the selection matrix s just considers a low number of models (i.e 250 models), a low-importance regressor z 8 may obtain a variable weight within the range of medium-level of importance regressors z 3 to z 6 . despite this exception, we find that overall, the employment of fma is accurate at ranking the groups of regressors by its importance. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020 . . https://doi.org/10.1101 the lmg method assigns to each regressor x k the following share: where svar denotes the sequentially added explained variance as defined above. thus the share φ k assigned to regresor k is the average over model sizes i of average improvements in explained variance when adding regressor k to a model of size i that did not contain k. hence, the lmg metric performs a r 2 decomposition by averaging marginal contributions of independent variables over all orderings of variables and using sequential sums of squares from the linear model, the size of which depends on the order of the regressors in each particular model. on the other hand, the pmvd method assigns each regressor x k the following share: φ p m v d x(k) = 1 p! rpermut (p(r)svar (x k |r)) where where p(r) denotes the data-dependent pmvd weights. the idea is to use all the sequences svar x r, k+1 , . . . , x r k | (x r 1 , . . . , x r k ) for all k = 1, . . . , k − 1 to determine the weights of all orders r which are proportional to: l(r) = k−1 k=1 svar x r, k+1 , . . . , x r k | (x r 1 , . . . , x r k ) −1 such that p(r) = l(r) rpermut l(r) . finally, to check the robustness of our results we also compute two alternative metrics of relative importance: (i) the genizi (1993) and the (ii) car scores zuber and strimmer (2011)). the weights associated to the genizi (1993) and car measures are given by: with ω = p −1 2 p xy . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020 . . https://doi.org/10.1101 a simple planning problem for covid-19 lockdown (no. w26981) the effect of air pollution on the spread of covid-19 in the catalan territory the effect of population density on the spread of covid-19 in the catalan territory a mathematical model for the spatiotemporal epidemic spreading of covid19 the resource curse revisited: a bayesian model averaging approach covid-19: towards controlling of a pandemic will coronavirus pandemic diminish by summer model selection: an integral part of inference epidemic influenza and vitamin d a dynamic model of covid-19: contagion and implications of isolation enforcement the effects of temperature and relative humidity on the viability of the sars coronavirus commuter mobility and the spread of infectious diseases: application to influenza in france regional governance matters: quality of government within european union member states the determinants of economic growth in european regions jointness of growth determinants stability of sars coronavirus in human specimens and environment and its sensitivity to heating and uv irradiation urban density is not an enemy in the coronavirus fight: evidence from china relative importance and value los diferentes criterios con los que se recogen los datos impiden conocer la dimensión de la epidemia en españa . retrieved decomposition of r2 in multiple regression with correlated regressors health expenditure and all-cause mortality in the galaxy of italian regional healthcare systems: a 15-year panel data analysis. applied health economics and health policy estimators of relative importance in linear regression based on variance decomposition comorbidity and its impact on 1590 patients with covid-19 in china: a nationwide analysis jackknife model averaging the drivers of local income inequality: a spatial bayesian model-averaging approach world economic outlook history and use of relative importance indices in organizational research the geographic spread of covid-19 correlates with structure of social networks as measured by facebook (no. w26990) a spatial dynamic panel data model with both time and individual effects roles of humidity and temperature in shaping influenza seasonality cold temperature and low humidity are associated with increased occurrence of respiratory tract infections effects of temperature variation and humidity on the death of covid-19 in wuhan vitamin d supplementation to prevent acute respiratory tract infections: systematic review and meta-analysis of individual participant data seasonality and uncertainty in covid-19 growth rates. medrxiv model averaging in economics: an overview april 16) cataluña cambia la forma de contar casos y hace aflorar 3.242 fallecidos más con coronavirus under the weather: climate, ecosystems, and infectious disease how effective has the spanish lockdown been to battle covid-19? a spatial analysis of the coronavirus propagation across provinces role of temperature and humidity in the modulation of the doubling time of covid-19 cases initial evidence of higher morbidity and mortality due to sars-cov-2 in regions with lower air quality. medrxiv the dynamics of covid-19: weather, demographics and infection timeline. medrxiv intensive care management of coronavirus disease 2019 (covid-19): challenges and recommendations. the lancet respiratory medicine a novel methodology for epidemic risk assessment: the case of covid-19 outbreak in italy covid-19 and italy: what next the link between quality of government and regional resilience in europe even a bad test can help guide the decision to isolate: covid simulations part 3 using a delay-adjusted case fatality ratio to estimate under-reporting temperature and latitude analysis to predict potential spread and seasonality for covid-19 determinants of long-term growth: a bayesian averaging of classical estimates (bace) approach relazione circa l'effetto dell'inquinamento da particolato atmosferico e la diffusione di virus nella popolazione spatial dynamic panel data models with interactive fixed effects actualización n o 63 model averaging and its use in economics relationship between sunshine duration and temperature trends across europe since the second half of the twentieth century aerosol and surface stability of hcov-19 (sars-cov-2) compared to sars-cov-1 estimates of the severity of coronavirus disease 2019: a model-based analysis. the lancet infectious diseases an effect assessment of airborne particulate matter pollution on covid-19: a multi-city study in china response to covid-19 in taiwan: big data analytics, new technology, and proactive testing high temperature and high humidity reduce the transmission of covid-19 does comorbidity increase the risk of patients with covid-19: evidence from meta-analysis exposure to air pollution and covid-19 mortality in the united states report of the who-china joint mission on coronavirus disease modes of transmission of virus causing covid-19: implications for ipc precaution recommendations. scientific brief no association of covid-19 transmission with temperature or uv radiation in chinese cities bayesian model averaging employing fixed and flexible priors: the bms package for r high-dimensional regression and variable selection using car scores notes: the dependent variable in regressions is the total number of contagions per 100,000 inhabitants. columns (1) to (3) reflect estimates of official contagions whereas columns (4) to (6) those after correcting for under-reporting. the results reported correspond to the estimation of the top 100 models from all the possible regressions including any combination of the 15 variables. we use a fixed bric g-prior on the parameters and a binomial model prior with prior mean model size equal to 7.5 in all cases. variables are ranked by column (1), the posterior inclusion probability this section presents the results of monte carlo experiments that investigate the performance of the frequentist model averaging approach outlined in section (6.2.2) when generating rankings of variable importance. in the monte carlo experiments, data on the potential explanatory variables in z = [ι n , z 1 , . . . , z 9 ] are generated from independent standard normal distributions. in the simulation study, z is of dimension n × 10. we consider a sample size of n = 100 . the dependent variable y is generated according to the following data-generating process:notice that by adjusting σ 2 it is possible to control the signalto-noise ratio in the model. values for σ 2 are set to 1, 2, 5 and 10 in the different experiments, to asses the ability of the fma approach to rank correctly the importance of z k ∈ z under different scenarios of the r 2 .the true slope coefficients β k capturing the effect of z k in the data-generating process given by equation (19) are generated in a way such that the constant term and the first two (nonconstant) covariates z 1 and z 2 are likely to be important regressors in the model. specifically, the corresponding coefficients are drawn from a normal distribution with mean zero and a relatively large variance of 10. the parameters corresponding to z 3 , z 4 , z 5 and z 6 are similarly drawn from a normal distribution with mean zero. however, with a level of 0.1, the variance for these parameters is set to be low, implying that these covariates are less likely to be important. the variance of the parameters related to z 7 , z 8 and z 9 is set to 0.00001to reflect the variables have a negligible influence on y.we focus on the jackknife fma approach based on different sizes of the selection matrix s of size s × k + 1 for different sizes of s. due to the computational burden, we use s instead of estimating the 2 k+1 models as in many empirical applications the enumeration of all the models might be quite time-consuming or infeasible. the design of s goes as follows. each of the entries of s is filled by drawing from a binomial distribution adjusted such that the expected probablity of z k being in the model is the 50% (i.e, by setting φ = 0.5 in p (z k ) = φ k (1 − φ) k−k , which governs the probability of inclusion for each z k and s. this implies that on average, the number of regressors in each model is 5 (i.e, each row of s i, * has 5 ones and 5 zeros). as the total number of model combinations with k + 1 = 10 is 1024 and in applied analysis with large numbers of regressors, storage of the full model space may require unfeasible ram memory, we just consider scenarios where s < 2 k+1 . in particular we consider s=250, 500 and 1,000. let the variance of the dependent variable y be given by σ 2 y , the variance of the set of regressors contained in x be denoted by σ and the covariance of y and the covariates by σ yx . let p denote the correlations among regressors and p yx marginal correlations between regressors and y , such that:andwhere v = diag (v ar(x 1 ), . . . , v ar(x p )). defining the correlation between the model estimates and y as ω = corr y, y , then the squared multiple correlation coefficient is expressed as:then, the unexplained variance can be written as σ 2 y (1 − ω) and the explained variance of a model with x k regresors with indices in the set s as evar s = σ 2y ω x k ,k∈s . finally, the sequential added explained variance when adding the regressors with indices in m to a model that already contains the regressors with indices in s as svar = σ 2 y ω m ∪s − σ 2 y ω s . 21 this implies that the true coefficient of determination is given by:with these definitions in hand, for any model with p regressors, the r-squared can be expressed as:where m denotes the decomposition method. key: cord-265299-oovkoiyj authors: hickman, d.l.; johnson, j.; vemulapalli, t.h.; crisler, j.r.; shepherd, r. title: commonly used animal models date: 2016-11-25 journal: principles of animal research for graduate and undergraduate students doi: 10.1016/b978-0-12-802151-4.00007-4 sha: doc_id: 265299 cord_uid: oovkoiyj this chapter provides an introduction to animals that are commonly used for research. it presents information on basic care topics such as biology, behavior, housing, feeding, sexing, and breeding of these animals. the chapter provides some insight into the reasons why these animals are used in research. it also gives an overview of techniques that can be utilized to collect blood or to administer drugs or medicine. each section concludes with a brief description of how to recognize abnormal signs, in addition to lists of various diseases. the mouse is a small mammal that belongs to the order rodentia ( fig. 7 .1). the house mouse of north america and europe, mus musculus, is the species most commonly used for biomedical research. it is likely that the mouse originated in eurasia and utilized its commensal relationship with humans to spread through to other continents as humans explored and colonized. mouse fanciers around the turn of the 20th century are the source of the majority of the laboratory mice that are in use today. a summary of the overarching categories of mouse models that are available is presented in table 7 .1. physiology; and the possibility for breeding genetically manipulated mice and mice that have spontaneous mutations. mice have been used as research subjects for studies ranging from biology to psychology to engineering. they are used to model human diseases for the purpose of finding treatments or cures. some of the diseases they model include: hypertension, diabetes, cataracts, obesity, seizures, respiratory problems, deafness, parkinson's disease, alzheimer's disease, various cancers, cystic fibrosis, and acquired immunodeficiency syndrome (aids), heart disease, muscular dystrophy, and spinal cord injuries. mice are also used in behavioral, sensory, aging, nutrition, and genetic studies. this list is in no way complete as geneticists, biologists, and other scientists are rapidly finding new uses for the domestic mouse in research. mice are mammals and their organ systems are very similar to organ systems in humans in terms of shape, structure, and physiology. basic physiologic data are presented in table 7 .2. mice have very long loops of henle in the kidneys, thus allowing for maximal concentration of their urine. as a result, urine output in mice usually consists of only a drop or two of highly concentrated urine at a time. they also excrete large amounts of protein in their urine with sexually mature male mice excreting the largest levels of protein possibly as pheromones. mice have only two types of teeth, incisors and molars. the incisors are openrooted and erupt (i.e., grow) continuously throughout their lives. this predisposes mice to malocclusion if not given feeds or objects such as nylon bones to help wear down the teeth during mastication. the molars are rooted and, thus, do not continuously erupt. the stomach has two compartments with the proximal portion completely keratinized and the distal portion entirely glandular. their intestines are simple, but the rectum is very short (1e2 mm) and hence is prone to prolapse, especially if the animal has colitis. the gastrointestinal flora consists of more than 100 species of bacteria that form a complex ecosystem that aids digestion and health of the mouse. mice have no sweat glands but have a relatively large surface area per gram of body weight. this results in dramatic changes in physiology and behavior in response to fluctuations in ambient temperature. when too cold, mice will respond by nonshivering thermogenesis (i.e., metabolism of brown adipose tissue). in addition to the lack of sweat glands, they cannot pant or produce large amounts of saliva to aid in cooling their body temperature. therefore, when exposed to very hot situations, mice increase the blood flow to their ears to maximize heat loss; and in the wild, they move into their burrows which are at cooler temperatures. the thermoneutral zone, the range of ambient temperatures at which the mouse does not have to perform regulatory changes in metabolic heat production or evaporative heat loss to maintain its core temperature, is about 85.3 fe86.9 f (29.6 ce30.5 c). the female reproductive system is comprised of paired ovaries and oviducts, uterus, cervix, vagina, clitoris, and paired clitoral glands. pregnant female mice have hemochorial placentation, similar to humans (i.e., maternal blood is in direct contact with the chorion, the outermost layer of the fetal placental membranes). the female mouse also has five pairs of mammary glands. the male reproductive system consists of paired testes, penis, and associated sexual ducts and glands. the inguinal canals are open in the male mouse, and the testes can retract easily into the abdominal cavity. both sexes have well-developed preputial glands, which can become infected. males have a number of accessory sex glands, including large seminal vesicles, coagulating glands, and a prostate. secretions from these glands make up a large part of the mouse's ejaculate. when mice ejaculate, the semen forms a coagulum or copulatory plug. mice breed continuously throughout the year unless conditions are very unfavorable to them (e.g., lack of food). their reproductive potential can be affected by a number of external influences such as noise, diet, light cycles, population density, or cage environment. genotype also can affect reproductive performance as it is common knowledge that some inbred strains of mice are poor breeders, and if pups are born they may receive poor maternal care. additional reproductive physiologic data are presented in table 7 .3. mice can be bred using a one-on-one system (one male to one female; monogamous) or in a harem mating system (polygamous mating). in a monogamous system, the male and female are always left together, but at weaning the pups are removed from the cage. this system allows for maximal use of the postpartum estrus and the maximum number of litters for the females involved, and facilitates recordkeeping and monitoring of specific breeders in the colony. in a harem mating system, multiple males are placed with multiple females, usually at a ratio of one male to two to six females. usually, females are removed to separate cages just before parturition and the postpartum estrus is underutilized. mouse pups are born hairless, blind, and deaf and require extensive parental care that is provided mainly by the mother. due to the ruddy coloration of the skin of the hairless pups, they are also known as "pinkies." while mouse pups can increase their body temperature through the metabolism of brown fat stores, they are unable to adequately conserve body heat until they develop an adequate fur coat. thus, inclusion of nesting materials in the cage is highly recommended as huddling inside the nest can provide much needed warmth and safeguard against temperature-associated neonatal losses. the most reliable method for determining the sex of a mouse is by measuring the length of the anogenital distance, i.e., the distance from the anus to the genitalia. this distance can be measured with a ruler or animals assessed side by side with the rear ends held up by their tails. the anogenital distance is longer in males than females. in sexually mature animals one can also determine the sex of mice by the presence or absence of testicles in a testicular sac. mice are social creatures and can be group housed easily. their main method of communication is via pheromones. they use these olfactory cues to establish a pecking order (i.e., a hierarchical system of social organization). these chemicals are so important that when cage environments are changed, such as by simple cleaning or with bedding changes, a bout of fighting may occur until scent marking of the cage is completed as a way to reestablish the pecking order and social organization in that cage. pheromones also play a vital role in reproduction of these animals. this is demonstrated by the whitten effect and the bruce effect. the whitten effect occurs when a group of female mice that are not cycling are exposed to male urine, which contains a large quantity of pheromones. the females will all resume cycling as a group soon after the introduction of the male. in contrast, the bruce effect is characterized by abortion of litters when pregnant females are exposed to the urine of a strange male. as with most rodents, mice are nocturnal animals exhibiting peak levels of activity at night. because mice are a prey species, they display thigmotactic behavior or wall hugging. they avoid open spaces where they might be easily caught by predators. despite this, mice are very curious about any new objects in their territory and will often examine them at length. mice not only have poorly developed eyesight but they are also color blind. a number of inbred strains (e.g., fvb/n and c3h/he) are functionally blind by weaning. they rely on their very sensitive hearing to escape detection and on their sense of smell and taste to detect food (and possibly avoid poisons). mice can hear over a range of frequencies between 0.5 and 120 khz; however, normal mice are most sensitive to frequencies of 12e24 khz. it is important to note here that some inbred strains of mice, e.g., c57bl/6, suffer significant hearing loss before 1 year of age. mice can climb, swim, and jump (up to a foot), though they normally prefer to avoid swimming, if possible. under certain conditions they display stereotypies, which are obsessiveecompulsive behaviors. the behaviors may be strain-related, environment-related, or study-related and include wire gnawing, circling, jumping, and aggression. the use of environmental enrichment items such as cardboard tubes or other structures offer the animal an area for retreat from cage mates and add complexity to the environment. aggression is another important behavior that commonly occurs in group-housed male mice. it can also occur in group-housed females and mixed-sex cages. indications that there is an aggressive animal in the cage include bite wounds on the tail, rump, ears, and shoulders of mice ( fig. 7 .2). the wounds can be so severe as to cause significant blood loss and abscess formation at bite sites. aggression has been shown to be influenced by strain, age, and prior encounters. in terms of strains, the more aggressive strains are balb/c, c57bl/10, c57bl/6, dba/2, and outbred swiss. methods to prevent or reduce aggression include use of properly designed enrichment devices; provision of adequate space and shelter for each animal; grouping of mice before they reach puberty; use of docile strains; and removal of dominant animals as soon as possible. a common manifestation of social organization in group-housed mice is barbering, a behavior in which a dominant mouse will trim, by chewing, the hair or whiskers of other mice in the cage. barbering is also sometimes referred to as the dalila effect. it is usually instigated by a dominant male or female mouse. the dominant animals retain their whiskers and full hair coats, while their cage mates have "shaved faces and bodies" (fig. 7. 3) . although barbering does not generally result in any physical harm to the animal, removing the dominant mouse (the nonbarbered one) from the cage is a good approach to control. general types of housing mice in a laboratory setting include: conventional, specific pathogen free (spf), and germ free. in conventional housing, no attempt is made to keep out adventitious microbial and parasitic organisms. mice housed in this manner can be found in open-topped cages. room air, along with any airborne contaminants, is allowed to freely circulate into the mouse's cage. in addition, the food and water are not sterilized, though it should be noted that microbial contaminants may enter into the mouse population in this way. spf mice are raised in barrier conditions to ensure that they remain free of a specific list of pathogens. care is taken to ensure that adventitious microbes and parasites are excluded from the animals. spf mice are typically raised in specialized caging such as microisolator cages. these cages contain a 0.22 mm filter top that aids in the exclusion of microbes and parasites. individually ventilated caging systems include a rack of microisolator cages, each of which receives a filtered air supply ( fig. 7.4) . under spf conditions, everything that comes into contact with the animal should be sterilized or disinfected. this includes, but is not limited to, the water, food, bedding, and caging. special care must be taken by anyone handling the mice, including researchers, to ensure that handling and experimental procedures do not introduce potential pathogens into the colony. thus, all handling and procedures done on spf mice are often performed under hepa-filtered air conditions, such as within a biosafety cabinet. placing the mouse in an unfiltered environment ("room" air), even for a moment, is enough to potentially colonize the mouse with a whole host of adventitious microorganisms, thus destroying its spf status. once a contaminated mouse is placed back into the colony, the entire colony is at risk for infection. germ-free, or axenic, mice are raised to contain no microbes or parasites whatsoever. raising germ-free mice requires strict barrier maintenance. usually, this requires the use of flexible film isolators which provide hepa-filtered air to the mice within the isolator. additionally, any materials (e.g., food, water, and bedding) must be sterilized or thoroughly disinfected prior to being moved into the isolator unit so as not to contaminate the living space of the germ-free rodents with adventitious microorganisms. all procedures performed within the flexible film isolator must utilize strict aseptic technique for the same reason. in addition to the microbiological environment of the animal's housing systems, mice need to be housed at specific environmental parameters otherwise they may experience stress. the guide for the care and use of laboratory animals, 8th edition (national research council, 2011) is an internationally accepted document that outlines and discusses globally accepted environmental parameters for housing different species of animals including the mouse. table 7 .4 outlines the specific environmental requirements listed in this document for housing mice. mice are omnivorous and coprophagic with at least one-third of their diet being the consumption of their feces. in the laboratory setting mice are fed a clean, wholesome, and nutritious pelleted rodent diet ad lib. there are many commercially formulated diets for the various stages of life and for animals with specific induced diseases such as diabetes mellitus or hypertension. these diets may be available as autoclavable or irradiated forms to prevent transmission of disease via contaminated feed. there are also a variety of "pet" treats available for mice. however, the treats should not make up more than 5e10% of the daily diet. mice should be provided with a continuous supply of water daily. if the animals do not get enough water daily, their food consumption will decrease. the animals will also look scruffy and unhealthy. mice can be provided with water from water bottles or pouches, automatic watering systems with nipples, or water-based gel packs. some general signs of ill health include: weight loss, depression or lethargy, anorexia, obesity, diarrhea, scruffiness or ruffled coats, abnormal breathing, sneezing, weakness, dehydration, enlarged abdomen, discolorations (e.g., yellow for jaundiced animals or very pale for anemic animals), masses or swellings, and abnormal posture or gait. body condition scoring is an objective measure to truly assess how fat or thin the animal is and can be used for accurate determination of endpoints in studies where animals are expected to lose or gain weight ( fig. 7 .5) (ullman-cullere and foltz, 1999) . some of the more commonly found diseases of mice are presented in table 7 .5. noninfectious disorders are presented in table 7 .6. based on their genetic and physiologic makeup, mice can be either immunocompetent or immunodeficient. immunocompetent means that the mouse has a normal functioning immune system and can stage an immune response to any insult or injury. in contrast, immunodeficient means that some component or components of the mouse's immune system is not working or functioning normally, and so they cannot stage an adequate immune response and are more susceptible to infectious disease. immunosuppressed mice are mice that have a complete immune system but because of a drug or chemical or disease state, the immunological response is attenuated. rats and humans have a long history of coexistence. the origins of the laboratory rat, also known as the norway rat, stretch back centuries to the areas of modern day china and mongolia (burt, 2006; song et al., 2014) . the dispersal of the norway rat has occurred across the centuries and its natural habitat stretches from the mediterranean across southeast asia and down into australia and new guinea . unfortunately, most people associate rats with disease and destruction. throughout history, outbreaks of bubonic plague, typhus, and hantaviruses have had an unwitting accomplice in the rat (zinsser, 1935; benedictow, 2004; firth et al., 2014) . over the centuries, rats have also been used for food (e.g., in imperial china), companionship, and sport (gorn and goldstein, 2004; hopkins et al., 2004; burt, 2006) . ratting, a vicious blood sport where people laid bets on the dog that could kill the most rats in a given period of time was especially popular in both the victorian england and american underworld (thomas and mayhew, 1998; gorn and goldstein, 2004) . at the turn of the 20th century, breeding rats as a hobby or for companionship (i.e., "fancy rats") was recognized by the addition of "rat" to both the name and mission of national mouse club in the united kingdom (american fancy rat and mouse association, 2014). however, as interest in pet rats waned over the following years, the club reorganized and dropped "rat" from its name. a similar club, the american fancy rat and mouse association, was founded in the united states in 1983 (american fancy rat and mouse association, 2014). the first recorded use of rats as research subjects occurred in 1828 (hedrich, 2000) , and the first known rat breeding experiments occurred in the late 1800s (lindsey and baker, 2006) . the first major effort to perform research in the united states using laboratory rats occurred at the wistar institute of philadelphia, the oldest independent research institute in the united states, in 1894 (lindsey and baker, 2006) . rattus norvegicus constitutes one of the most commonly used laboratory species ( fig. 7 .6), second only to the laboratory mouse. because rats and mice are not included under the animal welfare act regulations, the precise number of these species used per year within the united states is unavailable. however, examining the data collected within the european union can give some indication of their use relative to other common laboratory animal species. in 2011, rats accounted for just fewer than 14% (1.6 million) of the total animals (11.5 million) used in research within the european union (european commission, 2013) . this contrasts to mice, which constituted 61% (6.9 million) of the total animals used within the european union (european commission, 2013). rats possess a number of qualities which make them a highly suitable and much preferred animal model. like mice, these traits include relatively small size; known genetic background; short generation time; similarities to human disease conditions; and known microbial status. their tractable nature makes them easier to handle in a laboratory setting than many other rodents. rats rarely bite their handlers unless extremely stressed or in pain. rats have been used as animal models in numerous areas of research from space exploration to answering more basic scientific questions regarding nutrition, genetics, immunology, neurology, infectious disease, metabolic disease, and behavior. perhaps their largest use is in drug discovery, efficacy, and toxicity studies. in the united states, the approval of any new drug for use in humans or animals usually necessitates that toxicity testing be done in at least one small animal species (e.g., rodents) and one large animal or target species (e.g., dog, nonhuman primate). there are known physiologic differences between the numerous outbred stocks and inbred strains of rats. the rat genome database (rgd) is an extensive, free resource filled with information regarding the different phenotypes, models, and genomic tools used in rat research (laulederkind et al., 2013) . vendors of commercially available rat stocks and strains are often good resources for normal physiologic data of these strains. many provide stock-and strain-specific data directly on their websites such as growth curves, complete blood count and serum biochemistry panels, and spontaneous lesions seen on histopathology. a summary table of normal physiologic references is in table 7 .7. sexual dimorphism exists between male and female rats. sexing of adult rats is most easily done by examining the perineal area of the rat and identifying the external reproductive structures such as the penis, testes, or vagina. in addition, male rats are typically larger and weigh significantly more than their age-and strainmatched female counterparts. sexing of rat pups is most easily performed by examining the distance between the anus and genital opening in the pup. males have a greater anogenital distance than females. male rats possess paired testicles that descend from the abdomen into the scrotal sac at approximately 15 days of age (russell, 1992) . due to the lack of closure of the inguinal rings, the testes may be retracted into the abdominal cavity even as an adult. the male rat also possesses a number of accessory sex glands. a four-lobed prostate is present along with four other paired glands, including: the seminal vesicles, coagulating glands, ampullary glands, and bulbourethral glands (noted in some texts by the older name, cowper's gland) (popesko et al., 1992) . due to the unusual bihorned shape of the closely associated coagulating gland and seminal vesicles, individuals unfamiliar with rodent male anatomy may initially mistake these structures for the female uterus. however, the apices of these glands are freely mobile and easily exteriorized from the abdominal cavity unlike the uterus, which is attached to the dorsal body wall bilaterally via paired ovaries and their respective ovarian ligaments. the reproductive anatomy of the female rat contains some distinct features. the uterus of the female rat is classified as a duplex uterus, because the vagina is separated from the uterus by two individual cervices with each cervix leading to a separate uterine horn (popesko et al., 1992) . the placentation of the pregnant rat is hemotrichorial (three layers) rather than the hemomonochorial (single layer) placentation present in humans (wooding and burton, 2008) . the rest of the reproductive anatomy (e.g., ovary, oviduct) is structurally and functionally similar to other mammals. a summary of basic reproductive physiology is presented in table 7 .8. rat pups are born hairless, blind, and deaf and require extensive parental care that is provided mainly by the mother. as with mice, the skin of the hairless rat pups has a pink coloration, thus they are also referred to as "pinkies." the inclusion of nesting materials in the cage is recommended to assist the rat pups with thermal regulation until they have a full hair coat (whishaw and kolb, 2005). like other rodents, rattus norvegicus is a nocturnal species with the highest level of activity occurring during the dark phase. behaviors exhibited by rats include grooming, nesting, eating, and other social behaviors. nesting behavior serves several purposes among rats and mice (gaskill et al., 2012 (gaskill et al., , 2013a . nests allow for better thermoregulatory control within a given environment as well as protection against predation (gaskill et al., 2013c) . recent work in mice suggests that energy not diverted to thermoregulation can be shunted to other activities as seen via improved feed conversion and breeding performance (gaskill et al., 2013c) . however, anecdotal evidence suggests that nest building in rats is largely a learned behavior, and it appears that there is a developmental period in young rats whereupon if exposed to nesting materials during this time they will begin using the materials to build at least rudimentary nests (gaskill, 2014) . at a minimum, rats benefit from having a structural shelter or nest box into which they may rest away from prying eyes ( fig. 7 .7). rats emit alarm vocalizations during times of distress. however, these negatively associated vocalizations typically register in the ultrasonic wavelengths (approximately 22 khz), well outside of the human hearing range (burman et al., 2007; parsana et al., 2012) . rodents may also emit high-pitched audible vocalizations when extremely alarmed, distressed, or in pain (jourdan et al., 1995; han et al., 2005) . as discussed in chapter 5, rats exhibiting abnormal behaviors and stereotypies can create variables in the research findings and should not be used in a study unless abnormal behavior is the object of the study subjects (baenninger, 1967; callard et al., 2000; garner and mason, 2002; cabib, 2005; ibi et al., 2008) . examples of stereotypies seen in rats include: bar-gnawing, pawing behavior, repetitive circling, and backflipping. it is critical that rats should be provided some form of environmental enrichment to stimulate positive species-typical behaviors. the housing of rats in a laboratory setting is similar to that described previously for mice: conventional, spf, and germ free. as rats are social animals, at minimum, they should be housed in pairs whenever possible. there is a preponderance of evidence that shows the differences in affiliative versus aggressive behavior, biochemical changes, and changes in learning between rats raised and housed in social isolation versus those housed in social groups (baenninger, 1967; einon and morgan, 1977; robbins et al., 1996) . enrichment items, such as a hut, nesting box, or similar type of shelter may be included in the cage to provide a visual barrier between the rat and the rest of the animal room. evidence suggests that rats prefer shelters made from opaque plastic (patterson-kane, 2003) . rats also spend a lot of time in the wild chewing either for eating or for manipulating objects for nest building. providing objects made of safe materials in the cage allows the rats to exhibit this natural behavior and encourage the normal wear of the rat's incisors, minimizing the incidence of malocclusion of the teeth. rodents can benefit from frequent gentle handling by the researcher and animal care staff. this concept is also known as "gentling" and has been demonstrated to reduce the stress experienced by rats during experimental handling and procedures (hirsjarvi et al., 1990; van bergeijk et al., 1990) . another positive interaction between humans and rats is found in the "tickling" of rodents. based on ultrasonic vocalization data, rodents find tickling a pleasurable experience (burgdorf and panksepp, 2001; panksepp, 2007; hori et al., 2014) . tickling may also decrease the stress response seen in rodents after experimental manipulations like intraperitoneal injections (cloutier et al., 2014 ). a multilevel cage with an intracage shelter. this style of caging provides opportunities for exercise for the rats. photo provided by melissa swan. as for mice, some general signs of ill health would include: weight loss, depression or lethargy, anorexia, obesity, diarrhea, scruffiness or ruffled coats, abnormal breathing, sneezing, weakness, dehydration, enlarged abdomen, discolorations (e.g., yellow for jaundiced animals or very pale for anemic animals), masses or swellings, and abnormal posture or gait. when assessing animals, a body condition score can be used as an objective measure or scale to truly assess how fat or thin the animal is; and it allows for the accurate determination of endpoints in studies where animals are expected to lose or gain weight (hickman and swan, 2010) (fig. 7.8) . some of the more commonly found diseases of rats are presented in table 7 .9. the ancestral home of the european rabbit (oryctolagus cuniculus) is the iberian peninsula (hardy et al., 1995) . the earliest archeological evidence of the coexistence of humans and rabbits can be found in excavation sites dated at approximately 120,000 years bce in nice, france (dickenson, 2013) . in antiquity, romans used rabbits as a food source and are thought to be responsible for their dispersal throughout europe, although there is no evidence that they attempted to actually domesticate them (dickenson, 2013) . domestication and selective breeding is thought to have begun in france in the middle ages where monks began to breed rabbits in their monasteries (dickenson, 2013) . the rabbits were kept confined in enclosures called "clapiers"(dickenson, 2013). they were kept largely as a source of food for the monks especially since 600 ce when pope gregory i officially classified them as "fish" and thus eligible to being eaten during lent. however, rabbit wool production soon became a welcome by-product of these domestication efforts. european rabbits have been used in research since the middle of the 19th century. early work with the species was concentrated on the comparative anatomy of the rabbit with other species, such as the frog, and the unique features of the rabbit's heart and circulatory system (champneys, 1874; roy, 1879; smith, 1891) . louis pasteur used rabbits in a series of experiments that led to the development of the world's first rabies vaccine (rappuoli, 2014) . while there are numerous so-called "fancy" breeds of rabbits available in the pet trade, the list of breeds routinely used in research is much shorter. the new zealand white (nzw) rabbit is the most frequent breed of used in research (fig. 7.9 ). the california and dutch-belted rabbit breeds are also occasionally used. researchers have developed genetically inbred rabbit strains for particular research applications. for example, the watanabe heritable hyperlipidemic (whhl) and the myocardial infarction-prone whhl rabbit (whhlmi), both developed by researchers in japan, are used to explore diseases associated with dyslipidemia such as atherosclerosis (shiomi et al., 2003; shiomi and ito, 2009) . rabbits have been used as a model of human pregnancy and for the production of polyclonal antibodies for use in immunology research (hanly et al., 1995; ema et al., 2010; ito et al., 2011; fischer et al., 2012) . rabbits are routinely used in (southard et al., 2000; arslan et al., 2003; mcmahon et al., 2005; castaneda et al., 2008; habjanec et al., 2008; manabe et al., 2008; xiangdong et al., 2011; panda et al., 2014; sriram et al., 2014; wei et al., 2014; zhou et al., 2014) . the production of polyclonal antibodies is preferentially performed in the rabbit due to its relatively large blood volume compared to rodents (hanly et al., 1995) . their tractable nature and larger body size also make them suitable for surgical implantation of biomedical devices (gotfredsen et al., 1995; swindle et al., 2005; ronisz et al., 2013) . additionally, rabbits are a favored model in pharmacologic studies for teratogenicity testing of novel pharmaceutic compounds (gibson et al., 1966; lloyd et al., 1999; foote and carney, 2000; jiangbo et al., 2009; oi et al., 2011) . while much of the anatomy of the rabbit is similar to other mammalian species, it should be noted that there are a number of key differences. for example, the skin of the rabbit is quite thin and fragile. care should be taken when restraining a rabbit or shaving a rabbit's fur (e.g., in preparation for surgery) to avoid tearing the skin. unlike rodents and other laboratory animals, rabbits do not have pads on their feet; rather, the plantar surface is covered with fur (quesenberry and carpenter, 2012). new zealand white rabbits are commonly used in research. photo provided by kay stewart. the long ears of the rabbit serve several purposes. the most obvious is for hearing. in addition, the central ear artery and marginal ear veins are easily accessible for both intravenous administration and blood sampling (diehl et al., 2001; parasuraman et al., 2010) (fig. 7.10) . the ears also serve as a means of thermoregulation, as excess heat may be exchanged across the large surface area of the ears (sohn and couto, 2012) . the skin of rabbits lacks sweat glands and is therefore unable to sweat; panting is insufficient to dissipate the excess heat (sohn and couto, 2012) . thus, the ears play a vital role in maintaining proper body temperature. other unique features of the skin and adnexa are the presence of chin and inguinal glands used in scent marking. additionally, the female rabbit (doe) is noted by the presence of a large skin fold filled with subcutaneous fat just under the chin (the dewlap) (sohn and couto, 2012) . the skeleton of rabbits makes up only 8% of the body weight by mass (brewer, 2006) . this is in contrast to other similarly sized mammals. for example, the cat skeleton makes up 12e13% of body weight (brewer, 2006) . the small skeletal mass of the rabbit coupled with strong back muscles mean that the back is prone to traumatic fracture (meredith and richardson, 2015) . proper holding and restraint techniques are necessary to avoid this undesirable outcome. there are several unique features of both the respiratory and cardiovascular systems of rabbits. for example, rabbits are obligate nose breathers (varga, 2014) . this is especially important during procedures involving anesthesia and placement of an endotracheal tube. with respect to the cardiovascular system, the rabbit heart is unique in that the right atrioventricular (av) valve has only two leaflets instead of three (brewer, 2006) . additionally, due to the similarity to humans with respect to the neural anatomy of the ventricles, the rabbit is the species of choice for purkinje fiber research (brewer, 2006) . rabbit teeth are "open-rooted" meaning that they continue to erupt and grow throughout life. this applies to all of the teeth in the rabbit dental arcade (i.e., incisors, premolars, and molars; rabbits do not have canines). this contrasts with rodents, where the incisors are the only open-rooted (or hypsodontic) teeth (sohn and couto, 2012) . thus, rabbit teeth are subject to overgrowth. another unique feature of rabbit dentition that sets them apart from rodents is the presence of a second set of incisors just behind the first set of upper incisors known as "peg" teeth (sohn and couto, 2012) . they are thought to aid in tearing off the succulent leaves of plants while grazing. as an obligate herbivore, the gastrointestinal tract of rabbits differs greatly from that of carnivores and omnivores. rabbits require a high fiber diet of between 14 and 20% (sohn and couto, 2012) . the small intestine is divided up into three main sections: the duodenum, jejunum, and ileum. the ileum connects to the cecum via a structure called the sacculus rotundus. the presence of lymph follicles suggests that the sacculus rotundus has immunological functions. it is sometimes referred to as the ileocecal "tonsil" (jenkins, 2000) . in the rabbit, gastric associated lymphoid tissue is also present in the small intestine and the vermiform appendix (lanning et al., 2000) . the cecum, a large distensible outpouching of the large intestine, holds up to an estimated 40% of the total ingesta (sohn and couto, 2012) . rabbits are considered to be "hindgut fermenters." bacteria present in the cecum ferment the digestible fiber found within the diet. the product of this fermentative process becomes cecotrophs (also known as "night feces"). cecotrophs are excreted roughly 8 h after the initial foodstuffs are ingested (sohn and couto, 2012) . they are softer and more mucoid in appearance than the hard, dry "day feces" produced just 4 h after consuming food (sohn and couto, 2012) . the bulk of day feces consists of the indigestible fiber found in the diet. the sorting of foodstuffs destined to become either day feces or cecotrophs and the timing of their relative excretion is largely dependent upon the neural input of the fusus coli, also termed the "pacemaker" of the colon (sohn and couto, 2012) . the fusus coli is anatomically located between the ascending and transverse colons of the rabbit (popesko, 1992) . consumption of cecotrophs by rabbits is an important part of the digestive process in rabbits as they are rich in b vitamins, such as niacin and b12, and vitamin k (hã¶rnicke, 1981) . while cecotrophs are known colloquially as "night feces," rabbits produce and eat them at all hours of the day (sohn and couto, 2012) . rabbits are agile enough to eat these night feces directly from their anus (sohn and couto, 2012) . those researchers performing digestive research (e.g., fecal collection via metabolism cages) should keep this in mind. sexing of adult rabbits is aided by the sexual dimorphism present in the species. mature females are readily identified by the presence of the dewlap (sohn and couto, 2012) . females have 8e10 nipples, while in males these nipples are present, but rudimentary (sohn and couto, 2012) . nzw rabbits reach sexual maturity between 5 and 7 months (suckow et al., 2002) . reproductive data are summarized in table 7 .10. mature bucks have paired testicles enclosed in paired hairless scrotal sacs (sohn and couto, 2012) . like rodents, the inguinal rings do not close. accessory sex glands include several bilobed organs: the seminal vesicle, vesicular gland, prostate, and paraprostatic gland. the bulbourethral glands of bucks are paired (sohn and couto, 2012) . female rabbits are induced ovulators (dal bosco et al., 2011; sohn and couto, 2012) . that is, the egg does not ovulate spontaneously from the ovary, rather manual stimulation via copulation is required. ovulation occurs approximately 10 h postcopulation (sohn and couto, 2012) . because they are induced ovulators, does do not have a defined estrous cycle. rather, they have periods of sexual receptivity lasting approximately 14e16 days followed by 1e2 days of nonreceptivity. nonfertile matings may result in a period of pseudopregnancy of up to 15e16 days (sohn and couto, 2012) . fertile matings result in pregnancy lasting 31e32 days (sohn and couto, 2012) . the placenta of rabbits is classified as hemodichorial; this is in contrast to humans which have a hemomonochorial placenta (furukawa et al., 2014) . birthing (i.e., parturition; also known as "kindling") occurs most often during the early morning hours (sohn and couto, 2012) . kits are born deaf and blind. by 7 and 10 days of age they can hear and see, respectively (quesenberry and carpenter, 2012) . amazingly, does suckle their young ones daily, usually during the dark phase, and for approximately only 5e6 min (sohn and couto, 2012) . the kits are able to drink about 30% of their entire body weight in that time. wild and domesticated does both follow this nursing behavior. rabbit kits may be weaned between 5 and 8 weeks of age (suckow et al., 2002) . earlier weaning should not be attempted as there may be profound detrimental effects on the functioning of the gastrointestinal system (bivolarski and vachkova, 2014). rabbits are very social, nocturnal creatures. scent marking is a normal and important part of their behavior repertoire. rabbits will rub the secretions from their chin scent glands against inanimate objects, other rabbits, and human handlers in a process called "chinning" (sohn and couto, 2012) . dominance hierarchies are established behaviorally. dominant animals may mount, "barber," or scent-mark subordinates (sohn and couto, 2012) . barbering is the act of chewing the hair of a subordinate animal, usually on the neck and back in the case of rabbits, very close to the skin giving the appearance of having been cut or "barbered" (bays, 2006). rabbits will "thump" one or both back feet on the ground when frightened or as an alarm call to other rabbits (bays, 2006) . highly stressed rabbits may actually emit a loud, piercing scream, especially when roughly caught by an untrained individual (bays, 2006) . it is important to approach rabbits calmly and quietly. relaxed, content rabbits may be heard making a purring sound (bays, 2006) . rabbits benefit by repeated, positive interactions with people similar to the concept of "gentling" in rats (see section 7.2). changes in behavior are often first indication that an animal is in pain. given that rabbits are a prey species, it is evolutionarily speaking not in a rabbit's best interest to display signs of pain. thus, these behaviors are most often quite subtle in nature and may be easily missed if particular attention is not paid. the first sign often seen in a rabbit experiencing pain is a decreased appetite resulting in little to no food intake (sohn and couto, 2012) . rabbits will often grind their teeth (i.e., bruxism) when experiencing pain (sohn and couto, 2012) . other rabbits may simply appear very dull and inactive. as with rodents, rabbits can develop stereotypies. due to the sensitivity of the rabbit's nose and lips many stereotypies involve chewing behaviors. bar chewing, chewing on the water bottle, and self-barbering are all stereotypical behaviors seen in rabbits (gunn and morton, 1995; chu et al., 2004) . in addition, rabbits may engage in "nose sliding" against solid surfaces like the cage walls and head swaying (sohn and couto, 2012) . animals that exhibit stereotypies do not make good research animals. efforts should be made, where possible, to prevent these behaviors through the use of environmental enrichment. enrichment may be in the form of chew-resistant objects (such as plastic dumbbells and stainless steel rattles) and food treats (poggiagliolmi et al., 2011). as a prey species, rabbits benefit from the inclusion of a hut in the cage or at least a visual barrier into which they may retreat when psychologically stressed (baumans, 2005) . breeding females should always have access to a nest box to allow for the necessary expression of normal nesting behavior (baumans, 2005) . being social creatures, ideally rabbits should be housed in compatible pairs or trios unless contraindicated by the research objectives or by incompatibility of the animals (sohn and couto, 2012) . stable social groups formed shortly after weaning, where animals are not added or removed, is most beneficial (boers et al., 2002) . structurally, rabbits benefit from housing that has both adequate vertical and horizontal space (boers et al., 2002) . one recommendation on the space requirements of laboratory rabbits stipulates 16 in. as the minimum vertical cage height (national research council, 2011) . at a minimum, rabbits must be able to comfortably sit upright in the cage without their ears bending over (national research council, 2011) . laboratory rabbits are typically housed in easily sanitized stainless steel cage racks ( fig. 7.11 ). slatted flooring allows for urine and feces to fall through the slats onto special pans fixed below the cage, thus providing for easier sanitation of the cages. however, care must be taken that the slats are of sufficient width so as to prevent a condition known as bumblefoot (see section 7.3.7). dog runs with elevated, slatted flooring or a solid floor with bedding have also been used by some investigators in group-housed rabbits with success (personal observations). again, attention should be paid to the flooring and its effect on foot health. commercially available research diets specifically formulated for rabbits are available. these diets are preferred to so-called "natural diets" and feeding individual vegetables. this is because rabbits tend to be very selective eaters which can lead to nutritional imbalances (fraser and girling, 2009 ). additionally, the use of fresh vegetables may lead to the introduction of unwanted pathogens like salmonella (varga, 2014) . commercial diets are available in maintenance and reproductiveperformance dietary formulations as well as presterilized diets for rabbits housed under spf conditions. rabbits are very easily heat stressed and thus must be kept at significantly lower temperatures than other laboratory animals like rats and mice. noise is another significant stressor to rabbits (verga et al., 2007) . sudden, high-pitched, sharp noises are most disruptive. however, in general, noise within the animal rooms should be avoided as much as possible. for this reason, rabbits should not be housed, even temporarily for short procedures, near areas of high noise. problems in rabbits related to the gastrointestinal system are relatively common. these problems can become serious very quickly. therefore, it is critical that abnormalities seen (e.g., rabbit not eating or abnormal feces) be reported to the veterinary care staff immediately. even if a researcher is unsure if there is a problem, it is best to report suspicions because without prompt intervention, seemingly minor problems can escalate to potentially life-threatening conditions. some of the most commonly seen clinical conditions in rabbits are summarized in table 7 .11. the zebrafish, danio rerio of the cyprinidae family, is a small, dark blue and yellow striped, shoaling, teleost fish, popular among aquarium enthusiasts, and increasingly among the research community (fig. 7.12) . the adult fish are 4e5 cm in length, with an incomplete lateral line and two pairs of barbels (laale, 1977) . males have larger anal fins and more yellow coloration; females have a small genital papilla just rostral to the anal fin (laale, 1977; creaser, 1934) . zebrafish are hardy, fresh water fish originating from a tropical region with an annual monsoon season. the fish are generally found among slow moving waters of rivers, streams, and wetlands, across the south asia region of india, bangladesh, example of rabbit caging for a laboratory setting. photo provided by deb hickman. and nepal (engeszer et al., 2007; spence et al., 2008) . the waters tend to be shallow, relatively clear with substrates of clay, silt, or stone of varying size (mcclure et al., 2006; engeszer et al., 2007) . the fish feed mostly on insects and plankton, with evidence of feeding along the water column as well as water surface (mcclure et al., 2006; engeszer et al., 2007; spence et al., 2008) . gastrointestinal upset generally secondary to the use of broad spectrum antibiotics, such as penicillin. misalignment and subsequent overgrowth of the continuously growing teeth pododermatitis ("bumblefoot") infection of the underside of the feet pasteurella multocida common cause of respiratory infections and abscesses male zebrafish have a more stream-lined body with darker blue strips while the females have a white protruding belly. photo provided by kay stewart. the small size of zebrafish, the ease of keeping large numbers, frequent spawning, large egg clutches, translucent nonadherent eggs, rapid development and complex sequencing of the zebrafish genome are all key components that make the zebrafish an attractive research model. interestingly, approximately 70% of zebrafish genes have at least one orthologous human gene (howe et al., 2013) . publications on the use of zebrafish in research are cited as early as the 1930s (creaser, 1934) . until the early 1970s, the use of zebrafish stayed fairly low, with the number of articles published staying below 20 per year. in the mid-1970s, publications increased to about 40 per year, doubling again in the 1980s, increasing to almost 200 articles per year in the early 1990s, and rapidly expanding to 1929 publications by 2012. developmental biology was the initial focus of zebrafish research use. however, in recent years, use of the zebrafish in research related to biochemistry and molecular biology, cell biology, neurological sciences, and genetics has been rapidly increasing. zebrafish are known to live for only a year in the wild (spence et al., 2008) . for most of the year, the fish reside in shallow streams. with the onset of monsoon rains, they move to flooded, highly vegetated shallow wetlands and floodplains, including rice paddies, with little to no current and often silt bottoms for spawning (engeszer et al., 2007) . the offspring then develop in these waters until the seasonal waters diminish (engeszer et al., 2007) . zebrafish rapidly mature, reaching sexual maturity as early as 2 months postfertilization . the zebrafish continues to grow throughout life, which is much longer in captivity, with a mean lifespan of 3.5 years in captivity (gerhard et al., 2002) . in nature, spawning behavior occurs within small groups of three to seven fish. males within the group pursue females, with spawning occurring along the substrate (spence et al., 2008) . similar behaviors are noted in laboratory zebrafish, with spawning often occurring with the first light of day. courtship behavior involves a rapid chase of the female, the male swimming around the female, nudging her, or swimming back and forth working the female to the spawning site. interestingly, zebrafish prefer spawning near artificial plants. once there, the male remains close to the female, extending his fins to bring his genital pore in line with the female. the male may also rapidly undulate his tail against the side of the female to initiate egg release by the female, coinciding with sperm discharge by the male. the female produces eggs in batches of 5e20 over several encounters with the male for up to an hour. most eggs are released within the first 30 min, with a peak in production during the first 10 min (darrow and harris, 2004; spence et al., 2008) . zebrafish produce large clutches of eggs, from 150 to 400 eggs per clutch (laale, 1977) . the eggs, approximately 0.7 mn in diameter, are transparent and protected within a chorionic membrane (kimmel et al., 1995) . first body movements and beginning stages of organ development occur 10e24 h postfertilization (kimmel et al., 1995) . as development continues, the larva hatches from the egg two to three days postfertilization (kimmel et al., 1995) . the early larva has special secretory cells within multicellular regions of the head epidermis that allow the larvae to attach to various hard surfaces and plants until the swim bladder inflates 4 or 5 days postfertilization (laale, 1977; kimmel et al., 1995) . once the air bladder inflates, the fish can maneuver through the water column. in captivity, zebrafish can breed year round. the presence of males, or even just the male pheromones, is needed to induce ovulation (gerlach, 2006) . if females are housed away from males for an extended period, they can retain the eggs resulting in egg-associated inflammation, which can be lethal (kent et al., 2012a) . to accommodate the fish life cycle, zebrafish are typically housed in static spawning cages to allow for fertilized egg production. spawning cages include a housing tank containing a clear slotted bottom insert, and a plastic plant. the insert is often placed in the holding tank at an angle to create a shallow region for spawning and the slotted bottom of the insert allows for ease of egg collection (lawrence and mason, 2012; nasiadka and clark, 2012) (fig. 7.13 ). the embryos are then incubated at around 28.5 c in a petri dish for at least 3e4 days postfertilization (wilson, 2012) . the fish are then kept in static or slow water flow containment and can be fed paramecium, rotifer, a powdered food, or a combination of these feed types. unfortunately, other than the need for essential fatty acids in their diet, little is yet known on the nutritional requirements of zebrafish. zebrafish in research settings are typically fed live feed like artemia (brine shrimp), rotifers, bloodworms example of a zebrafish spawning system. the system is designed to allow eggs to fall beneath a slotted insert to the bottom of the tank as a way to prevent the adult fish from consuming the eggs. photo provided by robin crisler. (chironomid larvae), commercial feed, or a combination of all (lawrence, 2007) . the size of the feed is necessary to suit the gape size of the larvae, approximately 100 mm (lawrence, 2007; wilson, 2012) . water flow and feed size increase with development, with transition of the feed to artemia (brine shrimp), and/or use of a larger particle commercial feed during the 8e15 days postfertilization (wilson, 2012) . once the juvenile stage is reached, around 29 days postfertilization, the fish are housed more like adult fish, with more frequent feeding and slower water flow to accommodate their remaining development and smaller size, respectively (wilson, 2012). as early as 2 months of age the fish are sexually mature . adult zebrafish can be housed in traditional glass aquaria or elaborate computerized and automated systems that monitor and control water quality parameters such as temperature (typically 28.5 c), ph, water hardness, salinity, dissolved oxygen, and nitrogenous wastes (lawrence, 2007; lawrence and mason, 2012) . whether maintained manually or computerized, these parameters are important to monitor and maintain at appropriate levels to maximize the health of the fish. poor water quality can lead to disease in the fish (kent et al., 2012b) . many of the organisms that cause disease in zebrafish are opportunists in the environment and remain subclinical until the fish is stressed, often due to problems with husbandry. appropriately maintained housing, combined with a healthy water quality, avoidance of overcrowding, and a functional quarantine and health surveillance program are key components to avoiding stress and disease. to date, there are currently no known viruses documented in zebrafish as naturally occurring disease concerns (kent et al., 2012b) . mycobacterium infections are the most frequently documented bacterial infections (kent et al., 2012a,b) . class reptilia is made up of four orders classified as chelonia, rhynchocephalia, squamata, and crocodilia (frye, 1991) . in class amphibia, animals more commonly encountered in research setting are in the order anura, containing frogs and toads (such as xenopus, bufo, rana, hyla, and dendrobates spp.); and in the order caudata, containing salamanders such as the tiger salamander, ambystoma tigrinum and the axolotl, ambystoma mexicanum (national research council, 1974) (figs. 7.14 and 7.15). snakes and lizards are in class reptilia, order squamata; chelonians (turtles, tortoises, terrapins) are in order chelonia; and alligators, caimans, and crocodiles are in order crocodilia. in contrast to research in mammals, there is a tendency for reptile and amphibian research to be more oriented to studying evolution and ecology as opposed to basic science evaluating models of human disease (pough, 1991) . salamanders and frogs are important for studying embryonic development, metamorphosis, regeneration, figure 7.14 a commonly used amphibians in research is the axolotl, ambystoma mexicanum. provided by chris konz. african clawed frogs, xenopus laevis, a commonly used amphibian. provided by randalyn shepherd. physiology, and climate change (burggren and warburton, 2007; hopkins, 2007; pough, 2007) . reptiles are often studied because of their more simple cardiovascular systems as well as for evaluating mechanisms of immune responses, hormonal controls, and unique reproduction methods such as parthenogenesis (frye, 1991) . of the amphibians, xenopus laevis (south african clawed frog) and xenopus tropicalis (western clawed frog) are commonly studied in the research setting. x. laevis is a prominent research model in comparative medicine and developmental studies, and is the most commonly studied species in the genus xenopus (denardo, 1995; schultz and dawson, 2003; o'rourke, 2007) . advantages include large-sized eggs for ease of observing embryo development, as well as the wealth of published literature in areas of research such as evolution, neurobiology, regeneration, endocrinology, and toxicology (koustubhan et al., 2008; gibbs et al., 2011) . rana catesbeiana (bullfrogs) have been used for developmental and toxicological studies, and for infectious disease study of the chytrid fungus batrachochytrium dendrobatidis (alworth and vazquez, 2009) . a. mexicanum, in particular, is studied to understand the regenerative ability of the blastema of amputated limbs at the molecular level (gresens, 2004; rao et al., 2014) . ambystoma tigrarium has been studied in regard to general amphibian decline in north america, environmental contaminants such as pesticides and effects of infection with a. tigrarium virus (sheafor et al., 2008; kerby and storfer, 2009; chen and robert, 2011; kerby et al., 2011) . a variety of snakes, crocodiles, lizards, and turtles have been studied in research. for example, anolis carolinensis (the green anole) has been used for the study of reproduction biology (lovern et al., 2004) . caiman crocodilus and alligator mississippiensis (crocodiles), trachemys scripta elegans (red-eared sliders) represent a few other examples of reptiles used in research (o'rourke and schumacher, 2002). amphibians and reptiles are considered to be ectotherms (greene, 1995) . unlike mammals and birds, ectotherms are unable to internally regulate body temperatures above that of the ambient environment through metabolism and require complex behavioral and thermoregulatory adaptations to regulate temperature (pough, 1991; seebacher and franklin, 2005) . in captivity, ectotherms typically require supplemental sources of heat to mimic the thermoregulatory effects of basking in the sun. some amphibians and reptiles are aquatic (xenopus frog spp.) whereas others are semiaquatic or terrestrial. x. laevis and x. tropicalis are from geographically distinct areas and have different temperature requirement depending on life stage, with x. tropicalis adults generally around 25 c in their natural habitat versus about 20 c for x. laevis (tinsley et al., 2010) . the skin of amphibians is permeable to water and some adults (semiterrestrial tree frogs in family hylidae, arboreal and terrestrial toads in bufonidae) may receive a significant portion of their daily water requirement via absorption through a vascular-rich area on the pelvic area termed the pelvic patch (pough, 2007; ogushi et al., 2010) . the skin of some amphibians contains toxins which can cause arrhythmias in human handlers, for example, alkaloids from dendrobatid frogs, and bufotoxins from toads of the genus bufo (denardo, 1995) . the toxins serve to keep predators away but, as with xenopus, may harm the animals themselves by continued direct contact or diffusion through the water (tinsley et al., 2010; chum et al., 2013) . the skin of amphibians is easily damaged thus to protect the animal during handling powder-free gloves should be worn (gentz, 2007) . researchers and animal care providers should investigate the natural environment of each species within their care and critically evaluate what features are required for normal behavior and physiology to provide the essential elements in the research setting (pough, 1991) . in the wild, amphibians and reptiles live in ecological environments that span a range of diversity from topical forest areas to dry desert. they may be arboreal, aquatic, or terrestrial. they are often secretive and hide when in natural habitats, preferring to hide under vegetation or in crevices. parameters from the natural habitat to evaluate include temperature, humidity, nutritional requirements, natural diet, nocturnal versus diurnal behavior, and housing density. temperature and lighting gradients should be established so animals can choose to move toward or away from the heat source as a way to avoid overheating. most amphibian species in the wild are nocturnal (pough, 2007; tinsley et al., 2010) . amphibians and reptiles are sensitive to chemicals in the environment. water quality parameters (such as ph, hardness, ammonia, nitrate/nitrite, salinity, conductivity) should be regularly monitored. chloramine and chloramines are often present in municipal water supplies and are toxic to aquatic species. water should be treated prior to use for aquatic species with an agent like sodium thiosulfate, since chloramine does not readily dissipate (browne et al., 2007) . ammonia is a breakdown product between the chloramine and sodium thiosfulfate reaction and is a concern for aquatic animals (browne et al., 2007; koustubhan et al., 2008; o'rourke and schultz, 2002) . a wide variety of caging materials may be used for housing such as glass, plastic, stainless steel, or fiberglass but should be free of contaminants or harmful chemicals like bisphenol a that could leach from the caging into the water (levy et al., 2004; browne et al., 2007; bhandari et al., 2015) . agents used to sanitize caging should be chosen to minimize likelihood of harmful residues. environmental enrichment should be provided to encourage natural behaviors and can include providing cage mates for social interaction, cage accessories that serve as hiding spots or shelters (fig. 7 .16) as well as providing a variety of food treats in changing locations for foraging opportunities (hurme et al., 2003) . scents, sounds, and color choices may also be incorporated into enrichment strategies provided that they are carefully evaluated to ensure that they are beneficial and do not cause stress. for example, the tortoise, chelonoidis denticulata, may show a color preference for red-colored enrichment items (passos et al., 2014) . pvc tubes are another example of enrichment that has been provided to x. laevis for use as hiding cover (koustubhan et al., 2008) . some species may require haul-out ramps, areas for sun basking, floating rest areas, or enrichment devices along the water's surface to help prevent drowning. one should consider the possibility of ingestion, as reptiles and amphibians may attempt to consume the substrates provided to them. the degree to which amphibians are social varies significantly depending on the species and is not always well understood. they use visual and olfactory discrimination to help them find food, forage, and avoid predators (vitt and caldwell, 2014) . both in the wild and in captivity, reptiles and amphibians may exhibit excitatory behavior when fed (sometimes described as a "feeding frenzy") which may result in animal injury where animals are in close proximity (divers and mader, 2006; tinsley et al., 2010) . overcrowded tanks can result in competition for food and subsequent trauma. thus, when placed together for the first time, animals should always be observed for compatibility; and only members of the same species should be housed together. many reptiles and amphibians are escape artists and prevention of escape and injury is a critical factor when considering housing design. species that are prone to jumping must have secured lids on their enclosures. the diets of amphibians and reptiles are highly variable in the wild and are species dependent. commercially prepared pelleted diets may be available and accepted by reptiles and aquatic amphibians, however, terrestrial amphibians and many use of a rabbit feeder for xenopus enrichment. photo by randalyn shepherd. reptiles may prefer live diets (pough, 2007) . it is not unusual for some species to go several days of fasting between meals in nature (pough, 1991) . consultation with those experienced at successful housing and feeding the species in question (zoos, nutritionists, herpetologists) is recommended. there are many different types of infectious agents such as bacteria, viruses, fungi, and parasites that can cause health problems in amphibians and reptiles in addition to noninfectious conditions such as those resulting from nutritional imbalances, metabolic disease, neoplasia, trauma, and other spontaneous maladies. although significant advances in knowledge have been made over the past 100 years regarding disease in these species, much still remains unknown. it is not possible to go into detail here, but there are excellent reference texts for diseases in amphibians and reptiles that can be consulted (jacobson, 2007; frye, 1991; wright and whitaker, 2001) . from a taxonomic standpoint, birds are placed into class aves which includes multiple orders based on anatomical, physiological, and genetic characteristics. passeriformes is the largest order and contains songbirds and perching birds such as the finch, canary, and cardinal ( fig. 7.17 ). order columbiformes contains pigeons and doves; order psittaciformes contains budgies and parrots such as the african gray; and order galliformes contains domestic fowl such as the chicken and quail (proctor and lynch, 1993; ritchie et al., 1994) (fig. 7.18 ). birds have been used as research models of human disease and are important in evaluation of aging, memory, parasitology, atherosclerosis, reproduction, and infectious disease among other topics (austad, 1997 (austad, , 2011 maekawa et al., 2014) . the genomes of several avian species have now been sequenced (jarvis et al., 2014) . historically, chickens (gallus domesticus) are the most common bird species studied in biomedical and agricultural research and are a classic model in areas such as immunology, virology, infectious disease, embryology, and toxicology (scanes and mcnabb, 2003; kaiser, 2012) . chickens are also studied to evaluate reproductive development and retinal disease. embryonated chicken eggs have been used to commercially produce vaccines (such as for human influenza), studied for developmental analysis, and are now being treated with viral vectors like lentivirus to produce transgenic embryos. inbred lines with improved disease resistance are being developed and transgenic technology in the future may allow embryos to be used as bioreactors to produce therapeutic proteins of interest and potentially to generate transgenic chickens which have improved resistance to pathogens (bacon et al., 2000; scott et al., 2010) . because chickens develop spontaneous ovarian cancers at an incidence of up to 35%, they are also a prominent model of ovarian cancer in humans (bahr and wolf, 2012; hawkridge, 2014) . quail the zebra finch is a common avian species used in research. from http://www.redorbit.com/news/science/1112751282/male-zebra-finches-fake-song-121912/. the domesticated chicken commonly used in research. provided by kay stewart. (coturnix coturnix and coturnix japonica) have been studied in many of the same research disciplines as chickens, but offer advantages because of their smaller size and because they are among the shortest-lived bird species (austad, 1997) . japanese quail (c. japonica) have been selected as a model to evaluate reproductive biology and social behaviors such as mate selection because they readily show sexual behavior in captivity (ball and balthazart, 2010) . as with the chicken, methods to study transgenic quail are now becoming available and offer a useful tool to study gene function (seidl et al., 2013) . of the psittaciformes, amazon parrots and budgies (melopsittacus undulates) are among the most commonly studied, with research topics including veterinary medicine, diagnostics, behavioral, cognition, aging, and sensory studies (austad, 2011; kalmar et al., 2010) . the african gray parrot has been studied for its cognition and communication abilities (hesse and potter, 2004; harrington, 2014) . of the passerines studied in laboratory research, the most commonly evaluated include the zebra finch (taeniopygia guttata), european starling (sturnus vulgaris and sturnus roseus), and house sparrow (passer domesticus) (bateson and feenders, 2010). zebra finches and other songbirds are commonly studied in regard to aging and neurogenesis in addition to speech, learning, and memory because of their ability to learn and communicate intricate bird songs (harding, 2004; scott et al., 2010; austad, 2011; mello, 2014) . the most popular songbird species for neurobiological research include the zebra finch, canary, and other types of small finches such as lonchura striata domestica (schmidt, 2010) . zebra finches are favored in research settings since they are easy to house due to their small size, for their compatibility in groups, and proclivity for breeding. they are also studied for their biologic features such as sexual dimorphism, year-round singing in captivity, age-dependent period of song-learning propensity, and for ease of measurement with respect to their bird song (fee and scharff, 2010; mello, 2014) . pigeons (columba livia) have been evaluated in areas such as comparative psychology, neuroanatomy, neuroendocrinology, and atherosclerosis (santerre et al., 1972; austad, 1997; shanahan et al., 2013) . they are studied to understand their navigational skills and memory which allow homing, vision and discrimination ability. barn owls (tyto alba) are an example of a nocturnal avian species and are studied for neuroanatomy, vision, hearing, and for understanding learning mechanisms during auditory space mapping (pena and debello, 2010; rosania, 2014) . birds are warm-blooded vertebrates that have feathers for the purpose of flight and plumage. their respiratory system includes avascular air sacs, some of which attach to the lung and bronchi, but do not serve as sites for gas exchange as does the lung (maina, 2006; ritchie et al., 1994) . air sacs serve as internal compartments which hold air and facilitate internal air passage to allow birds to have a continuous flow of large volumes of air through the lungs as a way to increase oxygen exchange capacity and efficiency. birds lack a functional diaphragm and use muscles of the thorax to assist with respiration (ritchie et al., 1994) . care must be taken to ensure that use of physical restraint does not interfere with respiratory movement, cause the bird to struggle, or become stressed. the skeletal system includes pneumatic bones which are lined with air sac epithelium and are considered pneumatized by connection to the respiratory system (frandson et al., 2009 ). the specific bones which are pneumatized depend on the species but typically include the humerus, cervical vertebrae, sternum, sternal ribs, and sometimes the femur (ritchie et al., 1994) . the esophagus in birds leads to the crop, which is an outpocketing where food is held temporarily, and then continues to the proventriculus (also called the true stomach) which produces enzymes to break down food. food travels from the proventriculus to the ventriculus (gizzard) and then on into the small and large intestines. the presence or absence of a gallbladder is species dependent (tully et al., 2009; kalmar et al., 2010) . the rectum and urinary tract terminate in the cloaca, resulting in excreta where the fecal portion of waste is mixed with urate (white and/or creamy component). there are many additional unique and complex anatomic and physiologic adaptations of birds. other excellent references are available in the literature (scanes, 2015). housing requirements of birds held in captivity vary significantly depending on the particular species. basic parameters that apply to all birds include the necessity to provide an enclosure which is safe and permits species-specific behaviors to the greatest extent possible. consideration should be given to ensure that the type of structure is nontoxic, as some birds such as parrots have a powerful beak with the ability to chew through substrates. enclosures may be made of metals or durable plastic, but it is important to note that zinc wire, as well as leaded paint, can be toxic to birds and is best avoided. bar spacing on caging should be appropriate to prevent escape and injury based on the size of the bird. caging size varies and can include large aviaries where full short-distance flight is possible, to individual housing in smaller sized cages where flight may not be feasible. use of environmental enrichment and provision of opportunity for interaction is important to include as part of the cage structure, complexity, and social dynamic. some types of birds are considered social, polygamous, and benefit from group housing, whereas others such as those that pair-bond (such as new world quail) may prefer housing with a single mate (ritchie et al., 1994) . some species, genders, or individuals show aggression and may not be compatible. for example, sexually active male quail may injure each other and are generally considered incompatible (huss et al., 2008) . to help reduce aggression, housing densities should be kept low and multiple points of access to resources, such as feed and perches, should be provided. enrichment in the form of manipulanda can take the form of toys and food items. some types of birds may demonstrate foraging behavior in nature and may like to manipulate their feed. parrots, for example, typically grasp their food with their feet and may peel or strip the outer portion of the foodstuff prior to ingesting. toys should be size appropriate for the species, easily sanitized, free from sharp edges, and replaced once wear shows. birds can become easily caught in items that hang from the cage and as toys deteriorate they can become a hazard. for example, rope toys may begin to fray and become a hazard, causing entrapment; and some types of toys contain weights which pose a choking hazard or may be made of toxic materials such as lead. some types of birds spend considerable time perching and require perches, which vary in diameter, for comfort and to prevent pressure sores from developing on their feet. the respiratory system of the bird is very sensitive and caution must be taken by animal care staff to avoid exposure of birds to aerosols from chemicals that may arise from disinfectants used in the laboratory animal facility. scented cleaners, perfumes, hairspray, and emissions from teflon-coated materials are all examples of products which can be especially harmful to birds and may cause death. feeding requirements vary by species and life stage, but commercial pelleted diets designed to meet the nutritional needs can generally be provided. although many birds are seed eaters, a diet of seeds alone is unlikely to provide adequate or balanced nutrition. many birds have a requirement for dietary calcium, especially those that are reproductively active, and should be provided with calcium supplementation in the form of soluble grit such as cuttlebone or crushed oyster shells (sandmeier and coutteel, 2006; tully et al., 2009 ). birds often display neophobic behavior and may require long acclimation periods before fully accepting novel foodstuffs. for this reason, dietary changes should not be made abruptly and daily intake should be closely monitored. for birds in the laboratory setting, clean, fresh water should be provided daily either by use of nonbreakable bowls or sipper tubes. water intake will vary by species and environmental housing conditions. birds can mask disease and are easily stressed. it is best to first observe the bird in its normal home environment whenever possible and only perform restraint for physical exam or collection procedures when indicated. general indications of sickness may include decreased appetite, depressed behavior, loose stools, distended abdomen, ruffled feathers or unkempt appearance, skin lesions, openmouth breathing, abnormal respiratory sounds such as wheezing or sneezing, or signs of dehydration such as reduced skin turgor and sunken eyes. a healthy bird should have well-groomed feathers, appear alert, active and inquisitive, and should show species-typical behaviors. its eyes should be clear and bright. no evidence of discharge should be present from the eyes, nares, mouth, or urogenital area. numerous types of infectious (example, fig. 7.19 ) and noninfectious disease presentations are described in birds. additional reference resources should be consulted for in-depth information (ritchie et al., 1994; tully et al., 2009; doneley, 2010) . to provide the reader a broader view of animal use in research, descriptions of some less commonly used small mammal models follow. guinea pigs (cavia porcellus) are rodents, related to porcupines and chinchillas in the suborder hystricomorpha (fig. 7.20) . they originate from the mountain and grassland regions along the mid-range of the andes mountains in south america. they are small, stocky, nonburrowing, crepuscular herbivores with short legs and little to no tail, ranging from 700 to 1200 g, females being smaller than males (harkness et al., 2010) . guinea pigs have a long-standing historical role in research stretching as far back as the 1600s, when they were first used in anatomical studies (pritt, 2012) . further, they were used by louis pasteur and robert koch in their example of skin pox on the feet of a dark-eyed junco (junco hyemalis). photo from randalyn shepherd. investigations of infectious disease, and have contributed to the work of several nobel prize worthy studies (pritt, 2012) . specifically, the guinea pig has been used as a model for infectious diseases such as tuberculosis, legionnaires disease, sexually transmitted diseases such as chlamydia and syphilis, and one of the more common causes of nosocomial infections in people, staphylococcus aureus (padilla-carlin et al., 2008) . guinea pigs have also been useful tools in researching cholesterol metabolism, asthma, fetus and placental development and aspects of childbirth, as well as alzheimer's disease (bahr and wolf, 2012) . guinea pigs have many similarities to humans hormonally, immunologically, and physiologically. unlike other rodents, and more like primates (including people), guinea pigs are prone to scurvy if they do not receive adequate vitamin c, typically in their diet (gresham et al., 2012) . guinea pigs are housed similarly to other rodents, although they require more room than the smaller rodents. hamsters are of the rodentia order, suborder myomorpha along with the mouse and the rat. there are over 24 species of hamsters described in the literature, with the most common hamster used in research being the golden or syrian hamster, mesocricetus auratus (harkness et al., 2010) (fig. 7.21) . originating from the northwest region of syria, golden hamsters are thought to be descendants of only three or four littermates collected from syria in 1930 (adler, 1948; smith, 2012) . as their name implies, the typical wild-type coat is reddish gold along their dorsum, with a gray underside. they are granivores and insectivores, weighing 85e150 g, females weighing more than males, with short legs and short tail, and large cheek pouches (harkness et al., 2010) . specific anatomical and physiological features including their susceptibility to disease and infection make them a useful model for study. initially hamsters were utilized in studies of infectious disease, parasitology and dental disease, transitioning into cancer research in the 1960s (smith, 2012) . hamsters are still used in many areas of research, including investigations into metabolic diseases like diabetes mellitus (hein et al., 2013) , cardiovascular disease (russell and proctor, 2006) , reproductive endocrinology (ancel et al., 2012) , and oncology (tysome et al., 2012) . guinea pigs have also been used as models for infectious disease associated with bacteria, parasites, and viruses, such as leptospirosis (harris et al., 2011 ), leishmaniasis (gomes-silva et al., 2013 , and severe acute respiratory syndrome (sars) and ebola viruses (roberts et al., 2010; wahl-jensen et al., 2012) . other species of hamsters used have been used in research. for example, chinese and african hamsters have been used for investigations into diabetes mellitus (kumar et al., 2012) ; european and turkish hamsters have been useful to evaluate aspects of hibernation (batavia et al., 2013) ; and siberian and turkish hamsters have been used to study circadian rhythm and pineal gland activity (butler et al., 2008) (fig. 7.22) . chinchillas (fig. 7.23 ) are in the order rodentia, suborder hystricomorpha, as are the guinea pig and the degu. there are the long-tailed chinchilla, chinchilla lanigera, and the short-tailed chinchilla, chinchilla chinchilla. chinchillas originate from the andes mountains of south america (martin et al., 2012) . they are 400e800 g in size, females weighing more than males, with compact bodies and long, strong hind limbs and dense fur coats (alworth et al., 2012) . the lushness of the coat is what led them close to extinction in the wild due to excessive hunting siberian hamsters. photo from greg demas. chinchilla. photo from bill shofner jr. in the early to mid-1900s (jimenez, 1996) . the chinchilla has a large head, large eyes and ears. the large inner ear anatomy is of specific note as chinchillas are the traditional model for auditory studies (shofner and chaney, 2013) and otitis media (morton et al., 2012) . the gerbil is a rodent, suborder myomorpha, used in research. there are over 100 species of gerbil-like rodents documented, but the mongolian gerbil (meriones unguiculatus) is the species most commonly used in the united states (fig. 7.24 ). mongolian gerbils originate from a desert terrain in mongolia and northeast china. they are long-tailed, burrowing, herbivorous rodents, 55e130 g in size, males being larger than females (harkness et al., 2010) . due to anatomical variations in the blood supply to the brain in an anatomical region known as the "circle of willis," gerbils have been used most notably as a model for cerebral ischemia or stroke (small and buchan, 2000) . an interesting animal model to note among the small mammals is the nine-banded armadillo (dasypus novemcinctus), a new world mammal ranging from the southeastern half of north america, extending south through the americas to the northern region of argentina (balamayooran et al., 2015) . armadillos have a banded carapace, and, importantly, a low core body temperature of 33e35 c. the breeding season is in the summer, but embryo implantation is delayed until late fall, at which gerbil. photo used with permission of american association for laboratory animal science. point identical quadruplicates are always formed (balamayooran et al., 2015) . the armadillo's low body temperature, and susceptibility and physiologic response to the infectious organism, mycobacterium leprae, have made it an ideal model for studying leprosy (balamayooran et al., 2015) . the consistent polyembryony of the species has also made the animal a model of interest in understanding various aspects of twinning (blickstein and keith, 2007) . choosing the correct animal model is an essential component to the success of biomedical research. each species used in biomedical research must be provided with adequate housing and care to ensure the well-being of the animals. because good science and good animal care go hand in hand, it is important to understand and address the biological and behavioral needs of the animals being studied. origin of the golden hamster cricetus auratus as a laboratory animal chinchillas: anatomy, physiology and behavior a novel system for individually housing bullfrogs american fancy rat and mouse association the effects of osteoporosis on distraction osteogenesis: an experimental study in an ovariectomised rabbit model birds as models of aging in biomedical research a review of the development of chicken lines to resolve genes determining resistance to diseases comparison of behavioural development in socially isolated and grouped rats the armadillo as an animal model and reservoir host for mycobacterium leprae japanese quail as a model system for studying the neuroendocrine control of reproductive and social behaviors the effects of day length, hibernation, and ambient temperature on incisor dentin in the turkish hamster (mesocricetus brandti) environmental enrichment for laboratory rodents and rabbits: requirements of rodents, rabbits, and research the black death, 1346e1353: the complete history effects of the environmental estrogenic contaminants bisphenol a and 17alpha-ethinyl estradiol on sexual development and adult behaviors in aquatic wildlife species morphological and functional events associated to weaning in rabbits on the possible cause of monozygotic twinning: lessons from the 9-banded armadillo and from assisted reproduction biology of the rabbit facility design and associated services for the study of amphibians tickling induces reward in adolescent rats ultrasonic vocalizations as indicators of welfare for laboratory rats (rattus norvegicus) a melatonin-independent seasonal timer induces neuroendocrine refractoriness to short day lengths repetitive backflipping behaviour in captive roof rats (rattus rattus) and the effects of cage enrichment characterization of a new experimental model of osteoporosis in rabbits 1874. the septum atriorum of the frog and the rabbit antiviral immunity in amphibians a behavioral comparison of new zealand white rabbits (oryctolagus cuniculus) housed individually or in pairs in conventional laboratory cages the social buffering effect of playful handling on responses to repeated intraperitoneal injections in laboratory rats the technic of handling the zebra fish (brachydanio rerio) for the production of eggs which are favorable for embryological research and are available at any specified time throughout the year ovulation induction in rabbit does: current knowledge and perspectives characterization and development of courtship in zebrafish, danio rerio amphibians as laboratory animals a good practice guide to the administration of substances and removal of blood, including routes and volumes avian medicine and surgery in practice a critical period for social isolation in the rat reproductive and developmental toxicity of hydrofluorocarbons used as refrigerants zebrafish in the wild: a review of natural history and new notes from the field the songbird as a model for the generation and learning of complex sequential behaviors detection of zoonotic pathogens and characterization of novel viruses carried by commensal rattus nor rabbit as a reproductive model for human health the rabbit as a model for reproductive and developmental toxicity studies anatomy and physiology of farm animals rabbit medicine and surgery for veterinary nurses biomedical and surgical aspects of captive reptile husbandry a comparison of the histological structure of the placenta in experimental animals social and husbandry factors affecting the prevalence and severity of barbering ('whisker trimming') by laboratory mice evidence for a relationship between cage stereotypies and behavioural disinhibition in laboratory rodents heat or insulation: behavioral titration of mouse preference for warmth or access to a nest impact of nesting material on mouse body temperature and physiology nest building as an indicator of health and welfare in laboratory mice energy reallocation to breeding performance through improved nest building in laboratory mice medicine and surgery of amphibians pheromonal regulation of reproductive success in female zebrafish: female suppression and male enhancement metamorphosis and the regenerative capacity of spinal cord axons in xenopus laevis golden hamster (mesocricetus auratus) as an experimental model for leishmania (viannia) braziliensis infection nonavian reptiles as laboratory animals an introduction to the mexican axolotl (ambystoma mexicanum) guinea pigs: managment, husbandry and colony health inventory of the behaviour of new zealand white rabbits in laboratory cages comparison of mouse and rabbit model for the assessment of strong pgm-containing oil-based adjuvants computerized analysis of audible and ultrasonic vocalizations of rats as a standardized measure of pain-related behavior review of polyclonal antibody production procedures in mammals and poultry learning from bird brains: how the study of songbird brains revolutionized neuroscience rabbit mitochondrial dna diversity from prehistoric to modern times harkness and wagner's biology and medicine of rabbits and rodents speaking of psittacine research in vitro and in vivo activity of first generation cephalosporins against leptospira the chicken model of spontaneous ovarian cancer glp-1 and glp-2 as yin and yang of intestinal lipoprotein production: evidence for predominance of glp-2-stimulated postprandial lipemia in normal and insulin-resistant states a behavioral look at the training of alex: a review of pepperberg's the alex studies: cognitive and communicative abilities of grey parrots. anal. verbal behav use of a body condition score technique to assess health status in a rat model of polycystic kidney disease extreme cuisine: the weird & wonderful foods that people eat amphibians as models for studying environmental change tickling during adolescence alters fear-related and cognitive behaviors in rats after prolonged isolation utilization of caecal digesta by caecotrophy (soft faeces ingestion) in the rabbit the zebrafish reference genome sequence and its relationship to the human genome environmental enrichment for dendrobatid frogs japanese quail (coturnix japonica) as a laboratory animal model social isolation rearinginduced impairment of the hippocampal neurogenesis is associated with deficits in spatial memory and emotion-related behaviors in juvenile mice teratogenic effects of thalidomide: molecular mechanisms infectious diseases and pathology of reptiles: color atlas and text effect of astragaloside iv on the embryo-fetal development of sprague-dawley rats and new zealand white rabbits the extirpation and current status of wild chinchillas chinchilla lanigera and c-brevicaudata audible and ultrasonic vocalization elicited by single electrical nociceptive stimuli to the tail in the rat the long view: a bright past, a brighter future? forty years of chicken immunology pre-and post-genome guidelines and ethical considerations for housing and management of psittacine birds used in research documented and potential research impacts of subclinical diseases in zebrafish diseases of zebrafish in research facilities combined effects of virus, pesticide, and predator cue on the larval tiger salamander (ambystoma tigrinum) combined effects of atrazine and chlorpyrifos on susceptibility of the tiger salamander to ambystoma tigrinum virus stages of embryonic development of the zebrafish establishing and maintaining a xenopus laevis colony for research laboratories acute and chronic animal models for the evaluation of anti-diabetic agents the biology and use of zebrafish, brachydanio rerio in fisheries research. a literature review development of the antibody repertoire in rabbit: gut-associated lymphoid tissue, microbes, and selection the husbandry of zebrafish (danio rerio): a review generation time of zebrafish (danio rerio) and medakas (oryzias latipes) housed in the same aquaculture facility zebrafish housing systems: a review of basic operating principles and considerations for design and functionality historical perspectives the effects of methotrexate on pregnancy, fertility and lactation the green anole (anolis carolinensis): a reptilian model for laboratory studies of reproductive morphology and behavior the mechanisms underlying sexual differentiation of behavior and physiology in mammals and birds: relative contributions of sex steroids and sex chromosomes development, structure, and function of a novel respiratory organ, the lung-air sac system of birds: to go where no other vertebrate has gone the aerosol rabbit model of tb latency, reactivation and immune reconstitution inflammatory syndrome chinchillas: taxonomy and history notes on the natural diet and habitat of eight danionin fishes, including the zebrafish danio rerio animal models of atherosclerosis progression: current concepts the zebra finch, taeniopygia guttata: an avian model for investigating the neurobiological basis of vocal learning neurological diseases of rabbits and rodents a functional tonb gene is required for both virulence and competitive fitness in a chinchilla model of haemophilus influenzae otitis media a rabbit model of non-typhoidal salmonella bacteremia neuroevolutionary sources of laughter and social joy: modeling primal human laughter in laboratory rats blood sample collection in small laboratory animals positive and negative ultrasonic social signals elicit opposing firing patterns in rat amygdala enriching tortoises: assessing color preference shelter enrichment for rats environmental enrichment of new zealand white rabbits living in laboratory cages recommendations for the care of amphibians and reptiles in academic institutions manual of ornithology: avian structure & function ferrets, rabbits, and rodents: clinical medicine and surgery proteomic analysis of fibroblastema formation in regenerating hind limbs of xenopus laevis froglets and comparison to axolotl inner workings: 1885, the first rabies vaccination in humans avian medicine: principles and application behavioural and neurochemical effects of early social deprivation in the rat immunogenicity and protective efficacy in mice and hamsters of a beta-propiolactone inactivated whole virus sars-cov vaccine ny) 43, 157. roy, c.s., 1879. the form of the pulse-wave: as studied in the carotid of the rabbit small animal models of cardiovascular disease: tools for the study of the roles of metabolic syndrome, dyslipidemia, and atherosclerosis normal development of the testes management of canaries, finches, and mynahs spontaneous atherosclerosis in pigeons. a model system for studying metabolic parameters associated with atherogenesis an iacuc perspective on songbirds and their use in neurobiological research housing and husbandry of xenopus for oocyte production applications of avian transgenesis physiological mechanisms of thermoregulation in reptiles: a review transgenic quail as a model for research in the avian nervous system: a comparative study of the auditory brainstem large-scale network organization in the avian forebrain: a connectivity matrix and theoretical analysis antimicrobial peptide defenses in the salamander the watanabe heritable hyperlipidemic (whhl) rabbit, its characteristics and history of development: a tribute to the late dr development of an animal model for spontaneous myocardial infarction (whhlmi rabbit) processing pitch in a nonhuman mammal (chinchilla laniger) abnormal arrangement of the right subclavian artery in a rabbit anatomy, physiology, and behavior mitochondrial dna phylogeography of the norway rat mandibular bone density and fractal dimension in rabbits with induced osteoporosis. oral surg the behaviour and ecology of the zebrafish, danio rerio assessment of anti-scarring therapies in ex vivo organ cultured rabbit corneas vascular access port (vap) usage in large animal species the victorian underworld amphibians, with special reference to xenopus handbook of avian medicine a novel therapeutic regimen to eradicate established solid tumors with an effective induction of tumorspecific immunity effects of husbandry and management systems on physiology and behaviour of farmed and laboratory rabbits herpetology: an introductory biology of amphibians and reptiles use of the syrian hamster as a new model of ebola virus disease and other viral hemorrhagic fevers pseudomonas aeruginosa infectious keratitis in a high oxygen transmissible rigid contact lens rabbit model. invest. ophthalmol amphibian medicine and captive husbandry animal models for the atherosclerosis research: a review construction of corneal epithelium with human amniotic epithelial cells and repair of limbal deficiency in rabbit models rats, lice, and history key: cord-270249-miys1fve authors: liu, xianbo; zheng, xie; balachandran, balakumar title: covid-19: data-driven dynamics, statistical and distributed delay models, and observations date: 2020-08-06 journal: nonlinear dyn doi: 10.1007/s11071-020-05863-5 sha: doc_id: 270249 cord_uid: miys1fve covid-19 was declared as a pandemic by the world health organization on march 11, 2020. here, the dynamics of this epidemic is studied by using a generalized logistic function model and extended compartmental models with and without delays. for a chosen population, it is shown as to how forecasting may be done on the spreading of the infection by using a generalized logistic function model, which can be interpreted as a basic compartmental model. in an extended compartmental model, which is a modified form of the seiqr model, the population is divided into susceptible, exposed, infectious, quarantined, and removed (recovered or dead) compartments, and a set of delay integral equations is used to describe the system dynamics. time-varying infection rates are allowed in the model to capture the responses to control measures taken, and distributed delay distributions are used to capture variability in individual responses to an infection. the constructed extended compartmental model is a nonlinear dynamical system with distributed delays and time-varying parameters. the critical role of data is elucidated, and it is discussed as to how the compartmental model can be used to capture responses to various measures including quarantining. data for different parts of the world are considered, and comparisons are also made in terms of the reproductive number. the obtained results can be useful for furthering the understanding of disease dynamics as well as for planning purposes. since the first reported case of novel coronavirus (sars-cov-2) in wuhan, china, toward the end of 2019, this highly infectious disease first spread rapidly within china and to its neighboring countries [1, 2] . after china, the next confirmed cases occurred in japan, south korea [3] , and thailand in late january, and soon thereafter in 2020, the usa reported its first case in washington state. in february, europe faced its first outbreak in italy [4] , with churches and schools closing immediately thereafter and towns being locked down [5] . by early march, following the world health organization's declaration of the coronavirus as a pandemic, the usa imposed a ban on travelers from europe and declared a national emergency on march 13. as of may 4, more than 253,000 people had died from covid-19, with 3.6 million infections confirmed in more than 180 countries and territories globally [6] . the usa has taken the hardest hit from the pandemic, with 1.22 million confirmed cases and more than 70,000 deaths, at the time of writing of this paper. the goal of this work has been to look at the data available on the number of infections and study the evolution of the infection dynamics. in order to facilitate the authors' quest to pursue data-driven dynamics based on infection data, a statistical approach based on the generalized logistic function has been taken along with studies of a delay differential system. the logistic function, which was introduced by pierre verhulst for population growth modeling [7] , is now widely used in various areas of science and engineering. applications include friction modeling in mechanical systems [8] , activation function in neural networks [9] , and infectious disease spreading in biological systems [10] . the generalized logistic function, which has an asymmetric form in between the lower and upper horizontal asymptotes, was used by richards [11] for modeling plant growth. this function has been recently used for studying disease dynamics [12] . as for the studies in spreading dynamics, logistic functions and generalized logistic functions are essentially compartmental models [13] with a susceptible state and an infected state (si model). in reality, there exists a latent period, when people are infected but not yet infectious. consideration of this aspect leads to the seir model [14] , in which one has susceptible, exposed, infectious, and removed (recovered or dead) states. moreover, if mitigation measures are applied, a new state of quarantine needs to be considered, which results in the seiqr model. however, in all of the aforementioned models, the derivatives of the different states are only dependent on their current values and assumed to be uniformly distributed. here, these models are extended through introduction of distributed time delays. delay differential equations (ddes) arise often in various science and engineering applications, for instance, in mechanical engineering [15] [16] [17] . different from ordinary differential equations, to solve ddes, one needs information on current states and past states over time intervals in the past [18] . ddes are critical for modeling the spreading of covid-19, since time delays can be used to capture the durations of the latent, quarantine, and recovery periods. a seir model with two constant delays can be found in the work of [19] , a study of the global behavior of sier with delay can be found in the studies [20, 21] , and an sir model with delays has been studied in [22] . furthermore, stability studies on the solutions of an sir model can be found in [23, 24] , and solutions of an seir model can be found in the work of [25] . a primary objective of this paper is to use datadriven dynamics to further the current understanding of key aspects and features associated with the covid-19 outbreak, as well as to help assess the viability of control strategies that are applicable for mitigation, for example "flattening the curve." the remainder of the paper is organized as follows. in sect. 2, based on the susceptible-infected (si) populations model, the sshaped logistic and generalized logistic functions are introduced to describe the outbreak of the covid-19 among various countries-china, italy, germany, and the usa. with the goal of finding the inflection points for these countries, parameters of the s-curve functions for different countries are identified using an optimization method. in sect. 3, by considering the incubation period and the quarantine state, an improved seiqr model with distributed time delays is developed. the time delay is employed here to capture the time gap between different (compartmental) states in the new model. with data of confirmed cases, parameters are identified and simulations together with prediction of different geographic regions are presented. with the developed models, potential mitigation strategies for different scenarios are proposed. finally, concluding remarks are provided. in this section, the infection data of covid-19 for the us, as well as different countries from all over the world, are gathered by using a web-read function of matlab from the website of the covid tracking project [26] and worldometers [27], respectively. the determination of unknown parameters in the generalized logistic function is driven by the data of the total confirmed number of positive cases of the covid-19, and a nonlinear regression algorithm is used. then, the inflection (peak) point of covid-19 is predicted and analyzed for different countries and regions. followed by that, considering each of the regions as an element of the overall global system, a composite global model covid-19: data-driven dynamics, statistical the susceptible-infectious (si) model, which is also called the simple epidemic model (sem), is the simplest form of all epidemic models, as shown in fig. 2a . the evolution of the infection number in an si model is also known as logistic growth and results in a logistic function (sigmoid function). the logistic function is centrosymmetric about the inflection point, which means the decreasing (saturation) period mirrors the growing period. however, for the covid-19, from the current data, it can be gleaned that the infection growth occurs as a short outburst, followed by a long saturation period. in this scenario, the logistic function is no longer an appropriate descriptive function, since the saturation period needs to be slowed down. the generalized logistic function, also known as richards' curve, originally developed for plant growth modeling, is an extension of the logistic or sigmoid functions, allowing for more flexible s-shaped curves. by varying only one parameter in the generalized logistic function [11] , the saturation period of the pandemic can be controlled, which makes the model and prediction easier and more realistic. hence, based on current data, the authors choose the generalized logistic function here to capture the infection growth dynamics of covid-19. the generalized logistic function is the solution of the richards' differential equation (rde), which is a nonlinear dynamical system, can be written as follows then, the generalized logistic function is obtained via direct integral as in eqs. (1), (2), s = s(t) is the number of susceptible individuals in a population while i = i (t) is the number of infectious individuals. n = s(t) + i (t) is a constant, which typically represents the total population. the infection rate β > 0 represents the rate of spread of a pandemic and is the probability of transmitting disease from an infected individual to a susceptible individual. ν > 0 is the parameter affecting the position of inflection point as well as the saturation period. for ν ≡ 1, one has a symmetrical s-curve about the inflection point, that is, the logistic function. for ν > 1, the saturation period is shorter and the result is a faster saturation compared to that noted with the logistic function. for 0 < ν < 1, the saturation period is longer than that obtained with the logistic function. therefore, for the covid-19 outbreak, it is expected that with the generalized logistic function and an optimized 0 < ν < 1, one will have a better match with the data. as shown in fig. 2b , the circles represent the number of confirmed positive cases in the usa from march 1, 2020, to may 4, 2020. the dotted line in blue is the least-squares fitting done by using the logistic function (ν = 1), while the dashed line in orange represents the least-squares fitting carried out by using the generalized logistic function with an optimized ν = 0.011. the goodness of fit of the models for the covid-19 data shows that the root mean squared error (standard error) of the generalized logistic function is 1.15 × 10 4 , which is much smaller than that for the logistic function (3.7 × 10 4 ). the generalized logistic function provides a better fit because it is both unbiased at the growth-period and saturation-period and produces smaller residuals. four parameters l , k, t, and ν in the generalized logistic function need to be identified from the covid-19 data, by using nonlinear regression algorithms. the generalized logistic function eq. (1) has four unknown parameters l , k, t, and ν, which should be identified and optimized based on the covid-19 data. the nonlinear regression algorithm adopted here is a form of regression analysis by which the data can be modeled as any nonlinear function, such as the generalized logistic function. the objective of the nonlinear regression algorithm is to find an optimized point in although both the linear and nonlinear regression algorithms are integrated into the curve fitting toolbox of matlab r2018a, finding the global optimal point in the four-dimensional parameter space is still not an easy problem due to the nonlinearity of the generalized logistic function. therefore, to avoid a local optimum, the parameter identification processes with eq. (1) are carried out as shown in fig. 3 . for each loop as illustrated in the figure, the nonlinear regression algorithm is used to find a local minimum starting from widely varying initial values (random) of the parameters. and finally, the most extreme one is chosen from these local minima as the global minimum when the termination criteria are satisfied. here, the authors have used maximum loops of 50 and a minimum coefficient of determination, called the r-squared value, of 0.99 for the termination criteria. if any of the criteria are satisfied, the identification processes are terminated and the optimized parameters have peaked, according to the minimum of standard error between the data and the model prediction. it should be noted that the proposed criterion of termination here cannot guarantee that a global optimum will be found, but the process is expected to greatly increase the probability of realizing a global optimum, instead of a local optimum. based on the parameter identification approach described in this section, the covid-19 infection dynamics for several countries from north america, south america, europe, and asia is found to be captured well by using the generalized logistic function fig. 4 . in this figure, the abscissa is the total number of test-positive cases, while the ordinate is the daily increase in the number of positive cases; that is, the derivative of the total cases. hence, fig. 4 can be considered as being representative of the phase portraits of the dynamic system. from this plot with logarithmic coordinates, one can discern two periods of the covid-19 pandemic: a) the exponential growth period and b) the saturation period. besides, the time-domain covid-19 dynamics of china, south korea, italy, and the usa are illustrated in fig. 5a -d, respectively. in these figures, the left axis corresponds to the total number of test-positive cases, while the right axis corresponds to the daily increase in the number of positive cases. the results show the generalized logistic function predictions are consistent with the covid-19 data of china, italy, and the usa with a r-squared value greater than 0.995; nevertheless, the r-squared value for south korea data is 0.981; this is indicative of a rather discernible fitting error for the generalized logistic function in this case, as shown in fig. 5b . this is mainly because the generalized logistic function cannot be used to capture the plateau from march 10 to april 5 in the daily increments data for south korea. the inflection points of the total cases, that is, the peaks of the daily increments, are identified with good consistency from the data. the identified peak of covid-19 in china is on february 7, while the peaks of south korea, italy, and the usa are 25 days, 51 days, and 67 days later than china, respectively. additional model predicted curves for the countries and regions all over the world are shown in fig. 6 . from this figure, one can choose the countries and regions with total confirmed cases more than 20,000 by may 4. it should be noted that the authors have treated each state of the usa as an individual region due to the large infected population in the usa. based on the parameter identification approach, through the curves in this figure, the authors show the model identified (predicted) daily increment cases from late january 2020 when the pandemic outbreak started in china, to the middle of august 2020. from these plots, it can be seen clearly that there are remarkable time lags among different regions for the outbreaks of the covid-19 pandemic. these remarkable time lags indicate that simple models with low degrees of freedom (dof) are not sufficient to capture the global dynamics of the covid-19 infections from all over the world. to show the distribution of the outbreak dates of the pandemic, histograms for fig. 7 . this histogram plot is illustrative of the outbreak of the pandemic from one region to another in quick succession after china and south korea. at the time of writing of the paper, the prediction was that all the countries and regions worldwide would have experienced the peak (at least, the first peak) of the covid-19 pandemic, by july 1, 2020. given the significant time lags among different countries and regions for the outbreak of the covid-19 , it is natural to expect the global dynamics of the pandemic to be governed by a highdimensional system; that is, it is not conceivable to capture the global infection dynamics by using a simple model with low degrees of freedom. hence, inspired by the finite element method (fem) which is widely used in mechanics, considering each of the regions as an element of the overall global dynamic system, a composite global model is established to capture the combined dynamics of countries and regions all over the world. as shown in fig. 1 , the parameter identification approach (fig. 3) is utilized for each elemental sub-model, following which, the composite global model is constructed by assembling all of the 148 sub-models from different regions all over the world. thus, the global model for the covid-19 dynamics at hand is a highdimensional differential system with 148 sub-models. the outcome of the covid-19 dynamics modeling carried out by using a single generalized logistic function is as shown in fig. 8 . the notable deviation between the covid-19 data and the model prediction with the generalized logistic function confirms the earlier statement that a low-dimensional model cannot be used to capture the global dynamics. by contrast, the outcome of composite global model shown in fig. 9 , which is comprised of 148 identified sub-models, matches the worldwide covid-19 data with good consistency for both the total number of infection cases and daily increments. the prediction from the model with single generalized logistic function is that the daily increments cases of covid-19 will drop to 10 thousand by late july; on the other hand, the prediction with the composite global model is that one will have daily increments of more than 100 thousand until august, which is worrisome. with the composite global model, one can see from fig. 9 that there have been two waves of covid-19 pandemic spreading across the globe. the first wave started in late january and ended in late february in wuhan, china. subsequently, the second wave started mainly in europe and pushed to a surge into the usa. this surge, which has remained over for a long period in the usa, is being followed by one where the spreading is occurring in russia, brazil, and india. it is important to note that all of the predictions made in this section are based on both the generalized logistic function and the covid-19 data (positive tested cases). the elemental sub-model of the system is quite simple to avoid any over-fitting of the system dynamics. however, with this data-driven prediction, it is impor-tant to keep in mind that consideration has not been given to other factors, such as the rising temperature due to the seasonal changes, the differences and coupling between the southern hemisphere and northern hemisphere, measures and policies taken to counter the virus in different regions. to study the effects of these factors on the pandemic, a more sophisticated model needs to be established taking into a range of aspects, including but not limited to different perspectives on infection dynamics, control measures, viral transmission dynamics, stability of the dynamics, long-term predictions, and so forth. in this section, an improved epidemic model with timevarying parameters and distributed time delays is proposed based on the seiqr model. with this model, quantitative analysis of control measures and quarantining is conducted. based on the global covid-19 data, some key aspects and parameters that are reflective of the effects of the measures and policies taken by different countries are identified and discussed. the seiqr model is a compartment model widely used for epidemiological modeling in disease propagation [13, 28] as well as computer virus in the internet [29] . let it be supposed that the total population n is divided into five compartments so that quarantining will lead to loss of infectivity -r(t): removed cases (recovered or dead) as depicted in the flow diagram of fig. 10 , transmissions occur among these four compartments during the covid-19 pandemic. the dynamics of the transmission can be described as follows. individuals in the infectious (i) stage can transmit infection to their neighbors through contacts. individuals in the susceptible (s) stage become infected, and they are assigned to the exposed (e) stage immediately once contacts have occurred. all the exposed (e) individuals are assumed to become infectious (i) after a (random) latency period ranging from 2 to 14 days. the infectious (i) individual might be quarantined (q) and isolated once they show symptoms and be tested. finally, quarantined (q) individuals will be assigned to remove (r) stage after some time. however, according to the recent reports [30, 31] , asymptomatically infected individuals widely exist, and they can transmit the virus. consequently, a secondary path of viral transmission needs to be considered; that is, infectious (i) individuals assigned to the removed (r) stage without being quarantined or isolated. it is notable that transmissions among these compartments take some random period for any exposed individual and these time lags during the transmissions result in a multiple distributed delay in the system. hence, based on the proposed flow diagram of fig. 10 , the dynamical transfer flows among s, e, i, q, and r compartments can be written in the form of differential equations as follows: here( ·) is the derivative operation with respect to time t. δ jk is the flow rate from compartment j to compartment k ; that is, the subscripts se, ei, iq, ir, and qr denote the flow rate from s to e, e to i, i to q, i to r, and q to r, respectively. here, the flow rates δ jk are nonlinear functions, which not only depend on the current state variables but they are also related to the past states of the system. thus, the time-delay effects are introduced into the system. intuitively, the time delay here rises from the period of incubation, the period of infection, and the period of quarantine. according to the flow diagram of fig. 10 , the flow rates among the states δ jk (for jk = se, ei, iq, ir, qr) can be written as here, β is the infection rate, which is the average number of effective contacts for each infectious individual with others per unit time. β can be a time-dependent variable, that is, β(t), and controlled by taking measures against the epidemic. ζ is the quarantine rate, which is assumed to be a constant in this model. besides these two parameters, there are four distributed delays in this integral equation system eqs. (4): the delay period from e to i (τ ei ), from i to q (τ iq ), i to r (τ ir ), and q to r (τ qr ). for different individuals, the period from one state to another can be very different. however, in statistics for a large population, the period of incubation, infection, quarantine, and recovery should follow some form of distribution. this has prompted the authors to consider a novel model with distributed delay in this research. in the literature [32, 33] , weibull distribution, gamma distribution, normal distribution, and other positive skewed distribution are used widely. without the loss of generality, the authors assume all of the distributed time delays, τ jk (for jk = ei, iq, ir, qr), to follow the normal distribution, as shown in fig. 11 . hence, the probability density function p jk (τ jk ) in eqs. (4) follows the normal distribution with a mean valueτ jk and a standard deviation σ jk , and this distribution is written as for jk = ei, iq, ir, and qr the domain of the probability density function p jk (τ jk ) above is all real numbers spanning −∞ to +∞. however, the integral interval of the distributed delay in eqs. (4) is chosen to be [0, 2τ jk ] to not include the nonphysical interval less than zero and make the numerical integration feasible. it should be noted that this truncation causes the integration of the density function over the entire integral interval, that is, the area of the shaded region in fig. 11 , to be less than 1. to normalize the probability density function p jk in eq. (5), a compensation approach can be used by dividing the density function by the shaded area, for instance, dividing p jk by a compensation factor 0.68 for σ jk =τ jk , by 0.95 for σ jk =τ jk /2, and by 0.997 for σ jk =τ jk /3. here, the authors have chosen σ jk =τ jk /4 with a compensation factor 0.9999 and the resulting probability density function is shown as fig. 11 . the proposed, improved seiqr model at hand, given by eqs. (3), (4) is a time-varying system with 4), δ se is the transition rate from the compartment of susceptible individuals to the compartment of infectious individuals, and it is also called the force of infection [34] . for a susceptible individual, the transition from s to e is simultaneous once the individual makes an effective contact with an infectious individual. the rest of the four equations in eqs. (4), which, respectively, represent the flow rate of transition from s to i, i to q, i to r, and q to r, are delay integral equations. these four delay integral equations introduce four distributed delays τ jk into the system. the distributed delays τ jk obey normal distribution with mean valuesτ jk and standard deviations σ jk . due to the time delay, the system dimension is infinite while the continuity and distribution of the delays introduce the infinite number of time delays into the system. hence, analytical solutions for the proposed system are impossible and numerical simulation is be conducted based on discretization methods [35] . comparing with the most recent epidemic model with discrete delays given in [36] , in the proposed model with distributed delays, one takes the individual differences in symptoms into consideration. this is expected to result in a more realistic dynamic response to the virus and help improve the epidemic model. the proposed dynamical system, which is given by eqs. (3), (4) , is a set of time-varying nonlinear differential equations with multiple distributed delays. due to the lack of a universal solver for this specific probnumerical studies have been conducted with both the conventional seiqr model and the improved seiqr model with distributed delays, as shown in fig. 12 . for this comparison, the quarantine rate ζ is set to zero; this degenerates to the seir model without quarantine. the population n is set to 10 thousand. the mean values of the distributed delays are set as followsτ ei = 5 days,τ iq = 4 days,τ qr = 11 days, and τ ir = 15 days. the standard deviation is set to be onefourth of the mean values of the distributed delays; that is, σ jk =τ jk /4. as shown in fig. 12a , the infection rate β(t) is set to 0.8 initially and drops to 0.1 at the 30th day (the vertical dot-dashed line in the figure) , which represents the day on which measures taken to counter the epidemic are put in place. in figs. 12b, c, the yaxis on the left is the accumulated total infected cases, while the y-axis on the right is the daily increments in infection cases. they can be written as follows: the dynamic responses obtained with the conventional seiqr model fig. 12b and the improved model fig. 12c are found to be consistent with each other before the measures are taken. however, there are significant differences between the dynamics of these two models after the measures are put in place on the 30th day. in the conventional seiqr model, an immediate change in the predicted response can be observed with the drop of β(t), resulting in a nonsmooth peak. with this model, one notes that the measures taken to counter the virus have an immediate effect on the daily increase in infection cases. intuitively, this immediate response is not realistic and doesn't agree with the covid-19 data either. in contrast, with the improved model, the response continues to increase until it reaches a smooth peak for the daily increase in cases 3.8 days later than the date on which measures were initiated. this delayed response is more realistic and conforms with the covid-19 dynamics worldwide: for instance, in the covid-19 data in wuhan, china, one noted a peak of daily increments on february 4, 2020 while the city locked down on january 23, 2020. besides, with the improved model, one notes a small bump around the fortieth day. this feature matches well the covid-19 dynamics in many counties and regions, such as south korea. in summary, when the system is time-invariant, both the conventional seiqr model and the improved model with distributed delays can capture the dynamics of the epidemic well. when the system is time-varying, which is believed to be the scenario with the covid-19 pandemic, the improved seiqr model with distributed delays is more realistic and has significant advantages in capturing the system responses compared to the conventional seiqr model. infection control measures to reduce transmission of covid-19 include universal source control, for example, covering the nose and mouth to contain respiratory secretions, early quarantine, identification, and isolation of patients with suspected disease, use of appropriate personal protective equipment, and environmental disinfection [38, 39] . in the proposed model in eqs. quarantine is used to separate someone who has been exposed to covid-19 and confirmed to be positive by testing, away from others. quarantine helps prevent the spread of disease by isolating infectious individuals from others. depending on the policies in different regions, quarantined people may be in isolation or just stay at home. here, the quarantined individuals are those individuals who have tested positive and are isolated from others. an infectious individual would lose infectivity, once that person is quarantined. the quarantine rate ζ is the ratio of quarantined cases to the infectious cases. extensive covid-19 testing and screening can increase the quarantine rate ζ , which requires more coronavirus testing according to the world health organization. the quarantine rate ζ and the infection rate β are the only two parameters that the authors can use to control against the spreading of the virus in the improved seiqr model with distributed time delays, given by eqs. (3), (4) . in this subsection, control of the infection spread is studied by tuning the infection rate β and the quarantine rate ζ . two variables, namely "total confirmed cases" and "daily increments of confirmed," have been introduced into the improved seiqr model since the states s, e, i , q, and r in eq. (3) are almost not realistic to be observed directly in the real world. the total number of confirmed cases and its derivative, that is, the daily increments in confirmed cases, are the most widely used data in epidemiology for covid-19. according to the proposed model with distributed delays (fig. 10) , they can be written as total confirmed cases = the total cases and daily increments as defined above include only the infectious individuals confirmed by tests and they are observable variables, which distinguishes them from the total infected cases and daily increment of infection, as given in eq. (6) . to show the effectiveness of quarantining in preventing an epidemic from spreading, quantitative analysis of the control of an epidemic is carried out. as shown in fig. 13 , as the quarantine rate ζ is increased from 10 to 50%, and 100%, the number of total infected cases (dashed blue lines) shows some slight drops from almost 1 million to 0.8 million, while the daily increments of infections (dashed orange lines) show significant drops from 56k per day to 22k per day. however, both the total confirmed cases and daily increments of confirmed are increased due to the increase of covid-19 testing in quarantine. the results reveal that the total number of infected cases would not have a significant drop by having more quarantining, but the peak of daily infection of an epidemic can be slowed down by increasing the quarantine rate. more comparisons of daily increments of infections for different quarantine rates ζ (0% to 100%) have been carried out, and the corresponding results are shown in fig. 14. the curves show that an increase in the quarantine rate ζ can "flatten the curve" [40] , which can be helpful from the standpoint of a healthcare system. with the effects of "flattening the curve" as illustrated in this figure, the peak value of daily increments of infection reduced from 50k per day to 13k per day, and the peak position is also delayed by 17 days, allowing more time for healthcare capacity to increase and better cope with the patient load. besides the increase in quarantine rate, decreasing the infection rate β is the other approach to control against the spreading of an epidemic. effective decrease in the infection rate β is generally due to the control interventions, including lockdown of a city or region, stay-at-home order, social distance measures, wearing masks in public places, and so on. a lot of measures, policies, and orders have been taken by local municipalities and governments and state governments worldwide since the outbreak of the covid-19. to capture the effects of these measures, the authors have introduced a time-varying infection rate β(t) into the covid-19 model. as shown in figs. 15a, c, e, a piecewise linear function is used to describe the time-varying infection rate β(t). the drop of β(t) on the 26th day is reflective of the measures and orders that are taken to counter the epidemic. here, it is assumed that it takes seven days for the measures and directives to completely activated; this results in a slope during the linear drop of the infection rate β(t) as seen in the figures. in fig. 15a , b, the infection rate drops from β 0 = 1 to β 1 = 0.013 after taking measures against the epidemic. from the predicted system response, it can be observed that the daily increments in infection continuously increases until a local peak is reached 8 days after the measures are initiated. then, after a short period (five days) of drop, the daily increments of infection start increasing again and the system diverges. a rising tail can overwhelm health systems, leading to high fatalities, and herd immunity eventually. in fig. 15c , d with the infection rate drop to β 1 = 0.09, the system response can be said to be convergent and the daily increments of infection drop slowly but with a very long tail. in this scenario, the daily increments of infection are flattened by taking measures, but the total number of infected cases continuously increases almost linearly and this can still lead to a tremendous number of infected cases after a very long saturation period. in fig. 15e , f with the infection rate drop to β 1 = 0.025, the daily increments of infection drop rapidly with a very short tail and there is a quick saturation in the total number of infected cases. for the control of covid-19, this is the best scenario since the epidemic can be ended within 1 month after taking measures. however, to drop the infection rate drop to 0.025 is a great challenge for both the government and the people. with the results of fig. 15b , d, f, the authors have illustrated that slightly different infection rates lead to quite different consequences. an undesirable scenario is one with herd immunity. in this case, one can have break down of healthcare systems and high fatalities. a slightly better scenario is the slow drop case with a very long tail. in this case, the pandemic is under control and one will eventually converge. however, the suffering can be prolonged for a long time. relatively speaking, the best scenario is the rapid drop situation with a short tail. in this case, the pandemic is ended quickly within about 1 month after taking measures and the suffering is not as prolonged as compared to the previous situation. by examining the current covid-19 data from all over the world, one can notice that all the covid-19 curves in different countries and regions can be categorized into these three types mentioned above and the associated dynamics can also be explained by the seiqr model with distributed delays and time-varying infection rates. the nonlinear distributed delay systems given by eqs. (3), (4) provide a feasible mathematical framework to describe the evolution of covid-19 infection dynamics as well as for modeling the control performances associated with quarantine and other measures. to have a better understanding of the covid-19 dynamics and to forecast an epidemic's evolution with high confidence, identification of the nonlinear dynamical system from the time series of covid-19 data is highly important. based on the nonlinear regression algorithm and the parameter identification approach used and described in fig. 3 , a similar approach is used to fit the dynamics of the proposed epidemic model eqs. (3), (4) with the covid-19 data. based on the data from the web site of the covid tracking project [26] and worldometers [27], the identified parameters from the data are as follows: β 0 : original infection rate before taking any measures β 1 : infection rate after taking measures t m : start date for implementation of measures ζ : quarantine rate -r 0 : original reproduction number before taking any measures -r 1 : reproduction number after taking measures here, the reproduction numbers r 0 and r 1 are deduced from the stability of the delay system eqs. (3), (4), after linearization and discretization of the continuous distributed delay. the reproduction number is a floquet multiplier for the simplified system of eqs. (3), (4) , and this can be approximated as where the r 0,1 is used to denote r 0 or r 1 and β 0,1 is used to denote β 0 or β 1 . the floquet multiplier r 0,1 determines the stability of the solution obtained for the epidemic dynamics; that is, if the magnitude of r 0,1 is greater than 1, the system is unstable and the consequence is herd immunity, which is an undesired scenario as mentioned in the last paragraph of sect. 3.3. if the magnitude of r 0,1 is 1, the system is critically stable. if the magnitude of r 0,1 is less than 1, the system is stable. for the covid-19 pandemic, the initial reproduction numbers before measures are taken r 0 are always great than 1 (generally between 3 and 4), as identified in table 1 . the reproduction number after taking of measures, r 1 is the key to determine the evolution of epidemic dynamics. the smaller the magnitude of r 1 is, the faster covid-19 spreading ends. with the identified parameters from the covid-19 data as listed in table 1 , the predicted data-driven dynamics of covid-19 in these countries is shown in fig. 16 . the results illustrate excellent consistency between the covid-19 data and the predictions from the constructed seiqr model with distributed delays. the proposed model can capture most of the features of the covid-19 dynamics observed from data, such as the bump on the daily increments of south korea from march 9, 2020, to march 25, 2020, and the continuous linear increase in the total infection cases in the usa and uk. the identified date t m when the measures were initiated also matches the reality, for example, the identified date t m of china is january 31, 2020, while the actual period of lock down of cities and stay-athome order taken by the chinese government is in the window of january 23, 2020,-january 28, 2020. from the results shown in fig. 16 and table 1 , one can also assess the effectiveness of measures taken to counter covid-19 in different countries. as listed in table 1 , the r 1 numbers of china and south korea indicate that they these countries undertook measures, which helped best control the covid-19 spread, with a reproduction number being r 1 < 0.5. the effec-tiveness of covid-19 control in france, germany, italy, and spain were effective but with a little higher reproduction number r 1 . comparatively speaking, the covid-19 control measures in the usa and uk have not been as effective at the time of writing this paper, as those undertaken in the previously mentioned countries, since the reproduction number r 1 is close to 1; that is, the system has critical stability. in this scenario, the daily infected cases will continue to drop gradually a very long tail and one can reach a tremendously high number of total infected cases, as shown in fig. 16g , h. these results suggest that usa and uk will experience covid-19 spreading for an extended period of time. switching to the southern hemisphere, the reproduction number r 1 of brazil is still around 2. this means an exponential outbreak of the covid-19 pandemic. if this reproduction number is not controlled to come under 1 in the following weeks, the scenario can potentially lead to herd immunity, as shown in fig. 16i . this is not a welcome situation for healthcare systems. in this work, the spreading of covid-19 among different geographical regions worldwide has been modeled and studied based on the concepts of data-driven dynamical systems. first, the authors have used generalized logistic functions to study the local infection spreading in different regions. based on a nonlinear regression algorithm, the statistical fit of the generalized logistic model has been optimized in a fourdimensional parameter space and the system dynamics prediction and forecasting is driven by the covid-19 data. subsequently, inspired by the notion of the finite element method from mechanics, a composite global model with 148 elements (sub-models for different regions) is established. in this composite global model, each of the regions worldwide is regarded as an element of the overall global system and the global model construction is based on covid-19 data from all over the world. this construction of a global model based on generalized function based statistical models for local regions, and the use of this model to make predictions and forecasting is one of the original contributions of this work. this methodology may be employed for studies on global spreading of other pandemics as well. as an extension of the generalized logistic function model, which in essence is a two-compartment model, an extended compartment model is constructed based on the seiqr model. in this model, both time-varying parameters such as time-varying infection rates and distributed time delays to reflect the differences in individual responses to an infection are introduced. this is another important and original aspect of this work. with this model, one is able to quantitatively assess the effectiveness of different control measures taken such as lock down of regions and quarantining. again, the methodology employed here may be adopted for studies of other epidemics. based on the covid-19 data, some key parameters, which can reflect the effects of the measures and policies taken in different regions and different countries, have been identified and discussed. based on the observed data-driven dynamics, the following remarks are made. (i) there are significant time lags among different regions, for the outbreak of the covid-19 pandemic, as shown in figs. 6 and 7 . based on the current data and current situation, from the generalized logistic model prediction, one can glean that most of the countries and regions worldwide would have passed the infection peak of the covid-19 by july. for understanding the global dynamics of the covid-19, the proposed composite global model is attractive compared to a model with low degrees of freedom. the composite global model has helped capture two or more waves of covid-19 sweeping the globe. the first phase came to notice in late january in wuhan, china. subsequently, the second phase started in europe and moved to the usa. the next spreading phase is expected to be dominated by the dynamics in the usa, russia, brazil, and india, as shown in fig. 9 . (ii) the improved seiqr model with time-varying infection rate and distributed delays is found to be better suited for understanding covid-19 infection dynamics with and without control measures. with this model predictions, one is not only able to capture the effectiveness of the control measures but also anomalies such as the bump seen in the south korea data. based on the prediction comparison with available covid-19 infection data, it is clear that with just a conventional seiqr model, one is not able to capture the covid-19 dynamics well. with the new distributed delay model presented here, one can also understand how measures such as quarantining can help observations such as "flattening of the curve" (fig. 14) . it is also shown that slight differences in infection rates can lead to quite different consequences. to control the covid-19 infection dynamics, the measures taken against the epidemic need to be strict enough to guarantee that the infection rate β 1 is less than a critical value, as shown in fig. 15 . (iii) based on the data-driven covid-19 dynamics studied with the distributed delay model, it is evident the measures taken in countries such as china and south korea were effective in dropping the reproduction number r 1 to be below 0.5. with regard to europe, in france, germany, italy, and spain, after the measures that were taken, the r 1 value dropped to be in the range from 0.6 to 0.8. in the usa and uk, this reproduction number is close to 1. a reproduction number less than 0.5 is desirable, as it means a swift end to the spreading of the epidemic. in general, a number below 1 indicates system stability. on the other hand, when the reproduction number is close to 1, where one has critical stability, the covid-19 infection dynamics can persist for an extended period of time and even lead to herd immunity. in the southern hemisphere, for the current data from brazil, one has a reproduction number around 2, which is highly undesirable. one needs to temper the observations made in this work, by noting that in the modeling undertaken here, many aspects are not captured such as for example, the seasonal variations in temperature. these additional aspects need to be considered as well for appropriate use of findings from the current work. a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action first-wave covid-19 transmissibility and severity in china outside hubei after control measures, and second-wave scenario planning: a modelling impact assessment transmission potential and severity of covid-19 in south korea covid-19 and italy: what next? the lancet the response of milan's emergency medical system to the covid-19 outbreak in italy early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases use of binary sigmoid function and linear identity in artificial neural networks for forecasting population density nonlinear instabilities and control of drill-string stick-slip vibrations with consideration of state-dependent delay deep learning logistic disease incidence models and case-control studies a flexible growth function for empirical use generalized logistic growth modeling of the covid-19 outbreak in 29 provinces in china and in the rest of the world an introduction to compartmental modeling for the budding infection disease modeler nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study delay differential equations: recent advances and new directions nonlinear motions of a flexible rotor with a drill bit: stickslip and delay effects spatial-temporal dynamics of a drill string with complex time-delay effects: bit bounce and stick-slip oscillations stability and stabilization of time-delay systems: an eigenvalue-based approach analysis of an seirs epidemic model with two delays global behavior of an seirs epidemic model with time delays consequences of delays and imperfect implementation of isolation in epidemic control permanence of an sir epidemic model with distributed time delays global stability of an sir epidemic model with time delays complete global stability for an sir epidemic model with delay-distributed or discrete global stability for delay sir and seir epidemic models with nonlinear incidence rate the covid tracking project optimal quarantine and isolation strategies in epidemics control design and analysis of seiqr worm propagation model in mobile internet asymptomatic and presymptomatic sars-cov-2 infections in residents of a long-term care skilled nursing facility-king county, washington comparison of incubation period distribution of human infections with mers-cov in south korea and saudi arabia incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from wuhan, china updated semi-discretization method for periodic delay-differential equations with discrete delay complete dimensional collapse in the continuum limit of a delayed seiqr network model with separable distributed infectivity cdc:healthcare infection prevention and control faqs for covid-19 infection control in health care and home settings flatten the curve conflict of interest the authors declare that they have no conflict of interest. key: cord-240372-39yqeux4 authors: costa, kleyton vieira sales da; silva, felipe leite coelho da; coelho, josiane da silva cordeiro title: forecasting quarterly brazilian gdp: univariate models approach date: 2020-10-26 journal: nan doi: nan sha: doc_id: 240372 cord_uid: 39yqeux4 gross domestic product (gdp) is an important economic indicator that aggregates useful information to assist economic agents and policymakers in their decision-making process. in this context, gdp forecasting becomes a powerful decision optimization tool in several areas. in order to contribute in this direction, we investigated the efficiency of classical time series models and the class of state-space models, applied to brazilian gross domestic product. the models used were: a seasonal autoregressive integrated moving average (sarima) and a holt-winters method, which are classical time series models; and the dynamic linear model, a state-space model. based on statistical metrics of model comparison, the dynamic linear model presented the best forecasting model and fit performance for the analyzed period, also incorporating the growth rate structure significantly. the economic activity of a country can be influenced by several factors that subject economic agents to change their consumption and investment decisions, in addition to impacting other results, such as inflation and unemployment. such factors, or shocks, result from the modification of economic policies, in the level of production technology, through meteorological changes etc. the gross domestic product (gdp) is one of the main indexes for measuring the level of economic activity, and the forecast of its trajectory provides useful information concerning the future economic trend in the short term, acting as an object for the expectation of economic behavior. significant impacts on economic activity arise through crises. they are a dysfunction inherent in the free market system. through the development of information transmission technologies and the global integration of markets, the scope and frequency of these dysfunctions have been expanded. beginning in the second quarter of 2014, the brazilian economic crisis is still the subject of many analyzes, with no consensus on the generating variables, as well as their consequences. in the second quarter of 2016, the gdp growth rate accumulated in four quarters had reached the lowest level of the last two decades (-4.6 %) . the data show that the recovery (after a significant drop) was not complete, followed by a period of stagnation in the country's growth rate. paula and pires (2017) analyzed the ineffectiveness of counter-cyclical policies -between 2011 and 2014 -as a result of problems in the coordination of macroeconomic policy; and also by the occurrence of exogenous shocks, such as the deterioration of trade terms and the water crisis that occur in period. filho (2017) argues that the origin of the brazilian economic crisis was due to a series of supply and demand shocks that (mostly) were caused by wrong public policies, contributing to the reduction of growth potential the brazilian economy and to the increase in tax cost. according to feijó and ramos (2013) , the most relevant aggregates that derive from the system of national accounts are the measures of product, income and expenditure. the macroeconomic aggregates are statistical constructions that synthesizes the productive effort of a given country or region and its possible consequences on the generation of income and expenditure for a specific period of time. by definition, the gdp of a country or region represents production 1 of all production units of the economy -government, self-employed workers, companies etc. -in a given period, usually quarterly or annually, at market prices 2 . blanchard and johnson (2017) presents two ways of interpreting gdp. the so-called nominal gdp is defined as the sum of quantities of final goods multiplied by the current 1 the socially organized economic activity that aims to create goods and services to be traded on the market and/or they are achieved by means of factors production (land, capital and labor) traded on the market (ibge, 2016) . 2 economic transactions with observed or imputed market value. price of the goods, that is, considering the inflationary effect during the calculation period. real gdp takes into account constant prices and sets a given year as a base, excluding the effect of price increases. restricting itself to gdp as an instrument for efficiently measuring the quality of life of the population has theoretical and practical limitations. the growth in production is not a sufficient condition for improving well-being (education, health, culture, security etc.). this is because the quality of economic growth is not part of the scope defined for the calculation of gdp. there is possibility of an expansion sustained by war expenditures (production of supplies and weapons, construction of military installations etc.) or through the reconstruction of a region that has been affected by natural disasters (hurricanes, earthquakes, floods etc.), that it is reasonable to understand themselves as issues that do not promote economic and social well-being or are motivated to do so. angus deaton, the economics nobel prize in 2015 says, if crime goes up, and we spend more on prisons, gdp will be higher. if we neglect climate change, and spend more and more on cleaning up and repairing after storms, gdp will go up, not down; we count the repairs but ignore the destruction. (deaton, 2013) time series analysis has proven to be an effective tool to understand the behavior patterns of a dataset distributed sequentially over time, with a wide range of models for the purpose of analyzing and predicting trends and seasonality. seasonal autoregressive integrated moving average (sarima) and holt-winters method are considered classical time series models and the class of dynamic linear models is part of the bayesian approach. regarding the contributions that use classic models, the analysis constructed by abonazel and abd-elftah (2019) for egypt's annual gdp between the years 1965 and 2016, with a forecasting of ten years ahead (2017 to 2026), presented results that pointed to the country's gdp growth during the period under analysis; wabomba et al. (2016) estimated kenya's gdp between 2013 and in 2017. the result obtained was significant growth in the kenyan economy in the period; agrawal (2018) modeled the series of india's real gdp growth rate from 1996 to 2017. in the analyzed data, the arima model did not show any more significant results than other models. the author also used the holt-winters model and linear trend, both showing similar results each other; and da silva et al. (2020) found significant results using arimax and sarimax models (take into account exogenous variables) for the forecast of brazilian annual and quarterly real gdp for the year 2019. for the bayesian approach and the class of state-space models, piccoli (2015) analyzed four dynamic linear models to identify the one with the best forecasting capacity for nominal gdp in the united states. best results were obtained using a multivariate model sutse (seemingly unrelated time series equations) that considered as variables the nominal gdp, the industry production index, the consumer price index (inflation), and the quarterly interest rate for us treasury bills; rees et al. (2015) built new measures for australia's gdp growth, using state-space methods. the results found have a high correlation with the figures published officially for gdp growth. however, the measures are less volatile, easier to predict, and achieved good results in nowcasting; issler and notini (2016) estimate brazilian real monthly gdp with state-space representation and also find good results in forecasting when compared with central bank economic activity index (ibc-br) 3 ; migon et al. (1993) developed a study about the performance of bayesian dynamic models applied to a set of brazilian macroeconomics time series (industrial productivity index, the balance of trade, components of gdp and others) between the period 1970 to 1990. the comparison was made between the dynamic models and classical structured models and obtained results indicate that the bayesian approach was similar to the classical approach. another applied study was developed by baurle et al. (2020) , with the aim of forecasting gdp in the euro area and switzerland with a bayesian vector autoregressive structure (bvar) and a factor model structure. he found evidence that the factor model structure performs satisfactorily. another accepted approach to gdp forecasting is macroeconomic projections based on leading indicators. garnitz et al. (2019) applied this strategy to forecast gdp growth in forty-four countries, including brazil. one of the results found indicates that the forecasts can be improved by adding world economic survey (wes) indicators of the three main trading partners by country. the aim of this work is to investigate a suitable time series model to describe and forecast brazilian gdp, also investigating the fit of these models to dynamics between periods of economic growth and recession. for this purpose, it is compared different classes of time series models. thus, the chosen models were the holt-winters method, sarima and dynamic linear model. in the literature, there are some applications regarding these models but no comparative studies were found using the models adopted in this work. this work is organized as follows: section 2 describes the methodology. section 3 presents the results and discussion, and, finally, the last section provides the main conclusions and some possibilities for future research. to follow we outline the data and the empirical approach used to fitted and forecast the time series of brazilian gross domestic product between the years 1996 and 2019, at 1995 prices. this section also defines the models that were investigated. care was also taken that the references used in the definition of models and metrics also correspond to studies and authors with wide use and quality proven by the academic community. the quality of the data used in empirical analysis is a fundamental element for the quality of the results. a factor that contributes to the empirical analysis of gdp is the vast documentation made available by government agencies. for that, we obtained the time series in the ibge automatic recovery system (ibge, 2020). data used for the analysis are quarterly and comprise the first quarter of 1996 until the fourth quarter of 2019. data values used in this article are in brazilian real (brl). statistical analyzes, as well as graphic representations, were built using open-source software r core team (2020). the united nations (2010) says that gdp derives from the concept of value added. therefore, gdp is the sum of gross value added of all resident producer units plus that part of taxes on products, fewer subsidies on products. gdp is also equal to the sum of finalizes of goods and services measured at purchasers' prices, less the value of imports goods and services. and gdp is too equal the sum of primary incomes distributed by resident producer units. according to feijó and ramos (2013) gdp can be calculated in three different ways, but are part of the accounting identity (production = income = expenditure), guiding national accounts. the perspective of production is calculated by sum the added values of economic activities plus taxes, net of subsidies, on products. that is, where gva it is gross value added, ic is the intermediate consumption, t are taxes on products and sub are subsidies on products. the income perspective is obtained by adding the remunerations of factors of production. labor is remunerated by wages, loan capital is remunerated by interest, venture capital is remunerated by profit, and ownership of production goods ("land") is remunerated by rent. that is, where w are wages, gos are gross operating surplus (sum of interest, profit e rent), t are taxes on products and sub are subsidies on products. the time series constructed in this work was built from the perspective of expenditure. it is calculated by the sum of household consumption, investment, government spending and net exports. that is, where c it is household consumption, i it is investment (gross fixed capital formation less stock variation), g it is government consumption, and ne it is net exports (exports less imports). as described in cowpertwait and metcalfe (2009) , the holt-winters method was proposed by holt (1957) and winters (1960) , using exponentially weighted moving averages to update those needed for seasonal adjustment of the mean (trend) and seasonality. the method has two variations with four equations: one forecast equation and three smoothing equations. hyndman and athanasopoulos (2018) the additive method equations is describe as following, whereŷ t+h|t is the forecast equation. the t , b t and s t are respectively level, trend and seasonality equations, with corresponding smoothing parameters α, β and γ. the parameter m denotes the frequency of seasonality, and for quarterly data m = 4. finally, k is the integer part of ( h−1 m ) which ensures that the estimates of the seasonal indices used for forecasting come from the final year of the sample. for the multiplicative method the same equations t , b t and s t are defined. but the change in structure occurs because instead of sum the equations inŷ t+h|t an operation is performed to multiply the sum of the level and trend equations by the seasonality equation. box & jenkins models determine the proper stochastic process to represent a given time series by passing white noise through a linear filter (morettin and toloi, 2018) . the model used was sarima, seeking to incorporate the seasonality component that is present in the data under analysis. the sarima of order (p, q, d) × (p, q, d) s is defined by, where θ(b) is the moving average operator of q order, φ(b) is the autoregressive operator of p order, φ(b s ) is the seasonal autoregressive operator of p order, θ(b s ) is the seasonal moving average operator of q order, ∇ d is the simple difference operator, ∇ d s is the seasonal difference operator and α t is the noise. dynamic linear models are an important class of state-space models. broadly used in the last decades, they have a high degree of efficiency for the analysis and forecast of time series, providing flexibility and applicability through an elegant and robust probabilistic apparatus. the estimation and inference challenges are solved by recursive algorithms, which follow the bayesian approach, calculating conditional distributions of quantities of interest given the observed information. considering a series affected by time, through dynamic and random deformations, they associate seasonal or regressive components. in this work were used contributions from west and harrison (1997) , laine (2019) a system equation and initial information given by where f t e g t are known matrices; v t and w t are two sequences of independent noises, with average zero and known covariance matrices v t and w t respectively. d t is the current information set; m 0 and c 0 contains relevant information about the future, according usual statistical sense, given d t , (m t , c t ) is sufficient for (y t+1 , θ t+1 , . . . , y t+k , θ t+k ). to take into account growth and seasonality, it is defined θ t = (µ t , β t , γ t , γ t−1 , γ t−2 ), where µ t is the current level, β t is the slope of the trend, γ t , γ t−1 and γ t−2 are the seasonal components. the selection of most suitable forecasting model was made through the contributions of hyndman and koehler (2006) , armstrong (2001) , morettin and toloi (2018) and ahlburg (1984) using the following metrics: i. square root of the mean squared error (rmse); ii. mean absolute error (mae); iii. mean absolute percentage error (mape); and iv. theil's inequality coefficient (u-theil). the first two metrics are widely used for measures whose scale depends on the scale of the data. the third metric has the advantage of being scale-independent, and so are frequently used to compare forecast performance across different data sets. and the last metric can improve the accuracy of a forecast through theil's decomposition of forecast error into bias, regression and disturbance proportions and his associated linear correction procedure. this section presents the results obtained using the holt-winters additive method, sarima and dynamic linear models to fit the data of interest. for each model, it was plotted the observed and predicted values, and also the 95% confidence interval for the predicted values. graphics are effective tools to understand the behavior of the series and whether the models generate reasonable fit and predictions in relation to the observed data. compared to the multiplicative holt-winters method, the additive formulation was considered the most appropriate, taking into account the sum of squared errors. figure 1 shows the adjustment of the additive method for the brazilian quarterly gdp data in the period 1996 to 2016 and the forecast between the years 2017 and 2019. it is observed that the model was able to fit the data reasonably. the fit also occurred significantly in periods of strong recession, such as the international financial crisis of 2008 and the period of recession in the brazilian economy between the second quarter of 2014 and the fourth quarter of 2017. to apply sarima model, the behavior of autocorrelation (acf) and partial autocorrelation functions (pacf) were verified. in figure 2 (a), it is possible to see a slow decay rate of the autocorrelation function to zero. this behavior indicates the non-stationarity of the series, which needs to be differentiated in order to make it stationary. we used an algorithm to generate sixteen sarima models following the principle of parsimony. from the generated models, the structure with the best results was the sarima (0, 1, 1) × (0, 1, 1) 4 , with metrics: akaike information criterion (-438.58 ); the sum of squared error (0.01686); and the ljung-box test (p-value = 0.96). figure 4 shows the model fitted to the brazilian quarterly gdp data for the period 1996 to 2016 and the forecast between the years 2017 and 2019. it is observed that the model also fits the data reasonably. it includes periods when economic shocks occurred, such as the mentioned crises. in this work, the dynamic regression matrix f t and the evolution matrix g t of the model are for the study, it was assumed the observational variance v t = σ 2 , and the covariance matrix of the system w t is a diagonal matrix introduced by w t = diag(σ 2 µ , σ 2 β , σ 2 γ , 0, 0). these unknown variances were also estimated using bayesian inference. thus, to complete the specification of the model, it was assumed independent inverse gamma priors distributions with means a, a θ 1 , a θ 2 , a θ 3 and variances b, b θ 1 , b θ 2 , b θ 3 , respectively, fixed in known values. therefore, by using the unobservable states as latent variables, a gibbs sampler can be run on the basis of the following full conditional densities: with the full conditional density of the states is a normal distribution and it is covered in the used dlm package (petris, 2010) . from the gibbs sampler, 5000 iterations were generated for each parameter, model variances, out of which the 1000 initial iterations were considered as burn-in period and discarded. hence, the remaining iterations were used to compose the posterior samples of the estimated variances. posterior estimates of the four unknown variances, from the gibbs sampler output, can be seen in figure 5 . the rmse, the mape, the mae, and the u-theil were calculated for the fitted values of models, and their results are shown in table 1 . the rmse and mape metrics were investigated for the forecast values and the results are shown in table 2 . it is observed that the better results were given through the models sarima (0, 1, 1) × (0, 1, 1) 4 and dynamic linear, the latter being one that best fits the series of brazilian gdp, at 1995 prices, for having achieved the lowest values in all metrics for fitted and forecast values. as the dynamic linear model was chosen the best model, in the fit and prediction criteria, it is shown in table 3 the comparison between the growth rate for the brazilian gdp according to the observed values and the growth rate of values predicted by the dynamic linear model. figure 7 shows that the proposed model in this study obtained satisfactory results when the observed and predicted growth rates are compared. the projection data also maintained the tendency of brazilian economic growth to stagnate in the analyzed period. therefore, economic policies were not effective for a consistent recovery in the short and medium term. understanding the gdp behavior is a topic of study and discussion by society and the academic community. in the present work, we proposed the application of the holt-winters additive method, sarima and dynamic linear model with interest in forecast the behavior of brazilian quarterly gdp, at 1995 prices. the data comprise the period between the first quarter of 1996 and the fourth quarter of 2019. theil's inequality coefficient (u-theil) shows that the models used in the study are better than the naive prediction, i.e, when the forecast at time t is the value observed in t −1 . both the analyzed series and the models' forecast show the necessity for sustained growth in a market economy. by the metrics rmse, mae, mape and u-theil, it appears that the dynamic linear model presented the best fit to data and efficient forecast performance, with mape of 0.839. we find evidence in this study that corroborates with the observed results of stagnation in the brazilian economy after a crisis period started in the second quarter of 2014. therefore, the dynamic linear model proved to be efficient for forecasting and fit to gdp data even with economic shocks. from the pandemic caused by covid-19 in 2020 and your economic and humankind consequences, the time series forecasting models must be adjusted so that they can adapt to a significant exogenous and structure economic shock. this means that the model was probably able to capture the data structure and generate forecasts effectively forecasting egyptian gdp using arima models gdp modelling and forecasting using arima: an empirical study from india forecast evaluation and improvement using theil's decomposition principles of forecasting: a handbook for researchers and practitioners forecasting the production side of gdp uso de ferramentas econométricas para modelar e estimar o pib do brasil the great escape: health, wealth, and the origins of inequality a crise econômica de forecasting gdp all over the world using leading indicators based on comprehensive survey data forecasting seasonals and trends by exponentially weighted moving averages forecasting: principles and practice another look at measures of forecast accuracy brasil : ano de referência 2010. ibge sistema ibge de recuperação automática -sidra estimating brazilian monthly gdp: a state-space approach introduction to dynamic linear models for time series analysis modelos bayesianos univariados aplicados à previsão de séries econômicas análise de séries temporais: modelos lineares univariados crise e perspectivas para a economia brasileira an r package for dynamic linear models dynamic linear models with r identification of a dynamic linear model for the american gdp r: a language and environment for statistical computing. r foundation for statistical computing a state-space approach to australian gross domestic product measurement system of national accounts modeling and forecasting kenyan gdp using autoregressive integrated moving average (arima) models bayesian forecasting and dynamic models forecasting sales by exponentially weighted moving averages key: cord-248050-apjwnwky authors: vrugt, michael te; bickmann, jens; wittkowski, raphael title: effects of social distancing and isolation on epidemic spreading: a dynamical density functional theory model date: 2020-03-31 journal: nan doi: nan sha: doc_id: 248050 cord_uid: apjwnwky for preventing the spread of epidemics such as the coronavirus disease covid-19, social distancing and the isolation of infected persons are crucial. however, existing reaction-diffusion equations for epidemic spreading are incapable of describing these effects. we present an extended model for disease spread based on combining an sir model with a dynamical density functional theory where social distancing and isolation of infected persons are explicitly taken into account. the model shows interesting nonequilibrium phase separation associated with a reduction of the number of infections, and allows for new insights into the control of pandemics. controlling the spread of infectious diseases, such as the plague [1, 2] or the spanish flu [3] , has been an important topic throughout human history [4] . currently, it is of particular interest due to the worldwide outbreak of the coronavirus disease 2019 induced by the novel coronavirus sars-cov-2 [5] [6] [7] [8] [9] [10] . the spread of this disease is difficult to control, since the majority of infections are not detected [11] . due to the lack of vaccines, attempts to control the pandemic have mainly focused on social distancing [12] [13] [14] [15] and quarantine [16, 17] , i.e., the general reduction of social interactions, and in particular the isolation of persons with actual or suspected infection. while political decisions on such measures require a way for predicting their effects, existing theories do not explicitly take them into account. in this article, we present a dynamical density functional theory (ddft) [18] [19] [20] [21] for epidemic spreading that allows to model the effect of social distancing and isolation on infection numbers. a quantitative understanding of disease spreading can be gained from mathematical models [22] [23] [24] [25] [26] [27] . a wellknown theory for epidemic dynamics is the sir model [28] which has already been applied to the current coronavirus outbreak [29] [30] [31] . it is a reaction-model that describes the number of susceptible s, infected i, and recovered r individuals as a function of time t. susceptible individuals get the disease when meeting infected individuals at a rate c. infected persons recover from the disease at a rate w. when persons have recovered, they are immune to the disease. a drawback of this model is that it describes a spatially homogeneous dynamics, i.e., it does not take into account the fact that healthy and infected persons are not distributed homogeneously in space, even though this fact can have significant influence on the pandemic [32, 33] . to allow for spatial dynamics, disease-spreading theories such as the sir model have been extended to reactiondiffusion equations [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] , where a term d φ ∇ 2 φ with diffusion constant d φ is added on the right-hand side of the dynamical equation for φ = s, i, r. reaction-diffusion equations, however, still have the problem that they -being based on the standard diffusion equation -do not take into account particle interactions other than the reactions. this issue arises, e.g., in chemical reactions in crowded environments such as inside a cell. in this case, the reactands, which are not pointlike, cannot move freely, which prevents them from meeting and thus from reacting. to get an improved model, one can make use of the fact that the diffusion equation is a special case of ddft. in this theory, the time evolution of a density field ρ( r, t) with spatial variable r is given by with a mobility γ and a free energy f . note that we have written eq. (4) without noise terms, which implies that ρ( r, t) denotes an ensemble average [45] . the free energy is given by f = f id + f exc + f ext .its first contribution is the ideal gas free energy f id = β −1 d d r ρ( r, t)(ln(ρ( r, t)λ d ) − 1) corresponding to a system of noninteracting particles with the inverse temperature β, number of spatial dimensions d, and thermal de broglie wavelength λ. if this is the only contribution, eq. (4) reduces to the standard diffusion equation with d = γβ −1 . the second contribution is the excess free energy f exc , which takes the effect of particle interactions into account. it is typically not known exactly and has to be approximated. the third contribution f ext incorporates the effect of an external potential u ext ( r, t). ddft can be extended to mixtures [46] [47] [48] [49] , which makes it applicable to chemical reactions. while ddft is not an exact theory (it is based on the assumption that the density is the only slow variable in the system [50, 51] ), it is nevertheless a significant improvement compared to the standard diffusion equation as it allows to incor-porate the effects of particle interactions and generally shows excellent agreement with microscopic simulations. in particular, it allows to incorporate the effects of particle interactions such as crowding in reaction-diffusion equations. this is done by replacing the diffusion term d ∇ 2 φ( r, t) in the standard reaction-diffusion model with the right-hand side of the ddft equation (4) [52] [53] [54] . thus, given that its equilibrium counterpart, static density functional theory, has already been used to model crowds [55] , ddft is a very promising approach for the development of extended models for epidemic spreading. however, despite the successes of ddft in other biological contexts such as cancer growth [56, 57] , protein adsorption [58] [59] [60] [61] , ecology [62] , or active matter [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] , no attempts have been made to apply ddft to epidemic spreading (or other types of socio-economic dynamics). in this work, we use the idea of a reaction-diffusion ddft to extend the sir model given by eqs. (1)-(3) to a (probably spatially inhomogeneous) system of interacting persons, which compared to existing methods allows the incorporation of social interactions and social distancing. ddft describes the diffusive relaxation of an interacting system and is thus appropriate if we make the plausible approximation that the underlying diffusion behavior of persons is markovian [76] and ergodic [77] . using the mori-zwanzig formalism [78] [79] [80] [81] [82] , one can connect the ddft model and its coefficients to the dynamics of the individual persons [50, 51, 83] . the extended model reads note that we use different mobilities γ s , γ i , and γ r for the different fields s, i, and r, which allows to model the fact that infected persons, who might be in quarantine, move less than healthy persons. for generality, we have added a term −mi on the right-hand side of eq. (6) to allow for death of infected persons, which occurs at a rate m (cf. sird model [84, 85] ). since we are mainly interested in how fast the infection spreads, we will set m = 0 in the following. in this case, since the total number of persons is constant, one can easily show that is a conserved current. the ideal gas term f id in the free energy corresponds to a system of noninteracting persons and ensures that standard reaction-diffusion models for disease spreading [35] arise as a limiting case. the temperature measures the intensity of motion of the persons. a normal social life corresponds to an average temperature, while the restrictions associated with a pandemic will lead to a lower temperature. moreover, the temperature can be position-dependent if the epidemic is dealt with differently in different places. the excess free energy f exc describes interactions. this is crucial here as it allows to model effects of social distancing and selfisolation via a repulsive potential between the different persons. social distancing is a repulsion between healthy persons, while self-isolation corresponds to a stronger repulsive potential between infected persons and other persons. thus, we set f exc = f sd + f si with f sd describing social distancing and f si self-isolation. note that effects of such a repulsive interaction are not necessarily covered by a general reduction of the diffusivity in existing reaction-diffusion models. for example, if people practice social distancing, they will keep a certain distance (6 feet is recommended [86] ) in places such as supermarkets, where persons accumulate even during a pandemic, or if people live in crowded environments, as was the case on the ship "diamond princess" [16] . in our model, in the cases of two particles approaching each other, which even at lower temperatures still happens, repulsive interactions will reduce the probability of a collision and thus of an infection. existing models can only incorporate this in an effective way as a reduction of the transmission rate c, which implies, however, that properties of the disease (how infectious is it?) and measures implemented against it (do people stay away from each other?) cannot be modelled independently. furthermore, interactions allow for the emergence of spatio-temporal patterns. the final contribution is the external potential u ext . in general, it allows to incorporate effects of confinement into ddft. here, it corresponds to things such as externally imposed restrictions of movement. travel bans or the isolation of a region with high rates of infection enter the model as potential wells. the advantage of our model compared to the standard sir theory is that it allows -in a way that is computationally much less expensive than "microscopic" simulations, since the computational cost is independent of the number of persons [87] -to study the way in which different actions affect how the disease spreads. for example, people staying at home corresponds to reducing the temperature, quarantine measures correspond to a strongly repulsive potential between infected an healthy persons, and mass events correspond to attractive potentials. specifically, we assume that both types of interactions can be modelled via gaussian pair potentials, depending on the parameters c sd and c si determining the strength and σ sd and σ si determining the range of the interactions. combining this assumption with a ramakrishnan-yussouff approximation [88] for the excess free energy and a debye-hückel approximation [89] for the two-body correlation, we get the specific sir-ddft model with the diffusion coefficients d φ = γ φ β −1 for φ = s, i, r, the kernels k i ( r) = exp(−σ i r 2 ) for i = sd, si, and the spatial convolution . a possible generalization is discussed in the supplemental material. we begin our investigation with a linear stability analysis of this model, using a general pair potential, in order to determine whether a homogeneous state with i = 0, which is always a fixed point, is stable. the full calculation is given in the supplemental material. in the simple sir model, the s-r plane in phase space (these are the states where everyone is healthy) becomes unstable when cs 0 > w, where s 0 is the initial number of susceptible persons. thus, the pandemic cannot break out if persons recover faster than they are able to infect others. a linear stability analysis of the full model, performed under the assumption that the initial number of immune persons r 0 is small (which corresponds to a new disease) gives the eigenvalue λ 1 = cs 0 −w−d i k 2 with the wavenumber k, such that this instability criterion still holds when interactions are present. this means that social distancing cannot stabilize a state without infected persons, and can thus not prevent the outbreak of a disease. as reported in the literature [35] , the marginal stability hypothesis [90] [91] [92] [93] [94] gives, based on this dispersion, a front propagation speed of v = 2 d i (cs 0 − w). however, there are two additional eigenvalues λ 2/3 = (−d j + j 0 γ j u sdĥd (k))k 2 with j = s, r and the fourier transformed social distancing potential u sdĥd (k) associated with instabilities due to interactions. front speeds for dispersions of this form have been calculated by archer et al. [92] . if both epidemic and interaction modes are unstable, the fronts might interfere, leading to interesting results depending on their different speeds. for a further analysis, we solved eqs. (9)-(11) numerically. we assume x and t to be dimensionless, such that all model parameters can be dimensionless too. the calculation was done in one spatial dimension on the domain x ∈ [0, 1] with periodic boundary conditions, using an explicit finite-difference scheme with step size dx = 0.005 (individual simulations) or dx = 0.01 (parameter scan) and adaptive time steps. as an initial condition, we use a gaussian peak with amplitude 1 and variance 50 −2 centered at x = 0.5 for s(x, 0), i(x, 0) = 0.001s(x, 0), and r(x, 0) = 0. since the effect of the parameters c and w on the dynamics is known from previous studies of the sir model, we fix their values to c = 1 and w = 0.1 to allow for an outbreak. moreover, we set γ s = γ i = γ r = 1, d s = d i = d r = 0.01, and σ sd = σ si = 100. the relevant control parameters are c sd and c si , which control the effects of social interactions that are the new aspect of our model. we assume these parameters to be ≤ 0, which corresponds to repulsive interactions. measures implemented against a pandemic will typically have two aims: reduction of the total number of infected persons, i.e., making sure that the final number of noninfected persons s ∞ = lim t→∞ s(t) is large, and reduction of the maximum number of infected persons i max for keeping the spread within the capacities of the healthcare system. using parameter scans, we can test whether social distancing and self-isolation can achieve those effects. as can be seen from the phase diagrams for the sir-ddft model shown in fig. 1 , there is a clear phase boundary between the upper left corner, where low values of i max and high values of s ∞ show that the spread of the disease has been significantly reduced, and the rest of the phase diagram, where the disease spreads in essentially the same way as in the model without social distancing. since all simulations were performed with parameters of c and w that correspond to a disease outbreak in the usual sir model, this shows that a reduction of social interactions can significantly inhibit epidemic spreading, and that the sir-ddft model is capable of demonstrating these effects. the phase boundary shows that, for a reduction of spreading by social measures, two conditions have to be satisfied. first, |c si | has to be sufficiently large. second, |c si | has to be, by a certain amount, larger than |c sd |. within our physical model of repulsively interacting particles, this arises from the fact that if healthy "particles" are repelled more strongly by other healthy particles than by infected ones, they will spend more time near infected particles and thus are more likely to be infected themselves. physically, |c si | > |c sd | is thus a very reasonable condition given that infected persons, at least once they develop symptoms, will be isolated more strongly than healthy persons. figure 2 shows the time evolution of the total numbers s(t), i(t), and r(t) of susceptible, infected, and recovered persons, respectively, for the cases without interactions (usual sir model with diffusion) and with interactions (our model). if no interactions are present (i.e., c si = c sd = 0), i(t) reaches a maximum value of about 0.4 and the pandemic is over at time t ≈ 100. in the case with interactions (we choose c si = 2c sd = −10, i.e., parameter values inside the social isolation phase), the maximum is significantly reduced to a value of about 0.1. the final value of r(t), which measures the total number of persons that have been infected during the pandemic, decreases from about 1.0 to about 0.8. moreover, it takes significantly longer (until time t ≈ 200) for the pandemic to end. this demonstrates that social distancing and self-isolation have the effects they are supposed to have, i.e., to flatten the curve i(t) in such a way that the healthcare system is able to take care of all cases. the theoretical predictions for the effects of quarantine on the course of i(t) (sharp rise, followed by a bend and a flat curve) are in good qualitative agreement with recent data from china [95, 96] , where strict regulations were implemented to control the covid-19 spread [17] . to explain the observed phenomena, it is helpful to analyze the spatial distribution of susceptible and infected persons during the pandemic. figure 3 visualizes i(x, t) with x = ( r) 1 . interestingly, during the time interval where the pandemic is present, a phase separation can be observed in which the infected persons accumulate at certain spots separated from the susceptible persons. (as this effect is reminiscent of measures that used to be implemented against the spread of leprosy, we refer to these spots as "leper colonies".) this phase separation is a consequence of the interactions. since the formation of leper colonies reduces the spatial overlap of the functions i(x, t) and s(x, t), i.e., the amount of contacts between infected and susceptible persons, the total number of inthe leper colony transition is an interesting type of nonequilibrium phase behavior in its own right. recall that we have motivated the sir-ddft model based on theories for nonideal chemical reactions. it is thus very likely that effects similar to the ones observed here can be found in chemistry. in this case, they would imply that particle interactions can significantly affect the amount of a certain substance that is produced within a chemical reaction, and that such reactions are accompanied by new types of (transient) pattern formation. in summary, we have presented a ddft-based extension of the usual models for epidemic spreading that allows to incorporate social interactions, in particular in the form of self-isolation and social distancing. this has allowed us to analyze the effect of these measures on the spatio-temporal evolution of pandemics. given the importance of the reduction of social interactions for the control of pandemics, the model provides a highly useful new tool for predicting epidemics and deciding how to react to them. moreover, it shows an interesting phase behavior relevant for future work on ddft and nonideal chemical reactions. a possible extension of our model is the incorporation of fractional derivatives [97, 98] . furthermore, enhanced simulations in two spatial dimensions could show interesting pattern formation effects associated with leper colony formation. here, we present a possible generalization of our model. in the main text, we have used the decomposition which gives the excess free energy (i.e., the contribution from interactions) as a sum of social distancing and selfisolation. instead, one can use the form in this case, social distancing remains unaffected. however, there are now two terms f iso and f ill determining the way infected persons interact with others. f iso is the isolation term, which corresponds to a repulsive interaction between infected and healthy individuals. the term f ill models the interaction of infected persons with other infected persons. this can have various forms. they repel each other if they practice social distancing or self-isolation, but they can also attract each other (e.g., if they intentionally accumulate in a hospital or quarantine station). assuming that the interaction corresponding to f ill is also gaussian, i.e., with the parameters c iso and c ill for the strength and σ iso and σ ill for the range of the infected-noninfected and infected-infected interactions, respectively, the model given by eqs. (9)(11) in the main text generalizes to with the kernels k ill ( r) = e −σ ill r 2 (s11) and k sd as defined in the main text. for c iso = c ill = c si and σ iso = σ ill = σ si , the standard case is recovered. the general model can also allow for attractive interactions between infected persons, or simply for a reduction of the repulsion between them (resulting from the fact that they are already ill). here, we perform a linear stability analysis of the extended model given by eqs. (5)-(7) from the main text. for the excess free energy, we use the combined ramakrishnan-yussouff-debye-hückel approximation as in eqs. (9)(11) , but now with general two-body potentials u sd h d (x − x ) for social distancing and u si h i (x − x ) for self-isolation. in one spatial dimension, we obtain (s14) any homogeneous state with s = s 0 , r = r 0 , and i = 0, where s 0 and r 0 are constants, will be a fixed point. we consider fields s = s 0 +s and r = r 0 +r with small perturbationss andr and linearize in the perturbations. this results in (s17) we now drop the tilde and make the ansatz s = s 1 exp(λt − ikx), i = i 1 exp(λt − ikx), and r = r 1 exp(λt − ikx). this gives the eigenvalue equation here,ĥ d (k) andĥ i (k) are the fourier transforms of h d (x − x ) and h i (x − x ), respectively. the corresponding characteristic polynomial reads (s19) rather than solving this third-order polynomial in λ exactly, we consider the limit of long wavelengths. for k = 0, which corresponds to the usual sir model given by eqs. (1)-(3) in the main text if we assume k 2ĥ d (k) = 0, eq. (s19) simplifies to which has the solutions this means that the epidemic will start growing when cs 0 > w, since in this case there is a positive eigenvalue. when interpreting this result, one should take into account that, since a susceptible person that has been infected cannot become susceptible again, the system will, after a small perturbation, not go back to the same state as before even if w > cs 0 . actually, we have tested the linear stability of the s-r plane in phase space, and the fact that any parameter combination of s 0 and r 0 with i = 0 is a solution of the sir model is reflected by the existence of the eigenvalue λ = 0 with algebraic multiplicity 2 (a perturbation within the s-r plane will obviously not lead to an outbreak). next, we consider the case k = 0, but assume that we can neglect the term s 0 r 0 k 4 u 2 sdĥ 2 d (k)γ s γ r in eq. (s19). this corresponds to assuming either r 0 = 0 (i.e., we consider the begin of an outbreak of a new disease that no one is yet immune against) or small k (such that terms of order k 4 can be neglected). then, eq. (s19) gives we can immediately read off the solutions the result for λ 1 shows that the initial state still becomes unstable for cs 0 > w, i.e., the interactions cannot stabilize a state without infected persons that would be unstable otherwise. the eigenvalues λ 2 and λ 3 , which were 0 in the long-wavelength limit, now describe the dispersion due to interparticle interactions that may lead to instabilities not related to disease outbreak. for determining the propagation speed of fronts, we can use the marginal stability hypothesis [90] [91] [92] [93] [94] . we transform to the co-moving frame that has velocity v and assume that the growth rate in this frame is zero at the leading edge. thereby, we obtain for a general dispersion λ(k) the equations iv + dλ dk = 0, the solution of these equations is which is in agreement with results from the literature [35] . front speeds for dispersions of the form (s25) and (s26) can be found in ref. [92] . * corresponding author: raphael.wittkowski@uni-muenster yersinia pestisetiologic agent of plague plague spanish flu outdid wwi in number of lives claimed oxford textbook of infectious disease control: a geographical analysis from medieval quarantine to global eradication a new coronavirus associated with human respiratory disease in china a pneumonia outbreak associated with a new coronavirus of probable bat origin origin and evolution of pathogenic coronaviruses a novel coronavirus outbreak of global health concern early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia a novel coronavirus from patients with pneumonia in china substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) covid-19 and rationally layered social distancing pre-symptomatic transmission in the evolution of the covid-19 pandemic impact of self-imposed prevention measures and shortterm government intervention on mitigating and delaying a covid-19 epidemic impact of nonpharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand transmission potential of the novel coronavirus (covid-19) onboard the diamond princess cruises ship the positive impact of lockdown in wuhan on containing the covid-19 outbreak in china a dynamical extension of the density functional theory dynamic density functional theory of fluids dynamical density functional theory and its application to spinodal decomposition nonlinear diffusion and density functional theory avalanche outbreaks emerging in cooperative contagions evolution and emergence of infectious diseases in theoretical and real-world networks the physics of spreading processes in multilayer networks critical regimes driven by recurrent mobility patterns of reaction-diffusion processes in networks exact solution of generalized cooperative susceptibleinfected-removed (sir) dynamics markovian approach to tackle the interaction of simultaneous diseases a contribution to the mathematical theory of epidemics statistics based predictions of coronavirus 2019-ncov spreading in mainland china a simple stochastic sir model for covid-19 infection dynamics for karnataka: learning from europe predicting covid-19 distribution in mexico through a discrete and time-dependent markov chain and an sir-like model correlation between travellers departing from wuhan before the spring festival and subsequent spread of covid-19 to all provinces in china characterizing the dynamics underlying global spread of epidemics continuum description of a contact infection spread in a sir model infection fronts in contact disease spread natural human mobility patterns and spatial spread of infectious diseases complex dynamics of a reaction-diffusion epidemic model a reaction-diffusion system modeling the spread of resistance to an antimalarial drug asymptotic profiles of the steady states for an sis epidemic reaction-diffusion model global stability of the steady states of an sis epidemic reaction-diffusion model a sis reactiondiffusion-advection model in a low-risk and high-risk domain reaction-diffusion processes and metapopulation models in heterogeneous networks exponential stability of traveling fronts in a diffusion epidemic system with delay a fully adaptive numerical approximation for a two-dimensional epidemic model with nonlinear cross-diffusion dynamical density functional theory for interacting brownian particles: stochastic or deterministic? dynamical density functional theory: binary phase-separating colloidal fluid in a cavity selectivity in binary fluid mixtures: static and dynamical properties extended dynamical density functional theory for colloidal mixtures with temperature gradients multi-species dynamical density functional theory derivation of dynamical density functional theory using the projection operator technique projection operators in statistical mechanics: a pedagogical approach mechanism for the stabilization of protein clusters above the solubility curve: the role of non-ideal chemical reactions mechanism for the stabilization of protein clusters above the solubility curve development of reaction-diffusion dft and its application to catalytic oxidation of no in porous materials density-functional fluctuation theory of crowds dynamic density functional theory of solid tumor growth: preliminary models dynamical density-functional-theory-based modeling of tissue dynamics: application to tumor growth kinetics and thermodynamics of protein adsorption: a generalized molecular theoretical approach competitive adsorption in model charged protein mixtures: equilibrium isotherms and kinetics behavior dynamic density functional theory of protein adsorption on polymer-coated nanoparticles competitive adsorption of multiple proteins to nanoparticles: the vroman effect revisited optimizing the search for resources by sharing information: mongolian gazelles as a case study aggregation of selfpropelled colloidal rods near confining walls dynamical density functional theory for colloidal particles with arbitrary shape dynamical density functional theory for microswimmers dynamical density functional theory for circle swimmers particle-scale statistical theory for hydrodynamically induced polar ordering in microswimmer suspensions multi-species dynamical density functional theory for microswimmers: derivation, orientational ordering, trapping potentials, and shear cells active colloidal suspensions exhibit polar order under gravity active brownian particles in two-dimensional traps active crystals and their stability active brownian particles at interfaces: an effective equilibrium approach effective equilibrium states in the colored-noise model for active matter i. pairwise forces in the fox and unified colored noise approximations effective equilibrium states in the colorednoise model for active matter ii. a unified framework for phase equilibria, structure and mechanical properties effective interactions in active brownian suspensions the five problems of irreversibility particleconserving dynamics on the single-particle level on quantum theory of transport phenomena: steady diffusion transport, collective motion, and brownian motion ensemble method in the theory of irreversibility projection operator techniques in nonequilibrium statistical mechanics mori-zwanzig projection operator formalism for far-from-equilibrium systems with time-dependent hamiltonians microscopic derivation of time-dependent density functional methods investigation of epidemic spreading process on multiplex networks by incorporating fatal properties a simple mathematical model for ebola in africa should, and how can, exercise be done during a coronavirus outbreak? an interview with dr. jeffrey a. woods sedimentation of a twodimensional colloidal mixture exhibiting liquid-liquid and gas-liquid phase separation: a dynamical density functional theory study first-principles order-parameter theory of freezing theory of simple liquids: with applications to soft matter propagating pattern selection pattern propagation in nonlinear dissipative systems solidification fronts in supercooled liquids: how rapid fronts can lead to disordered glassy solids solidification in soft-core fluids: disordered solids from fast solidification fronts generation of defects and disorder from deeply quenching a liquid to form a solid coronavirus covid-19 global cases by the center for systems science and engineering at john hopkins university uk warns fifth of workforce could be off sick from coronavirus at its peak; army prepared dynamics of ebola disease in the framework of different fractional derivatives fractional derivatives applied to mseir problems: comparative study with real world data key: cord-291180-xurmzmwj authors: lin, xiaoqian; li, xiu; lin, xubo title: a review on applications of computational methods in drug screening and design date: 2020-03-18 journal: molecules doi: 10.3390/molecules25061375 sha: doc_id: 291180 cord_uid: xurmzmwj drug development is one of the most significant processes in the pharmaceutical industry. various computational methods have dramatically reduced the time and cost of drug discovery. in this review, we firstly discussed roles of multiscale biomolecular simulations in identifying drug binding sites on the target macromolecule and elucidating drug action mechanisms. then, virtual screening methods (e.g., molecular docking, pharmacophore modeling, and qsar) as well as structureand ligand-based classical/de novo drug design were introduced and discussed. last, we explored the development of machine learning methods and their applications in aforementioned computational methods to speed up the drug discovery process. also, several application examples of combining various methods was discussed. a combination of different methods to jointly solve the tough problem at different scales and dimensions will be an inevitable trend in drug screening and design. with the rapid development of both computer hardware, software, and algorithms, drug screening and design have benefited much from various computational methods which greatly reduce the time and cost of drug development. in general, bioinformatics can help reveal the key genes from a massive amount of genomic data [1, 2] and thus provide possible target proteins for drug screening and design. as a supplement to experiments, protein structure prediction methods can provide protein structures with reasonable precision [3] . biomolecular simulations with multiscale models allow for investigations of both structural and thermodynamic features of target proteins on different levels [4] , which are useful for identifying drug binding sites and elucidating drug action mechanisms. virtual screening then searches chemical libraries to provide possible drug candidates based on drug binding sites on target proteins [5] [6] [7] . with greatly reduced amount of possible drug candidates, in-vitro cell experiments can further evaluate the efficacy of these molecules. in addition to virtual screening, de novo drug design methods [8] , which generate synthesizable small molecules with high binding affinity, provide another type of computer-aided drug design direction. artificial intelligence, e.g., machine learning and deep learning, is playing more and more important roles in the aforementioned computational methods and thus drug development [9] [10] [11] . in this review, we will focus on developments of the last four computational methods as well as their applications in drug screening and design. predicted by the transition state theory. hence, they used isomorphism between probabilities obtained from the mc process and probability factors obtained from transition state theory [36] , and converted the mc process to a time-dependent simulation with additional simplified modifications [37] . the design, discovery, and development of drugs are complex processes involving many different fields of knowledge and are considered a time-consuming and laborious inter-disciplinary work [38] [39] [40] [41] . different drug design methods and virtual screening will be very useful to design and find rational drug molecules based on the target macromolecule that interacts with the drug and thus speed up the whole drug discovery process. here, we will discuss structure-based drug design, ligand-based drug design, and virtual screening. structure-based drug design must be performed with available structural models of the target proteins, which are provided by x-ray diffraction, nuclear magnetic resonance (nmr) or molecular simulation (homologous protein modeling, etc.) [42] [43] [44] [45] [46] . keeping in mind the complexity of cancers which show diverse phenotypes and multiple etiologies, a one-size-fits-all drug design strategy for the development of cancer chemotherapeutics does not yield successful results. lately, arjmand et al. [47] adopted a series of methods, such as the combination of x-ray crystal structures and molecular docking, to design, synthesize, and characterize novel chromone based-copper(ii) antitumor inhibitors. in general, after obtaining the structure of the receptor macromolecule by x-crystal single-crystal diffraction technique or multi-dimensional nmr, molecular modeling software can be used to analyze the physicochemical properties of drug binding sites on the receptor, especially including electrostatic field, hydrophobic field, hydrogen bond, and key residues. then, the small molecule database is searched, or the drug design technique is used to identify the suitable molecules whose molecular shapes match the binding sites of the receptor and binding affinity is high. then, these molecules are synthesized and their biological activities will be tested for further drug development. in short, structure-based drug design plays an extremely important role in drug design. unlike structure-based drug design, ligand-based drug design doesn't search small molecule libraries. instead, it relies on knowledge of known molecules binding to the target macromolecule of interest. using these known molecules, a pharmacophore model that defines the minimum necessary structural characteristics a molecule must possess in order to bind to the target can be derived [48, 49] . then, this model can be further used to design new molecular entities that interact with the target. on the other hand, ligand-based drug design can also use quantitative structure-activity relationships (qsar) [50, 51] in which a correlation between calculated properties of molecules and their experimentally determined biological activity is derived, to predict the activity of new analogs. both the pharmacophore model and qsar model will be discussed in detail in the following sessions. in recent years, the rapid development of computational resources and small molecule databases have led to major breakthroughs in the development of lead compounds. as the number of new drug targets increases exponentially, computational methods are increasingly being used to accelerate the drug discovery process. this has led to the increased use of computer-assisted drug design and chemical bioinformatics techniques such as high-throughput docking, homology search and pharmacophore search in databases for virtual screening (vs) technology [51] . virtual screening is an important part of computer-aided drug design methods. it may be the cheapest way to identify potential lead compounds, and many successful cases have proven successful using this technology. the primary technique for identifying new lead compounds in drug discovery is to physically screen large chemical libraries for biological targets. in experiments, high-throughput screening identifies active molecules by performing separate biochemical analysis of more than one million compounds. however, this technology involves significant costs and time. therefore, a cheaper and more efficient calculation method came into being, namely, virtual high-throughput screening. the method has been widely used in the early development of new drug. the main purpose is to determine the novel active small molecule structure from the large compound libraries. it is consistent with the purpose of high-throughput screening. the difference is that virtual screening can save a lot of experimental costs by significantly reducing the number of compounds for the measurement of the pharmacological activity, while high-throughput screening needs to perform experiments with all compounds in the database. here, we will discuss common methods of virtual screening. molecular docking, which predicts interaction patterns between proteins and small molecules as wel as proteins and proteins, to evaluate the binding between two molecules [52] , is widely used in the field of drug screening and design. the theoretical basis is that the process of ligand and receptor recognition relies on spatial shape matching and energy matching, which is the theory of "inducing fit". determining the correct binding conformation of small molecule ligands and protein receptors in the formation of complex structures is the basis for drug design and studying its action mechanism. molecular docking can be roughly divided into rigid docking, semi-flexible docking and flexible docking. in rigid docking, the structure of molecules does not change. the calculation method is relatively simple, and mainly studies the degree of conformation matching, so it is more suitable for studying macromolecular systems, such as protein-protein, protein-nucleic acid systems. in semi-flexible docking, the conformation of molecules can be varied within a certain range, so it is more suitable to deal with the interaction between proteins and small molecules [53] . in general, the structure of small molecules can be freely changed, while macromolecules remain rigid or retain some of the rotatable amino acid residues to ensure computational efficiency. in flexible docking, the simulated system conformation is free to change, thus consuming more computing resources while improving accuracy. what's more, the establishment of binding sites in molecular docking methods is very important. for the first time, collins [54] successfully determined the binding sites on the surface of proteins using a multi-scale algorithm and performed flexible docking of molecules, which greatly promoted the development of molecular docking. a pharmacophore is an abstract description of molecular features necessary for molecular recognition of a ligand by a biological macromolecule, which explains how structurally diverse ligands can bind to a common receptor site. when a drug molecule interacts with a target macromolecule, it produces a geometrically and energetically matched active conformation with the target. medicinal chemists found that different chemical groups in drug molecules have different effects on activity, and changes to some groups have a great influence on the interaction between drugs and targets, while others have little effect [55] . moreover, it was found that molecules with the same activity tend to have some of the same characteristics. therefore, in 1909, ehrlich proposed the concept of pharmacophores, which referred to the molecular framework of atoms with active essential characteristics [56] . in 1977, gund further clarified the concept of pharmacophores as a group of molecules that recognize receptors and form structural features of molecular biological activity [57] . there are two main methods for the identification of pharmacophores. on one hand, if the target structure is available, the possible pharmacophore structure can be inferred by analyzing the action mode of receptor and drug molecule. on the other hand, when the structure of the target is unknown or the action mechanism is still unclear, a series of compounds will be studied for pharmacophores, and information on some groups that play a key role in the activity of compound will be summarized by means of conformational analysis and molecular folding [58] . active compound that is suitable for constructing the model will be selected in the pharmacophore recognition process. then, conformation analysis is used to find the binding conformation of molecule, and to determine the pharmacophore [59] . in recent years, with the development of compound databases and computer technology, the virtual screening of databases using the pharmacophore model has been widely used, and has become one of the important means to discover lead compounds. qsar is a quantitative study of the interactions between small organic molecules and biological macromolecules. it contains a correlation between calculated properties of molecules (e.g., absorption, distribution, metabolism of small organic molecules in living organisms) and their experimentally determined biological activity [51] . in the case of unknown receptor structure, the qsar method is the most accurate and effective method for drug design. drug discovery often involves the use of qsar to identify chemical structures that could have good inhibitory effects on specific targets and have low toxicity (non-specific activity). with the further development of structure-activity relationship theory and statistical methods, in the 1980s, 3d structural information was introduced into the qsar method, namely 3d-qsar. since 1990s, with the improvement of computing power and the accurate determination of 3d structure of many biomacromolecules, structure-based drug design has gradually replaced the dominant position of quantitative structure-activity relationship in the field of drug design, but qsar with the advantages of small amount of calculation and good predictive ability [60] still plays an important role in pharmaceutical researches. based on 3d structural characteristics of ligands and targets, 3d-qsar explores the 3d conception of bioactive molecules, accurately reflects the energy changes and patterns of interactions between bioactive molecules and receptors, and reveals the drug-receiving mechanism of body interactions. the physicochemical parameters and 3d structural parameters of a series of drugs are fitted to the quantitative relationship. then, the structures of new compounds are predicted and optimized. in short, 3d-qsar is actually a research method combining qsar with computational chemistry and molecular graphics. it is a powerful tool for studying the interactions between drugs and target macromolecules, speculating the image of simulated targets, establishing the relationship of drug structure activity, and designing drugs. computer-based de novo design methods of drug-like molecules are mainly for generating small molecule compounds with ideal physicochemical and pharmacological properties. in the past decades, fragment-based drug discovery had appeared as a novel concept that has proved a good prospect for improving lead optimization, in order to decrease the clinical attrition rates in drug design. it is an approach that uses small molecular fragments to deduce the biomolecular targets [61] . fragment-based de novo design has obtained the long-term clinical success [62] . despite the fact that modern drug discovery has made some successes in offering effective drugs, drug design has been affected by several factors, such as the tremendous chemical space for exploring drug molecules [63] . further, as a large number of data increase in biological, chemical, and clinical medicine, it is obvious that the drug design should be solved with multiscale optimization methods, and concentrate on the data beyond molecular levels [64] . thus, it is essential to discuss the function of multiscale models in drug discovery, and how they have predicted multiple biological properties in different biological targets. accordingly, we discuss the combined application of both the concept of fragment-based on de novo design and multiscale modeling. the fragment-based de novo design method starts with small building blocks. the initial molecular building blocks with desired properties are either elaborated upon (growing), directly connected (joining), or connected by a linker (linking). this process can be iterated until one or more molecules with the desired properties are obtained. there are two methods, namely structure-based and ligand-based methods [65] . structure-based de novo design method searches novel ligands by using the 3d structural information of the protein target, which are usually constructed directly in the target protein binding site and evaluated by calculating the interaction energy of the target protein with ligands. nevertheless, in de novo ligand design method, the molecule structure of protein target is unknown, and the new molecule is suggested based structure analogous to the known ligand molecule. nowadays, proteochemometrics has emerged as a relatively new discipline for drug discovery. in this filed, qsar analysis is a powerful tool for the efficient virtual screening, which shows physicochemical properties of various compounds. compared to the classical qsar, the qm calculations use reactivity descriptors in ligand-based qsars, which provides an implicit model and calculate an exact enthalpy contributions of protein-ligand interactions. however, for the ab initio fragment, molecular orbital calculation in the structure-based qsars, which obtains an explicit model and a clear enthalpy, changes the binding energy in different additional conditions. moreover, it also calculates the free energy contribution of ligand-target complexes formation in structure-based and ligand-based qsar models. using a large number of ligand-target complexes to discuss the change of their binding affinity, more accurate optimization steps can be conducted based on good prediction and interpretation models [66] . the key of any qsar model is how to accurately describe the molecule, and qm approaches provides a better understanding to the molecular and structural characteristics of ligands and drugs, so as to solve the problems existed in drug discovery. the qsar models of different scales are built according to the different computational precision, multiscale-qsar research object mainly refers to the structure description of the training set, and involves small molecules and macromolecules [67, 68] . in micro-, meso-and macroscopic scales, different molecular approaches will be used. qm approaches are often used to perform precise calculations at microscales, such as atom-based qsar. molecular force field focuses on mesoscopic-scale simulation, such as fragment-based qsar. and coarse-grained study mainly performed in the macroscopic scales, such as macromolecule-based qsar and cell-based qsar. moreover, multi-scale can also be reflected by different dimensions used in different qsar models. comfa is a technique of 3d fragment-based qsar, which can complete skeleton transition and r-groups substitution, providing different structures for new drug design. besides, derived from proteochemometrics (pcm) [69] , the 2.5d kinochemometrics (kcm) approach using 3d descriptor for protein kinases and 2d fingerprints for ligands can greatly increase the efficiency as well as the precision compared the traditional 3d qsar methods [70] . the multiscale qsar provides effective predictions for drug design, which integrates qsar more systematically and applies all existing qsar methods effectively. multiscale de novo drug design is a novel concept that combines qsar models, qm calculations [71] and fragment-based drug discovery (fbdd) [72] . here, the importance of explicit molecular descriptors is shown in a model from a molecular structural point of view through qm calculations. with the assembly of reasonable molecular fragments, the objective of drug design method is to produce a certain novel molecule that display highest biological activity, absorption, metabolism, elimination (adme) and lowest toxicity properties at different environments, which belong to the application range of qsar models. the multiscale de novo drug design methods can efficiently handle a large amount of biochemical/clinical data and obtain the chemical characteristics in order to improve the properties of the drug molecule. it is considered to be a more effective and safer method to discover new therapeutic agents. in the process of drug discovery, machine intelligence methods have mostly been used in the above-mentioned computational methods over the past few decades [73] . with the booming era of "big" data, machine learning methods have developed into deep learning approaches, which are a more efficient way for drug designers to deal with important biological properties from large amount of compound databases. here, we introduce applications of machine learning methods in qsar analysis as well as the recent advances in deep learning methods. decision trees (dts) are a simple, interpretable and predictive machine learning method. ordinarily, there are two fundamental steps, that is, selecting properties and pruning for the decision trees building. the selected properties are considered as internal nodes, the branch representing the test result on the molecule, and the leaf node as a classification label. in order to avoid the complexity of the decision tree, the pruning program is used to prune the established tree. the dt is a typical classification algorithm, which is widely used in the prediction and auxiliary diagnosis of the disease, such as management decision-making, the classification mode for creating the metabolic disorder, and data mining of diabetes etc. [74] . abdul et al. [75] developed a task-based chemical toxicity prediction framework, and used a decision tree to obtain an optimum number of features from a collection of thousands of them, which effectively help chemists perform prescreening of toxic compounds effectively. the artificial neural network (ann) achieves problem-solving by mimicking brain function. just as the brain applies information obtained from past experience to solve new problems, a neural network can construct a system of "neurons" that reaches new decisions, classifications, and prediction based on previous experiences. the processing element is similar to a neuron, and a massive processing element is organized by the layers. they include three types: input, hidden, and output layers. ann benefits from high self-organization, robustness, and fault tolerance, and has been widely applied in prognosis evaluation and early prevention of diseases. lorenzo et al. [76] used the interpretable ann to predict biophysical properties of therapeutic monoclonal antibodies, include melting temperature, aggregation onset temperature and interaction parameters as a function of ph and salt concentration from the amino acid composition. artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research by borrowing ideas from the deep learning field. compared with some other fields in life sciences, their applications in drug discovery is still limited. the support vector machine (svm) is one of the most promising machine learning methods that can use molecular descriptors to construct a predictive qsar models and deal with high-dimensional datasets. ann and multiple linear regression analysis were used to construct linear and nonlinear models, which were then compared with the results gained by svm. for linear models, the svm approaches use space mapping points to separate different classification for maximizing the range between different categories of points [77] . further, for the nonlinear models, svms use nucleus mapping to transform into a high-dimension space for linear classification. at present, the svm approach has been widely used in modeling at different scales for drug discovery [78] . the k nearest neighbor (knn) is one of the simplest and most intuitive algorithms among all machine learning methods, and is usually jointly used with other selection algorithms in the feature space. further, it is used for classification and regression based on example learning. normally, molecules are classified by votes of its closest neighbors, resulting in the most common class that molecules are distributed to its closest neighbors. here, the value of k is the number of closest neighbors. based on ligand-based virtual screening, knn can be viewed as a prolongation form chemical similarity searching to supervised learning, and the top search results predicted the best bioactivities. weber et al. [79] tried two machine learning algorithms of classification (knn and rf) to analyze genotype-phenotype datasets of hiv protease and reverse transcriptase (rt). as a result, both algorithms had high accuracies for predicting the drug resistance for protease and rt inhibitors. the random forest (rf) is an ensemble learning approach involving the building of multiple dts based on the training examples. similar to knn algorithms, it is also used to for classification and to predict regression [80] . compared to dts, it is impossible that rf over-fits the data, and the rf has been used for bioactivity data classification [81] , toxicity modeling [82] , and drug target prediction [83] , etc. wang et al. [84] used the rf approach to model the binding affinity of protein-ligand on 170 hiv-1 proteases complexes, 110 trypsin complexes, and 126 carbonic anhydrase complexes, which demonstrated that individual representation and model construction for each protein family is a more reasonable way in predicting the affinity of one particular protein family. currently, the multiscale models can predict toxicity, activity, and adme properties of different proteins and microbial targets by integrating different genomes and proteomics. cheminformatics has played an important role in rationalize drug discovery. the qsar model has become the main auxiliary tool which can achieve virtual screening of various pharmacological characteristics. although the qsar model has been widely used in the search and design of new drug, classical qsar models can only predict the activity and toxicity of one biomolecule against one certain target. however, the multi-target qsar (mt-qsar) can be used to carry out rational drug design at multiple targets, which provides a better way to understand various pharmacological characteristic molecules including antibacterial activity and toxicity. furthermore, uniform multitasking models based on quantitative structure biological effect relationship (mtk-qsber) have been used in a lot of researches. these models were built by ann and the topological indices, which can predict the biological activity and toxicity correctly and classify the compounds in experimental conditions. meanwhile, these models used perturbation models to form structural-activity relationships between the site of infection and the drug, such as the ptml model [85] and the chembl model [86] , which has been applied in infectious diseases [71] , immunology [85] , and cancer [87] widely. currently, the mtk-qsber model has been able to carry out the in-silico design and virtual screening of an antibacterial drug efficiently, and these antibacterial drugs have good biosafety. these methods have provided a powerful tool for in silico screening reasonable drugs. the deep learning network is a concept closely related to ann, which are learning of the concept of layering. in other words, it is a multiple learning approaches ranging from low to high levels. just when the molecular descriptors are not selected, the deep learning method will automatically select representations from original data and high-dimensional data [88] . therefore, it allows deep learning to be applied to the model building of drug discovery [89] . the convolutional neural networks (cnn) are most commonly used, which have made great progresses in the computer vision community [90] , and been applied in the drug design fields including de novo drug molecule identification, protein engineering and gene expression analysis. with the rapid development of deep learning concepts such as cnn, the molecular modeler's tool box has been equipped with potentially game-changing methods. judging from the success of recent pioneering studies, we are convinced that modern deep learning architecture will be useful for the coming age of big data analysis in pharmaceutical research, toxicity prediction, genome mining and chemogenomic applications, to name only some of the immediate areas of application. kiminori et al. [91] developed a fundamental technology that can predict the resistance of free cancer cells to fluorinated pyrimidine anticancer drugs by deep learning from the morphological image data taken from images. cai et al. [92] developed a deep learning approach, termed deep human ether-a-go-go-related gene (herg), for the prediction of herg blockers of small molecules in drug discovery and post-marketing surveillance. the group found that deepherg models built by a multitask deep neural network (dnn) algorithm outperformed those built by single-task dnn, svm and rf. now, the drug development technologies usually include artificial intelligence-based (ai-based) techniques. most ai applications only concentrate on limited tasks. moreover, current ai can only direct patients' specific problems, it cannot make subjective inferences like doctors with the overall physical context of a patient. as a subfield of ai, ml can be successfully used for training in the quality of examples. however, this process is very time-consuming and costly. the development of ml techniques and the application of existing algorithms to process massive amounts of digital data resulted in higher requirements for computer hardware, which also increases the clinical cost. dl, which is also a subset of ml, and can process big data and create patterns by layers of neurons. however, it is difficult to understand how each decision is obtained by algorithm. ml methods have achieved great successes in the field of chemoinformatics to design and discover new drugs. an important innovation is the combination of ml methods and big data analysis to predict more extensive biological features. it is vital to discover more secure and efficient drugs by integrating structural, genetic information and pharmacological data from the scale of molecular to organism [93] . in addition, dl approaches have proven to be a promising way for efficiently learning from a large variety of datasets for modern novel drug discovery. multiscale modeling of the drugs in an excitable system is critical because experiments on a single system scale cannot reveal the underlying effects of multiple drug interactions. a computationally based approach to predict the emergency effects of drugs on excitatory rhythms may form an interactive technology-driven process for the drug and disease screening industry, research and development academia, and patient-oriented medical clinic. there are potentially far-reaching implications because millions of people affected by arrhythmia each year will benefit from improved risk stratification of drug-based interventions. much progress has been made in developing multiscale computational modeling and simulation approaches for predicting effects of cardiac ion channel blocking drugs. structural modeling of ion channel interactions with drugs is a critical approach for current and future drug discovery efforts. modeling of drug receptor sites within an ion channel structure can be useful to identify key drug-channel interaction sites. drug interactions with cardiac ion channels have been modeled at the atomic scale in simulated docking and md simulations, as well as at the level of channel function to simulate drug effects on channel behavior [32, [94] [95] [96] [97] [98] [99] [100] . structural modeling of drug-channel interactions at the atomic scale may ultimately allow for the design of novel high-affinity and subtype selective drugs for specific targeting of receptors for cardiac and neurological disorders. the world health organization (who) stated that cancer remains one of the most dangerous diseases today. considering that cancer is a multifactorial disease, there is increasing interest in multi-target compounds that can target multiple intracellular pathways. however, the study of large data sets for the analysis of anticancer compounds is difficult, with a large amount of data and high data complexity. for example, the chembl database [101] compiles big datasets of very heterogeneous preclinical assays. bediaga et al. [87] have reported a ptml-lda model of the chembl dataset for the preclinical determination of anticancer compounds. ptml is a model that combines perturbation theory (pt) ideas and ml methods to solve similar problems. they compared this model with other ptml models which was reported by speck-planche et al. [72, [102] [103] [104] and then concluded that this is the only one that can predict activity against multiple cancers. speck-planche et al. also derived a multi-task (mtk) chemical information model combining broto moreau autocorrelation with ann from a dataset containing 1933 peptide cases. this model is used to virtually design and screen peptides with potential anti-cancer activity against different cancer cell lines and low cytotoxicity to a variety of healthy mammalian cells, and the model shows greater than 92% in both training and prediction (test)accuracy. in addition, due to the inherent complexity of tumors, it is necessary to analyze their growth at different scales. it includes many phenomena that occur at various spatial scales from tissue to molecular length. the complexity of cancer development is manifested in at least three scales that can be distinguished and described by mathematical models, namely microscale, mesoscale, and macroscale. wang et al. conducted a number of studies on how to use multiscale models for the identification and combination therapy of drug targets [105] [106] [107] [108] [109] [110] [111] . this method is based on quantification of relationship between intracellular epidermal growth factor receptor (egfr) signaling kinetics, lung cancer extracellular epidermal growth factor (egf) stimulation and multicellular growth. the multiscale modeling of tumors combined with systemic pharmacology will contribute to the development of practical smart drugs. it will produce a comprehensive system-level approach to determine the dynamics and effects of existing and new drugs in preclinical trials, model organisms and individual patients. in addition, mathematical and computational studies will provide a better way to understand many factors that influence the effects of drugs, thus helping to uncover better ways to therapeutically interfere with disease. multiscale models can be also used to identify pathophysiological processes to allow disease staging. in many cases, like cancer, treatments vary depending on the stage of disease. the model can help determine prognosis, which is an important clinical determination that can help determine the right type of medication to be administered or discovered. several models focus on the neuronal network levels, including cutsuridis and moustafa for alzheimer's disease [112] , and lytton for epilepsy [113] . ann is a class of ml techniques that can be used for clinical analysis of big data including that related to drug testing, which is critical for drug discovery. in addition, anastasio [114] introduced process algebra, a computer technology widely used to analyze complex computing systems, used here to calculate neurology. sirci et al. [115] described how network (map) theory is used to identify similarities and differences between different pharmacological agents. in this type of study, each drug is a node, and the edges between drugs represent chemical and transcription-based interactions that characterize the drug. in addition, ferreira da costa et al. [116] report the first ptml (pt + ml) study of a large number of chembl datasets for preclinical determinations of compounds for dopamine pathway proteins. molecular docking or ml models can be used to solve a specific protein, but these models cannot explain the large and complex large data sets of preclinical assays reported in public databases. pt models, on the other hand, allow us to predict the properties of a query compound or molecular system in an experimental analysis with multiple boundary conditions based on previously known reference cases. in their work, the best ptml model found in the training and external validation series has an accuracy of 70-91%. hansch's model is a classic method for solving quantitative structural binding relationships (qsbr) in pharmacology and medicinal chemistry. abeijon et al. [117] developed a new pt-qsbr hansch model based on pt and qsbr methods for a large number of drugs reported in chembl, focusing on a protein expressed in the hippocampus of the brain of alzheimer's disease (ad) patients. now, by decomposing how risks and causes are combined in complex systems to produce disease, and how to prevent or improve these diseases through multi-stage, multi-target, multi-drug techniques, multiscale modeling is gradually being grasped. from aids, hepatitis c, influenza, and other disease-related viruses to the current 2019-ncov, we have been working hard to develop antiviral drugs targeting them. however, the unique structure and proliferation of the virus pose a natural challenge for drug development. viruses do not have their own cellular structure and metabolic system, and must replicate and proliferate in host cells. therefore, it is difficult to find compounds that target only viral targets without affecting the normal function of host cells. at present, the main way that some antiviral drugs work is to inhibit viral replication. however, many of the tools used for virus replication come from human cells, such as ribosomes, and the corresponding antiviral drugs will also bring great side effects to the human body. therefore, the discovery of drugs requires the introduction of a multi-scale model to screen out drugs that can inhibit viral replication while reducing the damage to the human body. so far, retroviral infections, such as hiv, are incurable diseases. chembl manages big data capabilities through complex datasets, which make the information difficult to analyze because these datasets describe numerous features for predicting new drugs for retroviral infections. without proper model, it is impossible to make full use of these features. hence, vã¡squez-domã­nguez et al. [118] proposed a ptml model for the chembl dataset, which can be efficiently used for preclinical experimental analysis of antiretroviral compounds. the pt operator is based on a multi-conditional moving average, which combines different functions and simplifies the management of all data. the ptml model they proposed was the first to consider multiple features combined with preclinical experimental antiretroviral tests. in order to simultaneously explore antibacterial activity against gram-negative pathogens and in vitro safety related to absorption, distribution, metabolism, elimination, and toxicity (admet), the speck-planchee et al. [119] further proposed the first mtk-qsber model. the accuracy of this model in both the training and prediction (test) sets is higher than 97%. they also have developed a chemoinformatic model for simultaneous prediction of anti-cocci activities and in vitro safety [71] . the best model displayed accuracies around 93% in both training and prediction (test) sets. additionally, focusing on exploring anti-hepatitis c virus (hcv), the accuracy shown in the training and prediction (test) sets is higher than 95% using this model [120] . cytotoxicity is one of the main concerns in the early development of peptide-based drugs. kleandrova et al. [121] introduce the first multi-task processing (mtk) computational model focused on predicting both antibacterial activity and peptide cytotoxicity. gonzalez-diaz et al. [122] developed a model called lnn-alma to generate complex networks of the aids prevalence with respect to the preclinical activity of anti-hiv drugs. multiscale models are also imperfect and have their limitations. models are expressions and simplifications of real life. no model can represent everything that can happen in the system. all models contain specific assumptions, and models vary widely in their comprehensiveness, quality, and utility. in other words, each model can only solve limited problems. hence, we need to integrate different computational models and data in order to make full use of these models. computational methods have come to play significant roles in drug screening and design. multiscale biomolecular simulations can help identify the drug binding sites on the target macromolecules and elucidate the drug action mechanisms. virtual screening can efficiently search massive chemical databases for lead compounds. de novo drug design provides alternative powerful way to design drug molecules from scratch using building blocks summarized and abstracted in previous successful drug discovery. ml is revolutionizing most computational methods in drug screening and design, which may greatly improve the efficiency and precision for the big data era. as we frequently emphasize, different models or efficient algorithms (e.g., dimensionality reduction) need to be integrated properly to achieve the comprehensive study of biological processes at multiple scales as well as accurate and effective drug screening and design. the integrated computational methods will accelerate drug development and help identify effective therapies with novel action mechanisms that can ultimately be applied to a variety of complex biological systems. the authors declare no conflict of interest. prediction of drug-target interaction networks from the integration of chemical and genomic spaces properties and identification of human protein drug targets critical assessment of methods of protein structure prediction (casp)-round xii multiscale modeling of biomolecular systems: in serial and in parallel virtual screening of chemical libraries computational protein-ligand docking and virtual drug screening with the autodock suite rapid virtual screening of enantioselective catalysts using catvs automated de novo drug design: are we nearly there yet? deep reinforcement learning for de novo drug design machine learning for molecular modelling in drug design. biomolecules machine learning based dimensionality reduction facilitates ligand diffusion paths assessment: a case of cytochrome p450cam development of multiscale models for complex chemical systems: from h + h 2 to biomolecules (nobel lecture) the many roles of computation in drug discovery role of molecular dynamics and related methods in drug discovery advancing drug discovery through enhanced free energy calculations excited state properties of non-doped thermally activated delayed fluorescence emitters with aggregation-induced emission: a qm/mm study exploring the dependence of qm/mm calculations of enzyme catalysis on the size of the qm region spectroscopy in complex environments from qm-mm simulations peptide folding kinetics from replica exchange molecular dynamics structural characterization of î»-repressor folding from all-atom molecular dynamics simulations macrolide antibiotics allosterically predispose the ribosome for translation arrest current tools and methods in molecular dynamics (md) simulations for drug design modeling structural dynamics of biomolecular complexes by coarse-grained molecular simulations the impact of molecular dynamics on drug design: applications for the characterization of ligand-macromolecule complexes molecular dynamics simulations and drug discovery the future of molecular dynamics simulations in drug discovery identification of drug binding sites and action mechanisms with molecular dynamics simulations assessing the performance of the mm/pbsa and mm/gbsa methods. 1. the accuracy of binding free energy calculations based on molecular dynamics simulations physical properties of the hiv-1 capsid from all-atom molecular dynamics simulations biomolecular interactions modulate macromolecular structure and dynamics in atomistic model of a bacterial cytoplasm online tools for protein ensemble pocket detection and tracking multiscale modeling in the clinic: drug design and development multiscale methods in drug design bridge chemical and biological complexity in the search for cures recent advances in fragment-based computational drug design: tackling simultaneous targets/biological effects monte carlo simulations of proton pumps: on the working principles of the biological valve that controls proton pumping in cytochrome c oxidase multiscale simulations of protein landscapes: using coarse-grained models as reference potentials to full explicit models realistic simulations of proton transport along the gramicidin channel: demonstrating the importance of solvation effects diverse strategies in drug discovery and development good practices in model-informed drug discovery and development: practice, application, and documentation advances in computational structure-based drug design and application in drug discovery expediting the design, discovery and development of anticancer drugs using computational approaches identification of potential crac channel inhibitors: pharmacophore mapping, 3d-qsar modelling, and molecular docking approach homology model versus x-ray structure in receptor-based drug design: a retrospective analysis with the dopamine d3 receptor new insights for drug design from the x-ray crystallographic structures of g-protein-coupled receptors an improved receptor-based pharmacophore generation algorithm guided by atomic chemical characteristics and hybridization types x-ray crystallographic structure of a teixobactin analogue reveals key interactions of the teixobactin pharmacophore design, synthesis and characterization of novel chromone based-copper (ii) antitumor agents with n, n-donor ligands: comparative dna/rna binding profile and cytotoxicity pharmacophore modeling and applications in drug discovery: challenges and recent advances searching for potential mtor inhibitors: ligand-based drug design, docking and molecular dynamics studies of rapamycin binding site best practices for qsar model development, validation, and exploitation rational drug design of antineoplastic agents using 3d-qsar, cheminformatic, and virtual screening approaches molecular docking and structure-based drug design strategies molecular docking as a popular tool in drug design, an in silico travel 1h-nmr of rh (nh3) 4phi3+ bound to d (tggcca) 2: classical intercalation by a nonclassical octahedral metallointercalator 3d pharmacophore modeling techniques in computer-aided molecular design using ligandscout ã�ber den jetzigen stand der chemotherapie three-dimensional pharmacophoric pattern searching pharmacophore models and pharmacophore-based virtual screening: concepts and applications exemplified on hydroxysteroid dehydrogenases pharmacophore-based virtual screening identification of potential tumour-associated carbonic anhydrase isozyme ix inhibitors: atom-based 3d-qsar modelling, pharmacophore-based virtual screening and molecular docking studies revealing the macromolecular targets of complex natural products multi-objective molecular de novo design by adaptive fragment prioritization click chemistry for drug discovery legacy data sharing to improve drug safety assessment: the etox project multi-objective optimization methods in de novo drug design. mini rev using local models to improve (q) sar predictivity a multiscale simulation system for the prediction of drug-induced cardiotoxicity multiscale quantum chemical approaches to qsar modeling and drug design polypharmacology modelling using proteochemometrics (pcm): recent methodological developments, applications to target families, and future prospects prediction of protein kinase-ligand interactions through 2.5d kinochemometrics chemoinformatics for medicinal chemistry: in silico model to enable the discovery of potent and safer anti-cocci agents fragment-based in silico modeling of multi-target inhibitors against breast cancer-related proteins quantitative structure-activity relationship: promising advances in drug discovery platforms machine learning method for knowledge discovery experimented with otoneurological data efficient toxicity prediction via simple features using shallow neural networks and decision trees application of interpretable artificial neural networks to early monoclonal antibodies development computational prediction of anti hiv-1 peptides and in vitro evaluation of anti hiv-1 activity of hiv-1 p24-derived peptides in silico de novo design of novel nnrtis: a bio-molecular modelling approach automated prediction of hiv drug resistance from genotype data cct244747 is a novel potent and selective chk1 inhibitor with oral efficacy alone and in combination with genotoxic anticancer drugs qsar based model for discriminating egfr inhibitors and non-inhibitors using random forest using random forest and decision tree models for a new vehicle prediction approach in computational toxicology identification of human drug targets using machine-learning algorithms a comparative study of family-specific protein-ligand complex affinity prediction based on random forest approach ptml model for proteome mining of b-cell epitopes and theoretical-experimental study of bm86 protein sequences from colima computational modeling in nanomedicine: prediction of multiple antibacterial profiles of nanoparticles using a quantitative structure-activity relationship perturbation model ptml combinatorial model of chembl compounds assays for multiple types of cancer big data deep learning: challenges and perspectives deep learning in neural networks: an overview in imagenet classification with deep convolutional neural networks deep learning recognizes ftd-resistant isolated cancer cells of colon cancer deep learning-based prediction of drug-induced cardiotoxicity data integration: challenges for drug discovery a computational model of induced pluripotent stem-cell derived cardiomyocytes incorporating experimental variability from multiple data sources multi-scale modeling of the cardiovascular system: disease development, progression, and clinical intervention binding pocket optimization by computational protein design molecular mechanism matters: benefits of mechanistic computational models for drug development parameterization for in-silico modeling of ion channel interactions with drugs a molecularly detailed nav1. 5 model reveals a new class i antiarrhythmic target a computational model to predict the effects of class i anti-arrhythmic drugs on ventricular rhythms cibriã¡n-uhalte, e. the chembl database in 2017 chemoinformatics in anti-cancer chemotherapy: multi-target qsar model for the in silico discovery of anti-breast cancer agents rational drug design for anti-cancer chemotherapy: multi-target qsar models for the in silico discovery of anti-colorectal cancer agents unified multi-target approach for the rational in silico design of anti-bladder cancer agents. anti-cancer agent multiscale modeling of ductal carcinoma in situ a multiscale agent-based model of ductal carcinoma in situ mathematical modeling in cancer nanomedicine: a review mathematical modeling in cancer drug discovery mathematical modeling of tumor organoids: toward personalized medicine an in silico approach for assessing drug efficacy within a tumor tissue current advances in mathematical modeling of anti-cancer drug penetration into tumor tissues multiscale models of pharmacological, immunological and neurostimulation treatments in alzheimer's disease computer modeling of epilepsy: opportunities for drug discovery modeling neurological disease processes using process algebra computational drug networks: a computational approach to elucidate drug mode of action and to facilitate drug repositioning for neurodegenerative diseases gonzã¡lez-dã­az, h. perturbation theory/machine learning model of chembl data for dopamine targets: docking, synthesis, and assay of new l-prolyl-l-leucyl-glycinamide peptidomimetics multi-target mining of alzheimer disease proteome with hansch's qsbr-perturbation theory and experimental-theoretic study of new thiophene isosters of rasagiline multioutput perturbation-theory machine learning (ptml) model of chembl data for antiretroviral compounds de novo computational design of compounds virtually displaying potent antibacterial activity and desirable in vitro admet profiles speeding up early drug discovery in antiviral research: a fragment-based in silico approach for the design of virtual anti-hepatitis c leads enabling the discovery and virtual screening of potent and safe antimicrobial peptides. simultaneous prediction of antibacterial activity and cytotoxicity ann multiscale model of anti-hiv drugs activity vs. aids prevalence in the us at county level based on information indices of molecular graphs and social networks key: cord-300570-xes201g7 authors: patwardhan, j. title: predictions for europe for the covid-19 pandemicafter lockdown was lifted using an sir model date: 2020-10-06 journal: nan doi: 10.1101/2020.10.03.20206359 sha: doc_id: 300570 cord_uid: xes201g7 i analyze a simplified sir model developed from a paper written by gyan bhanot and charles de lisi in may of 2020 to find the successes and limitations of their predictions. in particular, i study the predicted cases and deaths fitted to data from march and its potential application to data in september. the data is observed to fit the model as predicted until around 150 days after december 31, 2019, after which many countries lift their lockdowns and begin to reopen. a plateau in cases followed by an increase approximately 1.5 months after is also observed. in terms of deaths, the data fits the shape of the model, but the model mostly underestimates the death toll after around 160 days. an analysis of the residuals is provided to locate the precise date of the departure of each country from its accepted data estimates and test each data point to its predicted value using a z-test to determine whether each observation can fit the given model. the observed behavior is matched to policy measures taken in each country to attach an explanation to these observations. i notice that an international reopening results in a sharp increase in cases, and aim to plot this new growth in cases and predict when the pandemic will end for each country. the novel sars-cov-2 coronavirus, first appearing in wuhan, china at around december 31, 2019, has caused an ongoing worldwide pandemic. it differs from the initial sars-cov virus strain in a multitude of ways, from its only 79 percent similarity to the original [8] , to a wider range of mortality rates. while the sars-cov strain had about a 9.6 percent mortality [6] , this new coronavirus has death rates ranging from 13.9 percent in italy to 3.1 percent in the us [5] . despite this, it appears to have a more favorable transmission rate and a prolonged latency period. since the onset of a global lockdown earlier in the year, many countries throughout the world have begun reopening, albeit not in full. nonetheless, the measures taken by governments during quarantine have significantly eased, although many countries continue to encourage proper social distancing measures, from wearing masks to staying 6 feet apart. around 140 days after december 31, 2019, an sir model [10] was created to model the flow of this novel coronavirus in the european countries of the netherlands, denmark, sweden, norway, the uk, spain, germany, italy, and france [4] . the lack of a common policy amongst the world has led to different responses and outbreaks in different countries, as we will come to see. varying reopening policies between each country has resulted in varying levels of success in virus containment, with some countries almost entirely shutting down the virus and other countries seeing a plateauing and subsequent growth in cases. in my study, i focus on the two differential equations modelling the s and i part of the sir model, meaning susceptible and infected, respectively. this uses the variables x1, denoting s, x2, denoting i, α, the transmission rate (number of infections per day per contact), γ, the rate at which individuals leave the infected population, n, the total individuals in an interacting pool of susceptibles, δ, the fraction of individuals who die after being infected, and p, the maximum value of x2. the fits for the amount of deaths are found by scaling the cases by δ and shifting the graph forward by a certain amount of days. all of the relevant equations used to build the model are taken from the paper that develops the model [4] . the primary equations used to graph the curves are: finding the residuals themselves is an easy enough equation: if r = the value of the residual, x = the observed number of cases or deaths at a point in time, and x2 = the predicted number of cases or deaths at a point in time using the main curve, then the associated error bars to each residual are found from subtracting the observed number of cases or deaths from the predicted number of cases or deaths at a point in time using each value of the error bar curve. to each point in time, i also associate a p value denoting the probability of finding the observed data result if the model was true. this is a matter of a simple hypothesis test. the standard deviation is found by using its formula and the values x2u and x2l representing the upper and lower bounds from the estimation model: using the formula, we can then find the corresponding z score and the area under the curve representing the p value using alternatively, using the pnorm function in r yields the same results. reasons for data analysis tools. in this exploration, both residual plots and p value plots (at the α = 0.05 level corresponding to a 95 percent level of confidence) are utilized to find results. we can observe both the advantages and drawbacks to each tool, which provide a need for the other. with a residual plot, we can better analyze how far each data point falls from the predicted value on a linear scale. this clears up the confusion of distances on the log scale, as the model may show an overprediction of 1000 to be miniscule in comparison to an overprediction of 10 based on the data point's location on the y axis. we also gain a better understanding of both how well the data fits the model and when the data starts and stops fitting the model. the drawback, however, is a lack of knowledge of whether a certain residual is expected from the data for the same reason . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint that there is a distinction between the aforementioned difference of 1000 and 10. the error bars expand as time goes on, encompassing a larger expanse of area, but deceptively accounting for a less inclusive interval of error. it is useful to gather whether an observed data point's deviation from the model is expected or not. a statistical analysis into the probability of observing such an extreme or more extreme data point tells us that even a large difference of 1000 can be expected given a large standard deviation as dictated by the error bars. a problem, however, arises when the r value derived from the model [4] increases. as r increases, the fit becomes tighter and tighter, decreasing the standard deviation, potentially making even a deviation of 10 seem probabilistically impossible. this is where a look at the residuals clears up the problems caused by small standards of deviation. in other words, residuals are particularly useful when studying smaller values, while p values are more useful when studying larger values. a final useful note is that all labeled days represent days starting after december 31, 2019 (january 1, 2020 is day 1). analysis of the case model. in general, the model matches the observed data points well until around day 150. we note an initial almost linear climb in log space, accounting for an exponential growth of the virus. we note a maximum point at a median of about 40 days after the first case detected in the country, followed by a more gradual drop in cases. instead of seeing a continuous decrease in cases, we observe a plateau, sometimes followed by a small drop in cases, and then a rise which on occasion exceeds the initial peak. this plateau starts at different locations, most after an initial drop in cases, but some at the peak itself. the peaks of each country vary, and we can see a clear relationship between a country's population and their number of cases. looking at the number of cases per million population, however, does not show any such relationship, suggesting that this factor depends more on federal action rather than just population and population density. we notice that both the uk and sweden plateau immediately after peaking, making the model mostly inaccurate for both countries. for this reason, we can't do a p value analysis. along a similar vein, the observed drop in cases for italy didn't match the model as the drop was much more gradual than expected, and for that reason the model is mostly inaccurate for this country as well. for every other country, the data points began to deviate from the model at about day 142 ± 20 days. these plateaus lasted for about 42.75 ± 15 days, with the exception of norway, with an 83 day plateau. given the relatively small size of the outbreak in norway, it is expected that there would be a thickly shaped plateau in cases due to the smallness of the numbers involved. it can also be argued that france doesn't have a plateau at all, instead going from a decrease in cases to a sudden increase. another interpretation would be to observe a wide plateau akin to the shapes of norway and denmark, followed by a rise in cases. we notice an increase in cases in every country, with most increases being sharply exponential (with the exception of sweden). we also observe a small decline in cases right prior to the aforementioned increase (with the exception of spain and germany), suggesting that many countries had the potential to drop their cases once again before whatever event caused them all to grow so suddenly. finally, we observe this rise in cases to occur roughly in the same timeframe: from day 190 to day 205. we can likely attribute this to a policy either made by the eu or by each government in close proximity to the other regarding reopening. we'll explore this in a later section. the residual plots were useful when attaching a rough start date to the plateau, as it roughly corresponds to a deviation from the covid model. a point and its error bars not intersecting with 0 whose future points also do not intersect with 0 is thus marked as a starting point to locate these plateaus. as time went on, the model predicted the virus to be gone in most countries before day 180 [4] , with the exception being the uk at around day 210. after around day 170, the model predicts small x2 values with small standards of deviation, so observed residuals at this point mostly represent the growth of the virus itself during that time. we can clearly see an exponential increase both in the model and in the residual plots at nearly . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint identical times for this reason. similarly, we mark the date in which the observed data points are first statistically improbable given the model curve and continue to be (mostly) improbable as the date at which point the model becomes obsolete using our p value plots. this also matches quite well with the start date of the plateau. analysis of the death model. we can observe a similar shape for each death graph as that of the cases graph, but with the values on the y axis scaled down by whatever fraction δ was found. we notice almost consistently that the model underestimates the total death rate after its peak, with the exception of the netherlands, which the model surprisingly overestimates. the model accounts for deaths to be 0 at around 160 days for each country, but instead we see that there are still deaths after the predicted end date, albeit very few. we can see this in our residual plots, in which the graphs stabilize around 0 after around day 160. this stabilization, however, is not an indicator of the accuracy of the model, which is proven by the p values dropping below the α = 0.05 level after roughly the same date, but because the deaths have also fallen to fairly small numbers, usually around 1-10. the model fails around 15 days after cases plateau, because we expect cases and deaths to decrease continually until they reach 0, but instead we get a stagnation of the virus. it wouldn't be apt in most cases to call this persistence of the virus a plateau, but rather the failure of the model to graph a much less steep falling curve. we notice that the uk and italy almost immediately deviate from the expected curve after reaching its peak, and we can see a much more gradual decrease in cases in both countries. in almost every residual plot, we see a shape resembling a damped harmonic oscillation curve as the residuals approach 0. as x approaches 200 and beyond, we can say that if the residuals don't go to and stay at 0, there's an anomaly. we observe this in the uk and in sweden. in the uk, we notice a fluctuation between 0 and 125 from days 163 onwards, indicating a stagnation in deaths. each stagnation occurs during a period of increased cases. while we may expect an increase in cases to signify an increase in deaths, it might be useful to consider the possibility that the lethality of the virus itself has decreased, or hospitals have adapted and become better equipped to help patients survive the virus. we know that the virus takes a long time to mutate [9] , and that its mutations are mostly inconsequential, so we could choose to focus our analysis on hospitals. during the peak of the virus, hospitals were mostly swabbing and testing severely symptomatic patients, arguably skewing the lethality of the virus. considering that case rises could be associated with increased numbers of both moderately and severely symptomatic patients instead of just mostly severely symptomatic patients, we can infer that the number of deaths wouldn't have increased [3] . in addition to this, younger people are getting the virus, leading to a greater survival rate, and hospitals themselves have become better at preventing deaths, due to the efforts of medical researchers. all of this has continued to keep deaths down. in fact, since we don't see a surge in deaths as we see a surge in cases, it could be predicted that deaths would increase after day 227 in order to compensate for the rise in cases, but it is also entirely probable that the persistence and "plateau" that we see prior to day 227 is due to the surge of cases in each country. it is also probable that it takes longer to die now with the virus than it did before. finally, an examination of the date of the first deaths in each country compared to the days until the model began to deviate from its expected behavior yields no relation, indicating that any failure of the observed data points to conform to the model is most likely due to the policy measures taken in each country. policy measures taken in each country. we have seen almost uniform behavior amongst all of the countries in the shape of their growth, but the days for each significant change in shape vary. this begs the question, what changed? we know that it takes about 15.5 days for a person who gets infected to be recognized as infected and to get removed from the population, so we can . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint infer that any action taken will have repercussions about 15 days later. however, seeing as this number was recorded when mostly severely affected people were being diagnosed, it can be assumed that it'll take less time in the months of july and august to get diagnosed. after about 183 days since december 31 2019 (july 1, 2020) , the eu began lifting restrictions on nonessential travel for some countries. this has caused a direct increase in the number of cases for italy, germany, spain, norway, denmark, the netherlands, france, and to a much lesser extent, sweden. we can see these effects in the rise in cases in each country from days 190 to 205. the uk announced around 190 days since december 31 2019 that restaurants, bars, hotels, hairdressers, cinemas and museums would reopen, causing a surge in cases almost immediately. sweden, which imposed a more voluntary lockdown, has experienced a more protracted outbreak than other countries, which we can see by the plateauing of cases in sweden, with the exception of a sudden spike and drop in cases over the span of just a few days. in addition to this, every country on the list launched some phase of reopening in july and august, further compounding the issue caused by opening the borders [7] . this still begs the question, why did cases rise in this scenario but plateaued in prior days? this is likely because countries initially began lifting lockdowns internally during the day 140-day 160 period, launching their initial phases of reopening. these reopenings were largely internal, allowing for mobility inside the country. in germany, smaller shops began to reopen, and the travel ban was lifted for eu member states. italy ended travel restrictions, bars, restaurants, shops as well as other public activities such as tourist locations opened. such changes happened to essentially every country, with restrictions being lifted in mid may (day 130-day 140). a smaller scale reopening coupled with falling cases likely contributed to a smaller growth of the virus causing a plateau, whereas larger scale reopenings in later months almost uniformly caused a significant rise in cases. spain in particular has been hit hard, with many cases in aragon [2] , the most heavy hit county, due to infections with wandering seasonal fruit pickers. nonetheless, while cases continue to increase, only about 3 percent of cases need hospital treatment, and the mortality rate is as low as 0.3 percent in some areas. this provides a cause for hope, but also indicates an inaccuracy in the covid model in both its measurements of the fraction of cases that result in deaths and in the time it takes for an infected person to die. a new model to chart the post lockdown case growth. we then sought to chart the new growth of covid-19 using the same equations as before. using the times indicated by our analysis, we charted the new wave of cases. we found that the time before a person removes him/herself from the population (1/γ) increases dramatically, as well as the infectivity rate α. however, we notice no upward trends in deaths as of yet, and any increase in deaths is slow and not profound as of yet, suggesting that while cases are indeed on the rise, the death rate has stabilized. this can be linked to younger people getting the virus, as older age patients are more susceptible to receiving the virus [1] . an appropriate residual graph analysis shows that the residual cases have mostly dropped, with some exceptions. we can't do a p value analysis due to the tightness of the graphs in addition to the flatter shape of the incline and the odd rises and falls due to the peculiarities of each country's reopening format. we plot the best case scenario for each of these countries, as we see a general easing of the growth and a parabolic curve in each plot. we were unable to do so for sweden due because we failed to observe a sharp rise in cases akin to the initial growth. the assumption thus is that the trend will continue downwards as it is predicted, which will likely require a continuation of the current reopening phase without the introduction of more radical changes that may shift the data. studying the parameters, we observe that not only has the value for gamma significantly decreased for all of the countries, the value for alpha has too. in the netherlands, the value for gamma drops from 0.05676 to 0.02647, in denmark from 0.06 to 0.028, in norway from 0.08 to 0.0355, in the uk from 0.0268 to 0.00716, in germany from 0.0676 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint to 0.023, in italy from 0.0556 to 0.0222, in canada from 0.0267 to 0.008, in france from 0.06657 to 0.02, in the usa from 0.1676 to 0.00972, and in spain from 0.689 to 0.0178. similarly, in the netherlands, the value for alpha drops from 9.07e-5 to 2.539e-5, in denmark from 0.000329 to 6.205e-5, in norway from 0.0005448 to 0.0003996, in the uk from 3.62e-5 to 8.528e-6, in germany from 2.308e-5 to 2.472e-5 (increasing slightly in this scenario), in italy from 2.588e-5 to 3.45e-5 (also increasing slightly here), in canada from 7.94e-5 to 2.858e-5, in france from 2.889e-5 to 3.327e-6, in the usa from 1.04e-5 to 2.518e-6, and in spain from 2.348e-5 to 4.141e-6. a decrease in gamma indicates that each infected person is taking more time to remove himself/herself from the population, suggesting that the symptoms and harm to them is significantly less. a lower alpha value indicates a lesser transmission rate, which aligns with the prediction that younger people are getting the virus. younger people have a lesser history with preexisting conditions and are generally stronger and in better shape to fight the virus. if younger people are stronger in general and show less symptoms, it follows naturally that they would take longer to go to the hospital. taking x2 < 5 to be the day that the pandemic ends, we conclude that it will end in the netherlands on day 549 (july 2, 2021), in denmark on day 505 (may 19, 2021), in norway on day 383 (january 17, 2021), in the uk on day 804 (march 14, 2022 if we take 50 as our parameter), in germany on day 384 (january 18, 2021), in italy on day 580 (august 2, 2021), in canada on day 712 (december 12, 2021 if we take 50 as our parameter), in france on day 750 (january 19, 2022), in the usa on day 900 (june 18, 2022 if we take 100 as our parameter), and in spain on day 774 (february 12, 2022). while this is the expected fit, we hope for a vaccine to be found, or the development of herd immunity, which may reduce the time for the virus to be removed from the population. this paper thus aims to provide a general shape to the future of the covid-19 virus, and an examination of which policy measures resulted in a growth in cases. in almost every european country, we see the case model begin to fail at around day 150, at which point the cases stagnate, then increase starting at around day 190. this increase in cases is nowhere close to reaching the peaks of the initial outbreak, even though we see a sudden increase in the case count. this owes to the deceptive nature of space in the log scale, where a sudden increase from 400 to 900 is seen as more dramatic than an increase in the thousands. spain, however, is dealing with another major outbreak, with cases approaching that of its first peak. the model also largely underestimates deaths and the continued persistence of deaths after day 160. it also overestimates the lethality and underestimates the time to die in later months after around june. the deviation of the observed data points from the expectations set by the model of the model can be understood to be due to its inability to account for reopening, instead assuming that each country would remain in a state of total lockdown until deaths go to 0. it also fails to account for the outbreak amongst younger people, and thus overestimates the survival rate. nonetheless, before the lifting of lockdown, the model fits the data quite well. an attempt to extrapolate the results of this model to asian countries was attempted, but many countries there are still fighting the virus, seeing a growth in cases. others observed a small peak around april, and are suddenly seeing a dramatic resurgence in august, akin to the peaks of the european countries in april. while the model could be adapted to the cases in august, an analysis of the data wouldn't be appropriate until much after cases have subsided. nonetheless, this paper aims to change the notions of the general public as to the situation of the covid-19 virus, that contrary to popular opinion, it is still at large, and affecting a younger crowd. it also aims to provide an answer to what forms of reopening are still acceptable to reducing net cases, to which mostly internal reopenings are suggested. it finally looks to model the new growth of covid-19 in the 9 european countries in addition to two new north american countries. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint disclosure statement. the author has no conflicts of interest to declare. declaration regarding data and software. the data used in this paper were all derived from public sources. links to these data are included in the paper. the main data source is taken from https://ourworldindata.org/coronavirus-source-data. the r code used to analyze the data along with all data files will be provided on request -email: jaypatwardhan3@gmail.com. 2) using the ode solver ode in r. the mean values of the parameters obtained (inset) are from the solid black line and the error bars are from the two red lines. the next plot is the residual plot for the cases, found by subtracting the observed value from the fit value, with an error estimation taken from the values of the fits in the error bars. we can observe the error getting smaller as the expected data from the fits fall below the 100 case mark. the final plot is a measure of the p value for each observed data point given a mean of the middle fit and a standard deviation calculated from the values of the error bars. the horizontal blue line indicates an α = 0.05 level of significance, leading us to reject any values below that line as expected. this is replicated for all 9 european countries. observed data (red dots) for the number of deaths per day (x4(t)) and fits (solid lines) obtained by solving (3.1) and (3.2) using the ode solver ode in r. the mean values of the parameters obtained (inset) are from the solid black line and the error bars are from the two red lines. the next plot is the residual plot for the deaths, found by subtracting the observed value from the fit value, with an error estimation taken from the values of the fits in the error bars. we can observe the error getting smaller as the expected data from the fits fall below the 100 case mark (unless the peak is really small, in which case the error gets smaller as the expected data falls below the 10 case mark). the final plot is a measure of the p value for each observed data point given a mean of the middle fit and a standard deviation calculated from the values of the error bars. the horizontal blue line indicates an α = 0.05 level of significance, leading us to reject any values below that line as expected. this is replicated for all 9 european countries. figure 3 : observed data (blue dots) for the number of cases per day (x2(t)) and fits (solid lines) obtained by solving (3.1) and (3.2) using the ode solver ode in r. the mean values of the parameters obtained (inset) are from the solid black line and the error bars are from the two red lines. this graph contains the model fits for after reopening starts. data was obtained by ending the first curve at a certain date (found using the residual and p plot analyses) and starting the second curve around that date. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint figure 3 . new plot of cases after the lockdown was lifted . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted october 6, 2020. . https://doi.org/10.1101/2020.10.03.20206359 doi: medrxiv preprint incidence, clinical characteristics and prognostic factor of patients with covid-19: a systematic review and meta-analysis coronavirus: why spain is seeing second wave coronavirus disease 2019 (covid-19) in the eu/eea and the uk -eleventh update: resurgence of cases predictions for europe for the covid-19 pandemic from a sir model update: severe acute respiratory syndrome -worldwide and united states coronavirus: how lockdown is being lifted across europe genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding a contribution to the mathematical theory of epidemics key: cord-307340-00m2g55u authors: gerasimov, a.; lebedev, g.; lebedev, m.; semenycheva, i. title: reaching collective immunity for covid-19: an estimate with a heterogeneous model based on the data for italy date: 2020-05-25 journal: nan doi: 10.1101/2020.05.24.20112045 sha: doc_id: 307340 cord_uid: 00m2g55u background. at the current stage of covid-19 pandemic, forecasts become particularly important regarding the possibility that the total incidence could reach the level where the disease stops spreading because a considerable portion of the population has become immune and collective immunity could be reached. such forecasts are valuable because the currently undertaken restrictive measures prevent mass morbidity but do not result in the development of a robust collective immunity. thus, in the absence of efficient vaccines and medical treatments, lifting restrictive measures carries the risk that a second wave of the epidemic could occur. methods. we developed a heterogeneous model of covid-19 dynamics. the model accounted for the differences in the infection risk across subpopulations, particularly the age-depended susceptibility to the disease. based on this model, an equation for the minimal number of infections was calculated as a condition for the epidemic to start declining. the basic reproductive number of 2.5 was used for the disease spread without restrictions. the model was applied to covid-19 data from italy. findings. we found that the heterogeneous model of epidemic dynamics yielded a lower proportion, compared to a homogeneous model, for the minimal incidence needed for the epidemic to stop. when applied to the data for italy, the model yielded a more optimistic assessment of the minimum total incidence needed to reach collective immunity: 43% versus 60% estimated with a homogeneous model. interpretation. because of the high heterogeneity of covid-19 infection risk across the different age groups, with a higher susceptibility for the elderly, homogeneous models overestimate the level of collective immunity needed for the disease to stop spreading. this inaccuracy can be corrected by the homogeneous model introduced here. to improve the estimate even further additional factors should be considered that contribute to heterogeneity, including social and professional activity, gender and individual resistance to the pathogen. we developed a heterogeneous model of covid-19 dynamics. the model accounted for the differences in the infection risk across subpopulations, particularly the age-depended susceptibility to the disease. based on this model, an equation for the minimal number of infections was calculated as a condition for the epidemic to start declining. the basic reproductive number of 2.5 was used for the disease spread without restrictions. the model was applied to covid-19 data from italy. we found that the heterogeneous model of epidemic dynamics yielded a lower proportion, compared to a homogeneous model, for the minimal incidence needed for the epidemic to stop. when applied to the data for italy, the model yielded a more optimistic assessment of the minimum total incidence needed to reach collective immunity: 43% versus 60% estimated with a homogeneous model. because of the high heterogeneity of covid-19 infection risk across the different age groups, with a higher susceptibility for the elderly, homogeneous models overestimate the level of collective immunity needed for the disease to stop spreading. this inaccuracy can be corrected by the homogeneous model introduced here. to improve the estimate even further additional factors should be considered that contribute to heterogeneity, including social and professional activity, gender and individual resistance to the pathogen. most countries in the world are currently severely affected by the covid-19 epidemic 1 . the undertaken anti-epidemic restrictions have been effective, but there remains the risk of the epidemic second wave after the restrictions are lifted and people return to their normal way of life. indeed, the potential vaccines are only being explored 2-4 , so mass vaccination of the population will not start any time soon 5 . in these conditions, epidemic dynamics depend on the basic reproductive rate (that can be lowered by anti-epidemic measures) and the number of susceptible-exposed-infectious-recovered cases 6, 7 . as the number of infections increases (slowly with restrictive measures), a population immunity state can be reached where repeated occurrences of infection only lead to dilapidated chains of successive infections, not mass infections [8] [9] [10] . it is important to be able to forecast such an event when deciding to lift the anti-epidemic restrictions 11, 12 . here we developed a mathematical model for assessing the minimum incidence of covid-19 needed to reach collective immunity, which would assure that the epidemic cannot restart the cessation of quarantine measures. the key feature of our model is that it is heterogeneous, that is it accounts for the differences in the infection risk across affected subpopulations. this is important for covid-19 because the severity of this disease strongly depends on age 1, 13, 14 . as such, our model offers more precise estimates compared to the commonly used homogeneous models. we conducted a search of medrxiv, pubmed, and biorxiv for articles published in english from inception to may 9, 2020, with the keywords "covid-19", "sars-cov-2","reproduction number", "r0", "age", "homogeneous model", and "heterogeneous model". while this search yielded several useful references regarding covid-19 modeling, the basic reproduction number of this disease, and age-related heterogeneity, we did not find an approach similar to ours to modeling covid-19 dynamics and estimating the total incidence and population immunity. therefore, our study provides a novel model that estimates these metrics accurately and with a minimal number of model parameters. our report the results obtained with a heterogeneous model, which is different from the commonly used homogeneous models of the epidemic dynamics. this is especially important for covid-19, whose risks are strikingly age-dependent. with these modelling results, we contribute to the demanding issue of lifting anti-epidemic restrictions, the measure that could result in the second wave of the epidemic because of an insufficient level of collective immunity. in addition to assessing the minimum collective immunity level, our report provides a framework for incorporating additional sources of heterogeneity and for monitoring the epidemic dynamics for different subpopulations. as such, our model is an important tool that could be useful for making crucial decisions, which eventually will result in saving human lives. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . https://doi.org/10.1101/2020.05.24.20112045 doi: medrxiv preprint criteria for lifting restrictions (when and how) are critical for the overall success of anti-epidemic measures. our contribution to this issue is the assessment of the level of collective immunity that assures that the disease would nor restart when the restrictions are lifted. the assessment was made using a heterogeneous model that accounted for the age-related differences in covid-19 infection risk. future research should consider additional sources of heterogeneity, such as heterogeneity associated with social and professional activity, gender and individual resistance to the pathogen. we reasoned our model the following way. the simplest kermack-mckendrick susceptibleinfected-recovered (sir) model of an epidemic 15 assumes that each member of the population can be in one of three states: susceptible, infected, recovered (or immune), and the probability of transition between the states is the same for all members of the population. the epidemic dynamics is then described by the differential equations: (1) where s and i are susceptible and infected subpopulations, respectively. the term r 0 is the basic reproductive number, that is the average number of people infected by one infected person with 100% susceptibility. furthermore, i  is the recovery intensity for the infected people, 1/β is the average duration of the disease, and this representation can be further improved by the susceptible-exposed-infectiousrecovered (seir) model that accounts for the fact that there is a period when after being infected a person does not infect the others 6,7 . additionally, not only the contagiousness but also the probability of recovery (or death) depends on the time elapsed since the infection. to introduce these factors, let j (t) be the intensity of the emergence of new cases, 0 j r si  = . then the system dynamics are described by the equations: anti-epidemic activities like social isolation can be factored in by considering r 0 to be timedependent rather than constant. then, morbidity decreases if the recovery flow is higher than the infection flow: . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . taking 0 = 2.5 as an estimate for the basic reproductive number, the susceptible subpopulation, s, should be less than 40% for the epidemic to decline. in other words, 60% of the population should contract the disease. this assessment, however, overestimates the expected morbidity because it does not take into account the presence of low and high-risk groups. let h be an individual risk of infection, so that the average value of h for the population is 1 ( , , , , the condition for the epidemic decline is then: here we propose using this heterogeneous-model approach for the assessment of covid-19 population immunity level. italy is taken as an example. since the risk of covid-19 is highly dependent on age 13,14 , we considered an epidemic model where the individual risk of infection was a function of age. at the initial stage of an epidemic the proportion of infected persons is low. therefore, individual risk, h, is proportional to the number of infections in each age group, and the distribution of h can be estimated from the distribution of morbidity by age. table 1 presents this assessment for the data from italy. we calculated h as being proportional to morbidity; the average h for the entire population is 1. it can be seen from the last column of the table that people over 80 have the risk that is 2.5 times higher than the average, while the risk in children and adolescents is 15 times lower than the average. next, for the point where the epidemic starts to decline, we can assess the proportions of immune subpopulations by age using the assumptions that . these results are presented in table 2 . notably, the distribution is highly nonuniform, and the average proportion of people with immunity is approximately 43% instead of 60%, as estimated with the homogeneous model. here we looked into the issue of assessing the level of collective immunity [16] [17] [18] [19] , which would be safe for lifting anti-epidemic restrictions. we assumed that the basic reproductive number would be approximately 2.5 without restrictions [20] [21] [22] . our assumptions were based on a heterogeneous model which, unlike the commonly used homogeneous models, accounted for the age-related differences in infection risk. we used the covid-19 data from italy to generate the estimates. the heterogeneous model yielded the minimum level of 43% for collective immunity to stop the epidemic, which is different from the 60% level predicted by a homogeneous model. this difference is related to the highly uneven morbidity across the age groups, with the elderly being the most affected. the estimated 43% correspond to all forms of the infectious process development, including both symptomatic and asymptomatic cases. given the estimated proportion of manifested cases of 50% of the total, we estimate the minimum proportion of . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 25, 2020. . https://doi.org/10.1101/2020.05.24.20112045 doi: medrxiv preprint italians who have to suffer coronavirus infection before the nation can return to the normal lifestyle as 20%. for a better assessment of the epidemic dynamics, additional sources of heterogeneity should be considered 23,24 , including heterogeneity by social and professional activity, heterogeneity associated with sex and heterogeneity associated with individual resistance to the pathogen. these sources of heterogeneity could lower the minimum collective immunity even further. for example, for italy this number could be lower than 20%. world health organization. coronavirus disease 2019 (covid-19): situation report immune responses in covid-19 and potential vaccines: lessons learned from sars and mers epidemic preliminary identification of potential vaccine targets for the covid-19 coronavirus (sars-cov-2) based on sars-cov immunological studies the covid-19 vaccine development landscape developing covid-19 vaccines at pandemic speed global dynamics of an seir epidemic model with vertical transmission global dynamics of an seir epidemic model with saturating contact rate herd immunity-estimating the level required to halt the covid-19 epidemics in affected countries herd immunity": a rough guide herd immunity and herd effect: new insights and definitions predicting the ultimate outcome of the covid-19 outbreak in italy a phased lift of control: a practical strategy to achieve herd immunity against covid-19 at the country level clinical features of covid-19 in elderly patients: a comparison with young and middle-aged patients are children less susceptible to covid-19 a contribution to the mathematical theory of epidemics sequential lifting of covid-19 interventions with population heterogeneity restarting the economy while saving lives under covid-19 comparison of different exit scenarios from the lock-down for covid-19 epidemic in the uk and assessing uncertainty of the predictions heterogeneous social interactions and the covid-19 lockdown outcome in a multi-group seir model early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious covid-19 r0: magic number or conundrum? key: cord-241596-vh90s8vi authors: libotte, gustavo barbosa; lobato, fran s'ergio; platt, gustavo mendes; neto, antonio jos'e da silva title: determination of an optimal control strategy for vaccine administration in covid-19 pandemic treatment date: 2020-04-15 journal: nan doi: nan sha: doc_id: 241596 cord_uid: vh90s8vi during decades, mathematical models have been used to predict the behavior of physical and biologic systems, and to define strategies aiming the minimization of the effects regarding different types of diseases. in the present days, the development of mathematical models to simulate the dynamic behavior of novel coronavirus disease (covid-19) is considered an important theme due to the quantity of infected people worldwide. in this work, the aim is to determine an optimal control strategy for vaccine administration in covid-19 pandemic treatment considering real data from china. for this purpose, an inverse problem is formulated and solved in order to determine the parameters of the compartmental sir (susceptible-infectious-recovered) model. to solve such inverse problem, the differential evolution (de) algorithm is employed. after this step, two optimal control problems (monoand multi-objective) to determine the optimal strategy for vaccine administration in covid-19 pandemic treatment are proposed. the first consists of minimizing the quantity of infected individuals during the treatment. the second considers minimizing together the quantity of infected individuals and the prescribed vaccine concentration during the treatment, i.e., a multi-objective optimal control problem. the solution of each optimal control problems is obtained using de and multi-objective differential evolution (mode) algorithms, respectively. the results regarding the proposed multi-objective optimal control problem provides a set of evidences from which an optimal strategy for vaccine administration can be chosen, according to a given criterion. in the last decades, countless mathematical models used to evaluate the spread and control of infectious diseases have been proposed. these models are very important in different fields, such as policy making, emergency planning and risk assessment, definition of control-programs, promoting the improvement of various health-economic aspects (al-sheikh, 2013) . in general, such models aim to describe a state of infection (susceptible and infected) and a process of infection (the transition between these states) by using compartmental relations, i.e., the population is divided into compartments by taking assumptions about the nature and time rate of transfer from one compartment to another (trawicki, 2017; blackwood and childs, 2018) . one can cite several studies using models for measles vaccination (bauch et al., 2009; widyaningsih et al., 2018) , hiv/aids (mukandavire et al., 2009) , tuberculosis (bowong and kurths, 2010) , dengue (weiss, 2013) , pertussis epidemiology (pesco et al., 2014) , alzheimer (ebrahimighahnavieh et al., 2020) , among others. recently, the world has been experiencing the dissemination of a new virus, referred to as covid-19 (coronavirus disease 2019). covid-19 is an infectious disease emerged from china in december 2019, that has rapidly spread around in many other countries worldwide (gorbalenya et al., 2020; world health organization, 2020 (accessed april 8, 2020 . the common symptoms are severe respiratory illness, fever, cough, and myalgia or fatigue, especially at the onset of illness (huang et al., 2020) . the transmission may happen person-to-person, through direct contact or droplets (chan et al., 2020; li et al., 2020; riou and althaus, 2020) . since the covid-19 outbreak in wuhan city in december of 2019, various computational model-based predictions have been proposed and studied. lin et al. (2020) proposed a susceptible-exposed-infectious-removed (seir) model for the covid-19 outbreak in wuhan. these authors considered some essential elements including individual behavioral response, governmental actions, zoonotic transmission and emigration of a large proportion of the population in a short time period. benvenuto et al. (2020) proposed the auto regressive integrated moving average (arima) model to predict the spread, prevalence and incidence of covid-2019. roda et al. (2020) used a susceptible-infectious-removed (sir) model to predict the covid-19 epidemic in wuhan after the lockdown and quarantine. in this study, these authors demonstrate that non-identifiability in model calibrations using the confirmedcase data is the main reason for such wide variations. prem et al. (2020) proposed a seir model to simulate the spread of covid-19 in wuhan city. in this model, all demographic changes in the population (births, deaths and ageing) were ignored. the simulations showed that control measures aimed at reducing social mixing in the population can be effective in reducing the magnitude and delaying the peak of the covid-19 outbreak. in order to evaluate the global stability and equilibrium point of these models, li and muldowney (1995) studied a seir model with nonlinear incidence rates in epidemiology, in terms of global stability of endemic equilibrium. al-sheikh (2013) evaluated a seir epidemic model with a limited resource for treating infected people. for this purpose, the existence and stability of disease-free and endemic equilibrium were investigated. li and cui (2013) studied a seir model with vaccination strategy that incorporates distinct incidence rates for exposed and infected populations. these authors proved the global asymptotical stable results of the disease-free equilibrium. singh et al. (2017) developed a simple and effective mathematical model for transmission of infectious diseases by taking into consideration the human immunity. this model was evaluated in terms of local stability of both disease free equilibrium and disease endemic equilibrium. widyaningsih et al. (2018) proposed a seir model with immigration and determined the system equilibrium conditions. kim et al. (2019) developed a coxian-distributed seir model considering an empirical incubation period, and an stability analysis was also performed. in order to reduce the spread of dissemination of covid-19 worldwide, various procedures have been adopted. as mentioned by zhai et al. (2020) and wei et al. (2020) , quarantine and isolation (social-distancing) can effectively reduce the spread of covid-19. in addition, wearing masks, washing hands and disinfecting surfaces contribute to reducing the risk of infection. according to the u.s. food and drug administration, there are no specific therapies to covid-19 treatment. however, treatments including antiviral agents, chloroquine and hydroxychloroquine, corticosteroids, antibodies, convalescent plasma transfusion and radiotherapy are being studied . as alternative to these treatments, the use of drug administration (vaccine) arises as an interesting alternative to face this pandemic. it must be emphasized that there is currently no vaccine for covid-19, but there is a huge effort to develop a vaccine in a record time, which justifies the present study (lurie et al., 2020) . mathematically, the determination of optimal protocol for vaccine administration characterizes an optimal control problem (ocp). this particular optimization problem consists in the determination of control variable profiles that minimize (or maximize) a given performance index (bryson and ho, 1975; biegler et al., 2002) . in order to solve this problem, several numerical methods have been proposed (bryson and ho, 1975; feehery and barton, 1996; lobato, 2004; lobato et al., 2016) . these methods are classified according to three broad categories: direct optimization methods, pontryagin's maximum principle (pmp) based methods and hjb-based (hamilton-jacob-bellman) methods. the direct approach is the most traditional strategy considered to solve an ocp, due to its simplicity. in this approach, the original problem is transformed into a finite dimensional optimization problem through the parametrization of control or parametrization of control and state variables (feehery and barton, 1996) . from an epidemiological point of view, neilan and lenhart (2010) proposed an optimal control problem to determine a vaccination strategy over a specific period of time so as to minimize a cost function. in this work, the propagation of a disease is controlled by a limited number of vaccines, while minimizing a percentage of the overall number of dead people by infection, and a cost associated with vaccination. biswas et al. (2014) studied different mathematical formulations for an optimal control problem considering a susceptible-exposed-infectious-recovered model. for this purpose, these authors evaluated the solution of such problems when mixed state control constraints are used to impose upper bounds on the available vaccines at each instant of time. in addition, the possibility of im-posing upper bounds on the number of susceptible individuals with and without limitations on the number of vaccines available were analyzed. the optimal control theory was applied to obtain optimal vaccination schedules and control strategies for the epidemic model of human infectious diseases. in this word, the objective is to determine an optimal control strategy for vaccine administration in covid-19 pandemic treatment considering real data from china. in order to determine the parameters that characterizes the proposed mathematical model (based on the compartmental sir model), an inverse problem is formulated and solved considering the differential evolution (de) algorithm (storn and price, 1997; price et al., 2005) . after this step, two optimal control problems (mono-and multi-objective) used to determine the optimal strategy for vaccine administration in covid-19 pandemic treatment are proposed. the mono-objective optimal control problem considers minimizing the quantity of infected individuals during the treatment. on the other hand, the multi-objective optimal control problem considers minimizing together the quantity of infected individuals and the prescribed vaccine concentration during the treatment. to solve each problem, de and multi-objective differential evolution (mode) algorithms (lobato and steffen, 2011) are employed, respectively. this work is organized as follows. section 2 presents the description of the mathematical model considered to represent the evolution of covid-19 pandemic. in section 3, the general aspects regarding the formulation and solution of an ocp is presented. a brief review on de and its extension to deal with multi-criteria optimization is presented in section 4. in section 5, the proposed methodology is presented and discussed. the results obtained using such methodology are presented in section 6. finally, the conclusions are outlined in section 7. in the specialized literature, various compartmental models used to represent the evolution of an epidemic can be found (forgoston and schwartz, 2013; pesco et al., 2014; shaman et al., 2014; cooper et al., 2016; azam et al., 2020) . the study of these models is very important to understand an epidemic spreading mechanism and, consequently, to investigate the transmission dynamics in population (forgoston and schwartz, 2013) . as mentioned by keeling and rohani (2007) , these compartmental models can be divided into two groups: i) population-based models and ii) agentbased or individual-based models. in turn, the first one can be subdivided into deterministic or stochastic (considering continuous time, ordinary differential equations, partial differential equations, delay differential equations or integrodifferential equations) or discrete time (represented by difference equations). the second class can be subdivided into usually stochastic and usually discrete time. in the context of population-based models, the deterministic modeling can be represented, in general, by the interaction among susceptible (denoted by s -an individual which is not infected by the disease pathogen), exposed (denoted by e -an individual in the incubation period after being infected by the disease pathogen, and with no visible clinical signs), infected/infectious (denoted by e -an individual that can infect others) and recovered individuals (denoted by r -an individual who survived after being infected but is no longer infectious and has developed a natural immunity to the disease pathogen). considering a population of size n, and based on the disease nature and on the spreading pattern, the compartmental models can be represented as (keeling and rohani, 2007; hethcote, 2000) : • susceptible-infected (si): population described by groups of susceptible and infected; • susceptible-infected-recovered (sir): population described by groups of susceptible, infected and recovered; • susceptible-infectious-susceptible (sis): population also described by groups of susceptible and infected. in this particular case, recovering from some pathologies do not guarantee lasting immunity. thus, individuals may become susceptible again; • susceptible-exposed-infectious-recovered (seir): population described by groups of susceptible exposed, infected and recovered. it is important to mention that in all these models, terms associated with birth, mortality and vaccination rate can be added. in addition, according to keeling and rohani (2007) and hethcote (2000) , these models can include: i) time-dependent parameters to represent the effects of seasonality; ii) additional compartments to model vaccinated and asymptomatic individuals, and different stages of disease progression; iii) multiple groups to model heterogeneity, age, spatial structure or host species; iv) human demographics parameters, for diseases where the time frame of the disease dynamics is comparable to that of human demographics. human demographics can be modeled by adopting constant immigration rate, constant per capita birth and death rates, density-dependent death rate or disease-induced death rate. thus, the final model is dependent on assumptions taken during the formulation of the problem. in this work, the sir model is adopted, in order to describe the dynamic behavior of during the covid-19 epidemic in china. the choice of this model is due to the study conducted by roda et al. (2020) . these authors demonstrated that the sir model performs more adequately than the seir model in representing the information related to confirmed case data. for this reason, the sir model will be adopted here. the schematic representation of this model is presented in fig. 1 . mathematically, this model has the following characteristics: • an individual is susceptible to an infection and the disease can be transmitted from any infected individual to any susceptible individual. each susceptible individual is given by the following relation: where t is the time, β and µ represents the probability of transmission by contact and per capita removal rate, respectively. in turn, s 0 is the initial condition for the susceptible population. • any infected individual may transmit the disease to a susceptible one according to the following relation: where γ denotes the per capita recovery rate. i 0 is the initial condition for the infected population. • once an individual has been moved from infected to recovered, it is assumed that it is not possible to be infected again. this condition is described by: where r 0 is the initial condition for the recovered population. it is important to emphasize that the population size (n) along time t is defined as in practice, the model parameters must be determined to represent a particular epidemic. for this purpose, it is necessary to formulate and to solve an inverse problem. in the section that describes the methodologies adopted in this work, more details on the formulation and solution of this problem is presented. mathematically, an ocp can be formulated as follows (bryson and ho, 1975; feehery and barton, 1996; lobato, 2004) . initially, let where z is the vector of state variables, and u is the vector of control variables. ψ and l are the first and second terms of the performance index, respectively. the minimization problem is given by arg min with consistent initial conditions given by according to the optimal control theory (bryson and ho, 1975; feehery and barton, 1996) , the solution of the ocp, whose problem is defined by eqs. (5) and (6), is satisfied by the co-state equations and the stationary condition given, respectively, byλ where h is the hamiltonian function defined by this system of equations is known as the euler-lagrange equations (optimality conditions), which are characterized as boundary value problems (bvps). thus, to solve this model, an appropriated methodology must be used, as for example, shooting method or collocation method (bryson and ho, 1975) . as mentioned by bryson and ho (1975) and feehery and barton (1996) , the main difficulties associated with ocps are the following: the existence of end-point conditions (or region constraints) implies multipliers and associated complementary conditions that significantly increase the complexity of solving the bvp using a indirect method; the existence of constraints involving the state variables and the application of slack variables method may introduce differential algebraic equations of higher index; the lagrange multipliers may be very sensitive to the initial conditions. differential evolution is a powerful optimization technique to solve mono-objective optimization problems, proposed by storn and price (1997) . this evolutionary strategy differs from other population-based algorithms in the schemes considered to generate a new candidate to solution of the optimization problem (storn and price, 1997; price et al., 2005; lobato and steffen, 2011) . the population evolution proposed by de follows three fundamental steps: mutation, crossover and selection. the optimization process starts by creating a vector containing np individuals, called initial population, which are randomly distributed over the entire search space. during g max generations, each of the individuals that constitute the current population are subject to the procedures performed by the genetic operators of the algorithm. in the first step, the mutation operator creates a trial vector by adding the balanced difference between two individuals in a third member of the population, by v (g+1) the parameter f represents the amplification factor, which controls the contribution added by the vector difference, such that f ∈ [0, 2]. in turn, storn and price (1997) proposed various mutation schemes for the generation of trial vectors (candidate solutions) by combining the vectors that are randomly chosen from the current population, such as: the second step of the algorithm is the crossover procedure. this genetic operator creates new candidates by combining the attributes of the individuals of the original population with those resulting in the mutation step. the vector u (g+1) jk , such as k = 1, . . . , d, where d denotes the dimension of the problem and randb (k) ∈ [0, 1] is a random real number with uniform distribution. the choice of the attributes of a given individual is defined by the crossover coefficient, represented by cr, such that cr ∈ [0, 1] is a constant parameter defined by the user. in turn, rnbr ( j) ∈ [1, d] is a randomly chosen index. after the generation of the trial vector by the steps of mutation and crossover, the evolution of the best individuals is defined according to a greedy strategy, during the selection step. price et al. (2005) have defined some simple rules for choosing the key parameters of de for general applications. typically, one might choose np in the range from 5 to 10 times the dimension (d) of the problem. in the case of f, it is suggested taking a value ranging between 0.4 and 1.0. initially, f = 0.5 may be a good choice. in the case of premature convergence, f and np might be increased. the multi-objective optimization problem (mop) is an extension of the mono-objective optimization problem. due to the conflict between the objectives, there is no single point capable of optimizing all functions simultaneously. instead, the best solutions that can be obtained are called optimal pareto solutions, which form the pareto curve (deb, 2001) . the notion of optimality in a mop is different from the one regarding optimization problems with a single objective. the most common idea about multi-objective optimization found in the literature was originally proposed by edgeworth (1881) and further generalized by pareto (1896) . one solution is said to be dominant over another, if it is not worse in any of the objectives, and if it is strictly better in at least one of the objectives. as an optimal pareto solution dominates any other feasible point in the search space, all of these solutions are considered better than any other. therefore, multi-objective optimization consists of finding a set of points that represents the best balance in relation to minimizing all objectives simultaneously, that is, a collection of solutions that relates the objectives, which are in conflict with each other, in most cases. let f (x) = ( f 1 (x) , . . . , f m (x)) t be the objective vector such that f k : p → ir, for k = 1, . . . , m, where x ∈ p is called decision vector and its entries are called decision variables. mathematically, a mop is defined as (deb, 2001; lobato, 2008) : due to the favorable outcome of de in solving mono-objective optimization problems, for different fields of science and engineering, lobato and steffen (2011) proposed the multi-objective differential evolution (mode) algorithm to solve multi-objective optimization problems. basically, this evolutionary strategy differs from other algorithms by the incorporation of two operators to the original de algorithm, the mechanisms of rank ordering (deb, 2001; zitzler and thiele, 1999) , and exploration of the neighborhood for potential solution candidates (hu et al., 2005) . a brief description of the algorithm is presented next. at first, an initial population of size np is randomly generated, and all objectives are evaluated. all dominated solutions are removed from the population by using the operator fast non-dominated sorting (deb, 2001) . this procedure is repeated until each candidate vector becomes a member of a front. three parents generated by using de algorithm are selected at random in the population. then, an offspring is generated from these parents (this process continues until np children are generated). starting from population p 1 of size 2np, neighbors are generated to each one of the individuals of the population. these neighbors are classified according to the dominance criterion, and only the non-dominated neighbors (p 2 ) are put together with p 1 , in order to form p 3 . the population p 3 is then classified according to the dominance criterion. if the number of individuals of the population p 3 is larger than a predefined number, the population is truncated according to the crowding distance (deb, 2001) criterion. this metric describes the density of candidate solutions surrounding an arbitrary vector. a complete description of mode is presented by lobato and steffen (2011) . as mentioned earlier, the first objective of this work is to determine the parameters of the sir model adopted to predict the evolution of covid-19 epidemic considering experimental data from china. in this case, it is necessary to formulate and to solve an inverse problem. it arises from the requirement of determining parameters of theoretical models in such a way that it can be employed to simulate the behavior of the system for different operating conditions. basically, the estimation procedure consists of obtaining the model parameters by the minimization of the difference between calculated and experimental values. in this work, it is assumed that, since the outbreak persists for a relatively short period of time, the rate of births and deaths of the population is insignificant. thus, we take µ = 0, since there are probably few births/deaths in the corresponding period. we are interested in the determination of the following parameters of the sir model: β, γ and i 0 . let and i sim i are the experimental and simulated infected population, respectively, and m represents the total number of experimental data available. in this case, the sir model must be simulated considering the parameters calculated by de, in order to obtain the number of infected people estimated by the model and, consequently, the value of the objective function (f ). as the number of measured data, m, is usually much larger than the number of parameters to be estimated, the inverse problem is formulated as a finite dimensional optimization problem in which we aim at minimizing f . in order to formulate both ocps, the parameters estimated considering the proposed inverse problem are used. as proposed by neilan and lenhart (2010) and biswas et al. (2014) , a new variable w, which denotes the number of vaccines used, is introduced in order to determine the optimal control strategy for vaccine administration. for this purpose, the total amount of vaccines available during the whole period of time is proportional to us . physically, u represents the portion of susceptible individuals being vaccinated per unit of time (biswas et al., 2014) . it is important to mention that u acts as the control variable of such system. if u is equal to zero there is no vaccination, and u equals to one indicates that all susceptible population is vaccinated. a schematic diagram of the disease transmission among the individuals for the sir model with vaccination is shown in fig. 2 . mathematically, the sir model considering the presence of control is written as: where w 0 is the initial condition for the total amount of vaccines. it is important to emphasize that the population size (n) after the inclusion of this new variable w along the time t is defined as n(t) = s (t) + i(t) + r(t) + w(t). the first formulation aims to determine the optimal vaccine administration (u) to minimize the infected population, represented by ω 1 . thus, let the ocp is defined as arg min u ω 1 (16) subject to eqs. (11) -(14) and u min ≤ u ≤ u max , where t 0 and t f represents the initial and final time, respectively, and u min and u max are the lower and upper bounds for the control variable, respectively. the second formulation considers two objectives, i.e., the determination of the optimal vaccine administration, in order to minimize the number of infected individuals and, at the same time, to minimize the number of vaccines needed. the total number of vaccines can be determined by whereas the number of infected people is given by eq. (15) . thus, the multi-objective optimization problem is formulated as arg min u (ω 1 , ω 2 ) (18) subject to eqs. (11) -(14) and u min ≤ u ≤ u max . in both problems, the control variable u must be discretized. in this context, the approach proposed consists of transforming the original ocp into a nonlinear optimization problem. for this purpose, let the time interval 0, t f be discretized using n elem time nodes, with each node denoted by t i , where i = 0, . . . , n elem − 1, such that t 0 ≤ t i ≤ t f . for each of the n elem − 1 subintervals of time, given by [t i , t i+1 ], the control variable is considered constant by parts, that is, in order to obtain an optimal control strategy for vaccine administration, that can be used in medical practice, we consider the bang-bang control which consisting of a binary feedback control that turns either "on" (in our case, when u = u max = 1) or "off" (when u = u min = 0) at different time points, determined by the system feedback. in this case, as the control strategy u is constant by parts, the proposed optimal control problem has n elem − 2 unknown parameters, since the control variable at the start and end times are known. the resulting nonlinear optimization problems are solved by using the de, in the case of the mono-objective problem, given by eq. (16), and mode, for the multi-objective problem defined by eq. (18). in order to apply the proposed methodology to solve the inverse problem described previously, the following steps are established: • objective function: minimize the functional f , given by eq. (10); • design space: 0.1 ≤ β ≤ 0.6, 0.04 ≤ γ ≤ 0.6 and 10 −8 ≤ i 0 ≤ 0.5 (all defined after preliminary executions); • de parameters: population size (25), number of generations (100), perturbation rate (0.8), crossover rate (0.8) and strategy rand/1 (as presented in section 4.1). the evolutionary process is halted when a prescribed number of generations is reached (in this case, 100). 20 independent runs of the algorithm were made, with different seeds for the generation of the initial population; • to evaluate the sir model during the optimization process, the runge-kutta-fehelberg method was used; • initial conditions: s (0) = 1 − i 0 , i(0) = i 0 , and r(0) = 0. in this case, i 0 is chosen as the first reported data in relation to the number of infected individuals in the time series; • the data used in the formulation of the inverse problem refer to the population of china, from january 22 to april 2, 2020, taken from johns hopkins resource center (2020 (accessed april 03, 2020). table 1 presents the results (best and standard deviation) obtained using de. it is possible to observe that de was able to obtain good estimates for the unknown parameters and, consequently, for the objective function, as can be verified, by visual inspection of fig. 3 . these results were obtained, as mentioned earlier, from 20 runs. thus, the values of the standard deviation demonstrate that the algorithm converges, practically, to the same optimum in all executions (best). physically, the probability of transmission by contact in the chinese population is superior to 35 % (β equal to 0.3566). in addition, γ equal to 0.0858 implies a moderate per capita recovery rate. one must consider that, since many cases may not be reported, for different reasons, as for example an asymptomatic infected person, the value of i 0 may vary, as well as the behavior of the model over time. it is important to emphasize that when choosing i 0 as a design variable, the initial condition for the susceptible population (s 0 ) is automatically defined, that is, s 0 = 1 − i 0 , since there is not, at the beginning of an epidemic, a considerable number of recovered individuals and, thus, r 0 = 0 is a reasonable choice. in this case, the available data refer to the number of infected individuals and these represent only the portion of individuals in the population that have actually been diagnosed. this is due, among other facts, to the lack of tests to diagnose the disease of all individuals who present symptoms. thus, as the number of susceptible individuals at the beginning of the epidemic is dependent on the value of i 0 , in this work it is considered that the total size of the population, typically defined as n = s + i + r, is actually a portion of the total population, since the number of infected individuals available is also a fraction of those who have actually been diagnosed. in this case, the results presented below represent only the fraction of the infected population that was diagnosed and, consequently, the fraction of individuals susceptible to contracting the disease. qualitatively, the results presented are proportional to the number of individuals in the population who were diagnosed with the disease. in order to evaluate the sensitivity of the solutions obtained, in terms of the objective function, the best solution (β = 0.3566, γ = 0.0858, and i 0 = 0.0038) was analyzed considering a perturbation rate given by δ. for this purpose, the range [(1 − δ) θ k , (1 + δ) θ k ] was adopted, for k ⊂ {1, 2, 3}, where θ = (β, γ, i 0 ). thus, in each analysis, one design variable is perturbed and the value of f in relation to this noise is computed. figure 4 presents the sensitivity analysis for each estimated parameter, in terms of the objective function, considering δ equal to 0.25 and 100 equally spaced points in the interval of interest. in these figures, it is possible to observe that the variation in the result of each parameter, as expected, in a worst value for the f . in addition, that the design variable more sensible to δ parameter is the β parameter, since a wide range of values for the f were obtained. we consider two distinct analysis in this section, in order to evaluate the proposed methodology considered to solve the mono-objective optimization problem: i) solution of the proposed mono-objective optimal control problem and ii) evaluation on the influence of the maximum amount of vaccine, by defining an inequality constraint. for this purpose, the following steps are established: • objective function: minimize the functional ω 1 , given by eq. (16); • the previously calculated parameters (β, γ and i 0 ) are employed in the simulation of the sir model; • design space: 0 ≤ t i ≤ t f , for i = 1, . . . , t n elem −1 , and n elem = 10. it is important to mention that this value was chosen after preliminary runs, i.e., increasing this value do not produce better results in terms of the objective function; • de parameters: population size (25), number of generations (100), perturbation rate (0.8), crossover rate (0.8) and strategy rand/1 (as presented in section 4.1). the evolutionary process is halted when a prescribed number of generations is reached (in this case, 100). 20 independent runs of the algorithm were made, with different seeds for the generation of the initial population; • to evaluate the sir model during the optimization process, the runge-kutta-fehelberg method was used; • initial conditions: s (0) = 1 − i 0 , i(0) = i 0 , and r(0) = 0. as in the previous case, i 0 is chosen as the first reported data in relation to the number of infected individuals in the time series; table 2 presents the best solution obtained by using de and considering ten control elements, in terms of the number of individuals. the objective function obtained (about 8945.4278 individuals) is less than the case in which no control is considered (about 1594607.2234 individuals), i.e., the number of infected individuals is lower when a control strategy is considered (see figs. 5(a) and 5(c)). if the number of infected individuals is reduced, due to control action, the number of susceptible individuals rapidly decreases until its minimum value (1.4382 × 10 −3 individuals) and, consequently, the number of recovered individuals rapidly increase until its maximum value (767.5187 individuals), as observed in figs. 5(b) and 5(d), respectively. in terms of the action regarding the control variable, the effectiveness is readily verified in the beginning of the vaccine administration. further the administration is conducted in specific intervals of time, which preserves the health of the population, as observed in fig. 5(e) . the evolution of the number of vaccinated individuals is presented in fig. 5(f) . in this case, due to control action, the vaccinated population increase rapidly until the value is saturated (141835.1405 individuals). in summary, all obtained profiles are coherent from the physical point of view. finally, it is important to mention that the standard deviation for each result is, approximately, equal to 10 −3 , which demonstrates the robustness of de to solve the proposed mono-objective optimal control problem. in this model, the evaluation of the number of vaccinated individuals is associated with an inequality constraint. this relation bounds the quantity of individuals that can be vaccinated due to the limitation related to the production of vaccines. for this purpose, two control elements are incorporated to the model: if w(t 1 ) ≤ w lim , then u = 1. otherwise, u = 0 (t 1 is the instant of time that w(t 1 ) = w lim , and w lim is the upper bound for the number of vaccinated individuals). table 3 presents the results obtained considering different quantities for the parameter w lim . as expected, the insertion of this constraint implies in limiting the maximum number of vaccinated individuals and, consequently, a lower number of individuals are vaccinated. the increase of the parameter w lim implies in the reduction of the objective function value, in number of infected and recovered individuals and, consequently, an increase in the number of susceptible individuals. these analysis can be observed in fig. 6 . as presented previously, a multi-objective optimal control problem was proposed in order to minimize the number of infected individuals (ω 1 ) and to minimize the quantity of vaccine administered (ω 2 ). to evaluate the proposed methodology considered to solve this multi-objective optimization problem, the following steps are established: • objective functions: minimize both ω 1 and ω 2 together, which are defined by eqs. (15) and (17), respectively; • the previously calculated parameters (β, γ and i 0 ) are employed in the simulation of the sir model; • design space: 0 ≤ t i ≤ t f , for i = 1, . . . , t n elem −1 , and n elem = 10. • mode parameters: population size (50), number of generations (100), perturbation rate (0.8), crossover rate (0.8), number of pseud-curves (10), reduction rate (0.9), and strategy rand/1 (as presented in section 4.1). the stopping criterion adopted is the same as in the previous cases. • to evaluate the sir model during the optimization process, the runge-kutta-fehelberg method was used; • initial conditions: s (0) = 1 − i 0 , i(0) = i 0 , r(0) = 0, and w(0) = 0. table 4 . it must be stressed that the pareto curve presents the non-dominated solutions, as described in section 4.2. the point a represents the best solution in terms of the minimization of the number of infected individuals, with ω 1 = 8963.7775, that is, the number of infected individuals at t f assume its lowest value, which is equal to i(t f ) = 769.0921, but considering a larger amount of vaccine administered (ω 2 = 6.9358). on the other hand, the point b represents the best solution in terms of the quantity of vaccine administered, with ω 2 = 1.2940, i.e, the minimization of such value when t = t f . however, for this point, the number of infected individuals is high (ω 1 = 56644.0350). the point c is a compromise solution, which is a good solution in terms of both objectives simultaneously, with intermediary values for both objectives -ω 1 = 13298.2440 and ω 2 =2.3034. table 4 . in figure 7 (e) it is possible to observe the activation of the control variable when vaccine is introduced. besides, in both results obtained, the action of such treatment is readily verified in the population during a larger interval of time in the beginning of the vaccine administration. in figures 7(b) , 7(c), 7(d) and 7(f) the susceptible, infectious, recovered and number of vaccines profiles are presented, respectively, for each point described in table 4 . in these figures we can visualize the importance of the control strategy used. for example, the points a and c are good choices in terms of the minimization of infected individuals, although the point a has a highest value in terms of the objective ω 2 . on the other hand, point b is satisfactory in terms of minimizing the amount of vaccines administered, but, from a clinical point of view, it is not a good choice, as the number of infected individuals is not minimized. in this contribution was proposed and solved an inverse problem to simulate the dynamic behavior of novel coronavirus disease (covid-19) considering real data from china. the parameters of the compartmental sir (susceptible, infectious and recovered) model were determined by using differential evolution (de). considering the parameters obtained with the solution of the proposed inverse problem, two optimal control problems were proposed. the first consists of minimizing the quantity of infected individuals. in this case, an inequality that represents the quantity of vaccines available was analyzed. the second optimal control problem considers the minimizing together the quantity of infected individuals and the prescribed vaccine concentration during the treatment. this problem was solved using multi-objective differential evolution (mode). in general, the solution of the proposed multi-objective optimal control problem provides information from which an optimal strategy for vaccine administration can be defined. the use of mathematical models associated with optimization tools may contribute to decision making in situations of this type. it is important to emphasized that the quality of the results is dependent on experimental data considered. in this context, one may cite the following limitations regarding the sir model: i) poor quality of reported official data and ii) the simplifications of the model, as for example terms as birth rate, differential vaccination rate, weather changes and its effect on the epidemiology. finally, it is worth mentioning that the problem formulated in this work is not normally considered in the specialized literature (only the minimization of the infected individuals is normally proposed). in this context, the formulation of the multi-objective optimization problem and its solution by using mode represents the main contribution of this work. modeling and analysis of an seir epidemic model with a limited resource for treatment numerical modeling and theoretical analysis of a nonlinear advection-reaction epidemic system scheduling of measles vaccination in low-income countries: projections of a dynamic model application of the arima model on the covid-2019 epidemic dataset advances in simultaneous strategies for dynamic process optimization a seir model for control of infectious diseases with constraints an introduction to compartmental modeling for the budding infectious disease modeler parameter estimation based synchronization for an epidemic model with application to tuberculosis in cameroon applied optimal control: optimization, estimation and control a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster forecasting the spread of mosquito-borne disease using publicly accessible data: a case study in chikungunya multi-objective optimization using evolutionary algorithms deep learning to detect alzheimer's disease from neuroimaging: a systematic literature review dynamic simulation and optimization with inequality path constraints predicting unobserved exposures from seasonal epidemic data severe acute respiratory syndrome-related coronavirus: the species and its viruses-a statement of the coronavirus study group the mathematics of infectious diseases a new multi-objective evolutionary algorithm: neighbourhood exploring evolution strategy clinical features of patients infected with 2019 novel coronavirus in wuhan mapping 2019-ncov modeling infectious diseases in humans and animals global stability of an seir epidemic model where empirical distribution of incubation period is approximated by coxian distribution dynamic analysis of an seir model with distinct incidence for exposed and infectives global stability for the seir model in epidemiology early transmission dynamics in wuhan, china, of novel coronavirusâȃşinfected pneumonia a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action hybrid approach for dynamic optimization problems multi-objective optimization for engineering system design determination of an optimal control strategy for drug administration in tumor treatment using multi-objective optimization differential evolution a new multi-objective optimization algorithm based on differential evolution and neighborhood exploring evolution strategy developing covid-19 vaccines at pandemic speed mathematical analysis of a sex-structured hiv/aids model with a discrete time delay an introduction to optimal control with an application in disease modeling cours d'économie politique. f. rouge modelling the effect of changes in vaccine effectiveness and transmission contact rates on pertussis epidemiology the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study differential evolution: a practical approach to global optimization pattern of early human-to-human transmission of wuhan 2019 novel coronavirus (2019-ncov) why is it difficult to accurately predict the covid-19 epidemic? infectious disease modelling inference and forecast of the current west african ebola outbreak in guinea stability of seir model of infectious diseases with human immunity differential evolution-a simple and efficient heuristic for global optimization over continuous spaces deterministic seirs epidemic model for modeling vital dynamics, vaccinations, and temporary immunity clinical characteristics of 138 hospitalized patients with 2019 novel coronavirusâȃşinfected pneumonia in wuhan, china radiotherapy workflow and protection procedures during the coronavirus disease 2019 (covid-19) outbreak: experience of the hubei cancer hospital in wuhan the sir model and the foundations of public health susceptible exposed infected recovery (seir) model with immigration: equilibria points and its application the epidemiology, diagnosis and treatment of covid-19 multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach this study was financed in part by the coordenação de aperfeiçoamente de pessoal de nível superior-brasil (capes)-finance code 001, fundação carlos chagas filho de amparo à pesquisa do estado do rio de janeiro (faperj), and conselho nacional de desenvolvimento científico e tecnológico (cnpq). key: cord-263571-6i64lee0 authors: sarkar, rohan; bhowmik, arpan; kundu, aditi; dutta, anirban; nain, lata; chawla, gautam; saha, supradip title: inulin from pachyrhizus erosus root and its production intensification using evolutionary algorithm approach and response surface methodology date: 2020-09-09 journal: carbohydr polym doi: 10.1016/j.carbpol.2020.117042 sha: doc_id: 263571 cord_uid: 6i64lee0 production of inulin from yam bean tubers by ultrasonic assisted extraction (uae) was optimized by using response surface methodology (rsm) and genetic algorithms (ga). yield of inulin was obtained between 11.97%–12.15% for uae and 11.21%–11.38% for microwave assisted extraction (mae) using both the methodologies, significantly higher than conventional method (9.9 %) using optimized conditions. under such optimized condition, sem image of root tissues before and extraction showed disruption and microfractures over surface. uae provided a shade better purity of extracted inulin than other two techniques. degree of polymerization in inulin was also recorded to be better, might be due lesser degradation during extraction. significant prebiotic activity was recorded while evaluation using lactobacillus fermentum and it was 36 % more than glucose treatment. energy density by uae was few fold lesser than mae. carbon emission was far more less in both these methods than the conventional one. with the sharp increase in health related problems, especially with the advent of covid-19, enhancing immunity is one of the prescribed method to stay safe and healthy. boosting of immunity is linked with the structure and function of microbiome. health of gut bacteria can be enhanced by using probiotic directly or using prebiotic in order to get the beneficial effect indirectly. apart from this, there is an increasing trend of gastrointestinal problems. in this regard, india has become one of the pioneer countries among other south-east asians as around 10% of its population is suffering under severe "functional gastrointestinal diseases" according to a report of ncbi, 2017. there are a plethora of synthetic drugs accessible in the market but lots of side effects are adhered with these pharmaceuticals. so now-a-days scientific communities are expressing their interests towards enriching the population of gut-friendly microbes that are already present in human system. prebiotic compounds play a pivotal role in increasing the population of microbes in human gut. these are basically non-digestive oligosaccharides (short chain dietary carbohydrates) that show selective metabolism within j o u r n a l p r e -p r o o f 3 system. oligosaccharides, resistant to gastric acidity, are fermented and utilized by gut microbiota. it stimulate the growth and/or activity of gut bacteria (olano-martin et al., 2001) . mexican yam bean or jicama (prachyrhizus erosus l.), a member of fabaceae family, is an important crop in terms of its economic significance in mexico along with various south-east asian countries. different polysaccharides are present in the fruit that consist of cellulose, pectic polysaccharides, xyloglucans, hetereomannans along with inulin (ramos-de-lapena et al., 2013) . the crop is still under-utilized although it has huge commercial potential. inulin being a potent prebiotic substance, its extraction is very much economically important in nutraceutical as well as functional food perspective. so to explore the possibility of utilization of this under-utilized crop for the purpose of valorisation, this yam bean tuber flesh was selected for the extraction of inulin. generally oligosaccharides are being extracted by hot water apart from other newer techniques like ultrasound-assisted extraction (uae), microwave-assisted extraction (mae) etc. conventional extraction process uses higher temperature for extended period of time but it results in lesser yield (liu et al., 2014) . uae and mae provide immense advantage in terms of lesser extraction time, lower energy requirement and higher efficiency. using ultrasound mediated extraction method, inulin was earlier extracted from burdock roots (arctium lappa) (milani et al., 2011) , jerusalem artichoke tubers (li et al., 2018) , roots of elecampane (inula helenium l.) (petkovaet al., 2015) , tubers of iranian artichoke (abbasi and farzanmehr 2009) . similarly different fructo-oligosaccharides were extracted from various matrices using microwave heating. using this technique, inulin was isolated from tubers of helianthus tuberosus l. (temkov et al., 2015) , chicory roots (shweta et al., 2015) , tubers of cynara scolymus l. (ruiz-aceituno et al., 2016) , burdock roots (li et al., 2014) etc. but no study has been documented regarding extraction of inulin from yam bean tubers by ultrasound or microwave technique till date. as per industrial purview, the optimisation of extraction conditions is quintessential in order to get maximum yield of desirable bioactives for any kind of extraction method. using response surface methodology (rsm) for the optimisation purpose is not a new approach. but limited study has been conducted regarding optimised extraction protocol for inulin from different crops. keeping this fact in mind, the present research work was formulated with the aim j o u r n a l p r e -p r o o f 4 to find the role of extraction techniques on the purity of inulin extracted by uae and mae. prebiotic activity of the extracted inulin was also evaluated for its utilization in food. process optimisation was done by two methods viz. box-behnken design (response surface methodology) (varghese et al., 2017) as well as genetic algorithm approach based on the concept of natural selection and genetics by using non-linear second order response surface model (garcia et al., 2018) . this piece of work will be helpful for industrial application by understanding appropriate conditions for obtaining maximum prebiotic compound. yam bean tubers were collected from a local market of kolkata (mechhua fruit market, n 22.57⁰ ; e 88.36⁰ ). the tubers were cut into pieces of convenient sizes (1-2 cm) and blanched in boiling water (2 l per 1kg tubers; 100 ⁰ c) for 5 minutes to deactivate enzymatic activities. the blanched tuber pieces were kept overnight under oven at 50 ⁰ c for complete drying. the dried samples were then grinded using a mixer grinder into fine powder (<1 mm size). this powdered material was utilized for extraction of inulin. deionized water was used for extraction purpose, obtained from millipore purifier system having 18.2 mω cm resistance. calcium hydroxide, phosphoric acid, 3, 5-dinitrosalicylic acid, phenol and sulphuric acid are analytical grade (merckindia). standard of inulin was procured from sigma-aldrich. ultrasonicator (vcx-750, sonics, sonics and materials inc., newtown, usa) with 20 khz frequency was used for the uae purpose. it includes ultrasonic processor with a titanium probe of 13 mm diameter with amplitude (100%) of 114 µ.domestic microwave (ce2933, samsung) working at power level ranged between 300-900 w and frequency of 2450 mhz. chromatographic analysis was done by hplc (waters alliance 2695 separation module, equipped with the amino column (waters, 250 × 4.6 mm, 5 µ), using elsd (model 2424). j o u r n a l p r e -p r o o f 5 a ph meter and a uv-vis spectrophotometer (analytik jena ag, germany) were used for ph and spectrophotometric analysis respectively. centrifuge (z326k, hermle ag, germany) was used for the separation of precipitate. powdered material was subjected to probe ultrasound (vcx-750, sonics, sonics and materials inc., newtown, usa) of 20 khz frequency at 60, 80 and 100% amplitude using three different solvent (water)-solute ratio (3.5:1, 4.5:1, 5.5:1 v/w) under extraction time of 120s, 150s and 180s. total 15 combinations were reported as output from rsm analysis based on these extraction parameters and yield of inulin was obtained for each combination. further, inulin was obtained from the water extract of tuber powder according to li's method [14] . firstly ca(oh)2 was added to the extract up to ph of 11 to precipitate the protein portion. after completion of the step, h3(po4)2 was added to lower the ph to 8. the whole content was centrifuged at 10,000 rpm for 5 min to get the supernatant. finally inulin powder was obtained by precipitating in excess ethanol followed by freeze drying (labconco, usa) at -80⁰ c. total sugar content and reducing sugar content were measured adapting phenol-sulfuric acid method (dubois et al., 1956 ) and 3, 5-dinitrosalicylic acid method (dns) (miller., 1959) . the difference between these two values is the inulin present in the extract that was presented as percentage value as per dry weight of tuber sample. design of extraction method using microwave is similar to ultrasound assisted extraction where power of microwave were taken as 300 w, 600 w and 900 w along with solvent-solute ratio of 3.5:1, 4.5:1 and 5.5:1 under extraction time of 120s, 150s and 180s. fifteen different combinations were obtained by rsm analysis and the same were used for extraction of inulin. yield of inulin was obtained for each combination following the process described earlier in section 2.3.1. j o u r n a l p r e -p r o o f 6 inulin was also extracted by heating the tuber tissues (0.5 kg) with water (1.5 l) at 90⁰ c for 30 minutes along with homogenization (5000 rpm). it was repeated twice in order to complete the extraction. all the extracts were combined, filtered and processed in the way similar to other methods (discussed in section 2.3.1) to obtain yield of inulin. box-behnken design was used for the optimization experiment of inulin production from tuberous root of yam bean. the analysis based on bbd generally consider second order response surface model (quadratic polynomial). as model lack-of-fit was significant after using second order response surface model which ideally should remain non-significant, partial third order or cubic polynomial model can be used for fitting data based on bbd till degrees of freedom can be ensured for estimating the experimental error. dependent variables used in the present study were coded and depicted in table 1 and the layout of required experiments were done according to table s1 . design expert software (version 9.0.6.2) was used for the analysis of the whole experiment. optimised data generated by the rsm was validated by real time experimental data. genetic algorithms are another approach to optimize the experimental condition. it is based on natural evolution and it basically imitate the darwin's principle of "survival of the fittest". complex optimization problems can be solved using genetic algorithms. recently, number of studies used this technique to optimize experimental parameters (hatami et al., 2010; muthusamy et al., 2019; sodeifian et al. 2016) . genetic algorithm, pioneered by holland, is mainly used in optimization for its accuracy. unlike other optimization techniques, it does not require initial values for the experimentation. here, optimization was done by using exponential second order response surface polynomial based on ga approach. optimization using non-linear model requires initial parameter values. for the present investigation, the initial parameter values for the exponential second order polynomial were obtained by fitting the exponential second order polynomial model based on the data obtained through experimentation carried out using box-behnken designs used in the present study. in order to see the effectivity of extraction procedure and disruption caused during the extraction process, surface structure of root powder of pachyrhizus erosus was observed under sem (carlzeiss evo-ma-10, operating at 10.0 kv/eht) before and after the individual experiment. three samples (uae, mae and conventional method) along with the initial material was used for recording of sem image. sample was prepared for the analysis by mounting approximately the material (0.5 mg) in powdered form on an aluminium stub having sputter-coating with palladium layer. for the purity estimation of inulin, free fructose present in the extracted inulin was estimated by spectrophotometrically using the method described by saengkanuk et al., 2011 as well as by hplc. for the estimation of total fructose and glucose, inulin extract as hydrolysed by 0.2 ml l -1 hcl at 100 0 c for 45 minutes. the hydrolysate was estimated for fructose and glucose chromatographic separations of hydrolysed inulin was performed by isocratic elution with 90% a/10% b solvent system where a and b was 80/20 acetonitrile/water with 0.2% triethylamine and 30/70 acetonitrile/water with 0.2% triethylamine respectively with flow rate of 1.0 ml min -1 . gain set for els detector was 100 with nitrogen gas pressure of 35 psi. inulin was hydrolysed and the fructose content was measured by hplc. effect of temperature and influence of ultrasound/microwave on the degree of polymerization in inulin was also evaluated at two extreme condition of uae and mae, used in the rsm experiment. for this prebiotic effect evaluation of inulin, pure culture of lactobacillus fermentum was used, which was maintained by division of microbiology, icar-iari, new delhi. the microbial strain was grown at 35 0 c under anaerobic condition using de man rogosa sharp (mrs) broth. growth of the microbe was monitored at regular time interval by measuring optical density (od) of the medium at 622 nm. response of inulin as prebiotic was assessed by using the same culture media having similar mrs broth composition but with replacement of sugar with inulin. growth of culture was also assessed in mrs broth with no carbohydrate source that was considered as control. carbohydrate concentration was maintained at 2% level in all cases. the activated inoculum was incubated with 1% (v/v) and kept at 35 0 c. growth of the bacteria was observed at 12 hours interval up to 72 hours when growth of microbe was in stationary phase. od values were measured as an indication of growth of bacterial culture. for further confirmation, bacterial count was also done by taking sample from each respective culture media by serial dilution method using 0.9% nacl solution. j o u r n a l p r e -p r o o f 9 energy density (ev, j ml -1 ) was calculated and compared between uae and mae methods. it is described as amount of energy dissipated per volume unit of extraction solvent (chen et al., 2017) . it is measured by the following equations. where pv is the power density (wml -1 ); t is the extraction time (sec); m is the mass (g) of the sample; cp is the specific heat of water (4.186 j g -1 o c -1 ); is the heating rate ( o c s -1 ) during the execution of the experiment and v is the total volume (ml) of the sample. analysis for energy consumption is prerequisite for any technology, which has the potential to be scaled upto industry level. total energy consumption was calculated based on the consumption of electricity by each experiment. carbon emission was calculated by considering the fact that 1kwh produces 0.8 kg of co2. selection of pachyrhizus erosus tubers for extraction of inulin was done with the purpose of valorization of the crop. the crop is under-utilised although it is grown in different parts of the globe. extraction yield of inulin from pachyrhizus erosus tuberous root was done by conventional hot extraction, mae and uae was varied across 9.9, 10.2-11.2, 10.3-11.9 % respectively. conventional extraction was done by hot water refluxing for 30 minutes. extraction efficiency did not improve upon increase in duration. further, initial soaking for a fixed time followed by extraction or homogenization prior to extraction did not enhance extraction efficiency significantly. better extraction of phenolic components from tagetes erecta was reported by uae was done by varying frequencies, time and solute to solvent ratio keeping temperature constant at 40 0 c.variations in inulin yield was recorded across all the variables (fig. s1 ). maximum extraction (12.2 %) efficiency was observed in the experiment where 100 % amplitude was used for three minutes with solute to solvent ratio of 1:4.5 and it was 19.2 % more than the conventional extraction. better extraction in uae was provided by the energy delivered by the ultrasonic waves, which helped to penetrate the solvent inside the matrix, whereas, temperature governed the extraction in case of mae. extraction efficiency was maximum (11.2 %) in that experiment where microwave power of 900 w was exposed for 150 seconds with solvent to solute ratio of 5.5. it was observed that 13.5% more extraction efficiency in mae than conventional method. highest pectin yield from grapefruit was recorded in mae (27.8 %) as compared to uae (17.9%), done in ultrasonic bath. in mae, 900w for 6 min interval was used for extraction, whereas, 25 min sonication in ultrasonic bath at 70 0 c (bagherian et al., 2011) . better rupture of the cells followed by better penetration of solvent inside the matrix are the reasons for better extraction and it was confirmed by the sem data presented in future sub section. simultaneous ultrasonic-microwave assisted extraction of inulin required much shorter time than conventional method, when it was done in burdock root (lou et al., 2009) . extraction time for conventional extraction and ultrasonicmicrowave assisted extraction method was 60 and 300 seconds respectively, but the yield was a shade better in conventional method (99.8 mg g -1 ) than the other method (99.0 mg g -1 ). upon increase in extraction time, the later method showed degradation of inulin. uae provided better yield from roots of globe artichoke (castellino et al., 2020) . the study reported in general 33% increase in extraction yield by uae than conventional hot water extraction. it was also here, the response y = inulin content (%). based on second order model fitting, it has been observed that for the present experiment dataset, the overall model, a and b are highly significant at 1 % level of significance. the quadratic effect of a i.e. a 2 also remains significant at 1% level of significance. all other effects remain non-significant at 5 % level of significance. the r 2 = 0.9945 for the model indicates the model is able to explain 99.45% variability which is quite good. the adjusted r 2 = 0.9845 which is also quite good, indicates the significant portions of variations explained by the model is 98.45%. it is to be noted here that, the adjusted r 2 will increase if only significant variables included in the model. however, for the above model, the overall lack-of-fit also remains highly significant (pvalue: 0.0041) at 1 % level of significance which is not desirable as from statistical point of view, the lack-of-fit which tests the goodness of fit of the model which should remain non significant for model to be fitted well. the non significance of lack-of-fit may be due the fact that the second order model is not exactly capturing all the variations in the data and if it is so then there is still better scope for model improvement. optimum point is fixed as 100 % amplitude, 180 sec time and solvent to solute ratio of 4.5. it can be seen that, the optimum point maximize the inulin (%) and the predicted maximum value is 12.23% with the maximum desirability value (fig. s2) . the optimum value may lie between 12.22 to 12.25%. predicted values of the experiment was validated in laboratory and presented in table 3 . table 3 . comparison between optimum conditions predicted by bbd and ga models for uae and mae *average of three analysis out of three variables, interaction between two factors is presented in the form of 3d response surface curve and their contour plots (fig. 1a-1c ). 3d plots depicts the interaction between two variables keeping the third factor constant. here, only graph with second order interaction effects are plotted. figure 1a and figure 1d indicates that keeping solvent to solute at ratio 4.5, maxima with highest desirability lies towards the higher percentage of ultrasonication amplitude and time (sec). whereas, figure 2b similar experiment was conducted for mae of inulin from the same matrix. three factors for the bbd experiments are (microwave power (a1), time (b1) and solvent to solute ratio (c1) is a response surface design. the analysis revealed that for the given dataset, the overall model, a1 and b1 are significant at 1% level of significance with p-values as 0.004, <0.0001 and 0.003 respectively. all other effects remain non-significant at 5 % level of significance except the interaction a1b1 (p-value : 0.0310). the r 2 = 0.9862 for the model indicates the model is able to explain 98.62% variability which is quite good and the adjusted r 2 =0.9614. it is to be noted here that, the adjusted r 2 will increase if only significant variables included in the model. however, the overall lack-of-fit also remains significant (p-value: 0.0160) at 5 % level. the nonsignificance of lack-of-fit might be due to the fact that second order model is not exactly capturing all the variations in the data. thus the model was further analysed to make lack of fit non-significant. like the analysis done in case of uae experiment, by hit and trial method, different parametric combinations were considered and finally the model fitting was done with the above quadratic model with additional parameters as a1c1 2 and b1c1 2 . so, the non-hierarchical cubic model with a1c1 2 and b1c1 2 has been fitted again and the results are summarized in based on that, optimum point is fixed as 899.9 w of microwave power, 179.9 sec time and solvent to solute ratio of 4.55. it can be seen that, the optimum point maximize the inulin % and the predicted maximum value is 11.57 % with the maximum desirability value (fig. s3 ). the optimum value may lie between 11.54 to 11.60. the optimum point was validated in lab and presented in table 3 . the two factor interaction wise contour plots and 3d plots are as follows [only graph with second order interaction effects are plotted]. figure for the second order response surface model fitted to the data, lack of fit remains significant at 5% level of significance. as a result a non-hierarchical third order polynomial have been fitted. alternatively, an exponential form of second order response surface model as follows was also fitted to the data which lead to non-linear model fitting. the non-linear model fitting was done through iterative procedure using gauss-newton method of non-linear least square. the convergence criteria satisfied after 5 iteration. the model remains highly significant at 1% level of significance. the estimated parameters are presented in table 5 and since the model is highly significant, as a result the fitted model was used for genetic algorithm optimization for finding optimal solution. genetic algorithms, a mathematical model inspired by the famous charles darwin's idea of natural selection is being used for the optimization. principle of natural selection illustrates the preservation of only the fittest individuals, over different generations. an evolutionary algorithm which improves the selection over time. basic concept of ga is to combine the different solutions, generation after generation to extract the best information for each one. advantage of this approach over other optimization methods is that it allows the best solution to emerge from the best of prior solutions. that way it creates new and more fitted individuals. the ga approach has been effectively used in optimization problem. mutation: a parent solution is picked and altered to produce a better solution. the fitted non-linear second order response surface model as given above was considered as objective function. after 1000 iteration, optimization results are summarized as follows: fitness function value 12.24945 the optimum solution was obtained with 0.8 crossover probability which is quite high. the mutation probability is 0.1 which is in lower side as desirable. the final fitness value for the optimum solution is 12.25. the optimum combination comprised of ultrasonic amplitude of 100 %, time of 180 sec and solvent to solute ratio of 5.59 for the inulin yield of 12.79 %. it is to be noted here that, the fitness function value itself is the optimal value at the optimal solution point as obtained through the genetic algorithm approach. iteration results are presented in fig 3a. the predicted value was validated in the laboratory and the result is presented in table 3 . for optimization of mae also, similar experiment was planned and analysed the data using genetic algorithms. the estimated parameters are presented in table 5 and table s3 . the fitted equation in this case is as follows: 4 3 7 2 5 2 3 2 5 6 5 1 1 1 1 1 1 1 1 1 1 1 1 2.25 8.00 10 4.99 10 0.008 4.79 10 1.52 10 8.80 10 1.01 10 9.70 10 6.80 10 here, a: power, b: time and c: solvent to solute ratio. since the model is highly significant, as a result the fitted model was used for genetic algorithm optimization for finding optimal solution. fitness function value 11.53069 the optimum solution was obtained with 0.8 crossover probability which is quite high. the mutation probability is 0.1 which is in lower side as desirable. the final fitness value for the optimum solution is 11.53069. the optimum combination is as follows: the optimum combination comprised of microwave power of 900w, time of 180 sec and solvent to solute ratio of 5.22 for the inulin yield of 11.53 %. iteration results have been presented in fig 3b. the data came out of the analysis was required to be validated and thus the same was done in the laboratory and the result is presented in table 3 . ultrasonic soundwaves seems to rupture more intensely than all other extraction protocols. acoustic cavitation led to the severe damage of the outermost cells and helped in formation of bigger cracks at surface, which enhanced the maximum release of inulin from the matrix to the bulk solvent. better disintegration facilitated better penetration of solvent inside the matrix, followed by acoustic cavitation yielded better extraction efficiency (xia et al., 2011) . whereas, the pattern was different in mae residual material, where more disintegration at surface level was observed. texture was crumbled in a significant manner (dahmoune et al., 2015) . microwave irradiation along with rise in temperature helped the disintegration and thus release of inulin in the adjacent solvent facilitated. scanning electron micrograph of p. radiata bark upon soxhlet, uae and mae of phenolics showed significant cell destruction (aspe and fernandez, 2011). better yield in uae and mae method than conventional one was attributed to disruption of cellular structure followed by better penetration of solvent inside the matrix with acoustic cavitation/enhanced temperature. on the contrary, diffusion of solvent inside the matrix by following fick's law of diffusion, which led to solubilization of inulin and mass transfer to the bulk solution is the major mechanism in conventional extraction method. purity per cent of inulin was measured for uae and mae extracted inulin. in both the methods, minimum and maximum exposure condition was selected viz. for uae, 60 % amplitude for 120 sec and 100 % amplitude for 180 sec; for mae 300 mhz for 120 sec and 900 mhz for 180 sec for the comparison study. data of the experiment is presented in table 6 . in general, there is no difference between maximum and minimum exposure condition of uae and mae. whereas, a shade difference was observed between uae and mae in terms of purity of inulin. purity was least (56.7%) in the conventionally extracted inulin, which might be attributed to more extraction of unwanted materials due to long exposure time at boiling condition of water. there are a few reports on effect of processing specifically heating on the degradation of inulin. hydrolysis kinetics of fructo-oligosaccharide was studied across ph range and temperature (80-120 0 c) (l'homme et al., 2003) . at 90-100 0 c, complete degradation of fructo-oligosaccharide oligomers was reported in 1-1.5 h. (matusek et al., 2009 ). in the present study, degree of polymerization was maximum in uae extracted inulin and it was superior than the inulin extracted by mae and conventional methods. it might be attributed to more heating in mae and conventional methods. in this study first the growth of the microbe i.e. lactobacillus fermentum was observed to decide incubation period to perform prebiotic activity. it has been seen that the lag phase lasted for 18 h. then the strain grew exponentially which suggested it was in logarithmic phase. after that stationary phase began at 60 h and they started to decline after 72 h. that's why prebiotic activity was studied up to 72 h after culture fermentation to understand the effect of inulin over microbial population. lactobacillus paracasei and found significant prebiotic activity. as these fructo-oligosaccharides substances get fermented, different organic acids are formed that help to increase the microbiome, make these substances an effective prebiotic ingredient. similar reports were reported by caleffi et al., 2015, where pfaffia inulin was found to be highly active as prebiotic when evaluated using bifidobacterial and lactobacillary populations. the present result was further confirmed by counting bacterial colony from each respective culture media having glucose and inulin as carbon source along with control that showed similar kind of trend as of before (fig. 5b ). higher bacterial population was observed in case of inulin than others up to 72 h, showed nearly 36.4% increase compared to glucose as carbon source. energy density was measured for optimal condition of uae and mae experiment in order to compare between two extraction principles. to compare the efficacy of these two extraction techniques, energy density was evaluated. ev is the energy dissipated to the system provided by the extracting system. from the energy density value, efficient extraction system can be selected. ev delivered by optimized condition of mae is far greater than uae conditions. thus, uae proved to be energy efficient. present result is in agreement with the results reported by chen et analysis for energy consumption is prerequisite for any technology, which has the potential to be scaled upto industry level. energy consumption pattern in each experiment is presented in table 7 . total energy consumption was maximum in conventional method (180 kj) and uae experiments showed to be energy efficient as compared to mae experiments. the same trend was reflected in the carbon footprint pattern also. ultrasonication process generates ultrasound, mechanical acoustic waves and it produces acoustic cavitation during extraction. significant amount of energy is being required to generate it and after cell disruption the energy is converted into thermal energy. whereas, for the microwave radiation more energy is required to produce it. process intensification for the production of inulin was successfully optimized by uae and mae techniques by optimising the all the variables. amplitude of ultrasonicator/power of microwave oven, time and solute to solvent ratio were optimized by ga and rsm techniques. better extraction was achieved by uae as compared to mae and both these techniques were j o u r n a l p r e -p r o o f 29 better than conventional hot extraction. both these techniques provided comparable inulin yield. genetic algorithm approach commensurate the optimized data, produced by rsm. thus both or either one can be used for the optimization of inulin from the matrices. reason behind the better extraction by uae was confirmed by the sem analysis of the matrices. sem picture of the matrices after extraction revealed clear picture about the style of disruption on the cellular structure. microfractures were observed in root tissues extracted by uae, whereas surface modification was observed in mae materials. when the extracted materials were compared with the initial root tissues, difference in their rupture pattern was clearly observed. interestingly, uae provided a shade better purity of extracted inulin than other two techniques. degree of polymerization in inulin was also recorded to be better. higher temperature in mae and conventional method might be attributed to slight degradation of inulin. significant prebiotic activity was recorded while evaluation using lactobacillus fermentum and it was 36 % more than glucose treatment. enhancement in microbial count significantly confirmed the activity. when both these technologies were compared for the energy efficiency, uae provided far lesser energy density than mae. carbon emission was also comparatively a shade lesser in uae experiments than other techniques. thus uae can be considered to be more feasible for mass production at industrial level and it should be sustainable in long run. rohan sarkar: investigation, writing-original draft; arpan bhowmik: conceptualization, statistical and computational analysis; aditi kundu: writing-editing, methodology; anirban dutta: methodology, validation; lata nain: experimentation related to prebiotic activity; goutam chawla: microscopic analysis, manuscript editing; supradip saha: conceptualization, investigation, writing-editing all the authors declare that there is no known competing financial interests or personal relationships that could have influence the investigation reported in this paper. ultrasonically extracted β-d-glucan from artificially cultivated mushroom, characteristic properties and antioxidant activity the effect of different extraction techniques on extraction yield, total phenolic, and anti-radical capacity of extracts from pinus radiata bark comparisons between conventional, microwave-and ultrasound-assisted methods for extraction of pectin from grapefruit inulin-a versatile polysaccharide with multiple pharmaceutical and food chemical uses isolation and prebiotic activity of inulin-type fructan extracted from pfaffia glomerata (spreng) pedersen roots conventional and unconventional recovery of inulin rich extracts for food use from the roots of globe artichoke simultaneous optimization of the ultrasound-assisted extraction for phenolic compounds content and antioxidant activity of lycium ruthenicum murr. fruit using response surface methodology optimization of microwave-assisted extraction of polyphenols from myrtus communis l. leaves effect of extraction methods on the properties and antioxidant activities of chuanminshen violaceum polysaccharides phenol sulphuric acid method for total carbohydrate mathematical modeling and genetic algorithm optimization of clove oil extraction with supercritical carbon dioxide ultrasound versus microwave as green processes for extraction of rosmarinic, carnosic and ursolic acids from rosemary ultrasonication assisted ultrafast extraction of tagetes erecta in water: cannonading antimicrobial, antioxidant components determination of inulin in milk using high-performance liquid chromatography with evaporative light scattering detection microwave-assisted extraction of inulin from chicory roots using response surface methodology kinetics of hydrolysis of fructooligosaccharides in mineral-buffered aqueous solutions: influence of ph and temperature determination of inulin-type fructooligosaccharides in edible plants by high-performance liquid chromatography with charged aerosol detector development of a combined trifluoroacetic acid hydrolysis and hplc-elsd method to identify and quantify inulin recovered from jerusalem artichoke assisted by ultrasound extraction adsorption of uranium by amidoximated chitosan-grafted polyacrylonitrile, using response surface methodology. carbohydrate polymers enhanced expression of the codon-optimized exo-inulinase gene from the yeast meyerozyma guilliermondii in saccharomyces sp. w0 and bioethanol production from inulin preparation of inulin and phenols-rich dietary fibre powder from burdock root effect of temperature and ph on the degradation of fructo-oligosaccharides extraction of inulin from burdock root (arctium lappa) using high intensity ultrasound use of dinitrosalicylic acid reagent for determination of reducing sugar pectin extraction from helianthus annuus (sunflower) heads using rsm and ann modelling by a genetic algorithm approach continuous production of pectic oligosaccharides in an enzyme membrane reactor antioxidant activity and fructan content in root extracts from elecampane (inula helenium l flower heads of onopordum tauricum willd. and carduus acanthoides l-source of prebiotics and antioxidants optimizing the antioxidant biocompound recovery from peach waste extraction assisted by ultrasounds or microwaves in vitro prebiotic activity of inulin-rich carbohydrates extracted from jerusalem artichoke (helianthus tuberosus l.) tubers at different storage times by lactobacillus paracasei extraction of bioactive carbohydrates from artichoke (cynara scolymus l.) external bracts using microwave assisted extraction and pressurized liquid extraction a comparative study on the effect of conventional thermal pasteurisation, microwave and ultrasound treatments on the antioxidant activity of five fruit juices evaluation of the response surface and hybrid artificial neural network-genetic algorithm methodologies to determine extraction yield of ferulago angulata through supercritical fluid characterization of inulin from helianthus tuberosus l. obtained by different extraction methods-comparative study on the generation of cost effective response surface designs comparisons between conventional, ultrasound-assisted and microwave-assisted methods for extraction of anthraquinones from heterophyllaea pustulata hook f.(rubiaceae) ultrasonically assisted extraction (uae) and microwave assisted extraction (mae) of functional compounds from plant materials ultrasoundassisted extraction of phillyrin from forsythia suspensa microwave-assisted extraction of lactones from ligusticum chuanxiong hort. using protic ionic liquids j o u r n a l p r e -p r o o f 30 authors are thankful to head, division of agricultural chemicals for providing all the required facility for the execution of the experiment. authors are also grateful to icar, for proving fellowship to the first author. key: cord-274209-n0aast22 authors: yaro, david; apeanti, wilson osafo; akuamoah, saviour worlanyo; lu, dianchen title: analysis and optimal control of fractional-order transmission of a respiratory epidemic model date: 2019-07-15 journal: int j appl comput math doi: 10.1007/s40819-019-0699-7 sha: doc_id: 274209 cord_uid: n0aast22 the world health organization is yet to realise the global aim of achieving future-free and eliminating the transmission of respiratory diseases such as h1n1, sars and ebola since the recent reemergence of ebola in the democratic republic of congo. in this paper, a caputo fractional-order derivative is applied to a system of non-integer order differential equation to model the transmission dynamics of respiratory diseases. the nonnegative solutions of the system are obtained by using the generalized mean value theorem. the next generation matrix approach is used to obtain the basic reproduction number [formula: see text] . we discuss the stability of the disease-free equilibrium when [formula: see text] , and the necessary conditions for the stability of the endemic equilibrium when [formula: see text] . a sensitivity analysis shows that [formula: see text] is most sensitive to the probability of the disease transmission rate. the results from the numerical simulations of optimal control strategies disclose that the utmost way of controlling or probably eradicating the transmission of respiratory diseases should be quarantining the exposed individuals, monitoring and treating infected people for a substantial period. the respiratory syncytial virus, influenza virus and parainfluenza virus are some viruses that cause respiratory diseases. influenza [1] . influenza viruses are group into type a and type b. the viruses are often transmitted from people and cause seasonal influenza epidemics each year. the evolving prevalence of infectious diseases is increasing every year. it occurs through the respiratory tract. the spread of the disease is rapid and widespread. since 1980, the disease has appeared just like bird flu in asia. the outbreak of the 2014 ebola virus in west africa, severe acute respiratory infectious (sars) disease and the middle east respiratory syndrome (mers) caused severe impact on the health system. most respiratory diseases have no vaccines, such as sars and ebola. these diseases spread quickly and may be re-infected. after the flu season, there are several different types of influenza (types a and b) and subtypes (type a) that circulate and cause disease. for example, the bird flu virus has several subtypes such as h5n1, h7n3, h7n7, h7n9, and h9n2, which can infect humans [1] . epidemiological mathematical models have proven to be a valuable tool for understanding, analyzing influenza virus infection dynamics, disseminating and recommending control strategies. although [2] [3] [4] [5] [6] has completed a great deal of work on dynamic modeling of influenza, it is limited to ordinary differential equations. however, currently, it has been found that the use of fractional differential equations to model many different fields of phenomena has been very successful [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] . for instance, in mathematical epidemiology, ebola virus epidemic has been modeled with fractional-order differential equations by [19] . they used the seir fractional-order model to analyze data published by the who to provide projections of outbreaks in three countries in west africa. the model considered fits precisely with the real data. their findings revealed that the outbreak will last about two years, with an estimated 9 million infected people. although individuals heredity play an important role in mortality rate of the disease, data analysis shows that the predicted death toll is very high. goufo et al. [20] took into account the stability analysis of a non-linear spread ebola hemorrhagic fever epidemic model. they used conventional time derivatives to express models that contain new parameters that happen to be fractional. they proved that the model is well defined, poses non-negative solutions and also established the conditions for boundedness. the routh-hurwitz criterion was used to show the existence and stability analysis of ebola virus model equilibrium states and showed that they strongly depend on non-linear propagation. also, they provide the conditions for the persistence of the ebola virus in the system. in addition, numerical simulations of non-linear propagation are provided and the results obtained are significant for combating and preventing ebola hemorrhagic fever, which has so far caused the deaths of hundreds of families and continues to infect many people in west africa. fractional diffusion mimics the human mobility network by simulating disease outbreaks [21] . human mobility networks can smooth the spread of infectious diseases from the sidewalk to the flight route. the effort of control and removal depends on describing these networks in terms of individual connections and flux rates between the contact nodes. in some cases, transportation can be parametrized by a gravity-type model or approximated by diffuse random walks. alternatively, they separated domestic commercial air traffic into a case study of the utility of non-diffusion heavy-tailed transport models. a new stochastic simulation of typical influenza-like infections was adopted, targeting the dense, highly connected america air travel network. they show that the mobility on the network can be defined mainly by a power law, which is consistent with the previous study. they observed that the global evolution of an outbreak on the network is precisely reproduced by a twoparameter space-fractional diffusion equation, which is determined by the air travel network. the dynamic changes of cytotoxic t-lymphocyte (ctl) responses in ebola virus infection in vivo are described by a fractional-order model with time delay [22] . they introduced a time-delay during the ctl response period to represent the time required to simulate the immune system. stability and hopf bifurcation were obtained from the model through fractional laplace transform. moreover, the stability conditions show that the dynamics of the model can be improved by the fractional-order delay. in [23] sir epidemic model with non-integer order under the conditions of external noise is studied. the behavior of the system changes with the introduction of seasonality and noise force. it revealed that different non-integer order and parameters improve the dynamical behavior of the system. multi-scale fuzzy entropy is used to study the complexity of stochastic models. they designed a hard limiter control system and simulation results which showed that with effective medical and health measures the proportion of infected people can be controlled to significantly small numbers. compared with integer order models, non-integer order models are more effective in modeling biological systems which have long-range and temporal memories [24] . in [25] , gonzález-parra et al. proposed a nonlinear order system to discuss the outbreaks of h1n1 influenza. the fractional model does not only rely on the present stage but also on all its history which makes fractional models more general than the integer-order models. they also determined that the nonlinear fractional epidemiological model can be well matched to provide numerical results that are in good agreement with the actual data for h1n1 influenza. the model suggested gives valuable facts for understanding, predicting and controlling the spread of different epidemics across the globe. for these reasons, the fact that fractional (non-integer) order models of many phenomena have recently been shown to be more realistic than the integer order, and the fact that the fractional order also has long-range memories motivated this study. in addition, it is also obvious that modeling by fractional order derivative can capture any natural phenomena or a rich variety of dynamics observed in the system. many of the past models focused only on the analysis of the influenza outbreaks in human populations. but, a model in which the order is an integer to describe the transmission of a respiratory epidemic has recently been proposed in [26] . however, modeling the transmission of a respiratory epidemic by fractional order model might provide a feasible alternative in controlling or probably eradicating the disease. the present study introduces caputo fractional-order into seir epidemic type of model proposed by [26] . the proposed model is analyzed without control measures to investigate its stability conditions. the sensitivity of the basic reproductive number r 0 is analyzed to determine the most sensitive parameters. the interpretation of the sensitivity led us to two control measures. the optimal control theory is then used to investigate the efficiency of incorporating the control measures, namely, quarantine of exposed population and monitoring and treating of the infected population. the remaining part of the paper is structured as follows: we provide some basic and necessary definitions about the fractional calculus in section "fractional calculus". the description of the model is discussed in section "the model". the nonnegative solution and the stability analysis of the fractional-order differential equations are discussed in sections "non-negative solutions" and "model equilibrium states and stability" respectively. the optimal control of the epidemic is discussed in section "optimal control". a numerical solution of the fractional model using atanackovic and stankovic method is discussed in section "numerical method". we support our theoretical analysis with numerical simulations in section "numerical simulation and discussion". finally, the paper concludes in section "conclusion". in this section, we define fractional integral and fractional derivative of riemann-liouville and caputo respectively which are applied in this work. definition 1 [27] . the order σ > 0 of fractional integral operator function h : r + −→ r defined by is called the riemann-liouville fractional integral. here and elsewhere γ is known as the euler gamma function and is defined as definition 2 [12] . the derivative function of fractional order σ > 0 (where σ lie in the half-open interval (0,1]), defined by is called the caputo fractional derivative, where a is the starting point. we use the caputo derivative definition in this work. the initial conditions for caputo defined fractional differential equation is the same as ordinary differential equations which is a core advantage. the model we studied in this work is proposed by [26] . the model considers four variables, namely, the population which is susceptible ( s(ξ )), the population which is exposed (e(ξ )), the infected population (i (ξ )), and the recovered population (r(ξ )). according to [26] , b(ξ ) represents the rate of new susceptible people entering the population at time ξ , β represents the probability of disease transmission, ν represents the seroconversion rate, α represents the recovery rate, μ n represents the natural mortality rate, μ d represents the disease induced death rate, and κ represents the rate at which the recovered return to the susceptible population (due to the loss of immunity). these assumptions lead to the following integer-order differential equation presented by [26] , the system (4) in fractional order is given by with initial conditions where d σ * a = d σ dξ σ is the caputo fractional derivative of order σ . it is important to notice that when the fractional-order σ −→ 1 , the system (5) becomes the integer-order system (4). we need the following lemma in [28] to proof non-negative solutions of system (5). * a h(m) ∈ c(d, e] for 0 < σ ≤ 1 and d, e ∈ r then h(m) = h(d) + 1 γ (σ ) d σ * a h( )(m − d) σ , with d ≤ ≤ 1, ∀m ∈ (d, e]. remark 4.1 assume h(m) ∈ c[0, e] and d σ * a h(m) ∈ c(0, e] for 0 < σ ≤ 1. it follows from lemma 4.1 that if d σ * a h(m) ≥ 0, ∀m ∈ (0, e) then h(m) is non-decreasing ∀m ∈ [0, e], and if d σ * a h(m) ≤ 0, ∀m ∈ (0, e) then h(m) is non-increasing ∀m ∈ [0, e]. proof using theorem 3.1 together with remark 3.2 in [29] , we can obtain the existence and uniqueness of the solution of (5)(6) in (0, ∞). we prove that the domain r 4 + for the model is positively invariant, since around non-negative domain or neighborhood on each hyperplane, the vector field points to r 4 + . for the equilibrium states of the system of fractional order model (5), let then the equilibrium states are the diseases free equilibrium state k 0 , is where the infectives equal to zero (i = 0) and the endemic equilibrium state k 1 , is where the infectives is nonzero (i = 0). the basic reproduction number of the system (5) using the next generation matrix approach, given by where f is non-negative and v is a non-singular m-matrix. applying this method on the system (5), where now, finding f and v , at disease free equilibrium, k 0 , and using k = f v −1 we have the jacobian matrix j (k 0 ) for the system (5) evaluated at k 0 is given by the disease free equilibrium state k 0 of system (5) is locally asymptotically proof the disease free equilibrium state k 0 is asymptotically stable locally given that the eigenvalues λ l , l = 1, 2, 3, 4, of j (k 0 ) satisfy the conditions [30, 31] : we can evaluate these eigenvalues by solving the following characteristic equation this leads to the equation the characteristic equation gives the roots obviously, p + q > 0, and if p q > w , then all the eigenvalues λ l , l = 1, 2, 3, 4, satisfy the condition given by (9) . the basic reproduction number denoted by r 0 is value and is defined as the number of cases occurring in a population which is completely susceptible. biologically, if r 0 is less than one, then the infection will disappear, but if it is more than one, the infection still exists. for the discussion of the asymptotic stability of the persistence of the disease of system (5), we need the following definition and lemma 5.1. definition 3 [32] .the discriminant d(h) of a polynomial h(y) = y m + c 1 y m−1 + c 2 y m−2 + · · · + c m (10) is the determinant of the corresponding sylvester (m + ξ) ⊗ (m + ξ) matrix, the sylvester matrix is formed by filling the matrix beginning with the upper left corner with the coefficients of h(y) and then shifting down one row and one column to the right side. the process is then repeated for the coefficients of f (y). lemma 5.1 [32] . for the polynomial equation, the conditions displayed below make all the roots of (11) satisfy (9): 1. for m = 1, the condition for (11) is b 1 > 0 2. for m = 2, the conditions for (11) are either routh-hurwitz conditions or b 1 < 0 3. (11) is satisfied for all σ ∈ [0, 1). 6. for general m, b m > 0 is a necessary condition for condition (11) to be satisfied. the jacobian matrix j (k 1 ) calculated at the disease persistence equilibrium is given as: by using the characteristic equation det(j (k 1 ) − λi ) = 0, the linearized system above is in the form where lemma 5.2 from condition (6) in lemma 5.1 the positive equilibrium point k 1 is locally asymptotically stable, since the polynomial h (λ) as given in (12) has a coefficients d 1 , d 2 , d 3 , and d 4 positives. here, we investigate the response of r 0 to parameter changes and determine the effect of each parameter on r 0 and the potential for effective control and elimination of the disease. it is straightforward to calculate the partial derivatives of the value of r 0 using eq. (8) with respect to the parameters β, ν σ , b σ , μ σ n , μ σ d and the recovery rate α σ . with all other parameters held constant, the elasticity e x (or the variable's normalized forward sensitivity index) approximates the fractional change in r 0 that results from a unit fractional change in parameter x, defined as this index shows how sensitive r 0 is to changes of parameter x. specifically, a positive (negative) index shows that an increase in the parameter value results in an increase (decrease) of r 0 [33] . the elasticities for the quantities of interest are figure 1 indicate that, r 0 is most sensitive to β the disease transmission rate, followed by α σ the recovery rate. the seroconversion rate ν σ and the natural death rate μ σ n have the same sensitivity index. it can also be observed that, r 0 is least sensitive to μ σ d , the disease induced death rate. in detail, the sensitivity indexes for α σ , β, ν σ , μ σ d , and μ σ n , are found to be −0.9999988, 1, 0.0058, −0.00000122, and −0.0058 respectively, once all parameters are fixed at their baseline values (fig. 1) . thus, for instance, if the rate of recovery were to increase or (decrease) by 10%, then the value of r 0 would decrease or (increase) by 9.999988%. likewise, a 10% increase or (decrease) of the disease transmission rate would correspond to a 10% decrease or (increase) of the r 0 , 10% increase or (decrease) of seroconversion rate would decrease or (increase) the r 0 by 0.058%, 10% increase or (decrease) of the disease induced death rate would correspond to decrease or (increase) the value of r 0 by 0.0000122% and 10% fig. 1 the sensitivity analysis of the basic reproductive number increase or (decrease) of the natural death rate would correspond to a 0.058% decrease or (increase) in the value of r 0 . therefore, the above interpretations recommend that control strategies that can efficiently decrease the probability of disease transmission β, natural death rate μ σ n , disease induced death rate μ σ d , should be used to control the disease transmission effectively. additionally, increase the rate of recovery α σ will lead to a decrease in r 0 , thus all control strategies that can effectively help reduce the transmission of the respiratory diseases should be applied. the mathematical perspective for these strategies would be detailed in our subsequent model. in this section, we extend our model in eq. (6) by introducing two time-dependent control measures, namely u 1 (ξ ) (quarantine of exposed population groups) and u 2 (ξ ) (monitoring and treatment of infected populations ). it is assumed that the exposed population is reduced by the factor (1 − u 1 (ξ )) as they are quarantine. furthermore, the infected population is reduced by a factor of (1 − u 2 (ξ )) as they are monitored and treated by health professionals. the model system (6) becomes where e is the exposed population and i is the infected population. t is the final time and the coefficients c 1 , c 2 , c 3 , c 4 are positive weights. our aim is to minimize the exposed and infected population while minimizing the cost of control u 1 , u 2 . thus, we search for an optimal control u * 1 , u * 2 , such that where the control set is the terms c 1 e and c 2 i represent the cost of reducing the exposed and infected population respectively, while c 3 u 2 1 is the cost of quarantine and also, c 4 u 2 2 is the cost of monitoring and treatment. the necessary conditions that an optimal control must satisfy come from the pontryagin's minimum principle [34] [35] [36] [37] . this principle converts eqs. (6) and (18) into a problem of point-wise minimizing a hamiltonian m with respect to (u 1 , u 2 ) stated as follows: where λ s , λ e , λ i and λ r , adjoint variables or co-state variables [34] [35] [36] [37] the transversality conditions are λ s (t ) = λ e (t ) = λ i (t ) = λ r = 0. on the interior of the control set, where 0 < u i < 1, for i = 1, 2 we have we obtain the control parameters (u * 1 , u * 2 ) that minimizes j (u 1 , u 2 ) over u is given by where λ s , λ e , λ i and λ r are the adjoint variables satisfying (6) and the following transversality conditions: atanackovic and stankovic [38] numerical method fde is discussed in this section. this method indicates that the fractional derivative of a function f (ξ ) with order σ may be stated as where with the following properties: by using k terms with the sums appearing in (20) , we can approximate d σ g(ξ ) as the above equation can be written as d σ * a g(ξ ) ω(σ, ξ, k )g (1) where , we set for k = 2, 3, . . . we can rewrite system (7) as where now we can rewrite (23) and (25) as follows: the system (27) with (28) can be solved numerically by using the runge-kutta fourth order method. fig. 4 the simulations displaying the result of u 1 (quarantine of the exposed population groups) and u 2 (monitoring and treatment of the infected people) on a exposed populations, b infected populations in this section, we present numerical simulations to confirm the theoretical results obtained in the preceding section. by using the well known generalized euler method (gem) [39] with values in table 1 , we simulate system (5) . for the parameter values in table 1 above and by calculation, we obtained r 0 = 1.5143 and the endemic equilibrium k 1 = (1679.128, 858.647, 4.883, 3 .712). we obtained r 0 = 1.5143, d(p) = 2.389 > 0 and the simulations in fig. 2 show that the endemic equilibrium k 1 is positive and locally asymptotically stable for σ = 0.70, σ = 0.80, σ = 0.90 and σ = 1.0, satisfying condition 6 of lemma 5.1. it can obviously be seen in fig. 2 that, compared with the situation of order σ = 0.70 and σ = 0.80, the trajectory of the system with order σ = 0.90 is nearer to the trajectory of the system with the order σ = 1.0. thus, the bigger of the trajectory difference, the distant from σ to 1.0. it can be seen from fig. 3 when σ = 1 system (5) is the classical integer-order system (4). the display of the trajectories indicates the behavior of the approximate solutions for system (5) obtained for the value of σ = 1. it can be seen from fig. 4 that the optimal control u 1 and u 2 has a significant effect on the exposed and the infected populations respectively. infection level is reduced rapidly but not eliminated. this suggests that monitoring and treatment strategies that can allow the immune response to rebuilding should also be well-thought-out (figs. 5, 6 ). we introduced caputo fractional-order into a classical integer-order model proposed by [26] to model the transmission dynamics of respiratory disease. the nonnegative solution of the model is provided by using the generalized mean value theorem. we obtained the basic reproductive number r 0 , which perform as a threshold parameter in the disease control. we established and investigated the stability analysis of the fractional-order model with respect to the values of r 0 . the disease-free equilibrium is locally asymptotically stable if r 0 < 1. for r 0 > 1, using lemma 5.1 and condition 6, we investigated the local stability of the positive endemic equilibrium state. sensitivity analysis shows that r 0 is most sensitive to the disease transmission rate β. this recommends that periodic monitoring by medical professionals and researchers should be done to control the transmission of the disease. additionally, we investigated the optimal control problem by the application of the optimal control theory. we used the pontryagin's minimum principle to provide the necessary conditions needed for the existence of the optimal solution to the optimal control problem. also, we applied the atanackovic and stankovic method to provide a numerical solution to system (5) . lastly, the theoretical results were verified by numerical simulations to measure the efficacy and impact of control on the transmission of the respiratory diseases. from the numerical simulation, the size of the infected population is significantly reduced under the controlled conditions. this proposes that if all two control measures u 1 (quarantine of exposed population groups) and u 2 (monitoring and treatment of infected populations) are employed for the same period of time and continue for a considerable period of time, the future of free from transmission of a respiratory disease could be achieved. in this manner, the fractional-order optimal control method can progress the value of the treatment (fig. 7) . the major advantages of our proposed fractional order model which cannot be exhibited by classical order model are: -it's highly effective and efficient which help us to obtain better results. -it's easy to implement. -it provides improved precision of the process model by offering more flexibility in model identification. -by modeling system by fractional order we can model a higher system by low order model. -it has the effect of memory, which is essential factor in many biological processes. simple model for respiratory diseases modeling the initial transmission dynamics of influenza a h1n1 in guangdong province a dynamic compartmental model for the middle east respiratory syndrome outbreak in the republic of korea: a retrospective analysis on control interventions and superspreading events application of the cdc ebolaresponse modeling tool to disease predictions parameter estimation of influenza epidemic model global dynamics of avian influenza epidemic models with psychological effect a new fractional analysis on the interaction of hiv with cd4+ t-cells new aspects of the poor nutrition in the life cycle within the fractional calculus new aspects of the motion of a particle in a circular cavity suboptimal control of fractional-order dynamic systems with delay argument on the nonlinear dynamical systems within the generalized fractional derivatives with mittag-leffler kernel a fractional calculus based model for the simulation of an outbreak of dengue fever two-strain epidemic model involving fractional derivative with mittag-leffler kernel functionality of circuit via modern fractional differentiations a fractional model of vertical transmission and cure of vector-borne diseases pertaining to the atangana-baleanu fractional derivatives the effect of anti-viral drug treatment of human immunodeficiency virus type 1 (hiv-1) described by a fractional order model a fractional order seir model with vertical transmission some properties of the kermack-mckendrick epidemic model with fractional derivative and nonlinear incidence on a fractional-order ebola epidemic model stability analysis of epidemic models of ebola hemorrhagic fever with non-linear transmission fractional diffusion emulates a human mobility network during a simulated disease outbreak. institute for disease modeling, intellectual ventures a fractional-order delay differential model for ebola infection and cd8 + t-cells response: stability analysis and hopf bifurcation epidemic outbreaks and its control using a fractional order model with seasonality and stochastic infection a fractional-order model for ebola virus infection with delayed immune response on heterogeneous complex networks a fractional-order epidemic model for the simulation of outbreaks of influenza a(h1n1) applied mathematics for the analysis of biomedical data: models, methods, and matlab fractional-order nonlinear systems: modeling, analysis and simulation generalized taylor's formula global existence theory and chaos control of fractional differential equations equilibrium points, stability and numerical solutions of fractional-order predator-prey and rabies models stability results for fractional differential equations with applications to control processing on fractional-order differential equations model for nonlocal epidemics determining important parameters in the spread of malaria through the sensitivity analysis of a mathematical model an off-line nmpc strategy for continuous-time nonlinear systems using an extended modal series method a novel feedforward-feedback suboptimal control of linear timedelay systems an efficient finite difference method for the time-delay optimal control problems with time-varying delay the mathematical theory of optimal processes stability analysis for a fractional hiv infection model with nonlinear incidence an algorithm for the numerical solution of differential equations of fractionalorder fitting dynamic models to epidemic outbreaks with quantified uncertainty: a prime for parameter uncertainty, identifiability, and forecasts publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-285897-ahysay2l authors: wu, guangyao; yang, pei; xie, yuanliang; woodruff, henry c.; rao, xiangang; guiot, julien; frix, anne-noelle; louis, renaud; moutschen, michel; li, jiawei; li, jing; yan, chenggong; du, dan; zhao, shengchao; ding, yi; liu, bin; sun, wenwu; albarello, fabrizio; d'abramo, alessandra; schininà, vincenzo; nicastri, emanuele; occhipinti, mariaelena; barisione, giovanni; barisione, emanuela; halilaj, iva; lovinfosse, pierre; wang, xiang; wu, jianlin; lambin, philippe title: development of a clinical decision support system for severity risk prediction and triage of covid-19 patients at hospital admission: an international multicenter study date: 2020-07-02 journal: eur respir j doi: 10.1183/13993003.01104-2020 sha: doc_id: 285897 cord_uid: ahysay2l background: the outbreak of the coronavirus disease 2019 (covid-19) has globally strained medical resources and caused significant mortality. objective: to develop and validate machine-learning model based on clinical features for severity risk assessment and triage for covid-19 patients at hospital admission. method: 725 patients were used to train and validate the model including a retrospective cohort of 299 hospitalised covid-19 patients at wuhan, china, from december 23, 2019, to february 13, 2020, and five cohorts with 426 patients from eight centers in china, italy, and belgium, from february 20, 2020, to march 21, 2020. the main outcome was the onset of severe or critical illness during hospitalisation. model performances were quantified using the area under the receiver operating characteristic curve (auc) and metrics derived from the confusion-matrix. results: the median age was 50.0 years and 137 (45.8%) were men in the retrospective cohort. the median age was 62.0 years and 236 (55.4%) were men in five cohorts. the model was prospectively validated on five cohorts yielding aucs ranging from 0.84 to 0.89, with accuracies ranging from 74.4% to 87.5%, sensitivities ranging from 75.0% to 96.9%, and specificities ranging from 57.5% to 88.0%, all of which performed better than the pneumonia severity index. the cut-off values of the low, medium, and high-risk probabilities were 0.21 and 0.80. the online-calculators can be found at www.covid19risk.ai. conclusion: the machine-learning model, nomogram, and online-calculator might be useful to access the onset of severe and critical illness among covid-19 patients and triage at hospital admission. in december 2019, a novel coronavirus, severe acute respiratory syndrome coronavirus 2 (sars-cov-2; earlier named as 2019-ncov), emerged in wuhan, china [1] . the disease caused by sars-cov-2 was named coronavirus disease 2019 (covid19) . as of may 15, 2020, more than 4 490 000 covid-19 patients have been reported globally, and over 300 000 patients have died [2] . the outbreak of covid-19 has developed into a pandemic [3]. among covid-19 patients, around 80% present with mild illness whose symptoms usually disappear within two weeks [4] . however, around 20% of the patients may proceed and necessitate hospitalization and increased medical support. the mortality rate for the severe patients is around 13.4% [4] . therefore, risk assessment of patients preferably in a quantitative, non-subjective way, is extremely important for patient management and medical resource allocation. general quarantine and symptomatic treatment at home or mobile hospital can be used for most non-severe patients, while a higher level of care and fast track to the intensive care unit (icu) is needed for severe patients. previous studies have summarized the clinical and radiological characteristics of severe covid-19 patients, while the prognostic value of different variables is still unclear [5, 6] . several scoring systems that are in common clinical use (e.g. sequential organ failure assessment score, confusion-urea-respiratory rate-blood pressure-age 65, acute physiology and chronic health evaluation, etc.) could be applied to the triage problem, albeit each with their own problems and limitations, such as the need for laboratory variables that are hard to obtain at hospital admission [7] . the pneumonia severity index (psi) stands out as it is used to assess the probability of severity and mortality among adult patients with community-acquired pneumonia and to help hospitalization management [8] . a better solution could possibly be found using machine-learning, a branch of artificial intelligence that learns from past data in order to build a prognostic model [9] . in recent years, machine learning has been developed as a useful tool to analyze large amounts of data from medical records or images [10] . previous modeling studies focused on forecasting the potential international spread of covid-19 [11] . therefore, our objective is to develop and validate a prognostic machine-learning model based on clinical, laboratory, and radiological variables of covid-19 patients at hospital admission for severity risk assessment during hospitalization, and compare the performance with that of psi as a representative clinical assessment method. our ambition is to develop a multifactorial decision support system with different datasets to facilitate risk prediction and triage (home or mobile hospital quarantine, hospitalization, or icu) of the patient at hospital admission. the institutional review board approved this study (2020-71), which followed the standards for reporting of diagnostic accuracy studies statement [12] , and the requirement for written informed consent was waived. 299 adult confirmed covid-19 patients from the central hospital of wuhan were included consecutively and retrospectively between december 23, 2019 and february 13, 2020. the inclusion criteria were: (1) patients with a confirmed covid-19 disease, (2) patients present at hospital for admission. the exclusion criteria were: (1) patients already with a severe illness at hospital admission; (2) time interval > 2 days between admission and examinations; and (3) no data available or delayed results as described below. the patients included from this center were divided into two datasets according to the entrance time of hospitalization, 80% for training (239 patients from december 23, 2019, to january 28, 2020) and 20% for internal validation (60 patients from january 29 to february 13, 2020). the five test datasets were collected between february 20, 2020 and march 31, 2020 from other eight centers (supplementary) in china, italy, and belgium under the same inclusion and exclusion criteria (figure 1 ). patients were labelled as having a "severe disease" if at least one of the following criteria were met during hospitalization [6, 13] : (a) respiratory failure requiring mechanical ventilation; (b) shock; (c) icu admission; (d)organ failure; or (e) death. patients were labelled as having a "non-severe disease" if none of the abovementioned criteria were met during the whole hospitalization process until deemed recovered and discharged from the hospital. clinical, laboratory, radiological characteristics and outcome data were obtained in the case record form shared by the international severe acute respiratory and emerging infection consortium from the electronic medical records [14] . a confirmed case with covid-19 was defined as a positive result of high-throughput sequencing or real-time reverse-transcriptase polymerase-chain-reaction assay for nasal and pharyngeal swab specimens. after consultation with respiratory specialists and review of the recent covid-19 literature, a set of clinical, laboratory, and radiological characteristics was identified and the data collected from the electronic medical system. the clinical characteristics included basic information (5 variables), comorbidities (11 variables), and symptoms (13 variables). all clinical characteristics were obtained when the patients were admitted to hospital for the first time. 42 laboratory results were recorded, including complete blood count, white blood cell differential count, d-dimer, c-reactive protein (crp), cardiac enzymes, procalcitonin, liver function test, kidney function test, b-type natriuretic peptide and electrolyte test. the arterial blood gas was not taken into account due to missing data for most early-stage patients. the metric conversion of laboratory results was performed using an online conversion table [15] . a detailed list of variables can be found in tables 1 and 2 . the semantic ct characteristics (including ground-glass opacity, consolidation, vascular enlargement, air bronchogram, and lesion range score) were independently evaluated on all datasets by two radiologists (py [a radiologist with 5 years' experience in chest ct images] and yx [a radiologist with 20 years' experience in chest ct images]), who were blinded to clinical and laboratory results. any disagreement was resolved by a consensus read. lesion range was identified as areas of ground-glass opacity or consolidation and was graded all feature selection and model training were performed in the training dataset alone to prevent information leakage. an overview of the functions used is given in supplementary table s1. in order to reduce feature dimensionality, features showing high pairwise spearman correlation (r > 0.8) and the highest mean correlation with all remaining features were removed, followed by application of the boruta algorithm to select important features [16] . the boruta algorithm combines feature rank based on the random forest classification algorithm and selection frequency based on multiple iterations of the feature selection procedure. recursive feature elimination based on bagged tree models with a cross-validation technique (10 folds, 10 times) was performed to select the best performing combination of features. in order to balance the positive and negative sample size, an adaptive synthetic sampling approach for imbalanced learning (adasyn) was used during feature selection and modeling [17] . the feature selection process was used for clinical, laboratory, and ct semantic models alone, and in combination. the prognostic performances of the best model were compared with other models on the training dataset, due to a bigger sample size. the performance of the best model and psi scoring were gauged on the datasets via the receiver operator characteristic (roc) and confusion matrix. in order to gauge the level of overfitting, the outcomes were randomized on the best model and the entire process repeated, from feature selection to model building and evaluation. the patients from the training datasets were divided into low, medium and high risk according to the first quartile (25th percentile) and the third quartile (75th percentile) of probabilities from the best performing model. nomograms and on-line calculators were used to provide the interpretability of the best trained models. the test datasets were used to gauge the prognostic performance and the validity for the best model. baseline data were summarized as median, and categorical variables as frequency (%). differences between the severe group and the non-severe group were tested using the mann-whitney test for continuous data and fisher's exact test for categorical data. feature correlations were measured using the spearman correlation coefficient. we determined the area under the roc curve (auc) with its 95% confidence interval (ci) and tested auc difference between models 1-3 and model 4 by the delong method [18] , measures of prognostic performance included the auc, and metrics derived from the confusion matrix -accuracy, sensitivity, specificity, positive prediction value (ppv), and negative prediction value (npv). a calibration-plot based on the hosmer-lemeshow test was used to estimate the goodness-of-fit and consistency of the model on the test datasets. all p values were two-sided, and p < 0.05 was regarded as significant. all statistical analyses, modeling, and plotting were performed in r (version 3.5.3), and the detailed package characteristics are listed in supplementary table s1 . of 299 hospitalized covid-19 patients in retrospective cohort, the median age was 50.0 years (interquartile range, 35.5-63.0; range, 20-94 years) and 137 (45.8%) were men. all the clinical characteristics and ct findings are summarized in table 1 , and more details of laboratory findings can be seen in table 2 . of 426 hospitalized covid-19 patients in 5 cohorts as test datasets, the median age was 62.0 years (interquartile range, 50.0-72.0; range, 19-94 years) and 236 (55.4%) were men. among the clinical features, age, hospital employment, body temperature, and the time of onset to admission were selected. lymphocyte (proportion), neutrophil, (proportion), crp, lactate dehydrogenas (ldh), creatine kinase (ck), urea, and calcium were selected from the laboratory feature set. only the lesion range score was selected from ct semantic features. when putting these three category features together to select features, age, lymphocyte (proportion), crp, ldh, ck, urea and calcium were finally included in the combination model. model performance was as follows. the model 1 based on age and hospital employment showed an auc of model 4 was validated on the five test datasets, which showed aucs ranging from 0.84 to 0.93 with accuracies ranging from 74.4% to 87.5%, sensitivities ranging from 75.0% to 96.9%, specificities ranging from 57.5% to 88.0%, ppvs ranging from 71.4% to 84.1%, and npvs ranging from 73.9% to 93.9% ( table 3) . the roc, confusion-matrix, and calibration plots are shown in figure 3 . the results of randomizing the outcomes and rerunning the analysis yielded auc of 0.50 (95% ci, 0.44-0.55) for the model 4. based on the selected features from the best models, a nomogram was established to quantitatively assess the severity risk of illness (figure 4) . the developed online-calculators can be found at www.covid19risk.ai. compared to psi scoring, model 4 showed higher aucs, accuracies, sensitivities, and npvs on the five test datasets ( table 3) . there were significant difference for the proportion of severe patients among low, medium, and high-risk groups in the five test datasets (figure 5 ). this international multicenter study analyzed individually and in combination, clinical, laboratory and radiological characteristics for covid-19 patients at hospital admission, to retrospectively develop and prospectively validate a prognostic model and tool to assess the severity of the illness, and its progression, and to compare these with psi scoring. we found that covid-19 patients that developed a severe illness were often of an advanced age, accompanied by multiple comorbidities, presenting with chest tightness, and had abnormal laboratory results and broader lesion range on lung ct on admission. using simpler linear regression models yielded better prognostic performance than psi scoring in the test datasets. we believe these models could be useful for risk assessment and triage. previous studies have reported that age and underlying comorbidities (such as hypertension, diabetes, and cardiovascular diseases) may be risk factors for the covid-19 patients requiring intensive care unit (icu) [19 20] . in this study, we found that the elderly covid-19 patients who were male, non-hospital staff, suffering from hypertension, diabetes, cardiopathy disease, chronic obstructive pulmonary disease, cerebrovascular disease, renal disease, hepatitis b virus infection, lower body temperature, and chest tightness were more vulnerable to develop a severe illness in the early stages of the disease. among these features, age, hospital staff, body temperature, and the time of onset to admission had certain prognostic abilities. age was the most important feature, which may interact with other features, which was why only age was selected into our combination model (model 4) from these features. zhou and colleagues have confirmed that sars-cov-2 uses the same cell entry receptor (angiotensin-converting enzyme ii [ace2]) with sars-cov [21] . however, whether covid-19 patients with hypertension and diabetes have higher severe illness risk, which is due to treatment with ace2-increasing drugs is still unknown [22] . hospital staff had a lower risk of progression, possibly due lower age, higher levels of education, and more medical knowledge once infected although the unbalanced nature of this type of data has to be taken into account. furthermore, early studies have shown that covid-19 patients with severe illness had more laboratory abnormalities such as crp, d-dimer, lymphocyte, neutrophil, and ldh, than those patients with non-severe illness, which were associated with the prognosis [19, 20, 23] . in our study, we also found that the severe group had numerous laboratory abnormalities in complete blood cell count, white cell differential count, d-dimer, crp, liver function, renal function, procalcitonin, b-type natriuretic peptides, and electrolytes. among these abnormalities, lymphocyte proportion, neutrophil proportion, crp, ldh, ck, urea, and calcium were significant prognostic factors, which suggest that covid-19 may cause damage to multiple organ systems when developing into a severe illness. however, current pathological findings of covid-19 suggest that there is no evidence that sars-cov-2 can directly impair the other organs such as liver, kidney and heart [24] . current reports have shown that thin-slice chest ct is a powerful tool in clinical diagnosis due to the high sensitivity and the ability to monitor the development of the disease [25, 26] . in addition, a previous study reported that ground-glass opacity and consolidation were the most common ct findings for covid-19 patients with pneumonia, while being nonspecific [27] . clinical observations showed that there were significantly more consolidation lesions in icu patients on admission, while more ground-glass opacity lesions were observed in non-icu patients [28] . in our study, we found that vascular enlargement, air-bronchogram, and lesion range score differ significantly between non-severe and severe groups. among these features, only the lesion range score had prognostic power, but not enough to be selected for the combination model. this indicates that while these early stage ct semantic features could have diagnostic value, they have limited ability to prognose the onset of severe illness in covid-19 patients. the chinese national health committee added some warning indicators for severe or critical cases in the updated diagnosis and treatment plan for covid-19 patients (version 7) [29] , which includes progressive reduction of peripheral blood lymphocytes, a progressive increase of il-6, crp and lactate, and rapid progression of lung ct findings in a short period. in this study, we used age, lymphocyte fraction, crp, ldh, ck, urea, and calcium scores from clinical, laboratory, and radiological exams recorded at hospital admission to train a model for the prediction of the onset of severe illness. our model combining these features from multiple sources showed a favorable performance when validated in the five external datasets from china, italy, and belgium. in addition, the model is able to stratify covid-19 patients into low, medium, and high-risk groups for developing severe illness. we propose that this model with its higher prediction performance and simplicity than psi score could be used for a preliminary screening and triage tool at hospital admission for the potential to develop severe illness. furthermore, the model could be used for the selection and/or stratification of patients in clinical trials in order to homogenize the patient population. follow-up laboratory tests are needed to assess the severity risk with a higher accuracy. as one of the coronaviruses family infecting humans, sars-cov-2 has similar etiologic, clinical, radiological and pathological features to those of severe acute respiratory syndrome coronavirus and middle east respiratory syndrome coronavirus [23, 30, 31] . therefore, we believe that developing a reliable early warning model based on presently clinical, radiological, and pathological data is necessary for current outbreaks and possible future outbreaks of coronaviruses. our study has several limitations. first, selection bias is unavoidable and the limited and unbalanced sample size. second, patients from different races and ethnicities may have diverse clinical and laboratory results, and the self-medication of patients before admission may affect the clinical and laboratory results. third, the threshold to go to the hospital and hospitalization management can vary from country to country, we are also aware that rna viruses can mutate rapidly and that could have an impact of the performance of the models. we therefore propose that those models should be continuously updated to achieve a better performance for example using privacy-preserving distributed learning approaches [32, 33] . fourth, the ct features used for this study are semantic features from the first ct scan, and radiomics or deep learning approaches may improve its prognostic performance, and follow-up ct scan may yield more information. fifth, due to the large number of predictors included in the analysis, and the complexity of feature selection and modelling, overfitting is always possible. we have mitigated this with the use of external validation cohorts, and by rerunning the analysis on randomized outcomes to arrive at a "chance" (auc=0.5) result. elderly covid-19 patients and non-hospital staff seem more vulnerable to develop a severe illness after hospitalization as per defining criteria, which can cause a wide range of laboratory and ct anomalies. furthermore, our model based on lactate dehydrogenase, c-reactive protein, calcium, age, lymphocyte proportion, urea, and creatine kinase might be a more useful preliminary screening and triage tool than pneumonia severity index for risk assessment of covid-19 patients at hospital admission. role of the funder/sponsor: the funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. chest ct scans were performed using one of the ct scanners (uct 780, united imaging, china and brilliance ict 128, philips medical systems, the netherlands) with patients in the supine position. the scanning range was from the level of the upper thoracic inlet to the inferior level of the costophrenic angle. for ct acquisition, the tube voltage was 120kvp with automatic tube current modulation, a field of view (fov) of 350 × 350 mm, and a matrix size of 512 × 512. all images were reconstructed into a slice thickness of 1 mm and an interval of 1 mm. coronavirus disease (covid-19) outbreak who. coronavirus disease 2019 (covid-19) situation report -116 who. report of the who-china joint mission on coronavirus disease epidemiologic features and clinical course of patients infected with sars-cov-2 in singapore clinical characteristics of coronavirus disease 2019 in china acute physiology and chronic health evaluation ii score as a predictor of hospital mortality in patients of coronavirus disease a prediction rule to identify low-risk patients with communityacquired pneumonia machine learning in medicine radiomics: the bridge between medical imaging and personalized medicine nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study towards complete and accurate reporting of studies of diagnostic accuracy: the stard initiative. standards for reporting of diagnostic accuracy diagnosis and treatment of adults with community-acquired pneumonia. an official clinical practice guideline of the american thoracic society and infectious diseases society of america labcorp feature selection with the boruta package adaptive synthetic sampling approach for imbalanced learning comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach clinical features of patients infected with 2019 novel coronavirus in wuhan clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china a pneumonia outbreak associated with a new coronavirus of probable bat origin are patients with hypertension and diabetes mellitus at increased risk for covid-19 infection? a trial of lopinavir-ritonavir in adults hospitalized with severe covid-19 pathological findings of covid-19 associated with acute respiratory distress syndrome novel coronavirus (2019-ncov) pneumonia temporal changes of ct findings in 90 patients with covid-19 pneumonia: a longitudinal study ct imaging features of 2019 novel coronavirus (2019-ncov) radiological findings from 81 patients with covid-19 pneumonia in wuhan, china: a descriptive study national health commission & state administration of traditional chinese medicine short term outcome and risk factors for adverse clinical outcomes in adults with severe acute respiratory syndrome (sars) middle east respiratory syndrome systematic review of privacy-preserving distributed machine learning from federated databases in health care distributed learning on 20 000+ lung cancer patients -the personal health train key: cord-133273-kvyzuayp authors: christ, andreas; quint, franz title: artificial intelligence: research impact on key industries; the upper-rhine artificial intelligence symposium (ur-ai 2020) date: 2020-10-05 journal: nan doi: nan sha: doc_id: 133273 cord_uid: kvyzuayp the trirhenatech alliance presents a collection of accepted papers of the cancelled tri-national 'upper-rhine artificial inteeligence symposium' planned for 13th may 2020 in karlsruhe. the trirhenatech alliance is a network of universities in the upper-rhine trinational metropolitan region comprising of the german universities of applied sciences in furtwangen, kaiserslautern, karlsruhe, and offenburg, the baden-wuerttemberg cooperative state university loerrach, the french university network alsace tech (comprised of 14 'grandes 'ecoles' in the fields of engineering, architecture and management) and the university of applied sciences and arts northwestern switzerland. the alliance's common goal is to reinforce the transfer of knowledge, research, and technology, as well as the cross-border mobility of students. in the area of privacy-preserving machine learning, many organisations could potentially benefit from sharing data with other, similar organisations to train good models. health insurers could, for instance, work together on solving the automated processing of unstructured paperwork such as insurers' claim receipts. the issue here is that organisations cannot share their data with each other for confidentiality and privacy reasons, which is why secure collaborative machine learning where a common model is trained on distributed data to prevent information from the participants from being reconstructedis gaining traction. this shows that the biggest problem in the area of privacy-preserving machine learning is not technical implementation, but how much the entities involved (decision makers, legal departments, etc.) trust the technologies. as a result, the degree to which ai can be explained, and the amount of trust people have in it, will be an issue requiring attention in the years to come. the representation of language has undergone enormous development of late: new models and variants, which can be used for a range of natural language processing (nlp) tasks, seem to pop up almost monthly. such tasks include machine translation, extracting information from documents, text summarisation and generation, document classification, bots, and so forth. the new generation of language models, for instance, is advanced enough to be used to generate completely realistic texts. these examples reveal the rapid development currently taking place in the ai landscape, so much so that the coming year may well witness major advances or even a breakthrough in the following areas: • healthcare sector (reinforced by the covid-19 pandemic): ai facilitates the analysis of huge amounts of personal information, diagnoses, treatments and medical data, as well as the identification of patterns and the early identification and/or cure of disorders. • privacy concerns: how civil society should respond to the fast increasing use of ai remains a major challenge in terms of safeguarding privacy. the sector will need to explain ai to civil society in ways that can be understood, so that people can have confidence in these technologies. • ai in retail: increasing reliance on online shopping (especially in the current situation) will change the way traditional (food) shops function. we are already seeing signs of new approaches with self-scanning checkouts, but this is only the beginning. going forward, food retailers will (have to) increasingly rely on a combination of staff and automated technologies to ensure cost-effective, frictionless shopping. • process automation: an ever greater proportion of production is being automated or performed by robotic methods. • bots: progress in the field of language (especially in natural language processing, outlined above) is expected to lead to major advances in the take-up of bots, such as in customer service, marketing, help desk services, healthcare/diagnosis, consultancy and many other areas. the rapid pace of development means it is almost impossible to predict either the challenges we will face in the future or the solutions destined to simplify our lives. one thing we can say is that there is enormous potential here. the universities in the trirhenatech alliance are actively contributing interdisciplinary solutions to the development of ai and its associated technical, societal and psychological research questions. utilizing toes of a humanoid robot is difficult for various reasons, one of which is that inverse kinematics is overdetermined with the introduction of toe joints. nevertheless, a number of robots with either passive toe joints like the monroe or hrp-2 robots [1, 2] or active toe joints like lola, the toyota robot or toni [3, 4, 5] have been developed. recent work shows considerable progress on learning model-free behaviors using genetic learning [6] for kicking with toes and deep reinforcement learning [7, 8, 9] for walking without toe joints. in this work, we show that toe joints can significantly improve the walking behavior of a simulated nao robot and can be learned model-free. the remainder of this paper is organized as follows: section 2 gives an overview of the domain in which learning took place. section 3 explains the approach for model free learning with toes. section 4 contains empirical results for various behaviors trained before we conclude in section 5. the robots used in this work are robots of the robocup 3d soccer simulation which is based on simspark 1 and initially initiated by [10] . it uses the ode physics engine 2 and runs at an update speed of 50hz. the simulator provides variations of aldebaran nao robots with 22 dof for the robot types without toes and 24 dof for the type with toes, naotoe henceforth. more specifically, the robot has 6 (7) dof in each leg, 4 in each arm and 2 in its neck. there are several simplifications in the simulation compared to the real nao: all motors of the simulated nao are of equal strength whereas the real nao has weaker motors in the arms and different gears in the leg pitch motors. joints do not experience extensive backlash rotation axes of the hip yaw part of the hip are identical in both robots, but the simulated robot can move hip yaw for each leg independently, whereas for the real nao, left and right hip yaw are coupled the simulated naos do not have hands the touch model of the ground is softer and therefore more forgiving to stronger ground touches in the simulation energy consumption and heat is not simulated masses are assumed to be point masses in the center of each body part the feet of naotoe are modeled as rectangular body parts of size 8cm x 12cm x 2cm for the foot and 8cm x 4cm x 1cm for the toes (see figure 1 ). the two body parts are connected with a hinge joint that can move from -1 degrees (downward) to 70 degrees. all joints can move at an angular speed of at most 7.02 degrees per 20ms. the simulation server expects to get the desired speed at 50 hz for each joint. if no speeds are sent to the server it will continue movement of the joint with the last speed received. joint angles are noiselessly perceived at 50hz, but with a delay of 40ms compared to sent actions. so only after two cycles, the robot knows the result of a triggered action. a controller provided for each joint inside the server tries to achieve the requested speed, but is subject to maximum torque, maximum angular speed and maximum joint angles. the simulator is able to run 22 simulated naos in real-time on reasonable cpus. it is used as competition platform for the robocup 3d soccer simulation league 3 . in this context, only a single agent was running in the simulator. the following subsections describe how we approached the learning problem. this includes a description of the design of the behavior parameters used, what the fitness functions for the genetic algorithm look like, which hyperparameters were used and how the fitness calculation in the simspark simulation environment works exactly. the guiding goal behind our approach is to learn a model-free walk behavior. with model-free we depict an approach that does not make any assumptions about a robot's architecture nor the task to be performed. thus, from the viewpoint of learning, our model consists of a set of flat parameters. these parameters are later grounded inside the domain. the server requires 50 values per second for each joint. to reduce the search space, we make use of the fact that output values of a joint over time are not independent. therefore, we learn keyframes, i.e. all joint angles for discrete phases of movement together with the duration of the phase from keyframe to keyframe. the experiments described in this paper used four to eight of such phases. the number of phases is variable between learning runs, but not subject to learning for now, except for skipping phases by learning a zero duration for it. the robocup server requires robots to send the actual angular speed of each joint as a command. when only leg joints are included, this would require to learn 15 parameters per phase (14 joints + 1 for the duration of the phase), resulting in 60, 90 and 120 parameters for the 4, 6, 8 phases worked with. the disadvantage of this approach is that the speed during a particular phase is constant, thus making it unable to adapt to discrepancies between the desired and the actual motor movement. therefore, a combination of angular value and the maximum amount of angular speed each joint should have is used. the direction and final value of movement is entirely encoded in the angular values, but the speed can be controlled separately. it follows that: -if the amount of angular speed does not allow reaching the angular value, the joint behaves like in version 1. -if the amount of angular speed is bigger, the joint stops to move even if the phase is not over. this almost doubles the amount of parameters to learn, but the co-domain of values for the speed values is half the size, since here we only require an absolute amount of angular speed. with these parameters, the robot learns a single step and mirrors the movement to get a double step. feedback from the domain is provided by a fitness function that defines the utility of a robot. the fitness function subtracts a penalty for falling from the walked distance in x-direction in meters. there is also a penalty for the maximum deviation in y-direction reached during an episode, weighted by a constant factor. in practice, the values chosen for f allenp enalty and a factor f were usually 3 and 2 respectively. this same fitness function can be used without modification for forward, backward and sideward walk learning, simply by adjusting the initial orientation of the agent. the also trained turn behavior requires a different fitness function. f itness turn = (g * totalt urn) − distance (2) where totalt urn refers to the cumulative rotation performed in degrees, weighted by a constant factor g (typically 1/100). we penalize any deviation from the initial starting x / y position (distance) as incentive to turn in-place. it is noteworthy that other than swapping out the fitness function and a few more minor adjustments mentioned in 3. 3 , everything else about the learning setup remained the same thanks to the model-free approach. naturally, the fitness calculation for an individual requires connecting an agent to the simspark simulation server and having it execute the behavior defined by the learned parameters. in detail, this works as follows: at the start of each "episode", the agent starts walking with the old model-based walk engine at full speed. once 80 simulation cycles (roughly 1.5 seconds) have elapsed, the robot starts checking the foot force sensors. as soon as the left foot touches the ground, it switches to the learned behavior. this ensures that the learned walk has comparable starting conditions each time. if this does not occur within 70 cycles (which sometimes happens due to non-determinism in the domain and noise in the foot force perception), the robot switches anyway. from that point on, the robot keeps performing the learned behavior that represents a single step, alternating between the original learned parameters and a mirrored version (right step and left step). an episode ends once the agents has fallen or 8 seconds have elapsed. to train different walk directions (forward, backward, sideward), the initial orientation of the player is simply changed accordingly. in addition, the robot uses a different walk direction of the model-based walk engine for the initial steps that are not subject to learning. in case of training a morphing behavior (see 4.5) , the episode duration is extended to 12 seconds. when a morphing behavior should be trained, the step behavior from another learning run is used. this also means that a morphing behavior is always trained for a specific set of walk parameters. after 6 seconds, the morphing behavior is triggered once the foot force sensors detect that the left foot has just touched the ground. unlike the step / walk behavior, this behavior is just executed once and not mirrored or repeated. then the robot switches back to walking at full speed with the model-based walk engine. to maximize the reward, the agent has to learn a morphing behavior that enables the transition between learned model-free and old model-based walk to work as reliably as possible. finally, for the turn behavior, the robot keeps repeating the learned behavior without alternating with a mirrored version. in any case, if the robot falls, a training run is over. the overall runtime of each such learning run is 2.5 days on our hardware. learning is done using plain genetic algorithms. the following hyperparameters were used: more details on the approach can be found in [11] . this section presents the results for each kind of behavior trained. this includes three different walk directions, a turn behavior and a behavior for morphing. the main focus of this work has been on training a forward walk movement. figure 2 shows a sequence of images for a learned step. the best result reaches a speed of 1.3 m/s compared to the 1.0 m/s of our model-based walk and 0.96 m/s for a walk behavior learned on the nao robot without toes. the learned walk with toes is less stable, however, and shows a fall rate of 30% compared to 2% of the model-based walk. regarding the characteristics of this walk, it utilizes remarkably long steps 4 . table 1 shows an in-depth comparison of various properties, including step duration, length and height, which are all considerably bigger compared to our previous model-based walk. the forward leaning of the agent has increased by 80.4%, while 28.1% more time is spent with both legs off the ground. however, the maximum deviation from the intended path (maxy ) has also increased by 137.8%. table 1 : comparison of the previously fastest and the fastest learned forward walk once a working forward walk was achieved, it was natural to try to train a backward walk behavior as well, since this only requires a minor modification in the learning environment (changing the initial rotation of the agent and model-based walk direction to start with). the best backward walk learned reaches a speed of 1.03 m/s, which is significantly faster than the 0.67 m/s of its model-based counterpart. unfortunately, the agent also falls 15% more frequently. it is interesting just how backward-leaning the agent is during this walk behavior. it could almost be described as "controlled falling" 5 (see figure 3 ). sideward walk learning was the least successful out of the three walk directions. like with all directions, the agent starts out using the old walk engine and then switches to the learned behavior after a short time. in this case however, instead of continuing to walk sideward, the agent has learned to turn around and walk forward instead, see figure 4 . the resulting forward walk is not very fast and usually causes the agent to fall within a few meters 6 , but it is still remarkable that the learned behavior manages to both turn the agent around and make it walk forward with the same repeating step movement. it is also remarkable that the robot learned that it is quicker with the given legs at least for long distances to turn and run forward than to keep making sidesteps. with the alternate fitness function presented in section 3, the agent managed to learn a turn behavior that is comparable in speed to that of the existing walk engine. despite this, the approach is actually different: while the old walk engine uses small, angled steps 7 , the learned behavior uses the left leg as a "pivot", creating angular momentum with the right leg 8 . figure 5 shows the movement sequence in detail. unfortunately, despite the comparable speed, the learned turn behavior suffers from much worse stability. with the old turn behavior, the agent only falls in roughly 3% of cases, with the learned behavior it falls in roughly 55% of the attempts. one of the major hurdles for using the learned walk behaviors in a robocup competition is the smooth transition between them and other existing behaviors such as kicks. the initial transition to the learned walk is already built into the learning setup described in 3 by switching mid-walk, so it does not have to be given special consideration. more problematic is switching to another behavior afterwards without falling. to handle this, the robot simply attempted to train a "morphing" behavior using the same model-free learning setup. the result is something that could be described as a "lunge" (see figure 6 ) that reduces the forward momentum sufficiently to allow it to transition to the slower model-based walk when successful. 9 however, the morphing is not successful in about 50% of cases, resulting in a fall. we were able to successfully train forward and backward walk behaviors, as well as a morphing and turn behavior using plain genetic algorithms and a very flexible model-free approach. the usage of the toe joint in particular makes the walks look more natural and human-like than that of the model-based walk engine. however, while the learned behaviors outperform or at least match our old modelbased walk engine in terms of speed, they are not stable enough to be used during actual robocup 3d simulation league competitions. we think this is an inherent limitation of the approach: we train a static behavior that is unable to adapt to changing circumstances in the environment, which is common in simspark's non-deterministic simulation with perception noise. deep reinforcement learning seems more promising in this regard, as the neural network can dynamically react to the environment since sensor data serves as input. it is also arguably even less restrictive than the keyframe-based behavior parameterization we presented in this paper, as a neural network can output raw joint actions each simulation cycle. at least two other robocup 3d simulation league teams, fc portugal [8] and itandroids [9] , have had great success with this approach, everything points towards this becoming the state-of-the-art approach in robocup 3d soccer simulation in the near future, so we want to concentrate our future efforts here as well. retail companies dealing in alcoholic beverages are faced with a constant flux of products. apart from general product changes like modified bottle designs and sizes or new packaging units two factors are responsible for this development. the first is the natural wine cycle with new vintages arriving at the market and old ones cycling out each year. the second is the impact of the rapidly growing craft beer trend which has also motivated established breweries to add to their range. the management of the corresponding product data is a challenge for most retail companies. the reason lies in the large amount of data and its complexity. data entry and maintenance processes are linked with considerable manual effort resulting in high data management costs. product data attributes like dimensions, weights and supplier information are often entered manually into the data base and are often afflicted with errors. another widely used source of product data is the import from commercial data pools. a means of checking the data thus acquired for plausibility is necessary. sometimes product data is incomplete due to different reasons and a method to fill the missing values is required. all these possible product data errors lead to complications in the downstream automated purchase and logistics processes. we propose a machine learning model which involves domain specific knowledge and compare it a heuristic approach by applying both to real world data of a retail company. in this paper we address the problem of predicting the gross weight of product items in the merchandise category alcoholic beverages. to this end we introduce two levels of additional features. the first level consists of engineered features which can be determined by the basic features alone or by domain specific expert knowledge like which type of bottle is usually used for which grape variety. in the next step an advanced second level feature is computed from these first level features. adding these two levels of engineered features increases the prediction quality of the suggestion values we are looking for. the results emphasize the importance of careful feature engineering using expert knowledge about the data domain. feature engineering is the process of extracting features from the data in order to train a prediction model. it is a crucial step in the machine learning pipeline, because the quality of the prediction is based on the choice of features used to training. the majority of time and effort in building a machine learning pipeline is spent on data cleaning and feature engineering [domingos 2012] . a first overview of basic feature engineering principles can be found in [zheng 2018 ]. the main problem is the dependency of the feature choice on the data set and the prediction algorithm. what works best for one combination does not necessarily work for another. a systematic approach to feature engineering without expert knowledge about the data is given in [heaton 2016 ]. the authors present a study whether different machine learning algorithms are able to synthesize engineered features on their own. as engineered features logarithms, ratios, powers and other simple mathematical functions of the original features are used. in [anderson 2017 ] a framework for automated feature engineering is described. the data set is provided by a major german retail company and consists of 3659 beers and 10212 wines. each product is characterized by the seven features shown in table 1. the product name obeys only a generalized format. depending on the user generating the product entry in the company data base, abbreviating style and other editing may vary. the product group is a company specific number which encodes the product category -dairy products, vegetables or soft drinks for example. in our case it allows a differentiation of the product into beer and wine. additionally wines are grouped by country of origin and for germany also into wine-growing regions. note that the product group is no inherent feature like length, width, height and volume, but depends on the product classification system a company uses. the dimensions length, width, height and the volume derived by multiplicating them are given as float values. the feature (gross) weight, also given as a float value, is what we want to predict. as is often the case with real world data, a pre-processing step has to be performed prior to the actual machine learning in order to reduce data errors and inconsistencies. for our data we first removed all articles missing one or more of the required attributes of table 1. then all articles with dummy values were identified and discarded. dummy values are often introduced due to internal process requirements but do not add any relevant information to the data. if for example the attribute weight has to be filled for an article during article generation in order to proceed to the next step but the actual value is not know, often a dummy value of 1 or 999 is entered. these values distort the prediction model when used as training data in the machine learning step. the product name is subjected to lower casing and substitution of special german characters like umlauts. special symbolic characters like #,! or separators are also deleted. with this preprocessing done the data is ready to be used for feature engineering. following this formal data cleaning we perform an additional content-focused pre-processing. the feature weight is discretized by binning it with bin width 10g. volume is likewise treated with bin size 10ml. this simplifies the value distribution without rendering it too coarse. all articles where length is not equal to width are removed, because in these cases there are no single items but packages of items. often the data at hand is not sufficient to train a meaningful prediction model. in these cases feature engineering is a promising option. identifying and engineering new features depends heavily on expert knowledge of the application domain. the first level consists of engineered features which can be determined by the original features alone. in the next step advanced second level features are computed from these first level and the original features. for our data set the original features are product name and group as well as the dimensions length, width, height and volume. we see that the volume is computed in the most general way by multiplication of the dimensions. geometrically this corresponds to all products being modelled as cuboids. since angular beer or wine bottles are very much the exception in the real world, a sensible new feature would be a more appropriate modelling of the bottle shape. since weight is closely correlated to volume, the better the volume estimate the better the weight estimate. to this end we propose four first level engineered features: capacity, wine bottle type, beer packaging type and beer bottle type which are in turn used to compute a second level engineered feature namely the packaging specific volume. figure 1 shows all discussed features and their interdependencies. let us have a closer look at the first level engineered features. the capacity of a beverage states the amount of liquid contained and is usually limited to a few discrete values. 0.33l and 0.5l are typical values for beer cans and bottles while wines are almost exclusively sold in 0.75l bottles and sometimes in 0.375l bottles. the capacity can be estimated from the given volume with sufficient certainty using appropriate threshold values. outliers were removed from the data set. there are three main beer packaging types in retail: cans, bottles and kegs. while kegs are mainly of interest to pubs and restaurants and are not considered in this paper, cans and bottles target the typical super market shopper and come in a greater variety. in our data set, the product name in case of beers is preceded by a prefix denoting whether the product is packaged in a can or a bottle. extracting the relevant information is done using regular expressions. not, though, that the prefix is not always correct and needs to be checked against the dimensions. the shapes of cans are the same for all practical purposes, no matter the capacity. the only difference is in their wall thickness, which depends on the material, aluminium and tin foil being the two common ones. the difference is weight is small and the actual material used is impossible to extract from the data. a further distinction for cans in different types like for beer and wine is therefore unnecessary. regarding the german beer market, the five bottle types shown in figure 2 the engineered feature beer packaging type assigns each article identified as beer by its product group to one of the classes bottle or can. the feature beer bottle type contains the most probably member of the five main beer bottle types. packages containing more than one bottle or can like crates or six packs are not considered in this paper and were removed from the data set. compared to beer the variety of commercially sold wine packagings is limited to bottles only. a corresponding packaging type attribute to distinguish between cans and bottles is not necessary. again there are a few bottle types which are used for the majority of wines, namely schlegel, bordeaux and burgunder ( figure 3 ). deciding what product is filled in which bottle type is a question of domain knowledge. the original data set does not contain a corresponding feature. from the product group the country of origin and in the case of german wines the region can be determined via a mapping table. this depends on the type of product classification system the respective company uses and has not to be valid for all companies. our data set uses a customer specific classification with focus on germany. a more general one would be the global product classification (gpc) standard for example. to determine wine growing regions in non-german countries like france the product name has to be analyzed using regular expressions. the type of grape is likewise to be deduced from the product name if possible. using the country and specifically the region of origin and type of grape of the wine in question is the only way to assign a bottle type with acceptable certainty. there are countries and region in which a certain bottle type is used predominantly, sometimes also depending on the color of the wine. the schlegel bottle, for example, is almost exclusively used for german and alsatian white wines and almost nowhere else. bordeaux and burgunder bottles on the other hand are used throughout the world. some countries like california or chile use a mix of bottle types for their wines, which poses an additional challenge. with expert knowledge one can assign regions and grape types to the different bottle types. as with beer bottles this categorization is by no means comprehensive or free of exceptions but serves as a first step. the standard volume computation by multiplying the product dimensions length, width and height is a rather coarse cuboid approximation to the real shape of alcoholic beverage packagings. since the volume is intrinsically linked to the weight which we want to predict a packaging type specific volume computation is required for cans and especially bottles. the modelling of a can is straightforward using a cylinder with the given height ℎ and a diameter of the given width and length . thus the packaging type specific volume is: a bottle on the other hand needs to be modelled piecewise. its height can be divided into three parts: base, shoulders and neck as shown in figure 4. base and neck can be modeled by a cylinder. the shoulders are approximated by a truncated cone. with the help of the corresponding partial heights ℎ , ℎ ℎ and ℎ we can compute coefficients , ℎ and as fractions of the overall height ℎ of the bottle. the diameters of the bottle base and the neck opening are given by and and are likewise used to compute the ratio . since bottles have circular bases, the values for width and length in the original data have to be the same and either one may be used for . these four coefficients are characteristic for each bottle type, be it beer or wine (table 3) . with their help, a bottle type specific volume from the original data length, width and height can be computed which is a much better approximation to the true volume than the former cuboid model. the bottle base can be modelled as a cylinder as follows: the bottle shoulders have the form of a truncated cone and are described by formula 3: the bottle neck again is a simple cylinder: summing up all three sections yields the packaging type specific volume for bottles: ur-ai 2020 // 18 the experiments follow the multi-level feature engineering scheme as shown in figure 1 . first, we use only the original features product group and dimensions. then we add the first level engineered features capacity and bottle type to the basic features. next the second level engineered feature packaging type specific volume is used along with the basic features. finally all features from every level are used for the prediction. after pre-processing and feature engineering the data set size is reduced from 3659 to 3380 beers and from 10212 to 8946 wines. for prediction of the continuous valued attribute gross weight, we use and compare several regression algorithms. both the decision-tree based random forests algorithm (breimann, 2001) and support vector machines (svm) (cortes, 1995) are available in regression mode (smola, 1997) . linear regression (lai, 1979) and stochastic gradient descent (sgd) (taddy, 2019) are also employed as examples of more traditional statics-based methods. our baseline is a heuristic approach taking the median of the attribute gross weight for each product group and use this value as a prediction for all products of the same product group. practical experience has shown this to be a surprisingly good strategy. the implementation was done in python 3.6 using the standard libraries sk-learn and pandas. all numeric features were logarithmized prior to training the models. the non-numeric feature bottle type was converted to numbers. the final results were obtained using tenfold cross validation (kohavi, 1995) . for model training 80% of the data was used while the remaining 20% constituted the test data. we used the root mean square error (rsme) (6) as well as the mean and variance of the absolute percentage error (7) as metrics for the evaluation of the performance of the algorithms. all machine learning algorithms deliver significant improvements regarding the observed metrics compared to the heuristic median approach. the best results for each feature combination are highlighted in bold script. the results for the beer data set in table 4 show that the rsme can be more than halved, the mean almost be reduced to a third and the variance of quartered compared to the baseline approach. the random forest regressor achieves the best results in terms of rsme and for almost all feature combinations except basic features and basic features combined with the packaging type specific volume, in which cases support vector machines prove superior. linear regression and sgd are are still better than the baseline approach but not on par with the other algorithms. linear regression shows the tendency to improved results when successively adding features. sgd on the other hand exhibits no clear relation between number and level of features and corresponding prediction quality. a possible cause could be the choice of hyper parameters. sgd is very sensitive in this regard and depends more heavily upon a higher number of correctly adjusted hyper parameters than the other algorithms we used. random forests is a method which is very well suited to problems, where there is no easily discernable relation between the features. it is prone to overfitting, though, which we tried to avoid by using 20% of all data as test data. adding more engineered features leads to increasingly better results using random forest with an outlier for the packaging type specific volume feature. svm are not affected by only first level engineered features but profit from using the bottle type specific volume. regarding the wine data set the results depicted in table 5 are not as good as for the beer data set though still much better than the baseline approach. a reduction of the rsme by over 29% and of the mean by almost 50% compared to the baseline were achieved. the variance of could even be limited to under 10% of the baseline value. again random forests is the algorithm with the best metrics. linear regression and svm are comparable in terms of while sgd is worse but shows good rsme values. in conclusion the general results of the wine data set show not much improvement when applying additional engineered features. 6 discussion and conclusion the experiments show a much better predicting quality for beer than for wine. a possible cause could be the higher weight variance in bottle types compared to beer bottles and cans. it is also more difficult to correctly determine the bottle type for wine, since the higher overlap in dimensions does not allow to compute the bottle type with the help of idealized bottle dimensions. using expert knowledge to assign the bottle type by region and grape variety seems not to be as reliable, though. especially with regard to the lack of a predominant bottle type in the region with the most bottles (red wine from baden for example), this approach should be improved. especially bordeaux bottles often sport an indentation in the bottom, called a 'culot de bouteille'. the size and thickness of this indentation cannot be inferred from the bottle's dimensions. this means that the relation between bottle volume and weight is skewed compared to other bottles without these indentations, which in turn decreases prediction quality. predicting gross weights with machine learning and domain-specifically engineered features leads to smaller discrepancies than using simple heuristic approaches. this is important for retail companies since big deviations are much worse for logistical reasons than small ones which may well be within natural production tolerances for bottle weights. our method allows to check manually generated as well as data pool imported product data for implausible gross weight entries and proposes suggestion values in case of missing entries. the method we presented can easily be adapted to non-alcoholic beverages using the same engineered features. in this segment, plastics bottles are much more common than glass ones and hence the impact of the bottle weight compared to the liquid weight is significantly smaller. we assume that this will cause a smaller importance of the bottle type feature in the prediction. a more problematic kind of beverage is liquor. although there are only a few different standard capacities, the bottle types vary so greatly, that identifying a common type is almost impossible. one of the main challenges of our approach is determining the correct bottle types. using expert knowledge is a solid approach but cannot capture all exemptions. especially if a wine growing region has no predominant bottle type and is using mixed bottle types instead. additionally many wine growers use bottle types which haven't been typical for their wine types because they want to differ from other suppliers in order to get the customer's attention. assuming that all rieslings are sold in schlegel bottles, for example, is therefore not exactly true. one option could be to model hybrid bottles using a weighted average of the coefficients for each bottle type in use. if a region uses both burgunder and bordeaux bottles with about equal frequency, all products from this region could be assigned a hybrid bottle with coefficients computed by the mean value of each coefficient. if an initially bottle type labeled data set is available, preliminary simulations have shown that most bottle types can be predicted robustly using classification algorithms. the most promising strategy, in our opinion, is to learn the bottle types directly from product images using deep neural nets for example. with regard to the ever increasing online retail sector, web stores need to have pictures of their products on display, so the data is there to be used. quality assurance is one of the key issues for modern production technologies. especially new production methods like additive manufacturing and composite materials require high resolution 3d quality assurance methods. computed tomography (ct) is one of the most promising technologies to acquire material and geometry data non-destructively at the same time. with ct it is possible to digitalize subjects in 3d, also allowing to visualize their inner structure. a 3d-ct scanner produces voxel data, comprising of volumetric pixels that correlate with material properties. the voxel value (grey value) is approximately proportional to the material density. nowadays it is still common to analyse the data by manually inspecting the voxel data set, searching for and manually annotating defects. the drawback is that for high-resolution ct data, this process it very time consuming and the result is operator-dependent. therefore, there is a high motivation to establish automatic defect detection methods. there are established methods for automatic defect detection using algorithmic approaches. however, these methods show a low reliability in several practical applications. at this point artificial neural networks come into play that have been already implemented successfully in medical applications [1] . the most common networks, developed for medical data segmentation, are by ronneberger et al., the u-net [2] and by milletari et al., the v-net [3] and their derivates. these networks are widely used for segmentation tasks. fuchs et al. describes three different ways of analysing industrial ct data [4] . one of these contains a 3d-cnn. this cnn is based on the u-net architecture and is shown in their previous paper [5] . the authors enhance and combine the u-net and v-net architecture to build a new network for examination of 3d volumes. in contrast, we investigate in our work how the networks introduced by ronneberger et al. and milletari et al. perform in industrial environments. furthermore, we investigate if derivates of these architectures are able to identify small features in industrial ct data. in industrial ct systems, not only in the hardware design but also in the resulting 3d imaging data differs from medical ct systems. voxel data from industrial parts differ from medical data in the contrast level and the resolution. state-of-the-art industrial ct scanner produce one to two order of magnitude larger data sets compared to medical ct systems. the corresponding resolution is necessary to resolve small defects. medical ct scanners are optimised for a low xray dose for the patient, the energy of x-ray photons are typically up to 150 kev, industrial scanner typically use energies up to 450 kev. in combination with the difference of the scan "object", the datasets differ significantly in size and image content. to store volume data there are a lot of different file formats. some of them are mainly used in medical applications like dicom [6] , nifti 1 or raw. in industrial applications vgl 3 , raw and tiff 4 are commonly used. also depending on the format, it is possible to store the data slice wise or as a complete volume stack. industrial ct data, as mentioned in previous section, has some differences to medical ct data. one aspect is the size of the features to be detected or learned by the neural network. our target is to find defects in industrial parts. as an example, we analyse pores in casting parts. these features may be very small, down to 1 to 7 voxels in each dimension. compared to the size of the complete data volume (typically larger than 512 x 512 x 512 voxel), the feature size is very small. the density difference between material and pores may be as low as 2% of the maximum grey value. thus, it is difficult to annotate the data even for human experts. the availability of real industrial data of good quality, annotated by experts, is very low. most companies don't reveal their quality analysis data. training a neural network with a small quantity of data is not possible. for medical applications, especially ai applications, there are several public datasets available. yet these datasets are not always sufficient and researchers are creating synthetic medical data [7] . therefore, we decided to create synthetic industrial ct data. another important reason for synthetic data is the quality of annotations done by human experts. the consistency of results is not given for different experts. fuchs et al. have shown that training on synthetic data and predicting on real data lead to good results [4] . however, synthetic data may not reflect all properties of real data. some of the properties are not obvious, which may lead to ignoring some varieties in the data. in order to achieve a high associability, we use a large numbers of synthetic data mixed with a small number of real data. to achieve this, we developed an algorithm which generates large amounts of data, containing a large variation of aspects, needed to generalize a neural network. the variation includes material density, pore density, pore size, pore amount, pore shape and size of the part. there are some samples that could be learned easily, because the pores are clearly visible inside the material. however, some samples are more difficult to be learned, because the pores are nearly invisible. this allows us to generate data with a wide variety and hence the network can predict on different data. to train the neural networks, we can mix the real and synthetic data or use them separately. the real data was annotated manually by two operators. to create a dataset of this volume we sliced it into 64x64x64 blocks. only the blocks with a mean density greater than 50% of the grayscale range are used, to avoid too much empty volumes in the training data. another advantage of synthetic data is the class balance. we have two classes, where 0 corresponds to material and surrounding air and 1 for the defects. because of the size of the defects there is a high imbalance between the classes. by generating data with more features than in the real data, we could reduce the imbalance. reducing the size of the volume to 64x64x64 also leads to better balance between the size of defects compared to full volume. in table 1details of our dataset for training, evaluation and testing are shown. the synthetic data will not be recombined to a larger volume as they represent separate small components or full material units. the following two slices of real data ( figure 1 ) and synthetic data (figure 2 ) with annotated defects show the conformity between the data. ur-ai 2020 // 26 hardware and software setup deep learning (dl) consist of two phases: the training and its application. while dl models can be executed very fast, the training of the neural network can be very time-consuming, depending on several factors. one major factor is the hardware. the time consumed can be reduced by the factor of around ten when graphics cards (gpus) are used. [8] to cache the training data, before it is given into the model, calculated on the gpu, a lot of random-access memory (ram) is used [9] [10] [11] . our system is built on a dual cpu hardware with 10 cores each running at 2.1 ghz and a nvidia gpu titan rtx 5 with 24gb of vram and 64gb of regular ram. all measurements in this work concerning training and execution time are related to this hardware setup. the operating system is ubuntu 18.4lts. anaconda is used for python package management and deployment. the dl-framework is tensorflow 6 2.1 and keras as a submodule in python 7 . based on the 3du-net [12] and 3dv-net [3] architecture compared from paichao et al. [13] we created modified versions which differ in number of layers and their hyperparameters. due to the small size of our data, no patch division is necessary. instead the training is performed on the full volumes. we actually do not use the z-net enhancement proposed in their paper. the input size, depending on our data, is defined to 64x64x64x1 with 1 dimension for channel. the incoming data will be normalized. as we have a binary segmentation task, our output activation is the sigmoid [14] function. based on paichao et al. [13] the convolutional layer of our 3du-nets have a kernel size of (3, 3, 3) and the 3dv-nets have a kernel size of (5, 5, 5). as convolution activation function we are using elu [14] [15] and he_normal [16] as kernel initialization [17] . the adam optimisation method [18] [19] is used with a starting learning rate of 0.0001, a decay factor of 0.1 and the loss function is the binary cross-entropy [20] . figure 3 shows a sample 3du-net architecture where downwards max pooling and upwards transposed convolution are used. compared to figure 4 , the 3dv-net, where we have a fully convolutional neural network, the descend is done with a (2, 2, 2) convolution and a stride of 2 and ascent with transposed convolution. it also has a layer level addition of the input of this level added to the last convolution output of the same level, as marked by the blue arrows. to adapt the shapes of the tensors for adding them, the down-convolution and the last convolution of the same level, have to have the same number of kernel filters. our modified neural network differ in the levels of de-/ascending, the convolution filter kernel size and their hyperparameters, shown in table 2 . the convolutions on one level have the same number of filter kernel. after every down convolution the number of filters is multiplied by 2 and on the way up divided by 2. training and evaluation of the neural networks the conditions of a training and a careful parameters selection is important. in table 3 the training conditions fitted to our system and networks are shown. we are also taking into account that different network architectures and number of layers are better performing on different learning rates, batch size, etc. to evaluate our trained models, we are mainly focusing on the iou metric, also called jackard index, which is the intersection over union. this metric is widely used for segmentation tasks and compares the intersection over union between the prediction and ground truth for each voxel. the value of iou range between 0 and 1, whereas the loss values range between 0 and infinite. therefore, the iou is a much clearer indicator. an iou close to 1 indicates a high intersectionprecision between the prediction and the groundtruth. our networks where trained between 30 and 90 epochs until no more improvement could be achieved. both datasets consist of a similar number of samples, which means the epoch time is equivalent. one epoch took around 4 minutes. figure 5 shows the loss determined based on the evaluation data. as described in fehler! verweisquelle konnte nicht gefunden werden., all models are trained on and evaluated against the synthetic dataset gdata and on the mixed dataset mdata. in general, the loss achieved by all models is higher on mdata because the real data is harder to learn. a direct comparison between the models is only possible between models with the same architecture. the iou metric shown in figure 6 . here the evaluation is sorted based on the iou metric. if we compare the loss of unet-mdata with unet-gdata, which are nearly the same for mdata, with their corresponding iou (unet-mdata (~0.8) and unet-gdata (~0.93)), we can see that a lower loss does not necessarily lead to higher iou score. if only the loss and iou are considered, the unets tend to be better than the vnets. as a conclusion, considering the iou metric for model selection, the unet-gdata is the best performing model and vnet-gdata the least performing. the evaluation loss determined based on the evaluation data sorted from lowest to highest. the evaluation iou determined based on the evaluation data sorted from lowest to highest. after comparing the automatic evaluation, we show prediction samples of different models on real and synthetic data ( table 4) . rows 1 and 2 show the comparison between unet-gdata and vnet-gdata, predicting on a synthetic test sample. the result of unet-gdata exactly hits the groundtruth, whereas the vnet-gdata prediction has a 100% overlap to the groundtruth but with surrounding false positive segmentations. in row 3 and 4 both models predict the groundtruth plus some false positive segmentations in the close neighbourhood. in row 5 and 6 the prediction results of the same two models on real data is shown, taking into account that both models are not trained on real data. unet-gdata delivers a good precision with some false positive segmentations in thegroundtruth area and one additional segmented defect. this shows that the model was able to find a defect which was missed by the expert. vnet-gdata shows a very high number of false positive segmentations. in this paper, we have proposed a neural network to find defects in real and synthetic industrial ct volumes. we have shown that neural networks, developed for medical applications can be adapted to industrial applications. to achieve high accuracy, we used a large variety of features in our data. based on the evaluation and manually reviewing random samples we have chosen the unet architecture for further research. this model achieved great performance on our real and synthetic dataset. in summery this paper shows that the artificial intelligence and their neural networks will take an import enrichment in industrial issues. stress can affect all aspects of our lives, including our emotions, behaviors, thinking ability, and physical health, making our society sick -both mentally and physically. among the effects that the stress and anxiety can cause are heart diseases, such as coronary heart disease and heart failure [5] . due this information, this research will present a proposal to help people handling stress using the benefit of technology development and to set patters of stress status as way to propose some intervention, once the first step to controlling stress is to know the symptoms of stress. the stress symptoms are very board and can be confused with others diseases according the american institute of stress [15] , for example the frequent headache, irritability, insomnia, nightmares, disturbing dreams, dry mouth, problems swallowing, increased or decreased appetite, or even cause other diseases such as frequent colds and infections. in view of the wide variety of symptoms caused by stress, this research intends to define, through physiological signals, the patterns generated by the body and obtained by wearable sensors and develop a standardized database to apply the machine learning. hand, advances in sensor technology, wearable devices and mobile growth would help to online stress identification based on physiological signals and delivery of psychological interventions. currently with the advancement of technology and improvements in the wearable sensors area, made it possible to use these devices as a source of data to monitor the user's physiological state. the majority of the wearable devices consist of low-cost board that can be used to the acquisition of physiological signals [1, 10] . after the data are obtained it is necessary apply some filters to clear signal, without noise or distortions aiming to use some machine learning approaches to model and predict these stress states [2, 11] . the wide-spread use of mobile devices and microcomputers, as raspberry pi, and its capabilities presents a great possibility to collect, and process those signs with an elaborated application. these devices can collect the physiological signals and detect specific stress states to generate interventions following the predetermined diagnosis based on the standards already evaluated in the system [9, 6] . during the literature review it was evident the presence of few works dedicated to evaluating comprehensively the complete cycle of biofeedback, which comprises using the wearable devices, applying machine learning patterns detection algorithms, generate the psychologic intervention, besides monitoring its effects and recording the history of events [9, 3] . stress is identified by professionals using human physiology, so wearables sensors could help on data acquisition and processing, through machine learning algorithms on biosignal data, suggesting psychological interventions. some works [6, 14] are dedicated to define patterns as experiment for data acquisition simulation real situations. jebelli, khalili and lee [6] showed a deep learning approach that was used to compare with a baseline feedforward artificial neural network. schmidt et al. [12] describes wearable stress and affect detection (wesad), one public dataset used to set classifiers and identify stress patterns integrating several sensors signals with the emotion aspect with a precision of 93% in the experiments. the work of gaglioli et al. [4] describe the main features and preliminary evaluation of a free mobile platform for the selfmanagement of psychological stress. in terms of the wearables, some studies [13, 14] evaluate the usability of devices to monitory the signals and the patient's well-being. pavic et al. [13] showed a research performed to monitor cancer patients remotely and as the majority of the patients have a lot of symptoms but cannot stay at hospital during all treatment. the authors emphasize that was obtained good results and that this system is viable, as long as the patient is not a critical case, as it does not replace medical equipment or the emergency care present in the hospital. henriques et al. [5] focus was to evaluated the effects of biofeedback in a group of students to reduce anxiety, in this paper was monitored the heart rate variability with two experiments with duration of four weeks each. the work of wijman [8] describes the use of emg signals to identify stress, this experiment was conducted with 22 participants, evaluating both the wearables signals and questionnaires. in this section will be described the uniqueness of this research and the devices that was used. this solution is being proposed by several literature study about stress patterns and physiological aspects but with few results, for this reason, our project will address topics like experimental study protocol on signals acquisition from patients/participants with wearables to data acquisition and processing, in sequence will be applied machine learning modeling and prediction on biosignal data regarding stress (fig. 1) . the protocol followed to the acquisition of signals during all different status is the trier social stress test (tsst) [7] , recognized as the gold standard protocol for stress experiments. the estimated total protocol time, involving pre-tests and post-tests, is 116 minutes with a total of thirteen steps, but applied experiment was adapted and it was established with ten stages: initial evaluation: the participant arrives, with the scheduled time, and answer the questionnaires; habituation: it will take a rest time of twenty minutes before the pre-test to avoid the influence of events and to establish a safe baseline of that organism; pre-test: the sensors will be allocated ( fig. 2 ), collected saliva sample and applied the psychological instruments. the next step is explanation of procedure and preparation: the participant reads the instructions and the researcher ensures that he understands the job specifications, in sequence, he is sent to the room with the jurors (fig. 3) , composed of two collaborators of the research, were trained to remain neutral during the experiment, not giving positive verbal or non-verbal feedback; free speech: after three minutes of preparation, the participant is requested to start his speech, being informed that he cannot use the notes. this will follow the arithmetic task: the jurors request an arithmetic task in which the participant must subtract mentally, sometimes, the jurors interrupt and warn that the participant has made a mistake; post-test evaluation: the experimenter receives the subject outside the room for the post-test evaluations; feedback and clarification: the investigator and jurors talk to the subject and clarify what the task was about; relaxation technique: a recording will be used with the guidelines on how to perform a relaxation technique, using only the breathing; final post-test: some of the psychological instruments will be reapplied, saliva samples will be collected, and the sensors will still be picking up the physiological signals. based on literature [14] and wearable devices available the signals that was selected to analysis is the ecg, eda and emg for an initial experiment. this experimental study protocol on data acquisition started with 71 participants, where data annotation each step was done manually, from protocol experiment, preprocessing data based on features selection. in the machine learning step, it was evaluated the metrics of different algorithms as decision tree, random forest, adaboost, knn, k-means, svm. the experiment was made using the bitalino kit -plux wireless biosignals s.a. (fig. 4 ) composed by ecg sensor, which will provide data on heart rate and heart rate variability; eda sensor that will allow measure the electrical dermal activity of the sweat glands; emg sensor that allows the data collect the activity of the muscle signals. this section will describe the results in the pre-processing step and how it was made, listing all parts regarded to categorization and filtering data, evaluating the signal to know if it has plausibility and create a standardized database. the developed code is written in python due to the wide variety of libraries available, in this step was used the libraries numpy and pandas, both used to data manipulation and analysis. in the first step it is necessary read the files with the raw data and the timestamp, during this process the used channels are renamed to the name of the signal, because the bitalino store the data with the channel number as name of each signals. in sequence, the data timestamp is converted to a useful format, with goal to compare with the annotations, after time changed to the right format all channels unused are discarded to avoid unnecessary processing. the next step is to read the annotations taken manually in the experiment, as said before, to compare the time and classify each part of the experiment with its respective signal. after all signals are classified with its respective process of the tsst, each part of the experiment is grouped in six categories, which will be analyzed later. the first category is the "baseline", with just two parts of the experiment, representing the beginning of the experiment, when the participants had just arrived. the second is called of "tsst" comprises the period in which the participant spoke, the third category is the "arithmetic" with the data in acquired in the arithmetic test. the others two relevant categories are the "post_test_sensors_1" and "post_test_sensors_2", with its respective signals in the parts called with the same name. every other part of the experiment was categorized as "no_category", in sequence, this category is discarded in function of it will not be necessary in the machine learning stage. after the dataframe is right with all signals properly classified, the columns with the participants number and the timestamp are removed of the dataframe. the next step is evaluated the signal, to verify if the signal is really useful in the process of machine learning. for this, it is analyzed the signals using the biosppy library, which performs the data filtering process and makes it possible to view the data. finally, the script checks the volume of data present in each classification and returns the value of the smallest category. this is done because it was found that the categories have different volumes of data, which would become a problem in the machine learning stage, by offering more data from a determinate category than from the others. due this fact, the code analyzes the others categories and reduce its size until all categories stay with the same number of rows in each category (); after this the dataframe is exported in a csv file, to be read in the machine learning stage. the purpose of this article is to describe some stages of the development of a system for the acquisition and analysis of physiological signals to determine patterns in these signals that would detect stress states. during the development of the project was verified that there are data gaps in the dataframe in the middle of the experiment in some participants; a hypothesis about the motivation of this had happened is the sampling of the acquisition of bitalino regarding communication issues in some specifics sampling rates. it evaluate the results obtained when reducing this acquisition rate, however, it is necessary to carefully evaluate the extent to which the reduction in the sampling rate will interfere with the results. during the evaluation of the plausibility of the signals, it was verified that there are evident differences between the signals patterns in the different stages of the process, thus validating the protocol followed in the acquisition of the standards. the next step in this project is implement the machine learning stage, applying different algorithms as svm, decision tree, random forest, adaboost, knn and k-means; besides to evaluate the results using metrics like accuracy, precision, recall and f1. the next steps of this research will support the confirmation of the hypothesis raised about being able to define patterns of physiological signals to detect stress states. from the definition of the patterns, a system can be applied that identifies the acquisition of the signals and, in real time, performs the analysis of these data based on the machine learning results. therefore we can detect the state of the person and that the psychologist can indicate a proposal intervention and monitor whether the decrease is occurring. technological developments have been influencing all kinds of disciplines by transferring more competences from human beings to technical devices. the steps inculde [1]: 1. tools: transfer of mechanics (material) from the human being to the device 2. machines: transfer of energy from the human being to the device 3. automatic machines 1 : transfer of information from the human being to the device 4. assistants: transfer of decisions from the human being to the device with the introduction of artificial intelligence (ai), in particular its latest developments in deep learning, we let the system (in step 4) take over our decisions and creation processes. thus, tasks and disciplines that were exclusively reserved for humans in the past can now co-exist or even take the human out of the loop. it is no wonder that this transformation is not stopped at disciplines such as engineering, business, agriculture but also affects humanities, art and design. each new technology has been adopted for artistic expression-just see the many wonderful examples in media art. therefore, it is not surprising, that ai is going to be established as a novel tool to produce creative content of any form. however, in contrast to other disruptive technologies, ai seems particular challenging to be accepted in the area of art because it offers capabilities we thought once only humans are able to perform-the art is no longer done by artists using new technology to perform their art, but the art is done by the machine itself without the need for a human to intervene. the question of "what is art" has always been an emotionally debated topic in which everyone has a slightly different definition depending on his or her own experiences, knowledge base and personal aesthetics. however, there seems to be a broad consensus that art requires human creativity and imagination as, for instance, stated by the oxford dictionary "the expression or application of human creative skill and imagination, typically in a visual form such as painting or sculpture, producing works to be appreciated primarily for their beauty or emotional power." every art movement challenges old ways and uses artistic creative abilities to spark new ideas and styles. with each art movement diverse intentions and reasons for creating the artwork came along with critics who did not want to accept the new style as an artform. with the introduction of ai into the creation process another art movement is trying to be established which is fundamentally changing the way we see art. for the first time, ai has the potential to take the artist out of the loop, to leave humans only in the positions of curators, observers and judges to decide if the artwork is beautiful and emotionally powerful. while there is a strong debate going on in the arts if creativity is profoundly human, we investigate how ai can foster inspiration, creativity and produce unexpected results. it has been shown by many publications that ai can generate images, music and the like which can resemble different styles and produce artistic content. for instance, elgammal et al. [2] have used generative adversarial networks (gan) to generate images by learning about styles and deviating from style norms. the promise of ai-assisted creation is "a world where creativity is highly accessible, through systems that empower us to create from new perspectives and raise the collective human potential" as roelof pieters and samim winiger pointed out [3] . to get a better understanding of the process on how ai is capable to propose images, music, etc. we have to open the black box to investigate where and how the magic is happening. random variations in the image space (sometimes also referred to as pixel space) are usually not leading to any interesting result. this is because semantic knowledge cannot be applied. therefore, methods need to be applied which constrain the possible variations of the given dataset in a meaningful way. this can be realized by generative design or procedural generation. it is applied to generate geometric patterns, textures, shapes, meshes, terrain or plants. the generation processes may include, but are not limited, to self-organization, swarm systems, ant colonies, evolutionary systems, fractal geometry, and generative grammars. mccormack et al. [4] review some generative design approaches and discuss how art and design can benefit from those applications. these generative algorithms which are usually realized by writing program code are very limited. ai can change this process into data-driven procedures. ai, or more specifically artificial neural networks, can learn patterns from (labeled) examples or by reinforcement. before an artificial neural network can be applied to a task (classification, regression, image reconstruction), the general architecture is to extract features through many hidden layers. these layers represent different levels of abstractions. data that have a similar structure or meaning should be represented as data points that are close together while divergent structures or meanings should be further apart from each other. to convert the image back (with some conversion/compression loss) from the low dimensional vector, which is the result of the first component, to the original input an additional component is needed. together they form the autoencoder which consists of the encoder and the decoder . the encoder compresses the data from a high dimensional input space to a low dimensional space, often called the bottleneck layer. then, the decoder takes this encoded input and converts it back to the original input as closely as possible. the latent space is the space in which the data lies in the bottleneck layer. if you look at figure 1 you might be wondering why a model is needed that converts the input data into a "close as possible" output data. it seems rather useless if all it outputs is itself. as discussed, the latent space contains a highly compressed representation of the input data, which is the only information the decoder can use to reconstruct the input as faithfully as possible. the magic happens by interpolating between points and performing vector arithmetic between points in latent space. these transformations result in meaningful effects on the generated images. as dimensionality is reduced, information which is distinct to each image is discarded from the latent space representation, since only the most important information of each image can be stored in this low-dimensional space. the latent space captures the structure in your data and usually offers some semantic meaningful interpretation. this semantic meaning is, however, not given a priori but has to be discovered. as already discussed autoencoders, after learning a particular non-linear mapping, are capable of producing photo-realistic images from randomly sampled points in the latent space. the latent space concept is definitely intriguing but at the same time non-trivial to comprehend. although latent space means hidden, understanding what is happening in latent space is not only helpful but necessary for various applications. exploring the structure of the latent space is both interesting for the problem domain and helps to develop an intuition for what has been learned and can be regenerated. it is obvious that the latent space has to contain some structure that can be queried and navigated. however, it is non-obvious how semantics are represented within this space and how different semantic attributes are entangled with each other. to investigate the latent space one should favor a dataset that offers a limited and distinctive feature set. therefore, faces are a good example in this regard because they share features common to most faces but offer enough variance. if aligned correctly also other meaningful representations of faces are possible, see for instance the widely used approach of eigenfaces [5] to describe the specific characteristic of faces in a low dimensional space. in the latent space we can do vector arithmetic. this can correspond to particular features. for example, the vector a smiling woman representing the face of a smiling woman minus the vector a neutral woman representing a neutral looking woman plus the vector a neutral man representing a neutral looking man resulted in the vector a smiling man representing a smiling man. this can also be done with all kinds of images; see e.g. the publication by radford et al. [6] who first observed the vector arithmetic property in latent space. a visual example is given in figure 2 . please note that all images shown in this publication are produced using biggan [7] . the photo of the author on which most of the variations are based on is taken by tobias schwerdt. in latent space, vector algebra can be carried out. semantic editing requires to move within the latent space along a certain 'direction'. identifying the 'direction' of only one particular characteristic is non-trivial since editing one attribute may affect others because they are correlated. this correlation can be attributed to some extent to pre-existing correlations in 'the real world' (e.g. old persons are more likely to wear eyeglasses) or bias in the training dataset (e.g. more women are smiling on photos than men). to identify the semantics encoded in the latent space shen et al. proposed a framework for interpreting faces in latent space [8] . beyond the vector arithmetic property, their framework allows decoupling some entangled attributes (remember the aforementioned correlation between old people and eyeglasses) through linear subspace projection. shen et al. found that in their dataset pose and smile are almost orthogonal to other attributes while gender, age, and eyeglasses are highly correlated with each other. disentangled semantics enable precise control of facial attributes without retraining of any given model. in our examples, in figures 3 and 4 , faces are varied according to gender or age. it has been widely observed that when linearly interpolate between two points in latent space the appearance of the corresponding synthesized images 'morphs' continuously from one face to another; see figure 5 . this implies that also the semantic meaning contained in the two images changes gradually. this is in stark contrast to having a simple fading between two images in image space. it can be observed that the shape and style slowly transform from one image into the other. this demonstrates how well the latent space understands the structure and semantics of the images. other examples are given in section 3. even though our analysis has focused on face editing for the reasons discussed earlier it holds true also for other domains. for instance, bau et al. [9] generated living rooms using similar approaches. they showed that some units from intermediate layers of the generator are specialized to synthesize certain visual concepts such as sofas or tvs. so far we have discussed how autoencoders can connect the latent space and the image semantic space, as well as how the latent code can be used for image editing without influencing the image style. next, we want to discuss how this can be used for artistic expression. while in the former section we have seen how to use manipulation in the latent space to generate mathematical sound operations not much artistic content has been generatedjust variations of photography like faces. imprecision in ai systems can lead to unacceptable errors in the system and even result in deadly decisions; e.g. at autonomous driving or at cancer treatment. in the case of artistic applications, errors or glitches might lead to interesting, non-intended, artifacts. if those errors or glitches are treated as a bug or a feature lies in the eye of the artist. to create higher variations in the generated output some artists randomly introduce glitches within the autoencoder. due to the complex structure of the autoencoder these glitches (assuming that they are introduced at an early layer in the network) occur on a semantic level as already discussed and might cause the models to misinterpret the input data in interesting ways. some could even be interpreted as glimpses of autonomous creativity; see for instance the artistic work 'mistaken identity' by mario klingemann [10] . so far the latent space is explored by humans either by random walk or intuitive steering into a particular direction. it is up to human decisions if the synthesized image of a particular location in latent space is producing a visually appealing or otherwise interesting result. the question arises where to find those places and if those places can be spotted by an automatized process. the latent space is usually defined by a space of ddimensions for which it is assumed the data to be represented as multivariate gaussian distributions n (0, i d ) [11] . therefore, the mean representation of all images lies in the center of the latent space. but what does that mean for the generated results? it is said that "beauty lies in the eyes of the beholder", however, research shows that there is a common understanding of beauty. for instance, averaged faces are perceived as more beautiful [12] . adopting these findings to latent space let us assume that the most beautiful images (in our case faces) can be found in the center of the space. particular deviations from the center stand for local sweet spots (e.g. female and male, ethnic groups). these types of sweet spots can be found by common means of data analysis (e.g. clustering). but where are interesting local sweet spots if it comes to artistic expression? figure 6 demonstrates some variation in style within the latent space. of course, one can search for locations in the latent space where particular artworks from a given artist or art styles are located; see e.g. figure 7 where the styles of different artists, as well as white noise 2 , have been used for adoption. but isn't lingering around these sweet spots not only producing "more of the same"? how to find the local sweet spots which can define a new art style and can be deemed truly creative? or do those discoveries of new art style lie outside of the latent space, because the latent space is trained within a particular set of defined art styles and can, therefore, produce only interpolations of those styles but nothing conceptually new? so far we have discussed how ai can help to generate different variations of faces and where to find visually interesting sweet spots. in this section, we want to show how ai is supporting the creation process by applying the discussed techniques to other areas of image and object processing. 3 probably, different variations of image-to-image translation are the most popular approach at least if looking at the mass media. the most prominent example is style transfer -the capability to transfer the style of one image to draw the content of another (examples are shown in figure 7 ). but mapping an input image to an output image is also possible for a variety of other applications such as object transfiguration (e.g. horse-to-zebra, apple-to-orange, season transfer (e.g. summer-to-winter) or photo enhancement [13] . while some of the just mentioned systems are not yet in a state to be widely applicable, ai tools are taking over and gradually automating design processes which used to be time-consuming manual processes. indeed, the most potential for ai in art and design is seen in its application to tedious, uncreative tasks such as coloring black-and-white images [14] . marco kempf and simon zimmerman used ai in their work dubbed 'deepworld' to generate a compilation of 'artificial countries' using data of all existing countries (around 195) to generate new anthems, flags and other descriptors [15] . roman lipski uses an ai muse (developed by florian dohmann et al.) to foster his/her inspiration [16] . because the ai muse is trained only on the artist's previous drawings and fed with the current work in progress it suggests image variations in line with roman's taste. cluzel et al. have proposed an interactive genetic algorithm to progressively sketch the desired side-view of a car profile [17] . for this, the user has taken on the role of a fitness function 4 through interaction with the system. the chair project [18] is a series of four chairs co-designed by ai and human designers. the project explores a collaborative creative process between humans and computers. it used a gan to propose new chairs which then have been 'interpreted' by trained designers to resemble a chair. deep-wear [19] is a method using deep convolutional gans for clothes design. the gan is trained on features of brand clothes and can generate images that are similar to actual clothes. a human interprets the generated images and tries to manually draw the corresponding pattern which is needed to make the finished product. li et al. [20] introduced an artificial neural network for encoding and synthesizing the structure of 3d shapes which-according to their findings-are effectively characterized by their hierarchical organization. german et al. [21] have applied different ai techniques trained by a small sample set of shapes of bottles, to propose novel bottle-like shapes. the evaluation of their proposed methods revealed that it can be used by trained designers as well as nondesigners to support the design process in different phases and that it could lead to novel designs not intended/foreseen by the designers. for decades, ai has fostered (often false) future visions ranging from transhumanist utopia to "world run by machines" dystopia. artists and designers explore solutions concerning the semiotic, the aesthetic and the dynamic realm, as well as confronting corporate, industrial, cultural and political aspects. the relationship between the artist and the artwork is directly connected through their intentions, although currently mediated by third-parties and media tools. understanding the ethical and social implications of ai-assisted creation is becoming a pressing need. the implications, where each has to be investigated in more detail in the future, include: -bias: al systems are sensitive to bias. as a consequence, the ai is not being a neutral tool, but has pre-decoded preferences. bias relevant in creative ai systems are: • algorithmic bias occurs when a computer system reflects the implicit values of the humans who created it; e.g. the system is optimized on dataset a and later retrained on dataset b without reconfiguring the neural network (this is not uncommon, as many people do not fully understand what is going on in the network, but are able to use the given code to run training on other data). • data bias occurs when your samples are not representative of your population of interest. • prejudice bias results from cultural influences or stereotypes which are reflected in the data. -art crisis: until 200 years ago painting served as the primary method for visual communication and was a widely and highly respected art form. with the invention of photography, painting began to suffer an identity crisis because painting-in its current form then-was not able to reproduce the world as accurate and with as low effort as photography. as a consequence visual artists had to change to different forms of representations not possible by photography inventing different art styles such as impressionism, expressionism, cubism, pointillism, constructivism, surrealism, up to abstract expressionism. at the time ai can perfectly simulate those styles what will happen with the artists? will artists still be needed, be replaced by ai, or will they have to turn to other artistic work which yet cannot be simulated by ai? -inflation: similar to the image flood which has reached us the same can happen with ai art. because of the glut, nobody is valuing and watching the images anymore. -wrong expectations: only esthetic appealing or otherwise interesting or surprising results are published which can be contributed to similar effects as the well-known publication bias [22] in other areas. eventually, this is leading to wrong expectations of what is already possible with ai. in addition, this misunderstanding is fueled by content claimed to be created by ai but has indeed been produced-or at least reworked-either by human labor or by methods not containing ai. -unequal judgment: even though the raised emotions in viewing artworks emerge from its underlying structure in the works, people also include the creation process in their judgment (in the cases where they know about it). frequently, becoming to know that a computer or an ai has created the artwork, in the opinion of the people it turns boring, has no guts, no emotion, no soul while before it was inspiring, creative and beautiful. -authorship: the authorship of ai-generated content has not been clarified. for instance, is the authorship of a novel song composed by an ai trained exclusively on songs by johann sebastian bach belonging to the ai, the developer/artist, or bach? see e.g. [23] for a more detailed discussion. -trustworthiness: new ai-driven tools make it easy for non-experts to manipulate audio and/or visual media. thus, image, audio as well as video evidence is not trustworthy anymore. manipulated image, audio, and video are leading to fake information, truth skepticism, and claims that real audio/video footage is fake (known as the liar's dividend ) [24] . the potential of ai in creativity has just been started to be explored. we have investigated on the creative power of ai which is represented-not exclusively-in the semantic meaningful representation of data in a dimensionally reduced space, dubbed latent space, from which images, but also audio, video, and 3d models can be synthesized. ai is able to imagine visualizations that lie between everything the ai has learned from us and far beyond and might even develop its own art styles (see e.g. deep dream [25] ). however, ai still lacks intention and is just processing data. those novel ai tools are shifting the creativity process from crafting to generating and selecting-a process which yet can not be transferred to machine judgment only. however, ai can already be employed to find possible sweet spots or make suggestions based on the learned taste of the artist [21] . ai is without any doubt changing the way we experience art and the way we do art. doing art is shifting from handcrafting to exploring and discovering. this leaves humans more in the role of a curator instead of an artist, but it can also foster creativity (as discussed before in the case of roman lipski) or reduce the time between intention and realization. it has the potential, just as many other technical developments, to democratize creativity because the handcrafting skills are not so much in need to express his/her own ideas anymore. widespread misuse (e.g. image manipulation to produce fake pornography) can limit the social acceptance and require ai literacy. as human beings, we have to ask ourselves if feelings are wrong just because the ai never felt alike in its creation process as we do? or should we not worry too much and simply enjoy the new artworks created no matter if they are done by humans, by ai or as a co-creation between the two ones? [1] aims to design and implement a machine learning system for the sake of generating prediction models with respect to quality checks and reducing faulty products in manufacturing processes. it is based on an industrial case study in cooperation with sick ag. we will present first results of the project concerning a new process model for cooperating data scientists and quality engineers, a product testing model as knowledge base for machine learning computing and visual support of quality engineers in order to explain prediction results. a typical production line consists of various test stations that conduct several measurements. those measurements are processed by the system on the fly, to point out problematic products. among the many challenges, one focus of the project is on support for quality engineers. preparation of prediction models is usually done by data scientists. but the demand for data scientists is increasing too fast, when a big number of products, production lines and changing circumstances have to be considered. hence, a software is needed which quality engineers can operate directly and leverage the results from prediction models. based on quality management and data science standard processes [2] [3] we created a reference process model for production error detection and correction which includes needed actors and associated tasks. with ml system and data scientist assistance we bolster the quality engineer in his work. to support the ml system, we developed a product testing model which includes crucial information about a specific product. in this model we describe the relation to product specific features, test systems, production lines sequences etc. the idea behind this, is to provide metadata information which in turn is used by the ml system instead of individual script solutions for each product. a ml model with good predictions has often a lack of information about the internal decisions. therefore, it is beneficial to support the quality engineer with useful feature visualizations. by default, we support the quality engineer with 2d -3d feature plots and histograms, in which the error distribution is visualized. on top, we developed further feature importance measures based on shap values [4] . these can be used to get deeper insight for particular ml decisions to significant features which get lower ranked by standard feature importance measures. medicine is a highly empirical discipline, where important aspects have to be demonstrated using adequate data and sound evaluations. this is one of the core requirements, which were emphasized during the development of the medical device regulation (mdr) of the european union (eu) [1] . this applies to all medical devices, including mechanical and electrical devices as well as software systems. also, the us food & drug administration (fda) recently set a focus on the discussions about using data for demonstrating the safety and efficacy of medical devices [2] . beside pure approval steps, they foster the use of data for optimization of the products, as nowadays data can be acquired more and more, using modern it technology. in particular, they pursue the use of real world evidence, i.e. data that is collected through the lifetime of a device, for demonstrating improved outcomes. [2] such approaches require the use of sophisticated data analysis techniques. beside classical statistics, artificial intelligence (ai) and machine learning (ml) are considered to be powerful techniques for this purpose. currently, they gain more and more attention. these techniques allow to detect dependencies in complex situations, where inputs and/or outputs of a problem have high-dimensional parameter spaces. this can e.g. be the case when extensive data is collected from diverse clinical studies or also treatment protocols from local sites. furthermore, ai/ml based techniques may be used in the devices themselves. for example, devices may be developed which are considered to improve complex diagnostic tasks or find individualized treatment options for specific medical conditions (see e.g. [3, 4] for an overview). for some applications, it already has been demonstrated that ml algorithms are able to outperform human experts with respect to specific success rates (e.g. [5, 6] ). in this paper, it will be discussed how ml based techniques can be brought onto the market including an analysis of appropriate regulatory requirements. for this purpose, the main focus lies on ml based devices applied in the intensive care unit (icu) as e.g. proposed in [7, 8] . the need for specific regulatory requirements comes from the observation, that ai/ml based techniques pose specific risks which need to be considered and handled appropriately. for example, ai/ml based methods are more challenging w.r.t. bias effects, reduced transparency, vulnerability to cybersecurity attacks, or general ethical issues (see e.g. [9, 10] ). in particular cases, ml based techniques may lead to noticeably critical results, as it has been shown for the ibm watson for oncology device. in [11] , it was reported that the direct use of the system in particular clinical environments resulted in critical treatment suggestions. the characteristics of ml based systems led to various discussions about their reliability in the clinical context. it requires to find appropriate ways to guarantee their safety and performance. (cf. [12] ) this applies to the field of medicine / medical devices as well as ai/ml based techniques in general. the latter was e.g. approached by the eu in their ethics guidelines for trustworthy ai [9] . driven by this overall development, the fda started a discussion regarding an extended use of ml algorithms in samd (software as a medical device) with a focus in quicker release cycles. in [13] , it pursued the development of a specific process which makes it easier to bring ml based devices onto the market and also to update them during their lifecycle. current regulations for medical devices, e.g. in us or eu, do not provide specific guidelines for ml based devices. in particular, this applies to systems which continuously collect data in order to improve the performance of the device. current regulations focus on a fixed status of the device, which may only be adapted in a minor extent after the release. usually, a new release or clearance by the authority is required, when the clinical performance of a device is modified. but continuously learning systems exactly want to do such improvement steps using additional real-world data from daily applications without extra approvals (see fig. 1 ). basic approaches for ai/ml based medical devices. left side: classical approach, where the status of the software has to be fixed after the release / approval stage. right side: continuously learning system where data is collected during the lifetime of the device without a separated release / approval step. in this case, an automatic validation step has to guarantee proper safety and efficacy. in [13] , the fda made suggestions how this could be addressed. it proposed the definition of so called samd pre-specifications (sps) and an algorithm change protocol (acp), which are considered to represent major tools for dealing with modifications of the ml based system during its lifetime. within the sps, the manufacturer has to define the anticipated changes which are considered to be allowed during the automatic update process. in addition, the acp defines the particular steps which have to be implemented to realize the sps specifications. see [13] for more information about sps and acp. but the details are not yet well elaborated by the fda at the moment. the fda requested for suggestions with respect to this. in particular, these tools serve as a basis for performing an automated validation of the updates. the applicability of this approach depends on the risk of the samd. in [13] , the fda uses the risk categories from the international medical device regulators forum (imdrf) [14] . this includes the categories state of healthcare situation or condition (critical vs. serious vs. noncritical) and significance of information provided by samd to healthcare decision (treat or diagnose vs. drive clinical management vs. inform clinical management) as the basic attributes. according to [13] , the regulatory requirements for the management of ml based systems are considered to depend on this classification as well as the particular changes which may take place during the lifetime of the device. the fda categorizes them as changes in performance, inputs, and intended use. such anticipated changes have to be defined in the sps in advance. the main purpose of the present paper is to discuss the validity of the described fda approach for enabling continuously learning systems. therefore, it uses a scenario based technique to analyze whether validation in terms of sps and acp can be considered adequate tools. the scenarios represent applications of ml based devices in the icu. it checks its consistency with other important regulatory requirements and analyzes pitfalls which may jeopardize the safety of the devices. additionally, it discusses whether more general requirements can be sufficiently addressed in the scenarios, as e.g. proposed in ethical guidelines for ai based systems like [9, 10] . this is not considered as a comprehensive analysis of the topics, but as an addition to current discussions about risks and ethical issues, as they are e.g. discussed in [10, 12] . finally, the paper proposes own suggestions to address the regulation of continuously learning ml based systems. again, this is not considered to be a full regulatory strategy, but a proposal of particular requirements, which may overcome some of the current limitations of the approach discussed in [13] . the overall aim of this paper is to contribute to a better understanding of the options and challenges of ai/ml based devices on the one hand and to enable the development of best practices and appropriate regulatory strategies, in the future. within this paper, the analysis of the fda approach proposed in [13] is performed using specific reference scenarios from icu applications, which are particularly taken from [13] itself. the focus lies on ml based devices which allow continuous updates of the model according to data collected during the lifetime of the device. in this context, sps and acp are considered as crucial steps which allow an automated validation of the device based on specified measures. in particular, the requirements and limitations of such an automated validation are analyzed and discussed, including the following topics / questions.  is automated validation reasonable for these cases? what are limitations / potential pitfalls of such an approach when applied in the particular clinical context?  which additional risks could apply to ai/ml based samd, in general, which go beyond the existing discussions in the literature as e.g. presented in [9, 10, 12] ?  how should such issues be taken into account in the future? what could be appropriate measures / best practices to achieve reliability? the following exemplary scenarios are used for this purpose. ur-ai 2020 // 56  base scenario icu: ml based intensive care unit (icu) monitoring system where the detection of critical situations (e.g. regarding physiological instability, potential myocardial infarcts or sepsis) is addressed by using ml. using auditory alarms, the icu staff is informed to initiate appropriate measures to treat the patients in these situations. this scenario addresses a 'critical healthcare situation or condition' and is considered to 'drive clinical management' (according to the risk classification used in [13] ).  modification "locked": icu scenario as presented above, where the release of the monitoring system is done according to a locked state of the algorithm.  modification "cont-learn": icu scenario as presented above, where the detection of alarm situations is continuously improved according to data acquired during daily routine, including adaptation of performance to sub-populations and/or characteristics of the local environment. in this case, scs and acp have to define standard measures like success rates of alarms/detection and requirements for the management of data, update of the algorithm, and labeling. more details of such requirements are discussed later. this scenario was presented as scenario 1a in [13] with minor modifications. this section provides the basic analysis of the scenarios according to the particular aspects addressed in this paper. it addresses the topics automated validation, man-machine interaction, explainability, bias effects, and confounding, fairness and non-discrimination as well as corrective actions to systematic deficiencies. according to standard regulatory requirements [1, 15, 16] , validation is a core step in the development and for the release of medical devices. according to [17] , a change in performance of a device (including an algorithm in a samd) as well as a change in particular risks (e.g. new risks, but also new risk assessment or new measures) usually triggers a new premarket notification (510(k)) for most of the devices which get onto the market in the us. thus, such situations require an fda review for clearance of the device. for samd, this requires to include an analytical evaluation, i.e. correct processing of input data to generate accurate, reliable, and precise output data. additionally, a clinical validation as well as the demonstration of a valid clinical association need to be provided. [18] this is intended to show that the outputs of the device appropriately work in the clinical environment, i.e. have a valid association regarding the targeted clinical condition and achieve the intended purpose in the context of clinical care. [18] thus, based on the current standards, a device with continuously changing performance usually requires a thorough analysis regarding its validity. this is one of the main points, where [13] proposes to establish a new approach for the "cont-learn" cases. as already mentioned, sps and acp basically have to be considered as tools for automated validation in this context. within this new approach, the manual validation step is replaced by an automated process with only reduced or even no additional control by a human observer. thus, it may work as an automated of fully automatic, closed loop validation approach. the question is whether this change can be considered as an appropriate alternative. in the following, this question is addressed using the icu scenario with a main focus on the "cont-learn" case. some of the aspects also apply to the "locked" cases. but the impact is considered to be higher in the "cont-learn" situation, since the validation step has to be performed in an automated fashion. human oversight, which is usually considered important, is not included here during the particular updates. within the icu scenario, the validation step has to ensure that the alarm rates stay on a sufficiently high level, regarding standard factors like specificity, sensitivity, area under curve (auc), etc. basically, these are technical parameters which can be analyzed according to an analytical evaluation as discussed above. (see also [18] ) this could also be applied to situations, where continuous updates are made during the lifecycle of the device, i.e. in the "cont-learn". however, there are some limitations of the approach. on the one hand, it has to be ensured, that this analysis is sound and reliable, i.e. it is not compromised according to statistical effects like bias or other deficiencies in the data. on the other hand, it has to be ensured that the success rates really have a valid clinical association and can be used as a sole criterion for measuring the clinical impact. thus, the relationship between pure success rates and clinical effects has to be evaluated thoroughly and there may be some major limitations. one major question in the icu scenario is, whether better success rates really guarantee a higher or at least sufficient level of clinical benefit. this is not innately given. for example, a higher success rate of the alarms may still have a negative effect when the icu staff relies more and more on the alarms and subsequently reduces attention. thus, it may be the case that the initiation of appropriate treatment steps may be compromised even though the actually occurring alarms seem to be more reliable. in particular, this may apply in situations where the algorithms are adapted to local settings, like in the "cont-learn" scenario. here, the ml based system is intended to be optimized to subpopulations in the local environment or to specific treatment preferences at the local site. according to habituation effects, the staff's expectations get aligned to the algorithm's behavior to a certain degree after a period of time. but when the algorithm changes or an employee from another hospital or department takes over duties in the local unit, the reliability of the alarms may be affected. in these cases, it is not clear whether the expectations are well aligned with the current status of the algorithmeither in the positive or negative direction. since the data updates of the device are intended to improve its performance w.r.t. detection rates, it is clear that significant effects on user interaction may happen. under some circumstances, the overall outcome in terms of the clinical effect may be impaired. evaluation of such risks have to be addressed during validation. it is questionable whether this can be performed by using an automatic validation approach which focuses on alarm rates but does not include an assessment of the associated risks. at least a clear relationship between these two aspects has to be demonstrated in advance. it is also unclear, whether this could be achieved by assessment of pure technical parameters which are defined in advance as required by the sps and acp. usually, ml based systems are trained to a specific scenario. they provide a specific solution for this particular problem. but they do not have a more general intelligence and reasoning about potential risks, which were not under consideration at that point of time. such a more general intelligence can only be provided when using human oversight. in general, it is not clear whether technical aspects like alarms lead to valid reactions by the users. in technical terms, alarm rates are basically related to the probability of occurrence of specific hazardous situations. but they do not address a full assessment of occurrence of harm. however, this is pivotal for risk assessment in medical devices, in particular for risks related to potential use errors. this is considered to be one of the main reasons why a change in risk parameters triggers a new premarket approval in the us according to [17] . also, the mdr [1] sets high requirements to address the final clinical impact and not only technical parameters. basically, the example emphasizes the importance to consider the interaction between man and machine, or in this case, the algorithm and its clinical environment. this is addressed in the usability standards for medical devices, e.g. iso 62366 [19] . for this reason, the iso 62366 requires that the final (summative) usability evaluation is performed using the final version of the device (in this case, the algorithm) or an equivalent version. this is in conflict with the fda proposal which allows to perform this assessment based on previous versions. at most, a predetermined relationship between technical parameters (alarm rates) and clinical effects (in particular, use related risks) can be obtained. for usage of ml based devices, it remains crucial to consider the interaction between the device and the clinical environment as there usually are important interrelationships. the outcome of an ml based algorithm always depends on the data it gets provided. whenever an input parameter is omitted, which is clinically relevant, the resulting outcome of the ml based system is limited. in the presented scenarios, the pure alarm rates may not be the only clinically relevant outcomes. even though, such parameters are usually the main focus regarding the quality of algorithms, e.g. in publications about ml based techniques. this is due to the fact, that such quality measures are commonly considered the best available objective parameters, which allow a comparison of different techniques. this even more applies to other ml based techniques which are also very popular in the scientific community, like segmentation tasks in medical image analysis. here the standard quality measures are general distance metrics, i.e. differences between segmented areas. [20] they usually do not include specific clinical aspects like the accuracy in specific risk areas, e.g. important blood vessels or nerves. but such aspects are key factors to ensure the safety of a clinical procedure in many applications. again, only technical parameters are typically in focus. the association to the clinical effects is not assessed accordingly. this situation is depicted in fig. 2 for the icu as well as image segmentation cases. additionally, the validity of an outcome in medical treatments depends on many factors. regarding input data, multiple parameters from a patient's individual history may be important for deciding about a particular diagnosis or treatment. a surgeon usually has access to a multitude of data and also side conditions (like socio-economic aspects) which should be included in an individual diagnosis or treatment decision. his general intelligence and background knowledge allow him to include a variety of individual aspects, which have to be considered for a specific case-based decision. in contrary, ml based algorithms rely on a more standardized structure of input data and are only trained for a specific purpose. they lack a more general intelligence, which allows them to react in very specific situations. even more, ml based algorithms need to generalize and thus to mask out very specific conditions, which could by fatal in some cases. in [13] , the fda presents some examples where changes of the inputs in an ml based samd are included. it is surprising, that the fda considers some of them as candidates for a continuous learning system, which does not need an additional review, when a tailored sps/acp is available. such discrepancies between technical outcomes and clinical effects also apply to situations like the icu scenario, which only informs or drives clinical management. often users rely on automatically provided decisions, even when they are informed that this only is a proposal. again, this is a matter of man-machine interaction. this gets even worse due to the lack of explainability which ml based algorithms typically have. [9, 21] when surgeons or more general users (e.g, icu staff) detect situations which require a diverging treatment because of very specific individual conditions, they should overrule the algorithm. but users will often be confused by the outcome of the algorithm and do not have a clear idea how they should treat conflicting results between the algorithm's suggestions and their own belief. as long as the ml based decision is not transparent to the user, they will not be able to merge these two directions. the ibm watson example, referenced in the introduction shows, that this actually is an issue [11] . this may be even more serious, when the users (i.e. healthcare professionals) fear litigation because they did not trust the algorithm. in a situation, where the algorithm's outcome finally turns out to be true, they may be sued because of this documented deviation. because of such issues, the eu general data protection regulation (gfpr) [22] requires that the users get autonomy regarding their decisions and transparency about the mechanisms underlying the algorithm's outcome. [23] this may be less relevant for the patients, who usually have only limited medical knowledge. they will probably also not understand the medical decisions in conventional cases. but it is highly relevant for responsible healthcare professionals. they require to get basic insights how the decision emerged, as they finally are in charge of the treatment. this demonstrates that methods regarding the explainability of ml based techniques are important. fortunately, this currently gets a very active field. [21, 24] this need for explainability applies to locked algorithms as well as situations where continuous learning is applied. based on their own data-driven nature, ml based techniques highly depend on a very high quality of data which are provided for learning and validation. in particular, this is important for the analytical evaluation of the ml algorithms. one of the major aspects are bias effects due to unbalanced input data. for example, in [25] a substantially different detection rate between white and colored people was recognized due to unbalanced data. beside ethical considerations, this demonstrates dependencies of the outcome quality on sub-populations, which may be critical in some cases. even though, the fda proposal [13] currently does not consequently include specific requirements for assessing bias factors or imbalance of data. however, high quality requirements for data management are crucial for ml based devices. in particular, this applies to the icu "cont-learn" cases. there have to be very specific protocols that guarantee that new data and updates of the algorithms are highly reliable w.r.t. bias effects. most of the currently used ml based algorithms fall under the category of supervised learning. thus, they require accurate and clinically sound labeling of the data. during the data collection, it has to be ensured how this labeling is performed and how the data can be fed back into the system in a "cont-learn" scenario. additionally, the data needs to stay balancedwhatever this means in a situation where adaptions to sub-populations and/or local environments are intended for optimization. it is unclear, whether and how this could be achieved by staff who is only operating with the system but possibly does not know potential algorithmic pitfalls. in the icu scenario, many data points probably need to be recorded by the system itself. thus, a precise and reliable recording scheme has to be established which automatically avoids imbalance of data on the one hand and fusion with manual labelings on the other hand. basically, the sps and acp (proposed in [13] ) are tools to achieve this. the question is whether this is possible in a reliable fashion using automated processes. a complete closed loop validation approach seems to be questionable, especially when the assessment of clinical impact has to be included. thus, the integration of humans including adequate healthcare professionals as well as ml/ai experts with sufficient statistical knowledge seems reasonable. at least, bias assessment steps should be included. as already mentioned, this is not addressed in [13] in a dedicated way. further on, the outcomes may be compromised by side effects in the data. it may be the case, that the main reason for a specific outcome of the algorithm is not a relevant clinical parameter but a specific data artifact, i.e. some confounding factor. in the icu case, it could be the case, that the icu staff reacts early to a potentially critical situation and e.g. gives specific medication in advance to prevent upcoming problems. the physiological reaction of the patient can then be visible in the data as some kind of artifact. during its learning phase, the algorithm may recognize the critical situation not based on a deeper clinical reason, but on detecting the physiological reaction pattern. this may cause serious problems as shown subsequently. in the presented scenario, the definition of clinical situation and the pattern can be deeply coupled by design, since the labeling of the data by the icu staff and the administration of the medication will probably be done in combination at the particular site. this may increase the probability of such effects. usually, confounding factors are hard to determine. even when they can be detected, they are hard to be communicated and managed in an appropriate way. how should healthcare professionals react, when they get such potentially misleading information (see discussion about liability). this further limits the explanatory power of ml based systems. when confounders are not detected, they may have unpredictable outcomes w.r.t. the clinical effects. for example, consider the following case. in the icu scenario, an ml based algorithm gets trained in a way that it basically detects the medication artifact as described above during the learning phase. in the next step, this algorithm is used in clinical practice and the icu staff relies on the outcome of the algorithm. then, on the one hand, the medication artifact is not visible unless the icu staff administers the medication. on the other hand, the algorithm does not recognize the pattern and thus does not provide an alarm. subsequently, the icu staff does no act appropriately to manage the critical situation. in particular, such confounders may be more likely in situations where a strong dependence between the outcome of the algorithm and the clinical treatment exists. further examples of such effects were discussed in [7] for icu scenarios. the occurrence of confounders may be a bit less probable in pure diagnostic cases without influence of the diagnostic task onto the generation of data. but even here, such confounding factors may occur. the discussion in [10] provided examples where confounders may occur in diagnostic cases e.g. because of rulers placed for measurements on radiographs. in most of the publications about ml based techniques, such side effects are not discussed (or only in a limited fashion). in many papers, the main focus is the technical evaluation and not the clinical environment and the interrelation between technical parameters and clinical effects. additional important aspects which are amply discussed in the context of ai/ml based systems are discrimination and fairness (see e.g. [10] ). in particular, the eu puts a high priority of their future ai/ml strategy on fairness requirements [9] . fairness is often closely related to bias effects. but it goes beyond to more general ethical questions, e.g. regarding the natural tendency of ml based systems to favor specific subgroups. for example, the icu scenario "cont-learn" is intended to optimize w.r.t. to specifics of sub-populations and local characteristics, i.e. it tries to make the outcome better for specific groups. based on such optimization, other groups (e.g. minorities, underrepresented groups) which are not well represented may be discriminated in some sense. this is not a statistical but a systematic effect. superiority of a medical device for a specific subgroup (e.g. gender, social environment, etc.) is not uncommon. for example, some diagnosis steps, implants, or treatments achieve deviating success rates when applied to women in comparison to men. this also applies to differences between adults and children. when assessing bias in clinical outcome in ml based devices, it will probably often be unclear whether this is due to imbalance of data or a true clinical difference between the groups. does an ml based algorithm has to adjust the treatment of a subgroup to a higher level, e.g. a better medication, to achieve comparable results, when the analysis recognized worse results for this subgroup? another example could be a situation where the particular group does not have the financial capabilities to afford the high-level treatment. this could e.g. be the case in a developing country or in subgroups with a lower insurance level. in these cases, the inclusion of socio-economical parameters into the analysis seems to be unavoidable. subsequently, this compromises the notion of fairness as basic principle in some way. this is nothing genuine to ml based devices. but in the case of ml based systems with a high degree of automation, the responsibility for the individual treatment decision more and more shifts from the health care professional to the device. it is implicitly defined in the ml algorithm. in comparison to human reasoning, which allows some weaknesses in terms of individual adjustments of general rules, ml based algorithms are rather deterministic / unique in their outcome. for a fixed input, they have one dedicated outcome (when we neglect statistical algorithms which may allow minor deviations). differences of opinions and room for individual decisions are main aspects of ethics. thus, it remains unclear how fairness can be defined and implemented at all when considering ml based systems. this is even more challenging as socioeconomical aspects (even more than clinical aspects) are usually not included in the data and analysis of ml based techniques in medicine. additionally, they are hard to assess and implement in a fair way, especially when using automated validation processes. another disadvantage of ml based devices is the limited opportunities to fix systematic deficiencies in the outcome of the algorithm. let us assume that during the lifetime of the icu monitoring system a systematic deviation of the intended outcome was detected, e.g. in the context of post-market surveillance or due to an increased number of serious adverse events. according to standard rules, a proper preventive respectively corrective action has to be taken by the manufacturer. in conventional software devices, the error simple should be eliminated, i.e. some sort of bug fixing has to be performed. for ml based devices it is less clear, how bug fixing should work especially when the systematic deficiency is deeply hidden in the data and/or ml model. in these cases, there usually is no clear reason for the deficiency. subsequently, the deficiency cannot be resolved in a straightforward way using standard bug fixing. there is no dedicated route to find the deeper reasons and to perform changes which could cure the deficiencies, e.g. by providing additional data or changing the ml model. even more, other side effects may easily occur, when data and model are changed manually by intent to fix the issue. 4 discussion and outlook in summary, there are many open questions, which are not yet clarified. there still is little experience how ml based systems work in clinical practice and which concrete risks may occur. thus, the fda's commitment to foster the discussion about ml based samd is necessary and appreciated by many stakeholders as the feedback docket [26] for [13] shows. however, it is a bit surprising that the fda proposes to substantially reduce its very high standards in [13] at this point of time. in particular, it is questionable whether an adequate validation can be achieved by using a fully automatic approach as proposed in [13] . ml based devices are usually optimized according to very specific goals. they can only account for the specific conditions that are reflected in the data and the used optimization / quality criteria. they do not include side conditions and a more general reasoning about potential risks in a complex environment. but this is important for medical devices. for this reason, a more deliberate path would be suited, from the author's perspective. in a first step, more experience should be gained w.r.t. to the use of ml based devices in clinical practice. thus, continuous learning should not be a first hand option. first, it should be demonstrated that a device works in clinical practice before a continuous learning approach should be possible. this could also be justified from a regulatory point-of-view. the automated validation process itself should be considered as a feature of the device. it should be considered as part of the design transfer which enables safe use of the device during its lifecycle. as part of the design transfer, it should be validated itself. thus, it has to be demonstrated that this automated validation process, e.g. in terms of the sps and acp, works in a real clinical environment. ideally, this would have been demonstrated during the application of the device in clinical practice. thus, one reasonable approach for a regulatory strategy could be to reduce or prohibit the options for enabling automatic validation in a first release / clearance of the device. during the lifetime, direct clinical data could be acquired to demonstrate a better insight into the reliability and limitations of the automatic validation / continuous learning approach. in particular, the relation between technical parameters and clinical effects could be assessed on a broader and more stable basis. based on this evidence in real clinical environments, the automated validation feature could then be cleared in a second round. otherwise, the validity of the automated validation approach would have to be demonstrated in a comprehensive setting during the development phase. in principle, this is possible when enough data is available which truly reflects a comprehensive set of situations. as discussed in this paper, there are many aspects which do not render this approach impossible but very challenging. in particular, this applies to the clinical effects and the interdependency between the users and clinical environment on the one hand and the device, including the ml algorithm, data management, etc., on the other hand. this also includes not only variation in the status and needs of the individual patient but also the local clinical environment and potentially also the socioeconomic setting. following a consequent process validation approach, it would have to be demonstrated that the algorithm reacts in a valid and predictable way no matter which training data have been provided, which environment have to be addressed, and which local adjustments have been applied. this also needs to include deficient data and inputs in some way. in [20] , it has been shown that the variation of outcomes can be substantial, even w.r.t. rather simple technical parameters. in [20] , this was analyzed for scientific contests ("challenges") where renowned scientific groups supervised the quality of the submitted ml algorithms. this demonstrates the challenges validation steps for ml based systems still include, even w.r.t. technical evaluation. for these reasons, it seems adequate to pursue the regulatory strategy in a more deliberate way. this includes the restriction of the "cont-learn" cases as proposed. this also includes a better classification scheme, where automated or fully automatic validation is possible. currently, the proposal in [13] does not provide clear rules when continuous learning is allowed. it does not really address a dedicated risk-based approach that defines which options and limitations are applicable. for some options, like the change of the inputs, it should be reviewed, whether automatic validation is a natural option. additionally, the dependency between technical parameters and clinical effects as well as risks should get more attention. in particular, the grade of interrelationship between the clinical actions and the learning task should be considered. in general, the discussions about ml based medical devices are very important. these techniques provide valuable opportunities for improvements in fields like medical technologies, where evidence based on high quality data is crucial. this applies to the overall development of medicine as well as to the development of sophisticated ml based medical devices. this also includes the assessment of treatment options and success of particular devices during their lifetime. data-driven strategies will be important for ensuring high-level standards in the future. they may also strengthen regulatory oversight in the long term by amplifying the necessity of post-market activities. this seems to be one of the promises the fda envisions according to their concepts of "total product lifecycle quality (tplc)" and "organizational excellence" [13] . also, the mdr strengthens the requirements for data-driven strategies in the pre-as well as postmarket phase. but it should not shift the priorities for a basically proven-quality-in-advance (exante) to a primarily ex-post regulation, which boils down to a trial-and-error oriented approach in the extreme. thus, we should aim at a good compromise between pushing these valuable and innovative options on the one hand and potential challenges and deficiencies on the other hand. computer-assisted technologies in medical interventions are intended to support the surgeon during treatment and improve the outcome for the patient. one possibility is to augment reality with additional information that would otherwise not be perceptible to the surgeon. in medical applications, it is particularly important that demanding spatial and temporal conditions are adhered to. challenges in augmenting the operating room are the correct placement of holograms in the real world, and thus, the precise registration of multiple coordinate frames to each other, the exact scaling of holograms, and the performance capacity of processing and rendering systems. in general, two different scenarios can be distinguished. first, applications exist, in which a placement of holograms with an accuracy of 1 cm and above are sufficient. these are mainly applications where a person needs a three-dimensional view of data. an example in the medical field may be the visualization of patient data, e.g. to understand and analyse the anatomy of a patient, for diagnosis or surgical planning. the correct visualization of these data can be of great benefit to the surgeon. often only 2d patient data is available, such as ct or mri scans. the availability of 3d representations depend strongly on the field of application. in neurosurgery 3d views are available but often not extensively utilized due to their limited informative value. additionally computer monitors are a big limitation, because the data can not be visualized in real world scale. further application areas are the translation of known user interfaces into augmented ur-ai 2020 // 67 reality (ar) space. the benefit here is that a surgeon refrains from touching anything, but can interact with the interface in space using hand or voice gestures. applications visualizing patient data, such as ct scans, only require a rough positioning of the image or holograms in the operation room (or). thus, the surgeon can conveniently place the application freely in space. the main requirement is then to keep the holograms in a constant position. therefore, the internal tracking of the ar device is sufficient to hold the holograms at a fixed position in space. the second scenario covers all applications, in which an exact registration of holograms to the real world is required, in particular with a precision below 1 cm. these scenarios are more demanding, especially when holograms must be placed precisely over real patient anatomy. to achieve this, patient tracking is essential to determine position and to follow patient movements. the system therefore needs to track the patient and adjust the visualization to the current situation. furthermore, it is necessary to track and augment surgical instruments and other objects in the operating room. the augmentation needs to be visualized at the correct spatial position and time constraints need to be fulfilled. therefore, the ar system needs to be embedded into the surgical workflow and react to it. to achieve these goals modern state of the art machine learning algorithms are required. however, the computing power on available ar devices is often not yet sufficient for sophisticated machine learning algorithms. one way to overcome this shortcoming is the integration of the ar system into a distributed system with higher capabilities, such as the digital operating theatre op:sense (see fig. 2 ). in this work an augmented reality system holomed [4] (see fig. 1 ) is integrated into the surgical research platform for robot assisted surgery op:sense [5] . the objective is to enable high-quality and patient-safe neurosurgical procedures in order to increase the surgical outcome by providing surgeons with an assistance system that supports them in cognitively demanding operations. the physician's perception limits are extended by the ar system, which bases on supporting intelligent machine learning algorithms. ar glasses allow the neurosurgeon to perceive the internal structures of the patient's brain. the complete system is demonstrated by applying this methodology to the ventricular puncture of the human brain, one of the most frequently performed procedures in neurosurgery. the ventricle system has an elongated shape with a width of 1-2 cm and is located in a depth of 4 cm inside the human head. patient models are generated fast (< 2s) from ct-data [3] , which are superimposed over the patient during operation and serve as a navigation aid for the surgeon. in this work the expanded system architecture is presented to overcome some limitations of the original system where all information were processed on the microsoft hololens, which lead to performance deficits. to overcome these shortcomings the holomed project was integrated into op:sense for additional sensing and computing power. to achieve integration of ar into the operation room and the surgical workflows, the patient, the instruments and the medical staff need to be tracked. to track the patient, a marker system is fixated on the patient head and registration from the marker system to the patient is determined. a two-stage process was implemented for this purpose. first the rough position of the patient's head is determined on the or table by applying a yolo v3 net to reduce the search space. then a robot with a mounted rgb-d sensor is used to scan the acquired area and build a point cloud of the same. to determine the patient's head in space as precisely as possible a two-step surface matching approach is utilized. during recording, the markers are also tracked. with known position of the patient and the markers, the registration matrix can be calculated. for the ventricular puncture a solution is proposed to track the puncture catheter to determine the depth of insertion into the human brain. by tracking the medical staff the system is able to react to the current situation, e.g. if an instrument is passed. in the following the solutions are described in detail. our digital operating room op:sense (illustrated in fig. 2a) to detect the patient's head, the coarse position is first determined with the yolo v3 cnn [6] , performed on the kinect rgb image streams. the position in 3d is determined through the depth stream of the sensors. the or table and the robots are tracked with retroreflective markers by the arttrack system. this step reduces the spatial search area for fine adjustment. the franka panda has an attached intel realsense rgb-d camera as depicted in fig. 3 . the precise determination of the position is performed on the depth data with surface matching. the robot scans the area of the coarsely determined position of the patient's head. a combined surface matching approach with feature-based and icp matching was implemented. the process to perform the surface matching is depicted in fig. 4 . in clinical reality, a ct scan of the patient head is always performed prior to a ventricular puncture for diagnosis, such that we can safely assume the availability of ct data. a process to segment the patient models from ct data was proposed by kunz et al. in [3] . the algorithm processes the ct data extremely fast in under two seconds. the data format is '.nrrd', a volume model format, which can easily be converted into surface models or point clouds. the point cloud of the patient's head ct scan is the reference model that needs to be found in or space. the second point cloud is recorded from the realsense depth stream mounted on the panda robot by scanning the previously determined rough position of the patient head. all points are recorded in world coordinate space. the search space is further restricted with a segmentation step by filtering out points that are located on the or table. additionally, manual changes can be made by the surgeon. in a performance optimization, the resolution of the point clouds is reduced to decrease processing time without loosing too much accuracy. the normals of both point clouds generated from ct data and from the recorded realsense depth stream are subsequently calculated and harmonised. during this step, the harmonisation is especially important as the normals are often misaligned. this misalignment occurs because the ct data is a combination of several individual scans. for alignment of all normals, a point inside the patient's head is chosen manually as a reference point, followed by orienting all normals in the direction of this point and subsequently inverting all normals to the outside of the head (see fig. 5 ). after the preprocessing steps, the first surface fitting step is executed. it is based on the initial alignment algorithm proposed by rusu et al. [8] . an implementation within the point cloud library (pcl) is used. therefore fast point feature histograms need to be calculated as a preprocessing step. in the last step an iterative closest point (icp) algorithm is used to refine the surface matching result. after the two point clouds have been aligned to each other the inverse transformation matrix can be calculated to get the correct transformation from marker system to patient model coordinate space. as outlined in fig. 6 , catheter tracking was implemented based on semantic segmentation using a full-resolution residual network (frrn) [7] . after the semantic segmentation of the rgb stream of the kinect cameras, the image is fused with the depth stream ur-ai 2020 // 71 to determine the voxels in the point cloud belonging to the catheter. as a further step a density based clustering approach [2] is performed on the chosen voxels. this is due to noise especially on the edges of the instrument voxels in the point cloud. based on the found clusters an estimation of the three dimensional structure of the catheter is performed. for this purpose, a narrow cylinder with variable length is constructed. the length is changed accordingly to the semantic segmentation and the clustered voxels of the point cloud. the approach is applicable to identify a variety of instruments. the openpose [1] library is used to track key points on the bodies of the medical staff. available ros nodes have been modified to integrate openpose in the op:sense ros environment. the architecture is outlined in fig. 7 . in this chapter the results of the patient, catheter and medical staff tracking are described. the approach to find the coarse position of a patient's head was performed on a phantom head placed on the or table within op:sense. multiple scenarios with changing illumination and occlusion conditions were recorded. the results are depicted in fig. 8 and the evaluation results are depicted in table 1 . precision detection of the patient was performed with a two-stage surface matching approach. different point cloud resolutions were tested with regard to runtime behaviour. voxel grid edge sizes of 6, 4 and 3 mm have been tested, with a higher edge size corresponding to a smaller point cloud. the matching results of the two point clouds were analyzed manually. an average accuracy of 4.7 mm was found with an accuracy range between 3.0 and 7.0 mm. in the first stage of the surface matching, the two point clouds are coarsely aligned as depicted in fig. 9 . in the second step icp is used for fine adjustment. a two-stage process was implemented as icp requires a good initial alignment of the two point clouds. ur-ai 2020 // 73 for catheter tracking a precision of the semantic segmentation between 47% and 84% is reached (see table 3 ). tracking of instruments, especially neurosurgical catheters, are challenging due to their thin structure and non-rigid shape. detailed results on catheter tracking have been presented in [7] . the 3d estimation of the catheter is shown in fig. 10 . the catheter was moved in front of the camera and the 3d reconstruction was recorded simultaneously. over a long period of the recording over 90% of the catheter are tracked correctly. in some situations this drops to under 50% or lower. the tracking of medical personnel is shown in fig. 11 . the different body parts and joint positions are determined, e.g. the head, eyes, shoulders, elbows, etc. the library yielded very good results as described in [1] . we reached a performance of 21 frames per second on a workstation (intel i7-9700k, geforce 1080 ti) processing 1 stream. fig. 11 . results of the medical staff tracking. ur-ai 2020 // 75 4 discussion as shown in the evaluation, our approach succeeds in detecting the patient in an automated two-stage process with an accuracy between 3 and 7 mm. the coarse position is determined by using a yolo v3 net. the results under normal or conditions are very satisfying. the solution performance drops strongly under bright illumination conditions. this is due to large flares that occur on the phantom as it is made of plastic or silicone. however, these effects do not occur on human skin. the advantage of our system is that the detection is performed on all four kinect rgb streams enable different views on the operation area. unfavourable illumination conditions normally don't occur on all of these streams. therefore a robust detection is still possible. in the future the datasets will be expanded with samples with strong illumination conditions. the following surface matching of the head yields good results and a robust and precise detection of the patient. most important is a good preprocessing of the ct data and the recorded point cloud of the search area, as described in the methods. the algorithm does not manage to find a result if there are larger holes in the point clouds or if the normals are not calculated correctly. additionally, challenges that have to be considered include skin deformities and noisy ct data. the silicone skin is not fixed to the skull (as human skin is), which leads to changes in position, some of which are greater than 1 cm. also the processing time of 7 minutes is quite long and must be optimized in the future. the processing time may be shortened by reducing the size of the point clouds. however, in this case the matching results may also become worse. catheter tracking [7] yielded good results, despite the challenging task of segmenting a very thin ( 2.5 mm) and deformable object. additionally, a 3d estimation of the catheter was implemented. the results showed that in many cases over 90% of the catheter can be estimated correctly. however, these results strongly depend on the orientation and the quality of the depth stream. using higher quality sensors could improve the detection results. for tracking of the medical staff openpose as a ready-to-use people detection algorithm was used and integrated into ros. the library produces very good results, despite medical staff wearing surgical clothing. in this work the integration of augmented reality into the digital operating room op:sense is demonstrated. this makes it possible to expand the capabilities of current ar glasses. the system can determine the precise patient's position by implementing a two-stage process. first a yolo v3 net is used to coarsly detect the patient to reduce the search area. in a second subsequent step a two-stage surface matching process is implemented for refined detection. this approach allows for precise location of the patient's head for later tracking. further, a frnn-based solution to track the surgical instruments in the or was implemented and demonstrated on a thin neurosurgical catheter for ventricular punctures. additionally, openpose was integrated into the digital or to track the surgical personnel. the presented solution will enable the system to react to the current situation in the operating room and is the base for an integration into the surgical workflow. due to the emergence of commodity depth sensors many classical computer vision tasks are employed on networks of multiple depth sensors e.g. people detection [1] or full-body motion tracking [2] . existing methods approach these applications using a sequential processing pipeline where the depth estimation and inference are performed on each sensor separately and the information is fused in a post-processing step. in previous work [3] we introduce a scene-adaptive optimization schema, which aims to leverage the accumulated scene context to improve perception as well as post-processing vision algorithms (see fig. 1 ). in this work we present a proof-of-concept implementation of the scene-adaptive optimization methods proposed in [3] for the specific task of stereomatching in a depth sensor network. we propose to improve the 3d data acquisition step with the help of an articulated shape model, which is fitted to the acquired depth data. in particular, we use the known camera calibration and the estimated 3d shape model to resolve disparity ambiguities that arise from repeating patterns in a stereo image pair. the applicability of our approach can be shown by preliminary qualitative results. in previous work [3] we introduce a general framework for scene-adaptive optimization of depth sensor networks. it is suggested to exploit inferred scene context by the sensor network to improve the perception and post-processing algorithms themselves. in this work we apply the proposed ideas in [3] to the process of stereo disparity estimation, also referred to as stereo matching. while stereo matching has been studied for decades in the computer vision literature [4, 5] it is still a challenging problem and an active area of research. stereo matching approaches can be categorized into two main categories, local and global methods. while local methods, such as block matching [6] , obtain a disparity estimation by finding the best matching point on the corresponding scan line by comparing local image regions, global methods formulate the problem of disparity estimation as a global energy minimization problem [7] . local methods lead to highly efficient real-time capable algorithms, however, they suffer from local disparity ambiguities. in contrast, global approaches are able to resolve local ambiguities and therefore provide high-quality disparity estimations. but they are in general very time consuming and without further simplifications not suitable for real-time applications. the semi-global matching (sgm) introduced by hirschmuller [8] aggregates many feasible local 1d smoothness constraints to approximate global disparity smoothness regularization. sgm and its modifications are still offering a remarkable trade-off between the quality of the disparity estimation and the run-time performance. more recent work from poggi et al. [9] focuses on improving the stereo matching by taking additional high-quality sources (e.g. lidar) into account. they propose to leverage sparse reliable depth measurements to improve dense stereo matching. the sparse reliable depth measurements act as a prior to the dense disparity estimation. the proposed approach can be used to improve more recent end-to-end deep learning architectures [10, 11] , as well as classical stereo approaches like sgm. this work is inspired by [9] , however, our approach does not rely on an additional lidar sensor but leverages a priori scene knowledge in terms of an articulated shape model instead to improve the stereo matching process. we set up four stereo depth sensors with overlapping fields of view. the sensors are extrinsically calibrated in advance, thus their pose with respect to a world coordinates system is known. the stereo sensors are pointed at a mannequin and capture eight greyscale images (one image pair for each stereo sensor, the left image of each pair is depicted in fig. 3a) . for our experiments we use a high-quality laser scan of the mannequin as ground truth. we assume that the proposed algorithm has access to an existing shape model that can express the observed geometry of the scene in some capacity. in our experimental setup, we assume a shape model of a mannequin with two articulated shoulders and a slightly different shape in the belly area of the mannequin (see fig. 2 ). in the remaining section we use the provided shape model to improve the depth data generation of the sensor network. first, we estimate the disparity values of each of the four stereo sensors with sgm without using the human shape model. let p denote a pixel and q denote an adjacent pixel. let d denote a disparity map and d p ,d q denote the disparity at pixel location p and q. let p denote the set of all pixels and n the set of all adjacent pixels. then the sgm cost function can be defined as where d(p, d p ) denotes the matching term (here the sum of absolute differences in a 7 × 7 neighborhood) which assigns a matching cost to the assignment of disparity d p to pixel p and r(p, d p , q, d q ) penalizes disparity discontinuities between adjacent pixels p and q. in sgm the objective given in (1) is minimized with dynamic programming, leading to the resulting disparity mapd = arg min d e(d). as input for the shape model fitting we apply sgm on all four stereo pairs leading to four disparity maps as depicted in fig. 4a . to be able to exploit the articulated shape model for stereo matching we initial need to fit the model to the 3d data obtained by classical sgm as described in 3.2. to be more robust to outliers we do only use disparity values from pixels with high contrast and transform them into 3d point clouds. since we assume that the relative camera poses are known, it is straight forward to merge the resulting point clouds in one world coordinate system. finally the shape model is fitted to the merged point cloud by optimizing over the shape model parameters, namely the pose of the model and the rotation of the shoulder joints. we use an articulated mannequin shape model in this work as a proxy for an articulated human shape model (e.g. [2] ) as proof-of-concept and plan to transfer the proposed approach on real humans in future work. once the model parameters of the shape model are obtained we can reproject the model fit to each sensor view by making use of the known projection matrices. fig. 3b shows the rendered wireframe mesh of the fitted model as an overlay on the camera images. for our guided stereo matching approach we then need the synthetic disparity map which can be computed from the synthetic depth maps (a byproduct of 3d rendering). we denote the synthetic disparity image by d synth . one synthetic disparity image is created for each stereo sensor, see fig. 4b . in the final step we exploit the existing shape model fit, in particular the synthetic disparity image d synth of each stereo sensor and combine it with sgm (inspired by guided stereo matching [9] ). our augmented objective is defined as with the introduced objective is very similar to sgm and can be minimized in a similar fashion leading to the final disparity estimation in our scene-adaptive depth sensor network to summarize our approach, we exploit an articulated shape model fit to enhance sgm with minor adjustments. to show the applicability of our approach we present preliminary qualitative results. the results are depicted in fig. 4 . using sgm without exploiting the provided articulated shape model leads to reasonable results, but the disparity map is very noisy and no clean silhouette of the mannequin is extracted (see fig. 4a ). fitting our articulated shape model to the data leads to clean synthetic disparity maps as shown in fig. 4c , with a clean silhouette. in the belly area the synthetic model disparity map (fig. 4b) does not agree with the ground truth (fig. 4d) . the articulated shape model is not general enough to explain the recorded scene faithfully. using the guided stereo matching approach, we construct a much cleaner disparity map than sgm. in addition, the approach takes the current sensor data into account and exploits an existing articulated shape model. in this work we have proposed a method for scene-adaptive disparity estimation in depth sensor networks. our main contribution is the exploitation of a fitted human shape model to make the estimation of disparities more robust to local ambiguities. our early results indicate that our method can lead to more robust and accurate results compared to classical sgm. future work will focus on a quantitative evaluation as well as incorporating sophisticated statistical human shape models into our approach. inverse process-structure-property mapping abstract. workpieces for dedicated purposes must be composed of materials which have certain properties. the latter are determined by the compositional structure of the material. in this paper, we present the scientific approach of our current dfg funded project tailored material properties through microstructural optimization: machine learning methods for the modeling and inversion of structure-property relationships and their application to sheet metals. the project proposes a methodology to automatically find an optimal sequence of processing steps which produce a material structure that bears the desired properties. the overall task is split in two steps: first find a mapping which delivers a set of structures with given properties and second, find an optimal process path to reach one of these structures with least effort. the first step is achieved by machine learning the generalized mapping of structures to properties in a supervised fashion, and then inverting this relation with methods delivering a set of goal structure solutions. the second step is performed via reinforcement learning of optimal paths by finding the processing sequence which leads to the best reachable goal structure. the paper considers steel processing as an example, where the microstructure is represented by orientation density functions and elastic and plastic material target properties are considered. the paper shows the inversion of the learned structure-property mapping by means of genetic algorithms. the search for structures is thereby regularized by a loss term representing the deviation from process-feasible structures. it is shown how reinforcement learning is used to find deformation action sequences in order to reach the given goal structures, which finally lead to the required properties. keywords: computational materials science, property-structure-mapping, texture evolution optimization, machine learning, reinforcement learning the derivation of processing control actions to produce materials with certain, desired properties is the "inverse problem" of the causal chain "process control" -"microstructure instantiation" -"material properties". the main goal of our current project is the creation of a new basis for the solution of this problem by using modern approaches from machine learning and optimization. the inversion will be composed of two explicitly separated parts: "inverse structure-property-mapping" (spm) and "microstructure evolution optimization". the focus of the project lies on the investigation and development of methods which allow an inversion of the structure-property-relations of materials relevant in the industry. this inversion is the basis for the design of microstructures and for the optimal control of the related production processes. another goal is the development of optimal control methods yielding exactly those structures which have the desired properties. the developed methods will be applied to sheet metals within the frame of the project as a proof of concept. the goals include the development of methods for inverting technologically relevant "structure-property-mappings" and methods for efficient microstructure representation by supervised and unsupervised machine learning. adaptive processing path-optimization methods, based on reinforcement learning, will be developed for adaptive optimal control of manufacturing processes. we expect that the results of the project will lead to an increasing insight into technologically relevant process-structure-property-relationships of materials. the instruments resulting from the project will also promote the economically efficient development of new materials and process controls. in general, approaches to microstructure design make high demands on the mathematical description of microstructures, on the selection and presentation of suitable features, and on the determination of structure-property relationships. for example, the increasingly advanced methods in these areas enable microstructure sensitive design (msd), which is introduced in [1] and [2] and described in detail in [3] . the relationship between structures and properties descriptors can be abstracted from the concrete data by regression in the form of a structure-property-mapping. the idea of modeling a structure-property-mapping by means of regression and in particular using artificial neural networks was intensively pursued in the 1990s [4] and is still used today. the approach and related methods presented in [5] always consist of a structure-property-mapping and an optimizer (in [5] genetic algorithms) whose objective function represents the desired properties. the inversion of the spm can be alternatively reached via generative models. in contrast to discriminative models (e.g. spm), which are used to map conditional dependencies between data (e.g. classification or regression), generative models map the composite probabilities of the variables and can thus be used to generate new data from the assumed population. established, generative methods are for example mixture models [6] , hidden markov models [7] and in the field of artificial neural networks restricted boltzmann machines [8] . in the field of deep learning, generative models, in particular generative adversarial networks [9] , are currently being researched and successfully applied in the context of image processing. conditional generative models can generalize the probability of occurrence of structural features under given material properties. in this way, if desired, any number of microstructures could be generated. based on the work on the spm, the process path optimization in the context of the msd is treated depending on the material properties. for this purpose, the process is regarded as a sequence of structure-changing process operations which correspond to elementary processing steps. shaffer et al. [10] construct a so called texture evolution network based on process simulation samples, to represent the process. the texture evolution network can be considered as a graph with structures as vertices, connected by elementary processing steps as edges. the structure vertices are points in the structure-space and are mapped to the property-space by using the spm for property driven process path optimization. in [11] one-step deformation processes are optimized to reach the most reachable element of a texture-set from the inverse spm. processes are represented by so called process planes, principal component analysis (pca) projections of microstructures reachable by the process. the optimization then is conducted by searching for the process plane which best represents one of the texture-set elements. in [12] , a generic ontology based semantic system for processing path hypothesis generation (matcalo) is proposed and showcased. the required mapping of the structures to the properties is modeled based on data from simulations. the simulations are based on taylor models. the structures are represented using textures in the form of orientation density functions (odf), from which the properties are calculated. in the investigations, elastic and plastic properties are considered in particular. structural features are extracted from the odf for a more compact description. the project uses spectral methods such as generalized spherical harmonics (gsh) to approximate the odf. as an alternative representation we investigate the discretization in the orientation-space, where the orientation density is represented by a histogram. the solution of the inverse problem consists of a structure-property-mapping and an optimizer: as [4] described, the spm is modeled by regression using artificial neural networks. in this investigation, we use a multilayer perceptron. differential evolution (de) is used for the optimization problem. de is an evolutionary algorithm developed by rainer storn and kenneth price [13] . it is a optimization method, which repeatedly improves a candidate solution set under consideration of a given quality measure over a continuous domain. the de algorithm optimizes a problem by taking a population of candidate solutions and generating new candidate solutions (structures) by mutation and recombination existing ones. the candidate solution with the best fitness is considered for further processing. so, for the generated structures the reached properties are determined using the spm. the fitness f is composed of two terms: the property loss l p , which expresses, how close the property of a candidate is to the target property, and the structure loss l s , which represents the degree of feasibility of the candidate structure in the process the property loss is the mean squared error (mse) between the reached properties p r ∈ p r and the desired properties p d ∈ p d : considering the goal that the genetic algorithm generates reachable structures, a neural network is formed which functions as an anomaly detector. the data basis of this neural network are structures that can be reached by a process. the goal of anomaly detection is to exclude unreachable structures. the anomaly detection is implemented using an autoencoder [14] . this is a neural network (see fig. 1 ) which consists of the following two parts: the encoder and the decoder. the encoder converts the input data to an embedding space. the decoder converts the embedding space as close as possible to the original data. due to the reduction to an embedding space, the autoencoder uses data compression and extracts relevant features. the cost function for the structures is a distance function in the odf-space, which penalizes the network if it produces outputs that differ from the input. the cost function is also known as the reconstruction loss: with s i ∈ s as the original structures,ŝ i ∈ˆ s as the reconstructed structures and λ = 0.001 to avoid division by zero. when using the anomaly detection, the autoencoder determines a high reconstruction loss if the input data are structures that are very different from the reachable structures. the overall approach is shown in fig. 2 and consists of the following steps: 1. the genetic algorithm generates structures. 2. the spm determines the reached properties of the generated structures. 3. the structure loss l s is determined by the reconstruction loss of the anomaly detector for the generated structures with respect to the reachable structures. 4. the property loss l p is determined by the mse of the reached properties and the desired properties. 5. the fitness is calculated as the sum of the structure loss l s and the property loss l p . the structures, resulting from the described approach form the basis for optimal process control. due to the forward mapping, the process evolution optimization based on texture evolution networks ( [10] ) is restricted to a-priori sampled process paths. [11] relies on linearization assumptions and is applicable to short process sequences only. [12] relies on a-priori learned process models in the form of regression trees and is also applicable to relatively short process sequences only. ur-ai 2020 // 88 as an adaptive alternative for texture evolution optimization, that can be trained to find process-paths of arbitrary length, we propose methods from reinforcement learning. for desired material properties p d . the inverted spm determines a set of goal microstructures s d ∈ g, which are very likely reachable by the considered deformation process. the texture evolution optimization objective is then to find the shortest process path p * starting from a given structure s 0 , and leading close to one of the structures from g. where p = (a k ) k=0,...,k ; k t is a path of process actions a, t is the maximum allowed process length. the mapping e(s, p) = s k delivers the resulting structure, when applying p to the structure s. here, for the sake of simplicity, we assume the process to be deterministic, although the reinforcement learning methods we use are not restricted to deterministic processes. g τ is a neighbourhood of g, the union of all open balls with radius τ and center points from g. to solve the optimization problem by reinforcement learning approaches, it must be reformulated as markov decision process (mdp), which is defined by the tuple (s, a, p, r). in our case s is the space of structures s, a is the parameter-space of the deformation process, containing process actions a, p : s × a → s is the transition function of the deformation process, which we assume to be deterministic. r g : s × a → r is a goalspecific reward function. the objective of the reinforcement learning agent is then to find the optimal goal-specific policy π * g (s t ) = a t that maximizes the discounted future goal-specific reward where γ ∈ [0, 1] discounts early attained rewards, the policy π g (s k ) determines a k and the transition function p (s k , a k ) determines s k+1 . for a distance function d in the structure space, the binary reward function r g (s, a) = 1, if d(p (s, a), g) < τ 0, otherwise (6) if maximized, leads to an optimal policy π * g that yields the shortest path to g from every s for γ < 1. moreover, if v g is given for every microstructure from g, p from eq. 4 is identical with the application of the policy π * ζ , where ζ = arg max g [v g ]. π * g can be approached by methods from reinforcement learning. value-based reinforcement learning is doing so by learning expected discounted future reward functions [15] . one of these functions is the so called value-function v . in the case of a deterministic mdp and for a given g, this expectation value function reduces to v g from eq. 4 and ζ can be extracted if v is learned for every g from g. for doing so, a generalized form of expectation value functions can be learned as it is done e.g. in [16] . this exemplary mdp formulation shows how reinforcement learning can be used for texture evolution optimization tasks. the optimization thereby is operating in the space of microstructures and does not rely on a-priori microstructure samples. when using off-policy reinforcement learning algorithms and due to the generalization over goal-microstructures, the functions learned while solving a specific optimization task can be easily transferred to new optimization tasks (i.e. different desired properties or even a different property space). industrial robots are mainly deployed in large-scale production, especially in the automotive industry. today, there are already 26.1 industrial robots deployed per 1,000 employees on average in these industry branches. in contrast, small and medium-sized enterprises (smes) only use 0.6 robots per 1,000 employees [1] . reasons for this low usage of industrial robots in smes include the lack of flexibility with great variance of products and the high investment expenses due to additional peripherals required, such as gripping or sensor technology. the robot as an incomplete machine accounts for a fourth of the total investment costs [2] . due to the constantly growing demand of individualized products, robot systems have to be adapted to new production processes and flows [3] . this development requires the flexibilization of robot systems and the associated frequent programming of new processes and applications as well as the adaption of existing ones. robot programming usually requires specialists who can adapt flexibly to different types of programming for the most diverse robots and can follow the latest innovations. in contrast to many large companies, smes often have no in-house expertise and a lack of prior knowledge with regard to robotics. this often has to be obtained externally via system integrators, which, due to high costs, is one of the reasons for the inhibited use of robot systems. during the initial generation or extensive adaption of process flows with industrial robots, there is a constant risk of injuring persons and damaging the expensive hardware components. therefore, the programs have to be tested under strict safety precautions and usually in a very slow test mode. this makes the programming of new processes very complex and therefore time-and cost-intensive. the concept presented in this paper combines intuitive, gesture-based programming with simulation of robot movements. using a mixed reality solution, it is possible to create a simulation-based visualization of the robot and project, to program and to test it in the working environment without disturbing the workflow. a virtual control panel enables the user to adjust, save and generate a sequence of specific robot poses and gripper actions and to simulate the developed program. an interface to transfer the developed program to the robot controller and execute it by the real robot is provided. the paper is structured as follows. first, a research on related work is conducted in section 2, followed by a description of the system of the gesture-based control concept in section 3. the function of robot positioning and program creation is described in section 4. last follow the evaluation in section 5 and conclusion in section 6. various interfaces exist to program robots, such as lead-trough, offline or walk-trough programming, programming by demonstration, vision based programming or vocal commanding. in the survey of villani et al. [4] a clear overview on existing interfaces for robot programming and current research is provided. besides the named interfaces, the programming of robots using a virtual or mixed reality solution aims to provide intuitiveness, simplicity and accessibility of robot programming for non-experts. designed for this purpose, guhl et al. [5] developed a generic architecture for human-robot interaction based on virtual and mixed reality. in the marker tracking based approach presented by [6] and [7] , the user defines a collision-free-volume and generates and selects control points while the system creates and visualizes a path through the defined points. others [8] , [9] , [10] and [11] use handheld devices in combination with gesture control and motion tracking. herein, the robot can be controlled through gestures, pointing or via the device, while the path, workpieces or the robot itself are visualized on several displays. other gesture and virtual or mixed reality based concepts are developed by cousins et al. [12] or tran et al. [13] . here, the robots perspective or the robot in the working environment is presented to the user on a display (head-mounted or stationary) and the user controls the robot via gestures. further concepts using a mixed reality method enable an image of the workpiece to be imported into cad and the system automatically generates a path for robot movements [14] or visualizing the intended motion of the robot on the microsoft hololens, that the user knows where the robot will move to next [15] . other methods combine pointing at objects on an screen with speech instructions to control the robot [16] . sha et al. [17] also use a virtual control panel in their programming method, but for adjusting parameters and not for controlling robots. another approach pursues programming based on cognition, spatial augmented reality and multimodal input and output [18] , where the user interacts with a touchable table. krupke et al. [19] developed a concept in which humans can control the robot by head orientation or by pointing, both combined with speech. the user is equipped with a head-mounted display presenting a virtual robot superimposed over the real robot. the user can determine pick and place position by specifying objects to be picked by head orientation or by pointing. the virtual robot then executes the potential pick movement and after the user confirms by voice command, the real robot performs the same movement. a similar concept based on gesture and speech is persued by quintero et al. [20] , whose method offers two different types of programming. on the one hand, the user can determine a pick and place position by head orientation and speech commands. the system automatically generates a path which is displayed to the user, can be manipulated by the user and is simulated by a virtual robot. on the other hand, it is possible to create a path on a surface by the user generating waypoints. ostanin and klimchik [21] introduced a concept to generate collision-free paths. the user is provided with virtual goal points that can be placed in the mixed reality environment and between which a path is automatically generated. by means of a virtual menu, the user can set process parameters such as speed, velocity etc.. additionally, it is possible to draw paths with a virtual device and the movement along the path is simulated by a virtual robot. differently to the concept described in this paper, only a pick and place task can be realized with the concepts of [19] and [20] . a differentiation between movements to positions and gripper commands as well as the movement to several positions in succession and the generation of a program structure are not supported by these concepts. another distinction is that the user only has the possibility to show certain objects to the robot, but not to move the robot to specific positions. in [19] a preview of the movement to be executed is provided, but the entire program (pick and place movements) is not simulated. in contrast to [21] , with the concept presented in this paper it is possible to integrate certain gripper commands into the program. with [21] programming method, the user can determine positions but exact axis angles or robot poses cannot be set. overall, the approach presented in this paper offers an intuitive, virtual user interface without the use of handheld devices (cf. [6] , [7] , [8] , [9] , [10] and [11] ) which allows the exact positions of the robot to be specified. compared to other methods, such as [12] , [13] , [14] , [15] or [16] , it is possible to create more complex program structures, which include the specification of robot poses and gripper positions, and to simulate the program in a mixed reality environment with a virtual robot. in this section the components of the mixed reality robot programming system are introduced and described. the system consists of multiple real and virtual interactive elements, whereby the virtual components are projected directly into the field of view using a mixed reality (mr) approach. compared to the real environment, which consists entirely of real objects and virtual reality (vr), which consists entirely of virtual objects and which overlays the real reality, in mr the real scene here is preserved and only supplemented by the virtual representations [22] . in order to interact in the different realities, head-mounted devices similar to glasses, screens or mobile devices are often used. figure 1 provides an overview of the systems components and their interaction. the system presented in this paper includes kukas collaborative, lightweight robot lbr iiwa 14 r820 combined with an equally collaborative gripper from zimmer as real components and a virtual robot model and a user interface as virtual components. the virtual components are presented on the microsoft hololens. for calculation and rendering the robot model and visualization of the user interface, the 3d-and physics-engine of the unity3d development framework is used. furthermore, for supplementary functions, components and for building additional mr interactable elements, the microsoft mixed reality toolkit (mrtk) is utilized. for spatial positioning of the virtual robot, marker tracking is used, a technique supported by the vuforia framework. in this use case, the image target is attached to the real robot's base, such that in mr the virtual robot superimposes the real robot. the program code is written in c . the robot is controlled and programmed via an intuitive and virtual user interface that can be manipulated using the so-called airtap gesture, a gesture provided by microsoft hololens. ur-ai 2020 // 95 to ensure that the virtual robot mirrors the motion sequences and poses of the real robot, the most exact representation of the real robot is employed. the virtual robot consists of a total of eight links, matching the base and the seven joints of iiwa 14 r820: the base frame, five joint modules, the central hand and the media flange. the eight links are connected together as a kinematic chain. the model is provided as open source files from [23] and [24] and is integrated into the unity3d project. the individual links are created as gameobjects in a hierarchy, with the base frame defining the top level and are limited similar to those of the real robot. the cad data of the deployed gripping system is also imported into unity3d and linked to the robot model. the canvas of the head-up displayer of the microsoft hololens is divided into two parts and rendered at a fixed distance in front of the user and on top of the scene. at the top left side of the screen the current joint angles (a1 to a7) are displayed and on the left side the current program is shown. this setting simplifies the interaction with the robot as the informations do not behave like other objects in the mr scene, but are attached to the head up display (hud) and move with the user's field of view. the user interface, which consists of multiple interactable components, is placed into the scene and is shown at the right side of the head-up display. at the beginning of the application the user interface is in "clear screen" mode, i.e. only the buttons "drag", "cartesian", "joints", "play" and "clear screen" and the joint angles at the top left of the screen are visible. for interaction with the robot, the user has to switch into a particular control mode by tapping the corresponding button. the user interface provides three different control modes for positioning the virtual robot: -drag mode, for rough positioning, -cartesian mode, for cartesian positioning and -joint mode, for the exact adjustment of each joint angle. figure 2 shows the interactable components that are visible and therefore controllable in the respective control modes. depending on the selected mode, different interactable components become visible in the user interface, with whom the virtual robot can be controlled. in addition to the control modes, the user interface offers further groups of interactable elements: -motion buttons, with which e.g. the speed of the robot movement can be adjusted or the robot movement can be started or stopped, -application buttons, to save or delete specific robot poses, for example, -gripper buttons, to adjust the gripper and -interface buttons, that enable communication with the real robot. this section focuses on the description of the usage of the presented approach. in addition to the description of the individual control modes, the procedure for creating a program is also described. as outlined in section 3.2, the user interface consists of three different control modes and four groups of further interactable components. through this concept, the virtual robot can be moved efficiently to certain positions with different movement modes, the gripper can be adjusted, the motion can be controlled and a sequence of positions can be chained. drag by gripping the tool of the virtual robot with the airtap gesture, the user can "drag" the robot to the required position. additionally, it is possible to rotate the position of the robot using both hands. this mode is particularly suitable for moving the robot very quickly to a certain position. cartesian this mode is used for the subsequent positioning of the robot tool with millimeter precision. the tool can be translated to the required positions using the cartesian coordinates x, y, z and the euler angles a, b, c. the user interface provides a separate slider for each of the six translation options.the tool of the robot moves analogously to the respective slider button, which the user can set to the required value. joints this mode is an alternative to the cartesian method for exact positioning. the joints of the virtual robot can be adjusted precisely to the required angle, which is particularly suitable for e.g. bypassing an obstacle. there is a separate slider for each joint of the virtual robot. in order to set the individual joint angles, the respective slider button is dragged to the required value, which is also displayed above the slider button for better orientation. to program the robot, the user interface provides various application buttons, such as saving and removing robot poses from the chain and a display of the poses in the chain. the user directs the virtual robot to the desired position and confirms using the corresponding button. the pose of the robot is then saved as joint angles from a1 to a7 and one gripper position in a list and is displayed on the left side of the screen. when running the programmed application, the robot moves to the saved robot poses and gripper positions according to the defined sequence. for a better orientation, the robots current target position changes its color from white to red. after testing the application, the list of robot poses can be sent to the controller of the real robot via a webservice. the real robot then moves analogously to the virtual robot to the corresponding robot poses and gripper positions. the purpose of the evaluation is how the gesture-based control concept compares to other concepts regarding intuitiveness, comfort and complexity. for the evaluation, a study was conducted with seven test persons, who had to solve a pick and place task with five different operating concepts and subsequently evaluate them. the developed concept based on gestures and mr was evaluated against a lead through procedure, programming with java, programming with a simplified programming concept and approaching and saving points with kuka smartpad. the test persons had no experience with microsoft hololens and mr, no to moderate experience with robots and no to moderate programming skills. the questionnaire for the evaluation of physical assistive devices (quead) developed by schmidtler et al [25] was used to evaluate and compare the five control concepts. the questionnaire is classified into five categories (perceived usefulness, perceived ease of use, emotions, attitude and comfort) and contains a total of 26 questions, rated on an ordinal scale from 1 (entirely disagree) to 7 (entirely agree). firstly, each test person received a short introduction to the respective control concept, conducted the pick and place task and immediately afterwards evaluated the respective control concept using quead. all test persons agreed that they would reuse the concept in future tasks (3 mostly agree, 4 entirely agree). in addition, the test persons considered the gesture-based concept to be intuitive (1 mostly agree, 4 entirely agree), easy to use (5 mostly agree, 2 entirely agree) and easy to learn (1 mostly agree, 6 entirely agree). two test persons mostly agree and four entirely agree that the gesture-based concept enabled them to solve the task efficiently and four test persons mostly agree and two entirely agree that the concept enhances their work performance. all seven subjects were comfortable using the gesturebased concept (4 mostly agree, 2 entirely agree). overall, the concept presented in this paper was evaluated as more comfortable, more intuitive and easier to learn than the other control concepts evaluated. in comparison to them, the new operating concept was perceived as the most useful and easiest to use. the test persons felt physically and psychologically most comfortable when using the concept and were most positive in total. in this paper, a new concept for programming robots based on gestures and mr and for simulating the created applications was presented. this concept forms the basis for a new, gesture-based programming method, with which it is possible to project a virtual robot model of the real robot into the real working environment by means of a mr solution, to program it and to simulate the workflow. using an intuitive virtual user interface, the robot can be controlled by three control modes and further groups of interactable elements and via certain functions, several robot positions can be chained as a program. by using this concept, test and simulation times can be reduced, since on the one hand the program can be tested directly in the mr environment without disturbing the workflow. on the other hand, the robot model is rendered into the real working environment via the mr approach, thus eliminating the need for time-consuming and costly modeling of the environment. the results of the user study indicate that the control concept is easy to learn, intuitive and easy to use. this facilitates the introduction of robots and especially in smes, since no expert knowledge is required for programming, programs can be created rapidly and intuitively and processes can be adapted flexibly. in addition, the user study showed that tasks can be solved efficiently and the concept is perceived as performance-enhancing. potential directions of improvement are: implement various movement types, such as point-to-point, linear and circular movements in the concept. this makes the robot motion more flexible and efficient, since positions can be approached in different ways depending on the situation. another improvement is to extend the concept with collaborative functions of the robot, such as force sensitivity or the ability to conduct search movements. in this way, the functions that make collaborative robots special can be integrated into the program structure. a further approach for improvement is to engage in a larger scale study. in 2019 the world's commercial fleet consists of 95,402 ships with a total capacity of 1,976,491 thousand dwt. (a plus of 2.6 % in carrying capacity compared to last year) [1] . according to the international chamber of shipping, the shipping industry is responsible for about 90 % of all trade [2] . in order to ensure the safe voyage of all participant in the international travel at sea, the need for monitoring is steadily increasing. while more and more data regarding the sea traffic is collected by using cheaper and more powerful sensors, the data still needs to be processed and understood by human operators. in order to support the operators, reliable anomaly detection and situation recognition systems are needed. one cornerstone for this development is a reliable automatic classification of vessels at sea. for example by classifying the behaviour of non cooperative vessels in ecological protected areas, the identification of illegal, unreported and unregulated (iuu) fishing activities is possible. iuu fishing is in some areas of the world a major problem, e. g., »in the wider-caribbean, western central atlantic region, iuu fishing compares to 20-30 percent of the legitimate landings of fish« [3] resulting in an estimated value between usd 700 and 930 million per year. one approach for gathering information on the sea traffic is based on the automatic identification system (ais) 3 . it was introduced as a collision avoidance system. as each vessel is broadcasting its information on an open channel, the data is often used for other purposes, like training and validating of machine learning models. ais provides dynamic data like position, speed and course over ground, static data like mmsi 4 , shiptype and length, and voyage related data like draught, type of cargo, and destination about a vessel. the system is self-reporting, it has no strong verification of transmission, and many of the fields in each message are set by hand. therefore, the data can not be fully trusted. as harati-mokhtari et al. [4] stated, half of all ais messages contain some erroneous data. as for this work, the dataset is collected by using the ais stream provided by aishub 5 , the dataset is likely to have some amount of false data. while most of the errors will have no further consequences, minor coordinate inaccuracies or wrong vessel dimensions are irrelevant, some false information in vessel information can have an impact on the model performance. classification of maritime trajectories and the detection of anomalies is a challenging problem, e.g., since classifications should be based on short observation periods, only limited information is available for vessel identification. riveiro et al. [5] give a survey on anomaly detection at sea, where shiptype classification is a subtype. jiang et al. [6] present a novel trajectorynet capable of point-based classification. their approach is based on the usage of embedding gps coordinates into a new feature space. the classification itself is accomplished using an long short-term memory (lstm) network. further, jiang et al. [7] propose a partition-wise lstm (plstm) for point-based binary classification of ais trajectories into fishing or non-fishing activity. they evaluated their model against other recurrent neural networks and achieve a significantly better result than common recurrent network architectures based on lstm or gated recurrent units. a recurrent neural network is used by nguyen et al. in [8] to reconstruct incomplete trajectories, detect anomalies in the traffic data and identify the real type of a vessel. they are embedding the position data to generate a new representation as input for the neural network. besides these neural network based approaches, other methods are also used for situation recognition tasks in the maritime domain. especially expert-knowledge based systems are used frequently, as illegal or at least suspicious behaviour is not recorded as often as desirable for deep learning approaches. conditional random fields are used by hu et al. [9] for the identification of fishing activities from ais data. the data has been labelled by an expert and contains only longliner fisher boats. saini et al. [10] propose an hidden markov model (hmm) based approach to the classification of trajectories. they combine global-hmm and segmental-hmm using a genetic algorithm. in addition, they tested the robustness of the framework by adding gaussian noise. in [11] fischer et al. introduce a holistic approach for situation analysis based on situation-specific dynamic bayesian networks (ssdbn). this includes the modelling of the ssdbn as well as the presentation to end-users. for a bayesian network, the parametrisation of the conditional probability tables is crucial. fischer introduces an algorithm for choosing these parameters in a more transparent way. important for the functionality is the ability of the network to model the domain knowledge and the handling of noisy input data. for the evaluation, simulated and real data is used to assess the detection quality of the ssdbn. based on dbns, anneken et al. [12] implemented an algorithm for detecting illegal diving activities in the north sea. as explained by de rosa et al. [13] an additional layer for modelling the reliability of different sensor sources is added to the dbn. in order to use the ais data, preprocessing is necessary. this includes cleaning wrong data, filtering data, segmentation, and calculation of additional features. the whole workflow is depicted in figure 1 . the input in form of ais data and different maps is shown as blue boxes. all relevant mmsis are extracted from the ais data. for each mmsi, the position data is used for further processing. segmentation into separate trajectories is the next step (yellow). the resulting trajectories are filtered (orange). based on the remaining trajectories, geographic (green) and trajectory (purple) based features are derived. for each of the resulting sequences, the data is normalized (red), which results in the final dataset. only the 6 major shiptypes in the dataset are used for the evaluation. these are "cargo", "tanker", "fishing", "passenger", "pleasure craft" and "tug". due to their similar behaviour, "cargo" and "tanker" will combined to a single class "cargo-tanker". figure 1 : visualization of all preprocessing steps. input in blue, segmentation in yellow, filtering in orange, geographic features in green, trajectory feature in purple and normalization in red. four different trajectory features are used: ur-ai 2020 // 105 -time difference -speed over ground -course over ground -trajectory transformation as the incoming data from ais is not necessarily uniformly distributed in time, there is a need to create a feature representing the time dimension. therefore, the time difference between two samples is introduced. as the speed and course over ground is directly accessible through the ais data, the network will be directly fed with these features. the vessel's speed is a numeric value in 0.1-knot resolution in the interval [0; 1022] and the course is the negative angle in degrees relative to true north and therefore in the interval [0; 359]. the position will be transformed in two ways. the first transformation, further called "relative-to-first", will shift the trajectory to start at the origin. the second transformation, henceforth called "rotate-to-zero", will rotate the trajectory, in such a way, that the end point is on the x-axis. additional to the trajectory based features, two geographic features are derived by using coastline maps 6 and a map of large harbours. the coastline map consists of a list of line strips. in order to reduce complexity, the edge points are used to calculate the "distance-to-coast". further, only a lower resolution of the shapefile itself is used. in figure 2 , the resolution "high" and "low" for some fjords in norway are shown. due to the geoindex' cell size set to 40 km, a radius of 20 km can be queried. the world's 140 major harbours based on the world port index 7 are used to calculate the "distance-to-closest-harbor". as fishing vessels are expected to stay near to a certain harbour, this feature should support the network to identify some shiptypes. the geoindex' cell size is set for this feature to 5,000 km, resulting in a maximum radius of 2,500 km. the data is split into separate trajectories by using gaps in either time or space, or the sequence length. as real ais data is used, package loss during the transmission is common. this problem is tackled by splitting the data if the time between two successive samples is larger than 2 hours, or if the distance between two successive samples is large. regarding the distance, even though the great circle distance is more accurate, the euclidean distance is used. for simplification the distance value is squared and as a threshold 10 −4 is used. depending on latitude this corresponds to a value of about 1 km at the equator and only about 600 m at 60 • n. since the calculation includes approximation a relatively high threshold is chosen. as the neural network depends on a fixed input size, the data is split into fitting chunks by cutting and padding with these rules: -longer sequences are split into chunks according to the desired sequence length. -any left over sequence shorter than 80 % of the desired length is discarded. -the others will be padded with zeroes. this results in segmented trajectories of similar but not necessarily same duration. as this work is about the vessel behaviour at sea, stationary vessels (anchored and moored vessels) and vessels traversing rivers are removed from the segmented trajectories. the stationary vessels are identified by using a measure of movement in a trajectory: where n as the sequence length and p i its data points. a trajectory will be removed if α stationary is below a certain threshold. a shapefile 8 containing the major and most minor rivers (compare ??) is used in order to remove the vessels not on the high seas. a sequence with more than 50 % of its points on a river is removed from the dataset. in order to speed up the training process, the data is normalized in the interval [0; 1] by applying here, for the positional features a differentiation between "global normalization" and "local normalization" is taken into account. the "global normalization" will scale the input data for the maximum x max and minimum x min calculated over the entire data set, while "local normalization" will estimate the maximum x max and minimum x min only over the trajectory itself. as the data is processed parallel, the parameters for the "global normalization" will be calculated only for each chunk of data. this will result in slight deviations in the minimum and maximum, but for large batches this should be neglectable. all other additional features are normalized as well. for the geographic features "distance-to-coast" and "distance-to-closest-harbor" the maximum distance, that can be queried depending on grid size, is used as x max and 0 is used as the lower bound x min . the time difference feature is scaled using a minimum x min of 0 and the threshold for the temporal gap since this is the maximum value for this feature. speed and course are normalized using 0 and their respective maximum values. for the dataset, a period between 2018-07-24 and 2018-11-15 is used. altogether 209,536 unique vessels with 2,144,317,101 raw data points are included. using this foundation and the previously described methods, six datasets are derived. all datasets use the same spatial and temporal thresholds. in addition, filter thresholds are identical as well. the datasets differentiate in their sequence length and by applying only the "relativeto-first" transformation or additionally the "rotate-to-zero" transformation. either 360, 1,080, or 1,800 points per sequence are used resulting in approximate 1 h, 3 h, or 5 h long sequences. in figure 3 , the distribution of shiptypes in the datasets after applying the different filters is shown. for the shiptype classification, neural networks are chosen. the different networks are implemented using keras [14] with tensorflow as backend [15] . fawaz et al. [16] have shown, that, despite their initial design for image data, a residual neural network (resnet) can perform quite well on time-series classification. thus, as foundation for the evaluated architectures the resnet is used. the main difference to other neural network architectures is the inclusion of "skip connections". this allows for deeper networks by circumventing the vanishing gradient problem during the training phase. based on the main idea of a resnet, several architectures are designed and evaluated for this work. some information regarding the structure are given in table 1 . further, the single architectures are depicted in figures 4a to 4f . the main idea behind these architectures is to analyse the impact of the depth of the networks. furthermore, as the features itself are not necessarily logically linked with each other, the hope is to be able to capture the behaviour better by splitting up the network path for each feature. to verify the necessity of cnns two multilayer perceptron (mlp) based networks are tested: one with two hidden layers and one with four hidden layers, all with 64 neurons and fully connected with their adjacent layers. the majority of the parameters for the two networks are bound in the first layer. they are necessary to map the large number of input neurons, e. g., for the 360 samples dataset 360 * 9 = 3,240 input neurons, to the first hidden layer. each of the datasets is split into three parts: 64 % for the training set, 16 % for the validation set, and 20 % for the test set. for solving or at least mitigating the problem of overfitting, regularization techniques (input noise, batch normalization, and early stopping) are used. small noise on the input in the training phase is used to support the generalization of the network. for each feature a normal distribution with a standard deviation of 0.01 and a mean of 0 is used as noise. furthermore, batch normalization is implemented. this means, before each relulayer a batch normalization layer is added, allowing higher learning rates. therefore, the initial learning rate is doubled. additionally, the learning rate is halved if the validation error does not improve after ten training epochs, improving the training behaviour during oscillation on a plateau. in order to prevent overfitting, an early stopping criteria is introduced. the training will be interrupted if the validation error is not decreasing after 15 training epochs. to counter the dataset imbalance, class weights were considered but ultimately did not lead to better classification results and were discarded. the different neural network architectures are evaluated on a amd ryzen threadripper batch normalization and the input noise is tested. the initial learning rate is set to 0.001 without batch normalization and 0.002 with batch normalization activated. the maximum number of epochs is set to 600. the batch sizes are set to 64, 128, and 256 for 360, 1,080, and 1,800 samples per sequence respectively. in total 144 different setups are evaluated. furthermore, 4 additional networks are trained on the 360 samples dataset with "relative-to-first" transformation. two mlps to verify the need of deep neural networks, and the shallow and deep resnet trained without geographic features to measure the impact of these features. (f) "rtz" with 1,800 samples shown. the first row shows the results for the "relative-to-first" (rtf) transformation, the second for the "rotate-to-zero" (rtz) transformation. the results for the six different architectures are depicted in figure 5 . for 360 samples the shallow resnet and the deep resnet outperformed the other networks. in case of the "relative-to-first" transformation (see figure 5a ), the shallow resnet achieved an f 1 -score of 0.920, while the deep resnet achieved 0.919. for the "rotate-to-zero" transformation (see figure 5d ), the deep resnet achieved 0.918 and the shallow resnet 0.913. in all these cases the regularization methods lead to no improvements. the "relative-to-first" transformation performs slightly better overall. for the datasets with 360 samples per sequence, the standard resnet variants achieve higher f 1 -scores compared to the split resnet versions. but this difference is relatively small. as expected, the tiny resnet is not large and deep enough to classify the data on a similar level. for the "relative-first" transformation and trajectories based on 1080 samples (see figure 5b ), the split resnet and the total split resnet achieve the best results. the first performed well with an f 1 -score of 0.913, while the latter is slightly worse with 0.912. in both cases again the regularization did not improve the result. for the "rotateto-zero" transformation (see figure 5e ), the shallow resnet achieved an f 1 -score of 0.907 without any regularization and 0.905 with only the the noise added to the input. for the largest sequence length of 1,800 samples, the split based networks slightly outperform the standard resnets. for the "relative-to-first" transformation (see figure 5c ), the split resnet achieved an f 1 -score of 0.911, while for the "rotate-to-zero" transformation (see figure 5f ) the total split resnet reached an f 1 -score of 0.898. again without noise and batch normalization. to verify, that the implementation of cnns is actually necessary, additional tests with mlps were carried out. two different mlps are trained on the 360 samples dataset with "relative-to-first" transformation since this dataset leads to best results for the resnet architectures. both networks lead to no results as their output always is the "cargo-tanker" class regardless of the actual input. the only thing the models are able to learn is, that the "cargo-tanker" class is the most probable class based on the uneven distribution of classes. an mlp is not the right model for this kind of data and performs badly. the large dimensionality of even the small sequence length makes the use of the fully connected networks impracticable. probably, further hand-crafted feature extraction is needed to achieve better results. to measure the impact the feature "distance to coast" and "distance to closest harbor" have on the overall performance, a shallow resnet and a deep resnet are trained on the 360 sample length data set with the "relative-to-first" transformation excluding these features. the trained networks have f 1 -scores of 0.888 and 0.871 respectively. this means, by including this features, we are able to increase the performance by 3.5 %. the "relative-to-first" transformation compared to the "rotate-to-zero" transformation yields the better results. especially, this is easily visible for the longest sequence length. a possible explanation can be seen in the "stationary" filter. this filter removes more trajectories for the "relative-to-first" transformation than for the additional "rotate-to-zero" transformation. a problem might be, that the end point is used for rotating the trajectory. this adds a certain randomness to the data, especially for round trip sequences. in some cases, the stretched deep resnet is not able to learn the classes. it is possible, that there is a problem with the structure of the network or the large number of parameters. further, there seems to be a problem with the batch normalization, as seen in figures 5c and 5e . the overall worse performance of the "rotate-to-zero" transformation could be because of the difference in the "stationary" filter. in the "rotate-to-zero" dataset, fewer sequences are filtered out. the filter leads to more "fishing" and "pleasure craft" sequences in relation to each other as described in section 3.6. this could also explain the difference in class prediction distribution since the network is punished more for mistakes in these classes because more classes are overall from this type. for the evaluation, the expectation based on previous work by other authors was, that the shorter sequence length should perform worse compared to the longer ones. instead the shorter sequences outperform the longer ones. the main advantages of the shorter sequences are essentially the larger number of sequences in the dataset. for example the 360 samples dataset with "relative-to-first" transformation contains about 2.2 million sequences, while the corresponding 1,800 sample dataset contains only approximately 250,000 sequences. in addition, the more frequent segmentation can yield more easily classifiable sequences: the behaviour of a fishing vessel in general contains different characteristics, like travelling from the harbour to the fishing ground, the fishing itself, and the way back. the travelling parts are similar to other vessels and only the fishing part is unique. a more aggressive segmentation will yield more fishing sequences, that will be easier to classify regardless of observation length. the shallow resnet has the overall best results by using the 360 samples dataset and the "relative-to-first" transformation. the results for this setup are shown in the confusion matrix in figure 6 . as expected, the tiny resnet is not able to compete with the others. the other standard resnet architectures performed well, especially on shorter sequences. the split architectures are able to perform better on datasets with longer sequences, with the shallow resnet achieving similar performance. comparing the number of parameters, all three architectures have about 400,000 the shallow resnet about 50,000 more, the total split resnet about 40,000 less. only on the dataset with more sequences, the deep resnet performs well. this correlates with the need of more information due to the larger parameter count. due to the reduced flexibility, the split architecture can be interpreted as a "head start". this means, that the network has already information regarding the structure of the data, which in turn does not need to be extracted from the data. this can result in a better performance for smaller datasets. all in all, the best results are always achieved by omitting the suggested regularization methods. nevertheless, the batch normalization had an effect on the learning rate and needed training epochs: the learning rate is higher and less epochs are needed before convergence. based on the resnet, several architectures are evaluated for the task of shiptype classification. from the initial dataset based on ais data with over 2.2 billion datapoints six datasets with different trajectory length and preprocessing steps are derived. further to the kinematic information included in the dataset, geographical features are generated. each network architecture is evaluated with each of the datasets with and without batch normalization and input noise. overall the best result is an f 1 -score of 0.920 with the shallow resnet on the 360 samples per sequence dataset and a shift of the trajectories to the origin. additionally, we are able to show, that the inclusion of geographic features yield an improvement in classification quality. the achieved results are quite promising, but there is still some room for improvement. first of all, the the sequence length used for this work might still be too long for real world use cases. therefore, shorter sequences should be tried. additionally, interpolation for creating data with the same time delta between two samples or some kind of embedding or alignment layer might yield better results. as there are many sources for additional domain related information, further research in the integration of these sources is necessary. comparison of cnn for the detection of small ojects based on the example of components on an assembly many tasks which only a few years ago had to be performed by humans can now be performed by robots or will be performed by robots in the near future. nevertheless, there are some tasks in assembly processes which cannot be automated in the next few years. this applies especially to workpieces that are only produced in very small series or tasks that require a lot of tact and sensitivity, such as inserting small screws into a thread or assembling small components. in conversations with companies we have found out that a big problem for the workers is learning new production processes. this is currently done with instructions and by supervisors. but this requires a lot of time. this effort can be significantly reduced by modern systems, which accompany workers in the learning process. such intelligent systems require not only instructions that describe the target status and the individual work steps that lead to it, but also information on the current status at the assembly workstation. one way to obtain this information is to install cameras above the assembly workstation and use image recognition to calculate where an object is located at any given moment. the individual parts, often very small compared to the work surface, must be reliably detected. we have trained and tested several deep neural networks for this purpose. we have developed an assembly workstation where work instructions can be projected directly onto the work surface using a projector. at a distance, 21 containers for components are arranged in three rows, slightly offset to the rear, one above the other. these containers can also be illuminated by the projector. thus a very flexible pick-by-light system can be implemented. in order for the system behind it to automatically switch to the next work step and, in the event of errors, to point them out and provide support in correcting them, it is helpful to be able to identify the individual components on the work surface. we use a realsense depth camera for this purpose, from which, however, we are currently only using the colour image. the camera is mounted in a central position at a height of about two meters above the work surface. thus the camera image includes the complete working surface as well as the 21 containers and a small area next to the working surface. the objects to be detected are components of a kit for the construction of various toy cars. the kit contains 25 components in total. some of the components vary considerably from each other, but some others are very similar to each other. since it is the same with real components of a production, the choice of the kit seemed appropriate for the purposes of this project. object detection, one of the most fundamental and challenging problems in computer vision, seeks to local object instances from a large number of predefined categories in natural images. until the beginning of 2000, a similar approach was mostly used in object detection. keypoints in one or more images of a category were searched for automatically. at these points a feature vector was generated. during the recognition process, keypoints in the image were again searched, the corresponding feature vectors were generated and compared with the stored feature vectors. after a certain threshold an object was assigned to the category. one of the first approaches based on machine learning was published by viola and jones in 2001 [1] . they still selected features, in their case they were selected by using a haar basis function [2] and then using a variant of adaboost [3] . starting in 2012 with the publication of alexnet by krizhevsky et al. [4] , deep neural networks became more and more the standard in object detection tasks. they used a convolutional neural network which has 60 million parameters in five convolutional layers, some of them are followed by max-pooling layers, three fully-connected layers and a final softmax layer. they won the imagenet lsvrc-2012 competition with a error rate almost half as high as the second best. inception-v2 is mostly identical to inception-v3 by szegedy et al. [5] . it is based on inception-v1 [6] . all inception architectures are composed of dense modules. instead of stacking convolutional layers, they stack modules or blocks, within which are convolutional layers. for inception-v2 they redesigned the architecture of inception-v1 to avoid representational bottlenecks and have more efficient computations by using factorisation methods. they are the first using batch normalisation in object detection tasks. in previous architectures the most significant difference has been the increasing number of layers. but with the network depth increasing, accuracy gets saturated and then degrades rapidly. kaiming et al. [7] addressed this problem with resnet using skip connections, while building deeper models. in 2017 howard et al. presented mobilenet architecture [8] . mobilenet was developed for efficient work on mobile devices with less computational power and is very fast. they used depthwise convolutional layers for a extremely efficient network architecture. one year later sandler et al. [9] published a second version of mobilenet. besides some minor adjustments, a bottleneck was added in the convolutional layers, which further reduced the dimensions of the convolutional layers. thus a further increase in speed could be achieved. in addition to the neural network architectures presented so far, there are also different methods to detect in which area of the image the object is located. the two most frequently used are described briefly below. to bypass the problem of selecting a huge number of regions, girshick et al. [10] proposed a method where they use selective search by the features of the base cnn to extract just 2000 regions proposals from the image. liu et al. [11] introduced the single shot multibox detector (ssd). they added some extra feature layers behind the base model for detection of default boxes in different scales and aspect ratios. at prediction time, the network generates scores for the presence of each object in each default box. then it produces adjustments to the box to better match the object shape. there is just one publication over the past few years which gives an survey of generic object detection methods. liu et al. [12] compared 18 common object detection architectures for generic object detection. there are many other comparisons of specific object detection tasks. for example pedestrian detection [13] , face detection [14] and text detection [15] . the project is based on the methodology of supervised learning. thereby the models are trained using a training dataset consisting of many samples. each sample within the training dataset is tagged with a so called label (also called annotation). the label provides the model with information about the desired output for this sample. during training, the output generated by the model is then compared to the desired output (labels) and the error is determined. this error on the one hand gives information about the current performance of the model and, on the other hand it is used for further mathematical computations to adjust the model's parameters, so that the model's performance improves. for the training of neural networks in the field of computer vision the following rule of thumb applies: the larger and more diverse the training dataset, the higher the accuracy that can be achieved by the trained model. if you have too little data and/or run it through the model too often, this can lead to so-called overfitting. overfitting means that instead of learning an abstract concept that can be applied to a variety of data, the model basically memorizes the individual samples [16, 17] . if you train neural networks for the purpose of this project from scratch, it is quite possible that you will need more than 100,000 different images -depending on the accuracy that the model should finally be able to achieve. however, the methodology of the so-called transfer learning offers the possibility to transfer results of neural networks, which have already been trained for a specific task, completely or partially to a new task and thus to save time and resources [18] . for this reason, we also applied transfer learning methods within the project. the training dataset was created manually: a tripod, a mobile phone camera (10 megapixel format 3104 x 3104) and an apeman action cam (20 megapixel format 5120x3840) were used to take 97 images for each of the 25 classes. this corresponds to 2,425 images in total (actually 100 images were taken per class, but only 97 were suitable for use as training data). all images were documented and sorted into close-ups (distance between camera and object less than or equal to 30 cm) and standards (distance between camera and object more than 30 cm). this procedure should ensure the traceability and controllability of the data set. in total, the training data set contains approx. 25% close-ups and approx. 75% standards, each taken on different backgrounds and under different lighting conditions (see fig. 2 ). the labelimg tool was used for the labelling of the data. with the help of this tool, bounding boxes, whose coordinates are stored in either yolo or pascval voc format, can be marked in the images [19] . for the training of the neural networks the created dataset was finally divided into: ur-ai 2020 // 118 -training data (90% of all labelled images): images that are used for the training of the models and that pass through the models multiple times during the training. -test data (10% of all labelled images): images that are used for later testing or validation of the training results. in contrast to the images used as training data, the model is presented these images for the first time after training. the goal of this approach, which is common in deep learning, is to see how well the neural network recognizes objects in images, that it has never seen before, after the training. thus it is possible to make a statement about the accuracy and to be able to meet any further training needs that may arise. the training of deep neural networks is very demanding on resources due to the large number of computations. therefore, it is essential to use hardware with adequate performance. since the computations that run for each node in the graph can be highly parallelized, the use of a powerful graphical processing unit (gpu) is particularly suitable. a gpu with its several hundred computing cores has a clear advantage over a current cpu with four to eight cores when processing parallel computing tasks [20] . these are the outline parameters of the project computer in use: -operating system (os): ubuntu 18.04.2 lts -gpu: geforce r gtx 1080 ti (11 gb gddr5x-memory, data transfer speed 11 gbit/s) for the intended comparison the tensorflow object detection api was used. tensorflow object detection api is an open source framework based on tensorflow, which among other things provides implementations of pre-trained object detection models for transfer learning [21, 22] . the api was chosen because of its good and easy to understand documentation and its variety of pre-trained object detection models. for the comparison the following models were selected: -ssd mobilenet v1 coco: [11, 23, 24] -ssd mobilenet v2 coco: [11, 25, 26] -faster rcnn inception v2 coco: [27] [28] [29] -rfcn resnet101 coco: [30] [31] [32] to ensure comparability of the networks, all of the selected pre-trained models were trained on the coco dataset [33] . fundamentally, the algorithms based on cnn models can be grouped into two main categories: region-based algorithms and one-stage algorithms [34] . while both ssd models can be categorized as one-stage algorithms, faster r-cnn and r-fcn fall into the category of region-based algorithms. one-stage algorithms predict both -the fields (or the bounding boxes) and the class of the contained objects -simultaneously. they are generally considered extremely fast, but are known for their trade-off between accuracy and real-time processing speed. region-based algorithms consist of two parts: a special region proposal method and a classifier. instead of splitting the image into many small areas and then working with a large number of areas like conventional cnn would proceed, the region-based algorithm first proposes a set of regions of interest (roi) in the image and checks whether one of these fields contains an object. if an object is contained, the classifier classifies it [34] . region-based algorithms are generally considered as accurate, but also as slow. since, according to our requirements, both accuracy and speed are important, it seemed reasonable to compare models of both categories. besides the collection of pre-trained models for object detection, the tensorflow object detection api also offers corresponding configuration files for the training of each model. since these configurations have already shown to be successful, these files were used as a basis for own configurations. the configuration files contain information about the training parameters, such as the number of steps to be performed during training, the image resizer to be used, the number of samples processed as a batch before the model parameters are updated (batch size) and the number of classes which can be detected. to make the study of the different networks as comparable as possible, the training of all networks was configured in such a way that the number of images fed into the network simultaneously (batch size) was kept as small as possible. since the configurations of some models did not allow batch sizes larger than one, but other models did not allow batch sizes smaller than two, no general value for all models could be defined for this parameter. during training, each of the training images should be passed through the net 200 times (corresponds to 200 epochs). the number of steps was therefore adjusted accordingly, depending on the batch size. if a fixed shape resizer was used in the base configurations, two different dimensions of resizing (default: 300x300 pixels and custom: 512x512 pixels) were selected for the training. table 1 gives an overview of the training configurations used for the training of the different models. in this section we will first look at the training, before we then focus on evaluating the quality of the results and the speed of the selected convolutional neural networks. when evaluating the training results, we first considered the duration that the neural networks require for 200 epochs (see fig. 3 ). it becomes clear that especially the two region based object detectors (faster r-cnn inception v2 and rfcn resnet101) took significantly longer than the single shot object detectors (ssd mobilenet v1 and ssd mobilenet v2). in addition, the single shot object detectors clearly show that the size of the input data also has a decisive effect on the training duration: while ssd mobilenet v2 with an input data size of 300x300 pixels took the shortest time for the training with 9 hours 41 minutes and 47 seconds, the same neural network with an input data size of 512x512 pixels took almost three hours more for the training, but is still far below the time required by rfcn resnet101 for 200 epochs of training. the next point in which we compared the different networks was accuracy (see fig. 4 ). we focused on seeing which of the nets were correct in their detections and how often (absolute values), and we also wanted to see what proportion of the total detections were correct (relative values). the latter seemed to us to make sense especially because some of the nets showed more than three detections for a single object. the probability that the correct classification will be found for the same object with more than one detection is of course higher in this case than if only one detection per object is made. with regard to the later use at the assembly table, however, it does not help us if the neural net provides several possible interpretations for the classification of a component. figure 4 shows that, in this comparison, the two region based object detectors generally perform significantly better than the single shot object detectors -both in terms of the correct detections and their share of the total detections. it is also noticeable that for the single shot object detectors, the size of the input data also appears to have an effect on the comparison point on the result. however, there is a clear difference to the previous comparison of the required training durations: while the training duration increased uniformly with increasing size of the images with the single shot object detectors, such a uniform observation cannot be made with the accuracy, concerning the relation to the input data sizes. while ssd mobilenet v2 achieves good results with an input data size of 512x512 pixels, ssd mobilenet v1 delivers the worst result of this comparison for the same input data size (regarding the number of correct detections as well as their share of the total detections). with an input data size of 300x300 pixels, however, the result improves with ssd mobilenet v1, while the change to a smaller input data size has a deteriorating effect on the result with ssd mobilenet v2. the best result of this comparison -judging by the absolute values -was achieved by faster r-cnn inception v2. however, in terms of the proportion of correct detections in the total detections, the region based object detector is two percentage points behind rfcn resnet 101, also a region based object detector. we were particularly interested in how neural networks would react to particularly similar, small objects. therefore, we decided to investigate the behavior of neural networks within the comparison using an example to illustrate the behavior of the three very similar objects. figure 5 shows the selected components for the experiment. for each of these three components we examined how often it was correctly detected and classified by the compared neural networks and how often the network misclassified it with which of the similar components. the first and the second component was detected in nearly all cases by both region based approaches. the classification by inception-v2 and resnet-101 failed in about one third of images. the ssd networks detected the object in just one of twenty cases but mobilenet classified this correct. it has been surprising, that the results for the third component looks very different to the others (see fig. 6 ). ssd mobilenet v1 correctly identified the component in seven of ten images and did not produce any detections that could be interpreted as misclassifications with one of the similar components. ssd mobilenet v2 did not detect any of the three components, as in the two previous investigations. the results of the two region based object detectors are rather moderate. faster r-cnn inception v2 has detected the correct component in four of ten images, but still five misclassifications with the other two components. rfcn resnet101 has caused many misclassifications with the other two components. only two of ten images were correctly detected but it had six misclassifications with the similar components. an other important aspect of the study is the speed, or rather the speed at which the neural networks can detect objects, especially with regard to later use at the assembly table. for the comparison of the speeds on the one hand the data of the github repository of the tensorflow object detection api for the individual neural nets were used, on the other hand the actual speeds of the neural nets within this project were measured. it becomes clear that the speeds measured in the project are clearly below the achievable speeds that are mentioned in the github repository of the tensorflow object-detection api. on the other hand, the differences between the speeds of the region based object detectors and the single shot object detectors in the project are far less drastic than expected. we have created a training dataset with small, partly very similar components. with this we have trained four common deep neural networks. in addition to the training times, we examined the accuracy and the recognition time with general evaluation data. in addition, we examined the results for ten images each of three very similar and small components. none of the networks we trained produced suitable results for our scenario. nevertheless, we were able to gain some important insights from the results. at the moment, the runtime is not yet suitable for our scenario, but it is also not far from the minimum requirements, so that these can easily be achieved with smaller optimizations and better hardware. it was also important to realize that there are no serious runtime differences between the different network architectures. the two region based approaches delivered significantly better results than the ssd approaches. however, the results of the detection of the third small component suggest that mobilenet in combination with a faster r-cnn could possibly deliver even better results. longer training and training data better adapted to the intended use could also significantly improve the results of the object detectors. team schluckspecht from offenburg university of applied sciences is a very successful participant of the shell eco marathon [1] . in this contest, student groups are to design and build their own vehicles with the aim of low energy consumption. since 2018 the event features the additional autonomous driving contest. in this area, the vehicle has to fulfill several tasks, like driving a parcour, stopping within a defined parking space or circumvent obstacles, autonomously. for the upcoming season, the schluckspecht v car of the so called urban concept class has to be augmented with the according hardware and software to reliably recognize (i. e. detect and classify) possible obstacles and incorporate them into the software framework for further planning. in this contribution we describe the additional components in hard-and software that are necessary to allow an opitcal 3d object detection. main criteria are accuracy, cost effectiveness, computational complexity for relative real time performance and ease of use with regard to incorporation in the existing software framework and possible extensibility. this paper consists of the following sections. at first, the schluckspecht v system is described in terms of hard-and software components for autonomous driving and the additional parts for the visual object recognition. the second part scrutinizes the object recognition pipeline. therefore, software frameworks, neural network architecture and final data fusion in a global map is depicted in detail. the contribution closes with an evaluation of the object recognition results and conclusions. the schluckspecht v is a self designed and self build vehicle according to the requirements of the eco marathon rules. the vehicle is depicted in figure 1 . the main features are the relatively large size, including driver cabin, motor area and a large trunk, a fully equipped lighting system and two doors that can be opened separately. for the autonomous driving challenges, the vehicle is additionally equipped with several essential parts, that are divided into hardware, consisting of actuators, sensors, computational hardware and communication controllers. the software is based on a middle ware, can-open communication layers, localization, mapping and path planning algorithms that are embedded into a high level state machine. actuators the car is equipped with two actors, one for steering and one for braking. each actor is paired with sensors for measuring steering angle and braking pressure. environmental sensors several sensors are needed for localization and mapping. backbone is a multilayer 3d laser scanning system (lidar), which is combined with an inertial navigation system that consists of accelerometers, gyroscopes and magnetic field sensors all realized as triads. odometry information is provided from a global navigation satellite system (gnss) and two wheel encoders. the communication is based on two separate can-bus-systems, one for basic operations and an additional one for the autonomous functions. the hardware can nodes are designed and build from the team coupling usb-, i2c-, spi-and can-open-interfaces. messages are send from the central processing unit or the driver depending on drive mode. the trunk of the car is equipped with an industrial grade high performance cpu and an additional graphics processing unit (gpu). can communication is ensured with an internal card, remote access is possible via generic wireless components. software structure the schluckspecht uses a modular software system consisting of several basic modules that are activated and combined within a high level state ma-chine as needed. an overview of the main modules and possible sensors and actuators is depicted in figure 2 localization and mapping the schluckspecht v is running a simultaneous localization and mapping (slam) framework for navigation, mission planning and environment representation. in its current version we use a graph based slam approach based upon the cartographer framework developed by google [2] . we calculate a dynamic occupancy grid map that can be used for further planning. sensor data is provided by the lidar, inertial navigation and odometry systems. an example of a drivable map is shown in figure 3 . this kind of map is also used as base for the localization and placement of the later detected obstacles. the maps are accurate to roughly 20 centimeters, providing relative localization towards obstacles or homing regions. path planning to make use of the slam created maps, an additional module calculates the motion commands from start to target pose of the car. the schluckspecht is a classical car like mobile system which means that the path planning must take into account the non holonomic kind of permitted movement. parking maneuvers, close by driving on obstacles or planning a trajectory between given points is realized as a combination of local control commands based upon modeled vehicle dynamics, the so called local planner, and optimization algorithms that find the globally most cost efficient path given a cost function, the so called global planner. we employ a kinodynamic strategy, the elastic band method presented in [3] , for the local planning. global planning is realized with a variant of the a* algorithm as described in [4] . middleware and communication all submodules, namely, localization, mapping, path planning and high-level state machines for each competition are implemented within ur-ai 2020 // 129 the robot operating system (ros) middleware [5] . ros provides a messaging system based upon the subscriber/publisher principle. the single modules are capsuled in a process, called node, capable to asynchronously exchange messages as needed. due to its open source character and an abundance on drivers and helper functions, ros provides additional features like hardware abstraction, device drivers, visualization and data storage. data structures for mobile robotic systems, e. g. static and dynamic maps or velocity control messages, allow for rapid development. the lidar sensor system has four rays, enabling only the incorporation of walls and track delimiters within a map. therefore, a stereo camera system is additionally implemented to allow for object detection of persons, other cars, traffic signs or visual parking space delimiters and simultaneously measure the distance of any environmental objects. camera hardware a zed-stereo-camera system is installed upon the car and incorporated into the ros framework. the system provides a color image streams for each camera and a depth map from stereo vision. the camera images are calibrated to each other and towards the depth information. the algorithms for disparity estimation are running around 50 frames per second making use of the provided gpu. the object recognition relies on deep neural networks. to seamlessly work with the other software parts and for easy integration, the networks are evaluated with tensorflow [6] and pytorch [7] frameworks. both are connected to ros via the opencv image formats providing ros-nodes and -topics for visualization and further processing. the object recognition pipeline relies on a combination of mono camera images and calibrated depth information to determine object and position. core algorithm is a deep learning approach with convolutional neural networks. ur-ai 2020 // 130 main contribution of this paper is the incorporation of a deep neural network object detector into our framework. object detection with deep neural networks can be subdivided into two approaches, one being a two step approach, where regions of interest are identified in a first step and classified in a second one. the second are so called single shot detectors (like [8] ), that extract and classify the objects in one network run. therefore, two network architectures are evaluated, namely yolov3 [9] as a single shot approach and faster r-cnn [10] as two step model. all are trained on public data sets and fine tuned to our setting by incorporating training images from the schluckspecht v in the zed image format. the models are pre-selected due to their real time capability in combination with the expected classification performance. this excludes the current best instance segmentation network mask r-cnn [11] due to computational burdens and fast but inaccurate networks based on the mobilenet backbone [12] . the class count is adapted for the contest, in the given case eight classes, including the relevant pedestrian, car, van, tram and cyclist. for this paper, the two chosen network architectures were trained in their respective framework, i. e. darknet for the yolov3 detector and tensorflow for the faster r-cnn detector. yolov3 is used in its standard form with the darknet 53 backbone, faster r-cnn is designed with the resnet 101 [13] backbone. the models were trained on local hardware with the kitti [14] data set. alternatively, an open source data set from the teaching company udacity, with only three classes (truck, car, pedestrian) was tested. to deal with the problem of domain adaptation, the training images for yolov3 were pre-processed to fit the aspect ratio of the zed camera. the faster r-cnn net can cope with ratio variations as it uses a two stage approach for detection based on regions of interest pooling. both networks were trained and stored. afterward, their are incorporated into the system via a ros node making use of standard python libraries. the detector output is represented by several labeled bounding boxes within the 2d image. three dimensional information is extracted from the associated depth map by calculating the center of gravity of each box to get a x and y coordinate within the image. interpolating the depth map pixels accordingly one gets the distance coordinate z from the depth map to determine the object position p(x, y, z) in the stereo camera coordinate system. the ease of projection between dieeferent coordinate systems is one reason to use the ros middleware. the complete vehicle is modeled in a so calle tranform tree (tf-tree), that allows the direct interpolation between different coordinate systems in all six spatial degrees of freedom. the dynamic map, created in the slam subsystem, is now augmented with the current obstacles in the car coordinate system. the local path planner can take these into account and plan a trajectory including kinodynamic constraints to prevent collision or initiate a breaking maneuver. both newly trained networks were first evaluated upon the training data. exemplary results for the kitti data set are shown in figure 4 . the results clearly indicate an advantage for the yolov3 system, both in speed and accuracy. the figure depicts good results for occlusions (e. g. the car on the upper right) or high object count (see the black car on the lower left as example). the evaluation on a desktop system showed 50 fps for yolov3 and approximately 10 fps for faster r-cnn. after validating the performance upon the training data, both networks were started as a ros node and tested upon real data of the schluckspecht vehicle. as the training data differs from the zed-camera images in format and resolution, several adaptions were necessary for the yolov3 detector. the images are cropped in real time before presented to the neural net to emulate the format of the training images. the r-cnn like two stage networks are directly connected to the zed node. the test data is not labeled as ground truth. it is therefore not possible to give quantitative results for the recognition task. table 1 gives a quantitative overview of the object detection and classification, the subsequent figures give some expression of exemplary results. the evaluation on the schluckspecht videos showed an advantage for the yolov3 network. main reason is the faster computation, which results in a frame rate nearly twice as high compared to two stage detectors. in addition, the recognition of objects in the distance, i. e. smaller objects is a strong point of yolo. the closer the camera gets, the bigger is the balance shift towards faster r-cnn, that outperforms yolo on all categories for larger objects. what becomes apparent is a maximum detection distance of approximately 30 meters, from which on cars become to small in size. figure 6 shows an additional result demonstrating the detection power for partially obstructed objects. another interesting finding was the capability of the networks to generalize. faster r-cnn copes much better with new object instances than yolov3. persons with so far unknown cloth color or darker areas with vehicles remain a problem for yolo, but ur-ai 2020 // 133 commonly not for the r-cnn. the domain transfer from training data in berkeley and kitti to real zed vehicle images proved problematic. this contribution describes an optical object recognition system in hard-and software for the application in autonomous driving under restricted conditions, within the shell eco marathon competition. an overall overview of the system and the incorporation of the detector within the framework is given. main focus was the evaluation and implementation of several neural network detectors, namely yolov3 as one shot detector and faster r-cnn as a two step detector, and their combination with distance information to gain a three dimensional information for detected objects. for the given application, the advantage clearly lies with yolov3. especially the achievable frame rate of minimum 10 hz allows a seamless integration into the localization and mapping framework. given the velocities and map update rate, the object recognition and integration via sensor fusion for path planning and navigation works in quasi real-time. for future applications we plan to further increase the detection quality by incorporating new classes and modern object detector frameworks like m2det [15] . this will additionally increase frame rate and bounding box quality. for more complex tasks, the data of the 3d-lidar system shall be directly incorporated into the fusion framework to enhance the perception of object boundaries and object velocities. a few useful things to know about machine learning feature engineering for machine learning an empirical analysis of feature engineering for predictive modeling input selection for fast feature engineering random forests support vector regression machines strong consistency of least squares estimates in multiple regression ii business data science: combining machine learning and economics to optimize, automate, and accelerate business decisions global product classification (gpc) a study of cross-validation and bootstrap for accuracy estimation and model selection automatic liver and tumor segmentation of ct and mri volumes using cascaded fully convolutional neural networks convolutional networks for biomedical image segmentation v-net: fully convolutional neural networks for volumetric medical image segmentation self-supervised learning for pore detection in ct-scans of cast aluminum parts generating meaningful synthetic ground truth for pore detection in cast aluminum parts nema ps3 / iso 12052, digital imaging and communications in medicine (dicom) standard, national electrical manufacturers association ct-realistic lung nodule simulation from 3d conditional generative adversarial networks for robust lung segmentation deep learning hardware: past, present, and future a survey on specialised hardware for machine learning a survey on distributed machine learning hardware for machine learning: challenges and opportunities 3d u-net: learning dense volumetric segmentation from sparse annotation z-net: an anisotropic 3d dcnn for medical ct volume segmentation activation functions: comparison of trends in practice and research for deep learning fast and accurate deep network learning by exponential linear units (elus) delving deep into rectifiers: surpassing human-level performance on imagenet classification toward deeper understanding of neural networks: the power of initialization and a dual view on expressivity adam: a method for stochastic optimization diffgrad: an optimization method for convolutional neural networks tversky loss function for image segmentation using 3d fully convolutional deep networks a low-power multi physiological monitoring". processor for stress detection. ieee sensors using heart rate monitors to detect mental stress positive technology: a free mobile platform for the self-management of psychological stress exploring the effectiveness of a computer-based heart rate variability biofeedback program in reducing anxiety in college students psychological stress and incidence of atrial fibrillation continuously updated, computationally efficient stress recognition framework using electroencephalogram (eeg) by applying online multitask learning algorithms (omtl) ten years of research with the trier social stress test trapezius muscle emg as predictor of mental stress poptherapy: coping with stress through pop-culture du-md: an open-source human action dataset for ubiquitous wearable sensors stress recognition using wearable sensors and mobile phones introducing wesad, a multimodal dataset for wearable stress and affect detection feasibility and usability aspects of continuous remote monitoring of health status in palliative cancer patients using wearables detection of diseases based on electrocardiography and electroencephalography signals embedded in different devices: an exploratory study stress effects". the american institute of stress der smarte assistent can: creative adversarial networks, generating" art" by learning about styles and deviating from style norms creative ai: on the democratisation & escalation of creativity generative design: a paradigm for design research eigenfaces for recognition unsupervised representation learning with deep convolutional generative adversarial networks large scale gan training for high fidelity natural image synthesis interpreting the latent space of gans for semantic face editing visualizing and understanding generative adversarial networks mistaken identity spectral normalization for generative adversarial networks beauty is in the ease of the beholding: a neurophysiological test of the averageness theory of facial attractiveness unpaired image-to-image translation using cycleconsistent adversarial networks colorization for anime sketches with cycle-consistent adversarial network artificial muse using evolutionary design to interactively sketch car silhouettes and stimulate designer's creativity the chair project-four classics deepwear: a case study of collaborative design between human and artificial intelligence grass: generative recursive autoencoders for shape structures co-designing object shapes with artificial intelligence systematic review of the empirical evidence of study publication bias and outcome reporting bias ki-kunst und urheberrecht -die maschine als schöpferin? public law research paper no. 692; u of maryland legal studies research paper no inceptionism: going deeper into neural networks proactive error prevention in manufacturing based on an adaptable machine learning environment. artificial intelligence: from research to application: the upper-rhine artificial intelligence symposium ur-ai the benefits fo pdca crisp-dm 1.0: step-by-step data mining guide interpretable machine learning for quality engineering in manufacturing-importance measures that reveal insights on errors regulation (eu) 2017/745 of the european parliament and of the council of 5 april 2017 on medical devices -medical device regulation (mdr) use of real-world evidence to support regulatory decision-making for medical devices. guidance for industry and food and drug administration staff high-performance medicine: the convergence of human and artificial intelligence artificial intelligence powers digital medicine dermatologist-level classification of skin cancer with deep neural networks chestx-ray8: hospital-scale chest x-ray database and benchmarks on weaklysupervised classification and localization of common thorax diseases an attention based deep learning model of clinical events in the intensive care unit the artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care the european commission's high-level expert group on artificial intelligence: ethics guidelines for trustworthy ai key challenges for delivering clinical impact with artificial intelligence ibm's watson supercomputer recommended 'unsafe and incorrect' cancer treatments, internal documents show towards international standards for the evaluation of artificial intelligence for health proposed regulatory framework for modifications to artificial intelligence/machine learning (ai/ml)-based software as a medical device (samd) artificial-intelligence-and-machine-learning-discussion-paper.pdf 14. international medical device regulators forum (imdrf) -samd working group medical device software -software life-cycle processes general principles of software validation. final guidance for industry and fda staff deciding when to submit a 510(k) for a change to an existing device. guidance for industry and food and drug administration staff software as a medical device (samd): clinical evaluation. guidance for industry and food and drug administration staff international electrotechnical commission. iec 62366-1:2015 -part 1: application of usability engineering to medical devices why rankings of biomedical image analysis competitions should be interpreted with care what do we need to build explainable ai systems for the medical domain /679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (general data protection regulation -gdpr) artificial intelligence in healthcare: a critical analysis of the legal and ethical implications explainable artificial intelligence: understanding, visualizing and interpreting deep learning models association between race/ethnicity and survival of melanoma patients in the united states over 3 decades docket for feedback -proposed regulatory framework for modifications to artificial intelligence/machine learning (ai/ml)-based software as a medical device (samd) openpose: realtime multi-person 2d pose estimation using part affinity fields a density-based algorithm for discovering clusters in large spatial databases with noise fast volumetric auto-segmentation of head ct images in emergency situations for ventricular punctures a system for augmented reality guided ventricular puncture using a hololens: design, implementation and initial evaluation op sense-a robotic research platform for telemanipulated and automatic computer assisted surgery yolov3: an incremental improvement. arxiv deep learning based 3d pose estimation of surgical tools using a rgb-d camera at the example of a catheter for ventricular puncture fast point feature histograms (fpfh) for 3d registration joint probabilistic people detection in overlapping depth images towards end-to-end 3d human avatar shape reconstruction from 4d data scene-adaptive optimization scheme for depth sensor networks a taxonomy and evaluation of dense two-frame stereo correspondence algorithms advances in computational stereo a comparative analysis of cross-correlation matching algorithms using a pyramidal resolution approach fast approximate energy minimization via graph cuts stereo processing by semiglobal matching and mutual information guided stereo matching a large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation pyramid stereo matching network microstructure-sensitive design of a compliant beam microstructure sensitive design of an orthotropic plate subjected to tensile load microstructure sensitive design for performance optimization on the design, analysis, and characterization of materials using computational neural networks texture optimization of rolled aluminum alloy sheets using a genetic algorithm finite mixture models a tutorial on hidden markov models and selected applications in speech recognition information processing in dynamical systems: foundations of harmony theory generative adversarial nets building texture evolution networks for deformation processing of polycrystalline fcc metals using spectral approaches: applications to process design for targeted performance linear solution scheme for microstructure design with process constraints matcalo: knowledge-enabled machine learning in materials science differential evolution -a simple and efficient adaptive scheme for global optimization over continuous spaces reinforcement learning: an introduction hindsight experience replay industrieroboter für kmu. flexible und intuitive prozessbeschreibung toward efficient robot teach-in and semantic process descriptions for small lot sizes survey on human-robot collaboration in industrial settings: safety, intuitive interfaces and applications concept and architecture for programming industrial robots using augmented reality with mobile devices like microsoft hololens robot programming using augmented reality robot path and end-effector orientation planning using augmented reality spatial programming for industrial robots based on gestures and augmented reality spatial programming for industrial robots through task demonstration augmented reality based teaching pendant for industrial robot intuitive robot tasks with augmented reality and virtual obstacles development of a mixed reality based interface for human roboter interaction a hands-free virtual-reality teleoperation interface for wizard-of-oz control mixed reality as a tool supporting programming of the robot communicating robot arm motion intent through mixed reality head-mounted displays intuitive industrial robot programming through incremental multimodal language and augmented reality development of mixed reality robot control system based on hololens interactive spatial augmented reality in collaborative robot programming: user experience evaluation comparison of multimodal heading and pointing gestures for co-located mixed reality human-robot interaction robot programming through augmented trajectories in augmented reality interactive robot programming using mixed reality a taxonomy of mixed reality visual displays experimental packages for kuka manipulators within ros-indus-trial 24. siemens: ros-sharp a questionnaire for the evaluation of physical assistive devices (quead) unctad: review of maritime transport 2019 (2019) last accessed 2019-11-19. 2. international chamber of shipping report of the second meeting of the regional working group on illegal, unreported and unregulated (iuu) fishing automatic identification system (ais): data reliability and human error implications maritime anomaly detection: a review trajectorynet: an embedded gps trajectory representation for point-based classification using recurrent neural networks partition-wise recurrent neural networks for point-based ais trajectory classification a multi-task deep learning architecture for maritime surveillance using ais data streams identifying fishing activities from ais data with conditional random fields a segmental hmm based trajectory classification using genetic algorithm wissensbasierte probabilistische modellierung für die situationsanalyse am beispiel der maritimen überwachung detecting illegal diving and other suspicious activities in the north sea: tale of a successful trial source quality handling in fusion systems: a bayesian perspective tensorflow: large-scale machine learning on heterogeneous systems deep learning for time series classification: a review rapid object detection using a boosted cascade of simple features general framework for object detection a decision-theoretic generalization of on-line learning and an application to boosting imagenet classification with deep convolutional neural networks rethinking the inception architecture for computer vision going deeper with convolutions deep residual learning for image recognition mobilenets: efficient convolutional neural networks for mobile vision applications mobilenetv2: inverted residuals and linear bottlenecks rich feature hierarchies for accurate object detection and semantic segmentation ssd: single shot multibox detector deep learning for generic object detection: a survey pedestrian detection: an evaluation of the state of the art a survey on face detection in the wild: past, present and future text detection and recognition in imagery: a survey information visualizations used to avoid the problem of overfitting in supervised machine learning data science for business: what you need to know about data mining and data-analytic thinking automatic object detection from digital images by deep learning with transfer learning gpu asynchronous stochastic gradient descent to speed up neural network training tensorflow: tensorflow object detection api: ssd mobilenet v2 coco faster r-cnn: towards real-time object detection with region proposal networks tensorflow object detection api: faster rcnn inception v2 coco. online 29. tensorflow: tensorflow object detection api: faster rcnn inception v2 coco r-fcn: object detection via region-based fully convolutional networks 31. tensorflow: tensorflow object detection api: rfcn resnet101 coco multi-scale feature fusion single shot object detector based on densenet references 1. shell: the shell eco marathon real-time loop closure in 2d lidar slam kinodynamic trajectory optimization and control for car-like robots experiments with the graph traverser program robot operating system automatic differentiation in pytorch ssd: single shot multibox detector yolov3: an incremental improvement rich feature hierarchies for accurate object detection and semantic segmentation mobilenets: efficient convolutional neural networks for mobile vision applications deep residual learning for image recognition are we ready for autonomous driving? the kitti vision benchmark suite m2det: a single-shot object detector based on multi-level feature pyramid network the upper-rhine artificial intelligence symposium ur-ai 2020we thank our sponsor! main sponsor: esentri ag, ettlingen this research and development project is funded by the german federal ministry of education and research (bmbf) and the european social fund (esf) within the program "future of work" (02l17c550) and implemented by the project management agency karlsruhe (ptka). the author is responsible for the content of this publication. underlying projects to this article are funded by the wtd 81 of the german federal ministry of defense. the authors are responsible for the content of this article.this work was developed in the fraunhofer cluster of excellence "cognitive internet technologies". the upper-rhine artificial intelligence symposium key: cord-308652-i6q23olv authors: cobos-sanchiz, david; del-pino-espejo, maría-josé; sánchez-tovar, ligia; matud, m. pilar title: the importance of work-related events and changes in psychological distress and life satisfaction amongst young workers in spain: a gender analysis date: 2020-06-30 journal: int j environ res public health doi: 10.3390/ijerph17134697 sha: doc_id: 308652 cord_uid: i6q23olv a relentless stream of social, technological, and economic changes have impacted the workplace, affecting young people in particular. such changes can be a major source of stress and can cause a threat to health and well-being. the aim of this paper is to understand the importance of work-related events and changes in the psychological distress and life satisfaction of young workers in spain. a transversal study was carried out on a sample comprising 509 men and 396 women aged between 26 and 35 years old. the results showed that there were no differences between the men and women in the number of work-related events and changes experienced in the last 12 months, nor in terms of job satisfaction. the results from the multiple regression analysis showed that a greater number of work-related events and changes experienced during the last 12 months were associated with increased psychological distress and reduced life satisfaction amongst men, but this was not the case for women. although job satisfaction was independent from the men and women’s psychological distress when self-esteem and social support was included in the regression equation, greater job satisfaction was associated with greater life satisfaction for both men and women. it concludes that work-related events and job satisfaction are important for the health and well-being of young people, even though a larger number of work-related events and changes is associated with increased psychological distress and reduced life satisfaction for men only. profound social change has been taking place in recent decades in technological and economic terms, having an impact on work and workers. these changes are taking place at increasingly faster speeds and affect most countries across the world. from the late 1980s and, in particular, in the 1990s, researchers supporting different theoretical perspectives have recounted the consequences of the changes in manufacturing systems; the move from an industrial to post-industrial society. these changes go beyond the methods of organizing production, having a clear impact on working conditions and employment opportunities [1] . there is discussion around an environment characterized by volatility, uncertainty, complexity, and ambiguity (vuca) where companies, in response to almost unpredictable social expectations, undergo constant and rapid change [2] . among the effects of globalization, it is worth highlighting that economic, social, and health crises are no longer confined to just one country, but rather they spread to other countries-sometimes more quickly than others. within this context, working life has experienced significant changes, caused by an ever more global and flexible system [3] ; these are changes that are contributing to a significant loss of work, with an increase in unemployment, underemployment, and precarious employment throughout the world [4] . according to the international labor organization (ilo), the main problem in the world's labor markets is poor quality employment. millions of people are compelled to accept poor working conditions. recent data show that, in 2018, the majority of the total global working population of 3.3 billion did not have an adequate level of economic security, material well-being, and equal opportunities [5] . this tends to affect young people most of all, as individuals that are extremely vulnerable to an ever-changing environment, where situations of abnormal employment, temporary employment, part-time work, outsourcing, individual contracts, and self-employment prevail. the changes in the world of work that young people are facing, whether due to the introduction of new technologies or the type of employment in itself, take place in such a way that young people often do not have the chance to adapt to the position [6] . for millions of young people in the european union, finding a job is extremely difficult. in some southern european countries, more than half of all young adults are unemployed, a situation that was made even worse by the last financial crisis. this entails problems of a psychosocial nature, but can also have devastating consequences for the countries concerned-and the european union itself [7] in a post-brexit context, due to the social tensions that can arise from unemployment. in spain, in particular, youth unemployment rates double those of the adult population and affect more than half of the population [8] . the most relevant aspect has been the change in the industrial trend of male employment: stable jobs for life have made way for constant change with periods of unemployment, instability, and precariousness throughout one's working life. in addition to the difficulty of entering the labor market for the first time, replicated across europe, the problems arising from the spanish economy based on unstable productive industries must also be considered [1] . it could be assumed that this dynamic reflects the stamina of young people, but in reality it shows an overwhelming situation that hinders them in their daily routine, characterized by an uncontrolled flow of new demands that constantly force them to search for new jobs, which consistently fail to meet their expectations due to their temporary nature. according to vendramin [9] , switching jobs is part of a "normalized" journey of employment instability for young people. these changes are involuntary for 33.7% of women and 22% of men. the reality of young workers has been put in the spotlight by different visions. according to data from the ilo [5] , women are more likely to take casual work, and just like young people, they are prone to establishing weak ties with the employment market. it is undeniable that working is fundamental for the economic and psychological well-being of an individual, and of society in general [4] . in current day society, a person's job is one of their most important sources of identity and it plays a vital economic and social role for the majority of people [10] . however, employment and working conditions can involve factors that are not people-oriented, and this affects their well-being. the links between work and health have become a central issue in organizational literature, and workers are becoming more and more aware of the importance of health in their work-life balance [11, 12] . in terms of young workers, it is particularly important to understand that their initiation into the world of work is taking place under adverse circumstances; due to the lack of jobs currently on offer on the market and also due to the lack of appreciation for their personal conditions due to inexperience [6] . although work can be a source of stress, it is a fundamental aspect of life, as working age adults spend the majority of their waking hours at work [13] . work is particularly important for a young person's personal development, and due to the restricted possibilities of entering the labor market, they can end up accepting precarious employment with unfavorable conditions that put their physical and mental integrity at risk. it is worth nothing that, despite the employment crisis and the aforementioned changes in production systems, the vision and expectations of young spanish people with regard to work and professional life continue to be largely similar to those of previous generations. for them, work is a fundamental part of their lives, whether it is more instrumental and utilitarian (young people from an industrial background or from peripheral rural and semi-rural areas), or more self-fulfilling (young urbanites) [14] . in this respect, despite the fact that, among the realistic expectations of young spanish people is just how difficult it is to gain a foothold in the labor market, this does not mitigate a major psychological and social impact, which can trigger severe and significant consequences for individuals that endure over time [15] . such circumstances may be, in fact, an important source of stress that can alter the physiology and mental well-being of individuals [16] . work-related events and changes thus pose a threat to the health and well-being of women and men alike. in recent decades ample research has been carried out on the psychosocial risks of work, such as stress in the workplace, and a link was found between psychosocial risks and their consequences on physical, mental, and social health [17] . in accordance with data from the quebec national institute for public health [18] , there are several scientific studies that have shown that the presence of one or more psychosocial risks in the workplace can impinge upon workers' mental and physical health, increasing the risk of accidents. it was also stressed that workers with a low level of education working in precarious employment have to face increased psychosocial risks at work, which affects their health and hinders the possibility of improved life conditions. likewise, it was highlighted that men and women are not exposed to the same psychosocial risks in the workplace, but that women tend to be more exposed. it is important to understand that when assessing work-related psychosocial factors, analyzing job satisfaction is also important. although there is no sole definition, it is believed that job satisfaction reflects a pleasant emotional state where people value their work or work experience positively [19] . job satisfaction is a global concept that covers several aspects and is determined by specific work elements, such as the workplace or the job's characteristics, by specific personal factors such as skill or psychological status, and by non-specific factors such as demographic, cultural, and community aspects [19, 20] . there is evidence that job satisfaction has an impact on individual performance and company results, as well as affecting the health and lifestyle quality of the worker. for example, it has been found that job satisfaction is associated with trust in the company [21] , work performance [22] , and self-efficacy [23] , whilst workers that are unsatisfied can see their mental and physical health suffer due to mood changes or psychosomatic complaints, reduced efficiency, more time off, and more requests for a change in role [24] . symptoms of depression and anxiety have been named collectively as psychological distress [25] , although other symptoms such as somatic issues or insomnia are also included within psychological distress [26] [27] [28] . both clinically and in research, psychological distress is a commonly used indicator for mental health and psychopathologies [27] , as well as being associated with physical health. there is evidence showing that psychological distress is associated with higher mortality rates for various reasons [29] , and with several inflammatory markers [30] . it has also been found to increase the risk of diseases such as arthritis, cardiovascular disease, and chronic obstructive pulmonary disorder [25] . many of these effects have been studied less in young people than they have been in adults, perhaps due to morbidity and mortality rates being comparatively lower in the former group. it should be noted, however, that various health issues onset at a young age, which may affect the individual throughout the rest of their life [31] . self-esteem and social support stand out amongst the psychosocial factors related to health and well-being [32] [33] [34] [35] . self-concept refers to describing and evaluating oneself, including one's psychological and physical characteristics, qualities, skills, and roles. self-esteem is the degree to which such qualities and characteristics are perceived as being positive [36] . there are many factors that determine a person's self-esteem, including individual values, attitudes, wishes, family issues, and social factors related to work and the type of work [37] . there is evidence that self-esteem has consequences on central aspects of life and high self-esteem leads to good mental and physical health, satisfaction with close relationships, and social support [38, 39] , as well as being associated with job performance [37] and professional prestige and income [10] . across all societies, gender is fundamental in organizing work, and work is fundamental for socially constructing gender. although there is empirical evidence that men and women are similar in the majority of their psychological features [40, 41] , most societies believe that differences remain, and that they should take on different roles; and they are treated differently depending on the gender assigned to them at birth. gender is a social construct [42] that restricts people and gives them different roles and positions. traditional gender roles consider women as carers and men as the backbone of the family [43] , assuming that working is vital for a man's mental health, but somewhat secondary for women, whilst the opposite assumption occurs within family roles [44] . these assumptions are not supported by the empirical evidence that shows that the quality of job positions is associated with less psychological distress amongst men and women [44, 45] , and that the similarities between men and women are clearer than the differences, amongst a series of factors that are important for the family-work association [46] . despite this, it is still believed that the commitment of a woman to work is lower than that of a man, and the classification of gender is an important deciding factor when it comes to professional interests [47] . although there has been a trend towards more equal gender roles in recent decades [48] , gender stereotypes that present significant differences between men and women in terms of their features, occupations, and behavior still exist [49, 50] . in spite of a woman's role in the workplace having become more widespread in recent years in many countries [51] total working equality has not yet been achieved, with salary gaps and job segregation remaining [51] [52] [53] . an example of this is that men still dominate the more prestigious and creative roles, as well as the technical positions [53] . although the importance of research in the psychological aspects of work has received more recognition in recent decades, and research has been done on the psychosocial risks inherent to the workplace and the working environment [17] , the impact on health due to stress at work understood as work-related events and changes has been studied less. furthermore, the positive aspects of work such as job satisfaction or the presence of personal resources such as self-esteem, and social resources such as social support, are rarely considered in these studies. these are all variables that can differ from men to women and can be important in determining health and well-being. the aim of this paper is therefore to understand the importance of work-related events and changes experienced in the last year in psychological distress and life satisfaction for young people in spain, including satisfaction with the job role, self-esteem, and emotional and instrumental social support in the prediction model, all of which will be assessed by analyzing men and women separately. the hypotheses are: men and women who have experienced a greater number of work-related events and changes, and who report low job satisfaction, low self-esteem, and low social support will also report greater psychological distress. men and women who have experienced a lesser number of work-related events and changes, and who report high job satisfaction, high self-esteem, and high social support will also report greater life satisfaction. the sample consisted of 509 men and 396 women aged between 26 and 35 years old. the average age of the men was 30.13 (sd = 2.69) and for women was 30.08 (sd = 2.81), the difference was not statistically significant, t(903) = 0.28, p = 0.78. their professions were different: 37% was non-manual labor, 34.9% was manual labor, and 28.1% had professions that required university studies. there were also differences in their level of education, even though it was most common for them to have studied at university (41.8%), 33.7% had secondary school education and 24.5% had only basic education. more than half of the sample (59.3%) was single, 39% was married or in a civil partnership, and 1.7% was separated or divorced. psychological distress was assessed using scales of somatic symptoms, anxiety and insomnia, and severe depression as per the ghq-28 [54] , each of which includes 7 items that gathers information on general health over recent weeks. example items are "been feeling nervous and strung-up all the time?", "felt constantly under strain?", "felt that life isn't worth living?", and "felt that life is entirely hopeless?". the likert scale was used, allocating weightings from 0 (no symptoms) to 3 (greater discomfort). the internal consistency in the sample group of this paper for the 21 items is 0.92. life satisfaction was assessed with the satisfaction with life scale (swls) [55] . it is made up of 5 items with a likert-style 7-point response scale ranging from 1 (completely disagree) to 7 (completely agree). it is a tool that has been used in many countries, spain included, and has shown suitable psychometric properties for men and women [35, 56] . the internal consistency in the sample group of this paper was 0.84. the work-related events and changes were assessed using four items where participants were asked whether in the last 12 months they had experienced the following: (1) change of employment, (2) loss of employment, (3) starting new employment; and (4) change in employment conditions. each item was scored with a 0 if the person had not experienced it in the previous 12 months and with 1 if they had. the total score for work-related events and changes was obtained by adding together the responses from the 4 items, so the score range is between 0 (for the complete absence of work-related events and changes) and 4, which is the maximum score. job satisfaction was assessed using the job satisfaction questionnaire [57] . it is an open response test where there are 5 questions about whether the person is satisfied in their job, if it is the job they chose, if they want a change, and to what extent they feel fulfilled. the responses to each of the questions were scored quantitatively by applying a code created and approved by matud [57] . the internal consistency for the sample group in this paper was 0.76. self-esteem was assessed using the spanish version of the york self-esteem inventory [58] , a questionnaire made up of 51 items that takes an overall measurement of self-esteem, reflecting the assessment of several skills including personal, interpersonal, family, achievement, physical attractiveness, and the assessment of the degree of uncertainty in themselves. the answer format is on a 4-point scale that ranges from "never" (scored with a 0) to "always", which is scored with a 3. with this sample group in this paper the internal consistency was 0.94. social support was assessed using the social support scale [59] . it is made up of 12 items, answers to which are given on a 4-point likert scale that ranges from 0 (never) to 3 (always), and they assess the social support perceived emotionally (7 items) and instrumentally (5 items). the internal consistency of the sample in this paper was 0.84 for emotional social support and 0.80 for instrumental social support. furthermore, each participant was given a sociodemographic and employment data collection sheet. participants were volunteers and were not paid for their participation in this study. the sample was chosen through various work centers of spanish companies all over spain, from all production sectors. to collect data, the social networks of psychology and sociology students at 7 spanish universities were analyzed, who were trained for the testing step and received course credits for this task. after verbal informed consent was received, the questionnaires were completed individually on paper by people that met the following criteria: (1) aged between 16 and 35 years old; (2) with work experience and either working (work experience means having, or having had, a formal employment contract) or not; and (3) able to understand and speak spanish. this study is part of broader research on the importance of personal and social factors in men and women's well-being, and it was assessed positively by the animal research and well-being ethics committee at the university of la laguna (study approval no. 2012-0040). descriptive analyses were carried out to understand the socio-demographic characteristics of the participants. the reliability of the internal consistency of the study factors was calculated using cronbach's alpha coefficient. the comparisons between men and women were calculated using the student's t-test. the bivariate associations between variables were calculated using pearson's r correlation coefficient except for the educational level where spearman's rho was used as it is an ordinal variable with 7 levels, from 0 (for no studies) to 6 (for university studies spanning 6 years). to test the hypotheses and determine the importance of the number of work-related events and changes, job satisfaction, self-esteem and social support in psychological distress, and life satisfaction amongst men and women, hierarchical multiple regression analyses were made. the age and level of studies were incorporated at the first step (model 1) to control their effect. at step 2 (model 2), the number of work-related events and changes and job satisfaction. finally, at step 3 (model 3), self-esteem and emotional and instrumental social support were incorporated. the correlations and the multiple regression analyses were made independently for the sample of women and the sample of men. the statistical analyses were performed using the ibm spss statistics for windows software, version 22.0 (ibm corp., armonk, n.y., usa). sectors. to collect data, the social networks of psychology and sociology students at 7 spanish universities were analyzed, who were trained for the testing step and received course credits for this task. after verbal informed consent was received, the questionnaires were completed individually on paper by people that met the following criteria: (1) aged between 16 and 35 years old; (2) with work experience and either working (work experience means having, or having had, a formal employment contract) or not; and (3) able to understand and speak spanish. this study is part of broader research on the importance of personal and social factors in men and women's well-being, and it was assessed positively by the animal research and well-being ethics committee at the university of la laguna (study approval no. 2012-0040). descriptive analyses were carried out to understand the socio-demographic characteristics of the participants. the reliability of the internal consistency of the study factors was calculated using cronbach's alpha coefficient. the comparisons between men and women were calculated using the student's t-test. the bivariate associations between variables were calculated using pearson's r correlation coefficient except for the educational level where spearman's rho was used as it is an ordinal variable with 7 levels, from 0 (for no studies) to 6 (for university studies spanning 6 years). to test the hypotheses and determine the importance of the number of work-related events and changes, job satisfaction, self-esteem and social support in psychological distress, and life satisfaction amongst men and women, hierarchical multiple regression analyses were made. the age and level of studies were incorporated at the first step (model 1) to control their effect. at step 2 (model 2), the number of work-related events and changes and job satisfaction. finally, at step 3 (model 3), selfesteem and emotional and instrumental social support were incorporated. the correlations and the multiple regression analyses were made independently for the sample of women and the sample of men. the statistical analyses were performed using the ibm spss statistics for windows software, version 22.0 (ibm corp., armonk, n.y., usa). in table 1 are the correlation coefficients between the age, level of studies, number of workrelated events and changes, job satisfaction, self-esteem and social support with the psychological distress, and life satisfaction amongst men and women. as it is possible to observe, age is independent from the psychological distress and life satisfaction amongst men and women, and although for men said variables are also independent from the level of studies, women with a higher level of studies have less psychological distress and greater life satisfaction. for both men and women, a higher number of work-related events and changes are associated with increased psychological distress and less life satisfaction, whilst more job satisfaction, self-esteem, and social support are associated with more life satisfaction and less psychological distress. table 1 . correlations between the dependent and independent variables amongst men and women. table 2 shows the main results from the hierarchical multiple regression analysis in which the dependent variable was psychological distress for the male sample, with table 3 showing the female sample. as it is possible to observe, model 1 was only statistically significant in the female sample, albeit with the only statistically significant predictor being the level of studies (β = −0.17, p < 0.01). including the number of work-related events and changes and job satisfaction in model 2 produced a statistically significant increase in r 2 , with the beta weights being statistically significant for both variables amongst men but only job satisfaction amongst women. including self-esteem and emotional and instrumental social support in model 3 also produced a statistically significant increase in r 2 , although for the female sample only self-esteem was statistically significant (β = −0.50, p < 0.001), whilst in the male sample self-esteem (β = −0.49, p < −001) and instrumental social support (β = −0.12, p < 0.05) was. model 3, with all the independent variables in the equation, predicted 28% in table 1 are the correlation coefficients between the age, level of studies, number of work-related events and changes, job satisfaction, self-esteem and social support with the psychological distress, and life satisfaction amongst men and women. as it is possible to observe, age is independent from the psychological distress and life satisfaction amongst men and women, and although for men said variables are also independent from the level of studies, women with a higher level of studies have less psychological distress and greater life satisfaction. for both men and women, a higher number of work-related events and changes are associated with increased psychological distress and less life satisfaction, whilst more job satisfaction, self-esteem, and social support are associated with more life satisfaction and less psychological distress. table 2 shows the main results from the hierarchical multiple regression analysis in which the dependent variable was psychological distress for the male sample, with table 3 showing the female sample. as it is possible to observe, model 1 was only statistically significant in the female sample, albeit with the only statistically significant predictor being the level of studies (β = −0.17, p < 0.01). including the number of work-related events and changes and job satisfaction in model 2 produced a statistically significant increase in r 2 , with the beta weights being statistically significant for both variables amongst men but only job satisfaction amongst women. including self-esteem and emotional and instrumental social support in model 3 also produced a statistically significant increase in r 2 , although for the female sample only self-esteem was statistically significant (β = −0.50, p < 0.001), whilst in the male sample self-esteem (β = −0.49, p < −001) and instrumental social support (β = −0.12, p < 0.05) was. model 3, with all the independent variables in the equation, predicted 28% of the variability in psychological distress amongst men and 31% of psychological distress amongst women. for the male sample, psychological distress was associated with lower self-esteem, a higher number of work-related events and changes in the past year, and less instrumental social support whilst for the females it was only associated with a lower self-esteem. 6.32 (2, 393) ** 7.01 (4, 391) *** 26.72 (7, 388) *** note: β = standardized regression coefficient. * p < 0.05; ** p < 0.01; *** p < 0.001. table 4 shows the main results from the hierarchical multiple regression analysis in which the dependent variable was life satisfaction for the male sample, with table 5 showing the female sample. as it is possible to observe, model 1 was only statistically significant in the female sample where a higher level of study was associated to greater job satisfaction. including the number of work-related events and changes that had taken place in the last year and job satisfaction in model 2 produced a statistically significant increase in r 2 for men and women, with the beta weights being statistically significant for both variables, and showing that greater life satisfaction was associated to greater job satisfaction and a lower number of work-related events and changes in the previous year. including self-esteem and emotional and instrumental social support in model 3 produced a statistically significant increase in r 2 for men and women, with the beta weights for self-esteem and emotional social support in the male sample and self-esteem and instrumental social support in the female sample being statistically significant. model 3, with all the regression variables, predicted 27% of the variance in life satisfaction for men and 31% for women. for men, increased life satisfaction was associated with increased job satisfaction, greater emotional social support, higher self-esteem, and less work-related events and changes during the last year. for women, increased life satisfaction was associated to higher self-esteem, more job satisfaction, and greater instrumental social support. the aim of this paper was to understand the importance of work-related events and changes experienced in the last year in assessing psychological distress and life satisfaction for male and female young workers in spain, including these factors in the prediction model together with the number of work-related events and changes, and job satisfaction. self-esteem and emotional and instrumental social support were also included in the regression equation, with the aim of understanding the relative weight that social, personal, and work factors have on psychological distress and life satisfaction. a hierarchical regression model was used, and men and women were analyzed separately, given the evidence that gender is an important distinction in the workplace. it highlights that, in regression analysis, work-related events and changes experienced in the previous year and job satisfaction were statistically significant for men, but for women, only job satisfaction, was statistically significant. this coincides with what has been reported in literature [44, 45] ; for both men and women, the quality of job positions is associated with less psychological distress and better health. the fact that only job satisfaction was statistically significant for women could be a reflection of women facing situations of inequality, segregation, imbalances, and gender stereotypes in the labor market [49, 50] , which is still happening regardless of the academic level reached by this group in recent years. it is well known that gender places young men and women unequally in both the education and the labor market [60] . there is extensive literature on the gender gap, in general and in the spanish labor market in particular, which looks at employment discrimination, its evolution throughout the life cycle and, specifically, pay discrimination [61] [62] [63] [64] . employers still hold stereotypes about women's productivity and, in general, tend to regard women as being less committed to paid work than men [65, 66] . this reality is reported in studies that reveal the distribution of roles in accordance with gender in workplaces [51] [52] [53] , aspects, which are often highlighted in female working environments. with regard to the predictors of psychological distress and life satisfaction, there were some significant differences between men and women. in both groups, age was independent from psychological distress and life satisfaction, as was the case with the level of studies for men. for women, a higher level of studies was associated with less psychological distress and greater life satisfaction, despite the small size of the effect and its greatly reduced statistical significance when self-esteem and social support were included in the regression equation. the first hypothesis, proposing that men and women who have experienced a greater number of work-related events or changes, and who report low job satisfaction, low self-esteem, and low social support would also report greater psychological distress, was only partially supported. although in the male sample a larger number of work-related events and changes taking place in the last year and less job satisfaction predicted psychological distress (model 2), when self-esteem and social support (model 3) were included in the regression equation, job satisfaction ceased to be statistically significant in predicting psychological distress. for women, although in model 2 less job satisfaction predicted increased psychological distress, job satisfaction ceased to be statistically significant in predicting psychological distress when self-esteem and social support (model 3) were included in the regression equation. self-esteem ended up being an important predictor of psychological distress for the male sample and the only predictor for the females. these results coincide with those of other studies [33, 38] and confirm the importance of self-esteem on psychological well-being for both men and women. these results force us to consider the value of self-esteem and psychological well-being as health contributors, as highlighted by some authors in studies on psychological distress and the workplace [32] [33] [34] [35] . the results highlight the lack of importance of social support in predicting psychological distress, as it was only statistically significant in the male group, despite literature reporting social support as a protective factor of psychological distress. the second hypothesis, which proposed that men and women who have experienced a lesser number of work-related events and changes, and who report high job satisfaction, high self-esteem, and high social support would also report greater life satisfaction, was also only partially supported. in fact, in the male group, greater life satisfaction was associated with greater job satisfaction, increased emotional social support, higher self-esteem, and less work-related events and changes. however, for women, in the final model, when self-esteem and social support were incorporated, the number of work-related events and changes ceased to be statistically significant, and the social support, which was associated in a statistically significant way with greater life satisfaction, was instrumental and not emotional. this suggests that self-esteem and social support are valuable factors when surveying disrupting situations in life satisfaction for both groups. it should be noted that the perceived social support in particular has been considered by several authors as an element that facilitates protection against situations that create psychological distress. the results found highlight that there are differences between men and women in the predictive value of work-related events and changes in psychological distress, with job events and changes being much more associated with psychological distress in young men rather than women. this could perhaps be a consequence of traditional social practices and gender stereotypes that further underscore working roles amongst men more so than women [23, 47, 49, 50] , and therefore work-related events and changes could equate to a bigger threat for men's mental health than for women's. in any case, it also highlights that there were no differences between men and women in terms of their job satisfaction, and this was important for predicting life satisfaction for both sexes. the results allow us to broaden our knowledge about the relevance of work-related events and changes on the health and well-being of women and men. in this respect, the findings of our research serve as a basis for further studies aimed at the in-depth research into distress in young workers, including looking into factors that implicate the work environment as a potential trigger of psychological distress. in particular, the difference between men and women in the predictive importance that work-related events and changes have in psychological distress and life satisfaction (a construct that refers to the feeling of well-being with oneself in one's surroundings) has been highlighted, with this being much greater amongst young men than young women. in conclusion, work-related events and changes and job satisfaction are important for the health and well-being of young workers, even though a larger number of work-related events and changes amongst only men are associated with greater psychological distress and reduced life satisfaction. it is important to highlight that, for young workers, life satisfaction, social support, and self-esteem were shown to be important factors to considered in research in relation to the psychological distress created by adverse circumstances in the working environment. the study has some limitations and it should be noted that a convenience sample was used, so there can be no guarantee that this is representative of young spanish people. moreover, the study is transversal, so we cannot speak of cause-effect relationships. in addition, the percentage of variance explained in psychological distress and in life satisfaction does not exceed 31%. certain aspects could have been studied in greater depth, and remain open to subsequent study in greater detail. in particular, it would be interesting to use the holmes-rahe life stress scale, a psychological scale used to measure susceptibility to stress-induced health problems, as well as to introduce the locus of control as a study variable. finally, there should be some mention of the recent covid-19 pandemic. this study was carried out using the data of young spanish people collected in the last year. obviously, conditions have changed since the data were used, in a dynamic and changing context. we understand that the main results are still valid in this context, however, it is reasonable to consider highly likely that the socio-economic situation may be aggravated in light of the current situation, which may have an impact on the psychosocial occupational risks to which young people are exposed. empleabilidad de l@s jóvenes: formación, género y territorio (eject): informe final de proyectos de i+d+i; cso2014-59753-p the strategic position of human resource management for creating sustainable competitive advantage in the vuca world predisposition to change is linked to job satisfaction: assessing the mediation roles of workplace relation civility and insight expanding the impact of the psychology of working: engaging psychology in the struggle for decent work and human rights organisation internationale du travail. emploie et questions sociales dans le monde globalización y condiciones de trabajo de los jóvenes trabajadores: el caso de las franquicias de comida rápida youth unemployment in europe. appraisal and policy options enquête auprès des jeunes salarie en belgique francophone. fondation travail-université asbl self-esteem and extrinsic career success: test of a dynamic model work-life balance: weighing the importance of work-familiy and work-health balance retirement practices in different countries job satisfaction. subjective well-being at work jóvenes en perpetuo tránsito hacia ninguna parte desempleo juvenil en españa: situación, consecuencias e impacto sobre la vida laboral de los adultos stress, coping, and immunologic relevance: an empirical literature review world health organization. health impact of psychosocial hazards at work: an overview; world health organization risques psychosociaux du travail: des risques à la santé mesurables et modifiables. available online job satisfaction among immigrant workers: a review of determinants the nature and causes of job satisfaction transformational leadership, job satisfaction, and team performance: a multilevel mediation model of trust the impact of caring climate, job satisfaction, and organizational commitment on job performance of employees in a china's insurance company second career teachers: job satisfaction, job stress, and the role of self-efficacy satisfacción laboral y apoyo social en trabajadores de un hospital de tercer nivel the effects of psychological distress and its interaction with socioeconomic position on risk of developing four chronic diseases short screening scales to monitor population prevalence and trends in non-specific psychological distress epidemiology of psychological distress, mental illnesses-understanding, prediction and control gender differences in psychological distress in spain association between psychological distress and mortality:individual participant pooled analysis of 10 prospective cohort studies the association between inflammatory markers and general psychological distress symptoms social capital and self-rated health: a cross-sectional study of the general social survey data comparing rural and urban adults in ontario masculine/instrumental and feminine/expressive traits and health, well-being, and psychological distress in spanish men psychological distress and social functioning in elderly spanish people: a gender analysis relevance of gender and social support in self-rated health and life satisfaction in elderly spanish people apa dictionary of psychology the role of job performance on career success and self-esteem of staff the lifespan development of self-esteem self-esteem across the second half of life: the role of socioeconomic status, physical health, social relationships, and personality factors gender similarities and differences gender similarities from sex roles to gender structure breadwinner bonus and caregiver penalty in workplace rewards for men and women gender and the relationship between job experiences and psychological distress: a study of dual-earned couples a life-span perspective on women's careers, health, and well-being the work-family interface sex-typed personality traits and gender identity as predictors of young adults' career interest attitudes toward women's work and family roles in the united states stereotyping: processes and content the times they are a-changing . . . or are they not? a comparison of gender stereotypes introduction: examining intersections of gender and work the persistence of workplace gender segregation in the us sex, gender and work segregation in the cultural industries a scaled version of the general health questionnaire the satisfaction with life scale satisfaction with life scale: analysis of factorial invariance across sexes evaluación de la satisfacción con el rol laboral en mujeres y hombres [work role satisfaction evaluation in women and men diferencias en autoestima en función del género análisis y modificación de conducta social support scale [database record el lastre de las desigualdades de género en la educación y el trabajo: jóvenes castellano-manchegas atrapadas en la precariedad la segregación ocupacional de género y las diferencias en las remuneraciones de los asalariados privados diferencias salariales por género y su vinculación con la segregación ocupacional y los desajustes por calificación. dt 20/12 las brechas de género en el mercado laboral español y su evolución a lo largo del ciclo de vida. revista de ciencias y humanidades de la fundación ramón areces las desigualdades de género en el mercado de trabajo: entre la continuidad y la transformación men and women at work: sex segregation and statistical discrimination when professionals become mothers, warmth doesn't cut the ice this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license the authors declare no conflict of interest. the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. key: cord-261599-ddgoxape authors: nabi, khondoker nazmoon; abboubakar, hamadjam; kumar, pushpendra title: forecasting of covid-19 pandemic: from integer derivatives to fractional derivatives date: 2020-09-21 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110283 sha: doc_id: 261599 cord_uid: ddgoxape in this work, a new compartmental mathematical model of covid-19 pandemic has been proposed incorporating imperfect quarantine and disrespectful behavior of the citizens towards lockdown policies, which are evident in most of the developing countries. an integer derivative model has been proposed initially and then the formula for calculating basic reproductive number [formula: see text] of the model has been presented. cameroon has been considered as a representative for the developing countries and the epidemic threshold [formula: see text] has been estimated to be ∼ 3.41 [formula: see text] as of july 9, 2020. using real data compiled by the cameroonian government, model calibration has been performed through an optimization algorithm based on renowned trust-region-reflective (trr) algorithm. based on our projection results, the probable peak date is estimated to be on august 1, 2020 with approximately 1073 [formula: see text] daily confirmed cases. the tally of cumulative infected cases could reach ∼ 20, 100 [formula: see text] cases by the end of august 2020. later, global sensitivity analysis has been applied to quantify the most dominating model mechanisms that significantly affect the progression dynamics of covid-19. importantly, caputo derivative concept has been performed to formulate a fractional model to gain a deeper insight into the probable peak dates and sizes in cameroon. by showing the existence and uniqueness of solutions, a numerical scheme has been constructed using the adams-bashforth-moulton method. numerical simulations enlightened the fact that if the fractional order α is close to unity, then the solutions will converge to the integer model solutions, and the decrease of the fractional-order parameter (0 < α < 1) leads to the delaying of the epidemic peaks. in the late december 2019, the world health organization (who) was enlightened about sudden new cases of a unique pneumonia and the specific cause of this infection was an enigma for the health authorities of the city of wuhan in china. a few days later, it has been unearthed that this pneumonia is caused by a virulent virus and it has been named officially as the novel coronavirus (2019-ncov). the modes of transmission are similar to the viruses responsible for the previous epidemics of sars and mers [31] . however, this virus is more contagious and is the cause of millions of deaths worldwide. to date, there is no clinically established treatment or specific vaccine for covid-19. nevertheless, according to the suggestions of different prudent infectious disease specialists, several countries have started using chloroquine combined with azithromycin as alternative drugs [15] . several non-pharmaceutical interventions such as physical distancing, wearing face masks in public places, home quarantine, isolation and countrywide lockdown policies have been promoted to curb the spread of the virus [27] . cameroon, a country in sub-saharan africa, had its first confirmed case on march 05, 2020. it was a 58-year-old french citizen, arrived by yaoundé airport on february 24, 2020 [22] . with an aim to quell the spread of the virus, the government undertook several policies such as closing the borders, confinement measures, prohibiting large gatherings of people, closures of all kinds of educational institutions (from kindergarten to university), wearing face masks in public places and so on. in addition, cameroonian government implemented contact-tracing strategy together with mass-media sanitisation campaigns. contact-tracing consists of search, or identify all individuals who have had a contact with a confirmed case and force them to be quarantined if they are covid-19 positive [24] . the scarcity of medical resources in maximum sub-saharan african countries compelled authorities to request asymptomatic cases of covid-19 to be self-quarantined in their home while taking medication (conventional or not) following proper health guidelines [16] . the main problem encountered by health workers is the stigmatization of covid-19 patients. indeed, like the cases of hiv/aids in africa, stigma pushes confirmed cases or those who have had direct contact with a confirmed case of covid-19 to run away from hospitals, and in this way they are contributing to the rapid spread of the virus in the community [23, 37] . different types of mathematical models have been playing notable role in predicting the transmission dynamics of infectious diseases and different effective control measures can be designed to limit the community transmission. several models have already been proposed to predict the evolution of covid-19 and to study the impact of control measures imposed by different governments [1, 26] . although these models have similarities, each of them presents specifications related to the evolution of covid-19 and various control measures deployed by several respective governments. a model has been considered in [1] , describing the interactions among the bats and unknown hosts, human population and population of virus in reservoir (seafood market). atangana-baleanu derivative concept has been implemented successfully. however, confined or quarantined class and isolated class have not been considered in this study. in a recent study, nabi [26] has proposed a new susceptible-exposed-symptomatic infectious-asymptomatic infectious-quarantined-hospitalized-recovered-dead (sei d i u qhrd) compartmental mathematical model and calibrated model parameters to project the future dynamics of covid-19 for various covid-19 hotspots. although he considered a quarantine compartment, he did not consider the possibility of those quarantined people to contract the virus. although this consideration goes well with certain developed countries (france, italy, russia and spain), certain realities [27] of covid-19 pandemic in several developing countries and maximum african countries have not been taken into account in any model. in case of cameroon, several crucial factors such as violation of containment measures and negligence towards confinement measures must be taken into account to understand the transmission dynamics of covid-19. importantly, numerous daily wage-earners are compelled to go out to fetch food for their family members contravening all confinement orders. in this way, they are getting infected and spreading the viruses to their family members as well as neighbors [27] . it is therefore prudent to envisage future outbreak dynamics of covid-19 in developing countries with imperfect confinement. moreover, some people flee quarantine in hospitals to return to their families due to the stigmatization of covid-19 patients. henceforth, considering perfect quarantine would be highly debatable if we want to model and forecast the transmission dynamics of covid-19 in developing countries. fractional calculus or non-integer order calculus, a branch of mathematical analysis, contains the theory of fractional-order derivatives operators. the history of non-integer order calculus is more than 300 years old and modern fractional calculus is a rapidly evolving field both in theory, analysis and applications with a view to handling complicated real-world problems. a plethora of concepts regarding fractional derivatives have already been introduced and applied by researchers in various branches of science and engineering [13, 17, 32] . atangana-baleanu (ab), caputo-fabrizio (cf), and caputo derivatives are the most commonly used derivatives in solving real-world problems. grunwald-letnikov fractional operators have been applied comprehensively in the field of image processing. ab, cf, and caputo fractional derivatives have different kernel properties. caputo derivative is defined with power-law type kernel (non-local but singular), caputo-fabrizio with exponentially decaying type kernel (non-singular) and ab derivative in the caputo sense has mittag-leffler type kernel [36] . a new generalised caputo type fractional derivative has been introduced in [30] . asymptotic stability of generalised caputo fdes has been introduced by baleanu in [5] . integer-order derivative operators have lucid geometric and physical interpretations, which clearly simplify their applications in tackling real-world challenges. recently, various techniques have been proposed by researchers to describe the solution of non-linear fractional differential equations. a new technique for solving non-linear volterra integro-differential equations in atangana-baleanu derivative sense has been introduced in [12] . another technique with the uses of genocchi polynomials has been proposed in [35] . a new technique to solve multi-variable order differential equations in ab sense has been introduced by ganji et al. [11] . in this paper, a new compartmental covid-19 model has been studied rigorously with the help of caputo fractional derivatives. the advantage of applying caputo fractional derivatives to solve the proposed covid-19 model is the dynamics of the model can be observed more deeply using the real-time cameroon data. in this framework, real-time data can be compared with the model outputs in a precise manner. fractional derivatives can be a suitable alternative to integer derivatives in studying complex dynamics as there are always some limitations of integer derivatives described in [2] . podlubny has given the geometric and physical significance of riemann-liouville fractional integral in [33] . as compared to the riemann-liouville fractional derivative, caputo fractional derivative has the identical physical interpretation [33] . in a physical model, fractional derivatives manage systems with memory that ensure that the evolving system state is reliant on its past states also. the aim of this work is to forecast the probable time and size of the epidemic peaks of the novel coronavirus outbreak in cameroon by studying a realistic compartmental model using the robust concept of caputo fractional derivative. a compartmental mathematical model of covid-19 progression dynamics has been proposed incorporating the efficacy of confinement measures and imperfect quarantine. estimation of parameters has been performed by using real-time data, followed by a projection of the evolution of the disease. global sensitivity analysis is applied to determine the influential mechanisms in the model that drive the transmission dynamics of the disease. afterwards, the integer model is transformed into a fractional model in caputo sense and the existence and uniqueness of solutions have been presented. using the adams-bashforth-moulton scheme, a numerical scheme for the fractional model has been constructed. several numerical simulations are performed to compare the result obtained by the integer-order model with the fractional model outputs. the entire paper is organized as follows. model formulation and basic properties are presented in section 2. section 3 is devoted to model calibration using real data of reported cases of covid-19 in cameroon, global sensitivity analysis of the proposed model, and forecasting of the disease future dynamics. in section 4, the compartmental model is transformed into a fractional model in the caputo sense, following by the existence and uniqueness of solutions, construction of a numerical scheme and numerical simulations in which the coefficient of fractional order vary. the paper ends with some insightful observations and fundamental findings. a compartmental differential equation has been proposed to describe the transmission dynamics of the covid-19. the spread starts with an introduction of at least one infected human into a susceptible population. the human population has been categorized into seven compartments according to their epidemiological status, i.e {s(t), c(t), e(t), a(t), q(t), h(t), r(t)} which represent the number of susceptible individuals, confined individuals, infected individuals in incubation period, asymptomatic infectious individuals (undetected but infectious) including those who fled quarantine, quarantined individuals or confirmed infected individuals, hospitalised and recovered cases. so, the total human population, denoted by n , at any time can be represented by n (t) = s(t) + c(t) + e(t) + a(t) + q(t) + h(t) + r(t). the formulation of the model are subject to some assumptions listed bellow. • migration is not considered. indeed, concerning migration, the initial time of our model is taken when the disease is already inside the region or country and that international migrations are prohibited [26] . • vital dynamic (birth and natural deaths) is ignored. the idea is to observe what may happen in the short-term. indeed, since for example to have a birth we need around 9 months on an average, to have a natural death rate which is the inverse of the average life expectancy i.e. 1/55 per year in some developing country like cameroon and no one would like the disease to stay so long. • confinement is not perfect. indeed, in some developing countries, because of insufficient financial income in households, people do not respect the confinement measures imposed by the government, and are forced to go out to work to provide for their families. • the immunity is perfect. indeed we assumed, without confirmed proof of lost of immunity, that recovered individuals develop certain immunity against the disease [9, 29] . • hospitalized patients (in respiratory support) can not spread the disease as they remain under closed supervision. the flow diagram of the proposed model is depicted on figure 1 , where susceptible individuals decrease either via confinement at a rate c, or via infection by a direct contact with an infectious individual (a or q). the transmission rate from unquarantined asymptomatic carriers (a) (respectively quarantined symptomatic carriers (q)) is β (respectively ηβ). the modification and dimensionless parameter 0 ≤ η < 1 accounts for the assumed reduction in transmissibility of quarantined symptomatic carriers relative to unquarantined asymptomatic carriers. since the confinement is not perfect, confined individuals move to the latent class at a rate (1 − )λ where 0 ≤ ≤ 1 is the efficacy of confinement. this is in contrast with a variety of seir-style models recently employed in, e.g. [9, 29] , where the authors consider that confined individuals are not exposed to an infection. indeed, in developing country such as cameroon, since the confinement is not perfect, it is practical that an infected individual can infect a confined person by returning back home. this phenomenon was also observed at the beginning of the epidemic in a developed country such as italy, where old confined individuals had been infected by their grandchildren who are asymptomatic carrier [16] . after the incubation period of 1/γ, a proportion q of infected individuals move to the asymptomatic class a and the other (1−q) to the quarantine class. it important to note that asymptomatic class a include all infectious individual who are not in quarantine and those who has fled the quarantine. the rate of transition from unquarantined asymptomatic infectious to quarantine infectious, unquarantined asymptomatic infectious to hospitalised class, unquarantined asymptomatic infectious to recovered class and hospitalised to recovered class are r 1 σ 1 , r 2 σ 1 , (1 − r 1 − r 2 )σ 1 and σ 3 respectively. r 3 σ 2 , r 4 σ 2 and (1 − r 3 − r 4 )σ 2 are transition rates of quarantined asymptomatic infectious to unquarantined asymptomatic infectious, quarantined asymptomatic parameter description β infectious contact rate η infectiousness factor for quarantined infected carrier p transition rate from confined class to unconfined class c transition rate from unconfined class to confinement class confinement efficacy γ transition rate from exposed class to infectious class q fraction of exposed that become quarantined carriers σ 1 transition rate from unquaratined class to quarantined class and recovered class σ 2 transition rate from quaratined class to unquarantined class, hospitalised and recovered σ 3 transition rate from hospitalised class to recovered class r 1 fraction of unquaratined infectious that become quarantined infectious r 2 fraction of unquaratined infectious that become hospitalised r 3 fraction of quarantined infectious that become unquarantined infectious r 4 fraction of quarantined infectious that become hospitalised δ a disease induced death rate, unquarantined infectious δ q disease induced death rate, quarantined infectious δ h disease induced death rate, hospitalised infectious to hospitalised class, and quarantined asymptomatic infectious to recovered class respectively. δ a , δ q and δ h are the disease-induced death rate of unquarantined asymptomatic infectious, quarantined asymptomatic infectious and hospitalised patients respectively. the above assumptions led to the following nonlinear system of ordinary differential equations, the parameters are described in table 1 . we set x = (s, c, e, a, q, h, r) the vector of state variable, let f : r 7 → r 7 the the right hand side of system (2), which is a continuously differentiable function on r 7 . according to [39, theorem iii.10 .vi], for any initial condition in ω, a unique solution of (2) exists, at least locally, and remains in ω for its maximal interval of existence [39, theorem iii. 10 .xvi]. hence model (2) is biologically well-defined. here, we prove that all state variables of model (2) are non-negative for all time, i.e, solutions of the model (2) with positive initial data remain positive for all time t > 0. the following result can be obtained. is a solution of (2) with positive initial conditions. let us consider e(t) for t ≥ 0. it follows from the third equation of system (2) that since e(0) ≥ 0, it follows that e(t) ≥ 0 for t ≥ 0. we proceed with the same for a(t), q(t), h(t) and r(t). now, it remains to prove that s(t) and c(t) are also positive. assume the contrary, and lett such that s(t) = 0 and c(t) ≥ 0. from the first equation of (2), it follows that ds(t) dt | t=t = pc(t), which means that s(t) < 0 for t ∈ (t − ζ,t) with ζ a small positive constant. this leads to a contradiction. thus s(t) ≥ 0 for t ≥ 0. we proceed with the same to prove that c(t) ≥ 0. (2) with positive initial conditions are bounded by the total population n 0 . where n 0 is equal to the total population. it follows that for all t ≥ 0, we have s(t) ≤ n 0 , in what follows, we study the model (2) in the following set which is positively-invariant and attracting region for the model (2). now, we define the following manifold, in which any point is a disease-free equilibrium (dfe) of the model (2) note that w ⊂ d. in the following, we will work with the largest disease-free equilibrium point denoted by x 0 (see [14] ). assuming s + c = n 0 , it follows that using notations in [38] , matrices f and v for the new infection terms and the remaining transfer terms are, respectively, given by then, the control reproduction ratio is defined, following [38] , as the spectral radius of the next generation matrix, f v −1 : where ρ(·) represents the spectral radius operator. the formula for control reproduction number has been formulated. indeed, the insightful epidemic threshold, r c calculates the average number of new secondary covid-19 cases generated by a covid-19 positive individual in a population where a certain fraction of susceptible people are confined. the control of covid-19 pandemic is governed by the application of some control measures which contribute to bring down r c under unity [38] . hence, we claim the following result which is a direct consequence of the next generation operator method [38, theorem 2] . lemma 3. if r c < 1, the disease-free equilibrium x 0 is locally asymptotically stable and unstable if r c > 1. proof. note the two last equations of the system (2) are uncoupled to the remaining equations of the system. since the total population, n 0 , is constant, we have s + c = n 0 − (e + a + q + h + r). so, the local stability of the covid-19 model (2) can be studied through the remaining system of state variables (e, a, q). thus, we obtain that the jacobian matrix associated to these variables is given by the eigenvalues of j are the roots of the following characteristic polynomial: note that a 2 is always positive. a 0 and a 1 are positive provide that r c < 1. thus, it follows that j has all its eigenvalues with negative real parts. hence, the disease-free equilibrium, x 0 , is locally asymptotically stable whenever r c < 1. remark 1. lemma 3 implies that if r c < 1, then a sufficiently small flow of infected individuals will not generate an outbreak of covid-19, whereas for r c > 1, epidemic curve reaches a peak by growing exponentially and then decreases to zero as t → ∞. the better control of the covid-19 can be established by the fact that the dfe x 0 is globally asymptotically stable (gas). in this context, we claim the following result. theorem 1. if r c < 1, then the manifold, w, of disease-free equilibrium points of the model (2) is gas in d. proof. assume that r c < 1. let x = (s, c, r) and y = (e, a, q) . since we deduce that thus from [38, theorem 2] , it follows that s(f − v ) < 0 if and only if r c < 1 where s(m ) is the stability modulus of the matrix m . therefore, the trajectories of the auxiliary system whose right-hand side (6) converges to zero whenever r c < 1. since all the state variables are non-negative and f − v is a metzler matrix, by the comparison theorem [18] , it follows that lim shows that w is an attractive manifold. moreover, w is locally asymptotically stable when r 0 < 1. we conclude that in system (2), the manifold w is globally asymptotically stable when r c < 1. in the absence of confinement measures, i.e. = 0, r c converges to the basic reproduction number, r 0 , given by using (5), it follows that in the next section, numerical simulations have been performed rigorously. parameters considered in our simulations are either estimated from real data or related to the covid-19 outbreak data in cameroon. as in many countries in sub-saharan africa, the actual situation of the covid-19 in cameroon and its economic impact are mixed. indeed, after the appearance of the first case confirmed on march 07, 2020 until the introduction of control measures by the government of cameroon (closure of public places, in particular schools, universities, drinking places and other places of entertainment from 6 p.m. and so on), the statistics advanced by the cameroonian health authorities do not reflect, according to certain non-governmental organizations, the actual situation of the covid-19 epidemic in cameroon. since no nongovernmental organization is empowered to carry out tests, the only statistics we use here to calibrate our model are those communicated by the cameroonian ministry of public health during her daily press briefings [25] . also, these control measures decreed by the government have really impacted certain economics activity sectors. indeed, the fact of the closing of the drinking places at other places of distractions, without forgetting the prohibition of the gatherings of more than 50 persons, a lot of people found themselves in technical unemployment, and of this fact was forced to leave for go to affect activities that may allow them to support themselves and their families. the number of covid-19 positive cases (npc) per day, the number of recovered cases (nrc) per day and the number of death cases (ndc) per day in cameroon, have been taken in the summarized table online on the evolution site of covid-19 [34] , from early march to early july 2020. time=0 corresponds to march 7, 2020 and time=124 stands for the day july 08, 2020. as of july 10, 2020, the total death toll is 359, the total number of recovered cases is 11,525 and the total number of positive cases is 14,916 in cameroon [34] . the model (2) calibration has been performed using a newly developed optimization algorithm based on trust-region-reflective (trr) algorithm, which can be regarded as an evolution of levenberg-marquardt algorithm [26] . this robust optimization procedure can be used effectively for solving nonlinear least-squares problems. this algorithm has been implemented using the lsqcurvefit function, which is available in the optimization toolbox in matlab. necessary model parameters have been estimated using this optimization technique. daily infected cases data have been collected from a trusted data repository, which is available online [34] . a 7-day moving average of the daily reported cases has been used for our model calibration due to moderate volatile nature of real data fig. 2 . it has been observed that the number of daily testing in cameroon has been really inconsistent. with an aim to capture the real outbreak scenario, the 7-day moving average has been used in this regard. according to recent statistics, the total population of cameroon is found to be 26,548,238 [34] . for initial conditions, we take s(0) = 26533750, c(0) = e(0) = a(0) = h(0) = r(0) = 0 and q(0) = 1. the solutions have been obtained after resolving the following optimization problem numerically. where ψ = {β, σ 1 , σ 2 , σ 3 , r 1 , r 2 , r 3 , r 4 , δ a , δ q , δ h , η, p, c, , γ, q}, is displayed in table 2 parameter probable range base value trr output references β 0. table 2 : calibration of the model parameters using a newly developed optimization algorithm based on well-known trust-region-reflective algorithm. partial rank correlation coefficient (prcc) method has been carried out to quantify the most dominant mechanisms, which are significantly responsible for the transmission dynamics of covid-19 disease. prcc is a global sensitivity analysis method, which is used to quantify the relationship between model response function (outputs) and model parameters (sampled by latin hypercube sampling method) in an outbreak setting [28] . prcc values are ranged between -1 and 1. a negative prcc value indicates a negative correlation between the model output and respective input parameter, whereas a positive prcc value depicts a positive correlation between response function and corresponding model parameter. again, similar result has been found, when the confined infectious individual class (q) has been used as the model response function fig. 7 . in this case, the prcc indexes are estimated to be 0.885 and −0.69 accordingly. importantly, the public health implication of this scenario is that, the transmission dynamics of covid-19 can be controlled effectively in the community by curtailing the direct infectious contact rate and strengthening the confinement measures. direct transmission rate can be limited by promoting several non-pharmaceutical measures such as maintaining physical distance carefully, wearing face-masks in public places and following health guidelines properly. however, it is often really challenging for the government of a developing country to implement strict confinement policies [27] . thanks to the memory effect which represents an advantage of the fractional derivative compared to the ordinary derivative, the theory and the application of fractional calculus have been widely used to model dynamic processes in the fields of science, engineering and many others [3, 4] . before presenting the fractional model with caputo derivative, we give the definition of fractional derivative in caputo sense, and some useful results (see [20] ). here, we remind a definition and some results about caputo (c) derivative. definition 1. the caputo non-integer order derivative of g ∈ c k −1 is presented as lemma 4. if 0 < β < 1 and m is an integer (non-negative), then ∃ the +ve constants c β,1 and c β,2 only dependent on β, s.t and (m + 2) β+1 − 2(m + 1) β+1 + m β+1 ≤ c β,2 (m + 1) β−1 . where c is a +ve constant independent of m & h. by replacing the integer derivative order of the ode covid-19 model (2) by the fractional-order model in caputo sense, we obtain the following fractional model in this portion of the study, we will provide the existence of the unique solution for the proposed fractional-order model under the caputo fractional operator by applying fixedpoint results. in this setting, b is the banach space of real-valued continuous functions defined on an interval i with the associated norm , with e(i) denotes the banach space of real-valued continuous functions on i and the associated sup norm. for convenience the proposed system (11) can be rewritten in the equivalent form given below. by applying the caputo fractional integral operator, the above system (12) , reduces to the following integral equation of volterra type with the caputo fractional integral of order 0 < α < 1. now we prove that the kernels g 1 ; g 2 ; g 3 ; g 4 ; g 5 ; g 6 and g 7 fulfil the lipschitz condition and contraction under some assumptions. in the following theorem, we have proved for g 1 and can proceeds for the rest in a similar patron. theorem 2. let us consider the following inequality 0 ≤ π i ηp h αk 4 + ξ < 1. the kernel g 1 satisfies the lipschitz condition as well as contraction if the above inequality is satisfied. proof. for s and s 1 we proceed as below: since a(t) and q(t) are bounded functions, i.e, a(t) ≤ k 1 and q(t) ≤ k 2 , by the property of norm function, above inequality can be written as . here v 1 = {β(k 1 + ηk 2 ) + c} which implies that hence, for g 1 the lipschitz condition is obtained and if an additionally 0 ≤ β(k 1 + ηk 2 ) + c < 1, we obtain a contraction. the lipschitz condition can be easily verified for the rest of the cases and given as follows: recursively, the expressions in (13) can be written as the difference between successive terms of system (12) in recursive form is given below: with below initial conditions s 0 (t) = s(0), c 0 (t) = c(0), e 0 (t) = e(0), a 0 (t) = a(0), q 0 (t) = q(0), h 0 (t) = h(0), and r 0 (t) = r(0). taking norm of the first equation of (17), we obtain using lipschitz condition (14) we obtained thus, we have similarly, for the rest of equations in system (12) we obtained from above we can write that now, we claim the following result which guaranteed the uniqueness of solution of model (12) . theorem 3. the proposed fractional epidemic model (12) has a unique solution for t ∈ [0, t ] if the the following inequality holds proof. as we have shown that the kernels condition given in (14) holds. so, by considering the eqs. (20) and (29) , and by applying the recursive technique we obtained the succeeding results as below: therefore, the above mentioned sequences exist and satisfy φ 1n (24) and employing the triangle inequality for any k , we obtain where 1 γ(α) b α v i < 1 by assertion and t i = 1 γ(α) v i b α n , i = 1, 2, ..., 7. therefore, s n , c n , e n , a n , q n , h n , r n are regarded as the cauchy sequence in g(j). hence, they are uniformly convergent as described in [20] . applying the limit theory on eq. (16) when n → ∞ indicates that the limit of these sequences is the unique solution of the model (12) . ultimately, the existence of a unique solution for (12) has been achieved. in view of the fact that most of the fractional differential equations (fdes) do not have exact analytic solutions, so approximation and numerical techniques must be employed. numerous analytical and numerical methods have been put up to solve the fdes. for numerical solutions of the system (11) one can apply the generalized adams-bashforth-moulton method. to provide the exact solution by means of this algorithm, we take into account the following nonlinear fractional differential equation [19] : this equation is equivalent to volterra integral equation: diethelm and neville [7] employed the predictor-correctors scheme based on the adams-bashforth-moulton algorithm [6] . so we use again this same technique to find the solution of the projected model (11) . for α ∈ [0, 1], 0 ≤ t ≤ t and setting h = t /n and t n = nh, for n = 0, 1, 2, ..., n ∈ z + , the solution of the projected model is where and theorem 4. the numerical method (28)-(29) is conditionally stable. proof. lets 0 ,s j (j = 0, ..., n + 1) ands p n+1 (n = 0, ..., n − 1) be perturbations of s 0 , s j and s p n+1 , respectively. then, the following perturbation equations are obtained by using eqs. (20) and (29) using the lipschitz condition, we obtain where ζ 0 = max 0≤n≤n {|s 0 | + h α m a n,0 γ(α+2) |s 0 |}. also, from eq. (3.18) in [20] we derive where γ 0 = max{ζ 0 + h α m a n+1,n+1 γ(α+2) η 0 }. c α,2 is a positive constant only depends on α (lemma 4) and h is assumed to be small enough. applying lemma 2 concludes |s n+1 | ≤ cγ 0 . which completes the proof. we replace the integer order model to the fractional order model to study the true cameroonian data. the advantage to use caputo fractional derivative to study the current true data is to predict the epidemic peaks more clearly at different fractional order values α. parameter values use here are found in table 2 , and for which the basic reproduction number, r 0 is equal to 3.41. for initials conditions, we take s(0) = 26548238, c(0) = e(0) = a(0) = h(0) = r(0) = 0, q(0) = 1, and a fixed time step size of h = 2 −8 . the value of the fractional-order derivatives are α = 1, 0.9, 0.8, 0.7, 0.6, 0.5. firstly, figures have been analysed at integer order (denoted by red line) for the given model and then we analyse the nature of the compartmental model at different fractional values to give more peak predictions for the real-time data. it has been observed that onset of the epidemic peaks is delayed when the value of order α decreases. in figures 8 and 9 , one can see that when α −→ 1 the solutions of our fractional model (11) converge to the solutions of the integer model (2) . from above simulations, it can be conveyed that there are so much uncertainties of epidemic peaks for the given real-time data range. caputo derivative works well to study these biological phenomenons. robust forecasting results of covid-19 outbreak can be achieved by applying our model structure as imperfect quarantine and unsound confinement which are harsh realities in developing countries have been taken into account meticulously. a compartmental mathematical model has been formulated and studied to predict the evolution of the novel coronavirus disease in cameroon. after presenting the model with integer derivatives, the basic reproductive number r 0 of the model has been computed, which measures the number of secondary covid-19 positive cases caused by a single infectious person in a completely susceptible cameroonian population. the dynamics of the disease is impacted by the value of this epidemic threshold. indeed, if r 0 < 1, then a sufficiently small influx of infected individuals will not generate an outbreak of the covid-19, and if r 0 > 1, then a sufficiently small influx of infected individuals will generate an outbreak of the covid-19. using real data communicated by the cameroonian government, our model has been calibrated using a newly developed optimization algorithm based on wellknown trust-region-reflective (trr) algorithm. with the result of this model calibration, the probable peak date is estimated to be on august 01, 2020 with approximately 1073 (95% ci : 714 − 1654) daily new confirmed cases, and which corresponds to approximately ∼ 20, 100 (95% ci : 17, 343 − 24, 584) cumulative infected cases. as of july 11, according to the recent statistics communicated by the health authorities of cameroon, nearly 14,916 confirmed cases have already been identified. this difference between the "real data" communicated by the government and the predictions of our model can be explained by the fact that the fear that this disease creates within populations pushes the wealthy individuals towards private hospitals which not only give false test results but overbidding the treatment of this disease [8] . global sensitivity analysis has been performed implementing partial rank correlation coefficient (prcc) analysis to quantify the most crucial parameters, which significantly impact the progression dynamics of covid-19. it has been unearthed in our analysis that the direct infectious contact rate (β) and the efficacy of confinement measures ( ) are the most influential parameters in controlling the outbreak of covid-19. this ignites the public health implication that by reducing physical contact between people through social distancing, wearing face-masks and confinement measures, the disease outbreak can be controlled successfully. eventually, the concept of caputo derivatives has been deployed to formulate the fractional model and the existence and uniqueness of the solutions have also been presented. using generalized adams-bashforth-moulton method, the numerical scheme has been constructed for the fractional model. numerical simulations enlighten that for α = 1 the solutions of our fractional model converge to the solutions of the integer model, and the decrease of the fractional-order parameter (0 < α < 1) leads to delay the onset of the epidemic peaks. on a comprehensive model of the novel coronavirus (covid-19) under mittag-leffler derivative fractal-fractional differentiation and integration: connecting fractal calculus and fractional calculus to predict complex system fractional discretization: the african's tortoise walk approximate solution of tuberculosis disease population dynamics model chaos analysis and asymptotic stability of generalized caputo fractional differential equations an algorithm for the numerical solution of differential equations of fractional order analysis of fractional differential equations covid-19: la clinique marie o et le scandale hors de prix to mask or not to mask: modeling the potential for face mask use by the general public to curtail the covid-19 pandemic impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand a new approach for solving multi variable orders differential equations with mittag-leffler kernel a new approach for solving nonlinear volterra integro-differential equations with mittag-leffler kernel a new study of unreported cases of 2019-ncov epidemic outbreaks modeling the transmission dynamics of the covid-19 pandemic in south africa hydroxychloroquine and azithromycin as a treatment of covid-19: results of an open-label non-randomized clinical trial mgr kleda reçoit de nombreux soutiens pour ses recherches contre le coronavirus theory and applications of fractional differential equations on comparison systems for ordinary differential equations on the fractional adams method the finite difference methods for fractional ordinary differential equations. numerical functional analysis and optimization substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) coronavirus : un premier cas confirmé au cameroun stigma of people with hiv/aids in sub-saharan africa: a literature review covid-19 in cameroon: a crucial equation to resolve. the lancet infectious diseases forecasting covid-19 pandemic: a data-driven analysis has countrywide lockdown worked as a feasible measure in bending the covid-19 curve in developing countries? medrxiv sensitivity analysis of chronic hepatitis c virus infection with immune response and cell proliferation mathematical assessment of the impact of non-pharmaceutical interventions on curtailing the 2019 novel coronavirus numerical simulation of initial value problems with generalized caputo-type fractional derivatives the sars, mers and novel coronavirus (covid-19) epidemics, the newest and biggest global health threats: what lessons have we learned? fractional differential equations: an introduction to fractional derivatives, fractional differential equations, to methods of their solution and some of their applications geometric and physical interpretation of fractional integration and fractional differentiation operational matrix for atangana-baleanu derivative based on genocchi polynomials for solving fdes sir epidemic model with mittag-leffler fractional derivative the health stigma and discrimination framework: a global, crosscutting framework to inform research, intervention development, and policy on health-related stigmas reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission ordinary differential equations the authors thank the editor and the anonymous reviewer for their comments which permitted us to improve the manuscript. this work does not have any conflict of interest. no funding is received for this study. table 2 key: cord-285774-hvuzxlna authors: danion, j.; donatini, g.; breque, c.; oriot, d.; richer, j. p.; faure, j. p. title: bariatric surgical simulation: evaluation in a pilot study of simlife, a new dynamic simulated body model date: 2020-07-03 journal: obes surg doi: 10.1007/s11695-020-04829-1 sha: doc_id: 285774 cord_uid: hvuzxlna background: the demand for bariatric surgery is high and so is the need for training future bariatric surgeons. bariatric surgery, as a technically demanding surgery, imposes a learning curve that may initially induce higher morbidity. in order to limit the clinical impact of this learning curve, a simulation preclinical training can be offered. the aim of the work was to assess the realism of a new cadaveric model for simulated bariatric surgery (sleeve and roux in y gastric bypass). aim: a face validation study of simlife, a new dynamic cadaveric model of simulated body for acquiring operative skills by simulation. the objectives of this study are first of all to measure the realism of this model, the satisfaction of learners, and finally the ability of this model to facilitate a learning process. methods: simlife technology is based on a fresh body (frozen/thawed) given to science associated to a patented technical module, which can provide pulsatile vascularization with simulated blood heated to 37 °c and ventilation. results: twenty-four residents and chief residents from 3 french university digestive surgery departments were enrolled in this study. based on their evaluation, the overall satisfaction of the cadaveric model was rated as 8.52, realism as 8.91, anatomic correspondence as 8.64, and the model’s ability to be learning tool as 8.78. conclusion: the use of the simlife model allows proposing a very realistic surgical simulation model to realistically train and objectively evaluate the performance of young surgeons. as obesity has become a worldwide public health concern, bariatric surgery has been also recognized as an appropriate and effective method to treat obesity and its related diseases [1] [2] [3] [4] [5] . the training needs for bariatric surgeons are therefore increasing in order to maintain a high quality of care for obese patients. as reported in the literature [5] , 3 major factors influence bariatric surgery care: hospital infrastructure and volume, surgical team volume, and surgical skills. while it may be difficult to change the first 2 factors that are not dependent on the surgeon, the third can be improved. surgical simulation provides the opportunity for supervised directed learning of trainees, allowing full mastering of technical skill and increasing performances before actual practice on patients [6] [7] [8] [9] . for this purpose, we developed the simlife model, based on fresh human body given to science, dynamized by pulsatile vascularization with simulated blood, warmed to 37°c and ventilation [10, 11] . the objectives of this study were to assess the realism of this model, the satisfaction of learners, and finally the ability of this model to facilitate the learning process. the simlife model consists of a donated human body, which is retrieved by the body donation center of our university, prepared for surgical simulation [10] . bodies arrived within 24 h after death, and a traceability number (anonymity) is established [10] [11] [12] . exclusion criteria included all possible contaminations such as hiv, hbv, hcv, creutzfeldt-jacob, and tuberculosis, through analysis of a blood sample to perform serological tests; at the time of those simulations (2019) we were unaware of the risk of coronavirus infection, but now we systematicaly tested all cadavers about the covid status at their arrival at the body donation center. each body was then prepared for surgical simulation ( fig. 1 ): cannulas were placed in both femoral arteries and left common carotid artery (input) and both femoral veins and left internal jugular veins (output). the vascular axes of superior and inferior limbs may be excluded to target the trunk's vascularization [10] [11] [12] . a tracheotomy or orotracheal tube provided ventilation, and stomach emptying was obtained via a nasogastric tube. body's arterial tree was washed with water at low pressure (0.8 bar) and at a maximum temperature of 30°c to eliminate whole blood and clots. subsequent body cleaning and disinfection was performed and the body was frozen at −22°c in a negative pressure cold room [7, 8] . when a simlife simulation session was scheduled, before use and according to bodies' bmi, progressive body defrosting (at 16°c) over several days (3 days minimum) was achieved. finally, a testing procedure before starting on simlife model was performed to check the physiological behavior of the model. the specific technical module p4p (pulse for practice, patent number 1000318748 with international extension pct/ep2016/075819 published on 2017/05/11, wo 2017/ 076717 a1) animated the body, which was perfused by blood-mimicking fluid (patent l18217) circulating in the arterial system in a pulsating manner, recoloring and warming internal organs to 37°c, and restoring venous turgor. output was guaranteed by venous output. physiological hemodynamic data were computer monitored continuously and adapted as needed, with heart rate, blood pressure, and respiratory rate, which could increase or decreased to mimic a hemorrhagic shock for example. simlife inner organs were re-vascularized, re-colored, and warmed by specific mimicking-blood liquid. hemodynamic conditions were maintained and could be continuously modified by a computer-controlled device, ensuring identical physiological conditions of a real patient. for example, the pulsatile pump controlled by the computer automatically adjusted blood pressure according to possible iatrogenic accidents causing bleeding. thus, a moderate bleeding induced an increase in flow up to a threshold where hemodynamic instability resulted in a complete loss of blood pressure and systemic circulation interruption [10] [11] [12] . the learning platform on cadaveric model was covered by previous approval of french ministry of health ethics committee (protocol number dc-2019-3704). a total of 24 residents and chief residents (table 1) consented to this study on a total of 4 occasions. the training days were hosted at the medical school. before performing each procedure, all participants were given a theoretical approach, which included lectures, videos, description of the technique, and an overview to the reperfused cadaver model. this was followed by hand-on training on simlife models. we associated 2 trainees per station, with at least 1 supervising expert. the theme of the first 2 sessions was the sleeve gastrectomy, and the 2 following sessions were the roux-in-y gastric bypass; this sequence allowed trainees to familiarize themselves with the simlife model for a relatively simple procedure and then to move to a more technically demanding gastric by-pass. at the end of each practical session, all surgical trainees completed an anonymous evaluation survey indicating their degree of satisfaction (feedback) on a likert scale from 0 to 10 (0 = not at all to 10 = perfectly) on 4 items: 1. ease of learning a specific surgical procedure using simlife model, 2. accuracy of anatomic landmarks of simlife model compared with clinical reality, 3. degree of realism of simlife model, 4. overall satisfaction with the training model used. statistical analysis was performed by means of sas 9.3 software. values are reported as means and standard deviation (sd). results are summarized in table 1 . all participants completed and returned the evaluation survey corresponding to a response rate of 100% from the trainees. participants included 20 residents and 4 chief residents from the french nouvelle aquitaine area including three university hospitals: bordeaux, limoges, and poitiers. their status and experience in bariatric surgery are summarized on table 2 . the evaluation survey was carried out at the end of each session. data were collected from the 4 training sessions. the 24 participants answered to the four survey questions. based on these evaluations, the overall satisfaction of the cadaveric model had a mean score of 8.52 with sd of 0.83, realism had a mean score of 8.91 with sd of 0.94, anatomic correspondence had a mean score of 8.64 with sd of 0.96, and the model's ability to be learning tool had a mean score of 8.78 with sd of 0.85 (table 2) . on the evaluation form given to each trainee the final question was as follows: would you advise a colleague to bariatric surgery requires, as well as other surgical subspecialties, acquisition of specific skills, which may be learnt throughout consistent practice. corresponding at the halstedian model of apprenticeship "learning on the job" creates the notion of a learning curve. the relationship between hospital volume and outcomes is well recognized; at least 100 cases annually per hospital are recommended as the minimal requirement to achieve a low risk for serious complications [13] . moreover, a total experience of 500 cases was deemed necessary to diminish the risk for adverse outcomes and meet safety standards [13] . but an individual case report of 100 cases annually is not always feasible, and we focused on revisional bariatric surgery, as cited by bonrath; in germany an individual case volume of 300 procedures is referenced as a quality criterion [5] . the paradigm shift of training in surgery in experimental learning, kolb showed that strategy of the initial used in learning process influences adequate skill acquisition [14] . concerning bariatric surgery, the value of the classical surgical cursus, residency and fellowship training, is well documented [5, 9, [15] [16] [17] [18] . but availability of fellowship in a high debit department of bariatric surgery is not the rule for all young surgeons. in germany, as reported by bonrath, over 80% of surgeons had none or little exposure to fellowship training [5] . while in north america a "fellowship trained" is the rule to independently perform bariatric surgery. so designing fellowship training induced debate within the bariatric surgery societies without finding a worldwide agreement because the means available and the modalities of evaluation vary greatly from one country to another and sometimes from one university to another [8, [19] [20] [21] . other solutions have been proposed, for example, the sages telementoring, which allows surgeons to reach the plateau of maximum performance more quickly by "correcting" intraoperative gestures, thanks to experts who can follow the procedure remotely. an evaluation is proposed via this device; unfortunately, it is only subjective since it is left to the expert's free appreciation [22] and always on a patient. so in the last two decade, the surgical community stated that mentorship should not be the method of instruction that best prepares trainees to enter the modern world of surgery [6, 8, [17] [18] [19] [20] [21] . the milestone of the "new concept of training" should consist in exposing apprentices to features of real-life situations, without risks for living patients. surgical trainees may also benefit by activities performed far from operating theaters such as surgical simulation [23] [24] [25] , coaching [26, 27] , structured training programs [28] , and many others [13] . in fact, the learning curve must shift from the operating theater to a "preclinical" model in simulation. this "natural" evolution of training also follows the incredible technological progress of surgery where the practitioner must master not only his surgical technique but also the tool he uses. which model for surgical simulation and evaluation? donald kirkpatrick [29] in the late 1950s defined a training evaluation model based on four levels of evaluation. each level is built from the information of the previous levels. in other words, a higher level is a finer and more rigorous assessment of the previous level: level 1: assessment of reactions, level 2: learning assessment, level 3: evaluation of transfer, and level 4: outcome evaluation. level 1 with assessment of learners' reactions in front of the simulation model is fundamental. if we try to compare the simulation training of pilots and surgeons: a crucial element emerges. while computer models can perfectly simulate a long-distance flight with all possible anomalies, the same cannot be said for computerized surgical simulation. the root of surgical simulation should be the realism of the model to obtain the most immersive environment to the learners [30, 31] . a wide number of surgical simulators are available for the benefit of trainees [6, 7, 9, 10, 30, [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] . they can be divided into synthetic and organic simulators [7, 9] . within the first group we have plastic, rubber, or latex-based simulator as well as virtual reality (vr) and computer-based simulation. those simulators have the advantage to allow repetition of practice without any risk (no living being used), but these tools may sometimes present a lack of reality compared with human patients [7] . it is necessary to adapt simulation models to anatomical and/or physiological variations that cannot be perfectly programmed in a computerized scenario [42] [43] [44] . organic type simulators provide high-fidelity environment and may be divided into animal-based and human-based. the first type is mainly represented by canine, baboons, or porcine model [7] . nevertheless, some ethical restriction applied as living animal models are forbid in the uk and open discussion exist in some other european countries [7, 44] . the second organic model is represented by human cadaver, the historical model for practical training in surgery or interventional medicine [45, 46] . indeed, fresh or embalmed human cadavers have been used for centuries as a learning tool in clinical anatomy [33, 34] . the major pitfall of human corpse is represented by the fact that this is a static model, which could not simulate actual condition of surgery like bleeding and hemodynamic instability, one of the most critical conditions that a surgeon may face, especially during laparoscopy [35] [36] [37] [38] . to overcome this problem few teams introduced model of perfused cadaveric material, mainly in neurosurgery, reporting higher satisfaction of trainees and increased fidelity, similar to a living patient [6, [40] [41] [42] . these late reports particularly highlight the increased degree of reality represented by a perfused cadaveric model, which allowed training in hyper-realistic environment [39] [40] [41] . furthermore, the use of cadavers is also a source of ethical reflection and emotional and psychological analysis for learners in their surgical behavioral training [47, 48] . training on a cadaveric model (figs. 2 and 3) seems to be the best compromise between learning in the operating room, the animal model, and/or virtual simulators [35] . surgical apprenticeship on simlife is performed safely and achieved a high satisfaction score among trainees, as shown previously. this last point is truly important as apprentice appreciation of simulators is the key to provide successful training as it allows gaining of confidence, increasing of experience, and mastering of surgical techniques, which may be lately translated into proficient medical practice [29, 41] . first, the simlife model revascularization by a bloodmimicking fluid-limited coagulation, platelet activation, and thrombin-derived products could not be achieved as in a real standard patient. so the environment is closer to an extracorporeal circulation model. second, body availability and moreover overall mean cost per procedure limited the access to this model. this simulations' device cannot be reserved as initial training for junior residents, but it has to be implemented at the end of basic skills learning, which may be achieved on simpler models. thus, simlife should ideally be used for training in the last period of residency or during fellowship program to ensure skills mastering just before practicing on clinical theater. to also limit the cost, it is possible to set up simlife training sessions with several specialties: on day one, orthopedic surgery; on day 2, bariatric and/or endocrine surgery (thyroidectomy with lymph node dissection for example, in this case it is necessary to adapt the body preparation without neck dissection: cannulas placement can be modified as required); and on day 3, cardiac surgery (heart valve surgery). to look further, this model can be implemented in other universities and countries. simlife introduced a realistic bariatric surgery simulation model. it represents a relevant tool that can have a positive impact on the acquisition and mastery of advanced technical skills for young surgeons. the next step in this work will be the evaluation of performance acquisition over several sessions using specific evaluation scales. conflict of interest c breque, d oriot, jp richer, and jp faure are patent co-owner of the p4p device permitting revascularization and reventilation. all other authors declare that they have no conflict of interest. the learning platform on cadaveric model is covered by previous approval of french ministry of health ethics committee (protocol number dc-2019-3704). informed assent and consent informed consent was obtained from all individual participants included in the study. prevalence of obesity among adults and youth: united states clinical indications, utilization, and funding of bariatric surgery in europe estimates of bariatric surgery numbers training in bariatric surgery: a national survey of german bariatric surgeons simulation in surgery: a review systematic review of the current status of cadaveric simulation for surgical training the changing face of surgical education: simulation as the new paradigm patient safety and simulation-based medical education simlife a new model of simulation using a pulsated revascularized and reventilated cadaver for surgical education life: a new surgical simulation device using a human perfused cadaver simlife: face validation of a new dynamic simulated body model for surgical simulation the learning curve of one anastomosis gastric bypass and its impact as a preceding procedure to roux-en y gastric bypass: initial experience of one hundred and five consecutive cases experiential learning: experience as the source of learning and development high case volumes and surgical fellowships are associated with improved outcomes for bariatric surgery patients: a justification of current credentialing initiatives for practice and training presence of a fellowship improves perioperative outcomes following hepatopancreatobiliary procedures bariatric outcomes are significantly improved in hospitals with fellowship council-accredited bariatric fellowships systematic review with meta-analysis of the impact of surgical fellowship training on patient outcomes see one, do one, teach one": inadequacies of current methods to train surgeons in hernia repair see one, do one, teach one": education and training in surgery and the correlation between surgical exposures with patients outcomes michigan bariatric surgery collaborative. effects of resident involvement on complication rates after laparoscopic gastric bypass sleeve gastrectomy telementoring: a sages multi-institutional quality improvement initiative randomized clinical trial of virtual reality simulation for laparoscopic skills training psychomotor performance measured in a virtual environment correlates with technical skills in the operating room comprehensive surgical coaching enhances surgical skill in the operating room: a randomized controlled trial a randomized controlled study to evaluate the role of video-based coaching in training laparoscopic skills complementing operating room teaching with video-based coaching comprehensive simulationenhanced training curriculum for an advanced minimally invasive procedure: a randomized controlled trial evaluating training programs: the four levels -third edition. berrett-koehler publishers, 1 janv effectiveness of cadaveric simulation in neurosurgical training: a review of the literature testing of a complete training model for chest tube insertion in traumatic pneumothorax cadaver-based simulation increases resident confidence, initial exposure to fundamental techniques, and may augment operative autonomy the role of human cadaveric procedural simulation in urology training preoperative surgical rehearsal using cadaveric fresh tissue surgical simulation increases resident operative confidence back to basics: use of fresh cadavers in vascular surgery training an enhanced fresh cadaveric model for reconstructive microsurgery training basic laparoscopic skills training using fresh frozen cadaver: a randomized controlled trial a perfusion-based human cadaveric model for management of carotid artery injury during endoscopic endonasal skull base surgery live cadaver' model for internal carotid artery injury simulation in endoscopic endonasal skull base surgery endoscopic management of cavernous carotid surgical complications: evaluation of a simulated perfusion model the use of a novel perfusion based human cadaveric model for simulation of dural venous sinus injury and repair surgical skills training and simulation the minimal relationship between simulation fidelity and transfer of learning simulation and surgical training cadaveric surgery: a novel approach to teaching clinical anatomy a fresh cadaver laboratory to conceptualize troublesome anatomic relationships in vascular surgery detached concern" of medical students in a cadaver dissection course: a phenomenological study human dissection: an approach to interweaving the traditional and humanistic goals of medical education publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-281122-dtgmn9e0 authors: ribeiro, matheus henrique dal molin; mariani, viviana cocco; coelho, leandro dos santos title: multi-step ahead meningitis case forecasting based on decomposition and multi-objective optimization methods date: 2020-09-22 journal: j biomed inform doi: 10.1016/j.jbi.2020.103575 sha: doc_id: 281122 cord_uid: dtgmn9e0 epidemiological time series forecasting plays an important role in health public systems, due to its ability to allow managers to develop strategic planning to avoid possible epidemics. in this paper, a hybrid learning framework is developed to forecast multi-step-ahead (one, two and three-month-ahead) meningitis cases in four states of brazil. first, the proposed approach applies an ensemble empirical mode decomposition (eemd) to decompose the data into intrinsic mode functions and residual components. then, each component is used as the input of five different forecasting models, and, from there, forecasted results are obtained. finally, all combinations of models and components are developed, and for each case, the forecasted results are weighted integrated (wi) to formulate a heterogeneous ensemble forecaster for the monthly meningitis cases. in the final stage, a multi-objective optimization (moo) using the non-dominated sorting genetic algorithm – version ii is employed to find a set of candidates’ weights, and then the technique for order of preference by similarity to ideal solution (topsis) is applied to choose the adequate set of weights. next, the most adequate model is the one with the best generalization capacity out-of-sample in terms of performance criteria including mean absolute error (mae), relative root mean squared error (rrmse) and symmetric mean absolute percentage error (smape). by using moo, the intention is to enhance the performance of the forecasting models by improving simultaneously their accuracy and stability measures. to access the model’s performance, comparisons based on metrics are conducted with: (i) eemd, heterogeneous ensemble integrated by direct strategy, or simple sum; (ii) eemd, homogeneous ensemble of components wi; (iii) models without signal decomposition. at this stage, mae, rrmse, smape criteria and diebold–mariano statistical test are adopted. in all twelve scenarios, the proposed framework was able to perform more accurate and stable forecasts, which showed, on 89.17% of the cases, that the errors of the proposed approach are statistically lower than other approaches. these results showed that combining eemd, heterogeneous ensemble and wi with weights obtained by optimization can develop precise and stable forecasts. the modelling developed in this paper is promising and can be used by managers to support decision making. specific diseases. these approaches are adopted for different purposes such as comparing, 14 implementing, evaluating, prevention, therapy, and the development of public policies [4] . 15 usually, the epidemic models are based on parameters related to susceptibility (s), infected 16 (i), and removed (r), as well as exposed (e) individuals can be considered which leads to 17 the sir or seir models. each variation of these models has its particularity, and different 18 factors can be considered in these approaches to provide knowledge about the disease spread. 19 nowadays, these approaches have been proposed to understand the spread of new coronavirus 20 [5, 6] . also, different mathematical approaches are proposed to mitigate the effects of several 21 diseases such as ebola [7] , influenza [8] , and malaria [9] . in the last years, a computer 22 science field called artificial intelligence (ai) able to recognize patterns of historical data and 23 support the decision making has received attention to solving problems from the commerce 24 [10] , and industry [11] . machine learning models, an ai sub-field, becomes the kernel of data 25 analysis, once dealing with classification [12] , data clustering [13] , and regression tasks [14] . 26 nonetheless, when it comes to matters of diseases that plague the brazilian public health 27 system, such as dengue, malaria, and others, there are limited discussions as regards the 28 effectiveness of machine learning models to develop predictive models. some of these studies 29 aimed to define the incidence of diseases such as ventriculitis and meningitis [15] , as well as 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 the original signal is split into five components (four imf and one residue). in the sequence, 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 age (arima), and seasonal autoregressive integrated moving average (sarima). figure 1 125 associates the diseases and adopted modeling. 126 figure 1 : diseases related and its adopted modeling. adjacent to the above mentioned, as well the presented in appendix a, some gaps in 127 relation to the developed approaches can be found and are stated as follows: 128 • considering the disease's type, around of 93.75% of the papers focused on malaria, 129 dengue or influenza. hence, there is a lack of discussion concerning the predictive 130 capacity of machine learning-based approaches for diseases such as measles, meningitis, 131 and chikungunya on the forecast task; 132 • in the modeling aspect, only four papers focused on ensemble approaches such as bag-133 ging and boosting or models combined by average. one paper used an optimization 134 approach of the swarm intelligence field called firefly algorithm (ffa) for hyperparam-135 eters tuning, and no paper adopted signal decomposition or moo with the purpose of 136 building ensembles. it is well known that the combination of these strategies can help 137 on the improvement of the model's accuracy and, therefore, out-of-sample generaliza-138 tion; 139 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 (test set, out-of-sample forecasting) in the same brazilian state, but in different splitting 167 setups of the datasets. in this way, we tried to accommodate the data variability for each 168 state. 169 figure 2 shows the study areas, the behavior of the number of notified cases by state 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 when the emd is used, the main drawback is named "mode mixing", that is, each in which y i is i -th output value (i = 1, . . . , n), s is the number of neurons, x ij is an input 238 value, θ = [w 1 , . . . , w k , b 1 , . . . , b k , β 1 1 , . . . , β 1 p , β s 1 , . . . , β s p ], is the vector of weights and biases , 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1. minimization of in which, f(.) is a function of θ, y i andŷ i are the observed and estimated values, 245 respectively. also, σ 2 e and σ 2 θ are the variance of errors, weights and biases, respectively; considering that qrf uses the quantiles in the prediction process, the α-quantile of cdf 299 is stated as the probability that the number of notifications is lower than q α if the given p t 300 is equal to α, where the estimate of α is stated as follows: the pls regression approach is a technique to analyze multivariate data, in which the 306 aim is to relate one or two output variables (y) with several inputs (x). for this purpose, 307 given a linear model, the problem that often arises is the matrix of inputs being singular. faced with this, to deal with this problem, the pls decomposes x into orthogonal scores t 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 subject to inequality constraints in which, θ = [θ 1 , . . . , θ i ], (i = 1, . . . , n) is a vector of decision variables, l and u are the 332 lower and upper boundaries for each decision variable, and j k (θ) is the k -th objective to be 333 minimized or maximized. in this respect, during the moo step, an optimization algorithm is applied to find the 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 as regards to the moo approach, most of the algorithms proposed in the literature are the nsga-ii parameters adopted in this paper are exposed in section 5, item 4b. lastly, in the mcdm step, it is possible to find a preferable set of decision variables 359 (weights in this paper) that allows dealing with the trade-off between the objectives. it -proof 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 section 5, item 4c. and forecast the meningitis cases according to the recursive method, as given by eq. (8), s : in which f is a function related to the adopted model in the training process,ŷ (t+h,k) 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 [65], the recursive strategy uses forecasting values as a model's input to forecasting the 390 next predicted values. its main disadvantage is to accumulate the previous forecasting 391 errors in the recursive process. however, the advantage of the recursive method lies in 392 the use of one model for all processes, i.e., train one model to forecast a one-step-ahead 393 horizon, and use it for multi-step-ahead forecasting task. on the other hand, the direct 394 method uses only past values to predict the future, which is its advantage, as it does 395 not accumulate prediction errors. however, its disadvantage lies in the necessity to machine learning models, sometimes, have high training time, either due to the use 408 of different training strategies such as cross-validation (k -fold or loocv-ts), or even 409 due to the number of parameters to be tuned. in this context, because in this paper 410 several models are evaluated, to find an efficient ensemble learning forecasting model 411 to study meningitis cases, the recursive forecasting strategy is adopted. moreover, 412 the forecasting horizons are defined as one, two, and three months-ahead, which are 413 considered short-term. therefore, even though the recursive method could lead to high 414 forecasting errors, it is used in this paper due to its lower computational cost. however, 415 the direct method can be considered for this study. 416 in this paper, the control hyperparameters of the adopted models in step 2 are obeemd components. therefore, the methodology used to select the models' order is 432 the grid-search. table 1 shows a sample of 3 out of 3125 ensemble learning models, 433 randomly selected, where the order of models for each component is detailed. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 (a) in the mop, the cost function, for each combination of models and components, 440 is stated as follows: 441ŷ t+h = n=4 j=1 θ j im f m odel j + θ 5 residue m odel 5 ,(9) considering the bias-variance framework [57], the objectives are defined as follows: whereθ j is the estimated weight; 462 6. computing the performance measures mae, rrmse and smape given by, [71] is applied. in this paper, the lower tail priori hypothesis h is given by eq. (16), h : and statistic of dm test is given by eq. (17), 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 in addition, figure 3 presents the modeling process. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 residue, to forecast the number of meningitis cases one, two and three-month-ahead of time. according to zhang et al. [77] , this effectiveness is associated with the diversity used by the 510 heterogeneous ensemble approach, which is an efficient and simple way to perfect forecasting 511 accuracy and stability (lower standard deviation of the errors). faced with this, it makes 512 the predictive model more robust. -proof 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 concerning the one-month-ahead forecast, for all states, the proposed framework achieves 514 better accuracy than the compared models. regarding mae criterion, the compared mod-515 els increasing the forecasting errors regarding proposed methodology, which ranges between 516 22.73% and 36.36%, 36.87% and 138.22%, 0.72%, and 54.35%, as well as 41.51% and 66.04% the comparison ii is designed to verify the forecasting performance of the proposed 540 hybrid framework by comparing it with five models which do not consider eemd for signal 541 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 decomposition, namely brnn, cubist, gbm, qrf, and pls. the comparisons are shown 542 in table 3 and additional discussion is presented. regarding these results, it is observed that 543 the use of signal decomposition, specifically in the case of the eemd approach, can enhance 544 the model's performance. this shows that the eemd approach is suitable for decomposing the results of one-month-ahead up to three-month-ahead forecast for meningitis cases 557 were obtained. the comparisons between the proposed approach and the models without -proof 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 the eemd-hte-moo presents a better generalization capacity than the eemd-hte1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 concerning the use of the di strategy for the task of grouping the eemd components. therefore, regarding the results presented in subsections 6.1, 6.2, 6.3 and 6.4, the pro-651 posed approach showed better accuracy than the compared models. in parallel, the proposed 652 framework achieves excellent performance in 83.33% and 25% of the cases, and good results considering what is shown by figures 4a, 5a, 6a and 7a evidence of a trade-off among 662 the objectives adopted on moo can be seen, in other words, depending on the weights used 663 to create the ensemble obtained from the eemd, there is a bias increase while the variance 664 decreases. in this aspect, the use of moo is adequate, because it allows the obtaining of 665 an efficient model that is able to reach small forecast errors and lower standard deviation 666 errors. the same behavior is replicated for the other two forecast horizons. during this round, figures 4b, 5b, 6b and 7b show that the data behavior is learned 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 by the models, in most of the cases, which allows predictions compatible with the observed 669 values. that is, the meningitis cases forecasted are close to the observed values. the good 670 performance regarding accuracy obtained in the training phase persists in the test stage, 671 indicating that the hybrid framework is robust to reach the developed predictions. the overfitting phenomenon occurs when the model has great generalization in the train-673 ing set, but not in test set or out-of-sample forecasting. to avoid it, two approaches were 674 considered. first, each adopted model was trained using a cross-validation procedure, as 675 described in the methodology section, to prevent overfitting. and second, when the bias and 676 variance are objectives adopted in multi-objective optimization, the trade-off between these 677 measures is considered which leads to overfitting treatment. also, by the illustrated trough 678 of predicted and observed values (figures 4b, 5b, 6b and 7b) , once a similar performance is 679 observed in training and test sets, there is no evidence of overfitting. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 assembling of mcdm techniques. also, is desirable to compare the recursive and direct 724 methods to perform multi-step-ahead forecasting for the proposed task. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 -proof 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 appendix a. summary of related works 1051 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 compared with lasso and lstm approaches, the proposed framework reach better improvement than this approaches for the adopted task. j o u r n a l p r e -p r o o f journal pre-proof 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 viral (aseptic) meningitis: a 732 review current meningitis outbreak in ghana: historical perspectives and the importance 735 of diagnostics departamento de informática do sistemaúnico 738 de saúde (datasus) the mathematics of infectious diseases covid-abs: an agent-based model of covid-19 epidemic to simulate 745 health and economic effects of social distancing interventions modeling and forecasting the covid-19 748 pandemic in india an ebola model with hyper-susceptibility a 753 bayesian system to detect and characterize overlapping outbreaks modeling pyrethroids 756 repellency and its role on the bifurcation analysis for a bed net malaria model an artificial intelligence 759 system for predicting customer default in e-commerce opportunities and challenges of artificial 762 intelligence for green manufacturing in the process industry a novel bagging c4.5 algorithm based on 765 wrapper feature selection for supporting wise clinical decision making predicting temporal 772 propagation of seasonal influenza using improved gaussian process model healthcare-associated ventriculitis and meningitis in a neuro-icu: incidence 776 and risk factors selected by machine learning approach mapping the transmission risk of zika 779 virus using machine learning models complementing the power of deep learning 782 with statistical model fusion: probabilistic forecasting of influenza in dallas county multi-785 objective ensemble model for short-term price forecasting in corn price time series international joint conference on neural networks (ijcnn) ensemble learning by means of a multi-objective 789 optimization design approach for dealing with imbalanced data sets ensemble methods in machine learning ensemble approach based on bagging, boosting 794 and stacking for short-term prediction in agribusiness time series bayesian interpolation combining instance-based and model-based learning icml'93 instance-based learning algorithms greedy function approximation: a gradient boosting machine. the 881 quantile regression forests random forests using quantile regression forest to estimate uncertainty of 888 digital soil mapping products the multivariate calibration problem in chemistry 891 solved by the pls method partial least-squares regression: a tutorial automatic clustering-based identification of autoregres-922 sive fuzzy inference models for time series identification of lags 925 in nonlinear autoregressive time series using a flexible fuzzy model forecasting third-party mobile payments with implications 928 for customer flow prediction multiple steps ahead solar photovoltaic power forecasting 931 based on univariate machine learning models and data re-sampling multi-step ahead 934 forecasting of heat load in district heating systems using machine learning algorithms forecasting: principles and practice a novel combined model based on advanced optimiza-940 tion algorithm for short-term wind speed forecasting comparing predictive accuracy r: a language and environment for bus statistical computation coyote optimization algorithm: a new metaheuristic 996 for global optimization problems cultural coyote 999 optimization algorithm applied to a heavy duty gas turbine operation. energy convers 1000 manage metaheuristic inspired on 1002 owls behavior applied to heat exchangers design design of heat exchangers 1005 using falcon optimization algorithm a 1008 support vector machine-firefly algorithm based forecasting model to determine malaria 1009 transmission developing a 1012 dengue forecast model using machine learning: a case study in china 1015 et al. modeling dengue vector population using remotely sensed data and machine 1016 learning the utility of lasso-based models 1018 for real time forecasts of endemic infectious diseases: a cross country comparison employing machine learning techniques for the malaria 1021 epidemic prediction in ethiopia united kingdom real time influenza monitoring using hospital big data in combination 1026 with machine learning methods: comparison study forecasting influenza epidemics by 1029 integrating internet search queries and traditional surveillance data with the support 1030 vector machine regression model in liaoning a gis-based artificial neural 1033 network model for spatial distribution of tuberculosis across the continental united 1034 a comparison of 1037 three data mining time series models in prediction of monthly brucellosis surveillance 1038 data a novel data-driven model for real-time influenza forecasting artificial neural network based prediction of malaria 1043 abundances using big data: a knowledge capturing approach forecasting dengue epidemics using a 1046 hybrid methodology -1048 fluenza activity using self-adaptive ai model and multi-source data in chongqing declaration of interests (x) the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.( ) the authors declare the following financial interests/personal relationships which may be considered as potential competing interests:j o u r n a l p r e -p r o o f journal pre-proof key: cord-310863-jxbw8wl2 authors: prasad, j. title: a data first approach to modelling covid-19 date: 2020-05-26 journal: nan doi: 10.1101/2020.05.22.20110171 sha: doc_id: 310863 cord_uid: jxbw8wl2 the primary data for covid-19 pandemic is in the form of time series for the number of confirmed, recovered and dead cases. this data is updated every day and is available for most countries from multiple sources. in this work we present a two step procedure for model fitting to covid-19 data. in the first step, time dependent transmission coefficients are constructed directly from the data and, in the second step, measures of those (minimum, maximum, mean, median etc.,) are used to set priors for fitting models to data. we call this approach a "data driven approach" or "data first approach". this scheme is complementary to bayesian approach and can be used with or without that for parameter estimation. we use the procedure to fit a set of sir and sird models, with time dependent contact rate, to covid-19 data for a set of 45 most affected countries. we find that sir and sird models with constant transmission coefficients cannot fit covid-19 data for most countries (mainly because social distancing, lockdown etc., make those time dependent). we find that any time dependent contact rate, which falls gradually with time, can help to fit sir and sird models for most of the countries. we also present constraints on transmission coefficients and basic reproduction number r0~ as well as effective reproduction number r(t). the main contributions of our work are as follows. (1) presenting a two step procedure for model fitting to covid-19 data (2) constraining transmission coefficients as well as r0~ and r(t), for a set of most affected countries and (3) releasing a python package pycov19 that can used to fit a set of compartmental models with time varying coefficients to covid-19 data. at present the world is going through an unprecedented crisis of pandemic covid-19 caused by a novel form of coronavirus, named sars-cov-2 which was passed to the human from bats in the wuhan city of china, some time in december 2019 [org20a, org20b, ea20h, ea20t, ea20v, ea20d, ea20r, ea20l] . till the middle of may 2020 the virus has reached almost all the parts of the world resulting in more than four million people infected and more than a quarter million deaths [wor20] . the measures to contain the virus medically by developing a vaccine are going on war footing. however, the success is still expected to be a few years away [ea20f] . till a fraction of the population develop (herd) immunity or the vaccine is ready, the only means to contain the pandemic are social measures (social distancing, contact tracing etc.,) and enhanced hygiene practices [ea06, ea20s, ea20p] . some of the most important problems related to covid-19 research are (1) estimating the controlling parameters of the pandemic, (2) making short term predictions using mathematical-statistical modeling which can help in mitigating policies (3) simulating the growth of the epidemic by taking into account as many contributing effects as possible and (4) quantifying the impact of mitigation measures, such as lockdown etc [ea20j] . modeling covid-19 pandemic with compartmental models of kermack and mckendrick (for an introduction see [kr08, li18, bc18] ) has been one of the most active problems in the recent times [ea20p, ea20a, ea20c, ea20e, fp20, ea20m, oli20] . there have been alternative approaches also such as [rmi20] where statistical considerations are being taken into account for predictions. in one of the studies [fp20] it is argued that the data for the confirmed, recovered and dead, all three can easily fit a power law model with similar coefficients. the main attractive feature of these data driven approaches is that the complexity of the model being considered is determined by the data and not by theoretical expectations. in the present work we follow a middle approach and fit two compartmental models, named sir and sird with some modification, to the covid-19 data. one of the main reasons to consider these models has been that the covid-19 data is available only for the susceptible, infected, recovered and dead compartments (for the notations used here and other places in the present work see table (1)). it may be true that a large fraction of the population which may be exposed (defined later) play an important role in the dynamics of the pandemic however, it is hard to get reliable numbers for that. apart from that, a large number of undocumented cases [ea20l] may have significant influence on the spread of the pandemic. a brief summary of the work presented here is as follows. in §2 we give a brief introduction to the compartmental models and introduce the notations and variables used in the work. in particular, we discuss the sir model in §2.1 and the seir and the sird models in §2.2 and §2.3 respectively. one of the major parts of the work presented here is to study the time dependence of the contact rate β, we introduce a set of parametric models of β(t) in §2.4. we discuss the time series data used in the study in the §3 by giving an example of italy which is one of the most affected countries. the main results of our work are given in §4 and in §5. in §4 we discuss the reconstruction (regression) procedure for the set of transmission coefficients as well as for the effective reproduction (defined later), number r(t). parameter estimation is discussed in §5. the main conclusions of our work with a summary and some important points are discussed in §6. mckendrick (see [nc08, kr08, li18] for an introduction) is still the main framework most commonly used. the main idea of the kermack and mckendrick's compartmental models is that every individual in a society belongs to one of the m compartments and the total number of individuals belonging to different compartments keep changing with time. the minimum value m can have is two, for the susceptible-infected-susceptible (sis) model, in which the recovery does not guarantee that one will not get the infection again [kr08] . during an epidemic phase an individual can go through many stages from being perfectly healthy to the recovered one after an infection, with or without any immunity (short or long term) or may die. if we represent every stage with a compartment and keep the track of the number of individuals in each compartment then we can easily model the dynamics of the epidemic. this approach is very similar to the approach taken in astronomy where we count the number of stars in different stages of their life to understand the stellar evolution. in principle we can have any number of logical compartments but in practice we should consider only those compartments for which we have the counts data, in particular for model fitting. taking into account the fact that we have data only for the number of confirmed, recovered and dead population, the only compartmental model that meets the requirement is the sird model. if we consider the recovered and dead together we get the sir model as is discussed in the next section. one of the important compartments that also is commonly considered is the 'exposed' one and represents the population which have received the infection but cannot pass to others, before a certain period called the incubation period. if we consider exposed population also then we get the seir model that also is discussed below. three compartmental models sir, sird, and seir are shown in the (a), (b) and (c) panels of figure (1) respectively (for more detail one can refer to [kr08, het00, ea97, oli20] ). if we identify the compartments with the nodes of a graph then the transmission between different compartments, as is represented by a set of coefficients, can be considered the edges of the graph. some of the nodes may have multiple edges and some of the edges could be bi-directional also. the main challenge of the modeling a pandemic like covid-19 is not the scarcity of mathematical models but it is of the reliable data for the compartments being considered. considered. if we consider these compartments as nodes of a graph then there are transmission coefficients for every connecting edge that determine how effective that edge is in changing the population of the connected compartments. in (a) and (c) representing sir and seir models, the compartments are connected in a linear way, however, for the case (b), representing the sird model, there is a branching also. since the total population must remain a constant so the rates of change along all the connecting edges must add to zero. notation description here β and γ are the transmission coefficients, also called the contact rate and the recovery removal rate respectively, and 1/β and 1/γ represent the mean duration of infectiousness and the average period of infectivity (see [het00, ea03a] ), respectively. in general there is some time lag between acquiring an infection and becoming infectious. however, in the sir model it is ignored and an assumption is made that individuals become infectious immediately upon getting an infection. this is a very strong assumption and the main reasons for 4 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020 . . https://doi.org/10.1101 making this is that we do not know reliably how many people are actually 'exposed', or have the virus but are still not infectious (cannot pass it to others). one of the ways to address this problem could be contact tracing and assuming that anyone who has come into contact with an infected person is an exposed one. however, this assumption is as strong as the assumption made in the sir model. if we do not consider the birth, death and movement of people then the following condition must be satisfied. here s, i and r is the population of the s, i and r compartments respectively. in equation (1) the transmission coefficient β is one of the most important parameters of the epidemic dynamics and can be written as the product of the contact rates (the average number of contacts per person per time) and the transmission probability (the probability of disease transmission on contact between a susceptible and an infectious person). as has been mentioned that the transmission coefficient γ can be identified with the recovery rate which is nothing but the inverse of the infectious period (during which an infected person can pass the virus to other healthy people). in general, the equations (1) is solved with the following initial conditions: the second equation from 1 can be written as: and for s/n > γ/β we get a positive infection rate. here we define one of the most important parameters of an epidemic in terms of the ratio β/γ, called the basic reproduction number r 0 , when considered a constant, and called the effective reproduction number r(t), when considered a function of time. the most common definition [kr08] of it is that it is the average number of secondary cases arising from an average primary case in an entirely susceptible population. note that in the text we may also use just "reproduction number" and the meaning of it will depend on the context. some of the studies such as [ea20a] call r 0 and r(t)both as basic reproduction number, however, we follow the convention used in [ea03b, ea03a, cob20] . the basis reproduction number r 0 is the main measure which quantifies the transmissibility of the virus and r 0 > 1, sets a chain of transmissions leading an exponential growth of the pandemic. we can keep r 0 < 1, by minimizing the contact rates (social distancing etc.,), lowering the infectiousness of the infected people (by treating them or putting them in a quarantine etc.) and reducing the susceptibility of the healthy people by vaccination etc., (for detail see [ea05] ). the sir model is one of the most basic models and can be easily generalized by one or more of the following ways: 1. adding more compartments: depending on the type of pandemic and other details we are interested in we can add more compartments to the sir model. these compartments can fit in between the existing ones (for example as shown in figure( 1) (c) for seir case) or can branch out from the existing once (as shown in figure(1 6 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. if we relax the assumption that the people who get the infection become infectious instantly and consider a latent period to the the onset of infectiousness there is a fraction of population (compartment) which has been exposed to the virus but will become infectious only after some latent period 1/σ, then the model is called susceptible-exposed-infected-recovered (seir) model represented by the following set of equations [kr08] note that if we combine the second and third equation above we get: from the above equation we can see that population in the e and i compartments together can grow with time only when the fraction of the susceptible population is greater than the inverse of the reproduction number : there are many forms of seir equations which are in common use (see [bc18, ea20p, ea20u, ea20b, p.20, ea20n] ) however, equation (5) is the simplest one and does not include natural deaths. one of the common practices with the seir model has been to consider the incubation period 1/σ a constant, and estimates it from some other observations. the seir modal is quite complex as compared to the sir model and we cannot find the number of exposed people exactly at time t = 0 for evolving the equations and so the approach used to define r no longer works. thanks to the new generation matrix models [bc18] it is still possible to write r in a close form for this case also. one of the serious drawbacks of the sir model is that people who recover and who die are treated in the same way -there are no separate compartments for the dead and recovered people. this drawback can be addressed by separating the compartments for the dead and recovered population as is done in the sird model described with the following set of equations (for a detail discussion see [ea20a, vil20] ). 7 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . https://doi.org/10.1101/2020.05.22.20110171 doi: medrxiv preprint here a new transmission coefficient δ has been introduced which we can identify with the death rate. one of the advantages of the sird model is that it has three transmission coefficients β, γ and δ and we have the data for three time series i(t), r(t) and d(t) available so it is possible to compute the time dependency of all the three coefficients as well as the reproduction number r. the aim of any mitigation measures may be one or more of the followings: 1. lower the contact or infection rate β. 2. lower the mortality rate δ. 3. increase the recovery rate γ. the sird model provides us a framework to estimate or fit all these parameters. in one of the coming sections we will discuss how we can reconstruct the transmission coefficients β, δ and γ as well as r from the data by a direct reconstruction approach. the basic reproduction rate for the sird model can be written in the following way [ea20a] : or, 1 where r γ = β/γ and r δ = β/δ. if apart from death and recovery there is some other channel that can lower the population in the i compartment, for example if infected people move out from that region with transmission coefficient η then we can write: with r η = β/η. a more realistic model will have multiple compartments (nodes), either connected in series or some branching out from others, with data to constrain the transmission coefficients (edges). apart from this, realistic models may also require to consider different transmission coefficients of different subgroups (based on age etc.,). incorporating, all these considerations will lead to very complex models having very less connect with the actual data we have. 8 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . https://doi.org/10.1101/2020.05.22.20110171 doi: medrxiv preprint 2.4 time dependent β models as a pandemic triggers various containment measures [ea20s, dg20, ea20k, ts20, rr20, ea20c] such as lockdown, social distancing, improved hygiene practices etc., are taken and that lead to transmission coefficients such as β becoming time dependent [gg20, ea20m, fp20, ea20e, ea20i] . apart from this, the drop in the susceptible population also decreases β (see [ea03a, cob20] ). lockdown has been one of the most common mitigation measures followed all over the world and, in its extreme form, we can assume that once it starts the contact rate between susceptible and infectious people drops to zero. in general, the lockdown starts on a fixed day t l and has a duration (time scale) we call τ (we will be using both τ and corresponding decay rate µ = 1/τ in the discussion). we can incorporate these two parameters into the modeling of β(t) in many different ways and a set of three common choices is given below: this model is discussed here just for an example and we do not expect the variation of β(t) as slow as linear one. this expression shows that β(t) starts with an initial value β 0 and after time t l it starts decreasing linearly with a constant rate of µ = 1/τ and finally becomes β 0 (1 − µ) at t = ∞. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. this form of suppression of β(t) starts with a constant value β 0 at some t = t l and keeps decaying for period represented by τ and finally settles to a final value β 0 (1 − α) as is shown in figure ( 3). this can be written in the following way also: from equation (13) we can also write : equations (14) and equation (15) are important to find the priors for α and µ once we know the priors for β and this will be discussed again in §4 and will be used in parameter estimation in §5. 3. exponential suppression [fp20, ea20o, ea20n] : this model is similar to the tanh model and in this case also β(t) starts from some initial value β 0 and after decreasing for a period and finally approaches to a constant value β 0 α at t = ∞ as 10 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . https://doi.org/10.1101/2020.05.22.20110171 doi: medrxiv preprint is shown in figure (4) . note that the transmission coefficient β may decay with time without any intervention also as is discussed in [ea97] for plants. in this case also we can write : and these equations also will be used to find the priors for parameter estimation. in one of the studies [ea20o] it has been argued that even the time of recovery 1/γ may also vary with time due to the improvement in medical understanding of the epidemic and facilities and that also can be modeled as an exponential function. there have been other physically well motivated exponentially decaying forms also such as given in [vil20] in which β starts from starting value β 0 and decay with rate 1/τ finally becomes β 1 . the author argues that β 1 depends on the policy decisions leading to behavioral changes. this model is different from the model we are considering only in the respect that it considers the "lockdown" from the beginning i.e., t = 0. the time dependent β models as are discussed above and shown in figure (5) share a common property that before a certain time t l , that we can identify with the day on which lockdown starts, β has a constant value β 0 and after that it starts decreasing with a rate that depend on the parameter µ = 1/τ . the effect of the suppression in β is controlled by the parameter µ and for its zeros values all the models become constant β models. from figure (5) we can conclude that different models can lead to the same amounts of "flattening" of the curve with a different choice of parameters so there is no preferred model for the suppression. the sir model with constant transmission coefficients is applicable only in the situation when the pandemic is let to grow without any intervention. in the real world once a pandemic starts interventions of different kinds (social, medical etc.,) are considered to reduce the rate at which the the epidemic spreads. these interventions can be easily taken into account by considering a time dependent (decaying) growth rate (β). as we can see from the above figure that a decaying (exponentially) β helps to contain the disease by lowering the height of the peak as shown in figure (5). the primary data for covid-19 is in terms of three times series for the count of confirmed c(t) , recovered r(t) and dead d(t), persons for every country. by definition all the three times series 11 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . there are many factors, known and unknown, which determine the behavior of these time series, 12 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. the time series i(t) for the population in compartment i can be obtained by subtracting r and d from c. for a set of 45 countries the time series of i(t) are shown in figure ( (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. 16 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . https://doi.org/10.1101/2020.05.22.20110171 doi: medrxiv preprint δ, therefore it is a good measure which we can fit to a compartmental model, such as sir or sird and can get constraints. in this and next section we present the main results of the study in the form of demonstrating a reconstruction procedure for the time dependent transmission coefficients β(t), γ(t), δ(t) and the effective reproduction number r(t). we consider an example of italy and india for this procedure. note that this approach is common and can be used to understand the variation of the transmission coefficients with time as a result of interventions. the main advantages of this approach is that there are no parameters to adjust and so the results are easy to reproduce. the approach we use here is similar to as used in [ea20e, gg20] . in this approach the evolution equations are written in a discretized form as shown in equation (21). from the third equation we can write: and using this and second equation from (21) we get, note that by definition r t+1 ≥ r t , so γ(t) ≥ 0, however, we may have i t+1 ≤ i t also, β(t) may become negative also once the population in the compartment i starts decreasing. here an important assumption is being made and that is the fraction of susceptible population s/n is close to unity which may be true at the beginning of the epidemic. once we have expressions for the time dependent β and γ we can also written an expression for the time dependent reproduction number in the following way: following the similar procedure we can write the sird equations in the following discretized form: from these questions we can write : and can write the expression for the reproduction number: 18 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. where ∆x t = x t+1 − x t with x = i, r and d. this equation is identical to equation (24) if we do not count dead and recovered separately i.e., replace ∆r t + ∆d t with ∆r t . one of the interpretations of r is that it is a ratio of two rates and so in case we are interested finding out two separates measures for γ and δ, we can also write : and so, 1 the procedure as discussed above can be used to know the variation of the transmission coefficients β, γ, δ and effective reproduction r(t)with time. in order to follow this procedure we need to abandon first few data points which have very high noise. as explained above occasionally we may also have negative values of r(t). in figure (9) we show the reconstruction for β(t), γ(t) and r(t)for italy with sir model and in figure (10) the same is shown for β(t), γ(t) and δ(t) in case of sird model. we show r(t)as well as r γ (t) and r δ (t) for italy in the case of sird model in figure (11). similar figure are also shown for india in figure (12), (13) and (14). in all the figures the vertical red dashed lines mark the date of lockdown. one of the main attractive feature of the split of r for δ and γ is that it is sensitive to individual values of γ and δ and not just their sum γ + δ. for the case when γ = δ we get the usual reproduction 19 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. one of the import uses of the reconstruction procedure we have discussed here is to find the priors (minimum, maximum and best fit) values for the parameters to be fitted. once we have the estimates for β(t), γ(t) and δ(t) from the above procedure we can easily find x min , x max , x 0 , values (with x = β, γ, δ. here, x 0 is the approximate point for the parameter that is needed in many optimization procedure which iteratively find the solution. since in the present work we use parametric form of β(t), so we need priors for the parameters of β(t) i.e., β 1 , α, µ and τ which can find from the reconstructed β(t) (see §2.4 for detail). we consider a set of six compartmental models, three belonging to the sir and three to the sird class. the models are different from each other in terms of the choice for the epidemiological class (sir or sird) or the model for the contact rate β(t) (see §2.4 for detail). a summary of the models is given in table (3). 20 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . parameters note that in the model (5) and (6), β(t) starts decaying from the very beginning (in place of starting from a particular day representing the date of the lockdown) with a constant rate µ. in any fitting procedure the choice of the loss function depends on what we wish to fit. in the common least square fitting we use the sum of the squares of the offsets as the loss function. however, there is a problem here with the data we have for that choice. the time series we wish to fit have small values at the beginning and very large values at the later stage, so the the fitting is biased towards the points which have large values. one of the solutions for this could be to fit the log of the time series but then the fitting becomes biased toward small values, in the beginning (or later stage when 22 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. we decide to use the loss function of the ordinary least square which fits the data points close to the peak (having higher values) more accurately than other data points. we found this useful for the following two reasons: 1. the peak in the time series is an important feature, in particular its location and height, therefore any loss function biased towards it is justified. 2. for the short term predictions only the data points close to the dates of prediction is important, so using a loss function that fits later points (having higher values) more accurately than the noisy data points in the beginning is favorable. the loss function which we used for fitting the data to sir and sir models is given below. the variables used in the above equations are defined in table (4): for sir and sird models we fit multiple time series together so we must weight the sum of the squares of the offsets for different time series since they have very different values -the value of i(t) is generally few orders of magnitude higher than r(t) and d(t). we use the following weights for this 23 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. wherex l is the average of the time series x(t). we use the solve ivp and minimize modules from the scipy [sci20] for integrating the differential equations and minimize the cost function respectively. the loss function given by equation (30) represents the root mean square deviation (rmsd) and we use its final value as a measure of the goodness of the fit, and is shown by the points of different colors (for different models) in the figure (15) . from this figure we can see that it is hard to conclude that which of the models fit the data best (has always lowest rmsd). we can also notice from the figure that the model (6) is less sensitive towards the choice of a country for fitting the model (has the smallest fluctuations). a list of fitting parameter for the different models is given in table ( 3). for the sir class of models, model (1) and (2) we have five fitting parameters named, γ, β 0 , α, µ, t l and for the sird models, model (3) and (4) we have six fitting parameters named γ, β 0 , α, µ, t l and δ. as we can notice that for model (1) to model (4) four of the parameters are associated with β(t) and for model (5) and (6) the variation of β(t) is controlled by just two parameter -β 0 , the initial value of β and its decay rate µ = 1/τ . the best fit values of the fitting parameters with their 90 % ci (standard deviation) as well as the median values are given table (5) . the tables also give the estimate for the effective reproduction number r(t)which is a derived quantity here. note that for computing r 0 we have extrapolated the value of r(t)to the last date for which the data is being used here. a histogram of the effective reproduction number for the different models being considered is shown in figure (16) and detail values of that for different countries, which include the average values as well as 90% ci (stddev), are given in table (6). 24 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020 . . https://doi.org/10.1101 25 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020. . 26 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. 28 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020 . . https://doi.org/10.1101 covid-19 is a global crisis and understanding its impact on different systems of the modern human life (medical, social, economical etc.,) and the responses presented is an important exercise to carry out. we understand that despite being a global phenomenon, the impact of covid-19 in terms of the loss to life and the resourced being exhausted depend on the local conditions as well as on the mitigation measures taken locally. however, we believe that the global picture of the crisis does help to plan and take policy decisions at the local scale also. full understanding of any pandemic, in particular like covid-19 which does not have any other examples in the history (in terms of the scale and impact), may become available only when it is over and the facts and figure presented here may have very short life. however, we still believe that any quick timely insight may help a lot in terms of the planning for the worse. knowing very well that all the mathematical models are wrong but some are useful, we believe that mathematical models which are presented in this work may help to develop some insight about the crisis. a brief summary of the work presented here is as follows. in §1 we have given a very brief introduction of the problem being addressed and reviewed some of the key works about covid-19 which motivated the present work. a brief introduction of the mathematical framework used in the work in §2, in particular we have review a set of compartmental models sir, seir and sird in §2. we have also discussed a set of of parametric models for one of the transmission coefficients β(t), in §2.4. we have discussed the data being used in the work in §3. the main results of the present work are discussed in §4 and §5. in §4 we have reviewed a reconstruction procedure for the transmission coefficients and basic reproduction number r(t). this procedure does not depend on the choice of any parameter and can be easily generalized for other similar models also. we have presented the best-fit values of the parameters with their 90% ci, §5 in the form a set of tables. we have presented the values of the parameters in the following two forms: 1. model based 2. country based all the fitting parameters for the models being considered are summarized in table ( ??). the country based parameters are given in terms of a set of two tables (6) where every row represents a country. from the country based table we can see that the estimate are quite consistent -do not change much from one model to another. in place of presenting all the parameters for countries we present only the basic reproduction number corresponding to different models. we present table ( 2) which in the appendix which have some basic information about the countries being considered for the modeling. in order to see how different models fits the data we have given the plots for all the six models for a set of countries in the figures from (??). 29 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020 . . https://doi.org/10.1101 the work we presented here assumes that spreading of a pandemic like covid-19 happens homogeneously in space and time, however, we know that it is far from true. as the experience [ea03b] shows that "super-spread" events (sses) or rare events where, one particular infectious person interacts with a very large number of susceptible people over a short period of time have the maximum impact. in these situation the average measures like r 0 are not very informative. in the present work we data for a set of countries to constrain the parameters of the sir and sird model one similar exercise with sird model for india is done in [ea20q] . all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 26, 2020 . . https://doi.org/10.1101 an introduction to compartmental modeling for the budding infectious disease modeler modeling infectious disease dynamics early forecasts of the evolution of the covid-19 outbreaks and quantitative assessment of the effectiveness of countering measures analysis and fitting of an sir model with host response to infection load for a plant disease transmission dynamics and control of severe acute respiratory syndrome transmission dynamics of the etiological agent of sars in hong kong: impact of public health interventions strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic data-based analysis, modelling and forecasting of the covid-19 outbreak accounting for symptomatic and asymptomatic in a seir-type model of covid-19 assessing the efficiency of different control strategies for the coronavirus (covid-19) epidemic. arxiv e-prints epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study a time-dependent sir model for covid-19 with undetectable infected persons a strategic approach to covid-19 vaccine r&d an interactive web-based dashboard to track covid-19 in real time the species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2 monitoring the spread of covid-19 by estimating reproduction numbers over time inferring change points in the spread of covid-19 reveals the effectiveness of interventions early dynamics of transmission and control of covid-19: a mathematical modelling study substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) preliminary analysis of covid-19 spread in italy with an adaptive seird model a modified seir model to predict the covid-19 outbreak in spain and italy: simulating control scenarios and multi-scale epidemics a data driven analysis and forecast of an seiard epidemic model for covid-19 in mexico the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study studying the progress of covid-19 outbreak in india using sird model. medrxiv covid-19: epidemiology, evolution, and cross-disciplinary perspectives the global impact of covid-19 and strategies for mitigation and suppression. imperial college covid-19 response team a new coronavirus associated with human respiratory disease in china modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions a pneumonia outbreak associated with a new coronavirus of probable bat origin analysis and forecast of covid-19 spreading in china, italy and france mapping 2019-ncov novel coronavirus (covid-19) cases extracting the effective contact rate of covid-19 pandemic the mathematics of infectious diseases data on covid-19 (coronavirus) confirmed cases, deaths, and tests space-time dependence of corona virus (covid-19) outbreak. arxiv e-prints modeling infectious diseases in humans and animals an introduction to mathematical modeling of infectious diseases mathematical models of infectious disease transmission refined compartmental models, asymptomatic carriers and covid-19 novel coronavirus (2019-ncov) situation report -1 report of the who-china joint mission on coronavirus disease a time-dependent seir model to analyse the evolution of the sars-covid-2 epidemic outbreak in portugal a python package for fitting covid-19 data data-driven modeling reveals a universal dynamic underlying the covid-19 pandemic under social distancing age-structured impact of social distancing on the covid-19 epidemic in india scientific computing tools for python sk shahid nadim. assessment of 21 days lockdown effect in some states and overall india: a predictive mathematical study on covid-19 outbreak estimating and simulating a sird model of covid-19 for many countries, states, and cities covid-19 coronavirus pandemic the author would like to thank dr. gaurav goswami for comments and feedback. at present the author works as an independent researcher and data scientist and the work presented here is not supported by any public or private agency. the author will be thankful to any agency, individual or individuals who come forward to sponsor/support this and other similar works on covid-19. key: cord-301505-np4nr7gg authors: lin, xin; tan, suet mien; law, s.k. alex; torres, jaume title: two types of transmembrane homomeric interactions in the integrin receptor family are evolutionarily conserved date: 2006-01-27 journal: proteins doi: 10.1002/prot.20882 sha: doc_id: 301505 cord_uid: np4nr7gg integrins are heterodimers, but recent in vitro and in vivo experiments suggest that they are also able to associate through their transmembrane domains to form homomeric interactions. two fundamental questions are the biological relevance of these aggregates and their form of interaction in the membrane domain. although in vitro experiments have shown the involvement of a gxxxg‐like motif, several crosslinking in vivo data are consistent with an almost opposite form of interaction between the transmembrane α‐helices. in the present work, we have explored these two questions using molecular dynamics simulations for all available integrin types. we have tested the hypothesis that homomeric interactions are evolutionary conserved, and essential for the cell, using conservative substitutions to filter out nonnative interactions. our results show that two models, one involving a gxxxg‐like motif (model i) and an almost opposite form of interaction (model ii) are conserved across all α and β integrin types, both in homodimers and homotrimers, with different specificities. no conserved interaction was found for homotetramers. our results are completely independent from experimental data, both during molecular dynamics simulations and in the selection of the correct models. we rationalize previous seemingly conflicting findings regarding the nature of integrin interhelical homomeric interactions. proteins 2006. © 2006 wiley‐liss, inc. integrins are heterodimeric type i transmembrane proteins formed by noncovalent association of an ␣ and a ␤-subunit. each subunit contains a large extracellular domain, a single transmembrane (tm) spanning ␣-helix and a short cytoplasmic tail. 1 different types of ␣ integrins can combine with different ␤ counterparts, forming a variety of heterodimers. in humans, 18 ␣-chains can interact with eight different ␤-chains to form 24 different ␣/␤ heterodimers with varied functions. 2 by spanning the membrane, the integrins serve as a dynamic linkage between cytoplasm and extracellular space, transducing signals across the membrane to mediate cell growth, differentiation, gene expression, motility, and apoptosis. 3 inside-out signal transduction involves integrin cytoplasmic tails separation and subsequent ectodomain conformational changes, which alters the affinity of integrins for extracellular ligands. 4 -6 a number of experimental results suggest the existence of transmembrane ␣/␤ interactions. [7] [8] [9] for example, electron cryomicroscopy and single particle analysis, 10 cysteine scanning mutagenesis in the transmembrane domain 11 and activation by disruption of transmembrane interactions of integrin ␣iib␤3. 12 overall, for the ␣/␤ interaction, there seems to exist a general consensus on the type of interaction present in the inactive, low-affinity form. 11, 13, 14 but in addition to growing evidence indicating that ␣ and ␤ domains interact, there is also a strong tendency in vitro for ␣ and ␤ tm chains to form homooligomers, both in zwitterionic and acidic micelles 15 and in biological membranes. 16, 17 the latter authors examined the interaction of tm domains of the ␣2, ␣iib, ␣4, ␤1, ␤3, and ␤7 integrins when expressed as chimeric proteins, showing that most tm domains homooligomerize to some extent. also, a study of both cytoplasmic and transmembrane part of the ␣iib/␤3 integrin demonstrated only homooligomerization, but not formation of heterooligomers. 18 the way in which tm integrin homomeric interactions take place may be similar to that of glycophorin a (gpa), an interaction that involves a gxxxg-like motif, which was observed by arkin and brunger 19 to be prevalent in tm sequences. indeed, this motif can be found in multiple sequence alignments of predicted tm spanning regions of integrins (see fig 1, residues highlighted in gray). the importance of this and other related motifs in ␣-helical tm domains has been later been demostrated in exhaustive statistical analyses. 20 further, a selection of a random library of tm sequences for homodimerization clearly showed that the gxxxg motif is sufficient for strong helix-helix interactions. 21 using the toxcat assay, 22 a test that measures the oligomerization of a chimeric protein containing a tm helix in the escherichia coli inner membrane via transcriptional activation of the gene for chloramphenicol acetyltransferase, a sequence critical for integrin ␣iib-tm homodimerization that involved the gxxxg motif was suggested by li et al. 17 intriguingly, however, homomeric interactions that are not consistent with the involvement of a gxxxg-like motif have been observed between ␣ chains by crosslinking of the inactive ␣iib/␤3 dimer, 11 and also between ␤ chains 23 induced by a g708n mutation in ␤3 tm. the functional relevance of these interactions has been discussed by these authors, and a role for integrin transmembrane homomeric interactions in integrin clustering when binding to multimeric ligands is possible. weak integrin tm interactions, however, make difficult the observation of in vitro mutagenesis effects. also, some forms may be transient or not abundant, and they may be difficult to detect experimentally. one of the ways, although arguably indirect, to enquire on the existence and biological relevance of a given protein-protein interaction is to test its stability using evolutionary conservation data, using the idea that none of the mutations appeared during evolution, and present in homologous sequences, disrupt a native interaction. the general idea for this strategy is illustrated in figure 2 (a), which shows an example of results obtained in the present work for some of the ␣ integrin sequences. using this method, we obtained previously 24 the correct transmembrane homodimeric structure of glycophorin a at less than 1 å rmsd from the structure determined by nmr. 25 we later extended this method to the prediction of structures of various transmembrane ␣-helical bundles, for example, phospholamban or influenza a m2. 26 crucially, a picture emerged from that work that for some of these oligomers, for example, tetrameric m2, the correct structure was not found unless either the helix tilt was restrained to the experimentally obtained value or sampled at small intervals, so that all conformational space was fully explored. indeed, only in these conditions the correct structure of m2 could be found. 26 in the absence of experimentally determined helix tilt, the only alternative is to sample the helix tilt at small intervals, and we have recently followed this helix tilt sampling approach to find evolutionary conserved models of the transmembrane homooligomers of coronavirus envelope protein e. 27 in the present work we have studied, without the help of any experimental restraint, the transmembrane homooligomeric interactions of integrins using an exhaustive global search, 28 but sampling the helix tilt at 5°intervals, and using the integrin transmembrane sequence of 27 ␣ subtypes and 14 ␤ subtypes of integrin. we have explored the plausibility of dimeric, trimeric, and tetrameric homooligomers using a very stringent clustering protocol. the the two black columns indicate the position of the small-residuexxx-small-residue motif, or gxxxg-like motif, for each sequence. the search for ␣ (or ␤) homooligomers was started with a small subgroup of sequences indicated with an asterisk (*) (see materials and methods) referred to for convenience as "␣iib-like" or "␤3-like" in the text. for each sequence, similar low-energy structures (typically, backbone rmsd lower than 1 å) are grouped in clusters and averaged. these averaged low-energy structures are indicated with symbols here, and are plotted on a plane described by helix tilt and rotational orientation for an arbitrarily chosen residue (here 33 ). after considering the results from all sequences tested, a "complete set" is a group of these structures that contains representatives from all sequences. hence, the backbone structure obtained by averaging converging structures for different homologs is not destabilized by conservative mutations. in this figure, the location of two "complete sets," corresponding to ␣ homodimeric models i and ii (see results), are indicated by gray squares. we stress that this plot is only a visual guide, and although the models (symbols) look close in the -␤ plane, the ultimate test of similarity is rmsd (see materials and methods). (b) schematic representation of the rotational orientation and the helix tilt, ␤. results obtained here are self-consistent and the interpretation is unambiguous, that is, experimental data is not necessary to select the correct models. we have found many models of interaction for homodimeric and homotrimeric oligomers that have been conserved through evolution and which can be used as reference for future mutagenesis studies. the simulations were performed using a compaq alpha cluster sc45, which contains 44 nodes. all calculations were carried out using the parallel version of the crystallography and nmr system (cns version 0.3), the parallel crystallography and nmr system (pcns). 29 the global search was carried out in vacuo with united atoms, explicitly describing only polar and aromatic hydrogen atoms as described elsewhere 28 using chi 1.1 (cns helical interactions). as the models tested are homooligomers, the interaction between the helices was assumed to be symmetrical. trials were carried out starting from either left or right crossing angle configurations. the initial helix tilt, ␤, was restrained to 0°and the helices were rotated about their long helical axes in 10°increments until the rotation angle reached 350°. henceforth, the simulation was repeated by increasing the helix tilt in discrete steps of 5°, up to 45°. we must note that the restraint for the helix tilt is not completely strict, that is, at the end of the simulation a drift of up to ϯ5°from the initial restrained value could be observed in some cases. three trials were carried out for each starting configuration using different initial random velocities. clusters were identified with a minimum number of eight similar structures. any structure belonging to a certain cluster was within 1.5 å rmsd (root mean square deviation) from any other structure within the same cluster. finally, the structures belonging to each cluster were averaged and subjected to energy minimization. these final averaged structures, described by a certain tilt and rotational orientation at a specified arbitrary residue, were taken as the representatives of the respective clusters [symbols in fig. 2 the tilt angle of the models, ␤, was taken as the average of the angles between each helix axis in the bundle and the bundle axis. the bundle axis, coincident with the normal to the bilayer, was calculated by chi. the helix axis was calculated as a vector with starting and end points above and below a defined residue, where the points correspond to the geometric mean of the coordinates of the five ␣ carbons n-terminal and the five ␣ carbons c-terminal to the defined residue. the rotational orientation angle of a residue is defined by the angle between a vector perpendicular to the helix axis, oriented towards the middle of the peptidic cao bond of the residue, and a plane that contains both the helical axis and the normal to the bilayer. in this work, to compare the models, a residue was chosen arbitrarily, and the angle is always given for residue 33 (see common numbering in fig. 1 , lower row) both for ␣ and ␤ sequences. intersequence comparisons between low-energy clusters were performed by calculating the rmsd between their ␣-carbon backbone. fitting was performed using the program profit (http://www. bioinf.org.uk/software/profit). the energies calculated correspond to the total energy of the system, including both bonded, for example, bond, angle, dihedral, improper, and nonbonded, that is, van der waals and electrostatic terms. 28 the interaction energy for the residues was calculated with the function chi_interaction implemented in chi. homologous sequences were obtained using ncbi homolo-gene search (http://www.ncbi.nlm.nih.gov/). the definition of the 27 ␣ sequences, the abbreviation (inside parentheses) used in figure 1 the assignment of the transmembrane domain for these sequences was based on the hydrophilicity/surface probability plots and the transmembrane predictions from the tmhmm server. 30 according to these predictors, the transmembrane region of these sequences spans 24 resi-dues for the ␣ chain and 23 for the ␤ chain. the alignment of these sequences in the tm domain is shown in figure 1 . because of the tremendous computational work needed, to optimize the search, we first limited the search to a certain subgroup of integrin types indicated in figure 1 by a star. if one or more conserved models were found for this subgroup, other sequences were tested for the existence of these models. the rationale for starting with the sequences is the abundant experimental studies performed on the transmembrane domain of the major platelet integrin ␣iib/␤3. the initial selection of ␣ sequences therefore included ␣iib, the natural partner of ␤3, ␣v, which also associates to ␤3 (fig. 3) , and ␣6, which has the highest sequence similarity to ␣iib when using blast (http:// www.ncbi.nlm.nih.gov/blast/blast.cgi). the initial selection of ␤ sequences included ␤3 and other ␤ sequences able to bind ␣v, 31 that is, ␤5, ␤6, and ␤8 (fig. 3) . when a homodimeric model was assumed for the "␣iiblike" subgroup (see legend in fig. 1) , two "complete sets," or conserved models, were found restraining the helix tilt to 15°, and only when the configuration was right handed. in one of these models (model i), with helix tilt ␤ ϭ 19°and rotational orientation 33 ϭ 28°, the residues located at the "g" position in the gxxxg-like motif are involved in the interaction [ fig. 4(a), residues 972 and 976] . the other conserved model, with almost opposite orientation (model ii), is shown in figure 4 (b), with ␤ ϭ 6°and 33 ϭ 16°. the opposite orientation of models i [ fig. 4(a) ] and ii [ fig. 4(b) ] is shown clearly when comparing the orientation of w967 in these two models. when other helix tilts or left-handed configurations were tested, no other complete sets were found across this subgroup of sequences. the rmsd between any pair of structures belonging to these two complete sets, either model i or ii, was never higher than 0.8 å. when our search was extended to the remaining ␣ sequences, model i was conserved in all ␣ sequences tested. model ii was also conserved for the remaining sequences, but there were complex overlaps between different subtypes and, in contrast to model i, a common structure with rmsd less than 1 å for all integrins types could not be found. indeed, a virtually identical (rmsd less than 0.8 å) model ii was shared by all integrins, except for ␣m, ␣3, ␣2, and ␣x. we point out that this does not mean that a model similar to model ii presented here is not present in the aforementioned sequences. for example, a group formed by ␣m and ␣x shared a "complete set" in which the helices were rotated approximately 50°( not shown) from the model ii above. related to this, we note that in previous reports we have used homologous sequences, found in different species, of a given protein. here, in contrast, we have used different integrins types, which perform very different functions, in the same species (humans). this is clearly a more stringent condition for finding conserved interactions, as the variability in the sequences is potentially greater. the lack of a sufficient number of suitable homologous sequences of the same integrin type for different species, precluded the determination of a similar structure to model ii in sequences ␣3 and ␣2, although it is possible that a similar interaction, that is, one that does not involve the gxxxg motif, is also present there. when a homotrimeric model was assumed for the "␣iiblike" subgroup, a complete set was found for a right handed configuration (␤ ϭ 19°and ϭ 6°), and only when the helix tilt was restrained to 15°. the structure representing this model (a model i type, i.e., where the gxxxg-like motif is involved in the interaction) is shown in figure 4 (c). no other complete sets were found for other tilts or left-handed configurations. the rmsd between any pair of structures belonging to this complete set was never higher than 1 å. when the search was extended to other sequences, this model was still conserved. model ii of interaction was not found to be conserved in ␣ homotrimers. simulations were performed initially for the "␤3-like" subgroup formed by the star-labeled sequences (fig. 1 , lower panel). only two models or "complete sets" were found to be conserved, both right handed. a model equivalent to model i [ fig. 4(d) , see residues 699 and 703 corresponding to the sxxxa motif participating in the interaction] was found when the helix tilt was restrained to 5°in a right-handed configuration, with ␤ ϭ 5°and ϭ 27°. another model of almost opposite orientation (equivalent to model ii) was found at ␤ ϭ 18°and ϭ ϫ21°[ fig. 4(e) ]. the opposite orientation of models i [ fig. 4(d) ] and ii [ fig. 4(e) ] is shown clearly when comparing the orientation of m701 and g708 in these two models. the rmsd between any pair of structures belonging to these complete sets was never higher than 1 å. no complete sets were found for other tilts or left-handed conformations. when the search was extended to the rest of the sequences, model i was conserved for all ␤ sequences. a more complex overlap was found around model ii, but with rmsd less than 1.5 å, a common structure was conserved for all sequences, except for ␤4 and ␤7. when a homotrimeric model was assumed for the "␤3like" subgroup, two conserved models were found, both right handed. a model equivalent to model i was found when the helix tilt was restrained to 15°in a right-handed configuration (␤ ϭ 15°and ϭ 97°) [see fig 4(f) ]. another model, of opposite orientation, equivalent to model ii, was found at ␤ ϭ 16°and ϭ ϫ43°[ fig. 4(g) ]. no complete sets were found for other tilts or left-handed configurations. the rmsd between any pair of structures belonging to this complete set was never higher than 1 å. as for the homodimers, when search was extended to other ␤ sequences, the first model (model i) was conserved in all instances. model ii showed also the same behavior than for the homodimer, and a common model was found for all sequences (if rmsd ͻ1.5 å), except for ␤4 and ␤7. in general, therefore, in contrast to the very stable backbone model i, homomeric model ii seems to have drifted slightly during evolution both for ␣ and ␤ chain homomeric interactions. incidentally, we have also observed that these results also stand when using other sequences of ␤ integrins, from coral and sponges, 32 that are evolutionarily distant from the ones presented here. despite the lack of close homology, these sequences still present a gxxxg-like motif, and models equivalent to models i and ii for ␤ integrins were also found (not shown). as the sequences diverge, however, the structure representing a particular mode of interaction starts to diverge from a tight complete set and partial overlaps can be found. these results are difficult to interpret in terms of sequence similarity or function. as mentioned before, a sharper, although more complex, picture would have emerged if we had been able to use different homologs of a single integrin subtype, which is equivalent to test different homologous sequences, for different species, of the same protein. however, the fact that we have been able to obtain these two models of interaction using integrins that perform totally different functions confirms the robustness of our findings and suggests that these two models are general forms of interaction across all the integrin family spectrum. the different interactions observed in our computational work are summarized schematically in figure 5 . our computational results have been obtained independently from any previous experimental data, and clearly show that two right-handed types of homomeric interaction in the transmembrane domain of ␣ and ␤ integrins (models i and ii) are evolutionarily conserved. we also predict that these models are present in homodimers as well as in homotrimers, although with the exception of ␣ homotrimers, where model ii is not conserved. our results for the ␣ homodimer are in contrast with a recent study 33 that studied ␣ homodimeric interactions using a computational method similar to the one used here, and where two models, with a left-and right-handed configuration were proposed. in the aforementioned study, however, only 10 integrin subtypes were used. in addition, the helix tilt was not restrained, and hence, the conformational space searched was not complete, leading to ambiguous results, 33 the interpretation of which ultimately required the consideration of previous experimental data. in contrast, in the present work, we have used 27 ␣ sequences, and even under our stringent rmsd and clustering parameters (cf. previous report 33 ), we are able to detect two conserved models of interaction for all these sequences without the need to take into account any experimental data. in addition, we also show that an evolutionarily conserved mode of interaction (model i) also exists for ␣ homotrimers, which suggests that ␣ homotrimers observed in vitro for constructs involving both tm and cytoplasmic tail of ␣iib integrins (tm-cyto) in dodecylphosphocholine (dpc) micelles 15 are probably not artifacts. the fact that these homotrimers were only observed at high peptide concentration suggests that their stability is lower than that of ␣ homodimers. however, results derived from calculation of energy (see material and methods) and packing efficiency 34 (http://www.molmovdb. org/cgi-bin/voronoi.cgi) (unpublished results) of these ␣ homotrimers relative to the model i or ii ␣ homodimers do not explain this hypothetical lower stability for the homotrimer. our independent predictions are nevertheless consistent with previous findings. for example, the helix tilt for our ␣ homodimeric model i [see fig. 4 (a)] is 19°, which corresponds to a crossing angle of 38°for a symmetric dimer. this is remarkably consistent with a "model i-like" ␣iib homodimer proposed by w.f. degrado and coworkers based on an exhaustive search of rigid-helix interactions and mutagenesis data 17 where a modified motif, vgxxgg instead of gvxxg for gpa, was proposed. in the model i we report, residues g972 and g976, that pertain to the gxxxg motif (see residues 28 and 32 in fig. 1 , top panel) interact with v969 and v973 of the other helix (number 25 and 29 in fig. 1, top panel) . also, mutations at l980, l980a, and l980v, which have been reported to greatly increase homodimerization 17 are involved in the interaction. in contrast to this latter report, 17 however, calculations of the interaction energy per residue (fig. 6) in the ␣iib dimer model i do not show that the residues preceeding g in the gxxxg motif (i.e, residues v971 and g975) are important for the interaction. our results for model i are therefore more consistent with a typical gpa type mode of interaction in all integrins. the discrepancy between these results may be due to the different strategy used and the ambiguity in mutagenesis studies; for example, g975l or g975v was found to impair dimerization, but g975a was as dimerizing as the native residue. 17 small details aside, because we find that this homodimeric model i has been conserved through all integrin types, the model described for ␣iib 17 is just a particular instance of a more general fig. 5 . summary of the transmembrane integrin homomeric interactions found in our computational work. all homooligomers are right handed. the scheme represents the models of interaction found for homodimers and homotrimers in ␣iib (a) and ␤3 (b). each helix is represented by two halves. for ␣iib, the two gly residues in the gxxxg-like motif, g972 and g976, are located in one of the halves. residue w967 is located at an opposite location, in the other half (see text). for ␤3, the two small (gly) residues in the gxxxg-like motif, residues s and a, are located in the same half, whereas g708 (and m701) is at an opposite orientation, in the other half. form of interaction that includes all ␣ representatives, as has been also suggested by experiments involving other integrin tm domains. 16 the mode of interaction of ␣iib in the ␣/␤ heterodimer (␣iib/␤3) has been described for the inactive state, 11 with the residues at the g position in the gxxxg-like motif participating in helix-helix contacts; in contrast, residue w967 points away from the ␣/␤ interface. interestingly, mutation w967c resulted in the formation of a (␣iib/␤3) 2 species, a dimer of dimers, through formation of a disulfide bond. 11 this suggests that in this particular case, interaction between ␣iib chains is gxxxg-like motif independent [similar to our model ii of interaction, fig. 4(b) ]. our computational results show that this form of interaction is neither accidental nor specific to ␣iib, because we have found it to be evolutionarily conserved across all integrins. coexistence of heterodimer and ␣ homodimer is therefore possible taking model ii into account, as it has also been suggested earlier. 33 two conformations for the ␤3 homotrimer, of opposite handedness, have been proposed previously 33 based on restraints from mutagenesis data. in contrast, our results are independent from experimental data, and using 14 ␤ sequences we show that two models are evolutionarily conserved, but they are both right handed. in addition, we predict that these models are not only present in homotrimers, as previous experimental data suggests, 23 but also in homodimers. transmembrane ␤ homodimers have been observed in ␤1, ␤3, and ␤7, 16, 35 and the importance of the gxxxg motif (interaction equivalent to our model i) has been confirmed experimentally by mutagenesis using gallex, a twohybrid system that follows heterodimerization of membrane proteins in the e. coli inner membrane. this seems to suggest that model i of interaction is more stable than model ii for ␤ homodimers or homotrimers. this would also suggest that polypeptides encompassing the transmembrane domain and cytoplasmic tail (tm-cyto) of ␤3 that have been found to form homotrimers in dpc 15 probably correspond to model i. in contrast, only when the model ii form of interaction is stabilized, for example, the mutant g708n in ␤3 which promoted homotrimerization, 23 a model ii form of interaction [ fig. 4(g) ] would be detected (see position of g708 in ␤3). as for the ␣ homooligomers, however, calculation of energy and packing efficiency for ␤2 and ␤3 (not shown) show consistently model ii of interaction being more stable and well packed than model i. also, among the homotrimers, the ␤ homotrimer model ii seems to be the most stable and well-packed homooligomeric form. more detailed analyses are needed to explain this discrepancies. but have these homomeric interactions any functional relevance? the putative coexistence of integrin hetero-and homo-oligomers has been rationalized in a context where heteromeric interactions would stabilize the transmembrane region in a low-affinity and/or intermediate affinity state, 11 whereas homooligomers would be present in the active state, crosslinking individual molecules, and stabilizing focal adhesions. 33 consistent with this hypothesis, ␤3 tm homotrimerization induced constitutive activation and integrin clustering, suggesting a push-pull mechanism, 14 although other studies failed to detect homomeric interactions after ␣iib␤3 integrin activation. 11 nevertheless, a role for integrin transmembrane homomeric interactions in integrin clustering when binding to multimeric ligands 36 is possible. the fact that several homomeric interactions are evolutionary conserved strongly support this possibility. our results provide an explanation for seemingly conflicting reports in in vivo and in vitro transmembrane homomeric interaction of integrins. we have found that two modes of interaction are evolutionarily conserved. one of these interactions (model i) involves the gxxxg-like motif, which has been proposed previously on the basis of mutagenesis data in the transmembrane domains. the other model (model ii) involves the opposite face of the helix, which is consistent with previous experimental data. because our models have been obtained with independence of any previous experimental restraint and only using the perturbing effect of evolutionary conservation data as a filtering parameter, we suggest that these interactions are present in vivo. the present studies provide a fertile ground for experimentation. we are presently studying the in vivo effects of these potentially disruptive mutations in the integrin transmembrane domain. integrin avidity regulation: are changes in affinity and conformation underemphasized? extracellular matrix, anchor, and adhesion proteins role of integrins in regulating epidermal adhesion, growth and differentiation bidirectional transmembrane signaling by cytoplasmic domain separation in integrins effects of ligandmimetic peptides arg-gly-asp-x (x ϭ phe, trp, ser) on alpha iib beta 3 integrin conformation and oligomerization an unraveling tale of how integrins are activated from within crystal structure of the extracellular segment of integrin alpha v beta 3 three-dimensional em structure of the ectodomain of integrin {alpha}v{beta}3 in a complex with fibronectin breaking the integrin hinge. a defined structural constraint regulates integrin signaling three-dimensional model of the human platelet integrin alpha(llb)beta(3) based on electron cryomicroscopy and x-ray crystallography a specific interface between integrin transmembrane helices and affinity for ligand disrupting integrin transmembrane domain heterodimerization increases ligand binding affinity, not valency or clustering transmembrane signal transduction of the alpha(iib)beta(3) integrin a push-pull mechanism for regulating integrin function oligomerization of the integrin alphaiibbeta3: roles of the transmembrane and cytoplasmic domains involvement of transmembrane domain interactions in signal transduction by alpha/beta integrins dimerization of the transmembrane domain of integrin alpha(iib) subunit in cell membranes association of the membrane proximal regions of the alpha and beta subunit cytoplasmic domains constrains an integrin in the inactive state statistical analysis of predicted transmembrane alpha-helices statistical analysis of amino acid patterns in transmembrane helices: the gxxxg motif occurs frequently and in association with beta-branched residues at neighboring positions the gxxxg motif: a framework for transmembrane helix-helix association toxcat: a measure of transmembrane helix association in a biological membrane activation of integrin alpha iib beta 3 by modulation of transmembrane helix associations a new method to model membrane protein structure based on silent amino acid substitutions a transmembrane helix dimer: structure and implications contribution of energy values to the analysis of global searching molecular dynamics simulations of transmembrane helical bundles the transmembrane oligomers of coronavirus protein e computational searching and mutagenesis suggest a structure for the pentameric transmembrane domain of phospholamban crystallography & nmr system: a new software suite for macromolecular structure determination predicting transmembrane protein topology with a hidden markov model: application to complete genomes integrins: bidirectional, allosteric signaling machines molecular evolution of integrins: genes encoding integrin beta subunits from a coral and a sponge a computational model of transmembrane integrin clustering the packing density in proteins: standard radii and volumes gallex, a measurement of heterologous association of transmembrane helices in a biological membrane detection of integrin alpha iibbeta 3 clustering in living cells a new method to model membrane protein structure based on silent amino acid substitutions j.t. thanks the financial support of biomedical research council (bmrc) of singapore and the facilities at the bioinformatics research center (birc) of nanyang technological university. we are also grateful to paul d. adams for kindly providing chi. key: cord-274732-mh0xixzh authors: faizal, w.m.; ghazali, n.n.n; khor, c.y.; badruddin, irfan anjum; zainon, m.z.; yazid, aznijar ahmad; ibrahim, norliza binti; razi, roziana mohd title: computational fluid dynamics modelling of human upper airway: a review date: 2020-06-26 journal: comput methods programs biomed doi: 10.1016/j.cmpb.2020.105627 sha: doc_id: 274732 cord_uid: mh0xixzh background and objective: human upper airway (hua) has been widely investigated by many researchers covering various aspects, such as the effects of geometrical parameters on the pressure, velocity and airflow characteristics. clinically significant obstruction can develop anywhere throughout the upper airway, leading to asphyxia and death; this is where recognition and treatment are essential and lifesaving. the availability of advanced computer, either hardware or software, and rapid development in numerical method have encouraged researchers to simulate the airflow characteristics and properties of hua by using various patient conditions at different ranges of geometry and operating conditions. computational fluid dynamics (cfd) has emerged as an efficient alternative tool to understand the airflow of hua and in preparing patients to undergo surgery. the main objective of this article is to review the literature that deals with the cfd approach and modeling in analyzing hua. methods: this review article discusses the experimental and computational methods in the study of hua. the discussion includes computational fluid dynamics approach and steps involved in the modeling used to investigate the flow characteristics of hua. from inception to may 2020, databases of pubmed, embase, scopus, the cochrane library, biomed central, and web of science have been utilized to conduct a thorough investigation of the literature. there had been no language restrictions in publication and study design of the database searches. a total of 117 articles relevant to the topic under investigation were thoroughly and critically reviewed to give a clear information about the subject. the article summarizes the review in the form of method of studying the hua, cfd approach in hua, and the application of cfd for predicting hua obstacle, including the type of cfd commercial software are used in this research area. results: this review found that the human upper airway was well studied through the application of computational fluid dynamics, which had considerably enhanced the understanding of flow in hua. in addition, it assisted in making strategic and reasonable decision regarding the adoption of treatment methods in clinical settings. the literature suggests that most studies were related to hua simulation that considerably focused on the aspects of fluid dynamics. however, there is a literature gap in obtaining information on the effects of fluid-structure interaction (fsi). the application of fsi in hua is still limited in the literature; as such, this could be a potential area for future researchers. furthermore, majority of researchers present the findings of their work through the mechanism of airflow, such as that of velocity, pressure, and shear stress. this includes the use of navier–stokes equation via cfd to help visualize the actual mechanism of the airflow. the above-mentioned technique expresses the turbulent kinetic energy (tke) in its result to demonstrate the real mechanism of the airflow. apart from that, key result such as wall shear stress (wss) can be revealed via turbulent kinetic energy (tke) and turbulent energy dissipation (ted), where it can be suggestive of wall injury and collapsibility tissue to the hua. breathing, known as ventilation, is the process of taking air into and expelling it from the lungs, often by consuming oxygen and discharging carbon dioxide from the lungs. the respiratory system comprises the nose and mouth, and all the way through the respiratory tract: airways and lungs. lungs are the main organs of the respiratory system, in which it works to exchange oxygen and carbon dioxide during breathing. the nose and mouth are used for breathing, where the air would then enter the respiratory system, along the throat (pharynx), and across the voice box (larynx). food and drink are blocked from entering the airways during swallowing because the passage to the larynx is lined with a small flap of tissue called the epiglottis, which will spontaneously close upon swallowing. the upper airway is situated at the lower end of the trachea, which refers to the airway segment within the nose or mouth and the main carina. the segment between the trachea and the mainstem bronchi is the central airways. the lower conducting airways, such as the main, lobar, and segmental bronchi, are different from the upper airway in a sense that the latter has no collateral ventilation. as such, any obstruction of the upper airway or central airways can be fatal, be it an acute obstruction (occurrence within minutes) or a chronic one (development in weeks or months). asphyxia and fatality may be resulted from any clinically significant obstruction that take place along the site of the upper airway; this is where recognition and treatment can be lifesaving. the upper airways are segregated into four sections: the nose (functional during nasopharyngeal breathing) and the mouth (functional during oropharyngeal breathing), the pharynx, the larynx, and the trachea. due to the parallel anatomic arrangement of the nose and the mouth, they seldom become an upper airway obstruction, unless in the event of massive facial trauma. central airway obstruction is a branch of the upper airway obstruction, which includes trachea and mainstem bronchi. in recent years, the study of human upper airway (hua) has become one of the interesting research areas. some studies were focused on the flow characteristics, computer modeling, and fluidstructure interaction between the airflow and the soft tissue of the upper airway. human upper airway plays a crucial role in delivering the inhaled air from the nasal passages to the lungs, which is a part of the body's breathing mechanism. the unique anatomical structure and functional properties of the soft tissue of the upper airways (e.g., mucosa, cartilages, and neural and lymphatic tissues) significantly influence the airflow characteristics and have a crucial effect to the conduction of air to the lower airways. figure 1 shows the anatomy of human upper airway, which begins with the nasal cavity and continues over the nasopharynx and oropharynx to the larynx [1] . the function of the upper airway is not limited to only deliver air to the lung, but it ensures normal phonation, digestion, humidification, olfaction, and warming of inspired air [2] . therefore, the better understanding of human upper airway may enhance the clinical application of anatomical structure and improve the physiological knowledge of the respiratory system for a medical practitioner. an abnormal breathing during sleep can be classified in various forms and conditions, such as sleep apnea, sleep-disordered breathing (sdb), sleep apnea-hypopnea syndrome (sahs), and sleeprelated breathing disorder (srbd), all of which arise from different etiologies [3] . many researchers are concerned about the importance of abnormal breathing during sleep, to which they have proposed different efficient methods to detect or monitor the breathing performance of the patient [4] . on top of that, the scholars have proposed other efficient ways to treat the abnormal sleep breathing patient via skeletal surgical approach [5] , volumetric tongue reduction [6] , and using oral appliances to improve the their sleep breathing experience [7] , [8] . figure 1: the human upper airway is described as the area of airway between the nose (nasal cavity) and the mouth (oral cavity), and the main carina at the lower end of the trachea [5] . the primary physiological objective of sleep is the patency of the upper airway. the relaxation of throat muscle narrows the airway and causes the failure of upper airway patency, which leads to the obstructive sleep apnea (osa) and its sequelae [9] . moreover, the protective upper airway is weak during sleep and the relaxation of the soft tissues can easily lead to the upper airway collapse [10] . osa can be classified as a multifactorial disease, which involves a complex interplay of the upper airway anatomy, alone and/or a combination of neuromuscular control mechanism with other pathophysiologic factors (e.g., respiratory arousal threshold and loop gain) [11] . dynamic tongue movement and tongue thickness may impair the respiratory control of an osa patient. the factors that contribute to the osa vary individually. thus, an understanding of the osa mechanism is important to achieve the goal of individualized and targeted therapy for patients [12] . however, some restrictions such as lack of information on the flow behavior and upper airway collapse can result in only 50% operation success rate [13] . therefore, the knowledge and understanding of the flow properties of upper airway are practically important for medical practitioners and surgeons to accurately locate the obstruction in osa patients [14] , [15] . surgical correction of osa syndrome is one of the alternative methods to tackle the obstruction of the upper airway [16] . the understanding of a three-dimensional (3d) airway anatomy is crucial for the surgical correction of osa because it involves a number of parameters such as internal airflow velocity, wall shear stress, and pressure drop [17] . the advanced cone-beam computed tomography (cbct) scan and automated computer analysis were used to facilitate the visualization of the 3d upper airway in identifying the abnormal airway condition and their response to surgery. besides that, preoperative studies provide knowledge and information on specific airway obstruction, which can grant precise surgical treatment for osa patients by focusing on the specific region [18] . for example, 75% to 100% success rate of operation for maxillomandibular advancement (mma) was reported in the surgical correction of osa syndrome [19] . in recent years, the computational fluid dynamics (cfd) method has been widely employed to analyze the airflow in both healthy and diseased human conducting airways [20] . most studies have focused on the pollutant transport and drug delivery in respiratory systems, whereas others focused on sleep-disordered breathing [21] . during breathing, the airflow characteristics through the human respiratory tract are very complex. the airflow can be in the laminar, transitional, or turbulent condition. the geometry and boundary conditions are the main factors that greatly affect the airflow in human respiratory tracts [22] . therefore, these factors must be considered in the cfd study of respiratory flow. sometimes, the simplified airway geometries are considered in the cfd analysis due to the limitations of computing memory and time. with the advancements in medical imaging techniques, unique and realistic geometries of the respiratory tract can be reconstructed from the scanned images, to which they can be converted into a cfd model; this also applies for both extraand intra-thoracic airways [23] . the cfd analysis is able to provide clear visualization and its results can cover the lack of information from experiments such as the flow properties and flow pattern generated inside the human upper airway [24] . moreover, the rapid development of new computational and numerical methods using powerful hardware can now shorten the computation time, allowing researchers to study the critical airflow of the upper airway prior to conducting the surgery [25] . this makes cfd a vital and useful tool for predicting the various flow properties in the human upper airway by computationally solving flow equations [26] . thus, this review article focuses on the challenges that arise in analyzing the airway mechanisms via cfd approach, and critical process, including modeling and its mathematical background that are related to the study of the human upper airway (hua). in addition, the various steps involved (i.e., preprocessing and post-processing) in solving the cfd analysis are investigated. lastly, the application of cfd method in the analysis of the human upper airway, which has been carried out by previous scholars, is also discussed in this article. as mentioned beforehand, the upper and central airways are the parts of a human breathing airway. these airways can be afflicted by respiratory diseases, which can distress or impair organs and structures that are related to breathing, such as the nasal cavities, the pharynx (also known as throat), the larynx, the trachea (or windpipe), the bronchi and bronchioles, the tissues of the lungs, and the respiratory muscles of the chest cage. there are many causes of respiratory illnesses and diseases; some of them are through infection, smoking of tobacco or breathing in secondhand tobacco smoke, asbestos, radon, as well as various types of air pollution. asthma, chronic obstructive pulmonary disease (copd), pneumonia, pulmonary fibrosis, lung cancer [27] , and the newest pandemic-causing the aim of this work was to evaluate the effect of inlet velocity profile on the flow features in blocked airways along the branches of a human lung. evaluation of bifurcation flow in a human lung was deemed vital to give a better prediction on the particle deposition in drug therapy and inhalation toxicology. this is done through the use of fully 3d incompressible laminar navier-stokes equations and continuity equation, while the cfd solver was used to solve the unstructured tetrahedral meshes. ali et al. [31] respiratory disease (lung airways) cfd and particle dynamics framework on a specific lung model of a patient were described under unsteady flow conditions, which can enhance the knowledge of particle transport and deposition in the airways. essentially, this can be a useful guide for future targeted drug delivery studies. jinxiang et al. [32] respiratory obstructive diseases (bronchial tumor) potential sites of diseases were investigated in this work, where disease severity was determined to help formulate a targeted drug delivery plan for the treatment of the disease. cfd was employed to provide visualization of the unique lung structure in assessing the exhaled aerosol distribution. cfd is also sensitive to the varying airway structures. qingtao et al. [33] lung cancer the changes in structural and functional of the tracheobronchial tree post-lobectomy was assessed via the cfd to depict the airflow characteristics of the wall pressure, airflow velocity, and lobar flow rate. long et al. [34] pulmonary fibrosis (lung disease) the respiratory system was assessed through the use of cfd method and compared with the pulmonary acinus mechanics and functions in healthy. the airflow in the human upper airway is complex and time dependent. the airflow undergoes transition from laminar to turbulent, and vice versa within a second [35] . the complex geometry of the human upper airway results in curve streamlines, recirculation or vortex regions, secondary flow, and jet flow. in order to study the human upper airway, the laminar-turbulent transition flow with complex geometry can be analyzed via experimental and computational approaches. both methods require a similar first approach, which is the preparation of the geometry model for a human upper airway [36] . the geometry model can be generated from the images captured or scanned from the medical imaging process, such as ct scan and mri, among others [37] . in the experimental method, a prototype of the human upper airway is fabricated based on the unique geometry of the upper airway of a patient, and the experiments are performed on this physical prototype [38] . the prototype of the upper airway is typically fabricated according to the actual dimension or that in a scale model, such as reduced-or enlarged-scale model. the operating parameters are considered in the experiments to evaluate the velocity, pressure, and flow profile of an upper airway. several factors, such as time and cost, are required for the study, as well as the availability of facility, and measurement device that needs to be taken into consideration in the experimental works [39] . furthermore, other errors, such as measurement and human errors, may influence the accuracy of data collected from the upper airway experimental setup. many scholars had performed experimental analysis on various upper airways from different patients to identify the accurate causes, and to propose the response for surgery and operation [40] . pirnar et al. [41] investigated a clear physical interpretation by developing a simplified experimental system. the computational fluid dynamics method is an alternative way of analyzing and solving a complicated fluid flow problem. although the cfd is widely applied in the engineering field, it can also be an extremely powerful tool in the biomedical research field. cfd solves the governing equations of fluid flow while advances the solution via space and time [42] . fernández-parra et al. in addition, koullapis et al. [46] had studied the efficiency of computational fluid-particle dynamics approach for the prediction of deposition in a simplified approximation of the deep lung. moreover, the availability of computational resources for solving numerical algorithm also makes it possible to govern equations using a specific numerical method [47] . the combination of conventional coupled cfd-dem model is also able to solve fluid-structure interaction (fsi) problems. the integration of a dynamic meshing approach allows the fsi simulation to resolve the flow structure surrounding the large free-moving object. commercial software platform, such as ansys fluent, with user-defined functions (udfs) are used to simulate the analysis due to its capability in handling complex geometries of the model and solving the dynamic meshing [48] , [49] . recently, the fsi simulation analysis of the upper airway was adopted to explain the mechanism of pharyngeal collapse and snoring [50] . various cfd commercial software tools use different discretization methods (e.g., finite difference, finite element, and finite volume methods) for solving the governing equations. ansys fluent is a popular software for cfd code, which has been widely used in the simulation and research of the human upper airway. in the cfd analysis, ansys fluent separates the analysis into three main stages, which are: (i) preprocessing, (ii) solver; and (iii) post-processing. however, to predict the flow accurately in the study of the upper airway, the selected numerical method must have the capability to simulate the low-reynolds number turbulence model in a complex geometry [51] . the cfd model requires validation to ensure the reliability of the predicted results as reference. model validation is a very important step in any simulation analysis to ensure the models are performing as expected or mimics the real condition. similarly, model validation in biomechanical modeling of the human upper airway with clinical data gains the confidence of users in utilizing the cfd computations. the rapid growth of interest in biomechanical modeling of the human upper airway since the 1990s has provided a better understanding and explanation of its physiology and pathophysiology. in the cfd model validation, the experimental data or clinical data benchmark is usually used to validate the simulation results [52] , [53] . 15 in x-z plane. this bifurcation model was then further validated by mihai et al. [56] in their cfd pharyngeal airflow study. various turbulence models (e.g., les and rans) were employed to simulate the turbulent airflow behavior in a bifurcation model. the simulation results were compared with zhao and lieber's experiment [55] , as shown in figure 5 . comparing cfd results with available experimental data is a popular technique to validate the cfd model or analysis. a good agreement between the experimental and simulation results indicates the strong capability of cfd model to reflect the reality or real condition. cfd validation is not only limited to the results, but it is also applicable for the methodology. the cfd methodology is compared with the experimental flow data to ensure the airway has the similar conditions before it can be extended for parametric study using the same model [57] , [58] . table 2 summarizes the differences between experimental and computational approaches for various applications.  limited range of problem and constraint of operating conditions.  difficult to modify the experimental set up once manufactured or fabricated.  require a large number of measuring instruments in the data collection.  experimental works are not constrained by the complexity of the problem.  slow, sequential and single purpose for the experimental work.  simplification is done on the physical system.  simulation data in no limit in term of space and time.  cost only for initial investment of commercial software.  applicable for any range of problem virtually and for realistic operating conditions.  computational model or domain can be modified easily here.  various models and tools are available for calculation and data collection.  the computational approach is constrained by the build in mathematical models and require user defined functions, especially for complex system.  fast computing, parallel and multi-purpose in computational approach. the mechanical properties of the upper airway are important to define the obstructive events in detail, such as the relaxation of the soft tissue during sleep [61] . a better understanding of the mechanism involved in the obstructive upper airway can be achieved via experimental and numerical studies [62] . a particular numerical simulation method, cfd, is the best approach to describe the preprocessing is the initial step of the cfd simulation analysis, comprising the identification of domain interest, creating a domain for the upper airway, and mesh generation on the domain. the preprocessing step is not limited to the study of upper airway, rather, it is also widely applied in the cfd analysis of various engineering problems. in this step, modeling goals, such as the initial assumptions of the airflow (e.g., steady and unsteady flow) and the model (e.g., laminar and turbulent), that are used to describe the airflow must be clear on the given problem of the upper airway. moreover, the degree of accuracy, computation time, and simplification also needs to be considered in the initial step of the cfd simulation process. essentially, the preprocessing step of the cfd analysis includes the following: (i) identifying the domain of interest: figure 6 shows the schematic diagram of human upper airway in the anatomy of the nasal airway. the regions of the nasal airway were divided into nine different cross sections, and defined based on the nomenclature and magnitude of the airflow velocity [63] . the selection of specific section was considered according to the interest of the study. were generated close to the wall region for better numerical stability and accuracy. figure 8 shows the boundary layer meshing in the study of the upper airway using a 2d nonuniform mesh [63] . the boundary layer meshing was expected to provide better stability during the simulation. moreover, the grid independence test was required to determine the best number of cells or elements of a domain. the optimal number of elements reduces the computational resources, such as computation time while providing accurate predictions of the airflow through the simulation. in the grid independence test, different numbers of elements were considered for the domain, as shown in figure 9 [65]. once the elements had achieved the optimal number, the further increase in the elements had less than 1% of variation in the result. thus, the step to determine the optimal number of elements for a domain is necessary to obtain reliable and accurate cfd results. on the other hand, the upper airways were simplified into a 2d model to simulate the fluidstructure interaction (fsi) phenomenon of the soft palate in the pharynx using more than one turbulent model as a solver [70] . the multi-block structured grid was applied to the 2d model to keeping intact its unique feature of the patient [72] . the medical images captured from the ct, highresolution computed tomography (hrct), and mri are imported and processed by the image processing software (e.g., 3d-doctor, mimics, and 3d slicer) before the computational domain is created [73] . figure 12 depicts the typical steps to construct an accurate model of the upper airway, as well as that of other models (e.g., bone and tissue) [74] . the ct scan data were chosen for the image processing because they have a good contrast between bones and soft tissues compared with the mri data. the volumetric data was constructed in the image processing step, while the gaussian filters and edge detection were applied to remove the image noise. the specific region of bone is identified in this step, while the segmentation step separates the details of the bony structures. for example, the ligaments, tissues, cartilages, and bony structures, such as tibia and femur, are identified and defined in the image via a region-growing algorithm that is typically built in the algorithm of the commercial software. the defined regions were then created to form a profile series. then, the detailed model (i.e., accurate model) was generated and meshed based on the profile series. to generate an accurate model, the basic element used in generating the model is voxels. a voxel represents a value on a regular grid in three-dimensional space. voxels of the tissue with fixed dimensions provide the coordinates and some attributes that characterize the position of the tissue. from the voxels, the surface of the model is then generated using the graphic rendering technique. the marching cubes algorithm is applied to extract the polygons from the volumetric data. the boundary voxels and surface triangulation are generated to represent the 3d object [75] . finally, the vessel reconstruction process/meshing step is carried out to discretize the complex model into finite volumes or finite elements. smaller volumes or elements are generated to resolve the complex flow in the accurate model. apart from that, the specific region of the elements is used to define the boundary condition to compute the inlet velocity. the optimal number of meshing elements can be determined from the study of grid sensitivity to ensure the accuracy of the model. thus, a two-equation rans model, which is the standard k-ε (ske) model, is preferred in the simulation study of the human upper airway [77] . three turbulence models were compared with previous studies, which are realizable k-ε (rke) model [78] , the k-ω turbulence model [79] , and its shear stress transport (sst) variant, k-ω sst [80] . menter et al. [81] and langtry et al. [82] had developed the four-equation transition sst model, which is more demanding compared with the two-equation model. different models have its own wall mesh treatment and mesh resolution, as shown in figure 9 [83]. the wall functions are typically employed in cases where a flow separation is not to be expected. distance from the wall to the adjacent layer (y p ) is different for k-ε and k-ω models. therefore, the dimensionless wall distance (y + ) is applied to assess the mesh resolution near the wall, as defined in equation 1: where, u τ is the friction velocity based on wall shear stress and air density; and ν is the kinematic viscosity. figure 14 shows the possible combination of computational models and techniques in the modeling of a turbulent flow [92] . the specific flow modeling depends on its turbulence length scale and specific computational techniques to tackle the physics of turbulent flow. any interface tracking method (itm) can be straightforwardly combined with the les model due to its very small resolvable length scale. the combined methods are able to facilitate the interface exchange terms while delivering the time-dependent interfacial kinematics of the model. moreover, the les model has become a rather popular and extremely powerful tool in turbulence modeling [93] . in addition, les is not only limited to the turbulent flow analysis, but it is also applicable for various analyses, such as in aeroacoustics, combustion, gas turbine, and many other engineering areas. nasal resistance. they used this model to explain the regional dynamic airway narrowing phenomenon during the expiration in the upper airway of a child with osa ( figure 15 ). figure 15 shows the inspiration pressure, velocity, magnitude, turbulence kinetic energy, and expiration pressure in caudal and lateral views, and midline sagittal plane. the cfd results demonstrated that the low and negative pressure zones were attributed by the area of restriction, which is related to the cause of sleep apnea. jeong et al. [63] carried out a numerical investigation on the aerodynamic force in hua of patients with osa using cfd analysis. their simulation results revealed that the area of restriction in the velopharynx region had induced a turbulent jet flow in the pharyngeal airway. this situation caused higher shear stress and pressure forces surrounding the velopharynx (figure 17 ). in addition, an accurate approach was proposed by mihai et al. [56] to tackle the airflow situations that occur in the human airway. steady rans and les approaches were applied to the flow modeling. both modeling approaches yielded different airflow characteristics and static pressure distributions on the human airway walls. the maximum narrowing region of retropalatal pharynx had caused velocity changes and pressure variations in both modeling approaches, as clearly shown in figure 17 . cfd approach can provide valuable preliminary information to a surgeon on the geometrical reconstruction of the human upper airway prior to performing the surgery. sittitavornwong et al. [97] analyzed the anatomical airway changes to predict maxillomandibular advancement (mma) surgery. they considered a 3d geometrical reconstruction of the airway and investigated the fluid dynamics before and after the surgery using the cfd approach. after the mma surgery, the dimension of the airway increases, thus reducing the airway resistance and improving the pressure effort of osas. the mma surgery is expected to overcome osa problems for the patients. therefore, the study of human airway before and after the mma surgery becomes a focus in this research area. moreover, sittitavornwong et al. [98] extended their research on the effect of changes in soft tissues post-mma surgery. the dimension of the human airway for osa patient increases after the mma surgery ( figure 18 ). pre-and post-surgery results were compared (i.e., pressure distribution and shear stress) in figure 19 . the comparison study showed a significant improvement in the pressure distribution in the airway. apart from that, the investigation of mandibular advancement splint (mas) was reported by zhao et al. [99] using the fluid-structure interaction method. chang et al. [100] also used fsi to evaluate the effect before and after the maxillomandibular advancement surgery. in the pre-and posttreatment of osa, the cfd is a popular approach, in which it can reveal the airflow characteristics and flow variables after the treatment via the unsteady flow analysis. fsi method is considered when displacement and vibration of the soft tissue or membrane such as the trachea are involved. the function of trachea is to regulate the pressure during breathing, coughing, or sneezing. the deformations and stresses of the trachea induced by these ventilation conditions can be computed via the fsi method. severe muscular membrane deformation leads to critical issue for the physiological function of trachea [101] . based on the literature review, the capability of fsi method is validated and recognized in various studies. fsi may demonstrate realistic and accurate results on the flow patterns in the study of hua [106] , [107] . it is also applicable for the analysis of global and local flow features of the nasal cavity [23] . comparison between preoperative and postoperative pressure distributions along the upper airways [98] . recently, the cfd approach has emerged as a virtual surgical concept and reference for medical practitioners and surgeons by providing the simulation analysis. the cfd result is used for surgical planning and as reference for decision-making procedures in the diseased airways (table 3) . therefore, the most effective tailored treatment plan can be done by medical personnel to prevent the recurrence of osa disease. this procedure allows surgeons to modify the airway model, including the geometry reconstruction and boundaries (figure 20 ) based on a certain approach or a series of surgical procedures [105] . in addition, ultra-low-dose computed tomography (ct) scan data were used to predict the treatment outcome on children with sleep-disordered breathing using the cfd approach [106] . the excessive internal pressure forces may lead to the collapse of the upper airway [107] , which can be observed in the cfd analysis. the prediction of cfd is consistent with the clinical parameters and data. thus, it can be a useful tool to evaluate the surgical effects that are associate with osa problems [108] . table 3 : four types of virtual surgery analyses using functional imaging and cfd [105] . modifications to baseline (b) airway boundaries on axial scans (baseline and virtual surgeries s1-s4 boundaries are shown in figure 19 ) s1 baseline airway boundaries changed on crosssectional planes a, b, c, and d. s2 baseline airway boundaries changed on crosssectional planes d, e, f, and g. s3 (= s1 + s2) baseline airway boundaries changed on crosssectional planes a-g. this surgery removes both the constrictions in the airway. baseline airway boundaries changed on crosssectional planes d-g. similar to s2. airway lumen enlarged compared with the baseline, but not as much as it did in s2. additionally, cfd is used to test and prove the hypothesis model for the treatment of osa. the cfd model is used to correlate with the treatment response after the adenotonsillectomy surgery (at) by measuring the apnea-hypopnea index (ahi). a decrease in ahi indicates the success of the hypothesis model in the proposed virtual surgery. cfd, mri, and physiological data of ten obese children were used to calculate the air pressure and velocity before and after the at surgery [109] . the correlation between the at treatment response with other parameters (i.e., pressure-flow ratio, local air pressure drops, geometrical changes, and minimum surface pressure) were considered in the analysis. moreover, in vitro experiment was used to investigate the airflow distribution in the human airway [110] . the numerical and in vitro experimental models of airway were reproduced by the ct data, and the experimental model was created by a 3d printer. the particle image velocimetry (piv) technique was used to measure the velocity and the in vitro data were useful for the validation of numerical simulation results. the dns-lbm model was adopted to assess the flow properties in the upper airway, including the consideration of nasal cavity, pharynx, larynx, and trachea [89] . the cfd approach has been used to quantify the glottis motion, cyclic flow, and its effect towards the respiratory dynamics. liu et al. [111] had conducted the study of airflow dynamics in an obstructed realistic hua. the focus was put on the continuous inspiration and expiration with varied respiratory intensities. the above-mentioned authors had proposed a theoretical guidance for the treatment of respiratory diseases. a 3d airway model was developed, with a time-varying glottal aperture. several aspects, such as vortex topologies, shear stress, flow resistance, and breathing conditions, were considered in their studies. with the aid of cfd, the new clinical information can be incorporated into the cfd model to aid in the treatment of tracheomalacia [26] . the passive or active relationship of aerodynamic forces and the airway motion was categorized. passive relationship indicates the direction of pressure force, and the airway motion is concurrent. the opposite direction for both the pressure force and airway motion is categorized as an active relationship. in heavy breathing and several pathological conditions, the consideration of airway movement is crucial in the simulation analysis. however, fsi models can tackle this airway problem. recently, the cfd analysis of bifurcation junctions of respiratory tract have been considering the key parameters of the wall turbulence effects, such as local reynolds number (y + ), wall shear stress (wss), turbulent energy dissipation (ted), and turbulent kinetic energy (tke) [112] . local reynolds number (y + ) plays a significant role in calculating the wall turbulence and other related parameters. for example, wall shear stress is used as an indicator to determine the wall injury level in the respiratory tract. the upper and lower connecting airways were considered on the respiratory airway tract without omitting it, which extended from the nasal to the terminal bronchioles. valuable information obtained from the cfd results, such as airflow pattern, can contribute to the baseline data to provide a complete understanding of respiratory physiology [23] and airway collapsibility [113] . lastly, the cfd applications in simulating the airflow of the upper airway and respiratory are helpful and can be used as a guideline for proposing a surgical approach of the medical treatment to increase the cure rate. it is also beneficial in attaining a better understanding of the mechanism, treatment, and prevention of osa. table 4 summarizes the cfd study and comparison of relevant literature in the software and solver used. wang et al. [114] had also discussed on the relationship between the relation of obstructive sleep apnea and adenotonsillar hypertrophy in children with dissimilar weight status. on the other hand, vinha et al. [115] had explored the relationship between the dimensions of the palate and pharynx pre-and post-surgery that can affect patients with obstructive sleep apnea. software/solver sc/tetra solver (version 12; software cradle) was adopted to simulate the airflow. fsi simulations were run using the sc/tetra abaqus module (version 12.0; software cradle) to assess the flexible region of the airway. the correlation between cross-sectional area after maxillomandibular advancement surgery with pressure, velocity, airway volume, and resistance. post-mma surgery, it was learned that there was an increase in airway volume, but a decrease in pressure drop, maximum airflow velocity, and airway resistance for both patients. these results were given by the simulation results of cfd and fsi. fsi simulation portrayed a section of marked airway deformation in both patients prior to the surgery. however, this deformation was considered inconsequential after both patients had undergone the surgery. 2 liu et al. [102] software/solver fluent 17.2 (ansys) was utilized for the computation of the airway flow equations. the ansys mechanical simulation software was able to decipher the transient structural equations in the evaluation of the soft tissue deformation and loading. variation of geometry along the airway with pressure and velocity in determination of location obstruction in upper airway. identification of areas prone to collapse and precipitate apneic episode was achieved from observation of the tips of the soft palate and the tongue. consequently, the result was able to rationalize the mechanism of velocity and pressure distribution in the upper airway. 3 jinxiang et al. [116] software/solver ansys fluent (ansys, inc.) was used to solve for the mass and momentum conservation equations by taking into account the airway complexity, while the computational meshes were created via the ansys icem cfd (ansys, inc.) the relationship and significance of tidal breathing and glottis motion with regard to the airflow features and energy expenditure in an image-based human upper airway model. glottis motion on airflow demonstrated aperture and cyclic flow of both altered laryngeal jet instability and vortex generation through main flow speed variations. 4 alister et al. [26] software/solver star-ccm+ 11.04.012 was employed in the calculation of navier-stokes equations to assess the air pressure and velocity along the domain, which was bounded by moving walls. the researcher investigated the correlation between wall geometry from mri and airflow (inhale and exhale) in the upper airway. outcomes of cfd simulations are enhanced through rapid breathing maneuver and integrating the airway movement. there was a 19.8% increase in peak resistance that took place in earlier breath. there was also a 19.2% decrease in overall pressure loss, whereas the proportion of flow in the mouth had elevated by 13.0% in the airway. vivek at al. [112] software/solver finite volume method (fvm) based on cfd solver (ansys fluent 15) was utilized to computationally simulate the aerosol deposition in the image according to the respiratory tract model. investigate the correlation between turbulence model and near-wall function. the findings of this work will assist researchers in selecting the optimal range of local reynolds number (y + ) and turbulence model in simulating the right flow features in a human respiratory system. 6 yidan et al. [23] software/solver the ansys fluent meshing (ansys inc., lebanon, nh) was used to mesh the airway model with polyhedral elements. the airflow was assumed to be an incompressible flow (i.e., having constant air density), and a no-slip boundary condition was established at the walls. all simulations were numerically performed via the ansys fluent v17.0 (ansys inc., lebanon, nh). determination of the correlation and importance relation between large-to-small conducting airways and inhale mass flow rate (inspiration breathing). the outcome showed that secondary flow currents were observed in the larynxtrachea segment and left main bronchus, whereas the airflow was much smoother with no secondary flow currents for the terminal conducting airway in the right lower lobe. 8 omid et al. [117] software/solver the solid and fluid domains had employed an unstructured mesh, at which it was produced via the ansa software by beta cae systems usa inc. alya highperformance computational mechanics code, a multiphysics code, was also used to solve the fsi problems. relationship between deformation and collapse of the upper airway when breathing. findings showed that the sleeping position, gravity, and stiffness of the soft tissues (i.e., utilized in this work as a proxy for neuromuscular effects) were the key aspects of an upper airway collapse. this article presents review on the experimental and numerical method such as, computational fluid dynamics approach, and its application in the analysis of human upper airway (hua), including the fluid-structure interaction. the experimental study of hua had utilized the prototype, scale, and in vitro models to measure the velocity, pressure, and flow profiles. nevertheless, limited facilities and measurement devices are some of the constraints in the experimental study. a 3d printing technique was adopted to create the prototype based on the medical images (e.g., mri and ct scan data), whereby the model was able to retain the actual dimension, geometry, and unique features of the upper airway of each patient. the computational approach is an alternative method used in the hua research, as it is able to solve the complex phenomena in hua. various turbulence models (e.g., rans, k-ε, k-ω, les, and dns) are available in the commercial software to simulate the airflow characteristics and properties. most studies used commercial software to carry out the cfd analysis on hua. this is due to the capability of cfd to be integrated with the structural analysis software to perform a fluid-structure interaction simulation in tackling the hua problems that involve movement, motion, and deformation of soft tissue or airway wall. this work discusses the approach and modeling conditions (e.g., heavy or light breathing, and inhale or exhale), and region of interest (e.g., geometry parameters, consider the nasal cavity, pharynx, larynx, and trachea). it is reported that the increase in dynamic pressure leads to development of secondary flow. the larynx-trachea segment and left main bronchus is found to be occupied by secondary flow currents. furthermore, cfd is used to study the airflow characteristics of before and after adenotonsillectomy and maxillomandibular advancement surgeries. nevertheless, numerous researchers have analyzed the airflow mechanism by exhibiting the velocity, pressure, and shear stress to demonstrate the findings of their work. for example, navier-stokes equations are used to provide visualization of the actual mechanism of the airflow. this is because the technique is able to demonstrate meaningful results through its description of the real fluid flow mechanism and via the expression of turbulent kinetic energy (tke) in their outcomes. turbulent kinetic energy (tke) and turbulent energy dissipation (ted) can provide important findings, such as wall shear stress (wss), to which it can be suggestive of a wall injury, as well as collapsibility tissue of the human upper airway. additionally, this approach can give the analysis and detect the airway collapse during breathing. the airway collapse is caused by the posterior wall of the trachea bulging forward or tongue collapse during sleep, fully or partly blocking an airway that causes obstruction while exhalation and inspiration. the virtual surgical concept helps surgeons and medical practitioners to plan and determine decision-making procedures in the treatment of diseased airways. in addition, the literature demonstrates that the researchers were able to validate the computational model through clinical and in vitro experimental data. laminar and turbulent flow calculations through a model human upper airway using unstructured meshes functional anatomy and physiology of airway techniques to assist selection of appropriate therapy for patients with obstructive sleep apnea new approaches to diagnosing sleep-disordered breathing skeletal surgery for obstructive sleep apnea volumetric tongue reduction for obstructive sleep apnea the impact of continuous positive airway pressure on heart rate variability in obstructive sleep apnea patients during sleep: a meta-analysis respiratory effort during sleep apneas after interruption of long-term cpap treatment in patients with obstructive sleep apnea mechanical properties of the upper airway the nose, upper airway, and obstructive sleep apnea submental ultrasound measurement of dynamic tongue base thickness in patients with obstructive sleep apnea classification techniques on computerized systems to predict and/or to detect apnea: a systematic review nasal obstruction considerations in sleep apnea obstructive sleep apnea: the role of gender in prevalence, symptoms, and treatment success obstructive sleep apnea: a standard of care that works sleep, snoring, and surgery: osa screening matters the effects of noncontinuous positive airway pressure therapies on the aerodynamic characteristics of the upper airway of obstructive sleep apnea patients: a systematic review maxillary, mandibular, and chin advancement: treatment planning based on airway anatomy in obstructive sleep apnea maxillomandibular advancement for treatment of obstructive sleep apnea syndrome: a systematic review a review of the implications of computational fluid dynamic studies on nasal airflow and physiology prevalence of obstructive sleep apnea in the general population: a systematic review development of human airways model for cfd analysis detailed computational analysis of flow dynamics in an extended respiratory airway model the challenges of precision medicine in obstructive sleep apnea obstructive sleep apnea: emphasis on discharge education after surgery assessing the relationship between movement and airflow in the upper airway using computational fluid dynamics with motion determined from magnetic resonance imaging strategies to increase physical activity in chronic respiratory diseases computational fluid dynamics can detect changes in airway resistance in asthmatics after acute bronchodilation numerical modeling of particle deposition in the conducting airways of asthmatic children the effect of inlet velocity profile on the bifurcation copd airway flow topological analysis of particle transport in lung airways: predicting particle source and destination cfd modeling and image analysis of exhaled aerosols due to a growing bronchial tumor: towards non-invasive diagnosis and treatment of respiratory obstructive diseases structural and functional alterations of the tracheobronchial tree after left upper pulmonary lobectomy for lung cancer characterization of air flow and lung function in the pulmonary acinus by fluid-structure interaction in idiopathic interstitial pneumonias airway evaluation in obstructive sleep apnea upper airway imaging in pediatric obstructive sleep apnea syndrome correlation between severity of sleep apnea and upper airway morphology based on advanced anatomical and functional imaging experimental methods for flow and aerosol measurements in human airways and their replicas computed tomography characterization and comparison with polysomnography for obstructive sleep apnea evaluation sleep-related breathing disorders and quality of life effect of airway surface liquid on the forces on the pharyngeal wall: experimental fluid-structure interaction study numerical investigation of airflow, heat transfer and particle deposition for oral breathing in a realistic human upper airway model use of computational fluid dynamics to compare upper airway pressures and airflow resistance in brachycephalic, mesocephalic, and dolichocephalic dogs computational analysis of airflow dynamics for predicting collapsible sites in the upper airways: machine learning approach computational fluid dynamics simulation of full breathing cycle for aerosol deposition in trachea: effect of breathing frequency an efficient computational fluid-particle dynamics method to predict deposition in a simplified approximation of the deep lung modeling congenital nasal pyriform aperture stenosis using computational fluid dynamics coupling cfd-dem with dynamic meshing: a new approach for fluid-structure interaction in particle-fluid flows modelling nasal high flow therapy effects on upper airway resistance and resistive work of breathing computational fluid-structure interaction simulation of airflow in the human upper airway simulation of pharyngeal airway interaction with air flow using low-re turbulence model numerical simulation of stent deployment within patient-specific artery and its validation against clinical data clinical data are essential to validate lung ultrasound modelling the human pharyngeal airway: validation of numerical simulations using in vitro experiments steady inspiratory flow in a model symmetric bifurcation large eddy simulation and reynolds-averaged navier-stokes modeling of flow in a realistic pharyngeal airway model: an investigation of obstructive sleep apnea validation of computational fluid dynamics methodology used for human upper airway flow simulations validation of airway resistance models for predicting pressure loss through anatomically realistic conducting airway replicas of adults and children large eddy simulation of the pharyngeal airflow associated with obstructive sleep apnea syndrome at pre and postsurgical treatment a review of cfd methodology used in literature for predicting thermo-hydraulic performance of a roughened solar air heater assessment of upper airway mechanics during sleep experimental and numerical investigation on inspiration and expiration flows in a three-generation human lung airway model at two flow rates numerical investigation on the flow characteristics and aerodynamic force of the upper airway of patient with obstructive sleep apnea using computational fluid dynamics interaction between a simplified soft palate and compressible viscous flow a cfd (computational fluid dynamics) based heat transfer and fluid flow analysis of a solar air heater provided with circular transverse wire rib roughness on the absorber plate development of a realistic human airway model novel imaging techniques using computer methods for the evaluation of the upper airway in patients with sleepdisordered breathing: a comprehensive review flow-induced oscillation of collapsed tubes and airway structures a computational study of the respiratory airflow characteristics in normal and obstructed human airways recommendations for simulating microparticle deposition at conditions similar to the upper airways with two-equation turbulence models in silico investigation of sneezing in a full real human upper airway using computational fluid dynamics method magnetic resonance sleep studies in the evaluation of children with obstructive sleep apnea imaging the upper airway in patients with sleep disordered breathing image-based computational fluid dynamics modeling in realistic arterial geometries image processing, geometric modeling and data management for development of a virtual bone surgery system computational modeling and validation of human nasal airflow under various breathing conditions the numerical computation of turbulent flows a new k-ϵ eddy viscosity model for high reynolds number turbulent flows turbulent modeling for cfd two-equation eddy-viscosity turbulence models for engineering applications a correlation-based transition model using local variables -part i: model formulation a correlation-based transition model using local variables -part ii: test cases and industrial applications comparison of reynolds averaging navier-stokes (rans) turbulent models in predicting wind pressure on tall buildings direct numerical simulation of developed compressible flow in square ducts direct numerical simulations of droplet condensation evaluation of pressure oscillations by a laboratory motor direct numerical simulation of flow over a slotted cylinder at low reynolds number a fast algorithm for direct numerical simulation of turbulent convection with immersed boundaries on locating the obstruction in the upper airway via numerical simulation cfd analysis of drift eliminators using rans and les turbulent models advances and challenges of applied large-eddy simulation status and future developments of large-eddy simulation of turbulent multifluid flows (leis and less) large-eddy simulation: past, present and the future computational fluid dynamics modeling of the upper airway of children with obstructive sleep apnea syndrome in steady flow fluid-structure analysis of microparticle transport in deformable pulmonary alveoli numerical simulation of soft palate movement and airflow in human upper airway by fluid-structure interaction method evaluation of obstructive sleep apnea syndrome by computational fluid dynamics computational fluid dynamic analysis of the posterior airway space after maxillomandibular advancement for obstructive sleep apnea syndrome simulation of upper airway occlusion without and with mandibular advancement in obstructive sleep apnea using fluid-structure interaction fluid structure interaction simulations of the upper airway in obstructive sleep apnea patients before and after maxillomandibular advancement surgery modeling of the fluid structure interaction of a human trachea under different ventilation conditions study of the upper airway of obstructive sleep apnea patient using fluid structure interaction computational fluid dynamics simulation of the upper airway response to large incisor retraction in adult class i bimaxillary protrusion patients a review of fluid-structure interaction simulation for patients with sleep related breathing disorders with obstructive sleep planning human upper airway surgery using computational fluid dynamics functional respiratory imaging as a tool to assess upper airway patency in children with obstructive sleep apnea computational fluid dynamics for the assessment of upper airway response to oral appliance treatment in obstructive sleep apnea large eddy simulation of flow in realistic human upper airways with obstructive sleep computational fluid dynamics endpoints for assessment of adenotonsillectomy outcome in obese children with obstructive sleep apnea syndrome investigation of flow pattern in upper human airway including oral and nasal inhalation by piv and cfd numerical investigation of flow characteristics in the obstructed realistic human upper airway capturing the wall turbulence in cfd simulation of human respiratory tract computational fluid dynamics analysis of huvulopalatopharyngoplasty in obstructive sleep apnea syndrome correlations between obstructive sleep apnea and adenotonsillar hypertrophy in children of different weight status effects of surgically assisted rapid maxillary expansion on the modification of the pharynx and hard palate and on obstructive sleep apnea, and their correlations effects of glottis motion on airflow and energy expenditure in a human upper airway model impact of sleeping position, gravitational force & effective tissue stiffness on obstructive sleep apnoea key: cord-268142-lmkfxme5 authors: schafrum macedo, aline; cezaretti feitosa, caroline; yoiti kitamura kawamoto, fernando; vinicius tertuliano marinho, paulo; dos santos dal‐bó, ísis; fiuza monteiro, bianca; prado, leonardo; bregadioli, thales; antonio covino diamante, gabriel; ricardo auada ferrigno, cassio title: animal modeling in bone research—should we follow the white rabbit? date: 2019-09-26 journal: animal model exp med doi: 10.1002/ame2.12083 sha: doc_id: 268142 cord_uid: lmkfxme5 animal models are live subjects applied to translational research. they provide insights into human diseases and enhance biomedical knowledge. livestock production has favored the pace of human social development over millennia. today's society is more aware of animal welfare than past generations. the general public has marked objections to animal research and many species are falling into disuse. the search for an ideal methodology to replace animal use is on, but animal modeling still holds great importance to human health. bone research, in particular, has unmet requirements that in vitro technologies cannot yet fully address. in that sense, standardizing novel models remains necessary and rabbits are gaining in popularity as potential bone models. our aim here is to provide a broad overview of animal modeling and its ethical implications, followed by a narrower focus on bone research and the role rabbits are playing in the current scenario. rabbits have been used for decades by researchers in diverse scientific fields. however, only recently have they been targeted as potential bone models. 8 with great importance in age-related bone loss research. 9, 10 here, we first present a broad historical review and some key ethical points in animal modeling. we then take a closer look at bone research and the role rabbits play in this field. animal domestication was a significant turning point for mankind. human society developed into what it is today due to livestock production, 2 and animals still provide us with food, clothing, transportation, protection, and companionship. 2, 11 nowadays they contribute to human well-being in additional ways: by helping people with visual impairment or diabetes, by taking part in police enforcement, or even by entertaining people in animal shows, zoos, and social media. 3 animals have also been pivotal to our medical knowledge and health status since ancient greece. 3, 9 the first animal studies provided understanding of biological pathways and disease mechanisms. animal dissection proved to be a valuable substitute for human dissection -an illegal practice in ancient times. 12 several philosophers and physicians, from aristotle to diocles and erasistratus, experimented on animals. alcmaeon of croton (305-240 bc) was the first physician to document and publish anatomical observations of canine dissections. 11, 13 he established brain control over intelligence and sensory perceptions. 13 centuries later, aelius galenus (also known as galen of pergamon, 129-216 ad) would make pivotal discoveries based on animal experimentation. 4 galen served as a doctor to different roman emperors. his public demonstrations of cutting laryngeal nerves in squealing pigs made him famous. he also made important anatomic observations on cranial and spinal nerves. 14 he included findings from more than 80 animal species. 2, 16 in the late nineteenth century, claude bernard set the foundations of experimental medicine by developing rigorous guidelines for controlled studies. 2, 4, 11 animal-based research has been the cornerstone of health sciences ever since. it accounts for more than 80% (180/216) of all physiology or medicine nobel laureates' studies. 17 research on the diphtheria vaccine-developed in guinea pigs (cavia porcellus)-received the very first prize in 1901. other fundamental discoveries, like the insulin mechanism and pasteur's and koch's studies, are also credited to animal research. 2, 12, 17 animal welfare has not always been a concern. proper acknowledgment of an animal's moral status as a sentient being is a recent development. 2, 3 for most part of the history, animals were considered senseless to pain and were treated with little or no respect in research, teaching, and demonstrations. 2, 12 for centuries, animals were perceived mainly as useful tools. 2, 16 most greek philosophers excluded animals from moral judgments, especially those derived from stoic and epicurean beliefs. 6 other philosophical strands, such as cynicism, were more empathetic to the well-being of animals. nevertheless, assumptions that animals are entitled to ethical consideration and can indeed perceive pain and negative feelings only emerged during the renaissance. 2, 11, 16 french philosopher rené descartes (1596-1650) acknowledged that animals could perceive sensations, but in a purely mechanical way. based on this cartesian perspective, scientists justified the use of animals without concern for their feelings for centuries afterwards. 2, 3, 6 when william harvey demonstrated blood circulation on conscious dogs, the attending public believed the painful screams were part of a "beast machinery," like an automatic sound. 18, 19 only by the second half of the nineteenth century, in victorian europe, animal rights would be debated among the mainstream philosophers. 2,18 jeremy bentham's introduction to principles of morals and legislation (1789) was a turning point. 20 emphatic attitudes displayed by influential thinkers like rousseau and schopenhauer helped shaping a new approach towards animal welfare. 3, 6 darwin's evolutionary insights (published in 1859), emphasized our moral duty towards animals. [1] [2] [3] 14 the cruelty to animals act-passed in 1876-was the first official legal document to set boundaries on animal experimentation. 21 however, the dominant approach to animal research remained utilitarian. 2, 16 in the late 1950s, russell and burch developed the "three rs concept" to rationalize animal use by replacing, reducing, and refining resources. 12 these guidelines aim to minimize animal distress and emphasize our duty to search for alternative technologies. bioethical principles are now mandatory for any animal experimentation. 16, 18 today, the internet reflects public opinion on animal welfare. the attitude of young people towards animals is much more empathetic now than in previous generations. 2 consequently, bioresearch elicits heated debates. some groups with radical views advocate banning animal research altogether. nevertheless, the unlimited potential and importance of animal-based discoveries cannot be denied. 12 five key bioethical points are considered when assessing the moral status of animal subjects in research: the presence of life, the ability to feel and perceive stimuli, the level of cognitive behavior, the degree of sociability, and the ability to proliferate. 16 scientific proof of animal consciousness and sentience is a recent achievement. 18 however, there is no global consensus on the value people attach to particular animals. in some cultures, the western household dog is no more than a food source. the same is true for research. using animals like monkeys, dogs or cats as models will likely evoke adverse reactions nowadays. the social perception of the animal's "worthiness" is called "speciesism". 22 at this point in time, animal research cannot be entirely replaced by in vitro testing. developing alternative methods is essential. scientists can now create and cultivate microfluid organ-on-a-chip models. but these new technologies are still under development. hopefully future studies will provide the means to replace animal experiments. 23 until then, ethical treatment and rational use of all living forms are still necessary. 4, 22 in that context, characterizing alternative models remains a goal. rabbits, for instance, may be potentially useful bone models. they are already used as laboratory subjects in several medical fields. even though they are also praised as household pets, particularly in europe, their use in laboratory is well accepted. 2,24 many species can be suitable models for different diseases. the research question will dictate what type of model should be considered. undoubtably, rodents are the most popular laboratory subjects worldwide. rats (rattus norvegicus) have been part of medical studies since the nineteenth century (1828). 25 they reached peak importance with the development of the wistar strain, in 1909. 2 although mendel started studying the laws of inheritance on mice (mus musculus), he shifted his methods to peas after facing religious restrictions on his animal model. 5, 6 rodents became the standard choice for genetic experimentation after watson and crick published their dna study. 2 during the 1980s, the first "gene knockout" mouse was developed. this study won a nobel prize. 2, 17 using models is very attractive because one can easily ensure homogeneity between subjects-unachievable otherwise. then future studies can reproduce similar conditions. 11,25 for obvious reasons, the greater the model's similarity to humans, the greater are the moral implications. 6 the planning phase is the moment to define the best model to answer the research question, avoiding unnecessary enrollments. 16 the "ideal model" does not exist. no single animal-aside from humans-can perfectly exhibit human responses. 26 researchers must choose the most suitable option, considering the objectives of the study. 27 careful planning is mandatory. it should be kept in mind that sometimes more than one type of model might be necessary to answer the research question. 19 multi-level assessment is required to identify the possible advantages and challenges of any given model and table 1 provides a template guide. animal models have taught us much about bone disorders and have been central to developing many treatments throughout history. their contribution remains paramount for assessing bone physiology and immunology, since in vitro alternatives cannot fully reproduce whole-organism physiological behavior. they remain beneficial to the whole orthopedic field. either by mimicking diseases in arthrology and oncology studies or by allowing surgical training, animals are still essential to medicine. 28 nonhuman primates are our best biological representation. 29 for that reason, using them nowadays for scientific purposes elicits public. aside from moral implications, their size and ease of handling in experiments are difficulties, besides being financially demanding. working with primates also requires very well-trained staff (owing ta b l e 1 schematic compilation of traits and possible challenges to consider when planning to use an animal model 2,6,11 purpose/approach to their unpredictable aggressive behavior and zoonotic potential), which limits their research potential. 2 our second closest model on the structure of bone is dogs. 29 despite individual variations on macrostructure, their bone remodeling is somewhat similar, and they exhibit similar haversian structure. dogs used to be popular research subjects due to their medium size, ease of handling, and docile behavior. 6, 30 today, these classical models are no longer feasible. 2, 30 over recent decades, a paradigm shift regarding animal use in research has occurred. the fields of laboratory sciences, animal welfare and alternative methods for replacing animal use have expanded considerably to overcome the lack of public acceptance of the classical models. one of the most studied-and prevalent-disorders nowadays is osteoporosis. 31 age-related osteopenia is a public health concern of growing importance. demographic aging and the urban lifestyle of western societies have led to this modern disease. the world health organization considers osteoporosis a significant age-related disease and has developed global strategies for its prevention, management, and surveillance. 32 osteoporosis causes unbalanced bone formation/resorption and decreases bone mass. the weakened bones are more prone to suffer a fracture, even with low-impact injuries. pathological fractures occur mainly at the hip joint and vertebrae. they may even go unnoticed in elder patients. 33 the domestic rabbit (oryctolagus cuniculus) is a small digging lagomorph of the family leporidae. in the modern age, there are only two living families: leporidae (rabbits and hares) and ochotonidae (pikas), with 13 genera currently recognized. 40 more than 60 rabbit breeds exist worldwide. rabbits exhibit desirable traits for bone research. these calm and easily handled creatures have a short lifespan and breed readily in captivity. 10 the new zealand white rabbit is the most popular research breed. furthermore, rabbits are phylogenetically closer to primates than rodents. they reach skeletal maturity between 20 and 30 weeks of age (females earlier). 28 adults display some haversian remodeling and their bone metabolism is somewhat similar to humans. however, surgical castration is not enough to mimic satisfactory bone loss and other techniques must be associated. 6, 38 rabbits display less cancellous bone than humans. 38, 41 they have more fragile cortices. 29, 42 cortical thickness and the diameter of drilled holes contribute to the high complication rate of fracture repair in this species. 43 their functional anatomy allows their peculiar high-speed hopping to evade predators. cage confinement and exercise restriction might be harmful to their bone development 44 and researchers should consider alternative in-house systems, as opposed to small cage confinement. 45 in their natural habitat, rabbits are a prey species, which explains their curious but easily scared behavior and also explains some anatomic features that enable them to escape at high speed when in danger. their peculiar appendicular skeleton (figure 1 ) must be light weight but also resistant to allow their burrowing and food-seeking behaviors. 46 their hindlimbs have high power hip extensor muscles concentrated at the proximal part. muscle mass in the front limbs is distributed more distally and accounts for approximately 35% of the total body mass. 47 their fibula fuses to the middle shaft of the tibia. their four long webbed toes on each hindlimb allow accelerated digitigrade hopping. their small clavicles resemble those of domestic cats and make them more agile. 48, 49 a survey of the terms "rabbit" and "experimental model" in pubmed resulted in 33 344 articles of indexed journals published between 1951 and 2019, with almost 10 000 from the past decade. rabbits were pivotal to the discovery of the atropine esterase enzyme, in the nineteenth century. 50 since then, they have been used in several studies by nobel laureates. they helped characterizing the mechanisms involved in insulin production and diabetes. 8, 17, 51 rabbits are appealing models for bone research. studies involving rabbits are now commonplace in orthopedics, and multi-species assessments of model suitability have rated rabbits as potential bone models after primates and dogs. 52 biomechanical forces act during stance and walking in any living animal. measuring these forces is important to determine a model's bone strength, 53 but biomechanical data on rabbit bones are still scarce. a 2012 study published the effects of in vivo loading of rabbit tibiae. biomechanical data on axial compression and bending moments in the rabbit tibia were given. the authors concluded that rabbit tibia can endure higher strain levels than goats can, therefore rabbits were better models. 54 in another study describing the qualitative differences between mice, rats, dogs, nonhuman primates and rabbits, the authors concluded that the skeletal characteristics of rabbits were the least suitable for extrapolating to humans, but highlighted their lack of biomechanical data. 52 rabbits are a standard model in periodontal research. they are part of diverse studies such as measurement of parathyroid hormone effects on osseointegration in osteoporosis, 55 some recent studies have explored the potential of rabbits as models for cartilage 60 and meniscal tears repair. 61 they have also been used in other studies on arthrology and tendon healing. one study focused on intra-articular injections of chondroitin sulfate carried by hydrogel. 62 others assessed tendon healing by reproducing biceps tenosynovitis, 63 and anterior cruciate ligament 64 and rotator cuff tears. 65 rabbits have also increased in importance as pets. they are the third most popular companion animal in the uk, after dogs and cats. more than two million pet rabbits are estimated to have existed in the past decade. 66 they are the most popular exotic animal in us private veterinary practices. 24 in view of these trends, the demand for higher standards of rabbit medicine is increasing and thus the need to enhance veterinary knowledge also exists. 24 more recent studies focus on clinical and surgical aspects of the pet rabbit. 24, 43, [67] [68] [69] [70] [71] [72] in a recent paper, the authors evaluated the effect of three different screw-hole diameters and torsional properties of rabbit femora. 43 however, more in-depth biomechanical studies are lacking. there are scarce data on the torsional properties, [73] [74] [75] but the main focus of these studies was bone healing 74 and bone grafting. 75 fracture repair in the pet rabbit remains a major challenge. 68 rabbit bones are very thin and brittle, an important complicating factor that results in frequent implant failure. 43, 76 another study has defined vertebral safe corridors for implant insertion using computer tomography. 77 but rabbit research still has unexplored gaps to be addressed. the human-animal bond has sculpted medical knowledge. animal models play a significant role in enhancing our understanding of emerging pathologies. current in vitro technologies are very promising but still have some way to go before fully replicating whole-animal responses. rabbits have potential as bone models but conclusive studies are still lacking. however, the growing popularity of rabbits as pets may ultimately decrease their eligibility as laboratory models. the need for alternative methods to replace animals in research remains paramount. none. introduction to "working across species animal experiments in biomedical research: a historical perspective a brief history of animal modeling the development and application of laboratory animal science in china evaluation of bone regeneration using the rat critical size calvarial defect animal models for implant biomaterial research in bone: a review clinician's guide to prevention and treatment of osteoporosis the laboratory rabbit: an animal model of atherosclerosis research characterization of a rabbit osteoporosis model induced by ovariectomy and glucocorticoid an ovariectomy-induced rabbit osteoporotic model: a new perspective the rational use of animal models in the evaluation of novel bone regenerative therapies moral status as a matter of degree alkmaion's discovery that brain creates mind: a revolution in human knowledge comparable to that of copernicus and of darwin galen and the squealing pig historical perspective -andreas vesalius (1514-1564) research ethics in animal models foundation n. nobel prizes. nobel prizes animal consciousness, cognition and welfare animal models are essential to biological research: issues and perspectives the collected works of jeremy bentham: an introduction to the principles of morals and legislation common morality, coherence, and the principles of biomedical ethics the moral status of animals and their use in research: a philosophical review microfluidic organon-a-chip models of human intestine standards of care in the 21st century: the rabbit searching for animal models and potential target species for emerging pathogens: experience gained from middle east respiratory syndrome (mers) coronavirus. one heal preclinical models for orthopedic research and bone tissue engineering osteoporosis-bone remodeling and animal models the domestic rabbit, oryctolagus cuniculus: origins and history an interspecies comparison of bone fracture properties interspecies differences in bone composition, density, and quality: potential implications for in vivo bone research the recent prevalence of osteoporosis and low bone mass in the united states based on bone mineral density at the femoral neck or lumbar spine management of osteoporosis & who. prevention and management of osteoporosis: report of a who scientific group. genova: world health organization cortical bone porosity: what is it, why is it important, and how can we detect it? inter-trabecular angle: a parameter of trabecular bone architecture in the human proximal femur that reveals underlying topological motifs odanacatib, effects of 16-month treatment and discontinuation of therapy on bone mass, turnover and strength in the ovariectomized rabbit model of osteopenia animal models for fracture treatment in osteoporosis delay in estrogen commencement is associated with lower bone mineral density in turner syndrome bone mineral measurements of subchondral and trabecular bone in healthy and osteoporotic rabbits the mouse ascending: perspectives for human-disease models the domestic rabbit, oryctolagus cuniculus: origins and history animal models for glucocorticoidinduced postmenopausal osteoporosis: an updated review sex-related variations in bone microstructure of rabbits intramuscularly exposed to patulin effects of hole diameter on torsional mechanical properties of the rabbit femur comparative anatomical and biochemical studies on the main bones of the limbs in rabbit and cat as a medicolegal parameters a meta-analysis on the effects of the housing environment on the behaviour, mortality, and performance of growing rabbits bsava manual of rabbit medicine and surgery functional specialisation of the thoracic limb of the hare (lepus europeus) the biology of the laboratory rabbit anatomy, physiology, and behavior seasonal and sexual influences on rabbit atropinesterase the new zealand white rabbit as a model for preclinical studies addressing tissue repair at the level of the abdominal wall comparative bone anatomy of commonly used laboratory animals: implications for drug discovery biomechanics of bone: determinants of skeletal fragility and bone quality axial forces and bending moments in the loaded rabbit tibia in vivo effects of continual intermittent administration of parathyroid hormone on implant stability in the presence of osteoporosis : an in vivo study using resonance frequency analysis in a rabbit model physico-chemical and histomorphometric evaluation of zinc-containing hydroxyapatite in rabbits calvaria bone healing improvements using hyaluronic acid and hydroxyapatite/beta-tricalcium phosphate in combination: an animal study hemiarthroplasty of the shoulder joint using a custom-designed high-density nano-hydroxyapatite/ polyamide prosthesis with a polyvinyl alcohol hydrogel humeral head surface in rabbits bioactive scaffolds for osteochondral regeneration repair of osteochondral defects with biodegradable hydrogel composites encapsulating marrow mesenchymal stem cells in a rabbit model early in situ changes in chondrocyte biomechanical responses due to a partial meniscectomy in the lateral compartment of the mature rabbit knee joint intra-articular delivery of chondroitin sulfate for the treatment of joint defects in rabbit model comparison of bone tunnel and cortical surface tendon-to-bone healing in a rabbit model of biceps tenodesis a versatile protocol for studying anterior cruciate ligament reconstruction in a rabbit model into-tunnel repair versus ontosurface repair for rotator cuff tears in a rabbit model approach to preventive health care and welfare in rabbits pet rabbits comparison of two methods of long bone fracture repair in rabbits survey of the husbandry, health and welfare of 102 pet rabbits welfare assessment in pet rabbits anesthesia and analgesia in rabbits and rodents case report surgical correction of patellar luxation in a rabbit analysis of mechanical symmetry in rabbit long bones comparison of cyclic loading versus constant compression in the treatment of long-bone fractures in rabbits physical properties of autoclaved bone: torsion test of rabbit diaphyseal bone bsava manual of rabbit surgery, dentistry and imaging computed tomographic study of safe implantation corridors in rabbit lumbar vertebrae key: cord-299852-t0mqe7yy authors: janssen, loes h. c.; kullberg, marie-louise j.; verkuil, bart; van zwieten, noa; wever, mirjam c. m.; van houtum, lisanne a. e. m.; wentholt, wilma g. m.; elzinga, bernet m. title: does the covid-19 pandemic impact parents’ and adolescents’ well-being? an ema-study on daily affect and parenting date: 2020-10-16 journal: plos one doi: 10.1371/journal.pone.0240962 sha: doc_id: 299852 cord_uid: t0mqe7yy due to the covid19 outbreak in the netherlands (march 2020) and the associated social distancing measures, families were enforced to stay at home as much as possible. adolescents and their families may be particularly affected by this enforced proximity, as adolescents strive to become more independent. yet, whether these measures impact emotional well-being in families with adolescents has not been examined. in this ecological momentary assessment study, we investigated if the covid-19 pandemic affected positive and negative affect of parents and adolescents and parenting behaviors (warmth and criticism). additionally, we examined possible explanations for the hypothesized changes in affect and parenting. to do so, we compared daily reports on affect and parenting that were gathered during two periods of 14 consecutive days, once before the covid-19 pandemic (2018–2019) and once during the covid-19 pandemic. multilevel analyses showed that only parents’ negative affect increased as compared to the period before the pandemic, whereas this was not the case for adolescents’ negative affect, positive affect and parenting behaviors (from both the adolescent and parent perspective). in general, intolerance of uncertainty was linked to adolescents’ and parents’ negative affect and adolescents’ positive affect. however, intolerance of uncertainty, nor any pandemic related characteristics (i.e. living surface, income, relatives with covid-19, hours of working at home, helping children with school and contact with covid-19 patients at work) were linked to the increase of parents’ negative affect during covid-19. it can be concluded that on average, our sample (consisting of relatively healthy parents and adolescents) seems to deal fairly well with the circumstances. the substantial heterogeneity in the data however, also suggest that whether or not parents and adolescents experience (emotional) problems can vary from household to household. implications for researchers, mental health care professionals and policy makers are discussed. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 since march 2020, the coronavirus disease 2019 is referred to as a pandemic by the world health organization [1] . to slow the spread of covid-19, national governments have taken radical measures to minimize social interactions by closing public places, demanding people to keep physical distance and stay at home and-in some countries-by enforcing 'full lockdown'. in the netherlands, at march 15 th 2020, measures of social distancing enforced all dutch citizens to stay home and work remotely as much as possible, public spaces (e.g. schools, offices, parts of public transport, theatres) were closed and public gatherings were prohibited (see fig 1 for a timeline) . these measures of social distancing (a so-called 'lockdown') created drastic changes in daily social life; distinct domains such as family life, school, and work suddenly coincided and families faced an unforeseen increase in hours spent together under the same roof. adolescents and their families may be particularly affected by this enforced proximity, as adolescents strive to become independent and focus more on socializing and spending time with friends rather than with their families [2, 3] . to that end, this study aimed to investigate well-being of adolescents and their parents and parenting behaviors during the covid-19 pandemic and explored daily difficulties and helpful activities during the covid-19 pandemic linked to their well-being. for some families, spending more time together during a lockdown may bring family members closer towards each other and foster a sense of well-being. however, several factors that are emblematic for the covid-19 crisis, such as financial insecurity, concerns about own and others' health, uncertainty about quarantine duration, lack of social and physical activities, and boredom have all frequently been shown to negatively affect a person's mood and mental wellbeing [4] [5] [6] [7] [8] . moreover, parents and adolescents may also experience stress because they are faced with more daily hassles (e.g. a suboptimal work or school environment) and additional tasks (e.g. parents homeschooling their children or caring for significant others). previous studies have shown that the impact of these quarantine related factors on mental health outcomes (e.g. depressive symptoms, anxiety, and ptsd) can be wide-ranging, substantial and long-lasting (see review of brooks et al. [9] ). as a consequence, these confinements may also lead to more tension, irritability, family conflicts, and at worse, domestic violence or child abuse [10] . one of the key questions that have been raised by governmental agencies and health care workers is to what extent the covid-19 pandemic and the associated distancing measures affect families' well-being and parenting behaviors. in this study, dutch adolescents and their parents filled in 14 days of ecological momentary assessments (ema; [11] ) twice, before the covid-19 outbreak (2018-2019) and also during the covid-19 pandemic (14-28 april 2020). in addition, we asked parents and adolescents about daily difficulties and helpful activities during the covid-19 pandemic that possibly influenced their affect in positive and negative ways. this enabled us to investigate how and to what extent well-being and parenting behaviors in daily life were impacted by the covid-19 pandemic and the related social distancing measures. gaining more insight into these processes, our findings can contribute to formulating recommendations for policy makers and mental health professionals. individuals' affect states are not one-dimensional and static in nature, but can fluctuate from moment to moment in response to other individuals and external circumstances (e.g., [12] ). positive and negative affect reflect a persons' momentary mood state. both positive and negative affect have implications for health and well-being over time for adults and adolescents [13] [14] [15] [16] [17] [18] . positive affect predominantly generates action, motivation, social connectedness and cognitive flexibility, whereas negative affect might result in actions such as avoidance, attack, or expel [19, 20] . using momentary assessments enabled us to identify the potential impact of the pandemic on parents' and adolescents' positive and negative affect in daily life without the potential bias of retrospective recall. the covid-19 pandemic and the related social measures might also impact parenting behaviors, such as the amount of expressed warmth and criticism. parental warmth is typically considered as one of the primary dimensions of sensitive parenting behavior and can include acceptance, support, and positive involvement towards the child [21] . parental criticism can be defined as expressing negativity, disapproval, or dissatisfaction to a child [22] . psychological distress related to the covid-19 pandemic may influence parenting behaviors, with parents being more emotionally withdrawn or critical and irritated, instead of being supportive, sensitive and encouraging to the child [23] . previous studies have shown that especially positive mood of family members is closely related to warm family interactions, whereas negative mood is related to withdrawal from interactions [19, [24] [25] [26] . however, no prior studies have examined the effects of a situation comparable to the current covid-19 pandemic on parenting. therefore, in addition to its impact on affect, we also aimed to investigate the impact of the covid-19 pandemic and its consequences on parental warmth and criticism in daily life. since parenting is a dynamic process [16] , we will examine day-to-day parental warmth and criticism. furthermore, as perspectives from parents and adolescents on parenting might differ (e.g., [27] ), we examined both the parent and adolescent perspective on parental warmth and criticism. a crucial aspect of unforeseen stressful situations, such as the covid-19 pandemic, is uncertainty. uncertainty is one of the key determinants of experienced levels of stress [28] [29] [30] . moreover, the ability to deal with uncertainty varies widely. while some people can tolerate uncertainty very well, others have difficulties tolerating uncertainty and try to avoid it at best [31] [32] [33] . intolerance of uncertainty (iu) is described as a predisposition to negatively perceive and respond to uncertain information and situations, irrespective of its probability and outcomes [34, 35] . as the worldwide covid-19 pandemic influenced daily life for all people, escaping from the accompanied uncertainty is deemed impossible. consequently, parents and adolescents with higher levels of iu might experience greater distress under the current circumstances, which might in turn also impact their affect and parenting behaviors. no prior studies have investigated the relation between iu and daily affect and parenting behavior within the family context. this was pursued in the present study. in the light of the pandemic, it is also examined to what extent iu is related to a change in affect and parenting behaviors. in the present study, we examined the impact of the covid-19 pandemic on daily affect and parenting of both dutch parents and adolescents. the aims were: (1) to explore parents' and adolescents' daily difficulties and helpful activities during the covid-19 pandemic, (2) to examine and compare positive and negative affect of both parents and adolescents during 2 weeks of the covid-19 pandemic and a similar 2-week period pre-pandemic (from now on referred to as baseline), (3) to examine and compare (perceived) parenting behaviors in terms of parental warmth and criticism towards the adolescent (as assessed by both the adolescent and the parent) during 2 weeks of the covid-19 pandemic and a similar 2-week period prepandemic, (4) to examine whether parents' and adolescents' levels of iu at baseline are associated with affect and parenting behaviors in general, and (5) as well as with the hypothesized changes in affect and (perceived) parental warmth and criticism. we expect an increase of negative affect and a decrease in positive affect for both parents and adolescents during the covid-19 pandemic as compared to baseline. regarding parenting behaviors, we expect lower levels of parental warmth and higher levels of parental criticism during the covid-19 pandemic as compared to baseline, both from the perspective of parents and adolescents. with respect to iu, we expect that higher levels of iu predict higher levels of negative affect and lower levels of positive affect in parents and adolescents at both time points, as well as a greater increase in negative affect and decrease in positive affect during the covid-19 pandemic compared to baseline. the current study was based on baseline data of the ongoing dutch multi-method two-generation re-pair study: 'relations and emotions in parent-adolescent interaction research' and on the follow-up assessment 're-pair during the covid-19 pandemic'. in re-pair, we examine the relation between parent-child interactions and adolescent mental well-being. the study design and in-and exclusion criteria of the baseline assessment can be found in s1 text. the current study included data from adolescents without psychopathology and their parents (i.e., healthy control families). inclusion criteria for the adolescents to participate in the current study at baseline were: being aged between 11 and 17 years, living at home with at least one primary caregiver, going to high school or higher education, and a good command of the dutch language. adolescents were excluded if they had a current mental disorder, a life-time history of major depressive disorder or dysthymia, or a history of psychopathology in the past two years. adolescent psychopathology was assessed at baseline during a face-to-face interview using the structured interview of the kiddie-schedule for affective disorders and schizophrenia-present and lifetime version (k-sads-pl [36] ). for parents, no in-or exclusion criteria were specified, except for a good command of the dutch language. to participate in the follow-up during the covid-19 pandemic the adolescent had to still live at home with at least one caregiver. adolescents and parents were allowed to sign up individually. from the 80 adolescents and 151 parents who were contacted for the follow-up assessment during the covid-19 pandemic, 51 individuals (14 adolescents and 37 parents) did not respond to any of the attempts of contact from the researchers. of the individuals who did respond, 76 (31 adolescents and 45 parents) were not willing to participate. reasons were: being busy and having other priorities (i.e., work, school, taking care of children or parents). the remaining 104 participants gave consent to participate. two participants did not start the ema and one participant did not complete the measures and hence, the final sample of the current study included 101 participants, consisting of 34 adolescents and 67 parents. descriptive statistics of the current sample are described in the result section and in table 1 . recruitment of the participants was done via social media, advertisements, and flyers, with a specific focus on the inclusion of both parents (i.e., mothers and fathers). the focus was on primary caregivers, so not only biological parents could participate, but also stepparents and guardians, as long as they played an important role in the upbringing of the adolescent. does the covid-19 pandemic impact parents' and adolescents' well-being interested families could sign-up for the study via the website or mail and received information letters. approximately two weeks later families were contacted by phone by one of the researchers to provide them with more information and check the inclusion criteria. if all criteria were met, families could participate in the study. all participants signed informed consent (including consent to contact them to request to participate in follow-up research). in addition, for adolescents younger than 16 years of age, both parents with legal custody signed informed consent. the families completed the ema in the period between september 2018 and november 2019 with ema not taking place during holidays and exam weeks of the adolescent. instructions on the ema were given face-to-face prior to the baseline assessment and researchers assisted with installing the ethica app [37] on the smartphone of the adolescent and both parents. each family member also received written instructions and their individual account information. for participation in the ema, parents received €20,and adolescents €10,-. in addition, four gift vouchers of €75,were raffled based on compliance. all families who participated at baseline were invited for the follow-up in april 2020. the follow-up assessment was announced in a newsletter followed by a personal e-mail, and reminders were sent to parents and adolescents who had not responded yet. parents and adolescents who agreed to participate were sent an online questionnaire on demographic characteristics and general mental well-being. thereafter, participants received written instructions on how to download and reinstall the ethica app. ema data collection took place one month into the lockdown, from april 14 th to april 28 th . for participation in the follow-up assessment, parents received €20,and adolescents €10,in gift vouchers. the current study focusses on the ema data of the baseline assessment (2018-2019) and the follow-up assessment (2020). the re-pair study was approved by the medical ethics committee of leiden university medical center (lumc) in leiden, the netherlands (nl62502.058.17) and the follow-up assessment 're-pair during the covid-19 pandemic' was approved by the psychology research ethics committee of leiden university in leiden, the netherlands (2020-03-30-b.m. elzinga-v2-2334). ema. the ema procedures and set-ups were almost entirely similar at baseline and during the covid-19 pandemic and consisted of filling out questionnaires at four timepoints per day, for 14 consecutive days on parents' and adolescents' own smartphones using the mobile app ethica (ethica data, 2019). at all timepoints participants completed questions about their affect and how they experienced contact with the last person they interacted with. detailed information on the concepts in the questionnaires, triggering schedules, differences in set-up, number of items and completing time, and monitoring process can be found in s2 text. compliance. the overall response rate at baseline was 81.0%. adolescents affect. momentary affect states of parents and adolescents were assessed four times per day with a slightly adapted and shortened four-item version of the positive and negative affect schedule for children (panas-c; [38, 39] ). at each timepoint participants were asked "how do you feel at the moment?" followed by two positive affect states "happy" and "relaxed", and two negative affect states "sad" and "irritated". each affect state was rated on a 7-point likert scale, ranging from 1 (not at all) to 7 (very). a mean score of the positive affect state was calculated per moment to create a momentary pa scale and a mean score of the negative affect state was calculated per moment to create a momentary na scale. a higher score represented higher levels of pa or na. daily parenting. in the last questionnaire of each day, adolescents were asked to indicate with whom they spoke during that day (i.e., mother, father, stepmother, stepfather), and if so, to rate each parent's warmth and criticism by answering the questions "throughout the day, how warm/loving was your parent towards you?" and "throughout the day, how critical was your parent towards you?" on a 7-point likert scale ranging from 1 (not at all) to 7 (very). if adolescents only reported on mother and stepfather for instance throughout the ema, scores about stepfathers were recoded as father. this was the case for two adolescents during the baseline and three adolescents during the covid-19 pandemic. one adolescent reported on four caregivers (i.e. biological parents and stepparents) during both periods and we included scores about biological parents because these were mostly rated. in the questionnaire at the end of each day parents also had to indicate whether they spoke to their child (i.e., the participating adolescent) and if so, to rate their own behavior towards their child by answering the questions "how warm/loving were you towards your child?" and "how critical were you towards your child?" on a 7-point likert scale ranging from 1 (not at all) to 7 (very). both for adolescent and parent report, a higher score represented more warmth and more criticism. daily difficulties and helpful activities. to assess the difficulties and helpful activities during the covid-19 pandemic, at the end of each day, participants were asked to choose items from a list of potential activities. parents and adolescents could select almost similar activities and it was possible to give multiple answers. the list of potential daily difficulties consisted of: boredom, fights/conflicts, work (for parents)/homework (for adolescents), irritations with family members, noise disturbance, loneliness, missing social contact with friends, worries about own health, worries about health of others, concerns about the coronavirus in general, coronavirus-related news items or 'anything else, namely. . .'. the list of potential helpful activities consisted of: work (for parents)/homework (for adolescents), watching series/television, listening to music, gaming, social media, reading a book, sports, chilling, online contact with relatives or friends, being together with the family, card or board games, diy or crafts, cooking/dining, 'anything else, namely'. based on the total number of observed responses a top 5 of daily difficulties and helpful activities was composed. percentages were calculated by dividing the number of observed responses on one activity by the total of given answers. intolerance of uncertainty. the 12-item version of the intolerance of uncertainty scale (ius; [40] ) was used to assess iu of parents and adolescents. participants completed this questionnaire online prior to baseline. the 12 items of the ius (e.g., "uncertainty makes me uneasy, anxious, or stressed." or "i should be able to organize everything in advance.") were answered on a 5-point likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). a higher sum score represents higher levels of intolerance of uncertainty. both the original and the 12-item version of the ius appear to have satisfactory concurrent, discriminant, and predictive validity [41] . internal consistency of the scale was good with a cronbach's alpha of .81 for adolescents and .83 for parents. depressive symptoms. the patient health questionnaire (phq-9; [42] ) was used to screen for the presence of depressive symptoms during the past two weeks. depressive symptoms were assessed at both timepoints. the items are based on nine dsm-iv criteria for depression and are scored as 0 (not at all) to 3 (nearly every day). the phq-9 has been validated for use in primary care. sum scores range from 0 to 27 and a score above 10 is suggestive of the presence of depression [43] . for parents, the cronbach's alpha at baseline was .79 and during the covid-19 pandemic .73. for adolescents, cronbach's alpha at baseline was .53 and during the covid-19 pandemic .76. parents and adolescents reported repeatedly on positive affect, negative affect, parental warmth, and parental criticism at baseline and during the covid-19 pandemic. these repeated measures (level 1) were nested within individuals (level 2). given this nested structure of the data, multilevel modelling [44] was used for the main analyses. models were specified in r version 3.6.1 [45] , using the multilevel version 2.6 [46] package to test our hypotheses with maximum likelihood (ml) estimation. level 2 predictors were grand-mean centered, following guidelines proposed by hoffman [47] and bolger and laurenceau [48] . to evaluate within-person change in positive affect, negative affect, parental warmth, and parental criticism from baseline to the covid-19 pandemic, a series of models were tested. separate models were tested per outcome and per informant (adolescents and parents), resulting in a total of 8 models. per model, several similar steps were taken. first, we specified an unconditional random intercept model with covariance structure (model 1). for more information on the selection of covariance structure and results see s3 text. second, we added period as predictor (model 2), which was scored 0 (baseline) and 1 (during the covid-19 pandemic) to model change. for example, to model change in positive affect, we specified period as the predictor and positive affect as the outcome. the intercept of the model estimates is positive affect score at baseline and the slope of the model is the estimated change from baseline to during the covid-19 pandemic. fourth, we added a random effect (model 3) indicating that the change from baseline to during the covid-19 pandemic could vary between persons. significant changes in model fit were tested with likelihood ratio tests (following guidelines of hox [44] ). fifth, we examined whether the changes were predicted by iu by adding a main effect of iu (model 4). in the models on parental warmth and parental criticism gender of parents was also added to the model as main effect to test for possible gender differences. in the final model (model 5), we also added an interaction term of iu with period to test the possible moderating role of iu. since two parents of a same family could participate in the study, a third level (family) was specified in all models including parents (model 1b). to not overcomplicate our models, we tested whether adding family level (level 3) to model 1 for parents improved the model fit based on the likelihood ratio tests. only if these tests were significant, the third level remained in the model. since adolescents could report on parenting of fathers and mothers, family was specified as extra level in the models concerning parental warmth and parental criticism reported by adolescents (model 1b). for adolescents, answers on father and mother (level 2) are nested within adolescents (level 3). we tested whether adding parent level (level 2) to model 1 for adolescents improved the model fit based on the likelihood ratio tests. if these tests were significant, the second level remained in the model. we used two-tailed tests with an α = 0.05. the analytic plan for this study was uploaded to open science framework prior to the analyses (preregistered at april 27 th , osf.io/34ycu). in the current study, 67 dutch parents (age range during the covid-19 pandemic: 36.25-71.04 years) and 34 adolescents (age range during the covid-19 pandemic: 14.66-19.01 years) participated. participant characteristics can be found in table 1 . the sample reported little to none depressive symptoms as measured with the phq-9. phq-9 scores of adolescents ranged between 0-9 at baseline and between 0-16 during the covid-19 pandemic. phq-9 scores of parents ranged between 0-16 at baseline and between 0-16 during the covid-19 pandemic. levels of depressive symptoms did not differ between the two periods for adolescents (t = 1.11, df = 33, p = .275) and parents (t = 1.24, df = 67, p = .221). information on household composition of participating families can be found in s3 text. correlations between study variables (gender, age, affect, parenting behavior, and iu) can be found in s1 parents. of all parents, 91% (n = 61) were currently employed, 6% (n = 4) were unemployed and 3% (n = 2) were unable to work or lost their job due to the covid-19 pandemic. during the 14 days of ema, 53.7% of the parents who were employed worked more from home, 7.5% worked less from home and 38.8% worked just as much from home as compared to the period before the covid-19 pandemic. all parents indicated owning a house with a garden and having a living surface >100m2. of our sample, 17.9% (n = 12) of the parents reported having covid-19 related symptoms during the 14 days of ema. during the covid-19 pandemic, the most reported daily difficulties across the 14 days of ema for parents were (1) missing social contact with friends (14.6%), (2) concerns about the coronavirus in general (13.5%), (3) irritations with family members (12.8%), (4) worrying about health of others (8.3%), and (5) coronavirus-related news items (8.0%). it was also asked daily which activities were helpful during the day. the top 5 of helpful activities reported by parents was (1) being together with family (20.0%), (2) cooking/dining (14.4%), (3) watching television/series (9.9%), (4) work (7.4%), and (5) online contact with relatives or friends (6.2%). adolescents. due to the covid-19 pandemic all national final school exams were canceled and some high schoolers already graduated (or not) based on their prior school exams, 5 (21.7%) adolescents graduated promptly in march 2020 prior to the 14 days of ema. of our adolescent sample, one person reported having covid-19 related symptoms during the 14 days of ema. for adolescents (n = 34) the top 5 daily difficulties was (1) boredom (22.9%), (2) missing social contact with friends (17.7%), (3) irritations with family members (13.1%), (4) homework (12.3%), and (5) worry about the health of others (6.4%). the top 5 helpful activities for adolescents were (1) chilling (12.9%), (2) watching television/series (11.4%), (3) online contact with relatives or friends (11.0%), (4) listening to music (10.8%), and (5) being together with the family (9.6%). affect: parent reports. first, an unconditional means model of negative affect with the intercept only was built (referred to as 'model 1'-complete model results of parents can be found in s3 table, model fit statistics of parents can be found in s4 table) . the intraclass correlation coefficient (icc) was .31 on the person level, indicating that moderate concordance of negative affect across time points within persons existed. next, family was added as level to the unconditional means model (model 1b). the icc of the family level was .11, which indicates that some concordance of negative affect existed within families. however, the model fit did not improve significantly (χ2(1) = 1.581, p = .209) and family level was therefore removed from the model. next, in model 2, we tested change in negative affect from baseline to during the covid-19 pandemic by adding period to the model. parents reported more negative affect during covid-19 pandemic as compared to the baseline (b = 0.096, se = .025, df = 5982, t = 3.900, p < .001). adding individual variance in model 3 improved the model fit significantly (χ2(2) = 56.613, p < .001). in model 4, we added iu which was significantly associated with negative affect (b = 0.022, se = .010, df = 62, t = 2.075, p = .042) indicating that more iu was related to more negative affect (main effect). lastly, we added iu as moderator in model 5 and results of this final model are presented in table 2 . no moderating effect of iu was found (b = 0.002, se = .007, df = 5752, t = 0.225, p = .822) and iu was no longer significantly associated with negative affect (b = 0.021, se = .011, df = 62, t = 1.960, p = .054), but period remained significantly associated with negative affect. results are shown in fig 2. for positive affect, the same steps were followed. model 1 showed an icc of .32 and adding family level (model 1b) did not significantly improve the model fit (χ2(1) = 0.738, p = .390). results of model 2 showed that parents' positive affect did not differ across the two periods (b = 0.012, se = .028, df = 5986, t = 0.404, p = .686). adding individual variance in model 3 improved the model fit significantly (χ2(2) = 122.186, p < .001). in model 4 iu was added as a main effect, but no significant association with positive affect was found. lastly, iu was added as moderator in model 5, but no moderating effect of iu was found (b = -0.008, se = .009, df = 5756, t = -0.823, p = .411). results of this final model are presented in table 2 . affect: adolescent reports. in model 1, the icc of negative affect on the person level was .32 (complete model results of adolescents can be found in s5 table, model fit statistics of adolescents can be found in s6 table) . results of model 2 showed that there was no significant change in adolescent negative affect (b = 0.016, se = .027, df = 2618, t = 0.595, p = .552). adding individual variance in model 3 improved the model fit significantly (χ2(2) = 39.759, p < .001). in model 4, we added iu as a main effect which was significantly associated with negative affect (b = 0.030, se = .011, df = 30, t = 2.737, p = .010) indicating that more iu was related to more negative affect. iu was added as moderator in model 5 and iu remained significantly associated with negative affect, but no moderating effect of iu was found (b = -0.006, se = table 2 . results of final model 5 on the relation between period and affect and the moderating role of intolerance of uncertainty in parents. table 3 . parenting: parent reports. in model 1, the icc of parental criticism on the person level was .39 (complete model results of parents can be found in s3 table, model fit statistics of parents can be found in s4 table) . adding family level (model 1b) did significantly improve the model fit (χ2(1) = 5.430, p = .020) with an icc of .20 at the family level and 'family' remained in the model. results of model 2 showed that no difference in parental criticism between baseline and during the covid-19 pandemic was found (b = 0.126, se = .064, df = 1530, t = 1.963, p = .050). adding individual variance in model 3 improved the model fit significantly (χ2(4) = 39.527, p < .001). in model 4, we added iu and gender of the parent as main effects. both were not significantly associated with parental criticism. iu was added as table 3 . results of final model 5 on the relation between period and affect and the moderating role of intolerance of uncertainty in adolescents. table 4 . for parental warmth in model 1, the icc on the person level was .46 and adding family level (model 1b) did not significantly improve the model fit (χ2(1) = 0.761, p = .383). no significant change in parental warmth (b = 0.010, se = .038, df = 1530, t = 0.255, p = .799) was found in model 2. adding individual variance in model 3 improved the model fit significantly (χ2(2) = 22.499, p < .001). in model 4, we added iu and gender of parent and both were not significantly associated with parental warmth. iu was added as moderator in model 5, but no moderating effect of iu was found (b = 0.004, se = .008, df = 1466, t = .489, p = .625). results of this final model are presented in table 4 . parenting: adolescent reports. in model 1, the icc of parental criticism on the person level was .45 (complete model results of adolescents can be found in s5 table, model fit statistics of adolescents can be found in s6 table) . adding family level (model 1b) did not significantly improve the model fit (χ2(1) = 2.925, p = .087). results of model 2 showed that the change in reports on parental criticism between baseline and during the covid-19 pandemic was not significant (b = 0.036, se = .062, df = 1350, t = 0.576, p = .565). adding individual variance in model 3 improved the model fit significantly (χ2(2) = 53.931, p < .001). in model 4, we added iu and gender of parent as main effects. gender of parent was significantly associated with reports on parental criticism (b = -0.121, se = .058, df = 1268, t = -2.099, p = .036), indicating that adolescents reported more parental criticism of mothers than fathers. iu was not significantly associated with parental criticism. iu was added as moderator in model 5, but no moderating effect of iu was found (b = 0.028, se = .021, df = 1267, t = 0.083, p = .934). results of this final model are presented in table 5 . gender of parents remained significantly associated with parental criticism. table 4 . results of final model 5 on the relation between period and daily parenting behavior and the moderating role of intolerance of uncertainty in parents. does the covid-19 pandemic impact parents' and adolescents' well-being in model 4, we added iu and gender of parent and both were not significantly associated with parental warmth. iu was added as moderator in model 5, but no moderating effect of iu was found (b = 0.002, se = .021, df = 1267, t = 0.083, p = .934). results of this final model are presented in table 5 . post hoc analyses on increase in parents' negative affect during the covid-19 pandemic. as iu did not explain why parents reported more negative affect during covid-19 pandemic as compared to the baseline, we did some post hoc analyses to examine whether characteristics related to the lockdown and the covid-19 pandemic were associated with the increase of parents' negative affect. living surface, income, having suffered from covid-19 symptoms, helping children with school at home, working from home, going to work, daily difficulties during the past two weeks of covid-19, and working with covid-19 patients were examined (see s7 and s8 tables for description of the ema items). none of these characteristics were related to the increase of parents' negative affect during the covid-19 pandemic as compared to the baseline (all p-values < .001). in this study we (1) explored parents' and adolescents' daily difficulties and helpful activities during the covid-19 pandemic (2) examined positive and negative affect of both parents and adolescents during 2 weeks of the covid-19 pandemic and compared them to a 2-week baseline period pre-pandemic, (3) examined parenting behaviors (assessed by both the adolescent and the parent) and compared parental warmth and criticism towards the adolescent during 2 weeks of the covid-19 pandemic and a 2-week baseline period, (4) examined whether parents' and adolescents' levels of iu at baseline are associated with affect and parenting in general, and (5) as well as with the hypothesized changes in affect and (perceived) parental warmth and criticism. most importantly, both parents and adolescents were bothered by a lack of social contact with friends, by irritations with family members, and worried about the health of others. this might table 5 . results of final model 5 on the relation between period and daily parenting behavior and the moderating role of intolerance of uncertainty in adolescents. does the covid-19 pandemic impact parents' and adolescents' well-being be a logical consequence of the lockdown and social distancing. remarkably, adolescents struggled with boredom whereas this was not the case for parents. parents worried about the coronavirus in general, while this did not bother adolescents that much. in response to social distancing, online contact with relatives or friends aided both parents and adolescents to cope with the situation. in addition, watching tv-shows was also mentioned as a helpful activity by parents and adolescents. other activities that helped to cope with the situation varied across parents and adolescents. while parents reported to benefit from being together with family and cooking and dining, adolescents reported chilling and listening to music. previous studies have shown that quarantine and quarantine-related issues (i.e., financial insecurity, fear of infection, uncertainty about duration) in general have a negative influence on adult mood and mental well-being [9] . therefore, it was expected that the covid-19 pandemic and lockdown would increase negative affect and decrease positive affect as compared with a period before the lockdown. our results show that, indeed, parents' negative affect increased as compared to the period before the lockdown. important to note is that we collected data during 5 th and 6 th week of the lockdown in the netherlands with only minor prospects of easing regulations. we also explored whether other pandemic-related characteristics (i.e. living surface, income, relatives with covid-19, hours of working at home, helping children with school and contact with covid-19 patients at work) were linked to the increase of negative affect in parents. this was not the case. our findings suggested however the presence of heterogeneity among individuals. all our models improved significantly when allowing the associations between period (2 weeks of the covid-19 pandemic versus a similar 2-week baseline period) and affect and parenting behavior to vary across individuals, which is in line with the theoretical notion of differential susceptibility (e.g., [49] ). whether or not parents and adolescents experience (emotional) problems during lockdown can clearly vary from household to household, suggesting that in general families seem to be able to adapt to the circumstances, but that some families struggle. this is important to keep in mind for potential future measures of social distancing. it was expected that the forced social distance during the covid-19 pandemic and particularly the physical distance from friends and peers and the school closure would result in an increase of negative affect and decrease of positive affect in adolescents (see also loades et al. [50] ). yet, in our study, no differences in adolescent reports on negative affect were found during the covid-19 pandemic as compared to a baseline period. as for adults, the opportunities for adolescents of online social interaction might have buffered feelings of isolation or loneliness and bolstered mental well-being during the covid-19 pandemic [51] . moreover, it should be noted that our sample is considered healthy on average, based on the phq-9 scores, and lived in relatively favorable circumstances (e.g., high socioeconomic status). affect of adolescents with (subclinical) mental health issues (e.g. depressive or anxiety symptoms) or living under less fortune circumstances might be more influenced during the covid-19 pandemic. therefore, it is important to examine the effect of the covid-19 pandemic in clinical samples to elucidate its effect on psychopathology. moreover, it should be noted that our assessments were in the rather poignant phase of social lock down, when school closings may also have yielded relief for some adolescents. even though individuals thrive to become independent during adolescence and start to explore the environment outside family household [2, 3] this period of enforced proximity did not seem to affect adolescents on the short-term. potentially, the endurance of the lockdown may have more detrimental effects on adolescent well-being. not for parents nor for adolescents, a change in positive affect was found. despite the increase of stress and uncertainty around the covid-19 pandemic, disasters such as a pandemic also might increase the sense of social connectedness and morality [10] . this sense of shared social identity and the feeling of 'we are all in this together' can be related to positive affect [20] , which could explain why positive affect did not decrease in the present study. in families, as in our sample, no one was home alone, and one could still have online social interactions with others outside the household. to that end, 'physical distancing' might be a better term for the imposed social isolation or social distance, as was previously suggested in literature [10] . as mentioned before, the covid-19 pandemic and the related lockdown may lead to more tension, irritability, and family conflicts or worse [10] . notably, parent' affect and parenting behavior are interrelated and are both involved in giving comfort, expressing approval or expressing criticism [52, 53] . for instance, parents who worry more, express more criticism towards their adolescents, indicating that a negative affect promotes insensitive and in more extreme cases abusive parenting behavior, whereas positive affect strongly relates to supportive parenting [52, 53] . regarding parenting behaviors, we therefore expected higher levels of parental criticism and lower levels of parental warmth during the covid-19 pandemic as compared to baseline. we found, however, that parental warmth and criticism from both parent and adolescent perspective, did not differ between before and during the covid-19 pandemic. interestingly, even though negative affect of parents increased compared to the period before lockdown, this did not seem to affect parenting behavior (self-report and perceived by the adolescent). it should be noted that, in general, adolescents perceived their mothers as more critical compared with fathers, unrelated to measurement period. this might be due to the unique roles of mothers and fathers in caregiving and setting rules and boundaries [54, 55] . results showed that iu was related to more negative affect in both parents and adolescents, independent of the period of assessment. furthermore, in adolescents, iu was also linked to a decrease in positive affect, while for parents no link between iu and positive affect was found. it was expected that people with elevated iu levels might experience even greater distress under the covid-19 circumstances as compared to baseline, however our results do not support this. iu is often described as a predisposition to negatively perceive and respond to uncertain information and situations, irrespective of its probability and outcomes [34, 35] . apparently, it is negatively associated with affect in daily life, regardless of whether there are major threats and uncertainties, or more daily hassles. future research could elucidate why iu may particularly dampen positive affect in adolescents and not in adults. even though iu seems to relate to affect of parents and adolescents, it did not seem to spill over into parenting behaviors. these results give a first indication that iu also relates to more micro processes in daily life, for both adolescents and parents. firstly, the intensive longitudinal study design with multiple assessments per day enabled us to gain more fine-grained insights in affect and parenting behaviors in daily life and to consider individual differences. secondly, assessment during two periods, before and during the covid-19 pandemic, allowed us to detect changes due to the covid-19 pandemic. next to the strengths, it should be acknowledged that the sample (67 parents and 34 adolescents) was relatively small. second, it should be noted that the study sample consisted of overall healthy, well-functioning parents and adolescents. that is, adolescents were screened at baseline and were excluded if they had a current mental disorder, a history of psychopathology in the past two years, or a lifetime history of major depressive disorder or dysthymia. moreover, the phq-9 scores of adolescents and parents indicated few depressive symptoms. therefore, findings might not be applicable to adolescents and parents with (sub)clinical mental health problems or at-risk populations (e.g. refugees, low socioeconomic status), since these groups might be at increased risk of problems such as loneliness, negative affect or negative parenting practices during the covid-19 pandemic. lastly, it should be noted that information on long-term consequences of lockdown during the covid-19 pandemic is lacking. prior research has suggested that the impact of stress can be altered by mindsets and appraisals of stressful events [10, 56, 57] . these factors could possibly explain the individual variations we found. for instance, people with low expectations of the course of events might adapt relatively well to new situations and, therefore, experience little emotional problems. moreover, adaptive mindsets about stressful events might increase positive emotions and reduce negative health symptoms [58] . considering these factors in future studies might be useful to elucidate individual differences in risk and resilience. in our study parents, but not adolescents, showed an increase of negative affect in a twoweek period (14-28 april 2020) during the covid-19 pandemic compared with a similar two-week baseline period pre-pandemic. positive affect and parenting behaviors 'warmth' and 'criticism' did not change. it can be concluded that, on average, parents and adolescents in our sample seem to deal fairly well with the circumstances. individuals and families differed however to what extent the covid-19 pandemic influenced their affect and (perspective of) parenting behavior. living surface, income, having suffered from covid-19 symptoms, helping children with school at home, working from home, going to work, difficulties during covid-19, and working with covid-19 patients did not explain the increase of parental negative affect. policy makers and mental health professionals working to prepare for potential disease outbreaks should be aware that the experience of being quarantined might affect individuals differently. each parent and adolescent could therefore benefit from a different coping strategy, as 'one size does not fit all'. providing easily accessible and safe ways to increase online contact for all ages and layers of society, recommending to search for distraction such as listening to music or watching television, and helping to accept the uncertain situation are for instance potential coping strategies. in this way, individuals can find ways that suit their own personal needs in order to benefit their well-being in times of a lockdown and social distancing measures. table a : model results on the relation between period and negative affect, and the moderating role of intolerance of uncertainty in parents. table b : model results on the relation between period and positive affect, and the moderating role of intolerance of uncertainty in parents. table c : model results on the relation between period and parental criticism, and the moderating role of intolerance of uncertainty in parents. table a : model results on the relation between period and negative affect, and the moderating role of intolerance of uncertainty in adolescents. table b : model results on the relation between period and positive affect, and the moderating role of intolerance of uncertainty in adolescents. table c : model results on the relation between period and parental criticism, and the moderating role of intolerance of uncertainty in adolescents. who announces covid-19 outbreak a pandemic handbook of parenting: children and parenting cognitive and affective development in adolescence the experience of quarantine for individuals affected by sars in toronto sars control and psychological effects of quarantine depression after exposure to stressful events: lessons learned from the severe acute respiratory syndrome epidemic posttraumatic stress disorder in parents and youth after health-related disasters. disaster medicine and public health preparedness mental health status of people isolated due to middle east respiratory syndrome the psychological impact of quarantine and how to reduce it: rapid review of the evidence. the lancet using social and behavioural science to support covid-19 pandemic response. nature human behaviour ecological momentary assessment (ema) in behavioral medicine feelings change: accounting for individual differences in the temporal dynamics of affect emotional experience improves with age: evidence based on over 10 years of experience sampling intraindividual variability in affect: reliability, validity, and personality correlates the relation between short-term emotion dynamics and psychological well-being: a meta-analysis the family ecology of adolescence: a dynamic systems perspective on normative development the role of meta-cognition and parenting in adolescent worry the development of adolescent generalized anxiety and depressive symptoms in the context of adolescent mood variability and parent-adolescent negative interactions daily links between school problems and youth perceptions of interactions with parents: a diary study of school-to-home spillover the role of positive emotions in positive psychology: the broaden-and-build theory of positive emotions mothers' and fathers' parental warmth, hostility/rejection/neglect, and behavioral control: specific and unique relations with parents' depression versus anxiety symptoms psychological well-being and parent-child relationship quality in relation to child autism: an actor-partner modeling approach daily stress, coping, and well-being in parents of children with autism: a multilevel modeling approach gender differences in adolescents' daily interpersonal events and well-being. child development an upward spiral: bidirectional associations between positive affect and positive aspects of close relationships across the life span bringing it all back home: how outside stressors shape families' everyday lives congruence of parents' and children's perceptions of parenting: a meta-analysis investigating the construct validity of intolerance of uncertainty and its unique relationship with worry a retrospective examination of the role of parental anxious rearing behaviors in contributing to intolerance of uncertainty investigating the effect of intolerance of uncertainty on catastrophic worrying and mood generalized anxiety disorder: a preliminary test of a conceptual model why do people worry? personality and individual differences a little uncertainty goes a long way: state and trait differences in uncertainty interact to increase information seeking but also increase worry problem solving and problem orientation in generalized anxiety disorder experimental manipulation of intolerance of uncertainty: a study of a theoretical model of worry the 10-item positive and negative affect schedule for children, child and parent shortened versions: application of item response theory for more efficient assessment development and validation of brief measures of positive and negative affect: the panas scales fearing the unknown: a short version of the intolerance of uncertainty scale a comparison of the 27-item and 12-item intolerance of uncertainty scales the phq-9: validity of a brief depression severity measure optimal cut-off score for diagnosing depression with the patient health questionnaire (phq-9): a meta-analysis multilevel analysis: techniques and applications r: a language and environment for statistical computing austria: r foundation for statistical computing. retrieved from package 'multilevel longitudinal analysis: modeling within-person fluctuation and change. routledge intensive longitudinal methods: an introduction to diary and experience sampling research differential susceptibility to parenting and quality child care rapid systematic review: the impact of social isolation and loneliness on the mental health of children and adolescents in the context of covid-19 helping others regulate emotion predicts increased regulation of one's own emotions and decreased symptoms of depression the affective organization of parenting: adaptive and maladaptative processes relations between parental affect and parenting behaviors: a meta-analytic review. parenting father-child relationships. handbook of father involvement: multidisciplinary perspective +father-child+relationships.+handbook+of+father +involvement:+multidisciplinary+perspectives the role of fathers' versus mothers' parenting in emotion-regulation development from mid-late adolescence: disentangling between-family differences from within-family effects optimizing stress responses with reappraisal and mindset interventions: an integrated model arousal and physiological toughness: implications for mental and physical health the role of stress mindset in shaping cognitive, emotional, and physiological responses to challenging and threatening stress we are grateful to all the families that have invested their time by participating in this study. we furthermore thank ethica data for offering their services (i.e., use of the ethica app and platform) free of charge during the covid-19 pandemic. key: cord-308115-bjyr6ehq authors: baba, isa abdullah; nasidi, bashir ahmad title: fractional order model for the role of mild cases in the transmission of covid-19 date: 2020-10-20 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110374 sha: doc_id: 308115 cord_uid: bjyr6ehq most of the nations with deplorable health conditions lack rapid covid-19diagnostic test due to limited testing kits and laboratories. the un-diagnosticmild cases (who show no critical sign and symptoms) play the role as a route that spread the infection unknowingly to healthy individuals. in this paper, we present a fractional order sir model incorporating individual with mild cases as a compartment to become smir model. the existence of the solutions of the model is investigated by solving the fractional gronwall's inequality using the laplace transform approach. the equilibrium solutions (dfe & endemic) are found to be locally asymptotically stable, and subsequently the basic reproduction number is obtained. also the global stability analysis is carried out by constructing lyapunov function. lastly, numerical simulations that support analytic solution follow. it was also shown that when the rate of infection of the mild cases increases, there is equivalent increase in the overall population of infected individuals. hence to curtail the spread of the disease there is need to take care of the mild cases as well. the outbreak of the novel strain of corona viruses (covid19) started late december in the wuhan province in china [1] . it became a global pandemic causing the devastating impact in terms of morbidity, infections and fatality in addition to socio-economic disaster. the virus source which is yet to be identified is said to have genetic linkages with severe acute respiratory syndrome (sars-cov) but less severe than middle east respiratory syndrome (mers-cov) [2] . the virus is transmitted to healthy persons via eyes, mouth and nose when an infected person produced respiratory droplets of cough and sneeze or as a result of contact with contaminated surfaces. the average incubation period from catching the virus to time onset of major symptoms (like fever, cough and sneeze) is between 2 -14days [3] . as the vaccine is not yet found, the control measures such as: social distancing, quarantine of suspected case, use of personal protective equipment (like face mask, hand globes, gown), regular hand sanitation using antibacterial agents (like soaps, sanitizer) and imposing the lockdown curfew(when necessary) are the effective intervention that mitigate the transmission of the infection. to execute these measures effectively, there is need to have an in depth study about the number of persons that each infected individual can infect, meanwhile a mathematical model describing the transmission dynamics of the disease should be established. in this regard,zhao and chen [4] developed a susceptible, un-quarantined infected, quarantined infected, confirmed infected (suqc) model to characterize the dynamics of covid-19 and explicitly parameterized the intervention effects of control measures. similarly, song et al. [5] developed a mathematical model based on the epidemiology of covid-19, incorporating the isolation of healthy people, confirmed cases and close contacts. tahir et al. [6] developed a mathematical model (for mers) inform of nonlinear system of differential equations, in which he considered a camel to be the source of infection that spread the virus to infective human population, then human to human transmission, then to clinic center then to care center. however, they constructed the lyapunov candidate function to investigate the local and global stability analysis of the equilibriums solution and subsequently obtained the basic reproduction number or roughly, a key parameter describing transmission of the infection. also,chen et al. [7] developed a bats-hosts-reservoir-people (bhrp) transmission network model for the potential transmission from the infection source (probably bats) to the human infection, which focuses on calculating . to suit korean outbreak, sunhwa and moran [8] established deterministic mathematical model (in form of seihr), in which they estimated the reproduction number and assessed the effect of preventive measures. similarly, lin et al. [9] modeled (based on seir) the outbreak in wuhan with individual reaction and governmental action (holiday extension, city lockdown, hospitalization and quarantine) in which they estimated the preliminary magnitude of different effect of individual reaction and governmental action. also yang and wang [10] proposed a mathematical model to investigate the current outbreak of the coronavirus disease (covid19) in wuhan, china. the model described the multiple transmission pathways in the infection dynamics, and emphasized the role of environmental reservoir in the transmission and spread of the disease. however, the model employed non-constant transmission rates which change with the epidemiological status and environmental conditions and which reflect the impact of the ongoing disease control measures. nonlocality is one of the main drivers of interest in fractional calculus applications. there are interesting phenomena that have what are called memory effects, meaning their state does not depend solely on time and position but also on previous state. such system can be very difficult to model and analyze with classical differential equations, but nonlocality gives fractional derivative built-in ability to incorporate memory effects [11] .fractional differential equations appear naturally in numerous fields of study including physics, polymer rheology, regular variation in thermodynamics, biophysics, blood flow phenomena, aerodynamics, electrodynamics of complex medium, viscoelasticity, capacitor theory, electrical circuits, electron-analytical chemistry, biology, control theory, and fitting of experimental data [12 -14, 23 -26] . recently there are many studies on epidemiological disease modeling using fractional order differential equations [27 -33] . the rieman-liouville fractional derivative is mostly used by mathematicians, but this approach is not suitable for real world physical problems since it requires the definition of fractional order initial conditions, which have no physically meaningful explanation yet. caputo introduced an alternative definition, which has the advantage of defining integer order initial condition for fractional differential equations [15] . in mostly poor and underdeveloped territories where there is no capacity of rapid diagnostic covid-19 test due to insufficiency of testing kits, the mild cases (who usually show no symptoms of the infection due to their strong and active immune system up to their recovery period) play a major role as a route that spread the disease to healthy individuals. here we build our model by incorporating the population of mild individuals into the compartmental sir model to become smir model in form of system of fractional order differential equations (fode) in the caputo sense. it should be noted that to our knowledge no model in literature considered the contribution of the mild cases of covid -19 in the proliferation of the pandemic. the paper is organized as follows: section 1is the introduction, section 2 is the preliminary definitions and theorems, the model formulation followed in section 3, section 4 is the stability analysis, section 5 gives numerical simulations and discussions and section 6 gives the conclusion. the notion of convergence of mittag-leffler function is fully discussed in [15] . theorem 1 [19] :the equilibrium solutions of the system is locally asymptotically stable if all the eigenvalues of the jacobian matrix evaluated at the equilibrium points satisfy despite the fact that almost 80% of the covid -19cases are mild who recover naturally (due to stronger and active immune system that fight against the virus), they still play a role as a route of transmission of the infection [2]. the model was formulated based on the assumption that new born of human are recruited into susceptible class ( ) at the rate . the susceptible individual who had contact with an infected at rate can developed mild symptoms and move to mild class ( ). based on [2], the mild patients with infectivity rate play the role as a routine that spread the infection, those with strong immunity recovered naturally at the rate , while some with critical illness became infected and moved to infectious class ( ) after the incubation period . the infectious individual may then recovered ( ) or died at the rates and respectively. figure 1 gives the schematic diagram describing the transmission dynamics of the disease. with the total population the linearity of the caputo operator yield we apply the laplace transform method to solve the gronwall's inequality (6) with initial condition ( ) linearity property of the laplace transform gives (7) to partial fraction gives taking the inverse laplace transform of (8) where ( ), ( ) are the series of mittag-leffler function (as in definition 4) which converges for any argument, hence we say that the solution to the model is bounded. thus, consider the system (1) through (4) written as ] proof reference to picard-lindelof theorem [20] we establish the following theorem. since and its closed set, then is complete metric space. the continuous system (10) can be transformed to equivalent integral equations as; (12) is equivalent to volterra integral equation that solves (10) . define an operator in now, we need to verify that (13) satisfies the hypothesis of contraction mapping principle. first to show hence the operator maps onto itself. secondly, to show that is a contraction, we have since by hypothesis , then is a contraction and has a unique fixed point.. thus, system (10) has unique solution. to obtain equilibrium solution, we set the system to zero and solve simultaneously as follows; considering ( ) ( ) , in (11)(14) we find the endemic equilibrium, consider system (1) then, we have the following jacobian matrix clearly, allthe eigenvalues have zero imaginary part ( ) therefore, hence by theorem 1 above, the disease free equilibrium is locally asymptotically stable. to obtain basic reproduction number which is a key parameter describing the number of secondary infections generated by a single infectious individual, we consider the eigen values above. for implies that this threshold quantity which if less than one disease free equilibrium will be stable and if greater than one it is unstable is what we termed as basic reproduction ratio . hence we define the endemic equilibrium points can be rewritten in the form of to derive the lyapunov candidate function for fractional order as in [24] , consider the family of quadratic lyapunov function and define the lyapunov candidate function as applying lemma 2 [23] above hence by theorem 3 above, the disease free equilibrium is globally asymptotically stable. case 2: at the endemic (positive) equilibrium, (28) becomes we had earlier established that the positive equilibrium is stable if . now consider the relations therefore back substitution the above relations into (30) yields where ( ( )) ( ) ( ) hence by theorem 3 above, the endemic equilibrium is globally asymptotically stable. in this chapter we carry out numerical examples to support the analytic results using parameter values in table 1 . for the variables we use ( ) ( ) ( ) ( ) when the rate of infection of the mild cases increases, there is equivalent increase in the overall population of infected individuals. hence to curtail the spread of the disease there is need to take care of the mild cases as well. in mostly poor and underdeveloped territories where there is no capacity of rapid diagnostic covid-19 test due to insufficiency of testing kits, the mild cases (who usually show no symptoms of the infection due to their strong and active immune system up to their recovery period) play a major role as a route that spread the disease to healthy individuals. here we build our model by incorporating the population of mild individuals into the compartmental sir model to become smir model in form of system of fractional order differential equations (fode) in the caputo sense. the existence of the solutions of the model was shown by solving the fractional gronwall's inequality using the laplace transform approach. two equilibrium solutions, disease free and endemic were obtained. both local and global stability of the equilibria were shown to depend on the magnitude of basic reproduction ratio. numerical simulations were carried out and dynamics of the populations were shown to vary for different values of it was also shown that when the rate of infection of the mild cases increases, there is equivalent increase in the overall population of infected individuals. hence to curtail the spread of the disease there is need to take care of the mild cases as well. world health organization (who). novel coronavirus-china modeling the epidemic dynamics and control of covid-19 in china the impact of isolation on the transmission of covid-19 and estimation of potential second epidemic in china. preprints (www.preprints.org) stability behavior of mathematical model of mers corona virus spread in population a mathematical model for simulating the phased-based transmissibility of a novel coronavirus estimating the reproductive number and the outbreak size of novel coronavirus (covid-19) using mathematical model republic of korea a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action a mathematical for the novel coronavirus epidemic in wuhan what is fractional calculus. cantor's paradise a survey on existence result for boundary value problem of nonlinear fractional differential equations the analysis of fractional differential equations: an application-oriented exposition using differential operators of caputo type stability and dynamics of a fractional order leslie-gower model fractional differential equations stability analysis of fractional differential system with rieman-liouville derivative differential equations: an introduction to differential equations, creative commons, 94105 differential equation of fractional order: methods, results and problems stability result for fractional differential equations with application to control processing stability analysis of caputo fractional-order nonlinear systems revisited lyapunov function for fractional; order systems volterra type lyapunov function for fractional order epidemic model analysis of caputo fractional-order model for covid-19 with lockdown analysis of meningitis model: a case study of northern nigeria mathematical modeling for adsorption process of dye removal nonlinear equation using power law and exponentially decaying kernels modeling chickenpox disease with fractional derivatives: from caputo to atangana-baleanu a new study on the mathematical modelling of human liver with caputo-fabrizio fractional derivative a mathematical model for covid-19 transmission by using the caputo fractional derivative epidemiological and clinical characteristics of the covid-19 epidemic in brazil a fractional differential equation model for the covid-19 transmission by using the caputo-fabrizio derivative on the mathematical model of rabies by using the fractional caputo-fabrizio derivative a mathematical theoretical study of a particular system of caputo-fabrizio fractional differential equations for the rubella disease model analysis of the model of hiv-1 infection of cd4^{+} t-cell with a new approach of fractional derivative we write to declare our interest in publishing our work titled "fractional order model for the role of mild cases in the transmission of covid-19 " with the journal " chaos, solitons and fractals". key: cord-285435-fu90vb2z authors: björklund, tua a.; mikkonen, maria; mattila, pauliina; van der marel, floris title: expanding entrepreneurial solution spaces in times of crisis: business model experimentation amongst packaged food and beverage ventures date: 2020-11-30 journal: journal of business venturing insights doi: 10.1016/j.jbvi.2020.e00197 sha: doc_id: 285435 cord_uid: fu90vb2z research summary times of crisis require entrepreneurial responses to mitigate adverse effects and address new opportunities. this study focuses on how packaged food and drink entrepreneurs in finland took action to create and capture new value during the covid-19 crisis. examining 844 social media posts of 66 ventures between march and may 2020 and interviewing 17 of these ventures, we found ventures to experiment with new business model variations, which not only expanded their set of solutions directly, but resulted in action-based learning leading to longer-term changes and increased capabilities for subsequent value creation. furthermore, collaborative experiments and prosocial support increased the solution space through developing the capabilities of the ecosystem. managerial summary the global lockdown measures in response to the coronavirus pandemic have disrupted supply, production, sales and consumption. facing these constraints, entrepreneurs can respond quickly and experiment to create new liquidity and opportunities. our analysis of packaged food and beverage entrepreneurs in finland during the crisis shows how entrepreneurs leverage existing resources and acquire new ones to create new offerings, operations and partnerships. these initial actions serve as experiments to learn from in creating and revising business models, promoting a virtuous cycle of further action and expanding potential future solutions accessible to entrepreneurs. importantly, opportunities available to the venture expand through both venture specific learning and through supporting other actors in the ecosystem. the covid-19 pandemic and related restrictions are creating a global crisis, shaping the economic landscape and challenging entrepreneurs (kuckertz et al., 2020; brown and rocha, 2020) . while organizational resilience is crucial to keeping business afloat in these times, crisis-based entrepreneurship research that might inform resiliency cultivation has remained scarce (doern et al., 2019; linnenluecke, 2017; monllor and murphy, 2017; bullough et al., 2014) . with the covid-19 crisis, early research results have suggested diminished venture capital availability, particularly for early-stage ventures (brown and rocha, 2020) . entrepreneurs have also been quick to take action to cut costs and seek support (giones et al., 2020; kuckertz et al., 2020; thorgren and williams, 2020) , although entrepreneurial needs and policy responses have not always matched (kuckertz et al., 2020) , with for example entrepreneurs reluctant to take on additional loans (thorgren and williams, 2020) . however, these studies have provided limited insights into whether and how entrepreneurs might move from survival to thriving, calling for further research on creating new opportunities through innovation and revenue generating actions during the crisis (kuckertz et al., 2020; thorgren and williams, 2020) . indeed, taking action to create and pursue opportunities is at the heart of entrepreneurship. as certo et al. (2009:319) state, "doing is a key theme running throughout the academic literature in entrepreneurship" (italics original). entrepreneurial behavior transcends present limitations and creatively repurposes resources (becherer and maurer, 1999; baker and nelson, 2005) , with a growing literature on effectuation emphasizing how entrepreneurs engage in an active, flexible process focusing on experimentation and affordable losses to transform rather than adapt to existing environments (chandler et al., 2009; dew et al., 2008; read et al., 2009) . similarly, recent research highlights the role of experimentation in devising novel, successful business models (bocken and snihur, 2020; mcdonald and eisenhardt, 2020) . business models -defined as how organizations create, deliver and capture value (teece, 2010; zott et al., 2011; sj€ odin et al., 2020; schneider, 2019; mcdonald and eisenhardt, 2020 ) -reflect the realized strategy of organizations (spieth et al., 2020; casadesus-masanell; ricart, 2010) and the choices and actions entrepreneurs make in their offering, organization, networks and revenue models (mcgrath, 2010; spieth and schneider, 2016; schneckenber et al., 2017; bocken; snihur, 2020) . the set of possible business models the venture would be able to experiment with using its current resources, capabilities and networks can be captured by its solution space, i.e. the range of potential solutions available to a venture, of which only a portion is actualized in any given moment (macpherson et al., 2016 ; similar to the concept of problem space in cognitive psychology and design research, simon, 1973; goel and pirolli, 1992; bj€ orklund, 2013) . solution space expansion, in turn, represents increased options and potential for business model innovation and can thus offer an additional pathway to organizational resilience. in line with macpherson et al. (2015) , we view crises as critical episodes for ventures, representing periods of concentrated activity and learning (cope, 2005; schneider, 2019) . crises can intensify experimentation, with fisher et al. (2020:5) suggesting that the covid-19 pandemic can "give the permission to hustle", freeing "entrepreneurial actors from the perceived constraints or limitations that they typically face in their day-to-day work environments". macpherson, herbane and jones (2015) also found entrepreneurs to expand the solution space of their venture through resource accretion prompted by crises. however, these studies focused on organization-specific or market-specific crises where the entrepreneurs were able to leverage skills and resources from stakeholders largely unaffected by the change or crisis. in contrast, novel value creation or capture has been rare in extant results on entrepreneurial action in the face of the covid-19 pandemic (kuckertz et al., 2020; thorgren and williams, 2020; giones et al., 2020) . for example, while the majority of swedish 456 smes promptly took cust-cutting actions, fewer than 7% reported engaged in revenue generating or innovation activities (thorgren and williams, 2020) . as the availability and creation of community and collaboration can be crucial for venture resiliency (muñoz et al., 2020; giones et al., 2020) , the global pandemic influencing society at large creates a challenging, yet important arena for value creation. the current study sheds light into how entrepreneurs can experiment with new opportunities and business models to expand entrepreneurial solution spaces in such times of wide-spread collective crisis, examining the activities of packaged food and beverage ventures during the covid-19 pandemic in finland. the covid-19 pandemic and emergency measures have had a global impact, including the packaged food and beverage industry in finland, as production, sales, distribution channels and consumption patterns have been disrupted (appendix 1). examining the consumer-facing action responses of 66 packaged food and beverage ventures portrayed in their social media activity during the crisis (appendix 2), we found that most entrepreneurs did indeed expand their solution range during the crisis through experimenting with novel products (68), services (30), sales channels (32) or prosocial actions (17) (appendix 5). furthermore, ventures began to exhibit such actions immediately when the lockdown was announced and continued throughout the 2.5-month tracking period (fig. 1) . noticing the prevalence of engaging in novel action already amongst consumer-facing communications in social media, we then interviewed a subset of 17 ventures to gain more in-depth visibility to a wider range of experimentation in value offering, value creation architecture and revenue models (appendices 3 and 4). we found that most cases had experimented with multiple areas of their business model (appendix 6). many entrepreneurs described the crisis lowering the threshold for experimentation through creating a sense of urgency, forcing new solutions to compensate for lost revenue, giving a perceived mandate for experimentation or simply providing more time as travel and events were cancelled. as david remarked, "you just have to keep moving, do things, be present and show that you're alive and everything is happening, that might provide some bigger things over time. [ …] the corona situation has made it very concrete that we have to come up with things, and try things out." 2.1. areas of experimentation during the crisis novel value offering was the most common form of consumer-facing experimentation, most often taking the form of new products and services (appendix 5 and 6). many ventures entered completely new product categories, in addition to more incremental changes such as bundling products together or rebranding (appendix 5). collaborative experiments in new products and services tended to be incremental, joining forces for product bundles or producing collaborative variations of previously tested within-venture offerings. in addition to physical food and beverage products, many entrepreneurs tried out new online services, such as virtual tasting sessions as supplements to their offering. particularly in the very first creative responses to the crisis, entrepreneurs described fast-tracking the development of ideas they had already been playing around with pre-crisis, while others found new circumstances to fit ideas that had been discarded in pre-crisis conditions as too risky, for example. ten out of 17 ventures also appealed to new target customer segments (appendix 6), most often switching from business customers to direct consumer offerings. seeking change in the competitive positioning, however, was rare, with only 5 out of 17 ventures entering new competitive landscapes through diversifying their offering portfolio to new markets or successful experimentation in other business model areas. although visible to the consumers mostly in the area of sales and delivery processes (appendix 5), all 17 interviewed ventures had experimented with their value creation architecture (appendix 6). the most common experiments were in the realm of developing core competencies and resources, in which all case ventures reported taking some kind of action (e.g. acquiring the required skills to break into a novel product category or learning to use a new social media tool). experiments in novel partners and distribution processes tended to be closer to the existing business model, whereas experiments in internal value creation processes often entailed larger departures from pre-crisis operations (appendix 6). experiments in revenue models, in turn, were rarer than in value offering or value creation architecture. these were often conducted jointly with experiments in other business model areas, either intentionally aiming to increase revenue or reduce costs, or needing to change the revenue model as a result of alterations in other areas. for example, nina's health snacks intentionally sought to reduce the fixed costs of their restaurant space by renting it out for remotely working teams, whereas in the case of leo's snacks, the radical change to their target customer segment and altered user preferences made changes to their cost structure through streamlining their production. finally, the entrepreneurs engaged in a number of experiments that were not directly tied to their business models. these prosocial activities ranged from helping out critical and at-risk groups, such as donating products to health care professionals and deliveries to the elderly, to supporting fellow entrepreneurs. prosocial experiments could also be collective, with ventures joining forces to help both themselves and their communities or industries. for example, ben shared that "the same week as the lockdown was issued, at least all the companies [in the craft brewing industry] within our network, started to build online stores. everyone was exchanging messages in a whatsapp group and comparing which is the best platform and so on." analyzing the action responses of the entrepreneurs, we noted that rather than conducting isolated experiments, several elements were typically bundled together and initial responses often led to further action, sparking repetition, iteration and scaling of efforts, building on what entrepreneurs learned through engaging in the initial experiments. for example, a distillery turning to producing hand sanitizer as a novel product category (value offering) required also experimenting in the value architecture with novel supply chains and production, was coupled with experimenting with a new revenue and cost model and led to subsequent value architecture experiments with new delivery models and partnerships that would have not been possible with their original products. unsuccessful tests, in turn, were discontinued, as in the case of ben's roastery piloting an employee coffee subscription model for their business-to-business clients, due to low demand. we noted that a radical experiment in one of the areas of business innovation would lead to changes in one or more additional areas. for example, in the case of nina's health snacks (fig. 2) , the pandemic halted the supply of a main ingredient to her core offering causing her to seek alternative raw ingredients from the finnish suppliers. this, in turn, led her to develop a novel product category requiring the acquisition of new skills and competences, changes to internal value creation and changed the overall revenue structure of the venture (all radical changes for the venture). radical experiments, when successful, often led to longer term shifts in strategy or adding complementary extensions to existing business models through entering new markets. for example, nina's experiments sparked creating a whole new brand for a new 'ready-to-eat' product portfolio as a result of her learnings from the positive market response to her initial actions. similarly, kevin's distillery found their hand sanitizer gaining so much traction that quantity-wise it surpassed their beverage production, leading them to create new distribution solutions and plan further products in the category, no longer viewing it as a shortterm crisis response but as a strategic diversifying act. however, longer term effects on the business models could also result from more incremental experimentation. in anna's brewery (fig. 3) , bundling their existing products to increase their sales in a more commercial format than they had been used to was surprisingly successful: "we are completely blown away with the result. it is just those six existing beers and our existing tote bags, but people … when you make it easy for them, they get excited." encouraged by the positive market response, they proceeded with a series of experiments, including more radical shifts from their original operations. in a more incremental effect, henri remarked that the positive experience they had with a seasonal product bundle made it easy to repeat it, and the type of bundle ended up as a permanent part of their desserts offering. on the other hand, several ventures shifted their focus from collapsed business-to-business sales to consumer sales, often seeing these initially as temporary measures to maintain cash flow rather than a permanent change. for example, ben developed a whole new system for customers to pre-order beer online and then pick it up at the brewery as restaurant sales halted (fig. 4) . after initial positive experiences, however, many intended to keep some form of the generated business-to-customer value offering and value creation architecture. this resulted in the ventures diversifying their offerings and operations in the longer term. however, these were considered as relatively minor changes in service of preserving the pre-existing business model at large rather than business model innovation. in other cases, the experiments themselves were discontinued, but the capabilities the entrepreneurs had learned from them were used to execute existing business models more effectively. finally, examining the experimentation pathways, we noticed that collaborative business model experiments and experimenting with prosocial help enhanced networked or ecosystem capabilities for further experiments in novel value creation and capture. for example, david's roastery initiated a crisis-branded coffee, using portions of the profits to purchase gift cards from local small cafes and restaurants, which were then randomly included in the customer webstore orders. through this response, the roastery added value to private customers and built its business-to-business clientele: "if [the caf es and restaurants are] not our existing clients, it should be pretty easy to get back to them again after things normalize a little, and say: 'hey, we were in touch a few months ago [ …] and ask about how things are with their coffee, and there's potential clientele considering the future. we're leaving a positive memory". similarly, leo's snacks gave product giveaways to restaurants (fig. 5) . rather than representing lasting changes to value offerings, value creation architecture or revenue model of the ventures as such, these experiments led to changes in the resources and capabilities that could be leveraged for subsequent business model innovation even when the experiments tended to be incremental. interestingly, such effects could be seen even in cases where the original experiment was not aimed at the venture business model at all. figs. 2-5 show the experimentation pathways and their longer-term effects in four example ventures responding to the covid-19 crisis. for each venture, first response actions are in white boxes, follow-ups in light grey boxes and longer-term implications in dark grey boxes. actions and effects are connected by arrows and lines if they were inspired by or resulting from the former actions. dotted borders around actions indicate collaborative efforts. 3. four pathways for expanding entrepreneurial solution spaces during the pandemic taken together, we could see four types of pathways to expanding the solution space of business models accessible to ventures through the experimentation activities of the packaged food and beverage entrepreneurs (depicted in fig. 6) . first, business model experimentation -abundant in both the social media analysis and interviews -extended the set of solutions actualized by the ventures, through the new products, services, production, sales solutions and so forth created by the entrepreneurs during the crisis. these responses expanded the entrepreneurs' solution portfolio either by increasing the portion of the pre-crisis solution space they targeted (leveraging existing resources, capabilities and relationships) or by going beyond the pre-crisis solution space in novel solutions enabled through proactively acquiring new resources, capabilities and relationships to create new offerings. engaging in and learning from these experiments also expanded the solution space of possible business models the ventures might pursue in the future through building new value creation capabilities. the second pathway, internal capability-based expansion of potential solutions occurred when entrepreneurs developed new capabilities within their venture while engaging in business model experimentation. these new capabilities could result from learning new domain-specific skills and knowledge, gaining new insights on the direction and processes of the venture, or increasing entrepreneurial efficacy through initial successes. sometimes these capabilities were deliberately sought, while others were accrued as a by-product of gaining feedback through the experiments. the third pathway also expanded the solution space through action-based learning from experimentation, but through its effects on relational and ecosystem capabilities for value creation rather than internal capabilities. again, these could be both intentionally sought and organically developed as a by-product of experimentation. growing networks with new contacts and diversifying and deepening interactions with existing ones, the entrepreneurs learned more about their networks, finding complementary skills and creating new potential opportunities for collaboration -and thus again extending the solution space available for the ventures. in addition to increasing the capabilities of the ventures, instances of joint actions also contributed towards enhancing the capabilities of the collaborators, expanding the solution spaces of both parties, and could even strengthen broader groups and communities of actors. thus, while many collaborative experiments were incremental in themselves, they increased their impact through indirect effects on both the entrepreneurs and their networks. finally, the fourth pathway for solution space expansion represented building relational and ecosystem value creation capabilities through experiments in prosocial help rather than direct business model experiments. while these experiments were rarer, they expanded the solution space of the support giving entrepreneurs through building relational and ecosystem capabilities even in instances that directly benefited solely the support receiver. in addition to specific contacts, these actions could serve to strengthen the clusters the ventures belonged to, such as the industry niche (e.g. small breweries producing craft beers) or regional communities (e.g. ventures or actors in a specific town). maintaining and growing capabilities of the stakeholders, in turn, opened up further value creation and capture possibilities for the entrepreneurs in the future through potential collaboration. thus, while helping out others might not directly expand the solution repertoire or business model of a venture, strengthening collective capabilities could still enhance the venture's opportunities going forward. the current study highlights the prevalence and opportunities for value creation during global crises. both the social media analysis and founder interviews showcased an abundance of new business model experiments in response to the covid-19 pandemic amongst finnish packaged food and beverage ventures. these experiments led to both crisis management by maintaining operational viability through searching for alternative revenue streams, but also further enhanced the resiliency of the ventures through expanding the solution space (williams et al., 2017; herbane, 2019; duchek, 2020) . as the value and originality of new ideas are difficult to predict before-the-fact (berg, 2016; fuchs et al., 2019) , taking action creates new knowledge, skills and resources that can then be leveraged to discontinue, modify or scale subsequent efforts to pursue opportunities. this suggests that when in doubt, entrepreneurs may be well served by action rather than inaction. indeed, the knowledge-creating effects of experimentation can help to speed up decision making under uncertainty by avoiding prolonged debate based on assumptions only (mcdonald and eisenhardt, 2020). as such, experimentation can be a key activity to practice the effectual logics of aiming to control rather than predict uncertain events (sarasvathy, 2001; read et al., 2009 read et al., , 2016 . in contrast to macpherson et al. (2015) and kuckertz et al. (2020) examining entrepreneurial responses to crises, our study found entrepreneurs rarely relied on one-directional support from their networks or additional funding from the government, investors nor non-profit organizations. rather, entrepreneurs utilized their networks for collaboration and collective action. the ventures launched new collaborative products, services and sales channels, leveraged and built their networks, and helped others in their local communities. although these actions were often initially incremental, even incremental experimentation can serendipitously lead to or gradually give rise to radical innovation (bocken and snihur, 2020) . we found entrepreneurs able to find such opportunities for collaboration despite their stakeholders being adversely influenced by the crisis. similar to post-disaster entrepreneurs (grube and storr, 2018) , having covid-19 as a 'common' crisis for all initiated sharing vital information, but also discussing best practices and current experiences amongst the ecosystem actors. crises can spark a mutual need for collective identity (giones et al., 2020) , and social cohesion should be both supported and leveraged in crisis recovery (muñoz et al., 2020) . collaborative experiments not only benefit the development of internal capabilities, but also enhance relational and ecosystem capabilities, further expanding the potential for future joint value creation and ecosystem resilience. as such, they initiate virtuous spirals of co-development of the entrepreneurs and their ecosystems (feld, 2013; bj€ orklund and krueger, 2016) , with shared learning connected to more innovative enterprises (zhang et al., 2006) . many of the entrepreneurs also reported being supported by their customers through 'supporting local' and purchasing from 'craft businesses', highlighting a can-do-spirit of the wider ecosystem during this crisis. however, more research is still needed on the shared experience dynamic between entrepreneurs and their ecosystems and how actions balancing profitability and goodwill are perceived by different stakeholders. finally, many entrepreneurs found the crisis to further lower their threshold for business model experimentation, fast-tracking dormant ideas and removing some perceived constraints, similar to institutional actors finding "permission to hustle" during the pandemic (fisher et al., 2020) . though both the current study and that of thorgren and williams (2020) illustrated widespread entrepreneurial action in response to the covid-19 pandemic, they portrayed opposite tendencies in the frequency of value creation experimentation relative to cost-cutting actions -despite both data sets representing entrepreneurs in nordic countries with highly developed social systems mitigating economic, social and health concerns (wilkinson and pickett, 2009 ). as the current study suggests early experiments -even those seemingly superfluous or trivial -can lead to a lasting competitive edge through action-based learning and capability building, further comparative research should explore environmental and entrepreneurial variables that promote and inhibit value creation experimentation during crises. while based on a small number of ventures in a single sector and country, the current study contributes towards starting to unpack the interconnections between entrepreneurial resources, resilience and value creation through business model experimentation in times of crisis. through experimentation, ventures not only mitigate short-term adverse effects of crises but expand their overall solution space and increase the capabilities they can draw from. based on the insights of our study, we provide a checklist for entrepreneurs (table 1) to identify opportunities for experimentation and capability cultivation to expand their solution space during crises such as the covid-19 pandemic. although further research into the post-crisis effects of such solution space expansions, as well as if, when and how new capabilities are subsequently put to use for business model innovation is still needed, at its best, entrepreneurial experimentation can create new value, capabilities and lasting resilience for both ventures and those in their ecosystem. none. we appreciate the input from our interview partners who supported this research under very difficult conditions. we would also like • novel bundling, rebranding, streamlining or adding physical or online components to your offering • expanding from business-to-business to business-tocustomer, and vice versa • revisiting dormant or discarded ideas to assess their appropriateness for the changed circumstances • working with other ventures to create joint offerings to reach new customers value creation architecture: • leveraging staff skills and networks to create new solutions and upskill • repurposing resources such as production and facilities • reaching out to new partners in supply, sales and delivery chains • pool resources and deepen collaboration with existing networks • establish new physical, online or joint points of sales • novel domain-specific knowledge and skills • validating or disproving assumptions on viable strategy, operations and positioning • increased efficacy for staff through early successes relational and ecosystem capabilities: • a wider set of partners • higher-quality relationships with increased awareness of synergies • enhanced network coverage venture-specific action:. • donating offerings, expertise or portion of sales to those affected by or fighting the crisis • providing visibility, platforms or campaigning for relevant causes collective action: • lobbying for changes needed to support your region or industry • creating forums and avenues for learning from one another relational and ecosystem capabilities: • better positioned and more diverse partners to collaborate with • collective culture for support through enhanced cohesion • improved regional or cluster competitiveness to attract resources to thank business finland (grant 211822) and jenny and antti wihuri foundation for partial funding of the research, as well as anna kuukka for helping to examine and visualize our social media data. although the impact of the covid-19 pandemic has been global, the severity and associated experience including imposed restrictions are contextual, as is also the case for entrepreneurs in finland. after the first infected domestic citizen on february 26, 2020, the finnish government declared a state of emergency on march 16, 2020 (yle, 2020; helsingin sanomat, 2020) , restricting gatherings of over ten people, closing borders, and enforcing social distancing and hygiene measures. restricting gatherings resulted in the cancellation of events and festivals, as well as closing restaurants and bars, hotels, specialty stores, meaning excess raw ingredients were at hand due to decreased demand and sales. closing schools and transitioning to remote studies march 18, 2020 meant parents had to organize homeschooling for the family -entrepreneurs included. restrictions on travel led to reduced availability of raw ingredients and packaging materials, changes in supply chains, and halting of international sales efforts. to comply with the social distancing restrictions and hygiene measures, situations with human contact were minimized, leading into cancellations of tasting events in supermarkets and causing complications in production and sales facilities to ensure a safe working environment. restaurants struggled with unclear demands and lack of support, many closing their doors and the remaining ones transitioning to take-away service only -making a significant dent in the food and beverage cluster. as a support measure, the government simultaneously with the emergency declaration announced a five billion euro support package in the form of grants and loans for small companies towards research and planning of new business operations (up to 10k euro) and execution of these development actions (up to 100k euro) (kpmg, 2020), which was increased to 15 billion euro four days later. on april 2nd, the government announced that entrepreneurs, including sole traders and freelancers, were eligible for unemployment benefits (kpmg, 2020; ministry of economic affairs and employment, 2020). the support system in finland for entrepreneurship, in general, is well-established with clear policies, regulation, and infrastructure (gem, 2018) , and comparing finland with other eu members, opportunity perception levels are higher while fear of failure is lower on average (gem, 2018). in may 2020, some national restrictions were eased, and it was announced that as of the 1st of june, groups of 50 were allowed to meet, restaurants could reopen "with special arrangements" (including opening only half of the seats inside venues, restricted alcohol service and no self-service offering), and public places were opened gradually (finnish government, 2020) . the state of emergency ended on june 16, 2020, although the pandemic continues as of writing. our study pursued a two-fold design (appendices 2 and 3) in order to explore how entrepreneurs responded to the crisis, focused on understanding the responses taken by entrepreneurs during the time period when government-imposed restrictions were in place, hence the data collection spanning from the time when restrictions were introduced to the partial easing of the restrictions (mid-march to the end of may). similar to macpherson et al. (2015) , we view the crisis period as a critical episode of concentrated activity and learning (cope, 2005) and build on a critical incident approach (flanagan, 1954; chell, 2004) . as conducting research with human subjects during crises poses unique practical and ethical challenges (mezinska et al., 2016) , we coupled a non-obtrusive longitudinal design (examining social media activity for 66 ventures, appendix 2) with a more in-depth cross-sectional data set (interviews with 17 select cases, appendix 3) a month or two into the crisis. the first study tracked packaged food and beverage ventures along the nature and stages of the crisis (doern et al., 2019) through social media activity, enabling a novel longitudinal analysis of during-crisis responses in real-time (buchanan and denyer, 2013) . identifying instagram as the social media platform where the ventures were most active, we tracked public activity on the platform, including both posts that remain visible on the account grid (permanent collections of images and their captions on the account feed) and temporary stories (visible to the followers of the user's account for 24 h). this provided an unobtrusive yet effective access to empirical data on consumer-facing value creation and capture action responses, avoiding the pitfalls related to recall such as false recollection, a posteriori rationalization, missing archives, etc. (roux-dufort, 2016:25) . based on targeted searches and scanning media coverage, we identified 66 active finnish small-to-medium packaged food and beverage ventures' instagram accounts to follow. data was captured and saved daily starting march 16, 2020 (declaration of the state of emergency) to the may 31, 2020 (the last day before restaurants could reopen with restrictions), resulting in a total of 844 posts related to the covid-19 situation. when an image or post caption mentioned a covid-19 crisis-related activity (e.g. social distancing, maintaining hygiene, staying home, conducting remote work, spending time outdoors) it was considered as one data point and copied through screenshots to a daily data archive (see anonymized examples in figure a1 ). if several posts or stories were published on the same day, each of them was considered as a separate data point. not all 66 companies posted covid-19 related content, therefore the collected response data set represented activities from 51 companies (including 11 ventures displaying only marketing or communications responses rather than taking action to create new value). to examine the responses, the data set of 844 covid-19 related instagram posts and stories were scanned for value creation and capture action responses (experimenting with novel value offerings or other business model elements visible to customers) linked to the crisis, excluding descriptions of individual or company circumstances (such as personally working remotely for the week or having trouble in procuring ingredients) and marketing messages without novel solutions (such as promotion of previous products through existing channels). the action response posts and stories were then coded and categorized based on a semantic-level thematic analysis (braun and clarke, 2006) of the type of response portrayed in the post, using constant comparative analysis (schneider and wagemann, 2012) . the images and their accompanying text were analyzed as a pair, as removing either text or image would have resulted in the loss of context and meaning (laestadius, 2016) . in unclear instances, researchers separately categorized posts, and in case of disagreement, the post was discussed until agreement was reached and categories further defined as a result. this iterative process resulted in 13 categories of actions evidencing novel solutions (see appendix 5), where the self-descriptive categories themselves are key results characterizing the types of novel value offerings that the entrepreneurs experimented with (butterfield, 2005) . finally, the timing of these categories of social media posts was plotted on a week-by-week basis for the 11.5 weeks of data collection from the start of the state of emergency to partial re-opening, allowing us to examine the timing of responses. the second, interview-based, study aimed at providing a more in-depth understanding of the underlying mechanisms of business model experimentation and solution space expansion, (e.g. runuyan, 2006; doern, 2016; williams and vorley, 2015) , revealing changes in value creation architecture and revenue models beyond what was visible to the general public through social media. we approached 22 out of the 51 active cases from the social media dataset, targeting a rich variety of responses. seventeen entrepreneurs accepted the invitation, all being either founders or having director-level positions in a small company producing packaged goods (see appendix 4). it is noteworthy that our sampling biases active cases, representing relatively high performing entrepreneurs who were willing to share their crisis response. negative reactions or lack of reaction to the crisis may thus be underrepresented in the interview data set. altogether 17 interviews were conducted during a timeframe of april and early june 2020, either in person or over distance via phone or teleconferencing platforms. we note that timing of the interviews during the crisis may affect the results, with 1.5 months between the first and last interview (giving later interviewees more time to potentially recover and create new solutions, but also more time to experience additional challenges and fatigue). the interviews lasted from 18 to 31 min and were audio-recorded and transcribed verbatim for analysis. for presentation purposes, excerpts have been translated from finnish to english by authors fluent in both languages. the interviews were semi-structured with the general themes of coping with the situation, unpacking the situation after the declaration of the state of emergency, types of action taken by the venture during the crisis and how these had been received by customers or stakeholders, and their future outlook and plans. coding of the interview data was conducted using axial coding, going from first-order to second-order to aggregate dimensions (gioia et al., 2013) . first, we went through the transcripts to identify responses related to the impact and perception of the crisis, the ventures' responses (actions, and their corresponding reasons, enablers and consequences), and the entrepreneurs' plans. first responses were coded using the response coding scheme developed for the social media responses (see appendix 5). we then reviewed transcripts against the nine elements of business model innovation identified by spieth and schneider (2016) , coding for changes in target customers, products/services and competitive positioning for value offering, core competences/resources, internal value creation activities, role/involvement of partners and distribution for value creation architecture, and revenue and cost mechanisms and for revenue models. similar to andries and debackere (2013) , we also categorized the degree of novelty of the effects in terms of the degree of change, or the distance between the original and revised business model, examining novelty for the venture rather than industry. these layers of coding focused on identifying where novel actions had been taken, and are represented in appendix 5. the second layer of analysis focused on the connections between actions and the consequences of experimenting with new business model elements. pathways across actions were charted for each case (see example figs. 2-5), as well as a characterization of the overall effects on the venture's business model (appendix 5). based on the identified experimentation pathways and effects, we identified longer term effects in terms of their effect on the business models of the ventures and patterns in spread and sequence of experimentation activities. these findings were then finally connected to the aggregate dimensions of expanding solution spaces, noted in fig. 6 . establishing online stores to sell products, either delivered directly to customers or pre-ordered online and picked up by the customers from venture premises. new physical sales points setting up new physical sales or product pick-up points that comply with covid-19 directions 11 (10) 1 creating joint sales channels (e.g. drive-through pop up with another company) or utilizing each other's sales points (e.g. a section of the factory store dedicated to other companies' products) other experiments: prosocial actions support actions for victims of the covid-19 crisis (e.g. at-risk groups), the ones working in the 'frontlines' (e.g. giving away food products to the nurses working in a hospital as a 'thank you' for their work), other ventures or local communities 17 (12) 8 business model innovation: propositions on the appropriateness of different learning approaches creating something from nothing: resource construction through entrepreneurial bricolage the proactive personality disposition and entrepreneurial behavior among small company presidents balancing on the creative highwire: forecasting the success of novel ideas in organizations initial mental representations of design problems: differences between experts and novices generating resources through co-evolution of entrepreneurs and ecosystems lean startup and the business model: experimenting for novelty and impact. long range planning entrepreneurial uncertainty during the covid-19 crisis: mapping the temporal dynamics of entrepreneurial finance researching tomorrow's crisis: methodological innovations and wider implications danger zone entrepreneurs: the importance of resilience and self-efficacy for entrepreneurial intentions fifty years of the critical incident technique: 1954-2004 and beyond entrepreneurial orientation: an applied perspective causation and effectuation processes: a validation study essential guide to qualitative methods in organizational research toward a dynamic learning perspective of entrepreneurship outlines of a behavioral theory of the entrepreneurial firm entrepreneurship and crisis management: the experiences of small businesses during the london special issue on entrepreneurship and crises: business as usual? an introduction and review of the literature organizational resilience: a capability-based conceptualization journal of business venturing insights 14, e00173. finnish government, 2020 [ministry notice for changing restrictions for restaurants in finland the critical incident technique the ideator's bias: how identity-induced self-efficacy drives overestimation in employee-driven process innovation seeking qualitative rigor in inductive research: notes on the gioia methodology revising entrepreneurial action in response to exogenous shocks: considering the covid-19 pandemic embedded entrepreneurs and post-disaster community recovery outo virus kiinasta mullisti el€ am€ an kolmessa kuukaudessa -aikajana n€ aytt€ a€ a, miten epidemia eteni suomessa. national news summary of the epidemic in finland rethinking organizational resilience and strategic renewal in smes startups in times of crisis-a rapid response to the covid-19 pandemic the sage handbook of social media research methods resilience in business and management research: a review of influential publications and a research agenda developing dynamic capabilities through resource accretion: expanding the entrepreneurial solution space parallel play: startups, nascent markets, and effective business-model design business models: a discovery driven approach. long. range plan research in disaster settings: a systematic qualitative review of ethical guidelines ministry support notice for medium enterprises in finland natural disasters, entrepreneurship, and creation after destruction: a conceptual approach a meta-analytic review of effectuation and venture performance response to arend, sarooghi, and burkemper (2015): cocreating effectual entrepreneurship research. acad delving into the roots of crises: the genealogy of surprise. the handbook of international crisis communication research 43 business model innovation and decision making: uncovering mechanisms for coping with uncertainty how to approach business model innovation: the role of opportunities in times of (no) exogenous change. r d manag set-theoretic methods for the social sciences: a guide to qualitative comparative analysis value creation and value capture alignment in business model innovation: a process view on outcome based business models business model innovation in strategic alliances: a multi-layer perspective business model innovativeness: designing a formative measure for business model innovation business models, business strategy and innovation. long. range plan staying alive during an unfolding crisis: how smes ward off impending disaster income inequality and social dysfunction organizational response to adversity: fusing crisis management and resilience research streams the impact of institutional change on entrepreneurship in a crisis-hit economy: the case of greece conceptualizing the learning process in smes: improving innovation through external orientation the business model: recent developments and future research appendix 6. experimentation and its longer-term effects on ventures' business models key: cord-266593-hmx2wy1p authors: cope, robert c.; ross, joshua v. title: identification of the relative timing of infectiousness and symptom onset for outbreak control date: 2020-02-07 journal: journal of theoretical biology doi: 10.1016/j.jtbi.2019.110079 sha: doc_id: 266593 cord_uid: hmx2wy1p abstract in an outbreak of an emerging disease the epidemiological characteristics of the pathogen may be largely unknown. a key determinant of ability to control the outbreak is the relative timing of infectiousness and symptom onset. we provide a method for identifying this relationship with high accuracy based on data from simulated household-stratified symptom-onset data. further, this can be achieved with observations taken on only a few specific days, chosen optimally, within each household. the information provided by this method may inform decision making processes for outbreak response. an accurate and computationally-efficient heuristic for determining the optimal surveillance scheme is introduced. this heuristic provides a novel approach to optimal design for bayesian model discrimination. the timing of infectiousness relative to symptom onset has been identified as a key factor in ability to control an outbreak . the explanation is intuitive: if symptoms appear before infectiousness, then contact tracing and isolation strategies will be effective, whereas for post-infectiousness symptom presentation, broader, non-symptom based strategies must be adopted. consequently, identifying the relative timing as early as possible in an outbreak is imperative to assessing potential for control and selecting a measured response. severe acute respiratory syndrome (sars) is a prime example of a disease in which symptoms foreshadow significant levels of infectiousness . this played a critical role in limiting mortality and morbidity in outbreaks during 2003, via simple public health measures such as isolation and quarantining ( ksiazek et al., 2003; lee et al., 2003; fraser et al., 2004; anderson et al., 2004; hsieh et al., 2005; day et al., 2006 ) . smallpox is most similar to sars in this respect, but must be contrasted with hiv, where a large proportion of secondary infections occur before symptoms . for influenza, the relationship is less clear, with symptoms and infectiousness likely coinciding closely, with some transmission possible before symptom onset ( patrozou and mermel, 2009; lau et al., 2010 ) . for established diseases, experimental evidence ( charleston et al., 2011 ) or large* corresponding author. e-mail address: joshua.ross@adelaide.edu.au (j.v. ross). scale detailed case information ( international ebola response team et al., 2016 ) can provide insight into the relative timing of symptom onset and infectiousness; however, this relationship will not be known in an outbreak of an emerging pathogen. therefore, one must turn to early outbreak surveillance data for insights. many jurisdictions organize their emerging disease monitoring policies around households. as an example, first few hundred studies are proposed as a first response surveillance scheme following the identification of a novel disease and/or strain as part of national pandemic plans ( mclean et al., 2010; van gageldonk-lafeber et al., 2012; ahmppi, 2014 ) . following the observation of a first symptomatic individual, their household is enrolled in an intensive surveillance program, so that day of symptom onset for subsequent cases within that household are recorded. studies of this form were developed for pandemic influenza in 2009 in both the united kingdom ( mclean et al., 2010 ) and the netherlands ( van gageldonk-lafeber et al., 2012 ) ; and have been instituted in response to a lack of methods for determining disease epidemiology as required for determining a proportionate response to novel outbreaks. in the australian health management plan for pandemic influenza (ahmppi), first few hundred studies are proposed to be implemented following the first case of a novel influenza strain, with households being tracked nationally (but managed at the state/territory level) ( ahmppi, 2014 ). methods have recently been developed to characterise transmissibility and severity of a novel pathogen -other factors influencing ability to control an outbreak -based on such data walker et al., 2017 ) . currently lacking is a method for accuhttps://doi.org/10.1016/j.jtbi.2019.110079 0022-5193/© 2019 elsevier ltd. all rights reserved. between states within each household continuous-time markov chain; the five observation models being discriminated between; and, the way that these household-level data are observed. the data observed in each model are the number of observations of the relevant transition each day, within each household: data from four illustrative sample households are shown here. (b) random forest feature importance for the full 14day design, used to construct the heuristic for smaller designs. each bar represents a feature, so within each day there are (in this case, for households of size 5) 6 features, corresponding to the proportion of households with each incidence count, each day. (c) resulting random forest accuracy (proportion of test simulations assigned to the correct model) as design size increases, for the true optimal design (solid lines) and heuristic solution (crosses with dashed line). in this case, random assignment would produce accuracy 0.2. (d) two-class accuracy of random forest model discrimination: this measures the accuracy of discrimination between models with symptoms before or coincident with infectiousness, versus models with symptoms beginning after infectiousness. in this case, random assignment would produce accuracy 0.52 (red dashed line). these results correspond to households of size 5, with 10,0 0 0 training samples from each model, each with parameters drawn from the distributions displayed in supplemental figure s1 . (for interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) rate determination of relative timing of infectiousness and symptom onset using this data. here we introduce, and demonstrate through a simulation study, a method for identifying with high accuracy the timing of infectiousness relative to symptom onset from household-stratified symptom surveillance data (generated via simulation). remarkably, we show this is achievable with observations taken on only a few specific days, chosen optimally, within each household. our approach to determining the optimal surveillance scheme is based on an efficient heuristic. this heuristic provides a general, computationally-efficient approach to optimal design for bayesian model discrimination. we model disease dynamics within each household as a continuous-time markov chain ( keeling and ross, 2008 ) , that counts the number of household members that are susceptible (s), exposed ( e 1 and e 2 ), infectious ( i 1 and i 2 ), or recovered (and immune; r). two compartments for exposed and infectious individuals allows for a broad range outbreak observation dynamics, and allows erlang-distributed exposed and infectious periods. under this model, the timing of symptom onset relative to infectiousness is mapped to which transition is observed: symptoms appear either upon infection, between infection and infectiousness, coincident with infectiousness, between infectiousness and recovery, or upon recovery. the challenge is to determine which of these five (observation) models best describes the household-stratified symptom-onset data ( fig. 1 a) . there is a relatively rich literature on bayesian model discrimination ( chopin et al., 2013; drovandi and cutchan, 2016; alzahrani et al., 2018; touloupou et al., 2018 ) , and optimal design for such ( chaloner and verdinelli, 1995; ryan et al., 2015 ) , which are the most appropriate tools and framework to address this question. a general difficulty with this theory is that practical implementation is at best difficult, and often infeasible. this has led to methods based on approximate bayesian computation (abc), which requires only simulation of realisations from each model, and is computationally feasible for a wide range of models. unfortunately, there exists 'a fundamental difficulty' in establishing robust methods based upon summary statistics ( robert et al., 2011; robert, 2016 ) ; however, see the recent work of dehideniya et al. ( dehideniya et al., 2018 ) . another approach to model discrimination in an abc framework has been proposed by pudlo et al. ( pudlo et al., 2015 ) . they treat model discrimination as a classification problem, for which machine learning methods are ideal, and in particular propose the use of random forests to perform this task. this approach provides a highly-efficient, and importantly, robust method for model discrimination. hainy et al. ( hainy et al., 2018 ) expand on this approach as specifically applied to optimal design for model discrimination. we apply random-forest based bayesian model discrimination, first for accurate, robust characterisation of relative timing of symptoms and infectiousness, and second, for optimal design of early outbreak surveillance for accurate model discrimination. specifically, the aim of the latter is to select an optimal surveillance scheme, consisting of a fixed number of observations, in order to discriminate five different timings of symptom onset relative to infectiousness, within a household-stratified epidemic model. we evaluate, using simulated data, the impact of assumptions and summary statistics. additionally, we propose a new, computationally-efficient and highly-accurate heuristic for optimal design choice, which in this application determines the optimal days upon which to perform surveillance in households. we demonstrate using an example system of a novel infectious disease, spreading in a population structured into households. we assume that the population is large and mixing between households is random, such that after a household is initially infected, table 1 events, transitions and rates within a household. n is the (fixed) household size, β, γ and σ are the rates of infection, gaining infectiousness and recovery, respectively. transition rate the remaining transmission within the household is independent of transmission outside the household ( ross et al., 2010; black et al., 2013 ) . therefore, transmission dynamics within households can be modelled independently , i.e., with infection only occurring between individuals within a household, rather than between households. note that this is an assumption which simplifies the simulation process, but it could be modified if necessary. given this novel etiological agent, we wish to determine if symptom onset occurs at the time of infection, between infection and infectiousness, coincident with infectiousness, after infectiousness, or coincident with recovery (i.e., these are the five candidate models we wish to discriminate). the model behaviours are otherwise assumed identical. to be emphatic, the underlying disease dynamics is identical in all five models, each differing only in when observations are made, corresponding to different timings of symptom onset ( fig. 1 a) . we focus on selecting between these observation models as the relative timing of infectiousness and symptom onset is critical to effective outbreak management: quarantine can be applied effectively if symptoms occur before (or possibly coincident to) infectiousness. we model the epidemic dynamics in households as a continuous-time markov chain (figure 1a ) ( keeling and ross, 2008 ) . individuals transition from susceptible (s) to exposed ( e 1 , and subsequently e 2 ), then to infectious ( i 1 , and subsequently i 2 ), and finally to recovered (r), with rates as described in table 1 . the model dynamics are general, but explicitly resemble the dynamics of a respiratory virus such as influenza, as potential future pandemic influenza is of substantial concern globally. the collection of first few hundred data is included in the australian health management plan for pandemic influenza ( ahmppi, 2014 ), along with similar pandemic preparedness plans in other jurisdictions, so demonstrating the ability to discriminate models using these data for diseases resembling influenza is highly relevant. as such, prior parameter choices and the overall duration of the observation process (i.e., 14 days) also reflect influenza dynamics. we assign a prior distribution to each parameter (supplemental figure s1 ), based on physical quantities to reflect the assumed prior knowledge of the etiological agent: • 1 σ ∼ gamma (6 , 1 / 2) , representing a mean exposed duration of 3 days (mode at approximately 2.5 days); • 1 γ ∼ gamma (6 , 1 / 2) , representing a mean infectious duration of 3 days (mode at approximately 2.5 days); and, representing a mean r 0 (the expected number of secondary cases caused by an infectious individual in a fully susceptible population) of 2 (mode at approximately 1.5). these distributions are sampled per-simulation, i.e., sampled parameters are kept constant across all households within a given epidemic. we note that these priors are relatively broad, reflecting uncertainty around disease transmission dynamics, but within a range resembling the dynamics of a respiratory virus such as influenza. prior distributions should be chosen to reflect what is known about the disease of interest. following the first symptomatic case in a household, the number of symptomatic cases within the household is observed daily (i.e., the unit of time considered is one day). the instant that the first individual in a household shows symptoms is time zero. then, the number of cases seen before time 1 constitutes the first observation, between time 1 and 2 the next observation, and so on. this proceeds for 14 days, with any symptoms occurring after time 14 not observed. the 14 day duration allows time for the index case and subsequent infections to likely progress through the stages of infection given the transmission model and parameters chosen, resulting in most household transmission being observed within this time. if the disease progressed on a different timescale, the duration and frequency of observation should be varied appropriately, e.g., an infection with slower outbreak dynamics might be observed weekly rather than daily. when testing the effect of asymptomatic infections on model discrimination, we sample an additional parameter, p obs , the probability that an individual shows symptoms (implemented as an independent bernoulli trial for each individual at the time of symptom onset). note that p obs is held constant within each simulation, i.e., it varies by outbreak, but not by infected individual. we explored two scenarios: (1) p obs ~beta(5, 5) (i.e., a mean p obs of 0.5), and (2) p obs ~beta(7.5, 2.5) (i.e., a mean p obs of 0.75). figure s1 includes a visualisation of these distributions. we emphasise that in the asymptomatic infection scenario, data collection from a household begins with the first observed symptomatic case in that household; the index case may be asymptomatic or symptomatic. in preliminary studies (not reported) we estimated the accuracy of model discrimination with fixed, known parametersthe resulting accuracy was higher than with parameters sampled from the prior distribution. however, we report only the results with parameters sampled from a prior distribution here, as in an ongoing outbreak exact parameter values are likely to be unknown. to attempt to discriminate models, we use the approximate bayesian random forest approach of pudlo et al. ( pudlo et al., 2015 ) . a random forest is a popular machine learning classifier, that operates by aggregating many classification trees, each constructed on a random subset of predictors and a bootstrap sample of the training data ( hastie et al., 2009 ). when making a prediction the classification from each tree is determined, with the label predicted by the highest number of trees being the prediction of the random forest. implementations of the random forest algorithm are available in most commonly-used software packages. the process of bayesian model discrimination using random forests proceeds as follows: • select a number of simulations, n s , and a number of households, n h . • for each model: -sample a set of parameters θ = (r 0 , σ, γ ) from the (prior) distributions. -simulate n h households given these parameters. -repeat this process n s times. • given the n s simulations from each model, extract the data corresponding to the considered design. • construct a random forest that predicts the model label, given the simulations. • assess the accuracy of the process on a left-out test set. infections within each household are simulated using a standard doob-gillespie algorithm for simulating continuous-time markov chain dynamics. random forests were constructed using the python scikit-learn randomforestclassifier algorithm ( pedregosa et al., 2011 ) , with 200 trees. note that we use a completely separate set of test simulations to determine accuracy (of the same size as the training data), rather than out-of-bag error. out-of-bag error is an error metric commonly used with random forests, that is calculated using the training data, rather than a separate test set. it relies on the structure of the random forest: in a random forest, only a (randomly sampled) proportion of training data (i.e., simulations) are used to construct each tree, so the remaining training data may be used to test the accuracy of that tree. aggregating the result across all trees gives the out-of-bag error. in some cases out-of-bag error is prone to bias, so to ensure we are correctly assessing accuracy we instead use a left-out test set. we report accuracy as the proportion of all test samples that are correctly assigned to their generating model. i.e., we test 10,0 0 0 left-out training simulations from each model, and count those assigned the correct label. we also count the number that were assigned correctly to pre-infectious or coincident with infectious symptoms, versus symptom onset after infectiousness has begun: we call this the two-class accuracy . this was tested as it is the most relevant set of models to discriminate for determining the effectiveness of quarantine for disease control. all code necessary to produce the simulated data and perform model discrimination will be made available publicly (upon publication). to operationalise this process during an outbreak, the observed household data (on the days corresponding to the chosen design) would be input into a pre-trained random forest model, which would result in a predicted model label. that prediction indicates which observation model the outbreak most closely resembles. conducting a first few hundred-style study can be extremely labour intensive. consequently, we wish to assess the potential for model discrimination when sampling is only performed on a subset of days, rather than every day. if we choose to only sample on d < 14 days, within the first 14 days following the first symptomatic case in each household, we must necessarily also choose the optimal days on which to sample. we call the number of days d being sampled the design size. we choose those days that produce the highest classification accuracy on a left-out test set. this design problem is small, with only ( 14 d ) designs of size d (or 2 14 = 16 , 384 total designs) to evaluate, so we apply exhaustive search in this case; however a combinatorial optimisation algorithm could be applied and would likely be necessary in a more complex design problem to search for the optimal design. potentially symptom onset data could be made complete for this style of study by, for example, asking each household on which day all individuals with symptoms first presented with them (rather than just the individuals who presented symptoms on the sampling days); although some loss in quality of data might be expected. in other cases it might be necessary to perform a test (e.g., virological testing) as part of the sampling program, in which case choosing optimal designs that are as small as possible can save substantial resources. our study provides an example of the model discrimination and optimal sampling design process, that could be generalised to reflect the appropriate sampling scheme where necessary. to more effectively use the household data in training the random forest, we summarize the raw household data to produce daily distributions of counts. that is, we count the proportion of households that, on day d , observed an incidence of i , and then use the resultant (design size) × (household size + 1) data vector as the new random forest predictors. for example, with designs of size 5, households of size 5, and 200 households, the raw data would consist of 5 × 200 = 1000 predictors, whereas the daily summaries would consist of 5 × 6 = 30 predictors. rather than evaluating the full set of possible designs, or applying an optimisation algorithm, we propose a heuristic for efficiently finding high-quality designs of a given size. this heuristic is to perform random forest model selection on the largest possible design, extract the random forest feature importance ( fig. 1 b) , and use this random forest feature importance to rank design points. specifically, days are ranked on their maximum feature importance (i.e., decrease in gini impurity, see below); the sum of the importance of features from a day was also tested, but had inferior performance. a design of size d uses the highest-ranked d design points. the random forest feature importance metric we use is the mean decrease in gini impurity ( raileanu and stoffel, 2004 ) of a feature across the trees in the random forest. the gini impurity at a node is the probability that a new element at that node would be assigned an incorrect label, if it was assigned a random label from the distribution of training labels at that node. this metric is calculated using the python scikit-learn random forest algorithm ( pedregosa et al., 2011 ) . random forest-based bayesian model discrimination was able to discriminate relative timing of symptoms and infectiousness for simulated household-stratified symptom-onset data. with 200 households of size 5, accuracy was 0.6974 for discriminating the five observation models (with random parameters, and 10,0 0 0 training simulations per model). when selecting solely between pre-infectiousness and coincident symptoms versus postinfectiousness symptoms, accuracy was 0.9796 (we call this the two-class accuracy); suggesting that most model discrimination error was between similar models to which the same management decisions might be applied. accuracy was reduced with fewer households: to 0.608 with 100 households, and 0.518 with only 50 households ( fig. 1 d) ; these had two-class accuracy of 0.95 and 0.894, respectively. these results were robust with respect to variation in household size ( figure s2) , with accuracy ranging from 0.648 with 200 households of size 3 to 0.703 with 200 households of size 7. we report results for households of size 5 for the remainder of this section. remarkably, model discrimination remained accurate when only a small subset of daily household data were observed, when the observations were from an optimal design: a design of size 5 with 200 households was sufficient to produce a classification accuracy of 0.662 and a two-class accuracy of 0.975 ( figs. 1 d and 2 a) , only marginally below the accuracy of the full design ( figure s3 ). accuracy increased as the design size (i.e., number of days of surveillance) and the number of households increased. the heuristic produced an effectively indistinguishable level of accuracy compared to the optimal across design sizes, both for overall accuracy ( fig. 1 c) and two-class accuracy ( fig. 1 d) . the heuristic ensured a substantial reduction in computation time: to produce fig. 1 c, 39 fig. 2. (a) accuracy of model discrimination in designs of size 5, as the number of households increases, and under partial observation. note that p obs is not a fixed parameter but is sampled from a distribution: the beta(5,5) distribution has mean 0.5, and the beta(7.5,2.5) distribution has mean 0.75. figure s3 shows the equivalent result with a design of size 14. (b) difference between heuristic designs (coloured points) and optimal designs (black boxes) as the design size increases. note that we do not evaluate optimal designs of size 1 or 2, and so there are no optimal designs in these columns. (c) distribution of training sample observations (under each model and number of households) for the most important feature under the heuristic: the proportion of households with 1 case observed on day 2. each coloured point represents an observation in the training sample. these results correspond to households of size 5, with 10,0 0 0 training samples from each model, each with parameters drawn from the distributions that appear in supplemental figure s1 . random forests were required when using the heuristic, compared to 49,107 random forests to produce the optimal results. the key design points (i.e., sampling days) for optimal designs were consistently the second day ( fig. 2 b) , followed by other days early in the outbreak (i.e., days 3-6, and day 1). days 7-14 typically had little impact on model discrimination accuracy (i.e., optimal accuracy and two-class accuracy consistently levelled off as design size increased beyond 5; fig. 1 c/d), and the optimal combination of these days varied due to stochasticity in both training and test data. this is consistent with the feature importance used to develop the heuristic ( fig. 1 b) , i.e., those days that were consistently optimal were those with highest feature importance. when the most important design point is visualised ( fig. 2 c) it shows a subtle but clear difference between distributions of observations from the different models; this provides intuition as to how decision trees constructed from many predictors of this form can accurately discriminate models. to assess the impact of asymptomatic infections on model discrimination, we repeated the analysis, except with each individual only being symptomatic (at the point symptoms would otherwise appear) with probability p obs (again, sampled from a prior distribution). this partial observation made model discrimination substantially more challenging: with designs of size 5 and 200 households ( fig. 2 a) , accuracy was 0.522 and two-class accuracty was 0.863 when p obs ~beta(7.5, 2.5) (i.e., a mean of 0.75), and accuracy was 0.400 and two-class accuracy was 0.736 when p obs ~beta(5, 5) (i.e., a mean of 0.5) (compared to 0.622 and 0.975 with complete observation). identifying the relative timing of symptom onset and infectiousness in an emerging epidemic is critical to outbreak control. we have demonstrated, on simulated data, a method for identifying the relative timing based upon household-stratified data available early in an outbreak. this method produces reasonable accuracy for discriminating between five observation models, and very high accuracy for determining pre-or coincident with infectiousness symptom onset versus post-infectiousness symptom onset (i.e., two-class accuracy). this can be done without observing each household every day. moreover, we can use random forest feature importance to inform a heuristic that vastly reduces the computation necessary to choose high-accuracy designs. it is remarkable that it is possible to discriminate models so accurately, given that they share identical epidemic dynamics, and only differ in observation. the non-parametric nature of the random forest is able to use small but clear differences between models (e.g., fig. 2 c) to extract sufficient information to discriminate them. combining the raw household data to form summary statistics is critical to this: if the raw household data is used rather than the summary statistics, accuracy is substantially lower. while it can be difficult to interpret the classifications made by a random forest-classifier, interrogating key individual predictors (as in fig. 2 c) provides clarity, and elucidates why feature importance provides a useful heuristic for choosing optimal designs ( molnar, 2019 ) . the accuracy of model discrimination decreases substantially as the proportion of cases that are asymptomatic increases. this can be compensated by increasing the number of households ( fig. 2 a) . the outbreaks in which early control is most critical are likely to be those in which most individuals are symptomatic, due to symptoms being strongly correlated with severity, for example hospitalisations and deaths. however, there also exists diseases for which outbreak control is critical, even when the proportion of symptomatic individuals is very low (e.g., poliovirus). in some situations it may be necessary to consider more complicated surveillance schemes, in which case it may not be possible to evaluate the exact optimal design by exhaustive search. however, the heuristic proposed here should remain effective in more complicated design spaces, provided they have a similar form, i.e., designs of a given size are a subset of designs of larger sizes upon which the random forest can be trained to extract feature importance. assumptions impact any model-based study. most critically, this model discrimination process assumes that the dynamics of the simulated epidemic model and observation models reflect the actual disease and observation dynamics. it is possible to use this method to select between models that differ in dynamics in addition to the observation process; however any increase in the number of models to classify will likely result in increased computation and potentially decreased accuracy. we have chosen to focus on the timing of symptom onset and infectiousness in one general disease process as an example, resembling influenza in both transmission dynamics and prior distributions on parameters. useful future work could be to perform similar experiments on diverse disease processes, with different lif e histories. it would also be valuable to assess the robustness of the method for discriminating timing of symptom onset versus infectiousness when the underlying disease transmission model is misspecified. we note that selection between different transmission models (rather than observation models) for disease outbreaks has been considered in other studies, for example, assessing models of transmissibility over time for norovirus from household data ( zelner et al., 2013 ) . in addition, the simulation study we present is a simplification of realistic disease dynamics. the model assumes homogeneous within-household mixing, erlang-distributed latent and infectious durations, and constant transmission rates over the infectious period. independence between households is also assumed (i.e., that once a household is infected, all subsequent infection events are due to transmission within that household); this is only potentially valid in the case of a large population of households and the early stages of an outbreak. household size is uniform across households within the simulation; if household size were allowed to vary, data from each household size would need to be evaluated separately, and more households may need to be observed to obtain suitable accuracy. assessing optimal design for model discrimination given a range of household sizes would be a valuable direction for future work. finally, we treat the interaction between symptom onset and infectiousness as a discrete process (i.e., symptom onset coincides exactly with transitions between states), whereas this process may be more general in practice. this paper demonstrates an example of the process of bayesian model discrimination for outbreak control, and could be adapted to more complex disease models as desired. in the future, the aim is to combine bayesian model discrimination and parameter estimation in an online manner. improving estimates of parameters improves the ability to discriminate models, and, more certainty regarding the model likely reduces variance in parameter estimates. this would allow for unified characterisation of all factors influencing the ability to control an outbreak. model selection for time series of count data epidemiology, transmission dynamics and control of sars: the 20 02-20 03 epidemic epidemiological consequences of household-based antiviral prophylaxis for pandemic influenza characterising pandemic severity and transmissibility from data collected during first few hundred studies bayesian experimental design: a review relationship between clinical signs and transmission of an infectious disease and the implications for control smc2: an efficient algorithm for sequential analysis of state space models when is quarantine a useful control strategy for emerging infectious diseases? optimal bayesian design for discriminating between models with intractable likelihoods in epidemiology. comput alive smc2: bayesian model selection for lowcount time series models with intractable likelihoods factors that make an infectious disease outbreak controllable optimal bayesian design for model discrimination via classification the elements of statistical learning quarantine for sars, taiwan exposure patterns driving ebola transmission in west africa: a retrospective observational study on methods for studying stochastic disease dynamics a novel coronavirus associated with severe acute respiratory syndrome viral shedding and clinical illness in naturally acquired influenza virus infections a major outbreak of severe acute respiratory syndrome in hong kong pandemic (h1n1) 2009 influenza in the uk: clinical and epidemiological findings from the first few hundred (ff100) cases interpretable machine learning: a guide for making black box models explainable does influenza transmission occur from asymptomatic infection or prior to symptom onset? public health rep scikit-learn: machine learning in python reliable abc model choice via random forests theoretical comparison between the gini index and information gain criteria lack of confidence in approximate bayesian computation model choice approximate bayesian computation: a survey on recent results calculation of disease dynamics in a population of households a review of modern computational algorithms for bayesian optimal design efficient model comparison techniques for models requiring large scale data augmentation utility of the first few100 approach during the 2009 influenza a(h1n1) pandemic in the netherlands inference of epidemiological parameters from household stratified data linking timevarying symptomatology and intensity of infectiousness to patterns of norovirus transmission supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.jtbi.2019.110079 . key: cord-283678-xdma6vyo authors: séférian, roland; berthet, sarah; yool, andrew; palmiéri, julien; bopp, laurent; tagliabue, alessandro; kwiatkowski, lester; aumont, olivier; christian, james; dunne, john; gehlen, marion; ilyina, tatiana; john, jasmin g.; li, hongmei; long, matthew c.; luo, jessica y.; nakano, hideyuki; romanou, anastasia; schwinger, jörg; stock, charles; santana-falcón, yeray; takano, yohei; tjiputra, jerry; tsujino, hiroyuki; watanabe, michio; wu, tongwen; wu, fanghua; yamamoto, akitomo title: tracking improvement in simulated marine biogeochemistry between cmip5 and cmip6 date: 2020-08-18 journal: curr clim change rep doi: 10.1007/s40641-020-00160-0 sha: doc_id: 283678 cord_uid: xdma6vyo purpose of review: the changes or updates in ocean biogeochemistry component have been mapped between cmip5 and cmip6 model versions, and an assessment made of how far these have led to improvements in the simulated mean state of marine biogeochemical models within the current generation of earth system models (esms). recent findings: the representation of marine biogeochemistry has progressed within the current generation of earth system models. however, it remains difficult to identify which model updates are responsible for a given improvement. in addition, the full potential of marine biogeochemistry in terms of earth system interactions and climate feedback remains poorly examined in the current generation of earth system models. summary: increasing availability of ocean biogeochemical data, as well as an improved understanding of the underlying processes, allows advances in the marine biogeochemical components of the current generation of esms. the present study scrutinizes the extent to which marine biogeochemistry components of esms have progressed between the 5th and the 6th phases of the coupled model intercomparison project (cmip). electronic supplementary material: the online version of this article (10.1007/s40641-020-00160-0) contains supplementary material, which is available to authorized users. marine biogeochemistry plays a key role in the earth system. by regulating the exchange of co2 and other climatically active gases with the atmosphere [1] , it is involved in a large range of climate feedbacks [2] . as a result, changes in ocean biogeochemistry can have important consequences for climate [3] [4] [5] . marine biogeochemistry is also deeply interwoven with the functioning of marine ecosystems and ultimately food webs [6] [7] [8] . marine ecosystems are affected by anthropogenic environmental change [9] [10] [11] , particularly through climateinduced changes in physical properties and co 2 -induced ocean acidification [12] [13] [14] [15] [16] . understanding and quantifying the response of ocean biogeochemistry to global changes, as well as its role in earth system feedbacks [12, 17] , are essential to improve our capacity to project ecosystem services and climate change in this century and beyond. in this context, ocean biogeochemical models are acknowledged as powerful tools to study the ocean carbon cycle and its response to past and future climate and chemical changes [2] . since the pioneering assessment of anthropogenic carbon uptake by the ocean by maier-reimer and hasselmann [18] and sarmiento et al. [19] , and the ocean carbon model intercomparison project (ocmip) of orr et al. [20] , ocean biogeochemical models have been successfully integrated in many earth system models (e.g. [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] ). over the last few decades, the results from ocean biogeochemical models running within esms have increasingly been used to drive research on the carbon cycle. their results have supported the assessment of carbon cycle feedbacks [32] [33] [34] [35] and have improved the understanding of mechanisms behind the near-linear transient climate response to cumulative co 2 emissions [36] . consequently, they have helped determine the change in carbon budgets that is compatible with a given level of warming since pre-industrial times. ocean biogeochemical models have also been used to investigate potential geoengineering solutions to climate change such as solar radiation management [37] [38] [39] , ocean fertilization [40] [41] [42] [43] [44] [45] [46] [47] , alkalinity addition [48] [49] [50] [51] [52] and reversibility experiments (e.g. [53, 54] ). recent advances in marine ecosystem modelling have also led to diversification in the use of ocean biogeochemistry models within esms to study a wide range of potential impacts [55] [56] [57] [58] . these research activities are now grouped under the umbrella of the inter-sectoral impact model intercomparison project (isimip), with the fishmip initiative being a specific example for fisheries impacts [59, 60] . over recent years, models are increasingly being used in a semi-operational mode to aid with investigations of the predictability of key policy-relevant ocean biogeochemistry fields (e.g. net primary productivity, ocean acidity, ocean carbon uptake) [61] [62] [63] [64] [65] [66] [67] . because of their close relationship with important living marine resources, skillful predictions of these properties have led to ocean biogeochemistry models being recognized as valuable tools when developing environmental policies (e.g. [68] ) or designing fisheries management [64, 65, 69] . because this large array of applications goes well beyond the conventional scientific investigation of the ocean carbon cycle, marine biogeochemical models have been developed in a number of directions over recent years. these developments are generally supported by progress in process understanding, which in turn is driven by an increasing number of observational databases [70] [71] [72] . however, from one generation to another, the development of marine biogeochemical models is driven not only by common scientific considerations but also by the internal priorities of individual modelling groups. as a consequence, it is difficult to anticipate how far the representation of marine biogeochemistry within the current generation of earth system models differs from-and has improved upon-the previous one. the present study maps the changes or updates in ocean biogeochemistry components that have arisen between cmip5 and cmip6 and assesses how far these have led to actual improvements in model skill against present-day observations. overall, our assessment demonstrates that the simulated mean state of ocean biogeochemistry models in cmip6 is more realistic than that produced by their cmip5 analogues in many aspects, but that it remains difficult to clearly identify which changes in a given ocean biogeochemistry model are responsible for these improvements. in this section, we review the changes or updates implemented by participating modelling groups. the following method was employed to collect relevant model details as shown in table 1 . first, all of the modelling groups contributing both to cmip5 and cmip6 were approached. next, a questionnaire in the form of a spreadsheet was proposed and developed. this sought details around (1) model resolution, (2) complexity in marine biology, (3) the representation of bacteria, (4) internal physiology, (5) organic matter cycling, (6) sediments, (7) nutrients and elemental cycling, (8) the level of interactions with the other components of the earth system and (9) modelling approaches including spin-up protocols and tuning/calibration. the latter includes external inputs/ outputs and biophysical interactions. the resulting master table of model properties is provided in supplementary materials (table s1) . tables 1, 2 and 3 map the key updates made between cmip5 and cmip6 (full details are available in table s1 ). table 1 suggests that most of the changes have tried to address at least one missing process of major importance for marine biogeochemistry, as highlighted in ipcc ar5 ( [2] , page 499), that is, representation of the lower trophic level including bacteria, organic matter cycling including sinking particles or variation in stoichiometric ratios. table 1 includes a brief overview of the key updates in ocean physics between cmip5 and cmip6 because marine biogeochemistry is prominently driven by ocean circulation (large-scale circulation and mesoscale eddies) and vertical mixing. table 1 tracks not only updates in the horizontal and vertical resolution of physical ocean models but also changes in related ocean physical parameterization. as suggested by griffies et al. [103] , an increase in horizontal or vertical resolution enables the representation of finer-scale ocean physical processes (e.g. mesoscale eddies) in relation with the activation of more realistic ocean physical parameterizations (such as vertical mixing, diurnal cycle or coupling with the atmosphere). the first common difference between cmip5 and cmip6 esms comes from the ocean-sea ice components. indeed, it is interesting to note that 8 esm groups out of 12 use an upgraded version of the ocean models or employ a new ocean model (table 1) . these changes imply substantial updates or revisions in ocean physical parameterizations that may have an impact on large-scale circulation and vertical mixing. in addition, another common difference between ocean models used in cmip5 and cmip6 is the grid resolution. it is interesting to note that all of the ocean models, with the exception of mpi-esm1-2-lr, now resolve ocean dynamics at a minimum horizontal nominal resolution of 100 km. the highest horizontal nominal resolution in the available multimodel ensemble is 50 km (gfdl-esm4). despite this general increase in horizontal resolution, only gfdl-cm4 uses an eddy-permitting ocean model (~25 km). in addition, the current generation of ocean models also better represent vertical physical processes with a typically finer vertical resolution. another common difference between the two generations of models is the complexity of the marine ecosystem description and related parameterizations. here, the complexity encompasses the diversity of model trophic web (i.e. the number of specific model phytoplankton and zooplankton types), the representation of bacteria, ecosystem functioning including macro-and micro-nutrient limitation (e.g. iron), and the variation in modelled stoichiometric ratios of carbon, nitrogen and other elements (e.g. photosynthetic pigment). greater complexity does not necessarily imply a better representation of cycles and processes associated with each biogeochemical species, as it may introduce new degrees of freedom and/or non-linear (or at least not well controlled) interactions between parameterizations. table 1 shows that ocean biogeochemistry models span a wide range of complexity levels. the simplest models use ocean carbon cycle models based on the ocmip protocol [20] that do not include marine biota or nutrients. meanwhile, the most complex models include a broad trophic structure that groups marine organisms into plankton functional types based on their biogeochemical role, with mechanistic representations of nutrient limitation and variable stoichiometric ratios. table 1 also highlights noticeable changes in biogeochemical parameterizations between cmip5 and cmip6. they concern 10 biogeochemical models out of 12 reviewed in this study. these changes may be related to the change in model complexity or to a revised set of parameterizations (e.g. nitrogen fixation, remineralization, grazing, flux feeding; see table s1 ). we map updates and changes in ocean biogeochemical models along three major axes; axis 1. the trophic food web, the plankton internal physiology (e.g. variable stoichiometry, chlorophyll pigment) and nutrients cycling (iron cycle, nutrients cycles). this axis aims to track updates in biogeochemical dynamics and ecosystem functioning; axis 2. the external sources of nutrients; axis 3. the interactions of marine biogeochemistry with climate or ocean physics. the latter two axes track the level of integration of the marine biogeochemical model in the modelled earth system. it is important to stress that an increase or a decrease along one of those three axes does not necessarily imply an improvement in model performance or skill. in most cases, it reflects progress in process understanding (physical, biogeochemical or both), the inclusion of new earth system interactions or the representation of climate feedbacks is required to investigate future scenarios. table 1 shows that the current generation of cmip6 displays a greater diversity of marine biogeochemical models than cmip5. table 1 overview of the ocean and marine biogeochemical components of earth system models as used in cmip5 and cmip6. the names of the esm are given in the first line of the table where the cmip6 esms are given in red cells and the cmip5 predecessors are given in pink cells. the complexity of the marine biogeochemical models is described using (i) the trophic web, the number of living species or phytoplankton functional types; (ii) the internal physiology, the stoichiometry and the representation of internal photosynthetic pigment; (iii) the organic matter cycling, the number of organic carbon pools and their representation; (iv) the representation of marine sediments and (v) the nutrient cycling: the number of nutrients and the representation of oxygen and iron cycling cobaltv2 (in gfdl-esm 4), for instance, displays the highest trophic complexity level with 3 explicit phytoplankton classes, 1 implicit phytoplankton class, 3 explicit zooplankton classes and 1 explicit heterotrophic bacteria class; however, this model still employs a relatively simple parameterization of iron cycling. in comparison, piscesv2-gas (in cnrm-esm2-1) or piscesv2 (in ipsl-cm6a-lr) includes 4 explicit plankton types (2 phytoplankton and 2 zooplankton), but two iron ligands and 5 iron forms [104] . marbl-bec (in cesm2) also includes an iron ligand and has opted for increasing ecosystem complexity by introducing variable c:p stoichiometry, based on po 4 concentrations [105] , while maintaining 4 plankton types. it is interesting to note that, while limiting the number of nutrients, canesm5-canoe have evolved toward a more comprehensive treatment of marine biogeochemistry with 4 explicit plankton types and using variable stoichiometry [89] . in contrast with a general increase in complexity, noaa-gfdl has started to use a reduced complexity marine biogeochemical model embedded in the high-resolution ocean model of gfdl-cm4. this approach implies a trade-off between computational costs and essential biogeochemical processes to represent the ocean carbon cycle as explained in galbraith et al. [105] . such diversity tends to mirror progress in the understanding of the impact of variable stoichiometric ratios on ecosystem dynamics and carbon assimilation by phytoplankton cells [106] [107] [108] [109] [110] . table 1 shows that all cmip6 models except gfdl-cm4 have evolved toward a more comprehensive treatment of elemental cycling including nitrogen, oxygen and iron cycling. this moderate increase in model complexity is supported by recent observations in phytoplankton functioning, nutrient limitation or plankton physiology [111] [112] [113] [114] [115] [116] and the availability of a larger array of observational data (bio-argo and geotraces) supporting the model evaluation and development (e.g. tagliabue et al. [117] ). on the other hand, this increase in complexity is also encouraged by the growing range of applications to which esms are being dedicated (e.g. marine resource applications as investigated in lotze et al. [59] or park et al. [64] ). finally, table 1 shows that all cmip6 models have progressed toward a better representation of marine organic carbon cycling, sinking particles and marine sediments. in most cases, this component of marine biogeochemistry is parameterized using either a sediment box module or a metamodel based on downward fluxes of organic matter. indeed, for several cmip6 marine biogeochemical models, a more complex representation of sinking particles and organic matter pools (refractory classes or flux attenuation parameterization) replaces the generalized pools of organic matter used in the cmip5 predecessors. table 1 also sheds light on noticeable changes in the representation of sediment interactions. most of the reviewed cmip6 esms now simulate this compartment with biogeochemical parameterization (e.g. balance, meta-model, sediment box) or with a comprehensive sediment module (12-layer sediments module). table 2 also shows that the representation of the external sources of nutrients (i.e. the third axis of our model complexity breakdown) has grown in complexity between cmip5 and cmip6. it mirrors a more comprehensive treatment of boundary conditions between esm components (atmosphere, rivers, glaciers, etc.). most of the current generation of ocean biogeochemical models now consider inputs of biogeochemical elements via atmospheric deposition or from rivers. the iron delivery from sediment mobilization, hydrothermal sources or ice melting is additionally considered by a small set of models. this reflects recent advances in understanding the global iron cycle [111] [112] [113] [114] [115] [116] . in contrast, despite a better understanding of the role of submarine water discharge in ocean nutrient supply [118] [119] [120] [121] , this particular boundary condition is not considered in the current generation of ocean biogeochemical models. besides, it is interesting to note that a couple of cmip6 esms now includes a more comprehensive treatment of interactions between the marine biogeochemistry and the other earth system components. for instance, gfdl-esm 4 simulates interactively most of the primary source of iron for marine biogeochemistry (atmospheric dust deposition, iceberg melting and river supply), enabling the representation of biogeochemical couplings observed in the real world (e.g. [122] ). table 2 highlights that the current generation of esms displays a wider range of earth system feedbacks or interactions. in our review, we have decomposed earth system interactions involving marine biogeochemistry along two axes: (1) the air-sea exchange of greenhouse gases or reactive chemical compounds interacting with earth's radiative budget (and hence climate); (2) the represented earth system interactions involving marine biogeochemistry (including the air-sea exchange of greenhouse gases or reactive chemical compounds and biophysical interactions); that is, what is really contributing to the earth system model climate. this latter has been mapped into 4 feedbacks: climate-carbon cycle feedbacks (f1), biogenic aerosol-cloud feedbacks (f2), non-co 2 biogeochemical cycle feedbacks (f3) and phytoplankton-light feedbacks (f4). the influence of ocean dimethylsulfide (dms) emissions on cloud albedo is an example of the biogenic aerosol-cloud feedback (f2). dms is a breakdown product of dimethylsulfoniopropionate (dmsp), a metabolite in many phytoplankton with a role as a cellular osmolyte/antioxidant [123, 124] . it is exchanged with the atmosphere and is involved in the formation of sulfur aerosols once it is oxidized there. as the other sulfate aerosols, dms may be involved in the formation of cloud condensation nuclei (ccn). the potential importance of ocean dms emissions for the climate table 2 overview of the ocean and marine biogeochemical components of earth system models as used in cmip5 and cmip6. the names of the esm are given in the first line of the table where the cmip6 esms are given in red cells and the cmip5 predecessors are given in pink cells. the earth system interactions or couplings represented within the esms involving the marine biogeochemical models are described using three characteristics: the external inputs of nutrients or carbon-related fields conveyed by external boundary conditions are given with the chemical acronyms (c, p, n, si, fe), the representation of the gas exchange of greenhouse gases or reactive chemical species and the representation of earth system feedbacks. in the last rows, 'fx' indicates that the climate feedbacks or earth system interactions 'x' as depicted in fig. 1 is represented in a given earth system model table 3 overview of the ocean and marine biogeochemical components of earth system models as used in cmip5 and cmip6. the names of the esm are given in the first column of the table where the cmip6 esms are given in red cells and the cmip5 predecessors are given in pink cells. the modelling framework conducted by the various modelling groups for cmip5 and cmip6 is reviewed using two key characteristics: the duration of the spin-up simulation and the use of calibration/tuning procedure (further details about model calibration/tuning is given in table s1 ) system is still largely debated [125] because modern observations do not support its prominent role in the formation of ccn [126] [127] [128] . however, long-term measurement [129] and mesocosm experiments (e.g. [17] ) suggest that global changes may impact the rate of ocean dms emissions. recent modelling studies argue for a potential role of ocean dms in future climate change (e.g. [130, 131] ). ocean nh x emissions are also involved in biogenic aerosol-cloud feedbacks (f2). kirkby et al. [132] suggest that nh x can also play an important role in the formation of secondary nitrate aerosols in the atmosphere. similarly to dms, these aerosols can serve as ccn and contribute to changes in cloud albedo. non-co 2 biogeochemical cycle feedbacks (f3) involve ocean emissions of non-co 2 greenhouse gases (e.g. n 2 o or methane) or any chemical compounds contributing to the generation of greenhouses gases (e.g. methane, carbon monoxide). the phytoplankton-light feedbacks (f4) represent the suite of biophysical mechanisms that involve the influence of the marine biota on the upper ocean physics through the vertical redistribution of heat. table 2 confirms that all ocean biogeochemical models account for the climate-carbon cycle feedback since cmip5 (earth system feedback f1 in fig. 1 ). in addition, table 1 shows that the current generation of ocean biogeochemical models includes an air-sea gas exchange for a larger number of radiatively active biogeochemical compounds such as dms, nitrous oxide (n 2 o) and ammonia (nh x ). the inclusion of climate active gases or greenhouse gases other than co 2 in the current generation of ocean biogeochemical models is a result of the increased recognition of the importance of these compounds in earth system interactions with aerosols, atmospheric chemistry and, potentially, with clouds. in particular, the inclusion of ocean nh x or n 2 o emissions in ocean biogeochemical models has been driven by a better understanding of the global nitrogen cycle and its role in climate change. in particular, the development of databases such as memento (https://memento.geomar.de/) has enabled better validation and calibration of n 2 o modules in global ocean biogeochemical models [133] [134] [135] [136] [137] [138] . however, the inclusion of earth system feedbacks as illustrated in fig. 1 has not in all cases progressed between cmip5 and cmip6. for example, biophysical interactions with the ocean radiative transfer (f4 in fig. 1 ) are overlooked by more than half of the marine biogeochemical models examined, although this feedback is well documented and relatively well understood [139, 140] . our review of available esms suggests that the current generation of marine biogeochemical models has not much evolved toward comprehensive couplings between earth system components and ocean biogeochemistry or toward improved treatment of biophysical and biogeochemical feedback with respect to their predecessors (f1 and f4 in fig. 1 ). the full impact of ocean biogeochemistry on climate and its role in earth system feedback remains far from being entirely represented in the current generation of earth system models, as it involves different spatial and temporal scales that models are not currently able to reach and also processes still poorly understood. finally, our review suggests that the modelling approaches have evolved between cmip5 and cmip6. these latter have been monitored with two key indicators: (1) the length of the spin-up simulation and (2) the use of calibration/tuning for marine biogeochemical parameters. these two key indicators were discussed in published literature (e.g. séférian et al. [76] or hourdin et al. [141] ), reflecting, in general, an improved knowledge in model characteristics (strength and deficiency). table 3 and table s1 highlight that most of the modelling groups have expanded the duration of the spin-up for cmip6. this represents an important effort of the scientific community to converge toward recommended standards (e.g. [142] ). only gfdl and ipsl have reduced the duration of their spin-up protocol for computing reasons: they manage to fulfil cmip6 standard in a few hundreds of years. on the other hand, it is noticeable that several modelling groups have included a step of model calibration or tuning in cmip6. our review suggests that this step has been motivated by various reasons: bias reduction for key biogeochemical fields in cnrm, gfdl or noresm or bias compensation to reduce the impact of known biases in simulated surface chlorophyll for ocean dms and organic aerosols emissions in ukesm. there is no consensus between modelling groups on how model calibration or tuning takes place in the model preparation. depending on modelling group, the calibration or tuning is either included in the model development or during the spin-up procedure (table s1 ). figure, followed by the biases found across the current and last generation models. we note that, in several cases, observation-based estimates are derived from significant processing of sparse observations or from algorithms relating the quantity of interest to directly observed quantities (e.g. sea-toair co 2 flux, satellite chlorophyll). as such, the observations themselves are also subject to uncertainty which will be discussed in the context of each comparison. in fig. 2a , the sea-to-air flux of the critical greenhouse gas, co 2 , is shown, with a data product based on the mapping of observational pco 2 data drawn from the landschützer et al. [143] product (1995) (1996) (1997) (1998) (1999) (2000) (2001) (2002) (2003) (2004) (2005) (2006) (2007) (2008) (2009) (2010) (2011) (2012) (2013) (2014) . the key geographical features of this are strong outgassing (i.e. a net sea-to-air flux) in upwelling regions, most clearly in the tropics and along the equatorial region of the pacific ocean, and ingassing (i.e. a net air-tosea flux) at temperate and subpolar latitudes. these features reflect processes that are governed by temperature, patterns of deep water formation, surface biological production and the thermohaline circulation. in general, both cmip5 and cmip6 generations of models show a mixture of positive and negative biases across the globe with disagreement in the sign of the carbon fluxes over some regions. common patterns are slightly negative biases both in the equatorial pacific (i.e. weak outgassing) and in the north atlantic (i.e. excessive ingassing). both generations of models show a mix of relatively small positive and negative biases, except for the cmip5 canesm2 which shows the largest model-data error across the model ensemble. however, the comparison with observations has been substantially improved in canesm5. more generally, fig. 2a highlights that the improvement in simulated sea-toair carbon flux is clearer when looking at the direction of the carbon flux. this improvement seems to be linked to an improved representation of ocean vertical mixing (see skill scores of the ocean mixed-layer depth below). indeed, all cmip6 models exhibit smaller domains where the direction of the sea-to-air carbon flux disagrees with observations, except for mpi-esm1-2-lr, which used the same ocean model and displays the same pattern of model-data disagreement for cmip5 and cmip6. figure 2 b shows surface chlorophyll, compared with satellite-based estimates derived from esa-cci-oc ocean colour data [144] . the key geographical features are relatively high concentrations in productive temperate, subpolar and upwelling regions, and extremely low concentrations in the unproductive subtropical gyres. the latter are dominated by perennially low-nutrient conditions, while the former experience frequent, or seasonal, introduction of nutrients by upwelling or deep mixing. while these general biome scale patterns are robust across satellite algorithms, we note that estimates diverge in the southern ocean [146] , where global satellitebased chlorophyll algorithms have been found to significantly underestimate observations [147] . several cmip6 models compare more favourably with observations than their cmip5 predecessors. all models displaying a pattern of generally negative bias in cmip5 now exhibit large areas of both small positive and small negative biases. models overestimating surface chlorophyll concentrations in cmip5 now display reduced biases (< 0.4 mg chl m −3 ). this improvement is small for mpi-esm1-2-lr, which still overestimates surface concentrations of chlorophyll. some cmip6 models, such as cesm2, giss-e2-1-g-cc and noresm2-lm, display on the contrary larger model-data errors than their predecessors. given the large diversity across the models, it is difficult to determine whether changes in physical ocean models or changes in ocean biogeochemical models are behind these changes. however, it is interesting to note that three cmip6 models (cnrm-esm 2-1, ipsl-cm6a-lr and ukesm1-0-ll), which share a common ocean physics model, overlap in their model-data fit (squared correlation, r 2 ) is given in parenthesis with squared correlation coefficients for cmip5 and cmip6 models patterns of positive and negative biases in spite of differences in marine biogeochemistry submodels (spatial correlation of model-data errors r 2 =~0.5). it is notable that most of the models reviewed here overestimate surface chlorophyll estimates in the southern ocean. this bias, however, is likely due in part to the underestimation of southern ocean chlorophyll by the global satellite chlorophyll algorithms [147] . the substantial positive southern ocean bias in gfdl-esm 4, for example, is significantly diminished when compared against johnson's southern ocean-specific satellite-based chlorophyll algorithms (e.g. [148] ). figure 3 a and b show the distribution of surface nitrate (no 3 ) and silicic acid (h 4 sio 4 ), which are represented in both cmip5 and cmip6 models. figure 3 a shows that only gfdl, ipsl and miroc models have consistently improved their mean states between cmip5 and cmip6 for nitrate concentrations. in some cases, model generations show the same spatial patterns of biases, while others, most noticeably ukesm1-0-ll (where entirely new marine biogeochemistry has been incorporated), show a large overestimation of surface nitrate concentration over the tropics. a comparison of simulated surface concentrations of silicic acid with modern observations shows that all models except giss and cesm models have improved their representation of the surface distribution of silicic acid (fig. 3b) . the most striking improvement is seen between hadgem2-es and ukesm1-0-ll. such an improvement is explained by the switch in the biogeochemical model component between cmip5 and cmip6, from diat-hadocc to medusa-2.0 (see [96] , for further details). in general, cmip6 models improve upon their cmip5 predecessors in their representation of oxygen at 150 m (fig. 4b) . model errors in the southern ocean have been reduced in cmip6 with respect to cmip5, highlighting a better representation of the deep ocean ventilation in the southern ocean or more accurate biogeochemical characteristics of outcropping water masses. model-data errors have also been reduced in cmip6 in large domains of the indian ocean where large omzs occur although all models display a systematic overestimation of oxygen at 150 m in the arabian sea. the same feature is also observed in the tropical pacific where a modeldata error has been reduced in cmip6 with respect to cmip5. contrasting with the other ocean domains, models' performance has not improved in the atlantic ocean. for example, in the tropical atlantic, some models have shifted in the sign of the model-data errors: from a negative bias in cmip5 (stronger-than-observed omz) to a positive bias in cmip6 (weaker-than-observed omz) or the opposite. in both cases, the absolute magnitude of the model-data errors in this region remains similar between model generations. this implies a systematic bias in ocean biogeochemical models which seems independent from ocean resolution or complexity of marine biogeochemistry models. besides, our review of model performance highlights that open ocean hypoxia remains poorly represented in ocean biogeochemical models; the cmip6 models still tend to overestimate this marine biogeochemical feature with respect to their cmip5 predecessors. this is especially clear in the southern tropical pacific, where all models except cesm2 and gfdl-esm 4 overestimated the level of hypoxia of the omz (fig. 4) . improvement in gfdl-esm 4 is explained by a suite of updates and changes in model physics (i.e. mixing and southern hemisphere climate) and biogeochemical parameterizations (i.e. the use of a revised remineralization scheme for organic matter depending on oxygen and temperature of laufkötter et al. [148] ). in addition, cobaltv2 has lower net primary productivity than topazv2 which allows the highnutrient low-chlorophyll region to spread further meridionally in the tropical pacific and reduce the eastern equatorial nutrient trapping and associated oxygen decline. the surface distribution of dissolved iron is also an important feature of marine biogeochemistry. its availability controls marine biological production in several ocean regions [149] . as for oxygen, table 1 highlights that marine iron cycling is not represented in all biogeochemical models. nonetheless, this number has increased in cmip6 (table 1) . it translates the current scientific consensus which recognizes the need to resolve the iron cycling in biogeochemical model in order to better simulate the marine biogeochemical dynamics, e.g. for glacial-interglacial climate change [150] or for variability and response to climate change [151] . figure 5 illustrates, however, that the performance of the current generation of models with respect to iron does not improve much with respect to that of the previous generation. indeed, the model-data fit estimated with squared correlation coefficients remains < 0.25. this fit has not progressed much from cmip5 to cmip6, except possibly for ipsl and cnrm models which both employed piscesv1 [40, 41] for cmip5 and piscesv2 [91] for cmip6. as highlighted in aumont et al. [91] , piscesv2 includes a more detailed representation of the ocean iron cycle compared with piscesv1. the poor agreement between the observed and simulated distribution of dissolved iron relative to macronutrients (fig. 3 ) partly reflects differences in the nature of the datasets. the relatively large number of nitrate measurements globally, for example, has allowed construction of robust climatological patterns [145] that model climatologies can be compared against. the relative paucity of dissolved iron measurements, in contrast, requires a comparison of modelled climatologies against patchy individual measurements. despite this, fig. 5 shows that some cmip6 models better simulate the global average concentration of dissolved iron than their predecessors. this is particularly clear for ukesm1-0-ll, mpi-esm 1-2-lr and gfdl-esm 4. it is interesting to see the various modelling approaches for representing marine iron cycling. ukesm1-0-ll and miroc-es2l, for instance, use respectively dutkiewicz et al. [152] and moore and braucher [153] parameterization for marine iron cycling that removes dissolved iron concentrations above an ad hoc threshold. other ocean biogeochemical models use mechanistic iron cycling schemes that avoid the needs of ad hoc thresholds (e.g. pisces-v2 and pisces-v2-gas employs völker and tagliabue [154] formulation and topazv2 applies an empirical relationship to dissolved organic carbon (doc) to derive ligand concentrations). table 4 provides a large-scale picture of the model's ability to simulate key downward biogeochemical fluxes involved in global carbon and nutrients cycling. most of the cmip6 marine biogeochemical models better simulate the magnitude of the surface and 100 m biogeochemical fluxes than their cmip5 predecessors. indeed, cesm2, cnrm-esm2-1, giss-e2-1-g-cc, ipsl-cm6a-lr, mpi-esm 1-2-lr and noresm2-lr have improved the representation of at least one biogeochemical fluxes with respect to their cmip5 predecessors; bcc-csm2-mr, canesm5, gfdl-esm 4 and miroc-es2l display comparable performance; only canesm5-canoe, mri-esm 2-0 and noresm2-lm have respectively degraded the representation of either the vertically integrated net primary productivity or the carbon export at 100 m compared with their cmip5 predecessors. despite the general improvement, table 4 highlights that several cmip6 models fall outside the range of remotesensing estimates of primary production ( [157, 158, 161] ). it suggests that the current generation of marine biogeochemical models still has difficulties to model underlying processes involved in the carbon fixation by phytoplankton (such as nutrient colimitation, nitrogen fixation, remineralization), required to accurately simulate the magnitude of the vertically integrated net primary productivity. at the same time, it is also important to acknowledge that there are still large uncertainties in remote-sensing-based estimates of primary production, e.g. 38.8-42.1 pg c year −1 in the most recent estimates of kulk et al. [158] and 47.5-52.1 pg c year −1 according to behrenfeld et al. [157] . figures 6 and 7 track changes in performance between cmip5 and cmip6 marine biogeochemical models. figure 6 highlights how far the cmip6 models have improved their capability to simulate observed spatial patterns with respect to their cmip5 predecessors; fig. 7 summarizes the overall model performance including information on model performance to reproduce observed distribution (pattern and magnitude). both figs. 6 and 7 show that cmip6 models have improved the representation of the ocean physics (here the ocean mixed-layer depth). the cross-generation picture of the model performance for marine biogeochemistry is more contrasted. globally, figs. 6 and 7 show that most of the cmip6 models outcompete their cmip5 predecessors. however, this improvement remains modest. except for some models displaying a noticeable improvement for one or two biogeochemical fields (surface nitrate for cesm2, surface chlorophyll for cnrm-esm2-1, surface silicic acid for gfdl-esm 4), most of the cmip6 model display a slight increase in model-data spatial correlation (up to + 0.2, fig. 6 ) or an overall reduction in model-data rmse of about 20% (fig. 7) . besides, this improvement does not concern all models. for instance, giss-e2-1-g-cc shows a noticeable degradation in performance for all of the biogeochemical fields analyzed here. our review of available earth system models highlights that the current generation of marine biogeochemical models used for cmip6 displays a greater diversity than the previous one used for cmip5. several marine biogeochemical models have evolved toward a more comprehensive representation of marine biogeochemistry (i.e. cesm, cnrm, gfdl, ipsl, miroc, ukesm), typically including an expanded array of biological taxa (e.g. diazotrophs) or elemental cycling (e.g. oxygen and iron cycles), variable stoichiometry, sediments (e.g. sediment box module) and the representation of (non-co 2 ) trace gases relevant to atmospheric chemistry. on the opposite, some groups have limited the increase in model complexity between cmip5 and cmip6 (i.e. bcc, giss, mpi, mri, noresm). finally, it is interesting to note that some groups have started to investigate the use of reduced complexity marine biogeochemical model (i.e. gfdl) or to intercompare in a traceable framework the impact of rising complexity on the simulated marine biogeochemistry (canesm). when assessed against observations, most of the cmip6 models generally outperform their cmip5 predecessors in many regions and for most of the marine biogeochemical fields reviewed here (figs. 6 and 7 and table 4 ). however, this model review has also highlighted several systematic model-data errors that are persistent even in cmip6 models (e.g. oxygen concentrations at 150 m in tropical atlantic, nutrient trapping in the southern ocean). our review also shows that the modelling approaches have evolved between cmip5 and cmip6. indeed, most modelling groups have spun-up their model over a longer period for cmip6 with respect to cmip5 in order table 4 comparison between observational and model estimates of biogeochemical fluxes over the modern period. for both cmip5 and cmip6 models, biogeochemical fluxes are calculated over the 1995-2014 period (see methods in supplementary materials) observational estimates are derived from the following database: a landschützer et al. [143] product average over 1995-2014 and adjusted for the preindustrial ocean source of co 2 from river input to the ocean consistently with the methodology employed in [155] that used a river flux adjustment of 0.78 pg c year −1 [156] ; b maximal range of remote-sensing estimates from behrenfeld et al. [157] and kulk et al. [158] ; c dunne et al. [159] and d tréguer and de la rocha [160] . when required, the modelled net ocean carbon uptake is corrected with the net riverine-induced outgassing diagnosed from the picontrol simulation. coloured cells indicate the relative deviation in model global estimates with respect to the observation median best estimates; hatched coloured cells indicate where model global estimates fall within the observational uncertainty range. grey cells indicate missing or unrepresented biogeochemical fluxes to fulfil the drift criterion as proposed by jones et al. [142] . in contrast, the use of tuning and calibration for marine biogeochemical models for cmip remains a less common feature at the time of cmip6. finally, our review of model mean state performance against their model properties (resolution, complexity) suggests that neither increasing resolution nor increasing complexity leads automatically to model improvement. instead, improvement is a mixture of improved ocean physical processes and better representation of biogeochemical processes. in the context of improving confidence in future climate projections, it is important to stress that the model mean state performance is not the only mean to understand multi-model uncertainty, comparisons against seasonal to multi-annual variations in observed quantities may ultimately prove most critical to building confidence in future climate projections (e.g. [13, 163] ). in this final section, we identify some directions where marine biogeochemical models could continue to improve or to progress. the green (red) shading flags an improvement (degradation) of the model performance to replicate the observed geographical structure for a given field. the ocean mixed-layer depth is computed similarly in all models; it is based on a density criterion of 0.03 kg m −3 . the ocean mixed-layer depth simulated by the various earth system models is evaluated against the observational dataset of de boyer montégut et al. [162] the first step change to expect in the next generation of models is the emergence of high-resolution ocean biogeochemical models fit to investigate centennial-scale simulation. this step change may be supported in a number of ways: (1) the availability of greater computational resources; (2) the use of hybrid-resolution numerical schemes to decrease the cost of biogeochemical models (e.g. [164] ); (3) actually reduced complexity of marine biogeochemical models (e.g. such as minibling; [105] ); (4) the use of machine learning to either accelerate marine biogeochemical models or to reduce the numerical cost necessary to improve their performance (i.e. via tuning). these (and potentially other) step changes will help to understand the extent to which mesoscale or sub-mesoscale ocean physics might change the response of marine biogeochemistry to rising co 2 and climate change-a missing factor in such models already highlighted from cmip5 and ipcc ar5 [2] . a second important step change is related to the phytoplankton physiology and evolution. this change may have two benefits. first, several recent studies show that the inclusion of a more comprehensive treatment of plankton physiology may improve model performance, in particular some systematic biases in the southern ocean (e.g. [108, 165] ). then, this improvement is arguably a first step toward the representation of adaptation and fitness in ocean biogeochemical models [166, 167] . this omission remains an important caveat for multi-stressors studies (e.g. [9] ) or time-of-emergence studies [168] as current models effectively assume no change in the underlying properties of modelled plankton. fig. 7 portrait diagram highlighting the performance of cmip6 models (one representative per modelling groups) with respect to their cmip5 predecessors. the variables of interest are mixed-layer depth (oml), airsea co 2 flux (fgco2), surface chlorophyll (chl), oxygen concentration at 150 m (o2) and surface concentrations of nitrate (no3) and silicic acid (si). the skill score metric, z-score, is computed for a given model and for a given field as follows: is the global area-weighted average model-data rootmean-squared error (rmse) of the model of the current generation contributing to cmip6 and rmse cmip5 (p) is the rmse of its predecessor that has contributed to cmip5. greenish (reddish) colours and negative (positive) z-scores indicate improved (degraded) field representations in cmip6 model versions; darker colours indicate a greater change from cmip5 to cmip6. grey indicates missing data for one or both generations of models. air-sea co 2 flux (fgco2) was adjusted for riverineinduced outgassing as in table 4 . the ocean mixed-layer depth is computed similarly in all models; it is based on a density criterion of 0.03 kg m −3 . the ocean mixed-layer depth simulated by the various earth system models is evaluated against the observational dataset of de boyer montégut et al. [162] future developments should be pursued in the context of the internal cycling of micronutrients involved in phytoplankton physiology and metabolism such as iron, zinc or copper. our review confirms that the current generation of marine biogeochemical models are still struggling to reproduce the major features of the oceanic iron distribution although the observations of dissolved iron in the ocean are growing rapidly [149] and are made widely available by geotraces [169] . a key challenge for iron is that the dissolved iron commonly measured only appears to represent a trace residual of the underlying fluxes [170] , pointing to the need for more process studies and observations of fluxes. it is possible that iron isotopes may yield further insight into the role of external inputs and internal cycling in shaping iron distributions in the observations and models. finally, the development of additional model components dealing with other trace metals, such as cobalt [171] , zinc [172] , manganese [173] and copper [174] , may also prove beneficial in constraining the magnitude and dynamics of external inputs in particular. an expanded array of biological taxa may also be expected in the next generation of ocean biogeochemical models. a potentially important change in the ocean ecosystem modelling paradigm is the inclusion and integration of mixotrophs which are an important grazer of bacterioplankton, and which also feed on phytoplankton, microzooplankton and (sometimes) mesozooplankton. mixotrophic bacterivory among the phytoplankton may be important for alleviating nutrient stress and may increase primary production in oligotrophic waters. some modelling studies indicate that mixotrophy has a profound impact on marine planktonic ecosystems and may enhance primary production, biomass transfer to higher trophic levels and the functioning of the biological carbon pump [175] . this expanded array of biological taxa may take the concept of the marine biogeochemical model up to the marine ecosystem model, which will enable the representation of feedbacks of the marine trophic food web on marine biogeochemical cycles. the work of lefort et al. [57] provides an example of this type of marine ecosystem model realizing a comprehensive coupling between a marine biogeochemical model (pisces) with a marine trophic food web model (apecosm). a third important step change is related to the couplings between earth system components and ocean biogeochemistry. our review highlights that models have evolved toward a more comprehensive treatment of biological boundary conditions (e.g. atmospheric deposition, riverine inputs, sediments, ice sheets, geothermal sources) but that these latter are currently largely represented using climatological data rather than dynamic connections. progress toward more complete couplings between earth system components such as rivers, ice sheet/iceberg calving and ice shelves or atmospheric aerosols can help to better simulate interactions between marine biogeochemistry, biogeochemical cycles and climate. in the same manner, a more comprehensive treatment of biophysical and biogeochemical feedback could be realized in the next generation of marine biogeochemical models. the latter involves, for instance, ocean emissions of greenhouse gases or biogenic volatile organic compounds (bvocs) that are already simulated by a small number of models (see table 5 ). however, our understanding of the global cycles of dms, n 2 o and ch 4 (including, specifically, the processes that produce them) is much less developed compared with co 2 . therefore, better treatment of biophysical and biogeochemical feedback requires a larger array of observational data sets in order to improve our understanding of the processes underlying these ocean emissions. from the perspective of tracking future model improvement, it is important to stress that our capacity to assess model performance resulting from any of the potential advances discussed above is contingent upon continued improvement in observational constraints. existing constraints were adequate for detecting large skill differences between cmip5 and cmip6 models, but the overall improvement in models necessitates more precise comparisons to detect skill differences. such comparisons are challenged by data sparsity and [177] ; c paulot et al. [178] uncertainties in algorithms designed to derive global fields from sparse data or infer properties of interest from remotely sensed variables. continued improvement in the quality and quantity of data-based constraints is critical. that being said, our review of the available pairs of cmip5-cmip6 marine biogeochemical models strongly suggests that careful consideration is needed when selecting model complexity with regard to the fitness-for-purpose of models (i.e. carbon cycle feedbacks, multiple earth system feedbacks, multi-stressors, adaptation and biodiversity). indeed, when confronting model complexity against model mean state performance, our work suggests that complex models do not necessarily outperform simple models. this is consistent with the earlier study of kwiatkowski et al. [179] , which directly led to the choice of marine biogeochemistry model in ukesm1-0-ll, where across many earth system relevant metrics, the simplest model performed best. in this sense, our review shows that simple models (e.g. ocmip nutrient restoring or npzd type) remain viable when investigating carbon cycle feedbacks, although more complex models do still permit a better linkage with the marine biodiversity or a broader array of feedbacks and potentially more realistic earth system behaviour. acknowledgements r.s. on behalf of the author team thank the two anonymous referees for their useful comments that have improved the quality of this paper. r.s. thanks the author team for their contributions to this paper that occurred during the coronavirus sars-cov-2 pandemic. r.s. and s.b. thank the support of the team in charge of the cnrm-cm climate model. supercomputing time was provided by the météo-france/ dsi supercomputing center. conflict of interest on behalf of all authors, the corresponding author states that there is no conflict of interest. human and animal rights this article does not contain any studies with human or animal subjects performed by any of the authors. disclaimer this article reflects only the authors' view-the funding agencies as well as their executive agencies are not responsible for any use that may be made of the information that the article contains. open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. ocean biogeochemical dynamics: princeton university press carbon and other biogeochemical cycles. clim chang 2013 -phys sci basis bio-physical feedbacks in the arctic ocean using an earth system model regional impacts of climate change and atmospheric co2 on future ocean carbon uptake: a multimodel linear feedback analysis nonlinearity of ocean carbon cycle feedbacks in cmip5 earth system models global marine primary production constrains fisheries catches photosynthesis and fish production in the sea reconciling fisheries catch and ocean productivity multiple stressors of ocean ecosystems in the 21st century: projections with cmip5 models oxygen and indicators of stress for marine life in multi-model global warming projections climate change impacts on marine ecosystems projected ph reductions by 2100 might put deep north atlantic biodiversity at risk diverging seasonal extremes for ocean acidification during the twenty-first century anthropogenic ocean acidification over the twentyfirst century and its impact on calcifying organisms the response of marine carbon and nutrient cycles to ocean acidification: large uncertainties related to phytoplankton physiological assumptions ocean acidification: emergence from pre-industrial conditions the impacts of ocean acidification on marine trace gases and the implications for atmospheric chemistry and climate transport and storage of c02 in the ocean -an inorganic ocean-circulation carbon cycle model a perturbation simulation of co2 uptake in an ocean general circulation model basic performance of a new earth system model of the meteorological research institute (mri-esm 1) carbon emission limits required to satisfy future representative concentration pathways of greenhouse gases the norwegian earth system model, noresm1-m -part 1: description and basic evaluation of the physical climate climate change projections using the ipsl-cm5 earth system model: from cmip3 to cmip5 gfdl's esm 2 global coupled climate-carbon earth system models. part ii: carbon system formulation and baseline simulation characteristics climate and carbon cycle changes from 1850 to 2100 in mpi-esm simulations for the coupled model intercomparison project phase 5 preindustrial-control and twentieth-century carbon cycle experiments with the earth system model cesm1(bgc) natural air-sea flux of co 2 in simulations of the nasa-giss climate model: sensitivity to the physical ocean model formulation development and evaluation of cnrm earth system model -cnrm-esm 1 2010: model description and basic results of cmip5-20c3m experiments global carbon budgets simulated by the beijing climate center climate system model for the last century carbon-concentration and carbon-climate feedbacks in cmip5 earth system models carbon-concentration and carbon-climate feedbacks in cmip6 models, and their comparison to cmip5 models climate-carbon cycle feedback analysis: results from the c 4 mip model intercomparison uncertainties in cmip5 climate projections due to carbon cycle feedbacks the oceanic origin of path-independent carbon budgets climate engineering and the ocean: effects on biogeochemistry and primary production impact of solar radiation modification on allowable co 2 emissions: what can we learn from multi-model simulations? earth's futur impact of idealized future stratospheric aerosol injection on the large-scale ocean and land carbon cycles globalizing results from ocean in situ iron fertilization studies globalizing results from ocean in situ iron fertilization studies ocean iron fertilization in the context of the kyoto protocol and the post-kyoto process implications of large-scale iron fertilization of the oceans efficiency of carbon removal per added iron in ocean iron fertilization global negative emissions capacity of ocean macronutrient fertilization potential climate engineering effectiveness and side effects during a high carbon dioxideemission scenario low efficiency of nutrient translocation for enhancing oceanic uptake of carbon dioxide enhanced rates of regional warming and ocean acidification after termination of large-scale ocean alkalinization ocean solutions to address climate change and its effects on marine ecosystems impacts of artificial ocean alkalinization on the carbon cycle and climate in earth system simulations assessing the potential of calcium-based artificial ocean alkalinization to mitigate rising atmospheric co2 and ocean acidification núñez-riboni i. global ocean biogeochemistry model hamocc: model architecture and performance as component of the mpi-earth system model in different cmip5 experimental realizations a more productive, but different, ocean after mitigation ocean carbon cycle feedbacks under negative emissions shrinking of fishes exacerbates impacts of global ocean changes on marine ecosystems natural variability of marine ecosystems inferred from a coupled climate to ecosystem simulation spatial and body-size dependent response of marine pelagic communities to projected global climate change on the use of ipcc-class models to assess the impact of climate on living marine resources global ensemble projections reveal trophic amplification of ocean biomass declines with climate change a protocol for the intercomparison of marine fishery and ecosystem models: fish-mip v1.0. geosci model dev decadal predictions of the north atlantic co2 uptake predicting the variable ocean carbon sink, eaav6471 predicting near-term variability in ocean carbon uptake seasonal to multiannual marine ecosystem prediction with a global earth system model multiyear predictability of tropical marine productivity assessing the decadal predictability of land and ocean carbon uptake predicting near-term changes in the earth system: a large ensemble of initialized decadal prediction simulations using the community earth system model towards real-time verification of co2 emissions managing living marine resources in a dynamic environment: the role of seasonal to decadal climate forecasts a multi-decade record of high-quality fco2 data in version 3 of the surface ocean co2 atlas (socat) maredat: towards a world atlas of marine ecosystem data global ocean data analysis project, version 2 (glodapv2) the beijing climate center climate system model (bcc-csm): main progress from cmip5 to cmip6. geosci model dev the canadian earth system model version 5 (canesm5.0.3). geosci model dev the community earth system model version 2 (cesm2) inconsistent strategies to spin up models in cmip5: implications for ocean biogeochemical model performance assessment evaluation of cnrm earth-system model, cnrm-esm2-1: role of earth system processes in present-day and future climate structure and performance of gfdl's cm4.0 climate model noaa-gfdl gfdl-esm4 model output prepared for cmip6 cmip. earth system grid federation global carbon cycle and climate feedbacks in the nasa giss modele2.1. submitted to journal of advances in modeling earth systems the hadgem2-es implementation of cmip5 centennial simulations ukesm1: description and evaluation of the uk earth system model presentation and evaluation of the ipsl-cm6a-lr climate model description of the miroc-es2l earth system model and evaluation of its climate-biogeochemical processes and feedback. geosci model dev discuss developments in the mpi-m earth system model version 1.2 (mpi-esm 1.2) and its response to increasing co2 a new global climate model of the meteorological research institute: mri-cgcm3 -model description and basic performance the norwegian earth system model, noresm2 -evaluation of thecmip6 deck and historical simulations. geosci model dev discuss preindustrial, historical, and fertilization simulations using a global ocean carbon model with new parameterizations of iron limitation, calcification, and n2 fixation csib v1 (canadian sea-ice biogeochemistry): a sea-ice biogeochemical model for the nemo community ocean modelling framework. geosci model dev upper ocean ecosystem dynamics and iron cycling in a global three-dimensional model pisces-v2: an ocean biogeochemical model for carbon and ecosystem studies ocean biogeochemistry in gfdl's earth system model 4.1 and its response to increasing atmospheric co2 simple global ocean biogeochemistry with light, iron, nutrients and gas version 2 (blingv2): model description and simulation characteristics in gfdl's cm4.0 the gfdl earth system model version 4.1 (gfdl-esm4.1): model description and simulation characteristics drivers of air-sea co2 flux seasonality and its long-term changes in the nasa-giss model cmip6 submission description and evaluation of the diat-hadocc model v1.0: the ocean biogeochemical component of hadgem2-es. geosci model dev medusa-2.0: an intermediate complexity biogeochemical model of the marine carbon cycle for climate change and ocean acidification studies modeling in earth system science up to and beyond ipcc ar5 incorporating a prognostic representation of marine nitrogen fixers into the global ocean biogeochemical model hamocc uptake mechanism of anthropogenic co2 in the kuroshio extension region in an ocean general circulation model ocean biogeochemistry in the norwegian earth system model version 2 (noresm2) problems and prospects in large-scale ocean circulation models towards accounting for dissolved iron speciation in global ocean models complex functionality with minimal computation: promise and pitfalls of reduced-tracer ocean biogeochemistry models ecological nitrogen-to-phosphorus stoichiometry at station aloha optimal nitrogen-to-phosphorus stoichiometry of phytoplankton the impact of variable phytoplankton stoichiometry on projections of primary production, food quality, and carbon uptake in the global ocean buffering of ocean export production by flexible elemental stoichiometry of particulate organic matter ocean nutrient ratios governed by plankton biogeography hydrothermal vents trigger massive phytoplankton blooms in the southern ocean the biogeochemical cycle of iron in the ocean antarctic ice sheet fertilises the southern ocean biological processes on glacier and ice sheet surfaces hydrothermal contribution to the oceanic dissolved iron inventory the impact of different external sources of iron on the global carbon cycle how well do global ocean biogeochemistry models simulate dissolved iron distributions? glob biogeochem c y c l e s . 2 0 1 6 dynamic biogeochemical controls on river pco2 and recent changes under aggravating river impoundment: an example of the subtropical yangtze river submarine groundwater discharge and associated nutrient fluxes into the southern yellow sea: a case study for semi-enclosed and oligotrophic seas-implication for green tide bloom submarine groundwater discharge revealed by 228ra distribution in the upper atlantic ocean submarine groundwater discharge as a major source of nutrients to the mediterranean sea seasonal itcz migration dynamically controls the location of the (sub) tropical atlantic biogeochemical divide physiological aspects of the production and conversion of dmsp in marine algae and higher plants an antioxidant function for dmsp and dms in marine algae the claw hypothesis: a new perspective on the role of biogenic sulphur in the regulation of global climate surface ocean-lower atmosphere study: scientific synthesis and contribution to earth system science the case against climate regulation via oceanic phytoplankton sulphur emissions small fraction of marine cloud condensation nuclei made up of sea spray aerosol a remote sensing algorithm for planktonic dimethylsulfoniopropionate (dmsp) and an analysis of global patterns amplification of global warming through ph dependence of dms production simulated with a fully coupled earth system model global warming amplified by reduced sulphur fluxes as a result of ocean acidification role of sulphuric acid, ammonia and galactic cosmic rays in atmospheric aerosol nucleation data-based estimates of suboxia, denitrification, and n2o production in the ocean and their sensitivities to dissolved o2 constraints on global oceanic emissions of n2o from observations and models offsetting the radiative benefit of ocean iron fertilization by enhancing n 2 o emissions oceanic nitrogen cycling andn2o flux perturbations in the anthropocene projections of oceanic n2o emissions in the 21st century using the ipsl earth system model global distribution of n 2 o and the δn 2 o-aou yield in the subsurface ocean. glob biogeochem cycles ideas and perspectives: climaterelevant marine biologically driven mechanisms in earth system models cyanobacterial blooms cause heating of the sea surface the art and science of climate model tuning c4mip -the coupled climate-carbon cycle model intercomparison project: experimental protocol for cmip6 decadal variations and trends of the global ocean carbon sink a compilation of global bio-optical in situ data for ocean-colour satellite applications dissolved oxygen, apparent oxygen utilization, and oxygen saturation global and regional evaluation of the seawifs chlorophyll data set three improved satellite chlorophyll algorithms for the southern ocean temperature and oxygen dependence of the remineralization of organic matter the integral role of iron in ocean biogeochemistry glacial-interglacial co2 change: the iron hypothesis climate-induced interannual variability of marine primary and export production in three global coupled climate carbon cycle models interactions of the iron and phosphorus cycles: a three-dimensional model study sedimentary and mineral dust sources of dissolved iron to the world ocean modeling organic iron-binding ligands in a three-dimensional biogeochemical ocean model revision of global carbon fluxes based on a reassessment of oceanic and riverine carbon transport carbon-based ocean productivity and phytoplankton physiology from space primary production, an index of climate change in the ocean: satellite-based estimates over two decades a synthesis of global particle export from the surface ocean and cycling through the ocean interior and on the seafloor the world ocean silica cycle a comparison of global estimates of marine primary production from ocean color mixed layer depth over the global ocean: an examination of profile data and a profile-based climatology the southern ocean as a constraint to reduce uncertainty in future ocean carbon sinks evaluation of an online grid-coarsening algorithm in a global eddy-admitting ocean biogeochemical model the biological pump and seasonal variability of pco 2 in the southern ocean: exploring the role of diatom adaptation to low iron integrating climate adaptation and biodiversity conservation in the global protected ocean considering the role of adaptive evolution in models of the ocean and climate system rapid emergence of climate change in environmental drivers of marine ecosystems the geotraces intermediate data product the interplay between regeneration and scavenging fluxes drives ocean iron cycling the role of external inputs and internal cycling in shaping the global ocean cobalt distribution: insights from the first cobalt biogeochemical model biological uptake and reversible scavenging of zinc in the global ocean. science (80-) manganese in the west atlantic ocean in the context of the first global ocean circulation model of manganese insights into the major processes driving the global distribution of copper in the ocean from a global model marine mixotrophy increases trophic transfer efficiency, mean organism size, and vertical carbon flux an updated climatology of surface dimethlysulfide concentrations and emission fluxes in the global ocean constraints on global oceanic emissions of n2o from observations and models global oceanic emission of ammonia: constraints from seawater and atmospheric observations imarnet: an ocean biogeochemistry model intercomparison project within a common physical ocean modelling framework publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the reference paper of the reviewed esms is 1 wu et al. [31] , 2 wu et al. [73] , 3 arora et al. [22] , 4 swart et al. [74] , 5 lindsay et al. [27] , 6 danabasoglu et al. [75] , 7 séférian et al. [29, 76] , 8 séférian et al. [77] , 9 dunne et al. [25] , 10 held et al. [78] , 11 krasting et al. [79] , 12 romanou et al. [28] , 13 ito et al. [80] , 14 jones et al. [81] , 15 sellar et al. [82] , 16 dufresne et al. [24] , 17 boucher et al. [83] , 18 watanabe et al. [30] , 19 hajima et al. [84] , 20 giorgetta et al. [26] , 21 mauritsen et al. [85] , 22 adachi et al. [21] , 23 yukimoto et al. [86] , 24 bentsen et al. [23] , 25 seland et al. [87] . the reference paper of marine biogeochemical model is a key: cord-288342-i37v602u authors: wang, zhen; andrews, michael a.; wu, zhi-xi; wang, lin; bauch, chris t. title: coupled disease–behavior dynamics on complex networks: a review date: 2015-07-08 journal: phys life rev doi: 10.1016/j.plrev.2015.07.006 sha: doc_id: 288342 cord_uid: i37v602u it is increasingly recognized that a key component of successful infection control efforts is understanding the complex, two-way interaction between disease dynamics and human behavioral and social dynamics. human behavior such as contact precautions and social distancing clearly influence disease prevalence, but disease prevalence can in turn alter human behavior, forming a coupled, nonlinear system. moreover, in many cases, the spatial structure of the population cannot be ignored, such that social and behavioral processes and/or transmission of infection must be represented with complex networks. research on studying coupled disease–behavior dynamics in complex networks in particular is growing rapidly, and frequently makes use of analysis methods and concepts from statistical physics. here, we review some of the growing literature in this area. we contrast network-based approaches to homogeneous-mixing approaches, point out how their predictions differ, and describe the rich and often surprising behavior of disease–behavior dynamics on complex networks, and compare them to processes in statistical physics. we discuss how these models can capture the dynamics that characterize many real-world scenarios, thereby suggesting ways that policy makers can better design effective prevention strategies. we also describe the growing sources of digital data that are facilitating research in this area. finally, we suggest pitfalls which might be faced by researchers in the field, and we suggest several ways in which the field could move forward in the coming years. infectious diseases have long caused enormous morbidity and mortality in human populations. one of the most devastating examples is the black death, which killed 75 to 200 million people in the medieval period [1] . currently, the rapid spread of infectious diseases still imposes a considerable burden [2] . to elucidate transmission processes of infectious diseases, mathematical modeling has become a fruitful framework [3] . in the classical modeling framework, a homogeneously mixed population can be classified into several compartments according to disease status. in particular, the most common compartments are those that contain susceptible individuals (s), infectious (or infected) individuals (i), and recovered (and immune) individuals (r). using these states, systems of ordinary differential equations (odes) can be created to capture the evolution of diseases with different natural histories. for example, a disease with no immunity where susceptible individuals who become infected return to the susceptible class after recovering (sis natural history, see fig. 1 where [s] ([i ]) represents the number of susceptible (infectious) individuals in the population, β is the transmission rate of the disease, and μ is the recovery rate of infected individuals. some diseases, however, may give immunity to individuals who have recovered from infection (sir natural history, see fig. 1 where [r] is the number of recovered (and immune) individuals. in these ode models, a general measure of disease severity is the basic reproductive number r 0 = βn/μ, where n is the population size. in simple terms, r 0 is the mean number of secondary infections caused by a single infectious individual, during its entire infectious period, in an otherwise susceptible population [4] . if r 0 < 1, the disease will not survive in the population. however, if r 0 > 1, the disease may be able to persist. typically, parameters like the transmission rate and recovery rate are treated as fixed. however, new approaches to modeling have been developed in past few decades to address some of the limitations of the classic differential equation framework that stem from its simplifying assumptions. for instance, the impact of behavioral changes in response to an epidemic is usually ignored in these formulations (e.g., the transmission rate is fixed), but in reality, individuals usually change their behavior during an outbreak according to the change of perceived infection risk, and their behavioral decisions can in turn impact the transmission of infection. another limitation of the classical compartmental models is the assumption of well-mixed populations (namely, individuals interact with all others at the same contact rate), which thus neglects heterogeneous spatial contact patterns that can arise in realistic populations. in this review we will describe how models of the past few decades have begun to address these limitations of the classic framework. traditionally, infectious disease models have treated human behavior as a fixed phenomenon that does not respond to disease dynamics or any other natural dynamics. for many research questions, this is a useful and acceptable simplification. however, in other cases, human behavior responds to disease dynamics, and in turn disease dynamics responds to human behavior. for example, the initiation of an epidemic may cause a flood of awareness in the population such that protective measures are adopted. this in turn, reduces the transmission of the disease. in such cases, it becomes possible to speak of a single, coupled "disease-behavior" system where a human subsystem and a disease schematic illustration of disease-behavior interactions as a negative feedback loop. in this example, the loop from disease dynamics to behavioral dynamics is positive (+) since an increase in disease prevalence will cause an increase in perceived risk and thus an increase in protective behaviors. the loop from behavioral dynamics back to disease dynamics is negative (−) since an increase in protective behaviors such as contact precautions and social distancing will generally suppress disease prevalence. transmission subsystem are coupled to one another (see fig. 2 ). moreover, because the human and natural subsystems are themselves typically nonlinear, the coupled system is therefore also typically nonlinear. this means that phenomena can emerge that cannot be predicted by considering each subsystem in isolation. for example, protective behavior on the part of humans may ebb and flow according to disease incidence and according to a characteristic timescale (as opposed to being constant over time, as would occur in the uncoupled subsystems). to explore strategic interactions between individual behaviors, game theory has become a key tool across many disciplines. it provides a unified framework for decision-making, where the participating players in a conflict must make strategy choices that potentially affect the interest of other players. game theory and its corresponding equilibrium concepts, such as the nash equilibrium, emerged in seminal works from the 1940s and 1950s [5, 6] . a nash equilibrium is a set of strategies such that no player has an incentive to unilaterally deviate from the present strategy. that is, the nash equilibrium makes strategies form best responses to one other, since every player, who has a consistent goal to maximize his own benefit or utility, is perfectly rational. game theory has been applied to fields such as economics, biology, mathematics, public health, ecology, traffic engineering, and computer science [7] [8] [9] [10] [11] [12] . for example, in voluntary vaccination programs, the formal theory of games can be employed as a framework to analyze the vaccination equilibrium level in populations [9, 13, 14] . in the context of vaccination, the feedback between individual decisions of vaccination (or other prevention behaviors) and disease spreading is captured, hence these systems exemplify coupled disease-behavior systems. in spite of the great progress of game theory, the classical paradigm still shows its limitations in many scenarios. it thus becomes instructive to relax some key assumptions, such as the introduction of bounded rationality. game theory has been extended into evolutionary biology, which has generated great insight into the evolution of strategies [15] [16] [17] [18] [19] under both biological and cultural evolution. for instance, the replicator equation, which consists of sets of differential equations describing how the strategies of a population evolve over time under selective pressures, has also been used to study learning in various scenarios [20] . except for temporal concepts, spatial interaction topology has also proved to be crucial in determining system equilibria (also see refs. [16, 17] for a comprehensive overview). evolutionary game theory has been extensively applied to behavioral epidemiology, whose details will be surveyed in the following sections. several methods from statistical physics have become useful in the study of disease-behavior interactions on complex networks. most populations are spatially structured in the sense that individuals preferentially interact with those who share close geographic proximity. perhaps, the most simple population structure is a regular lattice: all the agents are assigned specific locations on it, normally a two-dimensional square lattice, just like atoms in crystal lattice sites, which interact with only nearest neighbors. in a regular lattice population, each individual meets the same people they interact with regularly, rather than being randomly reshuffled into a homogeneous mixture, as in well-mixed population models. in addition, another type of homogeneous network attracting great research interest is the erdös-rényi (er) graph [21] , which is a graph where nodes are linked up randomly and which is often used in the rigorous analysis of graphs and networks. however, in reality, there is ubiquitous heterogeneity in the number of contacts per individual, and recent studies have shown that the distribution of contact numbers of some social networks is not homogeneous but appears to follow a power-law [22] . moreover, social contact networks also display small-world properties (i.e., short average path length between any two individuals and strong local clustering tendency), which cannot be well described by regular lattices or random graphs [23] . with both motivations, two significant milestones were born in the late 90s: the theoretical models of small-world (sw) networks and scale-free (sf) networks [24, 25] . subsequently, more properties of social networks have been extensively investigated, such as community structure (a kind of assortative structure where individuals are divided into groups such that the members within each group are mostly connected with each other) [26] , clusters [27] , and the recent proposal of multilayer as well as time-varying frameworks [28] [29] [30] [31] [32] . due to the broad applicability of complex networks, network models have been widely employed in epidemiology to study the spread of infectious diseases [27] . in networks, a vertex represents an individual and an edge between two vertices represents a contact over which disease transmission may occur. an epidemic spreads through the network from infected to susceptible vertices. with the advent of various network algorithms, it becomes instructive incorporating disease dynamics into such infrastructures to explore the impact of spatial contact patterns [33] [34] [35] [36] [37] [38] . replacing the homogeneous mixing hypothesis that any individual can come into contact with any other agents, networked epidemic research assumes that each individual has comparable number of contacts, denoted by its degree k. under this treatment, the most significant physics finding is that network topology will directly determine the threshold of epidemic outbreak and phase transition. for example, compared with the finite epidemic threshold of random network, romualdo et al. found that disease with sis dynamics and even a very small transmission rate can spread and persist in the sf networks (i.e., there is absence of a disease threshold) [39] . this point helps to explain why it is extremely difficult to eradicate viruses on internet and world wide web, and why those viruses have an unusual long lifetime. but the absence of epidemic threshold is only suitable for sf networks with a power-law degree distribution p (k) ∼ k −γ with γ ∈ (2, 3]. if γ is extended to the range (3, 4) , an anomalous critical behavior takes place [39, 40] . to show the condition of disease spread, it is meaningful to define the relative spreading rate λ ≡ β/μ. the larger is λ, the more likely the disease will spread. generally, for an sf network with arbitrary degree distribution, the epidemic threshold is in particular, for an sf network k 2 diverges in the n → ∞ limit, and so the epidemic threshold is expected to die out. similarly, it is easy to derive the threshold of sir model which is related with average degree k and the second moment k 2 of networks as well. along these findings, more endeavors are devoted to the epidemic threshold of spatial networks with various properties, such as degree correlation [41, 42] , sw topology [23] , community structure [43] , and k-core [44] . on the other hand, more analysis and prediction methods (such as mean-field method, generation function) are also proposed to explain the transition of disease on realistic networks [27, 45] and immunization strategies of spatial networks are largely identified [46] . to illustrate the meaning of studying disease-behavior dynamics on complex networks, it is instructive to firstly describe a simple example of such a system. consider a population of individuals who are aware of a spreading epidemic. the information each individual receives regarding the disease status of others is derived from the underlying social network of the population. these networks have been shown to display heterogeneous contact patterns, where the node degree distribution often follows a power-law fashion [47, 48] . it is possible to use these complex network patterns to model a realistic population that exhibits adaptive self-protective behavior in the presence of a disease. a common way to incorporate this self-protective behavior is to allow individuals to lower their susceptibility according to the proportion of their contacts that are infectious, as demonstrated by bagnoli et al. [49] . in this model, the authors reduce the susceptibility of an individual to a disease which has a simple sis natural history by multiplying the transmission rate by a negative exponential function of the proportion of their neighbors who are infectious. specifically, this is given by βi (ψ, k) where β is the per contact transmission probability and models the effect an individual's risk perception has on its susceptibility, where j and τ are constants that govern the level of precaution individuals take, ψ is the number of infectious contacts an individual has, and k is the total number of contacts an individual has. the authors show that the introduction of adaptive behavior has the potential to not only reduce the probabilities of new infections occurring in highly disease-concentrated areas, but can also cause epidemics to go extinct. specifically, when τ = 1, there is a value of j for which an epidemic can be stopped in regular lattices and sw networks [25] . however, for certain sf networks, there is no value of j that is able to stop the disease from spreading. in order to achieve disease extinction in these networks, hub nodes must adopt additional self-protective measure, which is accomplished by decreasing τ for these individuals. the conclusions derived from this model highlight the significant impact different types of complex networks can have on health outcomes in a population, and how behavioral changes can dictate the course of an epidemic. the remainder of this review is organized as follows. in section 2, we will focus on the disease-behavior dynamics of homogeneously mixed populations, and discuss when the homogeneous mixing approximation is or is not valid. this provides a comprehensive prologue to the overview of the coupled systems on networks in section 3. within the latter, we separately review dynamics in different types of networked populations, which are frequently viewed through the lens of physical phenomena (such as phase transitions and pattern formation) and analyzed with physicsbased methods (like monte carlo simulation, mean-field prediction). based on all these achievements, we can capture how coupled disease-behavior dynamics affects disease transmission and spatial contact patterns. section 4 will be devoted to empirical concerns, such as types of data that can be used for these study systems, and how questionnaires and digital equipment can be used to collect data on relevant social and contact networks. in addition, it is meaningful to examine whether some social behaviors predicted by models really exist in vaccination experiments and surveys. finally, we will conclude with a summary and an outlook in section 5, describing the implications of statistical physics of spatial disease-behavior dynamics and outlining viable directions for future research. throughout, we will generally focus on preventive measures other than vaccination (such as social distancing and hand washing), although we will also touch upon vaccination in a few places. a large body of literature addresses disease-behavior dynamics in populations that are assumed to be mixing homogeneously, and thus spatial structure can be neglected. incorporating adaptive behavior into a model of disease spread can provide important insight into population health outcomes, as the activation of social distancing and other nonpharmaceutical interventions (npis) have been observed to have the ability to alter the course of an epidemic [50] [51] [52] . table 1 disease-behavior models applied to well-mixed populations, classified by infection type and whether economic-based or rule-based. when making decisions regarding self-protection from an infection, individuals must gather information relevant to the disease status of others in the population. prophylactic behavior can be driven by disease prevalence, imitation of others around them, or personal beliefs of probable health outcomes. in this section, we will survey the features and results of mathematical models that incorporate prophylactic decision making behavior in homogeneously mixed populations. the approaches we consider can be classified into two separate categories: economic-based and rulebased. economic based models (such as game theoretical models) assume individuals seek a maximization of their social utility, whereas rule-based models prescribe prevalence-based rules (not explicitly based on utility) according to which individuals and populations behave. both of these methods can also be used to study the dynamics of similar diseases (see table 1 ), and are discussed in detail below. the discovery of human immunodeficiency virus (hiv)/acquired immune deficiency syndrome (aids) and its large economic impacts stimulated research into behaviorally based mathematical models of sexually transmitted diseases (stds). in disease-behavior models, a population often initiates a behavior change in response to an increasing prevalence of a disease. in the context of stds, this change in behavior may include safer sex practices, or a reduction in the number of partnerships individuals seek out. following this prevalence-based decision making principle, researchers have used the concept of utility maximization to study the behavior dynamics of a population [53] [54] [55] [56] [57] . in these models, individuals seek to maximize their utility by solving dynamic optimization problems. utility is derived by members of the population when engaging in increased levels of social contact. however, this increased contact or partner change rate also increases the chance of becoming infected. one consequence of this dynamic is that higher levels of prevalence can result in increased prophylactic behavior, which in turn decreases the prevalence over time. as this occurs, self-protective measures used by the population will also fall, which may cause disease cycles [53, 56] . nonetheless, in the case of stds which share similar transmission pathways, a population protecting themselves from one disease by reducing contact rates can also indirectly protect themselves from another disease simultaneously [53] . in general, the lowering of contact rates in response to an epidemic can reduce its size, and also delay new infections [57] . however, this observed reduction of contact rates may not be uniform across the whole population. for example, an increase in prevalence may cause the activity rates of those with already low social interaction to fall even further, but this effect may not hold true for those with high activity rates [54] . in fact, the high-risk members of the population will gain a larger fraction of high-risk partners in this scenario, resulting from the low-risk members reducing their social interaction rates. this dynamic serves to increase the risk of infection of high activity individuals even further. these utility-based economic models show us that when considering health outcomes, one must be acutely aware of the welfare costs associated with self-protective behavior or implementing disease mitigation policies [56] . a health policy, such as encouraging infectious individuals to self-quarantine, may actually cause a rise in disease prevalence due to susceptible individuals feeling less threatened by infection and subsequently abandoning their own self-protective behavior [56] . also, a population who is given a pessimistic outlook of an epidemic may in fact cause the disease to spread more rapidly [55] . recently, approaches using game theory have been applied to self-protective behavior and social distancing [58] [59] [60] . when an individual's risk of becoming infected only depends on their personal investment into social distancing, prophylactic behavior is not initiated until after an epidemic begins, and ceases before an epidemic ends. also, the basic reproductive number of a disease must exceed a certain threshold for individuals to feel self-protective behavior is worth the effort [58] . in scenarios where the contact rate of the population increases with the number of people out in public, a nash equilibrium exists, but the level of self-protective behavior in it is not socially optimal [59] . nonetheless, these models also show that the activation of social distancing can weaken an epidemic. some models of disease-behavior dynamics, rather than assuming humans are attempting to optimize a utility function, represent human behavior by specifying rules that humans follow under certain conditions. these could include both phenomenological rules describing phenomenological responses to changes in prevalence, or more complex psychological mechanisms. rule-based compartmental models using systems of differential equations have also used to study heterogeneous behavior and the use of npis by a population during an epidemic. a wide range of diseases are modeled using this approach, such as hiv [61] [62] [63] , severe acute respiratory syndrome (sars) [64, 65] , or influenza [66, 63] . these models often utilize additional compartments, which are populated according to specific rules. examples of such rules are to construct the compartments to hold a constant amount of individuals associated with certain contact rates [61, 62, 67] , or to add and remove individuals at a constant rate [64, 65, 63, 68] , a rate depending on prevalence [69] [70] [71] [72] [73] [74] , or according to a framework where behavior that is more successful is imitated by others [75, 66, 76] . extra compartments signify behavioral heterogeneities amongst members of a population, and the disease transmission rates associated with them also vary. reduction in transmission due to adaptive behavior is either modeled as a quarantine of cases [64, 65, 63] , or prophylactic behavior of susceptible individuals due to increased awareness of the disease [75, 66, [69] [70] [71] [72] [73] [74] 77] . these models agree that early activation of isolation measures and selfprotective behavior can weaken an epidemic. however, due to an early decrease in new infections, populations may see a subsequent decrease in npi use causing multiple waves of infection [69, 75, 76, 71] . contrasting opinions on the impact behavioral changes have on the epidemic threshold also result from these models. for example, perra et al. [71] show that although infection size is reduced, prophylactic behavior does not alter the epidemic threshold. however, the models studied by poletti et al. [75] and sahneh et al. [70] show that the epidemic threshold can be altered by behavioral changes in a population. the classes of models presented in this section use homogeneous mixing patterns (i.e., well-mixed populations) to study the effects of adaptive behavior in response to epidemics and disease spread (see table 1 for a summary). often, populations will be modeled to alter their behavior based on reactions to changes in disease prevalence, or by optimizing their choices with respect to personal health outcomes. if possible, early activation of prophylactic behavior and npis by a population will be the most effective course of action to curb an epidemic. homogeneous mixing can be an appropriate approximation for the spread of an epidemic when the disease to be modeled is easily transmitted, such as measles and other infection that can be spread by fine aerosol particles that remain suspended for a long period. however, this mixing assumption does not always reflect real disease dynamics. for example, human sexual contact patterns are believed to be heterogeneous [48] and can be represented as networks (or graphs), while other infections, such as sars, can only be spread by large droplets, making the homogeneous mixing assumption less valid. the literature surrounding epidemic models that address this limitation by incorporating heterogeneous contact patterns through networks is very rich, and is discussed in the following section. in section 2, we reviewed disease-behavior dynamics in well-mixed populations. however, in real populations, various types of complex networks are ubiquitous and their dynamics have been well studied. the transmission of many infectious diseases requires direct or close contact between individuals, suggesting that complex networks play a vital role in diffusion of disease. it thus becomes of particular significance to review the development of behavioral epidemiology in networked populations. many of the dynamics exhibited by such systems have direct analogues to processes in statistical physics, such as how disease or behavior percolate through the network, or how a population can undergo a phase transition from one social state to another social state. perhaps the easiest way to begin studying disease-behavior dynamics in spatially distributed populations is by using lattices and static networks, which are relatively easy to analyze and which have attracted much attention in theoretical and empirical research. we organize research by several themes under which they have been conducted, such as the role of spreading awareness, social distancing as protection, and the role of imitation, although we emphasize that the distinctions are not always "hard and fast". the role of individual awareness. the awareness of disease outbreaks may stimulate humans to change their behavior, such as washing hands and wearing masks. such behavioral responses can reduce susceptibility to infection, which itself in turn can influence the epidemic course. in the seminal work, funk and coworkers [78] formulated and analyzed a mathematical model for the spread of awareness in well-mixed and spatially structured populations to understand how the awareness of disease and also its propagation impact the spatial spread of a disease. in their model, both disease and the information about the disease spread spontaneously by, respectively, contact and word of mouth in the population. the classical epidemiological sir model is used for epidemic spreading, and the information dynamics is governed by both information transmission and information fading. the immediate outcome of the awareness of the disease information is the decrease in the possibility of acquiring the infectious disease when a susceptible individual (who was aware of the epidemic) contacts with an infected one. in a well-mixed population, the authors found that, the coupling spreading dynamics of both the epidemic and the awareness of it can result in a lower size of the outbreak, yet it does not affect the epidemic threshold. however, in a population located on the triangular lattice, the behavioral response can completely stop a disease from spreading, provided the infection rate is below a threshold. specifically, the authors showed that the impact of locally spreading awareness is amplified if the social network of potential infection events and the communication network over which individuals communicate overlap, especially so if the networks have a high level of clustering. the finding that spatial structure can prevent an epidemic is echoed in an earlier model where the effects of awareness are limited to the immediate neighbors of infected nodes on a network [79] . in the model, individuals choose whether to accept ring vaccination depending on perceived disease risk due to infected neighbors. by exploring a range of network structures from the limit of homogeneous mixing to the limit of a static, random network with small neighborhood size, the authors show that it is easier to eradicate infections in spatially structured populations than in homogeneously mixing populations [79] . hence, free-riding on vaccine-generated herd immunity may be less of a problem for infectious diseases spreading spatially structured populations, such as would more closely describe the situation for close contact infections. along similar lines of research, wu et al. explored the impact of three forms of awareness on the epidemic spreading in a finite sf networked population [80] : contact awareness that increases with individual contact number; local awareness that increases with the fraction of infected contacts; and global awareness that increases with the overall disease prevalence. they found that the global awareness cannot decrease the likelihood of an epidemic outbreak while both the local awareness and the contact awareness do it. generally, individual awareness of an epidemic contributes toward the inhibition of its transmission. the universality of such conclusions (i.e., individual behavioral responses suppress epidemic spreading) is also supported by a recent model [81] , in which the authors focused on an epidemic response model where the individuals respond to the epidemic according to, rather than the density of infected nodes, the number of infected neighbors in the local neighborhood. mathematically, the local behavioral response is cast into the reduction factor (1 − θ) ψ in the contact rate of a susceptible node, where ψ is the number of infected neighbors and θ < 1 is a parameter characterizing the response strength of the individuals to the epidemic. by studying both sis and sir epidemiological models with the behavioral response rule in sf networks, they found that individual behavioral response can in general suppress epidemic spreading, due to crucial role played by the hub nodes who are more likely to adopt protective response to block the disease spreading path. in a somewhat different framework, how the diffusion of individual's crisis awareness affects the epidemic spreading is investigated in ref. [82] . in this work, the epidemiological sir model is linked with an information transmission process, whose diffusion dynamics is characterized by two parameters, say, the information creation rate ζ and the information sensitivity η. in particular, at each time step, ζ n packets will be generated and transferred in the network according to the shortest-path routing algorithm (n hither denotes the size of networks). when a packet is routed by an infected individual, its state is marked by infection. each individual determines whether or not to accept vaccine based on how many infected packets are received from immediate neighbors, and on how sensitive the individual response is to the information as well, weighed by the parameter η. the authors considered their "sir with information-driven vaccination" model on homogeneous er networks and heterogeneous sf networks, and found that the epidemic spreading can be significantly suppressed in both the homogeneous and heterogeneous networks provided that both ζ and η are relatively large. social distancing as a protection mechanism. infectious disease outbreaks may trigger various behavioral responses of individuals to take preventive measures, one of which is social distancing. valdez and coworkers have investigated the efficiency of social distancing in altering the epidemic dynamics and affecting the disease transmission process on er network, sf networks, as well as realistic social networks [83] . in their model, rather than the normally used link-rewiring process, an intermittent social distancing strategy is adopted to disturb the epidemic spreading process. particularly, based on local information, a susceptible individual is allowed to interrupt the contact with an infected individual with a probability σ and restore it after a fixed time t b , such that the underlying interaction network of the individuals remains unchanged. using the framework of percolation theory, the authors found that there exists a cutoff threshold σ c , whose value depends on the network topology (i.e., the extent of heterogeneity of the degree distribution), beyond which the epidemic phase disappears. the efficiency of the intermittent social distancing strategy in stopping the spread of diseases is owing to the emergent "susceptible herd behavior" among the population that protects a large fraction of susceptible individuals. impact of behavior imitation on vaccination coverage. vaccination is widely employed as an infection control measure. to explore the role of individual imitation behavior and population structure in vaccination, recent seminal work integrated an epidemiological process into a simple agent-based model of adaptive learning, where individuals use anecdotal evidence to estimate costs and benefits of vaccination [85] . under such a model, the disease-behavior dynamics is modeled as a two-stage process. the first stage is a public vaccination campaign, which occurs before any epidemic spreading. at this stage, each individual decides whether or not to vaccinate, and taking vaccine incurs a cost c v to the vaccinated individuals. the vaccine is risk-free and offers perfect protection against infection. the second stage is the disease transmission process, where the classic sir compartmental model is adopted. during the whole epidemic spreading process, those susceptible individuals who caught the disease incur an infection cost c i , which is usually assumed to be larger than the cost c v for vaccination. those unvaccinated individuals who remain healthy are free-riding off the vaccination efforts of others (i.e., no any cost), and they are indirectly protected by herd immunity. for simplicity, the authors rescale these costs by defining the relative cost of vaccination c = c v /c i (0 < c < 1) and c i = 1. as such, after each epidemic season, all the individuals will get some payoffs (equal to the negative value of corresponding costs) dependent on their vaccination strategies and also on whether they are infected or not, then they are allowed to change or keep their old strategies for the next season, depending on their current payoffs. the rule of thumb is that the strategy of a role model with higher payoff is more likely to be imitated. by doing so, each individual i randomly chooses another individual j from the neighborhood as role model, and imitates the behavior of j with the probability where p i and p j are, respectively, the payoffs of two involved individuals, and β (0 < β < ∞) denotes the strength of selection. this form of decision alternative is also known as the fermi law [16, 86] in physics. a finite value of β accounts for the fact that better performing individuals are readily imitated, although it is not impossible to imitate one agent performing worse, for example due to imperfect information or errors in decision making. the authors studied their coupled "disease-behavior" model in well-mixed populations, in square lattice populations, in random network populations, and in sf network populations, and found that population structure acts as a "double-edged sword" for public health: it can promote high levels of voluntary vaccination and herd immunity given that the cost for vaccination is not too large, but small increases in the cost beyond a certain threshold would cause vaccination to plummet, and infections to rise, more dramatically than in well-mixed populations. this research provides an example of how spatial structure does not always improve the chances of infection control, in disease-behavior systems. the symbols and lines correspond, respectively, to the simulation results and mean-field predictions (whose analytical framework is shown in appendix a). the parameter α determines just how seriously the peer pressure is considered in the decision making process of the individuals to taking vaccine. the figure is reproduced from [84] . in the similar vein, peer pressure among the populations is considered to clarify its impact on the decision-making process of vaccination, and then on the disease spreading [84] . in reality, whether or not to change behavior depends not only on the personal success of each individual, but also on the success and/or behavior of others. using this as motivation, the authors incorporated the impact of peer pressure into a susceptible-vaccinated-infected-recovered (svir) epidemiological model, where the propensity to adopt a particular vaccination strategy depends both on individual success as well as on the strategy-configuration of their neighbors. to be specific, the behavior imitation probability of individual i towards its immediate neighbor j (namely, eq. (6)) becomes where n i is the number of neighbors that have a different vaccination strategy than the individual i, and k i is the interaction degree of i, and the parameter α determines just how seriously the peer pressure is considered. under such a scenario, fig. 3 displays how vaccination and infection vary as a function of vaccine cost in er random graph. it is clear that plugging into the peer pressure also works as a "double-edged sword", which, on the one hand, strongly promotes vaccine uptake in the population when its cost is below a critical value, but, on the other hand, it may also strongly impede it if the critical value is exceeded. the reason is due to the fact that the presence of peer-pressure can facilitate cluster formation among the individuals, whose behaviors are inclined to conform to the majority of their neighbors, similar to the early report of cooperation behavior [88] . such behavioral conformity is found to expedite the spread of disease when the relative cost for vaccination is high enough, while promote the vaccine coverage in the opposite case. self-motivated strategies related with vaccination. generally, it is not so much the actual risk of being infected, as the perceived risk of infection, that will prompt humans to change their vaccination behavior. previous game-theoretic studies of vaccination behavior typically have often assumed that individuals react to the disease incidence with same responsive dynamics, i.e., the same formulas of calculating the perceived probability of infection. but that may not actually be the case. liu et al. proposed that a few will be "committed" to vaccination, perhaps because they have a low threshold for feeling at risk (or strongly held convictions), and they will want to be immunized as soon as they hear that someone is infected [87] . they studied how the presence of committed vaccinators, a small fraction of individuals who consistently hold the vaccinating strategy and are immune to influence, impacts the vaccination dynamics in well-mixed and spatially structured populations. the researchers showed that even a relatively small proportion of these agents (such as 5%) can significantly reduce the scale of an epidemic, as shown in fig. 4 . the effect is much stronger when all the individuals are uniformly distributed on a square lattice, as compared to the case of well-mixed population. their results suggested that those committed individuals can have a remarkable effect, acting as "steadfast role models" in the population to seed vaccine uptake in others while also disrupting the appearance of clusters of free-riders, which might otherwise seed the emergence of a global epidemic. one important message taken away from ref. [87] is that we might never guess what would happen by just looking at the decision-making rules alone, in particular when our choices will influence, and be influenced by, the choice of other people. another good example can be found in a recent work [89] , in which zhang et al. proposed an evolutionary epidemic game where individuals can choose their strategies as vaccination, self-protection or laissez faire, towards infectious diseases and adjust their strategies according to their neighbors' strategies and payoffs. the "disease-behavior" coupling dynamical process is similar to the one implemented by ref. [85] , where the sir epidemic spreading process and the strategy updating process succeed alternatively. by both stochastic simulations and theoretical analysis, the authors found a counter-intuitive phenomenon that a better condition (i.e., larger successful rate of self-protection) may unfortunately result in less system payoff. the trick is that, when the successful rate of self-protection increases, people become more speculative and less interested in vaccination. since a vaccinated individual brings the "externality" effect to the system: the individual's decision to vaccinate diminishes not only its own risk of infection, but also the risk for those people with whom the individual interacts, the reduction of vaccination can remarkably enhance the risk of infection. the observed counter-intuitive phenomenon is reminiscent of the well-known braess's paradox in traffic, where more roads may lead to more severe traffic congestion [90] . this work provides another interesting example analogous to braess's paradox, namely, a higher successful rate of self-protection may eventually enlarge the epidemic size and thus diminish positive health outcomes. this work raises a challenge to public health agencies regarding how to protect the population during an epidemic. the government should carefully consider how to distribute their resources and money between messages supporting vaccination, hospitalization, self-protection, and so on, since the outcome of policy largely depends on the complex interplay among the type of incentive, individual behavioral responses, and the intrinsic epidemic dynamics. in their further work [91] , the authors investigated the effects of two types of incentives strategies, partial-subsidy policy in which certain fraction of the cost of vaccination is offset, and free-subsidy policy in which donees are randomly selected and vaccinated at no cost on the epidemic control. through mean-field analysis and computations, they found that, under the partial-subsidy policy, the vaccination coverage depends monotonically on the sensitivity of individuals to payoff difference, but the dependence is non-monotonous for the free-subsidy policy. due to the role models of the donees for relatively irrational individuals and the unchanged strategies of the donees for rational individuals, the free-subsidy policy can in general lead to higher vaccination coverage. these findings substantiate, once again, that any disease-control policy should be exercised with extreme care: its success depends on the complex interplay among the intrinsic mathematical rules of epidemic spreading, governmental policies, and behavioral responses of individuals. as the above subsection shows, research on disease-behavior dynamics on networks has become one of the most fruitful realms of statistical physics and non-linear science, as well as shedding novel light on how to predict the impact of individual behavior on disease spread and prevention [92] [93] [94] 85, [95] [96] [97] [98] [99] 79] . however, in some scenarios, the simple hypothesis that individuals are connected to each other in the same infrastructure (namely, the so-called single-layer network in section 3.1) may generate overestimation or underestimation for the diffusion and prevention of disease, since agents can simultaneously be the elements of more than one network in most, yet not all, empirical systems [29, 28, 100] . in this sense, it seems constructive to go beyond the traditional single-layer network theory and propose a new architecture, which can incorporate the multiple roles or connections of individuals into an integrated framework. the multilayer networks, defined as the combination class of networks interrelated in a nontrivial way (usually by sharing nodes), have recently become a fundamental tool to quantitatively describe the interaction among network layers as well as between these constituents. an example of multilayer networks is visualized in fig. 5 [101] . a social network layer supports the social dynamics related to individual behavior and main prevention strategies (like vaccination); while the biological layer provides a platform for the spreading of biological disease. each individual is a node in both network layers. the coupled structure can generate more diverse outcomes than either isolated network, and could produce multiple (positive or negative) effects on the eradication of infection. because of the connection between layers, the dynamics of control measures in turn affects the trajectory of disease on biological network, and vice versa. under such a framework, which is composed of at least 2 different topology networks, nodes not only exchange information with their counterparts in other network(s) via inter-layer connections, but also diffuse infection with their neighbors through the intra-layer connections. subsequently, more theoretical algorithms and models, such as interdependent networks, multiplex networks and interconnected networks, have been proposed [102] [103] [104] . the broad applicability of multilayer networks and their success in providing insight into the structure and dynamics of realistic systems have thus generated considerable excitement [105] [106] [107] [108] [109] . of course, the study of disease-behavior dynamics in this framework is a young and rapidly evolving research area, which will be systematically surveyed in what follows. interplay between awareness and disease. as fig. 5 illustrates, different dynamical processes for the same set of nodes with different connection topologies for each process can be encapsulated in a multilayer structure (technically, these are referred to as multiplex networks [28, 29] ). aiming to explore the interrelation between social awareness and disease spreading, granell et al. recently incorporated information awareness into a disease model embedded in a multiplex network [110] , where the physical contact layer supports epidemic process and the virtual contact layer supports awareness diffusion. similar to sis model (where the s node can be infected with a transmission probability β, and the i node recovers with a certain rate μ), the awareness dynamics, composed of aware (a) and unaware (u) states, assumes that a node of state a may lose its awareness with probability δ, and re-obtains awareness in the probability ν. then, both processes can be coupled via the combinations of individual states: unaware-susceptible fig. 6 . transition probability trees of the combined states for coupled awareness-disease dynamics each time step in the multilayer networks. here aware (a) state can become unaware (u) with transition probability δ and of course re-obtains awareness with other probability. for disease, μ represents the transition probability from infected (i) to susceptible (s). there are thus four state combinations: aware-infected, (ai) aware-susceptible, (as) unaware-infected, (ui) and unaware-susceptible (us), and the transition of these combinations is controlled by probability r i , q a i and q u i . they respectively denote the transition probability from unaware to aware given by neighbors; transition probability from susceptible to infected, if node is aware, given by neighbors; and transition probability from susceptible to infected, if node is unaware, given by neighbors. we refer to [110] , from where this figure has been adapted, for further details. (us), aware-susceptible (as), and aware-infected (ai), which are also revealed by the transition probability trees in fig. 6 . using monte carlo simulations, the authors showed that the coupled dynamical processes change the onset of the epidemics and allow them to further capture the evolution of the epidemic threshold (depending on the structure and the interrelation with the awareness process), which can be accurately validated by the markov-chain approximation approach. more interestingly, they unveiled that the increase in transmission rate can lower the long-term disease incidence while raising the outbreak threshold of epidemic. in spite of great progress, the above-mentioned findings are based on two hypotheses: infected nodes become immediately aware, and aware individuals are completely immune to the infection. to capture more realistic scenarios, the authors relaxed both assumptions and introduced mass media that disseminates information to the entire system [111] . they found that the vaccine coverage of aware individuals and the mass media affect the critical relation between two competing processes. more importantly, the existence of mass media makes the metacritical point (where the critical onset of the epidemics starts) of ref. [110] disappear. furthermore, the social dynamics are further extended to an awareness cascade model [112] , during which agents exhibit herd-like behavior because they make decisions referring to the actions of other individuals. interestingly, it is found that a local awareness ratio (of unaware individuals becoming aware ones) approximating 0.5 has a two-stage effect on the epidemic threshold (i.e., an abrupt transition of epidemic threshold) and can cause different epidemic sizes, irrespective of the network structure. that is to say, when the local awareness ratio is in the range of [0, 0.5), the epidemic threshold is a fixed and larger value; however, in the range of [0.5, 1], threshold value becomes a fixed yet smaller value. as for the final epidemic size, its increasing speed for the interval [0, 0.5) is much slower than the speed when local awareness ratio lies in [0.5, 1]. these findings suggest a new way of understanding realistic contagions and their prevention. except for obtaining awareness from aware neighbors, self-awareness induced by infected neighbors is another scenario that currently attracts research attention [113] , where it is found that coupling such a dynamical process with disease spreading can lower the density of infection, but does not increase the epidemic threshold regardless of the information source. coupling between disease and preventive behaviors. thus far, many achievements have shown that considering simultaneous diffusion of disease and prevention measures on the same single-layer network is an effective method to evaluate the incidence and onset of disease [94, 85, [95] [96] [97] [98] 114, 79] . however, if both processes are coupled on the multilayer infrastructure, how does it affect the spreading and prevention of disease? inspired by this interesting question, ref. [115] suggested a conceptual framework, where two fully or partially coupled networks are employed, to transmit disease (an infection layer) and to channel individual decision of prevention behaviors (a communication layer). protection strategies considered include wearing facemasks, washing hands frequently, taking pharmaceutical drugs, and avoiding contact with sick people, which are the only means of control in situations where vaccines are not yet available. it is found that the structure of the infection network, rather than the communication network, has a dramatic influence on the transmission of disease and uptake of protective measures. in particular, during an influenza epidemic, the coupled model can lead to a lower infection rates, which indicates that single-layer models may overestimate disease transmission. in line with this finding, the author further extended the above setup into a triple coupled diffusion model (adding the information flow of disease on a new layer) through metropolitan social networks [116] . during an epidemic, these three diffusion dynamics interact with each other and form negative and positive feedback loop. compared with the empirical data, it is exhibited that this proposed model reasonably replicates the realistic trends of influenza spread and information propagation. the author pointed out that this model possesses the potential of developing into a virtual platform for health decision makers to test the efficiency of disease control measures in real populations. much previous work shows that behavior and spatial structure can suppress epidemic spreading. in contrast, other recent research using a multiplex network consisting of a disease transmission (dt) network and information propagation (ip) network through which vaccination strategy and individual health condition information can be communicated, finds that compared with the case of traditional single-layer network (namely, symmetric interaction), the multiplex architecture suppresses vaccination coverage and leads to more infection [117] . this phenomenon is caused by the sharp decline of small-degree vaccination nodes, whose number is usually more numerous in heterogeneous networks. similarly, wang et al. considered asymmetrical interplay between disease spreading and information diffusion in multilayer networks [118] . it is assumed that there exists different disease dynamics on communication layer and physical-contact layer, only where vaccination takes place. more specifically, the vaccination decision of the node in contact networks is not only related to the states of its intra-layer neighbors, but also depends on the counterpart node from communication layer. by means of numerous simulations and mean-field analysis, they found that, for uncorrelated coupling architecture, a disease outbreak in the contact layer induces an outbreak of disease in the communication layer, and information diffusion can effectively raise the epidemic threshold. however, the consideration of inter-layer correlation dramatically changes the onset of disease, but not the information threshold. dynamical networks play an important role in the incidence and onset of epidemics as well. along this line of research, the most commonly used approach is adaptive networks [119] [120] [121] [122] , where nodes frequently adjust their connections according to the environment or states of neighboring nodes. time-varying networks (also named temporal networks) provide another framework for the activity-driven changing of connection topology [31, 123, 32] . here, we briefly review the progress of disease-behavior dynamics on adaptive and time-varying networks. contact switching as potential protection strategy. in the adaptive viewpoint, the most straightforward method of avoiding contact with infective acquaintances amounts to breaking the links between susceptible and infective agents and constructing novel connections. along such lines, thilo et al. first proposed an adaptive scenario: a susceptible node is able to prune the infected link and rewire with a healthy agent with a certain probability [124] . the probability of switching can be regarded as a measurement of strength of the protection strategy. it is shown that different values of this probability give rise to various degree mixing patterns and degree distributions. based on the low-dimensional approximations, the authors also showed that their adaptive framework is able to predict novel dynamical features, such as bistability, hysteresis, and first order transitions, which are sufficiently robust against disease dynamics [125, 126] . in spite of great advances, the existing analytical methods cannot generally allow for accurate predictions about the simultaneous time evolution of disease and network topology. to overcome this limitation, vincent et al. further introduced an improved compartmental formalism, which proves that the initial conditions play a crucial role in disease spreading [127] . in the above examples, switching contact as a strategy has proven its effectiveness in controlling epidemic outbreak. however, in some realistic cases, the population information may be asymmetric, especially during the process of rewiring links. to relax this constraint, a new adaptive algorithm was recently suggested: an infection link can be pruned by either individual, who reconnects to a randomly selected member rather than susceptible agent (namely, the individual has no previous information on state of every other agent) [128, 129] . for example, ref. [129] showed that such a reconnection behavior can completely suppress the spreading of disease via continuous and discontinuous transitions, which is universally effective in more complex situations. besides the phenomena of oscillation and bistability, another dynamical feature, epidemic reemergence, also attracts great interest in a current study [122] , where susceptible individuals adaptively break connections with infected neighbors yet avoid being isolated in a growing network. under such operations, the authors observed that the number fig. 7 . panel (a) denotes the time course for the number of infected nodes when the network growth, the link-removal process, and isolation avoidance are simultaneously involved into the adaptive framework. it is clear that this mechanism creates the reemergence of epidemic, which dies out after several such repetitions. while for this interesting phenomenon, it is closely related with the formation of giant component of susceptible nodes. panel (b) shows the snapshot of the network topology of 5000th time step (before the next abrupt outbreak), when there is a giant component of susceptible nodes (yellow). however, the invasion of the infection individuals (red) makes the whole network split into many fragments, as shown by the snapshot of 5400th time step (after the explosion) in panel (c). we refer to [122] , from where this figure has been adapted, for further details. (for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) of infected agents stays at a low level for a long time, and then suddenly erupts to high level before declining to a low level again. this process repeats several times until the final eradication of infection, as illustrated in fig. 7(a) . with regard to potential mechanism, it is actually related with the invasion of infected individuals to susceptible giant components. link-removal process can suppress disease spreading, which makes susceptible (infected) agents form giant components (small yet non-isolated clusters), as shown in fig. 7(b) . but, the entrance of new nodes may bring new infection risk to such giant components, which accelerates next outbreak of infection and network crashing again (see fig. 7(c) ). interestingly, this finding may help to explain the phenomenon of repeated epidemic explosions in real populations. now, if we carefully look back upon the above-cited bibliography, we will find a common feature: except for disease processes, the adaptive adjustment of individual connections finally changes the degree distribution of networks. an interesting question naturally poses itself: is there an adaptive scenario that preserves the degree distribution of networks? that is, each individual has a characteristic behavior: keeping total number of its neighbors constant. to fill up this gap, neighbor exchange model becomes a very useful tool [130] , where individual number of current neighbors remains fixed while the compositions or entities of those contacts change in time. similar to famous algorithm of watts-strogatz sw network [25] , such a model allows an exchange mechanism in which the destination nodes of two edges are swapped with a given rate. incorporating the diffusion of epidemic, this model constructs a bridge between static network model and mass-action model. based on the empirical data, the authors further displayed that the application of this model is very effective to forecast and control sexually transmitted disease outbreak. along this way, the potential influence of other topology properties (such as growing networks [131] and rewiring sf networks [132] ) has recently been identified in the adaptive viewpoint, which dramatically changes the outbreak threshold of disease. vaccination, immunization and quarantine as avoidance behaviors. as in static networks, vaccination can also be introduced into adaptive architectures, where connection adjustment is an individual response to the presence of infection risk among neighborhoods. motivated by realistic immunization situations, disease prevention is implemented by adding poisson-distributed vaccination to susceptible individuals [133] . because of the interplay between network rewiring and vaccination application, the authors showed that vaccination is far more effective in an adaptive network than a static one, irrespective of disease dynamics. similarly, some other control measures are further encapsulated into adaptive community networks [134] . except for various transition of community structure, both immunization and quarantine strategies show a counter-intuitive result that it is not "the earlier, the better" for prevention of disease. moreover, it is unveiled that the prevention efficiency of both measures is greatly different, and the optimal effect is obtained when a strong community structure exists. vaccination on time-varying networks. in contrast to the mutual feedback between dynamics and structure in adaptive frameworks, time-varying networks provide a novel angle for network research, where network connection and fig. 8 . vaccination coverage as a function of the relative cost of vaccination and the fraction of imitators in different networks. it is obvious that for small cost of vaccination, imitation behavior increases vaccination coverage but impedes vaccination at high cost, irrespective of potential interaction topology. the figure is reproduced from [93] . dynamics process evolve according to their respective rules [135] [136] [137] . for example, summin et al. recently explored how to lower the number of vaccinated people to protect the whole system on time-varying networks [138] . based on the past information, they could accurately administer vaccination and estimate disease outbreaks in future, which proves that time-varying structure can make protection protocols more efficient. in [139] , the authors displayed that limited information on the contact patterns is sufficient to design efficient immunization strategies once again. but in these two works, the vaccination strategy is somewhat independent of human behavior and decision-making process, which leaves an open issue: if realistic disease-behavior dynamics is introduced into time-varying topology (especially combining with the diffusion process of opinion cluster [140] ), how does it affect the eradication of disease? we continue to discuss some of these and similar issues in section 4 on empirically-derived networks. some research uses networks derived from empirical data in order to examine disease-behavior dynamics. we discuss these models in this subsection. dynamics on different topologies. heterogeneous contact topology is ubiquitous in reality. to test its potential impact on disease spreading, martial et al. recently integrated a behavior epidemiology model with decision-making process into three archetypical realistic networks: poisson network, urban network and power law network [93] . under these contact networks, an agent can make decision either based purely on payoff maximization or via imitating the vaccination behavior of its neighbor (as suggest by eq. (6)), which is controlled by the fraction of imitators . by means of numerous simulations, they displayed the diploid effect of imitation behavior: it enhances vaccination coverage for low vaccination cost, but impedes vaccination campaign at relatively high cost, which is depicted by fig. 8 . surprisingly, in spite of high vaccination coverage, imitation can generate the clusters of non-vaccinating, susceptible agents, which in turn accelerate the large-scale outbreak of infectious disease (namely, imitation behavior, to some extent, impedes the eradication of infectious diseases). this point helps to explain why outbreaks of measles have recently occurred in many countries with high overall vaccination coverage [140, 143, 144] . with the same social networks, ref. [141] explored the impact of heterogeneous contact patterns on disease outbreak in the compartmental model of sars. it is interesting that, compared with the prediction of well-mixed population, the same set of basic reproductive number may lead to completely epidemiological outcomes in any two processes, which sheds light to the heterogeneity of sars around the world. impact of network mixing patterns. as ref. [93] discovered, high vaccination coverage can guarantee herd immunity, which, however, is dramatically affected and even destroyed by clusters of unvaccinated individuals. to evaluate how much influence such clusters possess, a recent work explored the distribution of vaccinated agents during seasonal influenza vaccination through a united states high school contact network [142] . the authors found that contact table 2 classification of disease-behavior research outcomes according to dynamic characteristics in networked populations reviewed by section 3. it is clear that the same type of networks can be frequently used to different problems. table 3 observed physical phenomena and frequently used methods in the study of diseasebehavior dynamics on networks. epidemic threshold mean-field prediction phase transition generation function self-organization percolation theory pattern formation stochastic processes bifurcation and stability analysis monte carlo simulation vaccination/immunization threshold markov-chain approximation networks are positively assortative with vaccination behavior. that is to say, large-degree unvaccinated (vaccinated) agents are more likely to contact with other large-degree unvaccinated (vaccinated) ones, which certainly results in a larger outbreak than common networks since these (positively assortative) unvaccinated agents breed larger susceptible clusters. this finding highlights the importance of heterogeneity during vaccine uptake for the prevention of infectious disease once again. in fact, the currently growing available human generated data and computing power have driven the fast emergence of various social, technological and biological networks [145] [146] [147] [148] . upon these empirically networks, mass diseasebehavior models can be considered to analyze the efficiency of existing or novel proposed prevention measures and provide constructive viewpoint for policy makers of public health [149] [150] [151] [152] [153] [154] . based on the above achievements, it is now clear that incorporating behavior epidemiology into networked populations has opened a new window for the study of epidemic transmission and prevention. to capture an overall image, table 2 provides a summary for the reviewed characteristics of disease-behavior dynamics in networked populations. here it is worth mentioning that some works (e.g., [93, 117] ) may appear in two categories because they simultaneously consider the influence of individual behavior and special network structure. fig. 9 . age-specific contact matrices for each of eight examined european countries. high contact rates are represented by white color, intermediate contact rates are green and low contact rates are blue. we refer to [155] , from where this figure has been adapted, for further details. (for interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) many of these achievements are closely related with physics phenomena (see table 3 ), via which we can estimate the effect of the proposed strategies and measures. on the other hand, these achievements are also inseparable from classical physics methods (table 3 ). in particular, monte carlo simulation and mean-field prediction have attracted the greatest attention due to simplicity and high efficiency. for a comprehensive understanding, we provide a general example of mean-field theory about behavioral epidemiology in appendix a. the first mathematical models studied the adaptive dynamics of disease-behavior responses in the homogeneously mixed population, assuming that individuals interact with each other at the same contact rate, without restrictions on selecting potential partners. networked dynamics models shift the focus on the effects of interpersonal connectivity patterns, since everyone has their own set of contacts through which the interpersonal transmission can occur. the contacts between members of a population constitute a network, which is usually described by some well-known proxy model of synthetic networks, as shown in section 3. this physics treatment of using evidence-based parsimonious models is valuable in illustrating fascinating ideas and revealing unexpected phenomena. however, it is not always a magic bullet for understanding, explaining, or predicting realistic cases. in recent years, the studies of social experiments become more and more popular. they contribute new insight in parameterizing and designing more appropriate dynamics models. this section briefly introduces the progress in this field. a large-scale population-based survey of human contact patterns in eight european countries was conducted in ref. [155] , which collects the empirical data of self-reported face-to-face conversations and skin-to-skin physical contacts. the data analysis splits the population into subgroups on the basis of properties such as ages and locations, and scrutinizes the contact rate between subgroups (see fig. 9 ). it reveals that across these countries, people are more probable to contact others of the same age group. combining self-reporting contact data with serological testing data, . we refer to [164] , from where this figure has been adapted, for further details. recent case studies are able to reveal location-or age-specific patterns of exposure to pandemic pathogens [156, 157] , which provide rich information for complex network modeling. sf networks have been widely used to model the connectivity heterogeneity in human contact networks. in sf networks, each hub member can have numerous connections, including all its potential contacts relevant to the transmission. however, such common consideration might not fully agree with human physiological limitations regarding the capacity in preserving a large number of interactions. generally, the size of individual social network is restricted to around 150 people [158, 159] . to better characterize the features of human contact behavior, social experiments studying the active contacts in realistic circumstances will be valuable. thanks to the development of information and communication technologies, the usage of digital equipments becomes increasingly popular in collecting empirical data relevant to human contacts in realistic social circumstances. it is instructive to first review a few brief examples. refs. [160, 161] referred to the bluetooth technique embedded in mobile phones, which collects the proxy data of person-to-person interactions of mit media laboratory in the reality mining program; with the help of wireless sensors, the social experiment was conducted to trace the close proximity contacts among the members of an american high school [162] ; refs. [163] [164] [165] [166] considered the active radio frequency identification devices (rfid) to establish a flexible platform recording the face-to-face proximity contacts among volunteers, which had been deployed in various social contexts such as conference, museum, hospital, and primary school; and the wifi accessing data among all students and staffs were also analyzed as the indirect proxy records of their concurrent communications in one chinese university [167, 168] . compared with the abovementioned data of questionnaires, the electronic data generated by digital experiments is more accurate and objective. moreover, some new interesting findings are also listed as follows. the data analysis reveals an unexpected feature that the distribution of the number of distinct persons each individual encounters every day only has a small squared coefficient of variance [162, 164, [166] [167] [168] [169] , irrespective of the specific social contexts (see fig. 10 ). this homogeneity in the so-called node-degree distribution indicates the absence of connectivity hubs, which is, to some extent, subject to our physiological limitations. the dynamics of human interactions is not evolving at an equilibrium state, but is highly fluctuating and timevarying in realistic situations. this can be characterized by measuring the statistical distribution of the duration per contact and the time intervals between successive contacts [163] . as shown in fig. 11 , these two statistics both have a broad distribution spanning several orders of magnitude. most contact durations and inter-contact intervals are very short, but long durations and intervals also emerge, which corresponds to a burst process without characteristic time scales [170] . the coexistence of homogeneity in degree of nodes and heterogeneity in contact durations lead to unexpected phenomena. for example, the low-degree nodes which are insignificant in conventional network models can act as hubs in time-varying networks [171] . the usage of electronic devices provides an easy and nonintrusive approach for contact tracing, which can help understand health-related behaviors in realistic settings. to measure close proximity interactions between health-care fig. 11(a) . similar to fig. 10 , each symbol denotes one venue (sfhh denotes sfhh, nice, fr; eswc09 (eswc10) is eswc 2009 (2010), crete, gr; and ps corresponds to primary school, lyon, fr). we refer to [174] , from where this figure has been adapted, for further details. workers (hcws) and their hand hygiene adherence, polgreen et al. performed experiments by deploying wireless sensor networks in a medical intensive care unit of the university of iowa hospital. they confirmed the effects of peer pressure on improving the hand hygiene participation [172] , i.e., the proximity of other hcws, can promote the hand hygiene adherence. they also analyzed the role of "superspreader", who has a high frequency in encountering others [173] . the disease severity increases with hand hygiene noncompliance of such people. except for empirical data of contact networks, social behavior experiments (or surveys) also play an important role in the vaccination campaign and disease spreading, especially combined with the decision-making process. here we will review the recent progress within this realm. role of altruistic behavior. game theory has been extensively used in the study of behavior epidemiology, where individuals are usually assumed to decide vaccination or not based on the principle of maximizing self-interest [9, 175] . however, in reality, when people make vaccination decision, do they only consider their own benefit? to test this fundamental assumption, ref. [176] recently conducted a survey about individual vaccination decisions during the influenza season. the questionnaires, from direct campus survey and internet-based survey, are mainly composed of two items: self-interest ones (the concern about becoming infected) and altruistic ones (the concern about infecting others), as schematically illustrated in fig. 12 . if agents are driven by self-interest, they attempt to minimize their cost associated with vaccination and infection, which gives rise to selfish equilibrium (or the so-called nash equilibrium). by contrary, if individual decision is guided by altruistic motivation, the vaccination probability reaches community optimum (or the so-called utilitarian equilibrium), at which overall cost of the community is minimal. the authors unveiled that altruism plays an important role in vaccination decision, which can be quantitatively measured by "degree of altruism". to further evaluate its impact, they incorporated the empirical data and altruistic motivation into svir compartmental model. interestingly, they found that altruism can shift vaccination decisions from individual self-interest to a community optimum via greatly enhancing total vaccination coverage and reducing the total cost, morbidity and mortality of the whole community, irrespective of parameter setup. along this line, the role of altruistic behavior in age-structure populations was further explored [177] . according to general experience, elderly people, who are most likely to be infected in the case of influenza, should be most protected by young vaccinators, who are responsible for most disease transmission. to examine under which condition young agents vaccinate to better protect old ones, the authors organized the corresponding social behavior experiment: participants are randomly assigned to "young" and "elderly" roles (with young players contributing more to herd immunity yet elderly players facing higher costs of infection). if players were paid based on individual point totals, more elderly than young players would get vaccinated, which is consistent with the theoretical prediction of fig. 12 . schematic illustration of questionnaire used in the voluntary vaccination survey. the survey items can be divided into self-interest ones (i.e., outcomes-for-self) and altruism ones (i.e., outcomes-for-others), which have corresponding scores. based on both, it becomes possible to indirectly estimate the degree of altruism, which plays a significant role in vaccination uptake and epidemic elimination. we refer to [176] , from where this figure has been adapted, for further details. self-interested behavior (namely, nash equilibrium). on the contrary, players paid according to the group point totals make decisions in a manner consistent with the utilitarian equilibrium, which predicts community-optimal behavior: more young than elderly players get vaccinated yet less cost. in this sense, payoff structure plays a vital role in the emergence of altruistic behavior, which in turn affects the disease spreading. from both empirically studies, we can observe that altruism significantly impacts vaccination coverage as well as consequent disease burden. it can drive system to reach community optimum, where smallest overall cost guarantees herd immunity. it is thus suggested that in realistic policies altruism should be regarded as one potential strategy to improve public health outcomes. existence of free-riding behavior. accompanying altruistic behavior, another type of behavior addressed within decision-making frameworks is free-riding behavior, which means that people can benefit from the action of others while avoiding any cost [79, 178, 179] . in a voluntary vaccination campaign, free riders are unvaccinated individuals who avoid infection because of herd immunity, as illustrated by the gray nodes in fig. 4 . to explore the impact of free-riding behavior, john et al. even organized a questionnaire containing six different hypothetical scenarios twenty years ago [180] . under such a survey, altruism and free-riding were simultaneously considered as the potential decision motivations for vaccination. they found that, for vaccine conferring herd immunity, the free-riding frame causes less sensitivity to increase vaccination coverage than does the altruism frame, which means that free-riding lowers preference of vaccination when the proportion of others vaccinating increases. in addition to homogeneous groups of individuals, yoko et al. recently conducted a computerized influenza experiment, where the groups of agents may face completely different conditions, such as infection risk, vaccine cost, severity of influenza and age structure [181] . they found that high vaccination rate of previous rounds certainly decreases the likelihood of individuals' vaccination acceptance in the following round, indicating the existence of free-riding behavior. both empirical surveys thus showed that individuals' decision-making may be driven by the free-riding motive, which depresses vaccination coverage. besides the above examples, there exist more factors, such as individual cognition [182] and confidence [183] , affecting the decision of vaccination in reality. if possible, these factors should be taken into consideration by public policy makers in order to reach the necessary level of vaccination coverage. the growth in online social networks such as twitter in recent years provides a new opportunity to obtain the data on health behaviors in near real-time. using the short text messages (tweets) data collected from twitter between august 2009 and january 2010, during which pandemic influenza a (h1n1) spread globally, salathé et al. analyzed the spatiotemporal individuals' sentiments towards the novel influenza a (h1n1) vaccine [97] . they found that projected vaccination rates on the basis of sentiments of twitter users can be in good accord with those estimated by the centers for disease control and prevention of united states. they also revealed a critical problem that both negative and positive opinions can be clustered to form network communities. if this can generate clusters of unvaccinated individuals, the risk of disease outbreaks will be largely increased. we have reviewed some of the recent, rapidly expanding research literature concerning nonlinear coupling between disease dynamics and human behavioral dynamics in spatially distributed settings, especially complex networks. generally speaking, these models show that emergent self-protective behavior can dampen an epidemic. this is also what most mean-field models predict. however, in many cases, that is where the commonality in model predictions ends. for populations distributed on a network, the structure of the disease contact network and/or social influence network can fundamentally alter the outcomes, such that different models make very different predictions depending on the assumptions about human population and diseases being studied, including findings that disease-behavior interactions can actually worsen health outcomes by increasing long-term prevalence. also, because network models are individual-based, they can represent processes that are difficult to represent with mean-field (homogeneous mixing) models. for example, the concept of the neighbor of an individual has a natural meaning in a network model, but the meaning is less clear in mean-field models (or partial differential equation models) where populations are described in terms of densities at a point in space. we speculate that the surge of research interest in this area has been fuelled by a combination of (1) the individual-based description that characterizes network models, (2) the explosion of available data at the individual level from digital sources, and (3) the realization from recent experiences with phenomena such as vaccine scares and quarantine failures that human behavior is becoming an increasingly important determinant of disease control efforts. we also discussed how many of the salient dynamics exhibited by disease-behavior systems are directly analogous to processes in statistical physics, such as phase transitions and self-organization. the growth in research has created both opportunities as well as pitfalls. a first potential pitfall is that coupled disease-behavior models are significantly more complicated than simple disease dynamic or behavior dynamic models on their own. for a coupled disease-behavior model, it is necessary not only to have a set of parameters describing the human behavioral dynamics and the disease dynamics separately, it is also possible to have a set of parameters to describe the impact of human behavior on disease dynamics, and another set to describe the effect of disease dynamics on human behavior. thus, approximately speaking, these models have four times as many parameters as a disease dynamic model on its own, or a human behavioral model on its own: they are subject to the "curse of dimensionality". a second pitfall is that relevant research from other fields may not be understood or incorporated in the best possible way. for example, the concept of 'social contagion' appears repeatedly in the literature on coupled disease-behavior models. this is a seductive concept, and it appears to be a natural concept for discussing systems where a disease contagion is also present. however, the metaphor may be too facile. for example, how can the social contagion metaphor capture the subtle but important distinction between descriptive social norms (where individuals follow a morally neutral perception of what others are doing) and injunctive social norms (where individuals follow a morally-laden perception of what others are doing) [184] ? social contagion may be a useful concept, but we should remember that it ultimately is only a metaphor. a third pitfall is lack of integration between theoretical models and empirical data: this pitfall is common to all mathematical modeling exercises. the second and third pitfalls are an unsurprising consequence of combining natural and human system dynamics in the same framework. there are other potential pitfalls as well. these pitfalls also suggest ways in which we can move the field forward. for example, the complexity of models calls for new methods of analysis. in some cases, methods of rigorous analysis (including physics-based methods such as percolation theory and pair approximations (appendix b))-for sufficiently simple systems that permit such analysis-may provide clearer and more rigorous insights than the output of simulation models, which are often harder to fully understand. for systems that are too complicated for pen-and-paper methods, then methods for visualization of large and multidimensional datasets may prove useful. the second and third pitfalls, where physicists and other modelers, behavioral scientists and epidemiologists do not properly understand one another's fields can be mitigated through more opportunities for interaction between the fields through workshops, seminars and colloquia. interactions between scholars in these fields if often stymied by institutional barriers that emphasize a 'silo' approach to academic, thus, a change in institutional modes of operation could be instrumental in improving collaborations between modelers, behavioral scientists and epidemiologists. scientists have already shown that these pitfalls can be overcome in the growing research in this area, and this is evidence in much of the research we have described in this review. the field of coupled disease-behavior modeling has the elements to suggest that it will continue expanding for the foreseeable future: growing availability of data needed to test empirical models, a rich set of potential dynamics created opportunities to apply various analysis methods from physics, and relevance to pressing problems facing humanity. physicists can play an important role in developing this field due to their long experience in applying modeling methods to physical systems. where [ss] is the number of susceptible-susceptible pairs in the population, q(i | ss) is the expected number of infected neighbors of a susceptible in a susceptible-susceptible pair, and similarly q(i | si ) is the expected number of infected neighbors of the susceptible person in a susceptible-infected pair. the first term corresponding to creation of new si pairs from ss pairs, through infection, while the second term corresponds to destruction of existing si pairs through infection, thereby creating ii pairs. an assumption must be made in order to close the equations at the pair level, thereby preventing writing down equations of motion for triples. for instance, on a random graph, the approximation might be applied, where q(i | s) is the expected number of infected persons neighboring a susceptible person in the population. equations and pair approximations for the other pair variables [ss] and [i i ] must also be made, after which one has a closed set of differential equations that capture spatial effects implicitly, by tracking the time evolution of pair quantities. the black death: a biological reappraisal world health organization. the world health report 1996: fighting disease; fostering development. world health organization the mathematics of infectious diseases perspectives on the basic reproductive ratio theory of games and economic behaviour equilibrium points in n-person games game theory with applications to economics game theory and evolutionary biology vaccination and the theory of games universal scaling for the dilemma strength in evolutionary games dangerous drivers foster social dilemma structures hidden behind a traffic flow with lane changes computer science and game theory group interest versus self-interest in smallpox vaccination policy evolutionary game theory evolutionary games on graphs coevolutionary games-a mini review emergent hierarchical structures in multiadaptive games evolutionary game theory: temporal and spatial effects beyond replicator dynamics evolutionary stable strategies and game dynamics on the evolution of random graphs statistical mechanics of complex networks scaling and percolation in the small-world network model emergence of scaling in random networks collective dynamics of 'small-world' networks the structure of scientific collaboration networks complex networks: structure and dynamics the structure and dynamics of multilayer networks multilayer networks evolutionary games on multilayer networks: a colloquium temporal networks activity driven modeling of time varying networks spread of epidemic disease on networks nonequilibrium phase transitions in lattice models influence of infection rate and migration on extinction of disease in spatial epidemics the evolutionary vaccination dilemma in complex networks effects of delayed recovery and nonuniform transmission on the spreading of diseases in complex networks influence of time delay and nonlinear diffusion on herbivore outbreak epidemic spreading in scale-free networks percolation critical exponents in scale-free networks absence of epidemic threshold in scale-free networks with degree correlations epidemic incidence in correlated complex networks epidemic spreading in community networks competing activation mechanisms in epidemics on networks epidemic thresholds in real networks epidemics and immunization in scale-free networks the structure and function of complex networks the web of human sexual contacts risk perception in epidemic modeling nonpharmaceutical interventions implemented by us cities during the 1918-1919 influenza pandemic public health interventions and epidemic intensity during the 1918 influenza pandemic alcohol-based instant hand sanitizer use in military settingsa prospective cohort study of army basic trainees rational epidemics and their public control integrating behavioural choice into epidemiological models of the aids epidemic choices, beliefs and infectious disease dynamics public avoidance and epidemics: insights from an economic model adaptive human behavior in epidemiological models game theory of social distancing in response to an epidemic a mathematical analysis of public avoidance behavior during epidemics using game theory equilibria of an epidemic game with piecewise linear social distancing cost modeling and analyzing hiv transmission: the effect of contact patterns structured mixing: heterogeneous mixing by the definition of activity groups factors that make an infectious disease outbreak controllable transmission dynamics and control of severe acute respiratory syndrome curtailing transmission of severe acute respiratory syndrome within a community and its hospital the effect of risk perception on the 2009 h1n1 pandemic influenza dynamics behavior changes in sis std models with selective mixing infection-age structured epidemic models with behavior change or treatment coupled contagion dynamics of fear and disease: mathematical and computational explorations on the existence of a threshold for preventive behavioral responses to suppress epidemic spreading towards a characterization of behavior-disease models the impact of information transmission on epidemic outbreaks modeling and analysis of effects of awareness programs by media on the spread of infectious diseases coevolution of pathogens and cultural practices: a new look at behavioral heterogeneity in epidemics spontaneous behavioural changes in response to epidemics risk perception and effectiveness of uncoordinated behavioral responses in an emerging epidemic a generalization of the kermack-mckendrick deterministic model the spread of awareness and its impact on epidemic outbreaks social contact networks and disease eradicability under voluntary vaccination the impact of awareness on epidemic spreading in networks suppression of epidemic spreading in complex networks by local information based behavioral responses epidemic spreading with information-driven vaccination intermittent social distancing strategy for epidemic control peer pressure is a double-edged sword in vaccination dynamics imitation dynamics of vaccination behaviour on social networks insight into the so-called spatial reciprocity impact of committed individuals on vaccination behavior wisdom of groups promotes cooperation in evolutionary social dilemmas braess's paradox in epidemic game: better condition results in less payoff price of anarchy in transportation networks: efficiency and optimality control effects of behavioral response and vaccination policy on epidemic spreading-an approach based on evolutionary-game dynamics modeling the interplay between human behavior and the spread of infectious diseases the impact of imitation on vaccination behavior in social contact networks a computational approach to characterizing the impact of social influence on individuals' vaccination decision making risk assessment for infectious disease and its impact on voluntary vaccination behavior in social networks vaccination and public trust: a model for the dissemination of vaccination behavior with external intervention assessing vaccination sentiments with online social media: implications for infectious disease dynamics and control erratic flu vaccination emerges from short-sighted behavior in contact networks the dynamics of risk perceptions and precautionary behavior in response to 2009 (h1n1) pandemic influenza optimal interdependence between networks for the evolution of cooperation social factors in epidemiology catastrophic cascade of failures in interdependent networks globally networked risks and how to respond eigenvector centrality of nodes in multiplex networks synchronization of interconnected networks: the role of connector nodes epidemic spreading on interconnected networks effects of interconnections on epidemics in network of networks the robustness and restoration of a network of ecological networks diffusion dynamics on multiplex networks dynamical interplay between awareness and epidemic spreading in multiplex networks competing spreading processes on multiplex networks: awareness and epidemics two-stage effects of awareness cascade on epidemic spreading in multiplex networks effects of awareness diffusion and self-initiated awareness behavior on epidemic spreading-an approach based on multiplex networks spontaneous behavioural changes in response to epidemics coupling infectious diseases, human preventive behavior, and networks-a conceptual framework for epidemic modeling modeling triple-diffusions of infectious diseases, information, and preventive behaviors through a metropolitan social networkan agent-based simulation influence of breaking the symmetry between disease transmission and information propagation networks on stepwise decisions concerning vaccination asymmetrically interacting spreading dynamics on complex layered networks modelling the influence of human behaviour on the spread of infectious diseases: a review adaptive coevolutionary networks: a review exact solution for the time evolution of network rewiring models epidemic reemergence in adaptive complex networks temporal networks: slowing down diffusion by long lasting interactions epidemic dynamics on an adaptive network robust oscillations in sis epidemics on adaptive networks: coarse graining by automated moment closure fluctuating epidemics on adaptive networks adaptive networks: coevolution of disease and topology infection spreading in a population with evolving contacts contact switching as a control strategy for epidemic outbreaks susceptible-infected-recovered epidemics in dynamic contact networks absence of epidemic thresholds in a growing adaptive network epidemic spreading in evolving networks enhanced vaccine control of epidemics in adaptive networks efficient community-based control strategies in adaptive networks evolutionary dynamics of time-resolved social interactions random walks on temporal networks outcome inelasticity and outcome variability in behavior-incidence models: an example from an sir infection on a dynamic network exploiting temporal network structures of human interaction to effectively immunize populations pastor-satorras r. immunization strategies for epidemic processes in time-varying contact networks the effect of opinion clustering on disease outbreaks network theory and sars: predicting outbreak diversity positive network assortativity of influenza vaccination at a high school: implications for outbreak risk and herd immunity an ongoing multi-state outbreak of measles linked to non-immune anthroposophic communities in austria measles outbreak in switzerland-an update relevant for the european football championship dynamics and control of diseases in networks with community structure spreading of sexually transmitted diseases in heterosexual populations particle swarm optimization with scale-free interactions modelling dynamical processes in complex socio-technical systems modelling disease outbreaks in realistic urban social networks complex social contagion makes networks more vulnerable to disease outbreaks traffic-driven epidemic spreading in finite-size scale-free networks impact of rotavirus vaccination on epidemiological dynamics in england and wales epidemiological effects of seasonal oscillations in birth rates dynamic modeling of vaccinating behavior as a function of individual beliefs social contacts and mixing patterns relevant to the spread of infectious diseases location-specific patterns of exposure to recent pre-pandemic strains of influenza a in southern china social contacts and the locations in which they occur as risk factors for influenza infection the social brain hypothesis modeling users' activity on twitter networks: validation of dunbar's number reality mining: sensing complex social systems inferring friendship network structure by using mobile phone data a high-resolution human contact network for infectious disease transmission dynamics of person-to-person interactions from distributed rfid sensor networks what's in a crowd? analysis of face-to-face behavioral networks high-resolution measurements of face-to-face contact patterns in a primary school predictability of conversation partners towards a temporal network analysis of interactive wifi users characterizing large-scale population's indoor spatio-temporal interactive behaviors spatial epidemiology of networked metapopulation: an overview bursts: the hidden patterns behind everything we do, from your e-mail to bloody crusades. penguin temporal dynamics and impact of event interactions in cyber-social populations do peer effects improve hand hygiene adherence among healthcare workers? analyzing the impact of superspreading using hospital contact networks temporal networks of face-to-face human interactions long-standing influenza vaccination policy is in accord with individual self-interest but not with the utilitarian optimum the influence of altruism on influenza vaccination decisions using game theory to examine incentives in influenza vaccination behavior multiple effects of self-protection on the spreading of epidemics imperfect vaccine aggravates the long-standing dilemma of voluntary vaccination the roles of altruism, free riding, and bandwagoning in vaccination decisions free-riding behavior in vaccination decisions: an experimental study cognitive processes and the decisions of some parents to forego pertussis vaccination for their children improving public health emergency preparedness through enhanced decision-making environments: a simulation and survey based evaluation social influence: social norms, conformity and compliance correlation equations and pair approximations for spatial ecologies a moment closure model for sexually transmitted disease spread through a concurrent partnership network we would like to acknowledge gratefully yao yao, yang liu, ke-ke huang, yan zhang, eriko fukuda, dr. wen-bo du, dr. ming tang, dr. hai-feng zhang and prof. zhen jin for their constructive helps and discussions, also appreciate all the other friends with whom we maintained (and are currently maintaining) interactions and discussions on the topic covered in our report. this work was partly supported by the national natural science foundation of china (grant no. 61374169, 11135001, 11475074) and the natural sciences and engineering council of canada (nserc; ma and ctb). whenever the classical epidemic spreading processes (sis, sir) are taking place in homogeneous populations, e.g., individuals are located on the vertices of regular random graph, er random graph, completed connected graph, the qualitative properties of the dynamics can be well catched by the mean-field analysis, assuming unbiased random matching among the individuals. for simplicity yet without loss of generality, we here derive the mean-field solution for the peer-pressure effect in vaccination dynamics on the er random graph as an example [84] . let x be the fraction of vaccinated individuals and w(x) be the probability that a susceptible individual finally gets infected in a population with vaccine coverage x. after each sir epidemic season, individuals get payoffs: vaccinated (i, p i = −c), unvaccinated and healthy (j , p j = 0), and unvaccinated and infected (ς ,p ς = −1). individuals are allowed to modify their vaccination strategies in terms of eq. (7). whenever an individual from compartment i goes to compartment j or ς , the variable x drops, which can be formulated aswhere above we have approximated, in the spirit of mean-field treatment, the fraction of neighbors holding opposite strategy of the vaccinated individuals as 1 − x. the quantity i→j is the probability that individuals from the compartment i change to the compartment j , whose value is determined by eq. (7) . accordingly, the gain of x can be written astake the two cases into consideration, the derivative of x with respect to time is written as dx/dt = x + + x − . solving for x, we get the equilibrium vaccination level f v . note that the equilibrium epidemic size is expected to be f i = w(x), satisfying the self-consistent equationwhere r 0 is the basic reproductive number of the epidemic. pair approximation is a method by which space can be implicitly captured in an ordinary differential equation framework [185, 186] . to illustrate the method, consider the variable [s], defined as the number of susceptible individuals in a population distributed across a network or a lattice. for an sir natural history, the equation of motion for s is: key: cord-275258-azpg5yrh authors: mead, dylan j.t.; lunagomez, simón; gatherer, derek title: visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling date: 2019-07-26 journal: j mol graph model doi: 10.1016/j.jmgm.2019.07.014 sha: doc_id: 275258 cord_uid: azpg5yrh the protein sequence-structure gap results from the contrast between rapid, low-cost deep sequencing, and slow, expensive experimental structure determination techniques. comparative homology modelling may have the potential to close this gap by predicting protein structure in target sequences using existing experimentally solved structures as templates. this paper presents the first use of force-directed graphs for the visualization of sequence space in two dimensions, and applies them to the choice of suitable rna-dependent rna polymerase (rdrp) target-template pairs within human-infective rna virus genera. measures of centrality in protein sequence space for each genus were also derived and used to identify centroid nearest-neighbour sequences (cnns) potentially useful for production of homology models most representative of their genera. homology modelling was then carried out for target-template pairs in different species, different genera and different families, and model quality assessed using several metrics. reconstructed ancestral rdrp sequences for individual genera were also used as templates for the production of ancestral rdrp homology models. high quality ancestral rdrp models were consistently produced, as were good quality models for target-template pairs in the same genus. homology modelling between genera in the same family produced mixed results and inter-family modelling was unreliable. we present a protocol for the production of optimal rdrp homology models for use in further experiments, e.g. docking to discover novel anti-viral compounds. (219 words) since high-throughput sequencing technologies entered mainstream use towards the end of the first decade of the 21st century, there has been an explosion in available protein sequences. by contrast, there has been no corresponding high-throughput revolution in structural biology. obtaining solved structures of proteins at adequate resolution remains a painstaking task. x-ray crystallography is still the gold standard for structure determination more than 60 years after its first use in determining myoglobin structure [1] . the result of this discrepancy between the rate of protein sequence determination and the rate of protein structure determination is the protein sequence-structure gap [2] . homology modelling is a rapid computational technique for prediction of a protein's structure from (a) the protein's sequence, and (b) a solved structure of a related protein, referred to as the target and the template, respectively. since structural similarity often exists even where sequence similarity is low [2, 3] , homology modelling has the potential to reduce massively the size of the protein sequence-structure gap, provided the models produced can be considered reliable enough for use in further research. the rna-dependent rna polymerase (rdrp) of rna viruses presents an opportunity to test and expand this approach. rdrps are the best conserved proteins throughout the rna viruses, being essential for their replication [4] . conservation is particularly high in structural regions that are involved in the replication process, for instance the indispensable rna-binding pocket [5] . rdrps are also of immense medical importance as the principal targets for antiviral drugs. evolution of resistance against anti-viral drugs is a major concern for the future, and the design of novel anti-viral compounds is a highly active research area. solved structures of rdrps are of great assistance to these efforts, as they enable the use of docking protocols against large libraries of pharmaceutical candidate compounds [e.g. refs. [6, 7] ]. although some human-infective rna viruses have solved rdrp structures, there are still large areas within the virus taxonomy that lack any. this paper will first identify where the protein sequencestructure gap is at its widest in rdrps. because of the sequencestructure gap, it is therefore impossible in many genera to perform docking protocols against solved structures of rdrp for discovery of novel anti-viral compounds. under these circumstances, replacement of real solved structures with homology models for docking experiments requires that the homology models used should be both high quality and also optimally representative of their respective genera. our second task is to present several similarity metrics in sequence space that assist in the identification of the virus species having the rdrp sequence that is most representative of its genus as a whole. we then present the first use of force-directed graphs to produce an intuitive visualization of sequence space, and select target rdrps without solved structures for homology modelling. these are then used to perform homology modelling using template-target pairs within the same genus, between sister genera and between sister families, monitoring the quality of the models produced as the template becomes progressively more genetically distant to the target sequence being modelled. finally, we produce homology models for reconstructed common ancestral rdrp sequences. in the light of our results, we comment on the strengths and weakness of homology modelling to reduce the size of the protein sequence-structure gap for rdrps, and produce a flowchart of recommendations for docking experiments on rdrp proteins lacking a solved structure. we chose rdrps from human-infective viruses based on the list provided by woolhouse & brierley [8] . given the global medical importance of aids, we also included lentivirus reverse transcriptases (rts) for analysis. solved structures for these proteins, where available, were downloaded from the rcsb protein data bank (pdb) [9] . table 1 presents our criteria for selecting suitable homology modelling candidates. rdrp and rt amino acid sequences for all virus species satisfying the criteria of table 1 were downloaded from genbank [10] . alignment of sequence sets for each genus, was performed using mafft [11] . alignments were refined in mega [12] using muscle [13] where necessary, and the best substitution model determined. alignment of target sequences onto their solved structure templates for homology modelling was carried out using the molecular operating environment (moe v.2016.08, chemical computing group, montreal h3a 2r7, canada). we define sequence space as a theoretical multi-dimensional space within which protein sequences may be represented by points. for an alignment of n related proteins, the necessary dimensionality of this sequence space is n-1, with the hyperspatial co-ordinates in each dimension for any protein determined by its genetic distance to the n-1 other proteins. for n ¼ 5, direct visualization of all dimensions of sequence space is impractical at best, since a 4-dimensional space must be simulated in three dimensions, and is effectively impossible for n ! 6. the following methods were used to reduce sequence space to two and three dimensions for ease of visualization. to simplify calculations, we allow an extra dimension defined by the distance from each sequence to itself. the value of the co-ordinate in that dimension is always zero and our sequence space has n dimensions rather than n-1. the pairwise distance matrix (m d ) for each genus, calculated from the sequence alignment in mega, consists of entries m d (i,j) giving the genetic distance between each pair of sequences i and j where {i, j} 2 {1,2 …. n} and i s j, for a set of n sequences. in our data set n ranges (see supplementary table) the similarity matrix was then used as input for r package qgraph [14] . the "spring" layout option was chosen, which uses the fruchterman-reingold algorithm to produce a two-dimensional undirected graph in which edge thickness is proportional to absolute distance in n dimensions and node proximity in two dimensions is optimized for ease of viewing while attempting to ensure that those nodes closely related in the n-dimensional input are also close in the two-dimensional output [15] . 500 iterations were performed, or until convergence was achieved. for each alignment, the pairwise distance matrix (m d ) was used as input for r package cmdscale, which uses multi-dimensional scaling to produce a three-dimensional graph from the n-dimensional input, with node proximity again reflecting relative similarity [16] . spotfire analyst (tibco spotfire analyst, v.7.12.0, 2018) was used to visualize the output of cmdscale. we define the centroid as a hypothetical protein sequence located at the centre point of the sequence space of an alignment. the real sequence closest to the hypothetical centroid is termed the centroid nearest neighbour (cnn). we calculate the position of the cnn in three ways. table 1 list of criteria used to select rna-dependent rna polymerases (rdrps) for homology modelling. human-infective virus importance to human health ncbi refseq annotated genome easy retrieval of high quality rdrp sequence rdrp located at the 3 0 end of polyprotein or on its own segment eliminates unconventional rdrps at least one solved rdrp at a range of different taxonomic levels, e.g. in same species, same genus, same family, same order to be used as the templates in homology modelling at different levels of genetic distance 2.4.1. shortest-path centroid nearest neighbour for a sequence i 2 {1,2 …. n} in an alignment of n sequences, its total path length d(i) to the other n-1 sequences may be calculated from the distance matrix m d as follows: is zero. this may be omitted to enforce a strict n-1 dimensions for n input sequences, but we leave it in to simplify subsequent calculations. we define i* as the index that minimizes d(i). the shortest path cnn is therefore sequence i*. for alignments where clusters of closely related sequences exist, giving many values of m d (i,j) close to zero, this method will tend to place the cnn within a cluster. to overcome this problem, the arithmetic mean and median, respectively, were used to determine the mean cnn and the median cnn. the values of d (equation (2)) may be averaged to produce mean total path distance d: where again n is the total number of sequences in the alignment. we now re-define i* as the index that minimizes d(i) -d. in the event of equation (5) returning zero, the mean cnn and the true centroid are identical. as with all variables using means, the mean cnn is liable to skewing by outliers. we generate a vector d over i 2 {1,2 …. n}, in which each entry d(i) represents the total path length for sequence i (equation (2)). the values of vector d are then ranked in ascending order x s(1) to x s(n) to produce vector d s . the median cnn is the sequence with value d(i) situated in the middle of the array d s , at d(m), where d(m) is either d (m odd ) or d (m even ) for alignments with odd or even numbers of sequences respectively. we now re-define i* as the index that minimizes d(i) -d(m). again, in the event of equation (9) returning zero, the median cnn and the true centroid are identical. as with all variables using medians, the median cnn is liable to skewing by the presence in the alignment of multiple sequences with the same value of d(i). the choice of solved structures as templates for homology modelling, and the choice of targets to be modelled, within each genus was governed by the following rules: (1) for each genus the solved structure that covered the highest proportion of the rdrp or rt sequence was chosen as the template for that genus. (2) if more than one candidate template structure was found at this sequence length, the structure with the lowest resolution in angstroms was selected. see table 2 for the templates satisfying these two criteria. (3) within each genus, the sequence with the greatest genetic distance from the template, was chosen as the target for homology modelling. see table 3 for the template-target pairs satisfying this criterion. (4) criterion 3 was applied to find template-target pairs in different genera (see table 4 ) and different families (see table 5 ), thus testing the limits of homology modelling at high genetic distances. homology modelling was carried out using the molecular operating environment (moe v.2016.08, chemical computing group, montreal h3a 2r7, canada). ten intermediate models were produced using the amber10:eht forcefield under medium refinement. the model that scored best under the generalised born/ volume integral (gb/vi) was selected to undergo further energy minimisation using protonate3d, which predicts the location of hydrogen atoms using the model's 3d coordinates [17, 18] . to assess the stereochemical quality of the homology models produced, ramachandran plots were derived in moe, and used to calculate the proportion of bad outlier f-j angles in the model, after subtraction of the number of outlier f-j angles in the template. generally, outlier angle percentage below 0.05% indicates a very high quality model, and a percentage below 2% indicates a good quality model [19] . models were superposed with their templates in moe and rootmean-square deviation (rmsd) value derived for the alpha carbons (ca) in the two structures. generally, an rmsd below 2 å indicates a good quality model [20] . qualitative model energy analysis (qmean) was used to analyse models using both statistical and predictive methods [21] . the qmean z-score is an overall measure of the quality of the model when compared to similar models from a pdb reference set of x-ray crystallography-solved structures. a z-score of 0 would indicate a model of the same quality as a similar high quality x-ray crystallographic structure, while a z-score below à4.00 indicates a low quality model [22] . maximum likelihood (ml) trees [23] were produced for each genus in mega. the ml tree and the corresponding multiple sequence alignment were input into the ancestral reconstruction server, fastml [24] . the reconstructed sequence for the root of the tree, i.e. the putative common ancestor rdrp or rt sequence for the genus was used as the target for homology modelling in moe, using the template chosen according to the rules in section 2.5. the reconstructed ancestral sequence was added to the alignment and the force-directed graph re-drawn. fig. 1b , showing the targettemplate pairs for homology modelling may be compared with fig. 1c , showing the ancestor-template pairs. our first observation is that there are still large areas of the viral taxonomy where no solved rdrp structures exist. no suitable templates for homology modelling were found within the entire nidovirales order of rna viruses. this order contains several coronaviruses important to human health including severe acute respiratory syndrome-related coronavirus (sars-cov) and middle east respiratory syndrome-related coronavirus (mers-cov) [25] . in the order mononegavirales, vesiculovirus was the only genus with a solved rdrp structure suitable for homology modelling. however, this order contains many medically important viruses such as zaire ebolavirus, hendra henipavirus, measles morbillivirus, and mumps rubulavirus [26] . in the order bunyavirales, phenuiviridae stands out as an important family lacking a solved rdrp, despite it containing various human-infective arboviruses such as rift valley fever phlebovirus and sandfly fever naples phlebovirus [27] . (table 1) . fig. 1 shows two-dimensional force-directed graphs of similarity for each genus with more than four rdrp reference sequences (or rt sequences in the case of lentivirus). in principle, it would be possible to draw force-directed graphs for entire families and even orders. however, the input to qgraph is the similarity matrix calculated from the distance matrix, and the distance matrix is calculated in mega from an alignment. once taxonomic distance begin to extend beyond genera, alignment becomes progressively less reliable, with all the downstream statistics tending to degrade as a consequence. we therefore confine our construction of forcedirected graphs to intra-genus comparisons. it is evident from fig. 1 that sequences are not necessarily evenly distributed in sequence space. clustering is noticeable in the genus flavivirus, with two sub-groups and an outlier sequence evident. mammarenavirus also shows division into two sub-groups. by contrast, picobirnavirus has only five relatively equidistant reference sequences, thus producing a highly regular pentagram. similarly, rotavirus has eight reference sequences, with four at each end of a fairly regular cuboid. fig. 1a also shows how the various methods equations (2)e(9) for determining the cnn of sequence space for each genus, are in poor agreement. only in rotavirus and table 2 solved structures of rdrps and reverse transcriptase (for hiv-1) selected as templates for homology modelling. all are derived by x-ray crystallography except 5a22 which is a cryo-electron microscopy structure. for protein coverage, indicates that the template covers more than 90% of the sequence, indicates less. for f-j outliers and qmean z-score, indicates good-quality, indicates poor-quality, determined by the following thresholds: f-j ¼ 2%, qmean z-score ¼ à4.00. table 3 homology modelling at intra-genus, inter-species level. templates are as given in table 2 . targets are the rdrp (or reverse transcriptase for lentivirus) sequences from the reference genome accession numbers given. rmsd: root mean square deviation in angstroms between template and model when superposed in moe. indicates good quality, indicates poor quality, determined by the following thresholds: f-j < 2%; qmean z-score > à4.00; rmsd <2 å. indicates good quality, but using a partial template (see table 1 ) *imjin thottimvirus was reclassified in 2018 by the international committee on taxonomy of viruses (ictv) in a new genus thottimvirus. table 4 homology modelling at intra-family, inter-genus level. templates are as given in table 2 . targets are the rdrp (or reverse transcriptase for spumavirus) sequences from the reference genome accession numbers given. rmsd: root mean square deviation in angstroms between template and model when superposed in moe. indicates goodquality, indicates poor-quality, determined by the following thresholds: f-j < 2%; qmean z-score > à4.00; rmsd <2 å. picobirnavirus are mean and median cnns found in the same sequence. fig. 1a also shows that the best solved structure for the purposes of template choice in homology modelling is rarely close to the centre of sequence space. only in lentivirus is the optimal template also the mean cnn, and only in vesiculovirus is the optimal template a shortest-path cnn. fig. 1b shows the relations of the template-target pairs in sequence space, illustrating how intra-genus homology modelling template-target selection attempts to traverse the largest genetic distance available within the genus. figs. 2 and 3 compare, for genera orthohantavirus and mammarenavirus respectively, the force-directed graphs of fig. 1 with the three-dimensional equivalent output of multidimensional scaling. fig. 2 shows a sequence clustering within orthohantavirus that is not readily apparent in the force-directed graph. the cnns are distributed among four clusters, as there is no sequence close to the geometrical centre of the three-dimensional space, where the notional centroid is located. the solved structure has 10 other sequences in its proximity in the three-dimensional space, roughly table 5 homology modelling at intra-order, inter-family level. templates are as given in table 2 . targets are the rdrp (or reverse transcriptase for lentivirus) sequences from the reference genome accession numbers given. rmsd: root mean square deviation in angstroms between template and model when superposed in moe. indicates goodquality, indicates poor-quality, determined by the following thresholds: f-j < 2%; qmean z-score > à4.00; rmsd <2 å. fig. 1 . force-directed graph visualisations of similarity of rdrps (or reverse transcriptase for lentivirus) within genera. the genetic distance matrix for each alignment was converted into a similarity matrix equations (1) and (2). the fruchterman-reingold algorithm (500 minimisation iterations) was implemented in r module qgraph to produce a force-directed graph. relative similarity is represented by node proximity, and absolute similarity is proportional to edge thickness. the solved structure and the three types of centroid nearest neighbour (cnn) sequences are highlighted. the species names corresponding to the numbered nodes are listed in the supplementary table. cardiovirus has less than four reference sequences and is omitted. a: location of solved structure and the three cnns in sequence space equations (3)e(7). some genera have two median cnns. equivalent to the lower right quadrant of the two-dimensional force-directed graph. similarly, the shortest-path cnn and mean cnn are both located within another three-dimensional cluster also containing 11 sequences, which is roughly equivalent to the upper right quadrant of the two-dimensional force-directed graph. fig. 3 presents a similar picture for mammarenavirus. the forcedirected graph for mammarenavirus has more obvious clustering that for orthohantavirus, showing a lower-left to top-right split. in the three-dimensional representation, these are equivalent, respectively, to the three clusters on the right and two clusters on the left. as with orthohantavirus, there is no cnn near the geometrical centre of the three-dimensional space, but the cnns are distributed around two clusters. three dimensional representations of all the genera in fig. 1 are available from the link in the raw data section. homology modelling was carried out as follows: (1) intra-genus, inter-species (11 models, table 3) (2) intra-family, inter-genus (5 models, table 4 ) (3) intra-order, inter-family (7 models, table 5 ) (4) intra-genus, on reconstructed common ancestor (12 models, table 6 ) table 3 shows that homology modelling with template and target within the same genus, produced good quality models in most cases, as judged by percentage of f-j outliers and rmsd within the high quality range. only the models for american bat vesiculovirus and tamana bat virus have percentages of f-j outliers outside of the high quality range. qmean, however, is rather more critical of the output with only the model for porcine picobirnavirus falling within the high quality range. the model for imjin thottimvirus scores eighth best on percentage of f-j outliers and second best on rmsd, despite the re-classification (occurring after the completion of our experimental work) by the ictv of this virus, originally in genus orthohantavirus into a new thottimvirus genus [28] . it should be noted that the models for imjin thottimvirus, burana orthonairovirus and brazilian mammarenavirus were based on very short template structures (see table 2 ). table 4 shows that homology modelling with template and target within the same family but different genera, still produced good quality models in most cases, as judged by percentage of f-j outliers and rmsd within the high quality range. only the models for lleida bat lyssavirus and macaque simian foamy virus have percentages of f-j outliers outside of the high quality range. however, once again, qmean assesses all models as outside the high quality range. table 5 shows that homology modelling with template and target within the same order but in different families, is a far more difficult proposition than at the lower taxonomic levels. the model for mammalian orthobornavirus 1 fails all three quality tests and only the model for rift valley fever phlebovirus manages to pass two out of three. table 6 shows that modelling the structure of the reconstructed sequence of the common ancestor of each genus, produces models of the same standard as intra-genus modelling (compare tables 3 and 6 ). by contrast with almost all the other models, the qmean scores are within the high quality range, with only two exceptions, table. the common ancestors of genera rotavirus and vesiculovirus. fig. 1c shows the force-directed graphs with the locations of the ancestral sequences added. table 7 summarises the results of tables 3e6 inclusive. as the taxonomical distance increases, production of high quality homology models becomes more difficult. however, modelling the reconstructed ancestral sequence of each genus is typically productive of a better scoring model even than the real sequence targets chosen for intra-genus modelling. fig. 4 shows representative examples of homology models of high and low quality superimposed with their template solved structure along with their corresponding ramachandran plots and qmean quality scores. all homology models in tables 3e6 are available from the link in the raw data section. the first objective of this study was to identify viral taxa which are comparatively lacking in solved structures for rna-dependent rna polymerase (rdrp). we observed that the entire order nidovirales, the families bornaviridae, filoviridae and paramyxoviridae within the order mononegavirales, and the family phenuiviridae within the order bunyavirales, fall into this category. additionally, within the genera orthohantavirus, orthonairovirus and mammarenavirus, all within the order bunyavirales, the solved structure available for rdrp covers less than 10% of the protein sequence. given the medical importance of many viruses within these taxa, and the number of anti-viral drugs that target rdrps, we suggest that they are prioritized for x-ray crystallography to close the "sequence-structure gap". our second objective was to assess how well homology modelling could provide models that might serve for computerassisted drug discovery of novel anti-viral compounds. to assist in the visualization of sequence space, we produced the first application of force-directed graphs to protein sequences (fig. 1) . we also applied multidimensional scaling for comparative purposes (figs. 2 and 3) . force-directed graphs enable the visualization of complex data in two dimensions. the three dimensional visualization produced from multidimensional scaling is visually richer, but this benefit can only be appreciated when a viewing application such as spotfire is available so that the three-dimensional image can be rotated. force-directed graphs convey much of the information in a single image which may be printed on a page or viewed on screen. this two-dimensional collapsing of sequence space also allows for easy simultaneous comparison of multiple datasets, in the present case multiple genera, which cannot readily be performed if separate three-dimensional viewers require to be open. the most common method of visualizing sequence space is the phylogenetic tree. for instance, starting from a distance matrix, agglomerative hierarchical clustering, such as the upgma method [29] , can be performed to generate a tree. slightly more sophisticated methods, such as neighbour-joining [30] can generate trees where the branch lengths are proportional to genetic distance. force-directed graphs do not represent genetic distance as accurately as phylogenetic trees, since the distances between nodes, table. although optimized to reflect relatedness, are constrained by the fruchterman-reingold algorithm to the best representation in two dimensions. however, force-directed graphs again allow easier simultaneous comparison of several data sets than phylogenetic trees. fig. 1 would be impossible to create on a single page if trees were used instead of force-directed graphs. trees represent ancestral sequences as nodes on the tree, with only existing taxa as leaves. force-directed graphs, by contrast, allow ancestral sequences to be represented in the same way as existing ones. fig. 1c shows that ancestral sequences do not necessarily appear as outliers in force-directed graphs. indeed, for genera flavivirus, hepacivirus, orthobunyavirus and orthohantavirus in particular, the insertion of the reconstructed ancestral sequence into the forcedirected graph in fig. 1c does not overly distort its original shape in fig. 1aeb . the reason for this becomes apparent when one considers a phylogenetic tree represented in unrooted "star" format. the ancestral sequence is then at the centre of the star topology and it can be seen that the genetic distance from the root to any particular leaf sequence may often be less than for many pairwise leaf sequence combinations. we did not perform calculation of centroid nearest neighbours (cnns) for alignments incorporating reconstructed ancestral sequences, but we are tempted to speculate that many of the ancestral sequences would have been cnns, had they been included. table 6 homology modelling the common ancestor for each genus. templates are as given in table 2 . targets are the reconstructed ancestral rdrp (or reverse transcriptase for lentivirus) sequences. rmsd: root mean square deviation in angstroms between template and model when superposed in moe. indicates good-quality, indicates poorquality, determined by the following thresholds: f-j < 2%; qmean z-score > à4.00; rmsd <2 å. table 7 mean model (or structure) quality. the top line shows the mean quality scores for the solved structures used. the other lines show the mean quality scores for the models produced at various levels of taxonomic distance between template and target. indicates good-quality, indicates poor-quality, determined by the following thresholds: f-j < 2%; qmean z-score > à4.00; rmsd <2 å. numbers in brackets indicate the revised scores if the model for imjin thottimvirus is moved out of the intra-genus category and into the intra-family category in the light of its subsequent transfer into the new genus thottimvirus. ), and outliers ( cross, text). the z-score graphics show model quality on a sliding scale: low-quality ( ), high-quality ( ). qmean4 shows the overall z-score, "all atom" shows the average z-score for all of the atoms in the model, "cbeta" the z-score for all cb carbons, "solvation" is a measure of how accessible the residues are to solvents, and "torsion" is a measure of torsion angle for each residue compared to adjacent residues. it is important to remember that homology models are theoretical constructions and caution must be exercised in treating them as input material for further experiments. among the various statistics for assessment of model quality, f-j outlier percentage is a measure of the proportion of implausible dihedral angles in the model, and indicate where parts of the model backbone are likely to be incorrectly predicted. nevertheless, it is also important not to become too dependent on statistics such as f-j outlier percentage, as "bad" angles do occasionally occur in solved structures. for instance in the present study, the thresholds of <0.05% for a very high quality model, and <2% for a good quality model given by lovell et al. [19] would suggest that six of the twelve template solved structures used here ( table 2) would not have been assessed as "very high quality" had they been models rather than solved structures. indeed the templates from indiana vesiculovirus and rotavirus a have more than 0.5% f-j outliers, and also have the poor quality scores for qmean. these two structures also have the poorest resolution of any of our templates, at > 3 å. the poor quality scoring may therefore simply be a consequence of uncertainties in positioning of atoms in these structures. one might reasonably posit that the use of template solved structures having such issues might influence the resulting models to contain the same outliers. however, the model for rotavirus i has a lower level of f-j outliers than its rotavirus a template ( table 3) . as might be expected, production of high quality models becomes more difficult as the genetic distance between target and template increases, as show in tables 3e5 nevertheless, even at the level of template-target pairs in separate genera (table 4) , the average performance is acceptable, as summarized in table 7 . we therefore suggest that homology modelling may be used to produce rdrp models for research use even for genera where no solved structure exists, provided a template structure exists within the same family. here, we provide examples (table 4 ) of such successful inter-genus, intra-family, models for genera coltivirus and parechovirus. our inter-genus models for lyssavirus and spumavirus are slightly less successful. moving to the next taxonomic level, models with template-target pairs in separate families (table 5) are generally less successful. one exception is our model for family phenuiviridae, which is better than some of the intra-family models. this is encouraging, since phenuiviridae is a family without any solved rdrp structure. homology models have been produced at much larger taxonomic distances than those dealt with here, for instance from bacteria to eukaryotes [31] , so it should be stressed that we make no claim for the generality of our findings outside of the viral orders under consideration, or for proteins other than rdrp. multi-domain proteins in particular, may produce higher quality models for some domains than others. one surprising result was the high quality of the models of reconstructed ancestral sequences (table 6 , summarized in table 7 ). as previously discussed, this may be due to the fact that the ancestral sequence is, assuming a regular molecular clock, potentially equally related to all descendent members of its genus. in this paper, we calculated centroid nearest neighbours (cnns) as the central points in sequence space for each genus (fig. 1) . a reconstructed ancestral sequence may also be considered as a candidate central point. the value of central points is that they may serve as targets that could be used to make models representative of their genus as a whole. for instance, the shortest-path, mean and median cnns of genus orthohantavirus are sequences 16, 22 and 7 (see supplementary table for a list of sequences for each genus) , representing sin nombre orthohantavirus, rockport orthohantavirus and cao bang orthohantavirus respectively. the partial solved structure used as the template for modelling in the genus orthohantavirus in the present paper is from hantaan orthohantavirus (5ize, see table 2 ) and the target used, imjin thottimvirus (sequence 27 in orthohantavirus panel of fig. 1) , is now classified as belonging to a new genus thottimvirus (table 3) . the three cnns, sin nombre orthohantavirus, rockport orthohantavirus and cao bang orthohantavirus are 71%, 64% and 75% identical to 5ize respectively, whereas imjin thottimvirus is only 58% identical. the latter was of course chosen to test the effectiveness of intra-genus homology modelling over as wide a genetic distance as possible (see section 2.5). for the performance of subsequent experimental procedures on orthohantavirus rdrps, for instance docking to discover novel anti-viral compounds, a homology model corresponding to one of the three cnns mentioned above or to the reconstructed ancestor (table 6 ) would be the preferred target, along with the existing solved structure. where a solved rdrp structure exists in a genus, it should be used. however, if that solved structure is not a cnn, a homology model of a cnn or ancestral sequence should be produced for comparative purposes. where no solved rdrp structure exists in a genus, a structure from another genus in the same family may be used. on the basis of our investigations, we recommend a procedural flowchart for selection of an rdrp structure for further study, for instance docking to discover novel anti-viral compounds, in any rna virus genus of interest (fig. 5) . where a solved structure exists within a genus, it is the obvious choice for further experiments. however, where that solved structure is far from any of the cnn sequences of the genus, as judged by the force-directed graph, a cnn may also be homology modelled for comparative purposes, using the existing solved structure as a template. any differential performance of the solved structure and the homology model in, for instance, a docking experiment, may give clues as to the generality of conclusions derived from the solved structure alone. a reconstructed ancestral rdrp may also be used as an alternative to, or in addition to, a cnn. the limits of homology modelling would appear, on the basis of the results presented here, to be at the intrafamily, inter-genus level. template-target pairs in different viral families are unlikely to be of practical use, as the predicted quality of the resulting models is low. our models were produced using moe, and we have not performed comparisons using other modelling tools, such as swiss-model [31] or modeller [32] . we feel that it is unlikely that significant differences in output would be produced, but when the object of the exercise is drug-discovery, we recommend that the protocol in fig. 5 be implemented using several alternative modelling softwares. crystallographic structural genome projects are badly needed to close the sequence-structure gap. in the meantime, systematic attempts to fill the gaps via homology modelling may be useful. however, for many taxa e all of the order nidovirales and much of mononegavirales -the paucity of solved structures to act as templates remains a serious obstacle. all code, inputs and outputs are available from: https://doi.org/ 10.17635/lancaster/researchdata/276. a three-dimensional model of the myoglobin molecule obtained by x-ray analysis protein modeling: what happened to the "protein structure gap the high throughput sequence annotation service (ht-sas) -the shortcut from sequence to true medline words the evolution and emergence of rna viruses crystal structure of the full-length japanese encephalitis virus ns5 reveals a conserved methyltransferase-polymerase interface molecular docking revealed the binding of nucleotide/ side inhibitors to zika viral polymerase solved structures using bioinformatics tools for the discovery of dengue rna-dependent rna polymerase inhibitors epidemiological characteristics of humaninfective rna viruses the rcsb protein data bank: integrative view of protein, gene and 3d structural information reference sequence (refseq) database at ncbi: current status, taxonomic expansion, and functional annotation mafft: iterative refinement and additional methods mega7: molecular evolutionary genetics analysis version 7.0 for bigger datasets muscle: multiple sequence alignment with high accuracy and high throughput network visualizations of of relationships in psychometric data graph drawing by force-directed placement some properties of classical multidimensional scaling protonate 3d: assignment of ionization states and hydrogen coordinates to macromolecular structures the generalized born/volume integral implicit solvent model: estimation of the free energy of hydration using london dispersion instead of atomic surface area structure validation by calpha geometry: phi,psi and cbeta deviation on the accuracy of homology modeling and sequence alignment methods applied to membrane proteins qmean: a comprehensive scoring function for model quality assessment toward the estimation of the absolute quality of individual protein structure models evolutionary trees from dna sequences: a maximum likelihood approach fastml: a web server for probabilistic reconstruction of ancestral sequences sars and mers: recent insights into emerging coronaviruses taxonomy of the order mononegavirales: second update emerging phleboviruses taxonomy of the order bunyavirales: second update construction of phylogenetic trees for proteins and nucleic acids: empirical evaluation of alternative matrix methods the neighbor-joining method: a new method for reconstructing phylogenetic trees swiss-model and the swiss-pdbviewer: an environment for comparative protein modeling modeller: generation and refinement of homology-based protein structure models supplementary data to this article can be found online at https://doi.org/10.1016/j.jmgm.2019.07.014. key: cord-297517-w8cvq0m5 authors: toğaçar, mesut; ergen, burhan; cömert, zafer title: covid-19 detection using deep learning models to exploit social mimic optimization and structured chest x-ray images using fuzzy color and stacking approaches date: 2020-05-06 journal: comput biol med doi: 10.1016/j.compbiomed.2020.103805 sha: doc_id: 297517 cord_uid: w8cvq0m5 coronavirus causes a wide variety of respiratory infections and it is an rna-type virus that can infect both humans and animal species. it often causes pneumonia in humans. artificial intelligence models have been helpful for successful analyses in the biomedical field. in this study, coronavirus was detected using a deep learning model, which is a sub-branch of artificial intelligence. our dataset consists of three classes namely: coronavirus, pneumonia, and normal x-ray imagery. in this study, the data classes were restructured using the fuzzy color technique as a preprocessing step and the images that were structured with the original images were stacked. in the next step, the stacked dataset was trained with deep learning models (mobilenetv2, squeezenet) and the feature sets obtained by the models were processed using the social mimic optimization method. thereafter, efficient features were combined and classified using support vector machines (svm). the overall classification rate obtained with the proposed approach was 99.27%. with the proposed approach in this study, it is evident that the model can efficiently contribute to the detection of covid-19 disease. the new coronavirus (covid19) is an acute deadly disease that originated from wuhan province, china in december 2019 and spread globally. covid-19 outbreak has been of great concern to the health community because no effective cure has been discovered [1] . the biological structure of covid-19 comprises of a positive-oriented single-stranded rna-type, and it is difficult to treat the disease owing to its mutating feature. medical professionals globally are undergoing intensive research to develop an effective cure for the disease. presently, covid-19 is the primary cause of thousands of deaths globally, and major deaths are in the usa, spain, italy, china, the uk, iran, etc. many types of coronavirus exist, and these viruses are commonly seen in animals. covid-19 has been discovered in human, bat, pig, cat, dog, rodent, and poultry. symptoms of covid-19 include sore throat, headache, fever, runny nose, and cough. the virus can provoke the death of people with weakened immune systems [2, 3] . covid-19 is transmitted from person to person mostly by physical contact. generally, healthy people can be infected through breath contact, hand contact, or mucous contact with people carrying covid-19 [4] . recently, artificial intelligence (ai) has been widely used for the acceleration of biomedical research. using deep learning approaches, ai has been used in many applications such as image detection, data classification, image segmentation [5, 6] . people infected by covid-19 may suffer from pneumonia because the virus spreads to the lungs. many deep learning studies have detected the disease using a chest x-ray image data approach [7] . a previous study has classified the pneumonia x-ray images using three different deep learning models [8] namely the fine-tuned model, model without fine-tuning, and the model trained from scratch. by using the resnet model, they classified the dataset into multiple labels such as age, gender, etc. they also used the multi-layer perceptron (mlp) as a classification method and achieved an average of 82.2% accuracy. samir yadav et al. [9] performed a classification algorithm using pneumonia data, svm as a classification method, and inceptionv3, vgg-16 models as a deep learning approach. in their study, the dataset is divided into three classes: normal, bacterial pneumonia, and viral pneumonia to improve the contrast and brightness zoom settings with the augmentation method for each image in the dataset. the best classification achievement was 96.6%. rahib abiyev et al. [10] used the backpropagation neural network and competitive neural network models to classify pneumonia data. using pneumonia and normal chest x-ray images, they set 30% of the dataset as test data and compared the proposed approach with the existing cnns. they achieved 89.57% classification success. okeke stephen et al. [11] proposed a deep learning model to classify the pneumonia data from scratch to train the data. their proposal consists of convolution layers, dense blocks, and flatten layers. the input size of the model was 200 × 200 pixels to determine the possibilities of classification using the sigmoid function. the success rate was 93.73% in pneumonia from x-ray images. vikash chouhan et al. [12] detected the images of pneumonia using deep learning models, three classes of the dataset: normal, virus pneumonia, and bacterial pneumonia images. in the first instance, they carried out a set of preprocessing procedures to remove noise from the images. then, they applied the augmentation technique to each image and used a transfer learning to train the models. the overall classification accuracy was 96.39%. in this study, we used covid-19 chest images dataset, pneumonia chest images, and normal chest images. we preprocessed each image before being trained with deep learning models. in this preprocessing, the dataset was reconstructed using the fuzzy technique and stacking technique. then, we trained the three datasets using the mobilenetv2 and squeezenet deep learning models and classified the models by the svm method. the remainder of this study is structured as follows: in section 2, we discuss the structure of the dataset, technique, method, deep learning models, and optimization algorithm. experimental analysis is mentioned in section 3. section 4 and 5 present the discussion and conclusion, respectively. in the experimental analysis, we use the three classes of datasets that are accessible publicly. these classes are normal, pneumonia, and covid-19 chest images. all datasets are x-ray images, and each image is converted to jpg format. since covid-19 is a new disease, the number of images related to this virus is limited. for this study, we combined two publicly accessible databases consisting of covid-19 images. the first covid-19 dataset was shared on the github website by a researcher named joseph paul cohen from the university of montreal. after the experts checked the images, they were made available to the public. in the joseph paul cohen dataset, image types are mers, sars, covid-19, etc. the data of 76 images labeled with covid-19 were selected for this study [13] . the second covid-19 dataset consists of the images created by a team of researchers from qatar university, medical doctors from bangladesh, and collaborators from pakistan and malaysia. the second covid-19 dataset is available on the kaggle website, and the current version has 219 x-ray images [14] . for this study, two datasets containing covid-19 images were combined, and a new dataset consisting of 295 images was created. covid-19 virus causes pneumonia in the affected individuals and can provoke death if the lungs are permanently damaged [15] . the second dataset is important in this study to compare covid-19 chest images using deep learning models. the second dataset consists of normal chest images and pneumonia chest images. pneumonia chest images include both virus and bacteria types, and these images are taken from 53 patients. the images were created by experts and shared publicly [16] . the combined dataset consists of three classes. information about the classes of the dataset and the number of images in the classes are as follows: we collect a total of 295 images in covid-19 class. normal class x-ray images are 65 in total, and pneumonia class x-ray images are 98. the total number of images of the dataset is 458. in the experimental analysis, 70% of the dataset was used as training data, and 30% was used as test data. in the last step of the experiment, the k-fold crossvalidation method was used for stacked images. sample images of the dataset are shown in fig. 1 . mobilenet is a deep learning model intended to be used in low hardware cost devices. object mobilenetv2 model uses the relu function between layers [18] . thus, it enables the nonlinear outputs from the previous layer to be linearized and transferred as input to the next layer. the model continues its training process until a comfortable step is reached. in this model, the convolutional layers circulate filters over input images and create activation maps. the activation maps contain the features extracted from the input images, and these features are transferred to the next layer. pooling layers are also used in the mobilenetv2 model. the matrices obtained through these layers are converted into smaller dimensions [19] . [20] . mobilenetv2 model was used as pre-trained, and the svm method was used in the classification phase. besides, other important parameters of the mobilenetv2 model are given in table 1 and table 2 . all default parameter values were used for the mobilenetv2 model without any change. squeezenet, an in-depth learning model of input size of 224 × 224 pixels, comprises of convolutional layers, pooling layers, relu, and fire layers. the squeezenet does not have fully connected layers and dense layers. however, fire layers perform the functions of these similar layers. the major benefit of this model is that it performs analyses successfully by reducing the number of parameters, thereby decreasing the model size capacity. squeezenet model produced more successful results, approximately 50 times fewer parameters than the alexnet model, thereby reducing the cost of the model [21] . fig. 3 presents the model design. in the expansion part, the depth is increased [21, 22] . table 3 presents the layers and default parameter values of the model, these values are used without changes. other important parameters of the squeezenet model are given in table 4 . (2) and eq. (3). here, and represent the coordinate points of the features in the hyperplane. parameter represents margin width, and parameter represents bias value [23, 24] . deep learning models use the optimization methods, and these facilitate learning trends of models. stochastic gradient descent (sgd) optimization is the method that updates the weight parameters in the model structure at every iteration. the models provide better training during each iteration. however, the sgd does not use all the images input into the model while updating the parameters. it performs this operation using only the images it randomly determines. this lowers the cost of the model, and it offers a faster training process. eq. (4) shows the mathematical function that performs the weight parameter updates in the sgd. where represents the weight parameter. this represents the coordinates of the features extracted in the and input images and represents α learning rate [25] . in this study, the svm method was preferred because: i. it has a strong potential to provide solutions to the data analysis problems encountered in daily life, ii. it is widely used for the remote pattern recognition and classification problems to successfully execute multiple classification processes [26, 27] , and iii. it gives the best classification performance among other machine learning methods (discriminant analysis, nearest neighbor, etc.). moreover, the linear svm was preferred owing to its best performance such as cubic, linear, quadratic, etc. preferred parameter values in the linear svm method were the kernel scale that was parameter automatically selected. the box constraint level parameter value was chosen, and the multiclass method parameter of one-vs-one was selected. the fuzzy concept is accepted based on its degree of accuracy, and its next degree is uncertain. fuzzy color algorithms play an important role in image analysis, and the obtained results depend on the similarity/difference functions used for color separation. in the fuzzy color technique, each of the input images contains three input variables (red, green, and blue -rgb). as a result of this process, a single output variable is passed. the input and output values are determined according to the training data [28, 29] . the logic behind the fuzzy color technique is to separate the input data into blurred windows. each pixel in the image has a membership degree to each window, and membership degrees are calculated based on the distance between the window and the pixel. image variance is obtained with membership degrees. the fuzzy color technique is to create the finishing output-input. in this step, the weights of the images of each blurred window are summed, and the output image is created from the average. here, the weight value of each pixel is expressed as the degree of membership [28, 30] . we recreated the original dataset using the python codes with the fuzzy color technique [31] . fig. 5 shows the structure of the data image. image stacking is a digital image processing technique that combines multiple images shot or is reconstructed at different focal distances. this is a technique used to improve the quality of the images in the dataset. this technique aims to eliminate the noises from the original image by combining at least two images in a row and dividing the image into two parts. these parts are background and overlay. while the first images are processed in the background, the second is overlaid on the image placed in the background. here, parameters such as opacity, contrast, brightness, and combining ratio of the two images are important. the more accurately these ratios, the more amount of noise is reduced in the images, and the higher the quality ratio [32] . in this study, python and pillow library was used for the stacking technique [33] . here, the original dataset was stacked on the reconstructed dataset using the fuzzy technique. the successful result of the fuzzy technique will contribute to the success of the stacking technique. the parameter values preferred in the stacking technique were the opacity value, with the value of 0.6, the contrast value was 1.5, the brightness value was set to -80, and the combined ratio was chosen as 50%. these values can be varied for another dataset. we evaluate the various stages for dataset images and determine that these values are the most efficient of the dataset. hence, we used them in the experimental analysis. moreover, the original dataset was placed in the background, and the structured dataset was placed in the overlay. a combined representation of the original dataset using the stacking technique and dataset structured with the fuzzy technique is shown in fig. 6 . emotions such as morale and happiness can be transfer from person to person. emotional imitation happens consciously or unconsciously. this situation is related to neurons in the human brain, that is, [34] . for this study, the population size of 20 was selected, the maximum iteration parameter was selected as 10. the best global value parameter was taken as 1000 for the start, and the % parameter value of 10 was chosen. &1, () = + + -& + − + ); ( = 1, 2, … , % the proposed approach is designed to perform the classification of chest images based on the dataset types. this is an approach to separate chest images of covid-19 virus infection from normal breast images and pneumonia. the original dataset is passed through the front image processing steps. in the first step, the original dataset was recreated with the fuzzy color technique, aimed to remove the noise in the original images. in the second step, using the original dataset stacking technique, each fuzzy color image was combined with original images, and a new dataset was created. the aim was to create a better data quality image. the two deep learning models were used, and the stacked dataset was trained with mobilenetv2 and squeezenet deep learning models. using the smo algorithm, the 1000-features obtained by the models were used to extract efficient features. by combining the efficient features, the svm method that produced successful results in multiple classifications was used as the classification process. fig. 7 shows the overall design of the proposed approach. python 3.6 is used to structure the original data set using the fuzzy technique and stacking technique. besides, the smo algorithm was compiled in python, and detailed information about the source codes and analysis used in this study are given in the web link specified in the open source code section. jupyter notebook is the interfaces program used in compiling python. using deep learning models, matlab (2019b) software was used for classification. the hardware features to compile software are the windows 10 operating system (64 bit) with a 1 gb graphics card, 4 gb memory card, and an intel © i5 -core 2.5 ghz processor. the experiment consists of three steps the 30% of the data set was used as test data and the remaining 70% as training data. in the steps related to the stacked dataset, the k-fold cross-validation method was used. the svm method was used as a classifier in the last layers of squeezenet and mobilenetv2 models. fig. 8 shows the training success graphs of the three stages performed with the squeezenet model, and fig. 9 shows the confusion matrices. the results of the experimental analysis are given in table 5 . to acknowledge the validity of these analyses, we conducted a new analysis using another deep learning model, mobilenetv2. in the first stage of the mobilenetv2 model, we trained the original dataset, and the overall accuracy rate obtained with the svm method was 96.32%. in the second stage, the dataset structured with the fuzzy color technique was classified and the overall accuracy rate obtained in this classification was 97.05%. in the third stage of the mobilenetv2 model, the model was trained with the stacked data set, and the overall accuracy rate in the classification was 97.06%. fig. 10 shows the graphs of the training success of the three stages performed with the mobilenetv2 model, and fig. 11 shows the confusion matrices. the results of the experimental analysis of this model are given in table 6 . in the second step of this experiment, the k-fold cross-validation method was used for the stacked dataset, classified by using the svm method. in the first step of the experiment, 30% of the dataset was used as test data. to confirm the validity of the results of the first step, the dataset was separated using a k-fold cross-validation method. for two cnn models, the k-fold value was adjusted to five. in the second step, the overall accuracy rate achieved with the squeezenet model was 95.85%. in the first step of this model, the classification rate with the stacked dataset (with 30% test data) was 97.06%. in the analyzes performed in the second step, the squeezenet model produced stable results in both steps. in the second step, the overall accuracy rate was 96.28%, obtained from the mobilenetv2 model trained with the stacked dataset. in the first step of the mobilenetv2 model, the classification rate was 97.06%, achieved with the stacked dataset (with 30% test data). fig. 12 shows the confusion matrices of the analysis performed in the second step, and table 7 gives the metric values obtained from the confusion matrices. as a result, all analyzes performed in the second step produce a stable result compared to the results obtained in the first step. the results obtained by the 5fold cross-validation confirmed the reliability of the proposed approach compared to the results obtained from the previous step. in the third step, the smo algorithm was applied to the feature sets obtained from the stacked dataset trained by cnn models. the dataset contains 1000 features, extracted by two cnn models. the dataset was obtained as a file with a '*.mat' extension from matlab. these feature sets were obtained from the pool10 layer in the squeezenet model and the logits layer in the mobilenetv2 model. choosing efficient features with the smo algorithm offers a total of 800 column numbers as we have set the maximum features selection to 800, and some of these column numbers are repeated using the smo algorithm. the feature column numbers selected by cnn models are less than 800. using an efficient optimization method with a few possible features, this step aimed to contribute to the classification success. the third step consists of four stages. in the first stage, the smo algorithm was applied to the 1000-feature set, obtained by training the stacked dataset using the squeezenet images. fig. 13 shows the confusion matrices of the analysis performed in the third step of the experiment, and table 8 gives the values of the metric parameters. in the proposed approach, the contribution of the smo algorithm was recorded for the improvement of classification performance. the codes and analysis of results of the smo algorithm are available in the web address of the open source code. available in the web address is also feature sets obtained in the experiment and related source codes. the number of confirmed covid-19 cases has exceeded millions worldwide with thousands of confirmed deaths. the world health organization has declared that covid-19 is a global epidemic [37] . using the proposed approach, we performed the detection of covid-19 from the x-ray image data. we compared covid-19 chest data with that of pneumonia and normal chest data since pneumonia is one of the symptoms of covid-19. the major challenge we encountered is that the publication of covid-19 images is still limited. moreover, previous studies on the detection of covid-19 using deep learning are non-existence. hence, we fill the gap in the literature. although the limited dataset is used, this study contributes to the classification of the dataset, using the image preprocessing to determine the data classes. other techniques can also be used instead of the fuzzy color technique. we paid attention to the similarity between the structured image and the original stack image. if we configured the image in a different format (such as resolution status or color pixel status), we would not have achieved the success achieved in this study. the advantages of the proposed approach are as follows: • it provides a 100% success rate in detecting the disease by examining the x-ray images of covid-19 patients. • the analysis can be carried out using ai, and the proposed approach can be integrated into portable smart devices (mobile phones, etc.) • the deep learning models (mobilenetv2 and squeezenet) used in the proposed approach have fewer parameters compared to other deep models. this helps to gain speed and time performance. besides, using the smo algorithm, cnn models save time and speed during the process. • it minimizes the interference in every image in the dataset and provides efficient features with stacking technique. the disadvantages of the proposed approach are as follows: • if the sizes of the input images in the dataset are different, a complete success may not be achieved. irrespective of the resize parameter, it is still a challenge for the proposed approach to deal with very low-resolution images. • in the stacking technique, the resolution dimensions of the original images and the structured images must be the same. we ensure that the number of pneumonia and normal chest images are close to that of covid-19 chest images. since we presume that the image classes found in an unbalanced number cannot contribute to the success of the model, we still achieved an overall 99.27% accuracy in the classification process. in the proposed model, the end-to-end learning scheme has been exploited, which is one of the great advantages of cnn models. the pathologic patterns were detected and identified by using the activation maps that kept the discriminative features of the input data. in this manner, the tedious and labor-intensive feature extraction process was isolated; a highly sensitive decision tool was ensured. people infected with covid-19 are likely to suffer permanent damage in the lungs, which can later provoke death. this study aimed to distinguish people with damaged lungs owing to covid-19 from normal individuals or pneumonia (not infected by . the detection of covid-19 was carried out using deep learning models. since it is important to detect covid-19 that spread rapidly and globally, ai techniques are used to perform this accurately and quickly. one of the novelty aspects of the proposed approach is to apply the pre-processing steps to the images. when using preprocessing steps, more efficient features are extracted from the image data. with the stacking technique, each pixel of equivalent images is superimposed, and the pixels with low efficiency is increased. with the proposed approach, efficient features were extracted using the smo algorithm. the model was aimed to produce faster and more accurate results. another innovative aspect is that the feature sets obtained with smo are combined to improve the classification performance. we also demonstrate the usability of our approach to smart mobile devices with the mobilenetv2 model as it can be analyzed on mobile devices without using any hospital devices. the 100% success was achieved in the classification of covid-19 data, and 99.27% success was achieved in the classification of normal and pneumonia images. in future studies, deep learning-based analyzes will be carried out using data images of other organs affected by the virus, in line with the views of covid-19 specialists. we plan to develop a future approach using different structuring techniques to enhance the datasets. as the data related to the factors influencing the virus in human chemistry (e.g. blood group, rna sequence, age, gender, etc.) are available, we will produce a solution-oriented analysis using ai. information about python and matlab software source codes, datasets, and related analysis results used in this study are given in this web link. https://github.com/mtogacar/covid_19 there is no funding source for this article. this article does not contain any data, or other information from studies or experimentation, with the involvement of human or animal subjects. the authors declare that there is no conflict to interest related to this paper. transmission of 2019-ncov infection from an asymptomatic contact in germany editorial covid-19 : too little , too late ? coronavirus disease 2019 (covid-19): a guide for uk gps transmission routes of 2019-ncov and controls in dental practice application of breast cancer diagnosis based on a combination of convolutional neural networks, ridge regression and linear discriminant analysis using invasive breast cancer images processed with autoencoders recent progress in semantic image segmentation identifying pneumonia in chest x-rays: a deep learning approach comparison of deep learning approaches for multi-label chest x-ray classification deep convolutional neural network based medical image classification for disease diagnosis deep convolutional neural networks for chest diseases detection an efficient deep learning approach to pneumonia classification in healthcare a novel transfer learning based approach for pneumonia detection in chest x-ray images covid-19 chest x-ray dataset or ct dataset covid-19 radiography database radiology perspective of coronavirus disease 2019 (covid-19): lessons from severe acute respiratory syndrome and middle east respiratory syndrome pneumonia sample x-rays, github mobilenetv2: inverted residuals and linear bottlenecks mobilenetv2: the next generation of on-device computer vision networks brainmrnet: brain tumor detection using magnetic resonance images with a novel convolutional neural network model a dual-path and lightweight convolutional neural network for high-resolution aerial image segmentation a deep-learning-based approach for fast and robust steel surface defects classification real-time vehicle make and model recognition with the residual squeezenet architecture a multiseed-based svm classification technique for training sample reduction multiple kernel-based svm classification of hyperspectral images by combining spectral, spatial, and semantic information stochastic gradient descent and its variants in machine learning support vector machines for classification bt -efficient learning machines: theories, concepts, and applications for engineers and system designers a unified view on multi-class support vector classification prediction of wood density by using red-green-blue (rgb) color and fuzzy logic techniques hybrid filter based on fuzzy techniques for mixed noise reduction in color images color comparison in fuzzy color spaces, fuzzy sets syst fuzzy color image enhancement algorithm, github focus stacking technique in identification of forensically important chrysomya species (diptera: calliphoridae), egypt image stack: simple code to load and process image stacks social mimic optimization algorithm and engineering applications çınar, a new approach for image classification: convolutional neural network fusing fine-tuned deep features for recognizing different tympanic membranes coronavirus disease 2019 he received his undergraduate degree from fırat university, department of electronics and computer education clinical decision support systems. research interests disease detection in biomedical images (mr, x-ray etc education information key: cord-277237-tjsw205c authors: hernandez vargas, esteban abelardo; velasco-hernandez, jorge x. title: in-host modelling of covid-19 kinetics in humans date: 2020-03-30 journal: nan doi: 10.1101/2020.03.26.20044487 sha: doc_id: 277237 cord_uid: tjsw205c covid-19 pandemic has underlined the impact of emergent pathogens as a major threat for human health. the development of quantitative approaches to advance comprehension of the current outbreak is urgently needed to tackle this severe disease. in this work, several mathematical models are proposed to represent covid-19 dynamics in infected patients. considering different starting times of infection, parameters sets that represent infectivity of covid-19 are computed and compared with other viral infections that can also cause pandemics. based on the target cell model, covid-19 infecting time between susceptible cells (mean of 30 days approximately) is much slower than those reported for ebola (about 3 times slower) and influenza (60 times slower). the within-host reproductive number for covid-19 is consistent to the values of influenza infection (1.7-5.35). the best model to fit the data was including immune responses, which suggest a slow cell response peaking between 5 to 10 days post onset of symptoms. the model with eclipse phase, time in a latent phase before becoming productively infected cells, was not supported. interestingly, both, the target cell model and the model with immune responses, predict that virus may replicate very slowly in the first days after infection, and it could be below detection levels during the first 4 days post infection. a quantitative comprehension of covid-19 dynamics and the estimation of standard parameters of viral infections is the key contribution of this pioneering work. is none so far at within-host level to understand covid-19 replication cycle ( fig.1 ) and its interactions 51 with the immune system. among several approaches, the target cell model has served to represent 52 several diseases such as hiv [7] [8] [9] [10] , hepatitis virus [11, 12] , ebola [13, 14] , influenza [15] [16] [17] [18] , among 53 many others. a detailed reference for viral modelling can be found in [19] . very recent data from 54 infected patients with covid-19 has enlighten the within-host viral dynamics. zou et al. [20] presented 55 the viral load in nasal and throat swabs of 17 symptomatic patients. interestingly, covid-19 56 replication cycles may last longer than flu, about 10 days or more after the incubation period [4, 20] . 57 here, we contribute to the mathematical study of covid-19 dynamics at within-host level based on 58 data presented by wolfel et al. [21] . 59 using ordinary differential equations (odes), different mathematical models are presented to adjust the 61 viral kinetics reported by woelfel et al. [21] in infected patients with covid-19. viral load [21] was cost function (14) is minimized to adjust the model parameters based on the differential evolution (de) 67 algorithm [22] . 68 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint viral dynamic is divided into two parts, exponential growth (v g ) and decay (v d ) modelled by equations 70 (1) and (2), respectively. viral growth is assumed to start at the onset of symptoms, with initial viral concentration v g (0). the parameter ρ is the growth rate of the virus. the parameter η quantifies the decay rate of the virus, 73 while v d (0) the initial value of the virus in decay phase. note that the growth phase of the virus was 74 measured only in two patients (a, and b) [21] . exponential growth and decay model for covid-19. continuous line are simulation based on (1) for viral exponential growth (v g ) or on (2) for viral decay (v d ). blue circles represents the data from [21] . viral growth rate (ρ) was only computed for patients a (till day 6) and b (till day 4) while the rest of patients have missing these measurements. for all patients viral decay rate η in (2) is computed. simulation are shown in fig.2 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint table 1 . estimations for the model (1)-(2) using experimental data from [21] . for the exponential growth phase there were measurements only for patient a and b, for the rest of patients were more in the logarithmic decay phase. this is the reason why patient a and b are the only ones that report estimations of viral growth. host cells can be in one of following states: susceptible (u ) and infected (i). viral particles (v ) infect 85 susceptible cells with a rate β ((copies/ml) −1 day −1 ). once cells are productively infected, they release 86 virus at a rate p (copies/ml day −1 cell −1 ) and virus particles are cleared with rate c (day −1 ). infected 87 cells are cleared at rate δ (day −1 ) as consequence of cytopathic viral effects and immune responses. coronaviruses infect mainly in differentiated respiratory epithelial cells [25] . previous mathematical 89 model for influenza [17] have considered about 10 7 initial target cells (u (0)). initial values for infected 90 cells (i(0)) are taken as zero. v (0) is determined from estimations in table 1 . note that v (0) cannot 91 be measured as it is below detectable levels (about 100 copies/m) [21] . viral kinetics are measured after the on-set of symptoms [21] , however, it is unknown when the 93 initial infection took place. patients infected with mers-cov in [26] showed that the virus peaked 94 during the second week of illness, which indicated that the median incubation period was 7 days (range, 95 2 to 14) [26] . for parameter fitting purposes, we explore three different scenarios of initial infection day 96 (t i ), that is, -14, -7, -3 days before the onset of symptoms for patients a and b, see fig. 3 . . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted march 30, 2020. . blue circles represents the data from [21] . due to the most complete data sets in [21] were from patient a and b, then these are the only presented in panel (a) and (b), respectively. infection time was assumed at -14, -7 and 0 days post symptom onset. infectivity can be defined as the ability of a pathogen to establish an infection [27] . to quantify 98 infectivity, the within-host reproductive number (r 0 ) was computed. r 0 is defined as the expected 99 number of secondary infections produced by an infected cell [28] . when r 0 < 1, one infected individual 100 can infect less than one individual. thus, the infection would be cleared from the population. otherwise, if r 0 > 1, the pathogen is able to invade the target cell population. this epidemiological 102 concept has been applied to the target cell model (3)(5), with previous studies [13, 29, 30] provided estimates of the infecting time (t inf ), that represents the time 104 required for a single infectious cell to infect one more cell. viruses with a shorter infecting time have a 105 higher infectivity [29, 30] . from equations (3)-(5), t inf can be explicitly computed as: assuming day of infection at day 0 post symptom onset (pso) would result in very high reproductive 107 numbers (r 0 ) and a high infection rate (β) for patients a and b as presented in table 2 . alternatively, 108 assuming the initial day of infection is either day -14 or -7 pso, then the rate of infection of susceptible 109 cells (β) would be slow but associated with a high replication rate (p). . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted march 30, 2020. before becoming productively infected cells (i) [29, 31] . this can be written as follows: cells in the eclipse phase (e) can become productively infected at rate k. holder et al. [29] improve the fitting respect to the target cell model (table 2 ) even when very long eclipse phase periods 121 are assumed (e.g 100 days), implying that this mechanism could be negligible on covid-19 infection. mathematical model with immune response. previous studies have acknowledged the relevance of the immune t-cell response to clear influenza [17, [32] [33] [34] [35] [36] . due to identifiability limitations for the estimation of the parameters of the target cell model using viral load data, a minimalistic model was derived in [37, 38] to represent the interaction between the viral and immune response dynamics. the model assumes that the virus (v ) level induces the proliferation of t cells (t ) as follows: . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. gives a better fitting than previous models (fig.2-4) . furthermore, aics values for patient a and b 135 highlight that t i = −15 dpso give the best fitting. for presentation purposes, numerical results for 136 patient a and b are the only portrayed in fig.4 . the summary of fitting procedures at t i = −15 dpso is 137 presented in table 3 . independently of the starting infection day, the immune response by t cells peaks 138 between 5 to 10 dpso. interestingly, the longer the period between infection time to the onset of 139 symptoms, the higher the immune response. 140 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint (12)(13) . blue circles represents the data from [21] . due to the most complete data sets in [21] were from patient a and b, then these are the only presented in panel (a) and (b), respectively. infection time was assumed at -14, -7 and 0 days post symptom onset. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint data from [26] showed that mers-cov levels peak during the second week with a median value of 150 7.21 (log10 copies/ml)in the severe patient group, and about 5.54 (log10 copies/ml) in the mild group. 151 for sars, the virus peaked at 5.7 (log10 copies/ml) between 7 to 10 days after onset [40] . for 152 covid-19, the viral peak was approximately 8.85 (log10 copies/ml) before 5 dpso [21] . liu et al. [41] 153 found that patients with severe disease reported a mean viral load on admission 60 times higher than 154 that of the mean of mild disease cases, implying that higher viral loads relate clinical outcomes. additionally, higher viral load persisted for 12 days after onset [41] . 156 using the target cell model, nguyen et al. [13] computed for ebola infection an average infecting 157 time of 9.49 hours, while holder et al. [29] reported that infecting time for the wild-type (wt) 158 pandemic h1n1 influenza virus was approximately 0.5 hours [29] . here, based on the results of the 159 target cell model in table 2 , we found that covid-19 infecting time between cells (mean of 30 days 160 approximately) would be slower than those reported for ebola (about 3 times slower) and influenza (60 161 times slower). the reproductive number for influenza in mice ranges from 1.7 to 5.35 [42] , which is 162 consistent with the values reported for covid-19. interestingly, both of our models (the target cell model (3)-(5) and the model with immune response 164 (12)-(13)) when fitted to the patient a data, predict that the virus can replicate below detection levels 165 for the first 4 dpi. this could be an explanation of why infected patients with covid-19 would take 166 from 2-14 dpi to exhibit symptoms. the model with immune system (fig.4(b and d) ) highlights that the t cell response is slowly 168 mounted against covid-19 [4] . thus, the slow t cell response may promote a limit inflammation 169 levels [42] , which might be a reason to the observations during covid-19 pandemic of the detrimental 170 outcome on french patients that used non-steroidal anti-inflammatory drugs (nads) such as ibuprofen. 171 however, so far, there is not any conclusive clinical evidence on the adverse effects by nads on 172 covid-19 infected patients. the humoral response against covid-19 is urgently needed to evaluate the protection to . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint infection in a non-human primate model [44] . furthermore, benefits has been reported for therapeutic 181 treatment if provided during 12 hours mers-cov infection [44] . our study here mainly addressed t cell 182 responses, therefore, future modelling attempts should be directed to establish a more detailed model of 183 antibody production and cross-reaction [45] as well as in silico testing of different antivirals [46] . 184 there are technical limitations in this study that need to be highlighted. the data for covid-19 185 kinetics in [21] is at the onset of symptoms. this is a key aspect that can render biased parameter 186 estimation as the target cell regularly is assumed to initiate at the day of the infection. in fact, we could 187 miss viral dynamics at the onset of symptoms. for example, from throat samples in rhesus macaques 188 infected with covid-19, two peaks were reported on most animals at 1 and 5 dpi [47] . in a more technical aspect using only viral load on the target cell model to estimate parameters may 190 lead to identifiability problems [48] [49] [50] [51] . thus, our parameter values should be taken with caution when 191 parameters quantifications are interpreted to address within-host mechanisms. for the model with multi-scale levels [52] [53] [54] [55] [56] [57] . further insights into immunology and pathogenesis of covid-19 will help to 198 improve the outcome of this and future pandemics. 199 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) mathematical models based on ordinary differential equations (odes) are solved using the matlab 202 library ode45, which is considered for solving non-stiff differential equations [58] . the clinical data of 9 individuals is from [21] . due to close contact with index cases and initial 205 diagnostic test before admission, patients were hospitalized in munich [21] . viral load kinetics were 206 reported in copies/ml per whole swab for 9 individual cases. all samples were taken about 2 to 4 days 207 post symptoms. further details can be found in [21] . where n is the number of measurements. the minimization of rms is performed using the differential 213 evolution (de) algorithm [22] . note that several optimization solvers were considered, including both 214 deterministic (f mincon matlab routine) and stochastic (e.g genetic and annealing algorithm) methods. 215 simulation results revealed that the de global optimization algorithm is robust to initial guesses of 216 parameters than other mentioned methods. model selection by aic. the akaike information criterion (aic) is used here to compare the 218 goodness-of-fit for models that evaluate different hypotheses [59] . a lower aic value means that a given 219 model describes the data better than other models with higher aic values. small differences in aic 220 scores (e.g. <2) are not significant [59] . when a small number of data points, the corrected (aicc) 221 writes as follows: 222 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint where n is the number of data points, m is the number of unknown parameters and rss is the 223 residual sum of squares obtained from the fitting routine. the authors declare that the research was conducted in the absence of any commercial or financial 226 relationships that could be construed as a potential conflict of interest. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted march 30, 2020. . https://doi.org/10.1101/2020.03. 26.20044487 doi: medrxiv preprint coronavirus diseases (covid-2019) situation reports identification of diverse alphacoronaviruses and genomic characterization of a novel severe acute respiratory syndrome-like coronavirus from bats in china a sars-like cluster of circulating bat coronaviruses shows potential for human emergence how will country-based mitigation measures influence the course of the covid-19 epidemic? the lancet real estimates of mortality following covid-19 infection. the lancet infectious diseases early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia modeling hiv persistence , the latent reservoir , and viral blips modeling the within-host dynamics of hiv infection modeling the three stages in hiv infection modeling of experimental data supports hiv reactivation from latency after treatment interruption on average once every 5-8 days analysis if hepatitis c virus infection models with hepatocyte homeostasis modeling viral spread ebola virus infection modeling and identifiability problems windows of opportunity for ebola virus infection treatment and vaccination. scientific reports kinetics of influenza a virus infection in humans neuraminidase inhibitor resistance in influenza: assessing the danger of its generation and spread effects of aging on influenza virus infection dynamics within-host models of high and low pathogenic influenza virus infections: the role of macrophages modeling and control of infectious diseases: with matlab and r sars-cov-2 viral load in upper respiratory specimens of infected patients clinical presentation and virological assessment of hospitalized cases of coronavirus disease 2019 in a travel-associated transmission cluster differential evolution-a simple and efficient heuristic for global optimization over continuous spaces modelling viral and immune system dynamics in-host modeling. infectious disease modelling university of texas medical branch at galveston viral load kinetics of mers coronavirus infection on the definition and the computation of the basic reproduction ratio r0 in models for infectious diseases in heterogeneous populations perspectives on the basic reproductive ratio assessing the in vitro fitness of an oseltamivir-resistant seasonal a/h1n1 influenza strain using a mathematical model the h275y neuraminidase mutation of the pandemic a/h1n1 influenza virus lengthens the eclipse phase and reduces viral output of infected cells, potentially compromising fitness in ferrets modeling amantadine treatment of influenza a virus in vitro a dynamical model of human immune response to influenza a virus infection simulation and prediction of the adaptive immune response to influenza a virus infection modeling within-host dynamics of influenza virus infection including immune responses dynamics of influenza virus infection and pathology quantifying the early immune response and adaptive immune response kinetics in mice infected with influenza a virus modeling influenza virus infection: a roadmap for influenza research multiscale model within-host and between-host for viral infectious diseases the survival and turnover of mature and immature cd8 t cells clinical progression and viral load in a community outbreak of coronavirus-associated sars pneumonia: a prospective study viral dynamics in mild and severe cases of covid-19. the lancet infectious diseases;0(0) switching strategies to mitigate hiv mutation reinfection could not occur in sars-cov-2 infected rhesus macaques prophylactic and therapeutic remdesivir (gs-5734) treatment in the rhesus macaque model of mers-cov infection uncovering antibody cross-reaction dynamics in influenza a infections passivity-based inverse optimal impulsive control for influenza treatment in the host infection with novel coronavirus ( sars-cov-2 ) causes pneumonia in the rhesus macaques sciences. preprint at research square estimation of hiv/aids parameters on identifiability of nonlinear ode models and applications in viral dynamics identifiability challenges in mathematical models of viral infectious diseases. ifac-papersonline analysis of practical identifiability of a viral infection model a model for coupling within-host and between-host dynamics in an infectious disease coupled within-host and between-host dynamics and evolution of virulence crossing the scale from within-host infection dynamics to between-host transmission fitness: a discussion of current assumptions and knowledge coupling multiscale within-host dynamics and between-host transmission with recovery (sir) dynamics high-resolution epidemic simulation using within-host infection and contact data neuraminidase inhibitors in influenza treatment and prevention-is it time to call it a day? viruses solve nonstiff differential equations -medium order method -matlab ode45 model selection and multimodel inference: a practical information-theoretic approach key: cord-269212-oeu48ili authors: rodrigues, alírio e. title: chemical engineering and environmental challenges. cyclic adsorption/reaction technologies: materials and process together! date: 2020-04-07 journal: j environ chem eng doi: 10.1016/j.jece.2020.103926 sha: doc_id: 269212 cord_uid: oeu48ili i start with a brief survey of paradigms in chemical engineering to highlight that in the early 70 s my thesis advisor p. le goff already mentioned the strong link of chemical processes with environment, energy and economy (market). then i move to my vision of che today summarized in che = m(2)p(2)e (molecular, materials, process and product engineering). i describe how i built a research lab centered around cyclic adsorption/reaction processes focusing in adsorption technologies to help solving environmental problems. i stress the basic concepts of adsorption processes and the need to use proper diffusion models for intraparticle mass transfer instead of pseudo first order or second order kinetic models. i also consider that adsorbent metrics should be linked to the process where the material is used: materials and processes together! in the last section i review some challenging areas where adsorption technologies are useful. carbon capture and utilization involving pressure swing adsorption to capture co(2) from flue gas in a pilot plant, 3d printed composite monoliths for electric swing adsorption, and utilization of co(2) to be transformed in methanol or synthetic natural gas (sng) (power-to-gas concept). i also address the general topic “processing of diluted aqueous solutions” with special attention for the development of simulated moving bed coupled with expanded bed adsorption. finally the integrated process to produce high-added valued compounds (vanillin and syringaldehyde) from kraft lignin is shown as an example of lignin valorization in pulp mill biorefinery. molecular engineering tools as molecular simulations will help more and more in the screening and design of adsorbents (mofs, cofs, etc) for target processes and computational fluid dynamics (cfd) will help in process design. process modeling and simulation packages such as gproms are now used replacing homemade simulators and learning of numerical methods; this was predicted in the 90 s by the late colin mc greavy (u. leeds) when we were teaching a chemical reaction engineering course at ufsc in florianopolis. however, i had some k (knowledge) which helped me to start a research laboratory along with some principles: i) if you don't wish something, you will never get it, ii) keep eyes open to other areas (cross-fertilization), iii) accountability (publish research results), iv) in research you can always do what you want; it can take longer because of lack of money, etc. v) researchers should leave their fingerprint in the lab, vi) research can't be done with absent people. i started my lab with three phd students, teaching assistants at the department of che of u. porto, working in topics involving separation and reaction engineering related to environment and bioengineering. i) removal of phenol from wastewater using polymeric adsorbent resins [10] , ii) denitrification of water in fluidized bed biological reactors [11] , iii) removal of heavy metals with complexing resins and adsorption/ reaction processes [12] . i remember reading a comment by p. v. danckwerts [13] on the use of chemical engineering principles. in his words: "i was reminded of this 15 or so years later when i sat in a committee concerned with sewage and water treatment. the suggestion that chemical engineering had some knowledge relevant to these processes, e.g., in the field of mass transfer, moved civil engineers on the committee, who had always regarded the field as their own, to apoplexy". i experienced the same reaction in a meeting in porto… in many of these projects adsorption is the technology under study. i remember again p. le goff in his lectures saying: "any che problem including adsorption can be modeled by writing: i) conservation equations (mass, energy, momentum balances), ii) equilibrium law at the interface, iii) kinetic laws of mass/heat transfer, iv) boundary and initial conditions, and v) optimization criterion. i also tell my students that factors governing adsorption processes can be divided in first order factors (adsorption equilibrium isotherms) and second order factors (all leading to dispersive effects: kinetics of mass transfer, axial dispersion, etc.). the message is: equilibrium first and show a picture of i. langmuir! (fig. 4) [14] . i also say that to understand adsorption in fixed bed columns you must know de vault equation developed in 1943 [15] from a "simplissime" model based on equilibrium theory. combining a mass balance in a bed volume element with the adsorption equilibrium isotherm = q f c ( ) i i * one can get: don de vault equation [1] shows that adsorption in fixed beds is a wave (concentration) propagation phenomenon; it also explains the effect of the nature of the adsorption equilibrium isotherm : if the isotherm is favorable the concentration front is compressive and will lead to a shock; if the isotherm is unfavorable it leads to a dispersive front (fig. 5 ). to me this is the most important result to understand fixed bed adsorption: the concepts of compressive and dispersive waves as i learned from my thesis co-advisor daniel tondeur. so the first thing to do in adsorption process development is to measure adsorption equilibrium isotherms. for liquid/solid systems measurements are done by contacting different masses of adsorbent with a known volume of solution with initial concentration c i0 . the average adsorbed concentration at any time is simply this is the operating line with slope -v/w, which is simply the integrated mass balance and relates at any time the average adsorbed phase concentration and the fluid phase concentration. after sufficient time equilibrium is reached and a point of the adsorption equilibrium isotherm is obtained q c ( , ) if if . so with equilibrium and operating lines it is a simple exercise to understand the effect of initial concentration in liquid phase and the effect of adsorbent loading! lots of experimental work and published papers could have been saved! also the regeneration process can be easily understood with simple graphical schemes. (fig. 6 ). adsorption is a mass transfer operation between a fluid phase and a solid adsorbent phase. the driving force for the intraparticle mass transfer in the case of "homogeneous" adsorbents is q q i i * where q i * is the adsorbent concentration at the interface in equilibrium with the fluid concentration at the interface c i . the simplest kinetic law is the linear driving force (ldf) model of glueckauf [16] = d q where in a batch adsorption process q i * is changing with time unless the adsorbent particle is in an infinite bath. for porous adsorbent structures one can easily relate d h with pore diffusion d p at least for linear systems; for bidisperse adsorbent structures involving macropore diffusion and micropore (crystal) diffusion d c adequate relations with d h can be derived. other models are often used called pseudo-order models of first order, second order…it is time to describe adsorption using diffusion inside particles: ldf, fick, stefan-maxwell!. one of such pseudo-order models is lagergreen model where the rate of adsorption is proportional to a distance to equilibrium q q i i * ; i wrote a note on this "what's wrong with lagergreen pseudo first order model for adsorption kinetics? " [17] . adsorption processes are multi-scale problems both in space and have been presented but in my opinion the metrics only make sense for a defined process in which the material will be used [18] [19] [20] . a good summary of adsorbent metrics can be found in reference [19] . the selectivity requirement for an adsorbent for pressure swing adsorption (psa) is not the same as that needed for use with simulated moving bed (smb) technology [21] . as sircar said once "each adsorbent must be "married" to a process that maximizes the potential" [22] . a comment on models should be made and as einstein said: "keep it simple but not simpler". i have seen big failures in predictions of employment rates during the last financial crisis of 2008. now we face tough times with covid-19 staying at home and following the numbers of infected people in various countries. my colleague manuel alves prepared a plot in semi log scale of n (number of infected people) versus time. the slope of these lines in all countries is around = k day 0.287 1 . the virus doesn't discriminate countries! this is a simple model of exponential epidemics = n n e kt 0 (fig. 7) . coming back to the modeling approach of adsorption processes i suggest: i) start with simple models; obtain from such models information which remains valid for more complex models; ii) the validity of a model is not just a result of a good fit; more important is the capability to predict the system behavior under conditions different from those used to get model parameters, iii) good results can only be obtained if the model well represents the reality and iv) use models to obtain useful design parameters and their dependence on operating conditions; use independent experiments if possible to get model parameters. there are societal challenges related to the need of clean air, water and soils and relevant topics as the processing of diluted aqueous solutions and valorization of biomass; in all these areas at some point adsorption technologies will be part of the solution. interestingly the first eu research project i got was on environmental area "purification of wastewaters by parametric pumping and ion exchange" [23] . parametric pumping is a cyclic adsorption process involving two steps in a cycle: one step at lower temperature (say 20°c) and other at higher temperature (say 60°c) with flow reversal. so it is a temperature swing adsorption (tsa) with flow reversal. it can be useful in processing diluted aqueous solutions. we can recover a concentrated phenol solution in the top reservoir and a purified water in the bottom reservoir. processing of diluted aqueous solutions is a topic relevant to industry and was addressed in the eu project prodias coordinated by basf [24] . one of the companies involved at that time xendo (now xpure) was developing a technology combining smb and expanded bed adsorption (eba) [25] . eba is an interesting idea where the core-shell adsorbent particles expand nicely in a bed without the chaotic movement of particles thanks to properly designed particle size and density distribution allowing cell debris to pass through the bed whilst the solute (proteins) are retained by the adsorbent. in principle with eba one doesn't need previous solid/liquid separation; however some concerns remain with adsorbent capacity loss due to cell adhesion to the particle surface and dead volumes in the top of eba columns are not good for smb chromatographic separation [26, 27] . the need for low-cost water treatment processes to remove fluoride, iron and arsenic is extremely important in some countries. in the framework of a project involved our lab, tu munich and three universities in india we were involved in the study of continuous electrocoagulation processes for fluoride removal [28, 29] some of these societal challenges are enormous as the carbon capture and utilisation (ccu) to tackle global warming from greenhouse gases. i started my involvement with co 2 capture in connection with the development of sorption enhanced reaction processes [30] to shift the equilibrium towards hydrogen production by coupling methane steam reforming with co 2 sorption on hydrotalcites at high temperature [31] . later i moved to capture of co 2 from flue gases using various adsorbents such as 13x zeolites, carbon materials, binderless zeolites in different shapes (monoliths, beads, extrudates) using cyclic adsorption technologies as pressure swing adsorption (psa), vacuum swing adsorption (vsa), temperature swing adsorption (tsa), electric swing adsorption (esa). all this work was developed in parallel with modeling and simulation with home-made packages for process simulation [32] . one of the phd students working in this area was zhen liu who returned to ecust in shangai, after a sandwich period at lsre, and built a pilot plant (fig. 8) to treat flue gas from a coal-fired power plant involving a 2-bed psa and a 3-bed psa under the guidance of prof yu jianguo and ping li [33, 34] . i was impressed with his achievement when i went to ecust for his phd defense. development of adsorbent processes is highly connected with materials development. the problems in the synthesis of new adsorbents is the scale-up from gr scale to kg scale and increase the productivity by using new reactors such as netmix reactor [35] . nevertheless at the end we get powder material and shaping is required to use the adsorbents in fixed bed columns. an example was the 3d printing of composite monolith of 13 x zeolite and activated carbon to be used in esa operation for co 2 capture [36] the cost of co 2 capture is still high and one option is storage. more interesting is the utilization of co 2 captured. still there is a mismatch between the amounts of co 2 to be captured and the potential use as reactant in current industrial processes. it is important to mention the transformation of co 2 from geothermal power plant in methanol in carbon recycling international (iceland) following the concept of methanol economy of nobel prize george olah [37] . another option is the use of adsorption/reaction cyclic process to make synthetic natural gas (sng) from co 2 . this is the power-to-gas concept; in a first step co 2 from flue gas or other source is adsorbed over hydrotalcites (and concentrated) and in a second step hydrogen from water electrolysis powdered by renewal energy (wind) is fed and methanation reaction occurs producing sng [38, 39] . this second step is the reactive regeneration of the adsorbent (fig. 9 ). this idea can be applied in pulp mills where the lime kiln is a source of co 2 thus allowing its transformation in sng which is needed in the plant. pulp mills are also nice examples of biorefineries. in kraft processes lignin is removed from wood and the black liquor, after recovery of chemicals, is burned in boilers and therefore pulp mills are net producers of electricity injected in the grid. typically for a pulp mill processing 1 million ton/year of wood, 250,000 ton/year of lignin are obtained. one may need to increase the capacity of the plant and be limited by the boiler capacity; in such case a fraction of the black liquor can be taken to produce chemicals such as vanillin or syringaldehyde depending on the wood source. in our lab, we developed an integrated process shown in fig. 10 [40] involving first lignin oxidation followed by membrane separation of low molecular weight compounds from the degraded lignin, which can be sent to the boiler or used to make polyurethane foams. the permeate stream is sent to adsorption columns were a clear separation by families: acids, aldehydes, ketones is achieved (fig. 11 ) [41] . proper elution allows enriched fractions of the compounds of interest and finally extraction/crystallization processes lead to the final product (vanillin or syringaldehyde) [42] . there is room here to implement cyclic adsorption processes using some kind of multi-column technology. a final word: chemical engineers combine expertise in chemistry, physics, mathematics and some in biology with an engineering thinking and are players in many frontier areas to develop sustainable processes/products and help solving environmental challenges. "shaking the present. shaping the future" is the moto of our lab. 11 . breakthrough from the development of the adsorption process to separate the permeate fraction containing lignin derived phenolic compounds, by families: acids (vanillic acid, va), aldehydes (vanillin, v and p-hydroxybenzaldehyde, h) and acetovanillone (vo). reprinted from seppur 216, 92-101 (2019) with permission from elsevier. no conflict of interests one hundred years of chemical engineering research needs and opportunities product design and development chemical product design design and development of biological, chemical, food and pharmaceutical products product-driven process engineering. the eternal triangle molecules, product, process, inaugural lecture tu eindhoven adsorption engineering: processes and materials together!, plenary lecture at eba 11 dynamics of cyclic separation processes: adsorption and parametric pumping biological denitrification in fluidized bed reactors adsorption and reaction in porous particles, application to the removal of heavy metals and chelating resins research and innovation for the 1990s. the chemical engineering challenges, the institution of chemical engineer the adsorption of gases on plane surfaces of glass, mica and platinum the theory of chromatography theory of chromatography. part 10-formulae for diffusion into spheres and their application to chromatography what's wrong with lagergreen pseudo first order model for adsorption kinetics simplistic approach for preliminary screening of potential carbon adsorbents for co2 separation from biogas do adsorbent screening metrics predict process performance? a process optimisation based study for post-combustion capture of co2 co2 capture by adsorption processes: from materials to process development to practical implementation simulated moving bed technology. principles, design and process applications pressure swing adsorption purification of wastewaters by parametric pumping and ion exchange diluted aqueous solutions, eu project funded under h2020-eu.2.1.5.3.grant agreement id: 637077 xpure-systems.com proteins separation and purification by expanded bed adsorption and simulated moving bed technology expanded bed adsorption for human serum albumin and immunoglobulin g onto a cation exchanger mixed mode adsorbent removal of fluoride from water by a continuous electrocoagulation process modeling the electrocoagulation process for the treatment of contaminated water sorption enhanced reaction processes adsorption of carbon dioxide at high temperature-a review a general package for the simulation of cyclic adsorption processes onsite co 2 capture from flue gas by an adsorption process in a coal-fired power plant co2 capture from flue gas in an existing coal-fired power plant by two successive pilot-scale vpsa units us patent 8434933 b2 electrical conductive 3d-printed monolith adsorbent for co2 capture sustainable fuels and chemicals by carbon recycling a sorptive reactor for co2 capture and conversion to renewable methane co2 methanation over hydrotalcite-derived nickel/ruthenium and supported ruthenium catalysts an integrated approach for added-value products from lignocellulosic biorefineries: vanillin, syringaldehyde, polyphenols and polyurethane lignin biorefinery: separation of vanillin, vanillic acid and acetovanillone by adsorption crystalization of vanillin lignin biorefinery: separation of vanillin, vanillic acid and acetovanillone by adsorption, accepted this work was financially supported by: base funding -uidb/ 50020/2020 of the associate laboratory lsre-lcm -funded by national funds through fct/mctes (piddac).the help of dr elson gomes in the preparation of figures is gratefully acknowledged. key: cord-281543-ivhr2no3 authors: richardson, eugene t title: pandemicity, covid-19 and the limits of public health ‘science’ date: 2020-04-17 journal: bmj glob health doi: 10.1136/bmjgh-2020-002571 sha: doc_id: 281543 cord_uid: ivhr2no3 nan ► mathematical models of infectious disease transmission are merely fables dressed in formal language (that therefore create the illusion of being scientific). ► for the most part, such models serve not as forecasts, but rather as a means for setting epistemic confines to the understanding of why some groups live sicker lives than others-confines that sustain predatory accumulation rather than challenge it. ► pandemicity-which we might conceive of as the linking of humanity through contagion-may bring about the dawning of a relational consciousness in the descendants of colonialists, especially in the global north. "no man is an island, entire of itself; each is a piece of the continent, a part of the main. if a clod be washed away by the sea, europe is the less, as well as if a promontory were, as well as if a manor of thy friend's or of thine own were. each man's death diminishes me, for i am involved in mankind. and therefore never send to know for whom the bell tolls; it tolls for thee." john donne wrote these lines in 1624 as part of a series of meditations conducted during a period of what we would now term social distancing, while he suffered from a relapsing febrile illness. whatever the pathogen, donne's musings on being part of a greater whole were not conceived during an epidemic or pandemic, since these words did not exist as nouns in the english language until 1674 and 1832, respectively. 1 in 2020, the quasi-inexorable spread of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) has brought the interconnectedness of humankind back to the forefront of many a consciousness. yet it has not brought clarity to the blurred boundary between epidemics and pandemics. this was made manifest by the who's hesitancy over employing the latter designation in march 2020. 2 and while 'expert' epidemiologists have been climbing over themselves to brandish their latest forecasts (a phenomenon i have described as #willtopunditry), it seems worth asking, are their ways of parsing health phenomena useful? moreover, if one accepts that the boundaries between disease outbreaks and their political economic determinants/sequelae are blurred, 3 the same question should also be asked of other 'expert' modelers, economists in particular. the modern epidemiologist is essentially an accountant (and this is a compliment). they tally up data, present graphs and tables, and make suggestions about investments (in intervention measures such as social distancing, for example). when it comes to forecasting epidemic trends, however, their contributions-from specious metrics 4 like the 2019 global health security index 5 to kaleidoscopic computational models of communicable disease transmission-have limited predictive power (as experience in global health has repeatedly shown). during the 2013-2016, ebola virus outbreak in west africa, modelers devised a dizzying array of forecasts, 6 ranging from the who's supposition early on that the outbreak would be contained at a few hundred cases to the us centers for disease control and prevention's estimate of up to 1.4 million cases by january 2015. 7 interestingly, this latter model was least consistent with the observed epidemic; at the same time, however, it was claimed to be the most useful (as an advocacy tool to muster a robust international response). 8 9 this is not quite what the statistician george e. p. box had in mind when he wrote his famous dictum, 'all models are wrong but some are useful.' 10 more recently, suppositious models of the sars-cov-2 outbreak in the uk posited that half the country (some 34 million people) might already be infected (as of 19 march 2020) 11 and that the 'herd immunity' approach initially adopted by the uk government was defensible. 12 in the usa, health economists bendavid and bhattacharya upped the ante questioning whether universal quarantine measures were worth their costs to the economy. 13 the duo's neoliberal proclivities, 14 coupled with this current offering in the wall street journal, underscore the ideological presumptions intrinsic to any modeling exercise. as the israeli economist ariel rubinstein notes: (1) mathematical models are merely fables dressed in formal language (that therefore create the illusion of being scientific) and (2) economics is an academic discipline which tends towards conservatism and helps the privileged in society maintain their dominance. 15 the same can be said for epidemiology, where bourgeois empiricists 16 build fable-models whose assumptions are usually conjured from the standpoint of dominant interests. 17 in the case of ebola outbreak in west africa, epidemiologists attributed amplified transmission to local populations' beliefs in misinformation or their 'strange' funerary practices-in essence, diverting the public's gaze from legacies of the transatlantic slave trade (or maafa), 18 colonialism, 19 indirect rule, 20 structural adjustment 21 and extractive foreign companies as determinants. 14 22 23 these ways of parsing health phenomena are indeed useful for those in protected affluence, since epidemiologists filter out information vital for demonstrating the global north's complicity in producing planetary health inequities-weakening the disposition of social resistance to such inequity (and demands for reparations) as a result. for the most part, mathematical models of infectious disease transmission serve not as forecasts, 24 25 but rather as a means for setting epistemic confines to the understanding of why some groups live sicker lives than others-confines that sustain predatory accumulation rather than challenge it. 26 27 similar to the role philanthropy plays in occulting ecnonomic exploitation, 28 29 the modest improvements in well-being offered by the right hand of public health 'science' often disguise what global elites and their looting machines 30 have expropriated with the left. 31 that being the case, the field is in clear need of decolonising; however, it is producing some potentially useful, although structurally naïve, 32 work to support the containment of sars-cov-2 within countries. but epidemiology's abetting function as an ideological apparatus can manifest at any time. 33 in the wall street journal article mentioned above, bendavid and bhattacharya, both academics based at stanford university, may have, unwittingly, given the trump administration the stanford imprimatur to trade people's lives for profits. as such, does it make sense to speak of such fabulists-given that their models are fables-as experts? 34 the fable-model i would propose prioritizes people's lives and has radical wealth redistribution as its moral. such a model requires expertise in solidarity. the same solidarity that kwame nkrumah called for as an antidote to neocolonialism. 35 the same lack of solidarity that allows the descendants of colonialists-those whose power and privilege have often shielded them from pandemicity-to continue proffering conservative fables under a veil of scientism, which for the most part serve to conceal violently seized privilege, thus maintaining transnational relations of inequality. [36] [37] [38] [39] covid-19 has the potential to change this. pandemicity-which we might conceive of as the linking of humanity through contagion-may bring about the dawning of a relational consciousness in the descendants of colonialists. as their bubbles of protected affluence are burst by sars-cov-2 and tnv (the next virus) and they gain insight into global human interconnectedness, they may also begin to see that the same disproportionate mortality they are seeing around them due to covid-19 is the quotidian experience of much of the global south, where nearly 10 000 children die daily from preventable causes. 40 as they start to sift back through the determinative web of human rights abuses-that is, the pathologies of power 41that set the stage for these health inequalities, they may begin to see that they contribute a great deal to the production and reproduction of structural injustice because of the social position they occupy and the violence that has been committed in their names. 42 and with this should come the realisation that every local outbreak is a pandemic, 43 since they are involved in (hu)mankind. or they will continue their retreat intro militarisation, xenophobia, necropolitics and fascism, and the bell will be deafening. for as donne wrote, '…never send to know/for whom the bell tolls;/it tolls for thee.' contributors etr is the sole author of this work. funding the authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors. competing interests none declared. provenance and peer review not commissioned; internally peer reviewed. open access this is an open access article distributed in accordance with the creative commons attribution non commercial (cc by-nc 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. see: http:// creativecommons. org/ licenses/ by-nc/ 4. 0/. eugene t richardson http:// orcid. org/ 0000-0001-8437-0671 pandemic' vs did the hesitancy in declaring covid-19 a pandemic reflect a need to redefine the term? the symbolic violence of 'outbreak': a mixed methods, quasi-experimental impact evaluation of social protection on ebola survivor wellbeing metrics: what counts in global health covid-19 gives the lie to global health expertise mathematical modeling of the west africa ebola epidemic estimating the future number of cases in the ebola epidemic -liberia and sierra leone cdc's top modeler courts controversy with disease estimate ebola: a big data disaster robustness in the strategy of scientific model building covid-19: experts question analysis suggesting half uk population has been infected mathematics of life and death: how disease models shape national shutdowns and other pandemic policies. science magazine is the coronavirus as deadly as they say? on the coloniality of global public health economic fables. cambridge: open book the political ecology of disease in tanzania epidemic illusions let the circle be unbroken: the implications of african spirituality in the diaspora discourse on colonialism indirect rule redux: the political economy of diamond mining and its relation to the ebola outbreak in kono district dying for growth: global inequality and the health of the poor how europe underdeveloped africa. london: bogle-l'ouverture understanding west africa's ebola epidemic: towards a political economy facts, power and global evidence: a new empire of truth essays on the sociology of knowledge postcolonial though and social theory silencing the past: power and the production of history violence: six sideways reflections winners take all: the elite charade of changing the world the looting machine: warlords, oligarchs, corporations, smugglers, and the theft of africa's wealth the divide: global inequality from conquest to free markets covid-19 and circuits of capital ideology and ideological state apparatuses the foreign gaze: authorship in academic global health neo-colonialism, the last stage of imperialism latin american critical ('social') epidemiology: new settings for an old dream decolonising the mind: the politics of language in african literature decolonizing methodologies: research and indigenous peoples tracking covid-19 responsibly children: reducing mortality pathologies of power: health, human rights, and the new war on the poor responsibility for justice biosocial approaches to the 2013-2016 ebola pandemic key: cord-315685-ute3dxwu authors: ehaideb, salleh n.; abdullah, mashan l.; abuyassin, bisher; bouchama, abderrezak title: evidence of a wide gap between covid-19 in humans and animal models: a systematic review date: 2020-10-06 journal: crit care doi: 10.1186/s13054-020-03304-8 sha: doc_id: 315685 cord_uid: ute3dxwu background: animal models of covid-19 have been rapidly reported after the start of the pandemic. we aimed to assess whether the newly created models reproduce the full spectrum of human covid-19. methods: we searched the medline, as well as biorxiv and medrxiv preprint servers for original research published in english from january 1 to may 20, 2020. we used the search terms (covid-19) or (sars-cov-2) and (animal models), (hamsters), (nonhuman primates), (macaques), (rodent), (mice), (rats), (ferrets), (rabbits), (cats), and (dogs). inclusion criteria were the establishment of animal models of covid-19 as an endpoint. other inclusion criteria were assessment of prophylaxis, therapies, or vaccines, using animal models of covid-19. result: thirteen peer-reviewed studies and 14 preprints met the inclusion criteria. the animals used were nonhuman primates (n = 13), mice (n = 7), ferrets (n = 4), hamsters (n = 4), and cats (n = 1). all animals supported high viral replication in the upper and lower respiratory tract associated with mild clinical manifestations, lung pathology, and full recovery. older animals displayed relatively more severe illness than the younger ones. no animal models developed hypoxemic respiratory failure, multiple organ dysfunction, culminating in death. all species elicited a specific igg antibodies response to the spike proteins, which were protective against a second exposure. transient systemic inflammation was observed occasionally in nonhuman primates, hamsters, and mice. notably, none of the animals unveiled a cytokine storm or coagulopathy. conclusions: most of the animal models of covid-19 recapitulated mild pattern of human covid-19 with full recovery phenotype. no severe illness associated with mortality was observed, suggesting a wide gap between covid-19 in humans and animal models. in part the easy transmission from person-to-person, and its dissemination within the body in severe and fatal cases [11] [12] [13] [14] [15] [16] [17] [18] . accordingly, sars-cov-2-induced covid-19 has led to a pandemic that overwhelmed the capacity of most national health systems, resulting in a global health crisis [19] . so far, an estimated 11,280 million persons in 188 countries were infected, of which 531,000 died [20] . the clinical spectrum of covid-19 is complex and has been categorized as mild, severe, and critical, representing 81, 14, and 5% [2, 3] . the mild pattern comprises patients with either no signs and symptoms or fever and radiological evidence of pneumonia [3] . the severe pattern manifests as rapidly progressive hypoxemic pneumonia involving more than half of the lung with a full recovery phenotype [2, 3] . the critical pattern consists of ards requiring respiratory assistance and mosd that result in death in approximately half of the patients [2, 3, 7, 21] . mortality was associated with host factors such as old age, comorbidities, and immune response [4] . viral and immunopathological studies revealed distinct patterns between mild and severe or critical forms of covid-19 [4, 5, 9, [21] [22] [23] [24] [25] [26] [27] . both severe and critically ill patients displayed higher viral load in the upper respiratory tract than mild cases, together with delayed clearance overtime [21, 22] . likewise, they presented with lymphopenia due to a decrease in cd4+ and cd8+ t cells, as well as t cell exhaustion accompanied by a marked inflammatory response [5, 9, [24] [25] [26] [27] . pro-and anti-inflammatory cytokines and chemokine concentrations were increased systemically and locally in the lung and correlated with severity [5, 9, 24] . in contrast, in the mild illness, the lymphocyte count was normal, with no or minimal inflammatory response [5, 23] . together, these suggest that the viral load and dynamic together with the host inflammatory response may play a pathogenic role. clinical and post-mortem studies of fatal cases of covid-19 demonstrated major alteration of coagulation and fibrinolysis [17, 18] . this was associated with widespread thrombosis of small and large vessels, particularly of the pulmonary circulation contributing to death in a third of patients [8, [28] [29] [30] [31] [32] [33] . these observations suggest that dysregulated coagulation may be an important mechanism of covid-19 morbidity and mortality [34] . in this context, animal models appear crucial to a better understanding of the complex biology of covid-19. animal models of sars-cov-2-induced covid19 have been rapidly reported since the start of the pandemic [35] . however, whether they express the full phenotype of covid-19, particularly the severe and critical patterns associated with lethality, remains to be determined. in this systematic review, we examined whether the newly created animal models reproduce the phenotype of human covid-19. moreover, we examined the knowledge generated by these models of covid-19 including viral dynamic and transmission, pathogenesis, and testing of therapy and vaccines. we conducted a systematic review according to the preferred reporting items for systematic reviews and meta-analysis (prisma) statement [36] to identify studies describing the creation of an animal model of covid-19 as an endpoint (table 1 and additional file 1). additional file 1 shows the data extraction and appraisal approach as well as the selected outcome. the systematic search identified 101 studies and 326 preprints, of which 400 articles were excluded because they were reviews, non-original articles, unrelated to the covid-19 infection, or experimental animals that do not support sars-cov-2 replication such as pigs, ducks, and chickens ( fig. 1 and additional file 2). additional file 2 displays all the excluded studies and the rationale for their exclusion. thirteen peer-reviewed studies and 14 preprints were included in the analysis. the studies used nonhuman primates (n = 13) [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] , mice (n = 7) [50] [51] [52] [53] [54] [55] [56] , hamsters (n = 4) [56] [57] [58] [59] , ferrets (n = 4) [60] [61] [62] [63] , cats, and dogs (n = 1) [63] (tables 2, 3 , 4, and 5). male and female, as well as young and old, were included but with no associated comorbidities. the aims were to investigate the pathogenesis of covid-19 (n = 15), testing drugs and vaccines (n = 14), the host table 1 search strategy and selection criteria we searched the medline, as well as biorxiv and medrxiv preprint servers for original research describing or using an animal model of sars-cov-2 induced covid published in english from january 1, 2020, to may 20, 2020. we used the search terms (covid-19) or (sars-cov-2) and, (animal models), (hamsters), (nonhuman primates), (macaques), (rodent), (mice), (rats), (ferrets), (rabbits), (cats), and (dogs). the preprint servers were included in the search as the field of covid-19 is developing quickly. inclusion criteria were the establishment of animal models of covid-19 as an endpoint. other inclusion criteria were assessment of prophylaxis, therapies, or vaccines, using animal models of covid-19. exclusion criteria consisted of reviews, non-original articles, and unrelated to the covid-19 infection or experimental animals that do not support sars-cov-2 replication. 101 studies and 326 preprints were screened of which 13 peer-reviewed studies and 14 preprints were included in the final analysis (fig. 1) . the variables extracted were the population type, study aim, the virus strain used, clinical response, pathology, viral replication, and host response as well as the effects of prophylaxis, drugs, or vaccines. the outcomes were organized according to species and categorized into phenotype (signs or symptoms; histopathology, timecourse of the illness and outcome), viral (titer in each tissue organ, detection methods, duration of positivity), and host response (dynamic of seroconversion, inflammatory, and hemostatic markers), therapy, and vaccine (efficacy and safety) immune response (n = 6), and the virus dynamic and transmission (n = 4) (tables 2, 3, 4, and 5). all the experimental animals were inoculated with sars-cov-2 with various strains, doses, and route of administration that differed across studies (tables 2, 3 , 4, and 5). likewise, the time-point for tissue collection and pathological assessment were variables. these together precluded any comparisons between the animal models either intra-species or inter-species. nonhuman primate models viral model rhesus macaques (n = 10) [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] , cynomolgus (n = 3) [46] [47] [48] , and african green model (n = 1) [49] and common marmoset (n = 1) [46] were assessed as models for covid-19 (table 2) . sars-cov-2 strains, dose, and route of inoculation were different across studies. different doses of virus inoculum were compared in a single study and showed that viral load in the upper and lower respiratory tract, fever, weight loss, respiratory distress, and mortality were comparable regardless of the doses except for mild transient neutropenia and lymphopenia in the high dose group [43] . in contrast, the route of administration resulted in different pathological response as the intratracheal route elicited severe interstitial pneumonia, as compared with mild interstitial pneumonia and no pneumonia from the intraconjunctival and intragastric routes, respectively [45] . the animals were euthanized at different time-points post-inoculation ranging from 3 to 33 days. the animals displayed variable clinical manifestations from none to fever, altered respiratory patterns, and igg antibody anti-sars-cov-2 spike s1 subunit evaluation of medical interventions reticulonodular opacities pet scan: fdg uptake lung and regional lymph nodes (2), mediastinal lymph nodes and . § dpi day post-inoculation, ¶ crp c-reactive protein, || na not available. **vaccine encoding spike protein variants: full-length sars-cov-2 s protein, s.dct deletion of the cytoplasmic tail of sars-cov-2 s protein, s.dtm deletion of the transmembrane domain and cytoplasmic tail reflecting the soluble ectodomain, s1 s1 domain with a fold on trimerization tag, rbd receptor-binding domain with a fold on trimerization tag, s.dtm.pp a prefusion stabilized soluble ectodomain with deletion of the furin cleavage site, two proline mutations, and a fold on trimerization tag, im intramuscular other general signs ( table 2 ). the clinical manifestations were not different between old and young macaques [46] [47] [48] . structural and ultrastructural examination of the respiratory tract were also variables including mild to moderate interstitial pneumonitis, edema, foci of diffuse alveolar damage with occasional hyaline membrane formation, and pneumocytes type ii hyperplasia ( table 2) . old rhesus macaques exhibited more diffuse and severe interstitial pneumonia than young ones [47] . the extrapulmonary injury was investigated in five studies [40, 42, 43, 46, 49] . these revealed pathological changes in two studies [46, 49] including distention and flaccidity of the intestine, inflammatory cells infiltrating the jejunum, and colon, steatosis of the liver, and alteration of myocardial fiber architecture with increased mitochondrial density [46, 49] . no mortality was observed in any of the nonhuman primate models. comparisons between species of nonhuman primates were not possible except in one study, which suggested that rhesus macaques were superior to cynomolgus and common marmoset as models of human covid-19 [46] . other comparisons suggested that sars-cov elicited more severe lung pathology than sars-cov-2 and middle east respiratory syndrome (mers-cov) [48] ( table 2) . the virus replicated rapidly and at higher titers in the upper airway and lung in all four species [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] . the virus was detected in pneumocytes type i and ii and ciliated epithelial cells of nasal, bronchial, and bronchiolar mucosa [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] . this differs from mers-cov where the virus was mainly present in type ii pneumocytes [46] ( table 2 ). replication of the virus was also demonstrated in jejunum, duodenum, colon, and rectum [37, 38, [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] . viral genome was detected in the blood of rhesus macaques, cynomolgus, and marmoset in one study [46] . viral replication of nasopharyngeal as well as anal swabs, and the lung in old macaques was higher than in young ones [47, 48] . sars-cov-2 infection-induced igg antibodies response against the sars-cov-2 spike was noted in all species [37, 46, 48, 49] except in marmoset [46] . the antibodies were protective against a second exposure to the virus [43, 44] . there was no difference between males and females [37, 39-43, 46, 47] ; however, young rhesus macaques had lower antibody titers than the old macaques [46] . the innate immune response to sars-cov-2 infection was variable with normal, high, or low leucocytes and lymphocyte counts [37, 46] . occasional reduction of cd4 + and cd8 + t cell concentrations was documented [37] as well as the transitory release of various cytokines and chemokines at different days postinoculation [37, 46, 49] . dna and inactivated virus-based vaccines were evaluated and showed protection in these nonhuman primates. however, the dna vaccine did not reduce the virus presence in the upper airway, while there was a residual small interstitial pneumonitis in the macaques that received the inactivated virus [40, 41] . this suggests that none of the virus tested so far displayed a comprehensive protection against sars-cov-2 infection. several candidate dna vaccines based in various forms of the sars-cov-2 spike (s) protein were also tested in rhesus macaques [39] . the findings revealed that only the vaccine encoding the full-length (s) offered optimal protection against sars-cov-2 [64] . nonhuman primates served also for the evaluation of antiviral therapies and medical interventions such as ct-and petscanners [47] . wild type mice (balb/c, c57bl/6), immunodeficient mice (scid), chimeric mouse expressing human angiotensinconverting enzyme 2 (hace2), and the rna-dependent rna polymerase (sars1/sars2-rdrp) were evaluated as models of covid-19 (table 3) . moreover, knockout (ko) mice were generated to test specific immunological pathways or therapy, including ablation of type i (ifnar1−/−), iii interferon (ifn) receptors, (il28r−/−), signal transducer and activator of transcription 2 (stat2−/−), and serum esterase (ces1c−/−). patient isolates of sars-cov-2 from different sources and variable times of passaging on various cell cultures or balb/c mice were employed (table 3) . mouseadapted sars-cov-2 was developed using two methods. the first by serial passaging (up to 6) through the lungs of balb/c mice until the virus spike receptor-binding domain (rbd) adapted to the murine ace-2 [54] . in the second, using genetic engineering, the sars-cov-2 rbd was remodeled to enhance its binding efficiency to murine ace2 [52] . the clinical signs and symptoms varied from none to mild weight loss, arched back, and slight bristles. whole-body plethysmography was used to measure the respiratory function of the animals and showed a mild to moderate reduction in old more than in young (table 3) . likewise, the pathological changes varied according to the experimental models and included peribronchiolar inflammation, lung edema, moderate multifocal interstitial pneumonia, lymphocyte infiltration, and intraalveolar hemorrhage. survival of hace2 mice was decreased at 5-day post-inoculation and was attributed to high viral replication in the brain, while it was weight and minimal in the lung, suggesting a different pathogenic mechanism of death from human covid-19 [52] . wild type mice showed no pathology as compared to hace2 mice, indicating that the lack of human ace2 receptor cannot be infected or inefficiently with sars-cov-2 [50, 56] . on the other hand, mouse-adapted sars-cov-2 exhibited more severe pathology, particularly in the aged mouse than hace2 transgenic mouse, suggesting that these models may be more relevant for the study of human covid-19 [52, 54] . however, whether the pathogenesis induced by the mouse-adapted sars-cov-2 is translatable to humans warrants further studies [52, 54] . the virus replicated to high titers in the upper and lower respiratory tract in most of the genetically modified mice models but not in wild type. viral replication was detected outside the respiratory tract in the intestine of hace2 mice [50] as well as in the liver, and heart in mouse modified sars-cov-2 rbd [52] . increased viral replication in ko mice ifnar1−/− suggested that interferon limited the viral replication [56] . specific igg antibodies against sars-cov-2 were documented in two studies ( table 3 ). the igg antibodies were found to cross-react in their binding to the spike protein of sars-cov, however, with no crossneutralization, hence suggesting the conservation of the same spike protein epitopes among coronaviruses [53] . proinflammatory cytokines and chemokines were demonstrated in mouse-adapted sars-cov-2 and ko mouse ( table 3 ). the inflammatory response was significantly higher in the old than young mice. antiviral therapies, including remdesivir [55] , ifn lambda [52] , and human monoclonal igg1 antibody against rbd [50] , were tested in these mouse models and produced a protective effect. likewise, vaccines using viral particles expressing sars-2-s protein [52] or an rbd-based vaccine were tested and showed protection [55] . wild type syrian hamsters and knockout hamsters for signal transducer and activator of transcription 2 (stat2−/− lacking type i and iii interferon signaling) and interleukin 28 receptors (il28r−/− lacking ifn type iii signaling) were reported as models for covid-19. patient isolate of sars-cov-2 from different sources and different passages on various cell cultures was used (table 4) . sars-cov-2 was administered intranasally at different titers to anesthetized hamsters. viral transmission between hamsters was demonstrated either through direct contact or indirectly via airborne transmission. the clinical manifestations included weight loss, which was consistently observed. other clinical signs and symptoms such as rapid breathing, lethargy, ruffled furs, and hunched back posture were reported in one study [57] . the histopathological findings were variables according to the experimental models and ranged from lung consolidation to multifocal necrotizing bronchiolitis, leukocyte infiltration, and edema. stat2−/− hamsters exhibited attenuated lung pathology as compared with il28r-a−/− hamsters [56] . the virus replicated to high titer in the upper and lower respiratory tract in most of the hamsters' models. viral replication was detected in the blood and kidney with a low concentration (table 4 ). stat2−/− hamsters had higher titers of infectious virus in the lung, viremia, and high levels of viral rna in the spleen, liver, and upper and lower gastrointestinal tract in comparison with wild type and il28r-a−/− hamsters. specific igg antibodies against sars-cov-2 were documented in the sera of hamsters at different time-points from virus inoculation ranging from 7 to 21 days. increased expression of proinflammatory and chemokine genes was demonstrated in the lungs of the sars-cov-2 infected animals, however with no increase in circulating levels of proteins such as tnf, interferon-γ, and il-6. immunoprophylaxis with early convalescent serum achieved a significant decrease in viral lung load but not in lung pathology [57] . ferrets, cats, and dogs were administered intranasally or intratracheally with various doses and strains of sars-cov-2 (table 5 ). ferrets displayed elevated body temperature for several days associated with signs that differed according to the studies. these include decreased activity and appetite, sporadic cough, and no body weight loss [60] [61] [62] [63] . no clinical signs were reported either in cats or in dogs. ferrets exhibited acute bronchiolitis [61, 63] , with perivasculitis and vasculitis [63] , but with no discernible pneumonia. cats disclosed lesions in epithelial nasal, tracheal, and lung mucosa (table 5 ). the virus replication and shedding were demonstrated in the upper airways and rectal swabs in ferrets and cats, but the extent to other tissues varied in ferrets from none to multiple organs, including the lung, blood, and urine. no viral rna was detected in cats' lungs. dogs showed rna-positive rectal swab but none in the upper or lower airways. viral transmission between ferrets and cats was demonstrated either through direct contact [55] or indirectly via airborne route [62] . ferrets, cats, and dogs exhibited specific antibody response against sars-cov-2 [60, 62, 63] . a study of the ferret immune response to sars-cov-2 revealed a subdued low interferon type i and type iii response that contrasts with increased chemokines and proinflammatory cytokine il6, which is reminiscent of the human response [61] . this systematic review of experimental animal models of sars-cov-2 induced-covid-19 identified 13 peerreviewed studies and 14 preprints that reported data on nonhuman primates [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] , mice [50] [51] [52] [53] [54] [55] [56] , hamsters [56] [57] [58] [59] , ferrets [60] [61] [62] [63] , cats, and dog [63] models of covid-19. the main findings indicate that most of the animal models could mimic many features of mild human covid-19 with a full recovery phenotype [3] . they also revealed that older animals display relatively more severe illness than the younger ones [38, 46, 48, 52, 54] , which evokes human covid-19 [3, 6] . however, none of the animal models replicated the severe or critical patterns associated with mortality as observed in humans with covid-19 [3] . the results of this systematic review are consistent with studies of animal models of sars-cov and mers-cov, which failed to replicate the full spectrum of humans' illness [65, 66] . nonetheless, several features of mild covid-19 in humans could be mirrored. high viral titers in the upper and lower respiratory tract and lung pathology were demonstrated in both large and small animal models. the pathology encompassed mild interstitial pneumonia, consolidation, and diffuse alveolar damage, albeit localized to a small lung area, edema, hyaline membrane formation, and inflammation. sars-cov-2 elicited specific antibody response against various viral proteins in the sera of most of the animal models. this systematic review revealed that none of these newly established animal models replicated the common complications of human covid-19 such as ards and coagulopathy [6, 8, 28-33, 67, 68] . ards can be particularly severe and results in refractory hypoxemia requiring maximum respiratory supportive measures in the intensive care unit [6, 67, 68] . the coagulopathy can lead to severe complications such as massive pulmonary embolism, cerebrovascular stroke, and mesenteric infarction, including in younger people [8, 28, 32, 33] . the pathology underlying these two complications were recently revealed by post-mortem studies disclosing diffuse alveolar damage involving the whole lung, hyaline membrane formation, and infiltration with inflammatory cells, thus leaving no air space open for ventilation [17, 18, 64, 69, 70] . it also detected the presence of diffuse and widespread thrombosis in the micro-and macro-circulation, including the pulmonary circulation compromising the lung perfusion [17, 18] . this double hit affecting the ventilation and perfusion simultaneously underlies the intractable hypoxemia that contributed to the high mortality. none of the animal models replicated the respiratory failure, thromboembolic manifestations, and their the mechanisms of the lung injury and coagulopathy are not well understood, although several known pathways were postulated including cytokine storm leading to upregulation of tissue factor [5, 9, 24] , activation/injury of the endothelium infected by the virus [30, 67, 71] , complement activation [72] , alveolar hypoxia promoting thrombosis [73] , and autoantibodies against phospholipid and lupus anticoagulant [74, 75] modulating the hemostasis and coagulation cascade directly. hence, the development of animal models that replicate the dysregulation of the inflammation and coagulation could be important, as these would allow the deciphering of the intimate mechanisms at play. this, in turn, may aid in identifying therapeutic targets and the testing of immunotherapy, anticoagulation, and thrombolytic interventions and thereby may improve the outcome. both antiviral and vaccine therapies were tested in rhesus macaques and mice infected with sars-cov-2 [40] [41] [42] . the antiviral drug stopped the viral replication and improved the pneumonitis [42, 55] . the vaccines induced an increase in titers of neutralizing antibodies in the sera that correlated with the decrease of viral replication and prevented the lung pathology [39] [40] [41] . these results represent a substantial proof of the concept of antiviral or vaccine efficacy against sars-cov-2 in animal models. however, because of the lack of overt clinical illness, the rapid clearance of the virus, and spontaneous improvement of the pneumonitis without lethality, the models do not permit the full assessment of the duration of the protection of the vaccines, or the effect of antiviral therapy on survival. since the emergence of sars-cov infection in 2003 [76] , followed by the mers-cov in 2012 [77] , and now with covid-19, researchers have not been able to develop a model of coronavirus infection that reproduces the severity and lethality seen in humans [65, 66] . one of the well-known reasons lies in the difference of ace-2 receptor binding domain structure across species [78] . human and primates have conserved a comparable structure that allows binding with high affinity to the sars-cov-2 [78] . the hamsters, ferrets, and cats maintained an intermediate affinity, while mice exhibit very low affinity [78] . the latter explains why wild-type mouse does not support sars-cov-2 replication, and hence, the necessity to create a chimera that expresses human ace-2, to enable the use of this species as a model of covid-19 [50] . more recently, a study applying single-cell rna sequencing to nonhuman primate uncovered another explanation that may underlie the difference between nonhuman primates and humans in expressing the complex phenotype of covid-19 [79] . the study reveals that the cellular expression and distribution of ace2 and tmprss2 which are essential for virus entry in the cells and its spread inside the body differ in the lung, liver, and kidney between the two species. ace2 expression was found lower in pneumocytes type ii and higher in ciliated cells in nonhuman primate lung as compared to humans [40] . this is particularly significant as type ii pneumocytes are critical targets of sars-cov-2 in humans and the pathogenesis of lung injury/damage. finally, the innate immune response including the defense system against viruses diverged during evolution both at the transcriptional levels and cellular levels, which may also explain why the sars-cov-2 hardly progresses in these animals outside the respiratory system [80] . taken together, these fundamental differences represent a real challenge to the successful development of an animal model that reproduces human covid-19. this systematic review has a few limitations. first, it is the high number of preprints included in this study that have not been peer-reviewed. second, the animal models from the same species were difficult to compare across studies, as they used different viral strain, inoculum size, route of administration, and timing of tissue collection. this systematic review revealed that animal models of covid19 mimic mild human covid-19, but not the severe form covid-19 associated with mortality. it also disclosed the knowledge generated by these models of covid-19 including viral dynamic and transmission, pathogenesis, and testing of therapy and vaccines. likewise, the study underlines the distinct advantages and limitations of each model, which should be considered when designing studies, interpreting pathogenic mechanisms, or extrapolating therapy or vaccines results to humans. finally, harmonization of animal research protocols to generate results that are consistent, reproducible, and comparable across studies is needed. supplementary information accompanies this paper at https://doi.org/10. 1186/s13054-020-03304-8. additional file 1. data extraction, appraisal, and outcome. a pneumonia outbreak associated with a new coronavirus of probable bat origin a novel coronavirus from patients with pneumonia in china characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention viral and host factors related to the clinical outcome of covid-19 clinical and immunological features of severe and moderate coronavirus disease 2019 baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region cardiovascular implications of fatal outcomes of patients with coronavirus disease 2019 (covid-19) high risk of thrombosis in patients with severe sars-cov-2 infection: a multicenter prospective cohort study clinical features of patients infected with 2019 novel coronavirus in wuhan cryo-em structure of the 2019-ncov spike in the prefusion conformation sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor sars-cov-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes integrated analyses of single-cell atlases reveal age, gender, and smoking status associations with cell type-specific expression of mediators of sars-cov-2 viral entry and highlights inflammatory programs in putative target cells sars-cov-2 receptor ace2 is an interferon-stimulated gene in human airway epithelial cells and is detected in specific cell subsets across tissues asymptomatic and human-to-human transmission of sars-cov-2 in a 2-family cluster transmission of 2019-ncov infection from an asymptomatic contact in germany histopathology and ultrastructural findings of fatal covid-19 infections autopsy findings and venous thromboembolism in patients with covid-19 offline: a global health crisis? no, something far worse global death from covid-19. covid-19 map -john hopkins coronavirus resource center clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study viral dynamics in mild and severe cases of covid-19 breadth of concomitant immune responses prior to patient recovery: a case report of non-severe covid-19 heightened innate immune responses in the respiratory tract of covid-19 patients persistent sars-cov-2 presence is companied with defects in adaptive immune system in non-severe covid-19 patients functional exhaustion of antiviral lymphocytes in covid-19 patients lymphopenia predicts disease severity of covid-19: a descriptive and predictive study acute pulmonary embolism associated with covid-19 pneumonia detected with pulmonary ct angiography incidence of thrombotic complications in critically ill icu patients with covid-19 covid-19 critical illness pathophysiology driven by diffuse pulmonary thrombi and pulmonary endothelial dysfunction responsive to thrombolysis. medrxiv fibrinolytic abnormalities in acute respiratory distress syndrome (ards) and versatility of thrombolytic drugs to treat covid-19 abdominal imaging findings in covid-19: preliminary observations large-vessel stroke as a presenting feature of covid-19 in the young anticoagulant treatment is associated with decreased mortality in severe coronavirus disease 2019 patients with coagulopathy the search for a covid-19 animal model preferred reporting items for systematic reviews and meta-analyses: the prisma statement respiratory disease in rhesus macaques inoculated with sars-cov-2 agerelated rhesus macaque models of covid-19 dna vaccine protection against sars-cov-2 in rhesus macaques chadox1 ncov-19 vaccination prevents sars-cov-2 pneumonia in rhesus macaques development of an inactivated vaccine candidate for sars-cov-2 clinical benefit of remdesivir in rhesus macaques infected with sars-cov-2 sars-cov-2 infection protects against rechallenge in rhesus macaques lack of reinfection in rhesus macaques infected with sars-cov-2 ocular conjunctival inoculation of sars-cov-2 can cause mild covid-19 in rhesus macaques comparison of sars-cov-2 infections among 3 species of non-human primates characteristic and quantifiable covid-19-like abnormalities in ct-and pet/ct-imaged lungs of sars-cov-2-infected crabeating macaques comparative pathogenesis of covid-19, mers, and sars in a nonhuman primate model establishment of an african green monkey model for covid-19 the pathogenicity of sars-cov-2 in hace2 transgenic mice rapid selection of a human monoclonal antibody that potently neutralizes sars-cov-2 in two animal models a mouse-adapted sars-cov-2 model for the evaluation of covid-19 medical countermeasures cross-reactive antibody response between sars-cov-2 and sars-cov infections rapid adaptation of sars-cov-2 in balb/c mice: novel mouse model for vaccine efficacy remdesivir potently inhibits sars-cov-2 in human lung cells and chimeric sars-cov expressing the sars-cov-2 rna polymerase in mice stat2 signaling as double-edged sword restricting viral dissemination but driving severe pneumonia in sars-cov-2 infected hamsters simulation of the clinical and pathological manifestations of coronavirus disease 2019 (covid-19) in golden syrian hamster model: implications for disease pathogenesis and transmissibility rapid isolation of potent sars-cov-2 neutralizing antibodies and protection in a small animal model pathogenesis and transmission of sars-cov-2 in golden hamsters infection and rapid transmission of sars-cov-2 in ferrets imbalanced host response to sars-cov-2 drives development of covid-19 sars-cov-2 is transmitted via contact and via the air between ferrets susceptibility of ferrets, cats, dogs, and other domesticated animals to sars-coronavirus 2 autopsy in suspected covid-19 cases is there an ideal animal model for sars? development of animal models against emerging coronaviruses: from sars to mers coronavirus pulmonary vascular endothelialitis, thrombosis, and angiogenesis in covid-19 management of covid-19 respiratory distress pathological findings of covid-19 associated with acute respiratory distress syndrome lung pathology of fatal severe acute respiratory syndrome endothelial cell infection and endotheliitis in covid-19 analysis of complement deposition and viral rna in placentas of covid-19 patients the stimulation of thrombosis by hypoxia coagulopathy and antiphospholipid antibodies in patients with covid-19 lupus anticoagulant and abnormal coagulation tests in patients with covid-19 a major outbreak of severe acute respiratory syndrome in hong kong isolation of a novel coronavirus from a man with pneumonia in saudi arabia broad host range of sars-cov-2 predicted by comparative and structural analysis of ace2 in vertebrates single-cell atlas of a non-human primate reveals new pathogenic mechanisms of covid-19 gene expression variability across cells and species shapes innate immunity publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations not applicable authors' contributions a.b. designed the study, analysis of the data, and writing the manuscript. s.e., m.a., and b.a. acquisition of the data, selection of studies, appraisal of the literature, and contributed to the writing of the manuscript. the authors read and approved the final manuscript. there was no funding source for this study.availability of data and materials all data generated or analyzed during this study are included in this published article [and its supplementary information files]. the authors declare that they have no competing interests.received: 13 july 2020 accepted: 21 september 2020 key: cord-311086-i4e0rdxp authors: adekola, hafeez aderinsayo; adekunle, ibrahim ayoade; egberongbe, haneefat olabimpe; onitilo, sefiu adekunle; abdullahi, idris nasir title: mathematical modeling for infectious viral disease: the covid‐19 perspective date: 2020-08-17 journal: j public aff doi: 10.1002/pa.2306 sha: doc_id: 311086 cord_uid: i4e0rdxp in this study, we examined various forms of mathematical models that are relevant for the containment, risk analysis, and features of covid‐19. greater emphasis was laid on the extension of the susceptible–infectious–recovered (sir) models for policy relevance in the time of covid‐19. these mathematical models play a significant role in the understanding of covid‐19 transmission mechanisms, structures, and features. considering that the disease has spread sporadically around the world, causing large scale socioeconomic disruption unwitnessed in contemporary ages since world war ii, researchers, stakeholders, government, and the society at large are actively engaged in finding ways to reduce the rate of infection until a cure or vaccination procedure is established. we advanced argument for the various forms of the mathematical model of epidemics and highlighted their relevance in the containment of covid‐19 at the present time. mathematical models address the need for understanding the transmission dynamics and other significant factors of the disease that would aid policymakers to make accurate decisions and reduce the rate of transmission of the disease. in this study, we examined various forms of mathematical models that are relevant for the containment, risk analysis, and features of covid-19. greater emphasis was laid on the extension of the susceptible-infectious-recovered (sir) models for policy relevance in the time of covid-19. these mathematical models play a significant role in the understanding of covid-19 transmission mechanisms, structures, and features. considering that the disease has spread sporadically around the world, causing large scale socioeconomic disruption unwitnessed in contemporary ages since world war ii, researchers, stakeholders, government, and the society at large are actively engaged in finding ways to reduce the rate of infection until a cure or vaccination procedure is established. we advanced argument for the various forms of the mathematical model of epidemics and highlighted their relevance in the containment of covid-19 at the present time. mathematical models address the need for understanding the transmission dynamics and other significant factors of the disease that would aid policymakers to make accurate decisions and reduce the rate of transmission of the disease. severe acute respiratory syndrome coronavirus-2 (sars-cov-2), a novel β-coronavirus is the pathogen responsible for the coronavirus disease 2019 (covid-19; li, geng, peng, meng, & lu, 2020 (sameni, 2020) . the viral genome sequence of the sars-cov-2 suggests the close relatedness to sars-like bat covs, but most genomic encoded proteins of the sars-cov-2 are similar to the sars-covs with differences in two of the nonstructural proteins (nsp2 and nsp3), spike protein and the receptorbinding domain (rbd; . studies have shown that the sars-cov-2 is capable of mutation with two types being majorly classified as the l-type and the s-type . the s-type has been reported to have evolved when jumping from animal to man while the l-type evolved later. although both are currently involved in the pandemic, the l-type has been reported to be more prevalent than the s-type (guo et al., 2020) . how mathematical models explain these chain reactions and transmission mechanisms forms the core of the foregoing. severity, features, structures risk analysis, and containment of the virus have been studied along with various disciplines and dimensions (adekunle, onanuga, akinola & ogunbanjo, 2020) . a notable consensus has been the adoption of social distancing and practice of good hygiene as a measure to deter virus proliferation and flatten the epidemic growth curve such that fast-rising number of covid-19 attributable deaths can be reduced (sameni, 2020) growth curve and its containment strategies remains grossly understudied in extant literature. the intricacies of this unobserved factor underpin this study. we complement available studies on the subject matter and extend the sir models and rely on inferences drawn from available studies using the extensions of sir models. the application of these models consists of the use of mathematical tools and a specific language to explain and predict the behavior of the infectious viral disease. these models could be deterministic, non-deterministic, or could contain branching processes that aid the prediction of the infectious disease. mathematical models help to make mental models quantitative; it involves writing down a set of equations that mimics reality which is then solved for specific values of the parameters within the equations (panovska-griffiths, 2020; revathi & rangnathan, 2020) . mathematical modeling simplifies reality and answers questions using subsets of data (panovska-griffiths, 2020). predictive mathematical models are essential for understanding the course of an epidemic. one of the most commonly used models is the susceptible-infectious-recovered (sir) models for the human to human transmission (giordano et al., 2020) . however, modelers need to acquire at least one dataset with relevant data points before developing or validating a model (nandal, 2020) . predictive models for large countries could be problematic because they aggregate heterogeneous sub-epidemics (jewell, lewnard, & jewell, 2020) . various factors, such as individual characteristics and population distribution, have a significant contribution, thus affecting the model prediction . the underlying mathematical model which has been developed as far back in the 1920s is still in use today, and this basic model is referred to as the sir model (freiberger, 2014 ). the sir model divided the population into three groups as in shil (2016); the susceptible (s), the infectious (i), and the recovered (r). it was developed by kernack and mckendrick to describe an influenza epidemic (bauer, 2017) . it assumes the introduction of an infected individual into a population where the members have not been previously exposed to the pathogen. therefore, all are susceptible (s), each infected individual (i) transmits to susceptible members of the population with a mean transmission rate β. at the end of the infectious period, individuals who recover from the infections are referred to as the recovered (r) member of the population, if the mean recovery rate is α, then the mean transmission period in any individual is given by 1/α. the differential equations describing the transmission as per the basic sir model is given by where s(t) and i(t) represents the number of individuals in the susceptible and infectious states respectively at any time t, while the rates of change of s(t) and i(t) with time is represented by ds(t) and di(t), respectively. if the population is considered constant with no agent leaving or coming into the system, the equation is given by: the number of susceptible individuals decreases as the number of incidences increases, so also the epidemic declines, as more individuals recover from the disease (shil, 2016) . basic reproduction number is a phenomenon where the average number of secondary infections generated by one infectious individual when introduced into a fully susceptible population is measured, r 0 denotes it. the severity of an epidemic and rate of progression depends on the value of the basic reproduction number, so if r 0 is greater than 1, the epidemic will continue, but if it is less than 1 then the epidemic would fade out (delamater, street, leslie, yang, & jacobsen, 2019) . the basic reproduction number can be calculated from the growth rate (r) of the epidemic obtained from the cumulative incidences data in the initial growth phase of the outbreak as the numerical solutions of the ordinary differential equations can be obtained with an appropriate application using computer simulations, and this model has been used to explain the transmission and repeated outbreaks of measles in new york between 1930 and 1962. the sir model can be further modified considering demographics and weather/seasonal variations. modified sir has been used to explain viral epidemics such as influenza justifying its applicability to the covid-19 context. certain infectious diseases have an incubation period or exposed state in an individual following infection until the symptoms are observed. in other words, the susceptible-exposed-infectiousrecovered (seir) account for the exposed or latent stage (shil, 2016) . here each individual who receives the virus exists in the exposed or latent state (e) during which the virus is incubated but does not transmit the infection to anyone, so with the onset of symptoms the individual makes a transition to the infectious state. considering the constant population size and the set of differential equations as; while equation (3) remains as dr t ð þ dt = αi t ð þ. the basic reproduction of the seir model can be determined using the formula where the mean infective period is 1/α while the mean incubation period is 1/k. the seir model with suitable adaptations has been widely applied for various disease epidemics such as chickenpox and sars, and its relevance has been advanced for the analysis of the dynamic transmission of covid-19 in this context. this is a simple model for viral epidemics involving asymptomatic individuals in the population in a situation without any interventions. individuals testing positive in serological tests or blood tests for disease without symptoms is referred to as asymptomatic and is denoted as a in the susceptible-exposed-infectious-asymptomatic-recovered (seiar) model, so considering a constant population; this indicates the total population was susceptible, and there was no transmission from individuals at the latent state and a fraction of the proceeds to the infectious state. in contrast, other fractions (i − p) proceed to the asymptomatic state at the same time (k) with the asymptomatic individuals having a reduced ability to transmit the infection. if q is the factor that determines transmissibility in asymptomatic individuals, then 0 < q < 1. the ordinary differential equation of the transmission process can be described as the following. where c denotes the cumulative number of infectives. this model was used to explain the transmission dynamics of the swine flu outbreak in 2009 at a residential school in maharashtra, india (shil, 2016) . this model describes the incorporation of the hospitalization of a fraction of infectious individuals. here the population is classified into seiar with j(t) and d(t) denoting the hospitalized and dead respectively. considering the total population is constant at any time, the ordinary differential equation of the transmission process is described as the following the μ represents the rate of birth and natural death, while the cumulative number of infections is represented by c(t). epidemic data of the spanish flu pandemic in geneva was obtained using the complex seiar model, and all parameters of the model were determined. the seir and seiar models have been further extended by involving various parameters to play crucial roles in public health interventions, quarantine, travel restrictions, vaccination, or dosage of antivirals (shil, 2016) . globally, radical alteration with rapidly changing socioeconomic dynamics has been occurring due to the covid-19 pandemic. several countries have been on full or partial lockdown while adhering to social distancing measures as they wait for a specific treatment modality such as vaccines (sinha, 2020) . public information such as incidence or prevalence of infection, morbidity, or mortality due to covid-19 could be used to solve mathematical models, solutions from these models are then recalibrated repetitively until it is suitable for prediction of the future behavior of sars-cov-2 (panovska-griffiths, 2020). the covid-19 pandemic has been modeled by various researchers with the aim of stimulating the infections within the population (shaikh, shaikh, & nisar, 2020) . most models represent individual to transition between compartments in a given community, these compartments are based on each individual's infectious state, and related population sizes with respect to time (shaikh et al., 2020) . lin et al. (2020) had suggested a conceptual model for covid-19, this model effectively catches the timeline of the disease epidemic while chen et al. (2020) examined a model based on stage based transmissibility of the sars-cov-2 lin et al., 2020) . whereas khan and atangana (2020) formulated a model of people versus covid 19, the model is given as where n represents the total population and is further divided into five subclasses which include susceptible people s(t), exposed people e(t), infected people i(t), asymptomatic people a(t), and recovered people r(t). the reservoir population is denoted as q(t). since most mathematical models utilize ordinary differential equations with integer order for understanding dynamics of biological systems, every model depending on such classical derivatives has been discovered to have restrictions (shaikh et al., 2020) . these restrictions could be overcome using fractional calculus, as recommended by caputo and fabrizio. researchers such as shaikh et al. (2020) applied the caputo-fabrizio fractional derivative operator to study the dynamics of covid-19 using the mathematical model suggested by khan and atangana (2020) in the form of the system of nonlinear differential equations involving the caputo-fabrizio operator (khan & atangana, 2020; shaikh et al., 2020) . the model is given as with initial conditions early dynamics of covid-19 transmission were studied by researchers such as kucharski et al. (2020) , where a combination of a stochastic transmission model with data on both cases in wuhan and international cases that originated from wuhan were used to estimate how transmission had varied between january to february 2020, these estimates were then used to calculate the probability of new cases that might generate outbreaks in new areas (kucharski et al., 2020) . their findings estimated that daily reproduction number (r t ) in wuhan declined from 2.35 to 1.05 between 1 week before and after travel restrictions were introduced respectively (kucharski et al., 2020) . based on these estimates, locations with similar transmission potential to wuhan have at least a 50% chance of an outbreak for very four independently introduced cases (kucharski et al., 2020) . several modeling studies have used the seir model to study the transmission dynamics of covid-19. wu, leung, and leung (2020) used the seir model to describe the transmission dynamics and forecast the spread of the disease using reported data between december 31, 2019, to january 28, 2020. the study also estimated the basic reproductive number to be 2.68 (wu, leung, & leung, 2020) . another study by read, bridgen, cummings, ho, and jewell (2020) using public data, the study concluded that covid-19 would remain endemic, and this demands long term prevention and intervention measures (yang & wang, 2020) . a different mathematical model approach was employed by where a susceptible-exposed-infectious-quarantineddiagnosed-recovered (seiqdr) based model, which is an expansion of the seir model was used . this sixchambered model was used to study the transmission mechanism of covid-19 and the implemented prevention and control measures, with the aid of time series and kinetic modal analysis, a basic reproductive number value of 4.01 was obtained (li, geng, et al., 2020) . the findings of the study suggested that while recovered individuals might not be re-infected due to the presence of antibodies to covid-19, bodies of deceased individuals should be well treated to prevent viral transmission . kim et al. (2020) also used a seiqr model that factored in behavioral changes to study the transmission of covid-19 in korea and predict the likely size and end of the epidemic (kim et al., 2020) . the model predicted over 10,000 cases over time until june, so it was suggested that a sustainable long term non-pharmaceutical interventions would significantly reduce transmission among the population (kim et al., 2020) . although the mathematical models for the covid-19 have majorly forecast few areas relating to pathogen spread such as the basic reproductive number of the sars-cov-2, population control measures, percentage of asymptomatic people (nandal, 2020) . there is still a paucity of modeling studies focusing on predicting the magnitude of the global spread of the virus, the duration of the pandemic, and possible effective interventions (nandal, 2020) . nevertheless, consumers of these models such as the public, media, and politicians; have the need for these predictions to plan for various interventions that would be reliable in combating the disease . models are very useful tools particularly for short term accurate predictions, and it helps policymakers to make decisions and allocate adequate resources toward disease control through predictions of disease spread and infected population (kucharski et al., 2020; revathi & rangnathan, 2020) . mathematical models can be used to understand how and where the disease is most likely to spread while avoiding so many trial experiments or random guesses with the real population. most mathematical models used during this epidemic are extensions of the seir model, a compartmental model based on the behavior of the population which enabled the simulation of how nonpharmaceutical prevention and intervention measures such as lockdowns, social distancing, self-isolation; can significantly affect the morbidity and mortality of the population over time (sameni, 2020) . although mathematical models mimic the actual reality using an equation that is solved for specific values of the user parameters within the equations, a mathematical model is as good as the data it uses. however, mathematical models are potent tools for understanding the transmission dynamics of an infectious viral disease. in other climes, there is no gainsaying to aver that the seir model seems the most reliable extension of the sir models during this pandemic due to its plausibility in explaining heterogeneous changes in features, structures, containment and risk analysis of the virus transmission. since infectious diseases have an incubation period or exposed state in an individual following infection until the symptoms are observed, the seir account for the exposed or latent stage which is concomitant with real-time observation modelling spatial variations of coronavirus disease mathematical epidemiology: past, present, and future a mathematical model for simulating the phase-based transmissibility of a novel coronavirus complexity of the basic reproduction number (r 0 ) fighting epidemics with maths. plus.maths.org/ content/fighting-epidemics-maths modelling the covid-19 epidemic and implementation of population-wide interventions in italy the origin, transmission and clinical therapies on coronavirus disease 2019 (covid-19) outbreak -an update on the status mathematical modeling of the spread of the coronavirus disease 2019 (covid-19) taking into account the undetected infections. the case of china predictive mathematical models of the covid-19 pandemic underlying principles and value of projections modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative prediction of covid-19 transmission dynamics using a mathematical model considering behavior changes early dynamics of transmission and control of covid-19: a mathematical modelling study molecular immune pathogenesis and diagnosis of covid-19 mathematical modeling and epidemic prediction of covid-19 and its significance to epidemic prevention and a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action mathematical modeling the emergence and spread of new pathogens: insight for sars-cov-and other similar viruses. pharma r&d today mathematical modeling of covid-19 transmission dynamics with a case study of wuhan can mathematical modelling solve the current covid-19 crisis? novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions why do we need mathematical models for. web.evolbio.mpg.de mathematical modeling of epidemic diseases a mathematical model of covid-19 using fractional derivative: outbreak in india with dynamics of transmission and control mathematical modeling of viral epidemics: a review mathematical modeling to estimate the reproductive number and the outbreak size of covid-19: the case of india and the world estimation of the transmission risk of the 2019-ncov and its implication for public health interventions on the origin and continuing evolution of sars-cov-2 modeling and prediction of covid-19 in mexico applying mathematical and computational models epidemics in african settings: a mathematical modelling study. london: centre for mathematical modelling of infectious diseases. who. (2020). coronavirus disease 2019 (covid-19) situation report-68 genome composition and divergence of the novel coronavirus (2019-ncov) originating in china nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study a mathematical model for the novel coronavirus epidemic in wuhan key: cord-269873-4hxwo5kt authors: r., mohammadi; m., salehi; h., ghaffari; a. a, rohani; r., reiazi title: transfer learning-based automatic detection of coronavirus disease 2019 (covid-19) from chest x-ray images date: 2020-10-01 journal: j biomed phys eng doi: 10.31661/jbpe.v0i0.2008-1153 sha: doc_id: 269873 cord_uid: 4hxwo5kt background: coronavirus disease 2019 (covid-19) is an emerging infectious disease and global health crisis. although real-time reverse transcription polymerase chain reaction (rt-pcr) is known as the most widely laboratory method to detect the covid-19 from respiratory specimens. it suffers from several main drawbacks such as time-consuming, high false-negative results, and limited availability. therefore, the automatically detect of covid-19 will be required. objective: this study aimed to use an automated deep convolution neural network based pre-trained transfer models for detection of covid-19 infection in chest x-rays. material and methods: in a retrospective study, we have applied visual geometry group (vgg)-16, vgg-19, mobilenet, and inceptionresnetv2 pre-trained models for detection covid-19 infection from 348 chest x-ray images. results: our proposed models have been trained and tested on a dataset which previously prepared. the all proposed models provide accuracy greater than 90.0%. the pre-trained mobilenet model provides the highest classification performance of automated covid-19 classification with 99.1% accuracy in comparison with other three proposed models. the plotted area under curve (auc) of receiver operating characteristics (roc) of vgg16, vgg19, mobilenet, and inceptionresnetv2 models are 0.92, 0.91, 0.99, and 0.97, respectively. conclusion: the all proposed models were able to perform binary classification with the accuracy more than 90.0% for covid-19 diagnosis. our data indicated that the mobilenet can be considered as a promising model to detect covid-19 cases. in the future, by increasing the number of samples of covid-19 chest x-rays to the training dataset, the accuracy and robustness of our proposed models increase further. i n the present time, coronavirus disease 2019 (covid19) is an emerging infectious disease and global health crisis. originally, this virus was identified in wuhan, china in december 2019 [1] . to date (on 28 july, 2020 at 10:46 gmt), 16 ,672,569 cases are infected by covid-19 around the world with 657,265 deaths and 10,263,092 recovered cases [2] . in more severe cases, co-vid-19 causes acute respiratory distress syndrome (ards), pneumonia, and respiratory failure. in fact, high pathogenic covid-19 mainly infects the lower respiratory tract and actives dendritic and epithelial cells, thereby resulting in expression of pro-inflammatory cytokines that cause pneumonia and ards, which can be fatal [3] . real-time reverse transcription polymerase chain reaction (rt-pcr) is known as the most widely laboratory method to detect the co-vid-19 from respiratory specimens, such as nasopharyngeal or oropharyngeal swabs [4] . rt-pcr is a sensitive method for diagnosing of covid-19, but it suffers from several main drawbacks, including time-consuming, high false-negative results, and limited availability [5] [6] [7] . to resolve these drawbacks, the medical imaging techniques such as chest x-ray and computed tomography (ct) scan of chest to detect and diagnose covid-19 can be use as alternative tools [8, 9] . the radiologists consider chest x-ray images over ct-scan as primary radiography examination to detect the infection caused by covid-19 [10] due to high availability of x-ray machines in most of the hospitals, low ionizing radiations, and low cost of x-ray machines compared to ct-scan machine. hence, in the present study, we preferred chest x-ray images over ct-scan. chest x-ray images can easily detect covid-19 infection's radiological signatures. it must be analyzed and diagnosed from chest x-ray images by using an expert radiologist. of note, it is time-consuming and has susceptibility to detect erroneously [11] . therefore, the automatically detect of covid-19 from chest xray images is required. to date, several studies have used deep learning based methods to automate the analysis of radiological images [12] . deep learning based methods have been previously utilized to diagnosis tuberculosis disease from chest x-ray images [13] . it is possible that weights of networks initialized and trained on a large datasets by using deep learning based methods and then fine tuning these weights of pretrained networks on a small datasets [14] . owing to the limited available dataset related to covid-19, the pre-trained neural networks can be utilized for diagnosis of covid-19. however, these approaches applied on chest x-ray images are very limited till now [15] . to this end, the present study aimed to use an automated deep convolution neural network based pre-trained transfer models for detection and diagnosis of covid-19 infection in chest x-rays. this study was designed as a retrospective study. transfer learning is a machine learning technique which reuse a pre-trained model that has been used for a problem, on a new related problem [16] . in fact, transfer learning applied pre-trained models for machine leaning. in the analysis of medical data, one of the major research challenges for health-care researchers can be attributed to the limited available dataset [7] . besides, deep learning models have several drawbacks such as a lot of data for training and data labeling that is costly and time-consuming [7] . using transfer learning provides the training of data with fewer datasets. in addition, the calculation cost of transfer learning models is less. over the last decades, using deep learning algorithms and convolution neural networks (cnns) resulted in many breakthroughs in many fields such as industry, agriculture, and medical disease diagnostic [17] [18] [19] . cnn architecture aims to mimic human visual cortex system [11] . basically, there are three main layers in cnn, including the convolution layer, the pooling layer and the fully connected layer [20] . the learning of model is performed by the convolution layer, the pooling layer, whereas the role of the fully connected layer is the classification [20] . herein, four well-known pre-trained cnn models were applied to detect infection in chest x-rays. so, x-rays images were classified into two groups, normal or covid-19: 1-vgg16, 2-vgg-19, 3-mobilenet, and 4-inceptionresnetv2. vgg architectures have been designed by oxford university's visual geometry group [21] . vgg-16 consists of 13 convolutional layers and 3 fully-connected layers, whereas vgg-19 is a combination of 16 convolutional layers and 3 fully-connected layers [21] . therefore, vgg-19 is considered as a deeper cnn architecture in comparison with vgg-16. mobilenet architecture has proposed by howard et al., in 2017 [22] . the mobilenet model is built on a streamlined architecture that applies depthwise separable convolutions to build light weight deep neural networks. depthwise separable convolutions are consisted of following layers: 1-depthwise convolutions and 2-pointwise convolutions [22] . in 2016, szegedy et al., have proposed inceptionresnetv2, as a combined architecture [23] . this model applies the idea of inception blocks and residual layers together. the use of residual connections results in preventing problem of degradation associated with deep networks; hence, it decreases the training time [23] . inceptionresnetv2 architecture is 164 layers deep and can assist us in our mission to classify x-ray images into normal or covid-19. in the present study, an open-source dataset was used. covid-19 chest x-ray images are available at this github repository (https:// github.com/ieee8023/covid-chestxray-dataset) that has been prepared by cohen et al, [24] . the repository of images is an open data set of covid-19 cases containing both x-ray images and ct scans and new images is regularly added. in this study, we used chest x-ray images to classify the covid-19. at the time of preparing this study, the dataset consisted of about 181 covid-19 chest x-ray images. as displayed in table 1 , the number of training pairs were 348, 236 negative and 112 positive covid-19 chest x-ray images. also, 55 negative and 33 positive x-ray images were used to create validation datasets, while 73 negative and 36 positive images were used for testing purpose. figure 1 shows some examples of the chest x-ray images taken from dataset. owing to the lack of uniformity in the dataset and the x-rays images with various sizes, we rescaled all the chest x-ray images. of note, the samples in the dataset are limited; hence, the data augmentation techniques were implemented to resolve this problem. in addition, the image augmentation methods can result in improved classification model performance. in this study, the data augmentation parameters were performed with a rotation range of 20, a zoom range of 0.05, a width shift range of 0.1, height shift range 0.1, shear range of 0.05, horizontal/vertical filliping, and filling mode called "nearest". as stated earlier, the dataset containing co-vid-19 chest x-ray images was used in our study, which it is publicly available on github. since the dataset obtained from multiple hospitals, the image resolutions differ from each other; hence, we rescaled the images and normalized pixels values to a range between zero and one. in our study, we used cnn-based method; therefore, it is not affected by adverse effects of the data compression used in the present study. in this study, a cnn-based model was used to detect covid-19 from the chest x-ray images. we have used four pre-trained cnn models, including vgg16, vgg19, mo-bilenet, and inceptionresnetv2. we have not explained these models in detail because a lot number of studies have been described previously the applied parameters in these models. in brief, the architecture of these models consists of convolution, pooling, flattening, and fully connected layers. the aforementioned models (i.e., vgg16, vgg19, mobilenet, and incep-tionresnetv2) were used for feature extraction. then, a transfer learning model consist of five different mentioned layers was trained and applied on the covid-19 dataset. these five different layers are considered as main part of the model. in other words, we have built a new fully-connected layer head comprising following layers: averagepooling2d, flatten, dense, dropout, and a last dense with the "two-element softmax (sigmoid)" activation to predict the distribution probability of classes. averagepooling2d layer is the first layer and average pooling operation is performed by this layer with pool size of (4, 4). then, a flatten layer was used to flat the input. flatten layers allows to change the shape of the data from 2-dimentional (2d) matrix of features into a vector that can be import into a fully connected neural network classifier. the aim of the dense layer is to transform the data. in other words, the transformed vector in the previous layer is input into a fully dense connected layer. this layer decreases the vector of height 512 to a vector of 64 elements. then, a dropout with a threshold of 0.5 is applied to ignore 50% neurons. the purpose of this layer is to improve generalization. finally, the last dense layer is used to reduce the vector of height 64 to a vector of 2 elements. the output of the classification model in this problem is two-class classification or binary classification. in the present study, a transfer learning approach is adopted to assess the performance of the cnn architectures described here and compare them. because radiologists must first distinguish covid-19 chest x-rays from normal images, we decided to choose a cnn design that can identify covid-19 and healthy people. the networks were trained using the binary cross-entropy loss function and adam optimizer with learning rate of 0.0001, batch size of 15, and epoch value of 100. other parameters and functions used in training phase have been described in the materials and methods subsection 2.3. as aforementioned, we have implemented data augmentation techniques to enhance training efficiency and prevent the model from overfitting. in our study, neural networks were implemented with python on a geforce gtx 8 gb nvidia and 32 gb ram. we used the holdout method, as the simplest type of cross validation to assess the performance of our binary classification models. training curve of accuracy and loss for each transfer learning model is shown in figures 2 and 3 , respectively. although number of epochs were set equal to 100, the all models were reached the stability with a number of epochs ranged between 28 and 30 because we used callback, as a powerful tool to customize the behavior of transfer learning models during training. we calculated the confusion matrix and area under curve (auc) of receiver operating char-acteristics (roc) to evaluate the performance of each transfer learning model. confusion matrix or table of confusion is a table with two rows and two columns reporting four primary parameters known as false positives (fp), false negatives (fn), true positives (tp), and true negatives (tn). figure 3 shows the performance of each transfer learning model for binary classification in the form of confusion matrix, aimed to distinguish covid-19 chest x-rays from healthy x-rays. as shown in figure 3 , the mobilenet model has the best classification performance. four different performance metrics, including accuracy, precision, recall, and f-measure (f1score) are used to evaluate the classification accuracy of each transfer learning model. the above-mentioned metrics are considered as most common measurement metrics in matable 2 , the all proposed models provide accuracy greater than 90.0%. the pre-trained mobilenet model provides the highest classification performance of automatroc curve is a 2d graphical plot, plots between the true positive rate (sensitivity) and the false positive rate (specificity). in fact, the roc curve represents the trade-off between sensitivity and specificity [25] . herein, the roc curve with true positive rate on the y-axis and false positive rate on the x-axis of each transfer learning model is plotted, as shown in figure 4 . also, we have calculated the auc of roc curve, as effective way indicating the accuracy of roc produced by each transfer learning model. the auc represents a measure of how well a parameter can discriminate between the covid-19 and healthy groups. the plotted auc of roc of four different models is shown in figure 4 . as displayed in figure 4 , the plotted auc of roc of vgg-16, vgg-19, mobilenet, and inceptionresnetv2 models were 0.92, 0.91, 0.99, and 0.97, re-spectively. in the field of medical diagnosis, these values are considered to be "excellent". in this study, we proposed four pre-trained deep cnn models, including vgg-16, vgg-19, mobilenet, and inceptionresnetv2 for discriminating covid-19 cases from chest x-ray images. from our data, it can be seen that vgg-16, vgg-19, mobilenet, and in-ceptionresnetv2 achieved the overall accuracy 93.6%, 90.8%, 99.1%, and 96.8% for binary classification, respectively. in addition, our data show that the precision (positive predictive value) and recall (sensitivity) for co-vid-19 cases are interesting results. it should be noted that an encouraging result is higher recall value that represents low fn case. this is important because the proposed models should be able to reduce missed covid-19 cases as much as possible, as most important purpose of the present study. the results of our study show that vgg-16 and mobilenet achieve best precision of 97.0% and 100%, respectively. furthermore, the mobilenet and inceptionresnetv2 models provided the same performance classification with recall of 98.0%, as shown in table 2 . table 3 summarizes the recent studies on the automated detection of covid-19 from chest x-ray and ct images. as observable in table 3 , the results achieved by our proposed models are similar or even superior compared to previous similar studies. several group of researcher have attempted to develop an automated model to diagnose covid-19 accurately. hemdan et al., have proposed covidx-net model to detect covid-19 cases from chest x-ray images [26] . their model achieved an accuracy equal to 90.0% using 25 covid-19 positive and 25 healthy chest x-rays. in another study, a residual deep architecture called covid-net for covid-19 diagnosis has been designed. the results of that study indicate that cov-id-net provides an accuracy of 92.4% using medical images obtained from various open access data [27] . apostolopoulos table 3 : summary of the recent study on the automated covid-19 detection for diagnosis of covid-19 infection and has a very slow process. in contrast, the rt-pcr is relatively fast and can diagnose covid-19 in around 4-6 hours. however, the rt-pcr testing has several limitations such as limited availability, high cost, shortage of the kit. as such, this molecular assay is time-consuming. of note, when we consider the magnitude of covid-19 pandemic throughout the world, the rt-pcr is not very fast. the aforementioned limitations can be resolved with our proposed pre-trained deep cnn models, in particular mobilenet. the proposed models in the present study are able to detect the covid-19 positive case in less than 2 seconds. our proposed models achieved the accuracy more than 90% with the limited patient data that we had. furthermore, the mobilenet and inceptionresnetv2 models provide 98% true positive rate. from the discussions, it can be understood that our proposed models achieved the promising and encouraging results in detection of covid-19 from chest x-ray images, as compared to recent methods proposed by the state-of-the-art. data indicate that deep learning plays a great role in fighting covid-19 pandemic in near future. our model must be validated by adding more patient data to the training dataset. in this study, our proposed models based on chest x-ray images aimed to improve the co-vid-19 detection. the proposed models can reduce clinician workload significantly. in this study, we presented four pre-trained deep cnn models such as vgg16, vgg19, mobilenet, and inceptionresnetv2 are used for transfer learning to detect and classify covid-19 from chest radiography. the all proposed models were able to perform binary classification with the accuracy more than 90.0% for covid-19 diagnosis. mobilenet model achieved the highest classification performance of automated covid-19 detection with 99.1% accuracy among the other three proposed models. our data indicated that the mobilenet can be considered as a promising model to detect covid-19 cases. this model can be helpful for medical diagnosis in radiology departments. a limitation of our study is the use of the insufficient number of cov-id-19 chest x-ray images. in the future, by increasing the number of samples of covid-19 chest x-rays to the training dataset, the accuracy and robustness of our proposed models increase further. lung infection quantification of covid-19 in ct images with deep learning covid-19 pathophysiology: a review detection of sars-cov-2 in different types of clinical specimens correlation of chest ct and rt-pcr testing for coronavirus disease 2019 (covid-19) in china: a report of 1014 cases detection of coronavirus disease (covid-19) based on deep features automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks covid-19): a perspective from china essentials for radiologists on covid-19: an update-radiology scientific expert panel deep learning approach for microarray cancer data classification problems of deploying cnn transfer learning to detect co-vid-19 from chest x-rays fusion of medical images using deep belief networks convolutional neural network based detection and judgement of environmental obstacle in vehicle operation efficient prediction of drug-drug interaction using deep learning models covid-densenet: a deep learning architecture to detect covid-19 from chest radiology images a study on cnn transfer learning for image classification. springer international publishing: advances in computational intelligence systems introduction of a new dataset and method for detecting and counting the pistachios based on deep learning comprehensive electrocardiographic diagnosis based on deep learning deep learning for big data applications in cad and plm -research review, opportunities and case study simple convolutional neural network on image classification. 2nd international conference on big data analysis very deep convolutional networks for large-scale image recognition mobilenets: efficient convolutional neural networks for mobile vision applications going deeper with convolutions covid-19 image data collection: prospective predictions are the future what's under the roc? an introduction to receiver operating characteristics curves covidxnet: a framework of deep learning classifiers to diagnose covid-19 in x-ray images covid-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks a deep learning algorithm using ct images to screen for corona virus disease (covid-19) deep learning-based detection for covid-19 from chest ct using weak label none key: cord-293893-ibca88xu authors: xie, tian; wei, yao-yao; chen, wei-fan; huang, hai-nan title: parallel evolution and response decision method for public sentiment based on system dynamics date: 2020-05-23 journal: eur j oper res doi: 10.1016/j.ejor.2020.05.025 sha: doc_id: 293893 cord_uid: ibca88xu abstract governments face difficulties in policy making in many areas such as health, food safety, and large-scale projects where public perceptions can be misplaced. for example, the adoption of the mmr vaccine has been opposed due to the publicity indicating an erroneous link between the vaccine and autism. this research proposes the “parallel evolution and response decision framework for public sentiments” as a real-time decision-making method to simulate and control the public sentiment evolution mechanisms. this framework is based on the theories of parallel control and management (pcm) and system dynamics (sd) and includes four iterative steps: namely, sd modelling, simulating, optimizing, and controlling. a concrete case of an anti-nuclear mass incident that sparked public sentiment in china is introduced as a study sample to test the effectiveness of the proposed method. in addition, the results indicate the effects by adjusting the key control variables of response strategies. these variables include response time, response capacity, and transparency of the government regarding public sentiment. furthermore, the advantages and disadvantages of the proposed method will be analyzed to determine how it can be used by policy makers in predicting public opinion and offering effective response strategies. social networking services (sns) such as microblog and wechat have become the most prominent carriers and tools for disseminating public sentiment. public sentiment reflects the ideas, attitudes, opinions and emotions that people express on certain noticed issues. positive public sentiment is often a necessary prerequisite for political action, and vice versa; negative public sentiment may become a major obstacle for a correct decision (burstein, 2010) . in this situation, public sentiment plays an increasingly important role in policy making processes in many areas such as public health, food safety, and large-scale projects where public perceptions can be misplaced (jiang, lin, & qiang, 2016) . typical examples include incidents like "the adoption of the mmr vaccine has been opposed due to the publicity indicating an erroneous link between the vaccine and autism" (godlee, 2011) , "china's milk powder industry had seriously declined due to the public sentiment of the sanlu milk powder incident" (chen, 2009) , "5g mobile phone masts have been vandalised caused by a false link with the coronavirus", and "nuclear fuel processing project has been cancelled because of an anti-nuclear public sentiment in guangdong province of china" (weiwei li; . lacking relevant popular science knowledge, most people never say 'yes' to any large-scale project or product involved in a public sentiment. in addition, as long as the projects are planned or "fears" are turned into "crisis," negative public sentiment will be sparked immediately (xiong, liu, & cheng, 2017; lee and chun 2016) . worse yet, if public sentiment is not guided and controlled in time, its effects will spread from the virtual network to the physical world and cause further secondary disasters, such as mass incidents. the interdisciplinary online-offline effects which make the situation more complex and uncertain, acutely threaten social stability and national economic development. there have been significant studies of public sentiment triggered by different incidents. in the public health areas, several epidemiologic studies found that mmr vaccine was not associated with an increased risk of autism even among high-risk children whose older siblings had autism. despite strong evidence of its safety, some parents are still hesitant to accept mmr vaccination of their children due to the fake scientific report and news (destefano & shimabukuro, 2019) . to characterise and evaluate the dissemination of information to the community during a suspected tb outbreak, a model of effective information dissemination in a crisis was developed through a study of information dissemination during the incident (duggan, 2004) . in the food safety area, lyu identified and compared the crisis communication strategies that organizations used to respond to a congenetic melamine-tainted milk crisis and public opinion in two chinese societies (i.e., mainland china and taiwan) and found that sanlu (a mainland china-based organization) and kingcar (a taiwan-based organization) demonstrated inverse pattern of the strategy adoption (lyu, 2012) . in the large-scale project area, jeong used situational communication crisis communication theory and attribution theory to explain the public's responses to a corporation that caused an oil spill accident (jeong, 2009 ). sun investigated nuclear power in china from the perspectives of electricity preference and social perceptions and assessed the public willingness-to-pay (wtp) to prevent a local nuclear power plant. research results indicate that the chinese pay most attention to the development of nuclear energy when they are worried about nuclear accidents (sun, zhu, & meng, 2016) . following the anti-nuclear event in jiangmen in guangdong province, china, li analyzed the causes and characteristics of nuclear public sentiment in depth and presented some relevant measures for preventing negative public sentiment in the era of new media (weiwei li; . above all, most relevant studies include some significant ideas for our purposes. however, existing research methods are mainly qualitative rather than quantitative and post-mortem analysis rather than real-time decision-making. two bottlenecks, a lack of prior knowledge and the dynamic of public sentiment evolution, have caused this situation. on the one hand, the original events (large-scale projects, major incidents or scandals) which trigger the relevant public sentiments occur with low frequency, and the triggered public sentiments are diverse and non-duplicated. thus, the key parameters (such as "posting rate of microblog") of prior knowledge (such as decision models and data) for describing or predicting the scenarios of public sentiment systems are difficult to obtain in time. on the other hand, interdisciplinary and dynamic original events lead to the features of feedback, nonlinearity and instability of the relevant public sentiment systems. this also means that the key parameters of the decision models are adjusted over time. therefore, it is more difficult for policy makers to build a quantitative evolutionary model and make an effective response decision based only on previous decision methods. 1.2 insightful lessons from the perspectives of system dynamics (sd) and parallel control and management (pcm). 1.2.1 lesson from the perspective of sd modelling. the sd research method focuses on system structure and simultaneously takes selectivity (decision-making processes based on information), self-discipline (multi-feedback processes) and non-linearity (time delay) into account (torres, kunc, and o'brien 2017; rashwan, abo-hamad, and arisha 2015) . it is a perspective and set of conceptual tools that helps us to learn about the structure and dynamics of complex systems. additionally, the rigorous modeling method of sd enables us to build formal computer simulations of complex systems and use them to design more effective policies and organizations (john sterman, 2000) . several scholars have used sd models to study the diffusion mechanism of different complex systems, which are helpful for us to develop insightful lessons. the sd methodology is used to develop some increasingly complex models to investigate the process of innovation diffusion and enhance insight in the problem structure and increase understanding of the complexity and the dynamics caused by the influencing elements (maier, 1998) . a diffusion model is employed to extend commonly used epidemic models in the study on analyzing dynamics of intra-organizational innovation implementation processes. the sd modelling and simulation results of this study provides structural explanations and offers insights into why, given various intra-organizational networks, management policies can make different results on whether and how innovations are adopted in an organization (wunderlich, größler, zimmermann, & vennix, 2014) . furthermore, by applying system dynamics, yu proposed a simulation model to study the public sentiment diffusion mechanism in dangerous chemical pollution water emergencies and discover effective response strategies (yu le-an et al. 2015) . his research shows that the sd modelling methods are useful for describing, understanding and simulating the complex public sentiment diffusion systems and for making relevant optimization policies. more importantly, compared with decision-making theories based on large amounts of historical data, sd has the advantage of being able to predict the future based only on system structure. therefore, when one lacks historical data, most complex social systems can be described and analysed using sd method (morgan, howick, & belton, 2017 ). public sentiment system is no exception. to solve the first bottleneck (lack of prior knowledge), the structure of the public sentiment system can also be represented by multi-feedback relationships, and its system functions can be achieved and controlled by sd method. moreover, by now many mature statistical technologies, such as regression analysis, are available to help decision-makers obtain key parameters and build sd simulation models to respond to possible disasters and social crises. 1.2.2 lesson from the perspective of pcm theory. with the advent and development of cloud computing and the internet of things (iot), the capacities of perception, computing, storage, transmitting and analysing have been greatly improved. in this era of new information and communication technology (ict), artificial society and pcm theory have been proposed by the chinese academy of sciences and the national university of defense technology to provide an available decision-making paradigm to respond to the dynamically evolutionary scenarios of interdisciplinary disasters and social crises. this theory uses multi-agents modelling and simulation technology to build an artificial social system equivalent to the real system. by simulation computing in the artificial system, decision-makers are able to recognize the evolutionary mechanisms of disaster scenarios. moreover, by comparatively analysing the artificial and the real systems, the simulation model of the artificial system is modified to parallel the actual situation in real time. lastly, with the knowledge obtained from the simulation results, the evolutionary processes of disasters can be controlled and optimized by making the appropriate decisions in time (f. wang, 2010) . inspired by pcm theory, the public sentiment system, a type of crisis system, can be considered a real system carrying public attitudes and opinions. thus, the second bottleneck (dynamic of public sentiment evolution) may be resolved based on both pcm theory and sd method. after building and running the artificial sd simulation system parallel with the public sentiment system, evolutionary scenarios can be predicted and effective response policies can be provided by decision-makers. therefore, a novel decision-making method "parallel evolution and response decision framework for public sentiment (perdfps)" based on the theories of pcm and sd is proposed as a specific contribution of this paper. this method is structure-dependent rather than data-dependent and can be implemented in real-time, which makes it helpful to simulate, analyze and guide the evolution processes of dynamic public sentiment in the case of lack of historical knowledge on less-frequently occurring original events. the paper is organized as follows for achieving the contribution. section 2 proposes the perdfps framework. and section "3 methodology" is developed to abstract and present the perdfps methodology with a roadmap (table 1) helping decision makers effectively use the novel method. furthermore, the section "4 empirical research" shows a whole case study to deduce the principle and verify the effectiveness of the perdfps method using the anti-nuclear mass incident in jiangmen, china. lastly, theoretical discussions, the limitations and future studies are presented in section 5. dynamics. pcm theory holds that social mass incidents are essentially unpredictable. however, one can now build artificial societies parallel to real-world societies and deduce their evolution in a computational experimental environment. in addition, with dynamic and real-time data, decision-makers are able to perceive coming disaster scenarios and response demand using artificial simulation models and create effective and timely coping strategies, policies and response solutions to control the real situations (f. wang, 2010) . therefore, the "parallel evolution and response decision framework for public sentiment (perdfps)" is proposed as a guideline for achieving this function. moreover, within this framework (fig. 1) , according to the "scenario-response"-based emergency management paradigm (bañuls, turoff, & roxanne, 2013) , the public sentiment system (pss) can be divided into two subsystems: the real scenario system (rss) and the simulation decision-making system (sds). as the source of real-time information, real scenarios are the basis for decision-making as well as the targets of public sentiment control. the original events, sentiment disseminators and sentiment regulators are the essential elements of the real scenario system. the original events (such as public health emergencies, food safety scandals and large-scale projects) may easily trigger relevant public sentiment. public sentiment disseminators include the media, netizens, and others. the media triggers and influences the processes of public sentiment propagation through reporting and directing the news. in addition, netizens use social networks to express and communicate their own opinions, which results in the continued diffusion and evolution of public sentiments. because the collective behaviours of the netizens comprehensively reflect their attitudes towards source events, their support or opposition are essential factors for the government in making an efficient response decision. generally, the government response departments dealing with the emergencies or large-scale projects assume the greatest responsibilities as public sentiment regulators. by taking measures such as holding lectures and seminars, press conferences on the events, and by releasing positive news, they supervise, guide and even control the development of the public sentiments. based on pcm theory, the simulation decision-making system is a closed-loop functional circuit consisting of four steps: sd modelling, simulating, decision-making and parallel controlling. in the first step, using real-time perception (which includes the processes of information collecting, analysing and abstracting), the characteristic elements of the public sentiment system, including source events, sentiment disseminators and sentiment regulators, can be extracted from real scenarios. based on the sd method, the causal loop diagram model is suitable for describing the relationships between the elements of public sentiment systems. in addition, after setting the parameters and calculating rules, the stock flow diagram model, which is based on the qualitative causal loop diagram model, can be built to describe the dynamic system structure and simulate its functional mechanism quantitatively. in the second step, simulation results of public sentiment diffusion and evolution at a future time can be obtained by importing real-time status data perceived from real scenarios into the stock flow diagram model and running it. next, to keep simulation models consistently parallel with real scenarios, decision-makers should dynamically adjust the parameters and structures of the stock flow diagram models by observing and analysing the running effect of models and the real-time status of the scenarios. in this way, the validity and reliability of the simulation results of the sd stock flow diagram model can be gradually improved. in the last step, given a reliable sd model, different response strategies or policies can be tested, verified and optimized in the simulated environment. roadmap is useful for decision makers to know when and how to use a modelling and simulation method for dealing with practical problems (davis, eisenhardt, & bingham, 2007) . for implementing the perdfnps method, we develop a roadmap that describes the principles and steps shown in table 1 ( thompson, howick, & belton, 2016) . no matter how the perdfnps method is carried out, the steps do not occur in neat sequence. specifically, the steps of sd modelling and parallel controlling constitute an iterative process for cultivating a rational decision model. table 1 roadmap for implementing the perdfnps method. step description  identify the type of the decision problem: the sentiment triggered by a periodic heated topic with enough investigable historical data or by an unconventional heated topic?  selecting the relevant decision methods, data-dependent or structuredependent?  reality hypotheses of the model: the simulation results of the model are consistent with the real scenarios.  validity hypotheses of the response strategies: testable relationships between the response strategies and the controllable degree of public sentiments.  analysis of real scenario system from perspectives of structure.  causal loop diagram modelling: abstracting the elements and the causal relationships between them from the scenario system; defining the circular causality loops of relevant modules of system.  stock flow diagram modelling: selecting levels, selecting rates and describing their determinants; selection of parameter values. simulating 3.4 parallel controlling  reality testing: contrasting the simulation results to verify the reality hypotheses of the model.  model improving: to achieve the rationality and consistency of the model.  response strategies setting.  response strategies testing: simulation and discussion of strategy effects for verifying the proposed validity hypotheses. 3.1 identify the decision problem. as a structure-dependent decision-making method, the perdfnps framework is applicable for response to the public sentiments without enough historical data. therefore, in the implementation stage, the decision makers must firstly determine what type of decision problem it is: is there enough historical data for building a model for this event? or is the perdfnps method suitable for the decision problem? if public sentiment is triggered by a periodic heated topic with enough investigable historical data, the decision-making model can be constructed based on the data-dependent statistical methods. for example, the training data used for cultivating the model of its public sentiment system for the presidential election every four years, can be obtained from the social network services including twitter, facebook (h. wang, can, kazemzadeh, bar, & narayanan, 2012) . conversely, if the public sentiment is triggered by an unconventional or non-periodic heated topic such as a nuclear-related event without sufficient investigable historical data, the sd simulation model can be cultivated to describe and solve these kinds of problems based on the structure-dependent perdfnps methods. two kinds of hypotheses are proposed for the perdfnps method according to the requirements of simulation reality and decision validity. (1) dynamic simulation reality hypotheses: hypothesis 1, the simulation results of the cultivated sd model are rational and consistent with the real scenarios. the rationality of the cultivated sd model and the consistency between its simulation results and the real evolution trends of the public sentiment are essential to achieve scenario rehearsal and response effectively in the decision-making processes (thompson et al., 2016) . thus, we firstly present hypothesis 1. then, we begin to cultivate our sd model and verify the hypothesis for a specific case through comparison between the real scenarios and the simulation results (section 4.3). of course, the rational initial general sd model is very important because it determines the efficiency for reality improvement of the descendant models. the best method for verifying the dynamic reality of a simulation model is testing the consistency between its simulation results and the real evolution trends in a real-time case. however, it is difficult to achieve this process in our empirical research. therefore, the special hypothesis 2 for the subsequent section of empirical research is proposed. hypothesis 2, information of the past scenario of the public sentiment, is unknown but can be gradually revealed through evolutionary process. the goals of the empirical research are to verify whether a reliable simulation model that parallels the real public sentiment scenario can be developed and whether the response strategies based on the validity hypotheses can be optimized. for comparing the simulation results with the real-time scenarios, we assume that the data of the scenario is unknown but will be gradually revealed through evolutionary process in the experiment. the public sentiment event is considered a real-time scenario to verify the rationality and consistency of the parallel simulation models to be cultivated. hypothesis 1 and 2 can be validated by reality testing (section 3.4). (2) decision validity hypotheses the public sentiment system is a complex social network system. the various decision response behaviours of government drive the evolutions and diffusions of public sentiments and ultimately decide the results of the original events through affecting the attitudes of people. therefore, four testable decision validity hypotheses are proposed to explore the effective response strategies for guiding the evolution of the public sentiment with the perdfps method. firstly, the government's response measures, including social stability risk assessment, answering questions from the public, giving lectures on the original event-related knowledge and so on, is significant for guiding the evolution processes of the public sentiment events. higher degree of government response results in a more controllable public attitude to the public sentiment event (zhang, qi, ma, & fang, 2010) . the amount of times the government responds to an event can reflect the degree of government response. therefore, the relevant hypothesis 3 is proposed as follows. hypothesis 3 is that, firstly, the more times the government responds, the more controllable the public attitude to the public sentiment event is. secondly, response time represents the time-delay in the government response processes. another important factor is controlling the evolution processes of the public sentiment events. a timely response means higher response speed which leads to a lower heat rate and a higher controllable degree of public sentiment (yu le-an, li ling, wu jia-qian, 2015) . similarly, hypothesis 4 is proposed as follows. hypothesis 4 is that a higher response speed (or short response time) of government leads to a higher controllable degree of public sentiment. furthermore, the degree of information transparency and the degree of popularity of original event-related knowledge also make significant influences to the evolution trends of the public sentiment event. the degree of information transparency represents the disclosure degree of the original event-related information, and the degree of popularity of original event-related knowledge reflects how the knowledge related to safety and risk was popularized locally. scholars analyzed the response processes of public sentiment events and believed that these two factors are positively related to the controllable degree of large scale project-related public sentiment (weiwei li; . hypotheses 5 and 6 are proposed as follows. hypothesis 5 details that a higher degree of information transparency leads to a higher controllable degree of public sentiment. hypothesis 6 is that a higher degree of popularity of original event-related knowledge leads to a higher controllable degree of public sentiment. (1) analysis of real scenario system from perspectives of structure. the sd approach emphasizes how causal relationships among system structures can influence the functions, behaviours and evolution processes of a system. analysis of the boundary and structure of a and regulations, and holding a press conference) to guide the development of original events and control the diffusion of triggered public sentiments in a timely manner. furthermore, the netizen module: original events may affect social and economic benefits, damage the ecological environment and even directly threaten our life safety. therefore, they quickly garner attention from people and trigger various public opinions or sentiments. there are both advantages and disadvantages for various people, and thus netizens are divided into three groups: opponents, supporters and neutrals. these groups express their own opinions and sentiments online and this kind of behaviour turns original events into controversial issues. in addition, the main attitudes of netizens influence and even decide the response behaviours of the relevant media and government departments, as they reflect public opinion. (2) causal loop diagram modelling the sd approach describes a system as a series of simple processes with positive and negative circular causality loops (forrester & senge, 1980; sterman 2000) . the feedback function of a positive loop is self-reinforcing and amplifying; the feedback function of a negative loop is dampening. the public sentiment systems evolve continually under the influence of the interactions between the feedbacks. based on the structure and mechanism of real scenario public sentiment system of a concrete case, the qualitative causal relationships and the circular causality loops of relevant modules can be defined. the concrete examples of this process are shown in section 4.2.1. (3) stock and flow diagram modelling causal loop diagrams aid in visualizing a system's structure and behavior, and analyzing the system qualitatively. to perform a more detailed quantitative analysis, a causal loop diagram is transformed into a stock and flow diagram. a stock and flow model helps in studying and analyzing the system in a quantitative way; such models are usually built and simulated using computer software such as vensim (rahmandada & sterman, 2012; martinez-moyano, 2012) . a stock is the term for any entity that accumulates or depletes over time. a flow is the rate of change in a stock. the key step to build the stock and flow diagram is converting the system description into level and rate equations (torres et al., 2017) . usually, the parameters and initial conditions of the equations can be estimated using statistical methods, expert opinion, market research data or other relevant sources of information (j sterman, 2001) . in a decision-making process for a non-duplicated public sentiment triggered by a major public health incident or a large-scale project, because the decision makers lack prior data and knowledge, the parameters of the initial equations of the 1-general sd model can be referenced from the developed models of historical cases which are similar with the current event in type, system structure and situation. by analyzing and refitting the real-time data gradually collected from the real scenarios, the parameters are modified and the equations are improved. to achieve the target of public sentiment simulation and control, the variables of the stock and flow model are classified into two categories, including state and control variables. the latter is classified into two subcategories, including parallel cultivating control and decision-making control variables. each kind of variable has relevant functions in the model. according to the literature (rahmandada & sterman, 2012) , the variables of the constructed sd model should meet the minimum and preferred reporting requirements for research reproducibility, communication and transparency. appendix 1 shows an example of a constructed and defined sd model of all variables and their equations. model library scenario 0 simulaiton decisionmaking system non-duplicated source events trigger diverse public sentiments and lead to a lack of prior knowledge (the parameters to describe or calculate the key variables) in order to make an effective response decision. in addition, even if we obtain some experience from the public sentiment triggered by a similar original event, due to their various spaces, locations and social environments, the evolution processes of the public sentiments are too different to build a simulation and decision model in one step using the most traditional methods. thus, in this paper, we use pcm theory to cultivate the sd simulation models dynamically and to keep the models parallel with real public sentiment systems in order to obtain accurate simulation results and reliable response solutions (f. wang, 2010) . the parallel controlling principles are shown in figure 2 . (1) 1-generation sd model constructing. the parallel simulation decision-making system enters into the emergence response state immediately after "scenario 0" of a new original event occurs. at this point, decision-makers should decide whether this event has ever occurred using a semantic searching and matching in an existing model library (wu et al., 2017) . if there is a public sentiment scenario very similar to the current scenario through the perspectives of event type and system structure, the relevant simulation model of the scenario is chosen as the initial model "1-generation sd model." however, there are usually no extremely similar scenarios for a non-duplicated public sentiment event. in that situation, the parameters of the initial equations of the 1-generation sd model can refer to the developed models of historical cases which are similar with the current event in type, system structure and situation. regarding the historical cases, it is very difficult to find the most similar one in an emergency situation. we can still improve and cultivate our models iteratively by adjusting them in parallel with the real-time scenarios even if the selected historical case is not identical to the current public sentiment event. of course, the more similar a selected historical case is, the less iteration there will be, so the improved decision-making model will be obtained quickly. (2) reality testing. we can validate hypothesis 1 and 2 by rationality and consistency testing. the rationality and consistency between the models and real public sentiment systems should be discussed by (wooldridge, 2006) . (3) model improving. by analyzing and refitting real-time data gradually collected from real scenarios, the parameters are modified and the equations are improved. in the entire parallel controlling process, the real time information from the various stages of the real scenarios (from scenario 0 to scenario n) should be collected, integrated, and analyzed at a specific time-interval, "△t". at the same time, modifying processes such as parameter refitting and data updating can be used to continually improve the simulation models from "1-generation sd model" to "n-generation sd model," ensuring that simulation results are consistent with the real scenario in the parallel evolution processes. regarding the parameter fitting into the model cultivating process, we constructed the 1-generation sd model and the other n-generation sd models in different ways. take the case shown in the empirical part of this paper as an example. on the one hand, most of the equations of the 1-generation sd model based on the literatures (tian, 2011), (weiwei li; , (zhang, 2012) and (yu le-an, li ling, wu jia-qian, 2015) are obtained using parameter fitting with the data of 36 similar cases such as the "sanlu milk powder incident" from "people.com." on the other hand, for the n-general models, we do not have to fit the parameters to improve the equations. only real-time data needs to be collected to draw the curves of the key state variables such as "increment of posts" and "increment of news" during the time-interval (△t = 24 hours). after contrasting the simulation results of the n-generation model and the scenario n, the parameters of the equations are adjusted and the model is improved to approach the real scenario in an iterative and parallel way. when an n-general model and its simulation results are consistent with the real scenarios, it means that the model has passed reality testing and has become a cultivated model. optimal strategy can be obtained by adjusting the control variables. as shown in the stock and flow diagram model (fig. 6) , the interactions between these system variables drive the evolutions and diffusions of public sentiments and decide the final results of the original events, as seen through the attitudes of people. in the end, when the entire emergency response process ends, the latest parameters and structure of the simulation model are recorded and archived in the model library for the next similar public sentiment. some advanced sd software tools, such as vensim, stella and anylogic, are available to help decision-makers construct, run and analyse the sd simulation models of the public sentiment systems and to create the optimized response policies and solutions in a graphic and visual way (rahmandada & sterman, 2012) . according to the evolutionary cycle and system boundary of the specific public sentiment system, and after setting the main calculating parameters such as simulation step size and run time, the simulation process can be implemented. to propose suitable response solutions, the relevant decision analysing process should include two aspects: (1) response strategies setting. the control variables can be tentatively set by the decision makers to represent different response strategies. however, most of the variables, especially state variables, are uncontrollable from the perspective of government. next, taking the public sentiment of anti-nuclear mass incident as an example (section 4) and according to the "validity hypotheses of the response strategies" table 4 . the r0 strategy is the baseline strategy extracted from this real case; the other response strategies, including a1~a4, b1~b4, c1~c4, d1~d4 and e, are set with different specific values of the key control variables, respectively. (2) response strategies testing. the effects of different strategies can then be tested by reviewing the simulation results with the cultivated n-general sd model. policy-makers would be able to control the system state variables and create an effective and optimized response strategy to guide the public sentiments by adjusting or resetting the values of the four key controllable variables in the sd simulation models. some practical measures, such as enhancing the credibility and degree of information transparency of government, are available to improve the control performances of public sentiments. however, increasing these variable values always brings higher response times or costs. thus, balanced considerations should be included in the decision-making processes. 4.1 case analysis (including "indentify the decision problem" and "research hypotheses") in this section, the anti-nuclear mass incident sparked by a nuclear fuel processing base project near jiangmen city in guangdong province is taken as a case study to verify the feasibility and validity of the perdfps method proposed in section 3 above. the announcement of the result of the social stability risk assessment of the nuclear project is considered the source event of the media module of this public sentiment system. in addition, by choosing and analyzing the increment data of baidu news and xinlang blog posts (fig. 3) during july 5-19, 2013, and the relevant government response measures (table 2 ), the key parameters of the initial stock flow diagram model and the initial data of the simulation process of public sentiment evolution can be extracted gradually. due to the actual evolution cycle of public sentiment lasting only 15 days, it is too short to prepare enough data and develop a reasonable simulation trend. the simulation model is thus set for running 12 hours per day. therefore, the simulation cycle is translated or expanded from 15 days to 240 hours. the main data used in this paper is collected from references and websites (weiwei li; cnsa, 2015) . moreover, some variables or parameters of the model should be adjusted according to the demand of the actual simulation experiment. for example, some unstructured behaviour data, such as government response behaviours, should be ranked into the qualified and controllable variables quantity of government response or increment of government response. note that the data shown in figure 8 are considered unknown information at the beginning of the simulation experiment and will be revealed gradually during the simulation processes. in fact, this primary data is used to verify the reality and validity of our parallel simulation model by describing the real scenarios and comparing them with the artificial ones. therefore, the decision problem of this case is structure-dependent and can be described and solved with the perdfnps method based on the 6 hypotheses proposed in section 3. the jiangmen government released the report of the social stability risk assessment of the nuclear project. the deputy general manager answered questions from the public. 8 th three lectures on nuclear knowledge are given by the jiangmen government. 9 th professor zhou participates in the propaganda lectures on nuclear knowledge, telling the public that the project is safe. 12 th announcement of the determination of the nuclear project is delayed for ten days. the nuclear project is terminated by the government. 4.2 1-genernal sd modelling. plans by the china national nuclear corporation (cnnc) to build a nuclear fuel-processing base near public sentiment in that city. taking this anti-nuclear mass incident as a concrete case, we can analyze the boundary, structure and evolution mechanism of the public sentiment system of this event and build a relevant qualitative causal loop diagram model. this model is divided into three main modules: the media module, the netizen module and the government module. (1) the causal loop diagram of the media module is a positive feedback loop r1 (fig. 4) . once the nuclear event does occur, the relevant news is released on an ongoing basis. in addition, as the event evolves, the quantity of news and the participation degree of media both increase. at the same time, due to media influence, the increased participation degree of media always leads to a higher rate of public sentiment and results in more news increments. the evolution processes of the nuclear public sentiment system are driven by a multi-feedback mechanism, which leads to some delays in the propagation and response processes. thus, the nuclear public sentiment system should be described as a number of nonlinear and multi-feedback relationships, such as the exponential dependence relationship between interactive system variables such as increment of government response, increment of posts and news increment. taking the incident of "anti-nuclear public sentiment in jiangmen city, guangdong province" as an example, and according to the main causal loops in section 3.1, the sd stock flow diagram model of the nuclear public sentiment system is shown in figure 7 . (1) stock flow diagram model of the media module. because the source event doesn't last long, the variable increase rate of news can be defined as a lookup or fitted function and described directly according to the captured short-term history data. the variable quantity of news is influenced by the variable heat rate of public sentiment, which is constrained by the variables participation degree of government, participation degree of media and participation degree of netizens. referring to the evolution mechanism of public sentiment (yu le-an, li ling, wu jia-qian 2015; zhang 2012), the functional relationships of the variables can be defined respectively as the formulas (1-4). in these formulas, we assume that there is no delay in the transmission processes of baidu news, and the variable type of source event is boolean (its value equals 1 if the source event is trigged). (2) stock flow diagram model of the netizen module. the posting reaction of the netizens after receiving news information causes a time-delay effect and the variable increment of posts is also influenced by the variable heat rate of public sentiment. the variable increase rate of posts can be defined as a lookup or fitted function and described directly according to the captured short-term history data. the function of increment of posts is then constructed as a formula (5). referring to the evolution mechanism of public sentiment ( figure 8 . at the beginning of the simulation, we lack sufficient data to fit certain parameters of the 1-generation model. the variables "increase rate of posts" and "increase rate of news" are defined temporarily as formulas (22) and (29) (in the appendix) by collecting and polynomial fitting the 240 hours of data (from july 15 th to august 3 rd ) of the similar public sentiment triggered by the changsheng vaccine event from ef.zhiweidata.com (figure 8 -①), because these two events have the similar "happening, developing, declining, and extinction" life cycle. 2) 1-generation model testing and improving. in the parallel iteration processes, the time-interval for improving the models is set as "△t = 24 hours." we constructed the improved n-generation formulas based on the "minimum sum of squared errors" algorithm with the real data during n△t, and we tested the reality of the model based on "goodness-of-fit formula" with the real data during (n+1)△t. considering the scenario during the 48 th hour as scenario 1 (testing period of 1-generation model t testing-1 = 2△t), assume that the real scenario evolves under the combined influences of the media, netizens, and government. after importing the initial data of the first hour of this case and simulating it for 48 steps (hours), the evolutionary trends of the variables "increment of posts" and "increment of news" can be drawn into figure 8 -②. the simulated time series (orange curves) are both obviously too steep to match the real situations (blue curves). the model reality is tested by calculating the goodness-of-fits ( , gof∈ (-∞, 1] ) between the real time series and the simulated time series of the key state variables. in this formula, y is the observed posts/news increment value, y * is the simulated value (wooldridge, 2006) . we use "gof dp " and "gof dn " to indicate the gof of the "increment of posts" and "increment of news" by the day respectively. after programming and calculating, gof dp and gof dn are -1276.29 and -24726.47 respectively. therefore, the simulation results show that the "1-generation model" is very different from scenario 1 during t testing-1 . to ensure that the simulation results are consistent with real scenarios, we developed a "minimum sum of squared errors" algorithm to improve the models. with this algorithm, each polynomial coefficient of (n-1)-generation "increase rate of posts" and "increase rate of news" formulas is traversed within a certain range, and the sum of squared errors between the observed real value series and the simulated value series is calculated for different coefficient combinations through multiple iterations. then, the optimal coefficient combination with the minimum sum of squared errors is selected to be the polynomial coefficients of the "increase rate of posts" and "increase rate of news" formulas of the n-generation model. after inputting the 48 hours of scenario-1 data, the 2-generation variables "increase rate of posts" and "increase rate of news" are defined as formulas (23) and (30) in the appendix (also shown in figure 8 -②). 3) 2-generation model testing and improving. considering the testing period of the 2-generation model as t testing-2 = 72h (3△t), the real scenario continues to evolve from scenario 1 to scenario 2 for 24 h(△t ). after simulating for 72h, the calculated gof values and drawn evolutionary trend curves (figure 8 -③) of the variables "increment of posts" and "increment of news" both show that the 2-generation model is much better than 1-generation model. however, gof dp and gof dn are still as low as -1.13 and -219.34 respectively, which means the 2-generation model cannot simulate the real scenario 2 during t testing-2 or help us to predict the evolutionary trend of the public sentiment. therefore, after calculating the "minimum sum of squared errors" between the observed 72 hours of scenario-1 data and the simulated value series, the optimal coefficient combination is selected as the polynomial coefficients of "increase rate of posts" and "increase rate of news" formulas of the 3-generation model shown in figure 8 -③. similarly, models are tested and improved from 3-generation to 7-generation after a set number of parallel iterations. the variables "increase rate of posts" are defined as formulas from (24) to (28) and the variables "increase rate of news" are defined from (31) to (35) in the appendix. the simulated trend curves are becoming more and more similar to the real curves, and the gof dp and gof dn of the models are increasing in the iteration processes (figure 8-④~⑧) . finally, the cultivated 7-generation model is obtained with high gof values (gof dp =0.87 and gof dn =0.89). although the simulated curves are still smoother than the real scenarios, the overall public sentiment trend is consistent with the real scenarios as shown in the stock flow diagram model (fig. 7) , the interactions among these system variables drive the evolution and diffusion of public sentiments and ultimately decide the results of the nuclear project by affecting the attitudes of the public. however, most of the variables, especially state variables, are uncontrollable from the perspective of government. next, we chose the four controllable variables, degree of popularity of nuclear knowledge, to explore the effective response strategies for guiding the evolution of the public sentiment in this event. the series of different response strategies is shown in in this table, "times of government response" represents how many times the government responds to public sentiment in an event. its initial value is 7 (times), because in this case, 7 measures were taken to respond to the nuclear-related public sentiment event by the government. we tentatively adjusted its value from 1 to 70 to present the different strategies a1-a4. the number of "response time of government response" represents the time-delay in the government response processes. its initial value is set to 2 (hours). we tentatively adjusted its value from 0.5 to 6 to represent the different strategies b1-b4. the "degree of information transparency" values represent the disclosure degree of the nuclear project-related information. its initial value is set to 0.5. we tentatively adjusted it from 0.1 to 0.9 to represent the different strategies c1-c4. the "degree of popularity of nuclear knowledge" value represents how the knowledge related to nuclear safety was popularized locally. its initial value is set at 50. the variables "degree of information transparency" and "degree of popularity of nuclear knowledge" are considered relative rather than absolute because they are hard to exactly quantify. furthermore, our research focuses on reality testing (how the evolutionary trends of the public sentiment are simulated by the parallel cultivated sd model) and response strategies testing (how the evolutionary trends of public sentiment are influenced by different strategies). this consideration is reasonable. we can simulate the public sentiment scenarios and test the effectiveness of the improved response strategies by increasing or reducing their initial values. 2) simulation and discussion of strategy effects (response strategies testing) after running the cultivated "7-generation model" with different values of control variables for 300 hours, the results indicate the corresponding effects of different strategies a1~d4 on public attitude to the nuclear project, respectively, and find the most appropriate response strategy for guiding nuclear public sentiment. the effects of the four controllable variables on public attitudes are shown in the in the two right-most columns of table 3 . (1) the separate effects of the four control variables on public attitudes compared with the baseline scenario r0, the percentages of support posts under the control of strategies a3 and a4, respectively, rose to 7.37% and 21.36%. this means that the approval rating of the nuclear project will rise only by greatly increasing the "times of government response." it is reasonable because the government is one of the project sponsors. enhancing the measures of response and disposal by means of open information, popular science propaganda, and so on is helpful for increasing the approval rating and guiding public sentiment in the right direction. this result also indicates that hypothesis 3 is tenable in this case. usually, the project approval rating increases as response speed or information transparency increases. however, the simulation results of the eight strategies from b1 to c4 show that the adjustments to the variables "response time of government" and "degree of information transparency" cause no obvious change in the public opinion. therefore, these two variables are not the most important and hypothesis 4, 5 is not entirely tenable in this case. the percentages of support posts under the control of strategies d1 and d2 are, respectively, reduced to 2.63% and 3.73% from 5.56%; in addition, the percentages of support posts under the control of strategies d3 and d4 rise to 9.22% and 20.2% respectively. this simulated result obviously indicates that a lower "degree of popularity of nuclear knowledge" by the public leads to a lower approval rating of nuclear projects. as the project was supported by government-owned corporations, this result also indicates that a higher "degree of popularity of nuclear knowledge" leads to a higher controllable degree of nuclear project-related public sentiment. professor zhou also publicly declared that "the whole nuclear fuel processing in the jiangmen base project does not relate to the nuclear fission reactions" (weiwei li; . however, due to the high threshold of nuclear technologies, it is very difficult for most people to quickly and effectively understand the safety of nuclear energy and accept nuclear projects. the blind worries regarding crises stemming from nuclear projects are one of the main reasons for triggering and spreading negative public sentiment and ultimately terminates potential projects. therefore, spreading long-term and effective popular science propaganda about nuclear safety is necessary and important for the government to accumulate and improve the local degree of popularity of nuclear knowledge and ensure project success. hypothesis 6 is therefore tenable. above all, adjusting the key control variables appropriately leads to completely different public attitudes and event results of the jiangmen anti-nuclear mass incident. in the response processes, a higher approval rating of the nuclear project can be achieved using response strategies with higher degrees of response strength, response speed, information transparency and popularity of nuclear knowledge. the variable "popularity of nuclear knowledge" has the most remarkable effects on public attitudes than the other three variables. the approval rates ahead the disapproval rates leading to the project establishment only by appropriate increase (about six times as the initial value) of the "degree of popularity of nuclear knowledge." these conclusions also indicate that only adjusting three other intermediate control variables, such as the "times of response" even up to ten times as the initial value, is still not enough to reduce the people's unreasonable fears and change the event result. the control variable "degree of popularity of nuclear knowledge" plays a decisive role in the evolution process of the public sentiment. however, it is not easy to increase the degree of popularity of nuclear knowledge in a short period of time. as an advance control variable, this indicator must be improved by education and advocacy activities for months or even years before the nuclear related public sentiment event actually occurs. (2) the effects of the flexible combination strategies on public attitudes additionally, some flexible combination strategies adjust two or more control variable values at the same time, such as strategy e with the higher "times of government response" and "degree of popularity of nuclear knowledge." these strategies are better than the strategies adjusting only one control variable value once for guiding the public sentiment. therefore, the evolution trend of the variable "quantity of support posts" impacted by the combination strategy e is compared with the trends impacted by the strategies r0, a4 and d4 respectively in figure 10 . the comparison shows that the quantity of support posts increases significantly with the strategy a4 (much more times of government response) or the strategy d4 (much higher degree of popularity of nuclear knowledge). however, it can be seen in table 3 that the percentages of support posts are still lower than percentages of support posts with the two single strategies separately in the end of simulation. however, the comparison also shows that the quantity of support posts increases much more significantly and rapidly using strategy e rather than a4 or d4. it can also be shown in figure 12 that strategy e leads to a higher percentage of support posts than percentage of opposite posts, which presents a reversed public sentiment simulation result that the approval rating is higher than disapproval rating for this nuclear-related project. therefore, the government is able to guide and control the evolution and development of public sentiments more effectively in such a scenario by using the combination strategies. taking the anti-nuclear mass incident as a case study, the parallel simulation models of public sentiment regarding the incident are constructed and improved into the final realistic model gradually and iteratively, and the 18 different strategies for responding to real scenarios are simulated with the improved model to discover the effects of the control variables on public sentiments and provide relevant policy implications. the empirical research verifies the effectiveness of the perdfps framework and shows that the methods proposed in this paper can be very useful in providing decision support for guiding and controlling public sentiments during future nuclear energy developments. traditional sd and pcm are integrated into the perdfps framework to solve the unconventional and unrepeatable public sentiment decision-making problems without any sufficient historical data in this paper. the novel approach cultivates the models to simulate real scenarios with real-time data rather than historical data in a parallel way. compared with the data-dependent "predictresponse" decision-making methods, the structure-dependent "scenarioresponse" perdfps framework has more advantages. first, the parallel evolution and response mechanisms of the public sentiment are both considered in the iterative model-improving processes to achieve the consistency between the simulation results and the real scenarios. second, in the case of a lack of historical data, the decision-making model can be constructed and cultivated gradually in the parallel-improving processes if the initial structure of the public sentiment system is describable. moreover, the effects of various response strategies can be simulated and optimized by adjusting the key control variables of the cultivated model. it would be helpful for the government to enact better policies or decisions in future public sentiment events. the parallel interactions between the real scenarios and simulation models will be truly achieved and the decision effectiveness of the perdfps framework will be completely verified in the practical evolution and response processes of the public sentiments. however, due to the limited conditions (most original events such as nuclear-related large-scale project occur rarely), it is impossible to develop a simulation experiment according to a real-time scenario of a nuclear public sentiment in this paper. this research aims to propose the basic parallel decision-making framework to help governments guide and control public sentiment. thus, the empirical research component of this paper uses the historic case of the jiangmen anti-nuclear mass incident to demonstrate and discuss the principle and effectiveness of the perdfps framework. meanwhile, in order to carry out concise empirical research, details are not completely considered as the key variables of the multi-generational parallel simulation models. therefore, in the next step, more elements, features, and other details of public sentiment systems, such as opinion leaders and their impacts on the public sentiment, should be designed and simulated to make decision-making models more realistic and effective in practical response processes. furthermore, due to the similarity of the system structures, the novel approach can be used in the public sentiment events not only in the nuclear related field but also in many other fields, such as food safety and healthcare. once events break out, the public sentiment systems can be constructed rapidly using little initial data and improved on iteratively and dynamically with real-time data. using this framework, decision makers are able to describe, rehearse and make the proper decisions to guide or control the evolution trends of the public sentiment effectively. therefore, in future studies, the perdfps framework and the relevant sd models can be extended to some other social heat incidents with similar public sentiment system structures such as "vaccination scandals" and "food-safety scandals." the percentage of support posts drops to 6% and the percentage of the opposite posts rises to 38%. the curves shown in figure 9 are consistent with the real nuclear project case that was eventually cancelled. therefore, the cultivated 7-generation model can simulate and match the evolutionary real scenarios very well. meanwhile, the simulation results also indicate that the dynamic reality hypotheses proposed in section 3.2 are tenable. references anton collaborative scenario modeling in emergency management through cross-impact public opinion, public policy, and democracy. handbook of politics: state and society in global perspective sham or shame: rethinking the china's milk powder scandal from a legal perspective emergency management typical case study report developing theory through simulation methods the mmr vaccine and autism constructing a model of effective information dissemination in a crisis tests for building confidence in system dynamics models wakefield's article linking mmr vaccine and autism was fraudulent public-opinion sentiment analysis for large hydro projects public's responses to an oil spill accident: a test of the attribution theory and situational crisis communication theory reading others' comments and public opinion poll results on social media: social judgment and spiral of empowerment a comparative study of crisis communication strategies between mainland china and taiwan: the melamine-tainted milk powder crisis in the chinese context new product diffusion models in innovation management -a system dynamics perspective notes and insights documentation for model transparency a toolkit of designs for mixing discrete event simulation and system dynamics the state of nuclear power two years after fukushima -the asean perspective reporting guidelines for simulation-based research in social sciences a system dynamics view of the acute bed blockage problem in the irish healthcare system business dynamics, system thinking and modeling for a complex wordl system dynamics modelling: tools for learning in a complex world post-fukushima public acceptance on resuming the nuclear power program in china critical learning incidents in system dynamics modelling engagements problem analysis for response to the network public opinion baesd on system dynamics supporting strategy using system dynamics parallel control and management for intelligent transportation systems : concepts , architectures , and applications a system for real-time twitter sentiment analysis of 2012 u.s. presidential election cycle the prevention and resolution of the nuclear public opinion from the anti-nuclear demonstrations of jiangmen introductory econometrics -a modern approach an efficient wikipedia semantic matching approach to text document classification managerial influence on the diffusion of innovations within intra-organizational networks modeling and predicting opinion formation with trust propagation in online social networks emergency policy exploration for network public opinion crisis in water pollution accident by hazardous chemical leakage based on systematic dynamics research on the mechanism of public opinion on internet for unexpected emergency research on the mechanism of public opinion on internet for abnormal emergency based on the system dynamics modeling the paper is supported by national natural science foundation of china (no. 71974090); young the following etable 1 is developed to clearly explain all the variables defined in our models. the variables of the sd model are classified into two categories including state and control variables. and the latter is classified into two subcategories including parallel cultivating control and decision-making control variables. each kind of variables has relevant functions in the model. ② the parallel control variables including "increase rate of posts" and "increase rate of news" are used to cultivate the n-general sd model. the key state variable "increment of posts" is directly decided by the real-time state variables "heat rate of public sentiment", "reaction time of netizens" and the parallel control variable "increase rate of posts". as shown in the rewritten section 4.3, the variable "increase rate of posts" of the initial 1-genernal sd model (△t=60 hours) is defined temporarily as the formula (22) by collecting and fitting the 216 hours of data (from july 17 to august 3, 2018) of similar public sentiment triggered by a similar event from ef.zhiweidata.com. and the formula (22) is finally adjusted into the formula (24) for the cultivated 3-general model (△t=180 hours) after parallel iterative cultivating.③ the decision-making control variables including "times of government response", "response time of government", "degree of information transparency" and "degree of popularity of nuclear knowledge" are controllable from the perspective of government. these variables can be tentatively set by the decision makers to represent different response strategies. and then the effects of different strategies can be tested by reviewing the simulation result with the cultivated n-general sd model and the optimal strategy can be obtained by adjusting the control variables. key: cord-297161-ziwfr9dv authors: sauter, t.; pires pacheco, m. title: testing informed sir based epidemiological model for covid-19 in luxembourg date: 2020-07-25 journal: nan doi: 10.1101/2020.07.21.20159046 sha: doc_id: 297161 cord_uid: ziwfr9dv the interpretation of the number of covid-19 cases and deaths in a country or region is strongly dependent on the number of performed tests. we developed a novel sir based epidemiological model (sivrt) which allows the country-specific integration of testing information and other available data. the model thereby enables a dynamic inspection of the pandemic and allows estimating key figures, like the number of overall detected and undetected covid-19 cases and the infection fatality rate. as proof of concept, the novel sivrt model was used to simulate the first phase of the pandemic in luxembourg. an overall number of infections of 13.000 and an infection fatality rate of 1,3% was estimated, which is in concordance with data from population-wide testing. furthermore based on the data as of end of may 2020 and assuming a partial deconfinement, an increase of cases is predicted from mid of july 2020 on. this is consistent with the current observed rise and shows the predictive potential of the novel sivrt model. the pandemic disease covid-19, caused by the coronavirus sars-cov-2, became in a few months one of the leading causes of death worldwide with now over 580.000 fatalities and 13 million reported cases (dong et al., 2020; john hopkins dashboard july 16, 2020, n.d.) . the total number of cases and recovered patients is unknown as a fraction of the virus carriers do only show mild or no symptoms and hence, escape any diagnostics or could not get tested especially at the onset of the crisis due to a lack of infrastructure and test material. to a lesser extent, the number of cases is likely to be underestimated in countries that did not count deaths outside care facilities. whereas other countries like belgium included every fatality that has been tested positive for the virus regardless of the cause of death (https://www.politico.eu/article/why-is-belgiums-death-toll-so-high/). the lack of consistency among testing strategies and case counts prevents the reliable and comparable calculation of simple measures, such as the infection fatality rate (ifr) or the effective reproduction number rt_eff, which is required to better assess the virulence and spread of the disease. a unified large-scale testing strategy and a more rigorous integration of the testing information would enable more precise political decisions on measures beyond following the all-or-nothing example of china, which imposed a lockdown on their population to avoid a breakdown of the healthcare system due to a saturation of icu beds by covid-19 patients. among the first affected countries by the virus, only the ones that had experienced the previous mers-cov outbreak such as south korea and singapore and hence had mitigation strategies in place or had established large test infrastructures like iceland could avoid strict containment strategies. other countries like sweden and england attempted to find a balance between lockdown and uncontrolled spread, by slowing the contagion to protect the healthcare system and the elderly people without facing the economic harm caused by a full lockdown. in luxembourg, the first case and death were reported on the 29 th of february and the 13 th of march 2020, respectively. on the 16 th of march, schools were closed and all non-crucial workers shifted to remote work or were furloughed. two days later, the state of emergency was declared, in-person gatherings were prohibited, and restaurants and bars were closed. over 70.000 workers were furloughed and 30.000 more took a leave for family reasons to homeschool their children. in parallel, within the con-vince study serologic tests were performed to assess the presence of igg and iga in the plasma, as well as nose and mouth swabs on a random set of 1.800 habitants to assess the spread of the disease in the luxembourgish population. around 1,9% of the samples had antibodies and 5 people tested positive, indicating that luxembourg was far away from herd immunity (snoeck et al., 2020) . epidemic models such as compartment models have proven to be a useful tool in other outbreaks to access the efficiency of mitigation strategies and to plan the timing and strength of interventions. more specifically, sir (susceptible, infected and removed) and seir (susceptible, exposed, infected, and removed) models, formulated as ordinary differential equations (ode), allow determining when social distancing, hand washing, testing, and voluntary remote working should be sufficient to prevent an exponential growth of the cases and when a significant portion of the population has to return into lockdown (song et al., 2020; tang et al., 2020; wangping et al., 2020; yang et al., 2020) . adaptations and extensions of sir and seir models were published for covid-19 already (giordano et al., 2020; siwiak et al., 2020; tang et al., 2020) . such models allow describing the dynamics of mutually exclusive states such as susceptible (s) which for covid-19 is assumed to be the entire population of a country, a region or city, the number of infected (i) and removed (r) that often combines (deaths and recovered), as well as the number of exposed (e) for seir models. the variables i and r, are often unknown as the number of cases and announced recovered patients only accounts for a fraction of the real values that is dependent on the testing performed within a country. therefore the numbers of susceptible and exposed that equals the total population minus infected and removed in seir models are also undetermined. several studies extended the number of considered states in such models to further differentiate between detected and undetected cases (susceptible (s), infected (i), diagnosed (d), ailing (a), recognized (r), threatened (t), healed (h) and extinct (e)), (gaeta, 2020; giordano et al., 2020) or took the severity of the disease into account in relation to the age of the infected person (balabdaoui & mohr, 2020; wu et al., 2020) . however, with an increase in the number of states and parameters describing the transitions between these states, more data is required to calibrate the model, i.e. to estimate the model parameters. roda et al (roda et al., 2020) showed that an sir model seems to better represent data obtained from case reports than seir models. notably, sir models captured a link between the transmission rate β and the case-infection ratio that was missed by seir models. the underestimation of the infected, deaths, and removed, due to not considering country specific testing information, causes sir models to predict ifr and effective reproduction number (rt_eff ) values that vary drastically across countries with different testing and might often be overestimated. to overcome this issue, we propose an extended sir model (sivrt) which is informed by the number of performed tests and also takes the number of hospitalizations into account to parametrize the model. this allows for a better prediction of the evolution of the disease and the estimation of key pandemic parameters, as well as the analysis of different deconfinement and testing strategies. the novel sivrt model ( and detected death cases (d d ). all these transitions are modeled with first-order laws with rate constants kir, kiv, kvr, kvd, kidr, kidvd, kvdr, and kvdd, respectively. the rate constants for detected and non-detected states are assumed to be equal. regarding the testing, it is presumed that (i) severe cases s are tested with high probability compared to asymptomatic cases, as it is more likely that the severe cases will be spotted within the population; (ii) susceptible s and recovered people rcum (=r+rd) are tested with the same probability. testing of severe cases s is modelled as a first-order term as well (kvvd), and the remaining performed tests are distributed among infected (i) and the sum of susceptible and recovered (s+rcum). the ratio between these two groups is adjusted with parameter ktivss which is also subjected to optimization. for luxembourg data on people tested positive, death cases, hospitalisation and performed tests were obtained from the website of the luxembourgish government https://coronavirus.gouvernement.lu/en.html and are summarized in appendix 2. the model was implemented in the iqm toolbox (sunnåker & schmidt, 2016) for the predicted no lockdown at all scenarios (figure 3 ), the lockdown event on day 17 was removed. for the predicted light lockdown scenario (figure 4) , the lockdown event on day 17 was kept, but the infection rate parameter (ksi) was increased by 10% of the difference of its value during the full lockdown compared to its value before the lockdown. for the predicted partial lifting of the lockdown scenario as of end of may ( figure 5 ), the infection rate parameter (ksi) was increased on day 85 of the simulation by 20% of the difference of its value during the full lockdown compared to its value before the lockdown. the testing rate was kept constant. for the predicted lifting of the lockdown scenario as of end of may with increased testing approximately matching the luxembourgish strategy of testing (figure 6 ), on day 85 of the simulation the infection rate parameter (ksi) was set to its value before the lockdown and the testing rate was increased to 5.000 tests per day. as the number of performed tests strongly influences the dynamic analysis of the covid-19 pandemic in a country or region, we developed a novel sir based epidemiological model (sivrt, figure 1 ) which allows the integration of this key information. the model consists of two layers describing the undetected and detected cases whereby the transition between these layers is realized by testing. the model distinguishes severe from non/less symptomatic cases . the probability for severe cases to get tested is assumed to be higher. the model consists of 9 states and has been implemented in the iqm toolbox within matlab (methods & appendix 1). importantly it allows the fitting to epidemiological data, among others to detected cases, deaths, march 25, 2020, which had one of the largest numbers of cases and tests performed at the onset of the pandemic (kim et al., 2020) and more importantly with the estimated ifr after adjusting for delay from confirmation to death obtained on the diamond princess cruise ship (russell et al., 2020 are predicted to have occurred in luxembourg with only around 5.000 deaths being detected and assigned to the pandemic ( figure 3 ). as the same number of tests in this simulation was assumed as performed in reality, a high number of deaths would not have been detected. whereas, a lighter lockdown could have led already to a second infection wave as shown in an example simulation (figure 4 ). thus obviously also the model supports the huge necessity of the performed lockdown, in comparison with alternative scenarios. example simulation showing a reduced risk of second infection wave arising around mid of july (day 135), compared to lower testing as shown in figure 5 . legend as in figure 2 . in summary, the novel testing informed sivrt model structure allows to describe and analyze the covid-19 pandemic data of luxembourg in dependency of the number of performed tests. this enables the estimation of the overall and recovered cases, including detected and nondetected cases and thereby the estimation of the infection fatality rate (ifr). it is furthermore possible to perform predictions on past and future scenarios of combinations of lockdown lifting and testing. simulations of the novel sivrt model with parameters estimated from the data of the early pandemic in luxembourg give a full dynamic picture including detected and non-detected cases. in particular, the overall number of cases until end of may and the ifr is estimated at around 13.000, representing 2,1% of the population and 1,3%, respectively. this is in line with 1,9% of volunteers in the con-vince study that had igg antibodies against sars-cov-2 in their plasma and the estimated ifr rate on the diamond princess cruise ship 1,3 (95% ci: 0.38-3.6) (russell et al., 2020) . the sivrt model also allowed predicting the appearance of a second wave in a time frame of 50 days after a partial lifting of the lockdown. this is in concordance with the rise in cases see in luxembourg as of mid of july. age-stratified model of the covid-19 epidemic to analyze the impact of relaxing lockdown measures: nowcasting and forecasting for switzerland an interactive web-based dashboard to track covid-19 in real time a simple sir model with a large set of asymptomatic infectives modelling the covid-19 epidemic and implementation of population-wide interventions in italy understanding and interpretation of case fatality rate of coronavirus disease 2019 why is it difficult to accurately predict the covid-19 epidemic? estimating the infection and case fatality ratio for coronavirus disease (covid-19) using age-adjusted data from the outbreak on the diamond princess cruise ship from a single host to global spread. the global mobility based modelling of the covid-19 pandemic implies higher infection and lower detection rates than current estimates prevalence of sars-cov-2 infection in the luxembourgish population the con-vince study an epidemiological forecast model and software assessing interventions on covid-19 epidemic in china iqm tools-efficient state of the art modeling across pharmacometrics and systems pharmacology estimation of the transmission risk of the 2019-ncov and its implication for public health interventions extended sir prediction of the epidemics trend of covid-19 in italy and compared with hunan estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china shortterm forecasts and long-term mitigation evaluations for the covid-19 epidemic in hubei province key: cord-302277-c66xm2n4 authors: bakaletz, lauren o. title: developing animal models for polymicrobial diseases date: 2004 journal: nat rev microbiol doi: 10.1038/nrmicro928 sha: doc_id: 302277 cord_uid: c66xm2n4 polymicrobial diseases involve two or more microorganisms that act synergistically, or in succession, to mediate complex disease processes. although polymicrobial diseases in animals and humans can be caused by similar organisms, these diseases are often also caused by organisms from different kingdoms, genera, species, strains, substrains and even by phenotypic variants of a single species. animal models are often required to understand the mechanisms of pathogenesis, and to develop therapies and prevention regimes. however, reproducing polymicrobial diseases of humans in animal hosts presents significant challenges. there is now compelling evidence that many infectious diseases of humans (fig. 1) and animals (table 1) are caused by more than one microorganism. the mixed microbial nature of these diseases has been recognized since the early 1920s but there has been renewed interest in this topic since the 1980s 1 , signalled by the publication of four important reviews from 1982 to the present date. polymicrobial diseases (see box 1 for nomenclature) can be caused by the synergistic or sequential action of infectious agents from either the same or different kingdoms, genera, species, strains or substrains, or by different phenotypic variants of a single species 6 . polymicrobial diseases share underlying mechanisms of pathogenesis, such as common predisposing factors (box 2) , but each disease has unique aspects. although the molecular mechanisms of some polymicrobial infections are known, other polymicrobial diseases are not well understood. owing to their complexity, the study of polymicrobial infections requires a multidisciplinary approach and specific in vitro methodologies and animal models. the development of assay systems and treatment and prevention regimes is needed. multiple diverse in vitro systems have been used to study polymicrobial diseases (box 3) . although in vitro methods are crucial for understanding polymicrobial diseases, rigorous, reproducible and relevant animal models of human diseases are essential for the prevention and treatment of these co-infections [7] [8] [9] [10] .all animal models of human diseases have inherent limitations but they also have important advantages over in vitro methods, including the presence of organized organ systems, an intact immune system and, in inbred mice, specific genetic backgrounds, and the availability of many reagents for characterizing the immune response to sequential or co-infecting microorganisms. the availability of mice with specific genetic backgrounds can have a pivotal role in understanding the mechanisms of pathogenesis of polymicrobial diseases, as exemplified by studies on septic peritonitis 11-17 , periodontal disease 18 and lyme arthritis 19, 20 . understanding the molecular mechanisms underlying polymicrobial diseases of veterinary importance has also been facilitated by the use of animal models. these veterinary systems are useful examples for those researchers attempting to develop animal models of complex human diseases. so far, most animal models for human polymicrobial diseases are rodents, usually mice, but also rats, gerbils, cotton rats and chinchillas. other animal models include non-human primates, which are useful for modelling diseases that are caused by microorganisms with a restricted host range. for most human viral co-infections of clinical importance, good animal models and culture systems are lacking and are urgently required. this review provides an overview of the pathogenesis of selected polymicrobial diseases, the molecular basis for some of these co-infections and describes animal models that have been developed to mimic these diseases. human co-infections with multiple hepatotropic viruses from the hepatitis virus group are well documented. co-infection with multiple hepatitis viruses is possible owing to their similar routes of transmission and ability to chronically infect the host. hepatitis a virus (hav) co-infection of individuals that are chronically infected with hepatitis b virus (hbv) and/or hepatitis c virus (hcv) results in a disease of increased severity and risk of death. moreover, hbv-hcv co-infection occurs in 10-15% of hbv patients, and hepatitis g virus (hgv)-hcv co-infection occurs in 10-20% of individuals with chronic hcv infection; however, hbv-hepatitis d virus (hdv) coinfection occurs only in the setting of co-infecting hbv. viral interference, in which replication of one virus is suppressed by another virus, is an intriguing aspect of triple hbv-hcv-hdv infection -hdv can suppress both hbv and hcv replication 29 . in a retrospective study of patients with hepatitis virus co-infections, hdv was dominant by rt-pcr detection of hdv rna in triple co-infections, but in dual co-infections there were alternating dominant roles for either hbv or hcv. multiple hepatotropic viral infections are associated with reduced hcv replication but increased pathology. patients with dual or triple co-infections have more severe liver disease pathologies than patients that are in cattle, infections with bovine viral diarrhoea virus (bvdv) can be clinically asymptomatic or can cause severe symptoms. the outcome depends on whether the primary infection occurred in utero or after birth 21 and whether the primary infection was with a cytopathic or non-cytopathic biotype of bvdv. milder, congenital persistent infection follows foetal infection with noncytopathic bvdv. death, or culling from the herd within 1 year of birth after failure to thrive, is common; however some persistently infected calves seem healthy at birth and survive for several years. mucosal disease follows congenital persistent infection and is due to coinfection with cytopathic and non-cytopathic bvdv in utero. conversely, acute bovine diarrhoeal disease is induced by primary post-natal infection with either of in some instances production of a virulence factor by one microorganism can increase the risk of infection or colonization by a second microorganism. this might include sharing virulence factors, such as adhesins; for example, h. influenzae shows enhanced adherence when pretreated with bordetella pertussis adhesins. infection with a microorganism that results in an impaired immune system predisposes the affected individual to infection with other microorganisms, or can allow infection of a niche that is usually protected in the body. [56] [57] [58] [59] [60] [61] . brdc pathology results from the effects of pathogen and host virulence factors. m. haemolytica produces multiple virulence factors, including a leukotoxin of the repeat in toxin (rtx) family that activates pmns, induces production of inflammatory cytokines, results in cytoskeletal changes and causes apoptosis. leukotoxin-activated pmns are crucial to pathogenesis and inflammatory mediators released by neutrophils are thought to be essential because inflammation and most of the pathology in brdc is absent human viruses, in contrast with rodent hosts, the use of greater primates for modelling human viral disease is limited by differences in the clinical presentation of disease -some diseases are asymptomatic in primates -and the expense of using primate models in research 40 . given the difficulties of modelling diseases caused by individual viruses, it is not surprising that models of virus co-infections, such as hiv and hcv, have not been established. a variety of small animal and lower-order nonhuman primate model systems have been developed to model human viral co-infections. mice and ferrets have been used to study interference between influenza a virus (iav) strains, as well as interference between cold-adapted influenza a and b vaccine reassortants and wild-type viruses [41] [42] [43] . murine hosts have been used to study how one retrovirus can block infection by a second retrovirus 44 , and to define the role of the tissue tropisms of helper viruses on the disease specificity of a co-infecting oncogene-containing retrovirus such as the type of tumour that is induced 45 . balb/c and nih swiss mice have been used as models to analyse a putative pathogenic interaction between a murine leukaemia virus and a polyomavirus 46 . rabbits have been used to produce models of mixed htlv-1 and hiv-1 co-infection 47 and co-infection with htlv types i and ii 48 . rhesus and pig-tailed macaque monkeys have been used to model co-infection with simian immuno deficiency virus (siv) and simian acquired immunodeficiency syndrome retrovirus type 1 (srv-1) 49 . more recently, macaques have been used to define the susceptibility to co-infection with two human hiv-2 isolates 50 . in this model, co-infections were established in macaques that were simultaneously exposed to both viruses, whereas in macaques that were sequentially challenged, co-infections were only observed if challenge with the second hiv-2 isolate occurred early after challenge with the first hiv-2 isolate and before full seroconversion. chimpanzees have also been used to study hiv-1 subtype b strain co-infections 51 there are several new in vitro methods, including: • genomic sequencing of individual microorganisms and mixed microbial ecosystems and the use of meta-genomics to study the genomes of uncultured microbial communities. • molecular phylogenetic studies, such as genotyping or 16s rrna analyses, to determine the genetic relatedness or diversity of microbial community members. • genome-wide transcription profiling using microarrays to assess the rates of transcription during polymicrobial infection. • fluorescence-based imaging and detection methods such as laser confocal microscopy using fluorescent probes, fluorescent in situ hybridization (fish) using species-specific 16s rrna-directed oligonucleotide probes and the use of transcription and translation reporter gene constructs. • analyses of inter-genera bacterial signalling such as quorum sensing. • use of biofilm chambers and continuous culture flow cell reactors to study polymicrobial diseases. • co-infections of cell lines, tissue and organ cultures and extracted teeth. • laser capture microdissection of colonized infected tissues. the mechanisms of synergy between pathogens in om have been analysed using in vitro methods and animal models (reviewed in ref. 7 ). briefly, viral infection compromises the protective functions of the eustachian tube, alters respiratory-tract secretions, damages the mucosal epithelial lining, interferes with antibiotic efficacy, modulates the immune response and enhances bacterial adherence 77 and colonization 78 to predispose the host to bacterial om. influenza and parainfluenza viruses have neuraminidases that remove sialic acids from host-cell glycoproteins, which results in the exposure of receptors for pneumococci. the activity of neuraminidases allows the adherence of and/or colonization by s. pneumoniae 79 , which is one of the primary aetiological agents of acute om. although all upper respiratory tract viruses can disrupt the host respiratory tract defences, each virus has a specific pathology. not surprisingly, there are specific partnerships between viruses and bacteria in om. in the chinchilla model (fig. 2) iav predisposes the host middle ear to s. pneumoniae-induced or pneumococcal om and adenovirus infection predisposes the host middle ear to nthi om. iav does not predispose the chinchilla host to nthi-induced om, nor does adenovirus predispose the host to either m. catarrhalis-induced om 80 or to pneumococcal om 78, 81 .virus and bacteria synergy seems to be maintained in adults and children. the oropharynges of 15% of adults with experimental iav infection were heavily colonized with s. pneumoniae six days after viral challenge 82 , whereas isolation rates for other middleear pathogens were unaffected. in children, s. pneumoniae is cultured more often from middle-ear fluids that contain iav than from those that are culture-positive for either rsv or parainfluenza virus 83 . cystic fibrosis polymicrobial diseases. upper respiratory viruses predispose the host to bacterial invasion of the lower respiratory tract and are often detected in patients with copd 84 or cystic fibrosis (cf). in addition to bacterial factors, host determinants also have a role in co-infections of the cf lung. cf patients do not have a higher incidence of viral disease compared with non-cf individuals, but viral disease produces more significant pathology. it has been proposed that cf patients have impaired innate immunity, which allows increased virus replication and upregulated cytokine production. in turn, this results in increased bacterial colonization of the lung. zheng and co-workers 85 showed that increased virus replication in cf patients is due to the absence of the antiviral nitric oxide synthesis pathway. this was attributed to impaired activation of signal transducer and activator of transcription (stat1), which is an important component of the antiviral defences of the host. compromising innate immunity provides a mechanism for the severity of viral disease in cf and the establishment of bacterial co-infections. expression of virulence determinants by pseudomonas aeruginosa, a pathogen of cf patients, can depend on signals produced by other bacteria. transcriptional profiling in vitro coupled with research in an animal model showed that adding exogenous signalling molecules, in neutrophil-depleted animals. in pigs, porcine respiratory disease complex (prdc) is a similar disease complex that is caused by co-infection with one of several porcine respiratory tract viruses and members of the pasteurellaceae family [62] [63] [64] [65] [66] . porcine reproductive and respiratory syndrome. prrs is caused by prrsv co-infection with multiple bacterial pathogens including streptococcus suis type ii 67 , bordetella bronchiseptica 68 , mycoplasma hyopneumoniae 69 and actinobacillus pleuropneumoniae 70 . in turkeys, pems is caused by turkey coronavirus, avian pneumovirus or newcastle disease virus co-infection with enteropathogenic escherichia coli 71, 72 . 1) . despite the diverse spectrum of diseases and anatomical niches, there are common underlying mechanisms involved in these co-infections. often, viral disruption of host defences has a role in the development of bacterial co-infections. in otitis media, which is a middle ear infection, a synergistic interaction that results in disease owing to co-infection with an upper respiratory tract virus and three bacterial species -streptococcus pneumoniae, nontypeable haemophilus influenzae (nthi) and moraxella catarrhalis -is well documented. however, certain viruses such as respiratory syncytial virus (rsv) and rhinovirus seem to predispose affected individuals more often to bacterial om. the saying that children ''get a cold and a week later develop om'' is substantiated by epidemiological data that indicate a seasonal influence on the coincidence of 'colds' and om, as well as evidence for a peak incidence of virus isolation that is coincident with, or immediately preceding, peak incidence of om (fig. 2) . in the recent finnish om cohort study and finnish om vaccine trial, the relationship between viruses and om was supported by data that showed the presence of a virus in either nasopharyngeal aspirates or middle-ear fluid specimens in 54% or 67% of om cases in these studies, respectively 76 . rhinovirus was the most commonly isolated virus, followed by enterovirus and rsv. a specific virus was detected in two-thirds of all cases of acute om in young children, but only those viruses that are tested for can be detected, so this figure is likely to underestimate the proportion of acute om events with viral co-infection. a sequential inoculation model has been developed in mice to probe the mechanisms of the interaction between s. pneumoniae and iav. mice infected simultaneously with s. pneumoniae and iav displayed gradual weight loss and increased mortality, commensurate with an additive effect. conversely, mice infected with s. pneumoniae seven days after iav infection uniformly died within 24 hours and had significant bacteraemia -lethality was due to overwhelming pneumococcal septicaemia 89 . this model is being used to define the molecular mechanisms of the lethal synergy of iav with s. pneumoniae 90 . the activity of viral neuraminidase was found to be crucial to this synergistic relationship 91 and, in common with om, has an important role in predisposing both the upper and lower respiratory tracts to invasion by s. pneumoniae. such as autoinducer-2, or the production of this molecule by the oropharyngeal bacterial flora upregulated the expression of genes that encode virulence factors 87 . modulation of gene expression by interspecies communication between normal flora and pathogenic bacteria could therefore have a role in polymicrobial diseases. periodontitis. some herpesviruses, including human cytomegalovirus (hcmv), epstein-barr virus type 1 (ebv-1) and hsv, have been implicated in the pathogenesis of a severe and highly aggressive form of periodontitis through co-infection with porphyromonas gingivalis 87, 88 . hcmv and hsv were detected at significant levels using pcr in periodontal disease and were shown to be good predictors of the presence of p. gingivalis. ebv-1 was not linked to isolation of p. gingivalis but was also predictive of active disease 87 . a common theme has emerged from these models that upper respiratory tract viruses of both animal and human hosts can predispose the respiratory tract to infection by pasteurellaceae in brdc and prdc in animals and periodontitis, sinusitis, copd and om in humans. a mouse model has been developed to evaluate the role of respiratory dendritic cells (rdcs) in viral-bacterial co-infections 105 . rdc migration from the lungs to the secondary lymph nodes after infection with pulmonary virus is monitored by the use of a fluorescent dye. after inoculation with influenza virus, the rate of rdc migration to the draining peribronchial lymph nodes increased, but this only occurred during the first 24 hours after virus infection. after 24 hours, rdcs did not migrate, despite virus replication and pulmonary inflammation. moreover, viral infection suppressed additional rdc migration in response to either a second pulmonary virus infection or administration of bacterial cpg dna. in addition to suppressed rdc migration, there was also suppression of an antiviral pulmonary cd8 + t-cell response. it seems likely that the transient suppression of rdc migration and the delayed development of an effective adaptive immune response to a second infection might be another mechanism by which influenza virus predisposes the host to bacterial co-infection. atrophic rhinitis. infection with more than one bacterial species is common in animals and man. in pigs, atrophic rhinitis (ar), which is characterized by severe atrophy of the nasal turbinates, is caused by co-infection with strains of b. bronchiseptica and heat-labile toxin-producing strains of p. multocida [106] [107] [108] . p. multocida can adhere to respiratory tissues, but co-infection with b. bronchiseptica allows more efficient colonization by p. multocida. p. multocida produces a dermonecrotic toxin called pmt (for p. multocida toxin), which interferes with normal bone modelling in both the nasal turbinates and long bones in swine, and is distinct from the b. bronchiseptica dermonecrotic toxin (dnt). in porcine models, pmt causes a more serious form of ar known as progressive ar, whereas b. bronchiseptica infection alone induces a milder, or non-progressive, form of the disease. bacterial co-infections of humans include orofacial infections 109 , adenotonsillitis 110 , persistent osteomyelitis 111 , peritonitis 112 , chronic sinusitis 113 , abscesses 114, 115 , necrotizing fasciitis and approximately one-third of urinary tract infections (utis) in the elderly 116 and in renal transplant patients 117 . two important bacterial co-infections are periodontitis and vaginosis. periodontitis. periodontal disease causes tooth loss and is associated with systemic vascular diseases such as atherosclerosis and carotid coronary stenotic artery disease 118 . periodontitis in an expectant mother can contribute to both low birth weight and pre-term labour 119 . periodontal disease is initiated by the formation of a a murine model has been developed to reproduce the pathogenesis of human meningococcaemia, which often results in serious symptoms or death 92 . in this model, adult balb/c mice are infected intranasally with a mouse-adapted iav, then seven or ten days later, are co-infected with neisseria meningitidis. fatal meningococcal pneumonia and bacteraemia occurred in mice challenged at seven, but not ten, days after iav infection. meningococcal pneumonia and bacteraemia did not develop in mice that were not co-infected with iav. susceptibility to lethal infections correlated with peak interferon-γ production in the lungs and decreased iav load and production of il-10, which indicates that transient iav-induced modulation of host immunity has a role in susceptibility to n. meningitidis co-infection. the only viral-bacterial co-infection model for om is the chinchilla. before 1980, most om studies in the chinchilla used inoculation of pathogens directly into the middle ear. although this induces disease in almost all of the animals inoculated and is therefore extremely useful for studies of therapeutics and surgical intervention strategies, it bypasses all of the early steps in the development of the pathogenesis of the disease, including colonization of the nasopharynx, ascension of the eustachian tube and initiation of infection in the middle ear. giebink and co-workers 93 , developed a clinically relevant model in which chinchillas were challenged intranasally with both s. pneumoniae and iav. this study showed that 4% of the chinchillas that were infected with iav alone, and 21% of those inoculated with s. pneumoniae alone, developed om, but of the animals that were coinfected with both microorganisms,~67% developed om (fig. 2) . this model has been useful in defining the molecular mechanisms of iav predisposition to pneumococcal om [94] [95] [96] and to investigate the role of pneumococcal virulence determinants in om 81 . to study the pathogenesis of om mediated by nthi, a chinchilla model that uses a co-challenge method was developed. in this model, adenovirus infection can predispose the chinchilla to nthi invasion of the middle ear 97 (fig. 2) ; however, iav infection has no effect. this adenovirus-nthi co-infection model has been used to study the mechanisms of adenovirus predisposition to nthi-induced om 98, 99 , to identify new nthi virulence determinants 100 and to assay the relative efficacies of different nthi-derived vaccine candidates for om [101] [102] [103] . a cotton rat model of rsv and nthi co-infection has also been developed to study co-infections of the respiratory tract 104 . in the cotton rat, colonization of the respiratory tract with nthi increased to a maximum level four days after infection with rsv and colonization was increased compared with rats that had not been infected with rsv. nthi colonization of the respiratory tract was increased by rsv co-infection and, although the mechanisms underlying this relationship are not understood, this model might be useful to determine the mechanisms of rsv predisposition to bacterial om. bacterial formed, in terms of microbial constituents, is the main predictor for periodontal disease. actinobacillus actinomycetemcomitans, p. gingivalis and bacteroides forsythus are the main periodontal pathogens. periodontal disease covers a range of clinical symptoms and there are multiple forms of periodontitis in children and adults. in individuals under 20 years of age, a. actinomycetemcomitans is the main bacterial pathogen, whereas in adults aged 35 years or older, periodontitis has been linked to p. gingivalis and b. forsythus. spirochaetes, especially treponema denticola, have been implicated in periodontitis despite the fact that most oral spirochaetes have not been successfully cultured. recent studies using molecular phylogenetic techniques have implicated new bacterial biofilm on the tooth surface followed by bacterial invasion of gingival tissues 119 . teeth have a non-shedding surface and are located in a warm, moist environment, so are a particularly suitable niche for biofilm formation by the oral microbial flora. in periodontitis, coaggregation -a process in which genetically distinct bacteria become interconnected by specific adhesins -is central to the formation of complex multispecies biofilms 120 (fig. 3) . in dental plaque, primary colonizers such as streptococcus gordonii, and other oral streptococci that express adhesins, provide a film on which other bacterial colonizers assemble the biofilm. it was thought that the abundance of plaque that formed was responsible for the induction of periodontitis but, at present, the favoured hypothesis is that the quality of the plaque periodontal disease. in most individuals, periodontal pathogens trigger an inflammatory response that effectively prevents microbial colonization and invasion of adjacent gingival tissues. however, individuals that have specific il-1 polymorphisms that result in increased levels of il-1 expression are predisposed to periodontal disease. using this criterion, a mouse model of polymicrobial-induced osteoclastogenesis, bacterial penetration, leukocyte recruitment and softtissue necrosis has been developed to clarify the role of cytokines in periodontal disease. in this model, the dental pulp of the first mandibular molars is exposed by surgically clipping the mesial cusps and then a mixture of putative oral pathogens is inoculated into the dental pulp (fig. 4a) . by monitoring the size of osseous lesions, tissue necrosis, osteoclastogenesis, osteoclastic activity, inflammatory cell recruitment and bacterial penetration into tissue periodontal disease, pathogenic mechanisms can be investigated 135 . il-1 or tnf receptor signalling does not seem to be required for bacteria-induced osteoclastogenesis and bone loss in this model, but does have a crucial role in protecting the host against anaerobic co-infections. a rat model of periodontitis was developed to test adherent (rough) and non-adherent (smooth) variants of a. actinomycetemcomitans for virulence, as well as to assess phenotypic reversion in vivo 136 . in this model, the normal flora of the oral cavity of sprague-dawley rats is reduced by antibiotic treatment, after which rats are inoculated with a. actinomycetemcomitans by either normal ingestion of food layered with bacterial cultures, oral swabbing or gastric lavage (fig. 4b) . when clinical isolates of a. actinomycetemcomitans were compared with laboratory-adapted variants, fine et al. 136 found that the clinical strains were more efficient at colonization and persisted longer in the rat oral cavity than laboratory strains. rough variants were more efficient colonizers of the rat oral cavity than smooth variants, regardless of the method of inoculation, although feeding was the preferred method owing to the similarity with human disease. importantly, rats that were orally infected with a. actinomycetemcomitans by feeding developed immunoglobulin g (igg) antibodies to the bacteria and had bone loss that was typical of periodontitis. this model has not been used to study the process of bacterial co-infection in periodontitis, but has been used to identify a gene locus that is important in virulence and which mediates tight adherence by a. actinomycetemcomitans 137 . a primate model (macaca fascicularis) of periodontal disease uses silk ligatures tied around the posterior teeth to induce plaque accumulation and the initiation of periodontitis 138 . so far, this model has only been used for single pathogen studies, but is considered to be a relevant animal model of periodontal disease owing to the similarity of clinical and histological features with those of periodontal disease of humans, and because, in this model, periodontal destruction is clearly triggered by bacterial infection 139 . species or phylotypes -including members of the uncultivated bacterial division tm7 -in periodontitis, dental caries and halitosis [121] [122] [123] [124] [125] [126] [127] . in many of these studies, not only were new species and phylotypes identified, but bacteria that are known to be oral pathogens were found to be numerically minor, which was expected because ~50% of oral flora have not been cultivated 123 . in localized juvenile periodontitis (ljp), the leukotoxin of a. actinomycetemcomitans, like that of m. haemolytica, is the best-studied virulence factor. this leukotoxin selectively kills pmns and macrophages in vitro, and pmns and macrophages are important components of the host defence in vivo. expression of this leukotoxin is variable among a. actinomycetemcomitans isolates and the leukotoxin-expression phenotype correlates with differences in the promoter region of the leukotoxin gene operon 128 . a subset of leukotoxin-overproducing strains are more virulent and are associated with ljp in humans. neutrophil abnormalities seem to be an important predisposing condition for periodontal disease. in addition, loss of tooth attachment and bone resorption, which are important events in periodontal disease, occur together with increased il-1 and tumour-necrosis factor (tnf) activities. the production of il-1 and tnf (both of which are pro-inflammatory cytokines) has been correlated with the spread of inflammatory cells to connective tissues, the loss of connective tissue attachment, osteoclast formation and the loss of alveolar bone. an overzealous host response to periodontal pathogens, resulting in excessive production of il-1 and tnf, is hypothesized to be responsible for much of the damage that occurs in periodontal disease 129 . ial co-infection. the mucosal environment of the vagina is influenced by developmental and hormonal changes 131 . the most common bacterial constituents of the vaginal microflora are lactobacilli, including lactobacillus crispatus and lactobacillus jensenii 131 . when these hydrogen peroxide (h 2 o 2 )-producing lactobacilli are outcompeted by anaerobic and facultatively anaerobic members of the vaginal flora, bv develops with a concomitant rise in vaginal ph, which further promotes the growth of the resident lactobacilli. bv is common, occurring in 5-51% of the global female population 130 and the role of lactobacilli in the maintenance of vaginal homeostasis has been well studied. women with stable bacterial colonization have a reduced risk of developing bv 132 . normal vaginal flora has a role in defence against the acquisition of other pathogenic microorganisms, including those that are responsible for sexually transmissible diseases (stds), and bv is a strong predictor of std acquisition 133 . compared with subjects with normal vaginal flora, subjects that have bv are more likely to test positive for neisseria gonorrhoea and chlamydia trachomatis. recently, bv has also been found to be associated with an increased risk of hsv-2 infection 134 . candida species. co-infection with multiple candida strains and substrains are also found. regardless of the co-pathogens, mycotic co-infections of the oral and vaginal cavities, on indwelling prosthetic devices, or systemic infection of the blood can present significant therapeutic challenges. difficulty in treating some of these infections is partly attributed to the formation of biofilms by candida spp. 152 biofilm formation on devices such as prosthetic heart valves and catheters has been studied in vitro 95 . when cultured on a variety of catheter materials, candida spp. form biofilms comprising a matrix of microcolonies of both the yeast and the filamentous hyphal forms. in studies of mixed microbial populations, candida spp. form biofilms with several bacterial species, including staphylococcus epidermidis and oral streptococcal species. the receptor for candida albicans co-aggregation with s. gordonii is a complex cell surface polysaccharide that is expressed on the surface of the bacterium. the interaction between yeast cells and oral streptococci or other bacteria has important implications for the mechanisms of yeast infections of the oral cavity, in addition to promoting biofilm formation on a variety of surfaces. in the oral cavity, candida-bacterial interactions are responsible for denture stomatitis, angular cheilitis and gingivitis, and also have a role in periodontitis 153 . although the role of pathogenic enterococci and their role in peritonitis is not understood, many putative virulence factors have been identified using animal models. available animal models include systemic infection in mice and compartmentalized infection in rats, and the bacterial virulence factors that have been identified using each model differ 140 . this indicates that both host and pathogen factors contribute to peritonitis and, perhaps, that the animal models are quite different. nevertheless, these models have identified a role for cytokines in septic shock, a protective role for il-10 against lethal shock 141 , a role for stat4 in the mortality seen in bacterial co-infection sepsis 142 and helped to define the role of the classical pathway of complement activation in defence against polymicrobial peritonitis 143 . animal models of bacterial co-infection peritonitis and/or sepsis can involve any of the following methods for induction of infection: peritoneal implantation of microbe-filled gelatin capsules 140, 141 ; intraperitoneal injection of faecal suspensions 17 or caecal ligation and puncture 112, [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] . candida and mixed infections. as defined by soll and his colleagues 6 bacterial suspensions that are being tested for the ability to cause periodontal disease are injected into the dental pulp and the mouse is monitored for signs of periodontal disease by methods that include examination of osseous lesions, tissue necrosis, inflammatory cell recruitment, bacterial tissue penetration and osteoclastogenesis. b | in the rat model bacteria are grown using standard laboratory procedures and washed 3 times with phosphate-buffered saline (pbs) supplemented with 3% sucrose. the rats are pretreated with antibiotics and the mouth is swabbed with chlorhexidine to deplete the oral flora. the bacterial suspension is mixed into the rat's food so that this animal model replicates, as far as possible, the natural route of infection for periodontal disease. after daily inoculation the rats are assessed for bacterial colonization and bone loss. using this model, different strains of bacteria and the contribution of different virulence loci can be tested. cfu, colony-forming units. of these genetic manipulations 159 . a murine host has been used as a model of systemic candidiasis 160 and there is also a murine model for c. glabrata-induced vaginitis 161 . in the c. glabrata model, the increased susceptibility of non-obese diabetic mice to c. glabratainduced vaginitis compared with their non-diabetic counterparts indicates a link between susceptibility to diabetes and infection with c. glabrata. in addition to studies of candida genetics and pathogenicity 162 , this model is useful for the evaluation of the relative efficacy of antimycotic agents and probiotics for the prevention of vaginitis. an animal model of haematogenously disseminated candidiasis has recently been developed 163 that can investigate the role of phenotype switching in candidiasis. in this model (fig. 5) mice were injected with engineered c. albicans strains in which the transition between yeast and filamentous forms is under the control of a doxycycline-regulated promoter. mice that were infected with strains that switched to the filamentous form died, whereas those infected with strains that could not switch from the yeast to the filamentous form survived, despite the fact that the fungal burdens in both groups were nearly identical. these data indicate that the filamentous form is important for mortality but that the yeast form of c. albicans is important for dissemination to deeper tissues. parasite-parasite co-infections. several human diseases have mixed parasitic aetiologies, including co-infections with plasmodium spp. and nematodes. in some studies, co-infection with a helminth seemed to confer protection against severe complications of malaria 164 , but this is not always the case. when infected controls with a low helminth burden were compared with those with circulating helminth schizonts, co-infection with ascaris lumbricoides was found to be associated with protection from cerebral malaria. in addition, a later study showed a significant association between ascaris infection and the risk of co-infection with plasmodium falciparum and plasmodium vivax, indicating that pre-existing ascaris infection might increase host tolerance to coexisting plasmodium spp. 165 subsequently, helminthinfected patients were found to be more likely to develop falciparum malaria compared with those that were not co-infected 166 . moreover, the risk of developing falciparum malaria increased with the number of coinfecting helminth species. collectively, these findings indicate that a helminth-mediated helper t cell 2 (t h 2) shift (an immune response that is biased towards that which is characteristic of a t h 2-mediated response) might have a complex impact on malaria co-infection -decreasing antisporozoite immunity but inducing a protective outcome against severe complications of malaria. although the underlying mechanism is less clear, mwatha et al. 169 showed that exposure of schistosoma mansoni-infected children to p. falciparum had a significant influence on the severity of hepatosplenomegaly (enlargement of the liver and spleen) that was observed in co-infected children. a new category of polymicrobial diseases has been proposed for candida spp. in which the infection is due to phenotypic heterogeneity 154 . in addition to the hypha-bud transition, c. albicans has a reversible, high-frequency phenotype switch that can be identified by differences in colony morphology. c. albicans cells of two phenotypic phases have different virulence characteristics. the ability of this human pathogen to rapidly switch between phenotypes could be a higher-order pathogenic trait. support for this hypothesis comes from studies in which strains that cause deep tissue mycoses were shown to switch at higher frequencies than those that cause superficial infections. furthermore, pathogenic c. albicans strains that were isolated from the oral cavity switch at higher rates than commensal strains that were isolated from the same site. a clinically relevant example of the role of both phenotype and mating-type switching in disease was characterized by brockert et al. 155 , who investigated oral cavity and vaginal isolates of c. glabrata in three patients with vaginitis. the results of this study showed that switching occurs at sites of infection, that different switch phenotypes of the same strain can dominate in different anatomical locations in the same host and that mating-type switching occurs in vivo. these co-infections can cause disease in the lower respiratory tract. in cf patients, clinical specimens that also harbour c. albicans contain nine times the amount of p. aeruginosa compared with patients that do not harbour c. albicans 156 . moreover, sputum samples of 6-70% of cf patients contain c. albicans in addition to p. aeruginosa. hogan and kolter 157 showed that p. aeruginosa forms a dense biofilm on c. albicans filaments in vitro and, in doing so, kills the fungus. p. aeruginosa fails to bind to, or kill, the yeast form of c. albicans. it is unclear if a similar relationship between these two pathogens functions in vivo but, as several p. aeruginosa virulence factors that are important in human disease are also involved in killing the fungal filaments, this co-culture system could prove useful for the study of the pathogenesis of p. aeruginosa-induced disease. owing to the ability of candida spp. to switch between bud and hypha (or hypha-like) forms as well as to switch phenotype, all animal models of candida infection are likely to represent one or another of the multiple polymicrobial states that have been proposed for this microorganism. a rat model of oral colonization has been used to compare the relative pathogenicity of different candida strains as well as to determine the effect of chemotherapeutic immunosuppression on the ability of candida spp. to switch from a commensal to an invasive phenotype 158 . a rat model of oral candidiasis has also been developed and used to assay isogenic derivatives of a virulent c. albicans strain for the biological consequences co-infection with schistosoma species and plasmodium species has been modelled in field voles and mice since 1956 (ref. 172) , with conflicting observations concerning the ability of one parasite to suppress the capacity of the other to infect the host 173, 174 . results obtained seem to depend on the plasmodium species used as well as the immune status of the host. s. mansoni is, however, a potent inducer of a t h 2 dominant response, not only to itself but also to other bystander antigens that are present in a host, so it does have an influence on the clinical outcome in these co-infections 175, 176 . synergistic interactions between specific protozoans and helminths are often ascribed to the immunosuppression that is characteristic of protozoan infections 177 and which is observed in the mouse, which is the main model for these infections a mouse model has been used to model the arthritis and carditis that can occur in co-infections with b. microti and b. burgdorferi 19 . co-infection resulted in a significant increase in symptoms of arthritis. this increase was correlated with a reduction in concentrations of the cytokines il-10 and il-13. a mouse model for tick-borne lyme arthritis mediated by co-infection with b. burgdorferi and a causative agent of human granulocytic ehrlichiosis (hge) has been developed 178, 179 (fig. 6) . co-infection results in increased titres of both pathogens and more severe arthritis than does infection with b. burgdorferi alone. co-infection resulted in reduced concentrations of il-12, ifn-γ and tnf-α and increased concentrations of il-6. ifn-γ expression in macrophages was suppressed, which might indicate a reduction in phagocytic activity in co-infection. these models will allow us to define the modulation of host immune responsiveness that occurs in those individuals that are simultaneously or sequentially infected with multiple tick-borne pathogens 179 . co-infections can arise as a result of the virus-induced immunosuppression that is characteristic of a subset of human viral pathogens, the best characterized of which is hiv. schistosomiasis is a chronic helminth infection that is caused by s.mansoni. in hcv and s. mansoni co-infection, there is a higher incidence of viral persistence and accelerated damage to the liver than when the patient is infected with either infectious agent alone. in a recent study, stimulation of cd4 + t cells with hcv antigens produced a type 1 cytokine profile in patients infected with hcv, whereas in patients that were co-infected with hcv and s. mansoni, a type 2 cytokine predominance was evident despite the fact that t cells that were recovered from both patient populations responded in the same manner to stimulation with schistosomal antigens 168 . the helminth-induced inability to generate an hcv-specific cd4 + /t h 1 t-cell response has been shown to have a role in the persistence and severity of hcv infection, which indicates that the induction of a strong cellular immune response through new therapeutic approaches might limit subsequent liver damage in those individuals with chronic hcv infection 169 . parasite-bacteria co-infections. one example of coinfection with a parasite and bacterium in humans is that of borrelia burgdorferi (the causative agent of lyme disease) and the intra-erythrocytic parasite babesia microti. both of these pathogens are transmitted by the tick vector ixodes scapularis. co-infection can occur by a bite from a single tick carrying multiple pathogens, or from multiple tick bites. the first cases of lyme borreliosis and babesiosis co-infection were reported in the mid 1980s, with parasite-bacteria co-infection rates of up to 33% among those with confirmed tick-borne infection in certain populations. although ticks can also harbour the human pathogen anaplasma phagocytophilum, lyme borreliosis and babesiosis coinfection accounts for ~80% of polymicrobial disease in the eastern united states. consistent with the theme for other co-infections involving a parasite, patients that harbour both of these pathogens had more severe and longer-lasting symptoms than those with lyme borreliosis alone 170 (nug) and is characterized by a surface biofilm of mixed microbial flora overlying a subsurface flora comprising dense aggregates of spirochaetes. in contrast to nug, high levels of yeast and herpes-like viruses were observed using transmission electron microscopy examination of tissues recovered from the former patient group. herpes-like particles were observed in 56.5% of biopsies obtained from hiv-infected patients with nup. these findings correlate well with those of contreras and co-workers 183, 184 in which co-infection with herpesvirus was associated with high levels of periodontopathic bacteria. the role of viruses in the pathogenesis of nup or periodontitis is not known but, in addition to inducing immunosuppression, it has been suggested that viruses might promote the overgrowth of bacterial pathogens and/or induce the release of tissuedestroying cytokines by host cells 185 . 32, 180 . in addition to systemic diseases, localized infections with candida spp., such as thrush in the oral cavity, are common co-infections in hiv-infected individuals 181 . the commensal oral flora acquires an invasive phenotype in the hiv-infected host, and c. albicans is indicative of a defect in host t-cell immunity in hiv infection 182 . oropharyngeal candidiasis develops iñ 20-50% of hiv-infected patients and often precedes the development of a more invasive candida infection, oesophageal candidiasis. the progressive immunosuppression that is characteristic of hiv infection provides a mechanism for the development of oesophageal candidiasis, which is a reportable aidsdefining opportunistic illness. another disease of the oral cavity in hiv-seropositive patients is necrotizing ulcerative periodontitis (nup), which is a disease that is characterized by ulcerated gingival papillae 122 figure 6 | animal model for lyme disease and human granulocytic ehrlichiosis (hge) co-infection. these diseases share a tick vector, ixodes scapularis, and to analyse whether ehrlichia sp. and borrelia burgorferi (the causative agents of hge and lyme disease, respectively) co-infection leads to increased severity of spirochaete-induced lyme arthritis a mouse model has been developed. mice are infected intradermally with either spirochaetes (b. burgdorferi cultured in vitro) or hge (blood culture from a scid mouse, see inset panel) 178 . arthritis and presence of the two pathogens can then be determined through histopathology, pcr to detect bacterial dna and by assessing immune responses. ticks were allowed to feed on all groups of mice to assess transmission of the pathogens. after feeding, pcr (hge) and immunofluorescence (b. burgdorferi) were used for pathogen detection. and not due to either increased apoptosis or altered distribution of these cells between the spleen, blood or lymphatics. the ability of measles virus to suppress both innate and adaptive immune responses is thought to be responsible for the increased susceptibility to bacterial co-infection. the future of polymicrobial disease research molecular methods are now being used together with conventional culture techniques to determine the identity of the full complement of microorganisms that are involved in co-infections and to determine the interactions between these microorganisms. as a result, additional diseases of polymicrobial origin will be identified. this will necessitate the development of new animal models and new in vitro methods for the study of polymicrobial diseases. uncovering the molecular mechanisms that are involved in the pathogenesis of complex diseases might show that changes in lifestyle, such as smoking cessation or dietary changes, could prevent co-infections. developing methods to disrupt biofilms are one target for researchers. new antimicrobials and vaccine candidates for both the predisposing and the co-infecting microorganisms will be sought. therapeutic approaches for polymicrobial diseases might include the use of probiotics for the treatment or prevention of vaginal infections, gastroenteritis, inflammatory bowel disease, utis and periodontitis. moreover, advances in nanotechnology and biomedical engineering will allow the development of new ways to deliver these therapeutic or preventative agents in a disease-or site-specific manner such as the design and use of 'intelligent implants' 193 . these indwelling devices might be embedded with sensors to detect the biofilm-forming microorganisms and signal the release of antimicrobial agents stored in an internal reservoir. as the organizers of the first satellite conference on diseases of mixed microbial aetiology (see the online links box) stated -polymicrobial diseases are ''a concept whose time has come'' 1 . virus cannot present antigen to t cells and have a diminished capacity to secrete ig or proliferate. susceptibility to bacterial co-infection is likely owing to these underlying immune defects that result in the hallmark of measles virus infection -inhibition of the proliferation of cd4 + and cd8 + t cells [186] [187] [188] [189] . a primate model of hiv-induced immunosuppression that developes cutaneous leishmaniasis has been developed in rhesus macaques. in this model, macaques are chronically infected with siv, and then co-infected with leishmania major metacyclic promastigotes by intradermal injection. lesion size, parasite load and siv viraemia are measured weekly. this model has been used to assay both the synergistic relationship between these two pathogens and the responsiveness to, and relative protective efficacy of, cpg oligodeoxynucleotides delivered to co-and mono-infected macaques 190 . recently, a rhesus monkey model for siv predisposition to mycobacterium leprae co-infection has been developed, which showed that co-infection increases the susceptibility to leprosy regardless of the timing between the two infections 191 . as mentioned earlier, measles virus-induced immunosuppression often leads to bacterial co-infection. to understand the mechanisms of co-infection, a murine model of combined measles virus and listeria monocytogenes infection was developed 192 . in this model system, transgenic mice expressing the human measles virus receptor cd46 are co-infected with measles virus and l. monocytogenes, or are challenged with the bacterial pathogen alone. mice co-infected with measles virus were more susceptible to infection with l. monocytogenes and this susceptibility corresponded with a reduction in the macrophage and pmn populations in the spleen, as well as a reduction in ifn-γ production by cd4 + t cells. a reduction in cd11b + macrophages and ifn-γ producing t cells was found to be due to reduced proliferative expansion introduces and provides a concise overview of polymicrobial diseases the role of microbial interactions in infectious disease mechanisms of bacterial superinfections in viral pneumonias respiratory viral infection predisposing for bacterial disease: a concise review molecular pathogenesis of pneumococcal pneumonia polymicrobial diseases polymicrobial diseases regulation of gene expression by cell-to-cell communication: acylhomoserine lactone quorum sensing bacterial biofilms: an emerging link to disease pathogenesis endotoxemia and polymicrobial septic peritonitis resistance to acute septic peritonitis in poly(adp-ribose) polymerase-1-deficient mice interaction between the innate and adaptive immune systems is required to survive sepsis and control inflammation after injury cd40 contributes to lethality in acute sepsis: in vivo role for cd40 in innate immunity role of the classical pathway of complement activation in experimentally induced polymicrobial peritonitis mice lacking monocyte chemoattractant protein 1 have enhanced susceptibility to an interstitial polymicrobial infection due to impaired monocyte recruitment provided the background for the development of an animal model of parasitic co-infection, as well as demonstrating the potential importance of mouse strain -genetic background -in disease outcome coinfection with borrelia burgdorferi and the agent of human granulocytic ehrlichiosis suppresses il-2 and ifn-γ production and promotes an il-4 response in c3h/hej mice dual infections of feeder pigs with porcine reproductive and respiratory syndrome virus followed by porcine respiratory coronavirus or swine influenza virus: a clinical and virological study pathogenesis and clinical aspects of a respiratory porcine reproductive and respiratory syndrome virus infection dual infections of prrsv/influenza or prrsv/actinobacillus pleuropneumoniae in the respiratory tract experimental dual infection of specific pathogen-free pigs with porcine reproductive and respiratory syndrome virus and pseudorabies virus induction of dual infections in newborn and three-week-old pigs by use of two plaque size variants of porcine reproductive and respiratory syndrome virus experimental reproduction of severe wasting disease by co-infection of pigs with porcine circovirus and porcine parvovirus pathogenesis of postweaning multisystemic wasting syndrome reproduced by co-infection with korean isolates of porcine circovirus 2 and porcine parvovirus replication status and histological features of patients with triple (b, c, d) and dual (b, c) hepatic infections polymicrobial diseases hepatitis c virus (hcv) and human immunodeficiency virus type 1 (hiv-1) infections in alcoholics hiv/hcv co-infection: putting the pieces of the puzzle together polymicrobial diseases kaposi's sarcoma-associated herpesvirus (kshv/hhv8): key aspects of epidemiology and pathogenesis kaposi's sarcoma-associated herpesvirus (kshv)/human herpesvirus 8 (hhv-8) as a tumour virus hiv and herpes co-infection, an unfortunate partnership epidemiology of herpes and hiv co-infection the interaction between herpes simplex virus and human immunodeficiency virus the role of sexually transmitted diseases in hiv transmission g protein-coupled receptors in hiv and siv entry: new perspectives on lentivirus-host interactions and on the utility of animal models interference between cold-adapted (ca) influenza a and b vaccine reassortants or between ca reassortants and wild-type strains in eggs and mice interference by a non-defective variant of influenza a virus is due to enhanced rna synthesis and assembly interference following dual inoculation with influenza a (h3n2) and (h1n1) viruses in ferrets and volunteers post-entry restriction of retroviral infections the helper virus envelope glycoprotein affects the disease specificity of a recombinant murine leukemia virus carrying a v-myc oncogene a model for mixed virus disease: coinfection with moloney murine leukemia virus potentiates runting induced by polyomavirus (a2 strain) in balb/c and nih swiss mice evidence for dual infection of rabbits with the human retroviruses htlv-i and hiv-1 dual infection of rabbits with human t cell lymphotropic virus types i and ii infection of macaque monkeys with simian immunodeficiency virus from african green monkeys: virulence and activation of latent infection identification of a window period for susceptibility to dual infection with two distinct human immunodeficiency virus type 2 isolates in a macaca nemestrina (pig-tailed macaque) model one of the few examples of an attempt to model a human viral co-infection in which viral host restriction presents significant challenges extensive diversification of human immunodeficiency virus type 1 subtype b strains during dual infection of a chimpanzee that progressed to aids interference between non-a, non-b and hepatitis b virus infection in chimpanzees hepatitis δ-virus cdna sequence from an acutely hbv-infected chimpanzee: sequence conservation in experimental animals molecular genetic analysis of virulence in mannheimia (pasteurella) haemolytica coinfection with bhv-1 modulates cell adhesion and invasion by p. multocida and mannheimia (pasteurella) haemolytica smoke and viral infection cause cilia loss detectable by bronchoalveolar lavage cytology and dynein elisa ultrastructural features of lesions in bronchiolar epithelium in induced respiratory syncytial virus pneumonia of calves experimental infection of lambs with bovine respiratory syncytial virus and pasteurella haemolytica: immunofluorescent and electron microscopic studies pathological study of experimentally induced bovine respiratory syncytial viral infection in lambs role of α/β interferons in the attenuation and immunogenicity of recombinant bovine respiratory syncytial viruses lacking ns proteins nonstructural proteins ns1 and ns2 of bovine respiratory syncytial virus block activation of interferon regulatory factor 3 bovine viral diarrhea virus isolated from fetal calf serum enhances pathogenicity of attenuated transmissible gastroenteritis virus in neonatal pigs effects of intranasal inoculation with bordetella bronchiseptica, porcine reproductive and respiratory syndrome virus, or a combination of both organisms on subsequent infection with pasteurella multocida in pigs experimental reproduction of severe disease in cd/cd pigs concurrently infected with type 2 porcine circovirus and porcine reproductive and respiratory syndrome virus association of porcine circovirus 2 with porcine respiratory disease complex in utero infection by porcine reproductive and respiratory syndrome virus is sufficient to increase susceptibility of piglets to challenge by streptococcus suis type ii effects of intranasal inoculation of porcine reproductive and respiratory syndrome virus, bordetella bronchiseptica, or a combination of both organisms in pigs differential production of proinflammatory cytokines: in vitro prrsv and mycoplasma hyopneumoniae co-infection model dual infections of prrsv/influenza or prrsv/actinobacillus pleuropneumoniae in the respiratory tract high mortality and growth depression experimentally produced in young turkeys by dual infection with enteropathogenic escherichia coli and turkey coronavirus experimental infection of turkeys with avian pneumovirus and either newcastle disease virus or escherichia coli invasive group a streptococcal disease in children and association with varicella-zoster virus infection. ontario group a streptococcal study group risk factors for invasive group a streptococcal infections in children with varicella: a casecontrol study invasive group a streptococcal infections in children with varicella in southern california presence of specific viruses in the middle ear fluids and respiratory secretions of young children with acute otitis media adenovirus infection enhances in vitro adherence of streptococcus pneumoniae effect of adenovirus type 1 and influenza a virus on streptococcus pneumoniae nasopharyngeal colonization and otitis media in the chinchilla reviews the evidence in support of the crucial role of a viral virulence factor in predisposing both the upper and lower respiratory tract to bacterial secondary infections adenovirus serotype 1 does not act synergistically with moraxella (branhamella) catarrhalis to induce otitis media in the chinchilla comparison of alteration of cell surface carbohydrates of the chinchilla tubotympanum and colonial opacity phenotype of streptococcus pneumoniae during experimental pneumococcal otitis media with or without an antecedent influenza a virus infection effect of experimental influenza a virus infection on isolation of streptococcus pneumoniae and other aerobic bacteria from the oropharynges of allergic and nonallergic adult subjects prevalence of various respiratory viruses in the middle ear during acute otitis media infectious exacerbations of chronic obstructive pulmonary disease associated with respiratory viruses and non-typeable haemophilus influenzae impaired innate host defense causes susceptibility to respiratory virus infections in cystic fibrosis modulation of pseudomonas aeruginosa gene expression by host microflora through interspecies communication the herpesvirus-porphyromonas gingivalis-periodontitis axis herpesviral-bacterial interactions in aggressive periodontitis a mouse model of dual infection with influenza virus and streptococcus pneumoniae lethal synergism between influenza virus and streptococcus pneumoniae: characterization of a mouse model and the role of plateletactivating factor receptor role of neuraminidase in lethal synergism between influenza virus and streptococcus pneumoniae a model of meningococcal bacteremia after respiratory superinfection in influenza a virus-infected mice the chinchilla superinfection model developed in this study was the first animal model to demonstrate conclusively the important role of the upper respiratory tract viruses eustachian tube histopathology during experimental influenza a virus infection in the chinchilla different virulence of influenza a virus strains and susceptibility to pneumococcal otitis media in chinchillas polymorphonuclear leukocyte dysfunction during influenza virus infection in chinchillas synergistic effect of adenovirus type 1 and nontypeable haemophilus influenzae in a chinchilla model of experimental otitis media evidence for transudation of specific antibody into the middle ears of parenterally immunized chinchillas after an upper respiratory tract infection with adenovirus kinetics of the ascension of nthi from the nasopharynx to the middle ear coincident with adenovirus-induced compromise in the chinchilla nontypeable haemophilus influenzae gene expression induced in vivo in a chinchilla model of otitis media protection against development of otitis media induced by nontypeable haemophilus influenzae by both active and passive immunization in a chinchilla model of virus-bacterium superinfection passive transfer of antiserum specific for immunogens derived from a nontypeable haemophilus influenzae adhesin and lipoprotein d prevents otitis media after heterologous challenge relative immunogenicity and efficacy of two synthetic chimeric peptides of fimbrin as vaccinogens against nasopharyngeal colonization by nontypeable haemophilus influenzae in the chinchilla effect of respiratory syncytial virus on adherence, colonization and immunity of non-typable haemophilus influenzae: implications for otitis media accelerated migration of respiratory dendritic cells to the regional lymph nodes is limited to the early phase of pulmonary infection evaluation of vaccines for atrophic rhinitis -a comparison of three challenge models the pathological effect of the bordetella dermonecrotic toxin in mice immunopathological changes in mice caused by bordetella bronchiseptica and pasteurella multocida the virulence of mixed infection with streptococcus constellatus and fusobacterium nucleatum in a murine orofacial infection model bacteriology of adenoids and tonsils in children with recurrent adenotonsillitis bacterial biofilms: a common cause of persistent infections an excellent review of the role of polymicrobial biofilms in diverse anatomical niches that are involved in persistent and chronic human diseases handbook of animal models of infection bacteriologic findings in patients with chronic sinusitis microbiology of polymicrobial abscesses and implications for therapy synergistic effect of bacteroides, clostridium, fusobacterium, anaerobic cocci, and aerobic bacteria on mortality and induction of subcutaneous abscesses in mice the etiology of urinary tract infection: traditional and emerging pathogens culture-independent identification of pathogenic bacteria and polymicrobial infections in the genitourinary tract of renal transplant recipients involvement of periodontopathic biofilm in vascular diseases periodontal disease: bacterial virulence factors, host response and impact on systemic health bacterial coaggregation: an integral process in the development of multi-species biofilms an overview of the molecular mechanisms by which bacteria interact with one another to construct complex multispecies biofilms phylogeny of porphyromonas gingivalis by ribosomal intergenic spacer region analysis association of bacteroides forsythus and a novel bacteroides phylotype with periodontitis diversity of bacterial populations on the tongue dorsa of patients with halitosis and healthy patients prevalence of bacteria of division tm7 in human subgingival plaque and their association with disease single-cell enumeration of an uncultivated tm7 subgroup in the human subgingival crevice genetic relatedness and phenotypic characteristics of treponema associated with human periodontal tissues and ruminant foot disease new bacterial species associated with chronic periodontitis beyond the specific plaque hypothesis: are highly leukotoxic strains of actinobacillus actinomycetemcomitans a paradigm for periodontal pathogenesis? the contribution of interleukin-1 and tumor necrosis factor to periodontal tissue destruction the identification of vaginal lactobacillus species and the demographic and microbiologic characteristics of women colonized by these species hydrogen peroxide-producing lactobacilli and acquisition of vaginal infections bacterial vaginosis is a strong predictor of neisseria gonorrhoeae and chlamydia trachomatis infection association between acquisition of herpes simplex virus type 2 in women and bacterial vaginosis interleukin-1 and tumor necrosis factor receptor signaling is not required for bacteria-induced osteoclastogenesis and bone loss but is essential for protecting the host from a mixed anaerobic infection provided the background for the development of an animal model of the relative bacterial colonization and persistence by single phenotypic variants of a bacterium tight-adherence genes of actinobacillus actinomycetemcomitans are required for virulence in a rat model inflammation and tissue loss caused by periodontal pathogens is reduced by interleukin-1 antagonists non-human primates used in studies of periodontal disease pathogenesis: a review of the literature exemplifies the potential for different animal models to provide disparate data and shows that there can be species and model dependency to our ability to define and characterize microbial virulence determinants in polymicrobial diseases origin microbiological and inflammatory effects of murine recombinant interleukin-10 in two models of polymicrobial peritonitis in rats stat4 is required for antibacterial defense but enhances mortality during polymicrobial sepsis comparison of the mortality and inflammatory response of two models of sepsis: lipopolysaccharide vs. cecal ligation and puncture early activation of pulmonary nuclear factor κb and nuclear factor interleukin-6 in polymicrobial sepsis granulocyte colony-stimulating factor and antibiotics in the prophylaxis of a murine model of polymicrobial peritonitis and sepsis antibiotics delay but do not prevent bacteremia and lung injury in murine sepsis mortality in murine peritonitis correlates with increased escherichia coli adherence to the intestinal mucosa polymicrobial sepsis induces organ changes due to granulocyte adhesion in a murine two hit model of trauma modulation of the phosphoinositide 3-kinase pathway alters innate resistance to polymicrobial sepsis the importance of systemic cytokines in the pathogenesis of polymicrobial sepsis and dehydroepiandrosterone treatment in a rodent model the activity of tissue factor pathway inhibitor in experimental models of superantigen-induced shock and polymicrobial intra-abdominal sepsis candida biofilms and their role in infection adherence of candida albicans to a cell surface polysaccharide receptor on streptococcus gordonii reviews the pathogenic potential and phenotypic variability of c. albicans and illustrates that the coinfecting microorganisms can be phenotypic variants of a single microbial strain phenotypic switching and mating type switching of candida glabrata at sites of colonization asm conference on polymicrobial diseases abstract 14 pseudomonas-candida interactions: an ecological role for virulence factors the relative pathogenicity of candida krusei and c. albicans in the rat oral mucosa avirulence of candida albicans auxotrophic mutants in a rat model of oropharyngeal candidiasis in vivo pathogenicity of eight medically relevant candida species in an animal model a murine model of candida glabrata vaginitis the protective immune response against vaginal candidiasis: lessons learned from clinical studies and animal models engineered control of cell morphology in vivo reveals distinct roles for yeast and filamentous forms of candida albicans during infection provided the background for the development of a murine model of fungal co-infection that allowed the demonstration of the importance of the ability of candida albicans to switch morphology from a yeast to a filamentous form in pathogenesis ascaris lumbricoides infection is associated with protection from cerebral malaria contemporaneous and successive mixed plasmodium falciparum and plasmodium vivax infections are associated with ascaris lumbricoides: an immunomodulating effect? intestinal helminth infections are associated with increased incidence of plasmodium falciparum malaria in thailand specific cellular immune response and cytokine patterns in patients coinfected with hepatitis c virus and schistosoma mansoni kinetics of intrahepatic hepatitis c virus (hcv)-specific cd4 + t cell responses in hcv and schistosoma mansoni coinfection: relation to progression of liver fibrosis associations between anti-schistosoma mansoni and anti-plasmodium falciparum antibody responses and hepatosplenomegaly, in kenyan schoolchildren epidemiology and impact of coinfections acquired from ixodes ticks. vector borne zoonotic dis coinfection in patients with lyme disease: how big a risk? some aspects of concomitant infections of plasmodia and schistosomes. i. the effect of schistosoma mansoni on the course of infection of plasmodium berghei in the field vole (microtus guentheri) infection of mice concurrently with schistosoma mansoni and rodent malarias: contrasting effects of patent s. mansoni infections on plasmodium chabaudi, p. yoelii and p. berghei suppression of schistosome granuloma formation by malaria in mice altered immune responses in mice with concomitant schistosoma mansoni and plasmodium chabaudi infections schistosoma mansoni infection cancels the susceptibility to plasmodium chabaudi through induction of type 1 immune responses in a/j mice heterologous synergistic interactions in concurrent experimental infection in the mouse with schistosoma mansoni, echinostoma revolutum, plasmodium yoelii, babesia microti, and trypanosoma brucei coinfection with borrelia burgdorferi and the agent of human granulocytic ehrlichiosis alters murine immune responses, pathogen burden, and severity of lyme arthritis coinfection with borrelia burgdorferi and the agent of human granulocytic ehrlichiosis suppresses il-2 and ifn-γ production and promotes an il-4 response in c3h/hej mice tuberculosis in hiv-infected patients: a comprehensive review a tem/sem study of the microbial plaque overlying the necrotic gingival papillae of hiv-seropositive, necrotizing ulcerative periodontitis molecular epidemiology of candida albicans strains isolated from the oropharynx of hiv-positive patients at successive clinic visits relationship between herpesviruses and adult periodontitis and periodontopathic bacteria herpesviruses in human periodontal disease herpesviruses: a unifying causative factor in periodontitis? modulation of immune system function by measles virus infection: role of soluble factor and direct infection modulation of immune system function by measles virus infection. ii. infection of b cells leads to the production of a soluble factor that arrests uninfected b cells in g 0 /g 1 manganese superoxide dismutase induction during measles virus infection suppression of antigen-specific t cell proliferation by measles virus infection: role of a soluble factor in suppression cpg oligodeoxynucleotides protect normal and siv-infected macaques from leishmania infection interactions between mycobacterium leprae and simian immunodeficiency virus (siv) in rhesus monkeys this paper reviews how virus-induced immunosuppression of both innate and adaptive immune responses can provide an underlying mechanism for bacterial co-infection this paper introduces the concept of one approach to treat or prevent polymicrobial diseases by using bioengineering and nanotechnology to specific anatomic sites and crucial time points in the disease course for intervention and/or prevention communication among oral bacteria. microbiol the author would like to thank j. neelans for help with manuscript preparation. discusses our increasing understanding of the link between biofilms and the pathogenesis of human disease, as well as identifying the need for relevant animal models with which to study these infectious states. 10. brogden key: cord-299932-c079r94n authors: he, x.; wang, s.; shi, s.; chu, x.; tang, j.; liu, x.; yan, c.; zhang, j.; ding, g. title: benchmarking deep learning models and automated model design for covid-19 detection with chest ct scans date: 2020-06-09 journal: nan doi: 10.1101/2020.06.08.20125963 sha: doc_id: 299932 cord_uid: c079r94n covid-19 pandemic has spread all over the world for months. as its transmissibility and high pathogenicity seriously threaten people's lives, the accurate and fast detection of the covid-19 infection is crucial. although many recent studies have shown that deep learning based solutions can help detect covid-19 based on chest ct scans, there lacks a consistent and systematic comparison and evaluation on these techniques. in this paper, we first build a clean and segmented ct dataset called clean-cc-ccii by fixing the errors and removing some noises in a large ct scan dataset cc-ccii with three classes: novel coronavirus pneumonia (ncp), common pneumonia (cp), and normal controls (normal). after cleaning, our dataset consists of a total of 340,190 slices of 3,993 scans from 2,698 patients. then we benchmark and compare the performance of a series of state-of-the-art (sota) 3d and 2d convolutional neural networks (cnns). the results show that 3d cnns outperform 2d cnns in general. with extensive effort of hyperparameter tuning, we find that the 3d cnn model densenet3d121 achieves the highest accuracy of 88.63% (f1-score is 88.14% and auc is 0.940), and another 3d cnn model resnet3d34 achieves the best auc of 0.959 (accuracy is 87.83% and f1-score is 86.04%). we further demonstrate that the mixup data augmentation technique can largely improve the model performance. at last, we design an automated deep learning methodology to generate a lightweight deep learning model mnas3dnet41 that achieves an accuracy of 87.14%, f1-score of 87.25%, and auc of 0.957, which are on par with the best models made by ai experts. the automated deep learning design is a promising methodology that can help health-care professionals develop effective deep learning models using their private data sets. our clean-cc-ccii dataset and source code are available at: https://github.com/arthursdays/hkbu_hpml_covid-19. the covid-19 (corona virus disease 2019) pandemic is an ongoing pandemic caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2) [1] . the sars-cov-2 virus can be easily spread among people via small droplets produced by coughing, sneezing, and talking [2] . even worse, § corresponding author at hong kong baptist university, tel.: +852-3411-5998; email: chxw@comp.hkbu.edu.hk ¶ corresponding author at hangzhou dianzi university; email: jzhang@hdu.edu.cn sars-cov-2 can be highly stable in a favourable environment so that it can adhere to different object surfaces up to several days [3] , which causes a higher risk of getting infected by touching these contaminated surfaces and then touching their own faces. covid-19 is not only easily contagious, but also a serious threat to human lives. the covid-19 infected patients usually present with pneumonia-like symptoms (fever, dry cough, dyspnea, etc.) and gastrointestinal symptoms such as diarrhea, followed by a severe acute respiratory infection. in some cases, acute respiratory distress accompanied by severe respiratory complications may even lead to death. according to the covid-19 situation report [4] provided by the world health organization (who), as of the end of may, there were 5,934,936 covid-19 infections and 367,166 deaths globally. the usual incubation period of covid-19 ranges from one to 14 days. many covid-19 patients do not even know that they have been infected without any symptoms, which would easily cause delayed treatments and lead to a sudden exacerbation of the condition. therefore, a fast and accurate method of diagnosing covid-19 infection is crucial. currently, there are two commonly used methods for covid-19 diagnosis. one is viral testing, which uses real-time reverse transcription-polymerase chain reaction (rrt-pcr) to detect viral rna fragments. the other one is making diagnoses based on characteristic imaging features on chest x-rays or computed tomography (ct) scan images. [5] conducted the effectiveness comparison between the two diagnosis methods and concluded that chest ct has a faster detection from the initial negative to positive than rrt-pcr. however, the manual process of analyzing and diagnosing based on ct images highly relies on professional knowledge and is time-consuming to analyze the features on the ct images. therefore, many recent studies have tried to use deep learning (dl) methods to assist covid-19 diagnosis with chest x-rays or ct scan images. however, the reported accuracy of the existing dl-based covid-19 detection solutions spans a broad spectrum because they were evaluated on different datasets, making it difficult to achieve a fair comparison. in this paper, we aim to conduct a reproducible comparative study of dl methods for covid-19 detection using chest ct scans. to this end, we first build a clean and segmented ct scans dataset based on a largescale open-source dataset 1 from cc-ccii (china consortium of chest ct image investigation) [6] . our dataset, named clean-cc-cci, consists of three classes: novel coronavirus pneumonia (ncp), common pneumonia (cp), and normal controls (normal). totally, there are 340,190 slices of 3,993 scans from 2,698 patients in our dataset, where the number of slices of ncp, cp, and normal is 131,517, 135,038, and 73,635, respectively. we split the dataset into the training and test sets according to the patient's id with a ratio of 4:1, the details of which are shown in table ii . notice that our test set size is the largest one (e.g., it is twice of that in [6] ), making our evaluation results more conservative than existing ones. our benchmark dataset is made open to the public and can facilitate the fair comparison of new dl models for covid-19 detection. in this paper, we use our dataset to benchmark two types of state-of-the-art (sota) dl models: 1) 3d convolutional neural networks (cnns), including densenet3d121 [17] , r2plus1d [18] , mc3 18 [18] , resnext3d101 [17] , pre-act resnet [17] , and resnet3d series [17] ; 2) 2d cnns, including densenet121 [19] , densenet201 [19] , resnet50 [20] , resnet101 [20] and resnext101 [21] . we explore three key factors that may affect the detection performance, including model depth, methods of reading slice images, and model architecture. first, regarding the model depth, we compare the performance of the resnet architecture [20] with 3d from 10 layers to 152 layers, i.e., resnet3d10, resnet3d18, resnet3d34, resnet3d50, resnet3d101, resnet3d152, and resnet3d200. second, in terms of how to read the slice images, we consider two popular approaches: one is to read a slice as an rgb image with three channels; another is to convert the slice to a greyscale image with only one channel. therefore, the scan images used to train the model will be different because of the different ways of slice reading. third, we exploit multiple dnn architectures including the hand-craft models and automatically generated models with automl techniques [22] , [23] . we use seven 3d models to analyze the effect of two types of scan data. besides, we discuss the influence of the number of slices in a ct scan on the model performance. we also evaluate the effectiveness of the mixup data augmentation method by comparing model accuracy before and after applying the mixup method. our major contributions are summarized as follows: for covid-19 detection using chest ct scans, and benchmark 9 different cnn architectures with more than 20 variants. 2) we find that both 3d and 2d cnns are promising solutions for detecting covid-19 infections. however, the overall performance of 3d cnns is better than 2d cnns. besides, the results of the resnet3d series show that the model performance does not scale very well with the model depth. 3) we find that the models can achieve higher auc when the slices are converted to greyscale images. 4) to the best of our knowledge, this is the first paper to explore the relationship between model performance and the number of slices in a ct scan. our result shows that there is no significant correlation between them. in other words, increasing the number of slices does not necessarily improve the model performance. instead, the model trained on scan data with a small number of slices can also achieve comparable or even better results. 5) we demonstrate that the mixup data augmentation method [24] can effectively improve model accuracy in our study. 6) we develop an automated deep learning methodology to generate a lightweight deep learning model mnas3dnet41. on our dataset, it achieves an accuracy of 87.14%, f1-score of 87.25%, and auc of 0.957, which are on par with the best results of the highly fine-tuned models made by ai experts. the rest of the paper is organized as follows. section ii describes the related work. in section iii, we describe the strategies used to build our dataset, the comparison study of sota cnn models, and the automated model design methodology. section iv presents and discusses the experimental results. we conclude the paper and introduce the future research directions in section v. in recent years, dl techniques have been proved to be effective in the diagnosis of diseases with x-ray and ct images [25] . to enable machine learning techniques be applied in helping detect covid-19, an increasing number of publicly available covid-19 datasets has been proposed in the past few months as shown in table i . these datasets can be classified into two classes: x-ray and ct scan images. machine/deep learning techniques highly rely on both the quality and quantity of the dataset. ieee8023 coivd-chestxray-dataset [26] is an open dataset of covid-19 cases with chest x-ray and ct images, which allows users to submit other covid-19 data to this dataset. however, this dataset mainly focuses on x-ray images with only a very small number of ct scans. based on this dataset, several dl based techniques have been proposed [7] [9] to detect covid-19. covid-ct-dataset [27] is a ct dataset of covid-19, which is mainly composed of ct images extracted from pdf files of covid-19 papers in medrxiv and biorxiv. thus, it has two main drawbacks. first, many ct images contain some marks created by the ct machine or doctors, which may have a high impact on the dl techniques. second, each patient has only one to several ct images instead of a complete 3d scan volume, which results in some difficulties to use 3d cnns to exploit the depth information of the lung. cc-ccii is another publicly available ct volume dataset proposed by [6] . it is currently one of the largest ct datasets for covid-19, which contains 617,775 slices of ct images from 6,752 scans of 4,154 patients. it has 3 classes of novel coronavirus pneumonia (ncp), common pneumonia (cp), and normal controls (normal). cp includes bacterial pneumonia and viral pneumonia. however, this dataset (version 1.0 released on 23 april 2020) contains some errors (e.g., disorder of ct images in some scans, some scans include ct of the head but not the lung, etc.). covid-19-ct-seg-dataset [28] is a publicly available ct dataset of covid-19. it contains 20 well-labeled scans with annotation of left lung, right lung and lesions. three experienced radiologists are involved for each annotation: two radiologists do the annotation and one does the verification. most research is conducted on ct images, but many of them do not exploit the 3d information of ct images, such as the work by [10] , [13] , [14] . they only propose the dl models with 2d cnns for covid-19 detection. [11] is the most related work to ours; but it only benchmarks ten 2d cnns and compares their performance in classifying 2d ct images on their private dataset with 102 testing images. on the other hand, the studies in utilizing 3d ct images are relatively rare, which is mainly due to the lack of 3d ct scan dataset of covid-19 in the earlier days. however, there still exist some work proposing 3d cnns with their private 3d ct datasets (e.g., [16] , [15] ). recently, [6] publish a large-scale publicly available 3d ct dataset, based on which they propose 3d cnns methods to segment lesion and detect covid-19. however, in [6] , only two dl models are exploited to evaluate the model performance using 10% of the dataset as test set. it is of practical importance to evaluate which types of models are suitable to the 3d ct images in detecting there are also some other studies conducted on x-ray images. for example, [9] propose three 2d cnns for covid-19 detection. [7] introduce a deep anomaly detection model for fast and reliable screening. [8] investigate the estimation of uncertainty and interpretability by droiopweights-based baysian cnn on the x-ray images. [12] use both x-ray images and ct images to do segmentation and detection. in recent years, automated machine learning (automl) has created many sota results by automatically searching model architectures and hyper-parameters for specific tasks [22] , [23] , [29] . for example, [30] introduce automl into the medical image processing task. they used five public datasets, messidor, oct images, ham 10000, paediatric images and cxr images, to train models by google cloud automl. their experimental results demonstrate that automl can generate competitive classifiers compared to manually designed dl models. a. dataset [6] provide an open-source chest ct image dataset for covid-19 diagnosis, namely china consortium of chest ct image investigation (cc-ccii), which contains a total of 617,775 ct slices of 6,752 ct scans from 4,154 patients. cc-ccii has three classes: novel coronavirus pneumonia (ncp), common pneumonia (cp), and normal controls (normal). cp includes bacterial pneumonia and viral pneumonia. to best of our knowledge, cc-ccii is the largest covid-19 ct dataset which is publicly available currently. it would be helpful for accelerating the research on machine learning based methods in covid-19 diagnosis. however, cc-ccii has five main issues (i.e., damaged data, non-unified data type, repeated and noisy slices, disordered slices, and non-segmented slices) that would have high negative impacts on the model performance. in this section, we first describe our methods to address the problems in cc-ccii to generate a better dataset for dl techniques. then we introduce the strategies of scan images construction. 3 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. . https://doi.org/10.1101/2020.06.08.20125963 doi: medrxiv preprint after addressing the above problems, we construct a clean cc-ccii dataset named clean-cc-ccii, which is more suitable to dl-based methods in covid-19 diagnosis. the statistics of our dataset are presented in table ii . finally, our clean-cc-ccii dataset consists of 340,190 slices of 3,993 scans from 2,698 patients. the dataset is divided into the training set and the test set according to patients to make sure that the ct scan 2 https://github.com/booz-allen-hamilton/dsb3tutorial 4 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. . images from the same patient will appear either only in the training set or in the test set. the ratio of the number of scans in the training set and the test set is 4:1. 2) scan images construction: after data pre-processing, we need to construct ct scan images as inputs of dl models for training. as shown in fig. 3 , there are two steps before feeding data into dl models: slice sampling and slice processing. slice sampling: in our dataset, each ct scan contains a different number of slices as shown in fig. 2 . the minimum and maximum number of slices are 9 and 457, respectively. however, dl models generally require the same dimensional inputs. to keep the same dimension inputs, we propose two types of slice sampling strategies: random sampling and symmetrical sampling. specifically, the random sampling strategy is applied to the training set, which can be regarded as data augmentation, while the symmetrical sampling strategy is performed on the test set to avoid introducing randomness into the testing results. besides, because the number of slices can be manually set to different values, both sampling strategies support automatically select upsampling or downsampling based on the original and target number of slices. we will also study the performance impact of the number of slices in section iv-c. notably, the relative order between slices remains the same before and after sampling. the details of our sampling strategies are given in algorithms 1 and 2 of a. slice processing: after slice sampling, each scan data is composed of the same number of slices. we then resize all slices to 160×160 and central crop to 128×128. in this way, the final input data sizes for the 3d and 2d models are c×d× 128 × 128 and d × 128 × 128, respectively, where c ∈ {1, 3} is the number of channels of the slice image, and d indicates the configured number of slices. for all scan data in the training set, we apply a 3d random horizontal flip transformation. the scan data in both the training and test sets is normalized by subtracting the mean and dividing the variance. in this study, we aim to investigate the performance of different types of dl models on detecting covid-19 infection with chest ct scans. therefore, we implement various experiments to evaluate the potential effective methods for covid-19 diagnosis. specifically, we compare the performance between sota dl models, including 3d and 2d models, and explore the relationship between model performance and (a) model depth and (b) how to read slice images. we also evaluate the effectiveness of the mixup data augmentation method in improving model classification accuracy. our pipeline of using dl models to classify cp, ncp, and normal ct scans is shown in fig. 3 . the first step is to construct ct scan images to feed into the dl models by slice sampling and processing. the sizes of all slices are fixed to 128×128 for the model inputs. the models are trained with the training set and evaluated on the test set. 1) exp 1: comparing different cnn models: in this study, we evaluate 17 cnn classification models shown in table iii , including 3d models and 2d models. for the 3d models, we use densenet3d121 [17] , r2plus1d [18] , mc3 18 [18] , resnext3d101 [17] , preact resnet3d [17] , and resnet3d series [17] (resnet3d10, resnet3d18, resnet3d34, resnet3d50, resnet3d101, resnet3d152, and resnet3d200). for the 2d models, we use densenet121 [19] , densenet201 [19] , resnet50 [20] , resnet101 [20] and resnext101 [21] . for the 2d models, the input scan data is composed of greyscale slice images. in terms of 3d models, we evaluate two types of scan data: rgb slice images with three input channels and greyscale slice images. besides, for both 2d and 3d models, the size of slices is fixed to 128 × 128. therefore, the input sizes for 3d and 2d models are c × d × 128 × 128 and d × 128 × 128, respectively, where c is the number of channels of the slice image that depends on how to read the slice images, and d is the number of slices in a scan image. c = 3 and c = 1 indicate that each slice is read as the rgb and greyscale image, respectively. the number of input channels in the first convolutional layer of all models is modified accordingly to handle the input with different size. 2) exp 2: comparing the different number of slices: in [6] , the scan input is fixed to 64 slices. however, in our dataset, the number of slices contained in different ct scans ranges from 9 to 457, and the mean value is 85 as shown in fig. 2 . intuitively, the higher number of slices, the more information can be extracted by the models, which could result in a higher performance. we empirically study the performance impacts of the number of slices by setting d to different values. we choose four representative 3d models (mc3 18, densenet3d121, resnet3d101, and resnext3d101) to evaluate the relationship between the model performance and the number of slices. for mc3 18, densenet3d121, and resnet3d101, we evaluate five types of scan images containing 16, 32, 64, 128, and 256 slices, respectively. for resnext3d101, it is too large to fit into the gpu memory when d > 64, so d is chosen with 5 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. is a generic and straightforward data augmentation strategy, which has been proven to be effective in improving the model performance on 2d image classification tasks. therefore, we explore the effectiveness of the mixup method in our 3d ct scan classification task. in essence, mixup trains a dl model on linear combinations of pairs of examples and their labels. the formula is given as follows: x where (x i , y i ) and (x j , y j ) are two feature-target vectors drawn at random from the training set, and the variable λ ∈ [0, 1] obeys a β-distribution, i.e., λ ∼ β(α, α) for α ∈ (0, ∞). by doing so, a new feature-target vector will be generated by mixing up two feature-target vectors, which encourages the model to behave linearly in-between training examples. in our experiments, we also use four representative 3d models (mc3 18, r2plus1d, resnet3d101, densenet121) to evaluate the feasibility of the mixup strategy. we set α = 0.4, as recommended in [24] . the results of all baseline experiments (to be discussed in section iv) show that dl is a powerful tool to assist the detection of covid-19 infection based on ct images, where 3d models generally outperform 2d models. however, as shown in table iii , 3d models have a very large model size and are slow to train. based on the results of table iv and table v , we can see that a larger or a deeper model does not necessarily result in better performance. for example, resnext3d101 is the largest model in our evaluated models, but its performance is not the best. therefore, in this section, we aim to design a lightweight 3d model, which is expected to achieve comparable or even better results than the baseline 3d models and is easier to deployment for faster detection. however, manually designing a deep neural network is a time-consuming process that highly relies on experience and expertise. luckily, a recent technique, namely neural architecture search (nas), would be a promising solution for us. nas can be seen as a sub-field of automl [22] , [23] , [29] , which draws much attention from academia and industry as it can design various neural networks automatically. in the following content, we first introduce our search space and search strategy, and then describe the implementation details and experimental results. 1) search space: the first step of nas is to build the search space, which defines the design principles of neural architectures. mobilenet [31] and mobilenetv2 [32] are a class of efficient models manually designed for mobile and embedded devices for efficient inference. many nas studies [33] , [34] use the mobilenetv2 structure to design the factorized hierarchical search space, but they mainly focus on 2d image recognition tasks. in this work, we also exploit mobilenetv2 as the backbone to design the 3d search space. an overview of the final model is shown in fig. 4 , which consists of n different cells. the number of blocks in a cell can be different, represented by [b 1 , ..., b i , ..., b n ]. the stride is set to 2 in the first block if the resolutions of input and output are different, and the stride is 1 in all other blocks. the blocks within the same cell have the same number of input/output channels. besides, the structure of each block is selected from a series of 3d mobile inverted bottleneck convolution operations [32] , represented by k × k m bconve, where k is the filter kernel size and e is the expansion ratio of linear layers. in our method, the search space consists of the following operations: cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. 2) search strategy: after building the search space, we can see that the key idea of the search task is to select the best submodel (in terms of validation accuracy) from the super-model. as summarized in [22] , [29] , there are various of search strategies, such as reinforcement learning, evolutionary algorithms, gradient descent-based methods, and random search. in recent studies [35] [37] , the authors demonstrate that random search is a more competitive method than many others. therefore, we also apply the random search strategy. 3) implementation details: the pipeline of our nas methodology is shown in fig. 5 , which contains two stages for searching 3d models on our clean-cc-ccii dataset: the search stage and the evaluation stage. search stage. in the search stage, we search for 100 epochs. each epoch consists of a number of steps. we sample a new neural architecture every five steps and make sure that every sampled architecture is trained. note that only the training set is used for training and evaluating the sampled models in the search stage. at the end of the search stage, there are 100 neural architectures and their corresponding training accuracy. evaluation stage. after the search stage, we need to select several top ranked models (in terms of validation accuracy) for the next stage. specifically, according to the training records, we choose those models that perform better validation accuracy than the previous sampled models. the selected models are first trained with the training set from scratch for 200 epochs, and then evaluated on the test set. implementation details. for both search and evaluation stages, we use the adam optimizer [38] [24, 40, 80, 96, 192, 320] . each experiment is conducted on four nvidia tesla v100 gpus . furthermore, to improve the searching efficiency, we fix the height and width of the input scan to 60×60 during the search stage, and restore the size to 128×128 in the evaluation stage. our nas-related code is based on nni 3 and can be found at https://github.com/arthursdays/hkbu hpml covid-19. in this section, we present and analyze the results of the different experiments mentioned above. all models are trained using the adam [38] optimizer with an initial learning rate of 0.001. the cosine annealing scheduler [39] is applied to adjust the learning rate. to compare the performance of cnn models, we use several commonly used evaluation metrics as follows: 3 https://github.com/microsoft/nni 7 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. . https://doi.org/10.1101/2020.06.08.20125963 doi: medrxiv preprint (a) the roc curves of 3d models that are trained with greyscale slices. (b) the roc curves of 3d models that are trained with rgb slices. (c) the roc curves of 2d models that are trained with greyscale slices. fig. 6 . the roc curves of 3d and 2d models. the overall performance of 3d models is better than 2d models. besides, the variance between the performance of the models that are trained with greyscale slices is smaller. where z(p, q) = 1, if p = q 0, otherwise . besides, the area under the receiver operating characteristic (roc) curve (auc) is also applied to evaluate the performance of covid-19 diagnosis. in this study, the positive and negative cases are assigned to ncp and non-ncp (i.e., cp and normal) scans, respectively. specifically, n t p and n t n indicate the number of correctly classified ncp and non-ncp scans, respectively. n f p and n f n indicate the number of wrongly classified ncp and non-ncp scans, respectively. the accuracy is the micro-averaging value for all test data, which is used to evaluate the overall performance. the performance comparison between different cnn modes including 3d and 2d is shown in table iv , in which the number of slices in the scan data is fixed to 64 and there are two types of inputs that differ in the way of reading slice images. the results in table iv show that both 2d and 3d models can achieve relatively good results on our clean-cc-ccii 8 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. . dataset, which indicates that the computer-aided covid-19 diagnosis with state-of-the-art dl techniques would be a promising solution. it shows that densenet3d121 is one of the best models among all evaluated models as it achieves the best accuracy, precision, sensitivity, specificity, and f1-score, and mc3 18 obtains the highest auc score when the slices are read as the greyscale images. in terms of the accuracy of 3d models, the different number of input channels has different impacts on the different network architectures. however, regarding the auc metric, almost all 3d models with the greyscale slice images perform better the rgb images. one can see that the roc curves in fig. 6 (a) are higher and distributed closer than those in fig. 6 (b) , which indicates that the models trained with greyscale slices are more robust. the main reason is that the original ct slices are greyscale images, and duplicating the greyscale images to rgb images would introduce much repetitive and redundant information, which instead increases the difficulty of model training. regarding the comparison of 2d models and 3d models, we can see that the overall performance of the 3d models is better than that of the 2d models, which is as expected because the convolutional filters in 3d models can better extract the three-dimensional spatial relationship between the slices of the scan data. we also explore the impact of model depth on model performance as shown in table v , from which one can see that there is no model that can have an absolute advantages on all metrics. although no significant correlation can be found between model performance and model depth, the results suggest that a smaller model can also obtain similar or even better results than the larger one. fig. 7 (a) plots the relationship between model accuracy and the number of slices. one can see that only the accuracy of resnet3d101 increases with the number of slices, while other models do not. however, because the distribution of our dataset is imbalanced, higher accuracy does not mean better performance. as fig. 7 (b) presents, when the number of slices is 64, the auc of resnet3d101 is smaller than the other cases. besides, fig. 7 (b) also shows that increasing the number of slices does not always improve the performance. instead, the models trained on a smaller number of slices can also achieve comparable or even better 9 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 9, 2020. . https://doi.org/10.1101/2020.06.08.20125963 doi: medrxiv preprint a possible explanation for this result might be that the original training data can be regarded as a pile of scattered points distributed in high-dimensional space, and a large number of new data points between the original data points are created by the mixup method. in this way, the original dataset is expanded to some extent, and the data distribution becomes smoother, which regularizes the model training and improves the model performance. we implement two types of nas experiments. one is to search for 21-layer networks, taking 3.7 hours, while the other searching for 41-layer networks took 5 hours. table vii presents the performance comparison between the baseline 3d models and our searched 3d models by nas, namely mnas3dnet. to have a fair comparison, for all models, the input scan images are composed of 64 greyscale slice images. compared to the baseline 3d models, the sizes of our searched models are much smaller, where mnas3dnet21 and mnas3dnet41 are 12.34 and 22.91 mb, respectively. at the same time, both models achieve the sota performance. specifically, mnas3dnet41 achieves an accuracy of 87.14%, f1-score of 87.25%, and auc of 0.957, which are on par with the best models designed by ai experts. the strong empirical results prove the effectiveness of random search strategy, and demonstrate that nas is a promising research direction for designing neural networks of detecting covid-19. in this paper, we aim to benchmark dl models and use automl techniques to design dl models for covid-19 detection using chest ct scans. our experimental results show that dl models are promising solutions, and 3d models outperform 2d models. we find that the model performance does not absolutely improve with the increase of model depth or the number of slices. in other words, a smaller model trained on less number of slices can also achieve comparable or even better results. besides, we demonstrate that mixup data augmentation can effectively improve model performance. last but not least, we design an automated deep learning methodology to generate a lightweight deep learning model, which achieves comparable results to the models designed by ai experts. we have several directions for future work on the agenda as follows. first, most of the data in our dataset are from china, thus we plan to collect more data from other countries to further improve the accuracy of covid-19 detection. second, we will try to apply semantic segmentation technology to our dataset, so as to help doctors diagnose more effectively. last, we will try other sota nas methods to explore more types of deep learning models. naming the coronavirus disease (covid-19) and the virus that causes it q&a on coronaviruses (covid-19)," world health organization stability of sars-cov-2 in different environmental conditions coronavirus disease (covid-2019) situation reports correlation of chest ct and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases clinically applicable ai system for accurate diagnosis, quantitative measurements, and prognosis of covid-19 pneumonia using computed tomography covid-19 screening on chest x-ray images using deep learning based anomaly detection estimating uncertainty and interpretability in deep learning for coronavirus (covid-19) detection automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks classification of covid-19 patients from chest ct images using multi-objective differential evolution-based convolutional neural networks application of deep learning technique to manage covid-19 in routine clinical practice using ct images: results of 10 convolutional neural networks covid mtnet: covid-19 detection with multi-task deep learning approaches sample-efficient deep learning for covid-19 diagnosis based on ct scans radiologist-level covid-19 detection using ct scans with detail-oriented capsule networks deep learning-based detection for covid-19 from chest ct using weak label artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet a closer look at spatiotemporal convolutions for action recognition densely connected convolutional networks deep residual learning for image recognition aggregated residual transformations for deep neural networks automl: a survey of the state-of-the-art automated machine learning: methods, systems, challenges mixup: beyond empirical risk minimization a survey on deep learning in medical image analysis covid-19 image data collection covid-ctdataset: a ct scan dataset about covid-19 covid-19 ct lung and infection segmentation dataset neural architecture search: a survey automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study mobilenets: efficient convolutional neural networks for mobile vision applications proceedings of the ieee conference on computer vision and pattern recognition mnasnet: platform-aware neural architecture search for mobile fbnet: hardware-aware efficient convnet design via differentiable neural architecture search random search and reproducibility for neural architecture search efficient neural architecture search via parameter sharing evaluating the search phase of neural architecture search adam: a method for stochastic optimization sgdr: stochastic gradient descent with warm restarts a. slice sampling strategies 1) random sampling strategy: the random sampling strategy is applied to the training set. in this way, each scan data will be composed of different slices, which can be regarded as data augmentation to improve the model robustness and avoid overfitting.algorithm 1: random slice sampling algorithm.input: s: the ordered slice list n: the target number of slices output:s: the sampled slice list function main(s, n):if n == len(s) thenre-order(ŝ) returnŝ 2) symmetrical sampling strategy: the symmetrical samling strategy is applied to the test set. this avoids the randomness of the test results, and also makes a fair performance comparison between different models. key: cord-312366-8qg1fn8f authors: adiga, aniruddha; dubhashi, devdatt; lewis, bryan; marathe, madhav; venkatramanan, srinivasan; vullikanti, anil title: mathematical models for covid-19 pandemic: a comparative analysis date: 2020-10-30 journal: j indian inst sci doi: 10.1007/s41745-020-00200-6 sha: doc_id: 312366 cord_uid: 8qg1fn8f covid-19 pandemic represents an unprecedented global health crisis in the last 100 years. its economic, social and health impact continues to grow and is likely to end up as one of the worst global disasters since the 1918 pandemic and the world wars. mathematical models have played an important role in the ongoing crisis; they have been used to inform public policies and have been instrumental in many of the social distancing measures that were instituted worldwide. in this article, we review some of the important mathematical models used to support the ongoing planning and response efforts. these models differ in their use, their mathematical form and their scope. abstract | covid-19 pandemic represents an unprecedented global health crisis in the last 100 years. its economic, social and health impact continues to grow and is likely to end up as one of the worst global disasters since the 1918 pandemic and the world wars. mathematical models have played an important role in the ongoing crisis; they have been used to inform public policies and have been instrumental in many of the social distancing measures that were instituted worldwide. in this article, we review some of the important mathematical models used to support the ongoing planning and response efforts. these models differ in their use, their mathematical form and their scope. models have been used by mathematical epidemiologists to support a broad range of policy questions. their use during covid-19 has been widespread. in general, the type and form of models used in epidemiology depend on the phase of the epidemic. before an epidemic, models are used for planning and identifying critical gaps and prepare plans to detect and respond in the event of a pandemic. at the start of a pandemic, policy makers are interested in asking questions such as: (i) where and how did the pandemic start, (ii) risk of its spread in the region, (iii) risk of importation in other regions of the world, (iv) basic understanding of the pathogen and its epidemiological characteristics. as the pandemic takes hold, researchers begin investigating: (i) various intervention and control strategies; usually pharmaceutical interventions do not work in the event of a pandemic and thus nonpharmaceutical interventions are most appropriate, (ii) forecasting the epidemic incidence rate, hospitalization rate and mortality rate, (iii) efficiently allocating scarce medical resources to treat the patients and (iv) understanding the change in individual and collective behavior and adherence to public policies. after the pandemic starts to slow down, modelers are interested in developing models related to recovery and long-term impacts caused by the pandemic. j. indian inst. sci. | vol xxx:x | xxx-xxx 2020 | journal.iisc.ernet.in as a result comparing models needs to be done with care. when comparing models: one needs to specify: (a) the purpose of the model, (b) the end user to whom the model is targeted, (c) the spatial and temporal resolution of the model, (d) and the underlying assumptions and limitations. we illustrate these issues by summarizing a few key methods for projection and forecasting of disease outcomes in the us and sweden. organization. the paper is organized as follows. in sect. 2 we give preliminary definitions. section 3 discusses us and uk centric models developed by researchers at the imperial college. section 4 discusses metapopulation models focused on the us that were developed by our group at uva and the models developed by researchers at northeastern university. section 5 describes models developed swedish researchers for studying the outbreak in sweden. in sect. 6 we discuss methods developed for forecasting. section 8 contains discussion, model limitations and concluding remarks. in a companion paper that appears in this special issue, we address certain complementary issues related to pandemic planning and response, including role of data and analytics. important note. the primary purpose of the paper is to highlight some of the salient computational models that are currently being used to support covid-19 pandemic response. these models, like all models, have their strengths and weaknesses-they have all faced challenges arising from the lack of timely data. our goal is not to pick winners and losers among these model; each model has been used by policy makers and continues to be used to advice various agencies. rather, our goal is to introduce to the reader a range of models that can be used in such situations. a simple model is no better or worse than a complicated model. the suitability of a specific model for a given question needs to be evaluated by the decision maker and the modeler. for epidemiology epidemiological models fall in two broad classes: statistical models that are largely data driven and mechanistic models that are based on underlying theoretical principles developed by scientists on how the disease spreads. data-driven models use statistical and machine learning methods to forecast outcomes, such as case counts, mortality and hospital demands. this is a very active area of research, and a broad class of techniques have been developed, including auto-regressive time series methods, bayesian techniques and deep learning 1, 2, 3, 4, 5, 6 . mechanistic models of disease spread within a population 7, 8, 9, 10 use mechanistic (also referred to as procedural or algorithmic) methods to describe the evolution of an epidemic through a population. the most common of these is the sir type models. hybrid models that combine mechanistic models with data driven machine learning approaches are also starting to become popular, e.g., 11 . there are a number of models, which are referred to as sir class of models. these partition a population of n agents into three sets, each corresponding to a disease state, which is one of: susceptible (s), infective (i) and removed or recovered (r). the specific model then specifies how susceptible individuals become infectious, and then recover. in its simplest form (referred to as the basic compartmental model) 7, 9, 10 , the population is assumed to be completely mixed. let s(t), i(t) and r(t) denote the number of people who are susceptible, infected and recovered states at time t, respectively. let s(t) = s(t)/n , then, the sir model can be described by the following system of ordinary differential equations where β is referred to as the transmission rate, and γ is the recovery rate. a key parameter in such a model is the "reproductive number", denoted by r 0 = β/γ . at the start of an epidemic, much of the public health effort is focused on estimating r 0 from observed infections 12 . mass action compartmental models have been the workhorse for epidemiologists and have been widely used for over 100 years. their strength comes from their simplicity, both analytically and from the standpoint of understanding the outcomes. software systems have been developed to solve such models and a number of associated tools have been built to support analysis using such models. although simple and powerful, mass action compartmental models do not capture the inherent heterogeneity of the underlying populations. significant amount of research has been conducted j. indian inst. sci. | vol xxx:x | xxx-xxx 2020 | journal.iisc.ernet.in to extend the model, usually in two broad ways. the first involves structured metapopulation models-these construct an abstraction of the mixing patterns in the population into m different sub-populations, e.g., age groups and small geographical regions, and attempt to capture the heterogeneity in mixing patterns across subpopulations. in other words, the model has states s j (t), i j (t), r j (t) for each subpopulation j. the evolution of a compartment x j (t) is determined by mixing within and across compartments. for instance, survey data on mixing across age groups 13 have been used to construct age structured metapopulation models 14 . more relevant for our paper are spatial metapopulation models, in which the subpopulations are connected through airline and commuter flow networks 15, 16, 17, 18, 19 . main steps in constructing structured metapopulation models. this depends on the disease, population and the type of question being studied. the key steps in the development of such models for the spread of diseases over large populations include • constructing subpopulations and compartments: the entire population v is partitioned into subpopulations v j , within which the mixing is assumed to be complete. depending on the disease model, there are s j , e j , i j , r j compartments corresponding to the subpopulation v j (and more, depending on the disease)-these represent the number of individuals in v j in the corresponding state • mixing patterns among compartments: state transitions between compartments might depend on the states of individuals within the subpopulations associated with those compartments, as well as those who they come in contact with. for instance, the s j → e j transition rate might depend on i k for all the subpopulations who come in contact with individuals in v j . mobility and behavioral datasets are needed to model such interactions. such models are very useful at the early days of the outbreak, when the disease dynamics are driven to a large extent by mobility-these can be captured more easily within such models, and there is significant uncertainty in the disease model parameters. they can also model coarser interventions such as reduced mobility between spatial units and reduced mixing rates. however, these models become less useful to model the effect of detailed interventions (e.g., voluntary home isolation, school closures) on disease spread in and across communities. agent-based networked models (sometimes just called as agent-based models) extend metapopulation models further by explicitly capturing the interaction structure of the underlying populations. often such models are also resolved at the level of single individual entities (animals, humans, etc.). in this class of models, the epidemic dynamics can be modeled as a diffusion process on a specific undirected contact network g(v, e) on a population v-each edge e = (u, v) ∈ e implies that individuals (also referred to as nodes) u, v ∈ v come into contact 3 main steps in setting up an agent-based model. while the specific steps depend on the disease, the population, and the type of question being studied, the general process involves the following steps: • construct a network representation g: the set v is the population in a region, and is available from different sources, such as census and landscan. however, the contact patterns are more difficult to model, as no real data are available on contacts between people at a large scale. instead, researchers have tried to model activities and mobility, from which contacts can be inferred, based on co-location. multiple approaches have been developed for this, including random mobility based on statistical models, and very detailed models based on activities in urban regions, which have been estimated through surveys, transportation data, and other sources, e.g., 20, 21, 8, 22, 23 . • develop models of within-host disease progression: such models can be represented as finite state probabilistic timed transition models, which are designed in close coordination with biologists, epidemiologists, and parameterized using detailed incidence data (see 9 for discussion and additional pointers). • develop high-performance computer (hpc) simulations to study epidemic dynamics in such models, e.g., 24, 25, 26, 27 . typical public health analyses involve large experimental designs, and the models are stochastic; this necessitates the use of such hpc simulations on large computing clusters. • incorporate interventions and behavioral changes: interventions include closure of schools and workplaces 22, 28 and vaccinations 21 ; whereas, behavioral changes include individual level social distancing, changes in mobility, and use of protective measures. such a network model captures the interplay between the three components of computational epidemiology: (i) individual behaviors of agents, (ii) unstructured, heterogeneous multi-scale networks, and (iii) the dynamical processes on these networks. it is based on the hypothesis that a better understanding of the characteristics of the underlying network and individual behavioral adaptation can give better insights into contagion dynamics and response strategies. although computationally expensive and data intensive, network-based epidemiology alters the types of questions that can be posed, providing qualitatively different insights into disease dynamics and public health policies. it also allows policy makers to formulate and investigate potentially novel and contextspecific interventions. like projection approaches, models for epidemic forecasting can be broadly classified into two broad groups: (i) statistical and machine learning-based data-driven models, (ii) causal or mechanistic models-see 29, 30, 2, 31, 32, 6, 33 and the references therein for the current state of the art in this rapidly evolving field. statistical methods employ statistical and time series-based methodologies to learn patterns in historical epidemic data and leverage those patterns for forecasting. of course, the simplest yet useful class is called method of analogs. one simply compares the current epidemic with one of the earlier outbreaks and then uses the best match to forecast the current epidemic. popular statistical methods for forecasting influenzalike illnesses (that includes covid-19) include, e.g., generalized linear models (glm), autoregressive integrated moving average (arima), and generalized autoregressive moving average (garma) 34, 31, 35 . statistical methods are fast, but they crucially depend on the availability of training data. furthermore, since they are purely data driven, they do not capture the underlying causal mechanisms. as a result, epidemic dynamics affected by behavioral adaptations are usually hard to capture. artificial neural networks (ann) have gained increased prominence in epidemic forecasting due to their self-learning ability without prior knowledge (see 1, 11, 36 and the references therein). such models have used a wide variety of data as surrogates for producing forecasts. this includes: (i) social media data, (ii) weather data, (iii) incidence curves and (iv) demographic data. causal models can be used for epidemic forecasting in a natural manner 30, 3, 37, 32, 38, 39 . these models calibrate the internal model parameters using the disease incidence data seen until a given day and then execute the model forward in time to produce the future time series. compartmental as well as agentbased models can be used to produce such forecasts. the choice of the models depends on the specific question at hand and the computational and data resource constraints. one of the key ideas in forecasting is to develop ensemble models-models that combine forecasts from multiple models 40, 6, 38, 39 . the idea which originated in the domain of weather forecasting has found methodological advances in the machine learning literature. ensemble models typically show better performance than the individual models. modeling group (uk model) background. the modeling group led by neil ferguson was to our knowledge the first model to study the impact of covid-19 across two large countries: us and uk, see 22 . the basic model was first developed in 2005-it was used to inform policy pertaining to h5n1 pandemic and was one of the three models used to inform the federal pandemic influenza plan and led to the now well-accepted targeted layered containment (tlc) strategy. it was adapted to covid-19 as discussed below. the model was widely discussed and covered in the scientific as well as popular press 41 . we will refer to this as the ic model. model structure. the basic model structure consists of developing a set of households based on census information for a given country. the structure of the model is largely borrowed from their earlier work, see 42, 28 . landscan data were used to spatially distribute the population. individual members of the household interact with other members of the household. the data to produce these households are obtained using census information for these countries. census data are used to assign age and household sizes. details on the resolution of census data and the dates were not clear. schools, workplaces and random meeting points are then added. the school data for us were obtained from the national centre of educational statistics, while for uk schools were assigned randomly based on population density. data on average class sizes and staff-student ratios were used to generate a synthetic population of schools distributed proportional to local population density. data on the distribution of workplace size were used to generate workplaces with commuting distance data used to locate workplaces appropriately across the population. individuals are assigned to each of these locations at the start of the simulation. the gravity-style kernel is used to decide how far a person can go in terms of attending work, school or community interaction place. the number of contacts between individuals at school, work and community meeting points are calibrated to produce a given attack rate. each individual has an associated disease transmission model. the disease transmission model parameters are based on the data collected when the pandemic was evolving in wuhan; see page 4 of 22 . finally, the model also has rich set of interventions. these include: (i) case isolation, (ii) voluntary home quarantine, (iii) social distancing of those over 70 years, (iv) social distancing of the entire population, (v) closure of schools and universities; see page 6 22 . the code was recently released and is being analyzed. this is important as the interpretation of these interventions can have substantial impact on the outcome. model predictions. the imperial college (ic model) model was one of the first models to evaluate the covid-19 pandemic using detailed agentbased model. the predictions made by the model were quite dire. the results show that to be able to reduce r to close to 1 or below, a combination of case isolation, social distancing of the entire population and either household quarantine or school and university closure is required. the model had tremendous impact-uk and us both decide to start considering complete lock downs-a policy that was practically impossible to even talk about earlier in the western world. the paper came out around the same time that wuhan epidemic was raging and the epidemic in italy had taken a turn for the worse. this made the model results even more critical. strengths and limitations. ic model was one of the first models by a reputed group to report the potential impact of covid-19 with and without interventions. the model was far more detailed than other models that were published until then. the authors also took great care parameterizing the model with the best disease transmission data that was available until then. the model also considered a very rich set of interventions and was one of the first to analyze pulsing intervention. on the flip side, the representation of the underlying social contact network was relatively simple. second, often the details of how interventions were represented were not clear. since the publication of their article, the modelers have made their code open and the research community has witnessed an intense debate on the pros and cons of various modeling assumptions and the resulting software system, see 43 . we believe that despite certain valid criticisms, overall, the results represented a significant advance in terms of the when the results were put out and the level of details incorporated in the models. northeastern and uva models (us models) background. this approach is an alternative to detailed agent-based models, and has been used in modeling the spread of multiple diseases, including influenza 15, 18 , ebola 17 and zika 19 . it has been adapted for studying the importation risk of covid-19 across the world 16 . structured metapopulation models construct a simple abstraction of the mixing patterns in the population, in which the entire region under study is decomposed into fully connected geographical regions, representing subpopulations, which are connected through airline and commuter flow networks. thus, they lack the rich detail of agent-based models, but have fewer parameters, and are, therefore, easy to set up and scale to large regions. model structure. here, we summarize gleam 15 (northeastern model) and patchsim 18 (uva model). gleam uses two classes of datasets-population estimates and mobility. population data are used from the "gridded population of the world" 44 , which gives an estimated population value at a 15 × 15 minutes of arc (referred to as a "cell") over the entire planet. two different kinds of mobility processes are considered-airline travel and commuter flow. the former captures long-distance travel; whereas, the latter captures localized mobility. airline data are obtained from the international air transport association (iata) 45 , and the official airline guide (oag) 46 . there are about 3300 airports world wide; these are aggregated at the level of urban regions served by multiple airport (e.g., as in london). a voronoi tessellation is constructed with the resulting airport locations as centers, and the population cells are assigned to these cells, with a 200 mile cutoff from the center. the commuter flows connect cells at a much smaller spatial scale. we represent this mobility pattern as a directed graph on the cells, and refer to it as the mobility network. in the basic seir model, the subpopulation in each cell j is partitioned into compartments s j , e j , i j and r j , corresponding to the disease states. for each cell j, we define the force of infection j as the rate at which a susceptible individual in the subpopulation in cell j becomes infected-this is determined by the interactions the person has with infectious individuals in cell j or any cell j ′ connected in the mobility network. an individual in the susceptible compartment s j becomes infected with probability j t and enters the compartment e j , in a time interval t . from this compartment, the individual moves to the i j and then the r j compartments, with appropriate probabilities, corresponding to the disease model parameters. the patchsim 18 model has a similar structure, except that it uses administrative boundaries (e.g., counties), instead of a voronoi tesselation, which are connected using a mobility network. the mobility network is derived by combining commuter and airline networks, to model time spent per day by individuals of region (patch) i in region (patch) j. since it explicitly captures the level of connectivity through a commuter-like mixing, it is capable of incorporating week-toweek and month-to-month variations in mobility and connectivity. in addition to its capability to run in deterministic or stochastic mode, the open source implementation 47 allows fine-grained control of disease parameters across space and time. although the model has a more generic force of infection mode of operation (where patches can be more general than spatial regions), we will mainly summarize the results from the mobility model, which was used for covid-19 response. what did the models suggest? gleam model is being used in a number of covid-19-related studies and analysis. in 48 , the northeastern university team used the model to understand the spread of covid-19 within china and relative risk of importation of the disease internationally. their analysis suggested that the spread of covid-19 out of wuhan into other parts of mainland china was not contained well due to the delays induced by detection and official reporting. it is hard to interpret the results. the paper suggested that international importation could be contained substantially by strong travel ban. while it might have delayed the onset of cases, the subsequent spread across the world suggest that we were not able to arrest the spread effectively. the model is also used to provide weekly projections (see https ://covid 19.gleam proje ct.org/); this site does not appear to be maintained for the most current forecasts (likely because the team is participating in the cdc forecasting group). the patchsim model is being used to support federal agencies as well as the state of virginia. due to our past experience, we have refrained from providing longer term forecasts, instead of focusing on short-term projections. the model is used within a forecasting via projection selection approach, where a set of counterfactual scenarios are generated based on on-the-ground response efforts and surveillance data, and the best fits are selected based on historical performance. while allowing for future scenarios to be described, they also help to provide a reasonable narrative of past trajectories, and retrospective comparisons are used for metrics such as 'cases averted by doing x' . these projections are revised weekly based on stakeholder feedback and surveillance update. further discussion of how the model is used by the virginia department of health each week can be found at https ://www.vdh.virgi nia.gov/coron aviru s/covid -19-data-insig hts/#model . strength and limitations. structured metapopulation models provide a good tradeoff between the realism/compute of detailed agentbased models and simplicity/speed of mass action compartmental models and need far fewer inputs for modeling, and scalability. this is especially true in the early days of the outbreak, when the disease dynamics are driven to a large extent by mobility, which can be captured more easily within such models, and there is significant uncertainty in the disease model parameters. however, once the outbreak has spread, it is harder to model detailed interventions (e.g., social distancing), which are much more localized. further, these are hard to model using a single parameter. both gleam and patchsim models also faced their share of challenges in projecting case counts due to rapidly evolving pandemic, inadequate testing, a lack of understanding of the number of asymptomatic cases and assessing the compliance levels of the population at large. researchers (swedish models) sweden was an outlier amongst countries in that it decided to implement public health interventions without a lockdown. schools and universities were not closed, and restaurants and bars remained open. swedish citizens implemented "work from home" policies where possible. moderate social distancing based on individual responsibility and without police enforcement was employed but emphasis was attempted to be placed on shielding the 65+ age group. background. statistician tom britton developed a very simple model with a focus on predicting the number of infected over time in stockholm. model structure. britton 49 used a very simple sir general epidemic model. it is used to make a coarse grain prediction of the behavior of the outbreak based on knowing the basic reproduction number r 0 and the doubling time d in the initial phase of the epidemic. calibration to calendar time was done using the observed number of case fatalities, together with estimates of the time between infection to death, and the infection fatality risk. predictions were made assuming no change of behavior, as well as for the situation where preventive measures are put in place at one specific time-point. model predictions. one of the controversial predictions from this model was that the number of infections in the stockholm area would quickly rise towards attaining herd immunity within a short period. however, mass testing carried out in stockholm during june indicated a far smaller percentage of infections. strength and limitations. britton's model was intended as a quick and simple method to estimate and predict an on-going epidemic outbreak both with and without preventive measures put in place. it was intended as a complement to more realistic and detailed modeling. the estimation-prediction methodology is much simpler and straight-forward to implement for this simple model. it is more transparent to see how the few model assumptions affect the results, and it is easy to vary the few parameters to see their effect on predictions so that one could see which parameter uncertainties have biggest impact on predictions, and which parameter uncertainties are less influential. model background. the public health authority (fhm) of sweden produced a model to study the spread of covid-19 in four regions in sweden: dalarna, skåne, stockholm, and västra götaland. 50 . model structure. it is a standard compartmentalized seir model and within each compartment, it is homogeneous; so, individuals are assumed to have the same characteristics and act in the same way. data used in the fitting of the model include point prevalences found by pcrtesting in stockholm at two different time points. model predictions. the model estimated the number of infected individuals at different time points and the date with the largest number of infectious individuals. it predicted that by july 1, 8.5% (5.9-12.9%) of the population in dalarna will have been infected, 4% (2.4-9.9%) of the population in skåne will have been infected, 19% (17.7-20.2%) of the population in stockholm will have been infected, and 9% (6.3-12.2%) of the population in västra götaland will have been infected. it was hard to test these predictions because of the great uncertainty in immune response to sars-cov-2-prevalence of antibodies was surprisingly low but recent studies show that mild cases never seem to develop antibodies against sars-cov-2, but only t-cellmediated immunity 51 . the model also investigated the effect of increased contacts during the summer that stabilizes in autumn. it found that if the contacts in stockholm and dalarna increase by less than 60% in comparison to the contact rate in the beginning of june, the second wave will not exceed the observed first wave. strength and limitations. the simplicity of the model is a strength in ease of calibration and understanding but it is also a major limitation in view of the well-known characteristics of covid-19: since it is primarily transmitted through droplet infection, the social contact structure in the population is of primary importance for the dynamics of infection. the compartmental model used in this analysis does not account for variation in contacts, where few individuals may have many contacts, while the majority have fewer. the model is also not age stratified, but covid-19 strikingly affects different age groups differently; e.g., young people seem to get milder infections. in this model, each infected individual has the same infectivity and the same risk of becoming a reported case, regardless of age. different age groups normally have varied degrees of contacts and have changed their behavior differently during the covid-19 pandemic. this is not captured in the model. rocklöv developed a model to estimate the impact of covid-19 on the swedish population at the municipality level, considering demography and human mobility under various scenarios of mitigation and suppression. they attempted to estimate the time course of infections, health care needs, and the mortality in relation to the swedish icu capacity, as well as the costs of care, and compared alternative policies and counterfactual scenarios. model structure. 52 used a seir compartmentalized model with age structured compartments (0-59, 60-79, 80+) susceptibles, infected, in-patient care, icu and recovered populations based on swedish population data at the municipal level. it also incorporated inter-municipality travel using a radiation model. parameters were calibrated based on a combination of values available from international literature and fitting to available outbreak data. the effect of a number of different intervention strategies was considered ranging from no intervention to modest social distancing and finally to imposed isolation of various groups. model predictions. the model predicted an estimated death toll of around 40,000 for the strategies based only on social distancing and between 5000 and 8000 for policies imposing stricter isolation. it predicted icu cases of up to 10,000 without much intervention and up to 6000 with modest social distancing, way above the available capacity of about 500 icu beds. strength and limitations. the model showed a good fit against the reported covid-19-related deaths in sweden up to 20th of april, 2020, however, the predictions of the total deaths and icu demand turned out to be way off the mark. background. finally, 53, 54 used an individualbased model parameterized on swedish demographics to assess the anticipated spread of covid-19. model structure. 53 employed the individual agent-based model based on work by ferguson et al. 22 . individuals are randomly assigned an age based on swedish demographic data and they are also assigned a household. household size is normally distributed around the average household size in sweden in 2018, 2.2 people per household. households were placed on a lattice using high-resolution population data from landscan and census dara from the statstics sweden and each household is additionally allocated to a city based on the closest city center by distance and to a county based on city designation. each individual is placed in a school or workplace at a rate similar to the current participation in sweden. transmission between individuals occurs through contact at each individual's workplace or school, within their household, and in their communities. infectiousness is, thus, a property dependent on contacts from household members, school/workplace members and community members with a probability based on household distances. transmissibility was calibrated against data for the period 21 march-6 april to reproduce either the doubling time reported using pan-european data or the growth in reported swedish deaths for that period. various types of interventions were studied including the policy implemented in sweden by the public health authorities as well as more aggressive interventions approaching full lockdown. model predictions. their prediction was that "under conservative epidemiological parameter estimates, the current swedish public-health strategy will result in a peak intensive-care load in may that exceeds pre-pandemic capacity by over 40-fold, with a median mortality of 96,000 (95% ci 52,000 to 183,000)". strength and limitations. this model was based on adapting the well-known imperial model discussed in sect. 3 to sweden and considered a wide range of intervention strategies. unfortunately the predictions of the model were woefully off the mark on both counts: the deaths by june 18 are under 5000 and at the peak the icu infrastructure had at least 20% unutilized capacity. forecasting is of particular interest to policy makers as they attempt to provide actual counts. since the surveillance systems have relatively stabilized in recent weeks, the development of forecasting models has gained traction and several models are available in the literature. in the us, the centers for disease control and prevention (cdc) has provided a platform for modelers to share their forecasts which are analyzed and combined in a suitable manner to produce ensemble multi-week forecasts for cumulative/incident deaths, hospitalizations and more recently cases at the national, state, and county level. probabilistic forecasts are provided by 36 teams as of july 28, 2020 (there were 21 models as of june 24, 2020) and the cdc with the help of 55 has developed uniform ensemble model for multi-step forecasts 56 . model it has been observed previously for other infectious diseases that an ensemble of forecasts from multiple models perform better than any individual contributing model 39 . in the context of covid-19 case count modeling and forecasting, a multitude of models have been developed based on different assumptions that capture specific aspects of the disease dynamics (reproduction number evolution, contact network construction, etc.). the models employed in the cdc forecast hub can be broadly classified into three categories, data-driven, hybrid models, and mechanistic models with some of the models being open source. data-driven models. they do not model the disease dynamics but attempt to find patterns in the available data and combine them appropriately to make short-term forecasts. in such data-driven models, it is hard to incorporate interventions directly; hence, the machine is presented with a variety of exogenous data sources such as mobility data, hospital records, etc. with the hope that its effects are captured implicitly. early iterations of institute of health metrics and evaluation (ihme) model 34 for death forecasting at state level employed a statistical model that fits a time-varying gaussian error function to the cumulative death counts and is parameterized to control for maximum death rate, maximum death rate epoch, and growth parameter (with many parameters learnt using data from outbreak in china). the ihme models are undergoing revisions (moving towards the hybrid models) and updated implementable versions are available at 57 . the university of texas at austin covid-19 modeling consortium model 58 uses a very similar statistical model as 34 but employs real-time mobility data as additional predictors and also differ in the fitting process. the carnegie mellon delphi group employs the well known auto-regressive (ar) model that employs lagged version of the case counts and deaths as predictors and determines a sparse set that best describes the observations from it by using 1 3 j. indian inst. sci. | vol xxx:x | xxx-xxx 2020 | journal.iisc.ernet.in lasso regression 59 . 60 is a deep learning model which has been developed along the lines of 1 and attempts to learn the dependence between death rate and other available syndromic, demographic, mobility and clinical data. hybrid models. these methods typically employ statistical techniques to model disease parameters which are then used in epidemiological models to forecast cases. most statistical models 34, 58 are evolving to become hybrid models. a model that gained significant interest is the youyang gu (yyg) model and uses a machine learning layer over an seir model to learn the set of parameters (mortality rate, initial r 0 , postlockdown r) specific to a region that best fits the region's observed data. the authors (yyg) share the optimal parameters, the seir model and the evaluation scripts with general public for experimentation 61 . los alamos national lab (lanl) model 35 uses a statistical model to determine how the number of covid-19 infections changes over time. the second process maps the number of infections to the reported data. the number of deaths is a fraction of the number of new cases obtained and is computed using the observed mortality data. mechanistic models. gleam and jhu models are county-level stochastic seir model dynamics. the jhu model incorporates the effectiveness of state-wide intervention policies on social distancing through the r 0 parameter. more recently, model outputs from uva's patchsim model were included as part of a multi-model ensemble (including autoregressive and lstm components) to forecast weekly confirmed cases. types we end the discussion of the models above by qualitatively comparing model types. as discussed in the preliminaries, at one end of the spectrum are models that are largely data driven: these models range from simple statistical models (various forms of regression models) to the more complicated deep learning models. the difference in such model lies in the amount of training data needed, the computational resources needed and how complicated the mathematical function one is trying to fit to the observed data. these models are strictly data driven and, hence, unable to capture the constant behavioral adaptation at an individual and collective level. on the other end of the spectrum seir, meta-population and agent-based network models are based on the underlying procedural representation of the dynamics-in theory, they are able to represent behavioral adaptation endogenously. but both class of models face immense challenges due to the availability of data as discussed below. (1) agent-based and seir models were used in all the three countries in the early part of the outbreak and continue to be used for counter-factual analysis. the primary reason is the lack of surveillance and disease specific data and hence, purely data-driven models were not easy to use. seir models lacked heterogeneity but were simple to program and analyze. agent-based models were more computationally intensive, required a fair bit of data to instantiate the model but captured the heterogeneity of the underlying countries. by now it has become clear that use of such models for long term forecasting is challenging and likely to lead to mis-leading results. the fundamental reason is adaptive human behavior and lack of data about it. (2) forecasting, on the other hand, has seen use of data-driven methods as well as causal methods. short-term forecasts have been generally reasonable. given the intense interest in the pandemic, a lot of data are also becoming available for researchers to use. this helps in validating some of the models further. even so, realtime data on behavioral adaptation and compliance remain very hard to get and is one of the central modeling challenges. were some of the models wrong? in a recent opinion piece, 4 professor vikram patel of the harvard school of public health makes a stinging criticism of modeling: crowning these scientific disciplines is the field of modeling, for it was its estimates of mountains of dead bodies which fuelled the panic and led to the unprecedented restrictions on public life around the world. none of these early models, however, explicitly acknowledged the huge assumptions that were made, a similar article in ny times recounted the mistakes in covid-19 response in europe 5 ; also see 62 . our point of view. it is indeed important to ensure that assumptions underlying mathematical models be made transparent and explicit. but we respectfully disagree with professor patel's statement: most of the good models tried to be very explicit about their assumptions. the mountains of deaths that are being referred to are explicitly calculated when no interventions are put in place and are often used as a worst case scenario. now, one might argue that the authors be explicit and state that this worst case scenario will never occur in practice. forecasting dynamics in social systems is inherently challenging: individual behavior, predictions and epidemic dynamics co-evolve; this coevolution immediately implies that a dire prediction can lead to extreme change in individual and collective behavior leading to reduction in the incidence numbers. would one say forecasts were wrong in such a case or they were influential in ensuring the worst case never happens? none of this implies that one should not explicitly state the assumption underlying their model. of course our experience is that policy makers, news reporters and common public are looking exactly for such a forecastwe have been constantly asked "when will peak occur" or "how many people are likely to die". a few possible ways to overcome this tension between the unsatiable appetite for forecasts and the inherent challenges that lie in doing this accurately, include: • we believe that, in general, it might not be prudent to provide long term forecasts for such systems. • state the assumptions underlying the models as clearly as possible. modelers need to be much more disciplined about this. they also need to ensure that the models are transparent and can be reviewed broadly (and expeditiously). • accept that the forecasts are provisional and that they will be revised as new data comes in, society adapts, the virus adapts and we understand the biological impact of the pandemic. • improve surveillance systems that would produce data that the models can use more effectively. even with data, it is very hard to estimate the prevalence of covid-19 in society. communicating scientific findings and risks is an important topical area in this context, see 41, 63, 64, 65 . use of models for evidence-based policy making. in a new book, 66 , radical uncertainty, economists john kay and mervyn king (formerly governor of the bank of england) urge caution when using complex models. they argue that models should be valued for the insights they provide but not relied upon to provide accurate forecasts. the so-called "evidence-based policy" comes in for criticism where it relies on models but also supplies a false sense of certainty where none exists, or seeks out the evidence that is desired ex ante-or "cover"--to justify a policy decision. "evidence-based policy has become policy-based evidence". our point of view. the authors make a good point here. but again, everyone, from public to citizens and reporters clamor for a forecast. we argue that this can be addressed in two ways:(i) viewing the problem from the lens of control theory so that we forecast only to control the deviation from the path we want to follow and (ii) not insisting on exact numbers but general trends. as kay and king opine, the value of models, especially in the face of radical uncertainty, is more in exploring alternative scenarios resulting from different policies: a model is useful only if the person using it understands that it does not represent the "the world as it really is" but is a tool for exploring ways in which a decision might or might not go wrong. in his new book the rules of contagion, adam kucharski 67 draws on lessons from the past. in 2015 and 2016, during the zika outbreak, researchers planned large-scale clinical studies and vaccine trials. but these were discontinued as soon as the infection ebbed. this is a common frustration in outbreak research; by the time, the infections end, fundamental questions about the contagion can remain unanswered. that is why building long-term research capacity is essential. our point of view. the author makes an important point. we hope that today, after witnessing the devastating impacts of the pandemic on the economy and society, the correct lessons will be learnt: sustained investments need to be made in the field to be ready for the impact of the next pandemic. the paper discusses a few important computational models developed by researchers in the us, uk and sweden for covid-19 pandemic planning and response. the models have been used by policy makers and public health officials in their respective countries to assess the evolution of the pandemic, design and analyze control measures and study various what-if scenarios. as noted, all models faced challenges due to availability of data, rapidly evolving pandemic and unprecedented control measures put in place. despite these challenges, we believe that mathematical models can provide useful and timely information to the policy makers. on on hand the modelers need to be transparent in the description of their models, clearly state the limitations and carry out detailed sensitivity and uncertainty quantification. having these models reviewed independently is certainly very helpful. on the other hand, policy makers should be aware of the fact that using mathematical models for pandemic planning, forecast response rely on a number of assumptions and lack data to over these assumptions. springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. epideep: exploiting embeddings for epidemic forecasting real-time epidemic forecasting: challenges and opportunities real-time forecasting of infectious disease dynamics with a stochastic semi-mechanistic model forecasting the impact of the first wave of the covid-19 pandemic on hospital demand and deaths for the usa and european economic area countries an arima model to forecast the spread and the final size of covid-2019 epidemic in italy (first version on ssrn 31 march) accuracy of realtime multi-model ensemble forecasts for seasonal influenza in the us structure of social contact networks and their impact on epidemics computational epidemiology the structure and function of complex networks deep learning based epidemic forecasting with synthetic information | vol xxx:x | xxx-xxx 2020 | journal.iisc.ernet.in control of severe acute respiratory syndrome social contacts and mixing patterns relevant to the spread of infectious diseases optimizing influenza vaccine distribution multiscale mobility networks and the spatial spreading of infectious diseases the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak assessing the international spreading risk associated with the 2014 west african ebola outbreak optimizing spatial allocation of seasonal influenza vaccine under temporal constraints spread of zika virus in the americas generation and analysis of large synthetic social contact networks modelling disease outbreaks in realistic urban social networks containing pandemic influenza at the source episimdemics: an efficient algorithm for simulating the spread of infectious disease over large realistic social networks enhancing user-productivity and capability through integration of distinct software in epidemiological systems fred (a framework for reconstructing epidemic dynamics): an open-source software system for modeling infectious diseases and control strategies using census-based populations modeling targeted layered containment of an influenza pandemic in the united states pancasting: forecasting epidemics from provisional data influenza forecasting in human populations: a scoping review near-term forecasts of influenza-like illness: an evaluation of autoregressive time series approaches a systematic review of studies on forecasting the dynamics of influenza outbreaks a framework for evaluating epidemic forecasts forecasting covid-19 impact on hospital bed-days, icudays, ventilator-days and deaths by us state in the next 4 months covid-19 cases and deaths forecasts tdefsi: theoryguided deep learning-based epidemic forecasting with synthetic information calibrating a stochastic, agent-based model using quantile-based emulation epidemic forecasting framework combining agent-based models and smart beam particle filtering individual versus superensemble forecasts of seasonal influenza outbreaks in the united states forecasting a moving target: ensemble models for ili case count predictions modelling the pandemic the simulations driving the world's response to covid-19 strategies for mitigating an influenza pandemic critiqued coronavirus simulation gets thumbs up from code-checking efforts high resolution global gridded data for use in population studies oag official airline guide code for simulating the metapopulation seir model the effect of human mobility and control measures on the covid-19 epidemic in china basic prediction methodology for covid-19: estimation and sensitivity considerations estimates of the number of infected individuals during the covid-19 outbreak in the dalarna region, skåne region, stockholm region, and västra götaland region robust t cell immunity in convalescent individuals with asymptomatic or mild covid-19 covid-19 healthcare demand and mortality in sweden in response to non-pharmaceutical (npis) mitigation and suppression scenarios arbo-prevent: climate change, human mobility and emerging arboviral outbreaks: new models for risk characterization, resilience and prevention intervention strategies against covid-19 and their estimated impact on swedish healthcare capacity managing covid-19 spread with voluntary public-health measures: sweden as a case study for pandemic control cdc. covid-19 forecasthub projections for firstwave covid-19 deaths across the us using social-distancing measures derived from mobile phones policy implications of models of the spread of coronavirus: perspectives and opportunities for economists evaluating science communication mathematical models to guide pandemic response infodemic and risk communication in the era of cov-19 radical uncertainty: decision-making beyond the numbers the rules of contagion: why things spread-and why they stop. basic books the authors would like to thank members of the biocomplexity covid-19 response team and network systems science and advanced computing (nssac) division for their thoughtful comments and suggestions related to epidemic modeling and response support. we thank members of the biocomplexity institute and initiative, university of virginia for useful discussion and suggestions. this work was partially sup research associate at the nssac division of the biocomplexity institute and initiative. he completed his phd from the department of electrical engineering, indian institute of science (iisc), bangalore, india and has held the position of postdoctoral fellow at iisc and north carolina state university, raleigh, usa. his research areas include signal processing, machine learning, data mining, forecasting, big data analysis etc. at nssac, his primary focus has been the analysis and development of forecasting systems for epidemiological signals such as influenza-like illness and covid-19 using auxiliary data sources. bryan lewis is a research associate professor in the network systems science and advanced computing division. his research has focused on understanding the transmission dynamics of infectious diseases within specific populations through both analysis and simulation. lewis is a computational epidemiologist with more than 15 years of experience in crafting, analyzing, and interpreting the results of models in the context of real public health problems. as a computational epidemiologist, for more than a decade, lewis has been heavily involved in a series of projects forecasting the spread of infectious disease as well as evaluating the response to them in support of the federal government. these projects have tackled diseases from ebola to pandemic influenza and melioidosis to cholera. professor in biocomplexity, the division director of the networks, simulation science and advanced computing (nssac) division at the biocomplexity institute and initiative, and a professor in the department of computer science at the university of virginia (uva). his research interests are in network science, computational epidemiology, ai, foundations of computing, socially coupled system science and high-performance computing. before joining uva, he held positions at virginia tech and the los alamos national laboratory. he is a fellow of the ieee, acm, siam and aaas. scientist at the biocomplexity institute & initiative, university of virginia and his research focuses on developing, analyzing and optimizing computational models in the field of network epidemiology. he received his phd from the department of electrical and communication engineering, indian institute of science (iisc), and did his postdoctoral research at virginia tech. his areas of interest include network science, stochastic modeling and big data analytics. he has used in-silico models of society to study the spread of infectious diseases and invasive species. recent research includes modeling and forecasting emerging infectious disease outbreaks (e.g., ebola, covid-19), impact of human mobility on disease spread and resource allocation problems in the context of key: cord-321735-c40m2o5l authors: manca, davide; caldiroli, dario; storti, enrico title: a simplified math approach to predict icu beds and mortality rate for hospital emergency planning under covid-19 pandemic date: 2020-06-04 journal: comput chem eng doi: 10.1016/j.compchemeng.2020.106945 sha: doc_id: 321735 cord_uid: c40m2o5l the different stages of covid-19 pandemic can be described by two key-variables: icu patients and deaths in hospitals. we propose simple models that can be used by medical doctors and decision makers to predict the trends on both short-term and long-term horizons. daily updates of the models with real data allow forecasting some key indicators for decision-making (an excel file in the supplemental material allows computing them). these are beds allocation, residence time, doubling time, rate of renewal, maximum daily rate of change (positive/negative), halfway points, maximum plateaus, asymptotic conditions, and dates and time intervals when some key thresholds are overtaken. doubling time of icu beds for covid-19 emergency can be as low as 2-3 days at the outbreak of the pandemic. the models allow identifying the possible departure of the phenomenon from the predicted trend and thus can play the role of early warning systems and describe further outbreaks. covid-19 is the most exacting pandemic since the spanish flu of more than a century ago. the fast outbreak of covid-19 and the wide spread all over the world transformed a local disease, initially located in china, into a global problem; thus the name: pandemic (fauci et al., 2020) . italy (60.5 million inhabitants) is one of the most plagued nations by this pandemic with almost 28000 official deaths in the hospitals (as of 30-apr-2020) and over 4000 icu beds to treat patients at the peak of covid-19 emergency (livingston & bucher, 2020; remuzzi & remuzzi, 2020) . likewise, lombardy (10 million inhabitants) is the most crowded region of italy and the most hit and afflicted region of europe with over 13800 official deaths in the hospitals and almost 1400 icu beds (grasselli et al., 2020a; grasselli et al., 2020b) . before the pandemic, the number of icu beds for any treatments in lombardy was about 700 (grasselli et al., 2020a; manca, 2020a) and in winter months those beds are usually 85-90% occupied (grasselli et al., 2020a) . the pandemic called for doubling that number at an incredible pace to cope with the repeated tsunami waves of very complicated patients affected by dip (i.e. diffuse interstitial pneumonia). those patients required ever-increasing treatments ranging from oxygen masks, to helmet c-pap, to niv, and eventually tracheal intubation (desai & aronoff, 2020) . in some hospitals, such as the lodi hospital in lombardy (the one where the first covid-19 italian patient was diagnosed and sheltered in icu, a 38 year-old sporty male with no comorbidities who remained in the icu ward for 18 days before moving to pip, i.e. post-intensive care, for other 14 days) before the pandemic there were 6 icu beds (cutuli, 2020) . in a matter of few weeks, the beds became 18 and eventually 27 at the peak of the emergency (i.e. 4.5 times more). this called for huge efforts in terms of burden on medical doctors, nurses, and managing team that literally transformed the original wards, operating rooms, and recovery rooms to set up and operate that high number of new icu beds. the mathematical models describing the phenomenon dynamics, which are reported in this paper, have allowed understanding the fast evolution and preparing daily for the continuously increasing burden. besides the predicted numbers, those models allowed also forecasting the different phases of the pandemic and quantifying some basic indicators about the daily variations, the key times, the key figures, the expected decrease, the progressive reach of a maximum plateau before facing with the decrease of icu beds for covid-19 which we are measuring right now. together with the icu beds, which are reported daily by the italian civil protection department every evening at 6 pm cet (dipartimento della protezione civile, 2020), we also monitored the dynamics of official deaths. official deaths are the ones that occur to patients sheltered in hospitals after they result positive to a nasal swab and possibly to a ct scan. official deaths are important as they measure one of the two possible outcomes of hospital treatment, either success or failure. physicians have to know and somehow forecast not only the daily number of deaths but also the asymptotic value predicted at the end of the pandemic. such a number is a big burden for the psychology of those who struggle for life (e.g., medical doctors, nurses) but it can prepare them to deal with that fatal outcome and somehow find an upper bound of the phenomenon, as shown in the following. as detailed in section 2, the mathematical models can be incomplete in the sense that further models might describe with the same or even higher efficacy the dynamics of real data. nonetheless, we will show the reliability, robustness, and quality of the suggested models that can describe the phenomenon evolution with a high degree of precision. the object of this paper consists in making the reader aware of the main features of a pandemic in terms of its dynamic evolution based on key points, key intervals, key dates, and key numbers. the mathematical models can be used to get a background of past events, understand the current situation, and forecast the pandemic evolution over either short-term or long-term horizons. the qualitative and quantitative assessment of the pandemic dynamics allows medical decision-makers to plan for the emergency, prepare for severe times, and finally relax the safety measures and revert to standard elective medicine when the pandemic deflates. the models can outline possible deviations of the pandemic from the expected evolutions and therefore play the role of early warning systems in case of new outbreaks. mathematical modeling can be of real help in describing, understanding, and eventually forecasting how a virus diffuses within a domain (e.g., population, province, region, country, and continent). mathematical models feature a set of mathematical equations that include a number of adaptive parameters that can be determined numerically grounding on available real data (panovska-griffiths, 2020) . once these models are refined and the adaptive parameter computed, one can use these models to understand what happened in the past (e.g., lessons learnt) and even more important to forecast what might happen in the future (e.g., emergency planning, resource allocation) (he et al., 2020; poston et al., 2020; steinberg et al., 2020) . this section focuses on describing the time evolution (i.e. dynamics) of covid-19 pandemic centering on only two variables that are the most reliable and therefore robust data, namely the number of icu patients (aka icu beds) and official deaths in hospitals. these data are reported daily by the italian civil protection department (dipartimento della protezione civile, 2020). day 1 is when the first patient was sheltered in the intensive care unit (22-feb-2020 in italy). we chose to describe and follow, but also to predict, the dynamics of those real values with the simplest mathematical models available. such models were carefully chosen by grounding on a keep-it-simple approach. the models are regression-based, which means that they minimize the distance between model predictions and real data. we selected the models that best describe the dynamics of the phenomenon and feature the lowest number of adaptive parameters. indeed, we only used either two or three adaptive parameters to keep the approach simple, avoid overparameterization, and preserve numerical robustness and stability (manca, 2020a (manca, , 2020b . this approach allows implementing those models on any computers without the necessity of relaying on dedicated programming tools. intentionally, we implemented a set of math models that can run on excel as it is widely available on any operating systems (e.g., windows, mac, linux) and works on several platforms (e.g., computers, tablets, mobile phones). in addition, excel (even though it is not a real programming software) requires a lower learning curve respect to other programming tools such as matlab or mathematica or programming languages such as fortran, c/c++, python (just to cite a few). the supplemental material to this manuscript includes an excel file with the proposed models that can be used to extract important information about past, present and future situation, and to draw the diagrams that describe the phenomenon dynamics and allow understanding its qualitative and quantitative evolution. once the models are identified (i.e. regressed respect to real data), they show new values of the adaptive parameters which play a role in quantifying some important indicators about the pandemic. we intentionally did not use any epidemiologic models such as sir, sird, seir, seird models that are based on a set of few differential equations with initial conditions and a number of adaptive parameters as well as strong assumptions and simplifications (c.h. li et al., 2014; g.h. li & zhang, 2017; magal et al., 2016; xia et al., 2009) . the family of sir models shows a higher detail of description of the phenomenon. however, according to our opinion, their use is good for running parametric predictive scenarios based on a number of assumptions and hypotheses that are quite sensitive to the selection of proper values of the adaptive parameters and functional description. reliable values of those parameters will be available only at the end of the pandemic and depend significantly on the political decisions endorsed at different times and intensity by each country (or even each region locally) on social-distancing measures and more in general on nonpharmaceutical interventions (bayham & fenichel, 2020; cowling et al., 2020) . conversely, the math models proposed in this paper can be used daily (by updating the icu and deaths data respectively) and automatically adapt to the evolution of those values. the phenomena behind icu beds and deaths follow two different evolutionary curves in case of a pandemic. icu beds start from zero at the very beginning of the pandemic, they keep on increasing once the pandemic deploys its intensity up to a maximum value. after the plateau, the icu beds start decreasing and reach a final null value when the pandemic expires. conversely, the deaths number continues increasing monotonically and reaches a final value, which is the fatalities toll the country/region has to pay to the pandemic plague. actually, the deaths number is a cumulated value that increases with the daily fatalities. as far as big regions and countries are concerned, the number of icu beds increases monotonically up to the maximum plateau. this is what both lombardy and italy (onder et al., 2020) dealt with. conversely, small regions in terms of population and/or positive cases may experience non-monotonically increasing trends. for instance, this is the case of umbria a central region in italy with 882,000 inhabitants living in small municipalities distributed on its territory, which experimented a trend of icu beds quite different from the most involved italian regions, the northern ones. indeed, umbria saw often the number of icu beds remaining constant for a few days. now and then, that number decreased and then increased again with rather small fluctuations (i.e. few units per day) due to the tiny number of infected people. for this kind of regions/countries, the models and trends discussed in this paper are not recommended and should be avoided. back to lombardy, after the fast explosion of icu patients following an exponential growth (manca, 2020a; remuzzi & remuzzi, 2020) , it experienced a saturation condition protracted over several days (manca, 2020b) before reaching the maximum plateau and turning into the descent trajectory. that saturation condition was not constant and the healthcare system was flexible enough to increase at a lower extent (than the necessary one) the number of icu beds. the "saturation" term means that the capacity of daily increasing the number of beds, to cope with the tsunami wave of new patients requiring an icu treatment, was lower than the compulsory number of new icu beds. it is worth observing that the continuous creation of new icu beds took to a subtle drift in terms of treatment quality as new beds were created under huge pressure with an above-limited capacity in already existing wards and intensive/step down areas reconfigured in a matter of few days. coupling the saturation condition with the quality of new icu beds had an impact on the treatment quality of icu patients and consequently on the fatalities toll . indeed, there was also a limiting factor played by the number of physicians and nurses subject to over-intensive activities, with shifts of 15-16 hours per day, day over day for more than one month, sometimes being themselves hit by covid-19 (more than 150 medical doctors left their lives due to covid-19) (adnkronos, 2020). for the sake of simplicity, let us focus first on the qualitative trend of icu patients. the first days of the pandemic see an exponential increase (remuzzi & remuzzi, 2020 ). an exponential curve is characterized by a doubling period that remains constant if no external measures or disturbances occur. it is rather easy to show that a phenomenon is exponential when it fits a straight line in a semilogarithmic diagram (i.e. a diagram where x-axis is linear (time, number of days) and y-axis reports the logarithm of the real values (icu beds). it is possible to measure the quality of the fitting curve (i.e. straight-line) respect to real data by means of the determination coefficient ( 2 r ). this is a nondimensional coefficient ranging from 0 to 1 (hahs-vaughn, 2016). the higher the value, the better the consistency of the model with real data. in case of both italy and lombardy, the first epidemic days saw 2 r values with either two or three nines as decimals, almost equal to 1, i.e. 2 0.99, ,0.999 r  . the exponential growth remains stable for about 15-16 days. for the sake of correctness, every day the slope of the straight line decreases slightly and the intercept increases slightly (i.e. the point where the line cuts the y-axis). this is intrinsic to the epidemic dynamics and shows that the phenomenon starts reducing its momentum, although it remains exponential (see also figure 1 ). there is a time, days 16-22, when new values of icu beds start departing from the straight line and begin to follow a parabola on the semilog plane. this is the case of the so-called exponentially modified gaussian (emg) trend (golubev, 2010) . on a standard cartesian plane, the phenomenon is well described by two distinct regions. the first one with upward concavity and the second one with downward concavity. the upward concavity reflects the exponential trend that becomes progressively linear at the inflection point where the concavity changes from upward to downward. at the inflection point, the maximum positive velocity of change occurs. in math terms, we say that at the inflection point the first derivative (i.e. maximum daily change of the observed variable, icu beds) has the highest positive slope (i.e. maximum increment). these are the toughest days to stand the tsunami wave as day over day the increment of icu beds is the highest one. in addition, the bigger the region/country with its number of infected patients, the higher its inertia. after the inflection point, the phenomenon continues to increase but at a slower pace. in this phase, the increase is monotonic, which means that the number of icu beds continues to climb, but the daily growth reduces progressively up to a maximum plateau (around days 39-43). again, the inertia of the system is usually high and instead of having a train at the top of a rollercoaster that quickly starts falling, the plateau remains stable for a few days. that maximum plateau is critical for the whole system as the icu stay is rather long (murthy et al., 2020) . usually, patients remain in icu wards at least fifteen days (with twenty-day stay the standard value) (cutuli, 2020) and, respect to covid-19 emergency, this quite a long time allows describing the whole icu beds inflation period with curves such as the logistic (hosmer et al., 2013) or the gompertz (panik, 2014) ones. at the maximum plateau, small oscillations may occur but finally the system starts decreasing (days 42-45). the reverse trend of the first ascending period occurs in this second descending phase up to a final null value when the pandemic is out. when the descent starts, the concavity is downward and becomes upward after a new inflection point (with negative slope, i.e. downhill). before the inflection point (days 60-70), the phenomenon increases the velocity of reduction of icu beds. after the inflection, the reduction pace slows down and approaches progressively the final null plateau (days 105-120). for the sake of simplicity, the two ascending and descending tracts of the overall icu beds phenomenon can be described with two separate segments of suitable models such as (i) exponentially modified gaussian (emg), (ii) logistic, (iii) gompertz curves. all these models ground on either physical or biological foundations. the logistic model (hosmer et al., 2013) was originally proposed in 1838 by pierre françois verhulst to describe the growth of populations where the rate of reproduction depends on both the existing population and the amount of available resources. the gompertz model (panik, 2014) is similar to the logistic one and was designed by benjamin gompertz in 1825 to describe the law of human mortality. the emg model (golubev, 2010) fits well processes involving normally-distributed inputs and exponentially distributed outputs. emg characterizes the transition probability of cellular cycles and embraces both the deterministic and probabilistic contributions. the same qualitative discussion made to model the inflation period of icu beds may be adapted to model the regional/national fatalities. contrary to icu beds, the deaths curve is only monotonically increasing and finally approaches the maximum plateau once the pandemic expires. the deployment of that curve is longer and shows a lag time respect to the icu curve of about 12-14 days in proximity of the first ascending inflection point. it is worth observing that the fatalities curve is a pure cumulative curve (i.e. integral curve) that sums up the single death tolls experienced daily. quantitatively, the inflection point, when the maximum daily fatalities occur, arrives at days 34-38 and it takes a total of 100-120 days to reach 98% of the final maximum plateau. mathematically, the logistic and gompertz models predict the practical extinction of deaths after days 150-200. the hiatus between the logistic and gompertz models increases as the pandemic deploys its dynamics. the predicted numbers should be taken with a grain of salt as (i) these are just forecasts when this manuscript was written; (ii) the fatalities phenomenon depends heavily on the social distancing measures enacted by regions and countries together with the very uncertain outcomes that the so-called phase 2 will produce when people progressively reduce social-distancing and start again to live and work as before the pandemic although with a higher awareness about safety and health related risks. only time will tell if these forecasts approach the real phenomenon or further modeling issues should be accounted for. the mathematical description of the exponential model is: the mathematical description of the logistic model is: the mathematical description of the gompertz model is: where t is time, ,, abc are adaptive parameters and 2.718281828459045235... e  is the euler's number. y is the dependent variable that is predicted by the model (i.e. icu patients or deaths). the exponential model features two parameters, whilst the logistic and gompertz ones feature three parameters. as discussed in section 2.2 the approach to models parameterization is the minimal one and in line with the keep-it-simple philosophy. as far as the independent variable is concerned, t can be considered either continuous or more correctly discontinuous as it corresponds to the number of days since the start of the pandemic in a specific region/country. equation (1) in semilogarithmic coordinates assumes the form: which is the equation of a line whose slope is b , the intercept is it is straightforward to determine the doubling time of the exponential growth from equation (4): the higher the slope of the line (equation 4), the shorter the doubling time of the exponential phenomenon. the logistic curve (equation 2) is symmetric respect to the inflection point and the sigmoid function is a special case of the logistic function (hosmer et al., 2013) . the gompertz curve is similar qualitatively to the logistic function but it is not symmetric and takes a longer time to reach the final asymptotic plateau. in addition, the gompertz plateau is higher than the logistic one (once the parameters of both curves are identified to minimize the distance from the same set of real data). table 1 shows the analytic formulae that allow computing the inflection and halfway points. the inflection point identifies the time when the rate of change is highest, whilst the halfway point determines the time when the phenomenon reaches a value that is half than the maximum plateau. it is worth observing that the time when the halfway condition occurs does not mean that after an equivalent amount of time the whole phenomenon completes. indeed, the halfway condition refers to the dependent variable ( y ) and not to the independent one ( t ). both the logistic and the gompertz models have the same maximum plateau when t , ya  . for the sake of clarity, each curve has its own a value that is computed by minimizing the sum of the squared distances between the real data (whose cardinality is np ) and the model predictions (equation 6) through the following nonlinear regression procedure (hosmer et al., 2011) : the minimization of equation (6) is multidimensional and unconstrained even though the optimizing procedure may be eased by specifying that the degrees of freedom (i.e. the adaptive parameters) are positive. once the model is identified, it is possible to answer the following questions: how much time does the phenomenon take to reach a given percentage respect to the maximum plateau? for instance, at what time the death phenomenon will reach 98% or 99% of the asymptotic condition (i.e. the maximum final plateau when the pandemic expires)? mathematically one has to solve the following nonlinear algebraic equation: where p is the desired fraction (e.g., 0.98 p  if the desired percentage is 98%). an easier approach to solve equation (7) consists in reporting in a table the model predictions (e.g., an excel spreadsheet) as a function of time and to search for the first time when the phenomenon overtakes that percentage (as the asymptotic value a is known). equations (1-3) are good at describing the monotonically increasing phase of icu patients and the whole fatalities phenomenon. once the icu phenomenon touches its maximum value it starts decreasing as discussed in section 2.2. the increase-plateau-decrease region can be described by an exponentially modified gaussian (emg) model whose mathematical description is: where 10 c c  is the multiplying factor, 10 bt is the exponential term and 2 10 at is the gaussian term. equation (8) is even more flexible as it can be applied to the uphill part of the icu phenomenon (besides the downhill one). for both the uphill and downhill parts, the emg model always showed positive values for , bc and negative values for a which is a mandatory condition for the gaussian contribution. the weak point of emg is that it is rather precise in predicting values over short-time intervals, whilst it is less reliable over longer periods when applied to the uphill portion of the phenomenon. however, across the maximum plateau and the descending section, the emg predictive performance is rather good. the evaluation of the daily rate of change of the phenomenon can be carried out either in a discrete way by computing the difference between two consecutive model predictions (for instance in the excel spreadsheet) or by analytic differentiation ( dy y dt ). the first derivative of the exponential function is:   ln 10 10 bt y a b   the first derivative of the logistic function is: the first derivative of the gompertz function is: the first derivative of the emg function is: equations ( once the logistic and gompertz curves reach the maximum plateau, they conclude their scope as far as icu beds are concerned. however, their intrinsic monotonic nature can be suitably exploited to describe the descent towards the end of pandemic when icu beds are null. both logistic and gompertz models can be rewritten according to a reverse formulation respect to the original one. it is sufficient to translate them, change their sign, and move the initial condition to the maximum plateau as reported in the following. the mathematical description of the reverse logistic model is: the mathematical description of the reverse gompertz model is: for the sake of clarity, parameters a and 0 t are known as they are the plateau value and the corresponding time when the maximum of icu beds is reached, respectively. consequently, equations (14-15) reduce to two adaptive parameters, , bc. both equations (14-15) exhibit the same final null plateau when the pandemic expires. finally, the first derivatives of the reverse logistic and gompertz models are respectively: the models reported in section 2.3 can be selected and used critically according to their performance and consistency with real data published daily in most countries. it is necessary to determine the values of the adaptive parameters by means of an identification procedure that is based on either a linear or a nonlinear regression of real data depending on the mathematical nature of the model (h.h. zhang et al., 2011) . the good news is that excel implements a multidimensional optimizer, which is capable of solving the minimization problem of equation (6). any other programming environments can solve as well that same problem provided a multidimensional unconstrained optimization routine is available. the interested reader can refer to the supplemental material to identify the models and their parameters. one of the performance indicators to discriminate the models and find the most representative is rmse , i.e. the root mean square error (equation 18). equally, one can evaluate the either the mean ( mae ) or the median ( medae ) absolute errors via equations (19) and (20) this section shows how to choose, use, and extract important information from the proposed models. the case study is applied to both italy and lombardy in terms of icu patients and deaths. in the very first days of the pandemic, the phenomenon is purely exponential (equation 1 ). this assumption proved true after observing the linear trend of dependent variables (i.e. icus and deaths) in semilogarithmic coordinates (equation 4) and evaluating the determination coefficient. figure 2 shows how, in the first days of the pandemic, the icu beds grow exponentially as the real data lay almost perfectly on a straight line. the linear trend is even stronger after day 10. it is worth remarking that the y-axis is logarithmic which means that the integer values are powers of ten in linear coordinates. for the sake of clarity, 1 means 10 icu patients, 2 means 100 icus and 3 means 1000 icus. the linear trend is confirmed by the determination coefficient 2 r that approximates 1 (i.e. 2 = 0.9917229 r ). as time progresses, the phenomenon leaves progressively the purely linear trend in semilog coordinates and moves towards a quadratic behaviour (in those same semilog coordinates) that is embodied by an emg model (see equations 8-9). figure 3 shows the high affinity of real data with the emg model, which is also confirmed by the very high value of the determination coefficient (i.e. 2 = 0.9991630 r ). it is not common to have real data (i.e. experimental data) that are so consistent with a model. somehow and to a first sight, data appear tamed or even worst manipulated to make them smoother, although this is absolutely not true. actually, this smoothing effect is the result of large numbers and cumulated curves. in the following, we will illustrate some diagrams, coincides with the logistic predictions and, as a result, it is almost invisible in the diagram. the trend of real data in lombardy fluctuates a bit more than the corresponding italian trend due to the extreme pressure and saturation condition exerted by the covid-19 emergency on that region. when lombardy exceeded the threshold of 900-1000 patients in intensive care units, it was a daily struggle with covid-19 pandemic to create further intensive care beds to shelter the continuously increasing wave of patients requiring icu treatment and tracheal intubation. that was the challenge that absorbed the most from the medical doctors and nurses of intensive care wards (cutuli, 2020) . consequently, when the first derivative is equal to zero (as for the emg model in figure 5 ) the maximum plateau of the corresponding model occurs. in lombardy, the emg model of figure 5 identifies the maximum plateau of the icu patients' trend at day 41 (2-apr) and the same happens for italy. equally, the minimum values of the first derivatives for the three models of figure 5 occur for lombardy around day 58-61 (19-22 april) and for italy around day 56-57 (17-18 april). these are the days when the highest daily decreases of icu patients are expected. afterward, the daily decrease continues but its intensity lowers and finally becomes negligible in proximity of the null plateau (see also figure 6 ). figure 4 . the red horizontal bottom line shows the 10% threshold respect to the maximum measured value and identifies the times when most of the icu wards should be empty. last available real data on 30-apr. equally, the reverse logistic and reverse gompertz models predict that for italy the descent below the 10% threshold of icu patients will occur on 25-may and 28-may respectively (i.e. 52 and 55 days after the maximum plateau, which is also at days 94 and 97 respectively). these values are in line with the predictions for lombardy. equally, the last remaining 10 icu patients would turn out on 18-jul and 31-jul for the reverse logistic and reverse gompertz models respectively (i.e. 106 and 119 days after the maximum plateau, which is also at days 148 and 161 respectively, based on 30-apr real data availability). the description of death dynamics can ground on the same modeling approach based on the logistic, gompertz, and emg curves (equations 2, 3, and 8-9 respectively). for the sake of brevity, the initial exponential growth (equation 1) will not be reported also because the aforementioned models feature in their formulation an initial part where the curve follows the exponential trend. allows also predicting (as of latest data available on 30 april) that the total expected deaths toll will be 34037 and that 98% of that value will be reached on day 111 (11 june). it is worth observing that the gompertz curve is not symmetric and that after the inflection point the change of concavity makes it approach the final maximum plateau slower than other curves such as the logistic and emg ones. at the end of june (day 130), the gompertz model predicts that 99.17% of the whole phenomenon is manifested and that 282 fatalities remain before the pandemic is out. it is straightforward to remark that these are predictions based on a mathematical model that is subject to the availability of reliable data. the more the real data the better. indeed, every day the national and regional reports release new data and one can carry out forecasts that are more reliable. in addition, the model forecasts ground on the assumption that future values will be registered and made available according to the same boundary conditions in terms of social distancing measures and data collection conditions. it is worth observing that the behaviour of italy in terms of pandemic dynamics all over its territory is not uniform. it would be a huge mistake to assume a uniform distribution of icu patients and fatalities in its regions as the death toll (as of day 69) sees more than 49% the models of section 2.3 applied to the case study of lombardy and italy proved their efficiency in reproducing real data and were used to forecast the evolution of key parameters as the number of icu patients and deaths on both short and long-time horizons. the same models can be applied to different countries and regions (hopman et al., 2020) if reliable and timely data are collected and shared by the public bodies of civil protection and health. the same models can also be applied to describe the dynamics of other variables provided they are reliable and collected according to the same standards and methodologies. for instance, at least in italy, the swabs done to test possibly infected people adopted different approaches in different regions and they are not representative of the expected number of infected people (grasselli et al., 2020b; molinari et al., 2020) . indeed, a patient initially positive to a swab has to receive two negative in a row results before being declared no more contagious (e.g., a medical doctor of a lombardy hospital was tested with 9 consecutive swabs before being declared negative. he had to wait 46 days from the first positive swab). therefore, the total number of swabs is not representative of the number of people tested. in addition, a large number of probably infected people have not been tested with a swab and nonetheless have to undergo a quarantine of at least 14 days. this happens to a large number of people living together at home with just released patients from hospitals and to asymptomatic or paucisymptomatic individuals (kimball et al., 2020; nicastri et al., 2020) . the three main models proposed in this paper (i.e. emg, logistic, and gompertz), either direct or reverse, can be used for qualitative and quantitative purposes in the covid-19 emergency. they play two separate roles. first, they track real data and allow discriminating among models to find the most reliable one(s) but also they allow understanding if any unexpected trends start occurring. being analytic and fully developed in terms of functional dependency, they are continuous and feature analytic derivatives of any order. this means that not only the cumulated values but also the daily variations can be observed and monitored to understand if any drifts of the phenomenon are occurring. second, these models can be used to predict the evolution of the phenomenon over either short or long-term horizons. during the very first tough days, the most important forecasts are required over short-time intervals to understand the doubling times of icu beds and allocate/prepare suitable resources to cope with the repeated tsunami waves of patients with acute respiratory distress syndrome. when the pressure is at least partially released, the models can be used to predict when some safer thresholds will be reached. in those cases, it will be possible to decrease the number of covid-19 beds, to close dedicated wards, to reallocate human resources, to restart elective hospital activities that were intentionally shutdown or reduced to the minimum to focus on the demanding emergency. also, the reported models have been exploited for decision making (see figure 9 ) to understand if some extreme decisions in terms of resources allocation would timely meet the dynamics of phenomenon (e.g., reallocation of wards, building new hospitals, moving icu patients to other hospitals and regions/countries). models can be compared for their consistency and precision in describing the phenomenon by evaluating suitable key performance indicators (kpis) as the ones reported in equations (18-20) (k. zhang et al., 2015) . table 2 and table 3 there is not a clear best-performer among the models used to describe the dynamics of icu patients. however, the gompertz model behaves on average better than the logistic one both uphill and downhill. the emg model can predict the transit across the maximum plateau before leaving the stage to the reverse logistic and gompertz models. the performance of the emg model is better in lombardy than in italy and it is comparable with the uphill and downhill trends of direct and reverse logistic and gompertz models. it is worth remarking that each kpi is characterized by a specific functional dependency and therefore the values of a kpi should be considered just to compare different models belonging to the same region/country (k. zhang et al., 2015) . for instance, it is reasonable that the kpis of italy are higher than those of lombardy are as the involved numbers are higher for the whole nation respect to that single region. indeed, the maximum plateau counted 4068 patients in italy and 1381 patients in lombardy. since the unit of measure of all the kpis reported in table 2 and table 3 is the number of icu patients, it is possible to observe that proportionally the same models were a bit less accurate in the predictions of lombardy respect to those of italy. however, the prediction capability of all the proposed models is rather accurate as they describe the allocation of thousands of icu beds and can cope with the intrinsic oscillations of biological and massive organisms (e.g., regions and nations) when they are hit by a pandemic that exerts paramount pressure and burden on their vital resources (livingston & bucher, 2020) . the comparison of models for the prediction of fatalities is easier as the phenomenon is somehow simpler since it is monotonically increasing and once it passes the inflection point one has to wait only for the final plateau. in case of fatalities, the gompertz model is the clear winner in terms of precision and reliability over the whole horizon of available real data (from the very beginning of pandemic until day 69, 30 april 2020 for italy and lombardy). conversely, the logistic model had to be discarded after day 55 as its predictions were too optimistic and it calculated that final plateau would occur earlier and with a too low fatalities toll. since the number of deaths in lombardy and italy were rather high at day 69 (27967 and 13772 respectively), the kpi values reported in table 4 are pretty low respect to the dimension of the phenomenon (where the expected final number of fatalities was 34037 for italy and 15519 for lombardy according to the projections of day 69, 30 april). the paper presented and discussed a few regression models to predict two of the most important variables in a pandemic from the point of view of decision-making and emergency planning. these are the number of icu patients who must be sheltered in dedicated covid-19 wards and the number of fatalities. these models can be applied to different regions and countries as the pandemic phenomenon has the same qualitative features. the two, maximum three, adaptive parameters allow describing quantitatively the dynamics of those different regions and countries. indeed, every region and every country are characterized by different features bound to their territory nature, population distribution (in terms of age, density, life-styles, human interactions, family habits), and political decisions (in terms of progressive/immediate, relaxed/inflexible lockdowns, social distancing, and other non-pharmaceutical interventions). the mathematics behind the proposed models is rather simple and can be implemented in an excel file that can be used by most decision makers and medical doctors. based on real data that most countries/regions produce daily, these regression models can be identified in terms of their adaptive parameters and used to forecast the trends of icu and deaths on either short or longterm horizons. these same models can also be adapted to track other variables as far as the reliability of those variables is good enough to preserve their consistency. once identified, the models can determine (i) the doubling time of the phenomenon, (ii) the inflection point where the daily increment of the phenomenon is either maximum positive (uphill) or maximum negative (downhill), and (iii) the maximum plateau. in addition, these models can evaluate the time when some conditions occur, such as the achievement of a fraction of the maximum plateau. by monitoring some suitable kpis, the user can assess the performance of the models and understand when they should be quit or if the reverse version of the model (in case of the logistic and gompertz models) would fit better the new real data. the periodic update of these models and the important details that one can extract by observing the diagrams that compare the prediction capabilities respect to real data allow detecting possible drifts of the phenomenon and can play the role of early-warning systems if the pandemic derails from the expected evolution. coronavirus, 150 medici morti in italia impact of school closures for covid-19 on the us health-care workforce and net mortality: a modelling study. the lancet impact assessment of nonpharmaceutical interventions against coronavirus disease 2019 and influenza in hong kong: an observational study covid-19: the lodi model (in italian) masks and coronavirus disease 2019 (covid-19) covid-19 italia-monitoraggio situazione covid-19 -navigating the uncharted exponentially modified gaussian (emg) relevance to distributions related to cell proliferation and differentiation critical care utilization for the covid-19 outbreak in lombardy, italy: early experience and forecast during an emergency response baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region applied multivariate statistical concepts how to transform a general hospital into an "infectious disease hospital" during the epidemic of covid-19 managing covid-19 in low-and middle-income countries applied survival analysis: regression modeling of time to event data applied logistic regression: third edition asymptomatic and presymptomatic sars-cov-2 infections in residents of a long-term care skilled nursing facility -king county analysis of epidemic spreading of an sirs model in complex heterogeneous networks dynamic behaviors of a modified sir model in epidemic diseases using nonlinear incidence and recovery rates coronavirus disease 2019 (covid-19) in italy final size of an epidemic for a two-group sir model analysis of the number growth of icu patients with covid-19 in italy and lombardy dynamics of icu patients and deaths in italy and lombardy due to covid-19 sars-cov-2: the lombardy scenario in numbers care for critically ill patients with covid-19 coronavirus disease (covid-19) in a paucisymptomatic patient: epidemiological and clinical challenge in settings case-fatality rate and characteristics of patients dying in relation to covid-19 in italy growth curve modeling: theory and applications can mathematical modelling solve the current covid-19 crisis? management of critically ill adults with covid-19 covid-19 and italy: what next? the lancet calculated decisions: covid-19 calculators during extreme resource-limited situations epidemics of sirs model with nonuniform transmission on scale-free networks linear or nonlinear? automatic structure discovery for partially linear models a comparison and evaluation of key performance indicator-based multivariate statistics process monitoring approaches the authors acknowledge the valuable discussions with md piergiorgio villani (lodi hospital), md giovanni mistraletti (san paolo di milano hospital), and md francesco trotta (lodi hospital). key: cord-175366-jomeywqr authors: massonis, gemma; banga, julio r.; villaverde, alejandro f. title: structural identifiability and observability of compartmental models of the covid-19 pandemic date: 2020-06-25 journal: nan doi: nan sha: doc_id: 175366 cord_uid: jomeywqr the recent coronavirus disease (covid-19) outbreak has dramatically increased the public awareness and appreciation of the utility of dynamic models. at the same time, the dissemination of contradictory model predictions has highlighted their limitations. if some parameters and/or state variables of a model cannot be determined from output measurements, its ability to yield correct insights -as well as the possibility of controlling the system -may be compromised. epidemic dynamics are commonly analysed using compartmental models, and many variations of such models have been used for analysing and predicting the evolution of the covid-19 pandemic. in this paper we survey the different models proposed in the literature, assembling a list of 36 model structures and assessing their ability to provide reliable information. we address the problem using the control theoretic concepts of structural identifiability and observability. since some parameters can vary during the course of an epidemic, we consider both the constant and time-varying parameter assumptions. we analyse the structural identifiability and observability of all of the models, considering all plausible choices of outputs and time-varying parameters, which leads us to analyse 255 different model versions. we classify the models according to their structural identifiability and observability under the different assumptions and discuss the implications of the results. we also illustrate with an example several alternative ways of remedying the lack of observability of a model. our analyses provide guidelines for choosing the most informative model for each purpose, taking into account the available knowledge and measurements. the current coronavirus disease pandemic, caused by the sars-cov-2 virus, continues to wreak unparalleled havoc across the world. public health authorities can use mathematical models to answer critical questions related with the dynamics of an epidemic (severity and time course of infected people), its impact on the healthcare system, and the design and effectiveness of different interventions [1] [2] [3] [4] . mathematical modeling of infectious diseases has a long history [5, 6] . modeling efforts are particularly important in the context of covid-19 because its dynamics can be particularly complex and counter-intuitive due to the uncertainty in the transmission mechanisms, possible seasonal variation in both susceptibility and transmission, and their variation within subpopulations [7] . the media has given extensive coverage to analyses and forecasts using covid-19 models, with increased attention to cases of conflicting conclusions, giving the impression that epidemiological models are unreliable or flawed. however, a closer looks reveals that these modeling studies were following different approaches, handling uncertainty differently, and ultimately addressing different questions on different time-scales [8] . broadly speaking, data-driven models (using statistical regression or machine learning) can be used for shortterm forecasts (one or a few weeks). mechanistic models based on assumptions about transmission and immunity try to mimic how the virus spreads, and can be used to formalize current knowledge and explore long-term outcomes of the pandemic and the effectiveness of different interventions. however, the accuracy of mechanistic models is constrained by the uncertainties in our knowledge, which creates uncertainties in model parameters and even in the model structure [8] . further, the uncertainty in the covid-19 data and the exponential spread of the virus amplify the uncertainty in the predictions. predictability studies [9] seek the characterization of the fundamental limits to outbreak prediction and their impact on decision-making. despite the vast literature on mathematical epidemiology in general, and modeling of covid-19 in particular, comparatively few authors have considered the predictability of infectious disease outbreaks [9, 10] . uncertainty quantification [11] is an interconnected concept that is also key for the reliability of a model, and that has received similarly scant attention [12, 13] . in addition to predictability and uncertainty quantification approaches, identifiability is a related property whose absence can severely limit the usefulness of a mechanistic model [14] . a model is identifiable if we can determine the values of its parameters from knowledge of its inputs and outputs. likewise, the related control theoretic property of observability describes if we can infer the model states from knowledge of its inputs and outputs. if a model is non-identifiable (or non-observable) different sets of parameters (or states) can produce the same predictions or fit to data. the implications can be enormous: in the context of the covid-19 outbreak in wuhan, non-identifiability in model calibrations was identified as the main reason for wide variations in model predictions [15] . reliable models can be used in combination with optimization and optimal control methods to find the best intervention strategies, such as lock-downs with minimum economic impact [16, 17] . further, they can be used to explore the feasibility of model-based real-time control of the pandemic [18, 19] . however, using calibrated models with non-identifiability or non-observability issues can result in bad or even dangerous intervention and control strategies. it is common to distinguish between structural and practical identifiability. structural non-identifiability may be due to the model and measurement (input-output) structure. practical non-identifiability is due to lack of information in the considered data-sets. non-identifiability results in incorrect parameter estimates and bad uncertainty quantification [14, 20] , i.e. a misleading calibrated model which should not be used to analyze epidemiological data, test hypothesis, or design interventions. the structural identifiability of several epidemic mechanistic models has been studied e.g. in [21] [22] [23] [24] [25] [26] . other recent studies have mostly focused on practical identifiability, such as [14, 20, [27] [28] [29] . in this paper we assess the structural identifiability and observability of a large set of covid-19 mechanistic models described by deterministic ordinary differential equations, derived by different authors using the compartmental modeling framework [30] . compartmental models are widely used in epidemiology because they are tractable and powerful despite their simplicity. we collect 36 different compartmental models, of which we consider several variations, making up a total of 255 different model versions. our aim is to characterize their ability to provide insights about their unknown parameters -i.e. their structural identifiability -and unmeasured states -i.e. their observability. to this end we adopt a differential geometry approach that considers structural identifiability as a particular case of nonlinear observability, allowing to analyse both properties jointly. we define the relevant concepts and describe the methods used in section 2. then we provide an overview of the different types of compartmental models found in the literature in section 3. we analyse their structural identifiability and observability and discuss the results in section 4, where we also show different ways of remedying lack of observability using an illustrative model. finally, we conclude our study with some key remarks in section 5. we consider models defined by systems of ordinary differential equations with the following notation: where f and h are analytical (generally nonlinear) functions of the states x(t) ∈ r n x , known inputs u(t) ∈ r n u , unknown constant parameters θ ∈ r n θ , and unknown inputs or time-varying parameters w(t) ∈ r n w . the output y(t) ∈ r n y represents the measurable functions of model variables. the expressions (1-2) are sufficiently general to represent a wide range of model structures, of which compartmental models are a particular case. definition 1 (structurally locally identifiable [31] ). a parameter θ i of model m is structurally locally identifiable (s.l.i.) if for almost any parameter vector θ * ∈ r n θ there is a neighbourhood n(θ * ) in which the following relationship holds:θ ∈ n(θ * ) and y(t,θ) = y(t, θ * ) otherwise, θ i is structurally unidentifiable (s.u.). if all model parameters are s.l.i. the model is s.l.i. if there is at least one s.u. parameter, the model is s.u.. likewise, a state x i (τ) is observable if it can be distinguished from any other states in a neighbourhood from observations of the model output y(t) and input u(t) in the interval t 0 ≤ τ ≤ t ≤ t f , for a finite t f . otherwise, x i (τ) is unobservable. a model is called observable if all its states are observable. we also say that m is invertible if it is possible to infer its unknown inputs w(t), and we say that w(t) is reconstructible in this case. structural identifiability can be seen as a particular case of observability [32] [33] [34] , by augmenting the state vector with the unknown parameters θ, which are now considered as state variables with zero dynamics, x = (x t , θ t ) t . the reconstructibility of unknown inputs w(t), which is also known as input observability, can also be cast in a similar way, although in this case their derivatives may be nonzero. to this end, let us augment the state vector further with w as additional states, as well as their derivatives up to some non-negative integer l: the l−augmented dynamics is: leading to the l−augmented system: remark 1 (unknown inputs, disturbances, or time-varying parameters). in section 4, when reporting the results of the structural identifiability and observability analyses, we will explicitly consider some parameters as time-varying. in the model structure defined in equations (1-2) the unknown parameter vector θ is assumed to be constant. to consider an unknown parameter as time-varying we include it in the "unknown input" vector w(t). thus, changing the consideration of a parameter from constant to time-varying entails removing it from θ and including it in w(t). the elements of w(t) can be interpreted as unmeasured disturbances or inputs of unknown magnitude or, equivalently, as time-varying parameters. regardless of the interpretation, they are assumed to change smoothly, i.e. they are infinitely differentiable functions of time. for the analysis of some models it is necessary, or at least convenient, to introduce the mild assumption that the derivatives of w(t) vanish for a certain non-negative integer s (possibly s = +∞), i.e. w s) (t) 0 and w i) (t) = 0 for all i > s. this assumption is equivalent to assuming that the disturbances are polynomial functions of time, with maximum degree equal to s [35] . definition 2 (full input-state-parameter observability, fispo [35] ). let us consider a model m given by (1) (2) . we augment its state vector as z(t) = x(t) t θ t w(t) t t (4), which leads to its augmented form (5) . we say that m has the fispo property if, for every t 0 ∈ i, every model unknown z i (t 0 ) can be inferred from y(t) and u(t) in a finite time interval t 0 , t f ⊂ i. thus, m is fispo if, for every z(t 0 ) and for almost any vector z * (t 0 ), there is a neighbourhood n (z * (t 0 )) such that, for allẑ(t 0 ) ∈ n (z * (t 0 )) , the following property is fulfilled: in this paper we analyse input, state, and parameter observability -that is, the fispo property defined aboveusing a differential geometry framework. such analyses are structural and local. by structural we refer to properties that are entirely determined by the model equations; thus we do not consider possible deficiencies due to insufficient or noise-corrupted data. by local we refer to the ability to distinguish between neighbouring states (similarly, parameters or unmeasured inputs), even though they may not be distinguishable from other distant states. this is usually sufficient, since in most (although not all, see e.g. [36] ) applications local observability entails global observability. this specific type of observability has sometimes been called local weak observability [37] . this approach assesses structural identifiability and observability by calculating the rank of a matrix that is constructed with lie derivatives. the corresponding definitions are as follows (in the remainder of this section we omit the dependency on time to simplify the notation): definition 3 (extended lie derivative [38] ). consider the system m (1-2) with augmented state vector (4) and augmented dynamics (5) . assuming that the inputs u are analytical functions, the extended lie derivative of the output alongf =f (·, u) is: the zero-order derivative is l 0f h = h, and the i−order extended lie derivatives can be recursively calculated as: definition 4 (observability-identifiability matrix [35] ). the observability-identifiability matrix of the system m (1-2) with augmented state vector (4), augmented dynamics (5), and analytical inputs u is the following mnx × nx matrix, the fispo property of m can be analysed by calculating the rank of the observability-identifiability matrix: theorem 1 (observability-identifiability condition, oic [38] ). if the identifiability-observability matrix of a model m satisfies rank (o i (x 0 , u)) = nx = n x + n θ + n w , withx 0 being a (possibly generic) point in the augmented state space, then the system is structurally locally observable and structurally locally identifiable. in this paper we generally check the oic criterion of (1) using strike-goldd, an open source matlab toolbox [39] . alternatively, for some models we use the maple code observabilitytest, which implements a procedure that avoids the symbolic calculation of the lie derivatives and is hence computationally efficient [33] . a number of other software tools are available, including genssi2 [40] in matlab, identifiabilityanalysis in mathematica [38] , daisy in reduce [41] , sian in maple [42] , and the web app combos [43] . it should be taken into account that in the present work we are interested in assessing structural identifiability and observability both with constant and continuous time-varying model parameters (or equivalently, with unknown inputs), as explained in remark 1. ideally, the method of choice should provide a convenient way of analysing models with this type of parameters (inputs). it is always possible to perform this type of analysis by assuming that the time dependency of the parameters is of a particular form, e.g. a polynomial function of a certain maximum degree. in this article we review compartmental models, which are one of the most widely used families of models in epidemiology. they divide the population into homogeneous compartments, each of which corresponds to a state variable that quantifies the number of individuals that are at a certain disease stage. the dynamics of these compartments are governed by ordinary differential equations, usually with unknown parameters that describe the rates at which individuals move among different stages of disease. the basic compartmental model used for describing a transmission disease is the sir model, in which the population is divided into three classes: • susceptible: individuals who have no immunity and may become infected if exposed. • infected and infectious: an exposed individual becomes infected after contracting the disease. since an infected individual has the ability to transmit the disease, he/she is also infectious. • recovered: individuals who are immune to the disease and do not affect its transmission. another class of models, called seir, include an additional compartment to account for the existence of a latent period after the transmission: • exposed: individuals vulnerable to contracting the disease when they come into contact with it. these idealized models differ from the reality. contact tracing, screening, or changes in habits are some differences that are not considered in basic sir or seir models, but are important for evaluating the effects of an intervention. furthermore, it is not only important to enrich the information about the behaviour of the population; the characteristics of the disease must also be taken into account. these additional details can be incorporated to the model as new parameters, functions, or extra compartments. compartments such as asymptomatic, quarantined, isolated, and hospitalized have been widely used in covid-19 models. from 29 articles, most of which are very recent [10, 15, , we have collected 36 models. depending on whether they have an exposed compartment or not, they can be broadly classified as belonging to the sir or seir families. however, most of these models include additional compartments. susceptible individuals become infected with an incidence of: where β = pc is the transmission rate, c is the contact rate and p the probability that a contact with a susceptible individual results in a transmission [6] . individuals who recover leave the infectious class at rate γ, where 1/γ is the average infectious period. the set of differential equations describing the basic sir model is given by: as mentioned above, compartmental models can be extended to consider further details. we have found models that incorporate the following features: asymptomatic individuals, births and deaths, delay-time, lock-down, quarantine, isolation, social distancing, and screening. figure 1 shows a classification of the sir models reviewed in this article, and table 1 lists them along with their equations. multiple output choices have been considered in the study of the structural identifiability and observability of some models. in such cases the observations are listed in the output column. figure 1 : classification of sir models. each block represents a model structure. the basic, three-compartment sir model structure is on top of the tree. every additional block is labeled with the additional feature that it contains with respect to its parent block. the darkness of the shade indicates the number of additional features with respect to the basic sir model. parameters output ics input equations individuals in the seir model are divided in four compartments: susceptible (s), exposed (e), infected (i) and recovered (r). compared to the sir models, the additional compartment e allows for a more accurate description of diseases in which the incubation period and the latent period do not coincide, i.e. the period between which an infected becomes infectious. this is why seir models are in principle best suited to epidemics with a long incubation period such as covid-19 [50] . susceptible individuals move to the exposed class at a rate βi(t), where β is the transmission rate parameter. exposed individuals become infected at rate κ, where 1/κ is the average latent period. infected individuals recover at rate γ, where 1/γ is the average infectious period. thus, the set of differential equations describing the basic seir model is: existing extensions of seir models may incorporate some of the following features: asymptomatic individuals, births and deaths, hospitalization, quarantine, isolation, social distancing, screening and lock-down. figure 2 shows a classification of the models found in the literature; table 2 lists them along with their equations. [51] s, l, e, i, q, r γ, β 1 , η, δ, ξ, θ 2 , , θ 1 , α 1 , α 2 , we analysed the structural identifiability and observability of the 17 sir model structures (a total of 98 model versions considering the different output configurations and time-varying parameter assumptions) and 19 seir models (with a total of 157 model versions) listed in tables 1 and 2. the detailed results for each model are given in appendix a, which reports the structural identifiability of each parameter and the observability of each state, for every model version. in the remainder of this section we provide an overview of the main results. the general patterns regarding state observability are as follows. the recovered state (r) is almost never observable unless it is directly measured (d.m.) as output; the only exceptions are two seir models, 31 and 38, for which r is observable under the assumption of time-varying parameters. the susceptible state (s), in contrast, is observable in roughly two thirds of the models (sir: 65/98, seir: 103/157); this is also true for the exposed state (e) in the seir models. the infected state (i) is included in most studies among the outputs, either directly (d.m.) or indirectly measured (as part of a parameterized measurement function). when it is not considered in this way, its observability is generally similar to that of s (in 18/157 model versions i is not an output and it is observable in 13/18). the transmission and recovery rates (β, γ) are the two parameters common to all sir models. the transmission rate is identifiable in 59/98 model versions, and γ in 51/98 and its derivatives in 12/98. seir models have a third parameter in common, the latent period (κ). it is identifiable in most of the models (145/157), as well as the recovery rate (111/157). the transmission rate is identifiable in 101/157 model versions, but it is not identifiable in any seir model version that accounts for social distancing (numbers 34 and 61); we found no clear pattern in the other models. the transmission rate β, the recovery rate γ, and in seir models the latent period κ, can vary during an epidemic as a result of changes in the population's behaviour [57, 70] , the introduction of new drugs or new medical equipment [57] , or the reduction of the period duration as a result of high temperatures [71] . to account for such variations, the present study has considered both the constant and the time-varying cases, by including the corresponding variables either in the constant parameter vector θ or in the unknown input vector w(t), respectively, as described in remark 1. changing a parameter from constant to time-varying naturally influences structural identifiability and observability. this effect is graphically summarized in figures 3-7 , which represent classes of models in tree form and classify them according to their observability. each model is shaded with a color, according to the observability of the parameter studied. some models include different rates for different population groups: for example, they may consider two different transmission rates for symptomatic and asymptomatic individuals. for those models, each rate may have different observability properties when considered time-varying parameters; in such cases the model is depicted between two color blocks (see for example the sir 20 model in figure 3 ). changing β from a constant to a time-varying parameter (or equivalently an unknown input) does not change its observability nor that of the other variables in sir models. in contrast, this is not the case with the recovery rate γ, for which a somewhat counter-intuitive result may be obtained: by changing γ from a constant to a continuous function of time with at least a non-zero derivative, its model can become more observable and identifiable -despite the fact that it is an unknown function. an example of this is the sir model 15: if γ is constant the model has only one identifiable parameter, τ, and no observable states; if γ is time-varying with at least one non-zero derivative, two parameters become identifiable (β, µ), two states become observable (i, s), and γ itself is observable. in the other models, when γ is not identifiable as a constant nor observable as an unknown input, its successive derivatives are observable. for the seir models, the consideration of the β parameter as an unknown input function follows a similar trend to that of the sir models with the exception of model 38, which gains both observability and identifiability and becomes fispo. considering the recovery rate γ (fig. 7) or the latent period κ (fig. 6) individually as time-varying parameters generally leads to greater observability, except for model 31 (1) . as an example, in model 39(2) one of the unknown inputs becomes observable, three states become observable (s, e, i), and three parameters become identifiable (γ, µ i , β); or the 16(2) model, in which both its input and three states (s, e, i) become observable and two parameters (µ, β) become identifiable. besides the transmission rate, latent period, and recovery rate, other rates (screening, disease-related deaths, and isolation) have also been considered as time-varying parameters in some studies. the observability of most models is not modified if these parameters are allowed to change in time; the exception being 8 models which gain observability. an example is the seir model 41(1), which has seven parameters, seven states, and one output. assuming constant parameters, five of them are structurally identifiable (κ, α, β, γ 1 , γ 2 ) and two are unidentifiable (q, ρ), while there are three observable states (i, j, c) and four unobservable states (s, e, a, r) [28] . however, when the parameter ρ (which describes the proportion of exposed/latent individuals who become clinically infectious) is considered timevarying, all parameters become identifiable (including ρ) and six states become observable (all except r, which is never observable unless it can be directly measured, as we have already mentioned). the fact that allowing an unknown quantity to change in time can improve its observability -and also the observability of other variables in a model -may seem paradoxical. an intuitive explanation can be obtained from the study of the symmetry in the model structure. the existence of lie symmetries amounts to the possibility of transforming parameters and state variables while leaving the output unchanged, i.e. their existence amounts to lack of structural identifiability and/or observability [72] . the strike-goldd toolbox used in this paper includes procedures for finding lie symmetries [73] . let us use the sir 15 model as an example. this model has five parameters (τ, β, ρ, µ, d), of which only τ is identifiable if assumed constant. the model contains the following symmetry: where is the parameter of the lie group of transformations. thus, there is a symmetry between ρ and µ that makes them unidentifiable: changes in one parameter can be compensated by changes in the other one. however, if ρ is time-varying and µ is constant, the latter cannot compensate the changes of the former, and the symmetry is broken. indeed, if ρ is considered time-varying the model gains identifiability (not only µ, but also τ and β become identifiable) and observability (s, i and ρ become observable). let us now illustrate how the results of this study may be applied in a realistic scenario. we use as an example the model sir 26, which has 6 states (s, i, r, a, q, j) and 16 parameters (d 1 , d 2 , d 3 , d 4 , d 5 , d 6 , k 1 , k 2 , λ, γ 1 , γ 2 , a , q , j , µ 1 , µ 2 ); its equations are shown in table 1 . this model includes the following additional features with respect to the basic sir model: birth/death, asymptomatic individuals (a), quarantine (q), and isolation (j). in its original publication two states were measured (q, j). with these two states as outputs the model has five identifiable parameters (d 1 , d 5 , q , k 2 , µ 1 ) and two observable states (a, i); thus, there are two unobservable states (s, r) and ten unidentifiable parameters. if we are interested in estimating e.g. the number of susceptible individuals (s), this model would not be appropriate. how should we proceed in that scenario? one way of improving observability could be by including more outputs (option 1). for example, since there is a separate class for asymptomatic individuals (a), the infected compartment (i) considers only individuals with symptoms, and we could assume that they can be detected. by including 'i' in the output set, the structural identifiability and observability of the model improves: six more parameters are identifiable (λ, a , j , d 4 , k 1 , µ 2 ) and the state in which we are interested (s) becomes observable. however, including more outputs is not always realistic. another possibility would then be to reduce the complexity of the model by decreasing the number of additional features (option 2). for example, leaving out the asymptomatic compartment leads to the following model: the output of the model is the same, q, j. in this case, the model has eight identifiable parameters (λ, q , j , d 1 , d 5 , µ 1 , µ 2 , k 2 ) and two observable states (s, i). a third possibility is to simplify the parametrization of the model (option 3). this model considers a different death rate for every compartment (d i , i = 1, . . . , 6.). with some loss of generality, we could consider a specific death rate for infected individuals, d i = d 2 , and a general death rate d for all non-infected and asymptomatic individuals, this reduction of the number of parameters leads to a better observability to the model: the only unidentifiable parameters are d 2 , γ 1 , and k 1 , and the only non-observable state is r. thus, this option also allows to identify s. our analyses have shown that a fraction of the models found in the literature have unidentifiable parameters. key parameters such as the transmission rate (β), the recovery rate (γ), and the latent period (κ) are structurally identifiable in most, but not all, models. the transmission and recovery rates are identifiable in roughly two thirds of the models, and the latent period in almost all (> 90%) of them. likewise, the states corresponding to the number of susceptible (s) and exposed (e) individuals are non-observable in roughly one third of the model versions analysed in this paper. the number of infected individuals (i) can usually be directly measured, but it is non-observable in one third of the model versions in which it is not measured. the situation is worse for the number of recovered individuals (r), which is almost never observable unless it is directly measured. many models include other states in addition to s, e, i, and r, which are not always observable either. the transmission rate and other parameters may vary during the course of an epidemic, as a result of a number of factors such as changes in public policy, population behaviour, or environmental conditions. to account for these variations, in the present study we have considered both the constant and the time-varying parameter case. somewhat unexpectedly, we found that allowing for variability in an unknown parameter often improves the observability and/or identifiability of the model. this phenomenon might be explained by the contribution of this variability to the removal of symmetries in the model structure. structural identifiability and observability depend on which states or functions are measured. the lack of these properties may in principle be surmounted by choosing the right set of outputs [74] , but the required measurements are not always possible to perform in practice. epidemiological models are a clear example of this; limitations such as lack of testing or the existence of asymptomatic individuals usually make it impossible to have measurements of all states. an alternative to measuring more states is to use a model with fewer compartments and/or a simpler parameterization, thus decreasing the number of states and/or parameters. reducing the model dimension in this way may achieve observability and identifiability. even when it is not possible (or practical) to avoid non-observability or non-identifiability by any means, the model may still be useful, as long as it is only used to infer its observable states or identifiable parameters. for example, we may be interested in determining the transmission rate β but not the number of recovered individuals r; in such case it is fine to use a model in which β is identifiable even if r is not observable. of course, this means that, to ensure that a model is properly used, it is necessary to characterize its identifiability and observability in detail, to know if the quantity of interest is observable/identifiable. the contribution of this work has been to provide such a detailed analysis of the structural identifiability and observability of a large set of compartmental models of covid-19 presented in the recent literature. the results of our analyses can be used to avoid the pitfalls caused by non-identifiability and non-observability. by classifying the existing models according to these properties, and arranging them in a structured way as a function of the compartments that they include, our study has answered the following question: given the sets of existing models and available measurements, which model is appropriate for inferring the value of some particular parameters, and/or to predict the time course of the states of interest? the tables included in the following pages report the results of the observability and structural identifiability analyses of all the model variants considered in this paper. each block of rows represents one of the following assumptions: • all parameters considered constant (i.e. as is usually the case in the original publications). • transmission rate β considered time-varying. • latent period κ considered time-varying (only in seir models; sir models do not have this parameter). • recovery rate γ considered time-varying. • all parameters considered time-varying. within each block, each row provides detailed information about identifiable and non-identifiable parameters, observable and non-observable states, directly measured (d.m.) states, observable and unobservable unknown inputs (and time-varying parameters), known inputs, and number of derivatives of the unknown inputs (and time-varying parameters) assumed to be non-zero (nnderw). the suffix d number represents the n th derivative of an unknown function (e.g. β d1 is the first derivative of the time-varying parameter β). the blank blocks in the tables of the seir models numbers 38 and 8 indicate that the corresponding time-varying case is already considered in the original formulation of the model. the sir models 29 and 30 have only been studied in their original form, i.e. without considering time-varying parameters, because these models do not contain the common parameters of the sir models; instead they use the r 0 constant. 13 20 h=i h=ki h=i, r, q h=q h=x h=d, r, t identifiable β, γ γ k0 β, δ, η, ξ, , ρ non identifiable k, β β, γ, δ k, β, α α, γ, ε, c, λ, σ, κ, θ, μ, τ opinion: mathematical models: a key tool for outbreak response an introduction to mathematical modeling of infectious diseases how simulation modelling can help reduce the impact of covid-19 special report: the simulations driving the world's response to covid-19 mathematical epidemiology an introduction to mathematical epidemiology modeling infectious disease dynamics wrong but usefulwhat covid-19 epidemiologic models can and cannot tell us on the predictability of infectious disease outbreaks predictability: can the turning point and end of an expanding epidemic be precisely forecast? sensitivity analysis for uncertainty quantification in mathematical models asymptotic estimates of sars-cov-2 infection counts and their sensitivity to stochastic perturbation covid-19 outbreak in wuhan demonstrates the limitations of publicly available case numbers for epidemiological modelling fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts why is it difficult to accurately predict the covid-19 epidemic? a simple planning problem for covid-19 lockdown a multi-risk sir model with optimally targeted lockdown can the covid-19 epidemic be controlled on the basis of daily test reports? practical unidentifiability of a simple vector-borne disease model: implications for parameter estimation and intervention assessment the structural identifiability of a general epidemic (sir) model with seasonal forcing the structural identifiability of the susceptible infected recovered model with seasonal forcing the structural identifiability of susceptible-infective-recovered type epidemic models with incomplete immunity and birth targeted vaccination identifiability and estimation of multiple transmission pathways in cholera and waterborne disease integrating measures of viral prevalence and seroprevalence: a mechanistic modelling approach to explaining cohort patterns of human papillomavirus in women in the usa population modeling of early covid-19 epidemic dynamics in french regions and estimation of the lockdown impact on infection rate structural and practical identifiability analysis of outbreak models assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models influencing public health policy with data-informed mathematical models of infectious diseases: recent developments and new challenges compartmental models in epidemiology dynamic systems biology modeling and simulation new results for identifiability of nonlinear systems a probabilistic algorithm to test local algebraic observability in polynomial time observability and structural identifiability of nonlinear biological systems full observability and estimation of unknown inputs, states, and parameters of nonlinear biological models local identifiability analysis of nonlinear ode models: how to determine all candidate solutions nonlinear controllability and observability an efficient method for structural identiability analysis of large dynamic systems structural identifiability of dynamic systems biology models genssi 2.0: multi-experiment structural identifiability analysis of sbml models a new version of daisy to test structural identifiability of biological models sian: software for structural identifiability analysis of ode models on finding and using identifiable parameter combinations in nonlinear dynamic systems biology models and combos: a novel web implementation total variation regularization for compartmental epidemic models with time-varying dynamics effective containment explains subexponential growth in recent confirmed covid-19 cases in china modelling the covid-19 epidemic and implementation of population-wide interventions in italy a simple sir model with a large set of asymptomatic infectives fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the sars-cov-2 epidemic covid-19 outbreak in wuhan demonstrates the limitations of publicly available case numbers for epidemiological modelling a feedback sir (fsir) model highlights advantages and limitations of infection-based social distancing construction of compartmental models for covid-19 with quarantine, lockdown and vaccine interventions models of seirs epidemic dynamics with extensions, including network-structured populations, testing, contact tracing, and social distancing a modified seir model to predict the covid-19 outbreak in spain and italy: simulating control scenarios and multi-scale epidemics social distancing to slow the coronavirus seiar model with asymptomatic cohort and consequences to efficiency of quarantine government measures in covid-19 epidemic research about the optimal strategies for prevention and control of varicella outbreak in a school in a central city of china: based on an seir dynamic model epidemic analysis of covid-19 in china by dynamical modeling mathematical modeling of epidemic diseases to mask or not to mask: modeling the potential for face mask use by the general public to curtail the covid-19 pandemic seir transmission dynamics model of 2019 ncov coronavirus with considering the weak infectious ability and changes in latency duration healthcare impact of covid-19 epidemic in india: a stochastic mathematical model modeling the control of covid-19: impact of policy interventions and meteorological factors modelling the transmission dynamics of covid-19 in six high burden countries mathematical model of transmission dynamics with mitigation and health measures for sars-cov-2 infection in european countries a novel covid-19 epidemiological model with explicit susceptible and asymptomatic isolation compartments reveals unexpected consequences of timing social distancing a mathematical model of epidemics with screening and variable infectivity dynamic models for the analysis of epidemic spreads effects of quarantine in six endemic models for infectious diseases introduction to seir models a time-dependent sir model for covid-19 with undetectable infected persons a periodic seirs epidemic model with a time-dependent latent period structural identifiability analysis via symmetries of differential equations finding and breaking lie symmetries: implications for structural identifiability and observability in biological modelling minimal output sets for identifiability key: cord-301117-egd1gxby authors: barh, debmalya; chaitankar, vijender; yiannakopoulou, eugenia ch; salawu, emmanuel o.; chowbina, sudhir; ghosh, preetam; azevedo, vasco title: in silico models: from simple networks to complex diseases date: 2013-11-15 journal: animal biotechnology doi: 10.1016/b978-0-12-416002-6.00021-3 sha: doc_id: 301117 cord_uid: egd1gxby in this chapter, we consider in silico modeling of diseases starting from some simple to some complex (and mathematical) concepts. examples and applications of in silico modeling for some important categories of diseases (such as for cancers, infectious diseases, and neuronal diseases) are also given. mathematically and computed models are established. these in silico models encode and test hypotheses about mechanisms underlying the function of cells, the pathogenesis and pathophysiology of disease, and contribute to identification of new drug targets and drug design. the development of in silico models is facilitated by rapidly advancing experimental and analytical tools that generate information-rich, high-throughput biological data. bioinformatics provides tools for pattern recognition, machine learning, statistical modeling, and data extraction from databases that contribute to in silico modeling. dynamical systems theory is the natural language for investigating complex biological systems that demonstrate nonlinear spatio-temporal behavior. most in silico models aim to complement (and not replace) experimental research. experimental data are needed for parameterization, calibration, and validation of in silico models. typical examples in biology are models for molecular networks, where the behavior of cells is expressed in terms of quantitative changes in the levels of transcripts and gene products, as well as models of cell cycle. in medicine, in silico models of cancer, immunological disease, lung disease, and infectious diseases complement conventional research with in vitro models, animal models, and clinical trials. this chapter presents basic concepts of bioinformatics, systems biology, their applications in in silico modeling, and also reviews applications in biology and disease. biotechnology will be the most promising life science frontier for the next decade. together with informatics, biotechnology is leading revolutionary changes in our society and economy. this genomic revolution is global, and is creating new prospects in all biological sciences, including medicine, human health, disease, and nutrition, agronomy, and animal biotechnology. animal biotechnology is a source of innovation in production and processing, profoundly impacting the animal husbandry sector, which seeks to improve animal product quality, health, and well-being. biotechnological research products, such as vaccines, diagnostics, in vitro fertilization, transgenic animals, stem cells, and a number of other therapeutic recombinant products, are now commercially available. in view of the immense potential of biotechnology in the livestock and poultry sectors, interest in animal biotechnology has increased over the years. the fundamental requirement for modern biotechnology projects is the ability to gather, store, classify, analyze, and distribute biological information derived from genomics projects. bioinformatics deals with methods for storing, retrieving, and analyzing biological data and protein sequences, structures, functions, pathways, and networks, and recently, in silico disease modeling and simulation using systems biology. bioinformatics encompasses both conceptual and practical tools for the propagation, generation, processing, and understanding of scientific ideas and biological information. genomics is the scientific study of structure, function, and interrelationships of both individual genes and the genome. lately, genomics research has played an important role in uncovering the building blocks of biology and complete genome mapping of various living organisms. this has enabled researchers to decipher fundamental cellular functions at the dna level, such as gene regulation or protein-protein interactions, and thus to discover molecular signatures (clusters of genes, proteins, metabolites, etc.), which are characteristic of a biological process or of a specific phenotype. bioinformatics methods and databases can be developed to provide solutions to challenges of handling massive amounts of data. the history of animal biotechnology with bioinformatics is to make a strong research community that will build the resources and support veterinary and agricultural research. there are some technologies that were used dating back to 5,000 b.c. many of these techniques are still being used today. for example, hybridizing animals by crossing specific strains of animals to create greater genetic varieties is still in practice. the offspring of some of these crosses are selectively bred afterward to produce the most desirable traits in those specific animals. there has been significant interest in the complete analysis of the genome sequence of farm animals such as chickens, pigs, cattle, sheep, fish, and rabbits. the genomes of farm animals have been altered to search for preferred phenotypic traits, and then selected for better-quality animals to continue into the next generation. access to these sequences has given rise to genome array chips and a number of web-based mapping tools and bioinformatics tools required to make sense of the data. in addition, organization of gigabytes of sequence data requires efficient bioinformatics databases. fadiel et al. (2005) provides a nice overview of resources related to farm animal bioinformatics and genome projects. with farm animals consuming large amounts of genetically modified crops, such as modified corn and soybean crops, it is good to question the effect this will have on their meat. some of the benefits of this technology are that what once took many years of trial and error is now completed in just months. the meats that are being produced are coming from animals that are better nourished by the use of biotechnology. biotechnology and conventional approaches are benefiting both poultry and livestock producers. this will give a more wholesome affordable product that will meet growing population demands. moreover, bioinformatics methods devoted to investigating the genomes of farm animals can bring eventual economic benefits, such as ensuring food safety and better food quality in the case of beef. recent advances in highthroughput dna sequencing techniques, microarray technology, and proteomics have led to effective research in bovine muscle physiology to improve beef quality, either by breeding or rearing factors. bioinformatics is a key tool to analyze the huge datasets obtained from these techniques. the computational analysis of global gene expression profiling at the mrna or protein level has shown that previously unsuspected genes may be associated either with muscle development or growth, and may lead to the development of new molecular indicators of tenderness. gene expression profiling has been used to document changes in gene expression; for example, following infection by pathological organisms, during the metabolic changes imposed by lactation in dairy cows, in cloned bovine embryos, and in various other models. bioinformatics enrichment tools are playing an important role in facilitating the functional analysis of large gene lists from various high-throughput biological studies. huang et al. discusses 68 bioinformatics enrichment tools, which helps us understand their algorithms and the details of a particular tool. however, in biology genes do not act independently, but in a highly coordinated and interdependent manner. in order to understand the biological meaning, one needs to map these genes into gene-ontology (go) categories or metabolic and regulatory pathways. different bioinformatics approaches and tools are employed for this task, starting form go-ranking methods, pathway mappings, and biological network analysis (werner, 2008) . the awareness of these resources and methods is essential to make the best choices for particular research interests. knowledge of bioinformatics tools will facilitate their wide application in the field of animal biotechnology. bioinformatics is the computational data management discipline that helps us gather, analyze, and represent this information in order to educate ourselves, understand biological processes in healthy and diseased states, and to facilitate discovery of better animal products. continued efforts are required to develop cost-effective and efficient computational platforms that can retrieve, integrate, and interpret the knowledge behind the genome sequences. the application of bioinformatics tools for biotechnology research will have significant implications in the life sciences and for the betterment of human lives. bioinformatics is being adopted worldwide by academic groups, companies, and national and international research groups, and it should be thought of as an important pillar of current and future biotechnology, without which rapid progress in the field would not be possible. systems approaches in combination with genomics, proteomics, metabolomics, and kinomics data have tremendous potential for providing insights into various biological mechanisms, including the most important human diseases. we are witnessing the birth of a new era in biology. the ability to uncover the genetic code of living organisms has dramatically changed the biological and biomedical sciences approach towards research. these new approaches have also brought in newer challenges. one such challenge is that recent and novel technologies produce biological datasets of ever-increasing size, including genomic sequences, rna and protein abundances, their interactions with each other, and the identity and abundance of other biological molecules. the storage and compilation of such quantities of biological data is a challenge: the human genome, for example, contains 3 billion chemical units of dna, whereas a protozoan genome has 670 billion units of dna. data management and interpretation requires development of newly sophisticated computational methods based on research in biology, medicine, pharmacology, and agricultural studies and using methods from computer science and mathematics -in other words, the multi-disciplinary subject of bioinformatics. bioinformatics enables researchers to store large datasets in a standard computer database format and provides tools and algorithms scientists use to extract integrated information from the databases and use it to create hypotheses and models. bioinformatics is a growth area because almost every experiment now involves multiple sources of data, requiring the ability to handle those data and to draw out inferences and knowledge. after 15 years of rapid evolution, the subject is now quite ubiquitous. another challenge lies in deciphering the complex interactions in biological systems, known as systems biology. systems biology can be described as a biology-based interdisciplinary field of study that focuses on complex interactions of biological systems. those in the field claim that it represents a shift in perspective towards holism instead of reductionism. systems biology has great potential to facilitate development of drugs to treat specific diseases. the drugs currently on the market can target only those proteins that are known to cause disease. however, with the human genome now completely mapped, we can target the interaction of genes and proteins at a systems biology level. this will enable the pharmaceutical industry to design drugs that will only target those genes that are diseased, improving healthcare in the united states. like two organs in one body, systems analysis and bioinformatics are separate but interdependent. computational methods take an interdisciplinary approach, involving mathematicians, chemists, biologists, biochemists, and biomedical engineers. the robustness of datasets related to gene interaction and co-operation at a systems level requires multifaceted approaches to create a hypothesis that can be tested. two approaches are used to understand the network interactions in systems biology, namely experimental and theoretical and modeling techniques (choi, 2007) . in the following sections is a detailed overview of the different computational or bioinformatics methods in modern systems biology. experimental methods utilize real situations to test the hypothesis of mined data sets. as such, living organisms are used whereby various aspects of genome-wide measurements and interactions are monitored. specific examples on this point include: protein-protein interaction predictions are methods used to predict the outcome of pairs or groups of protein interactions. these predictions are done in vivo, and various methods can be used to carry out the predictions. interaction prediction is important as it helps researchers make inferences of the outcomes of ppi. ppi can be studied by phylogenetic profiling, identifying structural patterns and homologous pairs, intracellular localization, and post-translational modifications, among others. a survey of available tools and web servers for analysis of protein-protein interactions is provided by tuncbag et al., 2009. within biological systems, several activities involving the basic units of a gene take place. such processes as dna replication, and rna translation and transcription into proteins must be controlled; otherwise, the systems could yield numerous destructive or useless gene products. transcriptional control networks, also called gene regulatory networks, are segments within the dna that govern the rate and product of each gene. bioinformatics have devised methods to look for destroyed, dormant, or unresponsive control networks. the discovery of such networks helps in corrective therapy, hence the ability to control some diseases resulting from such control network breakdowns. there has also been rapid progress in the development of computational methods for the genome-wide "reverse engineering" of such networks. aracne is an algorithm to identify direct transcriptional interactions in mammalian cellular networks, and promises to enhance the ability to use microarray data to elucidate cellular processes and to identify molecular targets of pharmacological drugs in mammalian cellular networks. in addition to methods like aracne, systems biology approaches are needed that incorporate heterogeneous data sources, such as genome sequence and protein-dna interaction data. the development of such computational modeling techniques to include diverse types of molecular biological information clearly supports the gene regulatory network inference process and enables the modeling of the dynamics of gene regulatory systems. one such technique is the template-based method to construct networks. an overview of the method is shown in flow chart 21.1. the template-based transcriptional control network reconstruction method exploits the principle that orthologous proteins regulate orthologous target genes. given a genome of interest (goi), the first step is to select the template genome (tg) and known regulatory interactions (i.e. template network, tn) in this genome. in step 2, for every protein (p) in tn, a blast search is performed against goi to obtain the best hit sequences (px). in step three these px are then used as a query to perform a blast search against tg. if the best hit using px as a query happens to be p, then both p and px are selected as orthologous proteins in step four. if orthologs were detected for an interacting p and target gene then the interaction is transferred in goi in the final step. note that this automated way of detecting orthologs can infer false positives. signal transduction is how cells communicate with each other. signal transduction pathways involve interactions between proteins, micro-and macro-molecules, and dna. a breakdown in signal transduction pathways could lead flow chart 21.1 template-based method for regulatory network reconstruction. to detrimental consequences within the system due to lack of integrated communication. correction of broken signal transduction pathways is a therapeutic approach researched for use in many areas of medicine. high-throughput and multiplex techniques for quantifying signaling and cellular responses are becoming increasingly available and affordable. a high-throughput quantitative multiplex kinase assay, mass spectrometrybased proteomics, and single-cell proteomics are a few of the experimental methods used to elucidate signal transduction mechanisms of cells. these large-scale experiments are generating large data sets on protein abundance and signaling activity. data-driven modeling approaches such as clustering, principal components analysis, and partial least squares need to be developed to derive biological hypotheses. the potential of data-driven models to study large-scale data sets quantitatively and comprehensively will make sure that these methods will emerge as standard tools for understanding signal-transduction networks. the systems biology and mathematical biology fields focus on modeling biological systems. computational systems biology aims to develop computational models of biological systems. specifically, it focuses on developing and using efficient algorithms, data structures, visualization tools, and communication tools. a mathematical model can provide new insights into a biological model of interest and help in generating testable predictions. modeling or simulation can be viewed as a way of creating an artificial biological system in vitro whose properties can be changed or made dynamic. by externally controlling the model, new datasets can be created and implemented at a systems level to create novel insights into treating generelated problems. in modeling and simulation, sets of differential equations and logic clauses are used to create a dynamic systems environment that can be tested. mathematical models of biochemical networks (signal transduction cascades, metabolic pathways, gene regulatory networks) are a central component of modern systems biology. the development of formal methods adopted from theoretical computing science is essential for the modeling and simulation of these complex networks. the computational methods that are being employed in mathematical biology and bioinformatics are the following: (a) directed graphs, (b) bayesian networks, (c) boolean networks and their generalizations, (d) ordinary and partial differential equations, (e) qualitative differential equations, (f) stochastic equations, and (g) rule-based formalisms. below are a few specific examples of the applications of these methods. mathematical models can be used to investigate the effects of drugs under a given set of perturbations based on specific tumor properties. this integration can help in the development of tools that aid in diagnosis and prognosis, and thus improve treatment outcome in patients with cancer. for example, breast cancer, being a well-studied disease over the last decade, serves as a model disease. one can thus apply the principles of molecular biology and pathology in designing new predictive mathematical frameworks that can unravel the dynamic nature of the disease. genetic mutations of brca1, brca2, tp53, and pten significantly affect disease prognosis and increase the likelihood of adverse reactions to certain therapies. these mutations enable normal cells to become self-sufficient in survival in a stepwise process. enderling et al. (2006) modeled this mutation and expansion process by assuming that mutations in two tumor-suppressor genes are sufficient to give rise to a cancer. they modified enderling's earlier model, which was based on an established partial differential equation model of solid tumor growth and invasion. the stepwise mutations from a normal breast stem cell to a tumor cell have been described using a model consisting of four differential equations. recently, woolf et al. (2005) applied a novel graphical modeling methodology known as bayesian network analysis to model discovery and model selection for signaling events that direct mouse embryonic stem cells (an important preliminary step in hypothesis testing) in protein signaling networks. the model predicts bidirectional dependence between the two molecules erk and fak. it is interesting to appreciate that the apparent complexity of these dynamic erk-fak interactions is quite likely responsible for the difficulty in determining clear "upstream" versus "downstream" influence relationships by means of standard molecular cell biology methods. bayesian networks determine the relative probability of statistical dependence models of arbitrary complexity for a given set of data. this method offers further clues to apply bayesian approaches to cancer biology problems. cell cycle is a process in which cells proliferate while collectively performing a series of coordinated actions. cell-cycle models also have an impact on drug discovery. chassagnole et al. (2006) used a mathematical model to simulate and unravel the effect of multi-target kinase inhibitors of cyclin-dependent kinases (cdks). they quantitatively predict the cytotoxicity of a set of kinase inhibitors based on the in vitro ic 50 measurement values. finally, they assess the pharmaceutical value of these inhibitors as anticancer therapeutics. in cancer, avascular tumor growth is characterized by localized, benign tumor growth where the nearby tissues consume most of the nutrients. mathematical modeling of avascular tumor growth is important to understanding the advanced stages of cancer. kiran et al. (2009) have developed a spatial-temporal mathematical model classified as a different zone model (dzm) for avascular tumor growth based on diffusion of nutrients and their consumption, and it includes key mechanisms in the tumor. the diffusion and nutrient consumption are represented using partial differential equations. this model predicts that onset of necrosis occurs when the concentrations of vital nutrients are below critical values, and also the overall tumor growth based on the size effects of the proliferation zone, quiescent zone, and necrotic zone. the mathematical approaches towards modeling the three natural scales of interest (subcellular, cellular, and tissue) are discussed above. developing models that can predict the effects across biological scales is a challenge. the long-term goal is to build a "virtual human made up of mathematical models with connections at the different biological scales (from genes to tissue to organ)." a model is an optimal mix of hypotheses, evidence, and abstraction to explain a phenomenon. hypothesis is a tentative explanation for an observation, phenomenon, or scientific problem that can be tested by further investigation. evidence describes information (i.e. experimental data) that helps in forming a conclusion or judgment. abstraction is an act of filtering out the required information to focus on a specific property only. for example, archiving books based on the year of publication, irrespective of the author name, would be an example of abstraction. in this process, some detail is lost and some gained. predictions are made through modeling that can be tested by experiment. a model may be simple (e.g. the logistic equation describing how a population of bacteria grows) or complicated. models may be mathematical or statistical. mathematical models make predictions, whereas statistical models enable us to draw statistical inferences about the probable properties of a system. in other words, models can be deductive or inductive. if the prediction is necessarily true given that the model is also true, then the model is a deductive model. on the other hand, if the prediction is statistically inferred from observations, then the model is inductive. deductive models contain a mathematical description; for example, the reaction-diffusion equation that makes predictions about reality. if these predictions do not agree with experiment, then the validity of the entire model may be questioned. mathematical models are commonly applied in physical sciences. on the other hand, inductive models are mostly applied in the biological sciences. in biology, models are used to describe, simulate, analyze, and predict the behavior of biological systems. modeling in biology provides a framework that enables description and understanding of biological systems through building equations that express biological knowledge. modeling enables the simulation of the behavior of a biological system by performing in silico experiments (i.e. numerically solving the equations or rules that describe the model). the results of these in silico experiments become the input for further analysis; for example, identification of key parameters or mechanisms, interpretation of data, or comparison of the ability of different mechanisms to generate observed data. in particular, systems biology employs an integrative approach to characterizing biological systems in which interactions among all components in a system are described mathematically to establish a computable model. these in silico models complement traditional in vivo animal models and can be applied to quantitatively study the behavior of a system of interacting components. the term "in silico" is poorly defined, with several researchers claiming their role in its origination (ekins et al., 2007) . sieburg (1990) and (danchin et al. 1991) were two of the earliest published works that used this term. specifically, in silico models gained much interest in the early stages by various imaging studies (chakroborty et al., 2003) . as an example, microarray analysis that enabled measurement of genome-scale expression levels of genes provided a method to investigate regulatory networks. years of regulatory network studies (that included microarray-based investigations) led to the development of some well-characterized regulatory networks such as e. coli and yeast regulatory networks. these networks are available in the genenetweaver (gnw) tool. gnw is an open-source tool for in silico benchmark generation and performance profiling of network inference methods. thus, the advent of high-throughput experimental tools has allowed for the simultaneous measurement of thousands of biomolecules, opening the way for in silico model construction of increasingly large and diverse biological systems. integrating heterogeneous dynamic data into quantitative predictive models holds great promise for significantly increasing our ability to understand and rationally intervene in disease-perturbed biological systems. this promise -particularly with regards to personalized medicine and medical intervention -has motivated the development of new methods for systems analysis of human biology and disease. such approaches offer the possibility of gaining new insights into the behavior of biological systems, of providing new frameworks for organizing and storing data and performing statistical analyses, suggesting new hypotheses and new experiments, and even of offering a "virtual laboratory" to supplement in vivo and in vitro work. however, in silico modeling in the life sciences is far from straightforward, and suffers from a number of potential pitfalls. thus, mathematically sophisticated but biologically useless models often arise because of a lack of biological input, leading to models that are biologically unrealistic, or that address a question of little biological importance. on the other hand, models may be biologically realistic but mathematically intractable. this problem usually arises because biologists unfamiliar with the limitations of mathematical analysis want to include every known biological effect in the model. even if it were possible to produce such models, they would be of little use since their behavior would be as complex to investigate as the experimental situation. these problems can be avoided by formulating clear explicit biological goals before attempting to construct a model. this will ensure that the resulting model is biologically sound, can be experimentally verified, and will generate biological insight or new biological hypotheses. the aim of a model should not simply be to reproduce biological data. indeed, often the most useful models are those that exhibit discrepancies from experiment. such deviations will typically stimulate new experiments or hypotheses. an iterative approach has been proposed, starting with a biological problem, developing a mathematical model, and then feeding back into the biology. once established, this collaborative loop can be traversed many times, leading to ever increasing understanding. the ultimate goal of in silico modeling in biology is the detailed understanding of the function of molecular networks as they appear in metabolism, gene regulation, or signal transduction. this is achieved by using a level of mathematical abstraction that needs a minimum of biological information to capture all physiologically relevant features of a cellular network. for example, ideally, for in silico modeling of a molecular network, knowledge of the network structure, all reaction rates, concentrations, and spatial distributions of molecules at any time point is needed. unfortunately, such information is unavailable even for the best-studied systems. in silico simulations thus always have to use a level of mathematical abstraction, which is dictated by the extent of our biological knowledge, by molecular details of the network, and by the specific questions that are addressed. understanding the complexity of the disease and its biological significance in health can be achieved by integrating data from the different functional genomics experiments with medical, physiological, and environmental factor information, and computing mathematically. the advantage of mathematical modeling of disease lies in the fact that such models not only shed light on how a complex process works, which could be very difficult for inferring an understanding of each component of this process, but also predict what may follow as time evolves or as the characteristics of particular system components are modified. mathematical models have generally been utilized in association with an increased understanding of what models can offer in terms of prediction and insight. the two distinct roles of models are prediction and understanding the accuracy, transparency, and flexibility of model properties. prediction of the models should be accurate, including all the complexities and population-level heterogeneity that have an additional use as a statistical tool. it also provides the understanding of how the disease spreads in the real world and how the complexity affects the dynamics. model understanding aids in developing sophisticated predictive models, along with gathering more relevant epidemiological data. a model should be as simple as possible and should have balance in accuracy, transparency, and flexibility; in other words, a model should be well suited for its purpose. the model should be helpful in understanding the behavior of the disease and able to simplify the other disease condition. several projects are proceeding along these lines, such as e-cell (tomita, 2001) and simulations of biochemical pathways. whole cell modeling integrates information from metabolic pathways, gene regulation, and gene expression. three elements are needed for constructing of a good cell model: precise knowledge of the phenomenon, an accurate mathematical representation, and a good simulation tool. a cell represents a dynamic environment of interaction among nucleic acids, proteins, carbohydrates, ions, ph, temperature, pressure, and electrical signals. many cells with similar functionality form tissue. in addition, each type of tissue uses a subset of this cellular inventory to accomplish a particular function. for example, in neurons, electro-chemical phenomena take precedence over cell division, whereas, cell division is a fundamental function of skin, lymphocytes, and bone marrow cells. thus, an ideal virtual cell not only represents all the information, but also exhibits the potential to differentiate into neuronal or epithelial cells. the first step in creating a whole cell model is to divide the entire network into pathways, and pathways into individual reactions. any two reactions belong to a pathway if they share a common intermediate. in silico modeling consists not only of decomposing events into manageable units, but also of assembling these units into a unified framework. in other words, mathematical modeling is the art of converting biology into numbers. for whole cell modeling, a checklist of biological phenomena that call for mathematical representation is needed. biological phenomena taken into account for in silico modeling of whole cells are the following: 1. dna replication and repair 2. translation 3. transcription and regulation of transcription 4. energy metabolism 5. cell division 6. chromatin modeling 7. signaling pathways 8. membrane transport (ion channels, pump, nutrients) 9. intracellular molecular trafficking 10. cell membrane dynamics 11. metabolic pathways the whole cell metabolism includes enzymatic and nonenzymatic processes. enzymatic processes cover most of the metabolic events, while non-enzymatic processes include gene expression and regulation, signal transduction, and diffusion. in silico modeling of whole cells not only requires precise qualitative and quantitative data, but also an appropriate mathematical representation of each event. for metabolic modeling, the data input consists of kinetics of individual reactions and also effects of cofactors, ph, and ions on the model. the key step in modeling is to choose an appropriate assumption. for example, a metabolic pathway may be a mix of forward and reverse reactions. furthermore, inhibitors that are part of the pathway may influence some reactions. at every step, enzymatic equations are needed that best describe the process. in silico models are built because they are easy to understand, controllable, and can store and analyze large amounts of information. a well-built model has diagnostic and predictive abilities. a cell by itself is a complete biochemical reactor that contains all the information one needs to understand life. whole cell modeling enables investigation of the cell cycle, physiology, spatial organization, and cell-cell communication. sequential actions in whole cell modeling are the following: 1. catalog all the substances that make up a cell. substances (for qualitative modeling). 5. add rate constants, concentration of substances, and strength of inhibition. 6. assume appropriate mathematical representations for individual reactions. 7. simulate reactions with suitable simulation software. 8. diagnose the system with system analysis software. 9. perturb the system and correlate its behavior to an underlying genetic and/or biochemical factor. 10. predict phenomenon using a hypothesis generator. in silico modeling of disease combines the advantages of both in vivo and in vitro experimentation. unlike in vitro experiments, which exist in isolation, in silico models provide the ability to include a virtually unlimited array of parameters, which render the results more applicable to the organism as a whole. in silico modeling allows us to examine the workings of biological processes such as homeostasis, reproduction, evolution, etc. for example, one can explore the processes of darwinian evolution through in silico modeling, which are not practical to study in real time. in silico modeling of disease is quite challenging. attempting to incorporate every single known interaction rapidly leads to an unmanageable model. furthermore, parameter determination in such models can be a frightening experience. estimates come from diverse experiments, which may be elegantly designed and well executed but can still give rise to widely differing values for parameters. data can come from both in vivo and in vitro experiments, and results that hold in one medium may not always hold in the other. furthermore, despite the many similarities between mammalian systems, significant differences do exist, and so results obtained from experiments using animal and human tissue may not always be consistent. also there are many considerations that cannot be applied. for example, one cannot investigate the role of stochastic fluctuations by removing them from the system, or one cannot directly explore the process that gave rise to current organisms. in silico modeling has been applied in cancer, systemic inflammatory response syndrome, immune diseases, neuronal diseases, and infectious diseases (among others). in silico models of disease can contribute to a better understanding of the pathophysiology of the disease, suggest new treatment strategies, and provide insight into the design of experimental and clinical trials for the investigation of new treatment modalities. in silico modeling of cancer has become an interesting alternative approach to traditional cancer research. in silico models of cancer are expected to predict the complexity of cancer at multiple temporal and spatial resolutions, with the aim of supplementing diagnosis and treatment by helping plan more focused and effective therapy via surgical resection, standard and targeted chemotherapy, and novel treatments. in silico models of cancer include: (a) statistical models of cancer, such as molecular signatures of perturbed genes and molecular pathways, and statistically-inferred reaction networks; (b) models that represent biochemical, metabolic, and signaling reaction networks important in oncogenesis, including constraint-based and dynamic approaches for the reconstruction of such networks; and (c) models of the tumor microenvironment and tissue-level interactions (edelman et al., 2010) . statistical models of cancer can be broadly divided into those that employ unbiased statistical inference, and those that also incorporate a priori constraints of specific biological interactions from data. statistical models of cancer biology at the genetic, chromosomal, transcriptomic, and pathway levels provide insight about molecular etiology and consequences of malignant transformation despite incomplete knowledge of underlying biological interactions. these models are able to identify molecular signatures that can inform diagnosis and treatment selection, for example with molecular targeted therapies such as imatinib (gleevec) (edelman et al., 2010) . however, in order to characterize specific biomolecular mechanisms that drive oncogenesis, genetic and transcriptional activity must be considered in the context of cellular networks that ultimately drive cellular behavior. in microbial cells, network inference tools have been developed and applied for the modeling of diverse biochemical, signaling, and gene expression networks. however, due to the much larger size of the human genome compared to microbes, and the substantially increased complexity of eukaryotic genetic regulation, inference of transcriptional regulatory networks in cancer presents increased practical and theoretical challenges. biochemical reaction networks are constructed to represent explicitly the mechanistic relationships between genes, proteins, and the chemical inter-conversion of metabolites within a biological system. in these models, network links are based on pre-established biomolecular interactions rather than statistical associations; significant experimental characterization is thus needed to reconstruct biochemical reaction networks in human cells. these biochemical reaction networks require, at a minimum, knowledge of the stoichiometry of the participating reactions. additional information such as thermodynamics, enzyme capacity constraints, time-series concentration profiles, and kinetic rate constants can be incorporated to compose more detailed dynamic models (edelman et al., 2010) . microenvironment-tissue level models of cancer apply an "engineering" approach that views tumor lesions as complex micro-structured materials, where three-dimensional tissue architecture ("morphology") and dynamics are coupled in complex ways to cell phenotype, which in turn is influenced by factors in the microenvironment. computational approaches of in silico cancer research include continuum models, discrete models, and hybrid models. in continuum models, extracellular parameters can be represented as continuously distributed variables to mathematically model cell-cell or cell-environment interactions in the context of cancers and the tumor microenvironment. systems of partial differential equations have been used to simulate the magnitude of interaction between these factors. continuum models are suitable for describing the individual cell migration, change of cancer cell density, diffusion of chemo-attractants, heat transfer in hyperthermia treatment for skin cancer, cell adhesion, and the molecular network of a cancer cell as an entire entity. however, these types of in silico models have limited ability for investigating singlecell behavior and cell-cell interaction. on the other hand, "discrete" models (i.e. cellular automata models) represent cancer cells as discrete entities of defined location and scale, interacting with one another and external factors in discrete time intervals according to predefined rules. agent-based models expand the cellular automata paradigm to include entities of divergent functionalities interacting together in a single spatial representation, including different cell types, genetic elements, and environmental factors. agent-based models have been used for modeling three-dimensional tumor cell patterning, immune system surveillance, angiogenesis, and the kinetics of cell motility. hybrid models have been created which incorporate both continuum and agent-based variables in a modular approach. hybrid models are ideal for examining direct interactions between individual cells and between the cells and their microenvironment, but they also allow us to analyze the emergent properties of complex multi-cellular systems (such as cancer). hybrid models are often multi-scale by definition, integrating processes on different temporal and spatial scales, such as gene expression, intracellular pathways, intercellular signaling, and cell growth or migration. there are two general classes of hybrid models, those that are defined upon a lattice and those that are off-lattice. the classification of hybrid models on these two classes depends on the number of cells these models can handle and the included details of each individual cell structure, i.e. models dealing with large cell populations but with simplified cell geometry, and those that model small colonies of fully deformable cells. for example, a hybrid model investigated the invasion of healthy tissue by a solid tumor. the model focused on four key parameters implicated in the invasion process; tumor cells, host tissue (extracellular matrix), matrix-degradative enzymes, and oxygen. the model is actually hybrid, wherein the tumor cells were considered to be discrete (in terms of concentrations), and the remaining variables were in the continuous domain in terms of concentrations. this hybrid model can make predictions on the effects of individual-based cell interactions (both between individuals and the matrix) on tumor shape. the model of zhang et al. (2007) incorporated a continuous model of a receptor signaling pathway, an intracellular transcriptional regulatory network, cell-cycle kinetics, and three-dimensional cell migration in an integrated, agent-based simulation of solid brain tumor development. the interactions between cellular and microenvironment states have also been considered in a multi-scale model that predicts tumor morphology and phenotypic evolution in response to such extracellular pressures. the biological context in which cancers develop is taken into consideration in in silico models of the tumor microenvironment. such complex tumor microenvironments may integrate multiple factors including extracellular biomolecules, vasculature, and the immune system. however, rarely have these methods been integrated with a large cell-cell communication network in a complex tumor microenvironment. recently, an interesting effort of in silico modeling was described in which the investigators integrated all the intercellular signaling pathways known to date for human glioma and generated a dynamic cell-cell communication network associated with the glioma microenvironment. then they applied evolutionary population dynamics and the hill functions to interrogate this intercellular signaling network and execute an in silico tumor microenvironment development. the observed results revealed a profound influence of the micro-environmental factors on tumor initiation and growth, and suggested new options for glioma treatment by targeting cells or soluble mediators in the tumor microenvironment (wu et al., 2012) . trauma and infection can cause acute inflammatory responses, the degree of which may have several pathological manifestations like systemic inflammatory response syndrome (sirs), sepsis, and multiple organ failure (mof). however, an appropriate management of these states requires further investigation. translating the results of basic science research to effective therapeutic regimes has been a longstanding issue due in part to the failure to account for the complex nonlinear nature of the inflammatory process wherein sirs/mof represent a disordered state. hence, the in silico modeling approach can be a promising research direction in this area. indeed, in silico modeling of inflammation has been applied in an effort to bridge the gap between basic science and clinical trials. specifically, both agent-based modeling and equation-based modeling have been utilized . equation-based modeling encompasses primarily ordinary differential equations (ode) and partial differential equations (pde). initial modeling studies were focused on the pathophysiology of the acute inflammatory response to stress, and these studies suggested common underlying processes generated in response to infection, injury, and shock. later, mathematical models included the recovery phase of injury and gave insight into the link between the initial inflammatory response and subsequent healing process. the first mathematical models of wound healing dates back to the 1980s and early 1990s. these models and others developed in the 1990s investigated epidermal healing, repair of the dermal extracellular matrix, wound contraction, and wound angiogenesis. most of these models were deterministic and formulated using differential equations. in addition, recent models have been formulated using differential equations to analyze different strategies for improved healing, including wound vacs, commercially engineered skin substitutes, and hyperbaric oxygen. in addition, agent-based models have been used in wound healing research. for example, mi et al. (2007) developed an agent-based model to analyze different treatment strategies with wound debridement and topical administration of growth factors. their model produced the expected results of healing when analyzing for different treatment strategies including debridement, release of pdgf, reduction in tumor necrosis factor-î±, and increase of tgf-î²1. the investigators suggested that a drug company should use a mathematical model to test a new drug before going through the expensive process of basic science testing, toxicology and clinical trials. indeed, clinical trial design can be improved by prior in silico modeling. for example, in silico modeling has led to the knowledge that patients who suffered from the immune-suppressed phenotype of late-stage multiple organ failure, and were susceptible to usually trivial nosocomial infections, demonstrated sustained elevated markers of tissue damage and inflammation through two weeks of simulated time. however, anti-cytokine drug trials with treatment protocols of only one dose or one day had not incorporated this knowledge into their design, with subsequent failure of candidate treatments. by now the reader is expected to be familiar with the meaning and the basics of in silico modeling. in this section we discuss the application of in silico modeling in the understanding of infectious diseases and in the proposition/ development of better treatments for infectious diseases. in fact, the applications of in silico modeling can help far beyond just the understanding of the dynamics (and sometimes, statistics) of infectious diseases, and far beyond the proposition/development of better treatments for infectious diseases. the modeling can be helpful even in the understanding of better prevention of infectious diseases. the level of pathogen within the host defines the process of infection; such pathogen levels are determined by the growth rate of the pathogen and its interaction with the host's immune response system. initially, no pathogen is present, but just a low-level, nonspecific immunity within the host. on infection, the parasite grows abundantly over time with the potential to transmit the infection to other susceptible individuals. to comprehensively understand in silico modeling in the domains of infectious diseases, one should first understand the "triad of infectious diseases," and the characteristics of "infectious agent," "host," and "environment" on which the models are always based. in fact, modeling of infectious diseases is just impossible without this triad; after all, the model would be built on some parameters (also called variables in more general language), and those parameters always have their origin from the so-called "triad of infectious diseases." at this point, a good question would be: what is a "triad of infectious diseases?" "triad of infectious diseases" means the interactions between (1) agent, which is the disease causing organism (the pathogen); (2) host, which is the infected organism, or in the case of pre-infection, the organism to be infected is the host (thus in this case the host is the animal the agent infects); and (3) environment, which is a kind of link between the agent and the host, and is essentially an umbrella word for the entirety of the possible media through which the agent reaches the host. now that we have an idea on what in silico modeling of infectious diseases are generally based on, we will outline a better understanding of the parameters that are considered in most in silico disease models. to discuss the parameters in an orderly manner, we just categorize them under each of the three components of the "triad of infectious diseases," and summarize them in the next subsection. it must be emphasized at this point that (1) even though all the possible parameters for in silico modeling of infectious diseases can be successfully categorized under the characteristics of one of any of the three components of the "triad of infectious diseases" (agent, host, and environment), (2) the parameters discussed in the next sub-section are by no means the entirety of all the possible parameters that can be included in in silico modeling of infectious diseases. in fact, several parameters exist, and this section cannot possibly enumerate them all. that is why we have discussed the parameters using a categorical approach. some of the parameters for in silico modeling of infectious diseases are essentially a measure of infectivity (ability to enter the host), pathogenicity (ability to cause divergence from homeostasis/disease), virulence (degree of divergence from homeostasis caused/ability to cause death), antigenicity (ability to bind to mediators of the host's adaptive immune system), and immunogenicity (ability to trigger adaptive immune response) of the concerned infectious agent. the exact measure (and thus the units) used can vary markedly depending on the intentions for which the in silico infectious diseases model is built, as well as the assumptions on which the in silico disease model is based. from the knowledge of the agent's characteristics, one should know that unlike parameters related to the other characteristics of the agent, the parameters related to infectivity find their most important use only in the modeling of the preinfection stage in infectious disease modeling. finally, some of the agent-related parameters of great importance in in silico modeling of infectious diseases are concentration of the agent's antigen-host antibody complex, case fatality rate, strain of the agent, other genetic information of the agent, etc. the parameters originating from characteristics of the host can also be diversified and based on the intentions for which the in silico infectious diseases model are built and the assumptions on which the in silico disease model are based; however, the parameters could then be grouped and explained under the host's genotype (the allele at the host's specified genetic locus), immunity/health status (biological defenses to avoid infection), nutritional status (feeding habits/food intake characteristics), gender (often categorized as male or female), age, and behavior (the host's behaviors that affect its resistance to homeostasis disruptors). typical examples of host-related parameters are the alleles at some specifically targeted genetic loci; the total white blood cell counts; differential white blood cell counts, and/or much more sophisticated counts of specific blood cell types; blood levels of some specific cytokines, hormones, and/or neurotransmitters; daily calories, protein, and/or fat intake; daily amount of energy expended and/or duration of exercise; etc. at first parameters originating from the environment might seem irrelevant to the in silico modeling of infectious diseases, but they are relevant. even after the pre-infection stage, the environment still modulates the host-agent intersections. for example, the ability (and thus the related parameters) of the agent to multiply and/or harm the host are continually influenced by the host's environmental conditions, and in a similar way the hosts defense against the adverse effects of the agents are modulated by the host's environmental conditions. but somehow, not so many of these parameters have been included in in silico infectious disease models in the recent past. a few examples of these parameters are the host's ambient temperature, the host's ambient atmospheric humidity, altitude, the host's light-dark cycle, etc. now that we know the parameters for in silico infectious disease modeling, the next reasonable question would be "what form does a typical in silico infectious disease model take?" so, this sub-section attempts to answer this very important question. let us view the in silico model as a system of wellintegrated functional equations or formulae. such well-integrated functional equations can be viewed or approximated as a single, albeit more complex, functional equation/formula. it is hence possible to vary any (or a combination) of the variables contained in this equation by running numerical simulations on a computer depending on the kind of prediction one wants to make. such in silico models can hence investigate several (maybe close to infinite) possible data points within reasonable limits that one sets depending on the nature of the variables considered. so the equations behind a typical infectious disease in silico model could take the form (equation 21.1): where h is the output from a smaller equation that is based on host parameters; î² is a constant; f and g are link functions which may be the same as or different from each other and other link functions in this system of equations; a is the output from a smaller equation that is based on agent parameters; g is a link function which may be the same or different from other link functions in this system of equations; e is the output from a smaller equation that is based on environment parameters; and æ� is a random error parameter. readers should know that we use the term "link function" to refer to any of the various possible forms of mathematical operations or functions. this means that based on the complexity of the model, a particular "link function" might be as simple as a mere addition or as complex as several combinations of operators with high degree polynomials. where î² a is a constant; f a1 , f a2 , â�¦ f ax are link functions that may be same or different (individually) from (every) other link function in this system of equations; a 1 , a 2 , â�¦ a x are a set of the agent's parameters (e.g. case fatality rate, agent's genotype, etc); and æ� is a random error parameter. where î² e is a constant; f e1 , f e2 , â�¦f ex are link functions which may be the same or different (individually) from (every) other link function in this system of equations; e 1 , e 2 , â�¦ e x are a set of environmental parameters (e.g. host's ambient temperature, host's ambient atmospheric humidity, etc.); and æ� is a random error parameter. muã±oz-elã­as et al. (2005) documented (through their paper "replication dynamics of mycobacterium tuberculosis in chronically infected mice") a successful in silico modeling of infectious diseases (specifically, tuberculosis). in their in silico modeling of tuberculosis in mice, the researchers investigated both the static and dynamic host-pathogen/agent equilibrium (i.e. mice-mycobacterium tuberculosis static and dynamic equilibrium). the rationale behind their study was that a better understanding of host-pathogen/agent interactions would make possible the development of better anti-microbial drugs for the treatment of tuberculosis (as well as provide similar understanding for the cases of other chronic infectious diseases). they modeled different types of host-pathogen/ agent equilibriums (ranging from completely static equilibrium, all the way through semi-dynamic, down to completely dynamic scenarios) by varying the rate of multiplication/growth and the rate of death of the pathogen/ agent (mycobacterium tuberculosis) during the infection's chronic phase. through their in silico study (which was also verified experimentally), they documented a number of remarkable findings. for example, they established that "viable bacterial counts and total bacterial counts in the lungs of chronically infected mice do not diverge over time," and they explained that "rapid degradation of dead bacteria is unlikely to account for the stability of total counts in the lungs over time because treatment of mice with isoniazid for 8 weeks led to a marked reduction in viable counts without reducing the total count. readers who are interested in further details on the generation of this in silico model for the dynamics of mycobacterium tuberculosis infection, as well as the complete details of the parameters/variables considered, and the comprehensive findings of the study, should refer to the article of ernesto et al. published in infection and immunity. another one of the many notable works in the domain of infectious disease in silico modeling is the study by navratil et al. (2011) . using protein-protein interaction data that the authors obtained from available literature and public databases, they (after first curating and validating the data) computationally (in silico) re-examined the virus-human protein interactome. interestingly, the authors were able to show that the onset and pathogenesis of some disease conditions (especially chronic disease conditions) often believed to be of genetic, lifestyle, or environmental origin, are, in fact, modulated by infectious agents. models have been constructed to simulate bacterial dynamics, such as growth under various nutritional and chemical conditions, chemotactic response, and interaction with host immunity. clinically important models of bacterial dynamics relating to peritoneal dialysis, pulmonary infections, and particularly of antibiotic treatment and bacterial resisitance, have also been developed. baccam et al. (2006) utilized a series of mathematical models of increasing complexity that incorporated target cell limitation and the innate interferon response. the models were applied to examine influenza a virus kinetics in the upper respiratory tracts of experimentally infected adults. they showed the models to be applicable for improving the understanding of influenza in a virus infection, and estimated that during an upper respiratory tract infection, the influenza virus initially spreads rapidly with one cell, infecting (on average) about 20 others (daun and clermont, 2007) . model parameter and spread of disease: model parameters are one of the main challenges in mathematical modeling since all models do not have a physiological meaning. sensitivity analysis and bifurcation analysis give us the opportunity to understand how model outcome and model parameters are correlated, how the sensitivity of the system is with respect to certain parameters, and the uncertainty in the model outcome yielded by the uncertainties in the parameter values. uncertainty and sensitivity analysis was used to evaluate the input parameters play in the basic productive rate (ro) of severe acute respiratory syndrome (sars) and tuberculosis. control of the outbreak depends on identifying the disease parameters that are likely to lead to a reduction in r. difficulty in finding the most appropriate set of parameters for in silico modeling of infectious diseases is often a challenge. it is hoped this challenge will subside with the advancement in infectonomics and high-throughput technology. however, another important challenge lies in the understanding (and the provision of reasonable interpretations for) the results from all the complex interactions of parameters considered. in this sub-section we focus on the application of in silico modeling to improve knowledge of neuronal diseases, and thus improve the applications of neurological knowledge for solving neuronal health problems. it is not an overstatement to say that one of the many aspects of life sciences where in silico disease modeling would have the biggest applications is in the better understanding of the pathophysiology of nervous system (neuronal) diseases. this is basically because of the inherent delicate nature of the nervous system and the usual extra need to be sure of how to proceed prior to attempting to treat neuronal disease conditions. by this we mean that the need to first model neuronal disease conditions in silico prior to deciding on or suggesting (for example) a treatment plan is, in fact, rising. this is not unexpected; after all, it is better to be sure of what would work (say, through in silico modeling) than to try what would not work. obtaining appropriate parameters for the in silico modeling of a nervous system (neuronal) disease is rooted in a good understanding of the pathophysiology of such neuronal disease. since comprehensive details of pathophysiology of neuronal diseases is beyond the scope of this book, we only present the basic idea that would allow the reader to understand how in silico modeling of a nervous system (neuronal) disease can be done. to give a generalized explanation and still concisely present the basic ideas underlying the pathophysiology of neuronal diseases, we proceed by systematically categorizing the mediators of neuronal disease pathophysiology: (1) nerve cell characteristics, (2) signaling chemicals and body electrolytes, (3) host/organism factors, and (4) environmental factors. readers need to see all these categories as being highly integrated pathophysiologically rather than as separate entities, and also that we have only grouped them this way to make simpler the explanation of how the parameters for in silico modeling of neuronal diseases are generated. when something goes wrong with (or there is a marked deviation from equilibrium in) a component of any of the four categories above, the other components (within and/ or outside the same category) try hard to make adjustments so as to annul/compensate for the undesired change. for example, if the secretion of a chemical signal suddenly becomes abnormally low, the target cells for the chemical signal may develop mechanisms to use the signaling chemical more efficiently, and the degradation rate of the signaling chemical may be reduced considerably. through these, the potentially detrimental effects of reduced secretion of the chemical signal are annulled via compensation from the other components. this is just a simple example; much more complex regulatory and homeostatic mechanisms exist in the neuronal system. despite the robustness of those mechanisms, things still get out of hand sometimes, and disease conditions result. the exploration of what happens in (and to) each and all of the components of this giant system of disease conditions is called the pathophysiology of neuronal disease, and it this pathophysiology that "provides" parameters for the in silico modeling of neuronal diseases. some of the important parameters (that are of nerve cell origin) for a typical in silico modeling of a neuronal disease (say, alzheimer's disease) are the population (or relative population) of specific neuronal cells (such as glial cells: microglia, astrocytes, etc.), motion of specific neuronal cells (e.g. microglia), amyloid production, aggregation and removal of amyloid, morphology of specific neuronal cells, status of neuronal cell receptors, generation/regeneration/ degeneration rate of neuronal cells, status of ion neuronal cell channels, etc. based on their relevance to the pathophysiology of the neuronal disease being studied, many of these parameters are often considered in the in silico modeling of the neuronal disease. more importantly, their spatiotemporal dynamics are often seriously considered. the importance of signaling chemicals and electrolytes in the nervous system makes parameters related to them very important. the secretion, uptake, degradation, and diffusion rates of various neurotransmitters and cytokines are often important parameters in the in silico modeling of neurodiseases. other important parameters are the concentration gradients of the various neurotransmitters and cytokines, the availability and concentration of second messengers, and the electrolyte status/balance of the cells/systems. the spatiotemporal dynamics of all of these are also often seriously considered. the parameters under host/organism factors can be highly varied depending on the intentions and the assumptions governing the in silico disease modeling. nonetheless, one could basically group and list the parameters collectively under genotype (based on the allele at a specified genetic locus), nutritional status (feeding habits/food intake characteristics; e.g. daily calories, protein intake, etc.), gender (male or female), age, and behavior (host's behaviors/lifestyle that influences homeostasis and/or responses to stimuli). a few examples of these parameters are ambient temperature, altitude, light-dark cycle, social network, type of influences from people in the network, etc. just like other in silico models, a neuronal disease in silico model is also based on what could be viewed as a single giant functional equation, which is composed of highly integrated simpler functional equations. so the equations behind a typical neuronal disease in silico model could take the form (equation 21.5): where n could be a parameter that is a direct measure of the disease manifestation; î² is a constant; f, g, j, and k are link functions which may be the same or different from other link functions in this system of equations; c, s, h, and e are the outputs from smaller equations that are based on parameters from neuronal cell characteristics, signaling molecule and electrolyte parameters, host parameters, and environment parameters, respectively; and æ� is a random error parameter. the reader should know that each of n, c, s, h and e could have resulted from smaller equations that could take forms similar to those (equations 21.2 to 21.4) described under in silico modeling of infectious diseases (previous sub-section). in silico models edelstein-keshet and spiros (2002) used in silico modeling to study the mechanism and/formation of alzheimer's disease. the target of their in silico modeling was to explore and demystify how various parts implicated in the etiology and pathophysiology of alzheimer's disease work together as a whole. employing the strength of in silico modeling, the researchers were able to transcend the difficulty of identifying detailed disease progression scenarios, and they were able to test a wide variety of hypothetical mechanisms at various levels of detail. readers interested in the complete details of the assumptions that govern in silico modeling of alzheimer's disease, the various other aspects of the model, and more detailed accounts of the findings should look at the article by edelstein-keshet and spiros. several other interesting studies have applied in silico modeling techniques to investigate various neuronal diseases. a few examples include the work of altmann and boyton (2004) , who investigated multiple sclerosis (a very common disease resulting from demyelination in the central nervous system) using in silico modeling techniques; lewis et al. (2010) , who used in silico modeling to study the metabolic interactions between multiple cell types in alzheimer's disease; and raichura et al. (2006) , who applied in silico modeling techniques to dynamically model alpha-synuclein processing in normal and parkinson's disease states. a more specific example of a molecular level in silico alzheimer's disease model can be found in ghosh et al. (2010) . among the amyloid proteins, amyloid-î² (aî²) peptides (aî²42 and aî²40) are known to form aggregates that deposit as senile plaques in the brains of alzheimer's disease patients. the process of aî²-aggregation is strongly nucleation-dependent, and is inferred by the occurrence of a "lag-phase" prior to fibril growth that shows a sigmoidal pattern. ghosh et al. (2010) dissected the growth curve into three biophysically distinct sections to simplify modeling and to allow the data to be experimentally verifiable. stage i is where the pre-nucleation events occur whose mechanism is largely unknown. the pre-nucleation stage is extremely important in dictating the overall aggregation process where critical events such as conformation change and concomitant aggregation take place, and it is also the most experimentally challenging to decipher. in addition to mechanistic reasons, this stage is also physiologically important as lowmolecular-weight (lmw) species are implicated in ad pathology. the rate-limiting step of nucleation is followed by growth. the overall growth kinetics and structure and shape of the fibrils are mainly determined by the structure of the nucleating species. an important intermediate along the aggregation pathway, called "protofibrils," have been isolated and characterized that have propensities to both elongate (by monomer addition) as well as to laterally associate (protofibril-protofibril association) to grow into mature fibrils (stage iii in the growth curve). aggregation ghosh et al. (2010) generated an ode-based molecular simulation (using mass-kinetics methodology) of this fibril growth process to estimate the rate constants involved in the entire pathway. the dynamics involved in the protofibril elongation stage of the aggregation (stage iii of the process) were estimated and validated by in vitro biophysical analysis. ghosh et al. (2010) next used the rate constants identified from stage iii to create a complete aggregation pathway simulation (combining stages i, ii, and iii) to approximately identify the nucleation mass involved in aî²-aggregation. in order to model the aî²-system, one needs to estimate the rate constants involved in the complete pathway and the nucleation mass itself. it is difficult to iterate through different values for each of these variables to get close to the experimental plots (fibril growth curves measured via fluorescence measurements with time) due to the large solution space; also, finding the nucleation phase cannot be done independently without estimating the rate constants alongside. however, having separately estimated the post-nucleation stage rate constants (as mentioned above) reduces the overall parameter estimation complexity. the complete pathway simulation was used to study the lag times associated with the aggregation pathway, and hence predict possible estimates of the nucleation mass. the following strategy was used: estimate the pre-nucleation rate constants that give the maximum lag times for each possible estimate of the nucleation mass. this led to four distinctly different regimes of possible nucleation masses corresponding to four different pairs of rate constants for the pre-nucleation phase (regime 1, where n = 7, 8, 9, 10, 11; regime 2, where n = 12, 13, 14; regime 3, where n = 15, 16, 17; and regime 4, where n = 18, 19, 20, 21) . however, it was experimentally observed that the semi-log plot of the lag times against initial concentration of aî² is linear, and this characteristic was used to figure out what values of nucleation mass are most feasible for the aî²42-aggregation pathway. the simulated plots show a more stable relationship between the lag times and the initial concentrations, and the best predictions for the nucleation mass were reported to be in the range 10-16. such molecular pathway level studies are extremely useful in understanding the pathogenesis of ad in general, and can motivate drug development exercises in the future. for example, characterization of the nucleation mass is important as it has been observed that various fatty acid interfaces can arrest the fibril growth process (by stopping the reactions beyond the pre-nucleation stage). such in depth modeling of the aggregation pathway can suggest what concentrations of fatty acid interfaces should be used (under a given aî² concentration in the brain) to arrest the fibril formation process leading to direct drug dosage and interval prediction for ad patients. despite the fact that we have mentioned several possible parameters for in silico modeling of neuro-diseases, it is noteworthy that finding a set of the most reasonable parameters for the modeling is in fact a big challenge. on the other hand, understanding (and thus finding reasonable biological interpretations for) the results from the complex interaction of all parameters considered is also a big challenge. in addition, a number of assumptions that models are sometimes based on still have controversial issues. accurately modeling spatio-temporal dynamics of neurons and neurotransmitters (and other chemicals/ligands) also constitutes a huge challenge. understanding the complex systems involved in a disease will make it possible to develop smarter therapeutic strategies. treatments for existing tumors will use multiple drugs to target the pathways or perturbed networks that show an altered state of activity. in addition, models can effectively form the basis for translational research and personalized medicine. biological function arises as the result of processes interacting across a range of spatiotemporal scales. the ultimate goal of the applications of bioinformatics in systems biology is to aid in the development of individualized therapy protocols to minimize patient suffering while maximizing treatment effectiveness. it is now being increasingly recognized that multi-scale mathematical and computational tools are necessary if we are going to be able to fully understand these complex interactions (e.g. in cancer and heart diseases). with the bioinformatics tools, computational theories, and mathematical models introduced in this article, readers should be able to dive into the exhilarating area of formal computational systems biology. investigating these models and confirming their findings by experimental and clinical observations is a way to bring together molecular reductionism with quantitative holistic approaches to create an integrated mathematical view of disease progression. we hope to have shown that there are many interesting challenges yet to be solved, and that a structured and principled approach is essential for tackling them. systems biology is an emerging field that aims to understand biological systems at the systems level with a high degree of mathematical and statistical modeling. in silico modeling of infectious diseases is a rich and growing field focused on modeling the spread and containment of infections with model designs being flexible and enabling adaptation to new data types. the advantages of avoiding animal testing have often been seen as one of the advantages offered by in silico modeling; the biggest advantage is that there are no ethical issues in performing in silico experiments as they don't require any animals or live cells. furthermore, as the entire modeling and analysis are based on computational approaches, we can obtain the results of such analysis even within an hour. this saves huge amounts of time and reduces costs, two major factors associated with in vitro studies. however, a key issue that needs to be considered is whether in silico testing will ever be as accurate as in vitro or in vivo testing, or whether in silico results will always require non-simulated experimental confirmation. tracqui et al. (1995) successfully developed a glioma model to show how chemo-resistant tumor sub-populations cause treatment failure. similarly, a computational model of tumor invasion by frieboes et al. (2006) is able to demonstrate that the growth of a tumor depends on the microenvironmental nutrient status, the pressure of the tissue, and the applied chemotherapeutic drugs. the 3d spatio-temporal simulation model of a tumor by dionysiou et al. (2004) was able to repopulate, expand, and shrink tumor cells, thus providing a computational approach for assessment of radiotherapy outcomes. the glioblastoma model of kirby et al. (2007) is able to predict survival outcome post-radiotherapy. wu et al. (2012) has also developed an in silico glioma microenvironment that demonstrates that targeting the microenvironmental components could be a potential anti-tumor therapeutic approach. the in silico model-based systems biology approach to skin sensitization (tnf-alpha production in the epidermis) and risk of skin allergy assessment has been successfully carried out; it can replace well known in vitro assays, such as the mouse local lymph node assay (llna) used for the same purpose (by maxwell and mackay (2008) at the unilever safety and environmental assurance centre). similarly, davies et al. (2011) effectively demonstrated an in silico skin permeation assay based on time course data for application in skin sensitization risk assessment. kovatchev et al. (2012) showed how the in silico model of alcohol dependence can provide virtual clues for classifying the physiology and behavior of patients so that personalized therapy can be developed. pharmacokinetics and pharmacodynamics are used to study absorption, distribution, metabolism, and excretion (adme) of administered drugs. in silico models have tremendous efficacy in early estimation of various adme properties. quantitative structure-activity relationship (qsar) and quantitative structure-property relationship (qspr) models have been commonly used for several decades to predict adme properties of a drug at early phases of development. there are several in silico models applied in adme analysis, and readers are encouraged to read the review by van de waterbeemd and gifford (2003) . gastroplusâ�¢, developed at simulations plus (www.simulations-plus.com), is highly advanced, physiologically based rapid pharmacokinetic (pbpk) simulation software that can generate results within 5 seconds, thus saving huge amounts of time and cost in clinical studies. the software is an essential tool to formulation scientists for in vitro dose disintegration and dissolution studies. towards next-generation treatment of spinal cord injuries, novartis (www.novartis. com) is working to model the human spinal cord and its surrounding tissues in silico to check the feasibility of monoclonal antibody-based drug administration and their pharmacokinetics and pharmacodynamics study results. the in silico "drug re-purposing" approach by bisson et al. (2007) demonstrated how phenothiazine derivative antipsychotic drugs such as acetophenazine can cause endocrine side effects. recently aguda et al. (2011) reported a computational model for sarcoidosis dynamics that is useful for pre-clinical therapeutic studies for assessment of dose optimization of targeted drugs used to treat sarcoidosis. towards designing personalized therapy of larynx injury leading to acute vocal fold damage, li et al. (2008) developed agent-based computational models. in a further advancement, entelos â® (www.entelos.com) has developed "virtual patients," in silico mechanistic models of type-2 diabetes, rheumatoid arthritis, hypertension, and atherosclerosis for identification of biomarkers, drug targets, development of therapeutics, and clinical trial design, and patient stratification. entelos' virtual idd9 mouse (nod mouse) can replace diabetes resistance type-1 diabetes live mice for various in vivo experiments. apart from diseases, systems level modeling of basic biological phenomena and their applications in disease have also been reported. an in silico model to mimic the in vitro rolling, activation, and adhesion of individual leukocytes has been developed by tang et al. (2007) . developing virtual mitochondria, cree et al. (2008) vi p r ( h t t p : / / w w w. v i p r b r c . o rg / b r c / h o m e . d o ? decorator=vipr) is one of the five bioinformatics resource centers (brcs) funded by the national institute of allergy and infectious diseases (niaid). this website provides a publicly available database and a number of computational analysis tools to search and analyze data for virus pathogens. some of the tools available at vipr are the following: 1. gatu (genome annotation transfer utility), a tool to transfer annotations from a previously annotated reference to a new, closely related target genome. 2. pcr premier design, a tool for designing pcr primers. 3. a sequence format conversion tool. 4. a tool to identify short peptides in proteins. a meta-driven comparative analysis tool. as there are many different kinds of tools available the tools on the website are organized by the virus family. the rat genome database (rgd) the rat genome database (http://rgd.mcw.edu/wg/ home) is funded by the national heart, lung, and blood institute (nhlbi) of the national institutes of health (nih). the goal of this project is to consolidate research work from various institutes to generate and maintain a rat genomic database (and make it available to the scientific community). the website provides a variety of tools to analyze data. influenza resource centers (brcs) funded by the national institute of allergy and infectious diseases (niaid). this website provides a publicly available database and a number of computational analysis tools to search and analyze data for influenza virus. this website provides many of the same tools that are provided at vibr. there are numerous other tools such as models of infectious disease agent study (midas), which is an in silico model for assessing infectious disease dynamics. midas assists in preparing, detecting, and responding to infectious disease threats. the wellcome trust sanger institute the sanger institute (http://www.sanger.ac.uk/) investigates genomes in the study of diseases that have an impact on global health. the sanger institute has made a significant contribution to genomic research and developing a new understanding of genomes and their role in biology. the website provides sequence genomes for various bacterial, viral, and model organisms such as zebrafish, mouse, gorilla, etc. a number of open source software tools for visualizing and analyzing data sets are available at the sanger institute website. an in silico modeling approach to understanding the dynamics of sarcoidosis models of multiple sclerosis. autoimmune diseases kinetics of influenza a virus infection in humans discovery of antiandrogen activity of nonsteroidal scaffolds of marketed drugs using a mammalian cell cycle simulation to interpret differential kinase inhibition in antitumour pharmaceutical development in silico models for cellular and molecular immunology: successes, promises and challenges introduction to systems biology a reduction of mitochondrial dna molecules during embryogenesis explains the rapid segregation of genotypes from data banks to data bases in silico modeling in infectious disease determining epidermal disposition kinetics for use in an integrated nonanimal approach to skin sensitization risk assessment a four-dimensional simulation model of tumour response to radiotherapy in vivo: parametric validation considering radiosensitivity, genetic profile and fractionation in silico models of cancer mathematical modeling of radiotherapy strategies for early breast cancer exploring the formation of alzheimer's disease senile plaques in silico in silico pharmacology for drug discovery: methods for virtual ligand screening and profiling farm animal genomics and informatics: an update an integrated computational/experimental model of tumor invasion dynamics of protofibril elongation and association involved in abeta42 peptide aggregation in alzheimer's disease bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists mathematical modeling of avascular tumour growth based on diffusion of nutrients and its validation in silico models of alcohol dependence and treatment large-scale in silico modeling of metabolic interactions between cell types in the human brain a patient-specific in silico model of inflammation and healing tested in acute vocal fold injury application of a systems biology approach to skin allergy risk assessment agent-based model of inflammation and wound healing: insights into diabetic foot ulcer pathology and the role of transforming growth factor-î²1. wound repair regeneration replication dynamics of mycobacterium tuberculosis in chronically infected mice when the human viral infectome and diseasome networks collide: towards a systems biology platform for the aetiology of human diseases dynamic modeling of alphasynuclein aggregation for the sporadic and genetic forms of parkinson's disease physiological studies in silico computational and experimental models of ca2+-dependent arrhythmias dynamics of in silico leukocyte rolling, activation, and adhesion whole-cell simulation: a grand challenge of the 21st century a mathematical model of glioma growth: the effect of chemotherapy on spatio-temporal growth computational cardiology: the heart of the matter a survey of available tools and web servers for analysis of protein-protein interactions and interfaces admet in silico modelling: towards prediction paradise? translational systems biology of inflammation bioinformatics applications for pathway analysis of microarray data bayesian analysis of signaling networks governing embryonic stem cell fate decisions in silico experimentation of glioma microenvironment development and anti-tumor therapy development of a three-dimensional multiscale agent-based tumor model: simulating gene-protein interaction profiles, cell phenotypes and multicellular patterns in brain cancer further reading silico' simulation of biological processes silico toxicology: principles and applications multiscale cancer modeling silico immunology silico: 3d animation and simulation of cell biology with maya and mel algorithm any well-defined computational procedure that takes some values, or set of values, as input, and produces some value, or set of values, as output. bioinformatics bioinformatics is the application of statistics and computer science to the field of molecular biology. biotechnology the exploitation of biological processes for industrial and other purposes. data structures a way to store and organize data on a computer in order to facilitate access and modifications. genome the complete set of genetic material of an organism. genomics the branch of molecular biology concerned with the structure, function, evolution, and mapping of genomes. gene ontology a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. informatics the science of processing data for storage and retrieval; information science. in silico in silico is an expression used to mean "performed on a computer or via computer simulation." in vivo in microbiology in vivo is often used to refer to experimentation done in live isolated cells rather than in a whole organism. in vitro in vitro studies in experimental biology are those that are conducted using components of an organism that have been isolated from their usual biological surroundings in order to permit a more detailed or more convenient analysis than can be done with whole organisms. kinomics kinomics is the study of kinase signaling within cellular or tissue lysates. oncogenesis the progression of cytological, genetic, and cellular changes that culminate in a malignant tumor. pathophysiology the disordered physiological processes associated with disease or injury. proteomics the branch of genetics that studies the full set of proteins encoded by a genome. sequencing the process of determining the precise order of nucleotides within a dna molecule. systems biology an inter-disciplinary field of study that focuses on complex interactions within biological systems by using a more holistic perspective. 1. the template-based transcriptional control network reconstruction method exploits the principle that orthologous proteins regulate orthologous target genes. in this approach, regulatory interactions are transferred from a genome (such as a genome of a model organism or well studied organism) to the new genome. 2. the ultimate goal of in silico modeling in biology is the detailed understanding of the function of molecular networks as they appear in metabolism, gene regulation, or signal transduction. there are two major challenges in modeling infectious diseases: a. difficulty in finding the most appropriate set of parameters for the in silico modeling of infectious diseases is often a challenge. b. understanding the results from all the complex interactions of parameters considered. 4. there are three types of cancer models. continuum models: in these models extracellular parameters can be represented as continuously distributed variables to mathematically model cell-cell or cell-environment interactions in the context of cancers and the tumor microenvironment. discrete models: these models represent cancer cells as discrete entities of defined location and scale, interacting with one another and external factors in discrete time intervals according to predefined rules. hybrid models: these models incorporate both continuum and discrete variables in a modular approach. there are three types of parameters considered for in silico modeling of infectious diseases: a. parameters derived from characteristics of agent: examples: concentration of the agent's antigen-host antibody complex; case fatality rate; strain of the agent; other genetic information of the agent; etc. examples: the total white blood cell counts; differential white blood cell counts, and/or much more sophisticated counts of specific blood cell types; blood levels of some specific cytokines, hormones, and/or neurotransmitters; daily calories, protein, and/or fat intake; daily amount of energy expended and/or duration of exercise; etc. c. parameters derived from characteristics of environment: examples: host's ambient temperature; host's ambient atmospheric humidity; altitude; host's lightdark cycle; etc. key: cord-308302-5yns1hg9 authors: wu, gang; zhou, shuchang; wang, yujin; lv, wenzhi; wang, shili; wang, ting; li, xiaoming title: a prediction model of outcome of sars-cov-2 pneumonia based on laboratory findings date: 2020-08-20 journal: sci rep doi: 10.1038/s41598-020-71114-7 sha: doc_id: 308302 cord_uid: 5yns1hg9 the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) has resulted in thousands of deaths in the world. information about prediction model of prognosis of sars-cov-2 infection is scarce. we used machine learning for processing laboratory findings of 110 patients with sars-cov-2 pneumonia (including 51 non-survivors and 59 discharged patients). the maximum relevance minimum redundancy (mrmr) algorithm and the least absolute shrinkage and selection operator logistic regression model were used for selection of laboratory features. seven laboratory features selected in the model were: prothrombin activity, urea, white blood cell, interleukin-2 receptor, indirect bilirubin, myoglobin, and fibrinogen degradation products. the signature constructed using the seven features had 98% [93%, 100%] sensitivity and 91% [84%, 99%] specificity in predicting outcome of sars-cov-2 pneumonia. thus it is feasible to establish an accurate prediction model of outcome of sars-cov-2 pneumonia based on laboratory findings. www.nature.com/scientificreports/ all methods were carried out in accordance with relevant guidelines and regulations. study design and participants. this study was approved by the ethics commission of hospital (tj-2020-075). written informed consent was waived by the ethics commission of hospital. the author's center was the designated hospital for severe and critical sars-cov-2 pneumonia. patients underwent repeated rt-pcr tests to confirm sars-cov-2. laboratory tests for sars-cov-2 pneumonia included: blood routine test, serum biochemical (including glucose, renal and liver function, creatine kinase, lactate dehydrogenase, and electrolytes), coagulation profile, cytokine test, markers of myocardial injury, infection-related makers, and other enzymes. repeated tests were done every 3-6 days for monitoring the patient's condition. oxygen support (from nasal cannula to invasive mechanical ventilation) was administered to patients according to the severity of hypoxaemia. all patients were administered with empirical antibiotic treatment, and received antiviral therapy. most of patients improved after treatment. however, a few critical patients continued to deteriorate and eventually died. data collection. 58 fatal cases of sars-cov-2 pneumonia (39 male, median age 66 years) were collected by the electronic medical record system. 68 discharged patients with sars-cov-2 pneumonia whose age and gender matched the non-survivors were selected (46 male, median age 66 years). the admission date of these patients was from feb 16, 2020 to mar 20, 2020 . we reviewed all laboratory findings for each patient. results of repeated tests were carefully compared to find the greatest deviation from normal value. in general, the greatest number in series of values was recorded. however, for platelets, red blood cell, lymphocytes, hemoglobin, calcium, total protein, albumin, estimated glomerular filtration rate (egfr), and prothrombin activity (pta), the minimum was recorded. laboratory findings at the day of mortality were not used. these recorded laboratory findings were considered as lab features of a patient. a initial data set of 126 patients (non-survivor 58, discharge 68) was thus built. there were 16 patients who did not have the entire group of laboratory features, thus their data were deleted from the dataset. the remaining data of 110 patients (51 non-survivor, 59 discharge) were analyzed by machine learning. statistical analysis and modeling. first, all the variables were compared between non-survivors and discharged patients using the mann-whitney u test for non-normally distributed features or the independent t test for normally distributed features 16, 17 . features with p < 0.05 were considered significant variables and selected 16, 17 . second, spearman's correlation coefficient was used to compute the relevance and redundancy of the features 16, 17 . third, we applied the maximum relevance minimum redundancy (mrmr) algorithm to assess the relevance and redundancy of the features 16, 17 . the features were ranked according to their mrmr scores 16, 17 . fourth, the top 15 features with high-relevance and low-redundancy were selected for least absolute shrinkage and selection operator (lasso) logistic regression model. the lasso logistic regression model was adopted for further features selection 16, 17 . some candidate features coefficients were shrunk to zero and the remaining variables with non-zero coefficients were finally selected 16, 17 . the model was used for calculating signature for each patient. mann-whitney u test was used for comparing signature between two groups 16, 17 . receiver operator characteristic (roc), precision recall curve (prc) analysis and hosmer-lemeshow test were used for further evaluation of model. the statistical analyses were performed using r software (version 3.3.4; https ://www.r-proje ct.org) 16, 17 . the following r packages were used: the "corrplot" package was used to calculate spearman's correlation coefficient; the "mrmre" package was used to implement the mrmr algorithm; the "glmnet" was used to perform the lasso logistic regression model, and the "proc" package was used to construct the roc curve 16, 17 . nine laboratory features were eliminated in the first step of feature selection because of non-significance. the remaining thirty-eight lab features were significantly different between two groups (p < 0.05), and then mrmr scores were obtained for them. there were seven features having non-zero coefficients after lasso algorithm, and were selected for the model. table 1 shows the fifteen features with the highest mrmr scores. figure 1 shows the correlation matrix heatmap of the thirty-eight significant features. figure 2 shows the feature selection process with lasso algorithm. figure 3 shows the contribution of the seven features to the model. figure 4 shows the signatures of all patients, as well as roc. figure 5 shows the prc for the model. non-survivors and discharged patients differed significantly in the signature derived from the model (p < 0.0001). the auc was 0.997 [95% ci 0.99, 1.00]. the sensitivity and specificity in predicting outcome of sars-cov-2 pneumonia were 98% [93%, 100%] and 91% [84%, 99%] respectively. the area under precision recall curve (auprc) was 0.996. hosmer-lemeshow test showed good calibration (p = 0.95) for the model. the seven features included in the prediction model were as follows: pta, urea, white blood cell (wbc), interleukin-2 receptor (il-2r), indirect bilirubin (ib), myoglobin, and fibrinogen degradation products (fgdp). all features had coefficients of positive number except pta. pta and fgdp are from coagulation profile. urea and ib are from renal and liver function respectively. wbc is from blood routine test. myoglobin is a marker of myocardial injury. il-2r is related to immune response. the signatures derived from the model could be positive or negative numbers. www.nature.com/scientificreports/ non-survivors and discharged patients did not differ in age or gender (median age 67 vs. 66, p = 0.75; percentage of males, 66% vs. 64%, p = 0.66). the comparisons of laboratory findings between non-survivors and discharged patients are shown in table 2 . blood routine test. wbc and neutrophils were significantly higher in non-survivor group versus discharge group. lymphocyte, platelets and red blood cells were significantly lower in non-survivors. auc for them were 0.646-0.910. table 1 . the fifteen features with higher mrmr scores were selected for the step of lasso logistic regression. some candidate features coefficients were shrunk to zero and the remaining variables with non-zero coefficients were selected. mrmr maximum relevance minimum redundancy, lasso least absolute shrinkage and selection operator, pta prothrombin activity, wbc white blood cell, il-2r interleukin-2 receptor, ib indirect bilirubin, tb total bilirubin, fgdp fibrinogen degradation products, hs-crp hypersensitive c-reactive protein, ldh lactate dehydrogenase, egfr estimated glomerular filtration rate. cytokine. il-2r and il-6 were significantly higher in non-survivor group versus discharge group. auc for them were 0.689-0.909. procalcitonin, high sensitive c-reactive protein, ferritin and n-terminal pro-brain natriuretic peptide (nt-probnp) were significantly higher in non non-survivors and discharged patients with sars-cov-2 pneumonia differed significantly in thirty-eight laboratory findings. by using machine learning method, we established a prediction model involving seven laboratory features. the model was found highly accurate in distinguishing non-survivors from discharged patients. the seven features selected by artificial intelligence also indicated that dysfunction of multiple organs or systems correlated with the prognosis of sars-cov-2 pneumonia. the sars-cov-2 triggers a series of immune responses and induces cytokine storm, resulting in changes in immune components 5, 18 . when immune response is dysregulated, it will result in an excessive inflammation, even cause death 7, 19 . excessive neutrophils may contribute to acute lung damage, and are associated with fatality 20 . higher serum level of il-2r was found in non-survivors, indicating excessive immune response. in addition, high leukocyte count in sars-cov-2 patients may be also due to secondary bacterial infection 21, 22 . liver injury has been reported to occur during the course of the disease 23, 24 , and is associated with the severity of diseases. increased serum bilirubin level was observed in fatal cases. acute kidney injury could have been related to direct effects of the virus, hypoxia, or shock 25, 26 . blood urea level continued to increase in some cases. non-survivors had higher blood urea compared to survivors. myocardial injury was seen in non-survivors, which was suggested by elevated level of myoglobin. the mechanism of multiple organ dysfunction or failure may be associated with the death of patients with sars-cov-2 pneumonia. some patients with sars-cov-2 infection progressed rapidly with sepsis shock, which is well established as one of the most common causes of disseminated intravascular coagulation (dic) 27 . the non-survivors in our cohort revealed significantly lower pta compared to survivors. at the late stages of sars-cov-2 infection, level of fibrin-related markers (fgdp) markedly elevated in most cases, suggesting a secondary hyperfibrinolysis condition. a number of laboratory features were compared between non-survivors and discharged patients with sars-cov-2 pneumonia. the two groups differed significantly in as many as thirty-eight lab features. however, none of the futures provided adequate accuracy in predicting the outcome of sars-cov-2 pneumonia. thus, a novel prediction model involving multiple features was established in the study. with machine learning methods previously used in radiomics, a prediction model combining seven out of thirty-eight laboratory features was built for predicting the outcome of sars-cov-2 pneumonia. the mrmr algorithm was used for assessing significant features to avoid redundancy between features. the mrmr score of a feature is defined as the mutual information between the status of the patients and this feature minus the average mutual information of previously selected features and this feature 17, 28, 29 . the top fifteen features with high mrmr scores were selected for the next step of modeling. the least absolute shrinkage and selection operator logistic regression model was used to processing the features selected by mrmr algorithm. lasso is actually a regression analysis method that improves the model prediction accuracy and interpretability 30 . the signature calculated with the model can be positive or negative number, corresponding with poor and good prognosis respectively. our results showed that the auc of the signature was 10-40% higher than that of a single feature. the modeling process is a black box; however, the choice of variables seems reasonable. pta can more accurately reflect the coagulation function compared to prothrombin time, and can also reflect the degree of liver injury. urea is a good index to reflect the degree of renal function damage. wbc can not only reflect immune www.nature.com/scientificreports/ it is suitable to start to use this model after three repeated laboratory tests (about 2 weeks after admission), because doctors may have enough data at that time. lots of laboratory findings are generated in hospitalization. which are most important for predicting outcome? our study at least answered such a problem. seven laboratory features could be used to construct a new signature with the model. the new signature seems more useful than any single feature. we encourage such a simple-to-use model widely used in clinical practice. most of clinical factors are not continuous variables (such as underlying disease). we used a machine learning method similar to radiomics, which mainly deals with continuous features. our study focused on continuous laboratory variables. we had to exclude non-continuous clinical factor with the current machine learning method. by using other methods, a model that involves both continuous variables and category variables can be established. thus clinical factors raised as significant predictive factors (such as respiratory status or radiological features) could be included in the models. however, there are more than forty laboratory findings in our study, making establishment of model difficult. we felt it necessary to simplify laboratory features. thus we establish a sub-model based on lab findings. a new lab signature is thus created, and is proved highly valuable. in future study, the signature may be combined with clinical factors to establish a more complex model. our study has some limitations. first, this is a single-center retrospective study. multi-center large-sample studies are required to validate our prediction model. second, our model may not be directly used in other centers. however, they could easily establish a prediction model using their own data with machine learning method. third, some patients who did not have all the lab findings were excluded. selection bias must be present due to patients exclusion. other studies with more strict design were thus required to reveal the bias. fourth, statistical approach conducted in this study is not perfect. as lasso was used for 15 variables, 150 or more patients were needed. more patients should be collected in future study. in conclusion, it is feasible to establish a accurate prediction model of outcome of sars-cov-2 pneumonia based on laboratory findings. injury of liver, kidney and myocardium, coagulation disorder and excess immune response all correlate with the outcome of sars-cov-2 pneumonia. after publication, the data will be made available to others on reasonable requests to the corresponding author. received: 26 march 2020; accepted: 10 august 2020 identification of a novel coronavirus in patients with severe acuterespiratory syndrome isolation of a novel coronavirus from a man with pneumonia in saudi arabia the novel coronavirus originating in wuhan, china: challenges for global health governance early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia clinical features of patients infected with 2019 novel coronavirus in wuhan a novel coronavirus from patients with pneumonia in china clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china chest ct findings in coronavirus disease-19 (covid-19): relationship to duration of infection sensitivity of chest ct for covid-19: comparison to rt-pcr general office of the national health commission of china. diagnosis and treatment protocol for 2019-ncov clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a singlecentered, retrospective, observational study clinical predictors of mortality due to covid-19 based on an analysis of data of 150 patients from wuhan, china next-generation radiogenomics sequencing for prediction of egfr and kras mutation status in nsclc patients using multimodal imaging and machine learning algorithms prediction model of aryl hydrocarbon receptor activation by a novel qsar approach, deepsnap-deep learning machine learning algorithms applied to a prediction of personal overall thermal comfort using skin temperatures and occupants' heating behavior nomogram based on shear-wave elastography radiomics can improve preoperative cervical lymph node staging for papillary thyroid carcinoma t2-weighted image-based radiomics signature for discriminating between seminomas and nonseminoma dysregulation of immune response in patients with covid-19 in wuhan mers-cov infection in humans is associated with a pro-inflammatory th1 and th17 cytokine profile pathogenic human coronavirus infections: causes and consequences of cytokine storm and immunopathology epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study clinical characteristics of 2019 novel coronavirus infection in china abnormal coagulation parameters are associated with poor prognosis in patients with novel coronavirus pneumonia liver injury during highly pathogenic human coronavirus infections pandemic 2009 influenza a in argentina: a study of 337 patients on mechanical ventilation the clinical and chest ct features associated with severe and critical covid-19 pneumonia complement activation in human sepsis is related to sepsis-induced disseminated intravascular coagulation a new feature selection method based on symmetrical uncertainty and interaction gain machine learning-based analysis of mr radiomics can help to improve the diagnostic performance of pi-rads v2 in clinically relevant prostate cancer selection of important variables and determination of functional form for continuous predictors in multivariable model building we thank all patients and their families involved in the study. g.w., s.z. and y.w. collected the epidemiological and clinical data. t.w., w.l. and s.w. summarized all data. g.w., x.l. drafted the manuscript. t.w. and x.l. revised the final manuscript. the authors declare no competing interests. correspondence and requests for materials should be addressed to t.w. or x.l.reprints and permissions information is available at www.nature.com/reprints.publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. key: cord-318562-jif88gof authors: jiménez-liso, maria rut; lópez-banet, luisa; dillon, justin title: changing how we teach acid-base chemistry: a proposal grounded in studies of the history and nature of science education date: 2020-08-15 journal: sci educ (dordr) doi: 10.1007/s11191-020-00142-6 sha: doc_id: 318562 cord_uid: jif88gof we propose explicit and implicit approaches for the teaching of acid-base chemistry based on research into the history and nature of science (nos). to support these instructional proposals, we identify four rationales for students to understand acid-base processes: daily life, socio-scientific, curriculum, and history of science. the extensive bibliography on misconceptions at all educational levels justifies the need for a change from the usual pedagogical approaches to teaching the acid-base domain (traditionally involving conceptual-focused teaching) to a deeper and more meaningful approach that provides (implicitly or explicitly) a chance to reflect on how scientific knowledge is constructed. controversial moments in science from 1923, when three researchers (bronsted, lowry, and lewis) independently enunciated two theories from two different paradigms (dissociation and valence electron), underpin our first sequence with an explicit nos approach for both lower secondary school and upper secondary or university levels. our inquiry teaching cycle promotes the transformation of a hands-on activity (using cabbage as an indicator) into an inquiry, and subsequently, we use an historical model to propose a sequence of activities based on the modeling cycle of couso and garrido-espeja for lower secondary school. finally, we identify some implications for a model-focused teaching approach for upper secondary and university levels using more sophisticated models. researchers in the area of the nature of science (nos) often provide recommendations for teachers. it is usually suggested, directly or indirectly, that teachers should improve their knowledge about what science is and how it is constructed, so that they can transfer this knowledge to the classroom, transforming it into sequences of activities for their students. for example, nouri et al. (2019) recommend a well-designed history of science (hos) intervention to convey essential lessons about the nos consensus described by mccomas (2006) such as science depends on empirical evidence; cultural, political, and social factors influence science; or science has a tentative or fallible nature. many authors identify multiple potential benefits for learning nos through such approaches: teaching scientific methods, challenging myths related to how science works, and differentiating between idealized scientific laws and observations (niaz 2009 ). however, they also highlight that research involving rationales and strategies for teaching hos is scarce and nouri et al. (2019) recommend expanding science teacher educators' rationales for teaching hos to inspire a broader array of orientations and teaching strategies. they also suggest paying special attention to instructors' orientation towards teaching hos which may have an impact on their effectiveness. such recommendations usually arise from studies focusing on the benefits of nos for students and research on what teachers think, their beliefs about nos, or in a less declarative way and close to their educational reality, the connection (or otherwise) between this knowledge and what the teacher really does in their classes (leden et al. 2015) . one of those recommendations involves the design of nos classroom activities: explicitly and implicitly (duschl and grandy 2013) and using reflective approaches to nos teaching and learning. these approaches open up the range of possibilities for teachers. for example, proposing explicit general activities, linked (or not) to a specific issue of science content, to develop students' understanding of an aspect of the nos consensus view (lederman 2007; mccomas 2006) . such an activity might involve the use of a mystery box to help students to learn about observation, interpretation, and argumentation (cavallo 2007; rau 2009 ). another example involves scientific practices, such as the national research council's (2000) inquiry into the problem of the tsunami on the us west coast or "mrs graham's" class which tackled the problem of leafless trees with explicit reflection such as metamodeling learning progression (schwarz et al. 2009 ) on how science is built. in this paper, we use garrido-espeja and definition of a model which is a "small number of big or core ideas (harlen 2010 ) that have the potential to explain a lot of different phenomena (izquierdo-aymerich and aduriz-bravo 2003), such as the particle model of matter" or, indeed, the chemical change model. implicit and explicit nos teaching approaches (duschl and grandy 2013) have a place in the high school science curriculum (12-18 years old) because an initial study of what science is is often included. at the same time, in chemistry courses, some topics such as atomic structure, the periodic table, or acid-base are often introduced through their historical developments. this history of chemistry topics present in curricula allows the design of authentic scientific practices (implicit approach burgin and sadler 2016) . the content overload in the spanish science curriculum forces some teachers to dispense with developing the initial lesson about what science is and how it is built or with spending more time in deepening these nos aspects when working on some historical development present in the curriculum, i.e. atomic structure, periodic table, etc. thus, before deciding the approach on explicit or implicit teaching, the first teacher decision is to develop or omit this initial nos lesson and the second is to decide whether to deepen or not the historical developments present in the curricula. we turn now to address the issue faced by teachers: how to translate these nos teaching approaches to sequences of activities on a specific topic? in this theoretical article examining teaching practice, we want to focus on the historical development of acid-base theories (arrhenius, bronsted-lowry and lewis) to analyse the steps to follow to design sequences of activities for different nos approaches. the main objective of this paper is to translate the explicit and implicit nos approaches using the historical development of acid-base domain into activity sequences that can be used as a reference by teachers. in the next section we will outline the importance of teaching the topic of acids and bases because we understand that the first decision for a teacher is whether to spend time on the historical development of the acid-base domain present in chemistry curricula at secondary level, high schools and university level (in analytical and inorganic chemistry subjects). finally, we discuss how to design sequences of activities. we examine conventional teaching approaches to the topic and its consequences in terms of students' alternative conceptions and their difficulties to transfer and apply knowledge and to recognize acid-base models' limits of applicability. in this section, we will use research results (our own and others) from assessments of high school students, university students, both undergraduates and postgraduates, and pre-service teachers to show the common acid-base teaching (concept-focused teaching) approach and its consequences. these discussions and an acid-base historical development (timeline in section 4) will help us to analyse the current situation in order to scaffold the design of sequences of activities utilizing different nos approaches: 1. we will propose nos sequences with an explicit approach through controversial moments of the acid-base historical development for both lower secondary school and upper or university levels; 2. we will transform a hands-on activity (using cabbage as an indicator) into an inquiry for lower secondary level (section 7.1) and from it; 3. we will propose a modeling sequence based on the modeling cycle of couso (2020) and couso and garrido-espeja (2017) using an historical model (erduran 2001) for the same level (lower secondary school); 4. we will identify some implications about a model-focused teaching for upper secondary and university levels using more sophisticated models. finally, we will discuss the change in teachers' awareness of a model-focused teaching approach that extends and gives meaning to the usual concept-focused teaching. in short, in this paper, we are constructing a science teaching learning progression in a theoretical manner (schneider and plasman 2011) to build models using the history of acids and bases as a theme which could be used as a reference by teachers in their professional practice. universities. acid-base processes also appear in other subjects such as ionic equilibria and chemistry lab work. in these subjects, they are often referred to as "acid-base reactions", "acidbase titrations", "ionic solutions", and "acid-base theories". we have identified four categories of rationales why students need to understand acid-base processes: daily life; socio-scientific; curriculum; and hos argument. we now discuss each in turn, briefly. acids and bases are commonly recognized by students and the general public. people know about acidic sweets, stomach acidity, antacids, etc. (cros et al. 1986 (cros et al. , 1988 . words such as "acid" and "neutral" are used in some tv advertisements ("fairy is neutral and protects your hands"; "johnson's ph 5.5 has natural ph"). nevertheless, understanding of acid-base concepts is still limited. furthermore, the use of scientific concepts is increasing and highlighted in advertisements to show products both as beneficial or a matter of trust despite misunderstanding of scientific concepts . thus, some acid-base contents are necessary to understand the phenomena encountered in daily life. pseudoscientists take advantage of the lack of awareness of the population about scientific expressions using it to promote "scientific credibility" to their unfounded proposals of health and home remedies. poor science is common in advertisements about cosmetics and cleaning "with ph" products, all sorts of diets and foods that reduce acidity in your body to prevent or treat cancer, etc. those adverts can serve us as a context to raise socio-scientific controversies (evagorou and osborne 2013; sadler and zeidler 2009) about medicalization (domènech calvet et al. 2015) or alternative treatments (uskola ibarluzea 2016). although acids and bases are encountered in students' daily life (and social networks), they are rarely taught well at primary school level. for example, in the spanish primary school curriculum, chemical reactions are only exemplified through oxidations, combustions, and fermentations, with no mention of acid-base reactions. however, if combustion and oxidation are the most used examples of chemical changes, they lead to the establishment of some alternative conceptions such as "all chemical changes are irreversible" (stavridou and solomonidou 1998) or "mass is not conserved in chemical reactions" (stavy 1990 ). not surprisingly, some alternative ideas held by students closely match ideas held by people studying science many centuries ago (wandersee 1986) . the curriculum rationale could be reinforced by the history of science rationale: the knowledge of historical models (justi and gilbert 1999a) and the context in which they were formulated could improve understanding of acid-base models, their limitations, and, as a consequence, the conditions required to select each model (erduran 2001) . taken together, these arguments justify why teachers need to develop acid-base content in their chemistry curriculum. in the next section, we justify why current acid-base teaching might be changed in order to improve understanding of scientific content. historical aspects of acid-base domain could constitute an educational resource of great relevance to prevent students seeing science as a finished product and to appreciate how some theories and explanations are provisional. for instance, the acid-base historical development would allow teachers to discuss with their students the limitations of each of the acid-base theories and why they were used in the past or still are used (alvarado et al. 2015) . in this paper, we use three famous acid-base theories (arrhenius, bronsted-lowry and lewis) , although the historical development of the acid-base domain is as long as the history of chemistry itself. figure 1 represents a timeline of the acid-base domain showing links with the chemical change models presented by justi and gilbert (1999b) and also with some acid-base models reviews (de vos and pilot 2001) or history of science books (taton 1957 (taton , 1959 taton and goupil 1961) . the historical development of a domain is usually introduced to students in a very condensed manner in order to focus on the last, the most useful or the longest surviving acid-base theories. in this first part of the paper, we only use acid-base "theories" as it is commonly called, but from section 6, we will be called acid-base "model" to focus on the explanatory and predictive power of the models. this timeline could be a good illustration of the historical development of ideas which is broader than that usually presented to the students. on the left side of the timeline, we use the term pre-model (justi and gilbert 1999a) to indicate that an acid-base classification does not have to be explanatory. in order to understand scientific models, we need to appreciate that they have been constructed to explain and predict phenomena, so the models are more than a descriptive account of the material world. in this sense, acid-base historical models are a good opportunity to understand how change has taken place in the scientific models over time. many of the situations where people encounter science involve the use of scientific knowledge, alongside other forms of knowledge, to reach decisions about action. this is often the case for lay people, who typically find science through media portrayals of socio-scientific issues, or through consultations with experts such as medical practitioners. lay views of science tend to portray such issues as being easily resolved through simple empirical processes (e.g. driver et al. 1996) . this position, however, is often not sustainable, as illustrated by the following examples. an example of science in the media is the case of enhanced global warming as a result of increased levels of carbon dioxide in the atmosphere due to the combustion of fossil fuels. it is not difficult to find widely different predictions in the media about the likely environmental impact of the burning of fossil fuels. these differences in predictions are based upon the application of models of the atmosphere. resolving those differences involves a complex interplay between models, empirical evidence and methodological expertise. understanding fig. 1 acid-base timeline how such differences arise, and why they cannot be resolved quickly and simply, involves understanding something about the use of models. it is not only lay people who encounter science in situations that are characterized by uncertainty. many experts will be faced with situations where scientific knowledge has to be drawn upon, alongside other considerations, to inform decisions about action in novel situations. a sad and recent example is the current scientific/political/cultural environment with covid-19. the academic literature now includes several accounts of how experts have to create new knowledge in order to answer questions that emerge in specific situations. another, much older example, is brian wynne's account of how experts had to develop new knowledge about the impact of pollution following the chernobyl accident impacting upon the milk produced by sheep which grazed on the cumbrian fells (wynne 1989) . in order to appreciate why the available scientific knowledge may be inadequate to inform action in specific, local conditions, such experts need to understand something about the nature of models in science. so, to summarize, it is necessary to teach models at university level to make the following points: & in order to have a sophisticated understanding of the conceptual content presented to them in chemistry degree programmes, students need to have some understanding of how the models are built. & in addition, if the students develop this understanding of the nature of models, it may enable them better to understand situations involving uncertainty, whether as educated citizens, or if they go on to become professional scientists, in their professional practice. these general arguments about teaching models can be exemplified in the case of acid/base models. conventional chemistry teaching might begin with questions such as "what is an acid?" or "what is a base?"; "what happens when an acid is added to a base, and vice versa?"; "what does ph mean?"; "is it always possible to reach ph=7 when an acid is added to a base?" conventional teaching shows some concepts from arrhenius and bronsted-lowry's theories that we summarized in fig. 2 . it is possible to recognize some differences between both theories in relation to considering acids and bases as substances (arrhenius) or as the conjugated acid-base pairs (bronsted-lowry) . as is normally mentioned, arrhenius', bronsted-lowry's, and lewis' theories are usually presented together (tarhan and acar sesen 2012). acid-base theories are introduced in the style of a short story without any connection with the phenomena they want to explain or with the historical problems that inspired them. by teaching these acid-base theories together we expose the combination of the acid-base concepts (acid, base, neutralization, etc.) of the three theories without appreciating a significant advance between them, which for many students could mean only a terminological change. teachers (and textbooks) usually say that "bronsted-lowry extend the definition of acids and bases" (nyachwaya 2016, p.510) given by arrhenius. in this sense, arrhenius', bronsted-lowry's and lewis' models are often presented several times in chemistry programmes, introducing some inconsistencies in their presentation that lead to "hybrid models" (justi and gilbert 1999a) and some concepts and their definitions are mixed in two or more models (gericke and hagberg 2010) . the science education literature is replete with examples of the consequences for students' learning of this typical way of teaching acid-base content focused on the definition of its concepts and with two or three theories introduced simultaneously. in the next sections we will use a review of research results (our own and others) on the understandings of high school students, university students (both undergraduates and postgraduates), and pre-service teachers in order to design some proposals focusing on two approaches, one nos explicit and the other nos implicit (sections 6 and 7 respectively). there have been a number of studies into students' misunderstandings of acid-base phenomena (for example mcclary and bretz 2012; nyachwaya 2016) . many students have difficulties in learning acid-base concepts and the presence of alternative conceptions (hoe and subramaniam 2016) can interfere with their understanding. for instance, students think that acids alone are associated with corrosiveness (demircioğlu et al. 2005; hoe and subramaniam 2016; özmen et al. 2009 ) and are more dangerous and reactive than bases (hoe and subramaniam 2016; nakhleh and krajcik 1994; sheppard 2006) . they also think that rain water in an unpolluted area is neutral or the solution formed after adding an acid and a base is always neutral (banerjee 1991; hoe and subramaniam 2016; scerri 2019; schmidt 1991) . as quílez (2019) points out, many of these misunderstandings come from students' terminological difficulties. therefore, students do not understand why the degree of acidity or basicity of two acidic or basic solutions is different, although they have the same concentration (alvarado et al. 2015) . moreover, a superficial correlation of chemical structures with acidity or basicity may explain why students believed that compounds containing h will produce h + and besides, the belief that a stronger acid is either the one that produces a higher hydrogen ion concentration, or that has more h in the formula, or an acid with a higher initial concentration (demircioğlu et al. 2005; hoe and subramaniam 2016; özmen et al. 2009; ross and munby 1991) , reveals that students do not apply correctly the definitions of strong and weak to acids and bases (garnett et al. 1995; mcclary and bretz 2012) and both equilibrium and incomplete dissociation of acid and bases are not considered. similar problems are found with students' understanding of bases (hoe and subramaniam 2016) . another issue is that students do not totally differentiate between the terms acidity and ph (alvarado et al. 2015) ; they do not consider ph is providing information about both the h + and oh − concentrations (garnett et al. 1995) , and show either a lack of consideration about the influence of variables such as the temperature or the solvent, using strength and concentration as if they were synonymous or having problems when they must differentiate between an acidbase reaction and a neutralization reaction (alvarado et al. 2015) . many of those alternative conceptions are consistent with students using the theory in other contexts to which arrhenius proposed it. in the next subsection, we are going to look deeper into the difficulties linked to transferring knowledge to new situations or to recognizing the limits of each of the acid-base theories. based on our own results in the spanish context (author 2000), the consequences of acid-base conceptual teaching for both undergraduate and postgraduate chemistry degree students become evident due to (a) students' difficulties to transfer knowledge and (b) the problems recognizing the limits of applicability of acid-base theories: (a) transference of knowledge to new situations. despite having been taught about acid-base concepts many times, undergraduate university students (n = 450) from three spanish universities showed weaknesses in their abilities to recognize an acid-base process and the proportion giving the correct answer decreased when the complexity of the theory applied increased: -most students (78%) recognized a proton transfer process as bronsted's model. -26% recognized the autoprotolysis of solvents (so 2 ) as an acid-base process. -only 12% considered the electron transference as in lewis' process. -less than 2% of the students applied all three theories. some spanish university students explained that an electron transfer process (lewis' model) or an autoprotolysis of solvents (so 2 ) is not an acid-base process, for example, "it is not an acid-base process", "it is a redox process", or "it is not an acid-base process because there isn't h + or ohor h 3 o + ". thus, as in many occasions mentioned by other authors (drechsler 2007; drechsler and van driel 2009; zoller 1990) , the bronsted-lowry acid-base process is more recognized by students than other acid-base models. for us, this result is evidence that the university students did not transfer their knowledge about acid-base acquired to new situations, for example, the lewis' acid-base electron transferences. (b) applicability of acid-base models. as a consequence of the previous result, it was expected that most of the graduates in their secondary education teacher civil service examination would cite arrhenius', bronsted-lowry's, and lewis' models (data from research on 50 exams). nevertheless, in approximately 15% of the cases, the description was wrong. only 52% of the candidate teachers identified the boundaries of arrhenius' model, three-quarters made no comment on the limitations of bronsted-lowry's theory, and none recognized that there might be limitations in lewis' theory and what they explicitly consider is the currently accepted one (jiménez-liso 2000) , similar to the results founded by yalcin (2011) with turkish candidate teachers. previous omissions of the limits of applicability of the different acid-base theories are worrying as regards that, if they passed this exam, they would be qualified as secondary education teachers in physics and chemistry. they do not usually follow any continuous professional training as teachers, a fact that also occurs in other countries such as england, france, finland, and cyprus, and could affect the quality of teaching and the improvement of the education system (evagorou et al. 2015) . teachers must have knowledge about teaching scopes and limitations of different acid-base models; nevertheless, most of them had not developed teaching strategies for this issue and only a few teachers said that they usually discussed the use of models of acids and bases in their teaching (drechsler and van driel 2008) . although they recognized some difficulties of the students such as confusion between models, only a few emphasized the different models of acids and bases (alvarado et al. 2015; drechsler and van driel 2008) . moreover, despite some teachers believing that most students do not understand the use of models, they tried to teach it anyway in order to help the best students in their learning, hoping other students understand that simple models are not the whole truth (drechsler and van driel 2008) . by presenting two or three theories together, the lewis and bronsted-lowry definitions are just that-definitions-and the validity of one of them does not automatically negate the other (although it may expand the set of substances which are classed as acids). the conceptualfocused teaching mentioned above is insufficient because it comes from a purely theoretical perspective without any kind of application, reduced to the definitions instead of containing a clear explanation of their development (cid manzano and dasilva alonso 2012) and of the problems or phenomena that gave rise to new ideas. the three favourite acid-base theories are presented as a collection of "agreed upon facts", so students memorize them without questioning their relationship with other scientific knowledge (justi and gilbert 1999a) , and focusing on the products rather than on the processes of science. there is clear evidence that many problems learners have arisen from acid-base theories confusion. when several theories of acid-base are presented together to students the scope to produce confusion is expanded, above all if most learners have a very limited notion of the role of models in science (driver et al. 1996; grosslight et al. 1991; taber 2001) . the role of a model in science is related to developing a scientific understanding of some phenomena, explaining them and predicting other related phenomena, then applying the new knowledge to novel situations or contexts (izquierdo-aymerich and aduriz-bravo 2003; oh and oh 2011) . so, the reason for explaining three (or more) acid-base theories together is not related to scientific understanding of the phenomenon. the reasons for introducing the three most used acid-base models together appears to be twofold: firstly, conceptual survival (a concept from past chemical curricula that is retained in modern chemical curricula) and, secondly, to show the history of a chemistry concept in a narrative manner. thus, the emphasis is focused on the differences between each concept, a fact that does not promote a proper understanding of science, instead of comprehending the conditions when the models were built and, consequently, the limitations they have. no advantage is taken of the opportunities to get the students to reflect on the nature of science through the history of acid-base theories. on the contrary, they usually develop a distorted image of science itself and of how it is carried out. before we discuss teaching acid-base models in the chemistry curriculum, it is necessary to clarify terminology. acid-base concepts, definitions, theories, and models are often used as synonymous. students' mistakes often arise due to the ambiguous use of the terminology (jiménez-liso and de manuel torres 2002). to avoid this difficulty, we have adopted acidbase models as the correct terminology to refer to the models that explain and predict phenomena proposed by arrhenius, bronsted, and lewis, because we understand that the theories in which they are included are the ionic dissociation theory, solvents theory, and valence electron theory respectively. considering that curriculum materials shape teachers' practice and characteristics (as their knowledge or beliefs) and students' opportunities to learn in science (davis et al. 2016; pareja roblin et al. 2018) , in the next section, we try to identify activity sequences, firstly with an explicit nos approach using the timeline of the historical development acidbase models. as burgin and sadler (2016) mentioned, the prevalent model for teaching nos in school has been referred to as the explicit/reflective approach (lederman 2007) . in this approach, the priority object of study is to teach the great consensus about the nature of science (tentativeness, creativity, ...) to avoid the main distorted views of science. typical activities using this approach are discussions about a "paper towel investigation", about the "card exchange" (cobern and loving 2002) , or about historical cases (readings or movies) (aduriz-bravo and izquierdo-aymerich 2009; moreno et al. 2018) and scientific errors (kipnis 2011) as a particular historical case or some historical controversies (niaz 2009 ). all of these strategies for teaching are linked to some rationale, to educational purposes (nouri et al. 2019 ). the acid-base timeline (fig. 1) could link with the next curriculum purposes for specific educational levels (table 1) . more interesting than improving acid-base content understanding are the opportunities to advance the understanding of nature of science content using certain moments of the acid-base timeline, for example, the story of what happened with acid-base models in 1923. in 1923, bronsted and lowry proposed their explanations about acid and base behaviour. both knew arrhenius's model (1903) and a less famous one today: the solvent-solute model proposed by franklin (1905) . some 21 years later, two researchers independently proposed a particular case of solvent model (proton model) where the water acts as solvent and its autoprotolysis as definition of acids (proton donor) or bases (accept protons). (taton 1964) to link with chemical change models expressed by students and with school science models such as "parts model" described by acher et al. (2007) and their pivotal ideas like transformations and conservation of "parts" to distinguish dilution (colour fading) and neutralization (erduran 2007 the main purpose of this model is not to identify acid-base processes with aqueous processes university level proton model (based on water autoprotolysis) given by bronsted (1923) and lowry (1923) we can use the original papers from bronsted, lowry, and lewis to help them to answer the next questions that we can ask to our high school students (or university students): -why does emerge a limited model (proton model, bronsted-lowry model) after a broader model (franklin model) ? we want to scaffold "the epistemic value of simplicity, referred to as ockham's razor, meaning that the simplest applicable model is the most elegant and the best" (rollnick 2019, p. xiv) . the solvent model proposed by franklin (1905) was known by bronsted and lowry but they only used the water as solvent, so for solving their problems, they did not need a broader model and they specify it on a model more simple but more useful. in fact, it is the most widely used and known today because most acid-base reactions are aqueous. -a danish researcher (varde, denmark) (bronsted 1923) and the same year that (lowry 1923) from bradford (uk) propose the same proton model, how do you think it was possible for two researchers in different countries (without knowing each other) to propose an identical model? perhaps in our digital era this scenario is unthinkable: two researchers producing identical researches without any previous contact, but in 1923, they heard from each other when they read the papers already published in two different journals. the scientific community recognized the merit of both of them and, thereafter, their model was named bronsted-lowry. what circumstances led both to propose the same theory? bronsted (1923) started from the dissociation electrolytic theory of arrhenius, which initially does not call into question his idea of acid (a →b + h + ) and for which he tries to find a better definition of base: "it is the purpose of the present small contribution to show the advantages that come from a modified definition of a base" (bronsted 1923, p. 718) , specifically the difficulties in explaining the basicity of ammonia: if we accept scheme (nh 4 + + oh -<===> nh 4 oh), as a suitable expression for characterizing bases, we will be forced to give a special definition of a base for each special solvent. however, in principle, acid and basic properties are independent of the nature of the solvent, and the concepts of acids and bases are in fact of such a general character that we must consider it a necessary requirement of these concepts in general to formulate a pattern independent of the nature of an arbitrary solvent (bronsted 1923, p. 719) and ends by concluding: the equilibrium formulated in scheme (1) between hydrogen ion and the corresponding acid and base can be called a simple acid-base equilibrium. by mixing two simple systems, a double acid-base system and an acid-base equilibrium result that can always be formulated as follows: this equilibrium includes a number of important reactions such as neutralization, hydrolysis, indicator reactions, etc. (bronsted 1923, p. 728) on a different path, lowry (1923) knew the electron valence theory of lewis (1923) and relied on it to distinguish two types of chemical affinity (polar and non-polar) and their links in organic and inorganic substances, which led to the need of proposing h 3 o + as what is exchanged in acid-base reactions overcoming arrhenius h + proposal, the difficulty of the basic character of nh 3 and the relative character of strong or weak acids depending on which substance they react with (fig. 3) . -with the previous knowledge, students are willing to answer one last question: how do you think two models emerged in 1923 from three different people working independently and in different paradigms (proton or electron paradigms)? 1923 was a good year for the acid-base historical development. lewis (1923) also raised his electron model (fig. 4) under a totally different paradigm (based on his electron valence theory) and to solve a problem not contemplated by his contemporaries: acid-base behaviour in reactions without solvent, for example, in gas reactions. discussing with upper secondary students (or university students) these acid-base moments of a broad timeline, we could challenge the accumulative-linear and erroneous image of science. for university chemistry or geology degree students, similar questions could be posed using the 1939 and lux-flood models for reactions in high pressure such as on geological process (without any dissolution). and also for university chemistry students (pre-and post-graduated), another interesting controversy could be the qualitative and quantitative chemical approaches (chamizo 2018) between pearson (1963) who proposed empirically his hard and soft acid-base model and drago (1973). 7 implicit approach to nos teaching 7.1 inquiry-based teaching proposal barrow (2006) described inquiry firstly as an epistemic practice (kelly 2008) , secondly, as scientific skills that students should develop, and finally as a teaching approach. we remain with this last meaning of inquiry to propose an instructional sequence of activities. as there are a multitude of research proposals (pedaste et al. 2015) , we have specified our teaching approach in a cycle (fig. 5 in orange) to connect it with the modeling cycle (fig. 5 in green) proposed by couso (2020) and couso and garrido-espeja (2017) . in the acid-base domain, reactions can be followed with indicators from daily life such as red wine. when we use the red cabbage as an acid-base indicator, we generally emulate boyle's descriptive pre-model in order to recognize the acid-base nature of some daily life products. in this way, we create (as boyle did) a classification of acid, neutral, or basic substances. we transform these hands-on activities about the acid-base classification to an inquirybased teaching where the steps will be easily recognizable by our students so that they become aware of how they have learned (learning and emotions self-regulation) and, therefore, can make an explicit debate about the phases of the inquiry, how they help to learn, and what emotions they felt during this sequence (step 7 in cycle orange, fig. 5) . to do this, we begin with a familiar problem (chewing gum tv advertisement 1 stops the acid attack, strengthens the tooth enamel and helps to keep your teeth strong and healthy while in the image we can show that raise the ph of the mouth to prevent the formation of cavities (jiménez-liso et al. 2018 ) that engages students to explain their personal ideas: chewing gum is the opposite of acids generated by food in the mouth, the tv ad does not tell the truth and the gum does nothing, warms and destroys the acids, more saliva is generated or the gum traps the remains of food. the key moment in this sequence is the students' proposals for designs of experiments that allow them to find evidence to confirm or reject their hypotheses. the experimental designs raised by our students facilitate the discussion about the usefulness of the designs (what did they measure? with what did they check?). for instance, some students proposed to put some food in a glass with water (to simulate the mouth) along with the gum and measure with ph paper, to which another group responded that they did not check the "before" and the "after" adding chewing gum, that is, the effect of chewing gum. others suggested sucking ph paper after eating and again after chewing gum (lópez-banet et al. 2021) . taking measurements with a ph meter can be a conflict for the students with their expectations, both because the chewing gum does not raise the ph of the acid-dissolution (mouth simulation) and neither does adding water (dilute). this opens the option of deepening the mathematical conflict that involves a linear scale (ph values of 1-14) versus a logarithmic scale (which means ph) asking how much water would be necessary to raise the value by one point (lópez-banet et al. 2021 ). however, as osborne (2014) mentioned, hands-on activities, such as the acid-base classification using red cabbage indicator, are not normally accompanied by an interpretation or explanation of the phenomena. in our inquiry-based sequence, students built essential descriptive knowledge (acid-base reactions vs dilution with water or saliva) so that they now recognize the need to seek an explanatory model perfect to start the modeling cycle (fig. 5) . the use of red cabbage as an acid-base indicator, often carried out by students aged up to 16 years old, is not accompanied by its possible explanation using models, keeping the explanation for higher levels (16-18 years old and university level). in this sense, the first (lewis 1923, p. 142) introduction of an explanatory model is presented for 16-18-year-old students and, generally, arrhenius or bronsted-lowry's model is used to present acid-base processes disconnected from those activities carried out (or not) during previous years (jiménez-liso et al. 2010) . therefore, if we focus the contents exclusively on the phenomenon and the identification of substances, we are only increasing the students' experiential field, but not their ability to explain phenomena they observe or to foresee what is going to happen in new situations. some hands-on activities about properties of acids and bases emphasize the teaching and learning of chemical knowledge through models and modeling by the formulation, evaluation, and revision of chemical models. when we ask students to express what they think happens "inside" by adding a base to an acid and observing changes in the colour of the indicator or changes in ph values, their initial models are unsatisfactory for some explanatory reasons (steps 1 and 2 in modelling cycle; fig. 5 in green; couso 2020; couso and garrido-espeja 2017) . students may have difficulties when it comes to expressing these initial models through drawings, most often represent non-explanatory circles, and only some of them point differently to acids or bases indicating that it is acidic when acids "predominate" over the bases and vice versa (lópez-banet et al. 2021) . despite these difficulties of the students in explaining "what happens inside", we cannot consider these initial models as students' alternative conceptions described in section 5.1 for two reasons: firstly, students' alternative conceptions were the product of punctual and "academicist" knowledge and, secondly, the difficulty for the initial models to be explanatory is the initial step to become aware of the need to build a model, that is an idea that helps explain a phenomenon (change of colour of acid-base reactions with indicator) and to predict new ones (for example the bubbles when we add bicarbonate to the vinegar). students are expected to relate properties of a substance to its shapes, in a similar way to nicolas lemery's model (erduran 2007; erduran and kaya 2019) . this seventeenth-century scientist explained that acids consist of keen particles that prick the tongue when they are tasted, differing both in length and in mass from one another. on the other hand, alkalis have pores where the acid points entering into do strike and divide them when oppose the motion of acids. so the difference of the points in acid substances is the cause why some acids can penetrate and dissolve well certain sort of mixts (lémery 1697) . our version of this model is a pacman model (jiménez-liso et al. 2018) suggested sometimes by some of our students. they are able to reason as ancient scientist used to and to build their own explanations in a similar way about what happens in a microscopic level by means of descriptions of the reality (macroscopic level) and their intuitive thoughts. this anthropomorphic model is already useful for students because it explains acid-base phenomena but it needs to be refined (steps 3 and 4 of the modeling cycle) because it does not serve them to explain a well-known experiment: why the balloon is inflated by adding bicarbonate to the vinegar. when students must construct a model to explain this precise knowledge of reality (what happens with the balloon), they introduce partial modifications to their useful models (lemery or pacman model with triangles as acid) such as "bow ties" that fly when the pacman eats the triangles, and they argue on its validity (or not) according to the descriptive knowledge they already have. this process leads them to identify the insufficiency of their initial models, the useful of pacman model and its limits, and the need for refinement to explain the production of gases in acid-base processes. figure 6 shows other alternative model based on the fighting idea to form a structure together that explain gases formation in an acid-base reaction, done by other students. as it is necessary to help students to comprehend the nature of models, a possible strategy could be introducing an explanation similar to that one mentioned as past scientists did. the activities previously mentioned encourage pupils to express their own ideas, giving opportunities to evaluate and restructure them, in order to pass from their initial to more scientifically valid conceptual schemes ones. for instance, pupils draw representations trying to explain the way they perceive some common substances and they described their models in class to share their ideas, as drawing "bubbles" in acids and less bubbles or no bubbles in the base substances (erduran 2003) . lemery, pacman or fighting models are very anthropomorphic. however, these models allow quick connection with the chemical formulation and the arrhenius model (fig. 7 from jiménez-liso et al. 2018). as couso (2020) mentioned, a model-focused teaching would be to put students in the situation of building themselves "adequate enough" explanations, in other words, to construct school-based scientific models to describe the behaviour of the world and to comprehend how it works (aduriz-bravo and izquierdo-aymerich 2009; izquierdo-aymerich 2000). instead of learning the models as the result of the scientific activity, it would be enough to focus on some specific big ideas (harlen 2010) or key ideas (national research council 2012) that have the potential to explain a lot of different phenomena (izquierdo-aymerich and aduriz-bravo . thus, a model-based teaching approach offers instructional strategies for improving conceptual learning in science education (shen and confrey 2007) and permits students go beyond the idea of models as reproduction, allowing them to reach the vision that the relationship between model, experiment, and reality is dynamic and evolutionary (tasquier et al. 2016) . in order to build a school science using more sophisticated models for upper secondary or university levels, as in our case, we talk about the model associated with phenomena using the concrete term "key connected aspects", which should emphasize: -the purposes of each model: for example, arrhenius' model is an explanation based on the classification of substances in acids or bases and their reactions, bronsted-lowry's model is based on equilibrium, and lewis' model focus on a different paradigm, the electron theory. we want to emphasize on this idea because it changes the acid-base view, from the conceptual-focused teaching (fig. 1) because we defined acid that contains h + (arrhenius) or that donates h 3 o + (bronsted-lowry) to a new view with an explanatory power of both models, which in the case of arrhenius explains reactions between substances and bronsted-lowry explains equilibrium, balances, and, therefore, their reversibility ( fig. 8) , as we will see below. -acid-base characteristic from each scientific model: our perceptions about acid-base definitions given in fig. 1 change from acid-base as substances to the absolute acid and base properties based on their chemical composition in arrhenius' model ( fig. 8, left) , from the acid-base pairs conjugated to the relative properties of substances in bronsted-lowry's model (fig. 8, right) , or from acid-base as accept-donor pair of electrons (the electron paradigm) to its possibilities to explain solvent absence in lewis' model. these decisive acid-base characteristics are connected to the models' educational purposes fig. 8 connected key aspects of arrhenius' and bronsted-lowry's models through a simplification of the historical scientific consensus models and it explains why some historical models can still be used to explain some phenomena (table 1) . the comparison and contrast of these key features, the nature, and purposes of models can be addressed in the teaching. -scope, boundaries, and explanatory power: for example, bronsted-lowry's model can explain not only the reason why a reaction between a strong acid and a weak base produces a ph < 7 solution without using hydrolysis concept but also that the reaction between a base and water is possible and that two acids (one stronger than other) can react (however a weak acid, according to arrhenius' classification, reacts like a base). these explanations are not possible using arrhenius' model. it would not be prudent to discard a model that is easy to understand and is well applicable in many cases (ockhams razor (rollnick 2019) . for instance, many chemical reactions occur in aqueous solutions because many compounds have hydrogen or hydroxide ions. thus, teaching the arrhenius concept is important for the purpose of promoting the recognition of the meaning of the acid-base characteristic from this scientific model in science learning. however, for this purpose, the introduction of new concepts needs to be followed in order to overcome the limitations of the arrhenius concept (paik 2015) . on the other hand, the key ideas from bronsted-lowry model emphasize five concepts: equilibrium, reversibility, simultaneous, and the relative strength of acids and bases, both in aqueous and non-aqueous solvent (fig. 8) . when acid-base reactions occur without solvent, for example, gases reactions, neither arrhenius' model nor that bronsted-lowry's model serves to explain them, so we need other models such as lewis' electron valance model or lux-flood model for geological hard pressure acid-base reactions. the arguments put forward in this paper might convince teachers to deepen their teaching of acid-base processes, at all possible educational levels, by taking advantage of the presence of historical development in upper secondary and to cover the need to advance it to primary or lower secondary levels by the arguments involving the presence of acids and bases in our daily lives and on solving socio-scientific issues about health or the environment. the extensive bibliography on alternative conceptions at all educational levels (including teachers and candidates to be) justify the need for a change in the usual way of presenting it that focuses on the presentation of definitions on acid, base, theirs reactions, ph, etc., in two or three "theories" presented together. this fragment of the history of science that survives in the current curriculum (in upper secondary and chemistry degree, university level) offers a very good opportunity for nos teaching without overloading the already extensive and concentrated chemistry curriculum. thus, the main aim of traditional acid-base teaching is to learn the main concepts by means of conceptual-focused teaching and it is very far from making sense to the students, because it makes them look at the bricks and not in their usefulness as part of a larger and more beautiful castle (meaningful) and useful. inquiry-based teaching (section 7.1) and the model-conceptual teachings that were exemplified in this paper (sections 7.2 and 7.3) provide implicitly a chance to reflect on how science is constructed. whereas the conceptual-focused teaching only explains definitions and emphasizes on descriptions about behaviours (not always coordinated), model-focused teaching emphasizes on explanations, interpretations, and predictions (stefani and tsaparlis 2009) . in this way, students should learn to use each model within its application domain to address different phenomena. applicability of acid-base models should be better understood, and knowledge of acid-base models would be transferred to new situations, for example, to recognize a new process as an acid-base reaction. in this paper (section 7.3), we proposed that considering models include key ideas connected in a particular way they provide coherence to concepts. both approaches are in conflict with each other, so this dual treatment is discussed: teaching isolated ideas in a conceptual-focused teaching or the relationship between connected key aspects through a model-focused teaching. we have attempted to show the differences between nos teaching approaches through several sequences of activities. first, one sequence with an explicitly nos approach, as lederman (2007) points as desirable, and then, three implicit approaches, similar to duschl and grandy's (2013) recommendations. these four sequences can help teachers to perceive the potential results of choosing one of those treatments in acid-base lessons, according to their own teaching goals. also, this concretion in sequences of activities, which is the fundamental tools for teachers to teach, can encourage them to teach acid-base models in a way closer to the recommendations of nos researchers. as we pointed out in the introduction, by specifying the implicit-explicit debate in several sequences of activities, we are also offering, for science teacher training, a theoretical learning progression. pre-service or in-service teachers in training could live inquiry and modeling sequences since lemery to the more sophisticated models and it allows to place the implicit sequence one after this lineal progression to make explicit the awareness of how the science is built. finally, as an agenda for future work, we could follow the steps outlined in this paper in order to develop an evaluation study about the efficiency of consensus nos understandings of each implementation of our implicit, explicit-ibse, explicit-modeling sequences, using frameworks such as burgin and sadler (2016) . modeling as a teaching learning process for understanding materials: a case study in primary education a research-informed instructional unit to teach the nature of science to pre-service science teachers canonical pedagogical content knowledge by cores for teaching acid-base chemistry at high school development of the theory of electrolytic dissociation misconceptions of students and teachers in chemical equilibrium a brief history of inquiry: from dewey to standards some remarks on the concept of acids and bases learning nature of science concepts through a research apprenticeship program: a comparative study of three approaches draw-a-scientist / mystery box química general. una aproximación histórica estudiando cómo los modelos atómicos son introducidos en los libros de texto de secundaria the card exchange: introducing the philosophy of science aprender ciencia escolar implica construir modelos cada vez más sofisticados de los fenómenos del mundo [learning school science involves building increasingly sophisticated models of world phenomena models and modelling in pre-service teacher education: why we need both conceptions of first-year university students of the constituents of matter and the notions of acids and bases conceptions of second year university students of some fundamental notions in chemistry teachers and science curriculum materials: where we are and where we need to go joseph priestley across theology, education, and chemistry: an interdisciplinary case study in epistemology with a focus on the science education context international handbook of research in history, philosophy and science teaching acids and bases in layers: the stratal structure of an ancient topic conceptual change achieved through a new teaching program on acids and bases la medicalización de la sociedad, un contexto para promover el desarrollo y uso de conocimientos científicos sobre el cuerpo humano [the medicalization of society as a context for promoting the development and use of scientific knowledge revista de investigación y experiencias didácticas changing how we teach acid-base chemistry learning from the history and philosophy of science: deficiencies in teaching the macroscopic concepts of substance and chemical change pearson's quantitative statement of hsab models in chemistry education. a study of teaching and learning acids and bases in swedish upper secondary schools experienced teachers' pedagogical content knowledge of teaching acidbase chemistry teachers' perceptions of the teaching of acids and bases in swedish upper secondary schools young people's images of science two views about explicitly teaching nature of science philosophy of chemistry: an emerging field with implications for chemistry education examining the mismatch between pupil and teacher knowledge in acid-base chemistry bonding epistemological aspects of models with curriculum design in acid-base chemistry transforming teacher education through the epistemic core of chemistry exploring young students' collaborative argumentation within a socioscientific issue pre-service science teacher preparation in europe: comparing pre-service teacher preparation programs reactions in liquid ammonia surveying students' conceptual and procedural knowledge of acid-base behavior of substances students' alternative conceptions in chemistry: a review of research and implications for teaching and learning models and modelling as a training context: what are pre-service teachers' perceptions? conceptual incoherence as a result of the use of multiple historical models in school textbooks understanding models and their use in science: conceptions of middle and high school students and experts principles and big ideas of science education on the prevalence of alternative conceptions on acid-base chemistry among secondary students: insights from cognitive and confidence measures fundamentos epistemológicos [ epistemological foundations didáctica de las ciencias experimentales: teoría y práctica de la enseñanza de las ciencias epistemological foundations of school science contenidos relacionados con los procesos ácido-base: diagnóstico y propuestas didácticas al nivel universitario la utilización del concepto de ph en la publicidad y surelación con las ideas que manejan los alumnos: aplicaciones en el aula [the use of the concept of ph in advertising and its relationship with the ideas that students handle: applications in the classroom la neutralización ácido-base a debate enseñanza de las ciencias química y cocina : del contexto a la construcción de modelos chewing gum and ph level of the mouth : a model-based inquiry sequence to promote scientific practices el enfoque de enseñanza por indagación ayuda a diseñar secuencias : ¿una rama es un ser vivo? propuestas de educación científica basadas en la indagación y modelización en contexto aprender ciencia escolar implica aprender a buscar pruebas para construir conocimiento (indagación) penguin random house grupo editorial a cause of ahistorical science teaching: use of hybrid models the history and philosophy of science through models: the case of chemical kinetics inquiry, activity and epistemic practice errors in science and their treatment in teaching science teachers' ways of talking about nature of science and its teaching nature of science: past, present, and future cours de chymie valence and the structure of atoms and molecules (issue 14). chemical catalog company steam views from a need: the case of a sensopill gum and ph the uniqueness of hydrogen development and assessment of a diagnostic tool to identify organic chemistry students' alternative conceptions related to acid strength the nature of science in science education: rationales and strategies efecto (¿o no?) de la inclusión de naturaleza de la ciencia en una secuencia para el aprendizaje y la aceptación de la teoría de la evolución influence of levels of information as presented by different technologies on students' understanding of acid, base, and ph concepts progressive transitions in chemistry teachers' understanding of nature of science based on historical controversies changing how we teach acid-base chemistry instructors' rationales and strategies for teaching history of science in preservice settings: illustrations from multiple cases with implications for science teacher education general chemistry students' conceptual understanding and language fluency: acidbase neutralization and conductometry what teachers of science need to know about models: an overview teaching scientific practices: meeting the challenge of change lux-flood basicity of binary silicate melts a comparative study of the effects of a concept mapping enhanced laboratory experience on turkish high school students' understanding of acid-base chemistry understanding the relationship among arrhenius, bronsted-lowry, and lewis theories what are critical features of science curriculum materials that impact student and teacher outcomes? hard and soft acids and bases phases of inquiry-based learning: definitions and the inquiry cycle a categorisation of the terminological sources of student difficulties when learning chemistry an activity to help students learn about observation, interpretation, and argumentation. the science teacher transforming teacher education through the epistemic core of chemistry (pp. xiii-xv) concept mapping and misconceptions: a study of high-school students' understandings of acids and bases scientific literacy, pisa, and socioscientific discourse: assessment for progressive aims of science education five ideas in chemical education that must die a label as a hidden persuader: chemists' neutralisazion concept science teacher learning progressions: a review of science teachers' pedagogical content knowledge development developing a learning progression for scientific modeling: making scientific modeling accessible and meaningful for learners from conceptual change to transformative modeling: a case study of an elementary teacher in learning astronomy high school students' understanding of titrations and related acid-base phenomena conceptual reorganization and the construction of the chemical reaction concept during secondary education pupils' problems in understanding conservation of matter students' levels of explanations, models, and misconceptions in basic quantum chemistry: a phenomenographic study building the structural concepts of chemistry: some considerations from educational research jigsaw cooperative learning: acid-base theories exploring students' epistemological knowledge of models and modelling in science: results from a teaching/learning experience on climate change la science antique et médiévale histoire générale des sciences, t. ii: la science moderne. presses universitaires de france histoire des sciences theory of acids and bases ¿los productos homeopáticos pueden ser considerados medicamentos? creencias de maestras/os en formación can the history of science help science educators anticipate students' misconceptions? sheepfarming after chernobyl a case study in communicating scientific information. environment: science and policy for sustainable development investigation of the change of science teacher candidates' misconceptions of acids-bases with respect to grade level students' misunderstandings and misconceptions in college freshman chemistry (general and organic) acknowledgements this work would not have been possible without the inspired discussions with esteban de manuel (rut's phd supervisor) and john t. leach and the participation of ies murgi students, their teachers lucía, isabel and carmen.funding information this work has been partially financed by the projects edu2017-82197-p and pgc2018-097988-a-i00 of the ministry of science and innovation (mci) of spain, the state research agency (aei) and the european regional development fund (feder), and with a visiting scholar in the university of exeter prx19 / 00364 of the ministry of education of the government of spain. conflict of interest no potential conflict of interest was reported by the authors. key: cord-318187-c59c9vi3 authors: basu, saikat; holbrook, landon t.; kudlaty, kathryn; fasanmade, olulade; wu, jihong; burke, alyssa; langworthy, benjamin w.; farzal, zainab; mamdani, mohammed; bennett, william d.; fine, jason p.; senior, brent a.; zanation, adam m.; ebert, charles s.; kimple, adam j.; thorp, brian d.; frank-ito, dennis o.; garcia, guilherme j. m.; kimbell, julia s. title: numerical evaluation of spray position for improved nasal drug delivery date: 2020-06-29 journal: sci rep doi: 10.1038/s41598-020-66716-0 sha: doc_id: 318187 cord_uid: c59c9vi3 topical intra-nasal sprays are amongst the most commonly prescribed therapeutic options for sinonasal diseases in humans. however, inconsistency and ambiguity in instructions show a lack of definitive knowledge on best spray use techniques. in this study, we have identified a new usage strategy for nasal sprays available over-the-counter, that registers an average 8-fold improvement in topical delivery of drugs at diseased sites, when compared to prevalent spray techniques. the protocol involves re-orienting the spray axis to harness inertial motion of particulates and has been developed using computational fluid dynamics simulations of respiratory airflow and droplet transport in medical imaging-based digital models. simulated dose in representative models is validated through in vitro spray measurements in 3d-printed anatomic replicas using the gamma scintigraphy technique. this work breaks new ground in proposing an alternative user-friendly strategy that can significantly enhance topical delivery inside human nose. while these findings can eventually translate into personalized spray usage instructions and hence merit a change in nasal standard-of-care, this study also demonstrates how relatively simple engineering analysis tools can revolutionize everyday healthcare. finally, with respiratory mucosa as the initial coronavirus infection site, our findings are relevant to intra-nasal vaccines that are in-development, to mitigate the covid-19 pandemic. www.nature.com/scientificreports www.nature.com/scientificreports/ simulated predictions of respiratory flow physics and transport therein; see e.g. [6] [7] [8] . of interest are nasal spray simulation studies on in silico models, re-constructed from medical imaging, to measure drug delivery along the nasal passages 9 , in the sinuses 10, 11 , and on the effects of surgical alterations of the anatomy on nasal airflow [12] [13] [14] [15] as well as on topical transport of drugs [16] [17] [18] [19] . the latter addresses the role of airway channel's shape in the context of airflow-droplet interactions. notably, while using medical devices like sprayers, which are inserted at the nostril, the anterior airway geometry gets altered. to simplify the situation though, computational results 10 suggest that such initial perturbations do not greatly change or adversely affect the eventual drug deposits at the diseased sites. despite the abundance of computational research on nasal drug delivery, there is a distinct lack of articulate instructions for guidance on what could be the "best" way to use the commercially available sprayers. first, numerical studies often do not use a realistic distribution of droplet sizes while simulating topical sprays. focusing on specific droplet diameters is resourceful while studying the detailed nuances of transport characteristics in that size range; however this somewhat limits the applicability of the subsequent findings while predicting the performance of real sprays, which have a wide variability of droplet sizes in each spray shot. secondly, the inter-subject anatomic variations also render it difficult to identify a generic spray orientation that can work for all and ensures maximal delivery of drugs at the diseased locations inside the nose. in this study, we have numerically tracked the transport of therapeutic particulates from over-the-counter nasal sprays via inhaled airflow. the computational fluid dynamics (cfd) models of droplet transport and the in silico prediction of their deposition sites along the nasal airway walls have been compared with in vitro spray experiments in 3d-printed solid replicas of the same anatomic reconstructions. we have proposed a new strategy of nasal spray usage and the recommendation is supported by a significant improvement in target site particulate deposition (tspd), when compared to the prevalent spray use techniques. the study also expounds [20] [21] [22] on the potential of cfd as a tool in nasal ailment treatment and subject-specific prognosis, and can contribute to the emergence of non-invasive personalized therapeutics and treatment strategies. preliminary results pertaining to this work have featured at the american physical society (aps) -division of fluid dynamics annual meetings 23, 24 and at the international society for aerosols in medicine (isam) congress [25] [26] [27] . anatomic reconstructions. all methods were performed in accordance with the relevant guidelines and regulations, including use of de-identified computed tomography (ct) data from three pre-surgery chronic rhinosinusitis (crs) patients -collected under approval from the institutional review board (irb) at the university of north carolina at chapel hill. we also obtained informed consent for participation in this study (which includes obtaining and use of ct data) from the test subjects. subject www.nature.com/scientificreports www.nature.com/scientificreports/ medical-grade ct scans of the subjects' nasal airways were used to re-construct digital models through thresholding of the image radiodensity, with a delineation range of −1024 to −300 hounsfield units for airspace 10, 28 , complemented by careful manual editing of the selected pixels for anatomic accuracy. as part of that process, the scanned dicom (digital imaging and communications in medicine) files for each subject were imported to the image processing software mimics v18.0 (materialise, plymouth, michigan). for this study, we subsequently considered each side of the nose in the in silico models as a distinct nasal passage model, while studying the droplet transport properties when the spray nozzle was placed on that side: (a) subject 1's right side constituted nasal passage model 1 (npm1) and his left side was nasal passage model 2 (npm2); (b) subject 2's left side was nasal passage model 3 (npm3); and (c) subject 3's right side was nasal passage model 4 (npm4) and her left side was nasal passage model 5 (npm5). note that subject 2's right-side anatomy did not exhibit a direct access to the diseased intra-nasal targets from outside of the nostril and was not selected for this study. this had to do with the scope of our study design; for details see the section on target site identification. also refer to the discussion section for follow-up comments. to prepare the in silico anatomic models for numerical simulation of the inhaled airflow and the sprayed droplet transport therein, the airway domain was meshed and spatially segregated into minute volume elements. the meshing was implemented by importing the mimics-output in stereolithography (stl) file format to icem-cfd v18 (ansys, inc., canonsburg, pennsylvania). following established protocol 10, 29 , each computational grid comprised approximately 4 million unstructured, graded tetrahedral elements; along with three prism layers of approximately 0.1-mm thickness extruded at the airway-tissue boundary with a height ratio of 1. inspiratory airflow and sprayed droplet transport simulations. laminar steady-state models work as a reasonable approximation while modeling comfortable resting to moderate breathing 8, [30] [31] [32] . furthermore, with our simulations focusing on a single cycle of inspiration, steady state flow conditions were adopted as a feasible estimate. based on the principle of mass conservation (continuity), and assuming that the airflow density stays invariant (incompressibility), we have with u representing the velocity field for the inspired air. conservation of momentum under steady state flow conditions leads to the modified navier-stokes equations: here ρ = 1.204 kg/m 3 represents the density of air, μ = 1.825 × 10 −5 kg/m.s is air's dynamic viscosity, p is the pressure in the airway, and b stands for accelerations induced by different body forces. to simulate the airflow, eqs. (1) and (2) were numerically solved through a finite volume approach, in the inspiratory direction. the computational scheme on ansys fluent v14.5 employed a segregated solver, with simplec pressure-velocity coupling and second-order upwind spatial discretization. solution convergence was obtained by minimizing the flow residuals (viz. mass continuity ~− o(10 ) 2 , velocity components − õ (10 ) 4 ), and through stabilizing the mass flow rate and the static outlet pressure at the nasopharynx of the digital models. a typical simulation convergence run-time with 5000 iterations clocked approximately 10 hours, for 4-processor based parallel computations executed at 4.0 ghz speed. the numerical solutions implemented the following set of boundary conditions: (1) zero velocity at the airway-tissue interface i.e. the tissue surface lining the sinonasal airspace (commonly called no slip at the walls), along with "trap" boundary conditions for droplets whereby a droplet comes to rest after depositing on the wall; (2) zero pressure at nostril planes, which were the pressure-inlet zones in the simulations, with "escape" boundary condition for droplets that allowed outgoing trajectories to leave the airspace through the nostril openings; and (3) a negative pressure at the nasopharyngeal outlet plane, which was a pressure-outlet zone, also with an "escape" boundary condition for droplets. the negative nasopharyngeal pressure was adjusted to generate inhalation www.nature.com/scientificreports www.nature.com/scientificreports/ airflow rates with less than 1% variation from subject-specific measurements of resting breathing. the physical recordings were collected with lifeshirt vests 33 that tracked chest compression/expansion during breathing, and accordingly quantified the inhalation rates (for additional details, see table 1 ). after simulating the airflow, sprayed droplet dynamics were tracked through discrete phase particle transport simulations in the ambient airflow, and the corresponding lagrangian tracking estimated the localized deposition along the airway walls through numerical integration of the following transport equations 34 : the parameters here are u d , representing the droplet velocity; along with u as the airflow field velocity, ρ and ρ d respectively as the air and droplet densities, g as the gravitational acceleration, f b as any other additional body forces per unit droplet mass (as for example, saffman lift force that is exerted by a typical flow-shear field on small particulates transverse to the airflow direction), and c d u u 18 re( )/24( ) quantifies the drag force contribution per unit droplet mass. here, c d is the drag coefficient, d is the droplet diameter, and re represents the relative reynolds number. mean time step for droplet tracking was in the order of 10 −5 sec., with the minimum and maximum limits for the adaptive step-size being − o(10 ) 10 sec. and − o(10 ) 3 sec., respectively. also note that the solution scheme posits the particulate droplets to be large enough to ignore brownian motion effects on their dynamics. post-processing of the simulated data laid out the spatial deposition trends, which were then tallied against in vitro observations. 3d printing and physical experiments. to assess the reliability of numerically predicted topical deposition vis-à-vis physical experiments, 3d-printed anatomic replicas were generated for subject 1's airway and hence included both npm1 and npm2. the posterior parts of the solid models were made from the stereolithography material watershed (dsm somos, elgin, illinois). post-digitization, the printing job of the posterior component was sub-contracted to protolabs (morrisville, north carolina). printing of the anterior soft plastic part on a connex3 3d printer was done by ola harrysson's group at north carolina state university (at the edward p fitts department of industrial and systems engineering), using polymer inkjetting process on tangogray flx950 material. see fig. 2 (a-c) for representative pictures of a digitized model and the corresponding 3d replica. recording deposits through gamma scintigraphy. intra-nasal topical delivery was tracked through in vitro examination of mildly radioactive spray deposits in the 3d-printed anatomic replicas. to ensure that the spray axis orientation and nozzle location aligned with the corresponding simulated spray parameters, we used specially designed nozzle positioning devices (npd) inserted at the nostril. the spray bottle was fitted into the npd, while administering the spray via hand-actuation. for each sample test, a bottle of commercial nasal spray nasacort was labeled with a small amount of radioactive technetium (tc99m) in saline. at the time of dispensing the spray shots, a vacuum line controlled by a flow-valve was used to set up inhalation airflow through the model, and the flow rate was commensurate with the subject-specific breathing data (table 1 ). corresponding setup is in fig. 2(d,e) . four independent replicate runs of each spray experiment were conducted, followed by compilation of the means and standard deviations of the drug deposits along the inner walls of the solid models. the topical deposition was proportional to the radioactive signals emitted from the spray solution traces that deposited inside a solid model and was quantifiable through image-processing of the scintigraphy visuals, collected using a bodyscan (mieamerica, forest hills, il) 400-mm width by 610-mm height 2d gamma camera. the pixel domain was 256 × 256, with an image acquisition time of 3 minutes; and one pixel equated to a cartesian distance of 2.38 mm in the digital and 3d models. table 1 . this table incorporates the parameters for measured and simulated inhalation airflow, in the study subjects. symbols: σ = standard deviation, μ = mean, *⇒ inhalation rate is considered to be twice the minute ventilation, **⇒ target simulated airflow is 94.24% 62 of the measured rate, to account for influence of the subjects' awareness of recording of the breathing process. note that the tidal volume is a measure of the lung volume representing the volume of air displaced between normal inhalation and exhalation, without application of any extra effort. the minute ventilation (air inhaled per minute) is computed from the inspiratory phase of a breath 33,63 . www.nature.com/scientificreports www.nature.com/scientificreports/ model segmentation for comparison with numerical data. to facilitate the comparison between the numerical predictions on droplet deposition and the physical observation of gamma scintigraphy signals in the corresponding solid replica, we segregated npm1 and npm2 into virtual segments oriented along three different directions. figure 3 lays out the cartesian coordinate directions for the 3d space. x was perpendicular to the sagittal plane traversing from left to right sides of the nasal models (with the model head facing forward), y was perpendicular to the axial plane traversing from inferior to superior aspects of the models, and z was perpendicular to the coronal plane traversing from anterior to posterior aspects of the models. the virtual segments were oriented along the xy (coronal), yz (sagittal), and zx (axial) planes. parallel to the xy coronal plane, the models contained 12 segments (named, c12-c1 ⇒ sagittal columns); there were 9 compartments (c1-c9 ⇒ frontal columns) parallel to the yz sagittal plane, and there were 12 compartments (r1-r12 ⇒ sagittal rows) parallel to the zx axial plane (see fig. 3 ). for each compartment, the particulate deposition fraction predicted from the simulation was compared with the deposition fraction measured based on gamma signals of the deposited particulates in the corresponding compartment of the 3d-printed model. to achieve this, signals emitted from the solution traces, that settled along the airway walls, were subjected to image processing analysis. therein, by superimposing the compartmental grid on the radio-images, the signals were extracted from each compartment. in order to align the grid on the image in a manner consistent with the virtual model, three inset discs were designed as reference points on the outer surface of the virtual and 3d-printed models. americium sources from commercial in-home smoke detectors were inserted into the insets as reference points on the 3d-model and a radio-image was recorded. for the analysis, the scintigraphy images were processed using imagej 35 by constructing a region of interest (roi) referenced to the fixed americium sources. care was taken to align the emitted visual signals with similar reference regions within the superimposed grid. this was done via manual visualization to achieve a best fit of signal intensity within reference regions. the grid compartment planes positioned using this visual best-fit technique were designated as "reference planes". given the nature of the radioactive signals and the resolution of the radio-image, some www.nature.com/scientificreports www.nature.com/scientificreports/ signal intensity resided outside of reference regions even while using best-fit practices. a reasonable fit could be obtained by shifting the image by one pixel in either direction (positive shift/negative shift). in order to account for this variation, alternative plane positions (see fig. 3 (d)) were created by shifting the reference planes one pixel along the positive and negative axes for each set of cartesian planes. these three sets of compartment planes were positioned in the in silico modeling software using the measured distances from the reference regions. the corresponding cartesian coordinates of these planes were used to assign droplet deposition locations from the computational simulations to grid compartments, for comparison with the in vitro model. in these comparisons, we left out the deposits in the anterior nose (from the cfd data as well as the physical recordings) in order to negate the bright radiation signal coming from that zone in the experimental deposits; and focused only on measurements from the posterior parts of the respective models. note that the anterior nose in an in silico model is in fact the removable soft pliable anterior part in the corresponding 3d print (e.g. see fig. 2 ). , and (c) depict the gridline schematic on npm1 and npm2, that is used to extract the deposition fractions from the gamma scintigraphy-based quantification of the sprayed deposits in the solid replicas. the models are respectively segregated into 3 sets of compartments: sagittal columns, frontal columns, and sagittal rows. panel (d) shows the perturbation of the base gridline by 1 pixel. representative technetium signals are in panel (e). note: in regard to the axis system, the circle with solid dot implies out-of-plane direction from this page, the circle with cross signifies into-the-plane of this page. www.nature.com/scientificreports www.nature.com/scientificreports/ identification of target site and spray parameters. effect of airflow on droplet trajectories. inertial motion of a droplet is linearly proportional to its mass, and hence is exponentially proportional to the droplet diameter. consequently, for bigger droplets, the inertial motion persists longer before being taken over by the ambient airflow. figure 4 (a) tracks the trajectory of a representative 5 μ droplet. in there, the tiny red circle marks the location where the inertial motion of the droplet got overwhelmed by the ambient flow, beyond which the droplet trajectory was same as the airflow streamline on which it was embedded at the red circle's location. note the contrasting 25 μ droplet trajectory in fig. 4(b) , where the inertial motion persisted longer. the phenomenon has a significant impact on drug deposition trends. the bigger droplets (≥100 μ) show a greater propensity to hit the anterior walls directly owing to their high initial momentum, while smaller droplet sizes penetrate further into the airspace; see e.g. figure 4 (c,d). to ensure that the bigger droplets also reach the target sites, we argue that it would be conducive to harness their inertial motion and direct those droplets actively toward the target when they exit the spray nozzle. this can be feasibly achieved by orienting the spray axis to pass directly through an intended anatomic target zone. 36, 37 indicate a lack of definitive knowledge on the best ways to use a nasal spray device. different commercial sprayers often offer somewhat contrasting recommendations. however, there is a common agreement (see fig. 5 (a)) that the patient should incline her/his head slightly forward, while keeping the spray bottle upright 36, 38 . furthermore, there is a clinical recommendation to avoid pointing the spray directly at the septum (the separating cartilaginous wall between the two sides of the in panel (a), the smaller droplet has weaker inertial momentum and the ambient airflow streamline takes over its motion much earlier than that in case of a heavier droplet like the one in panel (b), where the inertial momentum of the 25 μ droplet persists longer. the small red circle in (a) depicts the point where the inertial momentum gets overwhelmed by the fluid streamline. evidently, owing to smaller inertia, the droplets with smaller diameters get predominated by the airflow streamlines earlier than the bigger droplets. this results in a better penetration and spread of sprayed droplets in the nasal airspace, as shown in panel (c), for a different nasal model. on the contrary, spray shots with exclusive share of bigger droplets (e.g. ≥100 μ here) tend to follow their initial inertial trajectories, without much effect of the airflow streamlines on their paths, and deposit along the anterior walls of the nasal airspace, as depicted in panel (d). the red boundaries in panels (c) and (d) highlight the difference in particulate penetration into the model, in the two cases. note: these images were created using fieldview, as provided by intelligent light through its university partners program. www.nature.com/scientificreports www.nature.com/scientificreports/ nose). these suggestions were adopted in our standardization 16 of "current use" (cu) protocol for topical sprays. the digital models were inclined forward by an angle of 22.5°, and the vertically upright 36 spray axis was closer to the lateral nasal wall, at one-third of the distance between the lateral side and septal wall. also, the spray bottle was so placed that it penetrated into the airspace by a distance of 5-mm, inspired by the package recommendations of commercial sprayers 38 for a "shallow" insertion into the nose. refer to fig. 5(b,c) for the schematics of the cu protocol used in this study. target site identification and proposing an alternate spray use criteria. all sinuses, except sphenoid, drain into the ostiomeatal complex (omc), it being the main mucociliary drainage pathway and airflow exchange corridor between the nasal airway and the adjoining sinus cavities. to ensure that as many drug particulates reach the sinus chambers and their vicinity as would be possible, we hypothesize that the spray axis should be directed straight toward the omc 39 . this is supported by our observation of the effect of airflow physics on droplet trajectories. if the spray axis hits the omc directly, the likelihood that the larger droplets will deposit there is higher. we refer to this usage protocol as "line of sight" (los). like the cu protocol, the los protocol also had the sprayer inserted at a depth of 5-mm into the nasal airspace. representative los orientation is shown in fig. 6 . tspd percentage at the omc and the sinuses was evaluated as = 100 × (m target /m spray ); with m target being the spray mass of the particulate droplets deposited at the omc and inside the sinus cavities, and m spray being the mass of one spray shot. to establish the robustness of the tspd predictions for the cu and los protocols, we also tracked droplet transport and deposition when the spray directions were slightly perturbed. such perturbed peripheral directions for cu initiated 1 mm away on the nostril plane and were parallel to the cu's vertically upright true direction. for los, the perturbed peripheral directions were obtained by connecting the base of the true los direction on the nostril plane with points that radially lie 1 mm away from a point on the los; this specific point being 10 mm away along the los from the base of the los direction on the nostril plane (e.g. see bottom panel of fig. 7 for an illustrative example). parameters for the simulated spray shot. over-the-counter nasacort (triamcinolone acetonide), a commonly prescribed and commercially available nasal spray, was selected for this study. four units of nasacort were tested at next breath, llc (baltimore, md, usa) to characterize the in vitro spray performance. corresponding plume geometry was analysed through a sprayview nosp, which is a non-impaction laser sheet-based instrument. averaged spray half-cone angle was estimated at 27.93°, and the droplet sizes in a spray shot followed a log-normal distribution. with the droplet diameter as x, the droplet size distribution can be framed as a probability density function of the form 40 : here, x 50 = 43.81μ is the mass median diameter (alternatively, the geometric mean diameter 41 ) and σ g = 1.994 is the geometric standard deviation. the latter quantifies the span of the droplet size data. measurements were www.nature.com/scientificreports www.nature.com/scientificreports/ also made with and without the saline additive in the sprayer, and the tests returned similar droplet size distribution. note that a saline additive was used during the physical recording of the sprayed deposits. also, as per earlier findings in literature 42 , the mean spray exit velocity from the nozzle approximates at 18.5 m/s, based on phase doppler anemometry-based measurements. for the test spray units (at next breath), the actuation forces were found to range between 8.5-11.0 kg-force. considering an actuation area of approximately 4 cm 2 , the force measurements agree well with earlier values in literature [43] [44] [45] and hence the resultant pressure exerted on the droplets in our physical experiments was assumed to maintain a similar droplet size distribution, as was determined in the test cases by next breath. the droplets contained in one spray shot in the numerical simulations followed the same size distribution. while simulating the droplet trajectories, we assumed typical solid-cone injections and tracked the transport for 1-mg spray shot while comparing the tspd trends from the cfd predictions with the corresponding experimental drug delivery patterns. on the other hand, 95.0306 mg (which is one shot of nasacort, as quantified by next breath, llc) of spray mass transport was simulated while comparing the cfd-based tspd numbers for the los and cu protocols in each model. comparison between cu and los spray usage protocols. los was found to be consistently superior in comparison to the cu spray placement protocol, while targeting the omc and the sinus cavities for drug delivery. table 2 lists the deposition fraction percentages for each spray release condition in the five airway models (npm1-npm5). for a graphical interpretation, we have plotted the same information on fig. 7 . overall, the deposition fraction for the los was on an average 8.0-fold higher than the cu deposition fraction, with the corresponding subject-specific improvement range being 1.8-15.8 folds for the five test models. the improvement www.nature.com/scientificreports www.nature.com/scientificreports/ does decay when the perturbed peripheral spray directions are compared, to assess the robustness of the los protocol's advantage over cu. considering the varying peripheral directions around the true los and cu, the los set registered an average 3.0-fold increase in tspd, with the corresponding subject-specific improvement range being 1.6-4.3 folds. statistical tests -on improvements achieved by the revised spray use strategy. los was compared to cu through a paired study design on the data from five test models. table 3 lays out the computed numbers. for each model, the outcome comprised the percentage of deposition in omc and the sinuses for both cu and los spray usage. null hypothesis considered for this statistical test assumed that the tspd would be same for cu and los in an . panel (f) compares the tspd for peripheral directions in a 0.5-mm perturbation (on the left) with respect to a 1-mm perturbation (on the right) from the true los orientation, both in npm1. as expected from the overall findings, the tspd increased for the perturbed spray directions that were closer to the true los. panel (g) depicts the spatial perturbation parameters for the los spray axis orientation in npm1. www.nature.com/scientificreports www.nature.com/scientificreports/ airway model. the deposition percentage corresponding to cu and los protocols in the same nostril were treated as paired observations for a paired t-test to check the null hypothesis. owing to a relatively small study cohort, paired wilcoxon signed rank test was also used for robustness check. in order to study how spatial variation might affect the difference between cu and los, three different ways of calculating the percentage of deposition were implemented. the first strategy considered the average deposition from the true los and cu directions. the www.nature.com/scientificreports www.nature.com/scientificreports/ second strategy compared the tspd averaged from the true cu and los directions, along with the deposition data for spray release parameters obtained by perturbing the respective true directions. the third strategy used tspd averaged exclusively from the deposition data corresponding to the perturbed spray release parameters. this allowed us to assess the robustness of any probable improvement from using los, while still accounting for slight spatial variations of the spray direction. the first comparison method demonstrates an average deposition increase of 5.4 percentage points for los (6.39-% for los vis-à-vis 0.98% for cu). this difference is significant at the 0.05 level with a p-value from the paired t-test of 0.03. the paired wilcoxon signed-rank test has a p-value of 0.06, which was the lowest possible p-value for the wilcoxon signed-rank test given only five pairs of data. in the second comparison scheme, los has an increased deposition of 1.62 percentage points relative to cu (2.49% vis-à-vis 0.87%). the p-value for this difference is 0.02 using the paired t-test and 0.06 using the wilcoxon signed rank test. finally, for the third comparison method, los registered an increased deposition of 1.05 percentage points relative to cu (1.90% vis-à-vis 0.86%). the p-value for this difference is 0.02 using the paired t-test and 0.06 using the wilcoxon signed rank test. this provides a strong evidence that los leads to higher percentage of deposition in the omc and sinuses. the estimated difference is largest when using just the true directions, but the difference is still statistically significant even when using the spray release points obtained by perturbing the true directions. the p-value from the paired t-test is actually lower when the tspd from just the perturbed points are considered, owing to the reduced variance for the estimated difference. for all three ways of estimating the percentage of deposition, the paired wilcoxon signed-rank test returns a p-value of 0.06. with only five pairs of data, this suggests that the use of los does result in statistically significant higher deposition for all five nostril models. comparison of the simulated tspd predictions with physical experiments. figure 8 compares the numerical tspd predictions with corresponding gamma scintigraphy-based experimental recordings in npm1 and npm2. while the compartmental deposits visibly presented a congruous trend in the sagittal columns, sagittal rows, and frontal columns; we conducted additional statistical tests to verify the homogeneity between the two sets of data so as to establish the reliability of the computational findings. table 4 gives the pearson and kendall's correlation between the numerical and experimental models for the average deposition fractions in npm1 and npm2 for the los protocol. the confidence intervals are based on 1000 bootstrap samples, instead of asymptotic approximations, because of the relatively small sample size. based on the output, we can see that the pearson correlation is consistently very high while the kendall's correlation is somewhat lower. however, while the kendall's correlation is frequently thought to be more robust to outliers, particularly for small sample sizes like this data-set; in this particular instance the pearson correlation is likely more illustrative. this is because the pearson correlation is able to show that, for the most part, the magnitudes of the estimates are similar and comparable between the numerical and experimental models. in general, there is a strong linear relationship between the percent of deposition prediction from the numerical model and the corresponding physical measurements in the experimental model. the lower kendall's correlation (overall mean measure 0.78) is largely due to regions where both the numerical and experimental models had very low average deposition but the exact rank of these regions changed considerably between the two data-sets. note that this does not necessarily indicate a poor performing numerical model. however, the relatively high pearson correlation (overall mean measure 0.91) does indicate that the numerical models perform well while predicting the sprayed droplet transport. cfd-guided nasal spray usage defined by the los protocol was found to significantly enhance topical drug delivery at targeted sinonasal sites, when compared to currently used spray administration techniques. with increased sample size, this work can be the catalysis toward prompting personalized instructions and specifications for improved use of topical sprays. the findings, thus, have the potential to substantially upgrade the treatment paradigm for sinonasal ailments through the ability to ascertain los in individual subjects via endoscopic examinations conducted in the clinic, and to help guide treatment decision-making and patient instructions for spray usage. to quantifying the suitability of a person's airway for the los spray protocol, we exploratorily propose a scoring system that is based on how much of the targeted drug delivery sites (omc, sinuses) are visible when inspected clinically from outside of the nostril. the scoring system will also serve to quantify nasal anatomic variability www.nature.com/scientificreports www.nature.com/scientificreports/ among individuals. accordingly, as part of the current study, the los scores (see table 5 ) were first determined observationally, based on the external visibility of the omc site in the in silico sinonasal reconstructions. we fixed a range of scores ∈ [1, 4] , with 4 being used when the los direction was easiest to ascertain. subjective as that scoring procedure may be, it is similar to what attending physicians will gauge during a clinic visit to determine if a particular patient has a "line of sight" in her/his nasal anatomy. so, to establish the relevance of the findings from this manuscript toward revisions of the therapeutic protocol for sinonasal care, it is important to assess the comparability of the observational los scores with more objective score determination techniques. this was achieved by calculating the surface area of the nostril plane and the projected area of the omc on the plane of the nostril. we computed the ratio of the projected area to the nostril area, as a percentage. scores of 4 were assigned if the ratio exceeded 6%, 3 if the ratio exceeded 4%, 2 if the ratio was more than 1.5%, and 1 if the ratio was greater than 0%. the two scoring techniques yielded very similar results (as in table 5 ), with the highest and lowest scores respectively going to the same anatomic models. pearson's rank correlation for the two sets of scores was 0.85. while a broader study, involving clinical trials, will be necessary to revise therapeutic protocol for nasal drug delivery, the present results illustrate the easy adaptability of our findings into clinical practice settings. on the comparability of the experimental data with the numerical findings. the computational simulations assumed a laminar framework to mimic steady breathing. however, one may argue that even with resting breathing rates, the airflow often contains transitional features like vortices, emerging from the roll-up of www.nature.com/scientificreports www.nature.com/scientificreports/ shearing fluid layers during flow-structure interactions 46-51 at the anatomic bends. some of these nuances are, in fact, difficult to model without proper turbulence simulations 52, 53 . however, true as that may be, the effect of these flow artifacts on eventual drug delivery in the sinuses has been found to be somewhat nominal while comparing laminar and turbulence simulation results 10 . on the other hand, the in vitro techniques also often pose challenges. for instance, there can be post-deposition run-off as the deposited solution traces undergo translocation along the inner walls of the solid replica. such drip-off dynamics can lead to a flawed estimate of regional deposition. the effect of post-deposition dripping can be conjectured to be most prominent for the signals extracted from the sagittal rows, as the deposited droplets start moving downward along the internal solid walls of the 3d-printed models, owing to gravitational effects. this is confirmed by the physical and numerically-predicted signals from the sagittal rows demonstrating relatively lower correlation coefficients (when contrasted with the correlations for the signals from the sagittal and front columns) in the two experimental comparisons (e.g. see table 4 ). in the gamma scintigraphy-based method of recording deposits, the radiation signal undergoes some level of scattering and hence in the process of signal extraction from each of the compartments, there is the possibility that signals from one compartment may contaminate the signals at neighboring compartments. to minimize this effect while carrying out the comparisons, the nose (the soft plastic anterior part in the 3d-printed models), which had a bright radiation signal owing to the relatively large amount of anterior deposits, was excluded from both the experimental and numerical data. finally, while the inhalation airflow rates were same in vitro and in silico, the airflow partitioning on the two sides of the nasal airways was likely affected by the placement of the npd, while administering the spray through hand-actuation. caveats and future implications. readers should note that this was a computational study with validation from spray transport observations in inanimate solid replicas. also, not every patient will have a clear access to the omc, and hence may be without an los. for instance, in the current study, of the six airway sides in the three study subjects, subject 2's right-side airway did not exhibit an los. bulk rheology of the spray also affects the droplet size distribution. the spray property measurement tests having been performed in real over-the-counter sprays, we did not separately examine the effect of different droplet viscosities on the spray deposition trends. a different viscosity of the nasal spray can indeed alter the drug deposits, as observed in multiple studies 37, 54, 55 . it should however be pointed out that the spray positioning strategies proposed in this study could be conjectured to be generic and should maximize drug delivery to the omc and the sinuses for other sprays as well. it is also critical to note that the flow simulations for evaluating the spray usage strategies were not multiphase; implying that the sprayed droplets were not affected by constituents such as inhaled air moisture and the mucous lining, nor was there any consideration of droplet evaporation. there are, however, earlier findings in literature www.nature.com/scientificreports www.nature.com/scientificreports/ that have looked at some of these nuances; e.g., on the interaction of deposited particulates with mucus 56 and on the phase change of inhaled droplets during their passage through the respiratory tract 57 . the current study simply tracked the motion of inert droplets against the ambient inspiratory airflow and recorded their regional deposition. based on the surfactants in the spray solution, the droplets might also be rendered hydrophilic; but such effects are beyond the scope of this project and the numerical schemes that have been implemented. it is, however, expository to reckon that such hydrophilicity may at times lead to agglomeration of droplet molecules, which can impact the topical drug deposition estimates. this study, its restricted sample size and limitations notwithstanding, is still, to the best of our knowledge, the first-of-its-kind to propose an alternative easy-to-implement strategy that can significantly improve the intra-nasal delivery of topical drugs at the diseased sites. the recommendation for using the "line of sight" is user-friendly, personalized (the physician can instruct the patient on the spray usage technique based on a fast los check in the clinic), and has the potential to be smoothly incorporated into the nasal standard-of-care. for probable revisions to the clinical regimen, we will need a broader study with more subjects, along with a component for clinical trials to track patient response. comparison of the numerical data with in vivo spray performance will also eliminate errors that contaminate the in vitro tspd numbers (e.g. from drip-off of the deposited solution along the inner wall contours of the 3d-printed models). nevertheless on a larger intriguing perspective, the current study conclusively postulates how relatively simple engineering analysis and mechanistic tools can usher in transformative changes in the prognosis and treatment protocol for ailments such as nasal congestion and respiratory infection. special comments on the significance of the findings in view of the 2019-2020 coronavirus pandemic. with the rapid spread of the novel coronavirus disease 2019 (covid-19) worldwide, it is essential that a vaccine or a curative is developed at the earliest. with respiratory mucosa as the initial site in coronavirus infection and transmission; mucosal immunization through targeted intra-nasal vaccine promises to be an effective strategy for prophylaxis, by inducing mucosal and systemic immune responses. as of may 2020, several research groups are working on the possibility of designing intra-nasal vaccines for covid-19 58-60 , with supporting data from work carried out on earlier strains of coronavirus 61 . in this context, the intra-nasal anatomic targeting strategies (e.g. see fig. 9 ) discussed in the current study can be of significant help to increase the topical delivery. this project has generated both simulated and experimental, quantitative, de-identified data on the regional deposition of aerosolized nasal medication in the form of nasal spray droplets in the sinonasal passages. for readers' convenience, table 2 details the drug delivery numbers, processed from all the numerical runs; and the narrative, included under methods, elucidates the computational software settings for the airflow and droplet transport simulations. the datasets generated during and/or analysed during the current study are also available from the corresponding author on reasonable request. www.nature.com/scientificreports www.nature.com/scientificreports/ www.nature.com/scientificreports www.nature.com/scientificreports/ nasal architecture: form and flow comparative anatomy and physiology of the nasal cavity adult chronic rhinosinusitis: definitions, diagnosis, epidemiology, and pathophysiology topical corticosteroids in chronic rhinosinusitis: a randomized, double-blind, placebo-controlled trial using fluticasone propionate aqueous nasal spray clinical practice guideline: adult sinusitis perceiving nasal patency through mucosal cooling rather than air temperature or nasal resistance from ct scans to cfd modelling -fluid and heat transfer in a realistic human nasal cavity upper airway reconstruction using long-range optical coherence tomography: effects of airway curvature on airflow resistance simulation of sprayed particle deposition in a human nasal cavity including a nasal spray device on computational fluid dynamics models for sinonasal drug transport: relevance of nozzle subtraction and nasal vestibular dilation clinical questions and the role cfd can play quantification of airflow into the maxillary sinuses before and after functional endoscopic sinus surgery comparison of airflow between spreader grafts and butterfly grafts using computational fluid dynamics in a cadaveric model impact of endoscopic craniofacial resection on simulated nasal airflow and heat transport nasal airflow changes with bioabsorbable implant, butterfly and spreader grafts. the laryngoscope characterizing nasal delivery in 3d models before and after sinus surgery comparative study of simulated nebulized and spray particle deposition in chronic rhinosinusitis patients ideal particle sizes for inhaled steroids targeting vocal granulomas: preliminary study using computational fluid dynamics can we use cfd to improve targeted drug delivery in throat? bulletin of the a review of the implications of computational fluid dynamic studies on nasal airflow and physiology a critical overview of limitations of cfd modeling in nasal airflow image-based computational fluid dynamics in the lung: virtual reality or new clinical practice topical drug delivery: how cfd can revolutionize the usage protocol for nasal sprays magical" fluid pathways: inspired airflow corridors for optimal drug delivery to human sinuses numerical and experimental investigations on nasal spray usage strategies in chronic rhinosinusitis enhanced deposition of nasal sprays using a patient-specific positioning tool comparative analysis of nebulizer and "line of sight" spray drug delivery to chronic rhinosinusitis target sites creation of an idealized nasopharynx geometry for accurate computational fluid dynamics simulations of nasal airflow in patient-specific models lacking the nasopharynx anatomy influence of localized mesh refinement on numerical simulations of post-surgical sinonasal airflow detailed flow patterns in the nasal cavity numerical predictions of submicrometer aerosol deposition in the nasal cavity using a novel drift flux approach numerical simulations investigating the regional and overall deposition efficiency of the human nasal cavity the lifeshirt: an advanced system for ambulatory measurement of respiratory and cardiac function ansys fluent theory guide version 14 nih image to imagej: 25 years of image analysis techniques of intranasal steroid use effect of formulation-and administration-related variables on deposition pattern of nasal spray pumps evaluated using a nasal cast fluticasone propionate nasal spray instructions comparative analysis of the main nasal cavity and the paranasal sinuses in chronic rhinosinusitis: an anatomic study of maximal medical therapy characterization of nasal spray pumps and deposition pattern in a replica of the human nasal airway the mechanics of inhaled pharmaceutical aerosols: an introduction assessment of the influence factors on nasal spray droplet velocity using phase-doppler anemometry external characteristics of unsteady spray atomization from a nasal spray device measurements of droplet size distribution and analysis of nasal spray atomization from different actuation pressure automated actuation of nasal spray products: determination and comparison of adult and pediatric settings on point vortex models of exotic bluff body wakes exploring the dynamics of '2p' wakes with reflective symmetry using point vortices on the motion of two point vortex pairs with glide-reflective symmetry in a periodic strip dynamics of vortices in complex wakes: modeling, analysis, and experiments a mathematical model of 2p and 2c vortex wakes on angled bounce-off impact of a drop impinging on a flowing soap film what is normal nasal airflow? a computational study of 22 healthy adults nasal sprayed particle deposition in a human nasal cavity under different inhalation conditions the effect of formulation variables and breathing patterns on the site of nasal deposition in an anatomically correct model evaluation of different parameters that affect droplet-size distribution from nasal sprays using the malvern spraytec absorption and clearance of pharmaceutical aerosols in the human nose: development of a cfd model simulation of the phase change and deposition of inhaled semi-volatile liquid droplets in the nasal passages of rats and humans intranasal vaccine for covid-19 under development: bharat biotech superior immune responses induced by intranasal immunization with recombinant adenovirusbased vaccine expressing full-length spike protein of middle east respiratory syndrome coronavirus rethinking the traditional vaccine delivery in response to coronaviruses mucosal immunization with surface-displayed severe acute respiratory syndrome coronavirus spike protein on lactobacillus casei induces neutralizing antibodies in mice influence of awareness of the recording of breathing on respiratory pattern in healthy humans effect of obesity on ozone-induced changes in airway function, inflammation, and reactivity in adult females the authors declare no competing interests. correspondence and requests for materials should be addressed to s.b. publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. key: cord-307133-bm9z8gss authors: kong, lingcai; wang, jinfeng; han, weiguo; cao, zhidong title: modeling heterogeneity in direct infectious disease transmission in a compartmental model date: 2016-02-24 journal: int j environ res public health doi: 10.3390/ijerph13030253 sha: doc_id: 307133 cord_uid: bm9z8gss mathematical models have been used to understand the transmission dynamics of infectious diseases and to assess the impact of intervention strategies. traditional mathematical models usually assume a homogeneous mixing in the population, which is rarely the case in reality. here, we construct a new transmission function by using as the probability density function a negative binomial distribution, and we develop a compartmental model using it to model the heterogeneity of contact rates in the population. we explore the transmission dynamics of the developed model using numerical simulations with different parameter settings, which characterize different levels of heterogeneity. the results show that when the reproductive number, [formula: see text] , is larger than one, a low level of heterogeneity results in dynamics similar to those predicted by the homogeneous mixing model. as the level of heterogeneity increases, the dynamics become more different. as a test case, we calibrated the model with the case incidence data for severe acute respiratory syndrome (sars) in beijing in 2003, and the estimated parameters demonstrated the effectiveness of the control measures taken during that period. mathematical models play an important role in understanding epidemic spread patterns and designing public health intervention measures [1] [2] [3] [4] . the traditional deterministic compartmental models usually assume homogeneous mixing, which means that each individual has the same probability of contact with all of the others in the population [4] . however, there is a growing awareness that this assumption is not the case in reality, because heterogeneity can arise due to many sources [5] , including age, sex, susceptibility to disease, position in space and the activities and behaviors of individuals, among others [6] . here, we will focus on the heterogeneity in host contact rates at the population level. in recent years, scientists have developed different approaches to model heterogeneity in host contact rates. first, traditional compartmental models were extended: the infection term of the homogeneous mixing compartmental models was modified [7] [8] [9] . the compartments were further divided into multiple subgroups with similar behavioral characteristics (e.g., risk [10] ) or demography (e.g., age [11, 12] ). second, along with the rapid development in research on complex networks, a large body of literature has examined the effects of the heterogeneous contact structure on disease spread in networks [13, 14] . the third type of modeling approach considering heterogeneity is agent-based modeling [15] [16] [17] , which characterizes the heterogeneity in individual attributes and behaviors. additionally, several researchers have attempted to bridge the gap between traditional compartmental models and individual-based models [18] [19] [20] . in this paper, we develop a new compartmental model to incorporate heterogeneous contact rates in disease transmission. first, by combining a poisson distribution and a gamma distribution, we derived a negative binomial distribution (nbd) transmission function, with which we developed a compartmental model. then, we explored the influence of different levels of heterogeneity on the transmission dynamics of infectious diseases using numerical simulations. finally, we calibrated the model with the number of daily cases of severe acute respiratory syndrome (sars) in beijing in 2003, and the estimated parameters show that the control measures taken at that time were effective. the heterogeneity in transmission can be modeled by assuming that the number of contacts among individuals varies from person to person. let x i represent the number of effective contacts (the number of contacts that would be sufficient for transmitting the disease successfully, were it to occur between a susceptible individual and an infectious individual [21, 22] ) with infectious individuals of the i-th susceptible person per unit time. assume that x i has a poisson distribution π(θ i ), where θ i is the mean of the number of effective contacts that the i-th susceptible individual makes with infectious individuals per unit time. that θ i are identical means that each individual has an equal chance of effective contact with infectious individuals and an equal chance of being infected, thereby resulting in a traditional homogeneous-mixing model. in reality, however, individuals typically come into contact with only a small, clustered, subpopulation [20] . therefore, it is reasonable to assume that different individuals have different average effective numbers of contacts in a certain period of time; that is, θ i is itself a random variable. the gamma distribution is a good choice for describing θ i for a variety of reasons: it is bounded on the left at zero (the numbers of contact must be non-negative), is positively skewed (it has non-zero probability of an extremely high number of contacts) and can represent a variety of distribution shapes [23] . it has been used to describe the expected number of secondary cases caused by a particular infected individual [24] . therefore, we assume a gamma distribution for θ i , with shape parameter k, rate parameter m (or scale parameter 1 m ) and the following probability distribution function: the conditional distribution of x i given θ i = θ is: we obtain the marginal distribution of x i : this is the probability density function of an nbd with mean k m and variance k(1+m) m 2 . then, the probability of a susceptible individual escaping from being infected can be represented by the zero term of the nbd: let the mean of the nbd be equal to the mean of the number of effective contacts of all susceptible individuals with infectious individuals, that is k m = βi n , where β denotes the transmission rate, defined as the per capita rate at which two specific individuals come into effective contact per unit time [22] ; i denotes the number of infectious individuals; and n denotes the size of the total population. it follows that 1 m = βi kn , and: consider a closed population (without births, deaths and migration into or out of the population). let s t and i t denote the numbers of the susceptible and infectious individuals at time t, respectively. then, the difference equation relating s t and i t at successive time steps t and t + 1 is: here, λ t = 1 − (1 + βi t kn ) −k is the risk of a susceptible individual becoming infected between time t and t + 1. using the relationship between the risk and rate derived in [22] , risk = 1 − e rate , we obtain the rate at which susceptible individuals become infected at time t: therefore, the rate of change in the number of susceptible individuals can be represented by the differential equation representing: we call k ln(1 + βi kn ) in the right side of this equation the nbd transmission function. a similar function, k ln(1 + ap t k ), and its discrete form, (1 + ap t k ) −k , were first used in host-parasitoid models, where a denotes the per capita searching efficiency of the parasitoid and p t denotes the number of parasitoids [25, 26] . then, they were used in insect-pathogen models [27] . in [28] , the author used the transmission function, k ln(1 + βi k ), to model a possum-tuberculosis (tb) system. the influence of different transmission functions on a simulated pathogen spread was studied in [29] . because: when k → ∞, the nbd transmission function we derived here approximates the frequency-dependent transmission function of the homogeneous-mixing model. therefore, it can be regarded as a generalized frequency-dependent transmission function [1, 4] . similarly, the nbd transmission function used in [28] can be regarded as a generalized density-dependent transmission function [1, 4] . comparing the nbd transmission function with the density-dependent transmission function, βsi, and the frequency-dependent transmission function, βsi n , of the homogeneous-mixing model [4, 22] , we obtain one more parameter, k, which is the shape parameter of the gamma distribution (equation (1)). denote the mean of the gamma distribution as µ θ ; then, the variance is µ 2 θ k . setting the mean to be a constant and letting k → ∞, the variance goes to zero, resulting in homogeneous-mixing, just as shown in equation (2) . in contrast, the variance increases as the value of k decreases, which indicates greater heterogeneity of the contact rates between the susceptible and infectious populations. therefore, the parameter k characterizes the level of heterogeneity. the standard susceptible-exposed-infectious-recovered (seir) model divides the total population into four compartments: susceptible (s, previously unexposed to the pathogen), exposed (e, infected, but not yet infectious), infected (i, infected and infectious) and recovered (r, recovered from infection and acquired lifelong immunity) [1, 4, 22] . the infection process is represented in figure 1 . children are born susceptible to the disease and enter the compartment s. a susceptible individual in compartment s is infected after effective contact with an infectious individual in compartment i and then enters the exposed compartment e. after the latent period ends, the individual enters the compartment i and becomes capable of transmitting the infection. when the infectious period ends, the individual enters the recovered class r and will never be infected again [4, 22] . in each compartment, individual death occurs at a constant rate, µ, which is equal to the birth rate. death induced by the disease is not considered here. therefore, the total population size in the model, n, remains unchanged. the seir model and its extension have been used to model many infectious diseases, for example, measles [30] [31] [32] , rubella [33, 34] , influenza [35, 36] and sars [37, 38] , among others. using the nbd transmission function, we set up a new seir model in a closed population, represented by a set of ordinary differential equations: where the parameter α is the rate at which individuals in the exposed category become infectious per unit time, and its reciprocal is the average latent period [4, 22] ; the parameter γ is the rate at which infectious individuals recover (become immune) per unit time, and its reciprocal is the average infectious period [4, 22] ; and the parameter µ refers to the birth and death rates. based on the next-generation matrix approach [39] , we derive the basic reproductive number (see appendix a for further details), which is identical to that of the homogeneous-mixing model with a frequency-dependent transmission function [4] . it is worth noting that it is irrelevant to k, which means that it does not depend on the level of heterogeneity. this can be explained by r 0 being an average quantity, which means that it does not consider the individual variance in infectiousness [24] . this result is in agreement with the conclusion made using a meta population version of the standard stochastic sir model incorporating spatial heterogeneity [40] . we now determine the equilibrium states. without much work, we can obtain the disease-free equilibrium (n, 0, 0, 0). we also derive the approximate size of the infectious compartment at the endemic equilibrium, . this is identical to that of the homogeneous-mixing model with a frequency-dependent transmission function [4] . similar to r 0 , it does not depend on k. in other words, the contact heterogeneity does not influence the endemic equilibrium, although it does change the dynamics, which we demonstrate using numerical simulations in the next section. using numerical simulations, we explore the influence of the heterogeneity level on the transmission dynamics, characterized by the parameter k. the results show that the infectious curves with fixed β, but different values of k achieve a peak after a period that is almost the same in duration (figure 2a) . however, the transmission speed and, therefore, the peak size, as well as the dynamics after the peak are very different. a low level of heterogeneity results in dynamics similar to those predicted by the homogeneous-mixing model with a frequency-dependent transmission term, βsi n . this is consistent with the conclusion inferred in equation (2). as the value of k decreases, that is the level of heterogeneity increases, the dynamics differ increasingly from those predicted by the homogeneous-mixing model. the greatest difference is that at the overall level, the heterogeneity slows the transmission speed and decreases the peak sizes, which means milder disease outbreaks, because in the scenario with a high level of heterogeneity, only a small proportion of susceptible individuals have chances of coming into contact with infectious individuals and becoming infected, which results in a slower increase of the infected population. second, after the peak is attained, the infectious curves do not decline as rapidly as those predicted by the homogeneous-mixing model and the nbd models (equation (4)) with larger values of k (figure 2a) , and the disease persists over a long term in the population ( figure 2b ). compared to the homogeneous-mixing model or the nbd models with larger values of k, up to the peak time (almost the same), there are many more individuals who are still susceptible to the disease. a proportion of them come into contact with infectious individuals and become infected, and this process persists for a long period of time. moreover, figure 2b shows that the endemic sizes of the two scenarios are approximately equal, just as noted in the previous section. in addition, when k drops to a very small value, there will be no disease outbreak, because almost none of the susceptible individuals have any chance of coming into contact with infectious individuals and becoming infected. it is shown that the contact patterns exhibit more heterogeneity than that assumed by homogeneous-mixing models, but they do not appear extremely heterogeneous [6] . we also simulate the dynamics with a fixed value of k and different values of β. because the dynamics obtained with a large value of k are similar to those of the homogeneous-mixing model with a frequency-dependent transmission term, we only show the results for a relatively small value of k = 10 −4 (figure 3 ). for larger values of β, the infectious curves reach their peaks earlier, and the peaks are higher than those obtained for smaller values of β. after the peak of the disease outbreak is achieved, the infectious curves decrease slowly and reach endemic equilibrium gradually ( figure 3b ). additionally, for much smaller values of β, such that r 0 < 1, there will be no disease outbreak (here, for example, β = 0.1). the sars disease broke out in the beginning of march 2003 in beijing, spread rapidly over the next six weeks and peaked during the third and fourth weeks of april [41] . in total, 2048 confirmed cases were reported during the entire outbreak period (the circle markers shown in figure 4 ; the data were provided by the chinese center for disease control and prevention). prompted by the rapid expansion of the epidemic, on 17 april, the beijing municipal government established a joint sars leading group and deployed 10 task forces to oversee crisis management [41, 42] . on 20 april, a larger number of cases was reported, and the chinese government canceled the may day holiday in an effort to reduce the mass movement of people [43] . multiple measures were taken to control the spread of the disease, including the provision of personal protective equipment and training for healthcare workers [41] ; introduction of community-based prevention and control through case detection, isolation, quarantine and community mobilization [41] ; closure of the sites of public entertainment and schools [42] ; and stopping the entry of all visitors or screening them for fever upon entry to universities and other places [42] . additionally, a general increase in sars awareness played an important role in controlling the outbreak [42] . the multiple measures implemented in beijing likely led to the rapid resolution of the sars outbreak [42] . to evaluate the effectiveness of the control measures taken in beijing at that time, we calibrated the nbd model to the data of the sars daily cases using the globalsearch algorithm in the matlab global optimization toolbox [44, 45] and estimated the parameters. we used two different values, k 1 and k 2 , to characterize the different levels of heterogeneity in contact in the population before and after 20 april [38] . we assumed a fixed value for β for simplicity (in reality, the value of β decreased along with the control strategies [38] ; we mainly discuss the influence of the other parameter, k). we chose the normalized root mean square error (nrmse) [46] as the goodness of fit between the model output and the daily case data, as well as the objective function of the calibration procedure. in order to compute the nrmse, we solved the set of differential equations (equation (4)) with unknown parameters α, β, γ and k = k 1 from 7 march to 20 april. the initial conditions were set as follows: s(0) = 1.4564 × 10 7 , which was the size of the permanent population in beijing in 2003 [47] ; t = 0 corresponds to 7 march 2003; e(0) = 0; i(0) = 2, which was the number of daily cases on 7 march 2003; and r(0) = 0. then, the output of the model on 20 april was taken as the initial value to solve equation (4) with parameters α, β, γ and k = k 2 from 20 april to 4 june. finally, the two outputs were combined and used to calculate the goodness of fit to the sars daily case data. the birth and death rate, µ, was assumed to be 1/70 year −1 . in total, there were five unknown parameters to be estimated: k 1 , k 2 , α, β and γ. the starting points of the parameters for the calibration procedure were selected randomly between the bounds of the parameters shown in table 1 . because of the stochasticity of the globalsearchalgorithm [44, 45] , the results varied slightly every time. we ran the procedure 100 times. table 2 presents the minimum, maximum, mean and standard variance of the results. the average latent and infectious periods are 1 α = 6.8661 days and 1 γ = 4.8439 days, respectively. the much smaller k 2 value indicates that the control measures are extremely effective in controlling the sars transmission in beijing in 2003. this is in agreement with the result in [38] . figure 4 shows the 100 fitted infectious curves and the daily cases. in this paper, we aimed to study the influence of heterogeneity in the contact rates in disease transmission at the population level. the developed nbd model can be regarded as a generalized homogeneous-mixing model with a frequency-dependent transmission function. our results show that, keeping other conditions identical, the higher is the level of heterogeneity in contact rates, the greater is the difference in the disease dynamics observed from those predicted using the homogeneous-mixing models. it is worthwhile to compare our approach and results to previous approaches and results. to address heterogeneous-mixing within populations, the populations were further divided into multiple subgroups [10] [11] [12] , and used the waifw matrix ("who acquires infection from whom" [1] ), in which any individual is more likely to come into contact with other individuals from within the same subgroup than those outside. however, in this framework, contact rates within the subgroups are still homogeneous. a different class of approaches for extending the traditional compartmental models to incorporate heterogeneity involves modifying the transmission term; our approach belongs to this class. the work in [7, 8, 19] replaced the bilinear transmission term (si) in the homogeneous compartmental model with a nonlinear term ks p i q , where k, p, q are the "heterogeneity parameters". their results showed that the modified model was capable of predicting the disease transmission patterns in a clustered network [19] . stroud et al. used a power-law scaling of the new infection rate i(s/n) v , with scaling power v greater than one, to relax the homogeneous-mixing assumption [9] , and it was demonstrated that this power-law formulation leads to significantly lower predictions of the final epidemic size than the traditional linear formulation. compared to these empirical or semi-empirical modifications [7] [8] [9] 19] , the nbd transmission function seems to agree more with the real transmission mechanics, in that it assumes that the mean of the number of effective contacts of the susceptible individuals with infectious individuals per unit time is different from individual to individual, and the choice of the gamma distribution offers multiple advantages (see section 2.1). in recent years, several network-based models have been developed to study the influence of contact heterogeneity on disease transmission. keeling et al. reviewed multiple types of networks and the statistical and analytical approaches for the spread of infectious diseases [13, 14] . in particular, bansal et al. demonstrated that the high-level heterogeneous degree distributions generate an almost immediate expansion phase compared to homogeneous degree distributions, such as the poisson distribution [6, 49, 50] . the nbd-seir model does not exhibit this feature. we suspect that this is because our approach belongs to the mean-field class of approaches and considers a large population at the overall level. in addition, it is possible to approximate the main features of disease spread in networks with compartmental models using an appropriate construction. the work in [20] used r 0 as a fundamental parameter to formulate a mean-field type model, which can implicitly capture some important effects of heterogeneous-mixing in contact networks. the work in [51, 52] applied "edge-based compartmental modeling" (ebcm), which focuses on the status of a random partner rather than a random individual, to capture the heterogeneous contact rates in disease transmission. although it incorporates the heterogeneous contact rates in disease transmission in a tractable manner, the nbd model has some weaknesses. first, the parameter k characterizes the level of heterogeneity, which is difficult to measure directly, and this can be overcome by using contact tracing data. second, some features cannot be recovered by the nbd model. in future research, it will be interesting to incorporate other factors that influence transmission dynamics, such as the migration of populations, seasonality and vaccinations, among others. using the probability density function for the negative binomial distribution, we constructed a nbd transmission function and further developed a compartmental model for direct infectious disease. the developed model considers the heterogeneity of contact rates in the population. the simulation results show that, at the population level, the dynamics vary widely according to the level of heterogeneity in contact rates. once r 0 > 1, a low level of heterogeneity results in dynamics similar to those predicted by the homogeneous mixing models. keeping other conditions identical, as the level of heterogeneity increases, the transmission speed becomes more and more slowly, the peak size becomes smaller and smaller. these results have implications for developing interventions, such as isolation, targeted vaccination, among others. individuals: x = (e, i, s, r). here, the infected compartments are e and i, yielding m = 2. then, we decompose the components of the differential equations into f , in which f i is the rate of appearance of new infections in compartment i, and v , in which v i is the rate of transfer of individuals into and out of compartment i by all other means: the disease-free equilibrium (dfe) for this model is x 0 = (0, 0, n, 0). then, giving: this is called the next-generation matrix for the model [39] . finally, the basic reproductive number, r 0 , is calculated using the spectral ratio: because the total population size n is a constant and r = n − s − e − i, the last equation in equation (4) is redundant. to find the endemic equilibrium, we set the right side of the other three equations to zero. then, s and e can be represented by i: s = µn µ + k ln(1 + βi kn ) , e = γ + µ α substituting them into k ln(1 + βi kn )s − (α + µ)e = 0 and after some algebraic manipulation, we obtain: obviously, it is difficult and even impossible to find an explicit solution. we find an approximate solution using the first-degree taylor polynomial of ln(1 + x) near x = 0, that is ln(1 + x) ≈ x. it follows that, we obtain the approximate solution for i: where r 0 is given in equation (5). infectious diseases of humans: dynamics and control the mathematics of infectious diseases mathematical models of infectious disease transmission modeling infectious diseases in humans and animals models of infectious diseases in spatially heterogeneous environments when individual behaviour matters: homogeneous and network models in epidemiology dynamical behavior of epidemiological models with nonlinear incidence rates nonlinear transmission rates and the dynamics of infectious-disease semi-empirical power-law scaling of new infection rate to model epidemic dynamics with inhomogeneous mixing the transmission dynamics of human immunodeficiency virus (hiv) predicting the impact of measles vaccination in england and wales: model validation and analysis of policy options an age-structured model of pre-and post-vaccination measles transmission networks and epidemic models networks and the epidemiology of infectious disease an agent-based model to study the epidemiological and evolutionary dynamics of influenza viruses modeling and simulation for the spread of h1n1 influenza in school using artificial societies an agent-based spatially explicit epidemiological model in mason the implications of network structure for epidemic dynamics on representing network heterogeneities in the incidence rate of simple epidemic models building epidemiological models from r-0: an implicit treatment of transmission in networks an examination of the reed-frost theory of epidemics an introduction to infectious disease modelling statistical distributions in engineering superspreading and the effect of individual variation on disease emergence host-parasitoid systems in patchy environments: a phenomenological model discrete and continuous insect populations in tropical environments the dynamics of insect-pathogen interactions in stage-structured populations non-linear transmission and simple models for bovine tuberculosis analysis of a measles epidemic one-dimensional measles dynamics modelling vaccination programmes against measles in taiwan simulations of rubella vaccination strategies in china a simple analysis of vaccination strategies for rubella assessing the impact of airline travel on the geographic spread of pandemic influenza modelling control measures to reduce the impact of pandemic influenza among schoolchildren transmission dynamics and control of severe acute respiratory syndrome spatial dynamics of an epidemic of severe acute respiratory syndrome in an urban area reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission spatial heterogeneity and the persistence of infectious diseases evaluation of control measures implemented in the severe acute respiratory syndrome outbreak in beijing chinese national security: decisionmaking under stress; strategic studies institute of the us army war college (ssi global optimization toolbox scatter search and local nlp solvers: a multistart framework for global optimization goodness of fit between test and reference data frequently asked questions about sars dynamical patterns of epidemic outbreaks in complex heterogeneous networks sir dynamics in random networks with heterogeneous connectivity edge-based compartmental modelling for infectious disease spread incorporating disease and population structure into models of sir disease in contact networks the author for correspondence, jinfeng wang, designed the whole study, and lingcai kong implemented the method and drafted the manuscript. weiguo han and zhidong cao revised the manuscript critically and made constructive suggestions for the interpretation of the results. there was no conflict of interest regarding the submission of this manuscript, and it was approved by all authors for publication. using the next-generation operator approach [39] , we compute the basic reproductive number r 0 . first, we sort the compartments so that the first m compartments correspond to infected key: cord-268298-25brblfq authors: mao, liang title: modeling triple-diffusions of infectious diseases, information, and preventive behaviors through a metropolitan social network—an agent-based simulation date: 2014-03-04 journal: appl geogr doi: 10.1016/j.apgeog.2014.02.005 sha: doc_id: 268298 cord_uid: 25brblfq a typical epidemic often involves the transmission of a disease, the flow of information regarding the disease, and the spread of human preventive behaviors against the disease. these three processes diffuse simultaneously through human social networks, and interact with one another, forming negative and positive feedback loops in the complex human-disease systems. few studies, however, have been devoted to coupling all the three diffusions together and representing their interactions. to fill the knowledge gap, this article proposes a spatially explicit agent-based model to simulate a triple-diffusion process in a metropolitan area of 1 million people. the individual-based approach, network model, behavioral theories, and stochastic processes are used to formulate the three diffusions and integrate them together. compared to the observed facts, the model results reasonably replicate the trends of influenza spread and information propagation. the model thus could be a valid and effective tool to evaluate information/behavior-based intervention strategies. besides its implications to the public health, the research findings also contribute to network modeling, systems science, and medical geography. recent outbreaks of infectious diseases, such as the h1n1 flu, bird flu, and severe acute respiratory syndrome (sars), have brought images of empty streets and people wearing face masks to television screens and web pages, as fear of unknown diseases swept around the globe (funk, salathé, & jansen, 2010) . these images depict three basic components of epidemics, namely infectious diseases, information about diseases, and human preventive behavior against diseases. from a perspective of diffusion theory, each of the three components can be viewed as a spreading process throughout a population. the disease could be transmitted through person-to-person contact, the information is circulated by communication channels, and the preventive behavior can spread via the 'social contagion' process, such as the observational learning. the interactions among these three diffusion processes shape the scale and dynamics of epidemics (funk & jansen, 2013; lau et al., 2005; mao & yang, 2011) . mathematical and computational models have been extensively used by health policy makers to predict and control disease epidemics. a majority of existing models have been focused on the diffusion of diseases alone, assuming a 'passive' population that would not respond to diseases (bian et al., 2012; eubank et al., 2004; longini, halloran, nizam, & yang, 2004) . this is rarely the case because it is natural for people to protect themselves when realizing disease risks (eames, tilston, brooks-pollock, & edmunds, 2012; ferguson, 2007) . to improve, there has been much recent interest in modeling two diffusion processes in an epidemic, either a behavior-disease diffusion (house, 2011; mao & bian, 2011; vardavas, breban, & blower, 2007) , or an information (awareness)-disease diffusion (funk, gilad, watkins, & jansen, 2009; kiss, cassell, recker, & simon, 2010) . these 'dual-diffusion' models have made a remarkable progress toward the reality, but none of them consider all the three diffusion processes together. the third diffusion process has often been neglected or simplified. in the current literature, few modeling efforts have been devoted to explicitly representing all the three components, their spreading processes, and interactions. the lack of such models prevents researchers from unveiling a full picture of an epidemic, and inevitably introduces biases into the deep understanding on human-disease systems. for epidemiologists, it is of difficulty to explore how one diffusion process influences the other two, and what key factors govern the three diffusion processes. without a complete model, health policy makers would not be able to systematically evaluate social-network interventions for disease control, such as mass-media campaigns and behavior promotion strategies. as in the age of information, the fusion of diseasebehavior-information in epidemic modeling becomes a pressing task in public health. to fill the knowledge deficit, this research proposes a conceptual framework to integrate the three diffusion processes, and develops a triple-diffusion model in a realistic urban area. following sections discuss the conceptualization, formulation, and implementation of the model, and evaluate the simulation results. the proposed model conceptualizes a typical epidemic as one network structure, three paralleling diffusion processes, and three external factors (fig. 1) . first, the contacts among individuals form a network structure as a basis for diffusion and interaction. second, infectious diseases are transmitted through direct contacts among individuals (the middle layer). disease control strategies, such as vaccination program, case treatment and isolation, pose external effects on the disease diffusion. third, the diffusion of diseases prompts the "word-of-mouth" discussion among individuals, which disseminates the information concerning diseases and prevention (the upper layer). the outbreak of diseases may also stimulate various mass media, such as tv, newspapers, and radio, to propagate relevant information, thus accelerating the diffusion of information. fourth, people being informed start to consider and make a decision toward the adoption of preventive behaviors. the adoptive behavior of individuals also influences their network neighbors to adopt, widely known as the "social contagion" effects (the lower layer). the diffusion of preventive behaviors, in turn, limits the dispersion of diseases and speeds the diffusion of information. behavioral interventions, as an external factor, can be implemented by health agencies to promote preventive behaviors, such as educational, incentive and role-model strategies. during an epidemic, these three diffusion processes interact with one another and form negative/positive feedbacks loops in the human-disease system, shown as arrows between layers in fig. 1 . manipulated by the three external factors, these three diffusion processes, hereinafter named as the triple-diffusion process, determine the spatial and temporal dynamics of an epidemic. the conceptual model is formulated by an agent-based approach, which has gained its momentum in epidemic modeling during the last decade huang, sun, hsieh, & lin, 2004) . different from classic population-based models, each individual in a population is a basic modeling unit, associated with a number of attributes and events that change the attributes. to represent the contact network, individuals are modeled as nodes and are linked to one another through their daily contacts (as network ties). the individualized contacts are assumed to take place during three time periods in a day at four types of locations (mao & bian, 2010) , namely the daytime at workplaces, the nighttime at homes, and the pastime at service places or neighbor households (fig. 2) . individuals travel between the three time periods and the four types of locations to carry out their daily activities, thus having contact with different groups of individuals and exposing themselves to disease infection. these contacts link all individuals into a population-wide network. two types of individual contacts are modeled in terms of the contact duration and closeness. one type is the close contacts (solid-line ovals in fig. 2 ) that happen at homes (with family members), workplaces (with co-workers), and neighbor households (with friends). this type of contacts last for sufficient time to enable disease transmission. the other type refers to the occasional contacts (dash-line ovals in fig. 2 ) that only happen at service places (with clerks and other consumers). in this case, an individual encounters only a limited number of individuals for a short time period, and thus the contact is less effective for infection. the diffusion of infectious diseases is formulated following the concept of classic susceptible-infectious-recovered (sir) model (anderson & may, 1992) . each individual possesses a serial of infection states and events as shown in fig. 3 (the red dash-line box, in the web version). the progress of an infectious disease starts with a "susceptible" individual, who may receive infectious agents if having contact with an infectious neighbor in the network. the receipt event triggers a "latent" state, during which the disease agents develop internally in the body and are not emitted. the end of latent period initiates an "infectious" state, in which this individual is able to infect other susceptible neighbors and sustain the cascade of infection in the network. during the infectious period, individual may manifest disease symptoms ("symptomatic") or not ("asymptomatic"). for either state, this individual remains infectious but would be unaware of the infection if asymptomatic. after the infectious period, this individual gets recovered and is assumed to be immune to infection during this epidemic. two disease events connect the disease diffusion with other diffusions. first, the event of symptom manifestation will motivate individuals to discuss disease information, and prompt their social contacts to adopt preventive behavior by posing infection risks. the second event is the receipt of disease agents, which is affected by the diffusion of preventive behavior. specifically, the adoption of preventive behavior reduces the probability of disease transmission p, as specified in equation (1): and e prevention are three model parameters varying in [0, 1]. e contact indicates the effectiveness of a contact to transmit diseases, dependent on the physical closeness of the contact. its value can be calibrated based on the observed characteristics of a disease, such as the basic reproductive number r 0 . i age is an age-specific infection rate, specifying the likelihood of receiving disease agents by age group, such as children, adults and seniors. e prevention indicates the efficacy of a preventive behavior in reducing infection. the parameterization is discussed later in the model implementation (simulating the diffusion of disease section and simulating the diffusion of preventive behavior section) when a specific disease and a specific preventive behavior are selected. regarding to the diffusion of information (blue dash-line box in fig. 3 , in the web version), individuals are initially "unaware" of the disease, but can be "informed" through two channels: the word-ofmouth discussion and the mass media. the former circulates the information locally through the contact network, while the latter disseminates the information globally in the population, both modeled as probabilistic events. first, an informed or symptomatic individual will discuss the disease with each network neighbor at a rate g discussion , as formulated in equation (2): where t is the current time step and t 0 is the starting time of being informed or manifesting symptoms. the discussion rate g discussion decays nonlinearly as time proceeds, i.e., an individual is more likely to talk about the disease within a few days after being informed or feeling sick. turning to the mass media, the probability of an individual being informed g mass is formulated as a function of total symptomatic case number n s (t) at time step t (equation (3)): the more individuals get sick during the time t, the higher the intensity of mass-media propagation, and thus the greater chance for an individual being informed. the constant b is a scaling parameter that controls the intensity of mass-media propagation, and a small b results in a large g mass . a mass-media campaign then can be modeled by varying the b, the timing of campaign (when to start), and the frequency of campaign (time intervals between two broadcastings). once informed by either the discussion or the mass media, individuals will become decision makers toward the adoption of preventive behavior. in such a manner, the diffusion of information is coupled with the diffusion of disease and that of preventive behavior. individuals being informed start to evaluate and make a decision toward the adoption of preventive behavior (green dash-line box in fig. 3 , in the web version). the decision depends on individuals' own characteristics and inter-personal influence from their social networks. this research uses a threshold behavioral model (granovetter & soong, 1983) to formulate the decision process. specifically, each individual has two adoption states, and the change of state is calculated based on equation (4) for a given time step t, individual i will evaluate the proportion of adopters in i's personal network, as the peer pressure of adoption a i (t). once the peer pressure reaches a threshold t p;i (called the threshold of adoption pressure), an individual will decide to adopt. meanwhile, individual i also evaluates the proportion of symptomatic individuals in the personal network, as the perceive risks of infection m i (t). if the perceived risk exceeds another threshold t r;i (termed the threshold of infection risk), an individual will also adopt. the individualized thresholds (t p;i and t r;i ) reflect personal characteristics of individuals, while the events of evaluation represent the inter-personal influence between individuals. in such a way, the diffusion of disease elevates the perceived risks of individuals, and stimulates them to adopt preventive behavior. in turn, the adoption of preventive behavior impedes the diffusion of disease, forming a negative feedback loop in the human-disease system. the proposed triple-diffusion model is implemented in the greater buffalo metropolitan area, ny, usa, with a population of 985,001 (according to census 2000) . each individual is programmed into a software agent with attributes and events (table 1) . besides a unique identifier, each individual has 6 groups of attributes, including the network, demographic, spatiotemporal, infection, adoption, and information attributes. the events change the values of corresponding attributes. the social network is realized by a previously developed algorithm that assigns values to the demographic and network attributes of individuals (mao & bian, 2010) . the value assignment involves a large amount of geo-referenced data, including census data, business location data, land parcel data, transportation network, and results of a household travel survey. statistical distributions derived from these datasets, such as distributions of family size, workplace size, and household daily trips, are used to ensure the validity of value assignments. to differentiate the weekdays and weekends, individuals are not assigned to work (or schools) at weekends except those who work in service-oriented businesses (such as restaurants and grocery stores). those who do not work during the weekends would have increased trips to service-oriented businesses. the completion of assignments forms three linked populations, including a nighttime population at homes, a daytime population at workplaces, and a pastime population at service places or neighbor households. the three populations represent the same set of individuals, but at different locations and time periods of a day. individuals have contact with a number of other individuals at a same time period and same location, forming a spatio-temporally varying network. the simulated network has an average of 16.9 daily contacts per person, consistent with the observed number (16.8) from empirical studies (beutels, shkedy, aerts, & van damme, 2006; edmunds, 1997; fu, 2005) . the seasonal influenza is selected as an example due to its natural history has been well understood. a number of influenza parameters are either adopted or calibrated from existing literature as shown in table 2 . the product of i age and e contact determines the transmission probability through one contact (equation (1)), which is used to simulate individuals' transition from susceptible to latent state as a stochastic branching process. the latent, incubation, and infectious periods control the sequential transitions from latent to infectious, symptomatic, and recovered states. two groups of parameters are set to simulate the word-ofmouth discussion and the mass-media effects, respectively. the parameter values are calibrated from the model evaluation later but are reported here. for the word-of-mouth discussion, the initial discussion rate g discussion (0) in equation (2) is set to 0.001, and then the g discussion (t) is updated as the time goes by. for the mass media, the scaling parameter b in equation (3) is specified as 5000, based on which the probability of being informed by mass media g mass can be computed at every time step. the mass-media campaign is assumed to be triggered when the total symptomatic individuals exceed 1& of the total population, and the frequency of broadcasting follows a weekly basis. with these two probabilities, a monte-carlo simulation is used to determine whether an unaware individual will be informed or not at each time step. the use of flu prophylaxis (e.g., oseltamivir) is taken as a typical example of preventive behavior, because its efficacy is more conclusive than other preventive behaviors, such as hand washing and facemask wearing. three parameters are specified to simulate the behavioral diffusion and couple it with the diffusion of influenza. first, the model assumes that symptomatic individuals has a 75% likelihood to adopt flu prophylaxis to mitigate the symptoms and reduce their infectivity (mcisaac, levine, & goel, 1998) . second, the preventive efficacy of flu prophylaxis (e prevention in equation (1)) is set to 70% and 40% for susceptible and infectious individuals, respectively, indicating that their likelihood of being infected or infecting others can be reduced by such amount (hayden, 2001; longini et al., 2004) . third, the two adoptive thresholds of individuals t p;i and t r;i (in equation (4)) are generated from their statistical distributions using a monte-carlo method. those statistical distributions were derived based on a health behavioral survey, whose details are provided in the supplementary document. each individual is assigned to random numbers from those statistical distributions as their adoptive thresholds. for each time step, the model computes the peer pressure and perceived risk for every informed individual, and updates his/her adoption state using the threshold model. the triple-diffusion model is simulated over 150 days, covering a general flu season (from december to may). at the beginning of simulation, all individuals are assigned susceptible and unaware states. to consider a background immunity before the epidemic, the model randomly selects 62.7% of seniors, 15.6% of adults, and 17.9% of children according to the national immunization coverage (euler et al., 2005; molinari et al., 2007) , and directly moves them to the adopted and recovered states. all unselected individuals are set as non-adopters. to initialize the disease diffusion, five infectious individuals are randomly introduced into the study area at the first day. the simulation takes a tri-daily time step and runs the three diffusion processes concurrently in each time step. to stabilize the final outcomes, the model has been implemented by 50 realizations. in each realization, the background immunity, the first five infectious individuals, their contacts, and the infection, awareness, and adoption of these contacts are randomized. the final outcomes are three diffusion curves, namely the epidemic curve (the weekly number of new cases), the adoption curve (the weekly number of new adopters), and the awareness curve (the weekly number of newly informed), all averaged from 50 model realizations. two independent data sources are used to evaluate the model results and calibrate model parameters. one is the weekly reports of laboratory confirmed specimens in the 2004e2005 in buffalo, ny, issued by the new york state department of health (nysdoh, 2005) . the simulated epidemic curve is compared to the weekly reported data to show the validity of modeling disease diffusion. the other data source is the weekly statistics from google flu trends of the study area (google, 2011) , which summarizes the number of online flu-related inquires as a tool for monitoring influenza outbreaks (ginsberg et al., 2008) . relevant to this research, the weekly flu trend data could be a reliable representation to the real diffusion of influenza-related information (fenichel, kuminoff, & chowell, 2013) , and is compared to the simulated awareness curve from the model. model result evaluation fig. 4 displays the simulated weekly number of newly infected individuals, compared to the actual number of weekly labconfirmed cases during the 2004e2005 influenza season. the shape and peak time of the predicted curve correspond well with those of the reported epidemic, although the magnitude of simulated cases is much larger than the reported data. the first possible reason is that many sick people may choose self-care instead of seeing a doctor, and thus cannot be reported. second, for those who seek healthcare, only a small portion of their specimens were submitted for laboratory testing. therefore, the number of influenza cases is often highly under-reported, and a complete data is rather difficult to collect. the laboratory data, so far, is the best available touchstone for model validation. in this sense, the model performs well in predicting the trend, and at least allows the estimate of a worse case result. fig. 5 compares the simulated weekly number of newly informed people to the excessive weekly google flu search statistics that indicate the amount of online searching behavior relevant to influenza epidemic during the 2004e2005 flu season. the excessive weekly search statistic (the left y axis) is the difference between the observed statistics and its long-term average (1296), which removes searches of influenza in a normal day but not caused by the epidemic. the temporal course of the information diffusion is well predicted. since the two measurements are not in a same unit (primary and secondary axis in fig. 5 ), their magnitudes are not comparable but are highly correlated. to my knowledge, both comparisons show a level of consistency that has rarely been achieved by other epidemic models (ferguson et al., 2006; funk et al., 2009; kiss et al., 2010; vardavas et al., 2007) . a majority of previous models cannot validate themselves by observed facts, particularly for the information diffusion. the triple-diffusion model, thus, could provide a reliable foundation to devise much-needed control and intervention strategies for infectious diseases, such as behavior promotion strategies and massmedia campaigns. fig. 6 shows how the diffusion of influenza motivates people to adopt preventive behavior. the two diffusion processes take a similar shape, but there is a time lag about 1 week between their peak times. as the number of influenza cases rises, individuals perceive increasing risks, which motivate them to adopt preventive behavior. their adoptive behavior further influences surrounding individuals to adopt. the time lag between the epidemic and adoption curves is possibly the time individuals need to be informed and take preventive actions. fig. 6 also suggests that monitoring the real-time flu prophylaxis sales could detect the epidemic peak about 1 week ahead of the traditional disease surveillance networks, such as the cdc sentinel network, which take up to 2 weeks to collect, process, and report disease cases registered at health centers. the diffusions of influenza and its related information also take a similar bell-shape, but the information peaked approximately one week earlier than the disease (fig. 7) . a possible reason is the wide coverage of modern mass media over the population, enabling a faster spread of information than the disease. at the beginning of the epidemic, only a few influenza cases occurred and a vast majority of people were unaware of the disease. as the diffusion of influenza took off (january 2nde9th), individuals started to notice the disease problem and discuss with each other. when the epidemic became more sensational and drawn attention from the mass media (january 10the16th), the awareness curve climbed steeply and reached the peak in four weeks. fig. 7 implies that overseeing the diffusion of disease information, such as the google flu trend, could warn the public 1e2 week earlier before the epidemic peak actually occurs. in addition to forecasting the timeline of the three diffusion processes, this spatially explicit model allows the prediction of geographic distributions over time. fig. 8 maps the spatial distributions of simulated infection (square), adoption (circle), and awareness (triangle) are mapped. for the purpose of clarity, only the downtown area is presented. the spatial distributions on days 50, 75, and 100 are displayed in order to present different stages of the epidemic. on day 50 the epidemic is on the rise, day 75 is around the peak time, and on day 100, the epidemic is in decline. still in its infancy, the triple-diffusion model has several limitations in its design and implementation. first, the contact network used in the model could be further refined into three different but partially overlapping networks, namely an infection network, information network, and influential network, each channels a diffusion process. admittedly, the model would be more realistic, but building these networks requires extensive social survey to collect relevant data, which is often costly and time consuming. recent work on extracting social networks from facebook and twitter may be a promising method to address this issue (lewis, kaufman, gonzalez, wimmer, & christakis, 2008) . second, the effects of mass media is formulated as a simple formula, but could be more complicated if further considering various types of media and their corresponding coverage. probably, a sophisticated function could depict the human communications better, such as an exponential decay or a power-law decay function. life style data of individuals may be helpful to identify their preferred media and delineate the media coverage. third, the discussion rate is assumed homogenous over the study area. this rate may vary between age groups, occupations, and personalities. a health behavior survey may need to estimate discussion rates for different groups of individuals. fourth, there would be a certain amount of uncertainty if the model was used to predict the future influenza outbreaks. if new census data and travel survey data were filled in, this model could reasonably predict the timeline and scale of a future outbreak of seasonal influenza. however, for a pandemic influenza, such as the new h1n1, most of the model parameters should be adjusted to account for the highly infectious virus, the faster circulation of information, and possibly distinct response of individuals toward preventive behaviors. all these limitations warrant a future study. after all, the goal of modeling is not to predict what exactly happen during an epidemic, but rather to observe how the epidemic may proceed and encourage appropriate questions. in this sense, the model results provide valuable knowledge regarding city-wide epidemics. this article presents an original triple-diffusion model for epidemiology, and discusses its conceptual framework and design. the conceptual framework integrates three interactive processes: the diffusion of influenza, the diffusion of information, and that of preventive behavior, upon a human social network. the agentbased approach, network model, theories from epidemiology, information and behavioral sciences are used to formulate the conceptual framework into a working model. for illustration purposes, the model is implemented in an urbanized area with a large population. compared to the reported data, the proposed model reasonably replicates the observed trends of influenza infection and online query frequency. the model, thus, could be a valid and effective tool for exploring various control policies. there are two key contributions of this research to the literature body of network diffusion theory, public health, and agent-based modeling. first, the proposed triple-diffusion framework is a significant advancement to previous disease-only models and dualdiffusion models, such as bian and mao's work. the fusion of disease, information, and human behavior allows a more comprehensive 3d cubic view of the human-disease system, which can only be studied from a 2d planar perspective before. the increase of one dimension exposes much more details of an epidemic and thus enables a deeper understanding of this complex system, for example, the interactive mechanisms among the three diffusion processes. the proposed modeling framework can flexibly accommodate the mobile phone tracking data and the latest census data to improve the accuracy of modeled daytime and nighttime populations. the online social networking data (from facebook and twitter) can also be included to modify the way of communications between individuals, as well as the personal influence between them. second, this model can be further developed into a virtual platform for health decision makers to test disease control policies in many other metropolitan areas. particularly, since the model explicitly represents the diffusion of information and human preventive behavior, it permits a systematic evaluation of disease control policies that have not been well studied before, such as the mass-media campaigns and behavioral incentive strategies. the evaluation results will enrich the family of disease control polices, and help the public health overcome the socio-economic challenges posed by potential influenza outbreaks. infectious diseases of humans: dynamics and control social mixing patterns for transmission models of close contact infections: exploring self-evaluation and diary-based data collection through a web-based interface modeling individual vulnerability to communicable diseases: a framework and design individual-based computational modeling of smallpox epidemic control strategies key facts about seasonal influenza (flu) measured dynamic social contact patterns explain the spread of h1n1v influenza who mixes with whom? a method to determine the contact patterns of adults that may lead to the spread of airborne infections modelling disease outbreaks in realistic urban social networks estimated influenza vaccination coverage among adults and children e united states skip the trip: air travelers' behavioral responses to pandemic influenza capturing human behaviour strategies for containing an emerging influenza pandemic in southeast asia strategies for mitigating an influenza pandemic measuring personal networks with daily contacts: a single-item survey question and the contact diary the spread of awareness and its impact on epidemic outbreaks the talk of the town: modelling the spread of information and changes in behaviour modelling the influence of human behaviour on the spread of infectious diseases: a review detecting influenza epidemics using search engine query data google flu trend e united states threshold models of diffusion and collective behavior modeling targeted layered containment of an influenza pandemic in the united states perspectives on antiviral use during pandemic influenza control of communicable diseases manual modelling behavioural contagion simulating sars: small-world epidemiological modeling and public health policy assessments the impact of information transmission on epidemic outbreaks sars-related perceptions in hong kong tastes, ties, and time: a new social network dataset using facebook.com containing pandemic influenza with antiviral agents spatialetemporal transmission of influenza and its health risks in an urbanized area agent-based simulation for a dual-diffusion process of influenza and human preventive behavior coupling infectious diseases, human preventive behavior, and networksda conceptual framework for epidemic modeling visits by adults to family physicians for the common cold transmissibility of 1918 pandemic influenza the annual impact of seasonal influenza in the us: measuring disease burden and costs can influenza epidemics be prevented by voluntary vaccination supplementary data related to this article can be found at http:// dx.doi.org/10.1016/j.apgeog.2014.02.005. key: cord-280683-5572l6bo authors: liu, laura; moon, hyungsik roger; schorfheide, frank title: panel forecasts of country-level covid-19 infections() date: 2020-10-16 journal: j econom doi: 10.1016/j.jeconom.2020.08.010 sha: doc_id: 280683 cord_uid: 5572l6bo we use a dynamic panel data model to generate density forecasts for daily active covid-19 infections for a panel of countries/regions. our specification that assumes the growth rate of active infections can be represented by autoregressive fluctuations around a downward sloping deterministic trend function with a break. our fully bayesian approach allows us to flexibly estimate the cross-sectional distribution of slopes and then implicitly use this distribution as prior to construct bayes forecasts for the individual time series. we find some evidence that information from locations with an early outbreak can sharpen forecast accuracy for late locations. there is generally a lot of uncertainty about the evolution of active infection, due to parameter and shock uncertainty, in particular before and around the peak of the infection path. over a one-week horizon, the empirical coverage frequency of our interval forecasts is close to the nominal credible level. weekly forecasts from our model are published at https://laurayuliu.com/covid19-panel-forecast/. this paper contributes to the rapidly growing literature on generating forecasts related to the current covid-19 pandemic. we are adapting forecasting techniques for panel data that we have recently developed for economic applications such as the prediction of bank profits, charge-off rates, and the growth (in terms of employment) of young firms; see liu (2020) , liu, moon, and schorfheide (2020) , and liu, moon, and schorfheide (2019) . we focus on the prediction of the smoothed daily number of active covid-19 infections for a crosssection of approximately one hundred countries/regions, henceforth locations. the data are obtained from the center for systems science and engineering (csse) at johns hopkins university. while we are currently focusing on country-level aggregates, our model could be easily modified to accommodate, say, state-or county-level data. in economics, researchers distinguish, broadly speaking, between reduced-form and structural models. a reduced-form model summarizes spatial and temporal correlation structures among economic variables and can be used for predictive purposes assuming that the behavior of economic agents and policy makers over the prediction period is similar to the behavior during the estimation period. a structural model, on the other hand, attempts to identify causal relationships or parameters that characterize policy-invariant preferences of economic agents and production technologies. structural economic models can be used to assess the effects of counterfactual policies during the estimation period or over the out-of-sample forecasting horizon. the panel data model developed in this paper to generate forecasts of covid-19 infections is a reduced-form model. it processes cross-sectional and time-series information about past infection levels and maps them into predictions of future infections. while the model specification is motivated by the time-path of infections generated by the workhorse compartmental model in the epidemiology literature, the so-called susceptible-infected-recovered (sir) model, it is not designed to answer quantitative policy questions, e.g., about the impact of social-distancing measures on the path of future infection rates. building on a long tradition of econometric modeling dating back to haavelmo (1944) , our model is probabilistic. the growth rates of the infections are decomposed into a deter-about model parameters and uncertainty about future shocks. we model the growth rate of active infections as autoregressive fluctuations around a deterministic trend function that is piecewise linear. the coefficients of this deterministic trend function are allowed to be heterogeneous across locations. the goal is not curve fitting -our model is distinctly less flexible in samples than some other models -but rather out-of-sample forecasts, which is why we prefer to project growth rates based on autoregressive fluctuations around a parsimonious linear time trend with a single break. a key feature of the covid-19 pandemic is that the outbreaks did not take place simultaneously in all locations. thus, we can potentially learn from the speed of the spread of the disease and subsequent containment in country a, to make forecasts of what is likely to happen in country b, while simultaneously allowing for some heterogeneity across locations. in a panel data setting, one captures cross-sectional heterogeneity in the data with unit-specific parameters. the more precisely these heterogeneous coefficients are estimated, the more accurate are the forecasts. a natural way of disciplining the model is to assume that the heterogeneous coefficients are "drawn" from a common probability distribution. if this distribution has a large variance, then there is a lot of country-level heterogeneity in the evolution of covid-19 infections. if instead, the distribution has a small variance, then the path of infections will be very similar across samples, and we can learn a lot from, say, china, that is relevant for predicting the path of the disease in south korea or germany. formally, the cross-sectional distribution of coefficients can be used as a so-called a priori distribution (prior) when making inference about country-specific coefficients. using bayesian inference, we combine the prior distribution with the unit-specific likelihood functions to compute a posteriori (posterior) distributions. this posterior distribution can then be used to generate density forecasts of future infections. unfortunately, the cross-sectional distribution of heterogeneous coefficients is unknown. the key insight in the literature on bayesian estimation of panel data models is that this distribution, which is called random effects (re) distribution in the panel data model literature, can be extracted through simultaneous estimation from the cross-sectional dimension of the panel data set. there are several ways of implementing this basic idea. reflect parameter uncertainty as well as uncertainty about shocks that capture deviations from the deterministic component of our forecasting model. our empirical analysis makes the following contributions. first, we present estimates of the re distribution as well as the distribution of location-specific coefficient estimates. second, we document how density forecasts from our model have evolved over time, focusing on the forecasts for three countries in which the level of infections peaked at different points in time: south korea, germany, and the u.s. due to the exponential transformation from growth rates to levels, density forecasts can feature substantial tail risk by assigning nontrivial probability to very high infection levels which materialized in the u.s. but not in germany and south korea. third, we evaluate one-week and four-week ahead density forecasts based on the continuous ranked probability score and interval forecasts based on cross-sectional coverage frequency and average length. in addition to forecasts from our panel data model, we also consider forecasts based on location-level time series estimates of our trend-break model and a simple sir model. once we decompose the set of locations into those that experienced the covid-19 outbreak early (prior to 2020-03-28) and those that experience the outbreak later on, then we find some evidence that for the late group the panel density forecasts are more accurate than the time-series forecasts. however, because of the substantial heterogeneity in our panel and the poor data quality for some countries, the empirical evidence in favor of the panel approach is not as tidy as the simulation evidenced provided in the monte carlo section of this paper. over time, in particular after the infection level has peaked and started to fall, forecast accuracy increases. the timing of the peak appears to be very difficult to forecast. prior to the middle of may the panel and time-series forecasts from our trend-break model are substantially more accurate than the forecasts from a simple time-varying coefficient sir model. for subsequent forecast origins, the accuracy across the three forecasting procedures becomes much more similar. weekly real-time forecasts are published on the companion website https://laurayuliu.com/covid19-panel-forecast/. in terms of interval forecasts we find that over a one-week horizon the empirical coverage frequency of the trend-break model forecasts is close to the nominal coverage level based on which the forecasts were constructed. moreover, in april and may, the average interval lengths of the panel model forecasts are slightly smaller than the time-series intervals. at the four-week horizon the coverage frequency is considerably smaller than the nominal level and it deteriorates further for longer horizons. this paper is connected to several strands of the literature. the panel data forecasting approach is closely related to work by gu and koenker (2017a,b) and our own work in liu (2020), liu, moon, and schorfheide (2020) , liu, moon, and schorfheide (2019) . all five papers focus on the estimation of the heterogeneous coefficients in panel data models. the forecasting model for the covid-19 infections is based on the alternative parametric model considered in liu (2020) and tailored to the specifics of the covid-19 pandemic. the approach has several desirable theoretical properties. for instance, liu, moon, and schorfheide (2020), building on brown and greenshtein (2009) , show that an empirical bayes implementation of the forecasting approach based on tweedie's formula can asymptotically (as the crosssectional dimension tends to infinity) lead to forecasts that are as accurate as the so-called oracle forecasts. here the oracle forecast is an infeasible benchmark that assumes that the distribution of the heterogeneous coefficients is known to the forecaster. liu (2020) shows that the density forecast obtained from the full bayesian analysis converges strongly to the oracle's density forecast as the cross-section gets large. the piecewise linear conditional mean function for the infection growth rate resembles a spline; see de boor (1990) for an introduction to spline approximation. unlike a typical spline approximation in which the knot locations are free parameters and some continuity of smoothness restrictions are imposed, the knot placement in our setting is closely tied to the first component of the spline, and we do not impose continuity. our model could be generalized by adding additional knots in the deterministic trend component of infection growth rates, but the extension is not pursued in this paper. other authors have explored alternative forms of nonlinearity which are often tied to the object that is being modeled, e.g., active infections, cumulative infections, new infections, deaths. for instance, li and linton (2020) model the logarithm of country-level new infections and new deaths via a quadratic trend, using rolling samples. ho, lubik, and matthes (2020) model the cumulative number of infections using a very flexible nonlinear parametric function. an important aspect of our modeling framework is that the panel model is specified in event time, i.e., time since the level of infections in a particular location exceeds 100. the forecasts, however, are generated based on calendar time. this allows us to sharpen forecasts for countries/regions that experienced an outbreak at a late stage (in terms of calendar time), based on information from locations with an early outbreak. this idea j o u r n a l p r e -p r o o f journal pre-proof is also utilized by larson and sinclair (2020) who use state-level panel data to nowcast unemployment insurance claims during covid-19. a growing number of researchers with backgrounds in epidemiology, biostatistics, machine learning, economics, and econometrics are engaged in forecasting aspects of the covid-19 pandemic. because this is a rapidly expanding and diverse field, we do not attempt to provide a meaningful survey at this moment. instead, we simply provide a few pointers. forecasts are reported in the abovementioned papers by li and linton (2020) and ho, lubik, and matthes (2020) . the paper by avery, bossert, clark, ellison, and fisher ellison (2020) the remainder of this paper is organized as follows. section 2 provides a brief survey of epidemiological models with a particular emphasis on the sir model. the specification of our panel data model is presented in section 3. section 4 contains a small-scale monte carlo study and the empirical analysis is conducted in section 5. finally, section 6 concludes. there is a long history of modeling epidemics. a recent survey of modeling approaches is provided by bertozzi, franco, mohler, short, and sledge (2020) . the authors distinguish three types of macroscopic models: 1 (i) the exponential growth model; (ii) self-exciting point processes / branching processes; (iii) compartmental models, most notably the sir model that divides a population into susceptible (s t ), infected (i t ), and resistant (r t ) individuals. our subsequent discussion will focus on the exponential growth model and the sir model. while epidemiological models are often specified in continuous time, we will con-sider a discrete-time specification in this paper because it is more convenient for econometric inference. the exponential model takes the form i t = i 0 exp(γ 0 t). the number of infected individuals will grow exponentially at the constant rate γ 0 . this is a reasonable assumption to describe the outbreak of a disease, but not the subsequent dynamics because the growth rate will typically fall over time and eventually turn negative as more and more people become resistant to the disease. the sir model dates back to kermack and mckendrick (1927) . in its most elementary version it can be written in discrete-time as follows: where n is the (fixed) size of the population, β is the average number of contacts per person per time, and γ is the rate of recovery or mortality. the model could be made stochastic by assuming that β and γ vary over time, e.g., in response to the recent covid-19 pandemic, several introductory treatments of sir models have been written for economists, e.g., avery, bossert, clark, ellison, and fisher ellison (2020) and stock (2020) . moreover, there is a growing literature that combines compartmental models with economic components. in these models, economic agents account for the possibility of contracting a disease when making their decisions about market participation. this creates a link between infection rates and economic activity through the frequency of interactions. examples of this work in macroeconomics include eichenbaum, rebelo, and trabandt (2020) , glover, heathcote, krueger, and rios-rull (2020) , and krueger, uhlig, and xie (2020). the advantage of models that link health status to economic activity is that they can be used to assess the economic impact of, say, social distancing measures. we now simulate the constant-coefficient sir model in ( model. it is a monotonically decreasing function of time that we approximate by fitting a piecewise linear least-squares regression line with a break point at t * which is the point in time when the infections peak and the growth rate transitions from being positive to being negative. under the second parameterization the transmission rate β = 0.06 is much lower and the recovery rate is slightly faster. this leads to an almost bell-curve shaped path of infections. while the resulting growth rate of the infections is not exactly a linear function of time t, the break at t * is much less pronounced. while the piecewise-linear regression functions do not fit perfectly, they capture the general time-dependence of the growth-rate path implied by the sir model. in particular, they allow for a potentially much slower change in the growth rate of infections after the peak. we use these simulations as a motivation for the subsequent specification of our empirical model. 2 this model assumes that the growth rate of infections is a decreasing piecewiselinear function of time with a break when the growth rates cross zero and the infections peak. this deterministic component is augmented by a stochastic component that follows a first-order autoregressive, ar(1), process. we refer to the model as trend-break model. we will revisit a stochastic version of the sir model that comprises (1) and (2) in section 5.4 where we compare its forecasts to the proposed trend-break model. we now describe our empirical model in more detail. we begin with the specification of a regression model for the growth rate of infections in section 3.1. our model features location-specific regression coefficients and heteroskedasticity. the prior distribution for the bayesian analysis is summarized in section 3.2. posterior inference is implemented through a gibbs sampler that is outlined in section 3.3. further computational details are provided through replication files on the companion webpage. the algorithm to obtain simulated infection paths from the posterior predictive distribution is outlined in section 3.4. we specify a panel data model for infection growth rates y it = δ ln i it , i = 1, . . . , n and t = 1, . . . , t . we assume that where ] captures the size of the break in the regression coefficients at t = t * i . the deterministic part of y it corresponds to the piecewise-linear regression functions fitted to the infection growth paths simulated from the sir in figure 1 . the serially-correlated process u it generates stochastic deviations from the deterministic path γ i x t of the infection growth rate. the u it shocks may capture time variation in the (β, γ) parameters of the sir model or, alternatively, model misspecification. in section 2 the break point t * i was given by the peak of the infection path. abstracting from a potential discontinuity at the kink, we define t * i as which implies that e[y it |t = t * i ] = 0. because of the ar(1) process u it , t * i is not the peak of the observed sample path, nor is it an unbiased or consistent estimate of the period in which the infections peak. for δ i = 0, the model reduces to note that the break date t * i is identified in this model even if δ i = 0, because we assume the break occurs when the deterministic component of the growth rate falls below zero. to construct a likelihood function we define the quasi-difference operator δ ρ = 1 − ρl such that δ ρ u it = it . thus, we can rewrite (3) as follows now let λ i = [γ i , δ i ] and n λ be the dimension of λ. the parameters of the panel data model are (ρ, λ 1:n , σ 2 1:n ). here, we use the notation z 1:l to denote the sequence z 1 , . . . , z l . using this notation, we denote the panel observations by y 1:n,1:t . we will subsequently condition j o u r n a l p r e -p r o o f journal pre-proof on y 1:n,0 to initialize conditional likelihood function. finally, from the growth-rates y it we can easily recover the level of active infections as to conduct bayesian inference, we need to specify a prior distribution for ( ρ, λ 1:n , σ 2 1:n ). we do so conditional on a vector of hyperparameters ξ that do not enter the likelihood function. our prior distribution has the following factorization: where ∝ denotes proportionality and f (•) is an indicator function that we will use to impose the following sign restrictions on the elements of λ i : the restriction γ 1i < 0 ensures that the growth rates are falling over time. after the break point the rate of decline decreases (δ 1i > 0), but stays negative (γ 1i + δ 1i < 0). in addition we assume that the decrease in the rate of decline is associated with a downward shift, i.e., δ 0i < 0, of the intercept as shown in the sir simulation. because of the presence of the indicator function f (•) the right-hand side of (8) is not a properly normalized density. in view of the indicator function f (•) we define the re distribution of λ i given ξ as in turn, the marginal prior distribution of the hyperparameters is given by building on liu (2020), we use the following densities p(•) in (8) for ρ, λ i , and σ 2 i : j o u r n a l p r e -p r o o f thus, the vector of hyperparameters is ξ = (μ, σ, a, b). we decompose p(ξ) = p(μ, σ)p(a, b). the density p(μ, σ) is constructed as follows: the degrees of freedom for the inverse wishart distribution is set to the matrix w 0 is constructed to align the scale of the variance of μ i with the cross-sectional variance of the data, adjusting for the average magnitudes of the regressors that multiply the λ i elements. to obtain the density p(a, b), we follow llera and beckmann (2016) the parameters (α a , β a , γ a , α b , β b ) need to be chosen by the researcher. we use α a = 1, β a = γ a = α b = β b = 0.01, which specifies relatively uninformative priors for hyperparameters a and b. posterior inference is based on an application of bayes theorem. let p(y 1:n,1:t |λ 1:n , σ 2 1:n , ρ) denote the likelihood function (for notational convenience we dropped y 1:n,0 from the conditioning set). then the posterior density is proportional to p(ρ, λ 1:n , σ 2 1:n , ξ|y 1:n,0:t ) ∝ p(y 1:n,1:t |λ 1:n , σ 2 1:n , ρ)p(ρ)p λ 1:n , σ 2 1:n , ξ , j o u r n a l p r e -p r o o f journal pre-proof where the prior was given in (8). to generate draws from the posterior distribution we use a gibbs sampler that iterates over the conditional posterior distributions the gibbs sampler generates a sequence of draws ρ s , λ s 1:n , (σ 2 1:n ) s , ξ s , s = 1, . . . , n sim , from the posterior distribution. the implementation of the gibbs sampler closely follows liu (2020) . for the gibbs sampler to be efficient, it is desirable to have a model specification in which it is possible to directly sample from the conditional posterior distributions in ( 15). unfortunately, the exact likelihood function leads to a non-standard conditional posterior distribution for λ 1:n |(y 1:n,0:t , ρ, σ 2 1:n , ξ) because γ i enters the indicator function in (3) through the definition of t * i . thus, rather than using the exact likelihood function, we will use a limited-information likelihood function of the form the densities p l (y i,1:t |λ i , σ 2 i ) are constructed as follows. let δ be some positive number, e.g., three or five time periods. given a sample (y i,1:t , ln i i,1:t ) we define on the other hand, if t i,max < t , then it is likely that t * = t i,max . thus, we distinguish two cases: because δ i does not enter the likelihood function, its posterior is p(δ j o u r n a l p r e -p r o o f now δ i does enter the likelihood function and its prior gets updated in view of the data. bayesian forecasts reflect parameter and shock uncertainty. we simulate trajectories of infection growth rates from the posterior predictive distribution using the algorithm 1. the simulated growth rates can be converted into simulated trajectories for active infections using (7). algorithm 1 (simulating from the posterior predictive distribution) 2. based on the simulated paths i s 1:n,t +1:t +h , s = 1, . . . , n sim , compute point, interval, and density forecasts for each period t = t + 1, . . . , t + h. we now conduct a small monte carlo experiment that compares the forecasts derived from the panel data model to time-series forecasts generated for each location separately. the experiment shows that in our environment forecasts for locations that experience an outbreak at a later point in time are more accurate than forecasts for locations that have an early outbreak because the early outbreaks facilitate learning about the re distribution that benefits the forecasts for the remaining locations. the data generating process (dgp) is described in section 4.1 and the results are summarized in section 4.2. j o u r n a l p r e -p r o o f heterogeneous coefficients: initial infection level: i i0 = 101 for all i. the data generating process (dgp) is given by the trend-break model (3) for the growth rates of infections. for the simulation experiment we assume that the innovations it are homoskedastic, i.e., σ 2 i = σ 2 for all i. the dgp matches certain aspects of the empirical application in section 5, but it is more stylized in other dimensions. the time period t is a day. the number of locations in our simulation is n = 150. we split the locations into two groups: n 1 = 75 locations experience an early outbreak, starting at t = 1, and n 2 = 75 locations experience a late outbreak, starting at t = 56. we refer to these groups as "early" and "late." for the early group calendar time and event time are identical. for the late group, the event time is calendar time minus t δ = 56 (8 weeks). the parameters of the dgp are summarized in table 1 . the persistence of the growth rates is set to ρ = 0.8. the dispersion of the parameters λ i is controlled by a vector of means, λ and a covariance matrix σ. both are calibrated to match some stylized facts about the cross-sectional distribution of the country-level data used in section 5. we then draw the λ i s independently from the n (λ, σ) distribution. the innovation variance σ 2 corresponds to a high-density value of the estimated density σ 2 i ∼ ig(a, b). we assume that the outbreak starts in each geographical location i with i i0 = 101. there is also heterogeneity in the timing of the peak, which is illustrated in figure 2 . the figure shows the percentage of locations that have peaked in or prior to period t. by construction, infections in the early-group locations tend to peak sooner than in the late-group locations. however, the peak dates in each group are quite dispersed: only 20% of early locations have peaked after 60 days. it takes more than 100 days for the remaining early locations to peak. forecasting models. we report results for two forecasting models: (i) the panel data forecast evaluation. because of the exponential transformation in (7) from growth rates to levels, there is a large degree of cross-sectional heterogeneity among the levels of infection. locations with larger numbers of infections tend to be associated with larger forecast errors. if we simply average forecast errors or forecast interval lengths across locations, the results will be driven by a few locations with a high level of infections. therefore, we are standardizing all level-forecast evaluation statistics by the level of infections at the forecast origin, i it , i.e., we are reporting results for the forecast of i it +h /i it . we will report measures of density and interval forecasting performance below. we do not consider point forecasts because we strongly believe that due to the highly uncertain path of infections during a pandemic it is essential for forecasters to report forecasts that convey the degree of uncertainty in the predictive distribution. the density forecast performance is evaluated based on continuous ranked probability scores (crps). the crps measures the l 2 distance between the cumulative density functionf it +h|t (x) associated with a predictive distribution for location i at forecast origin t and a perfect probability forecast that assigns probability one to the realized x it +h : the crps is a proper scoring rule, meaning that it is optimal for the forecaster to truthfully reveal her predictive density. here x it +h could either be a growth rate y it +h or a relative level i it +h /i t . for interval forecasts we will report the cross-sectional coverage frequency and the average length separately. as discussed in more detail in askanazi, diebold, schorfheide, and shin we begin with the top left panel of figure 3 . in most early-group locations, the infections tend to peak between the forecast origin t = 63 and the four-week-ahead forecast target t + h = 91. in the late-group locations, the peak occurs after the forecast target date. for t = 63 the three important findings emerge. first, the panel forecasts clearly dominate the time-series forecasts. the discrepancy is particularly large for locations in the late group. second, while for the early group the crps based on the panel forecasts seem to be unrelated to the peak date, the accuracy of the time-series forecasts is substantially worse for early-group locations that peak between periods 63 and 91 than it is for locations that peak prior to period 63. third, the four-week-ahead panel forecasts for the late group are much more accurate than the panel forecasts for the early group. these findings can be explained as follows. first, in a panel setting, the experience of the early locations allows for relatively precise inference about the re distribution, which then sharpens the posterior inference for the late locations because the uncertainty about the prior distribution is reduced. note that the time series dimension for the late group is only 7. second, due to the structural break in the growth rate at the peak infection level, it is very difficult to predict how quickly the infections will die out after they have peaked. this makes it easier to predict infections for the late group which includes the locations that are still far away from the peak than for the early group in which infection levels are relatively close to the peak. the top right panel of figure 3 indicates that after 18 weeks (t = 126) the benefit of the panel approach is a lot smaller, both for the early group and the late group. because more time series information is available to estimate the location-specific parameters, the benefit from using prior information is significantly diminished. the bottom panel of the figure shows crps for levels rather than growth rates. the key message remains the same: early on in the pandemic, the panel approach substantially improves forecasts for locations that experience a delayed outbreak, because there is some learning from the locations in which the outbreak occurred early on. in figure 4 we plot the group-specific average crps as a function of the forecast origin are similar to the messages from figure 3 , but now the results span a broad range of forecast origins. first, the panel forecasts are (at least weakly) more accurate than the time-series forecasts. however, the accuracy differential vanishes as the time-series dimension of the estimation sample increases over time. second, the benefit from using a panel approach is more pronounced for the locations that experience a late outbreak than those that experience an early outbreak. interval forecast accuracy. finally, we report results on the interval forecast performance for infection growth rates and levels in figure 5 . we apply the panel forecasting techniques to country/region-level data on active covid-19 infections. the data set used in the empirical analysis is described in section 5.1. we discuss the posterior estimates in for the 2020-04-18 forecast origin in section 5.2. in section 5. the data set is obtained from csse at johns hopkins university. 4 we define the total number of active infections in location i and period t as the number of confirmed cases minus the number of recovered cases and deaths. throughout our study we use country-level aggregates. the time period t corresponds to a day and we fit our model to one-sided three-day rolling averages to smooth out noise generate by the timing of the reporting. in a slight abuse of notation, the time subscript t in (3) before discussing the forecasts, we will examine the parameter estimates for one of the early samples, namely 2020-04-18. heterogeneous slope coefficients. our gibbs sampler generates draws from the joint posterior of (ρ, λ 1:n , σ 2 1:n , ξ)|y 1:n,0:t . we begin with a discussion of the estimates of γ 1i and δ 1i , which affect the speed at which the growth rates are expected to change on a daily basis. γ 1i measures the average daily decline in the growth rate of active infections. for instance, suppose the at the beginning of the outbreak, in event time t = 0, the growth rate ln(i t /i t−1 ) = 0.2, i.e., approximately 20%. a value of γ 1i = −0.02 implies that, on average, the growth rate declines by 0.02, meaning that after 10 days it is expected to reach zero and turn negative subsequently. a positive value of δ 1i = 0.01 implies that after the growth rate becomes negative, its decline is reduced (in absolute value) to γ 1i + δ 1i = −0.01. peaked will take considerably longer than the rise to the peak. distribution. an important component of our model is the re distribution π(λ i |ξ) defined in (9). prior and posterior uncertainty with respect to the hyperparameters ξ generate uncertainty about the re distribution. in the remaining panels of figure 6 we plot draw from the posterior (center column) and prior (right column) distribution of the re density π(λ i |ξ). each draw is represented by a hairline. because the normalization constant c(ξ) of π(λ i |ξ) is difficult to compute due to the truncation of a joint normal distribution, we show kernel density estimates obtained from draws from π(λ i |ξ). we now turn to density forecasts generated from the estimated panel data model. for now, we will focus on the early stage of the pandemic. we use algorithm 1 to simulate trajectories of infection growth rates which, conditional on observations of the initial levels i it , we convert into stocks of active infections. for each forecast horizon h we use the values y s it +h and i s it +h , s = 1, . . . , n sim to approximate the predictive density. strictly speaking, we are not reporting complete predictive densities. instead, we plot medians and construct equal-tail-probability bands that capture the range between the 20-80% and 10-90% quantiles. the wider the bands, the greater the uncertainty. infections. the path of active infections broadly resembles the paths simulated with the sir model in section 2. the rise of infections during the outbreak tends to be faster than the subsequent decline, which is a feature that is captured by the break in the conditional mean function of our model for the infection growth rate y it in (3). the difference between the bands depicted in the second and third rows is that the former reflects parameter uncertainty only (we set future shocks equal to zero), whereas the latter reflects parameter and shock uncertainty. in the case of germany, shock uncertainty increases the width of the bands by approximately 30%. due to the exponential transformation that is used to recover the levels, the predictive densities are highly skewed and exhibit a large upside risk. this is particularly evident for the u.s. the growth rate prediction in the first row indicates that there is an approximately 20% probability of a positive infection growth rate throughout april and at least a 10% probability until the middle of june. converted into levels, temporarily positive growth rates of infections can generate a rise of infections from less than one million in april to more than five million two months later. in the bottom row of figure 8 we plot cumulative density function for the date of recovery, which we define as the first date when the infections fall below the initial level i i0 . the density function is calculated by examining each of the future trajectories i s it +h for h = 1, . . . , 60 generated by algorithm 1. for south korea the probability that the infection rate will fall below i i0 over the two month period is close to 80%, whereas for germany and the u.s. the probability is approximately 50% and 60%, respectively. in figure 9 we overlay eight weeks of actual infections onto density forecasts generated we now turn to a more systematic evaluation of the forecasts and will assess the accuracy of density and interval forecasts represented by the bands in figures 8 and 9 . for reasons previously discussed in section 4.2, we standardize future infections i it +h by the level of infections i it at the forecast origin. a closer inspection of the forecasts for more than 100 countries/regions reveals that the long-run forecasting performance is not particularly good. this is not just a feature of our panel trend-break model, but also a feature of other epidemiological models such as the sir model for which we will report results below. thus, in this section we will focus on one-week and four-week ahead forecasts and not report results for an eight-week horizon. alternative models. in addition to the panel model forecasts, we consider two alternative forecasts. first, as in section 4, we generate time-series forecasts based on the trend-break model (3) for each location. second, we estimate a version of the simple sir in ( 1) with time-varying parameters β t and γ t . notice, that by rewriting (1) we can express β t and γ t directly as a function of the observables (here we are omitting i subscripts): 8 this allows us to estimate the ar(1) law of motion in (2) for each country using bayesian techniques. the ar(1) models are then used to simulate trajectories ( β t +1:t +h , γ t +1:t +h ) from the posterior predictive distribution. 9 for each parameter sequence, we iterate the sir model (1) forward to obtain a predictive distribution of the active infections. density forecast accuracy. figure 10 summarizes the one-week-ahead density forecasting performance for once-a-week forecast origins starting on 2020-04-18 and ending on 2020-07-04. for each location, we compute the probability score crps i,t +h|t . the top row shows the cross-sectional median as a function of the forecast origin, whereas the center and the bottom row show the cross-sectional empirical distribution for two forecast origins: 2020-04-18 and 2020-06-06. the panels in the left column of figure 10 cover all locations, whereas the panels in the right column distinguish between early-group and late-group locations. the early group comprises locations that experienced more than 100 infections before 2020-03-28. the remaining 8 the following additional variables are obtained from the jhu csse dataset: n is the total population of each country. s t is computed as n -i t -recovered cases -deaths. 9 based on the specification of the sir model, we let β t , γ t > 0 and 0 ≤ s t , i t , r t ≤ n , for all t. the panels in the right column of figure 10 distinguish between locations that experienced the covid-19 outbreak at an early stage and locations that were hit by the pandemic at a later stage. the key result is that for forecast origins dated 2020-05-09 or earlier, the panel forecasts for the late group are more accurate than the time series forecasts from the trend-break model. this result confirms the basic intuition that the panel approach can be advantageous during a slowly spreading pandemic because the experience of the early-group countries can sharpen inference on the re distribution for the latter countries. unfortunately, because the time series approach dominates the panel approach for the early countries, in the aggregate there is no clear advantage to the panel analysis in our data set. the left panels of figure 10 the panel data forecasts have a smaller average length than the individual-level forecasts for both groups and in the aggregate. thus, on balance, in terms of interval forecasting, the panel approach comes out slightly ahead. finally, the bottom right panel shows that the interval forecasts for the late group are generally wider than for the early group. the additional uncertainty is caused by the difficulty of predicting the change in infection growth rates around the peak. figure 13 displays results for a four-week horizon. over this longer horizon, the coverage frequency is generally poor. as for the shorter horizon, the sir model interval forecasts are substantially worse in terms of coverage frequency and interval length than the panel and time-series forecasts from the trend-break model. we adopted a panel forecasting model initially developed for applications in economics to forecast active covid-19 infections. a key feature of our model is that it exploits the experience of countries/regions in which the epidemic occurred early on, to sharpen forecasts and parameter estimates for locations in which the outbreak took place later in time. at the core of our model is a specification that assumes that the growth rate of active infections coverage probability notes: the nominal coverage probability is 80%. left column panels: solid is panel, dashed is country-level, and dashed-dotted is sir. right column panels: solid is panel, dashed is country-level. blue lines correspond to early group and orange lines to late group. trend function with a break. our specification is inspired by infection dynamics generated from a simple sir model. according to our model, there is a lot of uncertainty about the evolution of infection rates, due to parameter uncertainty and the realization of future shocks. moreover, due to the inherent nonlinearities and exponential transformations, predictive densities for the level of infections are highly skewed and exhibit substantial upside risk. consequently, it is important to report density or interval forecasts, rather than point forecasts. a natural extension of our model is to allow for additional, data-determined breaks in the deterministic trend function as the pandemic unfolds and countries/regions are adopting on the comparison of interval forecasts policy implications of models of the spread of coronavirus: perspectives and opportunities for economists the challenges of modeling and forecasting the spread of covid-19 nonparametric empirical bayes and compound decision approaches to estimation of a high-dimensional vector of normal means splinefunktionen the macroeconomics of epidemics estimating and simulating a sird model of covid-19 for many countries, states, and cities health versus wealth: on the distributional effects of controlling a pandemic unobserved heterogeneity in income dynamics: an empirical bayes perspective the probability approach in econometrics going viral: forecasting the coronavirus pandemic across the containing papers of a mathematical and physical character macroeconomic dynamics and reallocation in an epidemic nowcasting unemployment insurance claims in the time of covid-19 when will the covid-19 pandemic peak? density forecasts in panel data models: a semiparametric bayesian perspective forecasting with dynamic panel data models estimating an inverse gamma distribution forecasting the impact of the first wave of the covid-19 pandemic on hospital demand and deaths for the usa and european economic area countries dealing with data gaps key: cord-258762-vabyyx01 authors: garbey, marc; joerger, guillaume; furr, shannon title: a systems approach to assess transport and diffusion of hazardous airborne particles in a large surgical suite: potential impacts on viral airborne transmission date: 2020-07-27 journal: int j environ res public health doi: 10.3390/ijerph17155404 sha: doc_id: 258762 cord_uid: vabyyx01 airborne transmission of viruses, such as the coronavirus 2 (sars-cov-2), in hospital systems are under debate: it has been shown that transmission of sars-cov-2 virus goes beyond droplet dynamics that is limited to 1 to 2 m, but it is unclear if the airborne viral load is significant enough to ensure transmission of the disease. surgical smoke can act as a carrier for tissue particles, viruses, and bacteria. to quantify airborne transmission from a physical point of view, we consider surgical smoke produced by thermal destruction of tissue during the use of electrosurgical instruments as a marker of airborne particle diffusion-transportation. surgical smoke plumes are also known to be dangerous for human health, especially to surgical staff who receive long-term exposure over the years. there are limited quantified metrics reported on long-term effects of surgical smoke on staff’s health. the purpose of this paper is to provide a mathematical framework and experimental protocol to assess the transport and diffusion of hazardous airborne particles in every large operating room suite. measurements from a network of air quality sensors gathered during a clinical study provide validation for the main part of the model. overall, the model estimates staff exposure to airborne contamination from surgical smoke and biological material. to address the clinical implication over a long period of time, the systems approach is built upon previous work on multi-scale modeling of surgical flow in a large operating room suite and takes into account human behavior factors. there is a large debate on the possible airborne transmission of coronavirus 2 (sars-cov-2) in closed buildings [1, 2] . we are still understanding the sars-cov-2 spreading but scientists support the hypothesis of airborne diffusion of infected droplets from person to person at a distance that can be greater than two meters [3] . in the unfortunate case an elective surgery is practiced on an asymptomatic covid-19 patient and who was not tested positive, one may ask if the virus can escape an operating room (or) kept under positive pressure and expose staff in peripheral area to the disease. this question is particularly important to healthcare staff who spend multiple long hour shifts in a hospital system that manages covid-19 patients. to quantify airborne transmission from a physical point of view, we consider surgical smoke as a marker of airborne particle diffusion-transportation emitted from the surgical table area. surgical smoke is 95% water or steam and 5% particle material and therefore surgical smoke can act as a carrier for tissue particles, viruses, and bacteria [4] . today, the risk of surgical smoke has clearly been established [5] [6] [7] [8] [9] [10] [11] . one of the main difficulties is that surgical smoke carries ultra fine particles (ufp) as small as 0.01 microns, which are able to bypass pulmonary filtration, and small particles up to several microns [10] . it was recently shown in a study that air quality, especially concentration of fine particles, is associated with an increase in covid-19 mortality [12] . respiratory protection devices are used to protect staff in healthcare facilities with various degrees of success [13, 14] . we propose to construct a rigorous multi-scale computational framework to address these questions and use measurements of diffusion-transportation of surgical smoke particles with off-the-shelf portable sensors to calibrate the model. this methodology addresses only the physical side of the problem and therefore does not answer the effectiveness of airborne particles to induce covid-19. some of the difficulties encountered in such studies are that air sampling and infection may or may not be strongly correlated [15, 16] . however, it is an important step to quantify the level of exposure in order to estimate the corresponding viral load in part. transport and diffusion mechanisms are very effective for ufp to travel a long distance from the source in a short period of time. a 2020 report from china demonstrated that sars-cov-2 virus particles could be found in the ventilation systems in restaurants [17] and in hospital rooms of patients with covid-19 underlining how viable virus particles can travel long distances from patients [18] . clinical environments are too complex to model with the traditional modeling method of airflow and particle transportation because both the source intensity of surgical smoke [19] as well as the mechanism of propagation via door openings [20] are largely dominated by human factors. the geometric complexity of the infrastructure and of the heating, ventilation, and air conditioning (hvac) system limit the capability of computational fluid dynamics (cfd) [20, [20] [21] [22] [23] [24] [25] [26] to predict indoor air quality and health [27] . last but not least, droplet behavior depends not only on their size, but also on the degree of turbulence and speed of the gas cloud, coupled with the properties of the ambient environment (temperature, humidity, and airflow) [2] . we present in this paper a mathematical framework and experimental protocol to assess the transport and diffusion of hazardous airborne particles in any large or suite. human behavior factors are taken into account by using a systems and cyber-infrastructure approach [28] [29] [30] coupled to a multi-scale modeling of surgical flow in a large or suite [31] . overall, the model estimates staff's exposure to airborne contamination, such as surgical smoke or biological hazard. validation is provided by a network of wireless air quality sensors placed at critical locations in an or suite during the initial phase of the surgical-suite-specific study. a step-by-step construction of the model scaling up from the or scale to the surgical suite scale will be presented; the model integrates the transport mechanism occurring at the minute scale with the surgical workflow efficiency simulation over a one year period. to assess potential contamination from one or to another, the extent of the propagation of surgical smoke in the area adjacent to the or will be checked-this might be more significant than the level of concentration itself. to simulate the airflow and dispersion of surgical smoke, an or that is representative of the surgical suite shown in figure 1 was used. measurements for calibrations and verifications were conducted in a real or of this dimension when there were no surgeries taking place. figure 2 provides the schematic of all boundary conditions and geometric parameters. this cfd approach is used to justify and build a simplified large-scale model of the airflow in the surgical suite. a 3d cartesian coordinate system was used with length along the x-direction, width along the y-direction, and height along the z-direction. the or is 7.5 m long, 6 m wide, and has a height of 2.7 m (see figure 2 ). the model takes into account the architecture of the room, the operating table location, and the hvac system design in the or as well as in the hallway. the corridor was modeled as a rectangle of 12 m long, 2.5 m wide and a height of 2.7 m. the operating table is displayed as a rectangle in the middle of the or, and the anesthesia equipment is also simulated by a rectangle that is close to the table. the computation of the flow was done by using a pressure based solver and an ansys fluent solver in steady states first and then transient mode after. the model's geometry was meshed using an unstructured tetrahedral grid with about 10 6 elements. the exact size of the mesh depends on the angle of the door with its initial closed position since the mesh gets refined at the interfaces. the airflow is assumed to be turbulent [25] and was modeled using the k − turbulence model, taking into account gravity to introduce the boussinesq approximation in the navier-stokes equation. it is the most common model for indoor airflow simulation in which the turbulent kinetic energy k and turbulent dissipation rate are modeled. the temperature and pressure boundary conditions in the model were measured and are reported in the result section of this paper. typically, the or is kept cooler than the hallway, and the inlet vent inside the or blows air at a temperature as low as 13 • c. to match the ventilation infrastructure, the model has three different rows of inlet vents in the ceiling. the first row of 4 inlets is in the middle of the room. it blows on top of the surgical table a laminar flow directly on the surgical table to remove as fast as possible any contaminant close to the patient. in our model, one inlet vent is represented by 2 rectangles of 0.5 by 0.1 m each. then, there are two rows of three inlet vents on each side of the one on top of the or table: one on the left (1.7 m from the center of the or table) and one on right (1.5 m from the middle of the or table, see figure 2 . those are present to avoid any flow returning towards the operating table. the slightly different velocity flows for each inlet were implemented (see result section) in order to replicate the anemometry measurement (via peak meter ms6252b with an accuracy of ±2.0%) obtained near these inlets-it was noticed that the large surgical lights over the surgery field have the tendency to obstruct part of the inlet flow. the or also contains two outlet vents represented by two rectangles of 0.9 by 0.6 m placed on the wall at 0.1 m from the ground with the coordinates of the middle of the bottom part: (−3.35 m, 3.35 m, 0.1 m) and (3.35 m, −2.68 m, 0.1 m) taking the center of the or table at ground level as reference (0, 0, 0). these outlet of that suck out the air in the or with such a velocity that the outlets pump out less volume than the volume injected by the ceiling inlets, which creates a positive pressure of about 8 pa. pressurization is a key factor in controlling room airflow patterns in a healthcare facility. positive pressurization is used to maintain airflow from clean to less-clean spaces. the appropriate airflow offset to reach the desired pressure differential depends mostly on the quality of the construction of the room. it is difficult, if not impossible, to know what the room's leakage area is before finishing the construction and doing measurements of airflow. the facility management service (fms) of the hospital was able to supply the values of volume per minute that the inlets are blowing (1.15 m 3 /s) and of the volume going through the outlets (0.55 m 3 /s). as the surface of the outlets is known, the velocity of the outlet vents were 0.5 m/s for the left one and 0.4 m/s for the right one, reported on figure 2 . the volume of extra air in the or is the difference: 1.15 − 0.55 = 0.6 m 3 /s. due to the positive pressure of 8 pa present in the or, this additional volume leaks out of the or through either the door being left open or the narrow gap around the door when it is closed. in that case, the free boundary surface was estimated to be 0.17 m 2 . for the hallway, a uniform inflow boundary condition of 0.1 m/s was imposed in order to take into account the anemometry measurement mentioned above. this upstream boundary condition is completed by a free outlet boundary condition at the other end of the hallway. to be as realistic as possible, the two existing inlet vents' boundary conditions were respectively added for the inlet vent located on the ceiling and the other inlet located close to the entry door of the next or, both with a velocity of 1. to validate the model, measurements were done of velocity flow and of the concentration of particles at various locations that were close to specific regions of interest-the door-frame location in particular, measurements are reported in the result section. it is unrealistic, in practice, to build a cfd model of the whole surgical suite and run this model for an extensive period of time. next, an upscale model will be presented that will use the present cfd simulation to verify some of the key parameter values, especially relating to transmission parameters between ors and the hallway. as an example, consider a system of 10 identical ors aligned on one side of a hallway. each or has one door access to the hallway. this system is part of a standard or suite and represents one half of the facility in figure 1 that has an architectural design almost symmetric with its two circulation side-halls. computing the concentration of a so-called "marker," which can be a specific gas or set of airborne particles in the air of this or suite, is particularly of interest. this marker is generated from the location of the or's surgical table, where a surgeon is using an electrosurgical instrument that produces smoke from the thermal destruction of tissues. the marker can also be the particles resulting from evaporation of any alcohol-based chemical used either to prep the patient or to clean the or. the model has two parts: first, a compartment-like model that can monitor the indoor pollution [32]; second, a multi-scale agent-based model (abm) that simulates the surgical flow activity and the impact on the indoor air quality either from the source of surgical smoke or from door openings affecting the dispersion of pollutants [31] . staff movement throughout the or suite via door openings and closings will manifestly be a key mechanism for propagation of markers. the indoor air quality is a linear set of differential equations that will be slightly more complex than a standard compartment model since the coefficient will be stochastic, the sources and output/leaks of the particles term will have a time delay built in, and the hallway will require a transport equation. the rationale for building this specific model will come out of the set of experiments described hereafter. next, the description of the acquisition process to identify the production of airborne contaminants will be explained. for this experiment in a surgical training facility, electrosurgical energy was delivered on the surface of two pieces of pork meat, each 2 cm thick, placed on an or table. three types of energy delivery systems were compared: electrosurgery (conduction) via the covidien forcetriad monopolar device(medtronic, minneapolis, mn, usa), ultrasonic (mechanic) with the ethicon harmonic scalpel p06674 device (ethicon inc., somerville, nj, usa), and laser tissue ablation with erbe apc (argon plasma coagulation) 2 device (erbe elektromedizin gmbh, tübingen, germany). to keep the tissue burn superficial, a pattern of parallel lines was followed with each device and always used unburned areas of the meat. the energy was delivered for a period of 30s up to 60s in order to produce a large quantity of smoke and thus particles. the measurement was done by several laser particle counters from dylos corp (riverside, ca, usa) placed at various distances from the source (http: //www.dylosproducts.com/dc1700.html). they give an average particle count every minute in a unit system with units u d that correspond to 0.01 particles per cubic foot or 0.00028 particles per cubic meter (1 cubic foot = 0.0283168 cubic meter). a traditional problem with the validation of particle count in laboratory conditions is that particles are not all the same uniform size. according to smartair (http://smartairfilters.com/cn/en/), the dylos system output is highly correlated (r = 0.8) to a "ground true" measurement provided by a high-end system such as the sibata ld 6s (sibata scientific technology ltd., tokyo, japan) that is claimed to be accurate within 10% in controlled laboratory conditions. according to smartair, the dylos system seems particularly accurate at the lower concentration ends, which is of interest for this study's purpose. semple et al. [33] also compared the dylos system with a more expensive system: the sidepak am510 personal aerosol monitors (tsi incorporated, shoreview, mn, usa). they concluded that the dylos' output agrees closely with the one produced by the sidepak instrument with a mean difference of 0.09 µg/m 3 . the dylos sensors were set up to track particles of small size in the range from 0.5 to 2.5 microns, which are the sizes of biological material. the results were checked systematically by comparing the measures of several sensors at the same location to show consistency, as well as checked that the particle count lowers back down to nearly zero in a clean-air room with ac equipped with high efficiency particulate air (hepa) filters. each experiment was started from an initial clean-air condition of a small particle count, fewer than 50 units, which is much less than the number of particles counted during energy delivery. it took about 6 min to reach the initial clear-air count after each experiment. for each experiment, the concentration increased to a maximum after a short time delay s from the time the energy was delivered; this delay depends on the distance to the source. the concentration then exponentially relaxes to zero in time. consequently, the model of source dispersion is an exponential function as follows: the least squares fitting technique was used to interpolate the data with this function. the amplitude of the source a (see figure 3 ), the delay s on particle diffusion and transport to reach the sensor and the rate of "diffusion decay" ρ > 0 were identified. the accuracy on s, which measures the time interval between the source production and the peak of the signal, cannot be faster than one minute since the sensor only works at a one-minute timescale. a delay of s ≤ 1 was found to be a good approximation for all three energy devices. each experiment was done 4 to 5 times depending on the variability of the results. therefore, about 24 to 30 data points were available to identify the parameters a, s, and ρ for each energy device that was tested. now, the protocol experiment to assess the transport and diffusion of particles in different areas of a large or suite will be described. this set of experiments, as opposed to the previous one, was done in a large or suite late at night and on weekends when the ors were empty and had clean air with high-efficiency hvac. a hairspray product (lamaur vitae, unscented) was used as the marker and sprayed for a duration of 1 to 2 s to track its small particles while keeping the same positions of the dylos systems. the experiment first tested the propagation in a closed-door or with the source above the or surgical table. the spray nozzle was held facing the near-vertical direction, pointing to the ceiling. a distribution of sensors as displayed in figure 4 was used. the initial observation was that all four sensors distributed along the central line of the whole or space were getting a particle count of the same order of magnitude after an average of 15 s. the mixing of particles was quite extensive within a minute by reason of the hvac input/output design in the or, and that the concentrations on each sensor quickly relaxed to zero. this observation is also coherent with the results of the cfd model of the flow circulation described above. a method identical to the previous one was used to identify the key parameters a, s, and ρ characteristic of the dispersion of hairspray in the or. the model for or diffusion of particles is then where q denotes the global concentration of particles in each or, s(t = 0) denotes the source production that is non zero at time zero and ρ or denotes the diffusion decay inside the or. this simple ordinary differential equation (ode) model provides an average of particle concentration in the or at the minute timescale. a first-order implicit euler scheme with a time step dt of one minute is used: an entirely similar technique is used to describe the dynamic of particle diffusion and transport in the hallway, except that the hallway is discretized as a one-dimensional structure of consecutive hall blocks located at the same level as the or block. in this part of the experiment, the source is set in the hallway-see figure 1 . as noticed earlier, there is a slow but significant air flow speed v 0 in the hall, pointing in the direction of the main entrance of the surgical suite, situated on the right of the map in figure 1 . naturally, the high pressure of the or is designed to drive the airflow out and the front corridor seems to be a significant outlet. on the opposite end of the hall, situated on the left of the map, figure 1 , this velocity is close to zero. it is assumed that v 0 (x) is an affine function, with a linear growth from 0 to 0.1 m/s at mid-hall, and a constant value beyond. the model of hallway diffusion of particles is then d dt where d dt denotes the total derivative ∂ ∂t − v 0 ∂ x using the x coordinate system in the one space dimension hall model. to assess the transmission of particles from an or to the adjacent hallway with closed or doors, the same experiments were run with some of the sensors placed in the hallway either facing the closed door or sitting at a location in the hallway (see figure 4 ). as a matter of fact, the door of the or is not perfectly sealed due to the difference between the pressure inside the or and the lower pressure in the hallway, a significant airflow with velocity around the order of 1 m/s exists at the gap located between the door's edge and the door frame. [sentence removed]. a similar technique is used to represent the diffusion coefficient as well as the delay s that is now interpreted as the time it takes for the particles to flow from the or to the hallway right outside the door. this transmission condition will be entered into the model to couple equations (2) and (3). finally, an entirely similar approach is used to get the transmission in the compartment model when the door of a specific or is wide open. in such cases, the gradient of pressure between the or and the hallway nearly vanishes. at the doorstep, we are observing buoyancy-driven effects due to the difference in temperature between the or (cold air) and the hall (warm air). there is a convective flow exchange with cold air at the bottom going out of the or and hot air at the top going into the or [34] . during our experiment with particle sensors, we were able to validate the propagation of aerosol traveling into the or from outside when the door is left open. with the cfd model and taking into account the gravity, we simulated the contamination by adding a source of co 2 from the inlet at the beginning of the hallway and keeping the door open. it took 18 s for the gas to reach the door and start contaminating the or. this proved the importance of keeping the door closed to maintain the positive pressure in order to control the contamination rate and nosocomial propagation in the or suite. now, the simple compartment-like model to monitor, in time and in space, the diffusion and transport of particles with intermittent source production in each or will be assembled. such a source of pollutants corresponds to either the use of some chemicals or the use of electrosurgical instruments during surgery. the goal is to get the average rate at which the staff working in the or suite is getting exposed to particle concentration emanating from surgical smoke throughout the day. potential propagation of particles that may carry biological material from one or to another is also of interest. as discussed earlier, the concentration is tracked in time and in space with a coarse time step of one minute. this time step scale is coherent with the measurement system used for particle counting. one minute is also roughly the time that the particles emitted from a point source next to the or table need to transport and diffuse throughout the or block once released. the compartment model computes the global concentration of the particles in each or as well as in each section of the hall adjacent to the or. these concentrations are denoted respectively q j (t) for or number j at time t and p j (t) for the corresponding section of the hall-see figure 1 . the source of particles is denoted as s j (t). in principle, s j (t) should be non-zero for a limited period of time and follow a statistical model based on the different phases of the surgery and the knowledge of electrosurgical instrument used during a surgical procedure [19] . the coefficients of decay are defined inside different parts of the model (ρ or and ρ hall ) as well as the coefficients of transmission between these spaces (α or from the or to the hall and γ hall for the opposite). β or represents the flow from the or to the hall when the door is open. the frequency of door openings is following a statistical model based on where the surgery is at; δ door j is a function of time and is 1 if the door is open, 0 otherwise. the simulation of the surgery schedule uses data from the smartor project [30] , which will be detailed later on. only the door openings of the order of a minute will be counted and γ hall = β or will be assumed because the gradient of pressure between the or and hallway vanishes. the system model of marker transport-diffusion in the or suite is: an additional unknown to track back-flow of marker in the or coming from the hallway can be introduced with: using this equation, the number of particles going from one or to another can be separately counted. this number is expected to be very low-see "results" section. the model (4) and (5) is not a standard box model. first, the source term has a delay built-in to simulate the transmission conditions observed. second, equation (5) is a pde, more precisely a linear transport equation. third, most of the coefficients are stochastic, especially those related to door openings and sources that are linked to human behavior. because the system of equations is linear, the superposition principle has been implicitly used to retrieve each unknown coefficient from the experimental protocol. let us describe our surgical flow model more precisely in order to provide an accurate description on how we manage to compute the source term s j (t). for each of the standard or stages of the surgery, an attributed state value is given as follows: • phase 1: anesthesia preparation label as state = 1. the type of airborne marker expected to release depends on those state values. for example: in state 0, cleaning crew uses a lot of chemical products that quickly evaporate in the or. similarly, a different type of sterilization product is used to prep the patient in state 1. in state 2, cauterization is often used for a short period of time. in state 3, various phases of the surgery will require energy delivery instruments to cut tissue and access specific anatomy or tumors. a stochastic model of energy delivery is used that consists of delivering short time fractions of energy in several consecutive minutes. the parameters of that model are: the frequency of energy delivery denoted f , the duration of the impulse denoted ξ, and the number of repetition r. a uniform probabilistic distribution of events is used within these intervals of variation for each parameter. figure 5 provides a typical example of the number of door openings observed in the or at a 15-min interval. both the detection of door openings and a patient bed coming in and out were provided by the sensors of the cyber-physical infrastructure [28] . a stochastic model of door openings will be used based on a uniform frequency of door opening during surgery, even though this distribution is non-uniform in practice and tends to concentrate at the beginning and the end of a case. • and x on the horizontal axis corresponds respectively to the entering and exiting time of the bed of the patient. this example has two procedures. the model of air pollution in the surgical suite will first be tested with a simplified model of surgical flow as follows: to provide the timeline of events, the model assumes there are three surgical procedures in each or. the timeline of each surgery will be such that: phase 1 and phase 5 last 12.5 min ± 5 min, phase 2 and phase 4 last 15 min ± 5 min, phase 3 is the surgery itself that lasts 65 min ± 25 min. phase 6 corresponds to a turnover time between surgeries that lasts 30 min ± 10 min. this simplified model of surgery scheduling has the correct order of time-length for each phase. its simplicity allows a sensitivity analysis to run with respect to the key parameters of the indoor air quality model that can be easily interpreted. next, the model will be coupled on an abm of an existing large general-surgery suite in the hospital that has been calibrated by tracking about 1000 procedures over one-year [31] . this model is complex and specific to a 20 ors surgical suite of a 1700 bed hospital, which has been monitored for over two years. to asses, the impact of human behavior on the transport and diffusion of surgical smoke in the surgical suite over a period of one year, a realistic abm of the surgical flow and of people behavior is now used. a byproduct of this study is the assessment of the air quality and the risk factors associated with surgical smoke by coupling it to the present model of transport and diffusion of airborne particles generated by surgical smoke. the method to construct this model is briefly explained in this paragraph. the exact description of the model goes beyond the scope of this paper's focus on air quality and has been detailed in garbey et al. [31] . the mathematical model of surgical flow is built upon observations and robust clinical data covering 1000 procedures with a noninvasive array of sensors that automatically monitor the surgical flow. to this end, several ors were equipped with sensors that capture timestamps [28, 29, 35] corresponding to the different states described in the previous section. overall, the model can simulate the or status of a large surgical suite during any clinical day and can be run over a long period of time. the model is able to reproduce the statistic distribution pattern over a year of performance indicators: turnover time, induction of anesthesia time, the time between extubation, and patient exit. the model classifies the human factors impact and limitation of shared resources on flow efficiency. in the end, communication delays and sub-optimal or awareness in large surgical suites have significant impacts on performance and should be addressed. this paper concentrates on the duration of surgery state 3 that corresponds to how long surgical smoke is generated, and how behavior inducing or door openings are responsible in part for the spread of surgical smoke and other agents. the output of the abm model of surgical flow coupled to the air quality model is the number of hours per year that staff gets exposed to surgical smoke in the or and hallway. various scenarios have been run related to the rate of adoption of vacuum systems for surgical smoke and or door openings to discuss the influence of human behavior on those results. following are the results on the circulation of surgical smoke in a surgical suite starting from a local source of emission in the or and ending on global dispersion in the suite. a detailed cfd model of the airflow in the or along with its immediate adjacent structure will be used to build an upper-scale, simplified model. a series of air quality measurements based on the density of particles at specific locations will be used for calibration of the model and for validation purposes. the measured rate of particles generated by various energy sources, such as monopolar cautery, argon plasma coagulation (apc), and harmonic sources, are found by testing them in an or space allocated to training, i.e. without patients. the unit used for the source of the emission is 0.01 particles per cubic foot; it gives the measurements an order of magnitude from ten to thousands by which they can be compared. small particles are found in the range of 0.5 to 2.5 microns, which are the sizes of biological-material particles like viruses. as opposed to the results reported in weld et al. [36] , our off-the-shelf particle sensor does not give us access to the ufp count. a conservative estimate from weld et al. [36] results would be that the concentration number of ufp is 2 to 3 orders of magnitude larger than what is measured for the small particles. in table 1 , each source's mean, standard deviation, and diffusion coefficient are reported -more precisely α is the rate at which the pollutant concentration decreases, which is obtained from fitting a simple exponential decay model source exp −αt to the experimental data. there was no significant statistical difference between the rates of diffusion of the particles emitted by the monopolar versus the apc instruments. the coefficient of diffusion corresponding to the harmonic instrument is lower but has strong variation. we interpret this result with the fact that the range of size of particles produced by the harmonic instrument is wider and the distribution of size is random. in some trials the particles emitted were then too small, they can go down to 0.06 microns [36] , to be detected by our sensors while in other only detectable particle were produced. as mentioned earlier, covering the sensor with a surgical facemask dropped the number of large particles to some extent, but there are always leaks on the sides. as noticed in the literature standard, surgical facemasks do not protect from ufp. a 3-dimensional (3d) cfd model is used to simulate the dispersion of a single source of pollutant in an or. the dimensions used in the model are the ones in the surgical suite where the clinical study and validations were made. every or is different, but the order of magnitude of the physical quantities is the same for each or in our clinical study. figure 2 provides the geometric details of the simulation setup that takes into account the geometry of the room, location of air conditioning ducts, location of the doors, and air leaks due to positive pressure despite closed doors. table 2 lists the boundary conditions on velocities and temperatures of the or and its adjacent hallway obtained from measurements. the surgical smoke plume in the cfd model was simulated using an injection of co 2 at the location of the or table for a duration of 10 s. the co 2 phase was tracked in the multi-phase cfd simulation as a marker of pollution. the smaller the particle, the best the dispersion model would be based on gas transportation. figure 6 shows the dispersion of the plume inside and outside the or, while the door is closed. dispersion into the hallway was due to the air leaks between the door of the or and its frame. verification of the simulation was obtained by refining the mesh and time step until it reached a numerical convergence on the quantities of interest -in particular, the density of co 2 and velocity of flow at specific locations. table 3 provides a comparison between the different velocity values found by the model and by the direct measurement obtained at those locations. the time intervals were also computed: between the emission of the pollutant and the time when an air sensor detected the pollutant inside the or, close to the door, and in the hall outside the door. the results of table 3 provides the first level of validation of the cfd simulation. in table 3 , r 3 1 is the ratio of pollutant phase concentration between the sensor location 1 and 3 in the or in figure 4 , and is interpreted as a small particle density ratio as well. it is particularly interesting to notice that the flow at the door has a 3-dimensional component that is driven by the pressure gradient as well as the temperature difference between the or and the hallway. while the or is kept under positive pressure, it loses this pressure as soon as the door is opened. because the temperature of the or is generally cooler than the temperature of the hall, we observe from the cfd that the buoyancy effect causes back-flow between the adjacent hallway when the door is opened. this might be part of the mechanism of contamination between ors. this result is consistent with air quality measurements done in controlled experimental conditions presented hereafter. it was found that the mixing of contaminants from a burst source to the rest of the or is reached within a minute; applying a simplified compartment model to describe the or's contribution to pollutants using a time step of one minute became apparent. this upper-scale model described in the methodology section will be calibrated next. the identification of the model parameters from the experimental data-set corresponding to the setup in figure 4 is explained below. the experiment was designed first, to assess the delay of pollutant transmission between the or and the hallway depending on if the door was opened or closed, see figures 7 and 8 . second, to compute the rate at which pollutant concentration decreases. an exponential decay was observed in the or, which is consistent with the fact that diffusion is the main mechanism due to the small velocities present inside the model. however, the hallway behaves more like a duct with a combination of convection and diffusion running down the hallway. these measurements are consistent with the cfd simulation results shown previously. fitting the simplified model to the controlled experiment with a single source of smoke, the coefficients of diffusion in the or and in the hallway can be retrieved, as well as the convection velocity in the hallway-see table 4 . figure 7 . source in closed-door or and its impact on hallway air concentration: a 2 min delay in transmission from or to hallway and an exponential decay for each signal was observed. the diffusion coefficient in the or and the hallway are dependent on the hvac system that is, by design, more effective in the or than in the hallway. therefore, the rate of decay in the or is twice as large as the rate of the decay in the hallway. as reported before, the diffusion coefficient for the particle tracking setting is about the same for the spray source as it is for the monopolar or apc sources. the transmission condition with a closed or door is not negligible: it is about 4 times less than with an open door. from figure 9 , the traveling wave velocity is reconstructed and travels about one or width in a minute, v 0 is about 0.1 m/s at mid-hall location. this small velocity in the hallway could not be directly measured, but it is in agreement with the cfd simulation reported earlier. table 4 . parameters of the model obtained by fitting the outcome on single-source controlled experiments with injection source locations; 9 measures were used to obtain this table. the coefficients of decay are (ρ or and ρ hall ) as well as the coefficients of transmission between these spaces (α or ) from the or to the hall. β or represents the flow from the or to the hall when the door is open. 9 . effect of door opening and closing on propagation of marker from one or to the next or down the hall: • is the sensor close to the source in the or, is 10 times the concentration down the hall, and is 100 times the measured concentration in the next or down the hall-for convenience, we have plotted in solid line the exponential model fitting for the experimental datasets in the main or and in the hall. the most important result is summarized in figure 9 : surgical smoke emitted in a single or can rapidly reach the hallway within a minute due to the or door opening and is diluted by a factor of roughly 10. in the unfortunate event that the door of the next or is opened, then some trace of the surgical smoke emitted by the or upstream, can flow inside the next or down the hall; while the level of exposure to surgical smoke would be insignificant in this second or, it is clear that the standard positive pressure established in these ors cannot guarantee those airborne particles do not propagate from one or to another. over a period of several months, this rare event might be capable of propagating an airborne disease. in fact, the frequency of door openings of each or can be very high, as shown in figure 5 , that the probability of propagating airborne diseases and contaminating other ors seems inevitable. next, to systematically assess long-term exposure, the result obtained by coupling the air quality model with an agent-based model (abm) of the surgical flow will be reported on. figure 10 shows a measurement done during a clinical study with 3 consecutive laparoscopic procedures during the day. the red curve accounts for the number of particles detected by the sensor inside the or, while the blue curve provides the corresponding measurement from the hallway. patients' registration starts at 7 a.m., before any surgery occurs, and lasts all day until all surgeries are complete. large peaks of particle concentrations were observed during the times the or was being cleaned. these peaks were removed from the or acquisition curve that corresponds to the use of detergent. similarly, during the preparation and closing of the patient, the sensor sometimes captured the use of chemicals when preparing the sterile field or the leak of anesthetic gas. the red peak during the third procedure in figure 10 most likely corresponds to an excess of surgical smoke. as expected, the concentration of particles in the hall is not strictly correlated to the emission of surgical smoke in the or. the hallway collects pollutants from a number of ors under positive pressure at the same time. because of this, it is difficult to separate out surgical smoke from other sources in the hallway measurement, such as chemicals used in the preparation of patients located in ors upstream. the model is built to qualitatively reproduce the concentration of surgical smoke inside an or and its adjacent section of the hall. in our simulation (see figure 11 ), the emission of surgical smoke is restricted during the time the patient is intubated. this simulation was done on the whole surgical suite with a stochastic production of smoke in each or similar to the one reported in meeusen et al. [19] . one observation is the same pattern of pollutant concentration in the hallway as seen in the clinical dataset. in particular, there is no obvious correlation between the source of smoke in the or of that simulation and the concentration in the section of the adjacent hallway. in fact, while exposure to surgical smoke in the or is relatively intense in a short period of time and then vanishes, the pollutant is stagnant in the hallway for a much longer period of time and therefore contributes to long-term exposure. overall, the delay ∆t 2 in the transmission conditions in (4) and (5) has very little influence on the result and can be neglected. to expand the study, this simplified air quality model was then coupled to the abm of surgical flow that reproduces the daily activity of a large surgical suite over a long period of time [31] . the model was calibrated using custom-made sensor systems placed at key locations of the surgical suite to capture the daily activity over a period of a year [29] . this or suite, dedicated to general surgery, has about 20 ors distributed in a layout as seen in figure 1 and is rather typical of the activity in a large urban hospital. a simple stochastic model is assumed for the source of surgical smoke in each or, similar to the previous one. the probability p door ∈ (0, 1) of the number of or door openings per minute is a parameter of the model. on average, one door opening every two minutes during a surgery is rather standard. this is mainly due to the fact that the staff may have to support logistics in various ors at the same time, and that coordination of team activity is still done by a direct conversation in the surgical workflow. in fact, it is common knowledge that a door opening every 8 min on average would correspond to a very strict policy controlling traffic in the surgical suite, but it would only reduce the exposure in the hallway by half. from the simulation, it is concluded that long-term exposure to surgical smoke in the hallway is about the same order of magnitude as the one in the or. figures 12 and 13 demonstrate the effect of the frequency of door openings on the average concentration of pollutants that a staff member is exposed to during the day. there was a noticeable low concentration at the upstream end of the halls, which was confirmed by direct measurement with the particle counter. enforcing strict control on door openings may reduce the level of transport and diffusion of hazardous airborne particles in the hallway by half. the model can be ran to test a fictitious situation as in figure 14 where every other or has an ideal practice and generates no surgical smoke at all. the usage of ideal exhaust ventilation devices during surgery in half of the ors has a direct linear correlation with the rate of exposure in the hallway, and it seems to be the most efficient technique to reduce long-term exposure; it cuts down the staff's exposure to surgical smoke in the hallway by half. about half a million healthcare professionals are exposed, daily, to surgical smoke in their clinical activities. transport and diffusion of hazardous airborne particles such as virus, generated by surgical smoke in particular, and its long-term effects on staff have not been studied systematically yet. the debate on the impact of surgical smoke on patients' and staff's health is reminiscent of the incident involving airborne hazards from anesthetic gas [23, [38] [39] [40] [41] . the national study led by the american society of anesthesiologists established that "female members in the operational room-exposed group were subject to increased risks of spontaneous absorption, congenital anomalies in their children, cancer and hepatic and renal disease." while the link with waste anesthetic gas (wag) was not clearly established at that time, except in animal studies, the stream of work initiated in the 70's [23, 41] eventually ended up in a better management of wag "by always using scavenging systems, by periodically testing anesthetic machines for gas leaks, and by not emptying or filling vaporizers" [42] . unfortunately, the efficiency of surgical masks to prevent virus transmission or surgical smoke breath intake is usually tested using non-biological markers while their use in hospitals is mostly against airborne biological particles. standard surgical masks and filtration techniques are not effective on ufp, which include viruses and bacteria. for example the sars-cov-2 as a size comprised between 0.06 to 0.14 microns [43] . long et al., showed in a new meta-analysis that there was no significant difference in effectiveness between surgical masks and n95 masks against laboratory-confirmed respiratory viral infections [13] especially at higher inhalation flow rate [14] . seongman et al., also showed that for viruses, effectiveness of surgical mask (and cotton based masks in the paper) are dependent to virus concentration and flow rate of inhalation but also showed a higher concentration of viral load on the outside of the mask compared to the inside [44] . yang et al., built a multi-criteria decision-making method based on the novel concept of the spherical normal fuzzy to assist healthcare staff in the decision of which mask to wear [45] . an italian research team is underlining the possible correlation between concentration of particulate matter (pm) with propagation of the virus like, for example, in the north part of italy where industrial pollution is high [46] . high concentration of pm could be a vector of propagation and needs to be carefully observed inside building, especially hospitals. being able to know where these particles are and in what concentration seem then the best protection and awareness for staff to avoid staying too long in contact. deposition rate of these particles is not addressed in this paper and researchers are still debating about lifetime on surface of the sars-cov-2 virus [47] . a method to construct a surgical-suite-specific model of the transport-diffusion of airborne particles can quickly be calibrated with cost-effective wireless particle counters. coupling this indoor quality model to the previous multi-scale model of surgical flow [28] [29] [30] allows quantification of surgical smoke exposure across long periods of time and provides a rationale for recommendations. as a matter of fact, the abm of surgical workflow allows insight into the human behavior factor, which can be included in the analysis. this work may expose rare events, such as contamination from one or to another, which when accumulated over the months becomes a tangible risk. this study has potential because it can run for long periods of time and can address the complexity of hundreds of staff's spatiotemporal behaviors in a large or suite. the cfd model requires detailed geometric and boundary conditions to be reliable; the k − model is an approximation that has its own limit as well. running a cfd model is a tedious process both in setting up the mesh and simulation parameters, as well as in terms of central processing unit (cpu) time required. cfd was used here only to test the components of a hybrid stochastic compartment model that incorporates the mechanism of diffusion-transport of airborne particles at the surgical suite scale over a one-year period. a coarse statistical model was used for the source of surgical smoke: the actual generation of surgical smoke depends on the surgery team, type of procedure, and many more parameters. however, the capability to non-invasively monitor such parameters using appropriate sensors via the cyber-physical system is available. the hybrid partial differential equation (pde) compartment model provides a first-order approximation of average exposure at the room scale. the delay in the transmission conditions between the or and the hall in equations (4) and (5) is not essential to reproduce the result on daily exposure to smoke. in the meantime, the uncertainty of the hvac input/output provides a much larger error. furthermore, deposition of particles on the or's surfaces was not taken into account. the deposition of ufp may be expected to be negligible [22] . a low accuracy model that carries an error of the order of 20% could, however, be conclusive for this study. as a conclusion, it is particularly important to recognize the impact of door design and human behavior when considering hazardous airborne particles spreading throughout a surgical suite. or doors constantly leak air depending on the difference of pressure with the outside hall, and they contribute to the transport of particles throughout the surgical suite. the door opening effect depends on the motion of the door and also the difference of temperature between the or and the hallway. some of these negative impacts can be controlled by a better design of the door and of the temperature control in order to work with a more cost-effective hvac design. the benefit of positive pressure in the or is still canceled by door openings, inducing possible back-flow and contamination from the hallway, especially when the door stays open for several minutes. therefore, efficient movements by personnel may improve indoor air quality and should be quantified. the architectural design of the or suite should optimize the circulation of staff and patient movement activities. the next important step in the modeling to address the complementary aspect of biological transmission versus physical transportation is to correlate the database of staff's pulmonary events with the study's findings to recognize these rare events. we can then translate the quantitative model of surgical smoke transport into a risk assessment for staff's health. the model should be surgery-specific: an efficient cyber-physical infrastructure should non-invasively monitor energy usage and smoke presence to instantly deliver awareness on practices that can improve air quality. a factor that has been neglected is the deposition of surgical smoke in the common storage area, see figure 1 , in which all ors have personal access to via their back-doors. this may offer a different mechanism of propagation of biological material. in the current study, quantification of surgical smoke concentration in the hallway, the duration of exposure along the year, and the mechanism of propagation of hazardous airborne particles from one or to another was feasible. on the practical side, an automatic sliding or door seems to be a better solution over a traditional door's rotation that acts as a pump. the analysis can also be extended to address the problem of the optimum placement of uv lights in the hallway to improve air quality in an efficient and controlled way. finally, the importance of aorn's guideline in the use of a vacuum system during surgery needs to be reinforced at a time when elective surgery may involve asymptomatic covid-19 patients. author contributions: this project is highly interdisciplinary and required all co-authors contributions to establish the concept and reach the goal of the paper. m.g. leads the project, designed the overall framework, including the hybrid model, agent-based clinical model, and matlab code implementation. g.j. did the cfd model at the or scale, participated in the design of the overall method and ran the experiments with air quality sensors required for validation. s.f. participated in the experiments with air quality sensors and with or door activity sensors. all authors contributed to the redaction of this publication. all authors have read and agreed to the published version of the manuscript. airborne transmission of sars-cov-2: the world should face the reality turbulent gas clouds and respiratory pathogen emissions: potential implications for reducing transmission of covid-19 airborne transmission route of covid-19: why 2 meters/6 feet of inter-personal distance could not be enough review awareness of surgical smoke hazards and enhancement of surgical smoke prevention among the gynecologists surgical smoke-a review of the literature: is this just a lot of hot air? surgical smoke-a health hazard in the operating theatre: a study to quantify exposure and a survey of the use of smoke extractor systems in uk plastic surgery units secondhand smoke in the operating room? precautionary practices lacking for surgical smoke smoke from laser surgery: is there a health hazard? papillomavirus in the vapor of carbon dioxide laser-treated verrucae electrosurgical smoke: ultrafine particle measurements and work environment quality in different operating theatres in vitro toxicological evaluation of surgical smoke from human tissue exposure to air pollution and covid-19 mortality in the united states: a nationwide cross-sectional study effectiveness of n95 respirators versus surgical masks against influenza: a systematic review and meta-analysis do n95 respirators provide 95% protection level against airborne viruses, and how adequate are surgical masks? predicting bacterial populations based on airborne particulates: a study performed in nonlaminar flow operating rooms during joint arthroplasty surgery can particulate air sampling predict microbial load in operating theatres for arthroplasty covid-19 outbreak associated with air conditioning in restaurant air, surface environmental, and personal protective equipment contamination by severe acute respiratory syndrome coronavirus 2 (sars-cov-2) from a symptomatic patient different types of door-opening motions as contributing factors to containment failures in hospital isolation rooms movement of airborne contaminants in a hospital isolation room airflow patterns due to door motion and pressurization in hospital isolation rooms anesthesia, pregnancy, and miscarriage: a study of operating room nurses and anesthetists multizone modeling of strategies to reduce the spread of airborne infectious agents in healthcare facilities ventilation of buildingsy numerical and experimental analysis of airborne particles control in an operating theater editorial: indoor air quality and health a cyber-physical system to improve the management of a large suite of operating rooms the smartor: a distributed sensor network to improve operating room efficiency a robust and non-obtrusive automatic event tracking system for operating room management to improve patient care multiscale modeling of surgical flow in a large operating room suite: understanding the mechanism of accumulation of delays in clinical practice using a new, low-cost air quality sensor to quantify second-hand smoke (shs) levels in homes door-opening motion can potentially lead to a transient breakdown in negative-pressure isolation conditions: the importance of vorticity and buoyancy airflows an intelligent hospital operating room to improve patient health care analysis of surgical smoke produced by various energy-based instruments and effect on laparoscopic visibility engineer's hvac handbook american society of anesthesiologists ad hoc committee. occupational disease among operating room personnel: national study occupational hazards for operating room-based physicians. analysis of data from the united states and the united kingdom spontaneous abortions and malformations in the offspring of nurses exposed to anaesthetic gases, cytostatic drugs, and other potential hazards in hospitals, based on registered information of outcome miscarriages among operating theatre staff occupational exposure to inhaled anesthetic is it a concern for pregnant women? can a novel coronavirus from patients with pneumonia in china effectiveness of surgical and cotton masks in blocking sars-cov-2: a controlled comparison in 4 patients decision support algorithm for selecting an antivirus mask over covid-19 pandemic under spherical normal fuzzy environment searching for sars-cov-2 on particulate matter: a possible early indicator epidemic recurrence covid-19 surface persistence: a recent data summary and its importance for medical and dental settings funding: this research received no external funding. the authors declare no conflict of interest. key: cord-294586-95iwcocn authors: kwuimy, c. a. k.; nazari, foad; jiao, xun; rohani, pejman; nataraj, c. title: nonlinear dynamic analysis of an epidemiological model for covid-19 including public behavior and government action date: 2020-07-16 journal: nonlinear dyn doi: 10.1007/s11071-020-05815-z sha: doc_id: 294586 cord_uid: 95iwcocn this paper is concerned with nonlinear modeling and analysis of the covid-19 pandemic currently ravaging the planet. there are two objectives: to arrive at an appropriate model that captures the collected data faithfully and to use that as a basis to explore the nonlinear behavior. we use a nonlinear susceptible, exposed, infectious and removed transmission model with added behavioral and government policy dynamics. we develop a genetic algorithm technique to identify key model parameters employing covid-19 data from south korea. stability, bifurcations and dynamic behavior are analyzed. parametric analysis reveals conditions for sustained epidemic equilibria to occur. this work points to the value of nonlinear dynamic analysis in pandemic modeling and demonstrates the dramatic influence of social and government behavior on disease dynamics. coronavirus disease 2019 (covid19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2) that was first identified in china in early december 2019. it has since become a global pandemic devastating the health, economy and lives of billions of people all over the world and has brought into sharp focus the need for accurate modeling of infectious diseases. the global government policies are in fact largely being driven by statistical analyses loosely based on nonlinear mathematical models that underlie epidemiology. as we write this paper, there is also a rising controversy about the predictive power of these models. the crux of the matter is that there is a trade-off between economic disruptions and deaths. if the model predictions are incorrect in terms of overprediction, we may be creating mass unemployment and hurting billions of lives by causing economic deprivation. on the other hand, if the model predictions are wrong through underprediction, then too many unnecessary deaths would occur. this quandary that most political leaders are finding themselves in points to the need for high accuracy in the models. mathematical modeling in epidemiology has a long history dating back to early models by bernoulli in the eighteenth century [1, 2] , although most current research uses models built on those developed in the 1930s by kermack and mckendrick [3] [4] [5] . these are called compartment models and constitute a set of nonlinear ordinary differential equations, where the state variables represent the population numbers in various stages of the infectious disease progression, which are described below [6] . -susceptible individuals (s). there is no detectable level of pathogens, and the individual's immune system has not developed a specific response to the disease-causing pathogen. -exposed individuals (e). the individual has come into contact with an infected person and is infected, but exhibits no obvious symptoms and has low levels of the pathogen that is not high enough to sustain a transmission to other hosts. -infected individuals (i). the number of pathogens has increased to a point that it is now possible to transmit to other susceptible individuals. -removed individuals (r). the individual's immune system has possibly won the battle and reduced the number of parasites significantly, and he/she is no longer infectious. or, the individual has been isolated from the population, or, alas, he/ she has succumbed to the disease and died. in all of these cases, the individual is said to be removed. note that it is common practice to model the number of individuals in each of the above categories as fractions of the nominal population. we should also observe that other potential variables could be included to account for quarantines, vaccination, etc. the key factors that govern the dynamics are the growth rate of the pathogen and the level of interaction between the pathogen and the host's immune response. as in all modeling, we will have to make a compromise between predictive accuracy and complexity. in addition, since we are using real data, the task of estimating accurate parameters becomes intractable, if not impossible, with a very complex model. considering all these factors, we will consider a seir model (describing susceptible, exposed, infected and removed individuals) as described further in the sequel. we will modify the seir model with two important features: the effect of government action and that of public reaction. these two behavioral actions represent social dynamical variables and are especially relevant to the accuracy of predictions as we will show. of all the nonlinear phenomena we may expect to find, it is important to note that endemic equilibrium points are probably most critical to identify; that is to say, we are interested in knowing under what conditions the disease will persist, and not vanish. particularly with growing interest in the impact of the covid-19 virus, there has been an explosion of research papers in modeling and prediction. it is hence not possible to refer to all-or even a large percentage-of them. what follows is hence a snapshot focusing somewhat on the subject of the current paper. as mentioned earlier, the epidemiological models have a rich history after kermack's original work. there are several excellent and modern text books [6] [7] [8] [9] that describe the fundamental mathematics of epidemiology and discuss the relevance to real historical data of infectious diseases, and we refer the reader to them for a clearer understanding of the model assumptions, derivations and implications. hethcote's paper [10] is an especially instructive review and [11] is another that skews toward policy decisions. in terms of nonlinear dynamics, early work by [12, 13] analyzed the effect of seasonal fluctuations as well as contact rate periodicity in what essentially becomes a forced response problem resulting in harmonic and subharmonic resonances. several authors have analyzed the occurrence of periodic solutions through hopf bifurcations in an seir model due to the presence of time delays and nonlinear incidence rates [14] [15] [16] [17] . schwartz and smith [18] discovered infinite subharmonic bifurcations in a similar seasonally forced model, while [19] analyzed bifurcations in the context of limited hospital resources. buonomo et al. [20] is a contemporary review summarizing the literature in seasonal dynamics. chaotic motion has also been documented in [21, 22] . finally, martcheva [8] provides a clear exposition of nonlinear dynamic phenomena in her monograph. we should note that the key parameters that can be quite powerful in estimating-and controlling-the spread of epidemics are the so-called reproduction number (r 0 ) and incubation period. in particular, it can be shown even with the simplest models that the disease will persist if r 0 [ 1 and will die out if this number is less than 1. many of the control techniques that the governments use are focused on achieving this goal by reducing the transmission rate that eventually controls r 0 . from the point of view of mathematical analysis, this creates an interesting situation of a timevarying parameter that is usually discontinuous as government polices are often implemented like step functions. it should be noted that the incubation period is characteristic of the virus and is less under our control. it has been estimated to be 6-7 days [23, 24] . as mentioned earlier, covid-19 has spawned a rich collection of publications, and we do not deem it necessary to document them here. nevertheless, it is interesting to note the rapid revelations that have come out of these admittedly short-term studies, many of them focusing on data from wuhan, china, where the virus apparently originated. who [25] reports that the earliest infections were identified there around the first of december and the infections declined by the end of february with strong government action as well as public reaction. the crude fatality rate was estimated to be a shockingly high 3.8%, although the real number is in all likelihood much lower since the number of infected individuals is heavily undercounted due to logistical limitations in testing, and given that a significant segment of the population is probably infected but asymptomatic. in quick studies, several researchers [26] [27] [28] have estimated the essential epidemiological parameters using early data from wuhan, china; in particular, they found r 0 to be in the range of 2-3. liu et al. [29] estimated the reproduction number to be 2.7, which is larger than the earlier sars epidemic, which would make it more dangerous than sars. kucharski et al. [30] estimated that the travel restrictions that the chinese government imposed brought down r 0 from 2.35 to 1.05, effectively bringing the infections in wuhan under control. even more impressive was the effect of aggressive restrictions on the diamond princess cruise ship, which was estimated to reduce r 0 from a devastating 14.8 to a more manageable 1.8 [31] . several papers have been published attempting to estimate the growth in other areas of china and the world. wu et al. [32] estimated r 0 to be 2.7, and predicted similar transmission rates for other cities in china, and [33] , published in mid-march assuming a reproduction number of 2.4, suggested mitigation strategies for various countries, principally us and uk. this last report was quite influential and led to these two governments to start implementing policies with the objective of ''flattening the curve'' of cumulative infections. the focus of our study is twofold. -we select a dataset for covid-19 that is reasonably complete and accurate and develop a mathematical model that is best able to represent the data. -given the above fitted model as a starting point, we wish to explore the fundamental nonlinear dynamics of the system and perform a parametric analysis to explore the effect of social dynamics. the reason we use the actual data (in this case, south korea's) is to keep us grounded in reality and to anchor our parametric studies around this particular situation. in addition, we expect that a parametric analysis will show the tremendous implication of various actions on the progression of the disease. in general, our analysis is intended to be relevant to the current situation. given that, as of the writing of this paper, the covid-19 situation is still evolving with considerable uncertainty about the future, we wish to use this paper to validate the importance of mathematical modeling in general, and nonlinear dynamic analysis in particular, to enhance our insights. building on the above objectives, the rest of the paper is organized as follows. first, we describe the modified seir mathematical model we employ in this study. next, we describe the data collection and properties. then, we describe a numerical algorithm we employed and coaxed to get the best parametric fits. the next section carries out the nonlinear dynamic analysis and describes the interesting results we have achieved. finally, we discuss the implications of the model and the results and end with a conclusion. we adopt the susceptible-exposed-infectious-removed (seir) framework with a total population size of n. in this model, s, e and i represent the susceptible, exposed and infectious populations and r represents the removed population. for completeness, it is best to start with a standard seir model as illustrated in fig. 1 [6, 8] : where 0 denotes derivative with respect to time. in this model, b is the transmission rate, l is the death (and emigration) rate, r, the incubation rate, is the reciprocal of the latent period (assumed to be the same as the incubation period in this model) and c is the removal rate, and hence the reciprocal of the recovery period (if removal is due to recovery). note that e represents those who are exposed but not yet infectious. we make two modifications to the standard model as described below. the first modification concerns the specific nature of covid-19 and concerns the fact that infected people can be contagious before they show symptoms during the incubation period. hence, it is possible that susceptible individuals would have had contact with individuals in both the exposed and infected categories. here, we will model the two paths from s to e using two values of b, say b 1 and b 2 . emulating [34, 35] , we will assume that b 2 ¼ b 1 =2 or that the probability of contacts with asymptotic infected individuals is half of the probability of contacts with exposed individuals. the modified model now becomes: the second modification, illustrated in fig. 2 , concerns the influence of two important sociological (and arguably, political) parameters: social behavior and government policy. here, we consider the transmission rates to be variable and change with these parameters [36, 37] . then, the modified model becomes where we have defined an infection function ç as follows. here, a represents the strength of the government action and j is the strength of public response. note that d is a new state variable representing social behavioral dynamics. d represents the strength of public perception of risk, 1=k is the mean period of public response and the model reflects the fact that public reaction would increase when more people get infected and would naturally diminish over time. 3 parameter identification we use the data from south korea as our dataset for model fitting for several reasons. compared to usa, where the testing kits are in significant shortage, and china, in particular wuhan, where the infected cases went up abruptly in a short period and hence massive testing might not have been available, the south korean government was prepared with appropriate emergency measures since january 20, when it changed its infectious disease alert (in the national crisis management system) category from level 1 (blue) to level 2 (yellow) [38] . such measures provided massive testing capability in south korea to enable one of the most accurate datasets available. we examined various databases of south korea and finally selected covid-19 data repository by the center for systems science and engineering (csse) at johns hopkins university [39] due to its complete record and accessible interface. in particular, the database provides time-series data containing daily updates on the new infected cases, death cases and recovered cases, all in a comma-separated values (csv) file format that is ready to be read and manipulated using standard software tools such as matlab. inspired by charles darwin's theory of natural evolution, holland introduced and popularized general-purpose search algorithms that use principles of natural population genetics to evolve solutions to problems, called genetic algorithms (ga) [40] . the basic idea in gas is that evolution will choose the fittest species over time. through emulation of the natural evolution of biological organisms, ga produces a population of individuals (potential solutions in each iteration) to search the solution space of the problem and evolve them through generations to approach the optimal solution. in each generation, the fitness of individuals is evaluated using an objective function and the fittest ones have higher probability to participate in the offspring production process of the next generation. three main types of operators are employed in ga to guide it toward a solution: -selection to choose between the solutions; -mutation to create and keep genetic diversity; and -crossover to combine the existing solutions into new ones. finally, when the stopping criterion is met, the best individual is presented by ga as the solution to the optimization problem. in this part of the study, the objective is to identify the parameters of the model in such a way that the simulated data match the real data as much as possible and then use the tuned model to analyze and forecast the spread of covid-19 in the future. the simulated data are obtained by numerically solving the model in eq. (2) using an integration algorithm (we used sixthorder runge-kutta algorithm). to accomplish the first part, namely parameter identification, we use ga to find the parameter values which minimize the cost function between the model prediction and real data. we devise the cost function based on a weighted sum of the mean square error for both infected and removed data. furthermore, as the main purpose of the model is to predict the future, and as the error at the end of the training time span is reflected significantly on the future time evolution, a penalty factor was included in the cost function for the end points. the cost function f is hence defined as follows: where r, i, r, m, s and e stand for removed cases, infected cases, real data, model-predicted data, start date and end date, respectively. also, w i and 1 à w i are, respectively, the infected and removed patient weights, and a p is the penalty factor for the end points. for this part of the study, the model described in sect. 2 was used with the assumption that l and k are zero. the target parameters are b 1 , b 2 and c, while r was assumed to be 0.14, equivalent to incubation period of 7 days for covid-19 [41] . we will consider the population to be constant; in other words, we assume that n does not change. this means that the natural mortality (including emigration/immigration) rate (l) and the birth rate (k) are zero. this assumption is reasonable over the short time period of analysis. as explained earlier in this study, factors like government actions can significantly affect the trend and pattern of disease spread and accordingly, the seir model parameters. so, in this study, we solved the parameter identification problem for two separate time spans, i.e., controlled and uncontrolled [42] . totally, the recorded data of 108 days were employed in this study (jan 22, 2020 to may 8, 2020). the uncontrolled data were taken for first 40 days (jan 22, 2020 to march 1, 2020) and controlled data for the next 68 days (march 2, 2020 to may 8, 2020). out of 68 days of controlled time span, first 40 days were used for model tuning and the next 28 days for evaluating the performance of the model in forecasting the unseen data. the total population of south korea was taken to be 51,269,185 from standard sources. the process of optimum selection of the optimization variables was accomplished with a population size of 200 with a crossover probability of 0.8 for 300 generations. for uncontrolled and controlled time spans, the values of a p and w i were 10 and 0.5, respectively. w i ¼ 0:5 means that the model-predicted removed and infection rates have the same weights in the cost function and so the optimization algorithm tries to make both of them close to the real data, simultaneously and equally. also, a p ¼ 10 means that the square error of model infection and removed rates at each end point has ten times more effect on the cost function than the mean square error of all 40 days and this enforces the model to be close to the real data at the end points. the initial number of infected and removed cases for the seir model in both periods was considered as the real data, i.e., i 0 ¼ 1 and r 0 ¼ 0 for uncontrolled and i 0 ¼ 3736 and r 0 ¼ 47 for controlled time span. also, due to lack of e 0 (initial number of exposed individuals) in the available dataset, it was assumed to be two times i 0 . the trend of optimal tuning of the model parameters for the controlled time spans is shown in fig. 3 . the convergence of the best fitness value range, including best, mean and worst values, to an optimum condition over 300 generations is demonstrated in the logarithmic form in this figure. the ga parameter identification results are presented in table 1 . as can be seen in this table, the values of model parameters changed significantly with transition from uncontrolled to controlled time span owing to strong actions which were imposed to control the disease transmission in south korea. the effect of this change in the parameter values is clearly seen in the comparison between the trend of individual numbers in the uncontrolled (fig. 4) and controlled (fig. 5 ) time spans. the sharp drop in the number of active infected individuals and reduction in the growing slope of accumulative removed cases show that the actions that have been taken in this country to control the covid-19 spread have been quite successful. comparison of model and real data for the first 80 days indicates that parameter identification for both uncontrolled and controlled conditions has been performed, appropriately, and there is good agreement between them. also, it is observed that the model was able to forecast the unseen data of days 81-108 quite well. nevertheless, there is still a difference between real and model-predicted data. some possible reasons include perhaps overly simplistic modeling of sociological behavior and government actions and inaccuracy in the assumed model parameters like r; e 0 ; k and l. it should also be noted that the number of infected individuals is a measure of the amount of testing that was done, which has not been comprehensive, and hence the numbers can be inaccurate. for all these reasons, the fluctuations seen in the real data are not predicted precisely by the model, but it is clear that the general trends of variation in both infection and removed rates are quite similar. this section focuses on the disease extinction or persistence, which is determined by the stability of the disease-free equilibrium and the existence of endemic equilibrium. prevention and control of covid-19 epidemics require a better understanding of its mode of dissemination as well as the impacts of control strategies. the analysis considers a naïve scenario where there is no governmental action, which is unlikely but will provide a baseline to appreciate the effects of the action. in the second and third scenarios, we consider the effects of individual reaction and the governmental action. in this scenario, the infection function is given by which captures the possibilities of new infection by both infected and exposed individuals. the corresponding model is shown in eq. (2). and proof in order to compute the expression of the equilibrium points, we set the time derivative to zero (steady state) and solve the corresponding algebraic equation. it is obvious that e 0 ðk=l; 0; 0; 0þ is a trivial solution of eq. (9) . e 0 is called the disease-free equilibrium since it is obtained for i ¼ e ¼ 0 and the corresponding infected function ç is zero. for i 6 ¼ 0, the model in eq. (2) has a nonzero solution e 1 given by eq. (7). the stability of the equilibrium points e 0 and e 1 is obtained from the routh-hurwitz criterion for stability, which states that the equilibrium state is stable if the roots of the characteristic polynomial in f are all negative. the jacobian matrix of the system is obtained as j ¼ àa 11 à l à a 12 à a 13 0 a 11 a 12 à ðl þ rþ a 13 0 the characteristic polynomial for the dfe is the following with the system is stable if the roots of the characteristic equation eq. (11) are all negative; this is satisfied if r 0 \1, which is equivalent to for the endemic equilibrium, the steady-state system in eq. (9) can be solved to obtain eq. (7) . the coefficients of the characteristic polynomial are given as a 2 ¼ b 2 s 0 à lðr 0 à 1þ à ðc þ r þ 3lþ a 1 ¼ a 33 ða 2 þ a 33 þ þ ra 13 þ a 0 à a 13 rl a 33 a 0 ¼ lðc þ lþðr þ lþðr 0 à 1þ the system is stable if the roots of the characteristic polynomial are all negative, that is, if r 0 [ 1, which is equivalent to figure 6a shows an illustration of a dfe situation where r 0 ¼ 0:7 and b 2 ¼ 0:0517; b 1 ¼ 0:0024; r ¼ 0:14 and c ¼ 0:0026. using the transmission rate coefficients obtained from sect. 3 (b 2 ¼ 0:0628; b 1 ¼ 0:407), we get the endemic equilibrium of fig. 6b . the effects of the transmission rates b 1 and b 2 are illustrated in fig. 7 . the figure considers the situation of fewer contacts with infected individuals (b 1 � 2 , most/some infected individual are in quarantine assuming the same probability of contamination once in close contact) and compares it to the situation where we have higher probability of contamination with infected individuals, or b 1 [ b 2 ). beyond r 0 ¼ 1, the proportion of infected individuals naturally increases and is higher when b 2 [ b 1 . this can be interpreted to mean that exposed people will have a greater impact on the persistence of the disease. the results of fig. 7 confirm the observations of the number of newly confirmed cases due to close contact with exposed and infected individuals in wuhan, china [35] . in the context of covid-19, governmental actions are mainly focused on regulating social life to reduce the likelihood of contact between individuals. this naturally impacts the transmission rates. the effects of governmental actions are summarized in the infection function, which would need to be substituted into eq. (3); however, note that we do not consider the effect of public reaction here; hence, we drop d from the equation. proposition 2 in this case, an endemic equilibrium proof under the effect of governmental action, repeating the analysis in the previous paragraph will not change the dfe, endemic equilibrium and the stability conditions if b i is replaced by ð1 à aþb i , (i ¼ 1; 2). however, the reproduction number becomes the endemic equilibrium is stable if r 00 [ 1, which leads to the critical value of the governmental control figure 8 gives two different views of how the government action could contribute to controlling the spread of the disease. as might be expected, stronger governmental action (higher values of a) has more impact on the disease (fig. 8a) . but, what is more interesting is that the results predict the existence of a threshold value a c expressed as a function of the transmission rate that would lead to complete control of the disease. this threshold value is higher for lager values of b 2 . in practice of course, there would be a natural limit to the governmental action. for this reason, additional controls would be needed. the literature that documents past infectious diseases similar to the covid-19 has shown how an increase in the number of deaths and the severity of critical cases can be leveraged to impact the perception and seriousness of the population. we now take into consideration the combined effects of the government action and the public perception of risk regarding the number of severe and critical cases. the variable d is added to the model to represent the public perception of risk. it increases when people die and will decay naturally, meaning that perception of risk diminishes over time in the absence of the covid-19. the intensity of this perception is carried through the intensity of the population response j and proportion of severe cases d. the infection function is now which would be substituted into the model given by eq. (3). the system in eq. (3) under control has a higher threshold for the onset of endemic equilibrium. this onset value is proposition 4 there is an endemic state for which the intensity of the public perception has no effect. that endemic state is defined by: proof the steady-state conditions lead to where the subscript c stands for control. the transcendental equation would not lead to an explicit expression of i c0 . thus, guided by the literature, we limit the analysis to some specific cases: for j ! 1 for other values of j, it can be shown that a single 0\i ã c0 exists if this can be proven graphically as shown in ''appendix.'' figure 9 shows how the intensity of the population response could impact the spread of the disease. in fact, under this control, the number of infections is considerably reduced as shown in the figures. in fig. 9a , there is a jump in the number of infected for small r 0 . this jump is significant for smaller value of j and is likely a manifestation of the nonlinearity in j in the expression of the infection function. recalling that the endemic equilibrium used here was obtained for j ! 1, the results of fig. 9a are only valid for larger values of j. this nonlinearity is not visible in the presence of a as shown in fig. 9b . figure 10 shows an illustration of the system response for a naïve scenario where there is no governmental action (fig. 10a) , the effects of governmental action alone (fig. 10b) , individual reaction alone (fig. 10c) and combined action (fig. 10d) . simulation and analytical derivation show that carefully setting the parameters (in the specified range of vales) could effectively stop the spread of the disease under combined actions. we summarize below the key findings of our analysis. -the reproduction number, traditionally computed for seir model in terms of r; k; l and c, has been expanded to include b 1 ; b 2 , and the social and policy parameters, a and j. this expanded definition embeds social dynamics neatly into the epidemiological model and significantly expands insight into their interactions. and hence the reproduction number (r 0 ) went through significant reduction with the south korean government response roughly 40 days after the first incidence. -exposed people (as opposed to infected individuals) have a greater impact on the persistence of the disease. -the stronger the government action, the more the impact on disease transmission. -there is a minimum threshold value for government action (a c ) for complete control of the disease. our model predicts that numerous small, tentative steps would not be as effective as bolder and significant steps. -the intensity of the public response (j) has significant impact on the reduction in number of infections. -the model predicts that for some values of the disease dynamics, the public perception j will have no effects. in this case, only the governmental action could stop the spread of the disease. -the analysis predicts that a suitable combination of government response (a) and public reaction (j) would effectively stop pandemics such as covid-19. in this paper, we adapted and developed an seir model for the covid-19 pandemic including different transmission rates for contacts with infected and exposed and integrated parameters and variables to model government action and social reaction. first, we used data from south korea to perform a parametric analysis using the genetic algorithm and achieved a very good fit. this provides sound validation for our model. the resulting numerical analysis shows that the south korean government action 40 days after the infection was first diagnosed had a significant influence on the spreading of the disease. next, we used more nuanced models for nonlinear dynamic analysis. equilibrium and stability analysis was performed revealing several areas of the parameter space where a stable endemic equilibrium can exist leading to persistent infections. we considered three situations: (a) without control, (b) with government action and (c) with combined effect of government action and public reaction. results show that it is possible to stop the spread of the disease (or to extinguish the endemic equilibrium) by proper choice of parameters that govern social and government behavior. in this paper, by seamlessly integrating two important sociological (and arguably, political) parameters, i.e., public perception and government policy, we are able to show the fact that these factors can significantly affect the transmission rate and spread pattern of disease evolution. the conclusions would support an argument that stronger government actions and policies such as quarantine, wearing masks, social distancing and improving public perception might be essential in combating the covid-19 spread. indeed, this is demonstrated in south korea, which has arguably achieved tremendous success in combating covid-19 unlike many other countries. similar perspectives should be considered for further government policy regarding progressively reopening the economy and campuses. a potential future direction is to integrate more aspects including seasonal effects, which would likely lead to periodic responses. finally, as we write this paper, we note that the pandemic situation is still evolving with considerable uncertainty about the future. we believe that this paper demonstrates the importance of nonlinear dynamic analysis to enhance our understanding of the natural world in which we the humans live and has profound implications for the way we handle it in the future. figure 11 shows the plot of z(x) (in black line) and y(x) for several values of j from j ¼ 0 (no perception) to realistic values of jx [36] . in all cases, the intersection of z(x) and y(x) is singular; thus, there exists a unique i ã c0 solution of eq. (28). essai d'une nouvelle analyse de la mortalite causee par la petite verole daniel bernoulli's epidemiological model revisited a contribution to the mathematical theory of epidemics contributions to the mathematical theory of epidemics-ii. the problem of endemicity contributions to the mathematical theory of epidemics-iii. further studies of the problem of endemicity modeling infectious diseases in humans and animals mathematical models in epidemiology an introduction to mathematical epidemiology an introduction to infectious disease modelling the mathematics of infectious diseases modeling infectious disease dynamics in the complex landscape of global health the incidence of infectious diseases under the influence of seasonal fluctuations-analytical approach oscillatory phenomena in a model of infectious diseases. theor nonlinear oscillations in epidemic models some epidemiological models with nonlinear incidence transients and attractors in epidemics the hopf bifurcation analysis and optimal control of a delayed sir epidemic model infinite subharmonic bifurcation in an seir epidemic model hopf bifurcation of a delayed epidemic model with information variable and limited medical resources seasonality in epidemic models: a literature review oscillations and chaos in epidemics: a nonlinear dynamic study of six childhood diseases in copenhagen persistence, chaos and synchrony in ecology and epidemiology estimation of the transmission risk of the 2019-ncov and its implication for public health interventions incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from wuhan, china report of the who-china joint mission on coronavirus disease 2019 (covid-19). world health organization novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions timevarying transmission dynamics of novel coronavirus pneumonia in china early transmission dynamics in wuhan the reproductive number of covid-19 is higher compared to sars coronavirus early dynamics of transmission and control of covid-19: a mathematical modelling study covid-19 outbreak on the diamond princess cruise ship: estimating the epidemic potential and effectiveness of public health countermeasures nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand simulation and analysis of control of severe acute respiratory syndrome the effectiveness of quarantine of wuhan city against the corona virus disease 2019 (covid-19): a well-mixed seir model analysis a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action inferring the causes of the three waves of the 1918 influenza pandemic in england and wales a timeline of south korea's response to covid-19 adaptation in natural and artificial systems prediction of new coronavirus infection based on a modified seir model. medrxiv sir model for covid-19 calibrated with existing data and projected for colombia publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations acknowledgements cn and fn gratefully acknowledge the financial support from us office of naval research (grant no. n00014-19-1-2070) for basic research on adaptive modeling of nonlinear dynamic systems. in particular, we appreciate the continuous encouragement from capt. lynn petersen and are humbled by his recognition of the value of our research. conflict of interest the authors declare that they have no conflict of interest. the existence of a unique endemic value of i ã c0 forcan be shown graphically for all values of j by plotting the following graphswith x i ã c0 ; m ¼ dcthe intersection point of z(x) and y(x) in the interval [0, 1] will exist if key: cord-293333-mqoml9o5 authors: scharbarg, emeric; moog, claude h.; mauduit, nicolas; califano, claudia title: from the hospital scale to nationwide: observability and identification of models for the covid-19 epidemic waves date: 2020-10-03 journal: annu rev control doi: 10.1016/j.arcontrol.2020.09.007 sha: doc_id: 293333 cord_uid: mqoml9o5 two mathematical models of the covid-19 dynamics are considered as the health system in some country consists in a network of regional hospital centers. the first macroscopic model for the virus dynamics at the level of the general population of the country is derived from a standard sir model. the second local model refers to a single node of the health system network, i.e. it models the flows of patients with a smaller granularity at the level of a regional hospital care center for covid-19 infected patients. daily (low cost) data are easily collected at this level, and are worked out for a fast evaluation of the local health status thanks to control systems methods. precisely, the identifiability of the parameters of the hospital model is proven and thanks to the availability of clinical data, essential characteristics of the local health status are identified. those parameters are meaningful not only to alert on some increase of the infection, but also to assess the efficiency of the therapy and health policy. covid-19 is an aerial virus which strickens humans through a respiratory infection, see zhu et al. [2020] . it belongs to the coronavirus family which was discovered in the 60's and has already infected humans through sars and mers. sars is an atypical pneumonia which appeared for the first time in 202 in china as described in ksiazek et al. [2003] . mers which appeared in 2012 in china too (see zaki et al. [2012] ), is very similar to sars but with a higher mortality. covid-19 was declared to who (the world health organization) at the end of 2019 from cases in wuhan, china, see zhu et al. [2020] . on the 20th of february 2020 the declared positive cases were 76,000 with almost 2,500 deaths, mainly in china. at the same time, there were about 50 declared positive cases in northern italy, while in france there were less than 30 declared positive cases. the health status was far from being homogeneous in those or other countries. the who declared on march 11th, 2020, that covid-19 was a pandemic on the website who [2020] . on june 1st, 2020, the declared positive cases around the world were 6.15 million with 375,000 deaths, with almost 233,000 positive cases and 33,000 deaths in italy and 152,091 positive cases and 28,833 deaths in france. on september 21st 2020, the confirmed cases globally amount to 30,909,405 with 958,754 deaths communicated to who by national authorities. the total number of cases in france on september 21st is 420,855 with 31,109 deaths, while in italy the total number of cases is 296,569 with 35,692 deaths (website who [2020b] ). mathematical models are fundamental to understand and to predict the mechanisms of the spread of an epidemic. the most popular and widely used is the sir model in kermack and mckendrick [1927] for human-to-human transmission. for the covid-19 pandemic, many models have been built to explore the epidemic at the scale of a country like the sidarthe model which is an upgrade of the sir model ans is found in giordano et al. [2020] . recall that sidarthe stands for a susceptible, infected, diagnosed, ailing, recognized, threatened, healed and extinct model. the sidarthe model has highlighted that restrictive social-distancing measures need to be combined with widespread testing and contact tracing in giordano et al. [2020] . gervetz et al. [2020] incorporate explicit social distancing via separate compartments for susceptible and asymptomatic individuals in the sir model, while in di giamberardino et al. [2020] infected people are split into infected with low viral load, undiagnosed ones, diagnosed ones and in quarantine ones. alternatively the sars-cov-2 dynamics is also described at a within-host level in hernandez-vargas et al. [2020] . in this paper, a new continuous-time macroscopic model, valid at the scale of a country, is introduced and it is argued that suitable delays are mandatory to reflect the dynamics of the infection while maintaining a certain simplicity in the modelling. an observability analysis is processed for further insight of this model. since the health status is far from being homogeneous over a full country and since the health system consists in a network of major regional hospitals, it is worth to take advantage of the availability and agility of local hospital data rather than just merging them in some centralized information system. at this point, the mathematical modelling of the local population flows at the level of one single hospital appears to be definitely relevant. such models may be interconnected to include the transfer of covid-19 patients from one hospital to an other to distribute the pressure at some peaks of the infection, as done in italy with transfers from the north to the south, or in france with transfers from the east to the west and even to germany. the identification of the parameters of the second local model is agile, fast and relevant not only to alert on some increase of the infection, but also to evaluate the flows of severely infected patients, and thus to assess the efficiency of the therapy and the local health policy. data from nantes university hospital were collected on a daily basis for about 6 months from march 16th to september 17th 2020, and are used here to identify essential characteristics of the local health conditions. these data include the daily values of patients in conventional care, intensive care, patients which died for covid-19 and patients which were discharged from the hospital because completely recovered or partially recovered but in the condition of continuing their treatment at home. the originality of the data from nantes university hospital lies in the fact that the lockdown started early, the emergency services were not saturated and they focus in a population of patients with mild or severe covid-19 cases. the macroscopic and local models are illustrated in figure 1 . control systems theoretic tools have shown their efficiency to give a new insight for various biomedical systems including for instance hiv infection in chang et al. [2014] . it is argued that news solutions to cope with the covid-19 pandemic may also benefit of those engineering science tools. the outline of the paper is as follows. a new infection model involving delays is derived from the standard sir model in section 2. to fit to the mainstream of the current literature, this model is described in continuous time. the observability analysis of this continuous time-delay model is processed in section 3. a subsystem, consisting in a node of the global health system network, is extracted and further detailed in section 4 as it models the dynamics inside a hospital center. since such a model is obtained from the daily data available from a university hospital, this second local model is directly designed in discrete-time. in section 5 a discussion on the identification of the parameters characterizing the local model and their interpretation is carried out. conclusions are pointed out in section 6. the evolution of the illness is described through the following dynamical model which represents a modification of the well-known sir model introduced in 1927 in kermack and mckendrick [1927] to describe the diffusion of an epidemic disease by considering the evolution of three classes of people (compartments) the susceptible individuals, those who can become infected, the infected, those who spread the disease around, and the recovered, those who recovered from the disease. with respect to the sir model, hereafter we split the infected people into two compartments due to the specifity of the disease, and we consider also the compartment of dying people. in particular apart the high transmission rate, other two aspects were immediately pointed out by the physicians which did strongly influence the diffusion of the disease and the medical resources: first it was estimated that a large delay of time (10 to 14 days) is present between the moment in which a person becomes infected and can infect, and the instant in which symptoms become evident and the person is isolated and sent to quarantine. secondly it took a long time for many patients to recover from the disease. some of them were in hospital and in intensive care for a big amount of time (many more than one month) thus maintaining the resources unavailable to help other patients. to highlight these two aspects we then refer in the following to a time delay system. as it will be shown in the next section the delay introduced will have an important role on the observability properties of the dynamics and thus cannot be neglected. such a dynamics can then be described by the following delay-differential equations: • i q are the infected patients aware of their disease and who are thus in quarantine. they include all hospitalized patients, but not only. • r is the sub-population which has recovered from the infection. • d q denotes the cumulative number of patients who deceased from the infection and were already identified so they are part of the i q population. • s is the amount of naive individuals among a given population which are susceptible to become infected. the s population is infected by the i a infected individuals which are not in quarantine. • i a are asymptomatic infected people which represent the main source of infection and who spread out the infection among the general population. • β si a is the amount of newly infected individuals per time unit. this term splits into two parts, a smaller part of very sensitive people which affects the dynamics of i q and the majority of newly infected individuals will increase the number i a . • α denotes the death rate due to the infection and mainly affects the i q population. • γ q i q (t − q τ) is the amount of patients in quarantine who recover from the disease. • γ a i a (t − a τ) is the amount of other infected individuals among the general population who recover from the disease. • ηi a (t − τ) is the amount of unaware infected individuals who become aware of their infection and go into quarantine. this phenomenon occurs with some delay τ which is typically evaluated from 2 to 3 weeks. remark. as already highlighted, d q represents the covid deaths from the infected and in quarantine compartment. there may be of course unknown covid deaths which can be described by an additional dynamics of the forṁ d a = α a i a so that the total number of deaths due to covid is actually given by d = d a + d q . d a cannot be measured, and had an important role essentially at the beginning of the pandemic when the number of detected covid cases was much lower than the real one. due to the high mortality caused by covid in that period, a rough estimation could be obtained by comparing the month death rate of a single region with the corresponding one in the current year. remark. as already underlined, the dynamics (1) is affected by two different kind of delays: the first one characterizes the amount of time that passes between the moment a person is infected and can infect, and the moment the person becomes aware of the illness and is put to quarantine. this delay is intrinsic of the disease and can only be marginally affected. the second delay which characterizes the dynamics (1) is instead linked to the time infected people recover from the disease. this delay instead has drastically changed from the starting of the pandemic and is mainly due to a better knowledge of the disease and the therapies needed by the patients which allows now to recover faster from the disease. in figure 2 a representation of the propagation of the disease is given. susceptible people (s) remain susceptible in time or become infected. once infected they will move to the compartment of people in quarantine (i q ) or infected asymptomatic people (i a ), depending whether they have symptoms or not. from the compartment i a people will become symptomatic and move to i q or recover and move to r. those in quarantine will recover and move to r or die due to the disease (d q ). finally as it has been recently put in evidence people who recover may be infected again, so they should be counted in the susceptible people. the model described by equation (1) has from this point of view a short term validity. it is possible to take into account the possibility of reinfection of recovered people. in this case, the dynamics (1) would be modified by settinġ the coefficient ν seems however to be very small at this stage, and is neglected in this study. in the next section, the dynamics (1) is further analyzed with respect to its observability properties since this kind of study allows to have an estimation of the state variables. the subsystem (2) consisting by i q , r and d q is then further discussed in section 4: a group of people who are aware of their infection define the flow of admissions in a local hospital and are split into two populations, the patients admitted in conventional hospitalization and the patients admitted in intensive care. the health status of a given population has to be assessed from its number i q of attested covid-19 cases. it is important to guess the real number of infected individuals, and this is the purpose of observability from the measurement of i q . considering then as output of the dynamics (1), y = i q we will see hereafter that successive time differentiations of the measurement y will involve i q , i a and s. thus, the dynamics (1) with the only output y = i q is not fully observable and at most i a and s can be estimated in addition to the measurement i q . on the other hand also the number of deaths due to covid-19 in the i q population can be measured, while the number of recovered people cannot clearly be estimated. in the present section we will thus investigate the observability properties of the given system starting from the measurement of y = i q , and we will show that given the fact that d q is measured and r cannot be estimated, the problem reduces to the study of the observability properties of a subsystem of order 3. now, due to the presence of delays, which in this context are considered constant, we can take the differential representation of the dynamics in order to study its behaviour, using the approach introduced in califano et al. [2020] . to take into account the link between the delayed variables, the backward shift operator δ has to be considered, see xia et al. [2002] . let us denote by k the field of causal then, thanks to the back shift operator δ , dx(t − sτ) = δ s dx. accordingly, given the function given a(·), f (·)) ∈ k : finally k (δ ] is the (left) ring of non commutative polynomials in δ with coefficients in k . a general module spanned by the differentials of functions in k is then defined over the ring k (δ ], as in xia et al. [2002] . then, in this framework setting (x 1 , x 2 , x 3 , x 4 , x 5 ) = (i q , r, d q , s, i a ), and assuming without loss of generalities 1 and 2 integers, our system is characterized by the differential representation and clearly dy (2+ j) , for any j ≥ 0 does not depend on dx 2 and dx 3 which proves that the whole system cannot be weakly, regularly or strongly observable. we may however be interested in studying the reduced system defined by the variables (x 1 , x 4 , x 5 ) = (i q , s, i a ) and given by   using the previous computations, one gets that the associated observability matrix isô this subsystem will then be weakly observable ifô(x, δ ) has full rank over k (δ ). it will be strongly observable if it is also unimodular. if it is weakly observable but not strongly we will have to check if it is regularly observable, see califano et al. [2020] , that is if it is possible to reconstruct the state of the system by using also derivatives of higher order of the output function 1 . we may essentially distinguish three cases based on the values of the parameters ε and η which are discussed hereafter and are represented in table 1 . this case corresponds to the situation in which there is an important delay between the time people get infected and the time they become aware and get into quarantine. as a consequence, necessarily η = 0 and the observability matrix associated to the third order subsystem (2) will bê the subsystem (2) will be then weakly observable for η = 0, β = 0, x 5 (−1) = 0. it is easily verified that it will neither be strongly nor regularly observable. • second case ε = 0 and η = 0. in this case there is no structural delay between the two classes i q and i a . people get infected and after a negligible time are moved to quarantine. the delay characterizes only the large amount of time that ill people need to get recovered. in this situation, for the subsystem (2) one getŝ no delay affects the observability matrix. the subsystem (2) will be strongly observable if and only if the matrix has full rank, which happens if and only if the different notions of observability are peculiar of time delay systems. strong observability can be tested by verifying the unimodularity of the observability matrix and allows to express the state of the system at time t as a function of the input and output and their derivatives up to order n − 1 eventually delayed. weak and regular observability are much weaker notions and have different implications. to have a flavour of these implications we give hereafter two examples to highlight the differences: regular observability: . the observability matrix is o(x, δ ) = 1 + δ , which has full rank over k (δ ] but is not unimodular. nevertheless, the state of the system at time t can still be written as a function of the input and output and their derivatives, but requires higher order derivatives. in case of the example, we get x(t) = y(t) −ẏ (t)−y(t−τ) u(t)−u(t−τ) , which is valid whenever u(t) = u(t − τ). weak observability: consider the dynamicsẋ(t) = u(t) with output y(t) = x(t) + x(t − τ). the observability matrix is still o(x, δ ) = 1 + δ . in this case however the system is neither strongly nor regularly observable, but is said to be weakly observable. in this case only an implicit relation can be written down involving different delayed values of the state, of the derivatives of the input and of the output. weakly observable regularly observable strongly observable (2) that is whenever β = 0, x 4 = 0, x 5 = 0. the last condition in fact is always satisfied since x 5 ≥ 0. • third case ε = 0 and η = 0. to check it let us use smith decomposition. we have that the subsystem (2) is then weakly observable for β x 5 = 0. it cannot be strongly observable. after some tedious computations, it appears that it is not regularly observable either. the originality of this section is to consider a smaller granularity with a subsystem of the previous dynamics: a local hospital center in charge of covid-19 patients. its input is the number a(k) of admissions in the hospital on day k. it is then in principle less than or at most equal to i q in model (1), and it is further split into the number i(k) of patients admitted in intensive care and the number c(k) of patients admitted in conventional hospitalization on day k. the model describes the evolution of the dynamics defined by the hospitalized patients i(k) and c(k) as well as the number of deaths d(k) in hospital and the number r(k) of patients who have recovered from the disease (completely or partially and continue their treatment outside the hospital). in figure 3 a representation of the management of the covid-19 patients in the hospital is highlighted. each day k a set of (a) new patients arrive at the hospital admission centre. depending on their health status they can be forwarded to the conventional care section (c) or the intensive care one (i). patients move from intensive care to conventional care and then are released from the hospital (r) once their health allows. some of them instead do not survive (d) once their condition worsen. as already underlined data from nantes university hospital were collected for about 6 months from march 16th to september 17th, 2020. these data include the daily values of c(k), i(k), r(k) and d(k) valid at day k. due to the format of the daily data, a discrete time model with four parameters is in order as follows. raw data from the nantes university hospital covid-19 database [2020] include, on a daily basis, the number i(k) of patients in intensive care, the number c(k) of patients in conventional hospitalization, the cumulative number of deaths and the cumulative number of patients which have recovered since day 1. the daily covid-19 admissions are not directly measured but are computed as thus, all those state variables are considered to be measured and the input a(k) is computed from the measurements over two days. the new admissions a(k) are split into θ a(k) which increases the number of patients in conventional hospitalization and in (1 − θ )a(k) which are directly entering intensive care. a value of θ close to 1 is representative of a high level monitoring of the disease by the health care outside hospital. when θ is closer to 0, then a major flow of new admissions enter directly into intensive care and, the other way around, λ i(k) is the amount of patients leaving intensive care and entering conventional hospitalization. the number of patients γc(k) who recover from the disease is assumed to be proportional to the number of patients in conventional hospitalization. µc(k) denotes the amount of patients who are moved from conventional hospitalization to intensive care. the number αi(k) of daily deaths is assumed to be proportional to the number i(k) of patients in intensive care. the daily number a(k) of new admissions highly depends on the social and medical environment, as well as on some political lockdown regulations or meeting restrictions (with some delay). thus, a(k) may have a significant variability from one regional hospital center to an other. it is proportional to the number of individuals which are susceptible of becoming infected, in some standard sir population model. remark. note that the local scale model (3) models the dynamics of hospitalized patients. taking the sum of c(k) + i(k) in (3) over all regional hospitals allows to get the hospitalized patients, nationwide. the latter is a part of the population i q in model (1), as not all detected patients are hospitalized. the sum of hospitalized patients and non hospitalized infected people equals i q + i a in model (1). it will be argued next that some parameters that characterize a single hospital may vary with respect to time. these parameters may also differ consistently from one hospital/region to another, depending on several sociological conditions and local regulations. remark. while in the model (1) delays were considered to characterize the evolution of the infection among the population, since they play a fundamental role in particular on the observability of the dynamics, in the local model (3) which refers to the hospitalization of covid-19 patients, delays could be avoided. on one hand, even if data are available daily, the discrete nature of the model allows to work on a large time window (namely 7 days as it will be discussed in the next section for the identification of the parameters) which is comparable with the delays involved in (1). on the other hand, the eventual presence of a large delay could be handled by splitting the patients in intensive care or in conventional hospitalization in more compartments moving the patients from one compartment to the successive one as the disease evolves, thus avoiding again the use of delays. note that the infection dynamics described by a standard or modified sir model, such as the dynamics (1), are external to the hospital and feed the dynamics (3). thus, the input variable a(k) of (3), is a percentage of i q . the parameters α, γ, θ µ and λ of model (3) can be identified over a short or longer period so that their evolution can be tracked. the argument in this section is that these parameters are • easy to derive from standard data from the hospital; • are representative of the regional health status. the death rate α and the recovery rate γ are easily obtained from the measured state variables, since: from nantes university hospital covid-19 database [2020], it is not significant to complete an identification on a daily basis. so, let us identify over an horizon of h days, which is done rewriting the third and fourth equation of (3) for days k, k + 1,..., k + h. standard computations lead to and . note that parameters α and γ are argued to be time-varying, so that equations (4) and (5) rather compute an average value over h days. the parameters θ and µ are not simultaneously identifiable in this model for constant sequences of a(k) and c(k). nevertheless, for some input sequence such that a(k)c(k + 1) ≡ a(k + 1)c(k), all parameters are identifiable. in other words, the system is generically identifiable, or identifiable for almost all sequence of admissions and for almost all flows of patients. the identification of θ may be processed as follows. rewrite the first equation of (3) at three different time instants: from (6), one can easily eliminate the coefficients (1 − γ − µ) and λ and after some elementary but tedious computations, θ is computed as follows. rename the following quantities so that obviously, there exist some singular input sequences a(k) which cancel out the denominator of the right-hand side of (8): for instance, when the sequences a(k) and i(k) are constant, then formula (8) is inapplicable for the estimation of parameter θ . following a similar elimination process, it is possible to eliminate parameters λ and α among (6). thus µ is identifiable for almost all input sequences, though the explicit expression becomes a bit more involved. the generic identifiability of λ is proven following similar lines. computation of the death rate α and the recovery rate γ from the hospital dataset data from nantes university hospital were collected from march 16th to september 17th, 2020 and are available in nantes university hospital covid-19 database [2020] . these data include the daily values of c(k), i(k), r(k) and d(k). in france the lockdown took place from march 17, 2020 to may 11, 2020. when some data were missing or inaccurate then the corresponding periods were skipped. (3) since α is computed from the available data over 7 days, its evaluation displayed in figure 6 starts on the second week. the empty information, instead, from may 20th to 28th corresponds to errors which are due to inaccurate data, such as the number of admissions which were sometimes incorrectly estimated in the hospital data, thus not allowing a correct estimation of α. in figure 6 it is shown that the death rate increased during the early days after the lockdown. two weeks later it was significantly decreasing and then it stabilized for over two months. surprisingly, the death rate α seems to reach unprecedented levels weeks after the end of the lockdown. a peak of the death rate α is noticed in early june, about one month after the end of the lockdown. this paradoxical result is easily interpreted as the general younger population left (intensive care) hospitalization and only elderly patients were remaining in intensive care, sometimes after several months of hospitalization with weak probability of health improvement. (3) from mid-june to mid-august, no patient was in intensive care which yields a zero death rate. this is due to a very low number of new covid-19 patients admissions. since the end of july 2020, there are new conventional hospitalizations, followed by intensive care hospitalizations at the beginning of august 2020. this indicates an increase in the circulation of the virus and eventually the beginning of a second epidemic wave. parameter α requires a period of 7 days to be estimated, so that the increase in mortality is confirmed on mid-august. in figure 7 a higher level of the recovery rate is noticed during the first month of the lockdown, i.e. before the peak of the infection was reached. once the spread out of the infection was under control, the recovery rate was somehow stabilized. population levels were low in july 2020: no patients in intensive care and a decrease in the number of patients in conventional hospitalization from 23 patients at the beginning of the month to 2 patients at the end of the month. therefore the recovery of one or two patients will have a significant impact on the recovery rate estimate. similarly in august 2020, there is an increase in the recovery rate due to a low headcount. these results point to the suspicion of a better patient management by the physicians in connection with an earlier treatment and to a massive screening. summarizing, a higher recovery rate in august/september is due to a better knowledge of the disease by the physicians. the main message in this paper is certainly that fast, agile, decentralized and relevant actions can be taken on some decentralized local level both to predict a new wave of the covid-19 pandemic and to assess the real-time efficiency of treatments or of the health policy. a local mathematical model of flows of patients will not stand in competition with a standard or upgraded centralized sir type model but it will give additional insights to the dynamics to help decisions at the local level of a regional hospital. the information provided by local data can be easily collected from any hospital. this information is rich enough to capture essential indicators about the local health status of the regional population. not only this information is easy to obtain and does not require the involvement of any national health agency, but it is also more precise in the sense that it gives a smaller granularity picture of the health situation. it is wellknown that this situation can be dramatically different from one region to the other; for instance, the bergamo region in italy had a very different status when compared with the south of italy as shown in alicandro et al. [2020] . similarly, the mulhouse region in france was one of the major clusters, see kuteifan et al. [2020] , with a much higher infection rate than western french regions along the atlantic coast. data were collected during and after the lockdown period from the medical information department of nantes university hospital center. such a low cost information is able to provide an early alert about a second wave of the infection. in this sense, it is an agile and precise tool able to complement official figures published by central health agencies. we wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome. we confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship but are not listed. we further confirm that the order of authors listed in the manuscript has been approved by all of us. we confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. in so doing we confirm that we have followed the regulations of our institutions concerning intellectual property. we understand that the corresponding author is the sole contact for the editorial process (including editorial manager and direct communications with the office). he/she is responsible for communicating with the other authors about progress, submissions of revisions and final approval of proofs. we confirm that we have provided a current, correct email address which is accessible by the corresponding author. signed by all authors as follows: italy's first wave of the covid-19 pandemic has ended: no excess mortality in may observability of nonlinear time-delay systems and its application to their state realization a control systems analysis of hiv prevention model using impulsive input dynamical evolution of covid-19 in italy with an evaluation of the size of the asymptomatic infective population a novel covid-19 epidemiological model with explicit susceptible and asymptomatic isolation compartments reveals unexpected consequences of timing social distancing, medrxiv modelling the covid-19 epidemic and implementation of population-wide interventions in italy in-host modelling of covid-19 kinetics in humans, medrxiv a contribution to the mathematical theory of epidemics a novel coronavirus associated with severe acute respiratory syndrome the outbreak of covid-19 in mulhouse : hospital crisis management and deployment of military hospital during the outbreak of covid-19 in mulhouse analysis of nonlinear time-delay systems using modules over non-commutative rings isolation of a novel coronavirus from a man with pneumonia in saudi arabia a novel coronavirus from patients with pneumonia in china covid19 epidemiological database from march 16th to september 17th in 2020 covid-19): situation report 51 situation on the authors wish to thank the medical information department of nantes universty hospital for sharing data. thanks to dr c. zhang for his careful reading of a preliminary version of this paper. thanks to z.bondaty for her drawing skills of figure 1 . thank you to professor karim asehnoune for agreeing to share the hospital data of the nantes university hospital on covid-19. finally the authors wish to thank the anonymous reviewers for their valuable comments. key: cord-289325-jhokn5bu authors: lachiany, menachem; louzoun, yoram title: effects of distribution of infection rate on epidemic models date: 2016-08-11 journal: phys rev e doi: 10.1103/physreve.94.022409 sha: doc_id: 289325 cord_uid: jhokn5bu a goal of many epidemic models is to compute the outcome of the epidemics from the observed infected early dynamics. however, often, the total number of infected individuals at the end of the epidemics is much lower than predicted from the early dynamics. this discrepancy is argued to result from human intervention or nonlinear dynamics not incorporated in standard models. we show that when variability in infection rates is included in standard susciptible-infected-susceptible ([formula: see text]) and susceptible-infected-recovered ([formula: see text]) models the total number of infected individuals in the late dynamics can be orders lower than predicted from the early dynamics. this discrepancy holds for [formula: see text] and [formula: see text] models, where the assumption that all individuals have the same sensitivity is eliminated. in contrast with network models, fixed partnerships are not assumed. we derive a moment closure scheme capturing the distribution of sensitivities. we find that the shape of the sensitivity distribution does not affect [formula: see text] or the number of infected individuals in the early phases of the epidemics. however, a wide distribution of sensitivities reduces the total number of removed individuals in the [formula: see text] model and the steady-state infected fraction in the [formula: see text] model. the difference between the early and late dynamics implies that in order to extrapolate the expected effect of the epidemics from the initial phase of the epidemics, the rate of change in the average infectivity should be computed. these results are supported by a comparison of the theoretical model to the ebola epidemics and by numerical simulation. a goal of many epidemic models is to compute the outcome of the epidemics from the observed infected early dynamics. however, often, the total number of infected individuals at the end of the epidemics is much lower than predicted from the early dynamics. this discrepancy is argued to result from human intervention or nonlinear dynamics not incorporated in standard models. we show that when variability in infection rates is included in standard susciptible-infected-susceptible (sis) and susceptible-infected-recovered (sir) models the total number of infected individuals in the late dynamics can be orders lower than predicted from the early dynamics. this discrepancy holds for sis and sir models, where the assumption that all individuals have the same sensitivity is eliminated. in contrast with network models, fixed partnerships are not assumed. we derive a moment closure scheme capturing the distribution of sensitivities. we find that the shape of the sensitivity distribution does not affect r 0 or the number of infected individuals in the early phases of the epidemics. however, a wide distribution of sensitivities reduces the total number of removed individuals in the sir model and the steady-state infected fraction in the sis model. the difference between the early and late dynamics implies that in order to extrapolate the expected effect of the epidemics from the initial phase of the epidemics, the rate of change in the average infectivity should be computed. these results are supported by a comparison of the theoretical model to the ebola epidemics and by numerical simulation. doi: 10.1103/physreve.94.022409 an important element in theoretical epidemiology is the epidemic threshold, which specifies the condition for an epidemic to grow. in mean-field epidemiological models, the concept of the basic reproductive number, ro, has been systematically employed as a predictor for epidemic spread and as an analytical tool to study the threshold conditions [1] [2] [3] [4] . an advantage of ro is that in many models it determines both the threshold for the emergence of an epidemic and the expected final outcome of an outbreak [5, 6] . it has thus been widely used to gauge the degree of threat that a specific infectious agent will pose as an outbreak progresses [7, 8] . however, over and over again differences have been observed between the predicted and observed sizes of epidemics, whereas in most cases the observed epidemic is much smaller than predicted [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] . indeed, recent studies raise doubt about the validity of forecasting the outcome of epidemics using the ro-based estimate [22, 23] . specifically, given an estimate of ro, one can project in the ordinary differential equation (ode) -based susceptibleinfected-recovered (sir) and susciptible-infected-susceptible (sis) models the future course of the epidemic. however, recent results have shown that these estimates are much larger than the true extent of the disease outcome [24] . this overestimate of late dynamics when using early dynamics has been raised by many authors in general cases, as well as specifically for the ebola virus [25, 26] . we build upon these known observations to show that even in fully mixed models, * louzouy@math.biu.ac.il the early dynamics cannot be used to estimate the outcome of the late dynamics, and we propose an alternative approach. this overestimate can be explained by a slow decrease in ro. ro can be reduced by human intervention, such as the removal of sick individuals from the society or removing carriers of the disease [27, 28] , or a limitation of movement either for the entire population or for people expressing clinical signs [29, 30] , which reduces the effective reproductive number [31] . ro can also be reduced by passive vaccination of the population [32] or aggressive vaccination of the population [33, 34] . all models mentioned above deal with external elements reducing ro, and thus reducing the number of patients in steady state or the total number of removed individuals in sir models. we argue here that the measure of ro may not be indicative of future forecasts of the number of patients, even if no external factors or vaccinations are involved. consequently, additional information is required to estimate the number of people that will be affected by the disease in steady state, or when the epidemic is over. we propose here that the estimate of the change in the relative infection rate defined as i /i can be a good way to estimate the steady-state number of infected individuals in sis models, and the total number of removed individuals in sir models. we introduce an epidemic model in which each of the susceptible individuals has a different probability to get infected and the same recovery probability. this model differs from uniform models, but also from network models. in the standard sir and sis models, the probability of infection is constant for everyone [1] [2] [3] [4] . however, in reality, different people have different tendencies to be clinically sick and infectious (e.g., elderly people, children, or immune and deficient individuals [35] [36] [37] [38] [39] [40] ). in network sir models, the infectivity of each node is a function of its degree (and thus it varies among nodes). however, network models assume a constant interaction pattern that may be realistic in sexually transmitted diseases, but it is not realistic for most noncontact (e.g., airborne or vehicleled) transmission. moreover, in undirected networks, the probability to infect and to become infected are symmetrical [2] [3] [4] [41] [42] [43] , which is again mainly appropriate for sexually transmitted diseases, but not for airborne transmission. we present evidence here that in a wide class of models, the variability in the probability to get infected can break the link between the early phase of the epidemics and the predicted outcome. we investigate the dynamical processes driving this result, its validity, and its consequences. we then compare the conclusions of this model to observed ebola outbreaks. such results have been presented in network models. however, in such models the connectivity is fixed over time [44] [45] [46] [47] . in the current analysis, we show that inhomogeneity has a crucial effect even in fully mixed systems. the models used in this study are based on the sis and sir models. each susceptible individual has a different probability to get infected, but all infected individuals have the same probability to infect other susceptible individuals. specifically, individuals exist in two discrete states-"healthy" or "infected"-in the sis model, and three discrete states-"healthy," "infected," or "recovered"-in the sir model. at each time step, each susceptible (healthy) individual i is infected with rate β i . at the same time, infected individuals are cured and become susceptible again with rate γ in the sis model, or recovered in the sir model. the value of β i is only a function of the person getting infected and not the person infecting, and it does not vary over time for a given individual. given a variable probability to get infected in the population, we define the number of susceptible individuals with a β i value between β and β + dβ as s(β)dβ and the number of infected individuals as i (β)dβ. we further define n(β) as n (β) = s(β) + i (β). note that n (β) does not change over time and is thus equal to its initial condition. we have studied multiple distributions for this initial condition, as shall be further explained. s is the sum over all the susceptible individuals with different probability to get infected, i is the sum over all the infected individuals with different probability to get infected, and n is the sum over all the susceptible individuals and infected individuals with different probability to get infected. formally, β is the disease transmission rate (i.e., the probability that a person would be infected), and γ is the disease recovery rate. note that here we use an integral approximation of the discrete sum. we will further show that this approximation is consistent with numerical simulations. the equations for the sis model used here are and n = i + s. in the sir model, we added the recovered individuals class. r(β) is defined as a recovered individual with a probability β to get infected. formally, none of the recovered individuals can be infected again in the sir model. in this model, n (β) as n (β) = s(β) + i (β) + r(β). the equations of the sir model are where i is defined in eq. (1), s is defined in eq. (2), and r is defined above in eq. (5) , when n = i + s + r. in the different models studied here, we implicitly consider four different distributions for the probability of susceptible individuals to get infected: (i) constant infection rate for all susceptible individuals. this is equivalent to the mean-field sis or sir models. we used β = 2 × 10 −5 , γ = 1, and a population size of n = 100 000. thus ro = βn/γ = 2. all other models only differ in the distribution of β. (ii) uniform probability distribution of infection rates within a range 2 × 10 −11 < β < 4 × 10 −6 . (iii) normal probability distribution with mean μ and variance σ 2 , the probability of having infection rate β is . we used μ = 2 × 10 −5 , and σ = 2μ. (iv) scale-free distribution with the probability of having infection rate β is p (β) = β −α , where α = 1.5, and β is limited to the range (2 × 10 −8 ,2 × 10 −4 ). monte carlo simulations of the systems studied have been performed with a population size of n = 100 000. we assign a different infection rate to each individual using the four distributions above. the population is initiated with a small number of infected individuals, i 0 = 10. these 10 initial infected individuals are random individuals (i.e., they have β values randomly chosen from the population). note that β is very low. however, β * n is high enough, allowing for the epidemics to spread, since n = 100 000. we assume full mixing and a mass action formalism, allowing for a very small number of infected individuals to spread the pathogen. formally, we assign each susceptible individual at each iteration a probability β i t to be infected. similarly, each infected individual is assigned in each iteration a probability of γ t = t to be removed and become susceptible again. the simulation updating is synchronous. the dynamics are simulated for different parameter values. the odes were solved numerically using the matlab fourth-order runge-kutta method [48] , as applied in the matlab ode45 function assuming nonstiff equations [49] . we study the behavior of an epidemic outbreak assuming that each susceptible individual has a different probability of getting infected. however, once it is infected, its contribution to the total force of infection is constant. in other words, p [s(β) + i → i (β) + i ] = β. this represents, for example, people with different levels of susceptibility to a given disease. within this model, we study two possible models of recovery. either the recovered hosts are immunized and then an sir model is used, or the recovered hosts become susceptible again and then their β value does not change over time, and an sis model is used. to study the effect of the distribution of β (the probability to be infected) on both initial and late dynamics, we assume one of four possible β value distributions: constant, uniform, gaussian, or scale-free. in all cases, we maintain the expected value of β equal among models. as will be further shown, the initial dynamics are only affected by the first moment of the distribution (the expected values of β), while the total number of infected individuals during the outbreak in the sir model or the steady-state infected fraction in the sis model can be strongly affected by the following moments. thus, in some distributions, it is impossible to predict the "outcome" of the epidemics from the observed initial dynamics and the resulting estimate of ro. to examine the behavior of the infected class as a function of time, we developed a moment closure scheme, and we use the following notations: for the sir model, in addition to the above definitions, we define we then use the following notation: as a product of eq. (8) with eq. (1), as a product of eq. (9) with eq. (2), as a product of eq. (10) with eq. (5) , and as a product of eq. (7) with eq. (3). for the sis model in eq. (4), the number of infected individuals can be estimated via (see appendix a) where i 1 is the first order of eq. (11) and e n (β) is the first order of eq. (7). an iterative equation can be developed to estimate the higher orders: this would obviously lead to an infinite number of coupled equations. however, the number of equations can be limited using a simple moment closure method. the highest order of the set of odes is set to be zero, and the order below it is set to be a constant. note that more advanced schemes could be used [50] [51] [52] . however, this simple scheme agrees well with simulation and is enough for the current analysis. for the sir model in eq. (6), the number of infected individuals can be estimated via where i 1 is the first order of eq. (11) , e n (β) is the first order of eq. (7), and r 1 is the first order of eq. (13). we developed a similar scheme for the sir model, with eq. (6). the iterative equation obtained for the number of susceptibles is where s n is defined in eq. (12) . the closure scheme above is also applied here. for example, if the stopping order is s 5 , we set s 5 = 0, and the previous order is set to the moment closure scheme is consistent with the dynamics of the infected population simulated using a monte carlo simulation with the appropriate parallel distribution (figs. 1-3) . increasing the number of moments for the solution of eqs. (16) and (18) reduces the difference between monte carlo simulations and the moment closure results (data not shown). for example, in the sis model, for the three cases of distribution of infection rates, an order of 20 coincides with the results obtained from the simulation. in the sir model, only an eighth order was needed to reach a very high prediction. 20) for the sir model. the results of a monte carlo simulation, the full moment closure scheme, and the approximation of the initial dynamics were compared, and a good fit was obtained. for all four distribution studied, n = 100 000, γ = 1, and e n (β) = 2 × 10 −5 . the rising line is the initial time approximation, and the two other, highly similar, lines are the full moment closure solution and the simulation results. for the early dynamics of the sis model in eq. (15), we can approximate the dynamics by neglecting elements of the order of −i i 1 compared with other elements, since β 1 and (20) and the recovered individuals number as equation (19) for the sis model and eq. (20) for the sir model are consistent with the simulations (fig. 1) . furthermore, as can be clearly seen, in both sis and sir, all distributions tested (uniform, gaussian, and scale-free) have the same dynamics as the constant infectivity. during the early dynamics, only the first moment of the sensitivity distribution affects the dynamics. thus, the higher moments of the distribution cannot be estimated from the observed value of r 0 . however, these moments may affect the dynamics later in the epidemic. we thus examined if the term neglected in the previous subsection, (−i i 1 ), has a differential effect as a function of the higher moments of the infection rate distribution. the general solution of the infected class for eq. (15) can be estimated to be (see appendix a) where the ratio between the solution of eq. (22) and the solution without the neglected term in eq. (19) is for t = 0, r = 1. for t ≈ 0, one can estimate the same analysis can be performed for the sir model. we examined again the effect of the neglected terms on the dynamics, with the general solution of the infected class in eq. (17) approximated by where . the ratio between the solution without the neglected terms [eq. (20) ] and the solution of eq. (25), marked by r(t), is with a similar approximation. one can clearly see that the difference between the models is affected by the value of e i (β)(t). thus if e i (β)(t) differs among models, so would the resulting dynamics. as can be seen in fig. 2 uniform distributions, the values are close to those obtained with constant infectivity. however, in the sf distribution, r(t) deviates from the values obtained in the constant infectivity case. the source of the difference is in the drastic change in e i (β) in the sf model over time. this affects the denominator of eq. (24) for the sis and sir models. this difference results from the effect of rare events in the sf distribution. while in the uniform and gaussian distribution the value of e i (β) is close to e s (β), this is not the case in the sf model, and early in the dynamics e i (β) e s (β). thus in eq. (24), r(t) is expected to decrease much faster than in the constant β scenario, as is indeed the case. the source of the difference between the sf and all other distributions is the presence of the very high β values in the population (even if they are rare). people with such high values of β are almost automatically infected, and they increase e i (β) sharply. note that they also slightly decrease e s (β). in all other distributions, the variance of β is limited per definition by the requirement that all individuals have positive β values. thus, while the normal distribution cannot have a large variance, in the case of the sf distribution, β could get in principle values between (0,∞). this is not the case in reality, since the sample is finite. still, the upper bound is a few orders above the average (fig. 2) . the same happens for the sir model, with the important distinction that the individuals with high β values are rapidly removed from the population, leading to a decrease in e i (β) following the initial sharp rise. thus, while initially all distribution show dynamics purely determined by r 0 , the dynamics evolve differently as a function of the distribution of β. one is thus led to ask whether the difference in the distribution can be estimated from the early dynamics. the conclusion from the results above are that in order to estimate the future dynamics, it is not enough to know r 0 , but the change in e i (β) should be estimated. while this cannot be estimated directly, we can directly quantify the effect this has on s * e s (β) through the disease dynamics. specifically, a decrease in e s (β) is expected to lead to a parallel decrease in i /i . in the bottom graph of fig. 3 , we calculated the difference in the number of infected individuals in every time step, divided by the total number of infected individuals in the same time. the total decrease in the expected value of β in the s compartment and the increase in the expected value of β in the i compartments are clear in the sf distribution, while it is more limited in all other distributions. this can be clearly seen in the measures of e s (β), in e i (β), and in i /i . the mechanism driving the difference is straightforward. in normal or uniform distribution the variance in β is limited, thus even infecting the most susceptible individuals does not significantly affect the values of e i (β) and e s (β). in the sf model, the expected value of β is strongly affected by the tail of the distribution. we have shown in the previous subsections that the distribution of the infection rate has a drastic effect on the dynamics for sf distributions. for other distributions, the infected class dynamics are similar to the constant infectivity model, or slightly lower. at the steady state of the system, the difference is further enlarged. in the sis model [eq. (4)], we can compute the equilibrium frequency of infected individuals (see appendix b) to be the solution of the following implicit equation: where c n are the taylor series coefficients, expanding the terms [β − e n (β)] n leads to as expected from the intermediate period, in the constant, uniform, and gaussian distributions, the same number of infected individuals is obtained in the sis model (fig. 4) . however, in agreement with the intermediate period, the total number of infected individuals in equilibrium is much lower for the sf distribution. a similar analysis can be performed for the sir model with similar results (fig. 4) . the results obtained in both models in simulations and in eq. (30) are equivalent (fig. 4) . moments of β are important, the first moment approximation (that is only affected by the first moment of β) obviously fails to properly reproduce the number of infected individuals. the intuitive explanation for the effect of the distribution of β on the number of infected individuals in equilibrium, and the resulting reduction in the number of people affected by the epidemics compared with the case of constant β, is the removal of individuals with high infection probability from the susceptible pool. this effect is directly quantifiable through the higher moments of β in the population and the resulting change in the first moment of β in the susceptible and infected pool. only in the distribution where the higher moments can be important will there be a difference between the models with constant infectivity and the models with a larger variability. to confirm the effect of heterogeneity in β on the outcome of observed epidemics, and that the steady-state number of infected individuals is much smaller than expected from the constant transmission rate model, we compared between observed epidemics and the analytical results described above. we studied the spread of the ebola virus in three african countries. the ebola virus is one of five viruses of the ebolavirus genus [53] . four of the five known ebola viruses, including ebov, cause a severe and often fatal hemorrhagic fever in humans and other mammals, known as ebola virus disease (evd). the ebola virus has caused the majority of human deaths from evd, and it is the cause of the 2013-2015 ebola virus epidemic in west africa, which resulted in at least 28 424 suspected cases and 11 311 confirmed deaths [54] . the natural reservoir of the ebola virus is believed to be bats, and it is primarily transmitted between humans and from animals to humans through bodily fluids [55] . we analyzed the ebola virus daily infection rates collected by health authorities in three african countries: guinea, liberia, and sierra leone. we computed the theoretical parameters providing the best fit to the epidemics in each of the three distributions described above as well as the standard sir models. there are now known cases of reexposure to ebola. we thus fit the sir and not the sis model. for example, the free parameters are as follows: for the sf distribution n is number of the population, α,β min ,β max ,i 0 ,γ ; for the gaussian distribution, n,μ,σ,γ,i 0 ; for the uniform distribution, n,β min ,β max ,γ ,i 0 ; and for the constant model, n,β,γ,i 0 . using the observed i 0 leads to suboptimal results for all models. this is probably the case since in the early dynamics, stochastic fluctuations can affect the results. we thus estimate i 0 from the dynamics later when enough cases are available. in fig. 5 , the top graph represents the observed infected class in sierra leone and the bottom graph represents the observed infected class in guinea. we added the real times as labels. both graphs are fitted to the models with the distribution above. while in sierra leone there is practically no difference between the different models (as can be observed in the f scores in fig. 6 ), in the guinea case there is a large difference. the scale-free and normal distributions provide a better fit than the constant case and uniform cases. . this rapid decrease even in the early stage of the disease suggests that highly infectious individuals are rapidly removed from the population. in the bottom graph, an f test was calculated between the best fit in the constant case and all the other distributions to incorporate the different number of free parameters. three asterisks between the classic model and any of the distributions represent p < 0.001. in guinea, all three distributions are much better than the constant case, but there is no significant difference between the quality of the nonconstant distributions. in liberia, the sf and gaussian distributions have a significantly better fit than the uniform distribution, and the same result is obtained for sierra leone. the relative infection rate i /i can be used to detect significant deviations from the straightforward dynamics expected from these models quite early in the dynamics. the top of fig. 6 represents the observed infected class in sierra leone for the relative infection rate i /i . we computed from the observed epidemics the number of new infection cases divided by the current number of infected individuals with a moving window of 3 days. to determine the best-fitting model for multiple states in africa, we calculated the sum of squared errors (sse) of the optimal fit obtained for each distribution, where the error is the difference between the observed number of infected individuals and the solution of the theoretical model for the appropriate distribution. in the bottom graph, the sse is plotted for three countries: guinea, liberia, and sierra leone. for all countries, the sse of all distribution of transmission coefficient is smaller than the classic case. an f test [56] [57] [58] was conducted to determine whether the reduced sse is statistically significant. the f statistic is computed using one of two equations depending on the number of parameters in the models. if both models have the same number of parameters, the formula for the f statistic is f = σ sse1 /σ sse2 , where sse1 is for the first model and sse2 is for the second model. the p value of the results is computed using w -v degrees of freedom, where w is the number of data points and v is the number of parameters being estimated (one degree of freedom is lost per parameter estimated). the resulting f statistic can then be compared to an f distribution. if the models have different numbers of parameters, the formula becomes where df is the number degree of freedom of every model. df1 is number of degree of model 1 and df2 is number of degree of model 2. in the bottom of fig. 6 , the f test was calculated between the constant case and all the other distributions. three asterisks between the classic model and any of the distributions represent a better fit of any of the distributions than the classic case. in guinea, all three distributions are better than the constant case, but we cannot determine which of the three distributions is preferable. in liberia, the sf and gaussian distributions were shown to be preferable to the uniform distribution, and the same result was obtained for sierra leone. we also compared between the results of the const susceptible-exposed-infected-recovered (seir) model and other distributions of the sir model by the f test, and the residuals of the seir model are practically identical to those of the sir model, and in this case adding an extra parameter does not improve the fit to the real data. we investigated the sir model and the sis model for the three cases of distribution of the transmission coefficient in the population and the classic case with a constant transmission coefficient. we found that in realistic cases, nonuniform models provide a better description of the observed epidemics than the model where there is a constant transmission coefficient for the entire population. in the early dynamics, only the first moment of β determines the dynamics, and there is no difference between the models as long as they have the same transmission rate. later, the second moment of β starts affecting the dynamics for both the sir and sis models. during this period, the scale-free distribution behaves differently from other distributions and results in a lower number of infected individuals than expected. this reduction is the result of a difference between the first moment of β in the different compartments (s, i ). there is a sharp increase in the average value of β among infected compartments, accompanied by a smaller decrease in the susceptible compartment, compared with the initial value. such an effect can only be observed in distributions where the second moment of β in the total population is large enough. the difference is then further enlarged in steady state where all distributions of the transmission coefficient in the population lead to a smaller number of infected individuals compared with the model with constant β, but the biggest difference is in the sf model, again explained by the high second moment of β in this distribution. we have then tested this conclusion by studying the spread of the ebola virus in multiple african countries. an f test between the different distributions shows that they all produce a better fit than the constant β model. we attempted to detect early in the infection whether the total number of people affected by the epidemics will deviate significantly from the results expected from the early dynamics. classical models predict a large number of infected individuals in most epidemics with r 0 higher than 1. however, in reality, many epidemics end with a limited impact. the most classical example is perhaps the huge difference between the predictions and the observed amplitude of the jcd [9] [10] [11] [12] [13] [14] [15] [16] . many models can explain this discrepancy, including, among others, nonlinear dynamics [59, 60] , delays [61, 62] , human intervention, passive vaccination [63, 64] , and small effective population [65] [66] [67] . we show here that even in the most standard sir and sis models, the initial dynamics cannot determine the total number of people affected by the epidemics. however, quite early in the dynamics, the relative infection rate i /i can be used to detect significant deviations from the straightforward dynamics expected from these models. the large difference is only expected if the distribution of β values is large. such a large difference can be the result of intrinsic differences, but also the result of environmental differences, partial mixing, or subpopulation structure. understanding this distribution in advance would improve our capacity to relate early and late dynamics. we now plan to develop methods to estimate this distribution from finer measures early in the disease dynamics. we thank miriam beller for the language editing. we solve eq. (4) using iterative equations for the sis model. the equation for the infected compartment from eq. (4) can be integrated over all values of β as follows: the term ∞ 0 βn(β)dβ is equal to e(β)n , and the term ∞ 0 βi (β)dβ can be defined as i 1 . equation (a3) thus leads to di dt where it has a form similar to eq. (15) . we can differentiate higher orders of i with respect to time to obtain equations similar to eq. (a4). this can be written at the nth order of i (i n ) as the following general solution: for the early dynamics for the sis model, we dropped the term −i i 1 from eq. (15) , as explained in the main text, and we obtained in the second order for initial time, the term −i i 1 in eq. (a4) is not neglected. we can write this term as −i 2 e i (β) using eq. (11) . the new form of eq. (a4) leads to we denote −e n (β)n + γ = c 1 and −e i (β) = c 2 to obtain di dt the solution for that equation is we get one can change the constants to be e n (β)n − γ = ξ and e i (β) = ψ, and we obtain the boundary condition is i (t = 0) = i o , leading to the solution for the infected class in eq. (a13) is the ratio between the solution without the neglected terms eq. (a8) and the current solution of eq. (a16), marked by r(t), is for t ≈ 0, we get and then we get where it has the same form as eq. (17). we drop the terms −i i 1 and −i r 1 for the early dynamics to obtain and r is dr(β) dt = γ i (β). we integrate again and obtain d dt which is in the second order for initial time, the terms −i i 1 and −i r 1 in eq. (c4) will not be neglected. the same development for the term i is identical to that in appendix a. we solve eq. (6) with the closure scheme by integrating by β: if we differentiate s 1 , we get and the equation will be where we define s 2 = ∞ 0 β 2 s(β)dβ. in the same way, we can define in general and the difference equation is we use the moment closure method to derive the solution for these differential equations. the order where the differential equation stops will be zero, and the order before it will be const. for example, if we want to stop at order s 5 , this order will be s 5 = 0, and order s 4 = const, where the const = s 4 (t = 0) = ∞ 0 β 4 s(β)dβ. infectious diseases in humans mathematical tools for understanding infectious disease dynamics mathematical epidemiology of infectious diseases: model building, analysis and interpretation modeling infectious diseases in humans and animals generality of the final size formula for an epidemic of a newly invading infectious disease age-of-infection and the final size relation the mathematics of infectious diseases pandemic potential of a strain of influenza a (h1n1): early findings representations of mad cow disease new phenotypes for new breeding goals in dairy cattle mad cow policy and management of grizzly bear incidents no brainer-the usda's regulatory response to the discovery of mad cow disease in the united states trust in food in the age of mad cow disease: a comparative study of consumers' evaluation of food safety in belgium risk perception of the "mad cow disease" in france: determinants and consequences villas-boas, consumer and market responses to mad cow disease tracking the human fallout from mad cow disease receptor recognition and cross-species infections of sars coronavirus cross-reactive antibodies in convalescent sars patients' sera against the emerging novel human coronavirus emc (2012) by both immunofluorescent and neutralizing antibody tests the genome sequence of the sars-associated coronavirus clinical progression and viral load in a community outbreak of coronavirus-associated sars pneumonia: a prospective study prion diseases: update on mad cow disease, variant creutzfeldt-jakob disease, and the transmissible spongiform encephalopathies early estimation of the reproduction number in the presence of imported cases: pandemic influenza h1n1-2009 in new zealand epidemic models with uncertainty in the reproduction number rapid drop in the reproduction number during the ebola outbreak in the democratic republic of congo predicting the extinction of ebola spreading in liberia due to mitigation strategies spatiotemporal spread of the 2014 outbreak of ebola virus disease in liberia and the effectiveness of non-pharmaceutical interventions: a computational modeling analysis identifying transmission cycles at the humananimal interface: the role of animal reservoirs in maintaining gambiense human african trypanosomiasis transmission and control of african horse sickness in the netherlands: a model analysis simulating school closure strategies to mitigate an influenza epidemic world health organization writing group, nonpharmaceutical interventions for pandemic influenza, national and community measures estimating the effective reproduction number for pandemic influenza from notification data made publicly available in real time: a multi-country analysis for influenza a/h1n1v vaccination and passive immunisation against staphylococcus aureus a vaccination model for a multi-city system modeling vaccination in a heterogeneous metapopulation system strategies for mitigating an influenza pandemic clinical recognition and diagnosis of clostridium difficile infection ageing and infection case definitions of clinical malaria under different transmission conditions in kilifi district the management of community-acquired pneumonia in infants and children older than 3 months of age: clinical practice guidelines by the pediatric infectious diseases society and the infectious diseases society of america clinical practice guidelines for clostridium difficile infection in adults: 2010 update by the society for healthcare epidemiology of america (shea) and the infectious diseases society of america (idsa) infection dynamics on scale-free networks automata network sir models for the spread of infectious diseases in populations of moving individuals epidemic outbreaks in complex heterogeneous networks slow epidemic extinction in populations with heterogeneous infection rates dynamics of person-to-person interactions from distributed rfid sensor networks griffiths phases on complex networks epidemic spreading in metapopulation networks with heterogeneous infection rates numerical solutions of the euler equations by finite volume methods using runge-kutta time-stepping schemes mastering matlab 5: a comprehensive tutorial and reference moment closure and the stochastic logistic model novel moment closure approximations in stochastic epidemics moment-closure approximations for mass-action models proposal for a revised taxonomy of the family filoviridae: classification, names of taxa and viruses, and virus abbreviations ebola situation report killers in a cell but on the loose-ebola and the vast viral universe the analysis of variance: fixed, random and mixed models mathematics for the clinical laboratory kenward-roger approximate f test for fixed effects in mixed linear models mathematical structures of epidemic systems a stabilizability problem for a reaction-diffusion system modeling a class of spatially structured epidemic systems production and germination of conidia of trichoderma stromaticum, a mycoparasite of crinipellis perniciosa on cacao differential equations and applications in ecology, epidemics, and population problems effect of vaccination in environmentally induced diseases listeriosis: a model for the fine balance between immunity and morbidity temporal genetic samples indicate small effective population size of the endangered yellow-eyed penguin small effective population size and genetic homogeneity in the val borbera isolate effective population size is positively correlated with levels of adaptive divergence among annual sunflowers key: cord-263987-ff6kor0c authors: holmes, ian h. title: solving the master equation for indels date: 2017-05-12 journal: bmc bioinformatics doi: 10.1186/s12859-017-1665-1 sha: doc_id: 263987 cord_uid: ff6kor0c background: despite the long-anticipated possibility of putting sequence alignment on the same footing as statistical phylogenetics, theorists have struggled to develop time-dependent evolutionary models for indels that are as tractable as the analogous models for substitution events. main text: this paper discusses progress in the area of insertion-deletion models, in view of recent work by ezawa (bmc bioinformatics 17:304, 2016); (bmc bioinformatics 17:397, 2016); (bmc bioinformatics 17:457, 2016) on the calculation of time-dependent gap length distributions in pairwise alignments, and current approaches for extending these approaches from ancestor-descendant pairs to phylogenetic trees. conclusions: while approximations that use finite-state machines (pair hmms and transducers) currently represent the most practical approach to problems such as sequence alignment and phylogeny, more rigorous approaches that work directly with the matrix exponential of the underlying continuous-time markov chain also show promise, especially in view of recent advances. models of sequence evolution, formulated as continuoustime discrete-state markov chains, are central to statistical phylogenetics and bioinformatics. as descriptions of the process of nucleotide or amino acid substitution, their earliest uses were to estimate evolutionary distances [1] , parameterize sequence alignment algorithms [2] , and construct phylogenetic trees [3] . variations on these models, including extra latent variables, have been used to estimate spatial variation in evolutionary rates [4, 5] ; these patterns of spatial variation have been used to predict exon structures of protein-coding genes [6, 7] , foldback structure of non-coding rna genes [8, 9] , regulatory elements [10] , ultra-conserved elements [11] , protein secondary structures [12] , and transmembrane structures [13] . they are widely used to reconstruct ancestral sequences [14] [15] [16] [17] [18] [19] [20] [21] [22] , a method that is finding increasing application in synthetic biology [16, [20] [21] [22] . trees built using substitution models used to classify species [23] , predict protein function [24] , inform conservation efforts [25] , or identify novel pathogens [26] . in the analysis of rapidly evolving pathogens, these methods are used to uncover population histories [27] , analyze correspondence: ihh@berkeley.edu dept of bioengineering, university of california, 94720 berkeley, usa transmission dynamics [28] , reconstruct key transmission events [29] , and predict future evolutionary trends [30] . there are many other applications; the ones listed above were selected to give some idea of how influential these models have been. continuous-time markov chains describe evolution in a state space , for example the set of nucleotides = {a,c,g,t}. the stochastic process φ(t) at any given instant of time, t, takes one of the values in . let p(t) be a vector describing the marginal probability distribution of the process at a single point in time: p i (t) = p(φ(t) = i). the time-evolution of this vector is governed by a master equation where, for i, j ∈ and i = j, r i,j is the instantaneous rate of mutation from state i to state j. for probabilistic normalization of eq. 1, it is then required that the general solution to eq. 1 can be written p(t) = p(0)m(t) where m(t) is the matrix exponential m(t) = exp(rt) (2) entry m i,j (t) of this matrix is the probability p(φ(t) = j|φ(0) = i) that, conditional on starting in state i, the system will after time t be in state j. it follows, by definition, that this matrix satisfies the chapman-kolmogorov forward equation: that is, if m i,j (t) is the probability that state i will, after a finite time interval t, have evolved into state j, and m j,k (u) is the analogous probability that state j will after time u evolve into state k, then summing out j has the expected result: this is one way of stating the defining property of a markov chain: its lack of historical memory previous to its current state. equation 1 is just an instantaneous version of this equation, and eq. 3 is the same equation in matrix form. the conditional likelihood for an ancestor-descendant pair can be converted into a phylogenetic likelihood for a set of extant taxon states s related by a tree t, as follows. (i assume for convenience that t is a binary tree, though relaxing this constraint is straightforward.) to compute the likelihood requires that one first computes, for every node n in the tree, the probability f i (n) of all observed states at leaf nodes descended from node n, conditioned on node n being in state i. this is given by felsenstein's pruning recursion: if n is a leaf node in state s n (4) where t mn denotes the length of the branch from tree node m to tree node n. i have used the notation (j) for the unit vector in dimension j, and the symbol • to denote the hadamard product (also known as the pointwise product), defined such that for any two vectors u, v of the same size: supposing that node 1 is the root node of the tree, and that the distribution of states at this root node is given by ρ, the likelihood can be written as where u · v denotes the scalar product of u and v. it is common to assume that the root node is at equilibrium, so that ρ = π . as mentioned above, this mathematical approach is fundamental to statistical phylogenetics and many applications in bioinformatics. for small state spaces , such as (for example) the 20 amino acids or 61 sense codons, the matrix exponential m(t) in eq. 2 can be solved exactly and practically by the technique of spectral decomposition (i.e. finding eigenvalues and eigenvectors). such an approach informs the dayhoff pam matrix. it was also solved for certain specific parametric forms of the rate matrix r by jukes and cantor [1] , kimura [31] , felsenstein [3] , and hasegawa et al. [32] , among others. this approach is used by all likelihood-based phylogenetics tools, such as revbayes [33] , beast [34] , raxml [35] , hyphy [36] , paml [37] , phylip [38] , tree-puzzle [39] , and xrate [40] . many more bioinformatics tools use the dayhoff pam matrix or other substitution matrix based on an underlying master equation of the form eq. 1. there exists a deep literature on markov chains, to which this brief survey cannot remotely do justice, but several concepts must be mentioned in order to survey progress in this area. a markov chain is time-homogeneous if the elements of the rate matrix r in eq. 1 are themselves independent of time. if a markov chain is time-homogeneous and is known to be in equilibrium at a given time, for example p(0) = π, then (absent any other constraints) it will be in equilibrium at all times; such a chain is referred to as being stationary. the time-scaling of these models is somewhat arbitrary: if the time parameter t is replaced by a scaled version t/κ, while the rate matrix r is replaced by rκ, then the likelihood in eq. 2 is unchanged. for some models, the rate is allowed to vary between sites [4, 5] . a markov chain is reversible if it satisfies the instantaneous detailed balance condition π i r i,j = π j r j,i , or its finite-time equivalent π i m i,j = π j m j,i . this amounts to a symmetry constraint on the parameter space of the chain (specifically, the matrix s with elements s i,j = π i / π j r i,j is symmetric) which has several convenient advantages: it effectively halves the number of parameters that must be determined, it eases some of the matrix manipulations (symmetric matrices have real eigenvalues and the algorithms to find them are generally more stable), and it allows for some convenient manipulations, such as the socalled pulley principle allowing for arbitrary re-rooting of the tree [3] . from another angle, however, these supposed advantages may be viewed as drawbacks: reversibility is a simplification which ignores some unreversible aspects of real data, limits the expressiveness of the model, and makes the root node placement statistically unidentifiable. stationarity has similar advantages and drawbacks. if one assumes the process was started at equilibrium, that is one less set of parameters to worry about (since the equilibrium distribution is implied by the process itself ), but it also renders the model less expressive and makes some kinds of inference impossible. the early literature on substitution models involved generalizing from rate matrices r characterized only by a single rate parameter [1] , to symmetry-breaking versions that allowed for different transition and transversion rates [31] , non-uniform equilibrium distributions over nucleotides [3] , and combinations of the above [32] . these models are all, however, reversible. a good deal of subsequent research has gone into the problem, in various guises, of generalizing results obtained for reversible, homogeneous and/or stationary models to the analogous irreversible, nonhomogeneous and nonstationary models. for examples, see [30, [41] [42] [43] . the question naturally arises: how to extend the model to describe the evolution of an entire sequence, not just individual sites? in such cases, when one talks about m(t) i,j "the likelihood of an ancestor-descendant pair (i, j)" (or, more precisely, the probability that-given the system starts in ancestral state i-it will after time t have evolved into descendant state j) one must bear in mind that the states i and j now represent not just individual residues, but entire sequences. as long as the allowed mutations are restricted to point substitutions and their mutation rates are independent of flanking context, then the extension to whole sequences is trivially easy: one can simply multiply together probabilities of independent sites. however, many kinds of mutation violate this assumption of site independence; most notably contextdependent substitutions and indels, where the rates depend on neighboring sites. for these mutations the natural approach is to extend the state space to be the set of all possible sequences over a given alphabet (for example, the set of all dna or protein sequences). this state space is (countably) infinite; eqs. 1-4 can still be used on an infinite state space, but solution by brute-force enumeration of eigenvalues and eigenvectors is no longer feasible, except in special cases where there is explicit structure to the rate matrix that allows identification of the eigensystem by algebraic approaches [44, 45] . it has turned out that whole-sequence evolutionary models have proved quite challenging for theorists. there is extensive evidence suggesting that indels, in particular, can be profoundly informative to phylogenetic studies, and to applications of phylogenetics in sequence analysis [46] [47] [48] [49] [50] [51] . the field of efforts to unify alignment and phylogeny, and to build a theoretical framework for the evolutionary analysis of indels, has been dubbed statistical alignment by hein, one of its pioneers [52] . recent publications by ezawa [53] [54] [55] and rivas and eddy [56] have highlighted this problem once again, directly leading to the present review. in this paper i focus only on "local" mutations: mostly indel events (which may include local duplications), but also context-dependent substitutions. this is not because "nonlocal" events (such as rearrangements) are unimportant, but rather that they tend to defy phylogenetic reconstruction due to the rapid proliferation of possible histories after even a few such events [57] . the discussion here is separated into two parts. in the first part, i discuss the master equation (eq. 1) and exact solutions thereof (eq. 2), along with various approximations and their departure from the chapman-kolmogorov ideal (eq. 3). this is an area in which recent progress has been reported in this journal. in the second part, i review the extension from pairwise probability distributions to phylogenetic likelihoods of multiple sequences, using analogs of felsenstein's pruning recursion (eq. 4). this section begins with various approaches to finding the time-dependent probability distribution of gap lengths in a pairwise alignment, under several evolutionary models. as an approach to models on strings of unbounded length, one can consider short motifs of k residues. these can still be considered as finite state-space models; for example, a k-nucleotide model has 4 k possible states. several such models have been analyzed, including models on codons where k = 3 [47, 58] , dinucleotides involved in rna basepairs where k = 2 [59] [60] [61] , and models over sequences of arbitrary length k [44, 62] . mostly, these models handle short sequences (motifs) and do not allow the sequence length to change over time (so they model only substitutions and not indels). some of the later models do allow the sequence length to change via insertions or deletions [62] though these models have not yet been analyzed in a way that would allow the computation of alignment likelihoods for sequences of realistic lengths. it is a remarkable reflection on the extremely challenging nature of this problem that, to date, the only exactly solved indel model on strings is the tkf91 model, named after the authors' initials and date of publication of this seminal paper [63] . while there has been progress in developing approximate models in the 25 years since the publication of this paper, and in extending it from pairwise to multiple sequence alignment, it remains the only model for which 1. the state space is the set of all sequences (strings) over a finite alphabet, 2. the state space is ergodically explored by substitutions and indels (so there is a valid alignment and evolutionary trajectory between any two sequences φ(0) and φ(t)), 3. equation 2 can be calculated exactly (specifically, as a sum over alignments, where the individual alignment likelihoods can be written in closed form). the tkf91 model allows single-residue contextindependent events only. these include (i) single-residue substitutions, (ii) single-residue insertions (with the inserted residue drawn from the equilibrium distribution of the substitution process), and (iii) single-residue deletions (whose rates are independent of the residue being deleted). the rates of all these mutation events are independent of the flanking sequence. this process is equivalent to a linear birth-death model with constant immigration [64] . thorne et al. showed that an ancestral sequence can be split into independently evolving zones, one for each ancestral residue (or "links", as they call them). this leads to the very appealing result that the length distribution for observed gaps is geometric, which conveniently allows the joint probability p(φ(0), φ(t)) to be expressed as a paired-sequence hidden markov model or "pair hmm" [65] . the conditional probability p(φ(t)|φ(0)) can similarly be expressed as a weighted finite-state transducer [66] [67] [68] . some interesting discussion of why the tkf91 model should be solvable at all can be found in [69] and in [42] . there are several variations on the tkf91 model. the case where there are no indels at all, only substitutions, can be viewed as a special case of tkf91, and can of course be solved exactly, as is well known. another variation on the tkf91 model constrains the total indel rate to be independent of sequence length [70] . in the following section i cover some variants that use different state spaces. it is difficult to extend tkf91 to more realistic models wherein indels (or substitutions) can affect multiple residues at once. in such models, the fate of adjacent residues is no longer independent, since a single event can span multiple sites. as a way around this difficulty, several researchers have developed evolutionary models where the state is not a pure dna or protein sequence, but includes some extra "hidden" information, such as boundaries, markers or other latent structure. in some of these models the sequence of residues is replaced by a sequence of indivisible fragments, each of which can contain more than one residue [56, 69, 71] . these includes the tkf92 model [71] which is, essentially, tkf91 with residues replaced by fragments (so the alphabet itself is the countably infinite set of all sequences over some other, finite alphabet). other models approximate indels as a kind of substitution that temporarily hides a residue, by augmenting the dna or protein alphabet with an additional gap character [72] [73] [74] . these models can be used to calculate some form of likelihood for a pairwise alignment of two sequences, but since this likelihood is not derived from an underlying instantaneous model of indels, the equations do not, in general, satisfy the chapman-kolmogorov forward eq. (3). that is, the probability of evolving from i to k comes out differently depending on whether or not one conditions on an intermediate sequence j. clearly, something about this "seems wrong": the failure to obey eq. 3 illustrates the ad hoc nature of these approaches. ezawa [53] describes the chapman-kolmogorov property (eq. 3) as evolutionary consistency; it can also be regarded as being the defining property of any correct solution to a continuous-time markov chain. the abovementioned approaches may be evolutionarily consistent if the state space is allowed to include the extra information that is introduced to make the model tractable, such as fragment boundaries. lèbre and michel have criticized other aspects of the rivas-eddy 2005 and 2008 models [73, 74] ; in particular, incomplete separation of the indel and substitution processes [42] . models which allow for heterogeneity of indel and substitution rates along the sequence also fall into this category of latent variable models. the usual way of allowing for such spatial variation in substitution models is to assume a latent rate-scaling parameter associated with each site [4, 5] . for indel models, this latent information must be extended to include hidden site boundaries [56] . another variation on tkf91 is the tkf structure tree, which describes the evolutionary behavior of rna structures with stem and loop regions which are subject to insertion and deletion [75] . rather than describing the evolution of a sequence, this model essentially captures the time-evolution grammar of a tree-like graph whose individual edges are evolving according to the tkf91 model. other evolutionary models have made use of graph grammars, for example to model pseudoknots [76] or context-dependent indels [77] . in tackling indel models where the indel events can insert or delete multiple residues at once, several authors have used the approximation that indels never overlap, so that any observed gap corresponds to a single indel event. this approximation, which is justified if one is considering evolutionary timespans t 1/(δ ) where δ is the indel rate per site and is the gap length, considerably simplifies the task of calculating gap probabilities [67, [78] [79] [80] [81] [82] [83] . at longer timescales, it is necessary to consider multiple-event trajectories, but (as a simplifying approximation) one can still truncate the trajectory at a finite number of events. a problem with this approach is that many different trajectories will generally be consistent with an observed mutation. summing over all such trajectories, to compute the probability of observing a particular configuration after finite time (e.g. the observed gap length distribution), is a nontrivial problem. in analyzing the long indel model, a generalization of tkf91 with arbitrary length distributions for instantaneous indel events, miklós et al. [84] make the claim that the existence of a conserved residue implies the alignment probability is factorable at that point (since no indel has ever crossed the boundary). they use a numerical sum over indel trajectories to approximate the probability distribution of observed gap lengths. although they used a reversible model, their approach generalizes readily to irreversible models. this work builds on an earlier model which allows long insertions, but only single-residue deletions [85] . recent work by ezawa has put this finite-event approximation on a more solid footing by developing a rigorous algebraic definition of equivalence classes for event trajectories [53] [54] [55] . solutions obtained using finite-event approximations will not exactly satisfy eq. 3. there will be some error in the probability, and in general the error will be greater on longer branches, as the main assumption behind the approximation (that there are no overlapping indels in the time interval, or that there is a finite limit to the number of overlapping indels) starts to break down. however, since these are principled approximations, it should be possible to form some conclusions as to the severity of the error, and its dependence on model parameters. simulation studies have also been of some help in assessing the error of these approximations. for context-dependent substitution processes, such as models that include methylation-induced cpgdeamination, a clever approach was developed in [44] . rather than considering a finite-event trajectory, they develop an explicit taylor series for the matrix exponential (eq. 2) and then truncate this taylor series. specifically, the rate matrix for a finite-length sequence is constructed as a sum of rate matrices operating locally on the sequence, using the kronecker sum ⊕ and kronecker product ⊗ to concatenate rate matrices. these operators may be understood as follows, for an alphabet of n symbols: suppose that m m is the set of all matrices indexed by m-mers, so that if a ∈ m m , then a is an n m × n m matrix. let i, j be m-mers, k, l be n-mers, and ik, jl the concatenated m + n-mers. if a ∈ m m and b ∈ m n then a ⊕ b and a ⊗ b are both in m m+n and are specified by where commuting terms in the taylor series for exp(rt) can then be systematically rearranged into a quicklyconverging dynamic programming recursion. this approach was first used by [44] and further developed including model-fitting algorithms [86] application to phylogenetic trees [87] and discussion of the associated eigensystem [45, 62] . it remains to be seen to what extent such an approach offers a practical solution for general indel models, where the instantaneous transitions are between sequences of differing lengths. such is the difficulty of solving long indel models that several authors have performed simulations to investigate the empirical gap length distributions that are observed after finite time intervals for various given instantaneous indelrate models. these observed gaps can arise from multiple overlapping indel events, in ways that have so far defied straightforward algebraic characterization. in recent work, rivas and eddy [56] have shown that if an underlying model has a simple geometric length distribution over instantaneous indel events, the observed gap length distribution at finite times (accounting for the possibility of multiple, overlapping indels) cannot be geometric. rivas and eddy report simulation studies supporting this result, and go on to propose several models incorporating hidden information (such as fragment boundaries, a la tkf92) which have the advantage of being good fits to hmms for their finite-time distributions. it has long been known that the lengths of empiricallyobserved indels are more accurately described by a power-law distribution than a geometric distribution [46, 47, [88] [89] [90] [91] and that alignment algorithms may benefit from using such length distributions, which lead to generalized convex gap penalties, rather than the computationally more convenient geometric distribution, which leads to an affine or linear gap penalty [92, 93] . for molecular evolution purposes in particular, it is known that overreliance on affine gap penalties leads to serious underestimates of the true lengths of natural indels [94] . for almost as long, it has been known that using a mixture of geometric distributions, or (considered in score space rather than probability space) a piecewise linear gap penalty, mitigates some of these problems in sequence alignment [94] [95] [96] . taken together, these results suggest that simple hmm-like models, which are most efficient at modeling geometric length distributions, may be fundamentally limited in their ability to fully describe indels; that adding more states (yielding length distributions that arise from sums of geometric random variates, such as negative binomial distributions, or mixtures of geometric distributions) can lead to an improvement; and that generalized hmms, which can model arbitrary length distributions at the cost of some computational efficiency [97] , may be most appropriate. for example, the abovementioned "long indel" model of miklós et al. uses a generalized pair hmm [84] , as does the hmm of [98] . it is even conceivable that some molecular evolution studies in the future will abandon hmms altogether, although they remain very convenient for many applications. the recent work of ezawa has some parallels, but also differences, to the work of rivas and eddy [53] [54] [55] . ezawa criticizes over-reliance on hmm-like models, and insists on a systematic derivation from simple instantaneous models. he puts the intuition of miklós et al [84] on a more formal footing by introducing an explicit notation for indel trajectories and the concept of "local history set equivalence classes" for equivalent trajectories. ezawa uses this concept to prove that alignment likelihoods for long-indel and related models are indeed factorable, and investigates, by numerical computation and analysis (with confirmation by simulation), the relative contribution of multiple-event trajectories to gap length distributions. ezawa's results also show that the effects on the observed indel lengths due to overlapping indels become more significant as the indels get larger, making the problem particularly acute for genomic alignments where indels can be much larger than in proteins. a number of excellent, realistic sequence simulators are available including dawg [99] , indelible [100] , and indel-seq-gen [101] . consider now the extension of these results from pairwise alignments, such as tkf91 and the "long indel" model, to multiple alignments (with associated phylogenies). some of the approaches to this problem use markov chain monte carlo (mcmc); some of the approaches use finite-state automata; and there is also some overlap between these categories (i.e. mcmc approaches that use automata). mcmc is the most principled approach to integrating phylogeny with multiple alignment. in principle an mcmc algorithm for phylogenetic alignment can yield the posterior distribution of alignments, trees, and parameters for any model whose pairwise distribution can be computed. this includes long indel models and also, in principle, other effects such as context-dependent substitutions. of the mcmc methods reported in the literature, some just focus on alignment and ancestral sequence reconstruction [65] ; others on simultaneous alignment and phylogenetic reconstruction [79-81, 83, 102, 103] ; some also include estimation of evolutionary parameters such as dn/ds [104] ; and some (focused on rna sequences) attempt prediction of secondary structure [105, 106] . in practise these mostly use hmms, or dynamic programming of some form, in common with the methods of the following section. it is of course possible to use hmm-based or other mcmc approaches to propose candidate reconstructions, and then to accept or reject those proposals (in the manner of metropolis-hastings or importance sampling) using a more realistic formulation of the indel likelihood. ezawa's methods, and others that build on them or are related to them, may be useful in this context. for example, ezawa's formulation was used to calculate the indel component of the probability of a fixed multiple sequence alignment (msa) resulting from sequence evolution along a fixed tree [53] . he also developed an algorithm to approximately calculate the indel component of the msa probability using all msacompatible parsimonious indel histories [54] , and applied it to some analyses of simulated msas [107] . using such realistic likelihood calculations as a post-processing "filter" for coarser, more rapid mcmc approaches that sample the space of possible reconstructions could be a promising approach. the dynamic programming recursion for pairwise alignment reported for the tkf91 model [63] can be exactly extended to alignment of multiple sequences given a tree [108, 109] . this works essentially because the tkf91 joint distribution over ancestor and descendant sequences can be represented as a pair hmm; the multiple-sequence version is a multi-sequence hmm [65] . this approach can be generalized, using finite-state transducer theory. transducers were originally developed as modular representations of sequence-transforming operations for use in speech recognition [110] . in bioinformatics, they offer (among other things) a systematic way of extending hmm-like pairwise alignment likelihoods to trees [67, 68, 111, 112] . other applications of transducer models in bioinformatics have included copy number variation in tumors [113] , protein family classification [114] , dna-protein alignment [115] and error-correcting codes for dna storage [116] . a finite-state transducer is a state machine that reads an input tape and writes to an output tape [117] . a probabilistically weighted finite-state transducer is the same, but its behavior is stochastic [110] . for the purposes of bioinformatics sequence analysis, a transducer can be thought of as being just like a pair hmm; except where a pair hmm's transition and emission probabilities have been normalized so as to describe joint probabilities, a transducer's probabilities are normalized so as to describe conditional probabilities like the entries of matrix m(t) (eq. 2). more specifically, if i and j are sequences, then one can define the matrix entry a i,j to be the forward score for those two sequences using transducer a. thus, the transducer is a compact encoding for a square matrix of countably infinite rank, indexed by sequence states (rather than nucleotide or amino acid states). the utility of transducers arises since for many purposes they can be manipulated analogously to matrices, while being more compact than the corresponding matrix (as noted above, matrices describing evolution of arbitrary-length sequences are impractically-or even infinitely-large). if a and b are finite transducers encoding (potentially infinite) matrices a and b, then there is a well-defined operation called transducer composition yielding a finite transducer ab that represents the matrix product ab. there are other well-defined transducer operations corresponding to the various other linear algebra operations used in this paper: the hadamard product (•) corresponds to transducer intersection, the kronecker product (⊗) corresponds to transducer concatenation, and the scalar product (·) and the unit vector ( ) can also readily be constructed using transducers. consequently, eq. 4 can be interpreted directly in terms of transducers [67, 68, 82] . this has several benefits. one is theoretical unification: eq. 4, using the above linear algebra interpretation of transducer manipulations, turns out to be very similar to sankoff 's algorithm for phylogenetic multiple alignment [118] . thus is a famous algorithm in bioinformatics unified with a famous algorithm in likelihood phylogenetics by using a tool from computational linguistics. (this excludes the rna structure-prediction component of sankoff 's algorithm; that can, however, be included by extending the transducer framework to pushdown automata [119] .) practically, the phylogenetic transducer can be used for alignment [79, 81] , parameter estimation [104] , and ancestral reconstruction [67] , with promising results for improved accuracy in multiple sequence alignment [112] . more broadly, one can think of the transducer as being in a family of methods that combine phylogenetic trees (modeling the temporal structure of evolution) with automata theory, grammars, and dynamic programming on sequences (modeling the spatial structure of evolution). the tkf structure tree, mentioned above, is in this family too: it can be viewed as a context-free grammar, or as a transducer with a pushdown stack [75] . the hmm-like nature of tkf91, and the ubiquity of hmms and dynamic programming in sequence analysis, has motivated numerous approaches to indel analysis based on pair hmms [56, 71, 74, 78, 120] , as well as many other applications of phylogenetic hmms [6, 7, 12, 121, 122] and phylogenetic grammars [8, 10, 40, 60, 123, 124] . in most of these models, an alignment is assumed fixed and the hmm or grammar used to partition it; however, in principle, one can combine the ability of hmms/grammars to model indels (and thus impute alignments) with the ability to partition sequences into differently evolving regions. the promise of using continuous-time markov chains to model indels has been partially realized by automatatheoretic approaches based on transducers and hmms. recent work by rivas and eddy [56] and by ezawa [53] [54] [55] may be interpreted as both good and bad news for automata-theoretic approaches. it appears that closed-form solutions for observed gap length distributions at finite times, and in particular the geometric distributions that simple automata are good at modeling, are still out of reach for realistic indel models, and indeed (for simple models) have been proven impossible [56] . further, simulation results have demonstrated that geometric distributions are not a good fit to the observed gap length distributions when the underlying indel model has geometrically-distributed lengths for its instantaneous indel events [56] . if the lengths of the instantaneous indels follow biologically plausible power-law distributions, the evolutionary effects due to overlapping indels become larger as the gaps grow longer [54] . that is the bad news (at least for automata). the good news is that the simulation results also suggest that, for short branches and/or gaps (such that indels rarely overlap), the error may not be too bad to live with. approximate-fit approaches that are common in pair hmm modeling and pairwise sequence alignment-such as using a mixture of geometric distributions to approximate a gap length distribution (yielding a longer tail than can be modeled using a pure geometric distribution)may help bridge the accuracy gap [96] . given the power of automata-theoretic approaches, the best way forward (in the absence of a closed-form solution) may be to embrace such approximations and live with the ensuing error. interestingly, the authors of the two recent simulation studies that prompted this commentary come to different conclusions about the viability of automatabased dynamic programming approaches. ezawa [53, 54] , arguing that realism is paramount, advocates deeper study of the gap length distributions obtained from simple instantaneous models-while acknowledging that such gap length distributions may be more difficult to use in practice than the simple geometric distributions offered by hmm-like models. rivas and eddy [56] , clearly targeting applications (particularly those such as profile hmms), work backward from hmmlike models toward evolutionary models with embedded hidden information. these models may be somewhat mathematically contrived, but are easier to tailor so as to model effects such as position-specific conservation, thus trading (in a certain sense) purism for expressiveness. whichever approach is used, these results are unambiguously good news for the theoretical study of indel processes. the potential benefits of modeling alignment as an aspect of statistical phylogenetics are significant. one can reasonably hope that the advance of theoretical work in this area will continue to inform advances in both bioinformatics and statistical phylogenetics. after all, and in spite of the cambrian explosion in bioinformatics subdisciplines, sequence alignment and phylogeny truly are closely related aspects of mathematical biology. evolution of protein molecules a model of evolutionary change in proteins atlas of protein sequence and structure in evolutionary trees from dna sequences: a maximum likelihood approach maximum-likelihood estimation of phylogeny from dna sequences when substitution rates differ over sites maximum likelihood phylogenetic estimation from dna sequences with variable rates over sites: approximate methods gene finding with a hidden markov model of genome structure and evolution combining phylogenetic and hidden markov models in biosequence analysis identification and classification of conserved rna secondary structures in the human genome an rna gene expressed during cortical development evolved rapidly in humans a comparative method for finding and folding rna secondary structures within protein-coding regions evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes using evolutionary trees in protein secondary structure prediction and other comparative sequence analyses using protein structural information in evolutionary inference: transmembrane proteins reconstructing large regions of an ancestral mammalian genome in silico evolution of coral pigments recreated ancestral sequence reconstruction crystal structure of an ancient protein: evolution by conformational epistasis palaeotemperature trend for precambrian life inferred from resurrected proteins fast m l: a web server for probabilistic reconstruction of ancestral sequences directed evolution of sulfotransferases and paraoxonases by ancestral libraries aav ancestral reconstruction library enables selection of broadly infectious viral variants enhancing the pharmaceutical properties of protein drugs by ancestral sequence reconstruction synthesis of phylogeny and taxonomy into a comprehensive tree of life protein molecular function prediction by bayesian phylogenomics phylogenetic diversity meets conservation policy: small areas are key to preserving eucalypt lineages identification of a novel coronavirus in patients with severe acute respiratory syndrome bayesian coalescent inference of past population dynamics from molecular sequences unifying the spatial epidemiology and molecular evolution of emerging epidemics patient 0' hiv-1 genomes illuminate early hiv/aids history in north america identifying predictors of time-inhomogeneous viral evolutionary processes a simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences dating the human-ape splitting by a molecular clock of mitochondrial dna revbayes: bayesian phylogenetic inference using graphical models and an interactive model-specification language beast: bayesian evolutionary analysis by sampling trees raxml-vi-hpc: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models hyphy: hypothesis testing using phylogenies paml 4: phylogenetic analysis by maximum likelihood phylip -phylogeny inference package (version 3.2) tree-puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing developing and applying heterogeneous phylogenetic models with xrate estimation of evolutionary distances under stationary and nonstationary models of nucleotide substitution an evolution model for sequence length based on residue insertion-deletion independent of substitution: an application to the gc content in bacterial genomes a stochastic gene evolution model with time dependent mutations a nucleotide substitution model with nearest-neighbour interactions a generalization of substitution evolution models of nucleotides to genetic motifs empirical and structural models for insertions and deletions in the divergent evolution of proteins empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments indel pdb: a database of structural insertions and deletions derived from sequence alignments of closely related proteins sequence context of indel mutations and their effect on protein evolution in a bacterial endosymbiont alignment of phylogenetically unambiguous indels in shewanella identification of transposable elements using multiple alignments of related genomes statistical alignment: computational properties, homology testing and goodness-of-fit general continuous-time markov model of sequence evolution via insertions/deletions: are alignment probabilities factorable? general continuous-time markov model of sequence evolution via insertions/deletions: local alignment probability computation erratum to: general continuous-time markov model of sequence evolution via insertions/deletions: are alignment probabilities factorable? parameterizing sequence alignment with an explicit evolutionary model multiple genome rearrangement and breakpoint phylogeny analytical expression of the purine/pyrimidine codon probability after and before random mutations analytical solutions of the dinucleotide probability after and before random mutations rna secondary structure prediction using stochastic context-free grammars and evolutionary history evolution probabilities and phylogenetic distance of dinucleotides genome evolution by transformation, expansion and contraction (getec) an evolutionary model for maximum likelihood alignment of dna sequences an introduction to probability theory and its applications evolutionary hmms: a bayesian approach to multiple alignment using guide trees to construct multiple-sequence evolutionary hmms accurate reconstruction of insertion-deletion histories by statistical phylogenetics a note on probabilistic models over strings: the linear algebra approach statistical alignment based on fragment insertion and deletion models evolutionary inference via the poisson indel process inching toward reality: an improved likelihood model of sequence evolution models of sequence evolution for dna sequences containing gaps evolutionary models for insertions and deletions in a probabilistic modeling framework probabilistic phylogenetic inference with insertions and deletions a probabilistic model for the evolution of rna structure pair stochastic tree adjoining grammars for aligning and predicting pseudoknot rna structures a probabilistic model for sequence alignment with context-sensitive indels sequence alignments and pair hidden markov models using evolutionary history joint bayesian estimation of alignment and phylogeny bali-phy: simultaneous bayesian inference of alignment and phylogeny incorporating indel information into phylogeny estimation for rapidly emerging pathogens phylogenetic automata, pruning, and multiple alignment hand align: bayesian multiple sequence alignment, phylogeny, and ancestral reconstruction a long indel model for evolutionary sequence alignment an improved model for statistical alignment chain monte carlo expectation maximization algorithm for statistical analysis of dna sequence evolution with neighbor-dependent substitution rates accurate estimation of substitution rates with neighbor-dependent models in a phylogenetic context patterns of insertion and deletion in mammalian genomes exhaustive matching of the entire protein sequence database pattern and rate of indel evolution inferred from whole chloroplast intergenic regions in sugarcane, maize and rice patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes the size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment problems and solutions for estimating indel rates and length distributions uncertainty in homology inferences: assessing and improving genomic sequence alignment sequence comparison with concave weighting functions probabilistic consistency-based multiple sequence alignment prediction of complete gene structures in human genomic dna indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment dna assembly with gaps (dawg): simulating sequence evolution indelible: a flexible simulator of biological sequence evolution biological sequence simulation for testing complex evolutionary hypotheses: indel-seq-gen version 2.0 statalign: an extendable software package for joint bayesian estimation of alignments and evolutionary trees advances in neural information processing systems erasing errors due to alignment ambiguity when estimating positive selection statalign 2.0: combining statistical alignment with rna secondary structure prediction simulfold: simultaneously inferring rna structures including pseudoknots, alignments, and trees using a bayesian mcmc framework characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map pacific symposium on biocomputing an efficient algorithm for statistical multiple alignment on arbitrary phylogenetic trees weighted finite-state transducers in speech recognition automata-theoretic models of mutation and alignment historian: accurate reconstruction of ancestral sequences and evolutionary rates phylogenetic quantification of intra-tumour heterogeneity protein family classification using sparse markov transducers modular non-repeating codes for dna storage a method for synthesizing sequential circuits simultaneous solution of the rna folding, alignment, and protosequence problems evolutionary triplet models of structured rna mcalign2: faster, accurate global pairwise alignment of non-coding dna sequences based on explicit models of indel evolution a hidden markov model approach to variation among sites in rate of evolution phylogenetic estimation of context-dependent substitution rates by maximum likelihood rna secondary structure prediction using stochastic context-free grammars xrate: a fast prototyping, training and annotation tool for phylo-grammars the author thanks kiyoshi ezawa, elena rivas, sean eddy, jeff thorne, benjamin redelings, marc suchard, and one anonymous referee for productive conversations that have informed this review. this work was supported by nih/nhgri grant hg004483. authors' contributions ih wrote the article. the author declares that they have no competing interests. not applicable. not applicable. springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.received: 9 january 2017 accepted: 30 april 2017 key: cord-269559-gvvnvcfo authors: kergaßner, andreas; burkhardt, christian; lippold, dorothee; kergaßner, matthias; pflug, lukas; budday, dominik; steinmann, paul; budday, silvia title: memory-based meso-scale modeling of covid-19: county-resolved timelines in germany date: 2020-08-03 journal: comput mech doi: 10.1007/s00466-020-01883-5 sha: doc_id: 269559 cord_uid: gvvnvcfo the covid-19 pandemic has led to an unprecedented world-wide effort to gather data, model, and understand the viral spread. entire societies and economies are desperate to recover and get back to normality. however, to this end accurate models are of essence that capture both the viral spread and the courses of disease in space and time at reasonable resolution. here, we combine a spatially resolved county-level infection model for germany with a memory-based integro-differential approach capable of directly including medical data on the course of disease, which is not possible when using traditional sir-type models. we calibrate our model with data on cumulative detected infections and deaths from the robert-koch institute and demonstrate how the model can be used to obtain countyor even city-level estimates on the number of new infections, hospitality rates and demands on intensive care units. we believe that the present work may help guide decision makers to locally fine-tune their expedient response to potential new outbreaks in the near future. the covid-19 pandemic continues to hold our way of life on this planet in a tight grip. over the whole world, we have now reached more than 10 million infections [1] . while new infections still rise at alarming pace in the united states, brazil, or india, most other asian and european countries that were hit much earlier by the pandemic seem to have succeeded in reducing the number of new daily cases. this success can largely be attributed to fast and locally tailored political measures that introduced severe travel restrictions [2] [3] [4] [5] [6] and curtailed public life. initially, the measures were met by a largely understanding general public. however, the partial necessity of police enforcement and increasing protest against contact restrictions, locally even encouraged by politicians [7] , demonstrate rising anger, fear, or even mental health problems due to the current situation [8, 9] . thus, it is critical to carefully reopen the economy and reestablish public life, while avoiding a relapse and a potential collapse of the health-care system, which may entail much stricter measures again. to reach this goal, however, decisions must be made quickly and often locally at county level, based on reliable data and trustworthy predictions. clearly, accurate models are of essence to capture the disease dynamics at exactly this spatial meso-scale, to predict the number of new infections per day or the number of patients that may require intensive care. here, we focus on the situation in germany, where county-resolved daily and cumulative infection cases are reliably reported by the robert-koch institute (rki, [10] ). we combine two previous modeling advances [11, 12] into a locally resolved, history-type model that captures the spatiotemporal evolution of the pandemic in germany. we use a generalization of typically known sir-type compartment models that allows for a much better representation of the courses of disease [12] . while becoming infected is well represented by a simple ordinary differential equation (ode), the remaining course of disease is captured rather restrictively by these ode-based, sir-type models [13] . based on the integro-differential model introduced by kermack and mckendrick already in 1927 [14] and recently reintroduced by [12, 13] , we model the spatial spread of covid-19 in the following way: for all s < 0, t > 0 and x ∈ ω with ω ⊂ r 2 open denoting the considered spatial region. the normalized initial history datum is given by ) denotes the normalized number of susceptibles. the weight γ i ∈ l 1 ((0, ∞); r ≥0 )) with γ i l 1 ((0,∞)) = 1 describes the evolution of infectiousness, where γ i (τ ) defines the infectiousness of an individual at τ days after the infection event. the interaction term β ∈ l ∞ ((0, ∞) × ω 2 ; r ≥0 ) denotes the interaction between the infectious and the susceptible population. the considered balance law is of nonlocal-history type. nonlocality as well as history in balance laws are receiving increasing attention to model real world phenomena. they provide a more detailed way to model evolution and can be seen as the mesoscopic link between purely macroscopic and fully microscopic models. in the considered application, the microscopic equivalent-agent models [15] [16] [17] -can be interpreted as a measure valued solution to the proposed model. the classically used compartment models (sir, seir) [18] have been widely used to model the viral spread. their recently revealed relationship to hamiltonian mechanics is quite insightful [19] , demonstrating that they constitute a mere simplification of the here considered integrodifferential equations. in terms of the spatial resolutionwhich of course can also be modeled in the compartment models [20, 21] -the classical sir model can be seen as the singular limit of the interaction term, i.e. β(·, * , ) → b(·)δ( * − ) for a given b ∈ l ∞ ((0, ∞); r ≥0 ). the models are generalized with respect to the evolution of infectiousness of infected individuals. the considered model can represent-based on medical data-any course of infectious-ness, in contrast to, e.g., an assumed exponential decay in the widely used sir model. as introduced in [11] , we discretize our spatial domain ω, germany, at county-level (or even city-level), where current containment rules are steadily evaluated and adapted in case local infection numbers rise up again. our county-interaction network is adapted from the global epidemic and mobility (gleam) model [22, 23] , focusing on mid-and shortrange interactions motivated by the severely restricted air travel [24, 25] . taken together, this spatially resolved integrodifferential model allows us to accurately analyze and predict disease dynamics at its various stages and the effect of local measures. to model the spread of the disease in a discretized spatial setting, we consider a finite partition of the domain ω, i.e. where n denotes the number of counties or cities in germany, depending on the spatial resolution. we obtain the following memory type vector-valued initial value probleṁ is the vector-valued normalized initial history datum. we introduce • to denote the transformation of a vector • into a quadratic diagonal matrix, where the entries along the diagonal equal those of the vector. the time-dependent, vector-valued function s ∈ w 1,∞ ((0, ∞); [0, 1] n ) denotes the normalized number of susceptibles. the matrix-valued function b ∈ l ∞ ((0, ∞); r n ×n ≥0 ) denotes the infection rates and interaction between the considered regions ω k with k ∈ {1, . . . , n }. the existence and uniqueness of a solution of the proposed integro-differential equations-continuous as well as discretized in space-is proven e.g. in [12] for all γ i for which there exists > 0 s.t. γ i | (0, ) ≡ 0. this is a rather natural condition, since the incubation time-the period during which the infected are not yet infectious-is positive. based on the history of s, other quantities and subgroups can be determined directly from including medical data on the various courses and infectiousness levels of the disease via corresponding integration weights: we distinguish between the states infectious γ i , symptomatic γ s , tested and quarantined γ q , hospitalized γ h , in intensive care γ icu , recovered γ r and deceased γ d . following the contribution of [ disease progression in our model: (a) light symptoms, recovering without hospitalization, 95% share; (b) hospitalization, recovering without intensive care, 4% share; (c) patients in intensive care and recovering, 0.4% share; (d) patients in intensive care that eventually die, 0.6% share. figure 1 depicts the four different courses of the disease as represented in our model, including their corresponding state transitions and infectiousness levels. note that only the weighted sum γ i = 4 i=1 share i · γ i ,i is necessary for the solution of eq. (1), but the individual contributions allow for detailed descriptions of disease progression from medical data and corresponding post processing. we normalize the integral over γ i such that it represents the probability of infection. we further assume that patients in courses (b) to (d) will be tested positive and are thereby considered in reported infection numbers. the ratio of total versus detected infections is defined as dark ratio ω, thereby representing the factor of unknown cases. since the dark ratio is not necessarily constant in space, this is taken into account by locally scaling the function γ q that represents the detected and quarantined state of course (a) to the appropriate value. since the dark ratio seems to closely correlate with testing capacities [11] , we introduce federal-state-wise dark ratios ω j , assembled in the vector ω, that vary only over states with j ∈ {1, . . . , 16}, due to locally differing behavior patterns and, in particular, political measures to reduce social contacts, the infection rates vary in space and time. based on our previous findings in [11] , we introduce federal-state-wise initial infection rates β j0 with j ∈ {1, . . . , 16}, and two reduction factors β red 1 and β red 2 representing the major restrictions of 1) cancelling large events and 2) contact restrictions. those model parameters are calibrated using data reported by the rki, as described in sect. 2.3. since the shut-down measures were introduced at slightly different times t j1 and t j2 in the different federal states of germany, we model the time-dependent reduction of infection rates via piece-wise constant functionsβ j1 (t) andβ j2 (t) to obtain the overall infection rates in each state (2) to model cross-county interactions, we adapt the gleam short-and mid-range mobility network [22, 23] as introduced in [11] and capture cross-county infections by where β k are time-dependent infection rates, c k are the crosscounty infection weights, n k is the number of inhabitants in the largest city of county k, n max = 3e6 corresponds to the number of inhabitants in germany's largest city berlin, and r kl is the distance between counties k and l. importantly, β k and c k are identical for all counties within one federal state, such that eq. 3 introduces 16 additional model parameters c j with j ∈ {1, . . . , 16} which need to be calibrated based on reported data in the literature (see sect. 2.3). the parameters for the gleam model are taken from [22] and given in table 1 . figure 2b displays the mobility network across germany [11] . note that both the county-internal as well as the cross-county infections contribute to the basic reproduction number in our spatial model. for the distinction of individual counties, we use the municipal directory (gemeindeverzeichnis) from the german federal statistical office (statistisches bundesamt) [27] , which delivers area, shape and population data. we consider city-wise population data, which is accumulated over the entire county or corresponding spatial domain for the respective model. the center of population of each county serves as its spatial coordinates. the detailed description of the courses of disease allows for elaborate post-processing of the solution to evaluate any described quantity. generally, evaluation differs for cumulative and current quantities. the number of cumulative discovered infections q or the number of deceased d are evaluated by double integrals such as current quantities like infectious i(t), those with symptoms, hospitalized or icu patients follow from expressions such as from data the number of positively tested people on day zero is known, but the integro-differential equation model requires initial values for the infected at each spatial node as well as an initial history as a starting point for the integration. initially, we assume exponential growth in all counties described by the ansatz with ν = 0.345 from fitting an exponential function to rki data on cumulative covid-19 cases in all of germany from march 2 to march 6. the initial estimated number of infected ε at time t = 0 in each county can be calculated by combining eq. (4), the number of initially reported cases q 0 , eq. (6) and the time derivativeṡ. from the result and eq. (6), the initial history can be estimated. the high-resolution network model brings with it the challenge for spatially consistent initial conditions. however, most counties did not yet have any known cases on the very first day, limiting the possibility of simply scaling overall initial infections per county by the dark ratio. thus, for the county-based model we selected the distribution of initial infections according to data for march 16 [10] , linearly scaling down the overall number of infections to the number reported on our starting date march 2. to approach our spatially resolved county-model, we followed a cascade optimization strategy. data analysis and preliminary simulations had shown that we require federalstate-dependent dark ratios ω j and infection rates β j0 , j ∈ {1, . . . , 16} [11] . figure 2a illustrates the estimates for the state-wise dark ratios ω j , which we obtain by assuming a germany-wide identical mortality of μ = 6‰ [26] and fitting to the individually reported death tolls, with ω ≈ 6.5. using state-wise identified dark ratios ω j , we first used a coupled system of 16 nodes connecting each federal state to obtain a germany-wide average β and reduction factors β red 1 and β red 2 by fitting the cumulative data for germany via the following objective function the residual r 1 is minimized using a particle swarm optimization (pso) algorithm described in detail in sect. 2.4 with weights w i = 1/(t end − t start )/max(q rki ), w d = 1/max(d rki ) and w s = 0.1/max(q rki ). we then considered state-wise data to fit β j , j ∈ {1, . . . , 16}, while keeping c = 1, leading to the distribution over ger-many depicted in fig. 2c . we fit the cumulative number of confirmed infections reported by the rki [10] for the time period from march 2 until april 25 with the cumulative number of detected infections q as defined in eq. (4), normalized by the maximum cumulative number of reported infections from the rki. this is the time period during which the various shutdown measures were in place without any noticeable relaxation. on top of that, we include the change-rate of infections on our last day april 25 into the residual vector with the weights the weights are state-wise normalized to balance the contribution of heavily and less affected states. the residual r 2 is again minimized using the pso algorithm. finally, we increased the resolution to full county level, amounting to a coupled system of 401 nodes. to re-balance the changed influence of the larger network, we iteratively fitted statewise cross-county weights c j for ∈ {1, . . . , 16} to match the state-wise cumulative infections of the 16 node state-wise model (q sw ) with the accumulated numbers from the 401 node county-based model (q cw ) on the last day of the fit. we used a damped gradient-descent like algorithm to update c j at iteration i + 1 following the rule empirically, we obtained converged cross-county weights within 30 iterations with a limited step size δc max = 0.25 and a damping exponent ζ = 1.5. the final state-wise distribution of optimized cross-county weights c j is displayed in fig. 2d. particle swarms are distributed optimization schemes that treat each realization of the d optimization variables as particles with a position x i and a velocity v i in a d-dimensional bounded search space. particles are initialized with a uniformly random position within the boundaries of the search space and zero initial speed. for the iteration i > 0 the following set of equations describes the behaviour of any particle: the velocity v i+1 is a linear combination of three quantities. the previous velocity v i weighted with the constant factor a results in an inert motion property. the term p i loc − x i represents a force that pulls the particle towards its local attractor p i loc , which is the best position this specific particle has visited so far. multiplication with the constant weight b loc controls the influence of this quantity. in addition to that the randomized diagonal matrix r i loc with values between 0 and 1 enables optimization in varying directions. the global attractor p i glob represents the best position any particle has visited so far and works analogously to the local attractor with the factor b glob . we chose established values for a, b loc and b glob as summarized in table 2 , used a total of 300 particles and followed the 'nearest' strategy when particles cross boundaries of the search space during optimization [29] . to prevent overly fast convergence to a visited attractor without broad coverage of the search space, we employed a so-called ring topology neighborhood, such that the global attractor of a particle corresponds only to the best local attractor of its two neighbors below and above. this way, good positions are slowly propagated through the whole swarm, allowing for enhanced exploration of the search space, which well balances run-time efficiency and identification of the true global optimum. to validate the model, we evaluated the temporal correlation between model predictions and rki data by computing the pearson correlation coefficient r p , the coefficient of determination r 2 = r 2 p and the corresponding p-value to assess statistical significance via the function [r p , p] = figure 3 shows how the optimized spatially resolved memorybased model with 401 network nodes representing each county of germany well reproduces the cumulative confirmed cases in each of its federal states from march 2 until april 25. for cumulative infection data reported by the rki [10] , we find astonishing and statistically significant (all p < 1e − 12) agreement on the temporal evolution. the only state with an r 2 < 0.98 is bremen-a city-state with overall very low infection numbers and a population of less than 700.000. here our quasi-continuum modeling approach and the underlying exponential growth seem to approach their validity limit, and stochastic effects start to prevail. although only the last data point of reported deaths was considered for parameter identification, the model captures the temporal evolution of covid-19 related deaths in each state of germany with remarkable accuracy (all r 2 > 0.91). here, we observe least agreement in the city-state hamburg. in general, the model better captures the evolution in higher-populated states, with overall more infections and death tolls. we note that our fitting procedure only operates on state-based information. to further validate our model, we compare county-wise cumulative infection numbers q as reported by the rki and our model (fig. 4 left and figure 5 shows how the model informs on the temporal evolution of cumulative confirmed cases, with more detailed resolution on the subgroups of symptomatic, infectious, and hospitalized, patients in the icu, as well as the dead. a first kink in the infectious group is clearly visible at the beginning of march due to the cancellation of major events, which then drops significantly when contact restrictions become effective shortly afterwards. figure 6 shows the model predicted spatial distribution at county resolution of infectious, symptomatic, hospitalized, and patients in intensive care, following from the individual disease courses in fig. 1 . we consider a period from early march, where the exponential growth of the disease started in germany, until early june under the assumption that the contact reduction factors stay in place. in early march, most of the infected were at an early stage of the disease, i.e., most of them were infectious but did not have disease specific symptoms yet (on average, the first symptoms appear on the fifth day after the infection event [26] ). this explains the delay in symptomatic infections clearly visible in fig. 5 . in our model, we assume that most of the symptomatic voluntarily quarantine themselves and no longer infect others, implying that the infectiousness decreases when people move to the symptomatic group (cf. fig. 1 ). the infectious state ends at the latest when the symptomatic have been tested positive for the virus and are quarantined. the symptomatic state of covid-19 lasts approximately nine days on average [26] , explaining why the symptomatic group is about double in size compared to the infectious group in fig. 5 . figure 6 also shows the delay in covid-19 cases that need inpatient treatment or even intensive care. as reported in [26] , infected are typically hospitalized for nine days after the infection event at a probability of 4.5%. as this is encoded in the courses of disease, the snapshot on march 2 reports hardly any hospitalized patients. according to [26] , patients finally, we show how our model can be adapted to locally increase resolution to individual city-or community level. fig. 7 spatial distribution of the infectious (i) on april 2 at county level for all of germany (top) and with locally increased resolution to community-level (bottom). the non-densified part of the domain is greyed out for the sake of better visual contrast. zoomed regions show county-and community resolution for counties erlangen, fürth, nürnberg and their rural surroundings. note that the proposed macroscopic model reaches its validity limit for very low daily new infections within one subregion figure 7 shows the germany-wide county-level simulation (top), with a zoom into the metropolitan area of nürnberg, erlangen and fürth and its surrounding counties. increasing the resolution within this domain to community level (bottom) but maintaining county-level for the rest of germany leads to a network of 464 nodes. the zoom-in clearly shows that county infections are dominated by their largest cities, following the three purple areas that represent nürnberg, fürth and erlangen from bottom to top, underpinning our formulation of the cross-county infection terms (cf. eq. (3)). surrounding communities suffer much less infections due to their much smaller populations. gray areas correspond to rural public space not assigned to a specific community [27] . we have presented a memory-based network model to predict the spatio-temporal outbreak dynamics of the covid-19 pandemic in germany. the model considers the effects of political measures, the cancellation of major events and contact restrictions, and the different possible courses of the disease, which is not possible when using traditional sirtype models. it well represents the evolution of confirmed cases and deaths reported by the rki from march 2 until april 25. we have then used the model to predict the further developments until june and have provided estimates for the county-wise required capacity of the local health care system, i.e. the number of patients that require hospitalization and even intensive care. finally, we have demonstrated that the model can be refined to predict the interaction and local outbreak dynamics at community level. by now, medical data on observed disease progression at most stages during a covid-19 infection is abundantly available and continues to improve. our versatile integrodifferential approach directly integrates these data into the model and can easily be extended, corroborating its superiority over standard sir-type models. in general, the model can thus handle an arbitrary number of courses of the disease. similarly, it may expand to consider region-dependent demographics or varying capacities and quality of treatment of the health-care system. while the model can serve as a valuable tool to assess the effects of new super spreader events-which may occur any time-on the distribution of cases in germany, it reaches its validity limit when the number of infections becomes small. to additionally capture this even smaller scale, a coupling to individual agent-based models [15] [16] [17] may be beneficial. covid-19 dashboard by the center for systems science and engineering. last accessed 29 the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak effective containment explains subexponential growth in recent confirmed covid-19 cases in china transmission dynamics of the covid-19 outbreak and effectiveness of government interventions: a data-driven analysis outbreak dynamics of covid-19 in europe and the effect of travel restrictions the reproduction number of covid-19 and its correlation with public health interventions covid-19: trump stokes protests against social distancing measures using social and behavioural science to support covid-19 pandemic response mental health and the covid-19 pandemic covid-19-dashboard. last accessed 28 meso-scale modeling of covid-19 spatio-temporal outbreak dynamics in germany modeling infectious diseases using integro-differential equations: optimal control strategies for policy decisions and applications in covid-19 why integral equations should be used instead of differential equations to describe the dynamics of epidemics a contribution to the mathematical theory of epidemics modelling disease outbreaks in realistic urban social networks an agent-based computational framework for simulation of competing hostile planet-wide populations modeling exit strategies from covid-19 lockdown with a focus on antibody tests the mathematics of infectious diseases analytical mechanics allows novel vistas on mathematical epidemic dynamics modelling epidemic processes in complex networks networkbased prediction of the 2019-ncov epidemic outbreak in the chinese province hubei multiscale mobility networks and the spatial spreading of infectious diseases modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model global and local mobility as a barometer for covid-19 dynamics the role of the airline transportation network in the prediction and predictability of global epidemics modellierung von beispielszenarien der sars-cov-2-epidemie 2020 in deutschland the particle swarm-explosion, stability, and convergence in a multidimensional complex space particle swarm optimization in highdimensional bounded search spaces publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations acknowledgements open access funding provided by projekt deal. we cordially thank sarah nistler for tedious data collection and the entire covid-19 modeling group at fau for valuable discussions and feedback on this work.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. key: cord-318900-dovu6kha authors: pitschel, t. title: sars-cov-2 proliferation: an analytical aggregate-level model date: 2020-08-22 journal: nan doi: 10.1101/2020.08.20.20178301 sha: doc_id: 318900 cord_uid: dovu6kha an intuitive mathematical model describing the virus proliferation is presented and its parameters estimated from time series of observed reported covid-19 cases in germany. the model replicates the main essential characteristics of the proliferation in a stylized form, and thus can support the systematic reasoning about interventional measures (or their lifting) that were discussed during summer and which currently become relevant again in some countries. the model differs in form from elementary sir models, but is contained in the general kermack-mckendrick (1927) model. it is maintained that (compared to elementary sir models) the model is more faithfully representing real proliferation at the instantaneous level, leading to overall more plausible association of model parameters to physical transmission and recovery parameters. the main policyoriented results are that (1) mitigation measures imposed in march 2020 in germany were absolutely necessary to avoid health care resource exhaustion, (2) fast response is key to containment in case of renewed outbreaks. a model generalization aiming to better represent the true infectiousness profile is stated. construction of the model has been motivated in course of the analysis of intensive-care capacity expenditure to be expected from sector-specific lifting of restrictions. this former analysis used a budget-oriented argument to arrive at an indicative estimate of the resource expenditure, but did not analyze dynamics. (concretely it assumed a constant rate of new infections.) a reasonable question to be posed is: if a sector was allowed to reopen, what would the trajectory of infections actually look like, when surely it is not a linear increase? further, can parameters of the local transmission behaviour be derived from the aggregate observed numbers? in the present text, a model capturing the dynamics of the number of infections is developed towards answering these questions. it deliberately contains only few parameters and is in fact not designed to a specific stage of the virus proliferation. though models for tracing the trajectory of infectious diseases exist, for example the intuitive sir model [ken56] which is formulated as a system of scalar differential equations, we believe that physically more realistic descriptions are possible, which, moreover, lead to increased accuracy of estimated parameters. we assume a homogeneous set of individuals which act as unwitting agents in the proliferation. we assume that infection spreads probabilistically from the infected (and still contagious) individuals to any other individual of the set, wherein we assume that each individual is connected randomly to others, but such that all individuals approximately have an equal number of neighbours (=: a1). (in graph-theoretic terminology, the graph of contacts between individuals is a random undirected graph where each node has about the same edge degree d.) no other assumptions are imposed on the global topology of interconnections. individuals who were once infected cannot be infected again (=: a2). finally, an assumption here made is that contagiousness lasts only for a duration t c , i.e. an infected individual is contagious for the period [0, t c ] after its infection and then not at all afterwards (a3). this simplified characteristic is motivated by results on infectiousness found in epidemiological and clinical investigations: in [hlwea20] , infection incidence data of the wuhan area is examined and combined with clinical data to derive an infectiousness profile which has most of its weight located at about 7 consecutive days around the symptom onset 1 . [wcg + 20] recorded viral rna load data in sputum, throat swab and stool and report of nine patients viral peak loads of 2.35 · 10 9 copies per ml sputum, declining rapidly starting from the first day of presentation in almost all patients, decreasing to 10 5 copies per ml within about 10 to 16 days after symptom onset. (a level of below 10 5 copies per ml sputum, combined with no symptoms and past day 10, has been regarded as warranting discharge of the patient from clinical care with ensuing home isolation.) [ttl + 20] (fig 2) report viral load in posterior oropharyngeal saliva samples decreasing monotonously to below 10 4 copies per ml in day 21 after symptom onset, for the majority of 20 non-intubated patients (out of n = 23). changes in population size due to non-disease effects will be ignored, instead n will be considered constant; similarly the changes in proliferation characteristic due to disease-related reduction of the population will be deemed negligible. assumptions a1 to a3 will be "baseline" assumptions throughout the text and substantial deviations from them will be discussed in the appendix only. we aim for a numerical formulation of the aggregate evolution in which the randomness is averaged out. for this, let n be the number of agents, and let at t = 0 the number of infected agents x(t) be given as x 0 < n . before t = 0, the number of infected agents shall be zero. to develop the model incrementally, lets momentarily assume that all infected agents are contagious infinitely long. in a unit time interval, all infected agents are deemed to infect each of respectively d other neighbours -stochastically independently -with probability p. the expected total number of virus receivers, per unit time interval, then is x(t) · d · p. but not all receivers get infected because some are already infected. the share of non-infected receivers among all agents is (1 − x(t)/n ); therefore the expected number of new infections in unit time is approximating the model evolution as continuous process even at small time intervals 1 caution in the usage of numbers from pure incidence analysis is required: as consequence of the way the raw data is obtained in [hlwea20] , only infectiousness around the moment of symptom onset is in fact fully observed. this is because earlier transmission are likely usually not properly associated to the real primary case because the primary case does not show symptoms yet. later transmissions are simply inhibited because the primary case is put into quarantine. an epidemiological analysis of incidence data alone therefore necessarily is insufficient to determine "pure" infectiousness. to emphasize the distinction between "pure"/"medical" infectiousness and infectiousness after taking into account the population's sociocharacteristics (household structures, current mitigation policies), it is worthwhile to call the density of the latter an "infection incidence profile". (reasonable given the size of the numbers involved), one concludes, under assumption of infinitely enduring contagiousness, that x(t) followṡ for t ≥ 0, with x((−∞, 0)) = 0 and x(0) = x 0 . obviously the function x(t) is non-decreasing. for incorporating the finite duration contagiousness, one determines the number of contagious individuals as the difference of the accumulated number of infected at time t minus the accumulated number of infected prevailing at the earlier time t − t c , because that share of agents had the infection already for at least duration t c , and will cease to be infectious at t. consequently, the expected total number of virus receivers is refined towards the model with the finite duration contagiousness thus reads, in expectation,ẋ with initial conditions as before. both differential equations respectively have a unique solution. 2 the here presented model is not representable by the elementary sir models that involve only instantaneous evaluations of the state variables on the right-hand side of the differential equation, as e.g. equation (2) in [ken56] (see [zml + 20] for a current example of its usage). it is therefore also necessarily different for example from [mb20] . the reason for this is a restriction imposed by such formulations, namely that the individual's transition from infection to recovery is modelled using a rate of transition proportional to the number of infected individuals, which corresponds to a stochastic recovery occurrence and yields an exponential decay characteristic on average. it is known however that, in reality, the sars-cov-2 shows a rather deterministic disease progression with regards to infectiousness in time, leading to end of the infectiousness after about two to three weeks after begin of infection, based on cell culture (see earlier citations). this clinically supported characteristic is properly represented in equation (2), but not in elementary sir models. on the other hand the here presented model is conceptually contained in the original (i.e. general) compartmental model of kermack and mckendrick [km27] (which involves a formulation using integrals; see comments in [bra17] also), for example by setting there ψ = 0. 3 this holds also for the refinement given in section b. the advantage of the here given formulation is that it allows for a mathematically relatively simple description while still fully allowing accomodation of the infectiousness characteristic in generalized form. this simplicity gives some room to incorporate other, hithertho 2 instead of considering only a temporally finite and uniform infectiousness, more detail can be incorporated into the differential equation using a convolution term, as shown in appendix b. unconsidered, effects into the model and still retain a model complexity which is amenable to simulation for parameter identification. 3 analysis of the model and exploratory simulation for later simulation, it is helpful to make use of the scale invariances inherent in the above differential equations. if one denotes the equation (2) parametrized with d · p and t c and n and initial value x 0 as "ode(dp, t c , n, x 0 )", then we have the following fact: if t → x(t) is a solution to ode(dp, t c , n, x 0 ), then t → x(at) is a solution to ode(a · dp, t c /a, n, x 0 ) for any a > 0. this means we can restrict analysis for example to t c = 1 and vary only d · p and x 0 . the other scale invariance is described by "x(·) solution of ode(dp, t c , n, instead of a further analytical proceeding, the above equation's evolution was examined via computer simulation, for various parameter choices d · p and initial values. the purpose is first to explore the general (i.e. not real-data matched) behaviour of equation (2) (next subsection), then to fit the parameters to observed real data (section 4). throughout it was used n = 1.0, t c = 1 and a (forward euler) discretization step size of 0.01 (corresponding to resolution=100 in code). the below discusses general features of the model and its behaviour under parameter variations. this is for demonstration only, and arguments on the proliferation phenomena should be taken as schematic. (whether the phenomena occur in the real parametrization is to be discussed in section 4.) fig 1a shows the evolution behaviour for some arbitrary but temporally constant parameter set. the most striking feature at this graph is that the number of infections asymptotically does not reach the total number n of agents. rather, the limit is a value x(∞) < n which depends on the d · p and the initial value. for comparison, the evolution of the number of infections as would arise when observing eqn. (1) [with same d · p parameter] is depicted as grey dashed line; in it, the x(t) converges to n independent of the choice of d · p. (in subsequent text, this will be referred to as "bounded exponential growth".) the reason for including this curve here and in following graphs is that it can give a hint on trajectories of future viruses that may have a much more extended infectiousness interval. in fact, this curve would result if infected individuals remained infinitely long infectious and were not quarantined. in simulations, the dependence of the limit x(∞) on d · p appeared to be generally overproportional (see fig 1b) . this is well-known behaviour also in the instantaneous-state models. on the other hand, the dependence of the limit on x 0 was linear or sub-linear. in instantaneous-state models, the limit does not depend on the size of the initiating jump of noteworthy is (in both cases) that even though the same number of exogenously infected was used as initially, the contagion effect is much smaller. the reason for this is that already about one fifth of the population had been infected (thus was immune in this model). 6 all rights reserved. no reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in the copyright holder for this this version posted august 22, 2020. . https://doi.org/10.1101/2020.08.20.20178301 doi: medrxiv preprint we use here the number of reported covid-19 cases (as aggregated by the robert-koch-institut [1]) as a proxy for the number of infections in germany. 4 we fit parameters for the interval until beginning of may 2020, assuming that the evolution proceeded within two different parameter regimes: first a d · p corresponding to no restrictions, then a d · p corresponding to the restrictions posed by contact disencouragement and store closure. (the observational interval used for parameter estimation does cover only a few days of the time of obligatory indoor face mask wearing.) we can derive parameters and based on them predict the trajectory of infections way forward. because of the simplicity of the examined model, there is the risk of a high model error existing. therefore, at the present state of this text, such estimation can only serve to determine reasonable bounds on the parameters of the model, rather than to give a reliable forecast of expect number of eventual infections. fitting of parameters is here conducted manually, focussing on moments in the time series that are indicative of parameter changes. at the beginning of april 2020, the number of weekly new covid-19 cases stood at about 40000 in germany. if we regard the modelling time unit to correspond to a real duration of 2 weeks (implying that each individual newly infected is non-contagious two weeks after and onwards), then we have a new-infections rate of 80000 individuals per such time unit which corresponds to an increment of approximately ∆x = 0.001 per unit time after normalizing to n = 1.0. identifying the moment which was one week after the initial wider lock-down in germany (i.e. around 29th march) as moment t = 3 in the modelling, parameters consequently need to be fitted such thatẋ(3.0+) = 0.001 (green line). (the t = 3.0 also implies that the model assumes around 6 weeks of initial evolution under a low-restrictions scenario, which matches the timeline of the outbreak in germany approximately.) fig 3a shows the trajectory of the system evolution using initially d·p = 1.42 and switching to d·p = 0.7 afterwards. fig 3b shows the evolution if no parameter switch (i.e. no intervention) had happened at t = 3.0. note: the matching is overly simplified for the interval t ∈ [0, 3.0], leading to an overestimated x(t), since for example x(3.0) = 0.0025 -corresponding to 200000 individuals-, while the actually reported number was around 52550. in reality, the d · p must have been larger than 1.42 at the beginning of the interval, but on the other hand closer to (but above) 1.0 in the second half of [0, 3.0]. assuming an infected individual occupies an intensive-care bed with ventilator (icu) for one to two weeks, the icu capacity in germany currently is about 12500 to 25000 icu cases per week. this allows for a maximum of 87500 to 175000 reported infections per week (assuming share of cases needing intensive care around 14.28%), i.e. 175000 to 350000 reported infections per two weeks. this in turn corresponds to a normalized increment of 0.0021875 to 0.0043750 per time unit (a horizontal line somewhere in the upper half of the graphs in fig 3) . it is necessary to remark that the conclusion drawn in connection with fig 2e and 2f i.e. that a second outbreak of similar magnitude as initially would not effect a substantial increase in the accumulated number of infected individuals -cannot be affirmed for the current scenario (in germany and elsewhere), since that number is rather about 0.25% to 0.5% of total population currently, rather than the 1/5 prevailing in the demo scenario in fig 2e and 2f at the onset of the second outbreak. the challenge with lockdown measures for the current corona virus is the following: when imposing them, they will show effect only if the basic reproduction number is pushed below 1 sufficiently enough. then, after the number of infected individuals has eventually dwindled, a lift of the lockdown is tempting -however even a slight increase of r 0 above one opens the way to renewed catastrophic infections increase. one therefore has a binary evolution characteristic; to control r 0 by policy such that a steady stream of just managable new infections is maintained is daunting, and likely impossible (in practice) if a policy requiring a constant set of restrictions is targetted. the natural answer, at least from a theoretical point of view, is to consider phases of lifted restrictions interleft with repeated adaptively switched phases of more stringent restrictions or more stringent enforcement of existing restrictions. the need for such strategy is not in principle altered by the local aspect of transmission, except that switched lockdowns only need to be local and thus do not affect the whole population. another point that needs to be mentioned is that the graphs suggest that a future virus having infectiousness lasting much longer than the about two to three weeks for sars-cov-2 and also being as highly infectious would pose serious challenges for containment, because of resource exhaustion in the mid-stages of the pandemic. 8 all rights reserved. no reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in in this study a novel model for virus proliferation dynamics was developed and with it the sars-cov-2 outbreak in germany retraced on an aggregate level, using covid-19 case count data by the robert-koch institute in berlin. elementary properties of the model were identified. predictions by the model for different levels of mitigation measures were hinted at or stated in approximate manner, and put into context of available health care resources in germany. future policy oriented work would need to address better understanding of fine-grained and adaptively activated mitigation measures, for which a spacial model should be favoured over purely aggregate models as the present one. further, for purpose of improving parameter and state estimates, the issue of underreporting (i.e. #actual > #reported cases) must be taken into account appropriately. ideally, one can develop an estimate for the factor of underreporting from more exact spacial analyses. on the mathematical side, a more rigorous formulation of the instantaneous proliferation dynamics is desirable, which allows to link parameters of the aggregate model to well-defined elementary parameters and results in more systematic parameter estimation. the ultimate goal is to be able to estimate more local structure from the observed time series. 10 all rights reserved. no reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in the copyright holder for this this version posted august 22, 2020. . https://doi.org/10.1101/2020.08.20.20178301 doi: medrxiv preprint a additional graphs a.1 data series on daily newly reported covid-19 cases figure 5 : a "smoothed" derivate of numbers of daily newly reported covid-19 cases in germany published by [rob] . the blue squares and green triangles series show (for comparison) the sum of daily new cases over a moving 7-day window. orange diamonds and triangles show daily new cases after scaled with a weekday-specific weight factor to remove the weekly pattern seen in the original data. the weight factors were estimated from data corresponding to the squares and diamonds series, i.e. from the interval from 1st april until 6th may 2020. germany imposed face-mask wearing in stores starting from 27th april and allowed certain (moderate) shop reopening starting from 4th may 2020. the "bend" at around 14th april is remarkable because no changes in measures were effected at that time or within the preceding one week. b refinement of the infectiousness mechanism, including a model generalization so far, a crude specification of the infectiousness has been used, putting focus on the main infectiousness interval of a few days. an additional aspect in the virus transmission which should be accounted for in a refinement is the transmission from longer lived remnants of the virus in otherwise cured individuals. for this, we imagine that individuals infected at time t 0 remain contagious until t 0 + t c2 with reduced probability, additionally to the previously used interval [0, t c ]. concretely, let p 2 be the probability that an individual which has been infected for a duration exceeding t c but not exceeding t c2 , will transmit the virus in a unit time step. withp 2 := p 2 /p the adjusted model equation then readṡ x(t) = d · p · (x(t) − x(t − t c )) +p 2 (x(t − t c ) − x(t − t c2 )) · (1 − x(t)/n ), 11 all rights reserved. no reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in the copyright holder for this this version posted august 22, 2020. . mathematical epidemiology: past, present, and future temporal dynamics in viral shedding and transmissibility of covid-19 deterministic and stochastic epidemics in closed populations a contribution to the mathematical theory of epidemics effective containment explains subexponential growth in recent confirmed covid-19 cases in china covid-19: fallzahlen in deutschland und weltweit temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by sars-cov-2: an observational cohort study katrin zwirglmaier, christian drosten, and clemens wendtner. clinical presentation and virological assessment of hospitalized cases of coronavirus disease 2019 in a travel-associated transmission cluster early prediction of the 2019 novel coronavirus outbreak in the mainland china based on simple mathematical model since those individuals must be added to the instantaneous reservoir from which infections are generated. the equation is better written aṡif we denote by i(t) the infectiousness profile, which shall describe the relative infectiousness of an infected individual 5 at time increment +t after the infection moment (relative to infectiousness at t = 0), then the above used specification for sars-cov-2 is expressed asits derivative is (with dirac notation) i = δ 0 − (1 −p 2 ) · δ tc −p 2 · δ t c2 . we therefore find that the model equation (6) in fact is generally best written aṡwhere " * " denotes the function convolution. a dirac notation-free representation deriveshere the last integral signifies the well-known stieltjes integral.note: the infection from contaminated surfaces of objects can be represented in the same framework. this is because initially and during the evolution of the spread, viruses are on surfaces mostly there where infected individuals previously had been. key: cord-313279-15wii9nn authors: trevijano-contador, nuria; zaragoza, oscar title: expanding the use of alternative models to investigate novel aspects of immunity to microbial pathogens date: 2014-05-15 journal: virulence doi: 10.4161/viru.28775 sha: doc_id: 313279 cord_uid: 15wii9nn nan in the present issue of virulence, an article entitled "the maternal transfer of bacteria can mediate trans-generational immune priming in insects" 1 describes an elegant study that illustrates the use of the lepidopteran galleria mellonella to investigate a specific aspect of immunity to microbes. the authors show that exposure of mothers to bacteria results in enhanced immunity in the offspring. furthermore, they have demonstrated that bacteria ingested by female larvae are found in the eggs, suggesting that enhanced immunity of the offspring is a consequence of direct exposure of the eggs to bacteria. this is a relevant and novel study for several reasons. the authors provide a mechanism for an important aspect of insect immunity, which is that direct transfer of bacterial fragments from the mother to the eggs primes their immune response. but in addition, this study opens the scope on the use of non-conventional models and illustrates how they can be used to investigate aspects of immunity against pathogenic microorganisms. classically, mammals have been used to study microbial pathogenesis (rodents such as mice and rats) and the immune response elicited by the host. these models have been a useful tool for centuries, and the development of their genetic manipulation offers new alternatives to investigate the role of specific factors of the immune system in the defense against pathogens. however, animal experimentation is associated with important bioethical problems, mainly due to the pain and suffering inflicted to the animals. for this reason, animal experimentation is nowadays regulated by authorities and bioethical committees. furthermore, to reduce these bioethical problems, there is a strong trend to apply the "3 rs" rule in experiments that involve animal use, which are: reduce the number of animals used in the laboratory; refine the protocols to increase animal comfort and reduce pain; and replace animals for other models that do not have bioethical problems associated. in this context, there has been an increasing interest in the scientific community to implement other systems that could be used as an alternative to protected animals, with special emphasis on animals with poorly developed neural systems in which the feeling of pain is almost absent. for this reason, "non-conventional" hosts are being used to investigate microbial pathogenesis, including both invertebrates and vertebrates. these organisms have been proven to be very useful to investigate specific virulence traits of the pathogen and their role in infection. but although they are not closely related to higher vertebrates from an evolutionary point of view, they share important aspects in their response to microbes, in particular, in their innate immunity. so these models can also provide information about the immune response elicited against microbial pathogens. among vertebrates, two alternative different models are being used as infection models: zebra fish embryos and embryonated chicken eggs. in both cases, and to reduce the bioethical issues associated with the use of adult individuals, infections are performed in the embryonated stage of development. these models present the advantage that they have a closer immunity to mammals than invertebrates. zebra fish (danio rerio) is used as model host during the first seven days after eggs deposition, and infections can be performed by microinjection in different areas. 2 the zebra fish has both innate and acquired immunity, although this last one is not developed until day 30 of development, so the zebra fish embryo infection model is of particular interest to investigate virulence of pathogens controlled mainly by innate immunity. one advantage of this model is that the anatomy of the embryos is easily visible under the microscopy due to their transparency. embryonated chicken eggs offer also an alternative to investigate microbial pathogenesis, and as it occurs with the zebra fish, the immunity of the eggs is similar to that of higher mammals. infections are performed by injection of the pathogen in chorioallantoic membrane or directly in the embryos of eggs. the use of zebra fish and embryonated chicken eggs is limited in many cases because they require specific facilities to host and maintain the animals, and also due to the expertise required to handle them. despite these limitations, these models have been used to investigate the virulence of fungal, bacterial, and viral pathogens. [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] invertebrate animals are also extensively used as models to study immunity and microbial virulence. there are three main alternative hosts that have been widely utilized: amoebas, insects, and nematodes. amoebas are environmental predators, and for this reason, they are considered an optimal model to investigate phagocytic activity. 16 this is of particular interest for the study of facultative intracellular pathogens, since some of them can also survive inside amoebas and the mechanisms that result in intracellular pathogenesis seem to be conserved from amoebas to mammalian phagocytic cells. in addition, survival of microbial pathogens inside amoebas drives selection of microbes resistant to killing, which has important implications to understand the acquisition of virulence traits that are also used to cause disease in more complex organisms. 16 this survival is also relevant for the infection cycle of some bacteria, which are phagocytosed in the environment by amoebas, and use them as vehicle to infect humans. 17, 18 nematodes, in particular caenorhabditis elegans, can be used as model hosts for infections. 19 immunity of c. elegans is based on three major responses: avoidance behavior, which relies on chemosensory neurons that sense pathogens and induce escape, physical barriers (cuticle and the pharyngeal grinder), and innate immunity. this last response depends on pattern recognition receptors (scavenger receptors, c-lectins, fshr, and tol1), which regulate different signaling pathways (mainly mapk, unfolded protein response, daf, and tgf-β). as a consequence, antimicrobial responses (such as antimicrobial peptides, caenopores, lysozymes, and reactive oxygen species, ros) and autophagy are induced. remarkably, c. elegans does not have phagocytic cells. 20 the main advantage of this model is the availability of genetic tools. ko collections are available, which makes this worm suitable to investigate the role of specific elements of the host in the response to pathogens. moreover, due to their small size and the possibility to perform assays in microdilution plates, c. elegans offers an excellent model to perform large screenings of antimicrobial compounds. 21 however, this model exhibits also several limitations. infection is performed by placing the worms on agar plates with a layer of the microorganism, so it is difficult to estimate the amount of inoculum used in each experiment. in addition, the worms do not tolerate high temperatures, so it is not an optimal model to analyze hostpathogen interaction at 37 °c. among insects, there are two species largely used as model hosts to study microbial virulence, drosophila melanogaster and galleria mellonella. 22 drosophila melanogaster is a fly that has been used in research for decades. its immunity depends mainly on physical barriers, and on both cellular (hemocytes), and humoral (toll and imd pathways) responses, that induce the production of antimicrobial peptides and ros. 23 investigation with drosophila melanogaster has elucidated some of the main elements of the immunity against pathogenic microorganisms, such as the toll receptors, which were identified for the increased susceptibility of ko flies lacking this receptor to aspergillus fumigatus, 24 a discovery that was awarded the nobel prize of physiology or medicine in 2011. pathogens can be introduced in the flies aerosolized, by microinjection, or administered in the food. the development of genetics and possibility to obtain knockout strains make d. melanogaster also a suitable model to investigate the role of host elements in the response against microbial pathogens. 25 the other insect that is currently widely used to investigate microbial virulence is the lepidopteran galleria mellonella. [26] [27] [28] the life cycle of this organism comprises a larval stage (size around 1-3 cm) that transform into pupae and finally into moth. the size of the larvae makes easy their manipulation and injection. survival monitoring is also very convenient because when they die, they become unresponsive to physical stimuli and acquire a dark color due to strong melanization. in addition, it is possible to easily administer accurate doses of antimicrobial compounds to test toxicity and in vivo efficacy. immune response of this insect is mainly based on the presence of hemocytes with phagocytic activity, on antimicrobial peptides and on the induction of melanization. furthermore, different routes of infection can be performed, such as direct injections in the hemocoele, or by ingestion after placing the pathogen in the food. galleria mellonella is becoming a reference model to investigate microbial pathogenesis, such as the role of virulence factors in disease and efficacy of antimicrobial compounds. but this model can be used to investigate more complex aspects of immunity and virulence. dr vilcinskas' group has elegantly demonstrated that this lepidopteran can be used to investigate specific features of microbial disease, such as brain infection caused by listeria monocytogenes 29 (comment in ref. 2). the article by freitak on maternal transfer of immunity to the offspring 1 illustrates another example of the versatility of non-mammalian models to investigate relevant aspects of immunity, and applies to different fields, from entomology to immunity. furthermore, this work opens new perspectives and research lines (such as the investigation of the susceptibility to infection of worms derived from mothers exposed to pathogens), a matter that could be address using other models, such as c. elegans or d. melanogaster. but furthermore, this is an exciting article from an intellectual point of view, because it suggests a mechanism of natural selection of microbe-resistant worms through evolution not based on the acquisition of specific genes or mutations. finally, we would like to stress that at the moment, despite the bioethical issues associated with animal experimentation, full replacement of classical models does not seem to be an option, since there is still a need to validate the use of non-conventional hosts to fully understand how much information obtained with them correlates with the results observed in more complex organisms. but the use of "non-conventional" models to investigate immunity to microbes is an emerging field, and the number of articles in which virulence and immunity is assessed in these models and not in "classical" animals, such as rodents, is increasing. for this reason, we believe that this type of host should be designated in the future as "alternative" models as opposed to the term "non-conventional". no potential conflicts of interest were disclosed. the maternal transfer of bacteria can mediate trans-generational immune priming in insects developing the potential of using galleria mellonella larvae as models for studying brain infection by listeria monocytogenes isolation and distribution of west nile virus in embryonated chicken eggs isolation and propagation of coronaviruses in embryonated eggs pathogenesis of candida albicans infections in the alternative chorio-allantoic membrane chicken embryo model resembles systemic murine infections embryonated eggs as an alternative infection model to investigate aspergillus fumigatus virulence chicken egg yolk antibodies as therapeutics in enteric infectious disease: a review pleiotropic phenotypes of a yersinia enterocolitica flhd mutant include reduced lethality in a chicken embryo model chicken embryo lethality assay for determining the virulence of avian escherichia coli isolates the zebrafish guide to tuberculosis immunity and treatment host-pathogen interactions made transparent with the zebrafish model zebrafish as a model for infectious disease and immune function a star with stripes: zebrafish as an infection model nadph oxidase-driven phagocyte recruitment controls candida albicans filamentous growth and prevents mortality the avian chorioallantoic membrane in ovo-a useful model for bacterial invasion assays amoeba provide insight into the origin of virulence in pathogenic fungi mycobacterium tuberculosis complex mycobacteria as amoeba-resistant organisms microorganisms resistant to freeliving amoebae caenorhabditis elegans, a model organism for investigating immunity evolution of host innate defence: insights from caenorhabditis elegans and primitive invertebrates elegans: model host and tool for antimicrobial drug discovery of model hosts and man: using caenorhabditis elegans, drosophila melanogaster and galleria mellonella as model hosts for infectious disease research drosophila as a model system to unravel the layers of innate immunity to infection the dorsoventral regulatory gene cassette spätzle/toll/cactus controls the potent antifungal response in drosophila adults genetics of immune recognition and response in drosophila host defense development of an insect model for the in vivo pathogenicity testing of yeasts galleria mellonella as a model system for studying listeria pathogenesis galleria mellonella larvae as an infection model for group a streptococcus brain infection and activation of neuronal repair mechanisms by the human pathogen listeria monocytogenes in the lepidopteran model host galleria mellonella key: cord-311432-js84ruve authors: hossein rashidi, t.; shahriari, s.; azad, a.; vafaee, f. title: real-time time-series modelling for prediction of covid-19 spread and intervention assessment date: 2020-04-29 journal: nan doi: 10.1101/2020.04.24.20078923 sha: doc_id: 311432 cord_uid: js84ruve substantial amount of data about the covid-19 pandemic is generated every day. yet, data streaming, while considerably visualized, is not accompanied with advanced modelling techniques to provide real-time insights. this study introduces a unified platform which integrates visualization capabilities with advanced statistical methods for predicting the virus spread in the short run, using real-time data. the platform is backed up by advanced time series models to capture any possible non-linearity in the data which is enhanced by the capability of measuring the expected impact of preventive interventions such as social distancing and lockdowns. the platform enables lay users, and experts, to examine the data and develop several customized models with different restriction such as models developed for specific time window of the data. our policy assessment of the case of australia, shows that social distancing and travel ban restriction significantly affect the reduction of number of cases, as an effective policy. the outbreak of coronavirus disease 2019 , caused by acute respiratory syndrome coronavirus 2 (sars-cov-2), has been recognized as a pandemic by world health organization representing the most serious public health threat during the last century [1] . the global impact of covid-19 has been profound. as of 12 april 2020, more than 1.87 million cases of covid5 19 have been reported in over 200 countries, resulting in 118,851 deaths as reported by the european centre for disease prevention and control (ecdc) [2] . forecasting the imminent spread of covid-19 informs policymaking and enables an evidencebased allocation of medical resources, arrangement of production activities and economic 10 development [3] . therefore, it is urgent to establish efficient trend prediction models, on the latest available data, to provide a point of reference for the governments to formulate adaptive responses based on reliable predictions on the impending progress of the pandemic. the classical susceptible-[exposed]-infected-recovered (seir/sir) epidemic models [4] , have 15 been widely developed to simulate the transmission dynamics of covid 19 [5, 6] and the impact of non-therapeutic interventions -e.g., travel and border restrictions [7, 8] , quarantines and isolations [5, [9] [10] [11] , or social distancing and closure of facilities-on the spread of the outbreak, and in some cases, on the healthcare demand [5, 9, [11] [12] [13] .these studies have been mostly focused on calibrating models for a specific country/region based on the data at the time 20 of the model-development and assuming a multitude of parameters initialized upon prior knowledge such as social contact structure, rate of compliance with the policy and incubation or infection period among others. complementing upon seir mathematical models, and owing to the increased amount of data and consistency of reports, some recent efforts have been focused on developing statistical [3, 14] or machine learning methods [15] to predict the near-future 25 spread of covid-19 (in terms of the number of confirmed cases or deaths) based on the historical data. while reliable predictions of the pandemic trend are essential for policymaking and resourceallocation, there is a lack of an adaptive real-time modelling platform which evolves as new data 30 arrives. in response to this urgent need, we present an advanced time-series models for the progression of covid-19 using the autoregressive integrated moving average (arima) formulation [16] statistical analyses combined with several non-linear transformation approaches [17] , and complemented with an interactive online dashboard which efficiently generates country-wise predictive models, in real-time, based on the latest ecdc report of covid-19 35 cases worldwide. the proposed modelling approach neither relies on strict modelling assumptions (e.g., linearity, stationarity, or existence of an epidemic steady state) nor on any initial parameters requiring a priori knowledge. it offers a transparent mathematical function to better understand the trend and 40 to predict future points in the series. different types of transformation have been examined to capture the nonlinearity in the time-series data followed by multiple differencing steps to eliminate the non-stationarity status. notably, we enhanced the time-series model to capture the effect of previous interventions using an exogenous variable which can be used to predict the impact of future interventions. further, when no record of intervention is provided, the model can infer previous interventions from data and incorporate its estimated impact into future predictions. the main objective of this study is to introduce an easy-to-use and readily available statistical tool to develop rigor models for time series data of covid-19 as data becomes available on a 5 real time basis. in this article it is demonstrated that the proposed modelling tool is reliable to estimate accurate model parameters and is capable of being used for policy assessment. the tool is perfectly tailored for modelling covid-19 data which an option of assessing the performance of any interventions to control the spread, such as lockdown, social distancing rules, and airport restrictions. multiple transformation operations are investigated to stabilise variance, coupled with recursive differencing until eliminating non-stationarity in the time-series data, i.e., p-value < 0.05 based on augmented dickey-fuller test [16] . upon each transformation, the best arima model is obtained for each country, according to akaike information criterion (aic) value using 15 maximum likelihood estimation. the optimal model for each transformation is then recorded based on the overall model root mean square error (rmse) on the last 20% of observations reported as a surrogate estimate of out-sample prediction performance. the predictive power of the best model per country is compared against estimations provided through 1) exponential growth in number of cases, 2) doubling time of two days, 3) doubling time of 3 days and 4) 20 doubling time of one week, as well as a conventional linear univariate regression on logtransformed data. table 1 shows the parameters of the optimal arima model per country and the corresponding rmse measures (of the last 20% of observations) compared with conventional trends (based on ecdc data on april 13, 2020) . while, the purpose of this study is not to develop the most accurate time-series predictive model, statistics of table 1 clearly show 25 that using a more sophisticated statistical model significantly improves the prediction accuracy of covid-19 spread in the near future (t-test p-value << 0.001 comparing residuals' distributions), which signifies the urgency of such studies for policy appraisal. in other words, having access to tools, such as the one introduced in this study, enables experts with limited knowledge about details of statistical specifications to readily use such specifications to nowcast 30 and forecast the effectiveness of policies they envision and propose for controlling the spread of covid-19, or similar outbreaks. different time-series transformation operations, namely power transformation, logarithmic transformation and ratio transformation, have been applied to pre-process the data prior to the differencing step. we have observed that the type of transformation can significantly improve the performance of a model (in terms of the estimated out-sample rsme) as there is no a priori knowledge about the best-performing transformation (except that power transformation always 40 performs poorly). figure 1a shows some countries, as case studies, whose arima models (as of april 13, 2020) are significantly affected by the type of transformation. as figure 1a shows some countries such as canada has significantly better performance (t-test p-value < 0.05) all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 29, 2020. . https://doi.org/10.1101/2020.04.24.20078923 doi: medrxiv preprint without any transformation. italy's model performs better (p-value < 0.06) by using ratio or logarithmic transformations and for usa, transformations do not significantly affect the performance (p-value > 0.6 for all comparisons). the case greece, quite interestingly, shows that the ratio transformation stands above the other two while there is no statistically significant evidence that the logarithmic transportation performs any better than the no transformation case. 5 overall, the results signify the value of a performance-driven transformation selection approach upon trying multiple operations, as implemented in this platform. the nonlinear dynamic system underlying covid-19 spread is producing a regularly disrupted 10 pattern making static predictions increasingly unreliable. accordingly, a powerful feature of the platform is dynamic model estimation, that is, all models are re-optimised temporally with availability of new daily observations. accordingly, the latest reports on covid-19 case numbers are reflected in model estimation which accounts for the impact of new interventions improving the reliability of the future forecasts. as a case study, we have chosen to show the 15 value of this feature on prediction of future case numbers in iran. iran's trend shows significant fluctuations in the last 10 days (as of april 13, 2020) offering an interesting case study. we assumed that the model has access to data up to april 03, and then reported the next 10 days predictions and the rmse of predicted number on april 13th. this procedure was repeated 10 times, where new observations became available to the model, one at the time. figure 3 shows 20 how such dynamic re-estimation adjusts the model with emerging pattern in time-series trend and improves prediction accuracy. the modelling platform developed in this study can identify unreported interventions to adjust 25 the estimated models based on any exogenous fluctuations in the data. in other words, assume that some fluctuations occurred in the time-series (for instance, due to quarantines, changes in the reporting or testing methods, or facility closures), for which are reliable reported information is not available. these fluctuations can be inferred by using the platform to add the inferred interventions as an external regressor to the model to correct the bias in the parameter estimation 30 which will be used later for prediction purposes. the procedure for detecting intervention is to iteratively add interventions on potential dates, reestimating the model and monitoring changes in the specification of the arima model. as soon as major changes are observed in the model structure, i.e., values of (p, i, q), such alterations can 35 be interpreted as an intervention in the data and the identified days would be associated with the dates of inferred interventions. the external regressor, corresponding to the interventions, is basically a binary variable which, for each day, is 0 (default) indicating no intervention and 1, otherwise. the coefficient and the significance of the intervention variable reflect the average impact of the recorded interventions. the online platform, as it is further discussed in the next 40 subsection, enables users to incorporate the effect of known interventions into model specifications, and to detect unknown potential interventions which can capture observed fluctuations in the disease spread time-series. the proposed intervention modelling approach is examined on the data provided for china as 45 reported dates of interventions are officially available. to examine the effectiveness of the all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 29, 2020. . https://doi.org/10. 1101 proposed intervention detection procedure, three models are developed, on the number of cases reported from the end of december to the 20 th of february, with 1) no intervention, 2) reported interventions, and 3) inferred interventions. the model with no interventions returns the rmse of 6.52, the one with reported interventions returns the rmse of 6.53 and the one with the inferred interventions returns the rmse of 6.40. in the reported intervention model, three days 5 are obtained from the available information online about major interventions in china happening on 23th , 24th, and 27th of january 2020, while the inference approach identified the following intervention days as 20th ,24th ,26th ,29th ,30th of january 2020, which are generally lagged compared to the reported ones, possibly because the impact of these quarantine decisions are reflected in the system with a delay. the proposed intervention inference approach, that 10 automates identification of the interventions, finds a statistically significance coefficient for the average impact of the inferred interventions and can successfully improve the goodness-of-fit of the model. the estimated coefficient of -0.04 (which is statistically significant at the 80% confidence level 15 unlike the parameter of the model with reported interventions in the inferred intervention model has a major impact on reduction of number of cases (almost no increase) if used to predict the number of cases to the stable situation of china. therefore, we performed some sensitivity analysis on the impact of this variable on the prediction results which are presented in figure 2 . the left graph shows the impact of an intervention applied on different days in the next 10 days 20 where the intervention parameter is set to -0.005 and the right graphs shows the situation that a coefficient of -0.01 is considered. the main take away message of the diagrams of figure 2 is that it is better to apply the interventions as early as possible even if the impact of the intervention is as small as the one considered in the right graph. further, stronger interventions, not only have a larger immediate impact, but they result in a more stable long-term impact as 25 shown in figure 2 . to further assess the effectiveness of policies using the proposed platform, we have looked at the response of the australian government to covid-19, given the access of the authors to detail 30 information on timeline of interventions by the australian government. the first case in australia was confirmed on january 25 th by a passenger travelling from wuhan which was followed by a travel ban from mainland china on february 1 st . other travel bans were followed by, iran arrival blocking on march 1 st , south korea on march 5 th , italy on march 11 th , selfisolation for overseas travelers and cruise ships blocked on the 15 th of march, border closure on 35 march 19 th , ban of travelling overseas on 24 th , and mandatory isolation in hotels for travelers on 28 th [18] . upon including travel bans in the model (data from 5 th march to 20 th of april), the estimated coefficient of the intervention variable would be -53.89 (with standard error, se = 26.94) implying that on average the travel bans could reduce the number of infected people by 53 cases. the model predictive power would be significantly improved in terms of the goodness 40 of fit of the model (rmse = 8.88, significant compared to 'no intervention' residuals). other than the coefficient of the travel ban intervention, an arima (1, 1, 2) is estimated reflecting the importance of incorporating the observation of three days before into prediction, which also requires one differencing to make the data stationary reflecting the non-linearity of time-series data. the first order of integration (i=3 in the arima model) also implies that the slope of the 45 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 29, 2020 . . https://doi.org/10.1101 number daily cases is critical in the model, not just the accumulated total number of reported cases. another type of intervention practiced in australia has been restrictions on gathering in public places. australia imposed restrictions on outdoor gathering of people to less than 500 on the 16 th 5 of march, indoor gathering to less than 100 individuals on 18 th of march, pubs clubs closed, restaurants take-away only on 23 rd of march, and all gathering limited to 2 persons on 30 th of march [18] . by considering these interventions except for the 23 rd which over laps with a major spike in the data, the parameter of intervention variable is estimated as 16.24 (se = 38.46) imply that the gathering bans in australia reduced the number of cases by -16 cases per day, 10 while the coefficient is not significant. the corresponding arima models is (3,3,0), with rmse 10.31 (not significant compared to 'no intervention' residuals). once all restrictions are included in one model to examine the effect of interventions in one model with the aim of finding the best fit to the data (on 6 th , 12 th , 15 th , 16 th , 19 th , 24 th , and 28 th of 15 march and 3 st of april) an arima (1,1,2) model is obtained with rmse of 8.68, where the coefficient of intervention variable is -45.4 (se=21.86, p-value = 0.03) and the model is only marginally improved to the first model. in other words, the intervention strategies of the government of australia could result in a reduction of 45 cases per day during march and earlyto-mid april 2020. as a result, by considering the first model and the last model which is 20 developed to find the best fit to the data we can say that the preventive strategies helped reducing the number pf infected cases in australia by 45-53 cases. we have developed an interactive online dashboard (https://unsw-dataanalytics.shinyapps.io/covid19_analytics) to facilitate real-time model development for lay users as well as data scientists. users can select the country of interest from the left panel and observe an interactive visualisation on cumulative counts of confirmed cases in the middle panel. upon pressing the 'predict' button', the platform provides users with optimal models fitted to the latest 30 reports of covid-19 spared as provided by ecdc. for any country of interest, the interactive user-interface enables users to re-estimate models by 1) adding observed interventions in previous days, 2) customising the range of days to be included in the model and 3) incorporating the effect of future interventions in predictions. the right panel visualises the cumulative number of confirmed cases since the 1000 th case of top 10 countries in terms of total number of cases, 35 plus predictions of growth trajectories in the next 10 days. similarly, the middle-button panel shows the world map color-coded with predicted number of cases per 100k, together providing a global comparative view of the forthcoming covid-19 spread. real-time covid-19 data analytics have been mainly focused on visualizing the spread [19] 40 with limited effort in developing models to dynamically analyze the data. epidemiological models, i.e., sir/sier models, have a strong foundation in analyzing epidemic growth/decline, and have been substantially explored for modelling the speed of infectious disease progression. yet, such models are often offline/static, require assumptions for the parametric formulation of the model and rely on multitude of initial parameters. 45 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 29, 2020 . . https://doi.org/10.1101 we developed a time-series based statistical model to dynamically predict the future trend of covd-19 spread. it is built upon the strength of sir/sier models in considering the speed of progression and coupled with the capacity of time-series models in 1) considering higher orders of derivatives of the number of cases in previous time intervals, 2) accounting for the impact of residuals of the previous time intervals, and 3) incorporating intervention effect as external 5 variables. we presented an automated modelling platform that delves into multiple layers of information in the covid-19 time series data to find the best fit with the aim of providing robust forecasts. the platform-unlike numerous counterparts that are primarily limited to disseminating the existing patterns-measures the average impact of previous interventions to reduce the growth of the infected cases. this measure is then used for predicting the future 10 spread when similar interventions are considered to further control the outbreak. the presented platform was shown to be effective in estimating the trend of outbreak for each country. we elaborated the importance of data transformation as a preprocessing step and shown that there is no transformation operation which consistently provides the best fit to the data. 15 hence, exploring multiple options are recommended to stabilize variations prior to modelling using conventional econometrics formulations. a unique aspect of the presented platform is that it facilitates real-time model development incorporating latest reported data into modelling. we have shown that such adaptive model estimation significantly improves the prediction power and therefor, forecasting reliability. 20 one major challenge ahead of governments and policymakers, is the uncertainty around the effectiveness of containment policies in controlling the outbreak. yet, authorities can observe the effectiveness of what they have done in the past few days, weeks, and months and learn from the impact of their previous decisions to enhance their intelligence in proposing new mechanisms to 25 control the outbreak. we have shown that examining the effect of controlling policies can inform policymakers on number and types of controlling policies required to achieve their objective in reducing the number of infected cases. we have shown for australia, as a case study, that containment policies not only decline the number of infected cases right after implementation, but also reduce the slope of progression. the latter is a more outstanding factor, especially when 30 such policies are coupled with economic related policies given the recession or depression that is expected to happen during or after the covid-19 pandemic. overall, given the unknown nature of sars-cov-2 spread, we need to relax the boundaries of potential methods, beyond classical epidemic models, to further explore the behavior of data to account for unknown aspects the virus spread. figure 1 effect of transformation on modelling performance. four countries were selected as case studies to demonstrate the effect of ratio and logarithmic transformations on the model performance as measured by rmse on last (most recent) 20% of time-series data. the solid line shows the observed trend and the dashed lines shows model fitted values without transformation (red) and after ratio (green) or logarithmic (blue) transformation. the bar plot beside each trend graph shows the corresponding rmse estimations. '*' implies that t-test p-value < 0.05 while '+' implies that p-value < 0.06. * * + * * + all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 29, 2020. intervention parameter: -0.005 intervention parameter: -0.01 pandemics and social capital: from the spanish flu of 1918-19 to covid-19. 2020. 45 2. (ecdc), e.c.f.d.p.a.c., data on geographic distribution of covid-19 cases worldwide 10 a modified seir model to predict the covid-19 outbreak in spain: simulating control scenarios and multi-scale epidemics. medrxiv the effectiveness of quarantine and isolation determine the trend of the covid-19 epidemics in the final phase of the current outbreak in china projecting hospital utilization during the covid-19 outbreaks in the united states the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study. the lancet infectious diseases data-based analysis, modelling and forecasting of the covid-19 outbreak artificial intelligence forecasting of covid-19 in china epidemiology and arima model of positive-rate of influenza viruses among children in wuhan, china: a nine-year retrospective study ensemble of arima: combining parametric and bootstrapping techniques for traffic flow prediction coronavirus 35 numbers in australia: how many new cases are there? covid-19 map, stats and graph tracking covid-19 responsibly. the lancet, 2020. ar1: -0.393 (0.15) acknowledgments: ss and thr acknowledge the support from the australian research council under linkage scheme (lp160100450). thr acknowledges the support from the australian research council under the decra scheme (de170101346). author contributions: thr and fv conceived and supervised the project. ss and thr developed the mathematical model. ss generated the results for the manuscript. aa and fv competing interests: authors declare no competing interests.data and materials availability: all data is available in the main text or the supplementary materials.all rights reserved. no reuse allowed without permission.(which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint this version posted april 29, 2020. cumulative case counts a p r i l 0 3 a p r i l 0 4 a p r i l 0 5 a p r i l 0 6 a p r i l 0 7 a p r i l 0 8 a p r i l 0 9 a p r i l 1 0 a p r i l 1 1 0 1k 2k 3k 4k 5k all rights reserved. no reuse allowed without permission.(which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint this version posted april 29, 2020 . . https://doi.org/10.1101 without arima(4,5,0) key: cord-296826-870mxd1t authors: taghikhah, firouzeh; voinov, alexey; shukla, nagesh; filatova, tatiana; anufriev, mikhail title: integrated modeling of extended agro-food supply chains: a systems approach date: 2020-06-27 journal: eur j oper res doi: 10.1016/j.ejor.2020.06.036 sha: doc_id: 296826 cord_uid: 870mxd1t the current intense food production-consumption is one of the main sources of environmental pollution and contributes to anthropogenic greenhouse gas emissions. organic farming is a potential way to reduce environmental impacts by excluding synthetic pesticides and fertilizers from the process. despite ecological benefits, it is unlikely that conversion to organic can be financially viable for farmers, without additional support and incentives from consumers. this study models the interplay between consumer preferences and socio-environmental issues related to agriculture and food production. we operationalize the novel concept of extended agro-food supply chain and simulate adaptive behavior of farmers, food processors, retailers, and customers. not only the operational factors (e.g., price, quantity, and lead time), but also the behavioral factors (e.g., attitude, perceived control, social norms, habits, and personal goals) of the food suppliers and consumers are considered in order to foster organic farming. we propose an integrated approach combining agent-based, discrete-event, and system dynamics modeling for a case of wine supply chain. findings demonstrate the feasibility and superiority of the proposed model over the traditional sustainable supply chain models in incorporating the feedback between consumers and producers and analyzing management scenarios that can urge farmers to expand organic agriculture. results further indicate that demand-side participation in transition pathways towards sustainable agriculture can become a time-consuming effort if not accompanied by the middle actors between consumers and farmers. in practice, our proposed model may serve as a decision-support tool to guide evidence-based policymaking in the food and agriculture sector. environmental, and social metrics. our aim is to investigate the impact of shifts from conventional to organic food consumption on the underlying sc activities and behaviors. in our literature survey, on the one hand, we found a few examples of ssc studies paying attention to the preference of consumers. for example, fan et al. (2019) discuss the influence of the altruistic behavior of retailers on the willingness of consumers to purchase low-carbon products. they further study the effect of retailers' behavior across the entire sc to find out the dynamics of the economic and environmental performance of manufacturers. tobé and pankaew (2010) empirically study the influence of green practices of the sc on pro-environmental behavior of consumers. they conclude that a quarter of the dutch population seems to be green consumers. nevertheless, when it comes to buying decisions, the degree of environmental friendliness of products is not a significant determinant. for green food products. sazvar et al. (2018) investigate the effect of substituting conventional product demand with organic assuming a percentage of consumers are willing to shift their preferences. similarly, rohmer et al. (2019) show the impact of possible consumers' shift from meat-based to plant-based diet on the underlying production system. on the other hand, there are studies from the economics and behavioral science discipline that consider some aspects of scs. in the field of economics, for example, wen et al. (2020) and sabbaghi et al. (2016) discuss the impact of consumer participation on pricing and collection rate decisions in csc. the study of safarzadeh and rasti-barzoki (2019) is another example of such analysis, which models the interactions between consumers, government, manufacturers, and energy suppliers for assessing residential energy-efficiency program. regarding the behavioral studies, as a few examples, we point out to the impact of consumer choices on the retailing sector (he et al., 2013; schenk et al., 2007) , energy market (xiong et al., 2020) , housing market (walzberg et al., 2019) , and so on. while researchers have taken initial steps in highlighting the role of consumers in managing sc operation, they are far behind in analyzing the behavior of various consumers and the collective impacts of changing their preferences on enhancing sc sustainability. the main finding that can be drawn from the reviewed papers is that there is a lack of research that analytically considers the role of green consumer behavior in scm. moreover, as there is no experimental or analytical study on the application of the essc framework, it still requires further investigations to be accomplished (ferrari et al., 2019) . according to taghikhah et al. (2019) , the complexity of relationships and the uncertainties involved in the essc requires a more comprehensive approach. in developing the proposed essc model considering the heterogeneity of consumers, we take an integrated modeling approach combining agent-based modeling (abm), discrete event simulation (des), and system dynamics (sd) to simulate both production and consumption side of the operation and the feedbacks between them. abm is a useful modeling approach for understanding the dynamics of complex adaptive systems with selforganizing properties (railsback & grimm, 2019) . it allows us to study emergent behaviors that may arise from the cumulative actions and interactions of heterogeneous agents. in the proposed model, we make use of abm to define each supply chain echelon/actor as an agent with specific behavioral properties and scale. the dynamics of consumer behavior and buying patterns is also modeled using individual households as agents who decide what they buy. des is used to define the behavior of farmer and processor agents (responsible for production and distribution) as a series of events occurring at given time intervals accounting for resources, capacities, and interaction rules. sd is employed in examining the behavioral patterns and interactions between farmers and market using aggregated variables. the decisions to be explored in the proposed model are related to land allocation, production planning, inventory control, pricing, and demand management under uncertainty. the model accounts for different temporal (from short-term to long-term decisions) scales and multiple objectives in supply chains. the applicability of the proposed model is illustrated in the particular case of the australian wine industry. the rest of the paper is organized as follows: section 2 presents a background on the wine sc characteristics and the modeling techniques applied in designing agro-food sc. section 3 describes the model framework and method. section 4 explains the details of a case study. section 5 presents the calibration and validation results, the uncertainty analysis, and findings from the model. finally, section 6 derives conclusions and some practical and managerial perspectives. farming, processing, distribution are the main functional areas of decision making in the agro-food sc. strategic and operational farming decisions are about the time of planting and harvesting crops, the land allocation to each crop type, and the resources and agrotechnologies to be used at the farm. processing decisions refer to the scheduling of production equipment and labor, selecting production-packaging technologies, and controlling the inventory along the supply chain. the distribution related decisions involve designing the logistics network, scheduling the product shipping, and selecting the transportation modes and routes. the studies by miranda-ackerman et al. (2017) , and jonkman et al. (2019) are recent examples of models addressing a range of decisions from farm level (e.g., organic versus conventional farming) to the production (e.g., technology selection) and distribution level (e.g., transportation route). although studies addressing sc decisions simultaneously are still lacking, the literature trend is towards more integrative, holistic agro-food models. strategies aimed at reducing the environmental footprints of agro-food sc are mainly focusing on the production side, designing low-carbon logistics networks, and improving the resiliency and reliability of food delivery (soysal et al., 2012) . these improvements alone may not bring considerable emission savings to agro-food sector. for example, in the case of meat production, which is responsible for approximately 14.5% of total global ghg emissions (e.g., mohammed and wang (2017) ), even more than the transportation sector (gerber et al., 2013) , introducing green logistics and optimizing energy consumption in the sc will hardly make a significant difference in its overall impact. regarding the food miles and local sourcing, new studies show that imported food products do not necessarily have higher environmental impacts than locals (nemecek et al., 2016) . using eco-friendly processing technologies (aganovic et al., 2017) and utilizing novel packaging options (licciardello, 2017) are examples of efforts to reduce the environmental footprint of food processing. an insightful discussion on these strategies can be found in li et al. (2014) . among the strategies examined in the literature (beske et al., 2014) , demand-side solutions such as consumer preferences for sustainable food or vegetarian diets and their influence on the overall configuration and performance of the sc have been largely ignored. for the production-side strategies, we focus on expanding organic food production systems. with regard to the environmental burdens of organic farming, scholars have arrived at contradictory recommendations. in the first set of studies, they have proposed organic farming system as a promising environmental solution due to a significant reduction in agricultural inputs resulted from enhanced soil organic matter and thus soil fertility (markuszewska & kubacka, 2017) . in another set of research, organic farming is not positively assessed, and the studies have also questioned as to what extent it can improve environmental performance. at the same time, more lands are required to produce the same amount of yields (tuomisto et al., 2012) . the contradiction between the results of the assessment is due to the limitations of lca (van der werf et al., 2020) . researchers advise that although there is no single best farming system, in many circumstances (depending on soil type, climate, altitude, and legislation), organic farming can be considered as the optimal system creating more resiliency in food systems. for a comprehensive discussion around the topic of organic versus conventional farming, we refer interested readers to risku-norja and mikkola (2009). 2.2 modeling methods in the agro-food supply chain from a modeling perspective, mathematical optimization techniques (combined with life cycle assessment) are the dominant approach used for designing ssc for food products (zhu et al., 2018) . some researchers take deterministic approaches such as linear programming, mixed integer programming, and goal programming (oglethorpe, 2010) to design and plan scs. the uncertainty and dynamics in the parameters are addressed by approaches such as stochastic programming (costa et al., 2014) , fuzzy programming, simulation modeling, and game theory. the choice of modeling technique depends on various factors such as problem scope, inherent complexity, and uncertainty in the sc, modelers' skill, and data availability. although a decade ago, the increasing necessity of using system science methods, such as abm, sd, and network theory for studying agro-food scs have been emphasized (higgins et al., 2010) , not many applications can be found in practice. authors have applied abm in developing theories and policies to improve the performance of the agro-food industry (huber et al., 2018) . theory focused studies aim to explore the application of theories in understanding agents decision-making process (e.g., farmer, government, dealer, etc.) or develop new theories to explain the interactions among individual agents (e.g., malawska and topping (2018) ). theories have already helped to describe the formation of cooperation networks, restructuring the partnerships, and rearrangement of the market power (see utomo et al. (2018) ). policy focused abms study the impact of financial (e.g., incentives and subsidies, pricing, credit, and compensation schemes), innovative and technological (e.g., improved seed, tree crop innovations, or environmental (e.g., organic agriculture, organic fertilizers) policies on the performance of food sc (albino et al., 2016) . in a recent review on the application of abm in agriculture, utomo et al. (2018) emphasize that important actors of the industry, such as food processors, retailers, and consumers, are rarely modeled in the current abm literature and call for further research on these areas. despite the growing interest in using optimization approaches, the application of simulation techniques in the ssc context is scarce. recently, wang and gunasekaran (2017) , rebs et al. (2018) , and brailsford et al. (2019) have suggested getting the advantages of combined simulation modeling methods in assessing complex ssc problems. in response to this call, our study presents the development of an extended food sc model that incorporates the dynamics of farmers, processors, retailers, and consumers behavior as well as sustainability aspects. for this we used an integrated, or rather an integral (voinov & shugart, 2013) modeling approach to link production decisions to consumption choices in a holistic way. in recent years, the area of modeling behavioral aspects of decision-making has received the attention of researchers and practitioners. the behavioral modeling approach presents an alternative basis for decision making in supply chains, which are traditionally modeled largely with mathematical optimization models. in behavioral models, individual decisions are modeled as per the definition of bounded rationality where decisions are made with respect to the limited available information, individual preferences and biases, cognitive limits, and time available to make decisions. for example, kunc (2016) , provided a useful resource to understand the use of system dynamics based simulations for behavioral modeling. these types of modeling approaches can provide new and emergent insights about operations and supply chain management. however, the use of behavioral modeling methods should be carefully designed and validated as such approaches can also introduce undesired complexity, higher ambiguity in the modeling environment, and harder interpretation of results. for a comprehensive discussion on this topic, see kunc et al. (2016) . commonly used methods for quantitative analysis in supply chain management, largely, relied on the optimization approaches based on constrained linear and nonlinear optimization algorithms, as well as dynamic programming and discrete optimization exact methods, heuristics and metaheuristics (barbosa-póvoa et al., 2018) . while these approaches performed well generally, but they fall short in modeling behavioral aspects that are bounded rational in nature. methods such as sd, abm are able to simulate the intangible aspects of the scm effectively, including interactions among different sc stages, learning over time for sc partners involved, and continuous feedback on key decisions in the presence of limited information. however, studies employing simulation modeling (e.g., abm, sd) in the area have been few and far between, as reported in the recent study by dharmapriya et al. (2019) . in fact, there are even less studies reported on modeling consumer behaviors in the sc using simulation modeling (taghikhah et al., 2019) . hybrid simulation is an approach that involves integrating multiple simulation methods such as des, abm, and sd (a comprehensive taxonomy can be found in mustafee and powell (2018) ). it has a strong practical appeal to deal with the limitations of a single method in developing behavioral modeling (mustafee et al., 2017) . this approach allows the models with different levels of abstractions to interact with each other and increases the flexibility of end-users in using them for decision-making. the main challenges of hybrid simulation are difficulty in verification and validation, huge computational complexity (bardini et al., 2017) , and low practical applicability for solving real-world cases. brailsford et al. (2019) found that among 139 published papers using hybrid simulation, combined sd-des is the most popular method. in contrast, a combination of des, sd, and abm is the least used method, reported only in 14 papers. in this paper, we compared the results of using both approaches and provide insight into their performance in a case study. for in-depth analysis of hybrid modeling, see brailsford et al. (2019) , eldabi et al. (2018) , and mustafee et al. (2017) . in this study, sc is composed of four actors/echelons -farmer, winemaker, retailer, and consumer -collaborating to achieve their various goals (see figure 1 ). they may have different functions, complexity levels, temporal dimensions, and spatial scales. in the proposed essc model, abm is used together with des and sd to model the behavior of each actor. the model is programmed in anylogic 8.3 software and it is openly available at (https://www.comses.net/codebase-release/eeb3cd12-91ac-4ba7-81f7-8c8bfe7bd804/). it is built in a gis computational environment enabling users to adjust the resolution and scales during the run time. having said that, winemakers still have a significantly stronger bargaining position compared to grape growers. in other words, farmers cannot merely pass higher grape prices and other costs along the supply chain to wineries. these considerations justify the assumptions of hierarchical structures and central control changes to collaboration between actors to maximize the profit. both historical and empirical data are used to parameterize, calibrate, and validate the model (for more details refer to sections 4 and 5 and appendix b, c, and d). the data on crop scheduling, vineyard costs, farming practices, grape types, and land yield describe the farmer agents. the winemaker agents use historical data on numbers and capacities of machinery, production processes, time, costs, and grape requirements. the information collected from liquor retailers' annual reports and the wine industry reports, including the prices, market structure, export and import, sales, and profit of retailing, addresses the data inquiry of retailer agents. finally, consumer surveys about wine preferences provide data for the behavioral (e.g., beliefs, goals, experiences, and perceptions) and contextual factors (e.g., price, availability, accessibility) of the consumer agents. regarding the intermediate link, as shown in figure 1 , the consumer preferences and demand for products (derived from consumer abm) influence the retailers selling price and availability of wine types (derived from retailer abm). this price and availability dynamics in conjunction with the volume of wine production (derived from winemaker des-abm) affect the wine inventory levels, order size, and retailers purchasing prices. these changes in the volume and price of wine are reflected in farming contracts and determine the volume of grape harvest (derived from farmer des-sd). an integrated abm-des-sd method is employed for the essc model development. we use abm for simulating consumer behavior and retailer operation. it is a bottom-up method suitable for modeling complex social, behavioral dynamics to study heterogeneity and the emergence of collective actions. in facing the same situation, every consumer and retailer agent has a unique reasoning mechanism, and they act based on predefined decision rules. a combination of des and abm is employed for modeling the dynamics of wine production and distribution operations. des presents (discrete) sequence of wine processing events in time. finally, a combined des and sd method simulates the annual growth cycle of grapevines and predicts farmers' expectations about the value of organic farming ( figure 1 ). the model assumes that farmers have fixed land area available to supply the grape requirements of wineries. farmers are contracted by winemakers to grow grapes under a capacity guarantee contract-farming scheme. this contact determines the approximate volume and the type of grapes -organic and conventional -required for production. in this study, organic farming refers to a method of crop production that relies on biological pest controls (e.g., cover crops), and organic fertilizers (e.g., manure). conventional farming, in contrast, uses synthetic fertilizers, fungicides, and pesticides to maximize the vineyard yield. the organic farming system is considered more sustainable since it can keep soil healthy and maintain the productivity of land. the simulation begins in springtime when the grapevines are in the bud break phase. in this phase, tiny buds start to swell and eventually shoots grow from the buds. approximately 40-80 days later, small flower clusters appear on the shoot, and the flowering phase starts. soon after, 30 days on average, the flowers are pollinated, and the berries start to develop. this crop phase determines the potential yield of the vineyard. in the next phase, veraison, the color of grape berries changes after 40-50 days signaling the beginning of the ripening process. following veraison, within 30 days, farmers complete the harvest, remove grapes from the vine, and transport them to wineries for further processing. due to the variation in climate conditions over the years, we consider a stochastic crop growth process where the annual harvest of organic and conventional grapes is: where, are grape yields and are cultivated areas at year for organic and conventional grapes at farm f. the annual production cost at farm f ( ) varies depending on the production cost of organic and conventional grapes. farmer agents make judgmental assessments of the value of organic and conventional farming systems. the hypothesis of adaptive expectations (nerlove, 1958) states that the expectations of the future value of the interest variable depends on its past value and adjusts for the prediction error. thus, the calculation of progressive expectations or error learning hypothesis is derived from observing the difference between past and present market values. the market and equilibrium price of organic and conventional wine (discussed in section 3.4) guide farmers' expectations of adaptation to organic farming (shown in figure 3 ). the current expectations of the value of organic farming in the future is calculated as: where is the past perceived value of organic wine, is the partial adjustment, which describes the gap between reported value ) and the perceived value of organic wine. a full description of sub-models and their equations is available in appendix a.1.1. winemaker agents process grapes to produce two types of products, organic and conventional wines. they are responsible for storing and dispatching final products to retailer agents. the total production capacity per agent is fixed, but periodically, the capacity ratio for organic and conventional wine production can adapt to the size of retailer orders. figure 4 presents the operations in winemaker agents. due to perishability issues, winemakers try to process the grapes straight away after the harvest. the grapes get sorted, crushed and pressed, fermented, matured, and bottled as organic and conventional wines. assuming winery w purchases all the farmer f yield, their annual production is: where and are the availability of raw materials from (1) and is the capacity of processing facilities. while the same type of machinery can be used for producing organic and conventional wines, the processes (e.g., excluding sulfate during fermentation and bottling for organic wine) and associated costs might be slightly different. upon order arrival from retailers, the winemakers check for the stock availability and follow a rule-based reasoning approach to best fulfill them as described in appendix a.1.2. to prevent the issuance of new orders in case of no stock, winery w informs all the retailer agents that due to unavailability of stock , ) < ( , they would not accept further orders. this is done because wine production can take place once a year at the end of harvest season. before this time, any new order will be placed in the queue for processing when the product is available. retailer agents have the responsibility of supplying products quickly and reliably, forecasting demand accurately, and controlling the inventory levels continuously. they employ dynamic inventory control models to make a trade-off between sc costs and demand fulfilment. figure 5 summarises the operations in this agent type. the decisions on when to place an order and how many products to order from winemakers can impact the inventory-related costs. a continuous review inventory policy meets the requirements of retailers in response to dynamic demand situations (hollier et al., 1995) . this policy allows them to review their inventory levels for both organic and conventional products on a daily basis at minimum costs. when the inventory drops to some predetermined level's' (known as reordering point), lot of size 's' is ordered. the reordering point ( , ) makes sure that sufficient stocks are available to meet the demand before the order arrives at the retailer r to replenish the inventory levels. the order size for retailer r, ( , ) is a function of the economic order quantity ( , ) and the inventory at hand ( , ). appendix a.1.3 presents the details of inventory management system. consumer agents follow a certain decision-making process to make choices between organic and conventional wines. orvin, an abm developed by taghikhah et al. (2020) , is integrated into our model to estimate the consumer preferences for wine. in exploring the cumulative market consequences of individual consumer choices, factors such as social influence, drinking habits, and behavioral dynamics come into play. figure 6 presents a summary of the functions used in this agent type. figure 6 . schematic of functions in retailer agents. to understand the wine purchasing behavior, the theory of planned behavior (tpb) (ajzen, 1985) is considered along with alphabet theory (zepeda & deal, 2009 ) and goal framing theory (lindenberg & steg, 2007) . according to tpb, a particular behavioral choice is preceded by intention, which in turn is influenced by an individual's behavioral attitudes, normative beliefs (i.e., social influence, perception of social pressures, belief that an important person or group of people will approve and support a particular behavior), and control beliefs (belief in ability to influence own behavior, and control behavioral changes resulting from specific choice). however, alphabet theory explains the influence of habits on the relationship between intentions and actual behavior (e.g., organic food purchase). besides habits, the goal-framing focuses on the impact of enviro-contextual conditions on personal goals (i.e., hedonic-gain-normative goals) when making decisions. in this study, we have included all of these theories in an integrated framework setting for exploring behavioral and contextual factors, including intentions, habits, and personal goals that may influence wine purchasing decisions. this combination provides a theoretical framework for exploring behavioral and contextual factors, including intentions, habits, and personal goals that may influence wine purchasing decisions. consumers have intentions for purchasing either organic or conventional wine before shopping. when they arrive at the nearest retailer, they first check the availability and price of wine types. if the price of wine is higher than the consumers' spending limit or if no wines are available in stock, they leave the shop without purchasing any wine. otherwise, they choose wines based on their intentions, habits, observations of what other shoppers buy, and the perceived value of products. during the simulation, the shopping experience, the information about organic wine, and the dynamics of price and availability of wines affect the wine preference of consumers. for a technical explanation of the model, please refer to appendix c: orvin model description in (taghikhah et al., 2020) . when integrating orvin into the essc model, some restrictions of the model could be released as below. • in orvin all the retailers have equal stocks of wine. now, retailers are different, and, apart from price considerations, the product availability on the shelf can affect the perception of consumers about their choice control (i.e., perceived behavioral control (pbc)). • in orvin no product shortage is allowed, and the service level is 100%. now some acceptable level of product shortages can happen, and these are modeled as a service level. retailers are also responsive to the changes in the demand for products to keep the profit margin of sc stable. for maintaining high service levels (i.e., acceptable stockout rates), they may adjust inventory policies and set new pricing strategies. they should keep the inventory stock-out at an acceptable level to timely meet customer demand. the service level at week is: where is the average number of lost consumers and denotes the total population of households. should not drop to less than the minimum acceptable level (assumed to be 95% ( =0.95)). in transitioning demand from one product type to another, for instance, from conventional to organic wine, the conventional wine stock level grows, and at the same time, the organic wine stock level declines in the sc. this supply-demand imbalance prompts retailerwinemakers interactions, where they take different pricing strategies. retailer agents monitor the dynamics in the organic and conventional wine inventory stocks using statistical process control (spc) charts (oakland, 2007) . upper and lower control limits for the wine inventory spc charts are yearly determined following a set of production rules, as presented in appendix a.2.3. nelson rule checks whether the process is in control/out of control. according to nelson rule 8, if the inventory level is out of the defined upper and lower limits for at least nine consecutive time units, then the process is uncontrolled. for example, in situations when due to the changes in the market trend, there is a shortage of products, the prices are subjected to rise to rebalance the demand and supply. generally, oversupply leads to a drop in the market prices while undersupply increases the market prices of organic and conventional wines ( , ) by a predetermined rate . the changes in the market price of wines cannot drop below the minimum ( ) or go beyond the maximum price of wines ( ). as the price of products will change temporarily over a short period, it may not be effective in coping with the market price gap when there are significant supply and demand imbalance. price adjustment is an effective market mechanism aiming to tune the equilibrium prices ( ) for increasing or decreasing the sales of a product for longer periods. instead of a fixed price option, wine equilibrium prices are modified on a week-by-week basis at different rates except during the land conversion period from conventional to organic. a sequence of decisions winemakers and retailers make about the wine prices affects the production plans and supply agreements with farmers. when the profit from a certain wine type increases, its production becomes financially more attractive and viable to winemakers. in these situations, the winemakers send revised orders to farmers requesting for different quantities of each grape type and proposing a new price schedule for the yields. farmers respond to these requests by evaluating their capabilities in terms of whether they can fulfill the order with the current vineyard configuration, or they need to convert a portion of their farmland to organic/conventional to meet the future demand from the winemakers. appendix a.2.3 provides a detailed explanation of the farmers' capacity and decisions about fulfilling the winemakers' orders for grape. thus, both parties decide on the volume and selling price of yield in a renewed contract farming agreement as summarized below. convert from conventional to organic farming: no changes in the production plan and vineyard configuration is expected unless the equilibrium price of organic wine increases before the planting season ( >0). the organic conversion scale (the amount of land to be converted in year y) is: here, is the minimum conversion scale, is the land required for conversion based on demand estimations, and is the perceived failure risk of conversion. the transition from conventional to organic farming takes three years. the yield from transitioning farms can only be sold as conventional products. this long lead time not only adds to the complications of balancing market demand but also gives a bias to farmer judgments about the long-term cost-benefits of their organic vineyards, as discussed in section 3.3.1. revert from organic to conventional farming: the decisions on increasing the production volume of conventional wine and reverting from organic to conventional agriculture impose higher risks on the financial performance of sc. in this model, the dynamics of equilibrium price of organic and conventional play the main role in provoking the reversion decisions ( ) as: • if there is no positive change in organic wine equilibrium price while the conventional equilibrium price is increasing and the sc service level is less than the minimum acceptable level, or • if there is an oversupply of organic wine and its equilibrium price is at a minimum. sustainability objectives, including social, environmental, and economic considerations as well as behavioral considerations, guide the essc decisions. we address the social issues from the public health perspective as a function of organic food consumption. organic diets expose consumers to fewer chemicals associated with human diseases such as cancer (chen et al., 2015) , autism (kalkbrenner et al., 2014) , and infertility ( after switching to an all-organic diet. they found that the level of synthetic pesticides in all participants has dropped, on average, 60.5% after eating only organic just for 6 days. a recent comprehensive discussion of organic food benefits for human health is also found in vigar et al. (2020) . by increasing the consumption of organic food, people can improve their health and well-being. thus, (1) social performance accounts for organic product consumption and is defined as: ; (6) where, is the number of organic consumers in year . rohmer et al. (2019) and sazvar et al. (2018) used similar diet-related indicators such as nutritional compliance (i.e., amount of nutrient n consumed) and individual health-living environmental health (i.e., organic product consumption and production) to assess the performance of ssc in terms of public health. with regard to environmental issues, this study focuses on the size of land used for organic farming practices. the heavy use of pesticides and synthetic fertilizers in conventional farming is seen as a major cause for more than 40% decline in the number of insects, and if this trend continues, there may be no insects left in the next 100 years (stepanian et al., 2020) . adoption of organic farming can help to: protect soil quality, keep waterways clean, and preserve the landscape. certainly, organic farming can reduce environmental impacts related to toxicity, and it could also help in biodiversity preservation. (2) environmental performance measures the size of land used for organic farming and is defined as: where, is the total land used for organic farming in year . we consider the revenue obtained from the sale of organic food products as an indication of economic performance. while sc cost is the most commonly used indicator, this research focuses on green economic growth and fostering the income from green products. thus, (3) economic performance evaluates organic income and is defined as: where, is the total organic food product sales in year , calculated as given the difficulties associated with the quantification of behavior, farmers' goals, and expectations of organic farming adoption can be used as a measure. according to bouttes et al. (2018) , organic farmers' work enjoyment is determined by their expectations of organic farming conversions, "a satisfaction heightened by the positive feedback they already receive for their decision to convert." in transitioning to more ecological farming practices, the market feedback (in terms of price incentives offered by consumers) is essential to enable farmers to enhance adaptive capacity, recover from current setbacks and cope with future change. thus, (4) behavioral performance is defined as: where, is the value-based expectations of farmers about organic farming in year from (2). the general model described in section 3 is applied to a case study derived from australian wine industry. currently, less than 0.5% of grape production volume in the australian wine market belongs to organic wine, and the total global organic vine area the focus of our study was on understanding the collective impact of individual behavior change on the performance of the supply chain. in doing so, we have modeled disaggregated demand using abm as the best option. des and sd enabled us to simulate their processes involved and a workable mental model for farmers at the aggregated level. with regard to farmers and winemakers, we aimed at presenting the usual operations and practices in the region. so, in the model, we use a representative farmer agent and a winemaker agent with the characteristics of the cool-climate grape growers in south australia and the typical processes of its commercial wineries, where we had collected empirical data. this region alone is responsible for more than half of the production of all australian wines. while we acknowledge that more than sixty different species of grapevines exist in the australian vineyards, for simplification, we collect data on one popular type, cabernet sauvignon (yield of organic/conventional land, resource requirements, and operational costs) (refer to appendix b.1). usually, wineries are established in the grape-producing zones to reduce transportation costs and preserve the quality of crops. the winery warehouses, however, may be located far from production sites and closer to customer zones. we assume that the winery warehouse, located in the vicinity of the retailers, uses (2013) also contained the number of consumers intending to purchase organic wine, when the price of organic wine is set to au$12, au$13, and au$14. this data was not used for calibration purposes and was set aside to revalidate the model. a comparison between the estimated number of consumers intending to purchase organic wine and the empirical data from literature is reported in table 1 . the results from the simulation model can estimate the number of organic wine consumers with high accuracy, translating to an error between 3% and 18%, depending on the willingness-to-pay settings. table 1 . model validation results, when comparing the number of consumers intending to purchase organic wine when its price is set to au$12 (20% more), au$13 (30% more), and au$14 (40% more). for the des model of vineyard process and outputs, we consulted industrial experts in the field of organic food science and agriculture and made presentations at conferences and meetings. we also tested the performance of model using extreme scenarios, for example, maximum and minimum prices for wine, maximum and minimum values for yields of vineyards, maximum and minimum values for statistical process control. because of the overall model complexity, we used the one-factor-at-a-time (ofat) method to calculate the sensitivity of model outputs to the input parameters. we analyze the model outputs by varying the model inputs by ±20% of their base case values. figure 10 . sensitivity analysis of model estimations to the input parameters (details are presented in appendix d, table d1 ). for example, figure 10 presents the sensitivity of model results to variations in the weights of attitude (wa), pbc (wb), social norms (ws), hedonic goals (wh), gain goals (wg), and normative goals (wn). for a detailed discussion on these weights, we refer the readers to appendix c.3.3 in (taghikhah et al., 2020) . variations of less than 5% are excluded from the charts. overall, social and economic performance have the lowest sensitivity to the inputs, while environmental and behavioral performance undergo significant variations. wa and ws account for the highest changes in social and economic performance, respectively (+22% ([18%, 24%] at 95% confidence interval (ci)) and +23% ([20%, 27%] at 95% ci) compared to the baseline). the value of environmental performance is equally sensitive to wa, ws, and wn parameters (+40% ([39%, 41%] at 95% ci) of the baseline estimation). the behavioral performance shows high sensitivity, nearly ±40%, to the dynamics of wn and wh. appendix d provides a detailed explanation of the modified parameters and their influence on the results. from this uncertainty analysis, we can conclude that while the model is statistically sensitive to some parameters (e.g., wa, wh, and wn), overall, the model outputs (such as economic, social, environmental, and behavioral performance) are quite robust, stay within 95% ci limit and the trajectories do not go to infinity or fall to zero. this also helps us to target particular types of parameters for future refinement in empirical studies. for example, given that the model outputs are especially sensitive to social norms, more effort could be spent on improving empirical micro-foundations for this parameter. conducting a global sensitivity analysis on this computationally-intensive model to assess the variations in the outputs to a combination of changing input parameters requires a high-performance computer cluster and will remain a subject for future work. the model is programmed in anylogic 8.3 simulation software with the help of agent-based, process-centric, and system dynamics modeling approaches. see section 3 for more details on accessing the files. when proposing the essc approach instead of the more traditional ssc analysis (taghikhah et al., 2019) , we assumed that the introduction of consumer behavior and preferences can have an impact on the overall performance of the sc. here, with the model in place, we can actually see how such a structural change in the way the sc is defined impacts the main performance indicators. in the majority of proposed models in the literature on ssc, the demand for products is homogenous. in contrast, the essc accounts for heterogeneous demand. to turn our essc model into a more conventional ssc one, we replace the heterogeneous adaptive consumers with homogeneous and rational ones using the average weekly demands for organic and conventional wines in each retailer. the ssc assumption is that the demands are constant in time, homogeneous and independent of supply levels, and price of wines. figure 11 . a comparison between the proposed essc and ssc (homogeneous demand). we scale the value of ssc outputs to 100% and compare them with the baseline values of essc as presented in figure 11 . behavioural performance is excluded from the analysis because ssc does not account for farmers' expectations. it can be seen that there are significant differences between the outputs of ssc and essc in terms of environmental (+176% points) and economic performance (-26% points). in the case of ssc, since the dynamics of wine prices do not affect the demand, the sales of organic wine would be higher than essc, even if the price of products (organic wine (-26% points) and conventional wine (-36% points)) are lower. this analysis shows that in the absence of heterogeneous demand, farmers do not perceive the market value of organic products, and they may decide to revert to conventional farming as reflected in the environmental performance. once the model is tested and displays reliable and meaningful performance, it can be used to explore the impact of various control factors on the overall dynamics of the system. this can help us to test how the system reacts to various combinations of input functions and parameters, which we call scenarios, and which describe management decisions and possible system modifications. there are many ways the system can be manipulated, and many policies and management interventions that can be explored. this is a subject of separate research; here, our purpose is only to demonstrate how essc can be used in industry and policy design and to show its receptivity to market feedback. in this research, we consider an approach for scenario use, which was proposed in kunc and o'brien (2017) . they provided a practical framework for supporting the strategic performance of a firm by exploring firm's resources and capabilities. based on this approach, we have designed a set of scenarios considering the opportunities and threats of sc in the external environment in conjunction with the dynamics of its strengths and weaknesses. gu and kunc (2019) also developed a hybrid simulation model for a supermarket sc and adopted a similar approach in devising strategies. for the purpose of this study, we only discuss the demand-side scenarios describing two possible changes in demographics (economic status such as income) and behavior (social networks such as neighborhood effect) of the consumers and compare the results to the baseline model output presented above in section 5.2.1. there is a 20% increase in the number of middle and high-income consumers. in terms of model parameters, this means that the income of 14% of consumers earning up to au$100,000 per year (middle-income group) is increased to au$150,000 per year (highincome group). at the same time, the income of 6% of consumers earning up to au$50,000 per year (low-income group) is increased to au$100,000 per year (middle-income group). this is consistent with the growing trend in australia. currently, the production rate of organic wine is low, and on the contrary, the production rate of conventional wine is high. to comply with the possible growth in the consumption of organics in the near future, due to the increasing marginal utility of income, the sc cannot immediately respond to the demand and requires a three-year transition period from conventional to organic wine production. it can be considered as a weakness-opportunity strategy; scenario 2: the effect of neighborhood-level characteristics on the wine preference of consumers is restricted because there are increasing trends in people living in apartments and therefore are less likely to interact with each other on a regular basis. in fact, sydney's urban population has moved towards apartment living to meet the affordable housing needs of the growing population. this change hinders social gatherings and neighbor interactions so that the influence of social norms on wine preferences becomes minimal. in terms of model parameters, this means that the weight of social norms on intention is changed from 0.12 to 0.02. as the word-of-mouth effect is small, the sc can shift the norm for conventional to organic wine purchasing, from a vicious into a virtuous cycle. this shift can perhaps bring higher socio-economic benefits for the business. it can be considered as a strength-threat strategy. appendix e provides a detailed explanation of the neighborhood effect and its sensitivity defined in this model (please refer to figure 10 ). the results presented in figure 12 show that in scenario 2 all the indicators, except behavioral performance, perform better than in scenario 1. by reducing the influence of social interactions (among customers living in a neighborhood) on the wine purchasing decisions, the social, environmental, and economic performance of essc can be improved by 78%, 122%, and 76%, respectively. however, due to the market volatility caused by variations in the price of organic products and correlated changes of demand and supply, farmers' expectations of the value of organic farming do not grow significantly (figure 13d ). these dynamics are the result of conventional wine overstocking, and organic wine understocking caused mainly by the three year conversion period. on the contrary, in scenario 1, we observe a growth in the organic market size by 17% in year 14 that eventually leads to a gradual increase in the farmers' expectations of organic agriculture value by 25% in year 30. with regard to environmental and economic performance, there is a 17% and 22% growth in scenario 1. the market financial incentive, in this case, is not good enough yet to meet the expectations of farmers regarding the value of organic farming, and hence government support is required. from these production-consumption patterns, we may conclude: • there is a negative impact of uncertain prices on farmers' expectations of organic adoption: the unpredictable and erratic organic prices add uncertainty to farmers' expectations about future returns. as it is unclear when organic wine prices would recover or stabilize, farmers start to prefer conventional markets more. they could choose to enter organic markets if the price for organic wine rose and remained relatively stable following the conversion from conventional farming. thus, in the periods of the high volatility of organic wine price but the stability of conventional price, farmers tend to perceive the value of waiting to convert higher and risks in the future of organic farming lower. • the propagation of consumer organic preferences through agro-food sc is slow: the adaptation of sc operations to the dynamic market trends can be delayed. for example, in both scenarios of simulation, the changes in the environmental and behavioral performance have started 5 to 10 years after the start of simulation. as there are two echelons between the consumers and farmers, transmitting the feedback/ market signals from the preferences of consumers to the land management decisions of farmers comes with a delay. taylor (2006) and naik and suresh (2018) emphasize that the operational and structural factors such as long lead times, absence of long-term demand forecasts, etc. account for this gap between agricultural production and consumer demand. • social norms can trigger big shifts in consumer wine preferences: it is interesting to observe that minor changes in the consumption side parameters can help to improve the socio-environmental performance of agro-food sc. the social norms manipulation (reducing neighborhood effect) promotes ecological behavior more significantly than economic factors (consumer income growth). it is quite challenging to motivate consumers to spend more on organic products in the absence of supportive norms, even if their income level is higher. as social norms exert a strong effect on food consumption and production behavior, considering them in the management of ssc can provide new insights. despite the essc model is quite complex, we can still use it for purposes of optimization. in analyzing the possible optimized scenarios, we can find the optimal organic and conventional wine prices for maximizing social, environmental, economic, and behavioral performance separately. these experiments demonstrate the capability and flexibility of the proposed essc model. organic farming is a promising solution for moderating agriculture impacts on ecosystems and improving human health. despite the potential benefits that this method has for biodiversity and soil fertility, the global adoption rate of organic farming is still low. it has not become mainstream for two main reasons: (1) lower farm yield and higher production costs in comparison to intensive agriculture (uematsu & mishra, 2012) , and, as a result, (2) reliance on a niche segment of consumers and a small market share, as compared to conventional food (o'mahony & lobo, 2017) . a growing number of studies focuses on improving the productivity of organic agriculture from sustainability perspectives; yet, the relationships between the behavior of final consumers and the decisions of upstream supply chain actors, in this case, farmers, have been poorly analyzed (naik & suresh, 2018; taghikhah et al., 2019) . we address this void by extending the analysis of traditional food sc to include the dynamics of consumers choices and preferences for organic versus conventional food, as recommended by the essc framework (taghikhah et al., 2019) . this study contributes to the existing literature in the following four ways: • first of all, it links three very different areas that, to our knowledge, have not yet been synthesized in a modeling study: (i) supply chain design and production economy, (ii) sustainability considerations and, (ii) pro-environmental and pro-health behavior. the model designed to operationalize the essc framework in which the sc analysis is extended to explicitly consider the buying behavior of consumers. while there are a number of papers that empirically examine the influence of behavioral aspects of demand on a few elements of supply, we are not aware of any published study that analytically links the heterogeneity of consumers and their preferences to the entire supply chain operation. • secondly, to the best of our knowledge, this is the first study that incorporates the preferences of consumers for organic food as well as farmer decisions regarding organic farming adoption into a model of an agro-food sc. organic supply chain modeling studies for reducing environmental impacts have largely ignored important socio-ecological issues related to consumers. in this study, we include the dynamics of consumer behavior (due to the changes in the social norms, willingness to pay more, the demand substitutions, etc.) and farmers' expectations (due to the changes in the price of products, organic versus conventional production, etc.). • thirdly, it contributes to the methodological development in the ssc field by extending it with the essc paradigm and proposing the integration of sd-des-abm methods to improve the decisions considering sustainable development goals. so far, systems thinking approaches are underrepresented in the context of ssc research (rebs et al., 2018) , while the field can benefit from integrated modeling solutions that account for the interplay between sc and sustainability aspects. in particular, the interactions between abm and sd provide an opportunity for considering the dynamics of social sustainability by developing the direct formulation of population, in our case, both consumers and farmers. according to brandenburg et al. (2014) and brandenburg and rebs (2015) , the practice of social simulation in ssc studies is adopted less often. • fourthly, the novelty of our model lies in capturing the simultaneous interactions between different sc actors (defined as adaptive systems) at different spatial and temporal scales, providing further insights into how integrated modeling can assist in strategic planning and in addressing real-world business challenges as suggested by kunc (2019) . in our case, the behavioral aspects, as well as operational characteristics of the sc, are studied in the model. the analysis of the model occurs in different levels of detail: micro-processes for consumption, and macro-process for production. businesses and producers can use the model for understanding consumers' preferences, estimate their future influence on the operation, and develop long-term plans for land management and adoption of technologies. the analysis can help them to make their business models more resilient to market shocks and signals. essc requires further integration of consumer behavior models as sub-models with traditional ssc models. this integration not only reveals the unobserved heterogeneity of preferences in consumers but also discloses a two-way influence between consumption patterns and production-distribution decisions. we calibrate the proposed model and test the validity of the outputs with available empirical data. the validation process is not straightforward (bert et al., 2014) and can certainly be improved in the future, as more data becomes available and the model undergoes further testing. the comparison between the results of essc and ssc analyses indicates that the assumptions of homogeneity in consumer preferences may need to be reconsidered and released. the homogeneous demand assumption has the highest impact on environmental and economic performance. our modeling experiments demonstrate the adaptiveness of essc model for market dynamics. the findings with respect to the changes in the financial and behavioral status of consumers, highlight the highest impact of changing social norms on improving the sustainability of the sc. as there are multiple actors between the consumers and suppliers, farmers' perceptions and expectations towards the value of organic-based agriculture may deviate notably from reality. moreover, the adaptation of producers to market trends takes much time due to the delays in supply. the analysis of optimal scenarios produces solutions that can simultaneously improve the economic, social, and environmental performance but not behavioral performance. this means, that by the expansion of organic farming in response to the growing demands of organic consumers, a significant reduction in the organic wine prices will eventually occur, which may not be favorable for farmers. accounting for demand-side heterogeneity provides new insights into addressing sustainability issues in scs. the results imply that the design of organic food policies aiming at behavioral changes should not be limited to financial incentives. in designing politically feasible policy options, paying attention to the social environment, public awareness, norm support cues, and cultural codes can reinforce the transition to organic agriculture. accompanying information and value-based policy instruments may not only lead to the diffusion of organic food consumption but also increase the number of organic farms. having said that, due to the presence of certain constraints and barriers (for example changing price and availability) a quick transition in organic consumption-production cannot be expected. government price control schemes to control minimum or maximum prices and trade control to balance exports and imports can speed up the contribution from the demand side in reducing the environmental impacts of production. a future research direction for this study is to apply the model for investigating the implications of social change for organic food development. one can use the strategy development protocol (torres et al., 2017) to generate scenarios in consultations with managers. in particular, the influence of green taxation schemes, informational marketing campaigns, and organic food promotions and incentives on the adaptive behavior of farmers and consumers can be further examined and assessed. another example of such scenarios is to explore the impact of covid-19, as the reserve bank of australia forecasts that gdp will fall by 6% in 2020 with a slightly larger number of 7% for unemployment. at the same time, social norms probably play a less significant role in households' choices due to lockdown and social distancing. it is interesting and contemporaneous to quantitatively assess whether the negative effect of the pandemic on wealth can be overcome by reversing the norms. a potential extension of this model will include agroecological models of crop growth to forecast the farm yield with regards to the adopted farming system (i.e., organic, biodynamic, conventional, etc.) under changing climatic factors (i.e., temperature, humidity, rainfall, etc.). the model was developed for the wine case study, yet, it is generic enough to be used for studying a wide range of agro-food scs that have similar characteristics such as tea and coffee scs. another interesting area to explore is the heterogeneity of farmers regarding their expectations of organic farming adoption and their choice between different conversion strategies. with minor modifications, the model can be easily adapted for other agricultural products to explore ways for transitioning to organic farming. the analytical framework and suggested modeling approach can also be adopted by researchers to examine the adaptive behavior of the disaggregated, multi-scale tiers of the sc in other sectors. finally, the model can be used as a decision support tool to help practitioners in designing evidence-based policies for organic food. pilot scale thermal and alternative pasteurization of tomato and watermelon juice: an energy comparison and life cycle assessment from intentions to actions: a theory of planned behavior exploring the role of contracts to support the emergence of self-organized industrial symbiosis networks: an agent-based simulation study australian competition and consumer commission opportunities and challenges in sustainable supply chain: an operations research perspective multi-level and hybrid modelling approaches for systems biology lessons from a comprehensive validation of an agent based-model: the experience of the pampas model of argentinean agricultural systems sustainable supply chain management practices and dynamic capabilities in the food industry: a critical analysis of the literature vulnerability to climatic and economic variability is mainly driven by farmers' practices on french organic dairy farms hybrid simulation modelling in operational research: a state-of-the-art review quantitative models for sustainable supply chain management: developments and directions sustainable supply chain management: a modeling perspective the age dynamics of vineyards: past trends affecting the future residential exposure to pesticide during childhood and childhood cancers: a meta-analysis association between pesticide residue intake from consumption of fruits and vegetables and pregnancy outcomes among women undergoing infertility treatment with assisted reproductive technology a model proposal for green supply chain network design based on consumer segmentation sustainable vegetable crop supply problem with perishable stocks multiagent optimization approach to supply network configuration problems with varied product-market profiles hybrid simulation challenges and opportunities: a life-cycle approach study of game models and the complex dynamics of a low-carbon supply chain with an altruistic retailer under consumers' low-carbon preference the future of food and agriculture-trends and challenges. in: food and agriculture organisation rome can nudging improve the environmental impact of food supply chain? a systematic review tackling climate change through livestock: a global assessment of emissions and mitigation opportunities using hybrid modelling to simulate and analyse strategies or forum-the evolution of closedloop supply chain research competition and evolution in multi-product supply chains: an agent-based retailer model challenges of operations research practice in agricultural value chains continuous review (s, s) policies for inventory systems incorporating a cutoff transaction size representation of decision-making in european agricultural agent-based models organic diet intervention significantly reduces urinary pesticide levels in us children and adults integrating harvesting decisions in the design of agro-food supply chains environmental chemical exposures and autism spectrum disorders: a review of the epidemiological evidence. current problems in pediatric and adolescent health care prospective association between consumption frequency of organic food and body weight change, risk of overweight or obesity: results from the nutrinet-santé study system dynamics: a behavioral modeling method strategic planning: the role of hybrid modelling behavioral operational research: theory, methodology and practice exploring the development of a methodology for scenario use: combining scenario and resource mapping approaches sustainable food supply chain management packaging, blessing in disguise. review on its diverse contribution to food sustainability normative, gain and hedonic goal frames guiding environmental behavior applying a biocomplexity approach to modelling farmer decision-making and land use impacts on wildlife does organic farming (of) work in favour of protecting the natural environment? a case study from poland a green supply chain network design framework for the processed food industry: application to the orange juice agrofood cluster the fuzzy multi-objective distribution planner for a green meat supply chain purpose and benefits of hybrid simulation: contributing to the convergence of its definition from hybrid simulation to hybrid systems modelling challenges of creating sustainable agri-retail supply chains. iimb management review environmental impacts of food consumption and nutrition: where are we and what is next? the dynamics of supply environmental impacts of food consumption in europe the organic industry in australia: current and future trends statistical process control: routledge consumer willingness to pay premiums for the benefits of organic wine and the expert service of wine retailers optimising economic, environmental, and social objectives: a goalprogramming approach in the food sector agent-based and individual-based modeling: a practical introduction system dynamics modeling for sustainable supply chain management: a literature review and systems thinking approach systemic sustainability characteristics of organic farming: a review sustainable supply chain design in the food system with dietary considerations: a multi-objective analysis managing consumer behavior toward ontime return of the waste electrical and electronic equipment: a game theoretic approach a game theoretic approach for assessing residential energy-efficiency program considering rebound, consumer behavior, and government policies a sustainable supply chain for organic, conventional agro-food products: the role of demand substitution, climate change and public health agent-based simulation of consumer behavior in grocery shopping on a regional level from a literature review to a conceptual framework for sustainable supply chain management a review on quantitative models for sustainable food logistics management declines in an abundant aquatic insect, the burrowing mayfly, across major north american waterways extending the supply chain to address sustainability exploring consumer behavior and policy options in organic food adoption: insights from australian wine sector demand management in agri-food supply chains: an analysis of the characteristics and problems and a framework for improvement. the international journal of logistics management the intergovernmental panel on climate change (ipcc) consumer buying behaviour in a green supply chain management context: a study in the dutch electronics industry supporting strategy using system dynamics environmental impacts of products: a detailed review of studies exploring a safe operating approach to weighting in life cycle impact assessment-a case study of organic, conventional and integrated farming systems organic farmers or conventional farmers: where's the money? applications of agent-based modelling and simulation in the agri-food supply chains towards better representation of organic agriculture in life cycle assessment a systematic review of organic versus conventional food consumption: is there a measurable benefit on human health? integronsters', integral and integrated modeling assessing behavioural change with agent-based life cycle assessment: application to smart homes modeling and analysis of sustainable supply chain dynamics pricing and collection rate decisions in a closedloop supply chain considering consumers' environmental responsibility global wine sustainable, organic multi-agent based multi objective renewable energy management for diversified community power consumers organic and local food consumer behaviour: alphabet theory recent advances and opportunities in sustainable food supply chain: a model-oriented review we wish to thank the editor and three anonymous reviewers for their valuable comments on this manuscript.we would like to thank professor raimo p. hamalainen for his constructive and detailed suggestions on this paper.we would also like to thank dr chris penfold from university of adelaide, school of agriculture, food and wine, and ms. sandy hathaway, industry analyst from wine australia for providing their expertise and assistance with vineyard and industry data. key: cord-225429-pz9lsaw6 authors: rodrigues, helena sofia title: optimal control and numerical optimization applied to epidemiological models date: 2014-01-29 journal: nan doi: nan sha: doc_id: 225429 cord_uid: pz9lsaw6 the relationship between epidemiology, mathematical modeling and computational tools allows to build and test theories on the development and battling of a disease. this phd thesis is motivated by the study of epidemiological models applied to infectious diseases in an optimal control perspective, giving particular relevance to dengue. dengue is a subtropical and tropical disease transmitted by mosquitoes, that affects about 100 million people per year and is considered by the world health organization a major concern for public health. the mathematical models developed and tested in this work, are based on ordinary differential equations that describe the dynamics underlying the disease, including the interaction between humans and mosquitoes. an analytical study is made related to equilibrium points, their stability and basic reproduction number. the spreading of dengue can be attenuated through measures to control the transmission vector, such as the use of specific insecticides and educational campaigns. since the development of a potential vaccine has been a recent global bet, models based on the simulation of a hypothetical vaccination process in a population are proposed. based on optimal control theory, we have analyzed the optimal strategies for using these controls, and respective impact on the reduction/eradication of the disease during an outbreak in the population, considering a bioeconomic approach. the formulated problems are numerically solved using direct and indirect methods. the first discretize the problem turning it into a nonlinear optimization problem. indirect methods use the pontryagin maximum principle as a necessary condition to find the optimal curve for the respective control. in these two strategies several numerical software packages are used. a relação entre a epidemiologia, a modelação matemática e as ferramentas computacionais permite construir e testar teorias sobre o desenvolvimento e combate de uma doença. esta tese tem como motivação o estudo de modelos epidemiológicos aplicados a doenças infeciosas numa perspetiva de controloótimo, dando particular relevância ao dengue. sendo uma doença tropical e subtropical transmitida por mosquitos, afecta cerca de 100 milhões de pessoas por ano, eé considerada pela organização mundial de saúde como uma grande preocupação para a saúde pública. os modelos matemáticos desenvolvidos e testados neste trabalho, baseiamse em equações diferenciais ordinárias que descrevem a dinâmica subjacentè a doença nomeadamente a interação entre humanos e mosquitos.é feito um estudo analítico dos mesmos relativamente aos pontos de equilíbrio, sua estabilidade e número básico de reprodução. a propagação do dengue pode ser atenuada através de medidas de controlo do vetor transmissor, tais como o uso de inseticidas específicos e campanhas educacionais. como o desenvolvimento de uma potencial vacina tem sido uma aposta mundial recente, são propostos modelos baseados na simulação de um hipotético processo de vacinação numa população. tendo por base a teoria de controloótimo, são analisadas as estratégiasótimas para o uso destes controlos e respetivas repercussões na redução/erradicação da doença aquando de um surto na população, considerando uma abordagem bioeconómica. os problemas formulados são resolvidos numericamente usando métodos diretos e indiretos. os primeiros discretizam o problema reformulando-o num problema de optimização não linear. os métodos indiretos usam o princípio do máximo de pontryagin como condição necessária para encontrar a curvá otima para o respetivo controlo. nestas duas estratégias utilizam-se vários pacotes de software numérico. ao longo deste trabalho, houve sempre um compromisso entre o realismo dos modelos epidemiológicos e a sua tratabilidade em termos matemáticos. introduction "mathematical biology is a fast-growing, well-recognized, albeit not clearly defined, subject and is, to my mind, the most exciting modern application of mathematics." -j. d. murray, mathematical biology, 2002 epidemiology has become an important issue for modern society. the relationship between mathematics and epidemiology has been increasing. for the mathematician, epidemiology provides new and exciting branches, while for the epidemiologist, mathematical modeling offers an important research tool in the study of the evolution of diseases. in 1760, a smallpox model was proposed by daniel bernoulli and is considered by many authors the first epidemiological mathematical model. theoretical papers by kermack and mckendrinck, between 1927 and 1933 about infectious disease models, have had a great influence in the development of mathematical epidemiology models [93] . most of the basic theory had been developed during that time, but the theoretical progress has been steady since then [14] . mathematical models are being increasingly used to elucidate the transmission of several diseases. these models, usually based on compartment models, may be rather simple, but studying them is crucial in gaining important knowledge of the underlying aspects of the infectious diseases spread out [63] , and to evaluate the potential impact of control programs in reducing morbidity and mortality. after the second world war, the strategy of public health has been focusing on the control and elimination of the organisms that cause the diseases. the appearance of new antibiotics and vaccines brought a positive perspective of the diseases eradication. however, factors such as resistance to the medicine by the microorganisms, demographic evolution, accelerated urbanization, increased travelling and climate change, led to new diseases and the resurgence of old ones. in 1981, the human immunodeficiency virus (hiv) appears and since then, become as important sexually transmitted disease throughout the world [64] . futhermore, malaria, tuberculosis, dengue and yellow fever have re-emerged and, as a result of climate changes, has been spreading into new regions [64] . recent years have seen an increasing trend in the representation of mathematical models in publications in the epidemiological literature, from specialist journals of medicine, biology and mathematics to the highest impact generalist journals [50] , showing the importance of interdisciplinary. their role in comparing, planning, implementing and evaluating various control programs is of major importance for public health decision makers. this interest has been reinforced by the recent examples of sars -severe acute respiratory syndrome -epidemic in 2003 and influenza pandemic in 2009. although chronic diseases, such as cancer and heart diseases have been receiving more attention in developed countries, infectious diseases are still important and cause suffering and mortality in developing countries. these, remain a serious medical burden all around the world with 15 million deaths per year estimated to be directly related to infectious diseases [71] . the successful containment of the emerging diseases is not just linked to medical infrastructure but also on the capacity to recognize its transmission characteristics and apply optimal medical and logistic policies. public health often asks information such as [64] : how many people will be infected, how many require hospitalization, what is the maximum number of people ill at a given time and how long will the epidemic last. as a result, it is necessary an ever-increasing capacity for a rapid response. education, vaccination campaigns, preventive drugs administration and surveillance programs, are all examples of prevention methods that authorities must consider for disease prevention. whenever the disease declares itself, the emergency interventions such as disinfectants, insecticide application, mechanical controls and quarantine measures must be considered. intervention strategies can be modelled with the goal of understanding how they will influence the disease's battle. as financial resources are limited, there is a pressing need to optimize investments for disease prevention and fight. traditionally, the study of disease dynamics has been focused on identifying the mechanisms responsible for epidemics but has taken little into account economic constraints in analyzing control strategies. on the other hand, economic models have given insight into optimal control under constraints imposed by limited resources, but they are frequently ignored by the spatial and temporal dynamics of the disease. therefore, progress requires a combination of epidemiological and economic factors for modelling what until here tended to remain separate. more recently, bioeconomic approaches to disease management have been advocated, since infectious diseases can be modelled thinking that the limited resources involved require trade-offs. finding the optimal strategy depends on the balance of economic and epidemiological parameters that reflect the nature of the host-pathogen system and the efficiency of the control method. the main goal of this thesis is to formulate epidemiological models, giving a special importance to dengue disease. moreover, it is our aim to frame the disease management question into an optimal control problem requiring the maximization/minimization of some objective function that depends on the infected individuals (biological issues) and control costs (economic issues), given some initial conditions. this way, will allow us to propose practical control measures to the authorities to assess and forecast the disease burden, such as an attack rate, morbidity, hospitalization and mortality. the thesis is composed by two parts. the first part, comprising chapters 1 to 3, gives a mathematical background to support the original results presented in the second part, that is composed by chapters 4 to 7. in chapter 1, the definition of optimal control problem, its possible versions and the adapted first order necessary conditions based on the pontryagin maximum principle are introduced. simple examples are chosen to exemplify the mathematical concepts. with the increasing of variables and complexity, optimal control problems can no longer be solved analytically and numerical methods are required. for this purpose, in chapter 2, direct and indirect methods are presented for their resolution. direct methods consist in the discretization of the optimal xii control problem, reducing it to a nonlinear constrained optimization problem. indirect methods are based on the pontryagin maximum principle, which in turn reduces the problem to a boundary value problem. for each approach, the software packages used in this thesis are described. in chapter 3, the basic building blocks of most epidemiological models are reviewed: sir (composed by susceptible-infected-recovered) and sis models (susceptible-infected-susceptible). for these, it is possible to develop some analytical results which can be useful in the understanding of simple epidemics. taking this as the basis, we discuss the dynamics of other compartmental models, bringing more complex realities, such as those with exposed or carrier classes. the second part of the thesis contains the original results and is focused on dengue fever disease. dengue is a vector borne disease, caused by a mosquito from the aedes family. it is mostly found in tropical and sub-tropical climates, mostly in urban areas. it can provoke a severe flu-like illness, and sometimes, in severe cases can be lethal. according to the world health organization about 40% of the world's population is now at risk [141] . the main reasons for the choice of this particular disease are: • the importance of this disease around the world, as well as the challenges of its transmission features, prevention and control measures; • two portuguese-speaking countries (brazil and cape verde) have already experience with dengue, and in the last one, a first outbreak occurred during the development of this thesis, which allowed the development of a groundbreaking work; • the mosquito aedes aegypti, the main vector that transmits dengue, is already in portugal, on madeira island [5] , which without carrying the disease, is considered a potential threat to public health and has been followed by the health authorities. in chapter 4 information about the mosquito, disease symptoms, and measures to fight dengue are reported. an old problem related to dengue is revisited and solved by different approaches. finally, a numerical study is performed to compare different discretization schemes in order to analyze the best strategies for future implementations. in chapter 5, a seir+asei model is studied. the basic reproduction number and the equilibrium points are computed as well as their relationship with the local stability of the disease free equilibrium. this model implements a control measure: adulticide. a study to find the best strategies available to apply the insecticide is made. continuous and piecewise constant strategies are used involving the system of ordinary differential equations only or resorting to the optimal control theory. chapter 6 is concerned with a sir+asi model that incorporates three controls: adulticide, larvicide and mechanical control. a detailed discussion on the effects of each control, individually or together, on the development of the disease is given. an analysis of the importance of each control in the decreasing of the basic reproduction number is given. these results are strengthened when the optimal strategy for the model is calculated. bioeconomic approaches, using distinct weights for the respective control costs and treatments for infected individuals have also been provided. in chapter 7 simulations for a hypothetical vaccine for dengue are carried out. the features of the vaccine are unknown because the clinical trials are ongoing. using models with a new compartment for vaccinated individuals, perfect and imperfect vaccines are studied aiming the analysis of the repercussions of the vaccination process in the morbidity and/or eradication of the disease. then, an optimal control approach is studied, considering the vaccination not as a new compartment, but as a measure control in fighting the disease. finally, the main conclusions are reported and future directions of research are pointed out. xiv part i state of the art 1 chapter 1 the optimal control definition and its possible formulations are introduced, followed by some examples related to epidemiological models. the pontryagin maximum principle is presented with the aim of finding the best control policy. optimal control (oc) is the process of determining control and state trajectories for a dynamic system over a period of time in order to minimize a performance index [16] . historically, oc is an extension of the calculus of variations. in the seventeenth century, the first formal results of calculus of variations can be found. johann bernoulli challenged other famous contemporary mathematicians -such as newton, leibniz, jacob bernoulli, l'hôpital and von tschirnhaus -with the brachistochrone problem: "if a small object moves under the influence of gravity, which part between two fixed points enables it to make the trip in the shortest time?" other specific problems were solved and a general mathematical theory was developed by euler and lagrange. the most fruitful applications of the calculus of variations have been to theoretical physics, particularly in connection with hamilton's principle or the principle of least action. early applications to economics appeared in the late 1920s and early 1930s by ross, evans, hottelling and ramsey, with further applications published occasionally thereafter [129] . the generalization of the calculus of variations to optimal control theory was strongly motivated by military applications and has developed rapidly since 1950. the decisive breakthrough was achieved by the russian mathematician lev s. pontryagin (1908 pontryagin ( -1988 and his co-workers (v. g. boltyanskii, r. v. gamkrelidz and e. f. misshchenko) with the formulation and demonstration of the pontryagin maximum principle [106] . this principle has provided research with suitable conditions for optimization problems with differential equations as constraints. the russian team generalized variational problems by separating control and state variables and admitting control constraints. in such problems, oc gives equivalents results, as one would have expected. however, the two approaches differ and the oc approach sometimes affords insight into a problem that might be less readily apparent through the calculus of variations. oc is also applied to problems where the calculus of variations is not convenient, such as those involving constraints on the derivatives of functions [80] . the theory of oc brought new approaches to mathematics with dynamic programming. introduced by r. e. bellman, dynamic programming makes use of the principle of optimality and it is suitable for solving discrete problems, allowing for a significant reduction in the computation of the optimal controls (see [78] ). it is also possible to obtain a continuous approach to the principle of optimality that leads to the solution of a partial differential equation called the hamilton-jacobi-bellman equation. this result allowed to bring new connections between the oc problem and the lyapunov stability theory. before the arrival of the computer, only fairly simple the oc problems could be solved. the arrival of the computer age enabled the application of oc theory and some methods to many complex problems. selected examples are as follows: • physical systems, such as stable performance of motors and machinery, robotics, optimal guidance of rockets [59, 90] ; • aerospace, including driven problems, orbits transfers, development of satellite launchers and recoverable problems of atmospheric reentry [12, 62] ; • economics and management, such as optimal exploitation of natural resources, energy policies, optimal investment of production strategies [92, 127] ; • biology and medicine, as regulation of physiological functions, plants growth, infectious diseases, oncology, radiotherapy [72, 73, 81, 95] . today, the oc theory is extensive and with several approaches. one can adjust controls in a system to achieve a goal, where the underlying system can include: ordinary differential equations, partial differential equations, discrete equations, stochastic differential equations, integro-difference equations, combination of discrete and continuous systems. in this work the goal is the oc theory of ordinary differential equations with time fixed. a typical oc problem requires a performance index or cost functional (j[x(t), u(t)]), a set of state variables (x(t) ∈ x), a set of control variables (u(t) ∈ u ) in a time t, with t 0 ≤ t ≤ t f . the main goal consists in finding a piecewise continuous control u(t) and the associated state variable x(t) to maximize a given objective functional. the development of this chapter will be closely structured from lenhart and workman work [81] . x(t f ) could be free, which means that the value of x(t f ) is unrestricted, or could be fixed, i.e, x(t f ) = x f . for our purposes, f and g will always be continuously differentiable functions in all three arguments. we assume that the control set u is a lebesgue measurable function. thus, as the control(s) will always be piecewise continuous, the associated states will always be piecewise differentiable. we have been focused on finding the maximum of a function. we can switch back and forth between maximization and minimization by simply negating the cost functional: an oc problem can be presented in different ways, but equivalent, depending on the purpose or the software to be used. there are three well known equivalent formulations to describe the oc problem, which are lagrange (already presented in previous section), mayer and bolza forms [25, 144] . definition 2 (bolza formulation). the bolza formulation of the oc problem can be defined as where φ is a continuously differentiable function. definition 3 (mayer formulation). the mayer formulation of the oc problem can be defined as proof. (2) ⇒ (1) to get the proof, we formulate the bolza problem as one of lagrange, using an extended state vector. let (x(·), u(·)) an admissible pair for the problem (1.2) and let z(t) = (x a (t), . so, (z(·), u(·)) is an admissible pair for the lagrange problem thus, the value of the functionals in both formulations matches. (1) ⇒ (2) conversely, each admissible pair (z(·), u(·)) for the problem (1.4) corresponds the pair (x(·), u(·)), where x(·) is composed by the last component of z, admissible for the problem (1.2) and matching the respective values of the functionals. (2) ⇒ (3) for this statement we also need to use an extended state vector. let be thus we have (z(·), u(·)) an admissible pair for the following mayer problem: this way, the values of the functional for both formulations are the same. (3) ⇒ (2) conversely, to all admissible pair (z(·), u(·)) for the mayer problem (1.5) corresponds to an admissible pair (x(·), u(·)) for the bolza problem (1.2) , where x(·) consists in the last component of z(·). for the proof of the previous theorem it was not necessary to show that the lagrange problem is equivalent to the mayer formulation. however, in the second part of the thesis, and due to computational issues, some of the oc problems (usually presented in the lagrange form) will be converted into the equivalent mayer one. hence, using a standard procedure is possible to rewrite the cost functional (cf. [82] ), augmenting the state vector with an extra component. so, the lagrange formulation (1.1) can be rewritten as max the necessary first order conditions to find the optimal control were developed by pontryagin and his co-workers. this result is considered as one of the most important results of mathematics in the 20th century. pontryagin introduced the idea of adjoint functions to append the differential equation to the objective functional. adjoint functions have a similar purpose as lagrange multipliers in multivariate calculus, which append constraints to the function of several variables to be maximized or minimized. definition 4 (hamiltonian). let the previous oc problem considered in (1.1). the function h(t, x(t), u(t), λ(t)) = f (t, x(t), u(t)) + λ(t)g(t, x(t), u(t)) if u * (t) and x * (t) are optimal for problem (1.1), then there exists a piecewise differentiable adjoint variable λ(t) such that for all controls u at each time t, where h is the hamiltonian previously defined and proof. the proof of this theorem is quite technical and we opted to omit it. the original pontryagin's text [106] or clarke's book [29] are good references to find the proof. remark 1. the last condition, λ(t f ) = 0, called transversality condition, is only used when the oc problem does not have terminal value in the state variable, i.e., x(t f ) is free. this principle converted the problem of finding a control which maximizes the objective functional subject to the state ode and initial condition into the problem of optimizing the hamiltonian pointwise. as consequence, with this adjoint equation and hamiltonian, we have at u * for each t, namely, the hamiltonian has a critical point; usually this condition is called optimality condition. thus to find the necessary conditions, we do not need to calculate the integral in the objective functional, but only use the hamiltonian. here is presented a simple example to illustrate this principle. example 1 (from [97] ). consider the oc problem: the calculus of this oc problem can be done by steps. step 1 -form the hamiltonian for the problem. the hamiltonian can be written as: step 2 -write the adjoint differential equation, the optimality condition and transversality boundary condition (if necessary). try to eliminate u * by using the optimality equation h u = 0, i.e., solve for u * in terms of x * and λ. using the hamiltonian to find the differential equation of the adjoint λ, we obtained the optimality condition is given by in this way we obtain an expression for the oc: as the problem has just an initial condition for the state variable, it is necessary to calculate the transversality condition: λ(2) = 0. step 3 -solve the set of two differential equations for x * and λ with the boundary conditions, replacing u * in the differential equations by the expression for the optimal control from the previous step. by the adjoint equation λ = −1 − λ and the transversality condition λ(2) = 0 we have hence, the optimality condition leads to u * = −λ ⇔ u * = 1 − e 2−t and the associated state is if the hamiltonian is linear in the control variable u, it can be difficult to calculate u * from the optimality equation, since ∂h ∂u would not contain u. specific ways of solving these kind of problems can be found in [81] . until here we have showed necessary conditions to solve basic optimal control problems. now, it is important to study some conditions that can guarantee the existence of a finite objective functional value at the optimal control and state variables, based on [53, 75, 81, 86] . the following is an example of a sufficient condition result. suppose that f (t, x, u) and g(t, x, u) are both continuously differentiable functions in their three arguments and concave in x and u. suppose u * is a control with associated state x * , and λ a piecewise differentiable function, such that u * , x * and λ together satisfy on t 0 ≤ t ≤ t f : then for all controls u, we have proof. the proof of this theorem is available on [81] . this result is not strong enough to guarantee that j(u * ) is finite. such results usually require some conditions on f and/or g. next theorem is an example of an existence result from [53] . theorem 1.3.3. let the set of controls for problem (1.1) be lebesgue integrable functions on t 0 ≤ t ≤ t f in r. suppose that f (t, x, u) is convex in u, and there exist constants c 1 , c 2 , c 3 > 0, c 4 and β > 1 such that then there exists an optimal control u * maximizing j(u), with j(u * ) finite. proof. the proof of this theorem is available on [53] . for a minimization problem, g would have a concave property and the inequality on f would be reversed. note that the necessary conditions developed to this point deal with piecewise continuous optimal controls, while this existence theorem guarantees an optimal control which is only lebesgue integrable. this disconnection can be overcome by extending the necessary conditions to lebesgue integrable functions [81, 86] , but we did not expose this idea in the thesis. see the existence of oc results in [52] . in some cases it is necessary, not only minimize (or maximize) terms over the entire time interval, but also minimize (or maximize) a function value at one particular point in time, specifically, the end of the time interval. there are some situations where the objective function must take into account the value of the state at the terminal time, e.g., the number of infected individuals at the final time in an epidemic model [81] . definition 5 (oc problem with payoff term). an oc problem with payoff term is in the form where φ(x(t f )) is a goal with respect to the final position or population level x(t f ). the term φ(x(t f )) is called payoff or salvage. using the pmp, adapted necessary conditions can be derived for this problem. proposition 1 (necessary conditions). if u * (t) and x * (t) are optimal for problem (1.8), then there exists a piecewise differentiable adjoint variable λ(t) such that for all controls u at each time t, where h is the hamiltonian previously defined and proof. the proof of this result can be found in [75] . a new example is given to illustrate this proposition. example 2 (from [97] ). let x(t) represent the number of tumor cells at time t, with exponential growth factor α, and u(t) the drug concentration. the aim is to minimize the number of tumor cells at the end of the treatment period and the accumulated harmful effects of the drug on the body. this problem is formulated as let us consider the hamiltonian the optimality condition is given by the adjoint condition is given by using the transversality condition λ(t f ) = 1 (note that φ(s) = s, so φ (s) = 1), we obtain the optimally state trajectory is (usingẋ = αx − u and x(0) = x 0 ): many problems require bounds on the control to achieve a realistic solution. for example, the amount of drugs in the organism must be non-negative and it is necessary to impose a limit. for the last example, despite being simplistic, makes more sense to constraint the control as 0 ≤ u ≤ 1. definition 6 (oc with bounded control). an oc with bounded control can be written in the form where a, b are fixed real constants and a < b. to solve problems with bounds on the control, it is necessary to develop alternative necessary conditions. proposition 2 (necessary conditions). if u * (t) and x * (t) are optimal for problem (1.9), then there exists a piecewise differentiable adjoint variable λ(t) such that for all controls u at each time t, where h is the hamiltonian previously defined and λ(t f ) = 0 (transversality condition). by an adaptation of the pmp, the oc must satisfy (optimality condition): i.e., the maximization is over all admissible controls, andũ is obtained by the expression ∂h ∂u = 0. in particular, the optimal control u * maximizes h pointwise with respect to a ≤ u ≤ b. proof. the proof of this result can be found in [75] . if we have a minimization problem, then u * is instead chosen to minimize h pointwise. this has the effect of reversing < and > in the first and third lines of optimality condition. remark 3. in some software packages there are no specific characterization for the bounds of the control. in those cases, and when the implementation allows, we can write in a compact way the optimal controlũ obtained without truncation, bounded by a and b: u * (t) = min{a, max{b,ũ}}. so far, we have only examined problems with one control and with one dependent state variable. often, it is necessary to consider more variables. definition 7 (oc with several variables and several controls). an oc with n state variables, m control variables and a payoff function φ can be written in the form max u1,...,um x i (t 0 ) = x i0 , i = 1, 2, . . . , n (1.10) where the functions f , g i are continuously differentiable in all variables. from now on, to simplify the notation, let x(t) = [x 1 (t), . . . , x n (t)], u(t) = [u 1 (t), . . . , u m (t)], x 0 = [x 10 , . . . , x n0 ], and g(t, x, u) = [g 1 (t, x, u), . . . , g n (t, x, u)]. so, the previous problem can be rewritten in a compact way as x(t 0 ) = x 0 , i = 1, 2, . . . , n (1.11) using the same approach of the previous subsections, it is possible to derive generalized necessary conditions. proposition 3 (necessary conditions). let u * be a vector of optimal control functions and x * be the vector of corresponding optimal state variables. with n states, we will need n adjoints, one for each state. there is a piecewise differentiable vector-valued function λ(t) = [λ 1 (t), . . . , λ n (t)] , where each λ i is the adjoint variable corresponding to x i , and the hamiltonian is it is possible to find the variables satisfying identical optimality, adjoint and transversality conditions in each vector component. namely, u * maximizes h(t, x * , u, λ) with respect to u at each t, and u * , x * and λ satisfy x(t f )) for j = 1, . . . , n (transversality conditions) ∂h ∂u k = 0 at u * k for k = 1, . . . , m (optimality conditions) by φ xj , it is meant the partial derivative in the x j component. note, if φ ≡ 0, then λ j (t f ) = 0 for all j, as usual. similarly to the previous section, if bounds are placed on a control variable, a k ≤ u k ≤ b k (for k = 1, . . . , m), then the optimality condition is changed from ∂h ∂u k = 0 to below, an optimal control problem related to rubella is presented. rubella, commonly known as german measles, is a most common in child age, caused by the rubella virus. children recover more quickly than adults, and can be very serious in pregnancy. the virus is contracted through the respiratory tract and has an incubation period of 2 to 3 weeks. the primary symptom of rubella virus infection is the appearance of a rash on the face which spreads to the trunk and limbs and usually fades after three days. other symptoms include low grade fever, swollen glands, joint pains, headache and conjunctivitis. it is presented now an optimal control problem to study the dynamics of rubella in china over three years, using a vaccination process (u) as a measure to control the disease (more details can be found in [17] ). let x 1 represent the susceptible population, x 2 the proportion of population that is in the incubation period, x 3 the proportion of population that is infected with rubella and x 4 the rule that remains the population constant. the optimal control problem can be defined as: it is very difficult to solve analytically this problem. for most of the epidemiologic problems it is necessary to employ numerical methods. some of them will be described in the next chapter. in this chapter, some numerical approaches to solve a system of ordinary differential equations, such as shooting methods and multi-steps methods, are introduced. then, two distinct philosophies to solve oc problems are presented: indirect methods centered in the pmp and the direct ones focus on problem discretization solved by nonlinear optimization codes. a set of software packages used all over the thesis is summarily exposed. in the last decades the computational world has been developed in an amazing way. not only in hardware issues such as efficiency, memory capacity, speed, but also in terms of the software robustness. groundbreaking achievements in the field of numerical solution techniques for differential and integral equations have enabled the simulation of highly complex real world scenarios. this way, oc also won with these improvements and numerical methods and algorithms have evolved significantly. the next section concerns on the resolution of differential equations systems. a dynamic system is mathematically characterized by a set of ordinary differential equations (ode). specifically, the dynamics are described for t 0 ≤ t ≤ t f , by a system of n odeṡ . . , y n (t), t) f 2 (y 1 (t), . . . , y n (t), t) . . . the problems of solving an ode are classified into initial value problems (ivp) and boundary value problems (bvp), depending on how the conditions at the endpoints of the domain are specified. all the conditions of an initial-value problem are specified at the initial point. on the other hand, the problem becomes a boundary-value problem if the conditions are needed for both initial and final points. there exist many numerical methods to solve initial value problems -such as euler, runge-kutta or adaptive methods -and boundary value problems, such as shooting methods. one can visualize the shooting method as the simplest technique for solving bvp. supposing it is desired to determine the initial angle of a cannon so that, when a cannonball is fired, it strikes a desired target. an initial guess is made for the angle of the cannon, and the cannon is fired. if the cannon does not hit the target, the angle is adjusted based on the amount of the miss and another cannon is fired. the process is repeated until the target is hit [108] . suppose we want to find y(t 0 ) = y 0 such that y(t f ) = b. the shooting method can be summarized as follows [9] : step 1. guess initial conditions x = y(t 0 ); step 2. propagate the differential system equations from t 0 to t f , i.e., shoot; step 3. evaluate the error in the boundary conditions c(x) = y(t f ) − b; step 4. use a nonlinear program to adjust the variables x to satisfy the constraints c(x) = 0, i.e., repeat steps 1-3. despite its simplicity, from a practical standpoint, the shooting method is used only when the problem has a small number of variables. this method has a major disadvantage: a small change in the initial condition can produce a very large change in the final conditions. in order to overcome the numerical difficulties of the simple method, the multiple shooting method is presented. in a multiple shooting method, the time-interval [t 0 , t f ] is divided into m − 1 subintervals. then is applied over each subinterval [t i , t i+1 ] with the initial values of the differential equations in the interior intervals being unknown that need to be determined. in order to enforce continuity, the following conditions are enforced at the interface of each subinterval: a scheme of the multiple shooting method is shown in figure 2 with the multiple shooting approach the problem size is increased: additional variables and constraints are introduced for each shooting segment. in particular, the number of nonlinear variables and constraints for a multiple shooting application is n = n y (m − 1), where n y is the number of dynamic variables y and m − 1 is the number of segments [9] . both shooting and multiple shooting methods require a good guess for initial conditions and the propagation of the shoots for problem of high dimension is not feasible. for this reason, other methods can be implemented, using initial value problems. the numerical solution of the ivp is fundamental to most optimal control methods. the problem can stand as follows: compute the value of y(t f ) for some value of t < t f that satisfies (2.1) with the known initial value y(t 0 ) = y 0 . numerical methods for solving the ode ivp are relatively mature in comparison to other fields in optimal control. it will be considered two methods: single-step and multiple-step methods. in both, the solution of the differential system at each step t k is sequentially obtained using current and/or previous information about the solution. in both cases, it is assumed that the time t = nh moves ahead in uniform steps of length h [9, 51] . the most common single-step method is euler method. in this discretization scheme, if a differential equation is written likeẋ = f (x(t), t), is possible to make a convenient approximation of this: this approximation x n+1 of x(t) at the point t n+1 has an error of order h 2 . clearly, there is a trade-off between accuracy and complexity of calculation which depends heavily on the chosen value for h. in general, as h is decreased the calculation takes longer but is more accurate. for many higher order systems it is very difficult to make euler approximation effective. for this reason more accurate and elaborate techniques were developed. one of these methods is the runge-kutta method. a runge-kutta method is a multiple-step method, where the solution at time t k+1 is obtained from a defined set of previous values t j−k , . . . , t k and j is the number of steps. if a differential equation is written likeẋ = f (x(t), t), it is possible to make a convenient approximation of this, using the second order runge-kutta method or the fourth order runge-kutta method x n+1 x n + h 6 (k 1 + 2k 2 + 2k 3 + k4) where this approximation x n+1 of x(t) at the point t n+1 has an error depending on h 3 and h 5 , for the runge-kutta methods of second and fourth order, respectively. numerical methods for solving oc problems date back to the 1950s with bellman investigation. from that time to present, the complexity of methods and corresponding complexity and variety of applications has substantially increased [108] . there are two major classes of numerical methods for solving oc problems: indirect methods and direct methods. the first ones, indirectly solve the problem by converting the optimal control problem to a boundary-value problem, using the pmp. on the other hand, in a direct method, the optimal solution is found by transcribing an infinite-dimensional optimization problem to a finite-dimensional optimization problem. in an indirect method, the pmp is used to determine the first-order optimality conditions of the original oc problem. the indirect approach leads to a multiple-point boundary-value problem that is solved to determine candidate optimal trajectories called extremals. for an indirect method it is necessary to explicitly get the adjoint equations, the control equations and all the transversality conditions, if there exist. notice that there is no correlation between the method used to solve the problem and its formulation: one may consider applying a multiple shooting method solution technique to either an indirect or a direct formulation. in the following subsection a numerical approach using the indirect method is presented. this method is described in a recent book by lenhart and workman [81] and it is known as forwardbackward sweep method. the process begins with an initial guess on the control variable. then, the state equations are simultaneously solved forward in time and the adjoint equations are solved backward in time. the control is updated by inserting the new values of states and adjoints into its characterization, and the process is repeated until convergence occurs. considering x = (x 1 , . . . , x n + 1) and λ = (λ 1 , . . . , λ n + 1) the vector approximations for the state and the adjoint. the main idea of the algorithm is described as follows: step 1. make an initial guess for u over the interval ( u ≡ 0 is almost always sufficient); step 2. using the initial condition x 1 = x(t 0 ) = a and the values for u, solve x forward in time according to its differential equation in the optimality system; step 3. using the transversality condition λ n +1 = λ(t f ) = 0 and the values for u and x, solve λ backward in time according to its differential equation in the optimality system; step 4. update u by entering the new x and λ values into the characterization of the optimal control; step 5. verify convergence: if the variables are sufficiently close to the corresponding in the previous iteration, then output the current values as solutions, else return to step 2. for steps 2 and 3, lenhart and workman used for the state and adjoint systems the runge-kutta fourth order procedure to make the discretization process. on the other hand, wang [136] , applied the same philosophy but solving the differential equations with the solver ode45 for matlab. this solver is based on an explicit runge-kutta (4,5) formula, the dormand-prince pair. that means the numerical solver ode45 combines a fourth and a fifth order methods, both of which are similar to the classical fourth order runge-kutta method discussed above. these vary the step size, choosing it at each step an attempt to achieve the desired accuracy. therefore, the solver ode45 is suitable for a wide variety of initial value problems in practical applications. in general, ode45 is the best method to apply as a first attempt for most problems [70] . let consider the open problem defined in chapter 1 (example 3) about rubella disease. with x(t) = (x 1 (t), x 2 (t), x 3 (t), x 4 (t)) and λ(t) = (λ 1 (t), λ 2 (t), λ 3 (t), λ 4 (t)), the hamiltonian of this problem can be written as using the pmp the optimal control problem can be studied with the state variableṡ with initial conditions x 1 (0) = 0.0555, x 2 (0) = 0.0003, x 3 (0) = 0.0004 and x 4 (0) = 1 and the adjoint variables: the optimal control is here it is only presented the main part of the code using the backward-forward sweep method with fourth order runge-kutta. the completed one can be found in the website [110] . .. beta*(x1(i)+h2*m11)*(x3(i)+h2*m13)-(0.5*(u(i) + u(i+1)))*(x1(i)+h2*m11); m22 = b*p*(x2(i)+h2*m12)+beta*(x1(i)+h2*m11)*(x3(i)+h2*m13)-(e+b)*(x2(i)+h2*m12); m23 = e*(x2(i)+h2*m12)-(g+b)*(x3(i)+h2*m13); m24 = b-b*(x4(i)+h2*m14); m31 = b-b*(p*(x2(i)+h2*m22)+q*(x3(i)+h2*m23))-b*(x1(i)+h2*m21)-... beta*(x1(i)+h2*m21)*(x3(i)+h2*m23)-(0.5*(u(i) + u(i+1)))*(x1(i)+h2*m21); m32 = b*p*(x2(i)+h2*m22)+beta*(x1(i)+h2*m21)*(x3(i)+h2*m23)-(e+b)*(x2(i)+h2*m22); m33 = e*(x2(i)+h2*m22)-(g+b)*(x3(i)+h2*m23); m34 = b-b*(x4(i)+h2*m24); m41 = b-b*(p*(x2(i)+h2*m32)+q*(x3(i)+h2*m33))-b*(x1(i)+h2*m31)-... beta*(x1(i)+h2*m31)*(x3(i)+h2*m33)-u(i+1)*(x1(i)+h2*m31); m42 = b*p*(x2(i)+h2*m32)+beta*(x1(i)+h2*m31)*(x3(i)+h2*m33)-(e+b)*(x2(i)+h2*m32); m43 = e*(x2(i)+h2*m32)-(g+b)*(x3(i)+h2*m33); m44 = b-b*(x4(i)+h2*m34); x1(i+1) = x1(i) + (h/6)*(m11 + 2*m21 + 2*m31 + m41); x2(i+1) = x2(i) + (h/6)*(m12 + 2*m22 + 2*m32 + m42); x3(i+1) = x3(i) + (h/6)*(m13 + 2*m23 + 2*m33 + m43); x4(i+1) = x4(i) + (h/6)*(m14 + 2*m24 + 2*m34 + m44); end for i = 1:m j = m + 2 -i; n11 = lambda1(j)*(b+u(j)+beta*x3(j))-lambda2(j)*beta*x3(j); n12 = lambda1(j)*b*p+lambda2(j)*(e+b-p*b)-lambda3(j)*e; n13 = -a+lambda1(j)*(b*q+beta*x1(j))-lambda2(j)*beta*x1(j)+lambda3(j)*(g+b); n14 = b*lambda4(j); n21 = (lambda1(j) -h2*n11)*(b+u(j)+beta*(0.5*(x3(j)+x3(j-1))))-... (lambda2(j) -h2*n12)*beta*(0.5*(x3(j)+x3(j-1))); n22 = (lambda1(j) -h2*n11)*b*p+(lambda2(j) -h2*n12)*(e+b-p*b)-(lambda3(j) -h2*n13)*e; n23 = -a+(lambda1(j) -h2*n11)*(b*q+beta*(0.5*(x1(j)+x1(j-1))))-... (lambda2(j) -h2*n12)*beta*(0.5*(x1(j)+x1(j-1)))+(lambda3(j) -h2*n13)*(g+b); n24 = b*(lambda4(j) -h2*n14); n31 = (lambda1(j) -h2*n21)*(b+u(j)+beta*(0.5*(x3(j)+x3(j-1))))-... (lambda2(j) -h2*n22)*beta*(0.5*(x3(j)+x3(j-1))); n32 = (lambda1(j) -h2*n21)*b*p+(lambda2(j) -h2*n22)*(e+b-p*b)-(lambda3(j) -h2*n23)*e; n33 = -a+(lambda1(j) -h2*n21)*(b*q+beta*(0.5*(x1(j)+x1(j-1))))-... (lambda2(j) -h2*n22)*beta*(0.5*(x1(j)+x1(j-1)))+(lambda3(j) -h2*n23)*(g+b); n34 = b*(lambda4(j) -h2*n24); n41 = (lambda1(j) -h2*n31)*(b+u(j)+beta*x3(j-1))-(lambda2(j) -h2*n32)*beta*x3(j-1); n42 = (lambda1(j) -h2*n31)*b*p+(lambda2(j) -h2*n32)*(e+b-p*b)-(lambda3(j) -h2*n33)*e; n43 = -a+(lambda1(j) -h2*n31)*(b*q+beta*x1(j-1))-... (lambda2(j) -h2*n32)*beta*x1(j-1)+(lambda3(j) -h2*n33)*(g+b); n44 = b*(lambda4(j) -h2*n34); lambda1(j-1) = lambda1(j) -h/6*(n11 + 2*n21 + 2*n31 + n41); lambda2(j-1) = lambda2(j) -h/6*(n12 + 2*n22 + 2*n32 + n42); lambda3(j-1) = lambda3(j) -h/6*(n13 + 2*n23 + 2*n33 + n43); lambda4(j-1) = lambda4(j) -h/6*(n14 + 2*n24 + 2*n34 + n44); end u1 = min(0.9,max(0,lambda1.*x1/2)); the optimal curves for the states variables and optimal control are shown in figure 2 there are several difficulties to overcome when an optimal control problem is solved by indirect methods. firstly, is necessary to calculate the hamiltonian, adjoint equations, optimality condition and transversality conditions. besides, the approach is not flexible, since each time a new problem is formulated, a new derivation is required. in contrast, a direct method does not require explicit derivation neither the necessary conditions. due to these practical difficulties with the indirect formulation, the main focus will be centered on the direct methods. this approach has been gaining popularity in numerical optimal control over the past three decades [9] . a new family of numerical methods for dynamic optimization has emerged, referred to as direct methods. this development has been driven by the industrial need to solve large-scale optimization problems and it has also been supported by the rapidly increasing computational power. a direct method constructs a sequence of points x 1 , x 2 , . . . , x * such that the objective function in minimized and typical f (x 1 ) > f (x 2 ) > · · · > f (x * ). here the state and/or control are approximated using an appropriate function approximation (e.g., polynomial approximation or piecewise constant parameterization). simultaneously, the cost functional is approximated as a cost function. then, the coefficients of the function approximations are treated as optimization variables and the problem is reformulated to a standard nonlinear optimization problem (nlp) in the form: where c i , i ∈ e e c j , j ∈ i are the set of equality and inequality constraints, respectively. in fact, the nlp is easier to solve than the boundary-value problem, mainly due to the sparsity of the nlp and the many well-known software programs that can handle with this feature. as a result, the range of problems that can be solved via direct methods is significantly larger than the range of problems that can be solved via indirect methods. direct methods have become so popular these days that many people have written sophisticated software programs that employ these methods. here we present two types of codes/packages: specific solvers for oc problems and standard nlp solvers used after a discretization process. the oc-ode [57] , optimal control of ordinary-differential equations, by matthias gerdts, is a collection of fortran 77 routines for optimal control problems subject to ordinary differential equations. it uses an automatic direct discretization method for the transformation of the oc problem into a finite-dimensional nlp. oc-ode includes procedures for numerical adjoint estimation and sensitivity analysis. considering the same problem (example 3), here is the main part of the code in oc-ode. the completed one can be found in the website [110] . the achieved solution is similar to the indirect approach, therefore we will not present it. the dotcvp [67] , dynamic optimization toolbox with vector control parametrization is a dynamic optimization toolbox for matlab. the toolbox provides environment for a fortran compiler to create the '.dll' files of the ode, jacobian, and sensitivities. however, a fortran compiler has to be installed in a matlab environment. the toolbox uses the control vector parametrization approach for the calculation of the optimal control profiles, giving a piecewise solution for the control. the oc problem has to be defined in mayer form. for solving the nlp, the user can choose several deterministic solvers -ipopt, fmincon, fsqp -or stochastic solvers -de, sres. the modified sundials tool [66] is used for solving the ivp and for the gradients and jacobian automatic generation. forward integration of the ode system is ensured by cvodes, a part of sundials, which is able to perform the simultaneous or staggered sensitivity analysis too. the ivp problem can be solved with the newton or functional iteration module and with the adams or bdf linear multistep method. note that the sensitivity equations are analytically provided and the error control strategy for the sensitivity variables could be enabled. dotcvp has a user friendly graphical interface (gui). considering the same problem (example 3), here is a part of the code used in dotcvp. the completed one can be found in the website [110] . the solution, despite being piecewise continuous, follows the curve obtained by the previous programs. functional for non-stiff problems data.odes.linearsolver with the gui interface this method was the preferred to be tested, due to its simple way to implement the code. in neos [98] platform there is a large set of software packages. neos is considered as the state of the art in optimization. one recent solver is muscod-ii [79] (multiple shooting code for optimal control) for the solution of mixed integer nonlinear ode or dae constrained optimal control problems in an extended ampl format. ampl [55] is a modelling language for mathematical programming and was created by fourer, gay and kernighan. the modelling languages organize and automate the tasks of modelling, which can handle a large volume of data and, moreover, can be used in machines and independent solvers, allowing the user to concentrate on the model instead of the methodology to reach solution. however, the ampl modelling language itself does not allow the formulation of differential equations. hence, the taco toolkit has been designed to implement a small set of extensions for easy and convenient modeling of optimal control problems in ampl, without the need for explicit encoding of discretization schemes. both the taco toolkit and the neos interface to muscod-ii are still under development. probably for this reason, the example 3 could not be solved by this software which crashed after some runs. anyway, we opted to also put the code, for the same example that is being used, to show the differences of modelling language used in each program. include optimalcontrol.mod; var t ; var x1, >=0 <=1; var x2, >=0 <=1; var x3, >=0 <=1; var x4, >=0 <=1; var u >=0, <=0.9 suffix type "u0"; the three nonlinear optimization software packages presented here, were used through the neos platform with codes formulated in ampl. the ipopt [135] , interior point optimizer, is a software package for large-scale nonlinear optimization. it is written in fortran and c. ipopt implements a primal-dual interior point method and uses a line search strategy based on filter method. ipopt can be used from various modeling environments. ipopt is designed to exploit 1st and 2nd derivative information if provided, usually via automatic differentiation routines in modeling environments such as ampl. if no hessians are provided, ipopt will approximate them using a quasi-newton methods, specifically a bfgs update. continuing with example 3 the ampl code is shown for ipopt. the euler discretization was the selected. indeed, this code can also be implemented in other nonlinear software packages available in neos platform, reason why the code for the next two software packages will not be shown. the full version can be found on the website [110] . , short for "nonlinear interior point trust region optimization", was created primarily by richard waltz, jorge nocedal, todd plantenga and richard byrd. it was introduced in 2001 as a derivative of academic research at northwestern, and has undergone continual improvement since then. knitro is also a software for solving large scale mathematical optimization problems based mainly on the two interior point (ip) methods and one active set algorithm. knitro is specialized for nonlinear optimization, but also solves linear programming problems, quadratic programming problems, and systems of nonlinear equations. the unknowns in these problems must be continuous variables in continuous functions; however, functions can be convex or nonconvex. the code also provides a multistart option for promoting the computation of the global minimum. this software was tested through the neos platform. snopt [58] , by philip gill, walter murray and michael saunders, is a software package for solving large-scale optimization problems (linear and nonlinear programs). it is specially effective for nonlinear problems whose functions and gradients are expensive to evaluate. the functions should be smooth but do not need to be convex. snopt is implemented in fortran 77 and distributed as source code. it uses the sqp (sequential quadratic programming) philosophy, with an augmented lagrangian approach combining a trust region approach adapted to handle the bound constraints. snopt is also available in neos platform. choosing a method for solving an oc problem depends largely on the type of problem to be solved and the amount of time that can be invested in coding. an indirect shooting method has the advantage of being simple to understand and produces highly accurate solutions when it converges [108] . the accuracy and robustness of a direct method is highly dependent upon the method used. nevertheless, it is easier to formulate highly complex problems in a direct way and can be used standard nlp solvers as an extra advantage. this last feature has the benefit of converging with poor initial guesses and being extremely computationally efficient since most of the solvers exploit the sparsity of the derivatives in the constraints and objective function. in the next chapter the basic concepts from epidemiology are provided, in order to formulate/implement more complex oc problems in the health area. in this chapter, the simplest epidemiologic models composed by mutually exclusive compartments are introduced. based on the models sis (susceptible-infected-susceptible) and sir (susceptible-infected-recovered), other models are presented introducing new issues related to maternal immunity or the latent period, fitting the features to distinct diseases. illustrative examples are presented, with diseases that can be described by each model. the basic reproduction number is calculated and presented as a threshold value for the eradication or the persistence of the disease in a population. in the 14th century, occurred one of most famous epidemic events: the black death. it killed approximately one third of the european population. from 1918-19, twenty to forty percent of the world's population suffered from the spanish flu, the most severe pandemic in history. in 1978, the united nations promoted an ambitious agreement between the countries forecasting that in the year 2000 infectious diseases would be eradicated. this conjecture failed, mainly due to the assumption that the microorganisms were biologically stationary and consequently they were not modified and became resistant to the medicines. besides, the improvements in the transportation allowing for a faster movement of individuals and the population growing especially in developing countries, led to the appearance of new diseases and the resurgence of old ones in distinct places. nowadays, aids is the most scrutinized. in 2007 there were an estimated 33.2 million sufferers worldwide, and 2.1 million deaths with over three quarters of these occurring in sub-saharan africa [93] . epidemiology -the study of patterns of diseases including those which are non-communicable of infections in population -has become more relevant and indispensable in the development of new models and explanations for the outbreaks, namely due to their propagation and causes. in epidemiology, an infection is said to be endemic in a population when it is maintained in the population without the need for external inputs. an epidemic occurs when new cases of a certain disease appears, in a given human population during a given period, and then essentially disappears. there are several types of diseases, depending on their type of transmission mechanism, of which stand out: • bacteria, which do not confer immunity against the reinfection and frequently produce harmful toxins to the host; in case of infection, the antibiotics are usually efficient (examples: tuberculosis, meningitis, gonorrhea, syphilis, tetanus); • viral agents, that confer immunity against reinfection; here the antibiotics do not produce effects and usually it is hoped that the immune system of the host responds to an infection by the virus or it will be necessary to take antiviral drugs that retard the multiplication of the virus (examples influenza, chicken pox, measles, rubella, mumps, hiv / aids, smallpox); • vectors, that are usually mosquitoes or ticks and are infected by humans and then transmit the disease to other humans (examples: malaria, yellow fever, dengue, chikungunya). the transmission can happen in a direct or indirect way. the direct transmission of a disease can happen by physical proximity (such as sneezing, coughing, kissing, sexual contact) or even by a specific parasite that penetrates the host through ingestion or the skin. the indirect transmission involves the vectors that are intermediaries or carriers of the infection. in most of the cases, the direct and indirect transmission of the disease happens between the member that coexists in the host population; this is called the horizontal transmission. when the direct transmission occurs from one ascendent to a descendent not yet born (egg or embryo) it is said that vertical transmission happens [76] . when formulating a model for a particular disease, we should make a trade-off between simple models -that omit several details and generally are used for specific situations in a short time, but have the disadvantage of possibly being naive and unrealistic -and more complex models, with more details and more realistic, but generally more difficult to solve or could contain parameters which their estimates cannot be obtained. choosing the most appropriated model depends on the precision or generality required, the available data, and the time frame in which the results are needed. by definition, all models are "wrong", in the sense that even the most complex will make some simplifying assumptions. it is, therefore, difficult to definitively express which model is right, though naturally we are interested in developing models that capture the essential features of a system. the art of epidemiological modelling is to make suitable choices in the model formulation making it as simple as possible and yet suitable for the question being considered [65] . mathematical models are a simplified representation of how an infection spreads across a population over time, and generally come in two forms: stochastic and deterministic models. the first ones, employ randomness, with variables being described by probability distributions. deterministic models split the population into subclasses, and an ode with respect to time is formulated for each. the state variable are determined using parameters and initial conditions. the main focus in this chapter will be the deterministic models, neglecting the others. most epidemic models are based on dividing the population into a small number of compartments. each containing individuals that are identical in terms of their status with respect to the disease in question. here are some of the main compartments that a model can contain. • passive immune (m ): is composed by newborns that are temporary passively immune due to antibodies transferred by their mothers; • susceptible (s): is the class of individuals who are susceptible to infection; this can include the passively immune once they lose their immunity or, more commonly, any newborn infant whose mother has never been infected and therefore has not passed on any immunity; • exposed or latent (e): compartment referred to the individuals that despite infected, do not exhibit obvious signs of infection and the abundance of the pathogen may be too low to allow further transmission; • infected (i): in this class, the level of parasite is sufficiently large within the host and there is potential in transmitting the infection to other susceptible individuals; • recovered or resistant (r): includes all individuals who have been infected and have recovered. the choice of which compartments to include in a model depends on the characteristics of the particular disease being modelled and the purpose of the model. the exposed compartment is sometimes neglected, when the latent period is considered very short. besides, the compartment of the recovered individuals cannot always be considered since there are diseases where the host has never became resistent. acronyms for epidemiology models are often based on the flow patterns between the compartments such as mseir, mseirs, seir, seirs, sir, sirs, sei, seis, si, sis. there are three commonly used threshold values in epidemiology: r 0 , σ and r. the most common and probably the most important is the basic reproduction number [61, 64, 65] . definition 8 (basic reproduction number). the basic reproduction number, denoted by r 0 , is defined as the average number of secondary infections that occurs when one infective is introduced into a completely susceptible population. this threshold, r 0 , is a famous result due to kermack and mckendrick [77] and is referred to as the "threshold phenomenon", giving a borderline between a persistence or a disease death. r 0 it is also called the basic reproduction ratio or basic reproductive rate. definition 9 (contact number). the contact number, σ is the average number of adequate contacts of a typical infective during the infectious period. an adequate contact is one that is sufficient for transmission, if the individual contacted by the susceptible is an infective. it is implicitly assumed that the infected outsider is in the host population for the entire infectious period and mixes with the host population in exactly the same way that a population native would mix. definition 10 (replacement number). the replacement number, r, is the average number of secondary infections produced by a typical infective during the entire period of infectiousness. note that the replacement number r changes as a function of time t as the disease evolves after the initial invasion. these three quantities r 0 , σ and r are all equal at the beginning of the spreading of an infectious disease when the entire population (except the infective invader) is susceptible. r 0 is only defined at the time of invasion, whereas σ and r are defined at all times. the replacement number r is the actual number of secondary cases from a typical infective, so that after the infection has invaded a population and everyone is no longer susceptible, r is always less than the basic reproduction number r 0 . also after the invasion, the susceptible fraction is less than one, and as such not all adequate contacts result in a new case. thus the replacement number r is always less than the contact number σ after the invasion [64] . combining these results leads to note that r 0 = σ for most models, and σ > r after the invasion for all models. for the models throughout this study the basic reproduction number, r 0 , will be applied. when r 0 < 1 the disease cannot invade the population and the infection will die out over a period of time. the amount of time this will take generally depends on how small r 0 is. when invasion is possible and infection can spread through the population. generally, the larger the value of r 0 the more severe, and possibly widespread, the epidemic will be [42] . in table 3 .1 are some example diseases with their estimated basic reproduction number. due to differences in demographic rates, rural-urban gradients, and contact structure, different human populations may be associated with different values of r 0 for the same disease [7] . in the next section some of the epidemiologic models will be presented. numerous infectious diseases confer no long-lasting immunity. the sis models are suitable for some bacterial agent diseases like meningitis, sexually transmitted diseases such as gonorrhea and for protozoan agent diseases where malaria and the sleeping sickness are good examples. for these diseases, an individual can be infected multiple times throughout their lives, with no apparent immunity. here, recovery from infection is followed by an instant return to the susceptible compartment. throughout this chapter we will consider that population is constant, neglecting the tourism and immigration factors. also it is considered that the population is homogeneously mixed, which means that every individual interacts with another at the same level and therefore all individuals have the same risk of contracting the disease. the number of individuals in each compartment must be integer, but if the population size n is sufficiently large, it is possible to treat s and i as continuous variables. calculating the proportion of these compartments, varying from 0 to 1, it is considered that the total population is constant over time, i.e., 1 = s + i. the compartment changes are expressed by a system of differential equations. the sis model can be mathematically represented as follows. definition 11 (sis model). the sis model can be formulated as subject to initial conditions s(0) > 0 and i(0) ≥ 0. it is considered β the transmission rate (per capita) and γ the recovery rate, so the mean infectious period is 1/γ. the vital dynamics (births and deaths) were not considered, but a similar model can be constructed with these effects [65] . despite this lack of susceptible births, the disease can still persist because the recovery of infected individuals replenishes the susceptible class and will guarantee the long-term persistence of the disease. remark 5. the si model is a particular case of the sis model when the recovery rate (γ) is null. a new infected individual is expected to infect another at a transmission rate β, during 1 γ that is the infectious period, so the expected basic reproduction number is trachoma is an infectious disease causing a characteristic roughening of the inner surface of the eyelids. also called granular conjunctivitis or egyptian ophthalmia, it is the leading cause of infectious blindness in the world. adapting a model from [109] , and using β = 0.047 as transmission rate and the recovery rate γ = 0.017, we have the basic reproduction number r 0 approximately 2.76 and the following representation of the state variables (figure 3 .2). the sir model was initially studied in depth by kermack and mckendrick and categorizes hosts within a population as susceptible, infected and recovered [77] . it captures the dynamics of acute infections that confers lifelong immunity once recovered. diseases where individuals acquire permanent immunity, and for which this model may be applied, include measles, smallpox, chickenpox, mumps, typhoid fever and diphtheria. once again we will consider the total population size constant, i.e., 1 = s + i + r. two cases will be studied, distinguished by the inclusion or exclusion of demographic factors. having compartmentalized the population, we now need a set of equations that specify how the sizes of compartments change over time. definition 12 (sir model without demography). the sir model, excluding births and deaths, can be defined as in addition, the transmission rate, per capita, is β and the recovery rate is γ. since population is constant and as r does not appear in the first two differential equations, most of the times the last equation is omitted, indeed, r(t) = 1 − s(t) − i(t). by assuming equations from (3.2) is possible to notice that ds dt then a newly introduced infected individual can be expected to infect other people at the rate β during the expected infectious period 1/γ. thus, this first infective individual can be expected to infect consider an epidemic of influenza in a british boarding school in early 1978 [76] . three boys were reported to the school infirmary with the typical symptoms of influenza. over the next few days, a very large fraction of the 763 boys in the school had contact with the infection. within two weeks, the infection had become extinguished. the best fit parameters yield an estimated active infectious period of 1/γ = 2.2 days and a mean transmission rate β = 1.66 per day. therefore, the estimated r 0 is 3.652. figure 3 .4 represents the dynamics of the three state variables. it was presented a sir model given the assumption that the time scale of disease spread was sufficiently fast that births and deaths can be neglected. in the next subsection we explore the long-term persistence and endemic dynamics of an infectious disease. the simplest and most common way of introducing demography into the sir model is to assume there is a natural host lifespan, 1/µ years. then, the rate at which individuals, at any epidemiological compartment, suffer natural mortality is given by µ. it is important to emphasize that this factor is independent of the disease and is not intended to reflect the pathogenicity of the infectious agent. historically, it has been assumed that µ also represents the population's crude birth rate, thus ensuring that total population size does not change through time, or in other words, ds dt + di dt + dr dt = 0. putting all these assumptions together, we have a new definition. definition 13 (sir model with demography). the sir model, including births and deaths, can be defined as the epidemiological scheme is in figure 3 .5. it is important to introduce the r 0 expression for this model. the parameter β represents the trans-mission rate per infective and the negative terms in the equation tell us that each individual spends an average 1 γ+µ time units in this class. therefore, if we assume the entire population is susceptible, then the average number of new infectious per infectious individual is determined by the inclusion of demographic dynamics may allow a disease to die out or persist in a population in the long term. for this reason it is important to explore what happens when the system is at equilibrium. definition 14 (equilibrium points). a model defined sir has an equilibrium point, if a triple e * = (s * , i * , r * ) satisfies the following system: if the equilibrium point has the infectious component equal to zero (i * = 0), this means that the pathogen suffered extinction and e * is called disease free equilibrium (dfe). if i * > 0 the disease persist in the population and e * is called endemic equilibrium (ee). with some calculations and algebraic manipulations, it is possible to obtain two equilibria for the system (3.3): dfe: when r 0 < 1, each infected individual produces, on average, less than one new infected individual, and therefore, predictable that the infection will be cleared from the population. if r 0 > 1, the pathogen is able to invade the susceptible population [61, 64] . it is possible to prove that for the endemic equilibrium to be stable, r 0 must be greater than one, otherwise the disease free equilibrium is stable. more detailed information about local and global stability of the equilibrium point can be found in [22, 74, 83, 91] . this threshold behavior is very useful, once we can determine which control measures, and at what magnitude, would be most effective in reducing r 0 below one, providing important guidance for public health initiatives. the sir model below, figure 3 .6, is plotted using similar parameters and initial conditions, except the transmission rate β (adapted from [76] ). it is shown that in case (a), using β = 520, the basic reproduction number is greater than one. with demographic effects could have damped oscillations with a decreasing amplitude, in the ee direction. in case (b), with β = 10, we obtain r 0 < 1 and the system tends to go to a dfe. the next three sections present a brief description of other possible refinements of the basic models sis and sir. in the seir case, the transmission process occurs with an initial inoculation with a very small number of pathogens. then, during a period of time the pathogen reproduces rapidly within the host, relatively unchallenged by the immune system. during this stage, pathogen abundance is too low for active transmission to other susceptible host, and yet the pathogen is present. the time in this stage is very difficult to quantify, since there is no symptomatic features of the disease. it is called the latent or exposed period. assuming the average duration of the latent period is given by 1 ν , the seir model can be described as follow. definition 15 (seir model). the seir model is formulated as the parameter β and γ were defined in the previous section. the epidemiological scheme for seir model is presented in figure 3 this threshold is the product of the contact rate β per unit time, the average infectious period adjusts to the population growth of 1 γ+µ , and the fraction ν ν+µ of exposed people surviving the latent class e. finding steady states of the system, we obtain the following equilibrium points: although the sir and seir models behave similarly at equilibrium, when the parameters are suitably adapted, the seir model has a slower growth rate after pathogen invasion. this is due to the fact that individuals need to stay some time in the exposed class before their contribution in the disease transmission chain [76] . an infected or vaccinated mother transfers some antibodies across the placenta to her fetus, so that the newborn infant has temporary passive immunity to an infection. since the infant can not produce new antibodies, when these passive antibodies are gone at a rate δ, the baby passes from the immune state m to the susceptible state s. some childhood diseases have this feature. the birth rate µs into the susceptible class of size s corresponds to newborns whose mothers are susceptible, and the other newborns µ(1 − s) enter passively immune class of size m , since their mothers were infected or had some type of immunity [65] . the transfer diagram for the mseir model is shown in the mseir is also composed by a system of differential equations. definition 16 (mseir model). the seir model can be described as thus, the basic reproduction number is equal to previous model seir, once the m compartment does not affect the transmission chain of the disease: the equations (3.5) always have a dfe given by there are several models with different epidemiological states, such as mseirs, seirs, sei, si, sirs, depending on the specific features and the level of detail that is wished to introduce in the model. these models are similar to the previous ones presented. other epidemiological models can be studied using more compartments such as quarantine-isolation (q), treatment (t ), carrier (c) or vaccination (v ). besides, most of the populations can be subdivided into different groups (sex, age, health weaknesses, ...), depending upon characteristics that may influence the risk of catching and/or transmitting an infection. more details and examples can be found in [4, 14, 65, 134] . other diseases can be caught and transmitted by numerous hosts or even is required a second different population to complete the transmission cycle, such as vector borne-diseases, using double models. this last case will be explored in the second part of the thesis. in cases that include multiple compartments of infected individuals, in which vital and epidemiological parameters depend on factors as stage of the disease, spatial position, age, behavior, multigroups, the next generation method is the more generalized approach to calculate r 0 . the definition of r 0 has more than one possible interpretation, depending on the field (ecology, demography or epidemiology) and there exist distinct methods and estimations to calculate this threshold. the next generation method, introduced by diekmann et al. [39] , defines r 0 as the spectral radius of the next generator operator. the formation of the operator involves the determination of two compartments, infected and non-infected from the model. recent examples of this method are given in [38, 42, 61, 142] . let us assume that there are n compartments of which m are infected. we defined the vector x = (x 1 , . . . , x n ) t , where x i ≥ 0 denotes the number of proportion of individuals in the ith compartment. for clarity we sort the compartments so that the first m compartments correspond to infected individuals. the distinction between infected and uninfected compartments must be determined from the epidemiological interpretation, and not by the mathematical expressions. it is necessary to define the set the disease transmission model consists of nonnegative initial conditions together with the following system of equations:ẋ note that f i should include only infections that are newly arising, but does not include terms which describe the transfer of infectious individuals from one infected compartment to another. let us consider the following assumptions: (a5) if f(x) is set to zero, then all eigenvalues of df (x 0 ) have negative real parts and df (x 0 ) is the derivative ∂fi ∂xj evaluated at the dfe x 0 . assuming that f i and v i meet the assumptions above, we can form the next generation matrix f v −1 from matrices of partial derivatives of f i and v i . specially, where i, j = 1, . . . , m and x 0 is the dfe. the entries of f v −1 give the rate at which infected individuals in x j produce new infections in x i , times the average length of time an individual spends on a single visit to compartment j. definition 17 (basic reproduction number using the next generator operator). the basic reproduction number is given by where ρ denotes the spectral radius (dominant eigenvalue) of the matrix f v −1 . the following theorem states that r 0 is a threshold parameter for the stability of the dfe. proof. the proof of this theorem can be found in [42] . consider a simple model seit for tuberculosis with a treated compartment (adapted from [11] ). tuberculosis, is a common, and in many cases lethal, infectious disease caused by various strains of mycobacteria. tuberculosis typically attacks the lungs but can also affect other parts of the body. most infections are asymptomatic and latent, but about one in ten latent infections eventually progresses to active disease which, if left untreated, kills more than 50% of those so infected. in this model a constant population is considered where n = s + e + i + t . exposed individuals progress to the infectious compartment at a rate ν. the treatment rates are r 1 for exposed individuals and r 2 for infectious individuals. however only a fraction q of the treatments of infectious individuals are successful. unsuccessfully treated infectious individuals re-enter the exposed compartment (1 − q). the dynamics are illustrated in figure 3 .9 and the differential equations are the following: e and i, which gives m = 2, and we will construct the matrices f and v only related to these compartments. let this way, epidemiological models can be understood as a framework to explain the mechanisms of the disease procedures and test ideas to implement control measures. r 0 is a key concept, used as a threshold parameter to predict where an infection will spread and how these controls can be effective. in the second part of the thesis, the epidemiological models will be used for studying dengue disease. applying oc theory, the repercussions of several controls in the development of the disease will be analyzed. during the last decades, the global prevalence of dengue progressed dramatically. in this chapter, dengue details are given, such as disease symptoms, transmission and epidemiological trends. an old oc model for dengue is revisited, as a first approach to the disease. due to the software robustness improvements and the higher computational capacity, a better solution for this problem is proposed. in order to study different discretization schemes for an oc problem, taking into account time performance and resources used, some numerical simulations are made using euler and runge-kutta methods. the origins of the word dengue are not clear. some researchers think that it is derived from the swahili phrase "ka-dinga pepo", meaning "cramp-like seizure caused by an evil spirit". the first recognized dengue epidemics occurred almost simultaneously in asia, africa, and north america in the 1780s, shortly after the identification and naming of the disease in 1779 [33] . dengue transcends international borders and can be found in tropical and subtropical regions around the world, predominantly in urban and semi-urban areas. dengue is a disease which is now endemic in more than one hundred countries of africa, america, asia and the western pacific. in figure 4 .1 it is possible to see the areas that in 2008 have more surveillance. nevertheless, some studies have indicated that countries with a mild climate, such as in the mediterranean, are at risk due to future climate conditions that may be favorable to this kind of disease [69] . in europe there are no registered cases, but the main vector of the disease is already in the old continent and has been followed on madeira island [5, 124] . this risk may be aggravated further due to climate changes and to the globalization, as a consequence of the huge volume of international tourism and trade [122] . travelers play an essential role in the global epidemiology: they act as viremic travelers, carrying the disease into areas where mosquitos can transmit the infection. dengue is a vector-borne disease transmitted from an infected human to a female aedes mosquito by a bite. then, the mosquito, that needs regular meals of blood to feed their eggs, bites a potential healthy human and transmits the disease making it a cycle. there are two forms of dengue: dengue fever (df) and dengue hemorrhagic fever (dhf). the first one is characterized by a sudden high fever without respiratory symptoms, accompanied by intense headaches, painful joints and muscles and lasts between three to seven days. humans may only transmit the virus during the febrile stage [33] . dhf initially exhibits a similar, if more severe pathology as df, but deviates from the classic pattern at the end of the febrile stage [54] . the hemorrhagic form has an additionally bleeding from the nose, mouth and gums or skin bruising, nausea, vomiting and fainting due to low blood pressure by fluid leakage. it usually lasts between two to three days and can lead to death [36] . nowadays, dengue is the mosquito-borne infection that has become a major international public health concern. according to the world health organization (who), 50 to 100 million dengue fever infections occur yearly, including 500000 dengue hemorrhagic fever cases and 22000 deaths, mostly among children [141] . there are four distinct, but closely related, viruses that cause dengue. the four serotypes, named den-1 to den-4, belong to the flavivirus family, but they are antigenically distinct. recovery from infection by one virus provides lifelong immunity against that virus but confers only partial and transient protection against subsequent infection by the other three viruses. there is good evidence that a sequential infection increases the risk of developing dhf [137] . unfortunately, there is no specific effective treatment for dengue. activities, such as triage and management, are critical in determining the clinical outcome of dengue. a rapid and efficient front-line response not only reduces the number of unnecessary hospital admissions but also saves lives. although up until now there is no effective and safe vaccine for dengue, a number of candidates are undergoing various phases of clinical trials [140] . with four closely related viruses that can cause the disease, there is a need for a vaccine that would immunize against all four types to be effective. the main difficulty in the vaccine production is that there is a limited understanding of how the disease typically behaves and how the virus interacts with the immune system. another challenge is that some studies show that some secondary dengue infection can leave to dhf, and theoretically a vaccine could be a potential cause of severe disease if a solid immunity is not established against the four serotypes. research to develop a vaccine is ongoing and the incentives to study the mechanism of protective immunity are gaining more support, now that the number of outbreaks around the world is increasing [33] . the spread of dengue is attributed to the geographic expansion of the mosquitoes responsible for the disease: aedes aegypti and aedes albopictus [23] . due to its higher interaction with humans and its urban behavior, the first mosquito is considered the major responsible for dengue transmission and, our attention will be focused on it. aedes aegypti, in figure 4 .2, is an insect species closely associated with humans and their dwellings, thriving in crowded cities and biting primarily during the day. humans not only provide blood meals for mosquitoes, but also nutrients needed for them to reproduce through water-holding containers, in and around their homes. in urban areas, aedes mosquitoes breed on water collections in artificial containers such as cans, plastic cups, used tires, broken bottles and flower pots. with increasing urbanization, crowded cities, poor sanitation and lack of hygiene, environmental conditions foster the spread of the disease that, even in the absence of fatal forms, breed significant economic and social costs (absenteeism, immobilization, debilitation and medication) [35] . dengue is spread only by adult females that require a blood meal for the development of their eggs, whereas male mosquitoes feed on fruit nectar and other sources of sugar. in this process the female acquire the virus while feeding on the blood of an infected person. after virus incubation from eight to twelve days (extrinsic period), an infected mosquito is capable, during probing and blood feeding, of transmitting the virus for the rest of its life to susceptible humans, and the intrinsic period for humans varies from 3 to 15 days. the life cycle of a mosquito has four distinct stages: egg, larva, pupa and adult, as it is possible to see in figure 4 .3. in the case of aedes aegypti, the first three stages take place in or near water whilst air is the medium for the adult stage [103] . female mosquitoes lay their eggs, but usually do not lay them all at once: it releases them in different places, increasing the probability of new births [139] . the eggs of aedes aegypti can resist to droughts and low temperatures for up to one year. although the hatching of mature eggs may occur spontaneously at any time, this is greatly stimulated by flooding. larvae hatch when water inundates the eggs as a result of rains or an addition of water by people. in the following days, the larvae will feed on microorganisms and particulate organic matter. when the larva has acquired enough energy and size, metamorphosis is done, changing the larva into pupa. pupae do not feed: they just change in form until the adult body is formed. the newly formed adult emerges from the water after breaking the pupal skin. the entire life cycle, from the aquatic phase (eggs, larvae, pupae) to the adult phase, lasts from 8 to 10 days at room temperature, depending on the level of feeding [27] . the adult stage of the mosquito is considered to last an average of eleven days in an urban environment, reaching up to 30 days in laboratory environment. studies suggest that most female mosquitoes may spend their lifetime in or around the houses where they emerge as adults. this means that people, rather than mosquitoes, rapidly move the virus within and between communities. aedes aegypti is one of the most efficient vectors for arboviruses because it is highly anthropophilic, frequently bites several times before complete oogenesis [141] . the extent of dengue transmission is determined by a wide variety of factors: the level of herd immunity in the human population to circulating virus serotype(s); virulence characteristics of the viral strain; survival, feeding behavior, and abundance of aedes aegypti; climate; and human density, distribution, and movement [121] . it is very difficult to control or eliminate aedes aegypti mosquitoes because they are highly resilient, quickly adapting to changes in the environment and they have the ability to rapidly bounce back to initial numbers after disturbances resulting from natural phenomena (e.g., droughts) or human interventions (e.g., control measures). we can safely expect that transmission thresholds will vary depending on a range of factors. primary prevention of dengue resides mainly in mosquito control. there are two primary methods: larval control and adult mosquito control, depending on the intended target [96] . larvicide treatment is done through long-lasting chemical in order to kill larvae and preferably have who clearance for use in drinking water [36] . the application of adulticides can have a powerful impact on the abundance of adult mosquito vector. however, the efficacy is often constrained by the difficulty in achieving sufficiently high coverage of resting surfaces [37] . this is the most common measure. however the long term use of adulticide has several risks: the resistance of the mosquito to the product reducing its efficacy, the killing of other species that live in the same habitat and has also been linked to numerous adverse health effects including the worsening of asthma and respiratory problems. larvicide treatment is an effective way to control the vector larvae, together with mechanical control, which is related to educational campaigns. the mechanical control must be done by both public health officials and residents in affected area. the participation of the entire population is essential in removing still water from domestic recipients and eliminating possible breeding sites [140] . the most recent approach for fighting the disease is biological control. it is a natural process of population regulation through natural enemies. there are techniques that combine some parasites that kill partially the larval population; however there are some operational resistance, because there is lack of expertise in producing these type of parasites and there are some cultural objections in introducing something in the water for human consumption [23] . another way of insect control is to change the reproduction process, releasing sterile insects. this technique, named as sterile insect technique, consists in releasing sterile insects in natural environment, so that the result of mating produces the non-viability of the eggs, and thus can lead to drastic reduction of the specie. this way of control, shows two types of inconvenience: it is expensive to produce and release insects, and can be confronted with social objections, because an uninformed population could not correctly understand the addition of insects as a good solution [8, 47, 131] . mathematical modeling became an interesting tool for understanding epidemiological diseases and for proposing effective strategies in fighting them. a set of mathematical models have been developed in literature to gain insights into the transmission dynamics of dengue in a community. while zeng and velasco-hérnandez [49] investigate the competitive exclusion principle in a two-strain dengue model, chowell et al. [26] estimates the basic reproduction number for dengue using spatial epidemic data. in [100] the author studies the spread of dengue thought statistical analysis, while in tewa et al. [130] global asymptotic stability of the equilibrium of a single-strain dengue model is established. the control of the mosquito by the introduction of a sterile insect technique is analyzed in thomé et al. [132] . more recently, a study in disease persistence was made [88] in brazil and otero et al. [102] studied dengue outbreaks. all these studies were made with the aim of providing a better understanding of the nature and dynamics of dengue infection transmission. in the next section a temporal mathematical model that explores the dynamics between hosts (humans) and vectors (mosquitoes) is analyzed. the aim of this section is to present a mathematical model to study the dynamic of the dengue epidemics, in order to minimize the investments in disease's control, since financial resources are always scarce. quantitative methods are applied to the optimization of investments in the control of the epidemiologic disease, in order to obtain a maximum of benefits from a fixed amount of financial resources. the used model depends on the dynamic of the mosquito growing, but also on the efforts of public management to motivate the population to break the reproduction cycle of the mosquitoes by avoiding the accumulation of still water in open-air recipients and spraying potential zones of reproduction. the dengue epidemic model described in this paper is based on the one proposed in [20] . it has four state variables and two control variables as follows. state variables: to describe the model it is also necessary to introduce some parameters: parameters: α r : average reproduction rate of mosquitoes α m : mortality rate of mosquitoes β : probability of contact between non-carrier mosquitoes and infected individuals η : rate of treatment of infected individuals µ : amplitude of seasonal oscillation in the reproduction rate of mosquitoes ρ : probability of individuals becoming infected θ : fear factor, reflecting the increase in the population willingness to take actions to combat the mosquitoes as a consequence of the high prevalence of the disease in the specific social environment τ : forgetting rate for goodwill of the target population ϕ : phase angle to adjust the peak season for mosquitoes ω : angular frequency of the mosquitoes proliferation cycle, corresponding to a 52 weeks period p : population in the risk area (usually normalized to yield p = 1) γ d : the instantaneous costs due to the existence of infected individuals γ s : the costs of each operation of spraying insecticides γ e : the cost associated to the instructive campaigns the model consists in minimizing subject to the following four nonlinear time-varying state equations [20] : . . , 4. equation (4.2a) represents the variation of the mosquitoes density per unit time to the natural cycle of reproduction and mortality (α r and α m ), due to seasonal effects µ sin(ωt + ϕ) and to human interference −x 4 (t) and u 1 (t). equation (4.2b) expresses the variation of the mosquitoes density carrying the virus x 2 (t) denotes the rate of the infected mosquitoes and β [x 1 (t) − x 2 (t)] x 3 (t) represents the increase rate of the infected mosquitoes due to the possible contact between the uninfected mosquitoes x 1 (t) − x 2 (t) and infected individuals denoted by x 3 (t). the dynamics of the infectious transmission is presented in equation (4.2c). the term −ηx 3 (t) is related to the rate of cure and ρx 2 (t) [p − x 3 (t)] describes the rate at which new cases spring up. the factor [p − x 3 (t)] is the number of individuals in the area, that are not infected. equation (4.2d) is a model for the level of popular motivation (or goodwill) to combat the reproductive cycle of mosquitoes. over time, the level of people motivated will have changed. as a consequence, it is necessary to invest in educational campaigns designed to increase consciousness of the population under risk. the expression −τ x 4 (t) represents the decay of the people's motivation over time, due to forgetfulness. the term θx 3 (t) describes the natural sensibilities of the public due to increase in the prevalence of the disease. the goal is to minimize the cost functional (4.1). this functional includes social costs related to the existence of ill individuals -like absenteeism, hospital admission, treatments -, γ d x 2 3 (t), the recourses needed for the spraying of insecticide operations, γ s u 2 1 (t), and for educational campaigns, γ e u 2 2 (t). the model for the social cost is based on the concept of goodwill explored by nerlove and arrow [99] . due to computational issues, the optimal control problem (4.1)-(4.2d), that was written in the lagrange form, was converted into an equivalent mayer problem. hence, using a standard procedure (cf., section 1.2) to rewrite the cost functional, the state vector was augmented by an extra component x 5 , leading to the following equivalent terminal cost problem: with given t f , subject to the control system (4.2a)-(4.2d) and (4.3). two different implementations were considered. in a first approach, the oc problem is solved by a specific optimal control package, oc-ode, already described in section 2.3. the oc problem considers (4.2a)-(4.4) and the code is available in [110] . a second approach uses the nonlinear solver ipopt [135] , also described in section 2.3. in order to use this software, it was necessary to discretize the problem. the euler discretization scheme was chosen (see section 2.1 for more details). the discretization step length was h = 1/4, because it is a good compromise between precision and efficiency. thus, the optimal control problem was discretized into the following nonlinear programming problem: the error tolerance value was 10 −8 using the ipopt solver. the discretized problem, after a presolve done by the software, has 1455 variables, 1243 of which are nonlinear; and 1039 constraints, 828 of which are nonlinear (see the ampl code for this problem in [110] ). [104, 133] that was used by the authors of the paper [20] . it is important to salient that, at the time of the initial paper [20] , the authors had not the same computational resources that exist nowadays. the results with oc-ode and ipopt are better since the cost to fight the dengue disease and the number of infected individuals are smaller. [20] . packages were tested, and they could not reach a solution -some crashed at the middle or some bad scaling issues were observed. until some years ago, due to computational limitations, most of the models were run using codes made by the authors themselves as [20] . nowadays, one can choose between several proper software packages "out of the box", that already take into account specific features of stiff problems, scaling problems, etc. with this work it is possible to realize that "old" problems can again be taken into account and be better analyzed with new technology and approaches, with the goal of finding global optimal solutions, instead of local ones. for this purpose, and at an initial research stage, it is important to understand if different kinds of discretization for an oc problem have influence on the problem resolution. this section aims to study the costs of different discretization processes, in terms of time performance, number of variables and iterations used. for the purpose of this analysis two discretization schemes, euler and second order runge-kutta's scheme [9] (cf. section 2.1), are considered to solve the problem described in the previous section. this discretization process transforms the dengue epidemics problem into a standard nonlinear optimization problem, with an objective function and a set of nonlinear constraints. this nlp problem was codified, for both discretization schemes, in the ampl modeling language [55] and can be checked in [110] . two nonlinear solvers with distinct features were selected to solve the nlp problem: the knitro (ip method) and the snopt (sqp method). the neos server [98] platform was used as interface with both solvers. the ipopt (ip method), used in the previous section, was the first choice for our research. however, at the time of this investigation, the neos platform was moved to another center of research and some software packages were unavailable for long periods of time. so, we had to choose another interior point robust software. table 4 .1 reports the results for both solvers, for each discretization method using three different discretization steps (h = 0.5, 0.25, 0.125), rising twelve numerical experiences. the columns # var. and # const. mean the number of variables and constraints, respectively. the next columns refer to the performance measures -number of iterations and total cpu time in seconds (time for solving the problem, for evaluate the objective and the constraints functions and for input/output). as the computational experiments were made in the neos server platform, the selected machine to run the program remains unknown as well as its technical specifications. the optimal cost reached was ≈ 3 × 10 −3 for all tests. comparing the general behavior of the solvers one can conclude that the ip based method (knitro) presents much better performance than the sqp method (snopt) in terms of the measures used. regarding the knitro results, one realizes that the euler's discretization scheme has better times for h = 0.25 and h = 0.125 and similar time for h = 0.5, when compared to runge-kutta's method. another obvious finding, for both solvers, is that the cpu time increases as far as the problem dimension increases (number of variables and constraints). with respect to the number of iterations, snopt presents more iterations as the problem dimension increases. however this conclusion cannot be taken for knitro -in fact, there does not exist a relation between the problem dimension and the number of iterations. the best version tested was knitro using runge-kutta with h = 0.5 (best cpu time and fewer iterations), and the second one was knitro with euler's method using h = 0.25. an important evidence of this numerical experience is that it is not worth the reduction of the discretization step size because no significative advantages are obtained. at this moment, as a result of major demographic changes, rapid urbanization on a massive scale, global travel and environmental change, the world faces enormous future challenges from emerging infectious diseases. dengue illustrates well these challenges [141] . in this work we investigated an optimal control model for dengue epidemics proposed in [20] , that includes the mosquitoes dynamics, the effect of educational campaigns. the cost functional reflects a compromise between financial spending in insecticides and educational campaigns and the population health. for comparison reasons, the same choice of data/parameters in [20] was considered. the results obtained from oc-ode and ipopt are similar, improving the ones previously reported in [20] (cf. section 4.2). indeed, the obtained control policy in this work presents an important progress with respect to the previous best policy: the percentage of infected mosquitoes vanishes just after four weeks, while mosquitoes are completely eradicated after 30 weeks (figures 4.4 and 4 .5); the number of infected individuals begin to decrease after four weeks while with the previous policy this only happens after 23 weeks (figure 4.6) . despite the fact that our results are better, they are accomplished with a much smaller cost with insecticides and educational campaigns (figure 4.8) . the general improvement, which explains why the results are so successful, rely on an effective control policy of insecticides. the proposed strategy for insecticide application seems to explain the discrepancies between the results here obtained and the best policy of [20] . our results show that applying insecticides in the first four weeks yields a substantial reduction in the cost of fighting dengue, in terms of the functional proposed in [20] . the main conclusion is that health authorities should pay attention to the epidemic from the very beginning: effective control decisions in the first four weeks have a decisive role in the dengue battle, and population and governments will both profit from it. we successfully solved an oc problem by direct methods using nonlinear optimization software based on ip and sqp approaches. the problem was discretized through euler and runge-kutta schemes. the implementation efforts of higher order discretization methods bring no advantages. the reduction of the discretization step and consequently the increase of the number of variables and constraints do not improve the performance with respect to the cpu time and to the number of iterations. we can point out the robustness of both solvers in spite of the dimension problem increase. the conclusions drawn in section 4.3 were helpful for decision-making future processes of discretization all over the work. as future work we intend to analyze how different parameters/weights associated to the variables in the objective function can influence the spread of the disease. this chapter was based on work available in the peer reviewed journal [112] and the peer reviewed conference proceedings [111] . a model for the dengue disease transmission is presented. it consists of eight mutually-exclusive compartments representing the human and vector dynamics. it also includes a control parameter, adulticide spray, as a measure to fight the disease. the model presents three possible equilibria: two disease free equilibria (dfe) and an endemic equilibrium (ee). it has been proved that a dfe is locally asymptotically stable, whenever a certain epidemiological threshold, known as the basic reproduction number, is less than one. in this work we try to understand which is the best way to apply the control in order to effectively reduce the number of infected humans and mosquitoes. a case study, using outbreak data in 2009 in cape verde, is reported. in chapter 4 the dengue epidemic was studied, mostly centered in people, specially in the goodwill of the individuals and spraying campaigns. however, the virus transmission scheme was overlooked: it was only considered two compartments for people and two compartments for adult mosquitoes. here, the aim is to deepen the relationship between human and mosquitoes, creating a better framework to explain the development and transmission of the disease. the mathematical model is based on [43, 44] , that describes the chikungunya disease transmitted by aedes albopictus. the notation used in our mathematical model includes four epidemiological states for humans: it is assumed that the total human population (n h ) is constant, so, n h = s h + e h + i h + r h . there are also four other state variables related to the female mosquitoes (the male mosquitoes are not considered in this study because they do not bite humans and consequently they do not influence the dynamics of the disease): similarly, it is assumed that the total adult mosquito population is constant, which means n m = s m + e m + i m . in this way, we put our model more complex and reliable to the reality of dengue epidemics. for this study we introduced a control variable: the control variable, c(t), varies from 0 to 1. however, the model does not fit completely the reality. epidemiologist and policy makers need to be aware of both strengths and weakness of the epidemiological modeling approach. an epidemiological model is always a simplification of reality. so, some assumptions were made to built this model: • the total human population (n h ) is constant; • there is no immigration of infected individuals into the human population; • the population is homogeneous, which means that every individual of a compartment is homogenously mixed with the other individuals; • the coefficient of transmission of the disease is fixed and does not vary seasonally; • both human and mosquitoes are assumed to be born susceptible, i.e., there is no natural protection; • for the mosquito there is no resistant phase, due to its short lifetime. to completely describe the model it is necessary to use parameters, which are: total population b : average daily biting (per day) β mh : transmission probability from i m (per bite) β hm : transmission probability from i h (per bite) 1/µ h : average lifespan of humans (in days) 1/η h : mean viremic period (in days) 1/µ m : average lifespan of adult mosquitoes (in days) ϕ : number of eggs at each deposit per capita (per day) µ a : natural mortality of larvae (per day) η a : maturation rate from larvae to adult (per day) 1/η m : extrinsic incubation period (in days) 1/ν h : intrinsic incubation period (in days) m : female mosquitoes per human k : number of larvae per human k : maximal capacity of larvae for notation simplicity, the independent variable t will be omitted when writing the dependent variables, example given, will be written s h instead of s h (t). the dengue epidemic can be modelled by the following nonlinear time-varying state equations: and vector population with the initial conditions for the previous set of differential equations is now analyzed the equilibrium points of the system and is determined the threshold phenomena. let the set proof. system (5.1)-(5.2) can be rewritten in the following way: as m (x) has all off-diagonal entries nonnegative, m (x) is a metzler matrix. using the fact that f ≥ 0, the system (5.4) is positively invariant in r 7 + [1] , which means that any trajectory of the system starting from an initial state in the positive orthant r 7 + remains forever in r 7 + . theorem 5.1.1. let ω be defined as above. consider also the system (5.1)-(5.2) admits at most two disease free equilibrium points: • if m ≤ 0, there is a disease free equilibrium (dfe), called trivial equilibrium; • if m > 0, there is a biologically realistic disease free equilibrium (brdfe), proof. the equilibrium points are reached when the following equations hold: using the mathematica software to solve the system (5.5), we obtained four solutions. the first one is known as the trivial equilibrium, since the mosquitoes do not exist, so there is no disease: in the second one, mosquitoes and humans interact, but there is only one outbreak of the disease, i.e., over time the disease goes away without killing all the mosquitoes. we have called this equilibrium point a biologically realistic disease free equilibrium (brdfe), since it is a more reasonable situation to find in nature than the previous one: which is equivalent to e * 2 = n h , 0, 0, kn hm η a µ b , kn hm µ b µm , 0, 0 . this is biologically interesting only if m is greater than 0. the third solution corresponds to a situation where humans and mosquitoes live together but the disease persists in both populations, which means that it is not a dfe. this equilibrium will be explained after (see theorem 5.1.3). thus the disease is not anymore an epidemic episode, but transforms into a endemic one. with some algebraic manipulations we obtained the following point: with the mathematica software we obtained a fourth solution. but some of the components are negative, which means that they do not belong to the ω set. remark 6. the condition m > 0 is equivalent, by algebraic manipulation, to the condition which corresponds to the basic offspring number for mosquitoes. thus, if m < 0, then the mosquito population will collapse and the only equilibrium for the whole system is the trivial equilibrium. if m ≥ 0, then the mosquito population is sustainable. the amount of mosquitoes is also related to an epidemic threshold: the basic reproduction number of the disease, r 0 . following [42] , we prove: . the equilibrium point brdfe is locally asymptotically stable if r 0 < 1 and unstable if r 0 > 1. proof. to derive the basic reproduction number, we use the next generator approach. the basic reproduction number is calculated in a disease free equilibrium. in this case we consider the most realistic one, brdfe. following [3, 42] , let consider the vector x t = (e h , i h , e m , i m ) which corresponds to the components related to the progression of the disease. the subsystem used is: this subsystem can be partitioned, represents the components related to new cases of disease (in this situation in the exposed compartments) and v(x) represents the other components. thus the subsystem (5.6) can be rewritten as let us consider the jacobian matrices associated with f and v: according to [42] the basic reproduction number is r 0 = ρ(j f (x0) j v −1 (x0) ), where x 0 is a disease free equilibrium (brdfe) and ρ(a) defines the spectral radius of a matrix a. using mathematica and we obtain the value for the threshold parameter, with m > 0. in the model we have two different populations (human and vectors), so the expected basic reproduction number reflects the infection human-vector and also vector-human, that is, r 2 0 = r hm × r mh . the term bβ hm sm0 n h represents the product between the transmission probability of the disease from humans to vectors and the number of susceptible mosquitoes per human; 1 η h +µ h is related to the human's viremic period; and ηm c+ηm+µm represents the proportion of mosquitoes that survive to the incubation period. analogously, the term bβ mh s h0 n h is related to the transmission probability of the disease between mosquitos and human, in a susceptible population; when r 0 < 1, each infected individual produces, on average, less than one new infected individual, and therefore it is predictable that the infection will be cleared from the population. if r 0 > 1, the disease is able to invade the susceptible population [42, 83] . theorem 5.1.3. if m > 0 and r 0 > 1, then the system (5.1)-(5.2) also admits an endemic equilibrium (ee): . proof. see proof of theorem 5.1.1. from a biological point of view, it is desirable that humans and mosquitoes coexist without the disease reaching a level of endemicity. good estimates of dengue transmission intensity are therefore necessary to compare and interpret dengue interventions conducted in different places and times and to evaluate options for dengue control. the basic reproduction number has played a central role in the epidemiological theory for dengue and other infectious diseases because it provides an index of transmission intensity and establishes a threshold criteria. we claim that proper use of control c, can result in the basic reproduction number remaining below unity and, therefore, making brdfe stable. in order to make effective use of achievable insecticide control, and simultaneously to explain more easily to the competent authorities its effectiveness, we assume that c is constant. the goal is to find c such that r 2 0 < 1. for this purpose we have studied the reality of cape verde. an unprecedented outbreak was detected in the cape verde archipelago in september 2009. this is the first report of dengue virus activity in that country. as the population had never had contact with the virus, the herd immunity was very low. dengue type 3 spread throughout the islands of the archipelago, reaching four of the nine islands. the worst outbreak occurred on the santiago island, where most people live. the number of cases increased sharply since the beginning of november, reaching 1000 cases per day. the cape verde ministry of health reported more than 20000 cases of dengue fever within the archipelago between october and december 2009, which is about 5% of the total population of the country. from 173 cases of dengue hemorrhagic fever reported, six people died [31, 40] . it represented a challenge to the performance of the national health care system. government officials launched a plan to eradicate the mosquito including a national holiday during which citizens were being asked to clear out standing water and other potential breeding areas used by the mosquitoes. the intense and spontaneous movement of solidarity from the civil society was another noteworthy dimension, not only cleaning but also voluntary donating blood to strengthen the stock of the central hospital dr. agostinho neto, in praia. we used the data for human population related to cape verde [54] . due to low surveillance and the fact of being the first ever of dengue outbreak in the country, it was not possible to collect detailed data from the mosquito. however, the authorities speak explicitly about mosquitoes coming from brazil [2] . also the information that comes from the ministry of health in the capital of cape verde, praia, confirms that the insects responsible for dengue came most probably from brazil, transported by means of air transport that perform frequent connections between cape verde and brazil, as reported by the radio of cape verde. with respect to aedes aegypti, we have thus considered data from brazil [132, 143] . the simulations were carried out using the following values: n h = 480000, b = 1, β mh = 0.375, β hm = 0.375, µ h = 1/(71 × 365), η h = 1/3, µ m = 1/11, ϕ = 6, µ a = 1/4, η a = 0.08, η m = 1/11, ν h = 1/4, , m = 6, k = 3. the initial conditions for the problem were: considering nonexistence of control, i.e. c = 0, the basic reproduction number for this outbreak in cape verde is approximately r 0 = 2.396, which is in agreement to other studies of dengue in other countries [100] . the control c affects the basic reproduction number, and our aim is to find a control that puts r 0 less than one. this value was obtained by mathematica, calculating the inequality with the parameters values above. the computational investigations were carried out using c = 0.157, which means that the insecticide is continuously applied during a twelve week period. the software used was scilab [21] . it is an open source, cross-platform numerical computational package and a high-level, numerically oriented programming language. for our problem we used the routine ode to solve the set of differential equations. by default, ode uses the lsoda solver of package odepack. it automatically selects between the nonstiff predictor-corrector adams method and the stiff backward differentiation formula (bdf) method. it uses the nonstiff method initially and dynamically monitors data in order to decide which method to use. the graphics were also obtained using this software, with the command plot (see code in [110] ). figures 5.2 and 5.3 show the curves related to human population, with and without control, respectively. the number of infected people, even with small control, is much less than without any insecticide campaign. figures 5.4 and 5.5 show the difference of the mosquito population with control and without control. when the control is applied, the number of infected mosquitoes is close to zero. note that the intention is not to completely eradicate the mosquitoes but instead the number of infected mosquitoes. it has been algebraically proved that if a constant minimum level of a control is applied (c = 0.157), it is possible to maintain the basic reproduction number below unity, guaranteeing the brdfe. this value is corroborated in another numerical study [113] . the values of infected humans obtained by the model are higher when compared to what really happened in cape verde. despite the measures taken by the government and health authorities were not accounted, they have had a considerable impact in the progress of the disease. until here, it was considered a constant control. using a theoretical approach [112] , we intend to find the best function c(t), using oc theory. instead of finding a constant control it will be possible to study other types of control, such as piecewise constant or even continuous but not constant. additionally we could consider another strategy, a more practical one: due to logistics and health reasons, it may be more convenient to apply insecticide periodically and at some specific hours at night. in this section we investigate the best way to apply the control in order to effectively reduce the number of infected humans and mosquitoes, using pulse control. in literature it has been proven that a dfe is locally asymptotically stable, whenever a certain epidemiological threshold, known as the basic reproduction number , is less than one. in the previous section, it was proven that if a constant minimum level of insecticide is applied (c = 0.157), it is possible to maintain the basic reproduction number below unity, guaranteeing the dfe. in this section, other kinds of piecewise controls that maintain the basic reproduction number less than one and could give a better solution to implement by health authorities are investigated. to solve the system (5.1)-(5.2), in a first step, several strategies of control application were used: three different frequencies (weekly, bi-weekly and monthly) for control application, a constant control (c = 0.157) and no control (c = 0). the first three different frequencies mean that during one day (per week, bi-week and month), the whole (100%) capacity of insecticide (c = 1) is used during all day. besides, also was used a constant control strategy (c = 0.157) that consists in the application of 15.7% capacity of insecticide 24 hours per day all the time (84 days). in this work, the amount of insecticide is an adimensional value and must be considered in relative terms. the numerical tests were carried out using scilab [120] , with the same ode system and parameters of the previous section (code available in [110] ). figures 5.6 and 5.7 show the results of these strategies regarding infected mosquitoes and individuals. without control, the number of infected mosquitoes and individuals increase expressively. it is also possible to see that the weekly pulse control had the closest results to the continuous control. therefore, realizing the influence of the insecticide control, further tests were carried out to find the optimum periodicity of administration which, from gathered results, must rest between six and seven days. the second phase of numerical tests, figures 5.8 and 5 .9, considered four situations: 6 days, 7 days, 10 days and continuously c = 0.157. to guarantee the dfe, the curves must remain below the one corresponding to c = 0.157. the amount of insecticide, and when to apply it, are important factors for outbreak control. table 5 .1 reports the total amount of insecticide used in each version during the 84 days. the numerical tests conclude that the best strategy for the infected reduction is every 6/7 days application. the value spent in insecticide is similar to the continuous strategy, but is much easier to implement in this work, several piecewise strategies were studied to find the best way of applying insecticide, but always having in mind a periodic application of the product. but if the best strategy is not a periodic one? in the next section the optimal control solution for this problem will be studied. the aim of this section is the study of optimal strategies for applying insecticide, taking into account different perspectives: thinking only on the insecticide cost, focusing on the cost of infected humans or even combining both perspectives. to take this approach, and after several numerical experiences, we considered that it was better to normalize the ode system (5.1)-(5.2). the reason for this transformation is due to the bad scaling of the variables: some vary from 0 to 480000, and others from 0 to 1, which can influence the software performance. consider the following transformations: the ode system (5.1)-(5.2) is transformed into: with the initial conditions let us consider the objective functional j considering the costs of infected humans and costs with insecticide: where γ d and γ s are positive constants representing the costs weights of infected individuals and spraying campaigns, respectively. using oc theory, let λ i (t), with i = 1, . . . , 8, be the co-state variables. the hamiltonian for the present oc problem is given by by the pontryagin maximum principle [106] , the optimal control c * should be the one that minimizes, at each instant t, the hamiltonian given by (5.10), that is, h (x * (t), λ * (t), c * (t)) = min c∈[0,1] h (x * (t), λ * (t), c). in this way, the optimal control is given by it is also necessary to consider the adjoint system λ i (t) = − ∂h ∂x i , i.e., as the oc problem just has initial conditions it is necessary to find the transversality conditions, that correspond to a terminal condition in the co-state equation, replacing the optimal control c * in the state system (5.7) and in the adjoint system (5.11) is possible to solve the differential system taking into account the initial and transversality conditions. in order to solve this oc problem three approaches were tested. the first one is a direct method, dotcvp toolbox [67] for matlab. it uses the differential system (5.7), the initial conditions (5.8) and it is necessary to transform the problem into a mayer form (see section 2.3). to solve the discretized oc problem, the ipopt software is chosen as an option inside the dotcvp. the functional is divided into time intervals, in this case ten intervals, with an initial value problem tolerance of 10 −7 and a nlp tolerance of 10 −5 . the optimal control function given by this toolbox is piecewise constant. another direct method is oc-ode [57] . it uses the differential system (5.7), the initial conditions (5.8) and it includes procedures for numerical adjoint estimation and sensitivity analysis and the feasibility tolerance considered was 10 −10 . the last method used, an indirect one, it is coded in matlab environment. it involves the backwardforward method (see section 2.2) and the resolution of the ode systems are made by the ode45 routine. the state differential system (5.7) is solved forward with initial conditions (5.8), while the adjoint system (5.11) is solved backwards using the terminal conditions (5.12). the absolute and relative tolerances were fixed in 10 −4 . the three codes are available at [110] . all the parameters are assumed equal to the previous section. for this first simulation, the values for the weights of the costs are γ d = 0.5 and γ s = 0.5. figure 5 .10 shows that, despite having distinct philosophies of resolution, the curves obtained by the three solvers are similar, which reinforces the confidence in the result. the optimal functional is 0.0470, 0.0498 and 0.0445, for dotcvp, oc-ode and backward-forward, respectively. the last solver achieved better value for the functional. this is expected, once its resolution contains more information about the oc problem because the adjoint system is supplied. the study considers three situations: a, b, and c. situation a, that was previously presented, regards both perspectives in the functional (human infected and insecticide application). situation b concerns only to infected humans whereas case c only considers insecticide campaigns. table 5 .2 resumes the three situations. functional values for weights case a both perspectives since the three solvers presented similar solutions, only one of them was chosen to solve these cases, that was dotcvp. figures 5.12, 5.13 and 5.14 show the results for optimal control c, infected humans and total costs, respectively. in the medical perspective (case b), when only the costs related with ill people (absenteeism, drugs, ...) are considered, the number of infected is the lowest; however a huge quantity of insecticide is used, because it is considered cheaper. on the other hand, when people just think on the economical perspective (case c), the treatment for people is neglected. optimal control is low, but the number of infected humans is high. the total cost is higher when both perspectives are considered. epidemiological modelling has largely focused on identifying the mechanisms responsible for epidemics but has taken little account on economic constraints in analyzing control strategies. economic models have given insight into optimal control under constraints imposed by limited resources, but they frequently ignore the spatial and temporal dynamics of the disease. nowadays the combination of epidemiological and economic factors is essential. the bioeconomic approach requires an equilibrium between economic and epidemiological parameters in order to give an efficient disease control and reflecting the nature of the epidemic. for this, the study goes on implementing both perspectives, but taking into account distinct weights of the parameters associated to the variables i h and c. table 5 .3 resumes these approaches. values for weights table 5 .3: different weights for the functional in case d, it is studied a situation where the lack of insecticide in a country could be a reality and as a consequence its market value is high. this could happen due to an unprecedent outbreak where the authorities were not prepared or even due to financial reasons by the fact that the government does not have financial viability for this kind of measure. in case e, once again the human perspective gains strength and, as human life and quality of life is an expensive good, it was considered that it is more expensive to treat humans then apply insecticide. the analysis from the figures from 5.15 to 5.17 is consistent with what we expected in reality. in case d, as insecticide is expensive, the function for optimal control is lower than the other perspective. as consequence, the number of infected people is higher. in case e, where the human factor is preponderant, the number of infected humans is low but expenses with insecticide are higher. curiously the total cost, in case d and e, are of the same order of magnitude, with a slightly higher cost for case e. the total cost is reported in table 5 .4. when both perspectives are considered, the total cost is higher than the single perspective. for a last analysis, a mathematical perspective was carried out: what values should the γ d and γ s have, in order to minimize the functional? let us call this perspective as case f. in this perspective, we not only want to minimize the control c, but also the parameters γ d and γ s , enforcing the equality constraint γ d + γ s = 1. show the comparison of this case with the first one. as expected, the total cost and the number of infected humans are the lowest ones. giving freedom to the parameters it is possible to see that the optimal control function is not periodic (as studied in the previous section), but gives a practical solution in applying insecticide. in this chapter a model based on two populations, humans and mosquitoes, with insecticide control is presented. it has shown that as time goes by, depending on several parameters, the outbreak could disappear (leading to a dfe) or could transform the disease into an endemic one (leading to an ee). this assuming that the parameters are fixed, the only variable that can influence this threshold is the control variable c, it has shown that with a steady insecticide campaign it is possible to reduce the number of infected humans and mosquitoes, and can prevent an outbreak that could transform an epidemiological episode to an endemic disease. for a steady campaign has proven that c = 0.157 it is enough to maintain the r 0 below unit. however, this type of control is difficult to implement. a pulse insecticide campaign was studied to circumvent this difficulty. it has proven that applying insecticide every 6/7 days, is a better strategy to implement by health authorities and has the same efficacy level and financial costs. finally, the oc theory was used to find the best optimal control function for the insecticide. the optimal function varies, giving a different answer depending on the main goal to reach, thinking in economical or human centered perspective. as future work it is important to study different kinds of controls. the accelerated increase in mosquito resistance to several chemical insecticides and the damage caused by these to the environment, has resulted in the search for new control alternatives. among the alternatives available, the use of bacillus thuringiensis israelensis (bti) has been adopted by several countries [89] . laboratory testing shows that bti has a high larvicide property and its mechanism of action is based on the production of an endotoxin protein that, when ingested by the larvae, causes death. to ensure the minimization of the outbreaks, educational programmes that are customized for different levels of health care and that reflect local capacity should be supported and implemented widely. people should be instructed to minimize the number of potential places for the mosquito breeding. educational campaigns can be included as an extra control parameter in the model. this chapter was based on work available in the peer reviewed journal [118] and the peer reviewed conference proceedings [113, 117] . a new model with six mutually-exclusive compartments related to dengue disease is presented. in this model there are three vector control tools: insecticides (larvicide and adulticide) and mechanical control. the human data for the model is again related to cape verde. due to the rapid development of the outbreak on the islands, only a few control measures were taken, but not quantified. in this chapter, some of these measures are simulated and their consequences are analyzed. in chapter 5, a model with eight compartments and a single control was analyzed. however, after discussion with some researchers in this area, many of them suggested removing the compartment exposed for three main reasons: first, it is difficult to collect data for this compartment, since the disease at this stage does not show symptoms; second, the curve obtained is similar to the infected compartment with only an advance of time, not bringing novelty to the model, but possible difficulties to the numeric resolution; and finally, as the main goal is to study the effects of several controls centered in the infected humans, this compartment plays a secondary role. thus, it was decided to remove the exposed compartments, in humans and mosquitoes, adjusting the other parameters to this new model and including three controls. taking into account the model presented in [43, 44] and the considerations of [111] [113] , a new model more adapted to the dengue reality is proposed. the notation used in the mathematical model includes three epidemiological states for humans: it is assumed that the total human population (n h ) is constant and n h = s h (t) + i h (t) + r h (t) at any time t. the population is homogeneous, which means that every individual of a compartment is homogeneously mixed with the other individuals. immigration and emigration are not considered. there are three other state variables, related to the female mosquitoes: due to the short lifespan of mosquitoes, there is no resistant phase. it is assumed homogeneity between host and vector populations, which means that each vector has an equal probability to bite any host. humans and mosquitoes are assumed to be born susceptible. to analyze the effect of campaigns in the disease fight, three controls are considered: proportion of mechanical control, 0 < α ≤ 1. larval control targets the immature mosquitoes living in water before they become biting adults. a soil bacterium, bacillus thuringiensis israelensis (bti), is applied from the ground or by air to larval habitats. this bacterium is used because when properly applied, it has virtually no effect on non-target organisms. the control of adult mosquitoes is necessary when mosquito populations cannot be treated in their larval stage. it is the most effective way to eliminate adult female mosquitoes that are infected with human pathogens. depending on the size of the area to be treated, either trucks for ground adulticide treatments or aircraft for aerial adulticide treatments can be used. the purpose of mechanical control is to reduce the number of larval habitat areas available to mosquitoes. the mosquitoes are most easily controlled by treating, cleaning and/or emptying containers that hold water, since the eggs of the specie are laid in water-holding containers. the aim is to simulate different realities in order to find the best policy to decrease the number of infected humans. a temporal mathematical model is introduced, with mutually-exclusive compartments, to study the outbreak occurred on cape verde islands in 2009 and improving the model described in [111] . the model uses the following parameters: average daily biting (per day) β mh : transmission probability from i m (per bite) β hm : transmission probability from i h (per bite) 1/µ h : average lifespan of humans (in days) 1/η h : mean viremic period (in days) 1/µ m : average lifespan of adult mosquitoes (in days) ϕ : number of eggs at each deposit per capita (per day) 1/µ a : natural mortality of larvae (per day) η a : maturation rate from larvae to adult (per day) m : female mosquitoes per human k : number of larvae per human the dengue epidemic is modelled by the following nonlinear time-varying state equations: with the initial conditions due to biological reasons, only nonnegative solutions of the differential system are acceptable. more precisely, it is necessary to study the solution properties of the system (6.1)-(6.2) in the closed set it can be verified that ω is a positively invariant set with respect to (6.1)-(6.2). the proof of this statement is similar as in [118] . the system (6.1)-(6.2) has at most three biologically meaningful equilibrium points (cf. theorem 6.1.1). is said to be an equilibrium point for system (6.1)-(6.2) if it satisfies the following relations: an equilibrium point e is biologically meaningful if and only if e ∈ ω. the biologically meaningful equilibrium points are said to be disease free or endemic depending on i h and i m : if there is no disease for both populations of humans and mosquitoes (i h = i m = 0), then the equilibrium point is said to be a disease free equilibrium (dfe); otherwise, if i h = 0 or i m = 0 (in other words, if i h > 0 or i m > 0), then the equilibrium point is called endemic. theorem 6.1.1. system (6.1)-(6.2) admits at most three biologically meaningful equilibrium points; at most two dfe points and at most one endemic equilibrium point. more precisely, let , . if m ≤ 0, then there is only one biologically meaningful equilibrium point, e * 1 , which is a dfe point. if m > 0 with ξ ≥ χ, then there are two biologically meaningful equilibrium points, e * 1 and e * 2 , both dfe points. if m > 0 with ξ < χ, then there are three biologically meaningful equilibrium points, e * 1 , e * 2 , and e * 3 , where e * 1 and e * 2 are dfes and e * 3 endemic. proof. system (6.3) has four solutions easily obtained with a computer algebra system like maple: e * 1 , e * 2 , e * 3 and e * 4 . the equilibrium point e * 1 is always a dfe because it always belongs to ω with i h = i m = 0. in contrast, e * 4 is never biologically realistic because it has always some negative coordinates. the other two equilibrium points, e * 2 and e * 3 , are biologically realistic only for certain values of the parameters. the equilibrium e * 2 is biologically realistic if and only if m ≥ 0, in which case it is a dfe. for the condition m ≤ 0, the third equilibrium e * 3 is not biologically realistic. if m > 0, then three situations can occur with respect to e * 3 : if ξ = χ, then e * 3 degenerates into e * 2 , which means that e * 3 is the dfe e * 2 ; if ξ > χ, then e * 3 is not biologically realistic; otherwise, one has e * 3 ∈ ω with i h = 0 and i m = 0, which means that e * 3 is an endemic equilibrium point. by algebraic manipulation, m > 0 is equivalent to condition is related to the basic offspring number for mosquitos. thus, if m ≤ 0, then the mosquito population will collapse and the only equilibrium for the whole system is the trivial dfe e * 1 . if m > 0, then the mosquito population is sustainable. from a biological standpoint, the equilibrium e * 2 is more plausible, because the mosquito is in its habitat, but without the disease. an important measure of transmissibility of the disease is now introduced: the basic reproduction number. it provides an invasion criterion for the initial spread of the virus in a susceptible population. for this case the following result holds. theorem 6.1.2. the basic reproduction number r 0 associated to the differential system (6.1)-(6.2) is proof. in agreement with [42] , just the epidemiological compartments that have new infections, i h and i m , are considered. the two differential equations related to these two compartments can be rewritten where f is the rate of production of new infections and v is the transition rates between states: the jacobian derivatives are the quantity j f (x) j −1 v(x) gives the total production of new infections over the course of an infection. the largest eigenvalue gives the fastest growth of the infected population, which means that r 0 is the spectral radius of the matrix j f (x) j −1 v(x) in a dfe point. maple was used to obtain the basic reproduction number r 0 in (6.5) is obtained, replacing s h df e and s m df e in (6.6) by those of the dfe e * 2 . the model has two different populations (host and vector) and the expected basic reproduction number should reflect the infection transmitted from host to vector and vice-versa. accordingly, r 0 can be seen as r 0 = (r hm × r mh ) if r 0 < 1, then, on average, an infected individual produces less than one new infected individual over the course of its infectious period, and the disease cannot grow. conversely, if r 0 > 1, then each individual infects more than one person, and the disease can invade the population. proof. the only solution of (6.3) with i h > 0 or i m > 0, the only endemic equilibrium, is e * 3 . that occurs, in agreement with theorem 6.1.1, in the case m > 0 and χ > ξ. the condition χ > ξ is equivalent, by theorem 6.1.2, to r 0 > 1. using the methods in [42, 83] , it is possible to prove that if r 0 ≤ 1, then the dfe is globally asymptotically stable in ω, and thus the vector-borne disease always dies out; if r 0 > 1, then the unique endemic equilibrium is globally asymptotically stable in ω, so that the disease, if initially present, will persist at the unique endemic equilibrium level. assuming that parameters are fixed, the threshold r 0 is influenceable by the control values. figure 6 .2 gives this relationship. it is possible to realize that the control c m is the one that most influences the basic reproduction number to stay below unit. besides, the control in the aquatic phase alone is not enough to maintain r 0 below unit: an application close to 100% is required. the simulations were carried out using the following numerical values: n h = 480000, b = 0.8, β mh = 0.375, β hm = 0.375, µ h = 1/(71 × 365), η h = 1/3, µ m = 1/10, ϕ = 6, µ a = 1/4, η a = 0.08, m = 3, k = 3. the initial conditions for the problem were: s h0 = n h − 10, i h0 = 10, r h0 = 0, a m0 = kn h , s m0 = mn h , i m0 = 0. with these values, one has m > 0. as in the previous chapter the values related to humans describe the reality of cape verde [32] and the information about mosquitoes is based on brazil [30, 47] . all computational calculus consider one year for time interval. although the final time was t f = 365 days, the figures show graphics in suitable windows, in order to provide a better analysis. all the simulations and graphics were done in matlab. to solve the differential equation system, the ode45 routine was used. this function implements a runge-kutta method with a variable time step for efficient computation (see [110] for more details about the code). the number of infected humans from the model (6.1)-(6.2) is higher when compared with what really happened in cape verde. as far as it was possible to investigate in the local news, the government of cape verde has done their best to banish the mosquito, with media campaigns appealing to people to remove or cover all recipients that could serve to breed the mosquito, and to use insecticide in critical areas. however, it was not possible to quantify those efforts in precise terms. next, follows a set of simulations using different controls. in each figure, only one control is used, continuously, which means that the others are not applied. the aim of this simulation is to see the importance of the control and what repercussions has on the model. figures 6.5 and 6.6 concern the adulticide control, figures 6.7 and 6.8 the larvicide control, and figures 6.9 and 6.10 the mechanical control. using a small quantity of each control, the number of infected people falls dramatically. in some cases, in spite of all graphs displaying five simulations, some curves are so close to zero that it is difficult to distinguish them. figures 6.5 and 6.6 show that excellent results for the human population are obtained by covering only 25% of the country with insecticide for adult mosquitoes. the simulations were done considering that the aedes aegypti does not become resistant to the insecticide and that it is financially possible to apply insecticide during all time. figures 6.7-6.10 are related to the applied controls in the aquatic phase of the mosquito. in these graphics the controls were studied separately, but one is closely related to the other. the application of these controls are not sufficient to decrease the infected human to zero, but the removal in the next section, using oc strategy we will find the best solution for the controls. epidemiological models may give some basic guidelines for public health practitioners, comparing the effectiveness of different potential management strategies. in reality, a range of constraints and trade-offs may substantially influence the choice of practical strategy, and therefore their inclusion in any modelling analysis may be important. frequently, epidemiological models need to be coupled to economic considerations, such that control strategies can be judged through holistic cost-benefit analysis. control of livestock disease is a scenario when cost-benefit analysis can play a vital role in choosing between cheap, weak controls that lead to a prolonged epidemic, or expensive but more effective controls that lead to a shorter outbreak. normalizing the previous ode system (6.1)-(6.2), we obtain: with the initial conditions the cost functional considered was where γ d , γ s , γ l and γ e are weights related to costs of the disease, adulticide, larvicide and mechanical control, respectively. in a first approach to this problem, it is assumed that all weights are the same, which means γ d = γ s = γ l = γ e = 0.25 (case a). the oc problem was solved using two different packages: dotcvp [67] and muscod-ii [79] . the mathematical formulation of the sir+asi problem, for the both packages, is available on [110] . the simulation behavior is similar, and we decided to show only the dotcvp results. the optimal functions for the controls are given in figure 6 .13. the adulticide was the control that more influenced the decreasing of that ratio and, as consequence, the decreasing of the number of infected people and mosquitoes, matching with the results obtained for the basic reproduction number in section 6.1. therefore, the adulticide was the most used. we believe that the other controls do not assume an important role in the epidemic episode, because all the events happened at a short period of time, which means that adulticide has more impact. however the mosquito control in the aquatic phase can not be neglected. in situations of longer epidemic episodes or even in an endemic situation, the larval control represents an important tool. figure 6 .14 presents the number of infected humans. comparing the optimal control case with a situation with no control, the number of infected people decreased considerably. besides, in the situation where oc is used, the peak of infected people is minor, which facilitates the work in health centers, because they can provide a better medical monitoring. a second analysis was made, taking into account different weights on the functional. table 6 .1 resumes the weights chosen for perspectives: not only economic issues (cost of insecticides and educational campaigns), but also human issues are considered. in case a, all costs were equal. in case b is given more impact on the infected people, considering that the treatment and absenteeism to work is very prejudicial to figure 6 .14: comparison of infected individuals under an optimal control situation and no controls the country, when compared with the cost of insecticides and educational campaigns. in case c, the costs with killing mosquitoes and educational campaigns have more impact in the economy. higher total costs were obtained when human life had more weight than controls measures as can be checked in table 6 table 6 .1: different weights for the functional and respective values figure 6 .15 shows the number of infected human in each bioeconomic perspective. we can realize that case a and case c are similar. it can be explained by the low weight given to the cost of treatment (cases a and c) when compared with the heavy weight given in case b. figure 6 .16 presents the behavior of the controls for the a, b and c cases. again, as adulticide is the one that has more influence on the model, this is the control that most varies when the weights are changed. a third analysis was made to the model: it changed the functional in order to study the effects of each control when separately considered. therefore, the new functional also considers bioeconomic perspectives, but only includes two variables: the costs with infected humans (with γ d = 0.5) and the costs with only one control (with γ i = 0.5, i ∈ {s, l, e}). thus, in figure 6 .17 are presented the proportion of adulticide (a) and infected humans (b), when the functional is dengue disease breeds, even in the absence of fatal forms, significant economic and social costs: absenteeism, debilitation and medication. to observe and to act at the epidemics onset could save lives and resources to governments. moreover, the under-reporting of dengue cases is probably the most important barrier in obtaining an accurate assessment. a compartmental epidemiological model for dengue disease composed by a set of differential equations was presented. simulations based on clean-up campaigns to remove the vector breeding sites, and also simulations on the application of insecticides (larvicide and adulticide), were made. it was shown that even with a low, although continuous, index of control over time, the results are surprisingly positive. the adulticide was the most effective control, since with a low percentage of insecticide, the basic reproduction number is kept below unit and the number infected humans was smaller. however, to bet only in adulticide is a risky decision. in some countries, such as mexico and brazil, the prolonged use of adulticides has been increasing the mosquito tolerance capacity to the product or even they become completely resistant. in countries where dengue is a permanent threat, governments must act with differentiated tools. it will be interesting to analyze these controls in an endemic region and with several outbreaks. we believe that the results will be quite different. aedes aegypti eradication is not considered to be feasible and, from the environmental point of view, not desirable. the aim is to reduce the mosquito density and, simultaneously, amount the level of immunity on the human population. the increase of population herd immunity can be reached by two ways: increasing the resistant people to the disease implying an increasing of infected individuals or with a vaccination campaign. no commercially available clinical cure or vaccine is currently available for dengue, but efforts are underway to develop one [10, 68] . the potential of prevention of dengue by immunization seems to be technically feasible and progress is being made in the development of vaccines that may protect against all four dengue viruses. in the next chapter, a model of this disease with a vaccine simulation as a new strategy to fight the disease will be analyzed. this chapter was based on work accepted in the peer reviewed journal [115] and the peer reviewed conference proceedings [116] . as the development of a dengue vaccine is ongoing, it is simulated an hypothetical vaccine as an extra protection to the population. in a first phase, the vaccination process is studied as a new compartment in the model, and some different types of vaccines are simulated: pediatric and random mass vaccines, with distinct levels of efficacy and durability. in a second step, the vaccination is seen as a control variable in the epidemiological process. in both cases, epidemic and endemic scenarios are included in order to analyze distinct outbreak realities. in 1760, the swiss mathematician daniel bernoulli published a study on the impact of immunization with cowpox upon the expectation of life of the immunized population [119] . the process of protecting individuals from infection by immunization has become a routine, with historical success in reducing both mortality and morbidity. the impact of vaccination may be regarded not only as an individual protective measure, but also as a collective one. while direct individual protection is the major focus of a mass vaccination program, the effects on population also contribute indirectly to other individual protection through herd immunity, providing protection for unprotected individuals [48] (see scheme in figure 7 .1). this means that when we have a large neighborhood of vaccinated people, a susceptible individual has a lower probability in coming into contact with the infection, being more difficult for diseases to spread, which decreases the relief of health facilities and can break the chain of infection. vector control remains the only available strategy against dengue. despite integrated vector control with community participation, along with active disease surveillance and insecticides, there are only a [23] . besides, the levels of resistance of aedes aegypti to insecticides has increased, which implies shorter intervals between treatments, and only few insecticide products are available in the market due to the high costs for development and registration and low returns [76] . dengue vaccines have been under development since the 1940s, but due to the limited appreciation of global disease burden and the potential markets for dengue vaccines, industry interest languished throughout the 20th century. however, in recent years, the development of dengue vaccines has accelerated dramatically with the increase in dengue infections, as well as the prevalence of all four circulating serotypes. faster development of a vaccine became a serious concern [94] . economic analysis are now conducted periodically to guide public support for vaccine development in both industrialized and developing countries, including a previous cost-effectiveness study of dengue [28, 123, 126] . the authors compared the cost of the disease burden with the possibility of making a vaccination campaign; they suggest that there is a potential economic benefit associated with promising dengue interventions, such as dengue vaccines and vector control innovations, when compared to the cost associated to the disease treatments. constructing a successful vaccine for dengue has been challenging: the knowledge of disease pathogenesis is insufficient and in addition the vaccine must protect simultaneously against all serotypes in order to not increase the level of dhf [128] . nevertheless, several promising approaches are being investigated in both academic and industrial laboratories. vaccine candidates include live attenuated vaccines obtained via cell passages or by recombinant dna technology (such as those being developed by the us national institutes of allergy and infectious diseases, inviragen, walter reed army institute of research/glaxosmithkline, and sanofi pasteur) and subunit vaccines (such as those developed by merck/hawaii biotech) [60, 138] . recent studies indicate that, by the progress in clinical development of sanofi pasteur's live attenuated tetravalent chimeric vaccine, a vaccine could be licensed as early as 2014 [87] . the team is carrying out an efficacy study on a vaccine covering four serotypes on 4000 children aged four to eleven years old in muang district, thailand. at this time, the features of dengue vaccine are mostly unknown. so, in this chapter we opt to present a set of simulations with different types of vaccines and we have explored also the vaccination process under two different perspectives. the first one uses a new compartment in the model and several kinds of vaccination are considered. a second perspective is studied using the vaccination process as a disease control in the mathematical formulation. in this case the theory of oc is applied. both methods assume a continuous strategy vaccination. in this section, a new compartment v is added to the previous sir model related to the human population. this new compartment represents the new group of human population that is vaccinated, in order to distinguish the resistance obtained through vaccination and the one achieved by disease recovery. two forms of random vaccination are possible: the most common for human disease is pediatric vaccination to reduce the prevalence of an endemic disease; the alternative is random vaccination of the entire population in an outbreak situation. in both types, the vaccination can be considered perfect conferring 100% protection for all life or else imperfect. this last case can be due to the difficulty of producing an effective vaccine, the heterogeneity of the population or even the life span of the vaccine. for many potentially human infections, such as measles, mumps, rubella, whooping cough, polio, there has been much focus on vaccinating newborns or very young infants. dengue can be a serious candidate for this type of vaccination. in the sv ir model, a continuous vaccination strategy is considered, where a proportion of the newborn p (where 0 ≤ p ≤ 1), was by default vaccinated. this model also assumes that the permanent immunity acquired through vaccination is the same as the natural immunity obtained from infected individuals eliminating the disease naturally. the population remains constant, i.e., n h = s h + v h + i h + r h . the new model for human population is represented in figure 7 we are assuming that it is a perfect vaccine, which means that it confers life-long protection. as a first step, it is necessary to determine the basic reproduction number without vaccination (p = 0). theorem 7.2.1. the basic reproduction number, r 0 , associated to the differential system (7.1) without vaccination is given by proof. the proof of this theorem is similar to the one in the previous chapter (see proof of theorem 6.1.2). just consider in this chapter, we make all the simulations in two scenarios: an epidemic and an endemic situation (programming codes available in [110] ). for these, the following parameter values of the differential system and initial conditions were used (tables 7.1 and 7.2). there were two main differences between the epidemic episode and an endemic situation. firsty, in the endemic situation there was a slight decrease in the average daily biting b and transmission probabilities β mh and β hm , that could be explained by the fact that the mosquito could have more difficulties to find a naive individual. the second difference is concerned with the strong increase of the initial human population that is resistant to the disease. this may be explained by the fact that the disease, in an endemic situation, already creates an immune resistance to the infection, i.e., the population already has herd immunity. with these values we obtain approximately r 0 = 2.46 and r 0 = 1.29 for epidemic and endemic scenarios, respectively. during an outbreak, the disease transmission assumes different behaviors, according to the distinct scenarios, as can been seen in figure 7 .3. in one year, the peak in an epidemic situation could reach more than 80000 cases. in contrast, instead in the endemic situation the curve of infected individuals has a more smooth behavior and reaches a peak less than 3000 cases. figure 7 .4 relates to the mosquito population. in the endemic scenario, because a substantial part of the human population is resistant to the disease, the infected mosquitoes bite a considerable percentage of resistant host, and as consequence, the disease is not transmitted. suppose that at time t = 0 a proportion p of newborns are vaccinated with a perfect vaccine that causes no side effects. since this proportion, p, is now immune, r 0 is reduced, creating a new basic reproduction number. definition 19 (basic reproduction number with pediatric vaccination). the basic reproduction number with pediatric vaccination, r p 0 , associated to the differential system (7.1) is given by where r 0 is defined in (7.2). observe that r p 0 ≤ r 0 . equality is only achieved when p = 0, i.e., when there is no vaccination. the constraint r p 0 < 1 implicitly defines a critical vaccination portion p > p c that must be achieved for eradication, where since vaccination entails costs, to choose the smallest coverage that achieves eradication is the best option. this way, the entire population does not need to be vaccinated in order to eradicate the disease. this phenomenon is called herd immunity. vaccinating at the critical level, p c , does not instantly lead to disease eradication. the immunity level within the population requires time to build up and at the critical level it may take a few generations before the required herd immunity is achieved. thus, from a public health perspective, p c acts as a lower bound on what should be achieved, with higher levels of vaccination leading to a more rapid elimination of the disease. figure 7 .5 shows the simulations related to the proportion of newborns vaccinated (p = 0, 0.25, 0.50, 0.75, 1) in both scenarios. notice that at time t = 0 no person was vaccinated. in the epidemic situation, as the outbreak reached a peak at the beginning of the year, the proportion of newborns vaccinated at that time is minimum and cannot influence the curve of infected individuals, giving the optical illusion of a single curve. on the other hand, in the endemic case, as the outbreak occurs later, the vaccination campaign starts to produce effects, decreasing the total number of sick humans. this last graphic illustrates that a vaccination campaign centered in newborns is a bet for the future of a country, but does not produce instantly results to fight the disease. to produce immediate results, it is necessary to use random mass vaccination, which means that it is necessary to vaccine a significant part of the population. a mass vaccination program may be initiated whenever there is an increase of the risk of an epidemic. in such situations, there is a competition between the exponential increase of the epidemic and the logistical constraints upon mass vaccination. for most human diseases it is possible, and more efficient, to not vaccinate those individuals who have recovered from the disease because they are already protected. another situation could be the introduction of a new vaccine in a population that lives an endemic situation. let us consider the control technique of constant vaccination of susceptibles. in this scheme a fraction 0 ≤ ψ ≤ 1 of the entire susceptible population, not just newborns, is being continuously vaccinated. it is assumed that the permanent immunity acquired by vaccination is the same as natural immunity obtained from infected individuals in recovery. the epidemiological scheme is presented in figure 7 the mathematical formulation for human population (the differential equations related to the mosquito remain equal to the previous subsection) is given by: for this model, we define a new basic reproduction number. definition 20 (basic reproduction number with random mass vaccination [145] ). the basic reproduction number with random mass vaccination, r ψ 0 , associated to the differential system (7.3) is where r 0 is defined in (7.2) . comparing this model with the constant vaccination of newborns model, it is apparent that instead of constantly vaccinating a portion of newborns, a part of the entire susceptible population is now being continuously vaccinated. since the natural birth rate µ h is usually small, the fraction pµ h of newborns being continuously vaccinated will be small, whereas in this model, will be also a larger group of susceptibles can be continuously vaccinated, ψs h . due to this, we expect that this model should require a smaller proportion of ψ to achieve eradication. notice that r ψ 0 ≤ r 0 . equality is only achieved in the limit ψ = 0, that is, when there is no vaccination. the constraint r ψ 0 < 1 implicitly defines a critical vaccination portion ψ > ψ c that must be achieved for eradication, where ψ c = (r 0 − 1) µ h . observe that in spite of the calculations done in the period of 365 days, the figures only show suitable windows, in order to provide a better analysis. in both scenarios, even with a small coverage of population, vaccination dramatically decreases the number of infected. the epidemic scenario has changed from less than 80000 cases (with no vaccination, figure 7 .3) to less than 1200 cases vaccinating only 5% of population. in the endemic scenario, the decrease is more accentuated. until here, we have considered a perfect vaccine, which means that every vaccinated individual remains resistant to the disease. however, a majority of the available vaccines for the human population does not produce 100% success in the disease battle. usually, the vaccines are imperfect, which means that a minor percentage of cases, in spite of vaccination, are infected. most of the theory on the disease evolution is based on the assumption that the host population is homogeneous. individual hosts, however, may differ and they may constitute very different habitats. in particular, some habitats may provide more resources or be more vulnerable to virus exploitation [56] . the use of models with imperfect vaccines can describe better this type of human heterogeneity. another explanation for the use of imperfect vaccines is that until now we had considered models that assumed that as soon as individuals begin the vaccination process, they become immediately immune to the disease. however, the time it takes for individuals to obtain immunity by completing a vaccination process cannot be ignored, because meanwhile an individual can be infected. in this section a continuous vaccination strategy is considered, where a fraction ψ of the susceptible class was vaccinated. the vaccination may reduce but not completely eliminate susceptibility to infection. for this reason we consider a factor σ as the infection rate of vaccinated members. when σ = 0 the vaccine is perfectly effective and when σ = 1 the vaccine has no effect at all. the value 1 − σ can be understood as the efficacy level of the vaccine. the new model for the human population is represented in figure 7 .8. figure 7 .8: epidemiological sv ir model for human population with an imperfect vaccine therefore, the differential system is as follows: (7.5) for this system of differential equations, we have a new basic reproduction number. [85] ). the basic reproduction number with an imperfect vaccine, r σ 0 , associated to the differential system (7.5) is where r 0 is defined in (7.2) . notice that r ψ 0 ≤ r σ 0 and when the vaccine is perfect (σ = 0), r σ 0 degenerates into r ψ 0 . in other words, a high efficacy vaccine leads to a lower vaccination coverage to eradicate the disease. however, it is noted in [85] that it is much more difficult to increase the efficacy level of the vaccine when compared to controlling the vaccination rate ψ. figure 7 .7a, in the epidemic scenario with a perfect vaccine, the number of human infected has reached to a maximum peak of 1200 cases per day, in the worst scenario (ψ = 0.05). using an imperfect vaccine, with a level of efficacy of 80% (figure 7 .9a), with the same values for ψ, the maximum peak increases until 9000 cases. we conclude that production of a vaccine with a high level of efficacy has a preponderant role in the reduction of the disease spread. figures 7.9c and 7 .9d reinforce the previous sentence. assuming that 85% of the population is vaccinated, the numbers of infected cases decreases sharply with the increasing of the effectiveness level of the vaccine. according to [34] , an acceptable level of efficacy is at least 80% against all four serotypes, and 3 to 5 years for the length of protection. these are commonly considered across countries, as the minimum acceptable levels. in the next subsection we study another type of imperfect vaccine: a vaccine that confers a limited life-long protection. until the 1990s, this was an universal assumption of mathematical models of vaccination: there is no waning of vaccine-induced immunity. this assumption was routinely made because, for most of the major vaccines against childhood infectious diseases, it is approximately correct [119] . suppose that the immunity, obtained by the vaccination process, is temporary. assume that immunity has the waning rate θ. then the model for humans is given by definition 22 (basic reproduction number with an imperfect vaccine). the basic reproduction number with an imperfect vaccine, r θ 0 , associated to the differential system (7.6) is where r ψ 0 is defined in (7.4 ). according to [125] , the basic reproduction numbers are the same, r θ 0 and r ψ 0 , because the disease will still spread at the same rate with or without temporary immunity. however, we should expect that the convergence rate will be different between the random mass vaccination and random mass vaccination with waning immunity, since the disease will be eradicated faster in the constant treatment model without waning immunity compared to the other with waning immunity. depending on the vaccine that will be available on the market, it will be possible to choose or even combine features. in the next section we will define the vaccination process as a control system. in this section we consider a sir model for humans and an asi model for mosquitoes. the parameters remain the same as in the previous chapter. the vaccination is seen as a control variable to reduce or even eradicate the disease. let u be the control variable related to the proportion of susceptible humans that are vaccinated. a random mass vaccination with waning immunity is selected. in this way, a parameter θ associated to the control u represents the waning immunity process. figure 7 .12 shows the epidemiological scheme for the human population. figure 7 .12: epidemiological sir model for the human population using the vaccine as a control the model is described by an initial value problem with a system of six differential equations: the main aim is to study the optimal vaccination strategy, considering both the costs of treatment of infected individuals and the costs of vaccination. the objective is to where γ d and γ v are positive constants representing the weights of the costs of treatment of infected and vaccination, respectively. using oc theory is possible to solve the problem. let us consider the following set of admissible control functions: ∆ = {u(·) ∈ (l ∞ (0, t f ))|0 ≤ u(t) ≤ 1, ∀t ∈ [0, t f ]}. table (7. 2), admits a unique optimal solution (s * h (·), i * h (·), r * h (·), a * m (·), s * m (·), i * m (·)) associated with an optimal control u * (·) on [0, t f ], with a fixed final time t f . moreover, there exists adjoint functions, λ * i (·), i = 1...6 such that with the transversality conditionsλ i (t f ) = 0, i=1, . . . 6. furthermore, u * = min 1, max 0, proof. the existence of optimal solutions (s * h (·), i * h (·), r * h (·), a * m (·), s * m (·), i * m (·)) associated to the optimal control u * (·) comes from the convexity of the integrand of the cost function (7.8) with respect to the control u and the lipschitz property of the state system with respect to state variables (s h , i h , r h , a m , s m , i m ) (for more details see [24] ). according to the pontryagin maximum principle [106] , if u * (·) ∈ ∆ is optimal for the problem considered, then there exists a nontrivial absolutely continuous mapping λ : [0, t f ] → r, λ(t) = (λ 1 (t), λ 2 (t), λ 3 (t), λ 4 (t), λ 5 (t), λ 6 (t)), called the adjoint vector, such that (7.11) and holds almost everywhere on [0, t f ]. moreover, the transversality conditions λ i (t f ) = 0, i = 1, . . . 6, hold. system (7.9) is derived from (7.12), and the optimal control (7.10) comes from the minimality condition (7.13). the simulations were carried out using the values of the previous section. the system was normalized, using the same strategy as in chapter 5. it was considered that the waning immunity was at a rate of method epidemic scenario endemic scenario direct (dotcvp) 0.07505791 0.00189056 indirect (backward-forward) 0.06070556 0.00080618 table 7 .3: optimal values of the cost functional (7.8) θ = 0.05. the oc problem was solved using two methods: direct [9, 133] and indirect [81] . the direct method uses the cost functional (7.8) and the state system (7.7) and was solved by dotcvp [67] . the indirect method used is an iterative method with a runge-kutta scheme, solved through ode45 of matlab. figure 7 .13 shows the optimal control obtained by both methods. notice that dotcvp only gives the optimal control as a constant piecewise function. table 7 .3 shows the costs obtained by the two methods in both scenarios. the indirect method gives a lower cost. this method uses more mathematical theory about the problem, such as the adjoint system (7.11) and optimal control expression (7.10). therefore it makes sense that the indirect method produces a better solution. using the optimal solution as reference, some tests were performed, regarding infected individuals and costs, when no control (u ≡ 0) or upper control (u ≡ 1) is applied. table 7 .4 shows the results for dotcvp in the three situations. in both scenarios, using the optimal strategy of vaccination produces better costs with the disease, when compared to not doing anything. once there is no control, the number of infected humans is higher and produces a more expensive cost functional. figure 7 .14 shows the number of infected humans when different controls are considered. it is possible to see that using the upper control, which means that everyone is vaccinated, implies that just a few individuals were infected, allowing eradication of the disease. although the optimal control, in the sense of objective (7.8), allows the occurrence of an outbreak, the number of infected individuals is much lower when compared with a situation where no one is vaccinated. we conclude that a vaccination campaign in the susceptible population, and assuming a considerable efficacy level of the vaccine, can quickly decrease the number of infected people. the worldwide expansion of the dengue fever is a growing health problem. dengue vaccine is an urgent challenge that needs to be overcome. this may be commercially available within a few years, when the researchers find a formula that protect against all four dengue viruses. a vaccination program is seen as an important measure used in infectious disease control and immunization and eradication programs. in the first part of the chapter, different types of vaccine, as well as their features and some of coverage thresholds, were introduced. the main idea was to study some types of vaccines in order to cover most of the future vaccine features. the main goal of a vaccination program is to reduce the prevalence of an infectious disease and ultimately to eradicate it. it was shown that eradication success depends on the type of vaccine as well as on the vaccination coverage. the imperfect vaccines may not completely prevent infection but could reduce the probability of being infected, thereby reducing the disease burden. in this study all the simulations were done using epidemic and endemic scenarios to illustrate distinct realities. a second analysis was made, using an oc approach. the vaccine behaved as a new disease control variable and, when available, can be a promising strategy to fight the disease. dengue is an infectious tropical disease difficult to prevent and manage. researchers agree that the development of a vaccine for dengue is a question of high priority. in the present study we have shown how a vaccine results in saving lives and at the same time in a reduction of the budget related with the disease. as future work we intend to study the interaction of a dengue vaccine with other kinds of control already investigated in the literature, such as insecticide and educational campaigns [113, 118] . this chapter was based on work accepted in the peer reviewed proceedings [114] . mathematical models can be a powerful tool to understand epidemiological phenomena. these models can be used to compare, plan, implement and evaluate several programs related to detection, prevention and control of infectious diseases. indeed, one of the most important issues in epidemiology is to improve control strategies with the final goal to reduce or even eradicate the disease. in this thesis were constructed and analyzed some models for the spreading of diseases, particularly for dengue fever. the links between health and sustainable development are illustrated for this disease. any attempt in predicting and preventing the disasters caused by the disease will imply a global strategy that takes into account environmental conditions, levels of poverty and illiteracy and, eventually, degree of coverage by vaccination programs [35] . although vector control strategies were already available before the second world war, dengue pandemic was underestimated. it became a global public health problem in the past 60 years and a major concern for who. the main contributions of this thesis can be classified into three main categories, namely, model formulation, mathematical and computational analysis, and contributions to public health/intervention design. deterministic models for assessing the combined impact of several control measures in dengue disease were considered. in chapter 4 an old oc problem for dengue was revisited. it was shown that new and robust tools bring refreshing solutions to the problem. some analysis with discretization schemes were carried out in order to understand the best way to implement the direct methods in future approaches. for this problem, is was better to use robust solvers than to implement higher order discretization methods, due to the increasing of problem's dimension. in chapter 5, a seir+asei model was studied. threshold criteria was established ensuring the disease eradication and hence convergence to the so-called disease free solution. using real data from cape verde, the application of insecticide in the country during the outbreak was simulated and its repercussions were analyzed. the study of this outbreak was performed in several phases: firstly, using a previously calculated constant control for the whole period, with the aim of having a basic reproduction number below unity; then, several periodic strategies were studied in order to find the best logistics approaches to implement that keep r 0 less than one; finally, an oc approach to compare with the previous suboptimal approaches was used. for this final phase, several perspectives of the problem were analyzed, including bioeconomic, medical and economic approaches. depending of the main target to achieve, the results for the control and infected individuals vary. in chapter 6, a sir+asi model for dengue was presented. the lost of two differential equations, was compensated by the introduction of two more controls: in addition to adulticide, larvicide and mechanical control were introduced. similarly to chapter 5, a threshold criteria for the eradication of the disease was established. the influence of the controls in this threshold was analyzed. then, varying the controls, separately and simultaneously, an analysis of the importance/consequence of each control in the development of the disease was made. finally, an oc approach using different weights for the variables in the functional was studied, in order to establish the best optimal curve for each control and the respective effects in the development/erradication of the disease. in chapter 7, some simulations with different types of vaccines were made. as a dengue vaccine is not yet available, distinct hypothetical models to introduce the vaccine were studied. in section 7.2, was considered a new compartment v h producing a new model svir+asi. this research comprised svir models with perfect vaccines -constant vaccination of newborns and constant vaccination of susceptibles -and imperfect vaccines -using a level of efficacy below 100% and also with waning immunity. in section 7.3, the vaccination process was studied as a control of the epidemic model. for this, as in the previous chapter, an oc approach was presented. all the simulations were done in epidemic and endemic scenarios, in order to understand what type of repercussions could bring each kind of vaccines. in this work, there was a concern in producing a mathematical analysis for all new models presented. epidemiological concepts, such as the basic reproduction number and equilibrium points, were calculated. the equilibrium points were classified and their local stability analyzed. the oc theory was used in order to provide the best strategies for each model, involving direct and indirect approaches. throughout the thesis, a set of software packages were used, showing the importance of the these tools in the development of some mathematical fields. to solve ode systems, codes in matlab and scilab were implemented. to calculate the equilibrium points and the basic reproduction number, mathematica and maple were used. for the oc approach, several packages were also selected: from programmed codes in ampl and run in neos server (ipopt, snopt, knitro, muscod-ii) to fortran codes in the linux environment (oc-ode) or even programs coded in matlab (dotcvp and indirect methods). the software choice was varying during the research process, due to availability, robustness and solving speed. the main codes developed in this thesis are available at: https://sites.google.com/site/hsofiarodrigues/home/ phd-codes-1 [110] . using direct and indirect methods, the solutions obtained were similar, reinforcing the confidence in the results. the study provides some important epidemiological insights into the impact of vector control measures. dengue burden decreases with the increasing of vector control measures (adulticide, larvicide and mechanical control). furthermore, the adulticide should be the first/main measure to apply when an outbreak occurs, whereas the other two measures should be considered as a long time prevention. the last control measure to be studied was the vaccination. it was shown that the vaccine, when available, could bring advantages not only in the reduction of infected individuals, but also decreasing the disease costs. a phd thesis is always an unfinished process, but some day it is necessary to stop. therefore, some topics were not explored and can be understood as future directions of the work. a first suggestion is to use heterogeneity for both populations, dividing each one in more compartments [84] . human population is not immunologically homogeneous, presenting groups with distinct level of risk, related with age, sex or the presence of a disease or immunosuppressive drugs which would be the case with transplant patients or cancer suffers. for example, the transmission probability in young children is higher and generally with more severe symptoms when compared to adult transmission. moreover, the mosquito population does not have the same behavior during the whole year, depending specially on weather conditions. temperature and humidity are key variables on vector population dynamics. it will be interesting to add some seasonality factors into the last models [45, 101] . another open question is the introduction of immigration and tourism issues. along the work it was considered a constant population, but the addition of new individuals could induce new outbreaks. other aspect is related to the disease development in the presence of several serotypes. while in cape verde only one serotype was found, the interaction of several serotypes in asia is already a reality. this will induce changes in the model, not only increasing of the number of variables but also causing a more expressive number of dhf cases [105] . currently, portuguese researchers are developing a new repellent for mosquito that could be considered as a new control for the disease. this product does not kill the mosquito, but prevents the bite by deviating it from the target and consequently decreasing the chain of the disease transmission. with the viability of the vaccine, it will be possible to fit a better model according to the vaccine used. one of the possibilities is using pulse vaccination, where children at a certain age cohorts are periodically immunized [41] . the theoretical challenge of pulse vaccination is the a priori determination of the pulse interval for specific values of r 0 , the proportion of vaccinated p and the population birth rate µ. most of the analysis and models presented in this thesis can be adapted for other vector-borne diseases -such as malaria, yellow fever, west nile virus, chikungunya, japanese encephalitis -by just fitting some variables/parameters and some initial assumptions intrinsic to the disease [18, 46, 72] . when attempting to model epidemics and control for public health applications, there is the compelling urge to make models as sophisticated as possible, including many details about the host and the vector. although this strategy may be useful when such details are known or exist suitable data, it may lead to a false sense of accuracy when reliable information is not available. another approach is to keep the model simple and, instead of using the conventional differential calculus, to apply fractional calculus to fit to the disease reality [107] or to use the general theory of time scales [15] . "if people do not believe that mathematics is simple, it is only because they do not realize how complicated life is." -john louis von neumann, 1947 box invariance in biologically-inspired dynamical systems mosquito transmissor da dengue chega a cabo verde the basic reproduction number in some discrete-time epidemic models mathematical models in biology: an introduction relatório revive 2011 -culicídeos infectious diseases of humans directly transmitted infectious diseases: control by vaccination radcliffe's ipm world textbook, chapter the sterile release method and other genetic control atrategies practical methods for optimal control using nonlinear programming. siam: advances in design and control vaccine candidates for dengue virus type 1 (den1) generated by replacement of the strutural genes of rden4 and rden4δ30 with those of den1 control strategies for tuberculosis epidemics: new models for old problems geometric orbital transfer using averaging techniques aedes aegypti: insecticidas, mecanismos de ação e resistência compartmental models in epidemiology escalas temporais e mathematica optimal control 1950 to 1985 on the optimal vaccination strategies for horizontally and vertically transmitted infectious diseases advances in mathematical population dynamics -molecules, cells and man large-scale nonlinear optimization, chapter knitro: an integrated package for nonlinear optimization optimal and sub-optimal control in dengue epidemics modeling and simulation in scilab/scicos mathematical approaches for emerging and reemerging infectious diseases: an introduction disease control priorities in developing countries optimization-theory and applications nonlinear and dynamic optimization: from theory to practice estimation of the reproduction number of dengue fever from spatial epidemic data aedes aegypti, the yellow fever mosquito: its life history, bionomics and structure economic impact of dengue fever / dengue hemorrhagic fever in thailand at the family and population levels optimization and nonsmooth analysis dynamics of the 2006/2007 dengue outbreak in brazil dengue virus net policymakers' views on dengue fever/dengue haemorrhagic fever and the need for dengue vaccines in four southeast asian countries dengue fever: mathematical modelling and computer simulation a model of dengue fever using adult mosquitoes to transfer insecticides to aedes aegypti larval habitats mathematical epidemiology of infectious diseases: model building, analysis and interpretation on the definition and the computation of the basic reproduction ratio r 0 in models for infectious diseases núcleo operacional da sociedade de informação stability properties of pulse vaccination strategy in seir epidemic model reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission vector control for the chikungunya disease on a temporal model for the chikungunya disease: modeling, theory and numerics dynamics of the control of aedes (stegomyia) aegypti linnaeus (diptera, culicidae) by bacillus thuringiensis var israelensis, related with temperature, density and concentration of insecticide the chaos and optimal control of cancer model with complete unknown parameters mathematical model to assess the control of aedes aegypti mosquitoes by the sterile insect technique on vaccine efficacy and reproduction numbers competitie exclusion in a vector-host model for the dengue fever strategies for mitigating an influenza pandemic on certain questions in the theory of optimal control deterministic and stochastic optimal control prevention (cdc), and i. diseases. prevention: how to reduce your risk of dengue infection ampl: a modeling language for mathematical programming imperfect vaccination: some epidemiological and evolutionary consequences user's guide oc-ode (version 1.4) snopt: an sqp algorithm for large-scale constrained optimization optimal singular rocket and aircraft trajectories dengue vaccine prospects: a step forward perspective on the basic reproductive ratio optimal control of the atmospheric reentry of a space shuttle by an homotopy method a thousand and one epidemic models the mathematics of infectious diseases mathematical understanding of infectous disease dynamics, chapter the basic epidemiology models: models, expressions for r 0 , parameter estimation, and applications sundials: suite of nonlinear and differential/algebraic equation solvers dotcvpsb, a software toolbox for dynamic optimization in systems biology scientific consultation on immunological correlates of protection induced by dengue vaccines: report from a meeting held at the world health organization 17-18 global-scale relationships between climate and the dengue fever vector applications of matlab: ordinary differential equations (ode) global trends in emerging infectious diseases optimal control of an hiv immunology model optimal control methods applied to disease models computation of threshold conditions for epidemiological models and global stability of the disease-free equilibrium (dfe) dynamic optimization: the calulus of variations and optimal control in economics and management modeling infectious diseases in humans and animals a contribution to the mathematical theory of epidemics optimal control theory: an introduction muscod-ii users manual the calculus of variations and optimal control. an introduction. mathematical concepts and methods in science and engineering optimal control applied to biological models a geometric approach to global-stability problems transmission dynamics and spatial spread of vector borne diseases: modelling, prediction and control svir epidemic models with vaccination strategies introduction to optimal control theory dengue vaccines regulatory pathways: a report on two meetings with regulators of developing countries modeling the dynamic transmission of dengue fever: investigating disease persistence diretrizes nacionais para prevenção e controle de epidemias de dengue optimal control strategies for speed control of permanent-magnet synchronous motor drives global stability of seir models in epidemiology optimal control of wind energy systems: towards a global approach, volume xxii of advances in industrial control mathematical biology review of dengue virus and the development of a vaccine optimal control of treatment in a mathematical model of chronic myelogenous leukemia bioecologia do aedes aegypti an introduction to optimal control with an application in disease modeling. in modeling paradigms and analysis of disease transmission models neos optimal advertising policy under dynamic conditions mathematical and statistical analyses of the spread of dengue optimal timing of insecticide fogging to minimize dengue cases: modeling dengue transmission among various seasonalities and transmission intensities modeling dengue outbreaks a stochastic spatial dynamical model for aedes aegypti real time computation of feedback controls for constrained optimal control problems. part 2: a correction method based on multiple shooting cocirculation of two dengue virus serotypes in individual and pooled samples of aedes aegypti and aedes albopictus larvae the mathematical theory of optimal processes fractional derivatives in dengue epidemics a survey of numerical method for optimal control a rationale for continuing mass antibiotic distributions for trachoma phd software codes optimization of dengue epidemics: a test case with different discretization schemes dynamics of dengue epidemics when using optimal control insecticide control in a dengue epidemics model optimal control of a dengue epidemic model with vaccination dengue in cape verde: vector control and vaccination modeling and optimal control applied to a vector borne disease control of dengue disease: a case study in cape verde dengue disease, basic reproduction number and control mathematical models of vaccination ecological aspects for application of genetically modified mosquitoes, chapter aedes aegypti density and the risk of dengue-virus transmission climate change and infectious diseases in europe cost-effectiveness of a pediatric dengue vaccine mosquito ameaça populações a study of infectious disease models with switching cost of dengue cases in eight countries in the americas and asia: a prospective study optimal control of dynamic investment on inventory with stochastic demand a two-age-classes dengue transmission model 300 years of optimal control: from the brachystochrone to the maximum principle lyapunov functions for a dengue disease transmission model controleótimo aplicado na estratégia de combate ao aedes aegypti utilizando insecticida e mosquitos estéreis optimal control of aedes aegypti mosquitoes by the sterile techinique and insecticide contrôle optimal: théorie & applications. vuibert, collection mathématiques concrètes an introduction to infectious disease modelling on the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming solving optimal control problems with matlab: indirect methods ecological and immunological determinants of dengue epidemics prospects for a dengue virus vaccine dengue haemorrhagic fever: diagnosis, treatment, prevention and control. world health organization dengue: guidelines for diagnosis, treatment, prevention and control. world health organization an epidemiological model for west nile virus: invasion analysis and control applications assessing the effects of temperature on dengue transmission mathematical control theory. modern birkhäuser classics stability of periodic solutions for an sis model with pulse vaccination key: cord-317993-012hx4kc authors: movia, dania; prina-mello, adriele title: preclinical development of orally inhaled drugs (oids)—are animal models predictive or shall we move towards in vitro non-animal models? date: 2020-07-24 journal: animals (basel) doi: 10.3390/ani10081259 sha: doc_id: 317993 cord_uid: 012hx4kc simple summary: this commentary focuses on the methods currently available to test the efficacy and safety of new orally inhaled drugs for the treatment of uncurable respiratory diseases, such as chronic obstructive pulmonary disease (copd), cystic fibrosis or lung cancer, prior to entering human experimentation. the key question that the authors try to address in this manuscript is whether there is value in using and refining current animal models for this pre-clinical testing, or whether these should be relinquished in favor of new, more human-relevant non-animal methods. abstract: respiratory diseases constitute a huge burden in our society, and the global respiratory drug market currently grows at an annual rate between 4% and 6%. inhalation is the preferred administration method for treating respiratory diseases, as it: (i) delivers the drug directly at the site of action, resulting in a rapid onset; (ii) is painless, thus improving patients’ compliance; and (iii) avoids first-pass metabolism reducing systemic side effects. inhalation occurs through the mouth, with the drug generally exerting its therapeutic action in the lungs. in the most recent years, orally inhaled drugs (oids) have found application also in the treatment of systemic diseases. oids development, however, currently suffers of an overall attrition rate of around 70%, meaning that seven out of 10 new drug candidates fail to reach the clinic. our commentary focuses on the reasons behind the poor oids translation into clinical products for the treatment of respiratory and systemic diseases, with particular emphasis on the parameters affecting the predictive value of animal preclinical tests. we then review the current advances in overcoming the limitation of animal animal-based studies through the development and adoption of in vitro, cell-based new approach methodologies (nams). respiratory diseases constitute a huge burden in our society. it has been calculated that, worldwide, around 235 million people are living with asthma [1], 251 million with chronic obstructive pulmonary disease (copd) [2] , and more than 70,000 people with cystic fibrosis [3] . furthermore, 3 million people are affected by idiopathic pulmonary fibrosis (ipf) [4] , and 10 million people contract tuberculosis (tb) annually [5] . in addition to this, lung cancer continues to be the leading cause of cancer death worldwide, accounting for 1.8 million deaths in 2018 [6] ; whereas, pneumonia still constitutes the inhalation is the preferred administration method for treating respiratory diseases [13] , as: (i) it delivers the drug directly at the site of action, resulting in a rapid therapeutic onset with considerably lower drug doses, (ii) it is painless and minimally invasive thus improving patients' compliance, and (iii) it avoids first-pass metabolism, providing optimal pharmacokinetic conditions for drug absorption and reducing systemic side effects [14] [15] [16] . it should be noted here, inhalation differs from intranasal administration for the drug portal-of-entry (poe) and targeted site of action. intranasal drugs are sprayed into the nostrils, producing a local effect in the nasal mucosa; whereas, inhalation occurs through the mouth, with the oids, also referred to as orally inhaled drug products (oips), having their efficacy in the lungs. notably, attempts have been made to develop oids that exert their therapeutic action outside the lung, for the treatment of systemic diseases [17] . the latter include, for example, migraine headaches, treated with aerosols of ergotamine or hydroxyergotamine, and type 1/type 2 diabetes, for which inhaled insulin products have been developed (e.g., exubera-withdrawn in 2008 due to poor revenue-and afrezza-the uptake of which has also been impacted by socio-economic issues). oid therapeutic categories currently approved for the clinical treatment of respiratory diseases include drugs for the treatment of asthma and copd, such as β2 adrenergic agonists (e.g., albuterol, formoterol) and muscarinic antagonists (e.g., ipratropium, tiotropium) inducing bronchodilation, or glucocorticosteroids (e.g., fluticasone and budesonide) reducing inflammation. oids for the treatment of cystic fibrosis are also available for clinical use, with most of them falling into the therapeutic categories of mucolytics (e.g., saline and acetyl choline), aiming at thinning the mucus for facilitating its clearance from the patient's lungs. alternatively, leukocyte dnase, reducing inflammation, and antimicrobial agents (e.g., tobramycin), treating the bacterial infection characteristic of this disease, are also administered as oids. various devices can be used to administer oids to patients, including dry-powder inhalers (dpis), pressurized metered-dose inhalers (pmdis) and nebulizers. these devices have been extensively discussed in several recent works [18] [19] [20] [21] [22] [23] [24] [25] . briefly, dpis deliver powder particles carrying the drug; pmdis and nebulizers generate liquid droplets containing the drug. to be effective, an inhalation device must be easy to use and forgiving of poor patient's compliance, while providing reproducible effective dosing. thus, a through characterization of the performance of the inhalation device is required at regulatory level, when developing an oid. such characterization is based on in vitro, ex vivo and in vivo (on human volunteers) tests, as extensively described in the scientific literature [26] [27] [28] [29] [30] [31] [32] [33] animal models are not used in the characterization of the efficiency and reproducibility of inhalation delivery devices. this is due to the fact that, dpis and pmdis are breath-actuated and therefore not compatible with animal exposure; whereas for nebulizers modifications are needed in line with the animal model adopted. thus, our manuscript, which focuses on the potential reduction and replacement of animals studies in oid development, does not discuss the impact of inhalers' performance on the effectiveness of inhalation therapies [34] , a current challenge discussed in detail elsewhere [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] . despite the major advantages over i.v. administration of drugs, inhalation therapy encounters several obstacles in achieving an effective therapeutic dose for the successful treatment of respiratory and/or systemic diseases. below, we describe the journey of an oid once administered and the human-specific features that, in the authors' opinion, strongly impact on the current low translation rate of oids, as these are poorly replicated in the current preclinical models. when an oid is administered to a patient, its liquid or powder aerosol enters the human respiratory system via the oropharynx. oid deposition in the oropharynx is invariably wasteful, reducing the oid dose reaching the lungs. this indeed constitutes the first feature to keep into account for developing an effective inhalation therapy [45] . rodent models cannot reproduce this feature, as they are obliged nose-breathers. however, other animal models (e.g., dogs) can be used to overcome the limitations posed by rodents. also, oid deposition in the oropharynx must be minimized in clinics to avoid severe side-effects in the patients. side-effects can be due to both local and systemic toxicity, as oids accumulating in the mouth and throat enter the body through swallowing. achieving an optimal oid deposition pattern in the patients' lung is the second feature to keep into account for an effective inhalation therapy [46] . to reach its site of action and/or absorption, the oid needs to pass through the so-called extrathoracic (or et) region of the larynx, enter the tracheobronchial region and reach the small and/or peripheral (alveoli) airways. drug absorption and translocation into the blood flow can in fact occur from all parts of the lung, but it occurs more readily in the alveoli [47] , where there is a large surface area and a relatively thin layer of epithelial and endothelial cells separating the inhaled drug from the blood flow. the oid journey within the complex, branched structure of the human lung is influenced by two main parameters of the particles/droplets carrying the drug [48] : (i) velocity [49] ; and (ii) aerodynamic size distribution (the so-called apsd) [13] . both parameters strongly impact on the drug deposition pattern and, subsequently, on the effectiveness of the inhalation therapy. velocity is defined by the delivery system employed in the oid administration. generally, high velocity results in increased deposition in the oropharynx and tracheobronchial regions; whereas, low velocity generates a peripheral deposition pattern [13] . it goes without saying that oids cannot reach those part of the respiratory tract where velocity is null, i.e., those parts of the lung that are not ventilated. this is particularly relevant to consider when developing oids against respiratory diseases [50] , which are characterized by the partial or full obstruction of the respiratory tract (e.g., asthma, copd, cystic fibrosis and lung cancer). combination of drugs where bronchodilators or mucolytics are used in a synergistic manner with other drug therapies, can be used to modulate oid velocity and increase the efficacy of the inhalation therapy. in parallel, the deposition mechanism of the aerosol particle/droplets in the bronchial tree changes depending on their apsd [13] . droplets/particles with large aerodynamic size deposit by impaction or interception mechanisms in the oropharynx or just beyond the trachea bifurcation. the smaller droplets/particles deposit in the smaller airways by sedimentation, subject to gravity. among those, the droplets/particles with aerodynamic size below 3 µm further move to the alveoli by diffusion or brownian motion. it should be noted here, droplets/particles deposition follows stokes' law [45] . the consequence is that, since most of the droplets/particles are near spherical, their aerodynamic size can be small despite being geometrically large. this happens when particles/droplets have low density, which is determined by the composition of the oid formulation. oid deposition pattern is currently evaluated in in vitro, cell-free experiments, achieving good predictive value [51] . once the oid deposits on the airways, removal mechanisms, such as mucociliary clearance in the conducting airways and macrophage clearance in the alveolar space, can be responsible for the drug elimination and/or degradation [52] , thus hindering the local efficacy and/or the systemic absorption of the oid. mucociliary clearance is the upward movement of mucus driven by beating cilia towards the pharynx, where mucus is subsequently swallowed and pass into the gastrointestinal tract [53] . in macrophage clearance, the oid is phagocytosed by alveolar macrophages and cleared by transport to the lung-draining lymph nodes [54, 55] . compared with mucociliary clearance, macrophage clearance is far slower [56] and, therefore, its action is typically assumed to be negligible for oids, unless the drug is known to be degraded by alveolar macrophages [57] . absorptive drug clearance is yet another clearance mechanism by which an oid is cleared from the lung through the blood circulation, a mechanism that is heavily dependent on perfusion. perfusion levels, however, vary between the different lung regions. in the alveoli, perfusion levels are the highest and drugs have a very short half-life; by contrast, in the tracheobronchial region, the perfusion rate is lower, thus offering a longer drug bioavailability [58] . removal mechanisms constitute the third feature to keep into account for developing an effective inhalation therapy. as described in detail in section 2.1.1, this feature is species-specific [59] and, therefore, human-specific removal mechanisms are not replicated by animal models. notably, human-specific removal mechanisms can be reproduced by in vitro, cell-based nams [60] [61] [62] [63] , as discussed in detail in section 2.2. to exert local or systemic efficacy, oid dissolution and absorption are indeed necessary [64] . the thickness and constitution of the pulmonary lining fluid, which can be modified by lung diseased states [65] , influence oid dissolution and, subsequently, absorption [66] , constituting the fourth feature to keep into account for developing an effective inhalation therapy. while the mucus layer (produced by goblet cells in the bronchial region) acts as a physical barrier, surfactants produced by alveolar cells in the peripheral airways reduce surface tension and facilitate drug dissolution [13] . noteworthy, oid dissolution rates strongly depend on disease-specific airway characteristics (e.g., copd is characterized by a thick mucus, hindering oid efficacy), which are not replicated by conventional preclinical models. noteworthy, in vitro, cell-based nams have the potential to reproduce the disease-specific composition of pulmonary lining fluid [67] . finally, the multicellular composition of the lung is the fifth feature to keep into account for developing an effective inhalation therapy, by playing an important role in defining oid delivery efficiency. for example, mast cells have protective functions against inhaled drugs; dendritic cells, together with macrophages, are the first line of defense of the lung immune system, sampling for and removing constantly any exogenous material such as drugs. clara cells are involved in oid metabolism. interestingly, the human lung has relatively low metabolic activity as compared to the gastro-intestinal tract or the liver [68, 69] . this constitutes a distinct advantage for inhalation therapy over oral drug administration. however, protease activity is generally increased in lung diseases as a result of chronic inflammation (e.g., enhanced activity of cytochrome p450 in patients affected by lung cancer [70, 71] or copd [72] ); this can indeed reduce the biopersistence and bioavailability of some oids (e.g., insulin [73] ). protection against metabolic activity has been achieved in inhalation therapy by drug encapsulation into carriers (e.g., liposomes [74] [75] [76] [77] [78] ). animal models and humans differ in the metabolism and distribution/types of cell populations lining the airways. for example, it has been shown that the average number of cells per alveolus for rats versus humans is: 21 vs. 1,481 for endothelial cells, 13 vs. 106 for interstitial cells, 6 vs. 67 for epithelial type ii cells, 4 vs. 40 for epithelial type i cells, and 1.4 vs. 12 for alveolar macrophages [79] . this has important clinical implications during the oid development. notably, the human-specific composition and metabolism of the lung can indeed be replicated more closely by adopting in vitro, cell-based nams, as described in the following sections. based on the multiple mechanisms and processes described above, it is evident that oid development is not an easy task. overall, a sound understanding of the features involved in the oid journey is necessary to use the most predictive preclinical models to overcome the complex, intrinsic challenges associated with inhalation therapy. interestingly, such challenges have certainly not hindered the interest of the pharmaceutical industry in inhalation therapy. based on a search carried out by the authors in july 2020, 2542 inhalation clinical trials for new, combination, and existing products, encompassing 666 drug interventions, 1111 different conditions and 115 rare diseases, have been logged on clinicaltrials.gov in the last four years (search terms: interventional studies; inhalation; start date from 01/01/2016 to 31/12/2020). to put this into context, a total of 97,744 interventional studies, comprising 2867 drug interventions, have been registered on clinicaltrials.gov in the same time period. consequently, inhalation clinical trials make for the 2.6% of the total number of interventional studies registered in the time period under consideration (2016-2020), and 23.2% of the total drug interventions examined. it is important to observe that more than half of these inhalation studies are for systemic conditions, thus demonstrating an interest that expands beyond the domain of respiratory diseases. preclinical studies of new oid candidates generally start from compound profiling in high-throughput in vitro studies [80] . compounds with promising efficacy results progress to in vivo studies. three preclinical animal-based studies are currently required by regulatory authorities before approving the request of clinical study for a novel oid. these are: (i) the range finding study, (ii) the repeat dose study, and (iii) the carcinogenicity study. other specialized studies can be necessary, such as safety pharmacology studies, reproductive studies, and neonatal/juvenile studies for pediatric oids. animal-based inhalation studies are carried out mainly in rats, mice or rabbits by exposure in restraint tubes [81] . dogs and primates can also be used for testing oids in more realistic settings, via facemasks or helmets [82] . although high-throughput cell-based assays can provide insightful information at the early stages of preclinical development, the cell models used fall short in recapitulating the complex interactions between different cell types and tissues/organs occurring in human. conventional in vitro models are in fact formed by one cell type grown as a flat, two-dimensional culture; thus, they are a simplistic representation of the human lung tissue [83] . furthermore, many in vitro assays use transformed cell lines that exhibit gene and protein expression that strongly differ from their primary counterpart [83] . on the other hand, various uncertainties characterize the animal-based preclinical studies currently required for regulatory purposes. the first level of uncertainty is associated with the type of devices used to administer the oid to the animal. while clinical nebulizers can be used in the preclinical environment (upon small modifications), dpis and pmdis cannot be employed to expose animal models at the preclinical screening level, as these devices are breath actuated. to overcome this issue, specialized equipment is used to expose the animal to an aerosol in a restrained environment. aerosol of powders is achieved via, for example, rotating brush generators or wright dust feed. an algorithm-based extrapolation [84] is then applied to define dose ranges to be used in clinical trials. the delivered dose is calculated as the amount of oid per unit of body weight that is presented to the animal. due to the two parameters (velocity and aerodynamic size distribution) affecting oid deposition patterns in the lungs, as discussed in the section above, and to the species of the animal model used, the deposited dose is only a fraction of the delivered dose. the fda assumes 100% deposition in humans, 10% in rats and 25% in dogs or non-human primates, irrespective of any information that has been produced by the submitting company [85] . this indeed generates uncertainties when calculating clinical overages. the second level of uncertainty in in vivo studies is posed by the animal model itself [86] . for example, rodents are obligate nose breathers; this strongly influences how inhaled compounds deposit in the respiratory tract. this and other interspecies differences have been extensively discussed by the authors in a recent perspective [59] . preclinical studies during oid development requires a clear understanding of such interspecies differences and their impact on the screening outcomes in terms of oid efficacy, toxicity and recovery from adverse effects. although not required at regulatory level, disease animal models are also used in preclinical research, particularly in the oncological field, as proof of concept for demonstrating oid efficacy. the authors have performed a literature search on pubmed using the searching terms "(inhaled drug) and (in vivo) and (efficacy)". the search results showed that, in the last five years, 116 articles used disease animal models to test the efficacy of oids. however, animal use as disease models needs to be viewed cautiously. in animal models, disease features are reproduced by applying exogeneous stimuli (e.g., allergens, irritant gas exposures, cigarette smoke, etc.) [87] . this modelling process is however incomplete, as the use of single stimuli does not mimic the disease etiology and chronicity observed in patients. the next section of this commentary focuses on this specific aspect, complementing the authors' previous publication [59] and further discussing if and how new approach methodologies (nams) could become useful in the attempt to overcome the limitations of current animal models and increase oid translation rate. for completeness, it should be mentioned here that the abbreviation "nams" is often used in toxicology to refer broadly to any non-animal technology, methodology, approach, or combination thereof that can be used to provide information on chemical hazard and risk assessment. examples of nams include non-mammalian model systems, (e.g., caenorhabditis elegans [88] [89] [90] , drosophila melanogaster [91] [92] [93] , zebrafish [94] [95] [96] and dictyostelium [97] ) and computational (in silico) approaches [98] , which indeed offer opportunities for mimicking human respiratory diseases in a predictive manner. however, the scope of the nams considered in our commentary includes only in vitro, non-animal cell models for the testing of oids. based on the most recent advances in tissue-engineering technologies, in vitro cell-based nams for screening the efficacy of oids can be classified in three main categories [99] : (i) tissue-mimetic lung cultures grown at the air-liquid interface (ali); (ii) lung organoids; and (iii) lung-on-chip. ali cultures mimic one of the main properties of the lung epithelium, i.e., the direct contact with the gas phase (air). this provides a tissue-mimetic environment that makes it possible for airway epithelial cells to proliferate and differentiate in vitro into a pseudostratified, ciliated epithelium that produces mucus. thus, ali cultures provide an excellent method for testing oid dissolution and absorption, while enabling testing of the drug in its aerosol form. whitcutt et al., were among the first research groups to report mucociliary differentiation in ali cultures [100] . today, ali cultures are known to be particularly useful in understanding the mechanisms of respiratory diseases, including the cell-cell and cell-extracellular matrix interactions during airways remodeling [101] [102] [103] . also, they can replicate some of the key features that need to be kept into account when developing an inhalation therapy, namely (i) the constitution and thickness of the pulmonary lining fluid [67] and (ii) mucociliary clearance [60] [61] [62] . for example, ali cultures have been used to model the effects of smoke exposure on epithelial cells [104] and the authors have created a complex, diseased ali culture model capable of reproducing the chemoresistance mechanisms observed in patients affected by non-small-cell lung cancer [105, 106] . also, culturing human airway epithelial cells isolated from patients, makes it possible to conduct patient-specific research and drug-screening, for example in cystic fibrosis, asthma and copd [107] [108] [109] [110] . with the aim of further increasing the predictive value of this in vitro nam, ali co-cultures have also been developed. in ali co-cultures, the lung cell populations are mixed or partially separated, depending on the experimental set-up. in general, the immune cells are cultured in direct contact with the epithelial cells; whereas, fibroblasts and endothelial cells are separated from the epithelial cells by the transwell permeable membrane. cell separation is due to the relative difference in the culturing conditions of the various cell types and the consequent need to separate them. this constitutes one of the main limitations of ali models, as separated cells cannot establish physical (cell-to-cell) interactions as per in vivo conditions. this indeed affects the detected responses during oid preclinical testing. the second type of in vitro, cell-based nams currently available for oid testing are lung organoids. these are grown from human induced pluripotent stem cells (ipscs) cultured within a natural or synthetic extracellular matrix to form three-dimensional (3d), hollow cell spheroids of basal, ciliated and secretory cells [111] . through differentiation and self-organization of the ipscs, an in vitro culture with lung tissue-specific morphogenetic and histological properties is formed [112] . to date, several organoids representative of the various human lung regions [39] and assessing a variety of pulmonary diseases [39, 113, 114] have been developed. in the context of oid preclinical testing, lung organoids can be used for modeling respiratory diseases and, therefore, as a platform for screening the efficacy of inhalation therapies [115, 116] . indeed, technical limitations are inherent with the use of lung organoids. lungs are in fact subjected to mechanical deformation during breathing cycles, a deformation that is currently hard to model in organoids. furthermore, there is still a lack of established in vitro lung organoids with a functional representation of the vasculature network. most importantly, lung organoids lack an important feature for oid testing, i.e., the direct contact of epithelial cells with the air. as mentioned above, lung organoids are spherical cultures. they present an interiorized lumen, with epithelial cells facing inwards rather than outwards; this makes drug administration extremely difficult and reduces the application of organoids in the screening of oid absorption. microfluidic technologies allow to add further complexity and functionality to the in vitro ali models described above. the so-called "lung-on-chip" is a microfluidic-based in vitro system in which lung epithelial cells are grown on one side of a membrane, and stromal cells on the other surface. liquid and air are circulated through the system to mimic air and blood flow in the lung. the applications of lung-on-chip range from basic research to drug discovery [117] , where the oid can be introduced in the air flow as per in vivo conditions. probably the most famous example of this in vitro, cell-based nam is the breathing lung-on-chip developed by huh and co-workers at the wyss institute of harvard university (usa), capable of reproducing both the physiological and pathological responses of the human lung, a rudimentary circulatory system and the mechanical stress associated with breathing [118] [119] [120] . the immediate application of lung-on-chip has been for toxicity testing [121, 122] ; more recently, this model has been exploited for improving understanding of the complex lung disease processes and their responses to therapeutics [123] [124] [125] , with applications extending even to the most recent need of a fast drug discovery for covid-19 treatment [126] . lung-on-chip systems allow, in fact, the in vitro creation of highly tissue-mimetic lung disease models [127, 128] , thus allowing, for example, to model the human response and the effects of existing and novel therapeutics when the lung is infected by the influenza virus or by viral pseudoparticles expressing spike protein of sars-cov-2, the virus responsible for covid-19 development [126] . the clear advantage of lung-on-chip systems over ali cultures or lung organoids is the possibility of mimicking the pulmonary mechanical stretch during in-and exhalation, while replicating the air-blood barrier for studying oid absorption. furthermore, lung-on-chip models allow evaluating the impact of the mucociliary clearance mechanism overcoming the lack of directionality in cilia beating function characteristic of fully-differentiated in vitro ali models [63] . nevertheless, the lung-on-chip models share some of the limitations of ali cultures, i.e., the impairment of physical crosstalk among different cell types. in fact, even in the most recent and advanced developments in "tumor-on-a-chip" cell culture technology, successfully used to create in vitro human orthotopic models of non-small-cell lung cancer [129] , the lung cancer cells (cultured under ali conditions) are physically separated from the lung endothelial cells by a porous, permeable membrane [130] . it is noteworthy to mention that, in the respiratory disease field, two additional categories of in vitro, cell-based nams exists, although these have not been used for oid testing to date. the first category is constituted by explant or ex vivo cultures, namely isolated perfused lungs and precision cut lung slices. these are better representations of the in vivo situation than any of the previous three nam types mentioned above. the use of ex vivo cultures in oid testing is however hindered by the hurdles associated with their manipulation, and by donor-specific differences that make the oid screening outcomes often not significant or difficult to interpret [131] . the second category includes the engineered, reconstructed lung organs [132] . these are formed from several cell types co-cultured within scaffolds that aim at replicating the composition and architecture of the human lung acellular stroma [133] . mechanical or biochemical stimuli can be added to tailor the properties of the scaffold and increase the similarity to the lung stroma in vivo. the first engineered lung organ was built from a decellularized lung matrix used as scaffold [134] . more recently, 3d bioprinting techniques have been used to produce the lung organs in vitro. for 3d bioprinting, cells are combined with bioactive hydrogels composed of synthetic (e.g., polyethylene glycol, pluronic) or natural (collagen, chitosan, fibrin, gelatin, matrigel, alginate) polymers [135] . the use of reconstructed lung organs in oid preclinical screening is currently hampered by the low throughput of these methods. to summarize, in this commentary we have presented an overview of the in vitro, cell-based nam systems that, to date, have been successfully employed to fill the technological gap that is believed to hindering the effective oid translation from the lab bench to the clinic. in the past, oid failure at clinical trial stage was mainly due to poor pharmacokinetics and bioavailability. today, these are rarely a cause of failure, as the pharmaceutical industry greatly invested in the development and application of much more accurate prediction and modelling approaches. lack of efficacy is now the most common cause of oid attrition [11] ; this appears to be associated to the fact that preclinical animal models are poorly representative of human respiratory diseases [136] . improved in vitro non-animal methods could provide a more human-relevant predictive value so that compounds would fail earlier in their course of development [137] . furthermore, we have provided a brief overview of those in vitro, cell-based nams that, in the future, we believe they could be adapted towards oid testing. although in vitro, cell-based nams still have limitations, the advantages associated with their use is evident and future efforts should aim at validating these systems for regulatory acceptance [59] . in the development of oids, we should therefore invest in moving away from animal studies. in the last decades, significant funding and precious time have been spent on developing animal models, despite the known species differences that make the results obtained from such models often unreliable when translated to humans. as dr. francois busquet and colleagues from the center for alternatives to animal testing-europe state for covid-19, human-relevant approaches offer crucial advantages of speed and "much more robust and exacting data than any animal experiment could deliver" [138] . in this instance, we believe it is important to highlight that directive 2010/63/eu on the protection of animals used for scientific purposes aims non only at reducing but at the "full replacement of procedures on live animals for scientific and educational purposes, as soon as it is scientifically possible to do so" [139] . consistently with this aim, in 2016 the netherlands has been the first eu member state to present a roadmap for phasing out animal testing in the safety research on chemical substances, food ingredients, pesticides and medicines (including veterinary medicines) [140] . the recent advances in tissue engineering, microfluidic and organ-on-chip technologies are providing researchers with tools for the development of human-relevant, in vitro nams. thus, it is essential now that the respiratory disease research community embraces these tools, bringing them forward towards regulatory validation. chronic obstructive pulmonary disease (copd) global incidence and mortality of idiopathic pulmonary fibrosis: a systematic review world cancer report-cancer research for cancer prevention who. coronavirus disease (covid-2019) situation reports. available online global respiratory drugs market to 2023-a changing therapeutic landscape as key patents expire and biologics, targeted therapies and cftr modulators for asthma and cystic fibrosis treatment emerge as market growth drivers global respiratory drugs market barriers to new drug development in respiratory disease the r&d cost of a new medicine inhaled therapy in respiratory disease: the complex interplay of pulmonary kinetic processes heuze-vourc'h, n. in a murine model of acute lung infection, airway administration of a therapeutic antibody confers greater protection than parenteral administration heuze-vourc'h, n. inhalation of immuno-therapeutics/-prophylactics to fight respiratory tract infections: an appropriate drug at the right place! front pulmonary drug delivery. part i: physiological factors affecting therapeutic effectiveness of aerosolized medications will pulmonary drug delivery for systemic application ever fulfill its rich promise? recent advances in aerosolised drug delivery 100 years of drug delivery to the lungs understanding dry powder inhalers: key technical and patient preference attributes delivery technologies for orally inhaled products: an update inhalation devices, delivery systems, and patient technique inhalation devices: from basic science to practical use, innovative vs generic products optimizing drug delivery in copd: the role of inhaler devices inhalation therapy devices for the treatment of obstructive lung diseases: the history of inhalers towards the ideal inhaler drug delivery devices: issues in drug development developing ways to evaluate in the laboratory how inhalation devices will be used by patients and care-givers: the need for clinically appropriate testing metered dose inhaler (mdi) and dry powder inhaler (dpi) products-quality considerations guidance for industry inhaled formulation and device selection: bridging the gap between preclinical species and first-in-human studies in vitro testing for orally inhaled products: developments in science-based regulatory approaches pulmonary drug delivery. part ii: the role of inhalant delivery devices and drug formulations in therapeutic effectiveness of aerosolized medications in vitro, in vivo and ex vivo models for studying particle deposition and drug absorption of inhaled pharmaceuticals validation of a general in vitro approach for prediction of total lung deposition in healthy adults for pharmaceutical inhalation products biological obstacles for identifying in vitro-in vivo correlations of orally inhaled formulations patient education and adherence to aerosol therapy trying, but failing" -the role of inhaler technique and mode of delivery in respiratory medication adherence the role of inhalation delivery devices in copd: perspectives of patients and health care providers matching inhaler devices with patients: the role of the primary care physician organoids as a model system for studying human lung development and disease device use errors with soft mist inhalers: a global systematic literature review and meta-analysis irregular and ineffective: a quantitative observational study of the time and technique of inhaler use problems with inhaler use: a call for improved clinician and patient education inhalers: to switch or not to switch? that is the question inhaler competence in asthma: common errors, barriers to use and recommended solutions particle transport and deposition: basic physics of particle kinetics pulmonary drug delivery: from generating aerosols to overcoming biological barriers-therapeutic possibilities and technological challenges inhaling medicines: delivering drugs to the body through the lungs guideline on the pharmaceutical quality of inhalation and nasal products drug delivery to the small airways the impact of pulmonary diseases on the fate of inhaled medicines-a review models of deposition, pharmacokinetics, and intersubject variability in respiratory drug delivery overcoming lung clearance mechanisms for controlled release drug delivery mucociliary clearance in the airways challenges for inhaled drug discovery and development: induced alveolar macrophage responses alveolar macrophages transport pathogens to lung draining lymph nodes report no 125-deposition, retention and dosimetry of inhaled radioactive substances pharmacometric models for characterizing the pharmacokinetics of orally inhaled drugs systems pharmacology approach for prediction of pulmonary and systemic pharmacokinetics and receptor occupancy of inhaled drugs in vitro alternatives to acute inhalation toxicity studies in animal models-a perspective air liquid interface culture can alter ciliary beat pattern in epithelium from primary ciliary dyskinesia patients characterization of pediatric cystic fibrosis airway epithelial cell cultures at the air-liquid interface obtained by non-invasive nasal cytology brush sampling responses of well-differentiated airway epithelial cell cultures from healthy donors and patients with cystic fibrosis to burkholderia cenocepacia infection mucociliary defense: emerging cellular, molecular, and animal models pulmonary drug metabolism, clearance, and absorption pulmonary surfactants and their role in pathophysiology of lung disorders measurements of deposition, lung surface area and lung fluid for simulation of inhaled compounds human cellular models for the investigation of lung inflammation and mucus production in cystic fibrosis expression and localization of cyp3a4 and cyp3a5 in human lung expression and regulation of xenobiotic-metabolizing cytochrome p450 (cyp) enzymes in human lung smoking and peripheral type of cancer are related to high levels of pulmonary cytochrome p450ia in lung cancer patients cytochrome p450-mediated pulmonary metabolism of carcinogens: regulation and cross-talk in lung carcinogenesis expression of cytochrome p450 mrnas in type ii alveolar cells from subjects with chronic obstructive pulmonary disease proteolytic enzymes as a limitation for pulmonary absorption of insulin: in vitro and in vivo investigations development of liposomal ciprofloxacin to treat lung infections liposomal formulations for inhalation the rationale and evidence for use of inhaled antibiotics to control pseudomonas aeruginosa infection in non-cystic fibrosis bronchiectasis liposomes for pulmonary drug delivery: the role of formulation and inhalation device design amikacin liposome inhalation suspension for treatment-refractory lung disease caused by mycobacterium avium complex (convert). a prospective, open-label, randomized study lower respiratory-tract structure of laboratory-animals and humans-dosimetry implications in vitro cell culture models for evaluating controlled release pulmonary drug delivery in vivo animal models for controlled-release pulmonary drug delivery preclinical models for pulmonary drug delivery reconstituted 2d cell and tissue models association of inhalation toxicologists (ait) working party recommendation for standard delivered dose calculation and expression in non-clinical aerosol inhalation toxicology studies with pharmaceuticals toxicologic testing of inhaled pharmaceutical aerosols species comparison of drug absorption from the lung after aerosol inhalation or intratracheal injection translational models of lung disease modeling molecular and cellular aspects of human disease using the nematode caenorhabditis elegans modeling human diseases in caenorhabditis elegans elegans as a model organism for in vivo screening in cancer: effects of human c-met in lung cancer affect c. elegans vulva phenotypes a drosophila model of cigarette smoke induced copd identifies nrf2 signaling as an expedient target for intervention drosophila in asthma research a drosophila asthma model-what the fly tells us about inflammatory diseases of the lung zebrafish: model for the study of inflammation and the innate immune response to infectious diseases using in vivo zebrafish models to understand the biochemical basis of neutrophilic respiratory disease modeling inflammation in the zebrafish: how a fish can help us understand lung disease what can dictyostelium bring to the study of pseudomonas infections? semin heuze-vourc'h, n. innovative preclinical models for pulmonary drug delivery research options for modeling the respiratory system: inserts, scaffolds and microfluidic chips a biphasic chamber system for maintaining polarity of differentiation of cultured respiratory tract epithelial cells adapting the electrospinning process to provide three unique environments for a tri-layered in vitro model of the airway wall use of porous membranes in tissue barrier and co-culture models a novel electrospun biphasic scaffold provides optimal three-dimensional topography for in vitro co-culture of airway epithelial and fibroblast cells intermittent exposure to whole cigarette smoke alters the differentiation of primary small airway epithelial cells in the air-liquid interface culture ali multilayered co-cultures mimic biochemical mechanisms of the cancer cell-fibroblast cross-talk involved in nsclc multidrug resistance multilayered cultures of nsclc cells grown at the air-liquid interface allow the efficacy testing of inhaled anti-cancer drugs antibacterial defense of human airway epithelial cells from chronic obstructive pulmonary disease patients induced by acute exposure to nontypeable haemophilus influenzae: modulation by cigarette smoke altered generation of ciliated cells in chronic obstructive pulmonary disease primary epithelial cell models for cystic fibrosis research asthmatic bronchial epithelial cells have a deficient innate immune response to infection with rhinovirus regeneration of the lung: lung stem cells and the development of lung mimicking devices in vitro generation of human pluripotent stem cell derived lung organoids a three-dimensional model of human lung development and disease from pluripotent stem cells modelling cryptosporidium infection in human small intestinal and lung organoids use of three-dimensional organoids and lung-on-a-chip methods to study lung development, regeneration and disease organoids as a powerful model for respiratory diseases lung-on-a-chip technologies for disease modeling and drug development reconstituting organ-level lung functions on a chip a human breathing lung-on-a-chip a human disease model of drug toxicity-induced pulmonary edema in a lung-on-a-chip microdevice a lung/liver-on-a-chip platform for acute and chronic toxicity studies a 3d human lung-on-a-chip model for nanotoxicity testing multiorgan microfluidic platform with breathable lung chamber for inhalation or intravenous drug screening and development microphysiological lung models to evaluate the safety of new pharmaceutical modalities: a biopharmaceutical perspective impaired wound healing of alveolar lung epithelial cells in a breathing lung-on-a-chip human organs-on-chips as tools for repurposing approved drugs as potential influenza and covid19 therapeutics in viral pandemics biomimetic human lung-on-a-chip for modeling disease investigation small airway-on-a-chip enables analysis of human lung inflammation and drug responses in vitro human organ chip models recapitulate orthotopic lung cancer growth, therapeutic responses, and tumor dormancy in vitro microengineered cancer-on-a-chip platforms to study the metastatic microenvironment bridging the gap between science and clinical efficacy: physiology, imaging, and modeling of aerosols in the lung modeling the lung: design and development of tissue engineered macro-and micro-physiologic lung models for research use tissue-informed engineering strategies for modeling human pulmonary diseases three-dimensional scaffolds of acellular human and porcine lungs for high throughput studies of lung disease and regeneration 3d in vitro/ex vivo systems animal models of asthma: value, limitations and opportunities for alternative approaches human tissue models for a human disease: what are the barriers? thorax harnessing the power of novel animal-free test methods for the development of covid-19 drugs and vaccines on the protection of animals used for scientific purposes this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license the authors thank moreno carrer for the technical assistance. the authors declare no conflict of interest. the funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. key: cord-303187-ny4qr2a2 authors: belo, vinícius silva; struchiner, claudio josé; werneck, guilherme loureiro; teixeira neto, rafael gonçalves; tonelli, gabriel barbosa; de carvalho júnior, clóvis gomes; ribeiro, renata aparecida nascimento; da silva, eduardo sérgio title: abundance, survival, recruitment and effectiveness of sterilization of free-roaming dogs: a capture and recapture study in brazil date: 2017-11-01 journal: plos one doi: 10.1371/journal.pone.0187233 sha: doc_id: 303187 cord_uid: ny4qr2a2 the existence of free-roaming dogs raises important issues in animal welfare and in public health. a proper understanding of these animals’ ecology is useful as a necessary input to plan strategies to control these populations. the present study addresses the population dynamics and the effectiveness of the sterilization of unrestricted dogs using capture and recapture procedures suitable for open animal populations. every two months, over a period of 14 months, we captured, tagged, released and recaptured dogs in two regions in a city in the southeast region of brazil. in one of these regions the animals were also sterilized. both regions had similar social, environmental and demographic features. we estimated the presence of 148 females and 227 males during the period of study. the average dog:man ratio was 1 dog for each 42 and 51 human beings, in the areas without and with sterilization, respectively. the animal population size increased in both regions, due mainly to the abandonment of domestic dogs. mortality rate decreased throughout the study period. survival probabilities did not differ between genders, but males entered the population in higher numbers. there were no differences in abundance, survival and recruitment between the regions, indicating that sterilization did not affect the population dynamics. our findings indicate that the observed animal dynamics were influenced by density-independent factors, and that sterilization might not be a viable and effective strategy in regions where availability of resources is low and animal abandonment rates are high. furthermore, the high demographic turnover rates observed render the canine free-roaming population younger, thus more susceptible to diseases, especially to rabies and leishmaniasis. we conclude by stressing the importance of implementing educational programs to promote responsible animal ownership and effective strategies against abandonment practices. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 the relationship between dogs (canis familiaris) and men goes back to the beginning of civilization, about 13,000 years ago [1, 2] . it is generally accepted that dogs were domesticated from the wolf (canis lupus pallipes or c. lupus variabilis) in a process of symbiosis that evolved through selective breeding [3] . indeed, dogs are termed as 'domestic' or 'domesticated' animals due to its association with humans and to the role that humans exercised in the emergence of this lineage [4] . since domestication, this relationship became even more intense and dogs are ubiquitous in the cultural context of every society, constituting the most abundant carnivore animal on the planet [5] . dogs have been associated with their owners' welfare and well-being [6, 7] and have started to play different functions [2] due to their malleable personalities, docile behavior and utility as guardians and hunters [8] . dogs that are not under immediate human supervision and have unrestricted access to public property are named "free-roaming" or free-ranging [1] . these terms encompass both owned dogs (family and some neighborhood dogs) and ownerless dogs (stray or feral) [3] . the existence of these dogs that can circulate freely in the streets can be harmful to both the animals and to human beings [1] . the abandonment and breeding of dogs in unrestricted environments have been attributed to behavioral, religious, cultural, ecological and socioeconomic factors, constituting important issues in public health and animal welfare [9, 10] . unrestricted dogs, in general, have their psychological and physical health compromised, are more likely to acquire infectious diseases and have a lower life expectancy compared to pet dogs [11] [12] [13] . their presence can be detrimental to humans since they are associated with the occurrence of biting incidents, transmission of diseases, damage to wild animal populations, accidents and pollution [14] [15] [16] [17] [18] [19] . different strategies are used to control the population of unrestricted dogs [20] . elimination by killing is not considered effective, since the number of removed animals is compensated for the increased entry and survival of the remaining ones. in addition, this method is the subject of much criticism based on ethical issues [20] [21] [22] . as a result, actions towards promoting responsible animal ownership, the strengthening of legislation against abandonment, and surgical control have been established in different countries [20, 23] . annually, thousands of unrestricted dogs are sterilized in veterinary clinics and in campaigns run by governments and non-governmental organizations. nevertheless, the effectiveness of this measure, in the long term, has been poorly evaluated [24, 25] a proper evaluation of actions that aim at controlling the free-roaming canine population requires non-biased estimates of parameters driving the dynamics of the target population [26, 27] . even though several studies have yielded estimates of the population size of free-ranging dogs, most of them used inadequate analytical methods and were susceptible to biases, which casts doubt on the validity of these estimates, as evidenced in a recent systematic review [28] . additionally, there is no published data of capture and recapture procedures that consider the canine populations as open, that is, subjected to deaths, births and migrations [29] . despite the perceived need and usefulness of such parameter estimates and recommendations for the most appropriate approaches applicable under such study designs [30] , survival and recruitment estimates of free-ranging dogs had not been obtained using methods of capture and recapture. in this study, we present estimates of abundance, survival and recruitment rates, and the probabilities of capture of two free-roaming dog populations by means of analytical models for open populations, so far unexplored in previous studies. these dogs were followed for 14 months in a city located in the southeast region of brazil. we report temporal variations of the estimates during the study period regarding gender and the effectiveness of surgical sterilization. been approved by cepea-commission of ethics in research involving animals of the federal university of são joão del rey (protocol no. 24/2010). prior to the implementation of the activities described in the next section, we performed a pilot study to define the areas of study and to identify potential problems requiring further attention. the pilot study lasted for four days, each day spent in a different neighborhood of the city. each one of the four neighborhoods belonged to four different eligible areas defined with municipal health authorities based on previous information on the occurrence of freeroaming dogs and on the feasibility of carrying out the research. those with similar features and with the highest raw number of captured and released animals were selected. data acquired in this stage were not used in the analyses. we conducted seven capture and recapture procedures in a period of one year and two months, one every two months. dogs found wandering in the streets during the capture period were included in the study, provided an owner with a dog leash did not accompany them. adapted vehicles drove around all the streets of the study areas screening for free-roaming animals. the work team consisted of one driver, two municipal health agents, one veterinarian and two individuals responsible for collecting and recording data. in area a, activities took place in the first week of the collecting months, while in area b, they took place in the second week of the same month. screenings always followed the same route, so that they covered all the streets of each region at least once. the same team collected the data in both areas. captured dogs of areas a and b were taken to the health surveillance reference center of the city (crevisa) in adapted vehicles of public health service. in crevisa they underwent clinical tests conducted by veterinarians and were screened for canine leishmaniasis (canl). the diagnosis of canl was made in the parasitology laboratory of universidade federal de são joão del rei through the techniques of "enzyme-linked immunosorbent assay" and "indirect fluorescent antibody test". seropositive dogs were euthanized according to the recommendations of the brazilian ministry of health [31] . dogs that tested negative were microchipped for identification if recaptured. these animals were also de-wormed and received a vaccine against rabies and another against distemper, leptospirosis, hepatitis, parainfluenza, parvovirus, coronavirus and adenovirus. dogs captured in the area b (intervention) were sterilized (s1 appendix). healthy animals were returned to the same place where they had been captured, after screening for canl and total rehabilitation of the surgical procedure (area b dogs) (s1 appendix). recaptured animals were re-examined. animals screened and found negative for canl were released again. animals that tested positive were euthanized. all dogs, even those not captured, were photographed for posterior identification in the same sample interval and in recaptures. for identification, distinctive characteristics of the dogs were sought in their craniolateral and/or dorso-caudal portions. for analytical purposes, dogs not physically captured but photographed were considered captured. this information was added to the database serving as input for the estimation of the population dynamic parameters. we released informative materials to the local population with the purpose of increasing their awareness regarding responsible animal ownership and visceral leishmaniasis (s2 appendix). we entered the individual history of capture and recapture of each animal into a microsoft excel (2013) database formatted as "encounter history" for captured animals tagged alive [30] . euthanized animals carried a negative sign, indicating the occurrence of death during the capture procedure. there were no deaths attributed to other factors. general procedure. the jolly-seber model with popan parametrization served as the starting structure for model fitting [34] . we estimated the following three parameters using his approach: φ i (survival): probability of a marked or unmarked animal surviving (and not migrating) between the captures i and i+1. p i : (capture probability): the probability of finding or seeing a marked or unmarked animal in a given capture i, given the animal is alive and in the area of capture. b i : (probability of entrance): considering the existence of a "super-population", comprised of all animals that would ever be born to the population, this parameter constitutes the probability of an animal of this hypothetical "super-population" entering the population between the occasions i and i+1. the parameters above allow for the estimation of recruitment (b: number of animals that enter the population between two capture procedures) and population size (n). we used mark, version 6.2 for fitting the statistical models. goodness of fit of highly parameterized models. we evaluated the goodness of fit (gof) of the model with the largest number of parameters prior to fitting models that were more parsimonious [35] . this step was necessary to check the premises of the jolly-seber approach. we checked model's gof based on tests 2 and 3 of the release suite of mark software, the gof statistics obtained via the bootstrap, as well as the "median c-hat" statistics. among the procedures, we adopted the one that indicated the highest variance inflation factor (c-hat). we first considered the model with the variables "sex" (male or female), "area" (a or b), "time" (sampling period), and their respective interaction terms. a c-hat value of 2.52 suggested sparsity in different periods of capture. we changed our model search strategy accordingly and partitioned the previous model into two models: a model with "sex", "time" and interactions and a second model with "area", "time" and interactions. c-hat estimated in these cases was 1.17 and 1.25, respectively for each model, and there were few indications of sparse data. modeling procedures. the gof analysis reported above prompted us to investigate factors associated with survival estimates, probability of capture and probability of entry separately for the variables "sex" and "area". in both cases, we built models considering timedependent or time-independent parameters and the presence of interactions between the variables "sex" and "time" or "area" and "time". we also fitted additive models containing parameters expressed as a function of two or more factors, in this case, area and time or sex and time, without the presence of interactions. in total, we fitted 50 models in both groups (s3 appendix). all models supported temporal variations for the "probability of entry" (b i ). model selection followed the usual approach by searching for the most parsimonious structure that retained the best balance between explained variability and precision of estimates. we ranked all models based on akaike's information criterion corrected for finite sample sizes (aicc). this statistic provides a summary balance between the goodness of fit to the data of each model and the number of necessary parameters. "data cloning" was used to identify the correct number of estimated parameters [36] . the presence of overdispersion in the data indicated the need to further correct the aicc statistics by the c-hat values to obtain the quasi-aicc statistics (qaicc). lower values of these latter statistics point to models that were more parsimonious [37] . after ranking all models based on qaicc, we evaluated the force of evidence in favor of each model (aic weight-"w"). this statistic can be interpreted as the conditional probability of a given model being the best among the set analyzed. thus, higher values of "w" indicate higher force of evidence in favor of the model. models with values of "w" lower than 0.01 were disregarded. we further evaluated the importance of each variable in a context of a set of models by adding up the weight (w) of each model containing a given variable [37] . we repeated this procedure for all predictors considered. variables with higher weights are considered more important than those with lower weights in explaining the variance observed in the data. parameter estimation. estimates for the parameters survival probability, probability of capture, probability of entry in the population, abundance and recruitment rate relied on the technique known as "model averaging" [38] . under this approach, we calculated the weighted average of parameter estimates from all models fitted to data using as weights the relative support (w) of the respective model. therefore, this technique accounts for both sources of variance: the specific conditional variation present in each one of the models and the nonconditional variation present in the model selection process. in this way, parameter estimates express more faithfully the sources of uncertainty associated with the estimation process. in time-dependent models under popan not all parameters are identifiable [30] . this is the case of the probability of capture in the first and in the last captures (p 1 and p k ), the probability of entry between the first and the second captures (b 1 ) and between the penultimate and the last captures (b k-1 ), and the survival probability between the penultimate and the last captures (φ k-1 ). thus, only the remaining parameters whose estimation was possible are described here. the effectiveness of sterilization was analyzed by comparing the evolution of abundance and of the other parameters estimated in the areas: a (control) and b (intervention). we estimated dog:human ratio by the ratio of population size and the dog mean abundance in each of the areas. during the study period, 171 dogs were identified individually in region a (control) and 157 in region b (intervention). the proportion of males in areas a and b was 56% (96 dogs) and 62% (98 dogs) respectively. one hundred and thirty-three animals (77 males and 56 females) were captured in more than one effort of captures. one hundred and thirty-eight dogs (88%) were sterilized. twenty-four were euthanized for testing positive to canl. sixty-six different individual histories of captures were registered and 38 of them included animals not captured in the first effort. all recaptures and visualizations took place in the same area where the dogs were initially detected. most free-roaming dogs were neighborhood dogs, i.e. several human residents in the area provided the needed resources to them [3] . models including the variable "gender". we fitted 50 models containing the variable gender to the data (s3 appendix). five had w-statistics greater than 1% and are shown in table 1 , along with statistics qaicc, δqaicc (difference, in module, between qaicc values of the best model and the analyzed model). the relative support for each model is also expressed as the ratio of its w-statistics to the largest value of this statistic among the five models considered. the model in which survival, probability of capture and probability of entry varied with time, but not between male and female dogs, was considered the most parsimonious (w = 73.24%). the weight for this model was 6.48 times higher than the model in which survival varied additively with gender; and 8.37 times higher in relation to the model in which the probability of capture varied between genders. other models had weights lower than 5% and low support, when compared to the most parsimonious model. the sum of each variable's weight (w) considering all models (table 1 ) are presented in s5 appendix. time-dependent parameters displayed higher weights. survival and capture probabilities varied between genders but these variables' "w" conferred weak support to this statement. for entrance probability, there was no evidence for the existence of group variation. models with the variable "area". models containing the variable "area" behaved similarly to the previous set containing the variable "gender". table 2 presents the six models in this group with w-statistics greater than 1%. results for the remaining models are described in s3 appendix. the model containing time-dependent parameters, but constant between areas control (a) and intervention (b), was the most parsimonious in this group. its weight, however, was lower when compared to the models containing the variable "gender" (table 1 ). it had 3.11 times more support from the data than the model in which there was variation in survival between areas, and 3.99 more support than the model in which the probability of capture also varied between areas. differences were significantly higher when comparing the most parsimonious model to the remaining models since the latter models received even lower support from the data. we observed stronger weights associated with time-dependent variables (s5 appendix). however, the observed weights were lower than those associated to the variables in the set of models containing the variable "gender". we also observed stronger weights in variables describing differences in survival and probabilities of capture between areas. the remaining variables were associated with lower weights. in particular, the probabilities of entry did not vary between control and intervention areas. models with the variable "gender". we estimated a population abundance of 148 females and 227 males in the target population in the entire study period. table 3 depicts gender-specific parameter estimates and their respective confidence intervals (ci). they result from model-specific estimates weighted by the relative support (w) of the respective model. taken together, these results show that gender-specific differences regarding the estimated parameters were not relevant. in contrast, time-dependent differences were significant. survival probabilities increased steadily, going from 0.75 in the interval between the first and second captures, to 0.99 between the fifth and sixth. on the other hand, probability of entry in the population was close to zero between the fifth and the sixth captures, and varied between 0.12 and 0.15 in other intervals. probability of capture reached the highest value in the second capture (0.68), and decreased subsequently until the fifth capture when it reached its lowest value (0.39). estimates of abundance highlight the majority of males in its composition. additionally, there was a higher entry of male dogs in all intervals in which the number could be estimated. population increased in size during the study. we estimated the presence of approximately 59 females and population dynamics and effectiveness of sterilization of free-roaming dogs 92 males in the second capture, 71 and 69 females and 104 and 105 males, respectively, in the fifth and sixth captures. models with the variable "area". we estimated the presence of 199 dogs in area a (control) and 177 in area b (intervention) throughout the study period. estimates of additional parameters stratified by control and intervention areas are presented in table 4 . they reflect the weighting mechanism by the relative support of all models analyzed as explained in the methods section. analogously to the models containing gender, differences between the estimates of each area were small, even though they were slightly higher than those seen between genders. owing to the fact that stronger weights were attributed to models that did not show a difference between strata in either set of models, estimates of survival, capture probabilities and entry probabilities for the areas were similar to those described for gender. on the other hand, the recruitment was similar in both areas, contrasting with our findings comparing males and females. population size increased in both areas. abundances seen in the second capture were smaller, 82 animals in area a and 70 in area b, contrasting with abundances observed in the fifth capture, 96 dogs in area a and 83 in area b. dog:human ratio in area a was one dog to 42 human beings. in area b, this ratio was one dog to 51 humans. we estimated critical parameters (survival, recruitment and abundance) that describe the population dynamics of free-roaming dogs based on a capture and recapture study design and on models suitable for open populations. our study demonstrated the increase in population size in both areas, the predominance and greater recruitment of males, the temporal variability in recruitment and in survival probabilities, the lack of effect of sterilization on population dynamics, the influence of abandon and of density-independent factors and a high demographic turnover. such information on the dynamics of free-ranging dogs are useful for informing control interventions of unrestricted dog populations and against canine visceral leishmaniasis and rabies, both neglected tropical diseases endemic to various countries. the dog:man ratio observed in our study was smaller than that observed in counts performed in urban regions of nigeria (1 dog to 25 men) [39] and that among rural dog populations in india (1 dog to 35 men) analyzed by mean of beck's method [40] . it was larger, however, than the counts obtained by hossain et al. [41] in a rural area of bangladesh. demographic, socioeconomic, environmental and cultural factors able to explain differences in abundances between and within regions have been underexplored in the literature [28] . abundance of free-roaming dogs in general is lower in rural than urban areas [42, 43] . regions under poorer socioeconomic conditions and higher population densities tend to have a larger concentration of dogs [44] . in the present study, abundance possibly reflects the intermediary socioeconomic condition, the urban environment and low population density of the study areas as well as the different methodology applied. for most animal species, survival is the demographic parameter with highest impact on population size [45] . few studies, however, aimed at estimating the survival of free-ranging dogs in urban environment. reece et al. [46] used data from a sterilization program to estimate the survival of castrated females in jaipur, india. annual survival of females aged over one year old was 0.70 and of females in their first year of life was 0.25. the assumptions leading to these estimates were implausible and might have biased the results. pal [47] conducted four annual capture efforts, in bengal, india, and estimated the canine mortality from the number of dogs observed in the captures after the first one. annual survival for adult dogs was 0.91, and for dogs in their first year of life, 0.18. this study did not report capture probabilities and included in estimation only dogs found dead. this approach possibly contributed to an overestimation of the survival probability. survival probability reported by beck [1] , in a study conducted in baltimore, canada, with dogs of all age groups, was 0.70. this author relied only on existing information regarding the number of dead dogs, also possibly leading to an underestimation of mortality and consequently to an overestimation of survival probability. although limited, estimates obtained in the literature suggest that survival is lower in young free-roaming dogs [46, 47] , a pattern already seen in different animal species [48, 49] . once the proper identification of the dogs' ages was outside the scope of our project, estimates in the present study refer to the general survival probability of the population and not age-specific probabilities. annual survival in our study was higher than that estimated for dogs aged less than one year [46, 47] , and lower than the survival probability estimated for adult dogs [46, 47] and for beck's study population [1] . the low survival probability identified in the population results from the different sources of mortality experienced by free-roaming dogs in the study setting. residents often reported roadkill and poisoning episodes during the study period. the high prevalence of canl-seropositive dogs, especially in the first months of the study, is another relevant factor leading to the removal of many animals by euthanasia. additionally, government actions towards street dogs were restricted to rabies vaccination. the lack of additional prophylactic measures or treatment may have contributed to the increased susceptibility of dogs to infections and other conditions. females have lower survival rates [50, 51] in a large number of animal species due primarily to the effects of reproduction. given the predominance of males in different studies, it is hypothesized that this pattern also happens in the canine populations [52] . we observed no difference in survival probabilities between genders, although a higher abundance and recruitment of males occurred. most pet owners prefer male dogs since they do not get pregnant and are better guard dogs [28, 41] . therefore, the higher survival of male puppies of owned but free-ranging dogs or of pet dogs subsequently abandoned by their owners could probably explain the predominance of males in the free-roaming dog population. to our knowledge, we report for the first-time the temporal evolution of the survival probability of free-roaming dogs. annual point estimates of survival probability found in the literature do not bear a longitudinal structure. our results show that survival of unrestricted dogs displays variations, even in short temporal scales. among the models fitted to the data, those in which survival did not vary with time had significantly lower weights, indicating that a constant value is not appropriate to representing the entire period. estimates of survival probabilities in other mammal species also show a temporal dependence, especially in young individuals [53] [54] [55] . long-term studies are required to uncover the intrinsic and extrinsic determinants driving these temporal dependencies. this would be useful for understanding the population dynamics of free-ranging dogs and improving the validity and precision of predictive modelling procedures. such studies are difficult to perform, and thus are rare in the literature [56] . despite being a short-term study, survival, recruitment and population size displayed an increasing tendency. this pattern suggests that density-independent factors could be responsible for driving the variations observed in survival probabilities of dogs in both areas. density-dependent mechanisms are the subject of several studies focusing on different animal species [57] [58] [59] [60] . in epidemiological and ecological modeling, one assumes that survival and recruitment rates in free-ranging dogs are driven by the availability of resources in the environment, a density-dependent mechanism [61] . however, as pointed out by de little et al. [56] , extrinsic factors not regulated by density may determine fluctuations in population size when those populations have not yet reached their carrying capacity or when environmental conditions are favorable. according to morters [61] , human beings are the major agents responsible for providing care and adequate food for dogs. as a result, human related factors such as living together with free-ranging dogs, the low dog-human ratio and the availability of residents' resources to maintain these animals, may explain why the increase in density had no influence upon mortality and recruitment. reducing the availability of shelters and food is an ethically questionable measure for population control of free-roaming dogs. however, this alternative has been presented in a recent study [62] . it is not possible to affirm whether population growth, attributed to the large number of animals entering the population, would keep the reported increasing trend constant if the study had a longer duration. maximum survival and lack of recruitments between the fifth and the sixth captures suggest potential instabilities. in the presence of increasing abundance, density-dependent factors could start to play a stronger role in regulating the population [63, 64] and in the behavior of residents regarding their support to dogs. there is considerable uncertainty in assessing the role played by vital rates and intrinsic and extrinsic factors in driving the population size of free-ranging dogs and other mammals [54, 61, 65, 66] . estimates of recruitment obtained from capture and recapture models do not allow us to disentangle the sources of entry attributable to births and immigration. we observed no females with their brood along the study period. we might infer that breeding females were located in less visible areas or put to adoption by the city public service and returned to the streets after the lactation period, even though such registries were rare. in the study of morters et al. [61] , as well as in the present study, recruitment was driven, predominantly, by the arrival of adult animals. the recruitment contingent may comprise dogs born in the region and not identified as puppies, dogs from other regions that migrated to the study region or were relocated by residents who raised them unrestrictedly, previously restricted dogs that changed status to being freely raised, or dogs abandoned nearby that later joined the population. the study areas are geographically isolated from their neighboring regions and are located next to a highway where dogs were frequently abandoned. therefore, the latter mechanism seems more plausible to account for the increase in population size rather than the spontaneous immigration of dogs. although there are heterogeneities [67] , free-ranging dogs are territorial animals that, in general, do not move across long distances, unless forced by unfavorable environmental conditions [1] . the low mobility of dogs in a favorable environment is supported by our data, since there were no animal movements between the areas a and b. the replacement of a great number of dogs that died or emigrated by dogs that are born or immigrate, as observed in our study populations, drives the population structure and gives rise to health problems that result from these structures. a population with a high turnover may be more susceptible to diseases [24] . a high population turnover is the major obstacle for the success of control strategies against rabies in developing countries [68] . vaccination strategies under such population dynamics must occur in short intervals and achieve high coverage in order to maintain proper levels of immunization. on the other hand, the replacement of euthanized dogs by susceptible animals and new individuals entering the reservoir compartment are the main causes of the low effectiveness of the euthanasia of seropositive dogs, a control strategy adopted in brazil against leishmaniasis [69] . in addition, the population also becomes younger and more likely to acquire other infections under the high turnover regime [70] . the field of mammal ecology identifies two main reproductive strategies driving population size, each focusing on specific stages of the life cycle. the so-called "slow breeding" animals experience late maturation and their reproductive strategy depends on the survival of juveniles and young adults. on the other hand, "fast breeding" mammals complete their reproductive cycle within their first year of life and place emphasis on fertility as their survival strategy as a species [71, 72] . control of the population size of "fast breeding" animals, such as dogs [73] , is more effectiveness when relying on measures that restrict entry of new individuals into the population as opposed to subjecting animals to euthanasia, a practice that reduces adults' survival. the fast versus slow breeding rationale, the sensitive ethical issues, and the low effectiveness of euthanizing animals observed in regions where this practice has been applied [20, 61, 74] prompted us to only consider sterilization, and not culling, as an alternative control strategy in our study. its use as a population control measure against hydatid disease in developing countries, however, has been recommended [19] . it is worth noting that in the present study sterilization did not affect the canine population dynamics. after one year and two months, we observed no difference in survival, entrance or recruitment probability between the control region and the intervention area where 88% of the dogs were sterilized. the impact of sterilization takes place slowly as suggested by modeling exercises. it might take up to five years for the first impact of sterilization to become apparent and up to 30 years of uninterrupted efforts to reach its maximum impact [75] . reece and chawla [24] evaluated a program that surgically sterilized 19,129 neighborhood dogs in jaipur, india, for eight years and showed that the population declined by only 28 per cent. on the other hand, frank and carlisle-frank [76] observed only a small impact of a sterilization program on the number of dogs joining a shelter in the united states. amaku et al. [77] , based on results from a mathematical model developed specifically for stray dogs, concluded that sterilization becomes inefficient in the presence of high abandonment rates, even after prolonged periods of use. natoli et al. [78] reached the same conclusion after studying for 10 years the impact of a castration and devolution program on a non-restricted cat population. continuous negligent practices of animal ownership, including abandonment, had a negative impact on the sterilization strategy rendering it ineffective and countering the effect of 8,000 surgical interventions undertaken in that study. finally, in a study conducted in brazil, dias et al. [79] concluded that it is counter-productive to invest in sporadic sterilization campaigns of owned dogs, the currently strategy adopted in most of brazilian municipalities. the small impact in controlling the population size, especially in areas with high abandoning rates as in our case, the need to reach high coverage rates without interruptions, the absence of behavioural benefits for castrated dogs [80] , high costs and null impact on a shortterm perspective, minimize the relevance of sterilization of free-ranging dogs in managing the population and controlling diseases. in this context, it becomes apparent that public health services and non-governmental organizations must develop and prioritize more effective strategies against abandonment practices. in countries where free-ranging dogs are considered a humanitarian or a public health issue the implementation of educational programs addressing responsible animal owning at different levels, the registration of dogs and their owners and the improvement of legislation aimed at those who wish to have a pet becomes imperative [81] . probability of capture is a useful parameter in the identification of essential population features [82] . it is known to vary in space, time and among individuals [83] . although we observed no significant differences in this parameter between genders and areas, it varied over time even in the presence of standardized procedures. such fluctuations may be attributable to social organization features not yet investigated in the population, or to environmental and climatic factors. dias et al. [84] showed that weather exerts an influence on dogs' activity, and consequently influences the probability of finding a dog in a given capture effort. in our study, even with the vehicles driving around in all the streets of the target regions, there was a large number of animals present but not visualized in all captures. our observations indicate that individual counts based on a census do not adequately estimate the abundance of unrestricted dogs and that the majority of the estimates available in the literature show important biases. different studies aimed at estimating the abundance of free-roaming dogs did not model or even consider the existence of differences in the probabilities of dog detection [28] . as pointed out before [28] , counting techniques should be carried out only in short periods of time and when no other alternative becomes available considering logistics, geography and culture of the study region. values of capture probabilities obtained in the present study are similar to those estimated by kalati in a population of urban free-ranging dogs in kathmandu [85] and may be used as correction factors for the previously published estimates of abundance. in addition to the limitations already mentioned regarding the observation and sampling techniques applied in capture and recapture studies, issues related to the choice of appropriate analytical methodology deserve mentioning. environmental and individual variables, relevant to helping understand the population dynamics [30] were not included in our models. the logistics of fieldwork turned out to be complex and difficult, requiring the participation of at least six individuals in each capture effort and extensive fieldwork journeys. direct contact with animals was unavoidable since assessing the effectiveness of sterilization was one of the study objectives. studies assessing the population dynamics of dogs, however, could rely on only photographic methods, these being less complex and onerous [86, 87] . our choice of modeling procedures for open populations allowed for the estimation of survival and recruitment probabilities of unrestricted dogs. in addition, we tested the data for the statistical assumptions required by each model. model selection followed the aic technique, which compares favorably with the classic statistical and hypothesis testing [37, [88] [89] [90] . lastly, our parameter estimates and confidence intervals express more faithfully the sources of uncertainty present in the whole estimation process, due to the use of the "model averaging" technique. the analytical procedures adopted here addressed methodological limitations of previous publications and propose a new starting point for future studies. in our view, longer periods of observation, larger sample sizes, and the choice of more variable study settings including different social, cultural and geographic characteristics are important topics that need the attention of researchers in the field of unrestricted dogs' ecology. the agenda of the investigation of factors influencing the canine population dynamics must consider the variables addressed in the present study, and further consider the stratification of these population parameters by age groups, as well as by intrinsic animal features and environmental conditions not yet investigated. our estimates of population size in the studied regions in general were small compared to previous estimates in the literature. survival probability was small and probability of animal entry in the population was high during the 14 months period of follow-up. high turnover, attributed mostly to the abandonment of pet dogs, has important implications to the population composition and the control of zoonosis. estimates of survival, recruitment and capture probabilities varied over time. survival and recruitment showed an increasing tendency. mortality patterns did not differ between genders. the probability of entry in the population was higher among males. the observed population dynamics seem to be driven by density-independent factors. sterilization, in turn, had no influence upon the parameters analyzed. our observations are useful for a better understanding of the population dynamics of free-roaming dogs and may aid in the planning, designing and evaluation of population control actions. in this context, it becomes imperative that public health services and nongovernmental organizations develop educational training programs addressing responsible animal ownership and better strategies against abandonment practices. parameter estimates may also be used as input to new predictive mathematical models. even though our study generated important answers and new hypotheses, the scarcity of existent knowledge and the misuse of the proper methodology raise numerous relevant questions yet to be elucidated about the population dynamics of free-roaming dogs. the ecology of stray dogs: a study of free-ranging urban animals a systematic review and meta-analysis of the proportion of dogs surrendered for dog-related and owner-related reasons behavioural changes in sheltered dogs. annali della facolta di medicina veterinaria di pisa demographics and economic burden of un-owned cats and dogs in the uk: results of a 2010 census a systematic review and meta-analysis of the factors associated with leishmania infantum infection in dogs in brazil parasites of importance for human health in nigerian dogs: high prevalence and limited knowledge of pet owners dog bites in humans and estimating human rabies are mortality in rabies endemic areas of bhutan assessing human-dog conflicts in todos santos, guatemala: bite incidences and public perception free-roaming dog populations: a cost-benefit model for different management options dogs, cats, parasites, and humans in brazil: opening the black box dog population management for the control of human echinococcosis the epidemiology of free-roaming dog and cat populations in the wellington region of new zealand urban rabies transmission dynamics and economics of rabies control in dogs and humans in an african city estratégias adicionais no controle populacional de cães de rua %20caes%20de%20rua.pdf?sequence=1 control of rabies in jaipur, india, by the sterilisation and vaccination of neighbourhood dogs impact of publicly sponsored neutering programs on animal population dynamics at animal shelters: the new hampshire and austin experiences free-roaming dog control among oie countries which are members approaches to canine health surveillance population estimation methods for free-ranging dogs: a systematic review métodos para estimativas de parâmetros populacionais por captura, marcação e recaptura analysis and management of animal populations secretaria de vigilância em saúde. departamento de vigilância epidemiológica infecção por leishmania em uma população de cães: uma investigação epidemiológica relacionada ao controle da leishmaniose visceral secretaria de vigilância em saúde. departamento de vigilância epidemiológica. normas técnicas de profilaxia da raiva humana. 1st ed. brasília: editora ministério da saúde a general methodology for the analysis of capture-recapture experiments in open populations u-care: utilities for performing goodness of fit tests and manipulating capture-recapture data estimability and likelihood inference for generalized linear mixed models using data cloning model selection and multi-model inference: a practical information-theoretic approach information-theoretic model selection and model averaging for closed-population capture-recapture studies studies on dog population and its implication for rabies control assessing demographic and epidemiologic parameters of rural dog populations in india during mass vaccination campaigns a survey of the dog population in rural bangladesh free-roaming dog population estimation and status of the dog population management and rabies control program in dhaka city demography of domestic dogs in rural and urban areas of the coquimbo region of chile and implications for disease transmission spacing and social organization: urban stray dogs revisited is survivorship a better fitness surrogate than fecundity? fecundity and longevity of roaming dogs in jaipur population ecology of free-ranging urban dogs in west bengal age, sex, density, winter weather, and population crashes in soay sheep sex-and age-dependent effects of population density on life history traits of red deer cervus elaphus in a temperate forest the evolution of parental care effects of maternal care on the lifetime reproductive success of females in a neotropical harvestman a seroepidemiologic survey of canine visceral leishmaniosis among apparently healthy dogs in croatia persistent instability and population regulation in soay sheep temporal changes in key factors and key age groups influencing the population dynamics of female red deer assessing the impact of climate variation on survival in vertebrate populations complex interplay between intrinsic and extrinsic drivers of long-term survival trends in southern elephant seals loss of density-dependence and incomplete control by dominant breeders in a territorial species with density outbreaks stochasticity and determinism: how density-independent and density-dependent processes affect population variability bayesian inference on the effect of density dependence and weather on a guanaco population from chile density-dependent spacing behaviour and activity budget in pregnant, domestic goats (capra hircus) the demography of freeroaming dog populations and applications to disease and population control defining priorities for dog population management through mathematical modeling population regulation in mammals: an evolutionary perspective population growth rate and its determinants: an overview population dynamics of large herbivores: variable recruitment with constant adult survival population dynamics of owned, free-roaming dogs: implications for rabies control domestic dog roaming patterns in remote northern australian indigenous communities and implications for disease modelling transmission dynamics and prospects for the elimination of canine rabies control of visceral leishmaniasis in latin america-a systematic review dog culling and replacement in an area endemic for visceral leishmaniasis in brazil life histories and elasticity patterns: perturbation analysis for species with minimal demographic data carnivora population dynamics are as slow and as fast as those of other mammals: implications for their conservation estudo do programa de esterilização das populações canina e felina no município de são paulo efecto del sacrificio de perros vagabundos en el control de la rabia canina an interactive model of human and companion animal dynamics: the ecology and economics of dog overpopulation and the human costs of addressing the problem analysis of programs to reduce overpopulation of companion animals: do adoption and low-cost spay/neuter programs merely cause substitution of sources? dynamics and control of stray dog populations management of feral domestic cats in the urban environment of rome (italy) dog and cat management through sterilization: implications for population dynamics and veterinary public policies effects of surgical and chemical sterilization on the behavior of free-roaming male dogs in stray dog and cat laws and enforcement in czech republic and in italy is heterogeneity of catchability in capture-recapture studies a mere sampling artifact or a biologically relevant feature of the population? revisiting the effect of capture heterogeneity on survival estimates in capture-mark-recapture studies: does it matter? size and spatial distribution of stray dog population in the university of são paulo campus, brazil street dog population survey, kathmandu: final report to wsp spot the match-wildlife photo-identification using information theory mark-recapture and mark-resight methods for estimating abundance with remote cameras: a carnivore case study choosing among generalized linear models applied to medical data null hypothesis testing: problems, prevalence, and an alternative model selection in ecology and evolution the authors are grateful for the invaluable support provided by members of staff of the municipal health service and by veterinarians who participated in this study. key: cord-278693-r55g26qw authors: wang, lianwen; liu, zhijun; guo, caihong; li, yong; zhang, xinan title: new global dynamical results and application of several sveis epidemic models with temporary immunity date: 2021-02-01 journal: appl math comput doi: 10.1016/j.amc.2020.125648 sha: doc_id: 278693 cord_uid: r55g26qw this work applies a novel geometric criterion for global stability of nonlinear autonomous differential equations generalized by lu and lu (2017) to establish global threshold dynamics for several sveis epidemic models with temporary immunity, incorporating saturated incidence and nonmonotone incidence with psychological effect, and an sveis model with saturated incidence and partial temporary immunity. incidentally, global stability for the sveis models with saturated incidence in cai and li (2009), sahu and dhar (2012) is completely solved. furthermore, employing the dediscover simulation tool, the parameters in sahu and dhar’model are estimated with the 2009–2010 pandemic h1n1 case data in hong kong china, and it is validated that the vaccination programme indeed avoided subsequent potential outbreak waves of the pandemic. finally, global sensitivity analysis reveals that multiple control measures should be utilized jointly to cut down the peak of the waves dramatically and delay the arrival of the second wave, thereinto timely vaccination is particularly effective. immunization is believed to be one of the most successful and cost-effective public health interventions [1] , for instance in worldwide eradication of small-pox and sharp reduction in the annual morbidity of most other vaccine-preventable diseases, such as polio, measles, hepatitis b, yellow fever [2] , cholera [3] , mumps [4] and influenza [5] [6] [7] [8] . currently, immunization saves 2-3 million lives yearly and prevents debilitating illness, disability and death from the diseases. however, it is estimated that 19.4 million infants failed to be reached with routine immunization services in 2018 [1] . due to low vaccination rate, the 2017-2018 seasonal influenza caused estimated 45 million illnesses, 21 million medical visits, 810,0 0 0 hospitalizations and 61,0 0 0 deaths in the united states [9] , and now burden is not optimistic. fortunately, timely vaccination programme played a core part in mitigating the pandemic (h1n1) 2009 [8] (ph1n1). take hong kong china for instance: the subsequent potential waves of the pandemic [10] might be effectively mitigated with the launch of the ph1n1 vaccination programme for several priority groups [11] , although the first wave failed to be timely contained due to the unavailability of the vaccine against the novel influenza strain [12] (see fig. 1 ). admittedly, immunization may not be once and for all because vaccine-induced immunity is generally temporary, and so are disease-acquired and natural immunity, which becomes one of major obstacles eliminating such these infectious diseases. vaccines rarely provide the recipients with almost life-long immunity against re-infection. after being infected, susceptible individuals first become exposed but not infectious and then become infectious. the successfully recovered individuals acquire disease-induced immunity. additionally, by virtue of natural immunity [13] [14] [15] , a part of exposed individuals fail to develop disease but acquire temporary immunity. for example, the efficient innate immunity protects more than 90% of individuals infected with mycobacterium tuberculosis [14] . a recent study [15] has showed that, similar to seasonal influenza, most infection (up to 75%) of the pandemic h1n1 strain was asymptomatic and gave the infected individuals temporary immunity. the nonlinear epidemic dynamical models incorporating both temporary immunity and latency such as seirs, sveis models in [16] [17] [18] [19] , have been developed to better understand the transmission dynamical behaviors of infectious diseases qualitatively and quantitatively. the exploitation of their global asymptotic stability has been of great interest and challenging to researchers in infectious disease modelling aimed at finding out the effective control interventions, seeing [16] [17] [18] [19] . while the lyapunov function methods may become unsuitable to prove their global stability, the classical geometric approach for nonlinear autonomous differential equations based on additive compound matrix theory developed by li and muldowney [20] [21] [22] has been succeed in applying to these epidemic models [16, 17, [20] [21] [22] . for example, cai and li [16] proposed the following nonlinear seiv epidemic model with temporary immunity: where the total population n consists of susceptible ( s ), latent ( e ), infectious ( i ) and vaccinated-recovered ( v ) classes. the nonlinear incidence βsi / ϕ( i ), with ϕ(0) = 1 and ϕ ( i ) ≥ 0, generalizes saturated incidence βsi/ (1 + κi) and nonmonotone incidence capturing psychological effect βsi/ (1 + κi 2 ) [23, 24] . along the work of [16] , sahu and dhar [17] further developed [6, 7] α vaccination rate m −1 [0,1] 0 [5, 7] ξ the recovery rate of exposed class due to natural immunity m −1 [3, 30] where susceptible class are vaccinated with certain vaccine at constant rate α, different from model (1.1) with a fraction of vaccinated newborns (denoted by p ). we always assume that the same parameter represents the identical biological meaning throughout this paper, and the detailed biological descriptions of the parameters for model (1.2) are demonstrated in table 1 . note that [16, 17] applied the geometric approach based on the second additive compound matrix theory of [20] to the responding limiting systems and achieved global stability of the unique endemic equilibrium (ee) under the vaccination reproduction number r v > 1 and some additional restrictions. more recently, lu and lu [18, 19] improved the classical geometric approach of [20] [21] [22] and generalized the geometric criterion on global-stability problem and applied it to several nonlinear seirs models, successfully removing some restrict conditions on global stability of their ee. borrowing the ideas of [16, 17, 23, 24] , we establish the following sveis epidemic model with general nonlinear incidence: (1.3) in which, it is assumed that vaccine-induced, disease-acquired and natural immunity may last the nearly same time for some diseases like influenza, and the differential infectious force function g possesses the following properties reflecting some biological significances: (p1) g ∈ c : r + → r + , satisfies g(0) = 0 , g ( i ) > 0 for i > 0. (p2) g(i) /i ∈ c is monotonously nonincreasing for i > 0, and lim i→ 0 + (g(i ) /i ) := β < + ∞ . it is worth highlighting that saturated and nonmonotone incidences in [23, 24] , βs ln (1 + κi) [29] and βsi/ (1 + κi + √ 1 + 2 κi ) [30] , but not confined to them, fulfill (p1) -(p3) , thus we lift restrictions on monotonicity of g ( i ) in spite of the introduction of (p3) . with this geometric criterion in [18] , we shall thoroughly address global threshold dynamics of models (1.3) and (1.2) , characterized by their vaccination reproduction numbers. incidentally, the unnecessary restrictions both in theorem 4 in [16] and theorem 5.5 in [17] are completely removed since model (1. 3) reduces to model (1.1) if g(i) = βi/ϕ(i) and ξ = 0 . of particular note is that we achieve global asymptotic stability for model (1.1) of [16] with nonmonotone incidence reflecting psychological effect, which also reserves threshold dynamics. furthermore, as an application of model (1.2) , the reported ph1n1 case data of hong kong china [12] are utilized to estimate its parameters, aimed at accounting for the avoidance of the subsequent potential waves of the pandemic in 2010 (as predicted by who [10] ) with the ph1n1 vaccination programme. meanwhile, several disease-control measures are evaluated in terms of global sensitivity analysis for the vaccination reproduction number. in particular, this study arrives at a conclusion that joint usage of multiple control measures such as isolation, vaccination and treatment, can more effectively cut down the peak of the waves and dramatically delay the arrival of the second wave at the same time. the outline of this paper is summarized as follows. in section 2 , we offer insight into global threshold dynamics for model (1.3) , including the existence, local and global asymptotic stability of its equilibria. section 3 completely addresses the global dynamics of model (1.2) . section 4 performs parameter estimation and global sensitivity analysis for the vaccination reproduction number of model (1.2) with the purpose of seeking for effective control measures. finally, we close the paper with a conclusion and discussion section. is the positively invariant set by similar arguments in [16] . apparently, the disease-free equilibrium (dfe) thus, by application of the next generation matrix approach in [31] , the vaccination reproduction number (e.g., seeing [32, 33] ) is calculated as clearly remaining the same with the model in [16] when ξ = 0 . by some direct but tedious algebra operations, it can be deduced that the i * component in the ee p * = (s * , v * , e * , i * ) is determined by the following equation in what follows, we are going to focus mainly on analyzing the positive real solution of eq. (2.2) . a simple induction then shows in the case of r v > 1 , together with g (0) > 0, g (0) = 0 and g (s 0 /a ) = −bs 0 /a < 0 , it can be revealed that g ( i ) > 0 as i is sufficiently small, guaranteeing the existence of positive real root for eq. (2.2) from fig. 2 , denoted by i * . and its uniqueness is verified by reduction to absurdity as follows. provided that another positive solution i * of (2.2) nearest to i * , if it exists, must satisfy g ( i * ) ≥ 0 owing to the continuity of g ( i ). actually, together with g ( i * ) ≤ g ( i * )/ i * deduced from (p3) , we arrive at where one utilizes the equality b = s * g(i * ) /i * derived by the equations that the ee satisfies. an obvious contradiction exists as shown in fig. 2 . thus, the positive solution i * is unique, which can lead to the uniqueness of s * , v * , e * from the analysis above. in the case of r v ≤ 1 , eq. proof. the jacobian matrix of model (1.3) takes the following form of obviously, its all eigenvalues possess negative real parts when where a 1 := μ + γ , a 2 := μ + σ + ξ , a 3 := μ + ω + g(i * ) , a 4 := μ + γ + σ, a 5 := μ + ω. clearly, λ 1 = −μ < 0 . case i . let g ( i * ) > 0. one asserts that all eigenvalues of the following equation we thus infer that all eigenvalues obey re λ < 0. combining cases i and ii leads to local stability of p * for r v > 1 . proof. by the first equation of (1.3) and s + v + e + i ≤ /μ, it is easy to ascertain that ds dt which asserts that s ≤ s 0 (similar to [4] ). otherwise, let us suppose that s > s 0 , thus ds / dt < 0. it follows that s ≤ s 0 when s (0) ≤ s 0 , which is absurd as our assumption. hence, our claim s ≤ s 0 is valid. observe that g ( i )/ i ≤ β for i > 0 can be ensured by (p3) (seeing, e.g., [34] ). construct lyapunov function w (t) = e + (μ + σ + ξ ) i/σ, and its time derivative of w ( t ) along the solutions of model (1.3) is estimated as from the lasalle's invariance principle [35] and local stability of p 0 in theorem 2.2, we can derive its global asymptotic stability for r v ≤ 1 . in the sequel, we shall employ the general criterion for global stability for autonomous differential equations developed by [18] to establish global stability of the ee p * of model (1.3) . a brief outline on this geometrical approach [18, [20] [21] [22] is presented as follows. let us consider the nonlinear autonomous dynamical system: where the function f (x ) ∈ c : q → r n and q is an open set. for (2.9), the solution with x 0 ) and its equilibrium as x * . moreover, let us assign the following three hypotheses are satisfied: (h1) is simply connected. (h2) there is a compact absorbing set d ⊂ q ⊂ . (h3) system (2.9) admits a unique equilibrium x * in . the general geometric criterion of lu and lu is recapped as follows. [18] ). the unique equilibrium x * of (2.9) is globally asymptotically stable (gas) in provided that (h1) -(h3) and the following condition (c) hold. (2.9) , there are a matrix c ( t ), a sufficiently large τ 1 > 0 and constants (c) for the coefficient matrix b ( x (0, x 0 )) of systemρ 1 , ρ 2 , . . . , ρ n > 0 such that b ii (t) + i = j ρ j ρ i | b i j (t) | ≤ c ii (t ) + ρ j ρ i | c i j (t ) | , for ∀ t ≥ τ 1 , ∀ x 0 ∈ d, (2.10) and lim t→∞ 1 t t 0 c ii (t) + ρ j ρ i | c i j (t) | ds = c i < 0 ,(2. where b ij ( t ) and c ij stand for entries of matrices b ( x (0, x 0 )) and c ( t ), respectively. denote the interior, the boundary of by ˚ and ∂ , respectively. uniform persistence in ˚ of model (1.3) for r v > 1 can be deduced from the instability of p 0 and p 0 ∈ ∂ . proof. the third additive compound matrix of j [22] for model (1.3) acquires the form j [3] [22] , it turns out to be n(x ) = ν(x ) = −μ and m = dim (∂ m /∂ x ) = 1 . in the sequel, let p (x ) = diag { i, e, v, s} and i 4 ×4 be the 4 × 4 identity matrix. then the coefficient matrix b (t) = p f p −1 + p j [3] (2.12) note that theorem 2.4 implies that there is a constant π 0 > 0 such that π 0 ≤ s, v, e, i ≤ / μ. it follows from (p1) that there are constants l, l > 0 such that l ≤ g ( i ) ≤ l . assign π := μπ 0 / . by i | g ( i )| ≤ g ( i ) in (p3) and (2.12) , c i ( t ) are respectively estimated as choose the matrix c ( t ) in lemma 2.1 as remark 2.1. let ξ = 0 and g(i ) = βi / (1 + κi) , then model (1.3) reduces to the model with saturated incidence of [16] , which retains global threshold dynamics from theorem 2.5, improving theorem 4 in [16] . more importantly, the sharp threshold dynamics result is extended to the model with nonmonotone incidence capturing psychological effect of [16] . in this section, for simplicity, we take g(i) := βi/ (1 + κi) , satisfying (p1), (p2) and in what follows, we make a thorough inquiry into global stability of model (1.2) . using the similar arguments as the analysis of theorems 2.3-2.4 in subsection 2.3 can lead to global stability of the dfe and persistence of model (1.2) as follows. in order to achieve global stability of the ee, we focus mainly on the significant differences and skip the repeated parts with the proof of theorem 2.3 in subsection 2.3 . the coefficient matrix b ( t ) for model (1.2) is calculated as we can similarly infer that by applying lemma 2.1, the above is concisely stated into theorem 3.5. e.g., βs ln (1 + κi ) [29] , βsi / (1 + κi + √ 1 + 2 κi ) [30] , also reserves global threshold stability by the same proof. from the analysis in sections 2 and 3 , it can be similarly verified that the following sveis model with temporary immunity and nonlinear incidence satisfying (p1) -(p3) is a sharp threshold system characterized by its vaccination reproduction number, vaccination was the most cost-effective intervention for mitigating the 2010 influenza a(h1n1) pandemic. on 28 august 2009, who advised that the countries in the northern hemisphere should prepare for a second wave of pandemic spread [10] . fortunately, the ph1n1 vaccination programme for five priority groups was launched, such as medical workers, pregnant women, people over 65 or with chronic illness, children aged between 6 months to 6 years [11] . because the susceptible individuals aged over 6 months were vaccinated with the ph1n1 vaccine instead of newborns and up to 75% of h1n1 infection was asymptomatic due to nature immunity [15] , model (1.2) is applied to illustrate that vaccination effectively contained subsequent potential waves of the pandemic (h1n1) 2009 in hong kong china in this section. at the end of every month from may 2009 to october 2010, the ph1n1 case data of hong kong were released by official website of center for health protection, hong kong china (available at https://www.chp.gov.hk/sc/statistics/data/10/26/43/ 416.html [12] ), and the data from may 2009 to june 2010 are chosen to fit the parameter values of model (1.2) owing to its high smooth degree (see fig. 1 ). indeed, the prevalence level of from july to october 2010 showed the small fluctuations and kept low (also seeing [8] ). the first wave of the pandemic failed to be avoided (see fig. 1 ) since there was no available vaccine against the novel influenza strain before 21 december 2009. it was on that day, the ph1n1 vaccination programme for five priority groups was launched and started [11] to minimize any potential second wave and 4182 doses of ph1n1 vaccine were administered [36] . notice that the vaccine recipients will develop immunity in about 15 days [7] (delayed vaccination, e.g., [2] ), so the start time of generating vaccine-induced immunity can be approximated as 1 january 2010 as shown in fig. 3 (a) . the intervals or values of parameters and initial condition of model (1.2) are estimated (as shown in table 1 ) and explained as follows. (a) according to subsection 4.1 , we set vaccination rate α = 0 during the 2009 pandemic, but α in (0,1] during the 2010 pandemic from [5] . the vaccine effectiveness is up to 99% [37] , thus the vaccine is considered to be perfect. let us take the infectious duration and the immunity period as 7 days [27, 28] and 1 years [6] , respectively, then 1 /γ = 0 . 2333 m and 1 /ω = 12 . 1655 m. (d) the latent period (1/ σ ) ranges from 1 day to 5 days according to refs [5, [26] [27] [28] ., then 1/ σ ∈ [0.0333, 0.1667]. from [5, 26, 28] , it may be realistic for the influenza a(h1n1) to consider that exposed individuals recover after 1-10 days due to natural immunity, namely, ξ in [3, 30] . it is not hard to obtain that the values of parameters q, β and κ belong to [0,1] based on some existing works (e.g., [17, 23] ). above all, the values of the remaining parameters β, ξ , σ , q, κ and the initial values s (0), v (0) are estimated (seeing table 1 ) with the 8 cases data from may to december 2019 by the dediscover simulation tool [39] , where we choose the method of hybrid desqp optimization algorithm, combining global differential evolution and local sequential quadratic programming. from the parameter estimation results above, the values of κ = 1 . 3458 × 10 −13 , q = 0 . 9287 tend to 0 and 1, respectively. this entails that several standard model selection criteria are employed to evaluate the superiority of models fitting the data [40] , including akaike information criterion (aic) and bayesian information criterion (bic), and their variations such as aicc, with their smaller values corresponding to a better model. it can be observed from table 2 that model (1.2) with κ = 0 and q = 0 . 9287 is selected as the best model by the criteria above, and its simulation results are presented in fig. 3 (a) . this suggests that the simple mass action incidence βsi may appropriately reflect the short-term transmission process of the emerging influenza a(h1n1) virus and partial temporary immunity should be incorporated into the influenza models. furthermore, we analyze the error of fitting to evaluate the performances and reliability of model table 1 , respectively. the results of parameter estimation above yield that the vaccination reproduction number of 2009 is computed into ˜ r v = 1 . 4675 > 1 , which is consistent with the conclusion in [28, 43] (ranging from 1.2 to 2.3). from theorems 3.4 and 3.5, the disease may be persistent and become endemic. without vaccination, as forecasted by who [10] , the second wave is indeed observed through simulation using the estimated parameter values (see fig. 3 if the vaccine is available. furthermore, vaccination rate α = 0 . 3527 is estimated with the case data from january to june 2010 (other parameter values remain the same with table 1 , and initial condition (93492,312020,627,1287) is the simulation result in december 2009, corresponding to ˜ r v = 0 . 2801 < 1 , such that the pandemic was contained quickly, as proved in theorem 3.3 and shown in fig. 3 (a) . the vaccination reproduction number ˜ r v of model (1.2) , measuring the average number of secondary cases that are caused when one index case is introduced into a disease-free population [32, 33] in which a vaccination programme is carried out, may determine the transmissibility, severity and outcome of the pandemic. in order to seek for effective diseasecontrol measures, we therefore shall be concerned with the effects of input parameters ω, β, α, γ , ξ on ˜ r v . based on latin hypercube sampling (lhs) and partial rank correlation coefficients (prccs) [44] , global uncertainty and sensitivity analysis for ˜ r v is conducted to reveal the influence degree on model outcomes. these interesting parameters are considered to obey normal distributions with means coming from baseline values given in table 1 . and their prcc values are computed through 50 0 0 simulations per run and demonstrated in fig. 4 (a) and table 3 . finally, numerical simulations are carried out to evaluate the effectiveness of disease-control measures. in table 3 , input parameters β, α, ω, γ , ξ are ranked in descending order according to their influences on new infections. in fact, it seems difficult to prolong immunity duration related to the parameter ω. for this reason, we only consider the impacts of parameters β, α and γ . in detail, β has positive impact on ˜ r v and α, γ have negative impacts on it. thus, we decrease the value of β by 10% and increase the value of γ by 10%, respectively. as discussed above, vaccination was such an effective health intervention, that the h1n1 pandemic was successfully curbed in 2010. in consideration of frequent outbreaks of current (h1n1) , b and c) epidemics in many countries, such as the united states [9] and china with low vaccination rate, it may be interesting and significant to assume that the vaccine is available and vaccination is carried out at the begin of the pandemic. 10% and 20% of vaccination rate α = 0 . 3527 are used to study the effect of vaccination on the pandemic. and the other parameter values and initial values of table 1 are fixed. simulation results are presented in fig. 4 (b)-(d). undoubtedly, reducing the disease transmission coefficient β, such as epidemic propaganda, isolation, sterilization and wearing a mask, cuts down the peak of the first wave and delays the arrival of the second wave, but its two peak values fail to decrease obviously even though parameter β is the first sensitive, seeing fig. 4 (b) . on the other hand, increasing vaccination rate α and shortening the disease course of disease γ (e.g., antiviral therapy) lower more dramatically the peak values of the first and second waves than reducing β, but the peak of the second wave arrives much earlier than reducing β (as shown in fig. 4 (c) and (d)). therefore, it is possible for policymaker to use multiple control measures jointly during the influenza pandemic. it is also acknowledged that timely vaccination is particularly effective at reducing the outbreak peaks than the other two measures. immunization has been bringing mankind great success to prevent the disease transmission every year [2] [3] [4] [5] [6] [7] [8] 1] , and a long latent period of infectious disease may generate dramatically different model prediction and thus allows of no to neglect [26] . what's more, nonlinear incidence can reproduce the inhibition effect from behavioral changes of individuals and the impact of other factors like severity and stage of the infection [16, 17, 45] . the current work formulates an sveis model with vaccination, latency, nonlinear incidence and temporary immunity and establishes its global threshold stability by a novel geometric criterion in [18] . most pointedly, the open questions on global threshold stability of their ee for two nonlinear sveis models with saturated incidence in [16, 17] are also well addressed. inspired by [18] , the introduction of the property (p3) on the infectious force function g ( i ) leads us to successfully achieve global threshold dynamics for the sveis models with nonmonotone incidence reflecting psychological effect. furthermore, let g(i) = βi/ϕ(i) , then an application of theorem 2.5 yields that model (1.1) is a sharp threshold system provided that ϕ( i ) meets ϕ(0) = 1 and 0 ≤ i ϕ ( i ) ≤ 2 ϕ( i ), such as ϕ(i) = 1 + κi r for 0 < r ≤ 2. in 2009, the novel influenza a(h1n1) virus caused the first pandemic of 21st century. we apply model (1.2) to illuminate the avoidance of the potential second wave of the pandemic (h1n1) 2009 in hong kong, china (as predicted by [10] ) with the ph1n1 vaccination programme, and it is revealed that timely vaccination is more effective at lowering the outbreak peaks than other measures. this offers solid support for implementation of immunization strategy to cope with current global seasonal influenza burden, measles cases surge and covid-19 pandemic if the vaccines are available. this research is also subject to several limitations as follows. in details, observe that hbv vaccine is administered to both newborns and susceptible individuals, so both two vaccination ways can be incorporated into these sveis models, which, together with [4] we guess, can still preserve the threshold dynamics since insights provided by several sveis models studied above, can inform us that vaccination for either newborns or susceptible individuals and temporary immunity fail to change their threshold stability (see theorems 2.5, 3.5 and remark 3.2 ). additionally, we just consider the nonlinearity of incidence rate on i , perfect vaccines, constant total population and postulate that vaccine-induced and disease-acquired immunity last the same time. it would be interesting to introduce more general incidence s ϱ f ( i ) ( ϱ > 0), distinct vaccinated class ( v ) and recovered class ( r ), incomplete vaccination and varying total population size (e.g., [4, 18, 19, 21, 45] ) to improve the accuracy of model prediction. certainly, more analytical techniques are needed, and these issues are left as future investigations. modelling the large-scale yellow fever outbreak in luanda, angola, and the impact of vaccination transmission dynamics of cholera: mathematical modeling and control strategies global dynamics of an sveir epidemic model with distributed delay and nonlinear incidence mathematical model of transmission dynamics and optimal control strategies for 2009 a/h1n1 influenza in the republic of korea prevention and control of seasonal influenza with vaccines: recommendations of the advisory committee on immunization practices-united states antibody dynamics of 2009 influenza a (h1n1) virus in infected patients and vaccinated people in china factors affecting intention to receive and self-reported receipt of 2009 pandemic (h1n1) vaccine in hong kong: a longitudinal study estimated influenza illnesses, medical visits, hospitalizations, and deaths in the united states who, preparing for the second wave: lessons from current outbreaks department of health hong kong, human swine influenza vaccination programme launched number of notifiable infectious diseases by month immunity in infective diseases, binnie, f.g. (transl.) modern infectious disease epidemiology comparative community burden and severity of seasonal and pandemic influenza: results of the flu watch cohort study analysis of a seiv epidemic model with a nonlinear incidence rate analysis of an sveis epidemic model with partial temporary immunity and saturation incidence rate geometric approach to global asymptotic stability for the seirs models in epidemiology global asymptotic stability for the seirs models with varying total population size a geometric approach to the global-stability problems global dynamics of a seir model with varying total population size dynamics of differential equations on invariant manifolds a generalization of the kermack-mckendrick deterministic epidemic model global analysis of an epidemic model with a nonlinear incidence rate department of economic and social affairs, population division. world population prospects: the 2015 revision, key findings and advance tables estimated epidemiologic parameters and morbidity associated with pandemic h1n1 influenza initial human transmission dynamics of the pandemic (h1n1) 2009 virus in north america the dynamics of insect-pathogen interactions in stage-structured populations the saturating contact rate in marriage and epidemic models reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission the mathematics of infectious diseases threshold dynamics of an sirs model with nonlinear incidence rate and transfer from infectious to susceptible the stability of dynamical systems statistics on human swine influenza vaccinations effectiveness of h1n1 vaccine against reported influenza a(h1n1) press release: year-end population for dediscover: a computation and simulation tool for hiv viral fitness research differential equation modeling of hiv viral fitness experiments: model identification, model selection, and multimodel inference forecasting principles and applications industrial and business forecasting methods: a practical guide to exponential smoothing and curve fitting transmission parameters of the a/h1n1 (2009) influenza virus pandemic: a review a methodology for performing global uncertainty and sensitivity analysis in systems biology bifurcation analysis of an sirs epidemic model with generalized incidence the authors would like to express their gratitude to ronghua tan for her kind suggestions. the work was supported by national natural science foundation of china (nos. 11871201 , 11871238 , 11901059 ) and natural science foundation of hubei province, china (nos.2019cfb241, 2019cfb353). key: cord-296388-ayfdsn07 authors: maziarz, mariusz; zach, martin title: agent‐based modelling for sars‐cov‐2 epidemic prediction and intervention assessment: a methodological appraisal date: 2020-08-21 journal: j eval clin pract doi: 10.1111/jep.13459 sha: doc_id: 296388 cord_uid: ayfdsn07 background: our purpose is to assess epidemiological agent‐based models—or abms—of the sars‐cov‐2 pandemic methodologically. the rapid spread of the outbreak requires fast‐paced decision‐making regarding mitigation measures. however, the evidence for the efficacy of non‐pharmaceutical interventions such as imposed social distancing and school or workplace closures is scarce: few observational studies use quasi‐experimental research designs, and conducting randomized controlled trials seems infeasible. additionally, evidence from the previous coronavirus outbreaks of sars and mers lacks external validity, given the significant differences in contagiousness of those pathogens relative to sars‐cov‐2. to address the pressing policy questions that have emerged as a result of covid‐19, epidemiologists have produced numerous models that range from simple compartmental models to highly advanced agent‐based models. these models have been criticized for involving simplifications and lacking empirical support for their assumptions. methods: to address these voices and methodologically appraise epidemiological abms, we consider acemod (the model of the covid‐19 epidemic in australia) as a case study of the modelling practice. results: our example shows that, although epidemiological abms involve simplifications of various sorts, the key characteristics of social interactions and the spread of sars‐cov‐2 are represented sufficiently accurately. this is the case because these modellers treat empirical results as inputs for constructing modelling assumptions and rules that the agents follow; and they use calibration to assert the adequacy to benchmark variables. conclusions: given this, we claim that the best epidemiological abms are models of actual mechanisms and deliver both mechanistic and difference‐making evidence. consequently, they may also adequately describe the effects of possible interventions. finally, we discuss the limitations of abms and put forward policy recommendations. in the aftermath of the outbreak of the novel coronavirus, governments around the globe have introduced non-pharmaceutical public health interventions aimed at slowing down the spread of the resultant pandemic. these measures range from relatively mild requirements like wearing face masks, washing hands, or avoiding close contacts to school closures and imposed isolation that are likely to have a detrimental and unpredictable influence on social and economic life. 1 despite their significant impact, the introduction of many of these measures was not supported with high-quality evidence. first, conducting rct would not be feasible for both ethical and practical constraints. second, significant differences between the coronaviruses that caused the sars and mers outbreaks and sars-cov-2 (such as the likely airborne transmission 2 and asymptomatic infectiousness of the latter 3,4 ) undermine extrapolation from the data gathered during these previous epidemics. finally, the current pandemic has not lasted long enough to gather observational data in the amount and quality sufficient for the assessment of the efficacy of alternative public health interventions, since the first reports were published just weeks after the first measures were introduced. 5 one of the many ways to address the issue concerning the impracticality of conducting rcts and observational studies in the context of an ongoing pandemic is through scientific modelling, in particular epidemiological modelling. here, we focus on the so-called agent-based modelling (abm) approach, which differs from more traditional epidemiological modelling in several ways. abms are a form of computational modelling strategy where agents are treated as entities interacting with each other and their environment in a locally defined fashion described by a set of rules. the overall dynamics of the system are then computed, allowing for the simulation of complex patterns and an understanding of how these patterns arise. 6, 7 abms are used in many scientific contexts, including modelling the spread of infectious diseases, and have proven successful in informing policy decisions before. for instance, eisinger and thulke 8 modified and then applied a previously developed abm of the spread of rabies, generating a rule-based model that represented specific spatial and behavioural characteristics of the fox population (eg, with fox families represented as moving within home ranges and young foxes engaging in long-distance migratory behaviour). 6 whereas the classical differential equation models predicted that vaccinating at least 70% of the fox population would eliminate rabies, the abm indicated that a successful vaccination strategy could do with much less than 70% of the population being immunized once the spatial arrangements of fox hosts were explicitly considered, saving millions of euros as a result. moreover, the abm also suggested that the classical strategy would fail more often than not, and was successfully applied to deal with the rabies problem. however, despite the promising record of using abms in effective epidemiological interventions, its use in informing proposed measures against the novel coronavirus epidemic has raised criticism. [9] [10] [11] unfortunately for the assessment of healthcare interventions based on this type of epidemiological models, standard evidence hierarchies exclude agent-based models altogether and include theoretical or mechanistic inferences at the lowest level of the hierarchy. for example, the oxford centre for evidence-based medicine 12 and the national institute for health and care excellence (nice guidelines) 13 include theoretical and mechanistic reasoning but agent-based models fall beyond their scope. this can be explained by the novelty of agent-based modelling and the limited trust of ebmers in theoretical and, to some extent, also mechanistic reasoning, which, despite being used implicitly to assess the possibility of confounding and the quality of results, 14 is downgraded or rejected as either subjective or fallacious. 15 however, such a view has been challenged by a group of philosophers advocating for improving the practices of evidence assessment in medicine by putting more weight on mechanistic reasoning in causal inference. [16] [17] [18] the position of the ebm+ program [16] [17] [18] is encapsulated by the normative reading of the russo-williamson thesis, 19 which states that causal claims should be based on both difference-making and mechanistic evidence. the causal claims supported by agent-based models have been interpreted inconsonantly: either as being in line with the potential outcome approach (poa), 20 as delivering theory-driven understanding 21 or as providing mechanistic evidence. 22 below, we show that all of these apparently inconsistent interpretations are correct, because the best contemporary abms bear a resemblance to the actual mechanisms and therefore allow for the counterfactual assessment of intervention efficacy in the target while also delivering an understanding of the phenomena of interest. our argument proceeds by apart from the compartmental sir (susceptible, infectious, recovered) framework and its derivatives [23] [24] [25] [26] [27] [28] or regression analysis, 29, 30 most advanced models of the spread of the novel coronavirus are transformed versions of agent-based influenza pandemic models. 11, 31 such models have been used as evidence for introducing (sometimes severe) public health measures, 32 with the recent change in british policy being the prime example. in this section, we illustrate this approach to modelling the sars-cov-2 pandemic with an agent-based model of the epidemic in australia 31 based on acemod. developed as a "framework for studying influenza pandemics in australia" 33 (p. 412) . acemod is an influenza spread model that addresses the need for simulating interventions responding to the outbreaks of future respiratory diseases. while the 2009 swine flu pandemic was the motivation for constructing acemod, the model was not intended to accurately represent the outbreak of the h1n1 strain, but rather as a generalized framework for studying how an infectious disease spreads through the social interactions of australians. acemod utilizes census data to ascribe realistic spatial and social characteristics to almost 20 million agents inhabiting the model world. these agents are divided into different social groups of varying characteristics, with households differentiated proportionally according to statistical data on the prevalence of different types of families (singles, single parents and couples with or without children). these features are ascribed to agents stochastically in a way that replicates the aggregate structure of statistical data. during the daytime, children and students meet in classrooms and at schools, adults go to work and pensioners stay at home. during the nighttime, the agents encounter contacts at households and in their neighbourhoods (eg, at supermarkets, theatres). the disease can be contracted by an agent in the event of meeting an infected individual in one of these settings. the probability that an agent i contracts the disease in a given step t depends on the number of sick individuals met in that step and the contagiousness of the disease, scaled by k. the modellers assume that the infectivity of the disease decreases linearly over time. asymptomatic cases are assumed to be 50% less infectious than symptomatic ones, and the flu lasts 5 days within the model. after this period, recovered agents cannot infect others. additionally, those who experience symptoms do so after an incubation period lasting approximately 3 days. the influenza epidemic is started by agents coming to australia via international airports and seeded into communities living near the airports at random. to represent an epidemic of a particular strain of influenza with acemod, the model requires calibration. modellers can proceed with this step in two ways, depending on the accessibility of data. in the case of well-studied influenza strains, their infectivity and the ratios of transmission in different contexts are well-recognized, and parameter values can be chosen based on empirical studies. however, if these data are missing, then parameter values have to be calibrated using statistical procedures such as simplex or genetic algorithms to maximize the fit of the model to a benchmark. after constructing and calibrating acemod, modellers run simulations to obtain the estimates of prevalence, incidence and attack rates, and choose the most common outcome (due to stochasticity, different runs of the model may lead to obtaining slightly different results). chang et al 31 abms such as acemod can be seen as consisting of two parts: the rules specifying the behaviour of agents and the creation of the model society, as well as the assumptions characterizing the infectivity of the pathogen causing the epidemic. given that acemod is based on 2016 census data and a major change in social behaviours is unlikely to have occurred since then, the model accurately represents the social interactions of present-day australians. hence, the former part of the model has been left mostly unchanged, beyond increasing the number of agents to over 24 million to adjust for the growing population. in addition to introducing a social structure sufficiently resembling the contact network of the present population, obtaining accurate predictions of epidemic development and policy assessment requires inputting data on transmission likelihoods that are true for the pathogen causing the modelled epidemic. 34 most changes in the model are concerned with the assumptions specifying the infectivity of the disease. even though several features of influenza epidemics are similar to the epidemic caused by the novel coronavirus, they differ with respect to infectivity and attack rates, mortality rates, the average duration of disease, the reproductive number r 0 and the distribution of asymptomatic cases. therefore, these parameters in the model required recalibration. the transmission probabilities remained mainly as specified in the the length of the generation period was calibrated to 6.4 days to reflect this difference in the model. additionally, the likelihood of contracting sars-cov-2 but staying asymptomatic was set to be agedependent, and equalled 1/3 for adults while minors were set to be five times less likely to suffer from symptoms than adults. while this assumption is in agreement with the empirical findings that children represent a minor fraction of symptomatic cases, the calibration aimed at reproducing aggregate epidemic curves and may diverge from the actual chances of developing symptoms. within the acemod framework, the reproductive number r 0 is not one of the assumptions inputted into the model. rather, its estimate results from a simulation of the scenario described by the rules and assumptions, some of which are stochastic. the assumptions considered and, particularly, the parameter denoting contagiousness of the disease (k) have been calibrated such that r 0 stays within the limit of (2.0-2.5), that is, in agreement with empirical estimates of the reproductive number at the beginning of the sars-cov-2 outbreak. 35, 36 the set of parameter values that result in the estimate of before proceeding to our argument, let us first make several general remarks about modelling. these remarks should prove essential in clarifying the main issues that are often raised with regard to using simplified models, particularly in the context of policy decision-making. first of all, abms are instances of mechanistic models, for they clearly fit the general, also called the minimal, characterization of what a mechanism is: a set of entities whose activities and interactions are organized such that they are responsible for the phenomenon. [37] [38] [39] this definition is broad enough to conceptually unify the debates on biological and social mechanisms under a single notion of a mechanism. furthermore, such definition leaves open the possibility of integrating biological and social aspects into a mixedmechanism model. 40 it should also be noted that much like any other kind of model, abms serve as simplified representations of their target phenomena. as the acemod case clearly shows, modellers introduce various simplifications by which they purport to adequately capture the core dynamics of the modelled phenomenon. in this process, they first abstract away from the complexities of the real system by "extracting" certain features that they believe to be of crucial importance and that will then be the focus of modelling, whereas other features that may or may not have a causal influence are disregarded in these early stages. modelling is an iterative process during which the merits of the model's assumptions are continuously being evaluated, and if required, the assumptions are refined and additional assumptions added. more importantly, some of those extracted features are distorted to the extent that, if taken literally, they would misrepresent the actual state of things. however, introducing such distortions is often made in full awareness, with the ultimate goal of finding out whether the consequences they have for the behaviour of the system make a difference and to what degree. philosophers often refer to the former-that is, the set of properties retained in a model-as an abstraction, while the latter case-that is, the distortions of the system's features-is called an idealization. 41 however, abstractions and idealizations do not exhaust the conceptual toolbox available to modellers. a popular way to attempt to model a given system realistically is to introduce various approximations. although there are noteworthy differences between approximations and idealizations, we cannot afford to go into any detail here. in summary, models often effectively disregard, distort and otherwise simplify possibly important details. in light of this, many wonder whether we can gain insight into the modelled phenomenon at all, and if so then how. although the sars-cov-2 abm is fairly detailed and precise, it cannot do without some of the simplifications discussed above. consider some of the following assumptions introduced in the model. on consequently, we concur with andersen's claim that "no mechanism model can include all the actual, much less the potential, causal relationships in which such a mechanism may engage in a system" 51 (p. 995). this pessimistic view on simplified models has inspired the method known as exploratory modelling. 52 in cases when the values of parameters and assumptions inputted into the model cannot be established with certainty, researchers can simulate multiple possible worlds to discover the dependencies that are stable across the set of different models. in cases when only a fraction of assumptions are uncertain, researchers conduct sensitivity analyses to check if changes in the values of the parameters lead to changes in their conclusions. 53 the results that remain unchanged despite minor adjustments to assumptions are considered to be robust. 54 this, in turn, leads to choosing those interventions that are most effective across different sets of parameter values, known as robust decision-making. 52 others prefer to think in terms of the distinction between howactually and how-possibly modelling, referring to models that describe an actual mechanism or a possible mechanism, respectively. 55 there are two general ways to unpack the concept of a how-possibly model. here we argue that, notwithstanding the simplifications introduced in the discussed influenza and sars-cov-2 abms, the epidemiologists are, in fact, providing representations of actual mechanisms of the spread of the viruses. this can be supported by exploiting the relevant similarities 56,57 between the sars-cov-2 abm and the actual outbreak. the respects in which an abm can be judged similar to its target concern the features retained in that model, while the degree(s) of similarity concern the extent to which the model's featwo remarks are in order here. first, one may oppose the claim that what is being represented is the actual mechanism by arguing that the mechanism underlying the beginning of the outbreak and the fully-fledged epidemic are distinct. changes in social behaviour or genetic mutations could undermine the behavioural adequacy of the model. second, it is possible (at least in principle) that the model represents a false mechanism, but is calibrated to the relevant benchmark such that it reproduces it. for example, there is no data confirming (or disproving) the assumption that children are asymptomatic five times more often than adults. as the modellers admit, this assumption was made not only to account for the lower attack rate among minors, but also to make the model adequate to aggregate-level data. this approach to calibration resembles the estimation of statistical parameters (a.k.a. curve fitting) and is considered dubious. the main line of criticism highlights that it is in principle possible to construct a model that represents a possible mechanism and, using calibration, adjust parameter values so that it reproduces the represented phenomenon, that is, obtains behavioural adequacy despite being false. however, while this criticism is indeed justified regarding models of mechanisms that are epistemically inaccessible in other ways (such as mechanisms in the social sciences 59 ), it is not so in the case of epidemiological mechanisms whose transmission mechanism can be studied empirically and compared to the mechanism represented by the model. this can establish that the mechanism represented by the model is similar (in relevant aspects and to relevant degrees) to the mechanism that generates the outbreak, that is, achieves mechanical adegiven that acemod fulfils glennan's criteria for behavioural and mechanical adequacy, considering our current understanding of the novel coronavirus, we can conclude that chang's et al 31 model represents the actual mechanism of the spread of the disease in australia. given this, the claims assessing the efficacy of the mitigation measures under consideration are likely to be accurate not only within the model but also about its target. we claim this with several caveats in mind to be discussed in the next section. it is also important to note that the abm integrates the biological aspects, expressed by the parameter of infectivity, and the social aspects such as daily interaction regimes. as a result, the abm should be construed as an instance of a model of a mixed mechanism, a concept elaborated by kelly et al. 40 due to exposure patterns, population-level phenomena such as infectious disease epidemics are crucially dependent on human behaviour and social practices. in cases like the current pandemic, effective interventions may best be aimed at the societal level and therefore mechanistic models that integrate social factors, human behaviour and biological aspects (something that the abm discussed here attempts to do) are arguably best suited for providing understanding and suggesting policy decisions. our study defends using abms for informing decisions regarding mitihad limited influence on the severity of the epidemic, considering that just one cluster was located at a school. 68 we believe that, considering the diversity in the number and patterns of social interactions across countries, the quality of evidence from abms should be assessed on the case by case basis. to do so, one can employ the approach of parkinnen et al 17 (p. 79) developed initially to evaluate the quality of evidence for biological mechanisms. in that case, one should consider (a) the quality of the method (ie, conadditionally, abms, much like the compartmental models, are dependent on the assumptions of the modellers. 10 our claim that acemod calibrated for sars-cov-2 bears similarity to the actual mechanism of the epidemic depends on the accuracy of the empirical results used as an input for this model. we need to repeatedly acknowledge the provisional nature of these empirical results, given the novelty of the pathogen. if the parameter values in acemod were miscalibrated, then the assessments of intervention efficacy could be wrong. this implies that neither the virus can mutate nor that people can significantly and unpredictably change their behaviour since "the efficacy of implementation depends on people's reactions, [the stability of] pre-existing social norms and structural societal constraints." 9 furthermore, the effects of epidemiological agent-based modelling are highly dependent on social structure and carefully calibrated to social and economic characteristics. therefore, the epidemiological abms are geographically localized and their conclusions should not be extrapolated beyond their target systems, 71 unless the models and their predictions are calibrated to particular settings. finally, while acemod is well-documented in the two publications discussed throughout our paper, neither its code nor detailed documentation regarding its use is published (this unfortunately also applies to some other abms of the sars-cov-2 epidemic). given these limitations, the models should be carefully checked for coding errors and other possible flaws before applying their implications in the policy context. in summary, we have argued that, despite the criticism raised against models being the appropriate vehicle for informing policies, the sars-cov abm is suitable for this purpose because the mechanism described by the model sufficiently resembles the mechanism at work in the real world. thus, our best contemporary epidemiological abms are representations of the actual mechanism of the spread of the virus. unfortunately, such models have been left out from methodological discussions and are not explicitly listed by evidence hierarchies. while the need for appraising mechanistic reasoning in medicine is also voiced by ebmers, 72 there is no broadly-accepted view on how to amalgamate evidence of different types. further research is needed to assess the risk of bias in the epidemiological models that deliver both difference-making and mechanistic evidence. however, considering the current situation and pressing need for rapid and accurate decisions regarding mitigation measures, policymakers should take to heart the advice that "if no randomized trial has been carried out […], we must follow the trail to the next best external evidence and work from there" 73 (p. 74 ). in the current situation, accurately calibrated epidemiological abms are the best existing evidence. isolation, quarantine, social distancing and community containment: pivotal role for old-style public health measures in the novel coronavirus (2019-ncov) outbreak is the coronavirus airborne? experts can't agree asymptomatic and human-to-human transmission of sars-cov-2 in a 2-family cluster presumed asymptomatic carrier transmission of covid-19 association of public health interventions with the epidemiology of the covid-19 outbreak in wuhan, china agent-based and individual-based modeling: a practical introduction an introduction to agent-based modeling: modeling natural, social, and engineered complex systems with netlogo spatial pattern formation facilitates eradication of infectious diseases computational models that matter during a global pandemic outbreak: a call to action modelling the pandemic report 9: impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand developing nice guidelines: the manual. london: national institute for health and care excellence the judgements that evidence-based medicine adopts evidence: philosophy of science meets medicine ebm+: increasing the systematic use of mechanistic evidence evaluating evidence of mechanisms in medicine: principles and procedures the evidence that evidence-based medicine omits interpreting causality in the health sciences formalizing the role of agent-based modeling in causal inference and epidemiology invited commentary: agent-based models for causal inference-reweighting data and theory in epidemiology mechanisms and the evidence hierarchy modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions potential impact of seasonal forcing on a sars-cov-2 pandemic a sidarthe model of covid-19 epidemic in italy the sars-cov-2 epidemic outbreak: a review of plausible scenarios of containment and mitigation for mexico. medrxiv social distancing strategies for curbing the covid-19 epidemic epidemic analysis of covid-19 in china by dynamical modeling simulating and forecasting the cumulative confirmed cases of sars-cov-2 in china by boltzmann function-based regression analyses evaluation of the lockdowns for the sars-cov-2 epidemic in italy and spain after one month follow up modelling transmission and control of the covid-19 pandemic in australia special report: the simulations driving the world's response to covid-19 investigating spatiotemporal dynamics and synchrony of influenza epidemics in australia: an agent-based modelling approach role of social networks in shaping disease transmission during a community outbreak of 2009 h1n1 pandemic influenza early phylogenetic estimate of the effective reproduction number of sars-cov-2 secondary attack rate and superspreading events for sars-cov-2 what is a mechanism? thinking about mechanisms across the sciences the new mechanical philosophy the routledge handbook of mechanisms and mechanical philosophy the integration of social, behavioral, and biological mechanisms in models of pathogenesis idealization and abstraction in scientific modeling identifying and interrupting superspreading events-implications for control of severe acute respiratory syndrome coronavirus 2 superspreading and the effect of individual variation on disease emergence the role of super-spreaders in infectious disease seroprevalence and ethnic differences in helicobacter pylori infection among adults in the united states single nucleotide polymorphisms in innate immunity genes: abundant variation and potential role in complex human disease ethnic-specific genetic associations with pulmonary tuberculosis increase in clostridium difficilerelated mortality rates, united states fcγriib in autoimmunity and infection: evolutionary and therapeutic implications sars-cov-2 viral load in upper respiratory specimens of infected patients mechanisms: what are they evidence for in evidencebased medicine? exploratory modeling for policy analysis sensitivity analysis of infectious disease models: methods, advances and their application robustness analysis thinking about mechanisms how models are used to represent reality an agent-based conception of models and scientific representation modeling mechanisms the philosophy of causality in economics: causal inferences and policy proposals impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand australia restricts travelers from mainland china as virus impact spreads australia blocks arrival of all non-citizens, non-residents in expanded coronavirus travel ban update on coronavirus measures pm announces pubs, clubs and cinemas to close, schools stay open in stage one measures-as it happened. the guardian why australia is not shutting schools to help control the spread of coronavirus. the guardian victoria state government. coronavirus update for victoria analysing the combined health, social and economic impacts of the corovanvirus pandemic using agent-based social simulation social distancing and supply disruptions in a pandemic why a one-size-fits-all approach to covid-19 could have lethal consequences medical scientists and philosophers worldwide appeal to ebm to expand the notion of 'evidence' agent-based modelling for sars-cov-2 epidemic prediction and intervention assessment: a methodological appraisal none of the authors reports conflict of interests. https://orcid.org/0000-0003-1979-0746martin zach https://orcid.org/0000-0001-7181-0391 key: cord-311868-40bri19f authors: fattahi, a.; sijm, j.; faaij, a. title: a systemic approach to analyze integrated energy system modeling tools: a review of national models date: 2020-11-30 journal: renewable and sustainable energy reviews doi: 10.1016/j.rser.2020.110195 sha: doc_id: 311868 cord_uid: 40bri19f we reviewed the literature focusing on nineteen integrated energy system models (esms) to: (i) identify the capabilities and shortcomings of current esms to analyze adequately the transition towards a low-carbon energy system; (ii) assess the performance of the selected models by means of the derived criteria, and (iii) discuss some potential solutions to address the esm gaps. this paper delivers three main outcomes. first, we identify key criteria for analyzing current esms and we describe seven current and future low-carbon energy system modeling challenges: the increasing need for flexibility, further electrification, emergence of new technologies, technological learning and efficiency improvements, decentralization, macroeconomic interactions, and the role of social behavior in the energy system transition. these criteria are then translated into required modeling capabilities such as the need for hourly temporal resolution, sectoral coupling technologies (e.g., p2x), technological learning, flexibility technologies, stakeholder behavior, cross border trade, and linking with macroeconomic models. second, a multi-criteria analysis (mca) is used as a framework to identify modeling gaps while clarifying high modeling capabilities of markal, times, remix, primes, and metis. third, to bridge major energy modeling gaps, two conceptual modeling suites are suggested, based on both optimization and simulation methodologies, in which the integrated esm is hard-linked with a regional model and an energy market model and soft-linked with a macroeconomic model. the long-term energy strategy of the eu is aimed at a 80-95% reduction of greenhouse gas (ghg) emissions by 2050, relative to 1990. reaching this goal requires a number of key actions to make a transition from a conventional energy system to a low-carbon energy system [1] . as a result, low-carbon energy system models (esms) have been developed to guide decision-makers on taking long-term robust policy decisions towards energy system transition. however, every esm has been developed to answer specific policy questions, due to the complexity of the energy system and limited computational power. as a result, each model comes with specific capabilities and shortcomings. a large growing body of the literature has listed and classified esms with different aims and scopes. connolly et al. provided a comprehensive overview of suitable esms addressing issues related to renewable energy integration [2] . similarly, bhattacharyya et al. compared energy models to identify their suitability for developing countries [3] . aiming to find the prevalent modeling approaches for the u.k., hall et al. classified and compared esms based on their structure, technological detail, and mathematical approach [4] . to find trends in energy system modeling, lopion et al. reviewed esms in a temporal manner [5] . some reviews have emphasized the role of policy strategies and the corresponding modeling challenges. by grouping energy models into four categories, pfenniger et al. examined the policy challenges they face in each group [6] . horschig et al. reviewed esms to provide a framework for identifying a suitable methodology for the evaluation of renewable energy policies [7] . while savvidis et al. identified the gaps between low-carbon energy policy challenges and modeling capabilities with a focus on electricity market models [8] , ringkjøb et al. classified esms with a focus on the electricity sector [9] . lastly, li et al. reviewed socio-technical models emphasizing societal dynamics [10] . the increasing share of variable renewable energy sources (vres) has caused the low-carbon energy system transition to face several major challenges, such as the increasing need for flexibility, further electrification, emergence of new technologies, technological learning, efficiency improvements, decentralization, macroeconomic interactions, and the deeper involvement of stakeholders in the energy system transition. additionally, some policy questions at the macro level, such as the impact of the energy transition on the macroeconomic indicators (e.g., economic growth and employment), require more indepth integrated analysis, i.e., analyzing the whole energy system consisting of technical, microeconomic, and macroeconomic aspects. however, current esms lack specific capabilities for adequately addressing low-carbon energy system changes that can cause debated conclusions. for instance, one study found no feasible way to achieve a 100% renewable power system by 2050 [11] , while another study claims that a 100% renewable eu power system scenario would require 30% higher annual costs [12] . separate analyses suggested that a 100% renewable eu energy system can be achieved by 2050 with only 12% higher annual energy system costs [13] , while neglecting significant parameters such as electricity grid costs, location of renewables, key technological detail, and flexible electricity demand. brouwer et al. provided a detailed analysis of the west european power sector with high shares of renewables, while neglecting the heat and transport sectors [14] . brown et al. analyzed the cross-sectoral and cross-border integration of renewables in europe, while assuming no national transmission costs, limited efficiency measures, and limited technology options [15] . social aspects of the energy system transition are usually neglected in esms. to this end, some studies included actors' behavior in the energy system from the demand perspective, for example, the thermal demand transition [16] or the efficiency of adaptation measures in households [17] . analyzing each of the major changes in the energy system can be challenging for conventional esms as they need further capabilities such as fine technological detail, high temporal and spatial resolutions, and the presence of stakeholders' behavior. this study concentrates on the energy modeling challenges which result from the increasing share of vres, complexity, and system integration. the transition towards a decarbonized energy system involves other policies such as the higher energy efficiency and change in energy demand, the use of nuclear power supply, and using carbon capture utilization and storage (ccus) technologies. due to the diversity of esms, two major limitations were imposed in this review. first, we focused on energy models at the national level. therefore, the reviewed models were designed for national analysis (or they can be used for national assessments, e.g., primes). second, the models covered all energy system sectors (i.e., residential, services, agriculture, transport, and industrial sectors). the overarching research question of this study is "what are the potential solutions to address the shortcomings of current esms considering current and future low-carbon energy system challenges?". to answer this question, we first describe the current and future lowcarbon energy system modeling challenges. based on these modeling challenges, we identify the required modeling capabilities, such as the need for hourly temporal resolution, sectoral coupling technologies (i.e., p2x), technological learning, flexibility and storage technologies, human behavior, cross border trade, and linking with market and macroeconomic models. the required capabilities were then translated into assessment criteria to be used in the multi-criteria analysis (mca). finally, potential model development solutions are discussed and a modeling suite is proposed as a model-linking solution to address the energy modeling challenges. (fig. 1 ) seven major low-carbon energy system modeling challenges were identified. the challenges were translated into a number of required energy system modeling capabilities and criteria. nineteen models were selected from other reviews ( [2, 4] ). primary inclusion criteria for the selected models (see table 1 ) were: (1) being used at national level, and (2) covering the whole energy system (i.e., integrated energy system models). all the information from the selected models was gathered from officially published documents that may be incomplete or outdated (notably when this paper is published), as models are continuously developed. for each model, a brief description was provided in the appendix section. ahp multi criteria analysis was used as a transparent framework to analyze diverse esms from different perspectives. mca is a methodology that evaluates complex choices (i.e., various criteria, objectives, and indicators), which has been used extensively to analyze energy transition policies [40] . the major advantage of mca is that it provides a rational structure of complex alternatives that presents substantial elements for identifying the desired choice [41] . although mca may have different purposes, we were particularly interested in: first, breaking down complicated energy models into key criteria; and second, identifying the importance or relative weight of each criterion for each alternative. models were ranked based on known criteria, but this did not mean one model was superior to others. therefore, the intention was not to compare models but to identify modeling capabilities and gaps when used for structuring a low-carbon energy system modeling framework. based on the identified modeling gaps, a conceptual modeling suite was proposed to address future low-carbon energy system modeling challenges. the proposed suite included a core integrated energy system model that was hard-linked with a regional model and soft-linked with both an energy market model and a macroeconomic model. energy policies are designed to meet three key objectives, which are providing energy reliability (i.e., supply security), affordability (i.e., economics and job creation), and sustainability (i.e., environment and climate) [42] . with the aim of reviewing electricity market models, savvidis et al. [8] clustered twelve energy policy questions as a basis to quantify the gap between models and policy questions. based on the literature and experts' opinions, we divided energy modeling related policy questions into four categories as follows: 1. technical questions, such as a lack of insights in higher share of intermittent renewables, role of new technologies, and further electrification of the energy system. 2. microeconomic questions, such as a lack of insights in decentralization, human behavior, and liberalized energy markets. 3. macroeconomic questions, such as a lack of insights on economic growth and jobs due to the energy transition. 4. a mix of the above questions, such as lack of insights on the effect of further electrification on energy markets. providing a solution for each policy inquiry can be a challenge for energy system modeling. these challenges can alter the choice of modeling methodology and parameters. in this section, energy modeling challenges and the corresponding modeling parameters are described. some sources of renewable energy such as wind and solar energies have an intermittent characteristic i.e., they are (highly) variable and less predictable [43] . the power generation from intermittent renewables is directly dependent on weather conditions [44] . as wind and solar power generation technologies are becoming more competitive [45] , it is expected that wind and solar power generation will cover up to 30% and 20% of the eu's electricity demand by 2030, respectively ( [46, 47] ). hence, a high share of intermittent renewables in the electricity generation sector is imminent. technically, the power system needs to be in balance at all temporal instances and geographical locations. therefore, the electricity sector should be structured in a way to ensure the balancing of demand and supply. the higher share of intermittent renewables entails variability on the power system balance [48] . solutions to deal with power balance variabilities are called flexibility options (fos) as they provide flexibility to the power system against the variable and uncertain residual load profiles. traditionally, conventional power supplies and grid ancillary services were primary sources of flexibility. however, the power system needs further fos as the share of intermittent renewables in the power generation increases while the share of conventional power supplies -i. e., notably dispatchable gas-fired power plants -decreases. several review papers can be used as a starting point for a review on fos ( [49, 50] ). lund et al. [51] listed different fos as demand side management (dsm), storage, power to x, electricity market designs, conventional supply, grid ancillary services, and infrastructure (e.g., smart grids and microgrids). further, sijm et al. [24] investigated fos by suggesting three fig. 1 . proposed modeling suite approach. the reviewed models and their corresponding developers. causes of the demand for flexibility as the variability of the residual load, the uncertainty of the residual load, and the congestion of the grid. michaelis et al. [52] divided fos based on the residual load in three groups: downward, upward, and shifting flexibility. due to high detail and complications regarding each fo, some studies focused mainly on one or a few technologies. to name a few examples: blanco et al. investigated the cost-optimal share of power to methane in the eu energy transition [53] . the potential of power to heat and power to ammonia in the dutch energy system was investigated by hers et al. [54] and ispt [55] , respectively. some other studies followed an integrated approach that included several fos in different sectors; however, they made several assumptions as the computational capacity was limited (e. g., see ref. [15] ). flexibility options can be divided into five main groups, i.e., storage, demand response (dr), vre curtailment, conventional generation, and cross border trade. instead of analyzing the pros and cons of each option, we were interested in identifying key energy modeling issues regarding each flexibility option. (fig. 2) from a temporal perspective, storage fos can be divided into daily and seasonal storage options. on the one hand, solid-state and flow batteries, such as li-ion, ni-cd, nas, icb, vrb, and znbr batteries, provide high ramp rate with limited capacity, which is suitable for diurnal power storage. modeling these batteries with diurnal temporal resolution can be different to hourly temporal resolution (htr) or hourly time-slices (i.e., grouping hours featuring similar characteristics [56] ). improvements in temporal resolution can have a significant impact on modeling results considering the high share of intermittent renewables (e.g., see ref. [43, 57, 58] ). on the supply side, the uncertainty regarding weather forecasts needs to be implemented in the model as weather conditions have a significant impact on intermittent renewables' generation (e.g., see ref. [59] [60] [61] ). on the other hand, technology options, such as pumped-hydro energy storage (phes), thermal energy storage (tes), large-scale hydrogen storage (lshs), and compressed air energy storage (caes), provide huge capacities that makes them suitable for seasonal energy storage. modeling seasonal storage options requires the inclusion of chronological order (cho) of the temporal parameter together with a fine temporal resolution, as the chronological order of time from summer to winter (and vice versa) determines the charge/discharge of seasonal storage options. dr refers to a set of schemes to shift the demand in a certain time period (e.g., an hour) to another time period of the day, week, or month, either forward or backward [24] . currently, electricity comprises around 22% of eu final energy consumption. power to x (p2x) technology options can provide further dr potentials by linking energy sectors and energy carriers together through converting electricity to other forms of energy, services, or products. in its latest report, the world energy council suggests that p2x will be a key element for the transition to a low-carbon energy system [62] . due to high detail and complications regarding each technology option, several studies focus mainly on one or a few options. at eu level, blanco et al. investigated the cost-optimal share of p2g in the eu energy transition [53] . at the national level, the potential of p2heat [54] and p2ammonia [55] in the dutch energy system was examined. there is a huge potential for demand response in the built environment sector as it is responsible for 40% of energy consumption and 36% of co2 emissions in the eu. while individuals can passively participate in either price-based 1 or incentive-based 2 demand response schemes [63] , proactive participation of consumers can increase market efficiency and reduce price volatility [64] . as heating demand averages 80% of eu household energy consumption, the dr potential can be realized by coupling electricity and heat demands. dr in the built environment can consist of three main components including p2heat technologies (e.g., heat pumps and electric boilers), storage (e.g., thermal tank storage and thermally activate building), and smart controllers (that consider market participation, consumer behavior, and weather forecast) [65] . as p2x technology options bridge two different energy sectors or carriers, analysis of these options requires multi-sectoral modeling, and preferably, integrated energy system modeling. moreover, the hourly temporal resolution of the power sector should be maintained. table 2 summarizes key modeling capabilities and concerning energy sectors and carriers for each p2x technology option. vre curtailment and conventional generation options have been used as fos in the power sector. modeling these options is relatively straightforward, as they do not involve other sectors or energy carriers. still, the hourly temporal resolution remains the key modeling capability for these options. from the energy security perspective, modeling conventional generation may require modeling capacity mechanisms, 3 preferably in combination with cross border power trade [66] . the eu is promoting an internal single electricity market by removing obstacles and trade barriers (see e.g., com/2016/0864 final -2016/0380). "the objective is to ensure a functioning market with fair market access and a high level of consumer protection, as well as adequate levels of interconnection and generation capacity" [67] . one of the products of an internal eu electricity market is the potential to offer flexibility in the power system, as the load can be distributed among a larger group of producers and consumers. sijm et al. identified the cross-border power trade as the largest flexibility potential for the netherlands [68] . similar to other flexibility options, one of the key modeling capabilities here is the hourly temporal resolution. table 2 summarizes the required key modeling capabilities for representing and analyzing flexibility options in esms. the main requirement is the inclusion of (at least) an hourly temporal resolution. models' capabilities can improve substantially by adding seasonal storage options, which require the inclusion of chronological order and different energy carriers. moreover, the inclusion of cross border trade can play an important role in the optimal portfolio of flexibility options, especially in eu countries. higher shares of intermittent renewables affect the reliability of power generation and distribution as residual loads become less predictable. for instance, the prediction accuracy of a single wind turbine generation decreases from 5-7%-20% mean absolute error for the hour ahead and day-ahead forecasts, respectively [69] . the increased uncertainty of the power generation due to higher shares of vre sources requires models to include short-term weather forecast and balancing mechanisms in their calculations. uncertainty analyses gain more importance for long-term esms as they model the energy system for several decades in an uncertain future that can get affected by parameters outside the energy system boundaries. energy system optimization models use four main uncertainty analysis methods, which are monte carlo simulation (mcs), stochastic programming (sp), robust optimization (ro), and modeling to generate alternatives (mga) [70] . in 2017, almost 22% of the eu final energy demand was satisfied by electricity, while heat consumption and transport accounted for the rest. current heating and cooling production in the eu come from fossil fuel sources, as renewable energy sources have a 19.5% share of gross heating and cooling consumption. the transport sector is highly dependent on fossil fuels, with only 7.5% of final energy consumption from renewables. therefore, decarbonization of the heat and transport sectors is getting more attention as it has a higher ghg emissions reduction potential. the eu commission suggests electricity as an alternative fuel for urban and suburban driving in its report entitled clean power for transport [71] . further electrification of heating, cooling, and transport sectors may contribute to ghg reduction, assuming the electricity is generated from renewables rather than fossil fuels [72] . due to the high seasonal variation of heating and cooling demand profiles (mainly in the built environment), further electrification of this sector requires huge seasonal storage capacities or other flexible supply options. currently, there are four main high capacity seasonal storage options, which are pumped hydro energy storage (phes), compressed air energy storage (caes), thermal energy storage (tes), and hydrogen energy storage (hes). by using tes technologies, hourly heat and power demand profiles can be decoupled resulting in a higher potential for dr flexibility option [73] . tes technologies can be divided into three main groups based on their thermodynamic method of storing heat energy: sensible, latent, and chemical heat [74] . sensible heat storage (shs) technologies stock the heat by the difference in the materials' temperature, for example by warming up water or molten salt tank. latent heat storage (lhs) technologies make use of phase-change materials (pcm) in a constant-temperature process to absorb or release thermal energy. chemical heat storage (chs) technologies make use of thermo-chemical materials (tcm) in a reversible endothermic or exothermic (i.e., a chemical reaction in which the heat is absorbed or released, respectively) process, for example, the reversible ammonia dissociation process (i.e., 2nh 3 =n 2 +3h 2 ). xu et al. [75] provided an extensive review of current seasonal thermal energy storage technology options. further electrification of the energy system, which is expected to account for 36-39% of final eu energy consumption by 2050 [76] , generates higher interdependencies between energy sectors. single sector models, which are not able to capture sector coupling effects, may provide misleading conclusions by neglecting these interdependencies. as more sources of intermittent renewables are deployed in the energy system, the further electrification implies further volatility of the energy system that highlights the higher demand for flexibility options. moreover, analyzing sector coupling technologies such as evs (p2mobility), heat pumps and electric boilers (p2heat), and electrolyzers (p2gas) become more important. inclusion of sector coupling options in the esm requires modeling of electricity, transport, and heat sectors simultaneously. due to high variations in the electricity supply, a fine temporal resolution should also be employed in the transport and heat sectors in order to adequately address the flexibility issues of sector coupling. development of new technologies and technological change are key drivers of the transition to a low-carbon energy system and are at the core of most energy-climate policies worldwide [77] . for instance, the price decline of pv cells from 3.37 usd/w to 0.48 usd/w in the last 10 years [78] has made solar energy an economic option independent of subsidies. development of new technologies have made additional renewable energy supply sources available such as advanced biofuels, blue hydrogen, deep geothermal, wave, and seaweed. it also provides innovative opportunities for the further integration of the energy system by implementing p2x technologies, which mainly consist of p2heat, p2g, p2h 2 , p2l, 4 and p2mobility technology options. the seasonal variation of wind and solar increases the need for seasonal storage options such as thermal energy storage and caes. ccs and ccu technologies can be considered as alternative solutions for conventional ghg emission emitters. deep decarbonization of the industrial sector can be achieved by the development of new industrial processes while considering the whole value chain [79] . the development of zero energy buildings [80] and formation of energy-neutral neighborhoods [81] can contribute to substantial energy savings in the built environment. esms currently represent technological learning either exogenously or endogenously [82] . technological change is prevalently expressed as a log-linear equation relating technology cost to its cumulative production units. this one-factor equation provides the learning rate, which is the cost reduction that results from doubling the cumulative produced units of the concerned technology [83] . the prominent alternative is the two-factor equation that incorporates both cumulative produced units and r&d investments [84] . endogenous technological learning (etl) is widely used in long-term esm analysis (e.g., see ref. [85] [86] [87] ). etl can be further elaborated as multi-cluster learning (mcl) and multi-regional learning (mrl) [88] . mcl (or so-called compound learning) describes a cluster of technologies, which share the same component and learn together (e.g. see ref. [89] ). mrl differentiates between region-specific technological learning and global technological learning. the consideration of new technologies and technological learning can greatly affect the energy system modeling results, particularly in long-term models. for instance, heuberger et al. [90] . concluded that the presence of global etl will result in 50% more economically optimal offshore wind capacity by 2050. as part of the clean energy for all europeans package, the eu sets binding targets of at least 32.5% energy efficiency improvement by 2030, relative to the business as usual scenario [91] . this policy emphasizes particularly the built environment as the largest energy consumer in europe. although energy-efficient technologies provide financial and environmental costs reduction, they are not widely adopted by energy consumers. this "energy efficiency gap" can be a consequence of market failures, consumer behavior, and modeling and measurement errors [92] . energy efficiency policies may induce the rebound effect (or backfire), in which energy efficiency improvements lead to an increase in energy use. the rebound effect may have a direct decreasing impact in energy consumption (e.g., a decrease in residential energy consumption), while having an indirect increasing impact (e.g., an increase in energy use by expansion of energy-intensive industries) [93] . providing an accurate estimate of the magnitude of the rebound effect can be challenging ( [94] , while the existence of this effect is a matter of discussion in the literature [95] . although energy-efficient technologies can play an effective role in energy system transition, modeling and analyzing its direct and indirect effects is challenging. energy infrastructure has a key role in the low-carbon energy system transition by facilitating sectoral coupling, integrating renewable energies, improving efficiency, and enabling demand-side management. however, analyzing energy infrastructure can come up with some challenges, such as the complexities of distributing costs and benefits of investments and allocation of risk between investors and consumers [96] . conventional energy infrastructure facilities are usually managed by a monopoly as public goods are not traded in a market. therefore, the clear disaggregation of costs and benefits of infrastructure changes due to energy transition needs further evaluation [97] . long-term infrastructure investment and risk profiles of investors and consumers can be highly diverse as energy infrastructures can undergo drastic changes. moreover, social acceptance of energy infrastructure plays a key role in energy transition, particularly in decentralized infrastructures such as ccus networks, transmission lines, district heating, and local energy storage. modeling the social acceptance of energy infrastructure requires a combination of qualitative and quantitative datasets which can be highly locally dependent [98] . assuming the above-mentioned datasets are available, esms would require specific capabilities to analyze energy infrastructure. the esm should be geographically resolved, as energy infrastructure can have both local and national scales. moreover, there is a need for gis-based geographical resolution of esms as costs and benefits of energy infrastructures can change drastically by their geographical location. over the past decades, energy used to be supplied by large power plants and then transmitted across to the consumers. by emerging renewable energy supplies, a new alternative concept of the energy system has been developed. the decentralized energy system, as the name suggests, is comprised of a large number of small-scale energy suppliers and consumers. a transition from a centralized fossil-fuel and nuclear-based energy system to a decentralized energy system based on intermittent renewable energy sources can be a cost-effective solution for europe [99] . the local energy supply reduces transmission costs, transmission risks, environmental emissions, and to some extent promotes energy security for a sustainable society, with a competitive energy market. on the other hand, it can increase generation costs capacity investment, distribution, and energy reliability. therefore, there is a need to determine the optimal role of energy system decentralization by carefully analyzing costs and opportunities. conventional energy modeling tools were based on the centralized energy system and they have faced difficulties fulfilling the decentralized energy system demands. in conventional energy models, the location of the power plants does not play an important role, while spatial detail may be critically important for renewables. for instance, economic potentials, solar potentials, generation costs, environmental and social impacts, network constraints, and energy storage potentials are some location-dependent factors that can vary greatly across different regions. some other factors such as wind potential and infrastructural costs can vary greatly even with little dislocation. therefore, a fine spatial resolution is required in order to assess the role of locationdependent parameters in the energy system. esms can use national, regional, or gis (geographical information system) based spatial resolution. using a fine spatial resolution can be limited by the available computational power and spatial data. therefore, the choice of spatial resolution is the compromise between these two parameters. due to the huge computational load of gis-based esms, they are usually applied at urban level rather than national level. gisbased models can be used in a preprocessing phase in order to provide spatially resolved datasets for national esms. for instance, the global onshore wind energy potential dataset is produced at approximately 1 km resolution [100] . assuming the availability of spatial data, the computational limitation can be addressed by linking a coarse resolution energy model with a spatial modeling tool such as arcgis (e.g., see ref. [101] ). conventional energy models neglected social stakeholders as the energy system was managed and controlled by central decision-makers. in order to reach a sustainable low-carbon energy system, technical and social insights should be integrated into these models [102, 103] . according to the technology review of the u.s. department of energy, the balance of energy supply and demand is affected as much by individual choices, preferences, and behavior as by technical performance [104] . the reliability of energy models is often low because they are excessively sensitive to cost analysis while ignoring major energy policy drivers such as social equity, politics, and human behavior [105] . several studies have recently indicated the role of social sciences in energy research [106, 107] . social parameters are usually difficult to quantify, and consequently, are usually neglected in quantitative energy models. however, there are practical methods of integrating human aspects into technical energy models, such as the inclusion of prosumers and agent-based modeling. originally coined by alvin toffler in his 1980 book the third wave [108] , prosumer is the mixture of the words producer and consumer, explaining the active role of energy consumers in the production process. the conventional energy grid was dependent on the interaction between supplier and distributor, while consumers play an active role in the decentralized energy system. an important element of this new system is the role of prosumers i.e., consumers who also produce and share surplus of renewable energy generated with the grid and/or other consumers in the community [109] . with the emerging renewable energies at the microscale, prosumers are not only an important stakeholder of the future smart grids but may also play a vital role in peak demand management [110] . however, social acceptance of the decentralized energy system faces several drivers and barriers [111] that need quantification in order to be imported into energy models. the emergence of prosumers has increased the diffusion of social sciences in energy system modeling (e.g., see refs. [112] [113] [114] [115] ). in order to grasp an adequate knowledge of the decentralized energy system, the behavior of the prosumers on energy grid should thus be considered alongside the techno-economical characteristics (e.g., see ref. [116] [117] [118] ). based on the position of the decision-maker, esms can be divided into two main categories. the common approach is the assumption of a system planner who optimizes the single objective function (e.g., system cost minimization). contrary, agent-based models practice decentralized decision making by assuming autonomous agents who decide based on their own objectives. agent-based modeling has been proposed as a suitable modeling approach for complex socio-technical problems [119] and it is used in modeling the wholesale electricity market considering human complexities [120] . ringler et al. reviewed agent-based models considering demand response, distributed generation, and other smart grids paradigms [121] . the term "agent" can be used to describe different types of players in the energy system such as prosumers, power generators, storage operators, or policy makers. agents can optimize their own objective function, which can be based on economic (e.g., capital, npv, and tariffs), technical (e.g., efficiency, emissions, and maximum capacity), and social (e.g., bounded rationality, neighborhood effect, and heterogeneity) factors. including techno-economic factors in the objective function is relatively easier due to the quantitative nature of these parameters, while integrating qualitative social parameters remains a complicated task. qualitative parameters such as the perceived weight of environmental costs and impacts, expected utilities, social networks, and communication can be estimated by socio-demographic factors and behavior curves (e.g., see ref. [122] [123] [124] ). macroeconomic models follow a top-down analytical approach compared to techno-economic esms that use a bottom-up approach. the analytical approach is the way to break a system into elementary elements in order to understand the type of interactions that exist between them. this system reduction may be done in different ways. based on the reduction approach, esms are usually differentiated into three main groups: top-down, bottom-up, and hybrid models. top-down (td) models describe the energy-economy system as a whole to assess the energy and/or climate change policies in monetary units [125] . these models mainly describe the relations between the energy system and the variations in macroeconomic and environmental factors such as economic growth, demographics, employment rate, global warming, and ghg emissions. consequently, top-down models lack detail on current and future technological options which may be relevant for an appropriate assessment of energy policy proposals [126] . macroeconomic equilibrium models are an example of top-down models. bottom-up (bu) models, provide a higher degree of technological detail (in comparison to top-down models). characterized by a rich description of the current and prospective energy supply and end-use technologies, bottom-up models describe energy systems evolutions as resulting from a myriad of decisions on technology adoption [127] . they can compute the least-cost solution of meeting energy balances subject to various systems constraints, such as exogenous emission reduction targets [128] . hybrid models (i.e., linking td and bu models) can be a solution to integrate top-down consistency while maintaining bottom-up detail. the major advantage of top-down models is their consistency with welfare, market, economic growth, and other macroeconomic indicators that leads to a comprehensive understanding of energy policy impacts on the economy of a nation or a region. on the other hand, they lack an appropriate indication of technological progress, energy efficiency developments, non-economical boundaries of the system, and other technical details. instead, bottom-up models describe the energy system with detailed technological properties. moreover, bottom-up models lack feedback from the macro-effects of the technical changes in the overall economy. therefore, closing the gap between top-down and bottom-up energy models results in more consistent modeling outcomes (e.g., see ref. [129] [130] [131] [132] ). model linking is not an exclusive solution for td and bu models. hourcade et al. [133] argued that the three main dimensions of an energy-economy system are: technological explicitness, microeconomic realism, and macroeconomic completeness. the main advantage of model linking (i.e., modeling suite) is the ability to provide consistent results while considering two or all three dimensions. each of these dimensions can be modeled with a number of different models depending on the complexity of the problem. the model linking approach can be classified into three subcategories, based on the level of linking [135] . first, individual stand-alone models are linked together manually meaning that the processing and transferring of the information between models are controlled by the user, preferably in an iterative manner (i.e., soft-linking). second, a reduced version of one model exchanges data with the master model while both running at the same time (i.e., hard-linking). third, a combined methodology features an integrated model through a unified mathematical approach (e.g., mixed complementarity problems [136] ). (fig. 3) helgesen [134] used another classification based on the linking type of models and the terminology proposed by wene (i.e., soft-linking and hard-linking) [137] . the advantages of soft-linking can be summarized as practicality, transparency, and learning, while the advantages of hard-linking are efficiency, scalability, and control [138] . the above discussion of the main challenges of the present and future esms identified several required modeling capabilities, which are summarized in table 3 . in order to review models based on mentioned challenges, required capabilities are grouped into several model assessment criteria in table 4 . it should be noted that the integrated energy system analysis capability is not mentioned further as all reviewed models were integrated. moreover, the linking esms with td models' capability is discussed further in section 5. apart from the criteria that results from emergent challenges of future esms, three additional criteria are considered in table 4 , namely: (i) the underlying methodology of the model to separate calculator models from non-calculator ones; (ii) the source of the model's datasets to measure input-data quality; and (iii) the accessibility and the number of the model's applications to determine the models' use and acceptance. considering the criteria regarding the future low-carbon energy systems and available models, it can be concluded that no perfect model exists. models can be assessed based on the list of criteria such as temporal resolution, spatial resolution, the social aspect, data source quality, accessibility, and application ( table 4 ). the capability of the model in each criterion is given a score from five (highest) to one (lowest) as presented in table 5 . the results are highly dependent on the scores and weights, which are both -to some extent -subjective. readers can alter the results by incorporating new criteria or changing the perspective weights. in the following sections, these modeling capabilities and the corresponding scores are explained. there are two parameters that differ across integrated esms, which are the inclusion of fos and the inclusion of technological learning. therefore, models can be grouped into three groups: (i) no flexibility option and no technological learning that would score one; (ii) the inclusion of either flexibility options or technological learning that would score three; and (iii) the inclusion of flexibility options and technological learning that would score five. esms usually balance the supply and demand on a yearly basis or a limited amount of (hourly) time-slices per year. nevertheless, some models have a higher temporal resolution and balance the system on an hourly basis. reviewed models can be categorized in three groups: (i) temporal resolution on yearly basis that would score one; (ii) time-slice approach that would score three; and (iii) hourly temporal resolution that would score five. some models have the capability to model the regions inside a country. this ability can provide regional insights on energy system policies and vice versa. although the limited computational capacity and the lack of data make it difficult to perform a detailed regional analysis, some models balance the system in different regions inside the country based on different capacities and properties of the regions (e.g., esme in the uk). reviewed models are divided in three groups: (i) models without regional depth that would score one; (ii) models which consider regions that would score four, since it is a considerable improvement; and (iii) models which consider gis data that would score five. the role of social analysis in techno-economic models is usually negligible. however, some modeling tools practice multi-agent programming in order to model qualitative aspects of stakeholders' decision-making practice on energy systems. models are categorized into two groups: (i) models capturing socio-economic parameters only based on demand curves that would score one, and (ii) agent-based models considering a set of decision-making rules for different stakeholders in the energy system that would score five. reviewed models practice a different set of methodologies. in this review, the main categorization between methodologies can be made between the calculator and non-calculator methodologies. therefore, models can be grouped into two groups: (i) calculator models that would score one, and (ii) non-calculator models that would score five. the depth of technical detail and the quality of the data play a crucial role in providing accurate insights into the energy system with regard to new technologies and sectoral coupling. moreover, data access is the first limitation of energy system research as databases are rather private. models can then be divided into five groups: (i) models not indicating a data source that would score one; (ii) models using generalized opensource data that would score two; (iii) models using limited countryspecific data that would score three; (iv) models using detailed opensource data that would score four; and (v) models using detailed country-specific datasets, possibly in combination with global datasets that would score five. open-access models provide an opportunity to test the model and provide feedback. these models are divided into five groups: (i) models which provide no access that would score one; (ii) models which provide limited access that would score two; (iii) models which are commercial that would score three; (iv) models which are open-source but need permission that would score four, and (v) models which are completely open-source and accessible on internet that would score five. a model with more applications and users makes it easier to disseminate and discuss results. models are grouped in five sets: (i) models with no publication yet that would score one; (ii) models applied in one country that would score two; (iii) models applied in two countries that would score three; (iv) models applied across eu countries that would score four, and (v) models which applied in many countries and are well-known that would score five. table 6 demonstrates the mca analysis with equal weight for all criteria. to calculate the score of each model for each criterion, the weighted percentage of that criteria in the model's total score is demonstrated. this percentage is calculated endogenously, as explained by (equation (1) . it indicates the share of the models' score in each criterion out of the models' total score. primes would score high mainly due to the inclusion of social parameters, while the high score of remix is due to its high spatial resolution. these models merely demonstrate improved capabilities compared to others; therefore, it does not mean that these models are "best" models. moreover, some features of the models are not reflected in this table. for instance, metis works complementary to long-term esms as it only simulates one specific year. besides, the mca results can be changed considerably by assigning slightly different scores to various criteria as total scores are relatively close. models such as the markal family and metis demonstrate high scores mainly due to their high granularity; however, they lack the inclusion of social parameters. ensysi includes social parameters while lacking spatial resolution and application. addressing all the policy-induced challenges of the energy system requires a comprehensive esm that is not yet available. therefore, a compromise should be made based on the challenges that the model is designed to address. consequently, a weighted decision matrix can be formed by using the ahp method [139] . in this method, the criteria are rated against each other with respect to the challenges. a consistency ratio (cr) is calculated to indicate the reliability of the weight table. saaty suggests that the crs of more than 0.1 are too inconsistent to be considered reliable [139] . here the challenges are divided into two main groups, first: intermittency, flexibility, and further electrification; and second: human behavior and decentralization. the first group of challenges puts emphasis on the technological detail, the high temporal and spatial resolution; while the second group emphasizes the inclusion of social parameters and high spatial resolution. the pair-wise weighted decision matrix for the first group of challenges is formed in table 7 . in each row, the importance of a criterion compared to other criteria is given a number in this order: 1 would be equal importance, 3 would be fairly important, and 5 would be extremely important. it should be noted that these numbers are entirely subjective, thus, the user can make his own decision matrix. then, the table is normalized and the sum of each row is reported as the weight of the criterion. to calculate the cr, each column of the pair-wise table is multiplied by the corresponding weights. then, the sum of each row is divided by the corresponding weight to get λ values. the cr is then calculated by equation (2), in which n is the number of criteria. repeating the same procedure for the second group of challenges leads to table 8 . in both cases, the low cr indicates low levels of inconsistency in the assigned weights. using the weights in table 8 to update the mca will lead to a slightly different result, presented in table 9 . for the first group, it is expected that models with high scores in technological detail, temporal resolution, and spatial resolution will get higher total scores. the remix model gets a high total score mainly due to the inclusion of high spatial resolution with the use of gis data and the inclusion of key flexibility and storage technologies with the exogenous technological learning. the metis model provides lower technological detail by neglecting technological learning while incorporating hourly temporal resolution and gis-based spatial resolution. for the second group, the inclusion of social parameters and fine spatial resolution gains importance. models with the inclusion of social parameters such as primes and ensysi get higher scores. although the metis model does not include social parameters, it keeps a high score due to its fine spatial resolution. irrespective of the assigned weights, we find four models at the top of the mca table which are remix, primes, metis, and the markal family models. these models had high scores in nearly all criteria, while a low score in one criterion (for instance, lack of social parameters for remix) is compensated with a high score in another criteria (high temporal and spatial resolution). these four models were either developed recently (e.g., remix and metis) or are under constant development (e.g., markal family and primes). it shows how integrated energy system modeling points towards the models with improved capabilities in all the criteria. other models keep the same ranking position except for iwes, ensysi, and energyplan, which changed their position considerably (i. e., more than two steps change). this position change can be explained by the asymmetry of these models' scores in the mca table. for instance, the iwes model gets a high score in the first four criteria while getting a low score in the last four criteria. the mca represents an overview of the current state of esms with regard to low-carbon energy system modeling challenges. however, there is a need for adding new capabilities to current esms in order to address future modeling challenges. in the next section, we discuss two potential modeling solutions based on our observation from the current state. the overall solution is to expand single models and/or to link different models. it is not practical to decide on the best model that addresses challenges regarding low-carbon energy systems, as each model has specific pros and cons. from a techno-economic point of view, the mca indicates that for modeling the low-carbon energy system, current models require specific capabilities such as hourly temporal resolution, regional spatial resolution, inclusion of sectoral coupling technologies, technological learning, and inclusion of social parameters. there are major gaps between policy questions and modeling capabilities in the criteria, which were used to assess the models' performance. however, these criteria mainly focus on the technical policy questions rather than the entire technical, microeconomic, and macroeconomic aspects. although techno-economic models are rich in detail, they lack the capability to answer microeconomic and macroeconomic policy questions. therefore, specific models, such as energy market models and general equilibrium models, have been developed. due to the strong interconnection between energy and economy, mixed policy questions arise that require analyzing the technical, microeconomic, and note: percentages may not add up to 100 due to rounding. macroeconomic aspects of the energy-economy system. such analysis can be conducted either by developing single models or by combining different models (i.e., soft-linking, hard-linking, or integrating). current single models can be developed and/or extended by incorporating further capabilities up to acceptable computational limits. considering the limitations, the modeler makes choices and/or tradeoffs on extensions to the model. developing a single model that can cover all the mentioned gaps could face limitations, such as complicated mathematical methodology and limited computational capacities (except generic limitations such as high data needs and lack of transparency). some common energy system modeling methodologies are optimization, simulation, accounting, multi-agent, and equilibrium. each mathematical methodology can be developed to answer specific energy modeling questions. integrating two different methodologies can be mathematically very complicated (e.g., mixed complementarity problems in which the optimization and equilibrium formulations are mixed) or not feasible (e.g., mixing optimization and simulation formulations). therefore, single esms are naturally limited by their underlying methodology. one of the main limitations for improving the temporal and geographical resolution of esms is the computational capacity. the computational limitation can be addressed either by hardware or software development. (fig. 4) hardware development follows an exponential growth and relates to improvements in the number of transistors, clock frequency, and power consumption of processors. on the other hand, software methods can be divided into solver-related and modelspecific methods to improve computing times of linear optimization esms [140] . solver-related methods focus on improving the solving methodology by using different off-the-shelf algorithms, such as lindo, cplex, gurubi, and mosek, or by practicing customized algorithms, such as bender's decomposition 5 and parallelization. 6 model-specific methods relate to heuristic methods, such as clustering, model reduction, decomposition, and parallelization. hardware-related developments proceed at a specific pace that is usually not affected by energy system modelers as users. solver-related developments are followed by a few energy system modelers (e.g., see the beam-me project [184]), while the rest of the energy system modeling community follows model reduction and clustering methods that can be applied on temporal resolution, spatial resolution, and technological detail (e.g., see [185] ). depending on the research questions, energy system modelers reduce or coarsen the resolution of the model in order to provide an answer in an adequate timeframe. therefore, a trade-off should be made between different modeling capabilities by making smart assumptions. an alternative approach to overcome the limitations of single model development is to form a modeling suite. model linking can be done between any set of desired models in order to enhance modeling capabilities. however, two types of energy model linking are more frequent in the literature: (1) linking bu and td models such as optimization energy system models (oesms) linked with cge models, and (2) linking two bu models such as oesms combined with energy market models (i. e., unit commitment or dispatch models). although linking models provides further modeling capabilities, it comes with certain challenges [141] such as the identification of connection points in soft-linking (e.g., see ref. [142] ), convergent solution in soft-linking (e.g., see ref. [143] ), and mathematical formulation for integrated linking (e.g., see ref. [144, 145] ). in summary, linking models can be resource-intensive as it requires the knowledge of different modeling frameworks. each model has its own set of assumptions and methodologies, which makes it complicated to maintain the harmonization of modeling assumptions in all steps of linking. the lack of harmonization in assumptions may result in inconsistent results from linked models. although this process seems straightforward, it is rather a puzzling procedure as esms are moderately complex. therefore, having an overview of different energy models and their capabilities is essential to provide the desired modeling suite. a linking approach is proposed for addressing current energy system modeling gaps. table 10 provides an overview of identified energy modeling gaps and corresponding linking suggestions. these suggestions can form a modeling suite that involves four different models, namely, the energy system model (esm), the energy market model (emm), the macroeconomic model (mem), and the socio-spatial model (ssm). the suggestions in table 10 can be framed in two separate modeling suites based on the methodology of the core esm. (fig. 6) the first modeling suite can be formed around an optimization esm (oesm) that provides the cost-optimal state of the energy system assuming a fully rational central social welfare planner. the second modeling suite uses a socio-technical esm (stesm) that demonstrates a more realistic state of the energy system by assuming profit maximizer agents who consider social decision-making parameters in their decision-making process. such parameters include behavioral economics, bounded rationality, neighborhood effect, and technology diffusion curve. the core component of the suggested modeling suite is the presence of a central techno-economic esm as an information processor hub that exchanges the outputs with different models. based on the current state of the energy system and future scenarios, the esm can determine the technology and energy mix, commodity and energy prices, amount and price of emissions, and total energy system cost. however, this standalone analysis is based on specific scenario assumptions of demand profiles, energy import and export profiles, decentralized energy supply prospects, and macroeconomic expectations. it is suggested to use linear relations (i.e., linear optimization methodology) to keep the computational load manageable. while the optimization framework determines the theoretically optimal state of the energy system, the simulation methodology can demonstrate feasible pathways to reach the optimal state. therefore, by comparing the results of the optimization and simulation frameworks, the gap between the optimal solution and the feasible solution (symbolically demonstrated in fig. 5 ) can be identified. several policy parameters can affect the width of this gap by bringing the feasible solution close to the optimal one. therefore, the analysis of the simulation and the optimization methodologies can provide a better understanding of the role of each parameter in reaching the optimal state of the energy system policy targets. based on the review and the mca, several optimization esms such as times and remix can be used as the core esm of the modeling suite, due to their fine temporal resolution and ample technological detail. agentbased simulation esms are not as common as optimization esms; therefore, from the reviewed models, only ensysi and primes can be selected as simulation core esms. current esms lack the capability to model the regional implications of the energy system such as decentralized supply and demand, infrastructure costs and benefits, land use, and resource allocation. although some local energy system models such as energis [146] and gisa sol [147] provide geographically resolved energy system analysis, they lack the interaction with other country regions. as regional variations of the energy system can have drastic effects on the system itself, it is suggested to hard-link the regional model into the core esm. the geographical resolution of esms can be improved depending on the research questions and available resources. for instance, after identifying spatially sensitive parameters of the energy system, such as heat supply location, renewable power production, transmission capacity expansion, and storage infrastructure, sahoo et al. provided a framework to integrate them into an esm (i.e., the opera model) [148] . focusing on infrastructure, van den broek et al. clustered the co 2 source regions using the arcgis software and then incorporated the spatially resolved data into the markal-nl-uu as the optimization-based esm [149] . for well-connected countries, it is suggested to hard-link an emm with the core esm to capture the flexibility potential of the cross-border energy trade, albeit some studies that use a soft-linking approach (e.g., see ref. [150, 151] ). in particular, for eu countries, this hard-linking is necessary as the interconnection flexibility option (fo) can be in direct competition with domestic fos such as demand response or storage. emms usually use the milp underlying methodology in order to model unit commitment; therefore, the inclusion of emm inside esm can be computationally intensive. it is suggested to use a linear optimization methodology in accordance with the core esm to reduce computational load, while reaching a fair estimate of the energy import and export flows, particularly for electricity. assuming the regional and interconnection capabilities are integrated into the core esm, in order to capture consistent economic analysis, one soft-linking loop is suggested as follows. this loop incorporates a macroeconomic model, which keeps demand and supply of commodities in equilibrium based on the statistical economic data such as capital stocks and investments, demographics, labor market, and trade and taxes. the esm outputs, such as energy prices, energy mix, and emissions are fed into the mem to update the supply and demand and price data of energy and fuel commodities. the mem provides the equilibrium demographics, gdp and income, monetary flows between economic sectors, trade, and employment rate. this loop can be performed once or it can continue until the results reach a convergence criterion, which is a user-defined criterion that determines the maximum gap between the results of two models. moreover, mem outputs can feed into an abm simulation esm in table 10 model development and model linking suggestions based on the identified energy modeling gaps. lack of sectoral coupling technologies between electricity, heat, and transport sectors. developing a long-term planning optimization energy system model (esm) that involves all energy sectors, hourly temporal resolution, regional spatial resolution, seasonal storage options, and technological learning lack of new seasonal storage technology options such as tes and hes. lack of endogenous technological learning rates. lack of hourly temporal resolution for capturing intermittent renewables and corresponding potentials. lack of regional spatial resolution for analyzing energy flows between regions across a country. hard-linking esm with a regional energy system model (resm) that involves resolved spatial resolution, land use analysis, and infrastructure analysis lack of fine geographical resolution options such as gis, fine mesh, and clustering for analyzing decentralized intermittent supply and infrastructure costs and benefits lack of spatially resolved datasets such as infrastructure and local storage. simplistic modeling of human behavior in the current abms. developing an abm simulation socio-technical energy model (stesm) that involves stakeholders' behavior, local and neighborhood effects, bounded rationality, and perceived environmental values. the focus of current datasets is only on technological detail, rather than stakeholders' behavior. high dependence of esms on consumer load profiles. lack of national energy modeling consistency with a european (or an international) energy market. hard-linking esm with an international (or european) energy market model (emm) that involves an optimal dispatch electricity market, the gas and oil market, hourly temporal resolution, regional spatial resolution, and a detailed generation database lack of energy modeling consistency with macroeconomic indicators which consumer demand profiles are generated based on demographics, income, and employment (e.g., see ref. [152] ). the stesm analyzes the social aspects of the energy system such as stakeholders' behavior, bounded rationality, imperfect communication, and environmental perceived value. the choice of models, connection points, and scenarios are dependent on the aims of the energy system modeling, available expertise and resources, and access to models and datasets. in summary, we identified the capabilities and shortcomings of current esms to analyze adequately the transition towards a low-carbon energy system. in this regard, we described seven current and future low-carbon energy system modeling challenges. finally, to bridge major energy modeling gaps, two conceptual modeling suites are suggested, based on both optimization and simulation methodologies, in which the integrated esm is hard-linked with a regional model and an energy market model and soft-linked with a macroeconomic model. model development and linking choices can be affected by major changes in the energy system outlook. for instance, the current covid-19 situation can have major impacts on economic activities, indicating the importance of soft-linking with macroeconomic models. a limitation of this study is that all the information about models has been gathered from published documents, which might be outdated as models are constantly under development. therefore, this review provides a rather static view on esms. only a limited number of energy system models was presented in this review, which is mainly due to limited time, resources, and access to modeling databases. other challenges of low-carbon energy systems modeling not included here are the need for energy policy harmonization, energy market design, business models of new technologies, legislation and legal aspect of the energy transition, and social acceptance implications of the energy transition. another limitation of this study is the subjective assignation of scores with the mca, however, a clear explanation for the assignation of scores was provided. furthermore, the mca only considered single esms while in practice a combination of models can be analyzed. a more comprehensive mca would consider the capabilities and limitations of suites of models. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. the revised renewable energy directive: clean energy for all europeans a review of computer tools for analysing the integration of renewable energy into various energy systems a review of energy system models a review of energy systems models in the uk: prevalent usage and categorisation a review of current challenges and trends in energy systems modeling energy systems modeling for twenty-first century energy challenges are decisions well supported for the energy transition? a review on modeling approaches for renewable energy policy evaluation the gap between energy policy challenges and model capabilities a review of modelling tools for energy and electricity systems with large shares of variable renewables a review of socio-technical energy transition (stet) models burden of proof: a comprehensive review of the feasibility of 100% renewable-electricity systems broek mv d. is a 100% renewable european power system feasible by 2050? smart energy europe: the technical and economic impact of one potential 100% renewable energy scenario for the european union least-cost options for integrating intermittent renewables in low-carbon power systems synergies of sector coupling and transmission reinforcement in a cost-optimised, highly renewable european energy system agent-based modeling of a thermal energy transition in the built environment adoption of energy efficient technologies by households -barriers, policies and agent-based modelling studies dynemo: model documentation. london: ucl energy institute dynemo: a dynamic energy model for the exploration of energy, society, and environment metis technical note t5: metis software introduction and architecture. european commission. brussels: directorate-general for energy australian energy projections to 2049-50 the national energy modeling system. an over view. energy information administration the supply of flexibility for the power system in the netherlands a simulation model for the dutch energy system. the hague: pbl netherlands environmental assessment agency osemosys: the open source energy modeling system an introduction to its ethos, structure and development modelling low-carbon energy system designs with the eti esme model. eti poles-jrc model documentation. luxembourg: publications office of the european union primes model, version 6. national technical university of athens a time step energy process model for germany -model structure and results scenario analysis: electrification of commercial urban road transportation and impacts on the energy system future security of power supply in germany-the role of stochastic power plant outages and intermittent generation analysis of alternative uk heat decarbonisation pathways. imperial college london for the committee on climate change institute for sustainable solutions and innovations (isusi) stockholm environment institute system analysis: dtu, department of management engineering documentation for the markal family of models documentation for the times model, part i. international energy agency a review of multi criteria decision making (mcdm) towards sustainable renewable energy development sensitivity analysis of multicriteria decision making methodology developed for selection of typologies of earth-retaining walls in an urban highway the goals of energy policy: professional perspectives on energy security, economics, and the environment impact of the level of temporal and operational detail in energy-system planning models large-scale variability of weather dependent renewable energy sources. management of weather and climate risk in the energy industry 10 trends reshaping climate and energy the european political strategy centre wind energy in europe: scenarios for 2030 an industrial policy for solar in europe. solarpower europe the potential of intermittent renewables to meet electric power demand: current methods and emerging analytical techniques flexibility requirements of renewable energy based electricity systems -a review of research results and methodologies a comprehensive survey of flexibility options for supporting the low-carbon energy future review of energy system flexibility measures to enable high levels of variable renewable electricity comparison of the techno-economic characteristics of different flexibility options in the european energy system potential of power-to-methane in the eu energy transition to a low carbon system using cost optimization potential for power-to-heat in the netherlands power to ammonia: feasibility study for the value chains and business cases to produce co2-free ammonia suitable for various market applications. institute for sustainable process technology the importance of high temporal resolution in modeling renewable energy penetration scenarios. 9th conference on applied infrastructure research implications of temporal resolution for modeling renewables-based power systems high-resolution modeling framework for planning electricity systems with high penetration of renewables the effect of weather uncertainty on the financial risk of green electricity producers under various renewable policies weather forecasting error in solar energy forecasting impact of wind power uncertainty forecasting on the market integration of wind energy in spain international aspects of a power-to-x roadmap benefits of demand response in electricity markets and recommendations for achieving them from passive demand response to proactive demand participation decentralised storage and demand response: impact on renewable share in grids and buildings. world sustainable energy days (wsed) capacity mechanisms for electricity. eprs | european parliamentary research service internal energy market. fact sheets on the european union the demand for flexibility of the power system in the netherlands wind power myths debunked common questions and misconceptions a review of approaches to uncertainty assessment in energy system optimization models clean transport -support to the member states for the implementation of the directive on the deployment of alternative fuels infrastructure: good practice examples gas and the electrification of heating & transport: scenarios for 2050 heat electrification: the latest research in europe a comprehensive review of thermal energy storage a review of available technologies for seasonal thermal energy storage luxembourg: publications office of the european union lessons on technological learning for policy makers and industry. technological learning in the energy sector, lessons for policy. industry and science the transition of energy intensive processing industries towards deep decarbonization: characteristics and implications for future research the role of nearly-zero energy buildings in the transition towards post-carbon cities an assessment methodology of sustainable energy transition scenarios for realizing energy neutral neighborhoods technological learning in energy-environment-economy modelling: a survey a review of learning rates for electricity supply technologies a review of uncertainties in technology experience curves long-term payoffs of near-term low-carbon deployment policies decarbonising road transport with hydrogen and electricity: long term global technology learning scenarios modelling the evolutionary paths of multiple carbon-free energy technologies with policy incentives the role of technology diffusion in a decarbonizing world to limit global warming to well below 2 • c: an assessment with application of global times model multi-cluster technology learning in times: a transport sector case study with tiam-ucl. informing energy and climate policies using energy systems models power capacity expansion planning considering endogenous technology cost learning clean energy for all europeans. luxembourg: publications office of the european union assessing the energy-efficiency gap residential energy efficiency policies: costs, emissions and rebound effects the rebound effect and energy efficiency policy energy efficiency and consumption -the rebound effect -a survey infrastructure, investment and the low carbon transition. new challenges in energy security infrastructure transformation as a socio-technical processimplications for the governance of energy distribution networks in the uk a conceptual framework for understanding the social acceptance of energy infrastructure: insights from energy storage decentralized energy systems. european parliament's committee on industry temporally-explicit and spatially-resolved global onshore wind energy potentials designing a cost-effective co2 storage infrastructure using a gis based linear optimization energy model energy research and the contributions of the social sciences: a retrospective examination energy research and the contributions of the social sciences: a contemporary examination integrating social science in energy research what kind of socio-technical research for what sort of influence on energy policy? the reality of cross-disciplinary energy research in the united kingdom: a social science perspective the third wave. united states: bantam books a novel prosumer-based energy sharing and management (pesm) approach for cooperative demand side management (dsm) in smart grid prosumer based energy management and sharing in smart grid distributed energy systems on a neighborhood scale: reviewing drivers of and barriers to social acceptance the flexible prosumer: measuring the willingness to co-create distributed flexibility self-sustainable community of electricity prosumers in the emerging distribution system an optimized energy system planning and operation on distribution grid level-the decentralized market agent as a novel approach incentivizing prosumer coalitions with energy management using cooperative game theory socio-technical transitions and policy changeadvocacy coalitions in swiss energy policy transformative policy mixes in socio-technical scenarios: the case of the low-carbon transition of the german electricity system socio-technical evolution of decentralized energy systems: a critical review and implications for urban planning and policy agent-based modeling: methods and techniques for simulating human systems99. national academy of sciences agent-based modeling and simulation of competitive wholesale electricity markets. handbook of power systems ii agent-based modelling and simulation of smart electricity grids and markets -a literature review towards circular economy implementation: an agent-based simulation approach for business model changes agent-based modeling of energy technology adoption: empirical integration of social, behavioral, economic, and environmental factors70. environmental modelling & software multi-agent based distributed control architecture for microgrid energy management and optimization introduction to energy systems modelling combining bottom-up and top-down reconciling topdown and bottom-up energy/economy models: a case of tiam-fr and imaclim-r. chaire modélisation prospective au service du développement technological change in economic models of environmental policy: a survey advances in techno-economic energy modeling: costs, dynamics and hybrid aspects useful models for simulating policies to induce technological change comparison of top-down and bottom-up estimates of sectoral and regional greenhouse gas emission reduction potentials closing the gap? top-down versus bottom-up projections of china's regional energy use and co2 emissions hybrid modeling: new answers to old challenges introduction to the special issue of top-down and bottom-up: combining energy system models and macroeconomic general equilibrium models integrated assessment of energy policies: decomposing top-down and bottom-up extension of gams for complementarity problems arising in applied economic analysis energy-economy analysis: linking the macroeconomic and systems engineering approaches hybrid modelling: linking and integrating top-down and bottom-up models. the h2020 project. set-nav mathematical models for decision support methods to improve computing times in linear energy system optimization models integrating short term variations of the power system into integrated energy system models: a methodological review challenges in top-down and bottom-up soft-linking: lessons from linking a swedish energy system model with a cge model energy-economy analysis: linking the macroeconomic and systems engineering approaches bridging the gap using energy services: demonstrating a novel framework for soft linking top-down and bottom-up models from linking to integration of energy system models and computational general equilibrium models -effects on equilibria and convergence energis: a geographical information based system for the evaluation of integrated energy conversion systems in urban areas a gisbased decision support tool for renewable energy management and planning in semi-arid rural environments of northeast of brazil electricity sector flexibility analysis: the case of the north of the netherlands feasibility of storing co2 in the utsira formation as part of a long term dutch ccs strategy: an evaluation based on a gis/markal toolbox modelling flexible power demand and supply in the eu power system: soft-linking between jrc-eu-times and the open-source dispa-set model. institute of thermal technology soft-linking exercises between times, power system models and housing stock models linking agent-based energy market with computable general equilibrium model: an integrated approach to climate-economy-energy system the authors would like to thank paul koutstaal, pascal wissink, joost van stralen, somadutta sahoo, ö zge ö zdemir, bert daniels, german morales-españa, and other team members of the estrac project and other anonymous reviewers and the editor for their constructive comments, conversations, and feedback. the authors wish to acknowledge the support given by the estrac integrated energy system analysis project financed by the new energy coalition (finance code: 656039). the views expressed here are those of the authors alone and do not necessarily reflect the views of the project partners or the policies of the funding partners. supplementary data to this article can be found online at https://doi. org/10.1016/j.rser.2020.110195. optimization-based or simulation-based conceptual model linking framework for the low-carbon energy system modeling suite. key: cord-293562-69nnyq8p authors: imran, mudassar; usman, muhammad; malik, tufail; ansari, ali r. title: mathematical analysis of the role of hospitalization/isolation in controlling the spread of zika fever date: 2018-08-15 journal: virus res doi: 10.1016/j.virusres.2018.07.002 sha: doc_id: 293562 cord_uid: 69nnyq8p the zika virus is transmitted to humans primarily through aedes mosquitoes and through sexual contact. it is documented that the virus can be transmitted to newborn babies from their mothers. we consider a deterministic model for the transmission dynamics of the zika virus infectious disease that spreads in, both humans and vectors, through horizontal and vertical transmission. the total populations of both humans and mosquitoes are assumed to be constant. our models consist of a system of eight differential equations describing the human and vector populations during the different stages of the disease. we have included the hospitalization/isolation class in our model to see the effect of the controlling strategy. we determine the expression for the basic reproductive number r(0) in terms of horizontal as well as vertical disease transmission rates. an in-depth stability analysis of the model is performed, and it is consequently shown, that the model has a globally asymptotically stable disease-free equilibrium when the basic reproduction number r(0) < 1. it is also shown that when r(0) > 1, there exists a unique endemic equilibrium. we showed that the endemic equilibrium point is globally asymptotically stable when it exists. we were able to prove this result in a reduced model. furthermore, we conducted an uncertainty and sensitivity analysis to recognize the impact of crucial model parameters on r(0). the uncertainty analysis yields an estimated value of the basic reproductive number r(0) = 1.54. assuming infection prevalence in the population under constant control, optimal control theory is used to devise an optimal hospitalization/isolation control strategy for the model. the impact of isolation on the number of infected individuals and the accumulated cost is assessed and compared with the constant control case. the zika virus is transmitted to humans primarily through aedes mosquitoes and through sexual contact. it is documented that the virus can be transmitted to newborn babies from their mothers. we consider a deterministic model for the transmission dynamics of the zika virus infectious disease that spreads in, both humans and vectors, through horizontal and vertical transmission. the total populations of both humans and mosquitoes are assumed to be constant. our models consist of a system of eight differential equations describing the human and vector populations during the different stages of the disease. we have included the hospitalization/isolation class in our model to see the effect of the controlling strategy. we determine the expression for the basic reproductive number r 0 in terms of horizontal as well as vertical disease transmission rates. an in-depth stability analysis of the model is performed, and it is consequently shown, that the model has a globally asymptotically stable disease-free equilibrium when the basic reproduction number r 0 < 1. it is also shown that when r 0 > 1, there exists a unique endemic equilibrium. we showed that the endemic equilibrium point is globally asymptotically stable when it exists. we were able to prove this result in a reduced model. furthermore, we conducted an uncertainty and sensitivity analysis to recognize the impact of crucial model parameters on r 0 . the uncertainty analysis yields an estimated value of the basic reproductive number r 0 = 1.54. assuming infection prevalence in the population under constant control, optimal control theory is used to devise an optimal hospitalization/isolation control strategy for the model. the impact of isolation on the number of infected individuals and the accumulated cost is assessed and compared with the constant control case. the zika virus spreads among humans primarily through an infected mosquito bite, which has been increasing at an alarming incidence rate worldwide over the past few years (dick et al., 1952) . it belongs to the family of flaviviruses which includes more than fifty viruses, such as dengue, yellow fever, and the west nile virus. this virus was first identified in the zika forests of uganda and east africa during the investigations on the ecology of the yellow fever (anderson et al., 2016) . the first isolation was made in april of 1947 from the serum of pyrexia rhesus monkey caged. the second isolation was made in 1947 in the same forest (dick et al., 1952) . just a year later, in 1948, the virus was recovered from mosquito aedes africanus from the zika forest. the first human case of zika fever was reported in uganda in 1952. the first outbreak of zika fever was reported in 2007 on the pacific island of yap, this outbreak caused 108 symptomatic cases. another epidemic outbreak occurred in french polynesia between 2012 and 2014. during this time, it was estimated that about 28,000 people were reported to have zika like symptoms (anderson et al., 2016) . epidemic, reporting the most cases of people infected with the zika virus worldwide. in 2016, the state of rio de janeiro alone reported over 71,000 probable zika virus infections, however, this number dropped to only 2210 cases in 2017. sexual transmission of the zika virus has also been reported from both males and females to their partners (cdc, 2018; ecdc, 20185; hastings and fikrig, 2003; summers and acosta, 2015) . zika virus can be sexually transmitted from a person who is infected by the virus, even while they are not symptomatic. furthermore, it has been suggested that zika virus can be transmitted from a pregnant woman to her fetus during pregnancy. zika virus can be transferred horizontally as well as vertically. during the zika epidemic, brazilian health officials reported an increase in the number of cases of microcephaly disease, a condition in which a baby's head is smaller than normal in zika affected areas. the main symptoms of zika fever include a fever, the maculopapular rash often spreading from the face to the body, joint and muscle pain, vomiting, or bilateral non-purulent conjunctivitis (ecdc). the first well-documented case of zika fever was reported in 1964 and started with a mild headache with later development of a maculopapular rash, fever, and back pain (hayes, 2009) . the symptoms of zika fever are thus quite similar to those of dengue fever and there is a strong possibility of misdiagnosis in regions where dengue virus is common. the incubation period for the zika virus is between 3 and 14 days (krow-lucal et al., 2017) . disease-related symptoms are developed within one week of infection for 50% of the infected individuals and within 2 weeks among 99% (krow-lucal et al., 2017) of the infected individuals. the vast majority of infections are not contagious from person to person; however, it may be passed person to person during sex. the virus infection is usually diagnosed by a blood test. the disease symptoms are usually mild and short lasting (17 days), and infection may go unrecognized or be misdiagnosed as dengue fever (ecdc). unfortunately, there is no vaccine, antiviral drug, or other modality available to prevent or treat the zika virus infection. zika fever is a preventable but not a curable disease. thus, the only means of controlling the zika virus is to control the mosquitoes that spread the disease and protection during sex. in the past several years, a number of deterministic models for the transmission dynamics of the dengue virus have been studied and analyzed (esteva and vargas, 1998 ferguson et al., 1999; garba et al., 2008 garba et al., , 2010 kautner et al., 1997; wearing and rohani, 2006) . after the zika outbreak, models for zika transmission have been developed (agustoa et al., 2017; maxiana et al., 2017; wiratsudakul et al., 2018) and analyzed. in these models, the authors included the effect of sexual transmission of the disease. in this work, we formulate and study a deterministic model for zika virus transmission including vertical and horizontal transmission of the disease. although esteva (esteva and vargas, 2000) discussed vertical disease transmission it was among vectors in a dengue transmission. our deterministic model for zika virus transmission includes horizontal and vertical transmission in both humans and vectors. as stated previously, it has been suggested that zika virus can spread to newborns from their mothers, and we therefore feel that an accurate model must include the vertical transmission. our work is an extension of our previous model (imran et al., 2017) by including a population group that is using controlling measures. in the previous work, we considered death due to the infection and the total human and vector populations were functions of time. the previous model did not possess global stability for both disease-free and endemic equilibriums. we showed a backward bifurcation phenomenon. the current model has constant population size. since death cases reported from zika fever were negligible, we take disease-induced mortality to be zero. there is no backward bifurcation and the steady states results are global. since the only way to control the disease is to isolate patients who have been infected with the zika virus, we included a new population compartment consisting of hospitalized individuals. we have calculated the basic reproductive number associated with our model that guarantees the elimination of the disease. finally, using optimal control techniques, we also propose and analyze the control strategies for decrease infected individuals while minimizing the costs and resources simultaneously. the rest of this paper is organized as follows, the proposed model is presented in section 2. basic properties and a detailed steady-state analysis of the model are presented in section 3. in section 4, we perform a sensitivity and uncertainty analysis of the model parameters and reproductive number associated with our model. section 5 uses ideas from optimal control theory to propose various controlling strategies to overcome zika are proposed. finally, section 6 presents our conclusions and contains a brief discussion of our results. we consider two types of populations in this model one for the humans and for the mosquitoes. the total human and mosquitoes populations at time t, denoted by n h and n v , are constant. the human population is divided into five mutually exclusive groups, susceptible humans s h (t), exposed humans e h (t), infected humans i h (t), isolated or hospitalized individuals h h (t) and recovered humans r h (t). the total vector population is divided into three mutually exclusive classes comprising of susceptible vectors s t the model assumes that the susceptible human population s h (t) has a recruitment rate μ h n h , where n h is total human population and μ h is the natural birth rate of humans. we assume that the birth rate of human population is same as the natural death rate. susceptible individuals get infected with zika fever virus (due to contact with infected vectors) at a rate λ h and thus enter the exposed class e h . in order to consider vertical transmission in our model, we make the assumption (see, e.g., li et al., 2001 ) that a fraction of newborn individuals from parents in the e h (t) and i h (t) classes will be infected, and thus remain in e h class before becoming infectious. we have assumed that the hospitalized individuals do not contribute to vertical transmission. population in each class is removed at the natural death rate μ h . we assumed a lifelong immunity for humans who recovered from zika virus. the exposed individuals who got an infection, move to infectious class at a rate ξ. the infected population recovers from the zika fever at a rate θ i and some infected individuals are transferred to hospitalized class at a rate τ. the hospitalized population recovered at a rate of θ h . the susceptible vector population s t ( ) v has a recruitment rate μ n v v , and μ v is a natural death rate of vector population. a fraction of offsprings in the e t ( ) v and i t ( ) v classes will be infected, and thus remain in e h class before becoming infectious. because of this vertical transmission, a fraction of susceptible individuals will enter the exposed class. susceptible vectors are infected with zika virus (due to effective contact with infected humans) at a rate of λ v and thus move to the exposed vector class e v . the susceptible, exposed and infected vectors have natural death rate μ . v in addition, exposed vectors develop symptoms and move to the infected vector class i v at a rate of σ v . it is assumed that infected vectors do not recover, and die at the natural death rate of μ v . as mentioned earlier there is no vaccine available for zika fever. the only way to control this disease is to reduce the contact rate either by killing the mosquitoes or using the protective measure like mosquito repellents, nets etc. the effective contacts will be further reduced by isolating the infected humans. isolation of individuals with disease symptoms constitutes what is probably the first infection control measure since the beginning of recorded human history (hethcote, 2000) . over the decades, these control measures have been applied, with varying degrees of success, to combat the spread of some emerging and re-emerging diseases such as leprosy, plague, cholera, typhus, yellow fever, smallpox, diphtheria, tuberculosis, measles, ebola, pandemic influenza and, more recently, severe acute respiratory syndrome (gumel et al., 2003; imran et al., 2013; lipsitch et al., 2003; lloyd-smith et al., 2003) . chavez et al. analyzed a saiqr model in detail, to investigate the effect of isolation on influenza (vivas-barber et al., 2015) . they used the isolation (quarantine) i-q model, where the infected population is isolated. we have included epidemiological factors like permanent or partial immunity after recovery as well as intervention control measures through the inclusion of a hospitalized (or isolated) class, h h . in this case both the total host population and the vector population are constant. the forces of infection λ h and λ v are given as (garba et al., 2008) : here we assumed that an individual in h class still can transmit the disease but with lower rate. the value of modification parameter 0 ≤ η < 1. fig. 1 presents schematic diagram of the model (1). a description of the variables and parameters of the model (1) is given in tables 1 and 2 respectively. the model (1) will be studied on the closed set: is positively invariant and attracting with respect to the model (1). it can be seen that the solutions are always positive. the right sides of (1) are smooth, so that initial value problems have unique solutions that exist on maximal intervals. since paths cannot leave , solutions exist for all positive time. thus the model is mathematically and epidemiologically well posed. in this section, we will perform a detailed steady state and stability analysis of the zika fever model presented in section 2. the model (1) has a disease free equilibrium (dfe) given by in order to investigate the local stability of the dfe ( ) 0 , the next generation operator method (van den driessche and watmough, 2002) will be used. following the notation of van den driessche and watmough (2002), the matrix f (for the new infection terms) and the matrix v (of the transition terms) are given, respectively, by progression rate of humans from exposed to infected class τ hospitalization rate of infected individuals σ v progression rate of vectors from exposed to infected class c hv effective contact rate η modification parameter for relative infectiousness of hospitalized humans the basic reproduction number r 0 for our model is given by (2) when p = q = 0, vertical transmission is not present in the model; the above r 0 reduces to the basic reproduction number r 0 of a seir model for a vector disease (garba et al., 2008) . to get a better understanding of the basic reproduction number r 0 in (2), we rewrite it using taylor expansion about p and q: where ) and . note that r c is the basic reproductive number for horizontal transmission (khan et al., 2014) . the square root means that the two generations required for an infected vector or host to reproduce itself (van den driessche and watmough, 2002) . r p + r q is the sum of the number of infected individuals during the mean latent period and the number of infected individuals during the mean infectious period. similarly, r r + r s is the sum of the number of infected vector offsprings during the mean latent period and the number of infected vector offsprings during the mean infectious period. the expression r = r c r p + r c r q + r c r r + r c r s ) represents the total contribution to the infective class made by the exposed and infective individuals of first generation (li et al., 2001) . the local stability of the disease free equilibrium follows directly from van den driessche and watmough (2002) . we have following result about local and global stability of disease free state: lemma 3.1. the dfe ( ) 0 of the model (2.1), is locally-asymptotically stable if r 0 < 1, and unstable if r 0 > 1. (1) is globally-asymptotically stable in whenever r 0 < 1. be a solution of (1) with x 0 = x(0). a comparison theorem will be used for the proof. the equations for the infected components of (1) can be written as (where the prime denotes the derivative with respect to time), lemma 3.1 established the local asymptotic stability of the dfe when r 0 < 1, or equivalently, ρ(fv −1 ) < 1, which is equivalent to all eigenvalues of f − v having negative real parts when r 0 < 1 (van den driessche and watmough, 2002) . also, f − v has all off-diagonal entries non negative. then this means that the omega limit set of x 0 , ω(x 0 ), is contained in the disease-free space. but, on the other hand, it is straightforward to check that every solution with the initial condition in the disease-free space converges to the epidemiological implication of the above result is that the disease can be eliminated from the population if the basic reproduction number r 0 can be brought down to a value less than unity (that is, the condition r 0 < 1 is sufficient and necessary for disease elimination) irrespective of the size of the initial populations in each class. the stability of the dfe is demonstrated in fig. 2a . if r 0 > 1 the dfe is unstable in this case and the solutions are attracted to an (apparently unique and stable) endemic equilibrium, as depicted in fig. 2b . in this section, the existence and stability of endemic equilibrium (ee) of the model (1) will be discussed. we define endemic equilibrium to be those fixed points of the system (1) in which at least one of the infected compartments of the model are non-zero. theorem 3.3. the endemic state, 1 , of the model (1) exists whenever r 0 > 1. denote an arbitrary endemic equilibrium of the model (1). also, let solving the equations of the model (1) for steady-state by setting right hand asides of the model (1) . where and all k′s are defined above. substituting (4) in (3) and simplifying gives, also, substitute (6) into (5), we have the following equation in λ h . clearly the model (1) has no endemic state if r 0 < 1 and one unique endemic state when r 0 > 1. □ fig. 3 shows the variation in r 0 with the relative fraction of newborn exposed individuals (p) and newborn infected individuals (q). we notice that no less than %5 of newborn exposed individuals (p) and no more than 25% of the fraction of newborn infected individuals to bring r 0 less than 1: fig. 4 shows the effect of hospitalization rate τ of infected individuals on the basic reproductive number r 0 . from this figure, we can see that effective isolation will help to reduce the basic reproductive number. about 25% of infected individuals should be effectively isolated to bring the basic reproductive number to less than 1. this plot shows that the effective isolation is helpful in controlling the epidemic of zika virus. (1) showing the contour plot of r 0 as a function of fraction of newborn exposed individuals (p) and newborn infected individuals (q). theorem 3.4. if r 0 > 1 then the disease is strongly uniformly ρ-persistent: be a solution of model (1). in this case there exists an endemic steady state. let (that is, x is the disease-free subspace). note that x, as well as + x ℝ \ 7 , are positively invariant. also, all solutions originating in x converge to n 0 as t → ∞. n 0 is asymptotically stable in x. hence n 0 is isolated in x. corollary 4.7 in salceanu (2011) )), together with proposition 4.1 and lemma 3.1 in salceanu (2011) , imply that {n 0 } is also uniformly weakly repelling. then, from theorem 8.17 in smith and thieme (2011) we have that the semiflow generated by (2.4) is uniformly weakly ρ-persistent. from the positive invariance of , we have that (2.4) is point dissipative. then, according to theorem 2.28 in smith and thieme (2011) , there exists a compact attractor of points for (1). this, together with uniformly weakly ρpersistent imply (10) (see smith and thieme, 2011, theorem 5.2) . in this case there exists an endemic steady state (smith and thieme, 2011) . □ the local stability of the endemic steady state 1 of the model (1) is given in the lemma below. theorem 3.5. if r 0 > 1, then the endemic state of the model (1) is locally asymptotically stable in for model (1). for the proof of this above local stability theorem see imran et al. (2017) . for the global stability of the endemic steady state 1 , we consider the (1) with no hospitalization and a small incubation period so that we can assume that susceptible individuals after infection move to infected class. in this case, it is easily seen that both for the host population and for the vector population the corresponding total population sizes are asymptotically constant. we assumed that in our model the total population is constant. previous results (thieme, 1992) imply that the dynamics of systems (1) is qualitatively equivalent to the dynamics of system given by: theorem 3.6. if r 0 > 1, then the endemic state of (11) is globally asymptotically stable in . o proof. we will use geometric approach to global-stability method given in li et al. (2001) . let x = (s, i h , i v ) and f(x) denote the vector field of (11). the jacobian matrix and its second additive compound matrix j [2] is (see li et al., 2001) : where now from the system (11), we can write second and third equation, the variation in the values of the parameters of our model (1) is a source of uncertainty and sensitivity. in this section, we carry out a parameter base global uncertainty and sensitivity analyses on r 0 . there are a lot of reasons for the sensitivity of the parameters, for example, inadequate data, lack of information about the vertical transmission. we use the latin-hypercube sampling based method to quantify the uncertainty and the sensitivity of r 0 as a function of the model parameters. we use the latin-hypercube sampling based method to quantify the uncertainty and the sensitivity of r 0 as a function of 13 model parameters, namely μ μ θ θ ξ σ p q r s c τ η , , , , , , , , , , , , . the partial rank correlation coefficient (prcc) measures the impact of the parameters on the output variable using the rank transformation of the data to reduce the effects of nonlinearity. the uncertainty analysis (figs. 5 and 6) yields an estimated value of r 0 = 1.54 with 95% ci (1.3491, 1.7669) for the zika fever. the sensitivity analysis suggests that r 0 is highly sensitive to the parameters c θ θ μ τ , , , , . the accuracy and precision in the values of these parameters is vital for the accurate predictions of the model. the estimated parameters are presented in table 3 . one of the goals of this study is to come up with a time-dependent hospitalization/isolation strategy that would minimize the infected population while keeping the costs to a minimum at the same time. m. imran et al. virus research 255 (2018) 95-104 optimal control is a very useful mathematical technique that can help us to address these questions. here our goal is to put down infection from the population by increasing the recovered class and to minimize the required resources to control zika fever infection using isolation or hospitalization. the optimal control algorithm we use is based on pontrygain's maximum principle which appends the original model to an adjoint system of differential equations with terminal conditions. the optimal objective system, which characterizes the optimal controls, consists of differential equations of the original model (state system) along with the adjoint differential equations (the adjoint system). the number of equations in the adjoint system is same as that of in the state system. the detailed mechanism of forming the necessary conditions for the adjoint and optimal controls are discussed in fleming and rishel (1975) and pontryagin and boltyanskii (1980) . an important decision while formulating an optimal control problem is deciding how and where to introduce the control in the system of differential equations. the form of the optimal control primarily depends on the system being analyzed and the objective function to be optimized. in this paper, we will propose various strategies to eradicate zika fever using optimal control techniques. the first step is to find an optimal hospitalization schedule that minimizes the number of infectious individuals and the overall cost of hospitalization during a fixed time. we define the control set as u= {τ(t) : 0 ≤ τ(t) ≤ ζ, 0 ≤ t ≤ t, 0 < ζ ≤ 1, τ(t) is lebesgue measur-able}. here ζ is a positive number and is defined as the maximum value attained during optimal control procedure. our aim is to minimize the associated cost function which is given as: here a i and a n are positive constants used to balance in the size of i(t) and n t ( ) v . further, we used a nonlinear cost function in order to accommodate the impact of variety of factors associated with hospitalization, documented widely in literature, see for instance kirschner et al. (1997) . w is weight associated with quadratic cost due to hospitalization. moreover a linear function has been chosen for cost incurred by infected individuals and the mosquitoes population. our objective is to find an optimal control for hospitalization rate τ * (t) such that . the lagrangian of the optimization problem is given the associated "hamiltonian" is given by where k i represents the right hand side of the ith equation in our original model. w depends on the relative importance of the control measures in mitigating the spread of the disease as well as the cost incurred (such as material resources and human effort) during the implementation of control measure per unit time. pontrayagin's maximum principal converts model (1) and objective function (16) into minimizing the hamiltonian (17) with respect to τ. now we prove the following theorem to elicit the effect of optimal control of hospitalization. theorem 5.1. there exists a unique optimal control τ * (t) which minimizes j over u. also, there exists an adjoint system of ϕ i 's such that the optimal treatment control is characterized as for some positive number ζ. the adjoint system is given as the above adjoint system also satisfies the transversality condition, proof. we can easily verify that the integrand of j is convex with respect to τ(t). also, the solutions of our model are bounded above. in addition, it is verifiable that the model has the lipschitz property with respect to the state variables. using the properties mentioned above along with corollary 4.1 of fleming and rishel (1975) , the existence of an optimal control is established. now using the pontryagin's maximum principle, we obtain . therefore on the set {t : 0 < τ * (t) < ζ} we obtain now, we discuss the numerical solutions of the optimality system and the corresponding optimal control obtained using ζ = 0.5. the optimal strategy is obtained by solving the optimal system consisting of both the state system as well as the adjoint system. since there are initial conditions present for the state equations, we start solving them with a guess for τ using the fourth-order forward runge-kutta method. the adjoint equations are then solved using the fourth-order backward runge-kutta method because of the presence of final conditions. then, the controls are updated by using a convex combination of the previous control and the value from the characterization given above. this process is repeated until we obtain a desired accuracy of convergence (fig. 7) . fig. 8 represents the optimal isolation(hospitalization) strategy to be employed to minimize the cost and the infected population. considering the practical constraints, an upper bound of 0.5 was chosen for the optimal hospitalization control. the figure shows that initially, the optimal level remains at the upper bound of 0.5 after which it declines steadily to 0. this implies that in the early phase of the endemic breakout, keeping the control at the upper bound would help in decreasing the number of infected individuals. fig. 9 captures the dynamics of the infected population (i h ) by virtue of a comparison between the infected host population under optimal control and constant control. it can be seen that the decrease in the number of infected individuals is greater with optimal control as compared to that with constant control. furthermore, in contrast with a constant control, the infected population remains lower when an optimal control is applied. fig. 10 shows a comparison between the costs associated with optimal and different constant control strategies. it is clear that the cost associated with different control strategies is higher as compared to that of optimal control. it is important to note that high constant isolation rate (τ = 0.4) incurs almost the same cost as the of optimal control. however, practically it is highly unlikely to implement these high constant controls primarily due to the lack of required resources and facilities. fig. 11 captures the effect of the change in effective contact rate over the optimal control strategy. it is clear from the simulation that an increase in the contact rate may or may not lead to higher rates of hospitalization. in this paper, we have presented a zika fever epidemic model comprising of eight compartments consists of vector and human population. the dynamics of zika fever epidemic model have been considered. furthermore, using optimal control theory, we proposed control strategies to eliminate the infection from the population. a vertical and horizontal transmission model for zika fever is constructed in the form of a system of ordinary differential equations. this model features the study of zika fever by considering vertical transmission in both humans as well as vectors. the basic reproductive number r 0 is formulated by using a next-generation matrix. this reproductive number is simplified in order to better understand the effect of vertical transmission parameters. it is shown that the disease-free steady state is globally asymptotically stable when the basic reproductive number (r 0 ) is less than 1. the model has a unique endemic equilibrium when the reproduction number r 0 exceeds unity. this equilibrium is shown to be globally asymptotically stable when the reproduction number r 0 exceeds unity under the reduced model. it is locally asymptotically stable when we consider a full model. we performed a parameter based global uncertainty and sensitivity analysis on r 0 . the uncertainty analysis yields an estimated value of the basic reproductive number r 0 = 1.54 with 95% confidence interval (1.3491, 1.7669). this estimated value of r 0 is close to the calculated value of basic reproductive using real data (villela et al., 2017 ). our previous model had an estimated basic reproductive number r 0 = 1.31 with 95% confidence interval (1.23, 1.39) given in imran et al. (2017) . our sensitivity analysis on the zika model parameters showed that the most influential parameters are the effective contact rates, the recovery rate of the infected individuals and the birth rate of mosquitoes. we proposed an optimal controlling strategy to eliminate zika fever from the population. we observed that optimal control strategy is most effective in terms of eliminating infection as it minimizes our cost and resources at the same time. moreover, the control measures themselves may take time to implement, once the outbreak has been realized. despite these points, our analysis can help public health authorities to determine quasi-optimal strategies they might want to adopt, especially as our work highlights the relative effectiveness of different control strategies. mathematical model of zika virus with vertical transmission zika virus background zika virus (i). isolations and serological specificity estimated incubation period for zika virus disease analysis of a dengue disease transmission model a model for dengue disease with variable human population influence of vertical and mechanical transmission on the dynamics of dengue disease coexistence of different serotypes of dengue virus microcephaly in brazil potentially linked to the zika virus epidemic the effect of antibody-dependent enhancement on the transmission dynamics and persistence of multiple-strain pathogens deterministic and stochastic optimal control backward bifurcations in dengue transmission dynamics effect of cross-immunity on the transmission dynamics of two strains of dengue modelling strategies for controlling sars outbreaks zika virus and sexual transmission: a new route of transmission for mosquito-borne flaviviruses zika virus outside africa the mathematics of infectious diseases a comparison of a deterministic and stochastic model for hepatitis c with an isolation stage transmission dynamics of zika fever: a seir based model dengue virus infection: epidemiology, pathogenesis, clinical presentation, diagnosis, and prevention estimating the basic reproduction number for single-strain dengue fever epidemics optimal control of the chemotherapy of hiv global dynamics of an seir epidemic model with vertical transmission transmission dynamics and control of severe acute respiratory syndrome curtailing transmission of severe acute respiratory syndrome within a community and its hospital zika virus dynamics. when does sexual transmission matter the mathematical theory of optimal processes pan american health organization, zika cumulative cases robust uniform persistence in discrete and continuous dynamical systems using lyapunov exponents dynamical systems and population persistence zika virus in an american recreational traveler convergence results and a poincar bendixson trichotomy for asymptotically autonomous differential equations zika in rio de janeiro: assessment of basic reproduction number and comparison with dengue outbreaks dynamics of an saiqr influenza model ecological and immunological determinants of dengue epidemics dynamics of zika virus outbreaks: an overview of mathematical modeling approaches the authors declare that there is no conflict of interests regarding the publication of this article. we study the following feasible region of the new system (11) key: cord-326409-m3rgspxc authors: lai, alvin c.k.; chen, f.z. title: comparison of a new eulerian model with a modified lagrangian approach for particle distribution and deposition indoors date: 2007-03-24 journal: atmos environ (1994) doi: 10.1016/j.atmosenv.2006.05.088 sha: doc_id: 326409 cord_uid: m3rgspxc understanding of aerosol dispersion characteristics has many scientific and engineering applications. it is recognized that eulerian or lagrangian approach has its own merits and limitations. a new eulerian model has been developed and it adopts a simplified drift–flux methodology in which external forces can be incorporated straightforwardly. a new near-wall treatment is applied to take into account the anisotropic turbulence for the modified lagrangian model. in the present work, we present and compare both eulerian and lagrangian models to simulate particle dispersion in a small chamber. results reveal that the standard k–ε lagrangian model over-predicts particle deposition compared to the present turbulence-corrected lagrangian approach. prediction by the eulerian model agrees well with the modified lagrangian model. gaining understanding of particle transport and deposition indoors has numerous engineering applications. there has been considerable interest over the past decade in exposure assessment of fine particles inhalation and its influence on public health. the anthrax mailing accidents following the terrorist attacks of 11 september 2001 have generated enormous concern in design and application of ventilation strategy for protecting indoor environment against the intentional release of biological agents. phrases like aerobiological engineering or immune-building technology are coined recently to reflex the importance of studying aerosol behaviors indoors. better understandings of the aerosol dynamics in terms of mixing time and dispersion rate are vital to decide the positioning of air toxic sensors which have became an important element for monitoring buildings (gadgil et al., 2003) . improper placement of the sensors can impair the ability of the sensors to provide the first response decision. in addition, understanding aerosol dispersion and transport is very essential in the prevention of nosocomial transmission of airborne pathogens (cole and cook, 1998; li et al., 2005) . www.elsevier.com/locate/atmosenv 1352-2310/$ -see front matter r 2007 published by elsevier ltd. doi:10.1016 ltd. doi:10. /j.atmosenv.2006 after the outbreak of severe acute respiratory syndrome (sars) in south east asia 2003, there is increasing research interest in studying transport and control of airborne bacteria or viruses in indoor environments (beggs et al., 2005; nicas et al., 2005) and in confined environment like flight cabinet (mangili and gendreau, 2005) . while conducting in situ experiment can provide detailed information on microorganisms transport and survival in enclosed environments, potential danger and cost involved should be carefully considered. computational fluid dynamics (cfd) provides a very cost-effective way to perform parametric studies to investigate the microorganism dynamic behavior prior to full-scale experiments. there are two modeling approaches for twophase flow problems; namely the eulerian-eulerian model (hereafter refers to eulerian model) and the eulerian-lagrangian model (hereafter refers to lagrangian model). the eulerian method considers the particle phase as another continuum. governing equations derived from the mass (species) conservation condition are solved to give details of the particle concentration field. however, some studies treated the particle phase as scalar species (no inertia) and the results should be inferred cautiously (lu et al., 2005; noakes et al., 2004) . gravitational settling was considered in some eulerian cfd models (murakami et al., 1992; holmberg and li, 1998; zhao et al., 2004) , nevertheless, the deposition rate in those studies was only estimated empirically or simply ignored. it has been shown that, in certain circumstances, ignoring the deposition flux may result in numerical instability problems. theoretical model needs to be developed to evaluate deposition rate according to local turbulent flow condition. the second approach is the lagrangian method, which treats the dynamics of a single particle by the trajectory method. under this framework, generally speaking, the flow field is obtained by applying reynolds averaged navier-stokes (rans) turbulent models. the flow and other quantities obtained are ensemble-averaged components. near-wall fluctuating velocities are highly anisotropic with the component normal to the wall substantially smaller than those in the other two directions. proper treatment for the fluctuation components is critical to particle deposition modeling. however, many previous rans approaches for indoor environments ignored this effect and lead to incorrect deposition rates. equation of motion resulting from various forces exerting on an individual particle is solved to acquire the single-particle trajectory. a large number of sample particles should be analyzed before statistical conclusions can be drawn. a number of lagrangian simulations have been carried out for particle transport and deposition in ventilated single-zone room (zhao et al., 2004) , two-zone chamber (lu et al., 1996) and multi-zone chamber (chung, 1999) . unfortunately, the effect of turbulence on particle phase was overlooked in most of those models, even though the airflow fields were simulated with turbulence models. recently, large eddy simulation (les) has applied to simulate particle transport and deposition for simple singlezone geometries bouilly et al., 2005) . recently, the present authors have proposed a new eulerian model which takes external drift forces into account (chen et al., 2006) . gravitational settling has been incorporated as an external drift velocity and the results agree well with the experimental measurements. lately, a new lagrangian model has been developed to improve turbulent intensity in the vicinity of the wall (lai and chen, 2007) . in the present work, we compared particle distribution and deposition rates for a small model chamber by the two approaches. the geometry of the single-zone model room is shown in fig. 1 . the room dimension are length (x) â width (y) â height (z) ¼ 0.8 m â 0.4 m â 0.4 m. its inlet and outlet are of the same size, 0.04 m â 0.04 m. their centers are located at x ¼ 0, y ¼ 0.2 m, z ¼ 0.36 m and x ¼ 0.8 m, y ¼ 0.2 m, z ¼ 0.04 m, respectively. the symmetrical plane at y ¼ 0.2 m is referred as the center plane in the following discussion. two inlet velocities, 0.225 and 0.45 m s à1 (corresponding to air exchange rates of 10 and 20 h à1 , respectively), are tested. the room air temperature is set as 27 1c. due to the high computational cost for continuously particle tracking for lagrangian models, particles were injected only once instead of continuous injection adopted in the eulerian approach. this is a very common practice for almost all lagrangian simulations; nevertheless, due to the different nature of particle injection, it imposes a key constraint comparing to the eulerian models. the main objective of the present work is to highlight and compare the two approaches on the prediction of particle phase dispersion in a chamber, thus the air phase model is not mentioned here and the approach can be found elsewhere (chen et al., 2006; lai and chen, 2007) . in brief, the airflow field was resolved by a rng k-e turbulence model for both eulerian or lagrangian approaches. the simulations were performed with the aid of the commercial cfd code fluent (2005). a simplified eulerian drift-flux model has been developed to take full advantage of the extremely low volume fraction of indoor particles. the term ''drift-flux'' (or drift velocity) stands for particle flux (or velocity) caused by effects other than convection, i.e. gravitational settling and diffusion for the current work. as the convective velocity of the particle phase is the same as the air phase, the complexity of the two-phase flow system is greatly reduced. the governing equation for particle transport in turbulent flow field is given as where u is the air phase velocity vector, c i is the particle mass concentration, kg m à3 (or number concentration, m à3 ) of particle size group i (hereafter in this paper, the subscript i denotes particle size group), v s;i is the particle settling velocity, e p is the particle eddy diffusivity, and d i is the brownian diffusion coefficient and s c i is the mass concentration source term. for coarse particles, the loss by deposition (i.e. sedimentation) must be properly treated. in the current approach, the concentration field is divided into two regions: the core region and the concentration boundary layer. the methodology adopted here is to get the distribution of particles in the core region with the three-dimensional mass conservation eq. (1), while within the concentration boundary layer, the particle wall flux is determined with a one-dimensional semi-empirical particle deposition model (lai and nazaroff, 2000) and the results are substituted into eq. (1) as the boundary condition. the equation of motion of a small aerosol particle can be written as where u p;i is the velocity of the particle, t is the particle relaxation time, n i (t) is the brownian force per unit mass, r p and r are the particle and air density, respectively, and g i is the gravitational acceleration. in a stochastically modeled turbulent flow, the instantaneous fluid velocity can be expressed as which is the sum of the mean velocity component, u i , from the rng k-e model and the fluctuating velocity component, u 0 i , which will be described in fig. 1 . schematics of the model room. section 2.3. the brownian force per unit mass is important for submicron particles. the brownian force is modeled as a gaussian white noise random process as described by li and ahmadi (1992) . the procedure for simulating the brownian force is to generate a white noise process with the noise intensity, and the spectral intensity s 0 is given by where g i is a zero-mean, unit variance-independent gaussian random number and k b ¼ 1.38 â 10 à23 j k à1 is the boltzmann constant. the third terms on the rhs of eq. (2) represents gravity force exerting on the particle. it has been reported by the present authors that the requirement for a careful grid independence test is more stringent for a lagrangian simulation (lai and chen, 2007) . to model particle deposition accurately, the grid should be able to resolve not only the turbulence field, but also the flow field properties along particle paths. if the near-wall boundary layer is not properly resolved, the predicted deposition rate may depart from the actual solution severely. in the present model, the near-wall turbulence is deliberately damped. if the near-wall cell center is out of the viscous sublayer, the normal velocity at the point may not be negligible and consequently the interpolated normal velocity at the particle position may be sufficiently large to drive the particle to impact onto the wall directly. hence, the near-wall grid should be fine enough to resolve the important deposition boundary layer. three different grid systems are tested and details of them are listed in table 1 . taken into account of both computational resources requirement and result accuracy, grid system 2 with 181,976 hexahedral cells is chosen for the present simulation (lai and chen, 2007) . to take into account of anisotropic turbulence, a new correction scheme, which is essentially a hybrid combination of the methods of he and ahmadi (1998) and matida et al. (2004) , is proposed. the quadric relation used by he and ahmadi is adopted, written as ffiffiffiffiffi ffi where a ¼ 0.008 (bernard and wallace, 2002) is used by fitting the dns results of kim et al. (1987) . in order to implement this correction to the isotropic k-e turbulence models, the method of matida et al. is used to simplify the system by forcing the streamwise and spanwise normal reynolds stress components equal to the normal component, i.e. ffiffiffiffiffiffi a new turbulent kinetic energy for particle tracking calculations can then be defined as in this method, only the turbulent kinetic energy in the near-wall area needs to be corrected, and the resultant turbulent time scale will be updated accordingly. it should be emphasized that the optimized turbulent kinetic energy in eq. (8) still remains isotropic. there are several salient features of this hybrid scheme. the method can be conveniently applied to various conditions, as the modification to the model is minimal. the prediction of the normal reynolds stress component is remarkably improved, thus it can result in a more reasonable particle deposition rate. on the other hand, the turbulent intensities in the rest two directions are underestimated undesirably. this under-estimation is unlikely to incur significant error when the particle deposition rate is the parameter of concern, as particle deposition is not directly affected by the rest two components and, the mean flow velocities in these two directions overwhelms the fluctuation counterparts. (all particle sizes are referred to aerodynamic particle diameters). a noticeable observation is that the submicron particle concentration can achieve fairly uniform in just 300 s, and in 30 min the particle concentration is virtually homogeneous for the entire zone. the ultrafine particles exhibit similar dispersion characteristic to that observed for 0.3 mm particles (results not shown). the dispersion rate decreases with the particle size as concentration gradient is distinctively observed for coarse particles. for 7 mm particles concentration homogeneousness cannot be achieved (even beyond 1800 s). to represent the non-uniformity of a concentration field, the coefficient of variation, cv i ðtþ, of concentration is defined as (mage and ott, 1996) where c i ðtþ is the volume averaged concentration, c i (t) is the concentration at elapsed time t at each sample point and n is the number of sample points. in case of non-uniform mixing, a useful parameter, the mixing time, is used to quantify the time needed to reach a well-mixed state. it is defined as the time at which the coefficient of variation becomes o10% permanently since the release of pollutant. fig. 3 presents the coefficients of variation of the three particle sizes. inferring from the results, it can be shown that cv strongly depends on the particle size and airflow; small size and high airflow favor mixing. one thing should be noted is that the mixing characteristics of 0.3 and 1 mm are very close whereas very distinct difference can be observed for 1 and 7 mm particles. as the gravitational settling magnitude is a function of particle size with an exponent of 2, sedimentation influences the mixing and dispersion rates (fig. 2) in a non-linear way; in the results presented, the distinct behavior of 7 mm particles is observed. for lagrangian model, the results presented are very different. since particles are treated as a discrete and not a continuum phase, concentration contour does not exist. another factor further complicates the issue is the computer resources required for lagrangian model. due to the current limitation of computational resources, almost all lagrangian models inject particles momentarily and track the individual particle based on the forces exerted. because of the transient in nature and the validity of the well mixed (discussed above), these pose limitation for presenting the results obtained from lagrangian approach. in the literature, the parameter decay rate loss coefficient is often used to characterize the particle removal rate in an enclosure. the decay rate coefficient is derived from the mass balance principle where well-mixed condition is assumed. this has a significant implication on the application of particle decay rate coefficient for quantifying the fate of particles in a non-well-mixed enclosures and it must be used cautiously (bouilly et al., 2005) . hence for lagrangian framework, another common approach, which counts the number of particles remaining in the domain versus time, is adapted . the particle deposition fraction, z, is defined as where n in and n out are the numbers of particles released at the inlet and the number of particles exited from the outlet, respectively. as observed in fig. 4 , the original eim without correction overpredicts particle deposition rate noticeably, particularly for 0.3 and 1 mm particles at the high ventilation rate. the turbulent intensity is stronger for the high-velocity case. if the correction scheme is not applied, the original eim tends to give more severe over-prediction for the high-velocity case. generally, the deposition fractions predicted with the correction are closer to the semi-empirical equation (lai and nazaroff, 2000) . thus far we have highlighted the main features of two models developed recently and results are presented separately. it is very valuable to quantify the simulation results by a same parameter. in velocities are shown. overall speaking, the results modeled by the two approaches agree well with each other; as the particle size increases, the deposition fraction increases. for submicron particles, the deposition fractions predicted by lagrangian (without near-wall turbulent correction) is higher than those predicted with correction and eulerian drift flux prediction follows. deposition fraction for 0.45 m s à1 is also higher than the case of 0.225 m s à1 . this is because for submicron particles, the deposition is significantly affected by turbulent diffusion and it is stronger for the high-velocity case. for supermicron particles scenario, the discrepancy between eulerian and lagrangian increases with particle size, however, is not significant. one thing should be noted is that as the particle size increases, the influence of the turbulent correction diminishes. these observations can be explained by the dominant fate mechanism for this size range of particles. as particle size increases, deposition by gravitational settling increases significantly and the other effect, i.e. turbulent diffusion, becomes less important. it must be careful when direct comparison is attempted to be made. as discussed above that the injection type for eulerian is continuous while for lagrangian, the injection is momentarily. for steady-state condition such as eulerian model, the parameter deposition velocity can be used to quantify the deposition rate in a more rigorous way, while for lagrangian framework, using deposition velocity as a parameter is not an obvious means. on the other hand, the deposition fraction cannot be used to quantify the deposition onto various orientated surfaces, however, it can be used to characterize both models. the current study focuses on an empty chamber so some comments on practical applications should be addressed. the author has published one article regarding the effects of room furnishing on particle deposition rates indoors (thatcher et al., 2002) . the experimental results showed that increasing the surface area from bare to fully furnished increased the deposition loss rate with the latest increase seen for 0.5 mm particles. it can be attributed to the additional surfaces available for diffusion loss. the current models can be applied to both empty and furnished room with proper boundary conditions. knowledge of particle dispersion and deposition indoors improves human exposure assessment and provides insights for better pollutant control measures. cfd has become a virtual tool for studying particle dynamics. we compared a new drift-flux eulerian and a modified lagrangian models for a single-zone chamber geometry. one key feature of the drift-flux model is to encapsulate external forces into the formulation and present the results in a continuous domain. near-wall anisotropic turbulence correction is properly applied in the lagrangian model. inferring from the results, it is shown that non-corrected turbulent scheme over-predicts particle deposition significantly for submicron particles while supermicron particles are not sensitive to the correction scheme. the paper also highlighted the results obtained by eulerian and lagrangian models. the two models agree very well for submicron particles. as the size increases, the discrepancy increases moderately. it is not straightforward to conclude which model is better than the other as each model has its own fundamental assumptions imposed. the results shown here reveal that the two present models are comparable. turbulent flow: analysis, measurement and prediction methodology for determining the susceptibility of airborne microorganisms to irradiation by an upper-room uvgi system using large eddy simulation to study particle motions in a room effect of ventilation strategies on particle decay rates indoors: an experimental and modelling study modeling particle distribution and deposition in indoor environments with a new drift-flux model three-dimensional analysis of airflow and contaminant particle transport in a partitioned enclosure characterization of infectious aerosols in health care facilities: an aid to effective engineering controls and preventive strategies indoor pollutant mixing time in an isothermal closed room: an investigation using cfd particle deposition with thermophoresis in laminar and turbulent duct flows modelling of indoor environment-particle dispersion and deposition modeling particle deposition and distribution in a chamber with a two-equation reynoldsaveraged navier-stokes model modeling indoor particle deposition from turbulent flow onto smooth surfaces dispersion and deposition of spherical particles from point sources in a turbulent channel flow role of air distribution in sars transmission during the largest nosocomial outbreak in hong kong modelling and measurement of airflow and aerosol particle distribution in a ventilated two-zone chamber a preliminary parametric study on performance of sars virus cleaner using cfd simulation accounting for nonuniform mixing and human exposure in indoor environments transmission of infectious diseases during commercial air travel improved numerical simulation of aerosol deposition in an idealized mouth-throat diffusion characteristics of airborne particles with gravitational settling in a convection-dominant indoor flow field toward understanding the risk of secondary airborne infection: emission of respirable pathogens development of a numerical model to simulate the biological inactivation of airborne microorganisms in the presence of ultraviolet light effects of room furnishings and air speed on particle deposition rates indoors comparison of indoor aerosol particle concentration and deposition in different ventilated rooms by numerical method key: cord-320914-zf54jfol authors: parrish, rebecca; colbourn, tim; lauriola, paolo; leonardi, giovanni; hajat, shakoor; zeka, ariana title: a critical analysis of the drivers of human migration patterns in the presence of climate change: a new conceptual model date: 2020-08-19 journal: int j environ res public health doi: 10.3390/ijerph17176036 sha: doc_id: 320914 cord_uid: zf54jfol both climate change and migration present key concerns for global health progress. despite this, a transparent method for identifying and understanding the relationship between climate change, migration and other contextual factors remains a knowledge gap. existing conceptual models are useful in understanding the complexities of climate migration, but provide varying degrees of applicability to quantitative studies, resulting in non-homogenous transferability of knowledge in this important area. this paper attempts to provide a critical review of climate migration literature, as well as presenting a new conceptual model for the identification of the drivers of migration in the context of climate change. it focuses on the interactions and the dynamics of drivers over time, space and society. through systematic, pan-disciplinary and homogenous application of theory to different geographical contexts, we aim to improve understanding of the impacts of climate change on migration. a brief case study of malawi is provided to demonstrate how this global conceptual model can be applied into local contextual scenarios. in doing so, we hope to provide insights that help in the more homogenous applications of conceptual frameworks for this area and more generally. the climate change-migration nexus has been the subject of research debate for decades. indeed climate change has been instigated in human migration since early humans first moved out of africa and migration has long been an adaptive strategy to climate shocks, long-term changes or cyclic climate conditions [1] . the field of climate migration has been gaining global scientific and popular attention since roughly the 1970s [2] , and very much so in recent years, since the emergence of the concept of 'environmental refugees' [3] . since the 2015 european migration 'crisis', the topic has received increasing controversy and non-evidence-based rhetoric in the media. as climate change int. j. environ. res. public health 2020, 17 , 6036 2 of 20 continues throughout the 21st century, it will likely serve as a threat magnifier of other migration drivers [4] . whilst terms such as 'climate refugee' are not recognised legally, migration and conflict are considered key mechanisms by which climate change has become a priority global health concern [5] [6] [7] . indirect health impacts of climate change, such as those mediated via migration and displacement, are often under-recognised and under-researched. the ongoing covid-19 pandemic is a poignant example of how the special circumstances of migrant communities creates unique and extreme health vulnerabilities: whilst some migrants living in displacement camps are unable to practise good hygiene and social distancing, other migrants are finding themselves denied their right to asylum, neglected or turned away at borders due to travel restrictions and fear of new waves of infection. the lancet countdown on climate change and health created a 'climate migration' indicator [4] whilst the newly launched lancet migration collaboration aims to explore and provide evidence for policy on the impacts of climate change on migrant health [8] . despite these advances, conceptual frameworks for robustly exploring and understanding the impacts of climate change upon migration are lacking in some key areas, and the body of empirical studies remains thin. these gaps undermine the ability of policy makers to design effective evidence-based policy, public health interventions, and strategies to support safe and positive migration experiences. the aims of this paper are to provide a critical review of existing climate migration literature; from this, we also suggest modifications to existing conceptualisation of climate migration by providing a new conceptual model of the system of migration determinants. the model is designed to be pan-disciplinary and transferable to any geographic or social context. we advocate the systematic application of theory in climate migration studies which may help better geographic representation [9] and improve our understanding of how climate change is impacting migration with contextual relevance to policy makers and public health interventions. finally, we apply this model to a case study of malawi to demonstrate how doing so can improve understanding of the local context and result in well-grounded and policy-relevant insights into the true impacts of climate change on migration. in order to improve conceptual modelling, several critical issues related to climate change and migration have been identified and discussed here. a key characteristic of migration is the multicausal nature of its drivers. climate change may act as a direct driver of displacement but in many cases is also inextricably linked to other, dynamic and interacting social, political, demographic and economic drivers [10] [11] [12] . a popular constructed narrative is that climate change acts as a threat magnifier of existing migration drivers. this can result in many empirical studies identifying that economic factors rather than climate factors dominate the decision to migrate [13] [14] [15] . however, it is possible that such studies overlook the mediated effect of climate change through other factors such as agriculture [16] . it is now largely acknowledged that the relationship between climate change and migration is complex, dynamic and non-directional. another critical issue relates to the multifaceted nature of climate change itself. scholars typically outline several classes of climate change: a change in climate variability; changes in frequency and magnitude of fast-onset climatic events (including extreme weather events, droughts, floods, and heatwaves); and slow-onset climate change, including long term changes in average temperature, rainfall and chronic drought or flooding [17] . this wide temporality as well as severity of climate change must be accounted for when discussing the implications of climate change on future population movements. migration itself may occur over a range of spatial scales-from movements between rural and urban areas, to international migration-as well as a range of temporal scales such as short-term migration, circular migration, to permanent moves. the decision of each individual to migrate may also carry different levels of human agency. the decision to migrate is due to an aggregation of micro-level (typically household or individual) and macro-level (societal) drivers. as such, each potential migrant has their own unique profile of factors and drivers. such individualistic situations are often described in terms of the individual's or community's vulnerability [18] [19] [20] . this presents a key challenge in many existing studies, which struggle to reconcile drivers at the macro-and micro-demographic scales. a key challenge remains the paucity and compatibility of datasets regarding both migration and potential drivers thereof, and the scarcity of such data at appropriate spatial and temporal levels, particularly in low-income and in indigenous communities. the necessity for localised quantitative studies can result in fragmented analyses of specific timescales, geographies, types of migration and drivers thereof. furthermore, this can make it difficult to summarise and build a global narrative of the risks of climate change to human security [21] . the resultant synthesis is that the impacts of climate change on migration are complex, multifaceted and dynamic. as such, increased attention on the upstream drivers of migration is called for. some authors argue that there is also some limitation in theoretical development and so in recent years there has been a push to promote a more sophisticated theoretical understanding of how climate change may interact with other drivers of migration [11, 16, 22] . this has allowed the narrative to evolve through time: the conventional narrative suggests a more simplistic view of climate change as a blanket push driver resulting in large-scale waves of migration [18] . however, newer frameworks appreciate the multi-driver nature of migration as well as the resilience and adaptive strategies of affected individuals and communities. nevertheless, such frameworks still struggle to capture the dynamic nature of such drivers including feedback and lag times, as well as the interactions between drivers themselves through time. the rising concept of ecological public health goes some way to attempt to address this, yet many policy-facing groups remain slow on the uptake [23] . understanding of climate-induced migration is further skewed by the discipline of researchers: each scientific discipline-be it epidemiology, economics, political sciences or anthropology-carries with it its own intrinsic assumptions and methodologies [22, 24, 25] which can perpetuate the fractured nature of the literature. therefore, to advance the understanding of the impacts of climate change on migration requires a truly interdisciplinary response. the collective result of these challenges is that current understanding of climate-induced migration is geographically uneven and studies are often non-transferable to other settings which presents an obstacle for policy makers and health intervention design. to aid design of interventions, scientists and decision makers (such as governments and humanitarian agencies) should engage with each other at all points of the intervention design and implementation. this can help ensure that interventions are contextually relevant and evidence based and that their impacts can be measured and evaluated. from the challenges identified in the above critical analysis, a new conceptual framework of climate migration is provided. migration, as a subjective concept, cannot carry one single definition and is highly contextualised. however, this is often neglected within most quantitative studies. in particular, the terms 'environmental migration' and 'climate migration' lack unanimous definitions across academic, ngo and political actors. migration exists as a normative behaviour in most communities globally but may also manifest as forced displacement and other involuntary or voluntary movements. furthermore, often overlooked in climate migration literature is the possible inhibiting effect of climate change on migration, resulting in reduced mobility rather than driving migration events [26, 27] . many scholars advocate the need for appropriate migration typologies and several have been presented. for instance, stojanov et al. [2] argue the need for contextualisation of climate drivers on a community in order to appropriately discern climate driven migration from normative or otherwise induced movement. renaud et al. [28] also comment on the difficulty of identifying the environmental signal within migration drivers and present a decision-making framework and accompanying typology of environmentally induced migration. carling [29] created the first iteration of the aspiration-capability framework, which describes voluntary or involuntary, mobility or immobility, based on both desire and ability to migrate. based on a review of a large variety of both qualitative and quantitative literature, we identify four dimensions which quantify migration: societal, temporal, spatial, and agency levels. "societal level" refers to the level of society affected, from micro scale (individual and household level) to macro scale (community, regional or population level). "temporal level" refers to the time duration of the migration, and the short term may consist of a matter of months, the long term is typically considered to be a year or more, though there is much range within empirical studies, and permanent migration represents the longest form of migration. "spatial level" refers to the physical distance covered by the migration. short distance may consider anywhere from intracommunity, intra-regional movements, to movements within the country and includes movements to between rural and urban hubs. long distance constitutes international movements across large geographical areas. whilst some cross-border movements may only require a few miles of travel and as such may be considered short distance, a large amount of international movements cover multiple countries and sometimes continents. such movements are of international political interest, though do not represent a large quantity of migrants or types of mobility [30, 31] . the spatial scale, like the societal scale may also be summarised in terms of macro (generally medium or large distance) and micro (small, community level distances) and may align to climate and economic macro-or micro-level determinants. "agency level" refers to the level of choice afforded to each migrant, existing as a continuous scale between the extremes of totally involuntary (in other words, forced) to totally voluntary. it should be noted that all four dimensions are continuous variables and hence demarcations used should be contextually modulated. by applying generalised demarcations, however, we classify five key categories of environmentally induced migration. the first category is forced displacement, also referred to as distress migrants [20] or temporary displaced migrants [2] . the second category is adaptive migration at the decision of the migrant(s) [11] . whilst this is a voluntary movement, a crucial caveat is applied here to note that such migration may not be truly voluntary; whilst many scholars and decision makers consider it as such, there is an emergent narrative arguing that migration due to longer-term environmental or economic degradation, or erosion of human security constitutes a type of forced migration, rather than a voluntary or adaptive movement [32] . the third category is proactive migration at the decision of wider authority such as local or national government referred to as 'planned resettlement' [5] . the fourth category is for trapped populations, which refers to a lack of mobility due to at-risk populations becoming trapped by environmental and socioeconomic barriers such as poverty [32] . the final category is immobility, which represents a lack of mobility at the decision of the person(s) at environmental risk [32, 33] . table 1 summarises these classifications according to the four dimensions outlined. throughout the remainder of this paper, we shall use the term 'climate migration' for simplicity. a newly proposed conceptual framework for identifying the determinants of any given migration is presented in figure 1 below. this framework is an updated iteration of prior models, which, over time, have converged to agree that migration is generally the result of a combination of upstream drivers, split into five categories: social, economic, political, demographic and environmental. however, consideration of interactions and evolution of these drivers through time, societal and spatial scales remains low. climate change is presented as an external driver which is expected in many contexts to act as a threat magnifier by exaggerating negative, push factors for vulnerable populations [34] [35] [36] . attributes of climate change are split into three categories: physical, biological/ecological, and anthropogenic impacts [37, 38] . the physical effects of climate change may be fast onset or slow onset. fast onset includes sudden events such as extreme weather or disaster events. slow onset consists of more gradual changes of mean values such as annual rainfall, rainfall variability and chronic droughting and flooding. secondary, or ecological climate aspects may include changes in land cover, flora and fauna habitats, including disease vectors and pollinators. tertiary or anthropogenic aspects include subsequent changes to anthropogenic systems such as crop yield and fish or game catch. a newly proposed conceptual framework for identifying the determinants of any given migration is presented in figure 1 below. this framework is an updated iteration of prior models, which, over time, have converged to agree that migration is generally the result of a combination of upstream drivers, split into five categories: social, economic, political, demographic and environmental. however, consideration of interactions and evolution of these drivers through time, societal and spatial scales remains low. climate change is presented as an external driver which is expected in many contexts to act as a threat magnifier by exaggerating negative, push factors for vulnerable populations [34] [35] [36] . attributes of climate change are split into three categories: physical, biological/ecological, and anthropogenic impacts [37, 38] . the physical effects of climate change may be fast onset or slow onset. fast onset includes sudden events such as extreme weather or disaster events. slow onset consists of more gradual changes of mean values such as annual rainfall, rainfall variability and chronic droughting and flooding. secondary, or ecological climate aspects may include changes in land cover, flora and fauna habitats, including disease vectors and pollinators. tertiary or anthropogenic aspects include subsequent changes to anthropogenic systems such as crop yield and fish or game catch. the model aims to build upon this a priori understanding by providing deeper discussion of the complexity of driver interactions, driver dynamicity and the evolution of both drivers, and their linkages over time and spatial scales. importantly, the range of possible migration outcomes receives greater attention in this conceptual model, with recognition that different combinations of causal and contextual determinants may result in different migratory responses, including differences in the level of agency of a migrant. the model is designed with the purpose of being pan-disciplinary and, as such, relevant in any academic or non-academic context. the model presents not only a theoretical exercise, but a frame of thinking to support quantitative studies, with a view to informing future research, data collection or intervention design. drivers within each of the five classes may act as push agents-encouraging movement away from the origin, or pull agents-attracting movement to a host area. bowles et al. [38] also identify glue and fend factors. glue factors act to cement a potential migrant in his/her home location such as cultural and family ties, whilst fend factors deter migration into an area such as hostile immigration policies. recent studies highlight that climate change may act as a glue factor in many situations, the model aims to build upon this a priori understanding by providing deeper discussion of the complexity of driver interactions, driver dynamicity and the evolution of both drivers, and their linkages over time and spatial scales. importantly, the range of possible migration outcomes receives greater attention in this conceptual model, with recognition that different combinations of causal and contextual determinants may result in different migratory responses, including differences in the level of agency of a migrant. the model is designed with the purpose of being pan-disciplinary and, as such, relevant in any academic or non-academic context. the model presents not only a theoretical exercise, but a frame of thinking to support quantitative studies, with a view to informing future research, data collection or intervention design. drivers within each of the five classes may act as push agents-encouraging movement away from the origin, or pull agents-attracting movement to a host area. bowles et al. [38] also identify glue and fend factors. glue factors act to cement a potential migrant in his/her home location such as cultural and family ties, whilst fend factors deter migration into an area such as hostile immigration policies. recent studies highlight that climate change may act as a glue factor in many situations, rather than as a push factor as is often popularised [26] . climate change is a sub-category of environmental drivers which may be further categorised into three classes [10, 37, 39] . these are perhaps best described as primary physical effects, secondary biological and ecological effects, and tertiary anthropogenic effects [37, 38] . climate change is segregated here from other environmental factors and framed as an externality to the determinant system. this facilitates investigation of the impacts of climate change as an upstream pressure to all five classes of drivers. each of the five categories is described in further detail below. to demonstrate the temporal nature of the system, the model is presented on a set of axes with time on the horizontal dimension with arbitrary timepoints t 0 and t 1 . this encourages consideration of the dynamicity of all determinants, as well as the changing nature of their interactions through time. as such, feedback implication on both host and source environments and communities of a migration decision may be decoded. externalities, such as future climate shocks or political interventions, such as climate mitigation, which may alter the system and resultant migration can also be presented and their impacts conjectured. the y axis depicting scale of impact refers simultaneously to the societal and spatial level of impact thereby encouraging the disparate nature of drivers on these scales to be considered. micro refers to small-scale, individual-or household-level factors whilst macro may be factors affecting large distances and large numbers of people. within the next section, we take a more granular view of each of the key families of drivers, and consider how each may directly or indirectly impact migration. this analysis is not exhaustive but attempts to provide a detailed summary of drivers over time and space, thereby encouraging a more nuanced and detailed exploration of the complexity of climate change. all climate factors are considered to occur at the macro spatial scale (figure 1 ), which correlates to the macro societal impact level as described in table 1 above. temporality of climate factors varies, and depends on the climate determinant (as shown in table 2 ). the true speed of climate change varies geographically and so there can be no definitive definition for fast or slow onset. furthermore, some aspects may manifest across multiple timeframes. it is well established that climate change is increasing the frequency and magnitude of extreme weather events and climate shocks, which can be a direct cause of forced displacement. nevertheless, in such natural hazard events, socially constructed vulnerabilities often govern the extent and type of migration responses which occur. myers et al. [40] , in a study of displacement due to hurricane katrina in 2005, identified a range of social vulnerability factors which had a significant impact upon outmigration from affected places. similarly, gray and bilsborrow [14] identified within an ecuadorian household migration survey, that household vulnerability factors such as home ownership, connectedness of household (to roads and schools), and poverty level, all confounded the environmental signal in the causes of observed migrations. less clear is the extent to which long-term or chronic climate change affects migration. chronic changes may include changes in average temperature, average rainfall, rainfall variability or extent of periodic drought and flooding. such changes often impact migration via mediating biological and anthropogenic factors such as impeded agricultural outputs [13, 41] , adverse health outcomes [42, 43] , or labour productivity [44] . in such examples, the extent of climate factor as a driver of migration compared to other sociodemographic and economic factors is seen to vary greatly across studies. to better understand such relationships, we classify these indirect impacts as biological or anthropogenic (secondary or tertiary). biological or secondary impacts are as a result of physical climate change, which may lead to changes in regional geochemistry, and flora and fauna. such biological changes may alter the vulnerability of human populations. for instance, climate change may drive changes in the distribution of disease vectors [37, 45] . anthropogenic or tertiary aspects of climate change comprise the resultant alterations to human systems. examples may include changes in anthropogenic land use and land availability due to sea level rise. alterations, for example, in crop yield and fish catch, may have direct implications to socioeconomic factors, for example, due to reduced agricultural output [41] , food security [1] , and therefore upon urbanisation rates due to rural to urban migration [38] . such anthropogenic pathways generally act over a longer temporal scale and can lead to the climate signal being masked by more proximal factors. as such, understanding of their impact on migration remains inconclusive [46] and less studied than direct physical impacts [39] . furthermore, additional consideration is needed to assist the recognition of dynamic interactions between the physical, ecological and anthropogenic aspects of climate change. there are of course a range of other drivers of migration which are important to understand as well as how they may be affected by climate change. it is usually a combination of drivers that culminates in an individual's decision to migrate and in what manner. within the context of climate change, we refer to this aggregation of drivers (climatic and other) as the 'vulnerability profile' which will be unique to each individual. we now present a more detailed view of some key drivers within each of the five main classes identified. these are outlined in table 3 below. as well as existing as intermediary drivers, each of these drivers may have direct impacts on migration decisions. for example, henry et al. [13] using regression modelling and gray and bilsborrow [14] using discrete time event history modelling both found that high literacy rates and economic status can act as significant push factors for migration. ezra and kiros [15] also found marital status and poverty level acted as push factors. warner et al. [1] clearly identified the role of government relocation policy on driving planned resettlement of communities away from flood plains in mozambique. such epidemiological methods are well utilised for analysing such direct causes. however, each analysis is limited to a specific type of migration and set of pre-assumed key drivers. it is not possible within this paper to examine in depth the nature and relationships of each driver, rather the authors focus on presenting a broad overview, elucidating the multilevel and multitemporal nature of migration drivers, as well the dynamicity of the drivers and their linkages. some key examples are used to demonstrate such complexities. table 3 . non-climatic drivers of migration. drivers are split into five classes: social, economic, political, demographic and environmental. societal level refers to the societal scale at which drivers typically impact. some drivers may exist both as micro and macro factors. temporal scale refers to the typical timescale of change in each driver. whilst there is no set demarcations, slow change refers to a change typically over years or decades and change fast refers to changes which may occur immediately or over a short timeframe of months. static implies that factor is not usually time varying. many studies examine the importance of social factors such as migration networks [47] [48] [49] [50] . education and literacy rates have also commonly been identified as determinants of vulnerability [35, [51] [52] [53] [54] . social drivers such as education and poverty may also alter other drivers. for example, other studies have identified that in poor areas of malawi where rain-fed agricultural practices reigned, the predominant climate change adaptation approach was not seasonal migration but the introduction of irrigation techniques to increase crop yields [54, 55] . however, joshua et al. [55] concluded that increased irrigation triggered increased water insecurity and hence water conflict. this interaction between poverty and adaptation approach has significant implications for future vulnerability levels and on future social and political factors. of course, such impacts are not isolated to only impoverished communities. developed countries with lower poverty levels can also suffer compound impacts of climate change on other social determinants [56] . however, developed countries generally have a higher capacity to mitigate or adapt to such changes resulting in different outcomes (migration and other), with different distributions across communities [57] . closely linked to social factors are economic considerations such as employment opportunity and household wages [58] at the micro societal level. poverty is also a key determinant of an individual's vulnerability and hence ability to migrate [1, 14, 15] . macro-level factors such as average employment rates and average income of a community may also act as push or pull factors which have often been identified as the dominant drivers of migration [11, 41] . there also exists a debate on the role of failed politicised economic models such as 'trickle down' and 'rent seeking 'as being largely responsible for the increase in wealth gaps and rising relative poverty [22] . these economic models may contribute to future migration behaviour due to relations between poverty and mobility [1] and the effect of inequality gradients acting as sinks for migration [59] . for instance in the malawian example given above, findlay [54] also comments on the additional causes of food insecurity beyond water scarcity, including soil erosion, socioeconomic factors including vulnerability to poverty, ability to financially withstand crop failures, low food utilisation and infrastructural factors such as high transport costs. political drivers are largely absent from environmental migration quantitative studies and yet present a significant category of migration drivers. possibly the most influential and most studied of this category is the role of political insecurity on migration. whilst the role of political insecurity and conflict is a well-acknowledged driver of migration, the role of climate change in driving political instability remains contested [20] . burrows and kinney [6] present an overview of multiple pathways through which climate change may lead to or exacerbate conflict such as through increasing rural to urban migration, resource competition or dispute between migrant and host communities. sokolowski et al. [60, 61] also discuss the role of political interventions such as efforts for conflict resolution, international relief, and immigration policy such as the closing of borders, and their impact upon migration outcomes. though largely overlooked in general climate migration literature, some models do focus on political drivers of migration with relatively accurate predictions [59, 62] . sokolowski and banks [60] modelled population displacements that occurred in syria in 2013 using unhcr guidelines for factors prompting departure. indeed, the syrian conflict can be argued to contain both political and climate determinants in the mass displacement that has resulted [63] . other political drivers include level of governance and trust in government and the level of institutionalisation and infrastructure within a community. infrastructure and governmental and non-governmental organisations are critical intervention nodes and as such their connection to environmental migration form an important area of potential study. other policies such as water, food and agricultural policy also co-interact and may result in a range of normative and adaptive migration approaches. for example, loevinsohn [64] studied the 2002 malawian food crisis and identified primary causal factors to be both environmental drought and the underinvestment by the national government in agricultural stock. loevinsohn further identified that 39% of households interviewed during 2002 had migrant family members working seeking alternative income [64] . crackdown on immigration policies in western countries such as britain, the usa and across the eu will also have significant impact upon future migration trends. with climate change expected to impact the numbers of both internal and international migrants in the future, existing dichotomies between the evidence on migration drivers and the political response to it will undoubtedly renew pressure on migration issues [65] . demographic factors at the micro level (such as age, gender, ethnicity) as well as at the macro level (such as average living conditions, affluency, diaspora presence) can act as push or pull factors as well as interact with other factors. the combined effect of climatic drivers and demographic drivers have resulted in many developing countries being the most vulnerable nations to climate change and has helped to drive research and narratives around climate justice [66] and climate refugees [67] . rapid urbanisation is often a trend in such locations, leading to slum development, poor infrastructure and high vulnerability to future climate change, not to mention other shocks such as the covid-19 pandemic. in developed countries, different demographic challenges such as population ageing may also impact upon population mobility and health. for example, an older population may result in a reduced willingness to move and increased mental health burden of doing so [68] . conversely, countries with aging populations can benefit from the 'healthy-migrant' effect [69] . as such, appreciating the demographic factors, their dynamics and interactions is essential to understanding climate risk on future sustainable development and population changes. when modelling future environmental migration, it is therefore essential to take into account the demographic situation of the study area. climate change is a key driver of environmental change. environmental degradation, such as desertification, permafrost melt and coastal erosion, undermines livelihoods and therefore acts as a push driver for migration away from these regions. in the short term, there may be positive environmental changes such as increased precipitation and improved agricultural production in many parts of the globe which may act as a migration pull factor [58] . environmental determinants such as rainfall and vegetation cover are commonly used in quantitative studies of climate change though other ecosystem attributes and ecosystem degradation appears somewhat overlooked in migration studies, such as food availability from natural sources and pollution of water. many environmental factors occur independently of climate change and may be influenced by other socio-political factors, often overlooked in environmental migration studies. for example, changes in land use, urbanisation, overexploitation of natural resources, environmental pollution and geophysical natural hazards may each be key determinants of migration. such environmental changes often have strong feedback loops-for example, rural to urban migration has significant repercussions on environmental degradation, air and water pollution, energy consumption and greenhouse gas emissions [70] . environmental drivers have been found to be critical in many development studies. the environmental kuznets curve ("ekc") hypothesis purports that the early stages of economic development are coupled to environmental degradation and has been found to be true in many contexts [71] . in the context of urbanisation led by adaptive migration, this hypothesis suggests that urbanisation will result in further environmental degradation, with significant implications for future development [33, 72] , health [73] , political security [74] , and internal migration [75] . we now provide a brief example of applying the conceptual model to a case study. we select rural malawi as a pertinent example of a climate-vulnerable society. malawi is a land-locked country in southern africa whose main economy is small-scale, rainfed agriculture, employing approximately 85% of malawians [76] . as such, many people's livelihoods as well as key source of food is highly climate sensitive. already malawi has witnessed an annual mean temperature increase of 0.9 â�¢ c since the 1960s, and whilst local rainfall patterns are difficult to accurately model, there has been an observed increase in frequency and magnitude of drought and flood events [77] . by conducting an in-depth literature review of malawi's political, demographic, environmental, social and economic makeup and then applying the conceptual approach described above by considering the impacts of climate change (primary, secondary and tertiary) to each key factor, we arrive at the case-specific model shown in figure 2 below. we now provide a brief example of applying the conceptual model to a case study. we select rural malawi as a pertinent example of a climate-vulnerable society. malawi is a land-locked country in southern africa whose main economy is small-scale, rainfed agriculture, employing approximately 85% of malawians [76] . as such, many people's livelihoods as well as key source of food is highly climate sensitive. already malawi has witnessed an annual mean temperature increase of 0.9 â°c since the 1960s, and whilst local rainfall patterns are difficult to accurately model, there has been an observed increase in frequency and magnitude of drought and flood events [77] . by conducting an in-depth literature review of malawi's political, demographic, environmental, social and economic makeup and then applying the conceptual approach described above by considering the impacts of climate change (primary, secondary and tertiary) to each key factor, we arrive at the case-specific model shown in figure 2 below. a key advancement of this malawi-specific model is that each variable is quantifiable using observational datasets. as such, it demonstrates how the application of the generalised conceptual model in figure 1 , to a local context, allows the creation of an astute, practical and measurable model, from which well grounded, policy-relevant research questions may be formulated and tested. by applying this methodological process, the malawi-specific model that is generated is based on wellgrounded assumptions and it holistically captures key variables that may be of relevance for future testing. additional information about each variable can be found in supplementary information table s1 . based on this conceptual model, the next step in the method would be to identify appropriate study and modelling techniques such as epidemiologic, mathematical, or integrated models to quantify the extent of each relationship depicted by the arrows in figure 2 . the insights from such models may therefore make possible evidence provision which can be particularly relevant for national adaptation, economic development and public health plans. a key advancement of this malawi-specific model is that each variable is quantifiable using observational datasets. as such, it demonstrates how the application of the generalised conceptual model in figure 1 , to a local context, allows the creation of an astute, practical and measurable model, from which well grounded, policy-relevant research questions may be formulated and tested. by applying this methodological process, the malawi-specific model that is generated is based on well-grounded assumptions and it holistically captures key variables that may be of relevance for future testing. additional information about each variable can be found in supplementary information table s1 . based on this conceptual model, the next step in the method would be to identify appropriate study and modelling techniques such as epidemiologic, mathematical, or integrated models to quantify the extent of each relationship depicted by the arrows in figure 2 . the insights from such models may therefore make possible evidence provision which can be particularly relevant for national adaptation, economic development and public health plans. complexities naturally arise when taking an upstream, systems-thinking approach to migration determinants. there are two key complexities identified. firstly, the acknowledgement of multilevel and interactions and feedbacks between drivers. secondly the dynamicity of drivers and their connections over time and space. despite these complexities, models must be transparent and provide results from which simplicity may be derived in order to be useful for decision makers and intervention planning. to aid reflection upon such interactions, figure 3 depicts a simple representation of the interactions between individual and classes of drivers. each class of driver is represented by a funnel, from which a combination of both macro and micro drivers is filtered from an interconnected reservoir where drivers from different classes interact on a range of temporal, spatial and social scales. the combination of drivers at the individual migrant level results in a unique vulnerability profile and context which determines the migration decision made by each potential migrant. climate change is again presented as an externality, cross-cutting all other driver classes and acting across the temporal and societal levels. as in figure 1 , each driver may vary over both time and spatial dimensions. however, modelling such dynamicity requires simultaneous understanding of drivers, their interactions, and their evolution through time and space. the insight that such dynamic modelling would allow may enable the effective identification of suitable intervention nodes for public health, land use and immigration policy to name but a few. connections over time and space. despite these complexities, models must be transparent and provide results from which simplicity may be derived in order to be useful for decision makers and intervention planning. to aid reflection upon such interactions, figure 3 depicts a simple representation of the interactions between individual and classes of drivers. each class of driver is represented by a funnel, from which a combination of both macro and micro drivers is filtered from an interconnected reservoir where drivers from different classes interact on a range of temporal, spatial and social scales. the combination of drivers at the individual migrant level results in a unique vulnerability profile and context which determines the migration decision made by each potential migrant. climate change is again presented as an externality, cross-cutting all other driver classes and acting across the temporal and societal levels. as in figure 1 , each driver may vary over both time and spatial dimensions. however, modelling such dynamicity requires simultaneous understanding of drivers, their interactions, and their evolution through time and space. the insight that such dynamic modelling would allow may enable the effective identification of suitable intervention nodes for public health, land use and immigration policy to name but a few. the concept of a vulnerability profile allows for the acknowledgement that each migrant has a unique set of drivers due to the multilevel and multitemporal combination of factors he or she is subjected to. in this way vulnerability may be conceived as a meta-driver of migration. the concept of vulnerability describes the ability of an individual or community to withstand and recover from a risk such as a disaster event [20] . other meta-drivers include resilience and adaptive capacity [78] [79] [80] . whilst vulnerability is a commonly used meta-driver in much climate migration literature, resilience is often the currency of choice in the fields of disaster management and climate change adaptation [81] . however, these terms are broad and often overlap and are even used interchangeably, rendering their distinction and usefulness within scientific analysis questionable. despite this, such meta-drivers are the dialogue of choice for policy makers and must be utilised for research to have political relevance. however, care should be taken when referring to such metathe concept of a vulnerability profile allows for the acknowledgement that each migrant has a unique set of drivers due to the multilevel and multitemporal combination of factors he or she is subjected to. in this way vulnerability may be conceived as a meta-driver of migration. the concept of vulnerability describes the ability of an individual or community to withstand and recover from a risk such as a disaster event [20] . other meta-drivers include resilience and adaptive capacity [78] [79] [80] . whilst vulnerability is a commonly used meta-driver in much climate migration literature, resilience is often the currency of choice in the fields of disaster management and climate change adaptation [81] . however, these terms are broad and often overlap and are even used interchangeably, rendering their distinction and usefulness within scientific analysis questionable. despite this, such meta-drivers are the dialogue of choice for policy makers and must be utilised for research to have political relevance. however, care should be taken when referring to such meta-drivers and the contributing drivers as explored above must be contextually relevant and carefully selected. previous conceptual models explore the linkages between climate change and migration with different assumptions and perspectives. the 2011 foresight report identifies five key families of drivers and concludes that migration may be an adaptive strategy in the face of climate change and represents possibly the best to-date, globally accepted conceptual model for climate migration [10] . the report disputes the long-time argument that migration represents a failure to adapt in situ. this conclusion, however, fails to consider several key aspects of migration: firstly, the agency and social well-being of migrants involved at each stage of the migration process (prior to movement, in transit, and at host destination). secondly, the level of agency afforded to would-be migrants during the migration decision-even as a supposedly proactive adaptation measure. finally, the delicate line between forced and voluntary movement, based upon a composition of drivers and the bias of the person(s) awarding the classification. the ongoing lancet commission on climate change and health also presents an interesting framework where migration as a result of climate change is appropriately framed as a health challenge, and a public health opportunity [7] . this framework, however, does not give a large amount of consideration to intermediate drivers and various pathways by which climate change may drive migration or produce trapped communities. helping to close this gap, and drawing on a range of political and economic, as well as health literature, the model presented by sellers, ebi and hess considers a puzzle of immediate and longer-term drivers of social instability, with both climate shocks and migration as contributing factors and possible outcomes [82] . mcmichael et al. [5] also present a foundational model whereby the basic links between climate change and migration are presented though driver interactions and dynamics are not discussed in depth. whilst this and other conceptual models encourage an upstream approach to environmentally induced migration, putting such thinking into practise presents further challenges. the paucity of empirical studies limits our understanding of how global climate change may threaten development and public health, particularly regarding the indirect impacts of climate change. lack of suitable data and quantitative metrics needed to conduct such studies remains a perennial challenge. it is essential that these challenges be overcome through future data collection and empirical modelling. migration datasets are largely based on cross-sectional survey and census data whilst information about health and well-being, disaggregated by migration status, is largely lacking. furthermore, collecting and disseminating such data present significant ethical and privacy concerns. for many drivers, proxies may be used. for instance, the normalised difference vegetation index (ndvi) may act as a proxy for natural resource availability [67] . henderson et al. use a simple count of manufacturing industries as a proxy for urban industrial capacity when analysing the relationship between climate change and urbanisation in an african context [83] . lu et al. [84] suggest the possibility of using out-migration rates as a proxy for changes in habitability. neumann and hilderink [17] present a range of possible datasets such as glasod for soil degradation and lada for biomass production of earth observation land degradation data. however, each of these datasets has its own challenges concerning spatial and temporal resolution, uncertainty and effectiveness as a proxy. furthermore, misalignment of datasets at the spatial, temporal and social levels creates further challenges in appropriately modelling migration determinants. other, more squidgy drivers such as perceived political stability and social networks remain elusive to measurement and under-represented in quantitative studies. the availability and quality of data in turn create methodological challenges for empiricists. some studies utilise a range of statistical and epidemiological methods. however, traditional epidemiological methods each have their short-comings. cross-sectional analyses do not allow for the temporal nature of drivers. timeseries analyses are often impeded due to lack of sufficient data and the ability to control interactions between drivers across a range of temporal and spatial resolutions. gravity models can capture linear push and pull factors at the macro level, though may struggle with ecological fallacy and in modelling of the more nuanced relationship between driver and migration outcome. recent developments in mathematical models offer useful insight. such models include improved agent-based modelling (abms) and multiagent systems approaches [21, 52, 85, 86] . study approaches must be chosen appropriately based on the assumed relevant determinants and their interactions, as choice of methods may have significant impact on the study results. the advantage of such systems approaches is that driver dynamics and interactions may be inbuilt and allowed to alter in timesteps. the individual nature of human decisions may also be captured through abms. however, abms require high-resolution data and are generally only applicable for small geographical scales. economic approaches such as economic bargaining theory can also be used to explain some micro-level migration decisions such as the 'healthy-migrant' effect, whereby young and fit-for-work individuals may be more likely to move in search of work and remittance opportunities [87] . however, since climate change exists only as a macro factor, such micro-level considerations within current models of climate migration are often lacking. other approaches have been proposed to deal with complexity and dynamicity. barbieri et al. [88] use a combined economic-demographic-climate model to understand the interactions between different classes of drivers over time (using appropriate proxies) in the northeast region of brazil. another emerging method is the use of shared socioeconomic pathways (ssps) to provide a combined set of scenarios for future population, urbanisation and wealth factors [89] . the ssps are designed to be used in conjunction with climate change representative concentration pathways (rcps) for future radiative forcing emission scenarios. application of this approach can be seen within the 2018 groundswell report who combine the rcps and ssps into three scenarios and use gravity modelling to provide a view of internal migration for three global regions [75] . finally, we make a crucial note regarding the overall approach by scientists towards climate migration. care must be taken when treading the literature of various typologies and terminologies which are necessarily subjective and vary by author and by discipline. furthermore, climate migration may be studied through a variety of academic lenses. as such, the impact of different epistemologies on conclusions is complex and often overlooked [24] . politically impactful research should attempt to transcend traditional research boundaries and avoid tribalism in science [90, 91] . indeed, in the pursuit of improved global health, research of climate migration should be contextually relevant, and politically pertinent and timely. one way to help achieve this is to adopt a pan-disciplinary approach such as the one demonstrated within this paper. table 4 elucidates this point by demonstrating a selection of fields which contribute to the study of climate migration as an aspect of global health. table 4 . an overview of the range of scientific disciplines which contribute to the study of climate migration, its drivers and impacts. human geography offers a range of frameworks and tools for studying human migration and its drivers. through the study of human behaviour, anthropological methods offer a deeper insight into the decision-making process behind migration, as well as the impacts of migration upon individual and societal well-being. ethnography offers a unique and rich insight into people's opinions and decision making. political sciences political sciences may be used to explore the effects of policy on immigration, as well as the geographic, economic and social drivers of migration policy and sentiment. both macro and micro economics can be used to quantify migration as well as study the economic drivers and impacts of migration. for example, in the case study of malawi, econometric modelling could be applied in the study of the impact of failed crops on household wealth and as such on migration. mathematics a range of mathematical models are used in the study of migration such as system dynamic models, agent-based models, gravity models, and diffusion models. environmental epidemiology can be used in the study of migration and its drivers. for example, the field of ecological public health supports the exploration of the relationships between the biological and material realms [92] . disaster risk reduction sciences disaster risk reduction relates mainly to sudden-onset events and short-term, forced displacement and as such provides cross-over to the field of migration science. computer sciences computer science is used in migration studies to model and simulate migration and its quantifiable drivers and impact. sociology can be used to study migration and its impacts at the societal level, with special interest in demographic makeup and the social structure of migrant (and non-migrant) communities. the study of population dynamics and structure places migration as a core component. ultimately, climate change may have critical impacts upon future migration across the globe and has significant implications for public health, human security and sustainable development. climate change is already and will continue over the coming decades to contribute to large numbers of displaced persons [93] , refugees [94] , internal migrants [75] , international migrants [26] and immobile and trapped persons [27] . as such, better understanding of the relationship between climate change and migration is essential for effective future policy planning in all sectors. this can be achieved through the systematic and homogenous application of robust conceptual frameworks to local contexts. a lack of data, particularly for low-income and indigenous settings, is a key set back which obstructs furthering our understanding. it also hinders the ongoing desire of academia, national and international policy makers to identify who are the climate migrants of today and of the future and count how many there are. this paper has attempted to demonstrate the need for a flexible and pan-disciplinary approach to environmentally induced migration. research which cross-cuts traditional discipline boundaries, in accordance with the planetary health viewpoint, is encouraged when using such a conceptual framework in the study of climate migration [95] . in this way, traditional pitfalls may be avoided, better reconciliation of macro and micro determinants may be achieved and more visibility of the dynamics of drivers and hence a more accurate understanding of their role in driving migration may be elucidated. however, the review within this paper is non-exhaustive and draws lightly on a wide range of literature and academic standpoints. as such, it is designed to demonstrate a nuanced approach to climate migration theory and application, rather than present a comprehensive "how-to" guide. finally, it was beyond the scope of this paper to fully apply our conceptual model to mathematical and epidemiological quantitative models, providing instead a simple overview, though this is the natural progression of the research. table s1 : a descriptive table providing an overview of each variable included within the malawi specific model. these variables were identified through an iterative process based on the malawi context as well as data availability. due to a lack of data on many variables some were omitted but may be considered in future work. author contributions: conceived and designed the research: a.z., g.l., p.l., r.p., s.h. and t.c. research carried out by r.p., a.z. wrote the paper r.p., a.z., t.c., g.l., p.l. and s.h. all authors have read and agreed to the published version of the manuscript. funding: r.p. is funded by nerc as part of the london nerc doctoral training partnership (dtp) under grant code ne/l002485/1. this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. we declare no competing interest. climate change, environmental degradation and migration contextualising typologies of environmentally induced population movement environmental refugees: a growing phenomenon of the 21st century the lancet countdown on health and climate change: from 25 years of inaction to a global transformation for public health an ill wind? climate change, migration, and health. environ. health perspect exploring the climate change, migration and conflict nexus of the lancet countdown on health and climate change: shaping the health of nations for centuries to come lancet migration: global collaboration to advance migration health the uneven geography of research on "environmental migration migration and global environmental change: future challenges and opportunities the effect of environmental change on human migration not only climate change: mobility, vulnerability and socio-economic transformations in environmentally fragile areas of bilivia, senegal and tanzania, human settlements working paper modelling inter-provincial migration in burkina faso, west africa: the role of socio-demographic and environmental factors environmental influences on human migration in rural ecuador rural out-migration in the drought prone areas of ethiopia: a multilevel analysis climate change and migration: is agriculture the main channel? opportunities and challenges for investigating the environment-migration nexus natural disasters and population mobility in bangladesh environmental drivers of human migration in dry lands: a spatial picture assessing the impact of climate change on migration and conflict, exploring the social dimensions of climate change developments in modelling of climate change-related migration environmental dimensions of migration ecological public health: the 21st century' s big idea? an essay by tim lang and geof rayner data and methods in the environment-migration nexus: a scale perspective scoping the proximal and distal dimensions of climate change on health and wellbeing international climate migration: evidence for the climate inhibitor mechanism and the agricultural pathway understanding immobility: moving beyond the mobility bias in migration studies a decision framework for environmentally induced migration migration in the age of involuntary immobility: theoretical reflections and cape verdean experiences migration in a turbulent time no matter of choice: displacement in a changing climate migration, immobility and displacement outcomes following extreme events climate change and human health: present and future risks focus on environmental risks and migration: causes and consequences what drives human migration in sahelian countries? a meta-analysis climate change, food systems and population health risks in their eco-social context climate change and health in earth's future. earth's future the determinants of vulnerability and adaptive capacity at the national level and the implications for adaptation social vulnerability and migration in the wake of disaster: the case of hurricanes katrina and rita mortality risk attributable to high and low ambient temperature: a multicountry observational study projections of temperature-related excess mortality under climate change scenarios workplace heat stress, health and productivity an increasing challenge for low and middle-income countries during climate change primary, secondary and tertiary effects of eco-climatic change: the medical response al health and climate change: policy responses to protect public health climate, environmental and socio-economic change: weighing up the balance in vector-borne disease transmission seasonal variation of child under nutrition in malawi: is seasonal food availability an important factor? findings from a national level cross-sectional study using a migration systems approach to understand the link between climate change and urbanisation in malawi climate and mobility in the west african sahel: conceptualising the local dimensions of the environment and migration nexus exploring the causes of forced migration: a pooled time-series analysis the use of survey data to study migration-environment relationships in developing countries: alternative approaches to data collection climate shocks and migration: an agent-based modeling approach 50-year trends in us socioeconomic inequalities in health: us-born black and white americans migrant destinations in an era of environmental change climate change in semi-arid malawi: perceptions, adaptation strategies and water governance review environmental health indicators of climate change for the united states: findings from the state environmental health indicator collaborative contribution of working groups i, ii and iii to the fifth assessment report of the intergovernmental panel on climate change; core writing team adaptive strategies to climate change in southern malawi system task team on the post-2015 un development agenda: migration and human mobility; thematic think piece modeling population displacement in the syrian city of aleppo methodology for environment and agent development to model population displacement contingency planning guidelines. in a practical guide for field staff ; division of operations support climate change in the fertile crescent and implications of the recent syrian drought the 2001-2003 famine and the dynamics of hiv in malawi: a natural experiment the politics of evidence-based policy in europe's "migration crisis framework for adaptation policy migration and environment in ghana: a cross-district analysis of human mobility and vegetation dynamics english national study of flooding and health study group. effect of evacuation and displacement on the association between flooding and mental health outcomes: a cross-sectional analysis of uk survey data population aging, migration and productivity in europe effect of internal migration on the environment in china environmental kuznets curve hypothesis: a survey social and environmental risk factors in the emergence of infectious diseases local disease-ecosystem-livelihood dynamics: reflections from comparative case studies in africa conflicts over water use in malawi: a socio-economic study of water resources management along the likangala river in zomba district preparing for internal climate migration future climate impacts on maize farming and food security in malawi climate risk assessment and agricultural value chain prioritisation for malawi and zambia working paper no.228. 228. wageningen, the netherlands linkages between vulnerability, resilience, and adaptive capacity migration as an adaptation to climate change geographies of resilience: challenges and opportunities of a descriptive concept climate change, human health, and social stability: addressing interlinkages has climate change driven urbanization in africa? engã¸-monsen, k.; et al unveiling hidden migration and mobility patterns in climate stressed regions: a longitudinal study of six million anonymous mobile phone users in bangladesh the influence of climate variability on internal migration flows in south africa agent-based model simulations of future changes in migration flows for burkina faso might climate change the "healthy migrant" effect? glob. environ climate change and population migration in brazil's northeast: scenarios for 2025-2050 the shared socioeconomic pathways and their energy, land use, and greenhouse gas emissions implications: an overview places, people and perpetuity: community capacities in ecologies of catastrophe climate change's role in disaster risk reduction's future: beyond vulnerability and resilience internal displacement monitoring centre unhcr. the environment and climate change of the rockefeller foundation-lancet commission on planetary health this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord-324254-qikr9ryf authors: lyócsa, štefan; píhal, tomáš; výrost, tomáš title: fx market volatility modelling: can we use low-frequency data? date: 2020-09-30 journal: financ res lett doi: 10.1016/j.frl.2020.101776 sha: doc_id: 324254 cord_uid: qikr9ryf high-frequency data tend to be costly, subject to microstructure noise, difficult to manage, and lead to high computational costs. is it always worth the extra effort? we compare the forecasting accuracy of lowand high-frequency volatility models on the market of six major foreign exchange market (fx) pairs. our results indicate that for short-forecast horizons, high-frequency models dominate their low-frequency counterparts, particularly in periods of increased volatility. with an increased forecast horizon, low-frequency volatility models become competitive, suggesting that if high-frequency data are not available, low-frequency data can be used to estimate and predict long-term volatility in fx markets. • we study 1-to-66 day-ahead volatility forecast of six major fx pairs. • for short forecast horizons high-frequency dominate low-frequency models. • high-frequency models are more accurate during market distress. • for longer forecast horizons low-frequency volatility models become competitive. • low-frequency data can be used to accurately predict long-term volatility. the outburst of the global financial markets in 2008, the european debt crisis, (geo)political uncertainties, oil-price wars in 2019 and 2020, and the outbreak of covid-19 in 2020 have resulted in a surge in volatility in financial markets worldwide. for example, investors use volatility estimates for pricing financial derivatives. fund managers might set specific risk levels that are, in turn, influenced by the predicted level of volatility. risk levels are also targeted by banks to fulfill specific basel criteria. volatility might even be traded (using options or artificial indices linked to market volatility poon and granger (2003) ). times of extreme volatility also create pressure to rebalance portfolios, and the likelihood of contagion between markets also increases (kodres and pritsker, 2002) . market participants are thus interested in measuring, managing, and forecasting market volatility to determine the value of their investments and to prepare and communicate their planned market decisions. the literature on volatility forecasting is rich and unfolds around available volatility estimators. initially, volatility was calculated from low-frequency, daily data. the first generation of generalized autoregressive conditional heteroscedasticity (garch) models (bollerslev, 1986 ) emerged in the 1990s and early 2000s and is represented by numerous variations using low-frequency data, e.g., egarch, gjr-garch, ap-arch, n-garch, na-garch, i-garch, and figarch (for an earlier review, see poon and granger, 2003) . the garch class of models offers competitive forecasts and can capture many stylized facts about volatility, particularly the volatility clustering effect. with the greater availability of high-frequency data in the late 2000s, the research shifted toward high-frequency (intraday) volatility estimators and models. the heterogeneous autoregressive (har) models of corsi (2009) utilized high-frequency data and the realized volatility estimator of andersen and bollerslev (1998) and andersen et al. (2001) . the empirical evidence suggests that models of volatility based on high-frequency estimators provide superior forecasts to models based on low-frequency data (e.g., andersen et al., 2007 , koopman et al., 2005 , corsi et al., 2010 , busch et al., 2011 , horpestad et al., 2019 . although the basic har model of corsi (2009) is appealingly simple and appears to capture the short-and long-term dependency of the volatility process adequately (e.g., andersen et al., 2007 , vortelinos, 2017 , the literature has raised several issues related to the effect of microstructure noise 1 . previously, andersen et al. (2001) acknowledged that for realized volatility (high-frequency estimator of daily volatility) to be more efficient and unbiased, one needs high-quality data from actively traded assets. as a response, alternative estimators have emerged (e.g., ait-sahalia et al., 2005 , bandi and russell, 2008 , barndorff-nielsen et al., 2008 , andersen et al., 2011 , liu et al., 2015a . the second generation of garch models bridges these two strands of the literature by relying on the latent volatility model (the garch concepts) while also using high-frequency data. the key ideas of the realized-garch model were presented by hansen et al. (2012) , and several alternative models emerged thereafter (e.g., wu and xie, 2019, xie and yu, 2019) . despite the wide interest of academia, the existing literature provides evidence only that i) volatility estimators based on high-frequency data are theoretically preferred (andersen et al., 1 the basic specification of the har model has also been enhanced, e.g., by the inclusion of semivariances (patton and sheppard, 2015) , the disentanglement of the realized volatility into continuous and jump components (e.g., andersen et al., 2012) , the introduction of the measurement error of the realized volatility into the har model as in (bollerslev et al., 2016) , the inclusion of nontrading volatility components (lyócsa and molnár, 2017, lyócsa and todorova, 2020) , and the use of hidden markov chains (luo et al., 2019) . 2001) and ii) in the day-ahead predictive setting, models using high-frequency data provide superior performance (e.g., andersen et al., 2007 , koopman et al., 2005 , corsi et al., 2010 , busch et al., 2011 , horpestad et al., 2019 . over longer horizons, averaging daily low-frequency volatility estimators across multiple days should reduce the effect of noise. intuitively, intraday price fluctuations should not greatly contribute to month-ahead volatility forecasts. therefore, with increasing forecast horizon, the difference in using high-or low-frequency volatility estimators should decrease, at which point low-frequency volatility models should tend to provide similarly accurate forecasts to high-frequency volatility models. evidence on the relative (un)importance of low-frequency volatility models for multiple-day-ahead forecasts is lacking, which is intriguing, given that the heterogeneity of market participants has increased (with different needs and investment horizons, wooldridge, 2019) and in many real-world scenarios, market participants are more interested in long-term forecasts, e.g., derivative traders. we fill this gap in the literature. in a recent study, ma et al. (2018) showed that when low-and high-frequency volatility forecasts are combined appropriately, the accuracy increases for the shanghai stock exchange composite index and s&p 500 index. therefore, low-frequency data could provide additional information complementary to the available high-frequency data. nevertheless, the study of ma et al. (2018) is centered around day-ahead forecasts, where high-frequency volatility models should have the edge. in this study, we present the results from a volatility forecasting modeling framework that compares the forecasting accuracy of several low-and high-frequency volatility models as a function of the forecast horizon. our market of interest is represented by six major currency pairs 2 . for some, the implications of our research could be substantial. if low-frequency volatility models provide competitive performance, one could argue that high-frequency data are not always worth the much higher costs. daily foreign exchange data are freely available from various sources 3 , but availability of high-frequency foreign exchange data depends on the policy of the given broker or bank, and data are not always free 4 . even if data are available 5 for free, they are subject to various constraints, e.g., have limited licensing (e.g., can be used only for academic purposes) or are available only for short time periods or for a specific time 2 equities and commodities are addressed in a separate study and show qualitatively similar results. 3 e.g., finance.yahoo.com, investing.com. 4 for example, the well-known provider of high-frequency data, tick data (www.tickdata.com), provides tick-by-tick quote data (bid and ask prices) that are already cleaned and processed. moreover, these data are from more contributors (banks and other market participants). the dataset that we used in our paper would cost approximately 8 100 usd after all discounts (july 2020). 5 e.g., oanda, dukascopy frequency. moreover, the use of high-frequency data raises other issues, most notably, working with high-frequency data requires appropriate cleaning and processing of the data. for example, the approximate sizes of the daily eur/usd data from 2005 to 2019 is 120 kb, 5-second data is 350 mb, and tick-by-tick data is 15 gb. processing daily data and estimating the models is overall much faster than processing and estimating models that use high-frequency data, where one needs to clean and prepare each line of the 15 gb of data. 6 therefore, the processing, data management, and computational intensity demands are much higher for highfrequency data and might not be worth the greater effort. our results illustrate the dominance of high-frequency estimators for forecasting one-day-ahead volatility. models that utilize highfrequency data or their combinations provide superior results. however, for longer forecast horizons, the combination of low-frequency volatility models provides forecasts statistically comparable to those of high-frequency volatility models and their combinations. our results suggest that for most foreign exchange market (fx) pairs, low-frequency data represent a sufficient replacement for high-frequency data for forecast horizons of 5 or more days. our study might therefore provide practitioners and policymakers with evidence supporting the use of high-or low-frequency volatility models in a particular setting. 2.1. volatility estimators 2.1.1. high-frequency estimator given 5-minute intraday continuous returns r t,j for day t = 1, 2, ..., t and intraday period j = 1, 2, ..., n , the usual realized variance estimator 7 (e.g., bollerslev, 1998, andersen et al., 2001) is defined as: many alternative estimators of quadratic variation to address the inherent microstructure noise exist (e.g., zhang et al., 2006 , jacod et al., 2009 , andersen et al., 2012 . our choice to use the 5-minute realized variance estimators is motivated by liu et al. (2015b) , who compared the empirical accuracy of several estimators across many assets 8 . they found that consistently outperforming the simple 5-minute realized variance is difficult. 6 one needs to do this only once, but we want to stress that different types of skills and experience are also required to work with high-frequency data. 7 in the following text, we use the terms variance and volatility interchangeably. 8 their comparison also included foreign exchange market futures. as an alternative low-frequency estimator, we use range-based estimators that are more efficient than the usual daily squared return (e.g., molnár, 2012) . motivated by patton and sheppard (2009), we increase the efficiency of the estimation process by combining three rangebased estimators via a simple average. specifically, given the natural logarithm of opening (o t ), high (h t ), low (l t ), and closing (c t ) prices on day t, the parkinson (1980) estimator is: the garman and klass (1980) estimator is: both estimators assume that the price follows driftless geometric brownian motion. allowing for arbitrary drift, rogers and satchell (1991) derived the following estimator: the range-based estimator used in our empirical setting is the average (following patton and sheppard, 2009) of the above three estimators: the motivation behind using the (naive) equally weighted average is based on the assumption that we have no prior information on which estimator might be more accurate for a given trading day. 9 should this simplified approach lead to competitive multiple-day-ahead volatility forecasts, it follows that a more sophisticated combination of low-frequency estimators might make the results even stronger. in this section, we describe what we refer to as high-and low-frequency volatility models. as the name suggests, high-frequency volatility models utilize realized variance as the estimator of volatility, whereas low-frequency models use the range-based estimator. we use three 9 the development and statistical verification of a method that continuously updates weights is left for further research. however, motivated by reviewer insights, we run our analysis and compare the results with low-frequency volatility models that use each of the three range-based estimators separately. a short discussion is presented in section '4.3. individual range-based low-frequency volatility forecasts'. classes of models: the heterogeneous autoregressive model (har) of corsi (2009), the autoregressive fractionally integrated model (arfima), and the realized generalized autoregressive conditional heteroscedasticity (realized-garch) of hansen et al. (2012) . these models were selected because they can use either high-or low-frequency volatility estimators in a straightforward manner. moreover, all these models have been proven to be capable of replicating long memory and volatility clustering effects. in the past decade, the simple har model proposed by corsi (2009) has gained popularity since it is easy to estimate and tends to perform better than competing first-generation garch models (horpestad et al., 2019) . let rv t+1,t+h be the daily average realized variance calculated over the next h days. in this study, we are especially interested in the role of low-frequency estimators for multiple-day-ahead volatility forecasts. we employ 1-to-66 trading day-ahead forecasts. according to the recent bank for international settlements (bis) survey, in 2019, 78% of the over-the-counter (otc) foreign exchange derivatives had a maturity of less than one year. 10 for the low-frequency volatility models to be useful for a wide array of participants, they should produce competitive forecasts up to a forecast horizon of one year or less. as our analysis shows that after a few weeks, the lowfrequency volatility models tend to provide competitive forecasts across all fx pairs, we have used 66 trading days (three months) as a compromise between a few weeks and one year. our baseline har model is therefore specified as: rv t is the realized variance, and rv t,t−4 , rv t,t−21 , andr t,t−65 are average realized variances calculated over the past 5, 22 and 66 days, respectively. the multiple-component volatility structure in (6) vortelinos, 2017) , our specification differs only in that in addition to the one-month, we also incorporate the three-month volatility component, which is motivated by the fact that we are also predicting three-month (66-day) ahead volatility. the model is denoted rv-har, and the corresponding low-frequency, range-based version is denoted rb-har. we consider two other popular versions of the har model that aim to model the asymmetric volatility observed in financial markets. let n sv t and p sv t , respectively, denote the negative and positive semivariances of (e.g., barndorff-neilsen et al., 2010, patton and sheppard, 2015) : i represents an indicator function that returns one if the condition in square brackets holds and zero otherwise. the har model is then defined as: we use only one-day lags of n sv t and p sv t to mitigate the number of estimated parameters, which might deteriorate the forecasting performance in an out-of-sample context. such simplified models were also considered by patton and sheppard (2015) and bollerslev et al. (2016) . this model is denoted sv-rv-har. as a low-frequency range-based counterpart, we use the following specification: r t is the daily return, and β 3 rb t × i [r t < 0] captures the asymmetric volatility response. the model is denoted arb-rb-har. the final two specifications are also motivated by the asymmetric volatility literature, namely, corsi and reno (2009) and horpestad et al. (2019): the coefficient β 4 captures the asymmetric effect, and β 3 controls for the size effect. preve (2019) for a discussion of estimating har models). we next use an arfima-garch model, for which the mean equation models variance: where d is the differencing parameter (e.g., granger and joyeux, 1980) , v t is the time-varying volatility 11 and η t is an iid variable following a flexible distribution johnson (1949b,a). the variance equation is the exponential garch model of nelson (1991): the sign and the size effects are captured by α and γ, and z t is the standardized innovation. the high-frequency volatility model employs the realized variance and is denoted rv-arfima-garch, and the range-based estimator is denoted rb-arfima-garch. finally, due to their popularity and the development of more sophisticated second-generation garch models, we use the realized-garch model of hansen et al. (2012), which can be adjusted to work with high-or low-frequency volatility estimators. the mean equation models daily returns: the variance and the measurement equations are: 11 in this case, it is the time-varying volatility of variance. originally, hansen et al. (2012) used realized variance, in which case we denote the model as realized-garch. if the range-based estimator is used instead, the model is called range-garch. the forecasting procedure uses a rolling-window framework. the algorithm is as follows: 1. select observations from t = 1, 2, ..., t e . 2. estimate volatility models. 3. using estimated parameters and observations, predict volatility at t e + 1. for har models, multiple-day-ahead forecasts are predicted directly, while for arfima-garch and real-garch models, multiple-day-ahead forecasts are calculated recursively. 4. shift the estimation window by using observations t = 2, 3, ..., t e + 1 and repeat steps 2 to 4 until the end of the sample. the estimation window size is set to t e = 1000. we draw on the ideas of bates and granger (1969) and use simple combination techniques to mitigate model uncertainty (timmermann, 2006) . forecasts are combined across all highfrequency volatility models, all low-frequency volatility models, and all ten high-and lowfrequency volatility models. to combine forecasts, we use weighted averages, where the weights are given by the discounted forecast error. let f m t and f t denote the forecasts from model m and the corresponding proxy, the realized variance rv t . our first combination is a simple average across all forecasts: here, the subscript h means that we averaged across high-frequency models. for low-frequency models, we use the subscript l, and for a combination across both classes of forecasts, we use hl. the loss (to be defined in the next section) is l t (f m t , f t ) and for simplicity is denoted as l t . we use the discounted forecast error to weight each loss value such that recent losses have higher weight than losses in the past, and we calculate the average loss over a time period of t (out-of-sample) observations:l where δ is the weighting parameter. with δ = 1, all losses have equal weights. the lower δ is, the higher the relative weight of the most recent losses. we choose δ = 0.975 and observe almost no qualitative change in results for δ = 0.950 or δ = 0.900. the weighted losses are calculated from the most recent 200 predictions, which we refer to as the size of the calibration sample. thus, the first combination forecast is available for the 1201 st observation of the initial sample (estimation window + calibration sample + 1). our second combination is formed as the weighted trimmed mean: here, f (m) t represents the ordered forecasts, i.e., the lowest and the highest are excluded, and l (m), * are the corresponding losses that are rescaled to sum to one. the final combination is a weighted average across the three best performing models and is denoted c t op h . as noted in the previous section, our proxy is the realized variance, rv t , which in the subsequent equations is denoted f t . this approach clearly places low-frequency models at a disadvantage, but we argue that this is the only meaningful way to test whether low-frequency models can achieve comparable performance to that of high-frequency models. we evaluate the forecasts of our model specification using two statistical loss functions and the model confidence set (mcs). according to patton (2011), the mean square error (mse) and quasi-likelihood (qlike) loss functions provide a consistent ranking of forecasts, even if the proxy of the underlying latent volatility is measured with noise. as the qlike loss function is less sensitive to extreme values and penalizes underestimation of volatility more strongly, we use it to present our key results. 12 statistical evaluation is conducted based on the mcs proposed by hansen et al. (2011) . this algorithm is suitable when models are nested, when a benchmark model is not specified, and when multiple models are evaluated, i.e., it controls for data-snooping bias. the mcs algorithm finds the 'superior set of models', which represents models with the same predictive ability at the selected confidence level. we study the market with the largest turnover in the world (approximately we collect data from oanda using a 5-minute calendar sampling scheme over a 24-hour trading window that starts at 22:00 utc (end of the new york session). due to low liquidity, weekends are removed from the analysis to avoid estimation bias, as is standard in the literature (e.g., dacorogna et al., 2001 , andersen et al., 2007 , aloud et al., 2013 , gau and wu, 2017 . <> the descriptive statistics for our daily volatility measures and returns are presented in table 1 . we note several interesting differences between high-and low-frequency variance estimates. first, the distribution of the low-frequency variance estimates shows a higher spread of values, which we would expect from a noisier estimate. specifically, the low-frequency variance estimate has an approximately 30% larger standard deviation with more skew and higher kurtosis. second, on average, the low-frequency estimator is slightly smaller than its high-frequency counterpart 14 . third, the persistence of the high-frequency estimators is higher and shows longer memory. this characteristic might prove to be useful in har models, which specifically exploit this persistence. fourth, the correlation between daily high-and low-frequency variance for illustration purposes, figures 1 and 2 plot the daily realized variance for the six fx pairs and corresponding day-ahead forecasts from arfima-garch models, which tend to produce the most accurate day-ahead forecasts for both high-and low-frequency volatility models. the forecasts tend to follow realized variances but are unable to replicate sudden spikes in volatility, a phenomenon also visible in other forecasting studies. <> our key results visualized in figure 3 in bold and with the dagger symbol represent the models that belong to the mcs, i.e., the predictive abilities of the models in bold are considered to be equally good. for example, the best models for forecasting one-day volatility for eur/usd (second column in table 2 ) are c ave h and c t rim h , which combine the results from high-frequency models (panel c). <> the estimated coefficients reported in table 6 show that over time, high-frequency models tend to produce more precise forecasts for the aud/usd, eur/usd, gbp/usd, and usd/can fx pairs, while the accuracy is also increased during more volatile periods. the opposite is true for usd/jpy, and the results are nonsignificant for usd/chf. these results suggest that more accurate forecasting models could be designed with a conditional combination that would exploit the level of market volatility. up to now, for our low-frequency volatility models, we have assumed that we do not have any ex ante information about which of the range-based estimators leads to more accurate volatility forecasts. here, we discuss the results from low-frequency volatility models estimated separately for the garman and klass (1980), parkinson (1980) and rogers and satchell (1991) estimators. detailed tabulated results are available upon request. our general observation does not change. increasing the forecast horizon leads to more competitive forecasts from low-frequency volatility models regardless of the range-based estimator employed. among the individual range-based estimators, the garman and klass (1980) estimator leads to lower forecast errors compared to the forecast errors generated from volatility models based on the equally weighted average of the range-based estimators. however, this does not mean that one should blindly prefer the garman and klass (1980) estimator, as there are two caveats. first, using only one range-based estimators has occasionally led to very inaccurate forecasts, which could successfully be avoided by using the average of the three range-based estimators. for example, in a day-ahead setting for the gbp/usd and usd/cad pairs, the forecast errors from the rb-har models with the garman and klass (1980) these examples suggest that in many practical scenarios, using the average across estimators should be preferred to using individual estimators. as many subjects interact with the fx market, predicting the market's uncertainty is crucial for improved risk management. while high-frequency data lead to superior volatility estimates, the acquisition, data management and computational costs associated with such data cannot be covered by all market participants. moreover, low-frequency data are publicly available and are much easier to work with. this leads to our question from the title 'can we use low-frequency data? '. in this paper, we compare the forecasting performance of several volatility models that use either low-or high-frequency volatility estimates or both. on the basis of a sample of six major currency pairs, our results suggest that for short-forecast horizons (from 1 to 5 days), high-frequency models dominate their low-frequency counterparts. as the forecast horizon increases, the advantage of the high-frequency models disappears, and low-and high-frequency forecasts become statistically comparable. the answer to the question proposed in the title is 'if high-frequency data are not available, then low-frequency data can be used to estimate and predict long-term market volatility'. moreover, regardless of whether one relies on high-or low-frequency volatility models, one should utilize combination forecasts. the mincer and zarnowitz (1969)) tests further suggest that at least part of the inaccuracy of low-frequency volatility forecasts is due to bias. finally, we find that high-frequency models tend to be more superior during periods of increased volatility. these results have implications for researchers and investors alike, as they demonstrate that low-frequency volatility models can provide competitive performance to that of highfrequency models under some circumstances. our study notes that high-frequency data might not always be worth the much higher acquisition, data management and processing costs, especially if the forecast horizon of interest is sufficiently long. note: ρ(.) is the value of the auto-correlation coefficient at the given lag. the sd is the standard deviation. the correlation between high-and low-frequency variance estimators is 0.90, 0.86, 0.96, 0.83, 0.88, and 0.89 for aud/usd, eur/usd, gbp/usd, usd/cad, usd/chf, usd/jpy. notes: the values in bold and with † symbol denote model confidence set for given currency pair. in other words, we can not reject the hypothesis that these models have the same predictive performance at the level of α = 0.15. all models and forecast combinations are described in section 2. the values in bold and with † symbol denote model confidence set for given currency pair. in other words, we can not reject the hypothesis that these models have the same predictive performance at the level of α = 0.15. all models and forecast combinations are described in section 2. the values in bold and with † symbol denote model confidence set for given currency pair. in other words, we can not reject the hypothesis that these models have the same predictive performance at the level of α = 0.15. all models and forecast combinations are described in section 2. the values in bold and with † symbol denote model confidence set for given currency pair. in other words, we can not reject the hypothesis that these models have the same predictive performance at the level of α = 0.15. all models and forecast combinations are described in section 2. note: the results correspond to the modelling of the loss differential between c t rim h and c t rim l forecasting models by the means of lagged realized variance and trend variable. all coefficients are multiplied by 10 4 . significances are based on the variance-covariance matrix estimated using a quadratic spectracl weighting scheme and newey and west automatic bandwidth selection. */**/*** correspond to 10%, 5% and 1% significance levels. how often to sample a continuous-time process in the presence of market microstructure noise. the review of financial studies stylized facts of trading activity in the high frequency fx market: an empirical study the distribution of realized stock return volatility answering the skeptics: yes, standard volatility models do provide accurate forecasts roughing it up: including jump components in the measurement, modeling, and forecasting of return volatility. the review of economics and statistics realized volatility forecasting and market microstructure noise jump-robust volatility estimation using nearest neighbor truncation microstructure noise, realized variance, and optimal sampling measuring downside risk: realised semivariance designing realized kernels to measure the ex post variation of equity prices in the presence of noise the combination of forecasts generalized autoregressive conditional heteroskedasticity are low-frequency data really uninformative? a štefan lyócsa: conceptualization, methodology, software, data curation, formal analysis, writing -original draft preparation, writing -review & editing, visualization, funding acquisition, project administration.: tomáš plíhal: conceptualization, methodology, data curation, investigation, writing -original draft preparation financial support from the ...(to be added later)... is acknowledged gratefully. key: cord-329256-7njgmdd1 authors: leecaster, molly; gesteland, per; greene, tom; walton, nephi; gundlapalli, adi; rolfs, robert; byington, carrie; samore, matthew title: modeling the variations in pediatric respiratory syncytial virus seasonal epidemics date: 2011-04-21 journal: bmc infect dis doi: 10.1186/1471-2334-11-105 sha: doc_id: 329256 cord_uid: 7njgmdd1 background: seasonal respiratory syncytial virus (rsv) epidemics occur annually in temperate climates and result in significant pediatric morbidity and increased health care costs. although rsv epidemics generally occur between october and april, the size and timing vary across epidemic seasons and are difficult to predict accurately. prediction of epidemic characteristics would support management of resources and treatment. methods: the goals of this research were to examine the empirical relationships among early exponential growth rate, total epidemic size, and timing, and the utility of specific parameters in compartmental models of transmission in accounting for variation among seasonal rsv epidemic curves. rsv testing data from primary children's medical center were collected on children under two years of age (july 2001-june 2008). simple linear regression was used explore the relationship between three epidemic characteristics (final epidemic size, days to peak, and epidemic length) and exponential growth calculated from four weeks of daily case data. a compartmental model of transmission was fit to the data and parameter estimated used to help describe the variation among seasonal rsv epidemic curves. results: the regression results indicated that exponential growth was correlated to epidemic characteristics. the transmission modeling results indicated that start time for the epidemic and the transmission parameter co-varied with the epidemic season. conclusions: the conclusions were that exponential growth was somewhat empirically related to seasonal epidemic characteristics and that variation in epidemic start date as well as the transmission parameter over epidemic years could explain variation in seasonal epidemic size. these relationships are useful for public health, health care providers, and infectious disease researchers. respiratory syncytial virus (rsv) has long been recognized as a substantial public health threat [1] with annual epidemics exacting an enormous toll on vulnerable populations and health care delivery systems. rsv is associated with substantial morbidity in children in both the hospitalized and outpatient setting [2] [3] [4] [5] . in addition to the toll on the health of the population, this disease imposes a large burden on the health care system in terms of human and material resources. although no rsv vaccine exists, infants and children with risk factors for severe rsv infection (eg, lung disease or prematurity) can receive monthly doses of palivizumab, a humanized murine anti-rsv monoclonal antibody, during the rsv season. palivizumab treatment is extremely costly; the cost-effectiveness of this therapy could be improved if treatment is given only during times of high rsv activity. treatment of vulnerable individuals also improves overall health in the population. prediction of seasonal epidemic characteristics including times of high activity and total size would support efficient management of resources and delivery of palivizumab. health care facilities could forecast requirements for beds, staffing, testing, treatment, and other resources needed to care for sick children. for greatest effectiveness, these predictions should be made early in the rsv season; the authors, including public health practitioners and physicians, hold the expert opinion that these predictions would be useful within the first month of the observed start of the rsv seasonal epidemic. in some regions, total epidemic size generally follows a biennial cycle from year to year with smaller epidemic seasons followed by larger epidemic seasons [6] . this cycle is currently used to gauge upcoming rsv seasonal epidemic size based on total size of the previous epidemic season. the centers for disease control and prevention (cdc) researchers using the national respiratory and enteric virus surveillance system found that the prior epidemic season's data were a relatively imprecise predictor of the epidemic season onset in a given community and that timing of the rsv epidemic season may vary substantially in the same year among communities in close proximity [7] . one goal of this research was to explore year-to-year variation in epidemic seasons using local data. the biennial variation in our seasonal epidemic data was seen in the early exponential growth rates (slope of the cumulative case curves, figure 1 ) as well as total epidemic size. we explored the relationship between exponential growth of rsv epidemics and the seasonal epidemic characteristics of total epidemic size, days to peak, and epidemic length to assess predictions made early in the epidemic season. knowledge about viral transmission characteristics and the data derived from surveillance systems can be used to inform novel approaches for estimating characteristics of rsv epidemics through the application of methods rooted in epidemiological models of infectious disease transmission [8, 9] . these methods are being increasingly applied to emerging threats like sars [10] [11] [12] and pandemic influenza, but their application to routine epidemics of common respiratory viruses like seasonal influenza and rsv has only begun to be explored. weber et al. [8] model rsv transmission to examine how climate and social factors influence transmission in a population. they consider compartmental models using susceptible-infected-recovered-susceptible (sirs) with additions to include latency and stages of susceptibility. they find no single best model for rsv epidemics; many "competing" models fit the observed data well. we further explored the variation in seasonal epidemics using compartmental models. the variation in exponential growth could potentially be related to variation in transmission rates, epidemic start dates, or proportions susceptible as well as a host of other factors. the second goal of this research was to evaluate the ability of a compartmental model based on epidemiologic principles to fit observed data from a series of epidemics and examine the extent to which seasonal variations in epidemics can be accounted for by variation in specific model parameters. for these analyses, we used daily laboratory data from the major pediatric health care facility in utah where routine viral testing is a fixture of standard clinical care for children presenting to regional emergency departments. the utility of the data from these surveillance systems for relating final epidemic size and modeling the epidemic curve has not been fully evaluated. we investigated the estimation of seasonal epidemic characteristics using regression of exponential growth across seven epidemic seasons. we also modified the model of weber et al. to explore the model fits and estimates of epidemic size using variation of parameters within a susceptible-exposed-infected-infected/detected-recovered (seidr) model. primary children's medical center (pcmc) is a 250-bed children's hospital that serves both as a community pediatric hospital for salt lake county, utah (2008 population 1 million [13] ), and as a tertiary referral center for five states in the intermountain west (utah, idaho, wyoming, nevada, and montana, total 2008 population 8.36 million [14] ). eighty percent of pediatric hospital admissions occurring in salt lake county and 73% occurring in the state of utah are at pcmc. during the study period, july 2001 through june 2008, direct respiratory sampling (mainly saline-assisted nasopharyngeal aspiration) for respiratory viral testing was performed for about 70% of children evaluated in the pcmc emergency department for respiratory complaints (unpublished data) and was required for all hospitalized children with respiratory symptoms (eg, upper or lower respiratory tract infection, bronchiolitis, asthma, or bacterial or viral pneumonia). in addition, respiratory viral testing was recommended for all febrile infants one to 90 days of age. test results were used to inform patient cohorting and isolation procedures and to assist with medical management. all samples were initially tested by direct fluorescent antibody staining (dfa). dfa testing was performed three to five times daily depending on the season, with a mean turnaround time of four hours. for all dfa negative specimens, multiplex polymerase chain reaction (pcr) or viral culture was performed. the data included in our analyses were all positive test results from the above sampling protocols from any of the testing methods during the study period. the practice of testing and test methods did not change appreciably during the study period (unpublished data on percentage of children tested and methods used). the data were used as daily counts by age group, under two and over two years old. the rsv epidemic year was defined to be from july 1 of one year through june 30 of the following year. this time period was chosen to place the beginning date close to the middle of the inter-epidemic period, approximately six months from the average historical peak of the seasonal epidemic. this study was reviewed by the institutional review boards of intermountain healthcare and the university of utah and determined by both organizations to be exempt. regression analysis was used to explore the relationship between the initial exponential growth rate and the epidemic season characteristics of size, days to peak, and length using the seven epidemic seasons of rsv data from pcmc. the exponential growth rate, λ t0, t1 , for time interval t 0 to t 1 was calculated as , where x t i denotes the cumulative number of cases at time t i , i = 0,1. the exponential growth rate was calculated at four weeks to assess regression predictions made early in the season. for comparison, exponential growth rate was also calculated at weeks one through six. the total epidemic size was the sum of cases over the epidemic year, including sporadic interepidemic cases. an observable seasonal epidemic start date of t 0 was defined as the start of the first week of the epidemic year with at least five confirmed rsv cases. this was the definition used by the hospital epidemiologists at pcmc to declare the start of rsv outbreaks during the study period. the term seasonal epidemic refers to the period from the epidemic start date until the epidemic end date, defined as the end of the last week of the epidemic year with at least five confirmed rsv cases. the number of days until the peak for the epidemic seasons was calculated as the midpoint day of the largest seven-day moving average window minus the epidemic season start day. the length of the epidemic season was calculated as the epidemic season end day minus the epidemic season start day. relationships between the initial exponential growth rate and seasonal epidemic characteristics were described using the pearson correlation coefficient and assessed using standard regression statistics. the fits of the regression models were assessed using the percent error of the model fits from the observed values. to combine across seasons, the absolute values of the percent errors were averaged providing the mean absolute percent error for the model. we modeled the observed rsv cases using an extension of the sir model that included individuals (c for children and a for adults) that were susceptible (s c and s a ), exposed (e c and e a ), infectious(i c and i a ), infectious and subsequently detected children (d), and recovered combined across children and adults (r). this seidr model was applied to a series of seven epidemic years. the population was split into children less than two years old (children) and those older than two (adults). it has been shown that the initial rsv infection is the most severe and occurs in almost every child in their first two years of life. transmission is modeled as a function of time using a cosine function to mirror the cyclic nature of epidemics [8] . there is an offset to this cycle (α), which we estimate along with transmission parameter (β). births and deaths (μ) are accounted for in the susceptible class only. achievement of age two is accounted for in all age-separated classes (η). assumptions of simple compartmental models that we made were as presented in koopman [15] . our seidr transmission model ( figure 2 ) was defined using the following system of non-linear differential here β was the transmission parameter, l the latency period, f the under-two detection fraction, and γ the recovery parameter. all parameters are presented in the next subsection with descriptions, ranges, and reference values from the literature. solution to the set of differential equations is addressed below. to fit the seidr model to the empiric epidemic data, three parameters-latency period, birth and death rate, and recovery period-were specified based on the literature. three parameters associated with variation across epidemic years were estimated: 1) the temporal offset of the epidemic cycle (α), 2) detection fraction (f), and 3) transmission parameter (β). different models were specified to explore the effect of these three parameters. all combinations of these were considered: models with one parameter allowed to vary across seasons, models with two parameters allowed to vary across seasons, and a model with all parameters allowed to vary across seasons. each parameter is described below. birth and death rate (μ) the number of daily births and deaths were entered in the model based on census data for salt lake county. it was assumed that 1/365 th of the children in each ageseparated compartment reached the age of two each day. detection fraction (f) the detection fraction parameter reflected the fraction of the rsv epidemic in children under two years old that was captured in our data set. the detection fraction parameter was estimated as a constant parameter across years and also allowed to vary by epidemic year. the latency period is the time between exposure resulting in transmission and time of infectiousness. the latency period was specified using the median value from crowcroft [16] , five days. the transmission parameter determined the rate of transmission from contacts between infectious and susceptible individuals. we assumed a homogeneous, uniformly mixing population. the transmission parameter was estimated as a constant parameter across years and also allowed to vary by epidemic year. the recovery parameter specifies the time from infectiousness to recovery. this was specified as 0.1, which translates to a ten-day recovery period, following the work by weber [8] and in the range of one to 21 reported by hall [17] . the final model parameter was the offset of the annual epidemic cycle. a regular annual cycle is thought to vary due to weather and climate conditions. the seidr model captures the entire epidemic, detected and not detected. prior to observing rsv cases, the epidemic cycle started within the undetected population. this offset parameter was estimated as a constant parameter across years and also allowed to vary by epidemic year. the nonlinear equations were solved using the lsoda function from the odesolve library [18] in r statistical software [19] . the parameters were estimated using a grid search. two fitting statistics were used. the estimates were the values that minimized the square root of the sum of standardized squared errors (rse) and/or the square root of the sum of squared standardized errors (rmse the denominator from these measures adjusted for the magnitude of the epidemic curve to avoid fitting the model mainly to the peak, where differences could over-inflate the fitting statistic and under-value differences during the early and late stages of the epidemic. the rmse reduces the effect of fit to the peak more than does the rse. a grid search was used starting with an initial wide range of values for f, β, and α. the search grid was repeated with successively narrowing ranges to minimize the rse. the grid started with the range of reasonable values, 0 -1 for β and f and one to 200 days for α. the range was reduced and resolution increased iteratively around minimal rse and rmse values. the minimum grid resolution was 0.0001 for β, 0.01 for f, and one day for α. the rses and rmses from the grid search results were used to select the best parameter estimates within each model type (eg, one model type had only transmission rates that varied by epidemic year). the model with all three parameters allowed to vary by epidemic year was fit as a saturated model to provide a benchmark for rse and rmse, along with the schwarz criteria described below, and percent error in estimating epidemic size when evaluating more parsimonious models in which only one of the 3 parameters was allowed to vary by epidemic year. multiple measures were used to compare the models, in part because the schwarz criteria assumed the residuals were independent and identically distributed, which was not the case; they are, in fact, autocorrelated. the schwarz information criterion [20] were calculated based on the weighted least squares method used for parameter estimation. there were n = 2555 data points, 365 days of case data for each of seven years, and k, the number of parameters estimated was 28 in the full model (four parameters for seven years) and 16 in each other model (two parameters for seven years and two parameters overall). the schwarz criteria were calculated as: bic = 2555 × ln 7 j=1 m 2 j + 2k ln(2555) where m represents either the rse or rmse fit statistic [21] . the absolute values of the percent error in estimating total epidemic size were summed across seasons for comparison of models. the number of children with test-positive rsv infection ranged from 682 cases in 2004-5 to 1704 cases in 2007-8 ( table 1 ). the median size of the annual epidemic was 1113 cases. overall, 98% of cases were detected between the months of october and april. larger epidemics alternated with smaller epidemics. the amplitude of this biennial cycle was approximately 600 cases. the total number of children (under 18 years of age) tested per epidemic year ranged from approximately 3000 to 7000, with numbers of tests increasing over time. overall, 21% percent of these were positive for rsv, varying according to the biennial cycle. of children tested, 81% were less than three years old and 95% were less than 11 years old. of children with positive tests, 92% were less than three years old and 99% were less than 11 years old. of the children tested, 70% were from salt lake county and 77% of children with positive tests were from salt lake county. exponential growth rates calculated from cases accumulated for four weeks from the observed epidemic season start ranged from 0.034 to 0.081 (table 1) across the epidemic seasons. the effective reproductive numbers ranged from 1.27 to 1.49 using a serial interval of seven days [16] . in regression analyses (table 2) , the fourweek exponential growth rate exhibited a substantial positive correlation with epidemic size (r = 0.69, p = 0.08), and was negatively correlated with start day (r = -0.43, p-value = 0.33), days to peak (r = -0.44, p-value = 0.32), and length of the epidemic (r = -0.58, p-value = 0.17). the regression models provided estimates of epidemic season characteristics that were on average within 16% of observed epidemic season size, 11% of observed days to peak, and 8% of observed epidemic length. using exponential growth rates calculated from weeks one through six provided, in general, increasing correlation (table 3) . the saturated seidr model was fit to seven epidemic years of observed rsv data with epidemic year-specific rse values that ranged from 13 to 21, rmse values that ranged from 0.40 to 0.77 and percent error of total cases that ranged from 1% to 16%. the fit statistics for the models with either transmission parameter or table 1 observed rsv epidemic size, start date, days to peak, duration, and 4-week exponential growth detection fraction estimated as a constant across epidemic year did not differ substantially from those from the saturated model (table 4) . the minimum rse model with detection fraction held constant across epidemic years had the smallest % error, smallest schwarz rse criterion, and had other fit statistics nearly equal to the saturated model. the minimum rmse models were, in general, fitting to the tails of the epidemic and resulted in large errors in estimating epidemic size. the pattern of variation in estimates of offset from all models matched the biennial cycle variation in total epidemic size across epidemic years (figure 3) . the variation in estimates of the transmission parameter and detection fraction did not necessarily match this cycle for all epidemic years. the parameter estimates for the transmission parameter were negatively correlated with total epidemic size. the seidr model we presented made assumptions that simplified the reality of rsv transmission. we have identified three limitations to the seidr modeling effort. first, the population age separation does not take full advantage of differences in interaction among a non-homogenous population. second, related to this, the parameter values were not allowed to vary within the population. transmission, for instance, could be age-dependent (due, eg, to hand-washing habits). third, the grid search method of parameter estimation did not provide estimated standard errors for parameter estimates, which limited the ability to compare models and seasons. despite these limitations, this seidr model was useful; it modeled the observed rsv cases from pcmc as part of larger unobserved epidemic seasons and provided a framework for investigating the model parameters. the parameters offset and transmission may not be completely identifiable within this framework but more likely represent combined other forces unmeasured here. our future work includes addressing these limitations and expanding the complexity of the models. rsv is carried by all age groups but is, in general, only a concern for infants. thus, an age-stratified model, possibly with different mixing mechanisms, would more closely resemble the true transmission. the biennial cycle of large, early, and short seasonal epidemics followed by smaller, later, and longer seasonal epidemics the next year observed in utah is similar to other published studies of seasonal rsv epidemics in temperate climates. the theories for this phenomenon include the existence and switching of two rsv disease strains, climate patterns, and waning immunity after infection [6, 8, 9, [22] [23] [24] . these and other theories could be investigated in more complex models. it is understood that immunity after infection of rsv is partial, at best. this incomplete immunity and severity of re-infections could be incorporated into more complex models [8, 25] . finally, future modeling efforts will involve approaches that include measures of uncertainty in parameter estimates, including bayesian methods [26, 27] and likelihood and other methods [28, 29] . the first main conclusion of this work was that exponential growth was somewhat empirically related to seasonal epidemic characteristics. the variations in epidemic seasons from data collected at pcmc during the seven years of the study can be partially explained by the variation in exponential growth, especially characteristics of epidemic size, peak day, and length of the epidemic. the seven years of data were not sufficient to make conclusive statements on the nature of the relationships. these early findings based on just seven data points can be built upon to explore early prediction of table 2 results of regression analysis using exponential growth to predict epidemic size, days to peak, and length the upcoming rsv epidemic season. these early predictions could be used by hospitals to budget and allocate resources and to coordinate the timing of palivizumab treatment. they can be used by public health to advise clinicians and the public and also to help identify nonstandard epidemics earlier in the season. for example, health departments might take specific actions if the number of observed cases during the season greatly exceeds early predictions. the second main conclusion of this work was that variation of the transmission parameter and the start of the epidemic (offset) over epidemic years could explain the variation in seasonal epidemic size. the three model parameters allowed to vary by epidemic year (detection fraction, transmission parameter, and offset) provided possible rationale for the variation in seasonal epidemic size. the model with detection fraction held constant across epidemic year fits the observed data well with the fewest parameters. the parameter estimates from this model also match the expected biennial pattern of the epidemic years. from the models considered in this study, this one performs best overall (figure 4 ). write the introduction and discussion sections of the text, providing public health perspective to the study. cb helped conduct the literature review and write the introduction and discussion sections of the text. ms conceived the study and directed its implementation, including contributions to all sections of the text. all authors read and approved the final manuscript. respiratory syncytial virus epidemics: the ups and downs of a seasonal virus prospective population-based study of viral lower respiratory tract infections in children under 3 years of age (the pride study) recent trends in severe respiratory syncytial virus (rsv) among us infants economic impact of respiratory syncytial virus-related illness in the us: an analysis of national databases bronchiolitisassociated hospitalizations among us children defining the timing of respiratory syncytial virus (rsv) outbreaks: an epidemiological study variation in timing of respiratory syncytial virus outbreaks: lessons from national surveillance modeling epidemics caused by respiratory syncytial virus (rsv) understanding the transmission dynamics of respiratory syncytial virus using multiple time series and nested models transmission dynamics and control of severe acute respiratory syndrome invited commentary: real-time tracking of control measures for emerging infections different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures census bureau population division-counties census bureau population division-states modeling infection transmission respiratory syncytial virus infection in infants admitted to paediatric intensive care units in london, and in their families respiratory syncytial virus infections in infants: quantitation and duration of shedding solvers for ordinary differential equations. r package version 0.5-18 edn vienna, austria: r foundation for statistical computing development core team: a language and environment for statistical computing generalizing the derivation of the schwarz information criterion multiexponential, multicompartmental and noncompartmental modeling, ii: data analysis and statistical considerations pattern of respiratory syncytial virus epidemics in finland: twoyear cycles with alternating prevalence of groups a and b occurrence of groups a and b of respiratory syncytial virus over 15 years: associated epidemiologic and clinical characteristics in hospitalized and ambulatory children the incidence of infectious diseases under the influence of seasonal fluctuation a stochastic method for solving inverse problems in epidemic modeling bayesian inference for partially observed stochastic epidemics predicting case numbers during infectious disease outbreaks when some cases are undiagnosed inference for nonlinear dynamical systems statistical challenges of epidemic data partial support for this work was provided by the public health services research grant ul1-rr025764 from the national center for research resources, nih/niaid1 u01 ai074419 and u01-a1061611, us cdc #1 po1 cd000284, and the nih/eunice kennedy shriver nichd k24-hd047249. author details 1 division of epidemiology, university of utah school of medicine, salt lake city, usa. 2 department of pediatrics, university of utah school of medicine, salt lake city, usa. 3 division of disease control and prevention, utah department of health, salt lake city, usa.authors' contributions ml performed the analysis and wrote the bulk of the manuscript. pg helped to conceive the study and prepare the data and also wrote a large part of the introduction, methods, and discussion sections of the text. tg advised on the design of the study's analysis and helped prepare the methods and results sections of the text. nw acquired and managed the data. ag provided clinical insight and helped conduct the literature review. rr helped the authors declare that they have no competing interests. key: cord-321852-e7369brf authors: wang, bo; jin, shuo; yan, qingsen; xu, haibo; luo, chuan; wei, lai; zhao, wei; hou, xuexue; ma, wenshuo; xu, zhengqing; zheng, zhuozhao; sun, wenbo; lan, lan; zhang, wei; mu, xiangdong; shi, chenxi; wang, zhongxiao; lee, jihae; jin, zijian; lin, minggui; jin, hongbo; zhang, liang; guo, jun; zhao, benqi; ren, zhizhong; wang, shuhao; xu, wei; wang, xinghuan; wang, jianming; you, zheng; dong, jiahong title: ai-assisted ct imaging analysis for covid-19 screening: building and deploying a medical ai system date: 2020-11-10 journal: appl soft comput doi: 10.1016/j.asoc.2020.106897 sha: doc_id: 321852 cord_uid: e7369brf the sudden outbreak of novel coronavirus 2019 (covid-19) increased the diagnostic burden of radiologists. in the time of an epidemic crisis, we hope artificial intelligence (ai) to reduce physician workload in regions with the outbreak, and improve the diagnosis accuracy for physicians before they could acquire enough experience with the new disease. in this paper, we present our experience in building and deploying an ai system that automatically analyzes ct images and provides the probability of infection to rapidly detect covid-19 pneumonia. the proposed system which consists of classification and segmentation will save about 30%–40% of the detection time for physicians and promote the performance of covid-19 detection. specifically, working in an interdisciplinary team of over 30 people with medical and/or ai background, geographically distributed in beijing and wuhan, we are able to overcome a series of challenges (e.g. data discrepancy, testing time-effectiveness of model, data security, etc.) in this particular situation and deploy the system in four weeks. in addition, since the proposed ai system provides the priority of each ct image with probability of infection, the physicians can confirm and segregate the infected patients in time. using 1,136 training cases (723 positives for covid-19) from five hospitals, we are able to achieve a sensitivity of 0.974 and specificity of 0.922 on the test dataset, which included a variety of pulmonary diseases. covid-19 started to spread in january 2020. up to early march 2020, it has infected over 100,000 people worldwide [1] . the virus is harbored most commonly with little or no symptoms, but can also lead to a rapidly progressive and often fatal pneumonia in 2-8% of those infected. covid-19 causes acute respiratory distress syndrome on patients [2, 3] . laboratory confirmation of sars-cov-2 is performed with a virus-specific rt-pcr, but it has several challenges, including high false negative rates, delays in processing, variabilities in test techniques, and sensitivity sometimes reported as low as 60-70%. the ct image can show the characteristics of each stage of disease detection and evolution. although many challenges are still existed with rapid diagnosis of covid-19, there are some characteristic with typical features. the preliminary prospective analysis by huang et al. [2] showed that all 41 patients in the study had abnormal chest ct, with bilateral ground-glass shape lung opacities in subpleural areas of the lungs. many recent studies [4, 5, 6, 7, 8] also viewed chest ct as a low-cost, accurate and efficient method for novel coronavirus pneumonia diagnosis. the official guidelines for covid-19 diagnosis and treatment (7th edition) by china's national health commission [9] also listed chest ct result as one of the main clinical features. ct evaluation has been an important approach to evaluate the patients with suspected or confirmed covid-19 in multiple centers in wuhan china and northern italy. the sudden outbreak of covid-19 overwhelmed health care facilities in the wuhan area. hospitals in wuhan had to invest significant resources to screen suspected patients, further increasing the burden of radiologists. as ji et al. [10] pointed out, there was a significant positive correlation between covid-19 mortality and health-care burden. it was essential to reduce the workload of clinicians and radiologists and enable patients to get early diagnoses and timely treatments. in a large country like china, it is nearly impossible to train such a large number of experienced physicians in time to screen this novel disease, especially in regions without an outbreak yet. to handle this dilemma, in this research, we present our experience in developing and deploying an artificial intelligence (ai) based method to assist novel coronavirus pneumonia screening using ct imaging. at present, the physicians often obtain a id from the hospital information system (his), then assess the results of ct images from the picture archiving and communication systems (pacs), and give a conclusion of ct images to his. due to the rapid increase in number of new and suspected covid-19 cases, we expect to detect the ct images in order of importance (i.e. high risk patient). however, since the ids from his are assigned with capturing time, the existing the ai system can only perform with these ids on pacs. therefore, it still spends a large amount of time to detect covid-19 patients, and affects the treatment time of severely infected covid-19 patients. in this paper, we introduce a automatically ai system that can provide the probability of infection and the ranked ids. specifically, the proposed system which consists of classification and segmentation will save about 30-40% of the detection time for physicians and promote the performance of covid-19 detection. the classification subsystem tries to give the probability of getting covid-19 for each sample, the segmentation subsystem will highlight the position of the suspected area. in addition, training ai models requires a lot of samples. however, at the beginning of a new epidemic, there were not many positive cases confirmed by nucleic acid test (nat). to build a dataset for detect covid-19, we collect 877 samples from 5 hospitals. all imaging data come from the covid-19 patients confirmed by nat who underwent lung ct scans. this requirement also ensured that the image data had the diagnostic characteristics. based on these samples, we employ experienced annotators to annotate all the samples. while it is easy to distinguish pneumonia from healthy cases, it is nontrivial for the model to distinguish covid-19 from other pulmonary diseases, which is the top clinical requirement. thus, we add other pulmonary diseases in the proposed dataset. using the dataset, we train and evaluate several deep learning based models to detect and segment the covid-19 regions. finally, the construction of the ai model included four stages: 1) data collection; 2) data annotation; 3) model training and evaluation, and 4) model deployment. the contributions of this paper can be summarised as follows: • in this paper, we present our experience in building and deploying an ai system that automatically ana-2 j o u r n a l p r e -p r o o f lyzes ct images to rapidly detect covid-19 pneumonia. • we build a new dataset on top of real images which labelled the contour and infection regions to promote the development of covid-19 detection. • the proposed ai system can reduce physician workload in regions with the outbreak based on the disease priority, and improve the diagnosis accuracy for physicians. • the proposed ai systems have deployed in 16 hospitals and can provide professional deployment service on-premise of the hospitals. starting from the introduction in section 1, the paper is organized as follows: in section 2, we give related work on data acquisition, lung segmentation and ai-assisted diagnosis. in section 3 describes the details of the proposed method. the experimental results and ablation studies are presented in section 4. in section 5 describes the advantages and disadvantages of the method. the paper is concluded in section 6. in this section we introduce some related studies on aiassisted diagnosis techniques towards covid-19, including data collection, medical images segmentation, and diagnosis. the very first step of building an ai-assisted diagnosis system for covid-19 is image acquisition, in which chest x-ray and ct images are most widely used. and there are more applications using ct images for covid-19 diagnosis [11, 12, 13, 14] , since the analysis and segmentation of ct images are always more precise and efficient than x-ray images. recently, there has been some progress on covid-19 dataset construction. zhao et al. [15] build a covid-ct dataset which includes 288 ct slices of confirmed covid-19 patients, collected from about 700 covid-19 related publications on medrxiv and biorxiv. the coronacases initiative releases some ct images of 10 confirmed covid-19 patients on its website [16] . covid-19 ct segmentation dataset [17] is also a publicly available dataset. it contains 100 axial ct slices of 60 confirmed covid-19 patients, and all the ct slices are manually annotated with segmentation labels. besides, cohen et al. [18] collect 123 frontal view x-rays from some publications and websites and build the covid-19 image data collection. some efforts have been made on contactless data acquisition to reduce the risk of infection during covid-19 pandemic [19, 20, 21] . for example, an automated scanning workflow equipped with mobile ct platform is built [19] , in which the mobile ct platform has more flexible access to patients. during ct data acquisition, the positioning and scanning of patients are operated remotely by a technician. the medical image segmentation and deep neural network [22, 23, 24, 25, 26, 27] plays an important role in aiassisted covid-19 analysis in many works. it highlights the regions of interest (rois) in ct or x-ray images for further examination. the segmentation tasks in covid-19 applications can be divided into two groups: lung region segmentation and lung lesion segmentation. during lung region segmentation, the whole lung region is separated from background, while in lung lesion segmentation tasks the lesion areas are distinguished from other lung areas. lung region segmentation is often executed as a preprocessing step in ct segmentation tasks [28, 29, 30] , in order to decrease the difficulty of lesion segmentation. several widely-used segmentation models are applied in covid-19 diagnosis systems such as u-net [31] , v-net [14] and u-net++ [32] . among them u-net is a fully convolutional network in which skip connection is employed to fuse information of multi-resolution layers. v-net adopts a volumetric, fully convolutional neural network and achieves 3d image segmentation. vb-net [33] replaces the conventional convolutional layers inside down block and up block with the bottleneck, to achieve promising and efficient segmentation results. u-net++ is composed of deeply-supervised encoder and decoder sub-networks. nested skip connections are 3 j o u r n a l p r e -p r o o f equipped to connect the two sub-networks, which could increase the segmentation performance. in ai-assisted covid-19 analysis applications, li et al. [12] develop a u-net based segmentation system to distinguish covid-19 from community-acquired pneumonia on ct images. qi et al.. [34] also build a u-net based segmentation model to separate lung lesions and extract the radiologic characteristics in order to predict the hospital stay of a patient. shan et al. [35] propose a vb-net based segmentation system to segment lung, lung lobes and lung lesion regions. and the segmentation results can also provide accurate quantification data for further study towards covid-19. chen et al. [36] train a u-net++ based segmentation model to provide covid-19 related lesions. medical imaging ai systems such as disease classification and segmentation are increasingly inspired and transformed from computer vision based ai systems. morteza et al. [37] propose a data-driven model that recommends the necessary set of diagnostic procedures based on the patients' most recent clinical record extracted from the electronic health record (ehr). this has the potential to enable health systems expand timely access to initial medical specialty diagnostic workups for patients. gu et al. [38] propose a series of collaborative techniques to engage human pathologists with ai given ai's capabilities and limitations, based on which they prototype impetus -a tool where an ai takes various degrees of initiatives to provide various forms of assistance to a pathologist in detecting tumors from histological slides. samaniego et al. [39] propose a blockchain-based solution to enable distributed data access management in computer-aided diagnosis (cad) systems. this solution has been developed as a distributed application (dapp) using ethereum in a consortium network. li et al. [40] develop a visual analytics system that compares multiple models' prediction criteria and evaluates their consistency. with this system, users can generate knowledge on different models' inner criteria and how confidently we can rely on each model's prediction for a certain patient. ai-assisted covid-19 diagnosis based on ct and xray images could accelerate the diagnosis and decrease the burden of radiologists, thus is highly desired in covid-19 pandemic. and a series of models which can distinguish covid-19 from other pneumonia and diseases have been widely explored. the ai-assisted diagnosis systems can be grouped into two categories, i.e., x-ray based screening covid-19 systems and ct based screening covid-19 systems. among x-ray based ai-assisted systems, ghoshal et al. [41] develop a bayesian convolutional neural network to measure the diagnosis uncertainty of covid-19 prediction. narin et al. [42] develop three widelyused models, i.e., resnet-50 [43] , inception-v3 [44] , and inception-resnet-v2 [45] , to detect covid-19 lesion in x-ray images and among them resnet-50 achieves the best classification performance. and zhang et al. [46] present a resnet based model to detect covid-19 lesions from x-ray images. this model can provide an anomaly score to help optimize the classification between covid-19 and non-covid19 . as x-ray images are always the typical common-used imaging modality in pulmonary diseases diagnosis, they are usually not as sensitive as 3d ct images. besides, the positive covid-19 x-ray data in these studies is mainly from one online datasets [18] which only contains insufficient x-ray images from confirmed covid-19 patients. and the lack of data could affect the generalization property of the diagnosis systems. as for ct based ai-assisted diagnosis, a series of approaches with different frameworks are proposed. some approaches employ a single model to determine the presence of covid-19 disease or certain other diseases in ct images. ying et al. [11] propose deeppneumonia, which is a resnet-50 based ct diagnosis system, to distinguish the covid-19 patients from bacteria pneumonia patients and healthy people. jin et al. [47] build a 2d c-nn based model to segment the lung and then identify slices of covid-19 cases. li et al. [12] propose covnet, which is a resnet-50 based model employed on 2d slices with shared weights, to discriminate covid-19 from community-acquired pneumonia and non-pneumonia. shi et al. [13] apply vb-net [33] to segment the ct images into the left or right lung, 5 lung lobes, and 18 pulmonary segments, then select hand-crafted features to train a random forest model to make diagnosis. some other works follow a segmentation and classification mechanism. taking some approaches for instance, xu et al. influenza-a patients, and healthy people. in this model, lung lesion region in ct image is extracted using v-net first, then the type of lesion region is determined via resnet-18. zheng et al. [49] propose decovnet, which is a combination of a u-net [31] model and a 3d cn-n model. the u-net model is used for lung segmentation, then the segmentation results are input into the 3d cnn model for predicting the probability of existence of covid-19. as shown in figure 1 , the construction of the ai model included four stages: 1) data collection; 2) data annotation; 3) model training and evaluation, and 4) model deployment. as we accumulated data, we iterated through the stages to continuously improve model performance. our dataset was obtained from 5 hospitals (see table 1 ). most of the 877 positive cases were from hospitals in wuhan, while half of the 541 negative cases were from hospitals in beijing. our positive samples were all collected from confirmed patients, following china's national diagnostic and treatment guidelines at the time of the diagnosis, which required positive results in nat. the positive cases offered a good sample of confirmed cases in wuhan, covering different age and gender groups (see figure 4 ). we collect many ct images with other types of pneumonias, e.g. common pneumonia patients, viral pneumonia patients, fungal pneumonia patients, tumor patients, emphysema patients, lung lesions patients. to choose the reasonable negative cases, we employ several senior physicians to manually confirm each case. based on the experiences of senior physicians, they choose the negative cases which have similar characteristics with covid-19. finally, we also had 450 cases with other known lung diseases with ct imaging features similar to covid-19 to some extent (see figure 5) . the hospitals used different models of ct equipment from different manufacturers (see table 2 ). due to the shortage of ct scanners for hospitals in wuhan, slice thicknesses varied from 0.625 mm to 10 mm, with a majority (81%) under 2 mm. we believed this variety helped to improve the generalizability of our model in real deployment. in addition, we removed the personally identifiable information (pii) from all ct scans for patients' privacy. we randomly divided the whole dataset into a training set and a test set for each model training (see table 4 ). to train the models, a team of six data annotators annotated the lesion regions (if there are any), lung boundaries, and the parts of the lungs for transverse section layers in all ct samples. saving time for radiologists was essential during the epidemic outbreak, so our data annotators performed most of the tasks, and we relied on a three-step quality inspection process to achieve reasonable accuracy for annotation. all of the annotators had radiology background, and we conducted a four-day hands-on training led by a senior radiologist with clinical experience of covid-19 before they performed annotations. our three-step quality inspection process was the key to obtain high-quality annotations. we divided the sixannotator team into a group of four (group a) and a group of two (group b). step 1 group a made all the initial annotations, and group b performed a back-to-back quality check, i.e., each of the two members in group b checked all the annotations independently and then compared their results. the pass rate for this initial inspection was 80%. the cases that failed to pass mainly had minor errors in small lesion region missing or the inexact boundary shape. step 2 group a revised the annotations, and then group b rechecked the annotations. this process continued until all of them passed the back-to-back quality test within the two-people group. step 3 when a batch of data was annotated and passed the first two steps, senior radiologists randomly checked 30% of the revised annotations for each batch. we observed a pass rate of 100% in this step, showing reasonable annotation quality. of course, there might still be errors remaining, and we relied on the model training process to tolerate these random errors. we performed the following preprocessing steps before we used them for training and testing. 1) since different samples had different resolutions and slice thicknesses, we first normalized them to (1, 1, 2.5) mm using standard interpolation algorithms (e.g. nearest neighbour interpolation, bilinear interpolation and cubic interpolation [50, 51] ). note that we use cubic interpolation to obtain better image effect. 2) we adjusted the window width (ww) and window level (wl) for each model, generating three image sets, each with a specific window setting. for brevity, we used the [min, max] interval format in programming for ww and wl. specifically, we set them to [-150, 350] for the lung region segmentation model, and [-1,024, 350] for both of the lesion segmentation and classification models. 3) we first ran the lung segmentation model to extract areas of the lungs from each image and used only the extraction results in the subsequent steps. 4) we normalized all the values to the range of [0, 1]. 5) we applied typical data augmentation techniques [52, 53] to increase the diversity of data. for example, we randomly flipped, panned, and zoomed images for more variety, which had been shown to improve the generalization ability for the trained model. our model was a combination of a segmentation model and a classification model. specifically, we used the segmentation model to obtain the lung lesion regions, and then the classification model to determine whether it was covid-19-like for each lesion region. we selected both models empirically by training and testing all models in our previously-developed model library. 6 j o u r n a l p r e -p r o o f for the segmentation task, we considered several widely-used segmentation models such as fully convolutional networks (fcn-8s) [54] , u-net [31] , v-net [14] and 3d u-net++ [32] . fcn-8s [54] was a "fully convolutional" network in which all the fully connected layers were replaced by convolution layers. thus, the input of fcn-8s could have arbitrary size. fcn-8s introduced a novel skip architecture to fuse information of multi-resolution layers. specifically, upsampled feature maps from higher layers were combined with feature maps skipped from the encoder, to improve the spatial precision of the segmentation details. similar to fcn-8s, u-net [31] was a variant of encoder-decoder architecture and employed skip connection as well. the encoder of u-net employed multi-stage convolutions to capture context features, and the decoder used multi-stage convolutions to fuse the features. skip connection was applied in every decoder stage to help recover the full spatial resolution of the network output, making u-net more precise, and thus suitable for biomedical image segmentation. v-net [14] was a 3d image segmentation approach, where volumetric convolutions were applied instead of processing the input volumes slice-wise. v-net adopted a volumetric, fully convolutional neural network and could be trained end-to-end. based on the dice coefficient between the predicted segmentation and the ground truth annotation, a novel objective function was introduced to cope with the imbalance between the number of foregrounds and background voxels. 3d u-net++ [32] was an effective segmentation architecture, composed of deeply-supervised encoder and decoder sub-networks. concretely, a series of nested, dense re-designed skip pathways connected the two subnetworks, which could reduce the semantic gap between the feature maps of the encoder and the decoder. integrate the multi-scale information, the 3d u-net++ model could simultaneously utilize the semantic information and the texture information to make the correct predictions. besides, deep supervision enabled more accurate segmentation, particularly for lesion regions. both re-designed skip pathways and deep supervision distinguished u-net++ from u-net, and assisted u-net++ to effectively recover the fine details of the target objects in biomedical images. also, allowing 3d inputs could capture inter-slice features and generate dense volumetric segmentation. for all the segmentation models, we used patch size (i.e., the input image size to the model) of (256, 256, 128). the positive data for the segmentation models were those images with arbitrary lung lesion regions, regardless of whether the lesions were covid-19 or not. then the model made per-pixel predictions of whether the pixel was within the lung lesion region. in the classification task, we evaluated some state-ofthe-art classification models such as resnet-50 [43] , inception networks [55, 44, 45] , dpn-92 [56] , and attention resnet-50 [57] . residual network (resnet) [43] was a widely-used deep learning model that introduced a deep residual learning framework. resnet was composed of a number of residual blocks, the shortcut connections element-wisely combined the input features with the output of the same block. these connections could assist the higher layers to access information from distant bottom layers and effectively alleviated the gradient vanishing problem, since they backpropagated the gradient to the bottom gradient without diminishing magnitude. for this reason, resnet was able to be deeper and more accurate. here, we used a 50-layer model resnet-50. inception families [55, 44, 45] had evolved a lot over time, while there was an inherent property among them, which was a split-transform-merge strategy. the input of the inception model was split into a few lowerdimensional embeddings, transformed by a set of specialized filters, and merged by concatenation. this splittransform-merge behavior of inception module was expected to approach the representational power of large and dense layers but at a considerably lower computational complexity. dual path network (dpn-92) [56] was a modularized classification network that presented a new topology of connections internally. specifically, dpn-92 shared common features and maintained the flexibility of exploring new features via dual path architectures, and realized effective feature reuse and exploration. compared with some other advanced classification models such as resnet-50, dpn-92 had higher parameter efficiency and was easy to optimize. residual attention network (attention resnet) [57] was a classification model which adopted attention mechanism. attention resnet could generate adaptive attention-aware features by stacking attention modules. in 7 j o u r n a l p r e -p r o o f order to extract valuable features, the attention-aware features from different attention modules change adaptively when layers going deeper. in that way, the meaningful areas in the images could be enhanced while the invalid information could be suppressed. we used attention resnet-50 in the residual attention network. all the classification models took the input of dualchannel information, i.e., the lesion regions and their corresponding segmentation masks (obtained from the previous segmentation models) were simultaneously sent into the classification models, then gave the classification results (positive or negative). for neural network training, we trained all models from scratch with random initial parameters. table 4 described the training and test data distribution of both segmentation and classification tasks. we trained the models on a server with eight nvidia titan rtx gpus using the pytorch [58] framework. we used adam optimizer with an initial learning rate of 1e −4 and learning rate decay of 5e −4 . we deployed the trained models on workstations that we deployed on premise of the hospitals. a typical workstation contained an intel xeon e5-2680 cpu, an intel i210 nic, two titan x gpus, and 64gb ram (see figure 6 ). the server imported images from the hospital's picture archiving and communication systems (pacs), and displayed the results iteratively. the server automatically checked for model/software updates and installed them so we could update the models remotely. we used the dice coefficient to evaluate the performance of the segmentation tasks and the area under the curve (auc) to evaluate the performance of the classification tasks. besides, we also analyzed the selected best classification model with sensitivity and specificity. concretely, the dice coefficient was the double area of overlap divided by the total number of pixels in both images, which was widely used to measure the ability of the segmentation algorithm in medical image segmentation tasks. auc denoted "area under the roc curve", in [11, 20] [ which roc stood for "receiver operating characteristic". roc curve was drawn by plotting the true positive rate versus the false positive rate under different classification thresholds. then auc calculated the two-dimensional area under the entire roc curve from (0, 0) to (1, 1), which could provide an aggregate measure of the classifier performance across varied discrimination thresholds. sensitivity / specificity was also known as the true positive / negative rate, measured the fraction of positives / negatives that were correctly identified as positive / negative. five qualified physicians (three from hospitals in wuhan, two from hospitals in beijing) participated in this reader study. four of them were attending physicians with average working years of five, while the last one was an associate chief physician with working years of eighteen. for this reader study, we generated a new dataset consistboth the physicians and the ai system performed the diagnosis purely based on ct images. we proposed a combined "segmentation -classification" model pipeline, which highlighted the lesion regions in addition to the screening result. the model pipeline was divided into two stages: 3d segmentation and classification. the pipeline leveraged the model library we had previously developed. this library contained the state-ofthe-art segmentation models such as fully convolutional network (fcn-8s) [54] , u-net [31] , v-net [14] , and 3d u-net++ [32] , as well as classification models like dual path network (dpn-92) [56] , inception-v3 [44] , residual network (resnet-50) [59] , and attention resnet-50 [57] . we selected the best diagnosis model by empirically training and evaluating the models within the library. the latest segmentation model was trained on 732 cases (704 contained inflammation or tumors). the 3d u-net++ model obtained the highest dice coefficient of 0.754, and table 3 showed the detailed segmentation model performance. by fixing the segmentation model as 3d u-net++, we used 1,136 (723 were positive) / 282 cases (154 were positive) to train / test the classification / combined model, the detailed data distribution was given in tables 4, 5 and 6. figure 2 (a) showed the receiver operating characteristic (roc) curves of these four combined models. the "3d unet++ -resnet-50" combined model achieved the best area under the curve (auc) of 0.991. figure 2 (a) plotted the best model by a star, which achieved a sensitivity of 0.974 and specificity of 0.922. the performance of the model improved steadily as the training data accumulated. in practice, the model was continually retrained in multiple stages (the average time between stages was about three days). table 7 showed the training datasets we used in each stage. figure 2 (b) showed the improvement of the roc curves at each stage. at the first stage, the auc reached 0.931 using 226 training cases. the model performance at the last stage, auc reached 0.991 with 1,136 training cases which was sufficient for clinical applications. with the model prediction, physicians could acquire insightful information from highlighted lesion regions in the user interface. figure 2(c) showed some examples. the model identified typical lesion characteristics of covid-19 pneumonia, including ground-glass opacity, intralobular septal thickening, air bronchogram sign, vessel thickening, crazy-paving pattern, fibre stripes, and honeycomb lung syndrome. the model also picked out abnormal regions for cases with negative classification, such as lobular pneumonia and neoplastic lesion. these highlights it was necessary to study the false positive and false negative predictions, given in figure 2 (c). most notably, the model sometimes missed positive cases for patchy ground glass opacities with diameters less than 1 cm. the model might also introduce false positives with other types of viral pneumonia, for instance, lobar pneumonia, with similar ct features. also, the model did not perform well when there were multiple types of lesions, or with significant metal or motion artifacts. we plan to obtain more cases with these features for training as our next steps. since the ai has powerful ability to locate the lesions in seconds, it can extremely reduce the workload of physicians which carefully search and decide the lesions from hundreds of ct images one by one. thanks to this system, the physicians only need to examine the estimated results of artificial intelligence. to verify the efficiency of the proposed system, we employ 5 senior physicians to detect the covid-19 infection regions, as shown in figure 3 . we found the system to be effective in reducing the rate of missed diagnosis. by only using the ct scans, for 170 cases (89 were positive) randomly selected from the test set, five radiologists achieved an average sensitivity of 0.764 and specificity of 0.788, while the deep learning model obtained a sensitivity of 0.989 and specificity of 0.741. on 100 cases misclassified by at least one of the radiologists, the model sensitivity and specificity were 0.981 and 0.646, respectively. the radiologists showed a very low average sensitivity of 0.2 on the cases misclassified by the model, and 82.8% (18/22) of the cases were also misdiagnosed by at least one of the radiologists. at the time of writing, we had deployed the system in 16 hospitals, including zhongnan hospital of wuhan university, wuhan's leishenshan hospital, beijing tsinghua changgung hospital, and xi'an gaoxin hospital, etc. physicians first ran the system once automatically, which took 0.8 seconds on average. the model prediction would be checked in the next step. regardless of whether the classification was positive or negative, the physicians would check the segmentation results to quickly locate the suspected legions and examine if there were missing ones. finally, physicians confirmed the screening result. in this section, we will discuss the benefits and drawbacks of the proposed system. as mentioned above, the deployed system is an effective to reduce the rate of missed diagnosis, and can distinguish the covid-19 pneumonia and common pneumonia. in addition, the proposed system will produce the results of classification and segmentation, simultaneously. this combination facilitates doctors to make a definite diagnosis further. furthermore, our system had deployed in 16 hospitals, which took 0.8 seconds on average and made the crucial contributions to cope the covid-19 in practice. although the proposed system have achieved significant effects, it still has some failed cases. first, it does not perform well when there were multiple types of lesions, or with significant metal or motion artifacts. how to enhance the generalization ability of the system is our study in the future. second, to train the network in the proposed system, we need a large annotated ct images which consist of lung contour, lesions and classification. thus, the another flaw of our system is too dependent on fully annotated ct images. the system would help the heavily affected areas, where enough radiologists were unavailable, by giving out the preliminary ct results to speed up the filtering process of covid-19 suspected patients. for the less affected area, it could help less-experienced radiologists, who faced a challenge in distinguishing covid-19 from normal pneumonia, to better detect the highly-indicative features of the presence of covid-19. 14 j o u r n a l p r e -p r o o f while it was not currently possible to build a general ai that could automatically diagnose every new disease, we could have a generally applicable methodology that allowed us to quickly construct a model targeting a specific one, like covid-19. the methodology not only included a library of models and training tools, but also the process for data collection, annotation, testing, user interaction design, and clinical deployment. based on this methodology, we were able to produce the first usable model in 7 days after we received the first batch of data, and conducted additional four iterations in the model in the next 13 days while deploying it in 16 hospitals. the model was performing more than 1,300 screenings per day at the time of writing. being able to take in more data continuously was an essential feature for epidemic response. the performance could be quickly improved by updating the model with continuous data taken in. to further improve the detection accuracy, we need to focus on adding training samples with complicated cases, such as cases with multiple lesion types. besides, ct is only one of the factors for the diagnosis. we are building a multi-modal model allowing other clinical data inputs, such as patient profiles, symptoms, and lab test results, to produce a better screening result. [ clinical features of patients infected with 2019 novel coronavirus in a decade after sars: strategies for controlling emerging coronaviruses clinical characteristics and intrauterine vertical transmission potential of covid-19 infection in nine pregnant women: a retrospective review of medical records ct imaging features of 2019 novel coronavirus (2019-ncov) time course of lung changes on chest ct during recovery from 2019 novel coronavirus (covid-19) pneumonia correlation of chest c-t and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases covid-19 pneumonia: what has ct taught us? national health commission of the peopleâȃźs republic of china, the notice of launching guideline on diagnosis and treatment of the novel coronavirus pneumonia potential association between covid-19 mortality and health-care resource availability deep learning enables accurate diagnosis of novel coronavirus (covid-19) with ct images, medrxiv large-scale screening of covid-19 from community acquired pneumonia using infection size-aware classification v-net: fully convolutional neural networks for volumetric medical image segmentation covid-ct-dataset: a ct scan dataset about covid-19 helping radiologists to help people in more than 100 countries covid-19 ct segmentation dataset covid-19 image data collection united imaging's emergency radiology departments support mobile cabin hospitals, facilitate 5g remote diagnosis towards robust rgb-d human mesh recovery precise pulmonary scanning and reducing medical radiation exposure by developing a clinically applicable intelligent ct system: toward improving patient care two-stream convolutional networks for blind image quality assessment deep hdr imaging via a non-local network ghost removal via channel attention in exposure fusion covid-19 chest ct image segmentation -a deep convolutional neural network solution multi-scale dense networks for deep high dynamic range imaging attention-guided network for ghost-free high dynamic range imaging longitudinal assessment of covid-19 using a deep learning-based quantitative ct pipeline: illustration of two cases rapid ai development cycle for the coronavirus (covid-19) pandemic: initial results for automated detection & patient monitoring using deep learning ct image analysis serial quantitative chest ct assessment of -19: deep-learning approach u-net: convolutional networks for biomedical image segmentation unet++: a nested u-net architecture for medical image segmentation segmentation of kidney tumor by multi-resolution vb-nets machine learning-based ct radiomics model for predicting hospital stay in patients with pneumonia associated with sars-cov-2 infection: a multicenter study lung infection quantification of covid-19 in ct images with deep learning deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study, medrxiv clinical recommender system: predicting medical specialty diagnostic choices with neural network ensembles lessons learned from designing an ai-enabled diagnosis tool for pathologists deters, access control management for computer-aided diagnosis systems using blockchain a visual analytics system for multi-model comparison on clinical data predictions estimating uncertainty and interpretability in deep learning for coronavirus (covid-19) detection automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks deep residual learning for image recognition rethinking the inception architecture for computer vision inception-v4, inception-resnet and the impact of residual connections on learning covid-19 screening on chest x-ray images using deep learning based anomaly detection development and evaluation of an ai system for covid-19 diagnosis deep learning-based detection for covid-19 from chest ct using weak label, medrxiv survey: interpolation methods in medical image processing nnu-net: breaking the spell on successful medical image segmentation cardoso, improving data augmentation for medical image segmentation differential data augmentation techniques for medical imaging classification tasks fully convolutional networks for semantic segmentation proceedings of the ieee conference on computer vision and pattern recognition dual path networks residual attention network for image classification automatic differentiation in pytorch, in: nips workshop deep residual learning for image recognition key: cord-310406-5pvln91x authors: asbury, thomas m; mitman, matt; tang, jijun; zheng, w jim title: genome3d: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome date: 2010-09-02 journal: bmc bioinformatics doi: 10.1186/1471-2105-11-444 sha: doc_id: 310406 cord_uid: 5pvln91x background: new technologies are enabling the measurement of many types of genomic and epigenomic information at scales ranging from the atomic to nuclear. much of this new data is increasingly structural in nature, and is often difficult to coordinate with other data sets. there is a legitimate need for integrating and visualizing these disparate data sets to reveal structural relationships not apparent when looking at these data in isolation. results: we have applied object-oriented technology to develop a downloadable visualization tool, genome3d, for integrating and displaying epigenomic data within a prescribed three-dimensional physical model of the human genome. in order to integrate and visualize large volume of data, novel statistical and mathematical approaches have been developed to reduce the size of the data. to our knowledge, this is the first such tool developed that can visualize human genome in three-dimension. we describe here the major features of genome3d and discuss our multi-scale data framework using a representative basic physical model. we then demonstrate many of the issues and benefits of multi-resolution data integration. conclusions: genome3d is a software visualization tool that explores a wide range of structural genomic and epigenetic data. data from various sources of differing scales can be integrated within a hierarchical framework that is easily adapted to new developments concerning the structure of the physical genome. in addition, our tool has a simple annotation mechanism to incorporate non-structural information. genome3d is unique is its ability to manipulate large amounts of multi-resolution data from diverse sources to uncover complex and new structural relationships within the genome. background a significant portion of genomic data that is currently being generated extends beyond traditional primary sequence information. genome-wide epigenetic characteristics such as dna and histone modifications, nucleosome distributions, along with transcriptional and replication center structural insights are rapidly changing the way the genome is understood. indeed, these new data from high-throughput sources are often demonstrating that much of the genome's functional landscape resides in extra-sequential properties. with this influx of new detail about the higher-level structure and dynamics of the genome, new techniques will be required to visualize and model the full extent of genomic interactions and function. genome browsers, such as the uscs genome database browser [1] , are specifically aimed at viewing primary sequence information. although supplemental information can easily be annotated via new tracks, representing structural hierarchies and interactions is quite difficult, particularly across non-contiguous genomic segments [2] . in addition, in spite of the many recent efforts to measure and model the genome structure at various resolutions and detail [3] [4] [5] [6] [7] [8] [9] [10] , little work has focused on combining these models into a plausible aggregate, or has taken advantage of the large amount of genomic and epigenomic data available from new high-throughput approaches. to address these issues, we have created an interactive 3d viewer, genome3d, to enable integration and visualization of genomic and epigenomic data. the viewer is designed to display data from multiple scales and uses a hierarchical model of the relative positions of all nucleotide atoms in the cell nucleus, i.e., the complete physical genome. our model framework is flexible and adaptable to handle new more precise structural information as details emerge about the genome's physical arrangement. the large amounts of data generated by highthroughput or whole-genome experiments raise issues of scale, storage, interactivity and abstraction. novel methods will be required to extract useful knowledge. genome3d is an early step toward such new approaches. genome3d is a gui-based c++ program which runs on windows (xp or later) platforms. its software architecture is based on the model-viewer-controller pattern [11] . genome3d is a viewer application to explore an underlying physical model displaying selections and annotations based on its current user settings. to support multiple resolutions and maintain a high level of interactivity, the model is designed using an objectoriented, hierarchical data architecture [12] . genome3d loads the model incrementally as needed to support user requests. once a model is loaded, genome3d supports ucsc genome browser track annotations of the bed and wig formats [1] . at highest detail, a model of the physical genome requires a 3d position (x, y, z) for each bp atom of the genome. the large amount of such data (3 × 10 9 bp × 20 atoms/bp × 3 positions × 4 bytes~600 gigabytes for humans) is reduced by exploiting the data's hierarchical organization. we store three scales of data for each chromosome in compressed xml format. atomic positions are computed on demand and not saved. this technique reduces the storage size for a human genome to~1.5 gigabytes, resulting in more than 400× savings. there are several sample models available for download from the genome3d project homepage. more information of our representative model and its data format can be found in additional file 1. the range of scales and spatial organizations of dna within the human cell presents many visualization challenges. to meet these challenges, genome3d manipulates and displays genomic data at multiple resolutions. figure 1 shows several screen captures of the genome3d application at various levels of detail. genome3d allows the user to specify the degree of detail to view, and the corresponding data is loaded dynamically. because of the large amount of data and the limited memory that is available, only portions of the data can typically be viewed at high resolution. the interactivity of genome3d facilitates exploring the model to find areas of interest. additionally, the user can configure various display parameters (such as color and shape) to highlight significant structural relationships. genome3d features include: • display of genomic data from nuclear to atomic scale. genome3d has multiple windows to visualize the physical genome model from simultaneous different viewpoints and scales. the model resolution of the current viewing window is set by the user, and its viewing camera is controlled by the mouse. resolutions and viewpoints depend of the type of data that is being visualized. • a fully interactive point-and-select 3d environment the user can navigate to an arbitrary region of interest by selecting a low resolution region and then loading corresponding higher resolution data which appears in another viewing window. • loading of multiple resolution user-created models with an open xml format the genome3d application adheres to the model-view-controller software design pattern [13] . the viewing software is completely separated from the multiscale model that is being viewed. we have chosen a simple open format for each resolution of the model, and users can easily add their own models. • image capture and povray/pdb model export support genome3d supports screen capture of the current display image to a jpg format. for highly quality renders, it can export the current model and view as a povray model [14] format for off-line print quality rendering. in addition, atomic positions of selected dna can be saved to a pdb format file for downstream analysis. • incorporation and user-defined visualization of ucsc annotation tracks onto the physical model the ucsc genome database browser has a variety of epigenetic information that can be exported directly from its web-site [1] . this data can be loaded into genome3d and displayed on the currently loaded genome model. we now give a few examples of applying biological information to a model and suggest possible methods of inferring unique structural relationships at various resolutions. one of the advantages of a multi-scale model is the ability to integrate data from various sources, and perhaps gain insight in higher level relationships or organizations. we choose to concentrate on highthroughput data sets that are becoming commonplace in current research: genome wide nucleosome positions, snps, histone methylations and gene expression profiles. the sample images, which can be visualized in gen-ome3d, were export and rendered in povray [14] . the impact of nucleosome position on gene regulation is well-known [15] . in addition to nucleosome restructuring/modification [16] , the rotation and phasing information of dna sequence may also play a significant role in gene regulation [17] , particularly within non-coding regions. figures 2a, b show a non-coding nucleosome with multiple snps using genome-wide histone positioning data [18] combined with a snp dataset [19] . it highlights one of the advantages of three dimensional genomic data by clearly showing the phasing of the snps relative to the histone. observations of this type and of more complicated structural relationships may provide insights for further analysis, and such hidden three-dimensional structure is perhaps best explored with the human eye using a physical model. figure 2 two examples of nucleosome epigenomic variation. a top view of 4 snp variants rs6055249, rs7508868, rs6140378, and rs2064267 (numbered 1-4 respectively) within a non-coding histone of chromosome 20:7602872-7603018. the histone position was obtained from [18] , the snps were taken from a recent study examining variants associated with hdl cholesterol [19] . such images may reveal structural relationships between non-coding region snps and histone phasing. b side view of a. c a series of histone trimethylations within encode region enr111 on chromosome 13:29668500-29671000 [27] . the histone bp positions are from [18] . each histone protein is shown as an approximate cylinder wedge: h2a (yellow), h2b (red), h3 (blue), h4 (green). the ca backbones of the h3 and h4 n-terminal tails are modeled using the crystal structure of the ncp (pdb 1a0i) [28] . the bright yellow spheres indicate h3k4me3 and h3k9me3, and the orange spheres are h3k27me3, h3k36me3 and h3k79me3. another important source of epigenomic information is histone modification. genome-wide histone modifications are being studied through a combination of dna microarray and chromatin immunoprecipitation (chipchip assays) [20] . histone methylations have important gene regulation implications, and methylations have been shown to serve as binding platforms for transcription machinery. the encode initiative [21] is creating high-resolution epigenetic information for~1% of the human genome. despite the fact that such modification occurs in histone proteins, current approaches to map and visualize such information are limited to sequence coordinates in the genome. our physical genome model visualizes methylation of histone proteins at atomic detail as determined by crystal structure. figure 2c shows histone methylations for several histones within an encode region. an integrated physical genome model can show the interplay between histone modifications and other genomic data, such as snps, dna methylation, the structure of gene, promoter and transcription machinery, etc. in addition to epigenomic data, the physical genome model also provides a platform to visualize highthroughput gene expression data and its interplay with global binding information of transcription factors. we consider a sample analysis of transcription factor p53. genome-wide binding sites of p53 proteins [22] can be combined with the gene expression results from a study investigating the dosing effect of p53 [23] . this may identify genes that have p53 binding sites in their promoter regions and are responsive to the dosing effect of p53 protein. such large-scale microarray expression data is often displayed with a two-dimensional array format, emphasizing shared expression between genes, while p53 binding data are stored in tabular form. with a physical model, expression levels of genes in response to p53 level can be mapped to genome positions together with global p53 binding information, revealing any structural bias of the expression. figure 3 shows this type of physical genome annotation. drawing inferences from coupling averaged or "snap-shot" expression data with the dynamic architecture of the genome may be helpful in determining structural dependences in expression patterns. to illustrate the capability of genome3d to integrate and examine data of appropriate scales, we constructed an elementary model of the physical genome (see additional file 1 for details). this basic model is approximate since precise knowledge of the physical genome is largely unknown at present. however, the model's inaccuracies are secondary to its multi-scale approach that provides a framework to improve and refine the model. current technologies are making significant progress toward capturing chromosome conformation within the nucleus at various scales [24, 25] . because our multi-scale model is purely descriptive beyond the ncp scale, it can easily incorporate more accurate structural folding information, such as the 'fractal globule' behaviour [26] . the genome3d viewer, decoupled from the genome model, can be used to view any model that uses our model framework. building a 3d model of a complete physical genome is a non-trivial task. the structure and organization at a physical level is dynamic and heavily influenced by local and global constraints. a typical experiment may provide new data at a specific resolution or portion of the genome, and the integration of these data with other information to flesh out a multi-resolution model is challenging. for example, an experiment may measure local chromatin structure around a transcription site. this structure can be expressed as a collection of dna strands, ncps, and perhaps lower resolution 30 nm chromatin fibers. our data formats are flexible enough to allow partial integration of this information, when the larger global structure is undetermined, or inferred by more global stochastic measurements from other experiments. combining such data across resolutions is often difficult, but establishing data formats and visualization tools provide a framework that may simplify the integration process. recent advances in determining chromosome folding principles [24] highlight the need for new visualization methods. more detailed three-dimensional genomic models will help in discovering and characterizing epigenetic processes. we have created a multi-scale genomic viewer, genome3d, to display and investigate genomic and epigenomic information in a three-dimensional representation of the physical genome. the viewer software and its underlying data architecture are designed to handle the visualization and integration issues that are present when dealing with large amount of data at multiple resolutions. our data structures can easily accommodate new advances in chromosome folding and organization. a common framework of established scales and formats could vastly improve multi-scale data integration and the ability to infer previously unknown relationships within the composite data. our model architecture defines clear demarcations between four scales (nuclear, fiber, nucleosome and dna), which facilitates data integration in a consistent and well-behaved manner. as more data become available, the ability to model, characterize, visualize, and perhaps most crucially, integrate information at many scales is necessary to achieve fuller understanding of the human genome. software development, and wjz oversaw the whole project. all authors read and approved the final manuscript. the ucsc genome browser database: 2008 update gene regulation in the third dimension polymer models for interphase chromosomes a randomwalk/ giant-loop model for interphase chromosomes a polymer, random walk model for the size-distribution of large dna fragments after high linear energy transfer radiation a chromatin folding model that incorporates linker variability generates fibers resembling the native structures capturing chromosome conformation modeling dna loops using the theory of elasticity computational modeling predicts the structure and dynamics of chromatin fiber multiscale modeling of nucleosome dynamics applications programming in smalltalk-80: how to use model-view-controller (mvc) object-oriented biological system integration: a sars coronavirus example computer graphics: principles and practice persistence of vision pty. ltd., persistence of vision raytracer (version 3.6) cooperation between complexes that regulate chromatin structure and transcription the language of covalent histone modifications binding of nf1 to the mmtv promoter in nucleosomes: influence of rotational phasing, translational positioning and histone h1 dynamic regulation of nucleosome positioning in the human genome newly identified loci that influence lipid concentrations and risk of coronary artery disease genome-wide approaches to studying chromatin modifications a global map of p53 transcription-factor binding sites in the human genome gene expression profiling of isogenic cells with different tp53 gene dosage reveals numerous genes that are affected by tp53 dosage and identifies cspg2 as a direct target of p53 comprehensive mapping of long-range interactions reveals folding principles of the human genome organization of interphase chromatin the role of topological constraints in the kinetics of collapse of macromolecules the landscape of histone modifications across 1% of the human genome in five human cell lines crystal structure of the nucleosome core particle at 2.8 a resolution submit your next manuscript to biomed central and take full advantage of: • convenient online submission • thorough peer review • no space constraints or color figure charges • immediate publication on acceptance • inclusion in pubmed, cas, scopus and google scholar • research which is freely available for redistribution this work is partly supported by grants irg 97-219-08 from the american cancer society, computational biology core of 1 ul1 rr029882-01, 3 r01 gm063265-09s1, a pilot project and statistical core of grant 5 p20 rr017696-05, phrma foundation research starter grant, a pilot project from 5p20rr017677 to w.j.z, and nsf 0904179 and 3 r01 gm078991-03s1 to jt. t.m.a. is supported by nlm training grant 5-t15-lm007438-02. the authors thank y.ruan for valuable discussion about the project, k.zhao and d.e. schones for providing nucleosome positioning data, m.boehnke for critical reading of the manuscript, and t qin, lc tsoi, and k. sims for software testing. the high performance computing facility utilized in this project is supported by nih grants: 1r01lm009153, p20rr017696, 1t32gm074934 and 1t15 lm07438. project name: genome3dproject homepage: http://genomebioinfo.musc.edu/ genome3d/index.html operating system: windows-based operation systems (xp or later) programming language: c++ and python other requirements: openglv2.0 and glsl v2.0 (may not be present on some older graphics adapters -see additional file 2) any restrictions to use by non-academics: none additional file 1: supplemental information. additional details about human physical genome model construction and the genome3d software.additional file 2: genome3d v1.0 readme. the readme file for genome3d software.authors' contributions wjz conceived the initial concept of the project and developed the project with tma. tma developed the 3d genomic model and worked with mm to develop the genome3d software. jt and wjz advised tma and mm on the key: cord-299439-xvfab24g authors: fokas, a. s.; dikaios, n.; kastis, g. a. title: covid-19: predictive mathematical models for the number of deaths in south korea, italy, spain, france, uk, germany, and usa date: 2020-05-12 journal: nan doi: 10.1101/2020.05.08.20095489 sha: doc_id: 299439 cord_uid: xvfab24g we have recently introduced two novel mathematical models for characterizing the dynamics of the cumulative number of individuals in a given country reported to be infected with covid-19. here we show that these models can also be used for determining the time-evolution of the associated number of deaths. in particular, using data up to around the time that the rate of deaths reaches a maximum, these models provide estimates for the time that a plateau will be reached signifying that the epidemic is approaching its end, as well as for the cumulative number of deaths at that time. the plateau is defined to occur when the rate of deaths is 5% of the maximum rate. results are presented for south korea, italy, spain, france, uk, germany, and usa. the number of covid-19 deaths in other counties can be analyzed similarly. unprecedented mobilization of the scientific community has already led to remarkable progress 38 towards combating this threat, such as understanding significant features of the virus at the 39 molecular level, see for example (1) and (2). in addition, international efforts have intensified 40 towards the development of specific pharmacological interventions; they include, clinical trials 41 using old or relatively new medications and the employment of specific monoclonal antibodies, as to viruses, and that complement deposits are abundant in the lung biopsies from sars-cov-2 48 patients indicating that this system is presumably overacting (6), it has been suggested that anti-49 complement therapies may be beneficial to sars-cov-2 patients. further support for this 50 suggestion is provided by earlier studies showing that the activation of various components of the 51 complement system exacerbates the acute respiratory distress syndrome disease associated with 52 sars-cov (6). the federal drug administration of usa has granted a conditional approval to 53 the anti-viral medication remdesivir (7). unfortunately, the combination of the anti-viral 54 medications lopinavir and ritonavir that are effective against the human immunodeficiency virus 55 has not shown any benefits (4). similarly, the combination of the anti-malarial medication 56 hydroxychloroquine and the antibiotic azithromycin, is not only ineffective but can be harmful (5). 57 the scientific community is also playing an important role in advising policy makers of possible 58 non-pharmacological approaches to limit the catastrophic impact of the pandemic. for example, 59 following the analysis in (8) of two possible strategies, called mitigation and suppression, for 60 combating the epidemic, uk switched from mitigation to suppression. within this context, in order 61 to design a long-term strategy, it is necessary to be able to predict important features of the covid-62 19 epidemic, such as the final accumulative number of deaths. clearly, this requires the 63 development of predictive mathematical models. 64 in a recent paper (9) we presented a model for the dynamics of the accumulative number of 65 individuals in a given country that are reported at time t to be infected by covid-19. this model 66 is based on a particular ordinary differential equation of the riccati type, which is specified by a in the particular case that the associate time-dependent function is a constant, the explicit solution 76 of the above riccati equation becomes the classical logistic formula. it was shown in (9) that 77 although this formula provides an adequate fit of the given data, is not sufficiently accurate for 78 predictive purposes. in order to provide more accurate predictions, we introduced two novel 79 models, called rational and birational. here we will show that the ricatti equation introduced in (9) can also be used for determining the 82 time evolution of the number, n(t), of deaths in a given country caused by the covid-19 epidemic. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. it turns out that the birational formula, in general, yields better predictions that the rational, which 100 in turn provides better predictions than the logistic. also, in general, the birational curve is above 101 the curve obtained from the data, whereas the rational curve is below. thus, the birational and (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. for the epidemic of the uk (fig. 3d) for the epidemic of germany (fig. 3e ), the logistic model predicts a plateau on may 17, 2020 (59 186 days after the day that 25 deaths were reported) with 6,830 deaths; the rational model predicts a 187 plateau on june 16, 2020 (day 89) with 8,702 deaths. for the epidemic of the usa (fig. 3f) , the 188 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. it is evident from the above that whereas each of the three equations (3) the fact that a is now a function of t has important implications. in particular, it made it possible 220 to construct the rational and birational models. all two models, as well as the logistic model provide 221 good fits for the available data. however, as discussed in detail above, the rational and birational 222 models provide more accurate predictions. furthermore, the birational model may provide an 223 upper bound of the actual n(t), whereas the rational model yields a better lower bound that the 224 logistic model. 225 it is noted that our approach has the capacity for increasing continuously the accuracy of the 226 predictions: as soon as the epidemic in a given country passes the time t, the rational model can be 227 used; furthermore, when the sigmoidal part of the curve is approached, the rational model can be 228 supplemented with the birational model. also, as more data become available, the parameters of 229 the rational and of the birational models can be re-evaluated; this will yield better predictions. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. (21). in any case, whatever strategy is followed, it is natural to expect that the number of reported 248 infected individuals as well as the number of deaths will begin to grow. at this stage the predictions 249 of the models proposed in (9) and here will not be valid. however, these works will still be valuable: 250 they can be used to compute the additional number of reported infected individuals and deaths 251 caused by easing the lockdown measures. let us hope that a prudent exit strategy is adopted so that 252 these numbers will not be staggering. the stability of the fitting procedure was established by using the following simple criterion: 272 different fitting attempts based on the use of a fixed number of data points, must yield curves which 273 have the same form beyond the above fixed points. in this way, it was established that the rational 274 formula could be employed provided that data were available until around the time t, whereas the 275 birational formula could be used only for data available well beyond t. the fitting accuracy of each model was evaluated by fitting the associated formula on all the 278 available data in a specified set. the relevant parameters specifying the logistic, rational, and 279 birational models are given on table 1. 280 we assume that the function n(t) satisfies the ordinary differential equation 281 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. . https://doi.org/10.1101/2020.05.08.20095489 doi: medrxiv preprint page 7 of 13 = ( ) − . (2) 282 the general solution of this equation is given by (9) 283 284 304 if b, c, d, k, are close to b1, c1, d1, k1, then nf is close to c1, and hence the value of α(t) after t=x is 305 close to the value of α(t) before t= x. the constant t can be computed by solving the equation obtained by equating to zero the second 308 derivative of n. this implies that for the logistic and the rational models t is given respectively by 309 310 = ln( ) , (1 + ) = − 1 + 1 . 311 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 12, 2020. . https://doi.org/10.1101/2020.05.08.20095489 doi: medrxiv preprint a pneumonia outbreak associated with a new 319 coronavirus of probable bat origin structural 321 basis of receptor recognition by sars-cov-2 infectious diseases society of america guidelines on the 325 treatment and management of patients with covid-19 infection a trial of lopinavir-ritonavir in adults hospitalized with severe outcomes of hydroxychloroquine usage in united states veterans hospitalized with complement as a target in covid-19? impact of nonpharmaceutical interventions (npis) 346 to reduce covid-19 mortality and healthcare demand response team predictive mathematical models for the number of 349 individuals infected with covid-19. medrxiv modeling and forecasting the early evolution of the covid-19 351 pandemic in brazil modeling covid-19 dynamics for real-time estimates and projections: an 355 application to albanian data the first 100 days: modeling the evolution of the 357 covid-19 pandemic a mathematical model for the novel coronavirus epidemic in wuhan no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted projected development of covid-19 in louisiana a mathematical model for the spatiotemporal epidemic 364 spreading of a spatial model of covid-19 366 transmission in england and wales: early spread and peak timing tracing day-zero and forecasting the fade out of the covid-19 outbreak in italy: a compartmental modelling and numerical optimization approach data-based analysis, modelling and 373 forecasting of the covid-19 outbreak the psychological impact of quarantine and how to reduce it: rapid review of the 376 evidence two alternative scenarios for easing 378 covid-19 lockdown measures: one reasonable and one catastrophic (preprint) bound constrained optimization convergence properties of the simplex method in low dimensions the levenberg-marquardt algorithm: implementation and theory an interior, trust region approach for nonlinear minimization 391 subject to bounds competing interests: the authors declare no competing interests. 407 data and materials availability: all data needed to evaluate the conclusions in the paper 408 are present in the paper and/or the supplementary materials. 409 all rights reserved. no reuse allowed without permission.(which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint this version posted may 12, 2020. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint this version posted may 12, 2020. 453 all rights reserved. no reuse allowed without permission.(which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity.the copyright holder for this preprint this version posted may 12, 2020. . https://doi.org/10.1101/2020.05.08.20095489 doi: medrxiv preprint key: cord-290421-9v841ose authors: weston, dale; ip, athena; amlôt, richard title: examining the application of behaviour change theories in the context of infectious disease outbreaks and emergency response: a review of reviews date: 2020-10-01 journal: bmc public health doi: 10.1186/s12889-020-09519-2 sha: doc_id: 290421 cord_uid: 9v841ose background: behavioural science can play a critical role in combatting the effects of an infectious disease outbreak or public health emergency, such as the covid-19 pandemic. the current paper presents a synthesis of review literature discussing the application of behaviour change theories within an infectious disease and emergency response context, with a view to informing infectious disease modelling, research and public health practice. methods: a scoping review procedure was adopted for the searches. searches were run on pubmed, psychinfo and medline with search terms covering four major categories: behaviour, emergency response (e.g., infectious disease, preparedness, mass emergency), theoretical models, and reviews. three further top-up reviews was also conducted using google scholar. papers were included if they presented a review of theoretical models as applied to understanding preventative health behaviours in the context of emergency preparedness and response, and/or infectious disease outbreaks. results: thirteen papers were included in the final synthesis. across the reviews, several theories of behaviour change were identified as more commonly cited within this context, specifically, health belief model, theory of planned behaviour, and protection motivation theory, with support (although not universal) for their effectiveness in this context. furthermore, the application of these theories in previous primary research within this context was found to be patchy, and so further work is required to systematically incorporate and test behaviour change models within public health emergency research and interventions. conclusion: overall, this review identifies a range of more commonly applied theories with broad support for their use within an infectious disease and emergency response context. the discussion section details several key recommendations to help researchers, practitioners, and infectious disease modellers to incorporate these theories into their work. specifically, researchers and practitioners should base future research and practice on a systematic application of theories, beginning with those reported herein. furthermore, infectious disease modellers should consult the theories reported herein to ensure that the full range of relevant constructs (cognitive, emotional and social) are incorporated into their models. in all cases, consultation with behavioural scientists throughout these processes is strongly recommended to ensure the appropriate application of theory. the united kingdom (uk) national risk register details a broad range of threats to the public health and security of the uk incorporating infectious disease outbreaks (e.g., pandemics and emerging diseases), malicious attacks (e.g., terrorist incidents), and natural phenomena (e.g., extreme weather, earthquakes) [1] . the risk of infectious disease outbreaks is so substantial that the uk national risk register ranks a pandemic outbreak as the number one high consequence civil emergency facing the uk (based on likelihood and probable impact [1] . the coronavirus disease pandemic which, at the time of writing has led to 19,440,423 confirmed cases and 722,706 deaths, presents a stark reminder of this public health threat [2] . outbreaks of infectious disease, particularly those for which little or no pre-existing immunity existssuch as the covid-19 pandemicrepresent a significant risk to public health. for example: the 2013-2016 ebola outbreak in west africa led to over 28,600 cases with 11, 325 deaths [3] , while the ongoing outbreak in the democratic republic of congo has led to over 2200 deaths thus far [4] ; since the identification of middle east respiratory syndrome coronavirus (mers-cov) in 2012, 851 associated deaths have been reported with cases across 27 countries [5] , and; although less severe than expected [6] , the h1n1 pandemic was estimated as responsible for between 151,700-575,400 deaths worldwide during the first 12 months [7, 8] . even when controlling for confounding factors (e.g., improvements in surveillance, communication infrastructure, etc.), the number of infectious disease outbreaks has substantially increased since 1980 to 2013 [9] . similarly, deaths from terrorism have substantially increased from less than 200 in 1970 to over 26,000 in 2017 (peaking with over 44,000 in 2014 [10] ), and despite a decline in the number of individuals affected by natural disasters between 1994 and 2014, the average death rate has increased over the same time period [11] . given these trends, it is therefore critical to ensure that emergency preparedness, response and resilience is optimised to mitigate the occurrence and/or impact of these events. behavioural science represents one such broad method of mitigation. the importance of encouraging adaptive and protective behaviour change in response to public health emergencies is emphasised by the world health organisation (who), who provide risk communications guidelines designed to encourage individuals, families, and communities to act to protect themselves [12] . this is echoed in the context of covid-19, with michie and colleagues stating that: "human behaviour will determine how quickly covid-19 spreads and the mortality. therefore behavioural science must be at the heart of the public health response" [13] . research in the behavioural sciences has focused on identifying barriers and facilitators to maximising public compliance with recommended emergency response and infection prevention behaviours. for example, decontamination behaviour (e.g. [14] ), medication adherence (e.g., [15] ), hand washing (e.g., [16] ), social distancing/ avoidance behaviour (e.g., [16, 17] ), and vaccination (e.g., [18, 19] ), to name but a few. furthermore, in the context of infectious disease emergencies, mathematical models are used to both: a) understand and map out the spread and control of disease (incorporating human-to-human transmission) and, b) calculate the potential effectiveness of interventions (including behavioural interventions) to reduce the spread of the disease [20] . considered together, the importance of human behaviour for emergency responseboth in terms of developing interventions and its relevance for modelling the potential efficacy of said interventionsis clear. however, there is still work to be done to optimise the incorporation of behavioural constructs in public health research, intervention design, and modelling. for example, despite medical research council guidelines recommending interventions be based on appropriate behaviour change theory [21] (see also [22] ), reference to theory is often absent in such interventions [22] . indeed, michie and colleagues, pioneers in the field of identifying and integrating behaviour change theory and techniques in the context of health promotion, note that much intervention design is based on the principle of "it seemed like a good idea at the time", rather than a systematic consideration and assessment of the most appropriate routes to behaviour change ( [23] , p. 14). similarly, a limitation of traditional mathematical models is that they often do not allow for heterogeneous behavioural responses within a population [24] . this assumption that human behaviour is homogenous can impact on the validity of these models. for instance, including a modest degree of fear-related flight behaviour (i.e., 10% of individuals in a model respond to fear of infection with flight) into a model in which fear of infection otherwise leads to hiding, caused projected disease incidence to rise to~65%, up from~30% in a model in which fear of infection led all individuals to hide [25] . although some recent infectious disease models do incorporate social and cognitive predictors of the kinds of self-protective health behaviours that are associated with infectious disease control and emergency response (e.g., vaccination uptake, social distancing etc.), they are more commonly informed by literature from behavioural economics than psychology [26] . that is not to say that the integration of theory is a silver bullet for the success of mitigation strategies and modelling. for example, there is mixed evidence concerning the efficacy of theory-based interventions (see [22] , p21 for a summary), inconsistency that may be based on the relevance of the chosen theory for the behaviour in question [22] . to illustrate this point, according to michie and colleagues, there are a total of 83 behaviour change theories across the behavioural and social sciences [22] . over the past three decades, multiple review papers and books have attempted to identify trends in theory use including those most frequently applied (e.g., [22, [27] [28] [29] [30] ). despite some commonalities in underlying psychological processes (michie and colleagues cede that many of the 1659 constructs identified within their book were different labels for overlapping constructs, [22] ), this proliferation of competing theories and recommendations could indeed make it difficult for researchers, intervention designers, and modellers to decide which theories to use and in what context. this unfortunately leads to a catch-22 situation: we wish to encourage non-specialists to use appropriate psychological theories and approaches within their own disciplines, yet we fail to recognise the complex and confusing landscape of psychological theory. michie and colleagues have made great strides to simplify the process by which psychological theories are used to inform behavioural interventions [23] . however, there are still a large number of behaviour change theories that were designed with specific applications in mind: for example, the behavioural-ecological model of adolescent aids prevention [31] , the integrated theory of drinking behaviour [32] , or the social ecological model of walking [33] public health researchers, infectious disease modellers, and practitioners may therefore be understandably perplexed as to how best to model and examine or influence behaviour in the specific context of infectious disease outbreaks or emergency response. this current paper therefore seeks to present a synthesis of the behaviour change theories that are most commonly applied within an emergency response or infectious disease outbreak context. that is, focused specifically on using behaviour change theories to understand and influence individuals' engagement with protective health behaviours that are recommended during infectious disease outbreaks and public health emergencies. to identify these commonly applied theories, we conducted a scoping review of the existing literature, but with a particular focus on identifying reviews using behaviour change theory in an infectious disease or emergency response context. this approach is recognised as a method of distilling a substantial literature into a manageable summary of evidence for decision makers ( [34] , see also [35] ). although using this 'review of reviews' approach focused on secondary sources, which may have led to some relevant information being missed, it enabled us to reduce the quantity of papers identified in a large and highly diverse literature to a manageable level while still achieving a broad overview of the state of the art within the field. by dovetailing with weston and colleagues' recent review of the application of human behaviour within infectious disease models [26] , the outcomes from this current review will enable us to make useful recommendations as to how psychological constructs, theory, and research can be used by public health practitioners, researchers modellers, to improve our understanding of human behaviour within the contexts of infectious disease outbreaks and emergency response. a scoping approach was adopted for our search. scoping reviews are recommended as a mechanism by which a given literature might be summarised for policy makers or practitioners [36] . as the aim of this review was to summarise and synthesise the psychological literature on behaviour change to inform recommendations for public health researchers and modellers, the adoption of a scoping review framework was a logical and appropriate choice. the literature search was conducted using pubmed, psy-chinfo and medline databases on the 6th january 2016. the databases were selected based on their coverage of discipline and context specific literature. each database was searched individually to ensure that all medical subject heading (mesh) terms were used effectively. the search terms covered four major categories: behaviour, emergency response (e.g., infectious disease, preparedness, mass emergency), theoretical models, and reviews. supplementary information 1 provides the full list of search terms used for each database. within the theoretical model category, we a priori selected several existing behaviour change models that were either: a) frequently cited within the literature, or b) adjudged to be of particular relevance within the context of infectious disease and emergency preparedness based on the authors combined expertise in these areas. in addition, generic phrases and subject headings for theoretical modelling were included within each search strategy to ensure that papers that do not cite the most common behaviour change models would still be captured within our search. lastly, papers identified through other, non-systematic methods (e.g., some clearly relevant citations in papers, keyword google scholar searches) were also included to try to identify articles that were not indexed within these databases. as the initial search was run in 2016, a condensed follow up search strategy was devised to identify seminal works in the field published since this date. given time and resource constraints in conducting this search, the strategy was designed to identify literature that closely corresponded to the output from the original selection process. for example, the strategy was simplified based on the broad search categories used in the original search, and as literature concerning human immunodeficiency virus (hiv)/ sexually transmitted infections (stis) was excluded from the original data extraction (see inclusion/ exclusion criteria section), the decision was taken to exclude these papers at the search strategy stage here. on 24/10/19 the following search was conducted on google scholar, sorted by relevance, with a custom date range of 2016-2019: "review* and behavio* and theor*, or models and infectious disease*, or emergenc* -hiv, -std, -sti"-given time and resource constraints, only the first 20 pages of google scholar were screened, first for title, then for abstract, and finally full-text screening was conducted on any remaining papers. due to limitations concerning the use of wildcard operators (*) on google scholar in the initial top-up search, and the potential for covid-19 related review papers to have been published in the intervening period, a further optimised top-up search strategy was developed. this strategy consisted of the following two searches (specified for emergency response and infectious diseases respectively), conducted on google scholar on 16th -17th may 2020: review emergency theory behavior or behaviour. review disease theory behavior or behaviour. as for the previous top-up search, the first 20 pages of google scholar were screened for each search (for a total of 400 results). for the original search, duplicates were removed electronically, and all remaining papers were subjected to title and abstract screening by one author (ai) using the inclusion/ exclusion criteria. all papers retained for full text assessment were screened independently by two researchers (ai and dw) to increase the reliability of the selection process. any inconsistencies between the researchers were resolved through a joint discussion. for the top-up search, individual title, abstract, and full-text screening stages were conducted by the first author (dw). as for the original search, all papers retained for full text assessment were screened independently by two researchers (ai and dw) with any inconsistencies between the researchers resolved through a joint discussion. the inclusion/ exclusion criteria employed in the top-up search were the same as those used for the original search. for the optimised top-up search, individual title, abstract, and full-text screening stages were conducted by the first author (dw). due to time constraints imposed by the covid-19 pandemic, full text screening for this search was conducted by the first author alone. as for the initial top-up search, the same inclusion/ exclusion criteria were employed as used in the original review. the following inclusion/ exclusion criteria were used: (i) type of article: reviews (systematic, scoping and narrative) and meta-analysis. (ii) theoretical model/theory: the papers needed to present or apply a model or theory of behaviour change. leniency in this criterion was initially applied in so far as papers which clearly applied constructs that were adapted from theories/models, were also retained, but this was subsequently restricted to focus specifically on the presentation of behaviour change models/ theories (see original study selection section below). (iii)context: the papers needed to present or apply the theory/model to explain human behaviour in the context of emergency or infectious disease outbreaks. as per [26] any reviews focusing on diseases that are not transferred from human-tohuman (e.g., vector borne) were excluded. (iv) target behaviour: preventive health behaviours during an emergency or outbreak (e.g. social distancing, vaccination and reducing social ties) were included in the review. (v) other: there were no restrictions on the date reviews were published or the population in question. reviews were included if they were written in the english language and involved human behaviour. for simplicity, papers exploring sti-related health behaviours (total: 53) were pragmatically excluded wholesale following the initial screening of the original search as the majority were deemed either: irrelevant according to the above criteria (30) , of unclear relevance to the researcher (ai) (6) , or inaccessible to the researcher (ai) (15) . for consistency, papers exploring sti-related health behaviours have also been excluded during subsequent screening of the top-up searches. based on data extracted as part of previous review work in this area (e.g., [27, 28] ), the following information was extracted from the included papers in both the original and top-up stages: (i) title (ii) author (iii)number of studies included in the review (iv) target behaviour(s) (v) theories employed (vi) key outcomes/ conclusions regarding the utility of behaviour change models data concerning the theories employed was identified within the included papers using the original reviews' own definitions or conceptualisations of theory. that is, if an included review referenced a particular theory or model, it was subsequently included in our synthesis. reference to theory was either found in specific citations of theories used by individual papers incorporated within the review (commonly included in summary tables within the included papers), or as a broader framework used by included reviews to collate and synthesise the identified literature. key outcomes and conclusions regarding the utility of behaviour change models were identified similarly, using review authors' references to the theories they cite within their results and discussion sections. data from the included studies were synthesised to identify: a) the behaviour change theories most commonly employed to understand and influence protective health behaviours during public health emergencies and infectious disease outbreaks and, b) any (in) consistency in the reported utility of different behaviour change theories. a total of 464 records were identified through database searching with an additional eight articles identified through other sources. following the removal of duplicate citations, 368 papers were subjected to title/ abstract screening. thirteen papers were retained for full-text eligibility assessment by the first and second authors. following this assessment, one paper was excluded, leaving 12 remaining papers [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] (see fig. 1 ). during the conduct of this review, the focus evolved to be explicitly concerned with only the application of theories rather than a broader focus on theoretically-related constructs. subsequent reconsideration by the first author therefore led to three of these papers being excluded [42, 43, 47] , as although they all presented constructs that are represented within behaviour change theories, none explicitly referenced theory. these papers are explicitly referenced here to signpost the interested reader to their existence. information concerning: (a) the article characteristics, (b) the application/ use of psychological theory within these reviews, (c) the total number of unique articles employing each behaviour change theory and, (d) a summary of the key conclusions regarding the utility of theory within each review, is collated and summarised in this results section. a total of approximately 17,200 papers were identified using the google scholar search. of these 17,200, the first 20 pages (200 hits) were subjected to title screening. following this stage 16 hits were retained for abstract screening, which yielded five papers for full-text review. following full-text review by the first and second author, two papers were retained for inclusion in this review [50, 51] (see fig. 2 ). these two papers were subsequently incorporated into a revision of the initial synthesis and analysis and are presented in table 1 and the supplementary information alongside literature identified through the original screening process. the first 20 pages of each optimised top-up search were subjected to title screening (400 hits in total). following this stage, 47 hits were retained for abstract screening, which yielded 22 papers for full-text review. following full-text review by the first author, two papers were retained for inclusion in this review [52, 53] (see fig. 3 ). as part of the full-text review process, the decision was taken to exclude two reviews [54, 55] which did examine the application of theory in a similar context to that of the two included reviews [52, 53] . this decision was taken as the focus of these reviews were more on understanding behaviour during an emergency, rather than the primarily protective or preventative focus of this review. these citations are presented here, in order to signpost them to interested readers. as for the initial top-up process, the two included papers were incorporated into a revision of the synthesis and analysis presented within this manuscript and the accompanying supplementary information. the synthesis including both original and all top-up studies is presented together in the following sections. in addition to peer reviewed academic publications, the sample included reports published by the department of health, uk [39, 40] and the european centre for disease prevention and control [37] , and one review from within an unpublished doctoral thesis [44] 1 . six of the reviews cited in this synthesis had at least one author in common with another review cited herein [38-41, 48, 50] , with one [39] explicitly cited as an update of another [38] . to examine the extent to which the papers included in our review were sampling the same citations, 1 this review [44] forms part of a broader phd thesis in which it is presented as a chapter. given the focus of the chapter on conducting a systematic review of h1n1 perceptions and responses, and the consistency in the theories identified within the review chapter and the thesis' introductory chapter, we elected to include only the specific review chapter in this synthesis. we looked across all papers to see how many citations were fully independent (that is, not cited in any other review included in this manuscript). to do this we either: a) examined the list of included studies provided by the authors of each systematic review, or; b) where such a list was not provided, we examined the full reference list for the manuscript. although the percentage of unique papers varied substantially from review to review, each paper had an average of 66.9% unique papers (see table 2 for the full breakdown). although heavily focused on h1n1 pandemic influenza and vaccination behaviour, these papers did cover a wide range of health-related behaviours (e.g., hand hygiene, face mask wearing) across various infectious disease and public health emergency contexts (e.g., natural disasters, terrorism). specifically, 10 papers [37-41, 44, 45, 48, 50, 51] looked at uptake of vaccination against influenza (primarily pandemic, but also including seasonal). one paper [44] considered the relationship between risk perception and preventive behaviour related to sars and avian influenza (e.g., hand washing, diet, exercise, wearing face masks). four papers [37, [44] [45] [46] also considered other outbreak preparedness behaviours in addition to vaccination (e.g. hand hygiene, non-pharmaceutical measures against influenza etc.). three papers [45, 52, 53] investigated the application of theories/model in non-infectious disease emergencies and disasters (e.g., flood disaster preparedness, earthquake preparedness, climate change, fire preparedness, bushfire emergencies, tornado preparedness, & terrorism preparedness). all reviews were published between 2009 [46] and 2017 [51] [52] [53] and all except two [52, 53] employed a systematic approach to data collection. in the first instance, we looked across the 13 review papers to see which theories were cited by the highest number of reviews (regardless of the number of cited papers using each theory within each review, and incorporating mentions of particular theories as frameworks for synthesis as in [48, [51] [52] [53] ). this initial examination revealed that the health belief model [56] was explicitly represented in the most review papers (nine [37, 38, 40, 41, 44-46, 48, 50] ), followed by the theory of planned behaviour [57] (eight [37, 38, 41, 44-46, 50, 51] ) and protection motivation theory [58] (seven [39-41, 44-46, 53] ), precaution adoption process model [59] (four [37, [44] [45] [46] ), and the common sense model of self-regulation [60] (four [37, 40, 41, 44] ). a further two models were each included in three review papers: extended parallel process model [61] (three [37, 44, 45] ), and the theory of reasoned action [62] (three [37, 44, 46] ). all other models were cited two times or fewer (see table 1 ) 2 . next, we collated and examined the papers cited across all reviews to identify the most frequently cited theories overall. articles that the reviews specifically cited as including behavioural theories were collated from the nine reviews that extracted such data [37-41, self-efficacy theory is listed as 'other' within this paper, but we have incorporated this within our review alongside the additional, explicit self-efficacy theory citation from another included review paper c although this is presented as the common sense model, distinct from the other self-regulation model citation within bults and colleagues' review, further examination of the original papers reveals they are based on the same underlying model, and so are integrated in our synthesis d examination of the reviews citing social cognitive theory [37, 38] and the social cognitive model [45] has revealed that these are distinct theories and are therefore included in our synthesis as such e the authors of these reviews briefly cite examples of additional theories before settling on the protection motivation theory and the protective action decision model respectively. only these two theories are included in this table and in our synthesis 2 where there was close overlap between model/ theory names, we conducted a google search to determine whether the tiles likely reflect the same, or different, models. consequently, social ecological theory (see [37, 44] ) and the social ecological resilience model [45] have been included as separate models in this synthesis. on closer inspection, subjective expected utility theory [38] and state dependent expected utility framework [46] refer to the same paper, and so are collapsed together herein. precaution adoption theory [45] and precaution adoption process model [37, 44, 46] are interpreted as referring to the same theory and so are included together. [44] [45] [46] 50] ). thus, papers were not included either: a) if the reviews did not indicate that such papers included behavioural theories or b) from the reviews that did not provide detail on the theories used by their cited papers (specifically, [48, [51] [52] [53] ). papers that were listed as including behavioural theory and were cited by multiple reviews (n = 9) were only included once, leaving a total of 137 unique papers which were listed by the various review authors as incorporating one or more behaviour change theories. see supplementary information 3 for an overview of which theories were cited in multiple reviews, and for the overall number of times each theory was cited in a unique paper across all reviews. across the cited literature, four behaviour change theories were applied more than 10 times. these were: when considering the application of behaviour change theories for research, the included reviews report mixed success. while all of the papers cited by [37] were [49] informed by theory, several other authors reported that relatively few papers included within their reviews explicitly use behaviour change theories [40, 41, 44, 46, 50] . for example, one review found that only around one third of cited papers explicitly refer to a theoretical model [46] and another failed to find any interventions that used behavioural frameworks in their development [50] . nevertheless, where theory was cited, across reviews there was support for the utility of the most cited theories across reviews and within individual papers. particular support was provided in the key outcomes and conclusions across reviews for: health belief model [37-40, 45, 46, 48, 50] , the theory of planned behaviour [37-40, 45, 50, 51] , protection motivation theory [38-41, 44, 53] , and the common-sense model [38] [39] [40] (see supplementary information 2 for a summary of theory-related outcomes for each review cited herein). although this is based on key outcomes/ conclusions and not an exhaustive list of all successful theories reported within/ across reviews, the commonly applied behaviour change theories do seem to be identified as relevant for understanding and explaining human behaviour within an infectious disease and emergency response context. however, the use of these theories was not universally lauded: for example, one review argues that the health belief model, protection motivation theory and the theory of planned behaviour do not adequately allow for emotional factors in behavioural decision making (of the three, protection motivation theory does incorporate fear, however the impact of emotion is on threatappraisals rather than a direct effect of emotion on action [22] ) [38] . indeed, one review exploring the [49] application of protection motivation theory for animal owners and emergency responders in the context of bushfire emergencies suggests that emotional attachment (to animals) could override adaptive responding [53] . similarly, one review strongly supports the relevance of the extended parallel process model for disaster and emergency preparedness but draws on a study examining the mediating role of fear on the threatpreparedness relationship to argue for further work applying the extended parallel process model to look beyond just threat and efficacy in their applications [45] . furthermore, although several papers do support the relevance of the theory of planned behaviour within this context (e.g., [37, 38, 51] ), one review does identify inconsistent findings regarding the role of the theory of planned behaviour for predicting behaviour within their cited papers (but still advocates the relevance of this theory for vaccination behaviour) [50] . in terms of intervention development, one review concludes that although there is clear evidence for the success of theory-based interventions for communicable disease control and prevention, there is no substantive difference in theories informing effective or ineffective interventions [37] . rather than specific theories being of critical importance for intervention development, it is instead the role of theory that is important; positive effects were reported where theories were used to design and develop interventions, but more mixed effects were reported when theories were only used to evaluate interventions [37] . this echoes points made by leppin and aro in their review of risk perceptions in relation to severe acute respiratory syndrome (sars) and avian influenza. specifically: 1) few studies explicitly or theoretically define risk perceptions, 2) there is a disproportionate focus on risk probability over risk severity and, 3) there is a need for further work to empirically examine the role of risk perceptions as represented within multifactor models rather than just through bivariate relationships [46] . overall, this literature synthesis yields two key conclusions. firstly, behaviour change theories are of clear relevance for understanding behaviour in the context of infectious diseases and emergency response. secondly, and related to the first conclusion, there is a definite requirement for further work to systematically examine, incorporate and test full behaviour change models within research and interventions in the context of infectious disease and emergency response. papers incorporating a total of 44 different theories (in various combinations) were collated across these reviews. 3 the health belief model, theory of planned behaviour, and protection motivation theory were the most cited theories both across reviews and by individual papers included within reviews. other theories that were commonly cited include: precaution adoption process model, extended parallel process model, theory of reasoned action, and social cognitive theory. although the authors indicate that 470 papers were included in the review, no list of these papers was provided. given this, the full reference list of 508 citations was searched following a synthesis of the key theory-related outcomes and conclusions, there was broad support for the applicability of the most commonly cited theories (listed above) within this context. however, despite this broad support for the applicability of these theories, several reviews reported low levels of explicit use of behaviour change theories in the research they cited. taken together, these results suggest that the most commonly cited theories reported herein represent an excellent starting point for practitioners and public health professionals looking to model and enact behaviour change in the context of infectious disease and emergency response. this point is explored in more detail in the recommendations subsection of the discussion. to the best of the authors' knowledge, this scoping review represents the first attempt to systematically collate review data concerning the application of behaviour change theories within a broad infectious disease and emergency response context. in this review we have synthesised the health behaviours, theories, and applications presented across 13 review papers drawn from both peer reviewed journals and grey literature in the context of infectious disease and emergency response. this synthesis enables us to provide some key recommendations and suggestions for infectious disease modellers, researchers, and public health professionals looking to apply behaviour change theory to understand and influence behaviour in this context. looking across the 13 reviews included in our synthesis provides a clear picture of the typical use of behaviour change theories across the current context. firstly, many papers included within these reviews do not seem to be explicitly based on a specific theory of behaviour change. secondly, whether we take a high-level approach (i.e., number of theories cited by multiple reviews) or a more granular approach (i.e., number of citations per theory across all reviews) to the synthesis, the conclusions are broadly the same. as per michie and colleagues, only a small number of theories were most commonly cited despite the broad number of theories available (83 theories detailed within [22] , and 44 distinct theories cited at least once within our included review papers, although these may not all be present in michie and colleagues work [22] ). specifically, three theories stand out as the most commonly applied: health belief model, theory of planned behaviour, and protection motivation theory. another four theories are also repeatedly, but less frequently, cited: precaution adoption process model, extended parallel process model, theory of reasoned action, and social cognitive theory. of these seven theories, four are consistent with the most frequently used theories as identified by michie and colleagues [22] (theory of planned behaviour, health belief model, precaution adoption process model, & social cognitive theory), and with some of the theories cited as frequently used across other health behaviour contexts e.g., [27, 28] (health belief model, theory of planned behaviour, & social cognitive theory). of the remaining theories, one (theory of reasoned action) is closely linked to the theory of planned behaviour (the latter having developed from the former [57] ). the final two most cited theories -protection motivation theory and the extended parallel process model -are closely related (the latter builds on the former [61] ) and both are particularly concerned with the processes underlying fear appeals and threat messaging, a focus with clear relevance in the context of this review. having identified the commonly applied theories, we subsequently conducted a rapid synthesis of the included reviews' key outcomes with regards to the utility of behaviour change theory. on the basis of this synthesis, we make two key conclusions. firstly, the most frequently cited theories do find broad (though not universal) support within the cited literature. secondly, despite this broad applicability, several reviews cited herein highlight the relative absence of behaviour change theory within research conducted in this context [40, 41, 44, 46, 50] . indeed, several of the reviews included herein explicitly advocate further work to study the thorough application of both individual theories and a range of multivariable theories, considering the interrelationship between model factors/ components (e.g., [37, 38, 40, 41, [44] [45] [46] 50] ). overall, a clear take-home message from the current review is that there are a range of commonly applied behaviour change theories with broad support for their use within an infectious disease and emergency response context. in the following section, these findings are used to form the basis for recommendations concerning the use of behaviour change theory by researchers, practitioners, and modellers in both research and practice. based on the results of our synthesis, there are two broad categories of recommendations for future work applying and incorporating behaviour change theories into both: 1) research and practice (i.e., understanding behaviour, and developing & deploying effective interventions), and; 2) infectious disease modelling. furthermore, although most of the literature screening and synthesis for this review was conducted prior to the covid-19 pandemic, we are aware of some recent and relevant work detailing recommendations and guidance for the use of behavioural science to tackle covid-19 in practice. some of this work has also been incorporated into this section to signpost interested readers to additional material of relevance. behavioural science can play a critical role in combatting the effects of a global pandemic, such as covid-19, both through informing our understanding of public perceptions of the virus, and through developing interventions to reduce barriers and facilitate uptake of recommended behaviours [13] . indeed, in the context of covid-19, west and colleagues indicate that in the absence of robust intervention data, behaviour change theories and constructs should be used to inform the development of policy and practice for increasing uptake of self-protective behaviours [67] . unfortunately, however, multiple reviews cited herein lament the absence of behaviour change theory in work conducted to date [40, 41, 44, 46, 50] , and advocate for the more in-depth study of various behaviour change theory within this context [37, 38, 40, 41, [44] [45] [46] 50] . our primary recommendation therefore reinforces that advocated in the literature cited herein. specifically, we recommend that researchers and practitioners working in the context of infectious disease and emergency response, should: a) draw on the available theoretical literature and; b) work with experts in behavioural science to inform both empirical work to understand behaviour, and the design and implementation of interventions to affect behaviour. this primary recommendation is supported by several additional recommendations, which are unpacked in the subsequent paragraphs. in describing the way forward for behaviour change theorising, michie and colleagues [22] note that the most popular behaviour change theories are relatively context agnostic, and that there may be important insights to be taken from models that are more context dependent. in order to enhance the translation of research into public health practice, we therefore recommend that future research and intervention development should consider both general theories and theories that most closely fits the context of study. the theories identified within this review represent a mix of both context agnostic (e.g., health belief model, theory of planned behaviour) and some more context specific (e.g., protection motivation theory and extended parallel process model) theories, and would therefore seem to be a good starting point for researchers and practitioners working in this area. furthermore, although not a theory identified within the current review (see limitations section), the com-b (capability, opportunity, motivation and behaviour) model [68] has been advocated as a key starting point for interventions to reduce the transmission of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) during the covid-19 pandemic (e.g., [67] ). we would therefore further recommend considering the application of this theory when either conducting research or delivering research into practice. while these recommendations are consistent with the medical council guidelines [21] , and echo recent work by michie and colleagues (e.g., [22, 23] ), it is critically important to approach the incorporation of behaviour change theory -particularly within intervention designin a systematic fashion [23] . indeed, one review included within our synthesis found that theory had a positive impact on the success of interventions when used at the design and development stage, relative to theory used only at the evaluation stage [37] . to facilitate the systematic and appropriate use of behaviour change theory, we therefore strongly recommend that researchers and practitioners involve expert behavioural scientists in the design and implementation process, and also make use of available guidance within the behaviour change literature. for example, we are aware of a thorough guide to designing interventions using the behaviour change wheel [23] that may be of use to practitioners. we also echo michie and prestwich's recommendation for researchers to use their theory coding scheme (a list of 19 items for coding the use of theory within intervention design) to both: a) allow for researchers and practitioners to systematically assess the incorporation of theory within existing interventions and, b) facilitate transparent reporting of the incorporation of theory within novel interventions [69] . using these resources will enable researchers and practitioners to overcome the limited use of theory acknowledged by the papers included in this review [40, 41, 44, 46, 50] while still avoiding "it seemed like a good idea at the time" ( [23] , p14) interventions. lastly, we include a note on the role of behavioural science in combatting the covid-19 pandemic. helpfully, a range of prominent behavioural scientists have developed guidelines and recommendations for the application of behavioural science within the context of covid-19. for example, michie and colleagues advocate four principles to help reduce transmission by effecting behaviour change [70] ; the british psychological society have compiled a list of nine recommendations to optimise the effectiveness of changing policy and communication/ guidance [71] , and; west et al. draw upon com-b to provide an account of components that need to be addressed in order to increase uptake of specific covid-19 transmission-reduction behaviours [67] . while this is by no means an exhaustive list, it does reinforce the importance of explicitly and systematically incorporating behavioural science into research and practice within the context of infectious disease and emergency response. alongside our recommendations detailed above, we therefore further advocate that interested readers explore these principles and guidelines in more detail. in our sister review [26] we find that although the 'gold standard' for incorporating protective behaviour into infectious disease modelling involves the incorporation of a range of cognitive and social constructs, there is very little explicit reference to well-recognised theories of behaviour change (indeed, only five of 42 included papers make any such reference [26] ). acknowledging the necessary tradeoff between accurately modelling human behaviour and the computational demands of such modelling [72] , weston and colleagues echo previous recommendations for modellers to familiarise themselves with the relevant behaviour change literature in order to improve their awareness of the main factors underlying human behaviour [24, 73] . as a result, the recommendation was made for infectious disease modellers to closely consult with the psychological literature concerning the predictors of health behaviour when developing their models [26] . based on the outcomes of this current review, we can provide further guidance for infectious disease modellers on both where to begin with this familiarisation, and how/ where to involve behavioural scientists in the process. firstly, in both the current review and weston and colleagues work, the health belief model is the most commonly cited behaviour change theory. we therefore agree that the health belief model represents an appropriate base on which to build infectious disease models incorporating human behaviour [26] . nevertheless, as noted in the modelling review, there are a broad range of additional factors-including emotional and social constructs-that should be more fully considered when representing infection prevention behaviour [26] . given the emphasis on protection motivation theory and the extended parallel process model within the literature reported in the current review, we first suggest that infectious disease modellers should consider these models alongside the health belief model to help improve the modelling of emotional responding and defensive avoidance behaviour (but see also bish & michie for further recommendations concerning the use of parallel processing models to incorporate cognitive and emotional constructs, [38] ). similarly, we would also recommend infectious disease modellers use social cognitive theory, identified as a prominent public health behaviour change theory in the current review, as another starting point for behavioural model formulation. by more fully considering these theories and associated literature, infectious disease modellers will be well prepared to accurately and precisely model a range of relevant social, cognitive, and emotional constructs that may be associated with behavioural responses to a public health emergency. although this familiarisation exercise will be invaluable in helping modellers to develop a deeper understanding of the factors underlying behaviour change and is consistent with recommendations from previous literature as outlined above, it is pertinent to echo a recommendation made by michie and colleagues in the context of intervention design. that is, it is important to ensure that the theory selected is appropriate for the type of behaviour in question. for example, if a behaviour is more likely to be influenced by habitual factors then models concerning deliberative and reflective processing are less likely to be relevant [22] . as this current review has focused on identifying the broad state of the art for incorporating behaviour change theory within infectious disease and emergency contexts, a full consideration of the appropriate theories for each individual behaviour in each specific disease or emergency context is unfortunately outside scope. however, we recommend that infectious disease modellers work closely with behavioural scientists in the design and development of their models to ensure that the most appropriate theories are being consulted and incorporated for a given target behaviour or context. through this greater collaboration between modellers and behavioural scientists, the discipline will be able to develop a more indepth understanding of the requirements for behavioural theory and the computational limitations to their incorporation within infectious disease models. although the review reported herein represents an impressive effort at addressing a herculean task (namely, the collation of psychological behaviour change theories applied across infectious disease and emergency response contexts), as with any large-scale review there are inevitably trade-offs and potential limitations that should be considered when interpreting the outcomes and recommendations from this review. firstly, although other emergency contexts are represented within the current review, we acknowledge that the papers included within this review are predominantly focused on infectious diseases. although the search strategy did include terms relating to other specific civil emergencies (e.g., terrorism), and emergencies generally (e.g., emergency response, emergency resilience), there were more search terms relating to infectious disease outbreaks/ pandemics. we therefore recommend that future reviews in this area should utilise search strategies optimised more clearly to reflect the full breadth of public health emergency contexts. secondly, and similarly to the first limitation, we are aware of some prominent behaviour change theories (e.g., com-b) that are not represented in the reviews cited herein. although this may represent a limitation of our search strategy, generic phrases and subject headings relating to theoretical models were included to ensure a breadth of focus. furthermore, as our focus was on identifying review articles that have themselves collated primary research involving behaviour change theories in the context of infectious disease and emergency response, it follows that any prominently applied theories should have also been represented within our sample regardless of the specific search terms we used. we are therefore confident that the theories identified herein represent an accurate overview of the most commonly cited theories within this specific context over the period in question (i.e., pre covid-19). thirdly, we acknowledge that several of the reviews included herein were authored by some of the same individuals [38-41, 48, 50] , with at least one [39] explicitly cited as an update of another [38] . although this may influence the macro-representation of theories across reviews (i.e., at review-level), our decision to also examine the frequency of theory use at a micro-level (i.e., at individual cited study level, excluding repeat citations across reviews) with similar results mitigates the likely impact of this. nevertheless, we acknowledge that frequency of theory usea key outcome within the current reviewis not the same thing as contextual-relevance of theory within these contexts. however, the purpose of this review was to identify the most commonly employed theories of behaviour change within the infectious disease and emergency response contexts. given this focus, we believe that the emphasis on theory frequency within the current review is well founded. nevertheless, we do also provide a synthesis and summary of the key outcomes and conclusions in order to further guide researchers and mathematical modellers to the points of commonality and divergence within the extant review literature. we hope that this review will therefore serve as a jumping off point for further research and modelling work building on our outcomes as detailed in the preceding recommendations section. finally, given the breadth of available primary literature, the proliferation of available reviews of behaviour change theories, and the specificity of our research question (i.e., to identify commonly applied theories of behaviour change within a public health emergency context) we elected to conduct a review of reviews rather than a systematic review of all literature. similarly, although driven by pragmatic concerns, our decision to conduct our top up searches using the first 20 pages of google scholar searches may have limited the number of potentially relevant manuscripts for screening. furthermore, given both the unclear relevance /lack of access to a number of papers within the current review, we elected to wholesale exclude reviews concerning sexually transmitted infection. although these decisions and outcomes may have narrowed our focus, we argue that the close parallels between the theories identified herein and those remarked as commonplace within previous review work (e.g., [22] ) render this claim ill founded. indeed, when combined, our top-up searches (which were sorted by relevance) allowed us to search through 600 citations published since 2016. the number of unique citations across papers included in our review (see table 2 ) further suggests that we have succeeded in drawing together a broader range of literature than the independent systematic reviews themselves managed. nevertheless, we consider the current work to be an initial attempt at identifying and integrating the literature applying behaviour change theories in the context of infectious disease and emergency response. we therefore invite and encourage the conduct of a full and systematic review of all primary literature concerning the application of behaviour change theories across public health emergency contexts using our search strategy and extraction terms as a guide. behavioural science can play a critical role in combatting the effects of an infectious disease outbreak or public health emergency, such as the covid-19 pandemic. however, the proliferation of available theories, with either general or specific application, can made the landscape confusing for researchers, infectious disease modellers and public health practitioners alike. in an effort to simplify the considerable behaviour change literature for ease of use by public health emergency researchers, we conducted a systematised scoping 'review of the reviews' concerning the application of behaviour change theories in infectious disease and emergency response emergency contexts. our search strategies revealed 13 relevant review papers from which we were able to identify and collate the seven most commonly cited and applied behaviour change theories in this context: health belief model, theory of planned behaviour, and protection motivation theory as most commonly applied, followed by precaution adoption process model, extended parallel process model, theory of reasoned action, and social cognitive theory. following a synthesis of the key theory-related outcomes and conclusions, we conclude that while there is broad support for the use of the most commonly cited theories within this context, the previous application of these theories within the literature is patchy. that is, much research in this context has not drawn on relevant theories of behaviour change. based on these identified theories and our synthesis of review outcomes, and in conjunction with a recent review by weston and colleagues [26] , we make recommendations to assist researchers, intervention designers, and mathematical modellers to incorporate psychological behaviour change theories within infectious disease and emergency response contexts. first, we echo previous recommendations that future research and intervention design within this context should be based explicitly and systematically on relevant behaviour change theories, and in close consultation with experts in behavioural science. the theories identified herein represent an excellent starting point for this work, and we further signpost the reader to both general materials to aid in intervention design, and guiding principles for practitioners and researchers working on covid-19. second, we recommend that mathematical modellers should consult the theories identified herein, and work closely with behavioural scientists to familiarise themselves with the key factors underlying behaviour change within an infectious disease and emergency response context. considered together, the results and recommendations reported herein therefore represent an important resource to enable researchers, modellers, and practitioners working in the context of infectious disease and emergency response to better incorporate a systematic and evidence-based consideration of human behaviour into their work. national risk register of civil emergencies 2017 edition. london: cabinet office world health organisation. ebola in the democratic republic of the congo. health emegency update middle east respiratory syndrome coronavirus (mers-cov the 2009 influenza pandemic. an independent review of the uk response to the 2009 influenza pandemic. london: cabinet office estimated global mortality associated with the first 12 months of 2009 pandemic influenza a h1n1 virus circulation: a modelling study influenza (flu) global rise in human infectious disease outbreaks the united nations office for disaster risk reduction, centre for research on the epidemiology of disasters. the human cost of natural disasters: a global perspective communicating risk in public health emergencies: a who guideline for emergency risk communication (erc) policy and practice. geneva: world health organisation behavioural science must be at the heart of the public health response to covid-19 the effect of communication during mass decontamination adherence to antimicrobial inhalational anthrax prophylaxis among postal workers public perceptions, anxiety, and behaviour change in relation to the swine flu outbreak: cross sectional telephone survey protection motivation theory and social distancing behaviour in response to a simulated infectious disease epidemic likely uptake of swine and seasonal flu vaccines among healthcare workers. a cross-sectional analysis of uk telephone survey data predictors of self and parental vaccination decisions in england during the 2009 h1n1 pandemic: analysis of the flu watch pandemic cohort data mathematical modelling of infectious diseases developing and evaluating complex interventions: the new medical research council guidance abc of behaviour change theories the behaviour change wheel agent-based modelling of epidemic spreading using social networks and human mobility patterns coupled contagion dynamics of fear and disease: mathematical and computational explorations infection prevention behaviour and infectious disease modelling: a review of the literature and recommendations for the future theories of behaviour and behaviour change across the social and behavioural sciences: a scoping review the role of behavioral science theory in development and implementation of public health interventions health behavior and health education: theory, research, and practice the use of theory in health behavior research from 2000 to 2005: a systematic review a behavioral-ecological model of adolescent sexual development: a template for aids prevention community strategies for the reduction of youth drinking: theory and application to walk or not to walk? the hierarchy of walking needs methodology in conducting a systematic review of systematic reviews of healthcare interventions cochrane handbook for systematic reviews of interventions scoping studies: towards a methodological framework systematic literature review to examine the evidence for the effectiveness of interventions that use theories and models of behaviour change: towards the prevention and control of communicable diseases demographic and attitudinal determinants of protective behaviours during a pandemic: a review demographic and attitudinal determinants of protective behaviours during a pandemic. london: department of health factors associated with uptake of vaccination against pandemic influenza: scientific evidence base review. london: department of health factors associated with uptake of vaccination against pandemic influenza: a systematic review compliance with anti-h1n1 vaccine among healthcare workers and general population the determinants of 2009 pandemic a/ h1n1 influenza vaccination: a systematic review perceptions and behavioural responses of the general public during the 2009 influenza a (h1n1) pandemic: a systematic review application of behavioral theories to disaster and emergency health preparedness: a systematic review risk perceptions related to sars and avian influenza: theoretical foundations of current empirical research acceptance of a pandemic influenza vaccine: a systematic review of surveys of the general public factors influencing pandemic influenza vaccination of healthcare workers-a systematic review preferred reporting items for systematic reviews and meta21 analyses: the prisma statement using behavior change frameworks to improve healthcare worker influenza vaccination rates: a systematic review barriers of influenza vaccination intention and behavior-a systematic review of influenza vaccine hesitancy human response to emergency communication: a review of guidance on alerts and warning messages for emergencies in buildings expanding protection motivation theory: investigating an application to animal owners and emergency responders in bushfire emergencies advances in human factors in simulation and modeling. ahfe 2018. advances in intelligent systems and computing the role of social identity processes in mass emergency behaviour: an integrative review historical origins of the health belief model the theory of planned behavior a protection motivation theory of fear appeals and attitude change a model of the precaution adoption process: evidence from home radon testing the common-sense model of regulation of health and illness putting the fear back into fear appeals: the extended parallel process model belief, attitude, intention, and behaviour: an introduction to theory and research social foundations of thought and action: a social cognitive theory transtheoretical therapy: toward a more integrative model of change diffusion of innovations health program planning: an educational and ecological approach applying principles of behaviour change to reduce sars-cov-2 transmission the behaviour change wheel: a new method for characterising and designing behaviour change interventions are interventions theory-based? development of a theory coding scheme slowing down the covid-19 outbreak: changing behaviour by understanding it behavioural science and disease prevention: psychological guidance incorporating individual health-protective decisions into disease transmission models: a mathematical framework nine challenges in incorporating the dynamics of behaviour in infectious diseases models publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the authors would like to acknowledge charlotte hall for providing additional support during the redrafting of this manuscript. supplementary information accompanies this paper at https://doi.org/10. 1186/s12889-020-09519-2.additional file 1: supplementary information 1. search terms for the original selection process additional file 2: supplementary information 2. summary of key theory-related conclusions additional file 3: supplementary information 3. breakdown of: a) how many/ which theories are presented across multiple reviews, and; b) how many times each theory is cited by a unique article across all reviews authors' contributions dw conceived of the project, secured funding, contributed to the design, contributed to developing the search strategy and exclusion & inclusion criteria, provided full-text review of potential papers, read all included papers, re-analysed and re-extracted data following the initial extraction and analysis, substantially revised the initial draft of the manuscript. dw also led on manuscript revisions including the rapid synthesis of outcomes/ conclusions, the screening of 200 citations for the top-up search, the screening of 400 citations for the optimised top up search, and the extraction and analysis of all subsequent data. ai contributed to developing the search strategy and exclusion & inclusion criteria, conducted the review (i.e., ran the searches and conducted title/ abstract and full text review), conducted initial data extraction and analysis, prepared an initial draft of the manuscript, and developed the strategy for the top-up search. ra contributed to the conception, design, and securing funding for the project and commented on drafts of the paper. all authors approved the authorship order, and all authors read and approved the final manuscript. availability of data and materials all data generated or analysed during this study are included in this published article and its supplementary information files.ethics approval and consent to participate not applicable. not applicable. the authors declare that they have no competing interests. key: cord-312911-nqq87d0m authors: zou, d.; wang, l.; xu, p.; chen, j.; zhang, w.; gu, q. title: epidemic model guided machine learning for covid-19 forecasts in the united states date: 2020-05-25 journal: nan doi: 10.1101/2020.05.24.20111989 sha: doc_id: 312911 cord_uid: nqq87d0m we propose a new epidemic model (sueir) for forecasting the spread of covid-19, including numbers of confirmed and fatality cases at national and state levels in the united states. specifically, the sueir model is a variant of the seir model by taking into account the untested/unreported cases of covid-19, and trained by machine learning algorithms based on the reported historical data. besides providing basic projections for confirmed and fatality cases, the proposed sueir model is also able to predict the peak date of active cases, and estimate the basic reproduction number (r0). in particular, the forecasts based on our model suggest that the peak date of the us, new york state, and california state are 06/01/2020, 05/10/2020, and 07/01/2020 respectively. in addition, the estimated r0 of the us, new york state, and california state are 2.5, 3.6 and 2.2 respectively. the prediction results for all states in the us can be found on our project website: https://covid19.uclaml.org, which are updated on a weekly basis, and have been adopted by the centers for disease control and prevention (cdc) for covid-19 death forecasts (https://www.cdc.gov/coronavirus/2019-ncov/covid-data/forecasting-us.html). one of the best ways to prevent the spread of covid-19 in the short term is to follow the mitigation strategies such as social distancing, quarantine, and isolation. for example, the state of california in the us has issued mandatory stay-at-home order on march 19, shutting down all non-essential businesses. only essential services, such as grocery stores, pharmacies, delivery restaurants, have remained open, and residents who need to leave home to take part in essential activities are advised to practice social distancing. with the increasing availability of public data on covid-19, more and more researches (flaxman et al., 2020; bendavid et al., 2020; sutton et al., 2020; altieri et al., 2020; bertozzi et al., 2020; murray et al., 2020) have been carried out to understand and prevent the spread of covid-19 from different aspects. among them, one important research direction is to model and forecast the spread of covid-19, such as predicting the peak of the active cases on the virus and the size of the coronavirus outbreak. such results can help government agencies better understand the overall impact of the disease and also facilitate policy makers in terms of pandemic preparedness and response such as allocating the medical resources. one widely used method for modeling the spread of infectious disease is to use epidemic models such as susceptible-infected-resistant (sir) (kermack and mckendrick, 1927) and susceptible-exposed-infected-removed (seir) (hethcote, 2000) . such epidemic models are quite useful in describing the dynamics of transmission and are well-suited for predicting the peak of active cases on the virus. from the decision-making perspective, the peak prediction is able to forewarn the health system when to expect a surge in cases. furthermore, the reproduction number (fraser et al., 2009 ) estimated by the epidemic model can be directly used to measure the effectiveness of the intervention strategies such as social distancing and quarantine. several recent work used epidemic models such as the sir and seir models li et al., 2020a; wu et al., 2020; kucharski et al., 2020; read et al., 2020; tang et al., 2020; ferguson et al., 2020) to simulate the spread of covid-19 in different regions and were able to forecast the size and severity of such epidemic outbreak. some other work (chinazzi et al., 2020; kraemer et al., 2020; dandekar and barbastathis, 2020) also applied these epidemic models to study the role of quarantine controls such as travel restrictions in the spread of covid-19. most of the aforementioned studies consider classical epidemic models, e.g., the sir and seir models, and base their analyses on the public reported data. however, it is often the case that the number of publicly reported cases (including confirmed cases and recovered cases) are much less than their real numbers as many infectious cases have not been tested due to test capability and asymptomatic patients, or even possibly under-reporting (li et al., 2020b) . as a result, classical epidemic models such as sir and seir cannot accurately characterize the epidemic evolution of covid-19 without taking such unreported cases into consideration. in addition, most existing work is focused on the nation-wide prediction. nevertheless, it is also very important and beneficial to provide state and county level forecasts to assist local public health departments and governments to prevent the spread of covid-19. the goal of this paper is to make good use of the current public data on covid-19 to better understand the spread of the coronavirus and to facilitate informed decisions by policy makers. in order to achieve this goal, we develop a new epidemic model, called the sueir model, to forecast the active cases and deaths of covid-19 by taking the untested/unreported cases into consideration. in addition, we use machine learning based methods to train our model, which enables us to train the model efficiently. based on the proposed model, we are able to make accurate predictions on the numbers of confirmed cases and fatality cases for nation, states and and counties. moreover, our model can also predict the peak dates of active cases and estimate the basic reproduction number (r 0 ) of different states in the us. in this subsection, we briefly introduce two classical epidemic models, i.e., the sir (kermack and mckendrick, 1927) and seir models (hethcote, 2000) , which have been adopted in many previous work to study the epidemic outbreaks such as sars (fang et al., 2006; saito et al., 2013; smirnova et al., 2019) and the ongoing covid-19 (read et al., 2020; tang et al., 2020; wu et al., 2020) . the sir model. the sir model is an epidemic model that shows the change of infection rate over time. more specifically, it characterizes the dynamic interplay among the susceptible individuals (s), infectious individuals (i) and removed individuals (r) (including recovered and deceased) in a certain place. in the sir model, the susceptible individuals may become infectious individuals over time, which depends on the spread rate of the virus, often called the contact rate. recovered individuals are assumed to be immune to the virus and thus cannot become susceptible again. to characterize this dynamics, we use s t , i t , r t to represent the number of susceptible, infectious, and removed individuals at time t, respectively. suppose that the total population in a certain area is fixed as n , then the evolving equations of the above variables over time are defined as follows: where β is the contact rate between the susceptible and infectious groups, and γ is the transition rate between the infectious and removed groups. the above ordinary differential equations indicate that at every time unit the total number of susceptible individuals will decrease by a quantity −βi t s t /n , which will transit into the infectious group. apart from the increase from the transition of susceptible individuals, the size of the infectious group will also decrease by a factor of γ. the seir model. for many diseases, there is often an incubation period during which individuals who have been exposed to the virus may not be as contagious as the infectious individuals. therefore, it is necessary to separately model these cases as the "exposed" group, and this gives rise to the seir model. the dynamics of the seir model introduces a new compartment e t , which models the number of individuals that are exposed to the disease but have not developed obvious symptoms. among all the exposed cases, there are only a σ fraction of people who will develop observable symptoms in a time unit. therefore, the dynamic of this model can be defined by the following ordinary differential equations: compared with the sir model, the seir model has more elaborated model parameters. the parameters σ, β and γ can be learned from the reported data. reproduction number. an important quantity to characterize the dynamic of a pandemic is the basic reproduction number r 0 , which is the expected number of cases directly generated by one case in a population where all individuals are susceptible to infection (fraser et al., 2009) . the 3 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020 . . https://doi.org/10.1101 basic reproduction number in the sir and seir models can be computed as r 0 = β/γ. in this section, we propose a new epidemic model and a machine learning method to train this model. it is observed that covid-19 has an incubation period ranging from 2 to 14 days (lauer et al., 2020) . it has also been observed that individuals who have been exposed to the coronavirus can also infect the susceptible group during this period. in addition, it is often the case that the number of reported cases (including confirmed cases and recovered cases) are less than their real numbers as many exposed cases have not been tested, which will not pass to the next compartment. however, such important factors cannot be characterized by the classical epidemic models such as the sir and seir models. we also observe that directly applying sir or seir model to fit the reported data will lead to unreasonable predictions. therefore, we proposed a new epidemic model that takes the untested/unreported cases as well as the "silent spreaders" into consideration. we call our model the sueir model and it is illustrated in figure 1 . in particular, the compartment exposed in our model is considered as the individuals that have already been infected and have not been tested. therefore, they also have the capability to infect the susceptible individuals. moreover, some of such individuals can receive a test and be further passed to the infectious compartment (as well as reported to the public), while the others will recover but not appear in the publicly reported cases. therefore, we introduce a new parameter 0 < µ < 1 in the evolution dynamics of i t to characterize the ratio of the exposed cases that are confirmed and reported to the public, which we call it the discovery rate. this discovery rate reflects the unreported/undiscovered cases, which is an important latent factor in the dynamics of the epidemic model. as a result, we propose to use the following ordinary differential equations to describe our 4 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . https://doi.org/10.1101/2020.05.24.20111989 doi: medrxiv preprint proposed sueir model: where β denotes the contact rate between the susceptible and "infected" groups (including both exposed and infectious compartments in figure 1 ), σ denotes the ratio of cases in the exposed compartments that are either confirmed as infectious or dead/recovered without confirmation, µ is the discovery rate of the infected cases, and γ denotes the transition rate between the compartments i and r. in this subsection, we will introduce our proposed machine learning method for training the sueir model. in addition, we will also present the detailed configurations used in our experiments. model training. as aforementioned in this paper, our model can be described by the ode (1), which is determined by the parameters θ = (β, σ, γ, µ). in particular, given the model parameters θ and initial quantities s 0 , e 0 , i 0 , and r 0 , we can compute the number of individuals in each group (i.e., s, e, i, and r) at time t, denoted by s t , e t , i t and r t , via applying standard numerical ode solvers onto the ode (1). then we propose to learn the model parameter θ = ( β, σ, γ, µ) by minimizing the following logarithmic-type mean square error (mse): where with i t and r t denote the reported numbers of infected cases and removed cases (including both recovered cases and fatality cases) at time t (i.e., date), and p is the smoothing parameter used to ensure numerical stability. note that given s 0 , e 0 , i 0 and r 0 , i t and r t can be described as differentiable functions of the parameter θ. then the model parameter θ = argmin θ l(θ; i, r) can be learnt by applying standard gradient based optimizer (e.g., bfgs) onto the loss function (2) under the constraint that β, σ, γ, µ ∈ [0, 1]. estimation of the number of removed cases r t . note that i t and r t in our model determine the number of "current" infectious cases (a.k.a., active cases) and the removed cases, i.e., the sum of recovered and fatality cases, respectively. however, most of the reported data only include the number of confirmed cases, i.e., sum of infected cases and removed cases i t + r t . in order to train the model, we need to get i t and r t separately. in addition, the sueir model can only predict the number of removed cases, while in many cases people are more interested in the number of fatality cases. therefore, in order to enable the training of the sueir model, as well as provide the 5 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . predictions for the number of fatality cases, we have to: (1) estimate the number of removed cases; (2) determine the number of active cases in the reported data by subtracting the estimated number of removed cases. in order to do so, we propose to use the following exponential function to model the ratio between the daily increased fatality cases and the removed cases, where a, b > 0 are parameters controlling the shape of the exponential function and t denotes the number of days since the starting date. in order to demonstrate its effectiveness, we evaluate the approximation error based on the reported data in four countries: us, china, france, and italy, which have separately reported fatality and recovered cases. more specifically, given parameters a, b and the number of fatality cases, we are able to estimate the corresponding number of removed cases. then the optimal parameters a and b are obtained by minimizing the mse between the reported number of removed cases and the estimated one on different dates. the results are displayed in figure 2 , which clearly shows that the exponential functions can well describe the ratio between the daily increased numbers of fatality and removed cases. for each state and county in the us, we try several different choices of a and b around the optimal ones we obtained for the us, and pick the one with the smallest validation error. initialization. in terms of the initialization, we directly set i 0 = i 0 and r 0 = r 0 1 . additionally, one can typically set s 0 + e 0 + i 0 + r 0 = n , where n is the total population of the region (which can be either a country or a state/county). however, since most of the states/counties in the us have already issued the stay-at-home offer, the actual total number of cases in the sueir model will be strictly less than n . thus we set s 0 + e 0 + i 0 + r 0 = n 0 for some n 0 < n . moreover, it is worth noting that the initialization of e, i.e., e 0 , is a bit tricky since we do not know the number of infected cases before testing them. it is not reasonable to set e 0 = 0 since generally there has already existed a large number of infected cases when the local governments began to test. therefore, we propose to use a validation set to choose the optimal initial estimates n 0 and e 0 when training our model. to determine the initial values of n 0 and e 0 , we first divide our data into the training data set and the validation data set. in detail, we choose the data in the most recent 7 1 here we omit the numbers of removed cases and recovered cases at the initialization by setting i0 and r0 to be the reported numbers of confirmed cases and fatality cases. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . days as the validation set, while treating the remaining as the training set. for example, suppose we have the data up to may 10, 2020, the data after may 3, 2020 will be used as the validation set, and the data up to may 3, 2020 will be used as the training set. we then do a grid search on different combinations of n 0 and e 0 and train different models on the training set accordingly. finally, we choose the combination of n 0 and e 0 with the smallest validation loss (evaluated using the loss function (2)) along with the best model parameters (i.e., β, γ, σ, µ) to build the sueir model for prediction. given the initial quantities s 0 , e 0 , i 0 , r 0 , we can solve the optimization problem in (2) to obtain the model parameter θ = ( β, σ, γ, µ). to assess the confidence of our estimator, we construct the confidence interval of θ following the previous work (ma, 2020) . more specifically, for a valid model parameter θ, we can compute the loss l θ in (2), and construct the test statistic as t θ = 2t l θ − l( θ) , which represents the loglikelihood ratio between the point estimator θ and θ. note that θ contains four free parameters (i.e., β, σ, γ and µ) while θ is fixed. by wilks's theorem (wilks, 1938) , we know that t θ follows χ 2 4 distribution asymptotically. as a result, we can compare t θ with the (1 − α) quantile of the χ 2 4 distribution and determine whether θ is in the confidence interval or not. in our experiment, we apply grid search on both sides of the point estimator θ to find the boundary of the confidence interval. we can also compute the basic reproduction number based on our proposed sueir model. note that our model has a different dynamics from that of sir and seir models. thus we cannot directly apply the standard computation method of r 0 for the sir or seir model to compute such number. instead, we use the method proposed in heffernan et al. (2005) to calculate r 0 based on the nextgeneration matrix. in specific, let x = (x 1 , . . . , x 4 ) with x i being the number of infected individuals in the compartment i. then we denote by function f i (x) the rate of new infections in compartment i, and denote by v − i (x) and v + i (x) the rate of individuals transferred out of the compartment i and the rate of individuals transferred into the compartment i by all other means respectively. let . . , f 4 (x)) and v (x) = (v 1 (x), . . . , v 4 (x)) . the ode (1) can be rewritten as note that the disease-free equilibrium of our model is x * = (n, 0, 0, 0) . let f and v be the partial jacobian matrices of functions f (x) and v (x) with respect to the number of individuals in the 7 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . https: //doi.org/10.1101 //doi.org/10. /2020 "infective" compartments (both e and i compartments in the sueir model), i.e., x 2 and x 3 , i.e., then the next-generation matrix g = fv −1 can be computed as follows: note that r 0 is given by the largest eigenvalue of next generation matrix g (heffernan et al., 2005) . therefore, it is easy to show that the basic reproduction number of our proposed sueir model is in contrast, the basic reproduction number for sir and seir is r 0 = β/γ. in this section, we present the forecast results, including confirmed cases and deaths, peak date and reproduction number, by our method. data collection. we use the data from the johns hopkins university center for systems science and engineering 2 (dong et al., 2020) to train our model for national-level forecasts. to train the state-level models, we use the data from the new york times 3 . in addition, we use the data from 03/22/2020 (most states have already issued the stay-at-home order by this date) to 05/10/2020 to train our models. more specifically, we use the reported data from 03/22/2020 to 05/03/2020 to train the sueir model while using the data from 05/04/2020 to 05/10/2020 for validation. prediction results. for the interest of space, we present the forecast results of our models for the us and states with more than 40, 000 total cases, including new york, new jersey, illinois, massachusetts, california, pennsylvania, michigan, florida and maryland. for more forecast results, please refer to our forecast website https://covid19.uclaml.org table 1 summarizes the projected death and its corresponding 95% confidence interval in the aforementioned regions from 05/12/2020 to 05/18/2020. the results show that our predictions are very close to the reported data, which suggests that our method performs very well in terms of the death forecasts. we also show the long term death forecasts by our method in table 2 . our results suggest that by june 30, the projected death for the us is 123.4×10 3 (95% ci 109.7×10 3 -140.4×10 3 ). we demonstrate the projected number of confirmed cases by our approach along with its 95% confidence interval in table 3 . the results suggest that our predictions in terms of the number 2 https://github.com/cssegisanddata/covid-19 3 https://github.com/nytimes/covid-19-data 8 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 25, 2020. . table 1 : short-term (daily ahead) prediction (×10 3 ) of total deaths in the us and states with more than 40,000 total cases. for each region, we present the predicted cumulative fatality cases with a 95% confidence interval. the reported number of deaths (groundtruth) from the jhu csse (for the us) and the new york times (for different states) is presented right below the predictions. of confirmed cases are also reasonably accurate. for example, the reported number of confirmed cases in the us on 05/18/2020 is 1508 × 10 3 , and our projected number is 1496 × 10 3 (95% ci 1443 × 10 3 -1572 × 10 3 ), which underestimates 12 × 10 3 cases. in addition, the projected number of long term confirmed case is presented in table 4 . it shows that by 06/30/2020, the projected number of confirmed cases for the us is 1900 × 10 3 (95% ci 1638 × 10 3 -2362 × 10 3 ). our method can also forecast the peak date in different regions, i.e., the date with the largest number of active cases, as shown in table 5 . it can be seen that the projected peak date of the us is 06/01/2020, new york state is 05/10/2020, new jersey is 05/19/2020, illinois is 06/07/2020, [29.17, 31.24] 32.26 [30.66, 34.06] 33.90 [31.83, 36.30] 35.18 [32.73, 38.05] 36.18 [33.44, 39.41] 36.94 [33.99, 40.46] 37.54 [34.42, 41.26] nj 10.54 [10.17, 10.93] 11.33 [10.73, 11.97] 11.96 [11.18, 12.80] 12.46 [11.54, 13.46] 12.85 [11.82, 13.97] 13.16 [12.04, 14.36] 13.40 [12.21, 14.68] il 4.215 [3.968, 4.496] 4.761 [4.330, 5.268] 5.220 [4.630, 5.926] 5.597 [4.875, 6.465] 5.901 [5.073, 6.896] 6.144 [5.232, 7.234] 6.335 [5.359, 7.497] ma 5.693 [5.487, 5.912] 6.138 [5.800, 6.502] 6.493 [6.050, 6.972] 6.773 [6.248, 7.339] 6.992 [6.405, 7.624] 7.162 [6.528, 7.844] 7.295 massachusetts is 05/23/2020, california is 07/01/2020, pennsylvania is 05/20/2020, michigan is 05/11/2020, florida is 06/14/2020, maryland is 05/27/2020. table 6 summarizes the basic reproduction number r 0 estimated by (4) in different regions, which characterizes the spread of the virus at the beginning of the epidemic. the results vary for different states, which are consistent with the severity of the coronavirus outbreak in these regions since mid march. for example the r 0 values of the states in the northeastern us (e.g., ny: 3.6, nj: 4.5, ma:4.2) are significantly higher than those of other states (e.g. ca: 2.2, mi: 2.1, fl: 2.4). we developed a novel epidemic model called sueir to infer the unreported cases of individuals contacting covid-19. based on this new model, we further develop a machine learning approach to forecast the numbers of confirmed and fatality cases in the us. our model can provide accurate short-term (daily ahead) projections for both confirmed cases and fatality cases at national and state levels, which demonstrates its effectiveness. in the long term, the prediction results by our model suggest that the numbers of confirmed cases and death will keep increasing rapidly within one month. in particular, at the end of june, our model forewarns that there will be approximately 2 millions confirmed infectious cases and 120k reported deaths in table 3 : short-term (daily ahead) prediction (×10 3 ) of total confirmed cases in the us and states with more than 40, 000 total cases. for each region, we present the predicted cumulative confirmed cases with a 95% confidence interval. the groundtruth number from the jhu csse (for the us) and the new york times (for different states) is presented under the row of predictions. the united states. our model uses training data since 03/22/2020 at which most states have already issued stay-athome order, and assumes that the contact rate maintains the same level during the training and prediction period. however, starting in may, many states have already lifted the restrictions of businesses and public spaces and considered reopening that allows people to go back to restaurants and offices and places of worship. it remains unclear how these reopening orders affect the contact rate as well as the spread of the virus and therefore our current model does not take this into consideration. table 4 : long-term (weakly ahead) prediction (×10 3 ) of total confirmed cases in the us and states with more than 40, 000 total cases. for each region, we present the predicted cumulative confirmed cases with a 95% confidence interval. moreover, we found that for most states, the learned discover rate (i.e., µ) is less than 0.1, which implies that a large fraction of "exposed" individuals will finally recover/die without being tested and reported. this further suggests that the actual number of infected cases in the us may be more than 10 millions, while most of them are not counted. this result is consistent with the recent findings by the researchers from the university of southern california , which show that 4.65% (ci: [2.8%, 5.6%]) of los angeles residents have already contracted the covid-19 virus, which is approximately 23 times more than the official reported numbers. curating a covid-19 data repository and forecasting county-level covid-19 antibody seroprevalence in santa clara county the challenges of modeling and forecasting the spread of covid-19 a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster the effect of travel restrictions on the spread of the 2019 novel coronavirus quantifying the effect of quarantine control in covid-19 infectious spread using machine learning. medrxiv an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases modelling the sars epidemic by a lattice-based monte-carlo simulation estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in european countries: technical description update pandemic potential of a strain of influenza a (h1n1): early findings. science perspectives on the basic reproductive ratio the mathematics of infectious diseases report 3: transmissibility of 2019-ncov a contribution to the mathematical theory of epidemics the effect of human mobility and control measures on the covid-19 epidemic in china early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) estimating epidemic exponential growth rate and basic reproduction number forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions extension and verification of the seir model on the 2009 influenza a (h1n1) pandemic in japan forecasting epidemics through nonparametric estimation of time-dependent transmission rates using the seir model seroprevalence of sars-cov-2-specific antibodies among adults universal screening for sars-cov-2 in women admitted for delivery estimation of the transmission risk of the 2019-ncov and its implication for public health interventions coronavirus disease 2019 (covid-19) situation report naming the coronavirus disease (covid-19) and the virus that causes it the large-sample distribution of the likelihood ratio for testing composite hypotheses. the annals of mathematical statistics nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study key: cord-156676-wes5my9e authors: masud, sarah; dutta, subhabrata; makkar, sakshi; jain, chhavi; goyal, vikram; das, amitava; chakraborty, tanmoy title: hate is the new infodemic: a topic-aware modeling of hate speech diffusion on twitter date: 2020-10-09 journal: nan doi: nan sha: doc_id: 156676 cord_uid: wes5my9e online hate speech, particularly over microblogging platforms like twitter, has emerged as arguably the most severe issue of the past decade. several countries have reported a steep rise in hate crimes infuriated by malicious hate campaigns. while the detection of hate speech is one of the emerging research areas, the generation and spread of topic-dependent hate in the information network remain under-explored. in this work, we focus on exploring user behaviour, which triggers the genesis of hate speech on twitter and how it diffuses via retweets. we crawl a large-scale dataset of tweets, retweets, user activity history, and follower networks, comprising over 161 million tweets from more than $41$ million unique users. we also collect over 600k contemporary news articles published online. we characterize different signals of information that govern these dynamics. our analyses differentiate the diffusion dynamics in the presence of hate from usual information diffusion. this motivates us to formulate the modelling problem in a topic-aware setting with real-world knowledge. for predicting the initiation of hate speech for any given hashtag, we propose multiple feature-rich models, with the best performing one achieving a macro f1 score of 0.65. meanwhile, to predict the retweet dynamics on twitter, we propose retina, a novel neural architecture that incorporates exogenous influence using scaled dot-product attention. retina achieves a macro f1-score of 0.85, outperforming multiple state-of-the-art models. our analysis reveals the superlative power of retina to predict the retweet dynamics of hateful content compared to the existing diffusion models. for the past half-a-decade, in synergy with the sociopolitical and cultural rupture worldwide, online hate speech has manifested as one of the most challenging issues of this century transcending beyond the cyberspace. many hate crimes against minority and backward communities have been directly linked with hateful campaigns circulated over facebook, twitter, gab, and many other online platforms [1] , [2] . online social media has provided an unforeseen speed of information spread, aided by the fact that the power of content generation is handed to every user of these platforms. extremists have exploited this phenomenon to disseminate hate campaigns to a degree where manual monitoring is too costly, if not impossible. thankfully, the research community has been observing a spike of works related to online hate speech, with a vast majority of them focusing on the problem of automatic detection of hate from online text [3] . however, as ross et al. [4] pointed it out, even manual identification of hate speech comes with ambiguity due to the differences in the definition of hate. also, an important signal of hate speech is the presence of specific words/phrases, which vary significantly across topics/domains. tracking such a diverse socio-linguistic phenomenon in realtime is impossible for automated, large-scale platforms. an alternative approach can be to track potential groups of users who have a history of spreading hate. as matthew et al. [5] suggested, such users are often a very small fraction of the total users but generate a sizeable portion of the content. moreover, the severity of hate speech lies in the degree of its spread, and an early prediction of the diffusion dynamics may help combat online hate speech to a new extent altogether. however, a tiny fraction of the existing literature seeks to explore the problem quantitatively. matthew et al. [5] put up an insightful foundation for this problem by analyzing the dynamics of hate diffusion in gab 1 . however, they do not tackle the problem of modeling the diffusion and restrict themselves to identifying different characteristics of hate speech in gab. hate speech on twitter: twitter, as one of the largest micro-blogging platforms with a worldwide user base, has a long history of accommodating hate speech, cyberbullying, and toxic behavior. recently, it has come hard at such contents multiple times 2 , and a certain fraction of hateful tweets are often removed upon identification. however, a large majority of such tweets still circumvent twitter's filtering. in this work, we choose to focus on the dynamics of hate speech on twitter mainly due to two reasons: (i) the wide-spread usage of twitter compared to other platforms provides scope to grasp the hate diffusion dynamics in a more realistic manifestation, and (ii) understanding how hate speech emerges and spreads even in the presence of some top-down checking measures, compared to unmoderated platforms like gab. diffusion patterns of hate vs. non-hate on twitter: hate speech is often characterized by the formation of echochambers, i.e., only a small group of people engaging with such contents repeatedly. in figure 1 , we compare the temporal diffusion dynamics of hateful vs. non-hate tweets (see sections vi-a and vi-b for the details of our dataset and hate detection methods, respectively). following the standard information diffusion terminology, the set of susceptible nodes at any time instance of the spread is defined by all such nodes which have been exposed to the information (followers of those who have posted/retweeted the tweet) up to that instant but did not participate in spreading (did not retweet/like/comment). while hateful tweets are retweeted in a significantly higher magnitude compared to non-hateful ones (see figure 1 (a)), they tend to create lesser number of susceptible users over time (see figure 1 (b)). this is directly linked to two major phenomena: primarily, one can relate this to the formation of hate echo-chambers -hateful contents are distributed among a well-connected set of users. secondarily, as we define susceptibility in terms of follower relations, hateful contents, therefore, might have been diffusing among connections beyond the follow network -through paid promotion, etc. also one can observe the differences in early growth for the two types of information; while hateful tweets acquire most of their retweets and susceptible nodes in a very short time and stall, later on, non-hateful ones tend to maintain the spread, though at a lower rate, for a longer time. this characteristic can again be linked to organized spreaders of hate who tend to disseminate hate as early as possible. topic-dependence of twitter hate: hateful contents show strong topic-affinity: topics related to politics and social issues, for example, incur much more hateful content compared to sports or science. hashtags in twitter provide an overall mapping for tweets to topics of discussion. as shown in figure 2 , the degree of hateful content varies significantly for different hashtags. even when different hashtags share a common theme (such as topic of discussion #jamiaunderattack, #jamiaviolence and #jamiacctv), they may still incur a different degree of hate. previous studies [5] tend to denote users as hate-preachers irrespective of the topic of discussion. however, as evident in figure 3 , the degree of hatefulness expressed by a user is dependent on the topic as well. for example, while some users resort to hate speech concerning covid-19 and china, others focus on topics around the protests against the citizenship amendment act in india. exogenous driving forces: with the increasing entanglethe color of a cell corresponds to a user, and a hashtag signifies the ratio of hateful to non-hate tweets posted by that user using that specific hashtag. ment of virtual and real social processes, it is only natural that events happening outside the social media platforms tend to shape the platform's discourse. though a small number of existing studies attempt to inquire into such inter-dependencies [6] , [7] , the findings are substantially motivating in problems related to modeling information diffusion and user engagement in twitter and other platforms. in the case of hate speech, exogenous signals offer even more crucial attributes to look into, which is global context. for both detecting and predicting the spread of hate speech over short tweets, the knowledge of context is likely to play a decisive role present work: based on the findings of the existing literature and the analysis we presented above, here we attempt to model the dynamics of hate speech spread on twitter. we separate the process of spread as the hate generation (asking for who will start a hate campaign) and retweet diffusion of hate (who will spread an already started hate campaign via retweeting). to the best of our knowledge, this is the very first attempt to delve into the predictive modeling of online hate speech. our contributions can be summarized as follows: 1) we formalize the dynamics of hate generation and retweet spread on twitter subsuming, the activity history of each user and signals propagated by the localized structural properties of the information network of twit-ter induced by follower connections as well as global endogenous and exogenous signals (events happening inside and outside of twitter) (see section iii). 2) we present a large dataset of tweets, retweets, user activity history, and the information network of twitter covering versatile hashtags, which made to trend very recently. we manually annotate a significant subset of the data for hate speech. we also provide a corpus of contemporary news articles published online (see section vi-a for more details). 3) we unsheathe rich set of features manifesting the signals mentioned above to design multiple prediction frameworks which forecast, given a user and a contemporary hashtag, whether the user will write a hateful post or not (section iv). we provide an in-depth feature ablation and ensemble methods to analyze our proposed models' predictive capability, with the best performing one resulting in a macro f1-score of 0.65. 4) we propose retina (retweeter identifier network with exogenous attention), a neural architecture to predict potential retweeters given a tweet (section v-b). retina encompasses an attention mechanism which dictates the prediction of retweeters based on a stream of contemporary news articles published online. features representing hateful behavior encoded within the given tweet as well as the activity history of the users further help retina to achieve a macro f1-score of 0.85, significantly outperforming several state-of-the-art retweet prediction models. we have made public our datasets and code along with the necessary instructions and parameters, available at https://github.com/lcs2-iiitd/retina. hate speech detection. in recent years, the research community has been keenly interested in better understanding, detection, and combating hate speech on online media. starting with the basic feature-engineered logistic regression models [8] , [9] to the latest ones employing neural architectures [10] , a variety of automatic online hate speech detection models have been proposed across languages [11] . to determine the hateful text, most of these models utilize a static-lexicon based approach and consider each post/comment in isolation. with lack of context (both in the form of individual's prior indulgence in the offense and the current world view), the models trained on previous trends perform poorly on new datasets. while linguistic and contextual features are essential factors of a hateful message, the destructive power of hate speech lies in its ability to spread across the network. however, only recently have researchers started using network-level information for hate speech detection [12] , [13] . rathpise and adji [14] proposed methods to handle class imbalance in hate speech classification. a recent work showed how the anti-social behavior on social media during covid-19 led to the spread of hate speech. awal et al. [15] coined the term, 'disability hate speech' and showed its social, cultural and political contexts. ziems et al. [16] explained how covid-19 tweets increased racism, hate, and xenophobia in social media. while our work does not involve building a new hate speech detection model, yet hate detection underpins any work on hate diffusion in the first place. inspired by existing research, we also incorporate hate lexicons as a feature for the diffusion model. the lexicon is curated from multiple sources and manually pruned to suit the indian context [17] . meanwhile, to overcome the problem of context, we utilize the timeline of a user to determine her propensity towards hate speech. information diffusion and microscopic prediction. predicting the spread of information on online platforms is crucial in understanding the network dynamics with applications in marketing campaigns, rumor spreading/stalling, route optimization, etc. the latest in the family of diffusion being the chassis [18] model. on the other end of the spectrum, the sir model [19] effectively captures the presence of r (recovered) nodes in the system, which are no longer active due to information fatigue 3 . even though limited in scope, the sir model serves as an essential baseline for all diffusion models. among other techniques, a host of studies employ social media data for both macroscopic (size and popularity) and microscopic (next user(s) in the information cascade) prediction. while highly popular, both deepcas [20] and deephawkes [21] focus only on the size of the overall cascade. similarly, khosla et al. [22] utilized social cues to determine the popularity of an image on flickr. while independent cascade (ic) based embedding models [23] , [24] led the initial work in supervised learning based microscopic cascade prediction; they failed to capture the cascade's temporal history (either directly or indirectly). meanwhile, yang et al. [25] presented a neural diffusion model for microscopic prediction, which employs recurrent neural architecture to capture the history of the cascade. these models focus on predicting the next user in the cascade from a host of potential candidates. in this regard, topolstm [26] considers only the previously seen nodes in any cascade as the next candidate without using timestamps as a feature. this approximation works well under limited availability of network information and the absence of cascade metadata. meanwhile, forest [27] considers all the users in the global graph (irrespective of one-hop) as potential users, employing a time-window based approach. work by wang et al. [28] lies midway of topolstm and forest, in that it does not consider any external global graph as input, but employs a temporal, two-level attention mechanism to predict the next node in the cascade. zhou et al. [29] compiled a detailed outline of recent advances in cascade prediction. compared to the models discussed above for microscopic cascade prediction, which aim to answer who will be the next participant in the cascade, our work aims to determine whether a follower of a user will retweet (participate in the probability of ui retweeting (static vs. j th interval) x t , x n feature tensors for tweet and news x t,n output from exogenous attention cascade) or not. this converts our use case into a binary classification problem, and adds negative sampling (in the form on inactive nodes), taking the proposed model closer to realworld scenario consisting of active and passive social media users. the spread of hate and exploratory analysis by mathew et al. [5] revealed exciting characteristics of the breadth and depth of hate vs. non-hate diffusion. however, their methodology separates the non-haters from haters and studies the diffusion of two cascades independently. real-world interactions are more convoluted with the same communication thread containing hateful, counter-hateful, and non-hateful comments. thus, independent diffusion studies, while adequate at the exploratory analysis of hate, cannot be directly extrapolated for predictive analysis of hate diffusion. the need is a model that captures the hate signals at the user and/or group level. by taking into account the user's timeline and his/her network traits, we aim to capture more holistic hate markers. exogenous influence. as early as 2012, myers et al. [7] exposed that external stimuli drive one-third of the information diffusion on twitter. later, hu et al. [30] proposed a model for predicting user engagement on twitter that is factored by user engagement in 600 real-world events. from employing world news data for enhancing language models [31] to boosting the impact of online advertisement campaigns [32] , exogenous influence has been successfully applied in a wide variety of tasks. concerning social media discourse, both de et al. [33] in opinion mining and dutta et al. [6] in chatter prediction corroborated the superiority of models that consider exogenous signals. since our data on twitter was collected based on trending indian hashtags, it becomes crucial to model exogenous signals, some of which may have triggered a trend in the first place. while a one-to-one mapping of news keywords to trending keywords is challenging to obtain, we collate the most recent (time-window) news w.r.t to a source tweet as our ground-truth. to our knowledge, this is the first retweet prediction model to consider external influence. an information network of twitter can be defined as a directed graph g = {u, e}, where every user corresponds to a unique node u i â�� u, and there exists an ordered pair (u i , u j ) â�� e if and only if the user corresponding to u j follows user u i . (table i summarizes important notations and denotations. ) typically, the visible information network of twitter does not associate the follow relation with any further attributes, therefore any two edges in e are indistinguishable from each other. we associate unit weight to every e â�� e. every user in the network acts as an agent of content generation (tweeting) and diffusion (retweeting). for every user u i at time t 0 , we associate an activity history the information received by user u i has three different sources: (a) peer signals (s p i ): the information network g governs the flow of information from node to node such that any tweet posted by u i is visible to every user u j if (u i , u j ) â�� e; (b) non-peer endogenous signals (s en ): trending hashtags, promoted contents, etc. that show up on the user's feed even in the absence of peer connection; (c) exogenous signals (s ex ): apart from the twitter feed, every user interacts with the external world-events directly (as a participant) or indirectly (via news, blogs, etc.). hate generation. the problem of modeling hate generation can be formulated as assigning a probability with each user that signifies their likelihood to post a hateful tweet. with our hypothesis of hateful behavior being a topic-dependent phenomenon, we formalize the modeling problem as learning the parametric function, where t is a given topic, t is the instance up to which we obtain the observable history of u i , d is the dimensionality of the input feature space, and î¸ 1 is the set of learnable parameters. though ideally p (u i |t ) should be dependent on s p i as well, the complete follower network for twitter remains mostly unavailable due to account settings, privacy constraints, inefficient crawling, etc. hate diffusion. as already stated, we characterize diffusion as the dynamic process of retweeting in our context. given a tweet ï� (t 0 ) posted by some user u i , we formulate the problem as predicting the potential retweeters within the interval [t 0 , t 0 + â��t]. assuming the probability density of a user u j retweeting ï� at time t to be p(t), then retweet prediction problem translates to learning the parametric function eq. 2 is the general form of a parametric equation describing retweet prediction. in our setting, the signal components s p j , h j,t , and the features representing the tweet ï� incorporates the knowledge of hatefulness. henceforth, we call ï� the root tweet and u i the root user. it is to be noted that, the features representing the peer, non-peer endogenous, and exogenous signals in eq. 1 and 2 may differ due to the difference in problem setting. beyond organic diffusion. the task of identifying potential retweeters of a post on twitter is not straightforward. in retrospect, the event of a user retweeting a tweet implies that the user must have been an audience of the tweet at some point of time (similar to 'susceptible' nodes of contagion spread in the sir/sis models [19] , [34] ). for any user, if at least one of his/her followees engages with the retweet cascade, then the subject user becomes susceptible. that is, in an organic diffusion, between any two users u i , u j there exists a finite path u i , u i+1 . . . , u j in g such that each user (except u i ) in this path is a retweeter of the tweet by u i . however, due to account privacy etc., one or more nodes within this path may not be visible. moreover, contents promoted by twitter, trending topics, content searched by users independently may diffuse alongside their organic diffusion path. searching for such retweeters is impossible without explicit knowledge of these phenomena. hence, we primarily restrict our retweet prediction to the organic diffusion, though we experiment with retweeters not in the visibly organic diffusion cascade to see how our models handle such cases. to realize eq. 1, we signify topics as individual hashtags. we rely purely on manually engineered features for this task so that rigorous ablation study and analysis produce explainable knowledge regarding this novel problem. the extracted features instantiate different input components of f 1 in eq. 1. we formulate this task in a static manner, i.e., assuming that we are predicting at an instance t 0 , we want to predict the probability of the user posting a hateful tweet within [t 0 , â��]. while training and evaluating, we set t 0 to be right before the actual tweeting time of the user. the activity history of user u i , signified by h i,t is substantiated by the following features: â�¢ we use unigram and bigram features weighted by tf-idf values from 30 most recent tweets posted by u i to capture its recent topical interest. to reduce the dimensionality of the feature space, we keep the top 300 features sorted by their idf values. â�¢ to capture the history of hate generation by u i , we compute two different features her most recent 30 tweets: (i) ratio of hateful vs. non-hate tweets and (ii) a hate lexicon vector hl = {h i |h i â�� ii + and i = 1, . . . , |h|}, where h is a dictionary of hate words, and h i is the frequency of the i th lexicon from h among the tweet history. â�¢ users who receive more attention from fellow users for hate propagation are more likely to generate hate. therefore, we take the ratio of retweets of previous hateful tweets to nonhateful ones by u i . we also take the ratio of total number of retweets on hateful and non-hateful tweets of u i . â�¢ follower count and date of account creation of u i . â�¢ number of topics (hashtags) u i has tweeted on up to t. we compute doc2vec [35] representations of the tweets, along with the hashtags present in them as individual tokens. we then compute the average cosine similarity between the user's recent tweets and the word vector representation of the hashtag, this serves as the topical relatedness of the user towards the given hashtag. to incorporate the information of trending topics over twitter, we supply the model with a binary vector representing the top 50 trending hashtags for the day the tweet is posted. we compute the average tf-idf vector for the 60 most recent news headlines from our corpus posted before the time of the tweet. again we select the top 300 features. using the above features, we implement six different classification models(and their variants). details of the models are provided in section vi-c. v. retweet prediction while realizing eq. 2 for retweeter prediction, we formulate the task in two different settings: the static retweeter prediction task, where t 0 is fixed, and â��t is â�� (i.e., all the retweeters irrespective of their retweet time) and the dynamic retweeter prediction task where we predict on successive time intervals. for these tasks, we rely on features both designed manually as well as extracted using unsupervised/self-supervised manner. for the task of retweet prediction, we extract features representing the root tweet itself, as well as the signals of eq. 2 corresponding to each user u i (for which we predict the possibility of retweeting). henceforth, we indicate the root user by u 0 . here, we incorporate s p i using two different features: shortest path length from u 0 to u i in g, and number of times u i has retweeted tweets by u 0 . all the features representing h i,t and s en remain same as described in section iv. we incorporate two sets of features representing the root tweet ï� : the hate lexicon vector similar to section iv-a and top 300. we varied the size of features from 100 to 1000, and the best combination was found to be 300. for the retweet prediction task, we incorporate the exogenous signal in two different methods. to implement the attention mechanism of retina, we use a doc2vec representations of the news articles as well as the root tweet. for the rest of the models, we use the same feature set as section iv-d. guided by eq. 2, retina exploits the features described in section v-a for both static and dynamic prediction of retweeters. exogenous attention. to incorporate external information as an assisting signal to model diffusion, we use a variation of scaled dot product attention [36] in retina (see figure 4 ). given the feature representation of the tweet x t and news static prediction of retweeters: to predict whether u j will retweet, the input feature x uj is normalized and passed through a feed-forward layer, concatenated with x t,n , and another feed-forward layer is applied to predict the retweeting probability p uj . (c) dynamic retweet prediction: in this case, retina predicts the user retweet probability for consecutive time intervals, and instead of the last feed-forward layer used in the static prediction, we use a gru layer. feature sequence x n = {x n 1 , x n 2 , . . . , x n k }, we compute three tensors q t , k n , and v n , respectively as follows: where w q , w k , and w v are learnable parameter kernels (we denote them to belong to query, key and value dense layers, respectively in figure 4 ). the operation (â·) | (â��1,0) (â·) signifies tensor contraction according to einstein summation convention along the specified axis. in eq. 3, (â��1, 0) signifies last and first axis of the first and second tensor, respectively. therefore, x each of w q , w k , and w v is a two-dimensional tensor with hdim columns (last axis). next, we compute the attention weight tensor a between the tweet and news sequence as where sof tmax(x[. . . , i, j]) = e x[...,i,j] j e x[...,i,j] . further, to avoid saturation of the softmax activation, we scale each element of a by hdim â��0.5 [36] . the attention weight is then used to produce the final encoder feature representation x t,n by computing the weighted average of v n as follows: retina is expected to aggregate the exogenous signal exposed by the sequence of news inputs according to the feature representation of the tweet into x t,n , using the operations mentioned in eqs. 3-5 via tuning the parameter kernels. final prediction. with s ex being represented by the output of the attention framework, we incorporate the features discussed in section v-a in retina to subsume rest of the signals (see eq. 2). for the two separate modes of retweeter prediction (i.e., static and dynamic), we implement two different variations of retina. for the static prediction of retweeters, retina predicts the probability of each of the users u 1 , u 2 , . . . , u n to retweet the given tweet with no temporal ordering (see figure 4 (b)). the feature vector x ui corresponding to user u i is first normalized and mapped to an intermediate representation using a feedforward layer. it is then concatenated with the output of the exogenous attention component, x t,n , and finally, another feed-forward layer with sigmoid nonlinearity is applied to compute the probability p ui . as opposed to the static case, in the dynamic setting retina predicts the probability of every user u i to retweet within a time interval t 0 + â��t i , t 0 + â��t i+1 , with t 0 being the time of the tweet published and â��t 0 = 0. to capture the temporal dependency between predictions in successive intervals, we replace the last feed-forward layer with a gated recurrent unit (gru), as shown in figure 4 (c). we experimented with other recurrent architectures as well; performance degraded with simple rnn and no gain with lstm. cost/loss function. in both the settings, the task translates to a binary classification problem of deciding whether a given user will retweet or not. therefore, we use standard binary cross-entropy loss l to train retina: where t is the ground-truth, p is predicted probability (p ui in static and p ui j in dynamic settings), and w is a the weight given to the positive samples to deal with class imbalance. we initially started collected data based on topics which led to a tweet corpus spanning across multiple years. to narrow down our time frame and ease the mapping of tweets to news, we restricted our time span from 2020-02-03 to 2020-04-14 and made use of trending hashtags. using twitter's official api 4 , we tracked and crawled for trending hashtags each day within this duration. overall, we obtained 31, 133 tweets from 13, 965 users. we also crawled the retweeters for each tweet along with the timestamps. table ii describes the hashtag-wise detailed statistics of the data. to build the information network, we collected the followers of each user up to a depth of 3, resulting in a total of 41, 151, 251 unique users in our dataset. we also collect the activity history of the users, resulting in a total of 163, 042, 612 tweets in our dataset. one should note that the lack of a wholesome dataset (containing textual, temporal, network signals all in one) is the primary reason why we decided to collect our own dataset in the first place. we also, crawled the online news articles published within this span using the news-please crawler [37] . we managed to collect a total of 683, 419 news articles for this period. after filtering for language, title and date, we were left with 319, 179 processed items. there headlines were used as the source of the exogenous signal. we employ three professional annotators who have experience in analyzing online hate speech to annotate the tweets manually. all of these annotators belong to an age group of 22-27 years and are active on twitter. as the contextual knowledge of real-world events plays a crucial role in identifying hate speech, we ensure that the annotators are well-aware of the events related to the hashtags and topics. annotators were asked to follow twitter's policy as guideline for identifying hateful behavior 5 . we annotated a total of 17, 877 tweets with an inter-annotator agreement of 0.58 krippendorf's î±. the low value of inter-annotator's agreement is at par with most hate speech annotation till date, pointing out the hardness of the task even for human subjects. this further strengthens the need for contextual knowledge as well as exploiting beyondthe-text dynamics. we select the final tags based on majority voting. based on this gold-standard annotated data, we train three different hate speech classifiers based on the designs given by davidson et al. [9] (dubbed as davidson model), waseem and hovy [8] , and pinkesh et al. [10] . with an auc score 0.85 and macro-f1 0.59, the davidson model emerges as the best performing one. when the existing pre-trained davidson model was tested on our annotated dataset, it achieved 0.79 auc and 0.48 macro-f1. this highlights both the limitations of existing hate detection models to capture newer context, as well as the importance of manual annotations and fine-tuning. we use the fine-tuned model to annotate the rest of the tweets in our dataset (% of hateful tweets for each hashtag is reported in table ii) . we use the machine-annotated tags for the features and training labels in our proposed models only, while the hate generation models are tested solely on gold-standard data. along with the manual annotation and trained hate detection model, we use a dictionary of hate lexicons proposed in [17] . it contain a total of 209 words/phrases signaling a possible existence of hatefulness in a tweet. example of slur terms used in the lexicon include words such as harami (bastard), jhalla (faggot), haathi (elephant/fat). using the above terms is derogatory and a direct offense. in addition, the lexicon has some colloquial terms such as mulla (muslim), bakar (gossip), aktakvadi (terrorist), jamai (son-in-law) which may carry a hateful sentiment depending on the context in which they are used. to experiment on our hate generation prediction task, we use a total of 19, 032 tweets(which have atleast 60 news mapping to it from the time of its posting) coming from 12, 492 users to construct the ground-truth. with an 80 : 20 train-test split, there are 611 hateful tweets among 15, 225 in the training data, whereas 129 out of 3, 807 in the testing data. to deal with the severe class imbalance of the dataset, we use both upsampling of positive samples and downsampling of negative samples. with all the features discussed in section iv, the full size of the feature vector is 3, 645. we experimented with all our proposed models with this full set of features and dimensionality reduction techniques applied to it. we use principal component analysis (pca) with the number of components set to 50. also, we conduct experiments selecting k-best features (k = 50) using mutual information. we implement a total of six different classifiers using support vector machine (with linear and rbf kernel), logistic regression, decision tree, adaboost, and xgboost [38] . parameter settings for each of these are reported in table iii . all of the models, pca, and feature section are implemented using scikit-learn 6 . the activity of retweeting, too, shows a skewed pattern similar to hate speech generation. while the maximum number retweets for a single tweet is 196 in our dataset, the average remains to be 13.10. we use only those tweets which have more than one retweet and atleast 60 news mapping to it from the time of its posting. with an 80 : 20 train-test split, this results in a total of 3, 057 and 765 samples for training and testing. for all the doc2vec generated feature vectors related to tweets and news headlines, we set the dimensionality to 50 and 500, respectively. for retina, we set the parameter hdim and all the intermediate hidden sizes for the rest of the feedforward (except the last one generating logits) and recurrent layers to 64 (see section v-b). hyperparameter tuning of retina. for both the settings (i.e, static and dynamic prediction of retweeters), we used mini-batch training of retina, with both adam and sgd optimizers. we varied the batch size within 16, 32 and 64, with the best results for a batch size of 16 for the static mode and 32 for the dynamic mode. we also varied the learning rates within a range 10 â��4 to 10 â��1 , and chose the best one with learning rate 10 â��2 using the sgd optimizer 7 for the dynamic model. the static counterpart produced the best results with adam optimizer 8 [39] using default parameters. to deal with the class imbalance, we set the parameter w in eq. 6 as w = î»(log c â�� log c + ), where c and c + are the counts for total and positive samples, respectively in the training dataset, and î» is a balancing constant which we vary from 1 to 2.5 with 0.5 steps. we found the best configurations with î» = 2.0 and î» = 2.5 for the static and dynamic modes respectively. 7 https://www.tensorflow.org/api docs/python/tf/keras/optimizers/sgd 8 https://www.tensorflow.org/api docs/python/tf/keras/optimizers/adam in the absence of external baselines for predicting hate generation probability due to the problem's novelty, we explicitly rely on ablation analyses of the models proposed for this task. for retweet dynamics prediction, we implement 5 external baselines and two ablation variants of retina. since information diffusion is a vast subject, we approach it from two perspectives -one is the set of rudimentary baselines (sir, general threshold), and the other is the set of recently proposed neural models. sir [19] : the susceptible-infectious-recovered (removed) is one of the earliest predictive models for contagion spread. two parameters govern the model -transmission rate and recovery rate, which dictate the spread of contagion (retweeting in our case) along with a social/information network. threshold model [40] : this model assumes that each node has threshold inertia chosen uniformly at random from the interval [0, 1]. a node becomes active if the weighted sum of its active neighbors exceeds this threshold. using the same feature set as described in section v-a, we employ four classifiers -logistic regression, decision tree, linear svc, and random forest (with 50 estimators). all of these models are used for the static mode of retweet prediction only. features representing exogenous signals are engineered in the same way as described in section iv-d. to overcome the feature engineering step involving combinations of topical, contextual, network, and user-level features, neural methods for information diffusion have gained popularity. while these methods are all focused on determining only the next set of users, they are still important to measure the diffusion performance of retina. topolstm [26] : it is one of the initial works to consider recurrent models in generating the next user prediction probabilities. the model converts the cascades into dynamic dags (capturing the temporal signals via node ordering). the senderreceiver based rnn model captures a combination of active node's static score (based on the history of the cascade), and a dynamic score (capturing future propagation tendencies). forest [27] : it aims to be a unified model, performing the microscopic and the macroscopic cascade predictions combining reinforcement learning (for macroscopic) with the recurrent model (for microscopic). by considering the complete global graph, it performs graph sampling to obtain the structural context of a node as an aggregate of the structural context of its one or two hops neighbors. in addition, it factors the temporal information via the last m seen nodes in the cascade. hidan [28] : it does not explicitly consider a global graph as input. any information loss due to the absence of a global graph is substituted by temporal information utilized in the form of ordered time difference of node infection. since hidan does not employ a global graph, like topolstm, it too uses the set of all seen nodes in the cascade as candidate nodes for prediction. we exercise extensive feature ablation to examine the relative importance of different feature sets. among the six different algorithms we implement for this task, along with different sampling and feature reduction methods, we choose the best performing model for this ablation study. following eq. 1, we remove the feature sets representing h i,t , s ex , s en , and t (see section iv for corresponding features) in each trial and evaluate the performance. to investigate the effectiveness of the exogenous attention mechanism for predicting potential retweeters, we remove this component and experiment on static as well as the dynamic setting of retina. evaluation of classification models on highly imbalanced data needs careful precautions to avoid classification bias. we use multiple evaluation metrics for both the tasks: macro averaged f1 score (macro-f1), area under the receiver operating characteristics (auc), and binary accuracy (acc). as the neural baselines tackle the problem of retweet prediction as a ranking task, we improvise the evaluation of retina to make it comparable with these baselines. we rank the predicted probability scores (p ui and p ui j in static and dynamic settings, respectively) and compute mean average precision at topk positions (map@k) and binary hits at top-k positions (hits@k). table iv presents the performances of all the models to predict the probability of a given user posting a hateful tweet using a given hashtag. it is evident from the results that, all six models suffer from the sharp bias in data; without any classspecific sampling, they tend to lean towards the dominant class (non-hate in this case) and result in a low macro-f1 and auc compared to very high binary accuracy. svm with rbf-kernel outperforms the rest when no upsampling or downsampling is done, with a macro-f1 of 0.55 (auc 0.61). effects of sampling. downsampling the dominant classes result in a substantial leap in the performance of all the models. the effect is almost uniform over all the classifiers except xgboost. in terms of macro-f1, decision tree sets the best performance altogether for this task as 0.65. however, the rest of the models lie in a very close range of 0.62-0.64 macro-f1. while the downsampling performance gains are explicitly evident, the effects of upsampling the dominated class are less intuitive. for all the models, upsampling deteriorates macro-f1 by a large extent, with values in the range 0.44-0.47. however, the auc scores improve by a significant margin for all the models with upsampling except decision tree. adaboost achieves the highest auc of 0.68 with upsampling. dimensionality reduction of feature space. our experiments with pca and k-best feature selection by mutual information show a heterogeneous effect on different models. while the only svm with linear kernel shows some improvement with pca over the original feature set, the rest of the models observe considerable degradation of macro-f1. however, svm with rbf kernel achieves the best auc of 0.68 with pca. with top-k best features, the overall gain in performance is not much significant except decision tree. we also experiment with combinations of different sampling and feature reduction methods, but none of them achieve a significant gain in performance. ablation analysis. we choose decision tree with downsampling of dominant class as our best performing model (in terms of macro-f1 score) and perform ablation analysis. table v presents the performance of the model with each feature group removed in isolation, along with the full model. evidently, for predicting hate generation, features representing exogenous signals and user activity history are most important. removal of the feature vector signifying trending hashtags, which represent the endogenous signal in our case, also worsens the performance to a significant degree. table vi summarizes the performances of the competing models for the retweet prediction task. here again, binary accuracy presents a very skewed picture of the performance due to class imbalance. while retina in dynamic setting outperforms the rest of the models by a significant margin for all the evaluation metrics, topolstm emerges as the best baseline in terms of both map@20 and hits@20. in figure 5 , we compare retina in static and dynamic setting with topolstm in terms of hits@k for different values of k. for smaller values of k, retina largely outperforms topolstm, in both dynamic and static setting. however, with increasing k-values, the three models converge to very similar performances. figure 6 provides an important insight regarding the retweet diffusion modeling power of our proposed framework retina. our best performing baseline, topolstm largely fails to capture the different diffusion dynamics of hate speech in contrast to non-hate (map@20 0.59 for non-hate vs. 0.43 for hate). on the other hand, retina achieves map@20 scores 0.80 and 0.74 in dynamic (0.54 and 0.56 in static) settings to predict the retweet dynamics for hate and non-hate contents, respectively. one can readily infer that our wellcurated feature design by incorporating hate signals along with the endogenous, exogenous, and topic-oriented influences empowers retina with this superior expressive power. among the traditional baselines, logistic regression gives comparable macro f1-score to the best static model; however, owing to memory limitations it could not be trained on news set larger than 15 per tweet. similarly, svm based models could not incorporate even 15 news items per tweet (memory limitation). meanwhile, an ablation on news size gave best results at 60 for both static and dynamic models. we find that the contribution of the exogenous signal(i.e the news items) plays a vital role in retweet prediction, much similar to our findings in table v for predicting hate generation. with the exogenous attention component removed in static as well as dynamic settings (retina-s â�  and retina-d â�  , respectively, in table vi) , performance drops by a significant margin. however, the performance drop is more significant in retina-d â�  for ranking users according to retweet probability (map@k and hits@k). the impact of exogenous signals on macro-f1 is more visible in the traditional models. to observe the performance of retina more closely in the dynamic setting, we analyse its performance over successive prediction intervals. figure 8 shows the ratio between the predicted and the actual number of retweets arrived at different intervals. as clearly evident, the model tends to be nearly perfect in predicting new growth with increasing time. high error rate at the initial stage is possibly due to the fact that the retweet dynamics remains uncertain at first and becomes more predictable as increasing number of people participate over time. a similar trend is observed when we compare the performance of retina in static setting with varying size of actual retweet cascades. figure 9 shows that retina-s performs better with increasing size of the cascade. in addition, we also vary the number of tweets posted by a user. figure 7 shows that the performance of retina in both static and dynamic settings increases by varying history size from 10 to 30 tweets. afterward, it either drops or remains the same. our attempt to model the genesis and propagation of hate on twitter brings forth various limitations posed by the problem itself as well as our modeling approaches. we explicitly cover such areas to facilitate the grounds of future developments. we have considered the propagation of hateful behavior via retweet cascades only. in practice, there are multiple other forms of diffusion present, and retweet only constitutes a subset of the full spectrum. users susceptible to hateful information often propagate those via new tweets. hateful tweets are often counteracted with hate speech via reply cascades. even if not retweeted, replied, or immediately influencing the generation of newer tweets, a specific hateful tweet can readily set the audience into a hateful state, which may later develop repercussions. identification of such influences would need intricate methods of natural language processing techniques, adaptable to the noisy nature of twitter data. as already discussed, online hate speech is vastly dynamic in nature, making it difficult to identify. depending on the topic, time, cultural demography, target group, etc., the signals of hate speech change. thus models like retina which explicitly uses hate-based features to predict the popularity, need updated signaling strategy. however, this drawback is only evident if one intends to perceive such endeavors as a simple task of retweet prediction only. we, on the other hand, focus on the retweet dynamics of hateful vs. non-hateful contents which presumes the signals of hateful behavior to be well-defined. the majority of the existing studies on online hate speech focused on hate speech detection, with a very few seeking to analyze the diffusion dynamics of hate on large-scale information networks. we bring forth the very first attempt to predict the initiation and spread of hate speech on twitter. analyzing a large twitter dataset that we crawled and manually annotated for hate speech, we identified multiple key factors (exogenous information, topic-affinity of the user, etc.) that govern the dissemination of hate. based on the empirical observations, we developed multiple supervised models powered by rich feature representation to predict the probability of any given user tweeting something hateful. we proposed retina, a neural framework exploiting extra-twitter information (in terms of news) with attention mechanism for predicting potential retweeters for any given tweet. comparison with multiple state-of-the-art models for retweeter prediction revealed the superiority of retina in general as well as for predicting the spread of hateful content in particular. with specific focus of our work being the generation and diffusion of hateful content, our proposed models rely on some general textual/network-based features as well as features signaling hate speech. a possible future work can be to replace hate speech with any other targeted phenomenon like fraudulent, abusive behavior, or specific categories of hate speech. however, these hate signals require a manual intervention when updating the lexicons or adding tropical hate tweets to retrain the hate detection model. while the features of the end-to-end model appear to be highly engineered, individual modules take care of respective preprocessing. in this study, the mode of hate speech spread we primarily focused on is via retweeting, and therefore we restrict ourselves within textual hate. however, spreading hateful contents packaged by an image, a meme, or some invented slang are some new normal of this age and leave the space for future studies. report of the independent international factfinding mission on myanmar fanning the flames of hate: social media and hate crime a survey on automatic detection of hate speech in text measuring the reliability of hate speech annotations: the case of the european refugee crisis spread of hate speech in online social media deep exogenous and endogenous influence combination for social chatter intensity prediction information diffusion and external influence in networks hateful symbols or hateful people? predictive features for hate speech detection on twitter automated hate speech detection and the problem of offensive language deep learning for hate speech detection in tweets a hierarchically-labeled portuguese hate speech dataset arhnet -leveraging community interaction for detection of religious hate speech in arabic the effects of user features on twitter hate speech detection handling imbalance issue in hate speech classification using sampling-based methods on analyzing antisocial behaviors amid covid-19 pandemic racism is a virus: anti-asian hate and counterhate in social media during the covid-19 crisis mind your language: abuse and offense detection for code-switched languages chassis: conformity meets online information diffusion containing papers of a mathematical and physical character deepcas: an end-to-end predictor of information cascades deephawkes: bridging the gap between prediction and understanding of information cascades what makes an image popular? representation learning for information diffusion through social networks: an embedded cascade model a novel embedding method for information diffusion prediction in social network big data neural diffusion model for microscopic cascade prediction topological recurrent neural network for diffusion prediction multi-scale information diffusion prediction with reinforced recurrent networks hierarchical diffusion attention network a survey of information cascade analysis: models, predictions and recent advances predicting user engagement on twitter with real-world events ccnet: extracting high quality monolingual datasets from web crawl data event triggered social media chatter: a new modeling framework demarcating endogenous and exogenous opinion diffusion process on social networks a deterministic model for gonorrhea in a nonhomogeneous population distributed representations of sentences and documents attention is all you need news-please: a generic news crawler and extractor xgboost: a scalable tree boosting system adam: a method for stochastic optimization maximizing the spread of influence through a social network key: cord-288303-88c6qsek authors: paul, s. k.; jana, s.; bhaumik, p. title: on nonlinear incidence rate of covid-19 date: 2020-10-21 journal: nan doi: 10.1101/2020.10.19.20215665 sha: doc_id: 288303 cord_uid: 88c6qsek classical susceptible-infected-removed model with constant transmission rate and removal rate may not capture real world dynamics of epidemic due to complex influence of multiple external factors on the spread. on top of that transmission rate may vary widely in a large region due to non-stationarity of spatial features which poses difficulty in creating a global model. we modified discrete global susceptible-infected-removed model by using time varying transmission rate, recovery rate and multiple spatially local models. no specific functional form of transmission rate has been assumed. we have derived the criteria for disease-free equilibrium within a specific time period. a single convolutional lstm model is created and trained to map multiple spatiotemporal features to transmission rate. the model achieved 8.39% mean absolute percent error in terms of cumulative infection cases in each locality in a 10-day prediction period. local interpretations of the model using perturbation method reveals local influence of different features on transmission rate which in turn is used to generate a set of generalized global interpretations. a what-if scenario with modified recovery rate illustrates rapid dampening of the spread when forecasted with the trained model. a comparative study with current normal scenario reveals key necessary steps to reach baseline. ynamical systems equations based on compartmental modelling of epidemiology have been widely used to predict the spread of an epidemic. susceptible-infected-removed or sir model is one such simplified set of differential equations to model the spread. however, accurately determining parameter values like the transmission rate for a specific disease is a challenge. the dynamics of a disease may vary across space and time. many external factors may influence the transmission rate. considering the transmission rate constant for a disease, grossly oversimplifies the model, thus compromising accuracy. secondly, knowing the factors influencing the transmission rate and the dynamics of the influence can provide a vivid understanding of the disease progression. there are several different types of nonlinear incidence rate suggested in the literature [3, 4, 5, 6, 7, 8, 9] . however, most of them adopt some type of simple predefined function with few parameters to model the incidence rate. simple functions have low representational capability. thus, they may not capture the detail dynamical variations of the incidence rate caused by multiple factors. we propose a convolutional lstm based spatiotemporal model to map the transmission rate of covid-19 with respect to multiple input features and thereby map the derived incidence rate from transmission rate. the model can forecast incidence rate with high spatiotemporal resolution provided availability of clean historical data in that resolution. exploratory analysis reveals probable influence of external features on transmission rate and eventually helped in feature selection. a spatiotemporal local interpretation method of a black box model is proposed which in turn is used to explain the trained model. the explanations reveal local influence of different external features on the transmission rate. a generalized global explanation is also generated to find common influence of factors across multiple locations and over a period. we experimented with available data of covid-19 across multiple regions of usa and the model achieved 7.95% and 0.19% mean absolute percent error in terms of new infection cases in each locality and cumulative total infection cases across the country in a 10-day prediction period respectively. the generated explanations revealed high influence of population density, somewhat medium influence of gender ratio and median population age on the transmission rate, globally. there are minor influences of temperature and temperature deviation but barely any observable influence of humidity. however, local influences of features vary widely across multiple small regions. a criterion for disease-free equilibrium within a specific time period has been derived for discrete sir model with variable transmission and recovery rate. a long-term forecast using the trained model and modified recovery rate to satisfy disease-free equilibrium criteria reveals rapid damping of active infection cases to reach the baseline. however frequent spikes due to resurgence are seen in this scenario. a comparative study is made with forecasted dynamics using current normal recovery rate to reveal necessary actions for rapid containment of the disease. the paper is organized as following. we conducted a brief literature survey in section 2. section 3 briefly explains the discrete sir model with variable transmission rate. section 4 discusses about spatiotemporal modelling of transmission rate. section 5 discusses on spatiotemporal influence ----------------of external features on transmission rate. we conduct long term forecasting of disease progression with a current normal scenario and a "what-if" scenario in section 6. section 7 concludes the paper. kermack and mckendrick [1] modelled communicable diseases using differential equations. hethcote introduced the sir model [2] where population is compartmentalized into susceptible, infected and removed groups. a set of differential equations modeled the dynamics of population in different compartments. in traditional sir model incidence rate or the number of new infections per unit time varies bilinearly with the number of infections and number of susceptible in a population considering the transmission rate as constant. however, assumptions like homogenous mixing, non-dependence on external factors, no psychological effects on population etc. may not be realistic in many cases. thus, several authors [3, 4, 5, 6, 7, 8, 9] introduced different types of non-linear incidence rates mostly addressing the saturation and psychological effect. saturation effect states that the incidence rate might slow down and saturate as number of infected individuals increases due to low availability of susceptible individuals. psychological effect on the population results in increased cautiousness among susceptible individuals as the epidemic spreads thus, slowing down the transmission rate. most of the incidence rates stated above satisfy weakly non-linear property and are too simple to capture any arbitrary effects of the environment. sir model with time varying transmission recovery rate have been studied in [11] and thresholds theorems are derived. liu et. al. [10] introduced a time varying switched transmission rate to model nonlinear incidence. hu et. al. developed a modified stacked autoencoder model of the epidemic spread in china and they claimed to achieve high level of forecasting accuracy [26] . on observing a universality in the epidemic spread in each country, fanelli and piazza [27] applied mean-field kinetics of susceptible-infected-recovered/dead epidemic model to forecast the spread and provided an estimation of peak infections in italy. zhan et. al. [14] integrated the intercity migration data in china with susceptible-exposed-infected-removed model to forecast an estimation of epidemic spread in china. hong et. al. [12] considered variable transmission rate of covid19 and came up with variable rnaught factor of covid-19. xi et. al. [28] used deep residual networks to model spatiotemporal characteristics of the spread of influenza and experimented with real dataset of shenzhen city in china. paul et. al. [35] used ensemble of convlstm networks to forecast covid-19 total infection cases. in sir model the total population in a region is compartmentalized into 3 classes, namely susceptible (s), infected (i) and removed (r). initially the whole population is in susceptible class. an individual can move from susceptible to infected class on contracting the disease. an infected individual can move to removed class by either getting recovered and immune to the disease or deceased. the dynamics of the disease spread can be modelled by the following set of differential equations. where ( ) is disease transmission rate or contact rate and ( ) is removal rate which is sum of recovery rate and mortality rate. it is assumed the population size ( ) remains constant during the course of epidemic. ( ), ( ) and ( ) are scaled as fraction of total population. thus, the following equation holds true. from [11] we get the following ∀ > 0 , where = ( 0 ), ( ) ≥ ∀ > 0 and ( ) ≥ 0 we consider discrete time steps in our modelling and measurements are taken on daily basis. thus, replacing differential with difference equation. solving for ( ) expanding log as taylor series and taking only the first term, considering a constant average difference between transmission rate and removal rate = − within the period considering ( ) < 1 as disease free equilibrium state, the upper bound of can be derived as following such that the epidemic reaches baseline in time t. maintaining > 0 asymptotically converges the total infection count to 0 at exponential rate thus makes the disease-free equilibrium stable. assuming a constant mortality rate, from (10) it can be deduced that increasing the recovery rate will directly reduce the time span of the disease outbreak. however, there is a hard limit for the removal rate, ( ) ≤ 1. but ( ) can be greater than 1, specially during initial outbreak when total infection count is low. in such situation dampening the . cc-by 4.0 international license it is made available under a preprint in perpetuity. is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . https://doi.org/10.1101/2020. 10.19.20215665 doi: medrxiv preprint spread of infection will not be possible only with treatment facilities. immediate restriction of mobility in area of outbreak and rapid isolation of infected individuals can reduce the transmission rate. once it comes down below 1, enhanced treatment facilities can increase the recovery rate, thus reducing the span of the disease outbreak. the transmission rate can vary spatially as well as temporally based on multiple variables. geographical location, weather conditions [13] , human mobility [14] , population statistics might be some of the impacting factors changing the dynamics of the spread. an exploratory analysis reveals probable dependency of multiple spatial and temporal features on the transmission rate. spatially co-located regions might have similar dynamics of the spread with high autocorrelation of transmission rate in a localized region. however distant regions may have dissimilar transmission dynamics with low correlation. thus, a large geographic area has been divided into small regions called as grids. each grid has been divided even further into smaller regions called pixels. a population within a pixel is assumed to be constant and transmission dynamics is modeled by separate sir models for each pixel. each grid consists of co-located regions which might be impacting each other's transmission rate. feature is constructed for each grid as multichannel temporal sequence of matrices which in turn used for training a convlstm [15] network to model the transmission rate. data has been obtained for a region in united states from multiple sources [16, 29, 30, 31, 32, 33, 34] . the time span of the data is from 2020-03-21 to 2020-05-11. covid19 daily data at usa county level are filtered by a spatial region of usa as shown in fig. 1 . the region is geospatially divided in m x n grids of equal sizes bounded by calculated latitudes and longitudes. fig. 2a illustrates a grid bounded by latitudes and longitudes. the dotted line box is called as frame. the overlapping areas in all 4 directions in a frame allows flow of spatial influence from neighboring grids. a frame is in turn divided into l x l pixel. each pixel represents a bounded area in geospatial region. each pixel contains a value mapped to certain feature in the bounded geospatial region. frame matrices are constructed for each feature and concatenated through a third axis called channels. for example, transmission rate and population density are two features and they represent two separate l x l matrices in a frame concatenated across a third axis. some features like transmission rate, active infection fraction, weather etc. are distributed spatio temporally. whereas other features like population density, female fraction, median age are assumed time invariant and have no temporal component. thus, they are only distributed spatially and copied along temporal axis. population density has been log transformed to reduce skewness and normalized. other features are only normalized in 0-1 scale. daily transmission rate and removal rates at pixel level have been calculated as following, where ∈ {1. . } denotes each pixel, ∆ + ( ) and ∆ ( ) are fraction of new cases in infected class and new individuals in removed class respectively at time in pixel . each training sample of a frame is represented by a tensor of dimension t x l x l x c, where t is the total time span and c is number of channels or features. as shown in fig the forecasting problem is framed as supervised learning problem. given a sequence of observed multichannel frames of spatial data as matrices 1 , 2 … the objective of the model is to predict the next single channel frame +1 . the training samples are divided into input sequences of length w and output frames. the model forecasts the transmission rate in each pixel in a frame for each timestep. thus, the output frame consists of only 1 channel. the input training dataset (x train ) can be represented as a tensor of size s train x w x l x l x c and the output dataset (y train ) as is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . s train x w x l x l x 1. for training, the input sequences are selected from all frames having non-zero total infection count. fig. 1b illustrates the sequence of a frame. the frames t-7 to t-3 represents an input training sequence (x train ) of length w. the output frame (y train ) for this training sample is t-2. other training samples are generated by sliding the window w+1 backwards in time by 1. the most recent images t-0 and t-1 represents the test output images (y test ) and immediate sequence of images t-6 to t-2 is the test input sample (x test ). the test set x test is represented by a tensor of size (m * n) x w x l x l x c and y test by (m * n) x w′ x l x l x 1. the primary purpose of the exploratory analysis is to understand the distribution of transmission rate and identify probable influence of different features on the transmission rate. eight external features are analyzed against transmission rate to find probable influence. among eight features, four are spatial features having no temporal component, namely population density, housing density, female fraction, median age. fig. 3 illustrates scatter charts between average transmission rate and four spatial features for multiple pixels. the color gradient represents log transformed cumulative number of infection cases in each pixel. only those pixels are filtered which experienced at least 30 days of running infection cases and having at least 10 cumulative infection cases at the beginning of the observation period. fig 3a and 3b displays scatter charts and regression lines of average transmission rate with respect to population density and housing density in each pixel respectively. the two external features are log transformed and scaled to get upper bounded by 1. log transformation reduces skewness and influence of outliers in data. as observed in the charts the transmission rate is positively correlated with both the features which is quite intuitive. places with high population density is expected to experience rapid spread of the disease. locations with high population density also experienced highest number of cumulative cases. fig. 3c and 3b displays scatter charts and regression lines of transmission rate with respect to female fraction and median age of the population respectively. in fig 3c, 16 pixels have been filtered out having female fraction less than 0.45 to remove the skewness in the data. there is a slight positive correlation between female fraction and transmission rate. however, this might not invoke a suggestive idea about the dependency of this external feature on transmission rate as majority of the pixels resides in the range of 0.50 -0.53 female fraction with barely any trend in that range. also, there is an indirect correlation as in general pixels with high female fraction has high population density. median age has negative correlation with transmission rate. there is an indirect correlation in this case also as in general pixels with high median age has low population density. another intuitive assumption can be, population with high median age are less mobile thus is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . https://doi.org/10.1101/2020. 10.19.20215665 doi: medrxiv preprint restricting the spread of the disease. apart from four spatial features four other external spatiotemporal features are analyzed to observe any influence on transmission rate. fig. 4 illustrates time lagged cross correlation between transmission rate and other spatio-temporal features at pixel level. the external features are time lagged from 0 to 15 time steps and cross correlated with transmission rate for each lag. in the plot, pixels are arranged in increasing order of total infection cases. fig. 4a and 4b shows the plot of cross correlation of transmission rate with respect to average daily temperature and 3-day running window temperature standard deviation respectively. average temperature is slightly positively correlated in time lag range of 0-5. in the plot, offset 15 denotes time lag 0 and offset 0 as time lag 15. the correlation with temperature variation varies widely across pixels. however, on average there is a minor positive correlation in time lag range of 5-15. for both the features pixels having high total infections have negative correlation with transmission rate in the time lag range of 0-10. fig. 4c and 4d shows the plot of cross correlation of transmission rate with respect to average daily relative humidity and daily removal rate respectively. there is an overall positive correlation with respect of relative humidity specially in pixels with highest infection cases. removal rate is mostly negatively correlated with transmission rate except in few pixels having highest infection cases. correlation might not represent causality. thus, we performed granger causality test [17] of transmission rate with respect to different features. granger causality is a statistical hypothesis test for finding if one time series can help improving the forecasting accuracy of another time series. it might not measure true causality rather it measures predictive causality. chi square test is chosen as the hypothesis testing method and minimum pvalues for each pixel are calculated. augmented dickey-fuller test [18] is performed to test stationarity of all the timeseries. table 1 displays the result of granger causality and dickey-fuller tests. the column '% of pvalue < 0.05' represents percentage of pixels for which the granger causality test gave pvalue less than 0.05 for each feature. the column '% of adf<10%' represents percentage of pixels for which the dickey-fuller test gave test statistic less than 10% critical value and having pvalue less than 0.1 for each feature. from the observed results it seems for majority of the pixels the weather features and removal rate have predictive causal relation with transmission rate. also, for majority of the pixels the feature timeseries are stationary or weakly stationary. recurrent neural networks (rnn) are a class of artificial neural networks with nodes having feedback connections thereby allowing it to learn patterns in variable length temporal sequences. however, it becomes difficult to learn long term dependencies for traditional rnn due to vanishing gradient problem [19] . lstms [20] solve the problem of learning long term dependencies by introducing a specialized memory cell as recurrent unit. the cells can selectively remember and forget long term information in its cell state through some control gates. in convolutional lstm [15] a convolution operator is added in state to state transition and input to state transition. all inputs, outputs and hidden states are represented by 3d tensors having 2 spatial dimensions and 1 temporal dimension. this allows the model to capture spatial correlation along with the temporal one. in our model we configured multichannel input such that distinct features can be passed through different channels. multiple convolutional lstm layers are stacked sequentially to form a network with high nonlinear representation. the final layer is a 2d convolutional layer having one filter which constructs a single channel output image as the next frame prediction. we assume the transmission rate saturates as number of infection cases increases. thus, the modified transmission rate is calculated as ′ ( ) = ( ) * (τ + ( )) which serves as the response variable for the model and τ = 1/ and is total population in pixel . the model is tested by feeding in input sequence of frames and next output frame is predicted which in turn is combined with other features along channel and appended with the input sequence. the new input sequence is fed to the model again to get the next predicted frame. this continues until forecasting completes for a desired time period. "mean absolute percent error" (mape) and kullback-liebler (kl) divergence [17] are used to measure the accuracy of the model. the model predicts the transmission rate for a future time period for each pixel which in turn is used to calculate daily new infection cases ∆ + ( ) using equation 11. the removal rate is estimated as running average of previous 3-days and daily removed cases are calculated using equation 11. the active infection cases ( ( )) and susceptibles ( ( )) are calculated using equation 1 and 2. cumulative infection cases (∑ ∆ + ( )) are calculated by summing up all new infection cases upto a certain day. mape of modified transmission rate is calculated at pixel level for the prediction period and averaged. the pixels with 0 susceptible population count are filtered out while calculating mape and kl divergence. pixel mape is calculated as per equation 12, where g is set of all grids and g′ set of all pixels such that the frame for each corresponding grid have non zero cumulative infection count, ′ is prediction time period, ′′ = − ′ is total time period in training set, ̂′ ( ) and ′ ( ) are predicted and actual modified transmission rate for ℎ pixel at time respectively. kl divergence at pixel level is calculated for modified transmission rate in the prediction period to measure the dissimilarity of distribution of predicted transmission rate with respect to actual. is softmax function applied after . cc-by 4.0 international license it is made available under a preprint in perpetuity. is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . scaling a series in 0 to 1 scale and ( ) is probability distribution of . softmax is applied to convert total infection cases as probability distribution across pixels. since kl divergence measures the dissimilarity between two distribution thus a lower value of it indicates better performance of the model. mape is also calculated at grid and country level with respect to cumulative predicted infection cases across the region during the prediction period. the model is constructed by stacking 4 convolutional lstm layer sequentially and terminating the network with a convolutional 2d layer. the final layer is followed by exponential linear unit as activation. the input and other hidden convolutional lstm layers are followed by sigmoid activation. each convolutional lstm layer has 32 filters and kernel size 3x3. the input layer is configured to take tensors of size 16x16x8. eight input features are constructed and fed into the model as separate channels. namely transmission rate, population density, female fraction, median age, active infection fraction, average temperature, temperature standard deviation and average relative humidity. the model is trained for 20 epochs with batch size of 50 and mean squared error as loss function. out of 11378 samples 10809 are used for training the model and 569 are for validation. the model is trained and tested twice. once with all the eight features another with only five leaving out the weather features. the dataset has a time span of 51 days out of which data from 42 nd to 51 st day is used for testing the model and rest for training and validation. table 2 displays the training, validation and test results of the model. statistics suggests there is a slight improvement of overall accuracy when weather features are included while training the model. pixel mape and grid mape are below 10% in both the cases and country mape is below 1%. predicted total infection cases at the end of prediction period is little overestimated than actual (1330525) when weather features are included in modelling and overestimated when weather features are not included. all future reference of trained model suggests the model has been trained with all eight features unless otherwise mentioned. fig. 5 illustrates different plots of predicted vs actual infection cases in 10-day prediction period. fig. 5a and 5b shows the plot of predicted vs actual new infection cases and cumulative infection cases per day in 10-day period. fig. 5c and 5d shows the plot of predicted vs actual log transformed total new infection cases and cumulative infection cases per grid in 10-day prediction period. all the predicted curves closely approximate the actual values. one of our goal of this study is to understand how different external features are influencing the transmission rate. we expect to find simple interpretable predictive causal relations between transmission rate and different features. one of the ways to find such relations is building an accurate predictive model followed by explaining the predictions in terms of input features. as described in previous sections deep neural networks can model the dynamics of epidemic quite accurately due to its high nonlinear representation. however high accuracy is tradeoff against model interpretability. given the complexity of the convolutional lstm network used to model the transmission rate it is nearly impossible to find how each feature is influencing the transmission rate just by studying the weight matrices. using a high bias predictive model like linear regression or shallow decision tree not only is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . https://doi.org/10.1101/2020. 10.19.20215665 doi: medrxiv preprint reduces the accuracy but also drops interpretability [21] . simple models can serve as interpretable models but may fail to capture true relations among features globally. this problem can be solved by building simple local models and drawing local explanations of feature relations. however, there may not be enough data points available or data distribution may be highly skewed in a local region to confidently build a predictive model and draw interpretations on it. thus, we use the trained convolutional lstm model as the global model and draw spatio-temporal local interpretations of it using locally perturbated synthetic data by satisfying a criterion called local fidelity [22] . local fidelity suggests the explanations should be locally faithful with the model behavior. local fidelity does not imply global fidelity however global fidelity implies local. to increase interpretability simple surrogate models can be trained with local data as it is expected that the response variable varies with the features almost linearly in a local region. in fact, there is a tradeoff between local fidelity and interpretability that needs to be made. model agnostic methods perturbs the input features in a local region around a single or a group of datapoints and feeds the model to obtain predicted response variable. this synthetic data is in turn used to train simple surrogate models to obtain local interpretations of global model. there are several existing methods available in the literature to derive local interpretations of a model [22, 23, 25] . few works also proposed methods to derive global explanations from local interpretations of any black box models [21, 24] . similar to as stated in [22] deriving explanations requires optimization of the following function, where g is set of interpretable surrogate models in a locality, is the global model to be explained, is the distribution function defining the locality of , ℒ is the loss function and is the complexity of the model . it is desirable to minimize both ω and ℒ. however, in general they are inversely proportional when the spread of is large. a very small spread of is also not desirable as it will oversimplify g to draw any meaningful explanations in the locality. thus, a choice of is important to derive meaningful interpretations. the locality of is defined by a threshold distance in all directions from both spatially and temporally and it is defined by the following tuple, where and are spatial and temporal components of observation . and are spatial and temporal threshold distances from to the boundary of locality. fig. 6 illustrates spatiotemporal locality of observation . spatial locality is bounded by pixels up to in all direction from such that locality of is bounded by a square box of pixels of size (2 + 1) x (2 + 1). no paddings are applied at the edges. thus, perimeter defining locality of pixels at the edges of a frame are trimmed. as illustrated in fig. 6 temporal locality is also defined similarly. combining spatial and temporal locality the local region of observation is defined by a sequence of group of pixels with equal time lead and lag from unless resides on temporal edge of an input tensor in which case temporal locality is trimmed on the direction of the edge. perturbated data points are generated by randomly perturbing the pixel values of following a uniform distribution. perturbated distribution is calculated separately for each feature. the perturbated sample distribution is calculated as following, where ( , ′) is uniform distribution with upper and lower bound as , ′, ( ( )) is standard deviation of all observations in the locality of and randomly selects one sample from two. the spatial features are only perturbated spatially and same values are copied temporally along the corresponding channel. the channels having temporal component are perturbated for different time slice within an input tensor. each perturbated pixel in a time slice represents a separate feature. input tensors are constructed using the perturbated values and passed through the blackbox model to generate a predicted output value. the set of all input perturbated data points of and the corresponding predicted output values serves as the training dataset for the surrogate model . each input channels and the predicted values are normalized to 0 mean and 1 standard deviation prior to training the surrogate model. normalization is done to convert the features into same scale so that coefficients of a linear regression surrogate model gives the relative influence of the features on the response variable. thus, the loss function is defined as following, where the function constructs the input tensor in the original representation from perturbated samples. though can be created by perturbing all features of a pixel in each channel within an input tensor, however in our analysis only a subset of all features is perturbated to produce to find effect of those features on transmission rate. other features are kept constant as per the original observation. is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . intuitively this will explain the effect of the chosen features on the transmission rate in a single pixel area given all other parameters remain constant including feature values of spatio temporal neighboring pixels. in our analysis is created by perturbing the following features only. population density, female fraction, median age, weather at 4 th , 5th and 6 th time lag. weather includes average daily temperature, 3-day temperature standard deviation and average daily relative humidity. apart from the weather features the other three features have no temporal component. so, for them the perturbated values are copied temporally in the input tensor during reconstruction. the weather features from 4 th to 6 th time lag is chosen by assuming the average incubation period of sars-cov-2 between 4 to 6 days. the spatial ( ) and temporal ( ) distance for defining locality is taken as 1. the perturbated samples for each feature are generated by equation 18. local interpretations are carried out for each pixel which experienced at least 10 cumulative infection cases on 21 st march 2020. the objective is to deduce the influence of aforementioned features on the transmission rate in each pixel given all other parameters remains constant. 250 perturbated input samples are generated for each pixel. the samples are reconstructed in tensor format and fed to the model to obtain the predicted transmission rate and together they form the input output samples. for each pixel a linear regression surrogate model is trained with the training samples. the coefficients of each feature denote the influence on the transmission rate. fig. 7 illustrated the feature influence chart for different pixels in grid 387. we choose grid 387 as it experienced highest number of cumulative infection cases with nearly 10% of total infection cases in usa as of 1 st may 2020. only those coefficients are plotted which have pvalue < 0.05. the features whose absolute value of median and standard deviation across all days are less than 0.05, are considered unimportant and filtered out from the plot. the counties covered by each pixel in grid 387 which have nonzero population is stated in table 3 . the influence values are smoothed using 3 rd degree polynomial. new york & bronx have somewhat positive influence of population density (pop den) and female fraction (f perc) on transmission rate. median age (med age) has positive effect in the mid period and negative on early and later days. 6 th day time lag temperature (t6 temp) have slight negative effect on later days. on average putnam also have positive influence of population density, median age and female fraction. however, population density and female fraction shows negative influence on later days. 4 th time lag and 5 th time lag relative humidity (t4 rh & t5 rh) have slight negative impact on average. at grid level population density and female fraction positively impacts transmission rate on daily basis. median age has minor positive impact on earlier days and negative impact on later days . fig 7d. shows median of influence across all days for different pixels in grid 387. population density and female fraction have positive impact across all pixels. median age closely resembles a sinusoidal curve which implies that its influence varies widely across pixels. fig. 8 illustrates the global effects of the features on transmission rate. to generate global interpretations local surrogate models are built for each pixel with 100 perturbated samples. for each feature the distribution of influence values for all pixels with nonzero population is plotted against time. considering the median of the distribution, population density, female fraction has positive impact across all days whereas median age has negative impact. temperature has minor positive impact, temperature standard deviation has minor negative impact and relative humidity barely have any noticeable impact on transmission rate. from this study it is clear local influence of features at pixel and grid level may widely deviate from global average. this is important as spread of infection is highly skewed regionally such that few hotspots contribute majority of the infection cases. thus, studying the local influence of features can shed light on the local dynamics of spread and at the same time global influence charts provides a general idea of the influence on spread. classical sir model assumes a constant transmission rate and it typically predicts a smooth bell curve of active is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . https://doi.org/10.1101/2020. 10.19.20215665 doi: medrxiv preprint infection cases with respect to time with a single peak. however, transmission rate may vary with respect to multiple external factors including intervention methods like lockdown. a variable transmission rate may result in periodic subsidence and resurgence of the spread of infection and in turn producing multiple peaks of active infection cases along time. along with this the recovery rate may also change due to multiple intervention methods like enhancing hospital facilities, improving treatment procedure etc. as shown in equation 10, recovery rate is very important in achieving disease free stable equilibrium state. in general, the average removal rate (recovery rate + death rate) over a period should exceed average transmission rate in order to reach the disease-free equilibrium. considering the death rate to be constant and quite small compared to recovery rate of covid-19, the time required to reach the equilibrium state is inversely proportional to the difference between recovery rate and transmission rate. in our experiments we used the trained model to do long term forecasting of the epidemic with current normal parameters and compared with an "what if" analysis by modifying the removal rate. a 300 days forecasting is carried out for the grid 387. since weather features barely impacts transmission rate in grid 387 thus the model trained without weather features is used for forecasting. "what if" analysis is done by setting high removal rate to expedite disease-free equilibrium and compared with current normal forecasting by setting removal rate as running average of past 5 days. in "what if" analysis removal rate is set as per equation 10 by setting t = 200 with upper hard limit 0.2. as removal rate changes daily active infection cases which in turn impacts future transmission rate and due to upper hard limit of removal rate the value of in some pixels is less than upper bound calculated by equation 10. from fig. 9a and 9b it is evident that number of active infection cases reduced much faster in the "what if" analysis and most of the pixels hit near baseline state at least once within 200-day period. however rapid periodic resurgence of the disease is seen in this case. as recovery rate has upper hard limit thus in some cases resurgence with high transmission rate resulted in destabilizing disease-free equilibrium. the growth is again quickly dampened due to high recovery is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . rate in future periodic resurgences. this can be empirically explained by the fact that population gets cautious and maintains social distancing with low intermixing when infection cases are high and vice versa. fig. 9c and 9d suggests there is rapid periodic resurgence of new infection cases in "what if" analysis compared to current normal and multiple short low new infection periods are seen. the resurgences in some cases (pixel 85, 101) are stronger compared to current normal. thus, it is evident, by only increasing recovery rate abruptly, infection spread may not be controlled fully unless other intervention methods are adopted to prevent spike of transmission rate during resurgence periods. fig. 10a and 10b shows the plot of daily active infections when only pixel 101 and 102 are subjected to modified recovery rate respectively and other pixels are set with current normal recovery rate. in both the cases there is a quick dampening of active cases in 101 and 102 pixels and resurgence spike is shorter and weaker compared to fig. 9b . it is evident there is spatial influence of neighboring active cases and transmission rate. one explanation can be, isolated intervention measures to dampen the spread does not breaks the cautiousness and preventive measures among the population. this makes determining an ideal recovery rate for a region a complex optimization problem. fig. 11 shows active infection cases at grid level quickly reaches baseline in "what if" scenario compared to current normal, but it is not eradicated fully. there are also small periodic spikes in future. the current normal scenario suggests unless strict intervention actions are not taken to reduce transmission rate or recovery rate it is going to take long time to reach the baseline. the trace of new infection cases suggests the trend is quite similar in both the scenarios with more frequent and stronger spikes in "what if" scenario. in current normal scenario the model estimates 487254 new infection cases and 711040 removed cases in 300-day period. in "what if" scenario it estimates 549158 new infection cases and 909437 removed cases. however, fig. 11c suggests most of the removal happens in initial 50 days of forecast period due to abrupt increase of removal rate in forecast period. in real world such abrupt increase of removal rate may not be possible. however, on an average if the difference between removal rate and transmission rate can be maintained as per equation 10 it is possible to dampen the spread of infection within desired time period. though in our analysis we took removal in strict sense however it may not refer to complete recovery. identification and complete isolation of a patient such that there is negligible chance of further spread of the infection from the patient may also be referred to removal. thus, maintaining high recovery rate, rapid and strict isolation of infected patient and intervention methods to reduce transmission rate are the keys to rapid convergence to diseasefree equilibrium. a thorough study on the transmission rate of covid19 in usa revealed several insights. key influencers are identified. however, there might be other influencers like human mobility, demographics, government interventions etc. on availability of those feature data, proposed methods may be applied to find influences. these methods can also be applied to other countries. though a threshold condition is derived for disease free equilibrium, yet it is not straightforward to determine ideal recovery rate to rapidly dampen the infection spread due to complex dependency of transmission rate. a general solution method may be investigated to solve this optimization problem and come up with ideal regional recovery rate. is the author/funder, who has granted medrxiv a license to display the (which was not certified by peer review) this preprint the copyright holder for this version posted october 21, 2020. . https://doi.org/10.1101/2020. 10.19.20215665 doi: medrxiv preprint containing papers of a mathematical and physical character qualitative analyses of communicable disease models influence of nonlinear incidence rates upon the behavior of sirs epidemiological models nonlinear biological dynamics system a delayed epidemic model with pulse vaccination. discrete dynamics in nature and society regulation and stability of hostparasite population interactions: i. regulatory processes analysis of a delayed sir model with nonlinear incidence rate. discrete dynamics in nature and society dynamic analysis of an sir epidemic model with nonlinear incidence rate and double delays a simple sis epidemic model with a backward bifurcation infectious disease models with time-varying parameters and general nonlinear incidence rate on the final size of epidemics with seasonality estimation of time-varying transmission and removal rates underlying epidemiological processes: a new statistical tool for the covid-19 pandemic temperature and latitude analysis to predict potential spread and seasonality for covid-19 modelling and prediction of the 2019 coronavirus disease spreading in china incorporating human migration data convolutional lstm network: a machine learning approach for precipitation nowcasting us covid-19 daily cases with basemap investigating causal relations by econometric models and cross-spectral methods. econometrica: journal of the distribution of the estimators for autoregressive time series with a unit root the vanishing gradient problem during learning recurrent neural nets and problem solutions lstm can solve hard long time lag problems from local explanations to global understanding with explainable ai for trees. nature machine intelligence explaining the predictions of any classifier how to explain individual classification decisions anchors: highprecision model-agnostic explanations model agnostic supervised local explanations artificial intelligence forecasting of covid-19 in china analysis and forecast of covid-19 spreading in china, italy and france a deep residual network integrating spatial-temporal properties to predict influenza trends at an intra-urban scale us covid-19 daily cases with basemap 2012: an overview of the global historical climatology network-daily database climate reference network after one decade of operations: status and assessment a multivariate spatiotemporal spread model of covid-19 using ensemble of con-vlstm networks key: cord-292699-855am0mv authors: engbert, ralf; rabe, maximilian m.; kliegl, reinhold; reich, sebastian title: sequential data assimilation of the stochastic seir epidemic model for regional covid-19 dynamics date: 2020-04-17 journal: nan doi: 10.1101/2020.04.13.20063768 sha: doc_id: 292699 cord_uid: 855am0mv newly emerging pandemics like covid-19 call for better predictive models to implement early and precisely tuned responses to their deep impact on society. standard epidemic models provide a theoretically well-founded description of dynamics of disease incidence. for covid-19 with infectiousness peaking before and at symptom onset, the seir model explains the hidden build-up of exposed individuals which challenges containment strategies, in particular, due to delayed epidemic responses to non-pharmaceutical interventions. however, spatial heterogeneity questions the adequacy of modeling epidemic outbreaks on the level of a whole country. here we show that sequential data assimilation of a stochastic version of the standard seir epidemic model captures dynamical behavior of outbreaks on the regional level. such regional modeling of epidemics with relatively low numbers of infected and realistic demographic noise accounts for both spatial heterogeneity and stochasticity. based on adapted regional models, population level short-term predictions can be achieved. more realistic epidemic models that include spatial heterogeneity are within reach via sequential data assimilation methods. the evolving spread of the novel coronavirus in germany [1] resulted in containment measures based on reduced traveling and social distancing [3] . in epidemic standard models [2, 14] , which provide a dynamical description of epidemic outbreaks [7, 20] , containment measures aim at a reduction of the contact parameter. since the contact parameter is one of the critical parameters that determine the speed of increase of the number of infectious individuals, estimation of the contact parameter is a key basis of epidemic modeling [17] . the current situation of covid-19 is characterized by extreme spatial heterogeneity [1] . in the initial phase of the outbreak, this heterogeneity is caused by random travel-based imports of infectious cases and enhanced by local events with increased contacts. as a consequence, the assumption of homogeneous mixing must be relaxed [12] and coupled dynamics of regional models seem to be a more adequate description [15] . however, when modeling a relatively small region with population size n = 10 5 compared to the country level with n = 10 7 to 10 9 , demographic stochasticity [9, 12] must be addressed (see the stochastic seir model). the combination of dynamical modeling with substantial fluctuations calls for sequential data assimilation methods for parameter inference [5, 19] . we investigate the stochastic seir epidemic model [2] for application to regional data of covid-19 incidence. the model assumes s, e, i, and r compartments representing susceptible, exposed, infectious, and recovered individuals (fig. 1) . this model is particularly important for the description of the spread of covid-19, since infectiousness seems to peak on or before symptom onset s e i r !si ae gi figure 1 : the seir model. the population is composed of four compartments that represent susceptible, exposed, infectious, and recovered individuals. the contact parameter β is critical for disease transmission, 1/a and 1/g are the average duration of exposed and infectious periods, resp. different from the standard model, birth and death processes are neglected for the short-term simulations discussed throughout the current study. [13] , such that models without the exposed compartment cannot adequately address the time-delay between build-up of exposed and infectious individuals. since we are interested in short-term modeling (weeks to months), we neglect birth and death processes as a first-order approximation for the dynamics of the model. disease-related model parameters are the rate parameters a = 1/z (with average latency period z) and g = 1/d (with mean infectious period d), which can be estimated independently from analysis of infected cases [13, 15] . therefore, the time-dependent contact parameter β is the most critical parameter that needs to be determined via data assimilation [19] . the contact parameter β is directly related to the basic reproductive rate r 0 in a seir-type model (see seir model and basic reproductive rate). therefore, non-pharmaceutical interventions that aim at r 0 < 1 translate into the relation β < g in the model. in the following, we will use a combination of sequential data assimilation and stochastic modeling on the regional level to estimate spatial heterogeneity in epidemics spread and show how to use such a combined approach for epidemics prediction and uncertainty quantification. the key motivation of the current study was to apply sequential data assimilation of the stochastic seir model to estimate the contact parameter. using simulated data, we successfully applied an ensemble kalman filter [10, 19] for recovery of the contact parameter from data (see parameter recovery from simulated data). when applied to empirical data on the level of a region, the estimation of the contact parameter produces a comparable evidence profile (see application to empirical data). in the early phase of the outbreak of covid-19 in germany, the reported cumulative numbers of cases strongly increase (fig. 2a,b) , however, epidemic dynamics vary on the regional level. such spatial heterogeneity is due to different onset times of the disease in different regions, but is also enhanced by variations in the local contact parameters β. in response to containment measures, we expect β to change over time. estimation of the time-dependence of the contact parameter is done via the model's best fit. an approximative instantaneous negative log-likelihood l(t k , β) of the contact parameter β at observation time t k is obtained from the ensemble kalman filter (see model inference based on sequential data assimilation). thus, by determining the minimum of l(t k , β) with respect to β at time t k we estimate the time-dependence of the best fit β * (t k ) (fig. 2c) . the black line reports the average time dependence for all 320 regions included in the analysis; standard deviations are indicated by the grey area. results for the two example regions are given by their corresponding colors. the non-pharmaceutical interventions in the spread of covid-19 were implemented at slightly varying points in time across germany. in the majority of regions, closings of schools and other educational institutions started on march 16th, while large-scale contact bans was implemented on march 22th. since these social distancing measures will have an impact on the contact parameter, we expected to observe a related drop in the contact parameter over time. before we present a corresponding analysis, it should be made clear that any of these measures cannot produce an immediate effect on the observed cases of infected individuals because of the latency period. to use a reliable estimate of the contact parameter, the related interval should be as long as possible, since sequential data assimilation will need several data points to adapt the model to the data. therefore, we selected the average value of β * (t k ) over the three days from march 17th to march 19th as a pre-intervention value. the average over march 31st to april 2nd is taken as an estimate of the post-intervention value. to analyze the effect across regions, we computed average values β pre (march [17] [18] [19] and β post (march 31-april 2) of the relevant β * (t k ) for all regions. a scatter plot indicated a clear reduction of the numerical value of the contact parameter from β pre to β post (fig. 2d) . the reduction is statistical significant (wilcoxon test, p < 0.01). finally, the overall time-dependence β * (t k ) shows a decreasing trend, however, figure 2c suggests that there is an additional weekly cycle with local minima at weekends (march 22 and 29). both of the example regions show this effect as well. for the rki data, we do not expect that seemingly reduced contact parameters are a simple consequence of increased reporting delay over the weekend, since this database is continuously updating the reported cases back in time (see rki data on covid-19 in germany). the contact parameter β is the most critical parameter determining the dynamics of the stochastic seir model. after time-resolved estimation of the best fit β * (t k ), we are able to generate simulations from an initial state to predict the future trajectory (fig. 3 ). simulations i are started from the first epidemic day in the corresponding region with greater than or equal to 30 cases. the initial numbers of infected i 0 were set to the observed number of cases y obs (t 0 ), while the initial numbers of exposed were set to e 0 = g/a · i 0 , which would hold at epidemic equilibrium. the initial number of infected people was disturbed by noise representing uncertainties in the initial model states. forward iteration with the estimated time-varying contact parameter show that the slope of the epidemic curve is approximately reproduced by the model (fig. 3a ,c; grey lines indicate the ensemble of simulated trajectories; blue points are observed data). at march 26th, we started simulations ii which exploits the full potential of sequential data assimilation. the sequential data assimilation approach via the ensemble kalman filter (see ensemble kalman filter) is based on the forward modeling of an ensemble of trajectories. the forward simulations discussed in the previous section demonstrated the predictive power of the seir model after sequential data assimilation. in the next step, we generated simulations under two different scenarios. in scenario i, we started with the adapted ensemble of internal model states after data assimilation (april 4th) and iterated the model forward with the mean contact parameter estimated in the week march 29th to april 4th after implementation of interventions (fig. 4 , green area). the simulations smoothly continue the time-course of infected cases for both example regions (fig. 4a,b) . daily reported case numbers show a decline for both regions (fig. 4c,d) . in scenario ii, we assumed that all governmental intervention measures were terminated. therefore, we used the estimated in the week march 15th to march 21st. again we started simulations with the adapted ensemble of internal states after sequential data assimilation (fig. 4 , red area). for both example regions, we observe a strong increase in infected cases under scenario ii (fig. 4c,d) . the dramatic increase can be seen most clearly in the plot of daily numbers of new cases (for more examples see supplementary information appendix). q q q q q q q q q q q q q q q q q q q q q q q q q q the ongoing worldwide spread of the new coronavirus exerts enormous pressure on health systems, societies and governments. therefore, predicting the epidemic dynamics under the influence of nonpharmaceutical interventions (npi) is an important problem from a data science and mathematical modeling perspective [18] . the motivation of the current work was to explore the potential of sequential data assimilation [19] for a regional epidemic model as a forecasting tool. standard epidemic seir-type models implement a compartmental description under the assumption of homogeneous mixing of individuals [2] . more realistic modeling approaches require spatial heterogeneity due to time-varying disease onset times, regionally different contact rates, and the time-dependence of the contact rates due to implementation of containment strategies. however, regional descriptions require models that include effects of demographic stochasticity due to limited population sizes and cases numbers in the region considered [6] . effects of such statistical fluctuations are inherently reproduced via stochastic versions of the standard epidemic models [9, 12] . we demonstrated the potential of sequential data assimilation for covid-19 dynamics at the level of a regional, stochastic model. based an the ensemble kalman filter [10] , we successfully recovered the contact parameter from simulated data and obtained reliable estimates from empirical data. the contact parameter is the most critical free parameter in the stochastic seir model, since other parameters (mean exposed and infectious duration) can be estimated independently from observed time series [13, 15] . moreover, the contact parameter of the seir model is directly related to the basic reproductive rate r 0 [16] . therefore, our approach could also be framed as model-based method for statistical inference of the basic reproductive rate. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint next, we ran time-resolved analyses that generated estimates of the time-dependence of the contact parameter. the drop in mean contact rates from an early (β pre , march 17-19) to a later period (β post , march 31-april 2) indicated the effect of non-pharmaceutical interventions. we also generated model prediction under two different scenarios. in scenario i, started simulations from april 4th with sampling from the assimilated ensemble as initials conditions and the contact parameter estimated for the post-intervention period (march 29-april 4). in scenario ii, we replaced the post-intervention contact parameter by its pre-intervention value (march 15-21). as a results, the two scenarios predict rather different temporal developments (decline of daily new cases for scenario i, and strong increase for scenario ii). therefore, our model predictions suggest that lifting off the current interventions would clear switch the epidemic dynamics to the exponential increase before implementation of non-pharmaceutical interventions. such predictions can easily be scaled up to the federal state level (bundesländer) or to the country level; a corresponding predictive model will be potentially quite 6 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint robust because of explicit modeling of spatial and temporal heterogeneities, captured by a separate time-course of the contact parameter for each region. the recent simulation study li et al. [15] used a similar approach of sequential data assimilation for dynamic epidemic models. however, the deterministic seir model was implemented and extended by additional noise assumptions. we proposed the usage of the stochastic seir model in the formulation of a master equation [9] which can be simulated exactly and numerically efficiently using gillespie's algorithm [11] . a more complex spatiotemporal stochastic model has been considered in [4] . furthermore, the state-parameter estimation in [15] utilises the ensemble kalman filter directly on an augmented state space [19] . contrary to that study, we found a direct application of the ensemble kalman filter to the augmented state space (x, β) not suitable because of the strongly nonlinear interaction between the model states x and the contact parameter β. this led us to the two stage approach, as presented in this study, combining the ensemble kalman filter for state estimation with a likelihood based inference of the contact parameter β. our current study was mainly motivated by the methodological problem of a possible contribution from data assimilation to epidemics modeling based on a stochastic seir model. there are obvious limitations in our current modeling framework, which we did not address because of the methodological focus. longer-term predictions (∼ months) are important, but clearly dependent critically on the estimation of undocumented infections (see li et al. [15] ). such hidden infections create, after recovery, an unknown reduction in the number of susceptibles that slows down epidemic dynamics; such an effect is currently not included in our current model. however, it seems compatible with our framework to extend the seir model by an additional class of undocumented infected individuals [15] . another important limitation of these results comes from the simplifying assumption that there is no coupling to neighboring regions. as a consequence, regional differences in the contact parameter might in fact be due to differences in contacts between regions. introducing couplings between regions [15] could also be integrated in our modeling framework. however, the no-coupling approximation might be realistic in the current situation of social distancing and contact ban. linking back to the potential of npi to control covid19 until vaccination or medicine is available we close with an observation relating to a recent study [8] . our analyses covered the same time span as dehning et al. [8] and we also picked up a systematic pattern of contact reduction which we interpret as a weekly cycle (see statistical modeling of trends and weekly oscillations). indeed, the dates of the interventions analyzed by dehning et al. are confounded with day of the week; the three dates refer to three sundays in march. it is reasonable to assume that in germany there is much less contact to persons outside the family context on weekends than during working days. of course, there are also other reasons why certified reports of covid19 infections are less frequent for weekends. for example, people are more likely to go to their gp on monday than to the emergency room on sunday, especially as long as the symptoms are mild. no reason, in our opinion, is a lack of recording at the rki; the cases reported for the weekend are added on monday and tuesday. one advantage of the method proposed here is that we recover effects with comparatively little date at the level of regions, not an entire country. in the absence of a vaccine or of medication, having such an epidemic forcasting tool seems almost like a necessary precondition for selective and optimal timing of tightening and loosening of npi-based containment measures. the robert koch institute (rki), the central scientific institution in the field of biomedicine within the portfolio of the federal ministry of health, provides daily access to the number of confirmed cases, deaths, and recovered patients, broken down by 412 counties, six age groups, and sex. as they are official records, only cases certified by doctors or labs according to a strict medical protocol in accordance with the infection protection act are entered into the data base. the exact time of an infection is usually not known. the associated time stamp refers to the date on which the local 7 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.13.20063768 doi: medrxiv preprint health authority became aware of the case and recorded it electronically. as records are passed from the physician or lab via local and state health authorities to the rki, there is a delay of several days before cases are available on the website. statistics relating to the most recent three or four days are incomplete and cannot be interpreted; retrospective updates and corrections are possible for all days of the pandemic spell as they become available. we use data inclusive april 4, as reported on april 8, 2020; they are included as part of the supplement. the seir epidemic model is a four-compartment model with susceptibles, which are able to contract the disease, exposed, those who are infected but not yet infectious, infectious individual, and recovered individuals who are immune. the model is typically formulated as a system of ordinary differential equations (ode), i.e.,ṡ where the total number of individual n = s + e + i + r is constant under temporal evolution due toṅ = 0. the ode system, eq. (1-4), has a non-trivial equilibrium point, denoted as epidemic equilibrium (s 0 , e 0 , i 0 , r 0 ), where the number of susceptibles s 0 at equilibrium is related to basic reproductive rate. since we aim at a short-term description of the system, we neglect birth and death processes here, equivalent to the limit m → 0, we obtain we use a numerical values of g = 1/3, equivalent to an average infectious period of d = 3 days, and a = 0.192, or an average latency period z = 5.2 [13] . as a consequence, the critical condition for disease containment r 0 < 1 is obtained for β < β crit = 1/3 in our model. while the classical model is formulated as a system of ordinary differential equations, we are exploring the application to relatively low numbers of cases in the early phase of the current epidemics on the regional level with population sizes from 10 5 to 10 6 . therefore, we use the stochastic seir model in the form of a master equation [9] , which is particularly useful for modeling small numbers of infected individuals occuring in smaller regions or in the beginning of epidemics. the demographic seir model contains four variables denoted by x = (s, e, i, r) t ∈ n 4 representing the number of individuals in each of the four classes with constant population size n = s + e + i + r. the transition rate of the ode compartmental model translate into transition probabilities in the master equation formulation for the evolution of the model's conditional probability density, that is, with transition probabilities given in table 1 and initial condition x 0 . single trajectories for the model's temporal evolution can be simulated exactly and numerically efficiently [9] using gillespie's algorithm [11] . publicly available data on the cumulative number of infected individuals is used to infer the model states x = (s, e, i, r) t and the contact parameter β of the stochastic seir model. note that the cumulative number of infected individuals corresponds to y = i + r in the seir model. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.13.20063768 doi: medrxiv preprint table 1 : transitions and transition probabilities in the stochastic seir model. transition are from state x = (s, e, i, r) t to x with probability w x→x . in the present study we combine sequential data assimilation for the model states with an approximate log-likelihood function for the contact parameter [19] . the basic algorithmic idea is to propagate an ensemble of m model forecasts using gillespie's algorithm up to the next available observation point t k . the forecast ensemble is denoted by x a (t k ). this step is implemented via the ensemble kalman filter [19] . while the forecast ensemble is used to compute the temporary negative log-likelihood l(t k , β) of the model's contact parameter β at time t k , the adjusted model states serve as starting values for the next gillespie prediction cycle. the above algorithm is run over a fixed range of contact parameters β ∈ [β min , β max ] and over a fixed time window [t initial , t final ] of available data points y obs (t k ). the best fit contact parameter β * (t k ) at any time any t k is found as the one that minimises the temporary negative log-likelihood function, that is, with l(t k , β) defined by (13) below. the observed cumulative number of infected individuals y obs (t k ) is linked to the model states x = (s, e, i, r) t via i.e., h = (0, 0, 1, 1). as initial condition, we set i 1 as the number of infected cases, r 1 = 0, so that y obs (t 0 ) = i 1 +r 1 , and e 1 = g/a·i 1 with additive noise. we assume that the errors in the observed y obs (t k ) is additive gaussian with mean zero and variance ρ. we set ρ = 10 in our experiments. the analysis step of the ensemble kalman filter is now based on the empirical mean and the empirical covariance matrix of the forecast ensemble. these two quantities are used to quantify the forecast uncertainty. combining the forecast uncertainty with the assumed data error model leads to the linear regression formula with the kalman gain defined by the resulting analysis ensemble x . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. model evidence the model's negative log-likelihood at an observation time t k is approximated by note that the first contribution penalises the data misfit while the second penalises model uncertainty. the smaller the negative log-likelihood the better the chosen model parameter β fits the data y k at time t k . the best parameter fit over a time window [t min , t max ] is defined as the value of β which minimises the cumulative negative log-likelihood that is, β * := arg min β l cum (β). parameter recovery from simulated data to test the inference scheme, we simulated data for 20 days. in figure 6 , the black line indicates the evolution of the seir model's predicted cumulative numbers of infected individuals, y (t k ) = hx(t k ) = i(t k ) + r(t k ). red dots represent the daily number of reported cases as in real data. in the simulation, the contact rate was chosen as β true = 0.6. in the following, we analyzed whether this true value could be recovered using the inference procedures described above. we varied the contact rate β and determined the cumulative negative log likelihood values l cum (β), eq. (14) . the position of the minimum of l cum (β) indicates the best estimate for the numerical value of the underlying contact rate β * , eq. (15) . the position of the minimum turns out to be close to the true value, β * ≈ β true = 0.6 (fig. 6b) . thus, parameter recovery can be demonstrated for a relatively short time series of 10 observations, which represents a typical data-set in the early phase of newly emerging epidemics. next, we apply our inference scheme to real data. application to empirical data since parameter inference was successful for simulated data, the next step was an application to empirical observations. we applied the inference framework to two regional data sets from the rki data base. as an example, we selected the covid-19 time series for köln (rki data, population size n = 1, 085, 664), which includes 27 days of observations with more than 30 cases and is plotted in figure 7 . parameter estimation yields an estimate for the contact rate of β * ≈ 0.7 (fig. 7b) . thus, analysis of the negative log-likelihood function produced 10 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. qualitatively similar results for simulated seir time series and empirical data for a representative region. in the main text, we carry out an estimation of the time-resolved instantaneous optimal parameter values β * (t k ) using the instantaneous negative log-likelihood function l(t k , β), eq. (13). we found that our results were relatively insensitive to the choice of the measurement error variance ρ appearing in (12) and (13) . at the same time, we emphasise that the errors in the reported cumulative numbers of infected individuals are complex, may vary over time, and will certainly impact on the inferred parameters. the same applies to the unknown initial model states x(t 0 ) = (s(t 0 ), e(t 0 ), i(t 0 ), r(t 0 )) t and their uncertainties. statistical modeling of trends and weekly oscillations mean certified cases, computed across regions per day, reveal a strong daily profile with local minima always falling on sundays. the corresponding mean of log contact parameters reveals the same oscillation. the negative slopes are compatible with the expectation that a decrease in contact rates causes a decrease in daily cases. these illustrative analyses imply two considerations. if evaluated against statistics of daily cases, obviously containment measures must take the weekly cycle into account and avoid confounds in the design. there are many potential sources for this cycle, some of them possibly quite trivial (e.g., the number of tests carried out). however, if experimental research and statistical modeling can establish that part of these fluctuations are indeed due to reduced contact rates beyond the family context on weekends, then dynamical models may help with the prediction of the timing of tightening and loosening decisions in local contexts. for example, one may consider moving to a three-or fourworkday week for some time, in well-defined contexts, and in targeted regions to facilitate this dynamic. this work was supported by a grant from deutsche forschungsgemeinschaft, germany (sfb 1294, project no. 318763901). data and source code for simulations, analyses, and figures is available via open science framework (osf) at https://osf.io/7dshm/ . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. this pdf file includes additional examples for regional modeling. 14 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. 18 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.04. 13.20063768 doi: medrxiv preprint covid-19 data for germany infectious diseases of humans: dynamics and control how will country-based mitigation measures influence the course of the covid-19 epidemic? the lancet a mathematical model for the spatiotemporal epidemic spreading of covid19 the quiet revolution of numerical weather prediction containment strategy for an epidemic based on fluctuations in the sir model space, persistence and dynamics of measles epidemics inferring covid-19 spreading rates and potential change points for case number forecasts chance and chaos in population biology-models of recurrent epidemics and food chain dynamics data assimilation. the ensemble kalman filter a general method for numerically simulating the stochastic time evolution of coupled chemical reactions spatial heterogeneity, nonlinear dynamics and chaos in infectious diseases temporal dynamics in viral shedding and transmissibility of covid-19 early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) the reproductive number of covid-19 is higher compared to sars coronavirus fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the sars-cov-2 epidemic effective containment explains sub-exponential growth in confirmed cases of recent covid-19 outbreak in mainland china probabilistic forecasting and bayesian data assimilation infinite subharmonic bifurcation in an seir epidemic model key: cord-290952-tbsccwgx authors: ullah, saif; khan, muhammad altaf title: modeling the impact of non-pharmaceutical interventions on the dynamics of novel coronavirus with optimal control analysis with a case study date: 2020-07-03 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110075 sha: doc_id: 290952 cord_uid: tbsccwgx coronavirus disease (covid-19) is the biggest public health challenge the world is facing in recent days. since there is no effective vaccine and treatment for this virus, therefore, the only way to mitigate this infection is the implementation of non-pharmaceutical interventions such as social-distancing, community lockdown, quarantine, hospitalization or self-isolation and contact-tracing. in this paper, we develop a mathematical model to explore the transmission dynamics and possible control of the covid-19 pandemic in pakistan, one of the asian countries with a high burden of disease with more than 100,000 confirmed infected cases so far. initially, a mathematical model without optimal control is formulated and some of the basic necessary analysis of the model, including stability results of the disease-free equilibrium is presented. it is found that the model is stable around the disease-free equilibrium both locally and globally when the basic reproduction number is less than unity. despite the basic analysis of the model, we further consider the confirmed infected covid-19 cases documented in pakistan from march 1 till may 28, 2020 and estimate the model parameters using the least square fitting tools from statistics and probability theory. the results show that the model output is in good agreement with the reported covid-19 infected cases. the approximate value of the basic reproductive number based on the estimated parameters is [formula: see text]. the effect of low (or mild), moderate, and comparatively strict control interventions like social-distancing, quarantine rate, (or contact-tracing of suspected people) and hospitalization (or self-isolation) of testing positive covid-19 cases are shown graphically. it is observed that the most effective strategy to minimize the disease burden is the implementation of maintaining a strict social-distancing and contact-tracing to quarantine the exposed people. furthermore, we carried out the global sensitivity analysis of the most crucial parameter known as the basic reproduction number using the latin hypercube sampling (lhs) and the partial rank correlation coefficient (prcc) techniques. the proposed model is then reformulated by adding the time-dependent control variables u(1)(t) for quarantine and u(2)(t) for the hospitalization interventions and present the necessary optimality conditions using the optimal control theory and pontryagin’s maximum principle. finally, the impact of constant and optimal control interventions on infected individuals is compared graphically. modeling the impact of non-pharmaceutical interventions on the dynamics of novel coronavirus with optimal control analysis with a case study coronavirus disease is the biggest public health challenge the world is facing in recent days. since there is no effective vaccine and treatment for this virus, therefore, the only way to mitigate this infection is the implementation of non-pharmaceutical interventions such as social-distancing, community lockdown, quarantine, hospitalization or self-isolation and contact-tracing. in this paper, we develop a mathematical model to explore the transmission dynamics and possible control of the covid-19 pandemic in pakistan, one of the asian countries with a high burden of disease with more than 100,000 confirmed infected cases so far. initially, a mathematical model without optimal control is formulated and some of the basic necessary analysis of the model, including stability results of the disease-free equilibrium is presented. it is found that the model is stable around the disease-free equilibrium both locally and globally when the basic reproduction number is less than unity. despite the basic analysis of the model, we further consider the confirmed infected covid-19 cases documented in pakistan from march 1 till may 28, 2020 and estimate the model parameters using the least square fitting tools from statistics and probability theory. the results show that the model output is in good agreement with the reported covid-19 infected cases. the approximate value of the basic reproductive number based on the estimated parameters is r 0 ≈ 1.87. the effect of low (or mild), moderate, and comparatively strict control interventions like social-distancing, quarantine rate, (or contact-tracing of suspected people) and hospitalization (or self-isolation) of testing positive covid-19 cases are shown graphically. it is observed that the most effective strategy to minimize the disease burden is the implementation of maintaining a strict social-distancing and contact-tracing to quarantine the exposed people. furthermore, we carried out the global sensitivity analysis of the most crucial parameter known as the basic reproduction number using the latin hypercube sampling (lhs) and the partial rank correlation coefficient (prcc) techniques. the proposed model is then reformulated by the novel coronavirus infectious disease, caused by the coronavirus is commonly known as covid19 . it has become the greatest challenge the history has ever seen. started from wuhan city of china earlier this year, it spread to the rest of the world in a few months and was declared a pandemic by the un. it has paralyzed life across the globe. the main cause of the virus is yet to be discovered, but it is presumed that it has emerged in one the biggest animal market in the chinese city of wuhan [1, 2, 3] . so far, it has engulfed more than 210 countries of the world and according to who statistics, the virus has affected around 6 million people across the world and more than 300 thousand people have died so far [1, 2, 3] . the recovery rate is higher than the mortality rate. however, the ratio varies from country to country and region to region. usa is the most affected country which is the epicenter of the virus followed by europe [2] . scientists are struggling to discover or invent a vaccine for the treatment, but it is yet to be discovered. the question, how long it will last? is on everyone's lips. although research is in a very early stage and with the passage of time things will unfold. the scientists are trying to dig out the main symptoms and causes of transmission. however, according to the information available the main symptoms are high fever, severe chest pain, continuous dry coughing, body aches, headache and difficulty in the respiratory system. the spread, according to available information, is droplets, produced by an infected person during coughing and sneezing and physical contacts, etc. [4] . covid-19 pandemic has caused great damage not only to human lives and health it has multidimensional effects. it not only exposed the weak health infrastructure even in the most advanced countries of the world, but also badly affected the world economy. almost the entire world is on lockdown and all the economic and business activities are halted. it has shocked the largest economies of the world, i.e. china and usa where the economic slowdown is observed since the outbreak of coronavirus. the third world countries, particularly pakistan, are the prime targets of the economic devastation. millions of people have lost their jobs in the past few months. poor countries are unable to repay their debts. they are even unable to support poor citizens who cannot earn their livelihood. the social and political lives of the nations are affected badly. people across the world have severed social relations. they avoid meeting each other even in the gravest times. governments of all the countries have diverted their attention and resources to cope with the challenge of this mysterious disease and thus no political activity is visible. like other countries in the world, covid-19 pandemic poses a huge threat to both humans health and economy in pakistan. this infection is even more devastating in pakistan because the implementation of non-pharmaceutical interventions i.e., make social-distancing and community lockdowns are certainly very tough for a society like pakistan. the government is unable to afford a strict nationwide lockdown. the covid-19 first case was reported on 26 february 2020 in karachi which was a student come back from iran. later, only within three weeks, the infection spread in all four provinces, gilgit-baltistan, azad jammu and kashmir, and the federal territory of islamabad. the total number of confirmed infected cases raised to 2291 across the nation and 252 new cases were reported on 1 april. due to the rapid growth of infected people across the country the government of pakistan decided to put the whole nation under strict lockdown and later extended twice until 9 may, due to a worse situation. currently, the situation in pakistan is worse and the number of confirmed cases crossed the cases reported in mainland china. pakistan is placed 17th in the list of highly reported infected cases and deaths by countries, territories, or areas [5, 6] with a total of more than 100,000 confirmed covid-19 cases. about 31,000 patients are fully recovered and 1900 people lost their lives due to this deadly infection [7, 8] . mathematical models are very useful in helping us to understand the transmission dynamics and control of emerging and re-emerging communicable diseases. one of the main challenges that mankind is facing nowadays is predicting the severity and suggest suitable public health intervention strategies to curtail the covid-19 pandemic. recently, a number of mathematical models have been proposed to explore the transmission patterns of covid-19 pandemic. in [9] , the authors formulated a deterministic model to explore the impact of various public interventions on the dynamic and mitigation covid-19 in ontario, canada. in [10] , a mathematical model based on nonlinear differential equations is presented to study the dynamics of covid-19 infection in highly affected countries that are china, italy, and france. a fractional-order covid-19 model with the atangana-bleanu-caputo operator is proposed by khan and atangana [11] and implemented the model to analyze the infection in wuhan. the role of lockdown in the absence of effective vaccines and treatment in order to mitigate the covid-19 pandemic is analyzed in [12] . the author has used the novel fractional-fractal operators to formulate the proposed mathematical model [12] . in [13] , a transmission model is formulated to predict the cumulative covid-19 cases in italy, uk, and usa. the transmission dynamics of covid-19 in mexico are studied using mathematical and computational models in [14] . the influence of non-pharmaceutical controls, including quarantine, hospitalization or self-isolation, contact-tracing, and the use of a face mask on the dynamics of covid-19 pandemic in the population of new york state and the entire usa is studied in detail in [15] . the present study proposes a new transmission model to analyze the dynamics and impact of non-pharmaceutical interventions on the covid-19 in pakistan. initially, we develop the model without optimal control variables and provide a good fit to the reported cases and then estimate the model parameters using a nonlinear least square curve fitting approach. further, the model is reformulated by adding two times dependent control variables. the rest of the manuscript is organized as: the mathematical formulation of the covid-19 model is presented in section 2. basic mathematical analysis, including stability results of disease-free equilibrium is explored in section 3. the model fitting to reported cases and estimation of parameters is done in section 4. the simulation results of without optimal control model are shown in section 5. the global sensitivity analysis and its graphical interpretation are depicted in section 6. the optimal control problem and its analysis of the covid-19 infection is presented in 7. finally, a brief concluding remarks are given section in 8. this section presents a brief description of the proposed model to study the dynamics and possible control of the covid-19 pandemic. the model is developed by dividing the whole human population at any time t, denoted by n (t) into eight mutually exclusive sub-groups depending on the disease status. these sub-groups are susceptible s(t), exposed or latent e(t), infected with the disease symptoms (or symptomatically infectious) i(t), asymptotically infectious having no clinical symptoms i a (t), quarantined q(t), hospitalized i h (t), critically infected (or intensive care) patients i c (t) and the recovered/removed individuals r(t), so that the infected individuals showing mild symptoms of the disease are also placed in the epidemiological class i a (t). the quarantine and isolation should be either at home or the specific centers or hospitals designated by the government. further, the group i h stand for the patients admitted into hospital also contains those with clinical symptoms of the disease who are self-isolating at home. we further assume that hospitalized people may also transmit the infection after interacting with susceptible people. the transmission dynamics of the covid-19 disease is expressed through the following system of the nonlinear differential equation: (1) the corresponding initial conditions are the birth rate is denoted by λ and the natural mortality rate in all groups is denoted by µ. the susceptible people acquired covid-19 infection when they interact with the infected people in i, i a and i h compartments. the force of infection is where the parameter β shows the effective contact rate, i.e., contacts capable of leading to infection transmission and the parameter 0 ≤ ψ ≤ 1 accounts for the assumed reduction in disease transmissibility of asymptomatic infected individuals in comparison to symptomatic one. similarly, ν is used for the infectiousness rate due to hospitalized covid-19 patients. it is noticed from the transmission patron of covid-19 that the asymptomatic individuals are comparatively dangerous relative to the individuals in the i h class because they are not aware of the infection and are capable to transmit the infection. the latent individuals develop an infection after completion of incubation period and become infected at the rate ω and a fraction denoted by ρ enters to the symptomatic class after showing disease symptoms and the remanding with no (or mild) symptoms join the asymptomatic compartment i a (t). the exposed individuals who have interaction with covid-19 infected patients are detected (via contact-tracing) and placed in quarantine at the rate κ which further moves to hospitalized class if they are tested positive with covid-19 infection. the symptomatically-infectious people are hospitalized at the rate η which further moves to critically-infected class i c at the rate φ if they are serious and need critical care. the parameters ζ 1 and ζ 2 represent the recovery rates in i and i a groups respectively. further, the recovery rates of quarantined, hospitalized and critically infected classes are shown by φ 1 , φ 2 and φ 3 respectively. finally, the covid-19 induced mortality for individuals in the i, i h and i c classes are respectively shown by ξ 1 , ξ 2 and ξ 3 . for simplicity, let us denote then, the above model (1) can be written as we present some basic and necessary analytical results of the covid-19 model (3) lemma 1. let p(0) ≥ 0 denotes the initial data and p(t) = (s, e, i, i a , q, i h , i c , r) are the model variables, then all solutions of the model (3) will be non-negative for all t > 0. (3): it can be further written as hence, the solution of (5) can found as below following a similar procedure, it can be shown that p(t) > 0, ∀ t > 0. in order to prove the second part we have 0 < p(0) ≤ n (t), and then the addition of all equations of the covid-19 model (3) we have the dynamics of the covid-19 mathematical model (3) will be studied in the following closed and biologically feasible region. the region defined in the closed set ∆ ⊂ r 8 + , is positively invariant for the model (3) with non-negative initial conditions in r 8 + . proof. as in lemma 1, it follows from the summation of all equation of the covid-19 model (3), it is clear that the solution of and (6) is given in the following inequality, in particular, n (t) ≤ λ/µ, if n (0) ≤ λ/µ. therefore, the region ∆ is positively invariant as will as attracts all the possible solution trajectories in r 8 + . the propose model (3) has a disease free equilibrium (df e), given by next, we investigate the most important and crucial threshold quantity known as the basic reproduction number and generally denoted by r 0 . this parameter measures the average number of new covid-19 infected cases generated by a typical infected individual when introduced into a completely susceptible population. the most common approach used to obtain r 0 is the next-generation method presented in [16] . the next generation matrices obtained from the model (3) are given as follows: the corresponding jacobian matrices f and v evaluated at df e is given as below: hence, utilizing the definition r 0 = ρ(f v −1 ) (where ρ(.) represents the spectral radius), we derived the following expression for the basic reproduction number: interpretation of r 0 in order to interpret the basic reproductive quantity we split the expression for r 0 as follows: where, the first term r 0i in (7) shows the average number of new covid-19 infections generated by symptomatically-infectious individuals in the i class. this term contains the product of the infection rate in the i class (the disease transmission rate), β, the fraction of exposed people that completed the incubation period and move to the symptomatic stage ( ρω k 1 ) and the average period spend in the i compartment ( 1 k 2 ). the second constituent reproduction number r 0a represents the number of new covid-19 infection cases generated by asymptomatically-infectious individuals in class i a . it is the product of infectious rate due to asymptomatic coivd-19 individuals (βψ), the fraction of latent people that completed the incubation period and move to the asymptomatic stage ( ρ(1−ω) k 1 ) and the average period spent in the asymptomatic class ( 1 k 3 ). similarly, the third constituent reproduction number r 0h expresses the new covid-19 infection cases generated by hospitalized/isolated individuals. in particular, the first term in r 0h represents the contribution into the hospitalized class by symptomatic infectious individuals (in class i). it is the product of infectious rate due to hospitalized individuals (βν), the fraction of latent individuals that completed the incubation period and move to the asymptomatic stage ( ρ(1−ω) k 1 ), the portion of individuals that left the symptomatic class i and move to the hospitalized class i h ( η k 2 ), and the average duration in the hospitalized class ( 1 k 5 ). finally, the second term in r 0h expresses the contribution of quarantined individuals into the hospitalized class. in this part, we will prove the local and global stability of the model around the df e. the epidemiological implication of the stability result of df e case is that a small influx of covid-19 infections cases will not generate a covid-19 outbreak if r 0 < 1. (3) is locally asymptotically stable if r 0 < 1 and unstable otherwise. proof. the jacobian matrix j w 0 obtained at the dfe w 0 is as follows: clearly, from the above jacobian matrix j w 0 , the eigenvalues −µ, −µ and −k 6 have negative real part. there remanning eigenvalues can be obtained through the equations given below: the coefficients involved in (8) are as follow: where, clearly, c j for j = 1..., 5 are all positive if r 0 < 1. further, it is easy to show the remaining routh-hurtwiz condition for the fifth order polynomial (8) . thus, the df e is locally asymptotically stable if r 0 < 1. the global dynamics of df e, w 0 of the covid-19 transmission model is studied in the following result. theorem 2. the system (3) at w 0 is globally asymptotically stable if r 0 <1, and unstable for r 0 >1. proof. let we consider the following lyapunov function, in order to prove the required result: where b i , for i = 1, 2, · · · , 5, used for some unknown positive constants. differentiating the function z(t) with respect to t and using the solutions of system (3), we obtain: we obtained after some simplification hence, it is obvious that if r 0 < 1 then dz(t) dt < 0. therefore, the largest compact invariant set in ∆ is the singleton set w 0 and using the lasalle's invariant principle [17] , w 0 is globally asymptotically stable in ∆. the present section investigates the data fitting using model (3) to the confirmed reported covid-19 infected cases in pakistan. the disease situation in pakistan is becoming worse day by day and currently, the cumulative reported cases are higher than china. in this study we consider the covid-19 confirmed cases from march 01, 2020, till may 28, 2020, reported in pakistan. the data is obtained from [7, 8] . in order to parameterize the model, we utilized two approaches: some of the demographic parameters are estimated from the literature. we assume the time unit is days and the estimation procedure of the parameters is as follows: • the birth rate λ: the total population of pakistan estimated by un for the year 2020 is about n (0)=220,309,834 [18], therefore, the parameter λ is obtained from λ/µ = n (0), and it is assumed that this is the limiting population in the disease absence, so that λ = 8939 per day. • mortality rate due to coronavirus ξ 1 : the death rate due to this novel infection in pakistan is 2.2% so far [8] , therefore the covid-19 induced mortality rate is estimated as ξ 1 = 0.022. the remaining biological parameters are fitted from the reported infected cases plotted in figure 1 . to do this we used the non-linear least-square curve fitting technique followed in [19, 20] . we briefly present the main steps in this statistical technique. in order to present the main theme of the algorithm, firstly, the model (3) can be comprehensively expressed as the function f depends on time t, the vectors of dependent or state variables y and unknown parameters θ to be estimated. the purpose of using the least square technique is to estimate the best values of model parameter which is obtain by minimizing the error between the reported data pointsỹ t l and the solution of the model y t l associated with the model parameters θ. the objective function used in the minimization procedure is given asθ where n denotes the available actual data points. to obtain the model parameters, we aimed to minimize the following objective function minθ subject to eq.(10). for more detail about this technique please see [21, 19] and reference therein. we investigate the proposed model fit to the reported covid-19 infected cases in pakistan using the above approach. the reported cases are shown in figure 1 . the model is solved using ode45 (rk4 technique) package which is a solver for the initial value problem in matlab. then, we implemented the lsqcurvefit package to fit the model to real data and to estimate the parameters. the best fit to the reported data via our model is depicted in figure 2 . it can be seen that the model simulation is in good agreement with the real data. the estimated and fitted parameters are given in this section is devoted to perform the simulation results of the covid-19 transmission model (3) . the model is solved numerically using fd45 package in matlab which is base on the range-kutta fourth-order method. the estimated parameter values given in table 1 are utilized in the simulation process in order to study the impact of various possible non-pharmaceutical interventions against the spread of covid-19 in pakistan. in the graphical results, the various non-pharmaceutical control parameters are taken at their baseline values given in table 1 (unless otherwise stated in captions). the effect of parameter β that represents the impact of effective contacts is shown in figure 3 . we have analyzed the impact of baseline social-distancing, mild social-distancing (10% reduction in β), moderate social-distancing (30% reduction in β) and comparatively strict socialdistancing (35% reduction in β) on the disease transmission. it is observed from figure 3 that the enhancement in social-distancing significantly reduced the burden of cumulative new infected cases. it is further observed that the implementation of a highly-effective social-distancing strategy (i.e., at least 35% reduction in β) dramatically reduces the cumulative new infected cases. thus, this graphical interpretation suggests that strict socialdistancing measures that reduces the contacts between people including staying 2 meters apart or more preferably staying at home should be implemented by the government. the impact of parameter ψ representing the infectiousness rate due to asymptomaticallyinfected individuals is depicted in figure 4 . it can be seen that the reduction in ψ also significantly reduces the cumulative number of newly confirmed covid-19 infected cases. this interpretation shows that the people who do not even know that they are infected (i.e., those with mild or no symptoms), are significantly contributing the disease burden. further, the influence of parameter ν the infectiousness rate due to covid-19 patient admitted in the hospital are depicted in figure 5 . it is observed that the reduction in this parameter has no reasonable impact on disease transmission. obviously, the hospitalized covid-19 patients are isolated and no one is allowed to meet him. the health care facilities are also supposed to follow strict standard operating procedure (sop) during the treatment and look after of hospitalized patients. therefore, these infected individuals do not contribute greatly to the disease burden. we further simulate the covid-19 model (3) by using the baseline parameter tabulated in table 1 and various of κ by increasing with different level i.e, mild, moderate, and strict rates. the resulting behavior is depicted in figure 6 showing that a strict quarantine or contact-tracing policy (up to 70% enhancement to its baseline) is needed to reduce the disease burden in pakistan. finally, the impact of hospitalization or selfisolation of tested positive cases (η) is plotted in figure 7 . it is observed that this strategy is comparatively less effective than social-distancing and quarantine interventions. these graphical interpretations emphasize that once a covid-19 infected case is diagnosed via testing, that case must be rapidly isolated and his/her contacts quickly traced (via effective contact-tracing) and placed in quarantine. global sensitivity analysis is one of the important aspects of mathematical modeling not only for epidemic models but in all sciences. the global sensitivity analysis of the threshold quantity r 0 is used to measure the effect of changes in the dominant factors of the model and to point out the most influential parameters of the model that greatly influence the prevalence of infection. furthermore, the sensitivity results provide a pathway to set effective and suitable control strategies to curtail the disease in a community. more specifically, this analysis is helpful to explore how the initial inputs to the model contribute to the system outputs. a latin-hypercube sampling approach (lhs) coupled with the partial rank correlation coefficient (prcc) is commonly used for this purpose [22] . this technique provides prcc and the corresponding p-values for each parameter the use of which can help estimate the level of uncertainty in an epidemic model. the higher prcc and smaller p-value of a parameter indicating that it has a substantial effect on simulation behavior. the graphical prcc results of the covid-19 model associated parameters taken in this analysis are shown in figure 8 , while the numerical values of prcc and corresponding p-values are placed in table 2 . it is observed from table 2 and figure 8 that β is considered to be the most sensitive parameter with high prcc value with a positive sign followed by ψ, ν, and ω. moreover, µ and κ, ξ 1 and η have competitively high prcc with negative sign and zero p-values. previously, we analyzed the impact of non-pharmaceutical interventions with constant rates. in this section we formulate an optimal control problem for covid-19 with the inclusion of two time dependent controls in the model (3). the resulting control problem is presented in (13) . these controls are chosen on the basis of global sensitivity results. the control variable u 1 (t) is used for the enhancement of effective contact-tracing policy to quarantine the exposed individuals which was previously taken as a constant parameter. the time dependent control variable u 2 (t) is used to enhance the hospitalization or selfisolation of diagnosed covid-19 infected cases (following testing). thus, the resulting control model after incorporating the aforementioned control variables is formulated via the following system: subject to the non-negative initial conditions. in order to minimize this covid-19 infection, we are aimed to minimize the cost function given as: where the expressions a i for i = 1, 2, · · · , 5, are the constants and representing the balancing cost factors while t f represents the final time. we consider the quadratic objective functional because the intervention is nonlinear, for more details see the work and references therein [23, 24, 20, 25] . onward our main objective is to investigate an optimal controls u * 1 , u * 2 for quarantine and hospitalization respectively such that the associated control set is given by the lagrangian and hamiltonian for the above optimal control system is defined by and where λ j , for j = 1, 2 · · · , 8, are the adjoint variables. we use the pontryagin's maximum principle [26] in order to solve the covid-19 optimal control problem (13) . to do this, let u * 1 , u * 2 are the desired optimal solution then, the corresponding conditions of pontryagin's maximum principle used in solution process are as follows: utilizing the conditions mentioned above (17), we present the solution of optimality system in the following theorem. theorem 3. the optimal controls u * 1 , u * 2 and the solutions s * , e * , i * , i * a , q * , i * h , i * c and r * of the corresponding control system (13) that minimize the objective functional j(u 1 , u 2 ) over ω. there exists adjoint variables λ i , where i = 1, 2, · · · , 8, along with transversality conditions λ i (t f ) = 0 such that furthermore, the associated optimal controls u * 1 and u * 2 are given by proof. the desired results (18) and the transversality conditions are obtained by utilizing the conditions specified in (17) for the hamiltonian function given (16) and with the settings s = s * , e = e * , i = i * , i a = i * a , q = q * , i h = i * h , i c = i * c and r = r * . further, to obtain the equations (19) for the control characterization, we use the condition ∂h(t,u j * ,λ j ) ∂u j = 0 given in (17) for j = 1, 2, and the equations (19) are presented. we present and discuss the graphical results of the covid-19 model (3) with constant quarantine and hospitalization/isolation control measures and the model (13) with time dependent control interventions and compared both results. to perform the simulations, both models are solved numerically using the rk4 technique. the estimated and fitted parameters given in table 2 are used in the simulation results. the time level is considered up to 150 units (days). the weight and balancing constants are chosen as a 1 = 0.001, a 2 = 0.01, a 3 = 0.001, a 4 = 300, and a 5 = 200. it should be noticed that the weights constant values taken in the simulations are theoretical as they were chosen only to carried out the control strategies developed in this study. in figure 9 (a-d), we have depicted the impact of hospitalization control only and keeping the quarantine or case tracing control inactive (i.e., u 1 = 0 and u 2 = 0). the control profile for this strategy is shown in 9 (e). it is observed that although the control u 2 is kept 100% for the first 70 days, but still it has no significant impact on the different infected individuals as seen 9 (a-d). thus, only the hospitalization or self-isolation intervention is not enough for the control of covid-19 pandemic in pakistan. the impact of only quarantine optimal control by keeping the hospitalization control zero (i.e., u 1 = 0 and u 2 = 0) is shown in 10 (a-d). the corresponding control profile for this case is depicted in 10 (e). it can be seen that by maintaining a strict quarantine control is very effective in minimizing the spread of the covid-19 infection. finally, the impact of both optimal controls on the dynamics of covid-19 burden is analyzed. the graphical results are depicted in figure 10 (a-d) , while the control profile is shown in 10 (e). it can be observed from 10 (a-d) that the number of exposed, symptomatically-infected, asymptomatically-infected, and critically-infected individuals are decreasing very significant when the optimal quarantined and hospitalization controls are applied rather than the constant case. the effectiveness can be viewed from the difference between the peaks of the two graphs. from the control profile depicted in figure 10 (e), it can be seen that the control u 1 is kept initially at 100% for 140 days and gradually reduces towards the end of the intervention, while the control u 2 is set to 55% and then immediately increases to 90% in the initial days and then gradually reduced during the rest of the intervention. the covid-19 pandemic has rapidly spread out to most of the regions of the world and has severe public health and socio-economic burden in developed and devolving countries including pakistan. the number of reported cases in pakistan is increasing and more than 100,000 confirmed cases have been reported till 05 june 2020. in the absence of a safe and effective vaccine or antiviral, the whole human's community is being focused on the use of non-pharmaceutical interventions against the covid-19 pandemic. in this study, we formulated a mathematical model in order to study the dynamics of covid-19 pandemic in pakistan, and used it to assess the community-wide impact of the various control and mitigation strategies. initially, we developed the model and presented some mathematical analysis, including positivity and stability results of the disease-free equilibrium. it is proven that the disease-free equilibrium is stable both locally and globally when r 0 < 1. the model is parameterized from the covid-19 confirmed cases reported till may 28, 2020 in pakistan while some parameters are estimated from literature. the findings show that the model predicted infected curve is in good agreement to the real infected cases. the estimated numerical value of the basic reproduction is obtained as r 0 ≈ 1.87 showing the alarming situation of the pandemic in pakistan. the control and mitigation strategies should be implemented to bring the threshold quantity r 0 to a value less than unity. after the estimation of model parameters, we simulated the model to explore the effectiveness of various control strategies implemented in pakistan. firstly, we presented the impact of three effectiveness levels (i.e., low or mild, moderate and strict) of social-distancing in curtailing the burden of covid-19. the simulation results revealed that although the implementation of mild social-distancing decreased the covid-19 burden significantly (as measured in terms of shifting and lowering the peak of daily infected cases), still a strict social-distancing measures should be implemented and maintained for an extended period of time to avoid a significant outbreak in pakistan. further, we simulated the model to assess the effect of various levels of quarantine and hospitalization or self-isolation interventions. with a highly-effective quarantine intervention (enhanced by 70% to its baseline value) a dramatic reduction in the pandemic peak was observed. on the other hand increase of hospitalization intervention of confirmed cases had no significant influence on the pandemic burden. finally, the proposed covid-19 model is reformulated by the inclusion of two time-dependent control variables in order to assess the impact of optimal control measures on disease dynamics. we simulated the control model and compared the effect of constant and optimal time-dependent control measures on disease burden. it is observed from the simulation results of the control covid-19 model that the infected individuals significantly decrease with the implementation of both time-dependent control measures. although it is impressive that pakistan ramps up daily diagnostic covid-19 testing and contact tracing to have a realistic measure of the burden of the nationwide pandemic and emphasize personal hygiene and hand washing, physical-distancing, wearing face masks in public. still, this study suggests that the implementation of basic non-pharmaceutical interventions, particularly social-distancing, quarantine (or self-isolation or stay at home) should be strictly observed in the future to avoid the worse scenario in pakistan. it is also believed that the present study will be beneficial to the decision-making in combating the disease. in the near future, we will extend the present model by introducing the fractional operators with local and nonlocal kernel to gain more insights about the dynamics of covid-19 pandemic. an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases world health organization coronavirus disease (covid-19) situation reports presumed asymptomatic carrier transmission of covid-19 covid-19 coronavirus pandemic world health organization coronavirus disease (covid-19) situation reports covid-19 coronavirus pandemic covid-19 coronavirus pandemic in pakistan quantifying the role of social distancing, personal protection and case detection in mitigating covid-19 outbreak in ontario analysis and forecast of covid-19 spreading in china, italy and france modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative modelling the spread of covid-19 with new fractal-fractional operators: can the lockdown save mankind before vaccination? forecasting the cumulative number of confirmed cases of covid-19 in italy, uk and usa using fractional nonlinear grey bernoulli model modeling and prediction of covid-19 in mexico applying mathematical and computational models mathematical assessment of the impact of non-pharmaceutical interventions on curtailing the 2019 novel coronavirus reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission the stability of dynamical systems modeling the transmission dynamics of tuberculosis in khyber pakhtunkhwa pakistan optimal control strategies for dengue transmission in pakistan parameter estimation in nonlinear dynamical systems a methodology for performing global uncertainty and sensitivity analysis in systems biology optimal isolation control strategies and cost-effectiveness analysis of a two-strain avian influenza model media coverage campaign in hepatitis b transmission model mathematical formulation of hepatitis b virus with optimal control analysis the maximum principle. the mathematical theory of optimal processes no conflict of interest exists regarding the publications of this paper. the authors declare that there is no conflict of interests regarding the publications of this work. key: cord-296560-ehrww6uu authors: bender, andreas; jenkins, jeremy l.; li, qingliang; adams, sam e.; cannon, edward o.; glen, robert c. title: chapter 9 molecular similarity: advances in methods, applications and validations in virtual screening and qsar date: 2006-11-07 journal: annu rep comput chem doi: 10.1016/s1574-1400(06)02009-3 sha: doc_id: 296560 cord_uid: ehrww6uu this chapter discusses recent developments in some of the areas that exploit the molecular similarity principle, novel approaches to capture molecular properties by the use of novel descriptors, focuses on a crucial aspect of computational models—their validity, and discusses additional ways to examine data available, such as those from high-throughput screening (hts) campaigns and to gain more knowledge from this data. the chapter also presents some of the recent applications of methods discussed focusing on the successes of virtual screening applications, database clustering and comparisons (such as drugand in-house-likeness), and the recent large-scale validations of docking and scoring programs. while a great number of descriptors and modeling methods has been proposed until today, the recent trend toward proper model validation is very much appreciated. although some of their limitations are surely because of underlying principles and limitations of fundamental concepts, others will certainly be eliminated in the future. molecular similarity [1] [2] [3] [4] follows, in principle, a simple idea: molecules which are similar to each other exhibit similar properties more often than dissimilar pairs of molecules. this is often written as the relationship which leaves open two major questions: 1. how to represent molecular structure (the connectivity table or the coordinates of atoms are not per se suitable choices)? 2. what is the functional form between structure (or rather structural representation) and the property under consideration so that we can derive an empirical measure of similarity? in order to explicitly include both challenges mentioned one can reformulate to give mðpropertyþ ¼ f ðgðstructureþþ where m is the measurement outcome of a molecular property concept (such as log p as a surrogate measure of 'lipophilicity'), g represents the transformation of a molecular structure into a 'descriptor' which is amenable to a statistical analysis or machine-learning treatment and f connects experimental measurement and structural representation. both steps are generally independent of each other, although some combinations of molecular representation and model generation technique are more sensible than others. the problem in establishing a suitable function g, which translates a molecular structure into a descriptor representation, is that it is usually not known a priori which molecular features contribute to a certain property. for example, some functional groups in ligand-receptor binding will establish ligand-receptor interactions, while others simply point into bulk solvent. often a large number of descriptors need to be calculated in order to (hopefully) capture the relevant factors for a certain molecular property, since often no direct experimental observation is known. the problem in establishing a function f, which correlates descriptor representation and property is that its functional form is also usually not known. again, no underlying theory exists and its character can vary between two extremes. linear regression, for example, represents a simple functional form between input and output variables with the advantage of a very small number of free parameters -and following occam's razor it should be applied in cases where there is a sound physical reason to believe in an underlying linear relationship between input and output variables. at the other end, neural networks are able to model any (also nonlinear) relationships between input and output variables. however, they depend on a large number of variables, which may lead to spurious correlations. often the choice of a functional form, in the absence of physical laws, is governed simply by trial-and-error. the problems in establishing the optimal choice of f and g are increased by the fact that the relationship between structure and measured property (the only relationship available from experimental data!) is rarely given over a large region of chemical space. data are sparse -estimations of the size of the chemical space for typical drug molecules [5] (up to 30 heavy atoms) are in the region of 10 60 , experimental datasets on a property of interest are rarely available for more than 10 6 compounds and are often considerably smaller. a solution to the problem of identifying the 'best' molecular descriptor will never be fully established -for both practical reasons (the limited size of datasets) and theoretical reasons. a wide variety of different features are important for each property and the functional forms between descriptor representation and property can usually not be established from physical laws (and thus cannot be optimized analytically). still, we can establish empirical measures of molecular similarity to predict some particular properties better than others, tested on some of the more or less restricted datasets available. this review deals with both novel molecular representations, function g from above, as well as novel model generation and machine learning methods, function f from above. as soon as a relationship between molecular representation and a particular property's values is established a crucial question arises: how good are predictions for novel molecules? ideally, all of chemical space would be covered with zero error. limits in descriptor generation as well as in experimentally available data clearly prevent us from reaching this goal. still, in order to establish confidence in models in practical settings, this requirement can be replaced by the question: which area of chemical space is covered with acceptable error? different methods (best known among them are approaches like crossvalidation), attempt to provide empirical answers to this question. intuitively one might guess that for the question which region is covered by a given model, the distance of compounds from the training set to the novel compounds whose properties are to be predicted is relevant. this is indeed the case, as has been established in recent articles (see section 4) . the question of how good predictions for novel compounds are is often established by cross-validation, where portions of the available datasets are, in turn, taken as an external test set, while the remainder of the dataset is used for training purposes. the test set thus attempts to simulate a novel set of molecules, unknown to the training phase of the model and root-mean-square errors (rmse) or cross-validated correlation coefficients (q 2 ) on the test set are often reported as a measure of the generalizability of models. recently, it has emerged that cross-validation actually shows merely that a model is internally consistent, but not necessarily predictive for new compounds. the question of how reliability of models can be assured is also discussed in section 4, and indeed several recent publications propose approaches to determine the 'domain' of models (the area in which they are applicable, see section 4 for details). conventionally, enrichment over random selection is often cited, giving an estimate of how many more active compounds are retrieved from a database than by pure chance. while this measure is correct in the way it is calculated, more recently the performance of 'sophisticated' fingerprints has been compared to trivial features, namely counts of atoms by element, without any structural information [6] . the performance ratio of 'state-of-the-art' methods (i.e., circular fingerprints and unity fingerprints) to those 'dumb' descriptors can then be interpreted as the 'added value' of more sophisticated methods. soberingly, on many datasets of actives 'real' fingerprints do not perform significantly better than atom counts (see fig. 1 ). this also relates to the suitability of current databases employed for retrospective virtual screening runs, which are often derived from the mddr [7, 8] . while on the one hand, multiple activity classes are present, those datasets still possess two major disadvantages; first, no information about definite inactivity of compounds is contained in the database. still, if experimental data for retrieved hits are subsequently obtained, many of the 'false-positive' predictions may well be active. second, following bioisosteric considerations in combination with 'fast follower' approaches to synthesis, it should be noted that this database contains a large number of close analogues. the hit rates obtained on this dataset may thus be overly optimistic compared to real-world libraries employed for virtual screening. still, the two databases referenced above, which are both subsets of the mddr, were very important as they enabled comparison of similarity searching approaches on multiple, identical datasets. we would also like to emphasize that more suitable datasets are too often -unfortunatelyunavailable from the pharmaceutical and biotechnology companies. in the following sections, we will also cover other recent developments in some of the areas, which exploit the 'molecular similarity principle'. section 3 will present novel approaches to capture molecular properties by the use of novel 'descriptors'. since molecular descriptors and the methods used to analyze the data they represent cannot be separated easily, the second part of this section also covers novel data analysis methods. section 4 focuses on a crucial aspect of computational models -their validity. in the previous few years, about two dozen publications that focused on 'model validation' have appeared, an area which shall be summarized in this review. finally, sections 5 and 6 turn to the application of the methods described earlier. in section 5, we discuss additional ways to examine data available such as those from high-throughput screening (hts) campaigns and to gain more knowledge from this data. section 6 describes some of the recent applications of methods described in the preceding sections, focusing on successes of virtual screening applications, database clustering and comparisons (such as drug-and in-house-likeness) and recent large-scale validations of docking and scoring programs. we will now describe some of the recent developments in the calculation of molecular descriptors. pot-dmc [9] (short for potency-scaled dynamic mapping of consensus positions) takes not only the (binary) activity of a compound into consideration for virtual screening applications, but also the quantitative activity of a structure. accordingly, each bit of the descriptor vector (which consists of a combination of one-, two-and three-dimensional (1d, 2d and 3d) features) is multiplied depending on the ic 50 value of the compound. scaled bits are summed and normalized at each position. afterward, the descriptor can be used for virtual screening. when applied to a database of ccr5 chemokine receptor antagonists, serotonin receptor agonists and gonadotropin-releasing hormone agonists, the method overall did not retrieve a larger number of structures -but those which were retrieved were, as intended, of higher activity than in cases where no scaling according to activity was applied. the fepops [10] (feature points of pharmacophores) descriptor aims to exploit a (relative) advantage of 3d descriptors, the ability to discover novel scaffolds against a given target, based on active sample structures. after generation of tautomers and conformers, k-means clustering of atomic coordinates is performed. thus, no knowledge about the active conformation of a structure is necessary. interaction types are assigned to characteristic 'feature points' in a subsequent step, and are again subject to k-medoids clustering to reduce redundant conformer coverage. cluster representatives can now be used for similarity searching. validations are presented using both mddr (cox-2, hiv-rt and 5ht3a inhibitors and ligands, respectively) and in-house datasets. in addition, it was shown that inhibitors can be identified from a database, based simply on endogenous ligands (for dopamine and retinoic acid). a completely different path is followed by the lingo [11] approach, which is based on a textual representation of molecules. based on the smiles string of a structure, and without time-consuming conversion and descriptor generation, a molecule is represented by a set of overlapping 'lingos', each of which represents a substring of the complete smiles structure. while being a straightforward concept (in the best possible sense), favorable performance is presented on log p and solubility datasets, where cross-validated rms errors are 0.61 and 0.89 log units, respectively. the descriptor also shows applicability to bioactivity, where significant discrimination between bioisosteres and random functional groups can be observed. reduced graph descriptors have been the subject of interest for a considerable time, and recently further work was performed in this area with success. earlier comparison algorithms of reduced graphs represent the graph as a binary fingerprint, sometimes leading to molecules perceived as similar by the algorithm, which are not similar to the eyes of most chemists. this problem was recently addressed [12] by applying 'edit distance' measures to the similarity of compounds -the number of operations needed to transform one reduced graph structure of a molecule into another. through this emphasis of not only the fragments present in reduced graphs, but also the way in which they are connected, better agreement with the human perception of 'molecular similarity' could be achieved. molecular binding can be thought of as being mediated by complementary shapes and matching properties -where, due to solvation and other effects, 'matching' does not only mean complementarity. accordingly, a 'shape fingerprint' method has recently been presented [13] which implements shape similarity measures akin to volume overlap methods, but which, due to the employment of database-derived reference shapes, is several orders of magnitude faster. (note of course that shape also plays an important role in other areas of science [14] .) employing gaussian descriptions of molecular shape, about 500 shape comparisons can be performed per second and the resulting shape similarity was shown to be useful in virtual screening applications. only some parts of a ligand bound to its target will actually interact with the target, other parts will just be pointing into the bulk solvent. by analyzing the variability of ligands' regions, features which correspond to each of the regions can be inferred -molecular features which are involved in ligand-target interactions will be more highly conserved than those which point into the solvent, due to the stricter requirements imposed on them. the 'weighted probe interaction energy (wep) method' [15] exploits exactly this principle, and can be used to derive ligand-based receptor models. this was applied to the steroid dataset (which is well known from comfa studies) a set of dihydrofolate reductase (dhfr) inhibitors as well as hydrophobic chlorinated dibenzofurans. in particular, the dhfr model was able to elucidate interactions relevant to binding which very closely resemble the target-derived model complex. previously applied to the calculation of inter-substituent similarities, which might be exploited for the identification of bioisosteric groups [16] , the r-group descriptor (rgd) was more recently also the subject of qsar investigations [17] . the rgd describes the distribution of atomic properties at a distance of n bonds (n ¼ 1, 2, 3 y) away from a core that is common to a series of compounds. in combination with partial least squares the descriptor was applied to several datasets for qsar studies, comprising of benzodiazepin-2-ones active at gaba a , triazines exhibiting anticoccidal activity and a set of tropanes active at serine, dopamine and norepinephrine transporters. rgdss in combination with pls showed comparable performance overall to hqsar and eva models in a cross-validation study, in some cases outperforming the other qsar approaches. another alignment-free method for the time-efficient generation of qsar models is fingal [18] (a short and straightforward acronym for 'fingerprint algorithm'). unlike rgds, a hashed fingerprint is generated which encodes structural features of the molecule, where distances may be measured either topologically or by employing spatial information between atoms. applied to d2 ligands, the 2d version of fingal, in particular, was able to outperform comfa-and comsia-based approaches. for estrogen ligands, performance was highly dependent on the structural class of compounds, not only for fingal but also for models based on comfa, hqsar, fred/skeys (fast random elimination of descriptors/substructure keys) and dragon descriptors. in subsets such as a pesticide subset, no model was obtained via comfa (correlation coefficient of zero), whereas fingal gave correlation coefficients as high as 0.85 in a cross-validation study. the grid force field [19] has been the basis of a number of descriptors developed recently, among the best-known ones being the grind descriptor [20] . some extensions of the descriptor have been presented recently, which include the incorporation of shape [21] into the descriptor. it was recognized that molecular shape is a major factor determining ligand-receptor binding, a property that was previously not emphasized enough by the original grind descriptors. this was due to the fact that only maximum products of interactions are incorporated into the descriptor, omitting large lipophilic features which do not contribute significantly to calculated interaction energies with probes, but might still have profound influence on binding through steric effects. introducing the new 'tip' probe (which is not a probe in the traditional sense but a measure of curvature of the molecular surface) led to significant improvements in qsar studies of adenosine receptor antagonists (of the xanthine structural class) and plasmodium falciparum plasmepsin inhibitors being observed. interestingly, tip-tip correlations were also found to be the most significant descriptors in case of a 1 antagonists, showing the importance of the shape descriptor on this class. the second development was the 'anchor-grind' approach [22] , which focuses on user-defined features to calculate a distribution of interaction points relative to it, thereby incorporating pre-existing biological knowledge about a target. models are found to be both of better quality and easier to interpret on congeneric series of hepatitis c virus ns3 protease and of acetylcholinesterase inhibitors, as well as more discriminatory between factor xa inhibitors of both high and low affinity. a virtual screening methodology also based on the grid force field was developed recently [23] . this method was validated on a large dataset containing thrombin inhibitors and also showed potential to select suitable replacements for scaffolds typically encountered in the lead optimization stage. a molecular 'descriptor' which actually does not employ an explicit transformation of the molecular structure into descriptor space was recently presented [24] . it employs a graph kernel description of the structure in combination with support vector machines (svms) for regression analysis. the computational burden is alleviated through employing a morgan index process as well as the definition of a second-order markov model for random walks on 2d structures. the method was then validated on two mutagenicity datasets. while already exhibiting the ability to capture molecular features responsible for bioactivity (here mutagenicity) in its current form, future developments might include more abstract representations of the molecular scaffold such as some form of reduced graph representation. while the bioinformatics area has a multitude of methods which can be applied to the analysis on 1d representations of protein sequences and dna, due to branching and cyclization the case is far more difficult for small molecules. one of the few 1d representations of molecules [25] , based on multidimensional scaling of the structure from 3d into 1d space, has more recently been extended to allow for the alignment of multiple structures [26] . applied to skc kinase ligands as well as herg channel blockers, significant improvement in retrieval rates could be observed in a retrospective study if multiple (in this case 10) ligands were used for screening. the concept of feature trees was also recently extended to allow for the incorporation of knowledge derived from multiple ligands into a single query [27] , and retrospective screening results on ace inhibitors as well as adrenergic a 1a receptor ligands showed considerable improvements over searches using single queries, both in terms of enrichments as well as the diversity of structures identified. when structures are encoded in a discrete fashion, 'binning' is often employed in order to convert real-valued distance ranges into binary presence/absence features. this approach is followed in, for example, the cats autocorrelation descriptor in its 3d version (cats3d) [28] . however, binning borders may introduce artifacts such that feature distances close to each other but on opposites sides of bin borders being perceived to be as different from each other (simply since features do not match) as much more distant features. accordingly, a related descriptor termed 'squid' was recently introduced which incorporates a variable degree of fuzziness [29] . applied to cox-2 ligands considerable retrieval improvement was observed, with best performance at intermediate degrees of fuzziness. using cox-2 ligands as well as thrombin inhibitors in combination with graph-based potential pharmacophore point triangles, typed according to interaction types, features responsible for ligand-target binding could be identified [30] . in addition, prospective screening was performed and a benzimidazole identified as a potent cox-2 inhibitor was experimentally found to be active in a cellular assay with high affinity (ic 50 ¼ 200 nm). the ultimate descriptor, in the realm of virtual screening, is the response of the biological system. while structure-derived descriptors are quick and (usually) easy to calculate, they are not the final goal -it is the effect that the compound has in a 'real world' setting. using those biological effects as descriptors, namely percent inhibition values across a range of 92 targets for a number of 1567 molecules, the 'biospectra similarity' (the similarity of effects on the respective targets) was established via hierarchical clustering [31] . it was found that biospectra similarity provides a solid descriptor for forecasting activities of novel compounds and this was validated by removal of some important target classes after which clustering of compounds was overall still very stable. while the response of single targets is already a step toward biology, protein readouts of cell cultures [32] also incorporate cell signaling networks, thus stepping even closer to whole organism systems (of course at the price of increased complexity and cost involved). also based on biological response data (phenotypic screening) a 'class scoring' technique was recently developed [33] , which does not assign binary (hit/non-hit) activities to individual compounds but to classes of compounds instead. this way, more robust assignments are achieved as well as a lower number of false-positive predictions. svms have been previously used for distinguishing, for example, between drug-and non-drug-like structures [34] and recently have been applied in virtual screening [35, 36] . using dragon descriptors and a modification of the traditional svm to rank molecules (instead of just classifying them), performance was in this study [35] validated on inhibitors (or ligands) of cyclin-dependent kinase 2, cyclooxygenase 2, factor xa, phosphodiesterase-5 and of the a 1a adrenoceptor. compared to methods such as binary kernel discrimination in combination with jchem fingerprints the new approach was found to be superior. the ability of lead hopping was also demonstrated recently through the combination of svms with 3d pharmacophore fingerprints (defined as smarts queries) [36] . there is a trend in the recent cheminformatics literature toward ensemble methods, i.e., methods where multiple models (instead of a single model) are generated and used together (as an ensemble) to make either qualitative or quantitative predictions about new instances. random forests [37] are an ensemble of unpruned classification or regression trees created by bootstrapping of the training data and random feature selection during tree induction. prediction is then made by majority vote or averaging the predictions of the ensemble. on a set of diverse datasets (blood-brain-barrier penetration, estrogen-binding, p-glycoprotein-activity, multidrug-resistance reversal-activity and activity against cox-2 and dopamine receptors) superior results to methods such as decision trees and pls were reported. more recently, 'boosting' was applied to the same (and additional) datasets [38] , and as a general rule this new method seems to be slightly superior in large regression tasks, whereas random forests are claimed to excel in classification problems. additionally, employing k-nearest neighbor classifiers, svms and ridge regression in an ensemble approach [39] gave significant improvement over single classifiers on a 'frequent hitter' dataset. most models derived in qsar studies, for example, ordinary and partial least-squares regression or principal components regression, employ a linear parametric part and a random error part, the latter of which is assumed to show independent random distributions for each descriptor. however, since molecular descriptors never capture 'complete' information about a molecule, this independence assumption is often not valid. kriging [40] has replaced the independent errors by, for example, gaussian processes. applied to a boiling point dataset and compared to other regression methods (ordinary and partial least-squares and principal component regression) improved performance could be observed. alongside model generation, feature selection is also an important step in many studies. since no perfect descriptors of the molecular system are known, often a multitude of descriptors (often several thousands) are calculated and it is hoped that they capture information, which is relevant to the respective classification or regression task. a comparative study of feature selection methods in drug design appeared recently [41] , which compares information gain, mutual information, w 2 -test, odds ratio and the gss coefficient (named after the authors, galavotti, sebastiani and simi; a simplified version of the w 2 -test) in combination with the naı¨ve bayes classifier as well as svms. while svms were found overall to perform favorably in higher-dimensional feature spaces (and do not benefit much from feature selection), feature selection is found to be a crucial step for the bayes classifier. (note that this has at the same time been shown empirically in virtual screening experiments [42, 43] .) some of the methods, namely mutual information and genetic programing, have also been evaluated separately for their use in qsar studies [44] with respect to a dataset which showed some (typical) problems present in the area, such as a very different sizes of 'active' vs. 'inactive' data subsets. the problem that structure-activity relationships are rarely linear has been addressed previously through the application of nonlinear methods [45, 46] such as k-nearest neighbor approaches [47, 48] . more recently, k-nn has also been combined with a comfa-like approach, termed k-nn mfa, to predict bioactivity of a compound based on its k-nearest neighbors in 'field space' [49] . as discussed by the authors, some of the disadvantages of comfa such as alignment problems are retained; nonetheless, multiple models are produced in each run, giving more room for appropriate model selection. removing limitations of the statistical model is possible using non-parametric models which have recently been used in qsar studies [50] and were shown to improve results over more conventional regressiontype models. also bayesian regularized networks have been found to be of interest in recent qsar studies [51] [52] [53] . those networks possess inherent advantages including that they run less risk of being overtrained than non-bayesian networks (since more complex models are punished by default). the effect of binary representations of fingerprints has been known for some time, such as combinatorial preferences [54] and size effects [55] (depending on the similarity coefficient used). more recently, another aspect of the binary representation of features in a fingerprint has been analyzed [56] . integer or real-valued representations of feature vectors were calculated for 12 activity classes and employed cats2d and cats3d autocorrelation descriptors as well as ghose and crippen fragment descriptors. afterward, retrospective virtual screening calculations were performed for both the original (quantitative) representations and the binary (presence/absence) fingerprints. surprisingly, in only 2 out of the 12 cases did significantly different numbers of actives get retrieved (defined as more than 20% difference). in addition, the retrieved actives showed, depending on the activity class, very different overlap, between 0% and 90%, indicating some orthogonality of the same descriptor, differing by its representation (integer/real-values vs. binary format). exploiting the 'molecular similarity principle' by not only looking for neighbors of an active compound and assuming they are active (as is usually done in virtual screening) but also using this knowledge further to improve the model, has recently been exploited in a method called 'turbo similarity searching' [57] . by feeding back information about the nearest neighbors of an active compound into the model generation step, an increased number of active compounds can be retrieved in a subsequent step. this is analogous to the re-use of hot air in turbo chargers in cars, where the output (hot gas, nearest neighbor in this case) is fed back into the loop to improve performance. a number of publications have appeared recently focusing on the validation of qsar models. a wealth of parameters exist here, such as training/ test/validation set splits, the dimensionality of descriptors used in relation to the number of degrees of freedom of a model, or the way selection of features is performed. while it has been recognized for some time that a larger number of descriptors increases the likelihood of chance correlations [58] , more recently a discussion of the validity of statistical significance tests, such as the f test, has appeared [59] which puts the number of features considered into relation to the significance of a model. this study cautions in agreement with earlier work that one needs to be very careful when judging the statistical significance of correlation models if feature selection is applied -and that statistically 'significant' models can hardly be 'avoided' if too large a variable pool is chosen to select features in the first place. since datasets are generally limited in size, a suitable split into training and test set(s) is crucial in order to achieve sufficient training examples on the one hand, and as high as possible a predictivity of the model on the other. often, leave-one-out cross-validation has been used to judge model performance -where the compound 'left out' was supposed to be a novel compound found for which property predictions had to be made. unfortunately this is, according to recent studies, not a suitable validation method [60, 61] . in the case of leave-one-out cross-validation, where features are selected from a wider range, the tendency exists in every case to select those features which perform best on a particular compound -thus decreasing generalizability of the model. results were summarized in a simple statement: 'beware of q 2 !', where specifically the cross-validated correlation coefficient of a leave-one-out cross-validation is alluded to. in addition, general guidelines for developing robust qsar models were developed, namely a high cross-validated correlation coefficient and a regression, which shows slope close to 1 and no significant bias. using theoretical considerations as well as empirical evaluations the question of leave-one-out vs. separate test sets was recently considered in detail [62] . performing repeated cross-validations of both types on a large qsar dataset, the conclusion was drawn that in the case of smaller datasets, separate test sets are wasteful, but in case of larger datasets (at least large three-digit numbers of data points) it is recommended. this partly contradicts the above conclusion, that separate test sets should always be used. the discrepancy was explained by the fact that in the earlier work only small separate test sets were used (containing 10 compounds), which was not able to provide a sufficiently reliable performance measure. the finding that cross-validation often overestimates model performance was corroborated in a recent related study [63] , in particular, in cases where strong model selection such as variable selection is applied. the main influence on quality overestimation was found to be a (small) dataset size; other factors are the size of the variable pool considered, the objectto-variable-ratio, the variable selection method, and the correlation structure of the underlying data matrix. while in case of conventional stepwise variable selection overconfidence is commonly encountered, as a remedy lasso (least absolute shrinking and selection operator) selection is proposed, as well as the utilization of ensemble averaging. both techniques give more reliable estimates of the quality of the developed model. given that the latter was shown to improve performance in many cases on its own the generation of reliable performance measures is an additional advantage of ensemble techniques. overfitting is a problem which describes good model performance on a training set but much worse performance on subsequent data, and thus, mediocre generalizability of the model (the model is not robust). a recent discussion of this problem, with many accessible examples, gives similar guidelines to those above, such as that leave-one-out cross-validation is not sufficient [64] . it also emphasizes the recommendation of multiple training/test set splits even in the case of very large dataset sizes and of performing cross-validation across classes of compounds in the case of close analogues (instead of molecule-by-molecule splits). in order to have some measure of overfitting, the use of 'benchmark models' such as partial least squares is recommended (depending on the particular problem) in order to determine whether there might be simpler models appropriate to the task (indicating that the more complex model overfits the data). using a toxicity dataset of phenols against tetrahymena pyriformis [65] the conclusion that q 2 is not a sufficient predictor for the applicability of a qsar model to unseen compounds is corroborated, and suggests using the rms error of prediction (rmsep) instead. this guideline is presented along with additional important points: that outliers should not necessarily be deleted since this step reduces the chemical space covered by the model, that the number of descriptors in a multivariate model needs to be chosen carefully and finally that an 'appropriate' number of dimensions is required for pls modeling. in addition, the influence of the number of variables on predictive performance for training and test sets is investigated. several recent publications have attempted to investigate what the actual scope of a qsar model is -and attempted to develop guidelines to assess the applicability of a model to a novel compound whose properties are to be predicted [66, 67] . two measures for applicability are proposed: the similarity of the novel molecule to the nearest molecule in the training set and the number of neighbors of the novel compound within the training set with a similarity greater than a certain cutoff. as expected, molecules with the highest similarity are best predicted, and this was found to be true across datasets as well as across methods. the applicability measures described above can also be used numerically to derive error bars for estimations of how likely the prediction of a specific model is within a certain error threshold. the issue of model validity was also briefly reviewed from a regulatory viewpoint [68] . in a similar vein, a 'classification approach' has been presented for determining the validity of a qsar model for predicting properties of a novel compound [69] . focusing on linear models (though the underlying concept is more generally applicable), the predictions made for compounds within the initial training set are differentiated between 'good residuals' and 'bad residuals'. using three different datasets (an artemisinin dataset as well as two boiling point datasets) machine-learning methods were employed to predict whether a novel compound belongs to the 'good' or 'bad' class of residuals, thereby making predictions as to whether its properties can be predicted -with a success rate of between 73% and 94%. a stepwise approach for determining model applicability [70] considers physicochemical properties, structural properties, a mechanistic understanding of the phenomenon and, if applicable, the reliability of simulated metabolism in a step-by-step manner. with several qsar datasets, it could be shown that for substances that are well covered by the training set improved predictions can be made for novel compounds, in agreement with the conclusions stated above. the performance of similarity searching methods varies widely, comprising both target-and ligand-based approaches. while large enrichment factors (often in the hundreds) are reported, the question arises of how much 'added value' more sophisticated methods actually provide, compared to very simple approaches, and where the gain-to-cost ratio actually shows an optimum. a recent study illustrated that simple 'atom count descriptors' (which do not capture any structural knowledge but represent a molecule by a set of integers which represent the number of atoms of each element) are able to have comparable performance to state-of-the-art fingerprints [6] . thus, when averaged over multiple target classes, the added value of virtual screening approaches is probably closer to two (compared to trivial descriptors) than in the region of often published double-digit numbers (compared to random selection). it should be added that performance of 'dumb' and more sophisticated descriptors varied widely, between virtually no difference in performance up to high single-digit performance improvements of state-of-the-art fingerprints (which are, with respect to retrieval rate and on a mddr-dataset, circular fingerprint descriptors). hts results are notorious for the amount of noise they contain and methods such as multiple screening runs are routinely applied to alleviate the problem. still, additional experiments are required. an alternative method was recently presented [71] which, applying purely computational methods, is able to predict truly active compounds with improved reliability in screenings where multiple compounds are screened per well. using scitegic circular fingerprints [72] , similarities between molecules in wells containing compounds predicted as being active (which may be true positives or, often, just noise) are calculated. the compounds most similar to active compounds are more likely to be active themselves; by predicting (across wells) those compounds which are similar to each other and at the same time are located in wells showing activity, the active compounds out of the mixtures can be estimated. this way, between 29% and 41% of the active compounds could be retrieved in the top 10% of the sorted compounds. another approach which attempts to improve knowledge derived from hts campaigns was recently proposed [73] ; the conventional selection of a fixed number of compounds showing activity in a primary screen is replaced for secondary screens ('top x approach'). alternatively, methods based on partitioning are frequently employed. in the approach presented here, an ontology-based pattern identification method is employed, which originated from bioinformatics methods (the prediction of gene function based on microarray data). taking scaffold diversity into account and also applying the 'molecular similarity principle', the overall probability of selecting active compounds from different clusters is maximized. based on earlier hts data, significant improvement of hit confirmation rates was demonstrated, compared to a conventional 'top x' approach. related work was recently also performed with a focus on scaffold clustering [74] . as discussed below, scoring functions are not yet able to predict binding affinities sufficiently well across the board of target proteins. still, the identification of active ligands was shown to be improved by a second data post-processing step. first, ligands are docked to the target. subsequently, predicted active and inactive compounds are subject to model generation via a naı¨ve bayesian model [75] based on circular fingerprints. applied to protein kinase b and protein-tyrosine phosphatase 1b, significant performance improvements could be observed in combination with dock, flexx as well as glide scores on protein-tyrosine phosphatase 1b. on the other hand, results on protein kinase b results were not improved, which was attributed to the fact that the predicted actives used to train the model were 100% false positives. understandably, performance cannot be improved if the initial enrichments are not able to identify true positive binders. more recently, another step was introduced between scoring and selecting active and inactive compounds for training the bayes classifier [76] , which is one of the available consensus scoring methods. since consensus scoring is often able to rescue docking results in cases where a specific scoring function fails, rank-by-median consensus scoring was shown to improve results for protein kinase b considerably. other consensus approaches (rank-by-mean, and rank-by-vote) did not perform as well. this was attributed to their sensitivity to cases where one of the scoring functions performs badly. (the median of a set of numbers is less sensitive to outliers than its mean.) an alternative method for post-processing docking scores is the post-dock approach whose final goal is the elimination of false-positive predictions and their discrimination from artifacts [77] . based on a ligandtarget database, derived descriptors (dock score, empirical scoring and buried solvent accessible surface area) and models from machine-learning methods were derived to identify false-positive predictions. validating the method on 44 structurally diverse targets (plus the same number of decoy complexes), 39 of 44 binding and only 2 of 44 complexes were predicted to be of true-positive nature. compared to purely docking-based methods, dock and chemscore achieve enrichments on the order of five to seven, depending upon the database used, while the method presented here claims to obtain about 19-fold enrichment. consensus prediction of docking scores is often able to improve results over single functions and multiple ways have been proposed to combine scores from different functions such as rank-by-rank, rank-by-vote or rankby-number [78] . performance improvement could not be observed in every case and a theoretical study [79] to elucidate the way in which consensus scoring improves results, concluded that this was due to the simple reason that multiple samplings of a distribution are closer to its true mean than single samplings. assumptions made by the study, such as the performance of each individual scoring function is comparable, have led to the work later being criticized [80] , and it has been concluded that consensus scoring can improve results but that it is not true in every case (as observed in practice). more recently, it was demonstrated [81] that two criteria are important if consensus scoring is to be successful: first, each individual scoring function has to be of high quality, and second, the scoring functions need to be distinctive. even if no training data are available to judge those points, rank-vs.-score plots were proposed to gauge the success of target-based virtual screening against a particular target. while consensus predictions for ligand-based virtual screening have been known for some time, a more recent study extended the descriptors employed to include structural, 2d pharmacophore and property-based fingerprints as well as bcut descriptors and 3d pharmacophores [82] . logistic regression and rank-by-sum consensus approaches were found to be most advantageous due to repeated samplings, better clustering of actives (since multiple sampling will recover more actives than inactives) and agreement of methods to predict actives but less so inactives. in addition, more stable performance across a range of targets was observed. if multiple active compounds are known in a virtual screening setting, the question arises of how to combine the retrieved lists of individual compounds. applied to different activity classes from the mdl drug data report as well as the natural products database [83] it was recently found that the rank-by-max method generally outperforms the rank-by-sum method, while concluding that the tanimoto coefficient is superior to 10 other similarity coefficients considered. as to the applicability of consensus approaches, it is found that more dissimilar activity classes profit more than more homogeneous classes, where best retrieval performance is already obtained using lower numbers of query structures (which are then already able to cover the 'activity island' inhabited by the particular class of compounds). while many applications of virtual screening tools have appeared in the literature, only some examples can be given here. a phosphodiesterase-4 (pde4) inhibitor recently has been optimized through the application of small combinatorial libraries [84] . affinity was increased by three orders of magnitude by screening only 320 compounds after prioritization by flexx docking. following the recent sars scare, a virtual screening procedure via docking (dock program) was able to find inhibitors of sars coronavirus 3c-like proteinase with binding affinities of k i ¼ 61 mm out of 40 compounds tested [85] . virtual screening based on a homology model of the neurokinin-1 (nk1) receptor led to the discovery of submicromolar ligands [86] , while even nanomolar binding compounds against checkpoint kinase 1 (chk1) could be discovered [87] by applying successive filtering for physicochemical properties, pharmacophore filters and docking stages. ligand-based pharmacophore models generated by catalyst [88] were used to discover nanomolar ligands of erg2, emopamil-binding protein (ebp), and the sigma-1 receptor (s 1 ) [89] . out of 11 compounds tested, 3 exhibited affinities of less than 60 nm. high levels of biliary elimination of a cck2 antagonist led to the quest for novel compounds, which retained activity and selectivity while improving half-life. using field points derived from xed charges [90] , novel heterocycles were proposed [91] (switching from an indole to pyrrole and imidazole series), which decreased molecular weight and polarity and achieved the desired scaffold hop. apart from this list of applications against particular targets, only two further applications shall be described here (since the field is simply too large to capture it in its entirety). first, ligand-and target-based approaches were recently compared in their abilities to identify ligands for g-protein coupled receptors [92] . evaluating docking into homology models, ligand-based pharmacophore models and feature trees, 3d similarity searches as well as models built on 2d descriptors, all ligand-based techniques were shown to outperform the docking-based approaches. however, docking also provided significant enrichment. second, the 'hts data mining and docking competition' presented its results recently [93] [94] [95] . duplicate residual activities of 50,000 compounds against escherichia coli dhfr in primary screening were released in late 2003 [96] , upon which 42 groups submitted activity predictions for a test set of the same size (but with unknown activity). approaches employed ranged from docking [97, 98] to purely ligandbased methods [99, 100] . overall, none of them was able to predict actives from the test set reliably. while this was partly due to difference in chemical composition of the training and test sets, an additional problem was posed by the test set which did not contain real 'actives' (showing proper dose-response curves in secondary assays), thus making predictions difficult. several novel clustering algorithms have been presented recently, each of which extends previous approaches in its own way. a combination of fingerprint and maximum common substructure (mcs) descriptors [101] speed up clustering (compared to purely mcs methods) enabling its application to large datasets, and the method was shown to be able to identify the most frequent scaffolds in databases, to select analogues of screening hits and to prioritize chemical vendor libraries. a modification of k-means clustering also showed a considerable speed increase to be possible when processing large libraries [102] , as demonstrated on a dataset containing about 60,000 compounds derived from the mddr. the desired speed-up was observed along with favorable enrichment of activity classes within the clusters. by introducing fuzziness into the clustering process [103] , superior results can be obtained compared to the original (non-fuzzy or 'crisp') approaches to k-means and ward clustering, depending on the particular dataset and the property one attempts to predict. fuzzy clustering assigns partial memberships to multiple classes (instead of binary values); with a log p dataset the best fuzzy parameterization was shown to clearly outperform the best crisp clustering. in addition, partial class memberships were shown to capture the 'chemical character' of a compound more satisfyingly than conventional (crisp) class assignments. while the concept of 'drug-likeness' has to be applied with care (and one needs to be aware of its limitations) it has nonetheless received considerable attention in recent years, based on datasets derived from the available chemicals directory (acd) and the world drug index (wdi). first applications employed ghose/crippen descriptors in combination with neural networks for classification, and correct classification was achieved for 83% of the acd and 77% of the wdi, respectively [104] . later, the application of svms was not able to improve overall performance significantly, but the new method was able to correctly classify compounds that were misclassified by the ann-based technique [34] . very recently a further analysis of the drug/non-drug dataset appeared, which analyzed svm performance (as well as that of other machine-learning methods) in more detail [105] . it was found that, in spite of problems with the dataset (some descriptor representations of compounds were, for example, identical in the drug and non-drug dataset) performance could be improved considerably to about 7% misclassified compounds by optimizing the kernel dimensions employed. an application using 'humanunderstandable' descriptors of drug-vs. non-drug-like properties has also been presented [106] recently, and was able to distinguish between both datasets with the most important descriptors being proper saturation level and the heteroatom-to-carbon ratio of the molecule. the concept of database comparison is also more generally applicable, as was shown recently when the question of how 'in-house like' external databases are was addressed in order to help to decide whether they should be acquired or not [107] . a number of validations of docking programs have appeared recently, and it is interesting to observe that they grow in size in every respect -including the number of docking and scoring functions considered as well as the number and diversity of ligand-target complexes employed for their evaluation. using dock, gold and glide in order to evaluate the performance of docking programs in target-based virtual screening on five targets (hiv protease, protein tyrosine phosphatase 1b, thrombin, urokinase plasminogen activator and the human homologue of the mouse double minute 2 oncoprotein), it was concluded that performance is both target-and method-dependent [108] . performance varied widely, between near-perfect behavior (for example, gold in combination with protein tyrosine phosphatase 1b) to negative enrichment (for example, gold with hiv protease). employing fred, dock and surflex, and adopting the algorithm to the particular binding pocket, it was found that target-based virtual screening is successful in some cases [109] , with surflex probably performing the best overall. investigating phosphodiesterase 4b [110] and a set of 19 known inhibitors with 1980 decoys, the scoring functions pmf, jain, plp2, ligscore2 and dockscore were compared with respect to their ability to enrich known ligands. it was found that pmf and jain showed high-enrichment factors (greater than four-fold) alone, while a rank-based consensusscoring scheme employing pmf and jain in combination with either dockscore or plp2 showed more robust results. in what is probably one of the most extensive studies yet, 14 scoring functions in combination with 800 protein-ligand complexes from the pdbbind database have been compared for evaluation [111] . the scoring functions compared were x-score and drugscore, five scoring functions implemented in sybyl (chemscore, d-, f-and g-score and pmf-score), four implemented in cerius2 (ligscore, ludi, plp and pmf) as well as two scoring functions implemented in gold (goldscore and chemscore) as well as the hint function. performance was assessed by their ability predicting affinity (k i /k d values). overall, x-score, drugscore, sybyl with chemscore and cerius2 with plp performed better than the other combinations, giving standard deviations in the range of 1.8-2.0 log units. another very comprehensive evaluation [112] employed 10 docking programs in combination with 37 scoring functions against eight proteins of seven types. three criteria were used for assessment, namely the ability to predict binding modes, to predict ligands with high affinity and to correctly rank-order ligands by affinity. while nearly all programs were able to generate crystallographic ligand-target complexes, the identification of the correct structure by the scoring function was found to be considerably more error-prone. averaged over all targets, none of the programs was able to predict more than 35% of the ligands within an rmsd of equal to or less than 2 å . while active compounds were correctly identified, activity prediction was more difficult -to the extent that 'for the eight proteins of seven evolutionarily diverse target types studied in this evaluation, no statistically significant relationship existed between docking scores and ligand affinity' [112] . similar results were obtained on five datasets (serine, aspartic and metalloproteinases, sugar-binding proteins and a 'miscellaneous' set) using the scoring functions bleep, pmf, gold and chemscore [113] , where across all complexes on average no function returned a better correlation than r 2 ¼ 0.32. interestingly, another recent study drew quite different conclusions from similar observations [114] . docking endogenous ligands into a panel of proteins it was concluded that proteins are often very promiscuous and do not interact with only a single clearly defined small molecule. while this is surely possible, given the limitations of today's scoring functions it might well be the case that predictions are just not yet good enough. while a great number of descriptors and modeling methods has been proposed until today, the recent trend toward proper model validation is very much appreciated. applications of the 'molecular similarity principle' do not yet show the power one would like them to have -and although some of their limitations are surely due to underlying principles and limitations of fundamental concepts, others will certainly be eliminated in the future. concepts and applications of molecular similarity chemical similarity searching approaches to measure chemical similarity -a review molecular similarity: a key technique in molecular informatics the art and practice of structurebased drug design: a molecular modeling perspective a discussion of measures of enrichment in virtual screening: comparing the information content of descriptors with increasing levels of sophistication in vitro and in silico affinity fingerprints: finding similarities beyond structural classes comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures pot-dmc: a virtual screening method for the identification of potent hits a 3d similarity method for scaffold hopping from the known drugs or natural ligands to new chemotypes lingo, an efficient holographic text based method to calculate biophysical properties and intermolecular similarities the reduced graph descriptor in virtual screening and data-driven clustering of high-throughput screening data small molecule shapefingerprints extracting the three-dimensional shape of live pigs using stereo photogrammetry novel receptor surface approach for 3d-qsar: the weighted probe interaction energy method calculation of intersubstituent similarity using r-group descriptors use of the r-group descriptor for alignment-free qsar fingal: a novel approach to geometric fingerprinting and a comparative study of its application to 3d-qsar modelling a computational procedure for determining energetically favorable binding sites on biologically important macromolecules grid-independent descriptors (grind): a novel class of alignment-independent three-dimensional molecular descriptors incorporating molecular shape into the alignment-free grid-independent descriptors anchor-grind: filling the gap between standard 3d qsar and the grid-independent descriptors virtual screening and scaffold hopping based on grid molecular interaction fields graph kernels for molecular structure-activity relationship analysis with support vector machines one-dimensional molecular representations and similarity calculations: methodology and validation fast small molecule similarity searching with multiple alignment profiles of molecules represented in one-dimension multiple-ligand-based virtual screening: methods and applications of the mtree approach comparison of correlation vector methods for ligand-based similarity searching fuzzy pharmacophore models from molecular alignments for correlation-vector-based virtual screening extraction and visualization of potential pharmacophore points using support vector machines: application to ligand-based virtual screening for cox-2 inhibitors biospectra analysis: model proteome characterizations for linking molecular structure and biological response can cell systems biology rescue drug discovery? identifying biologically active compound classes using phenotypic screening data and sampling statistics comparison of support vector machine and artificial neural network systems for drug/nondrug classification virtual screening of molecular databases using a support vector machine lead hopping using svm and 3d pharmacophore fingerprints random forest: a classification and regression tool for compound classification and qsar modeling boosting: an ensemble learning tool for compound classification and qsar modeling ensemble methods for classification in cheminformatics new approach by kriging models to problems in qsar a comparative study on feature selection methods for drug discovery molecular similarity searching using atom environments, information-based feature selection, and a naive bayesian classifier similarity searching of chemical databases using atom environment descriptors (molprint 2d): evaluation of performance evaluation of mutual information and genetic programming for feature selection in qsar nonlinear prediction of quantitative structure-activity relationships nonlinear quantitative structure-activity relationship for the inhibition of dihydrofolate reductase by pyrimidines novel variable selection quantitative structure-property relationship approach based on the k-nearest-neighbor principle development and validation of k-nearest-neighbor qspr models of metabolic stability of drug candidates three-dimensional qsar using the knearest neighbor method and its interpretation application of non-parametric regression to quantitative structure-activity relationships robust qsar models using bayesian regularized neural networks predictive bayesian neural network models of mhc class ii peptide binding an in silico approach for screening flavonoids as p-glycoprotein inhibitors based on a bayesian-regularized neural network combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and tanimoto coefficients analysis and display of the size dependence of chemical similarity coefficients comparison of three holographic fingerprint descriptors and their binary counterparts enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information chance factors in studies of quantitative structureactivity relationships judging the significance of multiple linear regression models rational selection of training and test sets for the development of validated qsar models beware of q2! assessing model fit by cross-validation chance correlation in variable subset regression: influence of the objective function, the selection mechanism, and ensemble averaging the problem of overfitting the better predictive model: high q(2) for the training set or low root mean square error of prediction for the test set? assessing the reliability of a qsar model's predictions similarity to molecules in the training set is a good discriminator for prediction accuracy in qsar improving opportunities for regulatory acceptance of qsars: the importance of model domain, uncertainty, validity and predictability determining the validity of a qsar model -a classification approach a stepwise approach for defining the applicability domain of sar and qsar models prioritization of high throughput screening data of compound mixtures using molecular similarity novel statistical approach for primary high-throughput screening hit selection hiers: hierarchical scaffold clustering using topological chemical graphs finding more needles in the haystack: a simple and efficient method for improving high-throughput docking results combination of a naive bayes classifier with consensus scoring improves enrichment of high-throughput docking results postdock: a structural, empirical approach to scoring protein ligand complexes consensus scoring: a method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins how does consensus scoring work for virtual library screening? an idealized computer experiment virtual screening using protein-ligand docking: avoiding artificial enrichment consensus scoring criteria for improving enrichment in virtual screening the use of consensus scoring in ligand-based virtual screening enhancing the effectiveness of virtual screening by fusing nearest neighbor lists: a comparison of similarity coefficients design of small-sized libraries by combinatorial assembly of linkers and functional groups to a given scaffold: application to the structure-based optimization of a phosphodiesterase 4 inhibitor virtual screening of novel noncovalent inhibitors for sars-cov 3c-like proteinase successful virtual screening for a submicromolar antagonist of the neurokinin-1 receptor based on a ligand-supported homology model identification of compounds with nanomolar binding affinity for checkpoint kinase-1 using knowledge-based virtual screening pharmacophore modeling and three-dimensional database searching for drug design using catalyst discovery of high-affinity ligands of sigma1 receptor, erg2, and emopamil binding protein by pharmacophore modeling and virtual screening scaffold hopping with molecular field points: identification of a cholecystokinin-2 (cck(2)) receptor pharmacophore and its use in the design of a prototypical series of pyrrole-and imidazole-based cck(2) antagonists virtual screening of biogenic amine-binding g-protein coupled receptors: comparative evaluation of protein-and ligand-based virtual screening protocols mcmaster university data-mining and docking competition -computational models on the catwalk experimental screening of dihydrofolate reductase yields a ''test set'' of 50,000 small molecules for a computational data-mining and docking competition evaluating the highthroughput screening computations high throughput screening identifies novel inhibitors of e. coli dihydrofolate reductase that are competitive with dihydrofolate here be dragons: docking and screening in an uncharted region of chemical space virtual ligand screening against e. coli dihydrofolate reductase: improving docking enrichment using physics-based methods using extended-connectivity fingerprints with laplacian-modified bayesian analysis in high-throughput screening follow-up screening for dihydrofolate reductase inhibitors using molprint2d, a fast fragment-based method employing the naive bayesian classifier: limitations of the descriptor and the importance of balanced chemistry in training and test sets database clustering with a combination of fingerprint and maximum common substructure methods a hierarchical clustering approach for large compound libraries clustering files of chemical structures using the fuzzy k-means clustering method a scoring scheme for discriminating between drugs and nondrugs classifying 'drug-likeness' with kernel-based learning methods a new rapid and effective chemistry space filter in recognizing a druglike database in-house likeness'': comparison of large compound collections using artificial neural networks comparison of automated docking programs as virtual screening tools fast structure-based virtual ligand screening combining fred, dock, and surflex retrospective docking study of pde4b ligands and an analysis of the behavior of selected scoring functions an extensive test of 14 scoring functions using the pdb bind refined set of 800 protein-ligand complexes a critical assessment of docking programs and scoring functions predicting protein-ligand binding affinities: a low scoring game? ligand selectivity and competition between enzymes in silico key: cord-321715-bkfkmtld authors: redelings, benjamin d; suchard, marc a title: incorporating indel information into phylogeny estimation for rapidly emerging pathogens date: 2007-03-14 journal: bmc evol biol doi: 10.1186/1471-2148-7-40 sha: doc_id: 321715 cord_uid: bkfkmtld background: phylogenies of rapidly evolving pathogens can be difficult to resolve because of the small number of substitutions that accumulate in the short times since divergence. to improve resolution of such phylogenies we propose using insertion and deletion (indel) information in addition to substitution information. we accomplish this through joint estimation of alignment and phylogeny in a bayesian framework, drawing inference using markov chain monte carlo. joint estimation of alignment and phylogeny sidesteps biases that stem from conditioning on a single alignment by taking into account the ensemble of near-optimal alignments. results: we introduce a novel markov chain transition kernel that improves computational efficiency by proposing non-local topology rearrangements and by block sampling alignment and topology parameters. in addition, we extend our previous indel model to increase biological realism by placing indels preferentially on longer branches. we demonstrate the ability of indel information to increase phylogenetic resolution in examples drawn from within-host viral sequence samples. we also demonstrate the importance of taking alignment uncertainty into account when using such information. finally, we show that codon-based substitution models can significantly affect alignment quality and phylogenetic inference by unrealistically forcing indels to begin and end between codons. conclusion: these results indicate that indel information can improve phylogenetic resolution of recently diverged pathogens and that alignment uncertainty should be considered in such analyses. reconstructing viral phylogenies is important for determining the parent stock of newly emerging strains [1] , as well as for understanding how viruses evolve over time, both within a single host and at the population level [2] . viral phylogenies are commonly inferred from aligned molecular sequence data, using the information available in substitutions shared by descent [3] [4] [5] [6] . short time-scales dominate in the development of rapidly emerging disease strains, such that the number of observed substitutions between sequences can be too low to yield well-resolved phylogenies. thus, to increase phylogenetic resolution for such disease strains we seek to make use of a wider class of phylogenetic information. insertions and deletions (indels) are a promising category of molecular sequence information that is largely ignored in phylogenetic reconstruction. researchers commonly remove gaps from molecular sequences alignments by coding them as missing data or by throwing out columns that contain gaps [3] [4] [5] [6] . indels may be useful to resolve deep branches in the tree of life that are difficult to resolve using information in shared substitutions [7, 8] . at the other extreme, on which we focus here, indels can help to resolve phylogenies in situations where the number of nucleotide substitutions is inadequate. for example, indels in non-coding chloroplast dna have been helpful in resolving the branching order of recent plant radiations [9, 10] . the rate of indel events in these regions approaches or surpasses the rate of substitution, making indels too important to ignore [10] . several species of viruses are also known to accumulate indels, sometimes at a high rate. cheyner et al. [11] note that indel rates are higher than substitution rates in hyper-variable regions of simian immunodeficiency virus (siv) and human immunodeficiency virus (hiv). other viruses also experience indels on short time-scales. hepatitis b virus (hbv) accumulates deletions in the core/pre-core region during the course of infection [12] , while equine infectious anemia virus accumulates insertions [13] . three deletion variants of severe acute respiratory syndrome (sars) appeared during the beginning of the sars outbreak in china [14] . influenza b viruses accumulate indels over several decades [15] . we note that these viruses are all rna viruses, with the exception of hbv. although hbv is a dna virus, it reverse transcribes its dna genome from an rna intermediate. redelings and suchard (2005) describe a statistical method of incorporating indel information into phylogeny estimation. this method uses a joint reconstruction framework that simultaneously infers the alignment, tree, and insertion/deletion rates. estimation proceeds through markov chain monte carlo (mcmc) within a bayesian framework and naturally accounts for uncertainty in alignments, phylogenies, and other parameters through posterior probabilities. unlike sensitivity analysis [16, 17] , this approach takes into account uncertainty resulting from the myriad of near-optimal alignments. this approach involves averaging over unobserved quantities such as the alignment and interal node states, which can lead to improved estimates [18] . this is different from other approaches which iteratively optimize a heuristically chosen cost function until no improvement is seen [19, 20] . joint estimation of alignment and phylogeny sidesteps bias that results from conditioning on a single alignment estimate [21, 18] , bias which may be exaggerated when indel information is inappropriately used. this method is based on a probabilistic model of sequence evolution that contains insertion and deletion events as well as substitution events. heuristic "costs" for opening and extending gaps are replaced by the insertion/ deletion rate and the mean indel length respectively, which are biologically interpretable parameters and can be estimated from the data without circularity [22, 23] . gaps are not treated as a fifth character state, since this overweights the evidence of shared indels by treating an indel of multiple residues as multiple shared indels [3] . instead, the indel process is separate and independent of the substitution process, and allows indels of several residues simultaneously. in addition, because alignments represent positional homology, the indel process does not allow a newly inserted character to be aligned to a previously deleted character. we introduce a new indel model to remedy a shortcoming of the redelings and suchard (rs05) model. unlike the tkf1 [22] and tkf2 [23] indel models that are not reversible on pairwise alignments, the reversible rs05 model does not make use of branch length information in the indel process and therefore does not place indels preferentially on longer branches. in order to increase biological realism, we describe an extended indel model that is able to incorporate branch length information. in doing so we overcome a substantial theoretical difficulty in using reversible indel models during phylogenetic reconstruction. we further enhance the estimation method of redelings and suchard [24] by introducing a novel mcmc transition kernel to improve mixing among topologies. this transition kernel is based on the subtree-prune-andregraft (spr) operator but is modified to partially sample the alignment along with the tree. block sampling improves mixing efficiency because topologies and alignments are highly inter-correlated. we introduce codon models [25] into joint estimation. codon models are often used in both bayesian and likelihood-based phylogeny estimation because they naturally allow different rates at the third codon position, but we are not aware of any work using codon models in joint estimation. we note that codon models implicitly alter the indel process as well as the substitution process by forcing indels to begin and end between codons. this constraint may not be biologically realistic and would result in misaligned nucleotides when indels are not in phase with the reading frame. such misalignment can artificially inflate the number of inferred substitutions. when the total number of substitutions is small, this may significantly alter the model fit or introduce bias. we compare nucleotide and codon indel models to see if these effects are significant. we analyze data sets from siv and hiv. the siv data set consists of a short section of the envelope (env) gene from 9 within-host strains. to see if indel information improves phylogenetic resolution we compare the number of bi-partitions that are supported under the joint model and the traditional sequential approach, in which topology reconstruction assumes a previously determined alignment. we also assess the importance of alignment ambiguity by assessing the sensitivity of phylogeny estimation to fixed alignments under both the traditional and joint models. the hiv data set consists of about 600 nucleotides from the env gene from 27 within-host strains. we compare the number of bi-partitions supported under the sequential and joint models to assess the importance of indel information. we also compare nucleotide and codon models to see if the assumption of unbreakable codons significantly decreases model fit or influences phylogeny estimates. in summary, we seek to improve the power to infer clades in rapidly emerging taxa by making use of indel information in a statistically rigorous manner. we also seek to determine whether indels can actually resolve extremely short branches with few substitutions. to accomplish these goals, we introduce an improved statistical model of the insertion-deletion process to improve the accuracy of the inference, and describe a novel mcmc transition kernel to improve the speed of the inference. once our statistical framework is in place, we then demonstrate that indel information can help to detect previously undetected bi-partitions in two real data examples from rna viruses. while analyzing these data, we note that alignment ambiguity may significantly affect phylogeny inference. we note that codon-based alignments can unrealistically shift indels to avoid breaking codons, and we develop the necessary statistical machinery to demonstrate that this can substantially affect phylogeny estimates. we introduce a time-dependent reversible indel process to the probabilistic framework for joint estimation of alignment and phylogeny of redelings and suchard [24] . time-dependence enables us to place indels preferentially on longer branches of the tree, producing a more realistic description of the evolutionary process. further, we also introduce a novel mcmc transition kernel to increase topology mixing so that we can estimate phylogenies and alignments containing increasingly more taxa. we review the salient features of the rs05 model here and propose the necessary extensions for a time-dependant indel process. our model starts with data y, where y is a collection of unaligned molecular sequences y i for i = 1, ..., n taxa. each molecular sequence y i is a collection of letters of length |y i |. we characterize the stochastic model that describes how the sequences in y diverged from a common ancestor in terms of a number of unknown but estimable parameters. these parameters include a multiple alignment a that specifies the positional homology between the sequences y, an evolutionary tree (τ, t) where τ is an unrooted bifurcating tree topology and t = (t 1 , ..., t 2n -3 ) is a vector of branch lengths along the edges in τ, and vectors θ and λ are parameters that characterize the letter substitution and indel processes respectively. alignment a includes felsenstein wildcard sequences of random lengths at the internal nodes of τ. thus, a also depicts the complete indel history among the sequences in y. we scale branch lengths in terms of expected number of substitutions per site. in contrast to traditional methods of phylogeny estimation that arbitrarily fix the alignment, we treat the alignment a as a random variable, leading to the probability expression the substitution likelihood p(y|a, τ, t, θ) and the priors p(τ, t) and p(θ) occur in traditional bayesian models that fix the alignment. however, the alignment prior p(a|τ, t, λ) and the prior on indel process parameters p(λ) are novel in the joint model, allowing for estimation and a natural way to handle uncertainty in a. to model the substitution process that specifies p(y|a, τ, t, θ), we assume that substitutions in each column of a occur independently and follow a continuous-time markov chain (ctmc) process [26] . under this process, letters at the root of the tree arise according to some distribution π. evolution then occurs independently along each branch of τ with rate matrix q. we restrict ourselves to reversible markov chains and use π as the equilibrium distribution of q. this makes the position of the root unidentifiable and so we use unrooted trees throughout this paper. ctmc models are in common usage for letters from nucleotide-, codon-, and amino acid-based alphabets. in contrast to nucleotide-based ctmc models, codon-based models group the three nucleotides in a codon into a single letter. given the small number of substitutions that occur during the emergence of rapidly evolving pathogens, codon-based models are preferred over amino-acid based models because they do not discard synonymous substitutions. codon-based models can also improve model efficiency over nucleotide-based models because the codon-based models can include non-independent nucleotide frequencies and rule out missense mutations [25] . codon-based models may also improve the accuracy of estimation by allowing the third-codon position to evolve at a higher rate. however, when the number of observed substitutions is low it may not be possible to estimate the non-synonymous to synonymous rate ratio ω, requiring researchers to fix ω to a previously estimated value. importantly, we note that codon-based models also affect the indel process by forbidding frameshift mutations and also indels that begin or end within a codon. while the former constraint is realistic for biologically active viruses, the latter constraint may force incorrect alignments at the nucleotide level, causing up to two misaligned residues per indel. this may result in a significant bias when the total number of substitutions is small. redelings and suchard [24] make the simplifying assumption that the alignment prior p(a|τ,t,l) = p(a|τ, l) (2) is independent of branch lengths. while this assumption implies that indels are equally likely to occur on each branch regardless of length, it trivially enforces that sequence length distributions φ on all nodes in τ remain the same. this is a necessary condition for constructing a reversible evolutionary hidden markov model (hmm) from pair-hmms along the branches of τ. reversibility substantially decreases implementation complexity. the assumption further allows us to avoid fragment based pair-hmms that tend to separate indels by the average indel length, which is not necessarily biologically realistic. here we develop an alignment prior p(a|τ, t, λ) that explicitly depends on branch lengths but retains equivalent sequence length distributions on all nodes of the tree. we begin construction of the extended model by briefly summarizing how the original indel model is constructed from a pairwise alignment distribution ν. we modify this construction to build the new indel model from a parameterized distribution ν t on pairwise alignments that corresponds to a divergence time t. we then describe a new pair-hmm which serves to generate ν t . finally we describe how to calculate posterior probabilities under this model. to describe our original multiple alignment model, we begin by noting that, given a topology τ, the multiple alignment a can be decomposed into a set of pairwise alignments a (b) along each branch b of the topology. this decomposition is possible because of the inclusion of felsenstein wildcard sequences at the internal nodes of τ. imposing an arbitrary distribution ν on each pairwise alignment a (b) independently yields a joint distribution over a. however, pairwise alignments on neighboring branches are not strictly independent because they both specify the length of the random sequence at the node they share. to handle this dependence, we first choose an arbitrary internal node in τ as the root; this imposes an orientation on each branch. we then label the sequence in each branch alignment a (b) that is closest to the root as the ancestral sequence and the other sequence as the descendant sequence. we sample the sequence length at the root from a distribution and draw the pairwise alignment a (b) for each branch b from ν conditional on the length of the ancestral sequence, proceeding down the tree from the root to the leaves. we note that the pairwise alignment distribution ν induces a sequence length distribution on each sequence in the pair it emits. to proceed, we require that the pairwise alignment distribution ν be symmetric under interchange of the two sequences in the pair. this implies that there is no preferred direction of evolution between the two sequences. it also implies that the sequence length distribution for the ancestral and descendant sequences are equal; we call this common distribution φ. if we set the root length distribution = φ, then we can write the multiple alignment prior as where i represents the set of internal nodes in τ [24] . note that in this expression the arbitrary root is not identifiable. unfortunately the parameters that characterize our original pairwise alignment distribution ν can not vary from branch to branch without inducing unequal length distributions. we therefore propose a new pairwise alignment prior that maintains a fixed sequence length distribution φ even when the indel probability varies from branch to branch. to accomplish this aim, we assume that each sequence consists of a series of unbreakable fragments, as in the tkf2 model. the fragment lengths are geometrically distributed with continuation probability ε and minimum length 1. the number of fragments is uniformly distributed over the non-negative integers. following an ancestral fragment at one end of a branch, a geometric number of new fragments are inserted in the descendent 3 with continuation probability δ(t). each ancestral fragment survives in the descendent with probability δ'(t) = δ(t)/(1 -δ(t)). following our previous model and the tkf models, insertions and deletions are equally likely. this model can be expressed as a symmetrical pair-hmm (figure 1 ), implying that alignments can be considered non-directed, since the probability does not change when ancestor and descendant sequences are interchanged. this contrasts with the tkf models that induce irreversible distributions on pairwise alignments. a major advantage of this symmetry is that it is clear how to construct alignment models on an unrooted tree and leads to greater simplicity in model implementation and, arguably, decreased computation time. the model described here diverges from our previous model in that match fragments no longer contain only a single letter, but instead follow the same length distribution as gap fragments. this is represented graphically in the pair-hmm by the addition of a loop with non-zero weight ε from the match state (+/+) to itself. to facilitate dependence of pairwise alignment distribution ν t on t, we seek a natural relationship between δ(t) and t. we define λ as the indel rate per residue scaled in pair-hmm representation of the fragment-based indel model becomes the probability of a fragment being inserted or deleted. we wish to re-parameterize the fragment model in terms of a per-residue indel rate; the probability of an indel occurring between two residues is (1 -ε)δ'(t). however, if we attempt to set then the probability δ'(t) can become greater than 1. we therefore move the factor of (1 -ε) into the time scale, such that we note that equation (4) agrees with equation (5) to first order in λt b and serves to connect fragment indel rates to per-residue indel rates. the product λt b is in general << 1, so matching on higher order terms is unnecessary. the distribution ν t naturally gives rise to two models. in the first model, denoted "fragments", we set ν (b) = ν λ for all b, making the probability f an indel independent of branch length again. in the second model, denoted as "fragments+t", we set making the probability of an indel roughly proportional to branch length t b . we now show that the sequence length distribution induced by ν t is independent of t. the pairwise alignment distribution is a uniform distribution on the number of fragments, with each fragment being a match (+/+), insertion (-/+) or deletion (+/-) with probabilities 1 -2δ(t), δ(t) and δ(t), respectively, and with exit measure (1 -δ(t)). this results in the following probability generating function for the length of either sequence in the pair-hmm: therefore, the length distribution is independent of δ(t), and is uniform except for an anomaly at length 0. this allows us to specify a different value of δ(t) in the pair-hmm on each branch of the tree without affecting φ. defining l 1 and l 2 as the emitted sequence lengths from the pair-hmm, we note that p(l 1 = l 1 ) has finite measure and that the distribution p(l 2 = l 2 |l 1 = l 1 ) on l 2 is therefore proper. this implies that the posterior distribution of the joint model is proper because the distribution conditions on the observed leaf sequence lengths. we introduce a novel mcmc transition kernel that improves mixing between topologies and alignments. the new transition kernel uses the spr operator ( figure 2 ) to propose new trees, but is extended to be alignment-aware. our previous approach used only nearest-neighbor-interchange (nni) operators to propose new trees [24] . this resulted in long convergence times and inefficient mixing when there were many taxa. the spr operator improves on this situation by proposing non-local topology rearrangements that would require several nni moves, and thus avoids several intermediates [27] . a along with the topology τ. in our framework, it is necessary to alter a when τ is altered because a specifies the homology of internal sequences and this homology may be inconsistent with the proposed topology. this happens when some column of a contains a letter that would be deleted and reinserted given the new topology. after an spr tree proposal, we note that the alignment of the subset of sequences corresponding to taxa in the pruned subtree ( figure 2 , blue) must remain consistent because their phylogeny remains unchanged. likewise, the alignment of the other sequences ( figure 2 , green) must remain consistent because the phylogeny of that subset remains unchanged. however the alignment of the complete set of sequences may not be consistent. our solution to this problem involves collapsed gibbs sampling [28] with τ' as long as c is large enough to contain alignments consistent with both τ and τ'. then we sample a single point from the chosen collapsed point in proportion to its posterior probability. to satisfy detailed balance, the set c must be constructed so that it contains at least the current alignment a; full conditions under which this procedure satisfies detailed balance are described in the appendix. we now seek a set c that is large enough to contain alignments consistent with τ' and yet small enough for integration and sampling to be computationally feasible. unfortunately, integration over the set of all alignments is not practical, even if we constrain the alignment of leaf sequences to be constant. therefore, we fix parts of the alignment and collapse only the remaining portions. allowing only the three branch alignments adjacent to node o ( figure 2 ) to vary will certainly allow an align-ment consistent with τ'. this is therefore a loose constraint, which we call c 3 (a, τ, o). it requires an o(l 3 ) dynamic programming algorithm for integration and resampling. to decrease the order of the dynamic programming algorithm to o(l 1 ), we consider imposing the additional constraint, which we call c 1 (a, τ, o), that the alignments between the three nodes connected to o the subtree-prune-and-regraft operator figure 2 the subtree-prune-and-regraft operator. (a) first a subtree (blue) and its associated node o are detached from the rest of the tree (green). (b) the subtree is then regrafted along into a different branch through its node o. in both (a) and (b), three branches connect to node o. the phylogeny relating sequences at the pruned nodes (blue) and the phylogeny relating sequences at the remaining nodes (green) do not change. therefore alignments within each of these sequence subsets can remain unchanged from (a) to (b). and may not include any alignments consistent with τ'. as an alternative, we propose to fix the alignment between sequences in the pruned subtree and the alignment between sequences in the remainder, but allow the alignment between the two groups of sequences to vary. this constraint, which we call algorithm that is significantly more computationally efficient than an o(l 3 ) algorithm. note that we have demonstrated above that the alignment within the two subgroups of sequences remains consistent under an spr proposal. thus, the constraint set c 2 (a, τ, op) contains an alignment that is consistent with τ' as well as τ, making c 2 (a, τ, op) a useful constraint set for collapsed sampling. triplet models coalesce three adjacent nucleotide letters into a single triplet letter. the size of the triplet alphabet is therefore approximately the cube of the size of the singlet alphabet. the larger alphabet size allows a more complex substitution model such as the codon model of goldman and yang [ [25] , m0]. triplet substitution models can prohibit stop codons, can make use of codon frequencies instead of nucleotide frequencies and can differentiate between synonymous and non-synonymous substitutions. triplet alphabets affect the alignment model as well as the substitution model by forcing indels lengths to be multiples of 3 singlet letters and by forcing indels to start and end between triplets. while the former is biologically realistic, the latter may not be. we describe a method of comparing triplet with singlet models to assess how forcing indels to begin and end between codons affects model fit. to accomplish this, we first remove the substitution benefits of the m0 model listed above to focus solely on the effects of the triplet alignment process. we construct a triplet substitution model that generates the same likelihood as a singlet substitution model given the same alignment. traditionally, both models are reversible and have a rate matrix q = {q xy } that is constructed from the equilibrium letter frequencies π and a symmetric exchangeability matrix s = {s xy } in the following way: fraction f can vary from 0 to 1 but traditionally f is fixed to 1. the fraction specifies the relative importance of unequal conservation (f = 0) and unequal replacement (f = 1) in creating the equilibrium frequency distribution [29] . given a singlet nucleotide model with exchangeability matrix s (s) , we build a triplet model with exchangeability matrix s (t) in the following fashion. each allowable substitution from triplet α to triplet β involves only one nucleotide substitution from nucleotide i to nucleotide j. we note that for branch lengths to agree between the singlet and triplet models, q (t) must be scaled so that instead of the usual 1, because q (t) measures changes of each of the three sites in the triplet. we use hky as the singlet model in our comparison because the hky × 3 model is identical to the m0 codon model with ω = 1, stop codons included, and independent nucleotide frequencies. we analyze two data examples to demonstrate the advantages of joint bayesian estimation. while both data sets come from related genes, they differ in their sequence lengths, number of taxa, and sequence characteristics. we select these datasets for their relative sparseness of phylogenetic information, typical of rapidly evolving pathogens. thus, although the joint model makes full use of both indels and substitutions shared by descent, we do not expect to recover fully resolved trees. rather, we note substantial improvement over traditional, sequential methods. we first examine a data set drawn from siv, a non-human primate lentivirus. lentiviruses contain a single-stranded rna genome that reverse transcribes into dna by upon infection. the dna then inserts into the host genome before expression. reverse transcriptase is extremely errorprone, giving lentiviruses high mutation rates. the data set consists of 9 partial env sequences sampled from within a single macaque initially infected by injection with strain sivmac251 [31] . cheynier et al (2001) have previously presented an alignment of these sequences as a typical example of phylogenetically informative indels in siv [11] . the env gene encodes glycoprotein gp160, which is split after translation to form the smaller glycoproteins gp120 and gp41. because gp120 and gp41 are displayed on the surface of mature virions, exposed to the host immune system, env tends to mutate more quickly than other siv genes through positive selection. from the data set, we remove a phylogenetically uninformative duplication in a single sequence because our model assumes insertions of random sequence but not duplications. all sequences then range in length from 57 to 69 nucleotides with an alignment length of 69 nucleotides independent of the method used to compute the fixed alignment. the data set contains 10 variable sites and 6 informative sites under the clustal w alignment, 12 variable and 7 informative sites under the muscle alignment, and 11 variable and 6 informative sites under the map estimate from the joint model (table 1 , -indel contribution). for a prior on ln κ, we assume a double-exponential distribution with median ln 2 and standard deviation . on ln λ, we assume a double-exponential distribution with median -5 and standard deviation . for ε we assume an exponential distribution with mean 5 on the expected indel length. we assume a uniform distribution over the topology τ. on the branch lengths we assume an exponential distribution with mean μ, and on μ we assume an exponential distribution with mean 0.04. continuous parameter estimates under the joint model are as follows: κ has median 2.4 with a 95% bayesian credible interval of (1.64,5.32). the median of ln λ is -3.4 and its 95% bci is (-4.99, -1.85). the median of ln ε is -0.71 with a 95% bci of (-1.12, -0.428). the mean branch length μ, has posterior median 0.0178 with a 95% bci of (0.00854,0.0368). to assess the usefulness of indel information and the importance of alignment ambiguity in phylogenetic inference, we compare the posterior topology distributions for the traditional sequential model, the joint model restricted to a fixed alignment, and the full joint model. we note that the joint model increases the number of resolved internal branches by 3, 2, and 2 at posterior probability (pp) > 0.9, > 0.95, and > 0.99, respectively, over the traditional model using the clustal w alignment. the joint model supports 4, 3, and 3 branches at these levels of posterior probability and we depict the tree with branches supported at pp > 0.99 in figure 3 . this increase in resolution is sensitive to the alignment estimation method. for example, the resolution increase changes to 0, 0, and 2 under the muscle alignment, and 1, 2, 2 under the joint map alignment. thus, even accounting for alignment uncertainty, we achieve an increase in the phylogenetic resolution. at high posterior probabilities indels become relatively more important because they are rarer than substitutions. we note that alignment ambiguity is significant in this data set. first, estimates under the traditional or restricted models are sensitive to the alignment method used (table 1). second, fixing the alignment under the joint model yields an increase in the number of supported branches if the alignment is fixed to the clustal w estimate or the joint map estimate, but a decrease if the muscle estimate is used. furthermore, the increased support when the clustal w alignment is used includes a branch that conflicts with the joint map model, and the conflicting branch is present in the guide tree. thus ignoring alignment ambiguity can lead to exaggerated support for branches and bias towards the guide tree, especially when indel information is used. figure 4 displays a "gold" plot [24] to summarize the posterior alignment distribution of a under the full joint model. we observe a high level of alignment uncertainty. this is borne out by the observation of only 4 unique indels under the full joint model, while the clustal w alignment contains 5 indels. this difference is reflected in the lower estimate of λ under the full joint model and in the restricted models not using the clustal w alignment (table 1) . s1 s10 s11 s15 s5 s9 ref s16 s20 0.02 (a) -indels s1 s10 s11 ref s16 s20 s15 s5 s9 0.02 (b) + indels estimates of continuous parameters κ, ln λ, and ln ε are presented as a posterior median followed by a 95% bayesian credible interval. our second data set consists of comparatively longer sequences from hiv-1, a lentivirus closely related to siv. we consider a collection of 27 partial env gene sequences sampled serially at three time points from patient 1 reported by shankarappa et al [4] . we first analyze these data using the m0 codon model [25] to assess the importance of selection in this region (table 2) . we use the same prior distributions on λ, ε, κ, τ, and t as in example 1. we additionally place a double-exponential distribution on ln ω with median 0 and standard deviation 0.1. in addition to the standard m0 model in which f is fixed to 1, we consider the case in which f is a random variable with a uniform prior distribution. the posterior distribution of ω has median 0.996 and a 95% bci of (0.834,1.20). this changes little when f is free. the estimated interval is quite close to the prior 95% bci of (0.84,1.16) so we conclude that these data possess little information about ω. allowing ω to vary does not yield much benefit, and we henceforth consider only ω = 1. we also note that fixing f = instead of the traditional value of 1 produces a decrease in marginal likelihood of 2 log units for the hky model and a substantial increase of 1 2 siv alignment uncertainty plot figure 4 siv alignment uncertainty plot. we annotate the joint maximum a posteriori alignment estimate to indicate the approximate probability that each letter aligns to the root taxon in its column [24] . the 8 gaps in the alignment are a result of only 4 indel events under the joint model, whereas the clustal w alignment requires at least 5 indel events. colors other than red indicates that letters or gaps may shift to adjacent positions. the high frequency of the caa triplet is partially responsible for the level of alignment uncertainty. uncertain certain aaatcatcaacaacaacaacaacagcatcaacaacacc------aacatcaacaaagtcaataaacatg s10 aaaccatcaacaacaacaacaacagcatcaacaacacc------aacatcaacaaagtcaataaacatg s11 aaatcatcaacaataacaacaacagcaccaacaacaccaaatacaacatcaacaaagtcaataaacatg s15 aaatcatcaacaacaacaacaacag---------caccaaatacaacatcaacagagtcaataaacatg s16 aaatcatcaacaacaac---aacagcaccaacaccaacaaacacaacatcaacaaagtcaataaacatg s20 aaatcatcaacaacaac---aacagcaccaacaccaacaaacacaacatcaacaaagacaataaacatg s5 aaatcatcaacaacaacaacaacaa---------caccaagtacaacatcaacaaagtcaataaacatg s9 aaatcatcaacaacaacaa---cac---------caccaagtacaacatcaacaaagtcaataaacatg we therefore assume f = 0.5 for the remainder of our analyses. under the hky model we find that κ has a posterior median of 7.2 with a 95% bci of (4.6,11.7). the posterior median of ln λ is -3.3 with a 95% bci of (-4.1, -2.7) and ln ε has a median of -1.0 and a 95% bci of (-1.3, -0.78). this estimate of ε corresponds to a mean indel length of 1.58 nucleotides. the posterior median of μ, is 0.0036 with a 95% bci of (0.00257,0.00508). to examine the model appropriateness of forcing indels to begin and end between codons, we compared the marginal likelihoods and posterior tree lengths for the hky singlet and hky × 3 triplet models. under both models, we fixed f = for equivalence and set independent nucleotide frequencies to their empirical estimates. the log marginal likelihood is -1555.7 ± 0.3 for the singlet model and -1579.8 ± 0.3 for the triplet model (table 2) . to examine the substantial decrease of 24.1 log units between models, we calculate the posterior distribution of parsimony tree lengths under both models. the posterior median tree length is 104 substitutions with a 95% bci of (103,106) for the singlet model and increases to 109 substitutions with a 95% bci of (108,110) for the triplet model. to verify that this increase results from forcing indels out of phase, we first calculate the posterior distribution of the number of indels under the singlet model. the posterior mean number of indels is 11.0 and the bci is (11, 11) . the posterior mean number of indels beginning 0, 1, or 2 nucleotides from the beginning of a codon is 2.6, 5.8, and 2.6 respectively. the 95% bci for the number of indels beginning inside a codon is (6, 10) . inspecting the alignment estimate from the map point using a "gold" plot demonstrates alignment uncertainty (data not shown). in the map alignment we observe 11 indel events. only 5 of the indels are consistently present with unambiguous phase, and none of these indels can be placed between codons. interestingly, one indel of 3 nucleotides occurs independently in clades (16, 17) , 19 , and 22 according to the map estimate ( figure 5 ). we note that augmented alignments such as those used in our model distinguish between indels shared by state and indel shared by descent through the inclusion of felsenstein wildcard sequences at internal nodes of τ. use of the triplet model instead of the singlet model has a discernible effect on phylogeny estimation. the posterior odds in favor of the clade (10, 12, 18) decrease by a factor of 8.0 from 24.6 to 3.1 ( table 2) . we note that in one column of the singlet map alignment estimate, only variants 10, 12 and 18 have an a residue, while other taxa have either a g residue or a gap (figure 5a) . however, the triplet model shifts these gaps out of the column to avoid breaking a codon. taxa that contain a gap in this column under the singlet alignment contain a residues according to the triplet map alignment, decreasing the support for (10, 12, 18) clade ( figure 5b) . thus, comparing marginal likelihoods for model selection between the singlet and triplet models may not provide the whole picture. triplet models have discernible effects on estimates of the indel parameters λ and ε, but little effect on the substitution parameters μ and κ. for ln ε that is significantly smaller than the hky estimate of -1.0. however, accounting for the fact that one triplet contains three nucleotides, the hky × 3 model predicts a mean indel length of 1.1 triplets and 3.2 nucleotides, but the hky model predicts a mean indel length of 1.6 nucleotides. this may be because a geometric distribution on the number of nucleotides in a gap does not fit the data as well as a geometric distribution on the number of triplets in a gap. this is especially true in data sets such as the present one in which the number of triplets tends to be small. it may also be because the indel model used is fragment-based. finally, we note that estimates of 7.5 for κ in the hky × 3 model are quite similar to estimates of about 7.2 under the hky model. to assess how much indel information improves the reswhile the number of branches supported at pp > 0.9 is equal, not all supported branches are the same. the number of branches supported only under the joint model is 2, 2, and 2. the joint model supports the clades (16, 17) and (21, 24 ) over the traditional model at all three levels of pp. the traditional model supports the clade (19, 21, 24, 25) at a pp of 0.980 compared to 0.887 with indel information. the traditional model also supports the clade (10, 12, 15, 16, 18, 19, 21, 24, 25) at pp > 0.9 that has support < 0.5 when indel information is included. this results because the large clade conflicts with the clade (16, 17) that is supported by two shared indels. thus, the number of branches supported in only one of the two models at each level of pp is 4, 3, and 2. since the joint model balances substitution and indel information as well as taking alignment ambiguity into account we assume that these differences represent an improvement in the accuracy of estimation. however, because the true tree is not observed, we cannot be certain which, if any, of the predictions is correct. the partitions supported under the two models at pp > 0.99 are depicted in figure 6 . in summary, indel information conflicts with one branch in the substitution-only tree and down-weights the evidence for another branch. the conflicting branch is ruled out by the support of 2 shared indels for the clade (16, 17) , although one of these is homoplastic. we demonstrate that the novel mcmc transition kernel introduced in the sub-section sampling improves the computational efficiency of topology estimation when using indel information. the transition kernel improves the triplet alignments may shift indels and cause misaligned residues figure 5 triplet alignments may shift indels and cause misaligned residues. triplet alignments may shift indels and cause misaligned residues. (a) maximum a posteriori (map) alignment estimate under the singlet hky model. (b) map alignment estimate under the triplet hky × 3 model. in the triplet alignment, two g residues (blue) and four a residues (red) are forced into a different column to avoid breaking the alignment-wide reading frame. the displaced a residues join a residues from strains 10, 12, and 18 (green) which were previously the only a residues in that column. under both models, the map alignment estimates display 8 gaps. the alignment of internal sequences (not shown) indicates that these gaps arose from 5 indel events on branches partitioning clades (20, 23) , (21, 24) , (16, 17) , (19) , and (22) . thus, the gaps in sequences 19 and 22 arose independently of the gap in (16,17) even though they have the same length and position. prefixes on sequence names indicate elapsed time in weeks between the initial infection and when the sequences were obtained. convergence properties of the markov chain substantially, so that fewer initial samples must be discarded as "burnin". we compare the behavior of the estimation procedure when the new transition kernel is disabled (nni-only) or enabled (nni+spr) by running 15 instances of each chain starting from a randomly chosen tree and alignment. we use the data-set from example 2 that consists of 27 hiv sequences, with a maximum length 612 nucleotides. to assess convergence for each markov chain, we count the number of iterations required for the sampled tree topology to approach its equilibrium distribution of tree topologies. to accomplish this task, we need to define a distance from a single tree topology to a distibution of tree topologies. we start with the robinson-foulds distance (rf) between two tree topologies that we denote as d rf (τ 1 , τ 2 ). this distance does not depend on branch lengths. we then define the distance d(τ 1 , ξ) from a topology τ 1 to a distribution of topologies ξ as the average rf distance between τ 1 and a tree τ 2 ~ ξ: the expectation of d(τ 1 , ξ) does not converge to 0 as the markov chain approaches stationarity; rather the expectation approaches the average distance between two trees sampled from the equilibrium topology distribution. with this in mind, we consider a chain to have converged when the distance from the chain's current topology to the equilibrium distribution reaches the lower 25th percentile of distances from trees at stationarity to the equilibrium distribution. we approximate the equilibrium topology distribution with 200 topologies sampled at widely spaced intervals from a long-running mcmc analysis. we find that this distribution is not sensitive to the starting point of the markov chain, and does not change when the new transition kernel is enabled. without the new transition kernel based on spr, the median time to convergence is 2112 iterations with an average of 1976.9. however, when the new transition kernel is enabled, the median time decreases to 66 iterations, triplet alignments may shift indels and cause misaligned residues to visualize the convergence properties of the two approaches, we project the tree samples from two typical chains into the plane using metric multidimensional scaling based on their rf distances (figure 7 ). some researchers question the ability of indel information to improve phylogenetic resolution of recently diverged taxa. golenberg et al. analyze non-coding spacer regions between chloroplast genes in a parsimony framework and claim that indels shared by state recur more often than substitutions shared by state [32] , leading to a concern that indels are not reliable characters for phylogenetic analysis. however, simmons and ochoterana find indels to be reliable markers with low levels of homoplasy [33] . this contrast is partially explained by noting that the original golenberg study incorrectly codes overlapping gaps of different lengths as homologous, leading to false homoplasy. improved methods of coding indels when gaps overlap can lead to more accurate and more informative indel characters [33, 34] . in addition, researchers note that chloroplast intergenic spacers contain indel "hotspots" and that sequence duplications or changes in the number of tandem repeats occur at a significantly higher rate than non-repeat indels [32, 10] . this high rate can lead to identical but non-homologous insertions in different taxa, and so repeat indels experience higher homoplasy than non-repeat indels [9] . repeat indels should therefore be down-weighted, but unfortunately an appropriate weighting scheme has not yet been developed [35] . we also note that current alignment algorithms do not recognize duplications or indel hotspots, so that automatic alignments must be adjusted manually. despite these difficulties with repeat indels, researchers have examined intergenic spacers in various plant species using improved indel coding and find that indel information is consistent with substitution information and largely reinforces it, improving phylogenetic resolution and support [9, 10] . in some analyses, indels are useful only in distinguishing larger groups [36] . despite the utility of indels in phylogeny estimation, most researchers note difficulties in indel coding that result from alignment ambiguity [35] . this can be true even when the number of substitutions is too small to yield well-resolved phylogenies. while alignment ambiguity causes general problems with gap placement, some specific problems are worthy of mention. for example, aligning insertions of questionable homology may create spurious evidence for common ancestry [35] . also, when the number of tandem sequence repeats decreases, it is unclear which repeat has been deleted. resolving these ambiguities to yield a single alignment can increase the support for some trees while decreasing the support for others, leading to bias, and so regions whose homology is uncertain should be thrown out [35] . the joint estimation approach we advocate sidesteps many of the issues through the assignment of uncertainty on alignments, indel existence and placement. although the indel model described here improves on common multiple alignment algorithms by allowing indels to be shared by descent, it has some limitations. first, the model assumes that the indel rate is spatially homogeneous. however, biological sequences contain indel "hotspots" where indels are more likely to occur as well as invariant regions where indels are prohibited. incorrectly accounting for rates at which indels occur in different regions can lead to over-weighting of the indel evidence. clustal w attempts to place indels in hydrophilic regions of amino acid sequences, but does not have a mechanism for locating hotspots in non-coding sequences or hotspots resulting from weak selection or positive selection. second, the indel model makes the common assumption that residues in a single sequence are never homologous. duplications violate this assumption and are treated as insertions of random sequence by the indel process. third, changes in the number of tandem repeats of a short sequence often occur at a higher rate than other indels via slipped-strand mispairing (ssm). however, no commonly used alignment program accounts for within-sequence homology or ssm. an improved stochastic process model that accounts for these properties of biological sequences is highly desirable in order to accurately weight shared indel evidence and to produce both more accurate alignments and phylogenies. we extend the joint bayesian estimation framework of redelings and suchard [24] for recently diverged siv and hiv sequences to incorporate indel information into phylogeny estimates. in both examples, the use of indel information increases the number of supported bi-partitions even though the branch lengths are small, especially at high posterior probabilities. while many indels in these data sets occur in a single taxon or on a branch supported by many substitutions, some indels occur on branches with few or no substitutions. the relative weight of indels and substitutions shared by descent is specified by the relative rate λ estimated from the data. this offers an improvement over existing methods that force the relative weight to be set a priori. alignment uncertainty is significant in the siv data set. this uncertainty is illustrated by the fact that the topology distribution under the traditional model varies significantly depending on the choice of alignment ( table 1) . the joint estimation framework does not suffer from this sensitivity to alignment choice and allows alignment uncertainty to be estimated ( figure 4 ). we note that alignment-aware spr transition kernel decreases burn-in time figure 7 alignment-aware spr transition kernel decreases burn-in time. we consider the 27-sequence data set of hiv sequences described in the results section as example 2. points represent 200 topologies sampled from a markov chains with the alignment-aware spr transition kernel disabled (red; nni-only) or enabled (blue; nni+spr) or from the equilibrium distribution (green). while the convergence time for markov chains varies widely, this example illustrates the median convergence time. the nni-only chain takes 2112 iterations to converge versus only 66 iterations for the nni+spr chain. because the convergence times are so different, the figure depicts every 10th tree for the first 2000 iterations, whereas for the nni+spr chain the figure depicts every 2nd tree for the first 400 iterations. points represent trees projected onto the plane using multidimensional scaling based on the robinson-foulds distance. this distance depends only on the topology, not the branch lengths. start including indel information in analyses exaggerates the bias that results from fixing a single alignment choice. the high level of alignment uncertainty in the siv data set is partially explained by a large number of occurrences of the triplet caa. we note that in the hiv data set alignment uncertainty does not significantly effect the topology posterior. models such as m0 assume that codons are unbreakable, but the hiv data set shows that this can be unrealistic. forcing indels to codon boundaries results in a decrease in model fit of 24.1 log units because of an increase in the number of inferred substitutions. thus choosing a codon model over a singlet model involves a tradeoff between a substantially improved substitution model and a possibility of incorrect homology in the alignment. because the effects of the latter can be significant when the total number of substitutions is small, we welcome the development of an improved substitution model that does not force this tradeoff. such a substitution model would be able to calculate the likelihood of a singlet alignment while making use of codon frequencies and differentiating between synonymous and non-synonymous changes. we begin by considering a probability distribution π(x) on points x ∈ ω and a function f(x) that associates a subset of ω to each point x ∈ ω. we call f(x) a collapsing function if for any x and y in ω we have x ∈ f(x) and f(x) and f(y) are either identical or disjoint. if f is a collapsing function, then it partitions ω into a set of non-overlapping subsets, which we refer to as collapsed points. we denote the set of collapsed points as f(ω), and note that the probability π*(f(x)) of each collapsed point f(x) can be naturally defined as the integral of the probabilities π(y) of points y ∈ f(x). because the collapsed points are disjoint sets, these probabilities sum to 1 and yield a probability distribution on collapsed points. we then consider a transition kernel p on ω that is defined in terms of a transition kernel p* on f(ω). starting from the current point x, this transition kernel consists of collapsing x to f(x), moving to some other collapsed point a, and then selecting a point y from a in proportion to its probability π(y). we note that y ∈ a implies a = f(y) and write the probability expression for this transition kernel as the condition for p to satisfy detailed balance is by cancelling common terms. π* (f(x)) × p*(f(x), f(y)) = π* (f(y)) × p*(f(y), f(x)). (13) thus, the requirement for p to satisfy detailed balance on ω is simply that p* satisfies detailed balance on f(ω). we now demonstrate that the function f that maps (a, τ, t, θ, λ) to is a collapsing function. the directed branch po partitions the nodes of τ into two subsets excluding node o ( figure 2 ). set c 2 contains all alignments that are consistent with a on each of the two subsets. alignment a certainly fulfills this criterion, and therefore a ∈ c 2 (a, τ, po), implying that x ∈ f(x) for any x. in addition, c 2 (a', τ, po) = c 2 (a, τ, po) for any a' in c 2 (a, τ, po) and so f(y) = f(x) for any y ∈ f(x), implying that f(x) and f(y) are either identical or non-overlapping. therefore f(x) is a collapsing function. the transition kernel consisting of spr proposals for points collapsed using c 2 (a, τ, po) therefore satisfies detailed balance when we use the mh rule for acceptance or rejection and mh satisfies detailed balance on the collapsed points. our method for sampling alignments samples from a distribution η that approximates the correct distribution π but does not match exactly [24] . we therefore define an mh transition kernel that uses collapsed sampling of alignments as a proposal distribution ρ. after selecting a new topology and alignment that goes along with it, we reject this new point j and move back to the original alignment and topology i with a small probability 1 -α ij . the mh acceptance ratio can be calculated as follows: the ρ ij satisfy detailed balance with respect to another probability η i = π i f i . thus, origin of hiv-1 in the chimpanzee pan troglodytes troglodytes the causes and consequences of hiv evolution integrating ambiguously aligned regions of dna sequences in phylogenetic analyses without violating positional homology consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 phylogenetic analysis of polyomavirus simian virus 40 from monkeys and humans reveals genetic variation timing and reconstruction of the most recent common ancestor of the subtype c clade of human immunodeficiency virus type evidence that eukaryotes and eocyte prokaryotes are immediate relatives rare genomic change as a tool for phylogenetics the evolution of the atpβ-rbcl intergenic spacer in the epacrids (ericales) and its systematic and evolutionary implications molecular evolution of insertions and deletions in the chloroplast genome of silene insertion/ deletion frequencies match those of point mutations in the hypervariable regions of the simian immunodeficiency virus surface envelope gene long-term follow-up study of core gene deletion mutants in children with chronic hepatitis b virus infection in vivo dynamics of equine infectious anemia viruses emerging during febrile episodes: insertions/duplications at the principal neutralizing domain epidemiology consortium: molecular evolution of the sars coronavirus during the course of the sars epidemic in china reassortment and insertion-deletion are strategies for the evolution of influenza b viruses in nature alignment-ambiguous nucleotide sites and the exclusion of systematic data. molecular phylogenetics and evolution elision: a method for accommodating multiple molecular sequence alignments with alignment-ambiguous sites freeing phylogenies from artifacts of alignment frequency of insertion-deletion, transversion, and transition in the evolution of 5s ribosomal rna iterative pass optimization of sequence data the order of sequence alignment can bias the selection of tree topology an evolutionary model for maximum likelihood alignment of dna sequences inching towards reality: an improved likelihood model of sequence evolution joint bayesian estimation of alignment and phylogeny a codon-based model of nucleotide substitution for protein-coding dna sequences mathematical and statistical methods for genetic analysis subtree transfer operations and their induced metrics on evolutionary trees monte carlo strategies in scientific computing a novel use of equilibrium frequencies in models of sequence evolution dating of the human-ape splitting by a molecular clock of mitochondrial dna wain-hobson s: antigenic stimulation by bgc vaccine as an in vivo driving force for siv replication and dissemination evolution of a noncoding region of the chloroplast genome gaps as characters in sequencebased phylogenetic analyses incorporating information from length-mutational events into phylogenetic analysis the evolution of the non-coding chloroplast dna and its application in plant systematics indel patterns of the plastid dna trnl-trnf region within the genus poa (poaceae) clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice muscle: multiple sequence alignment with high accuracy and high throughput we would like to thank vladimir minin for many helpful discussions. b.d.r. is supported by nsf training grant dge9987641 and nih training grant gm008185. m.a.s. is supported in part by nih grant gm068955, the ucla aids institute and the james b. pendleton charitable trust. therefore the acceptance ratio is:distribution f i is proportional to the product of the length distributions on the internal nodes and changes very slowly in i. therefore is usually quite close to one and there are few rejections. to assess alignment ambiguity we compare the posterior topology distribution for the full joint model to the distribution generated under models restricted to a fixed alignment. as these distributions may be sensitive to the specific alignment chosen, we use three different choices. these alignments are the estimates obtained from clustal w [37] , muscle [38] , and bali-phy [24] . in the latter case, we fix the alignment to its maximum a posteriori (map) point determined jointly. we use the default parameters for clustal w and muscle. parameters and models used by bali-phy are described in the results section. the inference method described in this paper and implemented in the bali-phy software [24] requires significant computation time in order to handle alignment uncertainty and incorporate indel information. this means that it is often impractical to analyze data sets with greater than 12 taxa or sequence lengths longer than about 750 letters (nucleotide, amino acid, or codon). analyzing data sets of this size often takes about a week on current hardware. however, we wish to emphasize two points. first, the long computation time is not required to make a simple estimate, but to obtain measures of confidence that are accurate enough to publish. for simple estimates or unpublished results, significantly larger data sets can be analyzed. second, the amount of time required to analyze a data set depends not just on the size of the data set, but on various characteristics such as the level of uncertainty. for example, the second example in this paper contains 27 taxa of maximum length 612 and took about 3 weeks to analyze. ms formulated the problem and provided project management. br designed the algorithms and models. br performed the actual programming and computations. ms and br analyzed the data. ms and br wrote the paper. all authors read and approved the final manuscript. key: cord-289542-u86ujtur authors: razavian, narges; major, vincent j.; sudarshan, mukund; burk-rafel, jesse; stella, peter; randhawa, hardev; bilaloglu, seda; chen, ji; nguy, vuthy; wang, walter; zhang, hao; reinstein, ilan; kudlowitz, david; zenger, cameron; cao, meng; zhang, ruina; dogra, siddhant; harish, keerthi b.; bosworth, brian; francois, fritz; horwitz, leora i.; ranganath, rajesh; austrian, jonathan; aphinyanaphongs, yindalon title: a validated, real-time prediction model for favorable outcomes in hospitalized covid-19 patients date: 2020-10-06 journal: npj digit med doi: 10.1038/s41746-020-00343-x sha: doc_id: 289542 cord_uid: u86ujtur the covid-19 pandemic has challenged front-line clinical decision-making, leading to numerous published prognostic tools. however, few models have been prospectively validated and none report implementation in practice. here, we use 3345 retrospective and 474 prospective hospitalizations to develop and validate a parsimonious model to identify patients with favorable outcomes within 96 h of a prediction, based on real-time lab values, vital signs, and oxygen support variables. in retrospective and prospective validation, the model achieves high average precision (88.6% 95% ci: [88.4–88.7] and 90.8% [90.8–90.8]) and discrimination (95.1% [95.1–95.2] and 86.8% [86.8–86.9]) respectively. we implemented and integrated the model into the ehr, achieving a positive predictive value of 93.3% with 41% sensitivity. preliminary results suggest clinicians are adopting these scores into their clinical workflows. introduction covid-19 has created a public health crisis unseen in a century. as of july 30th, 2020, worldwide cases exceed 17 million and deaths have surpassed 667,000, with over 143,000 deaths occurring in the united states alone 1 . new york emerged as an early epicenter, and the increase in case burden strained the healthcare system. although new york's initial surge has passed, the number of infections continues to increase worldwide 2 . the significant impact of covid-19 is likely to persist until herd immunity is achieved, effective therapies are developed, or a vaccine is broadly implemented. faced with a novel disease with complex multi-organ manifestations and an uncertain disease progression course, frontline clinicians responded by sharing anecdotal management practices among peers. yet collective expert opinion is suboptimal and susceptible to selection and cognitive biases. epidemiologic studies partially address these challenges 3 , but they do not provide targeted information for individual patients at the point of care. machine learning methods are uniquely positioned to rapidly aggregate the collective experiences of thousands of patients to generate tailored predictions for each patient. as a consequence, these methods have great potential to augment covid-19 care. to be effective, solutions involving machine learning must 4, 5 (1) address a clearly defined use case that clinical leaders will champion and (2) motivate changes in clinical management based on model predictions. during the covid-19 pandemic, the operational needs of frontline clinicians have rapidly shifted. for example, early in the pandemic-with testing in short supplypredicting which patients likely had covid-19 before a test result had great importance to triage and cohorting. as the availability and speed of testing progressed, this use case became obsolete at our center. similarly, while predicting deterioration is clinically important, our health system had already implemented a general clinical deterioration predictive model and did not have an immediate use case for a covid-19-specific deterioration model 6 . furthermore, since intensive care unit (icu) beds were already limited to patients in immediate need of requiring higher levels of care, predicting future needs would not dramatically change clinical management. after collaboration with clinical leaders, we selected identification of patients at the lowest risk of adverse events-i.e., those predicted to have favorable outcomes-as a primary focus. this prediction task fulfills each of the requirements listed above, as handling the surge of covid-19 patients with a limited bed capacity was a critical challenge faced by many hospitals. discharging patients safely to free up beds for incoming patients is optimal as it does not require expanding human (e.g. nursing/ physician) or structural (beds/medical equipment) resources. given clinical uncertainty about patient trajectories in this novel disease, accurate predictions could help augment clinical decision making at the time the prediction is made. finally, clinical leaders overseeing inpatient units committed to support the adoption of the prediction model. as of the time of writing, at least 30 peer-reviewed papers describing prognostic covid-19 models have been published . these models use variables including patient demographics, clinical values, and radiographic images to predict adverse events, including severe pneumonia, intubation, transfer to icu, and death. most models use more than one variable and most models predict composite outcomes. of the 30 models, 23 were trained on patients in china 7-29 , 2 were trained on patients in the united states 30, 31 , and 5 were trained on patients in south korea 32 or europe [33] [34] [35] [36] . only 8 of the models underwent validation on either held-out or external datasets 7, 10, 14, 18, 19, 24, 26, 36 , and 1 underwent prospective validation 17 . no model predicted favorable outcomes and no studies reported clinical implementation. in this article, we describe how a collaboration among data scientists, electronic health record (ehr) programmers (vendorand health system-based), clinical informaticians, frontline physicians, and clinical leadership led to the development, prospective validation, and implementation of a machine learning model for real-time prediction of favorable outcomes within a 96 h window among hospitalized covid-19 patients. our approach differs from prior work in that we: (1) predict favorable outcomes (as opposed to adverse outcomes), (2) use a large covid-19 patient cohort admitted across our hospitals, (3) design a model that can easily be extended to other institutions, (4) prospectively validate performance, and (5) integrate our model in the ehr to provide a real-time clinical decision support tool. a retrospective cohort for model creation and validation included all covid-19 positive adults hospitalized at any of the four hospitals of our institution from march 3, 2020 through april 26, 2020. this cohort included a total of 3,317 unique patients and 3,345 admissions. these patients were largely white (44.6%) with an average age of 63.5 years (table 1 ). more men (61.6%) than women were included, consistent with other studies [37] [38] [39] . we defined a favorable outcome as absence of adverse events: significant oxygen support (including nasal cannula at flow rates >6 l/min, face mask or high-flow device, or ventilator), admission to icu, death (or discharge to hospice), or return to the hospital after discharge within 96 h of prediction (methods). patients could experience multiple adverse events during the course of their admission, e.g. requiring significant oxygen support before admission to the icu and death. almost half (45.6%) of patients required significant oxygen supporting devices (beyond nasal cannula) at some point during their stay and one fifth (20.3%) spent time in an icu. the all time inhospital mortality rate was 21.2% with another 3.1% of patients being discharged to hospice. consistent with published literature [40] [41] [42] [43] [44] [45] [46] , we find that patients' admission laboratory values differ between those who do and do not go on to experience an adverse event during their hospitalization: lower lymphocyte percentage (12.6% among patients with adverse event vs. 19.7% among patients without adverse event; model development stage 1: blackbox model four models (logistic regression, two 'blackbox' models: random forest, lightgbm, and an ensemble of these three models) were trained with all 65 variables (demographics, vital signs, laboratory results, o2 utilization variables, and length-of-stay up to prediction time) from all prediction instances (each time a cbc result becomes available) on a training set (a sample of 60% of retrospective cohort patients: 1990 unique patients, contributing 17,614 prediction instances). at every prediction instance, only variables prior to prediction time were utilized. after tuning the hyperparameters for each model via grid search and comparing each model, the best performance on the validation set (20% of retrospective cohort, 663 unique patients, contributing 4,903 prediction instances) was achieved by a lightgbm model with the following hyperparameters: 500 decision trees, learning rate of 0.02, max of 5 leaves in one tree, 0.5 sampling rate for features, max depth of 4 per tree, 1.0 l1 regularization and 2.0 l2 regularization, the minimal gain to perform split set to 0.05, and minimal sum of hessian in one leaf set to 5.0. model development stage 2: parsimonious model using conditional independence tests, we obtained p values for each variable (table 2 ) measuring conditional independence between blackbox model predictions and the variable, conditioned on the rest of variables. using a p-value threshold of 0.2, 16 features were selected. these features were combined into a final 'parsimonious' model as a logistic regression after quantile normalization of each variable ( supplementary fig. 1 ). the relative magnitude of each final model coefficient (table 2 ) is proportional to its contribution to the final score. positive coefficients were associated with a favorable outcome, while negative coefficients were associated with a decreased likelihood of a favorable outcome. we then performed ablation analysis to remove features in the linear model that did not improve performance. this analysis led to the removal of age, bmi, and maximum oxygen saturation (in the last 12 h). of the 13 features included in the linear model, the maximum value of nasal cannula oxygen flow rate (in the last 12 h) feature had a non-linear, u-shaped individual conditional expectation plot with a maximum at a value of 3 l/min and was therefore split into three binary indicators with cutoffs at 0 and 3. model performance was measured by discrimination (area under the receiver operating characteristic curve; auroc) and average precision (area under the precision-recall curve; auprc), assessed on a held-out set (independent from training or validation sets) including 20% of the retrospective cohort: 664 unique patients, contributing 5914 prediction instances overall. the blackbox and parsimonious models achieved auprc of 90.3% (95% bootstrapped confidence interval [ci]: 90.2-90.5) and 88.6% (95% ci: 88.4-88.7) respectively, while maintaining an auroc of 95-96% (fig. 1a, deployment into the electronic health record the final parsimonious model was implemented into our ehr to make predictions every 30 min for each eligible patient. predictions were split into three color-coded groups. the lowest risk, green-colored group were those with a score above a threshold selected at 90% positive predictive value (ppv), 53% sensitivity within the held-out set (threshold = 0.817). the moderate risk, orange-colored group were those patients with a score lower than green but above a second threshold corresponding to 80% ppv, 85% sensitivity (threshold = 0.583). the highest risk, red-colored group were all remaining predictions. in the held-out set, these two thresholds separated all predictions into three groups where favorable outcomes within 96 h are observed in 90.0% of green, 67.3% of orange and 7.9% of red patients. prior to displaying the model predictions to clinicians, a team of medical students and practicing physicians assessed the face validity, timing, and clinical utility of predictions. a variety of patient types were reviewed including 30 patients who had a green score, 8 of whom had left the icu and 22 who had not. overall, 76.7% (23 of 30) of the green predictions were labeled clinically valid where the primary clinical team acknowledged the patient as low-risk or were beginning to consider discharge. timing of those green predictions either aligned with actions by the primary clinical team or preceded those actions by one or two days (a total of 34 days earlier, an average of 1.13 days). invalid green predictions typically had other active conditions unrelated to their covid-19 disease (e.g. untreated dental abscess), while those patients discharged as orange or red typically had pre-hospitalization oxygen requirements (e.g. bipap for obstructive sleep apnea). for all patients in the held-out set discharged alive, 77.8% of patients (361 of 464) had at least one green score, and their first green score occurred a median 3.2 (interquartile range: [1.4-5.4 ]) days before discharge. the vast majority of green patients who were discharged alive never received care in an icu (91.4%; 330 of 361). those that did receive icu care had much longer length of stay before their first green score ( fig. 2a ) but once green, they had similar remaining length of stay before discharge (fig. 2b) . the resulting scores, colors, and contributions populated both a patient list column viewable by clinicians and a patient-specific covid-19 summary report, which aggregates data important for care including specific vitals, biomarkers, medications. the core component of the visualization was a colored oval containing that patient's risk score (fig. 3) . the column hover-bubble and report section displayed a visualization containing the colored score, a trendline of recent scores, and variables with their values and contributions (fig. 3 ). after retrospective validation, the model parameters were fixed, and the model was integrated into the ehr and its real-time time from the first green score to discharge. this analysis includes all held-out set patients with at least one green score who were discharged alive (n = 361) and stratifies that group into patients that received some of their care in an icu (n = 31) and those who received no icu care (n = 330). to reduce potential for confusion by clinicians, we display the inverse of the model prediction raw score (i.e 1 -score) and scale the score from 0-100. consequently, lower scores represent patients at lower risk of adverse outcomes. negative feature contributions are protective. note, in the first prediction, the variable "nasal cannula o2 flow rate max in last 12 h" has a value of "n/a" because their o2 device is greater than nasal cannula. predictions were displayed to clinicians starting may 15, 2020. prospective performance was assessed using data collected from may 15 to may 28, 2020 (predictions until may 24 with 96 h followup). in those ten days, 109,913 predictions were generated for 445 patients and 474 admissions. among these prospectively scored patients, 35 .1% (156) required significant oxygen support, 5.4% (24) required more than 6 l/min of oxygen while on nasal cannula, 7.2% (32) died, 2.2% (10) were discharged to hospice care, 19.8% (88) were transferred to the icu, and 1.8% (8) were discharged and readmitted within 96 h. overall, 44.0% (196 patients) experienced an adverse event within 96 h of a prediction instance, which is lower than the rate observed in our retrospective cohort (51.6%, 1712 of 3317, p = 0.003 by two-tailed fisher's exact test), consistent with prior reports from our institution showing a temporal improvement in outcomes 3 . prospective evaluation of the model achieved an auprc of 90.8% (95% ci: 90.8-90.8; fig. 4a ) and auroc of 86.8% (95% ci: 86.8-86.9; fig. 4b ), similar to retrospective performance (auprc: 88.6%, and auroc: 95.1%). using the predefined green threshold, the real-time model identified 41.0% of predictions as green with 93.3% ppv and 67.8% sensitivity (compared to 90% ppv and 53% sensitivity in the retrospective held-out set), and favorable outcomes are observed in 93.3%, 72.4%, and 23.5% of green, orange, and red predictions, respectively, consistently higher than the retrospective held-out set (90.0%, 67.3%, and 7.9%). prospective validation results updated for the time period may 15-july 28 2020 are included in supplementary note and supplementary fig. 2 . adoption into clinical practice since integration into the ehr, we monitored two high-level metrics to assess score adoption into clinical practice. the model predictions are visible in two places: multiple patients shown in a patient list column (fig. 3) and a single patient shown in a covid-19 summary report. a patient list column metric counts the number of times the model scores are shown in patient lists (not counting each patient displayed). a summary report metric counts the number of times a provider navigated to the covid-19 summary report to review data on a single patient. more specifically, during the three weeks may 16 to june 5, 2020 (omitting the partial day of may 15), scores are shown in a total of 1122 patient lists and 3,374 covid-19 reports. temporal trends in these metrics suggest an increasing trend in the rate of patient lists per day but a decreasing trend in covid-19 reports (fig. 5) . together, these metrics describe an adoption of users adding the patient list column, a result of outreach and communication to users, and a decline in the number of covid-19 reports accessed, which may be explained by a decline in the number of hospitalized covid-19 patients. future work will assess the impact of these scores on physician perspectives and decisionmaking. the covid-19 pandemic energized an existing inter-disciplinary collaboration at our institution to successfully develop a predictive model that was accurate and relevant for clinical care, could be rapidly deployed within our ehr, and could be readily disseminated to other institutions. the final parsimonious model exhibited strong model performance for the clinical task (fig. 1) and could be maintained with only 14 of the original 65 variables combined in a logistic regression that is transparently explainable (fig. 3 ). yet model accuracy is not sufficient to ensure measurable success; the prediction must be clinically applicable at the time of prediction. we determined that our model predicts patients at high probability of favorable outcomes a median of 3.2 days before discharge (fig. 2b) , providing sufficient lead time to commence and prepare for earlier and safer discharges. our chart review results suggest the green transition occurs, in many cases, before any discharge planning is documented. by identifying patients at low risk of an adverse event with high precision, this system could support clinicians in prioritizing patients who could safely transition to lower levels of care or be discharged. by contrast, using published models that predict occurrence of adverse events to guide discharge decisions may not be as effective. the distinction between identification of patients at low-risk of experiencing an adverse event rather than those at high-risk is key. although the binary outcome of an adverse event or none is reciprocal, the methodology of tuning model hyperparameters to identify the best model and then selecting a threshold based on ppv is not. if the target outcome is reversed, we would expect our methodology to discover a different parsimonious model. the key strengths of our approach are twofold. first, a reduced variable set helps prevent overfitting by making it less likely that a machine learning model will learn site-specific details 47 . second, our approach is easily integrated into third-party ehr systems. collaborating with our clinical decision support (cds) experts, we incorporated our intervention directly into standard clinical workflows (fig. 5) : (1) the patient lists clinicians used when reviewing and prioritizing their patients, and (2) the standard report clinicians rely on to summarize covid-19 aspects of care. by incorporating the prediction at the appropriate time and place in the ehr for the users responsible for discharge decisions, we expect to maximize the impact of this intervention in the care of covid-19 patients 48 . although integration into an ehr system maximizes its impact and simplifies dissemination to other institutions, it also adds several significant constraints institutions must consider. potentially useful data available on retrospective data queries may not be reliably accessible in real-time to make a prediction. for example, codified comorbidities and prior medications may be incomplete at the time of prediction, particularly for new patients who have never received care within the health system. therefore, only data collected during admission are suitable for generalizable modeling. extraction of complex features such as means are infeasible within the current ehr's computing platform. these data access challenges inside the ehr are part of the rationale behind our two-step model development that produces a parsimonious model reliant on a small number of inputs. despite the above constraints, the two-step methodology applied to construct the parsimonious model did reveal previously described 3 prognostic indicators of adverse events in covid-19 patients including vital signs such as hypoxia, c-reactive protein and lactate dehydrogenase ( table 2 ). yet many features commonly associated with worsening prognosis, such as age, gender, lymphocyte count, and d-dimer ultimately did not contribute to the final model. there are a variety of potential explanations for this apparent discrepancy. differences between patients with and without adverse events were observed for both neutrophil percent and lymphocyte percent (and their absolute counts; table 1 ) but the parsimonious model used only eosinophil percent, as the alternatives were not found to provide further information over eosinophils (table 2) , reflecting probable redundancy between white blood cell biomarkers. both eosinophils percent and platelet count have positive coefficients (table 2 ) suggesting a positive association between immune characteristics 46 and thrombocytosis with fewer adverse outcomes. similar redundancy might also explain why lactate dehydrogenase and c-reactive protein contributed to the ultimate model while d-dimer and troponin did not. while age and sex are marginally associated with adverse outcomes, neither contibute to the final model, suggesting other variables account for variance in these demographics such that they no longer aid prediction. the reasoning for why these variables do not directly contribute is unclear. epidemiologic studies have been critical in helping clinicians understand this evolving disease entity and expedite predictive model development. yet the volume of clinical features associated with adverse events precludes easy assimilation by clinicians at the point of care. at our institution, a covid-19 specific summary report for each patient trends over 17 variables. the ability of machine learning to synthesize and weigh multiple data inputs facilitates more accurate application of the data to directly impact care. another advantage of our approach is that model explanations were made available to the clinicians along with real-time predictions. our parsimonious model, being linear, enabled a seamless computation of contributing factors. providing insight into contributing factors helps improve trust in the model and we believe will improve its incorporation into clinician decision making. these explanations also helped mitigate some inherent limitations of real-time models. for example, clinicians could discount the model's predictions if they found that some of the inputs, like respiratory rate, were documented inaccurately. similarly, the model could not discriminate between patients receiving bipap for chronic obstructive sleep apnea versus for acute respiratory failure. a clinician would have this background and could consider the model's score in that context. front-line clinicians continued to evolve their care for patients with covid-19 in response to research findings. particularly during the retrospective study period, march and april 2020, there were rapid changes in testing and treatment practices. the data collected about a covid-19 patient in march is likely very different from a similar patient seen in the prospective cohort in late may 2020. for example, the volume of d-dimer values for patients increased dramatically from early march to april as clinicians incorporated d-dimer screening into their care plans. these expected differences in model variables and outcomes challenge the generalizability of any predictive model, which emphasized the importance of prospective validation. using oxygen therapy as both an input variable and an outcome measure led the model to learn that patients on o2 devices are likely going to continue to remain on o2 devices in the near future. consequently, the model coefficient for significant oxygen support overshadowed other variables and patients on significant o2 devices uniformly had very low favorable outcome scores. in consultation with our clinical leads, this model behavior was acceptable given that these patients on significant oxygen devices were clinically unlikely to be safe for discharge. furthermore, when excluding significant o2 support as an input variable or omitting periods of significant o2 support, the model performed worse overall and among patients not using o2 devices. thus, we retained this variable and analyzed the subset of patients without o2 devices separately, which demonstrated excellent performance (figs. 1c, d) . construction of the parsimonious model as a linear model also impacted how each variable's contribution was explained to the clinician. this constraint resulted in some explanations that were clinically concerning, like hypothermic temperatures displaying as a mildly protective feature (table 1 ). this phenomenon occurs because a linear model fits a linear slope to each variable and misses u-shaped risk curves. in summary, our model's predictions were accurate, clinically relevant, and presented in real time within the clinician's workflow. these features all enhance the likelihood that the model will be clinically successful. to assess our model's impact on clinically important outcomes, a randomized controlled trial is underway examining knowledge of favorable outcome prediction on patient length of stay. with clinical value confirmed, we plan to collaborate with the vendor community to rapidly disseminate the model to other institutions. we followed nyu grossman school of medicine irb protocol, and completed an irb checklist for research activities that may be classified as quality improvement. this work met the nyu grossman school of medicine irb criteria for quality improvement work, not research involving human subjects, and thus this work did not require irb review and no informed consent was required or obtained. patients are defined as covid-19 positive (covid+) if they have any positive (detected) lab result for the sars-cov-2 virus by polymerase chain reaction (pcr) before or during their index admission. due to the rapidly evolving availability of tests and ordering practices, our definition includes sars-cov-2 pcr tests of patient sputum samples, nasopharyngeal swabs, or oropharyngeal swabs conducted by in-house or governmental laboratories. in-house testing started mid-march and produced fast results (median time from specimen to result was 2.4 h in april 2020) that enabled confirmation and inclusion of this group on their hospital day one. a favorable outcome is the absence of any adverse events. an adverse event was defined as the occurrence of any of the following events within 96 h: 1. death or discharge to hospice, 2. icu admission, 3. significant oxygen support: a. mechanical ventilation, b. non-invasive positive-pressure ventilation (including bipap and cpap), c. high-flow nasal cannula, d. face mask (including partial and non-rebreather), or e. nasal cannula flow rate greater than 6 l/min 4. if discharged, re-presentation to the emergency department or readmission. each clinical event was mapped to structural fields in the ehr that are documented as a part of routine clinical practice. the structural fields were then validated by ehr programmers and clinical informaticians to the target events. clinical leadership selected these events because their occurrence would indicate a patient who is unsafe for discharge. the events evolved with clinical care guidance. for example, early in the pandemic, certified home health agencies would not manage home oxygen for covid+ patients, so lower rates of oxygen supplementation with nasal cannula were considered adverse. as agencies evolved their practices, clinical leadership modified the oxygen adverse event to occur after 6 l/min of nasal cannula retrospective cohort all covid+ adults hospitalized at any of the four hospitals of our institution from march 3, 2020 through april 26, 2020, including adverse events through april 30, 2020, were used for model creation and validation. this cohort included a total of 3345 covid+ admissions including 3317 unique patients. we include age, sex, race, ethnicity, and smoking history in our analysis as unchanging variables throughout an admission. the following laboratory values were included: neutrophils, lymphocytes, and eosinophils (each absolute count and percent); platelet count and mean platelet volume; blood urea nitrogen (bun); creatinine; creactive protein; d-dimer; ferritin; lactate dehydrogenase (ldh); and troponin i. these laboratory values were selected because they were routinely obtained among our cohort and literature reported their association with increased likelihood of covid infection, or adverse outcomes among covid+ patients. as vital signs are collected many times a day and both high and low abnormal values can be prognostic, minimum and maximum vital sign values within the prior 12 h were calculated for heart rate, respiratory rate, oxygen saturation by pulse oximetry (spo2), and temperature. weight and body mass index (bmi) are similarly aggregated. vital signs data, which was used to calculate prior 12 h aggregate variables, was available at minutelevel resolution. three mutually exclusive categories of oxygen support were included as variables: room air, nasal cannula, or an oxygen device that provides more support than a nasal cannula (most commonly high flow nasal cannula, non-rebreather mask, or ventilators). for the subset of nasal cannula, we also include the maximum oxygen flow rate in the 12 h prior to prediction as a continuous feature. as covid-19 is associated with a characteristic decompensation that leads to death within one week of admission 37 , current length of stay was included as a candidate predictor. missing data is observed in lab values, weight/bmi, and vital signs where data is not missing at random. for retrospective analysis, we excluded any prediction instances where all vital signs (heart rate, respiratory rate, oxygen saturation or temperature) were completely missing and no prior measurement had been collected (4% of prediction instances). missing rate for 12-hour aggregate vital signs in remaining instances were under 2.3%. these missing vital signs were imputed using the last, not-missing minimum and maximum values. after this adjustment, the missing rate in the retrospective data for vital sign variables dropped to under 0.02%. similarly, the highest lab value missing rate was observed for d-dimer where 10% of patients had missing values. after forward-filling imputation, remaining missing lab values or weight/bmi were filled with zeros which would be learnable as a separate group within the distribution. we compared imputation on remaining missing data by mean of the observed values against the default imputation (forward fill and filling remaining missing data with zero), and found no benefits in model performance (auprc or auroc) using imputation. when implemented into the ehr, missing values prevent a score from being generated and a "missing data" placeholder is shown to users. the goal of our model is to predict the probability of no adverse event within the 96 h after a prediction instance. we employed a two stage approach to develop and deploy this model. in stage 1, we build a model that predicts the outcome with high performance without imposing any deployment constraints. to do so, we built a complex "blackbox" model that included all variables. in stage 2, we distill this model into a "parsimonious" secondary model that uses fewer variables, and has a simpler functional form, while achieving equivalent performance. the simplicity of the parsimonious model is intended to: (1) accommodate constrained ehr implementation requirements; (2) promote understandable explanations for how the entire model arrives at its predictions and how the model evaluates individual patients (3) facilitate generalizability to other populations and institutions. a model that uses fewer variables to achieve the same predictive performance is less likely to overfit to a particular institution. for the purposes of model creation and retrospective validation, a prediction was generated every time a complete blood count (cbc) resulted in the ehr for each included patient. in general the frequency is about 24 h as staff was instructed to limit to one draw daily to limit exposure. prediction timing around each cbc result was selected after early covid-19 works reported dysregulation of different types of white blood cells 49 . cbc results prior to a confirmed pcr test were included in the retrospective modeling, given the limited testing capacities in early march, which imposed a lag between admission and covid-19 confirmation (5.2% of all held-out prediction instances). from march 16th onwards, in-house testing capacities at our institute enabled rapid testing, which reduced the rate of pre-pcr prediction to 0.9%. during model development, we split the data (3,317 unique patients, 28,431 prediction instances) into a training set (60%), validation set (20%), and held-out test set (20%) such that all predictions for any patient are allocated to one group and there is no overlap between training, validation or held-out test patients. the training set is used to fit both blackbox and parsimonious models, while the validation set is used to select model hyperparameters that achieve the highest performance. the final models are retrospectively validated on the final held-out set of patients. this multi-step process minimizes overfitting during parameter selection and provides a robust estimate of out-of-sample performance. the training set included 1990 unique patients, contributing 17,614 prediction instances. the validation set included 663 unique patients, contributing 4,903 prediction times. the held-out test set included 664 unique patients, contributing 5914 prediction times. we built four models for this task. the first model was a logistic regression. we used the validation set to determine the regularization hyperparameter (l1 or l2) 50 . an l-bfgs 51 optimization method was used to learn the coefficients within sklearn 52 . the second model built was a random forest (rf) 53 classifier. rfs are robust and successful nonlinear ensemble models, built over multiple decision trees each over a subset of features and a subset of samples. the subsampling of features and samples enables the decision trees within rf to have low correlation, which is key to strong ensembling performance. in this work, we tuned the rf model for number of trees and regularization parameters such as maximum tree depth and minimum samples per leaf. we used gini impurity coefficient for building each individual decision tree. the subsampling rate was set to square root of total feature count. we used the rf implementation within sklearn 52 . the third model was a lightgbm model 54 which is an efficient variant of gradient boosting decision tree 55 method. lightgbm builds an ensemble of decision trees but, in contrast to rf, decision trees are built iteratively rather than independently. at each iteration, the next tree is built to lower the residual error of predictions made by the current set of trees. we used the lightgbm 54 open source package in this study, and tuned for hyperparameters including number of trees, a number of regularization parameters, feature sub-sampling rate, and learning rate. all three models above were optimized for weighted loss (by inverse frequency of each class) to correct for class imbalance. the fourth model was an ensemble of the three models above based on a simple averaging of the model probabilities. for each model, we computed the average of two statistics using the validation set: the area under the receiver operating characteristic curve (auroc) and the area under the precision-recall curve (auprc). the model that achieved the highest score was chosen for model distillation in the next stage. the goal of this stage was to reduce the variable set as much as possible while maintaining performance at a level comparable to the blackbox model. using the blackbox model, we first ran a conditional independence hypothesis test. using this test, we selected important variables using a pvalue threshold. finally, we built a parsimonious model using only the important variables and established that its performance was comparable to that of the blackbox model. conditional independence tests ask the question: how much additional information does a particular feature x j contain about the outcome y over all the other variables. a simple and effective way to test this in practice is to use a hypothesis test for conditional independence 56 , which involves two pieces: a test statistic, and a null distribution. in this case, the test statistic t * is simply the performance of our blackbox model on the validation set. to sample from the null distribution, we first created a "null dataset" using the training and validation sets. these null datasets replace the variable x j with random values designed to be similar to the original value, but have no relation to the outcome. we then fit the blackbox model to the null training set, and measure performance on the null validation set. the performance of this "null model"tis a sample from the null distribution. given our test statistic t * and k independent samples from the null distribution, we can compute a p-value for every variable in our dataset. this p-value indicates whether or not we can reject the null hypothesis that x j provides no additional information about outcome y over all the other variables. selecting features. to deploy a model with as few variables as possible, we chose the features using a threshold on p-values generated by our conditional independence test. we used a threshold of 0.2 with no multiple testing correction in order to boost the power of our selection process. building a parsimonious model. using our important variables, we built a logistic regression as our parsimonious model. logistic regression with a small set of variables is easy to deploy and highly interpretable. to prepare the data for this model, we applied additional preprocessing steps. as is common in medical datasets, we observed many outliers in our data. linear models are sensitive to outliers, so we quantile transformed each variable. this involved computing 1000 quantiles for each variable, and replacing each feature value with its respective quantile. the result was a dataset whose variables are scaled from 0 to 1, so outliers do not significantly impact the training of our parsimonious model. we compute quantiles using only the training set, and apply these quantiles to the validation and test sets. next, we filtered out variables that were not found to impact the performance of the logistic regression. we found age and maximum oxygen saturation (previous 12 h) to have no impact on either the auroc or average precision of the parsimonious model using an ablation analysis. we manually removed age as it had a non-zero coefficient in the model. finally, we "linearized" our remaining important variables. for each important variable, we visualized the individual conditional expectation (ice). an ice is computed by fixing all but one of the variables, and varying this single variable from its minimum value to its maximum. as we varied this variable, we observed changes in the prediction of the blackbox model to create an ice plot. if the ice plot appears roughly linear, the variable interacts with the outcome in a linear manner. in some cases, the ice plot would appear to have a u-shape. for any variable with such an ice plot, we split the variable into two binary indicators. using the minimum or maximum of the u as a threshold, the indicators represented whether the observed value was above or below this threshold. using the set of important, quantile normalized, and linearized variables, we fit a logistic regression. during optimization, we used an elastic net regularizer, which is a weighted combination of l1 and l2 regularization, where the weights on each are hyperparameters. we performed a grid search to identify hyperparameters of the linear model: the regularization, and elastic net mixing parameters. we chose the setting that helps maximize the average of auroc and auprc on the validation set. in order to make predictions accessible to the care team in real-time through the institution's ehr (epic systems, verona, wi), the parsimonious model was implemented as a cloud-based model within epic's cognitive computing platform. each model variable is extracted directly from the operational database in real time. exclusions are also implemented: age < 18 years, patient class not equal to inpatient, and no active covid-19 infection. an active covid-19 infection is automatically applied at the patient level when a pcr test for sars-cov-2 returns positive (or detected), typically within two hours of admission. patients tested in an outpatient setting who are then admitted are included. resulting predictions are scaled between 0 and 100 and shown in a patient list. every 30 min, an updated prediction is generated for every eligible patient to incorporate any newly collected data. probability thresholds are selected within the held-out set to separate patients at low-risk of an adverse event in 96 h from moderate-and highrisk patients. these groups are colored to indicate their risk: green as lowrisk, orange as moderate-risk, and red as high-risk. the planned application warrants a pure set of green, low-risk patients to aid consistent decisionmaking with few false positives. as such, a high positive predictive value (ppv or precision) of 90% is selected for the green to orange thresholdfrom every ten green patients, one will develop an adverse event-and 80% ppv for the orange to red threshold-from every ten green or orange patients, two will develop adverse events. to better understand how the model predictions could be integrated into care decisions, medical students supervised by attending physicians reviewed over 30 clinical patient encounters. the encounters were chosen to reflect a variety of patients who reached their first low risk of adverse event prediction at different time points in their stay. key questions for the review team were (a) did the clinical team believe the patient was medically ready for discharge at the time of the prediction and what the barriers were to discharge. (b) could the model prediction impact the care plan. each prediction is displayed with a list of variables that contribute to that score. for each variable, the raw value is stated, e.g. minimum spo2 of 88%, along with a percentage of its contribution to the total score for that individual prediction. contributions of each variable can be positive or negative and are computed as proportions where the total sum of absolute values is 100%: where x is the vector of quantile normalized variables and β is the vector of linear coefficients. the aim of this prospective validation is to assess whether the parsimonious model can maintain its held-out set performance when deployed live into an ehr. a prospective observational cohort of hospitalized, covid+ adults was collected including scored patients spanning may 15 to may 24, 2020, and a 4-day follow up to may 28, 2020 for potential adverse event observations. predictions were generated for 445 patients over 474 admissions every 30 min, accounting for a total of 109,913 prediction instances. in order to assess prospective performance of the recreated parsimonious model, each score produced during this study period is used to compute auroc and auprc as well as ppv and sensitivity at the green threshold. we used a bootstrapping method with 100 iterations to compute 95% confidence intervals. at each iteration, the performance statistics were computed over a randomly selected (with replacement) of 50% of the held-out samples. further information on research design is available in the nature research life sciences reporting summary linked to this article. due to specific institutional requirements governing privacy protection, data used in this study will not be available. code for model development and implementation is available upon reasonable request. much of the code for data retrieval and processing is specific to the particular data challenges of our institution and would not replicate elsewhere. the final deployed model's coefficient and intercepts are available within table 2 . deployed model can be transferred via epic turbocharger service, upon reasonable request. johns hopkins coronavirus resource center who director-general's opening remarks at the media briefing on covid-19-20 factors associated with hospital admission and critical illness among 5279 people with coronavirus disease implementing ai in healthcare key challenges for delivering clinical impact with artificial intelligence helps clinicians predict when covid-19 patients might need intensive care prediction of severe illness due to covid-19 based on an analysis of initial fibrinogen to albumin ratio and platelet count risk factors of fatal outcome in hospitalized subjects with coronavirus disease 2019 from a nationwide analysis in china a novel simple scoring model for predicting severity of patients with sars-cov-2 infection a tool to early predict severe coronavirus disease 2019 (covid-19): a multicenter study using the risk nomogram in wuhan and guangdong, china clinical characteristics of coronavirus disease 2019 and development of a prediction model for prolonged hospital length of stay individualized prediction nomograms for disease progression in mild covid-19 prediction for progression risk in patients with covid-19 pneumonia: the call score a predictive model for disease progression in non-severe illness patients with corona virus disease towards an artificial intelligence framework for data-driven prediction of coronavirus clinical severity plasma albumin levels predict risk for nonsurvivors in critically ill patients with covid-19 a simple algorithm helps early identification of sars-cov-2 infection patients with severe progression tendency development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with covid-19 neutrophil-to-lymphocyte ratio predicts critical illness patients with 2019 coronavirus disease in the early stage prediction of the severity of corona virus disease 2019 and its adverse clinical outcomes prognostic value of c-reactive protein in patients with covid-19 clinical decision support tool and rapid point-of-care platform for determining disease severity in patients with covid-19 the value of clinical parameters in predicting the severity of covid-19 clinical and laboratory predictors of in-hospital mortality in patients with covid-19: a cohort study in wuhan, china an increased neutrophil/lymphocyte ratio is an early warning signal of severe covid-19 an interpretable mortality prediction model for covid-19 patients clinical characteristics, associated factors, and predicting covid-19 mortality risk: a retrospective study in wuhan, china d-dimer levels on admission to predict in-hospital mortality in patients with covid-19 development and validation a nomogram for predicting the risk of severe covid-19: a multi-center study in sichuan, china using machine learning to predict icu transfer in hospitalized covid-19 patients clinical and chest radiography features determine patient outcomes in young and middle age adults with covid-19 a classifier prediction model to predict the status of coronavirus covid-19 patients in south korea chest x-ray severity index as a predictor of in-hospital mortality in coronavirus disease 2019: a study of 302 patients from italy intensive care risk estimation in covid-19 pneumonia based on clinical and imaging parameters: experiences from the munich cohort early predictors of clinical deterioration in a cohort of 239 patients hospitalized for covid-19 infection in lombardy a clinical risk score to identify patients with covid-19 at high risk of critical care admission or death: an observational cohort study presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with covid-19 in the baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region covid-19 patients' clinical characteristics, discharge rate, and fatality rate of meta-analysis clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china hematologic, biochemical and immune biomarker abnormalities associated with severe illness and mortality in coronavirus disease 2019 (covid-19): a meta-analysis predictors of mortality for patients with covid-19 pneumonia caused by sars-cov-2: a prospective cohort study association of inflammatory markers with the severity of covid-19: a meta-analysis the hemocyte counts as a potential biomarker for predicting disease progression in covid-19: a retrospective study laboratory data analysis of novel coronavirus (covid-19) screening in 2510 patients eosinophil responses during covid-19 infections and coronavirus vaccination variational information maximization for feature selection improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success dysregulation of immune response in patients with covid-19 in wuhan, china regression shrinkage and selection via the lasso a limited memory algorithm for bound constrained optimization scikit-learn: machine learning in python random forests lightgbm: a highly efficient gradient boosting decision tree greedy function approximation: a gradient boosting machine panning for gold: 'model-x' knockoffs for high dimensional controlled variable selection the authors declare no competing interests. supplementary information is available for this paper at https://doi.org/10.1038/ s41746-020-00343-x.correspondence and requests for materials should be addressed to y.a.reprints and permission information is available at http://www.nature.com/ reprints publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/. key: cord-319436-mlitd45q authors: brinati, d.; campagner, a.; ferrari, d.; locatelli, m.; banfi, g.; cabitza, f. title: detection of covid-19 infection from routine blood exams with machine learning: a feasibility study date: 2020-04-25 journal: nan doi: 10.1101/2020.04.22.20075143 sha: doc_id: 319436 cord_uid: mlitd45q background the covid-19 pandemia due to the sars-cov-2 coronavirus, in its first 4 months since its outbreak, has to date reached more than 200 countries worldwide with more than 2 million confirmed cases (probably a much higher number of infected), and almost 200,000 deaths. amplification of viral rna by (real time) reverse transcription polymerase chain reaction (rrt-pcr) is the current gold standard test for confirmation of infection, although it presents known shortcomings: long turnaround times (3-4 hours to generate results), potential shortage of reagents, false-negative rates as large as 15-20%, the need for certified laboratories, expensive equipment and trained personnel. thus there is a need for alternative, faster, less expensive and more accessible tests. material and methods we developed two machine learning classification models using hematochemical values from routine blood exams (namely: white blood cells counts, and the platelets, crp, ast, alt, ggt, alp, ldh plasma levels) drawn from 279 patients who, after being admitted to the san raffaele hospital (milan, italy) emergency-room with covid-19 symptoms, were screened with the rrt-pcr test performed on respiratory tract specimens. of these patients, 177 resulted positive, whereas 102 received a negative response. results we have developed two machine learning models, to discriminate between patients who are either positive or negative to the sars-cov-2: their accuracy ranges between 82% and 86%, and sensitivity between 92% e 95%, so comparably well with respect to the gold standard. we also developed an interpretable decision tree model as a simple decision aid for clinician interpreting blood tests (even off-line) for covid-19 suspect cases. discussion this study demonstrated the feasibility and clinical soundness of using blood tests analysis and machine learning as an alternative to rrt-pcr for identifying covid-19 positive patients. this is especially useful in those countries, like developing ones, suffering from shortages of rrt-pcr reagents and specialized laboratories. we made available a web-based tool for clinical reference and evaluation. this tool is available at https://covid19-blood-ml.herokuapp.com. the pandemic disease caused by the sars-cov-2 virus named covid-19 is requiring unprecedented responses of exceptional intensity and scope to more than 200 states around the world, after having infected, in the first 4 months since its outbreak, a number of people between 2 and 20 million with at least 200,000 deaths. to cope with the spread of the covid-19 infection, governments all over the world has taken drastic measures like the quarantine of hundreds of millions of residents worldwide. however, because of the covid-19 symptomatology, which showed a large number of asymptomatics [12] , these efforts are limited by the problem of differentiating between covid-19 positive and negative individuals. thus, tests to identify the sars-cov-2 virus are believed to be crucial to identify positive cases to this infection and thus curb the pandemic. to this aim, the current test of choice is the reverse transcriptase polymerase chain reaction (rt-pcr)-based assays performed in the laboratory on respiratory specimens. taking this as a gold standard, machine learning techniques have been employed to detect covid-19 from lung ct-scans with 90% sensitivity, and high auroc ( 0.95) [25, 18] . although chest cts have been found associated with high sensitivity for the diagnosis of covid-19 [1] , this kind of exam can hardly be employed for screening tasks, for the radiation doses, the relative low number of devices available, and the related operation costs. a similar attempt was recently performed on chest x-rays [4] , which is a low-dose and less expensive test, with promising statistical performance (e.g., sensitivity 97%). however, since almost 60% of chest x-rays taken in patients with confirmed and symptomatic covid19 have been found to be normal [40] , systems based on this exam need to be thoroughly validated in real-world settings [6] . the public health emergency requires an unprecedented global effort to increase testing capacity [29] . the large demand for rrt-pcr tests (also commonly known as nasopharyngeal swab tests) due to the worldwide extension of the virus is highlighting the limitations of this type of diagnosis on a large-scale such as: the long turnaround times (on average over 2 to 3 hours to generate results); the need of certified laboratories; trained personnel; expensive equipment and reagents for which demand can easily overcome supply [26] . for instance in italy, the scarcity of reagents and specialized laboratories forced the government to limit the swab testing to those people who clearly showed symptoms of severe respiratory syndrome, thus leading to a number of infected people and a contagion rate that were largely underestimated [34] . for this reason, and also in light of the predictable wide adoption of mobile apps for contact tracing [14] , which will likely increase the demand for population screening, there is an urgent need for alternative (or complementary) testing methods by which to quickly identify infected covid-19 patients to mitigate virus transmission and guarantee a prompt patients treatment. on a previous work published in the laboratory medicine literature [13] , we showed how simple blood tests might help identifying false positive/negative rrt-pcr tests. this work and the considerations made above strongly motivated us to apply machine learning methods to routine, low-cost 2 blood exams, and to evaluate the feasibility of predictive models in this important task for the massscreening of potential covid-19 infected individuals. in what follows we report this feasibility study in detail. the aim of this work is to develop a predictive model, based on machine learning techniques, to predict the positivity or negativity for covid-19. in the rest of this section we report on the dataset used for model training and on the data analysis pipeline adopted. the dataset used for this study was made available by the irccs ospedale san raffaele 3 and it consisted of 279 cases, randomly extracted from patients admitted to that hospital from the end of february 2020 to mid of march 2020. each case included the patient's age, gender, and values from routine blood tests, as well as the result of the rt-pcr test for covid-19, performed by nasopharyngeal swab. the parameters collected by the blood test are reported in table 1 . all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 25, 2020. . https://doi.org/10.1101/2020.04. 22.20075143 doi: medrxiv preprint the dependent variable "swab" is binary and it is equal to 0 in the absence of covid-19 infection (negative swab test), and it is equal to 1 in the case of covid-19 infection (positive to the swab test). the number of occurrences for the negative and positive class was respectively 102 (37%) and 177 (63%), thus the dataset was slightly imbalanced towards positive cases. figure 1 shows the pairwise correlation of the features used for this study, while figure 2 focuses on variables "age", "wbc ", "crp", "ast " and "lymphocytes". first of all, the categorical feature gender has been transformed into two binary features by one-hot encoding. further, we notice that the dataset was affected by missing values in most of its features (see table 2 ). to address data incompleteness, we performed missing data imputation by means of the multivariate imputation by chained equation (mice) [5] method. mice is a multiple imputation method that works in an iterative fashion: in each imputation round, one feature with missing values is selected and is modeled as a function of all the other features; the estimated values are then used to impute the missing values and re-used in the subsequent imputation rounds. we chose this method because multiple imputation techniques are known to be more robust and better capable to account for uncertainty compared with all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. single imputation ones [33] (as they employ the joint distribution of the available features), and mice in particular can also handle different data types. we developed and compared different classes of machine learning classifiers. in particular, we considered the following classifier models: -decision tree [35] (dt); -extremely randomized trees [16] (et); -k-nearest neighbors [2] (knn); -logistic regression [20] (lr); -naïve bayes [23] (nb); -random forest [21] (rf); -support vector machines [36] (svm). all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 25, 2020. . https://doi.org/10.1101/2020.04.22.20075143 doi: medrxiv preprint we also considered a modification of the random forest algorithm, called three-way random forest classifier [7] (twrf), which allows the model to abstain on instances for which it can express low confidence; in so doing, a twfr achieves higher accuracy on the effectively classified instances at expense of coverage (i.e., the number of instances on which it makes a prediction). we decided to consider also this class of models as they could provide more reliable predictions in a large part of cases, while exposing the uncertainty regarding other cases so as to suggest further (and more expensive) tests on them. from a technical point of view, since random forest is a class of probability scoring classifiers (that is, for each instance the model assigns a probability score for every possible class), the abstention is performed on the basis of two thresholds α, β ∈ [0, 1]: if we denote with 1 the positive class and 0 the negative class, then each instance is classified as positive if score(1) > α and score(1) > score(0), negative if score(0) > β and score(0) > score (1) and, otherwise, the model abstains. in these models the performance is usually evaluated only on the non-abstained instances [15] , and the coverage is a further performance element to be considered. the models mentioned above have been trained, and evaluated, through a nested cross validation [19, 9] procedure. this procedure allows for an unbiased generalization error estimation while the hyperparameter search (including feature selection) is performed: an inner cross-validation loop is executed to find the optimal hyperparameters via grid search and an outer loop evaluates the model performance on five folds. models were evaluated in terms of accuracy, balanced accuracy 4 , positive predictive value (ppv) 5 , sensitivity, specificity and, except for the three-way random forest, the area under the roc curve (auc). after discussing this with the clinicians involved in this study, we considered accuracy and sensitivity to be the main quality metrics, since false negatives (that is, patients positive to covid-10 which are, however, classified as negative, and possibly let go home) are more harmful than false positives in this screening task. 4 we recall that balanced accuracy is defined as the average of sensitivity and specificity. if accuracy and balanced accuracy significantly differ, the data could be interpreted as unbalanced with respect to class prevalence. 5 we recall here that ppv represents the probability that subjects with a positive screening test truly have the disease. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. tables 3 and 4 show the 95% confidence intervals of, respectively, average accuracy and average balanced accuracy (that is, the average of sensitivity and specificity) of the models (on the nested cross-validation) trained on the two best-performing sets of features: the first one, dataset a, includes all the variables, while the second one, dataset b, excludes the "gender " variable, as this was found of negligible predictive value. figure 3 shows the performance of the traditional models (i.e., the twrf model was excluded) on the nested cross-validation. to further validate the above findings, the entire dataset has been splitted into training and test/validation sets, respectively the 80% and the 20% of the total instances. the best performing model, i.e. the random forest classifier, trained on dataset b, achieved the following results on the test/validation set: accuracy = 82% , sensitivity = 92%, ppv = 83%, specificity = 65%, auc = 84%. figures 4 and 5 show the performance of this model in the roc and precision/recall space, respectively. the optimal hyperparameters found are shown in table 5 . similarly, for the best three-way random forest classifier on the validation set we observed: accuracy = 86%, sensitivity = 95%, ppv = 86%, specificity = 75%, coverage = 70% (that is, for 30% of the validation instances the model abstained). the feature importance assessed for the the best performing model (random forest on dataset b), are shown in figure 6 . in order to provide an interpretable overview (in the sense of explainable ai [17] ) of the predictive models that we developed, we also developed a decision all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. tree model, which is shown in figure 7 . although the depicted decision tree is associated with a lower discriminative performance than the two former (inscrutable) models, such a tree can be used as a simple decision aid by clinicians interested in the use of blood values to assess covid-19 suspect cases. we have developed two machine learning models to discriminate between patients who are either positive or negative to the sars-cov-2, which is the coronavirus causing the covid-19 pandemia. in this task, patients are represented in terms of few basic demographic characteristics (gender, age) and a small array of routine blood tests, chosen for their convenience, low cost and because they are usually available within 30 minutes from the blood draw in regular emergency department. the ground truth was established through rt-pcr swab tests. we presented the best traditional model, as it is common practice, and a three-way model, which guarantees best sensitivity and positive predictive value: the former is the proportion of infected (and contagious) people who will have a positive result and therefore it is useful to clinicians when deciding which test to use. on the other hand, ppv is useful for patients as it tells the odds of one having covid-19 if they have a positive result. the performance achieved by these two best models (sensitivity between 92% and 95%, accuracy between 82% and 86%) provides proof that this kind of data, and computational models, can be used to discriminate among potential covid-19 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. infectious patients with sufficient reliability, and similar sensitivity to the current gold standard. this is the most important contribution of our study. also from the clinical point of view, the feature selection was considered valid by the clinicians involved. indeed, the specialist literature has found that covid-19 positivity is associated with lymphopenia (that is, abnormally low level of white blood cells in the blood), damage to liver and muscle tissue [42, 39] , and significantly increased c-reactive protein (crp) levels [10] . in [27] a comprehensive all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. list of the most frequent abnormalities in covid-19 patients has been reported: among the 14 conditions considered, they report increased aspartate aminotransferase (ast), decreased lymphocyte count (wbc), increased lactate dehydrogenase (ldh), increased c-reactive protein (crp), increased white blood cell count (wbc) and increased alanine aminotransferase (alt). these parameters are also the most predictive features identified by the best classifier (random forest), all together with the age attribute. also other studies confirm the relevance of these features and their association with the covid-19 positivity [8, 30, 32, 44] , compared to other kinds of pneumonia [43] . this also gives confirmation that our models ground on clinically relevant features and that most of these values can be extracted from routine blood exams. the interpretable decision tree model provides a further confirmation (see figure 7 ) of the soundness of the approach: the clinicians (ml, gb) and the biochemist (df) involved in this study found reasonable that the ast would be the first parameter to consider (i.e., mirrored by the fact that ast was the root of the decision tree) and that it was found to be the most important predictive feature. indeed, values of ast below 25 are good predictors of covid-19 positivity (accuracy = ppv = 76%), while values below 25 are a good predictor of covid-19 negativity (accuracy = negative predictive value = 83%). similar observations can also be made about crp, lymphocytes and general wbc counts. no statistically significant difference was found between the accuracy and the balanced accuracy of the models (as mirrored by the overlap of the 95% confidence intervals), as a sign that the dataset was not significantly unbalanced. moreover, we can notice that the best performing ml classifier (random forest) exhibited a very high sensitivity (∼ 90%) but, in comparison, a limited specificity of only 65%. that gives the main motivation for the three-way classifier: this model offers a trade-off between increased specificity (a 10% increment compared all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 25, 2020. . https://doi.org/10.1101/2020.04.22.20075143 doi: medrxiv preprint fig. 7 an interpretable decision tree, developed in order to support the interpretation of the predictions from the other models. color gradients denote predictivity for either classes (shades of blue correspond to covid-19 negativity, shades of orange to positivity). all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 25, 2020. . https://doi.org/10.1101/2020.04. 22.20075143 doi: medrxiv preprint with the best traditional ml model) and reduced coverage, as the three-way approach abstains on uncertain instances (i.e., the cases that cannot be classified with high confidence neither as positive nor negative ). this means that the model yields more robust and reliable prediction for the classified instances (as it is mirrored by the increase in all of the performance measures), while for the other ones it is anyway useful in suggesting further tests, e.g., by either a pcr-rna swab test or a chest x-ray. in regard to the specificity exhibited by our models, we can further notice that even while these values are relatively low compared with other tests (which are more specific but slower and less accessible), this may not be too much of a limitation as there is a significant disparity between the costs of false positives and false negatives and in fact our models favors sensitivity (thus, they avoid false negatives). further, the high ppv (> 80%) of our models suggest that the large majority of cases identified as positives by our models would likely be covid-19 positive cases. that said, the study presents two main limitations: the first, and more obvious one, regards the relatively low number of cases considered. this was tackled by performing nested cross-validation in order to control for bias [38] , and by employing models that are known to be effective also with moderately sized samples [3, 31, 37] . nonetheless, further research should be aimed at confirming our findings, by integrating hematochemical data from multiple centers and increasing the number of the cases considered. the second limitation may be less obvious, as it regards the reliability of the ground truth itself. although this was built by means of the current gold standard for covid-19 detection, i.e., the rrt-pcr test, a recent study observed that the accuracy of this test may be highly affected by problems like inadequate procedures for collection, handling, transport and storage of the swabs, sample contamination, and presence of interfering substances, among the others [28] . as a result, some recent studies have reported up to 20% false-negative results for the rrt-pcr test [41, 24, 22] , and a recent systematic review reported an average sensitivity of 92% and cautioned that "up to 29% of patients could have an initial rt-pcr false-negative result". thus, contrary to common belief and some preliminary study (e.g., [11] ), the accuracy of this test could be less than optimal, and this could have affected the reliability of the ground truth also in this study (as in any other using this test for ground truthing, unless cases are annotated after multiple tests. however, besides being a limitation, this is also a further motivation to pursue alternative ways to perform the diagnosis of sars-cov-2 infection, such as our methods are. future work will be devoted to the inclusion of more hematochemical parameters, including those from arterial blood gas assays (abg), to evaluate their predictiveness with respect to covid-19 positiveness, and the inclusion of cases whose probability to be covid-positive is almost 100%, as they resulted positive to two or more swabs or to serologic antibody tests. this would allow to associate a higher weight with misidentifying those cases, so as, we conjecture, improve the sensitivity further. moreover, we want to investigate the interpretability of our models further, by both having more clinicians validate the current decision tree, and possibly construct a more accurate one, so that clinicians can use it as a convenient decision aid to interpret blood tests in regard to covid-19 suspect cases (even off-line). all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 25, 2020. . https://doi.org/10.1101/2020.04. 22.20075143 doi: medrxiv preprint finally, this was conceived as a feasibility study for an alternative covid-19 test on the basis of hematochimical values. in virtue of this ambitious goal, the success of this study does not exempt us from pursuing a real-world, ecological validation of the models [6] . to this aim, we deployed an online web-based tool 6 by which clinicians can test the model, by feeding it with clinical values, and considering the sensibleness and usefulness of the indications provided back by the model. after this successful feasibility study, we will conceive proper external validation tasks and undertake an ecological validation to assess the cost-effectiveness and utility of these models for the screening of covid-19 infection in all the real-world settings (e.g., hospitals, workplaces) where routine blood tests are a viable test of choice. not applicable not applicable research involving human subjects complied with all relevant national regulations, institutional policies and is in accordance with the tenets of the helsinki declaration (as revised in 2013), and was approved by the authors' institutional review board on the 20th of april. individuals signed an informed consent authorizing the use of their anonymously collected data for retrospective observational studies (article 9.2.j; eu general data protection regulation 2016/679 [gdpr]), according to the irccs san raffaele hospital policy (iog075/2016). the developed web tool is available at the following address: https://covid19-blood-ml. herokuapp.com/ the complete dataset will be made available on the zenodo platform as soon as the work gets accepted for publication. 6 the tool is available at the following address: https://covid19-blood-ml.herokuapp.com/. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 25, 2020. . https://doi.org/10.1101/2020.04. 22.20075143 doi: medrxiv preprint correlation of chest ct and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases an introduction to kernel and nearest-neighbor nonparametric regression model selection for support vector machines: advantages and disadvantages of the machine learning theory covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks mice: multivariate imputation by chained equations in r the proof of the pudding: in praise of a culture of real-world validation for medical artificial intelligence the three-way-in and three-wayout framework to treat and exploit ambiguity in data di napoli r (2020) features, evaluation and treatment coronavirus (covid-19) on over-fitting in model selection and subsequent selection bias in performance evaluation epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study detection of 2019 novel coronavirus (2019-ncov) by real-time rt-pcr covid-19: identifying and isolating asymptomatic people helped eliminate virus in italian village routine blood tests as a potential diagnostic tool for covid-19 quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing. science all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted extremely randomized trees explainable ai: the new 42? coronavirus detection and analysis on chest ct with deep learning the elements of statistical learning: data mining, inference, and prediction applied logistic regression random decision forest insufficient sensitivity of rna dependent rna polymerase gene of sars-cov-2 viral genome as confirmatory test using korean covid-19 cases naive (bayes) at forty: the independence assumption in information retrieval false-negative results of real-time reverse-transcriptase polymerase chain reaction for severe acute respiratory syndrome coronavirus 2: role of deep-learning-based ct diagnosis and insights from two cases artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct development and clinical application of a rapid igm-igg combined antibody test for sars-cov-2 infection diagnosis laboratory abnormalities in patients with covid-2019 infection potential preanalytical and analytical vulnerabilities in the laboratory diagnosis of coronavirus disease 2019 (covid-19) diagnostic testing for severe acute respiratory syndrome-related coronavirus-2: a narrative review time course of lung changes on chest ct during recovery from 2019 novel coronavirus (covid-19) pneumonia random forest for bioinformatics dysregulation of immune response in patients with covid-19 in wuhan multiple imputation for nonresponse in surveys as covid-19 cases, deaths and fatality rates surge in italy, underlying causes require investigation a survey of decision tree classifier methodology learning with kernels: support vector machines, regularization, optimization, and beyond stabilizing classifiers for very small sample sizes bias in error estimation when using cross-validation for model selection clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in wuhan, china chest x-ray findings in 636 ambulatory patients with covid-19 presenting to an urgent care center: a normal chest x-ray is no guarantee chest ct for typical 2019-ncov pneumonia: relationship to negative rt-pcr testing liver injury in covid-19: management and challenges a comparative study on the clinical features of covid-19 pneumonia to other pneumonias functional exhaustion of antiviral lymphocytes in covid-19 patients the complete code will be made available on the zenodo platform as soon as the work gets accepted for publication. key: cord-283907-ev1ghlwl authors: cao, lingyan; li, yongkui; zhang, jiansong; jiang, yi; han, yilong; wei, jianjun title: electrical load prediction of healthcare buildings through single and ensemble learning date: 2020-11-30 journal: energy reports doi: 10.1016/j.egyr.2020.10.005 sha: doc_id: 283907 cord_uid: ev1ghlwl healthcare buildings are characterized by complex energy systems and high energy usage, therefore serving as the key areas for achieving energy conservation goals in the building sector. an accurate load prediction of hospital energy consumption is of paramount importance to a successful healthcare building energy management. in this study, eight machine learning models of single learning and ensemble learning were developed for predicting healthcare facilities’ energy consumption. to validate the performance of the proposed model, an experiment was conducted on a general hospital in shanghai, china. it was found that the two ensemble models, extreme gradient boosting (xgboost) model and random forest (rf) model, outperformed single models in daily electrical load prediction. a further comparison between models trained with daily and weekly temporal resolution electrical data shows that it is more likely to achieve higher accuracy with finer time granularity. through feature importance analysis, the most influential features under the daily and weekly electrical load prediction were identified. based on the prediction results, it is expected that hospital facility managers will be able to conveniently assess the expected energy usage of their hospitals with the machine learning models. the building sector accounts for 30% of global energy consumption and more than 55% of global electricity consumption, generating 28% of energy-related co 2 emissions worldwide (oecd/iea, 2017) . driven by the increasing floor area and rapidly growing demand for energy-consumption equipment and services in buildings, the energy use in the building sector has shown a continuous growth trend, with an annual average growth of 1.1% since 2000 (oecd/iea, 2017). currently, the energy consumption of the building sector mainly comes from commercial and residential buildings. large-scale commercial buildings are reported to have a high energy consumption, which can be up to 300 kw h/m 2 , 5 to 15 times of that in residential buildings (liang et al., 2016) . in the commercial sector, healthcare buildings are particularly energy-intensive due to their constant need for power supply and strict requirements for air quality and disease control (garcía-sanz-calcedo et al., 2019; bawaneh * corresponding author. e-mail addresses: cly900601@tongji.edu.cn (l. cao), lyk@tongji.edu.cn (y. li), zhan3062@purdue.edu (j. zhang), jiang2@purdue.edu (y. jiang), yilong.han@tongji.edu.cn (y. han), 1510322@tongji.edu.cn (j. wei). et al., 2019a) . a report released by the energy information administration (eia) (eia, 2012) indicated that large hospitals were responsible for 6.6% of major fuel consumption in commercial buildings, even though they only constitute 2.2% of the total commercial building area. in addition, the healthcare industry market is expected to continuously increase in the future due to factors such as a higher number of chronic diseases, overpopulation, an increase of elderly population, and the lack of healthy lifestyle choices (gonzalez, 2019) . thus, healthcare buildings play an important role in overall energy consumption worldwide. in response to this increasing trend of building energy demand, much research has been conducted in the prediction of building energy consumption, especially the prediction of electrical loads. previous studies have shown that building energy prediction is helpful for implementing a series of energy conservation tasks, such as benchmarking building energy performance (zhao and magoulès, 2012) , detecting system fault , measuring building energy savings (heo and zavala, 2012) , and controlling demand response (pedersen et al., 2017) . the main functions of healthcare buildings include providing medical services and carrying out scientific research. compared to other commercial buildings, healthcare buildings are distinguished by overloaded schedules, increased electricity usage, diversified electricity forms, and higher electricity consumption per gross floor area. for example, in the hospital context, a large amount of electricity has to be provided to make sure the normal operation of cold-chain equipment for vaccine storage, pumps for clean water supply, lighting, and other life-saving medical equipment for night-time and emergency care (dholakia, 2018) . therefore, schemes to reduce the use of energy in complex healthcare buildings are important to understand. through literature review, it was found that numerous energy demand studies have been conducted on diverse building types, such as commercial buildings (yildiz et al., 2017) , office buildings (ding et al., 2017) , hotel buildings (shao et al., 2020) , and residential buildings (gao et al., 2019) . however, there is a paucity of research that focused on healthcare buildings due to the complexity of their energy consumption patterns. even though the amount of electrical load information related to diagnostic and medical treatment has been increased with the advancement of metering and sensing technologies, the detailed analysis based on the measured information remains scarce. therefore, in this paper, the authors propose a one day-ahead electrical load forecasting model based on single and ensemble machine learning algorithms. to validate the performance of the proposed model, an experiment was conducted on shanghai tenth people's hospital, a general hospital under the jurisdiction of the shanghai hospital development center (shdc) . daily electrical load consumption information was collected through an intelligent-building energy support system (i-bess). because energy consumption patterns in buildings can vary greatly depending on the use of buildings (park et al., 2016) , the ignorance of function related variables will result in a discrepancy between the predicted values and real values. to fill this gap, in this study, three types of patient occupants were taken into consideration, namely, the inpatient, outpatient, and emergency patients. in the present study, electrical load forecasting models of healthcare buildings are developed based on single and ensemble machine learning algorithms by taking account multi-factors simultaneously. 10 features that are classified into four categories (weather parameters, occupancy data, day type information, and operation & maintenance measure) are specified as input to the forecasting models. the daily electrical load forecasting models are trained with a database of a general hospital in shanghai, containing 700 instances with complete data. to test the impact of time granularity on the prediction performance, these models are also trained with a weekly load dataset. the performance of the prediction models is evaluated by statistical metrics such as the mean absolute percentage error (mape), coefficient of variation of the root mean square error (cvrmse), normalized root mean square error (nrmse). finally, a feature importance analysis is also presented to identify the critical attributes. the rest of this paper is organized as follows. in section 2, we discuss the relevant literature in the load prediction of buildings. section 3 shows our proposed methodology. section 4 introduces the experiment, ground truth, and evaluation metrics. section 5 and section 6 show the results and discussion of electrical load prediction, respectively. finally, in the last three section 7 to 9, the contributions, limitations, and conclusions are presented, respectively. in the past decade, extensive research has been carried out to predict the load demand of buildings and various models have been proposed for real-world applications. these models can be broadly classified into two categories, namely physical models and data-driven models (amasyali and el-gohary, 2018) (see fig. 1 ). in physical models, physical principles are used to calculate the thermal dynamics and energy behaviors of buildings by taking temperature, humidity, and other physical variables into consideration (wang et al., 2018a; li, 2020) . several building energy modeling tools, such as energyplus, transient system simulation tool (trnsys), environmental systems performance-research (esp-r), and the quick energy simulation tool (equest), have been widely used in this area (bui et al., 2020) . however, the performance of physical models depends on a large number of building parameters, such as building envelope structure, lighting system setup, pump water distribution, etc., making it more suitable for buildings at the design stage rather than for as-built (shao et al., 2020) . to this end, data-driven models using machine learning algorithms to construct mapping functions between input and output data have gained immense popularity due to their ease of use, adaptability and high forecasting performance (wang et al., 2018a; wang and srinivasan, 2017; somu et al., 2020) . in addition, data-driven models are more practical than physical models since the data used are more available from buildings, such as energy consumption, climatic, temporal, and occupancy, which can be collected via sensing and communication technologies (wang et al., 2018a) . furthermore, data-driven prediction models can be categorized into single models and ensemble models based on the modeling structure and the number of prediction models (wang and srinivasan, 2017) . single prediction models were created by applying one prediction algorithm (wang and srinivasan, 2017) . examples of single prediction models include multiple linear regression (ji and xu, 2015) , artificial neural network (platon et al., 2015) , support vector machines (wang et al., 2016b) , and long short term memory (lstm) algorithm (zhou et al., 2019; wang et al., 2019) . in ensemble learning prediction, instead of using one algorithm to build the forecasting model, multiple learning algorithms are needed to train its base models (wang and srinivasan, 2017) . the commonly used ensemble techniques are bagging (tuysuzoglu and birant, 2020) , boosting (kadkhodaei et al., 2020) , voting (tsai, 2019) , and stacking (mahendran et al., 2020) . among them, bagging and boosting are two of the most widely-used ensemble learning methods because of their theoretical superiority and strong experimental performance (oza, 2005) , with random forest (rf) and extreme gradient boosting (xgboost) as the representative algorithms for each, respectively. the main difference between rf and xgboost is that rf combines multiple predictors in a parallel way whereas xgboost combines them sequentially (wang et al., 2020a) . based on the literature review, the prediction performance of single and ensemble prediction models in healthcare buildings context has been underinvestigated. since xgboost and rf are more advanced algorithms that could be used for electrical load consumption in buildings, these two algorithms are described in detail as follows. (1) xgboost algorithm based on boosting the core idea of boosting is to start training a base learner from the initial training dataset, during which each sample (i.e., data point in the training dataset) has the same weight. then the weights of training samples are constantly adjusted in the subsequent iterations according to the performance of the base learner. a heavier weight will be given to the training samples that failed the training. this process is repeated until the number of base learners reaches a value specified in advance. finally, a strong model will be constructed by combining these base learners by their trained weights. in boosting, the data drawn for training the base learners depends directly on the previous steps (kadkhodaei et al., 2020) . xgboost is one of the boosting algorithms widely used in the field of machine learning and kaggle competitions for structured or tabular data in recent years (silvestro et al., 2017; kadkhodaei et al., 2020). xgboost can be used to solve both classification and regression problems. the mechanism of xgboost is to first train a decision tree by using the training dataset and then use this tree to obtain the corresponding predicted value. a residual can be obtained by subtracting predicted value from the corresponding real value. then, by inputting the new training sample and residual, a second tree is trained to optimize the previous residual. the iteration stops when the predefined threshold of parameters, such as max_depth, min_child_weight, are arrived at. the final predicted value is the sum of the predicted values of the previous predictors. after adding a base learner, the value of its objective function is calculated to ensure that the value of the objective function gradually decreases during the iteration process. xgboost can obtain much better performance than a single prediction algorithm due to the process of correcting the prediction errors of preceding models in the iterative process. (2) random forest based on bagging bagging is the most famous representative of ensemble learning. the core idea of bagging is to obtain an aggregated predictor by using a combination rule (tuysuzoglu and birant, 2020) . in bagging, given a data set containing m samples, one sample is randomly picked into the sampling set for processing and then put back into the original dataset, so that the sample still has the possibility to be selected in the next sampling round. after m round of random sampling operations, a sampling set containing m samples is formed. in repeating this sampling t times, t datasets each containing m data can be generated. a base learner is then trained based on each sample set and all base learners are combined in the end. random forest (rf), one of the most popular machine learning algorithms, is an improvement of bagging methods. different from bagging where all features are considered for splitting a node, rf only selects a subset of features at random and uses the best splitting feature from the subset to split each node in a tree. a problem with bagging is that decision trees can have a lot of structural similarities since they choose which variable to split on using a greedy algorithm that minimizes errors, resulting in a high correlation in their predictions. rf addresses this problem by generating sub-trees in a way that the resulting predictions from all of the subtrees have fewer correlations. rf contains several weak predictors that are trained through bagging and random variable selection. therefore, rf has good anti-noise and generalization capabilities. previous research has shown that rf can perform well in the field of load forecasting (wang et al., 2018a; morgenstern et al., 2016) . for data-driven models, parameters optimization plays an important role in improving the prediction accuracy of the proposed model. there are a variety of parameter tuning methods available. for example, a series of relevant work have been done by chen et al. (2020) 's team. they proposed an improved ant colony optimization for feature selection to identify the optimal subsets (zhao et al., 2014) . in addition, several methods such as enhanced moth-flame optimizer , improved whale optimization algorithm , enhanced bacterial foraging optimization , the chaos enhanced gray wolf optimization (zhao et al., 2019) have been developed and validated in different contexts such as medical diagnosis. based on the application context, following wang et al. (2020a) 's method, gridsearchcv provided by scikit-learn (pedregosa et al., 2011) was used for hyper-parameter tuning in this study. understanding the drivers of building electrical load consumption is critical to the advancement of building energy systems and their management. according to the project report of international energy agency (iea) on total energy use in buildings, there are six types of factors that are influencing the total energy consumption of buildings, including climate, building envelope, building equipment, operation and maintenance, occupant behavior, and indoor environmental conditions (yoshino et al., 0000) . in addition to the general characteristics of the public buildings' energy consumption, healthcare buildings have important distinctions from other commercial buildings, including their continuous operation, a high percentage of included space types with special indoor environment requirements and a set of extremely energy-hungry diagnostic equipment, such as xray, computed tomography (ct) and magnetic resonance imaging (mri) (koulamas et al., 2017) . energy systems show different complexity even from hospital to hospital, depending on several factors such as the type and volume of the buildings, health care services offered, geographical location, and technological plants (silvestro et al., 2017) . in general, the electricity consumption in hospitals consists of air conditioning systems, lighting, electromedical devices, safety systems, information and communications technology (ict) systems, and treatment equipment (silvestro et al., 2017) . furthermore, each department of a hospital also may have different energy use patterns. taking the outpatient and inpatient departments as an example, the main function of the outpatient department is to receive patients and diagnose patients who do not need to be admitted to hospital stay (morgenstern et al., 2016) . therefore, the main energy systems or equipment of the outpatient departments are hvac systems, lighting systems, and medical equipment. while for the inpatient department, whose main function is to provide a place for patients who need to be hospitalized, the main energy consumption is related to the hvac system, lighting system, domestic hot water system, catering, and cooking appliances. however, the majority of research in healthcare buildings on energy prediction pays more attention to the useful floor area and number of beds, without taking healthcare occupancy types into consideration. similar to the research conducted in general buildings, electrical load prediction models for healthcare buildings can also be classified into physical and data-driven models. for example, by using dynamic thermal simulation and computational fluid dynamics, adamu et al. (2012) explored the alternative effects of four strategies on the performance of thermal comfort and heating load using a new ward of the great ormond street hospital london as a case study. ascione et al. (2013) investigated the energy, environmental and economic effects of rehabilitation of building envelope for healthcare facilities in mediterranean climates via energyplus and the designbuilder interface. due to the complexity and time-consuming nature of collecting buildings' detailed information needed by physical models, several data-driven models have been found to predict the energy consumption of healthcare buildings. for example, chen et al. (2005) used ann to predict a hospital's air-conditioning system's energy use by taking into account temperature, relative humidity, the previous-hour electricity, the time of a day, and some uncontrolled variables. bagnasco et al. (2015) performed electrical load forecasting for the cellini medical clinic of turin by proposing a multi-layer perception ann model with (1) the type of the day, (2) time of the day, and (3) weather, as the input data. thinate et al. (2017b) built a multiple linear regression model for the load prediction of 45 large-scale hospital buildings in thailand by taking six affecting factors into account, including airconditioning area, non-air-conditioning area, in-patient department, out-patient department, staff number, and temperature. in ruiz et al.'s (2017) study, three machine learning techniques (i.e., multilayer perceptron, m5rules algorithm, and tree ensemble learner algorithm) were used to predict electrical energy consumption in a hospital of granada. in spite of the several different aspects of energy use in healthcare facilities explored in previous studies, the main gap of extant research is the lack of consideration of healthcare occupancy variables. to address this gap, this study takes into account the occupancy of outpatients, emergency patients, and inpatients and employs single and ensemble machine learning algorithms to predict the electric load demand of healthcare buildings. the research methodology consists of module 1 and module 2, which are for independent machine learning model development with different temporal granularity -daily and weekly, respectively as shown in fig. 2 . it can be seen that the electric load prediction for the healthcare buildings includes three steps: (1) identify the relevant features and gather data, (2) train single and ensemble learning models with prepared dataset, and (3) compare the prediction performance of different models. the electricity data of modules 1 and 2 are first randomly divided into two parts, 80% for training and 20% for testing. the generated training dataset is then used to train the machine learning models and the testing dataset is used to evaluate the performance of the trained models. machine learning is increasingly adopted in the big data era to enable data-driven decision making. empirical evidence shows that companies who adopt data-driven decision making will increase their efficiency and profitability significantly (bohanec et al., 2017) . data mining and machine learning methods are capable of addressing variable issues such as normality, correlation, missing value, and dependency. therefore, they can produce better prediction accuracies by mapping the nonlinear relationship between input and output more comprehensively. electrical load forecasting is naturally considered to be a regression problem in machine learning, aiming to accurately predict the energy demand of buildings based on its relationship with a given set of independent input variables. although machine learning techniques have provided good results in electrical load prediction of many types of buildings, electrical load prediction of healthcare buildings based on machine learning algorithms has been under-investigated, leading to a gap in knowledge on the most appropriate method to predict electrical load of healthcare facilities. to test the forecasting performance of different models, three performance measures are used including mape, cvrmse, and nrmse. the equations for calculating mape, cvrmse, and nrmse are defined as follows (kadkhodaei et al., 2020) : in which: (1) n represents the number of instances in training or testing data; (2)ŷ i and y i represent the forecasted and real values, respectively, and (3) y max and y min represent the maximum and minimum of the real values, respectively. for all the three indicators, a smaller value indicates a better prediction performance of the forecasting model. the method was applied to predicting the energy consumption of shanghai tenth people's hospital (stph), a general hospital located in jing'an district of shanghai, china. established in 1910, stph provides healthcare across broad areas of general and specialty care. like other hospitals located in megacities of china, stph is extremely busy providing medical services for patients not only from local areas but also from the surrounding regions due to urban agglomeration (wang et al., 2016a) . as a result, stph has experienced capacity strain over recent years. in 2018, it had 3.06 million person-times of outpatients and emergency patients, 100 thousand person-times of hospital discharge. the total floor area of stph is 168,575 m 2 (fig. 3) , consisting of various departments, such as operation theaters, intensive care units, examination and treatment rooms, and large-scale medical equipment, resulting in an annual electricity expenditure of 19 million chinese yuan. electric power consumed in stph consists of space heating and cooling, ventilation, lighting, office equipment, and medical diagnostic facilities such as ct and mri. generally, hospital buildings can be seen as an integration of general-purpose building space and healthcare-specific building space. therefore, factors influencing energy consumption consist of common factors related to weather parameters (i.e., outdoor temperature, relative humidity, wind speed, barometric pressure, and precipitation) (wang et al., 2018a) and some special factors relevant to occupancy, time and operation &maintenance (o&m). for occupancy, three types of patients are considered, including the number of outpatients, emergency patients, and inpatients. because hospitals usually have different schedules according to day type, a parameter representing weekday or weekend/holiday was also considered as an input to the forecasting model. in addition, different o&m measures related to the operational status of the central air-conditioning system are included to accommodate the changing indoor climate condition. the central air conditioning system will be turned on for heating or cooling and turned off at transition seasons, such as spring or autumn. finally, a total of 10 parameters need to be considered to build a load forecasting model, as listed in table 1 . due to different conditions such as geographic latitude and topography, the climate of china varies greatly from one location to another. to analyze the buildings according to local climate conditions, the chinese building climate is divided into severe cold area, cold area, hot-summer & warm-winter area, and temperate area (wei et al., 2018) . shanghai, a typical city in the hot-summer & cold-winter area, has a long summer and a short winter, resulting in high energy usage for cooling and heating. as illustrated in fig. 4 , a u-shaped relationship exists between outdoor temperature and daily electricity consumption in shanghai. in this study, the weather data was collected from a weather station in the baoshan district, whose latitude and longitude are close to that of stph. the weather station consists of a complete set of weather sensors, including temperature, humidity, wind speed, air pressure, and precipitation. in this study, the weather data was downloaded on a daily average basis from the department web server. generally, healthcare facilities can be classified into different categories based on management and ownership types, type of care provided, facility size and patient type, etc. ahmed et al. (2015) . for healthcare buildings in chinese hospitals, the main classification is according to the type of patients, such as outpatient buildings and inpatient buildings. for example, the main components of inpatient buildings are patient bedrooms and the supporting spaces, e.g., nurses' rooms, storage space, and possible food heating facilities (morgenstern et al., 2016) . therefore, the occupancy of hospitals mainly consists of three types of patients, i.e., outpatients, emergency patients, and inpatients. in order to ensure careful control of the indoor climate, healthcare buildings consume more energy than other types of commercial buildings, to achieve a comfortable indoor environment since occupants in hospitals are more sensitive to the physical environment . occupancy fluctuations complicate decisions concerning the need for a variety of auxiliary facilities, including physical therapy, laboratory tests, surgical services, pharmacy, and housekeeping, and may ultimately impact the energy consumption (littig and isken, 2007) . these daily occupancy data were collected from the healthcare information system (his) of shanghai. fig. 5 shows the normalized value of occupants and load relationship of stph for 1 month. it can be seen that the number of inpatients and emergency patients is small compared to that of outpatients. in addition, it can be seen that both the number of outpatients and electricity consumption have the lowest levels on sundays. according to lusis et al. (2017) , the prediction performance of electrical load consumption can be improved by taking into account calendar effects because they can capture the change of energy consumption patterns in different calendar periods, such as daily or weekly energy consumption. to investigate the impact of the interaction between the healthcare buildings and calendar variables on the electrical load prediction, a dummy variable was added to distinguish between weekdays and weekends/holidays. for stph, weekday usually refers to monday to saturday while only sunday is the weekend. in addition, the calendar from the hospital's official website was used to determine the holiday schedule. all holidays and sundays were combined. in this study, the relationship between the day type and daily electrical load was statistically analyzed, and the violin plot is shown in fig. 6 . as depicted in this plot, weekdays have higher load levels than the weekends/holidays. one possible explanation is that for weekdays, all departments will be under normal operations whereas, for weekends/holidays, only a small part of the departments will be open, e.g., the inpatient department. currently, there are four types of air conditioning terminals in stph, namely, central air-conditioning system, variable refrigerant volume (vrv) unit systems, large-sized test device for air-cooled heat pump unit, and split air-conditioning systems. only the split air-conditioning system can be kept in operation as wish. the other three types of air conditioning systems would not operate unless the outdoor temperature is higher or lower than an established threshold. according to the schedule of stph (table 2) , the central air conditioning system will not operate unless the outdoor temperature is higher than 30 • c in summer or lower than 8 • c in winter, which means that the electricity consumption is not linear with regard to the outdoor temperature and it depends on the operation & maintenance policy. generally, there are two working patterns for these central air conditioning systems -cooling and heating. otherwise, they will be turned off when it is a transition season (i.e., spring or autumn). during transition seasons, the central air conditioning system does not operate. in this study, a categorical variable that represents the operational status of air conditioning systems is introduced (fig. 7) . as fig. 7 shows, the electrical load consumption of the whole hospital is highest when the central air conditioning system was in operation for cooling, followed by heating and turned off. in this experiment, the building electrical load data were acquired every hour from intelligent-building energy support system (i-bess) developed by the shanghai hospital development center (shdc) , the administrative agency for the investment, management, and operation of 38 municipal public hospitals in shanghai, including stph. building electrical load data from january 1, 2017, to december 31, 2018 (a total of 730 days) were collected and used for training and testing. the time series of building electrical load in this experiment is shown in fig. 8 . in this section, the prediction performance of different single and ensemble machine learning algorithms is compared. within the single machine learning category, linear regression, lasso regression, ridge regression, elastic net, support vector regression (svr), and gaussian process regression are selected. linear regression is a statistical method used for analyzing the linear relationship of multiple variables, and the simplest single machine learning algorithm (shine et al., 2018) . regularization technique is applied which can be used to prevent overfitting problems by adding a regularization term to the loss term. the regression model which uses l 1 regularization term (i.e., the sum of absolute values) is lasso regression and the model which uses l 2 regularization term (i.e., the sum of squared coefficients) is ridge regression. elastic net further combines the regularization terms of lasso and ridge regression to overcome their dependency on data and the resulting instability, to get the best of both models. svr is a powerful and versatile machine learning algorithm that has the ability to perform regression tasks. gaussian process is a generalization of nonlinear multivariate regression (heo and zavala, 2012) . in the ensemble machine learning category, xg-boost and random forest, are popular energy prediction models which builds a decision tree in a sequential and parallel way, respectively. besides, the optimal parameters via gridsearchcv function of these above-mentioned algorithms are presented in the appendix. the mape, cvrmse, and nrmse of the tested hospital for different forecast models over the entire training and testing period are summarized in table 3 . because the models' rankings in all the three performance indicators are the same, mape will be used as an example to illustrate the comparison results of different models. for single learning models, svr was the most accurate model with a mape of 10.67%, followed by gaussian process regression and the remaining four models (i.e., linear regression, lasso regression, ridge regression, and elastic net) with a similar mape. for ensemble learning models, rf performed slightly better than xgboost rf with a mape of 9.64% and mape of 9.81%, respectively. however, when comparing single models with ensemble models, the prediction performance of ensemble models was consistently better than that of single models, which could be ascribed to the way how ensemble models were developed. to investigate the prediction performance of the proposed model on longer timespans, weekly electrical load prediction was also tested corresponding to module 2 of the methodology. in healthcare facilities management practice, it is very likely for facility managers to create weekly operation schedules for power systems. as was presented in section 4, there were four types of input variables and one type of output variable used in daily electrical load prediction (i.e., module 1). when it comes to weekly data, the day type variable is not needed. therefore, in the weekly electrical load prediction, these models were trained with three types of input variables, namely, weather, occupancy, and o&m related variables. for the weather variables, such as outdoor temperature, relative humidity, wind speed, pressure, and precipitation, data were derived by averaging the daily data during the whole week. for occupancy variables, namely, the number of outpatient visits, emergency visits, and inpatients visits, the data were derived by summing the daily data during an entire week. as for the o&m variables, values were identified according to the operation schedule (table 2) . for the output variable, weekly load prediction of stph, a sum of each week was used (fig. 9) . the comparison results of weekly electrical load prediction through single and ensemble machine learning algorithms are shown in table 4 . it can be observed that in single machine learning categories, svr outperformed the remaining five models with a mape of 10.77% by a large margin. for ensemble learning models, the mape of xgboost and rf was 11.12% and 10.69%, respectively, indicating that rf is slightly better than xgboost. when comparing the results of single and ensemble machine learning models, it appeared that the best single model (i.e., svr) performed close to the best ensemble model (i.e., rf), with the ensemble model still outperforming the single model. traditionally, it is recognized that ensemble models always outperform single models. however, the performance of a prediction model depends on several factors, such as input variables, selected models, and model parameters. the results of module 2 showed that in certain situations, a single model may get similar performance as ensemble models. due to the relatively high performance of svr, xgboost, and rf, they were selected as the base model for the subsequent analysis. by comparing the results of module 1 and module 2 in terms of svr, xgboost, and rf, it appears that prediction models trained with daily data have better performance than models trained with weekly data. the comparison shows that by dividing the data at a finer level of time granularity, the prediction performance of the machine learning model can be improved. therefore, the authors suggest that if there are different operating patterns in different periods, the data collected at a finer level of time granularity will be necessary to improve the prediction accuracy of building energy consumption. by comparing all machine learning models, including single and ensemble models, rf was found to be the model with the highest prediction accuracy, so rf was used for the subsequent feature importance analysis. figs. 10 and 11 depict the feature importance results for module 1 and module 2, respectively. it can be seen from figs. 10 and 11 that the top 3 most influential features for modules 1 and 2 were the same (i.e., outdoor temperature, pressure, and operation status of the central air conditioning system), indicating that the daily and weekly electrical load consumption are highly correlated with the same factors. due to the high energy use related to space heating, cooling and ventilation loads, it was expected that outdoor temperature was the most contributing factor of energy consumption in hospital buildings. the operation status of the central air conditioning system (i.e., ope), with its importance ranked the third, depends on the specific o&m measures in the tested hospital. given the same weather condition, hospitals taking different o&m measures will consume different levels of energy. for example, some hospitals decided to open the central air condition system based on the outdoor temperature while others based that decision on the day of the year solely. as to the remaining features, they followed a different importance pattern in daily and weekly load prediction. in fig. 10 , the result shows that in the case of hospital buildings, day type was the least influential feature mainly due to the around-the-clock operations of hospitals. in this experiment, eight machine learning models were compared for load prediction, including six single learning models and two ensemble learning models. the scope of single prediction methods covers several ai-based prediction models, i.e., linear regression, ridge regression, lasso regression, elastic net, svr, and gaussian process. when it comes to ensemble machine learning, two of the most well-known models, xgboost and rf were selected to construct the load prediction model of the tested hospital. empirical results indicated that the most accurate model in the single and ensemble learning categories was svr and rf, respectively. the prediction result for svr, xgboost, and rf are shown in figs. 12 and 13 . while ensemble models outperformed single models in load prediction of the tested hospital in daily load prediction, demonstrating its ability to improve the prediction performance of single models by training with data of finer granularity. the energy consumption patterns of healthcare buildings are characterized by uninterrupted operation and high energy usage intensity. the input variables in our study are accessible to healthcare facility managers, such as weather, occupancy, day type, and o&m variables. it is feasible for them to conduct daily electrical load forecasting by adopting the machine learning algorithms in this study, especially ensemble models, such as xgboost and rf. besides, svr is also an excellent alternative single machine learning algorithm which has been recommended by various previous studies (thinate et al., 2017b; shao et al., 2020) . as illustrated in section 2, there has been a large number of researches focused on building load forecasting over the past decades. however, few of them focused on healthcare buildings and the number is much less when it comes to electrical load forecasting of healthcare buildings via machine learning algorithms. however, healthcare buildings are characterized by energy-intensiveness due to the constant need for available power supply for medical equipment and stringent requirements for air quality, making them the key area of achieving the energy conservation goals in the building domain. in this study, the best prediction performance was 9.64% of mape (table 3) . cvrmse can be useful to indicate how much variation or randomness there is between the data and the model. for hourly and monthly load prediction, the cvrmse recommended by the american society of heating, refrigerating, and air-conditioning engineers (ashrae) is within 30% and 15%, respectively (landsberg et al., 2014) . on a daily basis, the cvrmse value results of ensemble models in our experiment were within 15%, indicating that the accuracy of our model is acceptable. in particular, considering the occupancy and o&m variables of the proposed procedure the cvrmse of xgboost in daily load prediction on the training set and the testing set was 10.11% and 12.81%, appeared to be improved with respect to the xgboost as exploited in wang et al. (2020a) , in which the cvrmse on the training set and the testing set was 14.2% and 21.1%, respectively. even though the electrical load of healthcare buildings is affected by a large number of factors, such as climate conditions, building envelope, energy system, and occupancy behavior, etc. it is obvious that these factors impact the electrical load with different weights. in this paper, the outdoor temperature seems to be the most important feature in both module 1 and module 2, which is consistent with many previous studies (e.g., thinate et al., 2017b) . since in complex buildings such as a hospital, the energy consumption of the air conditioning system can account for 67% of total energy consumption, whose main function is to keep a comfortable and constant indoor climate against the outdoor climate. pressure came after temperature, which also has a great impact on energy performance. while for the third important feature, o&m related variable, the importance result is arguable. in most situations, the operation status of the air conditioning system has not been able to keep up with the change in the climate condition, that is it will operate on heating when the outdoor temperature is low and on cooling when the outdoor temperature is high. however, for the majority of public hospitals in china, there is an operation schedule for the central air conditioning system with a flexible period for heating and cooling. in addition, three types of patients were taken into consideration and they also had relatively high importance on electrical load consumption. with respect to the day type, it seems that the differentiation between weekdays and weekends/holidays is not sensitive to the prediction performance of these models. however, the rankings of these features are different in daily and weekly load predictions. one possible explanation is that these variables have different patterns in daily and weekly operations. for example, the number of outpatients is high on weekdays and is low on sundays since the outpatient department will close on sundays, which resulted in a significant increase in emergency patients on sundays. the experiments were performed on a machine with intel i7-8550u 2.0 ghz cpu and 8gb memory. the operating system was windows 10. all the algorithms were implemented in python. table 5 lists the data size and computation time of different algorithms used in module 1 and module 2. notably, module 1 used daily load dataset and module 2 used weekly load dataset. the results showed that the computation time of ensemble learning algorithms were considerably greater than those of most single learning algorithms except for gaussian regression. the potential reason for the long computation time of gaussian regression was its working process in inferring a probability distribution over all possible values. as ensemble learning algorithms, both xgboost and rf had to operate by combining the results of several base learners, resulting in a relatively longer computation time. according to wang et al. (2018b) , the prediction accuracy should be given the first priority when solving such problems. overall, the computation time of the ensemble algorithms were acceptable. by comparing the computation time of module 1 and module 2, the influence of data size on computation time can be investigated. it can be observed that the computation time of all models trained with daily dataset were greater than those of weekly dataset, indicating that the increase of data size will result in an increment of computation time. however, the difference of computation time of xgboost and rf between module 1 and module 2 was acceptable and it is reasonable to infer that the ensemble learning algorithms would still be practically efficient when using large volume of data (e.g., big data). in recent years, various research focused on the energy analysis and consumption optimization in hospitals worldwide have been conducted, such as in the u.s. (bawaneh et al., 2019) , china (ji and qu, 2019) , german (gonzález et al., 2018) , etc. however, limited by the availability of reliable and fine-granularity data, most of them were descriptive analysis at a macro level. this paper makes several contributions in model design. focusing specifically in hospital modeling, the authors developed an electrical load prediction model for healthcare buildings via single and ensemble models. even though it is widely believed that ensemble models can deliver higher prediction performance than single models by combining multiple individual models together, it is not guaranteed and needs to be tested context-by-context. however, few empirical studies have been conducted to test it in the healthcare facility energy prediction context. using the data of a general hospital in shanghai, the experimental results of this paper confirmed that ensemble models indeed outperform single models in terms of almost all measurements. in addition, the scope of the involved independent variables of the models employed in this study is broader than the majority of previous studies. not only the well-known energy variables were included, such as weather and day type, but also explored is the impact of occupancy and o&m variables on the performance of the prediction model, making the model context closer to the actual operation situation of hospital buildings. furthermore, this study identified the most important features affecting energy consumption in the experimented hospital which can give insights to healthcare facility managers in how to create the o&m schedule more appropriately. by training the electrical load prediction with daily and weekly models, it shows that it is more accurate to conduct load prediction with finer time granularity. this research can have two direct practical applications. the first and most important application is that the proposed model can provide accurate forecasting results and will assist healthcare facility managers to conduct better planning of hospital activities and energy sources in a timely manner. it can also serve as the basis of some other relevant jobs, such as fault diagnosis, energy benchmarking, etc. with the aim of supporting administrators or facility managers of hospitals to make decisions when they are trying to achieve energy conservation goals, the prediction models are developed at the hospital level rather than building or floor level. in addition, this paper illustrates one possible way to improve the energy management performance by taking advantage of multi-sourced, heterogeneous data. for example, data of four types of independent variables and dependent variables involved in this study are gathered through different methods/tools, such as weather station, healthcare information system (his), and smart meter. with the increasing popularity of big data and artificial intelligence, more efforts will be devoted to collect data in the area of healthcare facility operations by using multiple technologies. from this perspective, the methodology used in this paper can achieve its effective applications not only in the energy management systems but also in other fields of healthcare facilities management in the future. in this study, eight algorithms from two load prediction approaches (single and ensemble machine learning) were compared, which has seldom been systematically conducted in previous studies. the ensemble models outperformed single models in electric load prediction of the tested hospital, demonstrating its ability to support the decision-making process of facility managers of hospitals. in addition, both general characteristics (e.g., weather parameters) and function-specific characteristics (e.g., the number of patients) were considered when building the prediction model, making the model context closer to the actual operation situation of hospital buildings. although the proposed model has satisfactory predictive performance, the following limitations are acknowledged. the nature of the proposed method is data-driven, meaning the prediction performance highly depends on the quality of data. even rigorous data preprocessing has been conducted before the formal experiment, there may still exist some data noise. thus, the characteristics of data collected from a specific hospital may affect its accuracy. it would be better to test the performance of the proposed procedure in several hospitals. besides, the occupancy in the hospital consists of several types of people, including doctors, nurses, patients, and their visitors, facilities managers, etc. in this study, due to the accessibility of data, only three types of patients were included. further research can be conducted for a deeper analysis of the effect of occupancy by taking all of them into consideration. healthcare facilities are a vital part of social and medical organizations providing the society with needed healthcare. in addition, they are also the main battlefield of the public health emergency system fighting a pandemic like coronavirus disease 2019 . serving as the lifeline for all activities, energy consumption plays an important role in ensuring the efficient operations of healthcare facilities. predicting the electrical load of healthcare buildings accurately can support the decision-making processes of healthcare facility managers to improve building energy performance. this study compared the precision of eight machine learning models trained with daily and weekly datasets from a general hospital in shanghai. it was found that for daily electrical load prediction, rf, xgboost, and svr were the most accurate ensemble and single learning models, with a mape on the test dataset of 9.64%, 9.81%, and 10.67%, respectively. these can be considered as reliable intelligent prediction models for facility/energy managers to conduct daily electrical load prediction in the future. in addition, based on the same dataset, the impact of forecasting granularity on prediction performance was tested. by comparing the results of daily and weekly load prediction, it was found that the performance would improve by using more granular or detailed energy usage data. one possible explanation is that a detailed load profile can decrease the uncertainty in computing. in terms of the feature importance, outdoor temperature, air pressure, and operation status of the central air conditioning system were found to be the top 3 contributing factors. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. this study was supported by the international exchange program for graduate students of tongji university, china (no. 201902024) , the support program for young and middle-tech leading talents of tongji university ''operation management of complex and mega projects'', china. shanghai hospital development center, china and shanghai tenth people's hospital, china were sincerely acknowledged for providing the original data. the number of features to consider when looking for the best split (1,11,1) 5 the number of features to consider when looking for the best split (1,11,1) 3 performance evaluation of natural ventilation strategies for hospital wards -a case study of great ormond street hospital a classification of healthcare facilities: toward the development of energy performance benchmarks for day surgery centers in australia a review of data-driven building energy consumption prediction studies rehabilitation of the building envelope of hospitals: achievable energy savings and microclimatic control on varying the hvac systems in mediterranean climates electrical consumption forecasting in hospital facilities: an application case energy consumption analysis and characterization of healthcare facilities in the united states energy consumption analysis and characterization of healthcare facilities in the united states explaining machine learning models in sales predictions enhancing building energy efficiency by adaptive façade: a computational optimization approach short-term electricity forecasting of airconditioners of hospital using artificial neural networks an enhanced bacterial foraging optimization and its application for training kernel extreme learning machine solar powered healthcare in developing countries research on short-term and ultra-shortterm cooling load prediction models for office buildings commercial buildings energy consumption survey: energy usage summary dilution effect of the building area on energy intensity in urban residential buildings electrical and thermal energy in private hospitals: consumption indicators focused on healthcare activity improving customer satisfaction of a healthcare facility: reading the customers evaluation of energy consumption in german hospitals: benchmarking in the public sector gaussian process modeling for measurement and verification of building energy savings investigation and evaluation of energy consumption performance for hospital buildings in china a bottom-up and procedural calibration method for building energy simulation models based on hourly electricity submetering data hboost: a heterogeneous ensemble classifier based on the boosting method and entropy measurement choosing measures for energy efficient hospital buildings measurement of energy, demand, and water savings designing a short-term load forecasting model in the urban smart grid system development of a conceptual benchmarking framework for healthcare facilities management: case study of shanghai municipal hospitals a data-driven strategy for detection and diagnosis of building chiller faults using linear discriminant analysis occupancy data analytics and prediction: a case study short term hospital occupancy prediction short-term residential load forecasting: impact of calendar effects and forecast granularity realizing a stacking generalization model to improve the prediction accuracy of major depressive disorder in adults benchmarking acute hospitals: composite electricity targets based on departmental consumption intensities? energy build world energy balances : overview online bagging and boosting development of a new energy benchmark for improving the operational rating system of office buildings using various data-mining techniques space heating demand response potential of retrofitted residential apartment blocks scikit-learn: machine learning in python hourly prediction of a building's electricity consumption using case-based reasoning, artificial neural networks and principal component analysis energy consumption modeling by machine learning from daily activity metering in a hospital prediction of energy consumption in hotel buildings via support vector machines machine-learning algorithms for predicting on-farm direct water and electricity consumption on pasture based dairy farms energy efficient policy and real time energy monitoring in a large hospital facility: a case study a hybrid model for building energy consumption forecasting using long short term memory networks sciencedirect sciencedirect sciencedirect sciencedirect energy performance study in thailand hospital building energy performance study in thailand hospital building nattanee the thinate, wongkot of assessing feasibility using the heat thailand demand fo new feature selection and voting scheme to improve classification accuracy enhanced bagging (ebagging): a novel approach for ensemble learning chaotic multi-swarm whale optimizer boosted support vector machine for medical diagnosis building thermal load prediction through shallow machine learning and deep learning forecasting district-scale energy dynamics through integrating building network and long short-term memory learning algorithm building energy efficiency for public hospitals and healthcare facilities in china: barriers and drivers a review of artificial intelligence based building energy use prediction: contrasting the capabilities of single and ensemble prediction models artificial intelligent models for improved prediction of residential space heating random forest based hourly building energy prediction random forest based hourly building energy prediction a study of city-level building energy efficiency benchmarking system for china enhanced moth-flame optimizer with mutation strategy for global optimization a review and analysis of regression and machine learning models on commercial building electricity load forecasting total energy use in buildings analysis and evaluation methods -final report of iea ebc annex feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton a review on the prediction of building energy consumption chaos enhanced grey wolf optimization wrapped elm for diagnosis of paraquat-poisoned patients using long short-term memory networks to predict energy consumption of air-conditioning systems see tables a.1-a.3. see tables b.1-b.3. key: cord-308219-97gor71p authors: elzeiny, sami; qaraqe, marwa title: stress classification using photoplethysmogram-based spatial and frequency domain images date: 2020-09-17 journal: sensors (basel) doi: 10.3390/s20185312 sha: doc_id: 308219 cord_uid: 97gor71p stress is subjective and is manifested differently from one person to another. thus, the performance of generic classification models that classify stress status is crude. building a person-specific model leads to a reliable classification, but it requires the collection of new data to train a new model for every individual and needs periodic upgrades because stress is dynamic. in this paper, a new binary classification (called stressed and non-stressed) approach is proposed for a subject’s stress state in which the inter-beat intervals extracted from a photoplethysomogram (ppg) were transferred to spatial images and then to frequency domain images according to the number of consecutive. then, the convolution neural network (cnn) was used to train and validate the classification accuracy of the person’s stress state. three types of classification models were built: person-specific models, generic classification models, and calibrated-generic classification models. the average classification accuracies achieved by person-specific models using spatial images and frequency domain images were 99.9%, 100%, and 99.8%, and 99.68%, 98.97%, and 96.4% for the training, validation, and test, respectively. by combining 20% of the samples collected from test subjects into the training data, the calibrated generic models’ accuracy was improved and outperformed the generic performance across both the spatial and frequency domain images. the average classification accuracy of 99.6%, 99.9%, and 88.1%, and 99.2%, 97.4%, and 87.6% were obtained for the training set, validation set, and test set, respectively, using the calibrated generic classification-based method for the series of inter-beat interval (ibi) spatial and frequency domain images. the main contribution of this study is the use of the frequency domain images that are generated from the spatial domain images of the ibi extracted from the ppg signal to classify the stress state of the individual by building person-specific models and calibrated generic models. stress is a mental, emotional, and physical reaction experienced when a person perceives demands that exceed their ability to cope. the two common forms of stress are acute stress and chronic stress. acute stress is a short-term form and caused by recent past and near future demands, events, or pressures. money worries, losing a job, causing an accident, taking an exam, death of a close family member, serious injury, or attending an interview can cause acute stress disorder. however, it requires a relief technique to relax and recover, such as breathing exercises, get outdoors, or muscle relaxation. in contrast, chronic stress is a long term form and resulting from prolonged and repeated exposure to stressors for a prolonged period and can lead to more severe health problems if it is not handled adequately [1] [2] [3] . chronic stress weakens the body's immune system, leading to several mental and physical illnesses such as depression and cardiovascular diseases [4] . people experience in which training is conducted in the fourier domain. the results indicated that convolution in the fourier domain speeds up without affecting the accuracy of image classification [26] . faster and more accurate image classification was obtained by fourier-based convolution neural network (fcnn). quan liu et al. designed cnn models to predict the depth of anesthesia (doa) indicator for patients from the eeg-based spectrum, and the model achieved 93.5% and can provide physicians with measures to prevent the influence of patient and anesthetic drug differences [27] . koln et al. developed several neural networks to classify images in the fourier domain to visualize patterns learned by the networks, and they found the important regions to classify particular objects [28] . frequency domain features are important for image classification as well as spatial features, especially when the spatial resolution increases [29] . lin et al. classified pixels in frequency domain infrared microscopic images to human breast cell and non-cell categories by k-means clustering [30] . however, perceived stress is very subjective and expressed differently among different people. the generic model can classify stress status for the unseen person, but the stress classification model needs personalization due to the differences in individual stress responses to stress and coping ability. moreover, a stressful situation for one individual may not be an issue for another one, and females will, in general, have a higher level of stress than men. likewise, there exist differences in stress vulnerability, reactivity, resilience, and responsiveness to the threading events. therefore, building a person-specific classification model is significant [31] [32] [33] . martin et al. found that developing student-specific models yielded better results than general and cluster-specific classification models for perceived stress detection in students using smartphone data [34] . kizito et al. proposed a hybrid stress prediction method, which revealed an increase in generic model accuracy from 42.5% to 95.2% by combining 100 person-specific samples used to train the generic model. they tested their new approach on two different datasets and found that the calibrated stress detection model outperformed the generic one. jing et. al proposed a new classification model for the drive's stress level using ibi images for the ecg signal and cnn. they compared the accuracy of this approach with the ann method using time-domain features (mean ibi and root mean squared difference of adjacent ibis (rmssd), and standard deviation of ibis (sdnn)). they found that the accuracy of the new approach was more accurate than the ann method, which has been frequently used in recent researches [35] . in this study, a new stress classification approach is proposed to classify the individual stress state into stressed or non-stressed by converting spatial images of inter-beat intervals of a ppg signal to frequency domain images and we use these pictures to train several cnn models. three types of stress classification models were used: person-specific models, generic models, and calibrated-generic models taking into account intra-individual and interindividual differences. the accuracy measurements of the proposed models (person-specific, calibrated generic model) showed the potential of using frequency domain images in stress detection. our binary classification approach can be applied to classify the state of the daily life stress of individuals into stressed or non-stressed using inter-beat intervals (ibis) data. moreover, it can be used to to monitor a person's psychological wellbeing in everyday life and trigger clinical intervention when the occurrence of acute stress states detected in a specific patient becomes too frequent. this could prompt the clinician to look for lifestyle related issues at the origin of the stress. this paper is structured as follows. section 2 describes the dataset used in this research, and section 3 describes the proposed stress image-based detection model. in section 4, the results for the proposed models are discussed. section 5 presents the results and states the findings of this research. wearable stress and affect detection (wesad) is a publicly available data set that contains motion and physiological data recorded from the chest and wrist-worn devices and self-reports of 15 subjects in laboratory settings during three conditions (baseline, amusement, and stress) [36] . the wesad multimodal data was used for this study. the tier social stress test (tsst) was implemented for inducing psychological stress [37] . the tsst is a procedure that induces acute social stress in a laboratory environment. in tsst, the public speaking is followed directly by mental math task in the same session, both are delivered in front of an interview panel, and both introduce novelty and uncontrollably [38] . in the baseline session, the subjects were given a neutral magazine to read for 20 min and watched a set of funny movies for amusement. while in stress conditions, they were exposed to public speaking and mental arithmetic tasks. the participants delivered a five-minute speech in front of a panel and were then asked to count down numbers from 2023 to zero with 17 steps. a repeat count was mandated for any mistake in the course of the counting exercise. for mediation, the subject performed a controlled berating exercise, during which ppg, ecg, emg, eda, skin temperatures, acceleration, and respiration signals were recorded using respiban professional and empatica e4. respiban recorded ecg, emg, eda, temp, acc, and resp data sampled at 700 hz. e4 records eda (4 hz), acc (32 hz), bvp (64 hz), and temp (4 hz). the data collection was conducted in a laboratory setting. in this study, the ibi sequence provided by empatica e4 wristband was used. ibi is computed by using a proprietary algorithm provided by empatica to detect heartbeats from the bvp signal and calculated the lengths of the intervals between adjacent beats. in empatica e4, the bvp signal is collected by a ppg sensor using a proprietary algorithm that combines the light signals detected during the exposure of the red and green lights with a 64 hz sampling rate. the ibi data file consists of two columns: a timestamp and the duration of the detected beats. the incorrect peaks caused by noise in the bvp signal were removed from the file [39, 40] . the ibi data for the public speaking and mental math task were combined to reflect the stress class to build a binary classification model to classify the stress state of a person into two categories: stressed or non-stressed. the ibi is a significant cardiac measure, which is used to detect stress and provides an emotional state of the individual [17, 41] . in this paper, the entire time interval of the extracted ibi data from ppg signals was divided into intervals according to their distributions. n × m matrices determine inter-beat interval distribution. after that, spatial images were generated from the extracted matrices and converted to frequency domain images for stress classification models using deep convolutional neural networks. the output of the classification model is the stress state of the person (stressed or non-stressed) as shown in figure 1 . an image can be presented as a 2d matrix, and each element in this matrix represents pixel intensity. the intensity distribution of the image is called a spatial domain. for the colored image, the spatial domain can be described as a 3d vector of 2d matrices that contains the intensities for rgb colors. the abnormal values outside the ibi normal ranges (6-1.2 s) were removed. then, the descriptive statics were calculated, such as range, minimum, and maximum values, and the time interval of the inter-beat interval was divided into 28 intervals according to the distribution of the inter-beat intervals as discussed in [35] . second, a n × 1 column vector was created for each inter-beat interval and assigned 1 to the interval in which the inter-beat belongs and 0 for the remaining elements. third, an n × m matrix was formed by concatenating the consecutive m column vectors, transferring the output matrix to 28 × 28 pixel spatial domain images using matlab. a sliding window of size 28 was moved only with the column, as shown in figure 2 . figure 3 shows two different images for several subjects in both condition stressed and non-stressed state. the value of pixel intensity is the primary information stored in the pixels, and the most significant feature used for image classification. the intensity of an image is the mean of all pixels in the image. the average pixel intensity was calculated for non-stressed and stressed images in order to quantify the differences between the two classes using the generated images. table 1 shows the mean of all the pixel values in the entire image for several subjects in both conditions (stressed and non-stressed) are shown. table 2 displays the average intensity for the four segments of each image in two different conditions (stressed and non-stressed). the mean values of the stress images are higher than the non-stressed spatial images. a spatial image can be represented in a frequency domain using transformation. in the output image, each point represents a particular frequency contained in the spatial domain image. in the frequency domain image, high-and low-frequency components correspond to edges and smooth regions, respectively; such image transformation helps to reveal pixels information and detect whether or not repeating patterns exist. the fourier transform is utilized to decompose a spatial domain equivalent image into its cosine and sine components. for a squared image of size n×m pixels, the 2d discrete fourier transform (dft) is given by the equation (1), in which the value of each point f(u, v) is calculated by the summation result of multiplying the spatial image with the corresponding base function. where in this research, the spatial image is converted to the frequency domain by applying fast fourier transformation (fft) on spatial images to get the frequency domain version for these images based on algorithm 1 as shown in the figure 4 . classification performance in the fourier domain outperforms the classification in the spatial domain [28, 42, 43] . moreover, image processing using frequency domain images provides more features and reduces the computational time of the classification model. in addition, image in frequency domain offers another level of information that spatial domain images can not provide. specifically, frequency domain images provide information with the rate at which the pixel values are changing in spatial domain. the rate (frequency) of this change has information that can be exploited to enhance classification models. the fft is a fast algorithm that is used to compute the dft. dft computation takes approximately n 2 (dft computational complexity : o(n 2 )) whereas fft computation takes approximately n log (n) (fft computational complexity : o (n log (n))) table 3 shows the average pixel intensity for the frequency domain images for subjects in the two conditions: stressed and non-stressed. the mean values of the stressed ibi frequency domain images are lower than the non-stressed images. cnn is an example of a deep learning neural network and can be used for computer vision tasks such as image classifying by processing the input image and output the class or probability that the image belongs to it. cnn has input, output, and hidden layers in which it extracts features from images while the network trains on a set of pictures. it applies several filters on the input image to build the feature map and trains through forward and back-propagation for many epochs until reaching a distinct network with trained weights and features. to classify individual stress status into stressed or non-stressed, a 19-layer cnn model was built, as illustrated in figure 6 . cnn is a deep learning algorithm used for image classification and object detection. images pass through 2d convolution layers with kernels, pooling, and fully connected layers. cnn extracts features from the input images while the network trains, and each layer increases the complexity of learned features. like other artificial neural networks, cnn or convnet has an input, several hidden layers (e.g., convolution layers), and an output layer. convolution is a linear operation that includes the multiplication of a set of 2d weight arrays called the filter or kernel with the input data array. the output of this multiplication is a 2d array called feature map. the feature map values are passed through nonlinear functions, such as the rectified linear unit (relu). cnn can train and learn abstract features for efficient object identification. it does not suffer from overfitting, overcomes the limitation of other machine learning algorithms, and is very effective at reducing the parameters amount using dimensional reduction methods without affecting the quality of models. it is used to solve complex problems in different domains such as image classification and object detection due to their better performance [44] [45] [46] [47] [48] . in our model, the input image with size of 28 × 28 pixels goes through 8 convolution layers to produce 32, 64, 128, 256 feature maps using filters with a convolution kernel of a 3 × 3 receptive field. there are 4 max-pooling layers with size 2 × 2 after every two convolution layers. max-pooling is used to reduce the two dropout layers with a rate of 0.5 for regularization. the fully connected layers have depths of 256, 256 and 1. relu activation layers are used to increase nonlinearity in the network. the outputs of these networks were stressed and non-stressed. the following stress classification models were trained, tested, and evaluated using our cnn model architecture and both type of images (spatial and frequency domain). person-specific models using spatial images: models were trained, validated, and tested on the spatial domain images of the same subject. the entire datasets were divided into 70%, 15%, and 15% for training, validation, and testing, respectively 2. person-specific models using frequency domain images: models were trained, validated, and tested on the frequency domain images of the same subject. the entire datasets were divided into 70%, 15%, and 15% for training, validation, and testing, respectively. generic models using spatial domain images: models were trained and validated models on the spatial domain images of 12 subjects (n − 3) and we tested their performance on three others that were left out. of these, three were used to evaluate the model's accuracy in classifying the unseen person's stress status. three subjects were in the test dataset, and the other 12 subjects' data were in the training and validation sets. generic models using frequency domain images: the models were trained and validated models on the frequency domain images of n-3 subjects, and we tested their performance on the left out three subjects frequency domain images. three subjects were in the test dataset, and the other 12 subject's data were in the training and validation sets. generic models using spatial domain images with calibration samples: 20% of the test dataset were incorporated in the training pool, and the models were tested on the remaining samples. this approach was implemented because the performance of the generic model is lower than the person-specific model. for models training and accuracy measurements, three subjects of data were used as a test dataset and we combined 20% of these data into the other 12 subjects' data in the training datasets. 6. generic models using frequency domain images with calibration samples: 20% of the test dataset was incorporated in the training pool, and the models were tested on the remaining samples. three subjects' data were used as a test dataset, and 20% of their data were combined with the training dataset to train the model and measure its accuracy. the above classification models were evaluated by measuring the accuracy of the training, validation, and testing. moreover, other parameters were also measured. these are the sensitivity (number of samples were classified by the model as positive among all actual positives), specificity (number of samples were classified by the model as negative among all actual negatives), and precision (how many samples were positive among all classified positive samples). the inputs were spatial images and frequency domain images, and the output was stressed or non-stressed. the performance of the classification models was measured by comparing the values of accuracy for the train, valid and test, along with the test sensitivity (true positive rate), precision, and specificity (true negative rate). the equations for calculating these performance metrics are shown in equations (2)-(5). the accuracy is the ratio of the correct classifications from all classifications. sensitivity is defined as the capability of a test to correctly classify a person as stressed: the specificity is the capability of a test to correctly classify a person as non-stressed: precision measures how correctly the classifier was able to classify positive out of all positives: the classification accuracy measurements for all models were satisfactory among the training, validation, and test datasets. the person-specific models achieved high performance compared to the generic models. the average classification accuracy of the person-specific models using spatial images for the training, validation, and test datasets was 99.9%, 100%, and 99.8%, respectively. for the person-specific models using frequency domain images, the accuracy was 99.68%, 98.97%, and 96.4%. the performance of the generic models varied between the different subjects and had lower accuracy than the person-specific models. the average accuracy for the generic classification models using spatial images was 98.6% (train), 96.8% (valid), and 61% (test), and 98.9% (train), 97.6% (valid), and 62.6% (test) when using frequency domain images. moreover, the accuracy for frequency domain classification models was slightly lower than the spatial image classification models, as shown in tables 4 and 5. the generic models cannot perfectly recognize the inter-subject difference in response to stress events. thus, adding some samples from the test to training data significantly increased the accuracy of the generic models as shown in tables 6 and 7 when using spatial images and tables 8 and 9 when using frequency domain images. by adding these samples, the performance of the models significantly increased from 61% to 88.1% and from 62.6% to 87.6% as happened in the generic models for the test dataset when using the spatial and frequency domain images, respectively. confusion matrix is a performance measurement that visualizes the performance of the classification model on test data in which the true values are known. the generic model had 179 non-stressed spatial images incorrectly classified as stressed while it had 619 stressed images incorrectly classified as non-stressed, as shown in figure 7 (left). however, the majority of the spatial images were classified correctly, while by adding 20% of the test data into the training pool, the performance of the model was increased, as it had 20 non-stressed spatial images incorrectly classified as stressed and 37 stressed images incorrectly classified as non-stressed, as shown in figure 8 (left) . from the confusion matrices in figures 7 and 8 , where the data of subjects 8, 9, and 10 were in the test dataset, and 20% of their calibrated samples were injected in the training dataset, the sensitivity was increased to 96% and 80% and specificity was also increased to 98% and 92% for spatial and frequency domain images, respectively. adding a few calibration samples allowed the model to learn more information about the unseen person and highlighted the effect of person-specific signals in classifying his/her stress state to either stressed or non-stressed. another finding is that the time of cnn training and validation using fourier domain images was lower than that of training and validation on spatial images (e.g., for the person-specific model of he subject number 10, the cnn spent 143 s to train and validate 1019 frequency domain images in around 125 epochs, while using the same number of spatial images took around 214 s). moreover, to achieve higher accuracy when using spatial and frequency domain images, there is a need to use more epochs to train the generic models. in this study, 150 epochs were used for all generic models using both spatial and frequency domain images. table 4 . the accuracy measures for the person-specific models using spatial images. 2 99 .8 100 99 98 100 100 3 100 100 99 100 98 98 4 100 100 100 100 100 100 5 100 100 100 100 100 100 6 100 100 100 100 100 100 7 100 100 100 100 100 100 8 100 100 100 100 100 100 9 100 100 100 100 100 100 10 100 100 100 100 100 100 11 100 100 100 100 100 100 13 100 100 100 100 100 100 14 100 100 100 100 100 100 15 100 100 100 100 100 100 16 100 100 100 100 100 100 17 100 100 100 100 100 100 average 99.9 100 99.8 99.8 99.8 99.8 table 5 . the accuracy measures for the person-specific models using frequency domain images. 2 100 100 97 100 96 96 3 100 100 99 100 98 98 4 100 100 100 100 100 100 5 95.7 84.6 74 100 65 65 6 100 100 93 85 100 100 7 100 100 99 100 99 99 8 100 100 100 100 100 100 9 100 100 99 100 99 99 10 100 table 6 . the accuracy measures for the generic models using spatial domain images. table 9 . the accuracy measures for the generic models with 20% calibration samples using frequency domain images. table 10 compares the results of this approach with other approaches conducted in the domain of stress detection. one of the main differences between this study and the other studies is the type of images that were utilized for training and validating the models. moreover, the accuracy of the calibrated model outperformed that of the generic model. compared to other approaches, the proposed method achieved high accuracy in person-specific models and comparative scores with the other generic models taking into account the different types of the data used (ibi extracted from ppg signal, spatial, and frequency domain images from the ibi) in our study. the results show the potential of using frequency domain images in stress detection. [35] ecg-ibi spatial cnn generic 92.8 [49] face cnn generic 85.23 [22] respiration cnn generic 84.59 in this study, a new approach was proposed to classify a person's stress state using a convolution neural network, spatial, and frequency domain images for inter-beat intervals extracted from the ppg signal. the entire time interval of the extracted ibi data from ppg signals was divided into intervals according to the ibi distributions, and then the output matrix was converted to spatial images. these images were transformed into the frequency domain by using the fourier transform. frequency domain features are important for image classification as well as spatial features, especially when the spatial resolution increases. several types of binary classification models were developed: generic model, person-specific model, and calibrated generic models. the proposed models utilized the ibi's files generated by empatica e4 devices founded in the wesad dataset. the average accuracy for the proposed models achieved a satisfactory performance. the person-specific models were able to classify stress status with high accuracy. although these models cannot be generalized, it is necessary and effective to personalize the model, as stress is subjective and each person has unique responses and degree of vulnerability to stress. these models can be used in the health monitoring system to monitor the stress status of the patient and can be enriched by collecting new data and training the models again. an image can be represented as a 2d matrix where each element shows pixel intensity. this spatial image can be transformed into the frequency domain by using a fourier transform. frequency domain features are important for image classification as well as spatial features, especially when the spatial resolution increases. images processing using frequency domain images can perform better than spatial domain images, provide more features, and reduce the computation time. cnn is an example of deep learning neural networks and can be used for computer vision tasks such as image classifying by processing the input image and output the class or probability that the image belongs to. cnn has input, output, and hidden layers in which it extracts features from images while the network trains a set of pictures. it applies several filters on the input image to build the feature map and trains through forward and back-propagation for many epochs until it reaches a distinct network with trained weights and features. in this study, a novel approach to classify the stress state of a person by using both spatial and frequency domain ibi images and convolution neural networks is proposed. the proposed models using the ibi's files generated by empatica e4 devices founded in the wesad dataset were tested. several classification models were built: person-specific, generic, and calibrated generic models. generic models performed more poorly than the person-specific models when trying to classify stress state of unseen people, as shown in the test accuracy measures in tables 6 and 8. these generic models cannot generalize well as stress is subjective, and some people more reactive to stress and have different types of physical and physiological responses. a personalized model was derived by combining a few person-specific samples with the training data to improve the performance of these generic models. in this study, 20% of the subject's data in the test dataset were combined with the training data, which showed a substantial improvement in the stress classification models performance, as shown by the accuracy measurement in tables 7 and 9 . these calibrated generic models introduced the subjects' identities and characteristics to the models. to ensure that our calibrated models were not suffering from overfitting, we validated these models by using 5-fold cross-validation, which leads to unbiased model performance estimation and tests how the different parts of the training set performed in the model. moreover, the results show that the average accuracy for the generic classification models using frequency domain images was slightly higher than the other models that used spatial images for training, validation, and testing. in addition, classifying the stress status using frequency domain images performed well and provided more features about the entire images and reduced the computation time. the proposed models in this study were effective at classifying stress state and applicable in a stress monitoring health system. our approach can be applied to monitor a person's psychological wellbeing and classify his state of daily life stress using inter-beat intervals (ibis) data. in addition, it can trigger alerts that can be used to guide clinical interventions to prevent and treat symptoms of acute stress disorders when the occurrence of acute stress states detected in a specific patient becomes too frequent. moreover, stress detection models can be used in the military or police to detect when soldiers and police officers experience high levels of stress that are abnormal and to improve their performance in stressful environments. they can also be used for a student in educational systems to identify which subjects may present issues for particular students. this will enable teachers to intervene and present the material in an alternative manner and minimize stressful events in the classroom as much as they can to reduce stress or anxiety. one limitation of this study is that the proposed models in this study are used to classify the state of an individual into two categories: stressed or non-stressed. the model is aim at instantaneous detection of stress via classification of physiological data. the model does not consider the prediction of stress as it is out of the scope of this work. for future work, newly ppg data will be collected either from lab settings or real-life using wrist-worn devices. these data will be used to train and test the proposed models to measure the accuracy and compare the results. the authors declare no conflict of interest continuous stress detection using wearable sensors in real life: algorithmic programming contest case study fact sheet: health disparities and stress acute vs. chronic stress heart rate variability metrics for fine-grained stress level assessment world health organization. mental health in the workplace salivary cortisol levels as a biological marker of stress reaction salivary cortisol as a biomarker in stress research psychological stress detection using photoplethysmography photoplethysmography based psychological stress detection with pulse rate variability feature differences and elastic net assessing mental stress from the photoplethysmogram: a numerical study can ppg be used for hrv analysis? comparison of heart rate variability from ppg with that from ecg mental stress assessment based on pulse photoplethysmography instant stress: detection of perceived mental stress through smartphone photoplethysmography and thermal imaging monitoring stress with a wrist device using context monitoring physical activity and mental stress using wrist-worn device and a smartphone part iii; voulme 414 smartphone-based approach to enhance mindfulness among undergraduates with stress assessing mental stress based on smartphone sensing data: an empirical study deep ecg-respiration network (deeper net) for recognizing mental stress deep learning of breathing patterns for automatic stress recognition using low-cost thermal imaging in unconstrained settings ambulatory and laboratory stress detection based on raw electrocardiogram signals using a convolutional neural network a multichannel convolutional neural network architecture for the detection of the state of mind using physiological signals from wearable devices spectral domain convolutional neural network. arxiv 2019 fcnn: fourier convolutional neural networks spectrum analysis of eeg signals using cnn to model patient's consciousness level based on anesthesiologists' experience visualizing image classification in fourier domain a survey of image classification methods and techniques for improving classification performance classification of fourier transform infrared microscopic imaging data of human breast cells by cluster analysis and artificial neural networks individual differences in stress susceptibility and stress inhibitory mechanisms individual differences in biological stress responses moderate the contribution of early peer victimization to subsequent depressive symptoms the effect of person-specific biometrics in improving generic stress predictive models automatic detection of perceived stress in campus students using smartphones a novel classification method for a driver's cognitive stress level by transferring interbeat intervals of the ecg signal to pictures introducing wesad, a multimodal dataset for wearable stress and affect detection trier social stress test'-a tool for investigating psychobiological stress responses in a laboratory setting the trier social stress test protocol for inducing psychological stress early detection of migraine attacks based on wearable sensors: experiences of data collection using empatica e4 e4 data-ibi expected signal inter-beat interval estimation from facial video based on reliability of bvp signals frequency learning for image classification image classification in the frequency domain with neural networks and absolute value dct conceptual understanding of convolutional neural network-a deep learning approach diabetes detection using deep learning algorithms automated detection of diabetes using cnn and cnn-lstm network and heart rate signals face recognition based on convolutional neural network research on face recognition based on cnn emotional analysis using image processing this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license the following abbreviations are used in this manuscript. inter key: cord-302336-zj3oixvk authors: clift, ash k; coupland, carol a c; keogh, ruth h; diaz-ordaz, karla; williamson, elizabeth; harrison, ewen m; hayward, andrew; hemingway, harry; horby, peter; mehta, nisha; benger, jonathan; khunti, kamlesh; spiegelhalter, david; sheikh, aziz; valabhji, jonathan; lyons, ronan a; robson, john; semple, malcolm g; kee, frank; johnson, peter; jebb, susan; williams, tony; hippisley-cox, julia title: living risk prediction algorithm (qcovid) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study date: 2020-10-21 journal: bmj doi: 10.1136/bmj.m3731 sha: doc_id: 302336 cord_uid: zj3oixvk objective: to derive and validate a risk prediction algorithm to estimate hospital admission and mortality outcomes from coronavirus disease 2019 (covid-19) in adults. design: population based cohort study. setting and participants: qresearch database, comprising 1205 general practices in england with linkage to covid-19 test results, hospital episode statistics, and death registry data. 6.08 million adults aged 19-100 years were included in the derivation dataset and 2.17 million in the validation dataset. the derivation and first validation cohort period was 24 january 2020 to 30 april 2020. the second temporal validation cohort covered the period 1 may 2020 to 30 june 2020. main outcome measures: the primary outcome was time to death from covid-19, defined as death due to confirmed or suspected covid-19 as per the death certification or death occurring in a person with confirmed severe acute respiratory syndrome coronavirus 2 (sars-cov-2) infection in the period 24 january to 30 april 2020. the secondary outcome was time to hospital admission with confirmed sars-cov-2 infection. models were fitted in the derivation cohort to derive risk equations using a range of predictor variables. performance, including measures of discrimination and calibration, was evaluated in each validation time period. results: 4384 deaths from covid-19 occurred in the derivation cohort during follow-up and 1722 in the first validation cohort period and 621 in the second validation cohort period. the final risk algorithms included age, ethnicity, deprivation, body mass index, and a range of comorbidities. the algorithm had good calibration in the first validation cohort. for deaths from covid-19 in men, it explained 73.1% (95% confidence interval 71.9% to 74.3%) of the variation in time to death (r(2)); the d statistic was 3.37 (95% confidence interval 3.27 to 3.47), and harrell’s c was 0.928 (0.919 to 0.938). similar results were obtained for women, for both outcomes, and in both time periods. in the top 5% of patients with the highest predicted risks of death, the sensitivity for identifying deaths within 97 days was 75.7%. people in the top 20% of predicted risk of death accounted for 94% of all deaths from covid-19. conclusion: the qcovid population based risk algorithm performed well, showing very high levels of discrimination for deaths and hospital admissions due to covid-19. the absolute risks presented, however, will change over time in line with the prevailing sars-c0v-2 infection rate and the extent of social distancing measures in place, so they should be interpreted with caution. the model can be recalibrated for different time periods, however, and has the potential to be dynamically updated as the pandemic evolves. objective to derive and validate a risk prediction algorithm to estimate hospital admission and mortality outcomes from coronavirus disease 2019 (covid-19) in adults. population based cohort study. setting and participants qresearch database, comprising 1205 general practices in england with linkage to covid-19 test results, hospital episode statistics, and death registry data. 6 .08 million adults aged 19-100 years were included in the derivation dataset and 2.17 million in the validation dataset. the derivation and first validation cohort period was 24 january 2020 to 30 april 2020. the second temporal validation cohort covered the period 1 may 2020 to 30 june 2020. the primary outcome was time to death from covid-19, defined as death due to confirmed or suspected covid-19 as per the death certification or death occurring in a person with confirmed severe acute respiratory syndrome coronavirus 2 (sars-cov-2) infection in the period 24 january to 30 april 2020. the secondary outcome was time to hospital admission with confirmed sars-cov-2 infection. models were fitted in the derivation cohort to derive risk equations using a range of predictor variables. performance, including measures of discrimination and calibration, was evaluated in each validation time period. results 4384 deaths from covid-19 occurred in the derivation cohort during follow-up and 1722 in the first validation cohort period and 621 in the second validation cohort period. the final risk algorithms included age, ethnicity, deprivation, body mass index, and a range of comorbidities. the algorithm had good calibration in the first validation cohort. for deaths from covid-19 in men, it explained 73.1% (95% confidence interval 71.9% to 74.3%) of the variation in time to death (r 2 ); the d statistic was 3.37 (95% confidence interval 3.27 to 3.47), and harrell's c was 0.928 (0.919 to 0.938). similar results were obtained for women, for both outcomes, and in both time periods. in the top 5% of patients with the highest predicted risks of death, the sensitivity for identifying deaths within 97 days was 75.7%. people in the top 20% of predicted risk of death accounted for 94% of all deaths from covid-19. the first cases of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) infection were reported in the uk on 24 january 2020, with the first death from coronavirus disease 2019 (covid-19) on 28 february 2020. as of 18 august 2020, more than 41 000 deaths from covid-19 had occurred in the uk and more than 773 000 deaths globally. 1 in the initial absence of any vaccination or prophylactic or curative treatments, the uk government implemented social distancing and shielding measures to suppress the rate of infection and protect vulnerable people, thereby trying to minimise the risk of serious adverse outcomes. 2 3 emerging evidence throughout the course of the pandemic, initially from case series and then from cohorts of patients with confirmed sars-cov-2 doi: 10.1136/bmj.m3731 | bmj 2020;371:m3731 | the bmj infection, has shown associations of age, sex, certain comorbidities, ethnicity, and obesity with adverse covid-19 outcomes such as hospital admission or death. [4] [5] [6] [7] [8] [9] [10] [11] the knowledge base regarding risk factors for severe covid-19 is growing. as many countries are cautiously attempting to ease "lockdown" measures or reintroduce measures if rates are rising, an opportunity exists to develop more nuanced guidance based on predictive algorithms to inform risk management decisions. 12 better knowledge of individuals' risks could also help to guide decisions on mitigating occupational exposure and in targeting of vaccines to those most at risk. although some prediction models have been developed, a recent systematic review found that they all have a high risk of bias and that their reported performance is optimistic. 13 the use of primary care datasets with linkage to registries such as death records, hospital admissions data, and covid-19 testing results represents a novel approach to clinical risk prediction modelling for covid-19. it provides accurately coded, individual level data for very large numbers of people representative of the national population. this approach draws on the rich phenotyping of individuals with demographic, medical, and pharmacological predictors to allow robust statistical modelling and evaluation. such linked datasets have an established track record for the development and evaluation of established clinical risk models, including those for cardiovascular disease, diabetes, and mortality. [14] [15] [16] we aimed to develop and validate population based prediction models to estimate the risks of becoming infected with and subsequently dying from covid-19 and of becoming infected and subsequently admitted to hospital with covid-19. the model we have developed is designed to be applied across the adult population so that it can be used to enable risk stratification for public health purposes in the event of a "second wave" of the pandemic, to support shared management of risk and occupational exposure, and in early targeting of vaccines to people most at risk. an ongoing companion study will externally validate the models, using datasets across all four nations of the uk, and will be reported separately. this study was commissioned by the chief medical officer for england on behalf of the uk government, who asked the new and emerging respiratory virus threats advisory group (nervtag) to establish whether a clinical risk prediction model for covid-19 could be developed in line with the emerging evidence. the protocol has been published. 17 the study was conducted in adherence with tripod 18 and record 19 guidelines and with input from our patient advisory group. study design and data sources we did a cohort study of primary care patients using the qresearch database (version 44). qresearch was established in 2002 and has been extensively used for the development of risk prediction algorithms across the national health service (nhs) and for epidemiological research. by april 2020, 1205 practices in england were contributing to qresearch, covering a population of 10.5 million patients. the database is linked at individual patient level, using a project specific pseudonymised nhs number, to hospital admissions data (including intensive care unit data), positive results from covid-19 real time reverse transcriptase polymerase chain reaction tests held by public health england, cancer registrations (including detailed radiotherapy and systemic chemotherapy records), the national covid-19 shielded patient list in england, and mortality records held by nhs digital. we identified a cohort of people aged 19-100 years registered with participating general practices in england on 24 january 2020. we excluded patients (approximately 0.1%) who did not have a valid nhs number. patients entered the cohort on 24 january 2020 (date of first confirmed case of covid-19 in the uk) and were followed up until they had the outcome of interest or the end of the first study period (30 april 2020), which was the date up to which linked data were available at the time of the derivation of the model, or the second time period (1 may 2020 until 30 june 2020) for the temporal cohort validation. the primary outcome was time to death from covid-19 (either in hospital or outside hospital), defined as confirmed or suspected death from covid-19 as per the death certification or death occurring in an individual with confirmed sars-cov-2 infection at any time in the period 24 january to 30 april 2020. the secondary outcome was time to hospital admission with covid-19, defined as an icd-10 (international classification of diseases, 10th revision) code for either confirmed or suspected covid-19 or new hospital admission associated with a confirmed sars-cov-2 infection in the study period. we selected candidate predictor variables on the basis of the presence of existing clinical vulnerability group criteria (table 1) , associations with outcomes in other respiratory diseases, or hypothesised to be linked to adverse outcomes on clinical/biological plausibility and likely to be available for implementation. they are summarised in box 1 and supplementary box a. we defined variables according to information recorded using read codes in general practices' electronic health records at the start of the study period. the exception to this was information on chemotherapy, radiotherapy, and transplants, which was based on linked hospital records. we randomly allocated 75% of practices to the derivation dataset, which we used to develop the models. we evaluated the models' performance in the remaining 25% of practices (the validation set). all models were fitted separately in men and women. the outcomes of interest are subject to competing risks. for the primary outcome of death from covid-19, the competing risk is death due to other causes. for the secondary outcome of hospital admission, the competing risk is death from any cause before admission. we fitted a sub-distribution hazard (fine and gray 21 ) model for each outcome to account for competing risks. individuals who did not have the outcome of interest were censored at the study end date, including those who had a competing event. for all predictor variables, we used the most recently available value at the entry date (24 january 2020). we used second degree fractional polynomials to model non-linear relations for continuous variables (age, body mass index, and townsend material deprivation score, an area level score based on postcode 20 ). initially, we fitted a complete case analysis by using a model within the derivation data to derive the fractional polynomial terms. for indicators of comorbidities and medication use, we assumed the absence of recorded information to mean absence of the factor in question. data were missing in four variables: ethnicity, townsend score, body mass index, and smoking status. we used multiple imputation with chained equations under the missing at random assumption to replace missing values for these variables. for computational efficiency, we used a combined imputation model for both outcomes. the imputation model was fitted in the derivation data and included predictor variables, the nelson-aalen estimators of the baseline cumulative sub-distribution hazard, and the outcome indicators (death from covid-19 and hospital admission with covid-19). we carried out five imputations. each analysis model was fitted in each of the five imputed datasets. we used rubin's rules to combine the model parameter estimates and the baseline cumulative incidence estimates across the imputed datasets. we initially sought to fit models using all predictor variables. owing to sparse cells, some conditions were combined if clinically similar in nature (such as rare neurological disorders). we examined interactions between body mass index and ethnicity and interactions between predictor variables and age, focusing on predictor variables that apply across the age range (asthma, epilepsy, diabetes, severe mental illness). we explored the use of penalised models (lasso) to screen variables for inclusion, but this retained all the predictor variables and most interaction terms. 17 in line with the protocol, we subsequently removed a small number of variables with low numbers of events and adjusted (sub-distribution) hazard ratios close to 1 (as these will have minimal effect on predicted risks) or with uncertain clinical credibility, defined as counterintuitive results in light of the emerging literature. lastly, we combined regression coefficients from the final models with estimates of the baseline cumulative incidence function evaluated at 97 days to derive risk equations for each outcome. we used all the available data in the database. we did all model evaluation using the validation data with two separate periods of follow-up. the first validation study period was the same as for the derivation cohort: 24 january to 30 april 2020. the second temporal validation covered the subsequent period of 1 may 2020 to 30 june 2020. this was carried out with the same validation cohort except for exclusion of patients who died during 24 january to 30 april 2020. in the validation cohort, we fitted an imputation model to replace missing values for ethnicity, body mass index, townsend score, and smoking status. this excluded the outcome indicators and nelson-aalen terms, as the aim was to use covariate data to obtain a prediction as if the outcome had not been observed to reflect intended use. we applied the final risk equations developed from the derivation dataset to men and women in the validation dataset and evaluated r 2 values, brier scores, and measures of discrimination and calibration for the two time periods. 22 where lower values indicate better accuracy. 25 d statistics (a discrimination measure that quantifies the separation in survival between patients with different levels of predicted risks) and harrell's c statistics (a discrimination metric that quantifies the extent to which people with higher risk scores have earlier events) were evaluated at 97 days (the maximum followup period available at the time of the derivation of the model) and 60 days for the second temporal validation, with corresponding 95% confidence intervals. 26 we assessed model calibration by comparing mean predicted risks with observed risks by twentieths of predicted risk for each of the validation cohorts. observed risks were derived in each of the 20 groups by using non-parametric estimates of the cumulative incidences. additionally, we did a recalibration for the mortality outcome, using the method proposed by booth et al by updating the baseline survivor function based on the temporal validation cohort with the prognostic index as an offset term. 27 we also applied the algorithms to the validation cohort for the first time period to define the centile thresholds based on absolute risk. we also defined centiles of relative risk (defined as the ratio of the individual's predicted absolute risk to the predicted absolute risk for a person of the same age and sex with a white ethnicity, body mass index of 25, and mean deprivation score with no other risk factors). we calculated the performance metrics in the whole validation cohort and in the following pre-specified 17 we evaluated performance by calculating harrell's c statistics in individual general practices and combining the results using a random effects meta-analysis. 28 patient and public involvement patients were involved in setting the research question and in developing plans for design and implementation of the study. patients were asked to aid in interpreting and disseminating the results. overall study population overall, 1205 practices in england met our inclusion criteria. of these, 910 practices were randomly assigned to the derivation dataset and 295 to the validation cohort. the practices had 8 256 158 registered patients aged 19-100 years on 24 january 2020. we included 6 083 102 of these in the derivation cohort, and the validation dataset comprised 2 173 056 people. table 2 shows the baseline characteristics of patients in the derivation cohort. of these patients, 3 035 409 (49.9%) were men and 990 799 (16.3%) were of black, asian, or other minority ethnic (bame) background. in the derivation cohort, 10 776 (0.18%) patients had a covid-19 related hospital admission and 4384 (0.07%) had a covid-19 related death during the 97 days' follow-up, of which 4265 (97.3%) were recorded on the death certificate and 119 (2.71%) were based only on a positive test (and of these <15 were based on a test more than 28 days before death). admissions and deaths due to covid-19 occurred across all regions, with the greatest numbers in london, which accounted for 3799 (35.3%) of admissions and 1287 (29.4%) of deaths. of those who died, 2517 (57.4%) were male, 732 (16.7%) were bame, 3616 (82.5%) were aged 70 and over, 1417 (32.3%) had type 2 diabetes, 1311 (29.9%) had dementia, and 1033 (23.6%) were identified as living in a care home. the characteristics of the validation cohort were similar to those of the derivation cohort, as shown in supplementary tables a and b. in the first validation period (24 january to 30 april 2020), 1722 deaths and 3703 hospital admissions due to covid-19 occurred. in the second validation period (1 may to 30 june 2020), 621 deaths and 1002 admissions due to covid-19 occurred. the variables included in the final models were fractional polynomial terms for age and body mass index, townsend score (linear), ethnic group, domicile (residential care, homeless, neither), and a range of conditions and treatments as shown in figure 1, figure 2, figure 3 , and figure 4. these conditions and treatments were cardiovascular conditions (atrial fibrillation, heart failure, stroke, peripheral vascular disease, coronary heart disease, congenital heart disease), diabetes (type 1 and type 2 and interaction terms for type 2 diabetes with age), respiratory conditions (asthma, rare respiratory conditions (cystic fibrosis, bronchiectasis, or alveolitis), chronic obstructive pulmonary disease, pulmonary hypertension or pulmonary fibrosis), cancer (blood cancer, chemotherapy, lung or oral cancer, marrow transplant, radiotherapy), neurological conditions (cerebral palsy, parkinson's disease, rare neurological conditions (motor neurone disease, multiple sclerosis, myasthenia, huntington's chorea), epilepsy, dementia, learning disability, severe mental illness), other conditions (liver cirrhosis, osteoporotic fracture, rheumatoid arthritis or systemic lupus erythematosus, sickle cell disease, venous thromboembolism, solid organ transplant, renal failure (ckd3, ckd4, ckd5, with or without dialysis or transplant)), and medications (≥4 prescriptions from general practitioner in previous six months for oral steroids, long acting β agonists or leukotrienes, immunosuppressants). figure 1 and figure 2 show the adjusted hazard ratios in the final models for covid-19 related death in the derivation cohort in women and men. figure 3 and figure 4 show the adjusted hazard ratios for the final models for covid-19 related hospital admission in the derivation cohort. supplementary figures a and b show graphs of the adjusted hazard ratios for body mass index, age, and the interaction between age and type 2 diabetes for deaths and hospital admissions due to covid-19 (which showed higher risks associated with younger ages). supplementary figures c and d show fully adjusted hazard ratios for variables for the full model, including variables that were not retained in the final model (for example, adjusted hazard ratios close to one or those which lacked clinical credibility). other variables with too few events for inclusion were hiv, sphingolipidoises, short bowel syndrome, polymyositis, dermatomyositis, ehlers-danlos syndrome, biliary cirrhosis, hepatitis b and c, haemochromatosis, non-alcoholic fatty liver disease, chronic pancreatitis, drug misuse, asplenia, cholangitis, scleroderma, sjogren's syndrome, and pregnancy. supplementary figures e and f show fully adjusted hazard ratios for a combined outcome of either covid-19 related death or hospital admission. this gave very similar absolute risks to the hospital admission outcome. table 3 shows the performance of the risk equations in the validation cohort for women and men over 97 days for the main study period and for the temporal validation cohort evaluated from 1 may 2020 to 30 june 2020. overall, the values for the r 2 , d, and c statistics were similar in women and men. values for the mortality outcome tended to be higher than those for the hospital admission outcome. for example, in the first validation period, the equation explained 74% of the variation in time to death from covid-19 in women; the d statistic was 3.46, and harrell's c statistic was 0.933. the corresponding values in men were 73.1%, 3.37, and 0.928. the results for the second validation period were similar except for covid-19 related admissions in women, for which the explained variation and discrimination were lower than for the first period (explained variation 45.4%, d statistic 1.87, and harrell's c statistic 0.776). supplementary tables c-f show the corresponding results by region, age band, and fifth of deprivation and within each ethnic group in men and women in both validation periods. performance was generally similar to the overall results except for age, for which the values were lower within individual age bands. figure 5 shows funnel plots of harrell's c statistic for each general practice in the validation cohort versus the number of deaths in each practice in men and women in the first validation period. the summary (average) c statistic for women was 0.916 (95% confidence interval 0.908 to 0.924) from a random effects meta-analysis. the corresponding summary c statistic for men was 0.919 (0.912 to 0.926). we have developed and evaluated a novel clinical risk prediction model (qcovid) to estimate risks of hospital admission and mortality due to covid-19. we have used national linked datasets from general practice and national sars-cov-2 testing, death registry, and hospital episode data for a sample of more than 8 million adults representative of the population of england. the risk models have excellent discrimination (harrell's c statistics >0.9 for the primary outcome). although the calibration for the hospital admission outcome was good in both time periods, some under-prediction existed for the mortality outcome in the second validation cohort, which improved after recalibration. the recalibration method could be used to transport the risk models to other settings or time periods with different absolute risks of covid-19. qcovid represents a new approach for risk stratification in the population. it could also be deployed in several health and care applications, either during the current phase of the pandemic or in subsequent "waves" of infection (with recalibration as needed). these could include supporting targeted recruitment for clinical trials, prioritisation for vaccination, and discussions between patients and clinicians on workplace or health risk mitigation-for example, through weight reduction as obesity may be an important modifiable risk factor for serious complications of covid-19 if a causal association is established. 10 although qcovid has been specifically designed to inform uk health policy and interventions to manage covid-19 related risks, it also has international potential, subject to local validation. one of the variables in our model (the townsend measure of deprivation) may need to be replaced with locally available equivalent measures, or some recalibration may be needed. previous risk prediction models based doi: 10 29 30 comparison with other studies although similarities exist between our study and the recently reported analysis of risk factors from another english general practice database using a different clinical computer system, our project had a different aim-namely, to develop and evaluate a risk prediction model. we used a more comprehensive outcome (including deaths in patients with positive tests for sars-cov-2), a much wider range of predictors, and a more granular assessment of ethnicity and body mass index. our c statistic for mortality (>0.92) is substantially higher than the previous study's reported value of 0.77. 31 other prediction models have been reported, although these focus on other outcomes of covid-19, including risk of admission to intensive care or death following a positive test, or clinical decision tools that integrate biochemical and imaging parameters to aid diagnostis. 13 however, most such studies are at high risk of bias, as they have been developed in highly selected cohorts, have limited transparency, are likely to have optimistic reported performance, or did not use covid-19 specific data. 13 this study represents a substantial improvement on previously developed risk algorithms in terms of the size and representativeness of the study population, the richness of data linkages enabling accurate ascertainment of cases (including both in-hospital and out of hospital deaths) across the health network, and the breadth of candidate predictor variables considered. importantly, it analyses risks at the population level, rather than risks in people with confirmed or suspected infection, and may have relevance for shielding or other policies that seek to mitigate risk of viral exposure. complexities of modelling several complexities of modelling adverse risks from covid-19 in the general population warrant discussion. we used a general population approach which, although not able to incorporate all determinants of being infected, offers an overall estimate of risk of adverse outcomes from covid-19 that could be used in discussions between clinicians and patients about adjustment of lifestyle or occupational and behavioural factors that could limit viral exposure. our model predicts risks of "catching covid-19 and then having a severe outcome," on the basis of data collected during the first peak of the pandemic. the endpoint in this study examines a risk trajectory that comprises two elements: becoming infected, which is predominantly a function of behavioural/environmental factors including occupation, local infection rate, and numbers of social interactions; and risk of hospital admission and death due to the infection, which is arguably primarily driven by "vulnerability" (that is, biological/ physiological factors including age, sex, body mass index, comorbidities, and medications). although producing a prediction model for risk of "death if infected" is feasible in principle, this approach is not yet possible owing to the approach to testing in the uk and the context of an as yet incompletely quantified degree of asymptomatic background transmission. limited covid-19 testing data are available, but the difficulty is that no systematic community testing was done in the uk during the study period, so only patients unwell enough to attend hospital were tested. this means that a risk score developed in those who tested positive would overestimate risks of severe outcomes. as more widespread testing is done and those data become available, we will be able to update the model to take background infection rates into account and also model regional differences. although the absolute risk levels will of course change over time, depending on the incidence of the disease, our analysis over two validation time periods indicates that the relative risk measures and discrimination are likely to remain stable. secondly, the model estimates the absolute risk for a non-infected individual in the general population of becoming infected and then dying (or needing to be admitted to hospital) from the virus over a 97 day period. although many more than 40 000 people have died from covid-19 in the uk to date, when the denominator is a population of multi-millions, the absolute risk for most people may be low. therefore, when conveying this type of risk score to an individual, due emphasis is needed on the different meanings of absolute and relative risk. thirdly, the absolute risk of catching covid-19 depends not only on the incidence of the infection but also on the number of people one gets close to. for this reason, non-pharmacological interventions such as social distancing and shielding were introduced in the uk during the study period. we have included some measures of multi-occupancy, as we have factored care homes into the analysis. the data generated during the study period will therefore be affected by the uptake of harrell's c statistic 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 twentieth of predicted risk at 97 days 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 twentieth of predicted risk at 97 days interventions such as social distancing and shielding, intended to mitigate the risks of sars-cov-2 infection. this could result in underestimation of some model coefficients and hence underestimation of absolute risk in people who were shielded. also, as this is a prediction model derived from an observational study, the associations estimated for individual predictor variables should not be interpreted as causal effects. however, ethical questions must be considered regarding how the tools may be used. we have presented two ways of stratifying risk based on either absolute or relative risk measures with associated centile values, but the choice of whether to have a threshold (given that risk is a continuous measure), and if so what threshold, will depend on the purpose for which the risk assessment tool is to be used, the available resources, and the ethical framework for decision making. we have analysed this within the "four ethical principles" framework that is widely used in medical decision making. the four principles are autonomy, beneficence, justice, and non-maleficence. 32 the new risk equations, when implemented in clinical software, are designed to provide more accurate information for patients and clinicians on which to base decisions, thereby promoting shared decision making and patient autonomy. they are intended to result in clinical benefit by identifying where changes in management are likely to benefit patients, thereby promoting the principle of beneficence. justice can be achieved by ensuring that the use of the risk equations results in fair and equitable access to health services that is commensurate with patients' level of risk. lastly, the risk assessment must not be used in a way that causes harm either to the individual patient or to others (for example, by introducing or withdrawing treatments where this is not in the patient's best interest), thereby supporting the non-maleficence principle. how this applies in clinical practice will naturally depend on many factors, especially the patient's wishes, the evidence base for any interventions, the clinician's experience, national priorities, and the available resources. the risk assessment equations therefore supplement clinical decision making and do not replace it. with these caveats, the predicted risk estimates can be used to identify people at higher risk, to inform shared decision making between healthcare professionals and service users, or for population level stratification. strengths and limitations of study our study has some major strengths, but some important limitations, which include the specific factors related to covid-19 along with others that are similar to those for a range of other widely used clinical risk prediction algorithms developed using the qresearch database. [14] [15] [16] key strengths include the use of a very large validated data source that has been used to develop other risk prediction tools; the wealth of candidate risk predictors; the prospective recording of outcomes and their ascertainment using multiple national level database linkage; lack of selection, recall and respondent biases; and robust statistical analysis. we have used non-linear terms for body mass index and age. we examined interaction terms, which show increased risks at younger ages for adults with type 2 diabetes. we also established a new linkage to the systemic anti-cancer therapy (sact) database for chemotherapy prescribed and administered in secondary care (which may not be recorded well in general practice software) to circumvent possible missing data for this important variable. specific limitations include the occurrence of shielding during the study period and that the study was conducted during the first phase of the uk epidemic. we have accounted for many risk factors for covid-19 mortality, but risks may be conferred by some rare medical conditions or other factors such as occupation that have not yet been observed or are poorly recorded in general practice or hospital data. in particular, the model does not include two important predictorsnamely, prevailing infection rate and personal social distancing measures. a lack of comprehensive testing has led to some missing data on covid-19 admissions and/or deaths, which means that development of a valid model for predicting death in people infected with sars-cov-2 is not yet possible. we acknowledge that absolute risks are changing during the course of the pandemic, so these should be interpreted with caution. however, we would expect predictors of risk, relative risk measures, and discrimination to be more stable over time, which is consistent with the results from our temporal validation. although this tool was modelled on the best available data from the first wave of the pandemic, it will be updated as further testing and outcome data accrue, immunity levels change, and (potentially) a vaccine becomes available. nevertheless, having a risk score available at this stage of the pandemic may be useful to identify people at high risk before a vaccine or treatment is available. we have reported a validation in each of two time periods using practices from qresearch, but these practices were completely separate from those used to develop the model. we have used this approach previously to develop and validate other widely used prediction models. when these have been further externally validated on completely different clinical databases, by ourselves and others, the results have been very similar. [33] [34] [35] work is already under way to evaluate the models in external datasets across all four nations of the uk and to integrate the algorithms within nhs clinical software systems. this study presents robust risk prediction models that could be used to stratify risk in populations for public health purposes in the event of a "second wave" of the pandemic and support shared management of risk. we anticipate that the algorithms will be updated regularly as understanding of covid-19 increases, as more data become available, as behaviour in the population changes, or in response to new policy interventions. it is important for patients/carers and clinicians that a common, appropriately developed, evidence based model exists that is consistently implemented and is supported by the academic, clinical, and patient communities. this will then help to ensure consistent policy and clear national communication between policy makers, professionals, employers, and the public. impact assessment of nonpharmaceutical interventions against coronavirus disease 2019 and influenza in hong kong: an observational study effects of non-pharmaceutical interventions on covid-19 cases, deaths, and demand for hospital services in the uk: a modelling study clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study covid-19 and african americans clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study clinical course and mortality risk of severe covid-19 no commercial reuse: see rights and reprints variation in covid-19 hospitalizations and deaths across new york city boroughs writing group form obesity uk, obesity empowerment network, uk association for the study of obesity. obesity and covid-19: a call for action from people living with obesity obesity is a risk factor for severe covid-19 infection: multiple potential mechanisms prevalence of co-morbidities and their association with mortality in patients with covid-19: a systematic review and meta-analysis shielding from covid-19 should be stratified by risk prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal development and validation of qrisk3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study development and validation of qdiabetes-2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study development and validation of qmortality risk prediction algorithm to estimate short term risk of death and assess frailty: cohort study protocol for the development and evaluation of a tool for predicting risk of short-term adverse outcomes due to covid-19 in the general uk population transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement record working committee. the reporting of studies conducted using observational routinely-collected health data (record) statement the black report. penguin a proportional hazards model for the subdistribution of a competing risk explained variation for survival models a new measure of prognostic separation in survival data multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors assessing the performance of prediction models: a framework for traditional and novel measures comparing the predictive powers of survival models using harrell's c or somers' d temporal recalibration for improving prognostic model development and risk predictions in settings where survival is improving over time external validation of clinical prediction models using big datasets from e-health records or ipd meta-analysis: opportunities and challenges improvement in cardiovascular risk prediction with electronic health records non-invasive risk scores for prediction of type 2 diabetes (epic-interact): a validation of existing models factors associated with covid-19-related death using opensafely medical ethics: four principles plus attention to scope an independent external validation and evaluation of qrisk cardiovascular risk prediction: a prospective open cohort study an independent and external validation of qrisk2 cardiovascular disease risk score: a prospective open cohort study predicting the 10 year risk of cardiovascular disease in the united kingdom: independent and external validation of an updated version of qrisk2 web appendix: supplementary materials imperial college london, london, uk we acknowledge the contribution of emis practices who contribute to qresearch and emis health and the universities of nottingham and oxford for expertise in establishing, developing, or supporting the qresearch database. this project involves data derived from patient level information collected by the nhs, as part of the care and support of cancer patients. the data are collated, maintained, and quality assured by the national cancer registration and analysis service, which is part of public health england (phe). access to the data was facilitated by the phe office for data release. the hospital episode statistics data used in this analysis are reused by permission from nhs digital, which retains the copyright in that data. we thank the office for national statistics (ons) for providing the mortality data. nhs digital, phe, and the ons bear no responsibility for the analysis or interpretation of the data. we express our gratitude to anne rigg, nisha shaunak, tom charlton, ana montes, claire harrison, susan robinson, david wrench, matthew streetly, omer bengal, doraid alrifai, and rajjinder nijjar for aiding the authors (notably pj and jhc) with the classification of agents on the sact dataset linkage used in this study and to david coggon for general comments on the study protocol and interpretation.contributors: jhc, cc, akc, rk, kdo, ph, and nm led study conceptualisation. all authors contributed to the development of the research question and study design, with development of advanced statistical aspects led by jhc, cc, rk, kdo, and akc. jhc, akc, cc, jb, and pj were involved in data specification, curation, and collection. jhc and akc developed, checked, or updated clinical code groups. jhc did the statistical analyses, which were checked by cc. jhc developed the software for the web calculator. all authors contributed to the interpretation of the results. akc and jhc wrote the first draft of the paper. all authors contributed to the critical revision of the manuscript for important intellectual content and approved the final version of the manuscript. the corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted. jhc is the guarantor. funding: this study is funded by a grant from the national institute for health research (nihr) following a commission by the chief medical officer for england, whose office contributed to the development of the study question, facilitated access to relevant national datasets, and contributed to interpretation of data and drafting of the report. produces open and closed source software to implement clinical risk algorithms (outside this work) into clinical computer systems; cc reports receiving personal fees from clinrisk, outside this work; ah is a member of the new and emerging respiratory virus threats advisory group; pj was employed by nhs england during the conduct of the study and has received grants from epizyme and janssen and personal fees from takeda, bristol-myers-squibb, novartis, celgene, boehringer ingelheim, kite therapeutics, genmab, and incyte, all outside the submitted work; akc has previously received personal fees from huma therapeutics, outside of the scope of the submitted work; rl has received grants from health data research uk outside the submitted work; as has received grants from the medical research council (mrc) and health data research uk during the conduct of the study; cs has received grants from the dhsc national institute of health research uk, mrc uk, and the health protection unit in emerging and zoonotic infections (university of liverpool) during the conduct of the study and is a minority owner in integrum scientific llc (greensboro, nc, usa) outside of the submitted work; kk has received grants from nihr, is the national lead for ethnicity and diversity for the national institute for health applied research collaborations, is director of the university of leicester centre for black minority ethnic health, was a steering group member of the risk reduction framework for nhs staff (chair) and for adult care staff, is a member of independent sage, and is supported by the nihr applied research collaboration east midlands (arc em) and the nihr leicester biomedical research centre (brc); rhk was supported by a ukri future leaders fellowship (mr/s017968/1); kdo was supported by a grant from the alan turing institute health programme (ep/t001569/1); no other relationships or activities that could appear to have influenced the submitted work. the views expressed are those of the author(s) and not necessarily those of the nihr, the nhs, or the department of health and social care. data sharing: to guarantee the confidentiality of personal and health information, only the authors have had access to the data during the study in accordance with the relevant licence agreements. access to the qresearch data is according to the information on the qresearch website (www.qresearch.org).the lead author affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.dissemination to participants and related patient and public communities: patient representatives from the qresearch advisory board have advised on dissemination of studies using qresearch data, including the use of lay summaries describing the research and its findings.provenance and peer review: not commissioned; externally peer reviewed. this is an open access article distributed in accordance with the terms of the creative commons attribution (cc by 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. see: http://creativecommons.org/licenses/by/4.0/. key: cord-320141-892v3b7m authors: boshra, mina; godbout, justin; perry, jeffrey j.; pan, andy title: 3d printing in critical care: a narrative review date: 2020-09-30 journal: 3d print med doi: 10.1186/s41205-020-00081-6 sha: doc_id: 320141 cord_uid: 892v3b7m background: 3d printing (3dp) has gained interest in many fields of medicine including cardiology, plastic surgery, and urology due to its versatility, convenience, and low cost. however, critical care medicine, which is abundant with high acuity yet infrequent procedures, has not embraced 3dp as much as others. the discrepancy between the possible training or therapeutic uses of 3dp in critical care and what is currently utilized in other fields needs to be addressed. objective: this narrative literature review describes the uses of 3dp in critical care that have been documented. it also discusses possible future directions based on recent technological advances. methods: a literature search on pubmed was performed using keywords and mesh terms for 3dp, critical care, and critical care skills. results: our search found that 3dp use in critical care fell under the major categories of medical education (23 papers), patient care (4 papers) and clinical equipment modification (4 papers). medical education showed the use of 3dp in bronchoscopy, congenital heart disease, cricothyroidotomy, and medical imaging. on the other hand, patient care papers discussed 3dp use in wound care, personalized splints, and patient monitoring. clinical equipment modification papers reported the use of 3dp to modify stethoscopes and laryngoscopes to improve their performance. notably, we found that only 13 of the 31 papers were directly produced or studied by critical care physicians. conclusion: the papers discussed provide examples of the possible utilities of 3dp in critical care. the relative scarcity of papers produced by critical care physicians may indicate barriers to 3dp implementation. however, technological advances such as point-of-care 3dp tools and the increased demand for 3dp during the recent covid-19 pandemic may change 3dp implementation across the critical care field. the advantages of the use of 3-dimensionl printing (3dp) technology in the medical field are numerous [1, 2] . the capability of 3dp technology to create high fidelity products has proved to be an asset in the production of patient specific models and prostheses (e.g. congenital heart disease models based on a patient's radiological data) [3] . moreover, the digital design of 3d models can be easily altered to fit its intended use by utilizing widely available software [4] [5] [6] . its high output speed and affordability of materials enables 3dp to meet high demands during shortages. for instance, it was able to supply many healthcare institutions with the protective equipment they needed during the covid-19 pandemic [7] . over the past few decades, many medical subspecialties began using 3dp for a variety of purposes. for example, cardiac surgeons began using computed tomography scans to create 3dp models of patients' hearts to help with surgical planning [8] . this widespread use of 3dp in medicine has become prevalent enough to create special interest groups to devise appropriateness criteria of 3dp utilization in clinical settings. through these criteria, 3dp implementation in medicine can become better regulated and therefore more established [3] . nevertheless, despite the aforementioned uses and advantages, 3dp technology has not been as heavily implemented in the field of critical care. this is noteworthy since critical care has many areas where 3dp could be applied. one such area is simulation training. simulation has been shown by multiple studies to be at least as efficient as standard lectures and visual aids [9, 10] . likewise, it has been shown that the "see one, do one, teach one" approach to medical education should be replaced by a model emphasizing constant practice in order to achieve a high level of competency [11] . in the critical care field, procedural competence is often hindered by the virtue of its many risky yet infrequent procedures (e.g. cricothyroidotomy). these procedures, while relatively infrequent, carry higher risk for patients if done inappropriately. therefore, simulation models would help increase the physician's comfort with the procedure without causing any harm to patients. however, many commercially available simulators are either expensive or depend on animal substitutes (which present additional storage and procurement issues). simulation using 3dp models can evade these issues due to their lower cost and ease of production [12] . manufacturing of tools and equipment can also improve through 3dp implementation. this is especially important in low resource settings where acquiring medical equipment may be economically or logistically challenging. 3dp can also be used to educate patients, staff, or caregivers [13, 14] . we conducted a literature review to summarize the current education and therapeutic uses of 3dp for critical care procedures or within critical care settings. the literature was collected by performing a comprehensive search of the pubmed database for all articles from inception until july 30th, 2019 containing the keywords used in literature to represent 3dp (e.g., 3d printing, threedimensional print, additive manufacturing), and those used to represent critical care (e.g. critical care, emergency medicine, intensive care unit). the scarcity of the results led to the modification of the search strategy to include keywords representing the repertoire of skills of a critical care physician (additional file 1). this modification was performed to include articles pertaining to the field of critical care but not necessarily by critical care physicians. with the assistance of a health sciences research librarian, terms and mesh categories for 3dp, critical care and skills required of a critical care physician were combined to create our search strategy (additional file 1). the search contributed 5846 results which were transported into covidence (veritas health innovation, melbourne, australia). the program found no duplicates. papers were screened by title and abstract and disqualified if they had non-human subjects, implementations were not pertinent to critical care, or reviews. this created a list of 87 papers which were further reviewed for their applicability to the field of critical care by two critical care physicians (authors a.p. and j.g.). this left 35 papers which underwent full-text screening (fig. 1) . the papers were additionally examined for the degree of involvement of critical care in their production. involvement in this study was defined as critical care physicians being part of the research team or research participants. finally, the methodologies of the papers were divided into randomized control trials (rct), technical reports, and quasi-experiments. quasi-experiments were defined as studies that aim to demonstrate a causal relationship by introducing an intervention and control groups but without randomization [15] . our search produced 31 papers that described possible uses of 3dp in critical care which can be divided into three main themes: medical education (med-ed), patient care, and clinical equipment modification (cem) ( table 1) . topics under med-ed included: bronchoscopy (9 studies), congenital heart disease (chd) (4 studies), cricothyroidotomy (3 studies), and medical imaging (3 studies). some single study utilities within medical education involved: thoracotomy, chest tube insertion, epistaxis management, and pediatric intubation. studies within the patient care category included wound care (1 study), personalized splints (2 studies), and patient monitoring (1 study). cem involved a 3dp stethoscope (1 study) and laryngoscope adjustments (3 studies). the characteristics of the papers can be found in table 2 . med-ed papers accounted for 74% of the studies found. bronchoscopy was the most common topic with 39% of med-ed papers, followed by chd at 17.4%. med-ed papers were mostly technical reports and quasi-experiments with only 21.7% of the studies containing rcts. on the other hand, 65.2% and 52.2% of the med-ed papers included a technical report or quasi experiment, respectively. patient care papers comprised 17.4% of the total number of papers with half focusing on personalized splints. moreover, patient care was the only topic that included a case study while the other three papers were either technical reports or quasi-experiments. cem papers also made 17.4% of the total number of papers found; however, cem was mostly comprised of laryngoscope modification projects (75%). all cem papers included a technical report with only one using an rct. comparing the involvement of critical care medicine in the papers, we found that only 13 of the papers had critical care contribution. moreover, 12 papers were interventional radiology residents 19 the results showed that the model was realistic for both anatomical variant and diseased vessels. scored higher than other models in realism. the feedback was used and they were able to modify the gelatin used to make it more stable and increase the compressibility of the venous system. yates e. et al. [35] thoracotomy technical report and quasi experiment created a thoracotomy model and tested it on em residents using before and after questionnaires em residents 21 the questionnaires showed that the 3dp model increased their confidence of performing the procedure. bettega a.l. et al. [36] chest tube insertion technical report and rct created a chest tube model and used medical students and pre/post questionnaires to assess. the study showed an increase in confidence equal to animal models. estomba c.c. et al. [37] epistaxis technical report and quasi experiment made a multi-material model with pulsating blood vessels and used a post questionnaire to assess it. em and ent residents nr the model received positive feedback from the residents for its realism. park l. et al. [38] endotracheal intubation quasi-experiments were identified as studies introducing an intervention but without randomizing of the participants within med-ed while patient care and cem having 0 and 1, respectively (fig. 2 ). this review shows that 3dp can have a variety of utilities in the field of critical care including medical education, patient care, and development of clinical equipment; however, med-ed takes the lead as the most common utility of 3dp with over 70% of the papers found discussing the use of 3dp models to train medical students and/or residents. this high percentage can be explained by the key findings of the papers. first, 3dp's ability to create simulation models for numerous parts of the body including airways, shoulder girdle, and nasal cavity provides the opportunity to practice a large variety of skills. such skills may be difficult to practice on real-life patients due to their high acuity and infrequency (e.g. cricothyroidotomy) [16-22, 32, 38] . therefore, obtaining a model that can be used for frequent practice can be essential and life conserving [12] . many of the studies showed that 3dp models were anatomically accurate and matched if not surpassed conventional models in their realism and student preference [18-20, 23, 28-32] . moreover, the simulators were able to assess the difference in the proficiency between novices and experts by showing a clear correlation between the scores of the users and the number of years of experience [24] . this ability to discern between novices and experts enables the 3dp models to be used for assessing the competency of students as they progress in their training. in addition, the ability to help novice practitioners match experts after practicing on the simulator allows 3dp models to be useful training modules [22] . another advantage of 3dp models is their ability to educate the user on both normal and variant/abnormal anatomy. for instance, 3dp models of congenital heart defects have been successfully used to increase the knowledge of participants of the anatomical issues and their consequences [25] [26] [27] [28] . normal anatomical variants have also been incorporated in many simulation models [17, 34] . interestingly, the study by white and colleagues found that the 3dp group scored higher on the tetralogy of fallot test while the didactic class group scored better on the ventricular septal defect (vsd) test [28] . according to the authors, while vsd is simple enough to be learned through didactic learning, tetralogy of fallot is a complex case in which the tactile component may be advantageous to understanding the intricate anatomy. that is why the 3dp group were able to do better on that test than the classroom group. therefore, for the optimum use of 3dp models in training, they should be used where a mix of visual and tactile information is beneficial. another noteworthy observation was that none of the papers we found discussed 3dp in critical care for adult cardiac disease. although 3dp in cardiac medicine is a well-established field of research with various reviews [8, 47] , 3dp utilization in adult cases mainly revolve around defect visualization, procedural planning, and surgical device innovation [47] . the scarcity of medical education utilities in adult cardiology is represented by both our results and those reported by vukicevic and colleagues in their review of 3dp uses in cardiac medicine [47] . considering the positive results of 3dp utilization in chd for educating critical care physicians, similar training modules for the management of adult cases in cardiac intensive care units may prove beneficial. the papers found under the categories of patient care and cem represent the innovations possible through 3dp's versatility. for example, the ability to create filaments of different characteristics allows for the production fig. 2 number of studies involving critical care within the major utilities of 3dp in critical care of more complex products. this was shown in the use of multiple materials to simulate the different tissue densities common to human structures [16, 33, 37, 38] . furthermore, specially designed materials can be created for a particular utility. for instance, muwaffak and colleagues were able to create specialized filaments containing silver, zinc, and copper and combine them into a personalized wound dressing that boasted the antimicrobial abilities of these metals [39] . another source of versatility is the ability create 3dp molds of the desired structures. using the sometimes-limited material properties offered by most 3dp technologies, 3dp molds can be used to shape silicon or gelatin to create anatomic structures that possess properties (e.g. mechanical) that more closely resemble tissue [20, 32] . 3dp molds were used with silicon in risler and colleagues' research to create the outer shoulder shell that provided feedback under ultrasound (us) that resembled human tissue [32] . 3dp versatility has also increased through personalization of therapeutic devices, such as splints to fit specifically to each patient. 3dp of personalized splints such as those described in li et al. and wu p-k et al. are made possible by hand-held 3d scanners that can capture the person's exact measures within seconds, and which can then be used to custom-design the splint to fit the measured anatomy [40, 41] . overall, 8 of the 31 studies specifically discussed that their 3dp models were either cheaper or of similar price to conventional models [16, 17, 31, 34, 35, 37, 38, 41] . this decrease in price can reach up to 250% which provides a strong motive for furthering the implementation of 3dp technology in critical care [17] . the expiration of the patent for various printers and the wide-spread availability of material has caused this decrease in cost and the increased availability of 3dp [48] . despite the various advantages to 3dp implementation in critical care, only 13 of the 31 papers involved critical care physicians as authors or participants. this could be due to a variety of reasons from 3dp illiteracy, to lack of knowledge of possible implementations, to the fact that many 3d printed models are developed by surgeons or non-clinical researchers where the applications are more widespread. another possible cause may be due to many physicians believing that the urgency of most critical care cases decreases the time available for designing and printing instruments. the relative difference in critical care papers found between the three topics supports this theory with 52% of med-ed papers involving critical care versus 25% and 0% in cem and patient care respectively. this higher involvement in med-ed could be explained by the fact that education usually occurs in less urgent settings. however, with innovations such as 3dp wound dressings that can be made beforehand, printers that can produce splints in only a few hours, and the shortage of supplies mitigated by 3dp during the covid-19 pandemic, we hope to see an increase in 3dp implementation in critical care tools and patient care [39, 40] . our search strategy was expanded by searching for the use of 3dp in skills pertaining to critical care. this allowed us to capture and describe results from both current and possible implementations of 3dp in critical care. moreover, we have presented the details and key findings of each study (table 2 ) which can help guide future research. many of the papers discussed were technical reports of models and hence can be developed and researched further. additionally, our results were supported by the findings of other larger reviews [1, 49, 50] . nonetheless, there are a few limitations to this review. first, our results are restricted to the papers found on the pubmed database. moreover, since our search was conducted before the covid-19 pandemic, additional uses of 3dp may have emerged to battle instrument shortage. nevertheless, we believe that any extra papers would still fall under the major topics of med-ed, patient care, and cem. another limitation is the low number of papers found in both cem and patient care. furthermore, many of the papers found did not test the clinical significance of their innovations. however, the positive results from every quasi-experiment and rct reported here supports the hypothesis that uses of 3dp are clinically significant. further research guided by our description of the benefits of 3dp in critical care will also help mitigate some of the issues caused by the low number of results found. with 3dp technology continuously improving, we expect a rise of new initiatives in the field of critical care. for instance, the ability of 3dp models to serve as simulation training modules for novice physicians will be crucial as the medical field begins its transition to competencybased learning. the versatility of 3dp raw materials makes it possible to create simulation models that cover an array of competencies and skills. for instance, researchers have been able to create high quality 3dp phantoms using different materials to resemble the physical characteristics of the distinctive tissue types [51] . these phantoms can be used to train novice critical care physicians on their imaging diagnostic skills as well as imaging-guided procedures [32, 33] . nevertheless, further collaboration with 3dp companies is needed in the future to improve the fidelity of these phantoms through specially designed raw materials to more accurately depict the characteristics of human tissue [51] . with the continued development of 3dp simulation models, the authors hope that an opensource library with the printing files of the models can be made available so that physicians in resource-scarce regions can still maintain their training. furthermore, the tools used in critical care can benefit from the enhancements possible through 3dp. for example, biochemical research papers [52] have designed 3d printed materials that could be used to enhance wound healing. this ability can be applied to wound dressing manufacturing and tested in a critical care setting to determine the advantages they provide over commercially available dressings. another field in 3dp research that has been gaining attention is point-of-care testing (poct). poct is the field of diagnostic testing that can be done in real time generally outside of a laboratory and by untrained individuals [53] . this field has become essential for diagnosis both in the developing world and rural or resource scarce areas in the developed world [54] . therefore, future research into 3dp poct projects like abo blood typing and wireless monitoring of key metabolites may be readily utilized in critical care settings [53, 55, 56] . the future implementation of 3dp in critical care has been affected greatly by the covid-19 pandemic. the shortage of personal protective equipment and ventilation valves has supported the need for 3dp's quick turnover and production rate [57] . indeed, many research endeavours have utilized 3dp to overcome the scarcity of resources that faced many hospitals. for example, callahan and colleagues were able to use 3dp to create nasal swabs that were comparable to the commercial ones [58] . other uses of 3dp included production of face shields, n95 masks, ventilator valves, and environmental protection (ex. hands-free door handles) [59, 60] . the pandemic was able to uncover the limitation of many of our hospitals when they were cut off their suppliers and faced shortage of necessary tools and equipments. however, this can be prevented in the future through two important steps. first, advocating for the development of 3dp labs within hospitals and the training of staff on the protocol for employment of 3dp tools during emergent situations may mitigate some of the effects of supply shortage. moreover, the creation of a central depository for medical 3dp designs may help increase the access of hospitals to readily available products. such a depository can also increase the number of trials a product undergoes which can hasten their development and improvement. this narrative review has summarized the major uses of 3dp in the field of critical care which were found to be mainly within the realms of medical education (e.g. simulation models and training modules), patient care (e.g. wound care and personalized splints), and clinical equipment modification (e.g. 3dp laryngoscope handle). moreover, our search found that most of the research endeavours, while discussing 3dp utilities applicable to the field of critical care, were not performed by critical care medicine. this fact represents the need for critical care-specific studies that consider the needs of the field and how 3dp can fulfill them. finally, we looked at how some of the new innovations in 3dp like biochemically active 3dp raw material may be beneficial for the future of critical care. with these various advantages of 3dp and the clear demand for its role in a plethora of aspects of critical care, we expect to witness a greater involvement of critical care physicians in this field in the near future. supplementary information accompanies this paper at https://doi.org/10. 1186/s41205-020-00081-6. additional file 1. medical 3d printing for the radiologist radiographics update: medical 3d printing for the radiologist rsna) 3d printing special interest group (sig): guidelines for medical 3d printing and appropriateness for clinical scenarios 3d printed ventricular septal defect patch: a primer for the preoperative planning and tracheal stent design in thoracic surgery custom-made 3d-printed face masks in case of pandemic crisis situations with a lack of commercially available ffp2/3 masks current and future applications of 3d printing in congenital cardiology and cardiac surgery prospective randomized comparison of standard didactic lecture versus high-fidelity simulation for radiology resident contrast reaction management training prospective randomized trial of simulation versus didactic teaching for obstetrical emergencies see one, do one, teach one: advanced technology in medical education computer-based simulation training in emergency medicine designed in the light of malpractice cases usage of 3d models of tetralogy of fallot for medical education: impact on learning congenital heart disease personalized 3d printed model of kidney and tumor anatomy: a useful tool for patient education the use and interpretation of quasi-experimental studies in medical informatics multimaterial three dimensional printed models for simulation of bronchoscopy novel application of rapid prototyping for simulation of bronchoscopic anatomy development of an innovative 3d printed rigid bronchoscopy training model realistic 3d-printed tracheobronchial tree model from a 1-year-old girl for pediatric bronchoscopy training three-dimensional printed pediatric airway model improves novice learnersê¼ flexible bronchoscopy skills with minimal direct teaching from faculty evaluation of a low-cost, 3d-printed model for bronchoscopy training development and evaluation of 3d-printed models of human tracheobronchial system for training in flexible bronchoscopy a randomised, controlled trial evaluating a low cost, 3d-printed bronchoscopy simulator assessment of bronchoscopic dexterity and procedural competency in a low-fidelity simulation model incorporating three-dimensional printing into a simulation-based congenital heart disease and critical care training curriculum for resident physicians just-intime" simulation training using 3-d printed cardiac models after congenital cardiac surgery novel, 3d display of heart models in the postoperative care setting improves cicu caregiver confidence utility of three-dimensional models in resident education on simple and complex intracardiac congenital heart defects modelling and manufacturing of a 3d printed trachea for cricothyroidotomy simulation evaluation of an innovative bleeding cricothyrotomy model. cureus a high-fidelity simulator for needle cricothyroidotomy training is not associated with increased proficiency compared with conventional simulators a threedimensional printed low-cost anterior shoulder dislocation model for ultrasound-guided injection training an assembled prototype multimaterial three-dimensionalprinted model of the neck for computed tomography-and ultrasoundguided interventional procedures fabrication and assessment of 3d printed anatomical models of the lower limb for anatomical teaching and femoral vessel access training in medicine development and utilization of 3d printed material for thoracotomy simulation chest tube simulator: development of low-cost model for training of physicians and medical students how we do it: anterior and posterior nosebleed trainer, the 3d printing epistaxis project increasing access to medical training with three-dimensional printing: creation of an endotracheal intubation model patientspecific 3d scanned and 3d printed antimicrobial polycaprolactone wound dressings rapid customization system for 3d-printed splint using programmable modeling technique -a practical approach printing a 3-dimensional, patient-specific splint for wound immobilization: a case demonstration determination of plasma lactate in the emergency department for the early detection of tissue hypoperfusion in septic patients a low-cost 3-d printed stethoscope connected to a smartphone impact of a custom-made 3d printed ergonomic grip for direct laryngoscopy on novice intubation performance in a simulated easy and difficult airway scenario-a manikin study design and evaluation of a novel and sustainable human-powered low-cost 3d printed thermal laryngoscope evaluation of a smartphone camera system to enable visualization and image transmission to aid tracheal intubation with the airtraqâ® laryngoscope cardiac 3d printing and its future directions how expiring patents are ushering in the next generation of 3d printing emerging technology and applications of 3d printing in the medical field surgical applications of threedimensional printing: a review of the current literature & how to get started recent advances on the development of phantoms using 3d printing for imaging with ct, mri, pet, spect, and ultrasound 3d printed chitosan dressing crosslinked with genipin for potential healing of chronic wounds a simple and low-cost portable paper-based abo blood typing device for point-of-care testing point-of-care testing: applications of 3d printing 3d-printed smartphone-based point of care tool for fluorescence-and magnetophoresis-based cytometry wireless monitoring in intensive care units by a 3d-printed system with embedded electronic lessons learned from covid-19 and 3d printing open development and clinical validation of multiple 3d-printed nasopharyngeal collection swabs: rapid resolution of a critical covid-19 testing bottleneck applications of 3d printing technology to address covid-19-related supply shortages covid-19 and the role of 3d printing in medicine publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the authors would like to acknowledge the uottawa health librarian majela guzman for her contributions to the design of the search strategy used in this paper. availability of data and materials not applicable.ethics approval and consent to participate not applicable. not applicable. the authors declare that they have no competing interests. key: cord-329534-deoyowto authors: mcbryde, emma s.; meehan, michael t.; adegboye, oyelola a.; adekunle, adeshina i.; caldwell, jamie m.; pak, anton; rojas, diana p.; williams, bridget; trauer, james m. title: role of modelling in covid-19 policy development date: 2020-06-18 journal: paediatr respir rev doi: 10.1016/j.prrv.2020.06.013 sha: doc_id: 329534 cord_uid: deoyowto models have played an important role in policy development to address the covid-19 outbreak from its emergence in china to the current global pandemic. early projections of international spread influenced travel restrictions and border closures. model projections based on the virus’s infectiousness demonstrated its pandemic potential, which guided the global response to and prepared countries for increases in hospitalisations and deaths. tracking the impact of distancing and movement policies and behaviour changes has been critical in evaluating these decisions. models have provided insights into the epidemiological differences between higher and lower income countries, as well as vulnerable population groups within countries to help design fit-for-purpose policies. economic evaluation and policies have combined epidemic models and traditional economic models to address the economic consequences of covid-19, which have informed policy calls for easing restrictions. social contact and mobility models have allowed evaluation of the pathways to safely relax mobility restrictions and distancing measures. finally, models can consider future end-game scenarios, including how suppression can be achieved and the impact of different vaccination strategies. models have played an important role in policy development to address the covid-19 outbreak from its emergence in china to the current global pandemic. early projections of international spread influenced travel restrictions and border closures. model projections based on the virus's infectiousness demonstrated its pandemic potential, which guided the global response to and prepared countries for increases in hospitalisations and deaths. tracking the impact of distancing and movement policies and behaviour changes has been critical in evaluating these decisions. models have provided insights into the epidemiological differences between higher and lower income countries, as well as vulnerable population groups within countries to help design fit-for-purpose policies. economic evaluation and policies have combined epidemic models and traditional economic models to address the economic consequences of covid-19, which have informed policy calls for easing restrictions. social contact and mobility models have allowed evaluation of the pathways to safely relax mobility restrictions and distancing measures. finally, models can consider future endgame scenarios, including how suppression can be achieved and the impact of different vaccination strategies. infectious diseases modelling in the current covid-19 pandemic has had more attention from government and media (both social and traditional) than for any previous pandemic. this can be attributed to the huge impact of the pandemic, changes in global communication and technical advances in modelling. these advances encompass new methods, improved computational tools, public data sharing, improved code availability and better visualisation methods, which have all advanced markedly in the decade since the h1n1 influenza pandemic in 2009 1 . policy makers have placed greater trust in models than ever before, although models have also received considerable criticism, and their limitations need to be acknowledged. in this paper, we describe ways in which models have influenced policy, from the early stages of the outbreak to the current date -and anticipate the future value of models in informing suppression efforts, vaccination programs and economic interventions. policy on border closures followed early estimates of trans-national spread of covid-19 and was strongly influenced by modelling. in january 2020, modelling based on travellers from wuhan found that early covid-19 case rates were significantly under-reported both in china 2 and overseas 3 , and border closure policies promptly followed. china imposed an internal travel lock-down on wuhan on 23 rd january and most countries enacted limited restrictions through february and comprehensive restrictions through march; many governments using travel risk models to anticipate case numbers with and without border closures (see for example, shearer et al. 4 ). fully connected meta-population travel models have provided additional insights. retrospective analyses have shown that the wuhan lockdown imposed by china did little to delay the outbreak within china, but had a greater impact on other countries 5 . models have also predicted the shifting of epicentres from asia to europe and from the usa to south america and africa, based on the connectedness of these regions 6 , enabling enhanced surveillance in vulnerable destination countries. currently, almost every country in the world has experienced local transmission, such that border restrictions are of lesser importance. however, as countries begin to move out of lockdown and seek to reignite their economies, travel modelling will again become helpful for anticipating the risk of reintroduction of cases to jurisdictions that have successfully reduced transmission. many early models estimated a high reproduction number -the average number of secondary cases per infected case. although precise values differed, in china prior to interventions, these mostly fell between a value of two and three 7 , heralding the seriousness of the pandemic. based on these estimates, model projections were consistent in predicting that an unmitigated epidemic would overwhelm health systems and lead to unacceptable loss of life 8, 9, 10 . a prominent example was the report by imperial college london on the potential of covid-19 to cause widespread infection across the uk and the us if a mitigation (reproduction number greater than one) rather than suppression (reproduction number less than one) strategy was pursued 9 . consistent model findings of high infection rates and mortality collectively resulted in many countries grasping the seriousness of the epidemic. consequently, public health interventions and governmentimposed restrictions on human movement were initiated to reduce transmission. models showing changes in transmission rates over time have been powerful tools for enabling policy-makers to demonstrate gains in epidemic control through public health policy and action. many media outlets and public health officials around the world have provided explanations about the effective reproduction number and its critical threshold of one, which is the key to escaping from lockdown. political leaders of new zealand 11 , australia 12 , the uk 13 , indonesia 14 and germany 15 have all used this terminology in communicating decision making for easing lockdowns, demonstrating the marked increase in the public's understanding of infectious disease modelling. as countries see their epidemic incidence curves decline in response to changes in behaviour and policy, developing a strategy to exit lockdown is vital. the aim is to do so gradually, without pushing the effective reproduction number above one and risking a second pandemic wave. models can predict the impact of resuming different activities if the impact of each behavioural intervention is well quantified. since most interventions were applied simultaneously, their individual contributions are unknown, and an alternative approach is required. one approach is to examine contact patterns and thereby infer infection risk using age-specific mixing models. in 2008, age-specific contact patterns were estimated in several european countries 16 , and synthetic matrices were later developed for 152 countries 17 . changes in contact rates resulting from covid-19 policy can be estimated using a number of sources: google mobility data provides the amount of time spent in locations (https://www.google.com/covid19/mobility/), city-mapper (http://citymapper.com/cmi) provides regularly updated data on direction requests, nation-wide behaviour surveys provide information on the impact of policies for both the number and duration of contacts in some countries. mixing models synthesise these results to identify age-specific contact rates and use relative infectiousness and susceptibility to infer age-specific infection rates. the findings that children are probably less likely to acquire infection and are much less likely to show symptoms when infected 16 have been incorporated into these models to assist with policy development. in particular, mixing models have shown that school closure has a modest impact on disease transmission at best, which has encouraged several jurisdictions to accelerate school reopenings, or avoid school closure. 18, 19, 20 models designed to inform policy in low-middle income countries [lmics] need to capture variation in transmissibility of covid-19 with socio-demographic features. models forecasting the covid-19 epidemic in lmics indicate that transmission is likely to be delayed by the relatively low travel numbers and the higher proportion of children in these countries 19 . this delay has given countries critical time to prepare and strengthen their often limited health infrastructure and diagnostic and surveillance capacity 21 . for example, there are only 0.23 ventilators per 100,000 of population in uganda, compared with nine in australia and eleven in europe 8, 22 . despite the delay, models predict that any natural protection offered by a younger population is likely to be offset by weaker health systems, overcrowding and comorbidities 23 ; hence lmics have been advised to prepare for a "slow burn" covid-19 epidemic. brazil has been one of the lmics most severely affected by covid-19 (thus far), and its epidemic trajectory provides some insight into the different pandemic trajectories of lmics compared to higher income settings. the first case of covid-19 occurred in brazil on 25th february, 2020 and by early may there were over 100,000 reported cases and 7,000 deaths, an estimated epidemic doubling rate for deaths of five days (the highest of 48 countries analysed) 24 . the rapid transmission in brazil is attributed to a combination of inadequate policy response, high population density, close living quarters, limited access to clean water, informal employment, and transmission to indigenous populations through encounters with illegal miners and loggers in the amazon rainforest 25 . on top of the direct covid-19-related morbidity and mortality, indirect effects associated with disruptions to other critical medical services are anticipated. in some lmics, health care systems will be faced with concurrent outbreaks, such as dengue in ecuador, brazil and other latin american countries 26 . models can also help estimate the indirect effects of covid-19, such as through interruptions to the supply of antiretroviral therapy (hiv treatment) in south africa due to covid-19. it has been estimated that this disruption could cause a death toll on the same order of magnitude as the deaths that would be averted by physical distancing interventions for covid-19 itself 27 . similarly, disruptions to vaccine delivery in lmics could lead to major outbreaks of vaccine preventable diseases, such as measles, for many years to come 28 . policies that have been effective in developed countries may not translate to impact in lmics. this has been attributed to physical distancing often being impractical due to large household sizes, overcrowded settlements, and informal economies 23 . lockdowns that severely restrict movement outside the home are less feasible and more harmful in countries where work is essential for survival. it seems almost certain that a vaccine will be needed if covid-19 is to be eliminated globally. as of 2 nd june, 2020, there are ten clinical trials for covid-19 vaccines in different stages of development and 123 molecules in preclinical evaluations 29 . modelling can be useful in evaluating the efficacy of vaccines during clinical trials and diminishing biases 30,31 . models have already shown that an immunity of 60% is desirable to protect those who cannot or choose not to become vaccinated. however, once a vaccine becomes available, an implementation strategy will be required as supply scales up. should we vaccinate the most vulnerable groups, those at the front-line of control or those contributing the most to transmission first? while ethical and equity considerations will appropriately drive much of this debate, modelling has a role to play. modelling can help to assess the potential effectiveness of different vaccination strategies, such as location-specific ring-vaccination for ebola 32 , age-specific vaccination for influenza 33 and assessing long-term risks and benefits of dengue vaccination 34 . for covid-19, strategies may differ between countries depending on the acuity of the epidemic, the age groups driving the infection or at higher risk for severe disease, and the age structure of the population. the containment and mitigation measures implemented by countries around the world to slow the spread and reduce mortality due to covid-19 have resulted in a worldwide economic crisis, with the cost likely to exceed the damage from the global financial crisis of 2007-2008. unlike previous such crises, the covid-19 pandemic has simultaneously created major downturns in both supply and demand across the world 35 . this creates significant policy challenges for governments to design strategies that take into account both the effectiveness of public health intervention measures and their negative socioeconomic consequences 36 . containment and mitigation efforts, which are aimed at saving lives, avoiding human capital losses and flattening the epidemic curve, also reduce economic activity 37, 38 . this intentional reduction of economic transactions has been undertaken by many governments as a necessary public health measure, at least in the early stages of the epidemic. apart from the humanitarian perspective, there are economic benefits to counter some of the losses of these policies, such as avoiding loss in income due to premature deaths, workplace absenteeism, and reduction in productivity. contractions in supply and demand also come from physical distancing, travel restrictions, non-essential business closures and other measures, as people work and consume less 39 . some studies have used early estimates of infection rates and case fatality ratios to assess healthcare costs due to the pandemic and provide cost-benefit analysis of non-pharmaceutical interventions 40 . others have used simulations to model epidemiological scenarios in inter-temporal general equilibrium models 41 . for example, a dynamic stochastic general equilibrium model is used to model global economic outcomes under different scenarios of the pandemic's evolution over the remainder of this year 42 . a limitation of this modelling approach is that it requires a large number of inputs and assumptions on economic relationships and epidemiological characteristics. as economic consequences from the covid-19 pandemic originated from epidemiological factors and these factors will continue to play a role in the dynamics of control, the economic trade-offs between public health and economics should be considered and analyses that combine epidemiological and economic modelling will be valuable. much of the value of models in the covid-19 pandemic is in informing immediate policy decisions. a collaborative relationship between modellers and policy-makers ensures that models focus on priority questions. providing modellers with a clear view of the policy environment allows them to propose and develop models that support decision making. this may go beyond policy-makers outlining the policy questions of interest and the interventions under consideration, and should ideally create an environment where modellers and policy makers work in partnership. modelling analyses may also play a role in broadening the scope of options under consideration. for example, modellers advising the uk health authorities did not initially explore a strategy that involved increased testing because they had been advised that there were limits on the extent to which testing capacity could be scaled up 43 . however, modelling more ambitious scenarios can provide high benefits at low cost, even if the strategies simulated are not immediately able to be enacted -if models had shown that increasing testing, tracing, and isolation would result in dramatically better outcomes, these activities may have been prioritised sooner in the uk 44 . similarly in australia, models presented to the australian government in march 2020 examined only mitigation strategies 45 . had these early models shown the enormous impact potential for suppression, earlier and shorter lockdowns may have been planned. the current pandemic crisis has led to a resurgence of interest in modelling, and an increase in its use for guiding policy. modelling advice should always be developed and considered in context, allowing for setting-specific limitations in capacity. early analyses have been biased towards high-income settings where both modelling expertise and reported cases have been concentrated. however, as the pandemic now shifts towards highly-vulnerable lmics, a concerted effort is required to provide international modelling support to local containment efforts. technology to advance infectious disease forecasting for outbreak management report 1. imperial college london, uk. who collaborating centre for infectious disease modelling mrc centre for global infectious disease analysis identifying locations with possible undetected imported severe acute respiratory syndrome coronavirus 2 cases by using importation predictions assessing the risk of spread of covid-19 to the asia pacific region delaying the covid-19 epidemic in australia: evaluating the effectiveness of international travel bans change in outbreak epicenter and its impact on the importation risks of covid-19 progression: a modelling study the reproductive number of covid-19 is higher compared to sars coronavirus modelling the impact of covid-19 on intensive care services in new south wales impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand flattening the curve is not enough, we need to squash it. an explainer using a simple model how new zealand's 'eliminate' strategy brought new coronavirus cases down to zero press conference -australian parliament house, act: transcript 16 full speech on modified coronavirus lockdown plan indonesia's r0, explained angela merkel uses science background in coronavirus explainer social contacts and mixing patterns relevant to the spread of infectious diseases projecting social contact matrices in 152 countries using contact surveys and demographic data stepping out of lockdown should start with school re-openings while maintaining distancing measures. insights from mixing matrices and mathematical models age-dependent effects in the transmission and control of covid-19 epidemics the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study. the lancet public health africa in the path of covid-19 helping africa to breathe when covid-19 strikes who collaborating centre for infectious disease modelling within the mrc centre for global infectious disease analysis j-i, imperial college london. short-term forecasts of covid-19 deaths in multiple countries covid-19 in brazil: "so what?". the lancet: lancet publishing group covid-19 and dengue, co-epidemics in ecuador and other countries in latin america: pushing strained health care systems over the edge the potential impact of interruptions to hiv services: a modelling case study for south africa vaccination programs for endemic infections: modelling real versus apparent impacts of vaccine and infection characteristics the influence of incomplete case ascertainment on measures of vaccine efficacy containing ebola at the source with ring vaccination modelling the strategies for age specific vaccination scheduling during influenza pandemic outbreaks benefits and risks of the sanofi-pasteur dengue vaccine: modeling optimal deployment macroeconomic policy in the time of covid-19: a primer for developing countries a conceptual framework for analyzing the economic impact of covid-19 and its policy implications. undp lac covid-19 mitigating the covid economic crisis: act fast and do whatever economic consequences of the covid-19 outbreak: the need for epidemic preparedness impacts of social and economic factors on the transmission of coronavirus disease 2019 (covid-19) in china the benefits and costs of using social distancing to flatten the curve for covid-19 the global macroeconomic impacts of covid-19: seven scenarios special report: the simulations driving the world's response to covid-19 modelling the pandemic modelling the impact of covid-19 in australia to inform transmission reducing measures and health system preparedness key: cord-317643-pk8cabxj authors: masud, mehedi; eldin rashed, amr e.; hossain, m. shamim title: convolutional neural network-based models for diagnosis of breast cancer date: 2020-10-09 journal: neural comput appl doi: 10.1007/s00521-020-05394-5 sha: doc_id: 317643 cord_uid: pk8cabxj breast cancer is the most prevailing cancer in the world and each year affecting millions of women. it is also the cause of largest number of deaths in women dying in cancers. during the last few years, researchers are proposing different convolutional neural network models in order to facilitate diagnostic process of breast cancer. convolutional neural networks are showing promising results to classify cancers using image datasets. there is still a lack of standard models which can claim the best model because of unavailability of large datasets that can be used for models’ training and validation. hence, researchers are now focusing on leveraging the transfer learning approach using pre-trained models as feature extractors that are trained over millions of different images. with this motivation, this paper considers eight different fine-tuned pre-trained models to observe how these models classify breast cancers applying on ultrasound images. we also propose a shallow custom convolutional neural network that outperforms the pre-trained models with respect to different performance metrics. the proposed model shows 100% accuracy and achieves 1.0 auc score, whereas the best pre-trained model shows 92% accuracy and 0.972 auc score. in order to avoid biasness, the model is trained using the fivefold cross validation technique. moreover, the model is faster in training than the pre-trained models and requires a small number of trainable parameters. the grad-cam heat map visualization technique also shows how perfectly the proposed model extracts important features to classify breast cancers. breast cancer is affecting millions of women every year in the world and is the reason of highest cause of deaths by cancers among women [1] . the survival rates of breast cancer largely vary in the countries. in north america, it is greater than 80%, in sweden and japan it is around 60%, while in low-income countries it is below 40% [1] . the main reason of low survival rate in low-income countries is the lack of programs for early detection and the shortage of enough diagnosis and healthcare facilities. therefore, it is vital to detect breast cancer in the pre-mature stage to minimize the rate of mortality. mammography and ultrasound images are the common tools to identify cancers by the experts and requires expert radiologists. manual process may cause to generate high false positive and false negative numbers. therefore, nowadays computer aided diagnosis systems (cads) are vastly used to aid radiologists during the process of decision making in identifying cancers. the cad systems now potentially reducing the efforts of radiologists and minimizing the false positive and negative numbers in diagnosis. machine traditional computer aided systems for disease diagnosis and patient monitoring [2] . however, the traditional machine learning techniques involve a hand-created step for extraction of features which is very difficult sometimes. it also requires domain knowledge and an expert radiologist. meanwhile, deep learning (dl) models automatically develop a learning process adaptively and can extract features from the input dataset considering the target output [3, 4] . the dl methods tremendously reduce the exhaustive process of data engineering and feature extraction while enabling the reusability of the methods. numerous researches [5] have been conducted to study breast cancer images from various perceptions. machine learning (ml), convolutional neural networks (cnns), and deep learning methods are now widely used to classify breast cancers from the breast images. cnn models have been effectively used in the wideranging computer vision fields for years [6, 7] . since the last few years numerous researches have been conducted applying cnn-based deep learning deep architectures for disease diagnosis. a cnn-based image recognition and classification model perhaps first applied in the competition of imagenet [8] . after then cnn-based models are currently considered in various applications, for example, image segmentation in medical image processing, feature extraction from images, finding region of interests, object detection, natural language processing, etc. cnn has incredible trainable parameters at the various layers that are applied for extracting important features at various abstraction levels [9] . meanwhile, a cnn model needs a huge dataset to train. particularly in the medical field medical dataset may not be always possible to obtain. moreover, a cnn model requires high speed computing resources to train and tune its hyper parameters. to overcome the data unavailability transfer learning techniques at present vastly applied in classification of medical images. applying transfer learning techniques, a model can use knowledge from other the pre-trained models (e.g., vgg16 [10] , alexnet [11] , densenet [12] , etc.) that are trained over a huge dataset to classify images. this lessens the requirement of data linked to the problem we are tackling with. the pre-trained models often are used as feature extractors in images from abstract level to more detailed levels. transfer learning techniques using pre-trained models have shown promising results in different medical diagnosis, such as chest x-ray image analysis for pneumonia and covid-19 patients' identification [13] , retina image analysis for blind person classification, mri image analysis for brain tumor classification, etc. deep learning models leveraging cnns are used widely to classify breast cancers. we now discuss some of the promising researches that have been proposed using cnn. authors in [14] proposed a learning framework leveraging deep learning architecture that can learn features automatically form mammography images in order to identify cancer. the framework was tested on the bcdr-fm dataset. although they showed improved results, however, did not compare employing pre-trained models. authors in [15] considered alexnet as a feature extractor for mass diagnosis in mammography images. support vector machin (svm) is applied as a classification model after alexnet generates features. the outcome of the proposed model is higher compared to the analytically feature extractor method. in our approach, we considered eight different pre-trained models and show their performances using ultrasound images. authors in [16] considered transfer learning approach using googlenet [17] and alexnet pre-trained models and some preprocessing techniques. the model is applied on mammograms images, where cancers are already segmented. the authors claim that the model achieves improved performance than the human involved methods. authors in [18] proposed a convolutional neural network leveraging inception-v3 pre-trained model to classify breast cancer using breast ultrasound images. the model supports facility for extracting multiview features. the model is trained only on 316 images and achieved 0.9468 auc, 0.886 sensitivity, and 0.876 specificity. authors in [19] developed an ensembled cnn model leveraging vgg192 and resnet1523 pre-trained models with fine tuning. the authors considered a dataset managed by jabts. there are 1536 breast masses that include 897 malignant and 639 benign cases. the model achieves 0.951 auc, 90.9% sensitivity, and 87.0 specificity. authors in [20] developed another ensemble-based computer aided diagnosis (cad) system combining vggnet, resnet, and densenet pre-trained models. they considered a private database that consists of 1687 images that includes 953 benign and 734 malignant cases. the model achieved 91.0% accuracy and 0.9697 auc score. the model is also tested on the busi dataset. in this dataset, the model achieved 94.62% accuracy and 0.9711 auc score. authors in [21] implemented two different approaches (1) a cnn and (2) a transfer learning to classify breast cancer from combining two sets of datasets, one containing 780 images and another containing 163 images. the model showed better performance results combining traditional and generative adversarial network augmentation techniques. in the transfer learning approach, the authors compared the performance of four pre-trained models, mainly, vgg16, inception [22] , resnet, and nasnet [23] . in the combined dataset, the nasnet achieved highest accuracy value 99%. authors in [24] compared three cnn-based transfer learning models resnet50, xception, and inceptionv3, and proposed a base model that consists of three convolutional layers to classify breast cancers from the breast neural computing and applications ultrasound images dataset. the dataset comprised of 2058 images that includes 1370 benign and 688 malignant cases. according to their analysis, inceptionv3 showed best accuracy of 85.13% with auc score 0.91. authors in [25] analyzed four pre-trained models vgg16, vgg19, inceptionv3, and resnet50 on a dataset that consists of 5000 breast images comprised of 2500 benign and 2500 malignant cases. inceptionv3 model achieved the highest auc of 0.905. authors in [26] proposed a cnn model for breast cancer classification considering the local and frequency domain information using histopathological images. the objective is to utilize the important information of images that are carried by the local and frequency domain information which sometime shows better accuracy for the model. the proposed model is applied on the breakhis dataset. however, the model obtained 94.94% accuracy. authors in [27] proposed a novel deep neural network consisting of clustering method and cnn model for breast cancer classification using histopathological images. the model is based on cnn, a long-short-term-memory (lstm), and a mixture of the cnn and lstm models. in the model, both softmax and svm are applied at the classifier layer. however, the model achieved 91% accuracy. from the above discussion, it is evident researchers still on the search for a better model to classify breast cancers. in order to overcome the scarcity of datasets, this research combines two publicly available ultrasound image datasets. then eight different pre-trained models after fine tuning are applied on the combined dataset to observe the performance results of breast cancer classification. however, the pre-trained models did not show expected outcome. therefore, we also develop a shallow cnn-based model. the model outperforms all the fine-tuned pre-trained models in all the performance metrics. the proposed model is also faster in training. we also employed different evaluation techniques to prove the better outcome of the proposed model. the details of the methods study, evaluation results and discussion are presented in sect. 3 . the paper is organized as follows: sect. 2 discusses materials and methods that are used for the purpose of breast cancer classification. section 3 proposes the custom cnn model. section 4 discusses evaluation results of the pre-trained models and the proposed custom. finally, the paper concludes in sect. 5. in this research, we consider two publicly available breast ultrasound image datasets [28, 29] . the two datasets are considered mainly for two reasons: (1) to increase the size of the dataset for the training purpose in order to avoid overfitting and biasness and (2) to consider three classes (benign, malignant and normal). combining the datasets also will improve the reliability of the model. dataset in [28] contains 250 images in which there are two categories: malignant and benign cases. the size of the images is different. the minimum and the maximum size of the images are 57 9 75 and 61 9 199 pixels with gray and rgb colors, respectively. therefore, all the images are transformed into gray color to fit into the model. the dataset in [29] contains 780 images, in which there are three categories: malignant, benign, and normal cases. the average image size of the images is 500 9 500 pixels. the breast ultrasound images are collected from 600 women in 2018, and the age range of the women is between 25 and 75 years. table 1 shows the class distribution of the images in the two datasets. figure 1 demonstrates examples of ultrasound images of different cases in the two datasets. data normalization is an important pre-processing phase before feeding the data into a model for training. with preprocessing the data features become easily interpretable by the model. lack of correct pre-processing makes the model slow in training and unstable. generally, standardization and normalization techniques are used in scaling data. normalization technique rescales the data values between 0 and 1. since the datasets that are considered in this research, are both gray and color images, hence the values of the pixels lie between 0 and 255. we consider zerocentering approach that shifts the distribution data values in such a way that its mean becomes equal to zero. assume a dataset d, that consists of n samples and m features. therefore, d[:, i] denotes ith feature and d[j, :] denotes sample j. the equation below defines zerocentering. d½k; i and in this research, we employed k-fold (k = 5) cross-validation on the dataset to overcome overfitting problem during model training. in k-fold cross validation method, k different datasets of same size is generated, where each fold is used to validate the model, and k-1 folds are considered for the purpose of model training. this ensures that the model produces reliable accuracy. cross-validation is a widely used mechanism to resample data for evaluating machine learning models when the dataset sample size is small. cross-validation is mainly considered to approximate the learning skill of a machine learning model using data which the model has not seen previously. the result of a model obtained using cross-validation is normally less biased or gives optimistic estimation skill of the model compared to train/test split method. table 2 shows how the fivefold cross-validation generates five different datasets of ultrasound images from the two datasets. during the last few years, transfer learning algorithms are widely used in many research problems in machine learning which concentrate on preserving knowledge acquired during unraveling one problem and employing the knowledge into another but a relevant problem. for example, an algorithm that is trained to learn in recognizing dogs can be applied to recognize horses. authors in [30] formally define the transfer learning in terms of domain and task as follows: let an arbitrary domain d = {x, p(x)}. here x denotes a feature vector {x 1 , x 2 , …, x n } and the probability distribution in x is denoted by p(x). one of the reasons that the transfer learning algorithms being used when small size dataset is available to train a custom model, but the goal is to produce an accurate model. a custom model employing transfer learning, applies the knowledge of the pre-trained models that are trained over a huge dataset for a long duration. there are mainly two approaches to apply transfer learning: (i) model developing and (ii) using pre-trained models. the pretrained model approach is widely used in deep learning domain. considering the importance of the pre-trained models as feature extractors this research implements eight pre-trained models using the weights of the convolutional layers of the pre-trained models. these weights act as feature extractors for classifying breast cancers applying on the ultrasound images. table 3 shows the pre-trained models that are considered in this research. all the models are built on convolutional neural network and were trained on the imagenet database [31] that consists of a million images. the models can classify 1000 objects (mouse, keyboard, pencil, and many animals) from different images. therefore, all the models have learned huge feature representations from a large number of images. from the table 3 , we see that different models us different input size. therefore, the images in the dataset are transformed accordingly to feed into the models. in the fine-tuning process of the pre-trained models, the final layer is substituted with a classifier that can classify three objects since the dataset consists of images with three classes (normal, malignant, and benign). hence the models are fine tuned at the top layers. in the fine-tuning process the last three layers of the models are substituted with (i) a fully connected layer (ii) softmax activation layer, and (iii) a custom classifier. we considered three different optimizers to train the models and to determine which model produces the best results. the brief description of the optimizers is given below: stochastic gradient descent with momentum (sgdm) is the fundamental optimizer in neural network that is used for the convergence of neural networks, i.e., moving in the direction of the optimum cost function. the following equation is used to update neural network parameters to calculate the gradient r. here l: initial learning, v t : exponential average of squares of gradients, and g t : gradient at time along w j . adam optimizer associates the heuristics of momentum and rmsprop. the equation is given below. here l: initial learning, v t : exponential average of gradients w j and g t : gradient at time t along w j s t : exponential average of squares of gradients along w j , b 1 ; b 2 are hyperparameters. the fine-tuned pre-trained models used softmax activation function to generate the probability between the range 0 and 1 of the class outcomes from the input images. using softmax activation function at the end of a cnn model to convert its outcome scores into a normalized neural computing and applications probability distribution is a very well-known practice. softmax function is defined with following equation: where z is a input vector, z i are the elements in z, e z i is the exponential function, and p k j¼1 e z j is the normalization term: 3 proposed custom model the model applies batch normalization with 20 channels. it also consists of one max pooling layer, one fully connected layer. dropout regularization is also added after the fully connected layer. finally, softmax activation function is applied since the model needs to classify three classes. the initial learning rate 1.0000e-04 is considered during training. the model also considers mini-batch size 8. the model is trained using three optimizers as the pre-trained models are trained. the model is trained and validated using the configuration of table 4 . the performance of the fine-tuned pre-trained models are evaluated with various standard performance of metrics. the metrics are accuracy (acc), area under curve (auc), precision, recall, sensitivity, specificity, and f1score. confusion matrix for each model is also generated to observe the scores of true positive (tp), true negative (tn), false positive (fp), and false negative (fn) of normal, malignant, and benign cases. the tp (e.g., malignant) score represents how the model correctly classify real malignant cases as malignant. the fp (e.g., malignant) represents how the model wrongly classifies benign cases as malignant. similarly, tn (e.g., benign) score represents how the model correctly classifies benign cases as benign, and fn (e.g., benign) score represents how the model wrongly classifies malignant cases as benign. another important metric is the precision that demonstrates the performance of a model in terms of proportion of the truly classified patients as malignant, benign, and normal cases. meanwhile, sensitivity or recall value shows the proportion of a case (e.g., malignant) a model truly classifies as malignant cases. specificity demonstrates the percentage of a case (e.g., benign) that a model classifies correctly. through the f1-score we achieve a single score from precision and recall through evaluating their harmonic mean. in the below, we show the formula of different metrics. fig. 2 the architecture of the custom model the scores of performance evaluation of the fine-tuned pretrained models as well as the custom model are shown are table 6 summarizes the models' performance with the best scores with different evaluation metrics and compares the results with the proposed cnn model. figure 3 shows the confusion matrix generated from the different pre-trained models as well as the proposed custom model. the figure only shows the confusion matrix of the best pre-trained models as mentioned in the table 6 . from the confusion matrix of the custom model, we observe that in all the classification of breast cancers the score is high. for example, the model classifies 100% benign class, 100% malignant class, and 100% normal classes using adam optimizer. the results also outperform the results of the pre-trained models. table 7 shows the classification results of the models. table 8 shows performance comparison results between the custom and the pre-trained models. the custom model outperforms all the pre-trained models with respect to accuracy, prediction time and number of parameters. the custom model is also very fast in training than all the finetuned pre-trained models. the reason is that the custom model has only one fully connected layer. in addition, the custom model requires a very small number of trainable parameters compared to the other models. all the models are trained in a gpu (nvidia ò geforce gtx 1660 ti with max-q design and 6 gb ram) considering a minibatch size of 8. figure 4 shows execution time and the accuracy score of each model. to calculate accurate time, we run the code four times. the area of each marker in the fig. 4 shows the size of the number of parameters in the networks. the time of models' prediction is calculated with respect to the fastest network. from the plot, it is quite evident that that custom model is fast and training and produces higher accuracy than the other pre-trained models. figure 5 shows the accuracy and loss values when the custom model is trained and validated. from the graph in fig. 5 , it is evident that the custom model generates very high accuracy result as claimed in table 8 . the custom model's performance is also evaluated by generating heat map visualization using grad-cam tool [32] to see how the model identifies the region of interest and how well the model distinguishes cancer classes. grad-cam is used to judge whether a model identifies the key areas in the images for prediction. grad-cam visualizes the portion of an image through heatmap of a class label that the model focuses for prediction. figure 6 shows a sample grad-cam output of benign and malignant classes and prediction probability. from the output, we observe that the model perfectly focuses on the key areas of images to classify cancers. this study implemented eight pre-trained cnn models with fine tuning leveraging transfer learning to observe the classification performance of breast cancer from ultrasound images. the images are combined from two different datasets. we evaluated the fine-tuned pre-trained models applying the adam, rmsprop, and sgdm optimizers. the highest accuracy 92.4% is achieved by the resnet50 with adam optimizer and the highest auc 0.97 score is achieved by vgg16. we also proposed a shallow custom model since the pre-trained models have not shown expected results and all the pre-trained models have many convolutional layers and need long duration in the training phase. the proposed custom model consists of only one convolutional layer as feature extractors. the custom model achieved 100% accuracy and 1.0auc value. with respect to training time, the custom model is faster than any other model and needs small size of trainable parameters. the future plan is to validate the model with other datasets that include new ultrasound images. html#:*:text= breast%20cancer%20survival%20rates%20vary,et%20al.%2c% 202008. accessed cloud-supported cyber-physical localization framework for patients monitoring applying deep learning for epilepsy seizure detection and brain mapping visualization hybrid deeplearning-based anomaly detection scheme for suspicious flow detection in sdn: a social multimedia perspective cervical cancer classification using convolutional neural networks and extreme learning machines automatic fruit classification using deep learning for industrial applications emotion recognition using secure edge and cloud computing imagenet large scale visual recognition challenge deep relative attributes very deep convolutional networks for large-scale image recognition. arxiv reprint imagenet classification with deep convolutional neural networks neural computing and applications explainable ai and mass surveillance system-based healthcare framework to combat covid-i9 like pandemics representation learning for mammography mass lesion classification with convolutional neural networks digital mammographic tumor classification using transfer learning from deep convolutional neural networks improving eeg-based emotion classification using conditional transfer learning going deeper with convolutions breast cancer classification in automated breast ultrasound using multiview convolutional neural network with transfer learning computer-aided diagnosis system for breast ultrasound images using deep learning computer-aided diagnosis of breast ultrasound images using ensemble learning from convolutional neural networks deep learning approaches for data augmentation and classification of breast masses using ultrasound images rethinking the inception architecture for computer vision learning transferable architectures for scalable image recognition comparison of transferred deep neural networks in ultrasonic breast masses discrimination diagnostic efficiency of the breast ultrasound computer-aided prediction model based on convolutional neural network in breast cancer histopathological breast-image classification using local and frequency domains by convolutional neural network histopathological breast cancer image classification by deep neural network techniques guided by local clustering dataset of breast ultrasoundimages. data brief intro to optimization in deep learning: momentum visual explanations from deep networks via gradient-based localization publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations funding not applicable.availability of data and material datasets are collected from public repository [28, 29] . conflicts of interest not applicable.code availability not applicable. key: cord-319291-6l688krc authors: hung, chun-min; huang, yueh-min; chang, ming-shi title: alignment using genetic programming with causal trees for identification of protein functions date: 2006-09-01 journal: nonlinear anal theory methods appl doi: 10.1016/j.na.2005.09.048 sha: doc_id: 319291 cord_uid: 6l688krc a hybrid evolutionary model is used to propose a hierarchical homology of protein sequences to identify protein functions systematically. the proposed model offers considerable potentials, considering the inconsistency of existing methods for predicting novel proteins. because some novel proteins might align without meaningful conserved domains, maximizing the score of sequence alignment is not the best criterion for predicting protein functions. this work presents a decision model that can minimize the cost of making a decision for predicting protein functions using the hierarchical homologies. particularly, the model has three characteristics: (i) it is a hybrid evolutionary model with multiple fitness functions that uses genetic programming to predict protein functions on a distantly related protein family, (ii) it incorporates modified robust point matching to accurately compare all feature points using the moment invariant and thin-plate spline theorems, and (iii) the hierarchical homologies holding up a novel protein sequence in the form of a causal tree can effectively demonstrate the relationship between proteins. this work describes the comparisons of nucleocapsid proteins from the putative polyprotein sars virus and other coronaviruses in other hosts using the model. identification of protein function is the main theme of the post-genome era. recently, biomedical scientists have been striving to study proteomics and explore the therapeutic potential of genes for curing disease. thus, a powerful and integrated methodology is required for predicting novel protein function. during the past decade, many algorithms applied to computational biology have been used to solve several of the above problems, with the most frequently used methods including dynamic programming [1] , the hidden markov model [2, 3] , the bayesian theorem, and probabilistic modeling [4] [5] [6] . the journal of molecular biology contains an introduction to hidden markov models for biological sequences written by krogh [7] . as a statistical model, the hidden markov model (hmm) is appropriate for many tasks in molecular biology. a profile-like hmm architecture can be used for searching against sequence databases for new family members and also can make multiple alignments of protein families. moreover, the latest review by tsakonas and dounias [8] indicates that artificial intelligence computational methodologies have been widely applied in bioinformatics, for example genetic algorithms [9] , genetic programming [10] , neural networks [11, 12] , fuzzy logic [13, 14] , and some classification and clustering techniques used in data mining [15] . design of automatic models for comparing protein sequences that are homologous to a known family or super-family has become increasingly important. one standard approach to solving this problem uses 'profiles' of position-specific scoring measure estimated using multiple sequence alignment (msa) [16] . in fact, a potential member of a distantly related protein family cannot be identified with the other members in the same family using only the primary protein structures. consequently, a new protein, such as the case of the sars virus, with numerous unknown functions cannot be accurately predicted by modeling a learning model. this study applies a hybrid methodology based on genetic programming with a causal tree [4, 28, 31] model to predicting protein function. the causal tree [28, 31] with a cause-effect form is proposed for locally comparing protein sequences across each pairwise alignment of msa. the clues to this proposal are that comparisons of distantly related sequences may carry more sensitive and accurate messages regarding protein function if given a segmented alignment using the more comprehensive depiction of the protein family. additionally, the probabilistic scoring measures, similar to profile-to-profile comparisons [17] , can be used to achieve one objective of genetic programming, namely aligning the hierarchical homologies based on profile comparisons. the resulting output is capable of function-related representation for further linkages of protein function. furthermore, the proposed model can generate a tree structure output with a special aligned format rather than a traditional local or global alignment format such as clustalw [23] . traditionally, a local alignment based on the smith-waterman algorithm [7] has been used to compare sequences with gaps using various scoring systems. using scoring systems, many methods are used to evaluate the similarity of two profile columns, for example the ffas method [18] , and prof sim [19] . the ffas method can calculate the correlation coefficients and thus score the 'dot-product' scores of the amino acid frequencies in the two columns. meanwhile, the prof sim method utilizes a single similarity score combined with the divergence score and the significance score of the probability distributions in two columns. recently, the method compass [17, 20] generated the occurrence probability of target residue in one profile column based on the other column of given the target residue frequencies to construct the local profile-to-profile alignment. however, the proposed model was designed to detect the potential functions of a novel protein with a hierarchical homology structure. the hybrid model, namely alignment using genetic programming with causal tree (agct), is a heuristic evolutionary method that contains three basic components: (i) genetic programming with innerexchanged individual strategy, (ii) causal trees [4, 28, 31] with probabilistic reasoning, and (iii) construction of hierarchical homologies with local block-to-block alignment using the methods of moment invariant and robust points matching (rpm) [24] . locally, a standard construction of an alignment requires at least a two-step design: the first step is a scoring evaluation for similarity between two given positions, while the second step is an alignment extension that considers the gap penalties and an extension algorithm. the agct model applies two modified steps to an alignment with a standard construction. the first step is to change the compared objects from positions to blocks that initialize extents based on a particular signal wave, while the second step is to iteratively adjust the boundaries using trimming local fragments rather than the extension algorithm. actually, in blast [21] and its successors [22] , the effective local extension is displayed for the cases of sequence-to-sequence and sequence-to-profile comparison. similarly, in agct, the approach to a local block-to-block comparison modified from profile-to-profile alignment described in [17] shifts the leftmost and rightmost boundaries in relation to the parent segments when high local alignment scores are achieved. notably, a detailed and refined procedure of alignment such as the smith-waterman algorithm [7] is also used in a part of the proposed model for confirming a meaningful fragment detected, because a fragment with a short length reduces the computation time and only some individuals in the population have to be compared making the combination of heuristic and exhaustive search models feasible. the proposed agct model is an evolutionary method based on a probabilistic inference model. the proposed model also incorporates a modified rpm [24] method and a local sequence alignment into the probabilistic model. meanwhile, the model uses a moment invariant theorem [32] for pattern recognitions to extract the feature-nodes from some fragmented signal curves. the rpm compares feature points to precisely construct a soft match for two sets of points. moreover, the modification of rpm changes the original transformation between matrices to the current transformation between a matrix and a tree. for achieving this matrix-to-tree transformation, the proposed constraints, namely rooted one-way winner-take-all (ro-wta), resembling the two-way winner-take-all (wta) constraints [27] , are also used. the proposed ro-wta method is a wta method that initially takes a unique root node by applying wta to the slacked matrix column that is involved for the null point correspondences. then wta is applied to all rows to ensure each row only produces a corresponding feature-node in a causal tree, and finally at most three nodes are taken and used as the branches of the parent node using the first three high values in the same column. moreover, the tree-to-matrix transformation simply visits a tree in preorder to derive a set of pairwise points. finally, an equation, eq. (28), is applied to yield a correspondence matrix for optimizing the rpm. a traditional msa method for protein classification compares more than two sequences to provide the aligned sequences for the observation of conserved domains. however, in a distantly related protein family, the data provides neither meaningful knowledge of biochemical function nor a consistent outcome when using different msa algorithms. clearly, the msa comparisons would be worst if it is applied with the distantly related protein family members. in this model, a small protein fragment transformed into a feature-node would be a compared target point with spatially localized features. individual feature-nodes simplified by six moment invariants [32] are matched with each other for the signal registration of 2-d curves [25, 30] . the 2-d curve is depicted using a block-to-block method that can transform a fragmenting sequence of protein residues into a signal curve by iteratively accumulating a block of amino acid properties. the signal curves are plotted by accumulating hydrophobicity against protein sequence here. because the problem has been transformed into the problem of comparing feature-nodes (points), the comparison of spatially localized features for protein function can be considered an optimization problem of point match for the 2-d shape. in fact, the optimization problem is a difficult problem. especially, the mapping problem must also consider the rigid and non-rigid deformations. a hybrid technique, combining genetic programming and the causal tree, is ideal for solving such difficult problems. genetic programming has emerged from the evolutionary based systems [26, [40] [41] [42] , and the causal tree is derived from probabilistic reasoning in intelligent systems [4] . since the genetic programming can avoid local minima of an energy function, the model is appropriate for recognizing a potential pattern which is never found via direct comparisons. this model uses a popular technique for synthesizing a special signal indicating specific protein functions by accumulating the properties of a group of protein residues. comparing these special signals with each other can provide a wider window for viewing the world of the protein. essentially, the signals must be decomposed for comparison using a method that can prevent a suitable waveform from destroying the protein signals. in the implementation, a technique of wavelet reconstruction for signals [29] is used to depict a noiseless signal curve skeleton. next, the reconstructed curve is differentiated with respect to the positions of each protein residue, and let its results equal zero. the technique thus can easily identify some positions with local minimum accumulating properties. subsequently, to preventing the generation of less than 40 residues, the method would discard some of the neighbor positions around the minima. next, all of the fragmented signals are used to provide an initial dataset for the input of evolutionary computation. although the local minimum values can not determine the true boundaries of the signals at the beginning step, the local alignment afterwards for fragmented proteins using the smith-waterman algorithm [7] would iteratively regularize the boundaries until the suboptimal boundaries are found. next, the model must further extract signal features from the fragmented input signals after decomposing the original signals. technically, the vector features can be associated with the signal features. however, the vector value representing a signal must be invariant when the physical linear-transforms deform the signal curve (shape), such as move, rescale, stretching, and rotation. once the problem is transformed into the image and signal processing domains, the rpm method is easily applied to protein function identification. conceptually, this proposed model includes two stages for feature matching. the first stage is designed to obtain the initial signal features for the signal registration [25, 30] . during the second stage, an inner-exchanged genetic programming is used to repeatedly refine the rpm parameters. this work utilizes the theorem of two-dimensional moment invariants [32] for planar geometric figures from the theorem of moment spaces [33] to extract features of the protein fragments. accumulating hydrophobicity of surrounding residues for each position of protein enables the protein fragments to generate signal curves. next, the signal curve further reduces to a fixed length vector, denoting a feature node, using moment invariants for matching the nodes. next, one of the energy functions in the model introduces the energy function of a correspondence algorithm of non-rigid mapping [24] derived from the tps theorem [34] . finally, genetic programming with the inner-exchanged subpopulation strategy is used to refine the comparison results. the theorem of moment invariants, which is extensively used in computer science, was first proposed in [32] for recognizing visual patterns. in the visual field, an essential step must extract patterns for recognizing visual objects from the original objects. the extracted patterns must also be independent of position, size and orientation. likewise, the model uses the moment invariants of the signal curves to substantially reduce the computational complexity. because the flexibility of the moment invariants used for recognizing protein function was such that the model could find more domains of conservation, the domains should be able to attain more biological meaning by this moment invariant based comparison. the following subsections detail how the model obtains a feature vector denoting a feature-node in a causal tree by using the moment invariant theorem for signal recognition. riemann integrals [32] define two-dimensional (k + l)th order moments for a density distribution f (x, y) as follows: where f (x, y) denotes a piecewise continuous bounded function, and has nonzero values if and only if it is on the finite part of the x y plane. vice versa, the f (x, y) may derive the unique double moment sequence {ṁ k,l } in the moment of any order. by directly integrating the double moment sequence, eq. (2) can express a central moment µ k,l derived from the ordinary moments. where x =ṁ 1,0 /ṁ 0,0 and y =ṁ 0,1 /ṁ 0,0 is an expected value of x and y, respectively. the point (x, y) is termed a center of gravity or center of centroid for x and y. for example, the first four orders have µ 0,0 =ṁ 0,0 ≡ µ, µ 1,0 = µ 0,1 = 0, µ 2,0 =ṁ 2,0 − µx 2 , µ 1,1 =ṁ 1,1 − µx y, µ 0,2 =ṁ 0,2 − µy 2 , µ 3,0 =ṁ 3,0 − 3ṁ 2,0 x + 2µx 3 , µ 2,1 =ṁ 2,1 −ṁ 2,0 y − 2ṁ 1,1 x + 2µx 2 y, µ 1,2 =ṁ 1,2 −ṁ 0,2 x − 2ṁ 1,1 y + 2µx y 2 , µ 0,3 =ṁ 0,3 − 3ṁ 0,2 y + 2µy 3 . in a similar mathematical sense, a measure of area for a two-value image b(m, n) with (k+l)th order moment can approximately obtain eq. (4) from eq. (1) . moreover, the corresponding central moment is expressed as eq. (5): m 0,0 ,m andn are the values for the area length and width, respectively. a pattern of the two-value image which must be independent of position, size, and rotation is derived from similitude moment invariants. likewise, the function b(m, n) with respect to a pair of fixed axes can display the signal curve of protein sequences. moreover, the two-dimensional central moments µ k,l expresses image patterns. additionally, numerous properties of the second central moment are equivalent to a covariance matrix in probability fields. therefore, the covariance matrix for the second central moment is expressed as follows: by the diagonalization, eq. (7) is obtained as follows: where e = e 1,1 e 1,2 e 2,1 e 2,2 , a column vector of matrix e is an eigenvector of matrix u , and the eigenvalues of u determine λ = λ 1 0 0 λ 2 . the values of λ max and λ min corresponding to (λ 1 , λ 2 ) and (λ 2 , λ 1 ) are shown in eqs. (8) and (9), respectively. moreover, eq. (10) denotes a direction angle of the area of the image. from the above theoretical deviations, eqs. (1)-(10) can easily formularize some basic properties of the pattern. notably, an added restriction such as µ 2,0 > µ 0,2 can determine the unique angle of θ . clearly, a discrimination property of the patterns increases when the higher order moments are used as the pattern. the higher order moments with respect to the principal axes can also be easily determined using the method of the principal axes described in [32] . because µ 0,0 = m 0,0 can be used to measure the area, a factor √ α of proportionality, a divisor, should be used to further normalize the central moment in eq. (5) by dividing the variables of the function of eq. (2) by the factor. eq. (2) then can adopt the form f x/ √ α, y/ √ α . as a result the normalized central moment ν k,l equals the original central moment of f (x, y) divided by √ α k+l+2 . because of µ 0,0 being able to measure area, α = µ 0,0 . eq. (11) then can be used to define the normalized central moment, as follows: an identification of the pattern independent of position, size and orientation then can be suitably employed to formularize the moment invariants for any signal curve using eq. (11). to improve pattern recognition capability, an absolute moment invariant should further be achieved by using multiple moment invariants. the normalized central ν k,l in the second and third orders can generate the following six absolute and orthogonal moment invariants listed in eq. (12) . while the skewed invariant listed in the original study [32] is useful for distinguishing mirror images, the model discards this invariant since one skewed orthogonal invariant is not necessary for signal recognition here. the six moment invariants in eq. (12) are not only independent of position, size and orientation, but also can be used to present a feature-node for a fragmented protein sequence. finally, a six-valued vector ω = ω 1 ω 2 ω 3 ω 4 ω 5 ω 6 t can denote a feature-node to enable further convenient processing. suppose that a feature-node ω a = (ω a1 , ω a2 , ω a3 , ω a4 , ω a5 , ω a6 ) can represent a signal curve from the fragmented protein corresponding to one point in the 2-d space. assume a set of vectors ψ = { ω a | a = 1, 2, . . . , k } and the other set of x = { x a | a = 1, 2, . . . , n}. since vector ω a with six variables would increase the problem complexity, a simple linear-transform is used to obtain a scalar for solving the point-matching problem using the tps of two variables. one problem of the fragmented sequence-mapping then can simply be transformed into the other new problem. the new problem involving the two feature-node sets of x and ψ fits a mapping function f ( ω a ) using tps, and simultaneously minimizes the tps energy function defined as follows: typically, two types of deformations of a compared object exist, rigid (affine) and non-rigid (non-affine). the rigid deformation includes parallel translation and rotation. meanwhile, the non-rigid deformation includes shrink, dilation, and distortion. by minimizing the first error measurement term of (13), the feature-node set x maps as closely as possible to the other featurenode set ψ . generally, an infinite number of mappings f can minimize the first term since the mapping is non-rigid. the energy function of eq. (13) can uniquely determine a minimizing function f specified by two parameter matricesã andw in eq. (14) if and only if it has a fixed value λ. whereã represents a (d + 1) with c being a constant for any ω [34] , is related to the tps kernel. the kernel comprises the information regarding the internal structural relationships of the feature-node set and a non-rigid warp comes into play when the warping coefficient matrixw is considered in eq. (14) . the second term of eq. (13), which is responsible for regularizing the mapping function between x and ψ , is essentially a smoothness constraint. the lagrange parameter λ in eq. (13) regularizes the warping of the matches of feature-nodes. the precise matches appear if λ approaches zero. when a solution for f in eq. (14) is substituted into the tps energy function in eq. (13) , and using mercer-hilbert-schmidt theorems of riesz and sz-nagy [35] to prove eq. (15), eq. (16) is obtained: whereλ 1 ≥λ 2 ≥ · · ·λ k ≥ 0 denote eigenvalues of the continuous eigenfunction φ 1 , φ 2 , . . . , φ k , andw t is the transposed matrix ofw . where x and ψ are merely concatenated versions of the vector values of the nodes vectors. each row of the matrix derives from the original vectors. in the model, an energy function with the affineã and warpingw parameters, which is derived from eqs. (13)(16), is used as a fitness function with genetic programming. furthermore, the local border between two regions of protein fragments should be considered an important feature for understanding protein functions. therefore, a distinctive characteristic of the tps can always decompose a geometrical transformation into a global affine transformation and a local non-affine warping component. using this characteristic, a functional signal generally can be detected based on the same essential protein family even if it is undergoing various environment parameters. the parameters include the setting of primary forming signal of the sequences, for example the fragment length in relation to signal response. alternatively, the rotation, translation, and global shear of the signals should slightly influence the detection of the functional signals based on the feature geometrical characteristics (pattern). consequently, the smoothness termw in eq. (16) is specifically applied to be the warping components, and this term should be penalized when signal deformation is ineffective to their essential properties. visually, two compared objects with similar patterns should increase the influence of tps to interpolate a fitting spline. correspondingly, the similar protein fragments compared with one another can be identified by tps matching. the proposed model uses tps methods to compare with two feature-node sets for multiple optimization problems. the tps method raises three issues of optimizations. traditionally, the main issues include optimizations of mapping and correspondence. generally, the work of correspondence must permute a one-to-one relationship between two feature-node sets, and the work of mapping must fit the two feature-node sets to an optimal function f . however, this model defines a new issue that the problem solved must simultaneously optimize the hierarchical heaping structure by constructing a hierarchical relationship among feature-nodes. thus, the work of correspondence in the model must achieve the hierarchical relationship of homologies rather than the one-to-one relationship. alternatively, the model must also optimize heaping. heaping optimization should meet the following minimum requirements: (i) only generating the node of a unique root, (ii) a parent node can compare with multiple children nodes, (iii) using the limitation of the number of branches, the model must only choose the first certain children nodes depending on homology likelihood to attach to a parent node, and (iv) the model must place the predicted feature-nodes and the predicting feature-nodes to the external and internal nods of a causal tree, respectively. these requirements describe the one-to-many query strategy in the homology search model. more precisely, heaping also accomplishes a descending order of homology probability for children nodes. nevertheless, the computations have increased difficulty in simultaneously optimizing the hierarchical correspondence, non-rigid mapping, and heaping. the subsection formally defines a correspondence matrix for a causal tree via the strategy of a one-to-many query. assume that a sequence {q 1 } with unknown functions is used as a query sequence to compare with the fragmented member sequences {q x } z x=2 for constructing an optimal hierarchical structure. the optimal structure should comprise the maximum of homology relationships among these protein fragments of both 'query' and 'member'. moreover, z denotes the number of the input sequence classified to the same distantly related protein family. furthermore, letm denote a correspondence matrix representing the correspondence between two node-sets in a tree. assume the matrix contains (n + 1) × (k + 1) elements: where each m ia is an element of the matrix, with a value that is a real number between zero and one. one query sequence q 1 is divided into s 1 fragments. the summation of m ia in eq. (17) for the constraints of each row inm must equal one. the node of the same label thus appears in the tree a maximum of once. the wta method and sinkhorn balancing theorem [36] can be used to normalize m ia to satisfy the row and column constraints in refs. [24, 27, 36] . however, this model only uses the row constraint to modify wta as ro-wta. alternatively, the correspondence matrix for a causal tree must satisfy the constraint n+1 a=s 1 +1,i =a m ia = 1 in eq. (17) to implement a one-to-many relationship. simultaneously, a constraint n+1 a=s 1 +1 m aa = 0 is used to prevent a self-correspondence. the self-correspondence that appears when one feature-node points to itself is invalid. furthermore, a constraint n i=s 1 +1 m i(n+1) = 1 and ∀i ≤ s 1 m i(n+1) can decide the node of the unique root. from the definition of the correspondence matrix in eq. (17), let: where k represents the number of parent nodes with the starting number one and ending at number k . however, the alternative representation is beginning at number s 1 + 1 and ending at number n. in fact, the two numbering representations are identical for the convenience of representation and implementation. from eqs. (18), (19) and (20) define the set of parent and children nodes in a tree, respectively. where j and (l x , l y ) denote different coordinates of the one and two dimensions, respectively. moreover, the variable s p denotes the number of fragments in each protein sequence. next, eqs. (21)-(25) further define a causal tree denoted ct using the correspondence matrix m. a set of causal trees , similarly described in [35] , can be expressed in terms of i as follows: where denotes the set of causal trees i . a causal tree includes the components r i , i i and e i , which denote a root node, internal node-set, and external node-set, respectively. the causal tree is a tree network that is constructed from the binary random variables labeled, and is also illustrated in fig. 1. fig. 2 shows an example of internal and external node assignment. of nine fragments, five corresponding fragments of internal nodes 5, 6, 7, 8, represent a set of fragmenting members from each member sequence {q x } z x=2 . the external nodes numbered 1, 2, 3, and 4 in fig. 1 are for a query sequence {q 1 } in accordance with the query fragments of fig. 2 in the numbering representation. in fact, the internal nodes can be used for inferring the probabilistic reasoning [4] for predicting protein function. the reasoning process infers along a cost path in the causal tree by assessing orderly conditional probabilities combined together from a root-node to an external feature-node. each internal feature-node ω ∈ i indicates a given protein fragment of different properties. likewise, each external feature-node ω ∈ e indicates an unknown protein fragment function. for heaping the causal tree, the constraintsh 0 ,h 1 andh 2 are used to allocate a correct location of feature-nodes in the causal tree. theh 0 in eq. (22) defines a root-node r by applying the wta method to column (n + 1) of correspondence matrixm. alternatively, a feature-node is a root-node if and only if no parent node points to it. additionally, eqs. (23) and (24) define the correct locations of internal and external nodes, respectively. the node-pairs −−→ (a, i ) and − −−→ (a, a ) represent a direction from a parent node to a child node. for clarity, fig. 3 illustrates an example of the correspondence matrix for a causal tree. the proposed ro-wta selects the maximum value from the real numbers within the parenthesis only for each row and the n + 1 column. besides ro-wta, the diagonal elements must all be zero to eliminate self-correspondence. eventually, eq. (25) defines a causal tree with the entities of a correspondence matrix ct as follows: fig. 3 . correspondence matrix for a causal tree. where ∩ and ∪ denote set intersection and union, respectively. because the causal tree ct may heap some feature-nodes to a cause-effect form, a set of given causes can also be inferred to obtain an unknown effect. the model has transformed the tps method of two point-sets correspondence into an optimization of the causal tree. next, the model must control the tree growth by adding an equation in eq. (26) into the resulting energy function. the value n in eq. (26) denotes the total number of all input fragments as the tree size. by adding eq. (26) to the energy function, the model can attain an approximated tree in accordance with the correspondence matrix. where i denotes the labeled identification of a feature-node. in the presence of this feature-node added to a causal tree j , the value n i returns 1, and otherwise it returns 0. finally, eq. (27) combines eqs. (16), (17) and (26) to create the final energy functions, as follows: by minimizing eq. (27), the model can use genetic programming to obtain a globally suboptimal solution with respect to mapping, correspondence, and heaping. the first term in eq. (27) is an error measure term for matching the similarity between two feature-nodes. meanwhile, the second term in eq. (27) can guard against all matches, and a controlling parameter ζ determines a degree of influence for the term. next, the third term of eq. (27) , which refers to a barrier function, is an entropy term with an annealing temperature t in statistical physics. by controlling the entropy barrier function, the distribution of random variables m ia in ct gradually stabilizes. alternatively, the model can generate the minimum number of states required to push the objective minimum away from the discrete points such that some rules emerge from m ia . a gradually reduced temperature t can suitably control thermal fluctuation to yield an optimal solution. regarding related applications, the application of this entropy term to statistical physics is also briefly described in [37] [38] [39] . subsequently, the fourth term in eq. (27) is used to penalize the differences of an affine mapping parameterã and an identity matrix i . the differences stimulate some unphysical reflections that result from flipping through the whole plane. the fifth term in eq. (27) , which is a standard tps regularization term, penalizes the local warping coefficientsw [24] . using an approximated least-squares approach and qr-decomposition technique, the model can solve the parameters (ã,w ) given fixed ct. additionally, two parameters λ 1 and λ 2 are used to control the influence of the fourth and fifth terms, respectively. finally, keeping a matrix consistent with a tree during the evolutionary process of genetic programming is very difficult. thus, the final term in eq. (27) where m ia must conform to ro-wta. currently, a parent node can only connect a maximum of three children nodes for this model. consequently, the summation of all elements in each column must equal three. when the model attains an optimal causal tree, a probability theorem is further used to model a training model for identifying protein function. consideration of a given protein family can be requested and answers obtained to a query with an unknown protein sequence of what the interrelationship among the particular protein fragments is. unfortunately, the calculation of a probability distribution in bayes networks is computationally intractable (np-complete). however, a causal tree of fixed structure may efficiently compute the result of the probabilistic distribution [31] . using the basic probability theorem, then assuming that p(x i , x a ) denotes the joint probability distribution of two random variables x i and x a , and that p(x i | x a ) denotes the conditional probability of x i given x a ; this conditional probability is defined as follows: using the bayes inference described in [4] , eqs. (30) and (31) are easily derived by eq. (29) . where x i and x a are conditionally independent given x a . subsequently, a well defined bayes model can be obtained by further generalizing beyond simple markov models and considering the probability distribution in the form of general trees. by the expansions of eqs. (30) and (31) , a probability distribution can be applied to learning and inferring for biological models when the tree network has a specific predetermined structure. during a learning phase, the variables within internal nodes and external nodes represent a set of member states and query states, respectively. these variables are both observable during the learning phase. instead, these variables are hidden during an inferring phase when the structure and probability distributions of a causal tree are temporally fixed. this process fully corresponds to the operation of hidden states in a hidden markov model. iteratively, the model eventually produces a refined prediction result based on fitness function estimation. recalling the example of a causal tree, fig. 1 defines nine random binary variables on the tree. the case produces a joint probability distribution of the nine variables using a product of the following terms: where each internal node x i separates variables above and below it. for example, using eq. (31) can show that the node x 7 gains the conditional probability p( initially, the root variable x 7 in eq. (32) first generates a value using the result from eq. (28) . given x 7 , the probabilistic model can determine the conditional probabilities p(x 6 | x 7 ), p(x 9 | x 7 ), and p(x 5 | x 7 ). this model then can generate the values of x 6 , x 9 , and x 5 , respectively. this model thus can calculate the remaining variables recursively. finally, this model obtains an estimated value as one of the multiple fitness functions described in the following section. by rearranging and observing the terms in eq. (32) , an ordered product of terms is also very easily implemented by a preorder tree search. besides the fitness function of the probabilistic and rpm models mentioned above, the model needs another fitness function to determine the boundaries of protein fragments. eq. (33) shows the normalized measures of a local alignment score α divided by the sequence length and obtained by the smith-waterman algorithm [7] . where δ ia denotes local alignment score and ρ ia represents sequence length. furthermore, eqs. (34) and (35) are used to determine a value of parameters ζ and λ 2 for adjusting the reasonable parameters in eq. (27) . since temperature t in eq. (27) is gradually decreased during the evolution, these m ia values are rapidly and accordingly decreased. from eq. (28), it is best for m ia to be in the interval of [0.0 1.0] to protect it from an infinite drop in value that cannot be computed due to the reasonable computation capacity limitation. thus, eq. (34) can be easily obtained from the inequality equation and normal distribution. where e 1 is the natural logarithm that provides a rough scope for estimating the resolution of 256 states that sufficiently fit this computing issue. moreover, µ, σ represent the mean and standard deviation of all feature data, respectively. furthermore, hall and titterington [47] proposed a solution for estimating the smoothing parameter λ 2 , by eq. (35) . where f(λ 2 ) denotes an influence matrix, i represents an identity matrix, and σ is a standard deviation of all feature data. a method of qr-decomposition can be used to determine the value of (i − f( the method decomposes the data of featurenodes ψ = (q 1 : q 2 )( r 0 ) into the product of orthonormal matrix and upper triangular matrix. finally, a process of iteratively estimating eq. (35) is used to find the optimal value λ 2 with respect to minimizing the error in eq. (35). this section focuses on an implementation design of genetic programming that evolves three related subpopulations with inner-exchanged individuals. in 1992, koza et al. were early pioneers of the concept of genetic programming [26, [40] [41] [42] which was extended from the genetic algorithm. genetic programming can automatically create programs for solving general problems. genetic programming is based on a grammar-based methodology of an evolutionary theorem using the darwinian survival principle. therefore, genetic programming can describe highly complicated models for any domain problem without sufficient domain knowledge. first, problem solving requires randomly yielding an initial population of individuals. subsequently, a fitness function can be used to evaluate individuals in the population and select excellent individuals for further performing specific genetic operations. the genetic operations typically include crossover, replication, and mutation operations. the operations enable the parent population at a given generation to form a new offspring population at the next generation. iteratively, genetic programming continuously performs the above evolutionary steps until obtaining an optimal individual or satisfying certain conditions. a representation of genetic programming for a domain problem is composed of several functional programs. the programs form a rooted tree structure as a solution for solving the problem. because genetic programming uses a tree form, the model can easily provide a flexible hierarchical presentation to estimate the relationship of homologies for predicting protein function. characteristically, genetic programming provides a mechanism for searching for the fittest individual. the individual is the problem solution provided by using the computer program in the search space of all possible computer programs. the search space comprises terminalnodes (external nodes) and function-nodes (internal nodes) that are appropriate to represent the specific problem [26] . when a causal tree following genetic operations generates an infeasible individual, the model must regularize this infeasible individual by using a tournament pick (tp) method. the tp method using the depth-first search (dfs) strategy is proposed to accelerate the performance of regularizing this infeasible individual to a feasible one. the tp+dfs method recursively cures the infeasible nodes residing in this individual. the method replaces an infeasible node with a feasible node, in a preorder visiting for trees, randomly selected from the tournament that contains numerous feasible nodes. this advantage of the tp+dfs method results from the mutation appearing at a suitable time, and estimates the fitness function simultaneously following the crossover. when all individuals conform to all constraints of a causal tree, a fitness function can be used for individual estimation. to improve prediction speed and accuracy, this model uses a multiple fitness function strategy. this strategy implements an exchanged individual mechanism involving three subpopulations. the model next considers the following parameters for developing a co-evolution with the strategy of inner-exchanged individuals in a population. the model exchanges the individuals in the three subpopulations 0, 1, and 2. there are nn individuals in subpopulation 0, nm individuals in subpopulation 1, and nk individuals in subpopulation 2, respectively. the size of the subpopulations should have nn nm, nk for considering a trade-off between speed and accuracy. the selection method used in the model always uses the tournament selection strategy rather than the fitnessproportionate selection strategy. additionally, the tournament selection (ts) differs from tp in the target level. the former selects the individuals in the population as its targets on the size 7 tournament. meanwhile, the latter selects the programs (nodes) of an individual (tree) to do the same. regarding the probabilities of genetic operations, the genetic operation of crossover and reproduction gives values of 0.9 and 0.1, respectively. to control tree shape, the tree depth must be 20 or less after performing the crossover operation. furthermore, each internal node has between one and three children connected to it. related applications employing genetic programming can be found in [43] [44] [45] . fig. 4 illustrates the inner-exchange strategy of individuals in three subpopulations of a population for achieving the cooperation of the multiple fitness function. initially, a set of initial feature nodes is fed into the subpopulation 0. subsequently, two feature-node sets produced by random orders of all feature-nodes match each other with tps methods to find the best correspondence between them. the subpopulation 0 is responsible for comparing the functional signals via heuristic matching. for effective heuristic match, the model assumes that the identical conserved protein domains demonstrate genetically similar functions given analogous functional signal shapes. consequently, a physical deformation of the signal shapes should be preserved in somewhat ancestral contours. subsequently, in fig. 4 , the n best individuals of subpopulation 0 must emigrate into subpopulation 1 for modifying the correct boundaries of the protein fragments. in subpopulation 1, an exhausted local alignment of two fragments (nodes) is used to adjust the boundaries of two protein fragments. meanwhile, a moment invariant method calculates a new set of feature data for updating the signal data. moreover, the model must realign the protein fragments once the boundaries of the related fragments have been changed. next, the m best individuals of subpopulation 1 must emigrate into subpopulation 2 to confirm the significant protein fragments by training of real cases. moreover, the n best individuals of subpopulation 0 also must emigrate into subpopulation 2 to do the same work. finally, the k best individuals of subpopulation 2 must emigrate into subpopulation 0 to resume the individuals having specific biological evidence. these resumed individuals can improve the quality of the individuals of subpopulation 0 using the guidelines from real cases. simultaneously, subpopulation 2 eliminates the worst individuals from subpopulation 2 to prevent them from backing to subpopulation 0 or 2. when all of the individuals in the three subpopulations have been coevolved by genetic programming, a final solution should select the optimum individual from subpopulation 0. specifically, the solution takes the form of a causal tree. each path from the root node to any external node represents the functional heredity process. this path can provide an auxiliary guide to biologists for comprehensively understanding the unknown function of query sequences. the strategies used by exchanged individuals are feasible and essential for improving the final solution quality. more specifically, the model cooperatively evolves the causal trees via the cooperation of multiple fitness functions. the multiple fitness functions use eq. (27) as the fitness function of subpopulation 0, eq. (33) as the fitness function of subpopulation 1, and the product of eq. (30) for all nodes as the fitness function of subpopulation 2. fig. 5 describes the statements of the whole algorithm constructing this agct model. the protein nucleocapsid (nsp1) of a coronavirus family from various species was tested to clarify the protein function of nsp1 of sars coronavirus. for seven input sequences, one sequence is a query sequence and six are member sequences. the model simultaneously inputs the sequences to analyze its performance. for convenience, table 1 lists the information on sequence number, abbreviated names, accessions, definitions, and sources obtained from the ncbi web site for the proteins, as follows: these coronaviruses are members of a family of enveloped viruses that replicate in the cytoplasm of animal host cells [46] . sars coronavirus tw1 is a severe acute respiratory syndrome associated coronavirus known as the tw1 isolate. for simplicity, the model only inputs the sequence of putative nsp1 protein of tw1 for analysis with the known protein nsp1 in other hosts. however, this analysis is useful for understanding the novel protein based on inference using other proteins with known functions. although the traditional msa can provide accurate knowledge regarding the conserved domains for these proteins, it cannot clearly express homology relationships over these conserved domains. in fact, the proposed model can provide a hierarchical contour of protein fragments. the contour can also help biologists to increase understanding of the functional heredity relationships only depending on the primary structure of proteins. due to the space limitation, this matching causal tree for the problem using this model is not shown. this section analyzes the performance of convergence of energy at different evolutionary generations using the agct model. table 2 lists some of test parameters used for the agct model to obtain a suboptimal solution to the problems. the parameters which are appropriate for the present case are obtained using eqs. (33)(35) . moreover, a reasonable value of λ 2 = 1.8 is used for eq. (27) through analysis using eq. (35) . the analysis attains the reasonable value using a general cross validation. to analyze the agct model, cases (i) to (v) confirm these typically varied situations. first, case (i) designs for illustrating the ideal solution can be found prematurely. next, case (ii) designs a situation involving a sufficiently large subpopulation size but related small migration size for the subpopulations. next, case (iii) designs a situation involving a small population size and small migration size. case (iv) then designs a situation involving sufficient population size and large migration size. finally, case (v) attempts to understand the real circumstances of evolution at subpopulations 0 and 1 when the energy of subpopulation 2 demonstrates convergence. experimentally, the key parameters are the rates of ii 100 10 50 10 50 10 100 iii 20 10 20 10 20 10 100 iv 100 20 50 30 50 30 200 v 1 0 2 1 0 2 8 1 200 table 3 test results of agct model table 3 lists the results of the five cases. the sensitivity column and specificity column of table 3 display the prediction and discrimination capacities, respectively. since the capacity estimation is based on averaging all generations, the ratios of comparison between the heuristic signal match and exhaustive sequence alignment are quite low. in fact, the high prediction capacity can improve the final solution quality. the high discrimination capacity can increase the execution speed. although the model obtains low sensitivity results, the model attains relatively high specificity results. actually, the results of the later generations have higher sensitivity than previous generations (data not shown). thus, the sensitivity results, which decrease the execution performance during the co-evolution in this model, do not influence the prediction accuracy. in fact, the rises and falls depend on a threshold (cut-off) value, because of a trade-off between sensitivity and specificity. thus, eq. (33) specifies a threshold value of 1.5 to determine whether or not two protein fragments are homologous. moreover, eq. (27) specifies a threshold of value of 0.3 to decide whether or not two signals are analogous. from the observation of the data in table 3 , case ii has better sensitivity than the other cases. furthermore, case ii is also competitive with other reported results. although fewer individuals are involved in case iii than case ii, the resulting reports in case iii are the best except for this sensitivity result. the following discussion can further explain the above confusion. although cases ii and iv involve the same numbers of individuals, case iv involves more immigrated or emigrated individuals than case ii. consequently, case iv obtains worse results than case ii. the above demonstrates that, appropriate individual migration is essential for improving solution quality. however, the amount of migration should not be too large to perform the normal evolution by genetic programming. meanwhile, population size is not particularly important in the present cases since the rpm method and smith-waterman algorithm can undoubtedly obtain the local optimal result for each generation of genetic programming. therefore, even through case iii involves fewer individuals than case ii, case iii can still achieve the performance results approaching case ii. finally, the mechanism of premature convergence in cases i and iv attained the worst results. clearly, case v improves the results of case v through a long-term evolution. accordingly, fig. 6 displays the performance of convergence in each subpopulation on fixed temperature. the energy of subpopulations 0 converges at 0.488 and has a different rate of convergence to the other two subpopulations. clearly, subpopulation 2 evolves faster than the other two subpopulations. fig. 7 illustrates the improvement of the energy convergence of subpopulation 0 by using the five temperature rates (tr) for decreasing annealing temperature, including 1, 0.95, 0.90, 0.85, and 0.80. eventually, the suboptimal solution to this present experiment is yielded using the genetic programming via a cooperation of the multiple fitness functions. this cooperation can continuously change the positions appropriate for splitting the protein sequence. due to the space limitation in this paper, the starting and ending positions of each protein fragment are not shown. subsequently, because of case ii containing more individuals than the others, fig. 8 compares the energy convergences based on case ii with each other for obtaining an appropriate tr value. more precisely, the rate of value 0.85 is the best choice for situations involving rapid and steady convergence. additionally, fig. 9 compares four different tr values 0.97, 0.95, 0.9, and 0.85 for analyzing the energy of subpopulation 0. clearly, case iv really is the worst case. next, fig. 10 simultaneously illustrates the energy convergence of three subpopulations based on case v by varying the rates of temperature decrease. clearly, the model proportionally and consistently reduces the energies of each subpopulation. from the observation of generations in fig. 10 , subpopulations 1 and 2 already begin to converge at around the fortieth generation. finally, the model gradually performs the rpm work in subpopulation 0 between the fortieth and eightieth generations. the agct model utilizes a novel representation of tree structure to illustrate the concept of applying functional homologies to predict protein function. this model differs in numerous respects from the other traditional model of msa. nevertheless, this model still includes various traditionally popular methods used for the high level application, predicting protein function. these methods include genetic programming, wavelet transform, profile comparison, local alignment with the smith-waterman algorithm, moment invariant theorem, rpm with tps, bayes inference, and the causal tree model. besides the above methods, this model introduces various strategies, including the inner-exchanged individuals in subpopulations, cooperation of multiple fitness functions, guided evolution involving real cases, and tp+dfs regularization. the prediction accuracy thus can be improved. this hybrid model is developed to exploit global search capabilities in genetic programming for predicting protein functions of a distantly related protein family that have difficulties in the conserved domain identification. moreover, the rpm involved can identify more sequence homologies through softening comparisons. essentially, this work contributed a complex and integrated methodology that enables new advances in predicting protein function in the post-genome era. we believe that the model robustness can be increased in the future if biologists generate more experimental data in laboratories. future works will apply this model to investigate issues related to protein interaction. sequence comparison by dynamic programming a hidden markov model that finds genes in e. coli dna hidden markov models in computational biology: applications to protein modeling probabilistic reasoning in intelligent systems: networks of plausible inference modeling splice sites with bayes networks prediction of the secondary structure of proteins from their amino acid sequence identification of common molecular subsequences hybrid computational intelligence schemes in complex domains: an extended review genetic algorithms in molecular recognition and design searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function logistic regression and artificial neural network classification models: a methodology review statistical mechanics beyond the hopfield model: solvable problems in neural network theory applying fuzzy logic to medical decision making in the intensive care unit increasing the efficiency of fuzzy logic-based gene expression data analysis epps: mining the cog database by an extended phylogenetic patterns search biological sequence analysis probabilistic models of proteins and nucleic acids probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments comparison of sequence profiles. strategies for structural predictions using sequence information within the twilight zone: a sensitive profile-profile comparison tool based on information theory compass: a tool for comparison of multiple protein alignments with assessment of statistical significance basic local alignment search tool gapped blast and psi-blast: a new generation of protein database search programs clustal w: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice a new algorithm for non-rigid point matching ant colony system with extremal dynamics for point matching and pose estimation genetic programming on the programming of computers by means of natural selection new algorithms for 2-d and 3-d point matching: pose estimation and correspondence probabilistic prediction of protein secondary structure using causal networks perfect sampling for wavelet reconstruction of signals a computational vision approach to image registration visual pattern recognition by moment invariants moment spaces and inequalities spline models for observational data functional analysis a relationship between arbitrary positive matrices and doubly stochastic matrices a new method for mapping optimization problems onto neural networks statistical physics algorithms that converge a novel optimizing network architecture with applications genetic and evolutionary algorithms come of age genetic programming iii: darwinian invention and problem solving: book review genetic programming and evolutionary generalization discovering knowledge from medical databases using evolutionary algorithms genetic programming for knowledge discovery in chest-pain diagnosis classifying proteins as extracellular using programmatic motifs and genetic programming fields virology common structure of techniques for choosing smoothing parameters in regression problems key: cord-319378-li77za5e authors: schroeder, wheaton l.; saha, rajib title: protocol for genome-scale reconstruction and melanogenesis analysis of exophiala dermatitidis date: 2020-09-11 journal: star protoc doi: 10.1016/j.xpro.2020.100105 sha: doc_id: 319378 cord_uid: li77za5e exophiala dermatitidis is a polyextremotolerant fungus with a small genome, thus suitable as a model system for melanogenesis and carotenogensis. a genome-scale model, iede2091, is reconstructed to increase metabolic understanding and used in a shadow price analysis of pigments, as detailed here. important to this reconstruction is optfill, a recently developed alternative gap-filling method useful in the holistic and conservative reconstruction of genome-scale models of metabolism, particularly for understudied organisms like e. dermatitidis where gaps in metabolic knowledge are abundant. for complete details on the use and execution of this protocol, please refer to schroeder and saha (2020) and schroeder et al. (2020). a text editing software with capabilities beyond those of standard text editing software (e.g., notepad for windows or textedit for mac os) such as highlighting for programming languages is recommended. the authors, using windows, use notepad++, available from notepadplus-plus.org. those who use mac os in the authors' laboratory use bbedit, available from https://www.barebones.com/products/bbedit/, or sublime text, available from https://www. sublimetext.com/, for their text editing. 3. in this protocol, the authors make use of a supercomputing cluster (the holland computing center, hcc) to perform much of this protocol, including all code in the gams programming language. if the readers of this protocol have access to resources which will increase solution speed, such as a supercomputing cluster at their own institutions, follow that institution's or supercomputing cluster's recommendations on creating an account, accessing the cluster and filesystems, and any other pertinent recommendations. for managing files on the supercomputing cluster, the authors have used the nppftp (notepad plus plus file transfer protocol) plugin as the authors use the windows operating system. for connecting to the supercomputing cluster, the authors use putty (compatible with windows and unix) which can be found at https://www.chiark. greenend.org.uk/$sgtatham/putty/. for ftp needs, mac os users in the authors' laboratory use cyberduck, available at https://cyberduck.io/. 4. codes related to this protocol as well as to and can be found in two github repositories associated with these works: github repositories for both the iede2091 model (doi:10.5281/zenodo.3608172, also available and periodically updated on github at https://github.com/ssbio/e_dermatitidis_model) and optfill (doi: 10. 5281/zenodo.3518501, also available and periodically updated on github at https://github. com/ssbio/optfill). 5. for many of these protocol steps, an internet connection is required. 6. it should be noted that here screenshots with some shapes overlaid are used to help guide users through the steps of this protocol; however, these screenshots were taken in april of 2020, whereas the actual process of building the iede2091 model began as early as march of 2017. therefore, this protocol is not an exact replication of what was done to construct the iede2091 model, but rather demonstrates the steps of model reconstruction. timing: days to months as discussed in , exophiala dermatitidis is a highly melanized polyextremotolerant fungi which is cultural as both a yeast and as a mycellium. due to this and its small genome, it is of interest as a model polyextremotolerant organism and model defensive pigment producing organism; however, lack of genome annotation results in poor knowledge related to metabolic functions and their compartmentalization. this is addressed through utilizing multiple data sources, predictive tools, and knowledge of related organisms. this portion of the protocol is focused on the creation of the first draft model of the iede2091 model, as an essential precursor for having a sufficient metabolic model draft to which to apply optfill. this method is used in for model reconstruction. the gold oval highlights the search query used, the pink ovals highlights where options may be found to expand the table to include enzyme commission (ec) numbers as an additional column, and the orange oval indicates where the table of results may be downloaded in the form of an excel table. the excerpt on the right is a partial screenshot of the menu which appears when the button in either pink oval is selected. the pink arrow indicates the checkbox which should be selected. the screenshot is from the database associated with as of july 15, 2020. arrows are used to highlight codes referred to in various steps of this protocol, with the color of those arrows indicating the groupings of the codes. dark red and pink arrows are files associated with before you begin step 9; orange and pink arrows with before you begin step 10; yellow arrows with before you begin steps 13 and 15, as well as step-by-step methods steps 1; brown arrows with before you begin step 13; green arrows with step-by-step methods step 27; light blue arrows with before you begin step 14 and step-by-step methods step 13; dark blue arrows with step-by-step methods step 19; and purple arrows with step-by-step methods step 31. while not a significant part of this protocol, python-based code has been provided so that users who do not have access to gams can perform some basic analyses on the iede2091 model including fba, fva, and shadow price analyses (gray arrows). note that the code ''txttosbmlmodel.pl'' is also marked with a gray arrow as it converts iede2091 to sbml format (resulting in ''iede2091.xml''). critical: the file ''common_functions.pl'' and the perl module lwp (the world wide-web library for perl available at https://metacpan.org/pod/lwp) are required for the automated brenda database search, otherwise the ''uniprot_get_ecs.pl'' code will return an error before performing any function. critical: a stable internet connection is required on the order of minutes to hours, depending on the size of the input excel sheet. note: should the code as provided not work, consult the brenda webpage and compare it to the provided code, as brenda does not at present have a dedicated application programming interface (api); therefore, the ''uniprot_get_ecs.pl'' works partially on the principle of ''screen scraping'' (e.g., reading the webpage, rather than the underlying database). any code relying on this principle may be outdated, yet the current code should at least provide a base which can be updated. note: generally, a warning will appear on the command line such as ''smartmatch is experimental at line ''. this is normal and is to be expected. the experimental nature of smartmatch has not, as of yet, caused any issues in code written by the authors which use this tool. 10. get information as to proteins and genes known to be in exophiala dermatitidis (ncbi) . the data used from the ncbi database can be accessed by the following series of substeps. a. perform a search (orange arrow figure 3a ) from the ncbi homepage for ''exophiala dermatitidis''. b. under the ''genomes'' heading of the results, select the ''assembly'' link (orange arrow figure 3b ). c. as this portion of the workflow was originally performed in 2017 for , only one assembly was available at that time, which is the one labeled ''exop_derm_v1'' (re-fseq assembly accession gcf_000230625.1, orange arrow figure 3c ) at present (as of march 2020). to recreate this work, this can be selected, or another assembly (such as the more recent asm1088354v1 shown in figure 3c ) can be selected to improve upon this work or expand it. (c) partial screenshot of assembly results which highlights the assembly used in the creation of iede2091 (exop_derm_v1). this was the assembly chosen as it was the only assemble available as of 2017 when this part of the procedure was done for . as of 04/20/2020 (when this screenshot was taken), there were seven genome assemblies. (d) partial screenshot of the genome assembly page, orange arrow shows the link to follow the assembly to the related organism page. (e) partial screenshot of the organism page, specifically the entrez record d. follow the hyperlink after ''organism name: '' (generally the first hyperlink under the header, orange arrow in figure 3d ). this will take the reader to ncbi's taxonomy browser which was used for the creation of the iede2091 model. e. in a table on the rhs of the web page, the number of direct links in the row ''gene'' should be hyperlinked (orange arrow figure 3e ), following this hyperlink leads to a tabular list of genes in the genome assembly. f. this list can be downloaded as a table by selecting the ''send to:'' drop-down menu above the upper right-hand corner of the results table. i. select the radio button ''file'', which will expand this menu. ii. select ''tabular (text)''. iii. sort by any method. iv. select ''create file'' (orange arrow figure 3f ). note: the formatting of the resultant file is moderately different from that of the supplemental files in , as these files were originally generated in 2017. 11. expand the known number of ec numbers (ncbi). this step strongly parallels step 9, and thus, for critical portions of this step and notes related to this step, see step 9. a similar set of input codes is used in this step, which can also be found in the github repository associated with , shown in figure 2 . the necessary codes for this step are indicated by orange arrows (''ncbiproteindetails.csv'' and ''ncbi_get_ecs.pl'') with the necessary library of functions indicated by the pink arrow (''common_functions.pl''). a. the first step for the recreation of the iede2091 model is to convert the microsoft excel file to a .csv file using the procedure described in step 9a. b. download, from the github repository associated with the following codes: ''ncbi_get_ecs.pl'' (orange arrow in figure 2 ), ''ncbiproteindetails.csv'' (orange arrow in figure 2 , or use the file from step 10 and rename it as this), and ''common_functions.pl'' (pink arrow in figure 2 , if not already downloaded). all these files should be in the same pathway (preferably the same as was used in step 9 these results then should be combined to get a full set of ec numbers known to be in e. dermatitidis. this is done by copying-and-pasting of the ec columns of the files generated in steps 9 and 11, and then performing the ''remove duplicates'' data operation on that column in microsoft excel. now that metabolic functions known to be carried out by exophiala dermatitidis have been identified, subcellular localization of those functions is necessary for modeling. 13. determining reactions associated with ec numbers through kegg api. ec number can be used to link proteins to their corresponding metabolic functions. in this work, the kegg application programming interface (api, available at rest.kegg.jp) was used by the ''enzymes_ to_rxns.pl'' perl language code (yellow arrow in figure 2 note: at this point, a non-kegg system of identifiers could be chosen to use for this protocol, if so desired, which may be partially automated. one such example conversion is included in the github for the iede2091 model (https://github.com/ssbio/e_dermatitidis_model, and seen in figure 2 ) which converts kegg identifiers to bigg identifiers at a rate of about 38% for reactions and 62% for metabolites, and the remainder would require manual curation to affect the conversion. the required files for this conversion are marked with brown arrows in figure 2 . 14. determining compartmentalization of ec numbers through cello. since exophiala dermatitidis is an understudied organism, it was decided to use the cello v.2.5: subcellular localization predictor (available at cello.life.nctu.edu.tw) to determine compartmentalization of reactions through amino acid sequences of proteins associated with ec numbers (yu et al., 2006) . in these predictions, the ''eukaryotes'' and ''proteins'' radio button options were selected as shown in figure 4 (orange arrows) which contains an example amino acid sequence for exophiala dermatitidis. the amino acid sequences were retrieved as fastas from either uniprot or ncbi database, depending on the ec number for which compartmentalization is being investigated. this step was performed manually as cello lacks an application programming interface (api) which would allow for automation. from this, a list of enzyme classification numbers with subcellular localizations appended (for example, ''1.1.1.1[c]'' for alcohol dehydrogenase in the cytosol) is generated as an input to the next step, named ''ecs_with_comp.txt''. 15. mapping ec numbers to a list of reactions. next, the links between enzymes and reactions that ec numbers represent must be followed to get a list of reactions which can be catalyzed by orange arrows indicate the settings to be selected. cello is used in before you begin step 14 to determine compartmentalization of the enzymes identified in exophiala dermatitidis. enzymes known to be in exophiala dermatitidis. the code ''enzymes_to_rxns.pl'' (yellow arrow figure 2 ) was used then to get list a list of reactions which might be catalyzed by these enzymes using ''ec_with_comp.txt'' as an input file. the resultant list of reaction identifiers and stoichiometries had reaction compartments applied automatically by the code used. this provides a list of reactions with stoichiometries in the same format used by the iede2091's model file. 16. selecting related species from which to fill metabolic gaps. an analysis of the reaction list generated by the previous step showed many metabolic gaps, even in key metabolic pathways such as the tca cycle. therefore, as done in other genome-scale modeling efforts, genomescale models of closely related species were sought to fill these metabolic gaps. this search should be conducted using the most complete phylogenetic tree which could be found which includes the species of interest, at the time of this work the most complete phylogenetic tree for ascomycota fungi is from schoch et al. (2009) , to identify closely related species. google scholar should be used to identify published genome-scale models of these species (if available). in , the aspergillus genus was identified as species most related to exophiala dermatitisdis which had multiple published genome-scale models. identified at the time were four metabolic models of aspergillus species: a. nidulans by david et al. (2008) , a. niger by andersen, nielsen and nielsen (2008) , a. oryzae by vongsangnak et al. (2008) , and a. terreus by liu et al. (2013) . note: saccharomyces species, specifically the model species saccharomyces cerevisiae (bakers' yeast) were not selected as models from which to fill these metabolic gaps as ancestors of saccharomyces and exophiala species branched very early in the evolutionary history of ascomycota fungi (schoch et al., 2009 ). while a model of saccharomyces cerevisiae was used in later manual curation efforts, namely isce926 by chowdhury, chowdhury and maranas (2015) , saccharomyces species were not considered sufficiently related for wide-scale adoption of metabolic functionalities. 17. download and compare selected aspergillus models. the supplemental files associated with each publication of genome-scale model of a related species, in this work from the aspergillus genus, identified in step 16 should then be downloaded. while each genome-scale modeling work had unique formatting, most models will list ec numbers associated with either the model or reaction which can then be used to compare the metabolic functionalities of each model. in all four models listed ec numbers associated with reactions. often, through some relatively simple string handling in microsoft excel, lists of all ec numbers both with and without compartmentalization can be generated for each model, in addition to a non-compartmentalized list of all enzymes present in the four models in total. the non-compartmentalized enzyme list can be used to determine which enzymes are present in which models to generate a summary of the overlaps of the metabolic functionalities of the related genome-scale models used. in , this overlap analysis resulted in four ''bins'' of ec numbers. the first bin was the full consensus (enzymes common to all four species models) which were assumed to also be present in e. dermatitidis. the remaining bins were labeled common to three of four, common to two of four, and unique to one model. these three bins will be used later to generate databases of functionalities for optfill in steps 28, 32, and 36. 18. determine compartmentalization from aspergillus models. for the full consensus bin from the previous step, each model should then be consulted to determine what compartmentalization these enzymes have in these models. all compartmentalization of the consensus enzymes found in the related speceis models was added to a list of enzymes with compartmentalization appended in the same manner as ''ec_with_comp.txt'' to produce a new list ''asp_ec_with_ comp.txt''. 19. mapping ec numbers to a list of reactions (aspergillus enzymes). the code ''enzymes_ to_rxns.pl'' (yellow arrow figure 2 ) from step 15 should then be used to get a list of reactions which might be catalyzed by these enzymes using ''asp_ec_with_comp.txt'' as an input file. the resultant list of reaction identifiers and stoichiometries should have reaction compartments applied automatically by the code. this step results in a list of reactions with stoichiometries in the same format used by model files such as iede2091's model file. 20. defining the biomass function. at this stage of the reconstruction process, literature evidence should be sought for the organism modeled in order to define a biomass equation, in this instance for exophiala dermatitidis. due to the availability of in vivo evidence for biomass composition at the time of developing the model, this search was divided primarily into three tasks: i) determine the composition of the cell wall and ii) determine the composition of the remainder of the cell (e.g., the plasma membrane and all it encloses) and iii) the mass fraction of biomass accounted for by carotenoids, where each accounts for a fraction of the total biomass. this method of defining the biomass composition was necessary due to direct evidence for the composition of e. dermatitidis cell walls being known, yet no evidence was yet present for the composition of the remainder of the cell or mass fraction of carotenoids in biomass. for the biomass of the cell wall, it was assumed that 25% of the cell mass is cell wall as the cell wall of this species has been described as ''thick'' (chen et al., 2014; kumar and vatsyayan, 2010; schnitzler et al., 1999) and in other ascomycetes cell walls may account for as much as 30% of dry biomass (lipke and ovalle, 1998) . began with determining the composition of the cell wall. the composition of the cell wall of e. dermatitidis was obtained from geis (1981) . as a. terreus model is the most recently published of the four aspergillus models, its biomass composition was used for the remainder cell (liu et al., 2013) . the lipid composition of the remainder of the cell was determined from a. terreus (by weight percentage) composition obtained from kumar and vatsyayan (2010) . once these two pieces of information were put together, one crucial piece of information was still missing: the contribution of carotenoids to the biomass of e. dermatitidis, as aspergillus species lack carotenogensis. again, lack of information pertaining to e. dermatitidis led to using the contribution of carotenoids to organism biomass data from another organism, in this case, podospora anserina, for which approximately 3.47% of cell mass is carotenoids (strobel et al., 2009 ). both p. anserina and exophiala are ascomycota, but phylogenetically diverge at the class level. unfortunately, no similar data was able to be found for a species which was phylogenetically closer to exophiala (the search made was similar to that in step 16). the remaining cell weight, 71.53%, was assumed to be composed of the cell membrane and all biomass components enclosed within it (such as proteins and lipids). the stoichiometric ratio of these pseudometabolites in the biomass reaction was determined by using the solver tool in microsoft excel whose objective was a biomass molecular weight of 1,000 mg/gdw$h. ratios between pseudometabolites were enforced in this analysis to preserve ratios as described above. critical: the biomass pseudomolecule's molecular weight should be 1,000 mg/gdw$h, so that the biomass flux rate simplifies to h ã�1 . should this not be the case, the model will produce growth rates which are too fast or too slow, depending on the direction of error in the molecular weight of the biomass pseudomolecule. for a discussion on issues which can be caused by non-standard biomass weight, see chan et al. (2017) . creating the first draft model. at this step, there are the results from three previous steps, namely steps 15, 19, and 20, which need to be synthesized for the creation of a genome-scale model. the results of step 15 contains reactions known to be able to be catalyzed in exophiala dermatitidis; the results of step 19 contain a set of metabolic functions core to the related aspergillus genus and assumed to also be core to e. dermatitidis; and step 20 produced the biomass pseudoreaction necessary for genome-scale models. combining all three of these outputs, a first draft genome-scale model of exophiala dermatitidis. a. this first draft e. dermatitidis model, or any other model made by paralleling this procedure, cannot be directly utilized by any code in the gams programming language, which requires precise formatting for input files. the set of files required by gams to properly read the model are generated by running the ''convert*.py'' code (where ''*'' is a short string identifier which describes what is being converted). note: the ''convert*.py'' code generates several files which are gams-readable inputs including ''mets.txt'' (set of metabolites), ''rxns.txt'' (set of reactions), ''rxntype.txt'' (stores parameter specifying reaction direction), ''sij.txt'' (stoichiometric matrix), ''ex_rxns.txt'' (subset of reactions that are exchange reactions), ''irrev_rxns_f.txt'' (subset of reactions which are irreversible and forward), ''rev_rxns_no_ex.txt'' (subset of reactions which are reversible not including exchange reactions), ''irrev_rxns_b.txt'' (subset of reactions which are irreversible backwards), and ''reg_rxns.txt'' (subset of reactions which are turned off by regulation). only two copies of the ''convert*.py'' code have been provided in the github repository associated with , as only line 20 need be edited to apply to a different model, and lines 23 through 32 can be edited to ensure that the output files of multiple ''convert*.py'' codes do not write over eachother. this code can be run through unix or mac os terminal or windows command prompt using the command ''python convert*.py'' so long as the ''convert*.py'' code is contained in the current working directory. b. once the ''convert*.py'' code has been run on the correct model file, the fba code ''fba.gms'' can be run using the command ''gams fba.gms''. users can change which model fba is applied to by changing line 20 in the ''convert*.py'' code and rerunning that code. c. at this step, this draft model will not be able to produce biomass, as there is no transport or exchange reactions. further, there may be some thermodynamically infeasible cycles (tics), also called futile or type iii cycles, due to lack of curation at this present stage, since all reactions at this point are reversible. timing: weeks to months while the first draft of the exophiala dermatitidis model has been created at this point, this first draft lacks important metabolic modeling capabilities such as the ability to produce the defensive pigments which make e. dermatitidis of such interest and the ability to simulate biomass production. therefore, this portion of the protocol is focused on the creation of the second draft model of the iede2091 model, which is capable of melanogenesis, carotenogenesis, and biomass production. this was done before the application of optfill to ensure correct sysnthesis pathways for both melanins and carotenoids (rather than pathways predicted through optfill). this second draft model is distinct from the first by having melanogenesis and carotenogensis added to the model and ensuring biomass can be produced. this procedure is used in in model reconstruction. 22. addition of melanogenesis. since one of the goals of this study was the study of melanin synthesis (melanogenesis), the reaction pathways which produce melanins, namely pyomelanin, dhn-melanin, and eumelanin should be identified and added to the draft model at this stage. all reactions should be element and charge balanced as they are added to the model. in a primary source used for identify these reactions and genes related to melanin synthesis was chen et al. (2014) . from other available literature sources the details of melanin synthesis pathways, including dhn-melanin synthesis pathway (szaniszlo (2002) ), several types of fungal melanin (eisenman and casadevall (2012) ), and detail pyomelanin synthesis in the related apsergillus genus (schmaler-ripcke et al. (2009)) reactions and their associated stoichiometeries for the production of melanins were determined. an effort was made to identify as many compounds and reactions as possible with kegg identifiers, however not all reactions and metabolites could be thus identified. for example, in the dhn-melanin synthesis pathway (showing melanins synthesized from dihydronaphthalene, abbreviated dhn) the final metabolite in the synthesis 1,8-dhn has no kegg identifier, and thus was labeled ''x00001''. as the metabolite identifier does not exist, neither do the reactions which produce 1,8-dhn and consume 1,8-dhn, therefore these reactions were arbitrarily assigned labels ''r900'' and ll open access star protocols 1, 100105, september 18, 2020 ''r901''. compartmentalization for these reactions was obtained from eisenman and casadevall (2012) . 23. addition of carotenogensis. another particular goal of this study was to study the synthesis of carotenoids, therefore also the reaction pathways which produce melanins should be identified and added to the draft model at this stage. as in step 22, all reactions should be element and charge balanced as they are added to the model. in , carotenoids identified as producible by exophiala dermatitidis were obtained from geis and szaniszlo (1984) and kumar (2018) , consisting primarily of b-carotene, neurosporaxanthin, and b-apo-4 0 -carotenal (several other intermediate carotenoids are produced, but do not accumulate significantly). the carotenoid synthesis as described in kumar (2018) was used to define the carotenoid synthesis pathway in the exophiala dermatitidis model. reaction stoichiometry, reaction labels, metabolite labels, and the synthesis pathway of the carotenoid precursor geranyl-geranyldiphosphate were obtained from the kegg database. these reactions were assumed to occur in the cytosol in the absence of concrete evidence as to compartmentalization. 24. addition of exchange reactions. at this point, the draft model is a collection of reactions, lacking any interactions with the environment outside the cell, and lacking interactions between the subcellular compartments. the first step in remedying this is the addition of exchange reaction which identifies how exophiala dermatitidis interacts with its environment, specifically in terms of metabolites exchanged between the defined metabolic system (that is the cell, cell wall and nearby extracellular space) and the medium at large. as an aerobic organism, uptake of oxygen and export of carbon dioxide should be added. further, it has been noted in kumar (2018) through the defined medium used that glucose and sucrose may be used as a carbon sources, as well as sulfate may be used as a sulfate source, ammonium as a nitrogen source, and phosphate as a phosphorous source. therefore, exchange reactions for these metabolites should also be added to the model. from discussions with an expert on exophiala dermatitidis (namely our co-author dr. steven d. harris), it was indicated that e. dermatitidis is also able to grow on acetate and ethanol as carbon sources as well, and that this should be reflected in the model, necessitating two more exchange reactions. further, water and proton exchanges should be specified with the external environment. note: not all exchange reactions present in the final model were defined here. for instance, it was discovered that some compound was necessary which allowed for the export of waste nitrogen. here, nitrogen export is modeled as an export of uric acid. 25. addition of transport reactions. now that step 24 has established interactions between exophiala dermatitidis and its environment, interactions between the subcellular compartments (for instance, between the extracellular space and the cytosol) need to be defined using transport reactions. at this stage, some basic transport reactions should be added to the model which are very likely to exist in exophiala dermatitidis, such as the diffusion of water across plasma membranes, the transport of exchanged metabolites from the extracellular space into the cytosol and to subcellular compartments which utilizes those resources (for instance, oxygen is transported into the mitochondria). other exchange reactions which are common to the four chosen related species models from step 17 should also be considered for addition to the model at this stage depending on the presence of those metabolites in the current e. dermatitidis draft model. note: not all transport reactions present in the final model were defined here, a significant number of transport reactions are added to the model as it is manually curated. 26. selection of metabolic functions used in manual curation. even with the addition of exchange and transport reactions, the current exophiala dermatitidis draft model has relatively few reactions which are capable of holding flux as determined by fva, see ''general steps on how to use iede2091'' and accompanying code for a description on how to apply fva). this is due to the ll open access continued presence of a large number of metabolic gaps. here, a database, or databases, of functionalities which can be used to manually fill metabolic gaps should be identified. in , the databases selected were the isce926 model (chowdhury et al., 2015) and the set of enzymes common to three of four aspergillus models from step 17. the latter database was converted to a set of reactions using the code ''ec_to_rxns.pl''. 27. manual curation to ensure the production of biomass. the set of reactions selected in step 26 can then be used to curate the model were then used to manually used to curate the model such that the model can produce biomass. manual curation can involve changing the direction of reactions already in the model, adding reactions from the database to address metabolic gaps noticed in the model, and removing reactions which may participate in tics. in notes about what was done during manual curation can be found in ''curation_notes.txt'' in the github repository associated with (green arrow in figure 2 ). this curated model is the second draft model of exophiala dermatitidis in this reconstruction. as noted in , the model contains 1,587 reactions, of which only 711 are capable of holding flux. note: as defensive pigments are part of the biomass equation, this step also ensures that the defensive pigments which are being studied will also be produced. note: at this stage, the optfill method could be used to address the metabolic gaps in the model using the isce926 model and the set of reactions common to three of four aspergillus models to define the database. this was not done at this stage in the original work because, at this point in time, the optfill method had not yet been devised. after attempting to use gap-find/gapfill at this stage and finding that many metabolic gaps required multiple reactions to fix, it was desired by the authors to devise a method which could guarantee a minimized number of reactions added on a whole-model basis when gap-filling to achieve a conservative reconstruction (e.g., minimizing the global number of reactions added to the model to address a large number of metabolic gaps). this step in the reconstruction of an exophiala dermatitidis model provided the rational and motivation for developing the optfill method. as optfill took more than a year to formulate, it was never applied at this stage since this formulation effort was done in parallel with the development of this genome-scale model of eoxphiala dermatitidis. details on the development of optfill can be found in and . throughout this work, a dell inspiron 7373 model laptop computer using the microsoft windows 10 home operating system was used. this computer has a 250 gb solid-state hard drive, an intel core i5-8250u cpu @ 1.60 ghz (1,800 mhz, 4 cores and 8 logic processors), and 8 gb random access memory (ram). uses of this computer include directly running program codes which used internet resources and for secure shell (ssh) access to the crane computing cluster at the holland computing center (hcc) of the university of nebraska -lincoln. at the time of writing, the crane cluster is the most powerful cluster at the hcc. the crane computing cluster has 64 gb of ram, utilizes an intel xenon e5-2670 2.60 ghz processor with 2 cpus per node. the crane computing cluster is used primarily for running gams codes and for running codes which creates necessary inputs for gams codes. it should be noted that the crane computing cluster is used to for running gams code as these codes are used to solve large linear algebra problems with matrices in the order of thousands in both dimensions. such computationally expensive problems are either very straining to personal computer or even impossible to perform in a reasonable amount of time. therefore, it is highly suggested that followers of this protocol have access to some advanced computing resource. alternatives: for running all code except for that which uses the gams programming language, any reasonably up-to-date computer may be used to run the code whether desktop or laptop, and whatever operating system used whether windows, mac os, or unix/linux. in the ''before you begin'' section is detailed some software tools which may be used, and indeed are used by other members of the research laboratory to which the authors belong. now that the draft model is in a state such that optfill can be applied, this section details the iterative application of optfill to the second draft model. this section specifically discusses the applications of optfill to the second, third, and fourth drafts of the iede2091 model to expand the functionality of the reconstructions. detailed in this section is the construction of each database, the application of optfill, the inclusion of a chosen solution to the previous draft model to produce the next draft model, and the application of blastp using ncbi's blast api to determine the evidence for the selected optfill solution. this method is used in in model reconstruction. 1. building the first database. from the analysis of the overlap of enzymes present in the four apsergillus models, the set of enzymes which is common to any three of those models and absent from the fourth should be used as the basis of the first database (from analysis done in before you begin step 17). these enzymes should be allowed with all compartmentalizations which are already present in the second draft genome-scale exophiala dermatitidis model. this list of enzymes with appended compartmentalization (such as ''1.1.1.1[c]'' for alcohol dehydrogenase in the cytosol) can then be converted to a list of reactions with stoichiometry using the ''enzymes_ to_rxns.pl'' perl language code (yellow arrow in figure 2 ) which automatically adds compartmentalization to each reaction stoichiometry based on the compartmentalization of the enzyme. the list of reactions and stoichiometries produced by this step is defined as the first database for the application of optfill. 2. applying optfill for the first time. using the database defined in step 1, optfill should then be applied to the second draft model of exophiala dermatitidis. the process of the application of optfill is the same as that described in the ''general steps on how to apply optfill'' section of this protocol, and therefore will not be repeated here ( figure 5 and figure 6 are taken from the results of this application of optfill as a method for checking results obtained). optfill produces two types of results, first the results of the tic-finding problem (tfp) and second, the results of the connecting problems (cps) which provide sets of reactions to be added to the model to fill metabolic gaps. see the ''general steps on how to apply optfill'' section of this protocol for how to read and interpret optfill results. 3. selecting and applying the first filling solution. in this instance, it is recommended to select and implement the first cps solution produced by optfill as this solution improves the model yet not at the expense of adding unnecessary additional reactions. the stoichiometries for the reactions selected by the first cps solution (taken from the first database file) should be added to a copy of ll open access star protocols 1, 100105, september 18, 2020 the second draft exophiala dermatitidis model in order to make the third draft e. dermatitidis model. note: other solutions of the cps could be selected at this stage but would result in a different model and may result in different results of analyses than those obtained through the iede2091 model. performing blastp to support the first solution. from the optfill solution selected, a list of enzymes which catalyze these reactions should be manually gathered from the kegg database and placed into a list in the file ''eclist_1.txt''. using this list and the code ''bidirectionalblast.pl'' (input files ''eclist_1.txt'' and ''blastspecs.txt'') an automated search should be made to determine if there is genetic evidence to support this optfill solution using the ''bidirectional-blast.pl'' code. this code uses an input list of enzymes; specifications as to acceptable cutoffs values for percent positive substitution of residues and expect value (e value) related to sequence similarity; and a list of related species from which to take the search sequences for the given enzyme. this code works as follows. first, for each enzyme on the list, this code searches kegg for amino acid sequences for genes known to produce that enzyme from an acceptably related organism (these organisms are specified in the ''blastspecs.txt'' file). this sequence is then used as the query in a blastp search, utilizing the blast application programming interface (api), against only the exophiala dermatitidis genome in the ncbi database. it should be noted that the performance of the blastp analyses themselves are based on the sample perl code provided by ncbi for using the blast api, available in the developer information of the blast api documentation (blast developer information, n.d.) . the time need to receive blast results depends not on the blast code, but rather the volume of demand for the blast tool at the time. all matches are then exported to various text files to store these ''forward'' blast results. this code then evaluates each match in terms of the cutoffs provided, and if the match passes, performs a blastp of the sequence found in the target genome against the sequence provided by the reference genome. should this ''backward'' blast also pass the provide criteria, the match is accepted. the recommended cutoffs could be 60% positive substitution (e.g., 60% of residues are conserved or substituted by a similar amino acid) and an expect value of 1e-30. this should result in a csv file which details genome-based support for the the inclusion of the reactions included in the optfill results. critical: note the formatting of input files ''blastspecs.txt'' and ''eclist_1.txt'', provided in the github associated with , as no effort was made to allow for different formatting of input files in the ''bidirectionalblast.pl'' code. critical: a loss of internet connection while the ''bidirectionalblast.pl'' code is running could cause the program to terminate prematurely. note: if the ''bidirectionalblast.pl'' code no longer functions, consult the blast developer information web page or other blast api documentation to determine if the api has changed or been updated. note: it is suggested by the authors to run the ''bidirectionalblast.pl'' code such that times of peak blast usage are avoided (or minimized). in the experience of the authors, it is best to run such code overnight or over the weekend. note: depending on how the supercomputing facilities which the reader uses works, this code may need to be run on a personal or work device (such as a laptop or desktop computer) rather than the supercomputing cluster as operations which directly interact with outside websites may not be allowed. building the second database. the second database should be constructed in the same manner as the first database (detailed in step 1); however, the enzymes used for the construction of this database are those common to two of four of the aspergillus models analyzed, allowing all compartmentalizations currently in the third draft e. dermatitidis model for each enzyme. 6. applying optfill for the second time. optfill should then be applied to the third draft exophiala dermatitidis model (serving as the model) with the second database built in step 5 serving as the database for this application of optfill, applying the procedure detailed in ''general steps on how to apply optfill''. the results of this application are described in . 7. selecting and applying the second filling solution. as with step 3, the best solutions should be selected from the second application of optfill, and the stoichiometries of the reactions in the optimal cps solution should be added to a copy of the third draft exophiala dermatitidis model to produce the fourth draft e. dermatitidis model. 8. performing blastp to support the second solution. as in step 4 a list of enzymes which catalyze the reactions which participate in the optfill solution selected in step 7 is created, and the ''bi-directionalblast.pl'' code is applied to this list to attempt to find some genomic support for this second optfill solution. 9. building the third database. the third database is built in the same manner as the first and second databases (detailed in steps 1 and 5); however, the enzymes which should be used for the construction of the database are those unique to one of the aspergillus models analyzed, allowing all compartmentalizations currently in the fourth draft e. dermatitidis model for each of these enzymes. 10. applying optfill for the third time. as in steps 2 and 6, optfill should then be applied for the third time using the fourth draft e. dermatitidis model as the model file, the third database ll open access (constructed in step 9) as the database, and following the procedure outlined in the ''general steps on how to apply optfill'' section of this protocol. the results of this application are described in . 11. selecting and applying the third filling solution. as with steps 3 and 7, the best solutions should be selected from the third application of optfill, and the stoichiometries of the reactions in the optimal cps solution can then be added to a copy of the fourth draft exophiala dermatitidis model to produce the fifth draft e. dermatitidis model. 12. performing blastp to support the third solution. as in steps 4 and 8, a list of enzymes which catalyze the reactions which participate in the optfill solution of step 11 should be created, and the ''bidirectionalblast.pl'' code is applied to this list to find some genomic support for this third optfill solution. iede2091 model: shadow price analyses after reconstructing the exophiala dermatitidis model iede2091, it is desired to use this model in analysis of e. dermatitidis metabolism. as the defensive pigments of e. dermatitidis are of particular interest for their ability to confer polyextremotolerant properties to an organism, shadow price analysis is selected as a tool for metabolic investigation of these molecules. shadow price represents the cost to the objective (growth in this model) to produce one more unit (here mmol/gdwâ�¢h) of a particular metabolite. the analysis is performed using the dual form of the fba optimization problem to determine the shadow price for each metabolite. it can then be used to determine the cost of particular metabolic phenotypes to the organism (here melanogenesis and carotenogensis) to give greater insight into an organism's metabolism. the shadow price analysis of metabolites, specifically defensive pigments, is one of the key analyses of . here is a step-by-step description of how the shadow price analysis was performed on the fifth draft exophiala dermatitidis model, which could hopefully aid others in their uses of shadow price analysis. further, this analysis resulted in some additional curation of the fifth draft exophiala dermatitidis model, which finally resulted in the iede2091 model. shadow price analysis. the format is as follow (perl-and python-format regular expressions are used to describe the format, items which will be described in greater detail are bracketed by ''<>''): '' (<#>\s(\s\+\s)?)*(\->|<\->|<\-)\s (<#>\s(\s\+\s)?)*'' a. example reactions and reaction formats are shown in figure 7 with the format described below. b. '''': label of the reaction. while it can be anything, but in this work the reaction label is composed of the kegg reaction identifier, if applicable, followed by square brackets surrounding a short code indicating subcellular compartment. exchange, transport, and reactions not in the kegg database but in the exophiala dermatitidis metabolism generally have custom reaction labels, which uses three digits after the character ''r'' (such as ''r900 [c]'' and ''r901[e]'' in figure 7) , as opposed to five digits in the kegg identifiers. c. ''<#>'': stoichiometric coefficient of the metabolite in the given reaction. not to be confused with a line beginning with ''#'' which denotes a comment in the model file (such as the first line of text in figure 7 ). these comments are ignored by the python code ''convert*.py'' so that the comments aid in the organization of the file yet have no effect on the actual function of the model. d. '''': label of a metabolite. while it can be anything, in this work the metabolite label is composed of the kegg reaction identifier, if applicable, followed by square brackets surrounding a short code indicating subcellular compartment. metabolites which do not have kegg identifiers have custom labels beginning with ''x'', such as x00001[c] in figure 7 which stands in for 1,8-dyhidroxypaphthalene which is not a compound in the kegg database. note: any changes made to the model, such as curation or the addition of reactions, should be made to the ''iede2091.txt'' model file, then repeating this step (specifically running the ''convertmodel.py'' code) will automatically update all input files required for the shadow price analysis. 16. run shadow price analysis code. run the ''get_shadow_prices.gms'' code using the command ''gams get_shadow_prices.gms''. the mathematics related to this analyses are described in detail in ; however, it may be summarized that the shadow price is the cost to the objective function (in this case, rate of biomass production) that would be incurred by producing one more mmol/gdw$h of a given metabolite. this creates several output files. first is ''rxn_rates_out.csv'', which is a csv file which stores the reaction flux (in mmol/gdw$h) for each fba performed in the shadow price analysis. the second output file is ''shadow_price.csv'', which stores the shadow prices calculated for metabolites which are metabolically close to carotenoids and melanins. the next file is ''shadow_price_mcoa.csv'' which stores the calculated shadow prices for malonyl-coa and its precursor metabolites. finally, ''shadow_price_biomass.csv'' stores the shadow price of all biomass precursor for each fba analysis. 17. study shadow price analysis results. the majority of shadow price values should be negative, indicating that producing extra of any metabolite detracts from biomass production, except for metabolites which may be biomass-coupled. a discussion on biomass-coupling can be found in burgard, pharkya and maranas (2003) . this study is generally the most time-consuming step of the procedure of shadow price analysis applied to the iede2091 model. 18. model curation using shadow price analysis. when applying shadow price analysis to the fifth draft exophiala dermatitidis model, it may be the case that some compounds have positive shadow prices (indicating that if more is made then more biomass is also made) and others may have shadow prices which vary between the high, medium, and low limiting nutrient availability conditions. these issues are likely indicators of mass or charge imbalance is some reaction involving that particular metabolite, or a metabolite upstream of that metabolite in the reaction network, or that a particular metabolite is coupled with biomass production. the latter is unlikely unless the model is of a particular strain designed to have biomass production be coupled with metabolite production. the former can be corrected by manual curation which addresses reaction balances. the shadow price analysis may then be run again until neither of these indicators remain present in the analysis. in , this served as the final curation step to turn the fifth draft exophiala dermatitidis model into the final iede2091 model. timing: minutes to 1 week should a reader wish to apply optfill to a genome-scale model (gsm) of an organism of choice, as opposed to the iede2091 model as described in this workflow, this section will describe, in more general terms, the step-by-step procedure to apply optfill using gams. this section describes the required files, directory structure, and modifications to code provided in the github repositories related to this work which will be necessary to apply optfill to an organism and model of the user's choice. 19. getting all requisite files. for the general application of optfill, several input files are required, and the number of required files varies to some extent by the procedure used. here it will be assumed that a similar procedure is used in the general application of optfill as in the specific applications of optfill to build the iede2091 model. therefore, required files include the following (marked by dark blue arrows in figure 2 ): a. one model file -the model whose metabolic gaps are to be filled using the optfill method (in this case the second draft exophiala dermatitidis model is marked). b. one database file -the database of reactions representing metabolic functionalities to use in the filling of metabolic gaps of the model using the optfill method (the ''3of4db.txt'' file is marked as this is what was used with the second draft exophiala dermatitidis model). critical: these files should have the same formatting as described in step 15. c. two ''convert*.py'' files -line 20 of each of these ''convert*.py'' files should be changed so that one file converts the provided model file, while the other converts the provided database file. further, lines 23 through 32 in these files must be somewhat changed so that unique sets of output files are made for each file to which as ''convert*.py'' code is applied. d. one ''prep_for_optfill.pl'' file -this file runs both of the ''convert*.py'' and creates certain input files for the optfill code, such as the set of all metabolites and the set of all reactions in both the model and database. critical: attention should be paid to updating the input files in the ''convert*.py '' and ''prep_for_optfill.pl'' code based on what the codes in this step were named (specifically lines 9 and 10 of this code) and what the output files of step 15 were named (specifically lines 15, 22, and 98 should reference the output files of the convert code which converted the model file, and lines 28, 34, and 103 should reference the output files of convert code which converted the database file). e. one optfill file -the optfill code file which will fill the metabolic gaps in the model file using the database file. critical: attention should be paid to updating the input files in the code based on the name of output files of previous steps, specifically step 15. lines particularly important to examine are 31, 34, 37, 40, 43, 46, 49, 58, 61, 64, 67, 79, 82, and 85. 20 . setting up appropriate file architecture. these codes assumed that all other codes are contained in the same directory. note: the authors set up the directory on the crane computing cluster of the holland computing center which was used in this work. run code generating requisite input files for optfill. this code can be run in two steps: first run the ''prep_for_optfill.pl'' code (command is ''perl prep_for_optfill.pl'' provided the terminal working directory is the directory in step 20). critical: format the model and database files (format shown in figure 7) ; otherwise code will not run properly. note: this step will create a large number of files. 22. run optfill. optfill should be run on the hardware as advance as possible which allows for a long runtime. depending on the quality and size of the model and database used, the runtime may be between a few seconds and several days. note: in both the schroeder and saha (2020) and works, the crane computing cluster was used. in order to have an uninterrupted run time, jobs were submitted using the slurm research manager which is used by all holland computing center computing clusters. note: a discussion about the needed quality of the model and how quality influence runtime can be found in . should the model itself has a large number of inherent tics, see the general steps on how to apply the tfp. this can greatly increase the runtime, and the tfp may need to be coupled with manual curation to address model quality issues. 23. understanding optfill solutions part 1: the tfp. an example of results for the tfp is shown in figure 5 and will be used to show what a typical output would look like (important features highlighted in blue). the substeps below describe feature of the output of the tfp. a. lines 1 through 5 represent the standard output header, which is always at the top of the output file. b. lines 6 through 9 is the standard output block which the tic-finding problem (tfp) returns when there are no more tics of a given size (where the size in indicated by the value of phi) left to find. in this case, there are no tics of size 1 (which only occur when there are duplicate reactions in the database and model) or of size 2. c. lines 18 through 28 and 31 through 41 show two examples of the information provided by the tfp once a tic is found. d. on lines 18 and 31, it is shown that each new tic is assigned a new, whole number, label. e. on lines 19-20 and 32-33, the objective value of the number of reactions in the tic are both reported. as discussed in , the objective function is the minimization of the number of reactions participating in the tic, and that this number is also fixed by the value of phi, rendering the objective function moot, yet one is still required for optimization. each pair of objective function and number of reaction pairs should have the same value. if they do not, this indicates that the relaxations allowed by the solver may need to be tightened. f. lines 21-22 and 34-35 serve as headers to make the output more human-readable and indicates the start of the table which describes the tic found in detail. g. at the start of each line in the table is the reaction label as provided in the model file. h. the next column in the table is an arrow indicating the direction of flux through the reaction when the reaction is participating in the tic. each reaction is allowed to proceed either forward or backward. the direction of the reaction is also indicated by the binary variables alpha and beta reported in the last two columns where alpha indicates forward while beta indicates backward. i. the third column of the table indicates where the reaction resides, either in the model (m) or database (db). j. the fourth column reports on the binary variable eta, which simply reports whether (value 1) or not (value 0) a reaction is participating in the found tic. all values of this column should be 1, and if this is not the case, consider tightening the relaxations allowed by the solver. k. the fifth column indicates the flux through the reactions participating in the tic. the values will always belong to the set â½ ã� 1; ã� 1e ã� 5wâ½1e ã� 5; ã� 1. this is because allowing smaller magnitude flux rates or more orders of magnitude for the range in flux rates may cause issues with the relaxations used by optimization problem solvers. generally, the magnitude of the reported flux rate is unimportant, rather the ratios between flux rates is more enlightening. see schroeder and saha (2020) for a discussion on this issues. 24. understanding optfill solutions part 2: the cps. an example of the results which may be returned from the cps is shown in figure 6 . note that as there are several output sections for each cps solution, only one example solution is shown, and some lines have been removed from that solution (replaced with ellipses) for the brevity of the figure. it should be noted that, for some portions of the output of the solutions to the cps, formatting is not as neat as that of the tic-finding problem. a. line 2,577 is the heading for the start of the current cps solution. b. lines 5,279, 5280, and 5,281 report the objective solution for the first, second, and third cps respectively. these problems sought to maximize the number of metabolites which can be produced, minimize the number of reactions, and maximize the number of reactions added reversibly respectively. c. lines 5,283 through 5,305 is the first result table of the cps solution and summarizes which reactions are added in the solution and how those reactions are added. the meaning of each column is detailed in figure 6 . d. lines 5,308 through 7,142 is the same table as lines 5,283 through 5,305, though these lines detail the results for all reactions contained in the database, hence the much larger size. this is mostly used for the purposes of debugging. reactions not included in the cps solution are labeled with a direction of ''xx''. star protocols 1, 100105, september 18, 2020 e. lines 7,145 to 1,776 list the metabolites which can now be produced by the optfilled model which the model could not previously produce. f. lines 7,179 through 9,507 produces a table of all metabolites in both the model and the database, and the binary variable, x(i), which determines whether or not they are produced. g. lines 9,510 through 11,119 show the results of a fba of the optfilled model. the alignment of text is not as high quality as with other tables produced by optfill, but this can be remedied by copying-and-pasting this table into a microsoft excel worksheet and using the text import wizard tool to delimit the cells by spaces. this will result in a more readable table of fba results. this table is mostly used for the purposes of debugging. h. lines 11,121 through 11,123 report the model and solver statuses of the solutions, as well as the number of iterations needed for the solver to reach these solutions. should the model or solver statuses not have a value of 1 at any stage, this would be an indication of a need for debugging the code. i. line 11,124 again states the number of reversible reactions in the cps solution. j. line 11,125 states the growth/biomass rate of the optfilled model, as often fixing metabolic gaps can have an effect on the rate of biomass production. k. line 11,126 finally reports the time, in seconds, which the solver takes to reach this optimal solution. 25. select solution. a set of unique filling solutions will be provided by the optfill code. users should carefully review the filling solutions produced by the optfill code and select the solution based on criteria such as how optimal the solution is determined by optfill (e.g., order of solutions), which metabolites are fixed (literature evidence for fixed metabolites), which reactions are added (literature evidence for metabolic functions), or other user-defined criteria. timing: minutes to 1 week this section describes how to apply the modified tfp of optfill to identify inherent tics utilizing gams. this is useful in order to identify tics inherent to a model to improve the quality of the reconstruction. this section describes the required files, directory structure, and modifications to code provided in the github repositories related to this work which will be necessary to apply the modified tfp to an organism and model of the user's choice. 26. getting all requisite files. for the general application of the tfp of optfill, a few input files are required, and the number of required files varies to some extent by the procedure used. here it will be assumed that a similar procedure is used in the general application of optfill as in the specific applications of optfill to build the iede2091 model. therefore, required files include the following (marked by orange arrows in figure 8 ): a. one model file -the model to which the tfp will be applied. here, the ijr904 model file (''ijr904.txt'') is marked as the system to which the inherent tic-finding will be applied. critical: these files should have the same formatting as described in step 15. b. one ''convert*.py'' file -a python code which makes several input files necessary for the tfp. line 20 of this ''convert*.py'' file should be changed so that one file converts the provided model file. (here, the file is ''convert_ijr904.py''). c. one tic-finding problem file -the code which performs the tic-finding problem on the input model file (here labeled ''optfill_tfp_only_ijr904.gms''). 27. setting up appropriate file architecture. these codes assumed that all other codes are contained in the same directory. 28. run code generating requisite input files for optfill. the ''convert*.py'' code from step 26b should be run to create all necessary input files. 29. run tfp code. the tfp should be run on the hardware as advance as possible which allows for a long runtime. depending on the quality and size of the model and database used, the runtime may be between a few seconds and several days. note: in both the schroeder and saha (2020) and works, the crane computing cluster was used. in order to have an uninterrupted run time, jobs were submitted using the slurm research manager which is used by all holland computing center clusters. 30. analyze solution. tics inherent to the model will be included in the solution of the tfp code. these solutions will include both the reactions participating in tics and their directions in the participating tic. inherent tics generally need to be addressed through manual curation that involve changing reaction directions or removing reactions. steps on how to use the iede2091 model including accompanying codes for basic analyses (fba and fva) this section describes how to use the iede2091 model for general analyses such as fba and fva using codes provided by the authors. the code focused on in this section utilize gams. 31. getting all requisite files. to run fba and fva codes on the iede2091 code, four files are required (indicated by purple arrows in figure 2 ). a. ''iede2091.txt'' file -the model file for the iede2091 model. b. ''convertmodel.py'' -converts the iede2091 model file into several files required by fba and fva code. c. ''fba.gms'' file -code which runs flux balance analysis on the iede2091 model. d. ''fva.gms'' file -code which runs flux variability analysis on the iede2091 model. please cite this article in press as: schroeder and saha, protocol for genome-scale reconstruction and melanogenesis analysis of exophiala dermatitidis, star protocols (2020), https://doi.org/10.1016/j.xpro.2020.100105 32. set up appropriate file architecture. these codes assume that all files are contained in the same directory. 33. run code which generates input files necessary for fba and fva. the ''convertmodel.py'' file should be run to create all requisite input files (the command is ''python convertwls.py''). 34. run fba and fva. flux balance and flux variability analyses may be run using the command ''gams f*a.gms'' (where ''*'' is substituted with ''b'' or ''v'' based on the desired analysis), and each will create output files with the results of these analyses. iede2091 model: comparison with human metabolism timing: minutes to days as mentioned in , e. dermatitidis is a potential model defensive pigment producing organism due to it relatively small genome of 26.4 mbp, compared to other model organisms including saccharomyces cerevisea with a genome of 11.9 mbp, drosophilia melanogaster with a genome of 137.6 mbp, arabidopsis thaliana with a genome of 119.1 mbp, and homo sapiens with a genome of 2,893.9 mbp (national center for biotechnology information, n.d.) . due to the size of the exophiala dermatitidis genome and the importance of melanin to the organism, the similarities and differences between e. dermatitidis and h. sapiens was investigated to determine the potential of e. dermatitidis as a model of h. sapiens melanogenesis. this section describes the procedure used to compare human and exophiala dermatitidis melanin metabolisms in the . this procedure includes comparing metabolic pathways and comparing human and e. dermatitidis tyrosinase amino acid sequences. comparing metabolic pathways of melanin synthesis. from the reconstruction of the iede2091 model, the metabolic pathway of melanin synthesis should be known through the reconstruction process. now the metabolic pathways for the synthesis of melanins, specifically pheomelanin and eumelanin, must be investigated. several sources were identified in which detailed eumelanin and pheomelanin synthesis pathways. this allows for a comparison of reaction pathways such as is shown in figure 3a of . 36. ensuring that all tyrosinase enzymes have been identified in exophiala dermatitidis. the key enzyme in the synthesis of eumelanin (and also the synthesis of pheomelanin in humans) is tyrosinase (ec number 1.14.18.1). in chen et al. (2014) , this four gene copies encoding the tyrosinase enzyme are listed with ncbi accessions of xp_009160170.1, xp_009156893.1, xp_009157733.1, and xp_009155657.1. the former two are said to be gene copies unique to e. dermatitidis, whereas the latter two are said to be conserved from aspergillus homologs. a quick check should be done to ensure that no new tyrosinase gene copies have been identified. a. perform a search in ncbi using the string ''tyrosinase exophiala dermatitidis'', which should produce a short list of all enzymes annotated as tyrosinase in exophiala dermatitidis. in , these were the only four gene copies in the list. b. perform a blastp search to determine if there are as yet unidentified tyrosinase gene copies in the exophiala dermatitidis genome. do this by performing a blastp of the known e. dermatitidis gene copies against the e. dermatitidis genome. this can be done by searching for the accession number in ncbi and selecting the protein page result matching that accession number. then, selecting the ''fasta'' link on that page (just below the heading). this will give the amino acid sequence of the tyrosinase gene copy. c. the amino acid sequence can then be copied-and-pasted into the query sequence textbox in the the blastp tool (see figure 9 ). figure 9 shows the blastp settings used. figure 10 shows composite results of the blast results performed on 04/23/2020. from this, it can be shown that, at present, there is no genetic evidence for additional gene copies of tyrosinase in e. dermatitidis using this method. interestingly, some tyrosinase genes are so dissimilar, at least on the whole amino acid sequence, that some gene copies do not match to all other gene copies of tyrosinase. 37. comparison of human and tyrosinase enzymes through blastp. as one method for evaluating the similarity of human and exophiala dermatitidis melanin synthesis, the similarity between tyrosinase enzymes of the two species should be evaluated. the first method of evaluation is a blastp analysis, which can be performed similarly to steps 4, 8, and 12, except that the search is limited to the homo sapiens genome. the combined results for this search can be seen in figure 11 . in summary, no tyrosinase gene copy from exophiala dermatitidis has high sequence similarity to that of homo sapiens. comparing human and tyrosinase enzymes through cobalt. a blastp analysis generally looks a whole-sequence similarity; however, the whole amino acid sequence of an enzyme is generally not critical to enzyme function. rather, those amino acids in the active site or sites of an enzyme are most critical; therefore, this step is focused on evaluating the similarity and conservation of the active site using the constraint-based multiple alignment tool (cobalt) (papadopoulos and agarwala, 2007) . the cobalt is available through ncbi (url: https:// please cite this article in press as: schroeder and saha, protocol for genome-scale reconstruction and melanogenesis analysis of exophiala dermatitidis, star protocols (2020), https://doi.org/10. 1016/j.xpro.2020.100105 www.ncbi.nlm.nih.gov/tools/cobalt/re_cobalt.cgi) and is relatively simple to use, as described below to compare human tyrosinase, human tyrosinase related proteins, and exophiala dermatitidis tyrosinase gene copies. a. to compare human tyrosinase related proteins, human tyrosinase alleles, and exophiala determatitidis, use the settings and search query shown in figure 12 (note that since this screenshot was taken on 04/23/2020, the covid-19 banner was removed from the image so as to show what is normally seen), then select ''align''. b. once the alignment is complete, the enzymes may be compared, particularly in term of conserved sequences. for this analysis, 3-bit conservation settings should be used (this can be changed by changing the drop-down menu labeled ''conservation setting:'' under the heading ''alignments''). c. the active sites of tyrosinase and tyrosinase related proteins, cua and cub (standing for copper-binding domains a and b respectively) can be identified using information on the residue positions of these active sites from sources such as garcã­a-borrã³ n and solano (2002), spritz et al. (1997), and furumura et al. (1998) . it should be assumed that the sequences from e. dermatitidis genes copies of tyrosinase which align with these active sites are the active sites of these tyrosinases. in summary, these results should show relatively well-conserved active sites and particularly well-conserved key active site residues. 39. comparing e. dermatitidis tyrosinase gene copies to hidden markov models (hmms) using pfam. to further evaluate how similar exophiala dermatitidis gene copies are to other tyrosinase sequences, the sequence of each gene copy should be evaluated against the hidden markov models (hmms) of various protein families using the pfam tool. as with the cobalt tool, it is fairly simple to use and the steps are described below. a. the pfam tool can be found at pfam.xfam.org. to submit a sequence to query against hmms of protein families, select the link ''sequence search'' in the middle of the page. b. this will result in a text box appearing in the middle of the page. copy-and-past the amino acid sequence of the desired protein into that text box and then select ''go'' (note that this should be done one sequence at a time). c. in general, the only protein family to which exophiala dermatitidis tyrosinase gene copies map is tyrosinase, showing that the important portions of the sequence are well conserved (expect values of 2e-38 or better). two of these gene copies, xp_0091601701.1 and xp_009156893.1, also had weak matches to the tyrosinase c hmm as well. a combination of the results for each tyrosinase gene copy of e. dermatitidis is shown in figure 13 . 40. analyze all data gather to this point on the similarity of human and exophiala dermatitidis melanin synthesis. to this point, a comparison of the reaction pathway which produces melanins has been made (step 35), the key enzyme (tyrosinase) is compared between the two species (steps 37, 38, and 39), and tyrosinase gene copies of e. dermatitidis are compared to the hidden markov models of tyrosinase. at this stage, comparisons of human and exophiala dermatitidis eumelanin and pheomelanin metabolism may be made. an example of such an analysis is provided in . there are three primary outcomes of this protocol. the first two deal directly with the subject of this protocol, that is exophiala dermatitidis. first is the iede2091 model, which is a curated metabolic (2008); and zhang and hua (2016) . code and procedures use and described in this protocol will also make the use of this model easier so that it may be more readily usable to non-experts. the second primary outcome is the analysis of the comparability of human and e. dermatitidis melanin synthesis pathways. this analysis is visually summarized in figure 14 using pieces of images from . the third primary outcome deals with the understudied nature of exophiala dermatitidis, yet a successful genome-scale metabolic reconstruction of the organism was achieved. here, it should be clarified as to what a successful reconstruction is in the context of an understudied organism. due to the nature of understudied organisms and the lack of in vivo experimental procedures included in this protocol, a ''successful'' model is not defined here as one that closely replicates in vivo behavior because such comparisons may not be able to be made. rather, a ''successful'' metabolic model of an understudied organism should be able to: i) simulate growth, ii) make full use of current knowledge of the organism, iii) be able to be subject to in silico metabolic analysis techniques to support, and iv) usability as a hypothesis-generating tool for the design of in vivo experiments to expand knowledge of the understudied organism. the hypotheses generated through this protocol, detailed in , are currently under in vivo investigation by a collaborator. this protocol then has the potential to serve as a guideline for other investigators interested in in silico experimentation of other understudied organisms to show how a gsm may be reconstructed. automated reconstruction processes for gsm of metabolism exist, including through resources like kbase (arkin et al., 2018) and modelseed (modelseed.org); however, these make use of a different beginning with knowledge of the exophiala dermatitidis system and genome (upper left hand corner) and four published genome-scale models of aspergillus species, various techniques including optfill are used to create the first major result: the genome-scale model (gsm) of e. dermatitidis. this model, in conjunction with knowledge of the human genome and the e. dermatitidis genome, was used to evaluate the similarity between human and e. dermatitidis melanogenesis for the second major result of this protocol. the third major result is the study of the shadow prices of defensive pigments, namely carotenoids (pink rods) and melanins (brown triangles), which showed that carotenoids are more expensive, suggesting a heretofore undiscovered role for carotenoids. open access philosophy of model reconstruction than used in this work. these tools seek to create a well-connected metabolic network at the cost of a ''permissive'' model reconstruction. that is, low confidence reactions will often be added to the model during reconstruction, and few methods exist to identify and address infeasible cycles. therefore, often the models reconstructed through these methods contain functions which may not be present in the organism and tics, both of which require manual curation to address. therefore, these automated methods may be best used as a method to produce draft models, rather than high-quality reconstructions. this protocol, in contrast, outlines a conservative reconstruction process, including a conservative reconstruction tool (optfill), which seeks to minimize the number of new functionalities added, and avoid tics, while sacrificing connectivity. this ''conservative'' reconstruction approach is well-suited for the gsm reconstruction of understudied organisms. most of this protocol describes how to recreate the iede2091 model presented and analyzed in ; however, perfect replication of this model following this procedure is likely not possible, as much of the data used in the reconstruction detailed was collected before 2018, yet since then knowledge of e. dermatitidis has increased through more genome assemblies such as shown in step 10 and figure 3 which acknowledges newer genome assemblies. further, perfect replication of these works may not be desirable, as no new insights would be gained. rather, this procedure outlines a method which may be followed for the reconstruction of future e. dermatitidis genome-scale models or genome-scale models of other understudied organisms. this procedure also details how to use the optfill method, giving greater step-by-step detail of how to use provided code than was possible in schroeder and saha (2020) and which are focused on the mathematics of the optfill method. in general, this protocol should be treated as a procedure to parallel in other research efforts, rather than duplicate. for fba, fva, and shadow price analysis, are included in the github repository for iede2091 in an attempt to make the results of this protocol usable by those without access to gams. should cobrapy become significantly quicker in solving mixed integer linear programming (milp) problems, it may be worth re-investigating the implementation of optfill in this package and language in future. particularly, it appears that cobrapy is significantly slower to identify infeasible problems. for example, when all potential tics of size 11 between the model and the database have been identified, gams concludes that no further solutions exist in 0.08 s, whereas cobrapy requires 20.1 s to make the same determination. similar four orders of magnitude differences in solution times when determining a problem to be infeasible exist throughout the application of the tfp to the first test model and database. as optfill often makes use of infeasible results (such as when to advance the size of sought tics), one of the most promising bottlenecks to address is the quick identification of infeasibility. this can be further evidenced by the fact that, of the 553.4 s needed to solve the tfp, 498.2 s (90% of the runtime) were used while trying to conclude that the model was infeasible. this would not, however, address the runtime disparity between cobrapy-and gams-implemented cps (as infeasibilities are not used in the cps). a further limitation of this protocol is that it cannot be entirely automated due to multiple factors. first, steps such as before you begin step 16 are difficult to automate as they generally require complex reasoning (such as why to choose one species or group of species over another for model curation and database creation). second, online tools used in this workflow such as cello (in before you begin step 14) and brenda (in before you begin steps 9 and 11) cannot be automated as they lack apis. when these steps are automated, these automated procedures may be easily undone by the reformatting of the website or tool. finally, some steps in this protocol require data which is not included in some databases. for instance, kegg does not contain identifiers for pheomelanin, dhn-melanin, and some melanin precursors, requiring careful manual curation and literature searching for identification and description of the correct melanogenesis synthesis pathways in the iede2091 model. these tools, and more broadly this protocol, were chosen and used instead of a fully automated model reconstruction procedure such as those available through modelseed and kbase because it allows for a more cautious and conservative reconstruction process which is valued given the lack of knowledge of the system and the authors desire not to add more unsupported metabolic functionalities than necessary. additionally, a previously fully automated process would not have allowed for a demonstration of a new tool to address metabolic gaps such as optfill. a particularly problematic aspect of this protocol is its use of and reliance on code files. it should be noted that programming codes, especially those reliant on websites or apis will inevitably not function as described at some point in the future. basic familiarity with the programming language in question and programming in particular may allow adaptation of codes which no longer work to restore function. in most cases, a ''critical'' or ''note'' is provided where code is introduced which may be most likely subject to being outdated and some suggestions as to how remedying issues of non-functionality may be possible. similarly, screenshots taken of websites will also be inevitably outdated at some time in future, through this is probably less critical. for some specific instances of codes which may be more subject to obsolescence, potential solutions are highlighted here. major version change(s) in programming languages causing codes included in this protocol to be outdates and not function. for those familiar with programming and the synthax of the languages in question, the best solution would be to update the syntax in the code. if this is not possible, then there is a brief description for each coding language as to how backward compatibility may be achieved. for gams: according to gams documentation ''gams makes every attempt to be backwards compatible and ensure that new gams systems are able to read save files generated by older gmas systems.'' this being said, gams has a boolean general option called forcework which can attempt to process files which have an execution error to attempt to be more backwards compatible. for python: the version of python used as runtime can be specified from the command line. for instance, ''python2 codename.py'' (where codename is the name of the code file) will run the code using python version two, whereas ''python3 codename.py'' will run the code using using python version 3. further, it appears that several version of the python library can be downloaded (from python.org) and so old releases may be downloaded and used. for perl: version used at runtime can be specified by the command ''use version'' or ''require version'' where version is replaced by a string which represent the version which is to be run, for instance ''use v5.24.0.1''. this ''use'' command will get its own line in the perl code itself. for all: another potential solution to this issue is to build an application containerization for each version of each programming language used in this protocol, which allows an application to be self-contained and run a native instance of the programming language it uses. this allows the containerized application to remain viable despite language version changes or updates. this protocol will not utilize application containerization, as the primary objective is to help other be able to utilize a mathematical tool (optfill) rather than programming tools (other codes used in this protocol). two apis are used in this protocol by various codes in this protocol, the blast api and the kegg api. one potential problem for this protocol in future is the obsolescence of code by updates to these apis. consult documentation of the api, particularly those documents that are changed. for blast api: check the provided blast perl code from ncbi to see if there is a need to update the function in common_functions.pl. the code which serves as the basis of the blast function used by the ''bidirectionablast.pl'' code can be found at https://blast.ncbi.nlm.nih.gov/blast.cgi? cmd=web&page_type=blastdocs&doc_type=developerinfo and downloaded by clicking the ''sampler perl code'' link. fixing the blast code in particular will require basic programming knowledge and knowledge of the perl programming language (since it was modified from a stand-alone script to a function). since all applications interfacing with apis are written in the perl programming language, the author would suggest the text ''learning perl'' by foy, phoenix, and schwarz as this text should contain all knowledge necessary to update the code. for kegg api: this api is less complex than the blast api, and if code interacting with the kegg api is rendered obsolete, it is likely that the formatting of the urls for the kegg api or that the text resulting from those urls has changed. this can likely be fixed by reviewing and updating calls to the kegg api used in various perl languages codes accompanying this work and/or by updating regular expression searches in in the kegg apis to address new formatting. the code which interacts with the brenda database no longer works, namely ''uniprot_get_ecs.pl'' (in before you begin step 2) and ''ncbi_get_ecs.pl'' (in before you begin step 4). unfortunately, this is the most susceptible code to obsolescence as brenda does not have an api and therefore the code was forced to function by using ''screen scraping'' techniques. the solution ll open access star protocols 1, 100105, september 18, 2020 metabolic model integration of the bibliome, genome, metabolome and reactome of aspergillus niger kbase: the united states department of energy systems biology knowledgebase national center for biotechnology information optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization standardizing biomass reactions and ensuring complete mass balance in genome-scale metabolic models comparative genomic and transcriptomic analysis of wangiella dermatitidis, a major cause of phaeohyphomycosis and a model black yeast human pathogen using gene essentiality and synthetic lethality information to correct yeast and cho cell genome-scale models analysis of aspergillus nidulans metabolism at the genome-scale synthesis and assembly of fungal melanin the growing scope of applications of genome-scale metabolic reconstructions using escherichia coli metal ligandbinding specificities of the tyrosinase-related proteins molecular anatomy of tyrosinase and its related proteins: beyond the histidine-bound metal catalytic center carotenoid pigments of the dematiaceous fungus wangiella dermatitidis chemical composition of the yeast and sclerotic cell walls of wangiella dermatitidis production of lipid and fatty acids during growth of aspergillus terreus on hydrocarbon substrates adaptations of exophiala dermatitidis in stressful environments. dissertation. etd collection for university of nebraska -lincoln cell wall architecture in yeast: new structure and new challenges genomescale reconstruction and in silico analysis of aspergillus terreus metabolism applications of genome-scale metabolic reconstructions cobalt: constraint-based alignment tool for multiple protein sequences production of pyomelanin, a second type of melanin effect of melanin and carotenoids of exophiala (wangiella) dermatitidis on phagocytosis, oxidative burst, and killing by human neutrophils the ascomycota tree of life: a phylum-wide phylogeny clarifies the origin and evolution of fundamental reproductive and ecological traits computation-driven analysis of model polyextremo-tolerant fungus exophiala dermatitidis: defensive pigment metabolic costs and human applications optfill: a tool for infeasible cycle-free gapfilling of stoichiometric metabolic models mutational analysis of copper binding by human tyrosinase carotenoids and carotenogenic genes in podospora anserina: engineering of the carotenoid composition extends the life span of the mycelium molecular genetic studies of the model dematiaceous pathogen wangiella dermatitidis improved annotation through genome-scale metabolic modeling of aspergillus oryzae prediction of protein subcellular localization applications of genome-scale metabolic models in biotechnology and systems medicine as noted in the ''before you begin'' section, not all code provided here can be run for free, especially since access to the gams programming language requires significant capital investment. the gams language leaves users of this protocol with two (at present unequal) options: i) purchase gams and the cplex solver (the quickest, most tractable, and most computationally powerful option), or ii) adapt the optimization-based tools used throughout the text to some other programming language which is free of charge (such as python) or already owned (such as matlab). to aid in the latter option, please see and for a detailed mathematical formulation of optfill which could be adapted to any programming language with optimization capabilities with effort.to make this protocol as usable as possible by those lacking the resources to purchase gams, the authors have attempted to implement optfill using python and the cobrapy package, the python package for constraint-based reconstruction and analysis (cobra). upon reviewing the implementation and its performance, the authors have concluded that optfill in cobrapy is limited by the computational speed of cobrapy. for instance, the cobrapy adaptation of optfill could not duplicate the first application of optfill to the draft e. dermatitidis model in a reasonable period of time. it took more than two days to identify the first tic between the database and model utilizing a supercomputing cluster, suggesting that the full application of the algorithm could have a runtime of months. therefore, the cobrapy version of optfill have been implemented and applied to a much smaller problem, the first test model and database from and is included in the github repository for optfill (https://doi.org/10.5281/zenodo.3518501). a short protocol detailing how to run optfill using cobrapy is included in this repository as well. this resulted in total runtimes of 553.4 s to solve the tfp utilizing cobrapy (as opposed to 13.2 s utilizing gams/cplex) and 737.6 s to solve the connecting problems utilizing cobrapy (as opposed to 32.76 s in gams). both results indicated that cobrapy is between one and two order of magnitude slower than gams for solving milp problems such as optfill for these small test problems. as it was noted in , the time to solve optfill in gams increases exponentially as the model and database to which it is applied becomes progressively larger. should a similar pattern exist in cobrapy, this would explain why the time required to apply optfill for a model such as iede2091 would be infeasibly long. in addition to implementing optfill in python, python-based analyses codes, namely ll open access star protocols 1, 100105, september 18, 2020 protocol to this problem would likely be to adapt the code to the new format of brenda, often by changing the regular expressions used in the code. this work is quite out-of-date and the time necessary to update it is not considered worthwhile. as much of this protocol utilizes interfaces (such as blast api and kegg api), websites (such as brenda), and programming languages (gams, perl, and python) which are subject to change, the authors do not intend this to be a monolithic procedure that can be forever duplicated with the provided code. rather, in future, it is hoped that this protocol outlines a procedure which may be followed by others for the study of exophiala dermatitidis or other understudied organism which has shown success with exophiala dermatitidis. it is also hoped that this protocol will allow for more widespread use of the iede2091 model, particularly for hypothesis generation related to defensive pigment production, and that it will spur interest in e. dermatitidis as a model organism for human melanocytes. further information and requests for resources and reagents should be directed to and will be fulfilled by the technical contact, wheaton l. schroeder (wheaton@huskers.unl.edu) or by the lead contact rajib saha (rsaha2@unl.edu). this study did not generate unique reagents. the datasets and code generated during this study are available in two github repositories related to this work, one for the exophiala dermatitidis model iede2091 (https://doi.org/10.5281/zenodo. 3608172) and one for the optfill tool (https://doi.org/10.5281/zenodo.3518501). these dois are for single releases of these github repositories. repositories which will be updated periodically and can be found through the laboratory group's github repository pages located at https:// github.com/ssbio/e_dermatitidis_model (iede2091 model and related code) and https://github. com/ssbio/optfill (optfill and related code). all data generated in the execution of the protocol has been included in the in appropriate github repository or in the supplemental files provided with the published works associated with this protocol, namely and . this work has been completed utilizing the holland computing center of the university of nebraska, which receives support from the nebraska research initiative. the authors gratefully acknowledge the apc discount given to them by star protocols which makes this work possible. the authors gratefully acknowledge funding from university of nebraska -lincoln (unl) faculty startup grant 21-1106-4038 as well as the nsf epscor center for root and rhizobiome innovation grant 25-1215-0139-025 at unl. the authors declare no competing interests. key: cord-326314-9ycht8gi authors: armstrong, eve; runge, manuela; gerardin, jaline title: identifying the measurements required to estimate rates of covid-19 transmission, infection, and detection, using variational data assimilation date: 2020-11-02 journal: infect dis model doi: 10.1016/j.idm.2020.10.010 sha: doc_id: 326314 cord_uid: 9ycht8gi we demonstrate the ability of statistical data assimilation (sda) to identify the measurements required for accurate state and parameter estimation in an epidemiological model for the novel coronavirus disease covid-19. our context is an effort to inform policy regarding social behavior, to mitigate strain on hospital capacity. the model unknowns are taken to be: the time-varying transmission rate, the fraction of exposed cases that require hospitalization, and the time-varying detection probabilities of new asymptomatic and symptomatic cases. in simulations, we obtain estimates of undetected (that is, unmeasured) infectious populations, by measuring the detected cases together with the recovered and dead and without assumed knowledge of the detection rates. given a noiseless measurement of the recovered population, excellent estimates of all quantities are obtained using a temporal baseline of 101 days, with the exception of the time-varying transmission rate at times prior to the implementation of social distancing. with low noise added to the recovered population, accurate state estimates require a lengthening of the temporal baseline of measurements. estimates of all parameters are sensitive to the contamination, highlighting the need for accurate and uniform methods of reporting. the aim of this paper is to exemplify the power of sda to determine what properties of measurements will yield estimates of unknown parameters to a desired precision, in a model with the complexity required to capture important features of the covid-19 pandemic. the coronavirus disease 2019 (covid19) is burdening health care systems worldwide, threatening physical and psychological health, and devastating the global economy. individual countries and states are tasked demonstrate the power and versatility of the sda technique to explore what is required of measurements to complete a model with a dimension sufficiently high to capture the policy-relevant complexities of covid-19 transmission and containment -an examination that has not previously been done. to this end, we sought to estimate the parameters noted above, using simulated data representing a metropolitan-area population loosely based on new york city. we examined the sensitivity of estimations to: i) the subpopulations that were sampled, ii) the temporal baseline of sampling, and iii) uncertainty in the sampling. results using simulated data are threefold. first, reasonable estimations of time-varying detection probabilities require the reporting of new detected cases (asymptomatic and symptomatic), dead, and recovered. second, given noiseless measurements, a temporal baseline of 101 days is sufficient for the sda procedure to capture the general trends in the evolution of the model populations, the detection probabilities, and the time-varying transmission rate following the implementation of social distancing. importantly, the information contained in the measured detected populations propagates successfully to the estimation of the numbers of severe undetected cases. third, the state evolution -and importantly the populations requiring inpatient care -tolerates low (∼ five percent) noise, given a doubling of the temporal baseline of measurements; the parameter estimates do not tolerate this contamination. finally, we discuss necessary modifications prior to testing with real data, including lowering the sensitivity of parameter estimates to noise in data. the model is written in 22 state variables, each representing a subpopulation of people; the total population is conserved. figure 1 shows a schematic of the model structure. each member of a population s that becomes exposed (e) ultimately reaches either the recovered (r) or dead (d) state. absent additive noise, the model is deterministic. five variables correspond to measured quantities in the inference experiments. as noted, the model is written with the aim to inform policy on social behavior and contact tracing so as to avoid exceeding hospital capacity. to this end, the model resolves asymptomatic-versus-symptomatic cases, undetected-versus-detected cases, and the two tiers of hospital needs: the general (inpatient, nonintensive care unit (icu)) h versus the critical care (icu) c populations. the resolution of asymptomatic versus symptomatic cases was motivated by an interest in what interventions are necessary to control the epidemic. for example, is it sufficient to focus only on symptomatic individuals, or must we also target and address asymptomatic individuals who may not even realize they are infected? the detected and undetected populations exist for two reasons. first, we seek to account for underreporting of cases and deaths. second, we desire a model structure that can simulate the impact of increasing detection rates on disease transmission, including the impact of contact tracing. thus the model was 3 j o u r n a l p r e -p r o o f structured from the beginning so that we might examine the effects of interventions that were imposed later on. the ultimate aim here is to inform policy on the requirements for containing the epidemic. we included both h and c populations because hospital inpatient and icu bed capacities are the key health system metrics that we aim to avoid straining. any policy that we consider must include predictions on inpatient and icu bed needs. preparing for those needs is a key response if or when the epidemic grows uncontrolled. for details of the model, including the differential equations describing the mass action between susceptible and infectious individuals and the disease progression through different sub-populations, see appendix a. sda is an inference procedure, or a type of machine learning, in which a model dynamical system is assumed to underlie any measured quantities. this model f can be written as a set of d ordinary differential equations that evolve in some parameterization t as: where the components x a of the vector x are the model state variables, and unknown parameters to be estimated are contained in p(t). a subset l of the d state variables is associated with measured quantities. one seeks to estimate the p unknown parameters and the evolution of all state variables that is consistent with the l measurements. a prerequisite for estimation using real data is the design of simulated experiments, wherein the true values of parameters are known. in addition to providing a consistency check, simulated experiments offer the opportunity to ascertain which and how few experimental measurements, in principle, are necessary and sufficient to complete a model. method, and we employ a method of annealing to identify a lowest minumum -a procedure that has been referred to loosely in the literature as variational annealing (va). the cost function a 0 used in this paper is written as: one seeks the path x 0 = x(0), ..., x(n), p(0), ...p(n) in state space on which a 0 attains a minimum value 1 . note that equation 1 is shorthand; for the full form, see appendix a of ref [18] . for a derivation -beginning with the physical action of a particle in state space -see ref [26] . the first squared term of equation 1 governs the transfer of information from measurements y l to model states x l . the summation on j runs over all discretized timepoints j at which measurements are made, which may be a subset of all integrated model timepoints. the summation on l is taken over all l measured quantities. the second squared term of equation 1 incorporates the model evolution of all d state variables x a . the term f a (x(n)) is defined, for discretization, as: 1 2 [f a (x(n)) + f a (x(n + 1))]. the outer sum on n is taken over all discretized timepoints of the model equations of motion. the sum on a is taken over all d state variables. r m and r f are inverse covariance matrices for the measurement and model errors, respectively. we take each matrix to be diagonal and treat them as relative weighting terms, whose utility will be described below in this section. the procedure searches a (d (n + 1) + p (n + 1))-dimensional state space, where d is the number of state variables, n is the number of discretized steps, and p is the number of unknown parameters. to perform simulated experiments, the equations of motion are integrated forward to yield simulated data, and the va procedure is challenged to infer the parameters and the evolution of all state variablesmeasured and unmeasured -that generated the simulated data. this specific formulation has been tested with chaotic models [27] [28] [29] [30] , and used to estimate parameters in models of biological neurons [13, 14, 16, 18, 31, 32] , as well as astrophysical scenarios [33] . our model is nonlinear, and thus the cost function surface is non-convex. for this reason, we iterate -or anneal -in terms of the ratio of model and measurement error, with the aim to gradually freeze out a 1 it may interest the reader that one can derive this cost function by considering the classical physical action on a path in a state space, where the path of lowest action corresponds to the correct solution [26] 5 j o u r n a l p r e -p r o o f lowest minimum. this procedure was introduced in ref [34] , and has since been used in combination with variational optimization on nonlinear models in refs [11, 18, 31, 33] above. the annealing works as follows. we first define the coefficient of measurement error r m to be 1.0, and write the coefficient of model error r f as: r f = r f ,0 α β , where r f ,0 is a small number near zero, α is a small number greater than 1.0, and β is initialized at zero. parameter β is our annealing parameter. for the case in which β = 0, relatively free from model constraints the cost function surface is smooth and there exists one minimum of the variational problem that is consistent with the measurements. we obtain an estimate of that minimum. then we increase the weight of the model term slightly, via an integer increment in β, and recalculate the cost. we do this recursively, toward the deterministic limit of r f r m . the aim is to remain sufficiently near to the lowest minimum to not become trapped in a local minimum as the surface becomes resolved. we will show in results that a plot of the cost as a function of β reveals whether a solution has been found that is consistent with both measurements and model. we based our simulated locality loosely on new york city, with a population of 9 million. for simplicity, we assume a closed population. simulations ran from an initial time t 0 of four days prior to 2020 march 1, the date of the first reported covid-19 case in new york city [35] . at time t 0 , there existed one detected symptomatic case within the population of 9 million. in addition to that one initial detected case, we took as our initial conditions on the populations to be: 50 undetected asymptomatics, 10 undetected mild symptomatics, and one undetected severe symptomatic. 2 we chose five quantities as unknown parameters to be estimated (table 1) : 1) the time-varying transmission rate k i (t); 2) the detection probability of mild symptomatic cases d sym (t), 3) the detection probability of severe symptomatic cases d sys (t), 4) the fraction of cases that become symptomatic f sympt , and 5) the fraction of symptomatic cases that become severe enough to require hospitalization f severe . here we summarize the key features that we sought to capture in modeling these parameters; for their mathematical formulatons, see appendix b. the transmission rate k i (often referred to as the effective contact rate) in a given population for a given infectious disease is measured in effective contacts per unit time. this may be expressed as the total contact rate multiplied by the risk of infection, given contact between an infectious and a susceptible individual. the contact rate, in turn, can be impacted by amendments to social behavior 3 . as a first step in applying sda to a high-dimensional epidemiological model, we chose to condense the significance of k i into a relatively simple mathematical form. we assumed that k i was constant prior to the implementation of a social-distancing mandate, which then effected a rapid transition of k i to a lower constant value. specifically, we modeled k i as a smooth approximation to a heaviside function that begins its decline on march 22, the date that the stay-at-home order took effect in new york city [39] : 25 days after time t 0 . for further simplicity, we took k i to reflect a single implementation of a social distancing protocol, and adherence to that protocol throughout the remaining temporal baseline of estimation. detection rates impact the sizes of the subpopulations entering hospitals, and their values are highly uncertain [3, 4] . thus we took these quantities to be unknown, and -as detection methods will evolve -time-varying. we also optimistically assumed that the methods will improve, and thus we described them as increasing functions of time. we used smoothly-varying forms, the first linear and the second quadratic, to preclude symmetries in the model equations. meanwhile, we took the detection probability for asymptomatic cases (d as ) to be known and zero, a reasonable reflection of the state of testing in that population during our study period. finally, we assigned as unknowns the fraction of cases that become symptomatic ( f sympt ) and fraction of symptomatic cases that become sufficiently severe to require hospitalization ( f severe ), as these fractions possess high uncertainties (refs [40] and [41] , respectively). as they reflect an intrinsic property of the disease, we took them to be constants. all other model parameters were taken to be known and constant (appendix a); however, the values of many other model parameters also possess significant uncertainties given the reported data, including, for example, the fraction of those hospitalized that require icu care. future va experiments can treat these quantities as unknowns as well. the simulated experiments are summarized in the schematic of the base experiment (i in figure 2 ) possesses the following features: a) five measured populations: detected asymptomatic as det , detected mild symptomatic sym det , detected severe symptomatic sys det , recovered r, and dead d; b) a temporal baseline of 101 days, beginning on 2020 february 26; c) no noise in measurements. 3 the reproduction number r 0 , in the simplest sir form, can be written as the effective contact rate divided by the recovery rate. in practice, r 0 is a challenge to infer [22, [36] [37] [38] . 7 j o u r n a l p r e -p r o o f the three variations on this basic experiment (ii through iv in figure 2 ), incorporate the following independent changes. in experiment ii, the r population is not measured -an example designed to reflect the current situation in some localities (e.g. refs [3, 4] ). experiment iii includes a ∼ five percent noise level (for the form of additive noise, see appendix c) in the simulated r data, and experiment iv includes that noise level in addition to a doubled temporal baseline. for each experiment, twenty independent calculations were initiated in parallel searches, each with a randomly-generated set of initial conditions on state variable and parameter values. for technical details of all experimental designs and implementation, see appendix c. the salient results for the simulated experiments i through iv are as follows: i table 2 . the base experiment that employed five noiseless measured populations over 101 days yielded an excellent solution in terms of model evolution and parameter estimates. prior to examining the solution, we shall first show the cost function versus the annealing parameter β, as this distribution can serve as a tool for assessing the significance of a solution. figure 3 shows the evolution of the cost throughout annealing, for the ten distinct independent paths that were initiated; the x-axis shows the value of annealing parameter β, or: the increasing rigidity of the model constraint. at the start of iterations, the cost function is mainly fitting the measurements to data, and its value begins to climb as the model penalty is gradually imposed. if the procedure finds a solution 8 j o u r n a l p r e -p r o o f that is consistent not only with the measurements, but also with the model, then the cost will plateau. in prior to its steep decline. we noted no improvement in this estimate for k i (t), following a tenfold increase in the temporal resolution of measurements (not shown). the procedure does appear to recognize that a fast transition in the value of k i occurred at early times, and that that value was previously higher. it will be important to investigate the reason for this failure in the estimation of k i at early times, to rule out numerical issues involved with the quickly-changing derivative 4 . figure 5 shows the cost as a function of annealing for the case with no measurement of recovered population r. without examining the estimates, we know from the cost(β) plot that no solution has been found that is consistent with both measurements and model: no plateau is reached. rather, as the model constraint strengthens, the cost increases exponentially. indeed, figure 6 shows the estimation, taken at β = 2, prior to the runaway behavior. note the excellent fit to the measured states and simultaneous poor fit to the unmeasured states. as no stable solution is found at high β, we conclude that there exists insufficient information in as det , sym det , sys det , and d alone to corral the procedure into a region of state-and-parameter space in which a model solution is possible. we repeated this experiment with a doubled baseline of 201 days, and noted no improvement (not shown). in experiment iii, the low noise added to r yielded a poor state and parameter estimate (not shown). with a doubled temporal baseline of measurements (experiment iv), however, the state estimate became robust to the contamination. figure 7 shows this estimate. while the ∼ five percent noise added to population r propagates to the unmeasured states s, e, and p, the general state evolution is still captured well. importantly, the populations entering the hospital are well estimated. note that some low state estimates (e.g. as) are not perfectly offset by high estimates (e.g. sym). the addition of noise in these numbers -by definition -breaks the conservation of the population. finally, the parameter estimates for experiment iv do not survive the added contamination (not shown). we have endeavoured to illustrate the potential of sda to systematically identify the specific measurements, temporal baseline of measurements, and degree of measurement accuracy, required to estimate unknown model parameters in a high-dimensional model designed to examine the complex problems that covid-19 presents to hospitals. in light of our assumed knowledge of some model parameters, we restrict our conclusions to general comments. we emphasize that estimation of the full model state requires measurements of the detected cases but not the undetected, provided that the recovered and dead are also measured. the state evolution is tolerant to low noise in these measurements, while the parameter estimates are not. the ultimate aim of sda is to test the validity of model estimation using real data, via prediction. in advance of that step, we are performing a detailed study of the model's sensitivity to contamination in the measurable populations as det , sym det , sys det , r, and d. concurrently we are examining means to render the parameter estimation less sensitive to noise, via various additional equality constraints in the cost function, and loosening the assumption of gaussian-distributed noise. in particular, we shall require that the time-varying parameters be smoothly-varying. it will be important to examine the stability of the sda procedure over a range of choices for parameter values and initial numbers for the infected populations. this procedure can be expanded in many directions. currently we are working to divide the model subpopulations by age, and to include age-specific parameters such as susceptibility and the likelihood of requiring hospitalization and intensive care. specifically, sda might inform the question of whether the contact matrices among age groups are non-stationary -a question of high interest for predicting age-dependent susceptibility during a second wave [42] of the virus. other avenues for expansion are as follows: 1) define additional model parameters as unknowns to be estimated, including the fraction of patients hospitalized, the fraction who enter critical care, and the various timescales governing the reaction equations; 2) impose various constraints regarding the unknown time-varying quantities, particularly transmission rate k i (t), and identifying which forms permit a solution consistent with measurements; 3) examine model sensitivity to the initial numbers within each population; 4) examine model sensitivity to the temporal frequency of data sampling. moreover, it is our hope that the procedure described in this paper can guide the application of sda to a host of complicated questions surrounding covid-19. thank you to patrick clay from the university of michigan for discussions on inferring exposure rates given social distancing protocols. and f severe are constant numbers, as they are assumed to reflect an intrinsic property of the disease. the detection probability of asymptomatic cases is taken to be known and zero. [40] and [41] . for experiments i and iv, the reported numbers are taken from the annealing iteration with a value of parameter β of 32 and 40, respectively: once the deterministic limit has been reached (see text). for experiment ii, an attempt was made to retrieve parameter estimates at β = 2; that is: before the solution grows unstable exponentially (see figure 5 ). see specific subsections for details of each experiment. the blue notation specified by overbrackets denotes the correspondence of specific terms to the reactions between the populations depicted in figure 1 . j o u r n a l p r e -p r o o f t d time from entering critical care to death, for severe symptomatics 5.0 [50] a as described in [45] , viral load can be high and detectable for up to 20 days. we choose a shorter duration of infectiousness to capture the time during which transmissibility is highest. j o u r n a l p r e -p r o o f appendix b: unknown time-varying parameters to be estimated the unknown parameters assumed to be time-varying are the transmission rate k i , and the detection probabilities d sym and d sys for mild and severe symptomatic cases, respectively. the transmission rate in a given population for a given infectious disease is measured in effective contacts per unit time. this may be expressed as the total contact rate (the total number of contacts, effective or not, per unit time), multiplied by the risk of infection, given contact between an infectious and a susceptible individual. the total contact rate can be impacted by social behavior. in this first employment of sda upon a pandemic model of such high dimensionality, we chose to represent k i as a relatively constant value that undergoes one rapid transition corresponding to a single social distancing mandate. as noted in experiments, social distancing rules were imposed in new york city roughly 25 days following the first reported case. we thus chose k i to transition between two relatively constant levels, roughly 25 days following time t 0 . specifically, we wrote k i (t) [51] as: the parameter t was set to 25, beginning four days prior to the first report of a detection in nyc [35] to the imposition of a stay-home order in nyc on march 22 [39] . the parameter s governs the steepness of the transformation, and was set to 10. parameters f and ξ were then adjusted to 1.2 and 1.5, to achieve a transition from about 1.4 to 0.3. for detection probabilities d sym and d sys , a linear and quadratic form, respectively, were chosen to preclude symmetries, and both were optimistically taken to increase with time: dsym(t) = 0.2 · t dsys(t) = 0.1 · t 2 finally, each time series was normalized to the range: [0:1], via division by their respective maximum values. the simulated data were generated by integrating the reaction equations (appendix a) via a fourth-order adaptive runge-kutta method encoded in the python package odeint. a step size of one (day) was used to record the output. except for the one instance noted in results regarding experiment i, we did not examine the sensitivity of estimations to the temporal sparsity of measurements. the initial conditions on the populations were: s 0 = n − 1 (where n is the total population), as 0 = 1, and zero for all others. for the noise experiments, the noise added to the simulated sym det , sys det , and r data were generated by python's numpy.random.normal package, which defines a normal distribution of noise. for the "low-noise" 24 j o u r n a l p r e -p r o o f experiments, we set the standard deviation to be the respective mean of each distribution, divided by 100. for the experiments using higher noise, we multiplied that original level by a factor of ten. for each noisy data set, the absolute value of the minimum was then added to each data point, so that the population did not drop below zero. the optimization was performed via the open-source interior-point optimizer (ipopt) [52] . ipopt uses a simpson's rule method of finite differences to discretize the state space, a newton's method to search, and a barrier method to impose user-defined bounds that are placed upon the searches. we note that ipopt's search algorithm treats state variables as independent quantities, which is not the case for a model involving a closed population. this feature did not affect the results of this paper. those interested in expanding the use of this tool, however, might keep in mind this feature. one might negate undesired effects by, for example, imposing equality constraints into the cost function that enforce the conservation of n. within the annealing procedure described in methods, the parameter α was set to 2.0, and β ran from 0 to 38 in increments of 1. the inverse covariance matrix for measurement error (r m ) was set to 1.0, and the initial value of the inverse covariance matrix for model error (r f ,0 ) was set to 10 −7 . for each of the four simulated experiments, twenty paths were searched, beginning at randomlygenerated initial conditions for parameters and state variables. all simulations were run on a 720-core, 1440-gb, 64-bit cpu cluster. ihme covid-19 health service utilization forecasting team and christopher jl murray. forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months. medrxiv, page 2020.03 the need for data innovation in the time of covid-19 estimating the early death toll of covid-19 in the united states substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) inverse problem theory and methods for model parameter estimation numerical weather prediction atmospheric modeling, data assimilation and predictability data assimilation: the ensemble kalman filter practical methods for optimal control and estimation using nonlinear programming the number of required observations in data assimilation for a shallow-water flow estimating the state of a geophysical system with sparse observations: time delay methods to achieve accurate initial states for prediction kalman meets neuron: the emerging intersection of control theory with neuroscience dynamical estimation of neuron and network properties i: variational methods dynamical estimation of neuron and network properties ii: path integral monte carlo methods real-time tracking of neuronal network structure using data assimilation estimating parameters and predicting membrane voltages with conductance-based neuron models automatic construction of predictive neuron models through large scale assimilation of electrophysiological data statistical data assimilation for estimating electrophysiology simultaneously with connectivity within a biological neuronal network towards real time epidemiology: data assimilation, modeling and anomaly detection of health surveillance data streams variational data assimilation with epidemic models bayesian tracking of emerging epidemics using ensemble optimal statistical interpolation real time bayesian estimation of the epidemic potential of emerging infectious diseases adjoint-based data assimilation of an epidemiology model for the covid-19 pandemic in 2020 an epidemiological modelling approach for covid19 via data assimilation efficient bayesian inference of fully stochastic epidemiological models with applications to covid-19 predicting the future: completing models of observed complex systems dynamical parameter and state estimation in neuron models. the dynamic brain: an exploration of neuronal variability and its functional significance estimating the biophysical properties of neurons with intracellular calcium dynamics accurate state and parameter estimation in nonlinear systems with sparse observations improved variational methods in statistical data assimilation nonlinear statistical data assimilation for hvc ra neurons in the avian song system data assimilation of membrane dynamics and channel kinetics with a neuromorphic integrated circuit an optimization-based approach to calculating neutrino flavor evolution systematic variational method for statistical nonlinear state and parameter estimation improved inference of time-varying reproduction numbers during infectious disease outbreaks a new framework and software to estimate time-varying reproduction numbers during epidemics different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures pause order in new york city takes effect getting a handle on asymptomatic sars-cov-2 infection estimating the burden of sars-cov-2 in france as americans brace for second wave of covid-19 substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) estimation of incubation period distribution of covid-19 using disease onset forward time: a novel cross-sectional and forward follow-up study. medrxiv, page 2020.03.06 virological assessment of hospitalized patients with covid-2019 incidence, clinical outcomes, and transmission dynamics of hospitalized 2019 coronavirus disease among 9,596,321 individuals residing in california and washington clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia clinical features of patients infected with 2019 novel coronavirus in wuhan, china. the lancet epidemiology and transmission of covid-19 in shenzhen china: analysis of 391 cases and 1,286 of their close contacts clinical course and outcomes of critically ill patients with sars-cov-2 pneumonia in wuhan, china: a single-centered, retrospective, observational study. the lancet respiratory medicine an updated estimation of the risk of transmission of the novel coronavirus (2019-ncov) short tutorial: getting started with ipopt in 90 minutes key: cord-298646-wurzy88k authors: van der merwe, rené; molfino, nestor a title: challenge models to assess new therapies in chronic obstructive pulmonary disease date: 2012-09-13 journal: int j chron obstruct pulmon dis doi: 10.2147/copd.s30664 sha: doc_id: 298646 cord_uid: wurzy88k chronic obstructive pulmonary disease (copd) is a major cause of morbidity and mortality. current therapies confer partial benefits either by incompletely improving airflow limitation or by reducing acute exacerbations, hence new therapies are desirable. in the absence of robust early predictors of clinical efficacy, the potential success of novel therapeutic agents in copd will not entirely be known until the drugs enter relatively large and costly clinical trials. new predictive models in humans, and new study designs are being sought to allow for confirmation of pharmacodynamic and potentially clinically meaningful effects in early development. this review focuses on human challenge models with lipopolysaccharide endotoxin, ozone, and rhinovirus, in the early clinical development phases of novel therapeutic agents for the treatment and reduction of exacerbations in copd. the global initiative for chronic obstructive lung disease defines chronic obstructive pulmonary disease (copd) as a … common preventable and treatable disease characterized by persistent airflow limitation that is usually progressive and associated with an enhanced chronic inflammatory response in the airways and the lung to noxious particles or gases. exacerbations and comorbidities contribute to the overall severity in individual patients. 1 exacerbations are associated with increased airway and systemic inflammation, which lead to airway wall edema, sputum plugging, and bronchoconstriction. copd remains a major global health and economic burden that is expected to be the third leading cause of death, and the fifth leading cause of disability by 2020. 1 in 2010, copd accounted for $49.9 billion in health care expenditures in the united states alone ($29.5 billion in direct health care expenditures, $8.0 billion in indirect morbidity costs, and $12.4 billion in indirect mortality costs). 2 in europe, copd accounts for 10.3 billion euros in health care spending a year. 3 pharmacological therapy is used to control symptoms, as well as to reduce exacerbations, and to improve exercise tolerance. ambulatory copd patients are currently treated with long-acting bronchodilators and inhaled corticosteroids, along with systemic corticosteroids during exacerbations. there is a pressing need to develop novel approaches for the treatment of copd and the prevention or reduction of acute exacerbations of copd. existing therapies give partial benefits either by incompletely improving airflow limitation or reducing acute exacerbations, hence the need for newer, more effective therapies. evidence that existing medications reduce lung function decline in the long term has been inconclusive. the torch study 4 investigated the effects of combined salmeterol plus fluticasone, either component alone, and placebo, on the rate of postbronchodilator forced expiratory volume in 1 second (fev 1 ) decline. the investigators found that salmeterol plus fluticasone reduced the rate of fev 1 decline by 16 ml/year compared with placebo. despite this large study, a meta-analysis conducted by soriano et al 5 concluded that inhaled corticosteroids showed a significant improvement in lung function decline compared with placebo at 3 months, however, after 6 months there was no significant difference between placebo and inhaled corticosteroid treatment. one of the main challenges in developing new therapeutic agents for the treatment or prevention of acute exacerbations of copd is that their potential success cannot be entirely known until the investigational therapies enter relatively large phase ii studies, assessing clinical outcome over a 3-to 6-month period or longer. 6, 7 this article reviews the experimental challenges that can be performed relatively early in drug development for the treatment of copd in order to obtain preliminary signals of safety and efficacy in humans. these challenge models are representative of the local inflammatory response caused by an exacerbation of copd; however it is important to note that these models do not reflect the actual exacerbation milieu. the models chosen are those models that have successfully been used to date in copd drug development. depending on the dose given during inhalation, these challenge models may also cause local and systemic inflammation, thus making them ideal for assessing the inflammatory processes in the lungs during an exacerbation and the potential therapeutic benefit of novel agents, as they mimic the local inflammatory response in the lung during an exacerbation. lipopolysaccharide (lps) is a macromolecular cell wall surface antigen of gram-negative bacteria. it is made up of three components: the o antigen (or o polysaccharide) side chain, the core oligosaccharide, and the lipid a moiety. 8 lps is an extremely biologically active substance and has been used for many years in preclinical and clinical research, due to its role in activating many transcription factors. in the serum, lps binds to a lipid-binding protein, which facilitates the association between lps and cd14 on the cell membranes. this in turn facilitates the transfer of lps to the trl4/md2 complex ( figure 1) . 9 this triggers a signaling cascade in macrophage lineage and endothelial cells, resulting in the secretion of proinflammatory cytokines and nitric oxide, and the activation of complement and the coagulation systems that contribute to characteristic features of inflammation, and with excessive stimulation, "endotoxic shock." in monocytes and macrophages, lps triggers the production of powerful inflammatory mediators including cytokines (eg, interleukin [il]-1, il-6, il-8, tumor necrosis factor [tnf]-α and platelet-activating factor), which stimulate production of prostaglandins and leukotrienes. in addition, lps activation results in enhanced macrophage phagocytic and cytotoxic activity. activation of alternative complement pathway factors c3a and c5a induce histamine release, and affect neutrophil chemotaxis and accumulation. kinin activation releases bradykinins and other vasoactive peptides, which cause hypotension. the release of these mediators and subsequent systemic response make lps a powerful research tool in evaluating inflammatory pathways. healthy subjects inhaling endotoxin show a systemic and pulmonary inflammatory response, recruiting neutrophils and macrophages to the lung tissue. inhalation of nebulized doses (up to 50 µg) of lps via a dosimeter in healthy volunteers, leads to an increase in temperature, blood c-reactive protein (crp), blood and sputum neutrophils, blood monocytes and lymphocytes, and blood and sputum proinflammatory mediators, including: il-8, tnfα, myeloperoxidase (mpo), matrix metalloproteinase-9, il-6, il-1β, monocyte chemotactic protein-1, and macrophage inflammatory protein-1β. 10, 11 bronchial segmental instillation induces an early phase response (0-24 hours), resulting in a statistically significant increase in neutrophils, tnf, il-1β, il-1r antagonist, il-6, and granulocyte colony-stimulating factor. neutrophils, macrophages, and monocytes increase 24-48 hours post instillation. 11, 12 intranasal lps challenge may be the least invasive and best tolerated model. clinical symptoms are minimal; however, very little or no data have been published to date using this model. further validation of the intranasal model needs to be performed to compare it to the lps inhalation challenge model, to better understand the relevance of the lps-triggered nasal inflammation to the phenomena that occur in the central and distal airways in copd. many of the effects of exogenous lps can be blocked by medications. in one study, pretreatment with oral prednisolone or cilomilast had no effect on the local lps-induced inflammatory response in the lung; 13 pretreatment with prednisolone alone significantly inhibited the lps-induced crp response. cilomilast attenuated the increase in crp, but not significantly. similarly, in another study, subjects who received simvastatin 40-80 mg demonstrated a reduction in neutrophils, mpo, tnfα, matrix metalloproteinase-7, -8, and -9 in the bronchoalveolar lavage fluid (balf), as well as a reduction in plasma crp, versus placebo. 14 hohlfeld et al 15 used lps to show that roflumilast reduced the influx of total cells, neutrophils, and eosinophils into the airways of healthy subjects after segmental challenge with endotoxin, with statistical significance. it is important to remember that inhaled lps challenge is a model of acute neutrophilic inflammation and not a model of copd. 16 the model can be used to understand the biological effects of compounds that inhibit the lps pathway (table 1) . ozone (o 3 ) is a major component of urban environmental air pollution. it is formed in the troposphere from primary precursor pollutants. in the presence of light, no 2 is cleaved by sunlight to no figure 2 ). in epidemiological studies, o 3 levels have been associated with exacerbations of asthma, copd, and pneumonia. [17] [18] [19] experimental o 3 exposure in healthy human subjects is known to elicit a reversible impairment in lung function, as well as acute proximal airways neutrophilic inflammation, and an increase in the concentration of several cytokines and mediators of inflammation within the airways. 20 in the first reported study of the inflammatory effects of low-level o 3 exposure (80 ppb o 3 for 6.6 hours) in healthy volunteers, 21 there were statistically significant increases in polymorphononuclear neutrophils, prostaglandin e 2 , lactate dehydrogenase, il-6, α1-antitrypsin, and decreased phagocytosis via the complement receptor. this is similar to a more recent study with low-level exposure to o 3 at 80 ppb for 6.6 hours, 22 in which there were increased airway neutrophils, monocytes, and dendritic cells, as well as modifications of the expression of cd14, hla-dr, cd80, and cd86 on monocytes. in another study examining whether circulating cd11b plays a role in the inflammatory response following inhaled o 3 exposure, 22 volunteers underwent controlled exposure to o 3 (400 ppb for 2 hours) and to clean air on two separate occasions. 23 induced sputum collected from subjects exposed to o 3 revealed marked neutrophilia, and increased expression of mcd14 on airway macrophages and monocytes. circulating cd11b levels also predicted the magnitude of the airway neutrophil response following inhaled o 3 exposure. a number of different classes of therapeutic agents have been studied in the o 3 challenge model in healthy volunteers. therapeutic classes include corticosteroids (administered orally and by inhalation) and nonsteroidal anti-inflammatory drugs. more recently, studies investigating the effects of cxc chemokine receptor (cxcr) 1, 2 antagonists have been reported. [24] [25] [26] [27] [28] holz et al 24 conducted a double-blind, double-dummy, placebo-controlled, three-period crossover study. eighteen healthy subjects, who had been shown at screening to produce more than a 10% increase in sputum neutrophils in response to exposure to 250 ppb o 3, were randomly assigned to receive alternating single orally inhaled doses of fluticasone 2 mg, 50 mg of prednisolone orally, and placebo at least 2 weeks apart. compared with placebo, pretreatment with inhaled or oral corticosteroids resulted in a significant reduction of sputum neutrophils, by 62% and 64%, respectively. this was associated with statistically significant reductions in sputum mpo, by 55% for inhaled corticosteroids and 42% for oral steroids. compared with placebo, there was a mean reduction in sputum il-8 levels, by 49% after inhaled corticosteroids and 34% after oral corticosteroids. similar results were obtained in a study conducted by alexis et al. 25 in subjects receiving fluticasone 0.5 mg and 2 mg, sputum neutrophilia was significantly reduced by 18% and 35%, respectively. the following inflammatory markers were also significantly reduced in a dose-dependent manner in subjects receiving fluticasone: cd11b, mcd14, cd64, cd16, hla-dr, and cd86 on sputum monocytes. serum clara cell protein 16 levels (a marker of pulmonary damage) were significantly increased post-o 3 challenge. schelegle et al 26 pretreated healthy volunteers with indomethacin, which was shown to significantly reduce o 3 -induced decrements in fev 1 and forced vital capacity, when compared to no drug or placebo. this was associated with reductions in subjective symptoms of cough, shortness of breath, and throat tickle on indomethacin treatment, suggesting that cyclooxygenase products play a partial role in subjective symptoms associated with o 3 exposure. hazucha et al 29 demonstrated a similar reduction of o 3 -induced decrements in fev 1 and forced vital capacity following singledose treatment with either 200 mg or 800 mg of ibuprofen compared to placebo, which was associated with reduced post-o 3 balf levels of prostaglandin e 2 , thromboxane b2, and il-6. sch527123 is a novel, selective, oral cxc chemokine receptor 2 antagonist that inhibits neutrophil activation and modulates neutrophil trafficking in animal models. eighteen healthy o 3 responders (.20% increase in sputum neutrophils) underwent o 3 challenge tests (250 ppb, 3 hours intermittent exercise) 1 hour after the last treatment dose, and sputum was induced at 3 hours postchallenge 27 following sch527123 treatment. the o 3 challenge resulted in statistically significantly lower sputum neutrophil counts (0.136 × 10 6 ml −1 ) compared with prednisolone (0.846 × 10 6 ml −1 ; p = 0.001) or placebo (2.986 × 10 6 ml −1 ; p = 0.001). comparable results were obtained for total cell count, percentage of sputum neutrophils, and for il-8 and myeloperoxidase in sputum supernatant. post challenge, sch527123 inhibited neutrophilia in peripheral blood, but significantly less than in sputum. gaga studies in copd have been conducted using o 3 concentrations in the range of 120 -250 ppb with 7.5-15 minutes of exercise every 30 minutes, aiming to maintain a ventilatory rate of between 20-30 l/min. [32] [33] [34] in a study of nine subjects with copd and ten age-matched controls, gong et al 35 found an increase in specific airway resistance and a statistically significant decrease in fev 1 in the copd subjects versus the age-matched controls. in summary, o 3 challenge has been well tolerated in healthy volunteers and in older subjects, as well as subjects with asthma or copd. the o 3 -challenge model potentially provides critical decision-making data in understanding whether new compounds have the desired biological effect in healthy volunteers and patients with copd; hence it can de-risk decisions to move forwards into large phase ii safety and efficacy trials. rhinovirus is responsible for the common cold and is spread through infected respiratory secretions from one person to another. human rhinovirus (hrv) replicates at 33°c-35°c and thus has been linked to upper airway infections, where mucosal surfaces are cooler. evidence exists that hrv is not limited to the upper airways. gern et al 36 hrv binds to intracellular adhesion molecule (icam)-1, the major hrv receptor. 37 the low-density-lipoprotein receptor 38 binds to a minor group of hrv (hrv2). the gene for icam-1 maps to human chromosome 19, as do the genes for a number of other picornavirus receptors. several studies have shown induction of proinflammatory genes implicated in neutrophil activation following rhinovirus induction of bronchial epithelial cells (eg, il-8 regulated by nf-κβ signaling pathways and groα). [39] [40] [41] rhinovirus infection of epithelial cells leads to the release of proinflammatory cytokines and chemokines including il-6 and il-8. chemokines attract inflammatory cells (eg, neutrophils, eosinophils). these cells release toxic products, stimulating mucus production and leading to tissue damage, with possible long-term loss of lung function. some mediators, such as endothelin-1, have a direct effect in causing bronchoconstriction and vasoconstriction, resulting in airflow obstruction and impaired gas exchange. healthy subjects, subjects with asthma, and subjects with allergic asthma have been intensively studied in clinical trials inoculating them with rhinovirus 16 or other rhinovirus serotypes. [42] [43] [44] these studies demonstrated that rhinovirus infection of the lower airways is common after experimental inoculation. several studies looking at causes of exacerbations in copd have shown that viruses account for up to 60% of exacerbations, and that hrv is numerically the most important virus type. 30, [45] [46] [47] [48] [49] [50] figure 3 depicts the total viral and hrv exacerbation rate in seven exacerbation studies. other viruses associated with acute exacerbations of copd are coronavirus, influenza a and b, parainfluenza, adenovirus, and respiratory syncytial virus. 45, [51] [52] [53] [54] [55] to develop a model of viral exacerbation in subjects with copd, mallia et al 56 conducted a virus dose-escalating study infecting four copd subjects with rhinovirus. in this study, the median tissue culture infective dose (tcid50) of rhinovirus was administered by the inhaled route using a nebulizer, to elicit a copd exacerbation. although there was a decrease in fev 1 (16%) and peak expiratory flow (12%) maximal on day 9, there was not a statistically significant increase in total sputum cell count or peripheral neutrophil count. symptoms of cold and lower respiratory tract symptoms, as well as lung function changes that are characteristic of viral-induced exacerbations of copd, were observed. there was an increase, although not statistically significant, in the proinflammatory cytokines il-6 and il-8. in another study, there were significant increases in total respiratory scores in both copd subjects and submit your manuscript | www.dovepress.com dovepress dovepress healthy controls. 57 peak expiratory flow fell by 23.5 ml in the controls (p = not significant) and by 50.5 ml (p , 0.05) in the copd patients. peripheral white cell counts and neutrophils increased in both groups. sputum neutrophil count also increased in the copd patients but not in the controls. more recently, both the upper and lower symptom scores were found to be significantly higher in the copd subjects. 58 in this study, ten of the 11 infected copd subjects met the criteria defining an exacerbation of copd. subjects in the copd group demonstrated significant decreased peak expiratory flow from baseline, while those in the control group did not. the blood and sputum showed a significant increase in peripheral neutrophils in the copd subjects but not the controls. however, crp was significantly increased in both groups on day 5. subjects in the copd group had significantly increased sputum neutrophil elastase levels over baseline on days 9 and 15, as well as il-8 levels on day 9. sputum neutrophil elastase levels were significantly higher in subjects with copd, compared with control subjects on days 9-15. to date, only one laboratory has published on this model and as such, the data should be interpreted with caution. the main advantage of this model is that it will give a clear understanding and insight into the molecular and cellular inflammatory processes that take place during a viral-associated exacerbation of copd. there are no published data to date on the effect of pharmacological interventions in this model. the rhinovirus challenge model has the potential for use as a preclinical and clinical tool to identify and investigate novel drug targets and establish whether new therapeutic agents have potential clinical utility. these include (but are not limited to) targets against soluble icam-1 and thus inhibit interaction of hrv with icam-1, 59,60 inhibitors of rhinovirus rna-dependent rna polymerases 3d, 61 activators of retinoic acid-inducible gene 1, 62 inhibitors of rhinovirus capsid protein vp-1 63 and inhibitors of different rhinovirus proteases (eg, 2a, 3c). 64, 65 currently, this is the only model that reflects the underlying mechanism of viral exacerbations of copd. the use of challenge models has the potential to significantly inform early decision making, before embarking on longterm phase ii and iii clinical trials designed to test interventions that may treat or avert exacerbations of copd. although challenge models are good predictive models of acute exacerbations of copd, there are ethical considerations associated with inducing exacerbations in subjects with copd. therefore, safety boards may be advised to only consider subjects with mild copd for inclusion in these studies. healthy subjects could be used as an alternative, when determining the effects of a developmental drug's mechanism of action on lung inflammation. the lps and o 3 models have been used successfully in healthy subjects. [13] [14] [15] 22, 24, [26] [27] [28] as these models represent the local inflammation in the lung during an exacerbation, and test the mechanism of action of potential novel drugs, these data may be used for future decision making. the lps challenge model is the best validated model in subjects with copd. pharmaceutical companies have used lps models as a means to establish proof of principle early during the clinical development process, because they are relatively simple to perform and have few adverse events. this model is cost effective because it can be conducted in healthy subjects, who are easy to recruit. lps challenge data, whether positive or negative, can provide valuable information to aid investment decision making. one disadvantage of the lps model is that it is a model of lung inflammation, but not of the disease state, thus preclinical validation of the developmental drug's effects on lps pathways is essential. for anti-inflammatory targets that are involved in the tolllike receptor 4, nf-κβ pathway, the lps challenge model is the model of choice. despite the longstanding knowledge and understanding of the adverse effects of o 3 on pulmonary biology, the use of o 3 as a challenge model to assess the potential of new drugs for the prevention of acute exacerbations in copd, is relatively new. the model has been shown to be safe and to have few side effects in healthy volunteers, and in patients with asthma and copd. 24, 25, 28, 66 additionally, it is reliable and reproducible. it has been used successfully to generate biological effect and systemic effect for fluticasone and the cxcr2 antagonists, sch527123 and sb-656933. 27, 31 a limitation of the o 3 model is that it has yet to be determined whether inhibition of neutrophilia translates into clinical benefits for patients with copd. preliminary data indicate that inhibition of the neutrophil response following o 3 challenge may be associated with beneficial changes in system scores obtained in subjects with copd. challenge with rhinovirus 16 to elicit mild exacerbations in subjects with copd appears to be safe and well tolerated, but only a few copd subjects have been exposed to this model. the observation that fev 1 does not always return to baseline after inducing an exacerbation in copd subjects may call to question the feasibility of using the challenge in a broader population of patients with copd, in addition to raising ethical considerations. this remains an exciting model with a great deal of potential, as rhinovirus models are good predictive models of viral-induced acute exacerbations of copd. in order for pharmaceutical companies to succeed in the copd arena, innovative approaches to clinical trial design and conduct are required that will generate critical, highquality proof of efficacy and biologic target engagement data to support early investment decision making, early drug termination, and facilitation of better-informed decisions regarding those drugs in which proof of effect has been clearly demonstrated. challenge models in copd, which expose fewer individuals for short periods of time to eliciting agents, may serve as surrogate of potential efficacy and thus may help early decision making and reduction in clinical development timelines. global initiative for chronic obstructive lung disease. global strategy for the diagnosis, management and prevention of copd chronic obstructive pulmonary disease (copd) fact sheet key facts effect of pharmacotherapy on rate of decline of lung function in chronic obstructive pulmonary disease: results from the torch study a pooled analysis of fev1 decline in copd patients randomized to inhaled corticosteroids or placebo salmeterol and fluticasone propionate and survival in chronic obstructive pulmonary disease for uplift study investigators. mortality in the 4-year trial of tiotropium (uplift) in patients with chronic obstructive pulmonary disease bacterial endotoxins: chemical structure, biological activity and role in septicaemia lps/tlr4 signal transduction pathway dose-response relationship to inhaled endotoxin in normal subjects inhaled endotoxin in healthy human subjects: a dose-related study on systemic effects and peripheral cd4+ and cd8+ t cells local inflammatory responses following bronchial endotoxin instillation in humans evaluation of oral corticosteroids and phosphodiesterase-4 inhibitor on the acute inflammation induced by inhaled lipopolysaccharide in human available at: http:// clinicaltrials.gov/ct2/show/nct01061671?term=nct01061671&ran k=1 roflumilast attenuates pulmonary inflammation upon segmental endotoxin challenge in healthy subjects: a randomized placebo-controlled trial blunting airway eosinophilic inflammation results in a decreased airway neutrophil response to inhaled lps in patients with atopic asthma: a role for cd14 health effects of air pollution air pollution in asthma: effect of pollutants on airway inflammation committee of the environmental and occupational health assembly of the ozone-induced airway inflammation in human subjects as determined by airway lavage and biopsy exposure of humans to ambient levels of ozone for 6.6 hours causes cellular and biochemical changes in the lung low-level ozone exposure induces airways inflammation and modifies cell surface phenotypes in healthy humans responses of subjects with chronic obstructive pulmonary disease after exposures to 0.3 ppm ozone validation of the human ozone challenge model as a tool for assessing anti-inflammatory drugs in early development fluticasone propionate protects against ozone-induced airway inflammation and modified immune cell activation markers in healthy volunteers indomethacin pretreatment reduces ozone-induced pulmonary function decrements in human subjects sch527123, a novel cxcr2 antagonist, inhibits ozone-induced neutrophilia in healthy subjects no effect of inhaled budesonide on the response to inhaled ozone in normal subjects effects of cyclo-oxygenase inhibition on ozone-induced respiratory inflammation and lung function changes sch527123, a novel treatment option for severe neutrophilic asthma sb-656933, a novel cxcr2 selective antagonist, inhibits ex-vivo neutrophil activation and ozone-induced airway inflammation in humans short-term respiratory effects of 0.12 ppm ozone exposure in volunteers with chronic obstructive pulmonary disease the acute effects of 0.2 ppm ozone in patients with chronic obstructive pulmonary disease response to ozone in volunteers with chronic obstructive pulmonary disease responses of older men with and without chronic obstructive pulmonary disease to prolonged ozone exposure detection of rhinovirus rna in lower airway cells during experimentally induced infection the major human rhinovirus receptor is icam-1 members of the low density lipoprotein receptor family mediate cell entry of a minor-group common cold virus role of p38 mitogen-activated protein kinase in rhinovirus-induced cytokine production by bronchial epithelial cells rhinovirus induction of the cxc chemokine epithelial-neutrophil activating peptide-78 in bronchial epithelium low grade rhinovirus infection induces a prolonged release of il-8 in pulmonary epithelium quantitative and qualitative analysis of rhinovirus infection in bronchial tissues rhinoviruses infect the lower airways lower airways inflammation during rhinovirus colds in normal and in asthmatic subjects respiratory viruses in exacerbations of chronic obstructive pulmonary disease requiring hospitalisation: a case-control study infections and airway inflammation in chronic obstructive pulmonary disease severe exacerbations a 1-year prospective study of the infectious etiology in patients hospitalized with acute exacerbations of copd effect of interactions between lower airway bacterial and rhinoviral infection in exacerbations of copd viral pathogens in acute exacerbations of chronic obstructive pulmonary disease a one-year prospective study of infectious etiology in patients hospitalized with acute exacerbations of copd and concomitant pneumonia rhinovirus infections in chronic bronchitis: isolation of eight possibly new rhinovirus serotypes role of infection in chronic bronchitis infectious agents associated with exacerbations of chronic obstructive bronchopneumopathies and asthma attacks respiratory viruses, symptoms, and inflammatory markers in acute exacerbations and stable chronic obstructive pulmonary disease respiratory viral infections in adults with and without chronic obstructive pulmonary disease an experimental model of rhinovirus induced chronic obstructive pulmonary disease exacerbations: a pilot study an experimental model of virus induced chronic obstructive pulmonary disease exacerbation experimental rhinovirus infection as a human model of chronic obstructive pulmonary disease exacerbation comparative antirhinoviral activities of soluble intercellular adhesion molecule-1 (sicam-1) and chimeric icam-1/immunoglobulin a molecule efficacy of tremacamra, a soluble intercellular adhesion molecule 1, for experimental rhinovirus infection: a randomized clinical trial genetic clustering of all 102 human rhinovirus prototype strains: serotype 87 is close to human enterovirus 70 regulation of innate antiviral defenses through a shared repressor domain in rig-i and lgp2 for pleconaril respiratory infection study group. efficacy and safety of oral pleconaril for treatment of colds due to picornaviruses in adults: results of 2 double-blind, randomized, placebo-controlled trials implications of the picornavirus capsid structure for polyprotein processing purification and partial characterization of poliovirus protease 2a by means of a functional assay acute lps inhalation in healthy volunteers induces dendritic cell maturation in vivo this review is part of a dissertation for a master's degree (rvdm) at the university of surrey, uk. editorial assistance was provided by lourdes briz and carrie lancos, medimmune, llc. sponsorship: this manuscript was sponsored by medimmune. conflict of interest: rvdm is an employee of medimmune; nam is a former employee of medimmune. submit your manuscript here: http://www.dovepress.com/international-journal-of-copd-journalthe international journal of copd is an international, peer-reviewed journal of therapeutics and pharmacology focusing on concise rapid reporting of clinical studies and reviews in copd. special focus is given to the pathophysiological processes underlying the disease, intervention programs, patient focused education, and self management protocols.this journal is indexed on pubmed central, medline and cas. the manuscript management system is completely online and includes a very quick and fair peer-review system, which is all easy to use. visit http://www.dovepress.com/testimonials.php to read real quotes from published authors.international journal of copd 2012:7 key: cord-326540-1r4gm2d4 authors: liu, yuliang; zhang, quan; zhao, geng; liu, guohua; liu, zhiang title: deep learning-based method of diagnosing hyperlipidemia and providing diagnostic markers automatically date: 2020-03-11 journal: diabetes metab syndr obes doi: 10.2147/dmso.s242585 sha: doc_id: 326540 cord_uid: 1r4gm2d4 introduction: the research of auxiliary diagnosis has always been one of the hotspots in the world. the implementation of auxiliary diagnosis support algorithm for medical text data faces challenges with interpretability and creditability. the improvement of clinical diagnostic techniques means not only the improvement of diagnostic accuracy but also the further study of diagnostic basis. traditional research methods for diagnostic markers often require a large amount of time and economic costs. research objects are often dozens of samples, and it is, therefore, difficult to synthesize large amounts of data. therefore, the comprehensiveness and reliability of traditional methods have yet to be improved. therefore, the establishment of a model that can automatically diagnose diseases and automatically provide a diagnostic basis at the same time has a positive effect on the improvement of medical diagnostic techniques. methods: here, we established an auxiliary diagnostic tool based on attention deep learning algorithm to diagnostic hyperlipemia and automatically predict the corresponding diagnostic markers using hematological parameters. in this paper, we not only demonstrated the ability of the proposed model to automatically diagnose diseases using text-based medical data, such as physiological parameters, but also demonstrated its ability to forecast disease diagnostic markers. human physiological parameters are used as input to the model, and the doctor’s diagnosis results as an output. through the attention layer, the degree of attention of the model to different physiological parameters can be obtained, that is, the model provides a diagnostic basis. results: it achieved 94% acc, 97.48% auc, 96% sensitivity and 92% specificity with the test dataset. all the above samples are drawn from clinical practice. moreover, the model predicted the diagnostic markers of hyperlipidemia by the attention mechanism, and the results were fully agreeable to the golden criteria. discussion: the auxiliary diagnosis system proposed in this paper not only achieves the accurate and robust performance, and can be used for the preliminary diagnosis of patients, but also showing its great potential to discover new diagnostic markers. therefore, it not only can improve the efficiency of clinical diagnosis but also shorten the research period of researching a diagnosis basis to an extent. it has a positive significance to the development of the medical diagnosis level. in recent years, with the gradual awakening of global health awareness, human beings have shown an urgent need for further development of the medical level. [1] [2] [3] artificial intelligence (ai) has great potential to promote the further development of medical diagnostic technology because of its excellent performance in the field of data processing beyond human experts. despite it has an excellent performance in the field of automatic diagnosis using medical images, the interpretability and text-based medical data analyzability of ai still faces great challenges. 4, 5, 45 in order to solve the problems above, on the global scale, researchers have gradually integrated deep learning technology with a medical diagnosis. edward choi and his colleagues used a recurrent neural network to process electronic health records (ehr) for diagnosing heart failure onset. 6 laila rasmy et al used recurrent neural networks to predict the risk of heart failure based on a large number of mixed ehr data. 7 sasank chilamkurthy et al used natural language processing model to recognize noncontrast head ct scan to identify various head diseases, such as intracranial haemorrhages and cranial fractures et al. 8 kang zhang et al used transfer learning algorithm and google's inception-v3 model to rapidly diagnose many kinds of diseases of eye and children pulmonary diseases. 9 michael a. schwemmer et al used a deep neural network decoding framework to classify intracortical recording, and then controlled the motor to help patients complete corresponding actions, according to the classification results. 10 although deep learning technology has shown a strong competitive advantage in the field of automatic diagnosis using medical images, it still faces many major challenges, such as processing medical text information. in the actual clinical diagnosis processing, in addition to the diseases that can be diagnosed by medical images, there are many diseases that need to be diagnosed by medical text data, such as hyperlipidemia, diabetes, etc. 11, 12 in order to realize the purpose of automatically diagnosing diseases using text-based medical data, long-short time memory (lstm) neural network was proposed. 13, 14 the physiological parameters obtained in the clinic are usually a vector rather than image data. sequential data also play an important role in clinical diagnosis. convolutional neural network (cnn) is more suitable for processing image data because of its translation invariance. because of the need to learn the interrelationship between different physiological parameters, lstm is a good choice when processing sequence data. lstm mentioned above relying on memory cells to learn long-range dependence information. as we know, each human physiological parameter is not independent, they are interrelated, and this relationship is difficult to be found by simple coding or logistic regression algorithm. therefore, we need a deep learning model that can learn the relationship better of far apart data to complete the task of processing text-based medical data. simply, the lstm neural network takes the original text-based medical data as input, and then use many special neurons to extract the joint features automatically from the original data, and finally use the classification function to classify the samples automatically to achieve the purpose of automatic diagnosis of diseases. this architecture makes it possible to process medical text data which have complex internal relationships, deep learning technology has been widely used in various fields. 15, 16 the deep learning technology-based text data classification method replaces the mathematical distance-based traditional clustering method, which greatly improves the performance of the model in processing text data. the key elements of traditional automatic diagnosis method using medical text data are: (1) patient description pathological characteristics, (2) researchers extract features manually based on patient descriptions or patient's ehr, (3) the extracted features are encoded according to the requirements, (4) classification algorithm is used to classify the coded physiological features. 17 the traditional automatic diagnosis method needs to extract features manually, and the quality of extracted feature vectors is greatly affected by the researcher's clinical experience and professional level, so it has uncertainty. at the same time, the traditional method will lose some original information artificially in the process of feature extraction, which may lose some joint features of physiological parameters, so the traditional method also has a degree of one-sidedness. although it is not only the pathological features described by patients, ehr are also widely used in the research of automatic diagnosis. but there is no objective and unified standard to evaluate the quality of ehr. this is one of the important factors that limit the performance of automatic diagnosis algorithms using ehr. 18, 20 deep learning algorithm has the ability of feature extraction automatically, so it overcomes the one-sidedness caused by manual feature extraction, saves labor resources and improves the efficiency and accuracy of automatic diagnosis. [20] [21] [22] another challenge in applying deep learning algorithms to auxiliary diagnosis is the interpretability. up to now, the deep learning model is still a black-box model, which cannot explain exactly which kind of physiological parameters plays a vital role in the process of data processing. the development of disease diagnosis technology depends not only on the improvement of diagnostic accuracy but also on the discovery of more effective diagnostic markers and the relationship between different physiological parameters (diseases). it is difficult to meet the requirements above by the auxiliary diagnosis system which only gives the diagnosis results. as we know, the human brain tends to have an attention focus when processing things, and it is able to find out important features purposefully according to the environment, this mechanism is called the attention mechanism. the combination of attention mechanism and deep learning model can imitate the functions of the human brain mentioned above, which has been proved to have the ability to focus on important features and has been applied in many fields such as image recognition and semantic recognition. [23] [24] [25] therefore, in order to solve the above problems, this paper not only studies the automatic diagnosis of diseases with human physiological parameters, but also applies the attention mechanism to the auxiliary diagnosis model. this method gives the importance of different physiological parameters for disease diagnosis while automatically diagnosing diseases, enhances the interpretability of the model, and further enhances the assistant ability of the auxiliary diagnosis system for clinical research. this algorithm is called attention deep learning. the advancement of medical diagnostic techniques should not rely solely on the improvement of diagnostic accuracy, but should also rely on the study of diagnostic markers or diagnostic basis that are more effective. therefore, research on the prediction of diagnostic markers is of great significance. the traditional method of studying diagnostic markers is often to collect dozens of sample data, and to predict the diagnostic markers according to the regression model, it is difficult to synthesize large quantities of samples. the above situation has caused the traditional research methods to have certain blindness and subjectivity. in order to solve the above series of problems, an auxiliary diagnostic system that can automatically provide disease markers while automatically diagnosing diseases is of positive significance for the development of medical level. hyperlipidemia refers to the excessive level of blood lipids, which can directly cause some diseases that seriously endanger human health, such as coronary heart disease, atherosclerosis and so on. however, due to the absence of obvious symptoms and abnormal signs, the diseases mentioned above have strong concealment and are difficult to be detected purposefully. at the same time, with the continuous development of medical level, researchers have found that more and more diseases are highly related to hyperlipidemia, such as aids, depression and so on. 26, 27 therefore, in the world, hyperlipidemia has become one of the most important diseases threatening human life and health to a large extent. although there is no uniform international standard for the diagnosis of hyperlipidemia, hematological parameters are widely used in the diagnosis of hyperlipidemia and the evaluation of treatment methods, which is capable of use hematological parameters to automatically diagnose hyperlipidemia. [28] [29] [30] [31] in this paper, we sought to propose an auxiliary diagnosis algorithm that can not only diagnose hyperlipidemia rapidly and accurately according to human hematological parameters but also provide diagnostic markers automatically, which improves the objectivity of traditional methods and the interpretability of deep learning model algorithm. compared with previous work, our proposed new model not only automatically determines the patient's health but also automatically provides diagnostic markers. compared with the auxiliary diagnostic system that only provides the diagnosis result, the new model proposed in this paper has higher interpretability and credibility. therefore, the above model can not only speed up the patient's medical treatment process but also further improve research efficiency of diagnostic markers, and have great potential for discovering new diagnostic markers. artificial intelligence aided diagnosis system can effectively simplify the process of patients seeking medical treatment, alleviate the contradiction of lack of medical resources, and improve the survival rate of emergency patients, as shown in figure 1 . further, the improvement of medical diagnostic technology depends not only on the improvement of diagnostic accuracy but also on the research of diagnostic markers. traditional research methods for diagnostic markers are usually difficult to synthesize hundreds and thousands of samples, the research cycle is long and the cost is high. the research method of diagnostic markers based on deep learning technology proposed in this paper can not only automatically synthesize large quantities of data but also effectively simplify the research process, thus reducing the research cost, as shown in figure 2 . in this paper, the attention deep learning algorithm is preliminarily illustrated with human hematological data, and the performance of different algorithms are compared. compared with ehr without unified evaluation standards, this paper used human hematological parameters which had unified standards as training data, so that the algorithm has higher reliability. our attention deep learning algorithm has been preliminarily applied to the automatic diagnosis of hyperlipidemia with hematological parameters; the corresponding diagnostic markers and the significance of different markers are given at the same time. the hematological parameters used in this study include cholesterol, triglyceride, high-density lipoprotein, low-density lipoprotein, hemoglobin and red blood cells. despite different blood parameters are often obtained by different methods, their acquisition methods have a unified standard. these parameters can reflect different health conditions of human body. 32, 33 at the same time, compared with the auxiliary diagnostic method which only provides diagnostic results, the algorithm proposed by this paper can also predict the diagnostic markers of diseases and the corresponding importance automatically, which improve the possibility of finding more effective new diagnostic markers, and accelerate the development of medical diagnostic technology further. this paper compares the results of the model's automatic prediction with the gold standard. although no new diagnostic markers were obtained using limited data, it proved that the model has the potential to reasonably predict the diagnostic basis. our work is the first time to systematically study an artificial intelligence aided diagnosis system that integrates automatic diagnosis and automatic prediction diagnostic markers, which is of great significance to the development of medical level. the improvement of recognition accuracy does not mean the improvement of medical diagnosis, but also the explanation of disease mechanism. increasing the interpretability of the model will further improve the diagnostic level of the disease. 34 in addition, a method that can automatically process a large number of samples and provide biomarkers can speed up the study of disease mechanism. in conclusion, the combination of deep learning technology and medical diagnostic technology is of great significance for disease research. the deep learning model used in this paper is the lstm network which combined with the attention mechanism. the eigenvector composed of human hematological parameters is fed to the lstm layer after it was processed by the attention layer. the lstm layer can extract the joint features hidden in the original data automatically. finally, the extracted joint features are processed by the classification function to achieve the purpose of the automatic classification of samples. from the attention layer, we can know which physiological parameters play a decisive role in the diagnosis of the disease, and we can get the influence degree of different physiological parameters on the disease. the global parameter of the model was updated by adam algorithm, 35 and as it is a binary classification task, the sigmoid function was used as a classification function. lstm is the core of attention deep learning algorithm. it can learn the features of the data far apart in the text data, which provides support for learning the relationship between physiological parameters mentioned above, and improves the performance of the auxiliary diagnostic model. the purpose of lstm is to study the joint representation of different physiological parameters. in clinical practice, diseaserelated physiological parameters are not independent, so lstm is more suitable for analyzing textual medical data with joint characteristics than traditional methods. the schematic diagram of lstm layer is shown in figure 3 . where iis the current input of lstm cell, ois the current output of lstm cell, sis the current status of the lstm cell. the key element of lstm is to use three control switches to control the long-term state of cell. the names of the three gates mentioned above are forgotten gate, input gate and output gate. the control principle is shown in figure 4 . where fg is forgotten gate, og is output gate, ig is input gate. the updating principle of lstm long-term state is shown in equation 1. the purpose of using attention mechanism is to make task processing system focus more on finding useful information that is significantly related to the target output in the input data, so as to improve the quality of output. in other words, attentional mechanisms are used to search for disease-related physiological parameters in the hope of finding more disease-related biomarkers. this is significantly useful information that can be used to identify diagnostic markers of disease. just as the human brain processes information, it is purposefully focused on the information most relevant to the purpose and ignores other things that do not matter. disease biomarkers can be identified by visualizing these levels of concern for different physiological parameters. the principle is shown in equation 3 and equation 4 . iwhere attentionis attention vector, fis the encoding function of the original data, xis raw data, iis the input data of the lstm layer. the possibility that the target output is related to each input physiological parameter is obtained by coding process f. then, the output of the coding process is normalized by softmax to obtain the attention distribution probability value that conforms to the probability distribution value range. the use of attention mechanisms provides more information on which physiological parameters are more important for the diagnosis of the target disease. at the same time, the attention mechanism helps the model to process effective information and discards useless data, improving the model's ability to process more complex information. the softmax function is shown in equation 5. softmaxðzþ ¼ e z j ∑ n n¼1 e z n j ¼ 1; :::; n (5) adam algorithm is different from the traditional stochastic gradient descent algorithm. adam algorithm designs independent self-adaptive learning rates for different parameters by first-moment estimation and second-moment estimation of the gradient. adam is an adaptive learning algorithm. compared with the traditional stochastic gradient descent algorithm, adam can automatically adjust the learning rate to make the model converge to a better value faster. the updating method of global parameters of the deep learning model is shown in equation 6 . in this equation, θ is global parameters, ε is learning rate, s is corrected first moment,r is corrected second moment. δ is a very small value; its function is to ensure that the denominator is not zero. the gradient of small-batch data, as shown in equation 7 . where mis the size of small-batch data, lis the loss function, xis the input data, and yis the target output. cross entropy loss function is used in this paper. all the data involved in this experiment were collected from the metabolic disease hospital of tianjin medical university. the data were collected from december 15, 2017 to january 20, 2018 and from march 1, 2018 to may 20, 2018. all the samples in the experiment came from patients who went to the hospital for health testing. we obtained permission from the metabolic disease hospital of tianjin medical university ethics committee and written informed consent from patients. before analyzing the data, we anonymous the patient's name and other basic information. we collected 600 data, each consisting of triglyceride (tg), cholesterol, low-density lipoprotein (ldl), high-density lipoprotein (hdl), hemoglobin (hb), red blood cell (rbc) and fg og ig figure 4 the control principle of lstm. diagnostic results. all data included 348 males (58%) and 252 females (42%), aged 21-87, the average age was 55.6 years. there were 321 hyperlipidemia patients (53.5%) in the entire data. subjects excluded pregnant women, lactation and patients who long-term taking anti-hyperlipidemia drugs. all hematological parameters were obtained by a fellowshiptrained laboratory physician according to the golden criteria. all diagnostic results were determined by an endocrinologist with 8-10 years of clinical experience. five hundred samples were used to train model and remaining 100 samples were used to evaluate the model's performance; two parts above are independent of each other. there are 50 hyperlipidemia samples (50%) and 50 healthy samples (50%) in the testing dataset to ensure the sample balance (64 male patients (64%) and 36 female (36%) patients in the testing dataset). a completely independent testing dataset can evaluate the system's performance to identify data not in the training dataset. the raw data are multidimensional vector, it consists of hematological parameters urological parameters and doctors' diagnostic results. it is shown in figure 5 . the raw data include blood routine parameters, biochemical test parameters, blood sugar parameters, glycosylated hemoglobin parameters and urine routine parameter. we extracted the above hematological data and diagnostic results as training vectors, and will consider adding more types of parameters in future work. diagnostic results of hyperlipidemia samples were quantified as 1, and health was 0. the parameter order of the feature sequence has not been specially designed. lstm model can automatically learn the joint features between parameters that are close or far apart. because there are complex internal relations between different physiological parameters, the lstm model is a better choice. we used 500 data to train the deep learning model, and the remaining 100 data were used to test the final performance of the model. the physiological characteristic sequence was fed to train the model. the training data mentioned above were divided into two parts: (1) training-sample: 90% of the training data was used to optimize the global parameters of the model and (2) hyperparameters-sample: the remaining 10% of the training data was used to fine-tune the hyperparameters of the model (such as the number of neurons), and this part of the data maintains sample balance. the schematic diagram of attention deep learning algorithm is shown in figure 6 . the model was built and trained by keras using tensorflow as backend. the experimental platform is ubuntu 16.04 computer with nvidia gtx 1080 gpu. the test dataset and training dataset are completely independent and do not cross each other. the test dataset contains 50 hyperlipidemia samples (50%). in the test dataset, there are 64 male patients (64%) and 36 female patients (36%). we kept the test set in a sample equilibrium state so that any health condition can be verified with the same probability. during the training process, the size of mini-batch was 20, the loss function was the cross-entropy cost function, and adam algorithm was used to optimize the global parameters (ε=0.001,p 1 =0.9,p 2 =0.999, δ=10 à8 ). at the same time, one-hot technology was also applied to the representation of data labels. each dimension of the output vector represents a different health condition, only the corresponding element is 1 and the rest is 0. because this paper distinguished two kinds of health conditions, the two-dimensional vector was used to code the data label, the normal diagnosis result was coded to 10, and the diagnosis result of hyperlipidemia was coded to 01. onehot technology is helpful to improve the robustness of the model. at the same time, the sigmoid function was used in the classification function, because of the binary classification task. as mentioned above, the cross-entropy was used as loss function, the principle of cross-entropy is shown in equation 8 . in equation 11 , t is the target output, a is the actual output, n is the number of samples. the training process of the model is shown in figure 7 . the model achieved a 94% acc performance in the test set. from the training images, we could find that the performance of the model on the training set is similar to that on the test set, this phenomenon proved that the model had good robustness. the model can not only judge the health condition of samples in training set but also diagnosis unknown samples. roc curve was also used to evaluate the model's ability in diagnosing diseases, the roc curve of the model mentioned above is shown in figure 8 . the area under the roc curve is 97.48%. the confusion matrix for the model is shown in figure 9 . according to the confusion matrix, the specificity and sensitivity of the model can be obtained. the model achieved 92% specificity and 96% sensitivity in the test set. in conclusion, it can be proved that our attention deep learning model achieved a better performance, it can diagnose hyperlipidemia automatically and accurately, even faced with samples that do not exist in the training set. diagnostic markers of hyperlipidemia can be predicted by the model automatically, the result is shown in figure 10 . in this study, our work is the first systematic study on the auxiliary diagnostic system that used human hematological data to automatically diagnose hyperlipidemia and provide the relevant diagnostic basis (automatically prompt diagnostic markers). experimental results show that our attention deep learning algorithm can not only accurately and automatically diagnose hyperlipidemia but also automatically provide the diagnostic markers of hyperlipidemia and the importance of different diagnostic markers. as shown in figure 7 , the model achieved good and similar performance on both the training set and the validation set, and the model achieved 94% acc with a completely independent test dataset. therefore, this phenomenon can be proved that our model has good generalization ability, and it can still achieve better a b figure performance in the facing of data that does not exist in the training set. as shown in figures 8 and 9 , the model achieved 97.48% auc, 92% specificity and 96% sensitivity. it can be proved that the model not only achieves better diagnostic accuracy but also has the good distinguishing ability and high reliability in the facing of different health conditions. an ai system which can auxiliary diagnosis of disease can alleviate the problem of uneven distribution of medical resources and improve the medical level in areas where medical resources are scarce. at the same time, the auxiliary diagnosis system can also speed up the patient's medical treatment process and enhance the patient's medical experience. because the ai system proposed in this paper does not have the segment of manual feature extraction, it has higher comprehensiveness and objectivity, and reduces the dependence of diagnostic results in the professional level of doctors. in a limited range, we found similar work. edward choi and his colleagues used recurrent neural networks to process electronic health records of varying lengths for early diagnosis of heart failure, reaching 88.3% of auc. 6 michael a. schwemmer et al used a deep neural network decoding framework to classify intracortical recording, reaching 93.78% acc. 10 oliver faust et al used lstm neural network to process rr interval signals for automatic diagnosis of atrial fibrillation. 36 all the work had achieved better performance in the test set. although ehr data are widely used in the research of auxiliary diagnostic system, there is no unified standard to evaluate the quality of ehr data at present. the ehr data include artificial description, which limits the credibility of ehr data, which is also one of the important factors limiting the further improvement of model performance. moreover, because ehr data does not have a uniform format, it is necessary to extract features manually before data are utilized, which not only causes the loss of original information but also increases labor costs. in the training process of this model, physiological parameters with standardized criteria were applied to the training of the model, and there was no manual description process. at the same time, the proposed model does not need to manually extract features, so that the model can obtain more potentially useful information, thus improving the performance of the model and increasing the reliability of the model. in addition, the explanation of disease mechanism and biomarker should be added. only the improvement of diagnostic accuracy can be used to prove that the improvement of medical diagnostic technology is not very comprehensive. the accuracy of diagnosis is difficult to represent the level of comprehensive diagnosis. we also compared the performance of svm and fully connected neural network with our model. the type of svm is c-svc, and the kernel function is rbf. it achieved 63% acc with the same testing dataset. the svm which used sigmoid kernel function polynomial kernel achieved 50% acc and 81% acc in the same testing data, respectively. fully connected neural network achieved 89% acc with test data mentioned above. we speculated that it is due to the fact that the traditional classification method is difficult to learn the relationship between different physiological parameters and cannot learn the importance of different physiological parameters for disease. each physiological parameter is not independent. actually, the parameters interact with each other. one physiological parameter is the same, while the other physiological parameters are different, which may reflect different health conditions. different physiological parameters are interrelated in physiological mechanism. like the semantic environment, the same words have different interpretations in different semantic environments. for example, when both lipids and hdl are high, patients may experience a temporary increase in blood lipids due to diet rather than hyperlipidemia. moreover, hdl reflects the synthesis of lipid metabolism, and it is not the higher the better. we also compared the performance of the simple recurrent neural network (rnn) with the model proposed by this paper. this rnn model also used the adam algorithm to update global parameters. it achieved 93% acc in the test dataset mentioned above. the performance of these two models is very close. however, lstm can better synthesize the relationship between different physiological parameters to give a judgment, and the simple rnn model only considers the state at the nearest moment. the more complex the data processed, the more obvious the difference in performance between the two models. we also found similar work in the limited range. manjeevan seeraa 37 and his colleges classify transcranial doppler signals using individual and ensemble rnn, it archives 85.52% auc. these works have also achieved good results in the test set. however, we speculate that human physiological features are not independent, it is not sufficient to consider only one parameter which is the reason for the better performance of lstm that can analyze joint characteristics. 38, 40 therefore, lstm is a better choice for dealing with physiological parameter sequences with complex intrinsic relationships, similar to the recognition of semantic environments or voice signals. further, the above studies on the auxiliary diagnostic system show good performance in the test set, but do not provide the basis for model classification data. the development of medical diagnosis depends not only on the improvement of diagnosis accuracy but also on the researching diagnostic markers, diagnostic basis or the influence of different physiological parameters on diseases. compared with previous work, this paper proposed a deep learning model that integrated the attention mechanism. by using the attention mechanism, we could observe which physiological parameters are more important for disease diagnosis. model can automatically provide disease diagnostic markers while diagnosing diseases. when using only lstm, we found that the model reached 92% acc in the same test dataset. we suspect that this is because the use of the attentional mechanism can help the model process more efficient information purposefully, thus alleviating the problem of over-fitting. in addition, the use of attention mechanism is more convenient for the study of diagnostic markers, which can effectively reflect the importance of different physiological parameters for disease diagnosis. the performance is very close. we speculate that this is because the data is not very complex. in the future, we will study and use more types of physiological parameters to identify more complex diseases. as shown in figure 10 , in the process of diagnosing hyperlipidemia, the model mainly judged hyperlipidemia according to cholesterol and triglyceride. this phenomenon coincides with the diagnostic criteria of hyperlipidemia. at the same time, hdl was also found to be associated with hyperlipidemia. we speculate that this phenomenon is due to the fact that hdl functions as a carrier of cholesterol in the surrounding tissues, so it has a close relationship with hyperlipidemia. 40, 41 the model mentioned above not only shows a high correlation between hyperlipidemia and direct markers, but also provide indirect markers. this phenomenon not only shows that the model proposed in this paper can learn the relationship between different physiological parameters but also shows that the model has great potential to discover new diagnostic markers. although the model does not give new diagnostic markers using limited data, the prediction results of the model are in line with the gold standard, which proves the reliability of the model, and the model has the potential to reasonably analyze more evidence for disease diagnosis. as shown in figure 10 , although the model pays little attention to the remaining items, in fact, the attention is not zero. we speculate that this is due to there is a correlation between human different physiological parameters. the model shows a strong concern for the physiological parameters directly related to disease, but does not give high attention to the physiological parameters not related to disease, such as red blood cells, which further proves the reliability of the model. by using the visualization method, the diagnostic basis of the auxiliary diagnostic model can be clearly presented, which improve a certain degree of transparency to the black-box model. the ai diagnosis system proposed in this paper not only provided accurate and robust diagnosis results but also provided the diagnostic basis of diseases (94% acc, 97.48% auc, 96% sensitivity and 92% specificity with test dataset). it not only increases the intelligence of the model but also broadens the application scope of the system, such as medical teaching (provide recommended diagnosis results and evidence to inexperienced physicians). most importantly, the traditional method of researching diagnostic markers is often to observe the clinical manifestations of dozens or hundreds of patients artificially, and then find the diagnostic markers of diseases according to the method of statistics. the traditional methods mentioned above are often difficult to synthesize large quantities of data and have a long research cycle. 42, 43 andrei m. beliaevc et al used 96 patients samples to discover diagnostic markers of acute cholangitis. akihiko yuki et al found cadm1 is a diagnostic marker in early-stage mycosis fungoides with 58 cases. 44 their research results have achieved good performance. artificial analysis of limited data (dozens of samples) has the characteristics of one-sidedness and long research time, which undoubtedly increases the difficulty of researching diagnostic markers. the auxiliary diagnostic system proposed in this paper can automatically provide diagnostic markers by integrating a large amount of clinical data, which reduces the blindness of researching diagnostic markers and speeds up the discovery process of new diagnostic markers to a certain extent. in addition, automatic analysis of large quantities of samples can improve the reliability of the model and reduce the contingency caused by small quantities of samples. despite it is potential, it still has limitations. one limitation of our study is that the data we used included only a few human hematological parameters. some diseases can not only be determined by these parameters but also need other information, such as biochemical testing and so on. diseases may also be associated with other physiological parameters that are not part of the training set. another limitation is that the diagnosis of many chronic diseases is also related to many other types of information, such as sex, age, disease history, family history and so on. finally, because the experimental data were collected in the metabolic disease hospital, there were many samples with metabolic diseases in the training data, which was also a factor limiting the further improvement of the performance of the model. therefore, in the future work, we will study how to add more types of parameters to the auxiliary diagnostic system and collect more samples of different health status, so as to further improve the performance of the model. in the future work, we will also research more types of model in order to find more effective model can process human physiological parameters. in this paper, an algorithm of attention deep learning is proposed which has the potential to automatically diagnose hyperlipidemia with human hematological parameters and provide the diagnostic markers and the importance of different markers for the diagnosis results at the same time. it achieved 97.48% auc, 92% specificity and 96% sensitivity with the test dataset. a new method proposed can accurately and automatically diagnose hyperlipidemia and provide disease diagnostic markers at the same time. the visualization of the model diagnosis basis enhances the transparency of the black-box model, increases the interpretability of deep learning algorithm, and enhances the credibility of the model. the attention deep learning algorithm proposed in this paper realizes the providing diagnostic basis while diagnosing disease. this phenomenon proves that it has the potential to discover new diagnostic markers, and expands the application scope of the auxiliary diagnostic system. at the same time, the experimental results show that the algorithm also has the capability to learn the relationship between different physiological parameters, so it has a high generalization ability. therefore, it can save medical resources, speed up the researching process of diagnostic markers to a certain extent, speed up the work efficiency of the hospital, and enhance the patient's medical experience. increasing the explanatory power of the model can effectively increase the research on biomarkers. 34 the future work is still around to improve the performance of the auxiliary diagnostic system. in order to further improve the accuracy of the model, we will consider how to input more types of data into the model, such as patient history, etc. at the same time, in order to diagnose more kinds of diseases, we will collect more data to expand our existing data set. because there are some complex diseases that require a joint judgment of multiple types of diagnostic information, we will study how to use cross-media diagnostic data as an input training model in the next step. due to the limited data types, no new diagnostic markers are proposed in this model. although the experiment confirmed that the diagnostic markers predicted by the model were the same as the gold standard, we will add more physiological parameter types and multiple diseases in the future work, with a view to finding more disease-related biomarkers. not only in medicine but also from the perspective of engineering, we will further study the optimization methods of auxiliary diagnostic systems, such as the adjustment methods of hyperparameters. we will also further expand the sample data, consider more factors that may influence the diagnosis of the disease such as different races, diverse age groups et al to further enhance the reliability of the model. the final frontier in cancer diagnosis deep learning for biology next-generation machine learning for biological networks developing a diagnostic decision support system for benign paroxysmal positional vertigo using a deep-learning model machine learning and medical education using recurrent neural network models for early detection of heart failure onset a study of generalizability of recurrent neural network-based predictive models for heart failure onset risk using a large and heterogeneous ehr data set deep learning algorithms for detection of critical findings in head ct scans: a retrospective study identifying medical diagnoses and treatable diseases by image-based deep learning meeting brain-computer interface user performance expectations using a deep neural network decoding framework albumin synthesis, albuminuria and hyperlipemia in nephrotic patients non-cholesterol sterols in different forms of primary hyperlipemias speech emotion recognition using deep 1d & 2d cnn lstm networks predicting hospital readmission for lupus patients: an rnn-lstm-based deep-learning methodology deep learning: a rapid and efficient route to automatic metasurface design cuffless blood pressure estimation from electrocardiogram and photoplethysmogram using waveform based ann-lstm network detecting glaucoma progression using guided progression analysis with oct and visual field assessment in eyes classified by international classification of disease severity codes methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research improving diagnostic accuracy using ehr in emergency departments: a simulation-based study detecting diseases by human-physiological-parameter-based deep learning an automatic diagnostic system based on deep learning, to diagnose hyperlipidemia classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning a frontal attention mechanism in the visual mismatch negativity bidirectional lstm with attention mechanism and convolutional layer for text classification using recurrent neural networks with attention for detecting problematic slab shapes in steel rolling effects of chinese herbal medicine on hyperlipidemia and the risk of cardiovascular disease in hiv-infected patients in taiwan increased risk of anxiety or depression after traumatic spinal cord injury in patients with preexisting hyperlipidemia: a population-based study clinical significance of analysis of the level of blood fat, crp and hemorheological indicators in the diagnosis of elder coronary heart disease cholesterol metabolism differs after statin therapy according to the type of hyperlipemia effect of gap junction uncoupler heptanol on resistance arteries reactivity in experimental models of diabetes, hyperlipemia and hyperlipemia-diabetes meta-analysis of the effect and safety of berberine in the treatment of type 2 diabetes mellitus, hyperlipemia and hypertension multidisciplinary care in the hematology clinic: implementation of geriatric oncology polyenylphosphatidylcholine decreases alcoholic hyperlipemia without affecting the alcohol-induced rise of hdl-cholesterol deep learning and medical diagnosis adam: a method for stochastic optimization automated detection of atrial fibrillation using long short-term memory network with rr interval signals classification of transcranial doppler signals using individual and ensemble recurrent neural networks modeling asynchronous event sequences with rnns classification of ecg arrhythmia using recurrent neural networks high hdl cholesterol: a risk factor for diabetic retinopathy? findings from no blind study diagnostic inflammatory markers in acute cholangitis informativeness of diagnostic marker values and the impact of data grouping cadm1 is a diagnostic marker in early-stage mycosis fungoides: multicenter study of 58 cases automated identification of normal and diabetes heart rate signals using nonlinear measures original research, review, case reports, hypothesis formation, expert opinion and commentaries are all considered for publication. the manuscript management system is completely online and includes a very quick and fair peer-review system yuliang liu and quan zhang are co-first authors, they contributed equally to this work. this work was funded by the tianjin university of science and technology's new coronavirus prevention and control research project and ministry of education fund of china (2018a03033). all authors contributed to data analysis, drafting or revising the article, gave final approval of the version to be published, and agree to be accountable for all aspects of the work. the authors declare no conflicts of interest in this work. key: cord-324230-nu0pn2q8 authors: ardabili, s. f.; mosavi, a.; ghamisi, p.; ferdinand, f.; varkonyi-koczy, a. r.; reuter, u.; rabczuk, t.; atkinson, p. m. title: covid-19 outbreak prediction with machine learning date: 2020-04-22 journal: nan doi: 10.1101/2020.04.17.20070094 sha: doc_id: 324230 cord_uid: nu0pn2q8 several outbreak prediction models for covid-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. among the standard models for covid-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. this paper presents a comparative analysis of machine learning and soft computing models to predict the covid-19 outbreak. among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, mlp, and adaptive network-based fuzzy inference system, anfis). based on the results reported here, and due to the highly complex nature of the covid-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. access to accurate outbreak prediction models is essential to obtain insights into the likely spread and consequences of infectious diseases. governments and other legislative bodies rely on insights from prediction models to suggest new policies and to assess the effectiveness of the enforced policies [1] . the novel coronavirus disease (covid -19) has been reported to infect more than 2 million people, with more than 132,000 confirmed deaths worldwide. the recent global covid-19 pandemic has exhibited a nonlinear and complex nature [2] . in addition, the outbreak has differences with other recent outbreaks, which brings into question the ability of standard models to deliver accurate results [3] . besides the numerous known and unknown variables involved in the spread, the complexity of population-wide behavior in various geopolitical areas and differences in containment strategies had dramatically increased model uncertainty [4] . consequently, standard epidemiological models face new challenges to deliver more reliable results. to overcome this challenge, many novel models have emerged which introduce several assumptions to modeling (e.g., adding social distancing in the form of curfews, quarantines, etc.) [5] [6] [7] . to elaborate on the effectiveness of enforcing such assumptions understanding standard dynamic epidemiological (e.g., susceptible-infected-recovered, sir) models is essential [8] . the modeling strategy is formed around the assumption of transmitting the infectious disease through contacts, considering three different classes of well-mixed populations; susceptible to infection (class s), infected (class i), and the removed population (class r is devoted to those who have recovered, developed immunity, been isolated or passed away). it is further assumed that the class i transmits the infection to class s where the number of probable transmissions is proportional to the total number of contacts [9] [10] [11] . the number of individuals in the class s progresses as a time-series, often computed using a basic differential equation as follows: where is the infected population, and is the susceptible population both as fractions. represents the daily reproduction rate of the differential equation, regulating the number of susceptible infectious contacts. the value of in the time-series produced by the differential equation gradually declines. initially, it is assumed that at the early stage of the outbreak ≈ 1 while the number of individuals in class i is negligible. thus, the increment becomes linear and the class i eventually can be computed as follows: where regulates the daily rate of new infections by quantifying the number of infected individuals competent in the transmission. furthermore, the class r, representing individuals excluded from the spread of infection, is computed as follows: under the unconstrained conditions of the excluded group, eq. 3, the outbreak exponential growth can be computed as follows: the outbreaks of a wide range of infectious diseases have been modeled using eq. 4. however, for the covid-19 outbreak prediction, due to the strict measures enforced by authorities, the susceptibility to infection has been manipulated dramatically. for example, in china, italy, france, hungary and spain the sir model cannot present promising results, as individuals committed voluntarily to quarantine and limited their social interaction. however, for countries where containment measures were delayed (e.g., united states) the model has shown relative accuracy [12] . figure. 1 shows the inaccuracy of conventional models applied to the outbreak in italy by comparing the actual number of confirmed infections and epidemiological model predictions 1 . the seir models through considering the significant incubation period during which individuals have been infected showed progress in improving the model accuracy for varicella and zika outbreak [13, 14] . seir models assume that the incubation period is a random variable and similarly to the sir model, there would be a disease-free-equilibrium [15, 16] . it is worth mentioning that seir model will not work well where the parameters are non-stationary through time [17] . a key cause of non-stationarity is where the social mixing (which determines the contact network) changes through time. social mixing determines the reproductive number 0 which is the number of susceptible individuals that an infected person will infect. where 0 is less than 1 the epidemic will die out. where it is greater than 1 it will spread. 0 for covid-19 prior to lockdown was estimated as a massive 4 presenting a pandemic. it is expected that lockdown measures should bring 0 down to less than 1. the key reason why seir models are difficult to fit for covid-19 is non-stationarity of mixing, caused by nudging (step-by-step) intervention measures. one can calculate that standard epidemiological models can be effective and reliable only if (a) the social interactions are stationary through time (i.e., no changes in interventions or control measures), or (b) there exists a great deal of knowledge of class r with which to compute eq. 3. often to acquire information on class r, several novel models included data from social media or call data records (cdr), which showed promising results [18] [19] [20] [21] [22] [23] [24] [25] . however, observation of the behavior of covid-19 in several countries demonstrates a high degree of uncertainty and complexity [26] . thus, for epidemiological models to be able to deliver reliable results, they must be adapted to the local situation with an insight into susceptibility to infection [27] . this imposes a huge limit on the generalization ability and robustness of conventional models. advancing accurate models with a great generalization ability to be scalable to model both the regional and global pandemic is, thus, essential [28] . a further drawback of conventional epidemiological models is the short lead-time. to evaluate the performance of the models, the median success of the outbreak prediction presents useful information. the median prediction factor can be calculated as follows: as the lead-time increases, the accuracy of the model declines. for instance, for the covid-19 outbreak in italy, the accuracy of the model for more than 5-days-in-the-future reduces from = 1 for the first five days to = 0.86 for day 6 [12] . due to the complexity and the large-scale nature of the problem in developing epidemiological models, machine learning (ml) has recently gained attention for building outbreak prediction models. ml approaches aim at developing models with higher generalization ability and greater prediction reliability for longer lead-times [29] [30] [31] [32] [33] . although ml methods were used in modeling former pandemics (e.g., ebola, cholera, swine fever, h1n1 influenza, dengue fever, zika, oyster norovirus [8, [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] ), there is a gap in the literature for peer-reviewed papers dedicated to covid-19. table 1 represents notable ml methods used for outbreak prediction. these ml methods are limited to the basic methods of random forest, neural networks, bayesian networks, naïve bayes, genetic programming and classification and regression tree (cart). although ml has long been established as a standard tool for modeling natural disasters and weather forecasting [44, 45] , its application in modeling outbreak is still in the early stages. more sophisticated ml methods (e.g., hybrids, ensembles) are yet to be explored. consequently, the contribution of this paper is to explore the application of ml for modeling the covid-19 pandemic. this paper aims to investigate the generalization ability of the proposed ml models and the accuracy of the proposed models for different lead-times. the rest of this paper is organized as follows. section two describes the methods and materials. the results are given in section three. sections four and five present the discussion and the conclusions, respectively. data were collected from https://www.worldometers.info/coronavirus/country for five countries, including italy, germany, iran, usa, and china on total cases over 30 days. figure 2 presents the total case number (cumulative statistic) for the considered countries. currently, to contain the outbreak, the governments have implemented various measures to reduce transmission through inhibiting people's movements and social activities. although for advancing the epidemiological models information on changes in social distancing is essential, for modeling with machine learning no assumption is required. as can be seen in figure 2 , the growth rate in china is greater than that for italy, iran, germany and the usa in the early weeks of the disease. the next step is to find the best model for the estimation of the time-series data. logistic, linear, logarithmic, quadratic, cubic, compound, power and exponential equations (table 2 ) are employed to develop the desired model. a, b, c, µ, and l are parameters (constants) that characterize the above-mentioned functions. these constants need to be estimated to develop an accurate estimation model. one of the goals of this study was to model time-series data based on the logistic microbial growth model. for this purpose, the modified equation of logistic regression was used to estimate and predict the prevalence (i.e., i/population at a given time point) of disease as a function of time. estimation of the parameters was performed using evolutionary algorithms like ga, particle swarm optimizer, and the grey wolf optimizer. these algorithms are discussed in the following. evolutionary algorithms (ea) are powerful tools for solving optimization problems through intelligent methods. these algorithms are often inspired by natural processes to search for all possible answers as an optimization problem [46] [47] [48] . in the present study, the frequently used algorithms, (i.e., genetic algorithm (ga), particle swarm optimizer (pso) and grey wolf optimizer (gwo)) are employed to estimate the parameters by solving a cost function. genetic algorithm (ga) gas are considered a subset of "computational models" inspired by the concept of evolution [49] . these algorithms use "potential solutions" or "candidate solutions" or "possible hypotheses" for a specific problem in a "chromosome-like" data structure. ga maintains vital information stored in these chromosome data structures by applying "recombination operators" to chromosome-like data structures [50] [51] [52] [53] . in many cases, gas are employed as "function optimizer" algorithms, which are algorithms used to optimize "objective functions." of course, the range of applications that use the ga to solve problems is very wide [52, 54] . the implementation of the ga usually begins with the production of a population of chromosomes generated randomly and bound up and down by the variables of the problem. in the next step, the generated data structures (chromosomes) are evaluated, and chromosomes that can better display the optimal solution of the problem are more likely to be used to produce new chromosomes. the degree of "goodness" of an answer is usually measured by the population of the current candidate's answers [55] [56] [57] [58] [59] . the main algorithm of a ga process is demonstrated in figure 3 . in the present study, ga [59] was employed for estimation of the parameters of eq. 6 to 13. the population number was selected to be 300 and the maximum generation (as iteration number) was determined to be 500 according to different trial and error processes to reduce the cost function value. the cost function was defined as the mean square error between the target and estimated values according to eq. 14: where, es refer to estimated values, t refers to the target values and n refers to the number of data. in 1995, kennedy and eberhart [60] introduced the pso as an uncertain search method for optimization purposes. the algorithm was inspired by the mass movement of birds looking for food. a group of birds accidentally looked for food in a space. there is only one piece of food in the search space. each solution in pso is called a particle, which is equivalent to a bird in the bird's mass movement algorithm. each particle has a value that is calculated by a competency function which increases as the particle in the search space approaches the target (food in the bird's movement model). each particle also has a velocity that guides the motion of the particle. each particle continues to move in the problem space by tracking the optimal particles in the current state [60] [61] [62] . the pso method is rooted in reynolds' work, which is an early simulation of the social behavior of birds. the mass of particles in nature represents collective intelligence. consider the collective movement of fish in water or birds during migration. all members move in perfect harmony with each other, hunt together if they are to be hunted, and escape from the clutches of a predator by moving another prey if they are to be preyed upon [63] [64] [65] . particle properties in this algorithm include [65] [66] [67] : • each particle independently looks for the optimal point. • each particle moves at the same speed at each step. each particle remembers its best position in the space. the particles work together to inform each other of the places they are looking for. each particle is in contact with its neighboring particles. every particle is aware of the particles that are in the neighborhood. every particle is known as one of the best particles in its neighborhood. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . the pso implementation steps can be summarized as: the first step establishes and evaluates the primary population. the second step determines the best personal memories and the best collective memories. the third step updates the speed and position. if the conditions for stopping are not met, the cycle will go to the second step. the pso algorithm is a population-based algorithm [68, 69] . this property makes it less likely to be trapped in a local minimum. this algorithm operates according to possible rules, not definite rules. therefore, pso is a random optimization algorithm that can search for unspecified and complex areas. this makes pso more flexible and durable than conventional methods. pso deals with nondifferential target functions because the pso uses the information result (performance index or target function to guide the search in the problem area). the quality of the proposed route response does not depend on the initial population. starting from anywhere in the search space, the algorithm ultimately converges on the optimal answer. pso has great flexibility to control the balance between the local and overall search space. this unique pso property overcomes the problem of improper convergence and increases the search capacity. all of these features make pso different from the ga and other innovative algorithms [61, 65, 67] . in the present study, pso was employed for estimation of the parameters of eq. 6 to 13. the population number was selected to be 1000 and the iteration number was determined to be 500 according to different trial and error processes to reduce the cost function value. the cost function was defined as the mean square error between the target and estimated values according to eq. 14. one recently developed smart optimization algorithm that has attracted the attention of many researchers is the grey wolf algorithm. like most other intelligent algorithms, gwo is inspired by nature. the main idea of the grey wolf algorithm is based on the leadership hierarchy in wolf groups and how they hunt [70] . in general, there are four categories of wolves among the herd of grey wolves, alpha, beta, delta and omega. alpha wolves are at the top of the herd's leadership pyramid, and the rest of the wolves take orders from the alpha group and follow them (usually there is only one wolf as an alpha wolf in each herd). beta wolves are in the lower tier, but their superiority over delta and omega wolves allows them to provide advice and help to alpha wolves. beta wolves are responsible for regulating and orienting the herd based on alpha movement. delta wolves, which are next in line for the power pyramid in the wolf herd, are usually made up of guards, elderly population, caregivers of damaged wolves, and so on. omega wolves are also the weakest in the power hierarchy [70] . eq. 15 to 18 are used to model the hunting tool: where t is represents repetition of the algorithm. and are vectors of the prey site and the vectors represent the locations of the grey wolves. is linearly reduced from 2 to 0 during the repetition. 1 ⃗⃗⃗ and 2 ⃗⃗⃗ are random vectors where each element can take on realizations in the range [0.1]. the gwo algorithm flowchart is shown in figure 4 . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. in the present study, gwo [70] was employed for estimation of the parameters of eq.1 to 8. the population number was selected to be 500 and the iteration number was determined to be 1000 according to different trial and error processes to reduce the cost function value. the cost function was defined as the mean square error between the target and estimated values according to eq. 14. ml is regarded as a subset of ai. using ml techniques, the computer learns to use patterns or "training samples" in data (processed information) to predict or make intelligent decisions without overt planning [71, 72] . in other words, ml is the scientific study of algorithms and statistical models used by computer systems that use patterns and inference to perform tasks instead of using explicit instructions [73, 74] . time-series are data sequences collected over a period of time [75] , which can be used as inputs to ml algorithms. this type of data reflects the changes that a phenomenon has undergone over time. let x t be a time-series vector, in which xt is the outbreak at time point t and t is the set of all equidistant time points. to train ml methods effectively, we defined two scenarios, listed in table 3 . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . table 3 , scenario 1 employs data for three weeks to predict the outbreak on day t and scenario 2 employs outbreak data for five days to predict the outbreak for day t. both of these scenarios were employed for fitting the ml methods. in the present research, two frequently used ml methods, the multi-layered perceptron (mlp) and adaptive network-based fuzzy inference system (anfis) are employed for the prediction of the outbreak in the five countries. ann is an idea inspired by the biological nervous system, which processes information like the brain. the key element of this idea is the new structure of the information processing system [76] [77] [78] . the system is made up of several highly interconnected processing elements called neurons that work together to solve a problem [78, 79] . anns, like humans, learn by example. the neural network is set up during a learning process to perform specific tasks, such as identifying patterns and categorizing information. in biological systems, learning is regulated by the synaptic connections between nerves. this method is also used in neural networks [80] . by processing experimental data, anns transfer knowledge or a law behind the data to the network structure, which is called learning. basically, learning ability is the most important feature of such a smart system. a learning system is more flexible and easier to plan, so it can better respond to new issues and changes in processes [81] . in anns, with the help of programming knowledge, a data structure is designed that can act like a neuron. this data structure is called a node [82, 83] . in this structure, the network between these nodes is trained by applying an educational algorithm to it. in this memory or neural network, the nodes have two active states (on or off) and one inactive state (off or 0), and each edge (synapse or connection between nodes) has a weight. positive weights stimulate or activate the next inactive node, and negative weights inactivate or inhibit the next connected node (if active) [78, 84] . in the ann architecture, for the neural cell c, the input bp enters the cell from the previous cell p. wpc is the weight of the input bp with respect to cell c and ac is the sum of the multiplications of the inputs and their weights [85] : a non-linear function өc is applied to ac. accordingly, bc can be calculated as eq. 20 [85] : similarly, wcn is the weight of the bcn which is the output of c to n. w is the collection of all the weights of the neural network in a set. for input x and output y, hw(x) is the output of the neural network. the main goal is to learn these weights for reducing the error values between y and hw(x). that is, the goal is to minimize the cost function q(w), eq. 21 [85] : in the present research, one of the frequently used types of ann called the mlp [76] was employed to predict the outbreak. mlp was trained using a dataset related to both scenarios (according to table 2 ). for the training of the network, 8, 12, and 16 inner neurons were tried to achieve the best response. results were evaluated by rmse and correlation coefficient to reduce the cost function value. figure 5 presents the architecture of the mlp. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . an adaptive neuro fuzzy inference system is a type of ann based on the takagi-sugeno fuzzy system [86] . this approach was developed in the early 1990s. since this system integrates the concepts of neural networks and fuzzy logic, it can take advantage of both capabilities in a unified framework. this technique is one of the most frequently used and robust hybrid ml techniques. it is consistent with a set of fuzzy if-then rules that can be learned to approximate nonlinear functions [87, 88] . hence, anfis was proposed as a universal estimator. an important element of fuzzy systems is the fuzzy partition of the input space [89, 90] . for input k, the fuzzy rules in the input space make a k faces fuzzy cube. achieving a flexible partition for nonlinear inversion is non-trivial. the idea of this model is to build a neural network whose outputs are a degree of the input that belongs to each class [91] [92] [93] . the membership functions (mfs) of this model can be nonlinear, multidimensional and, thus, different to conventional fuzzy systems [94] [95] [96] . in anfis, neural networks are used to increase the efficiency of fuzzy systems. the method used to design neural networks is to employ fuzzy systems or fuzzy-based structures. this model is a kind of division and conquest method. instead of using one neural network for all the input and output data, several networks are created in this model: • a fuzzy separator to cluster input-output data within multiple classes. • a neural network for each class. training neural networks with output input data in the corresponding classes. figure 6 presents a simple architecture for anfis. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. input n output figure 6 . anfis architecture in the present study, anfis is developed to tackle two scenarios described in table 3. each input included by two mfs with the tri. shape; trap. shape and gauss. shape mfs. the output mf type was selected to be linear with a hybrid optimizer type. evaluation was conducted using the root mean square error (rmse) and correlation coefficient. these statistics compare the target and output values and calculate a score as an index for the performance and accuracy of the developed methods [87, 97] . table 4 presents the evaluation criteria equations. where, n is the number of data, p and a are, respectively, the predicted (output) and desired (target) values. tables 5 to 12 present the results of the accuracy statistics for the logistic, linear, logarithmic, quadratic, cubic, compound, power and exponential equations, respectively. the coefficients of each equation were calculated by the three ml optimizers; ga, pso and gwo. the table contains country name, model name, population size, number of iterations, processing time, rmse and correlation coefficient. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. according to tables 5 to 12 , gwo provided the highest accuracy (smallest rmse and largest correlation coefficient) and smallest processing time compared to pso and ga for fitting the logistic, linear, logarithmic, quadratic, cubic, power, compound, and exponential-based equations for all five countries. it can be suggested that gwo is a sustainable optimizer due to its acceptable processing time compared with pso and ga. therefore, gwo was selected as the best optimizer by providing the highest accuracy values compared with pso and ga. in general, it can be claimed that gwo, by suggesting the best parameter values for the functions presented in table 2 , increases outbreak prediction accuracy for covid-19 in comparison with pso and ga. therefore, the functions derived by gwo were selected as the best predictors for this research. tables 13 to 17 present the description and coefficients of the linear, logarithmic, quadratic, cubic, compound, power, exponential and logistic equations estimated by gwo. tables 13 to 17 also present the rmse and r-square values for each equation fitted to data for china, italy, iran, germany and usa, respectively. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020 is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020 as is clear from tables 13 to 17, in general, the logistic equation followed by the quadratic and cubic equations provided the smallest rmse and the largest r-square values for the prediction of covid-19 outbreak. the claim can also be considered from figure 7 to 11, which presents the capability and trend of each model derived by gwo in the prediction of covid-19 cases for china, italy, iran, germany, and the usa, respectively. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . figure 11 . set of models for usa fitted by gwo figures 7 to 11 illustrate the fit of the models investigated in this paper. the best fit for the prediction of covid-19 cases was achieved for the logistic model followed by cubic and quadratic models for china (figure 7) , logistic followed by cubic models for italy (figure 8) , cubic followed by logistic and quadratic models for iran (figure 9 ), the logistic model for germany (figure 10 ), and logistic model for the usa (figure 11 ). this section presents the results for the training stage of ml methods. mlp and anfis were employed as single and hybrid ml methods, respectively. ml methods were trained using two datasets related to scenario 1 and scenario 2. table 14 presents the results of the training phase. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. according to table 18 , the dataset related to scenarios 1 and 2 have different performance values. accordingly, for italy, the mlp with 16 neurons provided the highest accuracy for scenario 1 and anfis with tri. mf provided the highest accuracy for scenario 2. by considering the average values of the rmse and correlation coefficient, it can be concluded that scenario 1 is more suitable for modeling outbreak cases in italy, as it provides a higher accuracy (the smallest rmse and the largest correlation coefficient) than scenario 2. for the dataset related to china, for both scenarios, mlp with 12 and 16 neurons, respectively for scenarios 1 and 2, provided the highest accuracy compared with the anfis model. by considering the average values of rmse and correlation coefficient, it can be concluded that scenario 2 with a larger average correlation coefficient and smaller average rmse than scenario 1 is more suitable for modeling the outbreak in china. for the dataset of iran, mlp with 12 neurons in the hidden layer for scenario 1 and anfis with gaussian mf type for scenario 2 provided the best performance for the prediction of the outbreak. by considering the average values of the rmse and correlation coefficient, it can be concluded that scenario 1 provided better performance than scenario 2. also, in general, the mlp has higher prediction accuracy compared with the anfis method. in germany, mlp with 12 neurons in its hidden layer provided the highest accuracy (smallest rmse and largest correlation coefficient). by considering the average values of the rmse and correlation coefficient, it can be concluded that scenario 1 is more suitable for the prediction of the outbreak in germany than scenario 2. in the usa, the mlp with 8 and 12 neurons, respectively, for scenarios 1 and 2, provided higher accuracy (the smallest rmse and the largest correlation coefficient values) than the anfis model. by considering the average values of the rmse and correlation coefficient values, it can be concluded that scenario 1 is more suitable than scenario 2, and mlp is more suitable than anfis for outbreak prediction. figures 12 to 16 present the model fits for italy, china, iran, germany, and the usa, respectively. by comparing figure 12 to 16 with figures 7 to 11 , it can be concluded that the mlp and the logistic model fitted by gwo provided a better fit than the other models. in addition, the ml methods provided better performance compared with other models. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted april 22, 2020. . the parameters of several simple mathematical models (i.e., logistic, linear, logarithmic, quadratic, cubic, compound, power and exponential) were fitted using ga, pso, and gwo. the logistic model outperformed other methods and showed promising results on training for 30 days. extrapolation of the prediction beyond the original observation range of 30-days should not be expected to be realistic considering the new statistics. the fitted models generally showed low accuracy and also weak generalization ability for the five countries. although the prediction for china was promising, the model was insufficient for extrapolation, as expected. in turn, the logistic gwo outperformed the pso and ga and the computational cost for gwo was reported as satisfactory. consequently, for further assessment of the ml models, the logistic model fitted with gwo was used for comparative analysis. in the next step, for introducing the machine learning methods for time-series prediction, two scenarios were proposed. scenario 1 considered four data samples from the progress of the infection from previous days, as reported in table 3. the sampling for data processing was done weekly for scenario 1. however, scenario 2 was devoted to daily sampling for all previous consecutive days. providing these two scenarios expanded the scope of this study. training and test results for the two machine learning models (mlp and anfis) were considered for the two scenarios. a detailed investigation was also carried out to explore the most suitable number of neurons. for the mlp, the performances of using 8, 12 and 16 neurons were analyzed throughout the study. for the anfis, the membership function (mf) types of tri, trap, and gauss were analyzed throughout the study. the five counties of italy, china, iran, germany, and usa were considered. the performance of both ml models for these countries varied amongst the two different scenarios. given the observed results, it is not possible to select the most suitable scenario. therefore, both daily and weekly sampling can be used in machine learning modeling. comparison between analytical and machine learning models using the deviation from the target value (figures 17 to 21) indicated that the mlp in both scenarios delivered the most accurate results. extrapolation for long-term prediction of up to 150 days using the ml models was tested. the actual prediction of mlp and anfis for the five countries was reported which showed the progression of the outbreak. the global pandemic of the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) has become the primary national security issue of many nations. advancement of accurate prediction models for the outbreak is essential to provide insights into the spread and consequences of this infectious disease. due to the high level of uncertainty and lack of crucial data, standard epidemiological models have shown low accuracy for long-term prediction. this paper presents a comparative analysis of ml and soft computing models to predict the covid-19 outbreak. the results of two ml models (mlp and anfis) reported a high generalization ability for long-term prediction. with respect to the results reported in this paper and due to the highly complex nature of the covid-19 outbreak and differences from nation-to-nation, this study suggests ml as an effective tool to model the outbreak. for the advancement of higher performance models for long-term prediction, future research should be devoted to comparative studies on various ml models for individual countries. due to the fundamental differences between the outbreak in various countries, advancement of global models with generalization ability would not be feasible. as observed and reported in many studies, it is unlikely that an individual outbreak will be replicated elsewhere [1] . although the most difficult prediction is to estimate the maximum number of infected patients, estimation of the n(deaths) / n(infecteds) is also essential. the mortality rate is particularly important to accurately estimate the number of patients and the required beds in intensive care units. for future research, modeling the mortality rate would be of the utmost importance for nations to plan for new facilities. covid-19 and italy: what next? lancet predicting the impacts of epidemic outbreaks on global supply chains: a simulation-based analysis on the coronavirus outbreak (covid-19/sars-cov-2) case the forecasting of dynamical ross river virus outbreaks a comparative study on predicting influenza outbreaks using different feature spaces: application of influenza-like illness data from early warning alert and response system in syria inter-outbreak stability reflects the size of the susceptible pool and forecasts magnitudes of seasonal epidemics on the predictability of infectious disease outbreaks real-time forecasting of hand-foot-and-mouth disease outbreaks using the integrating compartment model and assimilation filtering supervised forecasting of the range expansion of novel non-indigenous organisms: alien pest organisms and the 2009 h1n1 flu pandemic testing predictability of disease outbreaks with a simple model of pathogen biogeography short-term forecasting of bark beetle outbreaks on two economically important conifer tree species effective containment explains sub-exponential growth in confirmed cases of recent covid-19 outbreak in mainland china evaluation of the effect of varicella outbreak control measures through a discrete time delay seir model research about the optimal strategies for prevention and control of varicella outbreak in a school in a central city of china: based on an seir dynamic model calibration of a seir-sei epidemic model to describe the zika virus outbreak in brazil fitting the seir model of seasonal influenza outbreak to the incidence data for russian cities transmission dynamics of zika fever: a seir based model real-time prediction of influenza outbreaks in belgium forecasted size of measles outbreaks associated with vaccination exemptions for schoolchildren simple framework for real-time forecast in a data-limited situation: the zika virus (zikv) outbreaks in brazil from 2015 to 2016 as an example. parasites vectors predicting social response to infectious disease outbreaks from internet-based news streams effective network size predicted from simulations of pathogen outbreaks through social networks provides a novel measure of structure-standardized group size google trends predicts present and future plague cases during the plague outbreak in madagascar: infodemiological study prediction of dengue outbreaks based on disease surveillance, meteorological and socio-economic data forecasting respiratory infectious outbreaks using ed-based syndromic surveillance for febrile ed visits in a metropolitan city early prediction of the 2019 novel coronavirus outbreak in the mainland china based on simple mathematical model nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study superensemble forecast of respiratory syncytial virus outbreaks at national, regional, and state levels in the united states the norovirus epidemiologic triad: predictors of severe outcomes in us norovirus outbreaks consensus and conflict among ecological forecasts of zika virus outbreaks in the united states seasonal difference in temporal transferability of an ecological model: near-term predictions of lemming outbreak abundances a predictive management tool for blackfly outbreaks on the orange river predicting antigenic variants of h1n1 influenza virus based on epidemics and pandemics using a stacking model data mining techniques for predicting dengue outbreak in geospatial domain using weather parameters for spatiotemporal dengue fever hotspots associated with climatic factors in taiwan including outbreak predictions based on machine-learning development of artificial intelligence approach to forecasting oyster norovirus outbreaks along gulf of mexico coast development of genetic programming-based model for predicting oyster norovirus outbreak risks machine learning for dengue outbreak prediction: a performance evaluation of different prominent classifiers prediction for global african swine fever outbreaks based on a combination of random forest algorithms and meteorological data a machine learning-based approach for predicting the outbreak of cardiovascular diseases in patients on dialysis artificial intelligence model as predictor for dengue outbreaks comparative evaluation of time series models for 43 long-term predictors of dengue outbreaks in bangladesh: a data mining approach performance analysis of combine harvester using hybrid model of artificial neural networks particle swarm optimization comparative analysis of single and hybrid neuro-fuzzy-based models for an industrial heating ventilation and air conditioning control system evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms multi-objective optimization using evolutionary algorithms comparison of multiobjective evolutionary algorithms: empirical results feature selection based on hybridization of genetic algorithm and particle swarm optimization. ieee geoscience and remote sensing letters comparative analysis of single and hybrid neuro-fuzzy-based models for an industrial heating ventilation and air conditioning control system advances in machine learning modeling reviewing hybrid and ensemble methods the parallel genetic algorithm as function optimizer a genetic algorithm tutorial a niched pareto genetic algorithm for multiobjective optimization a genetic algorithm for flowshop sequencing development and validation of a genetic algorithm for flexible docking advances in machine learning modeling reviewing hybrid and ensemble methods a genetic algorithm for function optimization: a matlab implementation genetic algorithms and neural networks: optimizing connections and connectivity particle swarm optimization particle swarm optimization. swarm intelligence particle swarm optimization using selection to improve particle swarm optimization particle swarm optimization with particles having quantum behavior recent approaches to global optimization problems through particle swarm optimization analysis of particle swarm optimization algorithm. computer and information science particle swarm optimization method in multiobjective problems an efficient method for segmentation of images based on fractional calculus and natural selection multilevel image segmentation based on fractional-order darwinian particle swarm optimization advances in engineering software 72. ardabili;, s.; mosavi;, a.; varkonyi-koczy;, a. systematic review of deep learning and machine learning models in biofuels research,. engineering for sustainable future prediction of combine harvester performance using hybrid machine learning modeling and response surface methodology prediction of combine harvester performance using hybrid machine learning modeling and response surface methodology time series analysis modelling temperature variation of mushroom growing hall using artificial neural networks. engineering for sustainable future prediction of wind speed and wind direction using artificial neural network, support vector regression and adaptive neuro-fuzzy inference system prediction of output energies for broiler production using linear 79 artificial neural network modeling of hydrogen-rich syngas production from methane dry reforming over novel ni/cafe 2 o 4 catalysts wavelet neural network applied for prognostication of contact pressure between soil and driving wheel. information processing in agriculture intelligent modeling of material separation in combine harvester's thresher by ann detection of walnut varieties using impact acoustics and artificial neural networks (anns) detection of almond varieties using impact acoustics and artificial neural networks fundamentals of artificial neural networks faizollahzadeh_ardabili; mahmoudi, a.; mesri gundoshmian, t. modeling and simulation controlling system of hvac using fuzzy and predictive (radial basis function ,rbf) controllers prediction of output energy based on different energy inputs on broiler production using application of adaptive neural-fuzzy inference system deep learning and machine learning in hydrological processes climate change and earth systems a systematic review state of the art survey of deep learning and machine learning models for smart cities and urban sustainability adaptive neuro-fuzzy inference system for classification of eeg signals using wavelet coefficients a hybrid anfis model based on ar and volatility for taiex forecasting adaptive neuro-fuzzy inference system for prediction of water level in reservoir evaporation estimation using artificial neural networks and adaptive neuro-fuzzy inference system techniques an adaptive neuro-fuzzy inference system (anfis) model for wire-edm ahlawat, a. modeling and analysis of significant key: cord-331849-o346txxr authors: cardoso, pedro j.s.; rodrigues, joão m.f.; monteiro, jânio; lam, roberto; krzhizhanovskaya, valeria v.; lees, michael h.; dongarra, jack; sloot, peter m.a. title: computational science in the interconnected world: selected papers from 2019 international conference on computational science date: 2020-09-21 journal: j comput sci doi: 10.1016/j.jocs.2020.101222 sha: doc_id: 331849 cord_uid: o346txxr nan with the roots in computer science and applied mathematics, the computational science (cs) discipline is a fulcrum that brings together general sciences playing a major role in nowadays society. by the exploration of modelling and simulation through advanced computational methodologies, cs touches vast fields of knowledge originating a high demand of practitioners by all organization's types. in this context, practitioners come to intermingle, collaborate, and share resources and visions, building a large multidisciplinary community, thriving to solve highly complex computational problems, which were only dreamed by a few some years ago. in its line of action, for instance, cs is used to simulate phenomena that would be prohibitive because of many kinds of impracticalities such as cost or danger. the investigation starts by identifying a real problem or phenomenon which is simplified to an abstraction model. the abstract model is then transformed to a computation model by moving it to formal mathematics and algorithms. algorithms are later implemented in some programming code, used in the final step to run the simulations of the original problem. simulations allow a controlled scenario, that can be repeated with different parameters, starting points, and conditions in general, most certainly diminishing the associated costs when compared to executing the real scenario. increasing computational capacities, new algorithms, and methods, allow to achieve high production scenarios, making cs a relatively recent cane for science, completing the theoretical and experimental settings known for other sciences. these achievements are supported using from commodity computers to powerful computers [1] , [2] , built over high performance systems constituted by thousands of dedicated nodes, optimised location relative to network and environmental issues, power consumption, security etc. against this background, the international conference on computational science (iccs), annually held since 2001, has grown to become a major event in the cs field, with hundreds of experts meeting and discussing their works, along with keynote lectures presented by world's renowned researchers. in the 2019's edition, the motto was computational science in the interconnected world, highlighting the role of cs in an increasingly unified planet. as matter of fact, the context in which this editorial paper is being written, just a few months after the declaration of the covid-19 pandemic, highlights the importance of this interconnected world and keeps cs in the forefront of the needs, reflected in the epidemiological research that is supported in computational methods [3] [6] or the proved accuracy of many disease propagation models [7] [12] . in short, cs practitioners work on the knowledge fringe, making this research field very attractive to new and old practitioners, obliged to solve modern gratifying and intricate computational problems. iccs 2019, the 19th annual international conference on computational science, was held on june 12-14, 2019 in faro, portugal. the virtual special issue (vsi), entitled computational science in the interconnected world: selected papers from 2019 international conference on computational science [13] , maintains the sequence of key publications collections associated to the recent iccs events, containing a restricted selection of papers, extended from the ones published in the conference's proceedings [14] [18] . iccs 2019 accounted for 573 submissions divided in 228 submissions to the main track and 345 to the workshops, being accepted a total of 233 papers, 65 (28% accept rate) full papers from the main track and 168 (49%) full papers from the workshops. for this vsi, 10 papers were selected from the main track and 6 from workshops. authors were demanded to extend the original submission with at least 40% new content, describing mature and complete results, followed by a full review process. the vsi submissions are not devoted to a single field of research, although some majors can be identified. for instance, in the field of high-performance and distributed computing ames et al. [19] present a distributed gpu-accelerated fluid structure interaction simulations of high hematocrit cell-resolved flows with over 17 million red blood cells. scaling is compared on a fat node system with six gpus per node and on a system with a single gpu per node. through comparison between the cpu-and gpu-based implementations, data movement in multiscale multi-grid fsi simulations on heterogeneous systems costs are identified, proving to be a major performance bottleneck on the gpu. in [20] , fujita et al. develop a study on the acceleration of the element-by-element kernel in matrix-vector products, essential for high-performance in unstructured implicit finite-element applications. their study pays attention to random data access with data recurrence as major issue to attain performance, proposing a method to avoid these data races for high performance on many-core cpu architectures with wide single instruction, multiple data (simd) units, exemplified by finite-element earthquake simulations. research on numerical analysis and methods is presented in [21] , where fan, qiao, and sun study a thermodynamically consistent, and robust numerical scheme for the dynamical modelling of composition variation in the framework of the modified helmholtz free energy coupling with the realistic equations of state. being computationally efficient and memory saving, the scheme was proved to be unconditionally stable, being the implementation based on the single-component system, and does not required to choose a reference species for multicomponent fluids. in the same context, alekseev, bondarev, and kuvshinnikov [22] proposed an instance of epistemic uncertainty quantification concerning the estimation of the approximation error norm is investigated using the ensemble of numerical solutions obtained via independent numerical algorithms. the ensemble of numerical results obtained by five openfoam solvers is analyzed. in the field of agent simulation and complex systems, leonenko, arzamastsev, and bobashev [23] study an agent-based model for the influenza propagation. the work commits to assess the applicability of the model over a previous influenza incidence and to demonstrate the influence of human contact patterns on the simulated influenza dynamics. authors use several types of synthetic populations (e.g., intensive mixing of people occurs vs. less variety of contacts) reaching different intensities of influenza outbreaks, concluding that contact patterns may dramatically alter the course of epidemics. andelfinger et al. [24] propose a passenger model to measure the effect of different public transport vehicle layouts on the required time for boarding and alighting. the model includes a mechanism to emulate rotation behaviour while avoiding complex geometric computations, allowing to perform collision prediction in low density environments. the boarding and alighting model is validated against real experiments from the literature, demonstrating small deviations to the known values. simulation times for three autonomous vehicle interior layouts proposed by industrial designers in low-and high-density scenarios. application to earth and natural process simulations are proposed by mills et al. [25] which use computational methods to examine the possible consequences of sea level rise, namely salt intrusion and an increase in water volume affect the hydrodynamics and flooding areas of a major estuary in the iberian peninsula. the implementation of the devised 2d model simulated the guadiana estuary in different scenarios of sea level rise combined with different freshwater flow rates considering varying tidal amplitudes, finding an increase in salinity in all areas around the estuary in response to an increase in mean sea level. increase in flooding areas around the estuary was also positively correlate with an increase in mean sea level. assad et al. [26] evaluated the mesoscale ocean circulation aspects of the brazilian equatorial margin using a ten-year hydrodynamic back testing, successfully representing in situ and satellite data. results point to aspects such as the north brazil current influences on the amazon river plume's shape and spread, and the relationship between the seasonal behaviour of the inter-tropical convergence zone migration and of the northern branch of the south equatorial current with the north brazil current pattern. also in the simulation field, in [27] , takii et al. present a six degrees of freedom flight simulation with conversion between airplane and helicopter modes as an application for realizing digital flight of a tiltrotor aircraft. the paper considers components such as rotors, engine nacelles, fixed wings, and a fuselage. to perform large deformations of mesh geometry, the computational domain is decomposed into multiple domains corresponding to each of the component, which are independently moved to perform the characteristic behaviour of a tiltrotor aircraft. the airframe is treated as a rigid body, and the coupled simulations considering interaction with surrounding fluid of the aircraft are demonstrated. the result shows complicated fluid phenomena around the tiltrotor aircraft that occur when it flies with mode change. on data science and artificial intelligence, begy et al. work [28] envisages assisting data-intensive distributed job scheduling in computational grids. the study proposes a multiple linear regression remote access throughput forecasting model, applicable to large computing workloads, which was derived from experimentally identified remote data access throughput parameters and the statistically tested formalization of those parameters. one of the work's goals is to optimise the network load over different data access methodologies exhibiting nonoverlapping throughput bottlenecks. kadupitiya et al. [29] explore the idea of using machine learning surrogates to enhance the usability of molecular dynamics simulations of soft materials for both research and education. in this context, they integrate the machine learning methods with high-performance computing to enhance their predictive power. to demonstrate their approach a parallelised molecular dynamics simulation of self-assembling ions in nanoconfinement is used. they conclude that the implemented regression model, supported on an artificial neural network, successfully learns nearly all the interesting features associated with the output ionic density profiles over a broad range of ionic system parameters. in the fields of operational research and soft computing, randall, montgomery, and lewis [30] introduce a temporally augmented version of a water management problem which allows farmers to evaluate sustainable crop choices over long time horizons, over projected changes in precipitation and temperature. they use a multiple objective formulation solved by a differential evolution algorithm. among other conclusions, the authors suggest that, for the studied region, into the future it will no longer be sustainable to grow crops that are grown now, nor in the same quantities. lima and adi [31] introduce two new combinatorial optimization problems involving strings, namely, the chain alignment problem and the multiple chain alignment problem. while for the first problem a polynomial-time algorithm using dynamic programming is presented, the former one is proved to be np-hard, being proposed three heuristics to approximate instances solutions. the proposed heuristics are assessed with simulated data and the applicability of both problems is proven by their results over a gene identification problem variant. oulamara, cherif-khettaf, and vallée [32] study a variant of dial-a-ride problem encountered in a mobility service operated by a private company. customers ask for a transportation service and get a real-time answer about whether their requests are accepted or rejected, maximizing the number of accepted requests while satisfying constraints. authors propose some reinsertion techniques to exploit solutions' neighbourhoods, which are based in destruction, repair, and ejection operators. their proposal was tested on real and hard instances provided by the company. in [33] , dur et al. present a weak constraint gaussian process model to integrate noisy inputs into the classical gaussian process predictive distribution. to implement a high-performance computing environment, parallelism is explored by defining a parallel weak constraint gaussian process model which is based on domain decomposition. then, the algorithm is used for an optimal sensor placement problem and experimental results are provided for pollutant dispersion within a real urban environment. da silva at al. [34] profile and analyze the power consumption of two production scientific workflow applications executed on distributed platforms. their work examines the cpu utilization and i/o operations impact on energy usage, as well as the impact of executing multiple tasks concurrently on multi-socket, multi-core nodes. they find that power consumption is impacted non-linearly by the way in which cores in sockets are allocated to workflow tasks and that i/o operations have significant impact on power consumption, helping them to propose a power consumption model which accounts for i/o intensive workflows. parallel programming for modern high performance computing systems high performance computing: modern systems and practices computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with sars-cov-2 protease against covid-19 artificial intelligence and machine learning to fight covid-19 preliminary identification of potential vaccine targets for the covid-19 coronavirus (sars-cov-2) based on sars-cov immunological studies fast identification of possible drug treatment of coronavirus disease-19 (covid-19) through computational drug repurposing study a review of modern technologies for tackling covid-19 pandemic modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model sir-based mathematical modeling of infectious diseases with vaccination and waning immunity a workflow for software development within computational epidemiology computational analysis of sars-cov-2/covid-19 surveillance by wastewater-based epidemiology locally and globally: feasibility, economy, opportunities and challenges selected papers from 2019 international conference on computational science computational science -iccs 2019 computational science -iccs 2019 computational science -iccs 2019 computational science -iccs 2019 computational science -iccs 2019 multi-gpu immersed boundary method hemodynamics simulations development of element-by-element kernel algorithms in unstructured finite-element solvers for many-core wide-simd cpus: application to earthquake simulation unconditionally stable, efficient and robust numerical simulation of isothermal compositional grading by gravity on uncertainty quantification via the ensemble of independent numerical solutions contact patterns and influenza outbreaks in russian cities: a proof-of-concept study via agent-based modeling a passenger model for simulating boarding and alighting in spatially confined transportation scenarios the impact of sea level rise in the guadiana estuary ocean climatology at brazilian equatorial margin: a numerical approach six degrees of freedom flight simulation of tilt-rotor aircraft with nacelle conversion forecasting network throughput of remote data access in computing grids machine learning surrogates for molecular dynamics simulations of soft materials an introduction to temporal optimisation using a water management problem the chain alignment problem new online reinsertion approaches for a dynamic dial-a-ride problem weak constraint gaussian processes for optimal sensor placement characterizing, modeling, and accurately simulating power and energy consumption of i/o-intensive scientific workflows many people are entitled to be acknowledged for their work to complete this virtual special issue. first of all, the authors of the selected papers for their priceless work, the reviewers for their indepth reviews contributing to maintain the high standards of this series of special issues, the program committee members and workshop organizers for their effort to insure the iccs 2019 and this vsi excellence and success. to springer and elsevier for their willingness and support on the publishing of the conference's proceedings and special issue. to our research laboratories, namely, larsys (financed by the portuguese foundation for science and technology -fct, project larsys -fct project uidb/50009/20201) and the computational science lab at university of amesterdam. to our corporate supporters, notably, universidade do algarve, university of amesterdam, university of tennessee, nanyang technological university, oxford university press, tap air portugal, and turismo de portugal (algarve). finally, to intellegibilis (https://intellegibilis.com/) for their full support on the planning and deployment of iccs 2019. key: cord-330668-7aw17jf8 authors: chen, cheng-chang; krüger, jens; sramala, issara; hsu, hao-jen; henklein, peter; chen, yi-ming arthur; fischer, wolfgang b. title: orf8a of sars-cov forms an ion channel: experiments and molecular dynamics simulations date: 2011-02-28 journal: biochimica et biophysica acta (bba) biomembranes doi: 10.1016/j.bbamem.2010.08.004 sha: doc_id: 330668 cord_uid: 7aw17jf8 abstract orf8a protein is 39 residues long and contains a single transmembrane domain. the protein is synthesized using solid phase peptide synthesis and reconstituted into artificial lipid bilayers that forms cation-selective ion channels with a main conductance level of 8.9±0.8ps at elevated temperature (38.5°c). computational modeling studies including multi nanosecond molecular dynamics simulations in a hydrated popc lipid bilayer are done with a 22 amino acid transmembrane helix to predict a putative homooligomeric helical bundle model. a structural model of a pentameric bundle is proposed with cysteines, serines and threonines facing the pore. coronavirus (co-v) is the causal agent responsible for the severe acute respiratory syndrome (sars) [1, 2] . the 29.7 kb genome structure of sars-cov contains a single positive stranded rna harboring 14 open reading frames (orfs) which are translated into a large polyprotein which is processed by viral encoded protease to derive the individual proteins [3, 4] . one of the structural proteins from orf8a encodes a membrane-associated protein with 39 amino acids and contains a single transmembrane domain (tmd) with a length of approximately 22 residues [5] . orf 8a has been proposed not only to enhance viral replication but also to induce apoptosis through a mitochondria-dependent pathway [5] . its topology and biological function remains yet unknown. viral channel forming proteins (vcps) participate in several viral functions. they change chemical or electrochemical gradients by altering ion permeability of the lipid membrane and modulate cellular response by interacting with membrane proteins of the host [6] [7] [8] [9] . several vcps have been detected as ion channels, especially when reconstituted into lipid bilayers. vcps are suggested to be encoded in the genomes of enveloped and naked viruses and also chlorella plant virus pbcv-1. vcps, like host channels and pore forming microbial peptides [10] [11] [12] [13] , have to assemble to form homo oligomeric bundles. they are found to consist of approximately 80 to 100 amino acids. exceptions emerge with the discovery of 3a from sars-cov with a length of 274 amino acids [14] and, based on this study, of orf8a with 39 amino acids. in the presented study, we demonstrate experimentally that orf8a forms channels when reconstituted into artificial lipid bilayers at elevated temperature (38.5°c) . it is suggested that orf8a 3-20 is part of the tmd and multi nanosecond molecular dynamic simulations of orf8a 3-20 within a popc bilayer are undertaken. a set of different oligomeric bundle models raging from 4 to 6 units is generated. the generation of computational bundle models follows the 'two stage model' [15, 16] . in short, this model suggests that the secondary structure of a protein is generated prior to its tertiary or quaternary structure. the identification and structural characterization of the new ion channel protein -orf8a from sars-covwill be helpful for the design of novel drugs against sars. orf8a, mkllivltci 10 slcscictvv 20 qrcasnkphv 30 ledpckvq 39 was synthesized using solid phase peptide synthesis on a 433a synthesizer using fmoc-chemistry and hbtu as coupling reagent. the following side-chain-protecting groups were used: t-butyl for ser and thr, trityl for cys, his, asn and gln, boc for lys and pbf for arg. after completion of the synthesis, the peptide was cleaved from the resin using 95% tfa/water and 3% triisopropylsilane for 3 h at room temperature. after cleavage the crude peptide was precipitated with ether and lyophilized. the crude peptide was purified by preparative rp-hplc on a 21 × 300 mm kromasil column with a flow rate of 70 ml/ min. peaks were detected by uv at 220 nm and the separated product was identified by mass spectrometry. channel recordings of orf8a were made with the protein reconstituted into lipid bilayer mixture of pope (1-palmitoyl-2oleoyl-sn-glycero-3-phosphoethanolamine) and dopc (1,2-dioleoylsn-glycero-3-phosphocholine) (1:4 (volume), total lipid 5 mg/ml). lipids were dissolved in chloroform, dried under n 2 gas and resuspended in pentane and decane 9:1. approximately 1 μl lipid suspension was brushed over the delrin cup aperture (diameter 150 μl) and dried. then 3 μl of peptide dissolved in trifluoroethanol (tfe) were added onto the aperture in the same way as the lipids. the lipid bilayer was formed by raising the level of the buffer (300 mm or 30 mm kcl, 5 mm k-hepes, ph = 7.05) of both trans-(ground) and cis-side (head stage). the current response was recorded using planar lipid bilayer workstation from warner instruments with a bc-353 amplifier and 1440a data acquisition system. temperature was set to 38.5°c using a bipolar temperature controller model cl-100 from warner instruments. a constant positive voltage of 70 mv (cis-side) was applied during lipid bilayer formation to achieve an asymmetric orientation of the peptides within the bilayer [17] . records were filtered with 10 hz using a bessel-8-pole low pass filter, digitized at 10 hz and stored for further analysis. experiments have been repeated in the presence of the reducing agent dithiothreitol (dtt) with the peptide dissolved in tfe containing 50 mm dtt. the application of several transmembrane (tm) prediction tools: das [18] , tmhmm [19] , tmpred [20] , split4.0 [21] and sosui [22] , which assume helical tm segments, have been used to detect the tm helical segments of the protein. an ideal helix of tmd of orf8a (llivltci 10 slcscictvv 20 , 8a 3-20 ) has been created using integrated protein builder of moe (www.chemcomp.com). the helix was embedded into a lipid bilayer (popc). overlapping lipid molecules were removed using the moe suite. the lipid-peptide system was then further processed using gromacs-3.3 (www.gromacs.org). the topology and structure of the popc bilayer was prepared as described elsewhere [23] . the lipid-peptide system was hydrated and a short minimization routine using steepest descent and conjugate gradients followed to remove unfavorable interactions. a short equilibration (500 ps) followed with the peptide restrained at the cα atoms. at this stage the lipids were expected to surround the peptide adequately. in the production run (15 ns) all components were fully unrestrained. the program g_covar from the gromacs-3.3 package was used to perform principal component analysis (pca). the covariance matrix of positional variation was computed for the full 10 ns simulation length for the main-chain, based on the square fit of the cα-atoms of tm residues 3-20. the rotational and translational motions were removed by fitting the peptide structure of each time frame to the initial structure. the derived averaged monomer structure was then used in moe to generate tetrameric (tbms), pentameric (pbms) and hexameric bundle models (hbms) [24] by creating symmetric copies of monomeric units around a central pore axis. degrees of freedom such as the rotational angle, inter-helical distance and tilt angle (further on referred to as 'tilt') were changed systematically. to sample the whole conformational space of the bundles each of the degrees of freedom is varied stepwise (inter-helical distance 0.1 å, rotational angle 2°and tilt 2°). for each position the side chains were linked with the backbone. side chain conformation is chosen to be the most likely one for the given backbone position and referenced in the moe library. a short energy minimization (15 steps of steepest decent) followed the linking. consequently the potential energy was calculated using the engh-huber force field implemented in moe. more than one hundred thousand conformations are generated and stored in a database for further analysis. before embedding low energy models into lipid bilayers two amino acids residues of the protein were added at the n and c termini of each of the helices in each bundle model to account for the consequences of their interaction with the lipid bilayer during the simulation. short ideal helices from met-1 to ile-5 and thr-18 to arg-22 were generated. they were aligned and superimposed with those residues from the bundle models. those residues of the short helices which overlapped with the residues of the bundle models have been deleted and the remaining residues of the short helices conjoined with residues of the bundle models. this finally delivered bundle models of 8a . the selected bundle models were then embedded into a pre-equilibrated (70 ns) hydrated popc bilayer [23] . gromacs-3.3 with the gromos96 (ffg45a3) force field was used for the simulations with an integration step size of 2 fs. the temperature of the peptide, lipid and the water molecules were individually coupled to a berendsen thermostat with a coupling time of 0.1 ps. isotropic pressure coupling was used with a coupling time of 1.0 ps and a compressibility of 4.5e-5 bar −1 . long range electrostatics was calculated using the particle-mesh ewald (pme) algorithm with grid dimensions of 0.12 nm and interpolation order 4. lennard-jones and short-range coulomb interactions were cut off at 1.4 and 0.8 nm, respectively. the simulation boxes contained around 26,011 atoms for a b fig. 1 . identification of the putative membrane-spanning domain of the 39 amino acid sequence of orf8a using secondary structure prediction tools (a). amino acids chosen to model a helical tmd are shown in italics. orf8a 2-20 ideal helix including the bold residues shown in its 'gaussian contact' representation (moe) (b). hydrophilic and hydrophobic residues are highlighted in black and light gray, respectively. the tetrameric models (protein 828 atoms, lipids 5824 atoms, water molecules 19,359 atoms), 26,218 atoms for the pentameric model (protein 1035 atoms) and 26,425 atoms for the hexameric models (protein 1242 atoms). calculations of the root mean square deviations (rmsd) were based on the cα atoms. the simulations were run on a dell precision 490n workstation and a 28 core opteron based compute cluster with infiniband interconnects. all pictures were generated using xmgrace, vmd-1.8.6 and moe-2007.09. application of several tm prediction tools suggests a range of amino acids within the n terminal part of the protein (fig. 1a) . the longest stretch is suggested by sosui from lys-2 to ala-28 and the shortest stretch by das (val-5 to thr-18). all other programs place the start of the helix between residues lys-2 to ile-4 and the end between thr-18 to ala-28. for the experimental and computational investigations a consensus length of 18 residues from leu-3 to val-20 is chosen. channel recordings of sars orf8a at a kcl concentration of 300 mm at 38.5°c show frequent openings ( fig. 2a) . the temperature has been chosen to simulate elevated body temperature occurring during fever. the lipid composition and their individual components have been proven successful in previous investigations [25] . dopc is also successfully used in bilayer recordings [26] and is known to reduce packing density and enhance the integration of the peptide into the bilayers [27] . the calculated conductance histograms of the recorded data enlighten a major conductance state of 8.8 ± 0.8 ps (fig. 2b ) and an ohmic behavior (fig. 2c) . experiments have also been conducted in the presence of dtt (fig. s1a) , since the sequence of the peptide contains four cysteines (fig. 1a) . the overall behavior of the collected data remains the same as for the experiments without dtt, with a mean conductance of 8.9 ± 0.1 ps (fig. s1b ) and an ohmic channel behavior (fig. s1c ). all data in common is an increased open duration of the channels at negative potential. ion selectivity of orf8a was evaluated with asymmetric buffer concentration (ten-fold in trans chamber) which shifts the potassium reversal potential independent of the presence of dtt to approximately 30-40 mv (figs. 2d and s1d) . the shift is indicative for a weak cation-selective channel. the data show a mean conductance of 8.3 ± 0.1 ps (fig. 2e ) and 8.6 ± 0.1 ps in the presence of dtt (fig. s1e ) and ohmic behavior (figs. 2f and s1f). the idealized monomeric tm helix based on the consensus sequence leu-3 to val-20 (fig. 1a) shows clustering of hydrophilic residues (thr-8, ser-11, ser-14 and thr-18) on one side suggesting that the four hydrophilic amino acids form the lumen of the pore in a homooligomeric helical bundle channel model. four cysteines are forming a diagonal arrangement excluding any possibility for interhelical cys-bridges (fig. 1b) . the generation of a series of assembled pore models follows a protocol which was reported earlier [23, 24] . a single ideal helix (8a [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] ) is embedded in a hydrated lipid bilayer and undergoes a 10 ns md simulation. an averaged helix from this simulation is generated based on a pca analysis. calculating the eigenvector quantifies the magnitude of the functional motion of the protein. averaging over all structures of the first few eigenvectors delivers a structure of the helix which is the most likely one. this structure is then used to build the bundle models. the use of the truncated part of 8a, 8a [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] , is rationalized by the understanding that the hydrophobic part of the tmd guides the assembly. in the current assembly protocol any unfavorable interactions due to hydrophilic residues at either end of the helix would be over emphasized. with 8a 3-20 residues like lys-2 and arg-22 are not included during assembly. protein-protein interactions in respect to their potential and their individual three degrees of freedom are calculated. with decreasing inter-helical distance the energy values reach a minimum around 9.5 å for the tbms and pbms (fig. 3a) . for both models some low energy models can also be observed around 8.5 å. the hbms have favorable energy values for inter-helical distances between 10 å and 11 å (fig. 3a, left column) . the lowest energy conformation in a minimum is seen as representative for similar conformations clustering around it. energy versus rotational angle correlation plots show three minima for all models which are separated by 150-160° (fig. 3a , middle column). each of the models adopts a minimum energy conformation for a left-handed tilt of around + 25°to +35° (fig. 3a , right column). for the right handed models a minimum is present at −10°and around −35°. all bundles in common are rings of cysteines (cys-13, -15, and -17), threonines (thr-8), serines (ser-11) and arginines (arg-22, for pbm2 and pbm3 only) facing the lumen of the pore (fig. 4) . all hydrophilic and polarizable residues are separated by stretches of hydrophobic residues of various lengths. hbm1 and hbm3 exhibit the longest hydrophobic stretches within the lumen. three low energy pore models for each of the oligomers have been chosen for another 15 ns md simulation to further minimize the proposed bundle models. the rmsd plots for all models rise during the first nanosecond to level off at around 1 to 3 å (fig. 3b) . in one of the tbms (tbm3) the spread of the rmsd values for the individual helices of the bundle is minimal, indicative for the helices retaining its starting structure (fig. 3b, upper trace) . for pbms a moderate spread is observed (fig. 3b, middle trace) whilst for the hbms the largest spread can be detected (fig. 3b, lower trace) . consequently, minor conformational adjustments away from the assembled structure occur. the tilt angles averaged over the last 5 ns of each simulation are in the order of +30°to + 40°and −25°to −30°( table 1) . the values are almost unchanged in respect to the values found during the assembly process (fig. 3a) . the tilts show the lowest deviations for tbms and pbms. the averaged inter-helical distance of the tbms is between 6-7 å (table 1) . for the pbms the inter-helical distance is larger with values of 8 to 11 å. in the hbms the distance is around 9 å, but shows large standard deviation. visual inspection of all bundles reveals that most of the bundles fail to retain a circular pore assembly except tbm3, pbm2 and pbm3. the averaged minimum pore radii for the models are calculated (hole) to be (4.4 ± 0.2) å for pbm2 and (3.5 ± 0.4) å for pbm3 ( table 1 ). the minimum pore radius can be located with an accuracy of ±2.3 å for pbm2 and ±2.7 å for pbm3. for tbm3 the pore radius is calculated to be below 1.5 å over the entire length of the pore which is too small to harbor any water molecules. as a result, water filled bundles of the tmd of orf8a should adopt approximately a +40°tilt with a distance between the helices in respect to their long axis of 11 å (fig. 5 for pbm2). with this large tilt hydrophilic residues thr-8 and ser-11 as well as cys-15 of one helix form the hydrophilic stretch, whilst ser-14 and thr-18 are positioned at an inter-helical site. lys-2 remains in an outward position 'snorkling' for the aqueous phase and the lipid head groups. the side chains of all the arginines (arg-22) move from an all inward facing position to such a position that they point either outside the pore or towards the helix interface interacting with gln-21 of the neighboring helix. all assemblies in common is also a line of cysteines (cys-9, -13, and -17) in each helix at the outside of the bundle (fig. 5, lower left) . the hour glass shape of the water column is shown in fig. 5 (lower right). the model underpins the necessity of a balanced pattern with alternating hydrophilic and hydrophobic residues mantling the pore to allow the existence of a continuous water column through the pore. the calculated conductance data of the ofr8a protein and its weak selectivity for cations is in accordance with findings of other viral channel forming proteins [25, [28] [29] [30] . selectivity has so far been reported for viral proteins with a single (e.g. vpu from hiv-1) or double tm topology (e.g. p7 from hcv). even though assuming that selectivity is coupled with flexibility and gating [31] it may be argued that with the helical topologies of a single and eventually double tmd no highly selective channels can be formed via self-assembly. in the protocol for the assembly of the tm helices of 8a a combination of molecular dynamics simulations with a fine grained positioning routine is used [23, 24] . the positioning routine moves the helix on the bases of the backbone atoms to a specific position and consequently generates the side chains for each particular position. in this protocol a vacuum force field (engh-huber [32] ) has been used. in another study, helix assembly undertaken in vacuum have been compared with data derived from md simulations in a hydrated lipid bilayer and reported to explore the same conformational space [33] . this makes the use of vacuum conditions in combination with a fine grained search of the conformational space, a powerful tool [34] . applying md simulations after the assembly step aims to address for the specific environment of the bundle structure such as phospholipid head group region, hydrophobic slab as well as the hydrophilic environment within the lumen of the pore. this protocol is along side of other protocols used to model tm helix assembly and bundle formation [35] [36] [37] [38] [39] . in some of these protocols a limited number of model structures is generated [35, 36] and in other methods larger numbers of models are generated through extended simulation protocols and assessed [37, 38] . all protocols so far do not account for kinetics of the assembly of larger units such as bundles or pores. the pore-lining pattern discovered in the model supports models from other viral channel forming proteins which have been derived using more biased model generation protocols in the past [40, 41] . based on experimental findings hydrophilic residues such as threonine, serine and charged residues such as aspartic and glutamic acids [42] as well as arginines and lysines, are seen as essential porelining motif for ion channels [43, 44] and viral channel forming proteins [6, 25, 45] . a cysteine residue has been suggested to be at least transiently part of the pore-lining motif of the human concentrative nucleoside transporter 3 [46] . for the transient receptor potential (trp) channels conserved cysteine residues within the pore region have been reported to be essential for channel function [47] . also the ampa receptor has a cysteine facing the lumen of the pore of the channel [48] . the cysteines at the outside of the bundles may be able to lead to covalent linkages with host cell proteins or possibly an attachment to lipid rafts or other membrane based host cell factors. also some of the threonines and serines are not solely pore lining but enable stabilize inter-helical interaction, a feature which has been suggested by modeling other bundle systems [49] . the results indicate that the helices in the pentameric bundles are strongly tilted having the largest inter-helical distance. a straightening will not allow a water filled column to exist. the lower tilt angles of the tetra and hexameric bundles match nmr spectroscopic data on equivalent single peptides in equivalent lipids [50, 51] and consecutive md simulations [51] . a series of md simulations on the tmd of vpu in various lipid bilayer systems have shown that there is no simple correlation between tilt angle and lipid thickness [23] . often the tilt shows rather large fluctuations and a mismatch is compensated by an alternating kink angle. with lys-2 and arg-22 and their flexible and extended side chains [52] at either end of the tmd a large tilt in the pentameric assembly is by all means possible for the 8a peptide table 1 interhelical distances and tilt angles calculated between the centers of mass of each helix based on the cα backbone atoms. values are averaged over the last 5 ns of the 15 ns md simulation runs. minimum pore radius (mpr) is calculated as an average over six bundle models taken at 10, 11, 12, 13, 14 and 15 ns using the program hole [44] . bundle. computational studies on a model including the extramembrane parts are necessary to further evaluate the shape of the 'proper' bundle. one the bases of the applied assembly protocol a pentameric bundle is harboring a column of water molecules during the extended equilibration of a md simulation. all other models lose their circular assembly structure by approaching each other as if the pore is 'squeezed'. a similar situation has been reported also for simulations on the tmds of vpu from hiv-1 [53] . the size of the pore in the pbms would allow the physiological relevant ions to pass. the existence of a continuous water column in a particular bundle model may of course not be the exclusive criteria. functional studies will be the next level of screening. there is no possibility for the two bundles pbm2 and pbm3 to interchange based on the energy landscape. it only needs minor changes in the tilt and distance towards the central axis, but the rotational change of almost 180°has to cross a larger barrier (fig. 3a , middle column) in order to transform pbm2 into pbm3. it needs further computational functional analysis to discriminate between the models such as simulating the passage of an ion through each of the pore and evaluating the potential of the mean force [41] . at this stage it is suggested that both models would be present if generated under the present experimental conditions. this may reflect the situation in vivo, assuming that the proteins first diffuse by themselves freely for a short while until they are finally assembled. in case of an instant assembly at the site of in vivo production in the endoplasmic reticulum, the conformational available space would be possibly limited and a single model may emerge. the protein 8a encoded by sars-cov has a single tmd being able to form weak cation-selective channels most likely in a pentameric homomeric assembly. computational models of the assembled tmd reveal that a pentameric bundle is most likely to harbor a continuous water column. in such a bundle model a polarizable cysteine residue and hydrophilic residues such as serines and threonines are facing the pore. the putative water filled pentameric bundle of the tmd of 8a should be made of helices positioned in an approximately 11 å distance in respect to their helical long axis and with a tilt of about +40°. supplementary materials related to this article can be found online at doi:10.1016/j.bbamem.2010.08.004. sars -beginning to understand a new virus a novel coronavirus associated with severe acute respiratory syndrome isolation and characterization of viruses related to the sars coronavirus from animals in southern china viral ion channels: structure and function structure-function correlates of vpu, a membrane protein of hiv-1 viral channel forming proteins membrane composition determines paradixin's mechanism of bilayer disruption structure and orientation of pardaxin determined by nmr experiments in model membranes alamethicin aggregation in lipid membranes nmr structure of pardaxin, a pore-forming antimicrobial peptide, in lipopolysaccharide micelles. mechanism of outer membrane permeabilization severe acute respiratory syndrome-associated coronavirus 3a protein forms an ion channel and modulates virus release membrane protein folding and oligomerization: the two-stage model membrane protein folding: beyond the two stage model towards a mechanism of function of the viral ion channel vpu from hiv-1 prediction of transmembrane alpha-helices in procariotic membrane proteins: the dense alignment surface method a hidden markov model for predicting transmembrane helices in protein sequences tmbase -a database of membrane spanning protein segments basic charge clusters and prediction of membrane protein topology sosui: classification and secondary structure prediction system for membrane proteins exploring the conformational space of vpu from hiv-1: a versatile and adaptable protein assembly of viral membrane proteins biophysical characterisation of vpu from hiv-1 suggests a channel-pore dualism melittin induced voltage-dependent conductance in dopc lipid bilayers the role played by lipids unsaturation upon the membrane interaction of the heliobacter pylori hp(2-20) antimicrobial peptide analogue hpa3 identification of an ion channel activity of the vpu transmembrane domain and its involvement in the regulation of virus release from hiv-1-infected cells the vpu protein of human immunodeficiency virus type 1 forms cation-selective ion channels cation-selective ion channels formed by p7 of hepatitis c virus are blocked by hexamethylene amiloride activation-dependent subconductance levels in the drk1 k channel suggest a subunit basis for ion permeation and gating accurate bond and angle parameter for x-ray proteinstructure refinement molecular dynamics simulations of the erbb-2 transmembrane domain within an explicite membrane environment: comparison with vacuum simulations mapping the energy surface of transmembrane helixhelix interactions experimentally based orientational refinement of membrane protein models: a structure for the influenza a m2 h + channel the structure of the hiv-1 vpu ion channel: modelling and simulation studies membrane assembly of simple helix homo-oligomers studied via molecular dynamics simulations side-chain contributions to membrane protein structure and stability solving the membrane protein folding problem bundles consisting of extended transmembrane segments of vpu from hiv-1: computer simulations and conductance measurements reconstructing potentials of mean force from short steered molecular dynamics simulations of vpu from hiv-1 structure of a potentially open state of a proton-activated pentameric ligand-gated ion channel evidence that the m2 membrane-spanning region lines the ion channel pore of the nicotinic receptor bundles of amphipathic transmembrane a-helices as a structural motif for ion-conducting channel proteins: studies on sodium channels and acetylcholine receptors alphavirus 6k proteins form ion channels a proton-mediated conformational shift identifies a mobile pore-lining cysteine residue (cys-561) in human concentrative nucleoside transporter 3 conserved cysteine residues in the pore region are obligatory for human trpm2 channel function channel-lining residues of the ampa receptor m2 segment: structural environment of the q/r site and identification of the selectivity filter solid-state nmr and molecular dynamics simulations reveal the oligomeric ion-channels of tm2-gaba a stabilized by intermolecular hydrogen bonding tilt angle of a trans-membrane helix is determined by hydrophobic mismatch structure, topology, and tilt of cell-signaling peptides containing nuclear localization sequences in membrane bilayers determined by solid-state nmr and molecular dynamics simulation studies snorkeling of lysine side chains in transmembrane helices: how easy can it get? simulation of the hiv-1 vpu transmembrane domain as a pentameric bundle wbf acknowledges the national yang-ming university and the government of taiwan for financial support (aim of excellence program), as well as the national science council of taiwan (nsc). jk acknowledges a fellowship granted jointly by the alexander von humboldt-foundation and nsc. key: cord-330596-p4k7jexz authors: hu, ji; yan, chenggang; liu, xin; li, zhiyuan; ren, chengwei; zhang, jiyong; peng, dongliang; yang, yi title: an integrated classification model for incremental learning date: 2020-10-21 journal: multimed tools appl doi: 10.1007/s11042-020-10070-w sha: doc_id: 330596 cord_uid: p4k7jexz incremental learning is a particular form of machine learning that enables a model to be modified incrementally, when new data becomes available. in this way, the model can adapt to the new data without the lengthy and time-consuming process required for complete model re-training. however, existing incremental learning methods face two significant problems: 1) noise in the classification sample data, 2) poor accuracy of modern classification algorithms when applied to modern classification problems. in order to deal with these issues, this paper proposes an integrated classification model, known as a pre-trained truncated gradient confidence-weighted (pt-tgcw) model. since the pre-trained model can extract and transform image information into a feature vector, the integrated model also shows its advantages in the field of image classification. experimental results on ten datasets demonstrate that the proposed method outperform the original counterparts. classification tasks are widely used in image classification, personal credit evaluation, depiction of user portrait, and so on. these scenarios rely on streaming data to ensure lower latency. under the premise of big data, it is necessary to use efficient incremental learning algorithms to process streaming data in these application scenarios. clearly, these application scenarios require the highest possible accuracy. for these reasons, improving the accuracy of classification algorithms is an important problem [10] that must be addressed. at present, there are two methods to solve this problem, namely batch learning [8] and online learning [19] . in recent years, the emergence of a large amount of data has exceeded the size of memory, and online learning algorithms have attracted attention in the field of machine learning. many online incremental learning [18] algorithms have been proposed, such as the perceptron algorithm [14] , the passive aggressive algorithm (pa) [1, 9] , the online gradient descent algorithm (ogd) [3] , the stochastic gradient descent algorithm (sgd) [2] , the truncated gradient algorithm (tg) [3] , the weight adaptive regularization algorithm [5, 13] , and the confidence-weighted algorithm (cw) [4, 6, 7] . however, some of these, such as the perceptron and the ogd algorithms, effectively add noise to the model during each update, resulting in the inability to converge, or low and unstable convergence efficiency. other algorithms cannot produce a sparse solution while maintaining convergence speed. these problems greatly increase the complexity of prediction and affect the classification accuracy of the algorithm. in order to deal with the existing problems, this paper proposes an integrated classification model for incremental learning. this method consists of two parts: a pre-trained (pt) model and a novel truncated gradient confidence-weighted online classification model (tgcw). the pre-trained model is trained from the existing data samples in the neural network, and uses the transfer learning theory to use the pre-trained model to extract the feature vectors of all data to eliminate the noise from the data itself. then, the feature vectors extracted by the pre-trained model are sent to the novel online classification model for incremental learningthis enables superior classification results. compared with the baseline algorithm, our method combines the existing deep learning and incremental learning methods, making better use of the advantages of neural networks in data feature extraction and the continuous learning ability facilitated by incremental learning. through the combination of these two methods the performance of the original algorithm is widely improved. experiments have shown, conclusively, that our method has higher classification accuracy and faster convergence speed. the remainder of this paper is organized as follows. section 2 reviews other online learning algorithms related to our work. section 3 proposes the integrated classification learning framework, describing in detail the required steps. section 4 conducts extensive experiments on our proposed algorithm and other state-of-the-art algorithms. section 5 concludes this work. with the development of machine learning theory, many algorithms to deal with classification problems have been developed. most of these algorithms are based on batch learning, which relies on the assumption that all data samples can be obtained before training. the disadvantage of this kind of algorithm is that the learning stops after training on the existing data samples. for example, a deep neural network (dnn) [15, 25] uses a multilayer artificial neural network between input and output, and finds a mathematical operation that can minimize the loss function by adjusting the weights of the network. a dnn has the ability to capture complex nonlinear behavior, however it is easy to over fit and it typically requires a significant amount of training data. also, dnns require the training dataset to contain sufficient information that they can be applied to as-before-unseen data, post model training, while still obtain reasonable results. in order to cope with the continuous growth of massive data, various online incremental learning algorithms have been proposed, such as active learning [26] . online learning is a continuous training process in which input values are fed into the model in each round of training, and the model outputs prediction results based on the current parameters [16] . if the predicted classification result agrees with the input classification, the model will continue to be used for the next round of input values. on the other hand, if the input classification differs from that predicted by the model, an update will ensue in an attempt to make improved predictions for future data. incremental learning means that a model can acquire new knowledge from new samples, while at the same time retain most of the previously learned knowledgesimilar to the human learning process: people are exposed to new information every day, in a step by step fashion, and will ideally extract the key learnings, combining them with their prior understanding, thus improving their overall knowledge. the perceptron is a linear classification algorithm known as a threshold function, which is shown below. the basic idea of the perceptron algorithm is to find a hyper plane ω t x + b in the sample space to divide the dataset into two categories. therefore, a suitable function used to determine the class labels may be formulated as: where ω is a column vector of weight parameters and b is the bias. we can fix b and update the parameter ω, then the weight adjustment of the perceptron learning algorithm proceeds as follows: where η represents the learning rate, usually between 0 and 1. this is the mechanism by which the perceptron algorithm updates the model based on the prediction error. deviation causes the decision boundary to deviate from the origin and does not depend on any input value. if the training set is linearly separable, the perceptron convergence can be guaranteed. in addition, there is an upper limit for the number of times the perceptron adjusts its weight during training. in addition to this, a wide range of algorithms based on convex optimization have been proposed, including the pa and ogd algorithms. the core of pa algorithm is based on the samples of support vector machine, which requires the classifier to adjust the distance between the classifier and the real classifier to be as small as possible when encountering the incorrect samples, while at the same time to ensuring the previous samples are correct. online gradient descent algorithm is the application of offline gradient descent algorithm to an online learning situation. this algorithm is the basis of many online learning algorithms. in each round of training, the algorithm will update the direction of the loss function according to the existing model and parameters. in 2008, the confidence-weighted algorithm (cw) [6, 20] based on the weight probability distribution hypothesis shows a positive results in natural language processing. the cw algorithm imposes a the gaussian distribution on the parameter vector while updating each new training instance, so that the probability of correct classification of the instance under the updated distribution meets the specified confidence. subsequent to the cw algorithm, in order to generate a sparse solution, a truncated gradient algorithm (tg) [12] was proposed on the basis of convex optimization. its core idea is to control the size of the coefficient via an imposed threshold. in summary, batch learning algorithms cannot solve the problem of incremental data learning, and existing online learning algorithms are faced with problems involving sample noise and low classification accuracy. ultimately, previous researchers have not addressed these issues in a comprehensive manner. in this paper, therefore, we propose several methods in order to solve these important problems. the online learning algorithm many be applied in a wide range fields, however, the error rate of the online learning algorithm is relatively high under the classification problem due to its optimization strategy. when the parameters are updated during the training process, problems such as low convergence speed and high noise often occur [3] . in order to solve the existing problems, this paper proposes an integrated classifier model for incremental learning, as shown in fig. 1 . after the data samples are input, the neural network is trained to obtain a pre-trained model, then the obtained pre-trained model is used to process the subsequent input data to obtain feature vectors. next, the feature vectors are input into the tgcw classifier to obtain the final classification result. the integrated classification model consists of a pre-trained model and a tgcw classification model. the pre-trained model includes a dnn training module as well as a cnn training module [21] . the dnn training module and the cnn training module are two completely independent models. these two sub-modules are used to process vector data and matrix data, respectively. the type of data samples determines which module is selected for training. if the given data samples are vectors, the vector samples are input into the dnn module for training. if the given data sample is an image, the matrix is input into the cnn module for training. after the pre-trained model is obtained, the remaining data is input into the pre-trained model. it generates a set of feature vectors, and the classification model uses these feature vectors to generate prediction labels. if the predicted label is different from the real label, the classification model parameters will be updated, otherwise, the model parameters will remain unchanged. it should be pointed out that automatic data type detection is not implemented in the input layer of our model. in our later experiment section, the datasets are manually classified and entered into the model, separately. however, such an automatic detection feature could be easily realized. here we provide an operational idea: a certain amount of data can be randomly selected from the dataset to be processed, and input into the vector pre-training module and matrix pre-training module, respectively. then, using cross validation, a comparison of the performance of the two models may be carried out, with the model with superior performance then chosen for processing the complete dataset. this method is feasible because the better performance shows whether the vector or matrix methods can better describe the internal structure of the dataset. cross validation can also be replaced by other validation methods that are more suitable for the current data. for vector data, the pre-trained model uses a dnn without a convolution layer for training. the network uses three fully connected layers, with 64 nodes in the first two fully connected layers,as shown in fig. 2 . when the data is in vector format, a deep neural network without convolution layer is used for pre-training. in the pre-training step, three fully-connected layers are used to learn the characteristics of data, and the parameters of neurons in each layer are updated through back propagation to retain the information of vector quantity. the activation function uses the relu function, and the softmax function is used as the activation function in the third fully-connected layer. at fig. 2 vector pre-trained model the same time, the dropout layer is used between two adjacent fully-connected layers, and each time the entire network is trained, the dropout layer will be discarded with a certain probability for each node. for matrix data, the pre-trained model uses cnn for training [17] , as shown in fig. 3 . there are five convolution layers. each convolution layer uses a 3 × 3 convolution kernel and 32 nodes. at the same time, the batch normalization layer is used to normalize the data after each convolution to speed up the search for the optimal solution of gradient descent. after the first, third and fifth convolution layers, a maxpooling layer and the dropout layer are added. the pool size of the maxpooling layer is 2 × 2, and the step size is 2. two fully connected layers are used after the fully connected layer. the first fully-connected layer has 1024 nodes. the activation function uses the relu function. after the fullyconnected layer, the batch normalization layer is used to normalize the data. then the dropout layer is used to avoid overfitting. when inputting image data, convolution and maximum pool are used to extract and represent image data in the deep pre-training part containing convolution neural network. after convolution for many times, the obtained feature information is output to all the connection layers of 1024 nodes in a layer to obtain the image feature vector. in this section, we propose a new online learning algorithm suitable for binary classification of streamed data, named tgcw, which aims to further improve the prediction accuracy and feature selection capability of the model. the cw algorithm itself has some disadvantages. (1) its updating strategy is very aggressive. when there is noise in the streaming data, the cw algorithm will greatly modify the parameters, resulting in a decrease in accuracy. when used for binary classification, it will not obtain satisfactory performance. (2) the sparsity of the model parameters is poor, which will lead to low interpretability and performance degradation, at least to some extent. to overcome these disadvantages, we have considered two adjustments: (1) similar to the pa algorithm and the scw learning process, we introduce a parameter c to control the aggressiveness versus passiveness [11] of the learning algorithm; (2) streaming data makes fig. 3 matrix pre-trained model feature selection difficult, so we introduce the tg algorithm to truncate coefficients to be smaller than the threshold θ to reduces the dimension of streaming data. below we will elaborate our method. we assume that the coefficient follows a gaussian distribution with the mean vector μ i and the covariance matrix ∑ i , as per the cw method, and allow an immediate update of the variables μ t + l and σ t + l . to deal with the incremental streaming data, we apply the cost-sensitive strategy, i.e., where, c cs is the cost assigned to the suffered loss. the quantity ℓ ϕ is the loss that measures the probability of the misclassification based on the calculated gaussian distribution, where is the kl divergence, measuring the difference between the two distributions. online learning requires the sequential arrival of the streaming data that cannot be reused. this constraint makes it difficult to perform feature selection. truncated gradient (tg) is proposed to select the discriminate features. after the cost-sensitive cw update at each iteration, we apply tg to select the most active features. specifically, we define: where, γ is the learning rate, g is the gravity parameter, and θ is the threshold. x t and y t represent the sample value and label of the current iteration round, respectively. this tg operation can be performed in every k online steps. because of the feature extraction in the pre-trained part, the performance of the whole model is greatly improved. in addition, we also find that the robustness of the model has been significantly improved at the same time for the following reasons. (1) the convolution layer and the pooling layer in the cnn can filter at least some of the noise. (2) cnn local connectivity and shared weights strategy can also weaken the influence of the noise. if the classification accuracy is further improved, a convolution attention network can be used for preprocessing [22] . (3) for the dnn network for vector samples, the network structure with dense connection of dnn can improve the robustness of the whole model, which is more reliable than the model without pre-trained layer. (4) dnn network can extract features from datasets, and then put the feature vector into the tgcw model, which is conducive to weakening the direct impact of noise on model parameters. based on the above four points, we believe that the pre-trained model has significant robustness, which is also confirmed by subsequent experiments. details of our experiment are given in section 4. the main task of online classification is to minimize cumulative errors to obtain higher accuracy. therefore, in order to test the performance of our proposed integrated classification model, we carefully selected five datasets from different fields from the uc irvine machine learning repository and the keel dataset repository. glass1, glass2, glass5, ecoli3 and haberman are all well-known classically used test datasets. the glass datasets contain attributes about several glass types that can be used in criminological investigations. the ecoli data contains protein localization sites and the haberman dataset contains cases from a study that was conducted between 1958 and 1970 at the university of chicago's billings hospital on the survival of patients who had undergone surgery for breast cancer. in addition, the masked_faces dataset was collected by yin xiangzhi and others. it contains 2858 pictures of faces wearing masks and 1221 pictures of faces not wearing masks. the faces in the dataset have different directions and occlusion degrees. the dogs vs. cats dataset comes from the kaggle dogs vs. cats competition dataset, which includes 12,500 pictures of cats and 12,500 pictures of dogs. the color, age, and position of the cats and dogs in the picture are different. the animals10 dataset comes from the kaggle animals-10 competition dataset. this dataset contains images of 10 kinds of animals, including dog, cat, horse, spider, butterfly, chicken, sheep, cow, squirrel, elephant. we choose horses and elephants from the animals10 dataset as the dataset used in the experiment. this part of the dataset contains 2623 images of horses and 1436 images of elephants. the mnist database is a large database of handwritten digits that is commonly used for training various image processing systems. we selected 4132 images of "0" and 4684 images of "1" from the mnist dataset for experiments. the cifar-10 dataset contains 10 table 1 shows the specifications of the 10 datasets (fig. 4) . accuracy is the most important metric for evaluating the model performance in the classification problem. to show the efficiency of our online learning algorithm, we compare our method with five kinds of state-of-the-art online learning algorithms, including perceptron, ogd, pa, cw and iellip [23, 24] . perceptron is a classifier that uses hyper planes for binary classification, and uses new data instances, predictions, comparisons, and updates each time to adjust the location of hyper planes. the learning process of pa is basically the same as that of perceptron. when the weight is revised, a parameter tt is added. when the prediction is correct, there is no need to adjust the weight. when the prediction is wrong, the weight is actively adjusted. each learning parameter of the cw method has a degree of trust. parameters with a small degree of trust should be trained, so more frequent opportunities for modification occur. the degree of confidence is expressed by the gaussian distribution of the parameter vector. ogd uses the obtained data to perform a gradient descent every time, and updates the parameters according to the result of the gradient descent. the key idea of iellip is to approximate by an ellipsoid the classification hypotheses. in addition to the classical ellipsoid method, an improved version for online learning is also presented. we carried out the experiments by first obtaining the optimal parameters for other algorithms for each method on each dataset, then applied each algorithm 10 times using these optimal parameters on each dataset, each time with a randomly permuted sequence. all results are reported by averaging over these 10 runs. there are three performance metrics which are used to evaluate the performance of online learning algorithm, (1) online cumulative mistake rate, (2) number of updates (which would be closely related to the potential number of support vectors in the kernel extension), and (3) the cost of running time. table 2 lists the results of our empirical evaluation of cumulative performance of the proposed tgcw and other algorithms in five classes of datasets, where we show the six kinds of online learning algorithms. the bold elements indicate the best performance in different datasets. we can draw several observations as follows. first, in terms of the overall mistake rate, our proposed tgcw outperforms other online learning algorithms in all five datasets. specifically, compared to the strongest baselines, tgcw decreases the mistake rate by 9.1%, 26.0%, 2.7%, 40.6% and 20.3% in the dataset of glass1, haberman, ecoli3, glass2 and glass5 respectively. second, by examining the number of updates, we found that tgcw online leaning algorithm outperforms other methods. when the number of samples is large, the algorithm has a smaller update rate characteristic. our proposed tgcw algorithm improves on the cw algorithm, so our method inherits the excellent feature. moreover, among all the compared algorithms, tgcw often achieves the best or close to the best performance in terms of accuracy and number of updates. finally, fig. 5 shows us the online results of six kinds of algorithms with five varied numbers of datasets in the online learning process. the results again validate the advantages of tgcw in both efficacy and efficiency among all the state-ofthe-art algorithms. in order to verify the performance of the integrated classifier, we designed the experimental steps and flow as shown in fig. 6 . then, randomly selecting 10% of the dataset as data samples into the neural network for training to obtain a pre-trained model, the pre-trained model is used to process the remaining data to extract the feature vectors of the data. next, the feature vector is input into the tgcw online classifier for online incremental learning, with 20 training rounds performed in the classifier, and the average value of thee results from these 20 training rounds is taken as the final result. in addition, the remaining data is entered into the original tgcw classifier for online learning to obtain the classification result, again taking the average of 20 experimental results, then comparing the performance of the integrated classifier and the original classifier. the experimental results, as shown in the blue part of fig. 7 , show that the performance of the integrated classifier is greatly improved compared to the original classifier. the way that the integrated classifier uses the data features extracted by the pre-trained model has a greater effect in reducing data noise and speeding up the parameter update process. on ten datasets, the error rate of classification decreased significantly. at the same time, it can be found that the integrated classifier has a better improvement on the dogs vs. cats dataset, which shows that the pre-trained model will have a superior improvement effect in the face of relatively complex data. by comparing the performance of the ten datasets, we can find that the performance of the integrated classifier can be greatly improved on the basis of the original classifier. then, we randomly added gaussian noise with a mean of 0, 0.3 or 0.5 and a variance of 0.1, 0.3 or 0.5 in 10% of the data in the original dataset. then, the tgcw classifiers and integrated classifiers are used for classification. the result is shown as the red part of fig. 7 , and the experimental results show that the integrated classifier still has good classification performance after adding noise, which shows that the integrated classifier has advantages over the traditional classifier in dealing with noisy data. the change of the model before and after adding noise and the comparison between the models can also be seen clearly from fig. 7 . in this paper, a new, integrated classifier for incremental learning is proposed, which uses pretrained deep neural networks to extract the characteristic vectors of the data, and then uses the tgcw classifier for incremental learning. the experimental results show that the integrated classifier has great advantages in incremental learning. the results of running on ten different types of datasets indicate that the integrated classifier has a significant improvement in performance compared to traditional classifiers. at the same time, the operation results on the noise-added dataset show that after the introduction of noise the average increase of the mistake rate of the integrated classifier is about 1.5%, which is much lower than that of the original classifier (approx. 6%). hence, the integrated classifier has advantages in processing noisy data, and it also shows that our integrated classifier can have superior robustness and generalization while maintaining the convergence speed. the dense connection structure and the convolution and pooling layer of neural network improves the robustness of the model. our integrated classifier still has many shortcomings. at present, the ability of the integrated classifier to process complex image data is still poor. the accuracy of some image data is less than 50% and the training overhead is relatively large. our integrated classifier is still limited to binary classification problems, which restricts its application to many practical situations. future work will focus on extending our integrated classifier to multi-classification problems. in addition, we will also look for improved pre-trained models or use more classifiers for integrated learning to improve the classification accuracy of complex data. online learning versus offline learning stochastic gradient descent tricks online passive-aggressive algorithms multi-class confidence weighted algorithms adaptive regularization of weight vectors confidence-weighted linear classification efficient online and batch learning using forward backward splitting large margin classification using the perceptron algorithm a note on the utility of incremental learning truncated gradient confidence-weighted based online learning for imbalance streaming data sparse on learning via truncated gradient new adaptive algorithms for online classification the perceptron: a probabilistic model for information storage and organization in the brain deep learning in neural networks: an overview online learning and online convex optimization very deep convolution networks for large-scale image recognition incremental learning with support vector machines cost-sensitive online classification soft confidence-weighted learning automated pulmonary nodule detection in ct images using deep convolution neural networks convolution attention networks for scene text recognition learning with kernels online learning by ellipsoid method how transferable are features in deep neural networks? international conference on neural information processing systems state-relabeling adversarial active learning publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. key: cord-288183-pz3t29a7 authors: mckibbin, warwick j.; wilcoxen, peter j. title: chapter 15 a global approach to energy and the environment the g-cubed model date: 2013-12-31 journal: handbook of computable general equilibrium modeling doi: 10.1016/b978-0-444-59568-3.00015-8 sha: doc_id: 288183 cord_uid: pz3t29a7 abstract g-cubed is a multi-country, multi-sector, intertemporal general equilibrium model that has been used to study a variety of policies in the areas of environmental regulation, tax reform, monetary and fiscal policy, and international trade. it is designed to bridge the gaps between three areas of research – econometric general equilibrium modeling, international trade theory, and modern macroeconomics – by incorporating the best features of each. this chapter describes the theoretical and empirical structure of the model, summarizes its applications and contributions to the literature, and discusses two example applications in detail. from the trade literature, g-cubed takes the approach of modeling the world economy as a set of autonomous regions e 12 in the version used in this paper e interacting through bilateral trade flows. 3 following the armington approach (armington, 1969) , goods produced in different regions are treated as imperfect substitutes. 4 unlike most trade models, however, g-cubed distinguishes between financial and physical capital. financial capital is perfectly mobile between sectors and from one region to another, and is driven by forward-looking investors who respond to arbitrage opportunities. physical capital, in contrast, is perfectly immobile once it has been installed: it cannot be moved from one sector to another or from one region to another. in addition, intertemporal budget constraints are imposed on each region: all trade deficits must eventually be repaid by future trade surpluses. drawing on the general equilibrium literature, g-cubed represents each region by its own multisector econometric general equilibrium model. 5 production is broken down into n industries and each is represented by an econometrically estimated cost function. unlike many general equilibrium models, however, g-cubed draws on macroeconomic theory by representing saving and investment as the result of forwardlooking intertemporal optimization. households maximize an intertemporal utility function subject to a lifetime budget constraint, which determines the level of saving, and firms choose investment to maximize the stock market value of their equity. 6 finally, g-cubed also draws on the macroeconomic literature by representing international capital flows as the result of intertemporal optimization, and by including liquidityconstrained agents, a transactions-based money demand equation and slow nominal wage adjustment. unlike typical macro models, however, g-cubed has substantial sector detail and many of its parameters are determined by estimation rather than calibration. this combination of features was chosen to make g-cubed versatile. industry detail allows the model to be used to examine environmental and tax policies which tend to have their largest direct effects on small segments of the economy. intertemporal modeling of investment and saving allows g-cubed to trace out the transition of the economy between the short run and the long run. slow wage adjustment and liquidityconstrained agents improves the empirical accuracy with which the model captures the transition. overall, the model is designed to provide a bridge between computable general equilibrium models, international trade models and macroeconomic models by combining key features of each approach. the cost of this versatility is that g-cubed is a fairly large model. it has over 10000 equations holding in each year, is typically solved annually for 100 years in each simulation, and has over 100 intertemporal costate variables. nonetheless, it can be solved using software developed for a personal computer. the key features of g-cubed are summarized in table 15 .1. there are several different versions of g-cubed that have been developed, depending on the question being analyzed. versions have been built with two sectors (macroeconomic issues), six sectors (trade and growth issues), 12 sectors (energy and environmental issues), 21 sectors (india) and 57 sectors (australia). there are also a large number of different country disaggregations. however, all versions of g-cubed are global: each represents the economic activity of all countries in the world, either modeled individually or aggregated into regions. developed and developing countries are modeled in detail, including all trade and financial links between countries. 2 a full menu of financial assets is included and the valuations of those assets are driven by the real economy. 3 international flows of financial capital are modeled. an important distinction is made between the stickiness of physical capital within sectors and within countries and the flexibility of financial capital, which quickly flows to where expected returns are highest. this important distinction leads to a critical difference between the quantity of physical capital that is available at any time to produce goods and services, and the valuation of that capital as a result of decisions about the allocation of financial capital. 4 households and firms are represented as mixtures of two types of agents: one group which bases its decisions on forward-looking expectations and a second group which follows simpler rules of thumb which are optimal in the long run, but not necessarily in the short run. 5 the model allows for short-run wage rigidity (varying in degree across countries) and therefore allows for significant periods of unemployment depending on the labor market institutions in each country. this assumption, when taken together with the explicit modeling of money and other financial assets, gives the model more realistic macroeconomic properties than conventional general equilibrium models. the most frequently used model and the version most relevant for environmental and energy questions is the 12 sector model. in this paper we will focus on the structure and specification of this version of g-cubed. within each region, production is disaggregated into 12 sectors: five energy sectors (electric utilities, natural gas utilities, petroleum refining, coal mining, and crude oil and gas extraction) and seven non-energy sectors (mining, agriculture, forestry and wood products, durable goods, non-durable goods, transportation and services). this disaggregation, summarized in table 15 .2, enables us to capture the sector level differences in the impact of alternative environmental policies. each economy or region in the model consists of several economic agents: households, the government, the financial sector and the 12 production sectors listed above. we now present an overview of the theoretical structure of the model by describing the decisions facing these agents. to keep our notation as simple as possible we have not subscripted variables by country except where needed for clarity. throughout the discussion all quantity variables will be normalized by the economy's endowment of effective labor units. thus, the model's long-run steady state will represent an economy in a balanced growth equilibrium. regions (codes shown in parentheses) 1 us (u or usa) 2 japan (j or jpn) 3 australia (a or aus) 4 western europe (e or euw) 5 rest of the oecd (o or oec) 6 china (c or chi) 7 other developing countries (l or ldc) 8 eastern europe and the former soviet union (b or eeb) 9 oil exporting countries and the middle east (p or opc) sectors 1 electric utilities 2 gas utilities 3 petroleum refining 4 coal mining 5 crude oil and gas extraction 6 other mining 7 agriculture 8 forestry and wood products 9 durable goods 10 non-durables 11 transportation 12 services we assume that each of the 12 sectors can be represented by a price-taking firm, which chooses variable inputs and its level of investment in order to maximize its stock market value. each firm's production technology is represented by a tier-structured constant elasticity of substitution (ces) function. at the top tier, output is a function of capital, labor, energy and materials: where q i is the output of industry i, x ij is industry i's use of input j, and ij parameters reflect the weights of different inputs in production; the superscript 'o' indicates that the parameters apply to the top, or 'output', tier. without loss of generality, we constrain the ds to sum to one. at the second tier, inputs of energy and materials, x ie and x im , are themselves ces aggregates of goods and services. energy is an aggregate of goods 1e5 (electricity through crude oil) and materials is an aggregate of goods 7e12 (mining through services). the functional form used for these tiers is identical to (1) except that the parameters of the energy tier are a i e , d e ij and s i e , and those of the materials tier are a i m , d m ij and s i m . the goods and services purchased by firms are, in turn, aggregates of imported and domestic commodities, which are taken to be imperfect substitutes. we assume that all agents in the economy have identical preferences over foreign and domestic varieties of each commodity. we represent these preferences by defining 12 composite commodities that are produced from imported and domestic goods. each of these commodities, y i , is a ces function of inputs domestic output, q i , and imported goods, m i . 7 for example, the petroleum products purchased by agents in the model are a composite of imported and domestic petroleum. by constraining all agents in the model to have the same preferences over the origin of goods we require that, for example, the agricultural and service sectors have the identical preferences over domestic oil and oil imported from the middle east. 8 this accords with the input-output data we use and allows a very convenient nesting of production, investment and consumption decisions. finally, the production function includes one additional feature to allow the model to be used to examine the effects of emissions quotas or tradable permit systems: each input is used in fixed proportions to the use of an input-specific permit. the permits are owned by households and included in household wealth. permit prices are determined endogenously by a competitive market for each type of permit. to run simulations without a permit system, the supply of permits can be set large enough so that the price of a permit goes to zero. in each sector the capital stock changes according to the rate of fixed capital formation ( j i ) and the rate of geometric depreciation (d i ): following the cost of adjustment models of lucas (1967) , treadway (1969) and uzawa (1969) , we assume that the investment process is subject to rising marginal costs of installation. to formalize this we adopt uzawa's approach by assuming that in order to install j units of capital a firm must buy a larger quantity, i, that depends on its rate of investment ( j/k): where f is a non-negative parameter. the difference between j and i may be interpreted various ways; we will view it as installation services provided by the capital-goods vendor. the goal of each firm is to choose its investment and inputs of labor, materials and energy to maximize intertemporal risk-adjusted net of tax profits. for analytical tractability, we assume that this problem is deterministic (equivalently, the firm could be assumed to believe its estimates of future variables with subjective certainty). thus, the firm will maximize: 9 z n t ð1 à s 2 þp i e àðrðsþþm ei ànþðsàtþ ds; (15.4) where m ei is a sector-and region-specific equity risk premium, s 2 is the effective tax rate on capital income, and variables are implicitly subscripted by time. the firm's profits, p, are given by: (15.5) where s 4 is an investment tax credit and p ) is the producer price of the firm's output. r(s) is the long-term interest rate between periods t and s: as all real variables are normalized by the economy's endowment of effective labor units, profits are discounted adjusting for the rate of growth of population plus productivity growth, n. solving the top-tier optimization problem gives the following equations characterizing the firm's behavior: where l i is the shadow value of an additional unit of investment in industry i. equation (15.7) gives the firm's factor demands for labor, energy and materials, and equations (15.8) and (15.9) describe the optimal evolution of the capital stock. by integrating (15.9) along the optimum path of capital accumulation, it is straightforward to show that l i is the increment to the value of the firm from a unit increase in its investment at time t. it is related to q, the after-tax marginal version of tobin's q (abel, 1979) , as follows: (15.12) in order to capture the inertia often observed in empirical investment studies we assume that only fraction a 2 of firms making investment decision use the fully forward-looking tobin's q described above. the remaining (1 e a 2 ) use a slowly-adjusting version, q, driven by a partial adjustment model. in each period, the gap between q and q closes by fraction a 3 : as a result, we modify (15.12) by writing i i as a function not only of q, but also the slowly adjusting q: a global approach to energy and the environment: the g-cubed model this creates inertia in private investment, which improves the model's ability to mimic historical data and is consistent with the existence of firms that are unable to borrow. the weight on unconstrained behavior, a 2 , is taken to be 0.3 based on a range of empirical estimates reported by mckibbin and sachs (1991) . so far we have described the demand for investment goods by each sector. investment goods are supplied, in turn, by a 13th industry that combines capital, labor and the outputs of other industries to produce raw capital goods. we assume that this firm faces an optimization problem identical to those of the other 12 industries: it has a nested ces production function, uses inputs of capital, labor, energy and materials in the top tier, incurs adjustment costs when changing its capital stock, and earns zero profits. the key difference between it and the other sectors is that we use the investment column of the input-output table to estimate its production parameters. households have three distinct activities in the model: they supply labor, they save, and they consume goods and services. within each region we assume household behavior can be modeled by a representative agent with an intertemporal utility function of the form: where c(s) is the household's aggregate consumption of goods and services at time s, g(s) is government consumption at s, which we take to be a measure of public goods provided, and q is the rate of time preference. 10 the household maximizes (15.15) subject to the constraint that the present value of consumption (potentially adjusted by risk premium m h ) be equal to the sum of human wealth, h, and initial financial assets, f: 11 human wealth is defined as the expected present value of the future stream of aftertax labor income plus transfers: (15.17) 10 this specification imposes the restriction that household decisions on the allocations of expenditure among different goods at different points in time be separable. 11 as before, n appears in (15.16) because the model's scaled variables must be converted back to their original basis. where s 1 is the tax rate on labor income, tr is the level of government transfers, l c is the quantity of labor used directly in final consumption, l i is labor used in producing the investment good, l g is government employment, and l i is employment in sector i. financial wealth is the sum of real money balances, mon/p, real government bonds in the hand of the public, b, net holding of claims against foreign residents, a, the value of capital in each sector and holdings of emissions permits, q i p : solving this maximization problem gives the familiar result that aggregate consumption spending is equal to a constant proportion of private wealth, where private wealth is defined as financial wealth plus human wealth: however, based on the evidence cited by campbell and mankiw (1990) and hayashi (1982) we assume some consumers are liquidity-constrained and consume a fixed fraction g of their after-tax income (inc ). 12 denoting the share of consumers who are not constrained e and choose consumption in accordance with (15.19) e by a 8 , total consumption expenditure is given by: the share of households consuming a fixed fraction of their income could also be interpreted as permanent income behavior in which household expectations about income are myopic. once the level of overall consumption has been determined, spending is allocated among goods and services according to a two-tier ces utility function. 13 at the top tier, the demand equations for capital, labor, energy and materials can be shown to be: ; i˛fk; l; e; mg; (15.21) 12 there has been considerable debate about the empirical validity of the permanent income hypothesis. in addition the work of campbell, mankiw and hayashi, other key papers include hall (1978) and flavin (1981) . one side-effect of this specification is that it prevents us from computing equivalent variation. since the behavior of some of the households is inconsistent with (15.19), either because the households are at corner solutions or for some other reason, aggregate behavior is inconsistent with the expenditure function derived from our utility function. 13 the use of the ces function has the undesirable effect of imposing unitary income elasticities, a restriction usually rejected by data. an alternative would be to replace this specification with one derived from the linear expenditure system. a global approach to energy and the environment: the g-cubed model where x ci is household demand for good i, s o c is the top-tier elasticity of substitution and the d ci are the input-specific parameters of the utility function. the price index for consumption, p c , is given by: the demand equations and price indices for the energy and materials tiers are similar. household capital services consist of the service flows of consumer durables plus residential housing. the supply of household capital services is determined by consumers themselves who invest in household capital, k c , in order to generate a desired flow of capital services, c k , according to the following production function: where a is a constant. accumulation of household capital is subject to the condition: we assume that changing the household capital stock is subject to adjustment costs so household spending on investment, i c , is related to j c by: thus, the household's investment decision is to choose i c to maximize: ðp ck ak c à p i i c þe àðrðsþþm z ànþðsàtþ ds; (15.26) where p ck is the imputed rental price of household capital and m z is a risk premium on household capital (possibly zero). this problem is nearly identical to the investment problem faced by firms, including the partial adjustment mechanism outlined in equations 15.13 and 15.14, and the results are very similar. the only important difference is that no variable factors are used in producing household capital services. we assume that labor is perfectly mobile among sectors within each region but is immobile between regions. thus, wages will be equal across sectors within each region, but will generally not be equal between regions. in the long run, labor supply is completely inelastic and is determined by the exogenous rate of population growth. long-run wages adjust to move each region to full employment. in the short run, however, nominal wages are assumed to adjust slowly according to an overlapping contracts model where wages are set based on current and expected inflation and on labor demand relative to labor supply. this can lead to short-run unemployment if unexpected shocks cause the real wage to be too high to clear the labor market. at the same time, employment can temporarily exceed its long-run level if unexpected events cause the real wage to be below its long run equilibrium. we take each region's real government spending on goods and services to be exogenous and assume that it is allocated among inputs in fixed proportions, which we set to 2006 values. total government outlays include purchases of goods and services plus interest payments on government debt, investment tax credits and transfers to households. government revenue comes from sales taxes, capital and labor taxes, and from sales of new government bonds. in addition, there can be taxes on externalities such as carbon dioxide emissions. the government budget constraint may be written in terms of the accumulation of public debt as follows: where b is the stock of debt, d is the budget deficit, g is total government spending on goods and services, tr is transfer payments to households and t is total tax revenue net of any investment tax credit. we assume that agents will not hold government bonds unless they expect the bonds to be paid off eventually and accordingly impose the following transversality condition: lim s/n bðsþe àðrðsþànþs ¼ 0: (15.28) this prevents per capita government debt from growing faster than the interest rate forever. if the government is fully leveraged at all times, (15.28) allows (15.27) to be integrated to give: thus, the current level of debt will always be exactly equal to the present value of future budget surpluses. 14 the implication of (29) is that a government running a budget deficit today must run an appropriate budget surplus as some point in the future. otherwise, the government would be unable to pay interest on the debt and agents would not be willing to hold it. to ensure that (15.29) holds at all points in time we assume that the government levies a lump sum tax in each period equal to the value of interest payments on the outstanding debt. 15 in effect, therefore, any increase in government debt is financed by consols and future taxes are raised enough to accommodate the increased interest costs. other fiscal closure rules are possible, such as requiring the ratio of government debt to gdp to be unchanged in the long run or that the fiscal deficit be exogenous with a lump sum tax ensuring this holds. these closures have interesting implications but are beyond the scope of this paper. the nine regions in the model are linked by flows of goods and assets. flows of goods are determined by the import demands described above. these demands can be summarized in a set of bilateral trade matrices which give the flows of each good between exporting and importing countries. there is one nine by nine trade matrix for each of the 12 goods. trade imbalances are financed by flows of assets between countries. each region with a current account deficit will have a matching capital account surplus, and vice versa. 16 we assume asset markets are perfectly integrated across regions. 17 with free mobility of capital, expected returns on loans denominated in the currencies of the various regions must be equalized period to period according to a set of interest arbitrage relations of the following form: where i k and i j are the interest rates in countries k and j, m k and m j are exogenous risk premiums demanded by investors (possibly zero), and e j k is the exchange rate between the currencies of the two countries. 18 however, in cases where there are institutional rigidities to capital flows, the arbitrage condition does not hold and we replace it with an explicit model of the relevant restrictions (such as capital controls). 15 in the model the tax is actually levied on the difference between interest payments on the debt and what interest payments would have been if the debt had remained at its base case level. the remainder e interest payments on the base case debt e is financed by ordinary taxes. 16 global net flows of private capital are constrained to be zero at all times e the total of all funds borrowed exactly equals the total funds lent. as a theoretical matter this may seem obvious, but it is often violated in international financial data. 17 the mobility of international capital is a subject of considerable debate; see gordon and bovenberg (1994) or feldstein and horioka (1980) . 18 the one exception to this is the oil-exporting region, which we treat as choosing its foreign lending in order to maintain a desired ratio of income to wealth. capital flows may take the form of portfolio investment or direct investment but we assume these are perfectly substitutable ex ante, adjusting to the expected rates of return across economies and across sectors. within each economy, the expected returns to each type of asset are equated by arbitrage, taking into account the costs of adjusting physical capital stock and allowing for exogenous risk premiums. however, because physical capital is costly to adjust, any inflow of financial capital that is invested in physical capital will also be costly to shift once it is in place. this means that unexpected events can cause windfall gains and losses to owners of physical capital, and ex post returns can vary substantially across countries and sectors. for example, if a shock lowers profits in a particular industry, the physical capital stock in the sector will initially be unchanged but its financial value will drop immediately. we assume that money enters the model via a constraint on transactions. 19 we use a money demand function in which the demand for real money balances is a function of the value of aggregate output and short-term nominal interest rates: where y is aggregate output, p is a price index for y, i is the interest rate, and 3 is the interest elasticity of money demand. following mckibbin and sachs (1991) we take 3 to be e0.6. on the supply side, the model includes an endogenous monetary response function for each region. each region's central bank is assumed to adjust short-term nominal interest rates following a hendersonemckibbinetaylor rule as shown in the equation below. the interest rate evolves as a function of actual inflation (p) relative to target inflation (p t ), output growth (dy) relative to growth of potential output (dy t ) and the change in the exchange rate (de) relative to the bank's target change (de t ): the parameters in (32) vary across countries. for example, countries that peg their exchange rate to the us dollar have a very large value of b 3 . to estimate g-cubed's parameters we began by constructing a consistent time series of input-output tables for the us. the procedure is described in detail in and can be summarized as follows. we started with the detailed benchmark us input-output transactions tables produced by the bureau of economic analysis (bea) and converted them to a standard set of industrial classifications and then aggregate them to 12 sectors. 20 then, we corrected the treatment of consumer durables, which are included in consumption rather than investment in the us national income and product accounts (nipas) and the benchmark input-output tables. third, we supplemented the value added rows of the tables using a detailed dataset on capital and labor input by industry constructed by dale jorgenson and his colleagues. 21 finally, we obtained prices for each good in each benchmark year from the output and employment data set constructed by the office of employment projections at the bureau of labor statistics (bls). this dataset allowed us to estimate the model's parameters for the us. to estimate the production side of the model, we began with the energy and materials tiers because they have constant returns to scale and all inputs are variable. in this case it is convenient to replace the production function with its dual unit cost function. for industry i, the unit cost function for energy is: the cost function for materials has a similar form. assuming that the energy and materials nodes earn zero profits, c will be equal to the price of the node's output. using shephard's lemma to derive demand equations for individual commodities and then converting these demands to cost shares gives expressions of the form: where s e ij is the share of industry is spending on energy that is devoted to purchasing input j. 22 a i e , s i e and d ij e were found by estimating (15.33) and (15.34) as a system of equations. 23 estimates of the parameters in the materials tier were found by an analogous approach. 20 converting the data to a standard basis was necessary because the sector definitions and accounting conventions used by the bea have changed over time. 21 primary factors often account for half or more of industry costs so it is particularly important that this part of the data set be constructed as carefully as possible. from the standpoint of estimating cost and production functions, however, value added is the least satisfactory part of the benchmark input-output tables. in the early tables, labor and capital are not disaggregated. in all years, the techniques used by the bea to construct implicit price deflators for labor and capital are subject to various methodological problems. one example is that the income of proprietors is not split between capital and imputed labor income correctly. the jorgenson dataset corrects these problems and is the work of several people over many years. in addition to dale jorgenson, some of the contributors were l. christensen, barbara fraumeni, mun sing ho and dae keun park. the original source of the data is the fourteen components of income tape produced by the bureau of economic analysis. see ho (1989) for more information. 22 when s e is unity, this collapses to the familiar cobbedouglas result that s ¼ d and is independent of prices. 23 for factors for which the value of s was consistently very small, we set the corresponding input to zero and estimated the production function over the remaining inputs. the output node must be treated differently because it includes capital, which is not variable in the short run. we assume that the firm chooses output, q i , and its top-tier variable inputs (l, e and m) to maximize its restricted profit function, p: where the summation is taken over all inputs other than capital. inserting the production function into (15.35) and rewriting gives: where k i is the quantity of capital owned by the firm, d ik is the distributional parameter associated with capital, and j ranges over inputs other than capital. maximizing (15.36) with respect to variable inputs produces the following factor demand equations for industry i: this system of equations can be used to estimate the top-tier production parameters. the results are listed in . much of the empirical literature on cost and production functions fails to account for the fact that capital is fixed in the short run. rather than using (15.37), a common approach is to use factor demands of the form: this expression is correct only if all inputs are variable in the short run. in we show that using equation (15.38) biases the estimated elasticity of substitution toward unity for many sectors in the model in petroleum refining, for example, the fixed-capital estimate for the top tier elasticity, s 3 o , is 0.54 while in the variable elasticity case it is 1.04. the treatment of capital thus has a very significant effect on the estimated elasticities of substitution. estimating parameters for regions other than the us is more difficult because timeseries input-output data is often unavailable. in part, this is because some countries do not collect the data regularly and in part it is because many of g-cubed's geographic entities are regions rather than individual countries. as a result, we impose the a global approach to energy and the environment: the g-cubed model restriction that substitution elasticities within individual industries are equal across regions. 24 by doing so, we are able to use the us elasticity estimates everywhere. the share parameters (the ds in the equations above), however, are derived from regional input-output data taken from the gtap version 7 database and differ from one region to another. in effect, we are assuming that all regions share a similar but not identical production technology. this is intermediate between one extreme of assuming that the regions share common technologies and the other extreme of allowing the technologies to differ in arbitrary ways. the regions also differ in their endowments of primary factors, their government policies, and patterns of final demands. final demand parameters, such as those in the utility function or in the production function of new investment goods were estimated by a similar procedure: elasticities were estimated from us data and share parameters were obtained from regional input-output tables. trade shares were obtained from 2009 un standard industry trade classification (sitc) data aggregated up from the four-digit level. 25 the trade elasticities are based on a survey of the literature and vary between 1 and 3. 26 g-cubed is implemented via three software components. the first consists of a sequence of programs written in the ox language that construct g-cubed's dataset from raw data. 27 the second component consists of a set of files specifying the model's economic structure in a portable, general-purpose language we developed called 'sym'. sym is a set-driven matrix language that descends from gams and gempack. it imposes rigorous conformability rules on all expressions to eliminate a broad range of potential errors in the design and coding of the model. a useful consequence of these rules is that subscripts are generally unnecessary and the model can be expressed very concisely and clearly. the third component is a suite of ox programs that are used for setting up simulations and solving the model according to the two-point boundary value algorithm described in mckibbin (1986) . 28 . it allows models with large numbers of forwardlooking costate variables (g-cubed has more than 100) to be solved quickly on computers with limited resources. 24 for example, the top-tier elasticity of substitution is identical in the durable goods industries of japan and the us. this approach is consistent with the econometric evidence of kim and lau (1994) . this specification does not mean, however, that the elasticities are the same across industries within a country. 25 a full mapping of sitc codes into g-cubed industries is contained in mckibbin and wilcoxen (1994) . 26 for a sensitivity analysis examining the role of the trade elasticities and several other key parameters, see . 27 ox is available from www.doornik.com and described in doornik (2007) . 28 for a more detailed description of the algorithm, see mckibbin and sachs (1991, appendix c) . because g-cubed is an intertemporal model, it is necessary to calculate a baseline, or 'business-as-usual', solution before the model can be used for policy simulations. in order to do so we begin by making assumptions about the future course of key exogenous variables. we take the underlying long-run rate of world population growth plus productivity growth to be 2.5% per annum and take the long-run real interest rate to be 5%. we also assume that tax rates and the shares of government spending devoted to each commodity remain unchanged. our remaining assumptions are listed by region in table 15 .3. as these assumptions do not necessarily match the expectations held by agents in the real world, the model's solution in any given year, say 2006, will generally not reproduce that year's historical data exactly. in particular, it is unlikely that the costate variables based on current and expected future paths of the exogenous variables in the model will equal the actual values of those variables in 2006. this problem arises in all intertemporal models and is not unique to g-cubed, but it is inconvenient when interpreting the model's results. to address the problem we add a set of constants, one for each costate variable, to the model's costate equations. for example, the constants for tobin's q for each sector in each country are added to the arbitrage equation for each sector's q. similarly, constants for each real exchange rate are added to the interest arbitrage equation for each country, and a constant for human wealth is added to the equation for human wealth. 29 to calculate the constants we use newton's method to find a set of values that will make the model's costate variables in 2006 exactly equal their 2006 historical values. after the constants have been determined, the model will reproduce the base year exactly given the state variables inherited from 2005 and the assumed future paths of all exogenous variables. 30 one additional problem is to solve for both real and nominal interest rates consistently since the real interest rate is the nominal interest rate from the money market equilibrium less the ex ante expected inflation rate. to produce the expected inflation rate implicit in historical data for 2006 we add a constant to the equation for nominal wages in each country. 31 finally, we are then able to construct the baseline trajectory by solving the model for each period after 2006 given any shocks to variables, shocks to information sets (announcements about future policies) or changes in initial conditions. 29 one interpretation of these constants is that they are risk premiums; another is that they are simply the residuals left between the actual data and the econometrically fitted values calculated by the model. 30 in general, these constants affect the model's steady state, but have little or no effect on the transitional dynamics. 31 one way to interpret this is as a shift in the full employment level of unemployment. in that case this approach is equivalent to using the full model to solve for the natural rate of unemployment in each country. a global approach to energy and the environment: the g-cubed model originally developed to evaluate climate change policies, g-cubed has been used to analyze trade policy, monetary and fiscal policy, financial crises, projections of global economic growth, the impacts of pandemics, and global demographic change. 32 it has been used by agencies within the governments of the us, japan, canada, australia and new zealand, as well as in reports by the intergovernmental panel on climate change, the un, the organization for economic cooperation and development (oecd), the world bank, the international monetary fund, the asian development bank, and a number of corporations. academic users can be found in the us, the uk, germany, austria, australia, indonesia and japan. the remainder of this section outlines key applications of g-cubed in the six areas: climate and energy policy, trade policy, analysis of financial crises, macroeconomic policy, the analysis of pandemics, and global demographic change. g-cubed was designed to contribute to the debate on environmental policy and international trade, with particular emphasis on climate change. it has been used for that purpose since 1992 and work using the model has roughly fallen into two areas of focus. one has been on generating projections of the future evolution of the world economy and exploring the sensitivity of these projections to a variety of assumptions. the second focus has been on evaluating the impacts of a variety of policy changes on these projections. these two strands of research will be dealt with separately below. in a study for the united nations university, bagnoli et al. (1996) found that over a 30year horizon, assumptions about productivity growth and structural change are crucial for understanding an economy's energy intensity. using the model, the authors made two projections of the world economy from 1990 to 2020. the first scenario assumed that all sectors in a given region experienced a uniform rate of technical change characteristic of that region. however, the rate varied across regions based on their historical performance, with higher rates in particular developing economies such as china. the second scenario allowed technical change to be heterogeneous at the sector level. within each region, sectoral technical change followed historical patterns, but scaled so that each economy had the same average economy-wide gdp growth rate as in the first scenario. the two scenarios produced dramatically different projections of world energy intensity by 2020. countries had approximately the same gdp growth rates in both scenarios (by construction), but energy use was far lower in the second scenario. sectorlevel differences in technical change caused structural changes that reduced economywide energy per unit of gdp by around 1% per year independent of any autonomous energy efficiency improvement (aeei). this difference was purely due to the changing structure of economies over time in response to relative price changes induced by different sectoral rates of technical change. the difference was shown clearly in the carbon taxes required for stabilizing emissions: in the second scenario the taxes were typically half those for the first scenario. this study and subsequent papers by emphasized that a simple projection of gdp growth was insufficient for projecting carbon emissions. although overall gdp growth matters, sectoral-level differences in productivity are critical for future emissions. 33 the other issue that was emphasized in this study and related studies, is that the effect of small changes in low-level growth rates over 20 or more years can have enormous effects on composition of the economy. the large range of possible outcomes from small changes in growth rates is always a sobering reminder of the degree of uncertainty underlying climate policy. in particular, there is empirical evidence to suggest that many economic variables have a unit root or a stochastic trend. if this is correct, or even approximately correct, then standard errors for projected levels of variables would quickly become large. g-cubed has been used for a range of studies of alternative greenhouse policies. carbon taxes are examined in mckibbin and wilcoxen 1993 , 1994 . these studies all highlight that a surprise carbon tax leads to a reduction in real output with the greatest losses occurring in the short run. show that the adjustment of capital flows are important for the impacts of climate policy. an increase in the price of energy inputs makes goods produced using energy relatively more expensive in world markets. the conventional view is that the current account of a country would deteriorate as a result of a carbon tax. in mckibbin and wilcoxen (1994) we showed on the contrary that the current account could improve if the revenue from the tax was used to reduce the fiscal deficit (i.e. holding government spending and transfers constant in spite of the rise in tax revenue). the rise in saving and fall in investment could easily lead to an improvement in the overall current account balance reflecting a capital outflow. the composition of the trade account would reflect the simple partial equilibrium reasoning but the economy-wide general equilibrium effect could go the other way. this paper also illustrated that the way in which the revenue from a carbon tax is used can have important consequences for the costs of the carbon abatement policy. if the revenue is used to reduce another tax in the economy, then the costs of abatement can be reduced. for example, in the us if the revenue is used to reduce the fiscal deficit, there can be a fall in interest rates which stimulates economic growth and reduces the costs of the carbon abatement. however, this effect does not occur in a country like australia because it is not a major participant in global capital markets and has very little impact on world interest rates. nonetheless, using the revenue to reduce taxes on capital can help to offset the negative effects of a carbon abatement policy in australia. the trade implications of environmental policy are the focus of mckibbin and wilcoxen (1993 . these papers show that changes in environmental policy are unlikely to lead to major changes in trade flows through relocation of industry because the costs of environmental policy are generally small relative to the cost of relocating production facilities. this does not mean that environmental policies lead to small losses in economic output, but that policies are unlikely to be fully offset by substitution toward goods that are not subject to the same environmental regulation. in the context of us climate policy, the papers above have shown that for every 100 tons of reduction in us emissions, global emissions fall by 80e90 tons; only 10e20 tons are offset due to higher emissions elsewhere. a key insight from this research is that a significant part of energy use is for domestic transportation which is largely non-traded and therefore is unlikely to move overseas. in mckibbin and wilcoxen (1997) we found that many aggressive permit trading scenarios were infeasible in g-cubed because of the instability they caused in the global trade system. the main problem was the extent of stabilization proposed in the scenarios, which implied very high prices for emission permits. the result was wild fluctuations in real exchange rates and consequently in patterns of international trade. this pointed to a fundamental flaw in the global emission permit trading schemes frequently proposed, such as the kyoto protocol. these regimes could generate large transfers of wealth between countries. supporters of a global permit system regard this as an advantage, because it would allow developed countries to compensate developing countries for reducing their emissions. however, g-cubed suggests that such an approach would put enormous stress on the world trade system depending on the tightness of the emission targets, the extent to which the allocation of permits was different from the permits required to meet the targets, and the marginal cost of abatement in different countries, amongst other things. a developed country importing permits would see its balance of trade deteriorate substantially. equally serious problems would be created for developing countries. massive exports of permits would lead to exchange rate appreciation and a decline or collapse in traditional exports. in the international economics literature this is known as the 'dutch disease' or in australia as the 'gregory thesis'. it occurs because the granting of permits has an impact on the wealth of the receiving countries, which changes their consumption patterns and comparative advantage. in international capital flows are shown to play an important role in the adjustment process to emissions policies. a rise in the price of carbon leads to a fall in the return on capital in carbon-intensive economies and to capital outflow from carbon-intensive economies into large economies and less carbon-intensive economies. although developing countries are generally less carbon intensive, they cannot absorb a large capital inflow because of the adjustment costs in physical capital formation. there is, therefore, much less carbon leakage in g-cubed than in other trade models because of the impact of capital flows and adjustment costs in developing countries. the appeal of an international permit program is strongest if participating countries have different marginal costs of abating carbon emissions. the analysis in mckibbin et al. (1999) . suggests that abatement costs are quite heterogeneous and international trading offers large potential benefits to parties with relatively high mitigation costs. the analysis also highlights that in an increasingly interconnected world in which international financial flows play a crucial role, the impact of greenhouse abatement policy cannot be determined without attention to the impact of these policies on the return to capital in different economies. to understand the full adjustment process to international greenhouse abatement policy it is essential to explicitly model international capital flows. an important but often neglected issue in climate policy design is the effect that the climate policy regime has on the transmission of economic shocks within a country and between countries. mckibbin et al. (2009d) explore potential interactions between climate policy, unanticipated macroeconomic events, and carbon emissions. they examine two kinds of unanticipated macroeconomic shocks under two global climate policy architectures and pay special attention to outcomes that could undermine individual countries' incentives to remain party to the global agreement. they find that a regime of fixed emissions targets strongly propagates growth shocks between regions while price-based systems do not. under a quantity-based policy, a positive growth shock in developing countries can raise the global price of permits enough that gdp in some economies actually contracts, creating an incentive for such countries to withdraw from the arrangement. they also find that in a global downturn, a price-based system exacerbates the economic decline. overall, quantity-based policies perform badly during unexpected economic booms and price-based policies perform badly during downturns. they argue that a hybrid policy would be superior e performing like a price-based policy during a boom and like a quantity-based policy in a downturn. g-cubed also has been used to explore the characteristics of particular international agreements such as the kyoto protocol in and wilcoxen (2004, 2007) , and the copenhagen accord in mckibbin et al. (2010) . in the latter paper, the authors used g-cubed to convert a heterogeneous set of commitments by countries at copenhagen into comparable policy effort by calculating the 'carbon price equivalence' of policies. among other results, they showed that china's intensity targets, which some observers at the time regarded as insignificant, are actually a commitment to very significant reductions relative to the expected trajectory of chinese emissions in the absence of the policy. india's intensity targets, on the other hand, are essentially non-binding. g-cubed also has been used for evaluating national carbon policy proposals such as the carbon pollution reduction scheme in australia by the australian government (2008) and various national schemes in the us by mckibbin et al. ( , 2009c . in addition, we examined border tax adjustments for embodied carbon in mckibbin and wilcoxen (2009a) . border taxes are calculated based on the carbon emissions associated with production of each imported product, and would be intended to match the cost increase that would have occurred had the exporting country adopted a climate policy similar to that of the importing country. we estimated how large such tariffs would be in practice, and then examined their economic and environmental effects. we found that the tariffs would be small on most traded goods, would reduce leakage of emissions reduction very modestly and would do little to protect import-competing industries. the benefits produced by border adjustments would be too small to justify their administrative complexity or their deleterious effects on international trade. a consistent theme in analyses of climate policies using g-cubed is that climate policy design should be robust to uncertainties about future economic conditions. the sensitivity of longer run projections to small changes in assumptions as shown in suggests that policies that rely heavily on precise forecasts about the future are likely to be vulnerable to collapse. this experience led to the development of a 'hybrid' policy of taxes and permit trading set out in mckibbin and wilcoxen (2002a , 2002b , 2004 , 2007 . a policy that is able to manage uncertainty is key in the climate policy debate. 34 in a study for a report by the us congressional budget office (cbo), g-cubed was used to assess the north american free trade agreement (nafta) (congressional budget office, 1993; mckibbin, 1994; manchester and mckibbin, 1994) . at the time nafta was being evaluated, many studies suggested that it would lead to a flood of cheap goods into the us economy and a loss of jobs in the us. g-cubed, however, showed the opposite. in these studies, the key aspect of nafta was not only the removal of us tariffs on mexican goods, but the impact of the agreement on expected future productivity in mexico and the reduction in the risk premium attached to mexican assets by international investors. in the studies we followed the empirical link between closer economic integration and productivity growth surveyed in the case of europe by catinat and italianer (1988) . the risk premium shock was based on estimates by the congressional budget office (1993a) that on average investment in mexico required roughly a 10% higher return than investments in the us. we assumed that the risk premium which drove this differential was eliminated in three years from the announcement of nafta. g-cubed predicted that nafta would lead to a large flow of financial capital from the rest of the world into the mexican economy in response to a rise in the expected return to capital and a reduction in the mexican risk premium. the mexican real exchange rate was predicted to appreciate, crowding out net exports and leading to a rise in the mexican current account deficit. the short-term impacts of nafta were consistent with g-cubed predictions. the medium to long-run predictions from g-cubed were more consistent with the majority of studies at the time. the additional insight from g-cubed was the short-run adjustment process was largely driven by capital flows driving trade adjustment. the model predicted a large impact from expected long-term productivity improvements, and showed how, through the operation of intertemporal forces, this stimulated shortterm capital inflows to mexico. in the short term, this completely dwarfed the static effect (i.e. changing the composition of trade) of the tariff changes that was the focus of other studies. the scale of economies, as well as the sectoral adjustment within economies, can change significantly in dynamic models. financial markets contain important information about absolute and relative returns to current and future activities. g-cubed has also been applied to the free trade agreement of the americas (ftaa) e a proposed extension of nafta. that analysis is discussed in detail in section 15.4. the six-sector version of g-cubed has been used to explore the impact of trade liberalization under alternative regional and multilateral arrangements 35 as well as unilateral trade liberalization in china. 36 in many of these studies, which are based on actual agreements, the trade liberalization is generally announced to be phased in over time. in this case, the key dynamic adjustment to the various trade policy changes is the instantaneous change in rates of return to capital and asset prices in the liberalizing economies. changes in the return to capital change financial capital flows which cause exchange rate adjustments. these exchange rate adjustments then drive trade adjustment in the short run, even before substantial tariff reductions are implemented. mckibbin (1998a, 1998b) examined different regional groupings for trade liberalization. countries were assumed to reduce tariff rates from 1996 levels to zero by 2010 for developed countries, and by 2020 for developing countries figure 15 .1 shows the impact on australian real gdp of liberalization in alternative groupings. liberalization within the regional groupings that include australia (world, apec and australia) results in short-term losses as the tariff reductions are phased in. over time however there are significant medium to long-term gains relative to the base scenario. there are significant additional benefits to joint liberalization in the short run but the majority of medium to long-term gains occur through own liberalization. liberalization by other countries (asean) results in only small gdp gains for australia. the adjustment path to phased liberalization can therefore exhibit short-run costs as resources begin to be reallocated before the trade reforms are implemented. once the liberalization is announced, the return to capital in some sectors rises and capital flows in, appreciating the real exchange rate. this further dampens demand for exported goods as they temporarily become more expensive. liberalization by other countries at the same time can help to reduce these short-run adjustment costs and real exchange rate changes. in the long run, own reforms give larger gains than foreign reforms and there is little benefit from a policy of free riding. the key insight provided by g-cubed is the short-run adjustment process. the impact of a policy change can be perverse in the short run in the sense that capital flowing into a liberalizing economy can cause such a large real exchange rate appreciation that there is a significant deterioration in the trade account as real resources flow into the economy. if the adjustment process is poorly understood, policy makers can become disaffected or can implement inappropriate policy responses such as tightening macroeconomic policy in order to improve the external balance thus slowing down economic activity. however, the capital inflows are needed to build future capacity in expanding sectors. the appreciation of the real exchange rate and worsening of the trade balance is not a loss of underlying competitiveness because of a bad policy change. the reallocation of resources is driven by the signals in financial markets of where expected returns are highest after the reforms are implemented. g-cubed has been used in a number of studies to explore the role of macroeconomic policies and shocks in generating trade imbalances between regions of the world. in g-cubed the trade deficit of a country not only represents a excess of imports of goods and services over exports of goods and services. a trade deficit also reflects a excess of investment over savings in a country. lee et al. (2006) used the model to explore the sensitivity of the trade flows between the us and asia. they found the fundamental cause of trade imbalance since 1997 is changes in savingeinvestment gaps, attributed to the surge of the us fiscal deficits and the decline of east asia's private investment after the 1997 financial crisis. in exploring the impact of nominal exchange rate realignment the results from g-cubed show that a revaluation of east asia's exchange rates by 10% (effectively a shift in monetary policy) cannot resolve the imbalances. they also found that a concerted effort by east asian economies to stimulate aggregate demand can have significant impacts on trade balances globally, but the impact on the us trade balance is not large. us fiscal contraction was estimated to have large impacts on the us trade position overall and on the bilateral trade balances with east asian economies. these results suggest that in order to a global approach to energy and the environment: the g-cubed model improve the transpacific imbalance, macroeconomic adjustment will need to be made on both sides of the pacific. an antecedent of g-cubed called the mckibbinesachs global (msg) model was originally designed to explore macroeconomic policy issues. g-cubed has similar macroeconomic properties and has been used to explore a wide range of issues in macroeconomics. monetary and fiscal regime design in europe has been explored using g-cubed by allsop et al. (1996 allsop et al. ( , 1999 , gagnon et al. (1996) , haber et al. (2001) , mckibbin and bok (2001) , and neck et al. (2000 neck et al. ( , 2005 . in these papers the key insight was that the fixed exchange rate regime of the euro zone would be under serious stress if fiscal policies in europe were not coordinated in the face of various economic shocks. macroeconomic policy issues in japan have been examined using g-cubed by mckibbin (2002) and callen and mckibbin (2003) where the experience of japan during the 1990s was captured by the model as a serious of policy errors particularly in announcing fiscal expansion and generating crowding out through asset markets, but then not delivering the fiscal spending causing a persistent downward drop in gdp; in india by mckibbin and singh (2003) where nominal income targeting was shown to be a far better monetary regime than inflation targeting given the prevalence of supply side rather than demand-side shocks in the indian economy; in china by mckibbin and tang (2000) and mckibbin and huang (2000) where financial reforms where found to have profound effects on economic growth and the balance of payments adjustment but that a loss in confidence in china could devastate economic growth; and in asia in mckibbin and le (2004) and mckibbin and chanthapun (1999) where flexible exchange rate regimes were found to be far better at insulating east asian economies against global economic shocks that pegging to either the us dollar or a common asia currency. theoretical issues in monetary policy design are investigated using g-cubed in henderson and mckibbin (1993); . trade imbalances caused by macroeconomic policies and shocks are explored in lee, mckibbin and park (2006) . the impacts of the end of the cold war and large shift in military spending on the global economy are explored by mckibbin and thurman (1995) , and congressional budget office (1996b); the spillover of macroeconomic policies between countries are explored in mckibbin and bok (1995) and mckibbin and tan (2009) ; and theoretical issues in the design of models for policy analysis are explored in mckibbin and vines (2000) and pagan et al. (1998) . global fiscal consolidation is explored in mckibbin and stoeckel (2012) which examines the direct impact of a large-scale reduction in government outlays on economies as well as the implications that a global fiscal adjustment might have on country risk premia. one key result in this paper is that substantial fiscal consolidation by high-income economies (in proportion to the size of their debt problem) has the temporary effect of lowering economic activity in those economies, but has a positive effect on developing countries and a few high-income economies not undertaking fiscal consolidation. the reason is that the negative flow-on effects through trade linkages by high-income economies reducing imports and stimulating exports with the developing world are offset by favorable financial flow-on effects, which provides capital for developing countries to increase gdp. secondly a credible phasing in of fiscal cuts can reduce expected future tax liabilities of households and firms which dampens the negative direct effects of cuts in government spending. the paper also explores the outcome if all countries coordinate their fiscal adjustment except the us. a coordinated fiscal consolidation in the industrial world that is not accompanied by us actions is likely to lead to a substantial worsening of trade imbalances globally as the release of capital in fiscally contracting economies flows into the us economy, appreciates the us dollar and worsens the current account position of the us. the scale of this change is likely to be sufficient to substantially increase the probability of a trade war between the us and other economies. in order to avoid this outcome, a coordinated fiscal adjustment is clearly in the interest of the global economy. in mckibbin and martin (1998) , the six-sector version of g-cubed was used to simulate the asian currency and economic crisis. data from the key crisis economies of thailand, korea and indonesia were used as inputs for simulations to see if the model could generate the scales of adjustment in asset markets as well as the sharp declines in economic activity that occurred. the study considered three key factors in explaining the qualitative and quantitative events that unfolded in the crisis economies: (i) revisions to growth prospects, (ii) changes in risk perceptions and (iii) policy responses in individual countries. the role of asset markets and financial flows was critical. downward revision in expected growth led to falling asset prices, which reduced current income and wealth. combined with increased risk premia, calibrated to generate an exchange rate depreciation of the size being observed in real time in these economies, meant investment and growth collapsed. the extent to which financial markets responded through intertemporal arbitrage was crucial to the risk shocks. finally, the ability to model the anticipated policy responses, both through price-setting and asset-market adjustments, was crucial to understanding the subsequent outcomes. mckibbin (1999) focuses on the second of these factors: the impact on asian countries of a jump in the perceived risk of investing in these economies. this paper argued that a financial shock can quickly become a real shock because of the interdependence of the real and financial economies. too often policy makers and modelers a global approach to energy and the environment: the g-cubed model ignore this interdependence. the reaction of policy makers directly, and in the implications for risk of their responses are crucial to the evolution of the crisis. both mckibbin (1999) and mckibbin and martin (1998) conclude that the risk shock was crucial to understanding the asian crisis. the results for a risk shock are similar to the results for a fall in expected productivity. the shock leads to capital outflow from crisis economies and a sharp real and nominal exchange-rate depreciation. this reduces the value of capital, which, together with a significant revaluation of the us dollar denominated foreign debt, causes a sharp fall in wealth and a large collapse of private consumption expenditure. the fall in the return to capital, and the large rise in real longterm interest rates, lead to a fall in private investment. early in the debate over the asian crisis, the results from g-cubed were interesting and controversial because they were counter to popular commentary, both in australia and in the us. the model showed that although the international trade effects were negative for countries that export to asia, the capital outflow from crisis economies would push down world interest rates and stimulate non-traded sectors of economies that were not affected by changes in risk assessment. the model suggested that a country like australia would slow only slightly in the short run and the us would experience stronger growth as a result of the capital reallocation. this is now conventional wisdom. furthermore, for australia, in particular, the existence of markets outside asia, and changes in relative competitiveness, meant that substitution was possible for australian exports. models with an aggregate world growth variable or a single exchange rate variable would not capture this international substitution effect. models with exogenous balance of payments could replicate the shock, but it required an exogenous change in the trade balance and other factors that are exogenous to the model. in a number of papers stoeckel (2010a, 2010b) used the approach of mckibbin and martin (1998) together with shock to us housing markets and policy responses of central banks and fiscal authorities around the world to model the global financial crisis of 2008. specifically they modeled the key aspects of the crisis as: (i) the bursting of the housing bubble and loss in asset prices and household wealth with consumers cutting back on spending and lifting savings; (ii) a sharp reappraisal of risk with a spike in bond spreads on corporate loans and interbank lending rates with the cost of credit, including trade credit, rising with a commensurate collapse of stock markets around the world; and (iii) a massive policy response including a monetary policy easing, bailouts of financial institutions and fiscal stimulus. simulating the loss in confidence through higher risk premia on the us alone (the 'epicenter' of the crisis) showed several things. had there not been the contagion across other countries in terms of risk reappraisal, the effects would not have been as dramatic as subsequently occurred. the adverse trade effects from the us downturn would have been offset to some degree by positive effects from a global reallocation of capital. were the us alone affected by the crisis, chinese investment could have actually risen. the world could have escaped recession. when there is a reappraisal of risk everywhere including china, investment falls sharply e in a sense there is nowhere for the capital to go in a global crisis of confidence. the implication is that if markets, forecasters and policy makers misunderstand the effects of the crisis and mechanisms at work, they can inadvertently fuel fears of a 'meltdown' and make matters far worse. the bursting of the housing bubble had a bigger effect on falling consumption and imports in the us than does the reappraisal of risk, but the reappraisal of risk has the biggest effect on investment. rising risk causes several effects. the cost of capital rises that leads to a contraction in the desired capital stock. hence, there is disinvestment by business and this can go on for several years e a deleveraging in the popular business media. the higher perception of risk by households causes them to discount future labor incomes at a higher risk adjusted interest rate that leads to higher savings and less consumption, fuelling the disinvestment process by business. when there is a global reappraisal of risk there is a large contraction in output and trade e the scale of which depends on whether the crisis is believed to be permanent or temporary. the long-run implications for growth and the outlook for the world economy are dramatically different depending on the degree of persistence of the shock. these papers found that, as expected, the effects of the crisis are deeper and last longer when the reappraisal of risk by business and households is expected to be permanent versus where it is expected to be temporary. a third combination was explored in mckibbin and stoeckel (2010b) where agents unexpectedly switch from one scenario of believing the shock to be permanent to one the temporary scenario several years later. the dynamics for 2010 are quite different between the temporary scenario and the expectation revision scenario even though the shocks are identical from 2010 onwards. one of the key results of both these studies was that there was a substantially larger contraction in exports relative to the contraction in gdp in all economies. this was observed in the actual data. this massive shift in the relationship between trade and gdp is not the result of an assumption about the income elasticity of imports. it reflects some key characteristics of the model. first, imports are modeled on a bilateral basis between countries where imports are partly for final demand by households and government and partly for intermediate inputs across the six sectors. in addition, investment is undertaken by a capital sector that uses domestic and imported goods from domestic production and imported sources. as consumption and investment collapse more than gdp, imports will contract more than gdp. one country's imports are another country's exports; thus exports will contract more than gdp unless there is a change in the trade position of a particular country. the assumption that all risk premia rise and the results that all real interest rates fall everywhere implies small changes in trade balances but big changes in the extent of trade in durable goods. as durable goods have a much bigger share in trade than in gdp the compositional shift of demand away from durable goods due to higher risk premia causes a structural change in the relationship between trade and gdp. as part of research for the world health organization (who) g-cubed was adapted to explore two major pandemics: (i) the sars (severe acute respiratory syndrome) outbreak in 2003, which was explored in lee and mckibbin (2004) (ii) the potential of a pandemic resulting from the outbreak of bird flu, which was examined in mckibbin and sidorenko (2006) . in lee and mckibbin (2004) the authors used emerging data on changes in risk premiums observed in financial pricing and changes in spending behavior in the affected countries on hong kong, china to develop shocks to country risk, based on observed exchange rate changes, sector specific shifts in demand away from sectors with high human-to-human contact (mostly services) and an increase in the input costs of the service sector of roughly 5%. this study was the first of a new approach to analyzing the macroeconomic costs of diseases through general equilibrium modeling. the key insight for policy design and investment in public health was that the short-run cost of major disease outbreaks is significant. traditional estimates based on loss of life and income foregone estimates underestimate the costs of large-scale change in economic behavior and the spillovers between economies of disease outbreaks. the authors estimated that the cost in 2003 of sars for the world economy as a whole was close to $40 billion, which is the official who estimate of the sars outbreak. the approach in lee and mckibbin (2004) on sars was significantly extended in mckibbin and sidorenko (2006) to explore the possible implications of more widespread influenza pandemics. based on historical experience of influenza pandemics, mckibbin and sidorenko (2006) considered four mortality scenarios under current economic linkages in the global economy. the scenarios were: (i) a 'mild' pandemic, modeled on 1968e1969 hong kong flu; (ii) a 'moderate' pandemic, modeled on the asian flu of 1957; (iii) a 'severe' pandemic similar to the lower estimates of mortality and morbidity in the spanish flu of 1918e1919; and (v) an 'ultra' pandemic, modeled on high-end estimates of the spanish flu. these scenarios were used to generate a range of shocks to individual countries and sectors due to the pandemic (including mortality and morbidity shocks to labor force, increase in cost of doing business, an exogenous shift in consumer preferences away from exposed sectors, and a re-evaluation of country risk premiums). these shocks generate a complex response of incomes and prices driving global economic outcomes. the results illustrated that even a mild pandemic can have significant consequences for global economic output, with the developing countries experiencing the largest economic loss due to the compounding effect of a weaker public health response, capital reallocation and monetary policy responses within different exchange rate regimes. the use of a general equilibrium model that included changes in behavior of a large scale in response not only to market signals, but also changes in risk perceptions can cause a much larger economic loss that conventional estimates of pandemics imply. in a series of papers, mckibbin and nguyen (2004) , bryant and mckibbin (2004) , mckibbin (2006b) , and nguyen (2011) have incorporated overlapping cohorts of generations in g-cubed, in order to explore demographic change in various countries. the approach followed is based on the work of blanchard (1985) , yaari (1965) , and weil (1989) as extended by faruqee (2003) . this work was also adapted in batini et al. (2005) for the international monetary fund world economic outlook (international monetary fund, 2004) . the basic approach was to introduce individual cohorts of agents into g-cubed. by following the blanchard approach and assuming a constant probability of death across cohorts we are able to aggregate agents outside the model and feed in the change in productivity by agent cohort using estimated age-earnings profiles to generate shocks to effective labor supply in the model. this short cut approach of assuming a constant probability of death across cohorts is a strong assumption. to abandon that simplifying assumption requires an explicit multicohort olg model which is recently undertaken in nguyen (2011) . an analysis of the impact of the global and regional differences in demographic change needs to take into account the effects of changing growth rates as well as the numbers of adults and children. mckibbin (2006b) incorporated these projections into a general equilibrium model that allows for the changing composition of the population, and captures its effect on labor supply, investment, growth potential, saving, asset markets, international trade and financial flows. there are at least two most important policy implications from this research. the first is that the projected demographic transition in the global economy will likely have important macroeconomic impacts on growth, trade flows, asset prices (real interest rate and real exchange rates) and investment rates. the second result is that policy makers should not ignore the global demographic transition when focusing on domestic issues related to demographics. the fact that the demographic transition is at different stages across countries, particularly in industrialized countries relative to developing countries, implies that the global nature of demographic change cannot be ignored. mckibbin (2006b) showed that the developing world has important impacts on the industrial economies. as well as creating a framework for exploring a range of possible policy responses directly related to demographics, the model could be used to explore how other policies, apparently unrelated to demographics, might impact on the macro economy to offset any negative consequences or reinforce any positive consequences of global demographic a global approach to energy and the environment: the g-cubed model change. a first attempt at this is contained in batini et al. (2005) , which explored the impact of productivity improvements induced by economic reform and lowering barriers to international capital flows in developing countries. by using a general equilibrium model other policies in other parts of the economy might have a more substantial positive contribution to dealing with demographic change than the more direct policies that are usually proposed, such as increased migration, subsidies to child birth or changes in retirement ages. in this section we present results illustrating the use of g-cubed for analysis of financial shocks and international trade agreements. the first analysis draws on mckibbin and stoeckel (2010b) and examines the effects of a financial crisis on the global economy. the second analysis draws on mckibbin and wilcoxen (2003) and examines the effects of the proposed ftaa on trade patterns and carbon emissions. to analyze a financial crisis, we used an extended version of the model (aggregation n) which has greater regional detail. there are seven additional regions including five new countries (canada, the uk, germany, india and new zealand) and two new aggregates (other asia and latin america). in addition, the aggregate region representing europe has been replaced with a narrower aggregate representing the euro zone countries other than germany. the full list of regions is shown in table 15 .4. we model financial crises as changes in the risks perceived by investors, which are reflected in the risk premia they demand for holding assets. risk premia enter the model in a number of places. they play an important role in the model's calibration as well as having large impacts on economic outcomes through intertemporal relationships. for example, risk premia m k and m j appear in equation (15.30), the arbitrage equation between returns on domestic and us bonds, which is repeated below: in addition, as shown in equation (4) there are risk premia, m ei , between bonds and equity in each sector within each economy, which represent the sector's equity risk premium. there is also a risk premium, m h , on the rate at which households discount future after tax labor income, as shown in equation (15.16) . in calibrating the model, these risk premia are calculated so that the model's solution values for forward looking variables in the base year (2006) are equal to the historical values of those variables. for example, in the case of country risk (m), a constant is chosen so that the current exchange rate, which is the expected future path of interest differentials plus the period t exchange rate, is equal to the actual exchange rate in the base period 2006. the equity risk premium in each sector in each country is chosen so that the stock market value for sector i in country n is equal to its actual stock market value in 2006. once the risk premia are calculated they held constant for most simulations. however, they can be shocked to explore the impact of changes in perceived risks. to illustrate the importance of these risk premia, two experiments are presented in this section. the first is a rise in country risk in europe (consisting of the country models for germany, the rest of the euro zone and the uk). this shock could represent a sovereign debt crisis in this region or some other change in perceived risk that causes investors to demand a higher rate of return on government bonds from that region (relative to the us government bond rate adjusted by expected exchange rate changes). the second experiment examines a broader risk shock that extends to the us as well. in this case the relative risk between europe and the us is unchanged from the baseline but the risk of both regions relative to all other countries rises. in the reference case the world economy is assume to grow along the model's baseline projections. all risk premia are held constant at their calibrated values discussed above. in the first simulation, country risk in europe (including germany, the rest of the euro zone and the uk) is assumed to rise unexpectedly in 2011 by 300 basis points. this increase is assumed to be permanent. as a result, investors demand that all financial assets within europe pay an additional 300 basis points (relative to competing assets) to compensate for the additional risk. the immediate effect of the shock is a reduction in the financial value of european assets as investors reallocate their portfolios away from those assets. financial capital flows out of europe causing a sharp fall in nominal and real european exchange rates. for example, figure 15 these changes in exchange rates and financial flows are reflected in the trade balances of each region. as shown in figure 15 .3 european regions experiencing exchange rate declines and capital outflows see their trade balances move sharply toward surplus: their exports become more competitive and investors are less willing to finance trade deficits. for similar reasons, the trade balances of the us and china (where exchange rates have strengthened and capital inflows have increased) move toward deficit. in the longer term, the change in the required financial return causes the marginal physical product of capital in europe to rise to re-equilibrate the arbitrage condition between bonds and equity. this comes about via a decline in european capital stocks: the stocks initially in place when the shock occurs are too high to generate the physical return required. the expectation that increased risk is permanent leads to a long period of falling european capital and higher european interest rates relative to the reference case. as shown in figure 15 .4 the shock raises european real interest rates by more than 100 basis points. interest rates in the us and china, on the other hand, fall slightly. real investment, shown in figure 15 .5 also changes as expected: a sharp immediate drop in europe, followed by a gradual recovery as european capital stocks converge to their new, lower, long-term levels; and the reverse in the us and china. the drop in asset values in europe lowers the wealth of european households and causes private consumption to fall, as shown in figure 15 .6 relative to the reference case, consumption increases slightly in the us and somewhat more in china due to china's slightly large fall in interest rates. overall, european regions experience lower consumption and investment, and higher net exports. on balance, the effect on european gdp is negative, as shown in figure 15 .7. the results are consistent with supply-side effects: european capital stocks gradually fall over time and the loss of gdp is exacerbated in the short run by temporary unemployment. the change in risk is sufficient to cause a recession in europe with gdp in the first year down by nearly 5% relative to its reference level. in contrast, consumption, investment and gdp all rise slightly in the us and china under the europe-only shock. interestingly, the transmission of the shock to countries outside europe is positive (i.e. other countries gain) despite the fall in european gdp. the large outflow of financial capital from europe to the rest of the world pushes down long-term real interest rates in other regions, stimulating non-european investment and expanding a global approach to energy and the environment: the g-cubed model the supply side of other economies. the capital reallocation effect is sufficiently large that for most other regions it more than offsets the negative effect of lower import demands from a weaker europe. roughly speaking, the shock tends to reallocate financial capital rather than destroy it. although the value of capital drops sharply in europe (as reflected in the real exchange rate shown in figure 15 .2), it rises in the us and china. in the second simulation, the us also experiences an increase in perceived risk (perhaps through contagion or its own fiscal crisis). as noted above, results for this simulation are shown in the earlier figures along with those for the european financial crisis. broadly speaking, the effects of the broader crisis on the us are much like those of the narrower crisis on europe: the real exchange rate falls, the trade balance moves toward surplus, the real interest rate increases, and investment, consumption and gdp decline. interestingly, however, the spread of the crisis to the us also attenuates the effects on germany and the rest of europe: european losses in consumption, investment and gdp are reduced relative to the case when the shock is confined to europe. this result reflects the role of adjustment costs in capital accumulation in each sector. the us is a large economy that can absorb a lot of the capital that flows out of europe without much being lost to adjustment costs (see equation 15.3). in contrast, when the inflow from europe to the us is reduced by the rise in us risk, adjustment costs cause less capital to flow out of europe. the remaining countries (other than the us and europe) have much less capacity to absorb large inflows of additional capital without incurring rising adjustment costs from expanding their physical capital stocks. thus, more financial capital stays within europe under the second simulation, reducing the loss of european gdp. this result illustrates one of the benefits of intertemporal general equilibrium models that explicitly model the supply side of economies: in more traditional keynesian macroeconomic models, this effect does not exist and demands driven by trade dominate the results for the international transmission of economic shocks. finally, the results for china in the second simulation show the importance of assumptions about monetary policy. in g-cubed, china is assumed to peg its nominal exchange rate to the us dollar. 37 as a result, china effectively loosens its monetary policy at the onset of the us shock in order to have its nominal exchange rate move with the us dollar. thus, there is a substantial monetary expansion in china. short-term interest rates fall, real consumption, investment and gdp rise sharply, and inflation spikes markedly, as shown in figure 15 .8 (panel 4). our second illustration of analysis using g-cubed focuses on the proposed ftaa. we evaluated the ftaa by comparing the evolution of the world economy with and without the agreement. 38 in mckibbin and wilcoxen (2003) we considered a range of competing assumptions about the manner in which the ftaa would be implemented and the effects it would have on individual economies. in particular, we evaluated the ftaa under two different assumptions about its effect on productivity growth and under alternative assumptions about how governments respond to a decline in tariff revenue. in this section we discuss a subset of those results. the direct effect of reducing tariffs is to improve the efficiency of an economy's resource allocation by reducing the wedge between a buyer's willingness to pay for an imported product and the product's marginal cost. traditionally, general equilibrium studies of trade reform have focused on measuring these efficiency gains, and measuring them at a given point in time, usually either the immediate short run after the reform has been implemented, or far in the future at the model's long run equilibrium. by this standard, trade liberalization is usually found to improve welfare 38 other papers in the literature on trade and the environment include strutt and anderson (1999) , who find that trade can improve environmental quality in some circumstances and does little harm otherwise, and tsigas et al. (2002) , who find that the effect of trade on the environment is ambiguous. other general equilibrium studies of the ftaa include diao and somwaru (2000) and adkins and garbaccio (2002) . but the magnitude of the improvement tends to be small. for example, hertel et al. (1999) find that a worldwide cut in tariffs of 40% would raise world gdp by 0.24%. 39 however, liberalization leads to a host of indirect dynamic effects as well. these can cause profound changes in an economy by altering its rate of growth. unfortunately, they are often very difficult to measure, particularly because competing effects can work in opposite directions. for example, reductions in tariffs cause imports to rise, pushing a country's trade balance toward deficit, leading to a depreciation of its exchange rate and a consequent increase in its exports. deterioration of the trade balance is accompanied by inflows of capital from abroad which augment domestic saving and tend to raise the rate of investment. at the same time, the drop in tariff revenues will push the country toward fiscal deficit, raising government borrowing and tending to crowd out private investment. to further complicate matters, capital accumulation is also affected by reductions in the prices of imported durable goods, which tend to reduce the cost of new capital and thereby increase the rate of capital formation and growth. disentangling these effects requires a multisector general intertemporal equilibrium model with considerable financial detail. in addition to changing capital accumulation, the empirical literature suggests that trade improves industry productivity by placing additional competitive pressure on previously protected industries, and by increasing the flow of investment and embodied technical change across borders (frankel and romer, 1999; chand, 1999) . moreover, these studies find that trade liberalization has much larger effects than traditional static analysis suggests: frankel and romer, for example, find that a one percentage point increase in the ratio of trade to gdp raises per capita income by 2e3%. although there is clear evidence of a link between trade and productivity at the aggregate level, the literature is not yet sufficient to permit precise predictions about the magnitude of improvement in productivity of individual industries. as a result, we approach the issue by running two sets of simulations. the first employs the traditional assumption that firms in liberalizing economies do nothing when faced by increased competition from imports (apart from substituting toward cheaper inputs): they do not cut costs or adopt better management practices or newer technology. although this assumption is conventional, it is quite strong. it says, in effect, that a firm's technology choice is not affected by its industry's level of protection. it is hard to find any empirical support for that position and there is much evidence for the reverse: there are countless 39 martin et al. (2003) point out that traditional general equilibrium measures of the gains from liberalization are biased downward very significantly by aggregation across goods. tariff rates differ sharply between individual products and efficiency costs are proportional to the square of the price changes caused by the tariffs. when products are aggregated, however, their individual tariffs are replaced by a weighted average. the efficiency cost of an average tariff can be shown to be smaller than the average of the efficiency costs of the individual tariffs it replaced. a global approach to energy and the environment: the g-cubed model examples of industries clinging to obsolete, high-cost technology because protection allowed them to do so. our second set of simulations introduces a link between trade and productivity by assuming that previously protected industries are able to take modest steps to reduce their costs. 40 it captures, at least to first order, the empirical features seen in the econometric literature on trade and growth. since both sets of simulations depend on assumptions about the link between trade and productivity, neither one can be interpreted as a precise forecast of the ftaa's effects. however, they characterize the set of possible outcomes. other general equilibrium studies that have introduced a link between trade and productivity include stoeckel et al. (1999) , diao and somwaru (2000) , and monteagudo and watanuki (2001) . finally, a third indirect effect of trade agreements is that increased openness lowers the risk premium attached to a country's sovereign debt by rating agencies such as standard and poor's and moody's (stoeckel et al., 1999) . this can lead to pronounced increases in capital inflows, particularly for developing countries. however, that mechanism will not be discussed here. g-cubed's basic design is well-suited to the task because it has a full, integrated treatment of international trade and financial flows: each country's current account position must be offset by its capital account, which in turn leads to accumulation or erosion of its stock of foreign assets and thus to changes in its future flow of interest payments. in addition, it accounts for the relative immobility of physical capital and the high mobility of financial capital. to analyze the ftaa, we developed an extended version of the model with greater regional detail in the western hemisphere. it includes five regions particularly relevant for the ftaa: the us, canada, mexico, brazil and an aggregate region representing the rest of latin america. to keep the size of the model manageable, australia was merged into the rest of the oecd. the full list of regions is shown in table 15 .5. within each region, the disaggregation of production remained the same: the 12 sectors shown in table 15 .2. the structure of the model was also modified to facilitate simulations involving preferential trade agreements. the updated version allowed each region to have two sets of tariffs: one set for imports from countries within a preferred trade area and one for imports from everywhere else. for free trade agreements such as nafta or the ftaa, the tariffs on trade within the preferential area are set to zero. it should be noted that the model does not require participants in a free trade agreement have harmonized external tariffs: each country retains its original tariffs on trade outside the free trade area unless otherwise specified in a simulation. to determine the effects of the ftaa we carried out a suite of simulations: a reference case having no multiregion free trade areas; a pair of simulations examining the effects of nafta and the ftaa under the assumption that tariff reductions and trade flows have no effect on industry productivity; a pair of simulations examining the effect of nafta and the ftaa under an alternative assumption in which tariff reductions and increased trade lead to modest improvements in the productivity of previously protected industries; and a pair of simulations investigating the effect of announcing nafta or the ftaa 5 years before it is actually implemented. in this chapter we focus only on the simulations that include productivity effects; full results including the other simulations can be found in mckibbin and wilcoxen (2003) . the simulations discussed in here are listed in table 15 .6. the first was a reference case having no multiregion free trade agreements, not even nafta. 41 in this simulation, each region's tariff rates do not distinguish between imports from different trading partners. for example, the us imposes a single tariff on imports of durable goods, regardless of whether any particular imported good originates in canada, europe, brazil or somewhere else. as it does not include nafta, the reference case does not represent the current world economy. however, it allows us to a global approach to energy and the environment: the g-cubed model simulate the adoption of nafta itself, which is very useful for putting the ftaa results in context. the reference case tariff rates were derived by aggregating historical data and are listed in table 15 .7. 42 the pattern of trade under the reference case is exemplified by the figures shown in table 15 .8 for year 12 of the simulation; refer to table 15 .5 for a list of region codes. 43 each panel of the table gives the bilateral trade matrix for a particular good; panel nine, for example, shows trade in durable goods. the columns of table 15 .8 indicate the origin of each trade flow and the rows indicate the destination. each entry is the us dollar value of the corresponding flow of goods. 44 for example, in panel 5 the value in the 'u' row and 'p' column is 35.3, which indicates that shipments of crude oil from opec to the us were worth $35.3 billion. where the value of trade was less than $0.1 billion, the entry is left blank. as exports from different regions are imperfect substitutes, most goods flow in both directions between each pair of regions. for example, the us exports $45.5 billion dollars' worth of durables to japan while simultaneously importing $145.3 billion dollars of japanese durables. at g-cubed's level of aggregation, traded goods are generally far from homogeneous due to differences in the product mix of exports from different countries. aircraft, for example, are an important component of durables exported by the us but are not a significant portion of us imports of durables from japan. the row sums in table 15 .8 show the total value of imports of the good by each region; the sum of the 'u' row in panel 5, for example, shows that total us imports of crude oil from all of its trading partners were worth $96.2 billion. the column sums show the total value of exports of the good from each region; the sum of the 'p' column of panel 5 shows that total crude oil exports from opec were worth $156.4 billion. the value in the lower right corner of each panel shows the total us dollar value of trade in the good; in panel 5, the value is $297.4 billion. the model's 12 goods fall into three distinct categories in terms of the total dollar value of trade. six sectors each account for less than $100 billion: electricity ($5 billion), natural gas delivered by utilities ($12 billion), refined petroleum products ($86 billion), coal ($22 billion), non-fuel mining ($48 billion) and forestry ($80 billion). at the opposite extreme, two sectors each account for more than a trillion dollars: durables ($2926 billion) and non-durables ($1365 billion). the remaining four sectors fall in between: crude oil and natural gas ($297 billion), agriculture ($183 billion), transportation ($530 billion) and services ($607 billion). the overwhelming importance of durables is emphasized by the fact that shipments from europe (column 'e') to developing countries (row 'l') e a single entry in the trade matrix for durables e are worth $303 billion, or more than the total value of world trade in the six least-traded goods. in order to evaluate subsequent simulations, it is useful to group the model's regions into three aggregates: (i) the nafta countries: the us, canada, mexico; 45 (ii) the other table 15 .9 shows the dollar value of trade flows within and between each of these aggregates. 46 as in the detailed bilateral trade matrices, the column indicates the source of each trade flow and the row indicates its destination. for example, the entry in the 'row' column and 'nafta' row for crude oil and natural gas (good 5) shows that the us, canada and mexico together import $66 billion worth of crude oil and natural gas from the countries comprising row. the entry in the 'nafta' column and 'nafta' row, in contrast, shows the value of trade in crude oil and gas among the nafta countries. these aggregate regions will be used to assess the overall effects of nafta or the ftaa on the value of trade within the free trade area and between the trade area and the rest of the world. as discussed earlier, there is an empirical literature suggesting that tariff reform and increased international trade stimulate productivity growth. there are many mechanisms by which this could occur. industries previously protected by high tariffs and now faced with increased competition would almost certainly undertake cost-cutting measures and shift toward global best practices in production. lower trade barriers on durables would also allow a freer flow of new technology and embodied technical change. to incorporate this effect, we assume that when tariffs are reduced, previously protected industries are able to make modest improvements in productivity in response. in particular, we assume they are able to reduce their costs by a percentage equal to half of the change in their tariff, or by 5%, whichever is smaller. for example, the canadian tariff on durables is initially 4.14% so under this assumption trade reform would lead to a 2.07% improvement in the productivity of the canadian durables sector. in contrast, the mexican tariff on durables is initially 11.3% so the corresponding industry's productivity improvement would be limited to 5%. the full set of productivity shocks used in this section is shown in table 15 .10. under the nafta-p simulation, tariffs are reduced to zero on trade between the us, canada and mexico, and industries in those three countries receive the productivity improvements listed in table 15 .10. the ftaa-p simulation is similar to nafta-p, but also includes brazil and the rest of latin america. in both cases, it is important to note that the productivity shocks are one-time changes in productivity levels, not changes in the rate of productivity growth. they are very modest effects in the sense that they correspond to at most 2e3 years of ordinary productivity improvements. on the other hand, table 15 .9 value of trade flows between regions in the reference case: column shows source and row they do not include the effect of any adjustment costs or investment that might arise as the protected industries adapt. thus, they allow us to gauge the importance of productivity changes over the medium to long run but are not a precise prediction. table 15 .11 shows the effect of nafta-p on output, exports and capital stocks by sector and region in year 12. the results are percentage changes relative to the reference case. entries with changes greater than or equal to 1% in magnitude are given in italics and those with changes less than 0.1% in magnitude are left blank. as would be expected, the reduction in tariffs leads to an increase in trade among nafta countries. exports from the us, canada and mexico increase significantly. exports of durables (panel 2) rise by 2.4% in the us, 6.9% in canada and 8.6% in mexico. exports of non-durables increase even more because tariffs on non-durables are initially higher and hence fall more dramatically under a free trade agreement (the high tariffs on non-durables largely reflects the high levels of protection on textiles and apparel). exports of non-durables rise by 5.3% in the us, 18.5% in canada and 11.9% in mexico. the increase in trade raises the level of output of all industries in the us (panel 1), albeit by very small percentages in most cases. the corresponding capital stocks rise as well (panel 3). the output of almost all canadian industries increases, as does the output of most mexican industries. a notable exception is services (sector c) for both canada and mexico: output of services falls under nafta-p even though exports of services rise slightly in percentage terms due to the change in exchange rates (as well as increased demand for services in other countries that expand as a result of the policy). the reason is straightforward: the price of services rises, in part because the traded sectors grow and consume more labor, and in part because depreciation of the exchange rate raises the price of imported intermediate goods other than those whose tariffs have been cut. this effect is a recurring theme in the results and can be seen clearly in figure 15 .9, which shows percentage changes from the reference case in selected industry prices and quantities over time for the us, canada and mexico. the total value of the trade flows among nafta regions increases as shown in table 15 .12 (which also includes results for the ftaa, which will be discussed below). the value of trade in durables among nafta regions, for example, increases by $16 billion, while the value of trade in non-durables increases by $14 billion. comparing the magnitudes of the within-nafta flows with those between nafta and the rest of the world shows that nafta clearly increases trade rather than just redirecting it. in fact, the increase in economic activity stimulated within nafta actually causes aggregate nafta imports of both durables and non-durables from the rest of the world to rise. summing across goods and regions, nafta raises the total dollar value of trade flows in year 12 by $50 billion. the effects of nafta-p on trade and selected macroeconomic variables are shown in figure 15 .10. the reduction in tariffs in canada and mexico causes the demand for imports in those regions to increase, leading to depreciation of the two exchange rates relative to the us dollar. 47 the trade balances in the us, canada and mexico initially move toward deficit. the trade balances for brazil and the rest of latin america (as well as most other regions, although they are omitted from the graph) move toward surplus as imports by the us, canada and mexico rise. over time, however, the canadian and mexican trade deficits accumulate into increased stocks of foreign debt. the higher debt levels require larger interest payments, which consume an increasing share of each country's current account and eventually force the trade balance back toward surplus. this mechanism is made even stronger by the effect of exchange rates on each country's initial stocks of foreign debt. in particular, the depreciation of the canadian and mexican currencies relative to the us dollar increases the burden of servicing their us dollar denominated foreign debt. the reduction in tariffs also has an important fiscal effect: by reducing government revenue it increases the fiscal deficit in each region. the effect is very small in the us because tariff revenue is a small part of the government budget but it is significant in canada and mexico. in the short run, the reduction in tariffs functions as a fiscal stimulus. this effect would be significantly different under alternative monetary and fiscal assumptions. for example, other taxes could be raised to compensate for the reduction in tariff revenue, or government spending could be cut. in addition, monetary policy could be altered e in these simulations the money supply has been held constant at its base case value. either policy could reduce or eliminate the macroeconomic effects caused by the increase in the fiscal deficit. 47 in g-cubed, exchange rates are defined as the number of us dollars per unit of foreign currency. a decline in the exchange rate is a reduction in the number of dollars per unit of foreign currency and hence a depreciation of the currency. a global approach to energy and the environment: the g-cubed model table 15 .13 shows the effect of ftaa-p relative to nafta-p on output, exports and capital stocks by industry and region for year 12. the results are percentage changes relative to the nafta-p simulation; as before, entries with changes greater than or equal to 1% in magnitude are given in italics and those below 0.1% are left blank. the general nature of the results is highly analogous to those for nafta-p: the effect of the ftaa on brazil (i) and the rest of latin america (v) is very much like the effect of nafta-p on canada and mexico. the main difference is that the percentage changes tend to be larger in magnitude for brazil and the rest of latin america. for most industries, output, exports and capital stocks all increase substantially. this can be seen graphically in figure 15 .11, which shows the effects of ftaa-p relative to nafta-p on selected industry prices and output for all regions. in general, prices fall significantly in brazil and the rest of latin america, and output rises accordingly. as with nafta-p relative to the reference case, service sector output falls in brazil and the rest of latin america; however, the magnitude is relatively small. the value of trade flows between aggregate regions is shown in table 15 .12 above. the most notable result is a pronounced decline in imports of durables from non the final simulation was an anticipated implementation of the ftaa: the agreement was announced in the first year of the simulation but took effect in the fifth year. to distinguish it from the previous ftaa simulation, this version will be denoted ftaa5-p where the '5' indicates that implementation of the agreement is anticipated 5 years in advance. the effects of the anticipated ftaa will be equal to the differences between the ftaa5-p simulation and a counterfactual nafta simulation, nafta5-p, in which nafta was announced 5 years before its implementation. 48 when the ftaa is announced in advance, figure 15 .13 shows that very large anticipatory changes occur in exchange rates, trade accounts and macroeconomic variables. the key mechanism by which this occurs is the real exchange rate, which can be shown to be the price to foreigners of a given country's domestic assets. when the ftaa is implemented, real exchange rates for brazil and the rest of latin america will drop sharply for the reasons discussed in the previous section. investors anticipating the fall will be less willing to hold brazilian and latin american assets in advance, even at the beginning of the simulation. as a result, exchange rates for brazil and latin america deteriorate immediately. financial capital flows out of those regions and into the us and other large economies, whose trade balances deteriorate as a result. investment, labor demand, gdp and consumption all decline in brazil and the rest of latin america 48 this approach is used for the same reason that other ftaa simulations are compared to analogous nafta runs: the model's internal structure does not currently allow the size of a free trade area to change in the middle of a simulation. in the two simulations discussed in this section, either nafta or the ftaa is adopted immediately but remains dormant for 5 years. during that time, tariffs on imports from trade area partners remain the same as tariffs on other imports. this approximates an anticipated introduction of the ftaa. a global approach to energy and the environment: the g-cubed model during the period of anticipation and then rebound after implementation. by year 12, output levels, exports and capital stocks become similar to those from the unanticipated ftaa, as can be seen in table 15 .14. finally, figure 15 .14 shows the effect of each of the simulations on total carbon dioxide emissions from key regions. in keeping with the modest effects of nafta and the ftaa on industry output and gdp, the effect on carbon emissions is relatively small; in most cases, the change is 1e2% of emissions in the reference case. the only exception is the ftaa5-p experiment, which causes emissions to change by more than 3% in some years. the overall effects of the ftaa are highly analogous to nafta, but smaller in magnitude. countries reducing their tariffs see imports rise, exchange rates fall, trade balances move toward deficit, capital inflows increase and foreign debt levels rise. trade increases significantly between member countries and the largest changes arise in the most heavily traded sectors: durables and non-durables. the effects are most pronounced for non-durables because the initial tariffs are highest. liberalization is good for importers and exporters, as would be expected. non-traded sectors, particularly services, are hurt by the expansion of exporting industries which draw in labor and raise wages (even though, in percentage terms, exports of services rise slightly). for several of the ftaa countries the agreement would have a significant fiscal impact by reducing an importance source of government revenue. unless another tax is increased to compensate for the reduction in tariff revenue, the ftaa raises the country's fiscal deficit. providing a short-term stimulus but crowding out some private investment in the long run. allowing for industries to respond to liberalization by adopting modest productivity improvements sharply increases the overall gain from liberalization and reduces the fiscal drag causes by the drop in tariff revenue (tax revenue from other sources rises). both gdp and consumption rise significantly in liberalizing economies. in addition, productivity effects tend to reduce the changes in trade flows caused by liberalization because the difference in relative prices between foreign and domestic goods is smaller. this effect is particularly noticeable with durables imported by brazil and the rest of latin america. the ftaa and nafta are not analogous in one respect. nafta increases overall trade rather than just redirecting it away from non-nafta countries. the effect of the ftaa, however, is closer to trade redirection than to trade creation. the difference stems from the relative effects of the agreements on the us. nafta does more to stimulate the us economy and thus has a larger effect on the us demand for imports. finally, the effect of both nafta and the ftaa on carbon emissions are very small. neither agreement has much effect on energy consumption. the effect on criteria air pollutants would be small as well. g-cubed bridges three areas of research e econometric general equilibrium modeling, international trade theory and modern macroeconomics e to provide a versatile multicountry, multisector, intertemporal general equilibrium model that can be used for a wide variety of policy analyses. it distinguishes between financial and physical capital, tracking financial capital by currency and physical capital by region and sector where it is installed. investment, saving and international asset markets are driven by agents solving intertemporal optimization problems and having expectations driven by foresight (although not always perfect foresight). all budget constraints are imposed, including those applying to regions as a whole: all trade deficits must eventually be repaid by future trade surpluses. this combination of features allows the model to be used for a wide range of applications. its industry detail allows it to be used to examine environmental and tax policies, which tend to have their largest direct effects on small segments of the economy. intertemporal modeling of investment and saving allows it to trace out the transition of the economy between the short run and the long run. slow wage adjustment and liquidity-constrained agents improves the empirical accuracy with which the model captures the transition. to date, g-cubed has been used in nearly 80 studies covering topics ranging from climate and energy policy to pandemic influenza. its core strengths are: (i) scenario analysis, where scenarios are made up of different shocks that might confront the world economy or an individual country, and (ii) policy evaluation, especially where dynamic adjustment towards a long-run equilibrium is important. it has also occasionally been used as a forecasting model although it was not designed for that purpose. g-cubed continues to evolve and there a number of areas where research is underway to improve it. one project currently underway is an analysis of the effects of alternative fiscal closures on the consequences of imposing a carbon tax in the us. a second project, also underway, is further disaggregation of the energy sectors, which will allow analysis of a wider range of primary energy inputs will include explicit treatment of alternative energy generation technologies. a third project focuses on the role of infrastructure. this is particularly important for better understanding the determinants of economic growth especially in developing countries. it will also be critical when using the model to evaluate large fiscal consolidation programs in heavily indebted industrial economies over future years. a fourth area where more work is underway is improved estimation of g-cubed's dynamic adjustment parameters. although many of the intratemporal parameters of the model are estimated, the key dynamic parameters are largely calibrated. a number of alternative approaches are possible to improve this. perhaps the most attractive, particularly in adapting the core model to be used as a forecasting tool is to further develop the approach in pagan et al. (1998) . in that approach, the impulse response functions from g-cubed are combined with vector autoregression techniques on a time series data set. the result is a dynamic system with the medium-and long-term properties of g-cubed as well as the dynamics found in high frequency macro data. investment and the value of capital an intertemporal model of saving and investment the effects of the proposed ftaa on global carbon emissions: a general equilibrium analysis monetary and fiscal stabilisation of demand shocks within fiscal consolidation in europe: some empirical issues reducing coal subsidies and trade barriers: their contribution to greenhouse gas abatement the geographic pattern of trade and the effects of price changes emerging asia: changes and challenges. asian development bank carbon pollution reduction scheme: australia's low pollution future. white paper. government printing office future projections and structural change a general equilibrium model for tax policy evaluation the global impact of demographic change economic impacts of an australia e united states free trade area. centre for international economics debt, deficits, and finite horizons optimal environmental taxation in the presence of other taxes: general-equilibrium analyses incorporating demographic change in multi-country macroeconomic models issues in modeling the global dimensions of demographic change gtap-e: an energy-environmental version of the gtap model green: a multi-sector, multiregion general equilibrium model for quantifying the costs of curbing co 2 emissions: a technical manual relative price shocks and macroeconomic adjustment the impact of japanese economic policies on the asia region permanent income, current income and consumption completing the internal market: primary microeconomic effects and their impact in macroeconometrics models. commission of the european communities budgetary and economic analysis of the north american free trade agreement. us government printing office the michigan model of world production and trade: theory and applications a dynamic evaluation of the effects of a free trade area of the americas e an intertemporal, global general equilibrium model orani: a multisectoral model of the australian economy object-oriented matrix programming using ox the adjustment of consumption to changing expectations about future income does trade cause growth? german unification: what have we learned from multicountry models why is capital so immobile internationally? possible explanations and implications for capital taxation an intertemporal general equilibrium model for analyzing tax policy, asset prices and growth: a general equilibrium analysis monetary and fiscal policy rules in the european economic and monetary union: a simulation analysis stochastic implications of the life-cycle hypothesis: theory and evidence tobin's marginal q and average q: a neoclassical interpretation the permanent income hypothesis: estimation and testing by instrumental variables a comparison of some basic monetary policy regimes for open economies: implications of different degrees of instrument adjustment and wage persistence global trade analysis: modeling and applications the effects of external linkages on us economic growth: a dynamic general equilibrium analysis a multi-sectoral study of economic growth intertemporal general equilibrium modeling of us environmental regulation the role of human capital in the economic growth of the east asian newly industrialized countries, paper presented at the asia-pacific economic modeling conference transpacific trade imbalances: causes and cures globalization and disease: the case of sars optimal investment policy and the flexible accelerator the macroeconomic consequences of the savings and loan debacle the uruguay round and the developing economies the international coordination of macroeconomic policies dynamic adjustment to regional integration: europe 1992 and nafta the peace dividend, contribution to economic analysis which monetary policy regime for australia? regional and multilateral trade liberalization: the effects on trade, investment and welfare the economic implications of liberalizing apec tariff and non-tariff barriers to trade: us international trade commission publication 3101 southeast asia's economic crisis: origins, lessons, and the way forward forecasting the world economy with dynamic intertemporal general equilibrium multi-country models macroeconomic policy in japan global energy and environmental impacts of an expanding china the global macroeconomic consequences of a demographic transition china and the global environment the impact on the asiaepacific region of fiscal policy in the united states and japan. asia pac the european monetary union: were there alternatives to the ecb? a quantitative evaluation exchange rate regimes in the asiaepacific region and the global financial crisis china's entry to the wto: strategic issues and quantitative assessments which exchange rate regime for asia the east asia crisis: investigating causes and policy responses modelling global demographic change: results for japan global linkages: macroeconomic interdependence and cooperation in the world economy the global economic consequences of the uruguay round global macroeconomic consequences of pandemic influenza. lowy institute analysis issues in the choice of a monetary regimes in india the global financial crisis: causes and consequences modeling the global financial crisis global fiscal consolidation learning and the international transmission of shocks trade and financial reform in china: impacts on the world economy the impact on the world economy of reductions in military expenditures and military arms exports modelling reality: the need for both intertemporal optimization and stickiness in models for policymaking the global consequences of regional environmental policies: an integrated macroeconomic multisectoral approach the global costs of policies to reduce greenhouse gas emissions environment and development in the pacific: problems and policy options the theoretical and empirical structure of the g-cubed model environmental policy and international trade the role of international permit trading in climate change policy the role of economics in climate change policy climate change policy after kyoto: a blueprint for a realistic approach estimates of the costs of kyotoemarrakesh versus the mckib-binewilcoxen blueprint a credible foundation for long term international cooperation on climate change the economic and environmental effects of border adjustments for climate policy uncertainty and climate change policy design quantifying the international economic impact of china's wto membership some experiments in constructing a hybrid model for macroeconomic analysis special issue: the costs of the kyoto protocol: a multi-model evaluation) what to expect from an international system of tradeable permits for carbon emissions the asian financial crisis and global adjustments: impacts on us agriculture a dynamic analysis of the koreaejapan free trade area: simulations with the g-cubed asia-pacific model long term projections of carbon emissions china can grow and still help prevent the tragedy of the co 2 commons climate change scenarios and long term projections consequences of alternative us cap and trade policies: controlling both emissions and costs a copenhagen collar: achieving comparable effort through carbon price agreements expecting the unexpected: macroeconomic volatility and climate policy comparing climate commitments: a model based analysis of the copenhagen accord regional trade agreements for mercosur: the ftaa and the fta with the european union macroeconomic impacts of european union membership of central and eastern european economies modelling the macroeconomic effects of population ageing in japan and the international economy. phd dissertation. the australian national university liberalizing foreign trade in developing countries: the lessons of experience applied general equilibrium models of taxation and international trade: an introduction and survey productivity, risk and the gains from trade liberalisation estimating environmental effects of trade agreements with global cge models: a gtap application to indonesia on rational entrepreneurial behavior and the demand for investment how to assess the environmental impacts of trade liberalization time preference and the penrose effect in a two class model of economic growth the effects of environmental regulation and energy prices on us economic growth uncertain lifetime, life insurance, and the theory of the consumer the views expressed are those of the authors and should not be interpreted as reflecting the views of the trustees, officers or other staff of the brookings institution, australian national university or syracuse university. it has benefitted from collaboration with many coauthors including kym key: cord-323743-hr23ux58 authors: chen, xiaofeng; tang, yanyan; mo, yongkang; li, shengkai; lin, daiying; yang, zhijian; yang, zhiqi; sun, hongfu; qiu, jinming; liao, yuting; xiao, jianning; chen, xiangguang; wu, xianheng; wu, renhua; dai, zhuozhi title: a diagnostic model for coronavirus disease 2019 (covid-19) based on radiological semantic and clinical features: a multi-center study date: 2020-04-16 journal: eur radiol doi: 10.1007/s00330-020-06829-2 sha: doc_id: 323743 cord_uid: hr23ux58 objectives: rapid and accurate diagnosis of coronavirus disease 2019 (covid-19) is critical during the epidemic. we aim to identify differences in ct imaging and clinical manifestations between pneumonia patients with and without covid-19, and to develop and validate a diagnostic model for covid-19 based on radiological semantic and clinical features alone. methods: a consecutive cohort of 70 covid-19 and 66 non-covid-19 pneumonia patients were retrospectively recruited from five institutions. patients were divided into primary (n = 98) and validation (n = 38) cohorts. the chi-square test, student’s t test, and kruskal-wallis h test were performed, comparing 1745 lesions and 67 features in the two groups. three models were constructed using radiological semantic and clinical features through multivariate logistic regression. diagnostic efficacies of developed models were quantified by receiver operating characteristic curve. clinical usage was evaluated by decision curve analysis and nomogram. results: eighteen radiological semantic features and seventeen clinical features were identified to be significantly different. besides ground-glass opacities (p = 0.032) and consolidation (p = 0.001) in the lung periphery, the lesion size (1–3 cm) is also significant for the diagnosis of covid-19 (p = 0.027). lung score presents no significant difference (p = 0.417). three diagnostic models achieved an area under the curve value as high as 0.986 (95% ci 0.966~1.000). the clinical and radiological semantic models provided a better diagnostic performance and more considerable net benefits. conclusions: based on ct imaging and clinical manifestations alone, the pneumonia patients with and without covid-19 can be distinguished. a model composed of radiological semantic and clinical features has an excellent performance for the diagnosis of covid-19. key points: • based on ct imaging and clinical manifestations alone, the pneumonia patients with and without covid-19 can be distinguished. • a diagnostic model for covid-19 was developed and validated using radiological semantic and clinical features, which had an area under the curve value of 0.986 (95% ci 0.966~1.000) and 0.936 (95% ci 0.866~1.000) in the primary and validation cohorts, respectively. electronic supplementary material: the online version of this article (10.1007/s00330-020-06829-2) contains supplementary material, which is available to authorized users. on january 30, 2020, the world health organization (who) has declared the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) outbreak as a global health emergency of international concern. this outbreak has infected all provinces of china and rapidly spread to the rest of the world. at the time of writing this article (march 16, 2020), there have been more than 158 countries and territories affected [1] . wholegenome sequencing and phylogenetic analysis reveal that the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) is similar to some beta coronaviruses detected in bats, but it is distinct from severe acute respiratory syndrome coronavirus (sars-cov) and middle east respiratory syndrome coronavirus (mers-cov) [2] . patients with covid-19 develop pneumonia with associated symptoms of fever (98%), cough (76%), and myalgia or fatigue (44%) [3] . ct imaging plays a critical role in the diagnosis and the monitoring of disease progression [4] [5] [6] . the latest research studies described the characteristic imaging manifestations of covid-19, including ground-glass opacities (ggo) (57 to 88%), bilateral involvement (76 to 88%), and peripheral distribution (33 to 85%) [7] [8] [9] [10] . other imaging features such as consolidation, cavitation, and interlobular septal thickening are also reported in some patients [11] [12] [13] . however, these imaging manifestations of covid-19 are nonspecific and are difficult to distinguish from other pneumonia. to our knowledge, there have been no studies explicitly comparing imaging and clinical characteristics between pneumonia patients with and without covid-19. the current diagnostic criterion for covid-19 is the positive result of a nucleic acid test by real-time reverse transcription polymerase chain reaction (rt-pcr) or next-generation sequencing [14] . however, false-negative results caused by unstable specimen processing are relatively high in clinical practice, which has worsened the spread of the outbreak [15] [16] [17] [18] . moreover, laboratory testing for sars-cov-2 requires a rigorous platform, which is not assembled in all hospitals. thus, this requires specimen transfer, which may delay diagnosis for days. early and accurate diagnosis is crucial, particularly for critically ill patients who need emergency surgery, and with pneumonia complications. to solve these problems, we hypothesize that a diagnostic model can be developed based on ct imaging and clinical manifestations alone, independent of the nucleic acid test. in this study, we identify the differences in imaging and clinical manifestations between patients with and without covid-19. we also develop and validate a model for covid-19 diagnosis based on radiological semantic and clinical features. ethical approvals by the institutional review boards were obtained for this retrospective analysis, and the need to obtain informed consent was waived. from january 1 to february 8, 2020, seventy consecutive patients with covid-19 admitted in 5 independent hospitals from 4 cities were enrolled in this study (mean age, 42.9 years; range, 16-69 years), including 41 men (mean age, 41.8 years; range, 16-69 years) and 29 women (mean age, 44.5 years; range, 16-66 years). all patients were confirmed with sars-cov-2 infection by real-time rt-pcr and nextgeneration sequencing. of these patients, 24 were from huizhou city, 25 from shantou city, 15 from yongzhou city, and the rest 6 from meizhou city. at the same period, another 66 pneumonia patients without covid-19 from meizhou people's hospital were recruited as controls (mean age, 46.7 years; range, 0.3-93 years), including 43 men (mean age, 46.0 years; range, 0.3-93 years) and 23 women (mean age, 48.0 years; range, 1-86 years). all the controls were confirmed with consecutive negative rt-pcr assays. figure e1 in the supplementary material shows the patient recruitment pathway for the control group, along with the inclusion and exclusion criteria. according to previous studies [19] [20] [21] , whose sample size is comparable with ours, the ratio between primary and validation cohort is 7:3. in this study, a total of 136 patients were divided into primary (n = 98) and validation (n = 38) cohorts, close to 7:3. a total of 19 covid-19 patients from two hospitals (6 patients from meizhou people's hospital and 13 patients from the first affiliated hospital of shantou university medical college) and 19 randomly selected controls from meizhou city were incorporated into the validation cohort. the rest of the patients are incorporated in the primary cohort, including 51 covid-19 patients from huizhou, yongzhou, and shantou cities and 47 controls from meizhou city. the primary cohort was utilized to select the most valuable features and build the predictive model, and the validation cohort was used to evaluate and validate the performance of the model. the chest ct imaging data without contrast material enhancement were obtained from multiple hospitals with different ct systems, including ge ct discovery 750 hd (general electric company), scenaria 64 ct (hitachi medical), philips ingenuity ct (philips), and siemens somatom definition as (siemens). all images were reconstructed into 1-mm slices with a slice gap of 0.8 mm. detailed acquisition parameters were summarized in the supplementary material (table e1) . the clinical history, nursing records, and laboratory findings were reviewed for all patients. clinical characteristics, including demographic information, daily body temperature, blood pressure, heart rate, clinical symptoms, and history of exposure to epidemic centers, were collected. total white blood cell (wbc) counts, lymphocyte counts, ratio of lymphocyte, neutrophil count, ratio of neutrophil, procalcitonin (pct), c-reactive protein level (crp), and erythrocyte sedimentation rate (esr) were measured. all threshold values chosen for laboratory metrics were based on the normal ranges set by each individual hospital. for extraction of radiological semantic features, two senior radiologists (d.l. and x.c., more than 15 years of experience) reached a consensus, blinded to clinical and laboratory findings. the radiological semantic features included both qualitative and quantitative imaging features. the lesions in the outer third of the lung were defined as peripheral, and lesions in the inner two-thirds of the lung were defined as central [22] . the progression of covid-19 lesions within each lung lobe was evaluated by scoring each lobe from 0 to 4 [7] , corresponding to normal, 1~25% infection, 26~50% infection, 51~75% infection, and more than 75% infection, respectively. the scores were combined for all five lobes to provide a total score ranging from 0 to 20. a total of 41 radiological features (26 quantitative and 15 qualitative) were extracted for the analysis. the descriptions of radiological semantic features are listed in the supplementary material (table e2 ). figure 1 is one example of the evaluation of ct imaging. to obtain the most valuable clinical and radiological semantic features, statistical analysis, univariate analysis, and the least absolute shrinkage and selection operator (lasso) method were performed. in statistical analysis, the chi-square test, the kruskal-wallis h test, and t test were utilized to compare the radiological semantic and clinical features between covid-19 and non-covid-19 groups. the features with p value smaller than 0.05 were selected. then, univariate analysis was performed for clinical and radiological candidate features to determine the covid-19 risk factors. the features with p value smaller than 0.05 in univariate analysis were also selected. the least absolute shrinkage and selection operator (lasso) method [23] was utilized to select the most useful features with penalty parameter tuning that was conducted by 10-fold cross-validation based on minimum criteria. diagnostic models were then constructed by multivariate logistic regression with the selected features. the flowchart of the feature selection process for these models was presented in the supplementary material (fig. e2 ). to develop an optimal model, we evaluated 3 models by analyzing (i) the clinical features model (c model), (ii) radiological semantic features model (r model), and (iii) the combination of clinical and radiological semantic features model (cr model) by multivariate logistic regression analysis. the classification performances of the models were evaluated by the area under the receiver operating characteristic (roc) curve. the area under the curve (auc), accuracy, sensitivity, and specificity were also calculated. a decision curve analysis was conducted to determine the clinical usefulness of the diagnostic model by quantifying the net benefits at different threshold probabilities in the validation dataset [24] . the development of decision curve was described in the supplementary materials. figure 2 depicts the flowchart of the proposed analysis pipeline described above. we also built a nomogram, which was a quantitative tool to predict the individual probability of infection by covid-19, based on the multivariate logistic analysis of the cr model with the primary cohort. depending on the coefficient of the predictive factors in multivariate logistic regression model, all values of each predictive factor were assigned points. a total point was obtained by summing all the points of each predictive factor. the scale also showed the relationship between the total point and the prediction probability in the nomogram. the corresponding calibration curves of the cr model in the primary cohort and validation cohort are shown in the supplementary material (fig. e3 ). statistical analysis was conducted with r software (version: 3.6.4, http: www.r-project.org/). the reported significance levels were all two-sided, and the statistical significance level was set to 0.05. the multivariate logistic regression analysis was performed with the "stats" package. nomogram construction was performed using the "rms" package. decision curve analysis was performed using the "dca. r" package. the differences between patients with and without covid-19 for all 67 features (41 imaging and 26 critical clinical features) are shown in tables 1 and 2 and the supplementary materials (tables e3 and e4 ). the differences between the primary (tables e5 and e6 ). all characteristics except fatigue and white blood cell count in the cr model presented no significant difference between the primary and validation cohorts. a total of 1745 lesions were identified, with 1062 from the covid-19 group and 683 from the non-covid-19 group. for imaging manifestations, 7 patients in the covid-19 group showed normal chest ct (10%). covid-19 patients have a greater number of pure ggo and mixed ggo than non-covid-19 patients (p = 0.018 and p = 0.001, respectively). for pure ggo lesions, the differences are significant both in peripheral (p = 0.032) and in central areas (p = 0.001). however, the number of mixed ggo is mainly distributed at the periphery in covid-19 patients (p < 0.001), with no statistical difference in the central area. the consolidation lesions without ggo occurred less in covid-19 patients (p = 0.001). more lesions are between 1 and 3 cm (p = 0.027), and fewer lesions are larger than half of the lung segment (p = 0.017) in covid-19 patients. other significant differences between the two groups include the pleural traction sign (p = 0.019), bronchial wall thickening (p < 0.001), interlobular septal thickening (p = 0.009), crazy paving (p < 0.001), tree-in-bud (p < 0.001), pleural effusions (p < 0.001), pleural thickening (p = 0.030), and the offending vessel augmentation in lesions (p < 0.001). the lung score presents no significant difference between the covid-19 and non-covid-19 groups. comparison of clinical features between the two groups of patients with and without covid-19 is reported in table 2 . there is no significant difference in age and sex between the two groups. significant differences are found in common symptoms between groups, including fever (p = 0.003), dry cough (p = 0.025), and fatigue (p = 0.007). the respiration rate and heart rate also show significant differences between the two groups (both p < 0.001). compared with non-covid-19 pneumonia, the reduction of the wbc count is more pronounced in covid-19 patients (p < 0.001). the ratio of lymphocyte and ratio of neutrophil also show a significant difference between covid-19 and non-covid-19 groups. although lymphopenia was observed in 32 covid-19 patients (45.71%), it is not statistically different compared with that in the non-covid-19 group. c-creative protein (crp) level and procalcitonin level are also significantly different between the two groups (p < 0.001 and p = 0.007, respectively). most covid-19 patients present normal procalcitonin level (82.86%). of the features, 18 radiological features and 17 clinical features were selected to form the predictors based on the result from tables 1 and 2. table 3 lists the features selected by univariate analysis and lasso. the prediction models based on (i) clinical features (c model), (ii) radiological features (r model), and (iii) the combination of clinical features and radiological features (cr model) were developed. roc analyses for the primary and validation cohort are shown in table 4 and fig. 3 to determine the clinical usefulness of the diagnostic model, we developed the decision curve (fig. 4) , which showed better performances for the cr model compared with that for the c model and the r model. across the majority of the range of reasonable threshold probabilities, the decision curve analysis showed that the cr model had a higher overall benefit than the c model and r model. the nomogram (fig. 5) was developed by the cr model in the primary cohort, with the factors of the total number of mixed ggo in peripheral area (tn_mixed_ggo_ip), treein-bud, offending vessel augmentation in lesions (ovail), respiration, heart ratio, temperature, white blood cell count, cough, fatigue and lymphocyte count category incorporated. the total points were calculated by summing the points identified on the "points" scale for each factor. by comparing the "total points" scale and the "probability" scale, the individual probability of covid-19 infection could be obtained. in this multi-center study, statistical analysis was performed in comparing imaging and clinical manifestations between pneumonia patients with and without covid-19. eighteen radiological semantic features and seventeen clinical features were identified to be significantly different between the two groups (p < 0.05). three models for covid-19 diagnosis were developed based on the refined features. the models were validated in the both primary and validation cohorts and achieved an auc as high as 0.986. these models will play an essential role for early and easy-to-access diagnosis, especially when there are not enough rt-pct kits or experimental platforms to test for the covid-19 infection. a total of 1745 lesions were evaluated for the qualitative feature, location, and size in this study. consistent with the previous studies, the ground-glass opacities and consolidation in the lung periphery were considered to be the imaging hallmark in patients with covid-19 infection [11, 25] . however, when we subdivided the ggo into pure ggo and mixed ggo, we found that the distribution pattern is different between these two lesions. pure ggo show differences between groups in every location of the lungs, whereas mixed ggo only have significant differences between groups in the lung periphery. recent studies defined four stages of lung involvement in covid-19 [26] . therefore, a follow-up analysis of these distributions would be significant. the lesion size in patients with covid-19 infection was another interesting observation. most lesions were between 1 and 3 cm, with few lesions larger than half of the lung segment, which was similar to the finding in mers_cov [22] . other features similar to mers_covand sars_cov were observed in the laboratory abnormalities, such as lymphopenia, which may be associated with the cellular immune deficiency [3, 27] . however, our results showed no significant difference in lymphopenia between the covid-19 and non-covid-19 patients. to our knowledge, no diagnostic model based on imaging and clinical features alone has been proposed for the diagnosis of covid-19. our clinical and radiological semantic (cr) models consisted of the following features: total number of ggo with consolidation in the peripheral area, tree-in-bud, offending vessel augmentation in lesions, temperature, heart ratio, respiration, cough and fatigue, wbc count, and lymphocyte count category. the cr model outperformed the individual clinical and radiologic model. this result was in accordance with that in previous study in breast cancer, in which the model based on the combination of radiomics features and clinical features achieved a higher performance [24] . compared with the radiomics-based model, the extraction of the y-axis measures the net benefit, which is calculated by summing the benefits (true-positive findings) and subtracting the harms (false-positive findings), weighting the latter by a factor related to the relative harm of undetected metastasis compared with the harm of unnecessary treatment. the decision curve shows that if the threshold probability is over 10%, the application of the combination of clinical and radiological model (cr model) to diagnose covid-19 adds more benefit than the clinical model (c model) and radiological model (r model) radiological semantic features can overcome the image discrepancy caused by different scanning parameters and/or different ct vendors. a previous study [28] also indicated that models based on semantic features determined by an experienced thoracic radiologist slightly outperformed models based on computed texture features alone. there are a few limitations in this study. first, the sample size is relatively small because this is a retrospective analysis of a new disease and most of the cases outside of wuhan city are imported. second, with the multi-center retrospective design, there is a potential bias of patient selection [29] , since there may be some deviations in marking semantic features among readers, though we have taken the effort to reduce this by creating pictorial examples and setting feature criteria (supplementary materials). third, longitudinal ct study was not performed. whether or not this model can be used to evaluate the follow-ups and help to guide therapy remains an open question to be further explored. moreover, the rich high-order features of the ct image combined with radiomics or deep learning have not been studied, which may be another way to identify the patients with covid-19. besides, one can also focus on the role of radiological features in disease monitoring, treatment evaluation, and prognosis prediction. in conclusion, 1745 lesions and 67 features were compared between pneumonia patients with and without covid-19. thirty-five features were significantly different between the two groups. a diagnostic model with auc as high as 0.986 was developed and validated both in the primary and in the validation cohorts, which may help improve the covid-19 diagnosis. guarantor the scientific guarantor of this publication is zhuozhi dai. conflict of interest one of the authors of this manuscript (yuting liao) is an employee of ge healthcare. the remaining authors declare no relationships with any companies whose products or services may be related to the subject matter of the article. statistics and biometry one of the authors, dr. yuting liao, has significant statistical expertise. informed consent written informed consent was waived by the institutional review board. ethical approval institutional review board approval was obtained. • retrospective • case-control study • multi-center study open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons licence, and indicate if changes were made. the images or other third party material in this article fig. 5 nomogram of the cr model in the primary cohort. tn_ mixed_ggo_ip represented the total number of mixed ggo in peripheral area. avail represented offending vessel segmentation in lesions. n was a negative result, and p was a positive result. norm represented normal. note that in probability scale, 0 = non-covid-19, 1 = covid-19 are included in the article's creative commons licence, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. novel coronavirus (2019-ncov) situation reports a novel coronavirus from patients with pneumonia in china clinical features of patients infected with 2019 novel coronavirus in wuhan imaging changes in patients with 2019-ncov initial ct findings and temporal changes in patients with the novel coronavirus pneumonia (2019-ncov): a study of 63 patients in wuhan, china outbreak of novel coronavirus (covid-19): what is the role of radiologists? ct imaging features of 2019 novel coronavirus (2019-ncov). radiology emerging coronavirus 2019-ncov pneumonia chest ct findings in 2019 novel coronavirus (2019-ncov) infections from wuhan, china: key points for the radiologist. radiology ct imaging of the 2019 novel coronavirus (2019-ncov) pneumonia. radiology imaging profile of the covid-19 infection: radiologic findings and literature review the many faces of covid-19: spectrum of imaging manifestations longitudinal ct findings in covid-19 pneumonia: case presenting organizing pneumonia pattern novel coronavirus (2019-ncov) technical guidance: laboratory testing for 2019-ncov in humans a correlation study of ct and clinical features of different clinical types of 2019 novel coronavirus pneumonia correlation of chest ct and rt-pcr testing in coronavirus disease 2019 (covid-19) in china: a report of 1014 cases sensitivity of chest ct for covid-19: comparison to rt-pcr chest ct for typical 2019-ncov pneumonia: relationship to negative rt-pcr testing a clinical-radiomics nomogram for the preoperative prediction of lung metastasis in colorectal cancer patients with indeterminate pulmonary nodules corpus callosum radiomicsbased classification model in alzheimer's disease: a case-control study predicting the development of normal-appearing white matter with radiomics in the aging brain: a longitudinal clinical study ct correlation with outcomes in 15 patients with acute middle east respiratory syndrome coronavirus least absolute shrinkage and selection operator and dimensionality reduction techniques in quantitative structure retention relationship modeling of retention in hydrophilic interaction liquid chromatography radiomics of multiparametric mri for pretreatment prediction of pathologic complete response to neoadjuvant chemotherapy in breast cancer: a multicenter study radiological findings from 81 patients with covid-19 pneumonia in wuhan, china: a descriptive study time course of lung changes on chest ct during recovery from 2019 novel coronavirus (covid-19) pneumonia comparison of prediction models with radiological semantic features and radiomics in lung cancer diagnosis of the pulmonary nodules: a case-control study bias in research studies publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-296565-apqm0i58 authors: togati, teodoro dario title: general theorizing and historical specificity in the ‘keynes versus the classics’ dispute’ date: 2020-10-06 journal: east econ j doi: 10.1057/s41302-020-00177-1 sha: doc_id: 296565 cord_uid: apqm0i58 this paper addresses the issues of general theorizing and historical specificity in the ‘keynes versus the classics’ dispute and puts forward two main arguments. first, the current macroeconomic orthodoxy wins the ‘relative’ generality contest because it implies that institutions influence outcomes, such as the natural rate of unemployment, in contrast with keynes’s ‘internalist’ approach, which neglects historical specificity. secondly, mainstream macro is not truly general in an ‘absolute’ sense since it only makes sense under very special real-world institutional conditions. in their stimulating recent contributions, hodgson (2019) and o'donnell (2019a, b) discuss the issues of general theorizing and historical specificity, in the 'keynes versus the classics' dispute. two limitations of their analysis emerge. first, they address the issues without placing them in the full context of the macroeconomic frameworks involved. for example, while correctly regarding the 'logic of choice' as a universal principle, they neglect that standard theorists themselves fail to apply it in full to phenomena such as money and expectations. this reveals both the impasse of their microfoundations project-stressed by hodgson himself in other insightful contributions, see, for example, 2001, pp. 15-6, 2006, p. 126-and, as i suggest below, the reasons why such theorists end up by placing emphasis on institutions. secondly, the two authors fail to compare the two theories and draw clear-cut conclusions to their generality dispute, which i believe is quite important in order to assess the foundations of current macroeconomic theory and policymaking in the light of recent crises. 1 on one side, hodgson suspends his judgement-in his important book, how economics forgot history, he holds, for example, that 'it is difficult to say whether keynes's theory or orthodox theory is the more general' (hodgson 2001, p. 221 )-as he considers them to be equally wrong instances of the universalistic approach to general theorizing (i.e. one claimed as true for all times and all places) necessarily involving the neglect of historical specificity. on the other, while holding that keynes's generality claim (against pigou) is still valid today against the axiomatic general equilibrium approach, o'donnell eschews direct comparison since he regards the general theory (hereafter, gt) as involving a criterion of general theorizing alternative to the standard one. 2 in order to overcome such limitations, this paper views theories as lakatosian 'research programmes' (rp). on one side, this notion favours direct comparison because it provides a neutral benchmark for 'internally consistent' theorizing: it suggests that in principle, all 'good' theories should possess various parts, organized in a hierarchical manner, such as 'hard core' beliefs, 'protective belt' assumptions and 'heuristics'. 3 on the other, due to its holistic nature (i.e. all parts must be considered together), the rp concept shows that historical specificity is compatible with general theorizing, while placing drastic limitations on its validity. 4 more specifically, based on this concept, the paper discusses the following arguments. first, general theorizing is not always synonymous with universal theorizing. for example, both keynes's theory and standard macro share a common feature: while accepting a universalistic approach to theorizing-stressing behavioural hardcore 'drivers' such as agents' optimizing choices or psychological laws-they are not truly universal in their scope or object of analysis: both theories also include 1 one major question which arises today, for example, is whether the current 'consensus' macro is general enough to accommodate the post-covid-19 scenario. 2 o' donnell opposes the standard mtm (multiple-theory model) approach consisting of three different stages (axioms, extra assumptions and particular cases) to keynes's otm (one theory model), which he correctly characterizes as a 'one-stage' approach: '[it] has several senses in which it can express its generality, all are internal to the model and do not require external assistance of any kind' (2019a, p. 718, emphasis in the text). based on this, i label keynes's approach as 'internalist'. 3 i suggest that the rp criterion is a step forward from past comparisons of theories based exclusively on formal models, such as islm. as noted by leijonhufvud, for example, it is insufficient to list model properties to distinguish between theories, in contrast with most economists, who use 'model' and 'theory' as synonyms: '"theory" is a "patterned set of substantive beliefs about how the economic system works…a "model" is the formal representation of a "theory", or a subset of it… we will seldom… have a model that gives us an exhaustive account of the hard core of the corresponding theory. we may miss quite essential characteristics of the research programmes, therefore, if we approach the problem only by inference from "lists" of distinctive model properties ' (1976, p. 70 ). an example is the controversy between monetarists and keynesians in the 1970s, in which economists like friedman-holding that the differences between such schools only concerned the values of certain parameters of the islm model (such as the interest-elasticity of the demand for money function)-drew the conclusion that they belonged to the same rp. but according to leijonhufvud, these schools belong to two distinct rps insofar as they imply different visions of the economy (ibid., p. 71). 4 as noted by hands, for example, the rp notion is useful, especially 'for understanding the structure of economics… economic theories do seem to have hard cores, protective belts, positive and negative heuristics …' (2001, p. 296) . this notion is a more articulated version of the mtm criterion mentioned by o'donnell. among their premises 'enabling' institutions of a capitalistic type, such as the ones underlying the 'complete markets' assumption in the arrow debreu model (hereafter, adm), or fiat money and the stock market in the gt. this means that both theories only make sense within the broad context of 'modern capitalism' (in various forms) and the essence of their dispute is one of 'relative' generality: that is, which theory provides the most general account of it. secondly, general theories can accommodate historical specificity if they allow institutions to play a causal role by influencing outcomes, like income. i argue that the current orthodoxy wins the generality contest because its rp is correctly formulated and reconciles general theorizing, i.e. the 'hard-core' logic of choice, with historical specificity captured by specific models in the 'protective belt', in which institutions influence outcomes, like the natural rate of unemployment. on the other hand, keynes's major flaw is that he did not make a sharp distinction between hardcore principles and lower-level 'modelling' claims: that is, he conflated them at a single analytical level. 5 for this reason, he developed his 'internalist' approach, stressing ultimate psychological factors and, as emphasized by hodgson, neglecting historical specificity. 6 thirdly, historical specificity also places constraints on general theorizing. i emphasize that what undermines mainstream macro is that while being general in a 'relative' sense, it is not truly general in an 'absolute' sense, namely when it is evaluated not just in terms of internal consistency, but also in terms of an 'externalist' criterion, such as its applicability to real-world contexts. the point is that this theory is not entirely self-contained due to intrinsic limitations of its microfoundations project. as noted long ago by frank hahn, for example, money and expectations are irreducible to the timeless logic of choice as revealed by their inconsistency with the adm, which he regards as the 'best model' of the economy. 7 on close inspection, they turn out to be time-contingent 'conventions', requiring 'external' anchors, such as institutional factors, to be sustainable in a period of time. now analysis of such anchors in real-world contexts suggests that standard theory only makes sense under very special institutional conditions, such as those needed to grant the 'communism of models', underlying lucas' rational expectations hypothesis. to develop these points, this paper is organized as follows: 'keynes's "internalist" approach to generality' section focuses on keynes's 'internalist' approach to generality. 'the reasons for the success of the axiomatic approach in the "relative" generality dispute' section analyses the success of the axiomatic approach in 5 in other words, unlike o'donnell, i regard keynes's otm approach as a vice rather than a virtue. 6 keynes did not consider, for example, that 'the scope for governmental management of the level of effective demand would depend crucially on the economic institutions of a particular country and the nature and extent of its engagement with world markets … it may be possible to regard keynes's work as a framework for viable analyses that addressed such specific circumstances, but keynes himself did not lay down guidelines for the development of historically sensitive theories' (hodgson 2001, p. 225) . 7 'the most serious challenge that the existence of money poses to the theorists is this: the best developed model of the economy cannot find room for it' (hahn 1982, p. 1) , and that 'we have no theory of expectations firmly founded on elementary principles comparable say, to our theory of consumer choice' (ibid., p. 3). the 'relative' generality contest vs keynes. 'is standard macro theory truly general?' section discusses the limitations of standard theory concerning 'absolute' generality. there are five main reasons why we might regard keynes as pursuing an 'internalist' approach to generality. for simplicity's sake, we shall regard these reasons as 'pillars' of his generalist strategy stressing different properties of the gt. the first pillar is keynes's focus on the 'intrinsic' characteristics of a real-world 'monetary economy', which make it quite different from the 'real wage economy' abstraction underlying standard theory. as he himself put it: moreover, the characteristics of the special case assumed by the classical theory happen not to be those of the economic society in which we actually live, with the result that its teaching is misleading and disastrous if we attempt to apply it to the facts of experience. (keynes 1936, p. 3, my italics) more specifically, keynes placed a lot of emphasis on characteristics, such as true uncertainty and conventional behaviour, as well as on institutions of modern capitalism such as the stock market, fiat money and elasticities of production and substitution, which explain both why this economy is irreducible to a barter economy (for example, the money interest rate cannot be put on a par with the 'own' rates of interest of other commodities) and a lack of effective demand is possible. in particular, i suggest that in the gt, this phenomenon is not 'caused' but simply 'enabled' by the above institutions. it is ultimately due to behavioural drivers, such as agents' decisions to hold money rather than buy goods. 8 the second pillar of keynes's internalist approach is his belief that scientific analysis should be carried out in terms of 'self-contained models', and his main aim was to improve pigou's model. as he pointed out in his letter to harrod, 'progress in economics consists almost entirely in a progressive improvement in the choice of models. the grave fault of the later classical school, exemplified by pigou, has been to overwork a too simple or out of date model, and in not seeing that progress lay in improving the model' (keynes 1938, p. 296) . indeed, his revolution, establishing macroeconomics as an autonomous subject, can be regarded as the generalization of the marshallian partial equilibrium model to the economy as a whole. another pillar of keynes's strategy is the link between the generality of his theory and its ability to account for multiple equilibria referring to actual, 'internal' states of the economy: that is, those corresponding to different levels of unemployment, in contrast with standard theory associated with full employment: i shall argue that the postulates of the classical theory are applicable to a special case only and not to the general case, the situation which it assumes being a limiting point of the possible positions of equilibrium […] the classical theory is only applicable to the case of full employment, it is fallacious to apply it to the problems of involuntary unemployment. (keynes 1936, pp. 3, 16) keynes's argument has two major implications. first of all, it implies the existence of a 'direct' relationship between theory and 'external reality' (see, for example, pernecky and wojic 2019, p. 773). in particular, he thought it was possible to define various employment states 'objectively' and to establish a one-to-one correspondence between such states and theories. secondly, keynes regarded his theory as covering the normal case of the economy, namely persistent high unemployment, in contrast with pigou for whom unemployment is only a cyclical phenomenon due to short-lived 'external' shocks. the fourth pillar of keynes's internalist approach is his proposed view of rationality, broader than the standard one. while the latter stresses atomistic agents' substantive rationality in abstract, oversimplified contexts, keynes focused instead on how they actually behave in real-world contexts. as he clarified, especially in his 1937 summary article formulating the 'practical theory of the future', in order to make decisions in such contexts, agents are forced to rely upon conventional techniques, such as assuming that tomorrow will be like today or falling back on other agents' views. it can be argued that in his works, keynes mainly provided a descriptive account of such techniques, 9 which led him to capture essentially the 'psychological' dimension of conventions-i.e. the fact that they are formed through interaction of individual agents, each seeking to find reassurance in the behaviour of others-rather than their theoretical dimension. he failed to underline, for example, that they are not just 'internal' market structures; being forms of precarious knowledge, they also call for 'external' anchors, including institutional factors, to be sustainable through time. 10 moreover, he took the aggregate propensities underlying his model as quite impenetrable 'data '. 11 keynes nevertheless regarded conventions as playing a key role in his theory. on the one hand, he described key variables, such as the monetary rate of interest and the money wage, as conventional features; 12 on the other, he used them to support his generality claim. he noted that standard theory is just one particular conventional technique; for example, he accused '…. classical economic theory of being itself one of these pretty, polite techniques which tries to deal with the present by abstracting from the fact that we know very little about the future' (1937, p. 115) . in particular, keynes hinted at two different aspects of standard theory as a convention that should be distinguished: this theory can be regarded either as a technique used by economic agents to lull their disquietude in the face of uncertainty (it amounts to projecting the existing situation into the future 'to save the face of rational men') or a technique used by theorists themselves to lull the horror vacui of instability (for example, it amounts to relying on various ceteris paribus assumptions). in this case, standard theory could be regarded not as a true statement about the world or a substantive theory, but simply a useful heuristic assumption to 10 in other words, in line with hodgson's critique, it is true that in the gt keynes failed to carry out a full-blown 'conditional' analysis focusing on the specific role of institutions as anchors to conventions, an analysis which would have led him to go beyond 'self-contained' modelling and make reference to specific historical contexts. this is an important limitation for the applicability of demand policies, as they can only be devised in the light of such contexts. paradoxically, instances of this kind of analysis can be found in some of keynes's earlier writings, such as the economic consequences of the peace, where he noted that the applicability of standard theory-which he viewed as a form of convention-to the conditions of the victorian age, depended upon the existence of an implicit 'social pact' between workers and capitalists: this remarkable system depended for its growth on a double bluff or deception. on the one hand the labouring classes accepted from ignorance or powerlessness, or were compelled, persuaded, or cajoled by custom, convention, authority, and the well-established order of society into accepting, a situation in which they could call their own very little of the cake that they and nature and the capitalists were co-operating to produce. and, on the other hand, the capitalist classes were allowed to call the best part of the cake theirs and were theoretically free to consume it, on the tacit underlying condition that they consumed very little of it in practice. the duty of 'saving' became nine-tenths of virtue and the growth of the cake the object of true religion. (keynes 1919, p. 4). another instance of conditional analysis is provided by a treatise on money, where he refers to the chartalist theory, according to which money is a convention that arises from voluntary exchanges in markets to solve the problems of trade but becomes stable only when it is supported by the state (see keynes 1930, pp. 4,6) . 11 keynes left readers quite unclear about whether conventions or psychological laws should be taken as the ultimate foundations of his aggregate propensities and the true 'drivers' of his theory. while referring to conventions, he did not regard them as the common background of all his functions. suffice it to note, for example, that he viewed his ultimate independent variables 'as consisting of the three fundamental psychological factors, the psychological propensity to consume, the psychological attitude to liquidity and the psychological expectation of future yield from capital-assets…' (keynes 1936, p. 247) . 12 for example, he noted that 'it might be more accurate, perhaps, to say that the rate of interest is a highly conventional, rather than a highly psychological, phenomenon. for its actual value is largely governed by the prevailing view as to what its value is expected to be' (keynes 1936, p. 203 ). describe complex behaviour. this is the sense in which keynes himself treated the key standard assumption of maximizing behaviour in the gt. indeed, one key implication of his principle of effective demand-seen as substantive theory accounting, so to speak, for agents' ex ante behaviour-is that firms do not primarily act by trying to maximize profit (for example, they decide how much labour to hire on the grounds of expected demand, not by looking at the real wage and labour productivity, as in standard theory which takes maximizing behaviour as the key starting point of the analysis). following his marshallian background, keynes used the optimizing assumption merely to characterize ex post equilibrium in his aggregate demand and supply model. in other words, he actually viewed firms as maximizers, but only in the context of the principle of effective demand as key driver of the analysis. the last pillar of keynes's internalist approach concerns his 'orderly method of thinking', according to which, in order to deal with complex reality, where the relevant factors are organically interdependent, a theorist cannot consider all variables at once but is compelled to make simplifying distinctions such as the isolation of the true causal factors and the 'freezing' of other variables, at least temporarily. 13 keynes clarified his approach in chapter 18 of the book in particular, where he summarizes his 'static' analysis of the determinants of income (i.e. relating to equilibrium at a point in time) by making a distinction between two types of givens, which can be labelled respectively 'primary' and 'secondary'. on the one hand, he singled out his ultimate independent variables as consisting of (1) the three fundamental psychological factors, the psychological propensity to consume, the psychological attitude to liquidity and the psychological expectation of future yield from capital-assets, (2) the wage-unit as determined by the bargains reached between employers and employed, and (3) the quantity of money as determined by the action of the central bank. (keynes 1936, 247) on the other, he isolated the 'given' factors, which essentially correspond to the deep parameters of standard theory, that are preferences and resources: we take as given the existing skill, the existing quality and quantity of available equipment, the existing technique, the degree of competition, the tastes 13 as keynes put it: the object of our analysis is, not to provide a machine, or method of blind manipulation, which will furnish an infallible answer, but to provide ourselves with an organized and orderly method of thinking out particular problems; and, after we have reached a provisional conclusion by isolating the complicating factors one by one, we then have to go back on ourselves and allow, as well as we can, for the probable interactions of the factors among themselves. this is the nature of economic thinking. any other way of applying our formal principles of thought (without which, however, we shall be lost in the wood) will lead us into error. it is a great fault of symbolic pseudo-mathematical methods of formalising a system of economic analysis … that they expressly assume strict independence between the factors involved and lose all their cogency and authority if this hypothesis is disallowed… (ibid., 297). and habits of the consumer, the disutility of different intensities of labour and of the activities of supervision and organization, as well as the social structure including the forces, other than our variables set forth below, which determine the distribution of the national income. this does not mean that we assume these factors to be constant; but merely that in this place and context, we are not considering or taking into account the effects and consequences of changes in them. (ibid., 245) on this basis, one can understand why keynes's 'orderly method' supports his generality claim. while not focusing on growth explicitly, it can be argued that in principle, keynes regarded effective demand as being relevant to discuss not just fluctuations but also growth. 14 one implication of his organicist perspective-in contrast with the 'strict independence' view underlying mainstream theory-is that when the passage of time is considered, the primary demand factors are not truly independent but can be shaped by interaction with the secondary factors. this means that aggregate demand represents both the cause and effect of key trends or phenomena, such as technological progress or population growth, namely changes in the factors that keynes takes as 'secondary' givens. on the one hand, it is clear, for example, that without a sufficiently high propensity to invest there can be no technological progress. on the other, there is no doubting that the latter also influences aggregate demand. in this section, i suggest that the rp notion helps us to understand why keynes's internalist approach, while successful against pigou, has revealed itself to be a blunt weapon in the face of current standard macro based on debreu's axiomatic formulation of general equilibrium theory (get). one can single out at least five different reasons-all relating to the introduction of institutions into the analysis-for the victory of the axiomatic approach vis-à-vis keynes's gt over the 'relative' generality issue. first of all, the rp logic shows why keynes's emphasis on the intrinsic characteristics of a 'monetary economy' is not sufficient to support his generality claim against the modern standard approach. the point is that the latter builds an 'augmented' 14 suffice it note that conventions are intrinsic features of a monetary economy and thus do not vanish with the passage of time simply because one analyses the rate of change of income rather than just its level. however, it is true that keynes somehow failed to present his principle of effective demand as the permanent driver of the economy. in chapter 18, he pointed out, for example, that the order between the two sets of factors is reversible and depends upon the problem at hand. (ibid., p. 247). modelling dimension, capable in principle of accommodating all states of the economy, including those generated by a monetary economy. this augmentation can be understood if one considers that axiomatization-as an attempt to deal with the new unstable world, reflected in the 'crisis of foundations' in all fields at the end of the nineteenth century-creates what can be regarded as a certain degree of 'modelling freedom' by calling into question the one-to-one correspondence between theory and 'objective' structure of the world, which still underlay walras' get and marshallian theory. the specific solution to this crisis provided by axiomatization is the shift from 'correspondence with external reality' to 'internal consistency' as the main prerequisite of successful theory. strictly speaking, the new standard axiomatic approach does not imply a sharp break with walras or marshall in terms of substantive theory or 'cosmological' beliefs as to the internal stability of the economy, the key role of individual agents pursuing selfinterest and utility value theory. the new approach implies instead a streamlined or 'purified' reformulation of such beliefs, thus avoiding reference to realistic features, such as those typical of the cambridge tradition. the reason why the adm, as a formal representation of the hard-core beliefs of the neo-walrasian rp, 15 involves an augmented modelling freedom is that it implies a kind of 'relativistic' turn: it no longer assumes an 'absolute' notion of a commodity-based simply on its intrinsic nature-but radically generalizes it by differentiating commodities also by time, place of delivery and states of the world. the adm thus makes a very strong assumption, namely that of 'complete markets'-i.e. there exists a market for every time period and forward prices for every commodity at all time periods, in all places and states of the world-which necessarily involves a certain role of institutions as enabling factors. indeed, the existence of all these markets has nothing to do with primitive 'barter' or the 'natural' tendency of men to exchange goods, noted for example by adam smith, but instead calls for a very 'artificial', hyper-capitalistic institutional setting to arise. 16 the reason why this, in turn, generates an augmented modelling dimension, which is in principle more general than keynes's, can thus be readily understood: when commodities are specified as conditional on various 'states of the world', the adm does incorporate not just 'actual' or 'observable' states of the economy reflecting the intrinsic properties of a monetary economy considered by keynes, but also 'potential' or 'unobservable' ones. in this context, the gt appears to apply to a special case only, namely the kind of economy in which many future markets are missing. secondly, the rp logic shows why keynes's attempt to generalize pigou's model is not sufficient to support his generality claim against the modern standard approach: it amounts to calling into question more the 'protective belt' of the classical macro rp than its 'hard core'. 17 based on the adm and its sophisticated institutional setting, this hard core plays a key role in modern economics because it turns 'choice theory' into the only possible 'natural laws'-i.e. those forms of agents' behaviour that are true whatever the context-generating the basic rules of 'grammar' of economics, from which serious theorists cannot simply depart if they want to be understood by their peers. this grammar represents the purely linguistic formulation of the (relatively informal) hard-core cosmological beliefs, which is 'neutral' in the sense that it is independent from all possible interpretations of external reality. 18 in other words, the neo-walrasian rp implies that there is no 'direct access' to external reality (or to models reflecting it); this access can only be mediated by interpretations based on the adm grammar. strictly speaking, this does not mean that there is no ontologyi.e. claims about the real world-behind this rp, but only that it consists of those claims about agents-that they have preferences, pursue self-interest and calculate opportunity costs and so on-that appear so elementary or 'familiar' that they can be taken for granted and treated as axioms. theorists are thus able to concentrate on the purely linguistic dimension. 'grammatical' interpretations (i.e. those that respect the basic hard-core principles) take the shape of more specific 'models' in the protective belt. it is at this stage that another form of modelling freedom becomes possible: there can be many interpretations, depending, for example, on whether theorists take the adm as a positive benchmark (namely, if one takes it as fully descriptive of external reality, as the chicago school does) or a negative benchmark (that is, if one takes it as a tool for understanding what goes wrong in actual economies, as the new keynesians do). 19 based on this, one can see once again why keynes's gt appears as a particular case within the axiomatic approach: when seen through the lens of new keynesian 'grammatical' contributions, it represents merely a specific interpretation of capitalism, giving rise to a particular class of models, among many others, in the protective belt; it does not call into question the axioms of choice theory underlying the hardcore of this rp directly. 17 keynes's approach is justified by the fact that he shared with pigou many marshallian cosmological beliefs, which were so 'open' as to appear compatible with alternative approaches. for example, marshall regarded agents as being 'reasonable' as opposed to fully rational, as contemporary standard theory does. moreover, his beliefs were not even strictly individualistic and reductionist, as shown for example by his mix of biological and mechanical metaphors and the 'representative firm' device, which represented a halfway construction between micro and macro rather than a simple magnification of micro thinking as it is today (see, for example, hodgson 1993, pp. 101-2; togati 2019). 18 as debreu noted, in his approach, 'alliance to rigor determines the axiomatic form of analysis where the theory, in the strict sense, is logically disconnected from its interpretation' (debreu 1959, p. x) . 19 for this distinction, see, for example, hahn 1982. thirdly, in the light of the rp notion, by stressing the link between standard theory and full employment keynes's multiple equilibria story also appears insufficient to establish his generality claim. in line with post-positivist approaches, this notion implies the existence of a gap between substantive theoretical claims (or cosmological beliefs) at the hard-core level and observable states of unemployment. on the one hand, cosmological beliefs are metaphysical and 'unfalsifiable', meaning that one cannot dismiss standard postulates by simply noting that they are at variance with 'facts'. on the other, 'facts do not speak by themselves': it is always possible to interpret them in alternative ways. thus, while certainly legitimate, keynes's multiple equilibria story is just one of many possible other stories that can, in principle, explain labour market outcomes. suffice it to note that standard theory itself manages to explain even high unemployment rates as equilibrium phenomena. in particular, two key factors account for this result. first, the adm also generates a broader notion of equilibrium. while keynes regarded equilibrium simply as a 'state of rest' at a point in time, for standard theorists it has an inter-temporal dimension: it is, that is, 'stochastic' and in their view an economy following a multivariate stochastic process may be described as being in equilibrium. following this broader notion, these theorists are able in principle to conceptualize and interpret actual states of the economy-such as relatively 'mild' fluctuations-in terms of walrasian theory. more specifically, the stochastic approach is able to reconcile the 'up' (not a remote 'long-run equilibrium' state, but observed outcomes generated by the stochastic process) with the 'underlying structure' of the economy (described by the 'deep' parameters). secondly, thanks to its consideration of institutional factors as causal factors capable of determining outcomes, this approach is consistent not only with 'mild' fluctuations but also with keynes's persistent underemployment equilibria, thus once again reducing the gt to particular case status. this is made clear especially by friedman, who introduces a number of institutional factors in his 'natural rate of unemployment' (nru) concept. in his view, the latter designates: (…) the level which would be ground out by the walrasian system of general equilibrium equations, provided that there are imbedded in them the actual structural characteristics of the labor and commodity markets, including market imperfections, stochastic variability in demands and supplies, the cost of gathering information about job vacancies and labor availabilities, the costs of mobility and so on. (1968, p. 8) in this way, friedman manages to interpret any actual rate of unemployment as 'structural' unemployment, rather than induced by a lack of aggregate demand. as he noted, the nru 'is not a fixed number. it is not 6%, or 5%, or some other magic number… the natural rate is a concept that does have a numerical counterpart-but that counterpart is not easy to estimate and will depend on particular circumstances of time and place' (friedman 1996) . 20 fourthly, the lakatosian benchmark also helps us to see that keynes's 'descriptive' analysis of conventions is not sufficient to explain the 'data' of his model and thus establish his generality claim. the problem with such data is that they too 'do not speak by themselves', but must be interpreted. now, without a clear, full-blown justification of such data at the hard-core level in the gt-for example, keynes failed to regard conventions as the alternative 'natural laws' of macroeconomics (in contrast with the logic of choice) 21 or even clarify whether psychology or conventions are the ultimate independent variables of his analysis-it is not surprising that interpreters following the axiomatic approach have been predominant ever since the late 1930s. such authors have taken two 'micro-foundation' steps, which appear to justify the generality of standard theory against the gt. the first was to replace keynes's original formulation of his aggregate functions, which were widely regarded as being ad hoc-i.e. lacking proper theoretical justification 22 -with alternative ones based on choice theory. in particular, following the key heuristic principle of methodological individualism-according to which social scientists should seek to understand all macro phenomena as the result of the 20 this stance seems to contradict friedman's earlier views-stated, for example, in his 1946 paper on lange-according to which for example: 'the ultimate test of the validity of a theory is not conformity to the canons of formal logic but the ability to deduce facts that have not yet been observed, that are capable of being contradicted by observation, and that subsequent observation does not contradict' (friedman 1946, p. 631) . it can be argued that friedman shifts from a marshallian stance that sees theory as an 'engine for the discovery of concrete truth' (marshall (1885) 1925, p. 159) to a stance that regards the simple macro models deriving from the adm as having descriptive or positive value on a priori grounds. this does not mean that they correctly describe all real-world phenomena but that they capture their 'essential' or 'normal' features, including cyclical co-movements and growth. as reder (1982) notes, for example, adherents to this view, 'assume … that … one may treat observed prices and quantities as good approximation to their long-run competitive equilibrium values ' (1982, p. 12) . investigators can thus 'abstract from the effects of transitory market imperfections resulting in misallocation of underutilization of resources' (ibid.). for example, in line with the neo-walrasian hard-core claims, 'chicago economists have always recognized (that) prices (especially wage rates) are sticky in the short-run' (ibid.), but this phenomenon is either temporary or in need of reconciliation with the maintained hypothesis of continuous optimization. indeed, such theorists 'are far less willing than others to accept reports of irrational or inefficient behaviour at face value, including money illusion, and typically seek to discredit or reinterpret such reports so as to protect the basic theory' (ibid., p. 15, my italics). 21 in principle, the conventional techniques emphasized by keynes could be regarded as the 'natural' features of agents' macroeconomic behaviour because they apply whatever the context. in this way, they could play the same role in his theory as the 'logic of choice' in standard theory, that is abstract hardcore principles from which all the propositions of his theory, such as the key role of aggregate demand and the possibility of involuntary unemployment, follow. 22 leontief was the first to note that keynes elevates 'a questionable theorem to the status of a fundamental postulate and interprets the … new term as an independent datum' (leontief 1937, p. 345) . from the standpoint of get to which he subscribed, keynes's aggregate propensities cannot be truly independent data because, in this theory, only the 'deep parameters' enjoy this status. (for analysis of these issues, see, for example, togati 1998 togati , 2019 interaction of individual optimizing atomistic agents-authors like modigliani and tobin managed to obtain new formulations of the consumption and liquidity preference functions capable of accommodating keynes's original insights as special cases. 23 the second step in the generalization process of standard theory vis-à-vis the gt-taken by monetarist authors like lucas-was to construct a broader analytical dimension at the protective belt level (concerning more specific macro models) by providing an 'endogenous' account of expectations, which keynesians of all persuasions had continued to regard essentially as exogenous givens. once again, this greater generality has been achieved by placing the emphasis on institutional factors, such as policy rules, that were essentially ignored by keynes. the existence of a link between expectations and policy rules in this approach arises because it implies the 'communism of models' assumption, according to which in the real-world economy not just rational agents but also policy-makers believe in standard models, and base their expectations and act upon them. 24 in particular, according to this assumption, the only way governments and central banks can succeed in granting stability to market economies populated by rational agents is to adopt policy rules that are in tune with the standard 'natural laws', such as the need to balance budgets or carry out structural reforms of markets. ultimately, in the light of our lakatosian benchmark, it also appears that keynes's 'abstract' reference to his 'orderly method of thinking' is not sufficient to establish his generality claim. due to the fact that keynes essentially focused on the role of aggregate demand in his instantaneous equilibrium model, he did not actually implement the method he advocated (at least implicitly) to study growth, as he simply took the key propensities underlying this model as his 'primary givens'. he thus failed to provide an account of how they could change in the face of changes in the 'secondary givens', such as technology or population, which is necessary for a discussion of growth based on aggregate demand. it is not surprising therefore that this limitation opened the way for standard theorists to reverse keynes' original generality claim once again. initially, theorists like samuelson and solow accomplished this task by regarding the gt as relevant for analysis of temporary fluctuations only, in contrast with neoclassical theory which was capable of accounting for growth trends, on the grounds of models stressing the key role of supply-side factors, such as technology, saving 23 for example, the strong link between consumption and current income emphasized by keynes can be shown to be a special case within modigliani's life-cycle theory, which occurs when liquidity constraints are introduced. 24 'the rational expectations hypothesis imposes a communism of models: the people being modeled know the model. this makes the economic analyst, the policy maker and the agents being modeled all share the same model, i.e. the same probability distribution over sequences of outcomes' (hansen and sargent 2000, p. 1). ratio and population. in more recent times, instead, many standard theorists have established their generality claims by ruling out reference to gt altogether. on the one hand, new classical macroeconomists-on the basis of the notion of stochastic equilibrium-regard supply-side factors as being relevant to deal with both fluctuations and growth; on the other, important strands in growth theory embrace an endogenous perspective which resembles keynes's orderly method inasmuch as it seeks to explain the supply-side factors themselves, which were taken as exogenous in old growth models à la solow. however, in this perspective, institutions play the same explanatory, causal role in determining outcomes as aggregate demand (at least in principle) does in keynes's approach; indeed, they appear to be the 'deep' factors of growth. in particular, based on the distinction between the 'proximate' causes of growth which act in the medium term and the 'deep' causes acting in the long run, this institutionalist perspective is tantamount to explaining 'fundamental' structural factors assumed as given in the medium run, such as technology, the stock of capital and labour and workers' productivity, on the grounds of further 'deep' structural and institutional factors, such as the capability to innovate, the saving ratio, the education system, the rule of law and quality of government (see, for example, acemoglu et al. 2005) . three basic implications of this perspective should be underlined. the first is a 'relativistic' one: the growth of a market economy is no longer 'automatic', but occurs only when an adequate pro-growth institutional context exists in the real world. secondly, in order to explain why growth may not occur, standard theorists further broaden the scope of their analysis by considering new causal factors, such as market failures-including the incapability of pure market forces of providing a sufficient amount of particular goods, such as r&d or education, which is the task of 'structural' policies to remedy-or institutional failures, including 'extractive' institutions, corruption or bad political systems (see, for example, acemoglu and robinson 2012, 2013) that create uncertainty and impair the smooth working of markets. thirdly, it shows why standard theory appears more general than its keynesian competitor. just as the nru concept fits any labour market evidence, standard growth theory is able to account for all observable growth conditions, including 'negative' ones such as depression and stagnation, without making any reference to keynes's analysis of persistent unemployment as being due to lack of aggregate demand. but is this really the end of the generality battle? does the endless number of models and textbooks produced on the grounds of the standard premises really incorporate superior scientific knowledge that can, always and everywhere (in the context of modern capitalism), be taken as the correct foundation for policy moves? in this section, basing myself on the rp structure, i provide a negative answer to this question by emphasising one limitation of the generality of standard theory, which arises from its approach to institutions. more specifically, there is reason to regard this theory as an 'analytical giant with institutional feet of clay'. in order to make this claim clear, it is necessary to realize that a 'basic flaw' undermines the generality of the standard rp: namely, its key heuristic principle, whereby theorists ought to search for microfoundations, cannot actually be implemented. as noted by frank hahn, for example, key phenomena such as money and expectations cannot fully be explained in terms of the logic of choice underlying the adm. indeed, it is because the latter relies on the complete market assumption-whereby all choices are made instantly for all dates and contingencies-that it cannot simply find room for money and expectations, 25 hence for phenomena such as fluctuations and crises induced by them, which lie at the heart of macroeconomics. based on this, the rational expectations revolution appears in a new light. instead of providing an explanation of expectations in terms of first principles, it amounts to an ingenuous solution to the impasse created by the failure of finding this explanation. suffice it to note that when considered together with other assumptions-such as the one whereby equilibrium is unique and stable, 26 the device of a representative agent-the rational expectations hypothesis turns out to be a heuristic, simplifying 25 'the most serious challenge that the existence of money poses to the theorists is this: the best developed model of the economy cannot find room for it' (hahn 1982, p. 1) , and that 'we have no theory of expectations firmly founded on elementary principles comparable say, to our theory of consumer choice' (ibid., p. 3). strictly speaking, by placing the emphasis on such claims, i do not mean to neglect the existence of a large number of contributions in the literature that address the 'microfoundations of money' issue, that is they seek to show its consistency with the first principles of choice theory. for a useful survey, see, for example, (lagos et al. 2014) . what i hold instead is that such approaches do not seem to solve the basic problem identified by hahn. they show, for example, that the role of money can only be justified by relying on features emphasized by search theory or game theory, such as imperfect information and strategic interaction, and/or various types of 'frictions' or ad hoc factors-such as the consideration of money as a component of individual utility, aggregate production functions, cash in advance constraints-that are not part of the original adm. as lagos, rocheteau and wright themselves underline: 'a defining characteristic unifying the work discussed below is that it models the exchange process explicitly, in the sense that agents trade with each other, at least in some if not all situations. that is not true in ge (general equilibrium) theory, where agents only "trade" against their budget lines' (2014, p. 2). one advantage of our rp approach is to clarify that by departing from the adm-which is not just a model but the hard-core of the neo-walrasian rp-such approaches also implicitly depart from general equilibrium tout court and standard macroeconomics based on it. in other words, they enter a post-walrasian world. 26 strictly speaking, it is true that there can be a lot of walrasian equilibria for a given specification of preferences and endowments; however, this appears to be only a 'mathematical' result (that is, linked to an absence of restrictions on consumers' preferences; equilibrium can be shown to be unique and stable instead when there are restrictions on such preferences). this means that from the 'analytical' or 'theoretical' point of view, it is possible to demonstrate the existence of a unique link between the standard set of fundamentals and equilibrium outcomes. it should be noted that in the literature various authors seek to reconcile general equilibrium with multiple equilibria seen as a theoretical (rather than simply mathematical) result. however, they can only draw this conclusion by making assumptions which appear to be in contrast with the neo-walrasian rp. for example, farmer (2014 farmer ( , 2017 broadens the set of fundamentals considered by the adm by adding animal spirits to standard 'deep' parameters. assumption or technical device that is not 'true', but allows a smooth transition from the hard-core claims of the adm (where there are complete markets) to the more specific aggregate models (where many futures markets are missing), 27 which form the protective belt of the neo-walrasian rp. 28 more specifically, this hypothesis allows standard theorists to derive macroeconomic results-such as the convergence of the actual dynamics of the economy to the 'natural' levels of output and employment generated by the real, supply-side factors, which both money and expectations are unable to affect-that are in tune with the hard-core 'natural laws'. it must be noted, however, that if modern macro models are not fully anchored to the logic of choice but essentially turn out to be conventional features, based on a set of technical devices or assumptions, two major consequences follow. first of all, standard macro theory itself can no longer be regarded as a self-contained entity, obviously and unquestionably applying to all forms of modern capitalism. indeed, despite the lucasian rhetoric on this theory as being the 'true' model of the economy or 'the only game in town', its 'truth' or validity is far from being obvious. the point is that conventions are not intrinsically true; their 'truth' or validity is contingent and conditional: it all depends upon the strength of their external anchors. secondly, we need to consider a new dimension of generality, namely that of 'absolute' generality, which concerns the relationship between standard theory and real-world economies. while to discuss the 'relative' generality of this theory, it is sufficient to focus only on 'internal consistency', to discuss its 'absolute' generality we must clarify instead the conditions under which it holds in real-world economies. more specifically, we have to single out its external anchors and establish their solidity. to address this absolute generality issue, we need to reconsider in greater detail the 'communism of models' assumption underlying rational expectations. more specifically, this assumption implies that all agents form expectations (in the shape of 'subjective' probability distributions) based directly on the models and predictions of standard theory itself (formulated in terms of 'theoretical' probability distributions), which in turn conform to actual events, represented by 'objective' probability distributions. in other words, this view presupposes that three different 'worlds'-to use karl popper's terminology (see popper 1979, p. 154; also togati 1998, pp. 69-80) -converge to draw the full conventional circle in which agents' expectations (world-2) are influenced by standard models (world-3) . this means that they learn the 'natural' laws of macroeconomics, such as thinking in 'real' terms, and act upon them. their actions bring about observed events that justify their expectations (world-1), thus closing the circle. on this basis, we can conclude that standard models are applicable to the real world only if the successful closure of this conventional circle actually occurs. it may be argued that this closure cannot emerge spontaneously but only if three strict institutional conditions are realized in practice. the first anchor is the affirmation of standard theory itself as the 'best' popular model of the economy. as already noted, the claim that the neo-walrasian is the 'best' rp is almost entirely based on its internal consistency. however, because of its 'basic flaw', this feature is not sufficient per se to command the spontaneous, unanimous consensus (among peers) based on genuine scientific prestige, comparable say to einstein's relativity insight. this means that to emerge as the only game in town, the neo-walrasian rp needs institutional features, such as research funding, scientific journals and academic careers, to all support this view forcefully. the second anchor is the convergence of policymakers in identifying and adopting single 'best' policy rules, such as austerity, the taylor rule or structural reforms agenda, in tune with standard natural laws. because of the basic flaw, which implies that money and expectations in the real-world economies may generate outcomes that are far from the 'natural' ones expected on the grounds of standard models, policymakers cannot wait for spontaneous consensus over policies to emerge as a result of 'good' practice. the only way to 'save' the standard model is thus to enforce the 'best' rules a priori. one example is the friedmanite 'automatic pilot' prescription according to which central banks should stick to a simple money supply rule, whatever the context, inspired by a crude version of the quantity theory of money. another is the pre-commitment to austerity rules in the european union, which reduces scope for discretionary fiscal policies for individual member states. the third anchor grants that all agents eventually come to learn standard theory itself. in principle, the latter should be selected by agents on the grounds of an inductive 'natural' (psychological) learning process that weeds out errors and weaker competitors (in terms of criteria such as internal consistency or predictive ability) and thus justifies the convergence of expectations to the probability distribution captured by standard models. however, it may be argued once again that agents' learning is not really spontaneous for various reasons. first, there seems to be no inductive learning process at all: not only do data not speak by themselves and are unable to disconfirm theories, but the opposite often seems to be true: econometric techniques are used in a kind of 'cooking-the-books' manner to fashion empirical evidence in support of a priori thinking (e.g. calibration). secondly, due to its basic flaw, the theory does not generate predictions that 'shine through' and strikes agents as being obviously correct. thirdly, in macroeconomics there are no controlled experiments capable of confirming theories, as in physics, for example. on these grounds, one can conclude that agents' learning of standard models too ultimately calls for institutional mechanisms, such as teaching, textbooks and newspaper editorials, which simply enforce thinking along orthodox lines: that is, shape public opinion in a way consistent with natural laws. but to what extent is the fulfilment of such conditions for the closure of the conventional circle likely to occur in the real world? it can be argued that today, in the aftermath of the gfc, this closure is at risk because many developments potentially undermine the solidity of the anchors of standard theory, thus revealing its particular nature: namely that it applies to a special case only. notable developments are the emergence of divisions within the academic community and the proliferation of journals and ideas, including pleas for alternative paradigms in the discipline that undermine the rhetoric of the 'only game in town'. moreover, policymakers seem to be even more pragmatic than in the past, as they tend to adapt their stances to circumstances rather than impose 'correct' rules that follow from theory, quantitative easing being an obvious example. in the end, a major gap between agents' expectations and actual events also tends to emerge today, one instance of which is the recent rise of 'populist' political parties in europe that reflect growing popular dissatisfaction with the implementation of austerity rules a priori. this gap creates new scope for pluralism, as shown by the multiple, 'non-official' cultural resources and ways of learning for ordinary people and students made available by the internet. however, it is important to be aware of the existence of at least three major factors that prevent the breaking of the circle from happening even in difficult times such as these. first, the unity of method underlying current mainstream approaches: the 'only game in town' logic in particular is defended by the enormous flexibility allowed by partial equilibrium modelling along choice theoretic lines, which allows standard macroeconomists to accommodate all topics in a piecemeal fashion, including those that appear as striking anomalies from the standpoint of simple aggregate general equilibrium models. for example, while admitting that dsge models failed to predict the crisis and incorporate many important features of real-world economies, such as many financial factors, they believe there is nothing fundamentally wrong with the discipline at the methodological level, since partial equilibrium models addressing such factors already exist. 29 the challenge that lies ahead for macro theorists is not to change method but, at best, to integrate such contributions 29 this stance is well described by saint-paul: while the ultimate cause of the crisis is not entirely understood, many of its mechanisms are familiar to economists. this is because they already had the models to analyse those mechanisms. for example, it has been known for decades that fall in asset prices tend to reduce the wealth of households, which has a negative impact on consumption…. other phenomena such as contagion or the epidemic spreading of insolvency …are less well understood. but many economists have been working on phenomena like asset bubbles and crashes, bank insolvency and illiquidity, bail-outs and moral hazard… for decades…in that sense, there is no reason why economics should change because of the crisis… the crisis is not a major challenge for economics research. in fact, the crisis has triggered an explosion of research in the areas i mentioned but this new research uses the same methodology and assumptions as the preceding one. into current aggregate models-a task that always appears as feasible in such models, with relatively minor twists. but that is not all. an important role in defending the status quo is also played by institutional features to create a kind of methodological 'entry barrier'. for example, in most leading journals, articles that are not cast in standard formal terms are rejected straightaway whatever their contents: that is, it is not that new ideas are forbidden per se, but that they can make sense or have an impact only if they are cast in 'grammatical' terms. another key example is the ever-growing technical background that economics students must acquire before thinking seriously about substantive issues, such as unemployment. the second factor is the stabilizing power of keynesian policies. indeed, the success of ultra-expansionary fiscal and monetary policies accounts for the resilience of major economies in the face of the gfc: put simply, thanks to their implementation the world has not collapsed as it did during the great depression. the last factor that contributes to the present stalemate is the lack of a full-blown alternative paradigm in post-keynesian macroeconomics capable of challenging the standard rp on its own grounds and representing a unifying reference point for the galaxy of heterodox contributions. 30 this means that neoclassical macro models are the best ones not because of their scientific superiority but simply because of the tina syndrome. based on the rp concept, this paper leads to a few conclusions that are relevant to the discussion started by hodgson and o'donnell on the general theorizing versus historical specificity issue. first of all, the key flaw of standard macro is not its neglect of historical specificity per se. institutions play a major role in its success in the 'relative' generality contest, since they appear both as premises or enabling factors, making sense of hard-core claims, and causal factors, reconciling such claims with phenomenological reality. the true weakness of standard theory emerges instead in the absolute generality context in terms of its applicability to real-world conditions: namely, on account of its failure to accommodate money and expectations, the validity of the general theorizing it pursues (i.e. the logic of choice) is highly conditional: it only holds under highly specific 'institutional anchors', such as those implied by the 'communism of models'. for this reason, standard theory appears as 'an analytical 30 pasinetti, for example, notes that for post-keynesians, the gt does not play the same unifying role as the adm does in standard macroeconomics, as shown by the fact that there is no comprehensive 'monetary theory of production' paradigm, which was keynes's main objective in the gt: 'the model of a pure exchange economy was in fact going through an analytical process that was making it into the most clearly formulated economic theory so far. the arrow-debreu general equilibrium model, which has become the quintessence of it, presents itself as an extremely attractive, elegant formulation. nothing similar has emerged for a monetary theory of production ' (1999, p. 11). giant with institutional feet of clay', whose key characteristic is to presuppose that these very special conditions are actually always in place. 31 secondly, as noted by hodgson, keynes's major flaw is to neglect 'historical specificity'. however, while confirming his view, our analysis supports it on different grounds. this neglect is due not to the pursuit of general theorizing per se, but to the fact that keynes chose to rely on the wrong general principle, namely psychology rather than convention. indeed, what undermines an effective causal role of institutions in the gt is that keynes failed to treat conventions as 'hard-core' features: that is, the alternative 'natural laws' of macroeconomics, in contrast with the standard logic of choice. he did not directly question the latter but attacked the more specific assumptions of the classical theory of employment, replacing it with an alternative 'model' in which the key aggregate propensities appear as exogenous 'givens', ultimately based on psychology. moreover, he neglected the fact that insofar as they are forms of precarious knowledge, conventions also call for external 'anchors', such as institutional factors, including policy rules and implicit social pacts, to be sustainable through time. paradoxically, it is because it has managed to incorporate this feature-which accounts for the causal role of institutions-that the rational expectations revolution has conquered the macroeconomic field and virtually pushed keynes out of it. i have argued, finally, that keynes's economics calls for a new rp. the key point is that the economics profession regards keynesian policies-despite their continual success as last resort stabilizers-as mere pragmatic remedies, devoid of major theoretical consequences. thus, as post-gfc history confirms, what this policy success ultimately achieves is, paradoxically, the restoration of the 'business as usual' atmosphere, rather than the original revolution. if this analysis is correct, the reconstruction of this rp is the most important task that heterodox macroeconomists should embark on. in order to carry out this task, it would be necessary first of all to emphasize the role of conventions as the alternative natural laws of macroeconomics. indeed, the key reason why standard macro has increasingly become a technical discipline is that most economists (including many keynesians) simply take for granted the logic of choice-i.e. they all share the same grammar-so they can concentrate on the technical, modelling side. in the light of these new laws, one could then fully justify keynes's (relative) generality claims about standard theory being just a particular conventional technique based on a number of ceteris paribus conditions as well as integrate institutions as causal factors into keynes's economics, thus remedying its lack of historical specificity. this integration could allow keynes's theory also to regain its generality status in 'absolute' terms, for at least one major reason. unlike standard natural laws, which constrain the role of institutions but cannot be influenced by them, such conventional laws are much more 'malleable': they are in principle consistent with a much broader range of external anchors, including for example discretionary policy moves. why nations fail: the origins of power, prosperity and poverty the end of low hanging fruit? why nations fail blog, tuesday institutions as a fundamental cause of long-run growth theory of value how the economy works: confidence, crashes and self-fulfilling prophecies neo-paleo-keynesianism: a suggested definition. roger farmer's economic window blog post keynesian dynamic stochastic general equilibrium theory lange on price flexibility and employment: a methodological criticism the role of monetary policy the fed and the natural rate money and inflation reflection without rules. economic methodology and contemporary science theory wanting robustness in macroeconomics economics and evolution: bringing life back into economics how economics forgot history. the problem of historical specificity in social science economics in the shadows of darwin and marx: essays on institutional and evolutionary themes keynes and the historical specificity of institutions: a response to rod o'donnell, rod on the logical properties of keynes's theorising, and different approaches to the keynes-institutionalism nexus keynes's 'revolution'-the major event of twentieth century economics? the problematic nature and consequences of the effort to force keynes into the conceptual cul-de-sac of walrasian economics objective knowledge chicago economics: permanence and change how has the crisis changed economics? the economist keynes and the neoclassical synthesis. einsteinian versus newtonian macroeconomics how can we restore the generality of the general theory? making keynes's 'implicit theorizing' explicit general equilibrium analysis. studies in appraisal acknowledgements i am most grateful to two anonymous referees for their comments. key: cord-329276-tfrjw743 authors: ledzewicz, urszula; schättler, heinz title: on the role of the objective in the optimization of compartmental models for biomedical therapies date: 2020-09-30 journal: j optim theory appl doi: 10.1007/s10957-020-01754-2 sha: doc_id: 329276 cord_uid: tfrjw743 we review and discuss results obtained through an application of tools of nonlinear optimal control to biomedical problems. we discuss various aspects of the modeling of the dynamics (such as growth and interaction terms), modeling of treatment (including pharmacometrics of the drugs), and give special attention to the choice of the objective functional to be minimized. indeed, many properties of optimal solutions are predestined by this choice which often is only made casually using some simple ad hoc heuristics. we discuss means to improve this choice by taking into account the underlying biology of the problem. optimal control is a methodology to obtain well thought through solutions to practical problems for dynamical processes. it has found numerous applications in the sciences and has become a staple of engineering design, but applications to biomedical problems are less well known. yet, optimal control as a tool in the analysis of mathematical models of cancer chemotherapy dates back to the 1970s and 1980s (e.g., see [1] [2] [3] [4] [5] [6] ), and these efforts have continued actively throughout the years to the present day (e.g., see [7] [8] [9] [10] [11] [12] [13] [14] and the many references in [15] ). in the modeling, considerable attention is given to the formulation of the underlying dynamics. based on the tremendous progress that has been made over the years in understanding biomedical problems, and cancer in particular, even minimally parameterized models are generally well thought through and realistic. as a matter of fact, however, considerably less effort is made when the objective is formulated that will be imposed on the process. omnipresent are the papers in which the formulation of the criterion at best is a mere after-thought rested in questionable heuristics based on the notion of a "systemic cost"-whatever that may be. naturally, the choice of the objective in an optimal control problem is a very important aspect of the overall approach and quite often it is here where what will be eventual properties of the optimal solution are already determined. one should not confuse this with obtaining meaningful information on the underlying biological problem. while, on the one hand, it is not that straightforward to come up with a proper objective criterion in biological problems, on the other hand, choosing the objective based on mathematical tractability with disregard of its meaning will not pass muster. in this paper, which is intended as a review and discussion, using mathematical models for cancer treatments as the main vehicle of presentation, we discuss the main aspects that enter into the modeling of biomedical problems as optimal control problems. we give some attention to the formulation of the dynamics (such as growth and interaction terms for the population) and treatment aspects, but our emphasis will be on the formulation of the objective to be minimized. we stress the need for a meaningful interpretation of the objective. this should be related to side effects of the treatment in models for cancer therapies and/or to the actual cost. yet, such costs are not always easily quantifiable. just think of 'social distancing' during an epidemic and its effects on the economy. using examples from our work, we illustrate how information from the underlying biology or medically relevant limitations can be used to formulate the optimization criterion so that optimal solutions then give meaningful information for the underlying biomedical problem. we point out some pitfalls to be avoided, but also make useful simplifications in the modeling, both in the dynamics and the objective. in this section, a general model is formulated which represents a class of dynamical systems common in the modeling of biomedical therapies. these models are affine in the controls u ∈ r m with states x ∈ r n + . the components x i , i = 1, . . . , n, of the state x are nonnegative numbers which, in a typical example, represent volumes or cell counts of various populations (compartments) of cells relevant to the underlying biomedical condition or, in epidemiological models, population counts in various stages of the disease. the controls u j , j = 1, . . . , m, stand for possible external interventions and strategies such as drug therapy in cancer treatments or vaccination efforts in epidemiological models. mathematically, the general structure we consider in this paper is of the form journal of optimization theory and applicationṡ with f and g typically nonlinear vector fields called the drift and control vector fields, respectively, and u the control set which represents restrictions on the size of the controls. the drift vector field f describes the underlying dynamics of the system without any outside interventions. it generally consists of various terms which should include the following aspects: growth terms of the populations as well as transition and interaction terms between the various compartments. these terms describe the mechanisms behind the increase in all or some of the populations in the compartments that define the state. these mechanisms can be modeled in various ways taking into account whether or not such growth saturates or not. the simplest model for a single population x ∈ r is given by exponential growth, x = ax, with a a constant representing the growth rate. the same expression with a < 0 describes a basic death term of the population and, more generally, these two processes can be combined with a describing the net effect. while this is an adequate model for some phases of the dynamics, often used in short-term models or in epidemiology, over longer time periods saturation effects need to be taken into account. here, the most common models are logistic growth,ẋ = ax 1 − x k , with k denoting a fixed carrying capacity of the population. another saturation type model, which has been verified empirically as the model for late-stages in breast cancer [16, 17] , is gompertzian growth of the formẋ = −ax ln x k . logistic growth is often preferred because of its simpler mathematical structure. other growth models exist, but are less common (e.g., see [18] ). transition terms model the exchanges between populations. for example, in cellcycle specific models for cancer chemotherapy [8] , quiescent and proliferating cells are distinguished and proliferating cells move through growth phases, synthesis and mitosis, all represented by different coordinates of the state vector x. similarly, in models including drug resistance, random changes from sensitive to resistant cells occur which, in cases of cancer chemotherapy, also may be reversible. in epidemiological models, transitions between sensitive, exposed, infectious and recovered individuals are crucial to understand the spread of the disease. all such transitions are generally modeled by linear transition rates of the forṁ with α i ≥ 0 representing the outflow from the ith compartment and β i j ≥ 0 for i = j representing the inflow from the jth into the ith compartment. cells in various compartments stimulate and/or inhibit cells in other compartments. often these interactions are modeled by the law of mass actions leading to products of terms representing the cell counts or volumes in various compartments. this generates expressions of the form x i x j or, more generally, expressions which include powers of the variables. the simplest structure shows up in epidemiological models when susceptible and infectious populations s and i interact generating an outflow of the form β s i from the susceptible compartment which becomes an inflow to the infectious population i with β accounting for the probability of such interactions. powers arise, for example, in a model for angiogenic signaling by hahnfeldt et al. [19] where the interactions are between the tumor surface through which angiogenic inhibitors need to be released and the carrying capacity of the vasculature. if both tumor volume and the carrying capacity are modeled by volumes p and q, respectively, this generates an interaction term of the form p 2 3 q. in a classical model by stepanova of tumor immune system interactions [20] , the fact that the immune system is able to control small tumors, but increasingly becomes overwhelmed by large tumors is modeled by an interaction term of the form α x − βx 2 y where x represents the tumor volume and y a non-dimensional, order of magnitude variable related to the actions of the immune system called the immunocompetent cell density. here, x = 1 β corresponds to a threshold beyond which the immunological system becomes depressed by the growing tumor. the coefficients α and β are used to calibrate these interactions and collectively describe state-dependent interactions between the cancer cells and the immune system. limiting effects of the interactions between various compartments can be taken into account by so-called michaelis-menten terms. as an example, in a model for cml (chronic myeloid leukemia) [21] which includes proliferating cancer cells p and effector cells e of the immune system, this leads to terms of the type p max · p pc 50 + p · e with p max denoting the maximum possible effect and pc 50 the concentration of proliferating cancer cells for which half of the maximum effect is realized. similar terms describe stimulating or inhibiting effects of such interactions depending on the nature of the variables. important aspects of the uncontrolled dynamics are multi-stability and the geometry of the regions of attraction of locally stable equilibria. we illustrate these features using two classical models for tumor growth and its interactions with the tumor micro-environment: the model for angiogenic signaling by hahnfeldt et al. [19] and stepanova's model for tumor immune-system interactions [20, 22, 23] . these models also serve as illustrations for the general elements of the dynamics described above. (a) angiogenic signaling in this model by hahnfeldt et al. [19] , the spatial aspects of the underlying consumption-diffusion process that stimulate and inhibit angiogenesis are incorporated into a non-spatial 2-compartment model with the primary tumor volume, p, and the carrying capacity of the vasculature, q, as its principal variables. the dynamics consists of two odes that describe the evolution of the tumor volume and its carrying capacity which are given bẏ thus, a gompertzian growth term is used to model tumor growth. the carrying capacity q and tumor volume p are balanced for p = q, while the tumor volume shrinks for inadequate vascular support (q < p) and increases if this support is plentiful (q > p). different from traditional approaches, the carrying capacity is not considered constant, but becomes a state variable whose evolution is governed by a balance of stimulatory and inhibitory effects. the term bp represents the stimulation of the vasculature by the tumor and is taken proportional to the tumor volume. the term dp 2 3 q models the inhibition of the vasculature by the tumor which, as mentioned above, is represented as an interaction term between the tumor surface and the vasculature. the coefficients b and d are mnemonically labeled for birth and death, respectively. the last term −μq simply represents a natural death term related to the vasculature. this model has a unique equilibrium at which is globally asymptotically stable (relative to the natural state-space d = {( p, q) : p > 0, q > 0}) [24] (see fig. 1 ). the dynamics has a strong differentialalgebraic character with the q-dynamics of the vasculature fast and the p-dynamics for the tumor volume slow. essentially, the system follows the corresponding null-cline (defined by the equationq = 0) into the unique equilibrium point. for realistic values of the parameters, this equilibrium is not biologically viable and therapy is needed to lower or even eradicate this equilibrium point. in the later case, the objective then becomes to drive the state of the system toward the singular point ( p, q) = (0, 0). (b) tumor immune system interactions in this by now classical mathematical model of stepanova [20] , two ordinary differential equations formulate the aggregated interactions between cancer cell growth and the activities of the immune system during the development of cancer. precisely because of its simplicity-a few parameters incorporate many medically important features-the underlying equations have been widely accepted as a basic model. the main features of tumor immune system interactions are aggregated into just two variables, the tumor volume, p, and the immunocompetent cell densities, r , a non-dimensional, order of magnitude quantity related to various [19] , a fast growing cancer. the almost horizontal segments indicate a strong differential-algebraic character with trajectories eventually following theq-nullcline defined byq = 0 types of immune cells (t-cells) activated during the immune reaction. the resulting dynamics takes the following form: with f a tumor growth function. greek letters denote constant coefficients which model the main interactions of the tumor with the immune system. various organs such as the spleen, thymus, lymph nodes and bone marrow, each contribute to the development of immune cells in the body and the parameter γ in eq. (6) models a combined rate of influx of t-cells generated through these primary organs; δ is simply the rate of natural death of the t-cells. the first term in this equation models the proliferation of lymphocytes. for small tumors, it is stimulated by tumor antigen and this effect is taken to be proportional to the tumor volume p. large tumors suppress the activity of the immune system, mostly because of a general suppression of immune lymphocytes. this feature is expressed in the model through the inclusion of the term −β p 2 . thus, 1/β corresponds to a threshold beyond which the immunological system becomes depressed by the growing tumor. the coefficients α and β are used to calibrate these interactions and collectively describe state-dependent influence of the cancer cells on the stimulation of the immune system. the first equation, (5), models tumor growth with ξ a tumor growth coefficient that has been separated from the qualitative behavior of the growth function f. the second term, −θ pr , models the beneficial effects of the immune system reaction on the cancer volume and θ denotes the rate at which cancer cells are eliminated through the activity of t-cells. the dynamical behavior of this system is fundamentally different from the model for angiogenic signaling. for a typical set of parameter values, the dynamics is multi-stable table 1 and varying growth functions f. the following growth functions were used in the respective panels: a gompertzian growth: c generalized logistic growth with exponent 2: , and d exponential growth: f e ( p) ≡ 1. the points ( p b , r b ) and ( p m , r m ) are the locally stable benign and malignant equilibria, respectively, and ( p s , r s ) denotes the unstable saddle point. the dashed red curve is the stable manifold of this saddle and the system has both locally asymptotically stable microscopic and macroscopic equilibrium points as well as an unstable saddle point. figure 2 shows phase portraits of the system for the parameter values listed in table 1 for four different growth models which qualitatively all exhibit the same multi-stability. at the microscopic equilibrium point, the tumor volumes are small and immuno-competent cell densities are upregulated. this corresponds to a situation when the immune system is controlling the tumor. we denote the corresponding equilibrium point by ( p b , r b ) and call it benign. the macroscopic equilibrium point is characterized by more than tenfold higher tumor volumes and depressed immunocompetent cell densities. in these solutions, the tumor has suppressed the immune system and it almost reached its carrying capacity. we denote the corresponding equilibrium point by ( p m , r m ) and call it malignant. both of these equilibria are locally asymptotically stable, ( p b , y b ) a stable focus and ( p m , r m ) a stable node. we call the regions of attraction associated with the benign and malignant equilibria the benign, respectively, the malignant region. for the model with an these phase portraits represent a snapshot of the true dynamics: a stationary situation for fixed parameter values. because of unmodeled aspects of the dynamics and/or random events, however, this situation will constantly change in time. thus, it is intuitively clear that a benign equilibrium solution can be disrupted by events affecting the immune system which were not included in the mathematical modeling which then may lead to a transition of the state into the malignant region. how likely this becomes is related to the size of the region of attraction of this equilibrium: if the benign region is small, even minor events may bring up the disease, whereas the immune system may be able to control the disease if this region is large. while there exist good reasons to believe that the immune system is able to control some tumors initially, over a longer period of time, the neoplasm will develop various strategies to evade the actions of the immune system (immuno-editing) and this allows the tumor to recommence growing into clinically apparent tumors and eventually reach its carrying capacity [25] . tumor-immune system interactions thus exhibit a multitude of dynamic properties that include persistence of both benign and malignant scenarios. from a practical point of view, the question then becomes how to move an initial condition that lies in the malignant region into the benign region. this requires therapy and can naturally be formulated and analyzed as an optimal control problem. besides the underlying dynamics represented by the drift vector field f, the other essential ingredient in biomedical problems is a mathematical model for outside inter-ventions. these comprise therapies aimed at curing a patient in cancer treatment or actions to limit the spread of an epidemic. within the general structure (1) of the model considered in this paper, the effects of these actions are modeled linearly through the control vector fields here the control variables u i , i = 1, . . . , m, represent various therapy options such as chemotherapy, radiotherapy, immunotherapy in mathematical models for cancer treatments and/or other available actions to influence the state of the system such as, for example, social distancing measures in epidemiological models. the components . . m, model the effects which the jth intervention modality u j has on the ith compartment x i of the state. generally, the number m of different modalities for outside interventions in a biomedical problem setting is small and m typically lies in the range of m = 1, 2, 3. even for the current covid-19 pandemic, in essence, all efforts to limit the spread of the pandemic are related to social distancing measures in an attempt to lower the probabilities of infection. if available, vaccination would be a different second intervention strategy, but, clearly, the overall number of options is not very large. similarly, in cancer therapy, the number of possible qualitatively different treatments options is small. if one ignores surgery, as this is an option that would be pursued at the onset prior to any time frame that typically would be modeled mathematically, the main modalities are chemotherapy, radiotherapy and immunotherapy. if chemotherapy is given through a cocktail of drugs (and this is the common procedure), then it is still typically advantageous to model the effects of the combination of all the drugs together and one may have m = 1, while the case m > 1 often is reserved for true combination therapies that combine, for example, chemo-or radiotherapy with antiangiogenic treatment. thus, the typical scenario is that the number m of controls in the system is small. the particular mathematical structure considered in eq. (7)-linear in u-clearly does not cover the full range of possibilities, but it is worth emphasizing that this is by far the most common, most important, and, in many aspects, most natural structure. essentially, whether or not a control affine model (7) is valid depends on the choice of what the controls represent. if the controls are tied in with the effects of treatment (e.g., the control represents the concentration of an agent rather than its dose rate), then this model may not be appropriate. if, however, the controls u i are related to the administration of the therapies (dose rates) and are separated from their effects, these models will be linear in the controls. below we shall discuss in more detail mathematical models for drug therapies, but we want to mention at least in passing the main model for radiotherapy, the so-called linear quadratic model. in this model, the probability of cell survival is represented in the form exp −α d − β d 2 where d denotes the total radiation dose and α and β are radiosensitive parameters. briefly, the rationale for the model is as follows [15, 26] : radiation causes abnormalities (lesions) that correspond to ruptures of the molecular bonds on the double-stranded dna. this damage is made up of two components: (i) simultaneous breaks in both dna strands that are caused by a single particle and (ii) adjacent (both in location and time) breaks on separate strands. while a double-strand break leads to a loss of proliferative abilities, a single-strand breakage is not considered lethal since dna has the ability to repair it. only if a second adjacent break occurs on the other strand before the first one can be repaired, this will lead to loss of proliferation properties. both scenarios (i) and (ii) are considered lethal in the sense that the cell is no longer able to proliferate. the first situation leads to a linear term in the cell survival model, while the second one contributes quadratic terms. this is accounted for by the parameters α and β which are called the lq parameters. repair of dna damage generally is incomplete. once this is taken into account as well, then the surviving fraction of cancer cells can be represented in the form with u denoting an arbitrary, time-varying dose rate, u = u(t) ≥ 0, over the therapy interval [0, t ] and ρ the tissue repair rate. the total dose d is given by d = t 0 u(t)dt. this model can be written in a control-affine form [15, 27] as follows: here, p once more represents the tumor volume and the equation for r models the temporal effects of tissue repair; α and β are the lq parameters. the specific parameter values depend on the tissue treated, and there will be separate equations modeling the effects on cancerous and healthy tissue. the control u represents a time-varying fractionated radio-therapy dose rate. the linear model in u represents the way therapy is administered, and this almost always is adequately modeled by a linear term. the quadratic effects of radiotherapy are subsumed at the expense of introducing the extra state variable r . similarly, while the actions of different treatment modalities clearly are interrelated, this concerns their effects, not the way the therapies are administered. if chemotherapy is considered with multiple drugs, then synergistic or antagonistic effects which cannot be modeled linearly become important. aside from the fact that such interactions often are far from being understood, it also matters what the control u represents. if the control represents drug concentrations, then such interactions enter into the model through the controls and a linear model is inadequate. on the other hand, if the controls represent the dose rates of the various agents, then the concentrations of these agents merely become additional states and these interactions enter into the drift vector field f of a generally already nonlinear dynamics. this then preserves the control linear structure (7) . in essence, if the controls u i , i = 1, . . . , m, represent the administration of the therapies (dose rates) and as variables are separated from the effects of the actions (which, for example, depend on the concentrations), then a model which is linear in the controls is not only adequate, but is the correct one. we expand further on this for the case of chemotherapy and, as a matter of simplicity of the presentation, focus on the case of a single control u. pharmacodynamics is the branch of pharmacology which quantifies the effects which a drug has at concentration c. according to the law of mass action, the speed of a chemical reaction is proportional to the product of the active masses (concentrations) of the reactants. if a drug is administered, in an ideal situation it has been postulated that cell death under cancer drugs follows first-order kinetics [28] , i.e., the decrease in the number of cancer cells per unit of time is proportional to the number of cancer cells with the rate depending on the concentration of the anti-cancer drug. any functional relation of the form e = s(c)x with x denoting the cells in a compartment on which the drug is acting (e.g., tumor cells, immune system, healthy cells, etc.) and the coefficient s(c) representing the concentration dependent killing rate of the drug is considered a pharmacodynamic model. a linear model of the form s(c) = ϕc with ϕ a positive constant, the so-called linear log-kill hypothesis based on skipper's paper [28] , is the most commonly used form. incorporating this structure into a standard growth law results in the following growth model under chemotherapy, if the concentration c is considered the control u in the system, u = c, then the linear log-kill model fits the general linear form (7) . the product represents an interaction term as discussed in sect. 2.1.3. such terms can be both inhibiting (minus sign) and stimulating (plus sign). clearly, in the example considered above, the action is inhibiting as the agent is cytotoxic and kills the cells. on the other hand, in stepanova's model the action of an immune stimulatory agent (such as interleukin-2, il-2, representing immunotherapy) would be included through the addition of a positive term +ρru into eq. (6) . while a linear model s(c) = ϕc is applicable over a specified range of the concentration c, it is not valid for low or high drug concentrations. this simply stems from the fact that most drugs have little to no effect if their concentrations are too low and, on the other hand, their effectiveness saturates at high concentrations. if side effects allow, drugs are preferably given in the upper range of this dose-effect curve and an e max model of the form describes the effect of "fast" acting drugs more accurately for high concentrations. here, ec 50 denotes the concentration at which half of the maximum effect e max is realized. sigmoidal functions also capture the behavior at lower concentrations. in these cases, if the drug concentrations are used as the control, after normalization this leads to terms of the form u 1+u which no longer fit the proposed structure (7). this, however, is merely caused by using the concentrations as the controls. it can easily be removed by changing the model to treat the dose rates of the drug administration as the control. the second major aspect of drug delivery is pharmacokinetics. pharmacokinetic equations model the time evolution of a drug's concentration in the blood plasma. the standard model that describes the concentration c = c(t) of the drug in the bloodstream in response to a continuous dose rate u = u(t) is one of exponential growth and decay given byċ with ρ the clearance rate of the drug. more generally, if the concentrations in the bloodstream are separated from the concentrations in a peripheral compartment (such as tissue or an organ of interest) and the interactions between those compartments are taken into account, a two-dimensional linear system with positive eigenvalues is used. higher-dimensional compartmental models are used less frequently, but arise, for example, if the peripheral compartment is divided further into a 'shallow' and 'deep' compartment. for example, a 3-compartment model is often used for insulin pumps [29] . all these models, however, are described by linear time-invariant systems with real eigenvalues. if one therefore introduces the drug dose rates as the controls, then the associated drug concentrations c become state variables and thus become an integral part of the modeling of the uncontrolled dynamics that enters the form of the drift vector field f. in particular, all the nonlinearities that arise because of saturation effects or because of synergistic properties of drug administration are subsumed in the drift vector field f, which, as the example for angiogenic signaling shows, may already be a highly nonlinear vector field to begin with. actual drug administration, however, is described by linear models. once medical professionals have decided on a course of action, a number of natural questions need to be answered: (1) how much of the agent should be given? (2) when and how often should drugs be given? (3) in what order should the treatments be applied? all these questions can naturally be answered within the framework of an optimal control problem. clearly, we are only dealing with a mathematical model and no claims are made that the real problem could be solved in this way. but interesting and useful information can be obtained. to give a concrete example, chemotherapy needs the tumor vasculature to deliver cytotoxic agents that kill the tumor cells. it therefore was conventional wisdom to first give chemotherapy when antiangiogenic treatments and chemotherapy were combined in the treatment of cancer. this notion, however, was challenged by jain et al. [30, 31] who argued that a normalization of the vasculature ("pruning") prior to the administration of chemotherapy is beneficial for the delivery of the cytotoxic agents and that it gives better results. these findings can be strongly supported through a mathematical analysis of the hahnfeldt model with both chemotherapy and antiangiogenic therapy [32] . thus, while optimal control cannot provide the ultimate answers to the underlying problems, it can become a valuable tool in understanding these processes. the ultimate aim in biomedical problem formulations is to find a safe and reasonable strategy to solve the underlying medical problem. by adopting an optimal control formulation, we are aiming for what in a certain way can be considered the "best possible" solution. what this best possible solution should be, however, is often up to wide interpretations. it is here that the role of the objective functional chosen to minimize matters greatly-sometimes to the extent that this choice already preprograms the eventual solution. this choice then is the third essential aspect of the problem formulation. the following general form encompasses a wide range of possibilities: (13) here, l is a sufficiently smooth function of the state called the lagrangian, the coefficients θ j , j = 1, . . . , m, are nonnegative weights and ϕ is a penalty function on the terminal state. the terminal time t can be fixed or free. the minimization is over a class u of admissible controls u which are locally bounded lebesgue measurable functions such that the associated controlled trajectory (x, u) satisfies all other constraints imposed on the system. these generally include control-constraints and terminal point constraints, but could also come from a wide variety of other restrictions imposed possibly by state-space constraints or other limitations inherent to the problem formulation. we discuss the significance of each of the terms starting with the controls. generally, and without loss of generality, we make the assumption that the controls represent nonnegative quantities (e.g., dose rates or concentrations). thus, the control term m j=1 θ j u k j (t) in the objective (13) for k = 1 or k = 2 represents a weighted l 1 -or l 2 -norm, respectively. including such a positive and increasing term in the objective represents what in the engineering literature would be called a soft modeling of control constraints. clearly, medical treatments, especially the strong treatment options that need to be pursued in cancer treatment such as chemo-or radiotherapy, have significant side effects and these need to be taken into account in the modeling. including the control term in the objective to be minimized inherently limits the use of the controls. this represents an attempt to make a reasonable compromise between maximizing the tumor kill which requires to give a large amount of drugs and limiting the side effects of treatment (these are measured indirectly through the integral of the control) which requires to limit this amount of drugs. whether the optimal control does this adequately, then becomes an after-thought that always needs to be checked and, if the optimal solution is still inadequate, the weights need to be adjusted. if the weights on the drugs are too small, optimal solutions will simply give drugs all the time at full dose; if the weights are too high, no drugs will be given. it is clearly necessary to calibrate the weights to obtain meaningful results. such back-and-forth approaches are quite common in engineering and they simply reflect the fact that the form of the objective, and especially the weights that appear, are variables of choice in such problems. generally, and this is the typical application of optimal control in engineering, various choices of the weights will be tested until a solution has been found which appears adequate, possibly also with respect to other criteria which were not included in the mathematical model. the choice of taking an l 1 -or l 2 -norm, however, is not irrelevant. generally speaking, an l 1 -formulation is to be preferred from a modeling perspective while the l 2 -formulation offers significant mathematical and numerical advantages. in chemotherapy, there exists a clear interpretation for the integral t 0 u(t)dt related to the total dose given regardless of whether u represents the dose rate of the drug or its concentration. this is an essential pharmacological quantity also referred to as the auc (area under the curve), and it plays a major role in interpreting the overall efficacy and side effects of a drug. thus, making the choice of an l 1 -formulation as the penalty of the control leads to a realistic modeling. the handicap of an l 1 -type approach lies with the fact that the mathematical analysis generally becomes difficult. optimal solutions need to be synthesized from concatenations of bang and singular controls (and we shall discuss this more in the next section) and reliable numerical methods to solve such problems do not exist. it is for this reason-and in our opinion, for this reason only-that the use of an l 2 -criterion in the literature is wide-spread. for, in this case, the application of the necessary conditions for optimality of the pontryagin maximum principle [33] [34] [35] [36] leads to a closed form formula for the control as a function of state and multiplier that can easily be solved numerically. this, however, does not guarantee that a numerically computed solution is optimal as only extremals (pairs of controlled trajectories and multipliers which satisfy the first-order necessary conditions for optimality) are obtained, a fact often understated by many researchers. indeed, the projection from the space of extremals into the state space (i.e., projecting out the multiplier) can haveand often does have-singularities, the equivalent of the well-known conjugate points from the calculus of variations. for problems employing an l 2 -type objective, higherorder tests for local optimality are given in [15, chapter 4] or [37] and are not difficult to apply, but generally this is not done in the literature. our main objection to the use of an l 2 -criterion in the controls, however, does not lie with this mathematical aspect, but simply with the fact that for biomedical problems it seems rather impossible to give a satisfactory justification for such a functional form as the meaning of the phrase "systemic cost" employed in the vast literature is hard to capture. in chemotherapy, for example, not only does the quantity t 0 u(t) 2 dt not represent any pharmacologically meaningful quantity, but in addition, if one normalizes the controls to satisfy 0 ≤ u i ≤ 1, as it is often done, then his strongly favors smaller partial doses over the higher full doses as half the maximum dose is only given a quarter of the weight of the maximum dose. thus, from the very setup, a bias is built into the optimal solutions to be computed which the underlying system does not have. cost is another questionable justification for an l 2 -type objective as $ 2 just is not a very meaningful quantity. it is certainly true that the cost need not be linear in u. this particularly holds in epidemiological models where the cost of vaccinating the first 10% of a population clearly is much smaller than the cost of vaccinating the last 10% of this population as much greater efforts will be needed to reach the tail end of the population. yet, while this clearly makes a point for some nonlinear relation, a quadratic term does not seem to be the answer to this. we still remark that it is often the typical scenario that optimal solutions for the l 2 -type term (which necessarily are continuous) exhibit sharp rises from the lower control value u = 0 to its maximum value u = u max mimicking optimal bang-bang controls which would be obtained by using an l 1 -type term. indeed, from a numerical perspective, it makes sense to consider the problem when the control terms are taken in the form u +εu 2 and then the limit as ε → 0 is taken (if it exists). we note, however, that our interest in solving the optimal control problems is more in trying to obtain a better global understanding of the dynamics, i.e., the underlying problem, than to obtain one particular numerical solution (which is a much easier problem). as a rule, the parameter values in the model are uncertain and the solutions can be quite sensitive to these values (e.g., for stepanova's model) which limits the practical use of small numerical studies. we illustrate some of these effects with locally optimal solutions for a cell-cycle specific model for cancer chemotherapy due to swierniak [8, 12] . in such models, the various phases of the cell-cycle-a quiescent phase g 0 , a first growth phase g 1 , synthesis s, a second growth phase g 2 , and mitosis m-are clustered into compartments and the dynamics describes the transitions, i.e., the flow between these compartments. mathematically, the dynamics becomes a bilinear system of the forṁ with the state n i representing the average number of cells in the respective compartments and the matrices a and b j modeling the flow rates between the compartments. the controls represent cytotoxic agents which indiscriminately kill cells, cytostatic agents which block the transitions of cells through the cell-cycle and recruiting agents which entice cells to leave the quiescent phase and become active again so that it is possible to kill them. a typical form of the objective is given by with the coefficients nonnegative weights. figure 3 compares locally optimal solutions for a 3-compartment model with the compartments n 1 ∼ g0/g 1 , n 2 ∼ s and n 3 ∼ g 2 /m for l 1 -and l 2 -type functional terms on the controls which represent a cytotoxic agent u and a cytostatic agent v [11, 15, 37] . optimal controls for the l 1 -type objective are bang-bang, i.e., switch between full dose therapies and rest-periods while optimal controls for the l 2 -type objective are necessarily continuous. for an l 1 -type objective, the optimal administration of the killing agent u is full dose initially followed with a rest period (and this is what medically makes most sense here), whereas for the l 2 -type objective the continuity of the control generates decreasing lower dose administrations of the cytotoxic agent toward the end of the therapy interval. the total doses given in both cases are similar. for the blocking agent v, the optimal l 2 -control clearly mimics the optimal l 1 -control with steep continuous intermediate segments. the behavior of the controlled trajectories for both objectives is very much the same. the terminal values for the states corresponding to the l 1 -and l 2 -type objectives are close, and the reduction in the total cancer burden is from the initial normalization of c(0) = 1 to c(21) = 0.8673 for minimizing j 1 and to c(21) = 0.8314 for minimizing j 2 . in both cases, the initial condition was chosen as the normalized steadystate of the uncontrolled dynamics (i.e., assuming no earlier treatment). computing the total dose of the drugs given for the l 2 -optimal control, one sees, however, that while minimizing the quadratic objective leads to about a 3% improvement in the reduction of the tumor, this is at the expense of increasing the amounts of both the cytotoxic and cytostatic agents by about 20%. thus, higher side-effects are encountered. minimizing j 1 would appear to be the preferred strategy. an alternative approach-and the one which seems to be preferred by medical practitioners-is to impose the total amount of drug (or radio therapy dose etc.) given a priori. the argument is that, to begin with, any mathematical model represents only a small view of the underlying problem and that, by sheer necessity of the complexity which, for example, cancer therapy is, many aspects need to be neglected. thus, the point of view is taken that, based on other assessments such as the side effects of the therapy, the total amount of drug to be given has been determined a priori, here, then the question simply becomes the following one: given this amount of drug, how can it be administered in the best possible way? mathematically, this amounts to adding the trivial equationẏ = u, y(0) = 0, to the dynamics and the terminal condition y(t ) ≤ a is imposed as an isoperimetric constraint. this approach, for example, was taken in solving hahnfeldt's model for antiangiogenic therapy as an optimal control problem [13, 15] . in such a case, no control related term will be included in the objective. from a modeling perspective, the use of a hard constraint on the controls seems to be more adequate. mathematically, however, there are close relations between the necessary conditions for optimality for both formulations (either as a soft or hard constraint) and it makes little difference if one or the other formulation is chosen. the lagrangian l and the penalty term ϕ allow to judge the impact of the therapy on the state. including the lagrangian l in the objective can also be viewed as incorporating a soft constraint on the state. for example, in cancer therapies, it is clearly important to limit the size of the tumor, or, in another indirect measurement of the side effects, prevent that the level of the bone marrow becomes too small. clearly, such restrictions could be modeled as state-space constraints imposing upper or lower bounds on some of the variables. for example, in [38] an explicit upper bound in the form fig. 4 examples of locally optimal controls (left) and their corresponding trajectories (right) for a cellcycle specific 2-compartment model for cancer therapy over a fixed therapy horizon (t = 21 days) with lagrangian term l in the objective (top row, q = 0.1) and without lagrangian term in the objective (bottom row, q = 0). the initial condition is the normalized value of the steady-state solution for the uncontrolled system is imposed on the total number of cancer cells n (t) in a cell-cycle specific model for cancer chemotherapy. here, however, a modeling approach of using soft constraints is preferred as the necessary conditions for optimality become more involved if statespace constraints are present. if explicit state-space constraints are included in the modeling, then, a priori, the associated multiplier is only known to be a positive radon-measure. although it can be shown that this measure is absolutely continuous with respect to lebesgue measure if the constraint is an embedded hypersurface (e.g., see [39] ), the effort of dealing with a state-space constraint is much more involved than having a lagrangian present in the objective. a common pitfall in the modeling is not to include such a term. then, naturally, the emphasis of the minimization is put on the terminal time and quite often control actions will be delayed initially and be concentrated near the terminal time. once again, however, this is then not a feature of the underlying problem, but it has been imposed, possibly unwillingly, by the particular form of the objective that was chosen. figure 4 illustrates this feature by comparing the optimal solutions for a 2-compartment cell-cycle specific model for cancer chemotherapy [8, 10, 15] . the two compartments represent the clustered phase n 1 ∼ g 1 /g 0 + s and n 2 ∼ g 2 /m of the cell cycle and the drug is a g 2 /m-specific cytotoxic agent. the objective was chosen in the form for q = 0.1 and q = 0 over a fixed 3 week therapy period. optimal controls are bangbang with one switching. if only a penalty term is used, then the drug administration is all put to the end of therapy with an unacceptable spike in the tumor cells in between. the choice of the objective is often made in an heuristic ad hoc manner. sometimes, however, the underlying biology can be used as a guide. stepanova's model provides a beautiful illustration how this can be achieved. assuming that the parameter values are such that the dynamics is bi-stable with a benign and a malignant region, the practical aim of therapy then can be formulated as moving the state of the system from the malignant region into the region of attraction of the stable, benign equilibrium point while keeping side effects tolerable. assuming the controls are a strongly targeted cytotoxic agent u and an immuno-stimulatory agent v, this can be achieved using an objective functional of the following form: all coefficients are positive and the choice of the weights aims at striking a balance between the benefit at the terminal time and the side effects measured by the total amount of drugs given while it also guarantees the existence of an optimal solution by penalizing the free terminal time t [15, 40, 41] . we discuss the significance of each penalty term in (17) in detail: (i) the term ap(t ) − br(t ) at the terminal time is designed to induce a transfer of the system from the malignant into the benign region of the state space. as is seen in fig. 2 , points with small tumor volumes and depressed immunocompetent cell densities lie in the malignant region and it may not suffice to simply minimize the tumor volume. rather, it is the geometric shape of the stability boundary between the benign and malignant region that matters. even for a low-dimensional dynamical system like stepanova's model, it is generally not possible to give a functional description of the separatrix which is the stable manifold of the unstable equilibrium point of the system. its tangent space, however, is easily computed, and it thus becomes a good heuristic to choose the coefficients a and b in the penalty term such that the minimization of this term corresponds to moving in the normal direction to this stable manifold from the malignant toward the benign region. alternatively, one could also choose these coefficients to move into the direction of the unstable eigenvector at the saddle point. for, this is the tangent vector to the path which uncontrolled trajectories will closely when in the benign region. in the formulation (17) we have followed the first approach and v = (b, a) t is the stable eigenvector of the saddle point oriented so that both a and b are positive numbers (see fig. 5 ). it follows from the geometry of the stable manifold that a and b will have the same sign and including in the objective a term of the form ap(t ) − br(t ) gives the correct direction to minimize in the objective. the level sets of this quantity are lines parallel to the tangent space of the stable manifold 6) with some sample trajectories shown in blue. the dynamics has three equilibria: a locally asymptotically stable focus which is benign (shown as a small green circle), a saddle point (shown as a small grey circle) and a locally asymptotically stable node which is malignant (shown as a small red circle). the stable and unstable manifolds of the saddle point are shown as solid black curves. the tangent vector to the stable manifold of the saddle at the equilibrium point is labeled v = (b, a) t oriented so that both entries are positive of the saddle, and minimizing this quantity thus creates an incentive for the system to move into the benign region. (ii) the integrals of the controls, t 0 u(t)dt and t 0 v(t)dt, measure the total amounts of drugs given and, once again, represent a soft modeling of the side effects of the drugs. clinical data as to the severity of the drugs should be reflected in the choices for c and d. naturally, the specific type of tumor and stage of cancer will need to enter into the calibration of these coefficients. in a more advanced stage, higher side effects need to be tolerated and thus smaller values of c would be taken. (iii) the penalty term st on the terminal time makes the mathematical problem well-posed. this term, which can be written either under the integral or as a separate penalty term st , is somewhat unusual. it has to do with the nature of the underlying problem and the emphasis on the regions of attraction. the existence of the asymptotically stable, benign equilibrium point generates controlled trajectories that improve the value ap(t ) − br(t ) of the objective along the trivial controls u = 0 and v = 0. if no penalty is imposed on the free terminal time, then this creates a "free pass" structure in which the value of the objective can be improved without incurring a cost. as a result, in such a situation an optimal solution may not exist. intuitively, the controls can switch to (u, v) = (0, 0) immediately as the separatrix is crossed and then take an increasingly longer time as they pass near the saddle point with the infimum arising in the limit t → ∞. from a practical point of view, this behavior of course is not acceptable. mathematically, this behavior is easily eliminated by including a penalty term on the final time in the objective. this limits the size of the therapy interval and creates a mathematically well-posed problem. we discuss general properties of optimal controls for control-affine systems when the l 1 -norm (k = 1) is used on the controls and give examples of solutions for some of the problems mentioned earlier. we consider the optimal control problem to minimize the objective (13) over all admissible controls u ∈ u subject to the dynamics (1) and terminal constraints. necessary conditions for optimality of a controlled trajectory (x * , u * ) are given by the pontryagin maximum principle [33] (for some recent references, see [34] [35] [36] ). these conditions are formulated in terms of the (control) hamiltonian h defined as where λ 0 ≥ 0 and λ ∈ r n are multipliers which we write as a row-vector. the most important of the conditions of the maximum principle states that the optimal control u * for almost every time minimizes the control hamiltonian h pointwise over the control set u along the optimal controlled trajectory x * and the multipliers λ 0 and λ, i.e., we have that for the problems considered here, the hamiltonian h is affine in the controls and for each component u i of u the control set is a compact interval. hence, the minimum condition splits into m separate scalar minimization problems that are easily solved. defining the functions φ j : [0, t ] → r, it follows that the optimal controls satisfy the function φ j is called the switching function corresponding to the control u j . a priori, the control u j is not determined by the minimum condition at times when φ j (τ ) = 0. in such a case, all control values trivially satisfy the minimum condition and thus, in principle, all are candidates for optimality. however, the switching functions are absolutely continuous and if the derivativeφ j (τ ) does not vanish, then the control switches at time τ between u j = 0 to u j = u max j ifφ j (τ ) = 0. such a time τ is called a bang-bang switch. on the other hand, if φ j (t) were to vanish identically on an open interval i , then, although the minimization property by itself gives no information about the control, in this case also all the derivatives of φ j (t) must vanish and this condition puts strong limitations on the controls. extremal controls for which the switching function vanishes identically over an open interval i are called singular, while the constant controls u j = 0 and u j = u max j are called bang controls; controls that only switch between 0 and the maximum control values are called bang-bang controls. we mention that higher-order necessary conditions for optimality of singular controls are available, the so-called generalized legendre-clebsch conditions (e.g., see [36] ), that allow us to distinguish between minimizing and maximizing singular controls. if the control represents dose rates for the application of some therapeutic agent, then bang-bang controls correspond to treatment strategies that switch between maximum dose therapy sessions and rest-periods, the typical mtd (maximum tolerated dose) type applications of chemotherapy. singular controls, on the other hand, correspond to time-varying administrations of the agent at intermediate and often significantly lower dose rates. there is growing interest in such structures in the medical community because of mounting evidence that "more is not necessarily better" [42, 43] and that a biologically optimal dose (bod) with the best overall response should be sought. hence, the question whether optimal controls are bang-bang or singular has a rather direct interpretation and its answer is of practical relevance for the structure of optimal treatment protocols. summarizing, solving an optimal control is tantamount to determining the correct sequences and switching times for the controls as concatenations of bang and singular segments. we close this review with two examples of optimal solutions and interpretations of these results. we consider the model (2)-(3) for angiogenic signaling with an antiangiogenic agent u and a chemotherapeutic agent v. modeling the effects of these controls using skipper's log-linear model and treating the concentrations of the agents as the controls gives the following controlled dynamics: we are taking the point of view that the available amounts of agents have been determined a priori. we add the trivial dynamicsẏ = u andż = v which keep track of the amounts of agents used and impose the isoperimetric constraints t 0 u(t)dt ≤ y max and t 0 v(t)dt ≤ z max at the end-point. the terminal time t is free. once side effects have been modeled in this way, our objective simply becomes to minimize the tumor volume, i.e., j (u) = p(t ). from a practical point of view, the optimal solution then simply answers the following question: given a priori determined amounts of agents, how can they best be administered in time to maximize the tumor reduction? for the monotherapy problem when z max = 0, i.e., no chemotherapy is given, we have given a complete global solution to this problem in form of a regular synthesis of optimal controls [13, 15] . this solution specifies the optimal control as a feedback function of the state ( p, q, y) for arbitrary initial conditions. figure 6 gives a twodimensional representation of this solution into the ( p, q)-space with p shown as the vertical and q as the horizontal variable. for a medically realistic initial condition (with the carrying capacity larger than the tumor volume), optimal controls start with a full-dose segment (shown as solid green curves in the figure) until an optimal singular arc s is reached. this segment is short in time. using the conditions of the maximum principle, the curve s can be described explicitly as [13, 15] μ + dp then, and for a rather long time, the optimal controlled trajectory follows this singular arc s (shown as the solid blue curve) until all antiangiogenic agents are used up. because of after effects there still exists a small interval on which the tumor volume decreases further shown as a dashed green curve. we note that, and different from the impression given in the phase portrait, the times along the individual segments are as described above. the system has a strong differential-algebraic character and as a result the dynamics along the full and no dose segments occurring in the optimal synthesis is very fast generating the almost horizontal lines seen in fig. 6 . optimal controls are strongly characterized by the singular control which follows a specific time-varying concentration of the agents which maintains an optimal relation between the tumor volume p and its carrying capacity q. it is the functional relation defined by the singular arc which provides ideal conditions for tumor shrinkage. the optimal solution for the monotherapy problem provides the mathematical foundation on which optimal controls for combination therapies with antiangiogenic treatments can be computed. this holds both for combinations with radiotherapy [44] and chemotherapy [32, 45] . here, we still discuss the latter case. below we state the optimal concatenation structure for a typical scenario which includes an initial condition when the carrying capacity significantly exceeds the tumor volume, when there is ample supply of antiangiogenic inhibitors (whose side effects generally are low) and a more limited total dose of chemotherapy (which generally has much higher side effects) to be given. in such a situation, optimal controls (u * , v * ) consist of concatenations of four segments of the following form: initially, no chemotherapy is given, while antiangiogenic drugs are given at full dose until the optimal singular arc s for the monotherapy problem is reached. at this point, administration switches to the optimal singular control for the monotherapy problem until a specific time τ is reached. at this time chemotherapy commences in a single full dose (mtd) session. in the medical literature this has been called the 'optimal therapeutic window' [30] . also, because of the onset of chemotherapy, the formula for the optimal singular control adjusts and this is seen as a small kink in fig. 7 which gives a sample of the time-evolution for optimal controls u * and v * . then, both controls are given until all available drugs are exhausted. in eq. (25) we have formulated the controls for the case that the antiangiogenic inhibitors run out first, but this depends on the total amounts available and could also be reversed. figure 8 illustrates the shape of the optimal controlled trajectory in ( p, q)-space. mathematical optimization of the combined therapy strongly supports the following two qualitative conclusions: (i) there is an optimal relation between the tumor volume p and its carrying capacity q along which the tumor decrease is maximized. this relation is given by the optimal singular arc s of the monotherapy problem. (ii) along this path there is an optimal point so that chemotherapy is given in one full dose mtd administration as this point is reached. these properties reflect the ideas of 'pruning' the vasculature and of a 'therapeutic window' [30, 31] proposed in the medical literature. fig. 7 illustration of the qualitative shape of optimal controls u * (left) and v * (right) for combination treatments with chemotherapy and antiangiogenic inhibitors for medically realistic initial values: after a brief initial period of full dose therapy, administration of the antiangiogenic agent follows the singular control until, at a specific time τ , chemotherapy commences. at that time, the dose of the antiangiogenic agent adjusts to the presence of the chemotherapy. as the dose of the antiangiogenic agent intensifies along the singular arc for the monotherapy problem, in the presence of chemotherapy which also has antiangiogenic effects, the dose increases in a small jump at time τ . the control follows the adjusted singular control until all inhibitors are exhausted. chemotherapy is given in one full dose segment commencing at time τ fig. 8 "optimal therapeutic window": the red curve depicts the optimal controlled trajectory for the antiangiogenic monotherapy problem (the control is given by full dose therapy along the vertical portion and then follows the singular arc where the tumor volume decreases). chemotherapy then commences at an optimal time τ and the optimal trajectory is shown as solid blue curve. (the remaining optimal monotherapy trajectory is shown as dotted red curve.) when all antiangiogenic agents are exhausted, chemotherapy continues until all drugs have been administered. the junction point corresponds to the kink in the blue trajectory. chemotherapy still lowers the tumor volume, but as there are no more antiangiogenic agents given, the vasculature increases again we remark that the optimal controls in this model (and this is also the case in many other models where the concentration is taken as the control) are discontinuous. obviously, this a quirk of the modeling as concentrations are continuous functions. this discrepancy, however, is easily removed by adding a model for the pharmacokinetics of the drug which smoothes out the transitions between the segments while following the same structure. the simplified modeling with the controls representing concentrations has, from a mathematical point of view, the advantage of a lower dimensional state-space and, more importantly, the analysis of singular controls is simpler (c.f., [46, 47] ). this makes the overall structure of solutions more tractable. our paper [48] gives an in depth discussion of these aspects related to the pharmacometrics of the drugs. we conclude with an example of a numerically computed optimal solution for stepanova's model with strongly targeted chemotherapy u [15, 40, 41, 49] . again we treat the concentration of the agent as the control and use skipper's log-linear model to describe the effect of treatment. this simply adds a term −κ pu to the dynamics (5) of the tumor volume. choosing the objective functional in the form (17) with n = 0 (as we do not consider an immune boost), optimal chemotherapy protocols follow the concatenation structure 1s01 with 1 representing a full dose segment, s denoting administration following a singular control and 0 standing for a rest-period of the treatment. figure 9 gives a characteristic example of a solution: the initial state z 0 = ( p 0 , r 0 ) is characterized by a high tumor volume and depressed immuno-competent density close to the values of the malignant equilibrium point. in the figure, the unstable equilibrium point is marked with a green dot and the black curves crossing in it are its stable and unstable manifolds, respectively. initially, chemotherapy is given at full dose until the state of system has crossed over into the benign region. then it follows a singular arc where the concentration of the drug is dropped significantly through lower dose administrations. the full singular curve is indicated by a dotted red curve in the figure. finally, as the state of the system is in the benign region, chemotherapy is stopped and the uncontrolled dynamics (i.e., the actions of the immune system) lead to further reductions in the tumor volume. interestingly, toward the end, optimal chemotherapy protocols still apply a short chemotherapy segment. this indeed seems to be mirrored by what has proven a particularly effective administration protocol also in medical practice. if the effect of an immune boost is incorporated into the system, this structure is generally preserved. figure 10 gives an example of an optimal control for the combination therapy with the chemotherapeutic agent shown in red and the immune boost in green. essentially, the immune boost is only given toward the end to steer the system into a better terminal state. this figure also clearly shows the lowering of the dose rate for the cytotoxic drug along the singular segment. it has not been our aim to discuss numerical methods in optimal control in this paper. for completeness sake, however, we include some brief comments on the methods that were used by us and our co-workers to arrive at the solutions shown in this paper. the initial point, all junctions (where the control switches), and the terminal point of the optimal trajectory are marked by black dots. starting at z 0 = ( p 0 , r 0 ), initially full dose chemotherapy is used. along that trajectory, the system enters from the malignant region (below the stable manifold of the saddle point) into the benign region (above the stable manifold of the saddle point). the stable and unstable manifolds of the saddle point (marked by a green dot) are the black curves which intersect transversally in the saddle point. the singular arc is the u-shaped curve shown as solid green segments where the singular control is admissible and as dotted red segments where the singular control is inadmissible. once the optimal controlled trajectory meets the singular arc, the control switches to become singular and the trajectory follows the singular arc (at much reduced dose rates, c.f., fig. 10) . then, at a certain time, chemotherapy stops and the trajectory follows the dynamics of the uncontrolled system approaching, and ultimately tracing, the unstable manifold of the saddle point. as the state lies in the benign region, this leads to an overall improvement of the objective value. the optimal control then ends with another brief full dose chemotherapy session to reach the optimal terminal point the problem is simpler if an l 2 -type objective is employed in the control as this makes the hamiltonian h of the optimal control problem strictly convex and optimal controls are continuous. in this case, standard shooting methods can be applied to the combined system of dynamics and adjoint equations to compute numerical extremals. their local optimality is easily verified or excluded by computing conjugate points using a second-order approach which tests the existence of a solution of the associated matrix riccati differential equation as described in [37] . other approaches are based on discretizations or pseudo-spectral methods (e.g., see [50] [51] [52] ). for the solutions shown in sect. 4.3 we used gpops (general pseudo-spectral optimal control software), an open-source matlab optimal control software that implements the gauss hp-adaptive pseudospectral methods (http://www.gpops.org/, [53] ). these methods approximate the state using a basis of lagrange polynomials and collocate the dynamics at the legendre-gauss nodes [50] . the continuous-time optimal control problem is then transformed into a finite-dimensional nonlinear programming problem that is being solved using well-known and standard algorithms. fig. 9 . the maximum chemotherapy dose rate (vertical axis) has been normalized to 1. as can be seen, the time-intervals when chemotherapy is at full dose are brief and the optimal dose rates drop significantly along the much longer singular segment. the rest-period is the most dominant time-period (almost 8 units out of t = 12) problems with an l 1 -type objective which contain optimal singular arcs are more involved numerically as the singular control is only optimal on an embedded submanifold of positive codimension. such a structure is inherently difficult to locate numerically. various algorithms, including commercial programs, we tried were simply unable to deal with this situation. for the antiangiogenic monotherapy problem (see fig. 6 ) the complete optimal synthesis was determined by us theoretically by analytical means [13] . the computation of optimal controlled trajectories then amounts to the 'evaluation of a function' and is a matter of seconds (c.f., also [54] ). for the combination therapies, based on the monotherapy results, the computations were made by maurer. states, controls and adjoint variables were computed using discretization methods which transcribed the optimal control problem into a large-scale nonlinear programming problem which was implemented in the framework of the applied mathematical modeling programming language ampl and is linked to various optimization codes. either the interior-point method ipopt based on [55] or the optimization code nudocccs developed in [56] in conjunction with the arcparametrization method [57] were used. on the other hand, if optimal controls are bang-bang (such as for the cell-cycle specific methods for chemotherapy), the numerical problem is simple and basic algorithms which minimize over the switching times (e.g., gradient procedures [58] ) can be used. a verification of second-order conditions for optimality can easily be built into this algorithm [15] . in his paper, we formulated an optimal control problem for a general dynamical system which is nonlinear in the state and affine in the controls. the presented framework encompasses a number of models for treatment of biomedical problems, especially cancer therapies, as well as the more common models in epidemiology. the controls account for either therapeutic or epidemiological interventions. the terms in the cost functional represent various aspects and constraints that need to be taken into consideration when the dynamics is considered as an optimal control problem. we discussed, and using examples from our own work illustrated how the solutions to these problems share certain common characteristics, what differences can be expected to appear and, most importantly, how the choice of the objective affects the form of the solutions both quantitatively and qualitatively. specifically, it has been emphasized that omitting some important terms in the objective, like, for example, penalties on the tumor size during therapy-not an uncommon procedure in the literature-may lead to dubious solutions which indicate to wait and administer cancer treatment only toward the end of the therapy period. clearly, often mathematical features of the solutions (e.g., continuous versus discontinuous protocols for the dosage of a drug) are preprogrammed through the choice of the objective functional. this is acceptable, provided one is aware that this is an outcome of the mathematical formulation and not an indication of properties of the underlying dynamical system or biology. the purpose of this article was to review and somewhat organize and illustrate work done by us on this topic as well as to point out some common pitfalls and errors which often are made when biological phenomena are put into the mathematical framework of optimal control theory. mathematical models in cell biology and cancer chemotherapy an optimal control problem related to leukemia chemotherapy applications of optimal control theory in medicine general applications of optimal control theory in cancer chemotherapy role of optimal control in cancer chemotherapy optimal treatment protocols in leukemia-modelling the proliferation cycle optimal control of drug administration in cancer chemotherapy cell cycle as an object of control a mathematical tumor model with immune resistance and drug therapy: an optimal control approach optimal bang-bang controls for a 2-compartment model in cancer chemotherapy analysis of a cell-cycle specific model for cancer chemotherapy optimal control for a class of compartmental models in cancer chemotherapy antiangiogenic therapy in cancer treatment as an optimal control problem optimal control problems for the gompertz model under the norton-simon hypothesis in chemotherapy optimal control for mathematical models of cancer therapies a gompertzian model of human breast cancer growth mathematical models in cancer research fractal growth of tumors and other cellular populations: linking the mechanistic to the phenomenological modeling and vice versa tumor development under angiogenic signaling: a dynamical theory of tumor growth, treatment response, and postvascular dormancy course of the immune reaction during the development of a malignant tumour optimization of combination therapy for chronic myeloid leukemia with dosing constraints nonlinear dynamics of immunogenic tumors: parameter estimation and global bifurcation analysis dynamic response of cancer under the influence of immunological activity and therapy tumour eradication by antiangiogenic therapy: analysis and extensions of the model by the three e's of cancer immunoediting the molecular theory of radiation biology dynamic optimization of a linear-quadratic model with incomplete repair and volume-dependent sensitivity and repopulation on mathematical modeling of critical variables in cancer treatment (goals: better understanding of the past and better planning in the future) closed-loop subcutaneous insulin infusion algorithm with a short-acting insulin analog for long-term clinical application of a wearable artificial endocrine pancreas normalizing tumor vasculature with antiangiogenic therapy: a new paradigm for combination therapy vascular normalization as a rationale for combining chemotherapy with antiangiogenic agents on optimal delivery of combination therapy for tumors the mathematical theory of optimal processes singular trajectories and their role in control theory. mathématiques and applications introduction to the mathematical theory of control geometric optimal control sufficient conditions for strong local optimality in optimal control problems with l 2 -type objectives and control constraints a model for cancer chemotherapy with state space constraints a local field of extremals for optimal control problems with state constraints of relative degree 1 optimal response to chemotherapy for a mathematical model of tumor-immune dynamics optimal controls for a mathematical model of tumor-immune interactions under targeted chemotherapy with immune boost less is more, regularly: metronomic dosing of cytotoxic drugs can target tumor angiogenesis in mice perspective on "more is not necessarily bette": metronomic chemotherapy optimal combined radio-and antiangiogenic cancer therapy optimal and suboptimal protocols for a mathematical model for tumor anti-angiogenesis in combination with chemotherapy minimizing tumor volume for a mathematical model of anti-angiogenesis with linear pharmacokinetics singular controls and chattering arcs in optimal control problems arising in biomedicine on the role of pharmacometrics in mathematical models for cancer treatments on optimal chemotherapy with a stongly targeted agent for a model of tumor-immune system interactions with generalized logistic growth direct trajectory optimization and costate estimation via an orthogonal collocation method the chebyshev-legendre collocation method for a class of optimal control problems zero-propellant maneuver guidance user's manual for gpops: a matlab package for dynamic optimization using the gauss pseudospectral method optimal and suboptimal protocols for a class of mathematical models of tumor anti-angiogenesis on the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming sqp-methods for solving optimal control problems with control and state constraints: adjoint variables, sensitivity analysis and real-time control optimization methods for the verification of secondorder sufficient conditions for bang-bang controls a gradient method for application of chemotherapy models publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations we are grateful to four anonymous referees of this paper for their useful comments which helped us improve our presentation. all data generated or analyzed during this study are included in this published article. key: cord-303651-fkdep6cp authors: thompson, robin n.; hollingsworth, t. déirdre; isham, valerie; arribas-bel, daniel; ashby, ben; britton, tom; challenor, peter; chappell, lauren h. k.; clapham, hannah; cunniffe, nik j.; dawid, a. philip; donnelly, christl a.; eggo, rosalind m.; funk, sebastian; gilbert, nigel; glendinning, paul; gog, julia r.; hart, william s.; heesterbeek, hans; house, thomas; keeling, matt; kiss, istván z.; kretzschmar, mirjam e.; lloyd, alun l.; mcbryde, emma s.; mccaw, james m.; mckinley, trevelyan j.; miller, joel c.; morris, martina; o'neill, philip d.; parag, kris v.; pearson, carl a. b.; pellis, lorenzo; pulliam, juliet r. c.; ross, joshua v.; tomba, gianpaolo scalia; silverman, bernard w.; struchiner, claudio j.; tildesley, michael j.; trapman, pieter; webb, cerian r.; mollison, denis; restif, olivier title: key questions for modelling covid-19 exit strategies date: 2020-08-12 journal: proc biol sci doi: 10.1098/rspb.2020.1405 sha: doc_id: 303651 cord_uid: fkdep6cp combinations of intense non-pharmaceutical interventions (lockdowns) were introduced worldwide to reduce sars-cov-2 transmission. many governments have begun to implement exit strategies that relax restrictions while attempting to control the risk of a surge in cases. mathematical modelling has played a central role in guiding interventions, but the challenge of designing optimal exit strategies in the face of ongoing transmission is unprecedented. here, we report discussions from the isaac newton institute ‘models for an exit strategy’ workshop (11–15 may 2020). a diverse community of modellers who are providing evidence to governments worldwide were asked to identify the main questions that, if answered, would allow for more accurate predictions of the effects of different exit strategies. based on these questions, we propose a roadmap to facilitate the development of reliable models to guide exit strategies. this roadmap requires a global collaborative effort from the scientific community and policymakers, and has three parts: (i) improve estimation of key epidemiological parameters; (ii) understand sources of heterogeneity in populations; and (iii) focus on requirements for data collection, particularly in low-to-middle-income countries. this will provide important information for planning exit strategies that balance socio-economic benefits with public health. as of 3 august 2020, the coronavirus disease 2019 (covid19) pandemic has been responsible for more than 18 million reported cases worldwide, including over 692 000 deaths. mathematical modelling is playing an important role in guiding interventions to reduce the spread of severe acute respiratory syndrome coronavirus 2 (sars-cov-2). although the impact of the virus has varied significantly across the world, and different countries have taken different approaches to counter the pandemic, many national governments introduced packages of intense non-pharmaceutical interventions (npis), informally known as 'lockdowns'. although the socio-economic costs (e.g. job losses and long-term mental health effects) are yet to be assessed fully, public health measures have led to substantial reductions in transmission [1] [2] [3] . data from countries such as sweden and japan, where epidemic waves peaked without strict lockdowns, will be useful for comparing approaches and conducting retrospective cost-benefit analyses. as case numbers have either stabilized or declined in many countries, attention has turned to strategies that allow restrictions to be lifted [4, 5] in order to alleviate the economic, social and other health costs of lockdowns. however, in countries with active transmission still occurring, daily disease incidence could increase again quickly, while countries that have suppressed community transmission face the risk of transmission reestablishing due to reintroductions. in the absence of a vaccine or sufficient herd immunity to reduce transmission substantially, covid-19 exit strategies pose unprecedented challenges to policymakers and the scientific community. given our limited knowledge, and the fact that entire packages of interventions were often introduced in quick succession as case numbers increased, it is challenging to estimate the effects of removing individual measures directly and modelling remains of paramount importance. we report discussions from the 'models for an exit strategy' workshop (11) (12) (13) (14) (15) may 2020) that took place online as part of the isaac newton institute's 'infectious dynamics of pandemics' programme. we outline progress to date and open questions in modelling exit strategies that arose during discussions at the workshop. most participants were working actively on covid-19 at the time of the workshop, often with the aim of providing evidence to governments, public health authorities and the general public to support the pandemic response. after four months of intense model development and data analysis, the workshop gave participants a chance to take stock and openly share their views of the main challenges they are facing. a range of countries was represented, providing a unique forum to discuss the different epidemic dynamics and policies around the world. although the main focus was on epidemiological models, the interplay with other disciplines formed an integral part of the discussion. the purpose of this article is twofold: to highlight key knowledge gaps hindering current predictions and projections, and to provide a roadmap for modellers and other scientists towards solutions. given that sars-cov-2 is a newly discovered virus, the evidence base is changing rapidly. to conduct a systematic review, we asked the large group of researchers at the workshop for their expert opinions on the most important open questions, and relevant literature, that will enable exit strategies to be planned with more precision. by inviting contributions from representatives of different countries and areas of expertise (including social scientists, immunologists, epidemic modellers and others), and discussing the expert views raised at the workshop in detail, we sought to reduce geographical and disciplinary biases. all evidence is summarized here in a policy-neutral manner. the questions in this article have been grouped as follows. first, we discuss outstanding questions for modelling exit strategies that are related to key epidemiological quantities, such as royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 the reproduction number and herd immunity fraction. we then identify different sources of heterogeneity underlying sars-cov-2 transmission and control, and consider how differences between hosts and populations across the world should be included in models. finally, we discuss current challenges relating to data requirements, focusing on the data that are needed to resolve current knowledge gaps and how uncertainty in modelling outputs can be communicated to policymakers and the wider public. in each case, we outline the most relevant issues, summarize expert knowledge and propose specific steps towards the development of evidencebased exit strategies. this leads to a roadmap for future research (figure 1) made up of three key steps: (i) improve estimation of epidemiological parameters using outbreak data from different countries; (ii) understand heterogeneities within and between populations that affect virus transmission and interventions; and (iii) focus on data needs, particularly data collection and methods for planning exit strategies in low-to-middle-income countries (lmics) where data are often lacking. this roadmap is not a linear process: improved understanding of each aspect will help to inform other requirements. for example, a clearer understanding of the model resolution required for accurate forecasting ( §3a) will inform the data that need to be collected ( §4), and vice versa. if this roadmap can be followed, it will be possible to predict the likely effects of different potential exit strategies with increased precision. this is of clear benefit to global health, allowing exit strategies to be chosen that permit interventions to be relaxed while limiting the risk of substantial further transmission. (a) how can viral transmissibility be assessed more accurately? the time-dependent reproduction number, r(t) or r t , has emerged as the main quantity used to assess the transmissibility of sars-cov-2 in real time [6] [7] [8] [9] [10] . in a population with active virus transmission, the value of r(t) represents the expected number of secondary cases generated by someone infected at time t. if this quantity is, and remains below, one, then an ongoing outbreak will eventually fade out. although easy to understand intuitively, estimating r(t) from case reports (as opposed to, for example, observing r(t) in known or inferred transmission trees [11] ) requires the use of mathematical models. as factors such as contact rates between infectious and susceptible individuals change during an outbreak in response to public health advice or movement restrictions, the value of r(t) has been found to respond rapidly. for example, across the uk, country-wide and regional estimates of r(t) dropped from approximately 2.5-4 in mid-march [7, 12] to below one after lockdown was introduced [12, 13] . one of the criteria for relaxing the lockdown was for the reproduction number to decrease to 'manageable levels' [14] . monitoring r(t), as well as case numbers, as individual components of the lockdown are relaxed is critical for understanding whether or not the outbreak remains under control [15] . several mathematical and statistical methods for estimating temporal changes in the reproduction number have been proposed. two popular approaches are the wallinga-teunis method [16] and the cori method [17, 18] . these methods use case notification data along with an estimate of the serial interval distribution (the times between successive cases in a transmission chain) to infer the value of r(t). other approaches exist (e.g. based on compartmental epidemiological models [19] ), including those that can be used alongside different data (e.g. time series of deaths [7, 12, 20] or phylogenetic data [21] [22] [23] [24] ). despite this extensive theoretical framework, practical challenges remain. reproduction number estimates often rely on case notification data that are subject to delays between case onset and being recorded. available data, therefore, do not include up-to-date knowledge of current numbers of infections, an issue that can be addressed using 'nowcasting' models [8, 12, 25] . the serial interval represents the period between symptom onset times in a transmission chain, rather than between times at which cases are recorded. time series of symptom onset dates, or even infection dates (to be used with estimates of the generation interval when inferring r(t)), can be estimated from case notification data using latent variable methods [8, 26] or methods such as the richardson-lucy deconvolution technique [27, 28] . the richardson-lucy approach has previously been applied to infer incidence curves from time series of deaths [29] . these methods, as well as others that account for reporting delays [ figure 1 . research roadmap to facilitate the development of reliable models to guide exit strategies. three key steps are required: (i) improve estimates of epidemiological parameters (such as the reproduction number and herd immunity fraction) using data from different countries ( §2a-d); (ii) understand heterogeneities within and between populations that affect virus transmission and interventions ( §3a-d); and (iii) focus on data requirements for predicting the effects of individual interventions, particularly-but not exclusively-in data-limited settings such as lmics ( §4a-c). work in these areas must be conducted concurrently; feedback will arise from the results of the proposed research that will be useful for shaping next steps across the different topics. (online version in colour.) royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 useful avenues to improve the practical estimation of r(t). further, changes in testing practice (or capacity to conduct tests) lead to temporal changes in case numbers that cannot be distinguished easily from changes in transmission. understanding how accurately and how quickly changes in r(t) can be inferred in real time given these challenges is crucial. another way to assess temporal changes in r(t), without requiring nowcasting, is by observing people's transmissionrelevant behaviour directly, e.g. through contact surveys or mobility data [31] . these methods come with their own limitations: because these surveys do not usually collect data on infections, care must be taken in using them to understand and predict ongoing changes in transmission. other outstanding challenges in assessing variations in r(t) include the decrease in accuracy when case numbers are low, and the requirement to account for temporal changes in the serial interval or generation time distribution of the disease [32, 33] . when there are few cases (such as in the 'tail' of an epidemic§2d), there is little information with which to assess virus transmissibility. methods for estimating r(t) based on the assumption that transmissibility is constant within fixed time periods can be applied with windows of long duration (thereby including more case notification data with which to estimate r(t)) [34, 35] . however, this comes at the cost of a loss of sensitivity to temporal variations in transmissibility. consequently, when case numbers are low, the methods described above for tracking transmission-relevant behaviour directly are particularly useful. in those scenarios, the 'transmission potential' might be more important than realized transmission [36] . the effect of population heterogeneity on reproduction number estimates requires further investigation, as current estimates of r(t) tend to be calculated for whole populations (e.g. countries or regions). understanding the characteristics of constituent groups contributing to this value is important to target interventions effectively [37, 38] . for this, data on infections within and between different subpopulations (e.g. infections in care homes and in the wider population) are needed. as well as between subpopulations, it is also necessary to ensure that estimates of r(t) account for heterogeneity in transmission between different infectious hosts. such heterogeneity alters the effectiveness of different control measures, and, therefore, the predicted disease dynamics when interventions are relaxed. for a range of diseases, a rule of thumb that around 20% of infected individuals are the sources of 80% of infections has been proposed [38, 39] . this is supported by recent evidence for covid-19, which suggests significant individual-level variation in sars-cov-2 transmission [40] with some transmission events leading to large numbers of new infections. finally, it is well documented that presymptomatic individuals (and, to a lesser extent, asymptomatic infected individuals-i.e. those who never develop symptoms) can transmit sars-cov-2 [41, 42] . for that reason, negative serial intervals may occur when an infected host displays covid-19 symptoms before the person who infected them [43, 44] . although methods for estimating r(t) with negative serial intervals exist [44, 45] , their inclusion in publicly available software for estimating r(t) should be a priority. increasing the accuracy of estimates of r(t), and supplementing these estimates with other quantities (e.g. estimated epidemic growth rates [46] ), is of clear importance. as lockdowns are relaxed, this will permit a fast determination of whether or not removed interventions are leading to a surge in cases. (b) what is the herd immunity threshold and when might we reach it? herd immunity refers to the accumulation of sufficient immunity in a population through infection and/or vaccination to prevent further substantial outbreaks. it is a major factor in determining exit strategies, but data are still very limited. dynamically, the threshold at which herd immunity is achieved is the point at which r(t) ( §2a) falls below one for an otherwise uncontrolled epidemic, resulting in a negative epidemic growth rate. however, reaching the herd immunity threshold does not mean that the epidemic is over or that there is no risk of further infections. great care must be taken in communicating this concept to the public, to ensure continued adherence to public health measures. crucially, whether immunity is gained naturally through infection or through random or targeted vaccination affects the herd immunity threshold, which also depends critically on the immunological characteristics of the pathogen. since sars-cov-2 is a new virus, its immunological characteristics-notably the duration and extent to which prior infection confers protection against future infection, and how these vary across the populationare currently unknown [47] . lockdown measures have impacted contact structures and hence the accumulation of immunity in the population, and are likely to have led to significant heterogeneity in acquired immunity (e.g. by age, location, workplace). knowing the extent and distribution of immunity in the population will help guide exit strategies. as interventions are lifted, whether or not r(t) remains below one depends on the current level of immunity in the population as well as the specific exit strategy followed. a simple illustration is to treat r(t) as a deflation of the original (basic) reproduction number (r 0 , which is assumed to be greater than one): where i(t) is the immunity level in the community at time t and p(t) is the overall reduction factor from the control measures that are in place. if i(t) . 1 à 1=r 0 , then r(t) remains below one even when all interventions are lifted: herd immunity is achieved. however, recent results [48, 49] show that, for heterogeneous populations, herd immunity occurs at a lower immunity level than 1 à 1=r 0 . the threshold 1 à 1=r 0 assumes random vaccination, with immunity distributed uniformly in the community. when immunity is obtained from disease exposure, the more socially active individuals in the population are over-represented in cases from the early stages of the epidemic. as a result, the virus preferentially infects individuals with higher numbers of contacts, thereby acting like a well-targeted vaccine. this reduces the herd immunity threshold. however, the extent to which heterogeneity in behaviour lowers the threshold for covid-19 is currently unknown. we highlight three key challenges for determining the herd immunity threshold for covid-19, and hence for understanding the impact of implementing or lifting control measures in different populations. first, most of the quantities for calculating the threshold are not known precisely and require careful investigation. for example, determining the immunity level royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 in a community is far from trivial for a number of reasons: antibody tests may have variable sensitivity and specificity; it is currently unclear whether or not individuals with mild or no symptoms acquire immunity or test seropositive; the duration of immunity is unknown. second, estimation of r 0 , despite receiving significant attention at the start of the pandemic, still needs to be refined within and between countries as issues with early case reports come to light. third, as discussed in §3, sars-cov-2 does not spread uniformly through populations [50] . an improved understanding of the main transmission routes, and which communities are most influential, will help to determine how much lower diseaseinduced herd immunity is compared to the classical threshold to summarize, it is vital to obtain more accurate estimates of the current immunity levels in different countries and regions, and to understand how population heterogeneity affects transmission and the accumulation of immunity. quantitative information about current and past infections are key inputs to formulate exit strategies, monitor the progression of epidemics and identify social and demographic sources of transmission heterogeneities. seroprevalence surveys provide a direct way to estimate the fraction of the population that has been exposed to the virus but has not been detected by regular surveillance mechanisms [51] . given the possibility of mild or asymptomatic infections, which are not typically included in laboratory-confirmed cases, seroprevalence surveys could be particularly useful for tracking the covid-19 pandemic [52] . contacts between pathogens and hosts that elicit an immune response can be revealed by the presence of antibodies. typically, a rising concentration of immunoglobulin m (igm) precedes an increase in the concentration of immunoglobulin g (igg). however, for infections by sars-cov-2, there is increasing evidence that igg and igm appear concurrently [53] . most serological assays used for understanding viral transmission measure igg. interpretation of a positive result depends on detailed knowledge of immune response dynamics and its epidemiological correspondence to the developmental stage of the pathogen, for example, the presence of virus shedding [54, 55] . serological surveys are common practice in infectious disease epidemiology and have been used to estimate the prevalence of carriers of antibodies, force of infection and reproduction numbers [56] , and in certain circumstances (e.g. for measles) to infer population immunity to a pathogen [57] . unfortunately, a single serological survey only provides information about the number of individuals who are seropositive at the time of the survey (as well as information about the individuals tested, such as their ages [58] ). although information about temporal changes in infections can be obtained by conducting multiple surveys longitudinally [47, 59] , the precise timings of infections remain unknown. available tests vary in sensitivity and specificity, which can impact the accuracy of model predictions if seropositivity is used to assess the proportion of individuals protected from infection or disease. propagation of uncertainty due to the sensitivity and specificity of the testing procedures and epidemiological interpretation of the immune response are areas that require attention. the possible presence of immunologically silent individuals, as implied by studies of covid-19 showing that 10-20% of symptomatically infected people have few or no detectable antibodies [60] , adds to the known sources of uncertainty. many compartmental modelling studies have used data on deaths as the main reliable dataset for model fitting. the extent to which seroprevalence data could provide an additional useful input for model calibration, and help in formulating exit strategies, has yet to be ascertained. with the caveats above, one-off or regular assessments of population seroprevalence could be helpful in understanding sars-cov-2 transmission in different locations. (d) is global eradication of sars-cov-2 a realistic possibility? when r 0 is greater than one, an emerging outbreak will either grow to infect a substantial proportion of the population or become extinct before it is able to do so [61] [62] [63] [64] [65] . if instead r 0 is less than one, the outbreak will almost certainly become extinct before a substantial proportion of the population is infected. if new susceptible individuals are introduced into the population (for example, new susceptible individuals are born), it is possible that the disease will persist after its first wave and become endemic [66] . these theoretical results can be extended to populations with household and network structure [67, 68] and scenarios in which r 0 is very close to one [69] . epidemiological theory and data from different diseases indicate that extinction can be a slow process, often involving a long 'tail' of cases with significant random fluctuations (electronic supplementary material, figure s1). long epidemic tails can be driven by spatial heterogeneities, such as differences in weather in different countries (potentially allowing an outbreak to persist by surviving in different locations at different times of year) and varying access to treatment in different locations. regions or countries that eradicate sars-cov-2 successfully might experience reimportations from elsewhere [70, 71] , for example, the reimportation of the virus to new zealand from the uk in june 2020. at the global scale, smallpox is the only previously endemic human disease to have been eradicated, and extinction took many decades of vaccination. the prevalence and incidence of polio and measles have been reduced substantially through vaccination but both diseases persist. the 2001 foot and mouth disease outbreak in the uk and the 2003 sars pandemic were new epidemics that were driven extinct without vaccination before they became endemic, but both exhibited long tails before eradication was achieved. the 2014-16 ebola epidemic in west africa was eliminated (with vaccination at the end of the epidemic [72] ), but eradication took some time with flare ups occurring in different countries [73, 74] . past experience, therefore, raises the possibility that sars-cov-2 may not be driven to complete extinction in the near future, even if a vaccine is developed and vaccination campaigns are implemented. as exemplified by the ebola outbreak in the democratic republic of the congo that has only recently been declared over [75] , there is an additional challenge of assessing whether the virus really is extinct rather than persisting in individuals who do not report disease [73] . sars-cov-2 could become endemic, persisting in populations with limited access to healthcare or circulating in seasonal outbreaks. appropriate royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 communication of these scenarios to the public and policymakers-particularly the possibility that sars-cov-2 may never be eradicated-is essential. (a) how much resolution is needed when modelling human heterogeneities? a common challenge faced by epidemic modellers is the tension between making models more complex (and possibly, therefore, seeming more realistic to stakeholders) and maintaining simplicity (for scientific parsimony when data are sparse and for expediency when predictions are required at short notice) [76] . how to strike the correct balance is not a settled question, especially given the increasing amount of available data on human demography and behaviour. indeed, outputs of multiple models with different levels of complexity can provide useful and complementary information. many sources of heterogeneity between individuals (and between populations) exist, including the strong skew of severe covid-19 outcomes towards the elderly and individuals from specific groups. we focus on two sources of heterogeneity in human populations that must be considered when modelling exit strategies: spatial contact structure and health vulnerabilities. there has been considerable success in modelling local contact structure, both in terms of spatial heterogeneity (distinguishing local and long-distance contacts) and in local mixing structures such as households and workplaces. however, challenges include tracking transmission and assessing changes when contact networks are altered. in spatial models with only a small number of near-neighbour contacts, the number of new infections grows slowly; each generation of infected individuals is only slightly larger than the previous one. as a result, in those models, r(t) cannot significantly exceed its threshold value of one [77] . by contrast, models accounting for transmission within closely interacting groups explicitly contain a mechanism that has a multiplier effect on the value of r(t) [67] . another challenge is the spatio-temporal structure of human populations: the spatial distribution of individuals is important, but longdistance contacts make populations more connected than in simple percolation-type spatial models [77] . clustering and pair approximation models can capture some aspects of spatial heterogeneities [78] , which can result in exponential rather than linear growth in case numbers [79] . while models can include almost any kind of spatial stratification, ensuring that model outputs are meaningful for exit strategy planning relies on calibration with data. this brings in challenges of merging multiple data types with different stratification levels. for example, case notification data may be aggregated at a regional level within a country, while mobility data from past surveys might be available at finer scales within regions. another challenge is to determine the appropriate scale at which to introduce or lift interventions. although measures are usually directed at whole populations within relevant administrative units (country-wide or smaller), more effective interventions and exit strategies may target specific parts of the population [80] . here, modelling can be helpful to account for operational costs and imperfect implementation that will offset expected epidemiological gains. the structure of host vulnerability to disease is generally reported via risk factors, including age, sex and ethnicity [81, 82] . from a modelling perspective, a number of open questions exist. to what extent does heterogeneous vulnerability at an individual level affect the impact of exit strategies beyond the reporting of potential outcomes? where host vulnerability is an issue, is it necessary to account for considerations other than reported risk factors, as these may be proxies for underlying causes? once communicated to the public, modelling results could create behavioural feedback that might help or hinder exit strategies; some sensitivity analyses would be useful. as with the questions around spatial heterogeneity, modelling variations in host vulnerability could improve proposed exit strategies, and modelling can be used to explore how these are targeted and communicated [5] . finally, heterogeneities in space and vulnerabilities may interact; modelling these may reveal surprises that can be explored further. (b) what are the roles of networks and households in sars-cov-2 transmission? npis reduce the opportunity for transmission by breaking up contact networks (closing workplaces and schools, preventing large gatherings), reducing the chance of transmission where links cannot be broken (wearing masks, sneeze barriers) and identifying infected individuals (temperature checks [83] , diagnostic testing [84] ). network models [85, 86] aim to split pathogen transmission into opportunity (number of contacts) and transmission probability, using data that can be measured directly (through devices such as mobility tracking and contact diaries) and indirectly (through traffic flow and co-occurrence studies). this brings new issues: for example, are observed networks missing key transmission routes, such as indirect contact via contaminated surfaces, or including contacts that are low risk [87] ? how we measure and interpret contact networks depends on the geographical and social scales of interest (e.g. wider community spread or closed populations such as prisons and care homes; or subpopulations such as workplaces and schools) and the timescales over which the networks are used to understand or predict transmission. in reality, individuals belong to households, children attend schools and adults mix in workplaces as well as in social contexts. this has led to the development of household models [67, [88] [89] [90] [91] , multilayer networks [92] , bipartite networks [93, 94] and networks that are geographically and socially embedded to reflect location and travel habits [95] . these tools can play a key role in understanding and monitoring transmission, and exploring scenarios, at the point of exiting a lockdown: in particular, they can inform whether or not, and how quickly, households or local networks merge to form larger and possibly denser contact networks in which local outbreaks can emerge. regional variations and socio-economic factors can also be explored. contact tracing, followed by isolation or treatment of infected contacts, is a well-established method of disease control. the structure of the contact network is important in determining whether or not contact tracing will be successful. for example, contact tracing in clustered networks is known to be most effective [96, 97] , since an infected contact can be royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 traced from multiple different sources. knowledge of the contact network enhances understanding of the correlation structure that emerges as a result of the epidemic. the first wave of an epidemic will typically infect many of the highly connected nodes and will move slowly to less connected parts of the network, leaving behind islands of susceptible and recovered individuals. this can lead to a correlated structure of susceptible and recovered nodes that may make the networks less vulnerable to later epidemic waves [98] , and has implications for herd immunity ( §2b). in heterogeneous populations, relatively few very wellconnected people can be major hubs for transmission. such individuals are often referred to as super-spreaders [99, 100] and some theoretical approaches to controlling epidemics are based on targeting them [101] . however, particularly for respiratory diseases, whether specific individuals can be classified as potential super-spreaders, or instead whether any infected individual has the potential to generate super-spreading events, is debated [38, 102, 103] . as control policies are gradually lifted, the disrupted contact network will start to form again. understanding how proxies for social networks (which can be measured in near real time using mobility data, electronic sensors or trackers) relate to transmission requires careful consideration. using observed contacts to predict virus spread might be successful if these quantities are heavily correlated, but one aim of npis should be at least a partial decoupling of the two, so that society can reopen but transmission remains controlled. currently, a key empirical and theoretical challenge is to understand how households are connected and how this is affected by school opening ( §3c). an important area for further research is to improve our understanding of the role of within-household transmission in the covid-19 pandemic. in particular, do sustained infection chains within households lead to amplification of infection rates between households despite lockdowns aimed at minimizing between-household transmission? even for well-studied household models, development of methods accommodating time-varying parameters such as variable adherence to household-based policies and/or compensatory behaviour would be valuable. it would be useful to compare interventions and de-escalation procedures in different countries to gain insight into: regional variations in contact and transmission networks; the role of different household structures in transmission and the severity of outcomes (accounting for different household sizes and agestructures); the cost-effectiveness of different policies, such as household-based isolation and quarantine in the uk compared to out-of-household quarantine in australia and hong kong. first few x (ffx) studies [104, 105] , now adopted in several countries, provide the opportunity not only to improve our understanding of critical epidemiological characteristics (such as incubation periods, generation intervals and the roles of asymptomatic and presymptomatic transmission) but also to make many of these comparisons. a widely implemented early intervention was school closure, which is frequently used during influenza pandemics [106, 107] . further, playgrounds were closed and social distancing has kept children separated. however, the role of children in sars-cov-2 transmission is unclear. early signs from wuhan (china), echoed elsewhere, showed many fewer cases in under 20s than expected. there are three aspects of the role of children in transmission: (i) susceptibility; (ii) infectiousness once infected; and (iii) propensity to develop disease if infected [108, 109] . evidence for age-dependent susceptibility and infectiousness is mixed, with infectiousness the more difficult to quantify. however, evidence is emerging of lower susceptibility to infection in children compared to adults [110] , although the mechanism underlying this is unknown and it may not be generalizable to all settings. once infected, children appear to have a milder course of infection, and it has been suggested that children have a higher probability of a fully subclinical course of infection. reopening schools is of clear importance both in ensuring equal access to education and enabling carers to return to work. however, the transmission risk within schools and the potential impact on community transmission needs to be understood so that policymakers can balance the potential benefits and harms. as schools begin to reopen, there are major knowledge gaps that prevent clear answers. the most pressing question is the extent to which school restarting will affect population-level transmission, characterized by r(t) ( §2a). clearer quantification of the role of children could have come from analysing the effects of school closures in different countries in february and march, but closures generally coincided with other interventions and so it has proved difficult to unpick the effects of individual measures [7] . almost all schools in sweden stayed open to under-16s (with the exception of one school that closed for two weeks [111] ), and schools in some other countries are beginning to reopen with social distancing measures in place, providing a potential opportunity to understand within-school transmission more clearly. models can also inform the design of studies to generate the data required to answer key questions. the effect of opening schools on r(t) also depends on other changes in the community. children, teachers and support staff are members of households; lifting restrictions may affect all members. modelling school reopening must account for all changes in contacts of household members [112] , noting that the impact on r(t) may depend on the other interventions in place at that time. the relative risk of restarting different school years (or universities) does not affect the population r(t) straightforwardly, since older children tend to live with adults who are older (compared to younger children), and households with older individuals are at greater risk of severe outcomes. thus, decisions about which age groups return to school first and how they are grouped at school must balance the risks of transmission between children, transmission to and between their teachers, and transmission to and within the households of the children and teachers. return to school affects the number of physical contacts of teachers and support staff. schools will not be the same environments as prior to lockdown, since physical distancing measures will be in place. these include smaller classes and changes in layout, plus increased hygiene measures. some children and teachers may be less likely to return to school because of underlying health conditions and if there is transmission within schools, there may be absenteeism following infection. models must, therefore, consider the different effects on transmission of pre-and post-lockdown school royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 environments. post-lockdown, with social distancing in place in the wider community, reopening schools could link subcommunities of the population together, and models can be used to estimate the wider effects on population transmission as well as within schools. these estimates are likely to play a central role in decisions surrounding when and how to reopen schools. (d) the pandemic is social: how can we model that? while the effects of population structure and heterogeneities can be approximated in standard compartmental epidemiological models [2, 73, 113] , such models can become highly complex and cumbersome to specify and solve as more heterogeneities are introduced. an alternative approach is agent-based modelling. agent-based models (abm) allow complex systems such as societies to be represented, using virtual agents programmed to have behavioural and individual characteristics (age, sex, ethnicity, income, employment status, etc.) as well as the capacity to interact with other agents [114] . in addition, abm can include societal-level factors such as the influence of social media, regulations and laws, and community norms. in more sophisticated abm, agents can anticipate and react to scenarios, and learn by trial and error or by imitation. abm can represent systems in which there are feedbacks, tipping points, the emergence of higher-level properties from the actions of individual agents, adaptation and multiple scales of organization-all features of the covid-19 pandemic and societal reactions to it. while abm arise from a different tradition, they can incorporate the insights of compartmental models; for example, agents must transition through disease states (or compartments) such that the mean transition rates correspond to those in compartmental models. however, building an abm that represents a population on a national scale is a huge challenge and is unlikely be accomplished in a timescale useful for the current pandemic. abm often include many parameters, leading to challenges of model parametrization and a requirement for careful uncertainty quantification and sensitivity analyses to different inputs. on the other hand, useful abm do not have to be all-encompassing. there are already several models that illustrate the effects of policies such as social distancing on small simulated populations. these models can be very helpful as 'thought experiments' to identify the potential effects of candidate policies such as school re-opening and restrictions on long-distance travel, as well as the consequences of non-compliance with government edicts. there are two areas where long-term action should be taken. first, more data about people's ordinary behaviour are required: what individuals do each day (through timeuse diaries), whom they meet ( possibly through mobile phone data, if consent can be obtained) and how they understand and act on government regulation, social media influences and broadcast information [115] . second, a large, modular abm should be built that represents heterogeneities in populations and that is properly calibrated as a social 'digital twin' of our own society, with which we can carry out virtual policy experiments. had these developments occurred before, they would have been useful currently. as a result, if these are addressed now, they will aid the planning of future exit strategies. (a) what are the additional challenges of data-limited settings? in most countries, criteria for ending covid-19 lockdowns rely on tracking trends in numbers of confirmed cases and deaths, and assessments of transmissibility ( §2a). this section focuses on the relaxation of interventions in lmics, although many issues apply everywhere. perhaps surprisingly, concerns relating to data availability and reliability (e.g. lack of clarity about sampling frames) remain worldwide. other difficulties have also been experienced in many countries throughout the pandemic (e.g. shortages of vital supplies, perhaps due in developed countries to previous emphasis on healthcare system efficiency rather than pandemic preparedness [116] ). data about the covid-19 pandemic and about the general population and context can be unreliable or lacking globally. however, due to limited healthcare access and utilization, there can be fewer opportunities for diagnosis and subsequent confirmation of cases in lmics compared to other settings, unless there are active programmes [117] . distrust can make monitoring programmes difficult, and complicate control activities like test-trace-isolate campaigns [118, 119] . other options for monitoring-such as assessing excess disease from general reporting of acute respiratory infections or influenza-like illness-require historical baselines that may not exist [120, 121] . in general, while many lmics will have a well-served fraction of the population, dense peri-urban and informal settlements are typically outside that population and may rapidly become a primary concern for transmission [122] . since confirmed case numbers in these populations are unlikely to provide an accurate representation of the underlying epidemic, reliance on alternative data such as clinically diagnosed cases may be necessary to understand the epidemic trajectory. some tools for rapid assessment of mortality in countries where the numbers of covid-19-related deaths are hard to track are starting to become available [123] . in settings where additional data collection is not affordable, models may provide a clearer picture by incorporating available metadata, such as testing and reporting rates through time, sample backlogs and suspected covid-19 cases based on syndromic surveillance. by identifying the most informative data, modelling could encourage countries to share available data more widely. for example, burial reports and death certificates may be available, and these data can provide information on the demographics that influence the infection fatality rate. these can in turn reveal potential covid-19 deaths classified as other causes and hence missing from covid-19 attributed death notifications. in addition to the challenges in understanding the pandemic in these settings, metrics on health system capacity (including resources such as beds and ventilators), as needed to set targets for control, are often poorly documented [124] . furthermore, the economic hardships and competing health priorities in low-resource settings change the objectives of lifting restrictions-for example, hunger due to loss of jobs and changes in access to routine healthcare (e.g. hiv services and childhood vaccinations) as a result of lockdown have the potential to cost many lives in themselves, both in the short and long term [125, 126] . this must be accounted for when deciding how to relax covid-19 interventions. royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 we have identified three key challenges for epidemic modellers to help guide exit strategies in data-limited settings: (i) explore policy responses that are robust to missing information; (ii) conduct value-of-information analyses to prioritize additional data collection; and (iii) develop methods that use metadata to interpret epidemiological patterns. in general, supporting lmics calls for creativity in the data that are used to parametrize models and in the response activities that are undertaken. some lmics have managed the covid-19 pandemic successfully so far (e.g. vietnam, as well as trinidad and tobago [127] ). however, additional support in lmics is required and warrants special attention. if interventions are relaxed too soon, fragile healthcare systems may be overwhelmed. if instead they are relaxed too late, socio-economic consequences can be particularly severe. (b) which data should be collected as countries emerge from lockdown, and why? identifying the effects of the different components of lockdown is important to understand how-and in which order-interventions should be released. the impact of previous measures must be understood both to inform policy in real time and to ensure that lessons can be learnt. all models require information to make their predictions relevant. data from pcr tests for the presence of active virus and serological tests for antibodies, together with data on covid-19-related deaths, are freely available via a number of internet sites (e.g. [128] ). however, metadata associated with testing protocols (e.g. reason for testing, type of test, breakdowns by age and underlying health conditions) and the definition of covid-19-related death, which are needed to quantify sources of potential bias and parametrize models correctly, are often unavailable. data from individuals likely to have been exposed to the virus (e.g. within households of known infected individuals), but who may or may not have contracted it themselves, are also useful for model parametrization [129] . new sources of data range from tracking data from mobile phones [130] to social media surveys [131] and details of interactions with public health providers [132] . although potentially valuable, these data sources bring with them biases that are not always understood. these types of data are also often subject to data protection and/or costly fees, meaning that they are not readily available to all scientists. mixing patterns by age were reasonably well-characterized before the current pandemic [133, 134] ( particularly for adults of different ages) and have been used extensively in existing models. however, there are gaps in these data and uncertainty in the impacts that different interventions have had on mixing. predictive models for policy tend to make broad assumptions about the effects of elements of social distancing [135] , although results of studies that attempt to estimate effects in a more data-driven way are beginning to emerge [136] . the future success of modelling to understand when controls should be relaxed or tightened depends critically on whether, and how accurately as well as how quickly, the effects of different elements of lockdown can be parametrized. given the many differences in lockdown implementation between countries, cross-country comparisons offer an opportunity to estimate the effects on transmission of each component of lockdown [7] . however, there are many challenges in comparing sars-cov-2 dynamics in different countries. alongside variability in the timing, type and impact of interventions, the numbers of importations from elsewhere will vary [70, 137] . underlying differences in mixing, behavioural changes in response to the pandemic, household structures, occupations and distributions of ages and comorbidities are likely to be important but uncertain drivers of transmission patterns. a current research target is to understand the role of weather and climate in sars-cov-2 transmission and severity [138] . many analyses across and within countries highlight potential correlations between environmental variables and transmission [139] [140] [141] [142] [143] [144] , although sometimes by applying ecological niche modelling frameworks that may be ill-suited for modelling a rapidly spreading pathogen [145] [146] [147] . assessments of the interactions between weather and viral transmissibility are facilitated by the availability of extensive datasets describing weather patterns, such as the european centre for medium-range weather forecasts era5 dataset [148] and simulations of the community earth system model that can be used to estimate the past, present and future values of meteorological variables worldwide [149] . temperature, humidity and precipitation are likely to affect the survival of sars-cov-2 outside the body, and prevailing weather conditions could, in theory, tip r(t) above or below one. however, the effects of these factors on transmission have not been established conclusively, and the impact of seasonality on short-or long-term sars-cov-2 dynamics is likely to depend on other factors including the timing and impact of interventions, and the dynamics of immunity [47, 150] . it is hard to separate the effect of the weather on virus survival from other factors including behavioural changes in different seasons [151] . the challenge of disentangling the impact of variations in weather on transmission from other epidemiological drivers in different locations is, therefore, a complex open problem. in seeking to understand and compare covid-19 data from different countries, there is a need to coordinate the design of epidemiological studies, involving longitudinal data collection and case-control studies. this will help enable models to track the progress of the epidemic and the impacts of control policies internationally. it will also allow more refined conclusions than those that follow from population data alone. countries with substantial epidemiological modelling expertise should support epidemiologists elsewhere with standardized protocols for collecting data and using models to inform policy. there is a need to share models to be used 'in the field'. collectively, these efforts will ensure that models are parametrized as realistically as possible for particular settings. in turn, as interventions are relaxed, this will allow us to detect the earliest possible reliable signatures of a resurgence in cases, leading to an unambiguous characterization of when it is necessary for interventions to be reintroduced. (c) how should model and parameter uncertainty be communicated? sars-cov-2 transmission models have played a crucial role in shaping policies in different countries, and their predictions have been a regular feature of media coverage of the pandemic [135, 152] . understandably, both policymakers and journalists generally prefer single 'best guess' figures from models, rather than a range of plausible values. however, the ranges of outputs that modellers provide include important information about the variety of possible scenarios and guard royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 against over-interpretation of model results. not displaying information about uncertainty can convey a false confidence in predictions. it is critical that modellers present uncertainty in a way that is understandable and useful for policymakers and the public [76] . there are numerous and often inextricable ways in which uncertainty enters the modelling process. model assumptions inevitably vary according to judgements regarding which features are included [1, 95] and which datasets are used to inform the model [153] . within any model, ranges of parameter values can be considered to allow for uncertainty about clinical characteristics of covid-19 (e.g. the infectious period and case fatality rate) [154] . alternative initial conditions (e.g. numbers and locations of imported cases seeding national outbreaks, or levels of population susceptibility) can be considered. in modelling exit strategies, when surges in cases starting from small numbers may occur and where predictions will depend on characterizing epidemiological parameters as accurately as possible, stochastic models may be of particular importance. not all the uncertainty arising from such stochasticity will be reduced by collecting more data; it is inherent to the process. where models have been developed for similar purposes, formal methods of comparison can be applied, but in epidemiological modelling, models often have been developed to address different questions, possibly involving 'what-if?' scenarios, in which case only qualitative comparisons can be made. the ideal outcome is when different models generate similar conclusions, demonstrating robustness to the detailed assumptions. where there is a narrowly defined requirement, such as short-term predictions of cases and deaths, more tractable tools for comparing the outputs from different models in real time would be valuable. one possible approach is to assess the models' past predictive performance [33, 155] . ensemble estimates, most commonly applied for forecasting disease trajectories, allow multiple models' predictions to be combined [156, 157] . the assessment of past performance can then be used to weight models in the ensemble. such approaches typically lead to improved point and variance estimates. to deal with parameter uncertainty, a common approach is to perform sensitivity analyses in which model parameters are repeatedly sampled from a range of plausible values, and the resulting model predictions compared; both classical and bayesian statistical approaches can be employed [158] [159] [160] . methods of uncertainty quantification provide a framework in which uncertainties in model structure, epidemiological parameters and data can be considered together. in practice, there is usually only a limited number of policies that can be implemented. an important question is often whether or not the optimal policy can be identified given the uncertainties we have described, and decision analyses can be helpful for this [161, 162] . in summary, communication of uncertainty to policymakers and the general public is challenging. different levels of detail may be required for different audiences. there are many subtleties: for instance, almost any epidemic model can provide an acceptable fit to data in the early phase of an outbreak, since most models predict exponential growth. this can induce an artificial belief that the model must be based on sensible underlying assumptions, and the true uncertainty about such assumptions has vanished. clear presentation of data is critical. it is important not simply to present data on the numbers of cases, but also on the numbers of individuals who have been tested. clear statements of the individual values used to calculate quantities such as the case fatality rate are vital, so that studies can be interpreted and compared correctly [163, 164] . going forwards, improved communication of uncertainty is essential as models are used to predict the effects of different exit strategies. we have highlighted ongoing challenges in modelling the covid-19 pandemic, and uncertainties faced devising lockdown exit strategies. it is important, however, to put these issues into context: at the start of 2020, sars-cov-2 was unknown, and its pandemic potential only became apparent at the end of january. the speed with which the scientific and public health communities came together and the openness in sharing data, methods and analyses are unprecedented. at very short notice, epidemic modellers mobilized a substantial workforce-mostly on a voluntary basis-and state-of-the-art computational models. far from the rough-and-ready tools sometimes depicted in the media, the modelling effort deployed since january is a collective and multi-pronged effort benefitting from years of experience of epidemic modelling, combined with long-term engagement with public health agencies and policymakers. drawing on this collective expertise, the virtual workshop convened in mid-may by the isaac newton institute generated a clear overview of the steps needed to improve and validate the scientific advice to guide lockdown exit strategies. importantly, the roadmap outlined in this paper is meant to be feasible within the lifetime of the pandemic. infectious disease epidemiology does not have the luxury of waiting for all data to become available before models must be developed. as discussed here, the solution lies in using diverse and flexible modelling frameworks that can be revised and improved iteratively as more data become available. equally important is the ability to assess the data critically and bring together evidence from multiple fields: numbers of cases and deaths reported by regional or national authorities only represent a single source of data, and expert knowledge is even required to interpret these data correctly. in this spirit, our first recommendation is to improve estimates of key epidemiological parameters. this requires close collaboration between modellers and the individuals and organizations that collect epidemic data, so that the caveats and assumptions on each side are clearly presented and understood. that is a key message from the first section of this study, in which the relevance of theoretical concepts and model parameters in the real world was demonstrated: far from ignoring the complexity of the pandemic, models draw from different sources of expertise to make sense of imperfect observations. by acknowledging the simplifying assumptions of models, we can assess the models' relative impacts and validate or replace them as new evidence becomes available. our second recommendation is to seek to understand important sources of heterogeneity that appear to be driving the pandemic and its response to interventions. agent-based modelling represents one possible framework for modelling complex dynamics, but standard epidemic models can also be extended to include age groups or any other relevant strata in the population as well as spatial structure. network royalsocietypublishing.org/journal/rspb proc. r. soc. b 287: 20201405 models provide computationally efficient approaches to capture different types of epidemiological and social interactions. importantly, many modelling frameworks provide avenues for collaboration with other fields, such as the social sciences. our third and final recommendation regards the need to focus on data requirements, particularly (although not exclusively) in resource-limited settings such as lmics. understanding the data required for accurate predictions in different countries requires close communication between modellers and governments, public health authorities and the general public. while this pandemic casts a light on social inequalities between and within countries, modellers have a crucial role to play in sharing knowledge and expertise with those who need it most. during the pandemic so far, countries that might be considered similar in many respects have often differed in their policies; either in the choice or the timing of restrictions imposed on their respective populations. models are important for drawing reliable inferences from global comparisons of the relative impacts of different interventions. all too often, national death tolls have been used for political purposes in the media, attributing the apparent success or failure of particular countries to specific policies without presenting any convincing evidence. modellers must work closely with policymakers, journalists and social scientists to improve the communication of rapidly changing scientific knowledge while conveying the multiple sources of uncertainty in a meaningful way. we are now moving into a stage of the covid-19 pandemic in which data collection and novel research to inform the modelling issues discussed here are both possible and essential for global health. these are international challenges that require an international collaborative response from diverse scientific communities, which we hope that this article will stimulate. this is of critical importance, not only to tackle this pandemic but also to improve the response to future epidemics of emerging infectious diseases. data accessibility. data sharing is not applicable to this manuscript as no new data were created or analysed in this study. the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study epidemiological models are important tools for guiding covid-19 interventions 2020 first-wave covid-19 transmissibility and severity in china outside hubei after control measures, and secondwave scenario planning: a modelling impact assessment 2020 how and when to end the covid-19 lockdown: an optimisation approach. front. public health 8, 262 segmentation and shielding of the most vulnerable members of the population as elements of an exit strategy from covid-19 lockdown. medrxiv statistical estimation of the reproductive number from case notification data: a review report 13: estimating the number of infections and the impact of nonpharmaceutical interventions on covid-19 in 11 european countries estimating the timevarying reproduction number of sars-cov-2 using national and subnational case counts impact assessment of nonpharmaceutical interventions against coronavirus disease 2019 and influenza in hong kong: an observational study practical considerations for measuring the effective reproductive number the construction and analysis of epidemic trees with reference to the 2001 uk foot-and-mouth outbreak mrc biostatistics unit covid-19 working group. 2020 nowcasting and forecasting report /journal/rspb proc. r. soc. b 287: 20201405 13. government office for science. 2020 government publishes latest r number our plan to rebuild: the uk government's covid-19 recovery strategy estimating in real time the efficacy of measures to control emerging communicable diseases different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures a new framework and software to estimate timevarying reproduction numbers during epidemics improved inference of time-varying reproduction numbers during infectious disease outbreaks assessing the impact of nonpharmaceutical interventions on sars-cov-2 transmission in switzerland the effective reproduction number as a prelude to statistical estimation of time-dependent epidemic trends birth-death skyline plot reveals temporal changes of epidemic spread in hiv and hepatitis c virus (hcv) modeling the growth and decline of pathogen effective population size provides insight into epidemic dynamics and drivers of antimicrobial resistance the epidemic behavior of the hepatitis c virus adaptive estimation for epidemic renewal and phylogenetic skyline models nowcasting pandemic influenza a/h1n1 2009 hospitalizations in the netherlands estimating the effects of nonpharmaceutical interventions on covid-19 in europe bayesian-based iterative method of image restoration an iterative technique for the rectification of observed distributions reconstructing influenza incidence by deconvolution of daily mortality time series working group, phe modelling cell. 2020 adjusting covid-19 deaths to account for reporting delay using mobility to estimate the transmission intensity of covid-19 in italy: a subnational analysis with future scenarios a note on generation times in epidemic models serial interval of sars-cov-2 was shortened over time by nonpharmaceutical interventions using information theory to optimise epidemic models for real-time prediction and estimation an exact method for quantifying the reliability of end-of-epidemic declarations in real time. medrxiv estimating temporal variation in transmission of covid-19 and adherence to social distancing measures in australia estimating reproduction numbers for adults and children from case data heterogeneities in the transmission of infectious agents: implications for the design of control programs estimating the overdispersion in covid-19 transmission using outbreak sizes outside china quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing 2020 time from symptom onset to hospitalisation of coronavirus disease 2019 (covid-19) cases: implications for the proportion of transmissions from infectors with few symptoms serial interval of covid-19 among publicly reported confirmed cases estimating the generation interval for coronavirus disease (covid-19) based on symptom onset data estimating the time interval between transmission generations when negative values occur in the serial interval data: using covid-19 as an example challenges in control of covid-19: short doubling time and long delay to effect of interventions projecting the transmission dynamics of sars-cov-2 through the postpandemic period a mathematical model reveals the influence of population heterogeneity on herd immunity to sars-cov-2 individual variation in susceptibility or exposure to sars-cov-2 lowers the herd immunity threshold. medrxiv uk government office for national statistics covid-19) infection survey pilot: england use of serological surveys to generate key insights into the changing global landscape of infectious disease 2020 sars-cov-2 seroprevalence in covid-19 hotspots quantifying antibody kinetics and rna shedding during early-phase sars-cov-2 infection. medrxiv what policy makers need to know about covid-19 protective immunity interpreting diagnostic tests for sars-cov-2 seventy-five years of estimating the force of infection from current status data benefits and challenges in using seroprevalence data to inform models for measles and rubella elimination age-specific incidence and prevalence: a statistical perspective prevalence of sars-cov-2 in spain (ene-covid): a nationwide, populationbased seroepidemiological study viral kinetics and antibody responses in patients with covid-19. medrxiv when does a minor outbreak become a major epidemic? linking the risk from invading pathogens to practical definitions of a major epidemic stochastic epidemic models and their statistical analysis novel coronavirus outbreak in wuhan, china, 2020: intense surveillance is vital for preventing sustained transmission in new locations estimating the probability of a major outbreak from the timing of early cases: an indeterminate problem? plos one 8, e57878 detecting presymptomatic infection is necessary to forecast major epidemics in the earliest stages of infectious disease outbreaks intervention to maximise the probability of epidemic fade-out epidemics with two levels of mixing the use of chain-binomials with a variable chance of infection for the analysis of intrahousehold epidemics extinction times in the subcritical stochastic sis logistic epidemic estimating covid-19 outbreak risk through air travel in press. is it safe to lift covid-19 travel bans? the newfoundland story efficacy and effectiveness of an rvsv-vectored vaccine expressing ebola surface glycoprotein: interim results from the guinea ring vaccination cluster-randomised trial rigorous surveillance is necessary for high confidence in endof-outbreak declarations for ebola and other infectious diseases sexual transmission and the probability of an end of the ebola virus disease epidemic world health organization. 2020 10th ebola outbreak in the democratic republic of the congo declared over; vigilance against flare-ups and support for survivors must continue influencing public health policy with data-informed mathematical models of infectious diseases: recent developments and new challenges five challenges for spatial epidemic models the effects of local spatial structure on epidemiological invasions pair approximations for spatial structures? management of invading pathogens should be informed by epidemiology rather than administrative boundaries uk government office for national statistics clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study sars-cov-2 infection among travelers returning from wuhan 2020 the probability of detection of sars-cov-2 in saliva mathematics of epidemics on networks: from exact to approximate models epidemic processes in complex networks a novel field-based approach to validate the use of network models for disease spread between dairy herds analysis of a stochastic sir epidemic on a random network incorporating household structure reproduction numbers for epidemic models with households and other social structures ii: comparisons and implications for vaccination reproductive numbers, epidemic spread and control in a community of households reproduction numbers for epidemic models with households and other social structures. i. definition and calculation of r0 modeling the impact of social distancing testing contact tracing and household quarantine on second-wave scenarios of the covid-19 epidemic. medrxiv epidemics on random intersection graphs epidemics on random graphs with tunable clustering social networks with strong spatial embedding generate non-standard epidemic dynamics driven by higherorder clustering. biorxiv. royalsocietypublishing.org/journal/rspb disease contact tracing in random and clustered networks 2020 impact of delays on effectiveness of contact tracing strategies for covid-19: a modelling study network frailty and the geometry of herd immunity spread of epidemic disease on networks random graph dynamics graphs with specified degree distributions, simple epidemics, and local vaccination strategies social encounter networks: collective properties and disease transmission social encounter networks: characterizing great britain the first few x (ffx) cases and contact investigation protocol for 2019-novel coronavirus (2019-ncov) infection. 2020 characterising pandemic severity and transmissibility from data collected during first few hundred studies closure of schools during an influenza pandemic estimating the impact of school closure on influenza transmission from sentinel data age-dependent effects in the transmission and control of covid-19 epidemics 2020 school closure and management practices during coronavirus outbreaks including covid-19: a rapid systematic review susceptibility to and transmission of covid-19 amongst children and adolescents compared with adults: a systematic review and meta-analysis. medrxiv 2020 how sweden wasted a 'rare opportunity' to study coronavirus in schools stepping out of lockdown should start with school re-openings while maintaining distancing measures. insights from mixing matrices and mathematical models. medrxiv 2020 impact of selfimposed prevention measures and short-term government-imposed social distancing on mitigating and delaying a covid-19 epidemic: a modelling study agent-based models computational models that matter during a global pandemic outbreak: a call to action 2020 resilience in the face of uncertainty: early lessons from the covid-19 pandemic 2020 health security capacities in the context of covid-19 outbreak: an analysis of international health regulations annual report data from 182 countries historical parallels, ebola virus disease and cholera: understanding community distrust and social violence with epidemics the ongoing ebola epidemic in the democratic republic of congo a review of the surveillance systems of influenza in selected countries in the tropical region the indepth network: filling vital gaps in global epidemiology local response in health emergencies: key considerations for addressing the covid-19 pandemic in informal urban settlements revealing the toll of covid-19: a technical package for rapid mortality surveillance and epidemic response introducing the lancet global health commission on high-quality health systems in the sdg era the impact of covid-19 control measures on social contacts and transmission in kenyan informal settlements. medrxiv an appeal for practical social justice in the covid-19 global response in low-income and middle-income countries coronavirus government response tracker an interactive web-based dashboard to track covid-19 in real time contact intervals, survival analysis of epidemic data, and estimation of r0 mobile phone data for informing public health actions across the covid-19 pandemic life cycle 2020 early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study analysis of temporal trends in potential covid-19 cases reported through projecting social contact matrices in 152 countries using contact surveys and demographic data contagion! the bbc four pandemic-the model behind the documentary report 9: impact of nonpharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand inferring change points in the spread of covid-19 reveals the effectiveness of interventions preparedness and vulnerability of african countries against importations of covid-19: a modelling study effects of environmental factors on severity and mortality of covid-19. medrxiv the correlation between the spread of covid-19 infections and weather variables in 30 chinese provinces and the impact of chinese government mitigation plans impact of meteorological factors on the covid-19 transmission: a multi-city study in china covid-19 transmission in mainland china is associated with temperature and humidity: a time-series analysis impact of weather on covid-19 pandemic in turkey temperature and precipitation associate with covid-19 new daily cases: a correlation study between weather and covid-19 pandemic in oslo 2020 correlation between weather and covid-19 pandemic in jakarta 2020 spread of sars-cov-2 coronavirus likely to be constrained by climate. medrxiv 2020 a global-scale ecological niche model to predict sars-cov-2 coronavirus infection rate species distribution models are inappropriate for covid-19 european centre for medium-range weather forecasts. 2020 the era5 dataset the community earth system model: a framework for collaborative research susceptible supply limits the role of climate in the early sars-cov-2 pandemic effects of temperature and humidity on the spread of covid-19: a systematic review. medrxiv early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia four key challenges in infectious disease modelling using data from multiple sources 2020 how will country-based mitigation measures influence the course of the covid-19 epidemic? prequential data analysis accuracy of real-time multimodel ensemble forecasts for seasonal influenza in the harnessing multiple models for outbreak management control fast or control smart: when should invading pathogens be controlled? accurate quantification of uncertainty in epidemic parameter estimates and predictions using stochastic compartmental models fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts infectious disease pandemic planning and response: incorporating decision analysis improving the evidence base for decision making during a pandemic: the example of 2009 influenza a/h1n1 potential biases in estimating absolute and relative case-fatality risks during outbreaks estimates of the severity of coronavirus disease 2019: a model-based analysis acknowledgements. thanks to the isaac newton institute for mathematical sciences, cambridge (www.newton.ac.uk), for support during the virtual 'infectious dynamics of pandemics' programme. this work was undertaken in part as a contribution to the 'rapid assistance in modelling the pandemic' initiative coordinated by the royal society. thanks to sam abbott for helpful comments about the manuscript. key: cord-335418-s8ugu8e1 authors: annan, james d; hargreaves, julia c title: model calibration, nowcasting, and operational prediction of the covid-19 pandemic date: 2020-04-17 journal: nan doi: 10.1101/2020.04.14.20065227 sha: doc_id: 335418 cord_uid: s8ugu8e1 we present a simple operational nowcasting/forecasting scheme based on a joint state/parameter estimate of the covid-19 epidemic at national or regional scale, performed by assimilating the time series of reported daily death numbers into a simple seir model. this system generates estimates of the current reproductive rate, rt, together with predictions of future daily deaths and clearly outperforms a number of alternative forecasting systems that have been presented recently. our current (14th april 2020) estimates for rt are, respectively, uk 0.49 (0.0 – 1.02), spain 0.55 (0.33 – 0.77), italy 0.90 (0.74 – 1.06) and france 0.67 (0.40 – 0.94) (mean and 95% credible intervals). thus, we believe that the epidemics have been successfully suppressed in each of these countries, with high probability. our approach is trivial to set up for any region and generates results in a few minutes on a laptop. we believe it would be straightforward to set up equivalent frameworks using more complex and realistic models, and hope that some experts in the field of epidemiological modelling will consider investigating this approach further. bayes' theorem describes how a prior estimate can be updated in the light of observational evidence: in the above, p(θ|o) is the posterior estimate of the state θ conditioned on the evidence o, p(θ) is our prior belief before accounting for the evidence, and p(o|θ) is the likelihood function that measures the probability of obtaining the particular observations, as a function of the state variables/parameters θ. having obtained p(θ|o) at the forecast time (i.e., the present, 5 for real-time forecast), we are in a position to present any parameters of interest contained within θ and also generate a forecast for variables of interest. in this work, we focus on the the current reproductive rate of the epidemic, r t , as the main parameter of interest, and also on the reported number of daily deaths, both as being the most reliable source of data (i.e., our observations o in the application of bayes' theorem above) and also the primary forecast variable of interest to the public and policy makers. however, the model we are using also naturally calculates other variables that could be of great interest such as case numbers, 10 from which healthcare demands could be estimated. a range of techniques have been developed for applying bayesian theory which trade-off flexibility, computational cost, precision, and ease of use. the markov chain monte carlo (mcmc) method (metropolis et al., 1953; hammersley and handscomb, 1964; hastings, 1970) is particularly convenient in terms of flexibility and ease of implementation, and although it requires o(10000) model simulations, this is easily affordable for the simple model that we are using. 15 the approach we present here is based on fitting a simple homogeneous seir model to data. while the mcmc method (using the package 'mcmcpack' in the r language) is easy to set up and use, it seems likely that more efficient methods might be possible. in particular, it should be possible to exploit the log-linear behaviour exhibited in the dynamics to use much more efficient approaches such as kalman filtering, and thus we expect the principles of model calibration and initialisation demonstrated here could be readily applied to more computationally demanding models. 20 2 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org /10.1101/2020.04.14.20065227 3 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 2 methods the model we are using is a 6-box seir model (https://en.wikipedia.org/wiki/compartmental_models_in_epidemiology#the_ seir_model) originating from the webpages of thomas house, reader in the school of mathematics at the university of manchester who specialises in mathematical epidemiology. rather than describing the equations we simply present our imple-5 mentation in the r language: #the basic dynamics is a 6-box version of seir based on this post from thomas house: #https://personalpages.manchester.ac.uk/staff/thomas.house/blog/modelling-herd-immunity.html #there are two e and i boxes, the reasons for which we #could speculate on but it's not our model so we won't :-) the model is initialised with a state vector for the 6 boxes, and also requires three rate parameters, β, σ and γ which are simple functions of the three parameters more commonly discussed in the context of epidemiology: the reproductive rate r, the latent period l p and infectious period i p . in order to calculate daily deaths from the infected population, we use the gamma distribution for time to death described by ferguson et al. (2020) , except with the minor modification that we marginally reduce 10 the mean time from 18.8 to 15 days, having found that this shorter time scale simulates the hubei epidemic slightly better. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 the fundamental scenario common to all of our simulations is that the model parameters are assumed constant in time apart from the reproductive rate which is piecewise constant, having a value r 0 prior to the introduction of social controls and r t subsequent to that date. for the initial reproductive rate r 0 we use a gaussian prior of n(3,1 2 ) to include a wide range of plausible values, and for r t we use n(1,0.5 2 ) in in order to represent the expectation that the controls will have a substantial 5 effect albeit we are uncertain whether these achieve suppression (r t < 1) or merely mitigation (r 0 > r t > 1). using a prior centred on unity means that one can easily discern from the posterior predictive distribution whether the observed data are more consistent with suppression or mitigation even before the effect of the policies is known with high confidence. an alternative approach might be to use an uncertain scaling factor α such that r t = αr 0 but the interpretation of results would be marginally less clear and the prior choice on α slightly less natural for us. we expect that the optimisation technique would 10 however perform similarly well. the starting size of the epidemic is also taken to be uncertain with a very broad prior that ensures the specification of start date is not critical so long as a reasonable choice is made. in practice this parameter tends to vary inversely with r 0 in the posterior, in order to achieve a reasonable timing during the early stages of the epidemic. we have also allowed some uncertainty in the latent period, although this is probably not necessary and it is unsurprising to find that in the posterior it 15 varies with r 0 . in the setup of the system we have also included the infectious period and mortality rate as uncertainties within the mcmc procedure, but in practice we have chosen to constrain these with tight priors as they are not identifiable with the other parameters and including them has not been necessary in our applications which focus primarily on numbers of deaths. the modelling of uncertainties leading to model-data discrepancy is fundamental to the specification and calculation of the 20 likelihood in the bayesian updating. we consider three major sources of model-data discrepancy. note that these uncertainties are attached to the model outputs (predictions), rather than the observed values per se, in order to perform the likelihood calculation correctly. we also remark at this point that the model-data comparison is performed in log-space due to the largely 5 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 log-linear dynamics, and so all of these uncertainties are treated as multiplicative terms on the values (additive errors in log space). firstly and most straightforwardly, sampling uncertainty arises due to the stochastic nature of deaths occurring in a specific interval such as a day. we assume that if the expected number of deaths is n then the relative sampling uncertainty on this value is approximately √ n/n, arising as the standard deviation of the central limit theorem approximation to the binomial 5 distribution where the probability of "success" (a tasteless term in the circumstances) is some unknown p < 0.1 and the number of "trials", i.e. the population of dangerously ill patients who could die in the next day, is n/p. in practice we round this term up slightly to (1 + √ n)/n which further downweights the very sparse and noisy data that are seen for small numbers of deaths at the start of an outbreak, without significantly affecting larger values. data reporting errors have been much discussed in the uk context and there is clear evidence of them having a weekly pattern 10 due presumably to working practices in the medical sector. in particular, the "monday dip" in recent weeks been substantially greater than can be accounted for via sampling uncertainties alone and therefore as our second source of uncertainty we include an additional factor for these reporting errors, for which we use a gaussian distribution with a magnitude of 20% of the expected value (again at one standard deviation). we make no attempt to either directly correct for these reporting errors or to account for their somewhat periodic and skewed nature. 15 perhaps the most subtle and important source of model-data discrepancy is the inadequacy of the model itself. even if the model is perfectly initialised, with the best possible choice of parameter values, it is inevitable that it will diverge from reality to a greater extent than could be accounted for by stochastic factors. there are in fact no 'true' parameters for which the model can track reality indefinitely. this failure does not just affect the forecast, but also prevents the model from 'shadowing' (smith, 2006) an indefinitely long historic time series of observations. however, a consequence of the (log-)linear and deterministic 20 nature of the model is that, in the limit of an infinite data stream of unlimited precision, the posterior probability distribution under a standard bayesian updating methodology will converge to a point estimate and a point forecast which therefore will not validate. this ubiquitous problem in data assimilation and parameter estimation has a number of possible solutions. here we treat the model inadequacy in a simple way as a multiplicative error that increases linearly with time at the rate of 3% per day. this is necessarily a rather subjective approach and the calibration of this number is rather subjective. an error term of 25 1% per day or less would be too small to significantly influence results within the time scale of interest, and a value of 10% per day would render the model largely useless. as well as inflating the forecast spread, this error also grows in time backwards from the forecast time during the calibration process. what this means in practice is that observations sufficiently distant in time from the present have little influence on the calibration, and that the posterior certainty is bounded even for an unlimited time series of perfect observations. these three error terms are independent, gaussian and additive in log-space and therefore can be added in quadrature to give a total uncertainty on model predictions. they apply to the model prediction at observation times in the likelihood calculation, and are also included in the presentation of our forecasts. in practice however, the forecast uncertainty is usually dominated by our uncertainty in the current value of r t . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 we use daily death numbers as our sole source of data in the bayesian updating procedure. in principle other data such as numbers of infections can easily be used if available, but in practice these appear to be so unreliable and strongly biased due to under-sampling that we do not use them. our data are drawn largely from http://www.worldometer.com, with some minor adjustments. regional data for hubei have been obtained from https://www.kaggle.com/imdevskp/corona-virus-report# 5 covid_19_clean_complete.csv. the hubei data set did not include the earliest deaths and these were added by hand based on the wikipedia page https://en.wikipedia.org/wiki/2019-20_coronavirus_pandemic_in_mainland_china. due to the log-space model-data comparison, zeros are problematic. erroneous zeros due to missing data during the epidemic are removed by a simple local smoothing, and zeros in the early stage of the epidemic are treated as half a death in the likelihood to avoid numerical problems. the number of fictitious deaths created through this process is never more than a handful in any of our 10 applications, and we do not believe that this minor pre-processing affects our results other than to make the estimation process slightly more straightforward. we have validated our system by hindcasting epidemic growth in multiple countries and regions. a representative selection of our results, focussing on countries which have experienced significant epidemics, are reported here. our validation approach 15 follows the basic paradigm of issuing a hypothetical forecast at a date in the past by withholding data subsequent to that forecasting date, which can then be used as validation. the performance of our system was so robust that we have not felt the need to perform quantitative analyses of the goodness of fit, and here we simply show a selection of forecasts along with the observations that were subsequently made. the initial outbreak in the city of wuhan in hubei province, china, provides the most comprehensive test of the long-term 20 forecasting performance due to the long time series available. as can be seen from figure 1a , the strong decline in deaths was predicted (albeit with much uncertainty) as early as the 9th february, due to the deviation of reported deaths below the exponential continuation of the early growth rate. while the cluster of high death numbers just before the 16th feb raised the forecast in figure 1b somewhat, reality was still captured within its uncertainty range. subsequent forecasts predicted the exponential decline very accurately. figure 2b contains our current forecast for daily deaths in the uk, along with our estimate for the current reproductive rate r t which is assumed in the model to have applied since the date of the 'lockdown' on 23rd march, and also the original reproductive rate r 0 that applied prior to that date. figure 2a shows an equivalent forecast initialised on sunday april 5th, together with the data that have been reported since that date but which were not known to the algorithm. according to our results, r t is now very likely less than 1 and the number of deaths will fall steadily in the future so long as their reporting is on 30 a consistent basis with the past. the possibility that the 'monday problem' has suppressed recent data should not be completely discounted, but even if this is the case, it is hard to reconcile a value for r t of greater than unity. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 we have also simulated the epidemics in a number of european countries, and present results from the countries wth the three largest epidemics: france, italy and spain. as shown in figures 4a and 5a (and also consistent with our results for hubei), the downturn in deaths is confidently predicted with between two and three weeks of data subsequent to the lockdown. this gives us some confidence that our uk forecast is credible. our results for the uk, spain, italy, and france are summarised in table 2 , together with equivalent forecasts generated by 5 the mrc/ic centre for global infectious disease analysis (https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/ covid-19/covid-19-weekly-forecasts/). in each of the previous forecasts for which subsequent data are available, these data are consistent with our forecasts but frequently lie outside the uncertainty bounds of the mrc forecasts. it is unclear to us why the mrc approach has performed quite so poorly. we suspect it may be due to a lack of memory in their system. deaths over the coming week cannot possibly be deduced from the past few days alone, as they depend on 10 the time series of infections stretching back over several weeks and integrating this information requires the use of a model with appropriate time delays. an alternative forecasting system has been presented by the the institute for health metrics and evaluation (ihme, https://covid19.healthdata.org/united-kingdom) and their approach, which is outlined in murray et al. (2020) appears to include a highly detailed statistical model. while the extra information they consider, or example relating to hospital bed demand, could be very useful to know, the basic trajectory of the number of deaths is very poorly predicted by 15 their model. their forecasts of deaths issued on the 9th april predicted a steep rise for the uk and extraordinarily steep falls for both italy and spain, outside the range of what could be plausibly simulated by a mechanistic model. the fairly stable death rates in all three european countries (trending down gradually in italy and spain), which have been observed subsequently to 8 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 9 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 figure 6 . forecast for sweden. plot legend as for figure 1 their forecast date and which were correctly predicted by our mechanistic model, lie outside of the uncertainty bounds of the ihme forecasts a mere 4 days after their issue. it could perhaps be argued that our results have been largely obtained through a judicious choice of prior on r t which imposes a sharp reduction in growth rate. sweden provides a useful test of this possibility, since although it has taken some actions it has not imposed strict controls to suppress the epidemic. therefore, we can use sweden as a test of our system's 5 ability to detect failure or absence of controls. in this test, we assume imposition of controls in sweden at around the same time as in the rest of europe (17 march), and calibrate our model as before, with an assumed r t ∈ n (1, 0.5 2 ) subsequent to that date. the results are shown in figure 6 . these results demonstrates that the prior on r t is sufficiently vague that the data easily dominates it and generates a result showing essentially no change in the growth rate over the interval for which we have data. our results for sweden are also consistent with the latest forecast from mrc. therefore, we are confident that we are 10 have not been merely fixing our earlier results though the choice of prior for r t , but have in fact been detecting real changes in r which emerges from the data. we are surprised that calibration methods are not more widely used by epidemiological modellers in order to better simulate real epidemics. for example, the widely-discussed paper, ferguson et al. (2020), appears to have made no attempt to calibrate 15 the exponential growth rate that had been observed in both case and death data in the uk, merely (according to the description of the authors) setting the initial size of their modelled epidemic to match the number of deaths by mid-march. by this time, . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 there was already ample evidence in the piblic domain that the 5-day doubling time assumed by their choice of parameters was a substantial underestimate of the growth rate both in deaths (albeit with little data) and also diagnosed infections. similarly, davies et al. (2020) used alignment with observed deaths (up to 27th march) to choose the starting date of their model simulations, but do not appear to have considered calibrating the growth rate even though by that time it must have been apparent to all researchers that the empirical rate was very much at the high end of all prior expectations. it seems extremely remiss that 5 such simple calibration is not performed routinely, and it may well have played a significant röle in the tardy and complacent response of the uk government in the early stages of the epidemic. compared to the expert prediction of a 5-day doubling time, simple modelling suggests that the 3-day doubling that has been widely observed will, in the absence of strong controls on behaviour, result in the peak of the epidemic being roughly a month earlier and twice as high, resulting in a hugely increased demand on healthcare resources while giving far less time to prepare. coincident with this, the higher implied value for r also 10 makes the epidemic rather harder to control. we urge epidemiological modellers to consider more seriously the calibration and initialisation of their models when attempting to provide policy-relevant guidance. we have presented a simple data assimilation method that simultaneously calibrates and initialises a seir model for nowcasting and forecasting the covid-19 epidemic at national and regional scale. we have shown that this method generates valid 15 forecasts and an initial assessment of its performance shows it to be rather more reliable than the forecasts generated by both the mrc/ic centre for global infectious disease analysis and the institute for health metrics and evaluation. we suggest that these centers, and others like them, might like to consider the potential of model calibration for improving their forecasts. code for the generation of these analyses will be made available on github: https://github.com/jdannan/covid-19-operational-forecast. daily nowcast/forecasts for the uk are published on twitter: https://twitter.com/jamesannan. this work was not funded by anyone. we appreciate numerous comments on our blog and also via twitter which have helped greatly in improving the presentation of this work. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 cmmid covid-19 working group, et al. the effect of non-pharmaceutical interventions on covid-19 cases, deaths and demand for hospital services in the uk: a modelling study. medrxiv impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand. imperial college covid-19 response team monte carlo methods assimilation of paleo-data in a simple earth system model monte carlo simulation methods using markov chains and their application equations of state calculations by fast computing machines ihme covid-19 health service utilization forecasting team, et al. forecasting covid-19 impact on hospital beddays, icu-days, ventilator-days and deaths by us state in the next 4 months. medrxiv predictability past predictability present. predictability of weather and climate . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org /10.1101 /10. /2020 key: cord-289447-d93qwjui authors: helmy, mohamed; smith, derek; selvarajoo, kumar title: systems biology approaches integrated with artificial intelligence for optimized food-focused metabolic engineering date: 2020-10-09 journal: metab eng commun doi: 10.1016/j.mec.2020.e00149 sha: doc_id: 289447 cord_uid: d93qwjui metabolic engineering aims to maximize the production of bio-economically important substances (compounds, enzymes, or other proteins) through the optimization of the genetics, cellular processes and growth conditions of microorganisms. this requires detailed understanding of underlying metabolic pathways involved in the production of the targeted substances, and how the cellular processes or growth conditions are regulated by the engineering. to achieve this goal, a large system of experimental techniques, compound libraries, computational methods and data resources, including the multi-omics data, are used. the recent advent of multi-omics systems biology approaches significantly impacted the field by opening new avenues to perform dynamic and large-scale analyses that deepen our knowledge on the manipulations. however, with the enormous transcriptomics, proteomics and metabolomics available, it is a daunting task to integrate the data for a more holistic understanding. novel data mining and analytics approaches, including artificial intelligence (ai), can provide breakthroughs where traditional low-throughput experiment-alone methods cannot easily achieve. here, we review the latest attempts of combining systems biology and ai in metabolic engineering research, and highlight how this alliance can help overcome the current challenges facing industrial biotechnology, especially for food-related substances and compounds using microorganisms. with the growing population of our planet, food security remains a major challenge facing mankind. this is especially true for countries that do not possess large land spaces for agriculture, such as those in the middle east (deserts), japan (mostly mountainous), and singapore (land scarce). moreover, nature conservationists are mostly against the clearing of wild flora and fauna to feed the world. thus, looking at the long term, food security can become a pressing issue for many nations. the rome declaration (1996) by the food and agriculture organization (fao) defines food security as "food security, [is achieved] when all people, at all times, have physical and economic access to sufficient, safe and nutritious food to meet their dietary needs and food preferences for an active and healthy life" [1] . on the other hand, there is also a growing awareness of healthy diets as diet is considered to be the most significant risk factor that affects general health and cause diseases, disability or premature death. the trending diets are mostly focused on eating habits that are nutritious, help lose weight, avoid processed foods, especially with preservatives, or foods with artificial ingredients such as artificial flavours or colours [2] . these include plant-based diets (such as vegan) and low calorie fat-burning diets (such as ketogenic) [3, 4] . thus, the challenge is not only to produce enough food but also those that are safe, nutritious and appealing to the customer's preference. food security has become even more important during the ongoing covid-19 pandemic when countries have largely closed their borders, affecting the food import-export trade [5] . there are several types of modelling approaches today, that can be largely grouped into i) parametric approaches such as dynamic modeling using ordinary differential equations [23] , and ii) non-parametric models using boolean logics, stoichiometric matrix and bayesian inference algorithms [25, 26] . a dynamic model built using differential equations constructs an organism's metabolism step by step using known biochemical reactions and reaction kinetics from their genomic, enzymatic and biochemical information derived from experiment ( figure 2a ). using this information, the models are used to predict metabolic outcomes for different in silico perturbations, or to understand the key regulatory mechanisms (such as bottlenecks) and flux distributions to a given perturbation [27, 28] . in other words, the dynamic models utilize a priori knowledge of metabolic pathways, enzymatic mechanisms and temporal experimental data to simulate the concentrations of metabolites over time. these models are usually referred to as kinetic models [29] . although kinetic models have been widely used and have proven their benefits [24] , for large-scale modeling, such as genome-scale modeling, it is a daunting challenge to use dynamic modeling due to the absence of large-scale experimentally measured and reliable kinetics [23] . to overcome this major challenge, as a trade-off, scientists use other types of modeling such as the parameter-less stoichiometric constraint-based modeling approaches. constraint-based models, have constraints for each decision that represent the minimum and maximum values of the decision (e.g. the minimum and maximum reaction rates) [30] . a widely used constraintbased modeling is the flux balance analysis (fba) [31] . the fba models thousands of metabolites and reactions with reasonable computational cost and prediction outcome ( figure figure 2 . schematic representation of different modeling approaches used in metabolic engineering. a) mathematical modeling of metabolic pathways. b) flux balance analysis (fba) modeling. c) steps of promoter-strength modeling using statistical models and mutations data. d) ensemble modeling. although numerous works have used metabolic regulation to control the production of targeted metabolites, recent works indicate that transcriptional and translation control can provide significant fold increase in the intended yield output [13, 33] . the transcriptional control changes the way the gene of interest is regulated by manipulating its promoter region. this includes modifications such as mutating the ribosomal binding sites (rbs), the transcription factor binding sites (tfbs), designing and inserting shot sequencing (e.g. new binding sites), or designing an artificial promoter region [33] . the transcriptional control requires deep understanding of how the gene of interest is regulated (activators, enhancer and suppressors) as well as the knowledge of its genomic structure around the binding sites (such as nucleosome positions) [34] ( figure 2c ). thus, modeling the transcriptional control remains a challenge as it requires complex data involving quantitative gene expression under each mutation condition to train a model that simulates the effect of each mutation and then use it to predict the impact on the new mutation. nevertheless, statistical approaches such as the position weight matrix (pwm) modeling, which measures or scores aligned sequences that are likely functionally related, have shown promise for understanding the mutational impact on the transcriptional regulation in mammalian disease cells [35, 36] . such methods could be explored in the future for controlling the transcriptional efficiency for metabolic engineering outcome. crucial role in the development of the ensemble predictions, thereby, reducing the number of models to a smaller set [38] . an example of ensemble modeling was performed for two non-native central pathways for carbon conservation the non-oxidative glycolysis (nog) and the reverse glyoxylate cycle (rgc) pathways using ensemble modeling robustness analysis (emra). emra successfully determined the probability of system failure and identified possible targets for flux improvement [39] . in another study, ensemble modeling was used to help in developing a l-lysine-producing strain in e. coli [40] . nevertheless, ensemble modeling come with some major challenges. building an ensemble with different modeling algorithms is more difficult that using any standard modeling strategy, the requirement of perturbation-response data makes it similar to many other data-dependant modeling strategies that perform poorly in the absence of reliable data, and the difficulty in interpreting its overall results. these limitations hinder the utility of this powerful modeling approach. another widely used modeling approach for metabolic engineering is in silico three-dimensional (3d) molecular modeling for the study of receptor/enzyme-ligand docking and protein homology design [41] . it has a wide range of applications in drug design and metabolism, research and therapeutic antibodies design and molecular interactions research (protein-protein and protein-dna interactions). in metabolic engineering, 3d modeling is used to design and simulate engineered enzymes that are indispensable for the optimization process of the microorganism's metabolism [42] . in protein engineering, where no structural data is available, molecular modelling is used to model the 3d-structures of enzymes, and coupled with enzymesubstrate docking studies, can be used to target regions of interest to improve various attributes, such as specificity, activity and stability under a given environment. this has been used to great effect for single enzymes as in vitro industrial biocatalysts (e.g. sitagliptin [43]), as well as for entire enzyme cascades (e.g. islatravir [44] ) for the production of active pharmaceutical ingredients. j o u r n a l p r e -p r o o f dynamic modeling strategies, as mentioned above, often depend on the parameters that are used to build the model. the parameters (such as reaction kinetics or flux ranges) can be determined using bottom-up or top-down approaches [45] . the bottom-up approach is highly dependent on experiments (such as in vitro enzymatic assays) since it requires information on the reaction kinetics of each enzyme, which is highly challenging to determine for all the enzymes in a pathway or network. furthermore, even if information is obtained from in vitro experiments, the data are often several orders of magnitude different from actual in vivo experiments [46] . moreover, modeling usually requires data (kinetics or flux rates) for multiple conditions or time points to train the model and test its accuracy or applicability, which requires iterative experimental work [18] . despite the fact that the bottom-up modeling approaches often use optimization algorithms to estimate the model parameters, such as the genetic algorithm, the complex and non-linear nature of the relationships between metabolites limit the usefulness of the model fitting algorithms [45, 47] . another aspect of limitations is the scale of the model. since the bottom-up approach requires detailed experimental measurements, it is more suitable for small-scale models. extending the model size requires either more experiments (higher cost and longer time) or more computational estimation reliance of the parameter values (lower accuracy). thus, an accurate dynamic model based on a bottom-up approach is difficult to establish due to the extended level of uncertainty in the kinetic properties of the enzymes and their reactions [48] . ensemble modeling helps in building large-scale models, however it also suffers from major limitations as mentioned earlier. on the other hand, the top-down approaches utilize time series metabolomic data to indirectly infer the kinetics, flux rates or concentrations of metabolites, through the establishment of correlation and causation networks between metabolites [45] . the causation network establishes the cause-effect relationships between the metabolites in the networks and is usually built using time series metabolomic data, while the correlation network uses mathematical and statistical methods to determine the probable relation between the enzymes and metabolites in the network [47] . nevertheless, the top-down approach has shown notable success in analyzing cellular pathways with simple linear response or mass-action kinetic models with little parameter sensitivity [29, 49] . for the comparative 3d protein modelling, it is most commonly performed using template-based methods, where homologous protein structures are used to generate models using stand-alone programs such as modeller [50] or through online servers such as robetta, which incorporates the rosettacm method [51] , hhpred [52] , and itasser [53] . these methods produce useful models where good templates are available, but many protein sequences of interest have limited template information, and so poor-quality models are common which hinders their practical applications in guiding protein engineering works. most of the above-mentioned modeling strategies require the availability of sufficient and high-quality experimental data. the data includes metabolite concentrations, and their chemical structures, properties, pathways, reaction rates, genomic sequences, genome annotations, transcriptome sequence, gene expression data and many other types of data, as required for their respective modeling strategies. fortunately, a large number of bioinformatics databases and servers are now freely available with most of this data. many of them are meta-databases that collect and aggregate data from multiple sources such as kegg, pathways commons and metacyc [54] [55] [56] . despite the benefits of these bioinformatics resources, the challenge is in finding the correct dataset and modeling /analytical approaches to take advantage of this wealth of data. this, therefore, raises the need of the involvement of novel data mining and data analytics approaches, such as artificial intelligence (ai). artificial intelligence (ai) provides computers the ability to make decisions based on analyzing the data independently by following predetermined rules or pattern recognition models. since its introduction in 1956, ai has become a hot research area after proving useful in solving several challenges across many fields [57] . ai and many of its modern techniques such as machine learning (ml) contribute significantly to things that we use in our daily life; from the voice recognition that we use when interacting with smart devices to the algorithms that decide the contents that we see on our social media to the modern-day autonomous cars that will soon be cruising our streets. ai can now read, write, listen, respond to questions, play games or even engage in conversations [58] . it is also playing a significant role in science, technology and research. in the biomedical and biotechnology fields in particular, ai is heavily employed in addressing certain research challenges while being under-utilized in other aspects. the drug and vaccine discovery fields, for instance, are employing ai to address the challenges of developing new drugs, repurposing existing drugs, understanding drug mechanisms, designing and optimizing clinical trials and identifying biomarkers [59] . recent surveys show that more than 40 pharma companies and 230 startup companies are employing ai in different aspects of drug discovery [60, 61] . this has resulted in the development of over one hundred drugs that are in different development phases in the fields of oncology, neurology and infectious diseases [62] . furthermore, the research on covid-19 drugs and vaccination development is employing ai, and this has resulted in dozens of promising drug lead compounds and vaccines is such a short period of time [63, 64] . ai is also employed in the fields of genomics, protein-protein interaction prediction, signaling pathways prediction and analysis, protein-dna binding, cancer diagnosis, and genomic mutation variant calling among several other applications [65] [66] [67] [68] [69] . on the other hand, ai is not similarly utilized in the fields of metabolomics and metabolic engineering, especially for food applications. although the idea of combining systems biology and ai (machine learning in particular) to study metabolism is relatively old [70] , the applications of it is still under explored. machine learning (ml) is the field of ai that is interested in developing computer programs that learn and improve its performance automatically based on experience and without explicitly being programmed [71] . in the last few years, ml research and techniques have improved as large datasets generated by modern analytical lab instruments become available. therefore, in recent reports we are starting to see ml-based research in identifying weight loss biomarkers [72] , the discovery of food identity markers [73] farm animal metabolism [74] and many other applications in untargeted metabolomics [75, 76] . in metabolic engineering, several areas are starting to take advantage of ml and systems biology integration including pathways identification and analysis, modeling of metabolisms and growth, and 3d protein modeling ( figure 3 ). pathways identification and analysis is very crucial for metabolic engineering. it is common that the biochemical pathway of a targeted substance (e.g. enzyme or compound) is unknown or poorly studied. furthermore, in many cases, the gene(s) or gene cluster that is responsible for j o u r n a l p r e -p r o o f producing the targeted substance needs to be transferred to a model organism so that it can be easily manipulated and optimized [12] . as mentioned above, the different modeling techniques have their limitations, while combining omics data and using standard data analysis approaches for pathways, the final predictions come with its uncertainty [77] . ml can be utilized to identify the pathways upstream of the substance. for instance, ml model that used naive bayes, decision trees, logistic regression and pathway information of many organisms were used in metacyc to predict the presence of a novel metabolic pathway in a newly-sequenced organism. the analysis of the model performance showed that most of the information about the presence of a pathway in an organism is contained in a small set of used features. mainly, the number of reactions along the path from input to output compound was the most informative feature [45] . in general, the ml models used for pathway prediction showed better performance than the standard mathematical and statistical methods [78] . nevertheless, pathway discovery is still heavily relying on traditional approaches such as gene sequence similarity and network analysis. thus, better ml algorithms/methods for pathways discovery are ml can be invaluable for the identification of important genes or enzymes in the pathways of interest. ml classifiers, such as support vector machine, logistic regression and decision treebased models, have been instrumental in predicting gene essentiality within metabolic pathways through training and testing models (by using labeled data of essential and non-essential genes) [79] . it was also used in finding new drug targets by determining the essential enzymes in a metabolic network of each enzyme by its local network topology, co-expression and gene homologies, and flux balance analyses [80] . plaimas et al used an ml model that was trained to distinguish between essential and non-essential reactions, which followed an experimental validation using the phenotypic outcome of single knockout mutants of e. coli (keio collection) [80] . in an earlier study, the side effects of drugs on the metabolic network were investigated by predicting an enzyme inhibitory effect through building an ml model. the model used network topology, functional classes of inhibitors and enzymes as background knowledge, with logic-based representation and a combination of abduction and induction methods to predict drug inhibitory side effects [81] . newly sequenced genomes undergo two types of annotations; structural annotation and functional annotation. the structural annotation is the process of identifying the genome components and their structures (e.g. identifying genes, their exons, introns and utrs or their regulatory regions), while the functional annotation identifies the functions of the genes and their products. both types of annotation are important for metabolic engineering research; the structural annotation identifies the genes, their sequences, length and structure and, therefore, helps in finding alternative organisms where the same gene, pathways or gene clusters exist. the functional annotation helps in identifying organisms that produce the same substance or tolerate the same growth conditions. comparative genomics, network biology and traditional bioinformatics methods, such as sequence alignment, are usually utilized in this process [82, 83] . the rapid advancements in the genome sequencing technologies and the significant drop in its cost in the last decade raised the advantage for fast and accurate annotation methods [84] . this resulted in the development of several new annotation methods that analyse the newly sequenced genomes from different sequencing platforms that addressed many of the challenges, however, many other challenges remain such as missing short genes and erroneous exon start and end annotation [85, 86] . thus, several other methods were introduced with the idea of combining multi-omics data in the process of the genome annotation and, in particular, the proteomic and transcriptomic data [87] [88] [89] [90] . despite these efforts, over 20% of the sequenced genomes in the genome online database (gold) are still awaiting annotation [91] . the high-volume and multi-dimensional nature of the genome sequencing data makes it very suitable for applications of machine learning algorithms [92] . the ml model will be trained using annotated genomes to identify genome structures, e.g. genes or regulatory regions, using their features to identify the same structures in the newly sequences genomes [93] . yip deepannotator, an annotation tool that outperformed the ncbi annotation pipeline in rna genes annotation [95] . the new versions of the annotation tool genemarks for annotation prokaryotic genome (genemarks2+) and the eukaryotic self-training gene finder (genemark-ep+) both are utilizing ml algorithms in the annotation process [94, 96] . deep convolutional neural networks were used to annotate gene-start sites in different species by training the model using the sites from one species as the positive sample and random sequences from the same species as the negative sample. the model was able to identify gene-start sites in other species [97] . although, the idea of employing ml in functional annotation started relatively early, it is still underutilized in functional annotation compared to structure annotation. an early attempt of using ml in genes functional annotation from biomedical literature utilized hierarchical text j o u r n a l p r e -p r o o f categorization (htc) [98] , while tetko et al provided a high-quality curated functional annotation data as a benchmark dataset for the developers of machine ml-based functional annotation methods for bacterial genomes [99] . the recent reports show the applications of mlbased methods in a wide variety of functional annotations such as the discovery of missing or wrong protein function annotations [100] , predicting gene functions in plant [101] , controlling the false discovery rate (fdr), increase the accuracy of protein functional predictions [102] , and genome-wide functional annotation of splice-variants in eukaryotes [103] . the advancements of -omics technologies have resulted in a huge accumulation of data (genomics, transcriptomics, proteomics and metabolomics) that is estimated to grow in size to exceed astronomical levels by 2025 [104] . this enormous amount of data has shifted scientific research more towards data-driven approaches such as ml [45] . combining ml methods with omics data is a typical systems biology approach to address several biomedical challenges. an ml approach was used to replace the traditional kinetic models in estimating the metabolite concentrations over time by combining ml models, proteomic and metabolomic time series data [58] . also, proteomic and metabolomic data of yeast were combined under several perturbation conditions (97 kinase knockouts), and ml was used to predict the yeast metabolome using the enzyme expression proteome of each kinase-deficient condition. the ml quantifies the role of enzyme abundance through mapping the regulatory enzyme expression patterns then utilizing them in predicting the metabolome under the knockout condition [70] . the availability of transcriptome data and the ability of ml methods to deal with big data led to the development of several genome-scale methods to predict the phenotype using ml models. to take advantage of the accumulated transcriptome data, a biology-guided deep learning system named deepmetabolism was developed [105] . deepmetabolism uses transcriptomics data to predict cell phenotypes. it integrates unsupervised pre-training with supervised training to predict the phenotype with high accuracy and high speed. on the other hand, jervis et al implemented an ml algorithm to model the bacterial ribosome binding sites (rbss) sequence-phenotype relationship and accurately predicted the optimal high-producers, an approach that directly apply on wide range of metabolic engineering applications [106] . despite the progress in applying ml techniques in metabolic research, ml is still far from being fully utilized in some important aspects of metabolic engineering, especially in metabolic pathways identification, analysis and bioprocess optimization for the food-based research and industries. in the field of 3d protein modeling, several ai-based advances are also noted. the most recent critical assessment of protein structure prediction (casp) meeting in 2018 saw ai methods come of age. the program alphafold [107] used a neural net to extract covariant residue pairs from sequence alignments, coupled with estimated distances between them (from 2-20a), and then used the rosetta energy function [108] to fold the protein based on these ai-derived restraints. alphafold performed exceptionally well in the competition, giving high-accuracy models with template-modelling scores of 0.7 or higher for 24 out of 43 domains (as compared with 14/43 for the next best method). this has been developed into a lab-based version called prospr [109] . yang et al used a similar protocol, but with added estimation of relative residue orientations, resulting in trrosetta [110] , which improved predictions still further. these 3d modelling methods may be implemented into a comprehensive metabolic engineering platform. one area that could be addressed in the improvement of 3d protein modelling methods is the inclusion of cofactors. many enzymes are often folded around cofactors; small-to-large organic molecules which form part of the catalytic machinery, such as flavin adenine dinucleotide (fad) or haem. these molecules are often removed in template-based modelling (both manual and automated versions), yet their presence is often important for the correct folding of the enzyme [111] . this has the effect of lowering the quality of the model due to the removal of key restraints from the structure, requiring extra docking or structure manipulation to reinsert the cofactor after modelling. it should be possible to include the presence of cofactors through a survey of the protein data bank [112] , where ml methods can be used to identify key determinants of cofactor binding, coupled with identification of these determinants within a target sequence, and application of a combined sequence-and-template-based optimization protocol inclusive of these structural features. an extension of this might also be used for identification of substrates for enzymes within a metabolic pathway or unnatural substrates which is particularly valuable for the j o u r n a l p r e -p r o o f development of synthetic biosynthetic pathways. one input would be enzyme sequence alignments of known function, as well as structural information for both enzyme families and substrates. a neural network could be used to identify common patterns of binding pocket residues across multiple families of enzymes for different substrates, and identify potential sequences that would be suitable for inclusion in a particular metabolic pathway, inclusive of sequence determinants for ease of inclusion into heterologous expression systems. also, if no sequence is available that produces a required product, it might be possible to predict the binding pocket residues that might be mutated to give that product. predictions made can then be experimentally tested, and results fed back into the model. in recent years, the importance of harnessing natural and food ingredients from diverse sources is increasingly realized, such as using engineered microbes or synthetically derived as highlighted in the introduction section. these approaches provide several benefits for producing a more sustainable bio-based economy that relies less on precious land or limited livestock. nevertheless, the bioengineering processes utilized still remain suboptimal, due to the complexity of living systems' emergent behaviors (such as feedback/feedforward inhibition, cofactor imbalances, toxicity of intermediates, bioreactor heterogeneity) that tend to reduce the overall effect of any internal modifications such as adding or engineering a metabolic pathway [113, 114] . thus, achieving economically viable large-scale production of microbial-derived metabolites or compounds requires appropriately optimized production strains that generate high yields. until today, however, metabolic engineering efforts mainly serve for broadening and further reducing the cost of those molecules of commercial interests. to address these issues, brunk et al engineered eight e. coli lab strains that produced three commercially important biofuels: isopentenol, limonene, and bisabolene [115] . to understand the key regulatory or emergent bottleneck scenarios that limit their industrial applicability, they undertook a large scale -omics based systems biology approach where they performed time-series proteomics and metabolomics measurements, and analyzed the resultant high-throughput data using statistical analytics and genome-scale modeling. the integrated approach revealed several novel key findings. for example, they elucidated time-dependent regulation of gene, protein and metabolic pathways related to the tca cycle and pentose-j o u r n a l p r e -p r o o f phosphate pathway, and the resultant coupling of the pathways that affected nadph metabolism. these emergent responses were collectively implicated to downregulate the expected biofuel production. the findings, subsequently, led them to identify a crucial gene (ydbk) whose removal led to a 2-fold increase in the production of isopentenol in one of the e. coli strains [115] . despite their success on one strain (out of eight), the overall dynamic changes of metabolic pathways at the different stages of growth for all strains were not understood, as they employed a steady-state genome-scale model, which provided a qualitative, rather than quantitative, inference. this, as mentioned earlier (in dynamic modeling), is due to the lack of kinetic parameter values that are required to develop and test a dynamic model for each strain. to overcome this difficulty, costello and martin (2018) used the same time-series proteomics and metabolomics data of brunk et al and developed a ml model to effectively predict pathway dynamics in an automated fashion [58] . their model produced both qualitative and quantitative predictions that had better predictions compared to a traditional kinetic model side-by-side. basically, their ml model derived a mapping function between the proteomics and metabolomics dataset with the aid of regression techniques and neural networks onto a training data, and finally verifying the prediction on a test data. apart from better accuracy in the dynamic profiles of the metabolites predicted, the model also did not require detailed understanding of the regulatory steps, which is a major weakness for all modeling approaches. however, their ml model was short in predicting effective regulator(s) for enhanced production of any of the biofuels (isopentenol, limonene, and bisabolene), nor was there any experimental verification. although this is a major weakness in current systems metabolic engineering approaches, nevertheless, ml-based modeling has the future potential to productively guide bioengineering strains without knowing complete metabolic regulatory processes, which are very challenging to obtain. one interesting and popular area of industrially relevant metabolic engineering product in the food and consumer care industries are the terpenes and terpenoids; secondary metabolites or organic compounds naturally found in diverse living species, especially in plants. due to their high commercial values, numerous researches have focused on producing them or their derivatives at industrial scale using microbes [116] [117] [118] [119] . although several hundreds, or even thousands, of fold increase has been achieved at test tube or flask level by engineering microbes, j o u r n a l p r e -p r o o f the achievement at large industrial scale bioreactors are far from reality. it is our opinion that ml models can help to uncover the relations between output and input more accurately, and identify sweet spots for carefully targeted steps for generating bioreactor scale targeted output. although there is no current workable evidence for this, we believe the future looks promising for this front, provided large investments are made to generate biological data that are required by dynamic or ml models to effectively be predictive. integrating systems biology and ml holds a great promise for improving the way we study and understand metabolism as well as to improve and engineer alternative food sources that are healthier, affordable and nutritious. however, as reviewed in this chapter, this integration faces several challenges and limitations in order to fully utilize the power of both systems biology and ml. a major challenge that faces the application of systems biology and ml in food-grade or gras metabolic engineering is the lack of data. systems biology requires high throughput data from multi-omics levels (genomic, transcriptomic, proteomic and metabolomic), and this data is only available for a small subset of microorganisms in general, and significantly lacking for the food-grade or gras strains, in particular. the availability of such data is necessary for more holistic studying of the organism and helps in discovering new pathways or proteins, simpler, shorter directed pathways or new enzymes with better production rate [12] . this information will also help in choosing the most appropriate organism to be used for the engineering and production projects. usually, certain model organisms called "chassis" such as yeast and e. coli are used in these projects where the gene(s) or pathways of the substance of interest is transferred from the donor organism. however, the availability of sufficient information about both the donor organism and the chassis help choosing the correct chasse and avoid facing unexpected qualities such as resistance to certain conditions or missing of important pathways [120] . in addition to the need of large-scale -omics data for building ml models, another data problem is facing the application of ml in the metabolic engineering research. training an ml model for metabolic engineering requires sufficient quantitative data for multiple conditions. the multiple conditions can be multiple knockouts, perturbations or growth conditions. for instance, to build an ml model that predict the required engineering (e.g. knockouts) to improve the j o u r n a l p r e -p r o o f promoter strength, we need to train the model using quantitative data of the downstream gene expression under multiple knockouts or mutants. similarly, the predictive ml models investigating the translation control, transcription factor binding sites, ribosomal binding sites, enzyme engineering (mutation or truncation) and growth optimization require high quality quantitative data in multiple conditions. the same data can also be used in building different mathematical and statistical models, which allows the development of more integrated methods. however, this data is hard to find online and needs to be created for each project. we need more research that focus on the generation of high-quality quantitative data, and on building online resources, such as meta databases, that collect and combine this data to make it available for the community. another major challenge in the ml field is what is known as "the black box problem". the black box problem of ai techniques in general is defined as the difficulty of understanding how they work and how and why they give these results [121] . this causes the end user of the technique to be uncertain about the quality of the output, and the often biologically unfamiliar modeler will not be able to intervene to improve the performance as well as raising some legal concerns [122] [123] [124] . for example, in the application of ml in 3d structural modelling, as well as enzyme-substrate identification, the newer ai-based modelling methods are showing some promising results, however, due to the nature of neural nets, it is very difficult to interpret exactly what the programs are learning about the protein-folding problem. we can predict a structure, but without understanding the underlying model for folding. if a way could be found to capture this information, it would be of great use to the community for further study. to address the black box problem, scientists in the field of ai developed a group of ai methods called explainable artificial intelligence (xai) that aim to make the results of ai methods understandable to humans. although this is still new, it holds potential to solve the problems that prevent the systematic performance improvement of ai models [121, 125] . although genome annotation, both structural and functional, affects most of the biomedical research aspects, it has a special impact on metabolic engineering in general and applications in food industry in particular. the food-grade or gras microorganisms are a small subset of all organisms and many of them are either not well-studied or not studied at all. hence, there is a big challenge in using these species in ml-based metabolic engineering, as many of them are either not sequenced or sequenced with draft annotation, and/or with no annotation. the j o u r n a l p r e -p r o o f annotations are usually automated using standard pipelines which identify the common genes that they share with other microorganisms and can miss the organism-specific features that need deeper attention. these features are exactly what make those organisms suitable for metabolic engineering and food industry. improved ml-based genome annotation methods will help improving the annotation of the food-safe and gras genomes which will directly impact the research in this area. another area that needs special attention is the pathways prediction in the absence of genome sequence or genome annotation. since many of the food-safe and gras microorganisms are not sequenced yet, methods that predict the pathways for important substances using different -omics data is required. it is easy now to perform whole-or phosphoproteomics, or transcriptomics in different growth conditions or different life stages of an organism. this -omics data can be used, in the absence of genome sequence, to predict the endogenous or biosynthetic pathways of the substance of interest. ml methods can be used instead of the traditional pathway prediction approach due its better suitability to the nature and size of the data. overall, despite the challenges and limitations of ai or ml techniques in dealing with biological datasets, there is no better time than now to explore the full potential of these techniques and to further develop novel methods to overcome the many challenges, including "the black box problem". in parallel, the improvements to the data collection from -omics technologies in time will help to narrow the gap of uncertainty or ambiguity for future systems biology and ml integration for optimal metabolic engineering strategies. rome declaration and plan of action diets for health: goals and guidelines healthy low nitrogen footprint diets the ketogenic diet: evidence for optimism but high-quality research needed covid-19 risks to global food security world bank, food security and covid-19 metabolic engineering for higher alcohol production metabolic engineering of vitamin c production in arabidopsis bringing cultured meat to market: technical, socio-political, and regulatory challenges in cellular agriculture conceptual evolution and scientific approaches about synthetic meat engineered microorganisms for the production of food additives approved by the european union-a systematic analysis metabolic engineering j o u r n a l p r e -p r o o f and synthetic biology: synergies, future, and challenges systematic engineering for high-yield production of viridiflorol and amorphadiene in auxotrophic escherichia coli microbial astaxanthin biosynthesis: recent achievements, challenges, and commercialization outlook emerging engineering principles for yield improvement in microbial cell design advancing metabolic engineering through systems biology of industrial microorganisms predicting novel features of toll-like receptor 3 signaling in macrophages transcriptome-wide variability in single embryonic development cells order parameter in bacterial biofilm adaptive response systematic determination of biological network topology: nonintegral connectivity method (nicm) a systems biology approach to overcome trail resistance in cancer treatment a review of dynamic modeling approaches and their application in computational strain optimization for metabolic engineering formulation, construction and analysis of kinetic models of metabolism: a review of modelling frameworks bayesian inference of metabolic kinetics from genome-scale multiomics data flux analysis and metabolomics for systematic metabolic engineering of microorganisms signaling flux redistribution at toll-like receptor pathway junctions basic and applied uses of genome-scale metabolic network reconstructions of escherichia coli physical laws shape biology constraint-based models predict metabolic and associated cellular functions what is flux balance analysis? metabolic engineering to increase crop yield: from concept to execution design of synthetic yeast promoters via tuning of nucleosome architecture inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters impact of cancer mutational signatures on transcription factor motifs in the human genome genome-scale identification of transcription factors that mediate an inflammatory network during breast cellular transformation data mining process ensemble modeling of metabolic networks ensemble modeling for robustness analysis in engineering non-native metabolic pathways ensemble modeling for strain development of l-lysine-producing escherichia coli structure-based drug design strategies and challenges a review of metabolic and science (80-. ) design of an in vitro biocatalytic cascade for the manufacture of islatravir machine learning methods for analysis of metabolic data and metabolic pathway modeling can complex cellular processes be governed by simple linear rules? constructing kinetic models of metabolism at genome-scales: a review silico approach to characterization and reduction of uncertainty in the kinetic models of genome-scale metabolic networks macroscopic law of conservation revealed in the population dynamics of toll-like receptor signaling comparative protein modelling by satisfaction of spatial restraints high-resolution comparative modeling with rosettacm a completely reimplemented mpi bioinformatics toolkit with a new hhpred server at its core the i-tasser suite: protein structure and function prediction the kegg resource for deciphering the genome update: integration, analysis and exploration of pathway data the metacyc database of metabolic pathways and enzymes-a 2019 update siri, in my hand: who's the fairest in the land? on the interpretations, illustrations, and implications of artificial intelligence a machine learning approach to predict metabolic pathway dynamics from time-series multiomics data 43 pharma companies using artificial intelligence in drug discovery 230 startups using artificial intelligence in drug discovery 116 drugs in the artificial intelligence in drug discovery pipeline covid-19 vaccine tracker | raps dozens of coronavirus drugs are in development -what happens next? predicting pdz domain mediated protein interactions from structure predicting the sequence specificities of dna-and rna-binding proteins by deep learning a universal snp and small-indel variant caller using deep neural networks deep learning for genomics using janggu dermatologist-level classification of skin cancer with deep neural networks machine learning predicts the yeast metabolome from the quantitative proteome of kinase knockouts understanding machine learning: from theory to algorithms combining machine learning and metabolomics to identify weight gain discovery of food identity markers by metabolomics and machine learning technology metabolomics meets machine learning: longitudinal metabolite profiling in serum of normal versus overconditioned cows and pathway analysis machine learning in untargeted metabolomics experiments machine learning applications for mass spectrometry-based metabolomics transcriptome analysis and gene expression profiling of abortive and developing ovules during fruit development in hazelnut next generation models for storage and representation of microbial biological annotation an integrative machine learning strategy for improved prediction of essential genes in escherichia coli metabolism using flux-coupled features machine learning based analyses on metabolic networks supports high-throughput knockout screens application of abductive ilp to learning metabolic network inhibition from temporal data comparative genomics approaches to understanding and manipulating plant metabolism genome mining of the streptomyces avermitilis genome and development of genome-minimized hosts for heterologous expression of biosynthetic gene clusters challenges in the next-generation sequencing field whole-genome alignment and comparative annotation insect genomes: progress and challenges a perfect genome annotation is within reach with the proteomics and genomics alliance peptide identification by searching large-scale tandem mass spectra against large databases: bioinformatics methods in proteogenomics metabolomics view project chi sequence view project proteogenomics: from next-generation sequencing (ngs) and mass spectrometry-based proteomics to precision medicine combining rna-seq data and homology-based gene prediction for plants, animals and fungi genomes online database (gold) v.6: data updates and feature enhancements machine learning and genome annotation: a match meant to be? introduction to machine learning new machine learning algorithms for genome annotation genome annotation with deep learning modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes genome annotation across species using deep convolutional neural networks functional annotation of genes using hierarchical text categorization mips bacterial genomes functional annotation benchmark dataset machine learning for discovering missing or wrong protein function annotations machine learning: a powerful tool for gene function prediction in plants protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning genome-wide functional annotation of human protein-coding splice variants using multiple instance learning big data: astronomical or genomical? deepmetabolism: a deep learning system to predict phenotype from genome sequencing machine learning of designed translational control allows predictive pathway optimization in escherichia coli improved protein structure prediction using potentials from deep learning the rosetta all-atom energy function for macromolecular modeling and design prospr: democratized implementation of alphafold protein distance prediction network improved protein j o u r n a l p r e -p r o o f structure prediction using predicted interresidue orientations how do cofactors modulate protein folding? the protein data bank the future of metabolic engineering and synthetic biology: towards a systematic practice cell-free metabolic engineering: recent developments and future prospects characterizing strain variation in engineered e. coli using a multi-omics-based workflow identifying and engineering the ideal microbial terpenoid production host a "plug-n-play" modular metabolic system for the production of apocarotenoids use of terpenoids as natural flavouring compounds in food industry, recent patents food agrocybe aegerita serves as a gateway for identifying sesquiterpene biosynthetic enzymes in higher protein folding and de novo protein design for biotechnological applications solving the black box problem: a normative framework for explainable artificial intelligence big data: a new empiricism and its epistemic and socio-political consequences why should i trust you?: explaining the predictions of any classifier how the machine 'thinks': understanding opacity in machine learning algorithms explainable artificial intelligence: a survey the authors thank simon zhang congqiang for critical comments, and the singapore institute of key: cord-321984-qjfkvu6n authors: tang, lu; zhou, yiwang; wang, lili; purkayastha, soumik; zhang, leyao; he, jie; wang, fei; song, peter x.‐k. title: a review of multi‐compartment infectious disease models date: 2020-08-03 journal: int stat rev doi: 10.1111/insr.12402 sha: doc_id: 321984 cord_uid: qjfkvu6n multi‐compartment models have been playing a central role in modelling infectious disease dynamics since the early 20th century. they are a class of mathematical models widely used for describing the mechanism of an evolving epidemic. integrated with certain sampling schemes, such mechanistic models can be applied to analyse public health surveillance data, such as assessing the effectiveness of preventive measures (e.g. social distancing and quarantine) and forecasting disease spread patterns. this review begins with a nationwide macromechanistic model and related statistical analyses, including model specification, estimation, inference and prediction. then, it presents a community‐level micromodel that enables high‐resolution analyses of regional surveillance data to provide current and future risk information useful for local government and residents to make decisions on reopenings of local business and personal travels. r software and scripts are provided whenever appropriate to illustrate the numerical detail of algorithms and calculations. the coronavirus disease 2019 pandemic surveillance data from the state of michigan are used for the illustration throughout this paper. coronavirus disease 2019 , an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2) (world health organization, 2020), has become a global pandemic that has spread swiftly across the world since its original outbreak in hubei, china, in december 2020. as of 27 june 2020, this pandemic has caused a total of 9 801 572 confirmed cases and 494 181 fatalities in more than 200 countries. being one of the most lethal communicable infectious diseases in human history, it is expected that the covid-19 pandemic will continue spreading in the world population, causing even higher numbers of infections and deaths in the future. with no effective medical treatments or vaccines currently available, public health interventions such as social distancing have been implemented in most of the countries to mitigate the spread of the pandemic. one of the central tasks of statistical modelling is to provide a suitable risk prediction model that enables both government and public health workers to evaluate the effectiveness of public health policies and predict risk of covid-19 infection at the national and regional levels. such information is valuable for governments to assess the preparedness of medical resources (personal protective equipments and intensive care unit beds), to adjust various intervention policies and to enforce the operation of social distancing. modelling for infectious diseases has a profound role in informing public health policy across the world (heesterbeek et al., 2015; siettos & russo, 2013) . the outbreak of the covid-19 pandemic in december 2019 has led to a surge of interest in disease projection that ubiquitously relies on mathematical and statistical models. a crucial step in modelling disease evolution is to capture key dynamics of the underlying disease transmission mechanisms from available public health surveillance data, which enables reliable projection of disease infection into the future. a prediction model may help us foresee some possible future epidemic/pandemic scenarios and learn consequent impacts of current economic and personal sacrifices due to various control measures. because of both data quality and data limitations from public surveillance data systems, a statistical model should take the following features into account in its design and development. first, a statistical model should be able to make predictions and, more importantly, to quantify prediction uncertainties. forecasting is known to be a notoriously hard task, which depends heavily on the quality of data at hand and a certain model chosen to summarise the information from observed data and then to reproduce information beyond the observational time period. the chosen model is of critical importance to deliver prediction. this paper concerns a review of the family of classical compartment-based infectious disease models, which have been the most widely used mechanistic models to capture key features of infection dynamics. we begin with the most basic susceptible-infectious-removed (sir) model to build up the framework (section 2), and this three-compartment model is then generalised to have more compartments to embrace additional features of infection dynamics (section 3), such as the well-known fourcompartment model, susceptible-exposed-infectious-removed (seir) model, which takes the incubation period of contagion into account. given many types of factors potentially influencing the evolution of an epidemic, a single prediction value is insufficient to be trustworthy unless prediction uncertainty is reported as part of forecast analysis. quantification of prediction uncertainty is of critical importance, especially when a forecast is made at an early phase of an epidemic with limited data. building sampling variations in infectious disease models makes a statistical modelling approach different from a mathematical modelling approach. a clear advantage of a statistical model is that the model parameters, including those in the mechanistic model, can be estimated, rather than being specified by certain subjectively chosen prior information. second, the consideration of building sampling uncertainties in the modelling of infectious disease is a fundamental difference of a statistical modelling approach from a mechanistic modelling approach known in the mathematical literature of dynamic systems. a mechanistic model is typically governed by a system of ordinary differential equations, such as the existing three-compartment sir model consisting of three differential equations, which explicitly specifies the underlying mechanisms of an epidemic. this model is assumed to govern an operational system of disease contagion and recovery or death, which, in reality, cannot be directly observed. most of the time, public surveillance data are accessible, which represent only a few snapshots of the underlying latent mechanistic system of an epidemic. such gaps may be addressed by a statistical model that incorporates sampling schemes to explain how observed data are collected from the underlying infection dynamics. in turn, prediction uncertainty will reflect forms and procedures of the chosen sampling schemes specified in the statistical model. in this paper (section 5.1), we will introduce the state-space model as a natural and effective modelling framework to integrate the mechanistic model and sampling schemes seamlessly. third, given the scarcity of the available data in public health surveillance systems, the complexity of a model used for prediction should be aligned with the issue of parameter identifiability. for example, at the beginning of an outbreak, one should consider a simple model, which may be expanded over the course of an epidemic's evolution with increased data availability. to make the specified model useful to answer a certain question of practical importance, a relevant feature should be included in the model building. for example, in the study of control measures to mitigate the covid-19 spread, the model specification should incorporate a structure that is sensitive to the influence of a preventive policy. in section 5.2, we will introduce an expansion of the basic sir model in that time-varying control measures are allowed to enter. the flexibility of permitting certain modifications is an important property of a model to be considered in an infectious disease model. in this field, all models need to be tailored with increased data and more knowledge from the literature as a disease evolves over time. from this point of view, compartment-based models are superior to other models because, for example, it is easy to add other compartments, such as an exposure compartment, a quarantine compartment or a self-immunisation compartment, to improve the mechanistic model, to answer specific question of practical importance and to capture distinctive data features for better prediction. fourth, as the epidemic evolves further, surveillance data become abundant and have higher resolution. for example, in the usa, the numbers of confirmed symptomatic covid-19 cases and case fatalities are recorded for each county. the average county population size in the usa is approximately 98 000, so a microinfectious model may be built upon county-level surveillance data to make high-resolution prediction and to assess the effectiveness of control measures at a community level. this paper (section 6) will discuss this important extension of the classical sir model, essentially a temporal model, to a spatio-temporal model that enables borrowing of information from different spatially correlated counties in the improvement of risk prediction. this exemplary model generalisation sets up an illustration from a nation-level macromodel to a county-level micromodel. the latter is more relevant and useful for local governments to make decisions of business reopenings and for residents to be aware of local infection risk. last, to make research findings transparent and to place resulting toolboxes into the hands of practitioners, an open-source software package must be a deliverable. this is indeed a rather demanding task, as the ease of implementation and numerical stability impact the choice of statistical models and statistical methods for estimation and prediction. note that not every statistical model permits delivery of a user-friendly computing package that is general and flexible enough to handle various types of data. in this paper, we focus on the discussion of markov chain monte carlo (mcmc) methods that have been developed in the literature to perform estimation and prediction for state-space models (section 5.3). in this paper, we invite the readers on a journey of surveillance data, modelling, estimation and prediction, implementation and software development. after reading this paper, one should be able to use existing compartment-based models or to expand them in a study of an infectious disease epidemic, to improve estimation and/or prediction methods, or create one's own software. it is our hope that this paper may pave the path to learning, practising or developing new methodologies that are useful for a broader range of infectious disease modelling problems. multi-compartment models have been the workhorse for modelling infectious diseases since the early 20th century. they are a class of mathematical models used for describing the evolution of masses (in unit of proportions or counts) among the compartments of a varying system, with broad use cases in epidemiology, physics, engineering and information science. this is a dynamic system that is typically represented by a system of ordinary differential equations (odes) with respect to time, and, given a starting condition, the mass in each of the components is regulated by a function over time. an ode is a simple mathematical model to depict a trajectory of a functional trend. one of such examples used extensively in epidemiology is an exponential growth function, f.t/ d e t , which may be viewed as a solution to an ode of the form: df .t / dt d f .t/, or dy dt d y, where y is a function of time t, which obviously is y d f.t/ d e t with an initial condition f.0/ d 1. it is worth pointing out that this simple ode explicitly characterises the rate of change (speed or velocity) for function y d f.t/, rather than directly specifying a form for the function f.t/ itself. such rate-based characterisation is termed as 'dynamics' in the mathematical literature. clearly, this ode is not a statistical model as it does not provide a law of data generation; in other words, there is no randomness in this ode to reflect sampling uncertainties. a typical multi-compartment model consists of several odes for a vector of rates that are linked each other. this is referred to as a dynamic system. the forms of odes are specified according to relevant scientific knowledge about the understanding of the underlying dynamic mechanism related to an infectious disease. in the context of infectious disease modelling, the sir model is the most basic threecompartment dynamic system that describes an epidemiological mechanism of disease evolution over time (see figure 1 ). in brief, the model describes the flow of infection states or conditions by (i) moving susceptible individuals to the infectious compartment through a transmission process (the first arrow) and (ii) moving infectious individuals to the removed compartment (either dead or recovered) through a removal process (the second arrow). at a given time, the total population n under a study is partitioned into the three compartments, denoted by s, i and r, and their sizes satisfying s c i c r d n. with a slight abuse of notation, this notation denotes either the type of compartment or the size of compartment, whichever is applicable in a given context. in other words, s, i and r are used to denote the sizes of the mutually exclusive subpopulations of susceptible, infectious and removed individuals, respectively. this compositional constraint, that is, s c i c r d n, may be interpreted in a term of probability (or proportion) as follows: at a given time, an individual in the population is either at risk (susceptible), or under infection by a virus (infectious), or removed from the infectious system due to recovery or death; that is, â s c â i c â r d 1, where â s , â i and â r are, respectively, the probabilities of being susceptible, infectious and removed. this presents the primary constraint for a multi-compartment infectious disease model. more details of the sir model will be described in section 2. often times, the interest for such system lies in the function values over time, but the closedform analytical solution for such functions may not exist. for example, to answer the question of how many individuals will be infected with the covid-19 by the end of the year 2020 (or any future time) requires to know a calculator that computes the cumulative numbers of susceptible, infected and removed cases over time from the past to the future. unfortunately, in reality, functions relevant to this calculator are usually non-linear, and their exact forms are difficult to directly specify. in contrast, a set of odes helps better understand the disease transmission dynamics (i.e. traits of infectious diseases) and more conveniently captures their key features, where each ode may correspond to one mode of disease evolution. such odes for disease spread may be regarded as a model for the expected dynamic mechanism, serving as a systematic component in a statistical model. numerical methods such as the euler discretisation method or the runge-kutta approximation method (stoer & bulirsch, 2013; butcher, 2016) can be used to obtain approximate solutions of such odes with given boundary conditions. regardless of methods used, solutions to a dynamic system are deterministic functions. we illustrate a basic mechanistic model of disease spread in the succeeding text. additional review from deterministic and mathematical perspectives of multi-compartment models is given by anderson et al. (1992) and hethcote (2000) . example 1. consider the sir model for a hypothetical population with a constant population of n d 100 residents and an initial condition of 99 susceptible individuals, 1 infectious individual and 0 individual removed (either died or recovered). here 100 subjects may be also regarded as 100% if the unit of proportion is used in the interpretation. the transitions between compartments, written in odes as in (1), represent population movement from one compartment to another (see figure 1 ). we consider an example withˇd 0:5 (a rate of moving from s to i) and d 0:2 (a rate of moving from i to r), leading to r 0 dˇ= d 2:5. here r 0 is the socalled basic reproduction number that quantifies an average number of susceptible individuals contracting a virus from one contagious person in an environment of no preventive measures. this is a quite infectious scenario as we will see later. the r script in the succeeding text shows a scenario of obtaining the solution to the system of odes by standard ode solvers ( r package desolve) using the first-order euler method (not shown) or the runge-kutta fourthorder (rk4) approximation method (figure 2 ). details about the rk4 method can be found in appendix a0.1. figure 2 , on each of these 100 days, the sum of the three values from the three curves is always equal to 100, presenting a time-varying redistribution of the 100 individuals. with no control measures in this hypothetical infection dynamics, the susceptible compartment quickly drops and reaches an equilibrium state after 35 days of the outbreak, and during the period of first 35 days, the infectious compartment increases to a peak and then decreases to zero (no contagious individuals in the population) as all currently infected individuals move to the removed compartment, which is the exit of the system. despite relying on a valid infectious diseases mechanism, deterministic approaches have several drawbacks: (i) the actual population in each compartment at a given time is never accurately measured because we only obtain an observation around the mean; (ii) the nature of disease transmission and recovery is stochastic on the individual level and thus never certain; and (iii) without random component in the model, it is neither possible to learn model parameters (e.g. r 0 ) from available data nor to assess prediction uncertainty. the latter is of critical importance given many unobserved and uncontrolled factors in surveillance data collection. in an early stage of the current covid-19 pandemic, the daily infection and death counts reported by health agencies are highly influenced by the availability of testing kits, reporting delays, reporting and attribution schemes, and under-ascertainment of mild cases in public health surveillance databases (see discussions in angelopoulos et al., 2020; banerjee et al., 2020) ; both disease transmission rate and time to recovery or death are also highly uncertain and vary by population density, demographic composition, regional contact network structure and non-uniform mitigation schemes (ray et al., 2020) . hence, statistical extensions are necessary to incorporate sampling uncertainty in estimation and inference for infectious disease models. the main focus of this paper will be given to a statistical modelling framework based on a class of state-space models, in which the systematic component is specified by multicompartment infectious disease models while the random component is governed by a certain sampling distribution of surveillance data. note that multi-compartment infectious disease models present a class of classical mechanistic models widely used in practice and that incorporating certain sampling distributions allows to make statistical estimation, inference and prediction with quantification of uncertainties. we organise the paper as follows. in the first part of the paper, we introduce a class of macromodels. we begin with the most basic sir mechanistic model in details, followed by some important extensions used to address representative scenarios of disease spread and infection evolution. examples include seir model with an additional compartment of exposure accounting for potential incubation period of infection and susceptible-antibody-infectious-removed (sair) model with an additional compartment of antibody accounting for potential self-immunisation after being infected. then, we formally introduce the framework of state-space models, a powerful statistical modelling approach that aims to model available surveillance data from public health databases with the utility of the underlying latent mechanistic model. in the second part of the paper, we introduce a class of micromodels. when an epidemic continues, data become abundant and of high resolution at community level. for example, the surveillance data of the covid-19 pandemic in the usa are collected from individual counties. this allows building county-level microinfectious models in addition to country-level or state-level macromodels. being a certain subgroup analysis, such micromodelling is appealing to address spatial heterogeneity across the more than 3 000 counties in the usa and consequently improves the prediction accuracy. as far as the spatial modelling of infection dynamics concerns, we review the classical cellular automata (ca) that is extensively used to describe person-to-person interacting rules associated with epidemic spreading patterns in a population via relevant interlocation connectivity functions. this ca may vary spatially and temporally, which presents a principled way to extend a state-level macroinfectious disease model to a stratified microinfectious model. in addition to the case of geographical subgroups, other types of subgroups by, for example, age, race, income, political party and economy, are also of interest. our main objective of this paper is to introduce to readers the basics of infectious disease models, underlying modelling assumptions, statistical analyses and possible extensions. examples will be provided for demonstration purposes. this review targets readers who have had some statistical training but no prior experience in infectious disease modelling. the first infectious disease model (mckendrick, 1925; kermack & mckendrick, 1927) is widely known as the susceptible-infectious-removed model, or in short the sir model (see figure 1 ). it is a three-compartment model for studying how infectious diseases evolve over time on the population level. it defines a mechanism of disease transmission and recovery for a population at risk by a dynamic system of three disjoint states: susceptible, infectious and removed. we note an important distinction between infectious and infected individuals. infectious individuals are those who are currently infected and not yet recovered or dead (currently infected individuals become infectious immediately in the sir model, although it may not be true in reality; see the seir model in section 3 where currently infected individuals become infectious with a delay in time), whereas infected individuals could mean only currently infected or both currently and previously infected. for clarity, we will refer to currently infected as infectious so that the three states in the sir model are mutually exclusive. individuals in the susceptible state are not immunised and can become infected by coming into contact with infectious cases, so they are at risk at a given time. individuals in the infectious state contribute to the transmission of the disease until they ultimately recover or die, so they are contagious. individuals in the removed state include those who either recover or die (without distinction). this is an exit from the infection system, meaning that once an individual leaves this system (recovers or dies), he or she would never return to the system. this is true for people who die from the virus but may not be the case for recovered individuals. thus, in the sir model, there is a technical assumption that a recovered individual would become self-immunised to the virus and no longer impact the disease transmission. a possible way to relax this assumption is to create two separate compartments corresponding to recovery and death states, respectively, leading to a four-compartment infectious disease model. to make our presentation focused on the basic three-compartment model, we make this self-immunisation assumption in this section. given what we said earlier, the current version of sir is only applicable for diseases, where long-term immunity can be developed, and does not apply to recurring infectious diseases, such as the common cold. this is because the disease transmission rate is set as a constant in sir. in this section, we introduce the sir model in its basic deterministic form (section 2.1), define reproduction numbers (section 2.2), elaborate its assumptions (section 2.3) and properties (section 2.4) and present some technical extensions to the basic sir model. mechanistic extensions, such as modifications to the three-compartment sir model to account for additional components or disease mechanism, are discussed in section 3. we use s.t/, i.t/ and r.t/ to denote the time-course subpopulation sizes (i.e. the number of individuals) distributed into each of the three compartments at a given time t, where t is continuous. clearly, where n is the total population size, which is a fixed constant. the starting time is denoted as t d 0. the rates of change among these subpopulations are represented by a system of odes: (1) (1), these three odes define a dynamic system of three deterministic functional trajectories over time, including the susceptible trajectory s.t/, the infectious trajectory i.t/ and the recovered trajectory r.t/ for t 0. this sir dynamic system is well posed in the sense that non-negative initial conditions lead to non-negative solutions of the three functional trajectories. these trajectories collectively demonstrate the evolutionary mechanism of an infectious disease. the sir dynamic system in (1) may be interpreted as follows. let us consider events occurring instantaneously at time t. in the first ode, the ratio i.t/=n represents the proportion of contagious individuals in the population, which may be thought of as a chance that a person in the at-risk population may run into a virus carrier. if each individual at risk has an independent chance to meet a contagious person, then, according to the binomial distribution, the expected number of susceptible individuals contracting the virus is s.t/i.t/=n. in reality, a person at risk may run intoˇ(say, 2) contagious individuals, leading to a modified chanceˇi.t/=n. thus, instantaneously at time t, the system gains an additional number of infected cases equal toˇs.t/i.t/=n, and these cases will leave the susceptible compartment to enter the infectious compartment. such loss to s.t/ is attributed to the negative sign in the first equation. in the second ode, the first term is the number of new arrivals of contagious individuals and the second term is the loss of contagious individuals at a rate who either recover or die and then enter the removed compartment. the third ode is based on an absorbed compartment that always accumulates with new arrivals with no departure cases. in the literature, the transition rate represents the fraction of the infectious population that exits the infectious system per unit time. for example, d 0:2 means that the infection compartment will decay (or infectious individuals being recovered or dead) at an average rate 20%. in other words, 1= describes the expected duration (5 days for d 0:2) over which an individual stays infectious under the exponential distribution of time for his or her sojourn. variations of the form in (1) are often seen in the literature. among those, the most important sir specification is given as follows. because the total population n remains constant over the duration of infection, by dividing both sides of the ordinary differential equations by n, the rates of change in terms of population proportions can be derived, without changing the interpretation ofˇand . that is, (2) where â s .t/, â i .t/ and â r .t/ are the probabilities (or proportions) of being susceptible, infectious and removed at time t, respectively. here the probability of being infectious â i .t/ is also known as the prevalence of disease in the epidemiology literature (see, e.g. osthus et al., 2017; wang et al., 2020) . a clear advantage of this alternative form of the sir model (2) where the population size n is implicitly absorbed into the parameter of disease transmission rateˇ, which may be interpreted as a per capita effective contact in proportion to the population (see, e.g. johnson & mcquarrie, 2009 ). despite the differences in notations and presentations, they convey the same infection mechanism, but interpretations need to be given accordingly. although we use these model specifications exchangeably in this paper, the form given in (2) is recommended to conduct practical studies. based on the two parametersˇand in an sir model, the ratio r 0 dˇ= is termed as the basic reproduction number, which captures the expected number of new individuals who directly contract the virus from one contagious individual in an environment with no preventive measures. intuitively, it is a product of the infection rateˇand the infectious duration 1= . the basic reproduction number r 0 does not depend on the distribution of people over the three compartments and presents a key appealing disease characteristic for describing and comparing across infectious diseases (see, e.g. chowell et al., 2004; ferguson et al., 2006; khan et al., 2015; liu et al., 2020 ). an epidemic is expected to occur when r 0 > 1, or to disappear when r 0 < 1. this is because in the sir model (1), at the condition of s.t/=n 1, the former is equivalent toˇ> , leading to di.t/=dt .ˇ /i.t/ > 0, while the latter implies di.t/=dt < 0. the earlier interpretation of r 0 relies on an implicit assumption that all contacts with a contagious individual are susceptible, which contrasts with the effective reproductive number. the effective reproductive number is defined as r e .t/ d r 0 s.t/ n . it represents the expected number of newly infected individuals who contract the virus directly from a contagious individual at time t, given that each susceptible individual has a chance of s.t/=n to meet this contagious individual. this is not to be confused with the notation r.t/, the removed population. in the early outbreak of an infectious disease in a large population, r e .t/ r 0 because s.t/=n 1. in contrast to r 0 , which is only descriptive of the disease itself (or the progression of disease near time 0), r e .t/ reflects the progression of the infectious disease in a population at any given time because it directs the sign of di.t/=dt corresponding to acceleration or deceleration of the infection dynamics. this may be seen by the second-order derivative d 2 i.t/=dt 2 ; a time, say t , at which d 2 i.t /=dt 2 d 0 or the rate di.t /=dt reaches a peak, is referred to as a turning point (see the peak in the middle panel of figure 2 ). hence, r 0 is of most interest during the early phase of an epidemic, whereas r e .t/ is of most interest later on during the controlling phases of an epidemic. for example, the so-called herd immunity is the natural immunity developed when an epidemic reaches r e .t/ < 1. in other words, without interventions, it requires the proportion of susceptible individuals to be no more than 1=r 0 , or the combined proportion of infectious and recovered to be at least 1 1=r 0 in order to contain the spread. as another example, if an effective vaccine becomes available at time q t > 0, knowing r e . q t/ allows us to estimate the remaining proportion of population that needs to be vaccinated in order to control the epidemic (i.e. for achieving r e .t/ < 1). figure 3 shows that the effective reproductive number r e .t/ for example 1 decreases as the group of susceptible individuals, s.t/, shrinks over time, eventually reaching below the threshold of 1 at time 19. the value at time 0 is r 0 d r e .0/ d 2:5, while r e .19/ d 1. the time of reaching this threshold also marks a special time of interest-when the number of active contagious individuals starts decreasing at time 19 after reaching its maximum, as shown in the middle panel of figure 2 . like every mathematical model, there are some assumptions and constraints such as boundary conditions that the sir model needs to satisfy. these restrictions define the circumstances where the sir model may be appropriate to use in practice. although some of them have been mentioned earlier, for the sake of self-contained summarisation, we list all key assumptions as follows. assumption 1: the population involved in the infection is closed with no additions or leakage of individuals, and the size of the population is fixed, say, n. this assumption may be satisfied by an epidemic that is rapid and short lived, during which disease evolution is not affected or is minimally affected by vital changes (e.g. natural births or deaths) and migration (i.e. immigration and emigration). technically speaking, the three compartments satisfy the condition of the form: assumption 2: individuals in the population meet each other randomly in that both probability and degree of interactions with one another remain constant over time, regardless of geographical and demographic factors. this is a strong assumption of homogeneity for the sir dynamic system that is governed by the same transmission and recovery param-etersˇand . in practice, such a homogeneity assumption may be easily violated. thus, modelling with heterogeneous dynamics of infection is an important and active research area in the literature on infectious diseases. assumption 3: one susceptible individual can only develop immunity (or selfimmunisation with antibody against the virus) through infection (i.e. no vaccination). in other words, as shown in figure 1 , the infectious compartment is the only exit of the susceptible compartment, and there is no other state to which an at-risk individual would move next. once recovered from infection, one becomes immune to the virus for the remainder of the study period and would not return to be susceptible again. in effect, this is a rigorous definition of recovered case in the sir model. from a view of the graphic representation in figure 1 , this implies that there is no connection from the removed compartment to the susceptible compartment, or in other words the removed compartment is the terminal state of the infection dynamics. it is worth pointing out that to date the validity of this assumption for the covid-19 pandemic remains unknown. in the literature, this condition is assumed for a certain period of time over which risk prediction is considered. assumption 4: the infection has zero latent period in that one becomes infectious once exposed. this is a key distinction of the sir model from the seir model. like many infectious diseases, the covid-19 has a reported average incubation period of between 4 and 7 days (li et al., 2020; pan et al., 2020) , which adds some additional complexity in the modelling of infectious disease dynamics. as a matter of fact, this latency of contagion is really the timing of being contagious and not that of being symptomatic. some studies have found that covid-19 carriers are most contagious in the early phase of illness prior to the occurrence of noticeable clinical symptoms (ip et al., 2017; he et al., 2020) . given these findings, it is tricky to see how the compartment of exposure for incubation would be added to extend the sir model for the covid-19 pandemic. assumption 5: because the sir model has constant transmission and recovery parameterš and , which are not time varying, the underlying infection is assumed to evolve in fully neutral environments with no mitigation efforts via external interventions such as a public health policy of social distancing, effective medication or fast testing kits for diagnosis. as far as the covid-19 pandemic is concerned, this is the biggest restriction of the sir model, which is not reflective of the reality-almost all countries with reported covid-19 cases have issued various non-pharmacological control measures. many researchers have proposed solutions to overcome this unrealistic assumption of the sir model in the analysis of covid-19 data (see, e.g. wang et al., 2020) . assumption 6: the population size n is large enough to have enough number of incidences, including the number of infections, the number of deaths and the number of recovered cases, so that the sir model parameters can be stably estimated with high precision. technically speaking, this is not a model assumption but a condition of sample size for statistical power. because this mechanistic model will ultimately be used for risk projection, a well-trained model with reliable data is necessary to not only produce an accurate prediction but also to adequately assess the prediction uncertainty. although these six assumptions specifically concern the sir model, most of these discussions or associated insights are useful to understand the restrictions of sir model extensions that will be presented in the remaining sections. knowing possible violations of a certain restriction on a multi-compartment model in data analyses gives rise to potential new research problems for further investigation. to further understand the mechanism of infection governed by the sir model, we now give a brief summary of its analytic properties that provide useful guidelines for us to build statistical models and methods to learn the sir model from available surveillance data from public health databases. (2). more importantly, although the dynamic system defined by the sir model is continuous over time, available surveillance data are reported at discretised measurements over discretised time points. for example, most of the covid-19 public databases update data on a daily basis, in which 'a day' is the unit of time for measurement. knowing this discrepancy between the continuous time underlying mechanistic model and the sampling frequency at discrete times for available data is essential to create a statistical framework to link the sir model with the data at hand. property 2: the sir model is deterministic and does not contain any probabilistic components. it is noteworthy that dynamics and stochasticity are two different mathematical properties; a dynamic system (e.g. the sir model) is not necessarily stochastic, while a stochastic system is not necessarily dynamic. as shown in figure 2 , the compartment sizes s.t/; i.t/ and r.t/ are time-varying functions with no random fluctuations, which are completely determined by the model parameters and the initial conditions of the sir model. obviously, this is a limitation of the sir model when it is applied for data analysis, where data collection is subject to profuse uncertainties and random errors. property 3: it is easy to show that the number of individuals at risk (in the entry of the system), s.t/, is monotonically non-increasing and that the number of removed cases (at the exit of the system), r.t/, is monotonically non-decreasing (see figure 2 ). hence, the total number of individuals who have been exposed to a virus is equal to n s.t/ d i.t/ c r.t/, which is monotonically non-decreasing. i.t/, the number of active contagious cases, or the difference between the two groups of the exposed cases and the recovered cases, can be either increasing or decreasing. the middle panel of figure 2 nicely conveys such directionality of movements, in which the time of i.t/ reaching the peak and the time of i.t/ reducing to zero are two important turning points of interest in epidemiology. the former indicates the turning point of disease mitigation, and the latter corresponds to the turning point of disease containment. property 4: it can be shown that i.1/ d 0 (or equivalently, â i .1/ d 0), meaning that the disease will eventually die out. this is because when t ! 1, the rate of prevalence â i .t/, given by .ˇâ s .t/ / in (2), will become negative at a certain time and then become more and more negative until converging to zero because â s .t/ is a decreasing function and â i .t/ is bounded in the succeeding text by zero. however, this property of decaying to zero is conditional on the assumptions listed earlier. violation of assumptions 1 and 3 are most likely to cause a disease to persist because the monotonicity of s.t/ used in the earlier argument is no longer valid. an example of such diseases includes seasonal influenza, where immunity does not last long. property 5: the sir model has a recursive property in that at any given time, disease progression (i.e. shapes of the three functions) is only dependent on their current values and not on other information from the past. this property of recursion should not be confused with the markov property that has exclusively used in the literature of stochastic processes under the conditional probability law. here there is no probability law involved in the recursive operation, which is indeed a fully deterministic recursion. such conceptual distinction may help understand the differences between dynamics and stochasticity. during an epidemic, various control measures are typically issued by governments to mitigate or contain the spread of the disease. a direct impact of these external interventions is that both the transmission and recovery rates are no longer constant over time. thus, an important generalisation of the sir model is to accommodate different degrees of mitigation policies, including social distancing, limiting transportation, mandatory mask wearing and city lockdown. as observed in the ongoing covid-19 pandemic, mitigation strategies are changing over time. limiting mobility of susceptible individuals and medically isolating contagious individuals in the population would reduce the rate of contracting virus, leading to a decreasing disease transmission rateˇ.t/. at the same time, gaining better knowledge on both treatment and self-management of symptoms and improving medical resources may increase the rate of recovery .t/ over the course of an epidemic. incorporating time-varying parameters into the sir model leads to an important extension of the basic sir model (1): the form ofˇ.t/ can be specified mainly in two ways. one is to letˇ.t/ be either a parametric function (e.g. exponential decaying function) or a non-parametric function (smirnova et al., 2019; sun et al., 2020) , both of which may be estimated from available data. one useful feature for the use of a parametric function ofˇ.t/ is to incorporate seasonality in the transmission rate. it is well known that many infectious diseases spread most quickly in some of the winter months. especially, respiratory infectious diseases caused by some coronaviruses exhibit seasonal behaviours that are consistent with the trends of temperature and humidity (barreca & shimshack, 2012; sajadi et al., 2020) . accounting for such seasonal periodicity in the model would produce a better long-term prediction of an epidemic. as the public attention for covid-19 pandemic projection gradually shifts from the short term to the long term, it becomes increasingly important to take seasonality into account. following dietz (1976) , a simple way to introduce seasonality is to assume that the transmission rateˇfluctuates over the period of a year:ˇ. t 365 ã ; t d 1; : : : ; 365; whereˇ0 is the average contact rate, 2 oe0; 1 is the degree of seasonality with d 0 reducing the model to the basic sir model, and 2 oe0; 365/ is the offset in time horizon so that peak transmission occurs at t d . other periodic functions or their combinations can also be used to model seasonality. as an alternative to a fully non-parametric function, wang et al. (2020) assume a form .t/ dˇ .t/, 0 < .t/ ä 1, where .t/ is a known function specified according to given control measures. this specification allows to assess the effectiveness of a target preventive measure, as well as to compare different preventive strategies. clearly, the model with .t/ á 1 represents disease progression in the absence of any mitigation effort, which sets up the baseline situation in the policy assessment and comparison. the flexibility in specifying .t/ allows easy incorporation of future business reopening events; for example, in the covid-19 pandemic, this function may be specified as a u-shaped curve in that control measures (e.g. social distancing) gradually relax after a certain time point (see more details from wang et al., 2020 , and some numerical results of the covid-19 data analysis). more discussions on the time-varying transmission rate are given in section 5.5. the assumption of a fixed population size is restrictive, especially when an epidemic remains for a long period of time before it is contained. in this setting, inclusion of natural birth and death dynamics is needed to adequately characterise the time-varying size of each compartment in the sir model. first, let be the natural birth rate and let be the natural death rate. so, the population size will change according to the ode of the form in this case, there are three exits for natural deaths, each occurring at one compartment. an extension of the basic sir model is given as follows: ; as desired. note that when model (2) is used, n.t/ will be automatically absorbed into the proportions and thus no longer appears in the model formulation. in this section, we review several four-compartment mechanistic models as extensions of the basic sir model introduced in section 2. being a simple version of a mechanistic model with three compartments, the sir model has some limitations in real-world applications. thus, extensions of this basic type to account for different disease mechanisms and assumptions have been widely considered in the literature. the commonly studied seir model takes into account an incubation period by adding an exposed compartment in between susceptible and infectious compartments (see figure 4 ). the underlying assumption here is that individuals in this exposure subpopulation have contracted the virus but are not yet contagious and are bound to become contagious. in the current literature, most infectious diseases that are suitable for the sir model are believed to fit in the seir model. the exposed compartment may be regarded as a waiting room for virus carriers who are about to spread the virus in the population. let ı be the rate for an exposed individual becoming contagious. then, the basic sir model can be extended to a four-compartment model consisting of the following four odes: where e.t/ is the size of the exposed compartment at time t. in this case, the compositional constraint becomes this constraint is clearly satisfied by the seir dynamic system defined in (4). let â e .t/ be the probability (or proportion) of being exposed to the virus. then, the rates based sir model (2) can similarly be extended from the model (4) earlier. technically, the seir model often suffers from the issue of parameter identifiability because determining a correct incubation period of an infectious disease and thus the parameter ı is a rather difficult task in practice. first, incubation period varies from one person to another; in the case of covid-19, the incubation period ranges from 0 to 15 days, with a median of 5.1 days (lauer et al., 2020) . in another study of covid-19 patients in china, guan et al. (2020) have reported that the estimated incubation period is between 0 to 24 days with a median of 3 days. it is clear that this quantity is very person dependent. second, ascertainment of contagion may be largely delayed because of shortage of virus testing sources. this length-biased sampling problem is notoriously challenging for the estimation of the incubation period (qin et al., 2020) . third, in the literature (e.g. he et al., 2020) researchers found that covid-19 carriers tend to be more contagious right after contracting the coronavirus than a week later because they are not self-quarantined in the absence of clinical symptoms. in other words, in the case of the covid-19, the incubation period (or sojourn at exposed state) is too short to play a substantial role in the modelling of the pandemic. not all infectious diseases will develop long-term immunity. individuals may develop immunity after recovery only for some time and could lose immunity such that they become susceptible again. thus, recovered individuals rejoin the susceptible compartment after a certain duration of immunity. this disease evolution is intuitively called the susceptible-exposed-infectious-removed-susceptible (seirs) model. we assume no death in the removed compartment (see figure 5 where the recovered branch in the removed compartment is connected to the susceptible compartment). an example of diseases studied using this model includes the common cold. this seirs model is defined as follows: where is the rate of losing immunity and becoming susceptible again after recovery. different from the seirs model, there are some infectious diseases where long-term immunity is yielded by individuals who survive from their infection. to build the self-immunisation into the infection dynamics, introduce an antibody (a) compartment to the sir paradigm, shown in the bottom thread of figure 6 . because individuals who enter the antibody compartment will no longer be at risk of infection for a certain period of time, this compartment is indeed an exit compartment, at least over a certain time window within which immunity is active, in addition to the removed compartment. in some infectious diseases such as the covid-19, the subpopulation of self-immunised individuals is not directly observed or clinically confirmed by the viral rt-pcr diagnostic tests because of mild or absent clinical symptoms. they are self-cured at home with no clinical visits. adding this compartment in the modelling can help to greatly mitigate the issue of under-reporting for the actual number of infected cases in the population. this dynamic system consists of four compartments, that is, susceptible, self-immunised, infectious and removed, with the following odes: where˛is the rate of self-immunisation, which is not identifiable because of the lack of observed data. an approach to estimating the rate parameter˛is to collect data of antibody serological surveys from the population. refer to for more discussions. this section mainly focuses on an introduction of statistical models to analyse surveillance data of an epidemic. each statistical model consists of two components: a systematic component and a random component. in the context of infectious disease data analysis, the former may be specified by a dynamic infectious disease model from sections 2 and 3. the latter is built upon a random sampling scheme that enables a stochastic extension of the mechanistic model (e.g. sir model) given in the systematic component. essentially, the notions about disease transmission, recovery or other characteristics are used to define key population attributes or parameters in an infection dynamic system of interest, which will be estimated by available data via a statistical modelling framework, where some covariates may be incorporated to learn some subgroup-specific risk profiles. a clear advantage of statistical and stochastic extensions is the ability to quantify uncertainty in both estimation and prediction in connection to sampling variability. this added uncertainty is crucial to policymaking as models not only generate an average estimation or prediction but also present the best and worst possible scenarios for more robust and confident handling of epidemics, given that surveillance data are subject to various issues in the data collection. an example presented in britton (2010) vividly shows the uncertainty in the progression of an infectious disease. consider patient zero, who will go on and infect on average r 0 number of other individuals, as defined by a certain disease mechanism. the number of individuals who contract the virus from this patient is in fact stochastic, varying around the expected number of infections r 0 , which could be described by a distribution (e.g. poisson or negative binomial) with mean r 0 on the support of non-negative integers. with a non-zero probability of taking the value zero due to the variability in human activities, there is a non-negligible chance that an epidemic is completely averted. the opposite could be an outbreak with a non-zero probability that infects tens of thousands of people. without modelling such uncertainty, we cannot see all these possibilities and associated likelihoods of their occurrences during the course of an epidemic (roberts et al., 2015) . infectious disease systems governed by the class of multi-compartment models, though describing the population average, are useful to describe individual-based stochastic processes if certain random components are introduced into the modelling framework. thus, the resulting statistical models present more natural approaches to the analysis of surveillance infectious disease data. before introducing statistical methodologies that are commonly used for parameter estimation, we distinguish model parameters into two categories. those that can be determined a priori with no need for estimation, which we term as hyperparameters. those that cannot be fully determined and need to be estimated using the data at hand, which we term as target parameters. the choices of which parameter should be a target parameter versus a hyperparameter vary widely across methods. intuitively, the more we know about the biological characteristics of a disease, the more parameters can be held fixed a priori in the analysis. it is however very difficult to determine most of the model parameters early in an outbreak because of the limited amount of knowledge and data about the disease. indeed, many model parameters are not identifiable because of the lack of relevant data availability. one such example is the rate parameter of immunity˛in the sair model (6). as relevant knowledge accumulates, literature reveals increasingly precise characterisation of the disease, such as its latency period, recovery rate, death rate, immunity duration and antibody acquirement. such information is typically obtained from surveys of high-quality individual-level data, which may provide much better quantification of these hyperparameters than having to be re-estimated by epidemic models, which, on the other hand, are largely based on much coarser surveillance data. in the case of the covid-19 pandemic, this survey-based approach may be too costly to carry out in countries with large and heterogeneous populations. in general, target parameters are mostly those that are location specific, for example, transmission rate and fatality rate. they vary largely across regions because of non-uniform mitigation effort and hospital resources; hence, datadriven estimations are preferred. in section 6, we introduce an areal spatial modelling approach to account for spatial heterogeneity in the analysis of infectious disease data. because of the issue of parameter identifiability in some mechanistic models, specifying hyperparameters in the model fitting is inevitable. however, holding hyperparameters fixed at certain values according to some external data sources is indeed controversial, and the validity of consequent analyses is highly dependent on the appropriateness of these certain prior values. to relax this technical weakness, later in section 5, we introduce a bayesian framework in which such prior information (e.g. hyperparameters) enters the statistical model via certain prior distributions rather fixed values, so that the uncertainty on those hyperparameters is adaptively compensated with the amount and quality of observed surveillance data. such flexibility has a great advantage in synthesising prior evidence and observed data. to present this section at a reasonable technical level, most of the discussions in the succeeding text are given in the setting of the basic sir model, and generalisation to other compartment models should follow with slight modification. in closing, it is noteworthy that the frequentist statistical methods discussed in the succeeding text are based on a fundamental assumption of data collection; that is, the population-level compartment data s.t/, i.t/ and r.t/, and others if relevant, can be directly collected from the study population. in other words, at given time, every individual in the population can be observed directly for his or her current status of being susceptible, infectious, recovered or died. this is practically impossible. thus, the interpretation of the estimation results should be carried out with caution. in the sir model (1), the transmission rateˇand recovery rate are two target parameters of interest. estimation ofˇand can be carried out through optimisation in search for a model that best fits to the data. a commonly used minimisation criterion is the least squares loss. giveň and , numerical approximations (e.g. runge-kutta methods) can be used to solve for the trajectories, s.t/; i.t/ and r.t/. these expected trajectories are then compared with the observed trajectories to compute a discrepancy score, such as the sum (over time) of the squared errors, represented as a loss function of target parameters. now, it remains to find the estimates of these parameters that give rise to the curve that best fits the data through standard optimisation tools. in this case, the optimisation pertains to a two-dimensional search, which should be computationally straightforward. even a greedy search is computationally cheap. we illustrate using both simulated data and real data in examples 2 and 3, respectively. example 2. we first generate an observed sequence of cumulative infectious counts following example 1, namely, the sir model with the true parameter valuesˇd 0:5 and d 0:2. for simplicity, we fix d 0:2 in this example. we then evaluate the sum of squared error (sse) loss between the expected cumulative infectious count i.t/ and its sample counterpart i obs .t/, and the value that minimises this loss gives an estimate ofˇ. figure 7 plots the sse loss versusǔ sing the simulated data i obs .t/, t d 1; : : : ; t, with t d 10; 20; 50, respectively. it is found that the sse loss is minimised at ǒ d 0:5 as expected. the longer the observed sequence, the more curved around 0.5 the sse appears, so the better we can identify the minimum of the sse curve. the r script shows the example for the case of t d 10. note that the sequence we used to define the fit is i.t/, but s.t/ and r.t/ can also be used in the estimation. similarly, a two-dimensional grid search can be used for estimatingˇand jointly when is not fixed in which the data of r.t/ must be used in the estimation. here we present only one replicate for illustration. example 3. we apply the same approach as given in example 2 for analysing the daily time series of the covid-19 cumulative infectious counts in michigan during 11 march to 1 may 2020. details of the data are described in appendix a2, including the i.t/ sequence. the already defined sir function from example 1 is used as the dynamic model, and the already defined sse function from example 2 is used as the loss function. by fixing d 0:2 (i.e. average contagious period of 5 days) the following code computes the solution ǒ d 0:79 using the first 10 observations (11 to 20 march). we then increase the number of observations in the estimation; as shown in figure 8 , the value of ǒ decreases when more data are used. this is noticeably different from example 2 where ǒ remains constant regardless of the number of observations used. the gradual decrease in our estimate ofˇindicates a potential reduction in the transmission rate over time in michigan due to the enforcement of statewide social distancing. in other words, the assumption of a constant transmission rateˇis inappropriate for the michigan data. this result suggests a need for using a more proper modelling technique, which will be demonstrated in section 5.5. being often used as a classic textbook example, this least squares approach is equivalent to the maximum likelihood estimation (mle) under the assumption that measurement errors are independent and normally distributed with a homogeneous variance. in general, this approach gives consistent estimation and does not require a distributional assumption for the data generation and thus can be applicable to non-normal data. however, the ordinary least squares loss used in the earlier example assumes that data are independently sampled over time, which is not true because the observations are time series and are thus temporally correlated. because of this, the least squares estimation is not efficient. cintrón-arias et al. (2009) have discussed the use of a generalised least squares approach to account for more complex error structure, including temporal autocorrelations. it is not always the best practice to directly use data of i.t/ and r.t/ in the estimation of the model parameters. the covid-19 projection by gu (https://covid19-projections.com/) adopts a loss optimisation approach based on the seir model using only death counts due to quality concerns with infection counts (e.g. under-reporting issue). the model uses a discrete state machine with probabilistic transitions to minimise a mixture of loss functions, such as mean squared error, absolute error and ratio error. in the literature, there are many other estimation procedures (e.g. wallinga & teunis, 2004; cori et al., 2013; thompson et al., 2019) . some of these alternatives do not estimateˇand , but more directly target the effective reproductive number r e .t/ in estimation and inference. here we present the method of moments, another routine estimation approach in the statistical literature for estimating the model parameters in the sir model (1). during the early phase of an epidemic, one may assume s.t/=n 1 and set dt d 1 (e.g. a time unit of 1 day for discretisation), so that the second ode of (1) leads to the approximate exponential function solution: (1) at discrete times at which data are actually recorded. after estimate o is obtained, we obtain ǒ immediately. however, the estimation of is only accurate during the early phase of disease outbreak because the approximation of s.t/=n 1 is used. in the literature, other types of moments are also used to derive parameter estimates. for instance, using the approximation from the first ode of the sir model (1) at discrete times, one can easily obtain the following expression: an estimate ofˇmay be obtained by averaging the quantities given in the right-hand side of the equation earlier. in the case whenˇ.t/ varies over time because of changes of a certain mitigation measure, the earlier method of moments estimator may still be applied locally with a possible utility of a kernel weighting function such as the nadaraya-watson estimator (nadaraya, 1964; watson, 1964) . a very similar approach leads to the following approximation: which may give rise to a non-parametric estimator of the effective reproductive number. although r e .t/ can be identified at each time point using data solely from t, for numerical stability, the same idea of a kernel weighting (e.g. running-bin method) smoother is applied to estimate r e .t/ at t (see, e.g. wallinga & teunis, 2004) . linear approximations are easy to implement; however, the variances produced from such linear fits are typically inadequate in describing the true randomness of an infectious disease to allow valid inference and prediction. alternatively, it is promising to investigate the local linear fitting method (cleveland & devlin, 1988 ) that produces non-parametric estimates of the time-varying model parameters to better reflect temporal dynamics of the infection. in both the least squares estimation and method of moments estimation, there are no explicit assumptions about probability laws for data sampling. implicitly, both methods are based on the sampling scheme on the entire population; that is, the current status of every individual in the study population is recorded. this is certainly not true in practice. to overcome this, some estimation methods are proposed to account for sampling variability under certain parametric distributions. distribution assumptions can be made for many quantities in an infectious disease model. some are fully specified based on given knowledge. for example, the distribution of incubation period of a disease can be represented as a probability mass function by days (lauer et al., 2020) . on the other hand, some distributions are only specified to be from a family of shapes, with the exact form to be estimated. we illustrate the latter using a stochastic sir model. stochastic sir models typically require the same assumptions as a deterministic sir model (section 2.3). to reflect the stochastic nature of disease transmission and recovery, stochastic processes such as a poisson process are used to model the accumulation of cases. following the earlier definitions ofˇand , the number of effective contacts in the population is a poisson process with rateˇn. of these contacts, only those between the contagious and susceptible will lead to new infections. hence, the counting process defined by the number of exposed (i.e. i.t/ c r.t/, or equivalently n s.t/) follows a poisson process with ratě s.t/i.t/=n. hence, the number of newly exposed in an instantaneous duration of dt follows a poisson distribution with meanˇs .t /i.t / n dt. on the other hand, the duration of time individuals staying infectious is assumed to be independent and identically distributed according to an exponential distribution with rate , and hence, the mean infectious duration is 1= . when we jointly consider all i.t/ infectious subjects at time t, exit events occur independently with a rate i.t/, and the gap times between two adjacent exits are exponentially distributed with mean 1=f i.t/g. in summary, the number of removed individuals is a counting process following a poisson process with rate i.t/. such stochastic formulation is commonly used, for example, in bailey (1975) and andersson and britton (2000) . through the earlier definitions, s.t/; i.t/ and r.t/ are now random variables that can be directly sampled. in fact, it suffices to assume only two of the three counting processes in order to define a stochastic sir model due to the constant sample size constraint. for demonstration, at time t, in an instantaneous time interval oet; t c dt/, we may specify a stochastic sir model as follows: where i.t/ d n s.t/ r.t/. as a result of this probabilistic formulation, the effective reproductive number is now defined as an expectation, that is, r e .t/ d efˇs.t/i.t/=ng. the stochastic sir model (7) is specified in continuous time, and we would hope that dt is very small. in practice, approximation to (7) is used by letting dt d 1 or a unit of day, which is typically the smallest time unit used in public surveillance data. as a result, s.t/ and r.t/ at time t are used to approximate the average in the entire duration of oet; t c 1/. this approximation turns a continuous time stochastic model into a discrete time scholastic model to proceed with statistical analysis. other distributions, such as negative binomial or general dispersion family (song, 2007) , may be considered to handle the issue of overdispersion in the counting processes. with distributions in place, we turn the focus to estimation and inference by the maximum likelihood approach. maximum likelihood estimation is often preferred in a parametric model where the underlying probability distribution is properly chosen. for convenience, we take day as the time of unit. by discretising time based on observed sequences, that is, t d 0; 1; : : : ; t, observed daily increments of counts s.t/ d s.t/ s.t c 1/ in the susceptible compartment and r.t/ d r.t c 1/ r.t/ in the infectious compartment are conditionally independent, given historical accumulated counts s.t/ and i.t/, according to the definition of model (7). the second model in (7) contains only the removal parameter , so the log-likelihood function of with respect to the data of daily increments in the removed compartment, r.t/, and daily cumulative counts of infections, i.t/, can be written as where s.0/ d n and i.0/ d 1. however, one caveat in the simplistic likelihood formulations earlier is that the cumulative time series s.t/ and i.t/ are assumed to be directly measured without errors. in other words, the earlier likelihood accounts only for the sampling uncertainties in the increments not those in the cumulative counts, so the resulting statistical inference may suffer from underestimated standard errors. there are two types of statistical inference theory considered in this context, namely, the infill asymptotic theory and the outreach asymptotic theory. the former pertains to the situation where the sampling points increase within a fixed time window (i.e. fixed t), while the latter is a situation of practical relevance where the time window of the data collection tends to infinity (i.e. t ! 1). britton et al. (2019) discuss the infill large-sample properties under the assumption that the complete epidemic data, that is, continuously observed counting processes .s.t/; i.t// t2 [0,t] , are available. under such setting, the asymptotic distribution of the mle based on continuously observed trajectories is established. obviously, it is really rare in practice to collect infectious disease data via such infill sampling schemes. nevertheless, for the sake of theoretical interest, we refer readers to britton et al. (2019) and references therein. the outreach large-sample theory for the mle with discrete time series data provides a statistical inference relevant to most of infectious disease applications. as an epidemic evolves, the number of equally spaced time points (say, daily) for data collection increases. when sampling errors in both i.t/ and s.t/ are allowed, the likelihood earlier is indeed a kind of conditional composite likelihood (varin et al., 2011) . thus, the standard theory of composite likelihood estimation implies that the asymptotic covariance of the estimator is given by the inverse godambe information matrix (or a sandwich estimator). the sensitivity matrix in the godambe information is hard to obtain analytically because of the serial dependence in the time series. instead, one may take a non-parametric bootstrap approach similar to that considered by gao and song (2011) to evaluate the standard errors in order to conduct a valid statistical inference. conditional independence is a strong assumption for mathematical convenience in the mle. relaxing it has drawn some attention in the literature. for example, lekone and finkenstädt (2006) and allen (2008) construct likelihood-based approaches using discrete time markov chain seir models; becker (1977) and becker and britton (1999) consider the mle in the sir model using martingale methods when all transition events for each individual are observed. it is however unlikely that such individual-level details are observed in most surveillance data used for modelling of infectious disease mechanisms. estimators using less detailed data have been proposed (e.g. becker, 1979; rida, 1991) . as part of efforts on further relaxing strong conditions in the earlier stochastic sir model (7), in section 5.1, we review a state-space modelling approach that generalises the current likelihood model and estimation framework, where s.t/, i.t/ and r.t/ are not directly measured and rather treated as markov latent processes. also, hyperparameters are included via their prior distributions instead of fixed values, and a bayesian estimation similar to the mle is established through the mcmc approach. this class of state-space models is so far one of the most flexible statistical modelling frameworks to analyse infectious disease data. we highlight several software packages that are publicly available for estimation of parameters in the multi-compartment models. overall, additional efforts in this computational domain are needed. several packages focus on the estimation and inference for r 0 and r e .t/. for example, obadia et al. (2012) , in their r package r0, implements multiple methods, including a method of moments-type approach (dietz, 1993) , a bayesian method (bettencourt & ribeiro, 2008) and likelihood-based estimation procedures (forsberg white & pagano, 2008; wallinga & teunis, 2004; wallinga & lipsitch, 2007) . along this line, cori et al. (2013) and thompson et al. (2019) develop bayesian methods to estimate the effective reproductive number and are made available through the r package epiestim and microsoft excel (https://tools.epidemiology.net/epiestim.xls). their methods use a moving window approach, assuming that the reproduction number r t, in this window oet c 1; t is constant. a gamma prior distribution is used to derive the posterior distribution of the r t, given new infectious counts. state-space models refer to a class of linear or non-linear hierarchical stochastic models with parametric error distributions. the conventional state-space model is not formulated as a bayesian model, but later, its bayesian formulation has gained great popularity because of the availability of mcmc methods for the estimation of the model parameters (carlin et al., 1992) . this class of models primarily attempts to explain the dynamic features of the state-space model framework is advantageous over the stochastic compartment models introduced in section 4.4 in the following aspects of statistical modelling: (i) state-space model does not assume that the compartment processes s.t/, i.t/ and r.t/ are directly observed, which are treated as latent processes to be estimated from observed data. (ii) state-space model allows an explicit sampling scheme to be part of the model specification, which enables the quantification of both estimation and prediction uncertainties in the statistical analysis. (iii) state-space model is built upon the compartment probabilities (or rates or proportions) that automatically adjust for potentially varying population sizes. this conveniently relaxes the condition of a constant population size in the basic sir model. (iv) state-space model provides a flexible statistical modelling framework that embraces time-varying model parameters and integrates prior knowledge of disease mechanisms (e.g. r 0 value from other studies) via prior distributions of the model parameters. (v) implementation of mcmc methods in statespace modelling provides a powerful approach to parameter estimation and predictions using conditional distributions given the history. this is different from all estimation methods in section 4 that are always formulated via marginal distributions under strong assumptions of sampling rules. a state-space model consists of two stochastic processes: a d-dimensional observation process fy t g and a q-dimensional state process fâ t g given as follows: the state process â 0 ; â 1 ; : : : is a markov chain with initial condition â 0 p 0 .â/, and transition (conditional) distribution is given by y t jâ t f t .yjâ t /: the observation process fy t g is conditionally independent given the state process fâ t ; t 0g, and each y t is conditionally independent of â s ; s ¤ t; given â t , the conditional distribution is y t jâ t f t .yjâ t /. this model can be graphically presented by a comb structure shown in figure 9 . according to cox et al. (1981) , the state-space model is a parameter-driven model in that the processes of the compartment proportions are unknown population parameters to be estimated, while the stochastic multi-compartment model such as the stochastic sir model in (7) is a data-driven model where the compartment proportions are directly observed. as pointed out earlier, the validity of the latter is questionable in practice, especially in the analysis of the covid-19 pandemic data. let y s be the collection of all observations up to time s, namely, y s d .y 1 ; : : : ; y s /. let be a generic notation for the set of model parameters. denote the conditional density of â t , given y s d y s , by f t|s .âjy s ; /. then, the prediction, filter or smoother density is defined, respectively, according to whether t > s, t d s or t < s. this conditional density f t|s .â jy s ; / is the key component of statistical inference in state-space models. to develop the maximum likelihood inference for model parameters in state-space models, the one-step prediction densities f t|t 1 are the key components for the computation of the likelihood function (see chapter 10 of song, 2007) . given a time series data fy t ; t d 1; : : : ; ng, the likelihood of y n is where f 1 .y 1 i / is expressed as follows: where by convention, g 1 .â 1 i / d f 1|0 .â 1 jy 0 i /, conditional on an initial observation y 0 at time 0. in the earlier likelihood evaluation, one-step prediction densities, f t|t 1 , and filter densities, f t|t , can be respectively given by the recursions (9) with the recursion starting with f 0|0 .â / d p 0 .â/. in general, exact evaluation of the integrals in (9) and (10) is analytically unavailable, unless in some simple situations, such as both processes being linear and normally distributed. for the linear gaussian state-space model, all f t|s are gaussian, so the first two moments of (9) and (10) can be easily derived from the conventional kalman filtering procedure, as discussed in chapter 9 of song (2007) . however, with some computational costs, all integrals in the earlier likelihood and the filter can be evaluated numerically by mcmc methods. recently, wang et al. (2020) have developed an extended sir (esir) model that is built upon a state-space model with two (d d 2) observed time series of daily proportions of infectious and removed cases, denoted by y i t and y r t , which are generated from the q-dimensional underlying infection dynamics fâ t ; t 0g governed by a mechanistic sir model. in the case of the sir model, q d 3. as shown in figure 9 , the latent process is a time series of the three-dimensional latent vector of population probabilities â t d â s t ; â i t ; â r t > that satisfies a three-dimensional markov process of the following form: where parameter ä scales the variance. the function f. / is a three-dimensional vector as a solution to the sir model (2), which determines the mean of the dirichlet distribution via the rk4 approximation. in comparison with the stochastic sir model in (7), here the compartment proportions â t are unobserved and explicitly modelled by a markov process to account for temporal correlations, so the parameter estimation can be carried out with multivariate likelihood functions. because the serial dependence is accounted for in the statespace model, the resulting estimation and prediction are more powerful than those given in section 4.5. two observed time series y i t ; y r t > that are emitted from the underlying latent dynamics of infection â t are assumed to follow the beta distributions at time t: where â i t and â r t are the respective probabilities of being infectious and removed at time t, and i and r are the parameters controlling the respective variances of the observed proportions. it is easy to see that y i t and y r t are conditionally independent given â t , and e y i t jâ t d â i t and e.y r t jâ t / d â r t , and d . i ; r ; ä;ˇ; /. because y i t and y r t share a common latent variable â t , their marginal correlation is modelled. in fact, these two beta distributions define a sampling scheme of observed data, including daily empirical proportions of infectious cases and removed cases, which are a collection of daily signals from the underlying latent sir infection dynamic system. the earlier state-space model (11), (12) and (13) is useful to assess the effectiveness of control measures (e.g. social distancing) via the projected epidemic evolution in the future time. to process, one can replace the constant transmission rateˇby a time-varying transmission rateˇ .t/, where .t/ is a given transmission rate modifier. it is specified as a function in time to reflect different forms and strengths of control measures. this results in an esir model proposed by wang et al. (2020) : where .t/ 0. obviously, the basic sir model is a special case with no intervention in place, .t/ á 1. in general, the .t/ may be specified by a practitioner to reflect a particular control measure. for an example of the covid-19 in hubei province, china, a possible choice of .t/ given in the following is a step function that reflects government-initiated macroisolation measures: .t/ d figure 10 (a)-(c), we obtain different types of transmission rate modifiers. alternatively, .t/ can be a continuous function, say, .t/ d exp. 0 t/ or .t/ d expf . 0 t/ ! g; 0 > 0; ! > 0, that reflects steadily increased community-level surveillance and personal protection (wearing face masks and washing hands) as shown in figure 10 (d)-(f). note that this modifier function does not have to be a monotonic decreasing function and may take a u-shape to capture the relaxation of control measures. with such a modelling framework, one can carry out comparisons of different preventive protocols via the resulting projected infection risk â i .t/ or other epidemic features such as the time of the effective reproduction number r e .t/ < 1 and the time of a disease recurrence associated with relaxed control measures. a clear advantage of the state-space model is that it enjoys the resilience of mcmc being a primary method for statistical estimation and prediction. in other words, the statistical analysis methods can be easily modified to accommodate changes made in the latent multi-compartment models and/or in the observed time series models. one example of the covid-19 pandemic modelling given in wang et al. (2020) is to extend the three-compartment esir model to a four-compartment model by incorporating stringent quarantine measures issued by the hubei government via a new addition of in-home quarantine compartment. this new model is termed as susceptible-quarantined-infectious-removed (sqir) model. this quarantine compartment collects in-home isolated individuals who would have no chance of meeting any infectious individuals in the infection system. so, it is another exit from the dynamic system in addition to the removed compartment. let .t/ be the chance of a susceptible person being willing to take in-home isolation at time t. the basic sir model in equation (2) is then extended to include a four-dimensional latent process where â s t c â q t c â i t c â r t d 1. the quarantine rate .t/ may be specified as a dirac delta function with jumps at times when major quarantine policies are issued by the government. for example, one may specify the time-dependent quarantine rate function .t/ for hubei province as follows: .t/ d note that at each jump, the respective proportion of individuals would leave the susceptible compartment and enter the quarantine compartment. figure 10 (g)-(i) shows three different types of in-home quarantine rates during the period of the covid-19 pandemic in hubei province. in a similar spirit to the sqir example of application ii earlier, consider an interesting extension of the basic sir model in the analysis of the us covid-19 data to include an antibody compartment to handle the subpopulation of self-immunised individuals. this four-compartment model is termed as sair model, which has been discussed in detail in section 3.3. because the antibody compartment is also a second exit from the infection system, similar to the quarantine compartment, one can turn the sair model given in (6) into a similar form of the sqir model in (14), where .t/ is replaced by˛.t/, the rate of self-immunisation. it is known that the population immunity rate cannot be estimated from observed surveillance data, which needs to be figured out by using large-scale serological surveys in the population. thus,˛.t/ may be specified as a dirac delta function (e.g. figure 10 (g)-(i)) with jumps at times when the surveys are conducted and function values based on the survey results. it is worth pointing out that although the sqir and sair models have very similar model structures, their interpretations are very different. the former is applicable to the case of very stringent self-isolation control measures in hubei, while the latter is reflective to the situation of selfimmunisation due to mild control measures in the usa, so that a substantial proportion of individuals who contracted the virus, recovered and became immunised. markov chain monte carlo has been extensively used for the estimation and prediction in the state-space model (see, e.g. carlin et al., 1992; chan & ledolter, 1995; czado & song, 2008; de jong & shephard, 1995; zhu et al., 2011 , for a vast literature on this topic). such popularity of mcmc in the state-space model is rooted in its power to handle the evaluation of high-dimensional integrals involved in the likelihood function (8). the essential strategy for the calculation of each high-dimensional integral is to approximate it by a sample mean of the involved integrand. this sample average is obtained from many mcmc sample draws from posterior distributions of the model parameters, including the time series of the latent probability vector â t . let t 0 be the current time up to which we have observed data . (2) draw â .m/ t from the posterior h â t jâ .m/ t 1 ; .m/ i of the q-dimensional latent process, at t d 1; : : : ; t 0 ; t 0 c 1; : : : ; t; ; .m/ i according to the observed process, at t d 1; : : : ; t 0 ; t 0 c 1; : : : ; t, respectively. prior distributions are specified for some of the hyperparameters; for example, â 0 dirichlet 1 y i 1 y r 1 ; y i 1 ; y r 1 , r 0 dˇ= and follow some log-normal distributions, and i , r and ä follow some gamma distributions or inverse gamma distributions, respectively. convergence diagnostics of the mcmc algorithm may use standard diagnostic tools such as the gelman-rubin statistic based on multiple chains with different initial values, monitoring trace plots of the model parameters and so forth. the r package coda provides a comprehensive toolbox of convergence diagnostics (brooks & gelman, 1998) . using the mcmc draws collected after the burn-in, various summary statistics may be obtained to estimate model parameters, conduct inference and make prediction. the summary statistics (e.g. posterior mean and posterior mode) from the in-sample draws of the model parameters can provide point estimates and 95% credible intervals with the left and right limits set respectively at the 2.5th percentile and 97.5th percentile, and those of the observed processes may be used to check the goodness of fit of a proposed model and to perform model selection via the deviance information criterion (spiegelhalter et al., 2002; gelman et al., 2013) . more importantly, the summary statistics from the out-sample draws of the latent process â t ; t > t 0 provide point predictions and their 95% credible prediction intervals. it is interesting to note that the earlier mcmc implementation does not depend much on the form of the runge-kutta solution f.â t 1 ;ˇ; / in the latent process (11). as long as a mechanistic infectious disease model has an approximate analytic solution f. /, the bayesian estimation and inference can be carried out using mcmc. such flexibility is appealing to develop software applicable for a broad range of practical studies. mcmc procedures are well suited for the estimation and inference in the setting of statespace models because of fast and reliable numerical performances. for the michigan data analysis example in section 5.5, using an average personal computer, we spend 1.5 h completing all mcmc calculations of 200 000 draws with thinning bin size of 10 after the burn-in judged by four separate mcmc chains. this computing speed can be improved by using highperformance computing facilities and/or some recent posterior sampling methods. as suggested by zhou and ji (2020) for a state-space sir model, one may set a more efficient sampler over highly correlated posterior spaces by parallel-tempering mcmc algorithm (geyer, 1991) , which provides rapid mixing in mcmc chains. also, along the line of online learning, sequential monte carlo methods for posterior sampling (doucet et al., 2001; dukic et al., 2012) are promising, as they permit efficient updating of existing posteriors with sequentially arrived data, in the hope to avoid refitting the model by running mcmc from scratch using the updated complete data. wang et al. (2020) and have developed a series of extended sir models by introducing time-varying transmission rate, quarantine process and asymptomatic immunisation process (details in section 5.2). the proposed methods have been established in an open-source r package esir, available on github (https://github.com/lilywang1988/esir). this package calls rjags to generate mcmc chains and retains a few mcmc controllers from rjags. the package is also updated weekly with new summarised us state-level count data for the covid-19 pandemic. several robust methods that are developed specifically for the prediction of the covid-19 are cited by the centers for disease control and prevention (https://www.cdc.gov/coronavirus/ 2019-ncov/covid-data/forecasting-us.html). to name a few, the bayesian approach (verity et al., 2020) developed by researchers at imperial college london (featured in adam, 2020) and the hybrid modelling approach (ihme covid-19 health service utilization forecasting team & murray, 2020) adopted by the university of washington institute for health metrics and evaluation (ihme) (discussed by jewell et al., 2020) have attracted great public and government attention. we refer to their original work for modelling details. it is difficult to appreciate the original work and followed comments without running real covid-19 data using their software, which is lacking for the ihme models, among some others. to increase research transparency, releasing software or computing code used in statistical methods to the public is strongly encouraged. we now illustrate the use of r package esir to analyse the covid-19 surveillance data during the period of 11 march to 9 june 2020 from michigan state, usa. the michigan data used in this analysis are listed in appendix a2, including both i.t/ and r.t/. in the data analysis, we demonstrate the use of both the state-space model described in application i and the mcmc method, where the transmission rate modifier .t/ is set as exponential functions. from package esir, we can extract many useful statistics related to estimation and forecasting. for example, we can obtain both mean and median projections of the prevalence curve â i .t/; t > t 0 as well as their 95% credible prediction intervals. in addition, this package provides the estimated first and second turning points of an epidemic. the former is the time when the daily number of new infectious cases stops increasing, while the latter is the date when the daily number of new infections becomes zero. mathematically, the first corresponds to the time t at which r â i t d 0 or the gradient of p â i t is zero, and the second is the time t at which the rate of prevalence is zero p â i t .t/ d 0: the following is the r script to perform the data analysis: in the above program, we consider a time-dependent declining transmission rate with the modifier value .t/ d exp. 0 t/ where the parameter 0 is chosen so that the modifier equals to 0.6 on 2 may. this value is determined based on the social distancing scoreboard posted by unacast, inc. (https://www.unacast.com/covid19/social-distancing-scoreboard). one needs to set exponential d true, to activate such setting. alternatively, as shown in figure 10 (a)-(c), one may use a step function by providing a vector of pi0, values and the corresponding vector of changes dates in change_time. in the main function above, we let the starting date be 11 march and conduct the estimation and projection of 200 days ahead (t_fin d 200) on 10 june and after. we run four separate mcmc chains with different initial values, each with length of 5 10 5 , kept from every 10 draws (thn d 10) (a thinning operation to reduce autocorrelations) after 2 10 5 draws are dropped. thus, with a relatively squandering setting, we expect a better performance of convergence and reliable quantification of prediction uncertainty using sample quantiles. there are two different prior settings for sensitivity analysis. one follows the example code earlier, with the prior mean for the log-normal distribution of the basic reproduction number to be 3.5, the removal rate 0.1 and thus the mean transmission rate 0.35, and the other with all these values to be 4, 0.2 and 0.8, respectively. the two distinct settings provide similar estimating and forecast results as can be seen in figure 11 . their estimated reproduction numbers are 3.154 (95% credible interval [2.162, 4.369]) and 3.143 (95% credible interval [2.294, 4.147]), respectively, which are similar considering that their prior settings are quite different. the output gelman-rubin statistic developed by gelman and rubin (1992) are close to 1 (data not shown). both pieces of evidence as well as stationary trace plots warrant the convergence of the the michigan covid-19 data have been preprocessed to smooth away some unnatural gaps caused by the clustered reporting issue as discussed in appendix a2. figure 11 shows an adequate model fitting, where all observed numbers of confirmed infections fall in the 95% insample credible intervals of prevalence â i t ; t ä 9 june. in contrast, the 95% out-sample credible intervals of the projected proportion (y i t ) are much wider, reflecting to the significant amount of uncertainty in the prediction. such uncertainty elevates as the time moves further away from the present time. despite the large uncertainty, the projected mean and median prevalence curves show a decreasing trend over time, which means that the social distancing works to mitigate the epidemic in michigan although the rate of improvement is moderate. also the fact that the two estimated turning points have occurred before 9 june is another piece of evidence for the positive effects of the series of social distancing orders issued by the state governor since 23 march 2020. model diagnosis is an important part of a statistical analysis, which is typically conducted using various residual plots. as illustration, in this michigan data analysis, let n â t be the posterior means over the period of 11 march to 9 june. we consider residuals of the two observed processes, defined by figure 12 shows that there are dominant lag-1 autocorrelation (the three coefficients are about 0.97) and no any additional significant autocorrelations beyond the lag-1 dependence. this confirms the assumption that the three latent processes are all the first-order markov processes. all mechanistic models discussed in the previous sections are useful to analyse the infection dynamics for a large population such as a country or a state in which most of model parameters may be assumed to be homogeneous and representing the entire population. this type of macromodelling approach is particularly valuable at the early phase of disease outbreak when the national public health administration aims to come up with nationwide macrointervention protocols with very limited amounts of relevant data available. once an epidemic evolves further into its middle phase, with more and more surveillance data collected from local communities, a macromodel is no longer suitable for an in-depth analysis of microinfection dynamics owing to the existence of substantial heterogeneity across local communities. this section concerns a review of significant extensions of infectious disease models by incorporating spatial heterogeneity across different geographical locations into modelling and analysis. the focus will be on the recent development of integrating the classical spatial cellular automata (ca) (von neumann & burks, 1966) with the previously discussed temporal multi-compartment models, leading to an important class of spatio-temporal multi-compartment models. this class of models is useful to predict local infection risk. technically speaking, the majority of existing macromechanistic models to study the spread of infectious disease are based on the assumption that the system is homogeneous in space. this means that the spatial characteristics that could potentially play a non-trivial role in the development and outcome of disease infection are not taken into consideration. this is a valid assumption if the population vulnerable to the infectious disease is mixed well and the human interventions (e.g. vaccination strategies) are homogeneous across different spatial locations. however, in reality, there exists substantial heterogeneity in the urbanisation, ethnic distribution, political views, governance and economic composition across different subgroups of individuals distributed over geographical locations, all of which will influence the spread of infectious disease and make the previous macromechanistic model not appropriate to address the dynamics spatially. one possible extension is to utilise partial differential equations (pdes) (murray et al., 1986) in spatial homogeneity, which is relaxed to allow area-specific spread patterns of epidemics. as noted in the literature, one limitation of pdes is that this approach ignores the fact that infectious disease is spread through person-to-person interactions, rather than by a continuous population. thus, pdes may lead to impractical results about the dynamics of an epidemic (mollison, 1991) . a natural strategy is to embrace a micromodel mimicking an interactive particle system, and ca is one of the well-studied systems with the strength of modelling spatially varying infection dynamics. originated in the works of von neumann and burks (1966) and ulam (1962) , the ca paradigm has been used in many applied fields, including the modelling of infectious diseases. when applied to model spatial variations of epidemic spread, ca has three distinctive features: (i) it treats individuals as discrete entities in order to study person-level movements in the infection dynamics. this high-resolution paradigm necessitates the incorporation of individual's heterogeneity such as residential address, age, race, pre-existing medical conditions and others in the modelling. in surveillance data, geographical information is publicly available (e.g. county that an individual lives), so it is feasible to utilise this variable in the extension of the macromechanistic model. (ii) ca allows to introduce local stochasticity; for example, the ca paradigm may be built upon a person-to-person infectious mechanism if individual-level information is available; otherwise, it may be based on a group-group infection process. (iii) ca is formulated in a network of particles (e.g. individuals, groups, villages and counties) with certain rules of connectivity and stochastic laws of disease transmission. this network topology is well suited for computations and simulations. because of these unique advantages, the ca paradigm has been employed by researches as an efficient method to study spread patterns of epidemics (beauchemin et al., 2005; ahmed & agiza, 1998; boccara et al., 1994; quan-xing & zhen, 2005; fuks & lawniczak, 2001; willox et al., 2003; rousseau et al., 1997; sirakoulis et al., 2000; fuentes & kuperman, 1999; liu et al., 2006; yakowitz et al., 1990; sun et al., 2011) . in the modelling of infectious diseases, the basic ca formulation involves three primary components: (i) a two-way array of cells (e.g. an age group or a county) that contain groups of individuals under study, and each individual belongs to one cell; (ii) a set of discrete states (e.g. susceptible, self-immunised, contagious, recovered and death) that describe different conditions of individuals during an epidemic; and (iii) some specific rules or updating functions that determine spatially how local interactions with a target cell from its neighbouring cells can influence and change the states of individuals in the target cell; all cells in a ca system achieve a global propagation of infection status updates instantaneously and continuously. in the application of the ca, determining neighbouring cells is tricky, and different types of neighbourhood topology have been proposed in the literature, including von neumann neighbourhood, moore neighbourhood, mvonn neighbourhood and extended neighbourhood (hasani & tavakkoli, 2007 ) (see figure 13 for an example of these four neighbourhood types). in the modelling of influenza a viral infections, beauchemin et al. (2005) use a simple two-dimensional ca model to investigate the influence of spatial heterogeneity on viral kinetics. their study population consists of two types of cell species, the epithelial cells and the immune cells. the epithelial cells are the target of viral infection, and the immune cells are those fighting the infection. the ca model is built upon a two-dimensional square lattice with the moore neighbourhood (see figure 13 (b)), where the condition of a certain cell will only be influenced by the eight closest cells around it. the set of states for the epithelial cells include healthy, infected, expressing, infectious or dead, while an immune cell can be in any of two states: virgin or mature. decision rules of updating the ca system are governed by parameters, such as infect_rate that models the probability of a healthy epithelial cell being infected by contacting each infectious nearest neighbour. detailed updating functions are discussed in beauchemin et al. (2005) . simulations show that the proposed ca model is sophisticated enough to reproduce the basic dynamic features of the cell-to-cell infection. different from the modelling of the influenza a viral infection earlier, fuks and lawniczak (2001) propose a lattice gas ca that is closely connected to an sir framework of an epidemic, where the interacting patterns of individuals are modelled. it is assumed that the status of individuals will change between three types, susceptible, infectious and recovered, denoted as fs; i; rg. the space where the epidemic takes place is set as a group of regular hexagonal cells, in which the individuals are located at the centre of each cell and can move through a channel that is created by connecting two centres of adjacent cells. the evolution of the ca occurs at discrete time steps under the operation of three basic functions, including contact c, randomisation r and propagation p. with the application of function c, an individual who is susceptible can become infected with probability 1 .1 ˇ/ n i , whereˇis the transmission rate and n i is the number of infectious individuals within the same cell. meanwhile, an individual who is infectious can recover with probability , where is the recovery rate. the function r randomly assigns individuals in each cell to move through the channels, which contributes to modelling the mixing process of individuals. in the final propagation step, individuals simultaneously move to the cells that they are randomly assigned to by r. in addition to the basic epidemic dynamics modelled by the proposed lattice gas ca, fuks and lawniczak (2001) also study the effect of heterogeneous spatial distribution of individuals with states s, i and r and the influence of different types of barriers in controlling the spread of an epidemic. whereˇis the population macrotransmission rate and is the population macrorecovery rate. first, when the set v d ∅, that is, an empty set, the ca-sir model for cell .i; j/ reduces a celllevel sir model similar to that given in (2). second, the numerator n i cp;j cq â i i cp;j cq .t 1/ is the expected number of infectious cases yesterday (time t 1) in a neighbouring cell .p; q/ 2 v whose cell population is n i+p,j+q . so, the ratio is an empirical probability that a person in cell .i; j/ randomly runs in a contagious person from its neighbouring cell .p; q/. third, this random chance is weighted by a factor of intercell connectivity, denoted by ! .i;j / pq ; the stronger tie of cell .i; j/ with cell .p; q/, the higher likelihood of a person from cell .i; j/ running in contagious individuals in cell .p; q/. fourth, summing up all such likelihoods gives a total likelihood that an individual from cell .i; j/ would run in the virus carriers from all the neighbouring cells. a typical form of the intercell connectivity coefficient is given by ! .i;j / pq d c .i;j / pq m .i;j / pq , where c .i;j / pq and m .i;j / pq are broadly defined as a connection factor and a movement factor, respectively. they are used to characterise the intercell mobility or how easily individuals in the cells can move between the centre cell and its neighbouring cells. this ca-sir system, which is integrated with the sir model, can serve as a basis for the development of useful algorithms to emulate real-world epidemic infection spatially. . assume that there is a 5 5 square array of 25 cells that hold the population under the study of a certain epidemic. our target cell is the one at the centre (see figure 13) we illustrate the predicted risk of infection with the covid-19 for all 83 counties in michigan state using the state-space model with the mechanistic ca-esair latent process . in the first step, we apply the mcmc method to estimate the model parameters (ˇand ) and the vector of four probabilities â t of being susceptible, self-immunised, infectious and removed by fitting the esair model with the state-level surveillance data since 11 march. this can be performed easily using the r package esir, which has been illustrated in section 5.5. both the antibody rate function˛.t/ and the transmission rate modifier .t/ are pre-specified using other data sources with the detail given in the succeeding text. after getting the estimates of the model parameters, we use them as the initial values to make county-level risk prediction by the ca-esair model (15). in this example, we consider only a 1-day ahead infection rate prediction (i.e. 3 may 2020) for all the counties in michigan, namely, â i c .t 0 c 1/. given that the covid-19 pandemic evolves fast in the state of michigan in early may 2020, this kind of short-term forecast or nowcast is of great interest to the michigan government for timely decision making on either extending an existing governor's 'stay-at-home' order or relaxing this executive order. to perform the prediction, one important task is to specify the inter-county connectivity coefficient ! cc 0 .t/. as discussed in , it is challenging to define ! cc 0 .t/ objectively, as it involves many variables. in this illustration, we specify this coefficient as ! cc 0 .t/ d cc 0 expf ár.c; c 0 /g, where á is a tuning parameter to be determined. briefly speaking, the first factor c,c 0 is the inter-county mobility factor characterising the decrease of human encounters in terms of their potential movements between counties, which has been given online (https://www.unacast.com/covid19/social-distancing-scoreboard). the second factor r.c; c 0 / is a certain travel distance between two counties c and c 0 in terms of both geodesic distance (karney, 2013) and 'air distance' based on the accessibility to nearby airports. in addition, the tuning parameter á enables to adjust the scale of the travel distance by minimising the sum of (county-level) weighted absolute prediction error for the one-step ahead risk prediction of the infection rate. in addition to the specification of the connectivity coefficient ! cc 0 .t/, the self-immunisation rate˛c.t/ is calculated based on the results of the new york statewide antibody test surveys released by the new york governor andrew cuomo on 29 april (new york state report, 2020), and the transmission modifier function c .t/ is specified by the effectiveness score of state-specific social distancing using cell phone data in the usa from the transportation institute at the university of maryland (https://data.covid.umd.edu/). additional details of the determination of c,c 0 , r.c; c 0 /,˛c.t/ and c .t/ and the tuning of á can be found in . figure 14(a) shows the 1-day ahead projected infectious rate for 83 counties in michigan on 3 may, and figure 14 (b) plots the corresponding county-level weighted prediction errors (wpe), which is at the order of 10 7 for the counties. the r package ca-esair is available on github (https://github.com/leyaozh/ca-esair). in this paper, we have presented the basics of multi-compartment infectious disease models from both deterministic and stochastic perspectives. we emphasise on the probabilistic extension of mechanistic models, which opens the door to a suite of statistical modelling techniques while still preserving the infectious disease dynamics in multi-compartment models. within the stochastic modelling framework, both the frequentist and the bayesian schools of modelling considerations and statistical methods are visited, along with high-level review and illustrative examples. epidemic models have played a key role in the past century to provide understanding of past and ongoing infectious diseases, and it is our belief that they will continually be valued and be improved to help us better understand the current covid-19 pandemic as well as future infectious diseases. we conclude with several remarks on future directions of stochastic infectious disease modelling. although publicly available surveillance data are useful to build preliminary models for the understanding of spreading patterns of infectious diseases, their data quality in terms of measurement biases and under-reporting has been known an outstanding issue that significantly impacts the validity of statistical analysis results (angelopoulos et al., 2020) . this is indeed an open problem to date with no appropriate solutions yet. with no insurance of reliable data, statistical methods, regardless of macromodels or micromodels, would fail to produce meaningful results. one potentially promising solution to such a fundamental concern is to build reliable and well-validated open-source benchmark databases that include not only traditional surveillance data but also personal clinical data from various sources such as hospital electronic health records, drug trials and vaccine trials. in addition, data from serological surveys and data from mobile devices or as such are also useful to increase information resolution and reliability, to remove major measurement biases and to calibrate data analytics. this task requires also efforts of data integration and international collaborations. research on the covid-19 pandemic certainly gives rise to a new opportunity of developing data integration methods to not only address challenges of data multi-modality but also overcome many data-sharing barriers and data confidentiality concerns. the population of self-immunised individuals is a significant source of bias in covid-19 surveillance data; they have never been captured by public health monitoring systems. according to survey results (new york state report, 2020), 20% of individuals in the city of new york have been tested antibody positive to the coronavirus. this simply means that a nationwide serological survey is a must in order to come up with an appropriate assessment for the underlying epidemiological features of the covid-19 pandemic in the usa. the design of this nationwide serological survey is a challenging statistical problem. solving it requires some innovative ideas and methods; for example, a cost-effective design of pooling several serum samples to perform a pooled test (e.g. gollier & gossner, 2020) , and an efficient design of hierarchical stratified survey sampling schemes. the sair model introduced in section 3.3 presents a basic framework for statistical models incorporating antibody serological surveys into the multi-compartment dynamics of infectious diseases. large-scale tracking data have played an important role in evaluating the effectiveness of social distancing in communities. the precision of intervention efficacy helps improve both estimation and prediction that directly impact government's decisions on tightening, extending or lifting control measures. one emerging data source pertains to the information of real-time cell phone locations, which allows better contact tracing so that individual data sequences can be recovered and used for modelling of personal risk and regional hotspots. a research group in the university of maryland (https://data.covid.umd.edu/) proposes several algorithms to process the cell phone data in the usa to extract key features of personal mobility, including location identification, trip identification, imputation of missing trip information, multilevel data weighting scheme, comprehensive trip data validation, and data integration and aggregation ghader et al., 2020) . however, these types of data are proprietary and subject to the issue of personal privacy (ienca & vayena, 2020) . integrating such data type or its summary statistics into infectious disease models should be encouraged, but in a cautious and responsible manner. in this field, statistical learning methods with differential privacy (dwork, 2008) are of great interest. statistical methodologies have been greatly challenged in the modelling and analysis of infectious diseases; almost every methodological troubling issue known the statistical literature surfaces, which presents new opportunities to statisticians and data scientists to develop innovative solutions. among many challenges, we emphasise a few of critical importance, which may be easily ignored in the new methodology development. we strongly advocate for the urgent need to build models that are transparent and reproducible (peng, 2011) . as most methods and models for the covid-19 pandemic are fairly recent and many have not yet been carefully peer reviewed, researchers should document the sources of data used, data preprocessing protocols, source computing code and sufficient modelling details to allow external validation from the public. such details are also necessary to allow others, who may have better quality data but without sufficient statistical expertise, to easily adopt new methodologies to obtain high-quality results. as mentioned in an original post by dr nilanjan chatterjee (https://link.medium.com/hquqilead6), transparency, reproducibility and validity are three criteria to assess and assure the quality of prediction models. his essay also mentioned the difficulty in reproducing the work given by the ihme to obtain accurate predictions and appropriate confidence intervals. similar to the ihme method that has no software available, gu's method for the covid-19 prediction (https://covid19-projections. com/) that has recently received much attention does not provide software, either, unfortunately. without clear guidance and full reproducibility, even models that currently do well might fail in the future because predictions are relying on certain kinds of extrapolation assumptions that need to be unveiled to the scientific community with full transparency for validation and comparison. given that model projections for the covid-19 pandemic have been changing dramatically from day to day primarily because the underlying models are changing, the primary aim may be set at optimising prediction models for nowcasting or short-term projections and be aware of the probable worst case scenarios for longer-term trends. as shown in the data example in section 6.5, the optimal tuning parameter is determined by the minimal short-term 1-day ahead prediction error. as pointed out by huppert and katriel (2013) , transmission models with different underlying mechanisms may lead to similar outcome in one context (e.g. short term) but fail to do so in another (e.g. long term). the further we project, the more we are uncertain about the validity of model assumptions. hence, extra caution is needed when reporting and interpreting long-term projection results. with the available surveillance data, making a nowcast of infection risk in next few hours is difficult; but it may become feasible when certain data sources of local information are accessible, such as electronic health records from local hospitals, viral testing results from local testing centres and mobile tracking data from individual cell phones. this requires a finer-resolution prediction machinery that may be established by generalising the ca to certain spatial point processes. despite being challenging, such prediction paradigm would be very useful and worth a serious exploration. because of the potential bias in surveillance data, either delayed reporting of infected case or inaccurate ascertainment of death caused by a virus, there are many measurement errors in data. this calls for statistical methods that can directly handle various data collection biases or are robust to such biases. there is little work performed in this important field of statistical modelling and analyses. in the current literature, model diagnostics for infectious disease models are largely lacking. given that most of the existing mechanistic models are based on certain parametric distributions (e.g. poisson processes), checking model assumptions is required. for example, for the proposed poisson process, the assumption of incremental independence and overdispersion should be checked. in addition, procedures of validating prediction accuracy are also important in which the choice of test data is tricky and needs to be guided by some objective criteria. a major weakness noted for the existing mechanistic models is the inflexibility of adding individual or subgroup covariates (e.g. age and race). the current strategy of handling these extra variables is via stratification, which would end up with strata with small sample sizes, so that subsequent statistical analyses lose power in both estimation and prediction of infection dynamics. an extension from the ca seems promising as the ca presents a system of particles distributed in different cells (or strata), where individual characterisations on particles may be added via covariates. the resulting model would assess and predict personal risk, as well as identify hotspots of new infection. this is worth serious exploration in the future with appropriate data available (e.g. electronic health records from hospitals). for a global pandemic such as the covid-19 that affects over 200 countries in the world, an integrative analysis is appealing to understand common features of the pandemic so to learn different control measures. given the fact that a pandemic evolves typically in a certain time lag, experiences from countries with earlier outbreaks may be shared with countries with later outbreaks, where statistical methods may borrow relevant information to set up prior distributions in the model fitting. for example, the estimated reproduction number estimated from the european covid-19 data may be a hyperparameter in the statistical analysis of the us covid-19 data. there is a clear need of more comprehensive meta-analysis methods to better integrate data from different countries than using the data to create hyperparameters. along this line, one of the earliest attempts is to combine covid-19 forecasts from various research teams using ensemble learning (see, e.g. https://github.com/reichlab/covid19-forecast-hub). most investigation efforts made by quantitative researchers have been relatively independent in an academic setting, and it is high time that policymakers and stakeholders are involved and play an active role in such modelling efforts. long-term projection of the covid-19 is most sensitive to and highly dependent on public health policy. a major source of uncertainty is due to the conflicting demands between public health (disease mitigation) and the need to sustain economic growth (livelihood), and the balance of the two is a moving target. one way to account for the modelling uncertainty is to factor in economic planning as a time-varying modifier of projection models. although some efforts have been made to incorporate economical data, most are retrospectively oriented, and we believe more efforts should be spent to prospectively incorporate expert inputs and economic forecasts. this is a research area of great importance worth serious exploration. we like to close this review paper by casting a few open questions of great interest to the public (at least to ourselves) that statisticians may help deliver answers with existing or new data to be collected by innovative study designs. we also hope that these questions motivate new methodological developments. question 1: how would researchers assess both timing and strength of the second wave of the covid-19 pandemic? is the second wave worse than the first one? answers to these questions need a relatively accurate long-term prediction of the infection dynamics. among so many different statistical models being able to predict future spreading patterns, we need to identify few ones or their combinations that are particularly useful to make long-term predictions. question 2: as many countries and regions started to reopen business, how would government monitor the likelihood of a recurring surge of covid-19 caused by business reopenings? does the social distancing measure help reduce a potentially rising risk? answers to these questions require adequate data that may not be easily collected by routine approaches. statisticians may work with practitioners to develop good sampling instruments and schemes for community risk surveillance. question 3: is face mask protective? if so, how to assess the compliance of face mask wearing? questions about the causal effect of face mask wearing on disease progression are very challenging. this is because there is no randomisation in the intervention allocation and many confounding factors are unobserved. question 4: is there evidence that the contagion of the coronavirus decays over time because of an increasing recovery rate of virus carriers and a decreasing rate of case fatality? statisticians ought to work out some thoughtful and convincing answers to the public. such surveillance data, there are data reporting gaps shown in figure a1 that are possibly caused by the so-called clustered reporting; that is, the recovered cases have not been released on the daily basis. to mitigate this data reporting artefact, we invoked a simple local polynomial regression procedure (loess) to smooth such unnatural jumps, resulting in a smooth fitted curve shown in figure a1 . the calibrated cumulative numbers of removed cases from the fitted curve (rounded to the corresponding integers) are available from the corresponding author upon request. the total population in michigan is set as 9.99 million. the summarised us state-level count data, which are weekly updated, can be also be found directly from the esir package introduced in section 5.4. special report: the simulations driving the world's response to covid-19 on modeling epidemics including latency, incubation and variable susceptibility an introduction to stochastic epidemic models infectious diseases of humans: dynamics and control stochastic epidemic models and their statistical analysis on identifying and mitigating bias in the estimation of the covid-19 case fatality rate the mathematical theory of infectious diseases and its applications. charles griffin & company ltd: 5a crendon street estimating excess 1-year mortality associated with the covid-19 pandemic according to underlying conditions and age: a population-based cohort study absolute humidity, temperature, and influenza mortality: 30 years of countylevel evidence from the united states a simple cellular automaton model for influenza a viral infections on a general stochastic epidemic model an estimation procedure for household disease data statistical studies of infectious disease incidence real time bayesian estimation of the epidemic potential of emerging infectious diseases a probabilistic automata network epidemic model with births and deaths exhibiting cyclic behaviour stochastic epidemic models: a survey stochastic epidemic models with inference general methods for monitoring convergence of iterative simulations numerical methods for ordinary differential equations a monte carlo approach to nonnormal and nonlinear state-space modeling monte carlo em estimation for time series models involving counts model parameters and outbreak control for sars the estimation of the effective reproductive number from disease outbreak data locally weighted regression: an approach to regression analysis by local fitting a new framework and software to estimate time-varying reproduction numbers during epidemics statistical analysis of time series: some recent developments state space mixed models for longitudinal observations with binary and binomial responses the simulation smoother for time series models the incidence of infectious diseases under the influence of seasonal fluctuations the estimation of the basic reproduction number for infectious diseases sequential monte carlo methods in practice tracking epidemics with google flu trends data and a state-space seir model differential privacy: a survey of results strategies for mitigating an influenza pandemic a likelihood-based method for real-time estimation of the serial interval and reproductive number of an epidemic cellular automata and epidemiological models with spatial dependence individual-based lattice model for spatial spread of epidemics composite likelihood em algorithm with applications to multivariate hidden markov model bayesian data analysis inference from iterative simulation using multiple sequences markov chain monte carlo maximum likelihood. interface foundation of north america observed mobility behavior data reveal social distancing inertia group testing against covid-19 clinical characteristics of 2019 novel coronavirus infection in china. medrxiv a multi-objective structural optimization using optimality criteria and cellular automata temporal dynamics in viral shedding and transmissibility of covid-19 modeling infectious disease dynamics in the complex landscape of global health the mathematics of infectious diseases mathematical modelling and prediction in infectious disease epidemiology forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months on the responsible use of digital data to tackle the covid-19 pandemic viral shedding and transmission potential of asymptomatic and paucisymptomatic influenza virus infections in the community caution warranted: using the institute for health metrics and evaluation model for predicting the course of the covid-19 pandemic mathematical modeling of diseases: susceptible-infected-recovered (sir) model algorithms for geodesics a contribution to the mathematical theory of epidemics estimating the basic reproductive ratio for the ebola outbreak in liberia and sierra leone the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application statistical inference in a stochastic epidemic seir model with control intervention: ebola as a case study early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia the reproductive number of covid-19 is higher compared to sars coronavirus spatial organization and evolution period of the epidemic model using cellular automata applications of mathematics to medical problems dependence of epidemic and population velocities on basic parameters on the spatial spread of rabies among foxes on estimating regression amid ongoing covid-19 pandemic, governor cuomo announces results of completed antibody testing study of 15,000 people showing 12.3 percent of population has covid-19 antibodies the r0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks forecasting seasonal influenza with a state-space sir model association of public health interventions with the epidemiology of the covid-19 outbreak in wuhan reproducible research in computational science predictions, role of interventions and effects of a historic national lockdown in india's response to the covid-19 pandemic: data science call to arms asymptotic properties of some estimators for the infection rate in the general stochastic epidemic model nine challenges for deterministic epidemic models dynamical phases in a cellular automaton model for epidemic propagation temperature and latitude analysis to predict potential spread and seasonality for covid-19 mathematical modeling of infectious disease dynamics a cellular automaton model for the effects of population movement and vaccination on epidemic propagation forecasting epidemics through nonparametric estimation of timedependent transmission rates using the seir model correlated data analysis: modeling, analytics, and applications bayesian measures of model complexity and fit phase transition in spatial epidemics using cellular automata with noise tracking reproductivity of covid-19 epidemic in china with varying coefficient sir model improved inference of time-varying reproduction numbers during infectious disease outbreaks on some mathematical problems connected with patterns of growth of figures an overview of composite likelihood methods estimates of the severity of coronavirus disease 2019: a modelbased analysis theory of self-reproducing automata how generation intervals shape the relationship between growth rates and reproductive numbers different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures an epidemiological forecast model and software assessing interventions on covid-19 epidemic in china (with discussion) smooth regression analysis modeling epidemics using cellular automata epidemic dynamics: discrete-time and cellular automaton models naming the coronavirus disease (covid-19) and the virus that causes it cellular automaton modeling of epidemics an interactive covid-19 mobility impact and social distancing analysis platform semiparametric bayesian inference for the transmission dynamics of covid-19 with a state-space model a spatiotemporal epidemiological prediction model to inform county-level covid-19 risk in the usa semiparametric stochastic modeling of the rate function in longitudinal studies the authors are very grateful to the co-editors for the invitation to contribute a review paper for statistical modelling and analysis of infectious diseases and for their helpful feedback towards improving the manuscript. this research is partially supported by the national science foundation grant dms1811734. although the two applications discussed earlier in section 6.2 give a framework of how ca models the dynamics of epidemic spread, white et al. (2007) provide a more direct incorporation of spatial ca with the temporal sir compartments at the population level, where each cell stands for a small population (e.g. a county) with different proportions of susceptible, infectious or recovered individuals. the resulting ca-sir given in white et al. (2007) is formulated by four parts (c, q, v and f). first, c d f.i; j/; 1 ä i ä r; 1 ä j ä cg defines the cellular space, or a collection of r c cells on a two-way array, where r c is referred to the dimension of the cells. second, q represents a finite set that contains all the possible states of a cellular space. in the case of the sir model, q d fs; i; rg corresponding to the susceptible, infectious and removed states. third, v d f.p k ; q k /; 1 ä k ä ng is the finite set of indices defining the neighbourhood of each cell, and consequently, v ij d f.i c p 1 ; j c q 1 /; : : : ; .i c p n ; j c q n /g denotes the set of neighbouring cells for the central cell .i; j/. specifically, v d v f.0; 0/g represents all the neighbouring cells without the cell at the centre of consideration. fourth, function f stands for certain updating rules to govern the dynamics of interactions between cells in the a ca-sir system. for each cell at a discrete time t (say, today), its current status is described by three cell-specific compartments fâ s ij .t/, â i ij .t/ and â r ij .t/g, where â s ij .t/, â i ij .t/ and â r ij .t/ 2 oe0; 1 represent the cell-specific probabilities of being susceptible, infectious and recovered, respectively. clearly, â s ij .t/ c â i ij .t/ c â r ij .t/ d 1 to form a microcell-level sir model. the ca-sir model is updated based on the following transition functions: for cell .i; j/ 2 v, based on the basic ca-sir model proposed in white et al. (2007) , extensions can be easily applied to better model the dynamics of infectious diseases using real data. propose a spatio-temporal epidemiological forecast model that combines ca with an extended sair (esair) model to project the county-level covid-19 prevalence over 3 109 counties in the continental united states. this model is termed as ca-esair model in which a county is treated as a cell. to carry out cell-level infection prevalence updates, the macroparametersǎ nd need to be estimated from the macromodel esair model. in comparison with the esir model discussed in section 5.2, a new antibody compartment (a) is included in the esair model to account for the individuals who are self-immunised and have developed antibodies to the coronavirus. the inclusion of the antibody compartment can address the under-reporting issue known for available public databases and to build self-immunisation into the infection dynamics. in this way, better estimation of the macromodel parameters can be obtained. the esair model can be described using the following odes, which govern the law of interactive movements among four compartments or states of susceptible (s), self-immunised (a), infectious (i) and removed (r):where˛.t/ is the self-immunisation rate, .t/ is a time-varying transmission rate modifier,ˇis the basic disease transmission rate and is the rate of being removed from the system (either dead or recovered). the earlier esair model is an alternative expression of model (6) based on the compartment probabilities. in order to apply the ca-esair system to model the epidemic spread in the usa, relax the classical ca-esair from spatial lattices (or cells) to areal locations of counties. let c be the collection of 3 109 counties. here we consider the extended neighbourhood type (all counties are neighbouring ones given high mobility in the us population). for a county c 2 c, n c denotes the county population size, and c c denotes the set of all the other counties except county c. for county c at time t, the county-specific probability vector is denoted by â c .t/ d .â s c .t/; â a c .t/; â i c .t/; â r c .t// > . the ca-esair model at discrete times is expressed by the following form:where˛c.t/ is the county-specific self-immunisation rate and c .t/ is the county-specific transmission modifier. same as the parameter mentioned in the ca-sir model (15) earlier, ! cc 0 .t/ is a connectivity coefficient that quantifies the inter-county movements between counties c and c 0 . by applying the proposed ca-esair model, have proposed a t-day ahead risk forecast of the covid-19 as well as a personal risk related to a travel route. the runge-kutta method is an efficient and widely used approach to solving ordinary differential equations when analytic closed-form solutions are unavailable. it is typically applied to derive a numerical functional system of high-order accuracy with no need of high-order derivatives of functions. the most well-known runge-kutta approximation is the runge-kutta fourth-order (rk4) method. for example, in the case of the mechanistic sir model (1) where y is an unknown function in time t, which can be either a scalar or a vector. then for a preselected (small) step size h > 0, a fourth-order approximate solution of y satisfies at a sequence of equally spaced grid points y n ; n d 0; 1; : : : ; with jy n y n 1 j d h, y nc1 d y n c 1 6 h.k 1 c 2k 2 c 2k 3 c k 4 /; n d 0; 1; : : : ;where k 1 d f .t n ; y n /;k 4 d f .t n c h; y n c hk 3 /:because four terms k 1 , k 2 , k 3 and k 4 are used in the approximation, the earlier method is termed as an rk4 method of the ode solution to function y. for a general rk approximation, refer to stoer and bulirsch (2013) . in the succeeding text, we list michigan data from 11 march to 10 june 2020. the numbers of daily confirmed cases and deaths are obtained from the github repository jhu csse (https://github.com/cssegisanddata/covid-19), and the daily recovery data are collected from 1point3acres (https://coronavirus.1point3acres.com). the daily cumulative numbers of deaths and recovered cases are then summed as the cumulative number of removed cases. in key: cord-295786-cpuz08vl authors: castillo-sánchez, gema; marques, gonçalo; dorronzoro, enrique; rivera-romero, octavio; franco-martín, manuel; de la torre-díez, isabel title: suicide risk assessment using machine learning and social networks: a scoping review date: 2020-11-09 journal: j med syst doi: 10.1007/s10916-020-01669-5 sha: doc_id: 295786 cord_uid: cpuz08vl according to the world health organization (who) report in 2016, around 800,000 of individuals have committed suicide. moreover, suicide is the second cause of unnatural death in people between 15 and 29 years. this paper reviews state of the art on the literature concerning the use of machine learning methods for suicide detection on social networks. consequently, the objectives, data collection techniques, development process and the validation metrics used for suicide detection on social networks are analyzed. the authors conducted a scoping review using the methodology proposed by arksey and o’malley et al. and the prisma protocol was adopted to select the relevant studies. this scoping review aims to identify the machine learning techniques used to predict suicide risk based on information posted on social networks. the databases used are pubmed, science direct, ieee xplore and web of science. in total, 50% of the included studies (8/16) report explicitly the use of data mining techniques for feature extraction, feature detection or entity identification. the most commonly reported method was the linguistic inquiry and word count (4/8, 50%), followed by latent dirichlet analysis, latent semantic analysis, and word2vec (2/8, 25%). non-negative matrix factorization and principal component analysis were used only in one of the included studies (12.5%). in total, 3 out of 8 research papers (37.5%) combined more than one of those techniques. supported vector machine was implemented in 10 out of the 16 included studies (62.5%). finally, 75% of the analyzed studies implement machine learning-based models using python. supplementary information: the online version contains supplementary material available at 10.1007/s10916-020-01669-5. according to the world health organization (who) report in 2016, nearly 800,000 people have committed suicide [1] . suicide is a tragic situation that affects families, neighbours, leaving significant effects on those who survive. it is considered the second cause of unnatural death in people between 15 and 29 years old [2] . the report on "death statistic according to cause of death in spain" published by the national statistics institute, in 2017, the last year for which data is available, states a total of 3679 suicides. moreover, 140 fewer suicides than the previous year have been reported in 2018 (3539) [3] . the multiple scenarios that families and individuals face in their daily routine can lead to this tragic situation. consequently, committing suicide is a critical public health challenge that numerous countries address in different manners [4] . suicidal behaviours are a complex phenomenon that is influenced by multiple factors such as biological, clinical, psychological, and social considerations [5] . on the one hand, suicide is preceded by milder manifestations, such as thoughts of death or suicidal ideation [6] . on the other hand, suicide is closely related to the model of society in which an individual lives [7] . moreover, it is directly related to the experience of high-stress circumstances and lifestyle changes [8] . currently, the effects of covid-19 and isolation will cause a significant emotional impact worldwide [9] . in particular, people who have suffered from mental health diseases are in an even more fragile situation [10] . therefore, an increase in anxiety and depression disorders, drugs use, loneliness, domestic violence and even suicide are expected to occur in these individuals [11] . consequently, the risk of suicide attempts has increased among the population [9] . multiple novel factors contribute to an increase in suicide risk [12] . in particular, the measures for prevention of covid-19 that includes social distancing plans are strictly related to suicide risk [9] . the reduction in physical contact can lead to loss of protection against suicide [9] . these factors will be even more relevant among people who have previous mental health problems [13] . social distancing is necessary to control the covid-19 pandemic and decrease the propagation of the virus [14] . however, a global perspective on indirect mortality is also essential [15] . social distancing is connected to an increased risk of suicidal behavior [16] . therefore, social distancing must be addressed through a global intervention plan that implements new models to combat physical distancing using social networks [17] . in this context, several new technologies have been identified as a crucial resource to detect people in suicide risk [18] . furthermore, young people who constitute a vulnerable group commonly use social networks [19] . social networks are a popular method of communication between people [20] . consequently, social networks are an appropriate method to recognize the behaviour of the person according to the content of their posts [21] . the analysis of the user's posts on social media is a complex problem [22] . the complexity is even higher if the objective is to estimate the suicide risk [23] . also, if the analysis is carried manually by experts, discrepancies usually occur due to the peculiarities of the language used in social networks [24] . therefore, automatic architectures that use machine learning (ml) methods should be developed. nevertheless, numerous of these automated systems require the availability of datasets that allow the training of predictive models which is a critical limitation [25] . on the one hand, these datasets currently do not exist, or they have limited specifications. on the other hand, unsupervised models do not require training. however, these models need datasets for validation [26] . currently, the use of ml techniques to analyze health-related data is a trending topic. moreover, the use of different systems based on ml in different areas, such as disease diagnosis and bioinformatics presents promising results [27] [28] [29] [30] [31] [32] [33] . in particular, for mental health, various models and tools for suicide risk prevention have been proposed in the literature [34] . this scoping review aims to identify the current ml techniques used to predict suicide risk based on information posted on social networks. this paper reviews the state of the art on this topic focusing on the ml methods, the objectives, the data collection techniques, the development process and the validation metrics used. the main contribution of this study is to summarize the state of the art and to provide a description of the common outcomes and limitations of current research to support future investigations. the remaining of this paper is organized as follows. section 2 presents the methodology concerning the search strategy, study selection criteria, screening process, and data extraction. the included studies are analyzed in section 3 and are discussed in section 4. finally, the most relevant findings and the limitations of the study are summarized in section 5. the prisma extension for conducting scoping reviews, the technical details of the machine learning techniques, internal validation strategies and main outcomes of the selected studies are included as supplementary material. this study summarize the requirements and methods for enhanced suicide risk assessment using social networks. consequently, the authors conducted a scoping review using the methodology proposed by arksey and o'malley et al. [35] . furthermore, the authors have followed the prisma-scr proposed by tricco et al. [36] . the overall procedure is annexed as supplementary material (appendix i). on the one hand, arksey and o'malley et al. [35] framework is widely used on scoping reviews concerning the health domain. this framework presents relevant recommendations to summarize findings and identify research gaps in the existing literature. on the other hand, the prisma extension for scoping reviews built by tricco et al. [36] defines a checklist of the significant items to be reported when a scoping review is conducted. the authors have performed a systematic review to identify relevant papers that use suicide risk assessment models in social networks. the search has been conducted during march 11-13 of 2020. the databases used are pubmed, science direct, ieee xplore and web of science since they are the most relevant sources and include the most significant scientific work. the authors have defined the search terms, and the selection of the studies focus on literature written in the english language. the search string used in the databases was: ["suicide" and ("social networks" or "social network") and "algorithm"]. to select the relevant studies on this topic, the authors defined the following inclusion criteria: & the studies include algorithms or models to estimate suicide risk using the social network. the research papers were excluded if they were not written in the english language, do not include a specific suicide intervention or do not report information regarding technical aspects of the model/algorithm used to detect suicide risk on social networks. the screening process of the papers obtained through the search strategy was performed by two authors independently (gc and gm). the process was divided into two phases. firstly, the authors have reviewed the title and abstract. secondly, the authors have analyzed all the manuscript. the conflicts were resolved by common consensus. the extraction of the data from the selected studies was performed by four authors (gc, gm, or and ed). the authors examined the completed form for consistency and accuracy. the extracted data is split into two sets, such as general and technical information. general information refers to the title, year, authors, objectives and methods included in the study. the technical information set is based on luo et al.'s guidelines [37] and contains the following categories: & objectives: refers to the main goals of the proposed ml models. a taxonomy was defined to describe those goals: ○ text classification: models that aim to classify post into several categories, including a binary classification, based on post content. ○ entity recognition: models that aim to identify several public entities in the text. ○ emotion recognition: models that aim to identify emotions expressed in the post content. ○ feature extraction: models that aim to collect information regarding characteristics of the post content such as lexical, semantic or sentiment features (word polarity). ○ topics identification: models that aim to analyze themes being addressed in the dataset or the posts. ○ features selection: models that aim to select automatically features, including optimization and feature reduction, to be included as predictor parameters in the predictive model. ○ score estimation: models that aim to estimate a quantitative suicide risk value. ○ data sources: refers to where the data set for the study is collected. we have followed the taxonomy used by gonzalez-hernandez et al. [38] : ▪ generic social network (gsn): social network containing information about a range of topics (e.g. twitter, facebook and instagram). ▪ online health community (ohc): domain-specific networks that are dedicated exclusively for discussions associated with health. ○ inclusion and exclusion criteria: information regarding what method was followed to include the data in the data set. the authors define the following possible categories: ▪ keywords: this category includes all studies that defined a set of keywords, hashtags, or phrases to be used as queries or filters. ▪ direct selection: a set of participants is selected, and then, data from their social networks are included. ○ dataset annotation: the labelling process followed for dataset annotation. the authors defined the following possible methods: ▪ manual annotation: the annotation process involved the participation of humans that assessed post contents and assigned one of the possible classes defined. ▪ corpus: authors used an existing annotated corpus to train and test the proposed predictive models. ▪ previous scores: an assessment using a standard scale or other quantitative instrument was previously conducted. then, posts were labelled according to the user's score. ○ ml techniques: general ml techniques used in the study. ○ platform: platform or language programming used to develop the ml models proposed in the study. ○ strategy: how datasets were split into training and testing data. ○ performance metrics: refers to the metrics used to evaluate the performance of the models ○ outcomes: refers to the predictive performance of the final model. the authors retrieved 426 articles in the search conducted in research databases. after removing duplicates, 424 items were selected for screening. the title and abstract review stage resulted in the exclusion of 344 articles since most of the studies do not cumulative focus on suicide risk, social networks and ml methods. after the application of the inclusion and exclusion criteria, 19 papers are included in this work. three articles were excluded in the full-review stage. one study was excluded since it is based on suicidal behaviour without including a social media analysis [39] . another study was excluded because it proposes an approach to analyze social media posts for suicide detection, but the authors did not develop any model [40] . finally, the last exclusion in this stage was conducted since the study proposed by [41] does not include ml techniques. from the full-text review, 16 articles were then selected for inclusion [26, [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] . the flow diagram representing the search process is shown in fig. 1 . furthermore, the detailed information is presented as supplementary material (appendix i). the results of the application of artificial intelligence algorithms or models for suicide risk identification using data collected from social networks have been analyzed in this study. furthermore, this paper presents a summary and comparison of the state-of-the-art methods and technical details that address this critical public health challenge. this section introduces a brief description of the articles included in this scoping review. ambalavan et al. 2019 [42] developed several methods based on nlp and ml to study the suicidal behaviour of individuals who attempted suicide. the authors built a set of linguistic, lexical, and semantic features that improved the classification of suicidal thoughts, experiences, and suicide methods, obtaining the best performance using a support vector machine (svm) model. birjali et al. 2017 [43] presented a method based on ml classification for the social network twitter to identify tweets with risk of suicide. the authors used svm, where smo (sequential minimal optimization) is implemented as the best model in terms of precision (89,5%), recall (89,11%) and fscore (89,3%) for suspected tweets with a risk of suicide. burnap et al. 2017 [44] developed a set of ml models (using lexical, structural, emotive and psychological features) to classify texts relating to communications around suicide on twitter. this study presents an improved baseline of the classifier using the random forest (rf) algorithm and maximum probability voting classification decision method. furthermore, the proposed method achieves an f-score of 72.8% overall and 69% for the suicidal ideation class. chiroma et al. 2018 [45] measured the performance of five ml algorithms such as prism, decision tree (dt), naïve bayes (nb), rf and svm, in classifying suicide-related text from twitter. the results showed that the prism algorithm had outperformed the other ml algorithms with an f-score of 84% for the target classes (suicide and flippant). desmet et al. 2018 [46] have implemented a system for automatic emotion detection based in binary svm classifiers. the researchers used lexical and semantic features to represent the data, as emotions seemed to be lexicalized consistently. the classification performance varied between emotions, with scores up to 68.86% f-score. nevertheless, f-scores above 40% was achieved for six of the seven most frequent emotions such as thankfulness, guilt, love, information, hopelessness and instructions. du et al. 2018 [47] have investigated several techniques for recognizing suicide-related psychiatric stressors from twitter using deep learning-based methods and transfer learning strategies. the results show that these techniques offer better results than ml methods. using a convolutional neural network (cnn), they have improved the performance of identifying suicide-related tweets with a precision of 78% and an f-1 score of 83%, outperforming svm, extra trees (et), and other ml algorithms. the recurrent neural network (rnn) based psychiatric stressors recognition presented the best f-1 score of 53.25% by exact match and 67.94% by inexact match, outperforming conditional random fields (crf). fodeh et al. 2018 [48] proposed a suicidal ideation detection framework that requires a minimum human effort in annotating data by incorporating unsupervised discovery algorithms. this study includes lsa, lda, and nmf to identify topics. the authors conducted two analysis with k-means clustering and dt algorithms. dt showed better precision (84.4%), sensitivity (91.2%) and specificity (82.9%). grant et al. 2018 [49] automatically extracted informal latent recurring topics of suicidal ideation found in social media posts using word2vec. the proposed method uses descriptive analysis and can identify similar issues to the expert's risk factors. jung et al. 2018 [50] have implemented an ontology and terminology method to provide a semantic foundation for analyzing social media data on adolescent depression. they evaluated the ontology obtaining the best values of precision (76.1%) and accuracy (75%) using dt algorithms. liu et al. 2019 [51] performed a study to evaluate the feasibility and acceptability of proactive suicide prevention online (pspo). pspo is a new approach based on social media that combines proactive identification of suicideprone individuals with specialized crisis management. they evaluated different ml models in terms of accuracy, precision, recall and f-measure to get the best performance. the svm model showed the best performance overall, indicating that pspo is feasible for identifying populations at risk of suicide and providing effective crisis management. o'dea et al. 2015 [52] studied whether the level of concern for a suicide-related post on twitter could be determined based solely on the content of the post, as judged by human coders and then replicated by ml. they evaluated ml models and decided that the best performing algorithm was the svm with term frequency weighted by inverse document frequency (tfidf). the results show a prediction accuracy of 76%. parraga-alava et al. 2019 [26] present an approach to categorize potential suicide messages in social media, which is based on unsupervised learning using traditional clustering algorithms. the computational results showed that hierarchical clustering algorithm (hca) was the best model for binary clustering achieving average rates of 79% and 87% of f1-score for english and spanish. sawnhey et al. 2019 [53] investigate feature selection using the firefly algorithm to build an efficient and robust supervised approach for suicide risk detection using tweets. after applying different ml techniques, rf + bfa and cnn-lstm obtained the best results in accuracy, precision, recall and f1-scores in specific datasets. shahreen et al. 2018 [54] used svm and neural networks (nn) for text classification on twitter. the researchers used three types of weight optimizers, namely limited-memory bfgs, stochastic gradient descent and an extension of stochastic gradient descent which is adam to obtain maximum accuracy. the results show an accuracy of 95.2% using svm and 97.6% using neural networks. they have used 10-fold cross-validation for model performance evaluation. sun et al. 2019 [55] have proposed a hybrid model that combines the convolutional neural network long short-term memory (cnn-lstm) with a markov chain monte carlo (mcmc) method to identify user's emotions, sample user's emotional transition and detect anomalies according to the transition tensor. the results show that emotions can be well sampled to conform to the user's characteristics, and anomaly can be detected using this model. zhang et al. 2014 [56] have used npl methods and ml models to estimate suicide probability based on linguistic features. the experiments performed by the researchers indicate that the lda method finds topics that are related to suicide probability and improve the performance of prediction. they obtained the best root mean square error (rmse) value of 11 with a linear regression at 1-32 scale. this paper presents a detailed analysis of the results in the following sections: study objectives, data collection, model development process and this section show data pre-processing, data preparation, sentiment analysis, dataset annotation, ml techniques, platforms and internal validation. the distribution of the included studies according to the year of publication are presented in fig. 2 . most of the included studies propose models to classify collected text into suicide-related categories. text classification is the most common objective in the included studies (12/16, 75%) [26, [42] [43] [44] [45] [46] [47] [48] [51] [52] [53] [54] . a score estimation of suicidal probability based on post content was proposed in one of the included studies (1/16, 6,25%) [56] . feature extraction and feature selection were identified as main objectives in four different studies (4/16, 25%) [48, 50, 53, 56] . the remaining categories (entity recognition, theme identification and emotion recognition) were identified only in a study (1/16, 6.25%) [47, 49, 55] . in total, 4 of 16 studies (25%) can be grouped in two categories, involving text classification (3/4) [47, 48, 53] or score estimation (1/4) [56] . different data sources were selected to perform data collection for training and testing of the proposed models. in total, 13 out of the 16 included studies (81.25%) used general social networks (gsns) for data collection [26, 43-48, 50, 52-56] . the most popular gsn used as the data source in the included studies was twitter (10/16, 62.5%), followed by forums or microblogs (3/16, 18.75%). other gsns used were weibo (2/16, 12.5%), facebook, instagram, tumblr, and reddit (1/16, 6.25%). three studies used ohcs (18.75%), two of them used suicide-related subreddit [42, 49] , and the other one used a sina microblog [51] . three studies have collected data from ohcs used all posts/comments without defining inclusion/exclusion criteria. most of the remaining studies defined suicide-related keywords or phrases to filter posts out (10/13, 76.92%) [43-48, 50, 52-54] . zhang et al. [56] , recruited potential participants, and then, the selected participants' posts in weibo have even used. finally, two studies that used gsns did not define inclusion/exclusion criteria (2/13, 15.38%) [26, 55] . the data collection time spam must be reported in mlbased studies as it is defined by the luo et al.'s guidelines [37] . however, seven of the included studies did not report the time spam when data collection was performed (43.75%) [42, 43, 45, 46, [54] [55] [56] . one of the included studies did not report the dataset size (1/16, 6.25%) [54] . the dataset sizes were between 102 posts (minimum) and 1,100,000 posts (maximum). four out of the remaining 15 studies have used sample sizes between 100 and 999 posts (26.67%) [26, [42] [43] [44] . three of them used sample sizes with more than 800 posts. five studies reported dataset sizes between 1000 and 5000 posts (33.33%) [45, 47, 50, 52, 53] . finally, six studies used large dataset, including more than 10,000 posts (40%). the number of users/participants represented in those datasets was only reported in three studies (18.75%). one of those three studies recruited 697 participants and then collected data from their weibo accounts [56] . the other two studies analyzed the user's data collected to report the number of unique users involved in the study (n = 3873; n = 63,252) [48, 49] . although using basic statistics to describe dataset is defined as a relevant factor regarding the reliability of mlbased studies in the health domain, as suggested by luo et al.'s guidelines [37] . however, three of the included studies did not report any dataset description (3/14, 21.43%) [42, 54, 55] . moreover, only three studies included information regarding ethical issues to collect and manage social media data (3/16, 18.75%). two of those studies obtained the ethical approval from ethics committee: liu et al. [51] from the institutional review board of the institute of psychology, chinese academy of science, and o'dea et al. [52] from the university of new south wales human research ethics committee and the csiro ethics committee. the remaining study, conducted by ambalavan et al. [42] , adhered to the guidelines defined by kraut et al. 2004 [57] . it is highlighted that zhang et al. [56] assessed participants' suicide probability using a standard scale and have collected personal data. however, the information regarding ethical approval was not reported in the article. table 1 presents the summary of the results found in terms of the objectives of the study, data sources, ethical aspects, inclusion and exclusion criteria, time span, number of posts, part number and the description of the data of the papers included in this work. data pre-processing data pre-processing is a typical stage in the development process of ml-based models. this stage includes several techniques such as data cleaning, words removal (stop word and punctuation), data transformation, and addressing challenges of outlier or missing values. the reported information regarding data pre-processing is critical for study reproducibility. most of the included studies reported information regarding the pre-processing stage (14/16, 87 .5%) [26, 42, [44] [45] [46] [47] [48] [49] [50] [52] [53] [54] [55] [56] . several of these studies only reported vague information and did not include details on the specific techniques and tools used. however, the inclusion of a (sub) section describing data preprocessing is not mandatory. in total, 4 studies included a section/subsection reporting information regarding preprocessing. the remaining studies reported this information in the text. moreover, some studies presented this information in a different part of the article. the data mining techniques for feature extraction, feature detection, or entity recognition used in the included studies are summarized in table 2 . in total, 50% of the included studies (8/16) report the use of data mining techniques for feature extraction, feature detection or entity identification [26, 44, 46, 48, 49, 51, 53, 56] . the most common reported technique was liwc (4/8, 50%), followed by lda, lsa, and word2vec (2/8, 25%). moreover, nmf and pca were used only in one of the included studies (12.5%). in total, 3 out of 8 studies (37.5%) combined more than one of those techniques. seven out of the 16 included studies include sentiment analysis (43.75%). a sentiment ratio or polarity value was assigned to words or features in these studies. two of these studies used sentiwordnet to obtain the sentiment value [43, 44] . also, two studies used the categories defined in liwc as a basis of sentiment value estimation [44, 56] . furthermore, two studies used previously published lexicons to calculate it [46, 53] . finally, two studies calculated those values [50, 55] , automatically. supervised learning techniques require labelled, coded, or annotated datasets to train and test the models. in total, 15 out of the 16 included studies required annotated datasets. one of those studies did not report how annotations were performed (6.67%) [54] . most of the studies followed a manual process to annotate the training and test datasets, involving experts in the codification process (10/15, 66.67%) [42] [43] [44] [45] [46] [47] [50] [51] [52] [53] . some of these studies reported detailly how the annotation process was performed. two studies used existing annotated corpus (13.33%) [26, 55] . in one study (6.67%), the authors designed an algorithm to generate the labels automatically [48] . finally, a study recruited participants and assessed the participant's suicide probability using a standard scale, the suicide probability scale, and the model results were compared to those obtained using the scale (6.67%) [56] . analysis (hca), and association rules (ar). table 3 shows the distribution of these techniques in the included studies. svm was the most used technique being implemented in 10 out of the 16 included studies (62.5%) [42] [43] [44] [45] [46] [47] [51] [52] [53] [54] . the second most used technique was dt (7/16, 43.75%) [43-45, 47, 48, 50, 51] , followed by lr (5/16, 31.25%) [42, [50] [51] [52] [53] and rf (4/16, 25%) [45, 47, 51, 53] . dl, nb and km were used in 3 out of the 16 included models (18.75%). in total, 2 models based on nn were proposed (12.5%) [42, 50] . finally, 7 of those 15 techniques were used only in a study (lir, knn, gbm, rof, pam, hca, and ar). in total, 25% of the included articles used only a technique to implement the proposed model [46, 49, 55, 56] . the remaining studies developed the proposed models using 2 different techniques (3/16, 18.75%) [48, 52, 54] , 3 techniques (3/16, 18.75%) [26, 42, 50] , 4 techniques (5/16, 31.25%) [43-45, 47, 51] , or 5 techniques (1/16, 6.25%) [53] . the platform or software tool used to implement the mlbased models is identified by half of the included studies. python was the most used tool (6/8, 75%) [26, 42, 46, 49, 52, 54] . one of these studies combines python and r [26] . two out of the 8 studies used weka software to develop the proposed models [43, 44] . one of the included studies focus on topic identification, and authors followed a manual analysis of topics proposed using the models to estimate their validation [49] . five of the remaining included studies did not report information regarding internal validation strategy followed to assess the validity of the proposed models (33.33%) [26, 43, 48, 50, 55] . the 10-fold cross-validation was the most implemented strategy in the included studies (8/10, 80%) [44-46, 51-54, 56] . one study followed a 70-30 proportion rule to split the dataset in training and test datasets (10%) [42] . however, the technique used to split the data is not reported. other study followed a 7-1-2 proportion to split the dataset for classifier model validation and a manual selection for classifier validation (10%) [47] . all studies reported the performance parameters used in the validation process. precision, recall and f-score are the most used performance parameters (12/15, 80%) . in total, 66,67% (10/15) of the studies have used accuracy as a performance value. fodeh et al. [48] used specificity, sensitivity and area under the receiver operating characteristic (roc) curve (6.67%). zhang et al. [56] used rmse value to validate their estimation model. social networks are an effective method to detect some behaviours. moreover, they are particularly relevant to identify subjects at suicide risk. the extensive use of social networks leads the authors to investigate the current scenario concerning suicide prevention. this is the primary motivation of the presented research. this study verifies the trends and results of applying ml algorithms and the methods used by various researchers to address this critical situation. indeed, considering the covid-19 pandemic, social networks are one of the most used methods of communication. therefore, it is relevant to survey the main techniques, algorithms and models applied to social networks to detect suicidal risk behaviours. in total, 43,75% (7/16) of the studies does not provide the time spam information concerning the experiments conducted. this is a relevant limitation, as proposed by luo et al.'s guidelines [37] . moreover, 81,25% (13/16) does not specify the number of participants involved. the anonymization of the participant information should be justified. however, it is possible to characterize the participants involved in the studies and maintain their privacy at the same time. this information allows us to conclude that the quality of the reports of suicide risk prediction models must be increased. the authors must report relevant items to ensure reliability. furthermore, the details of the datasets used are not presented in 18,75% (3/16) of the analyzed literature. although the use of basic statistics to describe dataset is defined as a relevant factor regarding the reliability of ml-based studies in the health domain as proposed by luo et al.'s guidelines [37] . the dataset description is of utmost importance since the efficiency of the specified results, and their future improvements are closely connected with the sample size. furthermore, three studies did not report any dataset description (3/14, 21.43%) [42, 54, 55] . consequently, it is critical to question what reasons can justify the inexistence of the dataset description. indeed, this can be related to confidential concerns. however, it is essential to mention that without the complete dataset information is not possible to ensure the absence of bias or deficiencies in the information used. moreover, it is not possible to ensure the reproduction of the experiments. in total, 76.92% (10/13) of the studies defined suiciderelated keywords or phrases for text analysis. furthermore, text classification is the objective of 75% of the analyzed studies. consequently, this denotes a significant limitation concerning the multiple forms of visual communication items such as emoticons that are currently used. however, the reason why most of the authors does not consider the visual components in sentences is not clear. this can be related to technical limitations of the used software tools. consequently, it is necessary to promote new research activities to solve this critical limitation. the pre-processing data stage is required to develop or replicate the ml-based model. therefore, most of the included studies indicated information about the pre-processing stage (14/16, 87.5%) [26, 42, [44] [45] [46] [47] [48] [49] [50] [52] [53] [54] [55] [56] . moreover, it should be noted the majority of the studies only present vague information regarding pre-processing data methods and validation strategy. pre-processing is an essential aspect of detecting suicide risk using ml. however, according to the results achieved, there is a significant limitation related to the unstandardized information of each analyzed research. additionally, the authors note that most of the reviewed papers do not present the data processing methods in detail. consequently, there is a significant limitation concerning the real reason for this scenario. this can be related to methodological or practical difficulties. however, the question about what motivates this trend still exists. furthermore, there is no justification for this scenario in the before-mentioned studies. therefore, future research on the subject should ensure the detailed information about the pre-processing methods. a specific annotated dataset for suicide risk on social media is also a critical limitation. in total, 10 of the 15 papers (75%) have performed manual annotation. however, it should be noted that the peculiarities of the multiple languages used in social networks can be a relevant limitation for data labelling [38, 58] . the sentiment analysis has been performed in most cases assigning the polarity to the words [59] . however, these polarities could vary according to specific domains such as suicide and considering the terminology used in social networks. therefore, it is relevant to perform sentiment analysis that encompass the linguistic entities as phrases [60] . stakeholders have reported several ethical issues as critical factors in the use of social media as a participatory health tool [61] . in this sense, those relevant issues must also be addressed appropriately in ml research applied to the health domain. despite this relevance, ethics is not appropriately discussed by authors in their reports. there is a lack of information regarding ethical issues in the included studies. only three studies included information regarding ethical issues to collect and treat social media data (3/16, 18.75%). however, the doubt regarding the reasons that justify the inexistent ethical agreements of the majority of the works still exist. consequently, a critical limitation is found regarding the ethical concerns involved in the collection and analysis of this sensitive type of data. two of those studies obtained an ethical approval from the ethics committee ( [42, 52] ). however, ethical and privacy concerns associated with the data gathering method are a controversial practice. to justify its use, formal prospective studies analyzing if and how physician access to a patient's social media influences care should be performed [62] . this paper has presented a scoping review on the main techniques, algorithms and models applied to social networks to detect suicidal risk. in total, 75% of the included studies propose models to classify collected text into suicide-related categories. text classification is the main objective of 75% of the included studies. furthermore, 50% of the included studies (8/16) report explicitly the use of data mining techniques for feature extraction, feature detection or entity identification. the most commonly reported method was liwc (4/8, 50%), followed by lda, lsa, and word2vec (2/8, 25%). nmf and pca were used only in one of the included studies (12.5%). in total, 3 out of 8 research papers (37.5%) combined more than one of those techniques. one the one hand, svm was the most used technique being implemented in 10 out of the 16 included studies (62.5%). on the other hand, the second most used technique was dt (7/16, 43.75%), followed by lr (5/16, 31.25%) and rf (4/16, 25%). the most used platform to implement the ml-based models is python (6/8, 75%). furthermore, all studies reported the performance parameters used in the validation process. precision, recall and fscore were the most used performance parameters (12/15, 80%) . in total, 10 out of 15 studies used accuracy as a performance evaluation metric (66.67%). in summary, ml methods for suicide risk detection and prevention are adjusted to each region, supporting the current pandemic scenario towards enhanced public health and well-being. nevertheless, this scoping review has some limitations related to its primary objective. this paper only reviews studies that focus on suicide risks. the papers have been selected using a scoping review methodology in four research databases and written in english. however, other research studies can be available in different languages and databases. moreover, the authors are aware that are multiple algorithms available bases on statistical assessment. still, this review only surveys articles that include ml methods to detect suicide risk on social networks. as future work, several activities can be conducted, such as creating an annotated corpus for various languages, developing new ml models, especially in other languages than english. these activities aim to classify posts, estimate suicide risk, analyze potential predictive parameters, optimize predictive parameters, and analyze topics considering the temporal component of user posts and specific tools to analyze sentiment. world health organization: who | suicide data a systematic literature review of technologies for suicidal behavior prevention instituto nacional de estadistica: españa en cifras suicide and suicide risk psychosocial and psychiatric risk factors for suicide: case-control psychological autopsy study the suicidal process; prospective comparison between early and later stages beyond risk theory: suicidal behavior in its social and epidemiological context gene-environment interaction and suicidal behavior suicide mortality and coronavirus disease 2019-a perfect storm? awareness of mental health problems in patients with coronavirus disease 19 (covid-19): a lesson from an adult man attempting suicide the mental health consequences of covid-19 and physical distancing: the need for prevention and early intervention coronavirus disease 2019 (covid-19) and firearms in the united states: will an epidemic of suicide follow? covid-19 and mental health: a transformational opportunity to apply an evidence-based approach to clinical practice and research management of patients with multiple myeloma in the era of covid-19 pandemic: a consensus paper from the european myeloma network (emn) modelling the covid-19 epidemic and implementation of population-wide interventions in italy applying principles of behaviour change to reduce sars-cov-2 transmission psychological impact of the 2015 mers outbreak on hospital workers and quarantined hemodialysis patients use of new technologies in the prevention of suicide in europe: an exploratory study development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records the effect of social networks structure on innovation performance: a review and directions for research social network analysis: characteristics of online social networks after a disaster using neural networks with routine health records to identify suicide risk: feasibility study perceptions of suicide stigma: how do social networks and treatment providers compare? crisis: the journal of crisis intervention and suicide prevention smartphones, sensors, and machine learning to advance real-time prediction and interventions for suicide prevention: a review of current progress and next steps exploring and learning suicidal ideation connotations on social media with deep learning an unsupervised learning approach for automatically to categorize potential suicide messages in social media data-driven advice for applying machine learning to bioinformatics problems an application of machine learning to haematological diagnosis performance analysis of statistical and supervised learning techniques in stock data mining big data and machine learning algorithms for health-care delivery machine learning applications in cancer prognosis and prediction prevalence and diagnosis of neurological disorders using different deep learning techniques: a metaanalysis diagnosis of human psychological disorders using supervised learning and nature-inspired computing techniques: a meta-analysis quantifying the propagation of distress and mental disorders in social networks scoping studies: towards a methodological framework prisma extension for scoping reviews (prisma-scr): checklist and explanation guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view capturing the patient's perspective: a review of advances in natural language processing of health-related text emotion detection in suicide notes intend to analyze social media feeds to detect behavioral trends of individuals to proactively act against social threats a content analysis of depression-related tweets unveiling online suicide behavior: what can we learn about mental health from suicide survivors of reddit? stud machine learning and semantic sentiment analysis based algorithms for suicide sentiment prediction in social networks multi-class machine classification of suicide-related communication on twitter suicide related text classification with prism algorithm online suicide prevention through optimised text classification extracting psychiatric stressors for suicide from social media using deep learning using machine learning algorithms to detect suicide risk factors on twitter automatic extraction of informal topics from online suicidal ideation ontology-based approach to social data sentiment analysis: detection of adolescent depression signals proactive suicide prevention online (pspo): machine identification and crisis management for chinese social media users with suicidal thoughts and behaviors detecting suicidality on twitter exploring the impact of evolutionary computing based feature selection in suicidal ideation detection suicidal trend analysis of twitter using machine learning and neural network dynamic emotion modelling and anomaly detection in conversation based on emotional transition tensor using linguistic features to estimate suicide probability of chinese microblog users report of board of scientific affairs' advisory group on the conduct of research on the internet information extraction from medical social media sentiment analysis of health care tweets: review of the methods used artificial intelligence for participatory health: applications, impact, and future implications: contribution of the imia participatory health and social media working group ethical considerations for participatory health through social media: healthcare workforce and policy maker perspectives: contribution of the imia participatory health and social media working group social media and suicide: a review of technology-based epidemiology and risk assessment publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations acknowledgements this research has been partially supported by european commission and the ministry of industry, energy and tourism under the project aal-20125036 named bwetake care: ictbased solution for (self-) management of daily living.thanks to the research grants from senacyt, panama. ed receives funding and is supported by the v plan propio de investigación de la universidad de sevilla, spain. conflict of interest the authors declare that they have no conflict of interest.ethical approval this article does not contain any studies with human participants or animals performed by any of the authors. key: cord-309010-tmfm5u5h authors: dietert, kristina; gutbier, birgitt; wienhold, sandra m.; reppe, katrin; jiang, xiaohui; yao, ling; chaput, catherine; naujoks, jan; brack, markus; kupke, alexandra; peteranderl, christin; becker, stephan; von lachner, carolin; baal, nelli; slevogt, hortense; hocke, andreas c.; witzenrath, martin; opitz, bastian; herold, susanne; hackstein, holger; sander, leif e.; suttorp, norbert; gruber, achim d. title: spectrum of pathogenand model-specific histopathologies in mouse models of acute pneumonia date: 2017-11-20 journal: plos one doi: 10.1371/journal.pone.0188251 sha: doc_id: 309010 cord_uid: tmfm5u5h pneumonia may be caused by a wide range of pathogens and is considered the most common infectious cause of death in humans. murine acute lung infection models mirror human pathologies in many aspects and contribute to our understanding of the disease and the development of novel treatment strategies. despite progress in other fields of tissue imaging, histopathology remains the most conclusive and practical read out tool for the descriptive and semiquantitative evaluation of mouse pneumonia and therapeutic interventions. here, we systematically describe and compare the distinctive histopathological features of established models of acute pneumonia in mice induced by streptococcus (s.) pneumoniae, staphylococcus aureus, klebsiella pneumoniae, acinetobacter baumannii, legionella pneumophila, escherichia coli, middle east respiratory syndrome (mers) coronavirus, influenza a virus (iav) and superinfection of iav-incuced pneumonia with s. pneumoniae. systematic comparisons of the models revealed striking differences in the distribution of lesions, the characteristics of pneumonia induced, principal inflammatory cell types, lesions in adjacent tissues, and the detectability of the pathogens in histological sections. we therefore identified core criteria for each model suitable for practical semiquantitative scoring systems that take into account the pathogenand model-specific patterns of pneumonia. other critical factors that affect experimental pathologies are discussed, including infectious dose, time kinetics, and the genetic background of the mouse strain. the substantial differences between the model-specific pathologies underscore the necessity of pathogenand model-adapted criteria for the comparative quantification of experimental outcomes. these criteria also allow for the standardized validation and comparison of treatment strategies in preclinical models. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 as one of the most frequent infectious diseases, pneumonia causes a tremendous socioeconomic burden in industrialized countries [1] and is the leading infectious cause of death in children worldwide [2] . numerous classes of pathogens can cause acute pneumonia [3] and the risk of pneumonia is greatly enhanced under conditions of impaired pulmonary host defense, including preceding viral infections [4] , mechanical ventilation [5] and sepsis [6] . the leading causative pathogen of community acquired pneumonia (cap) is the gram-positive bacterium streptococcus (s.) pneumoniae [7, 8] which accounts for the majority of bacterial upper and lower respiratory tract infections and is responsible for millions of deaths annually [9, 10] . as another cause of cap influenza a virus (iav) infection leads to rapid progression of lung failure with limited treatment options and frequent fatal outcome [3, 11, 12] . moreover, iav infections are commonly complicated by bacterial superinfection, mostly caused by s. pneumoniae, resulting in severe progressive pneumonia associated with increased mortality [13] . in contrast, the gram-negative and facultatively intracellular bacterium legionella (l.) pneumophila is the causative agent of the severe cap legionnaires' disease, and the second most commonly detected pathogen in pneumonia in patients admitted to intensive care units (icu) in industrialized countries [14, 15] . however, in addition to cap, ventilator-associated pneumonia (vap) is also a major cause of hospital morbidity and mortality in icus [16] and the spectrum of pathogens is shifted in these forms of pneumonia. here staphylococcus (s.) aureus, klebsiella (k.) pneumoniae, acinetobacter (a.) baumannii, and escherichia (e.) coli have been isolated with varying prevalences [17] [18] [19] . more specifically, the gram-negative k. pneumoniae is a significant opportunistic pathogen causing severe life-threatening hospitalacquired respiratory tract infections [20] [21] [22] while s. aureus, a gram-positive bacterium, is one of the most prevalent pathogens of community-and hospital-acquired lower respiratory tract infections in humans and accounts for a significant health and economic burden [23] [24] [25] . a. baumannii and e. coli are ubiquitous gram-negative bacteria which have recently emerged as major causes of community-associated, nosocomial [26, 27] and ventilator-associated pneumonia [19, 28] as well as septicemia induced acute lung injury (ali) [29, 30] . in addition, more recently discovered pulmonary pathogens indicate that novel emerging diseases may add to the list of highly relevant pneumonias that may also be of interest to be studied in animal models. for example, the middle east respiratory syndrome coronavirus (mers-cov) which is transmitted by dromedary camels as vectors [31] has emerged as the cause of severe human respiratory disease worldwide [32, 33] with elderly and immunocompromised individuals particularly in saudi arabia being at highest risk [34] [35] [36] . the various forms of pneumonia have been successfully reproduced in specific murine models of experimentally-induced acute pneumonia [37] [38] [39] . these models have substantially contributed to our understanding of the pathogenesis of community-and hospital-acquired pneumonia as well as emerging lung infections worldwide and are indispensable for the development of novel therapeutic strategies [40] [41] [42] . histopathology has been a powerful, reliable, and reproducible read-out tool for the evaluation of morphological changes in animal lung infection experiments for many decades [43, 44] . qualitative diagnoses are based on a summation of microscopically observable changes in the morphology and cellular composition of the tissue and cell types involved. for a more comparative inclusion of histopathologic information in biomedical research, scoring systems have been widely applied which allow for a first semiquantitative assessment of lesions compared to controls [44, 45] . moreover, all preclinical models used for the development of novel treatment strategies and acceptance by regulatory agencies need to be assessed histopathologically by board certified pathologists as gold standard for qualitative and semiquantitative evaluation of tissue alterations in experimental animals [46] [47] [48] . previous studies have revealed first fundamental differences in histopathologic lesions caused by different pathogens in mouse lungs [38, 41, 42] . however, scoring schemes for acute murine pneumonia existing to date are very superficial, addressing only a few, rather unspecific parameters [45, 49, 50] . more importantly, they hardly allow for a differentiating perspective between distinct pathogens or for group comparisons, e.g., infections of wild type versus genetically modified mice. clearly, there is a strong need for more precise and pathogen-as well as model-specific parameters to allow for an accurate description and semiquantification of the inflammatory phenotype for reliable and reproducible comparisons between experimental groups within each model. therefore, we have recently adapted more specific scoring criteria for s. pneumoniae and s. aureus-induced pneumonia [38, 42] . however, such pathogenspecific scoring criteria have not been employed for other lung pathogens in mice. here, we systematically describe and compare the histopathologies at their peaks of inflammation and injury of nine previously established acute lung infection models induced by s. pneumoniae, s. aureus, k. pneumoniae, a. baumannii, l. pneumophila, e. coli, mers-cov, iav and superinfection with iav and pneumococci. we provide model-specific criteria that can be used for appropriate histological quantitative comparisons, e.g., when different therapeutic interventions are evaluated within these established models. whole mouse lung sections were used to obtain complete overviews, particularly of the distributions of lesions and inflammatory patterns. on the basis of the different and oftentimes quite pathogen-and model-specific changes we identified the most suitable evaluation criteria for each model that will allow for more accurate semiquantitative assessments of the severities and distributions of pneumonic lesions. the lung tissues examined here were derived from experiments primarily conducted for purposes other than this study and most have been published elsewhere [38, 41, 42, [51] [52] [53] , except for the a. baumannii and e. coli experiments that will be published elsewhere. all animal procedures and protocols were approved by institutional ethics committees of charité-university of berlin, justus-liebig university of giessen, philipps university of marburg, university hospital of jena and local governmental authorities (landesamt für gesundheit und soziales (lageso) berlin, regierungspräsidium (rp) gießen and darmstadt, landesamt für verbraucherschutz (tlv) thüringen), respectively. permit numbers were g 0356/10, a-0050/15 (s. pneumoniae), g 0358/11 (s. aureus), g 75/2011, g 110/2014 (k. pneumoniae), a 0299/15 (a. baumannii), g 0175/12 (l. pneumophila), 02-049/12 (e. coli), 114/2012 (mers-cov), g 0152/ 12, and g 0044/11 (iav and superinfection). all animal studies were conducted in strict accordance with the federation of european laboratory animal science associations (felasa) guidelines and recommendations for the care and use of laboratory animals, and all efforts were made to minimize animal discomfort and suffering. all mice, except for mers-cov infected mice, were monitored at 12 hour intervals throughout the experiment to assess appearance, behavior, grooming, respiration, body weight, and rectal temperature. humane endpoints were defined (body temperature <30˚c, body weight loss = 20%, cumbersome breathing, accelerated breathing in combination with staggering, pain or paleness) but not reached by any of the mice at the indicated time points of termination of the experiments. mers-cov infected mice were monitored once daily and appearance, behavior, grooming, respiration and body weight were protocolled. here, a single humane endpoint (loss of body weight of >15%) was defined but not reached by any of the mice employed due to their favourable clinical outcome at the infection dose used. for all experimental infection models, except of k. pneumoniae, female mice (aged 8-12 weeks and weighing 17-22 g) were randomly assigned to groups (n = 2-4) per cage whereas in the k. pneumoniae model female and male mice (aged 23-25 weeks and weighing 22-24 g) were used for model specific reasons [41] . furthermore, for all experimental infection models, specificpathogen-free (spf) mice on c57bl/6 (all except for mers-cov) or balb/c (as previously used for the mers-cov model [52, 54] ) background were used and housed in individually ventilated cages under spf conditions with a room temperature of 22 ± 2˚c and a relative humidity of 55 ± 10%. a 12 hour light/ 12 hour dark cycle was maintained and the animals had unlimited access to standard pellet food and tap water. all experimental details of the infection models compared here were applied following previously published and well established protocols that partly vary in terms of infection doses, routes of infection and time point of examination due to pathogen-or model specific reasons, as given below. for bacterial infections, except for e. coli, mice were anesthetized intraperitoneally with ketamine (80 mg/kg) (ketavet; pfizer, berlin, germany) and xylazine (25 mg/kg) (rompun; bayer, leverkusen, germany). for experimental viral infections, mice were anesthetized using inhalation of isoflurane (forene; abbott, wiesbaden, germany). for lung histology, all mice except of the mers-cov model were humanely euthanized by exsanguination via the caudal vena cava after anesthesia by intraperitoneal injection of premixed ketamine (160 mg/kg) and xylazine (75 mg/kg). mers-cov infected mice were humanely euthanized by cervical dislocation after isoflurane anesthesia. s. pneumoniae (serotype 3 strain, nctc 7978), s. aureus newman (nctc 10833), k. pneumoniae (serotype 2, atcc 43816), a. baumannii (ruh 2037), l. pneumophila (serogroup 1 strain, jr 32) were cultured as described [37, 38, 40, 55, 56] and resuspended in sterile pbs. mice were anesthetized intraperitoneally (i.p.) with ketamine (80 mg/kg) and xylazine (25 mg/kg) and transnasally inoculated with 5 x 10 6 cfu s. pneumoniae (n = 14 mice), 5 x 10 7 cfu s. aureus (n = 4), 5 x 10 8 cfu a. baumannii (n = 8), in 20 μl pbs. mice transnasally infected with l. pneumophila (n = 8) received 1 × 10 6 cfu in 40 μl pbs. mice infected with k. pneumoniae (n = 16) received 3.5 x 10 5 cfu intratracheally in 50 μl nacl (0.9%) via microsprayer1 aerosolizer (model ia-1b, penn-century, inc., wyndmoor, pa) using intubation-mediated intratracheal instillation through intact airways [57] which has previously been optimized for this model [41, [57] [58] [59] . e. coli (atcc 25922) from -80˚c glycerol stock was added to lb broth (carl roth, karlsruhe, germany) and incubated for 12 hours at 200 rpm and 37˚c with 5% co 2 . optical density of 0.03 was adjusted in lb broth followed by incubation until midlogarithmic phase for 1.5 hours at 200 rpm and 37˚c. after centrifugation, the pellet was resuspended in sterile 0.9% nacl at 8 x 10 5 cfu e. coli / 200 μl and administered intraperitoneally (n = 10). for initial transduction of human dpp4 for subsequent infection of balb/c mice with mers-cov (hcov emc) viruses were cultured and prepared as described [52, 54, 60] . mice were transduced transnasally with 20 μl of an adenovirus vector encoding human dpp4 and mcherry with a final titer of 2.5 x 10 8 pfu per inoculum (adv-hdpp4; viraquest inc.), resulting in hdpp4 expression in the epithelial compartment of the lung [60] and transnasally infected with a final titer of 7 x 10 4 tcid 50 of mers-cov in 20 μl dmem (n = 17) under isoflurane anesthesia (forene; abbott, wiesbaden, germany). influenza a/pr/8/34 virus (h1n1; pr8) was grown as described [42] and mice were transnasally infected with 100 pfu pr8 in 50 μl pbs (n = 4) under isoflurane anesthesia. for superinfection experiments, the iav infection procedure was applied as described above with 40 pfu pr8 in 50 μl pbs. 8 days after viral infection, s. pneumoniae was cultured as described [37] and resuspended in sterile pbs. mice were anesthetized intraperitoneally and transnasally inoculated with 1 x 10 3 cfu s. pneumoniae in 20 μl pbs (n = 4). mice were humanely euthanized at model-specific time points as indicated ( table 1 ). between 2 and 6 repetitions of the entire experimental procedures were performed in each model with similar group sizes in each repetition. lungs were carefully removed after ligation of the trachea to prevent alveolar collapse, immersion-fixed in formalin ph 7.0 for 24 to 48 hours (mers-cov for 7 days), embedded in paraffin, cut in 2 μm sections and stained with hematoxylin and eosin (he) after dewaxing in xylene and rehydration in decreasing ethanol concentrations. bacteria were visualized using the giemsa and gram (modified by brown and brenn) stains. for the display of whole lung overviews, he stained slides of entire lung sections were automatically digitized using the aperio cs2 slide scanner (leica biosystems imaging inc., ca, usa) and image files were generated using the image scope software (leica biosystems imaging inc.). three evenly distributed whole-organ horizontal sections throughout the entire lungs were microscopically evaluated to assess the distribution and character of pathologic alterations, generating a modified panel of specific lung inflammation parameters adapted to each pathogen used (table 1 and table 2 ). all examinations were performed by trained veterinary experimental pathologists. for immunohistochemical detection of s. pneumoniae and iav (h1n1), antigen retrieval was performed with microwave heating (600 w) in 10 mm citric acid (750 ml, ph 6.0) for 12 minutes (min). lung sections were then incubated with a purified rabbit antibody polyclonal to s. pneumoniae (1:2,000, kindly provided by s. hammerschmidt) or with a purified goat antibody polyclonal to iav h1n1 (1:4,000, obt155, bio-rad, puchheim, germany) at 4˚c overnight. incubation with an immuno-purified rabbit or goat antibody at the same dilution served as negative controls. subsequently, slides were incubated with a secondary, alkaline phosphataseconjugated goat anti-rabbit (1:500, ap-1000, vector, burlingame, ca) antibody for 30 min at room temperature. the alkaline chromogen triamino-tritolyl-methanechloride (neufuchsin) was used as phosphatase substrate for color development. all slides were counterstained with hematoxylin, dehydrated through graded ethanols, cleared in xylene and coverslipped. transnasal infection of mice with s. pneumoniae, serotype 3 resulted in a broad spectrum of tissue lesions and immune cell infiltrations that are typical of aerogenic bacterial pneumonia. specific for this model, lesions widely expanded down to the periphery of the lung lobes (fig 1a) with inflammation closely surrounding the airways and blood vessels. pneumococcal spread led to an early immune response which was mainly characterized by predominantly intrabronchial ( fig 1b) and intraalveolar ( fig 1c) infiltrations of neutrophils provoking a lobular, suppurative bronchopneumonia with consolidation of affected lung areas. large areas of coagulation and liquefaction necrosis (fig 1d, arrowhead) as indicated by cellular fragmentation, decay, and loss of cellular details, accumulation of cellular and karyorrhectic debris as well as karyorrhexis, karyopyknosis, and karyolysis with consecutive hemorrhage were also present. the perivascular interstitium was widely expanded by edema due to vascular leakage [53] with massive extravasation of neutrophils recruited into perivascular spaces ( fig 1e) . furthermore, suppurative and necrotizing vasculitis accompanied by hyaline thrombi within small-sized blood vessels were occasionally present, indicating early histological evidence of incipient septicemia. increased pulmonary vascular permeability [53] also led to expanded areas of protein-rich alveolar edema which presented as homogenous, lightly pink material within the alveolar spaces in the he stain (fig 1f, asterisk) . a distinctive histopathological feature of pneumococcal pneumonia was the occurrence of massive suppurative to necrotizing pleuritis (fig 1g, arrowhead) and steatitis ( fig 1h) with widespread dispersion of bacteria into the thoracic cavity, likely accounting for the painful and morbid clinical behavior with rapid progression in affected mice. myriads of pneumococci were clearly visible as bluish to purple dots of approximately 1 μm in size in the standard he stain, mostly located on the pleural surface, in the mediastinal adipose tissue or within perivascular spaces in the lungs. + + ++ ++ ++ abscess formation ++ ++ + granuloma formation ++ alveolar edema ++ + ++ + + ++ ++ perivascular edema ++ ++ + + + + perivascular lymphocytic cuffing ++ + ++ ++ ++ + + ++ + vasculitis ++ + + + ++ + fibrinoid degeneration of vascular walls ++ vascular thrombosis +/++ ++ ++ ++ + + + ++ + + pleuritis ++ ++ ++ ++ ++ ++ in contrast, transnasal infection with s. aureus resulted in multifocally extensive but non-expansive bronchopneumonia predominantly located near the lung hilus (fig 2a) , affecting the bronchi, alveoli and interstitium. the main inflammatory cell population consisted of neutrophils, leading to mainly suppurative (fig 2b and 2c ) lesions with a tendency towards abscess formation. in contrast to pneumococci, macrophages were also present albeit at lower numbers than neutrophils. (fig 2c) . similar to klebsiella and streptococci, large areas of necrosis and hemorrhage ( fig 2d) were present. the perivascular areas were predominantly infiltrated by lymphocytes and fewer neutrophils (fig 2e) . compared to the s. pneumoniae model mentioned above [53] , vascular permeability seemed only slightly increased as reported before [38] and perivascular edema (fig 2e) as well as protein-rich alveolar edema (fig 2f, asterisk) were also present albeit to a lesser extent. neither pleuritis nor steatitis were observed consistent with a rather favorable clinical outcome under the conditions used. furthermore, staphylococci were largely undetectable by he stain which was possibly due to the low bacterial spread within the lungs. intratracheal infection of mice with k. pneumoniae resulted in severe widely expansive bronchopneumonia with increased lesion severity in the lung periphery ( fig 3a) . recruited immune cells predominantly consisted of neutrophils, leading to suppurative ( fig 3b) to abscessing ( fig 3c) bronchopneumonia with hemorrhage and necrosis as well as neutrophilic interstitial pneumonia (fig 3d) in less affected areas. increased vascular permeability as reported [40] was associated with massive alveolar (fig 3d, asterisk) and perivascular edema (fig 3e) , admixed with myriads of bacteria easily recognizable as purple dots in the he stain. suppurative to necrotizing vasculitis, pleuritis ( fig 3f, arrowhead) , and steatitis were also present and associated with marked bacterial spread and the rapid lethal clinical outcome. after transnasal infection with a. baumannii, mice developed a widely expansive ( fig 4a) bronchopneumonia with predominantly infiltrating neutrophils causing a suppurative ( fig 4b) to abscessing inflammation with areas of hemorrhage within alveoli and interstitium and large areas of parenchymal necrosis as well as alveolar edema. perivascular spaces had mild to moderate edema and infiltration of lymphocytes and neutrophils ( fig 4c) . vascular thrombosis was a common change (fig 4d, arrowhead) in small-sized blood vessels. similar to staphylococci, acinetobacter was invisible by he stain and neither pleuritis nor steatitis were present. transnasal infection of mice with l. pneumophila resulted in slightly different lesion patterns depending on the time point of examination after infection. at 48 hours after infection, nonexpansive interstitial pneumonia was found in close proximity to the hilus (fig 5a) with comparative histopathology of mouse models of acute pneumonia prominent alveolar wall necrosis (fig 5b) . at the 6 day-time point, specifically the numbers of infiltrating macrophages were clearly increased, leading to accentuated perivascular granuloma formation (fig 5c, arrowhead) . here, marked lymphocytic cuffing of most blood vessels as well as highly activated endothelial cells (fig 5d, arrowhead) were observed. at both time points, neither pleuritis nor steatitis were present. bacteria were invisible in the he stained sections. after intraperitoneal infection, the hematogeneous spread of e. coli to the lungs had resulted in diffuse, interstitial suppurative pneumonia, diffusely affecting the entire lungs, modelling sepsis-induced ali (fig 6a) . the interalveolar interstitium was heavily infiltrated with neutrophils ( fig 6b) with most prominent aggregation around blood vessels (fig 6c) , consistent with bacterial entry via the circulation. numerous hyaline thrombi were present within small-sized blood vessels (fig 6d, arrowhead) , suggestive of disseminated intravascular coagulopathy (dic) due to bacterial septicemia. large, rod-shaped bacteria were easily detectable only outside the lungs, mostly present in the adipose tissue surrounding the esophagus, possibly due to local spread of e. coli via the abdominal cavity. transnasal infection with mers-cov following adenoviral transduction of human dpp4 yielded an expansive, (fig 7a) interstitial pneumonia with severe alveolar epithelial cell necrosis and infiltration of mainly macrophages, lymphocytes, and fewer neutrophils (fig 7b) . only comparative histopathology of mouse models of acute pneumonia moderate peribronchial ( fig 7c) and perivascular (fig 7d) lymphocytic infiltrations were present while most venous blood vessels had marked fibrinoid degeneration and necrosis of comparative histopathology of mouse models of acute pneumonia vascular walls (fig 7d, asterisk) . additional hallmarks of mers-cov infection were large areas of protein-rich alveolar edema (fig 7e, arrowhead) , pronounced hemorrhage within perivascular and alveolar spaces, and interstitium (fig 7f, arrowhead) , and the formation of hyaline thrombi within small-sized blood vessels. after transnasal infection with iav, mouse lungs displayed a diffusely distributed bronchointerstitial pneumonia restricted to single lung lobes only (fig 8a) . alveolar necrosis was prominent and alveolar septae were diffusely distended by infiltrating inflammatory cells (fig 8b) . bronchial epithelial cells were markedly necrotic (fig 8c, arrowhead) and extensively scaled off into the bronchial lumen. alveoli and interstitium were filled with macrophages and lymphocytes as major effector cells ( fig 8d) and prominent perivascular lymphocytic cuffing ( fig 8e) was a characteristic change. furthermore, large areas of alveolar edema (fig 8f, asterisk) and, albeit to a much lesser extent, areas of hemorrhage within alveoli and interstitium were present, suggesting vascular damage and increased permeability. when mice had been infected with iav prior to infection with s. pneumoniae, a combination and exponentiated phenotype of both models was observed 24 hours later. lesions were widely expansive to the lung periphery but restricted to single lung lobes pre-damaged by iav ( fig 9a) . the character of pneumonia included massive infiltration of neutrophils into alveoli ( fig 9b) and bronchi, typical features of severe, suppurative bronchopneumonia. bronchial epithelium was almost entirely necrotic and bronchi were filled up with pus ( fig 9c) . perivascular spaces were edematous and infiltrated by neutrophils and lymphocytes (fig 9d) whereas only mild lymphocytic perivascular cuffing (fig 9e) was present. a severe proteinrich alveolar edema was seen, similar to that seen in the s. pneumoniae mono-infection (fig comparative histopathology of mouse models of acute pneumonia 9f). pneumococci were difficult to visualize by he stain, possibly due to the lower infectious dose used here when compared to the mono-infection. prior to processing for histopathology, small tissue samples from experimentally infected mouse lungs are commonly removed for molecular analyses of gene and/ or protein expression or other readout systems to receive additional information. to obtain representative data from such samples that can be correlated with the histological changes, it is crucial to know about the homogeneity of the distribution of lesions. also, some experimental protocols recommend to use the left and right halves of the lungs, respectively, for different analytical procedures, again anticipating lesion homogeneity and symmetry. however, when we analyzed the distributions and bilateral symmetry of lung lesions for each of the infection models, we found a wide spectrum of distinct distributions and asymmetries (fig 10) . in principle, lesion distributions followed the route of pathogen entry into the lungs. however, the tendencies to spread towards the periphery of the lobes after aerogenous infection varied between different pathogens despite similar aerogenous infection routes. mostly centrally focused lesions induced by s. aureus and l. pneumophila remained close to the hilus with no trend towards peripheral expansion. infection with s. pneumoniae, a. baumannii and mers-cov resulted in lesions closely surrounding major airway segments with centrifugal expansion towards the periphery. in contrast, lesions induced by k. pneumoniae were mostly located in the periphery of the lobes and airways and much weaker adjacent to the hilus despite aerogenous infection. hematogenously-induced sepsis with e. coli was associated with an entirely diffuse distribution of lesions affecting the whole lung with myriads of inflammatory hot spots, commonly surrounding blood vessels. iav-induced lesions were restricted to individual lung lobes only with a rather homogeneous distribution within affected lobes. superinfection of s. pneumoniae into an iav-pneumonia resulted in a pattern virtually identical to that seen after iav infection alone. except for blood borne e. coli pneumonia which was consistently and evenly distributed throughout the entire lungs, affected areas in all other models tested here were randomly distributed more or less asymmetrically between the right and left halves of the lungs and also between adjacent lobes (fig 10) . for more than 100 years, a wide range of special stains have been used for the histological visualization of pathogens and other relevant structures, based on their more or less specific affinities to certain dyes. here, gram stain modified by brown and brenn was used for the visualization of gram-positive bacteria, including s. pneumoniae (fig 11a, arrowhead) , as easily recognizable, dark blue cocci. in contrast, giemsa stain was conducted predominantly for the detection of gram-negative bacteria such as k. pneumoniae (fig 11b, arrowhead) which then turned into light blue to greenish rods. for more specific pathogen detection on slides, particularly for viruses, immunohistochemistry is the method of choice. here, s. pneumoniae and iav were detected by immunohistochemistry using specific anti-s. pneumoniae or anti-iav antibodies, respectively. s. pneumoniae-positive signals were obtained as myriads of red cocci predominantly in the perivascular interstitium (fig 11c) , within neutrophils in alveoli and interstitium, and on pleural surfaces as well as in mediastinal adipose tissue. in addition, pneumococci were also visualized in the marginal sinuses of tracheal lymph nodes, both in macrophages and extracellularly. iav antigen was localized to the apical surface and cytosol of intact and necrotic bronchial epithelial cells (fig 11d) and within alveolar macrophages. the diversity of lesions and in particular the presence or absence of specific patterns in several of the models used ( table 1 ) strongly suggested that a uniform scoring scheme for the pneumoniae were mostly located in the periphery of the lobes and airways and much weaker adjacent to the hilus. hematogenous infection with e. coli was associated with entirely diffuse distribution of lesions affecting the whole lungs with myriads of inflammatory hot spots, commonly surrounding blood vessels. iav-induced lesions were restricted to individual lung lobes with a rather homogeneous distribution within affected lobes. which lobes were affected followed a rather random and inconsistent pattern. superinfection of s. pneumoniae into an iav-pneumonia resulted in a pattern virtually identical to that seen after iav infection alone. except for e. coli induced pneumonia, virtually all lung lesions were distributed asymmetrically between the left and right lung halves with no tendency of either half to be more often or more strongly affected. https://doi.org/10.1371/journal.pone.0188251.g010 comparative histopathology of mouse models of acute pneumonia semiquantification of mouse pneumonia is inconceivable. instead, scoring systems should take into account the more or less pathogen-specific lesion patterns that can be distilled from the comparative characterizations given above. for this purpose, we carved out the most characteristic lesion patterns that appear suitable for the development of specific scoring schemes for each model (table 2 ). different mouse models of acute pneumonia differ widely, with an obvious strong dependence on pathogen-specific features of virulence and spread, route of infection, infectious dose and other factors. here, we provide a detailed descriptive overview of histopathological features and distributions of lesions within infected lungs and compare them between nine relevant and commonly used infection models at their peaks of injury and inflammation. the models employed here all represent well established protocols that have been optimized and successfully used in previous studies, with model-specific variations in infection doses, routes of pathogen administration and analyzed time points [37-42, 51, 52, 54-56] . our model-specific description parameters (table 2 ) provide a rational for the selection of histopathological quantification criteria, in order to best reflect the model-specific lesion and distribution characteristics, which appear to be most relevant. clearly, the severity of lesions in terms of outcome of quantification systems will depend on several other factors that will have to be addressed separately in each model, such as the infection dose, time point of examination or therapeutic interventions. comparative histopathology of mouse models of acute pneumonia the model-associated characteristics of tissue lesions and immune cell infiltrates are widely consistent with well-established properties of the different pathogens used. for example, the destructive tissue damage with mostly neutrophilic infiltrations as seen in s. pneumoniae, s. aureus, and a. baumannii, are typically seen with extracellular bacteria that express cytotoxic virulence factors, such as pneumolysin and hydrogen peroxide [61] from s. pneumoniae or immunogenic cell wall components such as lipoteichoic acid (lta) from s. aureus [62, 63] . on the other hand, the intracellular pathogen l. pneumophila which primarily infects macrophages [64] resulted in a histiocytic infiltrate at 48 h that developed into granulomatous inflammation at 6 days after infection, typical of a t h 1-response [65, 66] . however, several of the pathogens used were associated with additional distinctive features. for example, histology revealed marked pleuritis and steatitis due to pathogen invasion into adjacent extrapulmonary tissues after infections with s. pneumoniae and k. pneumoniae. this massive bacterial spreading throughout the thoracic cavity was exclusively present in these two models and most likely associated with sepsis, responsible for the rapid clinical progression and unfavourable outcome [41, 67] . however, only k. pneumoniae had the tendency of abscess formation which was not seen in pneumococcal pneumonia. infection with s. aureus and a. baumannii also resulted in similar lesion patterns, except for acinetobacter-induced lesions widely expanding to the lung periphery while staphyloccocus-induced pneumonia was restricted to the lung hilus. a second difference between the two was the presence of prominent vascular thrombosis in a. baumannii-induced pneumonia which was absent from staphylococcus pneumonia. the clinical outcome of mice infected with a. baumannii and s. aureus was more favourable when compared to infection with s. pneumoniae or k. pneumoniae [38] which may be explained by the lack of bacterial spreading throughout the thoracic cavity and adjacent tissues, and possibly sepsis. e. coli infection was included here as a model for sepsis-associated ali [29, 30, 68] and consequently induced wide spread vascular thrombosis and vasculitis, most likely due to its blood borne entry into the lungs and concurrent septicemia with disseminated intravascular coagulopathy and associated vascular lesions. vascular thrombosis with or without vasculitis was also observed in other models, including s. pneumoniae, a. baumannii and mers-cov, however, to a much lesser extent and only within the most strongly affected areas. mers-cov and iav-associated lesions clearly reflected the known cellular tropisms of these viruses with necrosis of alveolar walls or bronchial epithelial cells, respectively, being the most characteristic histopathologic features [69] [70] [71] [72] . typical of virally induced lesions, the inflammatory cell infiltrates in mers-cov and iav infections were dominated by lymphocytes with no or only few neutrophils. nevertheless, the two viral models could be clearly distinguished from each other by additional histological characteristics. only the mers-cov infection led to a marked vascular phenotype with necrosis and degeneration of blood vessels, vasculitis, and consecutive vascular thrombosis as well as pronounced hemorrhages [69, 73] . in contrast, iav-induced pneumonia did not display any of these features, but was dominated by marked perivascular lymphocytic cuffing and alveolar edema [42, 74] . subsequent superinfection with low-dose s. pneumoniae potentiated the severity of the iav-induced lesions and aggravated the course of pneumonia. however, it did not alter the principal histological characteristics of iav-pneumonia. the patterns seen after single infection with s. pneumonia were not repeated in this superinfection model, likely owing to the much lower inoculation dose which is usually rapidly cleared from virus-naive lungs. when the distributions of lesions were compared among the 9 models tested, four distinct patterns could be clearly distinguished. the most common pattern, where lesions were focused around central airways and blood vessels close to the lung hilus with the periphery less or not affected can likely be explained by the aerogenous route of infection and pathogen entry. the opposite pattern characteristic of k. pneumoniae infection where the periphery of the lobes was more strongly affected than their hilus areas despite a similar aerogenous route of infection may be due to the aerosolic intratracheal application [75] of these bacteria which is typical of and necessary in this model [41, [57] [58] [59] . these differences are therefore more likely attributable to the model-specific route of infection rather than pathogen-specific properties. similarly, the very homogenous distribution of e. coli induced pneumonia likely followed the diffuse blood borne entry of the pathogen into the lungs after intraperitoneal infection. again unique among the pathogens tested here, the iav-associated pattern affected entire but only select lung lobes with almost complete sparing of others. this distribution probably followed a random spread of the virus along major airways but why some lobes remained virtually unaffected after transnasal infection remains hard to explain. apart from helping to understand differences in pathogen spread, the uneven and often quite asymmetrical distributions have a tremendous impact in practical terms when acute mouse pneumonia is sampled for molecular studies. when quantitative data on mrna or protein expression levels or other biochemical information are to be compared with one another or with tissue lesions, it is imperative that only identically affected areas are compared. since this is impossible to predict or recognize on the macroscopical level for most models, the practice of sampling different regions of such lungs for different readout systems appears highly problematic. another implication of the distinct lesion characteristics, immune cell reactions and distributions among the different models appears highly relevant for histological scoring systems that aim at first quantitative comparisons [45] . to narrow down the list of parameters appropriate for each pathogen and exclude features that are likely irrelevant for some of the models, we selected 23 single histopathologic criteria for the design of semiquantitative scoring systems suitable for each model. these criteria are partly composed of standard parameters such as the determination of the affected lung area, the distribution of lesions or the type of pneumonia induced. however, numerous other and more model-specific parameters were identified which precisely describe particular aspects and allow for a differentiation between the models, such as the presence of perivascular edema, vascular thrombosis, pleuritis or steatitis. appropriate scoring systems may thus encompass more general parameters when different pathogens are compared to one another or more pathogen-specific parameters in case specific pathogen features are in the focus of an experiment. for example, some of the parameters selected here have proven helpful in the discovery and semiquantification of different phenotypes of mouse pneumonia following genetic engineering of pathogens or mice [38, 51, 76] . however, scoring systems that claim universality for all mouse models of acute pneumonia seem neither generally applicable nor meaningful for all specific experimental goals. even the list of 23 parameters selected here may become inappropriate or insufficient when genetic changes on the pathogen or host side may result in different types of lesions, immune cell responses, time courses or other relevant features. in those cases, the list selected here may have to be adjusted or extended to better meet the specific challenges of each new study. as standard hematoxylin and eosin (he) staining of tissue sections failed to visualize most pathogens, traditional special stains as well as immunohistochemical techniques were employed, depending on specific staining properties of the pathogens and the availability of appropriate antibodies. while s. pneumoniae, k. pneumoniae and e. coli were easily visible in he stained tissue sections in areas with low density of inflammatory cells, e.g., in perivascular spaces, they were very difficult to identify in heavily infiltrated and consolidated lung parenchyma. in contrast, s. aureus, a. baumannii, l. pneumophila and both viruses were entirely invisible by he staining and thus had to be visualized by appropriate histotechnical stains or immunohistochemistry. both approaches will likely also allow for a rough quantification of pathogen numbers in tissues when appropriate image analysis tools are used. in this first comparative study of its kind, we examined previously established models with their optimized routes of infection, time points, and infection doses and volumes specific for each model to reach peaks of lung injury and inflammation. variations of such factors can be expected to result in different lesion severities, composition of the cellular infiltrates, and for some models in different expansions of lesions within the lung. still, the conditions used here are all based on observations that have evolved during extensive previous establishment studies of these models [37-39, 41, 42, 51, 52, 54, 56-58, 60, 77-79] . among the most important reasons, most human pathogens are not pathogenic for mice under non-experimental conditions and the decisive factor for obtaining a useful pneumonia model appears to be the determination of the appropriate infection dose and route of infection. in addition, the exact time points of tissue analysis after infection had to be determined for virtually all models with care to obtain a useful model, including a precise definition of the strain or variant of the pathogen used [51, 53, 80, 81] . another variable to consider is the mouse strain used. except for balb/c mice which were used in the mers-cov infection model here for model-specific reasons [60] , all models were conducted with c57bl/6 mice which is among the most commonly used mouse strain in infection research and therefore allows for comparisons with similar studies. however, variations of the strain or genetic background may have a dramatic impact on the type, severity and outcome of inflammation, particularly in innate immune responses [82] [83] [84] . again, the criteria suggested here for scoring procedures should allow to recognize and quantify such differences related to changes in infection dose and volume, time point of examination, strain and age of mice used, pathogen variant and other variables. histopathology of the lungs may be complex and requires fundamental knowledge in species-specific anatomy, physiology, organ-specific immunology, pathology, and histotechnical procedures. furthermore, various background lesions in mice, including strain specific spontaneous degenerative or inflammatory conditions and the possibility of accidental infections unrelated to the experiment should not be confused with experimental outcome. thus, despite our efforts to specify and simplify the criteria relevant for model-specific assessment and quantification of lesions, it appears crucial that trained histopathology experts be involved in the microscopical examination of mouse lungs [46, 85] . clearly, in addition to descriptive or semiquantitative histology, a number of other parameters may be useful for quantitative comparisons between experimental groups to determine the role of specific cell types, molecules, and therapeutic interventions, depending on the strategy and goal of the study [44] . such parameters could include flow cytometric immune cell identifications and quantifications, elisa or quantitative rt-pcr for the probing of cytokines, chemokines or matrix proteins involved in lung pathology and remodeling, and plaque/colony forming assays for the identification or quantification of pathogens, as previously published for most of the models used here [38, 41, 42, 51-53, 86, 87] . all of these methods, however, lack the spatial resolution that only histological assessments offer. only the combination of these techniques will lead to a better understanding of the disease in the complex context of the entire lung pathology. in conclusion, we have identified a spectrum of pathogen-and model-specific lesion characteristics in mouse models of acute pneumonia. our findings underscore the necessity of model-specific criteria for the accurate histopathological characterization and quantitative assessments of experimental pneumonia. this comparative landscaping of acute mouse pneumonia histology provides a comprehensive framework for future studies on the role of individual pathogen or host factors, complex disease mechanisms, and novel therapeutic strategies that could help to treat pneumonia in human patients. managing community-acquired pneumonia: a european perspective. respiratory medicine fact sheet n˚331 respiratory viral infection predisposing for bacterial disease: a concise review. fems immunology and medical microbiology evidence on measures for the prevention of ventilator-associated pneumonia immunostimulation is a rational therapeutic strategy in sepsis. novartis foundation symposium recent advances in our understanding of streptococcus pneumoniae infection the burden of pneumococcal pneumonia-experience of the german competence network capnetz pneumococcal conjugate vaccine for childhood immunization-who position paper. releve epidemiologique hebdomadaire mortality from invasive pneumococcal pneumonia in the era of antibiotic resistance, 1995-1997 hospitalized patients with 2009 h1n1 influenza in the united states neuraminidase inhibitors for preventing and treating influenza in healthy adults and children. the cochrane database of systematic reviews epidemiology, microbiology, and treatment considerations for bacterial pneumonia complicating influenza legionnaires' disease: update on epidemiology and management options legionella as a cause of severe pneumonia clinical infectious diseases: an official publication of the infectious diseases society of america american journal of respiratory and critical care medicine ventilator-associated pneumonia in a tertiary care hospital in india: incidence and risk factors incidence, bacteriology, and clinical outcome of ventilator-associated pneumonia at tertiary care hospital development of immunization trials against klebsiella pneumoniae the epidemiology of nosocomial infections caused by klebsiella pneumoniae current epidemiology and growing resistance of gram-negative pathogens. the korean journal of internal medicine association between staphylococcus aureus strains carrying gene for panton-valentine leukocidin and highly lethal necrotising pneumonia in young immunocompetent patients archives de pediatrie: organe officiel de la societe francaise de pediatrie european respiratory society: standards for quantitative assessment of lung structure. american journal of respiratory and critical care medicine an official american thoracic society workshop report: features and measurements of experimental acute lung injury in animals multiparametric and semiquantitative scoring systems for the evaluation of mouse model histopathology-a systematic review the european college of veterinary pathologists (ecvp): the professional body for european veterinary pathologists a minimum core outcome dataset for the reporting of preclinical chemotherapeutic drug studies: lessons learned from multiple discordant methodologies in the setting of colorectal cancer. critical reviews in oncology/hematology assessment of reproductive toxicity under reach. regulatory toxicology and pharmacology: rtp the virulence variability of different acinetobacter baumannii strains in experimental pneumonia. the journal of infection the role of tlr2 in the host response to pneumococcal pneumonia in absence of the spleen ifns modify the proteome of legionella-containing vacuoles and restrict infection via irg1-derived itaconic acid a highly immunogenic and protective middle east respiratory syndrome coronavirus vaccine based on a recombinant measles virus vaccine platform moxifloxacin is not anti-inflammatory in experimental pneumococcal pneumonia. the journal of antimicrobial chemotherapy protective efficacy of recombinant modified vaccinia virus ankara delivering middle east respiratory syndrome coronavirus spike glycoprotein flagellin-deficient legionella mutants evade caspase-1-and naip5-mediated macrophage immunity differential roles of cd14 and toll-like receptors 4 and 2 in murine acinetobacter pneumonia. american journal of respiratory and critical care medicine correlation of klebsiella pneumoniae comparative genetic analyses with virulence profiles in a murine respiratory disease model intubation-mediated intratracheal (imit) instillation: a noninvasive, lung-specific delivery system distinct contributions of neutrophils and ccr2+ monocytes to pulmonary clearance of different klebsiella pneumoniae strains rapid generation of a mouse model for middle east respiratory syndrome streptococcus pneumoniae: virulence factors, pathogenesis, and vaccines. microbiological reviews lipoteichoic acid synthesis and function in gram-positive bacteria. annual review of microbiology role of lipoteichoic acid in infection and inflammation. the lancet infectious diseases interaction between the legionnaires' disease bacterium (legionella pneumophila) and human alveolar macrophages. influence of antibody, lymphokines, and hydrocortisone legionella pneumophila pathogenesis and immunity. seminars in pediatric infectious diseases early recruitment of neutrophils determines subsequent t1/t2 host responses in a murine model of legionella pneumophila pneumonia immunostimulation with macrophage-activating lipopeptide-2 increased survival in murine pneumonia american journal of physiology lung cellular and molecular physiology emerging human middle east respiratory syndrome coronavirus causes widespread infection and alveolar damage in human lungs. american journal of respiratory and critical care medicine differential expression of the middle east respiratory syndrome coronavirus receptor in the upper respiratory tracts of humans and dromedary camels avian flu: influenza virus receptors in the human airway influenza virus-induced lung injury: pathogenesis and implications for treatment dipeptidyl-peptidase iv from bench to bedside: an update on structural properties, functions, and clinical aspects of the enzyme dpp iv. critical reviews in clinical laboratory sciences attenuation of immunemediated influenza pneumonia by targeting the inducible co-stimulator (icos) molecule on t cells rnai-mediated suppression of constitutive pulmonary gene expression by small interfering rna in mice moraxella catarrhalis induces an immune response in the murine lung that is independent of human ceacam5 expression and longterm smoke exposure. american journal of physiology lung cellular and molecular physiology multiple myd88-dependent responses contribute to pulmonary clearance of legionella pneumophila chadox1 and mva based vaccine candidates against mers-cov elicit neutralising antibodies and cellular immune responses in mice recovery from the middle east respiratory syndrome is associated with antibody and t-cell responses surface proteins and exotoxins are required for the pathogenesis of staphylococcus aureus pneumonia cell-specific interleukin-15 and interleukin-15 receptor subunit expression and regulation in pneumococcal pneumoniacomparison to chlamydial lung infection influenza h3n2 infection of the collaborative cross founder strains reveals highly divergent host responses and identifies a unique phenotype in cast/eij mice strain differences in a murine model of air pollutant-induced nonatopic asthma and rhinitis. toxicologic pathology murine strain differences in inflammatory angiogenesis of internal wound in diabetes the ecvp/esvp summer school in veterinary pathology: high-standard, structured training for young veterinary pathologists miniaturized bronchoscopy enables unilateral investigation, application, and sampling in mice il-37 causes excessive inflammation and tissue damage in murine pneumococcal pneumonia the authors thank charlene lamprecht and angela linke for excellent technical assistance and nancy a. erickson for helpful discussions. key: cord-309301-ai84el0j authors: li, yaqi; tang, peiyuan; cai, sanjun; peng, junjie; hua, guoqiang title: organoid based personalized medicine: from bench to bedside date: 2020-11-02 journal: cell regen doi: 10.1186/s13619-020-00059-z sha: doc_id: 309301 cord_uid: ai84el0j three-dimensional cultured organoids have become a powerful in vitro research tool that preserves genetic, phenotypic and behavioral trait of in vivo organs, which can be established from both pluripotent stem cells and adult stem cells. organoids derived from adult stem cells can be established directly from diseased epithelium and matched normal tissues, and organoids can also be genetically manipulated by crispr-cas9 technology. applications of organoids in basic research involve the modeling of human development and diseases, including genetic, infectious and malignant diseases. importantly, accumulating evidence suggests that biobanks of patient-derived organoids for many cancers and cystic fibrosis have great value for drug development and personalized medicine. in addition, organoids hold promise for regenerative medicine. in the present review, we discuss the applications of organoids in the basic and translational research. two-dimensional (2d) cultured cell lines have been the main in vitro research tool for the past decades. cell lines are relatively cheap, easy to handle and can be applied to multiple experimental techniques. however, the establishment of a cell line is time-consuming and involves extensive genetic and phenotypic adaption to culture conditions. thus, most cell lines are derived from tumors or have acquired oncogenic potential in vitro, while matching normal cells are usually lacking. the main problem of cell lines is the homogeneity of the cells, short of differentiated cell types in the original tissue. these problems limit the use of cell lines in personalized medicine and make them less suited to tissue physiology research requiring differentiated cell types. in cancer research, the preclinical research model that can phenocopy tumor heterogeneity is highly needed for research on the mechanisms of cancer progression and acquired drug resistance. in 1953, the first patient-derived xenograft (pdx) models were successfully established (toolan 1953 ). in this model, primary tumor tissue is transplanted into immune-deficient mice, while tumor structure and the relative proportion of tumor cells and stromal cells are largely preserved (byrne et al. 2017 ). thus, pdxs better retain the complexity and heterogeneity of the parental tumor than do cell lines, but establishment is still inefficient and early tumors are hard to establish (john et al. 2011) . besides, genetic manipulations cannot be carried out, and high-throughput analyses are expensive and hampered by complex logistics. last decade has witnessed a booming development of three-dimensional (3d) cell culture technologies. the advent of organoids avoids many of the disadvantages associated with cell lines and pdx. an organoid is characterized as a 3d structure, grown from stem and progenitor cells and consisting of variant organ-specific cell types, that self-organize via cell differentiation and spatially restricted lineage commitment (clevers 2016) . organoids can be grown from two types of cells: (i) pluripotent stem cells (pscs), such as embryonic stem cells (escs) and induced pluripotent stem cells (ipscs), or (ii) adult stem cells (ascs) (clevers 2016; rookmaaker et al. 2015) . organoids are proved amenable to all standard laboratory techniques, as well as to genetic modification drost et al. 2015; schwank et al. 2013) . organoids can be fast expanded, cryopreserved and applied to high-throughput analyses. though organoid cultures cannot mimic interactions with vasculature and stroma, organoids are a promising research model bridging the gap between cell lines and pdxs ( fig. 1 ) (drost 2018; sachs and clevers 2014) . since cell lines of escs and ipscs were established, researchers began to apply insights to induce these stem cells to generate differentiated cell types (chen et al. 2014; cherry 2012) . yoshiki sasai and colleagues firstly dug deeper by questioning whether such an in vitro model could mimic in vivo development and thus developed methods to culture brain, retina and pituitary structures 'in a dish' eiraku et al. 2008) . later, ipscs-derived organoids from optic cup, intestine, stomach, liver, lung, thyroid and kidney, were followed (chen et al. 2017; kurmann et al. 2015; mccracken et al. 2014; mccracken et al. 2011; nakano et al. 2012; takasato et al. 2015; takebe et al. 2013) . of note, each germ layer (endoderm, mesoderm, and ectoderm) is represented among this set of organs. typically, ipscs are expanded and subsequently differentiated through a multi-step protocol that moves towards a fully differentiated structure, and specific cocktails of growth factors are required for each step (fig. 2) . the differentiation process usually takes about 2-3 months, which depends on the specific type of organ (mccracken et al. 2011) . the structure of ipscs-derived organoids is complex and may contain mesenchymal, as well as epithelial and endothelial components. because differentiation protocols recapitulate development in vitro, ipscs-derived organoids are excellent models for studying development (takasato et al. 2015) , genetic diseases (freedman et al. 2015) , and infectious disease (garcez et al. 2016) . another air-liquid interface (ali) method was introduced allowing for the preservation of both epithelium and matched in vitro stromal microenvironment (neal et al. 2018 ). the ali method employs a boyden chamber-like structure where primary tissue is seeded in ecm (extracellular matrix) gel in an inner transwell tm fig. 1 comparison of cell lines, patient-derived xenografts and organoids. cell lines have low cost, are easy to handle and can be applied to multiple experimental techniques. pdxs preserve tumor heterogeneity and tumor-stromal interactions. pdos can be derived from both epithelial cancer cells and normal epithelium and cultured in an extracellular matrix (ecm) -providing basement membrane extract dish which is exposed to air to enhance oxygenation (dimarco et al. 2014; li et al. 2014; ootani et al. 2009 ). culture medium is added to the outer dish and can diffuse through the permeable transwell tm into the inner dish (fig. 3) . ali method has been applied in pscsderived organoid culture lately. koike and colleagues (koike et al. 2019) reported the continuous patterning and dynamic morphogenesis of hepatic, biliary and pancreatic structures, invaginating from ali culture of anterior and posterior gut spheroids differentiated from human psc. adapted ali culture of human cerebral organoids (giandomenico et al. 2019 ) and neocortical organoid (qian et al. 2020 ) derived from pscs were also developed. complementary to pscs-derived organoids that recapitulate development in vitro, ascs-derived organoids model adult tissue repair (clevers 2016) and can be established only from regenerative tissue compartments. in 1987, researches began to explore 3d-culture by culturing primary cells on a reconstituted basement membrane from engelbreth-holm-swarm (ehs) tumor (li et al. 1987; shannon et al. 1987 ). li and colleagues found mammary epithelial cells cultured on ehs matrix could form ducts, ductules, and lumina and resemble secretory alveoli. shannon and colleagues cultured adult rat type ii cells on ehs matrix with feeder layer cells (mainly fibroblasts) and revealed that cell-matrix interactions help type ii cells preserve their original cubical shape and morphological characteristics of variable differentiated cells. exploration based on ascs-derived 3d culture has been led to a new stage. two decades later, ascs-derived organoids were successfully developed from lgr5-positive intestinal stem cells in culture conditions modeling the stem cell niche of intestine sato et al. 2009 ). by providing the wnt agonist r-spondin, epidermal growth factor (egf), and the bone morphogenetic protein (bmp) inhibitor noggin, and embedding the cells in an extracellular matrix (ecm) -providing basement membrane extract (wenr method, culture strategy and applications of pluripotent stem cell (psc)-derived organoids. pscs-derived organoids can be differentiated toward each of the three germ layers (endoderm, mesoderm, and ectoderm) under specific differentiation signals. pscs-derived organoids can be applied to studying development, genetic diseases, and infectious disease spondin-1), lgr5-positive stem cells are able to selforganize, proliferate and form differentiated crypt-villuslike organoids (fig. 4a, b ). since then, by modifying cocktails of growth factors and cell isolation procedures, cultures of patient-derived organoids (pdos) have been successfully established for various human tissues by biopsy or resection, including the esophagus , stomach (bartfeld et al. 2015) , colon , liver (broutier et al. 2017; hu et al. 2018; huch et al. 2015) , pancreas (boj et al. 2015) , salivary gland (nanduri et al. 2014) , fallopian tube (kessler et al. 2015) , ovary (hill et al. 2018; kopper et al. 2019) , prostate karthaus et al. 2014) , breast , airway (sachs et al. 2019) , taste buds (ren et al. 2014) , endometrium (turco et al. 2017) , kidney (schutgens et al. 2019) , bladder (lee et al. 2018) , thyroid (saito et al. 2018) , biliary tract (saito et al. 2019) , oral mucosa (driehuis et al. 2019a ) and glioblastoma (fig. 5 , table 1) . a counterintuitive phenomenon is found that normal epithelium organoids often outgrow tumor organoids, which, in some instances, can be prevented by using cancer-specific selection methods. for example, tumor organoids from colorectal cancer (crc) can be selectively expanded upon withdrawal of wnt3a and r-spondin1. nearly all crcs harbor activating mutations in the wnt pathway or fusion of rspo(r-spondin-1) genes, allowing for the expansion of cancer cells without wnts and r-spondins, while normal epithelial cells arrest (nusse 2017; sato et al. 2011; seshagiri et al. 2012; van de wetering et al. 2015) . another approach to culture tumor cells selectively is to stabilize wild-type p53 by adding the mdm2 inhibitor nutlin-3 (drost et al. 2015) . tumor cells are not affected by nutlin-3 due to a loss of tp53 (olivier et al. 2010) , while normal cells in culture present cell cycle arrest and death, allowing for the selection of tumor cells. in general, pdos using wenr method can be derived from any epithelium of normal tissues as well as malignant or otherwise diseased tissues within approximately 7 days after embedding the cells into ecm matrix ( fig. 3c; fig. 5 ). pdos can be expanded long term and cryopreserved while remaining genetically stable, making organoids an ideal tool for disease modeling. in addition, this type of organoid culture allows the direct parallel expansion of diseased cells and matched normal cells from individual patients, which allows for the generation of living tumor organoid biobank and facilitates its potential application in personalized therapy (fig. 6) . however, to date, nearly all pdos types represent only the epithelial parts of organs, and there is an absence of stroma, nerves, and vasculature. adopting ali method, researchers can generate ascs-derived organoids from various murine tissues including small intestine, colon, stomach, and pancreas (li et al. 2014; ootani et al. 2009 ), then extending to culture clinical tumor samples (neal and kuo 2016; neal et al. 2018) , accurately recapitulating stem cell populations and their multi-lineage differentiation. the ali model preserves tumor microenvironment with tumor parenchyma and stroma, including functional tumor infiltrating lymphocytes (tils), providing a promising model for immunotherapy research for patients with cancer (neal et al. 2018) . in the remainder of this review, we will discuss how pscs-derived organoids and ascs-derived organoids are applied in basic and translational research. comparison of different culture methods for adult stem cells-derived organoids. in the wenr method, epithelial organoids are derived from tumor biopsies directly in matrigel with cocktail growth factors, with long-term expansion but no tumor micro environment. in the air-liquid interphase (ali) method, tumor biopsies are cultured in ali in the entire tumor microenvironment as a cell suspension of all cell types, including immune cells and other non-epithelial cell types, but with limited expansion tissue physiology organoids as a research tool for stem cell biology organoids is an ideal in vitro tool for the identification of novel stem cell markers, and the study of physiological phenomena requiring the coculture of multiple cell types. lgr5+ cells located at the crypt base was verified as the real intestinal stem cells (barker et al. 2007) . enlightened by the finding that lgr5 + intestinal stem cells can undergo thousands of cell divisions in vivo, sato and colleagues sato et al. 2009) successfully established the epithelial organoids from a single lgr5+ stem cell, which are also known as "miniguts". wnt signals, notch signals, egf signals and bmp signals together contribute the stem cell niche homeostasis (clevers 2013) . other stem cell biomarkers have been explored to initiate intestinal organoid cultures, including cd24 (von furstenberg et al. 2011) , ephb2 (jung et al. 2011) , and cd166+/grp78 (wang et al. 2013) . mini-guts contain multiple differentiated cell types. rapidly dividing, transit-amplifying (ta) daughter cells derived from lgr5 + cells can differentiate to enterocytes, fig. 4 morphology of several types of human adult stem-cell organoids. a a schematic diagram showing the growth pattern of organoids. typically, isolated cells or functional aggregates are embedded in extracellular matrix domes and cultured in media with essential niche factors. they gradually build tissue-like 3d structures within 1-2 weeks. b bright-field image of a typical murine small intestinal organoid culture. c brightfield image and he staining image of typical human normal colon epithelium, adenoma and adenocarcinoma organoids paneth cells, goblet cells, enteroendocrine cells, tuft cells, and the m cells that cover peyer's patches (clevers 2013) ,contributing to the study of physiology of cryptvillus axis. lendeboom and colleagues (lindeboom et al. 2018 ) applied a multi-omics framework on stem cellenriched and enterocytes-enriched mouse intestinal organoids to reveal multiple layers of gene expression regulation contributing to lineage specification and plasticity of intestine and found that hnf4g as a major driver of enterocyte differentiation. as another example, beumer and colleagues (basak et al. 2017; beumer et al. 2018 ) used organoids to study the effect of growth factors on hormone expression in enteroendocrine cells after establishing a protocol to obtain enteroendocrine cells in organoids. in organoids, hormones in enteroendocrine cells were differentially expressed based on the presence or absence of bmp4. this finding was then studied in a mouse model, and it was found that the bmp gradient along the crypt-villus axis in vivo dictates a switch in expressed hormones in enteroendocrine cells that migrate up this bmp gradient. beumer and his colleagues ) further constructed an organoid-based platform for functional studies of human enteroendocrine cells, which can be induced by transient expression of neurog3. by using single-cell mrna sequencing and mass-spectrometry, they revealed differences of human enteroendocrine cells with mice, and several secreted products were identified and validated by functional experiments. the mini-gut culture approach has been applied to the generation of organoids derived from the epithelial compartments of a variety of murine and human tissues of ecto-, meso-and endodermal origin, and promotes the study of stem cell biology of other tissues except for intestine. for example, long-term expanding organoids modeling mature pyloric epithelium can be efficiently generated from single lgr5 + stem cells located at the base of pyloric glands (barker et al. 2010) . later, strange and colleagues (stange et al. 2013 ) discovered that troy + chief cells can spontaneously dedifferentiate to act as multipotent epithelial stem cells in vivo, particularly upon damage. importantly, single troy + chief cells can initiate long-term expanding gastric organoids, containing various cell types of corpus glands. the finding further confirms troy + chief cells' role as "reserve" stem cells upon challenge of tissue homeostasis. organoid culture allows for the generation of specific cell types that were previously impossible in 2d cultures. for example, hepatocytes can be successfully established and expanded in organoid culture peng et al. 2018) . based on adult bile duct-derived bipotent progenitor organoids , culture table 1 comparison of the conditioned media requirements for the patient-derived organoids culture of respective cancer types conditions were developed that supported the growth of human hepatocyte organoids. the organoids proliferate greatly after transplanting into mice ). the resulting hepatocytes maintained its original physiological functions, including secreting cytoplasmic glycogen particles, forming bile canaliculi, and expressing albumin and cytochrome p450 enzymes. based on organoid culture system of hepatocytes, peng and colleagues (peng et al. 2018 ) described a unique effect of tumor necrosis factor-î±, a cytokine essential for liver regeneration and found that the addition of regeneration-enhancing cytokines in facilitating the in vitro expansion of cell types that are otherwise difficult to culture. as another example, yin and colleagues (yin et al. 2014 )showed modulation of wnt and notch signaling in intestinal organoids to direct lineage differentiation into mature enterocytes, goblet cells and paneth cells. specifically, the combination of iwp-2 (inhibitor of wnt production 2; wnt pathway inhibitor) and vpa (valproic acid; notch activator) specifically induced enterocyte differentiation, presumably by combining the effects of both inhibitors, in which iwp-2 induced enterocyte differentiation while vpa suppressed the differentiation of lgr5+ stem cells toward secretory cell types. the combination of dapt (notch inhibitor) and chir (chicken immunoglobulin-like receptor; gsk3î² inhibitor) mainly induced paneth cell differentiation, and the combination of iwp-2 and dapt primarily induced goblet cell differentiation. these methods provide new tools for the study and application of multiple intestinal epithelial cell types. organoids can be established from a single cell, which makes it possible to study the mutational status of single stem cells. the gradual accumulation of genetic mutations in stem cells throughout life is related to a variety of age-related diseases, including cancer. in this way, blokzijl and colleagues (blokzijl et al. 2016 ) were able to unveil mutation rates and patterns in normal stem cells throughout life by whole-genome sequencing (wgs) analysis (with peripheral blood as a reference for germline mutations). interestingly, the mutation rate, with around 40 novel mutations per year per stem cell, was applications of adult stem cells-derived organoids. a organoids derived from normal tissue are useful for studying physiology. for disease modeling, organoids can be genetically engineered to model genetic and malignant diseases by using crispr-cas9. normal organoids can also be infected with different types of pathogens to model infectious disease. normal organoids can be transplanted to wounds for tissue repair. b tumor-derived organoids can be used for basic research by genetic modification and modeling rare cancer. for translational research, tumorderived organoids can be used for biobanking, genetic repair and drug screening studies, both for personalized medicine (to choose the most effective treatment for a specific patient) and drug development (to test a compound library on a specific set of tumor organoids), as well as immunotherapy research similar in liver, small intestine, and colon stem cells, regardless of the large variation in cancer incidence of these organs. however, the types of mutations detected and the resulting mutational signatures in colon and small intestine cells were different from those in liver cells. to be pointed out, the inter-individual variation in mutation rate and spectra are low, indicating organspecific activity of common mutational processes throughout life. disease modeling crispr-cas9 technology as a useful tool for disease modeling of organoids the clustered regularly interspaced short palindromic repeats (crispr) associated protein 9 (cas9)/crispr system has become a major technology for mammalian genome editing. the system consists of cas9 nuclease derived from streptococcus pyogenes and guide rna which can recognize and target a specified dna sequence preceding the motif sequence adjacent to the. crispr-cas9 can generate dna double-strand breaks at specific genomic sites. mammalian double-strand dna breaks can be repaired by two ways, nonhomologous end joining (nhej) and homology-directed repair (hdr). nhej inserts indels randomly in the process of repair, and biallelic introduction of indels leads to gene knock-out (komor and badran 2017) . on the other hand, hdr can replace the damaged allele by existing intact genome, thus when tailored dna templates are co-delivered with crispr-cas9, hdr can be used for gene knock-in (komor and badran 2017) . although the crispr-cas9 technology has broadened its applications to a series of purposes, including dna base editing, rna targeting, epigenome editing and gene expression manipulation (adli 2018; komor and badran 2017) , the use of crispr-cas9 on organoids still basically harness nhej and hdr to engineer genes of interest. indeed, organoids are ideal model for investigating gene function by genome editing, as the organoid system allows for fast expansion with stable genetics and phenotype. previous studies have successfully achieved genome editing by installing crispr-cas9 into organoids using various approaches, including liposomal transfection, electroporation and viral infection (fig. 7) . however, variable experimental conditions limit the efficiency of genome editing in organoids, including the recovery after single-cell isolation, approaches for crispr-cas9 delivery, and the cleavage efficiency of the guide rna. selection and enrichment of positive organoids are necessary after crispr-cas9-mediated genome editing, or otherwise, labor-intensive organoid cloning, followed by sequencing of expanded organoid clones is needed. recently, ringel and his colleagues (ringel et al. 2020 ) developed a genome-wide pooled-library crispr screen approach by capturing sgrna (single-guide rna) integrations in single human intestinal organoids to dissect oncogenic signaling pathways. their screening method would be broadly applicable to various organoid models and selection assays, which may contribute to dissecting human disease mechanisms and facilitating biological discovery in primary 3d tissue models. currently, organoids have become a useful tool to model genetic diseases. generally, two types of methods have been adopted: (i) organoids established from patientderived biopsies; (ii) specific genetic mutations introduced to wild-type organoids using crispr-cas9 technology. cystic fibrosis (cf) is the best example for pdo modeling human genetic disease. cf is a monogenic channelopathy caused by inactivating mutations in the cf transmembrane conductance regulator (cftr) gene. the disease involves multiple organs, including the intestine, lung, pancreas, liver, kidney and sweat gland. in gastrointestinal organs and lungs, decreased cftr function results in reduced chloride transport through cftr toward the extracellular space, leading to a reduced water flow by osmosis and, hence, increased density of mucus. in the sweat glands, loss of cftr function leads to a high saline concentration in sweat. organoid model was firstly derived from rectum of cf patients. early work with organoids derived from the rectum of cf patients revealed cftr function: wild-type organoids rapidly swell upon opening the cftr channel in a cyclic adenosine monophosphate (camp)dependent manner through the addition of forskolin (fsk) ). rectal organoids from cf patients do not respond to fsk, but it is restored upon pre-incubation with cftr-restoring compounds or upon correction of the cftr mutation by crispr-cas9 (schwank et al. 2013 ). based on this finding, this organoid-based fsk-induced swelling assay began to be used to test drug response on organoids isolated from patients harboring different cftr mutations, including rare variants . lung is another organ that can be severely harmed by cf. due to cftr mutation, a thick sticky mucus forms in the lungs, which impairs breathing and provides a fertile environment for pathogens reproduction, leading to premature respiratory failure. cf airway organoids can be established from patient-derived ipscs (firth et al. 2015) , bronchial epithelial cells based on both ali (fulcher et al. 2005) and wenr cultures (sachs et al. 2019) or bronchoalveolar lavage fluids (no biopsy needed) (sachs et al. 2019; sondo et al. 2014 ). airway organoids from cf patients had an increased mucus layer, recapitulating the disease phenotype. fsk-induced swelling in airway organoids was reduced compared with organoids from normal controls and could be restored with cftrrestoring compounds. however, in contrast to rectal organoids, fsk-induced swelling in lung organoids did not depend on cftr alone, it was also influenced by the chloride transporter tmem16a, which is set as an alternative therapeutic target for cf patients. therefore, airway organoids may function as an additional platform for assessing drug response to cf, particularly for drugs acting on tmem16a. besides, liver, pancreatic and kidney organoids derived from cf patients can also be established. sampaziotis and colleagues (sampaziotis et al. 2015 ) generated ipscs from skin fibroblasts of a cf patient with homozygous f508 mutation and differentiated them into cholangiocyte-like cells (clc). cf-clcs expressed displayed defective expression of cftr protein. cf-clc organoids treated with experimental cf drug vx809 increases cftr function and improves intraluminal fluid secretion. cf pancreatic organoids can be established from pscs of cf patients, including both human escs and an ipsc line, to generate differentiated pancreatic ductal epithelial cells (pdecs) (simsek et al. 2016 ). pdecs derived from cf-ipscs showed decreased expression of cftr protein and damaged chloride ion channel activity, reappearing functional defects of patients with cf at the cellular level. in addition, a tubuloid line from urine of a cf patient was established (schutgens et al. 2019) . at the morphological level, the kidney tubuloids maintained folded over long-term culture, instead of the typical cystic phenotype, which was probably due to the lack of cftr function caused by cftr mutations f508del/s1251n. in tubuloids derived from urine of cf patients, fsk caused slight swelling in a concentration-dependent manner, suggesting residual cftr function, while after pre-incubation with the fig. 7 genome editing in organoids by crispr-cas9 technology. workflow of genetic engineering in organoids using crispr-cas9. either nhej or hr can be exploited for gene knock-out or knock-in, respectively. in most cases, expansion of single organoid clones after the selection procedure is necessary to obtain isogenic organoid populations cftr-potentiator drug vx-770 (ivacaftor, kalydeco), swelling increased significantly. all the above cf organoid models of different related organs allow in vitro assessment of treatment response and development of novel drugs. for a discussion of the use of cf-pdo for personalized medicine, see section 3.1. additionally, intestinal organoids harboring inactivation mutation of ttc7a have been successfully derived from patients with intestinal atresia, which recapitulates how ttca7 deficiency results in the loss of apical-basal cell polarity in the intestinal epithelium that can be rescued by adding rho kinase inhibitors (bigorgne et al. 2014) . besides, intestinal organoids were derived from patients with microvillus inclusion disease (mvix) caused by homozygous truncating mutations of syntaxin-3 (stx3) gene. the model revealed that partial loss of brush border microvilli and subapical accumulation of vesicles are typical histological phenomena of the disease (wiegerinck et al. 2014). liver organoids have been generated from patients with î±1-antitrypsin (a1at) deficiency. accumulation of mutant a1at in the endoplasmic reticulum in the liver leads to fibrosis or cirrhosis. liver organoids derived from the patients indeed contained a1at aggregates and presented increased apoptosis, which might contribute to fibrosis and cirrhosis . alagille syndrome is caused by loss-of-function mutations in jag1 or notch2 and leads to partial or complete biliary atresia. accordingly, organoids generated from a patient with alagille syndrome can not differentiate toward the biliary fate, whereas in proliferate conditions, no differences were observed compared with healthy controls (andersson et al. 2018; sachs et al. 2018) . ipscs-derived organoids can also be manipulated by crispr-cas9 technology to model diseases in different tissues. in human ipscs-derived kidney organoids, knock-out of podocalyxin and pkd genes freedman et al. 2015) recapitulated defects that mimic nephrotic syndrome and polycystic kidney disease respectively, as well as contributed to understanding the functions of the genes in the pathogenesis context. engineered ipscs-derived liver organoids helped in illustrating the various functions that different mutations of jag1 gene can have in the development of bile ducts and genesis the alagille syndrome: the c829x mutation of jag1 can causes significant alterations, while the g274d mutation does not affect organoid properties (guan et al. 2017) . in brain tissue, patient-specific ipscs-derived brain organoids can be used to model lissencephaly (bershteyn et al. 2017) , down syndrome , and neuronal heterotopia (klaus et al. 2019) . engineered ipscs-derived brain organoids were established to model microcephaly by rna interference of reprogramming factors (lancaster et al. 2013) , autism by overexpression of the transcription factor foxg1 (forkhead box g1) (mariani et al. 2015) , macrocephaly by deletion of pten (li et al. 2017) , timothy syndrome by introducing mutations in the cav1.2 calcium channel-interneurons (birey et al. 2017 ) and aicardi-goutiã¨res syndrome by introducing inactivation mutation of trex-1 ( thomas et al. 2017) . fused organoids culture was recently established to understand more complex biology, which was more applied in brain study. bagley and colleagues (bagley et al. 2017 ) firstly showed a co-culture method combining brain regions of choice within one organoid tissue and they generate a dorsal-ventral axis by fusing organoids of dorsal and ventral forebrain identities. combined with reprogramming technology, their novel fusions of organoids culture should offer researchers the possibility to analyze complex neurodevelopmental defects using cells from neurological disease patients and to test potential therapeutic compounds. xiang and his colleagues (xiang et al. 2017 ) successfully established and fused medial ganglionic eminence (mge) and cortex-specific organoids from human pluripotent stem cells followed by live imaging, to investigate mge development and human interneuron migration and integration, which offers deeper insight into molecular dynamics during human brain development . the same research team developed a new 3d system to create the reciprocal projections between thalamus and cortex by fusing the two distinct region-specific organoids, providing a platform for understanding human thalamic development and modeling circuit organizations and related disorders in the brain (xiang et al. 2019) .generally, engineered organoids can faithfully recapitulate genetic diseases and thus provide a valid resource for basic research and for development of novel therapeutics. organoids are closed 3d structures that exhibit the apical side of the epithelium towards the lumen and the basal membrane towards the outside. the apical membrane within the lumen is initially exposed to pathogens in vivo. three different methods have been established to reproduce the interaction between microbes and the host in the organoids culture (fig. 8 ). in the first method, organoids are mechanically sheared or digested into single-cell suspensions or large particles and then they are incubated with pathogens, which leads to infection of the cells. after embedded in the 3d matrix, infected cells can reform organoids that can be used to model infectious disease (dang et al. 2016; forbester et al. 2015; nigro et al. 2014; zhang et al. 2014 ). this method is easy to handle and do not require special equipment. but, the efficiency of infection varies among different pathogens and it can not reflect the initial interaction between microbes and the host. besides, during the process not only the apical side but also the basal side of cells are exposed, thus nonspecific responses may be introduced due to the interaction of pathogens with the basal side of the cells. in the second method, pathogens are directly injected in to the lumen of the organoids by a microinjection, thus the initial interaction of pathogens and the early response of the host cells can be captured and either apical or basal interaction can be investigated separately (bartfeld et al. 2015; leslie et al. 2015; mccracken et al. 2014 ). this method is now the mainstream method to build infection model. however, this method needs special such as a microinjector and it is hard to perform quantitatively due to the different sizes of organoids. in the third method, when single cells digested from organoids are seeded onto a 3d matrix-coated dish, they grow as 2d monolayer with the apical side exposed. adding pathogens directly into the culture media allows interaction between microbes and the host cells (ettayebi et al. 2016) . the 2d culture contains various differentiated cells and allows quantitative experiments, while it does not resemble the in vivo 3d structure of host tissues. based on the above approaches, organoids have been adopted to model viral, bacterial, and parasitic infectious diseases of different tissues, including diseases caused by pathogens that previously could not be studied in vitro (table 2) . these models recapitulate features of in vivo infection and could help identify therapeutic targets and develop novel drugs and vaccine. pscs-derived organoids were firstly used to model viral disease. in neuroscience, the development of pscsderived brain organoids help deciphered the sequence of disease progression in zika virus (garcez et al. 2016; qian et al. 2016) . zika virus (zikv) mainly spreads by the aedes aegypti mosquito and its infection can lead to microcephaly, which was set as a public health emergency by fig. 8 approaches of studying infectious diseases using organoids. organoids can be micro-injected with the microbe, thus the microbe is in direct contact with the apical side of epithelial cells, which became the mainstream method to build infection model. organoids can also be sheared into smaller aggregates, incubated with pathogens and re-seed into matrigel. alternatively, 3d organoids can be digested by enzyme into single cells and grown as 2d monolayer cultures, and then microbes are added into the culture media world health organization (who) in 2016. however, the pathogenesis of the zikv infection was not fully understood until brain organoids emerged. multiple research demonstrated that zikv infection can cause disorder of the cortical layers of cerebral organoid, abrogating growth and thus halting neurogenesis. researchers found that zikv infection leads to the activation of the toll-like receptor 3 (tlr3), contributing to deregulated neurogenesis and decreased functional neurons (cugola et al. 2016 ). gabriel and colleagues (garcez et al. 2016 ) further illustrated that two new isolated zikvs have different patterns of pathogenicity. unlike the highly passed mr766 strain of zikv, the new strains, infected apical proliferating progenitors, interfering with centrosomal protein assembly, which in turn led to their premature differentiation and apoptosis, resulting in microcephaly (wells et al. 2016) . the organoids model of zikv infection also promotes the development of treatment. in a high-throughput drug screen of 6000 compounds, caspase-3 activity inhibitors, emricasan and niclosamide, were found to be effective in limiting zikv induced death of neural cortical progenitor and zikv replication (xu et al. 2016) . except for zikv, japanese encephalitis virus (jev) infection, leading to japanese encephalitis (je), was modeled in generated telencephalon organoid . researchers found that jev infection caused decline of cell proliferation and increase of cell death, and infected astrocytes and neural progenitors. in addition, they revealed variable antiviral immunity in brain organoids of different stages of culture, which also provide clues to develop effective therapeutics of such diseases. another example of pscs-derived organoids application in human disease is with hepatitis b virus (hbv). in recent decades, treatments against hbv infection have improved; however, the development of personalized treatments has been hindered by the absence of personalized infection models. nie and colleagues (nie et al. 2018 ) generated pscs-derived liver organoids that recapitulated the genetic background of the donor, and found hbv infection in pscs-derived liver organoids could reproduce the life cycle of hbv and hbv-induced hepatic dysfunction, indicating that pscs-derived liver organoids may provide a promising personalized infection model for the development of personalized treatment for hepatitis. in recent years, ascs-derived organoids have become the main force in infectious diseases modeling. ascsderived organoids can be adopted to model the viral infection of intestinal organoids. gastric diarrhea in humans is mostly caused by human norovirus (hunov) and rotavirus infection (zheng et al. 2006) . although both of these viruses are rampant, no proper vaccine has been developed owing to the lack of in vitro culture method supporting their replication. intestinal organoids that were cultured as monolayers allowed for extensive replication of multiple strains of noroviruses. for some strains, the addition of bile to the culture medium was required for replication (ramani and atmar 2014) , indicating that not only are in vivo-like host cells required for productive infection but also an in vivo-like environment is relevant as well. ettayebi and colleagues (ettayebi et al. 2016) reproduced hunov infection in an organoid-virus co-culture system, with only a specific gii.3 hunov strain requiring the presence of bile. furthermore, lack of histo-blood group antigen (hbga) expression in intestinal organoids limits hunov replication, suggesting that this culture system allows the evaluation of potential treatments and preventions. similarly, researchers have shown that rotavirus strain (simian sa11) from clinical samples can proliferate in pscs-derived intestinal organoids (finkbeiner et al. 2012; yin et al. 2015 ). in urinary system, bk virus, which is a tubule-specific circular dna virus, infects 1-10% of transplanted kidneys, leading in 10-80% of these infected kidneys to the loss of the donor organ and no curative treatment exists (hirsch et al. 2005) . infection of kidney tubuloids (kidney-derived organoids in which only the tubular epithelium of the kidney is represented and glomeruli are lacking) with bk virus yielded a patchy infection with enlarged nuclei (due to intranuclear basophilic viral inclusions), similar to what is observed in kidney biopsies from patients with bk virus nephropathy (bohl and brennan 2007; schutgens et al. 2019) . respiratory infections pose a major global disease burdens (ferkol 2014) . respiratory syncytial virus (rsv) alone causes hundreds of thousands deaths annually among children, mostly in developing countries (nair et al. 2010) . ipscs-derived organoids of human airway epithelium can be infected by rsv virus, which can reproduce the morphological features of rsv infection in the distal lung (chen et al. 2017 ). persson and colleagues (persson et al. 2014 ) established a ali culture system for infection of human airway epithelium with rsv virus and they found that rsv has the potential to influence the cellular composition of the airway epithelium. besides, ascs-derived airway organoids using wenr methods can also be infected with rsv, recapitulating syncytia formation, cytoskeletal changes, and shedding of epithelial cells (mueller et al. 2005) . rsv-infected organoids attracted neutrophils more than did mockinfected control organoids, making this the first organoid model suitable for studying neutrophil-epithelium interactions (sachs et al. 2019) . intriguingly, rsv infection strongly increased organoid motility and ultimately resulted in organoid fusion. influenza viruses also pose a major public health problem worldwide, and novel emerging viruses may be lethal, as evidenced by the poultry-derived h7n9 virus infection that has had a 39% mortality rate since 2013. the infection of differentiated airway organoids with distinct strains of influenza virus can discriminate between poorly infective and highly infective strains (zhou et al. 2018) . importantly, hui and colleagues (hui et al. 2018 ) compared human and avian strains of influenza a virus in in vitro human bronchus and airway organoids, and found that the infection of airway organoids yielded similar results regarding virus replication and cytokine response. in addition, infection of airway organoids with enterovirus 71 (ev71) showed that ev71 replication kinetics are strain-dependent and the model help identify new infectivity makers for ev71 (van der sanden et al. 2018) . the year of 2020 witnessed the outbreak of coronavirus disease-19 (covid-19) caused by the virus severe acute respiratory syndrome-coronavirus 2 (sars-cov-2) which presents influenza-like symptoms ranging from mild disease to severe lung injury and multi-organ failure, eventually leading to death, especially in older patients with other co-morbidities. the who has declared that covid-19 is a public health emergency of pandemic proportions. organoids have been used as a great platform to research how covid-19 affects human and causes damage and for identifying possible drug targets for covid-19. lamers and colleagues (lamers and beumer 2020) infected enterocytes in human small intestinal organoids with sars-cov and sars-cov-2 and found that the intestinal epithelium supports sars-cov-2 replication, and organoids can be served as an experimental model for coronavirus infection and biology. zhou and colleagues (zhou et al. 2020) established the first expandable organoid culture system of bat intestinal epithelium which were fully susceptible to sars-cov-2 infection and sustained robust viral replication. they also found active replication of sars-cov-2 in human intestinal organoids indicating that the human intestinal tract might be a transmission route of sars-cov-2. in addition, monteil and colleagues (monteil et al. 2020) found that sars-cov-2 can directly infect engineered human blood vessel organoids and human kidney organoids, which can be inhibited by human recombinant soluble ace2 (hrsace2), demonstrating that hrsace2 can be a possible drug for early stages of covid-19. salmonella typhi (s. typhi) and clostridium difficile (c. difficile) are the two major bacterial intestinal pathogens that can cause diarrhea and gastrointestinal failures in humans. these pathogens infection has been successfully modeled using pscs-derived intestinal organoids. forbester and colleagues (forbester et al. 2015 ) microinjected live s. typhi into the lumen of the ipscs-derived intestinal organoids and revealed that upon injection, nf-îºb signaling was activated and inflammatory factors were secreted, which was consistent with previous findings in animal models. likewise, the spence lab (leslie et al. 2015 ) microinjected c. difficile toxin a (tcda) and toxin b (tcdb) into the lumen of ipscs-derived intestinal organoids to model anaerobe c. difficile infection (cdi). the injection of tcda recapitulates the impair of the epithelial barrier function and structure observed in organoids colonized with viable c. difficile. in another study, the worrell lab (engevik et al. 2015 ) observed a decreased expression of nhe3 (sodium/hydrogen exchanger 3) and muc2 (mucoprotein2) protein in c. difficile infected organoids compared to normal organoids, which may help creating a favorable environment for its colonization. bacterium helicobacter pylori (h. pylori) infection is a major risk factor for peptic ulcers, gastric adenocarcinoma and gastritis (salama and hartung 2013) . both gastric organoids derived from ipscs and ascs can be used to model h. pylori infection, by microinjecting h. pylori strain into the lumen of organoids (mccracken et al. 2014) , which can ensure the apical side exposed to h. pylori. luminal injection of h. pylori induces a potent nf-îºb-mediated inflammatory response (bartfeld et al. 2015) , connecting excessive microbial colonization of h. pylori with the occurrence of gastric cancer. in a followup study, researchers adopted gastric organoids to find out how h. pylori finds its gastric niche: a potent chemoattractant, urea, which produced by gastric epithelium is essential for the colonization of h. pylori in the gastric mucosa (huang et al. 2015a) . chronic salmonella infection of the gall bladder is associated with gallbladder carcinoma (shukla et al. 2000) . scanu and colleagues (scanu et al. 2015) showed that after salmonella infection, mouse gallbladder organoids exhibited characteristics of loss of polarity, familiar with those showed in the mouse model of gallbladder cancer. another study found that gallbladder organoids preexposed to salmonella that lack functional tp53 showed neoplastic transformations by activating akt (protein kinase b) and mapk (mitogen-activated protein kinase) pathway and could grow in culture media free of growth factors (scanu et al. 2015) . the protozoan parasite cryptosporidium causes lifethreatening diarrhea in immunocompromised individuals (e.g. people living with hiv and malnourished children), and infection may spread to the lungs (checkley et al. 2015) . drug development requires detailed pathophysiology information of cryptosporidium, but the lack of an optimal in vitro culture system hinders the experimental approaches. heo and colleague (heo et al. 2018) infected epithelial organoids derived from human small intestine and lung with cryptosporidium and found that the parasite can reproduce within the organoids and complete its complex asexual and sexual life cycles for multiple rounds. plasmodium parasites can cause malaria, which poses a significant global health burden, with over 200 million cases every year. plasmodium parasites are maintained between anopheles mosquitoes and mammalian hosts in a complex life cycle, and models to study them are challenging to establish, particularly for plasmodium species that infect humans (mellin and boddey 2020) . recently, several studies reported the application of ipscs-derived hepatocyte-like cells to model in vitro liver stage infections with p. berghei, p. yoelii, p. falciparum, and p. vivax (ng et al. 2015) . it was found that p. yoelii and p. falciparum infections of organoids recapitulated the primaquine sensitivity found in vivo. chua and colleagues (chua et al. 2019 ) infected organoids derived from simian and human hepatocytes, with p. cynomolgi and p. vivax and found that organoids could support the complete liver stage of both simian and human parasites, from initial infection with sporozoites, to the release of merozoites capable of erythrocyte infection. this study also illustrated the use of infected organoids to evaluate the response to an anti-relapse drug, highlighting the potential for organoids as a parasite drug screening platform, particularly in parasites with life-cycles longer than their host cells. though organoids derived from tumor and matched normal epithelial tissues provide valuable research tools for cancer biology, one of the most remarkable improvements in organoid research is the capacity to manipulate the genomes, transcriptomes and epigenomes of normal epithelial organoids to study the role of specific alterations in the process of tumorigenesis. murine organoid cultures were firstly used to study the early stages of tumorigenesis. li and colleagues (li et al. 2014 ) adopted the ali culture approach combined with genetically engineered mouse model and the retrovirus-mediated delivery of shrna constructs, to model multi-step tumorigenesis in organoids derived from digestive tract, including the colon, stomach, and pancreas. pancreatic and gastric organoids exhibited dysplasia as a result of expression of kras g12d , p53 loss or both. while colon organoids needed assembled apc, p53, kras g12d and smad4 mutations for malignant transformation to invasive adenocarcinoma-like morphology. all engineered organoids presented histologic characteristics of adenocarcinoma after subcutaneous implantation was performed to immunocompromised mice. following research tried to model multi-step tumorigenesis of conventional crc, which is characterized by chromosomal instability (cin) (fig. 9 ). drost and colleagues (drost et al. 2015) adopted crispr-mediated knock-out of the tumor suppressors apc, tp53, and smad4, married with crispr-mediated knock-in of the oncogenes kras g12d to model multi-step tumorigenesis. after being selected by niche factors in the culture media, cultures of organs were successfully built with complex oncogenic multi-gene modules that contain up to four simultaneous changes. the 4-hit akst (apc, kras g12d , smad4, and tp53) organoids could grow without stem cell niche factors such as wnt-3, rspondin-1 and egf. akst organoids were able to generate tumors with characteristics of invasive carcinoma upon subcutaneous implantation into immunocompromised mice. matano and colleagues (matano et al. 2015 ) applied a similar method to model tumorigenesis by inserting an additional crispr-mediated knock-in of the oncogene pik3ca e545 , in addition to akst. both studies showed that organoids with apc and tp53 mutations showed extensive aneuploidy, which is the hallmark of the cin pathway. xenotransplantation of engineered colorectal tumor organoids makes it possible to study cancer stem cells in vivo (de sousa e melo et al. 2017; shimokawa et al. 2017 ) and leads to metastatic diseases, making organoids a useful research tool to study metastasis mechanisms (fumagalli et al. 2017; roper et al. 2017) . de sousa e melo and colleagues (de sousa e melo et al. 2017) combined crc mouse with the lgr5 dtr/egfp allele. the resulting animals carry two of the most frequently mutated genes, apc and kras g12d , and in addition, express a diphtheria toxin receptor fused to an egfp under the endogenous regulatory region of lgr5, allowing specific elimination and visualization of lgr5-positive stem cells. using this model, it was found that in the absence of cancer stem cells, liver metastases did not occur, whereas primary tumors did not regress, indicating that lgr5-positive cancer stem cells are required for metastasis. in another study, fumagalli and colleagues (fumagalli et al. 2017 ) orthotopically transplanted cris pr-mediated kras, apc, tp53, and smad4 comutated human colon organoids into mice and showed that metastases to the liver and lungs occurred in 44% of the mice. almost no metastasis occurred when organoids carrying mutations in only three of these four genes were transplanted to mice; however, the lack of the fourth mutation could be overcome by providing the niche factor upstream of the absent mutation. for example, organoids with triple mutants lacking smad4 inactivation metastasized when noggin was added to the cells. these findings indicate that metastatic potential is directly related to the loss of niche factor dependency. crc that arises from the serrated neoplasia pathway is different from crc that arises from the conventional cin pathway. the activation of oncogene braf initiated the serrated pathway, followed by the extensive hypermethylation of cpg island methylator phenotype, and subsequent inactivation of tumor suppressor genes (bae and kim 2016). fessler and colleagues (fessler et al. 2016 ) firstly built the organoids to model the serrated pathway of crc by introducing the braf v600e mutation into normal human colon epithelial organoids via homologous recombination. they revealed that induction of a mesenchymal phenotype upon tgfî² treatment prevails in the braf v600e mutated organoids generating sessile serrated adenomas (ssas). in a recent study, by fig. 9 assessing different tumorigenicity and metastases mechanisms of combinative mutations in colon cancer based on normal colon organoids by using crispr-cas9 technology analyzing the genomic data from tcga, crcassociated gene alterations including braf v600e , cdkn2a, tgfbr2, znrf2 and rnf43, were selected to be introduced into murine colon organoids using crispr-cas9 technology to model the serrated pathway (lannagan et al. 2019) . upon subcutaneous implantation into immunocompromised mice, these engineered organoids could generate tumors with characteristics of serrated crc, including desmoplastic stromal responses, infiltrative growth, mucinous differentiation, tumor budding, and the formation of colon tumors spontaneously metastasizing to the liver. to be pointed out, transplantation of braf-mutated organoids failed to generate tumors, while injection of organoids with both mutations of braf and tgfbr2 led to invasive adenocarcinoma. implantation of organoids with those and additional mutations (cdkn2a, znrf2 and rnf43) resulted in increased tumor initiation and decreased survival time. moreover, to study the development of the r-spondin-driven serrated pathway, another study also introduced crispr-mediated gene fusions on braf v600e and/or tp53 accompanied with the r-spondin fusion eif3e (eukaryotic translation initiation factor 3 subunit e )-rspo2 to normal human colon epithelial organoids (kawasaki et al. 2020 ). the resulting braf v600e organoids showed aberrant crypt formation ability with poor engraftment capacity, while organoids with mutations of braf v600e and tp53 with eif3e-rspo2 fusion generated flatly elevated lesions and hyperplastic crypt structures with 'v'-shaped serration and basal dilation, together with enhanced engraftment capacity compared with only braf v600e mutated organoids. the loss of dna mismatch repair enzymes, such as mutl homolog 1 (mlh1), is common in crcs, resulting in tumors with high mutational load. these tumors are characterized by microsatellite instability (msi), as repetitive short sequences in the genome, which leads to changes in copy number following the loss of mismatch repair enzymes. drost and colleagues used crispr-mediated knock-out of mlh1 and cultured organoids for 2 months to allow accumulation of mutations. subsequent dna analyses of cultured organoids revealed an increase in mutational load compared with controls, which was similar to that of msi colorectal tumors. this study shows that organoids faithfully recapitulate in vivo mutagenesis and allow for the identification of mechanisms of tumor development. in addition, rare type of colorectal cancer can also be modeled based on organoid culture. li and colleagues ) have established a novel organoid line of colon signet-ring cell carcinoma (srcc), which accounts for only 1% of colorectal cancers. the novel organoid line resembles the primary tumor histologically and molecularly, and can efficiently generate tumors on xenografts. targeted dna sequencing with drug screening of 88 compound identified that jak2 (janus kinase 2) might be a potential treatment target. an in vitro drug screening experiment exhibited that srcc organoids can be sensitively treated by at9283 and pacritinib, two jak2 inhibitors, which was consistent with the in vivo xenograft response. the study provides a novel in vitro research tool for colorectal srcc and sets an example for personalized medicine based on organoids for rare cancer. pancreatic ductal adenocarcinoma (pdac) accounts for about 90% of all pancreatic malignancies and over 90% pdac patients harbor activation of the oncogene kras. kras can consequently induce inactivation of various tumor suppressor genes including tp53, cdkn2a, smad4 and brca2 to accelerate pdac development and progression (kanda et al. 2012; morris and wang 2010) . establishment of murine pancreatic organoids bearing a lox-stop-lox (lsl) krasg12d allele provided valuable insights for the development of pdac. in one study (li et al. 2014) , pancreatic organoids derived from lsl-kras g12d (k), tp53 flox/flox (p), and lsl-kras g12d ; tp53 flox/flox (kp) mice were successfully built using ali method. k, p, and kp organoids exhibited in vitro dysplasia and increased invasive behavior, and could in vivo generate well differentiated to poorly differentiated adenocarcinoma upon implantation into immunocompromised nog mice. kp organoids had most poorly differentiated morphology, with significant loss of ecadherin, indicating increased epithelial to mesenchymal transition. researchers of another study generated lowgrade preinvasive pancreatic intraepithelial neoplasias (panins) in vivo by crossing the murine lsl-kras g12d allele to a pancreas-specific pdx1-cre driver and organoids derived from panin lesions could be long-term expanded a produce panin-like lesions that persisted for up to 2 months upon transplantation (boj et al. 2015) . two studies have adopted crispr-cas9 technology to manipulate pdac driver genes including kras g12v , cdkn2a, tp53 and smad4 (kcts) in normal human pancreatic ductal organoids seino et al. 2018) . upon orthotopically transplantation to immunocompromising mice, different combinations of mutants in organoids exhibited distinct panins features, but only kcts organoids displayed pdac histopathological transformation . notably, kc and kt organoids died after wnt removal for 1-3 weeks, while kct and kcts organoids survived and expanded for at least 3 months, suggesting that cdkn2a and tp53 mutations are essential for organoids to grow independently of stem cell niche factors . in all, these studies indicate that organoids combined with crispr-mediated sequential mutations can recapitulate tumorigenesis and progression from panin to pdac. moreover, organoids could also be used to explore the roles of various factors such as the redox regulator nrf2 and the transcriptional enhancer foxa1 in pdac progression (chio et al. 2016; roe et al. 2017) . further researches are need to apply the improved knowledge of molecular mechanisms to clinics. gastric cancer are classified into four molecular subtypes based on deep sequencing: epstein-barr virus positive, msi, cin and genomically stable (cancer genome atlas research 2014). nanki and colleagues adopted organoids to illustrate the genotype-phenotype associations of gastric cancer. phenotype analyses of organoids derived from gastric cancer patients indicated multiple genetic and epigenetic ways to confer wnt and r-spondin niche independency. they found that induction of rnf43 and znrf43 mutations was sufficient for gastric cancer organoids to gain r-spondin independency. the phenomenon was then validated in crispr-cas9 engineered gastric organoids. in a similar study, seidlitz and colleagues (seidlitz et al. 2019 ) generated human and murine gastric cancer organoids which recapitulated the typical features and altered pathways of each of four molecular subtypes of gastric cancer. the combination of organoids and crispr-cas9 technologies promotes research on the molecular mechanisms of gastric cancer tumorigenesis and progression, thereby accelerating the development of preclinical gastric cancer models for novel drug development and personalized medicine. recently, dekkers and colleagues attempted to model multi-step carcinogenesis in breast epithelial organoids derived from human reduction mammoplasties using crispr-cas9 technology. they introduced crispr-mediated knock-out of four breast cancer-associated tumor suppressor genes, including p53, pten, rb1 and nf1 (ptrn) to mammary progenitor cells. mutated organoids could long-term expanded and generated er+ luminal tumors upon transplantation into mice for 1 out of 6 ptr-mutated and 3 out of 6 ptrn-mutated organoid lines. these organoids had various response to endocrine therapy or chemotherapy, indicating the potential application of this model to facilitate our understanding of the molecular mechanisms in specific subtypes of breast cancer. in prostate cancers, 40-80% of tumors harbor a gene fusion between the androgen receptor (ar)-responsive transmembrane protease serine 2 (tmprss2) gene and e26 transformation-specific (ets) family transcription factor, most often the oncogene ets-related gene (erg) (tomlins et al. 2005) . tmprss2 and erg, both located on chromosome 21, are separated by about 3 million base pairs. using crispr-cas9, a tmprss2-erg fusion was successfully introduced into mouse prostate organoids using a template that brought these two dna regions together. this genetic alteration resulted in armediated overexpression of erg, an effect that can be prevented by androgen receptor antagonist, consistent with those seen in vivo (driehuis 2017) . living organoid biobanks as a tool for personalized treatment and drug development as mentioned above, organoids can be efficiently established from patient-derived normal and tumor tissue samples, which can be cryopreserved and stored in living organoid biobanks. pdos resemble the tumor epithelium they were derived from both phenotypically and genetically. however, molecular profile alone is not sufficient to adequately predict drug sensitivity. even in patients with the same genotype, drug response varies. besides, some mutations are rare, thus clinical trials are impossible and efficacy testing is required for drugs to be conducted on an individual basis. thus, combined molecular and therapeutic profiling of pdos may help predict treatment response and contribute to personalized cancer treatment and drug development. a number of organoid biobanks have been reported since 2014 (table 3) . a colon cancer derived biobank of 22 lines was established in 2015, setting an example for a standard biobank . all samples performed rna sequencing and whole genome sequencing analysis. the molecular characteristics of pdos cover all five consensus molecular subtypes of crc. the mutations in the organoids were largely concordant with the original tumors, which was validated in a set of organoids established of colorectal metastases (weeber et al. 2015) . high-throughput screening of a panel of 83 compounds found that differences in drug sensitivity among the organoid lines that in some cases correlated with specific mutation. for example, rnf43mutant organoids were sensitive to wnt secretion inhibitors, and kras-mutant organoids were resistant to the egfr (epidermal growth factor receptor) inhibitors, including cetuximab and afatinib. later, schutte and colleagues (schã¼tte et al. 2017 ) reported a biobank of 35 organoid lines from crc. they found that organoid models reproduce most the genetic and transcriptomic characteristics of the donors, but (vlachogiannis et al. 2018) determined less complex molecular subtypes for the absence of stroma. drug screening with therapeutic compounds representing the standard of care for crc, combined with molecular profiles, helped identify a signature outperforming ras/raf mutation which has predictive value for sensitivity to the egfr inhibitor cetuximab. drug response in organoids and clinical response was also observed to prove that the in vitro organoid response correlates with the in vivo response. a clinical study of pdos derived from metastatic gastroesophageal and crc showed a strong correlation (100% sensitivity, 93% specificity, 88% positive predictive value, and 100% negative predictive value) between the in vitro organoid response to a set of targeted therapies and chemotherapies and the response of the tumor in patients (vlachogiannis et al. 2018) . another study adopted organoids for colon cancer chemoprediction showing that pdos test predicted more than 80% of patients' response treated with irinotecan-based therapies (ooft et al. 2019) . together, these studies indicate the potential of tumor-derived organoids to predict patients' responses. recently, two studies showed the applications of pdos derived from rectal cancer to predicting patient responses to neoadjuvant chemoradiation therapy. yao and colleagues (yao et al. 2020 ) generated a rectal cancer derived biobank (n=80) and tested pdos' sensitivity to 5-fu, irinotecan, or radiation. they incorporated a correlation between in vitro responses in organoids and the histopathologically determined tumor regression scores (trgs) after surgical resection to define prognostic cut-offs. using these parameters, the in vitro responses could predict clinical responses with an impressive area under the curve (auc) of 0.88 and an accuracy of 84%. in the other study, ganesh and colleagues (ganesh et al. 2019 ) established 65 pdo lines from rectal cancer to test responses to neoadjuvant chemoradiation therapy, including the standard folfox chemotherapy and radiation. the pdo responses significantly reflected the patients' progression-free survival. moreover, pdos generated invasive rectal cancer followed by metastasis upon transplantation into murine rectal mucosa, exhibiting the same in vivo metastatic route as in the corresponding patients. for pancreatic cancer, boj and colleagues (boj et al. 2015) were the first to successfully developed organoids from patient-derived pdacs. subsequently, seino and colleagues ) generate an extensive organoids biobank of pdacs (n=39) covering both classical and basal subtypes according to gene expression signatures. they found basal-type pdacs derived organoids are more independent of wnt signaling, which are more invasive and aggressive clinically, indicating that progression of pdacs are accompanied by loss of stem cell niche dependence. recently, tiriac and colleagues (tiriac et al. 2018 ) generated a much larger pdac biobank (n=114) and exposed a subset of these organoid lines to the standard-of-care chemotherapies. their sensitivities paralleled clinical responses in patients. besides, gene expression signatures of chemosensitivity based on organoids were developed to help predict responses to chemotherapy in both the adjuvant and advanced disease settings. by high throughput drug screening, they nominated alternative treatment strategies for chemorefractory pdo. another study also used pdos (n=30) to identify novel therapeutics to target pancreatic tumor cells in a biobank covering different histological subtypes, including pdacs, acinar cell carcinoma, cholangiocarcinoma, adenosquamous-pdacs, intraductal papillary mucinous neoplasm (ipmn)-derived pdacs and papilla of vater adenocarcinomas (driehuis et al. 2019c) . pdos were exposed to 76 therapeutic agents currently not exploited in the clinic. the prmt5 (protein arginine methyltransferase 5) inhibitor, ezp015556, was shown to target mtap (methylthioadenosine phosphorylase)-negative tumors, but also appeared to constitute an effective therapy for a subset of mtap-positive tumors, indicating the importance of personalized approaches for cancer treatment. huch and colleagues (broutier et al. 2017 ) described a liver tumor biobank (n=13) containing hepatocellular carcinoma and cholangiocarcinoma, as well as the rarer lymphoepithelioma-like cholangiocarcinoma. in drug screening experiments with 29 compounds, the erk (extracellular regulated protein kinases) inhibitor sch772984 was found to effectively inhibit the growth of tumor organoids, which was validated in vivo using xenotransplanted organoid lines in mice, histological types were not comprehensively reported highlighting sch772984 as a possible therapeutic agent. biliary tract carcinomas-derived organoids biobank was also established, covering intrahepatic cholangiocarcinoma, gallbladder cancer, and neuroendocrine carcinoma of the ampulla of vater (saito et al. 2019) . gene expression profiling of the organoids indicated that sox2, klk6 and cpb2 could be potential prognostic biomarkers. drug screening using a compound library of 339 drugs showed that the antifungal drugs, amorolfine and fenticonazole, significantly suppressed the growth of biliary tract carcinomas organoids with little toxicity to normal biliary epithelial cells. the organoids biobank of metastatic prostate cancer covering both ar (androgen receptor)-positive and -negative subtypes was the first reported biobank established by gao and colleagues ). the biobank captured the most common genetic aberrations in prostate cancer, including tmprss2-erg fusion, homozygous deletions of pten and chd1, as well as typical copy number variations. to be pointed out, organoids derived from circulating tumor cells were also successfully established in this biobank, showing that at least in some cases, organoids can be established from less invasive blood samples. a bladder cancer derived organoids biobank (n=20) was established containing urothelial carcinomas and one squamous cell carcinoma (lee et al. 2018) . organoid lines were interconvertible with orthotopic xenografts and recapitulated the mutational spectrum of the corresponding tumor type, including activation of fgfr3 and mutations in epigenetic regulators such as arid1a. drug screening of 40 compounds based bladder tumor organoids showed partial correlations with mutational profiles as well as treatment resistance, and the drug responses can be validated using xenografts in vivo. a biobank of breast cancer organoids (n=95) has been described covering major histological subtypes (invasive ductal carcinoma and invasive lobular carcinoma) and all molecular subtypes based on gene expression . organoid morphologies matched the histopathology of the original tumors, and hormone receptor [estrogen receptor (er), progesterone receptor (pr)], her2 status and copy number variations were retained. er and pr status have predictive value for the outcome of endocrine therapy (e.g. tamoxifen), while her2 is a target for targeted therapy (e.g. trastuzumab) and also has predictive value for chemotherapy outcome. the response of breast cancer-derived organoids to her2 inhibitor afatinib and to endocrine therapy tamoxifen were consistent with in vivo xeno-transplantations and patient response in clinic. an organoid biobank of high-grade serous ovarian cancer (hgsc) (n=33) was established by hill and colleagues (hill et al. 2018) . up to 50% of all patients with hgsc have dna repair defects, typically mutation of brca1 or brca2. these patients were thought to benefit from treatment with poly (adp-ribose) polymerase (parp) inhibitors. in the clinical setting, mutation analysis alone is not sufficient to adequately predict drug sensitivity. the study showed that functional assays in organoids are a better predictor than the genomic analysis in clinic, implying that functional assays in organoids may improve the prediction of drug sensitivity beyond what can be achieved with genomic analysis alone. kopper and colleagues (kopper et al. 2019 ) established a second ovarian cancer biobank (n=56) that captured all of the main histological subtypes, including borderline tumors, endometroid carcinomas, mucinous carcinomas, lgsc and hgsc. notably, a novel single-cell dna sequencing method was used to demonstrate intra-patient heterogeneity was preserved in organoids when compared with the original tumor. pdos can be used for drug-screening analyses and capture distinct responses of different histological subtypes to platinum-based chemotherapy, including acquisition of chemoresistance in recurrent disease. besides, pdos can also be xenografted, enabling in vivo drug sensitivity analyses. taken together, pdos of ovarian cancer have potential application for translational research and precision medicine. driehuis and colleagues (driehuis et al. 2019a ) established an organoid biobank (n=31) derived from head and neck squamous cell carcinoma (hnscc). pdos recapitulates genetic and molecular characteristics of original hnsccs and can generate tumors upon transplantation to immunocompromised mice. the authors observe different responses to commonly used drugs in clinic including cisplatin, carboplatin, cetuximab, and radiotherapy in vitro. besides, drug screens exhibit selective sensitivity to targeted drugs that are not normally used in clinic for patients with hnscc. these findings may inspire the personalized treatment of hnscc and expand the list of hnscc drugs. in another study, the authors reported pdos derived from hnscc can also be used to evaluate their response to targeted photodynamic therapy and to ensure the safety of the treatment at the same time by testing it on organoids derived from matched normal tissues (driehuis et al. 2019b ). all organoids discussed above were derived from tumors of epithelial origin, known as carcinomas. recent advances show organoids derived from primary glioblastoma tissue, setting the stage for growing organoids from non-epithelial tumors (hubert et al. 2016) . glioblastoma presents great heterogeneity, and thus it's difficult to generate an ideal in vitro model which can recapitulate the in vivo situation of the donor. by modifying the method to develop cerebral organoids, the pdos can be successfully derived both from primary lesion of glioblastoma and from brain metastases. once formed, pdos presented hypoxia gradients and mimicked cancer stem cell heterogeneity with rapidly dividing outer cells surrounding the hypoxic core of differentiated cells and diffuse, quiescent cancer stem cells. drug testing based on organoids showed that nonstem cells were sensitive to radiation therapy, whereas adjacent cancer stem cells were radioresistant. orthotopic transplantation of pdos resulted in tumors recapitulating histological features of the parental tumor. based on this method, jacob et al. (2020) established a larger organoid biobank derived from glioblastoma (n= 70), recapitulating the histological characteristics, cellular diversity, gene expression, and mutational profiles of their donors. the organoids generated rapid, aggressively infiltrated tumors upon transplantation into adult rodent brains. the authors observe different responses to exposure of radiation with concurrent temozolomide, which was consistent with the patients' response and survival in clinic. notably, the authors further demonstrate the utility of organoids in modeling immunotherapy by co-culture of chimeric antigen receptor t (car-t) cells with organoids. they observed specific tumor cells were targeted and killed by car-t cells. the study expands the application of pdos in personalized treatment to include immunotherapy. several biobanks containing organoids derived from mixed cancers were established for pan-cancer research. pauli et al. (2017a pauli et al. ( , 2017b ) developed a robust precision cancer platform, by integrating whole exome sequencing with a living biobank which allows high throughput drug screens on pdos. the biobank included tumors derived from prostate, breast, colorectum, esophagus, brain, pancreas, lung, small intestine, ovary, uterus, soft tissue, bladder, ureter and kidney. in another study, to model tumor immune microenvironment, kuo and colleagues (neal et al. 2018) established pdos based on ali method from >100 individual patient tumors of 19 distinct organs and 28 histological subtypes. these pdos included common cancers such as colon, pancreas, and lung, and rarer histologies such as bile duct ampullary adenocarcinoma, brain schwannoma, and salivary gland pleomorphic adenoma. pdos in this biobank retain immune cells and should enable immuno-oncology investigations and facilitate personalized immunotherapy testing. as mentioned above, cf is a lethal genetic disease caused by cftr mutations that impairs the function of many organs including the intestine, lung, pancreas, sweat gland, liver, and kidney. the disease is characterized by the buildup of viscous, sticky mucus which clogs airways, causes chronic digestive system problems and leads to cf-related diabetes. approximately 2,000 cfcausing mutations of cftr have been described, and drug efficacy varies among the different genotypes (cutting 2015) . thus, there is a need for a personalized medicine approach to predict treatment response. beekman and colleagues (dekkers et al. 2016 ) first established an organoid biobank derived from rectum of 71 cf patients with 28 different cftr genotypes. based on the biobanking, they developed a personalized medicine approach by using fsk-induced swelling assay to select clinical responders to cf modulators. two patients with the rare and uncharacterized f508del/ g1249r genotype responded in vitro to a specific cf modulator, ivacaftor (kalydeco, vertex pharmaceuticals). the responses were consistent with their in vivo clinical response to the treatment, reflected by improved pulmonary function and sweat chloride tests. in a prospective follow-up study involving 24 participants (berkers et al. 2019) , the predictive power of the fsk assay was further substantiated, as the in vitro assay correlated with changes in pulmonary function and sweat chloride tests conducted in vivo. besides cf-rectal organoids, pancreatic organoids and cholangiocyte-like organoids derived of cf-ipscs have also shown the potential for drug screening (sampaziotis et al. 2015; simsek et al. 2016 ). in the netherlands, the licensing of orkambi (lumacaftor/ivacaftor, vertex pharmaceuticals) allows treatment of cf patients solely based on a positive organoid swelling response, demonstrating the potential of organoid-based assays for delivering personalized medicine (fig. 10) . in summary, organoid biobanks have been established for multiple tumor types (table 3) , including nonepithelial glioblastoma, and several principals can be concluded. first, pdos can be established from small biological samples, such as biopsies and even circulating tumor cells, and these organoids can generate tumors upon xenotransplantation. second, the pdos in biobanks recapitulate histological and genetic aspects of the original tumors, which holds not only for localized primary tumors but also for metastatic tumors. third, high-throughput drug screening experiments in organoids correlate with the response in patients and lead the identification of new therapeutic targets. combining genetic sequencing with functional assays in pdos will facilitate personalized treatment, even including immunotherapy, which will be discussed in detail in section 3.4. schwank and colleagues (schwank et al. 2013) firstly demonstrated that it is feasible to repair genetic defects in organoids. intestinal organoids derived from two cf patients with a homozygous cftr f508 deletion were repaired using crispr-cas9 technology. after repair, fsk-induced swelling was restored, functionally demonstrating cftr activity. later, firth and colleagues (firth et al. 2015) generated ipscs from cf patients with a homozygous deletion of cftr f508 and corrected this mutation using crispr technology to target corrective sequences to endogenous cftr genomic locus, combining with selection system. the corrected ipscsorganoids were able to differentiate mature airway epithelial cells with normal cftr expression and function. however, cftr gene repair in organoids and subsequent transplantation into patients is hard to be applied in the clinic. first, the loss of cftr functions results in disease in multiple organ systems, which would require the transplantation of organoids into multiple tissue sites. second, a high percentage of repaired cells per organ would be required for functional restoration. third, ethics problems in organoids transplantation are controversial and needs to reach a consensus in the research field. human and murine organoids have been orthotopically transplanted into mice to model disease or to show tumorigenic potential. here, we discuss studies that used organoid transplantation as therapy. yui and colleagues (yui et al. 2012 ) firstly exploited the feasibility to apply organoids to repair damaged epithelium. gfp + murine colon organoids derived from lgr5 + stem cells were reintroduced into mice with dss (dextran sulfate sodium salt) -induced acute colitis. transplanted cells readily integrated into the mouse colon and covered superficially damaged tissue. 4 weeks after transplantation, the donor cells constituted a single-layered epithelium, which formed functionally and histologically normal crypts that were able to self-renew. further, engrafted mice had higher body weights than ungrafted ones, indicating that donor cells contributed to the recovery of dss-induced acute colitis. although further optimization is still needed, the current study indicates that in vitro expansion and transplantation of organoids may be a promising treatment choice for patients with severe gastrointestinal epithelial injuries. orthotopic transplantation of liver organoids has also shown promising results. in a mouse model of toxicityinduced acute liver failure, transplantation of mouse differentiated biliary duct organoids derived from lgr5 + stem cells generated detectable organoid-derived nodules in 20-40% of cases . although engraftment rate was low (approximately 1%), a significant increase in survival of the grafted group was observed compared to the ungrafted group, indicating that the transplanted cells contributed to liver function repair ). in follow-up studies using the same injury model, the transplantation of mouse (peng et al. 2018 ) and human fetal ) hepatocyte organoids, generated much more extensive engraftment indicating that the efficiency of engraftment may be enhanced by transplantation of the most physiologically relevant cell type. recently, xiao and colleagues (xiao and deng 2020) generated induced sensory ganglion organoids exhibiting fig. 10 organoids for modeling cystic fibrosis (cf) and its multiple applications. organ-specific pathologies of cf can be studied separately using organoids derived from distinct tissues and applied to personalized drug screening molecular features, subtype diversity, electrophysiological and calcium response properties, and innervation patterns characteristic of peripheral sensory neurons, which may serve as a source for cell replacement therapy. yoshihara and colleagues (yoshihara and o'connor 2020) generated human islet-like organoids (hilos) from ipsc, which provides a promising alternative to cadaveric and device-dependent therapies in the treatment of diabetes. hilos contain endocrine-like cell types that, upon transplantation, rapidly re-establish glucose homeostasis in diabetic nod/scid mice. overexpression of the pd-l1 protected hilo xenografts such that they were able to restore glucose homeostasis in immune-competent diabetic mice. organoids combined with 3d printing were recently introduced to build the 3d architecture of tissue, which may find widespread applications in regenerative medicine. firstly, zhang and his colleagues (zhang et al. 2016 ) established the technique that can be transformed into human cardiomyocytes derived from ipscs to construct endothelialized human myocardium. then, creff and colleagues (creff et al. 2019 ) provided the possibility of creating artificial 3d scaffolds that match the size and structure of mouse intestinal crypt and villi. moreover, homan and colleagues (homan et al. 2019 ) built a model that had the ability to induce substantial vascularization and morphogenesis of renal organs in vitro under flow conditions opening up a new way for the study of renal development, disease and regeneration. though above studies indicate the potential application of organoids in regenerative medicine, many problems need to be solved before organoids are put into clinical use. for example, integration upon transplantation requires optimization, and animal-based 3d ecm matrix used for organoid culture need to be replaced with a synthetic matrix. the tumor microenvironment consists of various nonepithelial cell types, including immune cells and stromal cells, which greatly affects therapeutic responses. however, it is a major challenge to model tumor microenvironment. much of our knowledge regarding the tumor microenvironment was studied on cell lines and pdx models. however, cell lines are insufficient to recapitulate the heterogeneity of tumor cells, while the microenvironment of pdx models mainly depend on the mouse immune system, which cannot adequately recapitulate the human immune system. cancer immunotherapy has emerged as a promising therapeutic developments that take advantage of a patient's own immune system to eradicate tumor cells (mellman and coukos 2011) , and several organoid-based models have been established to study immunotherapy response (fig. 11) . recently, a holistic approach based on the ali method (mentioned in section 1.2), which preserved the tumor epithelium and its stromal microenvironment in vitro, was described using pdos of various cancer types, including colorectal and lung cancers (neal et al. 2018 ). in addition to stromal fibroblasts, cellular immune components such as tumor-associated macrophages, ctls (cytotoxic t lymphocyte), t h cells (t helper cells), b cells, natural killer (nk) cells and natural killer t (nkt) cells were also readily maintained for up to 30 days in the organoid cultures. the organoid cultures also preserved the t cell receptor (tcr) heterogeneity of the t cells found in the parental tumor. the authors used these organoids to model immune checkpoint blockade, leading to the expansion and activation of tumor antigen-specific t cells and subsequent killing of tumor cells. adoptive cell therapy is a promising immunotherapy. in this method, autologous immune cells are expanded in vitro followed by the subsequent transplantation of these cells back into the patient, to enhance the immune response against a tumor. using this strategy, durable regression of melanoma was achieved by in vitro expansion of autologous tumor-infiltrating lymphocytes (tils) (rosenberg 2015) . however, this approach requires resected specimens from which tils can be obtained. a strategy to avoid resection is to isolate peripheral blood lymphocytes, activate these cells in vitro by co-culture with tumor cells. for this strategy, tumor-derived organoids are a highly useful source of tumor cells for coculture: tumor organoid cultures can be efficiently established from a small tissue sample through biopsy, and tumor-derived organoids are heterogeneous and recapitulate the genetic and histological characteristics of the parental tumors. dijkstra and colleagues (dijkstra et al. 2018) were thus able to obtain tumor-reactive t lymphocytes from peripheral blood lymphocytes after 2 weeks of co-culture with tumor organoids derived from non-small cell lung cancer and msi-h crc. before coculture, organoids were stimulated with ifn-î³ to enhance antigen presentation. pd-1 blocking antibody, il-2, and anti-cd28 were added to enhance t cell activation. after co-culture, t lymphocytes were activated, as demonstrated by expression of ifn-î³ and cd107a. accordingly, after an additional 3 days of co-culture of activated t lymphocytes with tumor organoids, the survival of the tumor organoids was reduced, while matched normal organoids were unaffected. cancers with low mutational burden and stable tumor antigen presentation may be suitable targets for chimeric antigen receptor (car)-engineered t cells. studies on b cell malignancies showed promising results (june 2018) . in solid tumors, the therapeutic application of car t cells has been hindered by side effects that arise from targeting overexpressed native antigens that are, not exclusively expressed by tumors. thus, for solid tumors, preclinical models are needed to allow for carmediated cytotoxicity testing. recently, a luciferasebased quantification assay has been developed to test car t cell-mediated cytotoxicity against pdos (schnalzger et al. 2019) . instead of using carengineered t cells, the researchers adopted an nk cell line with non-mhc restricted cytolytic activity to target research. the efficiency of the system was confirmed by using car-engineered nk-92 cells directed toward epcam, an epithelial marker overexpressed by cancers. car-mediated cytotoxicity was observed against organoids derived from both normal colon tissue and crc tissue presenting either peptide (schnalzger et al. 2019 ). subsequently, the authors engineered cars targeting egfrviii, a neoantigen that is widely expressed by solid cancers. in a competitive co-culture assay, egfrviiispecific car nk cells killed egfrviii-expressing organoids derived from tumors efficiently but not organoids derived from normal tissue. finally, the team generated cars targeting antigens specific to subgroup of crc that overexpress the wnt ligand receptor fzd upon loss of expression of its antagonists rnf43/znrf3. to test possible side effects of fzd-specific cars, the authors evaluated cytotoxicity against normal colon fig. 11 co-culture systems of organoid and immune cell in immuno-oncology research. two main methods are currently being adopted: a holistic approach (up). tumor biopsies are cultured in ali in the entire tumor microenvironment as a cell suspension of all cell types, including immune cells and other non-epithelial cell types. b reductionist approach (down), epithelial organoids are derived from tumor biopsies and are then co-cultured with autologous immune cells derived from the peripheral blood of the same patient. crc, colorectal cancer; ali, air-liquid interphase; ecm, extracellular matrix; nk cell, natural killer cell organoids as well as different gene-edited organoid lines deficient for both rnf43 and znrf3 or for apc. these co-culture assays illustrated that the cytolytic activity of the fzd-specific car nk cells was not specific to the mutated organoid lines, suggesting that such approaches may have marked side effects if used therapeutically. though effective target has not been found, the platform can be widely used to evaluate car efficacy and tumor specificity in a personalized manner. in summary, co-cultures of cancer organoids and immune cells have become a highly promising strategy for personalized immunotherapy for cancer patients. organoids are robust research tools for the study of human development and disease. however, there are hurdles and limitations associated with using organoids (fig. 12) . first, culture approaches are not standardized. ascsderived organoids are established under distinct culture conditions in each laboratory. to reduce the cost, costefficient small molecule compounds were used to replace growth factors yin et al. 2014) . besides, homemade niche factors produced by various cell lines has been used commonly to culture organoids. however, this trend will result in experimental variation into organoid studies across different research labs. second, organoid culture requires the use of matrigel or other animal-based matrix extract to enable cells to aggregate into 3d structures. these extracts suffer from batch-to-batch variability in their composition, which may affect the reproducibility of experiments. in addition, they may carry unknown pathogens and are potentially immunogenic when transplanted to humans, limiting the application of organoids in a clinical transplantation setting. this may be solved by culturing with clinical grade collagen, which has been successfully used for colon organoids culture and expansion (yui et al. 2012) . steps toward fully defined culture conditions have been made with the development of a synthetic polyethylene glycol-based gel that sustains the short-term growth of mouse asc-derived intestinal organoids (gjorevski et al. 2016 ). however, this matrix remains to be optimized for the long-term expansion of intestinal organoids and f non-intestinal organoids. third, obtaining pure tumor organoids is another critical problem for researchers. many studies have reported fig. 12 schematic summarize of current limitations in organoids culture development (yellow) and methods to overcome these issues (brown) that tumor organoids can be overgrown and contaminated by matched normal organoids. one of the most widely used methods to acquire pure tumor organoids is to select tumor cells that harbor the most common mutations in its cancer type. for example, tumor organoids derived from crc can be selectively upon withdrawal of wnt3a and r-spondin1. however, this method is not applicable to all cancers, and some specific cancer subtypes. more importantly, intra-tumoral heterogeneity would inevitably be lost upon selection. some labs have suggested obtaining pure tumor organoids based on growth phenotypes. however, not all tumor organoids have clear morphological differences to normal organoids. more researches are needed to establish appropriate approaches for obtaining pure tumor organoids. fourth, for ascs-derived organoids, only the epithelial compartment of organs is represented, while blood vessels, immune cells, stroma, and nerves are lacking. as mentioned above, many groups have focused coculturing organoids with various types of cells, analogous to what has already been done with immune cells (dijkstra et al. 2018) , or adopted unconventional organoid culture methods, such as the ali method (neal et al. 2018; ootani et al. 2009 ). besides, several psc-based organoids are able to undergo mesenchymal differentiation to generate subepithelial myofibroblasts and smooth muscle . last, the development of a model based on living human tissues that can be stored and expanded in biobanks-potentially forever-has raised a set of ethical issues regarding informed consent and ownership (bredenoord et al. 2017) . for organoid biobanks, patient consent is required. the most common type of patient consent restricts the use of a patient's material to only a specific research aim. however, biobanks are useful for researchers in multiple fields, and the use of biobanks in a combination of fields may provide potentially synergistic data. to solve this problem, bredenoord and colleagues (boers and van delden 2015; bredenoord et al. 2017 ) have suggested that a broad consent be used for governance. this broad consent would allow donors to make informed decisions about how their samples are used after they have been provided with relevant information about the establishment and regulation of the biobank. another issue that has arisen with the development of living biobanks is ownership. organoids are increasingly used by commercial parties as tools for drug development or in validation studies. such uses will inevitably result in patentable compounds. it may be helpful to include regulations covering the distribution of any financial gains from intellectual property among stakeholders in the governance of the biobanks. furthermore, in terms of the culture conditions of tumor organoids, intra-tumoral heterogeneity could be lost during passages for that the culture media might not favor the growth of all tumor subclones equally effectively. additionally, novel mutations would be acquired during long-term expansion. collectively, these drawbacks deserve further attention, and more work is needed to improve organoids culture technologies. despite these limitations, organoid technology holds great promise as a robust tool for basic, translational and clinical research for human development and disease modeling. in this review, we have demonstrated the potential of organoid technology for modeling genetic, infectious and malignant disease, as well as for drug development and personalized medicine. the basic and translational applications of organoids are expected to expand in the future. regarding epithelial genetics, organoids have potential especially for rare diseases due to their high expansion ability and genetic stability. in infectious diseases, the mechanisms of virus-induced malignant transformation, for example, by epsteini-barr virus in gastric and nasopharynx cancers, may be studied in long-term cocultures of the virus with normal epithelium from the respective organ. in the studies of malignancies, culture conditions need to be developed for many types of carcinomas and for sarcomas and melanomas. the optimization of drug screening will be a pivotal use of organoids in personalized medicine by adopting biobanks of different cancers. this enables many compounds to be screened for a specific disease or a specific compound to be screened for many forms of a given disease. moreover, multiple organoids can be derived from the same cancer patient over time to assess drug response and developing resistance to targeted drugs and to predict patient outcome. in the area of regenerative medicine, there is still a long way to go for the transplantation of organoids as therapy. several hurdles, including the development of a non-animal-based alternative for matrigel and efficient delivery procedures, remain to be overcome. in conclusion, the worldwide application of organoids has contributed to unprecedented advances in research on human development and diseases. even though current organoid systems show some limitations and require further optimization for use in disease modeling and personalized medicine, they will continue to be valuable tools in basic and translational research. ecm: extracellular matrix; pdos: eatient-derived organoids; wenr method: wnt3a+egf+noggin+r-spondin-1 nico: nicotinamide; rspo: r-spondin-1; pge2: prostaglandin e2 dht: dihydrotestosterone; crc: colorectal cancer; ali: air-liquid interface tumor infiltrating lymphocytes; ta: transit-amplifying; wgs: wholegenome sequencing; crispr: clustered regularly interspaced short palindromic repeats; cas9: crispr associated protein 9; nhej: non-homologous end joining; hdr: homology-directed repair cftr: cf transmembrane conductance regulator; camp: cyclic adenosine monophosphate; fsk: forskolin; clc: cholangiocyte-like cells pdecs: pancreatic ductal epithelial cells; mvix: microvillus inclusion disease syntaxin-3; a1at: î±1-antitrypsin; ags: aicardi-goutiã¨res syndrome zikv: zika virus; who: world health organization; tlr3: toll-like receptor 3 japanese encephalitis virus; je: japanese encephalitis; hbv: hepatitis b virus; hunov: human norovirus; hbga: histo-blood group antigen rsv: respiratory syncytial virus salmonella typhi clostridium difficile; cdi: c. difficile infection helicobacter pylori; cin: chromosomal instability hiv: human immunodeficiency virus; shrna: short hairpin rna; ssas: sessile serrated adenomas; mlh1: mutl homolog 1; msi: microsatellite instability pdac: pancreatic ductal adenocarcinoma; srcc: signet-ring cell carcinoma androgen receptor; tmprss2: transmembrane protease serine 2; ets: e26 transformation-specific; erg: ets-related gene auc: area under the curve; ipmn: intraductal papillary mucinous neoplasm hgsc: high-grade serous ovarian cancer; lgsc: low-grade serous ovarian cancer; parp: poly (adpribose) polymerase; hnscc: head and neck squamous cell carcinoma; car-t: chimeric antigen receptor t; nk cell: natural killer cell; nkt cell: natural killer t cell; tcr: t cell receptor chimeric antigen receptor; dss: dextran sulfate sodium; ifnî³: interferon-gamma; mhc: major histocompatibility complex nhe3: sodium/hydrogen exchanger 3; muc2: mucoprotein2; akt: protein kinase b; mapk: mitogen-activated protein kinase; eif3e: eukaryotic translation initiation factor 3 subunit e egfr: epidermal growth factor receptor; prmt5: protein arginine methyltransferase 5; mtap: methylthioadenosine phosphorylase erk: extracellular regulated protein kinases; ar: androgen receptor severe acute respiratory syndrome-coronavirus 2; mge: medial ganglionic eminence; hrsace2: human recombinant soluble ace2; iwp-2: inhibitor of wnt production 2; vpa: valproic acid dapt: dual antiplatelet therapy; sgrna: single-guide rna; foxg1: forkhead box g1; hilos: human islet-like organoids references adli m. the crispr tool kit for genome editing and beyond mouse model of alagille syndrome and mechanisms of jagged1 missense mutations kang gh molecular subtypes of colorectal cancer and their clinicopathologic features, with an emphasis on the serrated neoplasia pathway knoblich ja fused cerebral organoids model interactions between brain regions lgr5(+ve) stem cells drive selfrenewal in the stomach and build long-lived gastric units in vitro identification of stem cells in small intestine and colon by marker gene lgr5 in vitro expansion of human gastric epithelial stem cells and their responses to bacterial infection induced quiescence of lgr5+ stem cells in intestinal organoids enables differentiation of hormone-producing enteroendocrine cells rectal organoids enable personalized treatment of cystic fibrosis human ipsc-derived cerebral organoids model cellular features of lissencephaly and reveal prolonged mitosis of outer radial glia van den born m, van es jh. clevers h enteroendocrine cells switch hormone expression along the crypt-to-villus bmp signalling gradient high-resolution mrna and secretome atlas of human enteroendocrine cells ttc7a mutations disrupt intestinal epithelial apicobasal polarity assembly of functionally integrated human forebrain spheroids tissue-specific mutation accumulation in human adult stem cells during life bredenoord al broad consent is consent for governance bk virus nephropathy and kidney transplantation organoid models of human and mouse ductal pancreatic cancer human tissues in a dish: the research and ethical implications of organoid technology human primary liver cancer-derived organoid cultures for disease modeling and drug screening interrogating open issues in cancer precision medicine with patient-derived xenografts cancer genome atlas research n. comprehensive molecular characterization of gastric adenocarcinoma a review of the global burden, novel diagnostics, therapeutics, and vaccine targets for cryptosporidium robey pg human pluripotent stem cell culture: considerations for maintenance, expansion, and therapeutics a three-dimensional model of human lung development and disease from pluripotent stem cells daley gq reprogramming cellular identity for regenerative medicine nrf2 promotes tumor maintenance by modulating mrna translation in pancreatic cancer hepatic spheroids used as an in vitro model to study malaria relapse single luminal epithelial progenitors can generate prostate organoids in culture the intestinal crypt, a prototype stem cell compartment modeling development and disease with fabrication of 3d scaffolds reproducing intestinal epithelium topography by high-resolution 3d stereolithography organoid cystogenesis reveals a critical role of microenvironment in human polycystic kidney disease the brazilian zika virus strain causes birth defects in experimental models cystic fibrosis genetics: from molecular understanding to clinical application rana tm zika virus depletes neural progenitors in human cerebral organoids through activation of the innate immune receptor tlr3 a distinct role for lgr5(+) stem cells in primary and metastatic colon cancer characterizing responses to cftr-modulating drugs using rectal organoids derived from subjects with cystic fibrosis modeling breast cancer using crispr/ cas9-mediated engineering of human breast organoids a functional cftr assay using primary cystic fibrosis intestinal organoids generation of tumor-reactive t cells by co-culture of peripheral blood lymphocytes and tumor organoids heilshorn sc engineering of threedimensional microenvironments to promote contractile behavior in primary intestinal organoids clevers h crispr-induced tmprss2-erg gene fusions in mouse prostate organoids oral mucosal organoids as a potential platform for personalized cancer therapy patient-derived head and neck cancer organoids recapitulate egfr expression levels of respective tissues and are responsive to egfr-targeted photodynamic therapy pancreatic cancer organoids recapitulate disease and allow personalized drug screening clevers h organoids in cancer research use of crispr-modified human stem cell organoids to study the origin of mutational signatures in cancer sequential cancer mutations in cultured human intestinal stem cells self-formation of layered neural structures in three-dimensional culture of es cells sasai y self-organized formation of polarized cortical tissues from escs and its active manipulation by extrinsic signals worrell rt human clostridium difficile infection: inhibition of nhe3 and microbiota profile replication of human noroviruses in stem cell-derived human enteroids schraufnagel d the global burden of respiratory disease tgfî² signaling directs serrated adenomas to the mesenchymal colorectal cancer subtype stem cellderived human intestinal organoids as an infection model for rotaviruses functional gene correction for cystic fibrosis in lung epithelial cells generated from patient ipscs dougan g interaction of salmonella enterica serovar typhimurium with intestinal organoids derived from human induced pluripotent stem cells modelling kidney disease with crispr-mutant kidney organoids derived from human pluripotent epiblast spheroids a colorectal tumor organoid library demonstrates progressive loss of niche factor requirements during tumorigenesis well-differentiated human airway epithelial cell cultures genetic dissection of colorectal cancer progression by orthotopic transplantation of engineered cancer organoids recent zika virus isolates induce premature differentiation of neural progenitors in human brain organoids a rectal cancer organoid platform to study individual responses to chemoradiation organoid cultures derived from patients with advanced prostate cancer zika virus impairs growth in human neurospheres and brain organoids tripodi m cerebral organoids at the air-liquid interface generate diverse nerve tracts with functional output lutolf mp designer matrices for intestinal stem cell and organoid culture human hepatic organoids for the analysis of human genetic diseases modelling cryptosporidium infection in human small intestinal and lung organoids prediction of dna repair inhibitor response in short-term patient-derived ovarian cancer organoids polyomavirusassociated nephropathy in renal transplantation: interdisciplinary analyses and recommendations flow-enhanced vascularization and maturation of kidney organoids in vitro long-term expansion of functional mouse and human hepatocytes as 3d amieva mr chemodetection and destruction of host urea allows helicobacter pylori to locate the epithelium ductal pancreatic cancer modeling and drug screening using human pluripotent stem cell-and patient-derived tumor organoids rich jn a three-dimensional organoid culture system derived from human glioblastomas recapitulates the hypoxic gradients and cancer stem cell heterogeneity of tumors found in vivo in vitro expansion of single lgr5+ liver stem cells induced by wnt-driven regeneration long-term culture of genomestable bipotent stem cells from adult human liver chan mcw tropism, replication competence, and innate immune responses of influenza virus: an analysis of human airway organoids and ex-vivo bronchus cultures a patient-derived glioblastoma organoid model and biobank recapitulates inter-and intra-tumoral heterogeneity high compressioninduced conductivity in a layered cu-br perovskite the ability to form primary tumor xenografts is predictive of increased risk of disease recurrence in early-stage non-small cell lung cancer sadelain m chimeric antigen receptor therapy isolation and in vitro expansion of human colonic stem cells presence of somatic mutations in most early-stage pancreatic intraepithelial neoplasia identification of multipotent luminal progenitor cells in human prostate organoid cultures chromosome engineering of human colon-derived organoids to develop a model of traditional serrated adenoma the notch and wnt pathways regulate stemness and differentiation in human fallopian tube organoids gene-edited human kidney organoids reveal mechanisms of disease in podocyte development altered neuronal migratory trajectories in human cerebral organoids derived from individuals with neuronal heterotopia modelling human hepato-biliarypancreatic organogenesis from the foregut-midgut boundary liu dr crispr-based technologies for the manipulation of eukaryotic genomes an organoid platform for ovarian cancer captures intra-and interpatient heterogeneity regeneration of thyroid function by transplantation of differentiated pluripotent stem cells sars-cov-2 productively infects human gut enterocytes knoblich ja cerebral organoids model human brain development and microcephaly genetic editing of colonic organoids provides a molecularly distinct and orthotopic preclinical model of serrated carcinogenesis reconstituting development of pancreatic intraepithelial neoplasia from primary human pancreas duct cells tumor evolution and drug response in patientderived organoid models of bladder cancer spence jr persistence and toxin production by clostridium difficile within human intestinal organoids result in disruption of epithelial paracellular barrier function bissell mj influence of a reconstituted basement membrane and its components on casein gene expression and secretion in mouse mammary epithelial cells oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture a growth factorfree culture system underscores the coordination between wnt and bmp signaling in lgr5(+) intestinal stem cell maintenance induction of expansion and folding in human cerebral organoids a novel human colon signet-ring cell carcinoma organoid line: establishment, characterization and application integrative multi-omics analysis of intestinal organoid differentiation foxg1-dependent dysregulation of gaba/glutamate neuron differentiation in autism spectrum disorders modeling colorectal cancer using crispr-cas9-mediated engineering of human intestinal organoids modelling human development and disease in pluripotent stem-cell-derived gastric organoids generating human intestinal tissue from pluripotent stem cells in vitro organoids for liver stage malaria research dranoff g cancer immunotherapy comes of age inhibition of sars-cov-2 infections in engineered human tissues using clinical-grade soluble human ace2 hebrok m kras, hedgehog, wnt and the twisted developmental biology of pancreatic ductal adenocarcinoma fishman ja early weaning of piglets fails to exclude porcine lymphotropic herpesvirus global burden of acute lower respiratory infections due to respiratory syncytial virus in young children: a systematic review and meta-analysis sasai y self-formation of optic cups and storable stratified neural retina from human escs coppes rp purification and ex vivo expansion of fully functional salivary gland stem cells divergent routes toward wnt and r-spondin niche independency during human gastric carcinogenesis organoids as models for neoplastic transformation organoid modeling of the tumor immune microenvironment bhatia sn human ipsc-derived hepatocyte-like cells support plasmodium liver-stage infection in vitro recapitulation of hepatitis b virus-host interactions in liver organoids from human induced pluripotent stem cells the cytosolic bacterial peptidoglycan sensor nod2 affords stem cell protection and links microbes to gut epithelial regeneration organoid models of human liver cancers derived from tumor needle biopsies clevers h wnt/î²-catenin signaling, disease, and emerging therapeutic modalities tp53 mutations in human cancers: origins, consequences, and clinical use patient-derived organoids can predict response to chemotherapy in metastatic colorectal cancer patients sustained in vitro intestinal epithelial culture within a wnt-dependent stem cell niche in vitropersonalized and cancer models to guide precision medicine personalized in vitro and in vivo cancer models to guide precision medicine inflammatory cytokine tnfî± promotes the long-term expansion of primary hepatocytes in 3d respiratory syncytial virus can infect basal cells and alter human airway epithelial differentiation brain-region-specific organoids using mini-bioreactors for modeling zikv exposure sliced human cortical organoids for modeling distinct cortical layer formation estes mk epidemiology of human noroviruses and updates on vaccine development jiang p single lgr5-or lgr6-expressing taste stem/progenitor cells generate taste bud cells ex vivo genome-scale crispr screening in human intestinal organoids identifies drivers of tgf-î² resistance enhancer reprogramming promotes pancreatic cancer metastasis development and application of human adult stem or progenitor cell organoids in vivo genome editing and organoid transplantation models of colorectal cancer and metastasis restifo np adoptive cell transfer as personalized immunotherapy for human cancer organoid cultures for the analysis of cancer phenotypes a living biobank of breast cancer organoids captures disease heterogeneity long-term expanding human airway organoids for disease modeling establishment of patient-derived organoids and drug screening for biliary tract carcinoma development of a functional thyroid model based on an organoid culture system mã¼ller a life in the human stomach: persistence strategies of the bacterial pathogen helicobacter pylori cholangiocytes derived from human induced pluripotent stem cells for disease modeling and drug validation long-term expansion of epithelial organoids from human colon, adenoma, adenocarcinoma, and barrett's epithelium single lgr5 stem cells build crypt-villus structures in vitro without a mesenchymal niche salmonella manipulation of host signaling pathways provokes cellular transformation associated with gallbladder carcinoma a novel human gastric primary cell culture system for modelling helicobacter pylori infection in vitro farin hf 3d model for car-mediated cytotoxicity using patientderived colorectal cancer organoids tubuloids derived from human adult kidney and urine for personalized disease modeling molecular dissection of colorectal cancer in pre-clinical models identifies biomarkers predicting sensitivity to egfr inhibitors functional repair of cftr by crispr/cas9 in intestinal stem cell organoids of cystic fibrosis patients human gastric cancer modelling using organoids human pancreatic tumor organoids reveal loss of stem cell niche factor dependence during disease progression recurrent r-spondin fusions in colon cancer functional differentiation of alveolar type ii epithelial cells in vitro: effects of cell shape, cell-matrix interactions and cellcell interactions sato t visualization and targeting of lgr5(+) human colon cancer stem cells carcinoma of the gallbladder--is it a sequel of typhoid? modeling cystic fibrosis using pluripotent stem cell-derived human pancreatic ductal epithelial cells the tmem16a chloride channel as an alternative therapeutic target in cystic fibrosis directed differentiation of human pluripotent stem cells into intestinal tissue in vitro differentiated troy+ chief cells act as reserve stem cells to generate all lineages of the stomach epithelium kidney organoids from human ips cells contain multiple lineages and model human nephrogenesis vascularized and functional human liver from an ipsc-derived organ bud transplant modeling of trex1-dependent autoimmune disease using human stem cells highlights l1 accumulation as a source of neuroinflammation organoid profiling identifies common responders to chemotherapy in pancreatic cancer recurrent fusion of tmprss2 and ets transcription factor genes in prostate cancer growth of human tumors in cortisone-treated laboratory animals: the possibility of obtaining permanently transplantable human tumors long-term, hormone-responsive organoid cultures of human endometrium in a chemically defined medium enterovirus 71 infection of human airway organoids reveals vp1-145 as a viral infectivity determinant patient-derived organoids model treatment response of metastatic gastrointestinal cancers henning sj sorting mouse jejunal epithelial cells with cd24 yields a population with characteristics of intestinal stem cells isolation and characterization of intestinal stem cells based on surface marker combinations and colony-formation assay preserved genetic diversity in organoids cultured from biopsies of human colorectal cancer metastases genetic ablation of axl does not protect human neural progenitor cells and cerebral organoids from zika virus infection loss of syntaxin 3 causes variant microvillus inclusion disease hesc-derived thalamic organoids form reciprocal projections when fused with cortical organoids fusion of regionally specified hpsc-derived organoids models human brain development and interneuron migration generation of self-organized sensory ganglion organoids and retinal ganglion cells from fibroblasts identification of small-molecule inhibitors of zika virus infection and induced neural cell death via a drug repurposing screen olig2 drives abnormal neurodevelopmental phenotypes in human ipsc-based organoid and chimeric mouse models of down syndrome a comprehensive human gastric cancer organoid biobank captures tumor subtype heterogeneity and enables therapeutic screening patient-derived organoids predict chemoradiation responses of locally advanced rectal cancer niche-independent highpurity cultures of lgr5+ intestinal stem cells and their progeny modeling rotavirus infection and antiviral therapy using primary intestinal organoids immune-evasive human islet-like organoids ameliorate diabetes functional engraftment of colon epithelium expanded in vitro from a single adult lgr5 + stem cell differential antiviral immunity to japanese encephalitis virus in developing cortical organoids sun j salmonella-infected crypt-derived intestinal organoid culture system for host-bacterial interactions bioprinting 3d microfibrous scaffolds for engineering endothelialized myocardium and heart-on-a-chip monroe ss norovirus classification and proposed strain nomenclature infection of bat and human intestinal organoids by sars-cov-2 differentiated human airway organoids to assess infectivity of emerging influenza virus authors' contributions yl and pt conceived and wrote the paper. sc and jp participated in conceptualization, validation and review of the draft. gh participated in the supervision and review of the final draft. the author(s) read and approved the final manuscript. this work was supported by the national natural science foundation of china (31470826, 31670858, and 81773357 to hua), shanghai sailing program (19yf1409500 to li) and shanghai anticancer association eyas project (saca-cy1a05 to li). the authors declare that they have no competing interests author details 1 key: cord-333490-8pv5x6tz authors: liao, yi; peng, yuxuan; shi, songlin; shi, victor; yu, xiaohong title: early box office prediction in china’s film market based on a stacking fusion model date: 2020-10-06 journal: ann oper res doi: 10.1007/s10479-020-03804-4 sha: doc_id: 333490 cord_uid: 8pv5x6tz artificial intelligence has been increasingly employed to improve operations for various firms and industries. in this study, we construct a box office revenue prediction system for a film at its early stage of production, which can help management overcome resource allocation challenges considering the significant investment and risk for the whole film production. in this research, we focus on china’s film market, the second-largest box office in the world. our model is based on data regarding the nature of a film itself without word-of-mouth data from social platforms. combining extreme gradient boosting, random forest, light gradient boosting machine, k-nearest neighbor algorithm, and stacking model fusion theory, we establish a stacking model for film box office prediction. our empirical results show that the model exhibits good prediction accuracy, with its 1-away accuracy being 86.46%. moreover, our results show that star influence has the strongest predictive power in this model. in the past few years, china's film market has developed rapidly. according to a 2019 annual film market report, the box office revenue (henceforth called 'box office') of china's film market in 2019 was nearly $9 billion, the second-highest in the world (leung and lee 2019) . though china's film market's growth is significant, the continued expansion may discontinue. in 2019, the number of people who watched films in urban cinemas across the country was 1.727 billion, a growth rate of only 0.6%, the lowest level in the past decade, fig. 1 the number of movie-goers in urban cinemas in china (2008 china ( -2019 as shown in fig. 1 . more importantly, the covid-19 outbreak has brought severe economic impacts worldwide (ivanov 2020; queiroz et al. 2020) , especially for the film industry since cinemas are traditionally places for population gathering. as many countries and communities keep banning large public gatherings in response to the pandemic, not only countless films in production have been canceled, but also any new film investment would be examined extremely carefully. on the other hand, filmmaking itself is highly risky (ahmed et al. 2019; de vany and walls 1999) . before any expected revenue, a significant amount of capital investment is needed. for example, the box office revenue of cinemas usually depends on several popular movies, with around half of the rest losing money (ghiassi et al. 2015) . furthermore, the movie production cycle is usually long, but market competition is fierce. thus, it is critical to predicting the box office of a movie at an early stage to avoid potentially huge losses down the road (ahmed et al. 2019 ). in addition, such prediction can help film production companies minimize the opportunity cost of wasting limited development resources on mediocre or failed projects (ghiassi et al. 2015) . a few models have been developed for movie box office prediction based on word-ofmouth (wom) data, which are normally generated after a movie's official announcement. however, at this stage, the movie project has been largely developed, and its box office revenue prediction has a rather limited impact on the movie's investment risk (ahmed et al. 2019) . thus, in this research, we ask the following question: how to predict a film's box office revenue at its early stage without using post-release or post-production data? to answer this question, we develop a model to predict a movie's box office revenue at its early stage based on artificial intelligence techniques. our research is of practical significance as it can help avoid investment failure and save millions of dollars (sharda and delen 2006; delen and sharda 2010; ghiassi et al. 2015) . specifically, combining extreme gradient boosting (xgboost), random forest (rf), light gradient boosting machine (lightgbm) and k-nearest neighbor (knn) algorithms, we establish a stacking model for box office prediction during a film's early stage of production (shooting period). the model requires data related to the nature of the film itself instead of word-of-mouth or post-film data from social platforms. extensive numerical experiments show that the proposed model achieves 69.16% of bingo accuracy and 86.46% of 1-away accuracy. the rest of the paper is organized as follows. section 2 presents a literature review highlighting the existing research to date on the prediction of domestic box office revenue. in sect. 3, data used in this research and its processing are described, including data acquisition, cleaning, feature extraction, and test data preparation. next, sect. 4 provides details of the proposed model. through numerical experiments, sect. 5 demonstrates the model's prediction performance results and compares it to those of other models. lastly, the contributions of this study and potential future research are presented in sect. 6. litman and kohl (1989) proposed a box office prediction model based on linear regression, where film rental is used to predict film revenue. following this influential work, several improved box office prediction methods have been proposed. more recently, various prediction models, such as the bayesian model (neelamegham and chintagunta 1999) , support vector machine and neural networks (ghiassi et al. 2015) have been developed. according to kim et al. (2015) , there are four major types of box office prediction models (kim et al. 2015) : statistical models such as linear regression models (litman and kohl 1989; eliashberg and shugan 1997) , probabilistic models (neelamegham and chintagunta 1999; sawhney and eliashberg 1996; eliashberg et al. 2000) , time series models, and machine learning models such as deep neural network models (sharda and delen 2006; delen and sharda 2010; ghiassi et al. 2015) . in the following, we would like to review existing studies that are closely related to our work. the statistical models based on linear regression have been popular because they can explain the influence of each variable on box office prediction (kim et al. 2015) . mestyán et al. (2013) defined different prediction features and developed a linear regression model to predict the box office revenue of 312 american films on the first weekend. they assessed the popularity of movies based on the effects of external events on the activity of wikipedia editors and the number of page views. in litman and kohl (1989) , linear regression was used as the basic model. then the adjusted production cost, major distributors, awards, and other seven film characteristics were taken as input variables for regression analysis. eliashberg and shugan (1997) also used a linear regression prediction model to analyze the success of movies. however, they focused on the influence of screens, positive reviews, negative reviews, and mixed reviews on the box office. the probabilistic models can be considered as an alternative to linear regression models (kim et al. 2015) . neelamgham and chintagunta (1999) proposed a box office prediction system based on a bayesian model, which can predict the box office of new films at different stages. their research found that the number of screens has the greatest impact on the number of viewers. sawhney and eliashberg (1996) developed a parsimonious box office prediction system named boxmod-i based on queuing theory. eliashberg et al. (2000) proposed an improved box office prediction model named moviemod based on markov chain model. this model can explain the effects of different information sources on consumers' purchase intentions. the prediction models based on time series is to predict future box office performance according to the past patterns. these models are often used to predict the box office after a film is released based on the box office performance during the previous few weeks. a fig. 2 film cycle and prediction type significant limitation of these models is that it only relies on the historical performance of the film but ignores many valuable exogenous factors (kim et al. 2015) . dogru and keskin (2020) define artificial intelligence as an interdisciplinary subject of computer science, statistics, operational management, mathematics, humanities, social science, and philosophy. its goal is to develop non-biological systems to perform tasks that usually require human intelligence. artificial intelligence and big data technology have shaped various aspects of business and management (grover et al. 2020; kyriakou et al. 2019) . for example, akter et al. (2020) introduced a netflix's recommendation system, which first mines user data and then adopts different models to determine the most suitable recommendation system. however, the research on applying artificial intelligence in a film's box office prediction, especially at its early stage, is still limited. hur et al. (2016) proposed a box office prediction model based on film reviews and an independent subspace method, which uses audience review text and nonlinear machine learning to improve the prediction accuracy. sharda and delen (2006) set up an effective box office prediction model using artificial neural networks, logistic regression, discriminant analysis, and classification and regression tree algorithm. ghiassi et al. (2015) added mpaa rating, competition, star value, sequels, and the number of screens to the prediction variables and proposed a pre-release box office prediction model based on a dynamic artificial neural network algorithm. overall, for box office prediction, models based on artificial intelligence and machine learning have gradually replaced the traditional one. in this paper, we contribute to this research stream by building an improved box office prediction model before a film's production. our model is based on the stacking framework with data related to the nature of the film itself. in this section, we describe different aspects of the data, including data acquisition, cleaning, feature extraction, and test data preparation. before describing the data acquisition stage, we first briefly explain the timeline of film production since the timing is critical for explanatory variable selection, which determines data acquisition eventually. according to the schedule of film production, the box office prediction can be divided into pre-production prediction, pre-release prediction, and post-release prediction, as illustrated in fig. 2 . with fewer features, it has lower prediction accuracy, but the earliest prediction period, and therefore, the highest practical application value of the prediction results pre-release prediction in addition to the characteristics of the film itself, it also includes social media, search platform data, etc the prediction accuracy is higher than that of pre-production prediction, and it can guide operational decision-making for cinemas but has little value for early investment decision-making post-release prediction in addition to pre-release features, it also includes a large amount of theatre data, heat index, and audience comment information it contains the most information and the best predictive effectiveness, but the application value of the results is very low next, we compare the contribute factors and the effectiveness of box office prediction at different stages (table 1 ). in general, box office prediction based on post-show data has the most information to draw on and the best prediction accuracy. unfortunately for producers or investors who have already spent money, such a late forecast is of little value (ghiassi et al. 2015) . there is no doubt that prediction before production is very difficult but has the greatest practical value and significance in guiding decision-making in the film industry chain. from table 1 , we can easily identify that the pre-production prediction is generally based on the nature of the film itself, including the release date, type, content, star value, sequel, film length, and other factors. as aiming at the pre-production prediction, we narrow our scope to those factors in the data selection phrase. furthermore, because domestic films have a strong performance in china's film market, whose box office revenues reached 41.175 billion yuan and accounted for 64.07% of the market and eight of the top 10 films (table 2 ) at the box office in 2019 were domestic, it is appropriate to focus on domestic films as our first step. the original data were obtained from the movie box office database, douban movie, sina microblog, and other websites. the original data set contains basic information on films, directors, and actors in the chinese film market, and the data sources are exhibited in table 3 . after data cleaning and feature extraction, the experimental data set was formed. the experimental data include 1182 domestic films in 2010-2019 (domestic films are defined as those with an origin in china, and for which the directors and most of the main actors are from china). effective features are the core of box office prediction. considering the availability of data and the predictive power of features, five pre-production factors are selected based on the film itself: genre, star value, release date, release area, and sequels. a film's story has depth and complexity, so the classification of film types is diverse. moreover, people's tastes in film types also change with the market. this study measures the dynamic influence of film types according to the average box office performance of the type of film in the past year. regarding the main story types of films, the genre classification is shown in table 4 . acting staff (mainly actors and directors) can significantly affect the film's quality and the audience's expectations for the film. high-quality actors or directors and high-quality script content are mutually dependent and are also a basis for the film's market value. delen and sharda (2010) measured the star value of actors in three categories: high, medium, and low. without loss of generality, we consider the main director and the top three main actors and use the total box office, average box office, highest box office, lowest box office and the number of acting or directing films of the actor or director within the past 10 years to measure the dynamic influence of the actor or director in that year. the static influence of the actor or director is measured by the number of fans of the actor's microblog, and the number of most popular film awards greater china area, such as golden rooster awards, golden horse awards, and hong kong film awards won by the actor and director. concerning the release date, the release schedule plays an especially crucial role in the film's success (einav 2007) , as the release of multiple films at the same time will have a negative impact on each film's revenue due to the smaller market share (sharda and delen 2006; delen and sharda 2010) . according to the film's release month, delen and sharda (2010) classified the parameter 'competition' into three categories: high, medium, and low. in terms of the income of chinese films at different times of the year, the release time is divided into spring festival, national day, summer holidays and normal days depending on the distribution schedule, with codes of 4, 3, 2 and 1, respectively. the area in which a film is released represents its market. the statistics are all for chinese films, and are categorized into 'mainland china', 'hong kong' and 'taiwan'. these markets have obvious differences, where 'mainland china' has a large area as well as a large population, so it has the largest number of audiences. a one-hot coding method was used to describe the diversity of film release areas. for example, chinese films released only on the mainland are encoded (1, 0, 0), whilst chinese films released both on the mainland and in hong kong are encoded (1, 1, 0). sequels are also an important feature of early film prediction (delen and sharda 2010) . due to the brand effect of the parent film, the film audience will have higher and more specific expectations about the film, so sequels perform better than general films (dhar et al. 2012; moon et al. 2010) . high-quality sequels have many fans and topics while providing potential high-quality content and excellent production teams. following ghiassi et al. (2015) , we use binary variables to specify whether a film is a sequel (assigned a value of 1) or not (assigned a value of 0). besides, we use the income of the film's parent film to measure its sequel index. for example, 'war wolf' is the parent film of 'war wolf ii', with a revenue value of 525 million yuan, so the sequel index of 'war wolf ii' is 52,500. the sequel index of non-sequels is 0. the proposed model was trained on 7 years (from 2010 to 2016) and is being used for the prediction of 2017-2019 films. the whole feature set is shown in table 5 . it is a historical data set based on the film's release year. it uses the features of the film itself without relying on related social media data. the data include two parts: dynamic data and static data. dynamic data refers to data that changes dynamically each year, such as the dynamic influence of actors, directors, and film types, which is used to measure their influence in the film release year. no previous study of box office prediction research has used dynamic data. after data cleaning and feature processing of the original data, the experimental data feature set includes 37 feature vectors (feature numbers 1 to 37), including five factors: genre, star value, release date, release area, and the sequel, as shown in table 5 . the last column of the data set is the output label (y 0 ). the category label divides the films into seven categories a-g according to the box office revenue range, representing seven levels from flop to blockbuster. the corresponding box office revenue range of each category is shown in table 6 . a 2010-2016 film data set was used to build the model, and a 2017-2019 film data set was used as the test set to verify the effectiveness of the model. the model is programmed to conduct the calculations in the environment of python 3.7. in this part, after introducing the stacking framework first, we then describe the sample division and training method to avoid the repeated learning of the stacking model. the accuracy and characteristic contribution of various models in box office prediction are then analyzed, and the basic models used in this paper are selected based on these two factors. finally, the whole process of box office prediction based on the stacking model in this paper is described. in the stacking mechanism, each model class prediction is output, and the output is then taken to predict the final class of model (doumpos and zopounidis 2007) . the method can be described as follows: take the output of the base model as a new feature, input it into other models, and use this method to stack the models. that is to say, the output of the first level model is taken as the input feature of the second level model, and so on, and the output of the last level model is taken as the result. for data sets t {(x n , y n ), n 1, 2, . . . n }, x n are feature vectors of the n th sample, y n is the real classification value corresponding to the n th sample, and p is the number of features included, that is, each sample feature x contains the feature vector (x 1 , x 2 , . . . , x p ). the primary learners are m 1 , m 2 , . . . m m , the secondary learner is g 1 and the test data t 2 x , y . formula (1) explains the prediction method of the two-layer stacking model as follows:ŷ (1) the principle of the stacking integration method is to synthesize the learning results of primary learners via secondary learners, adjust the result bias caused by model overfitting in primary learners, and correct the model prediction error. with this method, we can obtain better prediction performance than with a single algorithm. it should be noted that the training set input by the secondary learner is generated by the prediction results output by multiple primary learners. if the secondary trainer is used to train the data set directly, the data may be repeatedly learned, resulting in 'overfitting'. it is therefore necessary to divide the data reasonably. taking a fivefold sample data set as an example, the original training data set is divided into five sub-data sets to ensure that the sub-data sets do not overlap. for each primary learner, four data sets are used to train the model, and the remaining data set is used as the validation data set to verify the effect of the learner and to adjust parameters. all five sub-data sets can obtain the prediction results through the learned model. training each primary learner in turn, the prediction results of multiple primary learners can be combined into a new data set and used as the input of secondary learners. in this way, the data is converted from output to input. for each primary learner, there is a different sub-data set that does not participate in training the learner. this arrangement ensures that the data sets involved in the model training are different to be able to prevent a repetitive training of the data. the model training method with a fivefold data set is shown in fig. 3 . the prediction ability of a single primary learner is the basis of the integrated model's prediction ability. at the same time, there should be a certain degree of difference between primary learners to allow the models to make up for each other. in this part, we comprehensively analyze the prediction performance and feature contribution of various machine learning algorithms and then choose the basic learners of our stacking model on that basis. the box office prediction is a multi-classification problem with no obvious classification boundary. though the support vector machine (svm) algorithm has excellent performance in solving small-sample and nonlinear problems, it has difficulty dealing effectively with multiclassification problems. classification and regression trees (cart), logical regression and decision tree algorithm have bad performance because of the low complexity of the models. however, the ensemble models based on decision tree learning such as xgboost, lightgbm as well as rf indicate good performance in multi-classification and regression problems due to their strong branching ability. the xgboost algorithm and lightgbm algorithm are boosting integrated algorithms based on gbdt. the former uses a layer-growth strategy, and the latter uses the leaf-node growth strategy. under the same splitting times, the latter can reduce error, but it is more likely to produce overfitting. in contrast to those two, the rf algorithm is a decision tree algorithm based on bagging integration. to evaluate the effectiveness of each model on the experimental data in this paper, the sub-data set is used to train the model and obtain the characteristic contributions under the training of the xgboost, lightgbm and rf algorithms, as shown in fig. 4 . among all classes of feature factors, the star influence (features no. 2-30) involves the most features and has the strongest predictive power, because not only do stars have an impact on the film's box office in terms of public praise and publicity, but also they have a strong correlation with potential factors such as the film's budget and script quality. three other factors namely release time, release area and film type, have the same predictive power. the sequel factors are limited by the number of sequel film samples, so their predictive power is weak. the xgboost model can fully use all the internal characteristics of input data, thus it has good applicability to both continuous and discrete features and ensures that the contribution of various features in the prediction is relatively uniform. the lightgbm algorithm generates trees in the form of leaf nodes, which cases the main features to have a better degree of fitting and a higher degree of contribution. in the lightgbm prediction, sum of the average box office of directors and actors (feature no. 22) and the sum of three actors' microblog fans (feature no. 30) make significantly higher contributions than other features, while some features' contributions are lost. the rf method is based on bagging integration, which is not sensitive to discrete sparse features. it focuses mainly on a small number of influential features. it performs similarly to the lightgbm algorithm in terms of feature contribution but uses a different integration method. in the prediction, the rf model is better than the other two models in extracting the features of sequels (feature numbers 36 and 37). in sum, the difference between these three primary learners can help the models complement each other's strength and obtain a better integration effect. to obtain stable output results, the secondary learners select the models with strong generalization ability, so as to reduce and correct the bias tendency of multiple primary learning algorithms to the training set and to balance the overfitting problem in the decision tree. the to summarize, the primary learners used in this paper are xgboost, rf, and lightgbm, while the secondary learner is knn. the box office prediction method based on model fusion in the framework of stacking is shown in fig. 5 . our box office prediction process based on the stacking model is as follows: 1. processing data • after data cleaning and feature extraction, feature data set t is generated by original data d. • feature data set t is divided into a training set t 1 and testing set t 2 . • based on the number of samples, the training set is divided into five sub-datasets. according to the training method mentioned above, the primary models are trained, and the appropriate parameters are selected. • the prediction class of the training set on multiple primary models are combined into a new data set. • at this stage, the new data set is used as input feature to train the secondary learner. so far, the training of stacking model based on multi-model fusion is completed. • the test data set is input into the trained primary learners. • the generated results are then input into the secondary learner to obtain the model prediction class. • the performance of the model is evaluated with respect to the prediction accuracy. to measure the effectiveness of the model, the average percentage hit rate (aphr) is used as the evaluation index, which represents the percentage of correctly classified samples relative to the total number of samples (ahmed et al. 2019) . in this paper, two types of aphr, bingo and 1-away, are used to judge the accuracy of various categories. (1) absolute accuracy (bingo): indicates the accurate hit rate; that is, only the classification of the correct class is considered. (2) relative accuracy (1-away): in addition to the absolute accuracy (bingo), the classification results in which the real class is adjacent to the predicted class are also counted. the aphr can be calculated as follows: (2) where n represents the total number of samples, k represents the total number of categories, and c i represents the total number of samples correctly classified as category i. to evaluate the model's performance in this paper, the prediction results (bingo and 1-away) are measured by the aphr index. we compare the prediction results of all three primary learners and the stacking model, as well as the results of the knn model, as shown in table 7 . xgboost performs best among the single prediction models, with a bingo accuracy of 65.13%, followed by rf, with a bingo accuracy of 64.84%, and then the lightgbm model, with a bingo accuracy of 63.98%; these are all much higher than knn's accuracy of 41.21%. however, the stacking model performs better than all these single-prediction models, achieving 69.16% bingo accuracy and 86.46% 1-away accuracy ( table 8) . the prediction accuracy of the model for low-income (class a) samples is slightly higher than that for other types of samples, presumably because these low-level films' box-office income is relatively low due to the lack of reputation and influence; and because of the 'long tail effect' of the film market, the number of such samples accounts for the highest proportion in the film market. however, if the high-level films are to obtain the corresponding high returns, they are also subject to many factors, such as market environment, audience preference, and even weather, with greater uncertainty. in general, the box office prediction model is more sensitive to risk than to revenue. when it is applied to budget allocation, it should focus on the avoidance of excessive investment in films with low predicted box-office incomes. from 2010 to 2016, the chinese film market grew rapidly. during this period, the market continued to expand, and the influence of stars also showed an upward trend. since 2017, the growth rate has slowed down, and the influence of the market and stars has stabilized. in this paper, the 2017-2019 film period is used as the prediction set. at this stage, the model may be slightly affected, but after the market tends to be stable, the impact of data fluctuations on the model will become less significant, the prediction effect will be gradually stable, and the prediction performance will be better in the future. to further evaluate the effectiveness of this feature model on box office prediction, the prediction results of this model are compared with the results of previous studies in china and abroad. one of the most representative studies in the field of box office prediction is the prediction model advanced by delen and sharda (2010) . their model selected seven pre-release factors such as competitiveness and star influence and used five machine learning methods and model fusion methods to predict the box office for 214 hollywood films before their release. the accuracy of prediction ranged from 40.46 to 56.07%, among which the mixed model had the best predictive effectiveness. quader et al. (2018) collected 755 films and television works from around the world, extracted 15 dimensional features before and after release, and classified the film box office data into five categories according to size. seven classic machine learning methods were used for prediction, with an accuracy range of 43.29% to 58.5%, among which the multilayer perceptron (mlp) model had the best predictive effectiveness. compared with the movie box office prediction model proposed by delen and sharda (2010) and quader et al. (2018) , the prediction model proposed in this paper adopts multidimensional data sets and new feature processing methods. the prediction time node is earlier than the above model, and it can predict the movie box office performance before the film production period. the comparison of model prediction results is shown in table 9 . the results show that the features and models proposed in this paper have a good effect on box office prediction. the accuracy of new movie prediction (bingo) has been improved by at least 13.09% and 10.66%, respectively. in the literature, existing box office prediction models usually depend on data pertaining to the after-film production period, which has a rather limited effect on stakeholders, because it can only affect the later refinement of advertising or distribution strategies at this stage. in this paper, we propose a novel model to predict the box office revenue of a film during its early stage of production. this model uses data about the nature of the film itself but not the word-of-mouth or post-film data from social platforms. by incorporating the xgboost, rf, lightgbm and knn algorithms, we establish a stacking model for film box office prediction. our results show that our model performs well in terms of box office prediction with 69.16% of bingo accuracy and 86.46% of 1-away accuracy. to better understand the impacts of different variables, we further use the trained prediction model for preparing movies. in this experiment, the decision makers of entertainment companies can identify with high accuracy how much a specific actor, a specific genre, a specific release date, or whether or not it is a sequel, can contribute to the success of a movie. through the analysis of the contributing factors to film performance, management can better plan the film production and distribution. for example, star influence is crucial to films, but famous stars obviously require higher pay. it is unwise to choose most famous movie stars blindly, as it may increase their market demand for top stars, further increasing their pay. de vany and walls (1999) and walls (2005) confirmed the 'curse of the superstar'. if a star gets the expected revenue growth associated with his or her performance, the movie almost always loses money. fortunately, stars and film production companies are a two-way choice. film companies can screen out more cost-effective actors through the box office prediction system or negotiate with them about film remuneration and control budget to maximize their profits. when building the prediction model, we consider the availability and nature of data with five factors, including the release area and star influence. however, due to data unavailability, several influencing factors (e.g., budget) are omitted. intuitively, movie box office prediction methods rely on reliable data, some of which are confidential. hence, first, future research can endeavor to cooperate with film studios and obtain additional data that are not available to outsiders. second, in the current research, we emphasize on the prediction model of the box office in the early stage of film production by incorporating the characteristics of the film only. however, a movie project may be adapted from intellectual property, which has a huge fan base. therefore, future research can still utilize social media data before film production to build an even better box office prediction model. pre-production box-office success quotient forecasting. soft computing transforming business using digital innovations: the application of ai, blockchain, cloud and data analytics uncertainty in the movie industry: does star power reduce the terror of the box office predicting the financial success of hollywood movies using an information fusion approach the long-term box office performance of sequel movies ai in operations management: applications, challenges and opportunities model combination for credit risk assessment: a stacked generalization approach seasonality in the u.s. motion picture industry moviemod: an implementable decisionsupport system for prerelease market evaluation of motion pictures film critics: influencers or predictors pre-production forecasting of movie revenues with a dynamic artificial neural network understanding artificial intelligence adoption in operations management: insights from the review of academic literature and social media discussions box-office forecasting based on sentiments of movie reviews and independent subspace method viable supply chain model: integrating agility, resilience and sustainability perspectives--lessons from and thinking beyond the covid-19 pandemic box office forecasting using machine learning algorithms based on sns data forecasting benchmarks of long-term stock returns via machine learning the chinese film industry: emerging debates predicting financial success of motion pictures: the '80 s experience early prediction of movie box office success based on wikipedia activity big data dynamic effects among movie ratings, movie revenues, and viewer satisfaction a bayesian model to forecast new product performance in domestic and international markets performance evaluation of seven machine learning classification techniques for movie box office success prediction impacts of epidemic outbreaks on supply chains: mapping a research agenda amid the covid-19 pandemic through a structured literature review a parsimonious model for forecasting gross box-office revenues of motion pictures predicting box-office success of motion pictures with neural networks modeling movie success when 'nobody knows anything': conditional stable-distribution analysis of film returns publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the authors are very thankful to the editor and the referees whose detailed reviews and suggestions helped improve this article. the research is supported by national natural science foundation of china (no. 71871186 and no. 71871184) and the fundamental research funds for the central universities (jbk18jyt02, jbk1902009, and jbk190504). key: cord-309096-vwbpkpxd authors: magdon-ismail, malik title: machine learning the phenomenology of covid-19 from early infection dynamics date: 2020-03-20 journal: nan doi: 10.1101/2020.03.17.20037309 sha: doc_id: 309096 cord_uid: vwbpkpxd we present a robust data-driven machine learning analysis of the covid-19 pandemic from its early infection dynamics, specifically infection counts over time. the goal is to extract actionable public health insights. these insights include the infectious force, the rate of a mild infection becoming serious, estimates for asymtomatic infections and predictions of new infections over time. we focus on usa data starting from the first confirmed infection on january 20 2020. our methods reveal significant asymptomatic (hidden) infection, a lag of about 10 days, and we quantitatively confirm that the infectious force is strong with about a 0.14% transition from mild to serious infection. our methods are efficient, robust and general, being agnostic to the specific virus and applicable to different populations or cohorts. as of march 1 2020, there was still much public debate on properties of the covid-19 pandemic (see the cnn article, cohen (2020) ). for example, is asymptomatic spread of covid-19 a major driver of the pandemic? there was no clear unimodal view, highlighting the need for robust tools to generate actionable quantitative intelligence on the nature of a pandemic from early and minimal data. one approach is scenario analysis. recently, chinazzi et al. (2020) used the global epidemic and mobility model (gleam) to perform infection scenario analyses on china covid-19 data, using a networked meta-population model based on transportation hubs. a similar model for the us was reported in wilson (2020) where the web-app predicted from 150,000 to 1.4 million infected cases by april 30, depending on the intervention level. such scenario analysis requires user input such as infection sites and contagion-properties. however, a majority of infection sites may be hidden, especially if asymptomatic transmission is significant. further, the contagion parameters are unknown and must be deduced, perhaps using domain expertise. data driven approaches are powerful. a long range regression analysis of covid-19 out to 2025 on us data using immunological, epidemiological and seasonal effects is given in kissler et al. (2020) , which predicts recurrent outbreaks. we also follow a data-driven machine learning approach to understand early dynamics of covid-19 on the first 54 days of us confirmed infection reports (downloadable from the european center for disease control). we address the challenge of realtime data-intelligence from early data. our approach is simple, requires minimal data or human input and generates actionable insights. for example, is asymptomatic spread significant? our fall away from predicteds, indicating that social distancing is working in agreement with our lag of 10 days (the two "kinks" in the curve). the figure emphasizes early data for learning about the pandemic, as later data is "contaminated" by public health protocols whose effects are hard to quantify. (note: dates are the time-stamps on the ecdc report (ecdc, 2020) , which captures the previous day's activity) data-driven analysis says yes, emphatically. we even give quantitative estimates for the number of asymptomatic infections. early data is both a curse and a blessing. the curse is that "early" implies not much information, so quantitative models must be simple and robust to be identifiable from the data. the blessing is that early data is a peek at the pandemic as it roams free, unchecked by public health protocols, for to learn the true intentions of the lion, you must follow the beast on the savanna, not in the zoo. as we will see, much can be learned from early data and these insights early in the game, can be crucial to public health governance. we analyzed daily confirmed covid-19 cases from january 21 to march 14, the training or model calibration phase, the gray region in figure 1 . a more detailed plot of the model fit to the training data is in figure 2 . qualitatively we see that the model captures the data, and it does so by setting four parameters: β, asymptomatic infectious force governing exponential spread γ, virulence, the fraction of mild cases that become serious later k, lag time for mild infection to become serious (an incubation time) m 0 , unconfirmed mild asymptomatic infections at time 0 figure 1 are the model predictions (blue envelope) and the red circles are the observed infection counts. how do we know the model predictions are honest, in that the red circles were in no way influenced the predictions. we are in a unique position to test the model because it is time stamped as version 1 of the preprint magdon-ismail (2020). the model has provably not changed since 03/14, and we just added test data as it arrived. the predictions in figure 1 are in no way forward looking, data snooped or overfitted. we observe that the model and observed counts agree, modulo two "kinks" around 03/24 and 03/30, when the observed infections start falling away from the model. to understand the cause, the lag is important. aggressive social distancing was implemented on about 03/13 and lockdowns around 03/21. a lag of k = 10 means the effects of these protocols will become apparent around 03/23 and 03/31 respectively. the methods are general and can be applied to different cohorts. in section 3.2 we do a cross-sectional country-based study. our contributions are • a methodology for quickly and robustly machine learning simple epidemiological models given coarse aggregated infection counts. • building a simple model with lag for learning from early pandemic dynamics. • application of the methodology within the context of covid-19 to usa data. our methods reveal significant asymptomatic (hidden) infection, a lag of about 10 days, and we quantitatively confirm that the infectious force is strong with about a 0.14% transition from mild to serious infection. • cross-sectional analysis of the pandemic dynamics across several countries. • to our knowledge, the only tested predictions for covid-19 due to our time-stamping of the predictions. our results demonstrate the effectiveness of simple robust models for predicting pandemic dynamics from early data. our model is simple and robust. the majority of disease models aggregate individuals according to disease status, such as si, sir, sis, kermack and mckendrick (1927) ; bailey (1957) ; anderson and may (1992) . we use a similar model by considering a mild infected state which can transition to a serious state. early data allows us to make simplifying assumptions. in the early phase, when public health protocols have not kicked in, a confirmed infection is self-reported. that is, you decide to get tested. why? because the condition became serious. this is important. a confirmed 3 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . case is a transition from mild infection to serious. this is not true later when, for example, public health protocols may institute randomized testing. at time t let there be c(t) confirmed cases and correspondingly m (t) mild unreported asymptomatic infections. the new confirmed cases at time t correspond to mild infections at some earlier time t − k which have now transitioned to serious and hence got self-reported. let a fraction γ of those mild cases transitioned to serious, another advantage of early dynamics is that we may approximate the growth from each infection as independent and exponential, according to the infectious force of the disease. so, here, the second term is the loss of mild cases that transitioned to serious, the third term is the remaining cases that don't transition to serious recovering at some later time r and the fourth term accounts for new infections from confirmed cases. we will further simplify and assume that confirmed cases are fully quarantined and so α = 0 and recovery to a non-infectious state occurs late enough to not factor into early dynamics. our simplified model is: we set t = 1 at the first confirmed infection (jan 21 in usa). given k, m 0 , we get an approximate fit to the data by using a perturbation analysis to solve for γ, β that fit two points s(τ ) and s(t ): where, and r = t − k, s = τ − k, κ = (s(t ) − s(1))/(s(τ ) − s(1)) and ρ = κ − 1 (for details see the appendix). from this solution as a starting point, we can further optimize γ, β using a gradient descent which minimizes an error-measure that captures how well the parameters β, γ, k, m 0 reproduce the observed dynamics in figure 2 , see for example abu-mostafa et al. (2012) . we used a combination of root-mean-squared-error and root-mean-squared-percentage-error between observed dynamics and model predictions. by optimizing over k, m 0 , we obtain an optimal fit to the training data ( figure 2 ) using model parameters: the asymptomatic infectious force, β, is very high, and corresponds to a doubling time of 2.6 days. the virulence at 0.14% seems comparable to a standard flu, though the virus may be affecting certain demographics much more severely than a flu. the incubation period of 10 days seems in line with physician observations. the data analysis predicts that when the first confirmed case appeared, there were 4 other infections in the usa. the parameters β * , γ * and m * 0 are new knowledge, gained with relative ease by calibrating a simple robust model to the early dynamics. but, these optimal parameters are not the whole story, especially when it comes to prediction. the exhaustive search over k, m 0 , fixing β and γ to the optimal for that specific k, m 0 , produces several equivalent models we show the quality of fit for various (k, m 0 ) in figure 3 (a). the deepblue region contains essentially equivalent models within 0.5% of the optimal fit, our (user defined) 4 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10.1101/2020.03.17.20037309 doi: medrxiv preprint error tolerance. the deep-blue region shows the extent to which the model is ill-identified by the data. indeed, all these deep-blue models equally fit the data which results in a range of predictions. for robustness, we pick the white square in the middle of the deep-blue region, but note here that it is only one of the models which are consistent with the data. in making predictions, we should consider all equivalent models to get a range for the predictions that are all equally consistent with the data. similarly, in figure 3 (b), we fix k * and m * 0 to their optimal robust values and show the uncertainty with respect to β and γ (the deep-blue region). again, we pick the white square in the middle of the deep-blue region of equivalent models with respect to the data. hence we arrive at our optimal parameters in equation (6). by considering all models which are equally consistent with the data, we get the estimates toghether with the ranges in equation 1. we emphasize that these error-ranges we report have nothing to do with the data, and are simply due to the ill-posedness of the inverse problem to inifer the model from finite data. several models essentially fit the data equivalently. we do not include in our range the possible measurement errors in the data, although the two are related through the error tolerence used in defining "equivalent" models. more noise in the data would result in more models being treated as equivalent. as already mentioned, to get honest estimates, we must consider all models which are equally consistent with the data (deep-blue regions in figure 3 ). . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . blue is better fit, red is worse. the deep-blue region corresponds to comparable models, within 0.5% of the optimal fit. the white square is the model chosen, (k * , m * 0 ) = (10, 4) which has optimal fit and also is "robust" being in the middle of the deep-blue regions. (b) model fit for the chosen k * and m * 0 . again, the deep-blue is an equivalence class of optimal models. the "robust" model is the white square in the middle of the deep-blue region, (β * , γ * ) = (1.30, 0.0014). the deep-blue regions represent uncertainty. the model in equation (1) gives the prediction of new infections in the table below. the cumulative predicted infections is the data plotted in figure 1 . 6 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint the predictions use the model in (1) in the supplementary material we give details of our cross-sectional study across countries. the different countries have different cultures, social networks, mobility, demographics, as well as different times at which the first infection was reported (the "delay"). we calibrated independent models for each country and the resulting model parameters are in table 1 . we primarily focused on the infectious force β, which has significant variability, and we studied how β statistically depends on a number of country-specific parameters factors. in the supplementary material, we give details of the study and the quantitative results. qualitatively, we find: • a larger delay in the virus reaching a country indicates a larger β. the more that has been witnessed, the faster the spread. that seems unusual, but is strongly indicated. we do not have a good explanation for this effect. it could be an artefact of testing procedures not being streamlined, so early adopters of the pandamic presenting as serious were not detected. • population density at the infection site has a strong positive effect but the country's population density does not. • there is faster spread in countries with more people under the poverty level defined as the percentage of people living on less than $5.50 per day. • median age has a strong effect. spread is faster in younger populations. the youth are more mobile and perhaps also more carefree. 7 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 • wealth and per-capita income have a slight negative effect. spread is slower in richer countries, perhaps due to risk-aversion, higher levels of education and less reliance on public transportation. whatever the cause, it does have an impact, but relatively smaller than the other effects. early dynamics allows us to learn useful properties of the pandemic. later dynamics may be contaminated by human intervention, which renders the data less useful without a more general model. we learned a lot from the early dynamics of covid-19. it's infectious force, virulence, incubation period, unseen infections and predictions of new confirmed cases. all this, albeit within error tolerances, from a simple model and little data. asymptomatic infection is strong, around 30%, converting to serious at a rate at most 1.2%. there is significant uncertainty in the lag, from 1 up to 13 days, and we estimate 5.3 million asymptomatic infections as of 03/14, the range being from 1 to 26 million. such information is useful for planning and health-system preparedness. are our parameters correct? we were in a unique position to test our predictions because our model was time-stamped as version 1 of the preprint magdon-ismail (2020). a side benefit of the model predictions is as a benchmark against which to evaluate public health interventions. if moving forward, observed new infections are low compared to the data in, it means the interventions are working by most likely reducing β. starting on about march 25, the observed infections starts falling off and we observe a flattening by march 28. the us instuted broad and aggressive social distancing protocols starting on or before march 13 and even stronger lockdown around march 21, which is consistent with the data and the model's lag of k = 10. without such quantitative targets to compare with, it would be hard to evaluate intervention protocols. our approach is simple and works with coarse, aggregated data. but, there are limitations. • the independent evolution of infection sites only applies to early dynamics. hence, when the model infections increase beyond some point, and the pandemic begins to saturate the population, a more sophisticated network model that captures the dependence between infection sites would be needed balcan et al. • while we did present an optimal model, it should be taken with a grain of salt because many models are nearly equivalent, resulting in prediction uncertainty. • the model and the interpretation of its parameters will change once public health protocols kick in. the model may have to be re-calibrated (for example if β decreases) and the parameters may have to be reinterpreted (for example γ is a virulence only in the self-reporting setting, not for random testing). it is also possible to build a more general model with an early phase β 1 and a latter phase β 2 (after social distancing). but, beware, for a more general sophisticated model looks good a priori until it comes time to calibrate it to data, at which point it becomes unidentifiable. • the model was learned on usa data. the learned model parameters may not be appropriate for another society. the virulence could be globally invariant, but it could also depend on genetic and demographic factors like age, as well as what "serious" means for the culturethat is when do you get yourself checked. in a high-strung society, you expect high virulenceparameter since the threshold for serious is lower. one certainly expects the infectious force to depend on the underlying social network and interaction patterns between people, which can vary drastically from one society to another and depending on interventions. hence, one should calibrate the model to country specific data to gain more useful insights. 8 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . the lag, k, and public policy. the lag k is important for public policy due to how public policy can be driven by human psychology. the human tendency is to associate any improvement in outcome to recent actions. however, if there is a lag, one might prematurely reward those recent actions instead of the earlier actions whose effects are actually being seen. such lags are present in traditional machine learning, for example the delayed reward in reinforcement learning settings. credit assignment to prior actions in the face of delayed reward is a notoriously difficult task, and this remains so with humans in the loop. knowledge of the lag helps to assign credit appropriately to prior actions, and the public health setting is no exception. wikipedia, source: world bank (2020). list of countries by percentage of population living in poverty. www.wikipedia.com. wilson, c. (2020). exclusive: here's how fast the coronavirus could infect over 1 million americans. time web-applet available for scenario analysis. world-bank (2017). adjusted net national income per capita (current us$). https://data.worldbank.org. worlddata-info (2015) . average income around the world. https://data.worlddata.info. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 recall the model, for fixed k, m 0 , we must perform a gradient descent to optimize β, γ. unfortunately, the dependence on β is exponential and hence very sensitive. so if the starting point is not chosen carefully, the optimization gets stuck in a very flat region, and many millions of iterations are needed to converge. hence it is prudent to choose the starting conditions carefully. to do so, we need to analyze the recursion. first, we observe that the recursion for m (t) is a standalone k-th order recurrence. for 1 ≤ t ≤ k, m (t) = m 0 β t−1 , hence, we can guess a solution m (t) = m 0 β k−1 φ t−k , for t > k, which requires φ k − βφ k−1 + γ = 0. we do a perturbation analysis in γ → 0. at γ = 0, φ = β, so we set φ = β + , to get which to first order in is solved by ≈ −γ/β k−1 and so . given this approximation, we can solve for s(t), since φ = β + o(γ), for t > 2k, we can approximate s(t) as, we can independently control two parameters φ and γ. we use this to match the observed s(t) at two time points. since the growth is exponential, we match the end time, s(t ) and some time τ in the middle, for example τ = 3t /4 . let ∆ t = (s(t ) − s(1))/m 0 and ∆ τ = (s(τ ) − s(1))/m 0 . then, dividing gives ∆ t /∆ τ = (φ t −k − 1)/(φ τ −k − 1) ≈ φ t −τ , because φ > 1. let us consider the equation κ = (φ r − 1)/(φ s − 1), which gives φ r − κφ s + κ − 1 = 0, or more generally φ r − κφ s + ρ = 0, where r > s > 1 and κ > ρ 1. this means φ > 1. when ρ = 0, we have φ r−s = κ, so we do a perturbation analysis with φ r−s = κ + , and our perturbation parameter is . then, φ r = (κ + )φ s and plugging into the equation gives . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. . (10) for our setting, r = t − k, s = τ − k, κ = ∆ t /∆ τ and ρ = κ − 1. finally, since φ is approximate, we may not be able to satisfy both equations in (9), hence we can instead minimize the mean squared error, which gives we now need to get β which satisfies φ = β(1 − γ/β k ). again, we do a perturbation analysis, omitting the details, to obtain if one wishes, a fixed point iteration starting at the above will quickly approach a solution to φ = β(1 − γ/β k ). we show the approximate fit on the us data ( figure 4) . we show the optimal fit, the initial fit using the parameters constructed from (12) and (11). the parameters and fit error are β γ fit error optimal 1.306 0.0013282 1.6521 equations (12) and (11) 1.3055 0.0013423 1.6619 the approximate fit works pretty well, and is certainly good enough to initialize an optimization. note that to get an even better starting point for the gradient descent optimization, assuming 12 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 t, τ ≥ 2k, one could simultaneously solve the three equations we perform our analysis on early dynamics data available from the ecdc giving infection numbers starting from december 31, 2019 ecdc (2020). we use data for 20 countries selected qualitatively because they appear to have reasonably efficient testing procedures for self-reported cases. we infrom the public health perspective, perhaps the most important parameter is β, since actions can be taken to mitigate the spread by reducing β, whereas γ, k and m 0 are somewhat givens for the country. we show the fits in table 3 . as you can see, there is much variability in β. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 we perform a simple statistical analysis to test if β can be explained by any of the country parameters in table 2 . we include the delay as a global explanatory variable, which would account for a global increase in vigilence as time passes and awareness of the pandemic increases. one expects β to decrease with the delay. a table of correlations of β with the various parameters is shown below. for our analysis we use the best case β, although similar results follow from the optimal β. as expected, there is a very significant correlation of β with delay, but in the opposite direction. • the larger the delay, the larger is β. the more a country has observed, the faster the spread in that country. that seems unusual but seems strongly indicated by the data. • population density at the infection site has a strong positive effect but the country's population density does not. • there is faster spread in poorer countries. • median age has a strong effect. spread is faster in younger countries. the youth are more mobile and perhaps also more carefree. • there is a slight negative effect from wealth and per-capita income. spread is slower in richer countries. perhaps this is due to more risk-aversion, perhaps higher levels of education, perhaps less use of public transportation. whatever the cause, it does have an impact, but relatively smaller than the other effects. 14 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint figure 5 : optimal feature to predict β, within a cross-valudation setting to select the regularization parameter. the gray region is the range of the predicted value for each country. the r 2 = 0.57, so the cross-validation based optimal linear feature captures 57% of the residual variance. we now use regularized regression to perform a linear model fit to explain β. to make the weight magnitudes meaningful, we normalize the data. we use a leave-one-out cross validation to select the optimal regularization parameter (which happens to be 20). the optimal regularized fit with this regularization parameter gives a new feature x = w 1 · (del) + w 2 · (pop) + w 3 · (age) + w 4 · (wlth) + w 5 · (inc) + w 6 · (pov) + w 7 · (pop-init) the learned weights and the their ranges which yield a cross-validation error within 10% of optimal are shown in the table. . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 the predictions of β using this feature are also shown in figure 5 . a statistical regression model using these data produces the fit: the statistical regression model also identifies positive weights on delay, population density at the initial site and poverty in that order of significance. as we observed from the correlations, delay, poverty and population density at the initial infection site have strong positive weights. age has a strong negative weight. wealth and income have weak negative effects, but non-zero. the population density of the country as a whole seems to have no effect, with a weight range that includes 0. 16 . cc-by-nc-nd 4.0 international license it is made available under a author/funder, who has granted medrxiv a license to display the preprint in perpetuity. is the (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 learning from data: a short course. amlbook infectious diseases of humans the mathematical theory of epidemics multiscale mobility networks and the spatial spreading of infectious diseases the effect of travel restrictions on the spread of the 2019 novel coronavirus infected people without symptoms might be driving the spread of coronavirus more than we realized geographic distribution of covid-19 cases worldwide infectious disease modeling of social contagion in networks networks and epidemic models a contribution to the mathematical theory of epidemics projecting the transmission dynamics of sars-cov-2 through the post-pandemic period machine learning the phenomenology of covid-19 from early infection dynamics acknowledgments. we thank abhijit patel, sibel adali and zainab magdon-ismail for provoking discussions on this topic. key: cord-297530-7zbvgvk8 authors: kühnert, denise; wu, chieh-hsi; drummond, alexei j. title: phylogenetic and epidemic modeling of rapidly evolving infectious diseases date: 2011-08-31 journal: infect genet evol doi: 10.1016/j.meegid.2011.08.005 sha: doc_id: 297530 cord_uid: 7zbvgvk8 epidemic modeling of infectious diseases has a long history in both theoretical and empirical research. however the recent explosion of genetic data has revealed the rapid rate of evolution that many populations of infectious agents undergo and has underscored the need to consider both evolutionary and ecological processes on the same time scale. mathematical epidemiology has applied dynamical models to study infectious epidemics, but these models have tended not to exploit – or take into account – evolutionary changes and their effect on the ecological processes and population dynamics of the infectious agent. on the other hand, statistical phylogenetics has increasingly been applied to the study of infectious agents. this approach is based on phylogenetics, molecular clocks, genealogy-based population genetics and phylogeography. bayesian markov chain monte carlo and related computational tools have been the primary source of advances in these statistical phylogenetic approaches. recently the first tentative steps have been taken to reconcile these two theoretical approaches. we survey the bayesian phylogenetic approach to epidemic modeling of infection diseases and describe the contrasts it provides to mathematical epidemiology as well as emphasize the significance of the future unification of these two fields. molecular phylogenetics has had a profound impact on the study of infectious diseases, particularly rapidly evolving infectious 1567-1348 ó 2011 elsevier b.v. doi:10.1016/j.meegid.2011.08.005 agents such as rna viruses. it has given insight into the origins, evolutionary history, transmission routes and source populations of epidemic outbreaks and seasonal diseases. one of the key observations about rapidly evolving viruses is that the evolutionary and ecological processes occur on the same time scale ). this is important for two reasons. first, it means that neutral genetic variation can track ecological processes and population dynamics, providing a record of past evolutionary events (e.g., genealogical relationships) and past ecological/population events (geographical spread and changes in population size and structure) that were not directly observed. second, the concomitance of evolutionary and ecological processes leads to their interaction that, when non-trivial, necessitates joint analysis. arguably the most studied infectious disease agent to date has been human immunodeficiency virus (hiv) and it has been the subject of thousands of phylogenetic studies. these have shed light on many aspects of hiv evolutionary biology, epidemiology, origins, phylogeography, transmission dynamics and drug resistance. in fact, the vast body of literature on hiv makes it clear that almost every aspect of the biology of a rapidly evolving pathogen can be better understood in the context of the evolution of the virus. whether it is retracing the zoonotic origins of the hiv pandemic or describing the interplay between the virus population and its host's immune system, a phylogenetic analysis frequently sheds light. although probabilistic modeling approaches to phylogenetics predate sanger sequencing (edwards and cavalli-sforza, 1965) , it was not until the last decade that probabilistic modeling became the dominant approach to phylogeny reconstruction. part of that dominance has been due to the rise of bayesian inference (huelsenbeck et al., 2001) , with its great flexibility in describing prior knowledge, its ability to be applied via the metropolis-hastings algorithm to complex highly parametric models, and the ease with which multiple sources of data can be integrated into a single analysis. the history of probabilistic models of molecular evolution and phylogenetics is a history of gradual refinement; a process of selection of those modeling variations that have the greatest utility in characterizing the ever-growing empirical data. the utility of a new model has been evaluated either by how well it fits the data (formal model comparison or goodness-of-fit tests) or by the new questions that it allows a researcher to ask of the data. in this review we will describe the modern phylogenetic approach to the field of infectious diseases, and particularly with reference to bayesian inference of the phylogenetic epidemiology of rapidly evolving viral pathogens such as hepatitis c virus (hcv), hiv and influenza a virus. the review is separated into two main sections. in section 2 we discuss phylogenetic methods for reconstructing the history of infectious epidemics, including identification of origins, dating of common ancestors, relaxed phylogenetics and coalescent-based population dynamics. in section 3 we review epidemiological models and finish by outlining progress in the development of phylodynamical models that marry statistical phylogenetics with dynamical modeling. the introduction of an efficient means of calculating the probability of a sequence alignment given a phylogenetic tree (known as the phylogenetic likelihood; felsenstein, 1981) heralded the beginning of practical phylogenetic tree reconstruction in a statistical framework. at around the same time the coalescent was introduced: a theory relating the shape of the genealogy of a random sample of individuals to the size of the population from which they came (kingman, 1982 ; see section 2.3 for details). both of these advances have been subsequently developed to the point that, together they enable the estimation of viral evolutionary histories and past population dynamics. bayesian inference brings together the likelihood, pr(djh) (the probability of the data given the model parameters) and the prior, p(h) (the probability of the model parameters prior to seeing the data), so that the posterior probability of the model parameters (h) given the data is: in a standard phylogenetic setting, the probabilistic model parameters include the phylogenetic tree, coalescent times and substitution parameters, and a prior probability distribution over these parameters must be specified. by using kingman's coalescent as a prior density on trees, bayesian inference can be used to simultaneously estimate the phylogeny of the viral sequences and the demographic history of the virus population (drummond et al., 2002 (drummond et al., , 2005 , see box 1). extension of phylogenetic inference methods to accommodate time-stamped sequence data (rambaut, 2000; drummond et al., 2002) and relaxation of the assumption of a strict molecular clock (thorne et al., 1998; kishino et al., 2001; sanderson, 2002; drummond et al., 2006; rannala and yang, 2007) provided sophisticated methods for ancestral divergence time estimation. for virus species that occupy more than one host species (e.g influenza a), models that aim to detect cross-species transmission may provide clues to the origin of a virus strain in a host population (reis et al., 2009 ). when a new epidemic emerges, one of the first goals is to trace it back to its genetic and geographic origin. the reconstruction of phylogenetic trees to infer the evolutionary relationships has been a key tool to uncover the origin of regional epidemics such as those resulting from hiv (gao et al., 1999; santiago et al., 2002) , hcv markov et al., 2009 ) and sars coronavirus (sars-cov) . some studies have also attempted to use phylogenetic trees to draw conclusions about transmission history and geographic spread of viral epidemics (motomura et al., 2003; santiago et al., 2005; gilbert et al., 2007) . however, great care should be taken when coming to conclusions about aspects of the epidemic process that are not explicitly modeled in the reconstruction of the phylogenetic tree and even if they are, the user needs to consider the appropriateness of the underlying model assumptions. one common and straightforward method used to identify the origin of an epidemic involves determining the non-epidemic genotype or lineage most closely related to the epidemic, i.e., the molecular sequences clustered most closely with the epidemic strain on a phylogenetic tree. while the method is intuitive, its success heavily depends on the collected data. the closest simian immunodeficiency virus (siv) relative of hiv-1 is sivcpz (gao et al., 1999; santiago et al., 2002) , which is harbored in chimpanzee sub-species pan troglodytes troglodytes and p.t. schweinfurthii in the form of the respective sub-species specific siv lineages sivcpzptt and sivcpzpts. although sivcpz became the prime candidate for the zoonotic source of hiv-1 as soon as it was identified, alternative sources could not be ruled out due to the paucity of identified chimpanzee infections (vanden haesevelde et al., 1996) . the source of hiv-1 was confirmed much later after the collection of sivcpz from fecal samples of wild p. t. troglodytes apes in the cameroon forest . hiv-1 groups m and n are much more closely related to sequences from the fecal samples than previously identified sivcpz strains. this finding uncovered the distinct origins of hiv-1 group m (pandemic) and group n (non-pandemic) traced to chimpanzee communities of southeastern and central cameroon respectively. the precise geographic identification of these wildlife chimpanzee reservoirs of hiv-1 by phylogenetic techniques provided the crucial evidence that sivcpz gave rise to the hiv/aids pandemic. conversely, if strains sufficiently closely related to the epidemic strain cannot be identified then phylogenetic trees are not able to easily provide answers about origins. for example, there has been much heated debate on the origin of the 1918 h1n1 influenza a pandemic -whether its source was avian, non-human mammalian or even human. the uncertainty mainly stems from the absence of sequences from the immediate ancestral source population of the 1918 virus (gibbs and gibbs, 2006) . a similar, though less severe problem has been encountered with the search for the origin of hiv-1 o group. strains of hiv-1 o group have been revealed to be most closely related to sivgor found in western lowland gorillas (gorilla gorilla gorilla) takehisa et al., 2009 ). however, hiv-o sequences are moderately divergent from the known sivgor sequences and consequently, the route of transmission that has given rise to hiv-1 o group and sivgor is still indeterminate. the interspersion of an emergent viral strain with other strains in a phylogenetic tree is often interpreted as evidence supporting multiple independent viral introductions. for example, hiv lineages are paraphyletic with siv lineages creating several separate clusters of hiv suggesting multiple zoonotic viral transmissions into the human population (santiago et al., 2005; keele et al., 2006) . while it is intuitive that separate clusters of the emergent virus suggest multiple introductions, it is not clear from the number of clusters alone how many independent events are responsible for the observed pattern. incomplete taxon sampling will lead to undercounting. for example, there may exist an unsampled sequence that will split an emergent viral cluster, or an additional unsampled emergent cluster. both scenarios, if detected, would increase the lower bound of the inferred number of events. the number of events could also be incorrectly estimated due to phylogenetic estimation error. finally, in situations where the event is potentially reversible, such as with drug-resistance mutations, e.g., adamantane resistance in h3n2 influenza virus (nelson et al., 2009) , it is quite possible that reversions are also present in the phylogenetic history, and these are not always detectable by a simple parsimony reconstruction, again leading to undercounting. for all these reasons, the applications of bayesian modeling of phylogeography and character evolution on phylogenies is crucial to quantitatively assess the uncertainty generated from these different sources of error (see section 2.4). in contrast to hiv-1, it has been clearly established for almost two decades that the progenitor of hiv-2 is sivsm from sooty mangabey (cerocebus torquatus atys) (hirsch et al., 1989; gao et al., 1992) . it was suggested by (santiago et al., 2005) that the geographic origin of hiv-2 groups a and b are in the eastern sooty mangabey range according to the clear geographic clustering displayed in the phylogenetic tree and branching position of the hiv-2 strains. although this heuristic approach to locating phylogeographic origins is commonly used, it has several disadvantages aside from the sampling error mentioned earlier. first, it relies on strong geographic signals to produce an unambiguous geographic clustering pattern in the trees. second, the lack of a formal statistical framework results in an inability to quantify the associated uncertainty with the geographic estimates. a number of statistical phylogenetic methods aim to reconstruct the migration process by treating geographic locations as another state that evolves down the tree. the states are either discrete (lemey et al., 2009b) , denoted by names of cities or provinces, or continuous represented by the latitude and longitude of the location (biek et al., 2006 (biek et al., , 2007 lemey et al., 2010) . even with comprehensive sampling, using a single phylogenetic tree is insufficient to reflect the complex genetic origin of virus species that undergo recombination or reassortment. reassortment arises when segments of the viral genome come from different viruses, while recombination also requires the genetic material from one source to (break and) join with that from another. these two processes enable the generation of novel combinations from two existing genotypes. moreover, these often large genetic changes may provide the potential for adaptation to a new host species (parrish et al., 2008) . reassortment has played an important role in the evolution of the influenza a virus (lindstrom et al., 2004; holmes et al., 2005; nelson et al., 2008) . evidence for recombination have also been found in dengue (holmes et al., 1999) , hiv, hcv and sars-cov . there are many phylogenetic methods that aim to detect recombination by identifying discordance in the topologies of different parts of the alignment (grassly and holmes, 1997; salminen et al., 1995; lole et al., 1999; smith, 1992; robertson et al., 1995; paraskevis et al., 2005) , which is a potential consequence of recombination. most of these methods use a sliding window approach to compute a summary statistic along the length of sequence. phylogenetic approaches are based on estimating either (i) bootstrap values or (ii) clade posterior probabilities for each window and a sudden change in bootstrap value, clade posterior probability or site percentage identity is an indication of the presence of a breakpoint around the region. other methods explicitly estimate the position of the breakpoint in an alignment, providing access to test the strength of support for recombination (holmes et al., 1999) . finally, some approaches portray the evolutionary history by networks to incorporate horizontal transfer (huson, 1998) or ancestral recombination graphs (bloomquist and suchard, 2010) . as a rule, rna viruses mutate rapidly, so that viruses isolated only a few months apart may exhibit measurable genetic differences (drummond et al., 2003a and references therein) . indeed, the mutation rate of some rna viruses is so high that it can result in evolutionary changes within a host during the course of infection. this is particularly true of long term chronic infections caused by viruses such as hiv and hcv. it is therefore not appropriate to consider the analysis of sequences that have been sampled years apart as if they are contemporaneous. sequence data with this type of temporal structure are called heterochronous and from such data the substitution rate can be estimated and divergence times calibrated to a calendar scale. here, a tree with branch lengths in calendar units is termed a ''time tree''. fig. 1 depicts an example of a serially sampled time tree of a rapidly evolving virus. to account for temporal structure in sequence data, the earliest methods estimated the time scale by estimating a gene tree with unconstrained branch lengths and then performing a linear regression of root-to-tip genetic distance against sampling times (see for review drummond et al., 2003b) . this method was used to provide the first estimate of the time of the most recent common ancestor (t mrca ) of hiv-1 m group, placing it in the 1930s (korber et al., 2000) . despite its simplicity, this method also accurately estimated the age of the oldest hiv sequence sampled in 1959. a maximum likelihood based method (the single rate dated tips (srdt) model; rambaut, 2000) , estimates ancestral divergence times and overall substitution rate on a fixed tree, assuming a strict molecular clock. the srdt model was used to date the most recent common ancestor of hiv-2 subtype a in 1940 ±16 and that of subtype b in 1945 ±14 (lemey et al., 2003) . using the serial coalescent as a tree prior in bayesian coalescent methods (drummond et al., 2002 (drummond et al., , 2005 drummond and rambaut, 2007) allows the time scale to be simultaneously estimated with other phylogenetic and demographic parameters. recently, a relaxed clock bayesian coalescent analysis that included two historical viral samples from 1959 (zr59) and 1960 (drc60) (worobey et al., 2008) , pushed back the estimated t mrca of hiv-1 m group to 1908 hiv-1 m group to (1884 hiv-1 m group to -1924 . besides estimating the time of an epidemic outbreak, it may also be important to know how long the ancestors of the epidemic strain had circulated in the source population prior to the epidemic. this can sometimes be indicated by the length of the branch ancestral to the epidemic clade. in the case of the 2009 swine-origin influenza a virus, the length of the branch leading to 2009 s-oiv strains is estimated to be 9-17 years depending on the viral segment analyzed, suggesting roughly a decade of unsampled diversity (smith et al., 2009) . to estimate the age of the common ancestor of sivsm strains, the t mrca of hiv-2/sivsm has been dated, indicating that the common ancestry prior the zoonosis of hiv-2 group a and b spans only the last few centuries (wertheim and worobey, 2009 ). this does not necessarily indicate that sivsm first arose only centuries ago, just that the common ancestor of all current sivsm may be recent. however, even this conclusion has recently been questioned (worobey et al., 2010) as a result of independent calibration evidence that suggests the t mrca could in fact be greater than 32,000 years ago, leading to debate about the fidelity of the statistical substitution models commonly employed for divergence time dating when the true divergence times are very ancient compared to the sampling interval. as demonstrated by wertheim and pond, 2011 , substitution models that do not take into account the effects of selection can produce underestimated branch lengths leading to much younger age estimates in presence of purifying selection. this will be more problematic for data sets for which the total sampling interval is only a small fraction of the total age of the tree. while incorporating sampling dates provides additional information to phylogenetic inference, it also implies that the reliability of those dates has a heavy impact on the validity of the inference. the h1n1 influenza virus that re-emerged in 1977 was found to have missed decades of evolution and was genetically remarkably similar to the h1n1 1950 virus (nakajima et al., 1978) . it is thus thought to be descended from a strain that was kept frozen in an unknown laboratory for perhaps decades before again becoming a ''wild'' strain again (zimmer and burke, 2009 ). if the missing evolution is not corrected for, analyses including the re-emergent strains produce biased date estimates and increased variances of the t mrca of the re-emergent lineages and across the phylogeny (wertheim, 2010) . in cases where the sampling dates of sequences are contentious or unknown, a method that can handle sequences with unknown dates is required. for example, the leaf-dating method estimates the unknown date or age of a sequence as a parameter, treating it the same way as the age of internal nodes (drummond et al., 2003c; nicholls and gray, 2008; shapiro et al., 2010) . unrealistic sampling dates may also be the result of human error and are thus not recognized prior to an analysis. therefore, diagnostics for unrealistic dates are important to pick up errors in the recorded dates. one possible method is to plot the root-to-tip genetic distance against sampling year if the virus does not display significant departure from constant rate (wertheim, 2010) . another is to check calibrations by dropping each calibration point in turn and re-estimating the date to confirm that the estimated dates are consistent ryder and nicholls, 2011) . early methods that accommodated heterochronous data assumed a strict clock model. however, a comprehensive study of heterochronous rna viral sequences using the srdt model (rambaut, 2000) demonstrated that the majority of the 50 rna viral species studied rejected the constant rate molecular clock hypothesis (jenkins et al., 2002) . the unrooted phylogeny is the other extreme of the scale of rate variability across branches of a phylogenetic tree. neither of them is a realistic representation of the underlying evolutionary process and the reality lies somewhere between the two. this has spawned the development of numerous methods that relax the molecular clock assumption and differ in their assumption of the pattern of rate variation across the branches. the local clock model approach assigns different rates to clades/ regions of the tree. however, without external information, it is difficult to know a priori what is the best partitioning of the tree into local clock models. bayesian model averaging overcomes the challenge of rate assignment by averaging over all possible local clock models , estimating the substitution rates, and the number and position of changes in substitution rate, simultaneously. another category of relaxed clock models is based on 'rate smoothing', including non-parametric rate smoothing (sanderson, 1997) , penalized likelihood (sanderson, 2002) and bayesian autocorrelated relaxed clock methods (thorne et al., 1998; kishino et al., 2001; aris-brosou and yang, 2002; rannala and yang, 2007) . these methods restrict the rates on parent and descendant branches to be similar by penalizing large departures from parent branch rates. hence, rate variation is expected to occur through small and frequent changes. different bayesian autocorrelated clock models differ in the distribution used to model a branch rate given its parent rate (thorne et al., 1998; kishino et al., 2001) . however, analysis of sequence data from influenza a and dengue-4 do not provide any evidence of autocorrelation of branch rates suggesting that autocorrelated models may not be appropriate when analyzing a genealogy of sequences from a single virus species. whereas lineage-effects may be expected to cause autocorrelation of rates (through incremental changes to life-history, metabolic rate et cetera), the gene-specific action of darwinian selection will also cause apparent rate variation among lineages, by producing a general over-dispersion of the molecular clock over the entire phylogeny (takahata, 1987 (takahata, , 1991 . this second source of rate variation among lineages may be better modeled by uncorrelated relaxed clock models , which make no assumption about the autocorrelation of rates between ancestral and descendent branches. published analyses have provided strong evidence supporting the uncorrelated relaxed clock model (e.g., salemi et al., 2008; worobey et al., 2008) over the strict clock model. as well as estimating the age of ancestral divergences, it is also of interest to estimate the time of cross-species transmission if the disease is zoonotic in origin. one method of identifying the time of the host-switch is by applying non-homogeneous substitution models. the motivation of non-homogeneous substitution models is to acknowledge possible differences in pattern of substitution in the virus within different host species, which violates the assumptions of homogeneity and stationarity underlying the standard substitution models. therefore it may be more appropriate to apply different substitution models to different parts of the tree (forsberg and christiansen, 2003) . non-homogeneous substitution models permit the equilibrium frequencies, and hence the model parameters, to change on a branch and all the descendant lineages from the point of change are assumed to have different equilibrium base frequencies to the lineages prior to that point. this technique has been used to suggest that the immediate ancestral population of 1918 influenza a virus resided in a mammalian host (reis et al., 2009 ). however, it does not indicate whether the most recent common ancestor of the swine influenza virus and the 1918 virus resided in humans or other mammals. interpretation of estimated divergence times can be difficult. there may be direct ancestors that are more ancient, but the lineages that would reveal them have not been sampled or did not survive to the present due to processes such as genetic drift. therefore, the estimated t mrca may not answer the question of interest. for epidemics that resulted from a zoonotic transmission, the host switch event is of paramount interest, but estimating the t mrca of the epidemic strain does not directly estimate the time of the transmission, and only serves as a lower bound. likewise, if there have been processes causing a loss of genetic diversity in the past or the sampling is not comprehensive, then the estimated t mrca could be substantially younger than the age of the viral lineage. an obvious example of the former occurs in seasonal influenza due to seasonal population fluctuations and also strong positive darwinian selection caused by immune surveillance (fitch et al., 1991; bush et al., 1999) , leading to rapid lineage turnover and a recent common ancestor of any single-season sample. similarly, the analysis by worobey et al. (2008) shows that the t mrca of hiv-1 group m seems to have been pushed back due to the inclusion of an additional pre-epidemic sample from 1960 which is highly divergent to the 1959 sequence (zr59). in general the inclusion of older samples can increase the estimated age of root by (i) revealing previously unsampled lineages that are outgroup to the t mrca estimated without them, or (ii) simply because more temporal sampling breaks up long internal branches as well as potentially revealing ancient evidence of variants that were assumed modern, resulting in a slower estimated rate and therefore older estimated root height. finally, it is likely that current techniques alone cannot always recover accurate divergence dates in the distant past, as illustrated by recent analyses suggesting a much deeper history of siv (worobey et al., 2010) than previously suggested (sharp et al., 2000; wertheim and worobey, 2009 ). fig. 2 illustrates the problem with three estimated viral time-trees that have vastly different inferred ages of their most recent common ancestor. we would expect the greatest confidence in the inferred age of the human influenza a time-tree where the sample period is a large fraction of the total age of the time tree, and the least confidence in the inferred age of the hepatitis c time-tree in which the sampling period is a small fraction of the inferred age of greater than 1000 years. so, apart from better models of rate variation across lineages (see guindon et al., 2004 , for early steps in this direction), future research in divergence time dating will likely focus on models that more accurately account for purifying selection and its role in maintaining the structure and function of the encoded genes. the impact of darwinian selection is expressed both in distortions of the genealogy (o'fallon, 2010; o'fallon et al., 2010) and the substitution process (e.g., bloom et al., 2007; cartwright et al., 2011) from neutral expectations. consideration of the action of pervasive purifying selection is especially important in viral genomes prone to clonal interference and which are compact, information rich and subject to great levels of functional and structural constraint in their evolutionary trajectories, especially when considering long time periods. beyond that there is also a need for more statistically rigorous methods of incorporating diverse sources of calibration information, such as biogeography, archaeology and paleontological evidence. bayesian statistical frameworks are uniquely suited for this sort of integration of multiple sources of information. genealogy-based population genetics can be used to infer demographic parameters including population size, rate of growth or decline, and population structure. when the characteristic time scale of demographic fluctuations are comparable to the rate of accumulations of substitutions then past population dynamics are ''recorded'' in the substitution patterns of molecular sequences. coalescent theory can therefore be combined with temporal information in heterochronous sequences to uncover past epidemiological events and pinpoint them on a calendar time scale. kingman's coalescent (kingman, 1982) describes the relationship between the coalescent times in a sample genealogy and the population size assuming an idealized wright-fisher population (fisher, 1930; wright, 1931) . the original formulation was for a constant population, but the theory has since been generalized to any deterministically varying function of population size for which the integral r t 1 t 0 nðtþ à1 dt can be computed (griffiths and tavaré, 1994) . parametric models with a pre-defined population function, such as exponential growth, expansion model and logistic growth models can easily be used in a coalescent framework (see fig. 3 and box 1 for details). for example a ''piecewise-logistic'' population model was employed in a bayesian coalescent framework to estimate the population history of hcv genotype 4a infections in egypt . this analysis demonstrated a rapid expansion of hcv in egypt between 1930-1955, consistent with the hypothesis that public health campaigns to administer anti-schistosomiasis injections had caused the expansion of an hcv epidemic in egypt. the coalescent process is highly variable, so sampling multiple unlinked loci (felsenstein, 2006; heled and drummond, 2008) or increasing the temporal spread of sampling times (seo et al., 2002) can both be used to increase the statistical power of coalescentbased methods and improve the precision of estimates of both population size and substitution rate (seo et al., 2002) . however in many virus species, the entire genome acts as a single locus, or undergoes recombination only when the opportunity arises through superinfection. the lack of independent loci therefore places an upper limit on the precision of estimates of population history. in many situations the precise functional form of the population size history is unknown, and simple population growth functions may not adequately describe the population history of interest. non-parametric coalescent methods provide greater flexibility by estimating the population size as a function of time directly from the sequence data and can be used for data exploration to guide the choice of parametric population models for further analysis. these methods first cut the time tree into segments, then estimate the population size of each segment separately according to the coalescent intervals within it. the main differences among these methods are (i) how the population size function is segmented along the tree, (ii) the statistical estimation technique employed and (iii) in bayesian methods, the form of the prior density on the parameters governing the population size function. in the 'classic skyline plot' (pybus et al., 2000) each coalescent interval is treated as a separate segment, so a tree of n taxa has n à 1 population size parameters. however, the true number of population size changes is likely to be substantially fewer, and the generalized skyline plot (strimmer and pybus, 2001) acknowledges this by grouping the intervals according to the small-sample akaike information criterion (aic c ) (burnham and anderson, 2002) . the epidemic history of hiv-2 was investigated using the generalized skyline plot (strimmer and pybus, 2001) , indicating the population size was relatively constant in the early history of hiv-2 subtype a in guinea-bissau, before expanding more recently (lemey et al., 2003) . using this information, the authors then employed a piecewise expansion growth model, to estimate the time of expansion to a range of 1955-1970. while the generalized skyline plot is a good tool for data exploration, and to assist in model selection (e.g., pybus et al., 2003; lemey et al., 2004) , it infers demographic history based on a single input tree and therefore does not account for sampling error produced by phylogenetic reconstruction nor for the intrinsic stochasticity of the coalescent process. this shortcoming is overcome by implementing the skyline plot method in a bayesian statistical [1978, 2005] and represents a significant fraction (%0.32t mrca ) of the overall tree height, but still small enough that the estimated root should be viewed with caution. (c) a phylogeny of human influenza a subtype h3n2: the sampling interval spans 12.2 years [1993.1, 2005.3 ] and represents almost the full height of the tree (%0.94t mrca ), and all divergence times are likely to be quite accurately estimated, since interpolation between many known sample times is inherently less error prone than extrapolation to ancient divergence times. framework, which simultaneously infers the sample genealogy, the substitution parameters and the population size history. further extensions of the generalized skyline plot include modeling the population size by a piecewise-linear function instead of a piecewise-constant population, allowing continuous changes over time rather than sudden jumps. the bayesian skyline plot (drummond et al., 2005) has been used to suggest that the effective population size of hiv-1 group m may have grown at a relatively slower rate in the first half of the twentieth century, followed by much faster growth (worobey et al., 2008) . on a much shorter time scale, the bayesian skyline plot analysis of a dataset collected from a pair of hiv-1 donor and recipient was used to reveal a substantial loss of genetic diversity following virus transmission (edwards et al., 2006) . further analysis with a constant-logistic growth model estimated that more than 99% of the genetic diversity of hiv-1 present in the donor is lost during horizontal transmission. this has important implications as the process underlying the bottleneck determines the viral fitness in the recipient host. one disadvantage of the bayesian skyline plot is that the number of changes in the population size has to be specified by the user a priori and the appropriate number is seldom known. one solution is provided by methods that perform bayesian model averaging on the demographic model utilizing either reversible jump mcmc (opgen-rhein et al., 2005) or bayesian variable selection (heled and drummond, 2008) , and in which case the number of population size changes is a random variable estimated as part of the model. the methods for demographic inference discussed so far assume no subdivision within the population of interest. like changes in the size, population structure can also have an effect on the pattern of the coalescent interval sizes, and thus the reliability of results can be questioned when population structure exists ). in the next section we will discuss approaches to phylogeographic inference, including coalescent approaches to population structure. phylogeography is a field that studies the evolution and dispersal process that has given rise to the observed spatial distribution of population or taxa. phylogeographic methods can be divided into two approaches. the first performs post-tree-reconstruction analysis to answer phylogeographic questions, while the second jointly estimates the phylogeny and phylogeographic parameters of interest. when treating geographic location as discrete states, the former approach has been popular in the past couple of decades. it has the advantage of being less computationally intensive, but the outcome of the analysis depends on the input tree. due to its simplicity, the most popular method for inferring ancestral locations has been maximum parsimony (slatkin and maddison, 1989; swofford, 2003; maddison and maddison, 2005; wallace et al., 2007) , however this method does not allow for any probabilistic assessment of the uncertainty associated with the reconstruction of ancestral locations. a mugration model is a mutation model used to analyze a migration process. a recent study of influenza a h5n1 virus introduced a fully probabilistic 'mugration' approach by modeling the process of geographic movement of viral lineages via a continuous time markov process where the state space consists of the locations from which the sequences have been sampled (lemey et al., 2009b ). this fig. 3 . the underlying wright-fisher population and serially-sampled genealogies from two populations. the first population has a constant population size over the history of the genealogy, while the second population has been exponentially growing. the coalescent likelihood calculates the probability of a genealogy given a particular background population history (e.g., constant or exponentially growing) and can therefore be employed to estimate the population history that best reflects the shape of the co-estimated phylogeny. facilitates the estimation of migration rates between pairs of locations. furthermore, the method estimates ancestral locations for internal nodes in the tree and employs bayesian variable selection (bvs) to infer the dominant migration routes and provide model averaging over uncertainty in the connectivity between different locations (or host populations). this method has helped with the investigation of the influenza a h5n1 origin and the paths of its global spread, and also the reconstruction of the initial spread of the novel h1n1 human influenza a pandemic (lemey et al., 2009b) . however, a shared limitation of models for discrete location states is that ancestral locations are limited to sampled locations. as demonstrated by the analysis of the data set on rabies in dogs in west and central africa, absence of sequences sampled close to the root can hinder the accurate estimation of viral geographic origins (lemey et al., 2009b) . phylogeographic estimation is therefore improved by increasing both the spatial density and the temporal depth of sampling. however, dense geographic sampling leads to large phylogenies and computationally intensive analyses. the structured coalescent (hudson, 1990) can also be employed to study phylogeography. the structured coalescent has also been extended to heterochronous data (ewing et al., 2004) , thus allowing the estimation of migration rates between demes in calendar units. the serial structured coalescent was first applied to an hiv dataset with two demes to study the dynamics of subpopulations within a patient (ewing et al., 2004) , but the same type of inference can be made at the level of the host population. further development of the model allowed for the number of demes to change over time (ewing and rodrigo, 2006a) . migrate (beerli and felsenstein, 2001) also employs the structured coalescent to estimate subpopulation sizes and migration rates in both bayesian and maximum likelihood frameworks and has recently been used to investigate spatial characteristics of viral epidemics (bedford et al., 2010) . additionally, some studies have focused on the effect of ghost demes (beerli, 2004; ewing and rodrigo, 2006b) , however no models explicitly incorporating population structure, heterochronous samples and nonparametric population size history are yet available. one ad hoc solution involves modeling the migration process along the tree in a way that is conditionally independent of the population sizes estimated by the skyline plot (lemey et al., 2009a) . thus, given the tree, the migration process is considered independent of the coalescent prior. however this approach does not capture the interaction between migration and coalescence that is implicit in the structured coalescent, since coalescence rates should depend on the population size of the deme the lineages are in. as we will see in the following section, statistical phylogeography is one area where the unification of phylogenetic and mathematical epidemiological models looks very promising. in some cases it is more appropriate to model the spatial aspect of the samples as a continuous variable. the phylogeography of wildlife host populations have often been modeled in a spatial continuum by using diffusion models, since viral spread and host movement tend to be poorly modeled by a small number of discrete demes. one example is the expansion of geographic range in eastern united states of the raccoon-specific rabies virus (biek et al., 2007; lemey et al., 2010) . brownian diffusion, via the comparative method (felsenstein, 1985; harvey and pagel, 1991) , has also been utilized to model the phylogeography of feline immunodeficiency virus collected from the cougar (puma concolor) population around western montana. the resulting phylogeographic reconstruction was used as proxy for the host demographic history and population structure, due to the predominantly vertical transmission of the virus (biek et al., 2006) . however, one of the assumptions of brownian diffusion is rate homogeneity on all branches. this assumption can be relaxed by extending the concept of relaxed clock models to the diffusion process . simulations show that the relaxed diffusion model has better coverage and statistical efficiency over brownian diffusion when the underlying process of spatial movement resembles an over-dispersed random walk. like their mugration model counterparts, these models ignore the interaction of population density and geographic spread in shaping the sample genealogy. however there has been progress in the development of mathematical theory that extends the coalescent framework to a spatial continuum (barton et al., 2002 (barton et al., , 2010a , although no methods have yet been developed providing inference under these models. box 1: the anatomy of a bayesian coalescent analysis using mcmc bayesian phylogenetic inference by markov chain monte carlo (mcmc) (yang and rannala, 1997; mau et al., 1999) involves the simulation of the joint posterior distribution of substitution model parameters (/) and the phylogenetic tree given the sequence data (d). by restricting the phylogenetic model to time-trees (see fig. 1 ) and coupling the phylogenetic likelihood with a coalescent prior, the parameters (h) of the population history, n h (t), can also be estimated simultaneously by sampling from the posterior probability distribution (drummond et al., 2002) : the term pr(djg,/) is often referred to as the phylogenetic likelihood, and is the probability of the data given the time-tree g and substitution model parameters. it can be computed by the pruning algorithm (felsenstein, 1981) , which efficiently sums over all ancestral sequence states at the internal nodes of the tree. an extension of the likelihood accommodates heterogeneity across sites (yang, 1994) . if the time-tree g relates a heterochronous sample of sequences, then the substitution parameters / also includes the overall substitution rate l, and this can be estimated from the heterochronous data, so that the population history is estimated on a calendar scale. the normalizing constant pr(d) is also known as the partition function or marginal likelihood and its magnitude provides a measure of model support, although its estimation requires advanced mcmc techniques (e.g., thermodynamic integration or transdimensional mcmc). coalescent models come into play when determining the prior density for the time-tree topology and coalescent/ divergence times. the coalescent provides a probability distribution, f g (gjh), conditional on a deterministic model of population size history, n h (t). its parameters (h) can in turn be estimated as hyperparameters. given a time-tree g = {e g ,t} of n contemporaneous samples composed of an edge graph e g and coalescent times t = {t n = 0,t nà1 , . . . ,t 2 ,t 1 } the coalescent density is: the prior distributions f h (h) and f u (/) are usually selected from standard univariate or multivariate distributions. in the previous section we have seen that phylogenetics can be used to infer the date of an outbreak, its source population and the viral transmission history, directly from time-stamped genomic data. whereas phylogenetic models mainly address questions about evolutionary history, dynamical models are often used to make predictions about the future. predictive models are important because they provide the possibility of anticipating certain aspects of the outcome of emerging epidemics and assessing the risk of pandemics, and the potential effects of planned intervention. phylogenetic inference is based on genetic data such as sampled dna sequences from infected hosts. current models using such data to infer information about the past often require simplifying assumptions about the population size e.g., to be constant or to be subject to pure exponential growth. epidemiologists, on the other hand, fit their models to prevalence or incidence data. standard epidemiological models are described by sets of ordinary differential equations tracking the (often non-linear) changes in numbers of susceptible and infected individuals. consequently, the simple prior assumptions for the population sizes (of infected individuals) used in phylogenetics appear inadequate from an ecological perspective. epidemiological models play a major role in deciding which measures of disease control are taken to avoid or stop viral outbreaks. the effects of isolation, vaccination and other measures are estimated through model simulations, serving as a basis for decisions on which public health policies to institute and actions to take. however, knowledge of the phylogenetic history of viral outbreaks can be vital in reconstructing transmission pathways which contributes to effective management and future prevention efforts (e.g., cottam et al., 2008) . the epidemiological and ecological processes determining the diversity of fast evolving rna viruses act on the same time scale as that on which mutations arise and are fixed in the population (holmes, 2004) . this implies that genetic sequence data can provide independent evidence on transmission histories. whereas epidemiological data typically provides information about who was infected and when, it generally does not provide positive evidence about transmission history. thus the combination of these sources of information should open the way to more detailed epidemiological inference, including bayesian estimation of contact networks and transmission histories (welch et al., 2011) . standard epidemiological models are based on flux between host compartments dividing the host population e.g., into susceptible (s), infected (i) and recovered or removed (r) individuals. standard models are termed si, sis and sir. the choice of model is based on the characteristics of the considered disease, the existence of a latent period, immunity after infection et cetera (see box 2) (anderson and may, 1991; keeling and rohani, 2008) . restricting the focus to the time evolution of the number of individuals in each compartment, these models grasp the overall progress of an epidemic. certain disease characteristics require adaptations or extensions of standard models, for example, the inclusion of asymptomatic infections that account for a sampling bias towards symptomatic infections in case the virus of interest does not always cause noticeable symptoms (e.g., aguas et al., 2008) . an important threshold ratio is the basic reproduction ratio r 0 , the expected number of secondary infections caused by one primary infection in a completely susceptible population (diekmann et al., 1990) . based on its value epidemiologists make predictions on the effect of the disease. in classical deterministic epidemiological models, if the basic reproduction ratio is larger than one, an epidemic is expected. box 2: compartmental models for infectious diseases (keeling and rohani, 2008 ) let s, e, i and r be the fractions of susceptible, exposed, infected and recovered/removed individuals in the host population. the left hand side of each equation block gives the model equations, the right hand side the (non-trivial) endemic equilibria, which are only obtainable for r 0 > 1. the basic reproduction ratio r 0 depends on the corresponding model. apart from the si model, the overall population is assumed to be constant, such that the sum of fractions for each model equals one. under the assumption of homogeneous mixing in the population the transmission term bs i can be derived, which determines the total rate of new infections. si model. fatal infections, eventually killing the infected, can be modeled with only two compartments: susceptible and infected. assume a fixed birth rate m and death rate l. the sir model. transmission of the disease to susceptibles leads to a period of illness until recovery, which in turn implies immunity. demography is described by the birth and death rate l and recovery is obtained at rate c; its reciprocal 1/c is the mean infectious period. here, r 0 ¼ b lþc . the last equation is redundant since s + i + r = 1. instead, after infection the individuals go back to the susceptible stage. therefore, the disease can persist even without including newborns in the population. ignoring demography, the dynamics are characterized by coupled differential equations _ s ¼ ci à bsi and _ i = bs i à ci. since s = 1 à i, they can be replaced by one equation. seir model. in order to account for a latent period with assumed average duration 1/r, the sir model can be extended by including exposed individuals composing a fraction e of the population. exposed individuals are infected, but not yet infectious. the differential equations for s (and r) are as in the sir model. dynamics in e and i are described as follows. further models are sirs, seis, msir, mseir, mseirs, etc., where m denotes passively immune infants, allowing for diseases where an individual can be born with a passive immunity from its mother. typically, epidemiologists fit a suitable set of deterministic differential equations to empirical data, often the number of infections or related hospitalizations in a population. consequently, the model can be used to estimate if an epidemic can be kept under control by measures such as (i) vaccination and (ii) antiviral prophylaxis for susceptible individuals, (iii) treatment of infected individuals or (iv) isolation of infected individuals from susceptible individuals. decisions on public health policies are often based on these estimates. the simplest epidemiological models assume homogeneous mixing within a population. in many cases this assumption is not valid. due to host contact dynamics viral infections spread easily within social units such as schools, cities and farms, less so among them. integration of population structure is therefore essential. however, even within subpopulations individual dynamics might differ stochastically (see fig. 4 ). such randomness can be accounted for by considering stochastic models (see e.g., survey by britton, 2010) . before introducing stochastic compartmental models thoroughly, we illustrate them based on a stochastic sir model simulation. we simulate the spread of a virus strain in a population divided into n subpopulations which are connected by comparatively rare migration events. let l ¼ f0; . . . ; n à 1g denote the set of locations. a single infected individual initiates the epidemic in one of the n completely susceptible populations. after an exponentially distributed waiting time one of the following events happens: infection at mass action infection rate b. migration at migration rate m ik for ik 2 l. birth of a susceptible individual at rate l. death of an individual at rate l. fig. 5 shows a realization of the simulated dynamics for n = 3 populations. the epidemic starts in population 1 (blue) and many individuals get infected before the first individuals in population 2 (yellow) and eventually population 3 (red) get infected. let s k , i k and r k be the fractions of individuals in each subpopulation k 2 l. the sum s k + i k + r k equals one for every k 2 l. the deterministic analogue of our model can be described with the following differential equations: however it is important to realize that this set of differential equations cannot capture all of the behaviors of its stochastic counterpart. in fact, starting from a deterministic representation like this, there are multiple stochastic markov processes that exhibit the same deterministic limit, but can potentially have exponentially different behavior in their stochastic properties, such as the time to extinction (e.g., . formally, two distinct sources of variance can be considered in stochastic models of populations (engen et al., 1998) . the first is environmental stochasticity and is often modeled by admitting temporal variation in the parameters of the population model. the second is demographic stochasticity and describes the stochasticity of fluctuations in populations of finite size due to the inherent unpredictability of individual outcomes. to model demographic stochasticity (also known as internal stochasticity; chen and bokka, 2005) in the absence of environmental (external) stochasticity, the time-evolution of an epidemic can be represented by a jump process and its corresponding master equation (gardiner, 2009 ). the master equation describes the time evolution of the probability distribution over the discrete state space. for the closed sir model (kermack and mckendrick, 1927) the master equation for the numbers of individuals in each of the three compartments (n s , n i , n r ) is: _ p n s ;n i ;n r ðtþ ¼ bðn s þ 1þðn i à 1þp n s þ1;n i à1;n r ðtþ ð 4þ þ cðn i þ 1þp n s ;n i þ1;n r à1 ðtþ à ðbn s n i þ cn i þp n s ;n i ;n r ðtþ a single realization of this epidemic jump process is described by a sequence of timed transition events (individual infection or recovery events). in the closed sir model, the waiting or sojourn time between a pair of sequential events is exponentially distributed (i.e., the transition process is memoryless), and thus the process is a continuous-time markov process. stochastic models of this form can also be viewed in terms of their reaction kinetics. for the closed stochastic sir model above the two 'reactions' are infection and recovery: indicating that a susceptible contacts an infectious individual and gets infected at reaction rate b whereas an infected recovers at reaction rate c. more precisely, the time (s) an individual spends in the susceptible and infected compartments are exponentially distributed with rates bi and c, respectively. it is the binary infection reaction that leads to the non-linear dynamics of the system. for stochastic models r 0 > 1 does not necessarily imply an outbreak of the disease. instead, a higher basic reproduction ratio suggests a higher probability of an outbreak, but the precise relationship depends on the specific model considered and the initial condition. algorithms have been developed that allow exact and approximate simulation of coupled reactions such as the closed sir (bartlett, 1957; gillespie, 1976 gillespie, , 2001 ). fig. 6 shows simulated viral outbreaks under a stochastic sir and sis model with r 0 % 2.3 in a population divided into three distinct subpopulations. note that there is no outbreak in (3) although r 0 > 1. deterministic epidemic models can be derived from the underlying jump process, and can represent useful macroscopic laws of motion in the appropriate limit. however such approaches are not adequate for modeling systems in which small numbers of individuals are frequently involved. for a similar reason, it is awkward to reconcile large-limit deterministic models with the small sample genealogies that are obtained with molecular phylogenetic approaches. therefore, stochastic continuous-time discrete-state formulations of epidemic models may be more suited to forming connections between the two disciplines. the forward simulations of a stochastic epidemic model introduced with fig. 5 demonstrate the relationship between epidemic models and genealogies. knowing the exact parameters and resulting dynamics throughout the simulated outbreak, we can build a full transmission history for the outbreak (which is not unique given only the time evolution of the number of infected individuals, since at each event the infected individuals involved are chosen randomly). an infection event in the forward simulation corresponds to a bifurcation in the transmission tree. restricting the full tree to a ''sample genealogy'' that only includes the individuals that were infectious at a specific sampling time yields very different results for different times during the outbreak, which underlines the importance of sampling methods (see e.g., stack et al., 2010) . as we can see in the simulations, virus transmission often depends on spatial structure. the interaction among humans living in the same city, for example, differs from among-city interaction, which is important whenever viral transmission exceeds city borders. there are many other social and spatial units this concept applies to: households, schools, or on a larger scale, regions, countries and continents. in fact, most phylogenetic and epidemiological studies model the dynamics of spatially distributed systems, albeit many of them ignore spatial structure for the sake of simplicity. durrett and levin demonstrate that models ignoring spatial structure yield qualitatively different results than spatial models (durrett and levin, 1994) . phylodynamics is a term used to describe a synthetic approach to the study of rapidly evolving infectious agents that considers the action (and interaction) of both evolutionary and ecological processes. the term phylodynamics was introduced by grenfell et al. (2004) to describe the ''melding of immunodynamics, epidemiology, and evolutionary biology'' that is required to analyse the interacting evolutionary and ecological processes especially of rapidly evolving viruses for which both processes have the same time scale. two distinct pursuits have been labeled phylodynamics by recent studies. the first relies on the idea that ecological processes and population dynamics can effectively be tracked by neutral genetic variation, such that past ecological and population events are ''imprinted'' in genetic variation within populations and can be reconstructed along with the reconstruction of evolutionary history. the idea is sound for truly neutral variation, but the compact genomes of rapidly evolving viruses are not simple recording devices. instead they are packed with functional information and mutations play an active role in population and ecological processes through the action of darwinian selection. hence, the more challenging second phylodynamic pursuit is the analysis of the inevitable interaction of evolutionary and ecological processes that requires the joint analysis of both. we will call the former pursuit phylogenetic epidemiology, and reserve the term phylodynamics for approaches that aspire to model the interaction of ecological and evolutionary processes. the effect of novel mutations on population dynamics through their interaction with the immune system or anti-viral drugs are examples of phylodynamics in this stricter sense. the focus of many studies aspiring to combine population genetic and epidemiological approaches is the basic reproduction ratio r 0 , estimates of which are used to develop containment strategies for emerging pandemics. such estimates can be obtained from phylogenetic analysis, e.g., through estimating population growth rates . another popular way to infer population dynamic information from genomic data is the application of parametric and non-parametric coalescent models (strimmer and pybus, 2001; drummond et al., 2005; minin et al., 2008) . phylogenetic methods can be used to estimate r 0 , which can then be used to investigate transmission patterns and the number of generations of transmission. depending on the distribution of the generation time (i.e., the duration of infectiousness) the relationship between r 0 and the growth rate r of the population can be used to compute the basic reproduction number (wallinga and lipsitch, 2007) . little is known about generation time distributions, the usual approach is to fit the epidemic models to the observed data. wallinga and lipsitch list the resulting equations for r 0 for exponential, normal, or delta distributions of generation time. they show that without knowledge of the generation time distribution an upper bound for the reproductive number can still be estimated. others obtain r 0 estimates based on coalescent theory, as for example (rodrigo et al., 1999) who estimated it in vivo for hiv-1. in a recent study on the influenza a (h1n1) outbreak in 2009 both epidemiological and bayesian coalescent approaches for the computation of r 0 were applied (fraser et al., 2009) . whereas the epidemic approaches gave estimates of 1.4-1.6 for r 0 , the bayesian coalescent approach yielded a posterior median of 1.22. all estimates are larger than one, correctly indicating that the virus spreads successfully, rather than dying out. however, an agedependent heterogeneous epidemic model best fits the data and results in an estimate of r 0 = 1.58. structures determining host interaction are often modeled as contact networks (welch et al., 2011) . the transmission of foot and mouth disease virus is highly dependent on the interaction among farms and the detection of infected farms is essential. a plausible approach is to consider each farm as an individual in a contact network. through phylogenetic analysis of consensus sequences (one sequence for each farm) contacts between farms can be traced in order to find infected but non-detected farms such that contacts between farms can be traced in order to find infected but non-detected farms (cottam et al., 2008) . changes in effective population size estimated through phylogenetic analyses can indicate past changes in population size. therefore, many recent studies infer the demographic history of a virus using bayesian skyline plot models (drummond et al., 2005) . for example, (siebenga et al., 2010) are interested in the epidemic expansion of norovirus gii.4 which they investigate by reconstructing the changes in population structure using bayesian skyline plots. similarly, (hughes et al., 2009 ) explore the heterosexual hiv epidemic in the uk. analyses of the genomic and epidemiological dynamics of human influenza a virus explore the sink-source theory and investigate the spatial connections of a seasonal global epidemic (rambaut et al., 2008; lemey et al., 2009b; bedford et al., 2010) . coalescent theory has also been adapted to fit an epidemic sir model to sequence data (volz et al., 2009 ). frost and volz (2010) provide an overview on how appropriate interpretation of coalescent rates differs among the different population dynamic approaches it is being used with. interpretation of the coalescent-based skyline plots must be made with caution. as opposed to generation times referring to durations of infection in epidemiological theory, for coalescent approaches being applied to infectious diseases the generation times usually describe times between transmission events. accordingly, although prevalence does affect phylogenetic reconstruction through sampling, the population dynamic patterns are mainly determined by incidence (frost and volz, 2010) . one early attempt to integrate dynamical and population genetic models used coupled differential equations and markov chain theory to model the within-host time evolution of viral genetic diversity under basic dynamic models of a persistent infection (kelly et al., 2003) . the main focus was the impact of the dynamical model on the variance in the number of replication cycles, as this is a key determinant of the rate of genetic divergence and thus potential for adaptation. interestingly, the model reveals that multiple cell type infections can decrease viral evolutionary rates and increase the likelihood of persistent infection. genetic diversity within hosts is closely related to between host dynamics: gordo and campos (2007) develop structured population genetic models, explicitly incorporating epidemiological parameters to analyze the relationship between genetic variability and epidemiological factors. a simple sis model is simulated based on two different models of host contact structure, the island model and a scale free contact network. for low clearance rates and low intrahost effective population size, levels of genetic variability turn out to be maximal when transmission levels are intermediate, independent of the host population structure. in a scale free contact network the population consists of many low-connectivity hosts and very few high-connectivity hosts, a common pattern for sexually transmitted diseases (e.g., lloyd and may, 2001; liljeros et al., 2001) . in this setting genetic variation appears to be lower in highly connected than in weakly connected hosts. with their study gordo and campos (2007) underline that an integration of population genetics and epidemiology can have important implications for public health policies. in a deterministic framework day and gandon (2007) model the interaction of evolutionary and ecological processes by coupling sis host dynamics with viral evolution. the interaction of evolution and ecology is incorporated through the fitness of each virus strain. for strain i they define a fitness r i = b i n s à l à v i à c, where b i is the strain-specific transmission rate per susceptible, v i is the strainspecific virulence (determining the increase in mortality rate due to infection), l is the baseline mortality rate and c is the recovery rate. the evolutionary dynamics of strain frequencies are tracked quantitatively and the evolutionary dynamics of strain frequencies are intimately linked with the overall infection dynamics of the host population via the strain-specific virulence and transmission rates. their analysis provides insight into the mechanistic laws of motion connecting genetic evolution with the evolution of virulence and transmission rates. an exceptional feature of influenza viruses is the limited genetic diversity which appears to contradict the viruses' high mutation rate. integrating single virus strain features and host immunity into a stochastic transmission model ferguson et al. (2003) search an explanation for this. although epidemiological factors play a role in limiting influenza diversity, strain-transcendent immunity must be relevant as well. through a phylodynamic analysis of interpandemic influenza in humans koelle et al. (2006) underline the importance of the viral structure for antigenicity and the immune recognition dynamics of influenza epitopes. they consider clusters that contain strains with similar conformations of ha epitopes such that there is high cross-immunity of strains within each cluster. a genotype-phenotype model that implements neutral networks (the clusters) is coupled with an epidemiological transmission model in which the number of susceptible, infected and recovered individuals in each cluster are modeled. model simulations result in time series of infected cases that agree with the typical annual outbreaks in temperate regions and empirical dominance of certain antigenic clusters. according to this model, years in which a formerly dominant cluster is replaced by a new one have the highest numbers of infections. in the following year there are particularly few infec-(1) (3) (4) fig. 6 . simulated viral outbreak under stochastic sir (1-3) and sis (4) model among three populations (denoted by blue, yellow and red curves). the initial condition is a single infected individual in the blue population. in (3) the disease does not break out (numbers of susceptibles in dotted lines and infected in solid lines). (for interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) tions, presumably due to higher host immunity caused by the previous year's outbreak. thereafter follow ''average'' years until the next cluster-transition occurs, i.e., until another cluster becomes dominant again. another natural explanation of the contradiction between high mutation rates and constant genetic diversity is the fixation of many deleterious mutations that leads to the extinction of the respective strains. recent population genetic models account for population dynamics e.g., in order to enhance the understanding of allele fixation processes and the importance of demographic stochasticity (parsons and quince, 2007; champagnat and lambert, 2007; parsons et al., 2010) . structured models do not only allow for more realistic dynamics, they can also bridge the gap to phylogenetic/-geographic methods since most of them are sample-based, ideally, with each sample representing one infected individual. modeling coupled host-virus dynamics welch et al. (2005) embed an epidemic population model into a branching and coalescent structure, producing a scaled coalescent process that describes the inter-host dynamics given a virus sample genealogy. their simulations show that, for large sample sizes, the model provides accurate estimates of the contact rate and the selection parameter. overall, phylodynamic methods have been developed and proven useful for the analysis of various viruses. however, phylogenetic reconstruction is still quite restricted by coalescent assumptions. an alternative to the coalescent for cases in which sample sizes are big compared to the overall population is the birth-death with incomplete-sampling model (gernhard, 2008; stadler, 2009) , and this framework has recently been extended to include heterochronous data (stadler, 2010) , opening the way for an alternative approach to phylodynamic inference from timestamped virus data. bayesian phylogenetic inference has led to an explosion of analyses of rapidly evolving viruses in recent years. while this explosion has been fruitful in elucidating the manifold variation in origin, transmission routes and evolutionary rates underlying the present diversity of infection agents, there is a nascent field that promises to extend the conceptual reach of molecular sequence data, through a unification of phylogenetics and mathematical epidemiology. this new field of phylodynamics encompasses both inference of classical epidemiological parameters using phylogenetics as well as exciting new approaches that aim to investigate the consequences of the inevitable interaction between evolutionary (mutation, drift, darwinian selection) and ecological (population dynamics and ecological stochasticity) processes. the research being pursued has broader consequences for evolutionary biology and molecular ecology. this interaction of evolution and ecology will occur whenever a population contains genotypes with different intrinsic dynamical properties (e.g., virulence, transmission rates, recovery rates). whereas this condition is almost always met in real populations and frequently definitive in its role in shaping outcomes, the mathematical and theoretically analysis of darwinian selection within epidemiological models is the most challenging and least studied area within the emerging field of phylodynamics. it is thus ripe for future research. in the meantime, it is likely that phylodynamic research will rapidly develop new methods for statistical phylogeography and structured population dynamics. prospects for malaria eradication in sub-saharan africa infectious diseases of humans: dynamics and control effects of models of rate evolution on estimation of divergence dates with special reference to the metazoan 18s ribosomal rna phylogeny measles periodicity and community size neutral evolution in spatially continuous populations a new model for evolution in a spatial continuum a new model for extinction and recolonization in two: dimensions quantifying phylogeography global migration dynamics underlie evolution and persistence of human influenza a (h3n2) effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach a virus reveals population structure and recent demographic history of its carnivore host a highresolution genetic signature of demographic and spatial expansion in epizootic rabies virus thermodynamics of neutral protein evolution unifying vertical and nonvertical evolution: a stochastic arg-based framework stochastic epidemic models: a survey model selection and multimodel inference: a practical information-theoretic approach predicting the evolution of human influenza a history can matter: non-markovian behavior of ancestral lineages evolution of discrete populations and the canonical diffusion of adaptive dynamics stochastic modeling of nonlinear epidemiology integrating genetic and epidemiological data to determine transmission pathways of foot-and-mouth disease virus applying population-genetic models in theoretical evolutionary epidemiology on the definition and the computation of the basic reproduction ratio r0 in models for infectious-diseases in heterogeneous populations measurably evolving populations inference of viral evolutionary rates from molecular sequences bayesian random local clocks, or one rate to rule them all relaxed phylogenetics and dating with confidence estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data tools for constructing chronologies: crossing disciplinary boundaries beast: bayesian evolutionary analysis by sampling trees bayesian coalescent inference of past population dynamics from molecular sequences extinction times in autocatalytic systems the importance of being discrete (and spatial) a method for cluster analysis population genetic estimation of the loss of genetic diversity during horizontal transmission of hiv-1 demographic and environmental stochasticityconcepts and definitions using temporally spaced sequences to simultaneously estimate migration rates, mutation rate and population sizes in measurably evolving populations coalescent-based estimation of population parameters when the number of demes changes over time estimating population parameters using the structured serial coalescent with bayesian mcmc inference when some demes are hidden evolutionary trees from dna sequences: a maximum likelihood approach phylogenies and the comparative method accuracy of coalescent likelihood estimates: do we need more sites, more sequences, or more loci? ecological and immunological determinants of influenza evolution positive darwinian evolution in human influenza a viruses a codon-based model of host-specific selection in parasites, with an application to the influenza a virus who rapid pandemic assessment collaboration viral phylodynamics and the search for an 'effective number of infections origin of hiv-1 in the chimpanzee pan troglodytes troglodytes human infection by genetically diverse sivsm-related hiv-2 in west africa stochastic methods: a handbook for the natural and social sciences the conditioned reconstructed process molecular virology: was the 1918 pandemic caused by a bird flu the emergence of hiv/aids in the americas and beyond a general method for numerically simulating the stochastic time evolution of coupled chemical reactions approximate accelerated stochastic simulation of chemically reacting systems a likelihood method for the detection of selection and recombination using nucleotide sequences unifying the epidemiological and evolutionary dynamics of pathogens sampling theory for neutral alleles in a varying environment modeling the sitespecific variation of selection patterns along lineages the comparative method in evolutionary biology bayesian inference of population size history from multiple loci an african primate lentivirus (sivsmclosely) related to hiv-2 phylogenetic evidence for recombination in dengue virus the phylogeography of human viruses whole-genome analysis of human influenza a virus reveals multiple persistent lineages and reassortment among recent h3n2 viruses gene genealogies and the coalescent process bayesian inference of phylogeny and its impact on evolutionary biology uk hiv drug resistance collaboration, rambaut, a.uk hiv drug resistance collaboration splitstree: analyzing and visualizing evolutionary data rates of molecular evolution in rna viruses: a quantitative phylogenetic analysis chimpanzee reservoirs of pandemic and nonpandemic hiv-1 modeling infectious diseases in humans and animals linking dynamical and population genetic models of persistent viral infection a contribution to the mathematical theory of infections the coalescent performance of a divergence time estimation method under a probabilistic model of rate evolution epochal evolution shapes the phylodynamics of interpandemic influenza a (h3n2) in humans timing the ancestor of the hiv-1 pandemic strains the molecular population genetics of hiv-1 group o tracing the origin and history of the hiv-2 epidemic bayesian phylogeography finds its roots phylogeography takes a relaxed random walk in continuous space and time reconstructing the initial global spread of a human influenza pandemic: a bayesian spatial-temporal model for the global spread of h1n1pdm bats are natural reservoirs of sars-like coronaviruses animal origins of the severe acute respiratory syndrome coronavirus: insight from ace2-s-protein interactions the web of human sexual contact genetic analysis of human h2n2 and early h3n2 influenza viruses, 1957-1972: evidence for genetic divergence and multiple reassortment events how viruses spread among computers and people full-length human immunodeficiency virus type 1 genomes from subtype c-infected seroconverters in india, with evidence of intersubtype recombination sinauer associates phylogeography and molecular epidemiology of hepatitis c virus genotype 2 in africa bayesian phylogenetic inference via markov chain monte carlo methods smooth skyride through a rough skyline: bayesian coalescent-based inference of population dynamics different subtype distributions in two cities in myanmar: evidence for independent clusters of hiv-1 transmission recent human influenza a (h1n1) viruses are closely related genetically to strains isolated in 1950 the origin and global emergence of adamantane resistant a/h3n2 influenza viruses multiple reassortment events in the evolutionary history of h1n1 influenza a virus since 1918 dated ancestral trees from binary trait data and their application to the diversification of languages a method to correct for the effects of purifying selection on genealogical inference a continuous-state coalescent and the impact of weak selection on the structure of gene genealogies inference of demographic history from genealogical trees using reversible jump markov chain monte carlo slidingbayes: exploring recombination using a sliding window approach based on bayesian phylogenetic inference cross-species virus transmission and the emergence of new epidemic diseases fixation in haploid populations exhibiting density dependence i: the non-neutral case some consequences of demographic stochasticity in population genetics genetic history of hepatitis c virus in east asia the epidemic behavior of the hepatitis c virus the epidemiology and iatrogenic transmission of hepatitis c virus in egypt: a bayesian coalescent approach evolutionary analysis of the dynamics of viral infectious disease an integrated framework for the inference of viral population history from reconstructed genealogies estimating the rate of molecular evolution: incorporating noncontemporaneous sequences into maximum likelihood phylogenies the genomic and epidemiological dynamics of human influenza a virus inferring speciation times under an episodic molecular clock using non-homogeneous models of nucleotide substitution to identify host shift events: application to the origin of the 1918 'spanish' influenza pandemic virus recombination in aids viruses coalescent estimates of hiv-1 generation time in vivo missing data in a stochastic dollo model for binary trait data, and its application to the dating of proto-indo-european highresolution molecular epidemiology and evolutionary history of hiv-1 subtypes in albania identification of breakpoints in intergenotypic recombinants of hiv type 1 by bootscanning a nonparametric approach to estimating divergence times in the absence of rate constancy estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach simian immunodeficiency virus infection in free-ranging sooty mangabeys (cercocebus atys atys) from the tai forest a viral sampling design for testing the molecular clock and for estimating evolutionary rates and divergence times a bayesian phylogenetic method to estimate unknown sequence ages origins and evolution of aids viruses: estimating the time-scale phylodynamic reconstruction reveals norovirus gii.4 epidemic expansions and their molecular determinants a cladistic measure of gene flow inferred from the phylogenies of alleles origins and evolutionary genomics of the 2009 swine-origin h1n1 influenza a epidemic analyzing the mosaic structure of genes protocols for sampling viral sequences to study epidemic dynamics on incomplete sampling under birth-death models and connections to the sampling-based coalescent sampling-through-time in birth-death trees exploring the demographic history of dna sequences using the generalized skyline plot paup⁄: phylogenetic analysis using parsimony (⁄ and other methods). version 4 on the overdispersed molecular clock statistical models of the overdispersed molecular clock origin and biology of simian immunodeficiency virus in wild-living western gorillas estimating the rate of evolution of the rate of molecular evolution human immunodeficiency viruses: siv infection in wild gorillas sequence analysis of a highly divergent hiv-1-related lentivirus isolated from a wild captured chimpanzee phylodynamics of infectious disease epidemics a statistical phylogeography of influenza a h5n1 how generation intervals shape the relationship between growth rates and reproductive numbers statistical inference to advance network models in epidemiology integrating genealogy and epidemiology: the ancestral infection and selection graph as a model for reconstructing host virus histories the re-emergence of h1n1 influenza virus in 1977: a cautionary tale for estimating divergence times using biologically unrealistic sampling dates purifying selection can obscure the ancient age of viral lineages dating the age of the siv lineages that gave rise to hiv-1 and hiv-2 direct evidence of extensive diversity of hiv-1 in kinshasa by 1960 island biogeography reveals the deep history of siv evolution in mendelian populations maximum likelihood phylogenetic estimation from dna sequences with variable rates over sites: approximate methods bayesian phylogenetic inference using dna sequences: a markov chain monte carlo method historical perspective-emergence of influenza a (h1n1) viruses key: cord-325321-37kyd8ak authors: iftikhar, h.; iftikhar, m. title: forecasting daily covid-19 confirmed, deaths and recovered cases using univariate time series models: a case of pakistan study date: 2020-09-22 journal: nan doi: 10.1101/2020.09.20.20198150 sha: doc_id: 325321 cord_uid: 37kyd8ak the increasing confirmed cases and death counts of coronavirus disease 2019 (covid-19) in pakistan has disturbed not only the health sector, but also all other sectors of the country. for precise policy making, accurate and efficient forecasts of confirmed cases and death counts are important. in this work, we used five different univariate time series models including; autoregressive (ar), moving average (ma), autoregressive moving average (arma), nonparametric autoregressive (npar) and simple exponential smoothing (ses) models for forecasting confirmed, death and recovered cases. these models were applied to pakistan covid-19 data, covering the period from 10, march to 3, july 2020. to evaluate models accuracy, computed two standard mean errors such as mean absolute error (mae) and root mean square error (rmse). the findings show that the time series models are useful in predicting covid-19 confirmed, deaths and recovered cases. furthermore, ma model outperformed the rest of all models for confirmed and deaths counts prediction, while arma is second best model. the ses model seems superior to other models for prediction of recovered counts, however ma is competitive. on the basis of best selected models, we forecast form 4th july to 14th august, 2020, which will be helpful for decision making of public health and other sectors of pakistan. the spread of covid-19 using the case of malaysia and scrutinized its linkage with some external factors e.g. inadequate medical resources and incorrect diagnosis problems. they have used epidemiological model and dynamical systems technique and observed that might misrepresent the evaluation on the severity of covid-19 under complexities. in order to forecast agreement to the publicly available data, the work in [21] used fractional time delay dynamic system (ftdd). the author in [22] used generalized logistic model and found the pandemic growth as exponential in nature in china. the author in [23] used genetic programming (gp) models for confirmed cases and death cases in three highly covid-19 affected states of india i.e. maharashtra, gujarat, delhi and whole india. they have statistical validated the evolved models to find that the proposed models based on gep use simple interactive functions and can be highly relied upon time series forecasting of covid-19 cases in the context of india. based on the spreading behaviour of the covid-19 in the mass, [24] estimated three novel quarantine epidemic models. they found that isolation at home and quarantine in hospitals are the two most effective control strategies under the current circumstances when the disease has no known available treatment. in the work [25] using positive cases over 50 days of disease progression for pakistan, analysed the graphical trend and using exponential growth forecasted the behaviour of disease progression for next 30 days. they assume different possible trajectories and projected estimated 20k-456k positive case within 80 days of disease spread in pakistan. due to the mutated nature of the virus, the situation has become graver with little known about the cure, there remain greater uncertainty about the probable time-line of this disease. hence, forecasting for short term is immensely important to get the clue for predicting the flattening of curve and revival of routine social and economic life [26] . statistical models using evidence from real world data can help predict the location, timing, and the size of outbreaks, allowing governments to allocate resources more effectively, to conduct scenario and signal analysis, and to determine policy approaches. epidemiological tools can then be applied to limit the scope and spread of outbreaks. however, these approaches are sensitive to the underlying assumptions and hence impact vary [27] . it is important to ensure oversight, check assumptions in modelling; and ensure the veracity, reliability, and accountability of these tools in order address bias and other potential harms. in this work, attempt to look at the projections for covid19 infections of pakistan, using a number different univariate time series methods. the rest of article is arranged as: section two described forecasting models and three disused the out-of-sample and forecasting results. finally, section four comprises of conclusion and discussion. 3 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. in this work, we consider five different univariate time series models including; autoregressive (ar), moving average (ma), autoregressive moving average (arma), nonparametric autoregressive (npar) and simple exponential smoothing (ses). these models are described with detail in the following: 4 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.20.20198150 doi: medrxiv preprint a linear autoregressive (ar) process describes a linear function of the previous n observations of m ( t), is defined as: where α and γ i (i = 1, 2, · · · , n) are the intercept and slope coefficients of the underlying ar process and t is the disturbance term. after, an examination graphical analysis (plotting the series residuals, acf and pacf), fit an ar(2) m t to each time series. moving average (ma) model is primarily remove the periodic fluctuations in the time series data, for example fluctuations due to seasonality. the moving average model mathematically can be written as: α indicate the constant (intercept), j (j = 1, 2, · · · , s) are parameters of ma model and the j is white process. the values of s are revealing the order of the ma process. the additive nonparametric counterpart of ar process leads to additive model, where the association between m t , and its previous lags have non-liner relationship, which may be describe as: where g i are showing smoothing functions and describe the association between m t and its previous values. in the recent case, functions g i are denoted by cubic regression splines. as in case of parametric form, we utilized 2 lags while estimating npar. autoregressive moving average (arma) model can be define as, the response variable m t is regressed on the previous n lags also with residuals (errors) as well. mathematically, where α denotes intercept, γ i (i = 1, 2, · · · , n) and φ k (k = 1, 2, ·, m) are the parameters of ar and ma process respectively, and t is a gaussian white noise series with mean zero and variance σ 2 . the arma model order selection is established through inspecting the correlograms (i.e. partial and auto-correlation function (p-acf)). in our case, fit an arma (1, 1) model to each series m t . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.20.20198150 doi: medrxiv preprint the simple exponential smoothing (ses) model of forecasting allows the researchers to smooth the time series data and then use it for out of sample forecasting. ses model is applicable when the data is stationary i.e., no trend and no seasonal pattern but the data at level changing gradually over time. where γ 1 is the smoothing constant, m t is showing the actual series,m t,k is representing the forecasted value of the underlying series for period t andm t+1,k is denoting the forecasted value for the period t + 1. this method assigns the weights in such a way that moving back from the recent value, the weights exponentially decreases. for the modelling purpose, a prime assumption of time series data is stationarity. a cases stationary process is defined as that the mean, variance and autocorrelation structure are time invariant. if the underlying series is nonstationary, it must be transform to stationary. in the literature, different techniques are used to achieve stationarity, for example, taking natural log, differencing the series or box-cox transformation etc [28] . in this work, the covid-19 confirmed, deaths and recovered counts times series are plotted in figure 1 (left-column) daily and figure 1 (right-column) cumulative cases. clearly seen, all the three daily time series having an upward increasing linear trend, which show that the series is non-stationery, hence need to make stationary using differencing method. also, to check the unit root issue of the underlying series that are conformed, deaths and recovered cases, we apply augmented dickey fuller test (adf) test. the results are tabulated in table1, which suggested that the all three series are non-stationary at level. however, taking first order difference, the series are turned out to be stationary. the first order differencing series of daily confirmed, deaths and recovered cases are piloted in figure 2 , where now the series do not contain any trend, hence its become stationary. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint in this paper, we used daily covid-19 conformed, deaths, and recovered cases for pakistan. the dataset was obtained by who[3], the each series ranges from 10, march 2020 to 3, july 2020. the complete dataset covers 116 days, of which data from 10, march 2020 to 19, may 2020 (71 days) were used for model training and from 21, may to 3, july 2020 (45 days) for one-day ahead post-sample (testing) predictions. for the predicting accuracy, two accuracy measures, root mean square error (rmse) and 7 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this this version posted september 22, 2020. . mean absolute error (mae) for each model were computed as follows: where m t = observed andm t = predicted values for t th day (t: 1, 2, · · · , 45). to evaluate the best model of among the previously described models for each series, we computed two standard accuracy measures and presented the outcomes in table 2 figure 3 where the superiority of ma (confirmed and deaths cases)and ses (recovered cases) models can be evidently seen in both cases training and testing exercise. the day-specific confirmed, deaths and recovered case are plotted in figure 4 , over the period of 21, march to 19, june 2020. from the figure 4 (left-column) can be observed that variation among the different weeks, while figure 4 (right-column) mean of days are plotted for conformed, deaths, and recovered cases. where clearly seen that the an increasing pattern saturday to friday, which is show that the effect of working and non-working days. once the best models assessed through the out-of-sample mean errors (rmse, mae), then we proceed for future forecasting with the superior model in each case. we used ma for confirmed and deaths cases and ses for recovered cases and forecast from 4, july to 14, august 2020 for both daily and cumulative cases. the forecasted values are seen in figures 5, clearly revealing that deaths and recovered cases are monotonically increasing, while conformed counts are not. the confirmed cases on 14, august 2020 are expected 7,325 and cumulative cases 413,639, deaths during the end of mid august are expected 121 and cumulative counts are 9,279, and the recovered cases are 10,730 and cumulative are 455,661. overall, the results suggested that, the increasing of confirmed case are gradually decreased, which was the outcome of government imposed earlier steps such as cancelled conferences to disrupted supply chains, imposed travel restrictions, closing of borders, tremendously wedged travel industry, close flights and within country disrupted work, closing of shopping mall, school, colleges and universities. for awareness of peoples different tv programs, commercial and advertisements were organized. face mask and sensitizer were used by each and every person. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the main purpose of this work was to forecast confirmed, deaths and recovered cases of covid-19 for pakistan using five different univariate time series models including; autoregressive (ar), moving average (ma), autoregressive moving average (arma), nonparametric autoregressive (npar) and simple exponential smoothing (ses) models. the dataset of confirmed, deaths and recovered cases ranges from 10, march to 03, july 2020 was used. for model estimation/training was used from 10, march 2020 to 19, may 2020 and 20, may to 3, july 2020 were used for one-day-ahead out-of-sample predictions. to check the predicting performance of all models, we use rmse and mae as mean errors. moreover, ma model beat the rest of all models for confirmed and deaths counts prediction and ses appears to be superior as compare to other models for prediction of recovered cases. at the end, on the bases of these best models, we forecast future 4, july to 14, august 2020, which can help decision making in public health and other sectors for the entire country. furthermore, this work may help in remembering present socio-economic and psychosocial misery affected by covid-19 amongst the public in pakistan. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.20.20198150 doi: medrxiv preprint 12 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.20.20198150 doi: medrxiv preprint coronavirus infectionsmore than just the common cold dyngen: a multi-modal simulator for spearheading new single-cell omics analyses short-term forecasting covid-19 cumulative confirmed cases: perspectives for brazil predicting the evolution and control of the covid-19 pandemic in portugal day level forecasting for coronavirus disease (covid-19) spread: analysis, modeling and recommendations real-time forecast of final outbreak size of novel coronavirus (covid-19) in pakistan: a data-driven analysis using the kalman filter with arima for the covid-19 pandemic dataset of pakistan epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in wuhan, china: a descriptive study statistical analysis of forecasting covid-19 for upcoming month in pakistan predicting optimal lockdown period with parametric approach using three-phase maturation sird model for covid-19 pandemic coronavirus pandemic: a predictive analysis of the peak outbreak epidemic in south africa, turkey, and brazil the impact of covid-19 pandemic upon stability and sequential irregularity of equity and cryptocurrency markets artificial intelligence forecasting of covid-19 in china data-based analysis, modelling and forecasting of the covid-19 outbreak automated detection of covid-19 cases using deep neural networks with x-ray images unravelling the myths of r0 in controlling the dynamics of covid-19 outbreak: a modelling perspective the reconstruction and prediction algorithm of the fractional tdd for the local outbreak of covid-19 real-time forecasts of the covid-19 epidemic in china from time series analysis and forecast of the covid-19 pandemic in india using genetic programming prolonged presence of sars-cov-2 viral rna in faecal samples forecasting unusual trend of covid-19 progression in pakistan forecasting the novel coronavirus covid-19 covid-19 diagnostics, tools, and prevention introduction to time series and forecasting key: cord-326280-kjjljbl5 authors: abdo, mohammed s.; panchal, satish k.; shah, kamal; abdeljawad, thabet title: existence theory and numerical analysis of three species prey–predator model under mittag-leffler power law date: 2020-05-27 journal: adv differ equ doi: 10.1186/s13662-020-02709-7 sha: doc_id: 326280 cord_uid: kjjljbl5 in this manuscript, the fractional atangana–baleanu–caputo model of prey and predator is studied theoretically and numerically. the existence and ulam–hyers stability results are obtained by applying fixed point theory and nonlinear analysis. the approximation solutions for the considered model are discussed via the fractional adams bashforth method. moreover, the behavior of the solution to the given model is explained by graphical representations through the numerical simulations. the obtained results play an important role in developing the theory of fractional analytical dynamic of many biological systems. a predator-prey model is a two-component system, where one of them lives at the expense of the other. a diversity of mathematical techniques is applied at modeling a predator-prey system due to numerous factors that may affect its evolution. in this regard there have been introduced some models in [1] [2] [3] [4] [5] [6] [7] in which the first model, which regards in a specified way only substantial phenomena (gluttony and fertility), is of the type where p(t) and q(t) are the number of prey, the number of predators, respectively, and a 1 , a 2 and a 3 are the average of death of predators, the measurement of the tendency of prey to predation, and the predatory capability, respectively. the model (1) has a unique solution. however, the solutions of (1) are not structurally stable w.r.t. perturbation of the initial conditions. inside the restricted scope of quadratic differential equations, those which cover competition and also predation, must be slightly more realistic. a second model with competition within preys is formulated as ⎧ ⎨ ⎩ p (t) = p(t)[a 1a 2 q(t) + a 5 p(t)], where a 1 a 3 > a 4 a 5 , a 5 > 0 describes the competition of the prey. according to biologically sensible hypotheses, there exists a unique positive solution of model (2) which is asymptotically stable. dai, and zhao investigated the dynamic complexities of a predator-prey model with state dependent on impulsive influences as the authors used the analogue of the poincaré norm to obtain the existence and stability of the model (3) . for details, see [8] . the following dynamic model, which addresses the case of predatory prey with disease, was analyzed by das et al. [9] : dy dt = (1 -y+x k )r 2 y + a 2 xyα 2 x(z + w)my, where x(0), y(0), z(0), w(0) > 0. the authors showed that the model is globally stable on every side of the internal equilibrium point according to certain standard conditions. so, their analysis shows that the force of infection and predation average are the main parameters on the dynamics of the model. fractional calculus deals with differentiation and integration involving fractional order, which is advantageous over the ordinary integer order in the explanation of real-world problems, as also in the modeling of real phenomena due to characterization of memory and hereditary properties [10, 11] . further, the integer-order derivative does not describe the dynamics between two various points. various types of fractional-order or nonlocal derivatives were proposed in the present literature to deal with the reduction of a traditional derivative. for instance, based on a power-law, riemann-liouville introduced the idea of a fractional derivative. afterwards caputo-fabrizio in [12] have proposed a new fractional derivative utilizing the exponential kernel. this derivative has a few problems related to the locality of the kernel. newly, to overcome caputo-fabrizio's problem, atangana and baleanu (ab) in [13] have proposed a new modified version of a fractional derivative with the aid of a generalized mittag-leffler function (mlf) as a nonsingular kernel and being nonlocal. since the generalized mlf is used as the kernel it is guaranteed to have no singularity. furthermore, the ab fractional derivative supplies a description of memory as discussed in [14] [15] [16] [17] [18] [19] [20] . most of the published work describes the mathematical system of predators and prey as a problem of cauchy type of a system of classical differential equations [21] [22] [23] [24] [25] . however, recently, there has been great interest in studying the behavior of the solution for some biological systems using fractional differential equations involving the atangana-baleanu operator by several authors for the purpose of investigating several real-world systems and modeling infectious diseases; see [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] . some fractional-order models have been investigated via the new operators recently. for instance its use has been suggested for the dynamics of smoking in [32] . along the same line, the transference model for the ebola virus together with ab operator was studied in [31] . a fractional-order model of leptospirosis infection was considered in [26] . the dynamical behavior of coronavirus (covid-19) epidemic infection model through the abc derivative has been studied in [33] . also, the existence results and analytic solutions of fractional-order dynamics of covid-19 with abc derivative has been obtained in [34] . there is no literature available on prey-predator fractional models with three species under the aforesaid derivative. just some fractional models have been found in the previous years; however, they have been confined to a standard fractional derivative. furthermore, in the presence of the mentioned derivatives, recently some fruitful results have been published in [37] [38] [39] . due to the success of this operator in modeling the biological systems and infectious diseases, we have studied the dynamical behavior of the mathematical model which describes three prey-predator species by a nonlocal atangana-baleanu-caputo (abc) derivative operator with 0 < α ≤ 1 as with the initial conditions where abc d α 0 + (·) is the abc fractional derivative of order α, p 0 is the initial population density of prey, s 0 is the initial population density of susceptible predator, and i 0 initial population density infected predator. here a denotes the saturation constant whereas susceptible predators threaten the prey, b is a search rate of the prey across a susceptible predator, c is the conversion rate of the susceptible predator due to prey, and d is the disease transmission coefficient. the symbol k represents the carrying capacity of the prey population, the proportionality constant is denoted by b 1 , the growth rate of the prey population is represented as r 1 . in the proposed model, m and n define the death rate of sensitive predator and death rate of the infected predator, respectively. further, we ramark that the right hand sides of our considered model (4) under abc fractional derivtive are assumed to vanish at zero, (for details, see theorem 3.1 in [28] ). the main aim of the paper is to demonstrate the existence, uniqueness and ulam stability of the solution for the model (4)-(5) by using the picard and fixed point techniques. moreover, the numerical simulations via the fractional version of the adams bashforth technique to approximate the abc fractional operator are performed. graphical presentations are also given of the numerical results. this paper is organized as follows: sect. 1 presents an introduction which contains a survey of the literature. section 2 consists of some foundational preliminaries related to fractional calculus and nonlinear analysis. the existence and ulam stability results on a proposed model are obtained in sects. 3, 4. the numerical solution and numerical simulations of the model at hand are presented in sect. 5. for the next analysis, let definition 1 ([13] ) let α ∈ (0, 1] and σ ∈ h 1 (0, t). then the left-sided abc fractional derivative with the lower limit zero of order α for a function σ is defined by where abc[α] is known as the normalization function which is defined as abc[α] = α 2-α , 0 < α ≤ 1 and satisfies the result abc(0) = abc(1) = 1, and e α is called the mittag-leffler function defined by the series here re(α) > 0 and γ (·) is a gamma function. let α ∈ (0, 1] and σ ∈ l 1 (0, t). then the left-sided ab fractional integral with the lower limit zero of order α for a function σ is defined by l abc d α 0 + σ (t) = abc[α] s α (1 -α) + α s α l σ (t) -s α-1 σ (0) . the solution of the proposed problem for α ∈ (0, 1] is given by here κ is the lipschitz constant for π . if κ < 1 we say that π is a contraction. now, we address the existence and uniqueness results of the model (4)-(5) by utilizing the fixed point technique. let us reformulate model (4) in the appropriate form where utilizing lemma 1, the model (8) can be turned to the fractional integral equation in the sense of ab fractional integral as follows: theorem 2 the kernels w ( = 1, 2, 3) agree with the contraction and lipchitz conditions if there exists a constant l such that 0 ≤ l < 1, = 1, 2, 3. for w 1 , let p and p * be two functions, then we have where l 1 := (b 1 + b 1 k (a 1 + a * 1 ) + abc 1 + r 1 d 1 ), and p , p * , s , i are functions bounded by the constants a 1 , a * 1 , c 1 , d 1 , respectively. consequently obviously, the lipschitz condition is verified for w 1 . besides, w 1 leads to a contraction due to 0 ≤ l 1 < 1. likewise, we can show that w 2 and w 3 admit the contraction and lipschitz condition, i.e., where l 2 := ( cb a+a 1 a 1 + dd 1 + m(c 1 + c * 1 )) and l 3 := (n + d + cka 1 ). (11)-(13) hold. if then the solution of the fractional model given in (4)-(5) exists and is unique. the initial conditions and the recurrence form of the model (10) are, respectively, the successive difference between the terms is defined as clearly taking the norm of eqs. (15) , it follows from the conditions (11)-(13) that let us consider p, s and i as bounded functions that comply with the lipschitz condition. it follows from eqs. (16) and (17) that this shows the existence for the solutions. moreover, to prove that eqs. (18) are solutions for the model (4)-(5), we consider on using recessive techniques, we get as n → ∞, m 1n (t) → 0. in a similar way, we conclude that m 2n (t) and m 3n (t) tends to 0. next, we address the uniqueness of the solution to the proposed mode (4)(5) . to this end, let p * (t), s * (t) and i * (t) be other solutions. then it means that from our hypothesis it follows that p(t)-p * (t) = 0. likewise, we conclude that s(t)-s * (t) = 0 and i(t)-i * (t) = 0. for the notion of ulam stability, see [42, 43] . the aforesaid stability has been scrutinized for classical fractional derivatives in many of the research articles; we refer to some of them like [44] [45] [46] [47] . additionally, since stability is a prerequisite in respect of approximate solution, we endeavor on ulam type stability for the model (4) via using nonlinear functional analysis. ulam-hyers stable if there exists λ = max(λ 1 , λ 2 , λ 3 ) > 0 and = max( 1 , 1 , 1 ) > 0, for each p, s, i ∈ e × e × e, with the following inequalities: then there exists (p, s, i) ∈ e × e × e satisfying the coupled system (4) with the following initial conditions: such that ( p, s, i) -(p, s, i) ω ≤ λ . remark 1 consider a small perturbation g 1 ∈ c[0, t] that depends only on the solution such that g 1 (0) = 0 with the following properties: 1 |g 1 (t)| ≤ 1 , for t ∈ [0, t] and 1 > 0. 2 furthermore, one has note that we will only discuss the first equation from the proposed system and the rest of the equations are similar in technique, i.e. p -p e ≤ λ 1 1 . satisfies the relation where p g 1 (t) is a solution of (21), p(t) satisfies (19-a) and κ := ( γ (α)-γ (α+1)+t α abc[α]γ (α) ). proof 3 thanks to remark 1, and lemma 1, the solution of (21) is given by also, we have it follows from remark 1 that (11), the system (4)-(5) will be ulam-hyers stable in ω. let p ∈ e be the solution of the inequality (19-a) and the function p ∈ e be a unique solution of eq. (4-a) with the condition that is, due to (22) , p 0 = p 0 . hence eq. (23) becomes thus by condition (11) and lemma 2, we obtain p(t) -p(t) ≤ p(t) -p g 1 (t) + p g 1 (t) -p(t) γ (α) )l 1 < 1. for λ 1 = 2κ 1-λ 1 , we get p -p e ≤ λ 1 1 . similarly, we conclude that s -s e ≤ λ 2 2 , and i -i e ≤ λ 3 3 , where λ = 2κ 1-λ ( = 2, 3). for some , λ > 0, ( p, s, i) -(p, s, i) ω ≤ λ . hence the model (4)-(5) is ulam-hyers stable. in this part, we give approximation solutions of the abc fractional model (4)(5) . then the numerical simulations are acquired via the suggested scheme. to this aim, we employ the modified fractional version for amb [48] to approximate the fractional integral in the ab sense. to procure an iterative scheme, we go ahead with the first equation of the model (10) as follows: set t = t n+1 , for n = 0, 1, 2, . . . , it follows that now, we approximate the function w 1 (θ , p) on the interval [t , t +1 ] through the interpolation polynomial as follows: which implies now, we compute the integrals i -1,α and i ,α as follows: and put t = , we get α(α + 1) -2(α + 1)(n -) α + (α + 1)(n + 1 -) α -(n -) α+1 + (n + 1 -) α+1 = α+1 α(α + 1) (n -) α -2(α + 1) -(n -) + (n + 1 -) α (α + 1 + n + 1 -) = α+1 α(α + 1) (n + 1 -) α (n -+ 2 + α) -(n -) α (n -+ 2 + 2α) (26) and substituting (26) and (27) into (25), we get similarly γ (α + 2) α (n + 1 -) α (n -+ 2 + α) -(n -) α (n -+ 2 + 2α) -w 2 (t -1 , s(t -1 )) γ (α + 2) α (n + 1 -) α+1 -(n -) α (n -+ 1 + α) (29) and now, to present the numerical simulations of the abc fractional model (4)-(5), we apply the iterative solution contained in (28)(30) . take the time range up to 100 units. the numerical values of the parameters applied in the simulations are specified in table 1 . the graphical representations of numerical solution for species p, s, i at various fractional orders, α = 0.4, 0.6, 0.8, 1.0, of the considered model (4) are given in figs. 1-3 , respectively. from figs. 1-3, we observe that species i depends on species p and s. therefore the papulation density of specie p and s gradually go on decreasing with different rate due to the fractional order in the first 50 days. the lower the fractional order, the faster the decay rate and hence the more rapidly the system becomes stable and vice versa. on the other hand, the species i is going on increasing with different rate, the lower the order the slower is the growth rate until it becomes stable and vice versa. the fractional order greatly figure 3 graphical representation of numerical solution for specie i at various fractional orders of the considered model (4) affects the stability of the system and also provides the global nature of the dynamics of the considered model. here we claim that the established numerical technique is powerful and converges for the abc fractional derivative. meanwhile the iterative techniques like perturbation and decomposition methods do not show the perfect behavior for the said derivatives for approximate solutions in many cases. in this paper, the population density model of prey and sensitive predatory and infected predatory has been studied theoretically and numerically. theoretically, the existence and stability results in the sense of ulam-hyers have been obtained through the help of fixed point theory and nonlinear analysis. numerically, the approximation solution of the abc fractional model (4) has been discussed via the use of a fractional adam bashforth method. moreover, the behavior of the solutions of the model (4) has also been explained through graphs using some numerical values for the parameter. the obtained results play an important role in developing the theory of fractional analytical dynamics of various phenomena of real-world problems. théorie mathématique de la lutte pour la vie elements of physical biology sulla theoria di volterra della lotta per l'esistenza mathematical biology. harrap models in ecology mathematical biology nonlinearities in mathematical ecology: phenomena and models, would we live in volterra's world mathematical and dynamic analysis of a prey-predator model in the presence of alternative prey with impulsive state feedback control a predator-prey mathematical model with both the populations affected by diseases theory and applications of fractional differential equations fractional differential equations a new definition of fractional derivative without singular kernel new fractional derivatives with non-local and non-singular kernel: theory and application to heat transfer model non validity of index law in fractional calculus: a fractional differential operator with markovian and non-markovian properties fractional derivatives with no-index law property: application to chaos and statistics decolonisation of fractional calculus rules: breaking commutativity and associativity to capture more natural phenomena dynamical study of fractional order mutualism parasitism food web module stability and numerical simulation of a fractional order plant-nectar-pollinator model green function's properties and existence theorems for nonlinear singular-delay-fractional differential equations a fractional order hiv-tb coinfection model with nonsingular mittag-leffler law a predator-prey mathematical model with both the populations affected by disease the dynamical analysis of a prey-predator model with a refuge-stage structure prey population a fractional calculus approach to rosenzweig-macarthur predator-prey model and its solution an impulsively controlled three-species prey-predator model with stage structure and birth pulse for predator a hybrid predator-prey model with general functional responses under seasonal succession alternating between gompertz and logistic growth fractional derivatives with mittag-leffler kernel a study of behaviour for immune and tumor cells in immunogenetic tumour model with non-singular fractional derivative on a class of ordinary differential equations in the frame of atangana-baleanu fractional derivative solution for fractional generalized zakharov equations with mittag-leffler function analysis of lakes pollution model with mittag-leffler kernel modelling the spread of ebola virus with atangana-baleanu fractional operators on a nonlinear fractional order model of dengue fever disease under caputo-fabrizio derivative semi-analytical study of pine wilt disease model with convex rate under caputo-fabrizio fractional order derivative study on krasnoselskii's fixed point theorem for caputo-fabrizio fractional differential equations hyers-ulam stability and existence criteria for coupled fractional differential equations involving p-laplacian operator on the existence of positive solutions for a non-autonomous fractional differential equation with integral boundary conditions analysis of some generalized abc-fractional logistic models fractional logistic models in the frame of fractional operators generated by conformable derivatives a fractional-order epidemic model with time-delay and nonlinear incidence rate discrete fractional differences with nonsingular discrete mittag-leffler kernels basic theory of fractional differential equations problems in modern mathematics a collection of mathematical problems. interscience investigation of ulam stability results of a coupled system of nonlinear implicit fractional differential equations on ulam's stability for a coupled systems of nonlinear implicit fractional differential equations ulam stability to a toppled systems of nonlinear implicit fractional order boundary value problem existence and ulam-hyers stability for caputo conformable differential equations with four-point integral conditions new numerical approximation of fractional derivative with non-local and non-singular kernel: application to chaotic models the authors are very thankful to the reviewers for their useful suggestions. the fourth author would like to thank prince sultan university for funding this research work. data sharing is not applicable to this article as no data sets were generated or analyzed during the current study. the authors declare that they have no competing interests. all authors equally contributed to this manuscript and approved the final version. springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.received: 15 april 2020 accepted: 19 may 2020 key: cord-342591-6joc2ld1 authors: higazy, m. title: novel fractional order sidarthe mathematical model of the covid-19 pandemic date: 2020-06-13 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110007 sha: doc_id: 342591 cord_uid: 6joc2ld1 nowadays, covid-19 has put a significant responsibility on all of us around the world from its detection to its remediation. the globe suffer from lockdown due to covid-19 pandemic. the researchers are doing their best to discover the nature of this pandemic and try to produce the possible plans to control it. one of the most effective method to understand and control the evolution of this pandemic is to model it via an efficient mathematical model. in this paper, we propose to model the covid-19 pandemic by fractional order sidarthe model which not appear in the literature before. the existence of a stable solution of the fractional order covid-19 sidarthe model is proved and the fractional order necessary conditions of four proposed control strategies are produced. the sensitivity of the fractional order covid-19 sidarthe model to the fractional order and the infection rate parameters are displayed. all studies are numerically simulated using matlab software via fractional order differential equation solver. all the world states' governments introduce a big effort and vital measures to eliminate the outbreak of covid-19 [22] . covid-19 is a new progeny of coronavirus, sars-cov-2 and firstly detected in wuhan, china [52, 57] . in the few months after discovering it, the number of patients were increasing exponentially. the taken measures against covid-19 until the day of writing these words didn't prevent the growth of infected cases around the globe. the world health organization situation report published in 25 may 2020 informed that 5304772 cases as total cases and 342029 deaths around the globe [54] . using mathematical model to predict the epidemics is very useful in order to understand the nature of the epidemic and to design an efficient strategies to control it [4, 8, 14, 26, 27] . it is common to study the humanitarian diffusion of epidemics via sir or seir models [11, 13, 33, 48] . various models have been proposed to model and study covid-19 pandemic. taking into account the risk understanding and the accumulative issue of cases, covid-19 pandemic has been modeled by lin et al. via extending seir model [38] where s signifies the susceptible, e signifies the exposed, i signifies the infected and r signifies the removed cases. in [3] , anastassopoulou et al. have suggested the sir model in the discrete time mode taking into account the dead cases. in [9] by casella, sir model is expanded to study the delays effect and to compare the policies of containment. in [56] by wu et al., the covid-19 severity has been estimated using the dynamics of transition. in [25, 34] , random transition models have been studied. in [12] , the general multi-group seira model was represented and numerically tested for modelling the diffusion of covid-19 between a non-homogeneous population. the basic mathematical tool used to model several epidemics is differential equations in various modes (ordinary, fractional, with delay, randomly detected or partial) [27] [28] [29] . 4 many research efforts have been widely done to control the outbreaks of epidemics via optimal control [20, [44] [45] [46] [47] . the optimal control idea is to look for the utmost powerful plan that decreases the rate of infection to a possible minimum limit with optimal minimum cost of circulating a treatment or preventative inoculation [39, [44] [45] [46] [47] 51] . these plans may include treatments, inoculation with vaccines, social distances, educational programs [6, 10 ] . studying the epidemiological diseases mathematically become very important [6, 7, 16, 17, 37, 40] . the literature has several studies to control for example models of hiv [24] , dengue fever [2] , tuberculosis [50] , delayed sir [1] and delayed sirs [29, 58] . the fractional order differential equations add an extra dimensions in the study of dynamics of epidemiological models. therefore the fractional version of many epidemical models have been investigated as in [29] [30] [31] [32] , [39, 41] and [53] . here, a new epidemiological fractional mathematical model for the covid-19 epidemic is proposed as an extension of the classical sir model, similar to that introduced by gumel et al. for sars in [23] and as a generalization of sidarthe model proposed in [21] . as explained in section 3, in sidarthe model, the infected cases are classified into five different classes depending on the detection and the appearance of symptoms [21] . in this work, we consider the fractional order sidarthe model and then we derive the fractional order necessary conditions for existence of a stable solution. in addition, we study an optimal control plans for the fractional order sidarthe model via four control strategies that include the availability of vaccination and existence of treatments for the infected detected three population fraction phases. applying the fractional order differential equations numerical solver using matlab software, we show the dynamics of the state variables of the model and display the effect of changing the fractional derivative order on the system response. also, the effect of changing the infection rates on the fractional order sidarthe model's state 5 variables. we also implement the optimal control strategies numerically for the fractional order sidarthe model. the remaining parts of the paper are organized as follows. in section 2, preliminaries and basic definition of the fractional derivative are introduced. describing covid-19 epidemic sidarthe fractional mathematical model is introduced in section 3. the details of the optimal control strategy and its implementation are given in section 4. numerical simulations of the uncontrolled fractional order sidarthe model, the effects of changing the fractional derivative order on the system response and the effects of changing the infection rates are all given in section 5. numerical simulations of the controlled fractional order sidarthe model and the effects of applying the proposed control strategies are represented in section 6. the concluding remarks are put in section 7 followed by the list of cited references. many definitions of fractional order derivatives exist such as riemann-lioville's derivative, grunwald letnikov's derivative, caputo's derivative, caputo-fabrizio, atangana-baleanu, etc. the interested reader can consult for example [42, 43] and the references therein for more details about fractional order defintions with applications. we have used caputo's definition throughout the paper. caputo fractional derivative operator q  of order q (see [18, 30, 39, 42, 43, 49] ) is defined as: symbolizes the gamma function. for more details about the basic definitions and characteristics of fractional derivatives see [18, 30, 39, 49] . notation: for numerical simulations, the predictor-corrector pece method of adams-bashforth-moulton type described in details in [15, 19] has been utilized and programmed with matlab software. giulia giordano et al. in [21] modeled the covid-19 epidemic via sidarthe and compare its response with the real data in italy. sidarthe distinguishes between determined infected cases and undetermined infected cases and between various degrees of illness (doi). in sidarthe covid-19 epidemic model, the total population is partitioned into eight phases of malady as recorded in the table 1 . figure 1 shows the interaction graph (  ) between the eight phases of malady. in figure 1 , the susceptible population partition is partitioned into four sub-phases to show the hidden sub-phases of the susceptible population partition , namely . the detailed interaction digraph in figure 1 is named  that may be studied using graph theory tools to discover more features about the model. infected (symptomless, undetermined) population fraction. diagnosed (infected, symptomless, determined) population fraction. ailing (infected, with symptoms undetermined) population fraction. recognized (infected, with symptoms, determined) population fraction. threatened (infected, with life-menacing symptoms, determined) population fraction. healed (recuperate) population fraction. extinct (died out) population fraction. table 1 where the sub-phases si, sd, sa, sr are appeared. the covid-19 epidemic sidarthe is modeled mathematically by eight ordinary differential equations [21] . the deterministic characteristic is essential in modeling the epidemics transition phenomena; the fractional derivative is very useful in modeling the epidemics transition systems because they consider the memory effect and the universal properties of the system, that are primary in the deterministic feature. the system is said to have a memory effect if its future states depend on its current states and the history of the states, and the fractional operator has this memory effect feature so it is very helpful in modeling the covid-19 diffusion model. here, as recorded in equations (3.1) to (3.8) , the dynamics of the population in each phase with time is described with eight fractional order ( ) differential equations: [21] using the real data. figure 1 shows the impact of each different phases of epidemic graphically. the sidarthe covid-19 model parameters have the following real meaning:  a signifies the rate of infection as a result of contacting among a susceptible case and an infected case.  b signifies the rate of infection as a result of contacting among a susceptible case and a diagnosed case.  c signifies the rate of infection as a result of contacting among a susceptible case and an ailing case.  d signifies the rate of infection as a result of contacting among a susceptible case and a recognized case.  e signifies the detection probability rate of infected symptomless cases.  θ signifies the detection probability rate of infected with symptoms cases.  z signifies the rate of probability at which an infected case is not conscious of becoming infected.  h signifies the rate of probability at which an infected case is conscious of becoming infected.  m signifies the rate at which undetermined infected case develops life-menacing signs.  v signifies the rate at which the determined infected case develops life-menacing signs.  τ signifies the death rate (for infected cases with life-menacing signs).  g, k, x, r and σ signify the rate of healing for the five phases of infected cases. for more details about the model choices see [21] and the references cited there. 9 from equations (3.1) to (3.8) and since the states and are sink vertices in the model graph  ( see figure 1 ), then they are considered as accumulative state variables that rely only on their own starting situations and the other state variables. since summing up all equations (3.1) to (3.8) gives zero as a result then the system is compartmentalized and shows the conservation property of mass : as can be directly proved, which implies that the total population (the sum of all state variables) is constant. let (3.10) be the state variables vector. since the state variables signify the population fractions, we can suppose that ∑ , such that 1 signifies the total population, dead are included. ; ; ; ; 20) 3 3 3 3 31 32 3 3 2 33 3 34 3 3 3 3 35 3 36 3 37 3 3 38 0 ( ) ; ; ; where ( ) are all positive constants. then from (3.19) to (3.26) , each of the eight functions agree with the lipschitz condition [18, 39] . with respect to the eight arguments, then all eight functions are absolutely continuous. in this area, the existence of the fractional order sidarthe model's optimal control is investigated and then the hamiltonian of the optimal control problem is constructed in order to produce the optimal control necessary requirements. compute the optimal values of vaccination and treatment strategies that would maximize the healed population phase ( ) and minimize the determined infected phases ( ) and susceptible ( ) population phases. in addition, the charges of utilizing the vaccination and treatment methods are minimized. then the optimal control problem of the following form is considered (see for example [1, 2, 5, 6, 10, 20, 24, 25, 34, 37, [44] [45] [46] [47] . that obeys the constraints the control function represents the vaccination strategy applied on the susceptible the four constants 1 2 3 ,, c c c and 4 c are the cost correspond to utilizing each control function. the uncontrolled system (put 0 i u  in (4.2)). for certain initial conditions: that their sum equal one, it is obvious that the final values of the state variables approach to an equilibrium:  which means that the phenomenon of epidemic is finished (see [21] ). the possible equilibrium points of the system are given by ( , 0, 0, 0, 0, 0, , ) the incidence function of the system is: in the following result, the existence of the optimal control is proved. proof. in order to prove the optimal control quadruple                since 1 s i d a r t h e         this subsection records the needed conditions for the optimal control to be exist. the necessary conditions are computed here via constructing the hamiltonian (  ) and satisfying the maximum basis of pontryagin [5] . let us denote by is the optimal controls and be the related optimal population phases fractions. consequently, there exists such that the necessary conditions for the optimal control to be exist are produced by (see for example [5] ): from (4.4) and (4.6), the optimization constraints can be found as: from (4.7), we can find that from the state variables system given in (4.2), the co-state conditions are: which can be simplified to produce the following co-state system: the transversality conditions leads to the collection of permissible controls u is convex. 17 the effect of applying the different control strategies will be simulated numerically in section 6. in this fifth section, we solve the fractional order sidarthe model numerically utilizing the predictor-corrector pece method of adams-bashforth-moulton type described in details in [15, 19] . the parameters' values used for the numerical simulation are estimated from the italian real life statistics published in [21] where: the total population are taken 100 million and the initial values of the different population phases after normalization are (let be the total population): in the following, we display the results of the numerical simulation of the uncontrolled sidarthe model. figure 2 with different fractional derivative order . figure 9 displays the phase plane of state variables: total infected ( ) and susceptible cases (s(t)) with different fractional derivative order . figure 10 displays the phase plane of state variables: total infected ) and healed cases (h(t)) with different fractional derivative order . from the results and the figures mentioned here, we can state that decreasing the fractional derivative order decreases the number of each population phase (except susceptible 19 population fraction as expected) and flatten the curves also delays reaching the maximum in each population phase. 23 figure 10 . the phase plane of state variables: total infected ) and healed cases (h(t)) with different fractional derivative order . in this subsection, we show the effect of changing certain system parameters on different population phases at day 60 with various fractional derivative order. figure the mean rate of separation or contraction of tiny phase-space disturbances of a dynamical system beginning from near starting points is metered by the lyapunov exponents (les) [55] . thus, they can be utilized to study the stability of dynamical systems and to examine sensitive reliance on starting conditions, that implies, the presence of hidden chaotic dynamics. it is important to check the epidemic transition models if it is chaotic or not via calculating les. corresponding techniques for the les calculation and their distinction are studied, e.g., in [35, 36] . the studied system represented by equations: (3.1) -(3.8) is stable [21] . here, we confirm its stability for different values of fractional derivative order via plotting the relationship between the fraction derivative order and the eight lyapunov exponents of the system. figure 23 shows that all system eight lyapunov exponents are negative with different fraction derivative order as time approaches infinity (here, time is taken 1500). figure 24 shows the 31 dynamics of the system eight lyapunov exponents with time. in figure 24 -(a) to 24-(d) with different fraction derivative order, the system eight lyapunov exponents approaches negative end which confirm the system stability with different fractional derivative orders. for more details about lyapunov exponents see [55] . here, we use the method in [13] for calculating the fractional order lyapunov exponents. this research has been carried out to the analysis of an eight dimension fractional -order sidarthe covid-19 mathematical model. in this 8-d cvid-19 mathematical model, the infected population fraction is partitioned into five different population fractions: . it is the first time to study such model with fractional order. the existence of stable solution of the fractional order sidarthe model is proved. the fractional order necessary conditions for a four optimal control strategies are implemented. in addition, the system dynamics displayed via the fraction order numerical solver by matlab software with different fractional orders and the effects of changing the infection rates parameters are presented in this manuscript with different fractional orders. the effects of changing the fractional order on the system lyapunov exponent are also displayed. the dynamics of the system are presented before and after control. from our study, we can state that decreasing the fractional derivative order decreases the number of cases in all population fraction phases and delays the maximum plus changing the value of the fractional derivative order has no effect on the stability of the system since its all lyapunov exponents still negative. the proposed 38 fractional order covid-19 sidarthe model predicts the evolution of covid-19 epidemic and try to help in understanding the impact of different plans to limit the diffusion of this epidemic with different values of the fractional order. our results confirm the importance of decreasing the infection rates. decreasing the infection rates include taking various actions like insure the social distance, closing the airports, closing all teaching authorities, random testing the asymptomatic cases and contact tracing. the author hope that the covid-19 study using the proposed model continues. and via utilizing the real data the optimum fractional order can be estimated. no competing of interest. this research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. the hopf bifurcation analysis and optimal control of a delayed sir epidemic model an optimal control problem arising from a dengue disease transmission model data-based analysis, modelling and forecasting of the covid-19 outbreak infectious diseases of humans optimal control of a sir epidemic model with general incidence function and a time delays control of emerging infectious diseases using responsive imperfect vaccination and isolation the use of epidemic models can the covid-19 epidemic be managed on the basis of daily data? optimal control of an epidemic through educational campaigns transmission model of endemic human malaria in a partially immune population a multi-group seira model for the spread of covid-19 among heterogeneous populations matlab code for lyapunov exponents of fractional-order systems mathematical epidemiology of infectious diseases: model building the fracpece subroutine for the numerical solution of differential equations of fractional order the first epidemic model: a historical note on p.d. en'ko modeling and analysis of the polluted lakes system with various fractional approaches the fractional sirc model and influenza a on linear stability of predictor-corrector algorithms for fractional differential equations optimal control applied to vaccination and treatment strategies for various epidemiological models modelling the covid-19 epidemic and implementation of population-wide interventions in italy clinical characteristics of coronavirus disease 2019 in china modelling strategies for controlling sars outbreaks optimal control of a delayed hiv infection model with immune response using an efficient numerical method feasibility of controlling covid-19 outbreaks by isolation of cases and contacts the mathematics of infectious diseases & driessche p. some epidemiological models with nonlinear incidence the mathematics of infectious on the dynamics of a delayed sir epidemic model with a modified saturated incidence rate. electron abdel moniem n. k. numerical simulation for the fractional sirc model and influenza a modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative fractional order sir model with generalized incidence rate a contribution to the mathematical theory of epidemics early dynamics of transmission and control of covid-19: a mathematical modelling study invariance of lyapunov exponents and lyapunov dimension for regular and irregular linearizations finite-time lyapunov dimension and hidden attractor of the rabinovich system optimal control of a delayed sirs epidemic model with vaccination and treatment a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action approximate solutions for solving nonlinear fractional order smoking model a filippov-type lemma for functions involving delays and its application to time-delayed optimal control problems sir epidemic model with mittag-leffler fractional derivative second-grade fluid model with caputo-liouville generalized fractional derivative fractional diffusion equation with new fractional operator optimal control for a nonlinear mathematical model of tumor under immune suppression a numerical approach optimal control for a fractional tuberculosis infection model including the impact of diabetes and resistant strains fractional optimal control in transmission dynamics of west nile model with state and control time delay a numerical approach numerical treatments of the tranmission dynamics of west nile virus and it's optimal control vaccination strategies for epidemics in highly mobile populations fractional differential equations on the delayed ross-macdonald model for malaria transmission optimal control strategies for tuberculosis treatment: a case study in angola the covid-19 epidemic analysis and numerical simulation of fractional model of bank data with fractal-fractional atangana-baleanu derivative determining lyapunov exponents from a time series estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72,314 cases from the chinese center for disease control and prevention optimal treatment of an sir epidemic model with time delay in this section, we show numerically, the effect of applying the four control strategies studied in section 3. figure 25 , (c) . figure 29 shows the effect of the control strategies: (vaccination) and 33 (treatment of the diagnosed population phase ) on the different population phases at day 70 with fractional derivative order . figure 30 shows the effect of the control strategies:(vaccination) and (treatment of the diagnosed population phase ) on the different population phases at day 70 with fractional derivative order . figure key: cord-264408-vk4lt83x authors: ruiz, sara i.; zumbrun, elizabeth e.; nalca, aysegul title: animal models of human viral diseases date: 2017-06-23 journal: animal models for the study of human disease doi: 10.1016/b978-0-12-809468-6.00033-4 sha: doc_id: 264408 cord_uid: vk4lt83x as the threat of exposure to emerging and reemerging viruses within a naïve population increases, it is vital that the basic mechanisms of pathogenesis and immune response be thoroughly investigated. recent outbreaks of middle east respiratory syndrome corona virus, ebola virus, chikungunya virus, and zika virus illustrate the emerging threats that are encountered. by utilizing animal models in this endeavor, the host response to viruses can be studied in a more complex and integrated context to identify novel drug targets, and assess the efficacy and safety of new products rapidly. this is especially true in the advent and implementation of the fda animal rule. although no one animal model is able to recapitulate all aspects of human disease, understanding the current limitations allows for a more targeted experimental design. important facets to consider prior to an animal study are route of viral exposure, species of animal, biomarkers of disease, and a humane endpoint. this chapter covers the current animal models for medically important human viruses, and demonstrates where the gaps in knowledge exist. well-developed animal models are necessary to understand disease progression, pathogenesis, and immunologic responses to viral infections in humans. furthermore, to test vaccines and medical countermeasures, animal models are essential for preclinical studies. ideally, an animal model of human viral infection should mimic the host-pathogen interactions and the disease progression that is seen in the natural disease course. a good animal model of viral infection should allow assay of many parameters of infection, including clinical signs, growth of virus, clinicopathological parameters, cellular and humoral immune responses, and virus-host interactions. furthermore, viral replication should be accompanied by measurable clinical manifestations and pathology should resemble that of human cases such that a better understanding of the disease process in humans is attained. there is often more than one animal model that closely represents human disease for a given pathogen. small animal models are typically used for first-line screening, and for testing the efficacy of vaccines or therapeutics. in contrast, nonhuman primate (nhp) models are often used for pivotal preclinical studies. this approach is also used for basic pathogenesis studies, with most studies in small animal models when possible, and studies in nhps to fill in the remaining gaps in knowledge. the advantages of using mice to develop animal models are low cost, low genetic variability in inbred strains, and abundant molecular biological and immunological reagents. specific pathogen free (spf), transgenic and knockout mice are also available. a major pitfall of mouse models is that the pathogenesis and protection afforded by vaccines and therapeutics cannot always be extrapolated to humans. additionally, blood volumes for sampling are limited in small animals, and viruses often need to be adapted through serial passage in the species to induce a productive infection. the ferret's airways are anatomically and histologically similar to that of humans, and their size enables collection of larger or more frequent blood samples, making them an ideal model for certain respiratory pathogens. ferrets are outbred, with no standardized breeds or strains, thus greater numbers are required in studies to achieve statistical significance and overcome the resulting variable responses. additionally, spf and transgenic ferrets are not available, and molecular biological reagents are lacking. other caveats making ferret models more difficult to work with are their requirement for more space than mice (rabbit-style cages), and the development of aggressive behavior with repeated procedures. nhps are genetically, the closest species to humans, thus disease progression and host-pathogen responses to viral infections are often the most similar to that of humans. however, ethical concerns pertaining to experimentation on nhps along with the high cost and lack of spf nhps raise barriers for such studies. nhp studies should be carefully designed to ensure the fewest number of animals are used, and the studies should address the most critical questions regarding disease pathogenesis, host-pathogen responses, and protective efficacy of vaccines and therapeutics. well-designed experiments should carefully evaluate the choice of animal, including the strain, sex, and age. furthermore, depending on the pathogen, the route of exposure and the dose should mimic the route of exposure and dose of human disease. the endpoint for these studies is also an important criterion. depending on the desired outcome, the model system should emulate the host responses in humans when infected with the same pathogen. in summary, small animal models are helpful for the initial screening of vaccines and therapeutics, and are often beneficial in obtaining a basic understanding of the disease. nhp models should be used for a more detailed characterization of pathogenesis and for pivotal preclinical testing studies. ultimately, an ideal animal model may not be available. in this case, a combination of different well-characterized animal models should be considered to understand the disease progression and to test medical countermeasures against the disease. in this chapter, we will be reviewing the animal models for representative members of numerous virus families causing human diseases. we will focus on viruses for each family that are of the greatest concern for public health worldwide. norovirus, the genus of which norwalk is the prototypic member, is the most common cause of gastroenteritis in the united states (hall et al., 2013) . there are five distinct genogroups (gi-gv) and numerous strains of norwalk virus, including the particularly significant human pathogens gi.1 norwalk virus, gii.2 snow mountain virus, and gii.1 hawaii virus. in developing countries, norwalk virus, also known as "winter vomiting virus," is responsible for approximately 200,000 deaths annually (patel et al., 2008) . a typical disease course is self-limiting, but there have been incidences of necrotizing enterocolitis and seizures in infants (chen et al., 2009; lutgehetmann et al., 2012; turcios-ruiz et al., 2008) . symptoms of infection include diarrhea, vomiting, nausea, abdominal cramping, dehydration, and fever. incubation is normally 1-3 days, with symptoms persisting for 2-3 days (koopmans and duizer, 2004) . viral shedding can range from 6 to 55 days in healthy individuals (atmar et al., 2014) . however, longer illness duration can be indicative of immunocompromised status, with the elderly and young having a prolonged state of shedding (harris et al., 2008; rockx et al., 2002) . interestingly, individuals vary greatly in susceptibility to norovirus infection depending on their fucosyl transferase 2 (fut2) allele functionality and histoblood group antigen status, with type a and o individuals susceptible and types ab and b resistant (hutson et al., 2005) . transmission occurs predominately through the oralfecal route with contaminated food and water being a major vector (atmar and estes, 2001; becker et al., 2000; koopmans and duizer, 2004) . vomiting results in airborne dissemination of the virus with areas of 7.8 m 2 being contaminated and subsequent transmission from oral deposition of airborne particles or contact with contaminated fomites, which can remain contaminated for up to 42 days (makison booth, 2014; tung-thompson et al., 2015) . each vomiting event in a classroom setting elevates the risk of norovirus illness among elementary students with proximity correlating with attack rates (evans et al., 2002; marks et al., 2003) . viral titers in emesis and fecal suspensions are as high as 1.2 × 10 7 and 1.6 × 10 11 ges (genomic equivalent copies per milliliter), respectively and the 50% infectious dose is 1320 ges (atmar et al., 2014) . therefore, outbreaks can be extremely difficult to contain. therapeutic intervention consists of rehydration therapy and antiemetic medication (bucardo et al., 2008; moe et al., 2001) . no approved vaccine or therapeutic is available, and development has been challenging given that immunity is short-lived after infection, new strains rapidly evolve and the correlates of protection are not completely understood (chen et al., 2013) . however, one promising strategy utilized a virus-like particle (vlp)-based vaccine that protected or reduced infection by almost 50% in human volunteers (aliabadi et al., 2015; atmar et al., 2011) . given the relatively benign disease in adults, experimental challenge has been carried out on human volunteers (ball et al., 1999; tacket et al., 2000) . viral titers are determined by shedding in feces and sera with histopathology changes monitored by biopsies particularly of the duodenum. the ph of emesis samples collected containing virus is consistent with viral replication in the small intestine with reflux to the stomach (kirby et al., 2016) . additionally, norwalk virus has been shown to bind to duodenal tissue (chan et al., 2011) . however, this type of research is technically difficult and expensive, and thus other models have been developed. a major hindrance to basic research into this pathogen is the lack of permissive cell culture systems or animal models for norwalk virus. nhps including marmosets, cotton-top tamarins, and rhesus macaques infected with norwalk virus are monitored for the extent of viral shedding; however, no clinical disease is observed in these models. disease progression and severity is measured exclusively by assay of viral shedding (rockx et al., 2005) . incidentally, more viruses were needed to create an infection when challenging by the oral route than by the intravenous (iv) route (purcell et al., 2002) . chimpanzees were exposed to a clinical isolate of norwalk virus by the iv route (bok et al., 2011) . although none of the animals developed disease symptoms, viral shedding within the feces was observed within 2-5 days postinfection and lasted anywhere from 17 days to 6 weeks. viremia never occurred and no histopathological changes were detected. the amount and duration of viral shedding was in-line with what is observed upon human infection. as such, chimeric chimpanzee-human antinorovirus neutralizing antibodies have been explored as a possible therapeutic strategy (chen et al., 2013) . a recently identified calicivirus of rhesus origin, named tulane virus, has been used as a surrogate model of infection. unlike norwalk virus, tulane virus can be cultured in cells. rhesus macaques exposed to tulane virus intragastrically developed diarrhea and fever 2 days postinfection. viral shedding was detected for 8 days. the immune system produced antibodies that dropped in concentration within 38 days postinfection, mirroring the short-lived immunity documented in humans. the intestine developed moderate blunting of the villi as seen in human disease (sestak et al., 2012) . a murine norovirus has been identified and is closely related to human norwalk virus (karst et al., 2003) . however, clinically the virus presents a different disease. the murine norovirus model does not include observable gastrointestinal clinical signs, possibly in part because rodents lack a vomiting reflex. additionally, mice infected with norovirus develop a persistent infection in contrast to human disease (hsu et al., 2006 (hsu et al., , 2007 khan et al., 2009) . porcine enteric caliciviruses can induce diarrheal disease in young pigs, and an asymptomatic infection in adults (wang et al., 2006 . gnotobiotic pigs can successfully be infected with a passaged clinical norovirus isolate by the oral route. diarrheal disease developed in 74% of the animals and virus was detected in the stool of 44% of the animals. no major histopathological changes or viral persistence was noted (cheetham et al., 2006) . calves are naturally infected with bovine noroviruses (scipioni et al., 2008) . experimental challenge of calves by oral inoculation with a bovine isolate resulted in diarrheal disease 14-16 h postinfection. recovery of virus was achieved after 53.5 and 67 h postinfection (otto et al., 2011) . eastern equine encephalitis virus (eeev), western equine encephalitis virus (weev), and venezuelan equine encephalitis virus (veev) present with near synonymous symptoms. the majorities of human cases are asymptomatic, but can present as a flu-like illness progressing to central nervous system (cns) involvement to include seizures and paralysis. mortality rates vary among the virus, with the highest reported for eev at 36%-75% followed by weev and lastly veev at less than 1% (ayers et al., 1994; griffin, 2007; steele and twenhafel, 2010) . there are currently no licensed vaccines or therapies but a recent phase 1 clinical trial of a veev dna vaccine resulted in veev-neutralizing antibody responses in 100% of the subjects (hannaman et al., 2016) . mouse models have been developed for numerous routes of infection including cutaneous, intranasal (in), intracranial (ic), and aerosol. eeev susceptibility in mouse models is correlated with age, with younger mice being more susceptible than adults. importantly, eeev pathogenesis is dependent on route of infection with delayed progression upon subcutaneous (subq) exposure (honnold et al., 2015) . newborn mice display neuronal damage with rapid disease progression, resulting in death (murphy and whitfield, 1970) . similarly, eeev produces fatal encephalitis in older mice when administered via the intracerebral route, while inoculation via the subq route causes a pantropic infection eventually resulting in encephalitis (liu et al., 1970; morgan, 1941) . a general drawback to the usage of the mouse model is the lack of vascular involvement during the disease course (liu et al., 1970) . after subq inoculation with weev, suckling mice started to show signs of disease by 24 h and died within 48 h (aguilar, 1970) . the heart was the only organ in which pathologic changes were observed. conversely, adult mice exhibited signs of lethargy and ruffled fur on day 4-5 postinfection. mice were severely ill by day 8 and appeared hunched and dehydrated. death occurred between days 7 and 14 with brain and mesodermal tissues, such as heart, lungs, liver, and kidney involvement (aguilar, 1970; monath et al., 1978) . intracerebral and in routes of infection resulted in a fatal disease that was highly dependent on dose while intradermal (id) and subq inoculations caused only 50% fatality in mice regardless of the amount of virus (liu et al., 1970) . comparing susceptibility of inbred and outbred strains revealed that cd-1, balb/c, a/j, and c57bl6 mice were all highly susceptible to experimental infection via subq inoculation when challenged prior to 10 weeks old with cns involvement and lethality (blakely et al., 2015) . subq/dermal infection in the mouse model results in encephalitic disease very similar to that seen in horses and humans (macdonald and johnston, 2000) . virus begins to replicate in the draining lymph nodes at 4 h postinoculation. eventually, virus enters the brain primarily via the olfactory system. furthermore, aerosol exposure of mice to veev can result in massive infection of the olfactory neuroepithelium, olfactory nerves, and olfactory bulbs and viral spread to brain, resulting in necrotizing panencephalitis (charles et al., 1995; steele et al., 1998) . aerosol and dermal inoculation routes cause neurological pathology in mice much faster than other routes of exposure. the clinical signs of disease in mice infected by aerosol are ruffled fur, lethargy, and hunching progressing to death (charles et al., 1995; steele and twenhafel, 2010; steele et al., 1998) . in challenge of c3h/hen mice with high dose veev caused high morbidity and mortality (julander et al., 2008b) . viral titers in brain peaked on day 4 postchallenge and remained elevated until animals succumbed on day 9-10 postchallenge. protein cytokine array performed on brains of infected mice showed elevated il-1a, il-1b, il-6, il-12, mcp-1, ifnγ, mip-1a, and rantes levels. this model was used successfully to test antivirals against veev (julander et al., 2008a) . additionally, a veev vaccine inactivated with 1,5-iodonaphthyl azide v3526 protects against both footpad and aerosol challenge with virulent veev in a mouse model (gupta et al., 2016) . guinea pigs and hamsters have also been developed as animal models for eeev studies (paessler et al., 2004; roy et al., 2009) . guinea pigs developed neurological involvement with decreased activity, tremors, circling behavior, and coma. neuronal necrosis was observed in brain lesions in the experimentally challenged animals (roy et al., 2009) . subq inoculation of eeev produced lethal biphasic disease in hamsters with severe lesions of nerve cells. the early visceral phase with viremia was followed by neuroinvasion, encephalitis, and death. in addition, parenchyma necrosis were observed in the liver and lymphoid organs (paessler et al., 2004) . harlan sprague-dawley hamsters develop viremia and progress to respiratory, gastrointestinal, and nervous system involvement when inoculated via subq route. vasculitis and encephalitis were both evident in this model, which mirrors the human disease clinical spectrum (paessler et al., 2004) . weev is highly infectious to guinea pigs and has been utilized for prophylactic screening (sidwell and smee, 2003) . studies demonstrated that although the length of the incubation period and the disease duration varied, weev infection resulted in mortality in hamsters by all routes of inoculation. progressive lack of coordination, shivering, rapid and noisy breathing, corneal opacity, and conjunctival discharge resulting in closing of the eyelids were indicative of disease in all cases (zlotnik et al., 1972) . cns involvement was evident with intracerebral, intraperitoneal (ip), and id inoculations (zlotnik et al., 1972) . ip inoculation of weev is fatal in guinea pigs regardless of amount of virus inoculum, with the animals exhibiting signs of illness on day 3-4, followed by death on day 5-9 (nalca, unpublished results) . id, im, or iv inoculations of eeev in nhps cause disease, but does not reliably result in neurological symptoms (dupuy and reed, 2012) . intracerebral infection of eeev produces nervous system disease and fatality in monkeys (nathanson et al., 1969) . the differences in these models indicate that the initial viremia and the secondary nervous system infection do not overlap in nhps when they are inoculated by the peripheral route (wyckoff, 1939) . in and intralingual inoculations of eeev also cause nervous system symptoms in monkeys, but are less drastic than intracerebral injections (wyckoff, 1939) . the aerosol route of delivery will result in uniformly lethal disease in cynomolgus macaques (reed et al., 2007) . in this model, fever was followed by elevated white blood cells and liver enzymes. neurological signs subsequently developed and nhps became moribund and were euthanized between 5-9 days postexposure. meningoencephalomyelitis was the main pathology observed in the brains of these animals (steele and twenhafel, 2010) . similar clinical signs and pathology were observed when common marmosets were infected with eeev by the in route (adams et al., 2008) . both aerosol and in nhp models had similar disease progression and pathology as seen in human disease. very limited studies have been performed with nhps. reed et al. exposed cynomolgus macaques to low and high doses of aerosolized weev. the animals subsequently developed fever, increased white blood counts, and cns involvement, demonstrating that the cynomolgus macaque model could be useful for testing of vaccines and therapeutics against weev (reed et al., 2005) . veev infection causes a typical biphasic febrile response in nhps. initial fever was observed at 12-72 h after infection and lasted less than 12 h. secondary fever generally began on day 5 and lasted 3-4 days (gleiser et al., 1961) . veev-infected nhps exhibited mild symptoms, such as anorexia, irritability, diarrhea, and tremors. leukopenia was common in animals exhibiting fever (monath et al., 1974) . supporting the leukopenia, microscopic changes in lymphatic tissues, such as early destruction of lymphocytes in lymph nodes and spleen, a mild lymphocytic infiltrate in the hepatic triads, and focal myocardial necrosis with lymphocytic infiltration have been observed in monkeys infected with veev. surprisingly, characteristic lesions of the cns were observed histopathologically in monkeys in spite of the lack of any clinical signs of infection (gleiser et al., 1961) . the primary lesions were lymphocytic perivascular cuffing and glial proliferation and generally observed at day 6 postinfection during the secondary febrile episode. similar to these observations, when cynomolgus macaques were exposed to aerosolized veev, fever, viremia, lymphopenia, and clinical signs of encephalitis were observed but the nhps did not succumb to disease (reed et al., 2004) . a common marmoset model was utilized for comparison studies of south america (sa) and north america (na) strains of eeev (adams et al., 2008) . previous studies indicated that the sa strain is less virulent than na strain for humans. common marmosets were infected in with either the na or sa strain of eeev. na strain-infected animals showed signs of anorexia and neurological involvement and were euthanized 4-5 days after the challenge. although sa strain-infected animals developed viremia, they remained asymptomatic and survived until the end of study. chikungunya virus (chikv) is a member of the genus alphaviruses, specifically the semliki forest complex, and has been responsible for a multitude of epidemics centered within africa and southeast asia (griffin, 2007) . the virus is transmitted by aedes aegypti and aedes albopictus mosquitoes. given the widespread endemicity of aedes mosquitoes, chikv has the potential to spread to previously unaffected areas. this is typified by the emergence of disease reported for the first time in 2005 in the islands of south-west indian ocean, including the french la reunion island, and the appearance in central italy in 2007 (charrel et al., 2007; rezza et al., 2007) . the incubation period following a mosquito bite is 2-5 days, leading to a self-limiting acute phase that lasts 3-4 days. symptoms during this period include fever, arthralgia, myalgia, and rash. headache, weakness, nausea, vomiting, and polyarthralgia have all been reported (powers and logue, 2007) . individuals typically develop a stooped posture due to the pain. for approximately 12% of infected individuals, joint pain can last months after resolution of primary disease, and has the possibility to relapse. underlying health conditions including diabetes, alcoholism, or renal disease, increase the risk of developing a severe form of disease that includes hepatitis or encephalopathy. children between the ages of 3 and 18 years old have an increased risk of developing neurological manifestations (arpino et al., 2009) . there is currently no approved vaccine or antiviral. wild-type c57bl/6 adult mice are not permissive to chikv infection by id inoculation. however, it was demonstrated that neonatal mice were susceptible and severity was dependent upon age at infection. six-dayold mice developed paralysis by day 6, and all succumbed by day 12, whereas 50% of 9-day-old mice were able to recover from infection. by 12 days, mice were no longer permissive to disease. symptomatic mice developed loss of balance, hind limb dragging, and skin lesions. neonatal mice were also used as a model for neurological complications (couderc et al., 2008; ziegler et al., 2008) . an adult mouse model has been developed by injection of the ventral side of the footpad of c57bl/6j mice. viremia lasted 4-5 days accompanied with foot swelling and noted inflammation of the musculoskeletal tissue morrison et al., 2011) . adult ifnα/βr knockout mice also developed mild disease with symptoms including muscle weakness and lethargy, symptoms that mirrored human infection. all adult mice died within 3 days. this model was useful in identifying the viral cellular tropism for fibroblasts (couderc et al., 2008) . icr and cd-1 mice can also be utilized as a disease model. neonatal mice inoculated subq with a passaged clinical isolate of chikv developed lethargy, loss of balance, and difficulty walking. mortality was low, 17 and 8% for newborn cd-1 and icr mice, respectively. the remaining mice fully recovered within 6 weeks after infection (ziegler et al., 2008) . a drawback of both the ifnα/βr and cd-1 mice is that the disease is not a result of immunopathogenesis as occurs in human cases, given that the mice are immunocompromised (teo et al., 2012) . a chronic infection model was developed using recombinant activating gene 1 (rag1 −/− ) knockout mice. in this study, mice inoculated via the footpad lost weight in comparison to the control group. both footpad and subq injected mice developed viremia 5-6 days postinfection, which was detectable up to 28 days postinfection. inflammation was evident in the brain, liver, and lung of the subq inoculated animals at 28-56 days postinfection. despite minimal footpad swelling on day 2 postinfection, on day 14 there was severe muscle damage noted at necropsy, which resolved by day 28 (seymour et al., 2015) . golden hamsters serve as another option for small animal modeling. although hamsters do not appear to develop overt clinical symptoms following subq inoculation, viremia developed in the majority of animals within 1 day postinfection with clearance following from day 3 to 4. histologically, inflammation was noted at the skeletal muscle, fascia, and tendon sheaths of numerous limbs. this study was limited in the number of animals utilized, and more work is needed to further develop the hamster model (bosco-lauth et al., 2015) . nhp models of disease include adult, aged, and pregnant rhesus macaques in addition to cynomolgus macaques (broeckel et al., 2015) . differing routes of infection (subq, iv, and im) have been successfully administered, although there is not a clear understanding of the role that route of transmission plays in subsequent pathogenesis and clinical symptoms. typically, viremia is observed 4-5 days postinfection with a correlation between infectious titer and time to viremia observed in cynomolgus but not rhesus (labadie et al., 2010; messaoudi et al., 2013) . fever began at 1-2 days postinfection and persisted for 2-7 days and 3-7 days in cynomolgus and rhesus, respectively and coincided with rash (chen et al., 2010; labadie et al., 2010; messaoudi et al., 2013) . overall blood chemistries changed in conjunction with initiation of viremia, and returned to baseline 10-15 days postexposure (chen et al., 2010) . cns involvement has been difficult to reproduce in nhp models, although it was reported that high inoculum in cynomolgus did result in meningoencephalitis (labadie et al., 2010) . the nhp models have been utilized to conduct efficacy testing on novel vaccines and therapeutics (broeckel et al., 2015) . dengue virus (denv) is transmitted via the mosquito vectors a. aegypti and a. albopictus (moore and mitchell, 1997) . given the endemicity of the vectors, it is estimated that half of the world's population is at risk for exposure to denv. this results in approximately 50 million cases of dengue each year, with the burden of disease in the tropical and subtropical regions of latin america, south asia, and southeast asia (gubler, 2002) . it is estimated that there are 20,000 deaths each year due to dengue hemorrhagic fever (dhf) (guzman and kouri, 2002) . there are four distinct serotypes of denv, numbered 1-4, which are capable of causing a wide clinical spectrum that ranges from asymptomatic to severe with the development of dhf (world health organization, 1997) . incubation can range from 3 to 14 days, with the average being 4-7 days. the virus targets dendritic cells and macrophages following a mosquito bite (balsitis et al., 2009) . typical infection results in classic dengue fever (df), which is self-limiting and has flu-like symptoms in conjunction with retroorbital pain, headache, skin rash, and bone and muscle pain. dhf can follow, with vascular leak syndrome and low platelet count, resulting in hemorrhage. in the most extreme cases, dengue shock syndrome (dss) develops, characterized by hypotension, shock, and circulatory failure (world health organization, 1997) . thrombocytopenia is a hallmark clinical sign of infection, and aids in differential diagnosis (gregory et al., 2010) . severe disease has a higher propensity to occur upon secondary infection with a different denv serotype (thein et al., 1997) . this is hypothesized to occur due to antibody dependent enhancement (ade). there is no approved vaccine or drug, and hospitalized patients receive supportive care including fluid replacement. in order to further progress toward an effective drug or vaccine, small human cohort studies have taken place. however, to provide statistically relevant results, testing must progress in an animal model. in developing an animal model, it is important to note that mosquitoes typically deposit 10 4 -10 6 pfu, and is considered the optimal range during experimental challenge . denv does not naturally replicate effectively in rodent cells, creating the need for mouse-adapted strains, engineered mouse lines, and a variety of inoculation routes to overcome the initial barrier. several laboratory mouse strains including a/j, balb/c, and c57bl/6 are permissive to dengue infection. however, the resulting disease has little resemblance to human clinical signs, and death results from paralysis (huang et al., 2000; paes et al., 2005; shresta et al., 2004) . a higher dose of an adapted denv strain induced dhf symptoms in both balb/c and c57bl/6 souza et al., 2009) . this model can also yield asymptomatic infections. a mouse-adapted strain of denv 2 introduced into ag129 mice developed vascular leak syndrome similar to the severe disease seen in humans (shresta et al., 2006) . passive transfer of monoclonal dengue antibodies within mice leads to ade. during the course of infection, viremia was increased and animals died due to vascular leak syndrome (balsitis et al., 2010) . another mouse-adapted strain injected into balb/c caused liver damage, hemorrhagic manifestations, and vascular permeability (souza et al., 2009) . ic injection of suckling mice with denv leads to death by paralysis and encephalitis, which is rare in human infection (lee et al., 2015; parida et al., 2002; zhao et al., 2014a) . immunocompromised mice have also been used to gain an understanding of the pathogenesis of denv. the most well-defined model is ag129 which is deficient in ifnα/β and γ receptors and can recapitulate dhf/dss if a mouse-adapted strain is utilized yauch et al., 2009) . scid mice engrafted with human tumor cells develop paralysis upon infection, and thus are not useful for pathogenesis studies (blaney et al., 2002; lin et al., 1998) . df symptoms developed after infection in nod/scid/il2rγko mice engrafted with cd34 + human progenitor cells (mota and rico-hesse, 2011) . rag-hu mice developed fever, but no other symptoms upon infection with a passaged clinical isolate and labadapted strain of denv 2 (kuruvilla et al., 2007) . a passaged clinical isolate of denv 3 was used to create a model in immunocompetent adult mice. ip injection in c57bl/6j and balb/c caused lethality by day 6-7 postinfection in a dose dependent manner. the first indication of infection was weight loss beginning on day 4 followed by thrombocytopenia. a drop in systolic blood pressure along with noted increases in the liver enzymes, ast and alt, were also observed. viremia was established by day 5. this model mimicked the characteristic symptoms observed in human dhf/dss cases (costa et al., 2012) . vascular leakage was also observed when c57bl/6 were inoculated with denv 2 (st john et al., 2013) . a murine model was developed that utilized infected mosquitoes as the route of transmission to hu-nsg mice. female mosquitoes were intrathoracically inoculated with a clinical isolate of denv 2. infected mosquitoes then fed upon the mouse footpad to allow for transmission of the virus via the natural route. the amount of virus detected within the mouse was directly proportional to the number of mosquitoes it was exposed to, with 4-5 being optimal. detectable viral rna was in line with historical human infection data. severe thrombocytopenia developed on day 14. this model is notable in that disease was enhanced with mosquito delivery of the virus in comparison to injection of the virus (cox et al., 2012) . nhp models have used a subq inoculation in an attempt to induce disease. although the animals are permissive to viral replication, it is to a lower degree than that observed in human infection (marchette et al., 1973) . the immunosuppressive drug, cyclophosphamide enhances infection in rhesus macaques by allowing the virus to invade monocytes (marchette et al., 1980) . throughout these preliminary studies, no clinical disease was detected. in order to circumvent this, a higher dose of denv was used in an iv challenge of rhesus macaques. hemorrhagic manifestations appeared by day 3 and resulted in petechiae, hematomas, and coagulopathy; however, no other symptoms developed (onlamoon et al., 2010) . a robust antibody response was observed in multiple studies (marchette et al., 1973; onlamoon et al., 2010) . marmosets also mirror human dengue infection, developing fever, leukopenia, and thrombocytopenia following subq inoculation (omatsu et al., 2011 (omatsu et al., , 2012 . nhps are able to produce antibodies similar to those observed during the course of human infection, making them advantageous in studying ade. sequential infection led to a cross-reactive antibody response which has been demonstrated in both humans and mice (midgley et al., 2011) . this phenotype can also be seen upon passive transfer of a monoclonal antibody to dengue and subsequent infection with the virus. rhesus macaques exposed in this manner developed viremia that was 3-to 100-fold higher than previously reported, however, no clinical signs were apparent (goncalvez et al., 2007) . the lack of inducible dhf or dss symptoms hinders further examination of pathogenesis within this model. west nile virus (wnv) was first isolated from the blood of a woman in the west nile district of uganda in 1937 uganda in (smithburn et al., 1940 . after the initial isolation of wnv, the virus was subsequently isolated from patients, birds, and mosquitoes in egypt in the early 1950s (melnick et al., 1951; taylor et al., 1953) and was shown to cause encephalitis in humans and horses. wnv is recognized as the most widespread of the flaviviruses, with a geographical distribution that includes africa, the middle east, western asia, europe, and australia (hayes, 1989) . the virus first reached the western hemisphere in the summer of 1999, during an outbreak involving humans, horses, and birds in the new york city metropolitan area (centers for disease control and prevention, 1999; lanciotti et al., 1999) . since 1999, the range of areas affected by wnv quickly extended. older people and children are most susceptible to wnv disease. wnv generally causes asymptomatic disease or a mild undifferentiated fever (west nile fever), which can last from 3 to 6 days (monath and tsai, 2002) . the mortality rate following neuroinvasive disease ranges from 4% to 11% (asnis et al., 2000; hayes, 1989; hubalek and halouzka, 1999; komar, 2000) . the most severe complications are commonly seen in the elderly, with reported case fatality rates from 4% to 11%. hepatitis, myocarditis, and pancreatitis are unusual, severe, nonneurologic manifestations of wnv infection. inoculation of wnv into nhps intracerebrally resulted in the development of either encephalitis, febrile disease, or an asymptomatic infection, depending on the virus strain and dose. viral persistence is observed in these animals regardless of the outcome of infection (i.e., asymptomatic, fever, encephalitis) (pogodina et al., 1983) . thus, viral persistence is regarded as a typical result of nhp infection with various wnv strains. after both intracerebral and subq inoculation, the virus localizes predominantly in the brain and may also be found in the kidneys, spleen, and lymph nodes. wnv does not result in clinical disease in nhps although the animals show a low level of viremia (lieberman et al., 2009; pletnev et al., 2003) . this is mirrored in new zealand white rabbits in that they only develop fever and low levels of viremia following inoculation via footpad (suen et al., 2015) . id inoculation of both marmosets and rhesus macaques did not yield any clinical signs of disease including fever. viremia was detected in both nhp species, but marmosets developed a higher titer for a greater duration than rhesus (verstrepen et al., 2014) . wnv has also been extensively studied in small animals. all classical laboratory mouse strains are susceptible to lethal infections by the intracerebral and ip routes, resulting in encephalitis and 100% mortality. id route pathogenesis studies indicated that langerhans dendritic cells are the initial viral replication sites in the skin (brown et al., 2007; johnston et al., 1996) . the infected langerhans cells then migrate to lymph nodes and the virus enters the blood through lymphatic and thoracic ducts and disseminates to peripheral tissues for secondary viral replication. virus eventually travels to the cns and causes pathology that is similar to human cases (byrne et al., 2001; cunha et al., 2000; diamond et al., 2003; fratkin et al., 2004) . the swiss mouse strain was inoculated ip in order to screen a variety of viral lineages to assess differences in pathogenesis (bingham et al., 2014) . tesh et al. developed a model for wn encephalitis using the golden hamster, mesocricetus auratus. hamsters appeared asymptomatic during the first 5 days, became lethargic at approximately day 6, and developed neurologic symptoms between days 7 and 10 . many of the severely affected animals died 7-14 days after infection. viremia was detected in the hamsters within 24 h after infection and persisted for 5-6 days. although there were no substantial changes in internal organs, progressive pathologic differences were seen in the brain and spinal cord of infected animals. furthermore, similar to the previously mentioned monkey experiments by pogodina et al. (1983) , persistent wnv infection was found in the brains of hamsters. zika virus recently came to the forefront of public health concerns with the outbreak in brazil at the end of 2015. the clinical disease spectrum is highly variable with reports of a flu-like illness accompanied by rash, guillan-barre syndrome, and microcephaly in newborns (ramos da silva and gao, 2016) . to date, a correlation between gestational age at which exposure to the virus occurs and severity of microcephaly is not fully understood (brasil et al., 2016) . however, a recent study of pregnant women in columbia found that infection with zika virus during the third trimester was not associated with any obvious structural abnormalities of the fetus (pacheco et al., 2016) . transmission of the virus occurs via the bite from an infected a. aegypti or a. albopictus (ramos da silva and gao, 2016) . other reported routes of exposure include sexual transmission and blood transfusion (cunha et al., 2016; d'ortenzio et al., 2016; hills et al., 2016; mccarthy, 2016) . the emergence of this virus with no approved vaccine or therapy, and few diagnostic options demonstrates the utility of well-characterized animal model development. it was first demonstrated in 1956 that experimentally infected mosquitoes could be used to transmit the virus to mice and nhps (boorman and porterfield, 1956) . a129 mice were susceptible to nonadapted zika virus infection following subq inoculation of the limbs. mice began to lose weight 3 days postinfection and met euthanasia criteria by day 6. microscopic lesions within the brain were noted upon necropsy. in conjunction, viral rna was detected in the blood, brain, ovary, spleen, and liver of the infected mice. wild-type 129sv/ev mice were also challenged with no observable clinical disease. however, viral rna was detected at day 3 postinfection in the blood, ovary and spleen, and then remained at detectable levels in the ovaries and spleen on day 7 (dowall et al., 2016) . footpad inoculation of the virus leads to a fatal disease in ag129 mice by day 7 postinoculation with significant histopathological changes in the brain noted at necropsy (aliota et al., 2016) . ag129 mice were also observed to develop neurologic disease by day 6 postexposure (rossi et al., 2016) . immunocompetent mice are resistant to infection via the subq route (rossi et al., 2016) . recently, a mouse model was identified to verify vertical transmission of the virus. pregnant c57 mice were injected either ip or in utero into the lateral ventricle of the fetal brain. ip inoculation induced transient viremia in the pregnant mice on day 1. viral rna was detected in five out of nine placentas on day 3 postinfection. the virus was able to infect the radial glia cells in the fetal brain and leads to a reduction in the cortical neural progenitors . viral exposure via cerebroventricular space/ lateral ventricle of the fetal brain exhibited small brain size at day 5 postexposure in addition to cortical thinning (cugola et al., 2016; li et al., 2016a) . ifnar1 −/− pregnant mice exposed to the virus had nonviable fetuses. in the same study, wild-type mice were given an anti-ifnar antibody prior to and during infection resulting in detectable virus in the fetal head with mild intrauterine growth restriction (miner et al., 2016) . all of these murine studies will further study of the pathogenesis of vertical transmission and the resulting neurological disorders in conjunction with screening novel countermeasures. nhp studies are currently ongoing for animal model development. numerous viruses from the coronavirus (cov) family exist that infect a wide range of animals. six species have been identified that can infect humans. two of these are alpha coronavirues: hcov-229e and hcov-nl63. four are beta coronavirueses: hcov-oc43, hcov-hku1, hcov-sars, and mers-cov. hcov-229e and hcov-oc43 were first detected in the 1960s from the nasal passages of humans with the "common cold" (gaunt et al., 2010) . hcov-nl63, which was first isolated in 2004, causes upper and lower respiratory infections of varying intensity and has been continuously circulating among humans (van der hoek et al., 2006) . hcov-hku1, first isolated in 2002, has been identified more sporadically but also causes respiratory infections (lau et al., 2006) . a significant portion of common cold infections in humans are caused by coronaviruses. in 2002 and 2012, two human coronaviruses, sars-cov and mers-cov, emerged that caused a great deal of alarm since these infections have resulted in nearly 10 and 40% fatality, respectively (assiri et al., 2013; peiris et al., 2004) . the etiologic agent of severe acute respiratory syndrome (sars), sars-cov, emerged in 2002 as it spread throughout 32 countries in a period of 6 months, with 8437 confirmed infections and 813 deaths (roberts and subbarao, 2006; world health organization, 2003) . no additional cases of community acquired sars-cov infection have been reported since 2004. the natural reservoir of sars-cov is the horseshoe bat and the palm civet is an intermediate host (lau et al., 2005) . the main mechanism of transmission of sars-cov is through droplet spread, but it is also viable in dry form on surfaces for up to 6 days and can be detected in stool, suggesting other modes of transmission are also possible (pearson et al., 2003; rabenau et al., 2005; rota et al., 2003) . sars-cov infection has a 10% case fatality with the majority of cases in people over the age of 15 (peiris et al., 2003; wang et al., 2004) . after an incubation period of 2-10 days, clinical signs of sars include general malaise, fever, chills, diarrhea, dyspnea, and cough (drosten et al., 2003) . in some sars cases, pneumonia may develop and progress to acute respiratory distress syndrome (ards). fever usually dissipates within 2 weeks and coincides with the induction of high levels of neutralizing antibodies (tan et al., 2004) . in humans, sars-cov replication destroys respiratory epithelium, and a great deal of the pathogenesis is due to the subsequent immune responses (chen and subbarao, 2007; perlman and dandekar, 2005) . infiltrates persisting within the lung and diffuse alveolar damage (dad) are common sequelae of sars-cov infection (perlman and dandekar, 2005) . virus can be isolated from secretions of the upper airways during early, but not later stages of infection as well as from other tissues (cheng et al., 2004) . sars-cov can replicate in many species, including: dogs, cats, pigs, mice, rats, ferrets, foxes, and monkeys (roper and rehm, 2009) . no model captures all aspects of human clinical disease (pyrexia and respiratory signs), mortality (∼10%), viral replication, and pathology (roberts et al., 2008) . in general, the sars-cov disease course in the model species is much milder and of shorter duration than in humans. viral replication in the various animal models may occur without clinical illness and/or histopathologic changes. the best-characterized models utilize mice, hamsters, ferrets, and nhps. mouse models of sars-cov typically are inoculated by the in route under light anesthesia (roberts et al., 2005) . young, 6-to 8-week-old balb/c mice exposed to sars-cov have viral replication detected in the lungs and nasal turbinate, with a peak on day 2 and clearance by day 5 postexposure (mcauliffe et al., 2004) . there is also viral replication within the small intestines of young balb/c mice. however, young mice have no clinical signs, aside from reduced weight gain, and have little to no inflammation within the lungs (pneumonitis) . in sars-cov infection of c57bl/6 (b6), also yields reduced weight gain and viral k. viral disease replication in the lungs, with a peak on day 3 and clearance by day 9 (glass et al., 2004) . in contrast, balb/c mice 13-14 months of age show weight loss, hunched posture, dehydration, and ruffled fur on day 3-6 postexposure (bisht et al., 2004) . interstitial pneumonitis, alveolar damage, and death also occur in old mice, resembling the age-dependent virulence observed in humans. 129s mice and b6 mice show outcomes to sars-cov infection similar to those observed for balb/c mice but have lower titers and less prolonged disease. while the aged mouse model is more frequently used then young mice, it is more difficult to obtain large numbers of mice older than 1 year (table 33 .1). a number of immunocompromised knockout mouse models of in sars-cov infection have also been developed. 129svev mice infected with sars-cov by the in route develop bronchiolitis, with peribronchiolar inflammatory infiltrates and interstitial inflammation in adjacent alveolar septae . viral replication and disease in these mice resolves by day 14 postexposure. beige, cd1 −/− , and rag1 −/− mice infected with sars-cov have similar outcomes to infected balb/c mice with regard to viral replication, timing of viral clearance, and a lack of clinical signs (glass et al., 2004) . stat1 ko mice infected in with sars-cov have severe disease, with weight loss, pneumonitis, interstitial pneumonia, and some deaths . the stat1 ko mouse model is therefore useful for studies of pathogenicity, pathology, and evaluation of vaccines. angiotensin converting enzyme 2 (ace2) and cd209l were identified as cellular receptors for sars-cov, with affinity for the spike (s) protein of the virus (jeffers et al., 2004) . the variations in the ace2 sequence across animal species could partially explain the differences in infection severity (li et al., 2016b; sutton and subbarao, 2015) . since mice in particular have a greater number of sequence differences in ace2, transgenic mice were created that express human ace2 (mccray et al., 2007; netland et al., 2008; yang et al., 2007) . unlike other murine models of sars-cov, mice expressing hace2 had up to 100% mortality, with severity correlating to the level of hace2 expression (tseng et al., 2007) . with high levels of hace2 expression, mice developed a severe lung and brain infection. however, cns k. viral disease infection is only rarely observed in humans infected with sars-cov. syrian golden hamsters (strain lvg) are also susceptible to in exposure of sars-cov. after the administration of 10 3 tcid 50, along with a period of transient viremia, sars-cov replicates in nasal turbinates and lungs, resulting in pneumonitis (roberts et al., 2005) . there are no obvious signs of disease, but exercise wheels can be used to monitor decrease in nighttime activity. limited mortality has been observed, but it was not dose dependent and could have more to do with genetic differences between animals because the strain is not inbred (roberts et al., 2008) . damage is not observed in the liver or spleen despite detection of virus within these tissues. several studies have shown that intratracheal (it) inoculation of sars-cov in anesthetized ferrets (mustela furo) results in lethargy, fever, sneezing, and nasal discharge (skowronski et al., 2005) . clinical disease has been observed in several studies excluding one, perhaps due to characteristics of the inoculating virus (kobinger et al., 2007) . sars-cov is detected in pharyngeal swabs, trachea, tracheobronchial lymph nodes, and high titers within the lungs. mortality has been observed around day 4 postexposure as well as mild alveolar damage in 5%-10% of the lungs, occasionally accompanied by severe pathology within the lungs (martina et al., 2003; ter meulen et al., 2004) . with fever, overt respiratory signs, lung damage, and some mortality, the ferret intratracheal model of sars-cov infection is perhaps most similar to human sars, albeit with a shorter time course. sars-cov infection of nhps by intransal or it routes generally results in a very mild infection that resolves quickly. sars-cov infection of old world monkeys, such as rhesus macaques, cynomolgus macaques (cynos), and african green monkeys (agms) have been studied with variable results, possibly due to the outbred nature of the groups studied or previous exposure to related pathogens. clinical illness and viral loads have not been consistent; however, replication within the lungs and dad are features of the infections for each of the primate species. some cynos have no illness but others have rash, lethargy, and respiratory signs and pathology martina et al., 2003; mcauliffe et al., 2004; rowe et al., 2004) . rhesus have little to no disease and only have mild findings upon histopathological analysis (rowe et al., 2004) . agms infected with sars-cov have no overt clinical signs but dad and pneumonitis has been documented (mcauliffe et al., 2004) . viral replication has been detected for up to 10 days in the lungs of agms; however, the infection resolves, and does not progress to fatal ards. farmed chinese masked palm civets, sold in open markets in china, were involved in the sars-cov outbreak. it and in inoculation of civets with sars-cov results in lethargy, decreased aggressiveness, fever, diarrhea, and conjunctivitis . leucopenia, pneumonitis, and alveolar septal enlargement, with lesions similar to those observed in ferrets and nhps, have also been observed in laboratory-infected civets. squirrel monkeys, mustached tamarinds, and common marmosets have not been susceptible to sars-cov infection (greenough et al., 2005; roberts et al., 2008) . vaccines have been developed for related animal covs in chickens, cattle, dogs, cats, and swine, and have included live-attenuated, killed, dna and viral-vectored vaccine strategies (cavanagh, 2003) . an important issue to highlight from work on these vaccines is that cov vaccines, such as those developed for cats, may induce a more severe disease (perlman and dandekar, 2005; weiss and scott, 1981) . as such, immune mice had th2type immunopathology upon sars-cov challenge (tseng et al., 2012) . severe hepatitis in vaccinated ferrets with antibody enhancement in liver has been reported (weingartl et al., 2004) . additionally, rechallenge of agms showed limited viral replication but significant lung inflammation, including alveolitis and interstitial pneumonia, which persisted for long periods of time after viral clearance (clay et al., 2012) . mouse and nhp models with increased virulence may be developed by adapting the virus by repeated passage within the species of interest. mouse-adapted sars with uniform lethality was developed from 15 serial passages in the lungs of young balb/c mice (mccray et al., 2007; roberts et al., 2007; rockx et al., 2007) . middle east respiratory syndrome (mers-cov) emerged in saudi arabia and is associated with fever, severe lower respiratory tract infection, and oftentimes renal failure (al-tawfiq et al., 2016; omrani et al., 2015) . mers patients can also occasionally manifest with neurological symptoms. mers-cov infection has a high fatality rate. infections in humans can also be asymptomatic. as of october 2015, there were 1589 confirmed cases and 567 deaths (li et al., 2016b) . bats serve as the likely natural reservoir since virus with 100% nucleotide identity to the index case was isolated from egyptian tomb bats (memish et al., 2013) . spread to humans likely comes from infected dromedary camels (adney et al., 2014; azhar et al., 2014) . the host range for mers-cov is dependent on the binding of the viral s protein to the host receptor, which is human dipeptidyl peptidase four (hdpp4), also known as cd26 (raj et al., 2013) . the expression and distribution of dpp4 in the human respiratory tract has recently been well characterized (meyerholz et al., 2016) . interestingly, dpp4 expression is preferentially localized to alveolar regions, perhaps explaining why mers predominantly manifests as an infection of the lower respiratory tract. humans with preexisting pulmonary disease have increased dpp4 expression in alveolar epithelia. small animals typically used for viral disease research, such as mice, hamsters, guinea pigs, and ferrets are naturally nonpermissive to mers-cov infection due to a low binding efficiency of the viral s protein to the host dpp4 (sutton and subbarao, 2015). in contrast the rhesus macaque and common marmoset have complete homology to human dpp4, allowing productive mers-cov infection to occur falzarano et al., 2014; munster et al., 2013; yao et al., 2014) . new zealand white rabbits can be infected with mers-cov, and virus was isolated from the upper respiratory tract, but there were no clinical symptoms or significant histopathological changes (haagmans et al., 2015) . due to the lack of strong binding affinity of the mers-cov s protein to the murine dpp4 receptor, wildtype mice are not susceptible to mers-cov infection. as such, several approaches have been used to create susceptible murine animal models of mers-cov infection by inducing the expression of hdpp4. one approach utilized an adenovirus vector expressing hdpp4 to transduce mice (zhao et al., 2014b) . these mice developed pneumonia but survived mers-cov infection. in mers-cov infection of mice with global expression of hdpp4 resulted in id 50 and ld 50 values of <1 and 10 tcid 50 , respectively (tao et al., 2016) . thus, mers-cov infection of these transgenic mice can be either sublethal or uniformly lethal depending on the dose. inflammatory infiltrates were found in the lungs and brain stems of mice with some focal infiltrates in the liver as well. another strategy uses transgenic mice expressing hdpp4 under either a surfactant protein c or cytokeratin 10 promoter (li et al., 2016b) . in mers-cov infection in these mice resulted in a uniformly lethal disease characterized by alveolar edema and microvascular thrombosis and mononuclear clear cell infiltration in the lungs. the brain stem was also impacted by the infection. dpp4 expression with an ubiquitously expressing promoter from cytomegalovirus also had a uniformly lethal infection with predominant lung and brain involvement, but numerous other tissues were also impacted and contained virus (agrawal et al., 2015) . common marmosets infected with 5.2 × 10 6 tcid 50 (emc-2012) mers-cov by the combined in, oral, ocular, and it routes capitulate the severe disease in human infections (falzarano et al., 2014) . the animals manifested moderate to severe clinical disease, with interstitial infiltration of both lower lung lobes. two of nine animals became moribund between days 4 and 6. viral rna was detected in nasal and throat swabs, various organs, and in the blood of some animals, indicating a systemic infection. histologically, animals showed evidence of acute bronchial interstitial pneumonia as well as other pathological defects. infection of rhesus macaques with mers-cov results in a mild clinical disease characterized by a transient lung infection with pneumonia. rhesus macaques were inoculated with at least 10 7 tcid 50 (emc-2012) mers-cov either by the it route or a combined in, it, oral, and ocular inoculation . the result was a mild respiratory illness including nasal swelling and a short fever with all animals surviving. viral rna was recovered from nasal swab samples and replicating virus was found in lung tissue . mild pathological lesions were found only in the lungs. radiographic imaging of the lungs revealed interstitial infiltrates, which are signs of pneumonia (yao et al., 2014) . interestingly, mer-cov infection is more severe in marmosets compared to rhesus macaques (falzarano et al., 2014) . this is despite the finding that both species have complete homology with humans within the dpp4 domain that interacts with the viral s protein. other host factors influencing disease severity have not yet been identified. transgenic mouse models expressing hdpp4 are ideal for initial development and screening of mers-cov countermeasures, and marmosets can be used for final selection and characterization. filoviridae consists of three genera, ebolavirus and marburgvirus, and a newly discovered group, cuevavirus (kuhn, 2008) . it is thought that various species of bats are the natural host reservoir for these viruses that have lethality rates from 40% to 82% in humans. there is evidence that the egyptian rousette bat (rousettus aegyptiacus) is the natural reservoir for marburgviruses but may not be for ebolaviruses (jones et al., 2015) . marburg virus (marv) first emerged in 1967 in germany when laboratory workers contracted the virus from agms (chlorocebus aerthiops) that were shipped from uganda. ebolaviruses sudan and zaire (sudv and ebov) caused nearly simultaneous outbreaks in 1976 in what is now the democratic republic of congo (drc). the most recent outbreak of ebov in west africa was by far the largest with over 28,000 suspected, probable and confirmed cases and over 11,000 deaths. bundibugyo virus (bdbv) first emerged in 2007 in bundibugyo, uganda with 56 confirmed cases . two other ebolaviruses are known: taï forest (tafv) (previously named cote d'ivoire) (ciebov) and reston (restv), which have not caused major outbreaks or lethal disease in humans. filovirus disease in humans is a characterized by aberrant innate immunity and a number of clinical symptoms: fever, nausea, vomiting, k. viral disease arthralgia/myalgia, headaches, sore throat, diarrhea, abdominal pain, and anorexia as well as numerous others (mehedi et al., 2011; wauquier et al., 2010) . approximately 10% of patients develop petechia and a greater percentage, depending on the specific strain, may develop bleeding from various sites (gums, puncture sites, stools, etc.) (table 33 .2). natural transmission in an epidemic is through direct contact or needle sticks in hospital settings. however, much of the research interest in filoviruses primarily stems from biodefense needs, particularly from aerosol biothreats. as such, im, ip, and aerosol models have been developed in mice, hamsters, guinea pigs, and nhps for the study of pathogenesis, correlates of immunity, and for testing countermeasures . since filoviruses have such high lethality rates in humans, scientists have looked for models that are uniformly lethal to stringently test efficacy of candidate vaccines and therapeutics. one issue to take note of in animal model development of filovirus infection is the impact of particle to plaque-forming unit (pfu) ratios on lethality, wherein it is possible that increasing the dose could actually decrease infectivity due to an immunogenic effect produced by inactive virions in the stock. additionally, the plaque assay used to measure live virions in a stock may greatly underestimate the true quantity of infectious virions in a preparation (alfson et al., 2015; smither et al., 2013a) . immunocompetent mice have not been successfully infected with wild-type filoviruses due to the control of the infection by the murine type 1 interferon response (bray, 2001) . however, wild-type inbred mice are susceptible to filovirus that has been mouse adapted (ma) by serial passage in mice (bray et al., 1999) . marv angola was particularly resistant to adaptation, but after 24 serial passages in scid mice, infection caused severe disease in balb/c and c57bl/6 mice when administered in or ip (qiu et al., 2014) . these mice had pathology with some similarities to infection in humans including lymphopenia, thrombocytopenia, liver damage, and viremia. balb/c mice, which are the strain of choice for ip inoculation of ma-ebov, are not susceptible by the aerosol route (bray et al., 1999; zumbrun et al., 2012a) . for aerosol infection of immunocompetent mice, a panel of bxd (balb/c x dba) recombinant inbred strains were screened and one strain, bxd34, was particularly susceptible to airborne ma-ebov, with 100% lethality to low or high doses (approximately 100 or 1000 pfu) ( zumbrun et al., 2012a) . these mice developed weight loss of greater than 15% and succumbed to infection between days 7 and 8 postexposure. the aerosol infection model utilizes a whole-body exposure chamber to expose mice aged 6-8 weeks to ma-ebov aerosols with a mass median aerodynamic diameter (mmad) of approximately 1.6 µm and a geometric standard deviation (gsd) of approximately 2.0 for 10 min. another approach uses immunodeficient mouse strains, such as scid, stat1 ko, ifn receptor ko, or perforin ko with a wild-type ebov inoculum by ip or aerosol routes (bray, 2001; lever et al., 2012; zumbrun et al., 2012a) . mice are typically monitored for clinical disease "scores" based on activity and appearance, weight loss, and moribund condition (survival). coagulopathy, a hallmark of filovirus infection in humans, has been observed, with bleeding in a subset of animals and failure of blood samples to coagulate late in infection (bray et al., 1999) . liver, kidney, spleen, and lung tissue taken from moribund mice have pathology characteristic of filovirus disease in nhps (zumbrun et al., 2012a) . while most mouse studies have used ma-ebov or ebov, an ip mouse-adapted marv model is also available (warfield et al., 2007 (warfield et al., , 2009 ). ma-marv and ma-ebov models are particularly useful for screening novel antiviral compounds (panchal et al., 2012) . recently, a model was created using immunodeficient nsg [nonobese diabetic (nod)/scid/il-2 receptor chain knockout] mice with transplanted human hematopoietic stem cells from umbilical cord blood. these mice were susceptible to lethal wt (nonadapted) ebov by ip and in exposure (ludtke et al., 2015) . the transplanted mice had all of the cellular components of a fully functional adaptive human immune system and upon ebov (brannan et al., 2015; lever et al., 2012) . interestingly, inoculation of infa/br −/− mice with tafv and restv does not result in clinical signs. yet another strategy uses knockout mice lacking possible receptors for filovirus entry, such as niemann-pick c1 and c2 (npc1 and npc2). npc2 (−/−) mice were fully susceptible to infection with ebov but npc1 (−/−) mice were completely resistant (herbert et al., 2015) . hamsters are frequently used to study cardiovascular disease, coagulation disorders, and thus serve as the basis for numerous viral hemorrhagic fever models (gowen and holbrook, 2008; herbert et al., 2015) . an ip ma-ebov infection model has been developed in syrian hamsters gowen and holbrook, 2008; herbert et al., 2015; tsuda et al., 2011) . this model, which has been used to test a vesicular stomatitis virus vectored vaccine approach, utilizes male 5-to 6-week-old syrian hamsters which are infected with 100 ld 50 of ma-ebov. virus is present in tissues and blood collected on day 4 and all animals succumbed to the disease by day 6. infected hamsters had severe coagulopathy and uncontrolled host immune responses, similar to what is observed in primates. (ebihara et al., 2010) guinea pig models of filovirus infection have been developed for ip and aerosol routes using guinea pigadapted ebov (gp-ebov) and marv (gp-marv) (choi et al., 2012; connolly et al., 1999; twenhafel et al., 2015; zumbrun et al., 2012c) . guinea pig models of filovirus infection are quite useful in that they develop fever, which can be monitored at frequent intervals by telemetry. additionally, the animals are large enough for regular blood sampling in which measurable coagulation defects are observed as the infection progresses. a comparison of ip infection of outbred guinea pigs with guinea pig-adapted marv angola and marv ravn revealed similar pathogenesis (cross et al., 2015) . infection with either strain resulted in features of the disease that are similar to what is seen in human and nhp infection, such as viremia, fever, coagulopathy, lymphopenia, elevated liver enzymes (alt and ast), thrombocytopenia, and splenic, gastrointestinal and hepatic lesions. gp-marv-ravn had a delayed disease progression relative to gp-marv-ang. hartley guinea pigs exposed to aerosolized gp-ebov develop lethal interstitial pneumonia. this is in contrast to subq infection of guinea pigs, aerosol ebov challenge of nhps, and natural human infection (twenhafel et al., 2015) . both subq and aerosol exposure of guinea pigs to gp-ebov resulted in only mild lesions in the liver and spleen. by aerosol exposure, gp-ebov is uniformly lethal at both high and low doses (100 or 1000 pfu target doses) but lethality drops with low (less than 1000 pfu) presented doses of airborne gp-marv and more protracted disease is seen in some animals (our unpublished observations) (zumbrun et al., 2012c) . weight loss of between 15% and 25% is a common finding in guinea pigs exposed to gp-ebov or gp-marv. fever, which becomes apparent by day 5, occurs more rapidly in gp-ebov exposed guinea pigs than with gp-marv exposure. lymphocytes and neutrophils increase during the earlier part of the disease, and platelet levels steadily drop as the disease progresses. increases in coagulation time can be seen as early as day 6 postexposure. blood chemistries (i.e., alt, ast, alkp, and bun) indicating problems with liver and kidney function are also altered late in the disease course. transmission of ebov has been documented from swine to nhps via the respiratory tract (kobinger et al., 2011) . as such, guinea pigs have been used to establish transmission models (wong et al., 2015a,b) . nonexposed guinea pigs were placed in the cages with infected guinea pigs 1 day postexposure to gp-ebov. guinea pigs challenged intanasally were more likely to transmit virus to naive cagemates than those that were exposed by the ip route. nhp models of filovirus infection are the preferred models for more advanced disease characterization and testing of countermeasures because they most closely mimic the disease and immune correlates seen in humans (dye et al., 2012) . old world primates have been primarily used for development of ip, im, and aerosol models of filovirus infection ( twenhafel et al., 2013) . uniformly lethal filovirus models have been developed for most of the virus strains in cynomolgus macaques, rhesus macaques, and to a lesser degree, agms (alves et al., 2010; carrion et al., 2011; davis et al., 1997; hensley et al., 2011a; reed et al., 2007; zumbrun et al., 2012b) . low-passage human isolates that have not been passaged in animals have been sought for development of nhp models to satisfy the food and drug administration (fda) animal rule. ebov-makona, the strain responsible for the recent large outbreak in west africa, was compared to the "prototype" 1976 ebov strain (marzi et al., 2015) . the disease in cynos was similar for both viruses, but disease progression was delayed for ebov-makona. this delay as well as the lower fatality rate in the 2014 epidemic compared to the 1976 outbreak suggest that ebov-makona is less virulent. the large number of cases in the 2014-15 ebov outbreak brought to light previously underappreciated eye pathology and ocular viral persistence in survivors. while survivors of nhp filovirus infection are infrequent, necrotizing scleritis, conjunctivitis, and other ocular pathology has been observed in ebov-infected animals (alves et al., 2016) . prominent features of the filovirus infections in nhps are onset of fever by day 5 postexposure, viremia, lymphopenia, tachycardia, azotemia, alteration in liver function enzymes (alt, ast, and alkp), decrease in platelets, and increased coagulation times. petechial rash is a common sign of filovirus disease and may be more frequently observed in cynomolgus macaques than in other nhp species (zumbrun et al., 2012b) . immunological parameters have been evaluated and t, b, and natural killer cells are greatly diminished as the infection progresses (fernando et al., 2015) . a cytokine storm occurs with rises in ifnγ, tnf, il-6, and ccl2 (fernando et al., 2015) . however, there is also evidence from transcriptional profiling of circulating immune cells that the early immune response is skewed toward a th2 response (connor et al., 2015) . strikingly, animals surviving challenge may have a delay in the production of inflammatory cytokines and chemokines (martins et al., 2015) . clinical disease parameters may have a slightly delayed onset in aerosol models. dyspnea late in infection is a prominent feature of disease after aerosol exposure (zumbrun et al., 2012b) . aerosol filovirus infection of nhps results in early infection of respiratory lymphoid tissues, dendritic cells, alveolar macrophages, blood monocytes, and fibroblastic reticular cells followed by spread to regional lymph nodes then multiple organs (ewers et al., 2016; twenhafel et al., 2013) . a number of pronounced pathology findings include multifocal hepatic necrosis and fibrin accumulation, particularly within the liver and the spleen. for aerosolized marv infection of rhesus, the most significant pathology included destruction of the tracheobronchial and mediastinal lymph nodes (ewers et al., 2016) . lymphocytolysis and lymphoid depletion are also observed (alves et al., 2010) . multilead, surgically implanted telemetry devices are useful in continuous collection of temperature, blood pressure, heart rate, and activity levels. as such, blood pressure drops as animals become moribund and heart rate variability (standard deviation of the heart rate) is altered late in infection (zumbrun et al., 2012b) . the most recently developed telemetry devices can also aid in plethysmography to measure respiratory minute volume for accurate delivery of presented doses for aerosol exposure. standardized filovirus-infected nhp euthanasia criteria have also been developed to enhance reproducibility for studies that evaluate therapeutic and vaccine countermeasures (warren et al., 2014) . filovirus infection of common marmosets (callithrix jacchus) is also a viable model to study the disease course. respiratory infection of marmosets with marv results in a lethal infection with fever, hemorrhaging, transient rash, disseminated viral infection, increases in liver function enzymes, coagulopathy, hepatitis, and histological lesions particularly in the kidney and liver (smither et al., 2013b) . marmosets are similarly susceptible to infection with ebovkikwit (smither et al., 2015) . thus, ebov or marv infection of marmosets produces features of the disease that are very similar to that of other nhps and humans. hendra and nipah virus are unusual within the paramyxoviridae family given that they can infect a large range of mammalian hosts. both viruses are grouped under the genus henipavirus. the natural reservoirs of the viruses are the fruit bats from the genus pteropus. hendra and nipah have the ability to cause severe disease in humans with the potential for a high case fatality rate (rockx et al., 2012) . outbreaks due to nipah virus have been recognized in malaysia, singapore, bangladesh, and india, while hendra virus outbreaks have yet to be reported outside of australia (luby et al., 2009a,b) . hendra was the first member of the genus identified and was initially associated with an acute respiratory disease in horses. all human cases have been linked to transmission through close contact with an infected horse. there have been no confirmed cases of direct transmission from bat to human. nipah has the distinction of transmission among, although the exact route is unknown (homaira et al., 2010) . the virus is susceptible to ph, temperature, and desiccation, and thus close contact is hypothesized as needed for successful transmission (fogarty et al., 2008) . both viruses have a tropism for the neurological and respiratory tracts. the incubation period for hendra virus is 7-17 days and is marked by a flu-like illness. symptoms at this initial stage include myalgia, headache, lethargy, sore throat, and vomiting (hanna et al., 2006) . disease progression can continue to pneumonitis or encephalitic manifestations, with the person succumbing to multiorgan failure (playford et al., 2010) . nipah virus has an incubation period of 4 days to 2 weeks (goh et al., 2000) . much like hendra, the first signs of disease are nondescript. severe neurological symptoms subsequently develop including encephalitis and seizures that can progress to coma within 24-48 h (lo and rota, 2008). survivors of infection typically make a full recovery; however, 22% suffer permanent sequelae, including persistent convulsions (tan and chua, 2008) . at this time, there is no approved vaccine or antiviral, and treatment is purely supportive. animal models are being used to not only test novel vaccines and therapeutics, but also deduce the early events of disease because documentation of human cases is at terminal stages. the best small animal model is the syrian golden hamster due to their high susceptibility to both henipaviruses. clinical signs upon infection recapitulate the disease course in humans including acute encephalitis and respiratory distress. challenged animals died within 4-17 days postinfection. the progression of disease and timeline is highly dependent on dose and route of infection. in inoculation leads to imbalance, limb paralysis, lethargy, and breathing difficulties whereas ip resulted in tremors and paralysis within 24 h before death. virus was detected in lung, brain, spleen, kidney, heart, spinal cords, and urine, with the brain having the highest titer. this model is used for vaccination and passive protection studies (guillaume et al., 2009; rockx et al., 2011; wong et al., 2003) . the guinea pig model has not been widely used due to the lack of a respiratory disease upon challenge (torres-velez et al., 2008; williamson et al., 2001) . inoculation with hendra virus via the subq route leads to a generalized vascular disease with 20% mortality. clinical signs were apparent 7-16 days postinfection with death occurring within 2 days of cns involvement. higher inoculum has been associated with development of encephalitis and cns lesions. id and in injection does not lead to disease, although the animals are able to seroconvert upon challenge. the inoculum source does not affect clinical progression. nipah virus challenge only causes disease upon ip injection and results in weight loss and transient fever for 5-7 days. virus was shed through urine and was present in the brain, spleen, lymph nodes, ovary, uterus, and urinary bladder (hooper et al., 1997) . ferrets infected with hendra or nipah virus display the same clinical disease as seen in the hamster model and human cases (bossart et al., 2009; pallister et al., 2011) . upon inoculation by the oronasal route, ferrets develop severe pulmonary and neurological disease within 6-9 days including fever, coughing, and dyspnea. lesions do develop in the ferret's brains, but to a lesser degree than seen in humans. cats have also been utilized as an animal model for henipaviruses. disease symptoms are not dependent upon the route of infection. the incubation period is 4-8 days and leads to respiratory and neurological symptoms (mungall et al., 2007; johnston et al., 2015; westbury et al., 1996) . this model has proven useful for vaccine efficacy studies. squirrel and agms are representative of the nhp models. for squirrel monkeys, nipah virus is introduced by either the in or iv route and subsequently leads to clinical signs similar to humans, although in challenge results in milder disease. upon challenge, only 50% of animals develop disease manifestations including anorexia, dyspnea, and acute respiratory syndrome. neurological involvement is characterized by uncoordinated motor skills, loss of consciousness, and coma. viral rna can be detected in lung, brain, liver, kidney, spleen, and lymph nodes but is only found upon iv challenge (marianneau et al., 2010) . agms are very consistent model of both viruses. it inoculation of the viruses results in 100% mortality, and death within 8.5 and 9-12 days postinfection for hendra and nipah viruses, respectively. the animals develop severe respiratory and neurological disease with generalized vasculitis rockx et al., 2010) . the reservoir of the viruses, gray-headed fruit bats, has been experimentally challenged. due to their status as the host organism for henipaviruses, the bats do not develop clinical disease. however, hendra virus can be detected in kidneys, heart, spleen, and fetal tissue, and nipah virus can be located in urine . pigs develop a respiratory disease upon infection with both nipah and hendra viruses (berhane et al., 2008; li et al., 2010; middleton et al., 2002) . oral inoculation does not produce a clinical disease, but subq injection represents a successful route of infection. live virus can be isolated from the oropharynx as early as 4 days postinfection. nipah virus can also be transmitted between pigs. nipah virus was able to induce neurological symptoms in 20% of the pigs, even though virus was present in all neurological tissues regardless of symptoms (weingartl et al., 2005) . within the pig model, it appeared that nipah virus had a greater tropism for the respiratory tract, while hendra for the neurological system. horses are also able to develop a severe respiratory tract infection accompanied with fever and general weakness upon exposure to nipah and hendra viruses. oronasal inoculation led to systemic disease with viral rna detected in nasal swabs within 2 days (marsh et al., 2011; williamson et al., 1998) . animals died within 4 days postexposure and have interstitial pneumonia with necrosis of alveoli (murray et al., 1995a,b) . virus could be detected in all major systems. mice, rats, rabbits, chickens, and dogs have been tested but are nonpermissive to infection (westbury et al., 1995; wong et al., 2003) . suckling balb/c mice succumb to infection if the virus is inoculated intracranially (mungall et al., 2006) . in exposure with nipah does not induce a clinical disease; however, there is evidence of a subclinical infection in the lungs following euthanasia of the mice (dups et al., 2014) . in addition, a human lung xenograph model in nsg mice demonstrated that the human lung is highly susceptible to nipah viral replication and damage (valbuena et al., 2014) . embryonated chicken eggs have been inoculated with nipah virus leading to a universally fatal disease within 4-5 days postinfection (tanimura et al., 2006) . annually, respiratory syncytial virus (rsv) is responsible for the lower respiratory tract infections of 33 million children under the age of 5, which in turn results in 3 million hospitalizations and approximately 200,000 deaths (nair et al., 2010) . within the united states, hospital costs alone amount to over 600 million dollars per year (paramore et al., 2004) . outbreaks are common in the winter (yusuf et al., 2007) . the virus is transmitted by large respiratory droplets that replicate initially within the nasopharynx and spreads to the lower respiratory tract. incubation for the virus is 2-8 days. rsv is highly virulent leading to very few asymptomatic infections (collins and graham, 2008) . disease manifestations are highly dependent upon the age of the individual. rsv infections in neonates produce nonspecific symptoms including overall failure to thrive, apnea, and feeding difficulties. infants present with a mild upper respiratory tract disease that could develop into bronchiolitis and bronchopneumonia. contracting rsv at this age results in an increased chance of developing childhood asthma (wu et al., 2008) . young children develop recurrent wheezing while adults have exacerbation of previously existing respiratory conditions (falsey et al., 2005) . common clinical symptoms are runny nose, sneezing, and coughing accompanied by fever. mortality rates from rsv in hospitalized children are 1%-3% with the greatest burden of disease seen in 3-4 month olds (ruuskanen and ogra, 1993). hematopoietic stem cell transplant patients, solid organ transplant patients, and copd patients are particularly vulnerable to rsv infection and have mortality rates between 7.3% and 13.3% upon infection (anderson et al., 2016) . although there are almost 60 rsv vaccine candidates which are in preclinical and clinical phases, there is no licensed vaccine available and ribavirin usage is not recommended for routine treatment (american academy of pediatrics subcommittee on diagnosis and management of bronchiolitis, 2006; higgins et al., 2016; kim and chang, 2016) . animal models of rsv were developed in the hopes of formulating an effective and safe vaccine unlike the formalin-inactivated rsv (fi-rsv) vaccine. this vaccine induced severe respiratory illness in infants whom received the vaccine and were subsequently infected with live virus (kim et al., 1969) . mice can be used to model rsv infection, although a very high in inoculation is needed to achieve clinical symptoms (jafri et al., 2004; stark et al., 2002) . strain choice is crucial to reproducing a physiological relevant response (stokes et al., 2011). age does not affect primary disease manifestations (graham et al., 1988) . however, it does play a role in later sequelae showing increased airway hyperreactivity . primary rsv infection produces increased breathing with airway obstruction (jafri et al., 2004; van schaik et al., 1998) . virus was detected as early as day 3 and reached maximum titer at day 6 postinfection. clinical illness is defined in the mouse by weight loss and ruffled fur as opposed to runny nose, sneezing, and coughing as seen in humans. a humanized mouse model was recently developed by in inoculation. the challenged mice experienced weight loss and demonstrated a humoral and cellular immune response to the infection (sharma et al., 2016) . cotton rats are useful given that rsv is able to replicate to high titers within the lungs and can be detected in both the upper and lower airways after in inoculation (boukhvalova et al., 2009; niewiesk and prince, 2002) . viral replication is 50-to 1000-fold greater in the cotton rat model than mouse model (wyde et al., 1993) . the cotton rats develop mild to moderate bronchiolitis or pneumonia (grieves et al., 2015; prince et al., 1999) . although age does not appear to factor into clinical outcome, it has been reported that older cotton rats tend to take longer to achieve viral clearance. viral loads peak by the 5th day, dropping to below the levels of detection by day 8. the histopathology of the lungs appears similar to that of humans after infection (piazza et al., 1993) . this model has limited use in modeling the human immune response to infection as challenge with the virus induces a th2 response in cotton rats, whereas humans tend to have a response skewed toward th1 (culley et al., 2002; dakhama et al., 2005; ripple et al., 2010) . fi-rsv disease was recapitulated upon challenge with live virus after being vaccinated twice with fi-rsv. chinchillas have been challenged experimentally with rsv via in inoculation. the virus was permissive within the nasopharynx and eustachian tube. the animals displayed an acute respiratory tract infection. this model is therefore useful in studying mucosal immunity during infection (gitiban et al., 2005) . ferrets infected by it were found to have detectable rsv in throat swabs up to day 7 postinfection, and positive qpcr up to day 10. immunocompromised ferrets were observed to have higher viral loads accompanied with detectable viral replication in the upper respiratory tract (stittelaar et al., 2016) . chimpanzees are permissive to replication and clinical symptoms of rsv including rhinorrhea, sneezing, and coughing. adult squirrel monkeys, newborn rhesus macaques, and infant cebus monkeys were also challenged but did not exhibit any disease symptoms or high levels of viral replication (belshe et al., 1977) . bonnet monkeys were developed an inflammatory response by day 7 with viral rna detected in both bronchial and alveolar cells (simoes et al., 1999) . the chimpanzee model has been proven useful for vaccine studies (hancock et al., 2000; teng et al., 2000) . sheep have also been challenged experimentally since they develop respiratory disease when exposed to ovine rsv (meyerholz et al., 2004) . lambs are also susceptible to human respiratory syncytial infection (olivier et al., 2009; sow et al., 2011) . when inoculated intratracheally, the lambs developed an upper respiratory tract infection with cough after 6 days. some lambs went on to develop lower respiratory disease including bronchiolitis. the pneumonia resolved itself within 14 days. rsv replication peaked at 6 days, and rapidly declined. studying respiratory disease in sheep is beneficial given the shared structural features with humans (plopper et al., 1983; scheerlinck et al., 2008) . the influenza viruses consist of three types: influenza a, b, and c, based on antigenic differences. influenza a is further classified by subtypes; 16 ha and 9 na subtypes are known. seasonal influenza is the most common infection and usually causes a self-limited febrile illness with upper respiratory symptoms and malaise that resolves within 10 days (taubenberger and morens, 2008) . the rate of infection is estimated at 10% in the general population and can result in billions of dollars of loss annually from medical costs and reduced work-force productivity. approximately 40,000 people in the united states die each year from seasonal influenza (dushoff et al., 2006) . thus, vaccines and therapeutics play a critical role in controlling infection, and development using animal models is ongoing (braun et al., 2007b) . influenza virus replicates in the upper and lower airways, peaking at approximately 48-h postexposure. infection can be more severe in infants and children under the age of 22, people over the age of 65, or immunocompromised individuals where viral pneumonitis or pneumonia can develop or bacterial superinfection resulting in pneumonia or sepsis (barnard, 2009; glezen, 1982) . pneumonia from secondary bacterial infection, such as streptococcus pneumonia, streptococcus pyogenes, and neisseria meningitides, and more rarely, staphylococcus aureus, is more common than viral pneumonia from the influenza virus itself, accounting for ∼27% of all influenza associated fatalities (alonso et al., 2003; ison and lee, 2010; speshock et al., 2007) . death, often due to ards can occur as early as 2 days after onset of symptoms. lung histopathology in severe cases may include dad, alveolar edema and damage, hemorrhage, fibrosis, and inflammation (taubenberger and morens, 2008) . the h5n1 avian strain of influenza, has lethality rates of around ∼50% (of known cases), likely because the virus preferentially binds to the cells of the lower respiratory tract, and thus the potential for global spread is a major concern (matrosovich et al., 2004; wang et al., 2016) . h7n9 is another avian influenza a strain that infected more than 130 people and was implicated in 37 deaths. approximately 75% of infected people had a known exposure to birds. there is no evidence of sustained spread between humans but these viruses are of great concern for their pandemic potential (zhang et al., 2013) . the most frequently used animal models of influenza infection include mice, ferrets, and nhps. a very thorough guide to working with mouse, guinea pig, ferret, and cynomolgus models was published by kroeze et al. (2012) . swine are not frequently utilized but are also a potentially useful model for influenza research since they share many similarities to human anatomy, genetics, susceptibility, and pathogenesis (rajao and vincent, 2015). lethality rates can vary with virus strain used (with or without adaptation), dose, route of inoculation, age, and genetic background of the animal. the various animal models can capture differing diseases caused by influenza: benign, severe, super infection, and sepsis, severe with ards, and neurologic manifestations (barnard, 2009) . also, models can utilize seasonal or avian strains and have been developed to study transmission, important for understanding the potential for more lethal strains, such as h5n1 for spreading among humans. mouse models of influenza infection are very predictive for antiviral activity and tissue tropism in humans, and are useful in testing and evaluating vaccines (gilbert and mcleay, 2008; hagenaars et al., 2008; ortigoza et al., 2012) . inoculation is by the in route, utilizing approximately 60 µl of inoculum in each nare of anesthetized mice. exposure may also be to small particle aerosols containing influenza with a mmad of <5 µl. most inbred strains are susceptible, with particularly frequent use of balb/c followed by c57bl/6j mice. males and females have equivalent disease but influenza is generally more infectious in younger 2-to 4-week-old (8-10 g) mice. mice are of somewhat limited use in characterizing the immune response to influenza. most inbred laboratory mice lack the mxa gene which is an important part of human innate immune response to influenza infection. the mouse homolog to mxa, mx1 is defective in most inbred mouse strains (staeheli and haller, 1987) . mice with the knocked-in mx1 gene have a 1000-fold higher ld-50 for an influenza a strain (pr8) than wildtype background c57bl/6 mice (grimm et al., 2007) . weight loss or reduced weight gain, decreased activity, huddling, ruffled fur, and increased respiration are the most common clinical signs in influenza infected mice. for more virulent strains, mice may require euthanasia as early as 48 h postexposure, but most mortality occurs from 5 to 12 days postexposure accompanied by decreases in rectal temperature (sidwell and smee, 2000). pulse oximeter readings and measurement of blood gases for oxygen saturation are also used to determine the impact of influenza infection on respiratory function (sidwell et al., 1992) . virus can be isolated from bronchial lavage (bal) fluids throughout the infection and from tissues after euthanasia. for influenza strains with mild to moderate pathogenicity, disease is nonlethal and virus replication is detected within the lungs, but usually not other organs. increases in serum alpha-1-acidglycoprotein and lung weight also frequently occur. however, mice infected with influenza do not develop fever, dyspnea, nasal exudates, sneezing, or coughing. mice can be experimentally infected with influenza a or b, but the virus generally requires adaptation to produce clinical signs. mice express the receptors for influenza attachment in the respiratory tract; however, the distribution varies and sa 2,3 predominates over sa 2,6 which is why h1, h2, and h3 subtypes usually need to be adapted to mice and h5n1, h2, h6, and h7 viruses do not require adaptation (o'donnell and subbarao, 2011). to adapt, mice are infected intratracheally or intranasally by virus isolated from the lungs, and reinfected into mice and then the process is repeated a number of times. once adapted, influenza strains can produce severe disease, systemic spread, and neurotropism. h5n1 and the 1918 pandemic influenza virus can cause lethal infection in mice without adaptation (gao et al., 1999; taubenberger, 2006) . h5n1 infection of mice results in viremia and viral replication in multiple organ systems, severe lung pathology, fulminant diffuse interstitial pneumonia, pulmonary edema, high levels of proinflammatory cytokines, and marked lymphopenia ( dybing et al., 2000; gubareva et al., 1998; lu et al., 1999) . as in humans, the virulence of h5n1 is attributable to damage caused by an overactive host immune response. additionally, mice infected with the 1918 h1n1 influenza virus produce severe lung pathology and oxygen saturation levels that decrease with increasing pneumonia (barnard et al., 2007) . reassortment influenza viruses of the 2009 h1n1 virus and a low-pathogenicity avian h7n3 virus can also induce disease in mice without adaptation . in superinfection models, a sublethal dose of influenza is given to mice followed 7 days later by in inoculation of a sublethal dose of a bacterial strain, such as s. pneumoniae or s. pyogenes (chaussee et al., 2011) . morbidity, characterized by inflammation in the lungs, but not bacteremia, begins a couple of days after superinfection and may continue for up to 2 weeks. at least one transmission model has also been developed in mice. with h2n2 influenza, transmission rates of up to 60% among cagemates can be achieved after infection by the aerosol route and cocaging after 24 h (schulman, 1968). rats (f344 and sd) inoculated with rat-adapted h3n2 developed inflammatory infiltrates and cytokines in bronchoalveolar lavage fluids, but had no lethality and few histopathological changes (daniels et al., 2003) . additionally, an influenza transmission model has been developed in guinea pigs as an alternative to ferrets (lowen et al., 2006) . cotton rats (sigmodon hispidus) have been used to test vaccines and therapeutics in a limited number of studies (eichelberger et al., 2004) . cotton rats have an advantage over mice in that the immune system is similar to humans (including the presence of the mx gene) and influenza viruses do not have to be adapted (eichelberger et al., 2006; ottolini et al., 2005) . nasal and pulmonary tissues of cotton rats were infected with unregulated cytokines and lung viral load peaking at 24 h postexposure. virus was cleared from the lung by day 3 and from the nares by day 66, but animals had bronchial and alveolar damage, and pneumonia for up to 3 weeks. there is also a s. aureus superinfection model in cotton rats (braun et al., 2007a) . coinfection resulted in bacteremia, high bacterial load in lungs, peribronchiolitis, pneumonitis, alveolitis, hypothermia, and higher mortality. domestic ferrets (mustela putorius furo) are frequently the animal species of choice for influenza animal studies because the susceptibility, clinical signs, peak virus shedding, kinetics of transmission, local expression of cytokine mrnas, and pathology resemble that of humans (lambkin et al., 2004; maines et al., 2012; mclaren and butchko, 1978) . like humans, ferrets exclusively express neu5ac, which acts as a receptor for influenza a virus, a feature likely contributing to the susceptibility of ferrets to human-adapted influenza a virus strains (ng et al., 2014) . the glycomic characterization of ferret respiratory tract tissues demonstrated some similarities and some differences to humans in terms of the potential glycan binding sites for the influenza virus (jia et al., 2014) . ferrets also have airway morphology, respiratory cell types, and a distribution of influenza receptors (sa 2,6 and sa 2,3) within the airways similar to that of humans (van riel et al., 2007) . influenza was first isolated from ferrets infected in with throat washes from humans harboring the infection and ferret models have since been used to test efficacy of vaccines and therapeutic treatments (huber et al., 2008; lambkin et al., 2004; maines et al., 2012) . when performing influenza studies in ferrets, animals should be serologically negative for circulating influenza viruses. infected animals should be placed in a separate room from uninfected animals. if animals must be placed in the same room, uninfected ferrets should be handled before infected ferrets. anesthetized ferrets are experimentally exposed to influenza by in inoculation of 0.25-0.5 ml containing approximately 10 4 -10 6 egg id 50 dropwise to each nostril. however, a larger inoculum volume of 1.0 ml has also been explored as being more appropriate, yielding more severe and consistent respiratory tract pathology, likely because the larger inoculum is more widely distributed in the lower respiratory tract (moore et al., 2014) . video tracking to assign values to activity levels in ferrets can aid ferret studies, eliminating the need for collection of subjective and arbitrary clinical scores (oh et al., 2015) . viral replication in the upper respiratory tract is typically measured by nasal washes, but virus can also be measured in bronchoalveolar lavage fluid using a noninvasive technique (lee et al., 2014) . influenza types a and b naturally infect ferrets, resulting in an acute illness, which usually lasts 3-5 days for mild to moderately virulent strains (maher and destefano, 2004) . ferrets are more susceptible to influenza a than influenza b strains and are also susceptible to avian influenza h5n1 strains without adaptation (zitzow et al., 2002) . however, the localized immune responses within the respiratory tract of ferrets infected with influenza a and b have been characterized and are similar (carolan et al., 2015) . virulence and degree of pneumonitis caused by different influenza subtypes and strains vary from mild to severe and generally mirrors that seen in humans (stark et al., 2013) . nonadapted h1n1, h2n2, and h3n2 have mild to moderate virulence in ferrets. the sequencing of the ferret genome has allowed for the characterization of the ferret host response using rnaseq analysis . distinct signatures were obtained depending on the particular influenza strain to inoculate the ferrets. also helpful is the sequencing and characterization of the influenza ferret infectome during different stages of the infection in naïve or immune ferrets (leon et al., 2013) . since influenza infection is particularly devastating to the elderly population, an aged ferret model of h1n1 influenza infection was developed (paquette et al., 2014) . features associated with increased clinical disease are weakened hemagglutinin antibody generation and attenuated th1 responses. pregnant and breastfeeding women and infants are also susceptible to more severe illness from influenza virus. to study this dynamic, a breastfeeding mother-infant ferret influenza infection model was created (paquette et al., 2015) . notably, the mammary gland itself harbored virus and transcript analysis showed downregulation of milk production genes. in support of the development of therapies, the ferret influenza model for pharmacokinetic/pharmacodynamics studies of antiviral drugs as also been developed (reddy et al., 2015) . critical to this model is ensuring pronounced clinical signs and robust viral replication upon influenza infection. strains of low virulence have predominant replication in the nasal turbinates of ferrets. clinical signs and other disease indicators in ferrets are similar to that of humans with mild respiratory disease, sneezing, nasal secretions containing virus, fever, weight loss, high viral titers, and inflammatory infiltrate in the airways, bronchitis, and pneumonia (svitek et al., 2008) . replication in both the upper and lower airways is associated with more severe disease and greater mortality. additionally, increased expression of proinflammatory mediators and reduced expression of antiinflammatory mediators in the lower respiratory tract of ferrets correlates with severe disease and lethal outcome. h5n1-infected ferrets develop severe lethargy, greater interferon response, transient lymphopenia, and replication in respiratory tract, brain, and other organs (peng et al., 2012; zitzow et al., 2002) . immunocompromised humans have influenza illness of greater duration and complications. immunocompromised ferrets infected with influenza similarly had prolonged virus shedding (van der vries et al., 2013) . interestingly, antiviral resistance emerged in both humans and ferrets with immunocompromised status infected with influenza. alveolar macrophage depleted of ferrets infected with 2009 pandemic h1n1 influenza also had a more severe disease with greater viral replication in the lungs and greater induction of inflammatory chemokines (kim et al., 2013) . a superinfection model resembling that of mice has been developed by in instillation of influenza in 6-to 8-week-old ferrets followed by in inoculation of s. pneumonia 5 days later (peltola et al., 2006) . this typically resulted in otitis media, sinusitis, and pneumonia. transmission models in ferrets have recently met with worldwide media attention and controversy with regard to the study of h5n1 (enserink, 2013; fouchier et al., 2012; herfst et al., 2012; oh et al., 2014) . in general, some subtypes, such as the 2009 h1n1, can transmit easily through aerosol and respiratory droplets (munster et al., 2009) . of concern, h7n9 isolated from humans was more pathogenic and readily transmissible between ferrets by larger respiratory droplets and smaller particle aerosols (kreijtz et al., 2013; richard et al., 2013; zhang et al., 2013) . h5n1 became transmissible by adopting just four mutations, spreading between ferrets in separate cages (imai et al., 2012) . transmission occurs more readily at the height of pyrexia, but for the 2009 h1n1 in particular, can occur before fever is detected (roberts et al., 2012) . ferret-to-ferret transmission of a mouseadapted influenza b virus has also been demonstrated (kim et al., 2015) . since ferrets can be expensive and cumbersome, influenza infection has been characterized and a transmission model developed in the guinea pig; however, this is a newer model with infrequent utilization thus far (lowen et al., 2014) . old and new world primates are susceptible to influenza infection and have an advantage over ferret and mouse models which are deficient for h5n1 vaccine studies because there is a lack of correlation with hemagglutination inhibition (murphy et al., 1980) . of old world primates, cynomolgus macaque (macaca fascicularis) is most frequently utilized for studies of vaccines and antiviral drug therapies (stittelaar et al., 2008) . h5n1 and h1n1 1918 infection of cynos is very similar to humans (rimmelzwaan et al., 2001) . cynos develop fever and ards upon in inoculation of h5n1 with necrotizing bronchial interstitial pneumonia . nhps are challenged by multiple routes k. viral disease (ocular, nasal, and tracheal) simultaneously 1 × 10 6 pfu per site. virus antigen is primarily localized to the tonsils and pulmonary tissues. infection of cynos with h5n1 results in fever, lethargy, nasal discharge, anorexia, weight loss, nasal and tracheal washes, pathologic and histopathologic changes, and alveolar and bronchial inflammation. the 1918 h1n1 caused a very high mortality rate due to an aberrant immune response and ards and had more than 50% lethality (humans only had a 1%-3% lethality) (kobasa et al., 2007) . ards and mortality also occur with the more pathogenic strains, but nhps show reduced susceptibility to less virulent strains, such as h3n2 (o'donnell and subbarao, 2011) . influenza-infected rhesus macaques represent a mild disease model for vaccine and therapeutic efficacy studies (baas et al., 2006) . host microarray and qrt-pcr proved useful for analysis of infected lung tissues. other nhp models include influenza infection of pigtailed macaques as a mild disease model and infection of new world primates, such as squirrel and cebus monkeys (baskin et al., 2009) . domestic pig models have been developed for vaccine studies for swine flu. pigs are susceptible in nature as natural or intermediate hosts but are not readily susceptible to h5n1 (isoda et al., 2006; lipatov et al., 2008) . while pigs infected with influenza may have fever, anorexia, and respiratory signs, such as dyspnea and cough, mortality is rare (van der laan et al., 2008) . size and space requirements make this animal difficult to work with, although the development of minipig models may provide an easier to use alternative. cat and dog influenza models have primarily been utilized to study their susceptibility to h5n1 with the thought that these animals could act as sentinels or could serve to transmit the virus to humans (giese et al., 2008; rimmelzwaan et al., 2006) . these models are not generally used to better understand the disease in humans or for testing vaccines or antivirals. rift valley fever virus (rvfv) causes epizootics and human epidemics in africa. rvfv mainly infects livestock, such as sheep, cattle, goats, etc. after 2-4 days incubation period, animals show signs of fever, hepatitis, and abortion, which is a hallmark diagnostic sign known among farmers (balkhy and memish, 2003) . mosquito vectors, unpasteurized milk, aerosols of infected animal's body fluids, or direct contact with infected animals are the important routes of transmission to humans (abu-elyazeed et al., 1996; mundel and gear, 1951) . after 2-to 6-day-incubation period, rvfv causes a wide range of signs and symptoms in humans ranging from asymptomatic to severe disease with hepatitis, vision loss, encephalitis, and hemorrhagic fever (ikegami and makino, 2011; laughlin et al., 1979; peters and linthicum, 1994) . depending on the severity of the disease when the symptoms start, 10%-20% of the hospitalized patients might die in 3-6 days or 12-17 days after the disease onset (ikegami and makino, 2011) . hepatic failure, renal failure or dic, and encephalitis are demonstrated within patients during postmortem examination. live domestic animals especially sheep and goats were used to develop animal models of rvfv (weingartl et al., 2014) . this study indicated that goats were more resistant to the disease compared to sheep. the viremia in goats was lower and had a shorter duration with only some animals developing fever. the susceptibility is influenced by route of infection, breed of animals, the rvfv strain, and growth conditions as well as the passage history. therefore, it might be difficult to establish an animal model with domestic ruminants. mice are one of the most susceptible animal species to rvfv infection. several mouse models including balb/c, ifnar −/− , mbt/pas, 129 and c57bl/6 were exposed to rvfv via parental or aerosol routes of infection (ross et al., 2012) . subq or ip routes of infection cause acute hepatitis and lethal encephalitis at a late stage of the disease in mice (mims, 1956; smith et al., 2010) . mice start to show signs of decreased activity and ruffled fur by day 2-3 postexposure. immediately following these signs, they become lethargic and generally die 3-6 days postexposure. ocular disease or the hemorrhagic form of the disease has not been observed in mouse models so far (ikegami and makino, 2011) . increased viremia and tissue tropism were reported in mice with (smith et al., 2010) increased liver enzymes and lymphopenia observed in sick mice. aerosolized rvfv causes faster and more severe neuropathy in mice compared to the parental route (dodd et al., 2014; reed et al., 2014) . the liver is a target organ following aerosol exposure and liver failure results in fatality. rats and gerbils are also susceptible to rvfv infection. the rat's susceptibility is dependent on the rat strain utilized for the challenge model and route of exposure. there is also noted age dependence in the susceptibility of rats. while wistar-furth and brown norway strains, and young rats are highly susceptible to rvfv infection, fisher 344, buffalo and lewis strains, and old rats demonstrated resistance to infection via subq route of infection (findlay and howard, 1952; peters and slone, 1982) . similar pathologic changes, such as liver damage and encephalopathy were observed in both rats and mice. the recent study by bales et al. (2012) showed that aerosolized rvfv caused similar disease outcome in wistar-furth and aci rats while lewis rats developed fatal encephalitis which was much more severe than the subq route of infection. there was no liver involvement in the gerbil model and animals died from severe encephalitis. the mortality rate was dependent on the strain used and the dose given to gerbils (anderson et al., 1988) . similar to the rat model, the susceptibility of gerbils was also dependent on age. natural history studies with syrian hamsters indicated that the liver was the target organ with highly elevated alt levels and viral titers (scharton et al., 2015) . lethargy, ruffled fur, and hunched posture were observed in hamsters by day 2 post-subq inoculation and the disease was uniformly lethal by day 2-3 postexposure. this model has been successfully used to test antivirals against rvfv (scharton et al., 2015) . studies thus far showed that rvfv does not cause uniform lethality in a nhp model. ip, in, iv, and aerosol routes have been utilized to develop nhp model. rhesus macaques, cynomolgus macaques, african monkeys, and south american monkeys were some of the nhp species used for this effort . monkeys showed a variety of signs ranging from febrile disease to hemorrhagic disease and mortality. temporal viremia, increased coagulation parameters (pt, aptt), and decreased platelets were some other signs observed in nhps. animals that succumbed to disease showed very similar pathogenesis to humans, such as pathological changes in the liver and hemorrhagic disease. there was no ocular involvement in this model. smith et al. compared iv, in and subq routes of infection in common marmosets and rhesus macaques (peng et al., 2012) . marmosets were more susceptible to rvfv infection than rhesus macaques with marked viremia, acute hepatitis, and late onset of encephalitis. increased liver enzymes were observed in both species. necropsy results showed enlarged livers in the marmosets exposed by iv or subq routes. although there were no gross lesions in the brains of marmosets, histopathology showed encephalitis in the brains of in challenged marmosets. a recent study by hartman et al. (2014) demonstrated that aerosolized rvfv only caused mild fever in cynomolgus macaques and rhesus macaques, while agms and marmosets had encephalitis and succumbed to disease between days 9 and 11 postexposure. in contrast to other lethal models, the brain was the target organ in agms and marmosets. although no change was observed in ast levels, alp levels were increased in marmosets. little or no change was observed in hepatic enzyme levels in agms. lack of information regarding human disease concerning the aerosol route of exposure makes it difficult to evaluate these animal models. crimean-congo hemorrhagic fever virus (cchfv) generally circulates in nature unnoticed in an enzootic tick-vertebrate-tick cycle and similar to other zoonotic agents, appears to produce little or no disease in its natural hosts, but causes severe disease in humans. cchfv transmits to humans by ixodid ticks, direct contact with sick animals/humans, or body fluids of animals/humans (ergonul and whitehouse, 2007) . incubation, prehemorrhagic, hemorrhagic, and convalescence are the four phases of the disease seen in humans. the incubation period lasts 1-9 days. during the prehemorragic phase, patients show signs of nonspecific flu-like disease for approximately a week. the hemorrhagic period results in circulatory shock and dic in some patients (mardani and keshtkar-jahromi, 2007; swanepoel et al., 1989) . over the years, several attempts have been made to establish an animal model for cchf in adult mice, guinea pigs, hamsters, rats, rabbits, sheep, nhps, etc. (fagbami et al., 1975; nalca and whitehouse, 2007; shepherd et al., 1989; smirnova, 1979) . until recently, the only animal that manifests disease is the newborn mouse. infant mice ip infected with cchfv resulted in fatality around day 8 postinfection (tignor and hanham, 1993) . pathogenesis studies showed that virus replication was first detected in the liver, with subsequent spread to the blood (serum). virus was detected very late during the disease course in other tissues including the heart (day 6) and the brain (day 7). the recent studies utilizing knockout adult mice were successful to develop a lethal small animal model for cchfv infection (bente et al., 2010; bereczky et al., 2010) . bente et al. infected stat1 knockout mice by the ip route. in this model, after the signs of fever, leukopenia, thrombocytopenia, viremia, elevated liver enzymes and proinflammatory cytokines, mice were moribund and succumbed to disease in 3-5 days postexposure. the second model was developed by using interferon alpha/beta (ifnα/β) receptor knockout mice (ifnar −/− ) (bereczky et al., 2010) . similar observations were made in this model as in the stat1 knockout mouse model. animals were moribund and died 2-4 days after exposure with high viremia levels in liver and spleen. characterization studies with ifnar −/− mice challenged with different routes (ip, in, im, and subq) showed that cchfv causes acute disease with high viral loads, pathology in liver and lymphoid tissues, increased proinflammatory response, severe thrombocytopenia, coagulopathy, and death, all of which are characteristics of human disease . proinflammatory cytokines and chemokines, such as g-csf, ifnγ, cxc-cl10, ccl2 increased dramatically day 3 postchallenge and gm-csf, il-1a, il-1b, il-2, il-6, il-12p70, il-13, il-17, cxcl1, ccl3, ccl5, and tnf-α concentrations were extremely elevated at the time of death/euthanasia. this model is also utilized to test therapeutics, such as ribavirin, arbidol, and t-705 (favipiravir) successfully (oestereich et al., 2014) . experimental vaccines developed for cchf were evaluated in this model provided protection compare to unvaccinated mice (buttigieg et al., 2014; canakoglu et al., 2015, p. 725) . thus, the ifnar −/− mouse model would be a good choice to test medical countermeasures against cchfv, although they have an impaired ifn and immune response phenotype. other laboratory animals, including nhps, show little or no sign of infection or disease when infected with cchfv (nalca and whitehouse, 2007) . butenko et al. utilized agms (cercopithecus aethiops) for experimental cchfv infections. except one monkey with a fever on day 4 postinfection, the animals did not show signs of disease. antibodies to the virus were detected in three out of five monkeys, including the one with fever. fagbami et al. (1975) infected two patas monkeys (erythrocebus patas) and one guinea baboon (papio papio) with cchfv. whereas all three animals had low-level viremia between days 1 and 5 after inoculation, only the baboon serum had neutralizing antibody activity on day 137 postinfection. similar results were obtained when horses and donkeys have been used for experimental cchfv infections. donkeys develop a low-level viremia (rabinovich et al., 1972) and horses developed little or no viremia, but high levels of virus-neutralizing antibodies, which remained stable for at least 3 months. these studies suggest that horses may be useful in the laboratory to obtain serum for diagnostic and possible therapeutic purposes (blagoveshchenskaya et al., 1975 (blagoveshchenskaya et al., ). shepherd et al. (1989 infected 11 species of small african wild mammals and laboratory rabbits, guinea pigs, and syrian hamsters with cchfv. whereas scrub hares (lepus saxatilis), cape ground squirrels (xerus inauris), red veld rats (aethomys chrysophilus), white-tailed rats (mystromys pumilio), and guinea pigs had viremia; south african hedgehogs (atelerix frontalis), highveld gerbils (gerbilliscus brantsii), namaqua gerbils (desmodillus auricularis), two species of multimammate mouse (mastomys natalensis and mastomys coucha), and syrian hamsters were negative for virus. all species regardless of viremia levels developed antibody responses against cchfv. iv and intracranially infected animals showed onset of viremia earlier than those infected by the subq or ip routes. the genus hantavirus is unique among the family bunyaviridae in that it is not transmitted by an arthropod vector, but rather rodents (schmaljohn and nichol, 2007) . rodents of the family muridae are the primary reservoir for hantaviruses. infected host animals develop a persistent infection that is typically asymptomatic. transmission is achieved by inhalation of infected rodent saliva, feces, and urine (xu et al., 1985) . human infections can normally be traced to a rural setting with activities, such as farming, land development, hunting, and camping as possible sites of transmission. rodent control is the primary route of prevention (lednicky, 2003) . the viruses have a tropism for endothelial cells within the microvasculature of the lungs (zaki et al., 1995) . there are two distinct clinical diseases that infection can yield: hemorrhagic fever with renal syndrome (hfrs) due to infection with old world hantaviruses or hantavirus pulmonary syndrome (hps) caused by new world hantaviruses (nichol, 2001) . hfrs is mainly seen outside of the americas and is associated with the hantaviruses dobrava-belgrade (also known as dobrava), hantaan, puumala, and seoul (lednicky, 2003) . incubation lasts 2-3 weeks and presents as flu-like in the initial stages that can further develop into hemorrhagic manifestations and ultimately renal failure. thrombocytopenia subsequently develops which can further progress to shock in approximately 15% patients. overall mortality rate is 7%. infection with dobrava and hantaan viruses are typically linked to development of severe disease. hps was first diagnosed in 1993 within southwestern united states when healthy young adults became suddenly ill, progressing to severe respiratory distress and shock. the etiological agent responsible for this outbreak was identified as sin nombre virus (snv) (centers for disease control and prevention, 1993) . this virus is still the leading cause within north america of hps. hps due to other hantaviruses has been reported in argentina, bolivia, brazil, canada, chile, french guiana, panama, paraguay, and uruguay (padula et al., 2000; stephen et al., 1994) . the first report of hps in maine was recently documented (centers for disease control and prevention, 1993). andes virus (andv) was first identified in outbreaks in chile and argentina. this hantavirus is distinct in that it can be transmitted between humans (wells et al., 1997) . the fulminant disease is more lethal than that observed of hfrs with a mortality rate of 40%. there are four phases of disease including prodromal, pulmonary, cardiac depression, and hematologic manifestation (peters and khan, 2002) . incubation typically occurs 14-17 days following exposure (young et al., 2000) . unlike hfrs, renal failure is not a major contributing factor to the disease. there is a short prodromal phase that gives way to cardiopulmonary involvement accompanied by cough and gastrointestinal symptoms. it is at this point that individuals are typically admitted to the hospital. pulmonary function is hindered and continues to suffer within 48 h after cardiopulmonary involvement. interstitial edema and air-space disease normally follow. in fatal cases, cardiogenic shock has been noted (hallin et al., 1996) . syrian golden hamsters are the most widely utilized small animal model for hantavirus infection. hamsters inoculated im with a passaged andes viral strain died within 11 days postinfection. clinical signs did not appear until 24 h prior to death at which point the hamsters were moribund and in respiratory distress. mortality was dose dependent, with high inoculums leading to a shorter incubation before death. during the same study, hamsters were inoculated with a passaged snv isolate. no hamsters developed any symptoms during the course of observation. however, an antibody response to the virus that was not dose dependent was determined via elisa. hamsters infected with andv have significant histopathological changes to their lung, liver, and spleen. all had an interstitial pneumonia with intraalveolar edema. infectious virus could be recovered from these organs. viremia began on day 8 and lasted up to 12 days postinfection. infection of hamsters with andv yielded a similar clinical disease progression as is seen in human hps including rapid progression to death, fluid in the pleural cavity, and significant histopathological changes to the lungs and spleen. a major deviation in the hamster model is the detection of infectious virus within the liver . normally, snv does not cause a disease in hamsters (wahl-jensen et al., 2007) . but a recent study showed that immunosuppression with dexamethasone and cyclophosphamide in combination causes lethal disease with snv in hamsters (brocato et al., 2014) . the disease was very similar to the disease caused by andv in hamsters. lethal disease can be induced in newborn mice, but does not recapitulate the clinical symptoms observed in human disease (kim and mckee, 1985) . the disease outcome is very much dependent on the age of the mice. younger mice are much more susceptible to virus than the adult mice. adult mice exposed to hanta virus leads to a fatal disease dependent upon viral strain and route of infection. the disease progression is marked by neurological or pulmonary manifestations that do not mirror human disease (seto et al., 2012; wichmann et al., 2002) . knockout mice lacking ifnα/β are highly susceptible to hanta virus infection (muller et al., 1994) . in a study of panel of laboratory strains of mice, c57bl/6 mice were most susceptible to a passaged hanta viral strain injected ip. animals progressed to neurological manifestation including paralyses and convulsions, and succumbed to infection within 24-36 h postinfection. clinical disease was markedly different from that observed in human cases (wichmann et al., 2002) . in a recent study, 2-weekold icr mice was exposed to htnv strain aa57 via the subq route (seto et al., 2012) . mice started to show signs of disease by day 11 postinoculation. piloerection, trembling, hunching, loss of body weight, labored breathing, and severe respiratory disease were observed in mice. studies to develop nhp models were not successful until recently. nhps have been challenged with new world hantaviruses; however, no clinical signs were reported (hooper et al., 2006; mcelroy et al., 2002) . cynomolgus monkeys challenged with a clinical isolate of puumala virus developed a mild disease (klingstrom et al., 2002; sironen et al., 2008) . challenge with andv to cynomolgus macaques by both iv and aerosol exposure led to no signs of disease. all animals did display a drop in total lymphocytes within 5 days postinfection. four of six aerosol exposed monkeys and 8 of 11 iv injected monkeys developed viremia. infectious virus could not be isolated from any of the animals. in a recent study, rhesus macaques were inoculated by the intramuscular route with snv passaged only in deer mice (safronetz et al., 2014) . characteristics of hps disease including rapid onset of respiratory distress, severe pulmonary edema, thrombocytopenia, and leukocytosis were observed in this promising model. viremia was observed 4-10 days prior to respiratory signs of the disease that were observed on days 14-16 postinoculation. with all aspects, this animal model would be very useful to test medical countermeasures against hanta virus. the family arenaviridae is composed of two serogroups: old world arenaviruses including lassa fever virus and lymphocytic choriomeningitis virus and the new world viruses of pichinde virus and junin virus. all of these viruses share common clinical manifestations (mccormick and fisher-hoch, 2002) . lassa fever virus is endemic in parts of west africa and outbreaks are typically seen in the dry season between january and april (curtis, 2006) . this virus is responsible for 100,000-500,000 infections per year, leading to approximately 5000 deaths (khan et al., 2008) . outbreaks have been reported in guinea, sierra leone, liberia, nigeria, and central african republic. however, cases have sprung up in germany, netherlands, united kingdom, and the united states due to transmission to travelers on commercial airlines (amorosa et al., 2010) . transmission of this virus typically occurs via rodents, in particular the multimammate rat, mastomys species complex (curtis, 2006) . humans become infected by inhaling the aerosolized virus or eating contaminated food. there has also been noted human-to-human transmission by direct contact with infected secretions or needle-stick injuries. the majority of infections are asymptomatic; however, severe disease occurs in 20% of individuals. the incubation period is from 5 to 21 days and initial onset is characterized by flu-like illness. this is followed by diarrheal disease that can progress to hemorrhagic symptoms including encephalopathy, encephalitis, and meningitis. a third of patients develop deafness in the early phase of disease that is permanent for a third of those affected. the overall fatality is about 1%; however, of those admitted to the hospital it is between 15% and 25%. there is no approved vaccine and besides supportive measures, ribavirin is effective only if started within 7 days (mccormick et al., 1986a,b) . the primary animal model used to study lassa fever is the rhesus macaque (jahrling et al., 1980) . aerosolized infection of lymphocytic choriomeningitis virus has been a useful model for lassa fever. both rhesus and cynomolgus monkeys exposed to the virus developed disease, but rhesus mirrored more closely the disease course and histopathology observed in human infection (danes et al., 1963) . iv or intragastric inoculation of the virus led to severe dehydration, erythematous skin, submucosal edema, necrotic foci in the buccal cavity, and respiratory distress. the liver was severely affected by the virus as depicted by measuring the liver enzymes ast and alt (lukashevich et al., 2003) . disease was dose dependent with iv, intramuscular, and subq inoculation requiring the least amount of virus to induce disease. aerosol infections and eating contaminated food could also be utilized, and mimic a more natural route of infection (peters et al., 1987) . within this model, the nhp becomes viremic after 4-6 days. clinical manifestations were present by day 7 and death typically occurred within 10-14 days (lukashevich et al., 2004; rodas et al., 2004) . intramuscular injection of lassa virus into cynomolgus monkeys also produced a neurological disease due to lesions within the cns (hensley et al., 2011b) . this pathogenicity is seen in select cases of human lassa fever (cummins et al., 1992; gunther et al., 2001) . a marmoset model has recently been defined utilizing a subq injection of lassa fever virus. virus was initially detected by day 8 and viremia achieved by day 14. liver enzymes were elevated and an enlarged liver was noted upon autopsy. there was a gradual reduction in platelets and interstitial pneumonitis diagnosed in a minority of animals. the physiological signs were the same as seen in fatal human cases (carrion et al., 2007) . mice develop a fatal neurological disorder upon intracerebral inoculation with lassa, although the outcome of infection is dependent on the mhc background, age of the animal, and inoculation route (salvato et al., 2005) . stat1 knockout mice inoculated ip with both lethal and nonlethal lassa virus strains develop hearing loss accompanied by damage to the inner ear hair cells and auditory nerve (yun et al., 2015) . guinea pig inbred strain 13 was highly susceptible to lassa virus infection. the outbred hartley strain was less susceptible, and thus strain 13 has been the preferred model given its assured lethality. the clinical manifestations mirror those seen in humans and rhesus (jahrling et al., 1982) . infection with pichinde virus passaged in guinea pigs has also been used. disease signs include fever, weight loss, vascular collapse, and eventual death (lucia et al., 1990; qian et al., 1994) . the guinea pig is an excellent model given that it not only results in similar disease pattern, viral distribution, histopathology, and immune response to humans (connolly et al., 1993; katz and starr, 1990) . infection of hamsters with a cotton rat isolate of pirital virus is similar to what is characterized in humans, and the nhp and guinea pig models. the virus was injected ip resulting in lethargy and anorexia within 6-7 days. virus was first detected at 3 days, and reached maximum titers within 5 days. neurological symptoms began to appear at the same time, and all animals died by day 9. pneumonitis, pulmonary hemorrhage, and edema were also present (sbrana et al., 2006) . these results were recapitulated with a nonadapted pichinde virus (buchmeier and rawls, 1977; gowen et al., 2005; smee et al., 1993) . the lentiviruses are a subfamily of retroviridae, which includes human immunodeficiency virus (hiv), a virus that infects 0.6% of the world's population. a greater proportion of infections and deaths occur in subsaharan africa. worldwide, there are approximately 1.8 million deaths per year with over 260,000 being children. transmission of hiv occurs by exposure to infectious body fluids. there are two species, hiv-1 and hiv-2, with hiv-2 having lower infectivity and virulence (confined mostly to west africa). the vast majority of cases worldwide are hiv-1 (de cock et al., 2011) . hiv targets t-helper cells (cd4+), macrophages, dendritic cells (fields et al., 2007) . acute infection occurs 2-4 weeks after exposure, with flu-like symptoms and viremia followed by chronic infection. symptoms in the acute phase may include fever, body aches, nausea, vomiting, headache, lymphadenopathy, pharyngitis, rash, and sores in the mouth or esophagus. cd8+ t-cells are activated which kill hiv-infected cells, and are responsible for antibody production and seroconversion. acquired immune deficiency syndrome (aids) develops when cd4+ t-cells decline to less than 200 cells/µl; thus cell-mediated immunity becomes impaired and the person is more susceptible to opportunistic infections as well as certain cancers. hiv has a narrow host range likely because the virus is unable to antagonize and evade effector molecules of the interferon response (thippeshappa et al., 2012) . humanized mice, created by engrafting human cells and tissues into scid mice, have been critical for the development of mouse models for the study of hiv infection. a number of different humanized mouse models allow for the study of hiv infection in the context of an intact and functional human innate and adaptive immune responses (berges and rowan, 2011) . the scidhu hiv infection model has proven useful, particularly in screening antivirals and therapeutics (denton et al., 2008; melkus et al., 2006) . a number of different humanized mouse models have been developed for the study of hiv, including rag1 −/− γc −/− , rag2 −/− γc −/− , nod/scidγc −/− (hnog), nod/scidγc −/− (hnsg), nod/scid blt, and nod/scidγc −/− (hnsg) blt (karpel et al., 2015; li et al., 2015; shimizu et al., 2015) . cd34+ human stem cells derived from umbilical cord blood or fetal liver are used for humanization (baenziger et al., 2006; watanabe et al., 2007) . hiv-1 infection by ip injection can be successful with as little as 5% peripheral blood engraftment (berges et al., 2006) . vaginal and rectal transmission models have been developed in blt scid hu mice in which mice harbor human bone marrow, liver, and thymus tissue. hiv-1 viremia occurs within approximately 7 days postinoculation . in many of these models, spleen, lymph nodes, and thymus tissues are highly positive for virus, similar to humans (brainard et al., 2009) . importantly, depletion of human t-cells can be observed in blood and lymphoid tissues of hivinfected humanized mice and at least some mechanisms of pathogenesis that occur in hiv-infected humans, also occur in the hiv-infected humanized mouse models (baenziger et al., 2006; neff et al., 2011) . the advantage of these models is that these mice are susceptible to hiv infection and thus the impact of drugs on the intended viral targets can be tested. one caveat is that while mice have a "common mucosal immune system," humans do not, due to differences in the distribution of addressins (holmgren and czerkinsky, 2005) . thus, murine mucosal immune responses to hiv do not reflect those of humans. another strategy uses a human cd4-and human ccr5-expressing transgenic luciferase reporter mouse to study hiv-1 pseudovirus entry (gruell et al., 2013) . hiv-1 transgenic (tg) rats are also used to study hiv related pathology, immunopathogenesis, and neuropathology (lentz et al., 2014; reid et al., 2001) . the clinical signs include skin lesions, wasting, respiratory difficulty, and neurological signs. brain volume decreases have been documented and the hiv-1 tg rat is thus used as a model of neuropathology in particular. there are a number of important nhp models for human hiv infection (hessell and haigwood, 2015) . an adaptation of hiv-1 was obtained by four passages in pigtailed macaques transiently depleted of cd8(+) cells during acute infection (hatziioannou et al., 2014) . the resulting disease has several similarities to aids in humans, such as depletion of cd4(+) t-cells (kimata, 2014) . simian immunodeficiency virus (siv) infection of macaques has been widely used as a platform for modeling hiv infection of humans (demberg and robert-guroff, 2015; walker et al., 2015) . importantly, nhps have similar, pharmacokinetics, metabolism, mucosal tcell homing receptors, and vascular addressins to those of humans. thus, while the correlates of protection against hiv are still not completely known, immune responses to hiv infection and vaccination are likely comparable. these models mimic infection through use of contaminated needles (iv), sexual transmission (vaginal or rectal), and maternal transmission in utero or through breast milk (keele et al., 2009; miller et al., 2005; stone et al., 2009) . there are also macaque models to study the emergence and clinical implications of hiv drug resistance (van rompay et al., 2002) . these models most routinely utilize rhesus macaques (macaca mulatta), cynomolgus macaques (m. fasicularis), and pigtailed macaques (macaca nemestrina). all ages are used, depending on the needs of the study. for instance, use of newborn macaques may be more practical for evaluating the effect of prolonged drug therapy on disease progression; however, adult nhps are more frequently employed. female pigtailed macaques have been used to investigate the effect of the menstrual cycle on hiv susceptibility (vishwanathan et al., 2015) . studies are performed in bsl-2 animal laboratories and nhps must be simian type-d retrovirus free and siv seronegative. siv infection of pigtailed macaques is a useful model for hiv peripheral nervous system pathology, wherein an axotomy is performed and regeneration of axons is studied (ebenezer et al., 2012) . exposure in model systems is typically through a single high-dose challenge. iv infection of rhesus macaques with 100 tcid 50 of the highly pathogenic siv/ deltab670 induces aids in most macaques within 5-17 months (mean of 11 months) (fuller et al., 2012) . peak viremia occurs around week 4. aids in such models is often defined as cd4+ t-cells that have dropped to less than 50% of the baseline values. alternatively, repeated low dose challenges are often utilized, depending on the requirements of the model (henning et al., 2014; moldt et al., 2012; reynolds et al., 2012) . since nhps infected with hiv do not develop an infection with a clinical disease course similar to humans, siv or siv/hiv-1 laboratory-engineered chimeric viruses (shivs) are used as surrogates. nhps infected with pathogenic siv may develop clinical disease which progresses to aids, and are thus useful pathogenesis models. a disadvantage is that siv is not identical to hiv-1 and is more closely related to hiv-2. however, the polymerase region of siv is 60% homologous to that of hiv-1 and it is susceptible to many reverse transcriptase (rt) and protease inhibitors. siv is generally not susceptible to nonnucleoside inhibitors, thus hiv-1 rt is usually put into siv for such studies (uberla et al., 1995) . sivmac239 is similar to hiv in the polymerase region and is therefore susceptible to nucleoside, rt, or integrase inhibition (witvrouw et al., 2004) . nhps infected with sivmac239 have an asymptomatic period and disease progression resembling aids in humans, characterized by weight loss/wasting, cd4+ t-cell depletion. additionally, sivmac239 utilizes the cxcr5 chemokine receptor as a coreceptor, similar to hiv, which is important for drugs that target entry (veazey et al., 2003) . nhps infected with shiv strains, may not develop aids, but these models are useful in testing vaccine efficacy . for example, rt-shivs and env-shivs are useful for testing and evaluation of drugs that may target the envelope or rt, respectively (uberla et al., 1995) . one disadvantage of the highly virulent env-shiv (shiv-89.6 p), is that it uses the cxcr4 coreceptor. of note, env-shivs that do use the cxcr5 coreceptor are less virulent; viremia develops then resolves without further disease progression (humbert et al., 2008) . simian-tropic (st) hiv-1 contains the vif gene from siv. infection of pigtailed macaques with this virus results in viremia, which can be detected for 3 months, followed by clearance (haigwood, 2009) . a number of routes are utilized for siv or shiv infection of nhps, with iv inoculation the most common route. mucosal routes include vaginal, rectal, and intracolonic. mucosal routes require a higher one-time dose than the iv route for infection. for the vaginal route, female macaques are treated with depo-provera (estrogen) 1 month before infection to synchronize the menstrual cycle, thin the epithelial lining of the vagina, and increase susceptibility to infection by atraumatic vaginal instillation (burton et al., 2011) . upon vaginal instillation of 500 tcid 50 of shiv-162p3, peak viremia was seen around 12 days postexposure with greater than 10 7 copies/ml and dropping thereafter to a constant level of 10 4 rna copies/ml at 60 days and beyond. in another example, in an investigation of the effect of vaccine plus vaginal microbicide on preventing infection, rhesus macaques were vaginally infected with a high dose of sivmac251 (barouch et al., 2012) . an example of an intrarectal model utilized juvenile (2-year-old) pigtailed macaques, challenged intrarectally with 10 4 tcid 50s of siv mne027 to study the pathogenesis related to the virulence factor, vpx (belshan et al., 2012) . here, viremia peaked at approximately 10 days with more than 10 8 copies/ml. viral rna was expressed in the cells of the mesenteric lymph nodes. the male genital tract is seen as a viral sanctuary with persistent high levels of hiv shedding even with antiretroviral therapy. to better understand the effect of haart therapy on virus and t-cells in the male genital tract, adult (3-to 4-year-old) male cynomolgus macaques were intravenously inoculated with 50 aid50s of sivmac251 and the male genital tract tissues were tested after euthanasia by pcr, ihc, and in situ hybridization (moreau et al., 2012) . pediatric models have been developed in infant rhesus macaques through the infection of siv, allowing for the study of the impact of developmental and immunological differences on the disease course (abel, 2009) . importantly, mother-to-infant transmission models have also been developed (jayaraman et al., 2004) . pregnant female pigtailed macaques were infected during the second trimester with 100 mid 50 shiv-sf162p3 by the iv route. four of nine infants were infected, one in utero and three either intrapartum or immediately postpartum through nursing. this model is useful for the study of factors involved in transmission as well as the underlying immunology. nhps infected with siv or shiv are routinely evaluated for weight loss, activity level, stool consistency, appetite, virus levels in blood, and t-cell populations. cytokine and chemokine levels, antibody responses, and cytotoxic t-lymphocyte responses may also be evaluated. the ultimate goal of an hiv vaccine is sterilizing immunity (preventing infection). however, a more realistic result may be to reduce severity of infection and permanently prevent progression. strategies have included live attenuated, nonreplicating, and subunit vaccines. these have variable efficacy in nhps due to the genetics of the host (mhc and trim alleles), differences between challenge strains, and challenge routes (letvin et al., 2011) . nhp models have led to the development of antiviral treatments that are effective at reducing viral load and indeed transmission of hiv among humans. one preferred variation on the models for testing the long-term clinical consequences of antiviral treatment is to use newborn macaques and treat from birth onward, in some cases more than a decade (van rompay et al., 2008) . unfortunately, however, successes in nhp studies do not always translate to success in humans, as seen with the recent step study which used an adenovirus-based vaccine approach (buchbinder et al., 2008) . vaccinated humans were not protected and may have even been more susceptible to hiv, viremia was not reduced, and the infections were not attenuated as hoped. with regard to challenge route, iv exposure is more difficult to protect than mucosal exposure and is used as a "worst case scenario." however, efficacy at one mucosal route is usually comparable to other mucosal routes. human and animal papillomaviruses cause benign epithelial proliferations (warts) and malignant tumors of the various tissues that they infect (bosch and de sanjose, 2002) . there are over 100 human papillomaviruses, with different strains causing warts on the skin, oropharynx, nasopharynx, larynx, and anogenital tissues. approximately one third of papillomaviruses are transmitted sexually. of these, virulent subtypes, such as hpv-16, hpv-18, hpv-31, hpv-33, and hpv-45 place individuals at high risk for cervical and other cancers. up to 35% of head and neck cancers are caused by hpv-16, particularly oropharyngeal cancers. major challenges in the study of these viruses are that papillomaviruses generally do not infect any other species outside of the natural hosts and can cause a very large spectrum of severity. thus, no wild-type animal models have been identified that are susceptible to hpv. however, a number of useful surrogate models exist which use animal papillomaviruses in their natural host or a very closely related species (borzacchiello et al., 2009; brandsma, 1994; campo, 2002) . these models have facilitated the recent development of useful and highly effective prophylactic hpv vaccines (rabenau et al., 2005) . wild-type inbred mice cannot be used to study disease caused by papillomaviruses unless they are engrafted with relevant tissue, orthotopically transplanted or transgenic, but they are often used to look at immunogenicity of vaccines (jagu et al., 2011; oosterhuis et al., 2011) . transgenic mice used for hpv animal modeling typically express the viral oncogenes e5, e6, e7, or the entire early region of hpv-16 from the keratin 14 promoter which is only active in the basal cells of the mouse epithelium (chow, 2015) . cancers in these models develop upon extended estrogen exposure (maufort et al., 2010; ocadiz-delgado et al., 2009; stelzer et al., 2010; thomas et al., 2011) . transgenic mice with constitutively active wnt/b-catenin signaling in cervical epithelial cells expressing the hpb oncoprotein e7 develop invasive cervical squamous carcinomas (bulut and uren, 2015) . the tumors occur within 6 months approximately 94% of the time. another model uses c57bl/6 mice expressing the hpv16-e7 transgene which are then treated topically with 7,12-dimethylbenz(a)anthracene (dmba) (de azambuja et al., 2014) . these mice developed benign and malignant cutaneous lesions. cervical cancers can also be induced in human cervical cancer xenografts transplanted onto the flanks of athymic mice and serially transplanted thereafter ( hiroshima et al., 2015; siolas and hannon, 2013) . a wild-type immunocompetent rodent model uses m. coucha, which is naturally infected with mastomys natalensis papillomavirus (mnpv) (vinzon et al., 2014) . mnpv induces papillomas, keratoacanthomas, and squamous cell carcinomas and provides a means to study vaccination in an immunocompetent small animal model. wild cottontail rabbits (sylvilagus floridanus) are the natural host for cottontail rabbit papillomavirus (crpv), but this virus also infects domestic rabbits (oryctolagus cuniculus), which is a very closely related species ( breitburd et al., 1997) . in this model, papillomas can range from cutaneous squamous cell carcinomas on one end of spectrum, and spontaneous regression on the other. lesions resulting from crpv in domestic rabbits do not typically contain infectious virus. canine oral papillomavirus (copv) causes florid warty lesions in mucosa of the oral cavity within 4-8 weeks postexposure in experimental settings (johnston et al., 2005) . the mucosatrophic nature of these viruses and the resulting oropharyngeal papillomas that are morphologically similar to human vaginal papillomas caused by hpv-6 and hpv-11 make this a useful model (nicholls et al., 1999) . these lesions typically spontaneously regress 4-8 weeks after appearing; this model is therefore useful in understanding the interplay between the host immune defense and viral pathogenesis. male and female beagles, aged 10 weeks to 2 years, with no history of copv, are typically used for these studies. infection is achieved by application of a 10 µl droplet of virus extract to multiple 0.5 cm 2 scarified areas within the mucosa of the upper lip of anesthetized beagles (nicholls et al., 2001) . some investigators have raised concerns that dogs are not a suitable model for high-risk hpv-induced oral cancer (staff, 2015) . bovine papillomavirus (bpv) has a wider host range than most papillomaviruses, infecting the fibroblasts cells of numerous ungulates (campo, 2002) . bpv-4 infection of cattle feeding on bracken fern, which is carcinogenic, can result in lesions of the oral and esophageal mucosa that lack detectable viral dna. bpv infections in cattle can result in a range of diseases, such as skin warts, cancer of the upper gastrointestinal tract and urinary bladder, and papillomatosis of the penis, teats, and udder. finally, rhesus papillomavirus (rhpv), a sexually transmitted papillomaviruses in rhesus macaques and cynomolgus macaques is very similar to hpv-16 and is associated with the development of cervical cancer ( ostrow et al., 1990; wood et al., 2007) . monkeypox virus (mpxv) causes disease in both animals and humans. human monkeypox, which is clinically almost identical to ordinary smallpox, occurs mostly in the rainforest of central and western africa. the virus is maintained in nature in rodent reservoirs including k. viral disease squirrels (charatan, 2003; khodakevich et al., 1986) . mpxv was discovered during the pox-like disease outbreak among laboratory monkeys (mostly cynomolgus and rhesus macaques) in denmark in 1958. no human cases were observed during this outbreak. the first human case was not recognized as a distinct disease until 1970 in zaire (the present drc) with continued occurrence of a smallpox-like illness despite eradication efforts of smallpox in this area. during the global eradication campaign, extensive vaccination in central africa decreased the incidence of human monkeypox, but the absence of immunity in the generation born since that time and increased dependence on bush meat have resulted in renewed emergence of the disease. in the summer of 2003, a well-known outbreak in the midwest was the first occurrence of monkeypox disease in the united states and western hemisphere. among 72 reported cases, 37 human cases were laboratory confirmed during an outbreak (nalca et al., 2005; sejvar et al., 2004) . it was determined that native prairie dogs (cynomys sp.) housed with rodents imported from ghana in west africa were the primary source of outbreak. the virus is mainly transmitted to humans while handling infected animals or by direct contact with the infected animal's body fluids, or lesions. person-to-person spread occurs by large respiratory droplets or direct contact (jeézek and fenner, 1988) . most of the clinical features of human monkeypox are very similar to those of ordinary smallpox (breman and arita, 1980) . after a 7-to 21-dayincubation period, the disease begins with fever, malaise, headache, sore throat, and cough. the main sign of the disease that distinguishes monkeypox from smallpox is swollen lymph nodes (lymphadenitis), which is observed in most of the patients before the development of rash (di giulio and eckburg, 2004; jeézek and fenner, 1988) . a typical maculopapular rash follows the prodromal period, generally lasting 1-3 days. the average size of the skin lesions are 0.5-1 cm and the progress of lesions follows the order: macules, papules, vesicles, pustules, umblication then scab, and desquamation and lasts typically 2-4 weeks. the fatality rate is 10% among the unvaccinated population and death generally occurs during the 2nd week of the disease (jeézek and fenner, 1988; nalca et al., 2005) . mpxv is highly pathogenic for a variety of laboratory animals and many animal models have been developed by using different species and different routes of exposure (table 33 .3). due to unavailability of variola virus (smallpox) to develop animal models and similar disease manifestations in humans that are similar, mpxv is one of the pox viruses that are utilized very heavily to develop a number of small animal models via different routes of exposure. wild-derived inbred mouse, stat1-deficient c57bl/6 mouse, icr mouse, prairie dogs, african dormice, ground squirrels, and gambian pouched rats are highly susceptible to mpxv by different exposure routes (americo et al., 2010; falendysz et al., 2015; hutson et al., 2009; osorio et al., 2009; sbrana et al., 2007; schultz et al., 2009; sergeev et al., 2016; stabenow et al., 2010; tesh et al., 2004; xiao et al., 2005) . cast/eij mice, one of the 38 inbred mouse strains tested for susceptibility to mpxv, showed weight loss and dose dependent mortality after in exposure to mpxv. studies with ip route of challenge indicated a 50fold higher susceptibility to mpxv when compared to in route (americo et al., 2010) . scid-balb/c mice were also susceptible to the ip challenge route and the disease resulted in mortality on day 9 postinfection (osorio et al., 2009) . similarly, c57bl/6 stat1 −/− mice were infected in with mpxv and the infection resulted in weight loss and mortality 10 days postexposure. recently sergeev et al. (2016) showed that in challenge of icr mice with mpxv resulted in purulent conjunctivitis, blepharitis, and ruffled fur in these mice although there was no death. the mouse models mentioned here are very promising for screening therapeutics against poxviruses but testing in additional models will be required for advanced development. high doses of the mpxv by ip or in routes caused 100% mortality in 6 days postexposure and 8 days postexposure, respectively, in ground squirrels (tesh et al., 2004) . the disease progressed very quickly and most of the animals were lethargic and moribund by day 5 postexposure without any pox lesions or respiratory changes. a comparison study of usa mpxv and central african strain of mpxv strains in ground squirrels by the subq route resulted in systemic disease and mortality in 6-11 days postexposure. the disease resembles hemorrhagic smallpox with nosebleeds, impaired coagulation parameters, and hemorrhage in the lungs of the animals. another study by sergeev et al. (2017) showed that in challenge with mpxv caused fever, lymphadenitis, and skin rash in ground squirrels 7-9 days postexposure. mortality was observed in 40% of the animals 13-22 days postexposure (sergeev et al., 2017) . since mpxv was transmitted by infected prairie dogs in the us outbreak, this animal model has been more thoroughly studied and utilized to test therapeutics and vaccines compared to other small animal models ( hutson et al., 2009; keckler et al., 2011; smith et al., 2011; xiao et al., 2005) . studies using in, ip, and id routes of exposure showed that mpxv was highly infectious to prairie dogs, ip infection with the west african mpxv strain caused a more severe disease and 100% mortality than challenge by the in route. anorexia and lethargy were common signs of the disease for both exposure routes. in contrast to ip route, the in route of exposure caused severe pulmonary edema and necrosis of lungs in prairie dogs, while splenic necrosis and hepatic lesions were observed in ip-infected animals (xiao et al., 2005) . hutson et al. (2009) african and congo basin strains and showed that both strains and routes caused smallpox-like disease with longer incubation periods and most importantly generalized pox lesions. therefore, this model has the utility for testing therapeutics and vaccines against pox viruses. furthermore, mpxv challenged prairie dogs were used to perform in vivo bioluminescent imaging (bli) studies (falendysz et al., 2015) . bli studies showed real time spread of virus in prairie dogs as well as potential routes for shedding and transmission. the african dormouse is susceptible to mpxv by a footpad injection or in routes (schultz et al., 2009) . mice had decreased activity, hunched posture, dehydration, conjunctivitis, and weight loss. viral doses of 200 and 2000 pfu provided 100% mortality with a mean time to death of 8 days. upper gastrointestinal hemorrhage, hepatomegaly, lymphadenopathy, and lung hemorrhage were observed during necropsy. with the hemorrhage in several organs, this model resembles hemorrhagic smallpox. in a recent study, comparison of the disease pathogenesis was performed by using live bioluminescence imaging in the cast/eij mouse and african dormouse challenged with low dose of mpxv (earl et al., 2015) . following in challenge, mpxv dissemination occurred through the blood or lymphatic system in dormice compared to dissemination that was through the nasal cavity and lungs in cast/eij mice. the disease course was much faster in cast/eij mice (earl et al., 2015) . considering the limited availability of prairie dogs, ground squirrels and african dormice, lack of reagents specific for these species, and not having commercial sources of these species, these small animal models are as attractive for further characterization and vaccine, and countermeasure testing studies. nhps were exposed to mpxv by several different routes to develop animal model for mpxv (edghill-smith et al., 2005; johnson et al., 2011; nalca et al., 2010; stittelaar et al., 2006; zaucha et al., 2001) . during our studies using an aerosol route of exposure, we observed that macaques had mild anorexia, depression, fever, and lymphadenopathy on day 6 postexposure (nalca et al., 2010) . complete blood count and clinical chemistries showed abnormalities similar to human monkeypox cases with leukocytosis and thrombocytopenia (huhn et al., 2005) . whole blood and throat swabs had viral loads peak around day 10, and in survivors, gradually decrease until day 28 postexposure. since doses of 4 × 10 4 pfu, 1 × 10 5 pfu, or 1 × 10 6 pfu resulted in lethality for 70% of the animals, whereas a dose of 4 × 10 5 pfu resulted in 85% lethality, survival was not dose dependent. the main pitfall of this model was the lack of pox lesions. with the high dose, animals succumbed to disease before developing pox lesions. with the low challenge dose, pox lesions were observed but they were few in comparison to the iv model. a recent study also evaluated the cytokine levels in aerosol challenged animals. (tree et al., 2015) . tree et al. (2015) showed that ifnγ, il-1rα, and il-6 increased dramatically on day 8 postexposure the day that death was most likely to occur, and viral dna was detected in most of the tissues. these results support the idea of a cytokine storm causing mortality in monkeypox disease. mpxv causes dose dependent disease in nhps when given by the iv route (johnson et al., 2011) . studies showed that a 1 × 10 7 pfu iv challenge results in systemic disease with fever, lymphadenopathy, macula-papular rash, and mortality. an it infection model skips the upper respiratory system and deposits virus into the trachea, delivering the virus directly to the airways without regard to particle size and the physiological deposition that occurs during the process of inhalation. fibrinonecrotic bronchopneumonia was described in animals that received 10 7 pfu of mpxv intratracheally (stittelaar et al., 2006) . although a similar challenge dose of it mpxv infection resulted in a similar viremia in nhps to the aerosol route of infection, the timing of the first peak was delayed by 5 days in intratracheally exposed macaques compared to aerosol infection, and the amount of virus detected by qpcr was approximately 100-fold lower. this suggests that local replication is more prominent after aerosol delivery compared to the it route. an intrabronchial route of exposure resulted in pneumonia in nhps (johnson et al., 2011) . delayed onset of clinical signs and viremia were observed during the disease progression. in this model, similar to aerosol and it infection models, the number of pox lesions was much less than in the iv infection model. a major downside of the iv, it, and intrabronchial models is that the initial infection of respiratory tissue, incubation, and prodromal phases are circumvented with the direct inoculation of virus to the blood stream or to the lung. this is an important limitation when the utility of these models is to test possible vaccines and treatments in which the efficacy may depend on protecting the respiratory mucosa and targeting the subsequent early stages of the infection, which are not represented in these challenge models. although the aerosol model is the natural route of transmission for human varv infections and a secondary route for human mpxv infections, the lack of pox lesions is the main drawback of this model. therefore, when this model is used to test medical countermeasures, the endpoints and the biomarkers to initiate treatment should be chosen carefully. hepatitis b virus (hbv) is one of the most common infections worldwide with over 400 million people chronically infected and 316,000 cases per year of liver cancer due to infection (lee, 1997) . the virus can naturally infect both humans and chimpanzees (guha et al., 2004) . hbv is transmitted parenterally or postnatally from infected mothers. it can also be transmitted by sexual contact, iv drug use, blood transfusion, and acupuncture (lai et al., 2003) . the age at which one is infected dictates the risk of developing chronic disease (hyams, 1995) . acute infection during adulthood is self-limiting and results in flu-like symptoms that can progress to hepatocellular involvement as observed with the development of jaundice. the clinical symptoms of hbv infection last for a few weeks before resolving (ganem and prince, 2004) . after this acute phase, lifetime immunity is achieved (wright and lau, 1993) . of those infected, less than 5% will develop the chronic form of the disease. chronicity is the most serious outcome of the disease as it can result in cirrhosis or liver cancer. hepatocellular carcinoma is 100 times more likely to develop in a chronically infected individual than a noncarrier (beasley, 1988) . the viral determinant for cellular transformation has yet to be determined, although studies involving the woodchuck hepatitis virus suggest that x protein may be responsible (spandau and lee, 1988). many individuals are asymptomatic until complications emerge related to chronic hbv carriage. chimpanzees have a unique strain that circulates within the population (hu et al., 2000; . it was found that 3%-6% of all wild-caught animals from africa are positive for hbv antigen ( lander et al., 1972) . natural and experimental challenge with the virus follows the same course as human disease; however, this is only an acute model of disease (prince, 1972) . to date, chimpanzees are the only reliable method to ensure that plasma vaccines are free from infectious particles (prince and brotman, 2001 ). this animal model has been used to study new therapeutics and vaccines. chimpanzees are especially ideal for these studies given that their immune response to infection directly mirrors humans (nayersina et al., 1993) . recent regulations by the national institute of health (nih) and restrictions to use great apes as animal models forced researches to find alternate models for hbv infection. other nhps that have been evaluated are gibbons, orangutans, and rhesus monkeys. although these animals can be infected with hbv, none develops hepatic lesions or liver damage as noted by monitoring of liver enzymes (pillot, 1990 ). mice are not permissible to infection, and thus numerous transgenic and humanized lines that express hbv proteins have been created to facilitate their usage as an animal model. these include both immunocompetent and immunosuppressed hosts. the caveat to all of these mouse lines is that they reproduce only the acute form of disease (guha et al., 2004) . recently, the entire genome of hbv was transferred to an immunocompetent mouse line via adenovirus. this provides a model for persistent infection (huang et al., 2012) . another model that has been developed is hydrodynamic injection of hbv genomes in the liver of mice (liu et al., 1999; yang et al., 2002) . although this model is very stressful to mice and has liver toxicity, it is successfully used to evaluate antivirals against hbv (mccaffrey et al., 2003) . liver chimeric mouse models are an additional set of surrogate models for hbv infection (dandri and lutgehetmann, 2014) . in these models human hepatocytes are integrated into the murine liver parenchyma (allweiss and dandri, 2016) . this model might be used to test antivirals as well as to study the molecular biology of hbv infection. hbv can also be studied using surrogate viruses, naturally occurring mammalian hepadna viruses (mason et al., 1982) . the woodchuck hepatitis virus induces hepatocellular carcinoma (summers et al., 1978) . within a population, 65%-75% of all neonatal woodchucks are susceptible to chronic infection (cote et al., 2000) . a major difference between the two hepatitis isolates is the rate at which they induce cancer; almost all chronic carriers developed hepatocellular carcinoma within 3 years of the initial infection in woodchucks, whereas human carcinogenesis takes much longer (gerin et al., 1989) . the acute infection strongly resembles what occurs during the course of human disease. there is a self-limiting acute phase resulting in a transient viremia that has the potential of chronic carriage (tennant, 2001) . challenge with virus in neonates leads to a chronic infection while adults only develop the acute phase of disease (buendia, 1992) . a closely related species to the woodchuck is the marmota himalayan. this animal is also susceptible to the woodchuck hepadna virus upon iv injection. the marmot himalayan develops an acute hepatitis with a productive infection (lucifora et al., 2010) . hepatitis d virus (hdv) is dependent upon hbv to undergo replication and successful infection in its human host (gerin, 2001) . there are two modes of infection possible between the viruses: coinfection where a person is simultaneously infected or superinfection in which a chronic carrier of hbv is subsequently infected with hdv (purcell et al., 1987) . coinfection leads to a similar disease as seen with hbv alone; however, superinfection can result in chronic hdv infection and severe liver damage (guilhot et al., 1994) . both coinfection and superinfection can be demonstrated within the chimpanzee and woodchuck by inoculation of human hepatitis d (ponzetto et al., 1991) . a recently published report demonstrated the use of a humanized chimeric upa mouse to study interactions between the two viruses and drug testing (lutgehetmann et al., 2012) new models ranging from nhps to small animals and representing the disease characteristics in humans are necessary to study viral and host factors that drive disease pathogenesis and evaluate medical countermeasures. the ideal animal model for human viral disease should closely recapitulate the spectrum of clinical symptoms and pathogenesis observed during the course of human infection. whenever feasible, the model should use the same virus and strain that infects humans. it is also preferable that the virus is a low passage clinical isolate thus animal passage or adaptation should be avoided if model species can be identified that are susceptible. ideally, the experimental route of infection would mirror that occurs in natural disease. in order to understand the interplay and contribution of the immune system during infection, an immunocompetent animal should be used. the aforementioned characteristics cannot always be satisfied; however, and often virus must be adapted, knockout mice must be used, and/or the disease is not perfectly mimicked in the animal model. well-characterized animal models are critical for licensure to satisfy fda "animal rule." this rule applies to situations in which vaccine and therapeutic efficacy cannot safely or ethically be tested in humans; thus licensure will come only after preclinical tests are performed in animal models. many fields in virology are moving toward standardized models that can be used across institutions to test vaccines and therapeutics. a current example of such an effort is within the filovirus community, where animal models, euthanasia criteria, assays, and virus strains are in the process of being standardized. the hope is that these efforts will enable results of efficacy tests on medical countermeasures compared across institutions. this chapter has summarized the best models available for each of the viruses described. the rhesus macaque pediatric siv infection model-a valuable tool in understanding infant hiv-1 pathogenesis and for designing pediatric hiv-1 prevention strategies prevalence of anti-rift-valley-fever igm antibody in abattoir workers in the nile delta during the 1993 outbreak in egypt common marmosets (callithrix jacchus) as a nonhuman primate model to assess the virulence of eastern equine encephalitis virus strains replication and shedding of mers-cov in upper respiratory tract of inoculated dromedary camels. emerg generation of a transgenic mouse model of middle east respiratory syndrome coronavirus infection and disease pathological changes in brain and other target organs of infant and weanling mice after infection with nonneuroadapted western equine encephalitis virus particle-to-pfu ratio of ebola virus influences disease course and survival in cynomolgus macaques progress toward norovirus vaccines: considerations for further development and implementation in potential target populations characterization of lethal zika virus infection in ag129 mice experimental in vitro and in vivo models for the study of human hepatitis b virus infection a model of meningococcal bacteremia after respiratory superinfection in influenza a virus-infected mice middle east respiratory syndrome coronavirus: current situation and travel-associated concerns aerosol exposure to the angola strain of marburg virus causes lethal viral hemorrhagic fever in cynomolgus macaques necrotizing scleritis, conjunctivitis, and other pathologic findings in the left eye and brain of an ebola virus-infected rhesus macaque (macaca mulatta) with apparent recovery and a delayed time of death american academy of pediatrics subcommittee on diagnosis and management of bronchiolitis identification of wild-derived inbred mouse strains highly susceptible to monkeypox virus infection for use as small animal models the gerbil, meriones unguiculatus, a model for rift valley fever viral encephalitis morbidity and mortality among patients with respiratory syncytial virus infection: a 2-year retrospective review chikungunya and the nervous system: what we do and do not know the west nile virus outbreak of 1999 in new york: the flushing hospital experience hospital outbreak of middle east respiratory syndrome coronavirus diagnosis of noncultivatable gastroenteritis viruses, the human caliciviruses norovirus vaccine against experimental human norwalk virus illness determination of the 50% human infectious dose for norwalk virus an epizootic attributable to western equine encephalitis virus infection in emus in texas evidence for camel-to-human transmission of mers coronavirus integrated molecular signature of disease: analysis of influenza virus-infected macaques through functional genomics and proteomics disseminated and sustained hiv infection in cd34+ cord blood cell-transplanted rag2 −/− gamma c −/− mice choice of inbred rat strain impacts lethality and disease course after respiratory infection with rift valley fever virus rift valley fever: an uninvited zoonosis in the arabian peninsula recombinant norwalk virus-like particles given orally to volunteers: phase i study tropism of dengue virus in mice and humans defined by viral nonstructural protein 3-specific immunostaining lethal antibody enhancement of dengue disease in mice is prevented by fc modification animal models for the study of influenza pathogenesis and therapy effect of oral gavage treatment with znal42 and other metallo-ion formulations on influenza a h5n1 and h1n1 virus infections in mice macaque studies of vaccine and microbicide combinations for preventing hiv-1 sexual transmission early and sustained innate immune response defines pathology and death in nonhuman primates infected by highly pathogenic influenza virus hepatitis b virus. the major etiology of hepatocellular carcinoma transmission of norwalk virus during football game vpx is critical for sivmne infection of pigtail macaques experimental respiratory syncytial virus infection of four species of primates pathogenesis and immune response of crimean-congo hemorrhagic fever virus in a stat-1 knockout mouse model crimean-congo hemorrhagic fever virus infection is lethal for adult type i interferon receptor-knockout mice the utility of the new generation of humanized mice to study hiv-1 infection: transmission, prevention, pathogenesis, and treatment hiv-1 infection and cd4 t cell depletion in the humanized rag2 −/− gamma c −/− (rag-hu) mouse model bacterial infections in pigs experimentally infected with nipah virus evaluation of a mouse model for the west nile virus group for the purpose of determining viral pathotypes severe acute respiratory syndrome coronavirus spike protein expressed by attenuated vaccinia virus protectively immunizes mice study of susceptibility to crimean hemorrhagic fever (chf) virus in european and long-eared hedgehogs. tezisy konf manipulation of host factors optimizes the pathogenesis of western equine encephalitis virus infections in mice for antiviral drug development genetic basis of attenuation of dengue virus type 4 small plaque mutants with restricted replication in suckling mice and in scid mice transplanted with human liver cells chimpanzees as an animal model for human norovirus infection and vaccine development a simple technique for infection of mosquitoes with viruses; transmission of zika virus human papillomavirus research: do we still need animal models? human papillomavirus in cervical cancer development of a hamster model for chikungunya virus infection and pathogenesis a neutralizing human monoclonal antibody protects against lethal disease in a new ferret model of acute nipah virus infection the cotton rat model of respiratory viral infections correlates of immunity to filovirus infection filovirus vaccines induction of robust cellular and humoral virusspecific adaptive immune responses in human immunodeficiency virus-infected humanized blt mice animal models of human-papillomavirus-associated oncogenesis interferon alpha/beta receptor-deficient mice as a model for ebola virus disease zika virus outbreak in rio de janeiro, brazil: clinical characterization, epidemiological and virological aspects co-infection of the cotton rat (sigmodon hispidus) with staphylococcus aureus and influenza a virus results in synergistic disease effectiveness of influenza vaccination the role of the type i interferon response in the resistance of mice to filovirus infection a mouse model for evaluation of prophylaxis and therapy of ebola hemorrhagic fever the rabbit viral skin papillomas and carcinomas: a model for the immunogenetics of hpv-associated carcinogenesis the confirmation and maintenance of smallpox eradication a lethal disease model for hantavirus pulmonary syndrome in immunosuppressed syrian hamsters infected with sin nombre virus nonhuman primate models of chikungunya virus infection and disease tissue tropism and neuroinvasion of west nile virus do not differ for two mouse strains with different survival rates pediatric norovirus diarrhea in nicaragua efficacy assessment of a cell-mediated immunity hiv-1 vaccine (the step study): a double-blind, randomised, placebo-controlled, test-of-concept trial variation between strains of hamsters in the lethality of pichinde virus infections hepatitis b viruses and hepatocellular carcinoma generation of k14-e7/n87betacat double transgenic mice as a model of cervical cancer limited or no protection by weakly or nonneutralizing antibodies against vaginal shiv challenge of macaques compared with a strongly neutralizing antibody a novel vaccine against crimean-congo haemorrhagic fever protects 100% of animals against lethal challenge in a mouse model interleukin-1beta but not tumor necrosis factor is involved in west nile virusinduced langerhans cell migration from the skin in c57bl/6 mice animal models of papillomavirus pathogenesis immunization of knock-out alpha/beta interferon receptor mice against high lethal dose of crimean-congo hemorrhagic fever virus with a cell culture based vaccine characterization of the localized immune response in the respiratory tract of ferrets following infection with influenza a and b viruses lassa virus infection in experimentally infected marmosets: liver pathology and immunophenotypic alterations in target tissues a small nonhuman primate model for filovirus-induced disease severe acute respiratory syndrome vaccine development: experiences of vaccination against avian infectious bronchitis coronavirus outbreak of acute illness-southwestern united states outbreak of west nile-like viral encephalitis-new york, 1999. mmwr morb. mortal in vitro whole-virus binding of a norovirus genogroup ii genotype 4 strain to cells of the lamina propria and brunner's glands in the human duodenum animal models for studying dengue pathogenesis and therapy us doctors investigate more than 50 possible cases of monkeypox mechanism of neuroinvasion of venezuelan equine encephalitis virus in the mouse chikungunya outbreaks-the globalization of vectorborne diseases inactivated and live, attenuated influenza vaccines protect mice against influenza: streptococcus pyogenes super-infections pathogenesis of a genogroup ii human norovirus in gnotobiotic pigs the immunobiology of sars* induction of tetravalent protective immunity against four dengue serotypes by the tandem domain iii of the envelope protein norovirus infection as a cause of diarrhea-associated benign infantile seizures comparative pathogenesis of epidemic and enzootic chikungunya viruses in a pregnant rhesus macaque model development of norwalk virus-specific monoclonal antibodies with therapeutic potential for the treatment of norwalk virus gastroenteritis viral shedding patterns of coronavirus in patients with probable severe acute respiratory syndrome a single sublingual dose of an adenovirus-based vaccine protects against lethal ebola challenge in mice and guinea pigs model systems to study the life cycle of human papillomaviruses and hpv-associated cancers primary severe acute respiratory syndrome coronavirus i nfection limits replication but not lung inflammation upon homologous rechallenge viral and host factors in human respiratory syncytial virus pathogenesis pathogenesis of pichinde virus infection in strain 13 guinea pigs: an immunocytochemical, virologic, and clinical chemistry study pathogenesis of experimental ebola virus infection in guinea pigs transcriptional profiling of the immune response to marburg virus infection the use of a neonatal mouse model to study respiratory syncytial virus infections a model of denv-3 infection that recapitulates severe disease and highlights the importance of ifn-gamma in host resistance to infection effects of age and viral determinants on chronicity as an outcome of experimental woodchuck hepatitis virus infection a mouse model for chikungunya: young age and inefficient type-i interferon signaling are risk factors for severe disease mosquito bite delivery of dengue virus enhances immunogenicity and pathogenesis in humanized mice comparison of the pathogenesis of the angola and ravn strains of marburg virus in the outbred guinea pig model the brazilian zika virus strain causes birth defects in experimental models age at first viral infection determines the pattern of t cell-mediated disease during reinfection in adulthood lassa fever encephalopathy: clinical and laboratory findings profound and prolonged lymphocytopenia with west nile encephalitis first complete genome sequence of zika virus (flaviviridae, flavivirus) from an autochthonous transmission in brazil viral haemorrhagic fevers caused by lassa, ebola, and marburg viruses the enhancement or prevention of airway hyperresponsiveness during reinfection with respiratory syncytial virus is critically dependent on the age at first infection and il-13 production mouse models of hepatitis b and delta virus infection [experimental inhalation infection of monkeys of the macacus cynomolgus and macacus rhesus species with the virus kinetic profile of influenza virus infection in three rat strains pathology of experimental ebola virus infection in african green monkeys. involvement of fibroblastic reticular cells validation of an hpv16-mediated carcinogenesis mouse model middle east respiratory syndrome coronavirus (mers-cov) causes transient lower respiratory tract infection in rhesus macaques selection of unadapted, pathogenic shivs encoding newly transmitted hiv-1 envelope proteins b-cells and the use of non-human primates for evaluation of hiv vaccine candidates antiretroviral pre-exposure prophylaxis prevents vaginal transmission of hiv-1 in humanized blt mice innate and adaptive immune responses determine protection against disseminated infection by west nile encephalitis virus rift valley fever virus encephalitis is associated with an ineffective systemic immune response and activated t cell infiltration into the cns in an immunocompetent mouse model evidence of sexual transmission of zika virus a susceptible mouse model for zika virus infection identification of a novel coronavirus in patients with severe acute respiratory syndrome subclinical infection without encephalitis in mice following intranasal exposure to nipah virus-malaysia and nipah virus-bangladesh nonhuman primate models of encephalitic alphavirus infection: historical review and future perspectives mortality due to influenza in the united states-an annualized regression approach using multiple-cause mortality data distinct pathogenesis of hong kong-origin h5n1 viruses in mice compared to that of other highly pathogenic h5 avian influenza viruses postexposure antibody prophylaxis protects nonhuman primates from filovirus disease comparative live bioluminescence imaging of monkeypox virus dissemination in a wild-derived inbred mouse (mus musculus castaneus) and outbred african dormouse (graphiurus kelleni) siv-induced impairment of neurovascular repair: a potential role for vegf smallpox vaccine does not protect macaques with aids from a lethal monkeypox virus challenge influenza-induced tachypnea is prevented in immune cotton rats, but cannot be treated with an anti-inflammatory steroid or a neuraminidase inhibitor distinct cellular immune responses following primary and secondary influenza virus challenge in cotton rats an outbreak of viral gastroenteritis following environmental contamination at a concert hall natural history of aerosol exposure with marburg virus in rhesus macaques experimental congo virus (ib-an7620) infection in primates further assessment of monkeypox virus infection in gambian pouched rats (cricetomys gambianus) using in vivo bioluminescent imaging respiratory syncytial virus infection in elderly and high-risk adults infection with mers-cov causes lethal pneumonia in the common marmoset immune response to marburg virus angola infection in nonhuman primates fields' virology the susceptibility of rats to rift valley fever in relation to age henipavirus susceptibility to environmental variables pause on avian flu transmission research aetiology: koch's postulates fulfilled for sars virus spinal cord neuropathology in human west nile virus infection therapeutic dna vaccine induces broad t cell responses in the gut and sustained protection from viral rebound and aids in siv-infected rhesus macaques hepatitis b virus infection-natural history and clinical consequences biological heterogeneity, including systemic replication in mice, of h5n1 influenza a virus isolates from humans in hong kong chikungunya virus arthritis in adult wild-type mice epidemiology and clinical presentations of the four human coronaviruses 229e, hku1, nl63, and oc43 detected over 3 years using a novel multiplex real-time pcr method development of an acute and highly pathogenic nonhuman primate model of nipah virus infection animal models of hepatitis delta virus infection and disease hepadnavirusinduced liver cancer in woodchucks experimental infection and natural contact exposure of dogs with avian influenza virus (h5n1). emerg megaribavirin aerosol for the treatment of influenza a virus infections in mice discovery of novel human and animal cells infected by the severe acute respiratory syndrome coronavirus by replication-specific multiplex reverse transcription-pcr chinchilla and murine models of upper respiratory tract infections with respiratory syncytial virus mechanisms of host defense following severe acute respiratory syndromecoronavirus (sars-cov) pulmonary infection of mice studies on the virus of venezuelan equine encephalomyelitis. i. modification by cortisone of the response of the central nervous system of macaca mulatta serious morbidity and mortality associated with influenza epidemics a novel respiratory model of infection with monkeypox virus in cynomolgus macaques clinical features of nipah virus encephalitis among pig farmers in malaysia monoclonal antibody-mediated enhancement of dengue virus infection in vitro and in vivo and strategies for prevention animal models of highly pathogenic rna viral infections: hemorrhagic fever viruses interferon alfacon-1 protects hamsters from lethal pichinde virus infection primary respiratory syncytial virus infection in mice pneumonitis and multiorgan system disease in common marmosets (callithrix jacchus) infected with the severe acute respiratory syndrome-associated coronavirus clinical and laboratory features that differentiate dengue from other febrile illnesses in an endemic area-puerto rico acute and chronic airway disease after human respiratory syncytial virus infection in cotton rats (sigmodon hispidus) alphaviruses replication fitness determines high virulence of influenza a virus in mice carrying functional mx1 resistance gene antibody and antiretroviral preexposure prophylaxis prevent cervicovaginal hiv-1 infection in a transgenic mouse model characterization of influenza a/hongkong/156/97 (h5n1) virus in a mouse model and protective effect of zanamivir on h5n1 infection in mice epidemic dengue/dengue hemorrhagic fever as a public health, social and economic problem in the 21st century cell culture and animal models of viral hepatitis. part i: hepatitis b expression of the hepatitis delta virus large and small antigens in transgenic mice acute hendra virus infection: analysis of the pathogenesis and passive antibody protection in the hamster model lassa fever encephalopathy: lassa virus in cerebrospinal fluid but not in serum 1, 5-iodonaphthyl azide-inactivated v3526 protects against aerosol challenge with virulent venezuelan equine encephalitis virus dengue: an update pegylated interferonalpha protects type 1 pneumocytes against sars coronavirus infection in macaques asymptomatic middle east respiratory syndrome coronavirus infection in rabbits head-to-head comparison of four nonadjuvanted inactivated cell culture-derived influenza vaccines: effect of composition, spatial organization and immunization route on the immunogenicity in a murine challenge model update on animal models for hiv research norovirus disease in the united states. emerg cardiopulmonary manifestations of hantavirus pulmonary syndrome serum neutralizing antibody titers of seropositive chimpanzees immunized with vaccines coformulated with natural fusion and attachment proteins of respiratory syncytial virus hendra virus infection in a veterinarian a phase 1 clinical trial of a dna vaccine for venezuelan equine encephalitis delivered by intramuscular or intradermal electroporation deaths from norovirus among the elderly, england and wales. emerg aerosolized rift valley fever virus causes fatal encephalitis in african green monkeys and common marmosets hiv-1-induced aids in monkeys west nile fever short communication: viremic control is independent of repeated low-dose shivsf162p3 exposures pathogenesis of marburg hemorrhagic fever in cynomolgus macaques pathogenesis of lassa fever in cynomolgus macaques niemann-pick c1 is essential for ebolavirus replication and pathogenesis in vivo airborne transmission of influenza a/ h5n1 virus between ferrets animal models in hiv-1 protection and therapy advances in rsv vaccine research and development-a global agenda transmission of zika virus through sexual contact with travelers to areas of ongoing transmissioncontinental united states establishment of a patient-derived orthotopic xenograft (pdox) model of her-2-positive cervical cancer expressing the clinical metastatic pattern resolution of primary severe acute respiratory syndromeassociated coronavirus infection requires stat1 mucosal immunity and vaccines nipah virus outbreak with person-to-person transmission in a district of bangladesh eastern equine encephalitis virus in mice i: clinical course and outcome are dependent on route of exposure the lesions of experimental equine morbillivirus disease in cats and guinea pigs a lethal disease model for hantavirus pulmonary syndrome hantaan/ andes virus dna vaccine elicits a broadly cross-reactive neutralizing antibody response in nonhuman primates persistent infection with and serologic cross-reactivity of three novel murine noroviruses molecular characterization of three novel murine noroviruses identification of hepatitis b virus indigenous to chimpanzees manifestation of thrombocytopenia in dengue-2-virusinfected mice transfer of hbv genomes using low doses of adenovirus vectors leads to persistent infection in immune competent mice west nile fever-a reemerging mosquito-borne viral disease in europe. emerg live, attenuated influenza virus (laiv) vehicles are strong inducers of immunity toward influenza b virus clinical characteristics of human monkeypox, and risk factors for severe disease shiv-1157i and passaged progeny viruses encoding r5 hiv-1 clade c env cause aids in rhesus monkeys norwalk virus infection associates with secretor status genotyped from sera a prairie dog animal model of systemic orthopoxvirus disease using west african and congo basin strains of monkeypox virus risks of chronicity following acute hepatitis b virus infection: a review the pathogenesis of rift valley fever experimental adaptation of an influenza h5 ha confers respiratory droplet transmission to a reassortant h5 ha/h1n1 virus in ferrets pathogenicity of a highly pathogenic avian influenza virus, a/chicken/yamaguchi/7/04 (h5n1) in different species of birds and mammals respiratory syncytial virus induces pneumonia, cytokine response, airway obstruction, and chronic inflammatory infiltrates associated with long-term airway hyperresponsiveness in mice a multimeric l2 vaccine for prevention of animal papillomavirus infections lassa virus infection of rhesus monkeys: pathogenesis and treatment with ribavirin pathogenesis of lassa virus infection in guinea pigs perinatal transmission of shiv-sf162p3 in macaca nemestrina human monkeypox and other poxvirus infections of man cd209l (l-sign) is a receptor for severe acute respiratory syndrome coronavirus glycomic characterization of respiratory tract tissues of ferrets: implications for its use in influenza virus infection studies comparative analysis of monkeypox virus infection of cynomolgus macaques by the intravenous or intrabronchial inoculation route phenotypic changes in langerhans' cells after infection with arboviruses: a role in the immune response to epidermally acquired viral infection? protection of beagle dogs from mucosal challenge with canine oral papillomavirus by immunization with recombinant adenoviruses expressing codon-optimized early genes detailed analysis of the african green monkey model of nipah virus disease experimental inoculation of egyptian rousette bats (rousettus aegyptiacus) with viruses of the ebolavirus and marburgvirus genera treatment of venezuelan equine encephalitis virus infection with (-)-carbodine c3h/hen mouse model for the evaluation of antiviral agents for the treatment of venezuelan equine encephalitis virus infection blt humanized mice as a small animal model of hiv infection stat1-dependent innate immunity to a norwalk-like virus pichinde virus infection in strain 13 guinea pigs reduces intestinal protein reflection coefficient with compensation establishment of the black-tailed prairie dog (cynomys ludovicianus) as a novel animal model for comparing smallpox vaccines administered preexposure in both high-and low-dose monkeypox virus challenges low-dose rectal inoculation of rhesus macaques by sivsme660 or sivmac251 recapitulates human mucosal infection by hiv-1 new opportunities for field research on the pathogenesis and treatment of lassa fever gastrointestinal norovirus infection associated with exacerbation of inflammatory bowel disease isolation of monkeypox virus from wild squirrel infected in nature in hot pursuit of the first vaccine against respiratory syncytial virus pathogenesis of hantaan virus infection in suckling mice: clinical, virologic, and serologic observations respiratory syncytial virus disease in infants despite prior administration of antigenic inactivated vaccine the severe pathogenicity of alveolar macrophage-depleted ferrets infected with 2009 pandemic h1n1 influenza virus mouse adaptation of influenza b virus increases replication in the upper respiratory tract and results in droplet transmissibility in ferrets stepping toward a macaque model of hiv-1 induced vomiting as a symptom and transmission risk in norovirus illness: evidence from human challenge studies wild-type puumala hantavirus infection induces cytokines, c-reactive protein, creatinine, and nitric oxide in cynomolgus macaques aberrant innate immune response in lethal infection of macaques with the 1918 influenza virus adenovirus-based vaccine prevents pneumonia in ferrets challenged with the sars coronavirus and stimulates robust immune responses in macaques replication, pathogenicity, shedding, and transmission of zaire ebolavirus in pigs west nile viral encephalitis foodborne viruses: an emerging problem low pathogenic avian influenza a(h7n9) virus causes high mortality in ferrets upon intratracheal challenge: a model to study intervention strategies filoviruses: a compendium of 40 years of epidemiological, clinical, and laboratory studies pathology of human influenza a (h5n1) virus infection in cynomolgus macaques (macaca fascicularis) dengue virus infection and immune response in humanized rag2(-/-)gamma(c) (-/-) (rag-hu) mice chikungunya disease in nonhuman primates involves long-term viral persistence in macrophages viral hepatitis b strong local and systemic protective immunity induced in the ferret model by an intranasal virosome-formulated influenza subunit vaccine origin of the west nile virus responsible for an outbreak of encephalitis in the northeastern united states antibody to hepatitis-associated antigen. frequency and pattern of response as detected by radioimmunoprecipitation severe acute respiratory syndrome coronavirus-like virus in chinese horseshoe bats coronavirus hku1 and other coronavirus infections in hong kong epidemic rift valley fever in egypt: observations of the spectrum of human illness hantaviruses. a short review hepatitis b virus infection quantitative measurement of influenza virus replication using consecutive bronchoalveolar lavage in the lower respiratory tract of a ferret model characterization of the activity of 2'-c-methylcytidine against dengue virus replication diffusion tensor and volumetric magnetic resonance measures as biomarkers of brain damage in a small animal model of hiv sequencing, annotation, and characterization of the influenza ferret infectome lethality and pathogenesis of airborne infection with filoviruses in a129 alpha/beta −/− interferon receptor-deficient mice experimental inoculation study indicates swine as a potential host for hendra virus early initiation of antiretroviral therapy can functionally control productive hiv-1 infection in humanized-blt mice zika virus disrupts neural progenitor development and leads to microcephaly in mice middle east respiratory syndrome coronavirus causes multiple organ damage and lethal disease in mice transgenic for human dipeptidyl peptidase 4 immunogenicity and protective efficacy of a recombinant subunit west nile virus vaccine in rhesus monkeys study of dengue virus infection in scid mice engrafted with human k562 cells a comparative study of the pathogenesis of western equine and eastern equine encephalomyelitis viral infections in mice by intracerebral and subcutaneous inoculations hydrodynamics-based transfection in animals by systemic administration of plasmid dna the emergence of nipah virus, a highly pathogenic paramyxovirus the guinea pig as a transmission model for human influenza viruses transmission in the guinea pig model a mouse model for the evaluation of pathogenesis and immunity to influenza a (h5n1) viruses isolated from humans transmission of human infection with nipah virus recurrent zoonotic transmission of nipah virus into humans the effect of an arenavirus infection on liver morphology and function hepatitis b virus replication in primary macaque hepatocytes: crossing the species barrier toward a new small primate model ebola virus disease in mice with transplanted human hematopoietic stem cells arenavirus-mediated liver pathology: acute lymphocytic choriomeningitis virus infection of rhesus macaques is characterized by high-level interleukin-6 expression and hepatocyte proliferation humanized chimeric upa mouse model for the study of hepatitis b and d virus interactions and preclinical drug evaluation role of dendritic cell targeting in venezuelan equine encephalitis virus pathogenesis detection of hepatitis b virus infection in wild-born chimpanzees (pan troglodytes verus): phylogenetic relationships with human and other primate genotypes proportion of deaths and clinical features in bundibugyo ebola virus infection the ferret: an animal model to study influenza virus local innate immune responses and influenza virus transmission and virulence in ferrets vomiting larry: a simulated vomiting system for assessing environmental contamination from projectile vomiting related to norovirus infection studies on the pathogenesis of dengue infection in monkeys. 3. sequential distribution of virus in primary and heterologous infections studies on dengue 2 virus infection in cyclophosphamide-treated rhesus monkeys crimean-congo hemorrhagic fever experimental infection of squirrel monkeys with nipah virus. emerg a school outbreak of norwalk-like virus: evidence for airborne transmission virology: sars virus infection of cats and ferrets characterization of clinical and immunological parameters during ebola virus infection of rhesus macaques delayed disease progression in cynomolgus macaques infected with ebola virus makona strain. emerg asymmetric replication of duck hepatitis b virus dna in liver cells: free minusstrand dna human and avian influenza viruses target different cell types in cultures of human airway epithelium a role for hpv16 e5 in cervical carcinogenesis replication of sars coronavirus administered into the respiratory tract of african green, rhesus and cynomolgus monkeys inhibition of hepatitis b virus in mice by rna interference zika virus was transmitted by sexual contact in texas, health officials report lassa fever. effective therapy with ribavirin lassa virus hepatitis: a study of fatal lassa fever in humans lethal infection of k18-hace2 mice infected with severe acute respiratory syndrome coronavirus andes virus infection of cynomolgus macaques regional t-and b-cell responses in influenza-infected ferrets clinical aspects of marburg hemorrhagic fever humanized mice mount specific adaptive and innate immune responses to ebv and tsst-1 isolation from human sera in egypt of a virus apparently identical to west nile virus middle east respiratory syndrome coronavirus in bats, saudi arabia. emerg chikungunya virus infection results in higher and persistent viral replication in aged rhesus macaques due to defects in anti-viral immunity reduced clearance of respiratory syncytial virus infection in a preterm lamb model dipeptidyl peptidase 4 distribution in the human respiratory tract: implications for the middle east respiratory syndrome experimental nipah virus infection in pigs and cats experimental nipah virus infection in pteropid bats (pteropus poliocephalus) an in-depth analysis of original antigenic sin in dengue virus infection propagation and dissemination of infection after vaginal transmission of simian immunodeficiency virus rift valley fever virus in mice. i. general features of the infection zika virus infection during pregnancy in mice causes placental damage and fetal demise outbreaks of acute gastroenteritis associated with norwalk-like viruses in campus settings a nonfucosylated variant of the anti-hiv-1 monoclonal antibody b12 has enhanced fcgammariiiamediated antiviral activity in vitro but does not improve protection against mucosal shiv challenge in macaques flaviviruses experimental studies of rhesus monkeys infected with epizootic and enzootic subtypes of venezuelan equine encephalitis virus necrotizing myocarditis in mice infected with western equine encephalitis virus: clinical, electrocardiographic, and histopathologic correlations aedes albopictus in the united states: ten-year presence and public health implications. emerg severity of clinical disease and pathology in ferrets experimentally infected with influenza viruses is influenced by inoculum volume impact of short-term haart initiated during the chronic stage or shortly post-exposure on siv infection of male genital organs influence of age on susceptibility and on immune response of mice to eastern equine encephalomyelitis virus a mouse model of chikungunya virusinduced musculoskeletal inflammatory disease: evidence of arthritis, tenosynovitis, myositis, and persistence dengue virus tropism in humanized mice recapitulates human dengue fever functional role of type i and type ii interferons in antiviral defense the occurrence of human cases in johannesburg. s feline model of acute nipah virus infection and protection with a soluble glycoprotein-based subunit vaccine vertical transmission and fetal replication of nipah virus in an experimentally infected cat pathogenesis and transmission of swine-origin 2009 a(h1n1) influenza virus in ferrets pneumonia from human coronavirus in a macaque model eastern equine encephalitis virus infection: electron microscopic studies of mouse central nervous system evaluation of three strains of influenza a virus in humans and in owl, cebus, and squirrel monkeys a novel morbillivirus pneumonia of horses and its transmission to humans. emerg a morbillivirus that caused fatal disease in horses and humans global burden of acute lower respiratory infections due to respiratory syncytial virus in young children: a systematic review and meta-analysis cchf infection among animals reemergence of monkeypox: prevalence, diagnostics, and countermeasures experimental infection of cynomolgus macaques (macaca fascicularis) with aerosolized monkeypox virus eastern equine encephalitis. distribution of central nervous system lesions in man and rhesus monkey hla a2 restricted cytotoxic t lymphocyte responses to multiple hepatitis b surface antigen epitopes during hepatitis b virus infection an aptamer-sirna chimera suppresses hiv-1 viral loads and protects from helper cd4(+) t cell decline in humanized mice severe acute respiratory syndrome coronavirus infection causes neuronal death in the absence of encephalitis in mice transgenic for human ace2 ferrets exclusively synthesize neu5ac and express naturally humanized influenza a virus receptors field's virology naturally occurring, nonregressing canine oral papillomavirus infection: host immunity, virus characterization, and experimental infection regression of canine oral papillomas is associated with infiltration of cd4+ and cd8+ lymphocytes diversifying animal models: the use of hispid cotton rats (sigmodon hispidus) in infectious diseases induction of focal epithelial hyperplasia in tongue of young bk6-e6/e7 hpv16 transgenic mice the contribution of animal models to the understanding of the host range and virulence of influenza a viruses evaluation of antiviral efficacy of ribavirin, arbidol, and t-705 (favipiravir) in a mouse model for crimean-congo hemorrhagic fever evaluation of oseltamivir prophylaxis regimens for reducing influenza virus infection, transmission and disease severity in a ferret model of household contact a novel video tracking method to evaluate the effect of influenza infection and antiviral treatment on ferret activity human respiratory syncytial virus a2 strain replicates and induces innate immune responses by respiratory epithelia of neonatal lambs common marmoset (callithrix jacchus) as a primate model of dengue virus infection: development of high levels of viraemia and demonstration of protective immunity changes in hematological and serum biochemical parameters in common marmosets (callithrix jacchus) after inoculation with dengue virus middle east respiratory syndrome coronavirus (mers-cov): animal to human interaction dengue virus-induced hemorrhage in a nonhuman primate model preclinical development of highly effective and safe dna vaccines directed against hpv 16 e6 and e7 a novel small molecule inhibitor of influenza a viruses that targets polymerase function and indirectly induces interferon comparison of monkeypox viruses pathogenesis in mice by in vivo imaging a rhesus monkey model for sexual transmission of a papillomavirus isolated from a squamous cell carcinoma infection of calves with bovine norovirus giii. 1 strain jena virus: an experimental model to study the pathogenesis of norovirus infection the cotton rat provides a useful small-animal model for the study of influenza virus pathogenesis zika virus disease in colombia-preliminary report genetic diversity, distribution, and serological features of hantavirus infection in five countries in south america liver injury and viremia in mice infected with dengue-2 virus the hamster as an animal model for eastern equine encephalitis-and its use in studies of virus entrance into the brain a recombinant hendra virus g glycoprotein-based subunit vaccine protects ferrets from lethal hendra virus challenge identification of an antioxidant small-molecule with broad-spectrum antiviral activity impaired heterologous immunity in aged ferrets during sequential influenza a h1n1 infection influenza transmission in the mother-infant dyad leads to severe disease, mammary gland infection, and pathogenesis by regulating host responses economic impact of respiratory syncytial virus-related illness in the us: an analysis of national databases inhibitory potential of neem (azadirachta indica juss) leaves on dengue virus type-2 replication systematic literature review of role of noroviruses in sporadic gastroenteritis. emerg sars: what have we learned? the severe acute respiratory syndrome severe acute respiratory syndrome bacterial sinusitis and otitis media following influenza virus infection in ferrets neuropathology of h5n1 virus infection in ferrets the draft genome sequence of the ferret (mustela putorius furo) facilitates study of human respiratory disease immunopathogenesis of coronavirus infections: implications for sars hantavirus pulmonary syndrome: the new american hemorrhagic fever rift valley fever inbred rat strains mimic the disparate human response to rift valley fever virus infection experimental studies of arenaviral hemorrhagic fevers experimental rift valley fever in rhesus macaques bovine respiratory syncytial virus protects cotton rats against human respiratory syncytial virus infection human hendra virus encephalitis associated with equine outbreak molecularly engineered live-attenuated chimeric west nile/dengue virus vaccines protect rhesus monkeys from west nile virus structure as revealed by airway dissection. a comparison of mammalian lungs study on west nile virus persistence in monkeys experimental hepatitis delta virus infection in the animal model changing patterns of chikungunya virus: re-emergence of a zoonotic arbovirus grune and stratton. hepatitis and blood transfusion perspectives on hepatitis b studies with chimpanzees pulmonary lesions in primary respiratory syncytial virus infection, reinfection, and vaccine-enhanced disease in the cotton rat (sigmodon hispidus) experimental hepatitis delta virus infection in the chimpanzee relative infectivity of hepatitis a virus by the oral and intravenous routes in 2 species of nonhuman primates cardiovascular and pulmonary responses to pichinde virus infection in strain 13 guinea pigs establishment and characterization of a lethal mouse model for the angola strain of marburg virus stability and inactivation of sars coronavirus possibility of extracting hyperimmune gammaglobulin against chf from doneky blood sera dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-emc swine as a model for influenza a virus infection and immunity zika virus: an update on epidemiology, pathology, molecular biology, and animal model oseltamivir population pharmacokinetics in the ferret: model application for pharmacokinetic/pharmacodynamic study design aerosol exposure to western equine encephalitis virus causes fever and encephalitis in cynomolgus macaques severe encephalitis in cynomolgus macaques exposed to aerosolized eastern equine encephalitis virus differences in aerosolization of rift valley fever virus resulting from choice of inhalation exposure chamber: implications for animal challenge studies an hiv-1 transgenic rat that develops hiv-related pathology and immunologic dysfunction a trivalent recombinant ad5 gag/pol/nef vaccine fails to protect rhesus macaques from infection or control virus replication after a limiting-dose heterologous siv challenge infection with chikungunya virus in italy: an outbreak in a temperate region a single dose of an iscom influenza vaccine induces long-lasting protective immunity against homologous challenge infection but fails to protect cynomolgus macaques against distant drift variants of influenza a (h3n2) viruses influenza a virus (h5n1) infection in cats causes systemic disease with potential novel routes of virus spread within and between hosts immunomodulation with il-4r alpha antisense oligonucleotide prevents respiratory syncytial virus-mediated pulmonary disease animal models for sars aged balb/c mice as a model for increased severity of severe acute respiratory syndrome in elderly humans a mouse-adapted sars-coronavirus causes disease and mortality in balb/c mice transmission of a 2009 h1n1 pandemic influenza virus occurs before fever is detected, in the ferret model experimental norovirus infections in non-human primates synthetic reconstruction of zoonotic and early human severe acute respiratory syndrome coronavirus isolates that produce fatal disease in aged mice a novel model of lethal hendra virus infection in african green monkeys and the effectiveness of ribavirin treatment clinical outcome of henipavirus infection in hamsters is determined by the route and dose of infection recent progress in henipavirus research: molecular biology, genetic diversity, animal models mucosal arenavirus infection of primates can protect them from lethal hemorrhagic fever sars vaccines: where are we? animal models of rift valley fever virus infection characterization of a novel coronavirus associated with severe acute respiratory syndrome macaque model for severe acute respiratory syndrome pathogenesis of aerosolized eastern equine encephalitis virus infection in guinea pigs pathophysiology of hantavirus pulmonary syndrome in rhesus macaques virulence and pathophysiology of the congo basin and west african strains of monkeypox virus in non-human primates arenaviridae. virus taxonomy, viiith report of the international committee on taxonomy of viruses clinical laboratory, virologic, and pathologic changes in hamsters experimentally infected with pirital virus (arenaviridae): a rodent model of lassa fever comparative pathology of north american and central african strains of monkeypox virus in a ground squirrel model of the disease biomedical applications of sheep models: from asthma to vaccines bunyaviruses the use of an animal model to study transmission of influenza virus infection experimental infection of an african dormouse (graphiurus kelleni) with monkeypox virus animal noroviruses human monkeypox infection: a family cluster in the midwestern united states the possibility of using the icr mouse as an animal model to assess antimonkeypox drug efficacy using the ground squirrel (marmota bobak) as an animal model to assess monkeypox drug efficacy experimental inoculation of juvenile rhesus macaques with primate enteric caliciviruses a rodent model of chikungunya virus infection in rag1 −/− mice, with features of persistence, for vaccine safety evaluation respiratory syncytial virus (rsv) pulmonary infection in humanized mice induces human anti-rsv immune responses and pathology viremia and antibody response of small african and laboratory animals to crimean-congo hemorrhagic fever virus infection early activation of natural killer and b cells in response to primary dengue virus infection in a/j mice murine model for dengue virus-induced lethal disease with increased vascular permeability in vitro and in vivo assay systems for study of influenza virus inhibitors viruses of the bunya-and togaviridae families: potential as bioterrorism agents and means of control potential role of immunomodulators for treatment of phlebovirus infections of animals patient-derived tumor xenografts: transforming clinical samples into mouse models treatment of lethal pichinde virus infections in weanling lvg/lak hamsters with ribavirin, ribamidine, selenazofurin, and ampligen a comparative study of the crimean hemorrhagic fever-congo group of viruses the pathogenesis of rift valley fever virus in the mouse model effective antiviral treatment of systemic orthopoxvirus disease: st-246 treatment of prairie dogs infected with monkeypox virus a neurotropic virus isolated from the blood of a native uganda comparison of the plaque assay and 50% tissue culture infectious dose assay as methods for measuring filovirus infectivity experimental respiratory marburg virus haemorrhagic fever infection in the common marmoset experimental respiratory infection of marmosets (callithrix jacchus) with ebola virus kikwit essential role of platelet-activating factor receptor in the pathogenesis of dengue virus infection respiratory syncytial virus is associated with an inflammatory response in lungs and architectural remodeling of lung-draining lymph nodes of newborn lambs trans-activation of viral enhancers by the hepatitis b virus x protein filamentous influenza a virus infection predisposes mice to fatal septicemia following superinfection with streptococcus pneumoniae serotype 3 contributions of mast cells and vasoactive products, leukotrienes and chymase, to dengue virus-induced vascular leakage. elife 2, e00481. stabenow correction: a retrospective investigation on canine papillomavirus 1 (cpv1) in oral oncogenesis reveals dogs are not a suitable animal model for high-risk hpv-induced oral cancer clinical profiles associated with influenza disease in the ferret model comparative neurovirulence and tissue tropism of wild-type and attenuated strains of venezuelan equine encephalitis virus administered by aerosol in c3h/hen and balb/c mice a mouse model for human anal cancer first reported cases of hantavirus pulmonary syndrome in canada antiviral treatment is more effective than smallpox vaccination upon lethal monkeypox virus infection evaluation of intravenous zanamivir against experimental influenza a (h5n1) virus infection in cynomolgus macaques ferrets as a novel animal model for studying human respiratory syncytial virus infections in immunocompetent and immunocompromised hosts differential pathogenesis of respiratory syncytial virus clinical isolates in balb/c mice limited dissemination of pathogenic siv after vaginal challenge of rhesus monkeys immunized with a live, attenuated lentivirus experimental west nile virus infection in rabbits: an alternative model for studying induction of disease and virus control a virus similar to human hepatitis b virus associated with hepatitis and hepatoma in woodchucks development of animal models against emerging coronaviruses: from sars to mers coronavirus severe seasonal influenza in ferrets correlates with reduced interferon and increased il-6 induction the clinical pathology of crimean-congo hemorrhagic fever human immune responses to a novel norwalk virus vaccine delivered in transgenic potatoes nipah virus encephalitis profiles of antibody responses against severe acute respiratory syndrome coronavirus recombinant proteins and their potential use as diagnostic markers distribution of viral antigens and development of lesions in chicken embryos inoculated with nipah virus characterization and demonstration of the value of a lethal mouse model of middle east respiratory syndrome coronavirus infection and disease the origin and virulence of the 1918 "spanish" influenza virus the pathology of influenza virus infections isolation of west nile virus from culex mosquitoes recombinant respiratory syncytial virus that does not express the ns1 or m2-2 protein is highly attenuated and immunogenic in chimpanzees animal models of hepadnavirus-associated hepatocellular carcinoma mouse models for chikungunya virus: deciphering immune mechanisms responsible for disease and pathology human monoclonal antibody as prophylaxis for sars coronavirus infection in ferrets experimental infection of ground squirrels (spermophilus tridecemlineatus) with monkeypox virus. emerg persistent west nile virus infection in the golden hamster: studies on its mechanism and possible implications for other flavivirus infections risk factors in dengue shock syndrome breaking barriers to an aids model with macaque-tropic hiv-1 derivatives dominant role of hpv16 e7 in anal carcinogenesis ribavirin efficacy in an in vivo model of crimean-congo hemorrhagic fever virus (cchf) infection histopathologic and immunohistochemical characterization of nipah virus infection in the guinea pig sequence of pathogenic events in cynomolgus macaques infected with aerosolized monkeypox virus severe acute respiratory syndrome coronavirus infection of mice transgenic for the human angiotensin-converting enzyme 2 virus receptor immunization with sars coronavirus vaccines leads to pulmonary immunopathology on challenge with the sars virus protective efficacy of a bivalent recombinant vesicular stomatitis virus vaccine in the syrian hamster model of lethal ebola virus infection persistence of human norovirus rt-qpcr signals in simulated gastric fluid outbreak of necrotizing enterocolitis caused by norovirus in a neonatal intensive care unit pathology of experimental aerosol zaire ebolavirus infection in rhesus macaques experimental aerosolized guinea pig-adapted zaire ebolavirus (variant: mayinga) causes lethal pneumonia in guinea pigs animal model for the therapy of acquired immunodeficiency syndrome with reverse transcriptase inhibitors a human lung xenograft mouse model of nipah virus infection human and avian influenza viruses target different cells in the lower respiratory tract of humans and other mammals virulence and reduced fitness of simian immunodeficiency virus with the m184v mutation in reverse transcriptase chronic administration of tenofovir to rhesus macaques from infancy through adulthood and pregnancy: summary of pharmacokinetics and biological and virological effects use of a small molecule ccr5 inhibitor in macaques to treat simian immunodeficiency virus infection or prevent simian-human immunodeficiency virus infection experimental infection of rhesus macaques and common marmosets with a european strain of west nile virus protective vaccination against papillomavirus-induced skin tumors under immunocompetent and immunosuppressive conditions: a preclinical study using a natural outbred animal model cataloguing of potential hiv susceptibility factors during the menstrual cycle of pig-tailed macaques by using a systems biology approach temporal analysis of andes virus and sin nombre virus infections of syrian hamsters anti-alpha4 integrin antibody blocks monocyte/macrophage traffic to the heart and decreases cardiac pathology in a siv infection model of aids clinical manifestations, laboratory findings, and treatment outcomes of sars patients. emerg prevalence of noroviruses and sapoviruses in swine of various ages determined by reverse transcription-pcr and microwell hybridization assays porcine enteric caliciviruses: genetic and antigenic relatedness to human caliciviruses, diagnosis and epidemiology avian influenza viruses, inflammation, and cd8(+) development of a model for marburgvirus based on severe-combined immunodeficiency mice development and characterization of a mouse model for marburg hemorrhagic fever euthanasia assessment in ebola virus infected nonhuman primates hematopoietic stem cell-engrafted nod/scid/ il2rgamma null mice develop human lymphoid systems and induce long-lasting hiv-1 infection with specific humoral immune responses the magnitude of dengue virus ns1 protein secretion is strain dependent and does not correlate with severe pathologies in the mouse infection model human fatal zaire ebolavirus infection is associated with an aberrant innate immunity and with massive lymphocyte apoptosis immunization with modified vaccinia virus ankarabased recombinant vaccine against severe acute respiratory syndrome is associated with enhanced hepatitis in ferrets invasion of the central nervous system in a porcine host by nipah virus development of a rift valley fever virus viremia challenge model in sheep and goats antibody-mediated enhancement of disease in feline infectious peritonitis: comparisons with dengue hemorrhagic fever an unusual hantavirus outbreak in southern argentina: person-to-person transmission? hantavirus pulmonary syndrome study group for patagonia. emerg equine morbillivirus pneumonia: susceptibility of laboratory animals to the virus susceptibility of cats to equine morbillivirus hantaan virus infection causes an acute neurological disease that is fatal in adult laboratory mice a north american h7n3 influenza virus supports reassortment with 2009 pandemic h1n1 and induces disease in mice without prior adaptation transmission studies of hendra virus (equine morbillivirus) in fruit bats, horses and cats a guinea-pig model of hendra virus encephalitis susceptibility of hiv-2, siv and shiv to various anti-hiv-1 compounds: implications for treatment and postexposure prophylaxis a golden hamster model for human acute nipah virus infection ebola virus transmission in guinea pigs intranasal immunization with an adenovirus vaccine protects guinea pigs from ebola virus transmission by infected animals characterization and experimental transmission of an oncogenic papillomavirus in female macaques dengue haemorrhagic fever: diagnosis, treatment, prevention and control, second ed. world health organization, geneva. world health organization clinical aspects of hepatitis b virus infection civets are equally susceptible to experimental infection by two different severe acute respiratory syndrome coronavirus isolates evidence of a causal role of winter virus infection during infancy in early childhood asthma vertical transmission of zika virus targeting the radial glial cells affects cortex development of offspring mice the antiviral activity of sp-303, a natural polyphenolic polymer, against respiratory syncytial and parainfluenza type 3 viruses in cotton rats experimental infection of prairie dogs with monkeypox virus. emerg epidemiological studies of hemorrhagic fever with renal syndrome: analysis of risk factors and mode of transmission hydrodynamic injection of viral dna: a mouse model of acute hepatitis b virus infection mice transgenic for human angiotensin-converting enzyme 2 provide a model for sars coronavirus infection an animal model of mers produced by infection of rhesus macaques with mers coronavirus a protective role for dengue virus-specific cd8+ t cells the incubation period of hantavirus pulmonary syndrome animal model of sensorineural hearing loss associated with lassa virus infection the relationship of meteorological conditions to the epidemic activity of respiratory syncytial virus. epidemiol hantavirus pulmonary syndrome. pathogenesis of an emerging infectious disease the pathology of experimental aerosolized monkeypox virus infection in cynomolgus monkeys (macaca fascicularis) hiv-1 infection and pathogenesis in a novel humanized mouse model h7n9 influenza viruses are transmissible in ferrets by respiratory droplet induction of neutralizing antibodies against four serotypes of dengue viruses by mixbiediii, a tetravalent dengue vaccine rapid generation of a mouse model for middle east respiratory syndrome an animal model for studying the pathogenesis of chikungunya virus infection pathogenesis of avian influenza a (h5n1) viruses in ferrets lethal crimean-congo hemorrhagic fever virus infection in interferon alpha/beta receptor knockout mice is associated with high viral loads, proinflammatory responses, and coagulopathy the pathogenesis of western equine encephalitis virus (w.e.e.) in adult hamsters with special reference to the long and short term effects on the c.n.s. of the attenuated clone 15 variant development of a murine model for aerosolized filovirus infection using a panel of bxd recombinant inbred mice a characterization of aerosolized sudan ebolavirus infection in african green monkeys, cynomolgus macaques, and rhesus macaques characterization of disease and pathogenesis following airborne exposure of guinea pigs to filoviruses manuscripts in preparation opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the us army. key: cord-344115-gtbkwuqv authors: grimm, volker; johnston, alice s. a.; thulke, h.-h.; forbes, v. e.; thorbek, p. title: three questions to ask before using model outputs for decision support date: 2020-09-30 journal: nat commun doi: 10.1038/s41467-020-17785-2 sha: doc_id: 344115 cord_uid: gtbkwuqv decision makers must have sufficient confidence in models if they are to influence their decisions. we propose three screening questions to critically evaluate models with respect to their purpose, organization, and evidence. they enable a more transparent, robust, and secure use of model outputs. question 1: what is the model's purpose? models are developed for a specific purpose and by the need to address certain questions about real systems. models therefore focus on aspects of the real system that are considered important in answering these questions. consequently, different models exist for the same system. without knowing its purpose, it is impossible to assess whether a model's outputs can be used to support decisions affecting the real world. model purposes fall into three main categories: demonstration, understanding, and prediction. given these different purposes, models also reflect different scopes. models for demonstration are designed to explore ideas, demonstrate the consequences of certain assumptions, and thereby help communicate key concepts and mechanisms. for example, at the onset of the covid-19 pandemic simple mathematical models were used to demonstrate how lowering the basic reproduction value, r 0 , would lead to "flattening the curve" of infections over time. this is an important logical prediction that helped to make key decisions, but it does not, and cannot, say anything about how effective interventions like social distancing are in reducing r 0 . models for understanding are aimed at exploring how different components of a system interact to shape observed behavior of real systems. for example, a model can mechanistically represent movement and contact rates of individuals. the model can be run to let r 0 emerge and then explore how r 0 changes with interventions such as social distancing. such models are not necessarily numerically precise, but they provide mechanistic understanding that helps to evaluate the consequences of alternative management measures. finally, models for prediction focus on numerical precision. they tend to be more detailed and complex and rely heavily on data for calibration. their ability to make future projections therefore depends on the quality of data used for model calibration. such models still do not predict the future with precision, as this is impossible 11 , but they provide important estimates of alternative future scenarios 12 . decision makers can benefit from all three types of models if they use them according to their given purpose. modelers should therefore state a model's purpose clearly and upfront. by asking this first screening question, one of the most common misuses of models can be prevented: using them for purposes for which they were not designed 13 . models are often used to support decisions without sufficient understanding of a model's basic elements. modelers need to ensure that their models are comprehensively documented, so that these elements can be understood by decision makers. the relevant questions about a model's organization are: what is in the model, and how do the parts of the model work together (fig. 1) ? decision makers can quickly understand which aspects of the real world are included, and which are excluded, by assessing: what entities are present in the model (e.g., individuals, populations, companies), what state variables characterize these entities (e.g., age, nationality, bank balance), what processes (e.g., movement patterns, meeting rates) link entities and their variables to system dynamics, and what are the temporal and spatial resolution and extent? recognizing the entities, processes and scales of a model provides vital information about a model's scope and hence its potential utility in a decision situation. for example, a model that does not include features of the economy cannot be used to explore the economic impact of lockdowns. likewise, a model's temporal resolution determines whether, for example, short-term behavioral changes can be considered. taken together, information about a model's structure, processes, and scales provides a quick overview of the key assumptions of the model, including the factors that the model ignores. these assumptions are then open to discussion, they can be compared, and their empirical basis can be checked. according to gmp, the rationale underlying these key assumptions should be documented 8 . question 3: is there evidence the model works? real systems are characterized by features that persist over time and can be quantified. models are designed and parameterized to reproduce one or more of these features, which can broadly be referred to as "patterns". the patterns used for model design and parameterization, however, can be idiosyncratic. that is, experts often disagree about the important characteristics of a system and tend to focus on the patterns with which they are familiar. the the model should be transparently and comprehensively documented and communicated (e.g. trace, odd). evidence what is the model's purpose? was the model designed to demonstrate, understand, or predict real-world system dynamics? is the model's domain of applicability suitable to address the proposed questions? how is the model organized? what are the entities and their state variables? what processes are modeled, and how do these processes link entities to system dynamics? what is the temporal and spatial resolution? is there evidence the model works? what patterns can the model reproduce? can the model make independent predictions, and under what conditions? the three screening questions support decision makers to assess whether a model is suitable for addressing real-world decisions and provide a common language for communication. fig. 1 three screening questions on a model's purpose, organization, and evidence. they support the assessment of whether a model is suitable for addressing real-world decisions and how to use model results. they provide a common language for modelers and decision makers. they help to evaluate models and take them into account according to their purpose and evidence (trace: transparent and comprehensive model "evaludation" 8 ); odd: overview, design concepts, details 10 ). only way to deal with this uncertainty is to be explicit about the patterns and data used and why they were considered important. nevertheless, models need to replicate features of the system pertinent to answering the questions at hand, and therefore should be able to demonstrate that incorrect assumptions lead to poor replication of important patterns. models for demonstration focus on generic features such as the existence of an equilibrium, which confirms the modeled system exists. models for understanding focus on one or a subset of mechanisms and how they help to replicate observed patterns. models for prediction focus on quantitatively reproducing sets of patterns, observed at different scales and levels of the system's organization. each pattern is used as a filter to reject unsuitable parameter values or inaccurate representations of processes. the more patterns a model can reproduce simultaneously, for instance, by using a pattern-oriented modeling approach, the more reliably it captures the essential features of a real system's organization 14 . patterns need not be highly distinctive. a combination of broad weak patterns can be as informative as a strong highresolution pattern. we are all familiar with this power of weak but combined filters: we can pick up a person unknown to us at the airport if we have their photo (strong pattern), or if we are told the person is, e.g., male, wears glasses, and has a red suitcase (combination of weak patterns). for situations such as the covid-19 pandemic, where few data and little knowledge are available at first, it is particularly important that models can predict multiple patterns, which by themselves may not contain much information but in combination can reduce model uncertainty. in crisis situations, like in a pandemic or the conservation of threatened species, models are often limited by incomplete data. nevertheless, models can help identify the most sensitive factors so that data collection can be prioritized. it is also essential that models can be readily updated with emerging information, which is facilitated by clear communication of a model's organization. a way forward models tend to look right as they are presented with the claim of being realistic enough for their purpose. looking right, though, has many dimensions, and we suggest the three screening questions presented here as a simple approach for decision makers to disentangle them. once we know a model's purpose, we can assess whether its organization is meaningful; once we know its purpose and organization, we can assess whether it represents the real system dynamics relevant to a specific problem; once we know which patterns it can reproduce, we can assess whether its model outputs can support decisions; and once we have answered all three questions for a suite of models they can be compared and taken into account to varying degrees 5 . the three questions do not replace more detailed guidelines on gmp 6,7 , but they provide a simple and effective common language that will allow us to develop models and use their outputs for decision support in a more transparent, robust, and safe way. it is still important to keep in mind that models can only support decisions. the responsibility for using model outputs lies with decision makers. answers to the three screening questions allow them to transparently base their decisions on a weight-ofevidence approach 15 , and to update their decisions when new data and updated models are available. population dynamics of fox rabies in europe emergency vaccination of rabies under limited resources-combating or containing? useless arithmetic: columbia university press modeling infectious disease dynamics harnessing multiple models for outbreak management computational modelling for decision-making: where, why, what, who and how making ecological models adequate towards better modelling and decision support: documenting model development, testing, and analysis using trace to what extent does evidence support decision making during infectious disease outbreaks? a scoping literature review the odd protocol for describing agent-based and other simulation models: a second update to improve clarity, replication, and structural realism verification, validation, and confirmation of numerical models in the earth sciences global warming under old and new scenarios using ipcc climate sensitivity range estimates different modelling purposes pattern-oriented modeling of agent-based complex systems: lessons from ecology weight of evidence: a review of concept and methods competing interests p.t. is working for basf se. correspondence and requests for materials should be addressed to v.g.peer review information nature communications thanks aaron king and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.reprints and permission information is available at http://www.nature.com/reprintspublisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons.org/ licenses/by/4.0/. key: cord-318079-jvx1rh7g authors: hinch, r.; probert, w. j. m.; nurtay, a.; kendall, m.; wymatt, c.; hall, m.; lythgoe, k.; bulas cruz, a.; zhao, l.; stewart, a.; ferritti, l.; montero, d.; warren, j.; mather, n.; abueg, m.; wu, n.; finkelstein, a.; bonsall, d. g.; abeler-dorner, l.; fraser, c. title: openabm-covid19 an agent-based model for non-pharmaceutical interventions against covid-19 including contact tracing date: 2020-09-22 journal: nan doi: 10.1101/2020.09.16.20195925 sha: doc_id: 318079 cord_uid: jvx1rh7g sars-cov-2 has spread across the world, causing high mortality and unprecedented restrictions on social and economic activity. policymakers are assessing how best to navigate through the ongoing epidemic, with models being used to predict the spread of infection and assess the impact of public health measures. here, we present openabm-covid19: an agent-based simulation of the epidemic including detailed age-stratification and realistic social networks. by default the model is parameterised to uk demographics and calibrated to the uk epidemic, however, it can easily be re-parameterised for other countries. openabm-covid19 can evaluate non-pharmaceutical interventions, including both manual and digital contact tracing. it can simulate a population of 1 million people in seconds per day allowing parameter sweeps and formal statistical model-based inference. the code is open-source and has been developed by teams both inside and outside academia, with an emphasis on formal testing, documentation, modularity and transparency. a key feature of openabm-covid19 is its python interface, which has allowed scientists and policymakers to simulate dynamic packages of interventions and help compare options to suppress the covid-19 epidemic. a particular focus of our work applying openabm-covid19 has been exploring different ways in which contact tracing, and in particular digital contact tracing using mobile phone apps that record proximity events, can contribute to epidemic control [9] . several other groups have approached this problem with similar agent based models [10, 11] ; compared to those, our model places more emphasis on simulating larger populations, computational efficiency, and on code generalisability that allows other researchers to use and develop the code. we developed the agent-based model (abm) openabm-covid19 to simulate an outbreak of covid-19 in an urban environment. the default population is one million inhabitants with demographic structure based upon uk-wide census data, and household size and age-structure matched to data from the uk 2011 census survey (for example, older people tend to live together and young children tend to live with younger adults). on a daily basis all individuals in the model move between networks representing households and either workplaces, schools, or regular social environments for older people. individuals also interact through random networks representing public transport, transient social gatherings etc. membership of each type of network is determined by age, giving rise to age-assortative mixing patterns. network parameters are chosen such that the average number of interactions match age-stratified data reported in [12] [12] . the number of daily interactions in random networks is drawn from a negative binomial distribution, allowing for rare super-spreading events. infections are seeded in the population and spread through the networks. biological and epidemiological characteristics of covid-19 disease have been derived from the scientific literature. the model takes into account asymptomatic infections and different stages of severity, and includes the simulation of hospitalisations and icu admissions. since symptoms, disease progression and infectiousness are highly age-dependent, disease pathways in the model are age-stratified. the abm was developed to simulate different non-pharmaceutical interventions including lockdown, physical distancing, self-isolation on symptoms, testing and contact tracing. modelling contact tracing requires the model to keep a record of previous interventions for a set number of days. a variety of contact tracing algorithms are included in the abm, including tracing on symptoms and/or after a positive test, notifying first-degree contacts only or second-degree contacts as well, testing of traced contacts, and imperfections in test-trace-isolate programmes such as delays, missed contacts and partial compliance. the model reports both aggregated data, such as incidence, tests required, individuals quarantined for various reasons etc., and individual data such as transmission relationships. openabm-covid19 is available on github ( https://github.com/bdi-pathogens/openabm-covid19 ), including model documentation, dictionaries for input parameters and output files, over 200 tests in a consistent testing framework used in model validation, and examples for running the model. the core of the model is implemented in the c language for speed; however, the model is run via python using a swig-interface. this interface allows for dynamic intervention strategies to be modelled, as well as providing full transparency about the state of the model. this manuscript was prepared using v0.3 of the model (commit number d14351e) and code for reproducing all figures in this manuscript from model output are publicly available online ( https://github.com/bdi-pathogens/openabm-covid19-model-paper ). openabm-covid19 enables simulation of interventions to help policymakers determine the best options to suppress the covid-19 epidemic in various settings. default demographic parameters were chosen to reflect the uk and fit well to the uk epidemic after calibration; however, all parameters of the model can be changed by the user. within the abm, individuals are categorised into nine age groups by decade, from "0-9 year" to "80+ years". decades were used because of the strong age-structure of the disease progression. by default, the demographics of the abm are set to uk national data for 2018 from the office of national statistics (ons). the proportion of individuals in each age group is the same as that specified by the population level statistics in supplementary table 1 . since we only consider simulating the epidemics up to a year, we do not consider changes in the population due to births, deaths due to other causes, and migration. in each of the interaction networks, individuals are represented as a node. constant and dynamic connections occur between the nodes in the networks, representing interactions between individuals. the three networks represent different types of daily interactions: household, occupation, and random ( figure 1 ). the interaction networks have two roles in the abm. first, the infection can be transmitted between two individuals on a day that they interact. second, the interactions for each individual are stored and can be used for contact tracing. the membership of different networks leads to age-group assortativity in the interactions. a previous study of social contacts for infectious disease modelling, based on participants being asked to recall their interactions over the past day, has estimated the mean number of interactions that individuals have by age group [12] . we estimate mean interactions by age group by aggregating data (supplementary table 2 ). figure every individual is assigned to live in a single household. the household network is formed by all members of every household interacting with each other every day. the distribution of household sizes is the ons estimate for the uk in 2018 (supplementary table 1 ). in addition to the household size, the mix of ages in households is important since multi-generational households provide a path by which the infection can be transmitted from young to old. to model this we used a reference panel of 10,000 households taken by down-sampling the uk-wide household composition data from the 2011 census produced by the ons. the overall household structure was generated by sampling from the reference household panel with replacement and using rejection-sampling to match the aggregate statistics for the age demographics and household size. each individual is also a member of a recurring occupation network to model school, workplace or social networks. the occupation networks are modelled as small-world networks [13] . the network has a fixed set of connections between individuals, and each day a random subset (50%) of these connections are chosen as the interactions between individuals. when constructing the occupation networks, the abm ensures the absence of overlaps between the household interactions and the local interactions on the small-world network. for children, there are separate occupation networks for the 0-9 year age group (i.e. nursery/primary school) and the 10-19 year age group (i.e secondary school). on each of these networks we introduce a small number of adults (1 adult per 5 children) to represent teaching and other school staff. similarly for the 70-79 year age group and the 80+ year age group we created separate networks representing daytime social activities among elderly people (again with 1 younger adult per 5 elderly people to represent some mixing between the age groups). all remaining adults (the vast majority) are part of the 20-69 network. due to the difference in total number of daily interactions, each age group has a different number of interactions in their occupation network. parameters and values corresponding to the occupation network are shown in supplementary table 3 . in addition to the recurring structured networks of households and occupations, we include random interactions. these are drawn randomly each day, independent of previous connections. the number of random connections an individual makes is the same each day (in the absence of interventions), drawn at the start of the simulation from an over-dispersed negative-binomial distribution. this variation in the number of interactions introduces some "super-spreaders" into the network who have many more interactions than average. the mean numbers of connections were chosen so that the total number of daily interactions matched that from a previous study of social interaction [12] . the number of random interactions was chosen to be lower in children in comparison to other age groups. interactions in the random network are listed in supplementary table 4 . the infection is spread by interactions between infected (source) and susceptible (recipient) individuals. the rate of transmission is determined by three factors: the infectiousness of the source, the age-dependent susceptibility of the recipient, and the type of interaction, i.e. on which network it occurred. infectiousness varies over the natural course of an infection, i.e. as a function of the amount of time the source has been infected, . infectiousness starts at zero at the point of infection ( = 0), increases to a peak at an intermediate time, and decreases to zero a long time after infection (large ). following [7] , we took the functional form of infectiousness to be a scaled gamma distribution. we chose the mean and standard deviation as intermediate values between different studies [7, 14, 15] . once infected, we split individuals into three groups based upon the eventual severity of the disease: asymptomatic, mild symptomatic and moderate-severe symptomatics. the level of infectiousness depends upon the eventual severity of the disease, i.e. pre-symptomatic individuals who go on to develop moderate-severe symptoms are more infectious than those who go on to develop mild symptoms. by default, the overall infectiousness of asymptomatic individuals and individuals with mild symptoms, is 0.33 and 0.72 times that of individuals with moderate-severe symptoms respectively [16] . an example of how transmissions can be stratified by the infection status of the source and the age of both source and recipient is depicted in figure 3 . in this simulation of an uncontrolled epidemic, most transmissions occur from pre-symptomatic individuals with mild disease who are more numerous than individuals who go on to develop severe disease, followed by symptomatic individuals with mild disease. interventions that reduce the rate of growth of transmission will change the relative contributions of different symptomatic stages. the susceptibility of the recipient to infection is modelled with a scale factor dependent on the recipient's age. to calibrate these factors, we identified studies of whether or not transmission occurred from index cases to monitored close contacts [17] [18] [19] [20] [21] [22] [23] [24] . lower probability of infection in children was reported in almost all studies, including that of 9 all rights reserved. no reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. zhang et al [17] which observed more infections than the rest of the studies combined, with consistent adjustment for other covariates of transmission risk. we used the susceptibility by age of zhang et al., interpolated to match our ten-year age categories. the merged data and fit are shown in supplementary table 5 . finally, we model the type of interaction, i.e. on which network the interaction took place. whilst we do not have data on the length of interactions, interactions which take place within a person's home are likely to be closer than other types of interactions leading to higher rates of transmission. this is modelled using a scale factor, which is 2 by default. combining all effects, we model the hazard rate per interaction at which the virus is transmitted by where t is the time since the source was infected; d indicates the disease severity of the source (asymptomatic, mild, moderate/severe); a is the age of the recipient; n is the type of network where the interaction occurred; i is the mean number of daily interactions; f γ ( u; μ, σ 2 ) is the probability density function of a gamma distribution; μ i and σ i are the mean and width of the infectiousness curve; r scales the overall infection rate; s a is the relative susceptibility of the recipient based on age; a d is the relative infectiousness of the source based on disease severity; b n is the scale factor for the network on which the interaction occurred. supplementary table 6 contains the values of the parameters used in simulations. the transmission hazard rate is converted to a probability of transmission via the epidemic is seeded by randomly . p = 1 − e −λ infecting individuals on the day before the simulation starts. 11 all rights reserved. no reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.16.20195925 doi: medrxiv preprint a fraction φ asym (age) of individuals are asymptomatic and do not develop symptoms, a fraction φ mild (age) will eventually develop mild symptoms, and the remainder develop moderate/severe symptoms. each of these proportions depend on the age of the infected individual (supplementary table 7 ). those who are asymptomatic are infectious at a lower level (see infection dynamics section) and will move to a recovered state after a time τ a,rec drawn from a gamma distribution. once an individual is recovered the model allows immunity to wane through time using two parameters: a fixed period for which every individual must wait, τ waning-shift , and then a geometric distribution of waiting times until individuals become susceptible, parameterised by its mean τ waning-mean . by default, the model assumes τ waning-shift to be 10,000 days (essentially no waning immunity). during this waiting period, infection is assumed to be completely immunising (recovered individuals cannot be reinfected). individuals who will develop symptoms start by being in a pre-symptomatic state, in which they are infectious but have no symptoms. the pre-symptomatic state is important for modelling interventions because individuals in this state do not realise they are infectious and therefore will not self-isolate based on symptoms to prevent infecting others. individuals who develop mild symptoms do so after time τ sym and then recover after time τ rec (both drawn from gamma distributions). the remaining individuals develop moderate/severe symptoms after a time τ sym drawn from the gamma distribution. whilst most individuals recover without requiring hospitalisation, a fraction φ hosp (age) of those with moderate/severe symptoms will require hospitalisation. this fraction is age-dependent. those who do not require hospitalisation recover after a time τ rec drawn from a gamma distribution, whilst those who require hospitalisation are admitted to hospital after a time τ hosp , which is drawn from a shifted bernoulli distribution. among all hospitalised individuals, a fraction φ crit (age) develop critical symptoms and require intensive care treatment, with the remainder recovering after a time τ hosp , rec drawn from a gamma distribution. the time from hospitalisation to developing critical symptoms, τ crit , is drawn from a shifted bernoulli distribution. of those who develop critical symptoms, a fraction φ icu (age) will receive intensive care treatment. for patients receiving intensive care treatment, a fraction φ death (age) die after a time τ death drawn from a gamma distribution, with the remainder leaving intensive care after a time τ crit,surv . patients who require critical care and do not receive intensive care treatment are assumed to die upon developing critical symptoms. patients who survive critical symptoms remain in hospital for τ hosp,rec before recovering. the age-dependent infection fatality ratio (ifr) is depicted in figure 5 ; other age-dependent outcomes in supplementary figure 1 . supplementary figure 2 shows the corresponding waiting time distributions. figure 5 : age-stratified infection fatality ratio (ifr) as output from a single simulation in a population of 1 million with uk-like demography and with a lockdown when prevalence reached 2%. grey numbers on each bar show the ifr within each age group. main outputs of the model include the number of infected individuals, hospitalisations, icu admissions and deaths ( figure 6 ). additional outputs are the number of people in quarantine and the number of tests required, which is of particular interest when comparing different interventions. transmissions can be depicted according to their type (pre-symptomatic, symptomatic and asymptomatic). the model provides a good fit to uk data ( figure 6 ). 13 all rights reserved. no reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this this version posted september 22, 2020. . 14 all rights reserved. no reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.16.20195925 doi: medrxiv preprint openabm-covid19 can model a range of non-pharmaceutical interventions (npis). given the many types of intervention and interest in introducing them at different times, the interventions are controlled in the simulation dynamically through the python interface. this allows for policy interventions to be applied in response to change in the growth of the epidemic (e.g. stricter policies such as lockdown can be applied when prevalence is above a threshold). below we give brief descriptions of the interventions and sample python code is given in the supplementary materials with links to jupyter notebooks. all model parameters involved with npis are given in supplementary tables 10 and 11. 1. self-isolation upon symptoms : a proportion of individuals self-isolate upon developing symptoms. self-isolation is modelled by stopping interactions on the individual's occupation network and greatly reducing their number of interactions on the random network. the default time for self-isolation is 7 days with a daily dropout. the abm contains the option to quarantine everybody within the household of the symptomatic individual. the abm also considers individuals without covid-19 who develop flu-like symptoms. supplementary figure 4a is a jupyter notebook demonstrating how self-isolation upon symptoms reduces the rate of spread of the infection. 2. hospitalisation: once admitted to hospitals, a patient immediately stops interacting with the household, occupation and random networks. we do not model interactions within hospitals, but will add this in future work. 3. lockdown : is modelled by reducing the number of interactions that people have on their occupation and random networks (by default by 71%). additionally, given that during lockdown people stay at home, the transmission rate for interactions on the household network is increased by a factor of 1.5. supplementary figure 4b is a jupyter notebook demonstrating the rapid reduction in new infections when a lockdown is imposed. the impact of lockdown on the reproduction number, r, is given in supplementary figure 5 and an animation showing the age-stratified detail breakouts is in supplementary figure 6 . 4. shielding : contact reductions can be applied to certain age groups only. for example, given that fatality ratio is highly skewed towards the over 70s, we have 15 all rights reserved. no reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this this version posted september 22, 2020. . the option of applying a reduction in contacts to this demographic group only. supplementary figure 4c is a jupyter notebook demonstrating how new infections can be kept low in a shielded group. 5. physical distancing : measures such as physical distancing and mask wearing will reduce the probability of transmission in certain types of interactions (i.e. random interactions but not household interactions). the abm allows for this to be modelled by allowing for the network-specific transmission multipliers to be adjusted during a simulation. supplementary figure 4d is a jupyter notebook demonstrating how new infections can be kept low after a lockdown with (extreme) social distancing measures. openabm-covid19 is able to model contact tracing (both manual and digital) and how it operates with or without an integrated testing system. the model contains many of the real-world imperfections which affect test and contact tracing programmes, such as test sensitivity and specificity, delays in testing and contact tracing, incomplete coverage, failure to recall contacts, contact tracer resource limitations and impartial adherence to quarantine requests. it also has the ability to model recursive contact tracing with and without testing. below we give descriptions of the test and contact tracing features, with sample code given in the supplementary materials along with links to jupyter notebooks. : testing can occur in both the community and hospital (where an immediate clinical diagnosis is allowed). tests are assumed to be sensitive from 3 days post-infection to 14 days post-infection with a sensitivity of 80% and specificity of 99%. for community testing, delays can be introduced for ordering a test and for receiving the test result. testing of an individual in the community is triggered by reporting symptoms and can also be triggered by being contact traced. supplementary figure 4e demonstrates the importance in quick testing if self-isolation only occurs after a positive test (as opposed to on symptoms). contact tracing is vital to control epidemics with a high level of pre-symptomatic transmission. a variable fraction of individuals in each group can be assigned to have the app. ownership of smartphones is based on age-stratified ofcom data (supplementary figure 3 and supplementary table 9 ). digital contact tracing can only occur between two app users. digital proximity sensing is likely to miss some interactions, so when contact tracing a number of interactions are randomly dropped. for contact tracing, the model takes into account all interactions the individual has had with other app-users for the past seven days which have not been dropped. the model can simulate different app-based contact tracing algorithms. the app can send out notifications with the request to quarantine based on symptoms, or based on a positive test result of the index case. it can ask the household members of the index case and/or household members of the contacts to quarantine and also send out notifications deeper into the network if desired. it can request tests for contacts of index cases if desired. supplementary figure 4f demonstrates how digital contact tracing following rapid testing can prevent a second wave even when the average uptake is at only 50% of the total population. 3. manual contact tracing : manual contact tracing works in a similar way to digital contact tracing with a few key differences. first, since it does not rely on an individual being a smartphone user, it can originate from anybody who tests positive (particularly important in the elderly where smartphone usage is lower). however, since the identification of interactions relies on the index case recalling them, only a fraction of actual interactions are traced. in particular, the fraction of interactions recalled depends on the type of interaction (i.e. occupation based interactions are more likely to be recalled than random interactions). manual contact tracing only occurs after a delay following a positive test, to account for contact tracers contacting both the index and traced individuals. finally, during a peak in the epidemic the amount of contact tracing required increases and risks overwhelming a manual contact tracing service. therefore the model contains constraints on the total number of interviews that contact tracers can perform on a single day. supplementary figure 4g demonstrates how a well-staffed manual contact tracing following rapid testing can lessen a second wave. quarantine : contact traced individuals can be asked to quarantine (default 14 days) either because they are directly traced or because they are a household member of somebody who has been traced. like self-isolation, quarantine is modelled by stopping interactions on the workplace network and greatly reducing the number of interactions on the random network. the model includes a daily dropout rate to simulate imperfect adherence. quarantine can be ended if the index case later tests negative (after tracing based upon their symptoms), or if the quarantined individual tests negative. the core of openabm-covid19 is coded in c using an object-oriented coding style. the code is written in a modular manner to ease readability and encourage extension of the code base. it is open source and is being actively developed by multiple teams. the model uses the gnu scientific library (gsl) for mathematical functions, statistical distributions, and random number generation [25] and so any distribution or function available within the gsl can be easily incorporated into the model (for instance in modelling waiting-time distributions). memory is pre-allocated at the start of the simulation for efficiency. an important feature of the implementation is the python interface using swig. running the model via python allows for complex dynamic interventions strategies to be easily modelled (see examples in supplementary figure 4a -h). all states of the model (e.g. transmission events, interactions, individual characteristics) are exposed in python, which gives full transparency to the results of the model. for example, supplementary figure 4h is a notebook showing how to calculate the relative personal protective effect of for app users versus non-app users when digital contact tracing is used. python is also a ubiquitous language amongst data scientists, and the interface allows them to fully interact with the model whilst keeping the high speed and memory performance of c. performance. the abm for 1 million individuals takes approximately 3s per day to run and requires 5gb of memory (reduced to 1.7gb if contact-tracing is disabled) on a 2015 macbook pro. both speed and memory are linear in population size (tested from 100k to 1m). the majority of the cpu usage is spent on rebuilding the daily interaction networks and updating the individual's interaction diaries. we present openabm-covid19, a covid-19-specific agent-based model suitable for simulating the epidemic in different settings and assessing non-pharmaceutical interventions, including contact tracing using a mobile phone app. the model is well documented with a simple interface, allowing non-experts to easily evaluate complex dynamic intervention strategies in a few lines of python code. openabm-covid19 is an open-source project and is easily extensible, with new features already being added by multiple external teams. the model is fully documented and is thoroughly tested in a formal testing framework. the model was designed to be as parsimonious as possible, with complexity only added when it was essential to model important features of covid-19 or details of non-pharmaceutical interventions, and with parameters being inferred from published studies. due to the substantial pre-symptomatic and asymptomatic transmission of the virus, it is necessary to model each individual's normal daily interactions. further, on developing symptoms or during interventions such as contact tracing, the interaction pattern of individuals change to only include those in the household. we therefore took the decision to model interactions using three social networks (household/occupational/random) with non-pharmaceutical interventions affecting each network differently. recurring small-world networks were used to model interactions at home and at work, whereas a transient random network was used to model other daily interactions such as on public transport or in shops. the strong association of covid-19 disease progression with age along with the age assortativity of social networks, led us to using a decade age-structure. the model simulated an urban population of 1 million rather than the population of a whole country to allow realistic estimates for hospitalisation and icu admission forecasts on a regional level. large national epidemics will also exhibit meta-population dynamics rather than the spatially unstructured mixing modelled here. one of the key aims of openabm-covid19 was to model non-pharmaceutical interventions and, in particular, different forms of contact tracing. the model of digital contact-tracing allows for questions such as the role of: testing delays, different quarantine requests, compliance rates, recursive testing, and app uptake to be investigated. the model of manual contact-tracing allows for questions such as resource limitations, partial contact recall and interview delays to be investigated. importantly, due to the simple python interface, it is possible for non-experts to simulate all these features and to investigate the effect of applying multiple intervention policies at different stages of the epidemic. the current version of the model does not currently include events in hospitals, care-home settings, non-hospital deaths, gender/sex of individuals, comorbidities, or any geographical structure apart from that implicit within the three modelled networks. all of these limitations are being currently addressed by collaborators and will become available on the github repository in the near future. openabm-covid19 is a versatile tool to model the covid-19 epidemic in different settings and simulate different non-pharmaceutical interventions including contact tracing. openabm-covid19 is a modular tool that will help scientists and policymakers weigh decisions during this epidemic. our vision is that, with the help of the world-wide modelling community, it will develop into a family of models for infectious diseases that are at risk of causing pandemics in the future, adding to the international toolkit for epidemic preparedness. 20 all rights reserved. no reuse allowed without permission. preprint (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this this version posted september 22, 2020. . https://doi.org/10.1101/2020.09.16.20195925 doi: medrxiv preprint coronavirus disease weekly epidemiology report a literature review of the economics of covid-19 mathematical models to guide pandemic response epidemiology, transmission dynamics and control of sars: the 2002--2003 epidemic special report: the simulations driving the world's response to covid-19 modelling transmission and control of the covid-19 pandemic in australia quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing effectiveness of isolation, testing, contact tracing, and physical distancing on reducing transmission of sars-cov-2 in different settings: a mathematical modelling study effective configurations of a digital contact tracing app: a report to nhsx covasim: an agent-based model of covid-19 dynamics and interventions. medrxiv social contacts and mixing patterns relevant to the spread of infectious diseases collective dynamics of "small-world" networks epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries estimating the generation interval for coronavirus disease (covid-19) based on symptom onset data changes in contact patterns shape the dynamics of the covid-19 outbreak in china characteristics of household transmission of covid-19 modes of contact and risk of transmission in covid-19 among close contacts household secondary attack rate of covid-19 and associated determinants in guangzhou, china: a retrospective cohort study household transmission of covid-19 sars-cov-2 transmission in different settings: analysis of cases and close contacts from the tablighi cluster in brunei darussalam covid-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study characteristics of pediatric sars-cov-2 infection and potential evidence for persistent fecal viral shedding key: cord-319885-8qyavs7m authors: chan, stephen; chu, jeffrey; zhang, yuanyuan; nadarajah, saralees title: count regression models for covid-19 date: 2021-02-01 journal: physica a doi: 10.1016/j.physa.2020.125460 sha: doc_id: 319885 cord_uid: 8qyavs7m at the end of 2019, the current novel coronavirus emerged as a severe acute respiratory disease that has now become a worldwide pandemic. future generations will look back on this difficult period and see how our society as a whole united and rose to this challenge. many reports have suggested that this new virus is becoming comparable to the spanish flu pandemic of 1918. we provide a statistical study on the modelling and analysis of the daily incidence of covid-19 in eighteen countries around the world. in particular, we investigate whether it is possible to fit count regression models to the number of daily new cases of covid-19 in various countries and make short term predictions of these numbers. the results suggest that the biggest advantage of these methods is that they are simplistic and straightforward allowing us to obtain preliminary results and an overall picture of the trends in the daily confirmed cases of covid-19 around the world. the best fitting count regression model for modelling the number of new daily covid-19 cases of all countries analysed was shown to be a negative binomial distribution with log link function. whilst the results cannot solely be used to determine and influence policy decisions, they provide an alternative to more specialised epidemiological models and can help to support or contradict results obtained from other analysis. the novel coronavirus disease (covid-19) first identified in wuhan, the capital of hubei, china in october 2019, is a severe acute respiratory disease that has now become a worldwide pandemic. fig. 1 shows the extreme extent to which this pandemic has spread across the world, with the total number of confirmed global cases exceeding 700,000 as of 30th march 2020 and is still increasing exponentially. the total number of cases worldwide has already surpassed the number due to the severe acute respiratory syndrome (sars) in the early 2000s. many reports have suggested that this new virus is becoming comparable to the spanish flu pandemic from 1918. the most common symptoms of covid-19 are almost identical to those of the flu -e.g. high fever, fatigue, cough and shortness of breath. individuals have been required to self-isolate if they believe that they are exhibiting these symptoms. the most severe symptoms have been linked to pneumonia, multi-organ failure, and death. other symptoms of covid-19 include the loss of sense of smell (anosmia) and in some cases individuals may display no symptoms at all, but will still be carrying the virus. the global effects of covid-19 have led to many countries locking down their international borders, cities and towns for extended periods. for example, in china, uk, italy, spain, france and many others. hence, many are fearful that another global recession is on the horizon. on the contrary, after a two month national lock-down, china has shown the world that a strict lock-down has contributed to a reduction in the number of new cases and deaths from covid-19, with the number of new cases recently decreasing to zero. the current literature relating to covid-19 is limited and the majority of the known work focuses exclusively on china, where the first major outbreaks occurred. the existing research has focused on topics such as determining the population who are most at risk, the factors increasing the risk of infection, the medical properties of those who become infected, the factors that can improve clinical outcomes and reduce the spread of the virus, the biological properties of the virus, and many others. see for example [1] [2] [3] [4] [5] [6] [7] , and [8] . more specifically, the literature relating to covid-19 analysis outside of china has been limited. since these countries are lagging behind china in terms of the overall spread of the disease, much of the literature has been focused on modelling and predicting the disease in the early stages of the outbreak -particularly the daily incidence (number of new confirmed cases per day) and the basic reproductive number. for example, in italy [9, 10] , in france [11] , and in japan [12] [13] [14] ; to name but a few. thus far, a wide range of statistical and predictive methods have already been applied to the analysis of covid-19 in china, for example traditional epidemic models, such as the sir model [15, 16] and the basic reproductive number [17] ; neural networks [18] ; regression models [19, 20] ; experimental frameworks [21] ; correlation analysis [22] . from the literature, it is evident that the majority of the analyses on covid-19 are limited to china, and a limited number of countries in asia and europe. these are arguably the countries that first identified known cases of covid-19. however, we should note that since this is an ongoing situation, new research is being published daily and therefore the literature is being updated continuously. hence, our main motivation is to provide a statistical analysis in modelling and analysing the number of confirmed cases of covid-19 in eighteen countries around the world. the main contributions of this paper are: (i) to provide a statistical analysis of covid-19 worldwide; (ii) to investigate whether it is possible to utilise count regression models for fitting and predicting the number of daily confirmed cases due to covid-19 globally. the contents of this paper are organised as follows. section 2 describes the data used in our analysis. in section 3, we detail the methodology and models used. section 4 outlines the results, and provides a discussion of these results. section 5 provides a conclusion and summary of our results. the data we analyse consists of the historical daily new cases due to the covid-19 coronavirus confirmed from eighteen different countries worldwide (china, denmark, estonia, france, germany, italy, malaysia, philippines, qatar, south korea, sri lanka, sweden, taiwan, thailand, uae, uk, usa, vietnam), listed on the eu open data portal from 31st december 2019 to 25th march 2020. these countries were chosen because they were the earliest countries to detect covid infections. the data were downloaded from the website ''european centre for disease prevention and control'' (ecdc) which sources its data from the who, and our analysis is limited to the data available at the time of writing. the eighteen countries were chosen based on their ranking in terms of the highest numbers of cases, thus we believe that the data obtained gives a satisfactory representation of the main countries affected by the virus at these times. in epidemiology and the study of infectious diseases, count-based data related to incidence are commonplace. in particular, data such as the daily incidence (number of cases) relating to an infectious disease can be modelled and predicted using a wide variety of methods, including compartmental (or deterministic) models such as the sir and seir models, and stochastic models such as discrete time and continuous time markov chains, and stochastic differential equations. in this study, we apply discrete time count regression models with the aim of modelling and predicting the daily incidence of covid-19 across the world. such models are preferred because they provide an appropriate, rich, and flexible modelling environment for non-negative integers. in addition, the models are robust for estimating constant relative policy effects and when implemented to policy evaluations, such models can move beyond the consideration of mean effects and determine the effect on the entire distribution of outcomes instead [23] . poisson count regression models are part of the family of generalised linear models that are commonly used in epidemiological studies [24] . the poisson and negative binomial regression models are widely used for modelling discrete count data where the count takes a non-negative integer with no upper limit, while the data is highly skewed. the negative binomial regression has the added advantage of being able to deal with the problem of overdispersion [25] . the four models below are due to christou and fokianos [26, 27] , fokianos and fried [28] , fokianos et al. [29] and fokianos and tjostheim [30] . the models due to christou and fokianos [26] are based on the negative binomial distribution. the models due to the others are based on the poisson distribution. both poisson and negative binomial distributions are commonly implemented when dealing with count data and observations occurring at a specific rate. let z t = denote the number of newly confirmed cases in a country on day t, t = 1, . . . , t . in other words, z t = the change in the cumulative confirmed cases from day t − 1 to t. for each of the eighteen countries selected, the following four regression models were fitted to the corresponding daily incidence data: where f t−1 denotes the history up to day t − 1, α represents the intercept parameter, and β is the slope parameter. each of the four models was fitted by the method of maximum likelihood. that is, by maximising respectively, with respect to α, β and φ. we shall denote the maximum likelihood estimates byα,β andφ, respectively. for a more in-depth discussion of the four regression models we refer the readers to the literature cited above. the models were fitted using the command tsglm in the r package tscount [31] . for each of the fitted models, we computed the akaike information criterion (aic), bayesian information criterion (bic) and associated p-values obtained by re-sampling. the aic for the four models were computed as the bic for the four models were computed as the values are given in table 1 . according to aic and bic values, the best model out of the four is the negative binomial model with a logarithmic link function. table 2 gives the estimates of the intercept and slope parameters along with their corresponding standard errors for this model. also given in table 2 are the p-values quantifying the significance of the slope parameter. in line with standard significance levels, if the p-value is less than 0.05 then the slope estimate is deemed to be significant. we applied the models specified in section 3 and fitted them to our data on the number of new daily cases of individuals infected with covid-19 from eighteen different countries worldwide. according to table 2 , the majority of p-values corresponding to the best fitting model (negative binomial model with a logarithmic link function) for each country's data are smaller than 0.05 -indicating significance of the slope coefficient estimates at the 5 percent significance. however, a particular exception is that of china, whose p-value is significantly greater than 0.05. this result is, perhaps, not surprising as china was the first country to be majorly affected by covid-19 and by the time most other countries started to see significant increases in new numbers of cases its numbers had already peaked and new cases in china were being confirmed at a slower rate. among the countries where the model appears to show a reasonable fit, the slope estimate was positive in all cases indicating the expected number of new cases confirmed each day is expected to increase with respect to time. in particular, the uk and vietnam have the largest and smallest slope estimates, respectively, hence the rate of increase in new daily covid-19 cases with time is the highest for the uk and lowest for vietnam. . the predicted median at time t, say m(t), was computed as the solution of the predicted 95 percent confidence interval at time t, say [l(t), u(t)], was computed as the solutions of the actual number of new cases falls within the 95 percent confidence intervals for each of the eighteen countries (for 7 days, 10 days and 15 days), suggesting that the fitted model is robust in spite of being simple. for some countries, such as denmark, malaysia, and the philippines, the actual and predicted values are reasonably close. on the other hand, for many countries the predicted values overestimate the actual number of new cases (estonia, france, germany, italy, etc.). however, in a few instances -e.g. qatar and united arab emirates, the actual number of new daily cases starts to outgrow the predicted values in the latter half of the 10 days (same was observed for 7 days and 15 days). note that these countries do not appear to share a common connection. although the regression model accounts for the historical number of daily cases (and the average rate of new daily cases), a possible explanation why it may under or overestimates the true number of new daily cases is due to the fact that it does not take into account many other factors that can influence the spread of infectious diseases, such as the behaviour of individuals (e.g. social, travel, etc.), government action, and economic policies. whilst this method has its advantages of being simple, straightforward and yet robust, the results should be interpreted with caution. they allow us to capture the general trend of the new daily cases in each country and generate some basic predictions in the short term. however, arguably, this approach misses key factors that are accounted for in other types of available models. therefore, it would not be wise to purely use the results presented here to make policy decisions, but rather these results should be used in conjunction with those from other analyses, which can help to support or contradict. furthermore, we do not consider here the historical daily mortality due to covid-19 as there exist many dependent factors that should be considered when modelling these numbers. examples include available treatments, susceptible population, hospital capacity, transmission rate, location and elevation risk, socio-economic factors and many more. this data can often be limited or hard to obtain due to restrictions such as data privacy or unreliable reporting. for further information we refer the readers to booth and tickle [32] . finally, we check robustness of the (log, negative binomial) model. we fitted all four models ((identity, poisson), (log, poisson), (identity, negative binomial) and (log, negative binomial)) to the two halves of the data set. the first half was taken as the data from 31 december 2019 to 11 february 2020. the second half was taken as the data from 12 february 2020 to 25 march 2020. the values of aic and bic for the four models for each half are given in tables 3 and 4 . we see that the (log, negative binomial) model gives the smallest values for each country and for each half. we have provided a statistical study on the modelling and analysis of the daily incidence of covid-19 in eighteen countries around the world. in particular, we have investigated whether it is possible to fit count regression models to the number of daily new cases of covid-19 in various countries and make short term predictions of these numbers. the results suggest that the biggest advantage of these methods is that they are simplistic and straightforward allowing us to obtain preliminary results and an overall picture of the trends in the daily confirmed cases of covid-19 in different countries. the best fitting count regression model for modelling the number of new daily covid-19 cases of all countries was shown to be a negative binomial distribution with log link function. the best fitted model was robust in that the 95 percent confidence intervals for prediction contained the actual number of new cases for each country. however, the model was not able to predict the trends of new daily cases well for china. we believe that this could be related to fact that china was the first country to be significantly affected, and by the time other countries started to be affected by covid-19, china had already reached its peak in confirmed cases and their confirmed cases dramatically declined. given these results, this suggests that this model may be more useful for modelling the early stages of an outbreak, when the number of new cases is increasing, and, more specifically, this suggests that a count regression model is better suited for modelling new daily cases when the trend is increasing linearly, semi-exponentially, or exponentially. among the countries that fit well with this model, the slope estimate was positive in all cases, indicating that the expected number of new cases being confirmed each day is expected to increase with respect to time. the uk and vietnam have the largest and smallest slope estimates, respectively, hence the rate of the daily increase in covid-19 cases is highest for the uk and lowest for vietnam. the model is beneficial for short term predictions in order to see the short term trend and the rate of growth of new cases, when no intervention measures are taken. in addition, the results could be useful in contributing to making health policy decisions or government intervention, but more importantly, these results should be used in conjunction with the results from other mathematical models that are more specific to epidemiology. nevertheless, direct extensions to the current work could include modelling the daily mortality due to covid-19. such models could incorporate dependent factors that influence mortality rate such as available treatments, susceptible population, hospital capacity, transmission rate, location and elevation risk, socio-economic factors and many more. a further extension is to seek models that are theoretically motivated for covid data. stephen chan: introduction, motivation, data. jeffrey chu: analysis, discussion. yuanyuan zhang: analysis, discussion. saralees nadarajah: methods, fitting. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study early prediction of disease progression in 2019 novel coronavirus pneumonia patients outside wuhan with ct and clinical characteristics epidemiological and transmission patterns of pregnant women with 2019 coronavirus disease in china estimation of the transmission risk of the 2019-ncov and its implication for public health interventions modelling the epidemic trend of the 2019 novel coronavirus outbreak in china the 2019 new coronavirus epidemic: evidence for virus evolution incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19)-china modelling and predicting the spread of coronavirus (covid-19) infection in nuts-3 italian regions covid-19 and italy: what next? mechanistic-statistical sir modelling for early estimation of the actual number of cases and mortality rate from covid-19 prediction of the epidemic peak of coronavirus disease in japan estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship estimation of the reproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: a data-driven analysis network-based prediction of the 2019-ncov epidemic outbreak in the chinese province hubei epidemic analysis of covid-19 in china by dynamical modeling the reproductive number of covid-19 is higher compared to sars coronavirus finding an accurate early forecasting model from small dataset: a case of 2019-ncov novel coronavirus outbreak prediction and analysis of coronavirus disease prediction of the number of new cases of 2019 novel coronavirus (covid-19) using a social media search index prediction of epidemic spread of the 2019 novel coronavirus driven by spring festival transportation in china: a population-based study temporal relationship between outbound traffic from wuhan and the 2019 coronavirus disease (covid-19) incidence in china evidence-based policy making in labor economics: the iza world of labor guide poisson regression analysis in clinical research comparison of statistical approaches to evaluate factors associated with metabolic syndrome quasi-likelihood inference for negative binomial time series models estimation and testing linearity for non-linear mixed poisson autoregressions interventions in log-linear poisson autoregression poisson autoregression log-linear poisson autoregression r development core team, r development core team, r: a language and environment for statistical computing, r foundation for statistical computing mortality modelling and forecasting: a review of methods the authors would like to thank the editor and the three referees for careful reading and comments which greatly improved the paper. key: cord-310844-7i92mk4x authors: hryhorowicz, magdalena; lipiński, daniel; hryhorowicz, szymon; nowak-terpiłowska, agnieszka; ryczek, natalia; zeyland, joanna title: application of genetically engineered pigs in biomedical research date: 2020-06-19 journal: genes (basel) doi: 10.3390/genes11060670 sha: doc_id: 310844 cord_uid: 7i92mk4x progress in genetic engineering over the past few decades has made it possible to develop methods that have led to the production of transgenic animals. the development of transgenesis has created new directions in research and possibilities for its practical application. generating transgenic animal species is not only aimed towards accelerating traditional breeding programs and improving animal health and the quality of animal products for consumption but can also be used in biomedicine. animal studies are conducted to develop models used in gene function and regulation research and the genetic determinants of certain human diseases. another direction of research, described in this review, focuses on the use of transgenic animals as a source of high-quality biopharmaceuticals, such as recombinant proteins. the further aspect discussed is the use of genetically modified animals as a source of cells, tissues, and organs for transplantation into human recipients, i.e., xenotransplantation. numerous studies have shown that the pig (sus scrofa domestica) is the most suitable species both as a research model for human diseases and as an optimal organ donor for xenotransplantation. short pregnancy, short generation interval, and high litter size make the production of transgenic pigs less time-consuming in comparison with other livestock species this review describes genetically modified pigs used for biomedical research and the future challenges and perspectives for the use of the swine animal models. pigs have been extensively used in biomedical research due to anatomical and physiological similarities to humans. moreover, progress in gene editing platforms and construct delivery methods allow efficiently, targeted modifications of the porcine genome and significantly broadened the application of pig models in biopharming and biomedicine. target editing is possible through site-specific nucleases, of which the following are most commonly used: zinc finger nucleases (zfns), transcription activator-like effectors (tales), and nucleases from the crispr/cas (clustered regularly interspaced short palindromic repeats/crispr associated) system. introducing modifications in a specific site of the genome is possible due to cellular processes of repairing double-strand breaks induced by site-specific nucleases. double-strand breaks may be repaired in two ways: by non-homologous end joining (nhej) or by homologous recombination (hr). repair provided by nhej may lead to the formation of indel (insertion/deletion) mutations in the table 1 . summary of the most important advantages and disadvantages of the methods for obtaining genetically modified animals. advantages increased efficiency of the transgene integration precise transformation and selection of modified cells used in cloning the number of damaged zygotes do not exceed 10% the obtained animals do not exhibit mosaicism modification is also revealed in germ cells-transgenic offspring disadvantages low process efficiency (2%-3% in pigs) very low efficiency the possibility of random integration of the transgene early fetal mortality high process invasiveness the possibility of genetic defects genetically modified animals as research models for human diseases are a very important tool in searching for and developing new methods of therapy. a suitable model organism should be characterized by rapid growth, a high number of offspring, easy and inexpensive breeding, ability to be easily manipulated, and having a sequenced genome. initially, only rodents were used as a model in biomedical research. experiments on mice contributed to understanding the genetic background of numerous diseases. however, not every genetic disease induced in mice has the same clinical manifestation as in humans. furthermore, the short life span, together with a higher metabolic rate, makes the analysis of some hereditary diseases challenging. currently, pigs are one of the most important large animal models for biomedical research. many human diseases, such as cardiovascular diseases, obesity, and diabetes, have their counterparts in this species. the use of model organisms makes it possible to analyze diseases that occur naturally in animals with specific mutations and those deliberately induced. introduced animal genome changes may reflect mutations occurring in people suffering from specific genetic disorders. moreover, accurate and efficient genome editing can be used in the treatment of monogenic diseases. a slightly different direction of research is the use of genetically modified animal models in toxicological studies for testing drugs. genetically modified pigs are used as model organisms in research into various diseases, including cardiovascular and neurodegenerative diseases, neoplasms, and diabetes. cystic fibrosis (cf) is an autosomal recessive disorder manifested by bronchopulmonary failure and pancreatic enzyme insufficiency. cf is caused by a mutation in the gene responsible for the synthesis of the cftr (cystic fibrosis transmembrane conductance regulator) chloride channel, altering the mucosal function in the respiratory epithelium, pancreatic ducts, intestines, and sweat glands. the most common cystic fibrosis-causing mutation is the deletion of a phenylalanine at amino acid position 508 (df508). cf lung disease is the main cause of morbidity and mortality in cf patients. porcine lungs share many anatomical and histological similarities with humans. it has been shown that pigs in which the cftr gene was inactivated develop all symptoms of the disease occurring in humans, such as meconium ileus, defective chloride transport, pancreatic destruction, and focal biliary cirrhosis. this makes them a very good model species for this disease [14] [15] [16] . cystic fibrosis is a monogenic disease, and the insertion of the functional cftr gene into cf patient cells should theoretically restore the cftr channel function. therefore, pigs have also been used in gene therapy. the treatment with viral vectors successfully improved anion transport and inhibited bacterial growth [17, 18] . currently, the research focuses on improving the cf gene therapy with the use of the crispr/cas9 system. these efforts are focused on increasing the delivery efficiency of crispr/cas9 elements to target locus and obtaining sustained expression of the cftr transgene [19, 20] . it was demonstrated that precise integration of the human cftr gene at a porcine safe harbor locus through crispr/cas9-induced hdr-mediated knock-in allowed the achievement of persistent in vitro expression of the transgene in transduced cells. these results can help design effective gene therapy to treat cf patients [20] . duchenne muscular dystrophy (dmd) is a progressive, monogenic, x-linked lethal disease characterized by degenerative changes in muscle fibers and the connective tissue. it involves the degeneration of subsequent muscles-skeletal, respiratory, and cardiac-and progressive muscular dystrophy. muscular dystrophy is caused by a frameshift mutation in the dmd gene, which encodes dystrophin, a protein in muscle cells that connects the cytoskeleton with the cell membrane. the dystrophin gene contains 79 exons, with exons 3-7 and 45-55 being the most susceptible to mutations. dmd gene mutations are usually large deletions or duplications of one or several exons, as well as point mutations, leading to a change in the reading frame, the appearance of a premature stop codon, and failure to produce a stable protein. muscular dystrophy is most often diagnosed in early childhood, and patients become wheelchair dependent by 12 years of age. untreated boys die of cardiorespiratory complications around their 20 years. the rapid progress in gene editing gives hope for effective targeted therapies for dmd. moreover, the use of an animal model can facilitate the development of personalized treatment approaches. pigs with a dmd gene mutation (exon 52 deletion) develop human disease symptoms, such as lack of dystrophin in skeletal muscles, increased serum creatine kinase levels, progressive muscle dystrophy, and impaired mobility [21] . however, these animals died prematurely (up to 3 months old at most) what precluded natural breeding. the histological evaluation of skeletal muscles and diaphragm confirmed the presence of excessive fiber size variation, hypercontracted fibers, and segmentally necrotic fibers, resembled that of human dmd patients [21] . moreover, proteome analysis of biceps femoris muscle was performed. an increased amount of muscle repair-related proteins and reduced amount of respiratory chain proteins was found in tissue from 3-month-old dmd pigs. this indicated severe disturbances in aerobic energy production and a decrease in functional muscle tissue [22] . as the deletion of exon 52 in the human dmd gene is a common cause of duchenne muscular dystrophy, pigs can make an accurate research model for gene therapy. another porcine dmd model is genetically modified miniature pigs with a mutation in exon 27 in the dmd gene obtained by the crispr/cas9 system. in addition, these animals have shown symptoms of skeletal and heart muscle degeneration, characteristic of human patients with duchenne muscular dystrophy. reduced thickness of smooth muscle in the stomach and intestine was also observed in the pigs studied. however, founder pigs died of unreported causes [23] . although mutations in exon 27 are not reported in human dmd patients, pigs with this deletion constitute another useful animal dmd model. recently, moretti et al. demonstrated the restoration of dystrophin by intramuscular injection of crispr/cas9 components with the use of adeno-associated viral vectors in a pig model. in this study, pigs with dmd carrying a deletion of dmd exon 52 (d52dmd), resulting in a complete loss of dystrophin expression, were used. the restoration of dystrophin expression was possible due to the excision of exon 51 and the restoration of the dmd reading frame. the internally truncated d51-52dmd sufficed to improve skeletal muscle function, prevent malignant arrhythmias as well as prolonging lifespan of dmd pigs [24] . in the future, this strategy may prove useful in the clinical treatment of patients with d52dmd. alzheimer's disease (ad) is an age-related, progressive neurodegenerative disorder characterized by memory dysfunction followed by cognitive decline and disorientation. ad accounts for 50%-80% of human dementia cases. familial forms of ad are caused by autosomal mutations in the genes encoding presenilin 1 (psen1) and presenilin 2 (psen2) and amyloid precursor protein (app). these mutations are associated with the accumulation of amyloid β (aβ) peptide in senile plaques and phosphorylated tau protein in neurofibrillary tangles (nfts), which leads to synaptic damage and neuronal dysfunction [25] . the first ad model with the use of transgenic pigs was generated in 2009 by kragh et al. they produced göttingen mini pigs that carried a randomly integrated construct containing the cdna of the human app gene with ad causing a dominant mutation known as the swedish mutation (appsw) and a human pdgfβ promoter fragment [26] . although the transgene was specifically expressed in brain tissue at a high level, no ad phenotype was observed in mutant pigs. the same group also obtained göttingen minipigs with the human psen1 gene carrying the ad-causing met146ile mutation (psen1 m146i) and driven by a cytomegalovirus (cmv)-enhanced human ubic promoter. pigs were generated with the use of a site-specific integration system-recombinase-mediated cassette exchange (rmce) [27] . the psen1 m146i protein was expressed and tolerated well in the porcine brain, but also in this case, no symptoms of the ad disease were noticed. therefore, this group generated double transgenic göttingen minipigs with both appsw and psen1 m146i mutations. such a solution allowed the increase in intraneuronal accumulation of aβ [28] . in turn, another group obtained ad transgenic pigs using a retroviral multi-cistronic vector containing three ad-related human genes: app, tau, and psen1, with a total of six well-characterized mutations under the control of a fusion promoter: cmve+ hpdgfβ promoter region. they confirmed that transgenes were expressed at high levels in brain tissue and demonstrated a two-fold increase in aβ levels in the brains of transgenic pigs compared to wild-type [29] . cancer is a genetic disease involving uncontrolled, abnormal cell growth in the blood or solid organs resulting from acquired or inherited mutations. pigs represent a useful animal for the development and validation of new medicines and procedures in human tumor models. there are many resemblances in cancer biology between pigs and humans. these animals can correctly mimic human tumors and show similar pharmacokinetic responses to humans. adam et al. demonstrated that autologous transplantation of primary porcine cells transformed with retroviral oncogenic vectors caused tumorigenesis akin to those found in humans [30] . in turn, schook et al. induced tumor formation in pigs by introducing random transgenes that encode cre-dependent kras (kirsten rat sarcoma viral oncogene homolog) g12d and tp53 (tumor protein 53) r167h oncogenic mutations (orthologous to human tp53 r175h) [31] . moreover, saalfrank et al. reported that porcine mesenchymal stem cells (mscs) resemble human mscs requiring disturbance of p53, kras, and myc signaling pathways to become a fully transformed phenotype [32] . at present, pig models commonly used in cancer research include the tp53 knock-out model of osteosarcoma and apc (adenomatous polyposis coli) mutations model of familial adenomatous polyposis (fap). tp53 is a known tumor suppressor gene, and a germline mutation within this gene leads to li-fraumeni syndrome, a rare, autosomal dominant disorder that predisposes carriers to cancer development. the first model of li-fraumeni syndrome using genetically modified pigs has been described by leuchs et al. they generated pigs carrying a latent tp53 r167h mutation that can be activated by the cre-lox recombinase system [33] . after several years of observation, it was noted that both pigs with homozygous tp53 knock-out and pigs with heterozygous knock-out of tp53 showed osteosarcoma development. the heterozygous knock-out caused the development of spontaneous osteosarcoma in older animals, while homozygous tp53 knock-out resulted in multiple large osteosarcomas in 7 to 8-month-old pigs [32] . moreover, sieren and colleagues generated genetically modified yucatan minipigs that carried the tp53 r167h mutation. animals heterozygous for this mutant allele showed no tumorigenesis process, whereas homozygotes that reached sexual maturity developed lymphomas, osteogenic tumors, and renal tumors at varying rates. the tumor formations were validated by computed tomography, histopathological evaluation, and magnetic resonance imaging [34] . familial adenomatous polyposis is an inherited disorder characterized by the development of numerous adenomatous polyps in the colon and rectum which greatly increases the risk of colorectal cancer. the mutations in the apc tumor-suppressor gene are responsible for fap and may result in a hereditary predisposition to colorectal cancer. flisikowska et al. generated gene-targeted cloned pigs with translational stop signals at codon 1311 in porcine apc (apc 1311), orthologous to common germline apc 1309 mutations in human fap. evaluation of one-year-old pigs carrying the apc 1311 mutation showed aberrant crypt foci and adenomatous polyps with low-to high-grade intraepithelial dysplasia, similar to tumor progression as in human fap [35] . the apc 1311 pig model resulting in the development of polyposis in the colon and rectum can be useful in the diagnosis and therapy of colorectal cancer. cardiovascular diseases (cvds) are the major cause of morbidity and mortality worldwide. cvd is a group of disorders of the heart and blood vessels that involve coronary heart disease (such as angina and myocardial infarction), deep vein thrombosis and pulmonary embolism, peripheral arterial disease, cerebrovascular disease, and rheumatic heart disease. the dominant cause of cvd is atherosclerosis, which is characterized by the narrowing of arteries due to the accumulation of lipid and plaque formation. the plaque buildup restricts blood flow, and plaque burst can entail blood clots. similarities in heart anatomy and physiology, vessel size, blood parameters, coronary artery system anatomy, and lipoprotein metabolism make pigs a suitable model for the human cardiovascular system. atherosclerosis starts with the buildup of serum low-density lipoprotein (ldl), and mutations in the ldl receptor (ldlr) gene may cause familial hypercholesterolemia (fh). a porcine fh model has been generated in yucatan miniature pigs through recombinant adeno-associated virus-mediated targeted disruptions of the endogenous ldlr gene. ldlr+/− heterozygous pigs exhibited mild hypercholesterolemia, while ldlr−/− homozygotes animals were born with severe hypercholesterolemia and developed atherosclerotic lesions in the coronary arteries. these phenotypes were accelerated by high fat and high cholesterol diets [36] . the utilization of ldlr-deficient yucatan minipigs in the preclinical evaluation of therapeutics was also demonstrated. ldlr+/− and ldlr−/− pigs were used to assess the ability of novel drug-bempedoic acid (bema)-to reduce cholesterol biosynthesis. long-term treatment with bema decreased ldl cholesterol and attenuated aortic and coronary atherosclerosis in this fa model [37] . moreover, a model of fa and atherosclerosis was created by using the yucatan miniature pigs with liver-specific expression of a human proprotein convertase subtilisin/kexin type 9 (pcsk9) carrying the gain-of-function mutation d374y. pcsk9 plays important functions in cholesterol homeostasis by reducing ldlr levels on the plasma membrane. gain-of-function mutations in this protein cause increased levels of plasma ldl cholesterol, which in turn may result in more susceptibility to coronary heart disease. pcsk9 d374y transgenic pigs exhibited decreased hepatic ldlr levels, severe hypercholesterolemia on high-fat, high-cholesterol diets, and atherosclerotic lesions [38] . it is also considered that hypertriglyceridemia is an independent risk factor for coronary heart disease, in which apolipoprotein (apo)ciii is associated with plasma triglyceride levels. the hypertriglyceridemic apociii transgenic miniature pig model was generated for the examination of the correlation between hyperlipidemia and atherosclerosis. transgenic pigs expressing human apociii exhibited increased plasma triglyceride levels with their delayed clearance and reduced lipoprotein lipase activity compared to non-transgenic controls [39] . diabetes mellitus (dm) is a group of metabolic disorders characterized by hyperglycemia (elevated levels of blood sugar over a prolonged period), which results from deficiency or ineffectiveness of insulin. dm may lead over time to cardiovascular disease, chronic kidney disease, damage to the nerves and eyes. there are two main types of diabetes mellitus, called type 1 and type 2. type 1 dm, also referred to as juvenile diabetes or insulin-dependent diabetes mellitus, is caused by the pancreas's failure to produce enough insulin. the most common is type 2 diabetes, which is characterized by insulin resistance (reduced tissue sensitivity to insulin) that may be combined with relative insulin deficiency. the anatomical and physiological resemblance to the human pancreas and islets makes pigs excellent animals for metabolic diseases modeling. moreover, the structure of porcine and human insulin is also very similar (differs by only one amino acid). a transgenic pig model for type 2 dm was generated to evaluate the role of impaired glucose-dependent insulinotropic poly-peptide (gip). the main function of incretin hormones gip and glucagon-like peptide-1 (glp1) is stimulated insulin secretion from pancreatic beta cells in a glucose-dependent manner. in type 2 dm, the insulinotropic action of gip is impaired, which may suggest its association with early disease pathogenesis. the transgenic pigs expressing a dominant negative gip receptor (giprdn) in pancreatic cells were produced by lentiviral vectors. a significant reduction in oral glucose tolerance due to delayed insulin secretion as well as in β-cell mass caused by diminished cell proliferation was observed in giprdn animals [40] . these observations resemble the characteristic features of human type 2 diabetes, which makes the porcine giprdn model useful for testing incretin-based therapeutic strategies. further analyses revealed characteristic changes in plasma concentrations of seven amino acids (phe, orn, val, xleu, his, arg, and tyr) and specific lipids (sphingomyelins, diacylglycerols, and ether phospholipids) in the plasma of 5-month-old giprdn transgenic pigs that correlate significantly with β-cell mass [41] . these metabolites represent possible biomarkers for the early stages of prediabetes. moreover, the porcine giprdn model has been used to test liraglutide, glp1 receptor agonist, which improve glycemic control in type 2 diabetic patients. ninety-day liraglutide treatment of adolescent transgenic pigs resulted in improved glycemic control and insulin sensitivity as well as reduction in body weight gain and food intake compared to placebo-treated animals. however, the use of liraglutide did not stimulate beta-cell proliferation in the endocrine pancreas [42] . another type of diabetes, type 3 dm, is maturity-onset diabetes of the young (mody3). mody3 is a noninsulin-dependent type of diabetes with an autosomal dominant inheritance and is caused by mutations in the human hepatocyte nuclear factor 1 α (hnf1α) gene. mutation in hnf1α gene leads to pancreatic β-cell dysfunction and impaired insulin secretion. a pig model for mody3 was generated by expressing a mutant human hnf1α gene (hnf1α p291fsinsc) using intracytoplasmic sperm injection-mediated gene transfer and somatic cell nuclear transfer. the transgenic piglets exhibited the pathophysiological characteristics of diabetes, including high glucose level and reduced insulin secretion from the small and irregularly formed langerhans islets [43] . furthermore, hnf1α p291fsinsc pigs revealed nodular lesions in the renal glomeruli, diabetic retinopathy, and cataract, complications similar to those in patients with dm [44] . mutations in the insulin (ins) gene may result in permanent neonatal diabetes mellitus (pndm) in humans. a pndm large animal model was establish by generated pigs expressing a mutant porcine ins gene (ins c94y), orthologous to human ins c96y. transgenic animals showed signs of pndm, such as lower fasting insulin levels, decreased β-cell mass, reduced body weight, and cataract development. in addition, ins c94y pigs exhibited significant β-cell impairment, including the reduction in insulin secretory granules and dilation of the endoplasmic reticulum [45] . the porcine ins c94y model was further used to perform analysis of pathological changes in retinas and evaluation of the liver of transgenic pigs. the studies revealed several features of diabetic retinopathy, such as intraretinal microvascular abnormalities or central retinal edema [46] . moreover, the multi-omics analysis of the liver demonstrated higher activities in amino acid metabolism, oxidation of fatty acids, gluconeogenesis, and ketogenesis, characteristic of insulin-deficient diabetes mellitus [47] . the genetically modified pig models for human diseases described in this review are summarized in table 2 . human-derived proteins have long been used as therapeutics in the treatment of numerous diseases. however, their quantities are limited by the availability of human tissues. thanks to the development of biotechnology and genetic engineering, modified animals can be used as "bioreactors" to produce recombinant proteins for pharmaceutical use. by using adequate regulatory sequences, promoters, the expression of transgenes can be directed to selected cells and organs. the therapeutic proteins can be obtained from milk, blood, urine, seminal plasma, egg white, or salivary gland that can be collected, purified, and used at an industrial scale. moreover, it is possible to generate multi-transgenic animals that produce many biopharmaceuticals or vaccines in a single organism. the use of an animal platform allows for the relatively low-cost production of pharmacologically valuable preparations in high quantity and quality. the mammary gland is considered to be an excellent bioreactor system for pharmaceutical protein production. the advantage of milk is that it contains large amounts of foreign proteins that do not affect the animal's health during lactation as well as the ease of product collection and purification. while cows are the best species for obtaining large amounts of pharmaceuticals in milk, the cost and time necessary to carry out successful transgenesis make rabbits, sheep, goats, and pigs more popular species. although the pig is not a typical dairy animal, a lactating sow can give about 300 l of milk per year. velander et al. generated transgenic pigs that synthesized human protein c in the mammary gland. protein c plays an important role in human blood clotting, which makes it a potentially attractive drug. the collected milk contained 1 g/l of this protein [48] . other recombinant human proteins involved in the coagulation process, such as factor viii [49] , factor ix [50, 51] , von willebrand factor [52] , were also successfully obtained in the porcine mammary gland. furthermore, the line of transgenic pigs producing functional recombinant human erythropoietin in their milk was demonstrated. erythropoietin regulates red blood cell production (erythropoiesis) in the bone marrow by binding to a specific membrane receptor and has been used in the treatment of anemia. this bioreactor system generates active recombinant human erythropoietin at concentrations of approximately 877.9 ± 92.8 iu/1 ml [53] . in turn, lu et al. generated transgenic cloned pigs expressing large quantities of recombinant human lysozyme in milk. lysozyme is a natural broad-spectrum antimicrobial enzyme which constitutes part of the innate immune system. the authors demonstrated that the highest concentration of recombinant human lysozyme with in vitro bioactivity was 2759.6 ± 265.0 mg/l [54] . biopharmaceuticals can also be synthesized in pigs with the use of alternative systems, such as blood, urine, and semen. the blood of transgenic animals can be a source of human blood proteins, such as hemoglobin. swanson et al. and sharma et al. obtained transgenic pigs that produced recombinant human hemoglobin in their blood cells at a high level, with the ability to bind oxygen identical to that of human blood hemoglobin [55, 56] . there remains, however, the issue of obtaining large amounts of animal-generated therapeutics easily and inexpensively, without killing the animal. moreover, blood cannot store high levels of recombinant proteins for a long time, which are innately unstable, and bioactive proteins in the blood may affect the metabolism of the animals [57] . for this reason, research is being conducted into the production of recombinant proteins secreted into the urine or semen. the advantage of semen is that it is easily obtained and produced in high amounts in species such as pigs (boars can produce 200-300 ml of semen 2-3 times a week), while the advantage of urine is that proteins can be obtained from animals of both sexes throughout their lives. in addition, urine contains few proteins, which facilitates the purification of the protein product, and the urine-based systems pose a low risk to the animal's health. however, the limitation of protein production in the bladder is low yield [58] . the recombinant pharmaceutical proteins produced from transgenic pigs are listed in table 3 . genetically modified pigs can also be used as a source of cells, tissues, and organs for transplantation into human recipients. despite the growing knowledge and ability to perform transplants, the shortage of organs means that the number of patients awaiting a transplant is constantly increasing. xenotransplantation is any procedure involving the transplantation, implantation, or infusion of cells, tissues or animal donor organs, and also body fluids, cells, tissues, and human organs (or their fragments), which had ex vivo contact with animal cells, tissues, or organs into a human recipient. organ xenotransplantation would give us an unlimited and predictable source of organs and enable careful planning of the surgery and preoperative drug treatment of the donor. the animal that best meets the criteria for xenotransplantation is the domestic pig (sus scrofa domestica). pig and human organs show great anatomical and physiological similarities. however, the significant phylogenetic distance results in serious immunological problems after transplantation. despite major difficulties, the pig is currently the focus of all research aimed towards eliminating the problem of organ shortage for human transplantation in the future. thus, the challenge now is to overcome interspecies differences that cause xenograft rejection by the human immune system. the solution, therefore, is to modify pigs in such a way that their organs are not rejected as belonging to another species. advances in genetic engineering have brought scientists closer to obtaining modified animals that would be useful for pig to human transplants. a number of studies have reached the preclinical stage, using primates as model organisms. pig organs transplanted into human recipients are immediately rejected as a result of the so-called hyperacute immunological reaction. xenograft rejection is mainly caused by the gal antigen found on the donor's cell surface, which is synthesized by the ggta-1 enzyme. humans lack both the gal antigen and the ggta-1 enzyme, but have xenoreactive antibodies directed against the porcine gal antigen, which leads to the so-called enzymatic complement cascade in the recipient. the sequence of reactions results in a formation membrane attack complex, lysis, and destruction of the graft cells. the best possible solution to the problem of hyperacute rejection is to inactivate the gene encoding the ggta-1 enzyme responsible for the formation of the gal antigen. in 2001, the first heterozygous ggta1 knock-out pigs were produced [59] , and one year later, the first piglets with two knock-out alleles of the ggta1 gene were born [60] . a series of ggta1 knock-out pigs has also been generated by using zfns [61] , talens [62] , and the crispr/cas9 system [63] . moreover, other carbohydrate xenoantigens present on pig cells but absent in humans have been identified and include neu5gc antigen (n-glycolylneuraminic acid) catalyzed by cytidine monophosphate-n-acetylneuraminic acid hydroxylase (cmah) and the sda antigen produced by beta-1,4-n-acetyl-galactosaminyltransferase 2 (β4galnt2). pigs with ggta1/cmah/β4galnt2 triple gene knock-out were generated using the crispr/cas9 system. cells from these genetically modified animals exhibited a reduced level of human igm and igg binding resulting in diminished porcine xenoantigenicity [64] . to prevent hyperacute rejection, it is possible to introduce human genes regulating the enzymatic complement cascade into the porcine genome. as the complement system may undergo spontaneous autoactivation and attack the body's own cells, defense mechanisms have developed in the course of evolution. they regulate complement activity through a family of structurally and functionally similar proteins blocking complement activation and preventing the formation of a membrane attack complex (mac). introduction of human genes encoding complement inhibitors, such as cd55 (daf, decay-accelerating factor), cd46 (mcp, membrane cofactor protein), cd59 (membrane inhibitor of reactive lysis), into the porcine genome may overcome xenogeneic hyperacute organ rejection [65] . it was demonstrated that the expression of human complement-regulatory proteins can prevent complement-mediated xenograft injury and prolong the survival time of the xenotransplant [66] [67] [68] . studies have shown that the absence of ggta1 and additional human cd55, cd59, or cd46 expression has greater survival rates than just ggta1 knock-out [69, 70] . many genetically modified pigs with human complement inhibitors and other modifications important for xenotransplantation were also generated [71] [72] [73] . the modifications of the porcine genome described above largely resolved the problem of hyperacute rejection. however, xenogenic transplant becomes subject to less severe rejection mechanisms resulting from coagulation dysregulation, natural killer (nk) cells-mediated cytotoxicity, macrophage-mediated cytotoxicity as well as t-cell response. the coagulative disorders result from incompatibilities between pig anticoagulants and human coagulation factors. overcoming coagulation dysregulation in xenotransplantation will require the introduction of human gene encoding coagulation-regulatory proteins into the porcine genome, for example, thrombomodulin (tbm), endothelial cell protein c receptor (epcr), tissue factor pathway inhibitor (tfpi), and ectonucleoside triphosphate diphosphohydrolase 1 (cd39). thrombomodulin binds thrombin and functions as a cofactor for the activation of protein c, which is strongly anticoagulative. porcine tbm binds human thrombin less strongly and cannot effectively activate protein c. it was demonstrated that expressing human tbm (htbm) in porcine aortic endothelial cells (paecs) suppresses prothrombinase activity and delays clotting time [71] . the endothelial protein c receptor enhances the activation of protein c and decreases proinflammatory cytokine synthesis. in vitro studies revealed the correlation between human epcr (hepcr) expression in paecs and reduced human platelet aggregation [74] . a meta-analysis of multiple genetic modifications on pig lung xenotransplant showed that hepcr was one of the modifications that had a positive effect on xenograft survival prolongation in the ex vivo organ perfusion model with human blood [75] . further study demonstrated that kidneys from genetically-engineered pigs (carrying six modifications) functioned in baboons for 237 and 260 days. the authors suggested that prolonged survival time was associated, among others, with the expression of the human epcr gene [76] . tissue factor pathway inhibitor is the primary physiological regulator of the early stage of coagulation. tfpi binds to factor xa, and then xa/tfpi inhibits the procoagulant activity of the tissue factor (tf)/factor viia complex. it was demonstrated that the expression of human tfpi in paecs can inhibit tf activity, suggesting potential for controlling the tf-dependent pathway of blood coagulation in xenotransplantation [77] . more recently, multi-modified pigs carrying human tfpi transgene were produced [76, 78] . cd39 is an ectoenzyme that plays a key role in reducing platelet activation. cd39 converts adenosine triphosphate (atp) and adenosine diphosphate (adp) to adenosine monophosphate (amp), which in turn is further degraded by ecto-5 -nucleotidase (cd73) to antithrombotic adenosine. transgenic pigs with human cd39 (hcd39) gene were generated. the study showed that hcd39 expression protects against myocardial injury in a model of myocardial acute ischemia-reperfusion injury [79] . another approach to xenograft protection may be introducing a human gene that protects against the inflammatory response into the porcine genome. transgenic pigs expressing antiapoptotic and anti-inflammatory proteins, such as human heme oxygenase-1 (ho-1) and human tumor necrosis factor-alpha-induced protein 3 (a20), were produced [80, 81] . porcine aortic endothelial cells derived from pigs carrying human a20 transgene were protected against tnf-α (tumor necrosis factor alpha)-mediated apoptosis and less susceptible to cell death induced by cd95 (fas) ligands [81] . similarly, overexpression of human ho-1 ensured prolonged porcine kidney survival in an ex vivo perfusion model with human blood and paecs protection from tnf-α-mediated apoptosis [80] . furthermore, pigs with a combined expression of human a20 and ho-1 on a ggta1 knock-out background were generated. that transgenic approach alleviated rejection and ischemia-reperfusion damage during ex vivo kidney perfusion [82] . the cellular immune response is another barrier to xenotransplantation. human nk cells can activate the endothelium and lyse porcine cells through direct nk cytotoxicity and by antibody-dependent cellular mechanisms. direct nk cytotoxicity is regulated by activating and inhibitory receptor-ligand interactions. to prevent nk-mediated lysis through the inhibitory cd94/nkg2a receptor, pigs with human leukocyte antigens-e (hla-e) were obtained [83, 84] . the study showed that the expression of hla-e in endothelial cells from transgenic pigs markedly reduces xenogeneic human nk responses. in addition, it was demonstrated that the introduction of the hla-e gene into the porcine genome may also protect pig cells from macrophage-mediated cytotoxicity [85] . more recently, ggta1 knock-out pigs with hcd46 and hla-e/human β2-microglobulin transgenes were produced. the study showed that multiple genetically modified porcine hearts were protected from complement activation and myocardial natural killer cell infiltration in an ex vivo perfusion model with human blood [86] . another approach to inhibit direct xenogeneic nk cytotoxicity is the elimination of porcine ul-16-binding protein 1 (ulbp1), which binds to nkg2d activating nk receptors. crispr technology was adapted to create genetically modified pigs with a disrupted ul16-binding protein 1 gene. in vitro studies confirmed that porcine aortic endothelial cells derived from ulbp1 knock-out pigs were less susceptible to nk-cells' cytotoxic effects [87] . macrophages also play an important role in xenograft rejection and can be activated by direct interactions between receptors present on their surface and donor endothelial antigens as well as by xenoreactive t lymphocytes. the binding cd47 antigen to macrophage surface signaling regulatory protein (sirp-α) delivers a signal to prevent phagocytosis. however, the interaction between porcine cd47 and human sirp-α does not supply the inhibitory effect on macrophages [88] . therefore, the introduction of human cd47 (hcd47) into the porcine genome can overcome macrophage-mediated responses in xenotransplantation. the overexpression of hcd47 in porcine endothelial cells suppressed the phagocytic and cytotoxic activity of macrophages, decreased inflammatory cytokine (tnf-α, il-6, il-1β) secretion and inhibited the infiltration of human t cells [89] . the pigs with ggta1 knock-out and hcd47 were obtained [90] . it was demonstrated that the expression of human cd47 markedly prolonged survival of donor porcine skin xenografts on baboons in the absence of immunosuppression [91] . another challenge in xenotransplantation is the prevention of t cell-mediated rejection. t cells can be induced directly by swine leukocyte antigen (sla) class i and class ii on porcine antigen-presenting cells (apcs) or by swine donor peptides presented on recipient apcs. the main co-stimulatory signals regulating t cell function include cd40-cd154 and cd28-cd80/86 pathways. the cytotoxic t-lymphocyte antigen 4-immunoglobulin (ctla4) can inhibit the cd28-cd80/86 co-stimulatory pathway. therefore, the introduction of human ctla4-ig (hctla4-ig) into the porcine genome may alleviate t cell response in xenografts. it was shown that neuronal expression of hctla4-ig in pigs reduced human t lymphocyte proliferation [92] . moreover, transgenic hctla4-ig protein in pigs extended the survival time of porcine skin grafts in a xenogeneic rat transplantation model [93] . another approach to inhibit t-cell immune response may be the deletion of swine leukocyte antigen class i. reyes et al. created sla class i knock-out pigs using grna and the cas9 endonuclease. the obtained animals revealed decreased levels of cd4−cd8+ t cells in peripheral blood [94] . recently, pigs carrying functional knock-outs of ggta1, cmah, b4galnt2, and sla class i with multi-transgenic background (hcd46, hcd55, hcd59, hho1, ha20) were produced. in vitro study presented that the four-fold knock-out reduced the binding of human igg and igm to porcine kidney cells [95] . beyond immune barriers in xenotransplantation, there is also concern about the risk of cross-species pathogens infection. the main problem constitutes porcine endogenous retroviruses (perv), which are integrated into multiple locations in the pig genome. utilizing the crispr/cas9 technology gives great hopes for the complete elimination of the risk of perv transmission. niu et al. using the crispr/cas9 system inactivated all 25 copies of functional pervs in a porcine primary cell line and successfully generated healthy perv-inactivated pigs via somatic cell nuclear transfer. what is more, no reinfection was observed in the obtained pigs [96] . advances in genetic engineering and immunosuppressive therapies prolong organ survival time in preclinical pig-to-non-human primate (nhp) xenotransplantation models. the first xenotransplantation using pig hearts with eliminated gal antigen into immunosuppressed baboons was performed in 2005. the longest surviving heterotopic graft functioned in the recipient for 179 days [97] , in comparison to 4-6 hours of survival time with the use of wild-type pig hearts [98] . introducing additional modifications extended the xenograft survival time even more. the longest survival was obtained for heterotopic cardiac xenotransplantation-up to 945 days. the authors used hearts derived from genetically multimodified pigs (ggta1 knock-out, hcd46, htbm) and chimeric 2c10r4 anti-cd40 antibody therapy [99] . additional expression of htbm in ggta1 knock-out, hcd46 genetically modified pigs prevented early dysregulation of coagulation and prolonged the cardiac xenografts survival time [99, 100] . using the same genetic background, orthotopic heart xenotransplantation was performed, resulting in a maximum survival of 195 days [101] . however, xenograft survival time depends on the types of transplanted organs. in the case of kidneys in pig-to-nhp transplantation models, the longest survival of a life-sustaining xenograft was 499 days. ggta1 knock-out pigs carrying hcd55 gene as well as immunosuppression with transient pan-t cell depletion and an anti-cd154-based regimen were used in the experiments. moreover, the selection of recipients with low-titer anti-pig antibodies improved the long-term survival of pig-to-rhesus macaque renal xenotransplants [102] . the success of porcine liver and lung xenotransplantation remains limited, which is mainly associated with the occurrence of coagulation disorders [103] . the longest survival time for orthotopic liver xenografts (29 days) was achieved using ggta1 knock-out pigs, exogenous human coagulation factors, and immunosuppression, including co-stimulation blockade [104] . in turn, watanabe et al. demonstrated prolonged survival time of lung xenotransplants (14 days) from ggta1 knock-out, hcd47/hcd55 donor pigs in immunosuppressed baboons [105] . the authors indicated the important role of hcd47 expression in reducing immunologic damages and extending lung graft survival in the pig-to-nhp model. however, additional genetic modifications of the porcine genome and immunosuppressive regimen strategy are necessary for the clinical application of xenotransplantation. table 4 summarizes the most important genetic modifications of the porcine genome for xenotransplantation purposes. the anatomical and physiological similarity between pigs and humans makes this species very interesting for biomedical research [110] . the rapid development of genetic engineering in recent years has allowed for precise and efficient modification of the animal genome using site-specific nucleases. the nuclease-mediated editing of the porcine genome, as well as potential applications of genetically modified pigs in biomedicine, are shown in figure 1 . certainly, the driving force for development is the human mind and ideas that arise in it. one of the factors limiting the possibilities of using the potential of our ideas is the technical aspect. the transfer of new technologies, tested on smaller animal models, for example, is often limited and requires optimization for a large animal model. in the case of crispr/cas9 technology, a lot of emphasis should be placed on the possibility of reducing the risk of so-called off-targets by improving this system. paired nicking has the potential to reduce off-target activity in mice from 50-1000 times without compromising on-target performance [111] . another strategy to limit the number of undesirable off-targets is to increase the specificity of the system. in this situation, one can focus on enhancing or improving cas9 protein or sgrna modifications. protein cas9 properties can be modified, or their lifespan can be changed [112, 113] . for the future clinical success, it is also important to improve the efficiency of hr-mediated gene correction, especially in the situation of treating disease in which a template sequence is delivered to replace the mutated variant. another important goal to achieve is the possibility of applying hr not only for dividing cells but also for cells in the post-mitotic stage. hopes are placed in the fusion of the crispr/cas9 technique and aav (adeno-associated virus) as a donor template provider [114] . considering the low immunogenicity of the aav virus, the ability to transduce a wide spectrum of cells in terms of both type and developmental stage and strong limiting factor-capacity, it should be considered to minimize the crispr/cas9 system or use more than one separate virus to simultaneously exploit the potential of both technologies. the use of other delivery systems, e.g., nanoparticles, is also worth considering [115] . the different site-specific nucleases (zfn, talen, crispr/cas9) used for genome editing and two techniques (somatic nuclear transfer and microinjection) to produce genetically modified pigs are shown. biomedical applications for which genetically engineered pigs are generated include modeling human diseases, production of pharmaceutical proteins, and xenotransplantation. genetically modified pigs serve as an important large animal model for studying the genetic background of human diseases, testing novel drugs and therapy methods as well as developing models for gene therapy [116] [117] [118] . pigs can be used as anatomical (e.g., endovascular), surgical, behavioral, and cytotoxic models. ideas for new models of large animals are provided by the reality that shapes current demand. a lot has been done (pig model for influenza a infection), but still there is a need for pig models of other human viral diseases (hepatitis b; human immunodeficiency virus, hiv; severe acute respiratory syndrome coronavirus 2, sars-cov-2) [119] . hiv has been modeled in mice, filoviruses (ebolavirus, marburgvirus) have been modeled in small animals (i.e., mice, hamsters), but still, we need large models to investigate vaccines and antiviral drugs [120, 121] . transgenic pigs can also be a promising source of recombinant proteins used as pharmacological preparations. actually, the possibility of using pigs for the production of biopharmaceuticals has been slowed in recent years. some studies demonstrated that the pig mammary gland can be used as a complex recombinant protein source with appropriate post-translational modifications [122] . despite the advantages of pig animal platform (natural secretion, correct posttranslational modifications, constant production), some ethical doubts are probably limiting the boost. finally, the use of genetically engineered pigs for xenotransplantation is becoming an increasingly feasible alternative to standard allogeneic transplants and a potential solution to the problem of organ shortage. the combination of various multi-modified pigs and immunosuppressive therapies is required for overcoming immune rejection and effective xenotransplantation of different solid organs [123] [124] [125] . when it comes to treating end-stage organ failure, biomedical research could go a step further and try to create chimeric genetically modified pigs that would be carriers of human organs [126] . the publication was co-financed within the framework of a ministry of science and higher education program as "regional initiative excellence" in the years 2019-2022, project no. 005/rid/2018/19. the authors declare no conflict of interest. genetically engineered pigs as models for human disease trends in recombinant protein use in animal production. microb. cell fact genetically modified pigs as organ donors for xenotransplantation genetic transformation of mouse embryos by microinjection of purified dna dramatic growth of mice that develop from eggs microinjected with metallothionein-growth hormone fusion genes pronuclear microinjection transgenic technology in farm animals-progress and perspectives increased efficiency of transgenic livestock production increased transgene integration efficiency upon microinjection of dna into both pronuclei of rabbit embryos a simple and highly efficient transgenesis method in mice with the tol2 transposon system and cytoplasmic microinjection efficient generation of gene-modified pigs via injection of zygote with cas9/sgrna sheep cloned by nuclear transfer from a cultured cell line a background to nuclear transfer and its applications in agriculture and human therapeutic medicine disruption of the cftr gene produces a model of cystic fibrosis in newborn pigs the deltaf508 mutation causes cftr misprocessing and cystic fibrosis-like disease in pigs sequential targeting of cftr by bac vectors generates a novel pig model of cystic fibrosis lentiviral-mediated phenotypic correction of cystic fibrosis pigs cftr gene transfer with aav improves early cystic fibrosis pig phenotypes efficient gene editing at major cftr mutation loci in vitro validation of a crispr-mediated cftr correction strategy for preclinical translation in pigs dystrophin-deficient pigs provide new insights into the hierarchy of physiological derangements of dystrophic muscle progressive muscle proteome changes in a clinically relevant pig model of duchenne muscular dystrophy porcine zygote injection with cas9/sgrna results in dmd-modified pig with muscle dystrophy somatic gene editing ameliorates skeletal and cardiac muscle failure in pig and human models of duchenne muscular dystrophy the role of cell-derived oligomers of abeta in alzheimer's disease and avenues for therapeutic intervention hemizygous minipigs produced by random gene insertion and handmade cloning express the alzheimer's disease-causing dominant mutation appsw generation of minipigs with targeted transgene insertion by recombinase-mediated cassette exchange (rmce) and somatic cell nuclear transfer (scnt) expression of the alzheimer's disease mutations aβpp695sw and psen1m146i in double-transgenic göttingen minipigs production of transgenic pig as an alzheimer's disease model using a multi-cistronic vector system genetic induction of tumorigenesis in swine a genetic porcine model of cancer a porcine model of osteosarcoma inactivation and inducible oncogenic mutation of p53 in gene targeted pigs development and translational imaging of a tp53 porcine tumorigenesis model a porcine model of familial adenomatous polyposis targeted disruption of ldlr causes hypercholesterolemia and atherosclerosis in yucatan miniature pigs bempedoic acid lowers low-density lipoprotein cholesterol and attenuates atherosclerosis in low-density lipoprotein receptor-deficient (ldlr+/− and ldlr−/−) yucatan miniature pigs familial hypercholesterolemia and atherosclerosis in cloned minipigs created by dna transposition of a human pcsk9 gain-of-function mutant characterization of a hypertriglyceridemic transgenic miniature pig model expressing human apolipoprotein ciii glucose intolerance and reduced proliferation of pancreatic beta-cells in transgenic pigs with impaired glucose-dependent insulinotropic polypeptide function changing metabolic signatures of amino acids and lipids during the prediabetic period in a pig model with impaired incretin function and reduced β-cell mass effects of the glucagon-like peptide-1 receptor agonist liraglutide in juvenile transgenic pigs modeling a pre-diabetic condition dominant-negative mutant hepatocyte nuclear factor 1α induces diabetes in transgenic-cloned pigs diabetic phenotype of transgenic pigs introduced by dominant-negative mutant hepatocyte nuclear factor 1α permanent neonatal diabetes in ins(c94y) transgenic pigs retinopathy with central oedema in an ins c94y transgenic pig model of long-term diabetes multi-omics insights into functional alterations of the liver in insulin-deficient diabetes mellitus high-level expression of a heterologous protein in the milk of transgenic swine using the cdna encoding human protein c transgenic pigs produce functional human factor viii in milk recombinant human factor ix produced from transgenic porcine milk engineering protein processing of the mammary gland to produce abundant hemophilia b therapy in milk production of recombinant human von willebrand factor in the milk of transgenic pigs recombinant human erythropoietin produced in milk of transgenic pigs production of transgenic-cloned pigs expressing large quantities of recombinant human lysozyme in milk production of functional human hemoglobin in transgenic swine an isologous porcine promoter permits high level expression of human hemoglobin in transgenic swine production of pharmaceutical proteins by transgenic animals expression systems and species used for transgenic animal bioreactors production of alpha-1,3-galactosyltransferase knockout pigs by nuclear transfer cloning production of alpha 1,3-galactosyltransferase-deficient pigs production of zfn-mediated ggta1 knock-out pigs by microinjection of gene constructs into pronuclei of zygotes production of α1,3-galactosyltransferase targeted pigs using transcription activator-like effector nuclease-mediated genome editing technology generation of ggta1 mutant pigs by direct pronuclear microinjection of crispr/cas9 plasmid vectors evaluation of human and non-human primate antibody binding to pig cells lacking ggta1/cmah/β4galnt2 genes expression of a functional human complement inhibitor in a transgenic pig as a model for the prevention of xenogeneic hyperacute organ rejection cardiac xenotransplantation: recent preclinical progress with 3-month median survival cytomegalovirus early promoter induced expression of hcd59 in porcine organs provides protection against hyperacute rejection long-term survival of nonhuman primates receiving life-supporting transgenic porcine kidney xenografts generation of gtko diannan miniature pig expressing human complementary regulator proteins hcd55 and hcd59 via t2a peptide-based bicistronic vectors and scnt b-cell depletion extends the survival of gtko.hcd46tg pig heart xenografts in baboons for up to 8 months potential value of human thrombomodulin and daf expression for coagulation control in pig-to-human xenotransplantation complement dependent early immunological responses during ex vivo xenoperfusion of hcd46/hla-e double transgenic pig forelimbs with human blood production of ulbp1-ko pigs with human cd55 expression using crispr technology regulation of human platelet aggregation by genetically modified pig endothelial cells and thrombin inhibition meta-analysis of the independent and cumulative effects of multiple genetic modifications on pig lung xenograft performance during ex vivo perfusion with human blood immunological and physiological observations in baboons with life-supporting genetically engineered pig kidney grafts atorvastatin or transgenic expression of tfpi inhibits coagulation initiated by anti-nongal igg binding to porcine aortic endothelial cells generation of α-1,3-galactosyltransferase knocked-out transgenic cloned pigs with knocked-in five human genes transgenic swine: expression of human cd39 protects against myocardial injury transgenic expression of human heme oxygenase-1 in pigs confers resistance against xenograft rejection during ex vivo perfusion of porcine kidneys transgenic expression of the human a20 gene in cloned pigs provides protection against apoptotic and inflammatory stimuli kidneys from α1,3-galactosyltransferase knockout/human heme oxygenase-1/human a20 transgenic pigs are protected from rejection during ex vivo perfusion with human blood hla-e/human beta2-microglobulin transgenic pigs: protection against xenogeneic human anti-pig natural killer cell cytotoxicity characterization of three generations of transgenic pigs expressing the hla-e gene the suppression of inflammatory macrophage-mediated cytotoxicity and proinflammatory cytokine production by transgenic expression of hla-e multiple genetically modified gtko/hcd46/hla-e/hβ2-mg porcine hearts are protected from complement activation and natural killer cell infiltration during ex vivo perfusion with human blood the production of ul16-binding protein 1 targeted pigs using crispr technology role for cd47-sirpalpha signaling in xenograft rejection by macrophages role of human cd200 overexpression in pig-to-human xenogeneic immune response compared with human cd47 overexpression transgenic expression of human cd47 markedly increases engraftment in a murine model of pig-to-human hematopoietic cell transplantation prolonged survival of pig skin on baboons after administration of pig cells expressing human cd47 transgenic expression of ctla4-ig by fetal pig neurons for xenotransplantation transgenic expression of human cytoxic t-lymphocyte associated antigen4-immunoglobulin (hctla4ig) by porcine skin for xenogeneic skin grafting creating class i mhc-null pigs using guide rna and the cas9 endonuclease viable pigs after simultaneous inactivation of porcine mhc class i and three xenoreactive antigen genes ggta1, cmah and b4galnt2 inactivation of porcine endogenous retrovirus in pigs using crispr-cas9 galactosyltransferase gene-knockout pig heart transplantation in baboons with survival approaching 6 months intravenous infusion of galα-3gal oligosaccharides in baboons delays hyperacute rejection of porcine heart xenografts chimeric 2c10r4 anti-cd40 antibody therapy is critical for long-term survival of gtko.hcd46.htbm pig-to-primate cardiac xenograft cardiac xenografts show reduced survival in the absence of transgenic human thrombomodulin expression in donor pigs consistent success in life-supporting porcine cardiac xenotransplantation long-term survival of pig-to-rhesus macaque renal xenografts is dependent on cd4 t cell depletion overcoming coagulation dysregulation in pig solid organ transplantation in nonhuman primates: recent progress prolonged survival following pig-to-primate liver xenotransplantation utilizing exogenous coagulation factors and costimulation blockade intra-bone bone marrow transplantation from hcd47 transgenic pigs to baboons prolongs chimerism to >60 days and promotes increased porcine lung transplant survival production of biallelic cmp-neu5ac hydroxylase knock-out pigs the generation of transgenic pigs as potential organ donors for humans a human cd46 transgenic pig model system for the study of discordant xenotransplantation pigs transgenic for human thrombomodulin have elevated production of activated protein c transgenic pigs as models for translational biomedical research double nicking by rna-guided crispr cas9 for enhanced genome editing specificity high-fidelity crispr-cas9 nucleases with no detectable genome-wide off-target effects enhanced homology-directed human genome engineering by controlled timing of crispr/cas9 delivery virus-mediated genome editing via homology-directed repair in mitotic and postmitotic cells in mammalian brain improved delivery of crispr/cas9 system using magnetic nanoparticles into porcine fibroblast current progress of genetically engineered pig models for biomedical research genome editing of pigs for agriculture and biomedicine genetically engineered pigs to study cancer the microminipig as an animal model for influenza a virus infection in vitro and in vivo models of hiv latency animal models for filovirus infections production of recombinant human coagulation factor ix by transgenic pig the role of genetically engineered pigs in xenotransplantation research genetically modified pigs as donors of cells, tissues, and organs for xenotransplantation xenotransplantation: current status in preclinical research interspecies chimerism with mammalian pluripotent stem cells this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license key: cord-340244-qjf23a7e authors: bernstein, daniel j. title: further analysis of the impact of distancing upon the covid-19 pandemic date: 2020-04-16 journal: nan doi: 10.1101/2020.04.14.20048025 sha: doc_id: 340244 cord_uid: qjf23a7e this paper questions various claims from the paper "social distancing strategies for curbing the covid-19 epidemic" by kissler, tedijanto, lipsitch, and grad: most importantly, the claim that china's "intense" distancing measures achieved only a 60% reduction in r0. the 22 march 2020 paper "social distancing strategies for curbing the covid-19 epidemic" [5] reports calculations in a model where distancing reduces r 0 by at most 60%, and claims that 60% is "on par with the reduction in r 0 achieved in china through intense social distancing measures (3)". reference "(3)" is a 9 march 2020 paper [1] that does not say what [5] claims it says. what [1] actually says about the effect of china's distancing upon r 0 is the following: with an early epidemic value of r 0 of 2.5, social distancing would have to reduce transmission by about 60% or less, if the intrinsic transmission potential declines in the warm summer months in the northern hemisphere. this reduction is a big ask, but it did happen in china. in other words, if r 0 in china was originally 2.5 without distancing, and if china's interventions had less than a 60% effect, then the new r 0 would have been larger than 1, so the epidemic would have continued to spread exponentially in china-but the news reports say that the epidemic was stopped in china, so r 0 there must have been reduced by at least 60%. how does [5] conclude that "intense" distancing reduces r 0 by at most 60%? there do not seem to be any further comments on this topic in [5] ; also, [1] does not seem to present any justification for its claim that reducing r 0 by 60% is a "big ask". is it plausible that an "intense" lockdown-presumably a drastic reduction in the amount of contact between typical people-would reduce transmission by only 60%? the actual level of reduction directly affects the main quantitative conclusions of [5] . the main objective of this paper is to see what the distancing model in [5] says regarding the initial covid-19 outbreak and subsequent lockdown in china. the software used in [5] does not seem to be available; this paper is accompanied by new public-domain software intended to implement the same model. the new computations conclude that, with the minimum r 0 allowed in the model of [5] , china would have a 43-day period of hospitalizations being within 50% of peak, and a 53-day period of critical-care cases being within 50% of peak. for comparison, china's reports of "severe cases" between 24 january and 28 march show only a 26-day period of "severe cases" being within 50% of peak. this comparison is consistent with the theory that "intense" distancing cut china's rate of new infections much more sharply than assumed in [5] . there was no way for [5] to rule out this theory, so [5] should have considered much smaller r 0 values. this paper also covers various other problems with [5] . the theory stated in the previous paragraph should be treated with caution, for several reasons. the hope of strong effects from interventions creates a bias towards believing that such effects exist. overly optimistic covid-19 models have slowed down interventions (see, e.g., [2] ), and newer models suggest that this slowness will turn out to have caused many unnecessary deaths; further policy decisions based on other unproven theories can be similarly disastrous. there are many other possible explanations for the relatively short period of "severe cases" in china: for example, perhaps increased contact tracing and wearing of masks helped reduce transmission; perhaps changes in treatment or in reporting policies reduced the number of reported "severe" cases. furthermore, the fragments of data considered here cannot reliably distinguish between, e.g., the theory that "intense" distancing reduces r 0 by 90% and the theory that "intense" distancing reduces r 0 by 99%. policymakers trying to find the best combinations of non-pharmaceutical interventions to keep r 0 low until a vaccine is available-for example, keeping it safely below 1, the "dance" explained in [7] -want to know much more regarding the impact of various types of distancing, the impact of encouraging widespread wearing of masks, etc. this paper should not be viewed as answering these questions; this paper merely disputes the overconfident answer given in [2] . the only policy recommendation in this paper is as follows. governments should systematically allocate 5% of their daily covid-19 testing capacity to testing people chosen at random from census rolls, with whatever incentives are necessary to create compliance (e.g., in the united states, a $100 reward to each person tested in this way). the resulting data regarding infection rates would help rule out incorrect theories regarding the prevalence of covid-19, and would provide essential feedback regarding the impact of interventions. no claims of novelty are made regarding this recommendation. this section reviews [5] 's model of covid-19 spread. 2.1. introduction to sir models. an "sir model" works as follows. there are n people in the population. this population is partitioned into three subpopulations ("compartments"): • some people are "susceptible" people who have never been exposed to the disease. the fraction of people who are susceptible is called s, so in total there are sn susceptible people. • some people are "infectious" people who have been exposed to the disease and could be spreading it. the fraction of people who are infectious is called i, so in total there are in infectious people. • some people are "recovered" people who have been exposed to the disease and are no longer spreading it. the fraction of people who are recovered is called r, so in total there are rn recovered people. the fractions s, i, r have s + i + r = 1. these fractions change over time. note that the literature often relabels sn, in, rn as s, i, r, which also changes the equations shown below. each infectious person is assumed to take, on average, 1/γ days to recover, where γ is a parameter in the model. one can imagine several different ways to define the exact timing of recovery: • each infectious person recovers after exactly 1/γ days. (there is no variance in the recovery time.) • each infectious person has probability γ of recovering each day. (there is much more variance in the recovery time.) • each infectious person has probability γ/24 of recovering each hour. • et cetera. an sir model uses the limit of these possibilities, continuously compounding the probability of recovery. mathematically, if there are no new infections, then this model says that the derivative i (t) is −γi(t) where t is time measured in days, so i(t) = i(0) exp(−γt). meanwhile each infectious person is assumed to transmit the disease to an average of β people each day, spread uniformly at random through the entire population. this transmission affects only susceptible people: each infectious person infects, on average, βs people each day. the in infectious people are assumed to infect separate people, a total of βisn people on average each day, increasing i at a rate βis. this again is modeled as a continuous process: diagrams for discrete automata should observe that the notation here does not include self-loops, the default possibility of staying in the current compartment. without doing the work of analyzing solutions to these equations, one can guess that each infectious person will transmit the disease to an average of β/γ people, and thus infect an average of (β/γ)s people, during the 1/γ days of an average infection. the ratio β/γ is called r 0 . this back-of-the-envelope calculation suggests that an epidemic with r 0 < 1 will remain under control, while an epidemic with r 0 > 1 will explode exponentially until s drops to 1/r 0 ("herd immunity"). one can see the same effects in the model: any nonzero rate of infectious people will increase (i.e., i (t) > 0) as long as βs(t) > γ, i.e., r 0 s(t) > 1. once s(t) drops to 1/r 0 , the quantity i(t) stops increasing. the remaining infectious people continue infecting more, pushing s(t) somewhat below 1/r 0 , but the recovery rate begins to reduce i(t). real diseases seem to be less infectious at first but become more infectious after an incubation period; an sir model does not account for this. an "seir model" tries to address this by adding an extra compartment for "exposed" people who are infected but not yet infectious: these equations are abbreviated "s βi −→ e ν − → i γ − → r". the equations say that an exposed person takes, on average, 1/ν days to become infectious. note that this delay slows down the progress of the epidemic, but also slows down the effect of interventions that are based only on observations of i and not on observations of e. real diseases also have different effects on different people-and, presumably, different levels of transmission-but sir models and seir models do not account for this. the paper [5] tries to address this for covid-19 by modeling three different levels of disease severity: • mild infections ("i r ") progress to recovery ("r r "). • moderate infections ("i h ") progress to hospitalization ("h h "), and then progress to recovery ("r h "). • critical infections ("i c ") progress to hospitalization ("h c "), and then to critical care ("c c "), and then to recovery ("r c "). the model thus has 11 compartments (including s and e). this makes the model sound more complicated than it actually is: one can compute the same results with just 7 compartments, namely s, e, i = i r + i h + i c , h h , h c , more broadly, when someone objects that an sir/seir model is oversimplified, the standard response is to add more transitions to the model. for example: any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint • perhaps observations show that immunity lapses. the standard response is to add an r → s transition. • perhaps the observed variance of progression times for a disease is smaller than predicted in an seir model. the standard response is to add more stages of progression, such as s → e 1 → e 2 → i 1 → i 2 → r, narrowing the variance while preserving the average. • perhaps the period of infectiousness begins before the period of symptoms, and people with symptoms take more precautions to reduce transmission. the standard response-assuming, e.g., an s → e 1 → e 2 → i 1 → i 2 → r model-is to model a larger transmission speed from i 1 than from i 2 . the terminology used for such a model distinguishes the "latent period", meaning the non-infectious time spent in e, from the "incubation period", meaning the pre-symptomatic time spent in e and i 1 . • part of the rationale for health-care workers to wear protective equipment is that hospitalized and critical-care patients-and the health-care workers themselves-are believed to be sources of infection. this is not accounted for in the model of [5] . the standard response would be to add further compartments to track infections of health-care workers. • people are spread around the globe. infections are sometimes carried by travelers or possibly by packages, but presumably infections are much more likely to spread locally. the standard response is to have local compartments, perhaps with different transmission rates that try to account for local factors such as population density and frequency of mask use. all of these models depend heavily upon parameters such as β and γ. small mistakes in estimating those parameters can easily produce vastly larger errors in predictions. this difficulty is exacerbated by the practice of adding more and more compartments to models, along with more and more dimensions in the parameter space, since the problem of computing parameters accurately from output data typically becomes exponentially more difficult as the dimension grows. furthermore, modeling a discrete probabilistic process as a continuous deterministic process becomes increasingly difficult to justify as the number of compartments increases. [5] . in the model of [5] , the speed (per day) of progression from being infectious (i → r or i → h h or i → h c , depending on the disease severity) is 1/5 ("γ"); the speed of progression from hospitalization to recovery (h h → r) for moderate cases is 1/8 ("δ h "); the speed of progression from hospitalization to critical care (h c → c) is 1/6 ("δ c "); and the speed of progression from critical care to recovery (c → r) is 1/10 ("ξ c "). the speed of progression from being exposed to being infectious is not stated in [5] but appears to be modeled as 1/5. in this model, a critical infection takes an average of 5+5+6 = 16 days before critical care, and an average of 16 + 10 = 26 days before the end of critical care. after a rapid burst of new infections, an intervention that drastically reduces new infections would not produce a peak of critical-care patients in this model until about three weeks later. any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint the model of [5] assumes that 95.6% of infections are mild while the other 4.4% require hospitalization. the procedures that have produced these numbers are not robust, as emphasized in [3] , but this paper's policy recommendation (see section 1) would rapidly produce robust numbers. this is not an endorsement of the controversial idea from [3] of delaying serious interventions until robust numbers are available. within hospitalizations, the model of [5] assumes that 30% (i.e., 1.32% of total infections) require critical care. there is one more parameter in the model of [5] , the most important parameter: namely, r 0 , which in turn determines β = γr 0 . this is modeled in five stages: • the wintertime r 0 is modeled as either 2 or 2.5. beware that the literature includes a wider range of estimates for r 0 , and underestimating r 0 is dangerous. • the summertime r 0 is modeled as either 70% (more optimistic) or 100% (less optimistic) of the wintertime r 0 . • the current-time r 0 is modeled as a cosine curve from the wintertime r 0 down to the summertime r 0 and back up. the maximum is assumed to occur 3.8 weeks (26.6 days) before the end of the year. • the distancing r 0 takes 0%, 20%, 40%, or 60% away from the currenttime r 0 , reflecting four different hypotheses regarding the effectiveness of distancing. • the final r 0 is modeled as either the current-time r 0 or the distancing r 0 , depending on whether distancing is "on" or "off". the paper analyzes the consequences in this model of a few different strategies for deciding when to turn distancing on or off. to summarize, the differential equations for the model of [5] are as follows, where r 0 is the final r 0 defined above: as mentioned earlier, this is presented with 11 compartments in [5] , but these 7 compartments produce the same results in a simpler way. further issues with the model. according to wikipedia, more than 10000 people in italy have been reported dead with covid-19 at the time of this writing, including more than 600 per day for each of the past 9 days. it is plausible (although not proven) that almost all of those deaths were not merely any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint with covid-19 but caused by covid-19; and that if covid-19 is not brought under control then it will cause millions of deaths worldwide within a year. perhaps many of these deaths can be avoided through, e.g., widespread mask usage, combined with several periods of distancing over the next year, followed by widespread vaccination. however, this cannot even be expressed, let alone analyzed, in the model of [5] : • the model includes on/off distancing, but does not include masks or any other non-pharmaceutical interventions. • the model does not include the possibility of future vaccination (s → r). • the model does not include deaths. dead people are treated as "recovered"they have been exposed to the disease and are no longer spreading it. this simplification does not change the analysis of the spread of the epidemic, but the terminology is ethically questionable and presumably reduces the amount of attention given to one of the most important variables. all of the parameters and distancing strategies considered in [5] result in mass covid-19 infection within a few years-but a closer look shows that some scenarios have far fewer infections than others by mid-2021. in an extended model that accounts for deaths and the possibility of widespread vaccination, these scenarios would have far fewer deaths than other scenarios. there could be an even larger reduction in covid-19 infections and deaths if non-pharmaceutical interventions are more effective than assumed in [5] . this again highlights the importance of understanding the actual impact of distancing, masks, etc. this section reviews, and disputes some of, (1) the calculations that [5] carries out within its model, and (2) the conclusions that [5] claims on the basis of these calculations. 3.1. software engineering. https://cr.yp.to/2020/gigo-20200329.py, an attachment to this paper, is a python script that produces (as pdf files) all of the graphs shown here. as noted in section 1, the software used in [5] does not seem to be available. it is easy to find software online for other sir/seir/. . . models, including software that appears reasonably easy to adapt to the model of [5] , but all of this software seems to depend on ode-solving packages (e.g., scipy.integrate.odeint), and it is not clear how much review those packages have had from a safetyengineering perspective. this paper's python script begins with a table of transitions, for example expressing e 1/5 − − → i as follows: ('e','i',1/5.0) transition speeds can be functions: for example, the transition any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint ('s','e',paperinitialpulse) uses the following function: def paperinitialpulse(day,distancing): return 0.01/7 if day < startday+3.5 else 0 this appears to match the description in [5] of how the initial infection is modeled: "infection is introduced through a half-week pulse in the force of infection". the height of this pulse is not stated in [5] , but an earlier paper suggests 0.01 per week. the most complicated transition is this matches the main distancing strategy highlighted in [5] , where distancing is "turned 'on' when the prevalence of infection rose above . . . 37.5 cases per 10,000 people", and off when the prevalence drops below 10.0 cases per 10,000 any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint people. ("infection" here appears to refer specifically to the i compartment, so it should have said "infectiousness".) if these calculations were performed in exact arithmetic (or interval arithmetic in higher and higher precision), rather than floating-point arithmetic, then the outputs over any compact interval would converge to the exact solution of the equations as daydelta converges down to 0. this does not imply that the outputs of the python script are exact: the script uses floating-point arithmetic and a specific nonzero daydelta. more precisely, the script takes daydelta as 1/72 days (20 minutes), but decreases daydelta towards 0 when i approaches the 37.5/10000 or 10.0/10000 cutoffs. some spot-checks suggest that smaller choices of daydelta produce visually identical graphs, but this is very far from a thorough error analysis. differential equations are normally solved by more complicated algorithms that are designed to obtain better tradeoffs between equation-solving time and accuracy. however, simpler algorithms are easier to review. the first 98 lines of this python script include the transition tables, underlying functions, differentialequation solver, and recording the history of output data, using no libraries other than π and cos. the script then has more lines-which also need review!-for turning the data into graphs using matplotlib, a relatively complicated library. [5] . the paper [5] claims, within its model, that the (37.5, 10.0) distancing strategy explained above achieves the "goal of keeping the number of critical care patients below 0.89 per 10,000 adults" under the following assumptions: wintertime r 0 = 2, and distancing achieves a 60% reduction in r 0 . this conclusion is stated for both the "seasonal and non-seasonal cases", i.e., for summertime r 0 being either 70% or 100% of the wintertime r 0 . the conclusion is backed by a non-seasonal graph [5, figure 3 (a)] and by a seasonal graph [5, figure 3(b) ]. the seasonal graph comes close to exceeding 0.89 (in july 2021), but does not exceed it. the (37.5, 10.0) choice is close to, but not at, the edge of safe possibilities shown in [5, figure s3 ]. the assumption that r 0 = 2 is important: other graphs [5, figures s6(a) , s6(b)] say that this distancing strategy does not achieve the goal if r 0 = 2.5. the orange curve shows 10000h = 10000h h + 10000h c (which is not plotted in [5] ). as expected, each peak of infectiousness (black) triggers a later peak of hospitalizations (orange), followed by a peak of critical-care patients (red). these graphs go several months beyond the graphs in [5] , so it is unsurprising that they show an extra period of distancing. the second and third plots are two different recalculations of the seasonal [5, figure 3(b) ]. the second plot seems to match [5, figure 3(b) ]. the third plot looks the same at the beginning, but looks different in 2022, and in particular any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14. crosses the 0.89 line. the only difference between the models used in the second and third plots is as follows: • one of the plots uses a 52-week seasonality period as specified in [5] . • the other plot is a calculation in a model that was modified through the introduction of a typographical error, replacing 52 with 55. this illustrates how safety conclusions can be undermined by errors hidden inside epidemic-modeling software. this illustration is not hypothetical: the second plot, the one that seems to match [5, figure 3(b) ], is the one that includes the typographical error. if the third plot here is correct then the (37.5, 10.0) safety claim in [5] is wrong. occam's razor suggests that the safety claim in [5] , and the underlying calculations in [5] , arise from the typographical error mentioned above. any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint in all of these models, the distancing trigger creates an immediate reduction in e, a less sharp reduction in i within a week, and a reduction in c in under a month. for comparison, peaks in i that occur without intervention (because 1 − s has increased enough compared to r 0 ) continue to trigger new infections for some time, and relatively large peaks in c. this is why the mid-2022 redversus-black gap in the third plot is relatively large compared to the other redversus-black gaps: a peak in i occurs just below the trigger for intervention. it should be clear that this variation can occur within this model, even under the assumptions that (1) there are mistakes in the third plot and (2) a corrected version of the same plot would not show this variation. one can see this variation in gaps in [5, figure s6(d) ], but these phenomenaand their safety consequences-do not seem to have been noted in [5] . the error analyses in [5] were too limited to catch this error, and there is no evidence that the software was subjected to any double-checks or other review. one can respond that 0.89 is not exceeded by much in the third plot, and that we should have enough critical-care capacity by then to handle this. such a response is missing the broader point. if a change from 52 to 55 was not caught, why should one expect larger errors to be caught? current practices in modeling epidemics obviously do not have adequate guarantees of the correctness of software claimed to be implementing models. the dangers of inaccuracies in computations are added to the dangers of inaccuracies in the models per se. email dated 25 mar 2020 05:19:27 -0000 to the contact authors for [5] included a preliminary public version of this paper's python script, noted the discrepancy with [5, figure 3(b) ], noted the contradiction with the (37.5, 10.0) safety claim, and suggested posting software "for public review so that errors can be more easily located and corrected". this email did not elicit a response. considerable effort spent reverse-engineering [5, figure 3 (b)] eventually identified the theory of a 55 typographical error. the unavailability of the software continues to make the theory unnecessarily difficult to confirm. as further checks on this paper's python script, figure 3 .4 includes the same calculations for r 0 = 2.5. compare the top two plots to [5, figures s6(a) , s6(b)]. 3.5. implementability of distancing strategies. recall that this distancing strategy is triggered by i crossing particular cutoffs, namely 37.5/10000 and 10.0/10000. a policymaker considering this strategy immediately runs into the problem that the real-world i is unknown at each moment. (this is separate from the question of whether the strategy is safe; see above.) the paper [5] claims that, to "implement an effective intermittent social distancing strategy", it will be "necessary to carry out widespread surveillance to monitor when the prevalence thresholds that trigger the beginning or end of distancing have been crossed". collecting data regarding covid-19 prevalence is valuable for other reasons, and is the only policy recommendation in this paper. this does not imply, however, that such data collection would enable implementation of the strategy in [5] with a useful level of accuracy. an error analysis is required here-accounting any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14. for lags in testing, errors in testing, possible correlations between infectiousness and non-compliance, etc.-but does not appear in [5] . furthermore, no justification is provided in [5] for the claim that this data collection is "necessary" for a distancing strategy to be "effective". imagine policymakers today implementing a simple one-month-on-one-month-off distancing strategy to be adjusted later; this is effective in some of the scenarios covered by the paper's model, contradicting the blanket claim of ineffectivity. perhaps this simple strategy is unimplementable for cost reasons or political reasons, but this does not justify the claim that the strategy is ineffective. 3.6. the effect of increased critical-care capacity. the obvious way to handle the possibility of overruns in critical-care capacity is to increase the criticalcare capacity. this is not controversial. any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . what is much more controversial is the idea of actively trying to exploit all available critical-care capacity, tuning the amount of distancing to almost-but not quite-overload this capacity. one reason this is controversial is that such tuning is prone to error, whether the errors come from miscalculations within models (as illustrated above) or from the models being wrong; see, e.g., [8] . another, more fundamental, reason this is controversial is that less distancing means more infections. the claim that these infections are inevitable might be correct but (1) is not justifiable from the available data and (2) does not justify policy decisions that make the infections happen now. the paper [5] claims that increasing critical-care capacity "allows population immunity to be accumulated more rapidly, reducing the overall duration of the epidemic and the total length of social distancing measures". it is not ethical to say that exploiting increased critical-care capacity in this way is "allowed" without mentioning that this has the side effect of more infections now-including painful hospitalizations and deaths that would have been avoided by more distancing. furthermore, the predictions made by the model of [5] , in particular regarding the epidemic duration and the amount of distancing, are not robust against modifications to the model that account for more effective interventions and the possibility of widespread mid-2021 vaccinations. the paper [5] also claims, in its "summary" at the top, that "intermittent efforts require greater hospital capacity". this is contradicted by, e.g., [5, figure 3 (b)], which claims-for r 0 = 2 with seasonality-that the capacity of 0.89 critical-care beds per 10000 adults would not be overrun with (roughly) halftime intermittent distancing in 2020 and less distancing in subsequent years. the second claim appears to be based on a miscalculation, as noted above, but correcting this miscalculation and adding more distancing in subsequent years would again contradict the "require" claim. this criticism of various unjustified claims in [5] should not be interpreted as opposition to the idea of increasing hospital capacity. current news reports and simple extrapolations are consistent with the theory that the united states will already need more than 0.89 critical-care beds per 10000 adults in april, as a direct result of inadequate initial interventions in march. 3.7. the possibility of interventions being more effective than assumed in the model. the first sentence of the summary of [5] is as follows: "one-time distancing results in a fall covid-19 peak." as for intermittent distancing, the paper's lead author [4] summarizes the paper's main quantitative conclusion as follows: intermittent social distancing can prevent critical care capacity from being exceeded but such measures may be required for 12-18 months. depending on r0 and the amount of seasonality, social distancing must be 'on' for as little as 25% but up to 70% of the time. the paper itself says that staying below 0.89 requires "social distancing measures to be in place between 25% . . . and 70% of that time" (into 2022). these claims could be correct, but they go beyond what is shown in the paper. any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14. . extension of the model from [5] to consider the possibility of interventions further reducing r 0 . first plot: r 0 = 2, 52-week 30% seasonality, 80% reduction from distancing, stopped at 10.0/10000. second plot: r 0 = 2, 52-week 30% seasonality, 99% reduction from distancing, stopped at 2.0/10000. third plot: r 0 = 1.5, 99% reduction from distancing, 52-week 30% seasonality, stopped at 2.0/10000. see text for further details. concretely, the paper shows within its model that a single month of distancing now will not prevent mass infection by october; and that extending this to, say, three months will not prevent mass infection by the end of the year. the paper also shows within its model that a particular intermittent-distancing strategy, applied between 25% and 70% of the time, comes close to 0.89/10000 criticalcare patients again and again. it is reasonable to conjecture that, within this model, every intermittent-distancing strategy with fewer than 6 months of total distancing will break the 1/10000 barrier. however, all of these conclusions depend upon the exact values of r 0 during and after distancing. perhaps the effect of distancing is stronger than assumed any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint in [5] . perhaps other interventions-increased hand-washing, increased wearing of masks and mask substitutes, etc.; see [7] for many more possibilities-will keep r 0 below 1 after one-time distancing. as examples of how heavily the results depend upon hypotheses regarding the effectiveness of interventions, figure 3 .8 shows what happens with the following strategies and extensions of the model: (1) distancing reduces r 0 by 80%; (2) distancing is turned off at 2.0/10000, and reduces r 0 by 99%; (3) same but with r 0 starting at 1.5 instead of 2, as a model of non-distancing interventions reducing transmission by 25%. the clear differences between the graphs again highlight how important it is to understand the actual impact of interventions. this section returns to the claim in [5] that a 60% reduction in r 0 is "on par with the reduction in r 0 achieved in china through intense social distancing measures". as noted in section 1, this claim is incorrectly attributed to [1] , and is not otherwise justified in [5] . sections 2 and 3 have highlighted the importance of understanding what r 0 can actually be achieved. figure 4 .2 contains four more plots produced by the same python script used in section 3. each plot stretches over a 4-month period, includes 52-week seasonal forcing, and includes "intense" distancing starting on 23 january 2020. the second plot takes the wintertime r 0 to be 2.0, the most optimistic possibility allowed in [5] . "intense" distancing is assumed to reduce r 0 by 60%, again the most optimistic possibility allowed in [5] . the first (less optimistic) plot takes the wintertime r 0 to be 2.5. the third (more optimistic) plot takes the wintertime r 0 to be 2.0 and uses an extended model where "intense" distancing has more of an effect, reducing r 0 by 99%. the fourth plot is like the third but takes the wintertime r 0 to be 3.5. in each plot, there is an initial 2-day pulse of exposure. this is chosen shorter than the half-week pulse from [5] to limit the impact of the pulse time upon the width of the resulting peaks. the pulse starts on 11 january, 5 january, 1 january, and 10 january respectively; these dates are chosen so that the red curves have approximately the same peak heights, simplifying comparison of other features of the curves. reported "severe cases" from china. https://cr.yp.to/2020/nhc-20200329.py is a python script that is intended to plot, for each day, the number of cases reported by china's nhc as being "severe" on that day. the data incorporated into the script was extracted manually, with considerable help from google translate, from http://www.nhc.gov.cn/yjb/pzhgli/new_list. shtml, the primary source of china covid-19 case counts cited by wikipedia. double-checking the data, and other aspects of the script, would be useful. the output of the script is shown in figure 4 the reason for selecting reports of "severe" cases, rather than (e.g.) reports of "confirmed" cases, is the common-sense guess that more severe cases are more any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14. what the model from [5] predicts regarding a lockdown in china starting 23 january 2020. first plot: r 0 = 2.5, with 60% reduction from distancing, with initial 2-day pulse starting 11 january 2020. second plot: r 0 = 2, with 60% reduction from distancing, with initial 2-day pulse starting 5 january 2020. third plot: r 0 = 2, in an extended model with 99% reduction from distancing, with initial 2-day pulse starting 1 january 2020. fourth plot: r 0 = 3.5, in an extended model with 99% reduction from distancing, with initial 2-day pulse starting 10 january 2020. see text for more details. any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14. likely to be tested and reported, hopefully producing a curve close to reality. however, there could still be biases in the testing or reporting procedures, and the nhc reports do not state the exact definition of a "severe" case. this raises the question of how to build a model that is not obviously inconsistent with the nhc reports-while at the same time avoiding the clear danger of overfitting. there are several contributing factors to the widths of the peaks in figure 4 .2. in the first plot, wintertime r 0 = 2.5 is reduced by distancing to 1, so the epidemic is controlled primarily by seasonal forcing (plus a gradual increase in 1 − s, the green curve). this is not a large effect, so new infections continue to occur for a long time. this phenomenon is smaller in the second plot, with wintertime r 0 = 2, and much smaller in the third and fourth plots. the width of the e peak (not plotted) then produces a somewhat wider i peak (the black curve), since e → i has a high variance. this in turn produces a somewhat wider h peak (the orange curve) followed by a somewhat wider c peak (the red curve). as noted in section 2, seir models are easily adjusted to match observations of lower variance in an individual's disease progression. for example, the analysis of [6] indicates that the covid-19 incubation time has standard deviation only about half of its mean. modifying the model of [5] to include more e stages any purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not . https://doi.org/10.1101/2020.04.14.20048025 doi: medrxiv preprint (while preserving the mean) would reduce the width of the resulting i, h, and c peaks, producing a sharper increase towards the maximum and the a sharper decrease from the maximum. it is also easy to adjust the model of [5] to delay the c peak. however, a large part of the width of the peak in the first and second plots in figure 4 .2 arises directly from the assumption that r 0 is large even after distancing. it is not obvious how to reconcile the model of [5] with the reports from figure 4 .4 without dropping this assumption. this analysis should not be interpreted as confidently concluding that [5] is wrong in claiming that china achieved only a 60% reduction in r 0 . one can imagine various ways that the claim could still be correct. however, [5] is not justified in the level of confidence that it states in this claim, and is not justified in the level of confidence that it states in the conclusions that rely on this claim. how will country-based mitigation measures influence the course of the covid-19 epidemic? mathematics of life and death: how disease models shape national shutdowns and other pandemic policies a fiasco in the making? as the coronavirus pandemic takes hold, we are making decisions without reliable data intermittent social distancing can prevent critical care capacity from being exceeded but such measures may be required for 12-18 months. depending on r0 and the amount of seasonality, social distancing must be 'on' for as little as 25% but up to 70% of the time social distancing strategies for curbing the covid-19 epidemic estimate the incubation period of coronavirus 2019 (covidany purpose without crediting the original authors. for peer-reviewed) in the public domain. it is no longer restricted by copyright. anyone can legally share, reuse, remix, or adapt this material the copyright holder has placed this preprint (which was not coronavirus: the hammer and the dance lesson from a long experience with model blowups:@neil ferguson , if you need a model w/"thousands of lines", this is not a model useable for real world risk & decisions-rather something with the fragility of a house built with matches to impress some tenure committee this typo did not appear in this paper's differential equations, and did not appear in this paper's software, but illustrates the importance of double-checking everything.) broadened "sensitivity analysis" to "error analysis", for readers who expect "sensitivity" to refer specifically to the sensitivity of a test added comment that the notation here does not include self-loops, for readers familiar with state-transition diagrams for discrete automata key: cord-332618-8al98ya2 authors: barraza, néstor ruben; pena, gabriel; moreno, verónica title: a non-homogeneous markov early epidemic growth dynamics model. application to the sars-cov-2 pandemic date: 2020-09-18 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110297 sha: doc_id: 332618 cord_uid: 8al98ya2 this work introduces a new markovian stochastic model that can be described as a non-homogeneous pure birth process. we propose a functional form of birth rate that depends on the number of individuals in the population and on the elapsed time, allowing us to model a contagion effect. thus, we model the early stages of an epidemic. the number of individuals then becomes the infectious cases and the birth rate becomes the incidence rate. we obtain this way a process that depends on two competitive phenomena, infection and immunization. variations in those rates allow us to monitor how effective the actions taken by government and health organizations are. from our model, three useful indicators for the epidemic evolution over time are obtained: the immunization rate, the infection/immunization ratio and the mean time between infections (mtbi). the proposed model allows either positive or negative concavities for the mean value curve, provided the infection/immunization ratio is either greater or less than one. we apply this model to the present sars-cov-2 pandemic still in its early growth stage in latin american countries. as it is shown, the model accomplishes a good fit for the real number of both positive cases and deaths. we analyze the evolution of the three indicators for several countries and perform a comparative study between them. important conclusions are obtained from this analysis. -we present a mathematical model based on a new stochastic process described by a pure birth process. -the proposed model matches the subexponential growth on the early stage of an epidemic. -the mathematical expression of the cumulative case incidence and cumulative death curves is obtained, with a quite accurate fit in both cases. -the model contains two parameters, the immunization and infection rates. the behavior in time of those parameters allows to assess the evolution of the outbreak. -we obtain a new indicator, the mean time between infections. this indicator allows not only to monitor the epidemic growth but also to predict the peak of cases. the coronavirus disease transmitted by the sars-cov-2 appeared in china by the end of 2019. more than 17000000 of cases and 675000 deaths were reported since then. the lack of a vaccine and an effective treatment for this disease forced the governments to adopt lockdown actions in order 5 to protect the population and to slow down the spread of the outbreak. despite those actions, health systems in several countries resulted overwhelmed. many mathematical models were developed in order to predict the behavior of the outbreak in several ways, most of them based on the well known sir (susceptible-infectious-recovered) models. we present in this work a different 10 approach having the advantage of a rapid and clear interpretation of its parameters. we obtain this way a quite useful new indicator, the mean time between infections (mtbi). modeling contagion has attracted the interest of statisticians for decades. these models are useful to model the spread of contagious diseases in a popula15 tion or plantation. a well-known example is the polya urn model, where balls of different colors are extracted from an urn in such a way that when a given ball is extracted, not only is it put back but also a certain number of balls of the same color is added. the probability of getting a ball of that color in the next drawing is thus increased, modeling a contagious process. contagion modeling 20 has been already applied in many areas of engineering, see for example [1, 2, 3]. the polya stochastic process is obtained as the limit from the polya urn model in a similar way as the poisson process is obtained from a bernoulli process. as it happens with the homogeneous and non-homogeneous poisson processes, the polya stochastic process can be described by the pure birth equation 25 2 (see chapter 17 of [4] ). the main difference between these two types of pure birth processes is that, in the polya process, the event rate is not just a function of time but also of the number of individuals in the population. there are also cases where the event rate depends only on the number of previous events but not on time, like the yule process. however, since the probability of births rises 30 due to an increase in the population while the probability of an individual birth remains constant, it does not describe a true contagion process. the stochastic mean of the polya process is a linear function of time, as will be shown. this characteristic makes the polya process unsuitable for application in many real cases of disease growth dynamics and engineering. hence, 35 we propose a different model based on a pure birth process with an event rate that, like polya's, depends on both the elapsed time and the number of previous events, but with a different functional form. we obtain this way a mean number of events that is a nonlinear function of time, with either an increasing or decreasing first derivative provided a given parameter is greater or less than 40 one. an important application of our proposed model is the study of the spreading of diseases. we obtain this way a process controlled by two competitive phenomena, infection and immunization, i.e. two opposing forces. hence, we get a model with two parameters: the transmission and immunization rates. the mean value function shows different behavior according to whether the ratio between infection and immunization rates is greater or lower than one. in the former case, its second derivative is positive and the mean value curve is convex, whereas on the latter case its second derivative is negative and the resulting curve is concave. the limit case of infection/immunization ratio γ/ρ = 1 is also 50 possible, resulting on a mean value function that grows linearly over time. the evolution of the parameters shows how well the pandemic is being controlled, either by lowering the infection rate via isolation and quarantine measures or by increasing the immunization rate with a vaccination campaign. we obtain this way three useful indicators: the infection/immunization ratio, the immunization 55 rate and the mean time between infections (mtbi). our model is quite suitable to be applied to early epidemic growth dynamics, where exponential models fail (this matter is discussed in [5] ). since ours falls in the category of subexponential models, the results presented in this article support the idea developed in the cited reference. another advantage of our 60 approach is that, as a subset of the subexponential cases, we are able to model cases with linear and sublinear (concave) cumulative case incidence curves, as it will be shown. as an interesting and quite important application, we apply our model to the recent sars-cov-2 pandemic as an alternative to the sir models. our 65 theoretical values match real data, which allows us to assess the evolution of the oubtreak and to predict its trend a few days ahead. the parameters can be easily estimated through simple methods, such as least squares, applied either to positive cases or deaths. we show applications to data from several countries with good agreement. this paper is organized as follows: motivation and related work are exposed in section 2, the proposed model is presented in section 3, applications to the sars-cov-2 pandemic are developed in section 4 and some discussions on the results we obtained are presented in section 5. the final conclusions are exposed in section 6. our main motivation is to obtain a model that describes an epidemic outbreak at its first stage, before it reaches the inflection point in the case incidence curve, which is useful to monitor how contagion is spreading out. this 80 first stage corresponds to an r naught greater than one. our model is inspired in the polya-lundberg process, which comes from the contagious urn model as explained in the previous section. since the mean value function of the polya-lundberg process is a linear function of time (see appendix b), we introduce a modification in the event rate in order to get a mean value function that grows 85 subexponentially with either positive or negative concavity as we observe in the early epidemic growth curves usually reported. there exists a wide variety of models that can be used to describe the evolution of an epidemic. main standard is held by the so called compartmental 90 models, i.e., the family of sir based models (sir, seir, sirs, etc. see for instance [6, 7, 8, 9] ). these models consist of a set of differential equations that take into account transitions between different compartments. the first classical sir model was porposed by kermack and mckendrick [10] . due to the recent covid-19 coronavirus pandemic, most of the scientific 95 community is dedicated to study its behavior, both by using sir based models and introducing new ones. in [11] , a stochastic term is introduced in the system of differential equations to simulate noise in the detection process. in [12] the authors use an arima based method to correct errors arising from the delay between the day the sample is taken and the day a diagnosis is made, which compartmental models can be found in [15] . in [16, 17] some well-known ma-110 chine learning and time series algorithms (like arima and svr) are used to learn from a present dataset and forecast the case incidence curve up to the following six days. combinations of sir models embedded with some random components was proposed in very recent articles, see for instance [18, 19, 20] . a theoretical framework to model epidemics by a non homogeneous birth-and-115 death process was proposed in [21] . an extensive analysis of epidemics at their early stage was performed in [5] . in this study, the authors propose a phenomenological model (see also [22, 23] ) and analyze the application of sir based models with either exponential and subexponential behavior. our approach is quite different. on one hand, our 120 model is stochastic whereas theirs is deterministic. on the other hand, we obtain our contagion model as a special case of a very general and well-known probabilistic model, the pure birth process. it is interesting to compare the differential equations the cited authors arrive with ours, which will be done in a future publication. furthermore, and unlike them, we are also able to model 125 cumulative case incidence curves that grow with negative concavity. the model we propose is based on a pure birth process. these processes describe the evolution of a population where individuals can only be born. we are then modeling the epidemic spread as births in a population, where every 130 birth corresponds to a new infection case. our novel approach consists of finding a proper incidence rate that describes the contagion phenomenon. we develop next the fundamentals of pure birth processes and our proposal. we propose to get the probability of having r infections in a given time t 135 from a pure birth stochastic differential equation. pure birth processes describe the behavior of a population where individuals can only be born and are not allowed to die. all the individuals are assumed to be identical. the probability of having r individuals in the population at a given time t, p r (t), is given by the solution of the following differential equation, see [4, 24] : (1) assuming we have 0 individuals at time 0, we impose the following initial conditions on eq. (1): p 0 (0) = 1 and p r (0) = 0 for r ≥ 1. to see that p r (t) is the process given by eq. (1) is also markovian, where the number of individuals corresponds to the state of the system, s r (t). the only transitions 145 allowed 2 are s r (t) → s r (t + dt) and s r (t) → s r+1 (t + dt) with probabilities: the dependence of λ r (t) on t defines the type of process. if λ r (t) is a function of t only or a constant, then the process is non-homogeneous or homogeneous respectively, with independent increments in both cases. if λ r (t) is also a function of r, the process has dependent increments. from eq. (1), the probability of having no births in a certain time interval greater than t − s given the population has r individuals by the time s is given by the well-known exponential waiting time: the mean number of infections in a given time is under certain conditions (see appendix b to verify those are satisfied by 2 actually, other transitions have probabilities that are higher order infinitesimals. our model) we have the following differential equation for the mean value: depending on the proposed function for λ r (t), the mean number of failures may or may not be easily obtained. our formulation allows us to calculate the mean time between infections (mtbi), this is, the mean time between births in a pure birth process. from eq. (3), we can predict the mean time between infections (mtbi) after r infected individuals were detected by the time s using a model with incidence rate λ r (t) as: here, z is a normalizing constant to consider cases where the probability of 155 having no infections in an infinite time interval is greater than zero and is given by eq. (7) (see appendix c for a full deduction): eq. (6) gives the expected time to the next infection given that by the time s the infected population consists on r individuals, this is, the mean time between two consecutive detections. this indicator is, in most cases, u-shaped (see fig. 1 ). an exceptional case is also shown in fig. 4b , and will be discussed later). it is expected to decrease at first, showing the acceleration of the spread (infections occur more often). when the cumulative case incidence curve reaches the inflection point and the epidemic starts to mitigate, mtbi shows a flattening portion (infections occur 165 at a constant rate). then, due to the deceleration of the epidemic (infections occur less often) mtbi tends to get larger again. this curve resembles the bathtub curve, well known in reliability engineering, which is also obtained when this class of models is applied to reliability studies, see [25] . 3.2. proposed incidence rate 170 we propose a two-parameter incidence rate given by the following expression: the parameters involved in eq. (8), indicate how fast the outbreak is spreading. we can see that the γ parameter is related to the strength of the contagion effect, since it is a factor of the number of infections, and the ρ parameter is related to the outbreak mitigation, either by natural immunization or due to external measures. that is the way how these two parameters, once estimated, 175 are useful to monitor the disease progress and the impact of actions taken by health institutions. these actions can affect either the γ parameter via the population solation and quarantine, or the ρ parameter by, for instance, vaccination campaigns. as we will show, the ratio between both rates is the exponent of 9 time t on the mean value function (eq. (11)), and it determines the curve's 180 concavity, provided it is greater or less than one. we can compare our proposed incidence rate with other formulations by rewriting eq. (8) as follows: the numerator of the second term in eq. (9) can be compared with the numerator of the incidence rate analyzed in [26] (eq. 1: is the number of infections, β the transmission probability per contact and c the 185 contact rate) by replacing i(t) with r and β c with γ; our γ parameter can be interpreted as the transmission rate in the same way. in that proposal, the n (t) denominator corresponds to the total population, since the authors consider the per capita incidence rate. as an interesting remark, it can be seen that it also coincides with that of the polya-lundberg process. the first term in eq. (9) is 190 equal to λ 0 (t), hence it can be considered as the initial incidence rate. on the other hand, following the usual behavior of ρ, this term rapidly vanishes with time. being the ratio γ ρ the coefficient of r in the contagion rate (eq. (8)), it takes into account both the infection and immunization rates, so it can be directly rethe evolution of these two parameters, the ratio γ ρ and ρ as a function of time, is indicative of how well the epidemic is being controlled by actions taken from health institutions. we expect a strong decrease in γ ρ , and in consequence, a strong increase in ρ. our definition of the outbreak parameters deserves a detailed analysis. the well-known definition of the basic reproduction number r 0 assumes all the population is susceptible and no immunization is present. this would imply ρ to be zero; in that case, eq. (8) reduces to λ r (t) = λ r. which is the event rate of another classic stochastic model, called the yule 205 process. in that case, the event rate given by eq. (10) grows linearly with the population size, which results in an exponential expression for the mean value function (see [27] for more details about the yule process). this means that, if we allow our ρ to eventually be zero, the yule process becomes a special case of ours, hence exponential growth can also be modeled. however, as it can be 210 seen in table 1 , even the worst case scenarios (very sharp incidence curves) differ from exponential growth, being this another empyrical confirmation of chowell's thesis presented in [5] . it should be remarked that our proposed rate (eq. (8) (5), as demonstrated in appendix b. from the incidence rate (eq. (9)), we are able to get the exact solution of p r (t) (see appendix a) and therefore we can show that m (t) is finite and eq. as seen from eq. (11), the mean value obtained from our model is a nonlinear function of time with a positive or negative concavity whether γ ρ is greater or lower than one, as shown in fig. 2 . this expression is the functional form that for our proposed model, it is straightforward to obtain an expression for the mtbi by inserting eq. (8) into eq. (6). this leads to eq. (12): replacing r by the mean number of infections given by eq. (11) we get eq. (13) which is a useful conditional expression, as shown in appendix c. in this section, we show the indicators obtained from several countries' re-235 ports. data were taken up to early july 2020 from https://ourworldindata. org/coronavirus-source-data. for asian and european countries, and also the usa, only data from the early epidemic stages were considered for analysis, this is, up to the inflection point. real data and fitted curves are shown in figs. 3 and 5. we fit eq. (11) to the incidence and death curves of the recent cov-2 pandemic in order to obtain the three mentioned indicators. the case incidence curves are shown in fig. 3 , while in order to show that the proposed model fits well not just positive concavities as shown before, we consider the data from uruguay, which presents cumulative number of deaths curves are shown in fig. 5 . the parameters and the coefficient of determination are shown in table 2 . in this section we show the evolution of the γ ρ parameter obtained from our model as a function of time, for the analyzed countries. in order to obtain the evolution of this parameter over time, we perform several estimation runs with successive subsets of the dataset and record each estimate. this picture is indicative of how effective the measures taken by different 265 countries have been, or how fast the outbreak is reaching the mitigation stage. as the transmission rate approaches the immunization rate, this parameter reaches the value of 1 as a limiting case. curves are depicted in fig. 6a . as previously discussed, the strength force that contains the outbreak is 270 determined by the immunization rate ρ parameter, and we expect it to increase due to actions taken by governments. then, we analyze the evolution of this parameter in the same way as it was done for γ ρ in the previous section. values of ρ as a function of time are depicted in fig. 6b. it is interesting to analyze the current behavior of the parameters for latin american countries, since they are still at the first stage of the pandemic and where, with the exception of brazil, the lockdown was relaxed and turned strict again more than once. curves are depicted in fig. 7 . spreading out and the inflection point has not been reached yet. this indicator is useful in order to estimate the moment when the curve will start to flatten. in fig. 9 we depict the mtbi for china in order to see the flattening section 285 of this indicator as the cumulative cases of incidence reaches its inflection point. by the time we are writing this report (early july 2020), contrarily to europe and the usa, latin american countries have not yet reached the inflection point in the cumulative incidence curve, which can also be seen through the mtbi characteristics. this inflection point in the cumulative incidence curve 290 corresponds to a minimum in the mtbi indicator, as seen in fig. 8 for italy. it table 3 . cumulative incidence figure 9 : mean time between infections and cumulative cases incidence for china. the well goodness of fit obtained for several countries shows that the spreading is governed by the same law. although with different parameter values, the cumulative case incidence curve follows the same mathematical expression. those parameters indicate the level of contagion and the effectiveness of the actions taken by governments. the model fits rather well the number of in-300 fected and deaths in the population previous to the inflection point, at the early pandemic growth stage, though its prediction beyond four or five days ahead is generally too pessimistic and overestimates the actual data. the mtbi minimum values indicated in table 3 show an interesting result. in r that implements our proposed model, and which was used to obtain the shown curves, is available at [28] . it can be easily proved that eq. (1) has a unique solution and is given by (see also [29] ): is a probability mass function provided that ∞ r=0 p r (t) = 1; this is shown by following the same steps as in section 4 of chapter xvii from [4] (see also [30] , section 4.3.2), where it is proved for the case λ r (t) depends on r but not on t. the authors are working on a future publication where this property will be properly generalized to functions that depend on t as well. recall our proposed functional form of λ r (t): let the function µ r (t) be: it can be seen from eq. (a.3) that µ r (t) = λ r (t) and µ r (0) = 0 ∀r ≥ 0. we now define the auxiliary function which does not depend on r. note that µ ∆ (0) = 0. it is a fact that the following equality holds: the proof can be done by differentiating the right side of eq. (a.5) to show it's indeed a primitive of the left side's integrand. since the functions are also equal on t = 0, fundamental theorem of calculus yields that the equalty is valid for every t. considering the initial condition p 0 (0) = 1 (the beginning population is 0), the probability mass functions are given by: this is demonstrated by induction over r, using eq. (a.6) as inductive hypothesis and p 0 (t) from eq. (a.1). in fact, we can rewrite eq. (a.6) as: using the fact hat µ r (t) = µ 0 (t) + rµ ∆ (t) and replacing µ 0 (t), µ 0 (0) and µ ∆ (t) by their expression yields: (a.8) it should be noted that, since the expression eq. (a.1) is also valid for the case λ r (t) = ρ γ ρ +r 1+ρ t , we can define µ r (t) and µ ∆ (t) so that the same properties are achieved. therefore, following the same steps we get the expression of the 365 pmfs for the polya-lundberg process. a similar procedure provides the pmfs for the yule process. multipliying eq. (1) by r and summing we obtain since the condition lim r→∞ r λ r (t) p r (t) = 0. (b.1) is attained for our model, we get ∞ r=1 r p r (t) = ρ 1 + γ ρ r 1 + ρ t p r (t) = ρ 1 + ρ t the same procedure that leads to eq. (b.3) is also valid for the polya-lundberg process. replacing λ r (t) by its expression and solving the differential 370 equation results in a linear mean value function. in the same way, with λ r (t) = λ r, we get an exponential mean value function, which corresponds to the yule process. appendix c. mean time between infections calculation for the contagion model 375 we begin from the exponential waiting time (eq. (3)), which we can rewrite as p (t r ≥ t|t r−1 ) = exp − image segmentation and labeling using the polya urn model an introduction to probability theory and its applications mathematical models to characterize early epidemic growth: a review an introduction to mathematical modeling of infectious diseases infectious diseases of humans: dy-405 namics and control, dynamics and control modeling the spread of infectious diseases: a review modeling epidemics: a primer and numerus model builder 415 implementation a contribution to the mathemati-420 cal theory of epidemics analysis of stochastic delayed sirs model with exponential birth and saturated incidence rate llanovarced-kawles,álvaro olivera-nappa, statistically-based method-430 ology for revealing real contagion trends and correcting delay-induced errors in the assessment of covid-19 pandemic inference of the generalized-growth model via maximum likelihood estimation: a reflection on the impact of overdispersion covid-abs: an agent-based model of covid-19 epidemic to simulate health and economic effects of social distancing interventions agent-based modeling vs. equation-based modeling: a case study and users' guide short-term forecasting covid-19 cumulative confirmed cases: 455 perspectives for brazil l. dos 460 santos coelho, forecasting brazilian and american exogenous variables sir epidemics with stochastic infectious periods, stochastic processes and their applications dynamic behaviors of a twogroup stochastic sirs epidemic model with standard incidence rates a stochastic sir epidemic model with lévy jump and media coverage approximation of epidemics by inhomogeneous birth-and-death processes a generalized-growth model to characterize the early ascending phase of infectious disease outbreaks using phenomenological models for forecasting the 2015 ebola challenge introduction to stochastic processes increasing failure rate software reliability models for agile projects: a comparative study the abc of terms used in mathemat-505 ical models of infectious diseases wiley series in probability and 510 statistics introducing the non-homogeneous compound-515 birth process an introduction to markov processes authors declare no conflict of interest related to this article. key: cord-284617-uwby8r3y authors: area, iván; hervada vidal, xurxo; nieto, juan j.; purriños hermida, maría jesús title: determination in galicia of the required beds at intensive care units date: 2020-10-06 journal: nan doi: 10.1016/j.aej.2020.09.034 sha: doc_id: 284617 cord_uid: uwby8r3y by using a recent mathematical compartmental model that includes the super-spreader class and developed by ndaïrou, area, nieto, and torres, a procedure to estimate in advance the number of required beds at intensive care units is presented. numerical simulations are performed to show the accuracy of the predictions as compared with the real data in galicia. in the context of the present coronavirus pandemic, the covid-19 epidemic has dramatically challenge the critical care capacity in every region or country in the world and therefore increasing the required intensive care units (icus). one of the main questions authorities have to address is: are the resources to treat infected cases enough? in this respect, hospital beds, intensive care units (icus), and ventilators are crucial for the treatment of patients with severe illness. we have employed a compartmental mathematical model for covid19 to estimate in advance the number of required beds at intensive care units. the predictions have been performed in real time in galicia (spain) during the previous two months (march-april 2020) by using real data of the epidemiological service of our region. in this work we have estimated the number of required beds at icus and compared the predictions with the real data. the numerical predictions show a great agreement with the real data and the model could be used in other regions or countries and with this or other epidemics. the delay from new infected individuals and requited beds at ucis can be used even in the case of not having appropriate mathematical model to predict the number of infected individuals in advance. the so-called coronavirus disease 2019 can be defined as an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (sars-cov-2). at this time, the origin of the pandemic is yet unknown, but the first reported cases were identified by december 2019 in wuhan, the capital city of hubei province in the people's republic of china. since 21st january, 2020 the word health organization has published daily reports with the reported data from the affected countries. by march 11th, 2020, the word health organization characterized this outbreak as a pandemic. at the time of submission of the first version the number of total confirmed cases was about 3 million and the number of deaths, approximately two thousand hundred people. nowadays, globally, as 20th september 2020, there have been 30.675.675 confirmed cases of covid-19, including 954.417 deaths, reported to who. as indicated in [1] how to determine the needed intensive care units (icus) is not easy and fully understood. the icu selection process is complex and much more during a epidemic like this and under bed pressure, prioritization, justice, expected outcome and extreme circumstances [2] . in the context of covid-19 epidemic, this coronavirus disease pandemic has dramatically challenge the critical care capacity in galicia, spain, europe and in most countries in the world. galicia is an autonomous community of spain and located in the northwest iberian peninsula and having a population of about 2,700,000 a total area of 29,574 km2. one of the main questions authorities have to address is: are the resources to treat infected cases enough? in this respect, hospital beds, icus, and ventilators are crucial for the treatment of patients with severe illness [3, 4] . from the mathematical point of view, it is possible to analyze the evolution of an infectious disease by using different techniques [5, 6] . one option is to consider compartmental models, dividing the population into different classes and determining the rate of change among the different classes. the simplest model is usually referred as sir, denoting susceptible, infected and recovered individuals, and it can be extended to more complex models [7] . following previous works [8] [9] [10] , in [11] a model including the super-spreader class [14, 15] has been presented, and applied to give an estimation of the infected and death individuals in wuhan. many mathematical models and dynamical systems have been already developed. we stand out a mathematical model introduced taking into account the possibility of transmission of covid-19 from dead bodies to humans and the effect of lock-down [16] , the dynamical model of [17] considering the inter-action among the bats and unknown hosts (wild animals) and among humans and the infections reservoir (seafood market) or the eight-stages of infection compartmental model [18] where the authors implement an effective control strategy. for a detailed statistical analysis in some countries (turkey and south africa) we refer the reader to [19] . for the role of quarantine and isolation, see [20, 21] . in this context it is crucial to know beforehand the needs of beds at icus, due to the huge resources needed for each of them. this problem cannot be lead to improvisation, since the technical needs are extremely specific, as well as the human resources. when there is no other option, improvisation; if possible, planning. the manuscript is organized as follows. in section 2, we recall the compartmental model for covid-19 [11] . the usefulness of our model is then illustrated in section 3 of numerical simulations, where by using the real data from galicia we estimate the number of required beds at icus and compare the predictions with the real data. we end with section 4 of conclusions, discussion, and future research. the mathematical model described in [11] considers a constant total population of size n, which is subdivided into eight epidemiological classes: 1. susceptible class (s), 2. exposed class (e), 3. symptomatic and infectious class (i), 4. super-spreaders class (p), 5. infectious but asymptomatic class (a), 6. hospitalized (h), 7. recovery class (r), and 8. fatality class (f). it is then described by the system of eight nonlinear ordinary differential equations: as for the parameters, next we provide a description of each of them as well as the numerical values used to model the spread of the disease in wuhan: 1. b ¼ 2:55 day à1 stands for transmission coefficient from infected individuals; 2. b 0 ¼ 7:65 day à1 stands for transmission coefficient due to super-spreaders; 3. l ¼ 1:56 is dimensionless and denotes the relative transmissibility of hospitalized patients; 4. j ¼ 0:25 day à1 stands for the rate at which exposed individuals become infectious; 5. q 1 ¼ 0:580 is dimensionless and stands for the rate at which exposed individuals become infected; 6. q 2 ¼ 0:001 is dimensionless and stands for the rate at which exposed individuals become sper-spreaders; 7. 1 à q 1 à q 2 is dimensionless and denotes the progression from exposed to asymptomatic class; 8. c a ¼ 0:94 day à1 denotes the rate of being hospitalized; 9. c i ¼ 0:27 day à1 is the recovery date without being hospitalized; 10. c r ¼ 0:5 day à1 is the recovery rate of hospitalized patients; 11. d i ¼ 1 23 day à1 is the disease induced death rates due to infected individuals; 12. d p ¼ 1 23 day à1 is the disease induced death rates due to super-spreaders individuals; 13. d h ¼ 1 23 day à1 is the disease induced death rates due to hospitalized individuals. a flowchart of model (1) is presented in [11] and included as fig. 1 . the initial mathematical guess of considering the class of super spreaders has been confirmed in places such as bhilwara (india), brighton (uk), or daegu (south korea), just to mention some cases. one of the key tools in any compartmental model is to determine the basic reproduction number, which can be read as a measure of the spread of the disease in the population. using the next generation matrix approach [12] , this quantity has been determined for the model (1) by considering the generation matrices f and v given by [11] and by computing the spectral radius of f á v à1 , the basic reproduction number r 0 is therefore obtained and given by by considering the values of the parameters given before. the following result has been obtained in [11] . theorem 1. the disease free equilibrium of system (1), that is, ðn; 0; 0; 0; 0; 0; 0þ, is locally asymptotically stable if r 0 < 1 and unstable if r 0 > 1. proof (sketch of the proof). first of all, it is remarkable in system (1), equations number 5, 7 and 8 are uncoupled. the jacobian matrix associated to the remaining variables is given by the characteristic polynomial of the latter matrix is given by in order to apply the lie´nard-chipard test [13] , it might be proved that a i > 0; i ¼ 1; 2; 3; 4 as well as a 1 a 2 > a 3 . in doing so, the coefficients a i can be rewritten in terms of the basic reproduction number as determination in galicia of the required beds furthermore, which implies the result. h during the pandemic, several reports have been produced to predict the number of required beds at the intensive care units. in doing so, we have used during march and april, 2020, the same values for the parameters in the differential system (1) as for the wuhan predictions [11] . in fig. 2 we have plotted both real data and the predictions of the mathematical model. in the simulations of wuhan it was fixed n ¼ 11; 000; 000=250, and in the galician case n ¼ 2; 700; 000=ð1:55 ã 250þ. this extra factor of 1:55 is due to the spread of the galician population. it has been determined in the first days of the pandemic and later has been proved to be an adequate value. moreover, we have fixed as initial conditions: in order to numerically solve the system of differential eqs. (1), the matlab code ode45 has been used, which is based on an explicit runge-kutta (4,5) formula. the initial conditions have been fixed taking into account the real data provided by the galician government during the first days of the pandemic, which allowed us to do the prediction in fig. 2 . the real data are also showed in fig. 2 , which shows the accurate of the model and the predictions. this is the main tool to afterwards predict the number of beds that would be necessary, day by day, at the intensive care units. by using the simulation, we have done a prediction of the requirements at the intensive care units, computing the 2.5% of the sum of the new infected individuals of the previous 20 days. in table 1 we show the predicted values as well as the real number of beds that have been required. the management of resources during the pandemic is essential and this work concludes a method for predicting the number of beds at icus. the predictions of beds at the icu's has been and is one of the keystones of this pandemic. the data about severity of the confirmed cases has changed several times since the beginning of the pandemic. this is normal in emerging diseases, in which initially the worst cases are detected and, as the disease progresses it is possible to identify milder cases. following the data from wuhan, the 31% of the first 99 cases required intensive care, but in 1,099 cases of 532 hospitals in china, only 5% were admitted to intensive care units. from the data of european union and united kingdom, 30% of the confirmed cases has been hospitalized, and 4% has been considered as critical. in a similar way, in spain, from the first 18,609 cases with complete information, 43% has been hosptalized and 3.9% has been at icu's. in the case study of galicia, as mentioned, there was not an important sustained community transmission, which allowed to have a better scenario, as compared with other regions of spain and europe. the approach we have followed is to predict in advance the number of new infected individuals. in doing so, we have considered the mathematical model (1), for which we have computed the basic reproduction number and analyzed the local stability. as a second step, with this prediction in advance, we have predicted the number of beds at icus as shown in the manuscript, which has been showed to be accurate. it seems extremely important to have a tool which allows to predict the number of beds at icu which would cover real needs, assuming confinement of the population for at least one month. this would help for new outbreaks of the disease. if confinement is not applied to the population, then they might be considered different values of the parameters in the differential system (1), but the prediction based on the curve produced by the model would remain extremely useful for the management of resources. not applicable. all the data used in this work has been obtained from official sources. moreover, the system of nonlinear differential eqs. (1) can be numerically solved to obtain the same results as showed. the work of area and nieto has been partially supported by the agencia estatal de investigacio´n (aei) of spain, cofinanced by the european fund for regional development (feder) corresponding to the 2014-2020 multiyear financial framework, project mtm2016-75140-p, as well as by instituto de salud carlos iii, grant cov20/00617. moreover, nieto also thanks partial financial support by xunta de galicia under grant ed431c 2019/02. this research was supported by the portuguese foundation for science and technology (fct) within the ''project n. 147 -controlo ó timo e modelac¸a˜o matema´tica da pandemia covid-19: contributos para uma estrate´gia siste´mica de intervenc¸a˜o em sau´de na comunidade", in the scope of the ''research 4 covid-19" call financed by fct. the authors contributed equally to this work. all authors read and approved the final manuscript. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. how many intensive care beds are enough? the novel coronavirus (sars-cov-2) infections in china: prevention, control and challenges projecting hospital utilization during the covid-19 outbreaks in the united states surge capacity of intensive care units in case of acute a mathematical model for simulating the phase-based transmissibility of a novel coronavirus effective containment explains subexponential growth in recent confirmed covid-19 cases in china on a fractional order ebola epidemic model mathematical modeling of zika disease in pregnant women and newborns with microcephaly in brazil ebola model and optimal control with vaccination constraints mathematical modeling of covid-19 transmission dynamics with a case study of wuhan reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission the theory of matrices one world, one health: the novel coronavirus covid-19 epidemic the role of super-spreaders in infectious disease cell host & microbe modelling the spread of covid-19 with new fractal-fractional operators: can the lockdown save mankind before vaccination? abdon atangana, modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative modelling the covid-19 epidemic and implementation of population-wide interventions in italy mathematical model of covid-19 spread in turkey and south africa: theory, methods and applications medrxiv a nonlinear epidemiological model considering asymptotic and quarantine classes for sars cov-2 virus the dynamics of covid-19 with quarantined and isolation the authors are grateful to the anonymous reviewers for their suggestions and comments that improved a preliminary version of the manuscript. key: cord-320953-1st77mvh authors: overton, christophere.; stage, helenab.; ahmad, shazaad; curran-sebastian, jacob; dark, paul; das, rajenki; fearon, elizabeth; felton, timothy; fyles, martyn; gent, nick; hall, ian; house, thomas; lewkowicz, hugo; pang, xiaoxi; pellis, lorenzo; sawko, robert; ustianowski, andrew; vekaria, bindu; webb, luke title: using statistics and mathematical modelling to understand infectious disease outbreaks: covid-19 as an example date: 2020-07-04 journal: infect dis model doi: 10.1016/j.idm.2020.06.008 sha: doc_id: 320953 cord_uid: 1st77mvh during an infectious disease outbreak, biases in the data and complexities of the underlying dynamics pose significant challenges in mathematically modelling the outbreak and designing policy. motivated by the ongoing response to covid-19, we provide a toolkit of statistical and mathematical models beyond the simple sir-type differential equation models for analysing the early stages of an outbreak and assessing interventions. in particular, we focus on parameter estimation in the presence of known biases in the data, and the effect of non-pharmaceutical interventions in enclosed subpopulations, such as households and care homes. we illustrate these methods by applying them to the covid-19 pandemic. mathematical epidemiology is a well-developed field. since the pioneering work of ross in malaria modelling [67] and kermack and mckendrick's general epidemic models [40] , there has been gathering interest in using mathematical tools to investigate infectious diseases. the allure is clear, since mathematical models can provide powerful insight into how these complex systems behave, which in turn can enable these problems to be better controlled/prevented. not only is the power of the mathematical tools increasing, but the availability of data on infectious diseases, whether this be a rapid release of data during an outbreak or detailed collection of data for endemic pathogens, is increasing. rapid interpretation of epidemiological data is critical for the development of effective containment, suppression and mitigation interventions, but there are many difficulties to interpreting case data in real-time. these include interpreting symptom progression and fatality ratios with delay distributions and right-censoring, exacerbated by exponential growth in cases leading to the majority of case data being on recently infected individuals; lack of clarity and consistency in denominators; inconsistency of case definitions over time and the eventual impact of interventions and changes to behaviour on transmission dynamics. mathematical and statistical techniques can help overcome some of these challenges to interpretation, aiding in the development of intervention strategies and management of care. examining key epidemiological quantities alongside each other in a transmission model can provide quantitative insights into the outbreak, testing the potential impact of intervention strategies and predicting the risk posed to the human (or animal) host population and healthcare preparedness. mathematical modelling has been used as part of the planning process during outbreak response by governments worldwide for many recent outbreaks. for example the uk department of health has a long established committee scientific pandemic influenza group on modelling, or spi-m to advise on new and emerging respiratory infections [22] . one of the largest instances of such an outbreak in recent history was the 2009 h1n1 pandemic. the world health organisation developed a network of modelling groups and public health experts to work on exploring various characteristic of the outbreak [12, 78] . these ranged from characterising the dynamics of the outbreak to investigating the effectiveness of different intervention strategies. this integration of mathematics into policy design indicates the important insights that modelling and statistics can provide. this paper is a collection of work-streams addressing various technical questions faced by the group as part of the ongoing response to covid-19, and as such is written to be reflective of the experience we have gone and are currently going through. therefore, to aid the reader each section includes results and a short discussion. many of the questions and techniques presented here can be further developed as the availability of data and research interests evolves, but are compiled into this manuscript as an overview of methodology and scientific approaches beyond the standard sir textbook model that benefit the ongoing efforts in tackling this and other outbreaks. first documented in december 2019, an outbreak of community-acquired pneumonia began in wuhan, hubei province, china. in january, this outbreak was attributed to a novel coronavirus, sars-cov-2. the initial spread of the pathogen in wuhan was fast, and after a period of case-finding and contact tracing, china moved to implement a 'shutdown' of wuhan on january 23, and other cities in china the following days, to try to suppress the growth of the epidemic. these measures may have succeeded at slowing down the rate at which cases have been seeded elsewhere, but in many countries initial importation of cases and transmission has not been contained. countries around the world are now seeing outbreaks that are overwhelming, or have the potential to overwhelm, healthcare systems and cause a high number of deaths even in high-income countries [64] . while the majority of documented symptomatic cases are mild, characterised in many reports by a persistent cough and fever, a significant proportion of these individuals go on to develop pneumonia, with some then developing acute respiratory failure and a small proportion of overall cases becoming fatal. severity of symptoms has been observed to increase with age and with the presence of underlying health conditions such as diabetes [23] and cardiac conditions, with some evidence that severity of symptoms might depend on gender and ethnicity [31, 65, 82, 83, 85] . sars-cov-2 has a fast doubling time (the time it takes for the number of cases in the region to double, estimated at approximately 3 days [63] ) and, potentially, a very large 0 r (the average number of infections caused by each infected individual, with estimates ranging from 1.4 to 6.47 [48, 50, 51, 59] ). it is possible that there is a significant degree of asymptomatic and/or pre-symptomatic transmission [46, 52, 56] , though without robust serosurveys, this is difficult to quantify with certainty. these characteristics result in the pathogen being able to spread widely, rapidly and undetected, presenting a significant risk to public health. typically, the aim of an intervention strategy would be to push and keep the reproduction number t r , defined as the average number of cases generated by a typical infective at time t, below 1. at this point each infected individual subsequently infects, on average, less than one individual, such that the number of cases should decline. the basic reproduction number, 0 r , represents the initial value of t r , before any intervention is put in place and the population can be assumed to be fully susceptible. high 0 r , fast growth, and possible pre-or asymptomatic infection make the design of potential interventions, and the modelling that would inform them, particularly challenging. large values of 0 r mean a substantial amount of transmission needs to be halted; fast growth causes the number of cases in the absence of interventions to rise rapidly, so that the time scale of interventions to reduce 0 r must also be fast in order to effect substantive early changes on a population level; finally, the resulting interventions must encompass possible pre-and asymptomatic cases, a challenging prospect when in many instances these individuals are indistinguishable from healthy individuals. consequently, we must consider the possibility of interventions that are massively disruptive to society and may have to be sustained for a long period of time in order to cause the number of infections to decline towards zero [25] . if infections remain, and the susceptible proportion of the population remains above the herd immunity threshold, these interventions must be upheld to prevent a second wave of the epidemic. there is not yet conclusive evidence as to the degree and duration of immunity conferred by infection with sars-cov-2 nor the feasibility of a vaccine, the timeline for which is unlikely to be any time shorter than 18 months away at the time of writing [35] . therefore, short term extreme interventions are not as effective as they might be in other circumstances, since after their removal there remains a long period of time in which cases can rise again. the longer these significantly suppressive and disruptive interventions are in effect, the more severe the effect on the economy, and broader societal health and well-being. furthermore, adherence to interventions will likely vary with their duration and severity. we are further challenged by the lack of transferable intuition. early work looked at intuition gained from sars and mers outbreaks, also caused by coronaviruses. some parameters do appear to be similar to these pathogens, such as the average length of the incubation period [44, 79, 80] . however, there are also clear differences, with both sars and mers being more fatal, but seemingly less efficient at spreading since they did not seed major global pandemics. another complication is the spread of the infection during the chinese spring festival, a time period during which movement, social, and contact patterns vary significantly. this presents significant challenges as experience and intuition from other studies regarding population mixing and spatial patterns must either be modified or are invalid. furthermore, the pandemic has received a proportionately larger level of public attention than e.g. the 2009 h1n1 pandemic [19, 68] , largely boosted by social media. this greater level of public awareness, and the successive, staggered interventions placed to prevent disease spread are responsible for significant variations in behaviour [16, 27] and adherence to public guidance both in china and abroad. the structure of this paper follows two main themes. in section 2, we discuss various biases that are present in outbreak data and techniques for estimating epidemiological parameters. accounting for biases and producing robust parameter estimates is important throughout the duration of an epidemic, both for increasing our understanding of the underlying dynamics, and for feeding into models. firstly, we discuss a bias-corrected method for estimating the incubation period, which can also be applied to serial intervals, onset-to-death time, and other delay distributions. we then present a method for estimating the true growth rate of the epidemic, accounting for the bias encountered since infected individuals may be exported from the region. our next method is a tool for estimating the expected size of the next generation of infectives based on the rate of observed cases. this tool provides insight into the size of small outbreaks, which can inform decision making when trying to prevent a major outbreak taking off. in section 3, we propose a variety of mathematical models looking at disease impact and intervention strategies, with particular focus on non-pharmaceutical interventions due to the current lack of widely deployable, targeted pharmaceutical treatments. these models focus on enclosed populations, since this is the level at which most interventions are implemented. since the disease is particularly fatal in the elderly and other at-risk groups, we develop a care home model to investigate how the pathogen may spread through care homes. we also develop household models to investigate the impact of different intervention/control strategies. these models can inform policy design for mitigating or controlling epidemic spread. finally, in the context of relaxing strong social distancing policies, we investigate the extinction probability of the pathogen. we first consider the extinction probability after lifting restrictions. we then develop a household-based contact tracing model, with which we investigate the extinction probability under weaker isolation policies paired with contact tracing, thus shedding light on possible combinations of interventions that allow us to feasibly manage the infection while minimising the social impact of control policies. 2 . biases and estimation during outbreaks 2.1 . potential biases in the outbreak data techniques are constantly developing that enable higher volumes of more accurate data to be collected real-time during an epidemic. these data present a large opportunity for analysis to gain insight into the pathogen and the dynamics of the outbreak. however, although the quality of the data is constantly increasing, there are still many biases present. some of these are due to the data collection methods, and in an ideal world we would be able to eliminate them, and some are simply due to the nature of the outbreak, and will be present regardless of data collection methods. during an outbreak, many parameters depend on delay distributions (the length of time between two events), such as the time from infection to symptom onset (the incubation period). if an individual can be followed indefinitely, it is easy to determine the length of these events. however, in reality only events that occur before a given date are observed. therefore, the data is subject to censoring and truncation issues. in the incubation period, for example, censoring comes into play since, if we have observed an infection but the individual has not yet developed symptoms, we only have a lower bound on how long it will take them to develop symptoms. to account for this, we can instead condition on observing symptom onset before the cut-off date. however, this leads to a truncation issue, since individuals who were infected close to the cut-off date will only be observed if they have a short incubation period, which leads to an overexpression of short delays. the number of cases tends to grow exponentially during the early stages of an outbreak, causing the force of infection and the number of reported cases to increase with time. this further complicates the truncation issue since not only are recent cases truncated but they also account for the majority of cases. the growing force of infection also needs to be accounted for, since if the potential time of infection is interval-censored rather than observed directly, the probability that the case was infected in each day of that interval is not constant. in theory, both of these biases are relatively straightforward to account for. in practice however, there are other biases in the data. one of the major biases is the reporting rate. although the total number of cases may be reasonably described as growing exponentially with a constant rate in the early stages of an outbreak, high-resolution data may exhibit more complex behaviour. this can be due to a variety of reasons, such as the workload becoming overwhelming, the availability of individual-level data decreasing, the laboratories or offices slowing down activity over the weekend, the case definition changing, the testing capabilities increasing, and so on. another uncertainty arises since generally only the date of each event is recorded rather than the time. this presents a large window of uncertainty in the length of the delay, since the time of each event can vary up to 24 hours, and for a delay distribution, which depend on two events, it could vary by up to 48 hours. travel rate is another bias present in the data. for example, this changes the density of observed cases in a region, which can change the apparent growth rate. intervention strategies present a further bias because this can change the growth rate of the epidemic and the reporting rate. additionally, estimates of certain parameters may vary depending on the interventions that are implemented, so these need to be considered carefully. to model the incubation period, we require information regarding when an individual was infected and when they expressed symptoms. observing exact time of infection is unlikely, but it can be possible to find potential exposure windows. we consider three different data sets. the first two consist of individuals who travelled from wuhan before expressing symptoms. we can assume these individuals were infected in wuhan, since at the time of this data, the force of infection was significantly higher in wuhan than elsewhere. the length of time spent in wuhan therefore provides a window during which each individual became infected, and for many of these individuals we also have the date of symptom onset. in the early stages, the growth rate in reported cases was constant, and dependent on the epidemic growth rate in wuhan and the rate at which people left wuhan. by using travel to estimate the true number of cases, we estimate the exponential growth rate in wuhan as 0.25 r = (see section 2.3). therefore, the force of infection on day i, ( ) g i , is proportional to 0.25 e i . after 23 january when significant travel bans were introduced, the rate at which individuals left wuhan diminished significantly, causing the reporting rate for our sample dataset to suddenly drop. this occurs since cases are only included if we have a fixed window of time spent in wuhan prior to developing symptoms. therefore, if the data is truncated after 23 january, the reporting rate must be appropriately adjusted. this is illustrated in figure 1a . the difference between these two datasets is the truncation date, with the first truncated at 20 january and the second at 9 february. the third dataset contains cases that were infected through a discrete infection event, such as spending time with a known infected case. in this ''non-wuhan" dataset, the reporting rate is constant and the force of infection can be assumed constant over each exposure window. the source we use for these three data sets is a publicly available line-list [74] . incubation periods, and many other delay distributions, are generally observed to have right skewed distributions. we therefore choose to use a gamma distribution, though other distributions can also be applied using the proposed methods, such as weibull and log-normal. to fit the data, we use maximum likelihood estimation. to adjust for the biases we use a ''forwards'' approach [55, 69, 73, 75] , where we condition on the time of the first event, time of exposure, and find the distribution looking forward to the second event, time of symptom onset. for a data point { } , , infection occurs between i a and i b , and i y is the symptom onset date, the likelihood function is given by where ( ) g â�� is the density function of the infection date and ( ) f î¸ â�� is the density function of the incubation period parameterised by î¸. from this, the likelihood function for our dataset x is given by this approach is independent of the reporting rate bias, since the reporting rate depends on the date an individual leaves wuhan ( i b ), which is conditioned against (see appendix 5) . we use the mean and standard deviation to characterise the mle. since the tail of the incubation period is important when designing quarantine strategies, we then calculate the probability that the incubation period is longer than 14 days and find the minimum day by which 99% of cases will have expressed symptoms (excluding true asymptomatic cases). we also investigate the reporting date uncertainty mentioned in section 2.1 by considering the different extremes that the data could represent. this is achieved through adding or subtracting a day to all recorded data. methods accounting for truncation and growth biases in epidemic data have been discussed widely in the literature [36, 55, 72, 76] , however there are fewer applications to outbreaks [24] . in the context of covid-19, estimates have considered growing force of infection, for example [44] , and some approaches have considered truncation, for example [47] . however, these attempts do not adjust for the reporting rate in the data or use the correct force of infection, causing the incubation period to be overestimated. although the method presented here is independent of the reporting rate, other approaches for estimating the incubation period are not. here we demonstrate the importance of truncation (table 1) . we use the data truncated at 20 january, which has exposure windows between 1 december and 19 january. this data set is chosen since it is most sensitive to truncation due to the exponentially growing force of infection and high reporting rate. without accounting for truncation, the length of the incubation period is significantly underestimated, which could have a large impact on the success of intervention strategies. to demonstrate the effectiveness of the bias correction method, we compare three different data sets ( table 2) . the similar distributions predicted across these datasets suggests a robust method. figure 1b compares the full distributions for these three estimates. here we investigate the effect that uncertainty in the reporting date can have on the results, using the data truncated at 9 february (table 3) . the standard interval is the recorded data, wide intervals are obtained by removing a day from the exposure window lower bound and adding a day to the upper bound, and the narrow interval vice versa. the uncertainty in the reporting date can impact the estimated incubation period, showing that it is important to consider this risk when designing interventions. when constructing intervention strategies for an epidemic, the incubation period is an important parameter. for example, consider the quarantine strategy deployed in many countries during the early stages of the epidemic, aimed at preventing cases being imported from wuhan. this strategy quarantined individuals upon their return from wuhan for 14 days. for such a strategy to be effective, we require most incubation periods to be less than 14 days, so that the majority of infected people would develop symptoms before quarantine ended, enabling them to be further isolated. in this analysis, we show that in the worst-case scenario we would expect 1 in 62 cases to slip through this quarantine, with the best fit predicting 1 in 101 cases. therefore, the 14 day quarantine period would capture the majority of cases. throughout the epidemic, this seems to have been reasonably successful and prevented early seeding of cases in many countries. however, potentially due to complicated travel patterns or asymptomatic transmission, cases have slipped through detection and not been quarantined, which unfortunately has led to the situation observed today. in addition to the incubation period, there are many other delay distributions that must be estimated while an epidemic is growing, which can be estimated using the same technique. these include the generation time, the time between two infection events in a transmission chain; the serial interval, the time between symptom onset of an infector to their infectee; and the onset-to-death delay, the time from symptom onset to death. transportation modelling plays a crucial role in the early stages of an outbreak; an infected individual may travel outside of the region in which they were originally infected and seed further infections across geographical scales which are impossible to contain. furthermore, as the rate of travelling increases, the number of observed cases within the known ''origin" region decreases, and if exportation is not taken into account this results in an underestimation of the number of cases. these underestimates can be improved by looking at the total number of cases across all known affected regions, but doing so introduces further complications. for example, if an individual has less severe symptoms they may not seek medical assistance, thereby not being recorded as a case at their destination. this underestimation of cases can have significant effects if the traveller is able to infect more people. a new transmission chain can thus be started which remains undetected for some time due to a lacking known connection to the ''origin" region. in the ''origin" region an individual with mild symptoms may still be tested for an infection due to a higher level of alertness in the local health care system. however, this level of active case-finding may not be present elsewhere, or may not have been allocated a comparable level of resources. further complications to this model arise from the incubation period of individuals wherein detection is unlikely, and the variations in movement and mixing between people when preventative measures are put in place. we consider a metapopulation model seeded with an infection in one of the regions, o, and investigate how exportation from this region combined with variability in case-finding can alter estimates for the doubling time and the expected portion of the population we expect to identify. this accordingly bounds the proportion of the infected population one would be able to target for personal intervention (e.g. quarantine or treatment). note that the proportion of identified cases need not necessarily correlate with the proportion of the infected population who exhibit symptoms. let us assume that movement from o begins at time where ï� is the mean case-finding ability across all destinations. in the presence of real-time transition probabilities ij p of moving between two regions, these estimates can be further elaborated. we assume that detection occurs immediately following the end of the incubation period, i.e. . similarly, we assume a gamma-distributed incubation period with shape and scale parameters k and î¸, respectively. we can parametrise this distribution using the ''non-wuhan" estimate of the incubation period in table 2 , which yields a gamma-distribution with mean 4.84 and standard deviation 3. 22 . in contrast to other values in the table, this estimate is obtained from discrete infection events, e.g. contact with a known infected case, and therefore has a constant reporting rate, and a constant force of infection over each exposure window. therefore, this estimate of the incubation period does not rely on the exponential growth rate unlike other estimates from section 2. when accounting for ï�. this difference may seem small, but it reduces the doubling time by approximately 12 hours. the expected value of r grows linearly with the exportation rate, which has also been observed with real-time travel models [42] . further models have also been developed which consider travel and exportation of cases in greater detail [20, 30] . the relationship between the observed cases in our origin and destinations can be used to determine the case-finding ability, though it should be noted that ï� likely varies with time as burdens are increased on public services and the number of cases grow. early estimates using data from [1] indicate at most an 80% case-finding ability, suggesting thousands of undetected cases exported to other regions of china, a sufficient quantity to sustain further transmission post-exportation independently of the number of asymptomatic cases present. the intention of these estimates is not to provide specific values for the doubling time of the spread of covid-19 in china (as the estimates above use historic travel data and are limited by the availability of data), but to bring attention to the unusual circumstances surrounding changes in contact patterns, and mobility during the chinese spring festival, the largest human migration on earth [38] . failing to account for the significant level of dispersion or exportation of cases during these circumstances will significantly skew our estimates. in a scenario where a single individual exposes a group to infection, it can be unclear how many people have been infected since they do not immediately develop symptoms. however, knowing the true prevalence in the population is essential to determine the most effective interventions to put in place, and to estimate future burdens on public services. using the probability density function of the incubation period, we consider the efficacy of using the time it takes for people to present with symptoms as a predictor for the size of the infected group. this analysis is an effective ready reckoner at early stages of a novel infection, or in close contact environments, and is useful for predicting generation size when a complete data set is not yet available. in this analysis we focus on a scenario where infection time is known. in reality, we may only know an exposure window. for short exposure windows this method can still be valid, but for longer exposure windows it will need extending to account for this added uncertainty. we assume that the number of individuals who have been exposed to potential infection is known, in which case the number of people who are infected can be assumed to be binomially distributed with an unknown probability p that each individual has been infected. to determine the distribution of infected individuals, we use the available information regarding the number of individuals who have expressed symptoms. this yields two cases. in the first case, we assume that the true number of symptomatic individuals are observed. in the second case, we take the number of observed symptomatic individuals as a lower bound on the true value. we wish to determine the probability that the first generation has 0 e individuals, 0 0 e e = , given that i ï� symptomatic individuals are observed on day ï�, i i ï� ï� = . this is given by (see appendix 6) this gives a distribution of the generation-size based on the number of observed symptomatic individuals by time ï�. we can extend it to investigate a scenario where no symptomatic individuals have been observed by time ï� by using a value of 0 for i ï� : this can be used to illustrate worst and best case scenarios given ï� time has passed without symptomatic individuals. additionally, if we consider the probability that 0 e = 0, we can find the value of ï� where we can have a 95% confidence that there will not be a second generation: this analysis considers the case when the number of observed symptomatic individuals to date is the true number. in practice however, we do not generally observe every symptomatic individual, so the number of observations is only a lower bound on the true number. to address this, rather than considering i ï� as the total number of people who have developed symptoms by time ï�, we can define i ï� as the minimum number of people who have developed symptoms by time ï�. we assume that the probability that i ï� is equal to i ï� for a given value of i ï� is uniform at 1 1 i ï� + . we can then use the same methods as above to infer a distribution for p. details are provided in appendix 7. e , in an infection event in which 20 people were exposed. a, b and c show the density when the number of observed symptomatics is taken to be the true number of symptomatics, d, e and f consider the case where the observed symptomatics is a lower bound on the true symptomatics. a and d consider the case when zero symptomatics are observed after 5 days, b and e when 5 are observed after 5 days, and c and f when 5 are observed after 10 days. the incubation period for the disease has been modelled as a gamma distribution with a mean of 4.84 and standard deviation of 2.79 ( table 2) . as we can see from figure 2 , this method can be used to predict the number of infected individuals in the original exposed group. however, we have also demonstrated the importance of caution when interpreting this data. if there is uncertainty surrounding the presentation of symptomatic patients, using i ï� as a lower bound is a robust method to ensure the size of the generation is not underestimated. 3 . modelling intervention strategies 3 when designing intervention strategies, we need to consider how adherence may alter their effectiveness. this is important, since highly effective interventions may not be adhered to if they present great individual cost to a population. in this case, a theoretically less effective intervention may perform better, if it has sufficient reduction in individual-level cost. in this section, we illustrate the potential impacts of adherence on the effectiveness of interventions using a toy model. consider a standard sir model, and denote by ( ) s t and ( ) r t , respectively, the susceptible and recovered/immune fractions of the population at time t. we can write s in terms of r such that if an intervention is put in place that reduces (with full adherence) 0 1 r < then the outbreak will be controlled. indeed, let us assume that 0 r is reduced to zero by the intervention: for example, assume that social distancing is perfect and the number of contacts of a fully-adherent individual is zero. if only 50% of people adhere to the intervention then the average number of contacts is effectively reduced by a half and logically â�  0 0 / 2 1.5 r r = = (the â�  representing quantities post intervention) and ( ) â�  0.58 r â�� = in this case. however, this assumes that adherence is an independent random process at each contact. this suggests that for each contact an individual would ordinarily make, they ''toss a coin" to decide whether to isolate or not. in reality, individuals are more likely to show polarity, where some individuals reduce all their contacts and follow the measures and a proportion of individuals choose to not adhere to the intervention. if there was distinct polarity in the population such that 50% adhered perfectly and 50% ignored policy, then a toy model can be created with two infectious groups, a i and b i , that behave differently. in this case where a dot over a variable represents its time derivative. such an epidemic model, where the two groups have the same susceptibility but different infectivity, has the same final size as an epidemic in a single-type model with the same 0 r (e.g. see [5] ). however, they have different durations as can be seen in figure 3 , where this shows that the assumptions about the nature of adherence predict the same growth rate and final size, but that the more polarised adherence has faster early growth and therefore an earlier peak. more complicated model structures could be constructed by incorporating adherence with intervention by susceptible states, which would lead to core group dynamics (see for example [37] ). this issue of independent versus polarised adherence is related to the idea of all-or-nothing versus leaky vaccination [29, 49] , where you either vaccinate a fraction of the population with 100% efficacy or vaccinate 100% of the population with reduced efficacy [34] . note however that vaccination reduces your susceptibility (whether only or also), rather than only your infectivity as in the model discussed above, and variation in susceptibility does reduce the final size, with imperfect coverage with a perfect vaccine (all-or-nothing) leading to a lower final size than full coverage with a leaky vaccine (all individuals having the same mean susceptibility). the ongoing covid-19 outbreak is known to have higher mortality rates amongst the elderly, the immunocompromised and those with respiratory and health complications [31, 82, 83, 84, 85] . in this section, we model the introduction of an infectious disease into care homes, in order to obtain estimates of the final size of the epidemic in the vulnerable population as well as predictions for the number of hospitalisations and fatalities. modelling of care homes in the uk is conducted against the backdrop of a wider epidemic in the general population, which we here assume to be following seir dynamics with a basic reproduction number 0 r that might be different from the within-care home reproduction number c r . care homes are assumed to be closed populations, with the infection entering each of them independently with a certain probability. infection is seeded only once, and within-care home outbreaks then evolve independently from, and do not contribute to, other care home outbreaks and the epidemic in the background population. to keep track of hospitalisations, we model the withincare home infection dynamics using a compartmental model that, in addition to seir model, has compartments for mildly symptomatic prodromal cases (p), who show no symptoms but are capable of transmitting the virus, those who recover from the disease after mild symptoms that did not require hospitalisation (m), those who have severe symptoms and are admitted to hospital (h), those who recover after hospitalisation (r), and those that die (d). this is illustrated in figure 4 . the infection pressure up to time t for a median sized care home is the integral from 0 to t of the force-of-infection (foi) applied to the care home coming from all infectious sources, multiplied by a probability p. this probability represents the probability of the infection being introduced to a median-sized care home. for other care homes, we allow this probability to be proportional to its size, under the assumption that larger care homes employ more staff and are therefore at higher risk of introduction. when the infection pressure becomes higher than an individual care home's resilience threshold, that care home begins its own deterministic infection dynamics with a single initial infected case. the equations describing the background epidemic and the within-care home epidemic are given in appendix 8. figure 4 : compartmental model for disease dynamics within a care-home. we extend a deterministic seir model to include compartments for prodromal (infectious) cases (p), mildly symptomatic cases that recover without requiring hospitalisation (m), cases that do require hospitalisation and are removed from the care home (h), cases that die in hospital (d) and cases that recover in hospital (r). we run this model on data for the entire care home population in the uk, so that there are approximately 15,000 care homes with a total population of approximately 450,000 residents [21] . care home sizes range from 1 to 215, with a mean size of 29.4 . in this model we only consider the vulnerable population within care homes. we assume 0 1.5 r = in the background epidemic, a relatively low value that somehow accounts for a certain degree of control, and an 3 c r = to allow relatively explosive epidemics in care homes due to potentially more frail individuals, difficulty in isolation and staff inadvertently passing the infection from one case to the next. the other parameters in the baseline scenario are reported in table 4 . apart from the reproductive number, the background epidemic uses the same parameters as the care home epidemic. however, in the background model there are only rates from e to i and i to r, which are taken to be the rates from e to p and i to m, respectively. there are a variety of assumptions underpinning this model. firstly, the background epidemic ignores structure and assumes homogeneous mixing. this is likely to make the peak more pronounced, so presents a worst case scenario for the demand on hospital beds. the assumption of 0 r for the background epidemic only affects the shape and duration of the background epidemic, since the tuneable parameter p controls the risk of introduction to the care home. that is, if 0 r is small, a large p still presents a high force of infection into the care homes. therefore, for a fixed p, we expect that changing 0 r does not affect the total number of deaths, but it changes the peak hospitalisation incidence because a faster and more explosive background epidemic makes epidemics in care homes more synchronised. in fact, when testing the impact of a longer, flatter background epidemic, for example obtained by simulating three slightly desynchronised background seir epidemics, results have lower peaks and a much more variable timing (not shown). assuming each care home is independent might not be realistic, since it is likely that staff are shared between multiple homes, in which case they can act as vectors of transmission between homes. however, in the model current outbreaks are already quite synchronised (the within care home outbreaks occur at similar times), so the effect of this assumption is likely to be minimal. the final major assumption is that the epidemic within the care homes is deterministic. this removes the probability of random extinction and random delays, and should obviously be relaxed with a stochastic model, given half of care homes have size smaller than 25. however, the extinction probability is very low with 3 c r = , so this stochastic effect is unlikely to have a large impact. random delays, instead, may change the shape and timing of the epidemic, which could potentially reduce the peak burden. therefore, this model represents a worst case scenario. in the absence of cure or vaccine for covid-19, governments worldwide must rely on nonpharmaceutical interventions (npis) to control the outbreak [32] . a natural such intervention is to ask individuals who express symptoms similar to covid-19 to isolate themselves, but variants to such individual isolation might include policies sometimes referred to as household isolation, household quarantine and mixed isolation. in this section, we investigate how such strategies affect the spread of the epidemic when bearing in mind that adherence to each intervention may differ. individual isolation relies on individuals staying in isolation when they express symptoms, thereby stopping transmission. however, there is potential asymptomatic or prodromal transmission before they go into isolation. additionally, isolation strategies generally ask infected individuals to remain at home, which presents an infection risk to the other members of their household, who may go on to spread the infection. the term 'household isolation' refers to a policy where, upon first detection of symptoms within a household, all individuals within the household go into isolation for a fixed duration of time. this strategy reduces the risk that other household members, if they are infected within the household, transmit in the community when pre-symptomatic (and hence before they self-isolate themselves) or if asymptomatic but still infectious. a blanket policy invoking a fixed duration of household isolation might cover the full epidemic in a small household. however, a larger household might present multiple generations of infection, potentially extending the within-household outbreak beyond the fixed duration of the household isolation policy. to address this issue, 'household quarantine' is another potential strategy. upon detection of symptoms, the entire household is isolated until a fixed duration of time after the last symptomatic case within the household expresses symptoms. this ensures that there are no symptomatic cases evading intervention but applies quite drastic measures to the household. a fourth strategy, that reduces the cost relative to household quarantine, is mixed isolation. here, upon detection of symptoms the entire household is isolated for a fixed length of time. any subsequent cases within the household then undergo individual isolation as described above. this reduces the risk of cases not being isolated whilst allowing recovered individuals to return to work. there is however still some remaining risk that infected individuals may not yet express symptoms after the end of the isolation period, but this risk can be controlled through the duration of each isolation. although there is now a rich theoretical literature on households models [7, 8, 9] , the mainstream methodological tools in this research area present important limitations that make them not directly applicable to studying these control policies. first, exact theoretical or asymptotic results in these models are mostly restricted to time-integrated quantities, i.e. those quantities that do not depend on the detailed temporal shape at which the infectivity is spread by an individual: these are 0 r (or any other reproduction number [10, 60] , e.g. the household reproduction number * r ), the probability of a large epidemic, and the epidemic final size [4] . for this reason, the vast majority of the literature relies on the standard stochastic sir model [4] , despite its unrealistic infectivity profile. even if more recent work has expanded beyond time integrated quantities, for example considering the real-time growth rate [10, 62] , if the interest is on tracking the dynamics of infection spread, a model based on full temporal representation of between-and within-household dynamics [33] appears necessary. a second limitation of standard household models is the key assumption of constant parameter values. this appears essential for any form of analytical progress. however, in the context of the interventions discussed above, a reduction in transmission between households, as well as a potential increase in within the household, require parameters to change over time. to overcome these limitations, we consider two approaches. the first approach fully captures both within and between-household dynamics with a master-equation formalism, i.e. by relying on a markovian within-household dynamics and keeping track of the expected number of households in each possible state of their internal dynamics. the second approach has a greater emphasis on withinhousehold dynamics, and is fundamentally an independent-households, individual-based, stochastic simulation. the more limited mathematical tractability is the price to pay for an increased flexibility, as the within-household markov assumption is relaxed and exact distributions for delays between events, typically informed by the data, can be explicitly inputted. although both approaches can account for increased within-household transmission as isolation and quarantine are imposed, we only consider this for the second method here. this aspect allows us to study the increased risk of infection a vulnerable individual in the household would experience following the implementation of a control policy. to model the households in the uk, we construct a realistic distribution of household sizes (which is given in the supplied code). we take this demographic data from the 2001 census [58] . more recent information, though less specific on large household sizes, shows that sizes of smaller households are largely unchanged over time [57] . 3 . 3 in this section, we investigate the above intervention strategies under the assumption that a fraction of households adhere 100% with an intervention and the remaining households ignore the intervention. to model the interventions, we implement a dynamical household model that explicitly represents the small sizes of households. the dynamics of the outbreak are simulated using an sepir model. this model assumes that there are five possible states in which an individual can be. these are, susceptible, latent, mildly symptomatic prodrome, symptomatic infectious and removed. individuals are infectious during the mildly symptomatic prodrome state and the symptomatic infectious state. following [17] , we assume that within-household transmission scales with the inverse of the household size to a specified power î·. such a model can be used to investigate how the pathogen spreads through and between households. the methodology involved is the use of self-consistent differential equations, first written down by ball [6] . more recent developments, including numerical methods for these equations, include [13, 33, 41, 66] . important features of this approach include allowing for a small, finite size of each household in which random effects are important and each pair can only participate in one infection event. here î� represents infections imported from outside the population of households, and the other terms represent between-household transmissions. in our code, we assume î� is a step function. results are largely insensitive to the precise choice of î� , but compared to, for example, random seeding of infections in households, starting the whole population susceptible and exposing to a small amount of external infection for a fixed time period has less room for the precise initial condition chosen to influence results, and is more realistic for the situation observed in countries apart from china. we take a 'global' intervention as part of the baseline, in particular, we can model phenomena such a school closures that hold during a set of times t as we call îµ the global reduction. we will generally drop this t-indexing for simplicity, and will also consider only a household isolation strategy (though the other strategies can be considered similarly, with an example of how other strategies could be captured in this model framework given in appendix 9). instead of isolating for a fixed duration, we assume that a fraction w î± of households isolates when there is at least one symptomatic case in the household, and isolating households leave isolation when no symptomatic cases remain. we make this assumption since it may potentially capture the behaviour of real households, who are more likely to remain isolated based on presence of symptoms rather than for a fixed duration. in the non-markovian household model in section 3. 3 using the methods in [13, 66] , it is possible to fit household models of this kind to the overall growth rate, r, which we take to correspond to a doubling time of three days. natural history parameters can then be set directly based on reasonable estimates: e p r â�� to the inverse of the latent period; p i r â�� to the inverse of the prodromal period; i r â�� â�� to the inverse of the symptomatic period. shaw [71] analyses various household datasets for respiratory pathogens and estimates values for î· close to 1, so this is taken to be 0. 8 . the remaining degrees of freedom are relative infectiousness of the prodrome (taken as a third) and the probability of transmitting within a pair, which we can take as a typical value given by shaw [71] . for the numerical results in figures 6 and 7 using the given parameter values for our baseline scenario (table 5) , we consider a combination of household isolation (which follows all-or-nothing adherence) with global reduction in transmission (which follows leaky adherence) for three weeks and show the results in figure 6 . the distribution of infectious individuals varies with household size, which is shown in figure 7 for different durations of global intervention. applying household isolation at 65% adherence ( 0.65 w î± = ) manages to reduce the spread of infection, but appears insufficient in this model and with baseline parameters for controlling the outbreak in the long-term, unless other intervention strategies that reduce the global transmission (increasing îµ) are adopted at the same time. alternatively, different levels of adherence can be considered to determine if and when control may be achieved purely through household-based interventions. for the model proposed in the next section, we look into the effectiveness of increasing adherence. parameters are in the main text. the model described above has the advantage of being able to track the dynamics within the household as well as the overall epidemic in the population in a relatively efficient manner. we now discuss a different framework that loses part of the capability in keeping track of the overall epidemic, but offers further flexibility both in the impact of policies on the within-household dynamics and in the distributions between events in the infectious life of an individual. we use this model to investigate the relative effectiveness of the different control policies. we also consider allowing recovered individuals to leave the household, even in the context of household isolation or household quarantine. this has no impact on the transmission dynamics, but reduces the individuals' life disruption and potential economic cost of any policy implemented. this model assumes that there is no reintroduction within households so each household can only be isolated or quarantined once. the assumption that only one household member is infected from outside is approximately satisfied if we assume homogeneous mixing between households and a large number of households, which are all fully susceptible at the start of the epidemic. however, the reality of heterogeneous mixing makes reintroduction a likely possibility even early on in the epidemic. this model, therefore, lacks an explicit description of the social network structure beyond the household. for simplicity, we assume that within households all individuals are identical in terms of their disease dynamics, although the method might be extended to allow for different age/risk groups with different disease dynamics. we assume that the level of within-household transmission in a household of size n scales proportionally to ( ) 1/ 1 n â�� , though we acknowledge that true transmission is slightly more complex [18] . we consider independent households of size each individual is given an indicator function of whether they are symptomatic or not (individuals show symptoms independently of each other with probability s p ) and a resilience threshold. this last quantity is drawn from an exponential distribution with mean 1, and represents the overall infection pressure this individual is able to withstand before they get infected. the infection pressure up to time ï� is the integral from 0 to ï� of the force-of-infection (foi) applied to this individual coming from all infectious sources. at the beginning of the within-household epidemic, a single initial case is assumed. time is discretised with a predefined time step d 0.1 t = days. at any time step, the current infectivity of all infectives in that time step is summed over, keeping track differently of the infectivity spread outside and inside the household. an overall measure of the accumulated infectivity within the household is updated at each time step and when this crosses the resilience threshold of a susceptible individual, they acquire the infection. we assume an individual spends half of their time outside and half inside the household. when selfisolation starts, the assumed adherence i a represents the fraction of the time spent outside that is shifted from outside to within the household. therefore, for perfect adherence, from the moment symptoms occur, the individual stops transmitting outside but their infectivity within the household grows by 100%. we also explore variations in this compensatory behaviour, so that the time of an individual is split in a more flexible proportion than 1:1. the same argument applies to other control policies, with adherence levels h a for household isolation and q a for household quarantine. when multiple control policies are in place at the same time, their effect is assumed to be multiplicative: if an individual has symptoms and the household isolates, the outside transmission rate from that individual is reduced from baseline by a multiplicative factor ( ) ( ) (1) and the area under this curve is known in the literature as the household reproduction number, and is typically denoted by * r . if enough transmission is prevented, so that * 1 r < , the epidemic is controlled. the basic reproduction number 0 r and * r share the same threshold at one, so they are simultaneously larger, equal or smaller than unity. however, in a growing epidemic 0 * r r < [10, 60] . the real-time growth rate r is related to * r by the lotka-euler equation asymptomatic cases, which are assumed to be half as infectious as symptomatic ones. the total number of cases under isolation (possibly compounded, e.g. both household and individual isolation) is shown in red (right axis). all random numbers involved in the realisation of the stochastic epidemic are drawn at the start, before the impact of each control policy is implemented. row 1 shows no isolation and individual and isolation. rows 2, 3 and 4 show, respectively, household isolation, mixed isolation and household quarantine. the difference between the columns is that the basic policy on the left is ''upgraded'' to the more cost-effective version on the right that allows recovered individuals to leave the house as they cannot transmit outside anymore. when no control is implemented, the primary case (individual a, infected at time 0) infects another individual (b) around time 11 . after a long latent period (i.e. incubation minus prodromal), b becomes infectious and infects a further individual (c). the last individual (d) escapes infection. when different intervention strategies are in place, within-household infectivity is increased. this can result in individual c becoming infected earlier in the outbreak and individual d no longer escaping infection, both due to the increased force of infection. in this simulation, the dynamics for individual b do not change since they are infected before a becomes symptomatic. individual d is infected earliest under mixed isolation, because within-household transmission is higher than household isolation alone, due to increased adherence from individual isolation also being in place. adherence levels to household quarantine are lower than those of household isolation, due to the higher demand of full quarantine, thus leading to less enhanced within-household transmission. we assume that adherence to individual isolation is 90%, household isolation is 80% and household quarantine is 60%. the more severe the intervention, the better it captures the infectious periods of infected individuals within the household. the lower x-axis gives the adherence to household quarantine, and the upper x-axis adherence to household isolation. we assume that household isolation is less demanding, and therefore adherence is assumed to be ''twice as high'', meaning it is at the midpoint between that of household quarantine and 1 (e.g. 0.6 for an x value of 0.2, 0.9 for an x value of 0.8, etc.). the black dash-dotted line in (a) gives the amount needed to control the spread by achieving * 1 r = . notice how: the effect of individual isolation is independent of adherence to household quarantine (dotted lines); the effect of household isolation is independent of adherence to individual isolation (overlapping dash-dotted lines); mixed isolation is always superior to household isolation; household quarantine is only optimal at really high levels of adherence (for these baseline parameters, generally, beyond the level needed to achieve control), but quickly becomes suboptimal to mixed isolation as adherence is reduced. when a sufficiently large reduction in * ); similarly, in all households, individual isolation would total exactly 7 days if all cases were symptomatic and all individuals in the household were ultimately infected. in (d), we assume that adherence is 100% to each intervention. under the baseline parameter values (table 6) , control can in principle be achieved via certain interventions, but only for high levels of adherence, which might be difficult to enforce for a prolonged length of time (figure 9a ). more importantly, the model's conclusions are highly sensitive to variations in parameter choices, which are uncertain. parameters that present problems here are the delay from symptom onset to isolation (with control failing for 1 day detection delay unless adherence is essentially perfect), proportion of asymptomatic infections (any chance of control lost at 50%) and the strength of asymptomatic transmission. the short delay before symptomatic individuals isolate may be unrealistic unless the susceptible population is very well-informed about symptoms that call for isolation, and so likely does not apply in very early stages of an outbreak. overall, in the face of the many uncertainties, household-based interventions triggered purely by symptoms appear useful to slow the spread but need to be complemented by other policies. comparing the different strategies (figure 9b , household quarantine can be optimal (as one might expect), but this requires high adherence levels. as adherence drops, this strategy becomes suboptimal to mixed isolation. mixed isolation is significantly better than household isolation on its own and requires little extra social cost, so should not cause adherence to drop (relative to household isolation adherence levels). the difference between the two strategies comes down to the transmission slipping through after the 14 day household isolation. the cheapest strategy, when considering working age adults, is individual isolation (figure 9d ), but the effect is limited compared to the other models and cannot achieve control in the baseline scenario even with 100% adherence. overall, the mixed isolation strategy appears to be most cost-effective. however, this is dependent on the assumption that adherence is better for 14 day isolation rather than a very long quarantine. it can be observed that household-based interventions are more effective than individual isolations, demonstrating the importance of these strategies in designing intervention policy. figure 8 shows how the different isolation strategies contain the infectious periods of individuals within the household and also indicates the number of individuals being isolated within the household. to study the impact such an increased within-household transmission has on the chance that a vulnerable individual is infected in the household, we randomly choose one non-primary case in the household as the vulnerable one and count how many of the e n epidemics result in this individual being infected under the different control policies (figure 9c ). under these interventions, the risk of a vulnerable individual getting infected within-household, conditional on the infection entering it in the first place, is in the range 5 15% â�� . since this model relies on the sellke construction, we calculate the infection pressure that accumulates (within a household) during the outbreak. in relation to figure 8 , we report in figure 10 the infection pressure that accumulates for the different control policies, showing the different impact each intervention can have on the within household dynamics. figure 10 : accumulated infection pressure in the simulation presented in figure 8 for different control policies. horizontal dotted lines represent individuals' resilience thresholds. as time progresses, the accumulated infection pressures (coloured lines) increase and when they cross the resilience thresholds, the corresponding individual acquires infection. notice that: in the absence of control, one individual escapes infection; with household isolation only, the infection pressure reaches a relatively low endpoint because of the last symptomatic individual slipping through and not transmitting much in the household; with mixed isolation, infection pressure is higher due to combined adherence; and with household quarantine, the infection pressure builds up more slowly at the beginning due to lower adherence. we assume that adherence to individual isolation is 90%, household isolation is 80% and household quarantine is 60%. social distancing, isolation and lockdowns act to mitigate the spread of an infectious disease and reduce the number of cases. however, such interventions, particularly widespread lockdowns, cannot be maintained indefinitely and must be lifted at some point. for the disease to be controlled, these interventions can be implemented until pharmaceutical interventions are developed, such as a vaccine, or until the case numbers are low enough that the disease may go extinct. here, we consider the situation where interventions are lifted just before extinction, when the number of cases has reached a low but non-zero initial value 0 n : at this point, the number of cases might rebound or might go extinct by random chance despite an 0 1 r > . we use a time-inhomogeneous birth-death chain model [39] to , q t s satisfies the differential equation subject to the initial condition ( ) 0, . q s s = solving for q and setting 0 s = gives the probability that, at time t, the number of cases has reached zero and the disease has become extinct. we denote this probability by ( ) q t [3] , which is given by the corresponding generating function, ( ) , r t s , for this random variable satisfies again, solving for ( ) we simulate data based on one initial case 0 1 n = , though this may easily be extended to any number of initial cases. we run simulations both with and without immigration, choosing w is the initial (constant) rate of importation of cases before any controls on immigration are put into effect. we set 0 5 w = imported cases per day. with these choices of parameters, the resulting extinction probabilities are given in figure 11 . note that we are assuming the immigration rate is decreasing to 0, so if the infection is controlled internally for long enough, an overall ultimate extinction is possible in this model. for these parameter choices, the final probability of extinction, defined as n q t . these probabilities suggest that, without widespread immunity, stochastic extinction might be aided by social distancing but is heavily compromised by immigration. border controls, therefore, if of limited use when transmission is self-sustaining, become key when the number of cases is low. note that we have assumed an importation function ( ) t î· that goes to 0 for large t, in line with a pandemic that goes extinct in other geographical regions. however, the presence of an animal reservoir might lead to an importation function that is non-zero over longer time scales, thus effectively making ultimate extinction impossible unless the effective reproduction number is kept below one by a systematic and permanent intervention (e.g. technology-based change in behaviour) or herd immunity. contact tracing is a complementary control policy to isolation or quarantine. when a case is discovered, attempts are made to identify and isolate individuals who may have been infected. in doing so, some of the secondary cases will be discovered and isolated early in their infection, decreasing their effective infectious period. if contact tracing is successful, it can greatly reduce the effective reproduction number of the infection, and in combination with other interventions may drive an epidemic extinct, as was seen in the case of sars [81] . contact tracing in itself presents numerous challenges, which are exacerbated by its success relying not only on the effectiveness of the tracing process but also the underlying transmission characteristics. for covid-19, some of these challenges include mild symptoms which cause infections not to be reported, pre-symptomatic transmission which occurs before a case is reported, and short generation times [28] which can cause the epidemic to outrun contact tracing. additionally, contact tracing is only feasible for smaller case numbers, because each case generates multiple contacts to follow up, so the tracing workload expands dramatically, and an increasing number of chains remain unobserved. this makes it a viable strategy in the early days of an outbreak, or, if containment has failed, following a period of severe interventions, such as a lockdown. combining contact tracing with isolation is being considered by many countries as part of a test, trace and isolate strategy to be implemented once lockdowns or comparable measures are lifted, provided these lockdowns succeed at driving case numbers sufficiently low. in this section, we develop a householdlevel contact tracing model for an emerging outbreak, since we do not wish to make assumptions about immunity or depletion of susceptibles. these assumptions can be added to the model as the availability of data into immunity improves. we are interested in the likelihood that the contact tracing process is overwhelmed by large case numbers and the likelihood that, combined with isolation, it can drive the disease to extinction. the early days of an outbreak can be modelled using a branching process, where generations of infections produce infectious offspring. contact tracing processes can be incorporated as a superinfection along the tree generated by the branching process [11] . when a node is 'superinfected' by the contact tracing process, it is isolated. we model the infection spreading through a fully susceptible population of individuals, segmented into households of different sizes according to the 2019 ons survey [57] , and progress through discrete time steps of 1 day. as such, our branching process is at the household level, coupled with localised within-household epidemics. this allows us to model contact tracing strategies that isolate whole households, which may contain several undetected infections. it also enables a wider range of contact tracing strategies to be modelled, each with different intervention scope and costs. each day, individuals (or nodes) make contacts to a random set of individuals; divided into local contacts to members of the same household, and global contacts to members of other households. the number of individuals contacted in a day is distributed using an overdispersed negative binomial distribution and parameterised using estimates from the polymod social contact survey [54] , stratified by household size. since the probability that a contact causes infection cannot be directly observed, we use improper hazard rates that give rise to the 5 day covid-19 generation time [26] and 0 3 r = . for contact tracing to begin, an infection must be diagnosed, which we assume occurs 70% of the time among infected individuals due to flaws in reporting or very mild symptoms in those infected. we assume a gamma distributed incubation period with mean 4.84 (table 2 ) and a geometric reporting delay from symptom onset with mean 4.8 days [42] . intuition suggests that if 0 3 r = then tracing two thirds of contacts will control the epidemic. however, in practice transmission may occur before tracing, so this will not reduce the number of infectious contacts by two thirds. to demonstrate this, we assume that contact tracing successfully traces two thirds of contacts. trained professionals have to trace all reported contacts from the last 14 days, so we assume that the contact tracing delay follows a geometric distribution with a mean of 2 days. individuals are considered recovered 21 days after infection, as the chances that they are still transmitting then are negligible. though our general framework can be modified extensively, we assume the following contact tracing strategy. when an individual reports infection, their household is immediately isolated. contact tracing attempts are then made for all households connected to one of the individuals in this household, whether symptomatic or not. when a connected household is identified (after the contact tracing delay), all individuals within the household are immediately placed under observation. if any of the individuals in the observed households develop symptoms, then the household becomes isolated and the contact tracing process continues to connected households. when a household is isolated, we assume all individuals are isolated with 100% adherence, and cannot transmit the virus within or outside the household. the assumption that isolation prevents local infections is unrealistic, but does not change the overall behaviour of the process as there are no more global infections. this strategy imposes high individual-level cost, since by isolating all individuals within a household, it isolates individuals who have not had direct contact with an infected individual. in practice, such a strategy may have poor adherence. figure 13a shows an example contact tracing network. 3 . 5 when choosing contact tracing strategies, a balance must be struck between the effectiveness of a strategy and the resources that it requires. some strategies are only feasible when there are few infections, since the resources required can grow rapidly depending on the dynamics of the outbreak and the contact tracing process. to define the capacity of the contact tracing process, we consider the ability of a public health agency to observe the condition of those asked to self-isolate, due to their recent exposure to an infected individual. the health agency must remain in contact for the duration of the 14 day self-isolation period, so that if any individual under isolation develops symptoms and then tests positive, the contact tracing process can be initiated on this node. we will define the capacity of the contact tracing process to be the number of people that can be placed under observation and assume two possible capacities: 800 and 8000. we assume that when a node is contact traced, they are asked to report their global contacts for the last 14 days. all global contacts are assumed to be to a new person since we are in the early stages of an outbreak. parameters are given in table 8 . hitting probability (8000) 76 .8% we carried out 6507 simulations of the contact tracing process for 150 days. contact tracing capacity was reached in 5000 simulations, and in 180 the epidemic neither went extinct nor was the 8000 capacity reached. in the remaining simulations, the epidemic went extinct. figure 12a and table 8 show that increasing the contact tracing capacity tenfold less than doubles the time until that capacity is reached. however, it does increase the odds of driving the epidemic to extinction without hitting the capacity by about 10% (table 8) . different contact tracing strategies will strain different aspects of the health agency. a strategy that generates large amounts of work is only feasible if there are few active infections. the optimal strategy will need to compromise and may need to change depending on the number of active infections, which cannot be directly observed. figure 12 : capacity hitting times for the contact tracing model. when there is a small number of cases in a single country, it may be possible to drive the pathogen to extinction. this small case number could correspond to the start of an outbreak or removing of severe interventions. we consider the latter case, but conservatively assume a fully susceptible population. we assume that social distancing is enforced on day 0 and reduces global contacts by 70%. full parameters are given in table 8 . since we are interested in extinction, we will no longer consider the contact tracing capacity. under these baseline parameter assumptions and 10000 simulations, the combined force of this contact tracing strategy and isolation is enough to drive the epidemic extinct ( figure 13b ), but measures will need to be in place for months in some cases. if the infection is ever re-imported, then the process would begin again, since herd immunity is not achieved. note that the minimum extinction time is 21 days due to this being the time after which an infected individual is labelled recovered. additionally, this model only considers extinction under the assumption that no cases are imported. in section 3.4, we have shown that importation of cases significantly reduces the extinction probability. this suggests that extinction may no longer be guaranteed, and the time to extinction will be significantly increased. this analysis has focused on a single contact tracing strategy using indicative parameters for covid-19. the proposed model can be extended to more strategies and region specific parameters to inform the design of control policies. also, as is shown in table 8 , contact tracing capacity is likely to be reached, which may prevent extinction from being achieved. this complication is compounded by the issues of loss of immunity or the presence of an animal reservoir discussed in section 3.4. figure 13 : example of the contact tracing process (a) and the extinction times distribution (b) for the contact tracing model. 4. discussion in this manuscript we have presented a range of mathematical tools to tackle infectious disease outbreaks. in particular, these tools address various technical questions posed by the authors to support the ongoing public health response to covid19 . this toolkit considers both estimation efforts for key parameters, and investigative efforts (often numerical simulations) in gauging the effectiveness of various intervention or control measures. joint consideration of estimation and simulation efforts is critical. parameter estimates are obtained using a certain set of assumptions regarding the data, and investigations or simulations utilising these estimates should ensure that their underlying assumptions are consistent. these challenges in model construction and applicability of statistical methods are compounded by the limitations of the data with which decisions must be made. some of the biases present in the data can be addressed with an improved data collection methodology -often challenging in the context of a fast-moving outbreak -but many are also inherent to the nature of early outbreak data [15, 45] . the consequent lack of intuitive insight from this data underscores the need for careful parametric estimates, especially considering the large variability in predicted outcomes resulting from small differences in parameters. even with robust estimates for some parameters, many other parameters are challenging to estimate using the available data. therefore, models need to address this variability and uncertainty in order to inform public health policy. we have presented methods to address biases arising from a growing force of infection, changes in the reporting rate, truncated data samples and a varying travel rate. we use these methods to account for these biases when estimating delay distributions, such as the incubation period, and the growth rate/doubling time. these biases can have significant impact when estimating key parameters: the mean incubation period estimates for covid-19 range from 3.48 days without correcting for truncation to 4.84 days with the correction, and the doubling time in hubei province decreases from 3.15 days without correcting for travel to 2.77 days. these differences can significantly alter our understanding of the outbreak, and could have a large impact on policy and public health. for instance, underestimating the incubation period may lead to quarantine strategies failing to identify infected individuals if the quarantine length is too short. overestimating the doubling time (or underestimating the growth rate) will underestimate the risk posed to the host population -both in terms of final size of the epidemic and the rate at which it spreads, which can have significant public health impacts as discussed in [63] . it is important to note that the above-mentioned biases, and consequent impact of implementing the methods correcting for their presence, may vary across different settings. as an example, the potential underestimation of the covid-19 growth rate is exacerbated by an overlap in early outbreaks with a period of significant travel and movement in china, and would be less detrimental if first observed in other populations such as italy. also, for the incubation period, we have shown two different types of data; one from wuhan and one from discrete infection events. in the wuhan data set, truncation and force of infection biases are very important, whereas in the other data set, there is no force of infection bias since the infection events are observed. when an outbreak occurs in an enclosed group, such as a large gathering, we may wish to know how many individuals are likely to be infected. we developed a statistical method to estimate the first generation size based on the number of symptomatic individuals, taking care to account for the uncertainty in this quantity. this ready reckoner can inform testing of large groups to help control the disease spread, but does not apply to later generations or the possible interventions enacted on the population. building on these enclosed population scenarios, we have developed a set of models that investigate public control measures or interventions on enclosed populations, such as households and care homes. these structured descriptions improve the population risk profiles relative to assumptions of homogeneous mixing. a complementary aspect to a structured population when modelling interventions is adherence. motivated by vaccination modelling, we consider leaky adherence, where every household chooses to adhere or not whenever an event occurs, and all-or-nothing adherence, where some households adhere every time and some never adhere. we observed that in a homogeneous population, although the two types of adherence predict the same growth rate and final size, the timing of the peak and the early growth can be faster under all-or-nothing adherence. this insight, combined with lessons from the vaccination literature, suggests that efforts should focus on ensuring complete adherence in individuals or households with some level of pre-existing adherence, rather than pushing non-adherent individuals or households to change behaviour. dedicated modelling of disease spread in care homes is essential due to the documented history of comorbidity of their residents during pandemics [31, 82, 83, 84, 85] . we do so by regarding care homes as closed populations that are subjected to a force of infection from an external epidemic. we develop a tool for analysing the risk posed to this population by determining the peak size of the epidemic within the care homes and the number of deaths. applying this model to covid-19, we find that by ''cocooning" the care homes, i.e. shielding them to reduce the chance of introduction from the external outbreak, we can significantly reduce the size of the peak and therefore reduce the number of deaths. however, assessing the necessary level of shielding requires accurate characterisation of the external force of infection, and underestimating this may invalidate shielding efforts. a limitation to the proposed model is the deterministic within care home epidemic. however, since the average size of care homes is relatively large and we assume a high 0 r within care homes, the deterministic assumption is unlikely to significantly alter the conclusions. when modelling households however, we are concerned with much smaller population sizes. therefore, it is important to consider stochastic effects within each household, combined with between-household dynamics. we consider two different household models: one which contains features of both within-and between-household transmission, where small-scale transmission can be linked to the epidemic on a population level, and another which facilitates more detail in the withinhousehold transmission and delay distributions, but with reduced correspondence to the populationwide transmission. with the first model, a 65% adherence to household isolation appears insufficient to control the epidemic without severe global reductions in transmission. coupled with a short term global reduction, the epidemic can be controlled, but upon lifting the global intervention, which could take the form of a lockdown, household isolation is insufficient to maintain control. for the second model, we look into changing the strength of adherence, and the impact this can have on achieving control. indicative but reasonable parameter values suggest that the covid-19 outbreak can potentially be controlled using household isolation strategies, provided the level of adherence is sufficiently high. however, such a high level of adherence may be difficult to maintain in the long-term and this modicum of control is anyway highly sensitive to the chosen parameters. we further investigated the efficacy of various isolation or quarantine measures. a policy of individual isolation struggles to curtail the epidemic for any adherence. instead, mixed isolation, whereby first the whole household isolates and any individual infected during isolation goes on to self isolate after household isolation is lifted, appears to be the most cost-effective strategy. countries have put into place strict social distancing and lockdown intervention to suppress or regain control of epidemics that threaten to overwhelm the health system and cause massive mortality, but they cannot be sustained in the long term without growing social and economic costs. we have shown however, that the probability of the epidemic becoming extinct once these policies are lifted, even when very few cases remain, is very small. we therefore consider a contact tracing intervention as a potential strategy for managing the covid-19 outbreak, once severe lockdown interventions are lifted. we developed a household-level contact tracing model to explore the feasibility of combining these strategies to control the epidemic. firstly, we noted that by using knowledge of household structure, we can reduce the burden on the contact tracing process by isolating household and removing them from the contact tracing process once an infected member has been identified. secondly, we investigated how contact tracing combined with household isolation may drive the disease to extinction, finding that aggressive contact tracing coupled with household isolation can drive the epidemic to extinction under the indicative parameters assumed, when starting with a single infection. however, the time until extinction can be impractically long, which risks the contact tracing capacity being overwhelmed, suggesting such a strategy may be infeasible in practice. a less aggressive strategy could be implemented that would be less likely to overwhelm local health agencies. whilst it may not lead to extinction, this can still be beneficial at mitigating and controlling the spread of an outbreak as part of a test, trace and isolate strategy. there are many complexities when modelling an outbreak of a novel infectious disease. to address some of these, we have described a variety of techniques to serve as part of a generally applicable toolkit. however, our proposed models, and many other models, are subject to important limitations which must be considered prior to their application. key among these is the lack of heterogeneous population mixing, such as through age-stratification [61] and different risk-groups [77] , and spatiotemporal variations [43] , all of which influence modelling estimates and predictions. nevertheless, the relative simplicity of the presented models allows for the development of qualitative intuition regarding the efficacy of various intervention methods, whilst providing tractable theoretical frameworks which can be further developed and better inform policy-makers. 5 . independence of the incubation period likelihood function and the reporting rate to estimate the incubation period distribution, we need to find the distribution that maximises the probability of observing the sampled data. however, the sampled data does not directly record incubation period, and instead contains infection exposure window and the symptom onset date. additionally, the sample does not contain all individuals, and therefore there is a reporting rate that must be incorporated into the likelihood function. if the reporting rate is constant, it can be ignored. however, in the data coming out the wuhan, the reporting rate varies significantly, since individuals are no longer exported from wuhan after travel restrictions. since the reporting rate depends on individuals leaving wuhan, the main factor effecting the probability that a case is included in the data set is the date an individual leaves wuhan. therefore, the reporting rate depends on the days an individual spends in wuhan and is independent of their symptom onset date. we need the likelihood function that an individual was infected between days a and b, had symptom onset on day y, and was included in the data set. we will condition against the infection window, a i b < < , the case being included in the data set, x d â�� , and symptom onset occurring before the truncation date, y t < , and determine the probability that for such an individual we observe the given symptom onset date, y y = . where f î¸ is the probability density function of the incubation period distribution. therefore, the likelihood function ( | , ) p y y a i b y t = < < â�¤ is independent of the reporting rate for the data coming out of wuhan in the early days of the outbreak. 6 . generation-size derivation to solve this, we need to determine the distribution of the infection probability p given the number of observed symptomatics. assuming that p is uniformly distributed, we have f represents the hypergeometric function [2] . substituting this into equation (9) gives ( ) ( 7 . estimating the generation size using a lower bound on the number of symptomatic individuals in the analysis in section 2.4, it is assumed that every person who has developed symptoms by time ï� is known to the observer. however, depending on the disease, symptoms can be subjective. one person may not notice something another person may visit hospital for. additionally, one person may not want to come forward with symptoms if they are worried about the repercussions of coming forward (for example being isolated against their own will). to address this, rather than considering i ï� as the total number of people who have developed symptoms by time ï�, we can consider i ï� as the number of people who have presented with symptoms by time ï�. we do no know the true value of i ï� , but we know that it cannot be below i ï� . we assume that the probability of i ï� being i ï� for a given value of i ï� is uniform at 1 1 i ï� + , di e i dt ï� î³ = â�� (12) . dr i dt î³ = (13) this provides a force of infection that is used to model seeding within each care home via the sellke construction, with the details provided in the main text. once infection has been seeded within a care home, the epidemic progresses using the following ordinary differential equations handbook of mathematical functions with formulas, graphs, and mathematical tables pre-existence and emergence of drug resistance in a generalized model of intra-host viral dynamics stochastic epidemics in dynamic populations: quasi-stationarity and extinction the final size of an epidemic and its relation to the basic reproduction number stochastic and deterministic models for sis epidemics among a population partitioned into households seven challenges for metapopulation models of epidemics, including households models household epidemic models with varying infection response epidemics with two levels of mixing reproduction numbers for epidemic models with households and other social structures ii: comparisons and implications for vaccination stochastic epidemic models featuring contact tracing with delays coordinating the real-time use of global influenza activity data for better public health planning characterising pandemic severity and transmissibility from data collected during first few hundred studies the final size of a serious epidemic estimation in emerging epidemics: biases and remedies models overestimate ebola cases a bayesian mcmc approach to study transmission of influenza: application to household longitudinal data pandemics in the age of twitter: content analysis of tweets during the 2009 h1n1 outbreak the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak. science care quality commission. cqc care directory -with ratings department of health and social care. spi-m modelling summary for pandemic influenza are patients with hypertension and diabetes mellitus at increased risk for covid-19 infection? sars incubation and quarantine times: when is an exposed individual known to be disease free? impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing the spread of awareness and its impact on epidemic outbreaks estimating the generation interval for covid-19 based on symptom onset data. medrxiv reproductive numbers, epidemic spread and control in a community of households estimated effectiveness of symptom and risk screening to prevent the spread of covid-19. elife clinical characteristics of coronavirus disease 2019 in china oxford covid-19 government response tracker (oxcgrt deterministic epidemic models with explicit household structure epidemic prediction and control in clustered populations coronavirus: gsk and sanofi join forces to create vaccine regression models for right truncated data with applications to aids incubation times and reporting lags modeling infectious diseases in humans and animals china braces for world's biggest travel rush around spring festival on the generalized ''birth-and-death" process containing papers of a mathematical and physical character scabies in residential care homes: modelling, inference and interventions for well-connected population sub-units the effect of human mobility and control measures on the covid-19 epidemic in china spatial and temporal dynamics of superspreading events in the 2014-2015 west africa ebola epidemic the incubation period of 2019-ncov from publicly reported confirmed cases: estimation and application the epidemiologic toolbox: identifying, honing, and using the right tools for the job substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data the reproductive number of covid-19 is higher compared to sars coronavirus china coronavirus: what do we know so far? early in the epidemic: impact of preprints on global discourse about covid-19 transmissibility. the lancet global health estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship the largest human migration on the planet: what happens in china when 1.4bn go on holiday at the same time social contacts and mixing patterns relevant to the spread of infectious diseases time variations in the generation time of an infectious disease: implications for sampling to appropriately quantify transmission potential world health organisation. coronavirus disease 2019 (covid-19). situation report -69 reproduction numbers for epidemic models with households and other social structures. i. definition and calculation of 0 systematic selection between age and household structure for models aimed at emerging epidemic predictions epidemic growth rate and household reproduction number in communities of households, schools and workplaces challenges in control of covid-19: short doubling time and long delay to effect of interventions. arxiv covid-19 and italy: what next? the lancet covid-19: disproportionate impact on ethnic minority healthcare workers will be explored by government calculation of disease dynamics in a population of households the prevention of malaria public perceptions, anxiety, and behaviour change in relation to the swine flu outbreak: cross sectional telephone survey some model based considerations on observing generation times for communicable diseases on the asymptotic distribution of the size of a stochastic epidemic sir epidemics in a population of households modeling left-truncated and right-censored survival data with longitudinal covariates empirical estimation of a distribution function with truncated and doubly intervalcensored data and its application to aids studies early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study. the lancet digital health a note on generation times in epidemic models evaluating factors associated with std infection in a study with interval-censored event times and an unknown proportion of participants not at risk for disease reorganization of nurse scheduling reduces the risk of healthcare associated infections. medrxiv epidemic and intervention modelling: a scientific rationale for policy decisions? lessons from the 2009 influenza pandemic investigation of a nosocomial outbreak of severe acute respiratory syndrome (sars) in toronto, canada comparison of incubation period distribution of human infections with mers-cov in south korea and saudi arabia can we contain the covid-19 outbreak with the same measures as for sars? estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72 314 cases from the chinese center for disease control and prevention prevalence of comorbidities in the novel wuhan coronavirus (covid-19) infection: a systematic review and meta-analysis clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan author contributions co and hs compiled the manuscript. all authors were involved in the research and revising of the manuscript. the authors declare no competing interests. all data and code used in this analysis is provided in the github repository https://github.com/thomasallanh ouse/covid19-stochastics and https://github.com/thomasallanhouse/covid19-growth, with the exception of uk specific data. this data is provided by public health england under a data sharing agreement and we are unable to share this data. where c is a normalising constant such that we add further compartments to the within care home model since we are interested in the different pathways that these individuals may take. for the background epidemic this is not important, since all we need is a force of infection provided by individuals in the infectious class. 9 . population and household transmission: individual isolation and household quarantine individual isolation does not intimately involve the household and so we assume that a fraction i î± of symptomatic cases self-isolates and ceases transmission outside the household, meaning that we take the baseline but with to capture the essential features of household quarantine, we need to add states to the dynamical variables. let we now suppose that a fraction s in this model, we assume that the duration of isolation after the absence of symptoms is exponentially distributed with mean equal to the fixed isolation period, given by 1/ ï� . this assumption aids the modelling and is justifiable because, in reality, although a household may choose to isolate, they may not strictly follow the fixed period. it is likely that within-household isolation will increase transmission to other members of the household. we do not incorporate this property into the model, but this limitation is worth bearing in mind when drawing conclusions about the effectiveness of different strategies. key: cord-340805-qbvgnr4r authors: ioannidis, john p.a.; cripps, sally; tanner, martin a. title: forecasting for covid-19 has failed date: 2020-08-25 journal: int j forecast doi: 10.1016/j.ijforecast.2020.08.004 sha: doc_id: 340805 cord_uid: qbvgnr4r epidemic forecasting has a dubious track-record, and its failures became more prominent with covid-19. poor data input, wrong modeling assumptions, high sensitivity of estimates, lack of incorporation of epidemiological features, poor past evidence on effects of available interventions, lack of transparency, errors, lack of determinacy, looking at only one or a few dimensions of the problem at hand, lack of expertise in crucial disciplines, groupthink and bandwagon effects and selective reporting are some of the causes of these failures. nevertheless, epidemic forecasting is unlikely to be abandoned. some (but not all) of these problems can be fixed. careful modeling of predictive distributions rather than focusing on point estimates, considering multiple dimensions of impact, and continuously reappraising models based on their validated performance may help. if extreme values are considered, extremes should be considered for the consequences of multiple dimensions of impact so as to continuously calibrate predictive insights and decision-making. when major decisions (e.g. draconian lockdowns) are based on forecasts, the harms (in terms of health, economy, and society at large) and the asymmetry of risks need to be approached in a holistic fashion, considering the totality of the evidence. modeling resurgence after reopening also failed (table 2 ). e.g. a massachusetts general hospital model 10 predicted over 23,000 deaths within a month of georgia reopening -actual deaths were 896. table 3 lists some main reasons underlying this forecasting failure. unsurprisingly, models failed when they used more speculation and theoretical assumptions and tried to predict long-term outcomes, e.g. using early sir-based models to predict what would happen in the entire season. however, even forecasting built directly on data alone fared badly, 11, 12 failing not only in icu bed predictions ( figure 1 ) but even in next day death predictions when issues of long-term chaotic behavior do not come into play (figures 2 and 3 ). even for short-term forecasting when the epidemic wave waned, models presented confusingly diverse predictions with huge uncertainty (figure 4 ). failure in epidemic forecasting is an old problem. in fact, it is surprising that epidemic forecasting has retained much credibility among decision-makers, given its dubious track record. modeling for swine flu predicted 3,100-65,000 deaths in the uk. 13 eventually 457 deaths occurred. 14 models on foot-and-mouth disease by top scientists in top journals 15, 16 were subsequently questioned 17 by other scientists challenging why up to 10 million animals had to be slaughtered. predictions for bovine spongiform encephalopathy expected up to 150,000 deaths in the uk. 18 however, the lower bound predicted as low as 50 deaths, 18 let's be clear: even if millions of deaths did not happen this season, they may happen with the next wave, next season, or some new virus in the future. a doomsday forecast may come handy to protect civilization, when and if calamity hits. however, even then, we have little evidence that aggressive measures focusing only on few dimensions of impact actually reduce death toll and do more good than harm. we need models which incorporate multicriteria objective functions. isolating infectious impact, from all other health, economic and social impacts is dangerously narrow-minded. more importantly, with epidemics becoming easier to detect, opportunities for declaring global emergencies will escalate. erroneous models can become powerful, recurrent disruptors of life on this planet. civilization is threatened from epidemic incidentalomas. cirillo and taleb thoughtfully argue 19 that when it comes to contagious risk, we should take doomsday predictions seriously: major epidemics follow a fat-tail pattern and extreme value theory becomes relevant. examining 72 major epidemics recorded through history, they demonstrate a fat-tailed mortality impact. however, they analyze only the 72 most-noticed outbreaks, a sample with astounding selection bias. for example, according to their dataset, the first epidemic originating from sub-saharan africa did not occur until 1920 ad, namely 20, 21 one of them, oc43 seems to have been introduced in humans as recently as 1890, probably causing a "bad influenza year" with over a million deaths. 22 based on what we know now, sars-cov-2 may be closer to oc43 than sars-cov-1. this does not mean it is not serious: its initial human introduction can be highly lethal, unless we protect those at risk. a heavy tail distribution ceases to be as heavy as taleb imagines when the middle of the distribution becomes much larger. one may also argue that pandemics, as opposed to epidemics without worldwide distribution, are more likely to be heavy-tailed. however, the vast majority of the 72 contagious events listed by taleb were not pandemics, but localized epidemics with circumscribed geographic activity. overall, when a new epidemic is detected, it is even difficult to pinpoint which distribution of which known events it should be mapped against. blindly acting based on extreme value theory alone would be sensible if we lived in the times of the antonine plague or even in 1890, with no science to identify the pathogen, elucidate its true prevalence, estimate accurately its lethality, and carry out good epidemiology to identify which people and settings are at risk. until we accrue this information, immediate better-safethan-sorry responses are legitimate, trusting extreme forecasts as possible (not necessarily likely) scenarios. however, caveats of these forecasts should not be ignored 1, 23 and new evidence on the ground truth needs continuous reassessment. upon acquiring solid evidence about the j o u r n a l p r e -p r o o f journal pre-proof epidemiological features of new outbreaks, implausible, exaggerated forecasts 24 should be abandoned. otherwise, they may cause more harm than the virus itself. the insightful recent essay of taleb 25 offers additional opportunities for fruitful discussion. taleb 25 ruminates on the point of making point predictions. serious modelers (whether frequentist or bayesian) would never rely on point estimates to summarize skewed distributions. even an early popular presentation 26 from 1954 has a figure (see page 33) with striking resemblance to taleb's figure 1. 25 in a bayesian framework, we rely on the full posterior predictive distribution, not single points. 27 moreover, taleb's choice of a three-parameter pareto distribution is peculiar. it is unclear this model provides a measurably better fit to his (hopelessly biased) pandemic data 19 than, say, a two parameter gamma distribution fitted to log counts. regardless, either skewed distribution would then have to be modified to allow for the use of all available sources of information in a logically consistent fully probabilistic model, e.g. via a bayesian hierarchical model (which can certainly be formulated to accommodate fat tails if needed). in this regard, we note that examining the ny daily death count data studied in ref. 12, these data are found to be characterized as stochastic rather than chaotic. 28 taleb seems to fit an unorthodox model, and then abandons all effort to predict anything. he simply assumes doomsday has come, much like a panic-driven roman would have done in the antonine plague, lacking statistical, biological, and epidemiological insights. j o u r n a l p r e -p r o o f journal pre-proof taleb 25 caricatures the position of a hotly debated mid-march op-ed by one of us, 29 alluring it "made statements to the effect that one should wait for "more evidence" before acting with respect to the pandemic", an obvious distortion of the op-ed. anyone who reads the op-ed unbiasedly realizes that it says exactly the opposite. it starts with the clear, unquestionable premise that the pandemic is taking hold and is a serious threat. immediate lockdown certainly makes sense when an estimated 50 million deaths are possible. this was stated emphatically in multiple occasions these days in interviews in multiple languages -for examples see refs. 30-32. certainly, adverse consequences of short-term lockdown cannot match 50 million lives. however, better data can recalibrate estimates, re-assessing downstream the relative balance of benefits and harms of longer-term prolongation of lockdown. that re-appraised balance changed markedly over time. 9 another gross distortion propagated in social media is that supposedly the op-ed 29 had predicted that only 10,000 deaths in the usa. the key message of the op-ed was that we lack reliable data, i.e. we don't know. the self-contradicting misinterpretation as "we don't know, but actually we do know that 10,000 deaths will happen" is impossible. the op-ed discussed two extreme scenarios to highlight the tremendous uncertainty absent reliable data: 10,000 deaths in the us and 40,000,000 deaths. we needed reliable data, quickly, to narrow this vast uncertainty. we did get data and did narrow uncertainty. science did work eventually, even if forecasts, including those made by one of us (confessed and discussed in box 1), failed. taleb 25 offers several analogies to assert that all precautionary actions are justified in pandemics, deriding "waiting for the accident before putting the seat belt on, or evidence of fire before buying insurance". 25 the analogies assume that the cost of precautionary actions are small j o u r n a l p r e -p r o o f journal pre-proof in comparison to the cost of the pandemic, and that the consequences of the action have little impact on it. however, precautionary actions can backfire badly when they are misinformed. in march, modelers were forecasting collapsed health systems, e.g. 140,000 beds would be needed in new york, when only a small fraction were available. precautionary actions damaged the health system, increased covid-19 deaths, 33 and exacerbated other health problems (table 4 ). seat belts cost next to nothing to produce in cars and have unquestionable benefits. despite some risk compensation and some excess injury with improper use, eventually seat belts prevent ~50% of serious injuries and deaths. 34 measures for pandemic prevention equivalent to seat belts in terms of benefit-harm profile are simple interventions like hand washing, respiratory etiquette and mask use in appropriate settings: large proven benefit, no/little harm/cost. 35,36 even before the covid-19 pandemic, we had randomized trials showing 38% reduced odds of influenza infection with hand washing and (non-statistically significant, but possible) 47% reduced odds with proper mask wearing. 35 despite lack of trials, it is sensible and minimally disruptive to avoid mass gatherings and decrease unnecessary travel. prolonged draconian lockdown is not equivalent to seat belts. it resembles forbidding all commute. similarly fire insurance offers a misleading analogy. fire insurance makes sense only at reasonable price. draconian prolonged lockdown may be equivalent to paying fire insurance at a price higher than the value of the house. taleb refers to the netherlands where maximum values for flooding, not the mean, are considered. 25 anti-flooding engineering has substantial cost, but a favorable decision-analysis profile after considering multiple types of impact. lockdown measures were decided based on examining only one type of impact, covid-19. moreover, the observed flooding maximum to-j o u r n a l p r e -p r o o f journal pre-proof date does not preclude even higher future values. netherlands aims to avoid devastation from floods occurring once every 10,000 years in densely populated areas. 37 a more serious flooding event (e.g. one that occurs every 20,000 years) may still submerge the netherlands next week. however, prolonged total lockdown is not equivalent to building higher sea walls. it is more like abandoning the country -asking the dutch to immigrate, because their land is quite unsafe. other natural phenomena also exist where high maximum risks are difficult to pinpoint and where new maxima may be reached. e.g., following taleb's argumentation, one should forbid living near active volcanoes. living at the santorini caldera is not exciting, but foolish: that dreadful island should be summarily evacuated. same applies to california: earthquake devastation may strike any moment. prolonged lockdown zealots might barely accept a compromise: whenever substantial seismic activity occurs, california should be temporarily evacuated until all seismic activity ceases. furthermore, fat-tailed uncertainty and approaches based on extreme value theory may be useful before a potentially high-risk phenomenon starts and during its early stages. however, as more data accumulate and the high-risk phenomenon can be understood more precisely with plenty of data, the laws of large numbers may apply and stochastic rather than chaotic approaches may become more relevant and useful than continuing to assume unlikely extremes. further responses to taleb 25 appear in table 5 . the short answer is: using science and more reliable data. we can choose measures with favorable benefit-risk ratio, when we consider together multiple types of impact, not only on covid-19, but on health as a whole, as well as society and economy. j o u r n a l p r e -p r o o f currently we know that approximately half of the covid-19 deaths in europe and the usa affected nursing home residents. 38,39 another sizeable proportion were nosocomial infections. 40 if we protect these locations with draconian hygiene measures and intensive testing, we may avert 70% of the fatalities without large-scale societal disruption and without adverse consequences on health. other high-risk settings, e.g. prisons, homeless shelters, meatprocessing plants also need aggressive protection. for the rest of the population, we have strong evidence on a very steep age gradient with ~1000-fold differences in death risk for people >80 years old versus children. 41 we have also detailed insights on how different background diseases modify covid-19 risk for death or other serious outcome. 42 we can use hygiene and some least disruptive distancing measures to protect people. we can use intensive testing (i.e. again, use science) to detect resurgence of epidemic activity and extinguish it early -the countries that faced most successfully the first wave, e.g. singapore and taiwan, did exactly that highly successfully. we can use data to track how the epidemic and its impact evolves. data can help inform more granular models and titrate decisions considering distributions of risk ( figure 5 ). 42 poorly performing models and models that perform well for only one dimension of impact can cause harm. it is not just an issue of academic debate, it is an issue of potentially devastating, wrong decisions. 36 taleb 25 seems self-contradicting: does he espouse abandoning all models (since they are so wrong) or using models but always assuming the worst? however, there is no single worst scenario, but a centile of the distribution: should we prepare for an event that has 0.1%, 0.001%, or 0.000000000001% chance of happening? paying what price in harms? abandoning all epidemic modeling appears too unrealistic. besides identifying the problems of epidemic modeling, table 3 also offers suggestions on addressing some of them. to summarize here some necessary (although not always sufficient) targets for amendments: • invest more on collecting, cleaning, and curating real, unbiased data, not just theoretical speculations unfortunately flirting with this slippery, defensive path. 45 total lockdown is a bundle of dozens of measures. some may be very beneficial, but some others may be harmful. hiding uncertainty can cause major harm downstream and leaves us unprepared for the future. for papers that fuel policy decisions with major consequences, transparent availability of data, code, and named peer-review comments is also a minimum requirement. the possibility of calibrating model predictions for looking at extremes rather than just means is sensible, especially in early days of pandemics, when much is unknown about the virus and its epidemiological footprint. however, when calibration/communication on extremes is adopted, one should also consider similar calibration for the potential harms of adopted measures. for example, tuberculosis has killed 1 billion people in the last 200 years, it still kills 1.5 million people (mostly young and middle age ones) annually, and prolonged lockdown may use of extreme case predictions for covid-19 deaths should be co-examined with extreme case predictions for deaths and impact from many other lockdown-induced harms. models should provide the big picture of multiple dimensions. similar to covid-19, as more reliable data accrue, predictions on these other dimensions should also be corrected accordingly. eventually, it is probably impossible (and even undesirable) to ostracize epidemic forecasting, despite its failures. arguing that forecasting for covid-19 has failed should not be misconstrued to mean that science has failed. developing models in real time for a novel virus, with poor quality data, is a formidable task and the groups who attempted this and made public their predictions and data in a transparent manner should be commended. we readily admit that it is far easier to criticize a model than to build one. it would be horrifically retrograde if this debate ushers in a return to an era where predictions, on which huge decisions are made, are kept under lock and key (e.g. by the government -as is the case in australia). we wish to end on a more positive note, namely where we feel forecasting has been helpful. perhaps the biggest contribution of these models is that they serve as a springboard for discussions and debates. dissecting variation in performance of various models (e.g. casting a sharp eye to circumstances where a particular model excelled) can be highly informative and a systematic approach to the development and evaluation of such models is needed. 12 this demands a coherent approach to collecting, cleaning and curating data, as well as a transparent approach to evaluating the suitability of models with regard to predictions and forecast uncertainty. what we have learned from the covid-19 pandemic can be passed to future generations that hopefully should be better prepared to deal with a new, different pandemic, learning from our failures. there is no doubt that, again, an explosive literature of models and forecasting will emerge again as soon as a new pandemic is suspected. however, we can learn from our current mistakes to be more cautious with interpreting, using, and optimizing these models. being more cautious does not mean not to act decisively, but it requires looking at the totality of the data; considering multiple types of impact; having scientists from very different disciplines involved; replacing speculations, theories and assumptions with real, empirical data as quickly as possible; and modifying and aligning decisions to the evolving best evidence. in the current pandemic, we largely failed to protect people and settings at risk. we could have done much better in this regard. it is difficult to correct mistakes that have already led to people dying, but we can avoid making the same mistakes in future pandemics from different pathogens. we can avoid making the same mistakes even for covid-19 going forward, since this specific pandemic has not ended as we write. in fact, its exact eventual impact is still unknown. for example, the leader of the us task force, dr. anthony fauci, recently warned of reaching 100,000 covid-19 us cases per day. 48 maybe this prediction is already an underestimate, because with over 50,000 cases diagnosed per day in early july 2020, the true number of infections may be many times larger. there is currently wide agreement that the number of infections in many parts of the united states is more than 10 times higher than the reported rates. 49 "according to the penn wharton budget model (pwbm), reopening states will result in an additional 233,000 deaths from the virus -even if states don't reopen at all and with social distancing rules in place. this means that if the states were to reopen, 350,000 people in total would die from coronavirus by the end of june, the study found." yahoo, may 3, 2020 (https://www.yahoo.com/now/reopeningstates-will-cause-233000-more-people-to-die-from-coronavirusaccording-to-wharton-model-120049573.html) based on jhu dashboard death count, number of additional deaths as of june 30 was 5,700 instead of 233,000, i.e. total deaths was 122.7 thousand instead of 350 thousand. it is unclear also whether any of the 5,700 deaths were due to reopening rather than error in the original model calibration of the number of deaths without reopening. "dr. ashish jha, the director of the harvard global health institute, told cnn's wolf blitzer that the current data shows within less than 4 weeks of this quote the number of j o u r n a l p r e -p r o o f that somewhere between 800 to 1,000 americans are dying from the virus daily, and even if that does not increase, the us is poised to cross 200,000 deaths sometime in september. "i think that is catastrophic. i think that is not something we have to be fated to live with," jha told cnn. "we can change the course. we can change course today." "we're really the only major country in the world that opened back up without really getting our cases as down low as we really needed to," jha told cnn." business insider, june 10, 2020 (https://www.businessinsider.com/harvard-expert-predictscoronavirus-deaths-in-us-by-september-2020-6) daily deaths was much less than the 800-1000 quote (516 daily average for the week ending july 4). then it increased again. the number of actual total deaths as of september will be added here when available. lack of consensus as to what is the 'ground truth" even for seemingly hard-core data such as daily the number of deaths. they may vary because of reporting delays, changing definitions, data errors, and more reasons. different models were trained on different and possibly highly inconsistent versions of the data. as above: investment should be made in the collection, cleaning and curation of data. wrong assumptions in the modeling many models assume homogeneity, i.e. all people having equal chances of mixing with each other and infecting each other. this is an untenable assumption and in reality, tremendous heterogeneity of exposures and mixing is likely to be the norm. unless this heterogeneity is recognized, estimated of the proportion of people eventually infected before reaching herd immunity can be markedly inflated need to build probabilistic models that allow for more realistic assumptions; quantify uncertainty and continuously readjust models based on accruing evidence high sensitivity of estimates for models that use exponentiated variables, small errors may result in major deviations from reality inherently impossible to fix; can only acknowledge that uncertainty in calculations may be much larger than it seems lack of incorporation of epidemiological features almost all covid-19 mortality models focused on number of deaths, without considering age structure and comorbidities. this can give very misleading inferences about the burden of disease in terms of quality-adjusted life-years lost, which is far more important than simple death count. for example, the spanish flu killed young people with average age of 28 and its burden in terms of number of quality-adjusted person-years lost was about 1000fold higher than the covid-19 (at least as of june 8, 2020). incorporate best epidemiological estimates on age structure and comorbidities in the modeling; focus on quality-adjusted lifeyears rather than deaths the core evidence to support "flatten-the-curve" efforts was based on observational data from the 1918 spanish flu pandemic on 43 us cites. these data are >100-years old, of questionable quality, unadjusted for confounders, based on ecological reasoning, and pertaining to an entirely different while some interventions in the broader package of lockdown measures are likely to have beneficial effects, assuming huge benefits is incongruent with the past (weak) evidence and should be avoided. large benefits may be feasible from j o u r n a l p r e -p r o o f (influenza) pathogen that had ~100-fold higher infection fatality rate than sars-cov-2. even thus, the impact on reduction on total deaths was of borderline significance and very small (10-20% relative risk reduction); conversely many models have assumed 25-fold reduction in deaths (e.g. from 510,000 deaths to 20,000 deaths in the imperial college model) with adopted measures precise, focused measures (e.g. early, intensive testing with through contact tracing for the early detected cases, so as not to allow the epidemic wave to escalate [e.g. taiwan or singapore]; or draconian hygiene measures and thorough testing in nursing homes) rather than from blind lockdown of whole populations. lack of transparency many models used by policy makers were not disclosed as to their methods; most models were never formally peer-reviewed and the vast majority have not appeared in the peer-reviewed literature even many months after they shaped major policy actions while formal peer-review and publication may take more time unavoidably, full transparency about the methods, and sharing of the code and data that inform these models is indispensable. even with peer-review, many papers may still be glaringly wrong, even in the best journals. errors complex code can be error-prone and errors can happen even by experienced modelers; using oldfashioned software or languages can make things worse; lack of sharing code and data (or sharing them late) does not allow detecting and correcting errors promote data and code sharing; use up-todate and well-vetted tools and processes that minimize the potential for error through auditing loops in the software and code lack of determinacy many models are stochastic and need to have a large number of iterations run, perhaps also with appropriate burn-in periods; superficial use may lead to different estimates promote data and code sharing to allow checking the use of stochastic processes and their stability looking at only one or a few dimensions of the problem at hand almost all models that had a prominent role in decision-making focused on covid-19 outcomes, often just a single outcome or a few outcomes (e.g. deaths, or hospital needs). models prime for decision-making need to take into account the impact on multiple fronts (e.g. other aspects of health care, other diseases, dimensions of the economy, etc.) interdisciplinarity is desperately needed; since it is unlikely that single scientists or even teams can cover all this space, it is important for modelers from diverse ways of life to sit on the same table. major pandemics happen rarely and what is needed are models which fuse information from a variety of sources. information from data, from experts in the field, from past pandemics, need to fused in a logically consistent fashion if we wish to get any sensible predictions. lack of expertise in crucial disciplines the credentials of modelers are sometimes undisclosed; when they have been disclosed, these teams are led by scientists who may have strengths in some quantitative fields, but these fields may be remote from infectious diseases and clinical epidemiology; modelers may operate in subject matter vacuum make sure that the modelers' team is diversified and solidly grounded in terms of subject matter expertise groupthink and bandwagon effects models can be tuned to get desirable results and predictions, e.g. by changing the input of what are deemed to be plausible values for key variables. this is especially true for models that depend on theory and speculation, but even data-driven forecasting can do the same, depending on how the modeling is performed. in the presence of strong groupthink and bandwagon effects, modelers may consciously fit their predictions to what is the dominant thinking and expectations -or they may be forced to do so. maintain an open-minded approach; unfortunately models are very difficult, if not impossible, to pre-register, so subjectivity is largely unavoidable and should be taken into account in deciding how much forecasting predictions can be trusted j o u r n a l p r e -p r o o f forecasts may be more likely to be published or disseminated, if they are more extreme very difficult to diminish, especially in charged environments; needs to be taken into account in appraising the credibility of extreme forecasts j o u r n a l p r e -p r o o f inform the public that we are doing our best, but it is likely that hospitals will be overwhelmed by covid-19 honest communication with the general public patients with major problems like heart attacks did not come to the hospital to be treated, 5 while these are diseases that are effectively treatable only in the hospital; an unknown, but probably large share of excess deaths in the covid-19 weeks were due to these causes rather than covid-19 itself 54 re-orient all hospital operations to focus on covid-19 be prepared for the covid-19 wave, strengthen the response to crisis most hospitals saw no major covid-19 wave and also saw a massive reduction in overall operations with major financial cost, leading to furloughs and lay-off of personnel; this makes hospitals less prepared for any major crisis in the future j o u r n a l p r e -p r o o f table 5 . taleb's main statements and our responses forecasting single variables in fat tailed domains is in violation of both common sense and probability theory. serious statistical modelers (whether frequentist or bayesian) would never rely on point estimates to summarize a skewed distribution. using data as part of a decision process is not a violation of common sense, irrespective of the distribution of the random variable. possibly using only data and ignoring what is previously known (or expert opinion or physical models) may be unwise in small data problems. we advocate a bayesian approach, incorporating different sources of information into a logically consistent fully probabilistic model. we agree that higher order moments (or even the first moment in the case of the cauchy distribution) do not exist for certain distributions. this does not preclude making probabilistic statements such as p(a90% of the potential deaths. >90% of the population could possibly continue with non-disruptive measures, since they account for only <10% of the total potential deaths. wrong but useful -what covid-19 epidemiologic models can and cannot tell us will have 100 million cases of covid-19 in four weeks, doubling every four days forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months reduced rate of hospital admissions for acs during covid-19 outbreak in northern italy collateral damage: the impact on outcomes from cancer surgery of the covid-19 pandemic years of life lost due to the psychosocial consequences of covid19 mitigation strategies based on swiss data should governments continue lockdown to slow the spread of covid-19? the totality of the evidence learning as we go: an examination of the statistical accuracy of covid19 daily death count predictions a case study in model failure? covid-19 daily deaths and icu bed utilisation predictions the 2009 influenza pandemic review transmission intensity and impact of control policies on the foot and mouth epidemic in great britain the foot-and-mouth epidemic in great britain: pattern of spread and impact of interventions use and abuse of mathematical models: an illustration from the 2001 foot and mouth disease epidemic in the united kingdom estimating the human health risk from possible bse infection of the british sheep flock tail risk of contagious diseases clinical impact of human coronaviruses 229e and oc43 infection in diverse adult populations an outbreak of human coronavirus oc43 infection and serological cross-reactivity with sars coronavirus complete genomic sequence of human coronavirus oc43: molecular clock analysis suggests a relatively recent zoonotic coronavirus transmission event five ways to ensure that models serve society: a manifesto estimating the effects of non-pharmaceutical interventions on covid-19 in europe the illusory effects of non-pharmaceutical interventions on covid-19 in europe us could see 100,000 new covid-19 cases per day key: cord-347791-wofyftrs authors: hao, tian title: prediction of coronavirus disease (covid-19) evolution in usa with the model based on the eyring rate process theory and free volume concept date: 2020-04-22 journal: nan doi: 10.1101/2020.04.16.20068692 sha: doc_id: 347791 cord_uid: wofyftrs a modification arguing that the human movement energy may change with time is made on our previous infectious disease model, in which infectious disease transmission is considered as a sequential chemical reaction and reaction rate constants obey the eyrings rate process theory and free volume concept. the modified model is employed to fit current covid-19 outbreak data in usa and to make predictions on the numbers of the infected, the removed and the death in the foreseeable future. excellent fitting curves and regression quality are obtained, indicating that the model is working and the predictions may be close to reality. our work could provide some ideas on what we may expect in the future and how we can prepare accordingly for this difficult period. during this global pandemic outbreak of coronavirus 2019 (covid − 19), usa becomes the number one in term of how many people are infected. what is going to happen next and how many people may be infected and may die become an emergent question for policy makers to make proper mitigation plans. mathematical modeling and analysis of infectious disease transmissions [1] [2] [3] [4] [5] [6] [7] [8] have been utilized to make predictions. precise prediction remains challenging due to randomness of human interactions and unpredictability of virus growth patterns. human mobility and virus transmissions, however, should follow basic physical and chemical laws. two very powerful theories in physics and chemistry fields are the eyring's rate process theory and the free volume concept. the eyring's rate process theory 9 argues that every physical or chemical phenomenon is a rate controlled process, while the free volume concept [10] [11] [12] [13] [14] argues that the transmission speed is also dependent on the available free volume. many seemingly unrelated systems or phenomena can be successfully described with these two theories, such as glass liquids 13 , colloids and polymers 15, 16 , granules [17] [18] [19] , electrical and proton conductivity 20,21 , superconductivity 22 , and hall effect 23 , etc. the infectious disease transmission phenomenon, a very complicated macroscopic process, could be properly analyzed with these two theories, too. attempts were made to integrate these two theories together for modeling infectious disease transmissions under an assumption that an infectious disease transmission is a sequential chemical reaction 1 . focus was placed on analyzing covid-19 outbreak in china for validating the newly formulated model and making predictions on peak time and peak infected. in this article, an infectious disease is still considered as a sequential chemical reaction by following the popular sir (susceptible, inf ectious, and removed) and seir (susceptible, exposed, inf ectious, and removed) compartment categorization methods proposed in the literature [2] [3] [4] [5] [6] [7] [8] . for better fitting data, modification is made on our previous model 1 by introducing an idea that the energy for human individuals to transmit diseases is time dependent, which is in line with other systems like granular powder under tapping process where the energy of particles is time dependent, too 18 . the modified model is used to analyze covid-19 transmission in usa and make predictions on potential infections and death toll. according to the model proposed previously 1 , the whole infection disease transmission process can be expressed as below: where s, e, i, and r represents the fractions or concentrations of the susceptible, the exposed, the infected, and the removed in the sequential chemical reaction. the difference between msir ( modified susceptible, infectious, and removed ) and mseir ( modified susceptible, exposed, infectious, and removed ) models is that msir model assumes that the susceptible will directly transform into the infected, while mseir model assumes that there is an intermediate state "exposed". k 1 , k 2 , k 2 , k 3 , k 4 , k 5 , k 5 are chemical reaction rate expressed in two chemical reactions shown above. once these parameters are known, the fractions of s, e, i, r can be predicted. for the first step reaction in msir model, i will follow th same approach used previously, i.e. two steps are involved during the process from "s" to "i": human individual movement and virus particle movement. human movement is an athermal stochastic random process. for athermal granular powder under a tapping process, we have demonstrated that they behave like thermal systems and follow the stretched exponential pattern in term of tap density changing with the number of taps 17 . our approach is mainly based on theodor förster's theory 24-27 that deals with the energy transfer from donors to random distributed acceptors. other people's research work has shown that human collective motion and individual walking patterns of animals behave like thermal systems and follow boltzmann distribution, though the term of temperature needs to be defined differently in these athermal systems 28, 29 . for the process transformed from the susceptible to the exposed, human movements may play a major role and the reaction rate of this process, k 3 in mseir model, may be expressed as 17 : and w is the basic/unit energy that a person may need during a normal circumstance, which is identical to the product of the boltzmann constant and the temperature. following the similar treatment method on powder particles 17 , we may assume that e a should be proportion to time and thus write: where n is a constant and t is the time. in previous article, e a was considered as a constant, independent of time. after an individual is exposed, the transmission of virus particles from one person to another will make an "exposed" person become "infected". the transmission rate will be dependent on how fast these virus particles will travel and how large the free volume is available for virus particles to travel. it can be analogous to the viscosity or conductivity of an entity that has been addressed in many systems in my previous articles 13, 15, 20, 21, 30 with free volume estimated using inter-particle spacing concept 15, 31 . the chemical reaction rate should be proportional to the "viscosity" of this entity, then k 4 can be written as 30 : where v is the volume in consideration, r is the radius of a virus particle, σ is the shear stress applied when virus particles transmit from one place to another,γ is the shear rate, n a is avogadro number, m is the mass of a virus particle, k b is boltzmann constant, t is the temperature, φ is the volume fraction of virus particles in the volume v , φ m is the maximum packing fraction of virus particles, and e 0 is the energy barrier for virus particles. in msir model, we assume that during the transmission process from the susceptible to the infected, both human movement and virus particle transmission are involved and the "exposed" is only an transient state. according to the transient state theory of chemical reaction 9 , we may easily obtain: eq.(1-7) indicates that infectious disease transmission is a complicated process, and is dependent on many factors like human movement energy barrier, the particle size and volume fraction of virus particles, the mass of a virus particle, temperature, and volume in consideration. smaller volume leads to lower transmission rate, and isolation definitely is a good method to preventing virus from spreading. for a sequential chemical reaction, the fraction of each reactant can be expressed with a series of differential equations 32 . for msir model, we may write: for mseir model, we may write: assume that the initial fraction of the susceptible is s 0 and n t is always smaller than w, so we may easily obtain: for seir model (16) since the contribution from k 2 r to the infected is negligible based on the fact that the recovered may gain immunity from the disease and the fraction of the recovered is relatively small at early stages, the first step reaction product, e and i may be written: 5 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . both equations above are first-order differential equations of standard form: which has a standard solution as: using exp x 1 + x when x <1, we therefore obtain: similarly, ignoring the contribution from k 5 r, we may obtain i in mseir as shown below: assuming that s + i + r = s 0 for msir model and s + e + i + r = s 0 for mseir model, we can obtain: 6 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. both i and e should have a peak value that can be simply determined by differentiating eq. 24 and eq. 25 against time. if there is no exact analytical solution, we may determine approximate peak values after equations are plotted out. the fractions of the susceptible, exposed, infected, and recovered are functions of time, virus particle volume fraction, and environment temperature. the trends between these parameters had been graphed previously 1 and shouldn't be impacted by the modification of human movement energy term. both the exposed and infected would peak at certain time, dramatically increase with virus particle volume fraction, and decrease with temperature increase. please refer to my previous article for further information. focus in this section will be placed on how the infected changes with other critical parameters like β and if these equations can be used to fit current data and make predictions. the infected against both time and the stretched exponential parameter β are plotted in figure 1 . the infected peaks with time and increases with β. the parameter β basically enlarges peak heights, implying that when β is large, more peoples are infected. the physical meaning of β, according to phillips 33, 34 , is as below: β = 3/5 for intrinsic molecular level short range interactions, β = 3/7 for intrinsic long range coulomb interactions, and β = 2/3 for extrinsic interactions. with the increase of β, more interaction between the entity is 7 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . expected, i.e. more infections, which seems to be logical in term of how infection transmission evolves. the rate constant could play a critical role in infection transmissions. the parameters k 1 and k 3 are replaced with other terms in equations and the remaining parameters like k 2 , k 4 and k 5 will be focused. figure 2 shows the infected against both time and the parameter k 2 for msir and the parameter k 4 for mseir. the infection peak against time was not showing up in msir graph, probably due to unreasonable variations of k 2 . peaks are shown up in figure 1 when a fixed value of k 2 is assigned. an infection peak against k 2 is observed, indicating that the recovery rate is important in disease transmission, which may flatten the infection at very early stages. for mseir model shown in figure 2 (b) , high infection rate indicated by k 4 means that the infection peaks at very early time and disappears quickly afterwards, possibly due to the fact that a large number of people are infected and "herd immunity" may be generated to stop further spreading. another two parameters in mseir model are amp and k 5 , the impact of these two on the infected is shown in figure 3 . similar to β, amp enlarges or amplifies infection peak heights. increase of amp may have more people infected. the impact of k 5 on the infected is similar to that of k 4 shown in figure 2 . the infected peaks at a early time when more people can be removed from the system, including the recovered and death. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. if the equations created are correct, they should be able to fit data and make predictions on what is going to happen next. figure 4 shows the fraction of the infected in usa from march 12 to april 4 2020 and fitted with both msir and mseir models. the r 2 of both fittings are larger than 0.99 as demonstrated in figure 4 (a). same equations with same fitting parameters are plotted again in a larger scale and shown in figure 4 (b). the first 9 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. another important information is the removed, including both the recovered and the death. figure 5 shows two sets of data, one is the recovered alone and another includes both the recovered and the death. both sets of data are fitted with msir and mseir models. again, the fitting is very good with r 2 larger than 0.99, though different fitting parameters are used for the recovered and the recovered plus the death. in both fitting processes β=0.01 is used, implying a very weak interpersonal interaction is found and there is no "herd immunity" happening. the large scale graphs calculated with the same fitting 10 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. parameters are shown in figure 6 for both the recovered alone (a) and the recovered plus the death (b). a huge number of people is projected to recover in the future. however, we'd better separate the recovered and the death from the removed. eq. 26 is thus used to fit the death only data and shown in figure 7 (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. all regressions. the predicted peak time, peak infected, death toll, and death rate is listed in table i . 12 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. the predictions made in this article are based on the data collected in usa and released from https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_the_united_ states. the main source of these data are from john hopkins university https://www. arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6 and https://www.coronavirus.gov/. the differences among various sources are small, dependent on when the data are updated and released. the accuracy of predictions shouldn't be affected by using different sources of data. the integration of eyring's rate process theory and free volume concept has been demonstrated to work well for many multi-scale systems ranging from electrons to granular particles, even the universe 35 . this approach is therefore naturally applied to disease transmissions, as human movement and virus particle transmissions should follow same physical and chemical principles. excellent fitting quality in term of r 2 > 0.99 using the derived 13 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . https://doi.org/10.1101/2020.04. 16.20068692 doi: medrxiv preprint equations may indicate that the unified approach across multi-disciplinary areas does unveil fundamental operating mechanisms behind various phenomena. an interesting and critical parameter introduced in this article is β, which defines the interaction level between human individuals during disease transmissions. β = 0.2 is found for the infected data regression and β = 0.01 is found for all other regressions, indicating that the interpersonal transmission is weak in usa at this moment and the isolation policy is working. the theoretical framework proposed in this article can be applied to other countries and other transmission diseases, though the focus is put on covid-19 currently spreading in usa. with the argument that human movement energy is time dependent, we have modified our infectious disease transmission model proposed previously. the remaining formulation and structure of the model are unchanged: the infectious disease transmission process from the susceptible, to the exposed, the infected, and the removed in the end is continued to be considered as a sequential chemical reaction process, and the reaction rate at each step follows the eyring's rate process theory and free volume concept. obtained equations are employed to describe covid-19 outbreak currently ongoing in usa. excellent fitting curves are obtained with r 2 larger than 0.99 for all regressions including the infected, the removed (the recovered with and without the death), and the death toll alone. msir and mseir models give different predictions: msir model predicts that the infected will peak on august 20, 2020, with 4.75 million infected and 450000 death, while mseir model predicts that the infected will peak on may 6, 2020, with 850000 infected and 50000 death. the difference may be caused by the "exposed" category in mseir model, which may take a huge portion of the infected in msir model. the death rate predicted with mseir model, 5.9%, is more close to current data, implying that the prediction made with mseir model may be more realistic than that with msir model. the infection peak time is strongly dependent on the stretched exponential parameter β, which substantially amplifies peak heights. large values of β mean that more people will be infected. for all regressions, the parameter β is less than 0.2, indicating that the interpersonal transmission is in a "weak" point and the "stay at home" isolation and travel 14 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 22, 2020. . https://doi.org/10.1101/2020.04. 16.20068692 doi: medrxiv preprint restriction is working for preventing covid-19 from spreading. the infection peak height is also dependent on the reaction rate constants such as k 2 , k 4 , k 5 . small k 2 means large peak heights, while large k 4 and k 5 lead to large peak heights, a substantially large number of infections. infection dynamics of coronavirus disease 2019 (covid-19) modeled with the integration of the eyring rate process theory and free volume concept a contribution to the mathematical theory of epidemics seasonality and period-doubling bifurcations in an epidemic model mathematical models of infectious disease transmission unraveling r 0 : considerations for public health applications stochastic epidemic dynamics on extremely heterogeneous networks mathematical modeling of infectious disease dynamics realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics the theory of rate process molecular transport in liquids and glasses free-volume model of the amorphous phase: glass transition source of non-arrhenius average relaxation time in glass-forming liquids unveiling the relationships among the viscosity equations of glass liquids and colloidal suspensions for obtaining universal equations with the generic free volume concept notes on free volume theories viscosities of liquids, colloidal suspensions, and polymeric systems under zero or nonzero electric field electrorheological fluids: the non-aqueous suspensions derivation of stretched exponential tap density equations of granular powders tap density equations of granular powders based on the rate process theory and the free volume concept defining temperatures of granular powders analogously with thermodynamics to understand the jamming phenomena electrical conductivity equations derived with the rate process theory and free volume concept conductivity equations of protons transporting through 2d crystals obtained with the rate process theory and free volume concept exploring high temperature superconductivity mechanism from the conductivity equation obtained with the rate process theory and free volume concept integer, fractional, and anomalous quantum hall effect explained with eyring's rate process theory and free volume concept english translation, t. förster, energy migration and fluorescence expermentelle und theoretische untersuchung des zwischengmolekularen ubergangs von elektronenanregungsenergie, a naturforsch excitation transfer from a donor to acceptors in condensed media: a unified approach on the relationship among three theories of relaxation in disordered systems collective motion of humans in mosh and circle pits at heavy metal concerts variation in individual walking behavior creates the impression of a lévy flight analogous viscosity equations of granular powders based on eyring's rate process theory and free volume concept calculation of interparticle spacing in colloidal systems stretched exponential relaxation in molecular and electronic glasses topological derivation of shape exponents for stretched exponential relaxation exploring the inflation and gravity of the universe with eyring's rate process theory and free volume concept the author sincerely appreciate professor yuanze xu for his constructive feedback and comments for substantially improving the readability and rationality of this article. key: cord-324924-5f7b02yq authors: agarwal, a.; tyagi, u. title: a transparent, open-source sird model for covid19death projections in india date: 2020-06-04 journal: nan doi: 10.1101/2020.06.02.20119917 sha: doc_id: 324924 cord_uid: 5f7b02yq as india emerges from the lockdown with ever higher covid19 case counts and a mounting death toll, reliable projections of case numbers and deaths counts are critical in informing policy decisions. we examine various existing models and their shortcomings. given the amount of uncertainty surrounding the disease we choose a simple sird model with minimal assumptions enabling us to make robust predictions. we employ publicly available mobility data from google to estimate social distancing covariates which influence how fast the disease spreads. we further present a novel method for estimating the uncertainty in our predictions based on first principles. to demonstrate, we fit our model to three regions (spain, italy, nyc) where the peak has passed and obtain predictions for the indian states of delhi and maharashtra where the peak is desperately awaited. india has just emerged from a long and strict lockdown. there are doubts about effective the lockdown has been and where the country is going from here. given the steep economic cost of lockdowns it is important to understand their impact on case numbers and death counts. further, given the lack of information about the novel coronavirus it is extremely hard to model the disease realistically. in our work, we choose a simple sird compartmental model in order to model death counts due to covid19. at a high level, the sird compartmental model in epidemiology partitions the population into four compartments susceptible (those people who can potentially be infected from the virus), infected, recovered and dead. it also takes as input the the reproduction rate (r 0 ) which depends on the social distancing behaviour of the people in the country, this allows us to vary for the infection by capturing details of the policies being implemented with respect to lockdowns in the country and summarizing it in a single number. we list down popular models being referred to by the indian and international community and their shortomings, and then we discuss how we overcome them. ihme model after getting cited by the united states government for the covid19 projections, the ihme model [1] has received a lot of attention from the public and professional community. it relies on data from china and italy for training the model parameters and extend the model to the united states. one of the criticisms for the model is that it uses statistical definitions to make predictions. it cannot give information on the actual parameters underlying the disease spread due to departure from epidemiological theory. it has consistently under-performed compared to the actual statistics, and needs to be re-updated after every few weeks. the code for this model is only partly open-sourced, it is very hard for a third-party to replicate their results. overcoming the limitations of the statistical models like ihme projections, stanford published a nine compartment model [2] , which explicitly tracks nine compartments, including exposed, asymptomatic, pre-symptomatic, symptomatic, hospitalized, and recovered. a similar model was developed in the indian context. [3] the criticism of this complex epidemiological model is that they have a large number of tunable and learnable parameters which interact in unpredictable ways. further, given the uncertainty surrounding the disease it is not clear how these values can be fixed reliably. since the data provided by the government becomes unreliable as the cases increase, the modelling in such cases becomes unreliable. pracriti model by m3rg, iit delhi the model by m3rg group of iit delhi [4] is an extension of the sier model that creates adaptive, interacting and cluster-based, seir compartments for district level populations. this model suffers from limitations as stated for both previous models. they have added a lot of mathematical parameters with do not have physical meaning, and thus cannot be checked against real-world data. with an additional migration term, they assume the migration statistics irrespective of the movement of the migrants and their social distancing measures. next, the model doesn't perform well as on simple inspection. as of the time of writing, the r 0 for mumbai is reported as 0.73, while same value for the state of maharashtra is 1.33. this is unreasonable because mumbai is known to be contributing the most to the case load of maharashtra. further, there are no confidence intervals in their plots which makes their predictions meaningless since it is impossible to extrapolate cases with 100% accuracy. further, they have not open-sourced their code. our model is inspired by the ones mentioned above and aims to combine various techniques to avoid the pitfalls outlined above. salient features of our model include -modelling on daily deceased data: we use daily deceased data for time-series forecasting of the covid19. in this work, we only include projections for future deaths, although this can be extended to projections for other things like the number of infections and the number of icu beds required. the benefits of using deaths rather than infections is that the latter crucially depends on the scale of testing and reporting by the relevant government agencies. further, cases are reported as and when they are discovered which means that the time when a person was infected does not show up in the data. on the other hand, deaths due to covid19 occur in hospital icus and the exact time of death is known. further, since all patients who eventually die end up in the hospitals it is unlikely that deaths are being under-counted (barring deliberate under-reporting). in this way, we bypass the uncertainty of testing. further, it can be argued that the death toll and number of icu beds required is more important to estimate than the total number of infections since most infected people recover from covid19 without medical assistance. simplicity: the model implements a four compartment sird based model where we vary reproduction number (r 0 ) with respect to the social distancing measures. this means that r 0 effectively summarizes the entire suite of prevention strategies adopted by a region as well as migration and mixing patters. this gives us flexibility to monitor real-time policy changes in the data and update r 0 accordingly. also, because of just 4 compartments we require only a few disease-specific parameters (the infectious period and mortality rate). more complicated models need a larger number of parameters. since the values of these parameters are not known accurately at this point of time, it makes these models prone to over-fitting. a fewer number of parameters also means that our model is region agnostic and can be implemented on national and state levels, all over the world with minimal modification. 2 . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 4, 2020. . https://doi.org/10.1101/2020.06.02.20119917 doi: medrxiv preprint transparency: since our model is based parameters well-documented in epidemiological theory, we can do a sanity check on the inferred values to see if they agree with what is known at this point of time. this can also be used in principle to compare the effects of different policies in mitigating the spread of the virus by comparing the variation in the reproduction number. further, we believe that any modelling effort must strive to be as transparent as possible. this is because to the non-expert, the projections churned out by sophisticated mathematical machinery seem to carry more weight than they really do. in reality, given the large number of variables involved, most mathematical models end up being nothing more than educated guesses and are only as strong as the assumptions they implicitly make. therefore, we believe it is critical that all results should be made publicly available and the methodology explained in as detailed a manner as possible. many models we have mentioned above do not do this. they have not open-sourced their code completely and it is not possible, or is very difficult and time-consuming to replicate their results by reading the technical reports alone. in contrast, we have completely open-sourced our code with relevant documentation so that the community can critique our assumptions and contribute their own ideas to improve the model. since it is impossible to predict with 100% accuracy the death count into the future, no model can be complete without providing confidence intervals for its projections. we present a novel strategy to compute these confidence intervals from first principles with the empirical claim that most of the time the observed death counts will fall within these intervals. an sird model is described by the following coupled differential equations, here, s, i, x are respectively the fraction of the population that is susceptible, infected and deceased. we omit the number of people who have recovered because we do not fit our model on that data. t inf denotes the median amount of time a person stays infectious, γ x is average number of people who die from covid19 in a day as a fraction of the total number of active cases on that day. r is the reproduction number of the disease which measures the average number of people an infected person transmits the disease to. r > 1 implies that the case count rises over time while r < 1 implies that the case count diminishes over time with the rate of spread being determined by r. note that in general r varies with time depending on the extent of social distancing practiced. note that γ x and t inf are spatially-invariant and time-invariant properties of the disease. they depend on the virus specifics and how the human body responds to it. thus, this parameters can be bounded in a small range based on preliminary studies from wuhan and europe, where cases of covid19 have been large. we then see that all the information about the progression of the disease lies in a single parameter -r. this is the major advantage of using a simple model. we do not have to deal with lots of uncertain parameters that influence the final curve in unpredictable ways, we can focus on estimating r as best we can. further, r is a well-established measure of disease spread in epidemiological literature which means there are already many existing estimates of r which our model can leverage. r is a function of time and depends on (among other things), the lockdown and social distancing measures adopted by each country. to estimate r we leverage open-source, real-time social distancing data published by google [5] , which allows us to model various mitigation measures by just two parameters as described below. while the social mobility data does not account directly for various measures such as contact tracing and mask usage, we nonetheless postulate that the timing of these measures is correlated with the timing of social distancing measures indicated by the mobility data. the data, available in aggregated form, shows how the number of visitors who go to (or spend time in) categorized places change compared to pre-covid days. a baseline day represents a normal value for that day of the week. the 3 . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 4, 2020. . https://doi.org/10.1101/2020.06.02.20119917 doi: medrxiv preprint table 1 : raw and smooth social distancing data for three different regions from 15 feb baseline day is the median value from the 5-week period jan 3 -feb 6, 2020. the places are categorized into retail and creation, grocery and pharmacy, parks, transit stations, workplaces, and residential. additionally, for a sanity check, we looked at the smartphone penetration in the country to validate the model. the report by statcounter [6] suggests that android based smartphones constitute more than 95% of all smartphones being used in the country in april 2020. with this size of market share, the open source data-model performs well. we construct a social distancing covariate s(t) from the changes in social mobility at various locations where each of the terms on the rhs above denote percentage change from baseline in mobility at the following locations • ∆i r (t) -retail and recreation • ∆i g (t) -grocery and pharmacy note that we ignore residential mobility data as residential mobility does not contribute to the spread of the disease. next, we smoothen the covariate s(t) by applying a savitzky-golay filter followed by convolution with a localized gaussian multiple times. the goal is to smooth out weekly variations in the data but not distort the overall profile of the curve. this gives us s(t), the smooth social distancing covariate. since we only care about the timing of social distancing measures, to relate s(t) to r we introduce two parameters r min , r max , the r values when s(t) is minimum and maximum respectively. we then define r as a linear interpolation function of s between these two values. mathematically, where s max , s min are the global maximum and minimum values of s(t). further, we introduce a fixed lag δ sd which equals the median time from infection to death. this is because s(t) influences the number of infections at time t, the effect on deaths is seen only later. where social distancing data is not available, we naively extrapolate the existing data into the future as well as the past. concretely, we assume that past values equal the earliest value we know and future values equal the latest value. this amounts to assuming that existing social distancing measures will continue into the future. this assumption can be altered as we learn more about the disease and mitigation strategies in the future. we solve the differential equations using a simple iterative procedure where the values of the next day are determined by the values of the previous day. . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 4, 2020. . https://doi.org/10.1101/2020.06.02.20119917 doi: medrxiv preprint we avoid more complicated numerical techniques like the runge-kutta methods because they proved to be too computationally intensive to fit a large number of models. additionally, for our purposes the above recurrences yield a reasonable approximation. note that to solve the system of differential equations above we need to specify an initial condition. in particular we need to specify initial values for time t, and each of s, i, x. since the set of differential equations (1) − (3) is valid at all points of time we can arbitrarily choose a starting point. we start the model just before we get the first death. obviously, s t0 = 1, x t0 = 0. we choose i t0 = 1 γ x p where p is the population. this choice implies that at day 1, there will be exactly one death. since real death counts are discrete, we choose t 0 in a narrow interval around where the actual death count start to rise. to prepare the daily death counts, we obtain raw death counts from two sources -john hopkins [7] and the covid19india.org [8], a volunteer-driven tracker project. we then smooth this death count using a combination of savitzky-golay and gaussian convolution filters. care needs to be taken to not distort the peak too much as with a large amount of smoothing the peak tends to decrease in height. finally, to fit the data we do a fine-grained brute force grid search [9] over the possible parameter values we provide and obtain a prediction with the lowest mean squared loss. in general, we fix γ x = 1.6e−3 (this implies a mortality rate of 0.8%), δ sd = 23, vary t 0 within a small margin near the beginning of the death count curve and vary r max from 1.4 − 2.8 and r min from 0.7 − 0.95. predicting the peak (height and position) are quite tricky because the beginning of the curve looks quite similar for different values of r. further, the peak can be quite sensitive to the values of r max and r min . r min is especially hard to estimate because it depends on the death count close to or after the peak, after social distancing measures have been put into place. we discuss these issues in greater detail in the following section. let m be the random vector corresponding to the choices for (variable) parameters in the model. then, the density function f (m) constitutes a prior on the choice of these parameters. as an approximation, we assume uniform priors on the parameters (this can in principle be extended to other priors). this is reasonable because from our knowledge of other countries, we can place bound r quite confidently whereas pinpointing a single value of r is very hard. further, we assume that the data y t ∼ h(t; m) + n (0, σ 2 ), where h is the hypothesis (sird model), y t is the observed deaths on day t and y is the vector of observed deaths. note that we can assert that we only include parameters in our confidence interval which have probability atleast where l(y, m) is the average root mean squared loss between y, m. this means that (7) is equivalent to . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 4, 2020. . https://doi.org/10.1101/2020.06.02.20119917 doi: medrxiv preprint simplifying this we get, note that because of the uniform prior the rhs is independent of m. further, for a theoretical perfect fit, = 1 and p(m = m) = 1, making the first term zero. therefore, we can interpret the second term 2σ 2 1 √ 2πσ as the minimum loss (or the loss of the best fit curve). this gives us a metric for choosing admissible values of the parameter m. where α is a constant we choose and m * are the best fit parameters. since our brute-force algorithm gives us the average loss for each possible m, we can select those m for which the loss satisfies the above inequality. having obtained a set of values of m, we obtain the corresponding curves for them and plot the minimum and maximum predictions for all these curves to obtain a confidence interval. note that the acceptable range of l(y, m) grows smaller as t max increases i.e. as we get more data. this agrees with our intuition, which says that as we get more data the confidence interval should become narrower (for fixed t, α) in practice, we start with a conservatively high value of α (=200). as we get more data, we increase α if the actual values fall outside the confidence interval. the initial value of α is chosen based on empirically fitting the model to different countries and observing that this value gives reasonably sized confidence intervals. here we present results for three different regions whose death count peaks have passed. all three -spain, nyc, italy were badly effected by coronavirus as india is likely to be. we use these curves to validate the values we have chosen for the fixed parameters and demonstrate the effectiveness of our model. more detailed plots can be found on the git repository. table 2 contains the graph for our projections. each row contains projections for one region. across a row, we vary the number data points we fit the model on, and obtain projections for the remaining times and compare them to the actual death counts. note that in each figure, the area shaded red contains points the model has not been fitted on. the tables 3, 4, 5 contain the numerical values for parameters that are inferred/used by our in each case. here, train loss is the mean squared loss of the solid blue line with respect to the data it is fitted on. breakpoint denotes the number of data points from the beginning which are included in the train set. for example, a breakpoint of 60 implies that the first 60 datapoints are used for fitting and the rest are ignored. it is worth noting that for nyc, italy and spain our model predicts r min < 1 indicating that social distancing measures have been effective in these places which is indeed the case. also notice that as we expect, with more data the uncertainty interval narrows and converges to the observed data. we now include predictions for two critical regions in india -delhi and maharashtra, both badly affected by the virus. note that the uncertainty intervals for maharashtra in the beginning of the curve are very high. this indicates that the model does not fit well to the initial part of the curve. this might be a consequence of (a) the fact that maharashtra is a large state and different parts of the state are affected differently by the virus (a model fit to mumbai would perform better) (b) the timing of the drop in s(t) does not track well the timing of prevention measures taken in the state. on the other hand, the model performs quite well on delhi. this is likely because delhi is much more homogeneous in terms of demographics and is more well-connected. this means that google mobility data is likely to reflect well how much people are social distancing. note that the current projections in delhi assume that social distancing will continue at lock-down levels. this is likely to not be the case as delhi has started re-opening. nevertheless, the government is still attempting to aggressively identify and quarantine so-called containment zones. 6 . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 4, 2020. . https://doi.org/10.1101/2020.06.02.20119917 doi: medrxiv preprint table 2 : predictions with uncertainty intervals for three different regions new york city (nyc), spain, and italy (from top to bottom). in each figure, the red area contains points the model has not been fitted on and the shaded blue region is a confidence interval. a clear conclusion from the data is that even during the lock-down which has been called one of the strictest in the world, r min remained above 1, unlike in other countries. this can indeed be seen from our model as well which predicts 7 . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 4, 2020. table 6 : predictions with uncertainty intervals for maharashtra and delhi (from top to bottom). in each figure, the red area contains points the model has not been fitted on and the shaded blue region is a confidence interval. we give projections for the death counts into the future in both cases. r min < 1 for italy, nyc and spain but r min > 1 for maharashtra and delhi. now that the country is re-opening r can only be expected to increase further. in particular, our model for delhi predicts that even at lock-down levels of social distancing, the peak is around 75 days out and we can expect to see as many as 400 deaths per day near the peak. this is a horrifying scenario to contemplate and is going to severely affect the elderly, those with co-morbities and our frontline workers. it is hoped that through these results we are able to emphasize the urgency with which the india needs to find an effective strategy to contain the virus. 8 . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 4, 2020. . https://doi.org/10.1101/2020.06.02.20119917 doi: medrxiv preprint based on our discussion in the previous sections, we can see the following directions in which the model can be improved • improving quality of r estimation with the wide usage of the aarogya setu app, the government has accurate raw data available for people's movement patterns. if we can acquire this data through official channels, we can further improve our r estimates. • constructing an online dashboard we can construct an online dashboard which shows projections with uncertainty intervals in real-time for all districts in india. further, we can allow the user to transparently adjust r to understand how critical social distancing is to contain the spread. • improving uncertainty estimation currently we choose α based on empirical conditions. which is to say, we run the model on many different countries and choose α for which a large majority of predictions fall within the confidence interval. can we choose α in a more principled manner? • collaboration with mohfw a big reason for undertaking this project was that we recognized the urgent need to come up with effective strategies to combat covid19. given that iit delhi is a well-respected institution we hoped that through the i4 students challenge we would be able to communicate this urgency to the government. if we can collaborate with policy-makers at the ministry for health and welfare we believe this model can help many people in the days to come. please feel free to contact the authors or open an issue/pull request on the github repository to request clarification, suggest improvements or features. ihme | covid-19 projections potential long-term intervention strategies for covid-19 a state-level epidemiological model for india: indsci-sim -. library catalog: zoterobib an adaptive, interacting, cluster-based model accurately predicts the transmission dynamics of covid-19 google covid-19 community mobility reports mobile operating system market share india cssegisanddata/covid-19 covid-19 projections using machine learning. library catalog key: cord-350240-bmppif8g authors: girardi, paolo; greco, luca; mameli, valentina; musio, monica; racugno, walter; ruli, erlis; ventura, laura title: robust inference for nonlinear regression models from the tsallis score: application to covid‐19 contagion in italy date: 2020-08-12 journal: stat (int stat inst) doi: 10.1002/sta4.309 sha: doc_id: 350240 cord_uid: bmppif8g we discuss an approach of robust fitting on nonlinear regression models, both in a frequentist and a bayesian approach, which can be employed to model and predict the contagion dynamics of covid‐19 in italy. the focus is on the analysis of epidemic data using robust dose‐response curves, but the functionality is applicable to arbitrary nonlinear regression models. we aim to discuss a robust approach to model and predict the spread of the coronavirus disease 2019 in italy, due to the worldwide epidemic outbreak of the new coronavirus sars-cov-2. in particular, we focus on deaths and intensive care unit hospitalizations data, that are expected to aid the detection of the time when the peaks and the upper asymptotes of contagion, both in daily new cases and total cases, are reached, so that preventive measures (such as mobility restrictions) can be applied and/or relaxed. for these data, robust procedures are particularly useful since they allow us to deal with model misspecifications and data reliability, simultaneously. nonlinear regression is an extension of classical linear regression, in which data are modeled by a function, which is a nonlinear combination of unknown parameters and depends on an independent variable. a relevant application of non-linear regression models concerns the modeling of so called dose-response relation, useful in toxicology, pharmacology and in the analysis of epidemic data. in these frameworks, the parameters of the model have a relevant interpretation, such as the upper limit and the inflection point. a normal non-linear regression model is obtained by replacing the linear predictor x t β by a known non-linear function µ(x, β), called the mean function. the model (see, e.g., bates and watts, 2007) is called a non-linear regression model, where x i is a scalar covariate, β is an unknown p-dimensional parameter, and ε i are independent and identically distributed n(0, σ 2 ) random variables. likelihood inference is the usual approach to deal with nonlinear models. the log-likelihood function for θ = (β, σ 2 ) is this article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the version of record. please cite this article as doi: 10.1002/sta4.309 and all likelihood quantities (maximum likelihood estimates, tests, confidence intervals, prediction, etc) can be easily derived. using the statistical environment r, the package drc (ritz et al., 2015) provides a user-friendly interface to specify the model assumptions about the nonlinear relationship and comes with a number of extractors for summarizing fitted models and carrying out inference on derived parameters. a large number of more or less well-known mean functions are available (log-logistic, weibull, gamma, etc, see for instance ritz et al., 2015 , table 1 ). these functions are parameterized using a unified structure with a coefficient b denoting the steepness of the curve, c and d the lower and upper asymptotes or limits of the response, and, for some models, e the inflection point. for instance, the five parameter log-logistic curve assumes however, in the presence of model misspecifications or deviations in the observed data from the assumed model, classical likelihood inference may be inaccurate (see, e.g., huber and ronchetti, 2009 , farcomeni and ventura, 2012 , farcomeni and greco, 2015 , and references therein). this paper aims to discuss the use of robust inference on nonlinear regression models. in particular, we discuss a general approach based on the tsallis score (basu et al., 1988 , ghosh and basu, 2013 , dawid et al., 2016 , mameli et al., 2018 , giummolé et al., 2019 , both in a frequentist and bayesian frameworks. to deal with model misspecifications, useful surrogate likelihoods are given by proper scoring rules (see dawid et al., 2016, and references therein) . a scoring rule is a loss function which is used to measure the quality of a given probability distribution for a random variable y, in view of the result y of y. when working with a parametric model with probability density function f(y; θ), with θ ∈ θ ⊆ ir d , an important example of proper scoring rules is the log-score, which corresponds to minus the log-likelihood function (good, 1952) . in this paper, to deal with robustness, we focus on the tsallis score (tsallis, 1988) , given by the density power divergence dα of basu et al. (1998) is just (4), with γ = α + 1, multiplied by 1/α. the tsallis score gives in general robust procedures (ghosh and basu, 2013) , and the parameter γ is a trade-off between efficiency and robustness. for the nonlinear regression model (1), the total tsallis score for θ = (β, σ 2 ) is the validity of inference about θ = (β, σ 2 ) using scoring rules can be justified by invoking the general theory of unbiased m-estimating functions. indeed, inference based on proper scoring rules is a special kind of m-estimation (see, e.g., dawid et al., 2016, and references therein) . the class of m-estimators is broad and includes a variety of well-known estimators. for example it includes the maximum likelihood estimator (mle) and robust estimators (see e.g. huber and ronchetti, 2009 ) among others. let s(y; θ) be the gradient vector of s(y; θ) with respect to θ, i.e. s(y; θ) = ∂s(y; θ)/∂θ. under broad regularity conditions (see mameli and ventura, 2015 , and references therein), the scoring rule estimatorθ is the solution of the unbiased estimating equation s(θ) = n i=1 s(y i ; θ) = 0 and it is asymptotically normal, with mean θ and covariance matrix v(θ)/n, where where k(θ) = e θ (∂s(y; θ)/∂θ t ) and j(θ) = e θ (s(y; θ)s(y; θ) t ) are the sensitivity and the variability matrices, respectively. the matrix g(θ) = v(θ) −1 is known as the godambe information and its form is due to the failure of the information identity since, in general, k(θ) = j(θ). asymptotic inference on the parameter θ can be based on the wald-type statistic which has an asymptotic chi-square distribution with d degrees of freedom. in contrast, the asymptotic distribution of the scoring rule ratio statisis a linear combination of independent chi-square random variables with coefficients related to the eigenvalues of the matrix j(θ)k(θ) −1 (dawid et al., 2016) . more formally, w s (θ)∼ d j=1 µ j z 2 j , with µ 1 , . . . , µ d eigenvalues of j(θ)k(θ) −1 and z 1 , . . . , z d independent standard normal variables. adjustments of the scoring rule ratio statistic have received consideration in dawid et al. (2016) , extending results of pace et al. (2011) for composite likelihoods. in particular, using the rescaling factor a(θ) = (s(θ) t j(θ)s(θ))/(s(θ) t k(θ)s(θ)), we have can be obtained by using a parametric bootstrap; see varin et al. (2011) for a detailed discussion of the issues related to the estimation of j(θ) and k(θ). however, for tsallis score (5) the matrices k(θ) and j(θ) can be derived analytically. indeed, under the same assumptions of theorem 3.1 in gosh and basu (2013) , it is possibile to show that . . , n, and ξα and ςα are the same as given in gosh and basu (2013) for the linear regression model (see sect.6), namely ξα = (2π) −α/2 σ −(α+2)/2 (1 + α) −3/2 and ςα = 1 4 (2π) −α/2 σ −(α+4)/2 2+α 2 (1+α) 5/2 . moreover, the computation of j(θ) leads to these matrices can then be used in (6) to derive the asymptotic distribution ofθ. note thatβ andσ 2 are asymptotically independent. from the general theory of m-estimators, the influence function (if) of the estimatorθ is given by and it measures the effect on the estimatorθ of an infinitesimal contamination at the point y, standardised by the mass of the contamination. the estimatorθ is b-robust if and only if s(y; θ) is bounded in y (see hampel et al., 1986) . note that the if of the mle is proportional to the score function; therefore, in general, mle has unbounded if, i.e. it is not b-robust. sufficient conditions for the robustness of the tsallis score are discussed in basu et al. (1998) and dawid et al. (2016) . for the tsallis score (5), straightforward calculation show that the if for the tsallis estimator of the regression coefficients becomes and that the if for the tsallis estimator of the error variance becomes since the functions s exp{−s 2 } and s 2 exp{−s 2 } are bounded in s ∈ ir, than both the influence functions if(y;β) and if(y;σ 2 ) are bounded in y for all γ > 1. the use of surrogate likelihoods in the bayes formula has received considerable attention in the last decade (see the review by ventura and racugno, 2016 , and references therein). paralleling the derivation of posterior distributions based on composite likelihoods, a tsallis posterior distribution can be obtained by using the scoring rule instead of the full likelihood in bayes formula. let π(θ) be a prior distribution for the parameter θ. the proposed sr-posterior distribution is defined as a possible choice of the matrix c is given by giummolé et al. (2019) and references therein. the choice of a prior distribution π(θ) to be used in (9) involves the same problems typical of the standard bayesian perspective. for instance, for objective bayesian inference the expected α-divergence to the tsallis posterior distribution can be used (giummolé et al., 2019) , and it is given by π g (θ) ∝ |g(θ)| 1/2 . we aim to build a data-driven model that can provide support to policymakers engaged in contrasting the spread of the covid-19. in particular, we have applied robust inference for the model (1) to the available data, which covers the period from february 24th to april 24th, 2020. the data sources are the daily report of the protezione civile (https://github.com/pcm-dpc/covid-19/). we consider two independent applications to: • daily deaths (dd) for covid-19, deaths confirmed by the istituto superiore di sanitá (iss); • intensive care unit (icu) hospitalizations with positive covid-19 swab; this data can be interpreted as a "department use index". moreover, our analyses are limited to two different geographical extensions: italy and lombardia. however, the proposed methodology has been applied to all the italian regions. we illustrate statistical modelling of cumulative dd and cumulative intensive icu in italy and in the italian region lombardia using the tsallis scoring rule. following dawid et al. (2016) , for dd and icu data the tsallis score (5) represents a composite scoring rule, based on only marginals. in particular, it can be interpreted as an independent scoring rule (see also varin et al., 2011 , for composite likelihoods). figures 1-2 display the robust fitted models for italy and lombardia, respectively, for both dd and icu data. here, a three-parameter log-logistic curve has been used, i.e. obtained by setting c = 0, f = 1 in (3). in this setting, the parameter e represents the median time. in each plot, the blue points denote the observed data, and the red curves are the estimates/forecasts with our robust nonlinear models. furthermore, the plots on the left show the cumulated data, whereas those on the right give the daily data (new deaths and new hospitalizations, every day). by looking at the plots about cumulative data, we appreciate the characteristic s-shape of the log-logistic curve describing the evolution of total cases. the upper asymptote of the curves represents the expected number of total deaths or intensive care unit hospitalizations due to the covid-19 outbreak. on the other hand, the inspection of the panels devoted to the evolution of daily data is informative about the peak of daily new cases. the peak in the right panels corresponds to the mode of the distribution of cumulative cases, and it is always dominated by the parameter e due to the right asymmetry. we remark that, in all the figures, the models fit well the observed data. in particular, the plots on the right show that the peak in new cases has already been reached both for dd and icu data, hence highlighting the effectiveness of the restrictions. it is also evident that such counts grew faster at the beginning of the outbreak than they are decreasing after the peak. the plots on the left reveal that the end of the outbreak is still to come, and total numbers are expected to increase. when the cumulative data attain the upper asymptote, then the daily data decrease to zero. the robust fits (tsallis estimates and 95% confidence intervals) of the parameters e (inflection point) and d (upper asymptote) for the models are summarized in tables 1 and 2 for dd and icu, respectively. for dd italy data, the model predicts an impressive total of more than 31k deceased at the end of the outbreak. according to the fitted model, the expected number of new deaths will be below 15 counts by the end of june. the inflection point is on april 5th (day 42), whereas the fitted peak is on march 30th. for what concerns icu, the total expected number of icu occupancy-days is about 186k, whereas the inflection point is on april 8th (day 45) and the peak on april 1st. the number of icu will decline at the same level as the end of february, at the very beginning of the outbreak in italy, during the first days of july. moving to lombardia, the model estimates a total of about 148k deaths at the end of the epidemic, an inflection point on april 1st (day 38), and a peak on march 27th. for icu counts, the total is about 76k occupancy-days, the inflection point is on april 10th (day 47), the peak on march 31st. by the end of may, the death tolls will go below 15 units, whereas by the end of june, the number of icu will be well below 100. in order to assess the accuracy of the fitted models, some numerical studies have been carried out to investigate the actual sampling distribution of the proposed estimator. to this end, 10000 samples have been drawn from the fitted model on italy death data. the sampling distributions of the tsallis estimates for (b, d, e, σ) are displayed in figure 3 . they all exhibit reasonable accuracy and and precision, as confirmed by the comparison with the normal approximation to the distribution of the tsallis estimator based on the fitted model. in the simplest instance of prediction, the object of inference is a future or a yet unobserved random variable z. let p z (z; θ) be the density of z. the basic frequentist approach to prediction of z, on the basis of the observed y from y, consists in using the estimative predictive density function pe(z) = p z (z;θ), obtained by substituting the unknown θ with a consistent estimator, such as the tsallis estimator or the mle. figure 4 reports the estimative predictive densities based on both the estimators for dd and the cumulative icu. note the tsallis estimative predictive density is shifted on the right and exhibits larger variability. to compare the predictive performances of the tsallis method with respect to the mle, a simulation study has been performed, based on n = 10000 monte carlo replications. figure mixing the robust procedure with the bayesian approach allows us to include prior information (objective or subjective) on the parameters of the model. moreover, the plots of the posterior distributions for the parameters of the model may be quite useful in practice since they are more informative than a simpler point or interval estimator. for instance, the marginal posterior distributions allow us to assign probabilities to intervals on the parameters. figure 6 gives the violin plots of the marginal posterior distributions for parameters of the model and the expected mean, both for dd and for icu data, using the reference prior (giummolé et al., 2019) . the posterior medians (the white dots) for the upper asymptotes and the inflection points are consistent with the frequentist robust estimates. note that the classical marginal posterior distribution shows too narrow tails with respect to robust posterior distributions, which, on the contrary, can take into account the actual large uncertainty in the available data. to conclude, we believe that our procedure can constitute a useful statistical tool in modelling italian covid-19 contagion data. indeed, the tsallis robust procedures allow us to take into account the inevitable inaccuracy of the italian covid-19 data, which are often underestimated. moreover, an example is those deaths of patients who died with symptoms compatible with covid-19 but who have not had a tampon, or what has been described by many media regarding the growing number of elderly people who remain in their homes while needing to be hospitalized in intensive care. thus, for these data, robust procedures are particularly useful since they allow us to deal at the same time with model misspecifications (with respect to the normal assumption, independence, or with respect to homoscedasticity) and data reliability. the estimation of the model and, as a consequence, the calculation of the expected asymptote and the inflection point are based on the assumptions that the adopted restrictions will not be subject to change. for these reasons, these fitted models cannot be used for predictive purposes, since it is not possible to predict how the data will change when the restrictions are modified at the end of the lock-down, scheduled for may 3rd. the day by day monitoring of the model fit stability will allow us to evaluate deviations from the actual lock-down situation with respect to the next reopening. as a final remark, since the variables are daily counts, we will investigate the use of the tsallis scoring rule in the context of nonlinear poisson regression models. updates on the results and on the italian regions may be found in the web page of the robbayes-c19 research group: https://homes.stat.unipd.it/lauraventura/content/ricerca in this web page also the r codes are available. the reader is also pointed to the work of the statgroup-19 research group that models the mean function by using the five parameters richards curve (divino et al., 2020) . the data that support the findings of this study are available in github repository at the italian protezione civile account (pcm-dpc/ and described in morettini et al. (2020) . these data were derived from the following resources available in the public domain: https://github.com/pcm-dpc/covid-19/. robust and efficient estimation by minimising a density power divergence nonlinear regression analysis and its applications minimum scoring rule inference an overview of robust methods in medical research robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression robust bayes estimation using the density power divergence objective bayesian inference with proper scoring rules robust statistics bootstrap adjustments of signed scoring rule root statistics higher-order asymptotics for scoring rules covid-19 in italy: dataset of the italian civil protection department dose-response analysis using r possible generalization of boltzmann-gibbs statistics an overview of composite likelihood methods pseudo-likelihoods for bayesian inference key: cord-332412-lrn0wpvj authors: ibrahim, mohamed r.; haworth, james; lipani, aldo; aslam, nilufer; cheng, tao; christie, nicola title: variational-lstm autoencoder to forecast the spread of coronavirus across the globe date: 2020-04-24 journal: nan doi: 10.1101/2020.04.20.20070938 sha: doc_id: 332412 cord_uid: lrn0wpvj modelling the spread of coronavirus globally while learning trends at global and country levels remains crucial for tackling the pandemic. we introduce a novel variational lstm-autoencoder model to predict the spread of coronavirus for each country across the globe. this deep spatio-temporal model does not only rely on historical data of the virus spread but also includes factors related to urban characteristics represented in locational and demographic data (such as population density, urban population, and fertility rate), an index that represent the governmental measures and response amid toward mitigating the outbreak (includes 13 measures such as: 1) school closing, 2) workplace closing, 3) cancelling public events, 4) close public transport, 5) public information campaigns, 6) restrictions on internal movements, 7) international travel controls, 8) fiscal measures, 9) monetary measures, 10) emergency investment in health care, 11) investment in vaccines, 12) virus testing framework, and 13) contact tracing). in addition, the introduced method learns to generate graph to adjust the spatial dependences among different countries while forecasting the spread. we trained two models for short and long-term forecasts. the first one is trained to output one step in future with three previous timestamps of all features across the globe, whereas the second model is trained to output 10 steps in future. overall, the trained models show high validation for forecasting the spread for each country for short and long-term forecasts, which makes the introduce method a useful tool to assist decision and policymaking for the different corners of the globe. as a new contagious disease in human inhabitants, covid-19, has been currently reaching 803,126 confirmed cases with 39.032 death in 201 countries across the world (wordometer, 2020) . although there are a number of the statistical and epidemic models to analyse covid-19 outbreak, the models are suffering from many assumptions to evaluate the impact of intervention plans which create a low accuracy as well as unsure prediction (hu et al., 2020) . therefore, there is a vital need to develop new frameworks/methods to curb/control the spread of coronavirus immediately (botha and dednam, 2020; hu et al., 2020) . the epidemic outbreak of covid-19 in literature is investigated using mathematical compartmental model named susceptible-infected-recovered (sir) (kermack and mckendrick, 1927 ). the sir model represents a population under three categories: 1) susceptible (the number of people presently not infected), 2) the number of people currently infected, and 3) the number of people either recovered or died. the model describes as differential equations. the model is completely determined by transmission rate, the recovery rate, and the initial condition, which can be estimated using least square error, kalman filtering or bmc. the model is sometimes renamed based on the new parameters such as susceptible-infectious-quarantined-recovered (siqr) or susceptible-exposed-infected-recovered (seir). the main idea in the version of all sirs models are four-fold; first, identification and better understanding current epidemic (crokidakis, 2020) , second, simulation the behaviour of the system (castro, 2020) , third, forecasting of the future behaviour (toda, 2020) , and last, how we control the current situation (sameni, 2020) . however, the results of the models including accuracy only valid based on their assumptions in a slice of available data/moment and have their scopes to assist healthcare strategies for the decisionmaking process. on the other hand, agent-based modelling is utilised to explore and estimate the number of contagions of covid-19, specifically for certain countries (chang et al., 2020; simha et al., 2020) . also, statistical methods (singer, 2020) , simple time series modelling (deb, 2020) , and logistic map (al-qaness et al., 2020) are utilised for similar objectives, whereas (botha and dednam, 2020) , focused on modelling the spread of coronavirus based on the parameters of basic sir in a (3dimensional) iterative maps to provide a wider picture of the globe. petropoulos and makridakis (2020) forecasted the total global spread relying on exponential smoothing model based only on historical data. put all together, the drawbacks of their models are not flexible to fit for each country or region due to the lack of necessary measures, government responses, and spatial factors related to each specific location. there are few examples of predictive modelling of the coronavirus spread based on machine learning approaches, whether through shallow or deep models. while it is can be explained due to the limitation of data since the early stage of the outbreak, it remains an essential tool. according to pham and luengo-oroz (2020), machine learning approaches certainly could assist in forecasting by with improved quality for prediction. one of the few studies is presented by (hu et al., 2020) . they have applied real-time short-term forecasting using the compiled data from 11 th jan to 27 th feb 2020 collected by the world health organization (who) for the 31 provinces of china. the data is trained on a deep learning model for realtime forecasting of new cases for the provinces. their model has the flexibility to be trained at the city, provincial, or national level. besides, the latent variable of the trained model is used to extract necessary features for each region and fed into a k-means to cluster similar features of the infected or recovered features of patients. bearing this in mind, there is still a knowledge gap for machine learning models to predict coronavirus cases at a global as well as regional scales (pham and luengo-oroz, 2020) . while sir models with their different types, in addition to the aforementioned ones, are essential, the challenges remain in forecasting different regions and countries across the globe with a single model without any assumptions or scenario-based rules, but only with the current situations, features related to countries, and measures amid to reduce the impact of the outbreak. accordingly, in this paper, we introduce a new method of learning and encoding information related to the historical data of coronavirus per country, features of countries, spatial dependencies among the different countries, and last, the time and location-dependent measures taken by each country amid towards reducing the impact of coronavirus. relying on deep learning, we introduce a novel variational long-short term memory (lstm) autoencoder model to forecast the spread of coronavirus per country across the globe. this single deep model aimed to provide robust assistance to policymakers to understand the future of the pandemic at both a global level and country level, for a short-term forecast and long-term one. the main advantages of the proposed method are: 1) it can structure and learns from different data sources, either that belongs to spatial adjacency, urban and population factors, or various historical related data, 2) the model is flexible to apply to different scales, in which currently, it can provide prediction at global and country scales, however, it can be also applied to city level. and last 3) the model is capable of learning global trends for countries that have either similar measures, spread patterns, or urban and population features. after the introduction, the article is structured in five sections. section 2 introduces the method and materials used. in section 3, we show model evaluations and the experimental results at country and global levels. in section 4 we discuss our results, compare our model to any existing base models and highlights limitations. last, in section 5 we conclude and present our recommendation for future works. the model algorithms are constructed based on four assumptions that we assume the model needs to learn to predict the next day spread: first, the model needs to extract features regarding the historical data of coronavirus spread for a given country bearing in mind the historical values of the virus spread in the other countries simultaneously before it outputs a prediction for a given country. second, before the model gives a predicted value for each country, it should consider the predicted values of all other countries instantaneously, similar to the first point. third, the spatial relationship between different countries is multidimensional; it can vary based on geographical location, adjacency, accessibility, or even policies for banning accessibility. the model needs to deal with variations of time and location of the different inputted scenarios while sampling outcomes. last, apart from the virus features, for each country, there are unique demographic and geographical features that show association to the spread of the virus that may show association with the virus, in which the learning process of the model needs to consider each time before it gives a predicted value. the structure of the input data is key for any model to learn. figure 1 shows the concept of the overall structure of the proposed graph of multi-dimensional data sets for forecasting the spread. it illustrates how different types of data can be linked and clustered for the model to learn the spread of a virus. this data can be seen as dynamic features related to both virus and the location with long temporal scales (i.e. the population data) or short ones ( ). it shows how local and global trend for a virus can be forecasted for a given country ( ), with urban features that include both spatial and demographic factors ( ), that share a spatial weight ( ) with other countries in the graph, whereas government mitigated measures ( ) are applied. put all together, the model needs to differentiate between factors that characterise countries or regions, and those which characterise the virus spread to understand the patterns of spread at global and country levels. to meet these hypotheses and assumptions during the learning process, the architecture of the proposed model is based on the combinations of three main components: 1) lstm, 2) self-attention, and 3) variational autoencoder graph. lstm represents the main component of the proposed model. it has been shown it is the ability to learn long-term dependencies easier than a simple recurrent architecture (goodfellow et al., 2017; lecun et al., 2015) . unlike traditional recurrent units, it has an internal recurrence or a self-loop, in which it allows the timestamps to create paths, in which the gradient of the model can flow for a long duration without facing the vanishes issues presented in a normal recurrent unit. even for an lstm with a fixed parameter, the integrated time scale can change based on the input sequence, simply because the constants of time are outputted by the model itself. these self-loops are controlled by a forget gate unit ( ( ) ) for a given time (t) and a cell (i), in which it fits this weight to a scaled is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted april 24, 2020. . value between 0,1 with a sigmoid unit ( ). it can be explained as: ( ) = ( + ∑ , ( ) + ∑ , ℎ ( −1) ) (1) where ( ) is a vector for the current input, ℎ ( ) is a vector for the current hidden layer that contains the outputs of all the lstm cells, are the biases for the forget gates, is the input weights, is the recurrent weights for the forget gates. the internal state of the lstm is updated with a conditioned self-loop weight ( ( ) ) as: (2) where b represents biases, u represents input weights, w represents the current weights into the lstm cell, and ( ) represents the external input gate unit. it is computed similar to the forget gate but with it is own parameters as: last, the lstm cell output ℎ ( ) can also be controlled and shut off with an output gate ( ) , similar to the aforementioned gate by using a sigmoid unit. the output ℎ ( ) is computed as: represents biases, represents input weights, represents the current, and ᵩ.represents the activation function such as tanh function. put all together, this controls of the time scale and the forgetting behaviour of different units allow the model to learn long-and short-term dependencies for a given vector. not only the model learns from the previously defined timestamps for each country, but also the model could extract features from the other countries at each given timestamp in which the dimension of the input vector, and cell states, includes the dimensions of the different countries. it is worth mentioning that the input for the lstm cells is can be seen as a three-dimensional tensor, representing the sample size for both training and testing, the defined timestamps for the model to look back, and the timestamps of the other countries as a global feature extractor. while the lstm cells learn from their input sequence to output the predicted sequences through the long and short dependencies of the time constants and their additional features for each country, the relations between its inputs remains missing. a self-attention mechanism allows the lstm units to understand the representation of its inputs by relating the positioning of each sequence (goodfellow et al., 2017; vaswani et al., 2017) . this mechanism in the case of the proposed model is crucial to assist the model to which piece of information to consider and what to forget when making a prediction. the selfattention mechanism can be explained as: we initialise the first graph based on the spatial weight of the geographical locations of all infected countries (more details will follow in sub-section 3.1.4), however, despite the attempts of trying to create a sophisticated adjacency matrix among the infected countries ( based on flight routes, spatial network, migration network, etc.), the output may misleading for any learning method over time or for a given location. the spatial weight since the outbreak of the model may look completely different from the initial day to the latest day. these due to different policies and measures that are taken by countries. however, due to its high uncertainty and variation. inputting the model with a static graph or even a dynamic one based on limited data may exacerbate the learning process. accordingly. the third vital components in our model represent the variational autoencoder (vae) component that allows the model to generate information from a given input. it can be defined as a generative directed method that makes use of the learned approximate inference (goodfellow et al., 2017; ha and schmidhuber, 2018) . the model is based on the idea of passing latent variables to the coded distribution ( ) over samples using a differentiable generator network ( ). subsequently, is sampled from the distribution of ( ; ( )) which is equal to the distribution of ( | ). the model is trained by maximising the lower bound of the variation ℒ( ) that belongs to as: (6) describes the joint log-likelihood of the visible and hidden variables under the approximate posterior over the latent variables log ( , ), and the entropy of the approximate posterior ℋ( ( | ), in which is chosen to be a gaussian distribution with a noise that is added to the predicted mean value. in traditional variational autoencoder, the reconstruction log-likelihood tries to equalise the approximate posterior distribution ( | ) and the model prior ( | ). however, in the case of our model the encoded ( | ) is conditioned and penalized based on the output of a predicted value of the next forecast of the spread, instead of the log-likelihood of the similarity with ( | ), which will be explained further in the proposed framework. we propose a sequence-to-sequence architecture relying on a mixture of vae and lstm. the model comprises two branches trained in parallel in an end-to-end fashion. figure 2 shows the overall proposed framework. the first branch is a self-attention lstm model that feeds by spatio-temporal data of coronavirus spread per day and per country, the government policies per day and per country, and the urban features per country, in which the vector is repeated to cover the duration of training (the urban features used are three features: population density, urban population percentage and fertility rate, which will be covered in detail in the upcoming section). each input is reshaped as a 3d tensor of shape (samples, timestamps, number of features x number of countries). the three-input data are concatenated at the last axis of the data (the dimension of the feature) and passed to the first branch of the model through two parts: 1) the self-attention lstm sequence encoder, and 2) the lstm sequence decoder. the first sequence encodes the input data and extracts features for the second part of the lstm sequence to output the prediction of the spread for the next day (in case of the shortterm forecast) per country. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . in parallel to the self-attention encoder sequence, the second branch of the model is an encoder of vae. it is feed by a spatial matrix of dimensions (number of countries x number of countries) and repeated for the entire duration of training and timestamps (in the next section, more details will follow on how it is selected and computed). this encoder part is mainly a convolution structure, which consists of three 1d convolution layers of filters 32, 64, and 128 respectively, in which they are all of a kernel size of 1 and activated by a relu function and followed by a dropout layer of size 0.2. after the dropout, two lstm layers are followed, in which they contain 100, 494 lstm cells respectively. the first one is activated by a relu function, whereas the second one by a linear function. a fully connected layer of neurons equivalent to the number of countries is applied. last the latent space is defined with a dimension of 10, in which the z-values are generated from sampling over the gaussian distribution of the previous inputted layer (as explained in section 2.2.3). to visualise the generated graph for representation purposes, it is worth mentioning that the encoder of the second branch of the model can be decoded to output the generated samples for each predicted sequence by passing it into a decoder vae, where the 1d convolutions layers are transposed to a final output shape equal to the inputted dimension. as for future work, this could be an interesting approach to understanding the variation of the graph for each predicted day for all countries. both outputs of the self-attention lstm encoder and the encoder of the vae are concatenated over the feature dimension and passed to the lstm decoder sequence, which contains a single lstm layer of cell numbers equal to the total number of countries. it is followed by two fully connected layers of shape size (1 x number of countries) for predicting the value of the next day, in case of the short-term forecast, or can be shaped to (numbers of future steps x number of countries) for any number of future steps that model needs to output per each country. data sets are split to training and testing on the first dimension of data shape (the total duration of the temporal data), in a way that the model can be tested on the last 6 days. we trained two different models, one as a single-step model for short-term forecast (one day), whereas the other is trained as a multi-step model (10 days forecast). there are two crucial differences between these two models; the output layer, and the dimension of the y-train, and y-test of the first one is shaped as (1 x n), whereas in the other model is output layer is shaped as (10 x n), despite the number of samples. is the structure of the y-train and y-test. the second issue, is the trained and tested sample is not only reduced by the number of timestampsat the beginning of each sequence-as in the case of the first model, but also reduced by the number of future steps -at the end of the sequence-in the case of the second model. last, based on trial and error, we structured the data based on 3 timestamps for both models to look back for all the input features for each country, in which we found optimal results. the weights of the model are initialised by random weights. the model is compiled based on the backpropagation of error of the stochastic gradient descents, relying on 'adam' optimiser (kingma and ba, 2014), with a learning rate of 0.001 and is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . https://doi.org/10.1101/2020.04.20.20070938 doi: medrxiv preprint momentum 0.9. the model is trained for 500 training cycles (epochs). the performance of the proposed method is evaluated based on three different scales; 1) a global loss-based evaluation, 2) country-based evaluation and last, 3) step-based evaluation. the short-term forecast model (single-step model) relies only on the first two evaluation metrics, whereas the multi-step model includes the three levels of evaluations. the first loss function evaluates the overall performance of the model at a global level, in which it influenced the adjustment of the model weights during training for both trained models. it is evaluated based on the mean squared of error (mse) which is calculated as: where m is the total sample, ̂( ) is the predicted values of the test set, and ( ) is the observed values of the test set. we also computed kullback-leibler divergence ( ) or so called 'relative entropy 'which measures the difference between the probability distribution of two sequence. it is a common approach for assessing the vae, nevertheless, it could be a good indicator to evaluate the predicted sequences globally. it is calculated as: where ( ) and ( ) represent the two probability distributions of the two random discrete sequences of . in the case of the model ( ) and ( ) represents the true distribution of data and the predicted one ( ( ) ̂( ) ). it is worth mentioning ( ( )|| ( )) ≠( ( )|| ( )). the second loss evaluates the performance of the model at a local level of each country or region. strictly, ̂( ) and ( ) ideally fit a statistically significant linear model where the strength of the model with r-squared value can be computed for further interpretation, in addition to the computed mse or its root, for each county for the entire duration. similar to the second loss, the performance of the second model (the multistep model) includes a calculated loss (based on the root of the mse) for each predicted step. last, comparing our results to other models remains a challenge due to the absence of a unified model similar to what we have achieved that forecast each country globally, or also due to the absence of general benchmark data with a common evaluation metrics. however, we try our best to compare and discuss the performance of our method to any existing models such simple or deep time-series model for specific countries or at any specific time. to forecast the spread of the coronavirus the next day, we synchronised different types of data to allow the model to learn. this wide range of data comprises the historical data of the coronavirus spread by each country, dynamic policies and government responses that amid to mitigate coronavirus by each timestamp and by each country, static urban features that characterise each country and shows significant correlations with the virus spread, and last, the spatial weight among the different countries. these different data types are integrated and synchronised by countries and -time steps in case of dynamic datato be feed to the introduced framework. we used the historical data for coronavirus spread published by john hopkins university (dong et al., 2020; jhu csse, 2020). after integrating this data with following data sources, the version we used, contains timestamps from 22/01/2020 till 09/04/2020 (79 days) for 264 regions or countries across the globe as shown in figure 3 for the confirmed cases for the start and end day of the examined duration. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. we used demographic and locational data that represent the population for each region or country from the aforementioned data set (worldometer, 2020) . there is a wide range of factors, however, we only selected three factors; 1) population density, 2) fertility rate and 3) urban population. the two reasons for selecting these features are: first, the selection is based on enhancing the model prediction after several trial and errors with and without several features. second and most importantly, the selected features show a statistically significant association with the spread of coronavirus over time for all countries across the globe. figure 5 shows the outputs of the spearman correlation for the three selected factors. in figure 5 a, the population density was significant with decaying positive correlation coefficients (rho) from the starting date until before the last 14 days of the examined duration. this means the higher the population density, the more likely a higher coronavirus spread. in figure 5 -b, the fertility rates across the globe show a significant association over the entire tested duration, with negative rho values, which means countries with higher fertility rates are less likely to have a higher spread of coronavirus. this could explain the less spread of the virus in africa (as shown in figure 3) , however, this feature may be a time-dependant or due to reporting inaccuracy or the low percentage of virus testing in africa. last, in figure 5 -c, the percentage of the urban population started to show a significant association with the spread of the virus with positive rho values from the middle of the tested duration till the last day. this means the higher the countries with a higher percentage of the urban population, are more likely to have higher coronavirus spread. different countries took and continuously take different measures and responses amid towards coronavirus outbreak. these time and location dependant measures include 13 indicators, which they are: 1) school closing, 2) workplace closing, 3) cancelling public events, 4) close public transport, 5) public information campaigns, 6) restrictions on internal movements, 7) international travel controls, 8) fiscal measures, 9) monetary measures, 10) emergency investment in health care, 11) investment in vaccines, 12) virus testing framework, and 13) contact tracing. put all together, oxford covid-19 government response tracker (hale et al., 2020) aimed to measure the variation of the government responses weighted by these indicators in a scaled index, so-called stringency index. we used this index to weight the different countries based on the government responses, after integrating and matching the time and location of the previously mentioned data sets. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . https://doi.org/10.1101/2020.04.20.20070938 doi: medrxiv preprint is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . we computed a spatially weighted adjacency matrix based on the geolocation of each region or country, relying on the geodesic distance between each region or country. we used haversine formula to compute the distance on the sphere. it calculated as: (1 − a) )) (10) where 1 , 2 represent the origin and destination latitudes in radian respectively, ∆ represents the change between the origin and destination longitudes in radian, and r is the earth's radius. the adjacency matrix is conditioned based primary on eliminating long-distance connections, which can represent the connection between the us and europe, the us and china, and direct connection between china and the rest of the world. this hypothetical assumption came from the early international measured took by the us to ban flight to europe and china for non-american citizens. given, this spatial weight may vary or have a higher degree of uncertainty, the model only self-learns from its representation while it generates various samples with the vae encoder as discussed earlier, instead of using these data as a fixed and constant factor during training and testing. to be in business-as-usual. however, these are only few easily interpretable examples, the challenges for the model is to selflearn the representation of the graph to adjust the different weights and generate graph that could in forecasting the spread globally. in algorithm 1, we show how we initialised the adjusted spatially weighted matrix for all countries. it attempts to show three main elements for computing the graph: first, it shows how a complete graph between the origin and destination countries is computed. second, how the relative distance is computed and conditioned. and last, it shows how the array is scaled and reshaped. figure 7 shows examples for the variation that could be more significant and realistic for predicting a given day for a given country. for instance, the first graph in figure 7 , can represents countries with strict measures towards international travel, the second one which could be the more likely to be the case during the period of banning travel from us to europe or china for instance, the last two shows how the world more likely to be in business-as-usual. cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . after 500 epochs, the training and testing curves of the model show a steady output with no sign of over fitness, nevertheless, the mse losses for both curves are at a minimum, with values less than 0.01 whereas the kl loss for the test set is less than 0.37 for both trained model. in figure 8 , we show the distribution of the confirmed and predicted cases globally with the single step model. the total predicted cases per day is a close number to the actual data, with a slightly higher confirmed in africa than what has been confirmed. in figure 9 , we show the sum of the accumulated predicted casespredicted at a country level -across the globe for each day regarding the actual data. the results are highly accurate at a global level, with a fraction difference between the actual and predicted ones on the last examined day 09/08/2019. the prediction of the model is nonlinear, however, its output at a given sample when it compared to its ground truth is linear. therefore, fitting a linear regression model between the predicted result and the observed one and provide a r-squared . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . value could be a good indicator for understanding the model strength. therefore here, we also show the r-squared value and the root of the mse metrics (rmse) for a linear regression fitted model on the predicted and actual values of our singlestep model. the computed metrics shows a high linear association among them. what makes this method a more reliable one than any simple time-series model is that the predicted global curve to the actual one is outputted without the model learns the actual one or without any explicit rules extracted at the global level to mimic that global spread curve of the virus. the model learns the patterns at country levels, whereas error is minimised at both local and global levels. what makes this a very crucial point to discuss is that changes across the globe more likely to happen at a country level, whereas the global level is rather an impact of the difference countries. not only the model shows higher performance globally but also at a country level. figure 10 (in annexes section) shows the performance of the a single-step model at different countries. overall, the model shows higher performance in countries with higher spread whereas the performance of the model decreases with countries with fewer cases over short period. however, the model shows overall reliable results at a country level. in table 1, we extend on the evaluation of the single step model. we show further variation of prediction in selected countries in different continents. while the model performance varies from a country to country, overall, it shows a reliable result for at a country level. in table two (in annexes section), we show the performance of the 10-step model for a group of selected countries. this model is evaluated per country and per step. while the model performance reduces with the increase of the number of steps, compared to the single step model, the result to a higher degree remains consistent at a country level when we reach the 10-step. in this article, we introduce a state-of-the-art method for predicting the spread of coronavirus for each country across the globe for both short and long-term forecast. it has three main advantages, first, the model learns not only from the historical data but also the applied governmental measures for each country, urban factors, and the spatial graph that represent the dependencies among the different countries. the second advantage of the model is its ability to be applied at various scales. currently, it can forecast the spread at a global and country, and region level (i.e. the case of china, uk), however, it can also be applied at the city level. last, the model can forecast short and long term forecast which could be a reliable tool for decision-making. 5.1 base model evaluations there are different attempts for relying on a simple timeseries model whether it is relying on machine learning or a simple mathematical rule for a single country or the total cases globally. however, the drawback in such methods is: first, by fitting an exponential smoothing function to a model with no controlled point would mean the virus will continue to spread, regardless of the number of a population, the action is taken. second, if a simple rule for a given country works for the last days, till when this logic will continue works? what happens when values remain constant, decrease, or even increase at a different rate? there are different possible scenarios that such an approach could not answer. third, despite the first two arguments, how many rules are needed to fit each country globally at a longer period? subjectively, a simple time-series model without considering the factors that characterise countries or policies taken to find "general rules and features" would mean finding simple rules for each country at a given time. in most simple ways, when the curve is only increasing at the initial spread time. cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted april 24, 2020. . last, even if these previous issues are solved, the world is connected, the spatial weights may vary from country to country or day to day based on the restrictions and measures are taken. if there are simple rules that ultimately can fit the entire countries, the challenges would remain in how to weight the changes around the world. most importantly, one single case in one region could influence the spread elsewhere. the generative graph of the model along with other factors of the model empirically has enhanced the predicted values for each country globally (based on trial and errors). however, it remains a challenge that countries with spread over a longer period are more likely to be predicted more accurate than countries with no prior cases. re-training the model with more data in the future would yield better results at both; global and country levels. however, despite data improvement, there are three main domains that the model algorithms can be advanced in future works. first, finding more significant spatial or demographic factors that show significant associations with the spread may also enhance the forecast of the model second, applying the same concept and goals of the model to other subjects of coronavirus, could lead to a better understanding of its future. for instance, estimating deaths or recovery, bearing in mind the health system capability and capacity, in addition to the governmental responses could be another assisting tool. put all together, more data, more factors, different forecasting model could also lead to better long-term forecast (1-3 months) for each country based on the lesson learned from the global and country-level trends of spread. in this article, we introduced a new deep learning model relying on variational-lstm autoencoder to predict the spread of coronavirus for 264 regions/countries across the globe. the introduced learning process and the structure of the data are keys. the model learned from various types of dynamic and static data, including the historical spread data for each country, urban and demographic features such as urban population, population density, and fertility rate, and government responses for each country amid towards mitigating coronavirus outbreak. also, the model learned to sample different conditions and adjustments of a spatially weighted adjacency matrix among the different infected countries. overall, the model shows high validation for forecasting the spread at global and country levels, which makes it a useful tool to assist decision and policymaking for the different corners of the globe. there are several lessons learned while conducting this research. first, concerning urban features, we found several associations of several factors with the spread of coronavirus globally. most significantly, countries with a higher density of population in one km 2 and larger portion of the population living in urban areas are associated with higher coronavirus spread with different coefficients, and levels of statistical significance during the examined duration, whereas countries with higher fertility rates are associated with fewer spread cases at the given studied period (22/01/2020-08/04/2020). however, we also found an association with other factors that not used in this research such as migration nets. we found that countries with higher migration flows are associated with higher spread which could also be explained with their likelihood of having a higher influx of job opportunities. second, concerning the computed adjacency matrix graph, we found that at very short distances among the different infected countries with coronavirus spread, western european countries (such as germany, italy, spain) are fully or partially connected relative to other countries globally that are same distance they are completely isolated. this can be reflected on the relatively shorter distanceas a physical attribute-as among these countries when it compares to other countries, or the nonphysical accessibility of the european market which could lead to a higher influx of migration and accordingly higher spread cases. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted april 24, 2020. . https://doi.org/10.1101/2020.04.20.20070938 doi: medrxiv preprint optimization method for forecasting confirmed cases of covid-19 in china 19 2020. a simple iterative map forecast of the covid-19 pandemic 1-6 sir model for covid-19 calibrated with existing data and projected for colombia modelling transmission and control of the covid-19 pandemic in australia data analysis and modeling of the evolution of covid-19 in brazil a time series method to analyze incidence pattern and estimate reproduction number of covid-19 an interactive web-based dashboard to track covid-19 in real time deep learning, adaptive computation and machine learning series world models. arxiv180310122 cs stat oxford covid-19 government response tracker [www document novel coronavirus covid-19 (2019-ncov) data repository by johns hopkins csse [www document a contribution to the mathematical theory o f epidemics adam: a method for stochastic optimization deep learning forecasting the novel coronavirus covid-19 mathematical modeling of epidemic diseases a simple stochastic sir model for covid-19 infection dynamics for karnataka -learning from europe short-term predictions of country-specific covid-19 infection rates based on power law scaling exponents susceptible-infected-recovered ( sir ) dynamics of covid-19 and economic impact attention is all you need covid-19 coronovirus/ death toll countries in the world by population key: cord-327784-xet20fcw authors: rieke, nicola; hancox, jonny; li, wenqi; milletarì, fausto; roth, holger r.; albarqouni, shadi; bakas, spyridon; galtier, mathieu n.; landman, bennett a.; maier-hein, klaus; ourselin, sébastien; sheller, micah; summers, ronald m.; trask, andrew; xu, daguang; baust, maximilian; cardoso, m. jorge title: the future of digital health with federated learning date: 2020-09-14 journal: npj digit med doi: 10.1038/s41746-020-00323-1 sha: doc_id: 327784 cord_uid: xet20fcw data-driven machine learning (ml) has emerged as a promising approach for building accurate and robust statistical models from medical data, which is collected in huge volumes by modern healthcare systems. existing medical data is not fully exploited by ml primarily because it sits in data silos and privacy concerns restrict access to this data. however, without access to sufficient data, ml will be prevented from reaching its full potential and, ultimately, from making the transition from research to clinical practice. this paper considers key factors contributing to this issue, explores how federated learning (fl) may provide a solution for the future of digital health and highlights the challenges and considerations that need to be addressed. research on artificial intelligence (ai), and particularly the advances in machine learning (ml) and deep learning (dl) 1 have led to disruptive innovations in radiology, pathology, genomics and other fields. modern dl models feature millions of parameters that need to be learned from sufficiently large curated data sets in order to achieve clinical-grade accuracy, while being safe, fair, equitable and generalising well to unseen data [2] [3] [4] [5] . for example, training an ai-based tumour detector requires a large database encompassing the full spectrum of possible anatomies, pathologies, and input data types. data like this is hard to obtain, because health data is highly sensitive and its usage is tightly regulated 6 . even if data anonymisation could bypass these limitations, it is now well understood that removing metadata such as patient name or date of birth is often not enough to preserve privacy 7 . it is, for example, possible to reconstruct a patient's face from computed tomography (ct) or magnetic resonance imaging (mri) data 8 . another reason why data sharing is not systematic in healthcare is that collecting, curating, and maintaining a high-quality data set takes considerable time, effort, and expense. consequently such data sets may have significant business value, making it less likely that they will be freely shared. instead, data collectors often retain fine-grained control over the data that they have gathered. federated learning (fl) 9-11 is a learning paradigm seeking to address the problem of data governance and privacy by training algorithms collaboratively without exchanging the data itself. originally developed for different domains, such as mobile and edge device use cases 12 , it recently gained traction for healthcare applications [13] [14] [15] [16] [17] [18] [19] [20] . fl enables gaining insights collaboratively, e.g., in the form of a consensus model, without moving patient data beyond the firewalls of the institutions in which they reside. instead, the ml process occurs locally at each participating institution and only model characteristics (e.g., parameters, gradients) are transferred as depicted in fig. 1 . recent research has shown that models trained by fl can achieve performance levels comparable to ones trained on centrally hosted data sets and superior to models that only see isolated single-institutional data 16, 17 . a successful implementation of fl could thus hold a significant potential for enabling precision medicine at large-scale, leading to models that yield unbiased decisions, optimally reflect an individual's physiology, and are sensitive to rare diseases while respecting governance and privacy concerns. however, fl still requires rigorous technical consideration to ensure that the algorithm is proceeding optimally without compromising safety or patient privacy. nevertheless, it has the potential to overcome the limitations of approaches that require a single pool of centralised data. we envision a federated future for digital health and with this perspective paper, we share our consensus view with the aim of providing context and detail for the community regarding the benefits and impact of fl for medical applications (section "datadriven medicine requires federated efforts"), as well as highlighting key considerations and challenges of implementing fl for digital health (section "technical considerations"). ml and especially dl is becoming the de facto knowledge discovery approach in many industries, but successfully implementing data-driven applications requires large and diverse data sets. however, medical data sets are difficult to obtain (subsection "the reliance on data"). fl addresses this issue by enabling collaborative learning without centralising data (subsection "the promise of federated efforts") and has already found its way to digital health applications (subsection "current fl efforts for digital health"). this new learning paradigm requires consideration from, but also offers benefits to, various healthcare stakeholders (section "impact on stakeholders"). the reliance on data data-driven approaches rely on data that truly represent the underlying data distribution of the problem. while this is a wellknown requirement, state-of-the-art algorithms are usually 1 evaluated on carefully curated data sets, often originating from only a few sources. this can introduce biases where demographics (e.g., gender, age) or technical imbalances (e.g., acquisition protocol, equipment manufacturer) skew predictions and adversely affect the accuracy for certain groups or sites. however, to capture subtle relationships between disease patterns, socioeconomic and genetic factors, as well as complex and rare cases, it is crucial to expose a model to diverse cases. the need for large databases for ai training has spawned many initiatives seeking to pool data from multiple institutions. this data is often amassed into so-called data lakes. these have been built with the aim of leveraging either the commercial value of data, e.g., ibm's merge healthcare acquisition 21 , or as a resource for economic growth and scientific progress, e.g., nhs scotland's national safe haven 22 , french health data hub 23 , and health data research uk 24 . substantial, albeit smaller, initiatives include the human connectome 25 , the uk biobank 26 , the cancer imaging archive (tcia) 27 , nih cxr8 28 , nih deeplesion 29 , the cancer genome atlas (tcga) 30 , the alzheimer's disease neuroimaging initiative (adni) 31 , as well as medical grand challenges 32 such as the camelyon challenge 33 , the international multimodal brain tumor segmentation (brats) challenge [34] [35] [36] or the medical segmentation decathlon 37 . public medical data is usually task-or disease-specific and often released with varying degrees of license restrictions, sometimes limiting its exploitation. centralising or releasing data, however, poses not only regulatory, ethical and legal challenges, related to privacy and data protection, but also technical ones. anonymising, controlling access and safely transferring healthcare data is a non-trivial, and sometimes impossible task. anonymised data from the electronic health record can appear innocuous and gdpr/phi compliant, but just a few data elements may allow for patient reidentification 7 . the same applies to genomic data and medical images making them as unique as a fingerprint 38 . therefore, unless the anonymisation process destroys the fidelity of the data, likely rendering it useless, patient reidentification or information leakage cannot be ruled out. gated access for approved users is often proposed as a putative solution to this issue. however, besides limiting data availability, this is only practical for cases in which the consent granted by the data owners is unconditional, since recalling data from those who may have had access to the data is practically unenforceable. the promise of federated efforts the promise of fl is simple-to address privacy and data governance challenges by enabling ml from non-co-located data. in a fl setting, each data controller not only defines its own governance processes and associated privacy policies, but also controls data access and has the ability to revoke it. this includes both the training, as well as the validation phase. in this way, fl could create new opportunities, e.g., by allowing large-scale, ininstitutional validation, or by enabling novel research on rare diseases, where the incident rates are low and data sets at each single institution are too small. moving the model to the data and not vice versa has another major advantage: high-dimensional, storage-intense medical data does not have to be duplicated from local institutions in a centralised pool and duplicated again by every user that uses this data for local model training. as the model is transferred to the local institutions, it can scale naturally with a potentially growing global data set without disproportionately increasing data storage requirements. as depicted in fig. 2 , a fl workflow can be realised with different topologies and compute plans. the two most common ones for healthcare applications are via an aggregation server [16] [17] [18] and peer to peer approaches 15, 39 . in all cases, fl implicitly offers a certain degree of privacy, as fl participants never directly access data from other institutions and only receive model parameters that are aggregated over several participants. in a fl workflow with aggregation server, the participating institutions can even remain unknown to each other. however, it has been shown that the models example federated learning (fl) workflows and difference to learning on a centralised data lake. a fl aggregation server-the typical fl workflow in which a federation of training nodes receive the global model, resubmit their partially trained models to a central server intermittently for aggregation and then continue training on the consensus model that the server returns. b fl peer to peer-alternative formulation of fl in which each training node exchanges its partially trained models with some or all of its peers and each does its own aggregation. c centralised training-the general non-fl training workflow in which data acquiring sites donate their data to a central data lake from which they and others are able to extract data for local, independent training. themselves can, under certain conditions, memorise information [40] [41] [42] [43] . therefore, mechanisms such as differential privacy 44, 45 or learning from encrypted data have been proposed to further enhance privacy in a fl setting (c.f. section "technical considerations"). overall, the potential of fl for healthcare applications has sparked interest in the community 46 and fl techniques are a growing area of research 12, 20 . current fl efforts for digital health since fl is a general learning paradigm that removes the data pooling requirement for ai model development, the application range of fl spans the whole of ai for healthcare. by providing an opportunity to capture larger data variability and to analyse patients across different demographics, fl may enable disruptive innovations for the future but is also being employed right now. in the context of electronic health records (ehr), for example, fl helps to represent and to find clinically similar patients 13, 47 , as well as predicting hospitalisations due to cardiac events 14 , mortality and icu stay time 19 . the applicability and advantages of fl have also been demonstrated in the field of medical imaging, for whole-brain segmentation in mri 15 , as well as brain tumour segmentation 16, 17 . recently, the technique has been employed for fmri classification to find reliable disease-related biomarkers 18 and suggested as a promising approach in the context of covid-19 48 . it is worth noting that fl efforts require agreements to define the scope, aim and technologies used which, since it is still novel, can be difficult to pin down. in this context, today's large-scale initiatives really are the pioneers of tomorrow's standards for safe, fair and innovative collaboration in healthcare applications. these include consortia that aim to advance academic research, such as the trustworthy federated data analytics (tfda) project 49 and the german cancer consortium's joint imaging platform 50 , which enable decentralised research across german medical imaging research institutions. another example is an international research collaboration that uses fl for the development of ai models for the assessment of mammograms 51 . the study showed that the fl-generated models outperformed those trained on a single institute's data and were more generalisable, so that they still performed well on other institutes' data. however, fl is not limited just to academic environments. by linking healthcare institutions, not restricted to research centres, fl can have direct clinical impact. the on-going healthchain project 52 , for example, aims to develop and deploy a fl framework across four hospitals in france. this solution generates common models that can predict treatment response for breast cancer and melanoma patients. it helps oncologists to determine the most effective treatment for each patient from their histology slides or dermoscopy images. another large-scale effort is the federated tumour segmentation (fets) initiative 53 , which is an international federation of 30 committed healthcare institutions using an open-source fl framework with a graphical user interface. the aim is to improve tumour boundary detection, including brain glioma, breast tumours, liver tumours and bone lesions from multiple myeloma patients. another area of impact is within industrial research and translation. fl enables collaborative research for, even competing, companies. in this context, one of the largest initiatives is the melloddy project 54 . it is a project aiming to deploy multi-task fl across the data sets of 10 pharmaceutical companies. by training a common predictive model, which infers how chemical compounds bind to proteins, partners intend to optimise the drug discovery process without revealing their highly valuable in-house data. impact on stakeholders fl comprises a paradigm shift from centralised data lakes and it is important to understand its impact on the various stakeholders in a fl ecosystem. clinicians. clinicians are usually exposed to a sub-group of the population based on their location and demographic environment, which may cause biased assumptions about the probability of certain diseases or their interconnection. by using ml-based systems, e.g., as a second reader, they can augment their own expertise with expert knowledge from other institutions, ensuring a consistency of diagnosis not attainable today. while this applies to ml-based system in general, systems trained in a federated fashion are potentially able to yield even less biased decisions and higher sensitivity to rare cases as they were likely exposed to a more complete data distribution. however, this demands some up-front effort such as compliance with agreements, e.g., regarding the data structure, annotation and report protocol, which is necessary to ensure that the information is presented to collaborators in a commonly understood format. b decentralised: each training node is connected to one or more peers and aggregation occurs on each node in parallel. c hierarchical: federated networks can be composed from several sub-federations, which can be built from a mix of peer to peer and aggregation server federations (d)). fl compute plans-trajectory of a model across several partners. e sequential training/cyclic transfer learning. f aggregation server, g peer to peer. patients. patients are usually treated locally. establishing fl on a global scale could ensure high quality of clinical decisions regardless of the treatment location. in particular, patients requiring medical attention in remote areas could benefit from the same high-quality ml-aided diagnoses that are available in hospitals with a large number of cases. the same holds true for rare, or geographically uncommon, diseases, that are likely to have milder consequences if faster and more accurate diagnoses can be made. fl may also lower the hurdle for becoming a data donor, since patients can be reassured that the data remains with their own institution and data access can be revoked. hospitals and practices. hospitals and practices can remain in full control and possession of their patient data with complete traceability of data access, limiting the risk of misuse by third parties. however, this will require investment in on-premise computing infrastructure or private-cloud service provision and adherence to standardised and synoptic data formats so that ml models can be trained and evaluated seamlessly. the amount of necessary compute capability depends of course on whether a site is only participating in evaluation and testing efforts or also in training efforts. even relatively small institutions can participate and they will still benefit from collective models generated. researchers and ai developers. researchers and ai developers stand to benefit from access to a potentially vast collection of realworld data, which will particularly impact smaller research labs and start-ups. thus, resources can be directed towards solving clinical needs and associated technical problems rather than relying on the limited supply of open data sets. at the same time, it will be necessary to conduct research on algorithmic strategies for federated training, e.g., how to combine models or updates efficiently, how to be robust to distribution shifts 11, 12, 20 . fl-based development implies also that the researcher or ai developer cannot investigate or visualise all of the data on which the model is trained, e.g., it is not possible to look at an individual failure case to understand why the current model performs poorly on it. healthcare providers. healthcare providers in many countries are affected by the on-going paradigm shift from volume-based, i.e., fee-for-service-based, to value-based healthcare, which is in turn strongly connected to the successful establishment of precision medicine. this is not about promoting more expensive individualised therapies but instead about achieving better outcomes sooner through more focused treatment, thereby reducing the cost. fl has the potential to increase the accuracy and robustness of healthcare ai, while reducing costs and improving patient outcomes, and may therefore be vital to precision medicine. manufacturers. manufacturers of healthcare software and hardware could benefit from fl as well, since combining the learning from many devices and applications, without revealing patientspecific information, can facilitate the continuous validation or improvement of their ml-based systems. however, realising such a capability may require significant upgrades to local compute, data storage, networking capabilities and associated software. fl is perhaps best-known from the work of konečnỳ et al. 55 , but various other definitions have been proposed in the literature 9, 11, 12, 20 . a fl workflow (fig. 1) can be realised via different topologies and compute plans (fig. 2) , but the goal remains the same, i.e., to combine knowledge learned from non-co-located data. in this section, we will discuss in more detail what fl is, as well as highlighting the key challenges and technical considerations that arise when applying fl in digital health. federated learning definition fl is a learning paradigm in which multiple parties train collaboratively without the need to exchange or centralise data sets. a general formulation of fl reads as follows: let l denote a global loss function obtained via a weighted combination of k local losses fl k g k k¼1 , computed from private data x k , which is residing at the individual involved parties and never shared among them: where w k > 0 denote the respective weight coefficients. in practice, each participant typically obtains and refines a global consensus model by conducting a few rounds of optimisation locally and before sharing updates, either directly or via a parameter server. the more rounds of local training are performed, the less it is guaranteed that the overall procedure is minimising (eq. 1) 9,12 . the actual process for aggregating parameters depends on the network topology, as nodes might be segregated into subnetworks due to geographical or legal constraints (see fig. 2 ). aggregation strategies can rely on a single aggregating node (hub and spokes models), or on multiple nodes without any centralisation. an example is peer-to-peer fl, where connections exist between all or a subset of the participants and model updates are shared only between directly connected sites 15, 56 , whereas an example of centralised fl aggregation is given in algorithm 1. note that aggregation strategies do not necessarily require information about the full model update; clients might chose to share only a subset of the model parameters for the sake of reducing communication overhead, ensure better privacy preservation 10 or to produce multi-task learning algorithms having only part of their parameters learned in a federated manner. a unifying framework enabling various training schemes may disentangle compute resources (data and servers) from the compute plan, as depicted in fig. 2 . the latter defines the trajectory of a model across several partners, to be trained and evaluated on specific data sets. these issues have to be solved for both federated and nonfederated learning efforts via appropriate measures, such as careful study design, common protocols for data acquisition, structured reporting and sophisticated methodologies for discovering bias and hidden stratification. in the following, we touch n. rieke et al. upon the key aspects of fl that are of particular relevance when applied to digital health and need to be taken into account when establishing fl. for technical details and in-depth discussion, we refer the reader to recent surveys 11, 12, 20 . data heterogeneity. medical data is particularly diverse-not only because of the variety of modalities, dimensionality and characteristics in general, but even within a specific protocol due to factors such as acquisition differences, brand of the medical device or local demographics. fl may help address certain sources of bias through potentially increased diversity of data sources, but inhomogeneous data distribution poses a challenge for fl algorithms and strategies, as many are assuming independently and identically distributed (iid) data across the participants. in general, strategies such as fedavg 9 are prone to fail under these conditions 9,57,58 , in part defeating the very purpose of collaborative learning strategies. recent results, however, indicate that fl training is still feasible 59 , even if medical data is not uniformly distributed across the institutions 16, 17 or includes a local bias 51 . research addressing this problem includes, for example, fedprox 57 , part-data-sharing strategy 58 and fl with domainadaptation 18 . another challenge is that data heterogeneity may lead to a situation in which the global optimal solution may not be optimal for an individual local participant. the definition of model training optimality should, therefore, be agreed by all participants before training. privacy and security. healthcare data is highly sensitive and must be protected accordingly, following appropriate confidentiality procedures. therefore, some of the key considerations are the trade-offs, strategies and remaining risks regarding the privacypreserving potential of fl. privacy vs. performance: it is important to note that fl does not solve all potential privacy issues and-similar to ml algorithms in general-will always carry some risks. privacy-preserving techniques for fl offer levels of protection that exceed today's current commercially available ml models 12 . however, there is a trade-off in terms of performance and these techniques may affect, for example, the accuracy of the final model 10 . furthermore, future techniques and/or ancillary data could be used to compromise a model previously considered to be low-risk. level of trust: broadly speaking, participating parties can enter two types of fl collaboration: trusted-for fl consortia in which all parties are considered trustworthy and are bound by an enforceable collaboration agreement, we can eliminate many of the more nefarious motivations, such as deliberate attempts to extract sensitive information or to intentionally corrupt the model. this reduces the need for sophisticated counter-measures, falling back to the principles of standard collaborative research. non-trusted-in fl systems that operate on larger scales, it might be impractical to establish an enforceable collaborative agreement. some clients may deliberately try to degrade performance, bring the system down or extract information from other parties. hence, security strategies will be required to mitigate these risks such as, advanced encryption of model submissions, secure authentication of all parties, traceability of actions, differential privacy, verification systems, execution integrity, model confidentiality and protections against adversarial attacks. information leakage: by definition, fl systems avoid sharing healthcare data among participating institutions. however, the shared information may still indirectly expose private data used for local training, e.g., by model inversion 60 of the model updates, the gradients themselves 61 or adversarial attacks 62, 63 . fl is different from traditional training insofar as the training process is exposed to multiple parties, thereby increasing the risk of leakage via reverseengineering if adversaries can observe model changes over time, observe specific model updates (i.e., a single institution's update), or manipulate the model (e.g., induce additional memorisation by others through gradient-ascent-style attacks). developing countermeasures, such as limiting the granularity of the updates and adding noise 16, 18 and ensuring adequate differential privacy 44 , may be needed and is still an active area of research 12 . traceability and accountability. as per all safety-critical applications, the reproducibility of a system is important for fl in healthcare. in contrast to centralised training, fl requires multiparty computations in environments that exhibit considerable variety in terms of hardware, software and networks. traceability of all system assets including data access history, training configurations, and hyperparameter tuning throughout the training processes is thus mandatory. in particular in non-trusted federations, traceability and accountability processes require execution integrity. after the training process reaches the mutually agreed model optimality criteria, it may also be helpful to measure the amount of contribution from each participant, such as computational resources consumed, quality of the data used for local training, etc. these measurements could then be used to determine relevant compensation, and establish a revenue model among the participants 64 . one implication of fl is that researchers are not able to investigate data upon which models are being trained to make sense of unexpected results. moreover, taking statistical measurements of their training data as part of the model development workflow will need to be approved by the collaborating parties as not violating privacy. although each site will have access to its own raw data, federations may decide to provide some sort of secure intra-node viewing facility to cater for this need or may provide some other way to increase explainability and interpretability of the global model. system architecture. unlike running large-scale fl amongst consumer devices such as mcmahan et al. 9 , healthcare institutional participants are equipped with relatively powerful computational resources and reliable, higher-throughput networks enabling training of larger models with many more local training steps, and sharing more model information between nodes. these unique characteristics of fl in healthcare also bring challenges such as ensuring data integrity when communicating by use of redundant nodes, designing secure encryption methods to prevent data leakage, or designing appropriate node schedulers to make best-use of the distributed computational devices and reduce idle time. the administration of such a federation can be realised in different ways. in situations requiring the most stringent data privacy between parties, training may operate via some sort of "honest broker" system, in which a trusted third party acts as the intermediary and facilitates access to data. this setup requires an independent entity controlling the overall system, which may not always be desirable, since it could involve additional cost and procedural viscosity. however, it has the advantage that the precise internal mechanisms can be abstracted away from the clients, making the system more agile and simpler to update. in a peer-topeer system each site interacts directly with some or all of the other participants. in other words, there is no gatekeeper function, all protocols must be agreed up-front, which requires significant agreement efforts, and changes must be made in a synchronised fashion by all parties to avoid problems. additionally, in a trustlessbased architecture the platform operator may be cryptographically locked into being honest by means of a secure protocol, but this may introduce significant computational overheads. conclusion ml, and particularly dl, has led to a wide range of innovations in the area of digital healthcare. as all ml methods benefit greatly from the ability to access data that approximates the true global distribution, fl is a promising approach to obtain powerful, accurate, safe, robust and unbiased models. by enabling multiple parties to train collaboratively without the need to exchange or centralise data sets, fl neatly addresses issues related to egress of sensitive medical data. as a consequence, it may open novel research and business avenues and has the potential to improve patient care globally. however, already today, fl has an impact on nearly all stakeholders and the entire treatment cycle, ranging from improved medical image analysis providing clinicians with better diagnostic tools, over true precision medicine by helping to find similar patients, to collaborative and accelerated drug discovery decreasing cost and time-to-market for pharma companies. not all technical questions have been answered yet and fl will certainly be an active research area throughout the next decade 12 . despite this, we truly believe that its potential impact on precision medicine and ultimately improving medical care is very promising. reporting summary further information on research design is available in the nature research reporting summary linked to this article. received: 17 march 2020; accepted: 12 august 2020; deep learning deep learning in medicine-promise, progress, and challenges deep learning: a primer for radiologists clinically applicable deep learning for diagnosis and referral in retinal disease revisiting unreasonable effectiveness of data in deep learning era a systematic review of barriers to data sharing in public health estimating the success of reidentifications in incomplete datasets using generative models identification of anonymous mri research participants with face-recognition software communicationefficient learning of deep networks from decentralized data federated learning: challenges, methods, and future directions federated machine learning: concept and applications advances and open problems in federated learning privacy-preserving patient similarity learning in a federated environment: development and analysis federated learning of predictive models from federated electronic health records braintorrent: a peerto-peer environment for decentralized federated learning privacy-preserving federated brain tumour segmentation multi-institutional deep learning modeling without sharing patient data: a feasibility study on brain tumor segmentation multi-site fmri analysis using privacy-preserving federated learning and domain adaptation: abide results patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records federated learning for healthcare informatics ibm's merge healthcare acquisition nhs scotland's national safe haven the french health data hub and the german medical informatics initiatives: two national projects to promote data sharing in healthcare the human connectome: a structural description of the human brain uk biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age the cancer imaging archive (tcia): maintaining and operating a public information repository chestx-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases deeplesion: automated mining of largescale lesion annotations and universal lesion detection with deep learning the cancer genome atlas (tcga): an immeasurable source of knowledge the alzheimer's disease neuroimaging initiative (adni): mri methods grand challenge-a platform for end-to-end development of machine learning solutions in biomedical imaging 1399 h&e-stained sentinel lymph node sections of breast cancer patients: the camelyon dataset the multimodal brain tumor image segmentation benchmark (brats) identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features a large annotated medical image dataset for the development and evaluation of segmentation algorithms quantifying differences and similarities in whole-brain white matter architecture using local connectome fingerprints distributed deep learning networks among institutions for medical imaging membership inference attacks against machine learning models white-box vs blackbox: bayes optimal strategies for membership inference understanding deep learning requires rethinking generalization the secret sharer: evaluating and testing unintended memorization in neural networks deep learning with differential privacy privacy-preserving deep learning a roadmap for foundational research on artificial intelligence in medical imaging: from the 2018 nih/rsna/acr/the academy workshop federated tensor factorization for computational phenotyping federated deep learning via neural architecture search trustworthy federated data analytics (tfda medical institutions collaborate to improve mammogram assessment ai the federated tumor segmentation (fets) initiative machine learning ledger orchestration for drug discovery federated optimization: distributed machine learning for on-device intelligence peer-to-peer federated learning on graphs federated optimization in heterogeneous networks federated learning with non-iid data on the convergence of fedavg on non-iid data p3sgd: patient privacy preserving sgd for regularizing deep cnns in pathological image classification deep leakage from gradients beyond inferring class representatives: user-level privacy leakage from federated learning deep models under the gan: information leakage from collaborative deep learning data shapley: equitable valuation of data for machine learning analytics") and the prime programme of the german academic exchange service (daad) with funds from the german federal ministry of education and research (bmbf). the content and opinions expressed in this publication is solely the responsibility of the authors and do not necessarily represent those of the institutions they are affiliated with, e.g., the u.s. department of health and human services or the national institutes of health. open access funding provided by projekt deal. supplementary information is available for this paper at https://doi.org/10.1038/ s41746-020-00323-1.correspondence and requests for materials should be addressed to n.r. publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.open access this article is licensed under a creative commons attribution 4.0 international license, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the creative commons license, and indicate if changes were made. the images or other third party material in this article are included in the article's creative commons license, unless indicated otherwise in a credit line to the material. if material is not included in the article's creative commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. to view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/. key: cord-332583-5enha3g9 authors: bodine, erin n.; panoff, robert m.; voit, eberhard o.; weisstein, anton e. title: agent-based modeling and simulation in mathematics and biology education date: 2020-07-28 journal: bull math biol doi: 10.1007/s11538-020-00778-z sha: doc_id: 332583 cord_uid: 5enha3g9 with advances in computing, agent-based models (abms) have become a feasible and appealing tool to study biological systems. abms are seeing increased incorporation into both the biology and mathematics classrooms as powerful modeling tools to study processes involving substantial amounts of stochasticity, nonlinear interactions, and/or heterogeneous spatial structures. here we present a brief synopsis of the agent-based modeling approach with an emphasis on its use to simulate biological systems, and provide a discussion of its role and limitations in both the biology and mathematics classrooms. agent-based models (abms) are computational structures in which system-level (macro) behavior is generated by the (micro) behavior of individual agents, which may be persons, cells, molecules or any other discrete quantities. typical abms contain three elements: agents, an environment, and rules governing each agent's behavior and its local interactions with other agents and with the environment. decades of advancement in computer power has made agent-based modeling a feasible and appealing tool to study a variety of complex and dynamic systems, especially within the life sciences. as the use of abms in research has grown, so too has the inclusion of abms in life science and mathematical modeling courses as a means of exploring and predicting how individual-level behavior and interactions among individuals lead to system-level observable patterns. abms are now one of the many types of models students studying the life sciences or applied mathematics should encounter in their undergraduate education. prior to the introduction of abms into biological and applied mathematics curricula, the clear model format of choice was the ordinary differential equation (ode), or maybe a pair of them; occasionally, discrete difference equations and/or matrix equations would also be introduced. exponential growth and decay were ready examples, paving the way for extensions of the exponential growth process toward a carrying capacity in the form of the logistic growth process (voit 2020) . this logistic process was easily generalized to two populations, which were at first independent, but then allowed to interact. depending on these interactions, the result was a pair of two populations competing for the same resource or a simple predator-prey model in the format of a two-variable lotka-volterra system. although odes and any other types of "diff-e-qs" are a priori dreaded by almost all but mathematicians and physicists, the concept of an ode, if adequately explained, becomes quite intuitive. for instance, one may ease a novice into the world of odes by considering changes in the water level w of a lake over time (ayalew 2019) . whereas these dynamics are difficult to formulate as an explicit function w (t), newcomers readily understand that changes in the water level depend on influxes from tributaries, rain, and other sources on the supply side, and on effluxes, evaporation and water utilization on the side of reducing the amount of water. just putting these components into an equation leads directly to a differential equation of the system (weisstein 2011 ). on the left side, one finds the change over time as dw /dt, and this change is driven, on the right-hand side, by a sum of augmenting and diminishing processes. there is hardly a limit to what can be achieved with odes in biology, with the very important exception of processes that have genuine spatial features. and while it is not difficult to ease a biology undergraduate into ordinary differential equations, the same is not necessarily true for partial differential equations (pdes). however, spatial phenomena in biology seldom occur in homogeneous conditions. as examples, consider the formation of tumors with angiogenesis and necrosis; the local patterns of cell-to-cell signaling that governs the embryonic development; the spread of the red fire ant (solenopsis invicta) from mobile, al, its alleged port of entry into the usa, all along the gulf and east coasts; or the population size and dynamics of the santa cruz island fox (urocyon littoralis santacruzae) being driven by territory size which in turn depends on local vegetation (scott 2019) . until relatively recently, the conundrum of space was often dealt with in the final chapter of mathematical modeling in biology. a sea change came with the development of abms, which are natural formats for both stochasticity and spatial phenomena. by their nature, these models are computationally expensive, which initially prevented their use in most classrooms. however, this situation has obviously changed. as the bio2010 report (2003) stated: "computer use is a fact of life of all modern life scientists. exposure during the early years of their undergraduate careers will help life science students use current computer methods and learn how to exploit emerging computer technologies as they arise." classroom use of abms has thus become not just logistically feasible, but also very appealing for demonstrating spatial dynamics in a wide range of biological systems (kottonau 2011; triulzi & pyka 2011; shiflet 2013; pinder 2013) . supporting this appeal is a repertoire of software tools, such as simbio and netlogo (see sect. 3), that contain predefined examples and require minimal computer coding skills for model analysis. here, we present a brief synopsis and history of this modeling approach with emphasis on life science applications (sect. 2), describe some of the software tools most frequently used in the classroom (sect. 3), and then focus on some of its roles and limitations in the classroom (sect. 4). the origins of agent-based modeling. the true origins of any method or procedure are seldom identifiable in an unambiguous manner. in the case of agent-based modeling, one could think of craig reynolds' 1987 seminal article on the formation of bird flocks (with the agents denoted as boids, short for "bird-oid object"), which he was able to represent with just three rules of behavior: (1) avoid collisions with nearby birds; (2) attempt to match the velocity of nearby birds; and (3) attempt to stay close to nearby birds in the flock (reynolds 1987; gooding 2019) . the result of simulations with this simple abm was very realistic-looking flocking behavior. particularly intriguing in this study was the fact that there was no leader or a global organizing principle. instead, the virtual birds were truly individual agents that self-organized locally, thereby generating a globally coherent flight pattern. while reynolds' work was a milestone, key concepts leading to modern abms can be found much earlier. one notable contributor of ideas was nobel laureate enrico fermi, who used mechanical addition machines to generate probabilities for stochastic models with which he solved otherwise unwieldy problems (gooding 2019) . this procedure was an early form of the method of a monte carlo simulation, which was later independently developed and published by stanislav ulam, like fermi a member of the manhattan project (metropolis & ulam 1949; metropolis 1987) . another very important contribution to the budding development of abms was the turing machine (turing 1936) , which is a mathematical model of computation that uses a set of rules to manipulate symbols in discrete cells on an infinite tape. much closer to abms were ideas of ulam, who was fascinated by the "automatic" emergence of patterns in two-dimensional games with very simple rules (ulam 1950; metropolis 1987) . together with the new concept of game theory (von neumann & morgenstern 1944) , all these ideas were developed into the concept of cellular automata, which are direct predecessors of abms (ulam 1950; ulametal 1947; von neumann & morgenstern 1944) . a very appealing implementation of a cellular automaton was john conway's famous game of life (gardner 1970) . the social sciences adopted computational methods in the 1960s for microanalytic simulations or, simply, micro-simulations (gilbert & troitzsch 2005; gooding 2019 ). in contrast to today's abms, which use simple rules to recreate observed or unknown patterns, the agents in the original microsimulations acted according to empirical data (bae 2016). the advantage of this strategy is that model analysis can reveal systemic behaviors under different realistic scenarios (gooding 2019) . a seminal paper in this context described generic segregation processes (schelling 1971) . other agent-based modeling work in sociology and economics was gleaned from the biological abm work of smith (1982) , who formulated darwin's ideas of evolution as a computer simulation. this idea inspired nelson & winter to apply similar concepts and implementations to market studies, where firms were modeled like animals that followed specific routines. in particular, the firms were in competition, and the market weeded out bad routines while rewarding the fittest. nelson & winter's influential book an evolutionary theory of economic change (nelson & winter 1982) strongly proposed the use of computer simulations, which today would fall within the scope of agentbased modeling. their ideas led to a school of thought called evolutionary economics (hanappi 2017 ). an early and particularly influential paper in this context tried to shed light on the stock market (palmer 1994) . initially, abms of artificial life simulated simple homogeneous agents that acted like huge colonies of ants that could just move and eat in the pursuit of food. somewhat more sophisticated, economic simulations used as the main agent homo economicus, a consistently rational human pursuing the optimization of some economic goal or utility with exclusive self-interest (persky 1995) . based on these humble beginnings, sophistication in computing soon permitted heterogeneous agents and much more complicated landscapes than before. the successes in economics were so tantalizing that simulation studies eventually reached the most prestigious journals of economics and the social sciences geanakoplos 2012; hanappi 2017) . modern abms in economics are capable of capturing much of the complexity of macroeconomic systems (e.g., caiani (2016) ). following directly the principles of cellular automata, kauffman studied large grids with elements that changed features in a binary fashion (kauffman 1993) . for instance, a white agent could turn black, and this shift occurred according to boolean rules that usually involved some or all neighboring grid points. kauffman was able to demonstrate the emergence of complex patterns, such as oscillations and percolation. starting in the 1980, wolfram performed systematic studies of cellular automata, which led to his influential 2002 book a new kind of science (wolfram 2002 ) that assigns cellular automata a wide range of applications in a variety of fields. biological applications of agent-based modeling. abms have been constructed to study a wide range of biological phenomenon. numerous reviews and research efforts using abms have focused on specific biomedical systems. issues of gene expression were modeled by thomas (2019) . morphogenetic and developmental processes were discussed in grant (2006) , robertson (2007) , , tang (2011), and glen (2019) . models of tissue mechanics and microvasculature are captured in and van liedekerke (2015) . inflammation, wound healing and immune responses were addressed in , ), chavali (2008 , and castiglione & celada (2015) . other authors (wang 2015; segovia 2004; cisse 2013 ) used abms to model cancer growth, tuberculosis, and schistosomiasis, respectively. lardon and colleagues (lardon 2011) used abms to analyze biofilm dynamics. butler and colleagues described the use of abms in pharmacology (butler 2015) . abms studying multicellular systems provide a unique capability to examine interactions and feedback loops across the different hierarchies of the system (hellweger 2016) . reviews of abms in the context of ecology, environmental management, and land use include (bousquet 2004; matthews 2007; grimm & railsback 2005; caplat 2008 ; deangelis & diaz 2019). in some applications, interventions or treatments were addressed and therefore required the adaptation of agents to changing scenarios (berry 2002) . abms have also been used to simulate epidemics with analyses examining the impact of implemented or potential intervention measures (e.g., quarantining/physical distancing, mask wearing, and vaccination) (mniszewski 2013; perez & dragicevic 2009; tracy 2018) . visual representations of epidemiological abms have even been used by news outlets during the covid-19 pandemic to help explain to the public how various intervention methods change the shape of an epidemic (i.e., "flatten the curve) or the basic reproduction number (r 0 ) of an epidemic; see, for example, fox (2020) and stevens (2020). rationale & pitfalls of abms. the two most frequent goals of an abm analysis are (1) the elucidation and explanation of emergent behaviors of a complex system and (2) the inference of rules that govern the actions of the agents and lead to these emerging system behaviors. this type of inference is based on large numbers of simulations, i.e., replicate experiments with the abm using the same assumptions and parameters, and different experiments over which assumptions or parameter values are systematically changed. simulations of abms essentially always yield different outcomes, because movements, actions and interactions of agents with each other or with the environment are stochastic events. the inference of rules from simulation results is an abductive process (voit 2019) that is challenging, because one can easily demonstrate that different rule sets may lead to the emergence of the same systemic behaviors, and because even numerous simulations seldom cover the entire repertoire of a system's possible responses. in fact, hanappi (2017) warned: "assumptions on microagents that play the role of axioms from which the aggregate patterns are derived need not-and indeed never should-be the end of abm research." arguably the greatest appeal of abms, and at the same time a treacherous pitfall, is their enormous flexibility, which is attributatble to the fact that any number of rules can be imposed on the agents, and that the environment may be very simple but can also be exceedingly complicated. for instance, the environment may exhibit gradients or even different individually programmable patches (barth 2012; gooding 2019) including importing geographic information systems (gis) data to define the characteristics of each patch (see scott (2019) for an example with a detailed explanation of how gis data are incorporated into an abm). in particular, the repeated addition of new elements to a model can quickly increase the complexity of the model, thereby possibly distracting from the core drivers of the system's behavior, obscuring the importance of each of the rules the agents must follow, and generally making interpretations of results more difficult. related to this option of adding elements is the critique that abms can be 'tuned' by researchers to create outcomes that support the researcher's narrative (gooding 2019) . to counteract increases in complexity, some authors have begun to develop methods of model reduction that retain the core features of a model, but eliminate unnecessary details (zou 2012). another critique of abms is that simulations are not readily reproducible, because the rules can be complicated and stochasticity rarely repeats the same model trajectories. a strategy to increase the reproducibility of abms was the establishment of protocols for the standardized design and analysis of abms (lorek & sonnenschein 1999; grimm 2006 grimm , 2010 heard 2015) . (2017). in a slight extension of abms, lattilä (2010) and cisse (2013) described hybrid simulations involving abms and dynamic systems, and heard (2015) discussed statistical methods of abm analysis. these introductory tutorials and reviews, however, are typically not designed for undergraduates with limited mathematical or computational modeling experience. the scientific community has also worked hard on facilitating the use of abms by offering software like netlogo, swarm, repast, and mason. summaries and evaluations of some of the currently pertinent software are available in berryman (2008) and abar (2017) , and netlogo is described more fully in sect. 3. a variety of software tools can be used to construct, simulate, and analyze abms. when abms are taught or used in biology or mathematics courses, software should be chosen to align with pedagogical objectives. note that the pedagogy of abms in life science and mathematical modeling courses is discussed in more detail in sect. 4. in this section, we highlight some of the most used software packages in an educational setting. ecobeaker & simbio. ecobeaker was developed by eli meir and first released in 1996 as software that ran simulated experiments designed to explore ecological principles. in 1998, simbio (meir 1998 ) was founded (then called beakerware) for the release of the second version of ecobeaker and has since grown to include simulated experiments in evolution, cell biology, genetics, and neurobiology. many of the simulated experiments in the simbio virtual labs software are agent-based simulations. in a simbio virtual lab, the user interacts with a graphical interface portraying the agents (individuals in the experiment) and sometimes their environment, which can be manipulated in various ways for different experimental designs. the graphical interface also contains relevant graphs which are updated as the simulated experiment runs. ecobeaker and simbio virtual labs are examples of software where the focus is on experimental design and the use of simulation to understand biological concepts. the user never interfaces with the code and does not need to understand the underlying algorithms which produce the simulation. simbio virtual labs are used in many biology classrooms across the united states. the software is priced per student (currently $6/lab/student or $49/student for unlimited labs). netlogo. the agent-based modeling environment netlogo (wilensky 1999) was developed by uri wilensky and first released in 1999. netlogo is free and continues to be improved and updated (the software is currently on version 6). additionally, a simplified version of netlogo can be run through a web browser at http://netlogoweb. org/. netlogo is a programming platform allowing the implementation of any abm a user might design. as such, its user interface includes a tab where abm code is written in the netlogo programming language, and a tab for the user to view a visualization of the abm and user specified outputs as the abm is simulated. textbooks by railsback & grimm (2012) and wilensky & rand (2015) provide introductions to the netlogo prgramming languages as well as providing a thorough overview of the algorithmic structures of abms. since its initial development, netlogo has built up an extensive model library of abms. additionally, over the years, faculty at various institutions have developed abm modules through netlogo that allow students to explore a variety of biological phenomenon. for example, the virtual biology lab (jones 2016) has created 20 different virtual laboratory modules using netlogo through a web browser for exploring topics in ecology, evolution, and cell biology. the virtual biology labs are similar in scope to the ecobeaker and simbio labs. another example is infections on networks (iontw) which provides an abm framework and teaching modules for examining aspects of disease dynamics on various network structures (just 2015a, b, c) . one of the responses to the bio2010 report (2003) has been a push to create biocalculus courses or to insert more biological application examples and projects within traditional calculus courses. indeed, studies have shown that including applications from the life sciences in classic math courses like calculus leads to students gaining equivalent or better conceptual knowledge than from similar courses without life science applications (comar 2008; . however, many mathematics and biology educators have pointed out that the subset of mathematics applicable to biology extends well beyond calculus, and undergraduates (especially those majoring in biology) should be exposed to a variety of mathematical models and methods of analysis across biology and mathematics courses (bressoud 2004; gross 2004; robeva & laubenbacher 2009 ). the only prerequisites for analyzing the simulations of an abm are a basic understanding of the underlying biology and, in some instances, knowledge of how to perform basic statistical calculations or generate graphical representations of results. many pre-built abms in simbio virtual labs and netlogo generate relevant graphs and/or produce spreadsheets or text files of the relevant data. thus, a student can utilize abms in life science courses without having to learn how to implement (by writing code) an abm. the prerequisites for learning how to implement abms are not as extensive as for other forms of mathematical models. the most essential prerequisite is some exposure to the fundamentals of computer programming such as understanding loop structures and conditional statements, implementing stochastic processes, and understanding how the order of executed operations within a program impacts the program's output. agent-based modeling software like netlogo (wilensky 1999 ) keeps model implementation and analysis relatively simple by providing built-in model visualization tools and automatically randomizing the order in which agents execute programmed operations. due to their appealing visual representation, abms can easily be used in the classroom to demonstrate biological processes ranging from chemical reactions to interactions among species, as if the model were simply an animation. however, using abms in this way is much like using a cell phone to hammer nails (q.v., theobald (2004) ): it may work in the desired fashion, but represents an utter waste of the tool's real potential. adding just one more step, the collection of model-generated data, transforms a passive learning experience into an active one. students can be asked to calculate means and variances, graph relationships between variables, discuss the sample size needed for reliable results, and generate quantitative predictions under different hypotheses, all in the context of a specific biological question. this can be done even in a large classroom with only a single, instructor-controlled computer, bridging the gap between lecture and lab. if students have access to individual computers, much more is possible. either individually or in small groups, students can use abms to collect and analyze their own data. free file-sharing resources such as googledocs make it easy to pool data across many groups, thereby crowd-sourcing problems that would be too large for any one group to handle on their own. in smaller classes and lab sections, individuals or groups can be assigned to model different scenarios (e.g., the interaction effects between different parameters), prompting discussions of the most appropriate parameter values and model settings. such models can even be extended into miniature research projects. for example, in a unit on community ecology, students might be assigned a question about two interacting species, then use online resources to find relevant information and parameter estimates, design and conduct a series of model runs, analyze their data using simple statistical techniques, and present their findings to the class. although abms can be used to simulate almost any biological process, meaningful exploration of a model typically requires a substantial commitment of class time and instructor engagement. as a result, except in modeling courses, it is seldom practical to incorporate abms into every lesson plan. in our experience, their educational value is highest for studying processes involving (a) substantial amounts of stochasticity, (b) nonlinear interactions, and/or (c) a defined spatial structure. the inclusion of abms in math courses generally comes in two different modes: (1) abms are taught as one modeling technique in a course covering multiple modeling techniques, and (2) the construction and analysis of abms are taught in a course where abms are the only type of modeling being used. however, due to the minimal prerequisites for learning agent-based modeling, both types of courses can potentially be offered with one or no prerequisite courses. bodine (2018) provides an example of a course where abms are taught as one type of discrete-time modeling technique in an undergraduate course designed for biology, mathematics, biomathematics, and environmental science majors that has no prerequisites beyond high-school algebra. an example of a course where abms are the only type of modeling being used is given by bodine (2019); this course has only a single prerequisite: either a prior math modeling course which introduces basic computer programming or an introduction to computer science course. the use of biological applications in teaching mathematical modeling (including modeling with abms) is often viewed as having a lower entry point with less new vocabulary overhead than other types of applications (e.g., those from physics, chemistry, or economics). in particular, models of population dynamics, even those involving interactions between multiple subpopulations or different species, usually do not require any new vocabulary for most students, which allows for a more immediate focus on mechanisms of population change and impact of interactions between individuals within the populations. video game vs. scientific process. in our experience, students sometimes react to an abm's many controls and visual output by treating the model as a video game, clicking buttons at random to see what entertaining patterns they can create. other students prefer to complete the exercise as quickly as possible by blindly following the prescribed instructions. neither of these approaches substantially engages students in thinking about the question(s) underlying the model, different strategies for collecting and analyzing data, or the model's limitations. to foster these higher-order cognitive skills, use of the abm should be explicitly framed as an example of the scientific process. this approach begins with a set of initial observations and a specific biological question. for example, what management practices would be most effective in controlling the invasive brown tree snake? after familiarizing themselves with the biological system, students propose hypotheses about the factors contributing to the snake's success on guam, suggest management strategies, and set model parameters that reflect their chosen strategy. finally, they run the model multiple times to collect data that allow them to measure their strategy's effectiveness. under this pedagogical approach, abms become a vehicle for design-ing and conducting a miniature research project, enabling experiments that would not otherwise be practical due to cost, logistics, or ethical considerations. the modeling exercise can also reinforce lessons on how scientific knowledge is constructed and tested (e.g., the three p's of science, namely, problem posing, problem solving, and peer persuasion (watkins 1992) ). as part of this exercise, students should engage in the process of deciding how best to collect, analyze, and present their data. for example, as part of the brown tree snake project, students might be asked to explore the practical steps that other pacific islands could take to prevent invasions or eradicate invaders. one group of students decides to focus on two different control measures: cargo checks of all inbound flights and deployment of poison baits around airstrips. following an overview of different statistical approaches, the students determine that a multiple regression analysis would best allow them to address their question. allowed only a limited 'budget' of 30 model runs, students settle on a factorial design using three treatment levels of cargo checks, three levels of baiting, and three replicates of each combination. the students set up a spreadsheet to record the data from each model run, graph their data in a scatter plot, and use software such as microsoft excel's analysis toolpak to conduct their analysis. a model as a caricature of the real world. students at early stages of their academic career often envision science as a collection of factual information and fixed procedures. students with this mindset may dismiss as useless any model that does not incorporate every detail of a particular biological system. by contrast, scientists recognize that models, whether mathematical, physical, or conceptual, are deliberate simplifications that attempt to capture certain properties of a biological system while ignoring others (dahlquist 2017) . for example, the standard epidemiological sir model (diagrammed in fig. 1a ) divides a population in three subpopulations (susceptible, infectious, and removed) while ignoring any potential heterogeneity within each subpopulation (e.g., age, treatment status, groups at higher risk for transmission). students will need to engage in activities that frame the abm as a hypothesis about the organization and function of a specific biological system (weisstein 2011) . after a description (and possible exploration) of the basic model, students can work in groups to suggest additional processes and variables that seem relevant to understanding the system. they can then choose one or two of the factors that they consider most important to addressing the question being asked. finally, they should consider how to modify the model to incorporate the chosen features. for example, a standard epidemiological model divides the host population into susceptible, infectious, and removed subpopulations, and studies the movement of individuals among these subpopulations (fig. 1a) . a group of students decides to modify this model to track a malarial epidemic. after discussing mortality rates, prevention and treatment options, and genetic and age-related variation in host susceptibility, the students decide to focus on incorporating vector transmission into their model. through guided discussion with the instructor, they realize that transmission now occurs in two directions: from infected vectors to susceptible hosts and from infected hosts to uninfected vectors. they therefore develop a schematic model (fig. 1b) that depicts these revised rules for each agent in the abm. even if the students do not actually build the corresponding computational model, this exercise in extending a model to reflect specific biological assumptions helps students understand the iterative process by which models are developed and the utility of even simple models to clarify key features of the system's behavior. algorithms vs. equations. the concept of an equation is introduced fairly early in mathematics education. in the united states, children can encounter simple algebraic equations in elementary school (common core 2019) and then continue to see increasingly complex equations in math classes through college. because of this long exposure to equations, the use of functions and systems of equations to model systems in the natural world feels "natural" or logical to students when they first encounter differential equation models or matrix models. abms, on the other hand, can seem confusing to students because they lack the ability to be expressed as an equation or set of equations. an abm is constructed as an algorithm describing when and how each agent interacts with their local environment (which may include other agents). often these interactions are governed by stochastic processes, and thus "decisions" by agents are made through the generation of random numbers. when first introducing students to abms, it can be helpful to teach students how to read and construct computer program flowcharts and to create a visual representation of what is occurring within an algorithm or portion of an algorithm (see bodine (2019) for example assignments that utilize program flowchart construction). in life science classes where abms are being analyzed but not constructed and implemented, a flow diagram can be a useful tool for conveying the order processes occur in the model. class discussions can question whether the order of processes make biological sense, and whether there are alternatives. in math modeling classes, the construction of flowcharts, even for simple abms, can help students elucidate where decision points are within the code, and what procedures are repeated through loop structures. the construction of flowcharts as students progress to more complicated abms can help students rectify the order of events in their implemented algorithm against the order in which events should be occurring biologically. whether students are working with abms in life science or math modeling classes, it is helpful for them to learn how to read and understand flow diagrams as they are often included in research publications that use agent-based modeling. describing an abm. much to the alarm of many math students beginning to develop abms, the formal description of an abm requires more than just writing the computer code. the standard method for describing abms in scientific publications, referred to as the overview, design concepts, and details (odd) protocol (grimm 2006 (grimm , 2010 railsback & grimm 2012) , often requires more than one page to fully describe even a simple abm. this can make abms seem overwhelming to students as they first begin to explore abms. in courses which teach the implementation and description of abms, instructors should take care not to introduce the computer code implementation simultaneous to the model description via the odd protocol. note that the introductory text on agent-based modeling by railsback & grimm (2012) does not introduce the concept of the odd protocol until chapter 3, which comes after the introduction and implementation (in netlogo) of a simple abm in chapter 2. in the course materials by bodine (2019) , the concept of the odd protocol is introduced prior to project 2, but the students are not required to write their own odd protocol description until the final project, once they have seen odd descriptions for multiple models. model implementation vs. the modeling cycle. courses that aim to teach methods in mathematical modeling often start with a discussion of the modeling cycle, which is typically presented as a flow diagram showing the loop of taking a real world question, representing it as a mathematical model, analyzing the model to address the question, and then using the results to ask the next question or refine the original real world question. figure 2 shows an example of a modeling cycle diagram. in courses where the mathematical models are encapsulated in one or a small handful of equations, the time spent on representing the real world as a mathematical model (the green box in fig. 2) is relatively short. the construction of abms, however, can be a fairly lengthy process, as abms are designed to simulate interactions between individuals and the local environment. when students are in the middle of constructing their first few abms, they often lose sight of where they are in the modeling cycle because the model implementation becomes a cycle of its own; a cycle of writing bits of code, testing the code to see if it runs and produces reasonable results, and repeating this process to slowly add all the components needed for the full algorithm of the abm. as students are first learning agent-based modeling, they need to be reminded often to pull back and view where they are in the modeling cycle; to see the flock for the boids, as it were. model validation. within the modeling cycle, there is a smaller cycle of model validation (see dashed line in fig. 2) . in a course where students are first introduced to the classic lotka-volterra predator-prey model, the students are usually first introduced to a predator-prey data set (like the 200-year data set of canadian lynx and snowshoe hare pelts purchased by the hudson bay company (maclulich 1937; elton & nicholson 1942) ), which shows the oscillating population densities of the predator and prey populations. when the students then simulate the lotka-volterra model for various parameter sets, they find that they are able to produce the same oscillating behavior of the predator and prey populations. this is a form of model validation, and a model which did not display this distinctive trend seen in the data would be considered invalid for that data set. a similar process must occur for validating abms against observed biological patterns. however, in order to engage in this validation process for abms, students must first understand how decisions of individual agents and interactions between neighboring agents can lead to system-level observable patterns, a phenomenon referred to as "emergence" or the "emergent properties" of an abm. the classic abm example for easily identifying an emergent property is a flocking example (a stock example of flocking exists in the netlogo models library, and is explored in chapter 8 of the abm textbook by railsback & grimm (2012) ). the concept of an emergent property can take a little while for students to fully understand. by definition, it is an observable outcome that the system was not specifically programmed to produced. in particular, it is not a summation of individual-level characteristics, and is typically not easily predicted from the behaviors and characteristics of the agents. for example, a variation of the reynolds (1987) flocking model is included in the netlogo library and is explored in railsback & grimm (2012, chapter 8 ). in the model, each agent moves based on three rules: 1. separate: maintain a minimum distance from nearby agents 2. align: move in the same direction as nearby agents 3. cohere: move closer to nearby agents where all agents move at the same speed and different schemes for determining who is a nearby agent can be used. additionally, there are model parameters for the minimum distance to be maintained between agents, and the degree to which an agent can turn left or right in a single time step in order to align, cohere, and separate. it is not immediately evident from this set of rules that individual agents might be able to form flocks (or swarm together), and indeed that system-level behavior does not emerge for all parameter sets. however, certain parameter sets do lead to the agents forming one or more flocks that move together through the model landscape. students observations: algorithms for pattern recognition. one of the most exciting moments for students when they first begin running simulations of an abm is seeing systemlevel patterns emerge before their eyes. one of the challenges for students (and agentbased modelers, in general) is to develop algorithms to identify and/or quantify the patterns we can easily identify by sight. for example, in the flocking abm discussed above, an observer watching the locations of individual agents change at each time step can easily see the formation of flocks. if however, the observer wanted to systematically explore the parameter space of the flocking abm and determine the regions of the parameter space under which flocking occurs (a process which might involve running hundreds or thousands of simulations), it would be a tedious and time-consuming task to physically watch each simulation and record whether the agents formed a flock or not. instead, the observer must choose the criteria that indicate the formation of a flock, and then determine a measure or an algorithm for determining whether a flock (or multiple flocks) have formed. in a course designed to teach the construction and analysis of abms, this is a point where students should be encouraged to be both creative and methodical about developing such measures and algorithms. the development of these observational measures and algorithms also provides a great opportunity for collaboration between students. it is especially helpful if students with a diversity of academic backgrounds can be brought together to brainstorm ideas; for instance, mixing students with various levels of exposure in mathematics, computer science, and biology can be very beneficial. rapid advances in computing power over the past decades have made agent-based modeling a feasible and appealing tool to study biological systems. in undergraduate mathematical biology education, there are multiple modes by which abms are utilized and taught in the classroom. in biology classrooms, abms can be used to engage students in hypothesis testing and in the experimental design and data collection processes of otherwise infeasible experiments, and to enable students to utilize models as a part of the scientific process. all of this can be done without students having to learn a programming language. by contrast, students who have had some exposure to computer programming can learn the construction, implementation, and analysis of agent-based models in a math or computer science modeling class. biological applications are ideal systems for first attempts at agent-based models as they typically do not necessitate learning extensive new vocabulary and theory to understand the basic components that need to be included in the model. throughout this article, we endeavored to articulate the benefits and challenges of including abms in undergraduate life science and math modeling courses. consistent with the bio2010 report (2003), we recommend that undergraduate biology and life science curricula be structured to ensure that all students have some exposure to mathematical modeling. we additionally recommend that this includes agent-based modeling. while not every student necessarily needs to take a course exclusively focused on agent-based modeling, every undergraduate biology student should have the opportunity to utilize an abm to perform experiments and to collect and analyze data. as we educate the next-generation of life scientists, let us empower them with the ability to utilize abms to simulate and better understand our complex and dynamic world. agent based modelling and simulation tools: a review of the state-of-art software in silico experiments of existing and hypotheticalcytokine-directed clinical trials using agentbased modeling agent-based models in translational systems biology aligning simulation models: a case study and results integration of biology, mathematics and computing in the classroom through the creation and repeated use of transdisciplinary modules. primus, in revision bae jw multi-cell agent-based simulation of the microvasculature to study the dynamics of circulating inflammatory cell trafficking typical pitfalls of simulation modeling-lessons learned from armed forces and business adaptive agents, intelligence, and emergent human organization: capturing complexity through agent-based modeling land operations division dsto, defence science and technology organisation, po box 1500 agent-based modeling: methods and techniques for simulating human systems multi-agent simulations and ecosystem management: a review the changing face of calculus: first-and second-semester calculus as college courses mathematics for the life sciences discrete math modeling with biological applications (course materials) agent-based modeling course materials agentbased modeling in systems pharmacology agent based-stock flow consistent macroeconomics: towards a benchmark model symmetric competition causes population oscillations in an individualbased model of forest dynamics immune system modelling and simulation characterizing emergent properties of immunological systems with multicellular rule-based computational modeling assessing the spatial impact on an agent-based modeling of epidemic control: case of schistosomiasis the integration of biology into calculus courses common core state standards initiatives an invitation to modeling: building a community with shared explicit practices decision-making in agent-based modeling: a current review and future prospectus the case for biocalculus: design, retention, and student performance the ten-year cycle in numbers of the lynx in canada a (2020) how epidemics like covid-19 end (and how to end them faster) mathematical games-the fantastic combinations of john conway's new solitaire game getting at systemic risk via an agent-based model of the housing market review: agent-based modeling of morphogenetic systemsadvantages and challenges agent-based model history and development. in: economics for a fairer society-going back to basics using agent-based models simulation of biological cell sorting using a two-dimensional extended potts model simulating properties of in vitro epithelial cell morphogenesis a standard protocol for describing individual-based and agent-based models the odd protocol: a review and update individual-based modeling and ecology interdisciplinarity and the undergraduate biology curriculum: finding a balance a systematic review of agent-based modelling and simulation applications in the higher education domain agent-based modelling. history, essence, future the history, philosophy, and practive of agent-based modeling and the development of the conceptual model for simulation diagram. doctoral dissertation a survey of agent-based modeling practices agent-based models and microsimulation advancing microbial sciences by individual-based modelling cellular potts modeling of complex multicellular behaviors in tissue morphogenesis virtual biology lab: an inquiry-based learning environment transmission of infectious diseases: data, models, and simulations disease transmission dynamics on networks: network structure versus disease dynamics an interactive computer model for improved student understanding of random particle motion and osmosis idynomics: next-generation individual-based modelling of biofilms hybrid simulation models-when, why, how? modelling and simulation software to support individual-oriented ecological modelling tutorial on agent-based modelling and simulation fluctuations in the numbers of the varying hare (lepus americanus) agent-based land-use models: a review of applications the beginning of the monte carlo method the monte carlo method episims: large-scale agent-based modeling of the spread of disease bio2010: transforming undergraduate education for future research biologists. the national academies press an evolutionary theory of economic change agent-based computing from multi-agent systems to agent-based models: a visual survey artificial economic life: a simple model of a stockmarket retrospectives: the ethology of homo economicus an agent-based approach for modeling dynamics of contagious disease spread an active learning exercise for introducing agent-based modeling agent-based and individual-based modeling: a practical introduction flocks, herds and schools: a distributed behaviour model multiscale computational analysis of xenopus laevis morphogenesis reveals key insights of systems-level behavior mathematical biology education: beyond calculus identifying control mechanisms of granuloma formation during m. tuberculosis infection using an agent-based model dynamic models of segregation an agent-based model of the spatial distribution and density of the santa cruz island fox. integrated population biology and modeling part b undergraduate module on computational modeling: introducing modeling the cane toad invasion why outbreaks like coronavirus spread exponentially, and how to "flatten the curve phenotypic transition maps of 3d breast acini obtained by imaging-guided agent-based modeling 29+ evidences for macroevolution intrinsic and extrinsic noise of gene expression in lineage trees agent-based modeling of multicell morphogenic processes during development agent-based modeling in public health: current applications and future directions learning by modeling: insights from an agent-based model of university-industry relationships intelligent machinery a heretical theory statistical methods in neutron diffusion ulam s (1950) random processes and transformations simulating tissue mechanics with agent-based models: concepts, perspectives and some novel results perspective: dimensions of the scientific method press) von neumann j, morgenstern o (1944) theory of games and economic behaviour multi-scale modeling in morphogenesis: a critical analysis of the cellular potts model teaching students to think like scientists: software enables experiments to be conducted that would be impossible in a laboratory simulating cancer growth with multiscale agent-based modeling building mathematical models and biological insights in an introductory biology course an introduction to agent-based modeling: modeling natural, social, and engineers complex systems with netlogo agent-based modeling: an introduction and primer model reduction for agent-based social simulation: coarse-graining a civil violence model publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-332922-2qjae0x7 authors: mbuvha, rendani; marwala, tshilidzi title: bayesian inference of covid-19 spreading rates in south africa date: 2020-08-05 journal: plos one doi: 10.1371/journal.pone.0237126 sha: doc_id: 332922 cord_uid: 2qjae0x7 the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) pandemic has highlighted the need for performing accurate inference with limited data. fundamental to the design of rapid state responses is the ability to perform epidemiological model parameter inference for localised trajectory predictions. in this work, we perform bayesian parameter inference using markov chain monte carlo (mcmc) methods on the susceptible-infected-recovered (sir) and susceptible-exposed-infected-recovered (seir) epidemiological models with time-varying spreading rates for south africa. the results find two change points in the spreading rate of covid-19 in south africa as inferred from the confirmed cases. the first change point coincides with state enactment of a travel ban and the resultant containment of imported infections. the second change point coincides with the start of a state-led mass screening and testing programme which has highlighted community-level disease spread that was not well represented in the initial largely traveller based and private laboratory dominated testing data. the results further suggest that due to the likely effect of the national lockdown, community level transmissions are slower than the original imported case driven spread of the disease. the first reported case of the novel coronavirus (sars-cov-2) in south africa was announced on 5 march 2020, following the initial manifestation of the virus in wuhan china in december 2019 [1] [2] [3] . due to its further spread and the severity of its associated clinical outcomes, the disease was subsequently declared a pandemic by the world health organisation (who) on 11 march 2020 [1, 2] . in south africa, by 26 april 2020, 4546 people had been confirmed to have been infected by the coronavirus with 87 fatalities [4] . numerous states have attempted to minimise the growth in number of covid-19 infections [1, 5, 6] . these attempts are largely based on non-pharmaceutical interventions (npis) aimed at separating the infectious population from the susceptible population [1] . these initiatives aim to strategically reduce the increase in infections to a level where their healthcare systems stand a chance of minimising the number of fatalities [1] . some of the critical indicators for policymaker response planning include projections of the infected population, estimates of health care service demand and whether current containment measures are effective [1] . as the pandemic develops in a rapid and varied manner in most countries, calibration of epidemiological models based on available data can prove to be [7] . this difficulty is further escalated by the high number of asymptomatic cases and the limited testing capacity [1, 2] . a fundamental issue when calibrating localised models is inferring parameters of compartmental models such as susceptible-infectious-recovered (sir) and the susceptible-exposedinfectious-recovered (seir) that are widely used in infectious disease projections. in the view of public health policymakers, a critical aspect of projecting infections is the inference of parameters that align with the underlying trajectories in their jurisdictions. the spreading rate is a parameter of particular interest which is subject to changes due to voluntary social distancing measures and government-imposed contact bans. the uncertainty in utilising these models is compounded by the limited data in the initial phases and the rapidly changing dynamics due to rapid public policy changes. to address these complexities, we utilise the bayesian framework for the inference of epidemiological model parameters in south africa. the bayesian framework allows for both incorporation of prior knowledge and principled embedding of uncertainty in parameter estimation. in this work we combine bayesian inference with the compartmental seir and sir models to infer time varying spreading rates that allow for quantification of the impact of government interventions in south africa. compartmental models are a class of models that is widely used in epidemiology to model transitions between various stages of disease [1, 8, 9] . we now introduce the susceptible-exposed-infectious-recovered (seir) and the related susceptible-infectious-recovered (sir) compartmental models that have been dominant in covid-19 modelling literature [1, 5, 6, 10] . the susceptible-exposed-infectious-recovered model. the seir is an established epidemiological model for the projection of infectious diseases. the seir models the transition of individuals between four stages of a condition, namely: • being susceptible to the condition, • being infected and in incubation • having the condition and being infectious to others and • having recovered and built immunity for the disease. the seir can be interpreted as a four-state markov chain which is illustrated diagrammatically in fig 1. the seir relies on solving the system of ordinary differential equations below representing the analytic trajectory of the infectious disease [1] . where s is the susceptible population, i is the infected population, r is the recovered population and n is the total population where n = s + e + i + r. λ is the transmission rate, σ is the rate at which individuals in incubation become infectious, and μ is the recovery rate. 1/σ and 1/μ therefore, become the incubation period and contagious period respectively. we also consider the susceptible-infectious-recovered (sir) model which is a subclass of the seir model that assumes direct transition from the susceptible compartment to the infected (and infectious) compartment. the sir is represented by three coupled ordinary differential equations rather than the four in the seir. fig 2 depicts the three states of the sir model. the basic reproductive number r 0 . the basic reproductive number (r 0 ) represents the mean number of additional infections created by one infectious individual in a susceptible population. according to the latest available literature, without accounting for any social distancing policies the r 0 for covid-19 is between 2 and 3.5 [2, 6, 10, 11] . r 0 can be expressed in terms of λ and μ as: extensions to the seir and sir models. we use an extended version of the seir and sir models of [6] that incorporates some of the observed phenomena relating to covid-19. first we include a delay d in becoming infected (i new ) and being reported in the confirmed case statistics, such that the confirmed reported cases cr t at some time t are in the form [6] : we further assume that the spreading rate λ is time-varying rather than constant with change points that are affected by government interventions and voluntary social distancing measures. we follow the framework of [6] to perform bayesian inference for model parameters on the south african covid-19 data. the bayesian framework allows for the posterior inference of parameters which updates prior beliefs based on a data-driven likelihood. the posterior inference is governed by bayes theorem as follows: where p(w|d, m) is the posterior distribution of a vector of model parameters (w) given the model(m) and observed data(d), p(d|w, m) is the data likelihood and p(d) is the evidence. the likelihood. the likelihood indicates the probability of observing the reported case data given the assumed model. in our study, we adopt the student-t distribution as the likelihood as suggested by [6] . similar to a gaussian likelihood, the student-t likelihood allows for parameter updates that minimise discrepancies between the predicted and observed reported cases. priors. parameter prior distributions encode some prior subject matter knowledge into parameter estimation. in the case of epidemiological model parameters, priors incorporate literature based expected values of parameters such as recovery rate(μ), spreading rate(λ), change points based on policy interventions etc. the prior settings for the model parameters are listed in table 1 . we follow [6] by selecting lognormal distributions for λ and σ such that the initial mean basic reproductive number is 3.2 which is consistent with literature [2, 5, 6, 10, 12] . we set a lognormal prior for the σ such that the mean incubation period is five days. we use the history of government interventions to set priors on change points in the spreading rate. the priors on change-points include 19/ 03/2020 when a travel ban and school closures were announced, and 28/03/2020 when a national lockdown was enforced. we keep the priors for the lognormal distributions of the spreading rates after the change points weakly-informative by setting the same mean as λ 0 and higher variances across all change points. this has the effect of placing greater weight on the data driven likelihood. similar to [6] we adopt weakly-informative half-cauchy priors for the initial conditions for the infected and exposed populations. markov chain monte carlo (mcmc). given that the closed-form inference of the posterior distributions on the parameters listed in table 1 is infeasible, we make use of markov chain monte carlo to sample from the posterior. monte carlo methods approximate solutions [6, 10] . in this work, we explore inference using metropolis-hastings (mh), slice sampling and no-u-turn sampler (nuts). metropolis hastings (mh). mh is one of the simplest algorithms for generating a markov chain which converges to the correct stationary distribution. the mh generates proposed samples using a proposal distribution. a new parameter state w t � is accepted or rejected probabilistically based on the posterior likelihood ratio: a common proposal distribution is a symmetric random walk obtained by adding gaussian noise to a previously accepted parameter state. random walk behaviour of such a proposal typically results in low sample acceptance rates. while sample acceptance is guaranteed with slice sampling, a large slice window can lead to computationally inefficient sampling while a small window can lead to poor mixing. hybrid monte carlo (hmc) and the no-u-turn sampler (nuts). metropolis-hastings (mh) and slice sampling tend to exhibit excessive random walk behaviour-where the next state of the markov chain is randomly proposed from a proposal distribution [13] [14] [15] . this results in low proposal acceptance rates and small effective sample sizes. hmc proposed by [16] reduces random walk behaviour by adding auxiliary momentum variables to the parameter space [15] . hmc creates a vector field around the current state using gradient information, which assigns the current state a trajectory towards a high probability next state [15] . the dynamical system formed by the model parameters w and the auxiliary momentum variables p is represented by the hamiltonian h(w, p) written as follows [15, 16] : where m(w) is the negative log-likelihood of the posterior distribution in eq 7, also referred to as the potential energy. k(p) is the kinetic energy defined by the kernel of a gaussian with a covariance matrix m [17] : the trajectory vector field is defined by considering the parameter space as a physical system that follows hamiltonian dynamics [15] . the dynamical equations governing the trajectory of the chain are then defined by hamiltonian equations at a fictitious time t as follows [16] : in practical terms, the dynamical trajectory is discretised using the leapfrog integrator. in the leapfrog integrator to reach the next point in the path, we take half a step in the momentum direction, followed by a full step in the direction of the model parameters-then ending with another half step in the momentum direction. due to the discretising errors arising from leapfrog integration a metropolis acceptance step is then performed in order to accept or reject the new sample proposed by the trajectory [15, 18] . in the metropolis step the parameters proposed by the hmc trajectory w � are accepted with the probability [16] : pðw � jd; a; b; hþ pðwjd; a; b; hþ algorithm 1 shows the pseudo-code for the hmc where � is a discretisation stepsize. the leapfrog steps are repeated until the maximum trajectory length l is reached. the hmc algorithm has multiple parameters that require tuning for efficient sampling, such as the step size and the trajectory length. in terms of trajectory length, a trajectory length that is too short leads to random walk behaviour similar to mh. while a trajectory length that is too long results in a trajectory that inefficiently traces back. the stepsize is also a critical parameter for sampling, small stepsizes are computationally inefficient leading to correlated samples and poor mixing while large stepsizes compound discretisation errors leading to low acceptance rates. tuning these parameters requires multiple time consuming trial runs. nuts automates the tuning of the leapfrog stepsize and trajectory length. in nuts the stepsize is tuned during an initial burn-in phase by targeting particular levels of sample acceptance. the trajectory length is tuned by iteratively adding steps until either the chain starts to trace back (u-turn) or the hamiltonian explodes (becomes infinite). we use the samplers described above to calibrate the seir and sir models on daily new cases and cumulative cases data for south africa up to and including 20 april 2020 provided by johns hopkins university's center for systems science and engineering(csse) [3] . sir and seir model parameter inference was performed using confirmed cases data up to and including 20 april 2020 and mcmc samplers described in the methodology section. each of the samplers are run such that 5000 samples are drawn with 1000 burn-in and tuning steps. we use leave-one-out(loo) cross-validation error of [19] to evaluate the goodness of fit of each model. table 2 shows the loo validation errors of the various models. it can be seen that the sir model with two change points as the best model fit with the lowest mean loo of 448.00. the seir model with two change points showed a mean loo of 459.94. we note that [6] similarly finds that the sir model displayed superior goodness of fit to the seir on german data. we now further present detailed results of the sir and seir models with inference using nuts, the trace plots from these models indicating stationarity in the sampling chains are provided in s2 and s5 figs. the trace plots for the sir and seir models using mh are provided in s3 and s6 figs, while similar trace plots for slice sampling are provided in s4 and s7 figs. the trace plots largely indicate that the nuts sampler displays greater agreement between parallel chains thus lower rhat values. time-varying spread rates allow for inference of the impact of various state and societal interventions on the spreading rate. fig 5 shows the fit and projections based on sir models with zero, one and two change points. as can be seen from the plot the two change point model best captures the trajectory in the development of new cases relative to the zero and one change point models. the superior goodness of fit of the two change point model is also illustrated in table 2 . the fit and projections showing similar behaviour on the seir model with various change points are shown in fig 6. the mean reporting delay time in days was found to be 6.848 (ci [5.178, 8.165 ]), literature suggests this delay includes both the incubation period and the test reporting lags. the posterior distribution incubation period from the seir model in fig 7 yields the inference of parameters is dependent on the underlying testing processes that generate the confirmed case data. the effect of the mass screening and testing campaign was to change the underlying confirmed case data generating process by widening the criteria of those eligible for testing. while initial testing focused on individuals that either had exposure to known cases or travelled to known covid-19 affected countries, mass screening and testing further introduced detection of community level transmissions which may contain undocumented contact and exposure to covid-19 positive individuals. we have performed bayesian parameter inference of the sir and seir models using mcmc and publicly available data as at 20 april 2020. the resulting parameter estimates fall in-line with the existing literature in-terms of mean baseline r 0 (before government action), mean incubation time and mean infectious period [2, 5, 6, 10] . we find that initial government action that mainly included a travel ban, school closures and stay-home orders resulted in a mean decline of 80% in the spreading rate. further government action through mass screening and testing campaigns resulted in a second trajectory change point. this latter change point is mainly driven by the widening of the population eligible for testing, from travellers (and their known contacts) to include the generalised community who would have probably not afforded private lab testing which dominated the initial data. this resulted in an increase of r 0 to 1.304. the effect of mass screening and testing can also be seen in fig 9 indicating a mean increase in daily tests preformed from 1639 to 4374. the second change point illustrates the possible existence of multiple pandemics, as suggested by [20] . thus testing after 28 march is more indicative of community-level transmissions that were possibly not as well documented in-terms of contact tracing and isolation relative to the initial imported infection driven pandemic. this is also supported by the documented increase in public laboratory testing (relative to private) past this change point, suggesting health care access might also play a role in the detection of community-level infections [21] . we have utilised a bayesian inference framework to infer time-varying spreading rates of covid-19 in south africa. the time-varying spreading rates allow us to estimate the effects of government actions on the dynamics of the pandemic. the results indicate a decrease in the mean spreading rate of 60%, which mainly coincides with the containment of imported infections, school closures and stay at home orders. the results also indicate the emergence of community-level infections which are increasingly being highlighted by the mass screening and testing campaign. the development of the community level transmissions (r 0 � 1.3041 (ci[0.887, 1.7748])) of the pandemic at the time of publication appears to be slower than that of the initial traveller based pandemic (r 0 � 3.278 (ci[2.715, 3.73])). a future improvement to this work could include extensions to regional and provincial studies as current data suggests varied spreading rates both regionally and provincially. as more government interventions come to play priors on more change points might also be necessary. on data-driven management of the covid-19 outbreak in south africa. medrxiv covid-19) an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases coronavirus disease (covid-19) case data-south africa impact of nonpharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand inferring covid-19 spreading rates and potential change points for case number forecasts an epidemiological forecast model and software assessing interventions on covid-19 epidemic in china compartmental models in epidemiology an introduction to compartmental modeling for the budding infectious disease modeler fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the sars-cov-2 epidemic feasibility of controlling 2019-ncov outbreaks by isolation of cases and contacts. medrxiv detecting suspected epidemic cases using trajectory big data bayesian automatic relevance determination for feature selection in credit default modelling sampling techniques in bayesian finite element model updating automatic relevance determination bayesian neural networks for credit card default modelling bayesian learning via stochastic dynamics bayesian learning for neural networks mcmc using hamiltonian dynamics. handbook of markov chain monte carlo practical bayesian model evaluation using leave-one-out cross-validation and waic sa's covid-19 pandemic trends and next steps update on covid-19 key: cord-336747-8m7n5r85 authors: grossmann, g.; backenkoehler, m.; wolf, v. title: importance of interaction structure and stochasticity for epidemic spreading: a covid-19 case study date: 2020-05-08 journal: nan doi: 10.1101/2020.05.05.20091736 sha: doc_id: 336747 cord_uid: 8m7n5r85 in the recent covid-19 pandemic, computer simulations are used to predict the evolution of the virus propagation and to evaluate the prospective effectiveness of non-pharmaceutical interventions. as such, the corresponding mathematical models and their simulations are central tools to guide political decision-making. typically, ode-based models are considered, in which fractions of infected and healthy individuals change deterministically and continuously over time. in this work, we translate an ode-based covid-19 spreading model from literature to a stochastic multi-agent system and use a contact network to mimic complex interaction structures. we observe a large dependency of the epidemic's dynamics on the structure of the underlying contact graph, which is not adequately captured by existing ode-models. for instance, existence of super-spreaders leads to a higher infection peak but a lower death toll compared to interaction structures without super-spreaders. overall, we observe that the interaction structure has a crucial impact on the spreading dynamics, which exceeds the effects of other parameters such as the basic reproduction number r0. we conclude that deterministic models fitted to covid-19 outbreak data have limited predictive power or may even lead to wrong conclusions while stochastic models taking interaction structure into account offer different and probably more realistic epidemiological insights. on march 11th, 2020, the world health organization (who) officially declared the outbreak of the coronavirus disease 2019 (covid-19) to be a pandemic. by this date at the latest, curbing the spread of the virus then became a major worldwide concern. given the lack of a vaccine, the international community relied on non-pharmaceutical interventions (npis) such as social distancing, mandatory quarantines, or border closures. such intervention strategies, however, inflict high costs on society. hence, for political decision-making it is crucial to forecast the spreading dynamics and to estimate the effectiveness of different interventions. mathematical and computational modeling of epidemics is a long-established research field with the goal of predicting and controlling epidemics. it has developed epidemic spreading models of many different types: data-driven and mechanistic as well as deterministic and stochastic approaches, ranging over many different temporal and spatial scales (see [49, 15] for an overview). computational models have been calibrated to predict the spreading dynamics of the covid-19 pandemic and influenced public discourse. most models and in particular those with high impact are based on ordinary differential equations (odes). in these equations, the fractions of individuals in certain compartments (e.g., infected and healthy) change continuously and deterministically over time, and interventions can be modeled by adjusting parameters. in this paper, we compare the results of covid-19 spreading models that are based on odes to results obtained from a different class of models: stochastic spreading processes on contact networks. we argue that virus spreading models taking into account the interaction structure of individuals and reflecting the stochasticity of the spreading process yield a more realistic view on the epidemic's dynamic. if an underlying interaction structure is considered, not all individuals of a population meet equally likely as assumed for ode-based models. a wellestablished way to model such structures is to simulate the spreading on a network structure that represents the individuals of a population and their social contacts. effects of the network structure are largely related to the epidemic threshold which describes the minimal infection rate needed for a pathogen to be able to spread over a network [37] . in the network-free paradigm the basic reproduction number (r 0 ), which describes the (mean) number of susceptible individuals infected by patient zero, determines the evolution of the spreading process. the value r 0 depends on both, the connectivity of the society and the infectiousness of the pathogen. in contrast, in the network-based paradigm the interaction structure (given by the network) and the infectiousness (given by the infection rate) are decoupled. here, we focus on contact networks as they provide a universal way of encoding real-world interaction characteristics like super-spreaders, grouping of different parts of the population (e.g. senior citizens or children with different contact patterns), as well as restrictions due to spatial conditions and mobility, and household structures. moreover, models based on contact networks can be used to predict the efficiency of interventions [38, 34, 5] . here, we analyze in detail a network-based stochastic model for the spreading of covid-19 with respect to its differences to existing ode-based models and the sensitivity of the spreading dynamics on particular network features. we calibrate both, ode-models and stochastic models with interaction structure to the same basic reproduction number r 0 or to the same infection peak and compare the corresponding results. in particular, we analyze the changes in the effective reproduction number over time. for instance, early exposure of superspreaders leads to a sharp increase of the reproduction number, which results in a strong increase of infected individuals. we compare the times at which the number of infected individuals is maximal for different network structures as well as the death toll. our results show that the interaction structure has a major impact on the spreading dynamics and, in particular, important characteristic values deviate strongly from those of the ode model. in the last decade, research focused largely on epidemic spreading, where interactions were constrained by contact networks, i.e. a graph representing the individuals (as nodes) and their connectivity (as edges). many generalizations, e.g. to weighted, adaptive, temporal, and multi-layer networks exist [31, 44] . here, we focus on simple contact networks without such extensions. spreading characteristics on different contact networks based on the susceptible-infected-susceptible (sis) or susceptible-infected-recovered (sir) compartment model have been investigated intensively. in such models, each individual (node) successively passes through the individual stages (compartments). for an overview, we refer the reader to [35] . qualitative and quantitative differences between network structures and network-free models have been investigated in [22, 2] . in contrast, this work considers a specific covid-19 spreading model and focuses on those characteristics that are most relevant for covid-19 and which have, to the best of our knowledge, not been analyzed in previous work. sis-type models require knowledge of the spreading parameters (infection strength, recovery rate, etc.) and the contact network, which can partially be inferred from real-world observations. currently for covid-19, inferred data seems to be of very poor quality [24] . however, while the spreading parameters are subject to a broad scientific discussion, publicly available data, which could be used for inferring a realistic contact network, practically does not exist. therefore real-world data on contact networks are rare [30, 45, 23, 32, 43] and not available for large-scale populations. a reasonable approach is to generate the data synthetically, for instance by using mobility and population data based on geographical diffusion [46, 17, 36, 3] . for instance, this has been applied to the influenza virus [33] . due to the major challenge of inferring a realistic contact network, most of these works, however, focus on how specific network features shape the spreading dynamics. literature abounds with proposed models of the covid-19 spreading dynamics. very influential is the work of neil ferguson and his research group that regularly publishes reports on the outbreak (e.g. [11] ). they study the effects of different interventions on the outbreak dynamics. the computational modeling is based on a model of influenza outbreaks [19, 12] . they present a very high-resolution spatial analysis based on movement-data, air-traffic networks etc. and perform sensitivity analysis on the spreading parameters, but to the best of our knowledge not on the interaction data. interaction data were also inferred locally at the beginning of the outbreak in wuhan [4] or in singapore [40] and chicago [13] . models based on community structures, however, consider isolated (parts of) cities and are of limited significance for large-scale model-based analysis of the outbreak dynamic. another work focusing on interaction structure is the modeling of outbreak dynamics in germany and poland done by bock et al. [6] . the interaction structure within households is modeled based on census data. inter-household interactions are expressed as a single variable and are inferred from data. they then generated "representative households" by re-sampling but remain vague on many details of the method. in particular, they only use a single value to express the rich types of relationships between individuals of different households. a more rigorous model of stochastic propagation of the virus is proposed by arenas et al. [1] . they take the interaction structure and heterogeneity of the population into account by using demographic and mobility data. they analyze the model by deriving a mean-field equation. mean-field equations are more suitable to express the mean of a stochastic process than other ode-based methods but tend to be inaccurate for complex interaction structures. moreover, the relationship between networked-constrained interactions and mobility data remains unclear. other notable approaches use sir-type methods, but cluster individuals into age-groups [39, 28] , which increases the model's accuracy. rader et al. [41] combined spatial-, urbanization-, and census-data and observed that the crowding structure of densely populated cities strongly shaped the epidemics intensity and duration. in a similar way, a meta-population model for a more realistic interaction structure has been developed [8] without considering an explicit network structure. the majority of research, however, is based on deterministic, network-free sir-based ode-models. for instance, the work of josé lourenço et al. [29] infers epidemiological parameters based on a standard sir model. similarly, dehning et al. [9] use an sir-based ode-model, but the infection rate may change over time. they use their model to predict a suitable time point to loosen npis in germany. khailaie et al. analyze how changes in the reproduction number ("mimicking npis") affect changes in the epidemic dynamics using epidemic simulations [25] , where a variant of the deterministic, network-free sir-model is used and modified to include states (compartments) for hospitalized, deceased, and asymptotic patients. otherwise, the method is conceptually very similar to [29, 9] and the authors argue against a relaxation of npis in germany. another popular work is the online simulator covidsim 1 . the underlying method is also based on a network-free sir-approach [50, 51] . however, the role of an interaction structure is not discussed and the authors explicitly state that they believe that the stochastic effects are only relevant in the early stages of the outbreak. a very similar method has been developed at the german robert-koch-institut (rki) [7] . jianxi luo et al. proposed an ode-based sir-model 1 available at covidsim.eu all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. to predict the end of the covid-19 pandemic 2 , which is regressed with daily updated data. ode-models have also been used to project the epidemic dynamics into the "postpandemic" future by kissler et al. [27] . some groups also reside to branching processes, which are inherently stochastic but not based on a complex interaction structure [21, 42] . a very popular class of epidemic models is based on the assumption that during an epidemic individuals are either susceptible (s), infected (i), or recovered/removed (r). the mean number of individuals in each compartment evolves according to the following system of ordinary differential equations where n denotes the total population size, λ ode and β are the infection and recovery rates. typically, one assumes that n = 1 in which case the equation refers to fractions of the population, leading to the invariance s(t)+i(t)+r(t) = 1 for all t. it is trivial to extent the compartments and transitions. a stochastic network-based spreading model is a continuous-time stochastic process on a discrete state space. the underlying structure is given by a graph, where 2 available at ddi.sutd.edu.sg all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. . https://doi.org/10.1101/2020.05.05.20091736 doi: medrxiv preprint each node represents one individual (or other entities of interest). at each point in time, each node occupies a compartment, for instance: s, i, or r. moreover, nodes can only receive or transmit infections from neighboring nodes (according to the edges of the graph). for the general case with m possible compartments, this yields a state space of size m n , where n is the number of nodes. the jump times until events happen are typically assumed to follow an exponential distribution. note that in the ode model, residual residence times in the compartments are not tracked, which naturally corresponds to the exponential distribution in the network model. hence, the underlying stochastic process is a continuous-time markov chain (ctmc) [26] . the extension to non-markovian semantics is trivial. we illustrate the three-compartment case in fig. 1 . the transition rates of the ctmc are such that an infected node transmits infections at rate λ. hence, the rate at which a susceptible node is infected is λ·#neigh(i), where #neigh(i) is the number of its infected direct neighbours. spontaneous recovery of a node occurs at rate β. the size of the state space renders a full solution of the model infeasible and approximations of the mean-field [14] or monte-carlo simulations are common ways to analyze the process. general differences to the ode model. this aforementioned formalism yields some fundamental differences to network-free ode-based approaches. the most distinct difference to network-free ode-based approaches is the decoupling of infectiousness and interaction structure. the infectiousness λ (i.e. the infection rate) is assumed to be a parameter expressing how contagious a pathogen inherently is. it encodes the probability of a virus transmission if two people meet. that is, it is independent from the social interactions of individuals (it might however depend on hygiene, masks, etc.). the influence of social contacts is expressed in the (potentially time-varying) connectivity of the graph. loosely speaking, it encodes the possibility that two individuals meet. in the ode-approach both are combined in the basic reproduction number. note that, throughout this manuscript, we use λ to denote the infectiousness of covid-19 (as an instantaneous transmission rate). another important difference is that ode-models consider fractions of individuals in each compartment. in the network-based paradigm, we model absolute numbers of entities in each compartment and extinction of the epidemic may happen with positive probability. while ode-models are agnostic to the actual population size, in network-based models, increasing the population by adding more nodes inevitably changes the dynamics. another important connection between the two paradigms is that if the network topology is a complete graph (resp. clique) then the ode-model gives an accurate approximation of the expected fractions of the network-based model. in systems biology this assumption is often referred to as well-stirredness. in the limit of an infinite graph size, the approximation approaches the true mean. all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. to transform an ode-model to a network-based model, one can simply keep rates relating to spontaneous transitions between compartments as these transitions do not depend on interactions (e.g., recovery at rate β or an exposed node becoming infected). translating the infection rate is more complicated. in odemodels, one typically has given an infection rate and assumes that each infected individual can infect susceptible ones. to make the model invariant to the actual number of individuals, one typically divides the rate by the population size (or assumes the population size is one and the odes express fractions). naturally, in a contact network, we do not work with fractions but each node relates to one entity. here, we propose to choose an infection rate such that the network-based model yields the same basic reproduction number r 0 as the ode-model. the basic reproduction number describes the (excepted) number of individuals that an infected person infects in a completely susceptible population. we calibrate our model to this starting point of the spreading process, where there is a single infected node (patient zero). we assume that r 0 is either explicitly given or can implicitly be derived from an ode-based model specification. hence, when we pick a random node as patient zero, we want it to infect on average r 0 susceptible neighbors (all neighbors are susceptible at that point in time) before it recovers or dies. let us assume that, like in aforementioned sir-model, infectious node becomes infect their susceptible neighbors with rate λ and that an infectious node loses its infectiousness (by dying, recovering, or quarantining) with rate β. according to the underlying ctmc semantics of the network model, each susceptible neighbor gets infected with probability λ β+λ [26] . note that we only take direct infections form patient zero into account and, for simplicity, assume all neighbors are only infected by patient zero. hence, when patient zero has k neighbors, the expected number of neighbors it infects is k λ β+λ . since the mean degree of the network is k mean , the expected number of nodes infected by patent zero is now we can calibrate λ to relate to any desired r 0 . that is note that we generally assume that r 0 > 1 and that no isolates (nodes with no neighbors) exists in the graph, which implies k mean ≥ 1. hence, by construction, it is not possible to have an r 0 which is larger than (or equal to) the average number of neighbors in the network. in contrast, in the deterministic paradigm this relationship is given by the equation (cf. [29, 9] ): note that the recovery rate β is identical in the ode-and network-model. we can translate the infection rate of an ode-model to a corresponding networkbased stochastic model with the equation while keeping r 0 fixed. in the limit of an infinite complete network, this yields to lim n→∞ λ = λode n , which is equivalent to the effective infection rate in the ode-model λode n for population size n (cf. eq. (1) ). example. consider a network where each node has exactly 5 neighbors (a 5regular graph) and let r 0 = 2. we also assume that the recovery rate is β = 1, which then yields λ ode = 2. the probability that a random neighbor of patient zero becomes infected is 2 5 = λ (β+λ) , which gives λ = 2 3 . it is trivial to extent the compartments and transitions, for instance by including an exposed compartment for the time-period where an individual is infected but not yet infectious. the derivation of r 0 remains the same. the only the only requirement is the existence of a distinct infection and recovery rate, respectively. in the next section, we discuss a more complex case. we consider a network-based model that is strongly inspired by the ode-model used in [25] , we document it in fig. 2 . we use the same compartments and transition-types but simplify the notation compared to [25] to make the intuitive meaning of the variables clearer 3 . we denote the compartments by c = {s, e, c, i, h, u, r, d}, where each node can be susceptible(s), exposed (e), a carrier (c), infected (i), hospitalized (h), in the (intensive care unit (u), dead (d), or recovered (r). exposed agents are already infected but symptom-free and not infectious. carriers are also symptomfree but already infectious. infected nodes show symptoms and are infectious. therefore, we assume that their infectiousness is reduced by a factor of γ (γ ≤ 1, sick people will reduce their social activity). individuals that are hospitalized (or in the icu) are assumed to be properly quarantined and cannot infect others. note that accurate spreading parameters are very difficult to infer in general and the high number of undetected cases complicate the problem further in the current pandemic. here, we choose values that are within the ranges listed in [25] , where the ranges are rigorously discussed and justified. we document them in table 1 . we remark that there is a high amount of uncertainty in the spreading parameters. however, our goal is not a rigorous fit to data but rather a comparison of network-free ode-models to stochastic models with an underlying network structure. note that the mean number of days in a compartment is the inverse of the cumulative instantaneous rate to leave that compartment. for instance, the mean residence time in compartment h is as a consequence of the race condition of the exponential distribution [47] , r h modulates the probability of entering the successor compartment. that is, with probability r h , the successor compartment will be r and not u. inferring the infection rate λ for a fixed r 0 is somewhat more complex than in the previous section because this model admits two compartments for infectious agents. we first consider the expected number of nodes that a randomly chosen patient zero infects, while being in state c. we denote the corresponding basic reproduction number by r 0 . we calibrate the only unknown parameter λ accordingly (the relationships from the previous section remain valid). we explain the relation to r 0 when taking c and i into account in appendix a. substituting β by µ c gives naturally, it is extremely challenging to reconstruct large-scale contact-networks based on data. here, we test different types of contact networks with different features, which are likely to resemble important real-world characteristics. the contact networks are specific realizations (i.e. variates) of random graph models. different graph models highlight different (potential) features of the real-world interaction structure. the number of nodes ranges from 100 to 10 5 . we only use strongly connected networks (where each node is reachable from all other nodes). we refer to [10] or the networkx [18] documentation for further information about the network models discussed in the sequel. we provide a schematic visualization in fig. 3 . we consider erdős-rényi (er) random graphs as a baseline, where each pair of nodes is connected with a certain (fixed) probability. we also compute results for watts-strogatz (ws) random networks. they are based on a ring topology with random re-wiring. the re-wiring yields to a small-world property of the network. colloquially, this means that one can reach each node from each other node with a small number of steps (even when the number of nodes increases). we further consider geometric random networks (gn), where nodes are randomly sampled in an euclidean space and randomly connected such that nodes closer to each other have a higher connection probability. we also consider barabási-albert (ba) random graphs that are generated using a preferential attachment mechanism among nodes and graphs generated using the configuration model (cm-pl) which are-except from being constrained all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. . https://doi.org/10.1101/2020.05.05.20091736 doi: medrxiv preprint λ · #neigh(c) + λγ · #neigh(i) rate of transitioning from e to c rc 0.08 recovery probability when node is a carrier µc rate of leaving c ri 0.8 recovery probability when node is infected µi 1 5 rate of leaving i r h 0.74 recovery probability when node is hospitalized µ h rate of leaving h ru 0.46 recovery probability when node is in the icu µu 1 8 rate of leaving u all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. on having power-law degree distribution-completely random. both models contain a very small number of nodes with very high degree, which act as superspreaders. we also test a synthetically generated household (hh) network that was loosely inspired by [2] . each household is a clique, the edges between households represent connections stemming from work, education, shopping, leisure, etc. we use a configuration model to generate the global inter-household structure that follows a power-law distribution. we also use a complete graph (cg) as a sanity check. it allows the extinction of the epidemic, but otherwise similar results to those of the ode are expected. we are interested in the relationship between the contact network structure, r 0 , the height and time point of the infection-peak, and the number of individuals ultimately affected by the epidemic. therefore, we run different network models with different r 0 (which is equivalent to fixing the corresponding values for λ or for r 0 ). for one series of experiments, we fix r 0 = 1.8 and derive the corresponding infection rate λ and the value for λ ode in the ode model. in the second experiments, calibrate λ and λ ode such that all infection peaks lie on the same level. in the sequel, we do not explicitly model npis. however, we note that the network-based paradigm makes it intuitive to distinguish between npis related to the probability that people meet (by changing the contact network) and npis all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. . https://doi.org/10.1101/2020.05.05.20091736 doi: medrxiv preprint related to the probability of a transmission happening when two people meet (by changing the infection rate λ). political decision-making is faced with the challenge of transforming a network structure which inherently supports covid-19 spreading to one which tends to suppress it. here, we investigate how changes in λ affect the dynamics of the epidemic in section 5 (experiment 3). we compare the solution of the ode model (using numerical integration) with the solution of the corresponding stochastic network-based model (using monte-carlo simulations). code will be made available 4 . we investigate the evolution of mean fractions in each compartment over time, the evolution of the so-called effective reproduction number, and the influence of the infectiousness λ. setup. we used contact networks with n = 1000 nodes (except for the complete graph where we used 100 nodes). to generate samples of the stochastic spreading process, we utilized event-driven simulation (similar to the rejection-free version in [16] ). the simulation started with three random seeds nodes in compartment c (and with an initial fraction of 3/1000 for the ode model). one thousand simulation runs were performed on a fixed variate of a random graph. we remark that results for other variates were very similar. hence, for better comparability, we refrained from taking an average over the random graphs. the parameters to generate a graph are: er: k mean = 6, ws: k = 4 (numbers of neighbors), p = 0.2 (re-wire probability), gn: r = 0.1 (radius), ba: m = 2 (number of nodes for attachment), cm-pl: γ = 2.0 (power-law parameter) , k min = 2, hh: household size is 4, global network is cm-pl with γ = 2.0, k min = 3. experiment 1: results with homogeneous r 0 . in our first experiment, we compare the epidemic's evolution (cf. fig. 4 ) while λ is calibrated such that all networks admit a r 0 of 1.8. and λ is set (w.r.t. the mean degree) according to eq. (6). thereby, we analyze how well certain network structures generally support the spread of covid-19. the evolution of the mean fraction of nodes in each compartment is illustrated in fig. 4 and fig. 5 . based on the monte-carlo simulations, we analyzed how the number r t of neighbors that an infectious node infects changes over time (cf. fig. 6 ). hence, r t is the effective reproduction number at day t (conditioned on the survival of the epidemic until t). for t = 0, the estimated effective reproduction number always starts around the same value and matched the theoretical prediction. independent of the network, r 0 = 1.8 yields r 0 ≈ 2.05 (cf. appendix a). in fig. 6 we see that the evolution of r t differs tremendously for different contact networks. unsurprisingly, r t decreases on the complete graph (cg), as nodes that become infectious later will not infect more of their neighbors. this 4 github.com/gerritgr/stochasticnetworkedcovid19 all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. also happens for gn-and ws-networks, but they cause a much slower decline of r t which is around 1 in most parts (the sharp decrease in the end stems from the end of the simulation being reached). this indicates that the epidemic slowly "burns" through the network. in contrast, in networks that admit super-spreaders (cm-pl, hh, and also ba), it is principally possible for r t to increase. for the cm-pl network, we have a very early and intense peak of the infection while the number of individuals ultimately affected by the virus (and consequently the death toll) remains comparably small (when we remove the super-spreaders from the network while keeping the same r 0 , the death toll and the time point of the peak increase, plot not shown). note that the high value of r t in fig. 6 c) in the first days results from the fact that super-spreaders become exposed, which later infect a large number of individuals. as there are very few super-spreaders, they are unlikely to be part of the seeds. however, due to their high centrality, they are likely to be one of the first exposed nodes, leading to an "explosion" of the epidemic. in hh-networks this effect is way more subtle but follows the same principle. experiment 2: calibrating r 0 to a fixed peak. next, we calibrate λ such that each network admits an infection peak (regarding i total ) of the same height (0.2). results are shown in fig. 7 . they emphasize that there is no direct relationship between the number of individuals affected by the epidemic and the height of the infection peak, which is particularly relevant in the light of limited icu capacities. it also shows that vastly different infection rates and basic reproduction numbers are acceptable when aiming at keeping the peak below a certain threshold. all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. x-axis: day at which a node becomes exposed, y-axis: (mean) number of neighbors this node infects while being a carrier or infected. note that at later time points results are more noisy as the number of samples decreases. the first data-point is the simulation-based estimation of r 0 and is shown as a blue square. fig. 8 . noticeably, the relationship is concave for most network models but almost linear for the ode model. this indicates that the networks models are more sensitive to small changes of λ (and r 0 ). this suggests that the use of ode models might lead to a misleading sense of confidence because, roughly speaking, it will tend to yield similar results when adding some noise to λ. that makes them seemingly robust to uncertainty in the parameters, while in reality the process is much less robust. assuming that ba-networks resemble some important features of real social networks, the non-linear relationship between infection peak and infectiousness indicates that small changes of λ, which could be achieved through proper hand-all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. for the network-models, r 0 (given by eq. (7)) is sown as a scatter plot. note the different scales on x-and y-axis washing, wearing masks in public, and keeping distance to others, can significantly "flatten the curve". in the series of experiments, we tested how various network types influence an epidemic's dynamics. the networks types highlight different potential features of real-world social networks. most results do not contradict line with real-world observations. for instance, we found that better hygiene and the truncation of super-spreaders will likely reduce the peak of an epidemic by a large amount. we also observed that, even when r 0 is fixed, the evolution of r t largely depends on the network structure. for certain networks, in particular those admitting super-spreaders, it can even increase. an increasing reproduction number can be seen in many countries, for instance in germany [20] . how much of this can be attributed to super-spreaders is still being researched. note that superspreaders do not necessarily have to correspond to certain individuals. it can also, on a more abstract level, refer to a type of events. we also observed that cm-pl networks have a very early and very intense infection peak. however, the number of people ultimately affected (and therefore also the death toll) remain comparably small. this is somewhat surprising and requires further research. we speculate that the fragmentation in the network makes it difficult for the virus to "reach every corner" of the graph while it "burns out" relatively quickly for the high-degree nodes. we presented results for a covid-19 case study that is based on the translation of an ode model to a stochastic network-based setting. we compared several interaction structures using contact graphs where one was (a finite version of the) the implicit underlying structure of the ode model, the complete graph. we found that inhomogeneity in the interaction structure significantly shapes the epidemic's dynamic. this indicates that fitting deterministic ode models to real-world data might lead to qualitatively and quantitatively wrong results. the interaction structure should be included into computational models and should undergo the same rigorous scientific discussion as other model parameters. contact graphs have the advantage of encoding various types of interaction structures (spatial, social, etc.) and they decouple the infectiousness from the connectivity. we found that the choice of the network structure has a significant impact and it is very likely that this is also the case for the inhomogeneous interaction structure among humans. specifically, networks containing super-spreaders consistently lead to the emergence of an earlier and higher peak of the infection. moreover, the almost linear relationship between r 0 , λ ode , and the peak intensity in ode-models might also lead to misplaced confidence in the results. regarding the network structure in general, we find that super-spreaders can lead to a very early "explosion" of the epidemic. small-worldness, by itself, does not admit this property. generally, it seems that-unsurprisingly-a geometric network is best at containing a pandemic. this would imply evidence for corresponding mobility restrictions. surprisingly, we found a trade-off between the height of the infection peak and the fraction of individuals affected by the epidemic in total. for future work, it would be interesting to investigate the influence of non-markovian dynamics. ode-models naturally correspond to an exponentially distributed residence times in each compartment [48, 16] . moreover, it would be interesting to reconstruct more realistic contact networks. they would allow to investigate the effect of npis in the network-based paradigm and to have a well-founded scientific discussion about their efficiency. from a risk-assessment perspective, it would also be interesting to focus more explicitly on worst-case trajectories (taking the model's inherent stochasticity into account). this is especially relevant because the costs to society do not scale linearly with the characteristic values of an epidemic. for instance, when icu capacities are reached, a small additional number of severe cases might lead to dramatic consequences. all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. the reachability probability from s − c to e − c is related to r 0 . it expresses the probability that a infected node (patient zero, in c) infects a specific random (susceptible) neighbor. the infection can happen via two paths. furthermore, we assume that this happens for all edges/neighbors of patient zero independently. assume a randomly chosen patient zero that is in compartment c. we are interested in r 0 in the model given in fig. 2 assuming γ > 0. again, we consider each neighbor independently and multiply with k mean . moreover, we have to consider the likelihood that patient zero infects a neighbor while being in compartment c and the possibility of transitioning to i and then transmitting the virus. this can be expressed as a reachability probability (cf. fig. 9 ) and gives raise to the equation: in the brackets, the first part of the sum expresses the probability that patient zero infects a random neighbor as long as it is in c. in the second part of the sum, the first factor expresses the probability that patient zero transitions to i before infecting a random neighbor. the second factor is then the probability of infecting a random neighbor as long as being in i. note that, as we consider a fixed random neighbor, we need to condition the second part of the sum on the fact that the neighbor was not already infected in the first step. all rights reserved. no reuse allowed without permission. was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which this version posted may 8, 2020. . https://doi.org/10.1101/2020.05.05.20091736 doi: medrxiv preprint derivation of the effective reproduction number r for covid-19 in relation to mobility restrictions and confinement analysis of a stochastic sir epidemic on a random network incorporating household structure generation and analysis of large synthetic social contact networks epidemiology and transmission of covid-19 in shenzhen china: analysis of 391 cases and 1,286 of their close contacts controlling contact network topology to prevent measles outbreaks mitigation and herd immunity strategy for covid-19 is likely to fail modellierung von beispielszenarien der sars-cov-2-ausbreitung und schwere in deutschland the effect of travel restrictions on the spread of the 2019 novel coronavirus inferring covid-19 spreading rates and potential change points for case number forecasts a first course in network theory strategies for mitigating an influenza pandemic community transmission of sars-cov-2 at two family gatherings-chicago, illinois binary-state dynamics on complex networks: pair approximation and beyond mathematical models of infectious disease transmission rejection-based simulation of nonmarkovian agents on complex networks epidemic spreading in urban areas using agent-based transportation models exploring network structure, dynamics, and function using networkx modeling targeted layered containment of an influenza pandemic in the united states schätzung der aktuellen entwicklung der sars-cov-2-epidemie in deutschland-nowcasting feasibility of controlling covid-19 outbreaks by isolation of cases and contacts representations of human contact patterns and outbreak diversity in sir epidemics insights into the transmission of respiratory infectious diseases through empirical human contact networks coronavirus disease 2019: the harms of exaggerated information and non-evidence-based measures estimate of the development of the epidemic reproduction number rt from coronavirus sars-cov-2 case data and implications for political measures based on prognostics mathematics of epidemics on networks projecting the transmission dynamics of sars-cov-2 through the post-pandemic period contacts in context: large-scale setting-specific social mixing matrices from the bbc pandemic project fundamental principles of epidemic spread highlight the immediate need for large-scale serological surveys to assess the stage of the sars-cov-2 epidemic an infectious disease model on empirical networks of human contact: bridging the gap between dynamic network data and contact matrices temporal network epidemiology comparison of three methods for ascertainment of contact information relevant to respiratory pathogen transmission in encounter networks a small community model for the transmission of infectious diseases: comparison of school closure as an intervention in individual-based models of an influenza pandemic analysis and control of epidemics: a survey of spreading processes on complex networks epidemic processes in complex networks an agent-based approach for modeling dynamics of contagious disease spread threshold conditions for arbitrary cascade models on arbitrary networks optimal vaccine allocation to control epidemic outbreaks in arbitrary networks the effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study investigation of three clusters of covid-19 in singapore: implications for surveillance and response measures crowding and the epidemic intensity of covid-19 transmission. medrxiv pattern of early human-to-human transmission of wuhan 2019 novel coronavirus (2019-ncov) a high-resolution human contact network for infectious disease transmission spreading processes in multilayer networks interaction data from the copenhagen networks study impact of temporal scales and recurrent mobility patterns on the unfolding of epidemics probability, markov chains, queues, and simulation: the mathematical basis of performance modeling non-markovian infection spread dramatically alters the susceptible-infected-susceptible epidemic threshold in networks an introduction to infectious disease modelling modelling the potential health impact of the covid-19 pandemic on a hypothetical european country all rights reserved. no reuse allowed without permission. key: cord-331374-3gau0vmc authors: giorgi, gabriele; montani, francesco; fiz-perez, javier; arcangeli, giulio; mucci, nicola title: expatriates’ multiple fears, from terrorism to working conditions: development of a model date: 2016-10-13 journal: front psychol doi: 10.3389/fpsyg.2016.01571 sha: doc_id: 331374 cord_uid: 3gau0vmc companies’ internationalization appears to be fundamental in the current globalized and competitive environment and seems important not only for organizational success, but also for societal development and sustainability. on one hand, global business increases the demand for managers for international assignment. on the other hand, emergent fears, such as terrorism, seem to be developing around the world, enhancing the risk of expatriates’ potential health problems. the purpose of this paper is to examine the relationships between the emergent concept of fear of expatriation with further workplace fears (economic crisis and dangerous working conditions) and with mental health problems. the study uses a quantitative design. self-reported data were collected from 265 italian expatriate workers assigned to both italian and worldwide projects. structural equation model analyses showed that fear of expatriation mediates the relationship of mental health with fear of economic crisis and with perceived dangerous working conditions. as expected, in addition to fear, worries of expatriation are also related to further fears. although, the study is based on self-reports and the cross-sectional study design limits the possibility of making causal inferences, the new constructs introduced add to previous research. the globalization of markets that has taken place in recent decades was a great opportunity for companies to become known and to operate in a wider context (biemann and andresen, 2010; andresen et al., 2014) . this phenomenon led to the possibility that many of the managers who served in a national territory could be transferred to foreign countries, characterized by different cultures and work processes (sims and schraeder, 2005) . however, working globally involves changes in occupational dynamics and in the levels of job complexity, and it also requires great skills of adaptation and adjustment (black et al., 1991; black and gregersen, 1999) . most of the researches conducted on adjustment in a foreign country concerned so-called expatriate workers or expatriates (sims and schraeder, 2005) , who are those workers sent from their own organization to follow projects or to lead company sectors abroad. an expatriate can be properly defined as one who works in a foreign country for a period of at least 6 months (birdseye and hill, 1995; jones, 2000; chan et al., 2015) . however, shorter forms of expatriation also exist, as shown in the present study. supporting expatriates in performing their tasks in a new environment is nowadays essential for companies. accordingly, researchers have studied expatriated performance and adaptation and evaluated the influence of specific practices of human resource management on their behavior. relevant dimensions have been identified (mol et al., 2005; cheng and lin, 2009) , such as training, to support expatriates in dealing with different cultural values, unexpected behavioral rules, and language barriers. on the other hand, the dark sides of the expatriation experience have also been studied, such as the possible failure of the assignment, leaving without having finished the task, or psychologically withdrawing and performing worse than they usually would. failure may be particularly expensive in human and monetary terms (baruch, 2004) . however, adaptation/adjustment may be defined as the comfort degree (or the stress absence) associated with the role of the expatriate (bhaskar-shrinivas et al., 2005) ; expatriates who fail to face the demands of a job and do not properly adapt to a new environment may experience high levels of stress (perone et al., 2008) . the scenario of stressors among expatriates seems complex, from the micro-environment and the macro-environment to the mega-environment (lei et al., 2004) . in particular, according to bhaskar-shrinivas et al. (2005) , work assignments to be carried out abroad lead to greater stress when the following situations occur: (a) when the leader perceives his role as unclear, or rather he does not understand which tasks are actually his and what the company expects from him; (b) when the leader feels he has a low decision latitude; if he does not feel free to make decisions without first having to ask and obtain the green light from his company; (c) if the position is considered too demanding, difficult or new; or in a situation that the leader does not feel up to the task of handling because of lack of experience or lack of capacity; (d) when the manager recognizes that there is a conflict, such as in a case where certain tasks cannot be completed because that would hamper the achievement of other business objectives he is trying to achieve. black et al. (1991) observed that expatriates tend to suffer a greater number of relapses after periods of stress. jones (2000) , in a review, identified some risk factors that could not only adversely affect health, but could also lead to developing fear and anxiety of expatriation: risk of being involved in accidents; quality of living conditions; working conditions; risk of disease contagion; fear of being involved in violence; kidnappings and terrorist acts. these risk factors are analyzed below: -risk of being involved in accidents. this fear is typically supported by the objective evidence that in some countries there are very low driving standards and poor road safety. moreover, in some countries the roads are of low quality. -quality of living conditions. the quality of food and hygiene is one of the most important factors to ensure the adaptation of an expatriate to the new job environment. for example, good water quality cannot be ensured in all countries. drinking poor quality water could cause the development of oral infections or gastrointestinal problems. the same effects can also be produced by eating non-controlled food. as far as lifestyle is concerned, a lack of leisure activities and difficulties in communication (for example, poor internet and telephone functionality) may be a concern. -working conditions. there are higher psychological and physical strains in developing countries, which can inhibit the expatriate's ability to cope with perceived stress and can eventually increase unsafe practices. heavy traffic and low control of industrial gas emissions could also affect the health of expatriates. also, the presence of pristine nature in some working locations might interact negatively with a lower standard of safety and health. -chances of disease contagion. expatriates should be informed on the prevalence of diseases in the host country before their trip or during their stay. the possibility of having specific vaccines would be an important protective factor against possible contagion. however, the fear of contagion from some illness might be always present in some countries. psychological susceptibility to become stressed by the potential contagion also appears important. -fear of being involved in violence, kidnappings, and terrorist acts. this issue, once confined to few world regions, seems now to be more widespread (bader and schuster, 2015) . given that anxiety could significantly decrease people's psychological well-being and mental health, there is increasing empirical research on the effect of fear in the workplace (mueller and tschan, 2011) . fear, especially if chronic, may damage, in particular, the immune, the nervous and the cardiovascular systems (shiba et al., 2016) . the human body may be weakened by states of fear, especially if chronic. in particular, the immune, the nervous and the cardiovascular systems are damaged, but equally, the gastrointestinal tract and the reproductive system are not spared. in particular, the fear may compromise the decoding of emotions and decision making processes, making the subject susceptible to intense emotions and impulsive reactions and, consequently, to making inappropriate actions. fatigue, depression, accelerated aging, and even premature death may be the further consequences of long-term fear (shiba et al., 2016) . furthermore, the literature shows that expatriate workers have an increased risk of consuming psychotropic and narcotic substances as well as of abusing of psychotropic drugs (aubry et al., 2012; bianchi et al., 2014; kaeding and borchers, 2016) . our study enhanced the literature by being the first to look at a set of important fears among expatriates. in particular, we aimed to find out how the emergence of fear of expatriation, induced by mental health problems, might impact on the expatriate's further fears in the workplace, using data from a survey of 265 italian expatriate workers. building on the stress perspective (lazarus and folkman, 1984; lazarus, 1991) , we have, in particular, examined the following issues: how mental health is associated with fear; the relationship between fear of expatriation and fear of economic crisis as well as perceived dangerous working conditions; the mediation of fear with mental health and the development of further fears. we intend to conceptualize fear of expatriation due to the risk factors discussed above. indeed, this study contributes to the literature on expatriates' health by testing an emergent model for the prevention of mental health issues. this paper proceeds as follows. first of all, we present the conceptual model and the derived hypotheses. then, we explain the methodology used. finally, the results and discussion (including limitations and research perspectives) are considered. expatriate workers often experience difficulties in their adjustment to new work and living situations and, consequently, they are at risk of developing mental health problems (costa et al., 2015; zhu et al., 2016) . this situation may enhance the fear of violence and of poor living and working conditions during the experience abroad (lazarus and folkman, 1984) . this fundamental concept is the basis of our conceptual model (figure 1) . with this in mind, workers can be severely traumatized not only by actual violence but also from any potential violence . for instance, terrorism is quickly spreading (leistedt, 2013) . data from the global terrorism database (gtd) of the national consortium for the study of terrorism and responses to terrorism (start) (2015) regarding global terrorism shows that in 2014 such attacks relate to 95 countries. in 2014, the worldwide attacks numbered 13,463 (35% more than in 2013), which led to more than 32,700 deaths and more than 34,700 injured people. the geographical distribution is highly concentrated. sixty percent of these attacks took place in five countries (iraq, india, afghanistan, pakistan, and nigeria), while 78% of the fatalities caused by terrorism took place in five countries (iraq, nigeria, afghanistan, pakistan, and syria). the strategy of the most developed terroristic groups -e.g., islamic state of iraqi and the levant (isil) and boko haram -not only provides for violent attacks on military or civilian points, but also for kidnapping, torture, and rape. these practices increase the fear of the people -particularly those who are located in the directly involved geographical areas -and of international public opinion. workers with mental health problems might be particularly vulnerable to developing fears of these practices. the use of social media by isil has allowed for the extreme visibility of this organization with a widespread dissemination of its terroristic contents (united states department of state publication bureau of counterterrorism, 2015) and might increase anxiety in workers with pre-existent mental health problems or stress (solberg et al., 2015; glad et al., 2016; paz garcía-vera et al., 2016) . the context of living and working conditions in the host country is another factor associated with expatriates' psychological well-being. frazee (1998) pointed out that healthcare is one of the main issues for expatriates: more than one-third of international assignments are dissatisfied with the health assistance they receive. the standard of healthcare around the world varies in a very important way. however, discrepancies may exist even among different regions of the same country. in addition, expatriates might be afraid of not receiving an adequate and timely treatment for all types of injury and, moreover, the sanitary conditions might not be good, increasing the risks of contagions or illness. these concerns affect virtually all expatriate workers, but may result in real states of fear in subjects with mental health problems and may generate the acute and chronic worsening of any already existing clinical situation (pierre et al., 2013; cleary et al., 2014; wilde and gollogly, 2014) . hypothesis 1: mental health problems generate fear of expatriation. the second part of the model is focused on the development of further fears in the workplace. despite the numerous relevant stressors in global assignments, in our conceptual framework we mainly focused on two areas of fear of expatriation. the first is related to violence, intended both as physical and psychological. the second is related to the perceived impeded living and working conditions (including workplace safety, illness contagion, and lifestyle). as already explained, the presence of a pre-existing state of fear or anxiety may enhance the likelihood of negative stimulus to elicit fear. indeed, emotions are specific to the context and imply a person-environment relationship. more specifically, emotions embody a particular theme, reflecting the way the individual sees his/her relationship with the environment in a given situation (lazarus, 2001) . the fear of expatriation might negatively influence the perception of the safety environment and the anxiety caused by economic crisis. moreover, expatriates are often exported to societies with weaker and less expensive h&s policies (heymann, 2003) , raising a perception of unsafe working conditions (curcuruto et al., 2015) . in our model we expect a mediation process of fear of expatriation among mental health and further fears in the workplace. first, mental health problems generate anxiety and fear. fear can impair the formation of long-term memories and can cause damage to certain parts of the brain, such as the hippocampus (besnard and sahay, 2016) . this can make it even more difficult to regulate fear and can leave a person anxious most of the time. the threats to our security impact our mental wellbeing, whether they are real or perceived, generating multiple fears. hypothesis 2: mental health problems, through the mediation of fear of expatriation, influence further fears in the workplace: dangerous working conditions and economic stress. this study was conducted in a large international company dealing with technology and services in heavy industry. the expatriate managers employed in this company were all invited to participate in the study. expatriation services in this company are usually in short form. expatriates spend cyclically 28 days outside the workplace (often in platforms or yards located worldwide). the final respondents were 265 employees (response rate = 70%) working in multiple locations (italy, europe, middle east, asia, africa, australia, etc.). the survey was administered through the corporate intranet, ensuring anonymity, and privacy rules. a video, in which an industrial psychologist and an occupational physician explained the procedure of questionnaire compilation and the survey aims, was also made available through the corporate intranet. the sample consisted of only men in managerial positions. workers were, on average, relatively young: 18.9% 30 years old or younger, 48.3% from 31 to 40 years old, 23.4% from 41 to 50 years old, and only 9.4% were over 50. regarding job tenure, 23.8% of the participants had worked from 0 to 5 years, 30.9% of participants had worked from 6 to 10 years, 30.9% of the participants had worked from 11 to 20 years, and 14.3% of participants more than 20 years. finally, the majority of employees had long working hours (17.4% 50 h per week, 26.6% 50-60 h per week, 56% more than 60 h per week). after collecting some socio-demographic variables, participants completed the scales on fear of expatriation, economic stress, dangerous working conditions and psychological distress. the scales used in this study are described below. fear of expatriation was measured by a new questionnaire, developed by our research group and called fear of expatriation scale (supplementary material). the measure is composed of two dimensions: (a) fear of violence/terrorism -(two items) employees are scared of being subjected to violence/terrorism (e.g., "i am scared of being the object of physical violence -kidnapping, terrorism, etc."); (b) fear of the working and living conditions -(three items) employees are worried about the working and living conditions and about healthcare (e.g., "i am scared of contracting a disease"). the scores were collected, for each dimension, through a fivepoint likert scale (from 1: "strongly disagree" to 5: "strongly agree"). as this instrument was developed for this research, we evaluated the construct validity and reliability of the fear of expatriation scale in order to investigate its psychometric properties. we assessed the construct validity (convergent validity and discriminant validity) of the scale by conducting a confirmatory factor analysis (cfa) in order to compare the hypothesized factorial model involving two distinct factorsfear of violence/terrorism and fear of the working and living conditions -with a one-factor model. results showed that the hypothesized two-factor model yielded a good fit to the data (χ 2 [4] = 9.35, ns; cfi = 0.99; rmsea = 0.07; srmr = 0.02) and outperformed that of the one-factor solution (χ 2 [4] = 20.25, p < 0.01; cfi = 0.97; rmsea = 0.11; srmr = 0.02; χ 2 (1) = 10.90, p < 0.01), thus supporting the distinctiveness between the two sub-dimensions of fear of violence/terrorism and fear of the working and living conditions. furthermore, standardized regression coefficients of items on each factor were all higher than 0.50 (hair et al., 2007) , thus supporting the convergent validity of the factors (range = 0.62-0.85). however, cfa results also indicated that the correlation among latent constructs was higher than 0.89. this therefore suggests that the two dimensions might be best combined on an overall scale of fear of expatriation (kline, 2011) . accordingly, in our subsequent analyses to test hypotheses 1 and 2, we considered only the overarching fear of expatriation scale, and not its separate dimensions. finally, internal consistency, which was assessed by the calculation of reliability coefficients (cronbach's alpha), was 0.86, 0.76, and 0.80 for the overall fear of expatriation scale, the fear of violence/terrorism dimension and the fear of the working and living conditions dimension, respectively. thus, this indicated good internal consistency of the measure (nunnally, 1978) . economic stress was measured with the scale about subjective economic stress included in the recent stress questionnaire (sq), developed and validated in italy (giorgi et al., 2013; mucci et al., 2015) . the economic stress measure is composed of two dimensions: (a) fear of the economic crisis (five items) -employees perceive that the organization is suffering from the economic crisis (e.g., "i am scared that my organization is affected by the economic crisis; i am scared that my organization, due to the economic crisis, will be subjected to downsizing"); (b) non-employability (five items) -employees perceive that their working competencies would not permit them to acquire another job in the market (caricati et al., 2016) [e.g., "my professionalism is not spendable (recognized) in the labor market; my staying in the organization is linked to the difficulty of outplacement in the labor market"]. each dimension includes five items in a five-point likert scale (from 1: "strongly disagree" to 5: "strongly agree"). we used a scale, included in the above mentioned stress questionnaire (giorgi et al., 2013; mucci et al., 2015) , that covers two factors in a five-point likert scale (from 1: "strongly disagree" to 5: "strongly agree"): this was measured with the general health questionnaire (ghq-12; goldberg and hillier, 1979; fraccaroli et al., 1991) . the scale asks whether the respondent has experienced a particular symptom or behavior related to general psychological health recently. each item is rated on a four-point likert-type scale (0-1-2-3). a higher score indicates a greater degree of psychological distress. in this study we particularly focus on the sub-dimension "anxiety and insomnia" (seven items, e.g., "considering the last few weeks, have you recently [. . .] felt constantly under strain?"). following anderson and gerbing's (1988) two-step structural equation modeling (sem) procedure, we tested a measurement model (cfa) by determining whether each measure's estimated loading on its expected underlying factor was significant. this allowed us to establish discriminant validity among the study constructs. then, a structural model was performed to estimate the fit to the data of the hypothesized model in which fear of expatriation mediates the relationship of mental health problems with economic stress and perceived dangerous working conditions (hypotheses 1 and 2) . a cfa was, therefore, performed with mplus, version 7.11 (muthén and muthén, 1998-2010) , with the four variables measuring mental health problems, fear of expatriation, economic stress, and perceived dangerous working conditions. moreover, the variables' dimensions were used as indicators of their corresponding latent constructs in the measurement and structural models. these dimensions were formed by averaging the items of each sub-scale for the four latent variables. we therefore obtained three indicators for mental health problems, two indicators for fear of expatriation, two indicators for economic stress, and two indicators for perceived dangerous working conditions. to evaluate the model fit, we considered chi-square (the higher the values are, the worse is the model's correspondence to the data), and used both absolute and incremental fit indexes. absolute fit indexes evaluate how well an a priori model reproduces the sample data. in our study, we focused on three absolute fit indexes: the standardized root mean square residual (srmr), for which values of less than 0.08 are favorable, and the root-mean-square error of approximation (rmsea), which should not exceed 0.10 (browne and cudeck, 1993; kline, 2011) . incremental fit indexes measure the proportionate amount of improvement in fit when a target model is compared with a more restricted, nested baseline model (schreiber et al., 2006) . we considered the comparative fit index (cfi), for which values of 0.90 or greater are recommended (schreiber et al., 2006) . as expected, the hypothesized four-factor model yielded a good fit to the data: χ 2 (20) = 61.17, cfi = 0.95, rmsea = 0.09, srmr = 0.05 (table 1) . additionally, as shown in table 2 , this model had a significantly better fit than alternative, more parsimonious models (p < 0.01), supporting the distinctiveness of the study variables. table 2 displays the descriptive statistics, correlations, and reliability coefficients of the variables. in order to examine the hypothesized model, we performed sem with mplus. sem offers the following advantages: (a) controlling for measurement errors when the relationships among variables are analyzed (hoyle and smith, 1994) ; (b) comparing the goodness-of-fit of the hypothesized model with other alternative models (cheung and lau, 2008 ). thus, we tested our proposed structural model and compared it with alternative models. additionally, when conducting sem analyses, we controlled for the effects of age and organizational tenure on both the mediator and the dependent variables. fit indexes for each tested model are presented in table 3 . the hypothesized model (model 1), which is a full mediation model, displayed a good fit to the data: χ 2 (35) = 70.37, cfi = 0.96; rmsea = 0.06; srmr = 0.06. specific inspection of direct relationships further revealed that mental health problems were positively associated with fear of expatriation (β = 0.64, p < 0.01), thus supporting hypothesis 1. additionally, fear of expatriation, in turn, was positively related to economic stress (β = 0.40, p < 0.01) and perceived dangerous working conditions (β = 0.66, p < 0.01), thus providing preliminary support for hypothesis 2. completely standardized path coefficients for model 1 are depicted in figure 2 . to assess whether the hypothesized model was the best representation of the data, we then compared its fit to that of different alternative models. first, we assessed a partial mediation model, which included two additional direct paths from mental health problems to economic stress and perceived dangerous working conditions. this model yielded an adequate fit to the data (χ 2 [33] = 67.21, cfi = 0.96; rmsea = 0.06; srmr = 0.05), but it was not significantly better than model 1, as revealed by the chi-square difference ( χ 2 [1] = 3.16, ns). moreover, the additional direct relationships of workplace mental problems with economic stress (β = 0.13, ns) and dangerous working conditions were not significant (β = 0.03, ns). next, we compared the hypothesized model with a nonmediation model (model 3), which only included direct paths from mental health problems and fear of expatriation to economic stress and perceived dangerous working conditions. results revealed that the non-mediation model was a slightly worse fit to the data than the hypothesized fully-mediated model however, because this model had the same degrees of freedom as the hypothesized model, the statistical significance of the chisquare difference could not be calculated. accordingly, we used the akaike information criterion (aic), instead of the chisquare, to compare the two models. the hypothesized model is considered to be superior to the non-mediation model if the former has an aic value lower than the latter by four or more units (burnham and anderson, 2002) . results revealed that model 3 had an aic of 7301.58 compared to an aic of 7290.35 for model 1, suggesting that the hypothesized full mediation model represents a superior fit to the data than the non-mediation model ( aic = 11.23). furthermore, because mental health problems, fear of expatriation, economic stress, and perceived dangerous working conditions were all measured at the same time, reverse relationships could also be expected between the four variables. in order to rule out this possibility, we therefore compared the hypothesized model against a set of alternative models that specified all the possible reverse indirect relationships among the study variables, namely: the indirect relationship of economic stress and perceived dangerous working conditions with fear of expatriation via mental health problems (model 4); the indirect relationship of economic stress and perceived dangerous working conditions with mental health problems n = 265. cfi, comparative fit index; rmsea, root-mean-square error of approximation; srmr, standardized root mean square residual; akaike information criterion χ 2 = chi-square difference tests between the best-fitting model (model 1) and alternative models; aic, akaike difference test between the best-fitting model (model 1) and alternative models. fex, fear of expatriation; fec, fear of economic crisis; dwc, dangerous working conditions; mhp, mental health problems. * p < 0.05. via fear of expatriation (model 5); the indirect relationship between mental health problems and fear of expatriation via economic stress and perceived dangerous working conditions (model 6); the indirect relationship of fear of expatriation with economic stress and perceived dangerous working conditions via mental health problems (model 7); the relationship between fear of expatriation and mental health problems via economic stress and perceived dangerous working conditions (model 8). again, because models 6-8 had the same degrees of freedom as the hypothesized model, we compared the model fit by using the aic difference test. as can be seen from table 3 , models 4-8 all yielded a worse fit to the data than the hypothesized model. overall, results from model comparison suggested that model 1 was the best fitting model. we therefore retained the hypothesized fully-mediated model. finally, in order to assess whether the indirect relationship of mental health problems with economic stress and perceived dangerous working conditions through fear of expatriation was significant (hypothesis 2), we calculated 95% bootstrapping confidence intervals (preacher and hayes, 2008; preacher and kelley, 2011) . based on 5,000 bootstrap replications, results indicated that mental health problems had an indirect positive effect on economic stress (indirect effect = 0.18; 95% ci = 0.12, 0.24) and perceived dangerous working conditions (indirect effect = 0.18; 95% ci = 0.12, 0.24) via fear of expatriation. hypothesis 2 was therefore fully supported. in a globalized working environment with turbulence in the economy and in the security expatriate workers are confronted with several stressors, making international assignments potentially stressful. accordingly, expatriate managers might report lower psychological well-being and anxiety (wang and kanungo, 2004; shaffer et al., 2006; wang and nayir, 2006) . at the same time, fears are now increasing in the workplace, marked by emotional discomfort, apprehension, or concerns about the internal and external environment, and expatriates seem particularly at risk. these symptoms can progress to more severe psychosomatic symptoms, including further anxiety and additional fears (blum et al., 2010; besnard and sahay, 2016) . in our model, fear of expatriation is particularly associated with mental health problems. this result is in line with the field literature (i.e., perone et al., 2008; andresen et al., 2014; aracı, 2015) . in addition, an expatriate might go through several personal and professional problems. in fact, expatriation is associated with a lot of unhealthy issues such as stress, anxiety, loneliness and homesickness, generating a sort of potential and prolonged cultural shock (barón et al., 2014) . finally, it must be emphasized that expatriates cannot count on the support of family and/or other trusted people if they need it (bhaskar-shrinivas et al., 2005) . fear of expatriation might generate a spiraling effect in which people, feeling more anxious, might become less engaged in the workplace, developing beliefs that money and extrinsic reward are the most important aspects of employment (gerhart and fang, 2015) . at the same time, they may be more worried about their financial situation and their ability to hold on to their jobs and their benefits, developing their fear of economic crisis. on one hand, from a subjective point of view, h&s measuresunder certain circumstances (for instance, the emergence of fear in the workplace) -do not necessarily express feelings of safety, but rather can be interpreted as latent danger (bader and berg, 2012) generating a widespread anxiety. on the other hand, from an objective point of view, expatriates are often exported to societies with weaker and less expensive h&s policies and less organized labor forces, bringing potentially stressful and unsafe working conditions (heymann, 2003) to expatriates. this might raise a perception of dangerous working conditions. moreover, h&s procedures are usually highly specific to each country and with very important differences -related to the various national legislations and traditions -and working conditions abroad are generally perceived as less familiar and presenting higher risk (taylor et al., 2007) . expatriates are a group of people with a high cumulative risk of exposure to illness and injury (including the increased risk of certain vaccine-preventable illnesses) due to changes in travel patterns and activities, lifestyle alterations, and increased interaction with local populations. pre-travel immunization management provides one safe and reliable method of preventing infectious illness in this group; however, this might not be enough to cope with anxiety (vaid et al., 2013; shepherd and shoff, 2014) . in addition, there are diseases that are not preventable with vaccines. these diseases, in particular of a viral nature, seem to spread faster nowadays -such as, for example, the recent ebola or middle east respiratory syndrome (mers) outbreaks -and might be frightening for expatriates (cohen et al., 2016; regan et al., 2016) . in summary, the non-optimal h&s perception (beeley et al., 1993) , the risk of contracting infectious diseases (jones, 1999; hamlyn et al., 2007) and the unsuitability of medical care (pierre et al., 2013) were also evaluated under the construct of the fear of expatriation. in our study, expatriates reported being frightened by the risk of becoming involved in accidents during their frequent moves from one country to another. in particular, this fear seemed to be greater for traffic accidents (wilks et al., 1999) , but a fear of flying was also described (hack-polay, 2012) . a further and significant stress factor for expatriate workers is the managers' fear of terrorist attacks or other fatal events involving the governments of countries where there are companies' headquarters (leistedt, 2013; bader and schuster, 2015) . finally, in our model we consider the concerns about the working and living conditions (costa et al., 2015; zhu et al., 2016) . in addition, our model has shown that the presence of fear of expatriation may, in turn, generate further fears in the workplace. in particular, fears of both the economic crisis and of the foreign working conditions are mediated by fear of expatriation. in fact, fear could adversely disturb human thinking and decision-making processes, leaving the individual more susceptible to generating further fears, as in a vicious circle (barón et al., 2014; besnard and sahay, 2016) . our findings are in line with the basic propositions of lazarus and the "affective events theory" (aet; weiss and cropanzano, 1996) . the latter pointed out that an emotion experienced by a worker (e.g., fear) may impact on later within-person emotions (e.g., fear again), influencing different organizational outcomes. as far as economic crisis is concerned, economic stress might be more frightening for those who have more invested in the company, such as expatriates. in fact, these workers, being away from home for extended periods of time, are completely absorbed by their job and, therefore, they may be an easier target for contagious negative emotional cycles (quantin et al., 2012) . moreover, they could be most affected by the psychological impacts of the economic crisis and its consequences as they have lower levels of social support (fernandez et al., 2015) . similarly, fear of expatriation may significantly lead to perceiving a priori all foreign working conditions as being more dangerous. these findings support our assumption and the literature (perone et al., 2008; andresen et al., 2014) . the impact of working and living conditions -resulting, for instance, in perceived risks of harms -might be higher if expatriates are scared by the impeded living conditions or by the threat of violence. this might have several negative implications as often expatriates have the tendency to "get the job done" as smoothly as possible, so they can return home again (aracı, 2015) . however, this specific issue should be investigated in future studies. our findings provide interesting contributions both to the literature and to the managerial practices in the field of foreign work. first, we believe that the subjects in the process of leaving their own country should be mentally healthy and not feeling frightened by either the place of destination or the assigned tasks. according to lazarus and folkman (1984) , if an expatriate is worried and anxious, it is less likely that he/she will ever adjust. therefore, it is essential to help expatriates to prevent the development of any type of fear. strategies of prevention and rationalization, particularly useful in this sense, can be implemented through several instruments: specific training for foreign services, company's reference facilities in countries where employees come to work, remote counseling (on-line/phone) provided by occupational physicians and psychologists affiliated with the company, company procedures for immediate repatriation in the case of adverse events (e.g., terrorism, infective outbreaks, and health problems), etc. in the second place, stress management training is also recommended. from this perspective, issue-focused coping strategies seem crucial to counteract fearful feelings as well as to minimize the states of anxiety and distress (folkman et al., 1986) . thirdly, companies should conduct preventive screening to identify the human resources to be sent to foreign countries, favoring those who have demonstrated both a highly qualified professionalism in their field and robust mental health (stone, 1991; di fabio, 2014 , 2015 . consequently, a process for the successful management of expatriation needs to be adopted using both individual and organizational strategies to reduce the possibilities of stress among expatriates. at the organizational level, selection, training, healthcare activities, and counseling need to be implemented and monitored in order to prevent the diffusions of workplace fears. at the individual level, expatriates should be psychologically supported, e.g., with mentoring and coaching, analyzing competencies, health perceptions, and mood over time (de paul and bikos, 2015) . our study has many and innovative strengths but is not without limitations. although the expatriates worked worldwide, the sample was limited to a single company, limiting the generalizability of the results. in addition, the sample was composed only of men. however, by virtue of family demands, men expatriate much more frequently than women. our scale fear of the expatriation is new in the literature and, consequently, a replication of this study is needed. in addition, we look forward to more large studies whose starting points are the results of the first application of our scale. in particular, comparative research evaluating stress responses between italian and other ethnic populations around the world would be particularly helpful. this study used a cross-sectional design, resulting in the impossibility of determining causal relationships. longitudinal research is needed in order to provide further evidence that mental health problems cause fear of expatriation, which, in turn, may generate additional workplace fears. in summary, following this new research path, we have developed a new model, formulated a new theory that found an association between mental health and fears in the workplace, and explored different fears in the workplace and their links. our results confirmed our innovative hypothesis and we suggest that companies' key people take into account the construct of fear of expatriation for business health purposes. with this in mind, companies need proper advice from qualified consultants such as occupational physicians and industrial psychologists. an investment dedicated to the prevention and protection of the h&s of expatriate workers is not only an instrument of risk assessment -in the context of the obligations under the laws of eu countries in the field (e.g., the italian legislative decree n. 81 and subsequent amendments) -but also, and moreover, a significant tool to improve the company's business. some studies have estimated that the cost associated with the failure of expatriation would be about one million usd (insch and daniels, 2002; wentland, 2003) . overall, considering the aggregate data about the american situation, punnett (1997) has calculated that us companies spend a total of up to two billion usd annually to address the failures of their expatriate managers. in such a context, it is legitimate to expect that intervention strategies -such as a careful selection of personnel to be devoted to foreign missions and the development of actions aimed at improving the real and perceived well-being in destination countries -will lead to significant returns on investment (roi) for the companies. gg, fm, jf-p, ga, and nm equally contributed to all the following issues of the research: conception and design of the work; acquisition, analysis, or interpretation of data for the work; drafting the work and critically revising it; final approval of the version to be published; agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. the supplementary material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01571 structural equations modeling in practice: a review and recommended two-step approach addressing international mobility confusion-developing definitions and differentiations for self-initiated and assigned expatriates as well as migrants the barriers to increasing the productivity in expatriate management: examples in the world and turkey demographics, health and travel characteristics of international travellers at a pre-travel clinic in marseille. france an empirical investigation of terrorism-induced stress on expatriate attitudes and performance expatriate performance in terrorism-endangered countries: the role of family and organizational support expatriate social networks in terrorismendangered countries: an empirical analysis in afghanistan cultural shock, acculturation and reverse acculturation: the case of mexican citizens traveling to and back from china and their competitiveness level managing careers: theory and practice environmental hazards and health adult hippocampal neurogenesis, fear generalization, and stress input-based and time-based models of international adjustment: meta-analytic evidence and theoretical extensions ionic liquid-based solid phase microextraction necklaces for the environmental monitoring of ketamine self-initiated foreign expatriates versus assigned expatriates. two distinct types of international careers individual, organizational/work, and environmental influences on expatriate turnover tendencies: an empirical study the right way to manage expats toward a comprehensive model of international adjustment: an integration of multiple theoretical perspectives psychiatric manifestations as the leading symptom in an expatriate with dengue fever alternative ways of assessing model fit model selection and multimodel inference: a practical information-theoretic approach real and perceived employability: a comparison among italian graduates occupational health management system: a study of expatriate construction professionals do as the large enterprises do? expatriate selection and overseas performance in emerging markets: the case of taiwan smes. int testing mediation and suppression effects of latent variables: bootstrapping with structural equation models contemplating an expatriate health care position? key factors to consider travel and border health measures to prevent the international spread of ebola pre-travel health advice guidelines for humanitarian workers: a systematic review the role of prosocial and proactive safety behaviors in predicting safety performance perceived organizational support: a meaningful contributor to expatriate development professionals' psychological well-being intrapreneurial self-capital: a new construct for the 21st century beyond fluid intelligence and personality traits in social support: the role of ability based emotional intelligence effects of the economic crisis and social support on health-related quality of life: first wave of a longitudinal study in spain dynamics of a stressful encounter: cognitive appraisal, coping, and encounter outcomes l'uso del general health questionnaire di goldberg in una ricerca su giovani disoccupati keeping your expatriates happy pay, intrinsic motivation, extrinsic motivation, performance, and creativity in the workplace: revisiting long-held beliefs stress questionnaire (sq) posttraumatic stress disorder and exposure to trauma reminders after a terrorist attack a scaled version of the general health questionnaire when home isn't home -a study of homesickness and coping strategies among migrant workers and expatriates multivariate data analysis, 7th edn sexual health and hiv in travellers and expatriates global inequalities at work: work's impact on the health of individuals, families, and societies new york formulating clinical research hypotheses as structural equation models: a conceptual overview causes and consequences of declining early departures from foreign assignments hiv and the returning expatriate medical aspects of expatriate health: health threats issues for the traveling team physician principles and practice of structural equation modeling emotion and adaptation relational meaning and discrete emotions stress, appraisal and coping stress in expatriates behavioural aspects of terrorism predicting expatriate job performance for selection purposes: a quantitative review work-related stress assessment in a population of italian workers. the stress questionnaire consequences of client-initiated workplace violence: the role of fear and perceived prevention mplus user's guide, 6th edn psychometric theory a systematic review of the literature on posttraumatic stress disorder in victims of terrorist attacks stress and mental health in expatriates expatriates: special considerations in pretravel preparation asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models effect size measures for mediation models: quantitative strategies for communicating indirect effects towards effective management of expatriate spouses comparison of british and french expatriate doctors' characteristics and motivations tracing airline travelers for a public health investigation: middle east respiratory syndrome coronavirus (mers-cov) infection in the united states reporting structural equation modeling and confirmatory factor analysis results: a review you can take it with you: individual differences and expatriate effectiveness vaccination for the expatriate and long-term traveler beyond the medial regions of prefrontal cortex in the regulation of fear and anxiety expatriate compensation: an exploratory review of salient contextual factors and common practices the aftermath of terrorism: posttraumatic stress and functional impairment after the 2011 oslo bombing expatriate selection and failure differences in national legislation for the implementation of lead regulations included in the european directive for the protection of the health and safety of workers with occupational exposure to chemical agents (98/24/ec) post-exposure prophylaxis in resource-poor settings: review and recommendations for pre-departure risk assessment and planning for expatriate healthcare workers nationality, social network and psychological well-being: expatriates in china how and when is social networking important? theoretical examination and a conceptual model affective events theory: a theoretical discussion of the structure, causes and consequences of affective experiences at work a new practical guide for determining expatriate compensation: the comprehensive model health and social welfare of expatriates in southeast asia international drivers in unfamiliar surroundings: the problem of disorientation ups and downs of the expatriate experience? understanding work adjustment trajectories and career outcomes the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.copyright © 2016 giorgi, montani, fiz-perez, arcangeli and mucci. this is an open-access article distributed under the terms of the creative commons attribution license (cc by). the use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. no use, distribution or reproduction is permitted which does not comply with these terms. key: cord-351411-q9kqjvvf authors: moghadas, seyed m; haworth-brockman, margaret; isfeld-kiely, harpa; kettner, joel title: improving public health policy through infection transmission modelling: guidelines for creating a community of practice date: 2015 journal: can j infect dis med microbiol doi: nan sha: doc_id: 351411 cord_uid: q9kqjvvf background: despite significant research efforts in canada, real application of modelling in public health decision making and practice has not yet met its full potential. there is still room to better address the diversity of the canadian population and ensure that research outcomes are translated for use within their relevant contexts. objectives: to strengthen connections to public health practice and to broaden its scope, the pandemic influenza outbreak research modelling team partnered with the national collaborating centre for infectious diseases to hold a national workshop. its objectives were to: understand areas where modelling terms, methods and results are unclear; share information on how modelling can best be used in informing policy and improving practice, particularly regarding the ways to integrate a focus on health equity considerations; and sustain and advance collaborative work in the development and application of modelling in public health. method: the use of mathematical modelling in public health decision making for infectious diseases workshop brought together research modellers, public health professionals, policymakers and other experts from across the country. invited presentations set the context for topical discussions in three sessions. a final session generated reflections and recommendations for new opportunities and tasks. conclusions: gaps in content and research include the lack of standard frameworks and a glossary for infectious disease modelling. consistency in terminology, clear articulation of model parameters and assumptions, and sustained collaboration will help to bridge the divide between research and practice. l'amélioration des politiques de santé publique par la modélisation de la transmission des infections : des directives pour créer une communauté de pratique historique : malgré l'ampleur des recherches au canada, la mise en oeuvre de la modélisation n'a pas encore atteint son plein potentiel en santé publique dans la prise de décision et la pratique. il y a matière à mieux intégrer la diversité de la population canadienne et d'utiliser les résultats de la recherche dans les contextes pertinents. objectifs : pour renforcer les liens avec l'exercice de la santé publique et en élargir la portée, l'équipe de pandemic influenza outbreak research modelling s'est associée au centre de collaboration nationale des maladies infectieuses pour organiser un atelier national. cet atelier visait à déterminer les secteurs où la terminologie, les méthodo-logies et les résultats de la modélisation manquent de clarté, à transmettre de l'information sur l'utilisation optimale de la modélisation pour étayer les politiques et améliorer la pratique, notamment en accordant plus d'importance aux questions d'équité en santé, et à maintenir et faire progresser la collaboration pour élaborer et mettre en oeuvre la modélisation en santé publique. méthodologie : l'atelier sur l'utilisation de la modélisation mathématique dans la prise de décision relative aux maladies infectieuses en santé publique a réuni des chercheurs modélisateurs, des professionnels de la santé publique, des décideurs et d'autres experts du pays. les conférenciers ont mis en contexte les discussions dans le cadre de trois séances. une dernière séance a suscité des réflexions et des recommandations sur les futures tâches et possibilités. conclusions : les lacunes en matière de contenu et de recherche incluent l'absence de cadres standardisés et de glossaire de la modélisation des maladies infectieuses. une terminologie uniforme, la formulation claire des paramètres et des hypothèses de modélisation ainsi qu'une collaboration soutenue contribueront à corriger l'écart entre la recherche et la pratique. seyed m moghadas phd 1 , margaret haworth-brockman msc 2 , harpa isfeld-kiely ma 2 , joel kettner md 2 research defines a mathematical model as a framework "representing some aspects of reality at a sufficient level of detail to inform a clinical or policy question" (1). mathematical, computational and statistical models and techniques have been applied in the canadian public health system, especially after the 2003 severe acute respiratory syndrome (sars) epidemic, but it is unclear to what degree their outcomes have been used to shape policy and improve practice. furthermore, the diversity of the canadian population has not been adequately addressed in public health models, and research outcomes are often not translated for use within their relevant contexts. to improve the applicability and impact of models in public health, it is necessary to understand areas where modelling results are unclear, the value of a common language between modelling and public health, and how to sustain and enhance the application of modelling in public health. established during the early stages of the 2009 h1n1 pandemic in canada, the pandemic influenza outbreak research modelling (pan-inform) team has a mandate to develop innovative modelling frameworks and knowledge translation methods that inform public health by linking theory, policy and practice. aligned with its mandate, on october 6 and 7, 2014, pan-inform held its fourth biannual workshop (2) (3) (4) cohosted by the national collaborating centre for infectious diseases. this workshop brought together public health this open-access article is distributed under the terms of the creative commons attribution non-commercial license (cc by-nc) (http:// creativecommons.org/licenses/by-nc/4.0/), which permits reuse, distribution and reproduction of the article, provided that the original work is properly cited and the reuse is restricted to noncommercial purposes. for commercial reuse, contact support@pulsus.com practitioners and leading research modellers (a list of attendees is available at ) to enhance cross-discipline communications by providing a forum for knowledge to flow freely in a 'jargon-free' setting. the expected outcome was to identify the infrastructure, expertise and resources necessary to establish a 'communities of practice' (cop) network. the cop concept, initially developed by jean lave and etienne wenger, refers to groups of people who share a concern, a set of problems or a passion about something they do, and learn how to do it better by interacting regularly (5) (6) (7) (8) . the proposed cop, as a new initiative to be catalyzed from this workshop, would offer new approaches to addressing problems at different levels of health care and population health and enable the development of strategic plans to move evidence to action. an important role for this cop is to develop a common language that can be used in understanding the outcomes of health research and disease modelling. the workshop objectives were to: understand areas where modelling terms, methods and results are unclear; share information on how modelling can best be used in informing policy and improving practice, particularly regarding the ways to integrate a focus on health equity considerations; and sustain and advance collaborative work in the development and application of modelling in public health. the two-day event unfolded in four sessions. the first two, "modelling in public health: opportunities and challenges" and "mathematical modelling in public health practice", helped to set the context regarding scientific methods and research applications, particularly for the evaluation of research uptake and provided public health perspectives on the utility of modelling in decision making. the third session, "muddling through modelling: communication, common language and health equity", spurred discussion regarding the need for common language between researchers and knowledge users, to improve the use of modelling study results. this included foundational issues regarding access to data and the involvement of indigenous and other community representation. in the final session, "developing our network and communities of practice", participants reflected on earlier presentations and discussions to clarify what is needed to continue collaboration and knowledge exchange that can increase the value of research modelling in public health. the presentations and discussions that ensued created compelling arguments that the most prominent and observable outcomes can be achieved when communication barriers between disciplines are eliminated. the present report discusses key presentations and discussions that took place, and summarizes the outcomes and action plans that emerged from the workshop. setting the context: opportunities and challenges for modelling in public health sessions 1 and 2 of the workshop began with a presentation regarding the development of infectious diseases modelling in canada. before the 2003 sars epidemic, modelling activities were largely driven by research interests of individuals or small groups, with a significant emphasis on the theoretical aspects of exploring complex mathematical phenomena. for the most part, these activities were carried out in isolation, with minimal communication and engagement with public health professionals and policymakers (2) . during and following the sars epidemic, various groups of disease modellers were formed to engage with, and develop models for application to public and population health in more specific contexts. despite the importance and relevance of these initiatives, knowledge translation remained a challenge that the pan-inform was established in part to address (9). the canadian institutes of health research supported the establishment of pan-inform to address the limited knowledge exchange between modelling researchers and those who could potentially make use of models to inform health policy and improve practice. since its inception, pan-inform has undertaken several national initiatives for knowledge brokering, including the evaluation of canada's response to the spring and winter waves of the novel h1n1 pandemic, identification of strategies for protecting vulnerable populations from emerging infectious diseases, and development of approaches that can enrich existing links with aboriginal health organizations and foster multijurisdictional collaborative efforts in canada (2) (3) (4) . in canada, public health decision making occurs within orders of government at all levels. situations are often very complex for a number of reasons including the availability and adequacy of health resources; inconsistent or absent evidence regarding the effectiveness and cost effectiveness of intervention strategies; pressure from the public, media and government under which public health must operate; other competing public health services; and ethical considerations to balance the protection of community health against individuals' rights and freedoms. other pertinent challenges include the lack of data to estimate potential outcomes of a public health program, paralysis resulting from having too much information on occasion, differing opinions and short timelines. in this context, as one presenter put it, there are three questions decision makers face: what is the benefit of the public health program or intervention; who will benefit from the program and; is the program cost-effective? often, the evidence to answer these questions is not available in a timely manner. ideally, one would address these questions by investigating the effects experimentally. however, controlled trials may not be feasible or ethical, and can also be time consuming, laborious, expensive or inconclusive. models provide a useful tool to overcome these challenges and systematically evaluate possible effects by using existing data and knowledge, generating quantitative outcomes and mapping out interdependencies that may be key factors for determining policy needs. given these capabilities, models can be used to identify key uncertainties in the parameters and generate qualitative predictions, such as the effect of behavioural changes on the trends and distribution of an infection in the population. indeed, the overarching goal of modelling is to support evidence-based public health policy. to enhance the utility of models, communication and collaboration between modellers and public health leaders must take place early in a decision making process. models are more valuable when end users are engaged in formulating the questions because models are built so that they truly reflect a public health question. end users who understand a model are likely to be better able to assess the results. during the construction and validation of a model, the relevance and importance of input parameters must be understood, and the sources for their values and ranges, uncertainty about the parameters, and sensitivity of the model outcomes with respect to parameter variation and original model assumptions, must be determined. new knowledge generated by a model should address the target question and be translated and disseminated for uptake and action appropriate to the context. furthermore, when data are limited, it is essential to quantify any uncertainty in parameterization, because different sets of parameters may fit equally well. ideally, the process to improve the model structure and its outcomes is iterative. the value of direct conversations between modellers and public health leaders, in particular with regard to the availability and access to data and other critical information that are essential for model inputs of real-time scenarios (10) , was exemplified in the use of modelling and the implementation of model recommendations for antiviral use and vaccination in canada's response to the 2009 h1n1 pandemic (11, 12) . table 1 summarizes key issues presented and discussed for modelling in public health during the workshop. the international society for pharmacoeconomics and outcome research guidelines highlight the importance of a common language for drafting a health decision question and addressing it through a modelling framework. the guidelines for transparency and validation state: every model should have non-technical documentation that is freely accessible to any interested reader. at a minimum, it should describe in nontechnical terms, the type of model and intended applications; funding sources; structure of the model; inputs, outputs, other components that determine the model's function and their relationships; data sources, validation methods and results; and limitations (1). good communication flows to and from knowledge producers and users, and requires a common language to build effective partnerships and understanding of the groups' respective concerns. there are a number of challenges to developing a common language: determining a common lexicon; understanding priorities and contributions, which may shift depending on the political climate or population health status; asking the right questions that are appropriate to the given context; knowing the right audience; and being able to communicate findings to others outside the research community. the lack of such common language may have been an impediment to addressing key parameters in 'determinants of health' and 'health equity'. in the canadian context, one needs to take into account differential health status and population structure of first nations, inuit and métis people, population-level patterns of abuse, poverty and historical trauma, challenges regarding access to health services in rural and remote areas, and limits in identifying "vulnerable" populations in available datasets with no real markers. building partnerships and an iterative exchange allows for goals and facts to be clearly identified, and outcomes to be assessed for their value to inform decisions about the potential benefits and risks of policy development and program delivery. effective partnerships require willingness and commitment, alignment of values, mechanisms to engage early and continuously, and plans to regularly review goals, objectives, roles, and responsibilities and outcomes. a recent review of literature highlights the inconsistency in definitions and interpretations of epidemiological terms in several modelling studies and the need for common language to sustain and enhance the application of models in public health (13) . the review found that disparate outcomes and interpretations for policy decisions may arise from inconsistent use of terms in model structures, even when the assumptions and input parameters are identical. discrepancies in how terms are used for modelling are generally associated with two main reasons. first, it is often assumed that the particular terms are well defined or well understood. for example, 'infectiousness' and 'infectious' were found to be used interchangeably; the former describes a characteristic of the disease and/or how readily the disease is transmitted, while the latter describes a patient state (13) . second, definitions of some terms have drifted over time as understanding of the mechanisms of disease processes and control has evolved. for example, the way terms such as 'prevention', 'protection' and 'reduced susceptibility' are used related to communicable disease may lead to different results depending how they are used in modelling. developing a common modelling has an important place in public health policy and practice. the utilization of modelling has been far less that its potential in the canadian context create a national infrastructure or network in canada to develop useful and applicable models based on realistic assumptions and quality data closer working relationships collaboration, engagement and exchange between modellers, and policymakers are needed to facilitate iterative processes that optimize the value and understanding of models and their results identify partners at the provincial level within acute care, emergency services and public health divisions. formalize exchange processes for regular communication and education applying health equity and other lenses limited attention has been paid to using health equity or sex and gender analyses. the availability of aboriginal-specific information has been inconsistent at best modellers and users can be called on to create model frameworks and ask questions that will provide better information about where there are inequities and inequalities. involve the people who understand equity issues data quality and access access to good-quality, population level data is essential to validate a model and its outcomes. such data may not necessarily be available or accessible in a timely fashion during an emerging infectious disease evaluate data quality and the type of information provided by surveillance for its potential to be used for research modelling. engage with provinces to determine the nature and availability of data required for modelling standardization of approaches to develop useful models, three aspects of the modelling will need to be standardized: what (ie, frameworks that are context specific and take into account the population demographic and geographic characteristics); who (ie, involvement of policymakers, knowledge users and modellers with relevant expertise), and how (ie, develop an iterative process from the formulation of health policy questions to the dissemination of model outcomes) a communities of practice network can be tasked with the standardization of this process to ensure that synergies exist when models are formulated to inform clinical or health policy decisions roles and responsibilities clarification on the roles of health agencies and jurisdictions are needed to engage partners from academic institutes, government health organizations and health industries national collaborating centre for infectious diseases will lead the initiative to forge the linkages and develop appropriate channels and effective methods of communication between the involved partners capacity some jurisdictions lack modelling capacity. there is also a lack of information about which modellers are available to work with public health and their expertise a centralized list or network could contribute to greater capacity for public health jurisdictions. develop opportunities for public health personnel to learn more about models and their value language will help to reduce possible variation in study results produced by different research communities. this will in turn decrease misinterpretation of the outcomes by allowing for comparisons of scientific evidence from multiple disciplines involving health research, and helping knowledge users and policymakers to better understand research outcomes and their applicability to policy and practice. there are other factors responsible for variation in model findings, including different strategies or approaches and assumptions, different population demographic variables, and the objectives for evaluating policy effectiveness that can vary from one situation to another. the latter can be exemplified in two recent studies on the effectiveness of school closure during pandemic influenza outbreaks. when assessing the effect of school closure strategies in reducing community attack rates, halder et al (14) found that due to the difficulty in determining the true degree of epidemic spread and its severity in the early stages of an outbreak, a strategy of individual school closures would be more effective than simultaneous closures across a region. the outcomes are drawn from an agent-based simulation model of albany, a small community in western australia with a popula tion of approximately 30,000 individuals. in contrast, to evaluate the impact of local reactive school closures on critical care provision in the united kingdom population setting, house et al (15) concluded that school closures should be coordinated in time (simultaneous) and location (all schools within a school district) to become an effective strategy to reduce infection transmission and, consequently, relieve capacity pressures of hospital intensive care unit admissions. the population demographics and the objectives for closing schools are distinctly different between the two studies, suggesting that different modelling approaches are required for measuring the effectiveness of school closures. understanding scenario-specific outcomes and their applications requires a critical evaluation to address the following questions: • is the methodology appropriate for the specific population setting? • do the assumptions and parameters address the reality of demographic and geographic characteristics? • can the outcomes be compared with other studies and validated with observed data? • how generalizable are the outcomes to address different scenarios or population settings? a consensus emerged during the workshop regarding the need to develop a common language for modelling to enhance its application in a public health context and promote bidirectional communication ( table 2 ). to address this need, the fourth session of the workshop provided an opportunity for participants to discuss the establishment and potential impact of a community of practice. during the final discussion session, a number of important issues related to the development of a cop network were discussed, including its structure and governance, leadership and research capacity, memberships and partnerships, strategic plans for sustainability and resources, and the impact and uptake of outcomes (table 3) . the october 2014 national workshop propelled new discussion on the value of mathematical models in public health planning and the need for greater cohesion and collaboration among stakeholders. the workshop concluded with a consensus among participants that there is work to be done and a willingness to continue to work together. the creation of a common lexicon is a tangible, initial task that should be undertaken as an immediate response to the workshop discussions. we expect that through sustained cross-disciplinary dialogues, a cop will initially produce a 'book of terminology' that describes current usage and proposes common terminology (community standards) in different areas, including medical and infectious diseases epidemiology, public health and disease modelling. this reference book can then be updated regularly when new terms need clarification for shared understanding and agreement in use. furthermore, in times of uncertainty, the virtual cop network will provide opportunities to access, analyze, synthesize and utilize reliable information and databases in a timely fashion, and drive a broad consensus around plausible alternatives and integrated courses of action. it is also true, however, that ongoing discussions between modellers and public health personnel will help to clarify language use and break down perceived barriers. ispor-smdm modeling good research practices task force. modeling good research practices-overview: a report of the ispor-smdm modeling good research practices task force-1 managing public health crises: the role of models in pandemic preparedness canada in the face of the 2009 h1n1 pandemic indigenous populations health protection: a canadian perspective the development, design, testing, refinement, simulation and application of an evaluation framework for communities of practice and social-professional networks communities of practice: an opportunity for interagency working communities of practice: learning, meaning and identity pandemic influenza outbreak research modelling team (pan-inform), fisman d. modelling an influenza pandemic: a guide for the perplexed pandemic influenza: modelling and public health perspectives available at (accessed on annex e. the use of antiviral drugs during a pandemic review of terms used in modelling influenza infection developing guidelines for school closure interventions to be used during a future influenza pandemic modelling the impact of local reactive school closures on critical care provision during an influenza pandemic modelling for public health (mod4ph) 21% of the total reported deaths worldwide) so far. however, as the new cases, hospital admissions, and deaths began to decline in mid-may, most states in the u.s. began phased lifting of their social intervention measures. for example, florida adopted a three phased approach: phase i (which began in may 18, 2020) allowed most business and workplaces to reopen with up to 50% of their building capacities and with large events constrained to 25%; phase ii began in june 5, 2020 and allowed all businesses to reopen for up to 50-75% of their capacities and also permitting events in large venues with no more than 50% of their capacities; phase iii will be akin to a complete reopening for which neither a date nor the criteria have been declared. a summary of florida's phased intervention plan can be seen in figure a2 (in the appendix). as the reopening entered phase ii, florida, along with many other states, began to see sharp increases in daily new infections (e.g., florida reported over 15,000 new cases on july 11, 2020 along with a test positivity rate reaching over 15%). in this paper, we investigate a few 'what-if' scenarios for social intervention policies including if the stay-at-home order were not lifted, if the phase ii order continues unaltered, what impact will the universal face mask usage have on the infections and deaths, and finally, how do the benefits of contact tracing vary with various target levels for identifying asymptomatic and pre-symptomatic. we conduct our investigation by first developing a comprehensive agent-based simulation model for covid-19, and then using a major urban outbreak region (miami-dade county hospitalization (if infected with acute illness); and 10) recovery or death (if infected). the ab model reports daily and cumulative values of actual infected, doctor visits, tested, reported cases, hospitalized, recovered, and deaths, for each age category. a schematic diagram depicting the algorithmic sequence and parameter inputs for the ab simulation model is presented in figure 1 . our ab simulation model works as follows. it begins by generating the individual people according to the u.s. census data that gives population attributes including age (see table a1 ) and occupational distribution (see table a4 ). thereafter, it generates the households based on their composition characterized by the number of adults and children (see table a2 ). the model also generates, per census data, schools (see table a3 ) and the workplaces and other community locations (see table a4 ). each individual is assigned a household, while maintaining the average household composition, and, depending on the age, either a school or a workplace (considering employment levels). a daily (hour by hour) schedule is assigned to every individual, chosen from a set of alternative schedules, based on their attributes. the schedules vary between weekdays and weekends and also depend on the prevailing social intervention orders (see table a5 ). simulation begins on the day when one or more infected people are introduced to the region (referred to as simulation day 1). simulation model tracks hourly movements of each individual (susceptible and infected) every day, and records for each susceptible the number of infected contacts and their identification at each location. based on the level of infectiousness of each infected contact (which depends on the day of his/her infectiousness period), the model calculates the daily force of infection received by each susceptible from all infected contacts at all hours of the day [18] . daily force of infection is considered to accumulate. however, it is assumed that if a susceptible does not gather any additional force of infection (i.e., does not come in contact with any infected) for two consecutive days, the cumulative force of infection for the susceptible reduces to zero. at the end of each day, the table a6 for average lengths of the periods) epidemiological models and other parameters that guide the ab model are described next. figure 3 presents a schematic of the disease natural history of covid-19, parameters of which are given in table a6 . once infected, an individual simultaneously begins the latency and the incubation periods. the individual becomes infectious after the latent period is complete but displays symptoms (unless asymptomatic) at the end of the incubation period. the period between end of latency and end of incubation is referred to as pre-symptomatic, a time when the infectiousness grows rapidly and almost reaches its peak. symptomatic cases either follow a non-acute progression (majority of cases, not requiring hospitalization) or acute progression (requiring hospitalization). cases for whom disease does not become acute, enter a recovery period after infectiousness ends. those with acute disease progression (generally toward the end of the infectious period) are hospitalized. after the hospital stay period, cases either recover or die. for average lengths of recovery and hospitalized periods that are used in the ab model, see table a6 . there is some evidence based on animal experimentation that recovered individual may become immune to reinfection [31, 40] . but other studies remain inconclusive [32] . hence, due to lack of established data on this matter, our model considers the recovered cases to be immune to further covid-19 infections. the duration and intensity of infectiousness is considered to be guided by a lognormal density function (see figure 4 ). the function is truncated at the average length of the infectiousness period (which is considered to be 9.5 days). asymptomatic cases are assumed to follow a similar infectiousness intensity profile but scaled by a factor ( in the force of infection calculation (1), see table a7 ). ` the ab model estimates the probability of infection for a susceptible using the accumulated value of daily force of infection ( ), which is calculated as follows. the first component in (1) accounts for the force experienced by susceptible individual at home from another infected household members . the second component captures the force experienced at schools/workplaces/community places for work and also at community places visited for daily errands; this happens when a susceptible is in the same location type where infected is at hour . the definition and values of the parameters of (1) are given in table a7 . equation (1) is a modified version of the force of infection equation given in [18] , which has three components that separately calculate force of infection received at home, at indoor workplaces, and at the outdoor community. for the sake of simplicity, we have considered only the first two components, home and indoor workplaces, where most of the covid-19 transmission is assumed to be taking place. we have assumed that the mode of virus transmission at indoor community places that are routinely visited by people as part of their daily errands (like grocery stores, home goods stores, dine-in/take-out restaurants, etc.) is similar to that of indoor workplace transmission. the force of infection is gathered by a susceptible individual each day from all infected contacts in his/her mixing groups (home, school/workplace, and community places). the cumulative value of is used at the end of each day to calculate the probability of infection as . j o u r n a l p r e -p r o o f the ab model incorporates all applicable intervention orders like stay-at-home, school and workplace closure, isolation of symptomatic cases at home, and quarantine of household members of those who are infected. the model also considers: varying levels of compliances for isolation and quarantine, lower on-site staffing levels of essential work and community places during stay-at-home order, restricted daily schedule of people during various social intervention periods, phased lifting of interventions, use of face masks in workplaces, schools and community places with varying compliance levels, and contact tracing with different target levels to identify asymptomatic and presymptomatic cases. the timeline for social interventions implemented in the model are summarized in table a8 . other salient considerations in the implementation of our ab model are as follows. across all age groups, 35% of the infected cases were considered asymptomatic [7] . approximately, twenty percent (20%) of florida residents are reported as uninsured and do not have access to a primary care physician [49] . uninsured people thus considered not to have the doctor referral required for most of the testing facilities in miami-dade county, and hence not tested. all symptomatic cases with health insurance were assumed to visit/consult with a doctor. depending on their symptoms, travel history, and contact history, some of them were given referrals for testing. we considered that only a small percentage of cases visiting/consulting a doctor were given referrals in the early months of the pandemic (until the middle of april 2020), due to the shortage of testing and restrictive cdc guidelines for who could be tested [6] . however, as cdc relaxed its test eligibility guidelines [39] and the capacity to test increased in florida, we gradually increased the probability of getting a test referral from a doctor closer to 100% by early june 2020 for symptomatic cases (see table a9 ). we also considered in our model that a small fraction (reaching only up to 10% over time) of the asymptomatic cases are randomly tested through various community testing protocols, e.g., at elderly care facilities, healthcare facilities, workplaces, etc. note that we did not consider co-infection, and therefore all cases that were tested in our simulation model had covid-19. hence, each test yielded a positive outcome with a probability equal to the test sensitivity (see table a9 ). based on the data reported on florida covid-19 dashboard, a test result reporting delay of up to 10 days on average was considered at the start of the pandemic, which was progressively reduced (see table a9 ). all symptomatic cases with or without testing were considered to isolate at home with a given probability of compliance. the probability of compliance was considered to vary during the length of the symptomatic period of infection. for this purpose, we divided the symptomatic period into three parts: i, ii, and iii, and assumed a lower isolation compliance in parts i and iii and higher in part ii, when the illness j o u r n a l p r e -p r o o f is more apparent. see table a10 for the isolation compliance probabilities. susceptible members of the households with one or more infected cases are considered to quarantine themselves. we also assumed a level of compliance for the quarantine (see table a10 ). we used hospitalization and death data reported for miami-dade county [20] for each age group to obtain probabilities of hospitalization of the reported cases, and probabilities of death for those who are hospitalized (see table a11 ). though we have implemented our ab simulation model for a specific region, it is quite general in its usability for other urban regions with similar demography, societal characteristics, and intervention measures. in our model, tables a1-a4 summarize the demographic inputs (age and household distribution, number of schools for various age groups, and number of workplaces of various types and sizes). these data will need to be curated from both national and local census records. social interventions vary from region to region and hence the data in table a8 will need to be updated. similarly, testing availability, test sensitivity, and test outcome reporting delay may also vary significantly from region to region, and thus table a9 will also need to be updated. the rest of the data (in tables a5, a6 , a7, a10, and a11) are related to epidemiology of covid-19. these are unlikely to be significantly different, though some adjustments of these based on population demographics may be needed. the ab model utilizes a large number of parameters, which are demographic parameters, epidemiological parameters, and social intervention parameters. we kept almost all of the above parameters fixed at their respective once the model is calibrated and validated with available reported data on infected and dead, we extended the model into the future to predict outcomes. the only parameters that were altered after the calibration period are to reflect the expected changes in social interventions, e.g., order mandating use of face mask, re-closing some community places, expected increase in contact tracing, and changes in community response via daily schedule restrictions. hence, the parameters that were changed after the calibration period included those for daily schedules, transmission coefficients, testing and contact tracing rates, and compliance to isolation and quarantine. most of the parameter values used in the ab model were obtained from government archives and research literature, for which references are provided (see tables a1-a11 ). for some of the parameters for which we could not find an archived data source, we used expert opinion and current media reports. we first present a summary of the key results of our study (see table 1 ), from which a number of key insights can be derived that may apply to other similar urban regions experiencing respiratory/influenza type virus outbreaks. early imposition of stay-at-home order appears to have been quite effective in first flattening and then reversing the growth curve. per our model, if the stay-at-home order was allowed to remain enforced, the pandemic would have subsided with a relatively low percentage of the population (5.8%) infected and approximately 0.04% dead within six months of inception; 50 or below daily new infections was used as the criterion to consider that pandemic has subsided in miami-dade county. if the extent of social mixing akin to phase ii reopening of florida is in place for an urban region (without the use of face mask and contact tracing), the pandemic would likely have raged for 8-9 months and subside only after reaching herd immunity with over 75% of the population infected and 1.3% of the population dead. universal use of face masks of surgical variety was shown by the model to reduce average total infected, hospitalized, and dead by 20%, 19%, and 15%, respectively. aggressive contact tracing with a goal to identify 50% of the asymptomatic and pre-symptomatic was also projected to have a very significant positive impact with an average reduction of 66% of total infected. the average reductions in total infected with 40%, 30%, and 20% contact tracing targets were found to be 58%, 41%, and 14%, respectively. in what follows, we expound the results from our study. figures 6 and 7 fig. (a) ) and hospitalizations and deaths ( fig. (b) ) if stay-athome order were not lifted figure 6 shows a strong influence of continuing with the stay-at-home order in curbing the covid-19 growth j o u r n a l p r e -p r o o f within approximately 6 months from its inception with on average less than 5.8% of the population infected, 0.15% hospitalized, and 0.037% dead; 50 or below daily new infections was used as the criterion to consider that pandemic has subsided in miami-dade county. such a quick suppression of a virus outbreak always leaves the possibility of resurgence, for which an effective plan of contact tracing, testing, isolation, and support for those isolated (when needed) should be in place. . (a) and fig. (b) ) and phase ii reopening without face mask and contact tracing ( fig. (c) and fig. (d) ) figure 7 shows the expected outcomes of continuing with the phase i order and the phase ii order. figure 7 outbreaks, it is shown that adjusted odds ratio (aor) of getting an infection after wearing surgical variety face masks versus without wearing face mask is 0.33 on average [12] . this can be interpreted as the likelihood of getting j o u r n a l p r e -p r o o f infected if wearing a surgical variety face mask is one third of what it would be for not wearing a mask. hence, we considered a 67% reduction in the transmission coefficient ( ) used in calculating the force of infection (see equation (1)), assuming a 100% compliance in the use of surgical variety masks at workplaces, schools and community places. we also tested the impact of 30% and 45% reductions in the transmission coefficient value ( ), which translate to approximately 50% and 70% compliance for face mask usage, respectively. the anticipated impact of face mask usage together with phase ii order on the average cumulative numbers of infected are shown in figure 8 (a). it also depicts the risk difference between the average values of cumulative infected without and with the universal use of face mask. it may be noted that since the infections grow slower with the use of face mask, the cumulative risk difference rises to almost 875k in the middle of august and then settles down close to 430k when pandemic is predicted to subside by the end of november 2020. figure 8 (b) depicts the daily values of the average infected for phase ii without and with universal face mask policy. as expected, the peak of daily infection with face mask usage is shifted to a slightly later date and the downward trend begins after a smaller percentage (31%) of the total population are infected compared to 36% without the use of face mask. our agent-based model has several limitations. first and foremost, the simulation model is an abstraction of how a pandemic impacts a large and complex society. though our model deliberately introduces some variabilities, somewhat pre-defined daily schedules are used to approximate a highly dynamic contact process of an urban region. also, the contact process does not account for significant variabilities in the types and lengths of interactions even within each mixing groups. we did not assign geographic locations (latitude and longitude) for households, businesses, schools, and community places, and assumed them to be uniformly distributed over the region. it is common for urban population centers and associated establishments to grow in clusters, for which the contact patterns are expected to be different from those in uniformly dispersed regions. we did not consider special events like parties, games, and street protests, some of which are known to have caused superspreading of the virus and case increases. finally, and perhaps most importantly, the model uses a large number of parameters (listed in tables each scenario of our case study with 10 replicates (with different seeds) takes approximately 8-12 hours to run in a standard desktop computer with intel core i7 with 16gb memory. in the interest of presenting our observations quickly to the public health decision makers, while covid-19 is still rampant in the region, we chose to use a limited number (10) of replicates. as the main purpose of this paper is to conduct a broad what-if analysis, we do not believe that use of a small number of replicates has negatively influenced our observations. the trends and observations derived from our results are only intended to be used for planning and guidance of public health decision makers. as part of our continuing (future) work, we plan to use our model to examine the impact of reopening of k-12 schools and colleges/universities for the new academic year, which began at the end of august and early september. we also plan to use our ab model to assess efficacy of various prioritization strategies (based on age, risk, and work groups) for the vaccines that are anticipated to be available in limited quantities by the beginning of 2021. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. county can be found in [a11] . c) use of face mask: boston implemented a facemask policy in early may in an executive order by the state [a9] . d) implementation of contact tracing: massachusetts state government provides a dashboard on the community health outcomes for covid-19. details on the success of contact tracing in the communities along with the outcome measures varying over time can be found in [a10] . e) policy for returning to school: school reopening policies also widely vary from state to state and also among counties within a state. information on boston's public school reopening policy can be found in [a12] . it is important to frequently check sources on school policy as they are transient. for example, boston planned to reopen on october 15, but shifted to october 22 after seeing an increase in the number of cases. 10) time varying testing of symptomatic and asymptomatic: limited testing availability has been a serious concern in many u.s. regions that suffered from a high level of disease spread. time varying data on test availability and test outcome reporting delay are difficult to find in indexed literature during a pandemic. hence, these can be assumed from regional news reports, test reporting data, and/or other grey literature. 11) number of reported, hospitalized and deaths: daily data and archived data on the number of people reported positive, hospitalized and dead can be found from dashboard in [a10] . 12) probability of hospitalization for reported cases and probability of death for hospitalized: this can be calculated from [a10] based on age specific reporting. the information contained in the following tables for miami dade county are likely to be same for other regions like boston city/suffolk county. 13) daily schedules for people can be assumed to be same as in table a5 . 14) disease natural history parameters for covid-19 can be assumed to be same across regions within a country, see table a6 . 15) some of the parameters in table a7 for calculating the force of infection need to be calibrated (see step 3) . however, the remaining parameters in table a7 can be assumed to be same. 16) self-isolation compliance for symptomatic cases, and quarantine compliance for household members can be assumed to be same for different urban regions within a country, as in table a10 . step 2: updating of the simulation model once the input data collection is complete, the next step is to update the model parameters as follows. 1) update the simulation model with all gathered input data from step 1: after gathering data, it needs to be curated and transformed into .txt files to be read by the simulation model. some of the data are directly coded in the model, where applicable. 2) decide simulation begin date: simulation begin date depends on the outbreak region and is based on the date of the first reported case. up to 14 days before the first reported case can be used as a potential date for simulation model to begin. 3) decide simulation end date: simulation end date is chosen as desired by the modeler. 4) number of initial infected cases: most departments of health (doh) provide a count and characterization of the number of initial infected cases with travel histories. one can identify these initial infected cases during the first month or so of the outbreak and use those cases to initiate social mixing and community spread. step 3: calibrate and validate simulation model once the simulation model is updated with the input data for the region, the model is calibrated using a small applicable subset of input data and the model output is validated with actual surveillance data from the region, as follows. 1) generate multiple seeds for the uniform random variables that are used to calculate the probabilities of infection, hospitalization, death, testing, symptomatic, disease severity, test sensitivity, compliance for isolation and quarantine, among others. simulation output from each seed is considered a replicate. using output data from all replicates, an average value and a corresponding confidence interval for each output measure are calculated. 2) a set of initial values of the transmission coefficients for home, school, work, and community places are assumed (based on current literature and published models for outbreaks of similar diseases). these transmission coefficients (along with other parameters, see table a7 ) are used for calculating force of j o u r n a l p r e -p r o o f infection, which is then use to calculate the probability of infection. different sets of transmission coefficient values are selected for different reference points in time in the simulation, depending on changes in social intervention status and significant current events. for example, the transmission coefficients are appropriately calibrated (reduced) on the day universal use of facemask is announced. also, percentage testing of asymptomatic and pre-symptomatic are increased when contact tracing begins. street protests combined with independence day holiday in early july, 2020 are examples of current events that may require adjustment of transmission coefficient values. 3) other parameters that are considered suitable for the ab model calibration are probability of running errand that guides daily schedule and probability of employees reporting to work for essential and nonessential businesses. these values can also be assumed to change over time during a pandemic depending on the phased intervention policies implemented by the government in the outbreak region. 4) the simulation is calibrated for a chosen period. in this study, we chose to calibrate the model up to july 15th, as we had reported data available until that date for validation purposes at the time of calibrating the model. 5) results for reported cases, hospitalized, dead for all age groups are gathered from the simulation model for each seed. 6) average values (with confidence interval) are computed for the numbers of reported, hospitalized and dead. 7) for model validation, the simulated average values for the reported, hospitalized, and dead and compared with actual surveillance data. 8) alter calibration parameters as needed to obtain desired level of validation accuracy. measure validation accuracy is calculated as the difference in the seven-day moving average between simulated and surveillance data. step 4: implement calibrated model for prediction 1) run calibrated simulation model for all seeds for a desired prediction period beyond the calibration/validation time. 2) extract age specific data for total infected, reported cases, hospitalized, and dead from simulation for each seed. 3) report mean and confidence interval. figure a1 : florida's phased social intervention plan for covid-19 pandemic [44] j o u r n a l p r e -p r o o f table a10 : self-isolation compliance for symptomatic cases and quarantine compliance for household members modeling the impact of social distancing, testing, contact tracing and household quarantine on second-wave scenarios of the covid-19 epidemic the proximal origin of sars-cov-2 estimating the infection horizon of covid-19 in eight countries with a datadriven approach evolution and epidemic spread of sars-cov-2 in brazil centers for disease control and prevention'. cdc releases consolidated covid-19 testing recommendations centers for disease control and prevention. covid-19 pandemic planning scenarios centers for disease control and prevention. interim clinical guidance for management of patients with confirmed coronavirus disease modeliling tranmission and control of the covid-19 pandemic in australia flute, a publicly available stochastic influenza epidemic simulation model covid-19 virus outbreak forecasting of registered and recovered cases after sixty day lockdown in italy: a data driven model approach physical distancing, face masks, and eye protection to prevent person-to-person transmission of sars-cov-2 and covid-19: a systematic review and metaanalysis modeling the worldwide spread of pandemic influenza: baseline case and containment interventions a large-scale simulation model of pandemic influenza outbreaks for development of dynamic mitigation strategies how to restart? an agent-based simulation model towards the definition of strategies of covid-19 "second phase an influenza simulation model for immunization studies transmission dynamics of the covid-19 outbreak and effectiveness of government interventions: a data-driven analysis strategies for mitigating an influenza pandemic strategies for containing an emerging influenza pandemic in southeast asia coronavirus: summary of persons being monitored, persons under investigation, and cases covid-19: summary of persons being monitored, persons tested, and cases division of disease control and health protection. florida's covid-19 data and surveillance dashboard hospitalization rates and characteristics of patients hospitalized with laboratory-confirmed coronavirus disease mitigation strategies for pandemic influenza in the united states containing bioterrorist smallpox evidence summary for covid-19 viral load over course of infection the effectiveness of quarantine of wuhan city against the corona virus disease 2019 (covid-19): a well-mixed seir model analysis infectious disease modeling: creating a community to respond to biological threats. statistical communications in infectious diseases a contribution to the mathematical theory of epidemics covid-19 and postinfection immunity lack of reinfection in rhesus macaques infected with sars-cov-2. biorxiv containing pandemic influenza with facs: a geospatial agent-based simulator for analyzing covid-19 spread and public health measures on local regions design of non-pharmaceutical intervention strategies for pandemic influenza outbreaks miami matters. households/income data for county: miami-dade covid-19/parameter_estimates coronavirus screening test developed at johns hopkins will we see protection or reinfection in covid-19? epidemic analysis of covid-19 in china by dynamical modeling. medrxiv covid-abs: an agent-based model of covid-19 epidmeic to simulate health and economic effects of social distancing interventions estimating disease burden of a potential a(h7n9) pandemic influenza outbreak in the united states the city of miami -florida. covid-19 updates all sectors: county business patterns by legal form of organization and employment size class for u.s., states, and selected geographies all sectors: county business patterns by legal form of organization and employment size class for u.s., states, and selected geographies annual business survey: statistics for employer firms by industry, sex, ethnicity, race, and veteran status for the a predictive decision-aid methodology for dynamic mitigation of influenza pandemic a new coronavirus associated with human respiratory disease in china early estimation of the case fatality rate of covid-19 in mainland china: a data-driven analysis modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions the impact of social distancing and epicenter lockdown on the covid-19 epidemic in mainland china: a data-driven seiqr model study. medrxiv the boston planning and development agency annual business survey: statistics for employer firms by industry, sex, ethnicity, race, and veteran status for the state-by-state guide to face mask requirements covid-19 response reporting covid-19 state of emergency j o u r n a l p r e -p r o o f journal pre-proof table a4 : distribution of different types of workplaces in miami-dade county. all industries and community places are classified into essential or non-essential. essential industries remain functional with a percentage of their workforce reporting during interventions like stay-at-home or phased reopening. non-essential industries are considered to operate remotely. essential industries include wholesale trade, waste management and remediation services, agriculture, forestry, fishing and hunting, mining, quarrying, oil and gas extraction, utilities, construction, manufacturing, transportation and warehousing. non-essential industry includes finance and insurance, real estate and rental and leasing, professional, scientific and technical services, management of companies and enterprises, administrative and support for waste management and remediation services, educational services, other services except public administration. essential community includes grocery stores, convenience stores, pharmacies and drug stores, home centers, health care and social assistance. non-essential community includes retail, arts, entertainment and recreation and accommodation and food services. for details on education institutions, see j o u r n a l p r e -p r o o f key: cord-315462-u2dj79yw authors: hewitt, judith a.; lutz, cathleen; florence, william c.; pitt, m. louise m.; rao, srinivas; rappaport, jay; haigwood, nancy l. title: activating resources for the covid-19 pandemic: in vivo models for vaccines and therapeutics date: 2020-10-01 journal: cell host microbe doi: 10.1016/j.chom.2020.09.016 sha: doc_id: 315462 cord_uid: u2dj79yw the preclinical working group of accelerating covid-19 therapeutic interventions and vaccines (activ), a public-private partnership spearheaded by the national institutes of health, was charged with identifying, prioritizing, and communicating sars-cov-2 preclinical resources. reviewing sars-cov-2 animal model data facilitates standardization and harmonization and informs knowledge gaps and prioritization of limited resources. to date, mouse, hamster, ferret, guinea pig, and non-human primates have been investigated. several species are permissive for sars-cov-2 replication, often exhibiting mild disease with resolution, reflecting most human covid-19 cases. more severe disease develops in a few models, some associated with advanced age, a risk factor for human disease. this review provides a snapshot that recommends the suitability of models for testing vaccines and therapeutics, which may evolve as our understanding of covid-19 disease biology improves. covid-19 is a complex disease and individual models recapitulate certain aspects of disease; therefore, the coordination and assessment of animal models is imperative. covid-19 took the world by surprise. while the disease spread around the globe, no vaccines or drugs were available to ameliorate its effects. as a result, the national institutes of health (nih) and the foundation for the nih (fnih) established the accelerating covid-19 therapeutic innovations and vaccines (activ) partnership in april 2020 comprised of experts in the fields of virology, public health, vaccine and drug development. this public-private partnership brings together industry, academic and government stakeholders in an unprecedented collaboration to share information and resources to impact the trajectory of the pandemic. there are four working groups: preclinical, clinical therapeutics, clinical trial capacity and vaccines (collins and stoffels, 2020; nih, 2020) . the preclinical working group (wg) was charged "to standardize and share preclinical evaluation resources and methods and accelerate testing of candidate therapies and vaccines to support entry into clinical trials" (collins and stoffels, 2020) . under normal circumstances, the response to emerging pathogens is a methodical linear progression of in vitro, in vivo, and clinical studies, yet the urgency of the response to the current pandemic requires these activities to be carried out in parallel. prior experience with other coronaviruses, both in vitro and in vivo models, provides a guiding foundation for current research and effectively enables interventions to the disease. activ's recommendations on accelerating preclinical in vitro studies are described in the accompanying manuscript (grobler et al). animal models help us not only to understand the pathogenesis and mechanisms of sars-cov-2 disease biology, but also to elucidate aspects of pharmacology, toxicology and immunology of the therapeutic and vaccine strategies. these data may provide confidence that the products being developed can prevent or treat disease; however, no single animal model fully recapitulates human covid-19 disease. while there is no substitute for randomized controlled clinical trials, there are great advantages to adjunctive research in animal models to explore indepth mechanisms, immunology/pharmacology, vaccine or therapeutic efficacy and discover biomarkers that may be useful in the clinic and to ensure the safety of the candidates in humans. understanding the pharmacokinetic/pharmacodynamic relationships such that effective and safe doses can be extrapolated from in vitro antiviral activity, preclinical pharmacokinetics, toxicology and clinical modeling (fda, 2020) can allow compounds to proceed into clinical testing without necessarily demonstrating efficacy in an animal model. although animal model data may not be required to advance to the clinic, it can provide useful comparison data, especially informative in a pandemic. the selection of appropriate animal models of infection, disease manifestation, and efficacy measurements is important for vaccines and therapeutics to be compared under activ's umbrella using master protocols with standardized endpoints and assay readouts. models of sars-cov-2 infection include mice (ace2 transgenic strains, mouse adapted virus, and aav transduced ace2 mice), hamsters, rats, ferrets and non-human primates (nhps). the urgency of the need to coordinate efforts to reduce duplication and cycle times also necessitated an assessment of supply and demand. specifically, resource optimization of all the models, but particularly that of the nhps, and priority setting of experimental therapeutic and vaccine interventions was immediately warranted. the national primate research centers (nprcs), established 60 years ago with funding from the nih, serve as a major resource in this effort to provide nhp expertise, models for human diseases, and nhp resources to nih and other investigators (www.nprcresearch.org). similarly, nih funded mouse repositories stand ready to supply mice to the broader scientific community for research and testing. the activ preclinical wg efforts are facilitating a coordinated effort of the seven nprcs, which work together as a consortium to assure the highest degree of rigor and ethics, as well as to manage the relative paucity of available primates and nprc-abls3 facilities in which to perform sars-cov-2 in vivo research. the nprcs are playing a major role in activ, although the animal models in our assessment are not restricted to the nprcs or those available through nih sponsored repositories. beyond activ, operation warp speed (ows) is the us government's effort to make 300 million doses of vaccine available by january 2021 (hhs, 2020) . ows is also making use of master protocols, and activ and ows are sharing information around protocols being developed. ows plans to perform animal model testing at contract research organizations under contract to the biodefense advanced research & development authority (barda) and niaid, as well as government labs in the department of defense. coordinating protocols and plans between activ and ows for the supply of non-human primates is critical to the success of all pandemic related efforts. the activ preclinical wg aims to assess the status and applicability of all animal models that are under development and to share information in real-time including: taking inventory of the resources and annotating the salient features of the models, establishing a data collection tool to assess the status and applicability, highlighting the gaps in effort and knowledge, and suggesting redirection to fill the gaps. this initial publication describes the current status of animal models, as determined from the available data in the inventory. as features of the covid-19 disease are described, for example the rare but serious multisystem inflammatory syndrome in children (mis-c) (rowley, 2020) , new animal models can help address these complications, ideally in a coordinated fashion to accelerate our understanding. additional animal model data are welcomed, particularly to address gaps in our knowledge, and we will describe how members of the scientific and medical community can submit data for inclusion. it is appropriate to understand the course of human infections in order to model disease in animals; therefore, we present a summary of human disease first. since the outbreak began in china, some of the earliest and most cited publications report numerous cases from china (chauhan, 2020; huang et al., 2020; wu and mcgoogan, 2020) . covid-19 is characterized by phases of increasing severity, generally described as asymptomatic, mild, moderate or severe disease. mild disease includes general symptoms such as fever, fatigue, cough, headache, diarrhea, sore throat, congestion, muscle or body aches, and/or vomiting; loss of sense of smell and/or taste and a dry, non-productive cough are more pathognomonic of covid-19. moderate disease is characterized by mild pneumonia, shortness of breath, and/or pressure in the chest. severe symptoms include difficulty breathing, bluish lips, inability to stay awake, confusion, acute respiratory distress, septic shock and/or multi-organ failure which may lead to death. the median time from symptom onset to pneumonia is about 5 days and to severe hypoxemia is 7-12 days (chauhan, 2020) . progression to severe disease is associated with advanced age and/or comorbidities . siddiqi and mehra (siddiqi and mehra, 2020) noted that the progression of disease was associated with transition from an antiviral response to an inflammatory response. garcia (garcía, 2020) posits that responses at various immunological checkpoints determine the course of disease, to include relatively weak stimulation of innate immune responses, adaptive antibody and t cell responses, and potentially strong activation of proinflammatory chemokines. garcia urges increased monitoring of immunological responses to better understand the checkpoints that prevent progression of or promote disease severity, noting that much of what we understand comes from comparing immunological findings in moderate and severe cases, with few reports following asymptomatic or mild cases. finally, ziegler, et al., recently described regulation of ace2 cell surface expression in upper airway cells by ifn-α, with significant upregulation in humans and nhps but not in mice (ziegler et al., 2020) . this highlights the need to further understand the risk/benefit of antiviral or ifn therapy treatments in infected humans. further understanding of this dynamic is critical to balancing the immune response and efficacious treatments. (ziegler et al., 2020) . descriptions of viral loads determined by pcr in clinical infections note that respiratory viral loads (sputum or nasopharyngeal swabs) were significantly higher, peaked later and were detected longer in severe vs. mild infections zheng et al., 2020) . while virus could be detected for longer periods from stool samples, there was not a difference between mild and severe disease (zheng et al., 2020) . a larger study of hospitalized cases demonstrated that viral load detected by pcr from nasopharyngeal swabs was a significant predictor of mortality, with mean log 10 viral loads of 5.2 copies per ml for survivors vs. 6.4 copies per ml for those who died (pujadas et al., 2020) . gulati and colleagues published a comprehensive review of coronavirus disease, comparing sars-cov, mers-cov and sars-cov-2, focusing not just on pulmonary disease but encompassing other organ systems as well: cardiovascular, hepatobiliary, gastrointestinal, renal, neurologic, musculocutaneous and hematologic (gulati et al., 2020) . the covid-19 reports were retrospective, clinicopathologic or case reports, and generally noted increased frequency of extrapulmonary involvement with more severe disease (icu) and comorbidities. dyspnea is predictive of icu admission, and many patients exhibited abnormal chest computed tomography, predominantly ground glass opacities. histopathology demonstrated interstitial fibrosis and inflammatory infiltrates. severe disease requiring intensive care was associated with lymphopenia, leukocytosis, neutrophilia, and increases in alanine aminotransferase, aspartate aminotransferase, lactate dehydrogenase, total bilirubin, creatinine, creatinine kinase, blood urea nitrogen, troponin-i and d-dimer (gulati et al., 2020) . these extrapulmonary organ systems are not well understood in most animal models; hence, the primary endpoints of animal models have been testing of viral load and immune and inflammatory responses. no single animal model has ever been predictive of human infection, or exactly modelled it. human infection happens by chance, and very little is known about how long a person needs to be in contact with the pathogen nor how much pathogen is needed prior to symptoms presenting. for animal models to be useful, we need to identify the most reproducible methods for challenge that lead to infection and symptoms that emulate those seen in humans. host tropism for viruses is complicated by the need for host factors to facilitate virus entry and replication, the host's immune response and pathways by which viruses evade immune responses (douam et al., 2015) . models for the closely related sars-cov and mers-cov viruses have paved the way for sars-cov-2 infection and pathogenesis studies and have bolstered expertise in studying coronavirus vaccines and therapeutics. angiotensin converting enzyme 2 or ace2 was discovered as the receptor for sars-cov and has proven to be the receptor for sars-cov-2 as well (zhou et al., 2020b) . phylogenetic comparisons of ace2 proteins allowed for predictions of species that are susceptible to sars-cov-2 infection luan et al., 2020; qiu et al., 2020; wan et al., 2020) , as ace2 is a key determinant of infectivity . mice have mismatched ace2 receptors and two approaches have been used to solve this problem: (1) mice that express the human ace2 receptor; or (2) a sars-cov-2 strain that is adapted to recognize the murine ace2 receptor. although both approaches provide a way to model covid-19 infection in mice, neither is a perfect substitute for human infection. the goal for animal models is to understand and/or provide the most accurate and complete predictor of results in a variety of human clinical settings. at this time, there are several models for sars-cov-2 in development ( table 1) . nhp models are traditionally considered the most translational models to humans. nhps bear close similarities to human genetic, neurological, cognitive, physiological, reproductive, anatomical, and immunological systems. their susceptibility to most human pathogens is not surprising, and as such, are models for many of the most intractable acute and persistent pathogens. their size and longevity make them excellent models for pathogenesis, allowing repeated sampling and imaging in vivo for longitudinal studies. however, their limited supply in the face of numerous drug and vaccine candidates makes them an even more precious resource. it is therefore imperative to prioritize agents to be tested by demonstrating tolerability and efficacy in smaller mammalian models: mice, hamsters and ferrets are currently the small animal species of choice. small animal models are presented first, as the tractable models used in early discovery and development. mice and hamsters predominate, given that there are many choices that are becoming available. guinea pigs do not appear to be productively infected as virus cannot be detected, despite presenting early (day 3) pulmonary histological changes, therefore are not the best model of disease (dick bowen, personal communication). mice mice hold a prominent position in disease research. their small size, ease of use, rapid breeding, ability to be inbred as well as readily genetically modified have made the mouse the go-to model in biomedical research. mice can be used to rapidly screen vaccines, antivirals, and other therapeutics in a relatively high throughput, pipeline approach. there are several mouse models for sars-cov-2 infection listed in table 1 , each with advantages and limitations. it bears emphasizing that mouse models that are not available from public repositories or commercial vendors cannot be scaled effectively to meet the high demand for animals and supply the high throughput needs of the community. those models that remain in private labs, no matter how good they may be, will be of limited value to the large-scale efforts needed to evaluate the many vaccines, antivirals and other therapeutics being investigated for the pandemic response, unless they are made available through commercial suppliers or repositories. the use of the standard laboratory mouse for infection of sars-cov and sars-cov-2 has been limited due to amino acid differences in the ace2 receptor between mouse and humans that result in reduced susceptibility to infection in this model. older balb/c mice are more susceptible to infection, but still the infectivity is modest, and the requirement of an aged animal can be a significant disadvantage in screening therapeutics (roberts et al., 2005a; vogel et al., 2007) . one solution to the lack of infectivity in laboratory mice is to use an adapted virus, where multiple passages/or and selected mutations in the virus make the mice more susceptible to infection. a recent publication demonstrated a mouse adapted virus q498t/p499y that was not only able to readily infect mice but was used to demonstrate the effectiveness of neutralizing antibodies to reduce viral replication in vivo . a second solution to the lack of infectivity in common laboratory mice is to express the human ace2 gene (hace2), either by viral transduction or genetic engineering. two labs have used adenovirus transduction to express human ace2 in the lungs of mice, resulting in mild disease noted by viral replication and weight loss, that could be ameliorated by neutralizing monoclonal antibodies (hassan et al., 2020) or antivirals and convalescent sera (sun et al., 2020a) . several transgenic mouse models carrying the human ace2 gene were created to study sars-cov infection. five transgenic models currently exist and are rapidly being assessed for their use in the study of covid-19. the first model is a transgenic mouse developed in china by chuan qin, where the human ace2 cdna is under the control of the mouse ace2 promoter, icr-tg(ace2-ace2)1cqin/j. sars-cov-2 infection in these mice leads to weight loss and virus replication in lung with histopathology indicative of interstitial pneumonia. sars-cov and sars-cov-2 cause mild disease and no death was reported in this model . a second model, developed by clarence peters in 2007, employed another transgenic construct tg(cag-ace2)ac70ctkt with the human ace2 gene expressed under the control of a ubiquitous cag promoter. these mice are highly susceptible to sars-cov infection, demonstrate weight loss, along with other clinical manifestations before reaching 100% mortality within 8 days after intranasal infection (tseng et al., 2007; yoshikawa et al., 2009 ). these mice are currently being bred in order to determine how they respond to infection with sars-cov-2 (kent tseng, personal communication). two other transgenic mouse models also expressing the human ace2 gene using different epithelial based promoters, tg(foxj1-ace2)1rba and b6.cg-tg(k18-ace2)2prlmn/j, were developed by ralph baric and stanley perlman, respectively (mccray et al., 2007; menachery et al., 2016) . both models are considered severe models of sars-cov, with significant weight loss, disease pathology, evidence of encephalitis and death by 5-7 days post infection. interestingly, they differ in severity for covid-19, where the b6c3tg(foxj1-ace2)1rba model displayed a variable response to infection with sars-cov-2, while the b6.cg-tg(k18-ace2)2prlmn/j model develops a uniformly severe response. half of the infected b6c3tg(foxj1-ace2)1rba mice recovered from infection, demonstrating no significant signs of weight loss or illness; the other half of the infected b6c3tg(foxj1-ace2)1rba mice demonstrated severe illness, body weight loss and required euthanasia . the b6.cg-tg(k18-ace2)2prlmn/j mouse develops a severe infection and lung pathology in response to sars-cov-2. reports from multiple independent research groups report that upon infection, mice lose significant body weight, have a hunched posture and require euthanasia 5-7days post challenge in 100% of infected mice moreau et al., 2020; oladunni et al., 2020; rathnasinghe et al., 2020; winkler et al., 2020) . importantly, these mice are not hypersensitive to infection, and lower doses of the virus result in less severe disease progression . viral loads are detected in the brain of these mice, but initial reports indicate the brain infection to be less than the severity-associated encephalitis seen with sars-cov infection in this model, with minimal histopathological changes observed (oladunni et al., 2020) . the tg(foxj1-ace2)1rba and b6.cg-tg(k18-ace2)2prlmn/j mice are available for distribution to research scientists by the mmrrc and the jackson laboratory, respectively. another mouse model was recently developed by sun et al, c57bl/6-ace2 em1(ace2)yowa . this model utilizes crispr/cas9 to knock-in the human ace2 cdna into the mouse ace2 locus, utilizing the endogenous mouse ace2 promoter. this model also results in infectivity, however, the mice recover from body weight loss post infection with sars-cov-2 with mild lung pathology (sun et al., 2020b) . given the ease of genetic engineering in mice, one can expect to see a variety of new models developed in the immediate future. these models will differ in terms of random transgenesis, targeted knock ins, variations in constructs, reporter tags and new ways to deliver the human ace2 gene, such as through recently reported viral transgenesis (hassan et al., 2020; sun et al., 2020a) . humanized mice represent another potential approach, and while there are distinct advantages, there are additional challenges that impact rigorous testing of vaccine and therapeutic efficacy, namely the required technical expertise, preparation time, immunocompromised status of mice, variability, and limited throughput (skelton et al., 2018) . xenografts of human lung tissue into immunodeficient mice have resulted in lung-only mice or bone marrow/liver/thymus-lung (blt-l) mice, which demonstrated infection with mers-cov, rsv, zika and cytomegalovirus and in the case of blt-l mice, produced a human immune response (wahl et al., 2019) . the use of nod scid gamma (nsg) mice for engraftment of human immune cells will prove to be a powerful tool in studying the interaction of the human immune system with the virus. new models incorporating the nsg genetic background and the human ace2 gene are underway. while the current models will be critical for high throughput screening of therapeutics, the newer and milder models will be critical to understanding how infection responds to comorbidities such as type 1 diabetes, hypertension, obesity etc. these j o u r n a l p r e -p r o o f comorbidities can be induced, but also achieved through genetic crosses to existing mouse models engineered with mutations in key genes to produce these desired phenotypes. furthermore, differences between mice and humans can be manipulated to more faithfully represent human disease, such as exploiting differential regulation of ace2 by interferons (ziegler et al., 2020) . other applications of the mouse models include exploring the effects of genetic diversity and various genetic backgrounds, again helping to translate why some individuals are more susceptible to severe disease manifestations, while others recover quickly or are asymptomatic. golden syrian hamsters have been shown to have distinct advantages as models for diseases involving respiratory viral infections including influenza virus, adenovirus and sars-cov (miao et al., 2019; roberts et al., 2005b) . following infection by the intranasal route, golden syrian hamsters demonstrate clinical features, viral kinetics, histopathological changes, and immune responses that closely mimic the mild to moderate disease described in human covid-19 patients (chan et al., 2020b; imai et al., 2020; sia et al., 2020) . in this form of non-lethal disease, the clinical signs include rapid breathing, decreased activity and weight loss that is most severe by day 6 post infection. airway involvement is evident with histopathology showing progression from the initial exudative phase of diffuse alveolar damage with extensive apoptosis to the later proliferative phase of tissue repair. micro-ct analysis of infected hamsters revealed severe lung injury with the degree of lung abnormalities related to the infectious dose. commonly reported imaging features of covid-19 patients with pneumonia were present in all infected animals (imai et al., 2020) . high dose sars-cov-2 infection led to severe weight loss and partial mortality while older hamsters appear to exhibit more pronounced and consistent weight loss (osterrieder et al., 2020) . other findings include intestinal mucosal inflammation, myocardial degenerative changes, viral rna in brain stem and lymphoid necrosis. there is a marked activation of the innate immune response with high levels of chemokines/cytokines induced by the infection (chan et al., 2020b) . transmission of covid-19 from infected hamsters to naïve cage mates suggests utility of the model for studying transmission (chan et al., 2020a; chan et al., 2020b; sia et al., 2020) . in addition, passive transfer studies, with either convalescent sera (chan et al., 2020b) or neutralizing monoclonal antibodies (rogers et al., 2020) show great promise for studies related to immunity and vaccine development. the golden syrian hamster model of sars-cov-2 infection appears to be a suitable model for the evaluation of antiviral agents (kaptein et al., 2020; rosenke et al., 2020) and candidate vaccines . hamsters carrying the hace2 receptor under the control of the epithelial k18 promoter are also being evaluated as a model. in an initial study of sars-cov-2 infection of hace2-hamsters, clinical signs were observed including elevated body temperatures, slow or reduced mobility, weight loss and mortality (1 out of 4 animals). virus titers were detected in lungs, heart and brain tissues, with the highest titers observed in lungs on days 1-3 (>6 logs) (bart tarbet, personal communication). hamsters with compromised immune systems, by either cyclophosphamide treatment or rag2-deficiency demonstrated more severe disease, longer in duration (cyclophosphamide induction) or resulting in mortality (rag2-/-) and could be protected by human antibody given prophylactically . ferrets are considered good models for respiratory diseases as the physiology of their lung and airways are close to humans and they have been used extensively to model disease caused by many respiratory viruses including influenza (thangavel and bouvier, 2014) , rsv (stittelaar et al., 2016) and sars-cov (van den brand et al., 2008) . unlike rodents, ferrets cough and possess a sneeze reflex, making them a particularly useful model in the study of disease transmission. ferrets exhibit lethargy and appetite loss following infection with sars-cov-2 via the intranasal route, but the disease does not progress to acute respiratory disease, and the animals recover from the infection shi et al., 2020) . virus shedding from the upper respiratory tract (nasal washes, saliva) can persist for up to 21 days post-infection; the length of shedding appears to be dependent on the initial viral challenge dose and can be intermittent after 14 days. mild multifocal bronchopneumonia is observed early post-infection (day 3 in animals receiving 4 to 6 logs of virus) with mild multifocal bronchopneumonia developing after one week. fever has been reported in some studies but neither coughing nor dyspnea have been observed shi et al., 2020) . ferrets re-challenged after 28 days post initial infection appear to be completely protected (ryan et al., 2020) . sars-cov-2 was transmitted readily to naïve direct contact ferrets but less efficiently to naïve indirect contact ferrets (shi et al., 2020) and efficiently via the air resulting in a productive infection and the detection of infectious virus in indirect recipients (richard et al., 2020) . disease in ferrets following sars-cov-2 infection appears to be very mild, less severe as compared to ferrets infected with sars-cov. the use of small animals for preclinical research in the study of sars-cov-2 infection involves a broad spectrum of models, from infecting wild type animals with adapted viruses, to multiple methods of introducing human ace2 receptors. disease is typically mild although transgenic mice present with more severe disease. each model offers selected advantages that will be useful not only for the testing of therapies and diseases, but in the understanding of disease enhancement and related co-morbidities. nhps are indispensable models for evaluating medical countermeasures against infectious diseases and are considered the gold standard animal model for modeling human infectious diseases. the lack of suitable substitutes for nhp models for predicting response in humans serves as a bottleneck for the development of countermeasures against infectious disease like sars-cov-2. summarized below is the progress to date to establish models for covid-19 in different primate species. rhesus macaques (m. mulatta) exposed to sars-cov-2 become infected and display a mild, non-lethal shedding disease phenotype, which includes little to no clinical observations. if j o u r n a l p r e -p r o o f clinical observations are reported they are typically transient and include reduced appetite, mild dehydration, tachypnea, piloerection and dyspnea (munster et al., 2020) . when reported, fever is mild and transient beginning shortly after exposure (i.e., day 2 pe) and resolving within 2 or 3 days (munster et al., 2020) . body weight loss findings, if reported, are mild transient drop in weights followed by recovery (munster et al., 2020) . clinical chemistry and hematology are generally unremarkable. however, transient leukocytosis, neutrophilia, monocytosis and lymphopenia are reported (munster et al., 2020; singh et al., 2020) . imaging (radiographs or pet/ct) confirm rhesus macaques are infected, with infiltrates and ground-glass appearances in radiographs beginning early after exposure (day 2 or 3) and resolution occurring by day 10-14 post-exposure (munster et al., 2020; singh et al., 2020) . anecdotal evidence suggests that older rhesus develop a chronic infection where infiltrates persists throughout the study . when available, pet/ct images corroborate the radiograph findings (singh et al., 2020) . virus is detected in nasal, throat, rectal, ocular swabs and in bronchoalveolar lavage (bals) via median tissue culture infectious dose (tcid50) beginning approximately day 2 post-exposure (pe), peaking around day 4/5 pe and decreasing after day 6 pe (chandrashekar et al., 2020; chao shan, 2020; deng et al., 2020; munster et al., 2020; yu et al., 2020b) . finally, exposed rhesus macaques seroconvert, as demonstrated by a sars-cov-2 anti-spike elisa and neutralization assays to various endpoint titers depending on the laboratory and assay format utilized, and are protected from reinfection (bao et al., 2020a; munster et al., 2020) . cynomolgus macaques (m. fascicularis) have been used to study the pathogenesis of sars-cov, where aged animals were more likely to develop disease. when exposed to sars-cov-2, they become infected, but show no overt clinical signs of disease . weight loss is not observed, but in some studies, infected animals have a fever on day two to three (johnston et al., 2020; . virus shedding from the upper respiratory tract occurs, peaking early at day one post infection in young animals and day four in aged (15-20 years) animals, then decreasing rapidly but still detected intermittently up to day 10 post infection . overall, higher levels of virus shedding were measured in aged animals compared with young animals . they develop mild to moderate lung abnormalities, macroscopic lesions in the lungs including alveolar and bronchiolar epithelial necrosis, alveolar edema, hyaline membrane formation and accumulation of immune cells (finch et al., 2020; johnston et al., 2020) . while self-limiting, the disease in cynomolgus macaques does recapitulate many aspects of human covid-19 and could be utilized to test preventative and therapeutic strategies . african green monkeys (agms) exposed to sars-cov-2 as young adults display a mild, nonlethal shedding disease phenotype, which includes little to no clinical observations (hartman et j o u r n a l p r e -p r o o f al., 2020). if clinical observations, such as fever, are reported, they are typically transient and mild with no serious manifestations (hartman et al., 2020; woolsey et al., 2020) . body weight findings are generally unremarkable. clinical chemistry and hematology reveal mild and transient shifts in leukocyte populations, mild thrombocytopenia, and selected liver enzymes (woolsey et al., 2020) . a measure of acute inflammation, crp, is elevated early in infection (woolsey et al., 2020) . imaging (radiographs or pet/ct) confirms agms are infected with infiltrates and ground-glass appearances in radiographs beginning early after exposure (day 2 or 3) and resolving by day 10-14 pe. when available, pet/ct images corroborate the radiograph findings. finally, plethysmography suggests respiratory disease, but there is no consistent trend (hartman et al., 2020) . presence of sars-cov-2 in bal via rt-pcr and plaque assay is detected by approximately day 3 pe and is present at least through day 7 pe (hartman et al., 2020; woolsey et al., 2020) . finally, exposed agms do seroconvert as demonstrated by a sars-cov-2 anti-spike elisa and neutralization assays to various endpoint titers, depending on the laboratory and assay format utilized (hartman et al., 2020; woolsey et al., 2020) . of note, one study (hartman et al., 2020) used a viral isolate from munich which had the d614g amino acid substitution in the spike protein that has been demonstrated to be more infectious to cells in culture and was reported to be more infectious in humans (korber et al., 2020) . disease in those animals was similarly mild as was seen in agms infected with the washington isolate without the d614g substitution. a limited dataset suggests that age is a co-morbidity for sars-cov-2 disease severity in agms. investigators at tulane national primate research center infected 4 older agms, two by multiple routes of infection including intranasal, intratracheal, and conjunctival routes, and two animals by aerosol route with sars-cov-2. one of two animals from each group exhibited acute respiratory disease syndrome, ards, with radiologic and histologic abnormalities observed primarily within the right caudal lung lobes (blair et al., 2020) . of the two animals with severe disease, one met the criteria for euthanasia at day 8 (aerosol) and the other at day 22 (multiroute infection) (blair et al., 2020) . there did not appear to be significant differences in viral replication, or disease pathogenesis associated with aerosol versus multi-route infection (blair et al., 2020) . of the surviving older agms, clinical disease was mild. some transient fevers and lethargy were observed, but no serious manifestations. clinical chemistries and hematologies were generally unremarkable. radiographs confirm the older agms are infected with infiltrates and groundglass appearances in radiographs beginning early after exposure (day 2 or 3) through at least day 14 pe. pet/ct images corroborate the radiograph findings. finally, viral shedding (rt-pcr) in pharyngeal, nasal, buccal, bronchial brush, rectal and vaginal swabs begins ~d2-d7/14 pe in exposed animals (blair et al., 2020) . in view of the severe disease observed in 2 of 4 animals as well as the observation of cytokine storm in 3 of 4 animals, older agms may serve as faithful if impractical models of severe disease to evaluate therapeutic strategies such as immune modulators and may also provide insights into sars-cov-2 pathogenesis. however, additional studies are clearly needed to corroborate these initial findings. other non-human primates finally, limited data is available describing disease presentation following sars-cov-2 exposure in pigtail macaques, baboons, and marmosets. post-exposure data shows these animals can be infected but they are not as widely used. baboons had more severe pathology and shed virus longer than macaques, and may be a good model for cardiovascular and diabetic comorbidities (singh et al., 2020) ; they have also been used for immunogenicity studies (tian et al., 2020) . marmosets did not exhibit fever, were difficult to monitor by radiograph, and did not mount an immune response even though they were positive for virus by nasal swabs ; pathology was reduced compared to macaques (singh et al., 2020) . thus far, pigtail macaques have only been used for immunogenicity studies (erasmus et al., 2020) . the authors recognize that the development of new and refined animal models for covid-19 disease is a rapidly evolving field and that this summary is not a complete review, but rather a description of the currently available data and the resulting guidance that can be drawn from it. some themes are emerging and warrant monitoring. mice expressing hace2 and hamsters are reasonable models for early testing of potential countermeasures, displaying mostly mild disease with reproducible endpoints of weight loss and viral burden, and in transgenic mice, severe disease resulting in lethality. intranasal inoculation is generally the route of infection, and the dose of virus varies and has an impact on the course of disease. given the availability of animals and the ability to rigorously test for statistical significance using groups of 10 or more animals, small models should be utilized to the greatest extent possible and may be sufficient for moving into clinical studies. based on available data from multiple challenge routes (intratracheal, intranasal, oral, ocular and combinations of these, along with small particle aerosol) and multiple dose ranges (10e3 -10e6), routes and challenge do not appear to have significant impact on disease presentation (i.e., mild yet reproducible). as the nhp model matures, focused efforts will be required to demonstrate any dose response and different disease presentation following a range of doses and routes. anecdotal evidence suggests severe disease may be more common, however, focused studies are needed to confirm. based on a fisher's exact test with alpha level 0.05, a sample size of n=8 per group ensures 80% power to detect 77% vaccine efficacy in a comparison of protection between vaccinated and sham animals (measured by absence of pathology in lung, assuming that 90% of sham animals become infected). analysis of selected nhp studies shows that n=8 per group ensures 80% power to detect a mean difference of ~1.88 log10 viral burden between groups (as measured by genomic rt-pcr in bal on day 1 post challenge; allan decamp, personal communication). additional data are needed to allow these calculations to be applicable to across studies and laboratories. young adult rhesus macaques have extremely mild symptoms that may be difficult to associate with symptomatic benefit for interventions, however viral load differences may prove a useful biomarker. african green monkeys may prove to be more useful and data is still being collected for cynomolgus macaques. while there may be a desire to select the best animal model for comparison across studies, in practice, all good models should be utilized to meet the demand for countermeasure testing. few models faithfully recapitulate severe disease, and further exploration is needed, including assessment of comorbidities. to understand the landscape and to facilitate rapid data sharing, activ wg members created an inventory tool to collect information on animal studies from published literature, preprints, and unpublished studies in progress. this tool started as a spreadsheet to collect the most important information about studies, such as information on the animals, the virus isolates used, the challenge process, parameters measured in the studies, and relevant observations/endpoints, along with the laboratory performing the study. the group then made assessments as to whether disease was none-to-minimal, mild, moderate or severe. a few species that are not infected are included, to prevent unnecessary duplication or perhaps even to spur enhancements, for example the development of adapted viruses. a summary of the information is available on the ncats open portal, under the animal models tab (https://opendata.ncats.nih.gov/covid19/animal). as we learn more about models, this site will be updated. activ preclinical wg members have curated this information by combing peer reviewed publications and preprints, along with information that is not publicly available but is available to us. additional efforts are currently underway to develop a nonhuman primate studies covid-19 coordinating center (nhpsccc) that will bridge the covid-19 work at the national primate research centers and could be a repository for data from many animal models. recognizing that activ does not constitute the entire universe of activity in this area, we invite others to submit data to activ, particularly as it fills gaps or further informs the models we have summarized. the information on submitting such animal model data will be found on the ncats open portal animal model page when it becomes available. it is important to consider these animal models in the context of human disease. we adapted a framework for human disease from siddiqi and mehra (siddiqi and mehra, 2020) and placed the animal models summarized here in that framework (figure 1) . this graphical framework includes the full spectrum of disease but does not indicate the frequency of severity. the authors noted that 81% of cases recovered after mild disease, while 5% progressed to the most severe form, with half of those cases succumbing to disease, for a 2.3% mortality rate (siddiqi and mehra, 2020) . it is therefore not surprising and completely consistent with human disease that most animal models present with mild disease and recover. some aged mice and aged african green monkeys present with more severe disease than younger animals; additional comorbidities have not yet been explored but would be expected to model more severe disease. each of the current models in development is yielding valuable information about infectivity, routes of infection, viral persistence, reinfection, and relative level and types of pathogenesis per species. as researchers around the world seek to discover effective therapies and vaccines to treat or to prevent covid-19, judicious choices will be needed to assure that the models are available for comparative studies. the complex interplay between the host and the virus means that each model is unlikely to represent every aspect of disease in humans, and thus different models may be recommended for therapeutics compared with vaccines, due to animal availability and endpoints of studies. indeed, investigators have begun testing vaccines and therapeutics as soon as species are known to get disease, more rapidly than under normal circumstances, where additional studies might be performed to better understand the disease model prior to testing countermeasures. table 1 indicates which models could be used for testing vaccines, antivirals, neutralizing antibodies or other therapies, but does not make specific recommendations. this is based on the relevant endpoints one can measure that would inform vaccine or therapeutic efficacy: viral load (swabs, bal, etc.), body weight, body temperature, lung imaging, lung function, and cytokines. antivirals have been tested in mice (sheahan et al., 2020) , hamsters (kaptein et al., 2020; rosenke et al., 2020) , cynomolgus macaques (maisonnasse et al., 2020) and rhesus macaques (rosenke et al., 2020; williamson et al., 2020) . antibodies have been assessed in hamsters rogers et al., 2020) and mice . finally, vaccines have been tested for immunogenicity in small animals, but efficacy studies have largely been done in rhesus macaques van doremalen et al., 2020; yu et al., 2020a) . the activ animal model pages on the ncats open portal will be updated with publications reflecting the utility of various models. master protocols currently in development will define key parameters and timepoints to follow along with group sizes, allowing comparison across studies. the severity of disease, rapid transmission and global spread of this new virus make the development of vaccines and therapeutics an urgent priority. typical development plans have been greatly accelerated by performing many activities in parallel, often at risk, shortening timelines with unprecedented speed (figure 2) . this includes animal model development and testing, which can contribute to the identification and confirmation of correlates and surrogate markers for use in the clinic. the vaccines that have been chosen by ows can be compared in animal models while phase i and phase ii clinical work continues (corey et al., 2020) . selection of an appropriate animal model depends on the questions to be addressed. table 2 presents our recommendations on which of the currently available animal models to select, focusing on testing classes of therapeutics and vaccines for reducing various aspects of covid-19. for example, carboxylesterase activity is high in rodents but not humans (bahar et al., 2012; li et al., 2005; rudakova et al., 2011) , impacting the pharmacokinetics of antivirals such as remdesivir and the suitability of models for preclinical testing of specific drugs (warren et al., 2016) . it is clear that many models are available which faithfully reproduce the early phases of infection and lung disease, followed by recovery; recovery is a prevalent feature of covid-19 in humans. models of severe disease are largely determined by fatal outcomes when untreated but may not capture all signs and symptoms associated with human covid-19 disease, for example coagulopathies. one difference between some models and human disease is the observation of virus in the brain in mice and hamsters, which requires further exploration. there is a report of virus in csf in a single clinical case (zhou et al., 2020a) though most neurological complications are considered to be inflammatory responses (gulati et al., 2020) . comorbidities have not been rigorously explored in animal models thus far, yet we know the strong association of age, hypertension, diabetes, lung and heart disease with poor prognoses in covid-19. immune responses likely play a key role in determining the severity of disease yet are not as easily manipulated in animal species that are phylogenetically closer to humans as they are in mice. in order for these preclinical studies to have the most impact, their design is critically important to assure that results are rigorous and reproducible. studies will require statistical justification based on clinical and laboratory measures such that studies are appropriately powered, to prevent getting uninterpretable results that can be misleading and wasteful of animal resources. one method to enhance reproducibility that is currently under development is the adoption of shared master protocols for design and sampling. not only can experiments be compared across sites, but also controls located at different sites can be combined for increased power, while retaining contemporaneous infection controls at each site. this strategy can reduce the number of controls used overall, conserving precious resources. if assays and challenge stocks can be harmonized across experimental sites, many of the variables can be further reduced. defining animal models and their use is a prerequisite to performing studies to compare vaccines and therapeutics, so that the most promising ones advance to the next phase, whether it is testing in nhps or the clinic. the prioritization schemes for the development and testing of treatments and vaccines are beyond the scope of this manuscript. this aim of this manuscript is defining the utility of animal models to inform the requisite prioritization of animal studies, particularly in nhps where resources are limited relative to the number of candidate vaccines and therapeutics. covid-19 is a multi-faceted, multi-factorial, multi-systemic, highly infectious disease that evokes wide ranging responses in humans, from asymptomatic to severe disease with respiratory, gastrointestinal, circulatory, neurological and normal to hyper immune responses. no single animal model recapitulates the totality of pathogenesis or predicts interventional responses faithfully as in the human. yet as with the response to other emerging infections, animal models will play a key role, even as candidate vaccines and therapeutics are entering clinical trials at a record pace. we have presented various animal models that are being developed and the role they may play in responding to covid-19. our summary table of animal models and their applications has been posted to the ncats open portal (https://opendata.ncats.nih.gov/covid19/animal) and will be continuously updated as models are advanced and further interrogated. the key models appear to be mice (various), hamsters and for nhps, rhesus macaques and african green monkeys. most mimic the mild form of covid-19 disease, with the exception of transgenic mice, and additional research may result in development of more severe disease models. activ invites submission of information on animal models to be included in our assessment, as we work toward standardized and harmonized animal models, while balancing resources and availability. table 2 . recommended animal models for specific stages of human covid-19 disease. within each medical need, the disease aspects are presented in order of increasing complexity, while animal models are presented in the order in which they should be approached. advantages and limitations are also presented to help in the selection of the appropriate animal model. agm = african green monkey; ards = acute respiratory distress syndrome; mabs = monoclonal antibodies; nhp = non-human primates; pk = pharmacokinetics species difference of esterase expression and hydrolase activity in plasma lack of reinfection in rhesus macaques infected with sars-cov-2. biorxiv the pathogenicity of sars-cov-2 in hace2 transgenic mice ards and cytokine storm in sars-cov-2 infected caribbean vervets disruption of adaptive immunity enhances disease in sars-cov-2 surgical mask partition reduces the risk of non-contact transmission in a golden syrian hamster model for coronavirus disease simulation of the clinical and pathological manifestations of coronavirus disease 2019 (covid-19) in golden syrian hamster model: implications for disease pathogenesis and transmissibility sars-cov-2 infection protects against rechallenge in rhesus macaques. science infection with novel coronavirus (sars-cov-2) causes pneumonia in the rhesus macaques comprehensive review of coronavirus disease 2019 (covid-19) accelerating covid-19 therapeutic interventions and vaccines (activ): an unprecedented partnership for unprecedented times a strategic approach to covid-19 vaccine r&d ocular conjunctival inoculation of sars-cov-2 can cause mild covid-19 in rhesus macaques a mouse-adapted sars-cov-2 model for the evaluation of covid-19 medical countermeasures. biorxiv genetic dissection of the host tropism of human-tropic pathogens an alphavirus-derived replicon rna vaccine induces sars-cov-2 neutralizing antibody and t cell responses in mice and nonhuman primates translating in vitro antiviral activity to the in vivo setting: a crucial step in fighting covid-19 characteristic and quantifiable covid-19-like abnormalities in ct-and pet/ct-imaged lungs of sars-cov-2-infected crab-eating macaques (macaca fascicularis). biorxiv rapid development of an inactivated vaccine for sars-cov-2. biorxiv immune response, inflammation, and the clinical spectrum of covid-19 human angiotensin-converting enzyme 2 transgenic mice infected with sars-cov-2 develop severe and fatal respiratory disease. biorxiv a comprehensive review of manifestations of novel coronaviruses in the context of deadly covid-19 global pandemic sars-cov-2 infection of african green monkeys results in mild respiratory disease discernible by pet/ct imaging and prolonged shedding of infectious virus from both respiratory and gastrointestinal tracts. biorxiv a sars-cov-2 infection model in mice demonstrates protection by neutralizing antibodies fact sheet: explaining operation warp speed clinical features of patients infected with 2019 novel coronavirus in wuhan syrian hamsters as a small animal model for sars-cov-2 infection and countermeasure development pathogenesis of sars-cov-2 in transgenic mice expressing human angiotensin-converting enzyme 2. cell development of a coronavirus disease 2019 nonhuman primate model using airborne exposure. biorxiv antiviral treatment of sars-cov-2-infected hamsters reveals a weak effect of favipiravir and a complete lack of effect for hydroxychloroquine infection and rapid transmission of sars-cov-2 in ferrets spike mutation pipeline reveals the emergence of a more transmissible form of sars-cov-2. biorxiv butyrylcholinesterase, paraoxonase, and albumin esterase, but not carboxylesterase, are present in human plasma rapid selection of a human monoclonal antibody that potently neutralizes sars-cov-2 in two animal models. biorxiv functional and genetic analysis of viral receptor ace2 orthologs reveals a broad potential host range of sars-cov-2. biorxiv viral dynamics in mild and severe cases of covid-19 comparison of sars-cov-2 infections among 3 species of non-human primates. biorxiv spike protein recognition of mammalian ace2 predicts the host range and an optimized ace2 for sars-cov-2 infection hydroxychloroquine in the treatment and prophylaxis of sars-cov-2 lethal infection of k18-hace2 mice infected with severe acute respiratory syndrome coronavirus sars-like wiv1-cov poised for human emergence syrian hamster as an animal model for the study on evaluation of k18-hace2 mice as a model of sars-cov-2 infection. biorxiv respiratory disease in rhesus macaques inoculated with sars-cov-2. nature accelerating covid-19 therapeutic interventions and vaccines (activ) lethality of sars-cov-2 infection in k18 human angiotensin converting enzyme 2 transgenic mice. biorxiv, 2020 age-dependent progression of sars-cov-2 infection in syrian hamsters sars-cov-2 viral load predicts covid-19 mortality predicting the angiotensin converting enzyme 2 (ace2) utilizing capability as the receptor of sars-cov-2 comparison of transgenic and adenovirus hace2 mouse models for sars-cov-2 infection. biorxiv sars-cov-2 is transmitted via contact and via the air between ferrets aged balb/c mice as a model for increased severity of severe acute respiratory syndrome in elderly humans severe acute respiratory syndrome coronavirus infection of golden syrian hamsters comparative pathogenesis of covid-19, mers, and sars in a nonhuman primate model rapid isolation of potent sars-cov-2 neutralizing antibodies and protection in a small animal model. biorxiv hydroxychloroquine proves ineffective in hamsters and macaques infected with sars-cov-2. biorxiv understanding sars-cov-2-related multisystem inflammatory syndrome in children comparative analysis of esterase activities of human, mouse, and rat blood dose-dependent response to infection with sars-cov-2 in the ferret model: evidence of protection to re-challenge. biorxiv an orally bioavailable broadspectrum antiviral inhibits sars-cov-2 in human airway epithelial cell cultures and multiple coronaviruses in mice susceptibility of ferrets, cats, dogs, and other domesticated animals to sarscoronavirus 2 pathogenesis and transmission of sars-cov-2 in golden hamsters covid-19 illness in native and immunosuppressed states: a clinical-therapeutic staging proposal sars-cov-2 infection leads to acute infection with dynamic cellular and inflammatory flux in the lung that varies across nonhuman primate species. biorxiv a hitchhiker's guide to humanized mice: new pathways to studying viral infections ferrets as a novel animal model for studying human respiratory syncytial virus infections in immunocompetent and immunocompromised hosts. viruses generation of a broadly useful model for covid-19 pathogenesis, vaccination, and treatment a mouse model of sars-cov-2 infection and pathogenesis animal models for influenza virus pathogenesis, transmission, and immunology sars-cov-2 spike glycoprotein vaccine candidate nvx-cov2373 elicits immunogenicity in baboons and protection in mice. biorxiv ad26 vaccine protects against sars-cov-2 severe clinical disease in hamsters severe acute respiratory syndrome coronavirus infection of mice transgenic for the human angiotensin-converting enzyme 2 virus receptor pathology of experimental sars coronavirus infection in cats and ferrets chadox1 ncov-19 vaccination prevents sars-cov-2 pneumonia in rhesus macaques. biorxiv utility of the aged balb/c mouse model to demonstrate prevention and control strategies for severe acute respiratory syndrome coronavirus (sars-cov) precision mouse models with expanded tropism for human pathogens receptor recognition by the novel coronavirus from wuhan: an analysis based on decade-long structural studies of sars coronavirus clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in therapeutic efficacy of the small molecule gs-5734 against ebola virus in rhesus monkeys clinical benefit of remdesivir in rhesus macaques infected with sars-cov-2. biorxiv sars-cov-2 infection in the lungs of human ace2 transgenic mice causes severe inflammation, immune cell infiltration, and compromised respiratory function. biorxiv establishment of an african green monkey model for covid-19. biorxiv characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a cases from the chinese center for disease control and prevention differential virological and immunological outcome of severe acute respiratory syndrome coronavirus infection in susceptible and resistant transgenic mice expressing human angiotensin-converting enzyme 2 dna vaccine protection against sars-cov-2 in rhesus macaques agerelated rhesus macaque models of covid-19 the d614g mutation in the sars-cov-2 spike protein reduces s1 shedding and increases infectivity viral load dynamics and disease severity in patients infected with sars-cov-2 in zhejiang province sars-cov-2: underestimated damage to nervous system a pneumonia outbreak associated with a new coronavirus of probable bat origin sars-cov-2 receptor ace2 is an interferon-stimulated gene in human airway epithelial cells and is detected in specific cell subsets across tissues the authors would like to thank the activ preclinical working group for discussions and the spark for this manuscript. we would also like to thank annaliesa anderson for critical review, along with michael diamond, prabha fernandes, and tomas cihlar. we would especially like to thank kara carter and jay grobler for their review and discussions on figures and shared content with the companion manuscript. we thank katinka vigh-conrad for assistance with figures. key: cord-325738-c800ynvc authors: shi, pengpeng; cao, shengli; feng, peihua title: seir transmission dynamics model of 2019 ncov coronavirus with considering the weak infectious ability and changes in latency duration date: 2020-02-20 journal: nan doi: 10.1101/2020.02.16.20023655 sha: doc_id: 325738 cord_uid: c800ynvc pneumonia patients of 2019-ncov in latent period are not easy to be effectively quarantined, but there is evidence that they have strong infectious ability. here, the infectious ability of patients during the latent period is slightly less than that of the infected patients was assumed. we established a new seir propagation dynamics model, that considered the weak transmission ability of the incubation period, the variation of the incubation period length, and the government intervention measures to track and isolate comprehensively. based on the raw epidemic data of china from january 23, 2020 to february 10, 2020, the dynamic parameters of the new present seir model are fitted. through the euler integration algorithm to solve the model, the effect of infectious ability of incubation patients on the theoretical estimation of the present seir model was analyzed, and the occurrence time of peak number in china was predicted. the new coronavirus (2019-ncov) is a major epidemic that humans are experiencing. as a kind of coronavirus, it poses a continuing threat to human health because of its high transmission efficiency and serious infection consequences [1] . since the first case of novel coronavirus pneumonia was discovered in early dec 2019, it has spread widely over the past two months. by the end of 10:00, feb 10th, 2020, there were 42708 confirmed cases of 2019-ncov infection and 21675 suspected cases in china. at present, 2019-ncov pneumonia cases have been confirmed in dozens of countries and regions including the united states, germany, france, canada, australia, and japan. the rapid spread of 2019-nov pneumonia around the world poses a major threat to the closely connected and interdependent world today. the reproductive number r refers to the average number of secondary cases generated from primary cases, and has become a key quantity to determine the intensity of interventions needed to control epidemics [2] . on january 29, 2020, li et al conducted a study of the first 425 confirmed cases in wuhan, showing that the r of 2019-ncov was 2.2, and revealed that person-to-person transmission occurred between close contacts [3] . on january 26, 2020, new research shows that the reproductive ratio of 2019-ncov is 2.90, which is higher than the 1.77 sars epidemic [4] 。in the absence of comprehensive treatments or vaccines, china has adopted the most effective isolation prevention and control measures to isolate patients diagnosed with 2019-ncov to control the spread of infection, but the number of 2019-ncov infections has far exceeded that of sars epidemics. existing basic research results and actual situation of epidemic spread have shown that 2019-ncov has a higher pandemic risk than the outbreak of sars in 2003 [4] 。 this paper discusses the feasibility of using the deterministic model of the transmission dynamics model of infectious diseases to assess the development of the 2019-ncov epidemic in china. some researchers have tried to study and evaluate the development trend of the 2019-ncov pneumonia epidemic through the transmission dynamics. on jan 24, 2020, british scholars read et al [5] estimated that the number of 2019-ncov infection cases in wuhan would reach 190000 on feb 4, which obviously overestimated the development trend of the epidemic. on jan 31, 2020, wu et al [6] , scholars from hong kong, china, predicted that the number of infections in wuhan on jan 25 exceeded 75815, which also obviously overestimated the spread of the epidemic. chinese scholars shen et al. [7] used dynamics models to predict the peak time and scale of the epidemic, and estimated that the peak number of infections was less than 20,000, which was lower than the raw data released on feb 10. chinese scholars xiao et al. [8] established a transmission dynamic model with considering intervention strategies such as close tracking and quarantine. based on their model, scholars predicted that the epidemic would reach its peak around feb 5 [8] , which made an early estimation of the peak time of the epidemic. in summary, although some researchers have carried out research on the transmission dynamics of the 2019-ncov pneumonia epidemic, the real epidemic development has far deviated from the prediction results of previous studies. in previous studies on the transmission dynamics of infectious diseases, the epidemic was in the early stages of development and lacked sufficient raw data, so it was difficult to accurately predict the development of the 2 / 5 epidemic. in addition, the shortcomings of the transmission dynamic model itself have also become the direct cause of limiting its prediction effect. latent patients are not easy to be effectively quarantined, and recent evidence shows that latent patients have a strong infectious ability, but the existing epidemic transmission dynamic models [5] [6] [7] [8] often ignore the transmission risks caused by patients in the latent period. in addition, researchers have found that estimates of the average latency of 2019-ncov are also changing. the incubation period was determined to be 7 days in january, and it was recently estimated to be 3 days, which means that 2019-ncov infected people become more likely to develop symptoms. obviously, it is still necessary to make a new assessment of the spread of the 2019-ncov epidemic situation, which has important practical significance for the research, judgment and prevention of the epidemic situation. in this paper, we established a new seir propagation dynamics model, considering the weak transmission ability of the incubation period, the variation of the incubation period length, and the government intervention measures to track and quarantine comprehensively. based on this new seir propagation dynamics model, the effect of infectious ability of incubation patients on the theoretical estimation of the present seir model was analyzed, and the occurrence time of peak number in china was predicted. considering that the chinese spring festival began on january 23 and the chinese government initiated effective prevention, control and quarantine measures for the whole people, it can be considered that there are few new imported cases in all provinces, and the impact of migrant population flow can be basically ignored. therefore, the premise of carrying out dynamic model research on infectious disease transmission is sufficient. the model in this paper is based on the classical seir model. the classic seir model divides the population into s (susceptible), i (infected), e (exposed), and r (recovered). in this model, it is also assumed that all individuals in the population are at risk of infection. antibodies will be produced when the infected individual recovers, so the recovered population r will not be infected again. in addition, quarantine measures to prevent and treat infectious diseases are also taken into account in the model. the population components in the model also includes sq (isolated susceptible), isolation exposed eq (isolated exposed) and iq (isolated infected). in view of the fact that isolated infected individuals will be immediately sent to designated hospitals for quarantine and treatment, this part of the population will be converted to hospitalized h in this model. therefore, the original populations s, i, and e respectively refer to susceptible, infected, and exposed individuals who have missed quarantine measures. the isolated susceptible person is reconverted to the susceptible person after isolation. both the infected and the exposed have the ability to infect the susceptible, turning them into the exposed. the transformation relationship among different groups of people is shown in fig. 1. figure1 . modified seir propagation dynamics model it is assumed that the quarantine ratio is q, the probability of transmission is  , and the contact rate is c. where  is the ratio of the transmission ability of the exposed to the infected. when =0  , it means that the infection ability of the patients in the latent period is ignored. when =1  , the infection ability of the patients in the latent period is the same as that of the patients with symptoms. and  is the rate of isolation release, hence 1/14  = (the quarantine duration is 14 days). with the growth of statistical data, researchers' estimation of the average latent period of 2019-ncov is also changing. the average incubation period has changed from 7 days determined in january to 3 days estimated recently, which means 2019-ncov infected people are more likely to show symptoms.  is the transformation rate from the exposed to the infected. considering the change of actual latent period, a smooth transition curve from 1/7 to 1/3 is used to represent the change law of transformation rate from the exposed to infected under the change of the latent period. the curve equation is where a and b correspond to the values of the transformation rate of the exposed to the infected in the early and late stages of the epidemic period studied. the epidemic data in this letter are from jan. 23 to feb. 10 in china, so a and b correspond to the conversion rates of jan. 23 to feb. 10 respectively. in eq. (2) , 0 t represents the time when the conversion rate is the average of a and b, and k can be directly determined by the rate of change and the derivative value of the conversion rate at time. the newly established 2019-ncov modified seir propagation dynamics system model is as follows 3 / 5 where i  is the quarantine rate of the infected, i  is the recovery rate of the infected, q  is the transformation rate from the exposed to the isolated infected, is the recovery rate of the isolated infected, is  the diseaseinduced death rate, and  is the transformation rate from the exposed to the infected. the epidemic data used in this paper comes from the raw epidemic notification data published on the official website of the national health commission of the people's republic of china (http://www.nhc.gov.cn/). in this paper, the governing eqs. 2 and 3 is solved using euler's numerical method, and the integration step is 0.01 (days). the initial value of the dynamic system refers to the epidemic data officially released by the chinese government on january 23, 2020, and some parameters have been reasonably estimated. the specific parameter values are shown in table 1 . the other model parameters are the results of fitting optimization based on the original data.  ), the corresponding model parameters are obtained by fitting and optimization based on the raw data, and predict the epidemic situation in future. as can be seen from the fig. 2 , the corresponding optimal model parameters are found with different infectious abilities. under the optimal parameters, the number of infected persons predicted by the theoretical model is in good agreement with the raw data from january 23, 2020 to february 10, 2020. next, we can analyze the effect of different infectious ability of patients in latent period on model estimation. note that without considering the infectious capacity of patients in the incubation period, that is, =0  , the present model can degenerate to obtain the recent dynamic model of infectious disease transmission established in ref. [8] . the results shown in figure 2 show that the peak estimate of the number of infected people using the theory established in ref. [8] is much higher than the estimate made by the present theoretical model. this is due to the neglect of the infectious ability of patient in latent period in previous models [6] [7] [8] . because only the infected are transmissible, the actual probability of infection needs to be overestimated to adequately fit the raw data, which ultimately leads to overestimation of the number of infected people. figure. 2.effect of infectious ability of patient in latent period on the theoretical estimation. fig.3 . impact of variation of incubation period length on the theoretical estimation. next, figure 3 analyzes the effect of variation of incubation period length on the theoretical estimation. it can be seen from fig. 3 that the peak number of infected persons in the model considering the variation of the incubation period is lower than that in the model with the constant incubation period. this corresponds to an easily understood fact. with the development of the epidemic, the length of the incubation period has shortened, which directly leads to the patients in the incubation period to show symptoms faster and be taken to hospital and quarantined. therefore, the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02.16.20023655 doi: medrxiv preprint 4 / 5 gradual shortening of the incubation period is obviously conducive to suppressing the spread of the epidemic, and eventually leading to a reduction in the peak number of infections. fig. 4 . evaluation of the spread of china's 2019-ncov epidemic finally, the trend of china's 2019-ncov epidemic was evaluated by using the new present seir model. in order to obtain an interval estimate of the peak number of infections and its occurrence time, the contact rate was set to vary within the interval of [2.8, 4.2] . the prediction of the current theoretical model indicates that the number of 2019-ncov infections in china reaches its peak after february 19, and the upper and lower limits of the estimated peak time of the epidemic respectively correspond to the abscissa of the peaks of the upper and lower envelopes of the theoretical forecast data of the number of infected. the current theoretical estimation of the number of infected people in china is in good agreement with the raw number of infected people in the period from january 23, 2020 to february 10, 2020. it is worth noting that the theoretical model and parameters of this paper were initially established before february 5. the prediction results of the number of infected persons from february 6 to february 10 are mainly consistent with the actual data of subsequent reports, which initially proves the feasibility of the model in predicting the short-term development situation of the epidemic. this paper attempts to estimate the short-term development of china's 2019-ncov epidemic and predicts that the number of people infected in china will peak after february 19. it should be noted that because the infectious ability of patient in latent period and variation of incubation period length are considered, our present model in this paper is closer to the real situation, so the estimates based on it should also be more reliable. however, some necessary assumptions are still made during the model establishment in this paper. the mathematical descriptions made by the model are different from the complex reality, which leads to the inevitable deviation of the prediction results. all rights reserved. no reuse allowed without permission. author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.02.16.20023655 doi: medrxiv preprint transmission dynamics and control of severe acute respiratory syndrome early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia transmission dynamics of 2019 novel coronavirus (2019-ncov). biorxiv novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions. medrxiv nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study. the lancet modelling the epidemic trend of the 2019 novel coronavirus outbreak in china. biorxiv qian li,et al. estimation of the transmission risk of 2019-ncov and its implication for public health interventions this work was financially supported by the national natural science foundation of china (no. 11802225). *t-4) ./ii*t/0.2))+1/3; ds = -(beta*c+c*q*(1-beta))*s*(i+theta*e)+lam*sq; de = beta*c*(1-q)*s*(i+theta*e)-sigma*e; di = sigma*e-(delta_i+alpha+gama_i)*i; dsq = (1-beta)*c*q*s*(i+theta*e)-lam*sq; deq = beta*c*q*s*(i+theta*e)-delta_q*eq; dh = delta_i*i+delta_q*eq-(alpha+gama_h)*h; dr = gama_i*i+gama_h*h; dde = alpha*(i+h); %% euler integration algorithm s =s+ds*t; e = e+de*t; i = i+di*t; sq = sq+dsq*t; eq = eq+deq*t; h = h+dh*t; r = r+dr*t; aa=[aa; s e i sq eq h r]; end %% theoretical estimation infected(:,1)= round(aa(1:1/t:size(aa,1),3)); cured(:,1)=round(aa(1:1/t:size(aa,1),7)); plot ( key: cord-342855-dvgqouk2 authors: anzum, r.; islam, m. z. title: mathematical modeling of coronavirus reproduction rate with policy and behavioral effects date: 2020-06-18 journal: nan doi: 10.1101/2020.06.16.20133330 sha: doc_id: 342855 cord_uid: dvgqouk2 in this paper a modified mathematical model based on the sir model used which can predict the spreading of the corona virus disease (covid-19) and its effects on people in the days ahead. this model takes into account all the death, infected and recovered characteristics of this disease. to determine the extent of the risk posed by this novel coronavirus; the transmission rate (r0) is utilized for a time period from the beginning of spreading virus. in particular, it includes a novel policy to capture the r0 response in the virus spreading over time. the model estimates the vulnerability of the pandemic with a prediction of new cases by estimating a time-varying r0 to capture changes in the behavior of sir model implies to new policy taken at different times and different locations of the world. this modified sir model with the different values of r0 can be applied to different country scenario using the real time data report provided by the authorities during this pandemic. the effective evaluation of r0 can forecast the necessity of lockdown as well as reopening the economy. this is a new virus and the world is facing a new situation [10] as there is no vaccine has made to combat this virus. in this situation, on 30 january 2020, world health organization (who) declared it to be a public health emergency of global concern [11] . as of 20 may 2020, the disease was confirmed in more than 4998097 cases reported globally and there are 325304 death cases reported. world health organisation (who) declared it a pandemic situation caused by coronavirus spreading [12] . in this paper, the literature review consists of some relevant mathematical models which tried to describe the dynamics of the evolution of covid-19. some of the phenomenological models tried to generate and assess short-term forecasts using the cumulative number based on reported cases. the sir model is a traditional one to predict the vulnerability of a pandemic and also can be used to predict the future scenario of coronavirus cases. however this model is modified [13] , [14] including other variables to calibrate the possible infection rate with time being. coronavirus transmission (how quickly the disease spreads) is indicated by its reproduction number (ܴ , pronounced as r-nought or r-zero), which indicates the actual number of people to whom the virus can be transmitted by a single infected case. who predicted (on jan. 23) ܴ , would be between 1.4 and 2.5. other research measured ro with various values somewhere between 3.6 and 4.0, and 2.24.the long used value of ro is 1.3 for flu and 2.0 for sars [15] [2] . in this research, firstly sir model is simulated taking a constant value of ܴ ൌ 0 . 5 to observe the response. since reducing the face to face contact among people and staying home in lockdown can improve in reducing the further infection rate, ܴ is taken as a time varying constant rather than a fixed one to observe the overall scenario of coronavirus spreading. the organization of the paper is as follows: a brief study of some mathematical models correlated to coronavirus spreading is extracted in section 2, literature review part, section 3 contains the modeling of coronavirus spreading .section 4 is the result and discussion part and section 5 concludes the paper. the sir epidemic model which is used in this work is one of the simplest compartmental models firstly used by kermack and mckendrick (1927) .compartmental model denotes mathematical modeling of infectious diseases where the population is separated in various compartments for example, s, i, or r, (susceptible, infectious, or recovered). many works have been undergone during the coronavirus pandemics utilizing the compartmental models [16], [17] and imperial college covid-19 response team (2020) provides a useful overview of this classic model to show that these models can be applied to understand the current health hazards. much interest has been generated among economists to identify the impact of current pandemic on economic . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 18, 2020. . https://doi.org/10.1101/2020.06.16.20133330 doi: medrxiv preprint sectors by exploring compartmental models along with standard economic models using econometric techniques. [18] , [19] . it was argued by the economists that many of the parameters controlling the move among compartments are not structural however it depends on individual decisions and policies. for example, according to eichenbaum et al. [20] and farboodi, jarosch and shimer [21] , the number of new infections is a function of the endogenous labor supply and consumption choices of individuals which is determined by the rate of contact and by the standard decision theory models where the rate of contact is amenable. similarly, the death rates and recovery rates are not just clinical parameters. moreover, it can be used as functions of policy decisions for example by expanding emergency hospital capacity may increase the recovery rate and decrease death rates. also, in this case the fatality ratio is a complex function because it depends on many clinical factors. therefore the selection-into-disease mechanisms are themselves partly the product of endogenous choices [22] . moreover concerning with the identification problems of compartmental models economists atkeson [23] and korolev [24] have found that these models lack many set of parameters that is able to fit the observed data efficiently before the long-run degradable consequences. some of the researchers linton [25] and kucinskas [26] used time-series models utilizing the econometric tradition other than using compartmental models. however many of the economists are pushing the study of compartmental models in a multitude of dimensions. acemoglu et al. [27] and alvarez et al. [28] identified that the optimal lockdown policy for a planner who wants to control the fatalities of a pandemic while minimizing the output cost of the lockdown. berger et al. [29] examine the role of testing and case-dependent quarantines. bethune and korinek [30] estimate the infection externalities associated with covid-19. bodenstein et al. [31] examine a compartmental model with a multi-sector dynamic general equilibrium model. other researchers like garriga, manuelli and sanghi [32] ,hornstein [33] , and karin et al. [34] study a variety of containment policies. toda [35] , estimated a sird model to explores the optimal mitigation policy that controls the timing and intensity of social distancing. flavio toxvaerd [36] also developed a simple economic model emphasizing on endogenous social distancing. furthermore, many of the economists commented that coronavirus infection transmission cannot be a biologically induced constant rather it can be varied with human behavior and the change in human behavior can be predicted in response to changing social policies. another form of multirisk sir model with assumed a targeted lockdown period provided by the economists daron acemoglu et al [37] which is an epidemiology models with economic incentives. the authors of this paper argued about the herd immunity that might be much lower if the super spreaders like people in hospitals, emergency service providers, bus drivers, etc are set to immune first. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. . https://doi.org/10.1101/2020.06.16.20133330 doi: medrxiv preprint therefore in this paper, we have tried put emphasis on these ideas by allowing the infection rates changeable over time by imposing social distancing among people. moreover we tried to focus on the reproduction rate ܴ which is not a constant but varies with time based on the demographic and policy heterogeneity. according to the conventional sir model, for the n number of constant population, each of whom may be in one of five states with respect to time implies that: while a susceptible person can be affected by the disease when comes in contact with an infectious person. the sir model is used to predict the vulnerability of any type pandemic which may not be applicable to coronavirus cases since this model assumes the reproduction rate as a constant. the virus seems to be diminished when all affected people will be recovered which is practically not possible. the affected people by corona virus are highly contagious to other people who come in contact to them. thus the spreading of coronavirus infection is increasing day by day. therefore, a mathematical model of the spreading of it can help predicting its vulnerability and while to take some efficient measures to lowering the contact rate. the modified behavioral sir model is presented by [38] : . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. in the sir model, β is a constant and hence we use a constant ܴ ൌ reproduction rate. eventually the exponential growth of the disease then the growth can be confined by the reducing the number of susceptible person in the total number of population. moreover, each of the infected people is considered to be recovered people after certain time. but in actual scenario it is not happening because of the death from coronavirus infection. in the actual sir model lowered β was used. however, by lowering the reproduction rate ܴ , the further spread of coronavirus can be suppressed. after the lockdown most of the people reduced their contacts which decrease the proportionally of the increased number of infectious people. since β is the rate of susceptible person so that a logarithmic function can be used as it can't be negative. cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. . https://doi.org/10.1101/2020.06.16.20133330 doi: medrxiv preprint however, due to absence of sufficient continuous testing the number of infectious people at any time cannot be predicted. therefore the further expansion model can be expressed as utilizing the current death rate: the logarithmic function was used to get the idea in the early declines while super spreading activities are considered to be eliminated. let us assume the total number of population is 1 million. in this model we used the same parameter value as used by the researcher chad and jesús [38] . hereby, γ =0.2 which is used for 5 days of remain infectiousness in an average, if the death rate is 1%, therefore the calibration of α is calculated at an infection rate of 0.5%. i.e. 5000 death at a million population. cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. the standard sir model simulation with those above mentioned parameters values is as follow: . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. . https://doi.org/10.1101/2020.06.16.20133330 doi: medrxiv preprint from the graph, the first day starts with only one infected person then the infection rate in the graph increases exponentially with the increasing number of new infected people while it show the pick value at almost half the population. however, after 25 days of the infection spreading the herd immunity begun while the maximum number of infected people was found. the number of sick people who are resolving peaks after few days. the pandemic period is over 2 months and it is expected to go away after that. while everyone of the total 1 million populations get infected and that results into 0.8% or 8000 death. furthermore, the behavioral modified sir model simulation, the system reacts by reducing the transmission rate meaning that the varied values of r function. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. the graph shows that about 4000 people will be infected but not the total 1 million populations. the pandemic gets going exponentially (blue line), while the infection rate reach 1000 per million there is a noticeable acute reduction in the further reproduction rate indicated in black colored dashed line. moreover the transmission rate r is asymptotes in this model, with a decline in the number of infection rate as well as less number of death cases per day. in this process, it may take a long period to achieve herd immunity if all the initiatives taken in order to lessening contacts are stopped. assuming the reproduction rate is asymptotes to r ൌ 1 , the following graph shows: . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. this simulation graph shows an aggressive response, α is increased by a factor of 5 showed in the dashed line (red) in the first graph. in this graph from the vertical scale it is clear that, the less people are vulnerable to the infections implies that the overall rate of infection is comparatively low. being stricter about the social gathering and possible contacts the situation doesn't change since the reproduction or transmission rate is still asymptotes, such that r declines to one. however in such scenario, we get low infection rate on daily basis. after reducing the number of infected people over time, letting r varies over time and α is increasing by a factor of 2 for 100 days ,the following graph shows: . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. . is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 18, 2020. since after a few weeks deaths occur by the infections is found steady over time. then the situation is considered under control. after letting people go back to the normal life with no restriction, the situation may get back with more severe affects. then the coronavirus pandemic might be uncontrolled with high deaths in the first wave while by taken possible measures and consciousness of people to avoid contact can reduce the infection rate and death rate. furthermore, after lowering the death rate people ease up and a second wave of infection rate appears so forth. therefore, the reproduction rate r cannot be constant as it varies with the behavioral change of people and policy taken to control the pandemic. the covid-19 is a current global issue which spread in almost every country of the world and caused restriction to the free movement of people resulting a massive economical loss worldwide. transmission rate (r ) is the number of newly infected individuals derived from a single case is the factor that can calibrated the reproduction rate of coronavirus utilizing the sir model. in the traditional sir model, taking ro as a constant cannot predict the actual scenario of coronavirus spreading. the value of r can be different for different places and time period. moreover, ro can be changed with the behavioral changes of people because of the adopted policies by respective authorities. therefore, in this research r is not considered a constant rather it is used as a time varying function. by taken the possible measures to reduce the social contact r can be minimized with the time causing less death and forecasting to reopen the economy. population biology of infectious diseases: part 1 mathematical formulation and validation of the be-fast model for classical swine fever virus spread between and within farms a novel spatial and stochastic model to evaluate the within-and between-farm transmission of classical swine fever virus. i. general concepts and description of the model theglobaldynamicsforanagestructuredtuberculosistransmissionmodelwiththeexponential progression rate a novel coronavirus outbreak of global health concern severe acute respiratory syndrome-related coronavirus: the species and its viruses -a statement of the coronavirus study group. biorxiv naming the coronavirus disease (covid-19) and the virus that causes it unique epidemiological and clinical features of the emerging 2019 novel coronavirus pneumonia (covid-19) implicate special control measures statement-on-the-second-meeting-of-theinternational-health-regulations director-general's opening remarks at the media briefing on covid-19 -11 estimating and simulating a sird model of covid-19 for many countries, states, and cities. no. w27128 early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases the mathematics of infectious diseases data gaps and the policy response to the novel coronavirus policy implications of models of the spread of coronavirus: perspectives and opportunities for economists the macroeconomics of epi demics internal and external effects of social distancing in a pandemic what does the case fatality ratio really measure? how deadly is covid-19? understanding the difficulties with estimation of its fatality rate identification and estimation of the seird epidemic model for covid-19 when will the covid-19 pandemic peak? tracking r of covid-19 a multi-risk sir model with optimally targeted lockdown a multi-risk sir model with optimally targeted lockdown anseirinfectiousdiseasemodelwith testing and conditional quarantine covid-19 infection externalities: trading off lives vs. livelihoods social distancing and supply disruptions in a pandemic optimal management of an epidemic: an application to covid-19 socialdistancing, quarantine,contacttracing, andtesting: implications of an augmented seir-mode adaptive cyclic exit strategies from lockdown to suppress covid-19 and allow economic activity susceptible-infected-recovered (sir) dynamics of covid-19 and eco nomic impact cambridge working papers in economics individual variation in susceptibility or exposure to sars-cov-2 lowers the herd immunity threshold estimating and simulating a sird model of covid-19 for many countries, states, and cities. no. w27128 key: cord-338466-7uvta990 authors: singh, brijesh p. title: modeling and forecasting the spread of covid-19 pandemic in india and significance of lockdown: a mathematical outlook date: 2020-10-31 journal: nan doi: 10.1016/bs.host.2020.10.005 sha: doc_id: 338466 cord_uid: 7uvta990 a very special type of pneumonic disease that generated the covid-19 was first identified in wuhan, china in december 2019 and is spreading all over the world. the ongoing outbreak presents a challenge for data scientists to model covid-19, when the epidemiological characteristics of the covid-19 are yet to be fully explained. the uncertainty around the covid-19 with no vaccine and effective medicine available till today create additional pressure on the epidemiologists and policy makers. in such a crucial situation, it is very important to predict infected cases to support prevention of the disease and aid in the preparation of healthcare service. india is fighting efficiently against covid-19 and facing greater challenges because of its large population and high population density. though the government of india is taking all needful steps to prevent its spread but it is not enough to control and stop spread of the disease so far, perhaps due to defiant nature of people living in india. effective measure to control this disease, medical professionals needs to know the estimated size of this pandemic and pace. in this study, an attempt has been made to understand the spreading capability of covid-19 in india through some simple models. findings suggest that the lockdown strategies implemented in india are not successfully reducing the pace of the pandemic significantly after first lockdown. a novel corona virus is responsible for epidemic popularly known as covid-19 is a new strain that has not been identified previously in humans. world health organization (who) declared covid-19 a pandemic on march 11, 2020. the virus that caused the incidence of severe acute respiratory syndrome (sars) in 2002 in china, middle east respiratory syndrome (mers) in 2012 in saudi arabia and the virus that causes covid-19 are genetically related to each other, but the diseases they caused are quite different (who). these viruses, in general, are a family of viruses that target and affect mammal's respiratory systems. the sars corona virus spread to humans via civet cats, while the mers virus spread via dromedaries. in case of the novel corona virus, typically happens via contact with an infected animal, perhaps the common carriers are bats initial reports from seafood market in central wuhan, china. the novel corona virus (covid-19) started from wuhan, china and thus, initially known as the wuhan virus, expanded its circle in south korea, japan, italy, iran, usa, france, spain and finally spreading in india. it is named as novel because it is never seen before mutation of animal corona virus but certain source of this pandemic is still unidentified. it is said that the virus might be connected with a wet market (with seafood and live animals) from wuhan that was not complying with health and safety rules and regulations. as of july 16, 2020, with the continuously increasing global risk more than 14 million confirm positive cases and more than 0.58 million of deaths have occurred in the world. as number of cases growing day by day, in most of the countries of the world, some most populous countries like china, india, brazil, usa, etc., are badly affected by it. in this context, the crucial role of modeling, transmission dynamics and estimating development of covid-19 are expected. the population based mathematical model especially growth model in this scenario are the most preferable techniques to understand the epidemic future trajectory. epidemiological characteristics like propagating dynamics, severity, susceptibility, and the effects of control measures, for covid-19 has produced a greater concern for researchers (cowling and leung, 2020; lipsitch et al., 2020) . since preventive measures like lockdown and social distancing have immense pressure on economy of the country, quantitative estimates and predictions are necessary to learn the impact of spread that will help in plan the strategies against covid-19. given the paucity of such quantitative measures, the predictions on the basis of different idea given in this paper become critical and to know when the covid-19 stops. in recent past a number of studies with various technique and tools have been carry out to understand the dynamics of propagation of disease and future course of action. for covid-19, various models which are capable of providing worth insights for health care policy making are being continuously developed and used to explain this pandemic retrospectively as well as to project the events (batista, 2020; koo et al., 2020; kucharski et al., 2020; tuite and fisman, 2020; wu et al., 2020) . wu et al. (2020) has been done to analyzing the pace of virus transmissibility through estimating the value of r 0 with the help of stochastic markov chain monte carlo method. another analysis with mathematical incidence decay and exponential adjustment is performed. further to explain growth behavior of covid-19 a statistical exponential growth model adopting the serial interval from severe acute respiratory syndrome is applied by zhao et al. (2020) . a three-parameter logistic growth function is applied and predicted for china as well as some other countries is found very satisfying (shen, 2020) . in the context of india, an early study of covid-19 (when it started spreading in india) done by singh and adhikari (2020) rightly believed that countrywide lockdown on march 24 for 21 days may be insufficient for controlling the covid-19 pandemic. malhotra and kashyap (2020) tried to forecast the endpoints to explain the progression of covid-19 in indian states, using sir and logistic growth models and found the endpoint of covid-19 in india is in july 23, 2020. india with a huge population about 1.3 billion, among majority of the people are living in poor hygienic condition and the medical facilities like number of doctors and hospitals are less in india as compared to developed countries indicates that the situation of india will become very critical but comparatively better public health system and political control in india than the above developed countries. the picture of india is not so good and has more than 1 million confirm positive cases and more than 26 thousand of deaths. although the death rate of this pandemic is low in comparison of other pandemics and diseases but its high rate of spread and no proper cure available so far is the major concern in the present time. right now in india only 29 districts out of 739 districts have covid-19 case more than 4000. these districts are mainly metropolitans; if we implement preventive measures properly then spread can be under control at desired level, but due to defiant nature of people living in india, political desire and rivalry, still we india society are facing problem made by covid-19. the first case of covid-19 is reported in india on january 30, 2020 when a student returned from wuhan, china (covid19india.org). the government of india was quick to launch various levels of travel advisories beginning from february 26, 2020, with restrictions on travel to china and nonessential travel restrictions to singapore, south korea, iran and italy. the efforts to control by the hon'ble prime minister narendra modi ji through janata curfew (public curfew) on march 22, 2020, can be seen as the beginning of wide-scale public preventive measures. india has launched several social distancing measures and personal hygiene measures during the second week of march. symptoms of covid-19 are reported as cough, acute onset of fever and difficulty in breathing. out of all the cases that have been confirmed, up to 20% have been deemed to be severe. cases vary from mild forms to severe ones that can lead to serious medical conditions or even death. it is believed that symptoms may appear in 2-14 days, as the incubation period for the has not yet been confirmed. however, in india 14 days minimum quarantine period is declared by government for suspected cases. since it is a new type of virus, there is a lot of research being carried out across the world to understand the nature of the virus, origins of its spreads to humans, the structure of it, possible cure/vaccine to treat covid-19. india also became a part of these research efforts after the first two confirmed cases were reported here on january 31, 2020. then in india screening of traveler at airport migrant was started, immediate chinese visas was canceled, and who was found affected from covid-19 kept in quarantine centers (ministry of home affaires government of india, advisory). for the spread of covid-19, when disease dynamics are still unclear, mathematical modeling helps us to estimate the cumulative number of positive cases in the present scenarios. now india is interring in the mid stages of the epidemic. it is important to predict how the virus is likely to grow among the population. the covid-19 pandemic presents a challenge for data scientists to model it; however, the epidemiological characteristics of the covid-19 are yet to be fully explained. the uncertainty around the covid-19 with no vaccine and effective medicine available until today create additional pressure on the epidemiologists and policy makers. in such a crucial situation, it is very important to predict infected cases to support prevention of the disease and support in the preparation of healthcare service. a mathematical modeling approach is a suitable tool to understand the dynamics of epidemic. in the study some mathematical approach to understand the dynamics of novel covid-19 in india has been discuss. in absence of a definite treatment modality like vaccine, physical distancing has been accepted globally as the most efficient strategy for reducing the severity of disease and gaining control over it (ferguson et al., 2020) . also in india it is reported that the country is well short of the who's recommendations of minimum threshold of 2.28 skilled health professionals per 1000 population (anand and fan, 2016) . therefore, on march 24, 2020, the government of india under prime minister narendra modi ji ordered a nationwide lockdown for 21 days, limiting movement of the entire 1.3 billion population of india as a preventive measure against the covid-19 pandemic in india. it was ordered after a 14-h voluntary public curfew on 22 march. the lockdown was placed when the number of confirmed covid-19 cases in india was approximately 500. on 14 april, prime minister of india extended the nationwide lockdown until 3 may, with a conditional relaxation after 20 april for some regions. on 4 may, the government of india again extended the nationwide lockdown further by 2 weeks until 17 may. also, the government has divided the entire nation into three zones viz. green, red and orange with relaxations applied accordingly. there are already various measures such as social distancing, lockdown masking and washing hand regularly has been implemented to prevent the spread of covid-19, but in absence of particular medicine and vaccine it is very important to predict how the infection is likely to develop among the population that support prevention of the disease and aid in the preparation of healthcare service. this will also be helpful in estimating the health care requirements and sanction a measured allocation of resources. it is well known fact that covid-19 has spread differently in different countries, any planning for increasing a fresh response has to be adaptable and situationspecific. data obtained on covid-19 outbreak have been studied by various researchers using different mathematical models srinivasa rao arni et al., 2020) . many other studies (anastassopoulou et al., 2020; corman et al., 2020; gamero et al., 2020; huang et al., 2020; hui et al., 2020; rothe et al., 2020) on this recent epidemic have been reported so many meaningful modeling results based on the different principles of mathematics. most of pandemics follow an exponential curve during the initial spread and eventually flatten out ( junling et al., 2014) . sir model is one of the best suited models for projecting the spread of infectious diseases like covid-19 where a person once recovered is not likely to become susceptible to the infection again (kermack and mckendrick, 1927) . susceptible-infectious-recovered (sir) compartment model (herbert, 2000) is used to include considerations for susceptible, infectious, and recovered or deceased individuals. these models have shown a significant predictive ability for the growth of covid-19 in india on a day to day basis so far. a time dependent sir models have been defined to observe the undetectable infected persons with covid-19 (chen et al., 2020) . a recent study by mandal et al. (2020) has shown that social distancing can reduce cases by up to 62%. further, time series models have been employed for predicting the incidence of covid-19 disease. as compared to other prediction models, for instance support vector machine (svm) and wavelet neural network (wnn), arima model is more capable in the prediction of natural adversities (zhang et al., 2019) . chatterjee et al. (2020) studied a stochastic mathematical model of the covid-19 epidemic in india. the logistic growth regression model is used for the estimation of the final size and its peak time of the covid-19 pandemic in many countries of the world and found similar result obtained by sir model (batista, 2020) . it is well known that the effects of social distancing become visible only after a few days from the lockdown. this is because the symptoms of the covid-19 normally take some time to come out after getting infected from the covid-19. an estimates indicates that, with hard lockdown and continued social distancing, the peak total infections in india will be 97 million and the number of infective by september is likely to be over 1100 million (schueller et al., 2020) . the study of infectious diseases is called epidemiology. a disease is called endemic if it persists in a population and pandemic when it occurs worldwide. the spread of an infectious disease involves not only disease related factors such as the infectious agent, mode of transmission, latent period, infectious period, susceptibility and resistance, but also social, cultural, demographic, economic and geographic factors. mainly there are three types of models for infectious diseases that are spreading directly through person to person contact in a population. some simple models are formulated and analyzed mathematically considering differential equations. parameters are estimated for infectious diseases and also used to compare the vaccination levels necessary for herd immunity. the three models considered here are the simple epidemiological models and suitable for diseases which are transmitted directly from person to person. more complicated models must be used when there is transmission by insects called vectors or a reservoir of nonhuman infective. epidemiological models are widely used to understand the pattern and policy development. even though vaccines are available for many infectious diseases, these diseases still cause suffering and mortality in the world, especially in developing countries. in developed countries chronic diseases such as cancer and heart disease have received more attention than infectious diseases, but infectious diseases are still a more common cause of death in the world. the transmission mechanism from an infective to susceptible is understood or nearly all infectious diseases and the spread of diseases through a chain of infections is known. however, the transmission interactions in a population are very complex so that it is difficult to comprehend the large scale dynamics of disease spread without the formal structure of a mathematical model. an epidemiological model uses a microscopic description (the role of an infectious individual) to predict the macroscopic behavior of disease spread through a population. in many sciences it is possible to conduct experiments to obtain information and test hypotheses. experiments with infectious disease spread in human populations are often impossible, unethical or expensive. data is sometimes available from naturally occurring epidemics or from the natural incidence of endemic; however, the data is often incomplete due to underreporting. this lack of reliable data makes accurate parameter estimation difficult so that it may only be possible to estimate a range of values for some parameters. since repeatable experiments and accurate data are usually not available in epidemiology, mathematical models and computer simulations can be used to perform needed theoretical experiments. mathematical models have both limitations and capabilities that must recognized. sometimes questions cannot be answered by using epidemiological models, but sometimes the modeler is able to find the right combination of available data, an interesting question and a mathematical model which can lead to the answer. comparisons can lead to a better understanding of the processes of disease spread. modeling can often be used to compare different diseases in the same population, the same disease in different populations, or the same disease at different times. comparisons of diseases such as measles, rubella, mumps, chickenpox, whooping cough, poliomyelitis and others are made (hethcote, 1983; yorke and london, 1973; yorke et al., 1979) and in the article on rubella in this volume by hethcote (1989) . quantitative predictions of epidemiological models are always subject to some uncertainty since the models are idealized and the parameter values can only be estimated. however, predictions of the relative merits of several control methods are often robust in the sense that the same conclusions hold over a broad range of parameter values and a variety of models. optimal strategies for vaccination can be found theoretically by using modeling. longini et al. (1978) use an epidemic model to decide which age groups should be vaccinated first to minimize cost or deaths in an influenza epidemic. hethcote (1988) uses a modeling approach to estimate the optimal age of vaccination for measles. within a short period of time, covid-19 has traumatized the world with a greater magnitude and coercion than older pandemics. its eventuality is grabbed by the fact that it has infected millions and killed thousands across the globe. global markets, accessible transportation, large scale production have largely contributed to make this pandemic spread faster. this has drastically affected the social life and health mental as well as physical of human beings worldwide. the already burdened health infrastructure across the globe is virtually exposed up to an irreparable point. the who declared 2019-2020 corona virus outbreak a public health emergency of international concern (pheic) on january 30th, 2020 and a pandemic 12 days later on february 12th, 2020. with its outbreak in wuhan, china, the pandemic seems to occupy and include all the vitals of the world thereby affecting the mechanistic processes of any nation. the countries are trying hard to combat and contain this outbreak by following suitable set of protocols that tend to alter the transmission rate effectively. in the initial phase of spread of covid-19; italy, spain, france and some other european countries are one of the worst sufferers of the pandemic and the coercive measures have resulted in the disruption of all the necessary services. on the other hand, the case is virtually less severe in south asia. india is less affected by the covid-19, however, china is its neighboring country having border through buffer states like nepal and bhutan. being the second most populous country of the world, india is fighting hard to minimize the damage of covid-19. as on 15th april, the total number of infected cases in india was 12,370 with 422 deaths and most recoveries (covid19india.org). india reported its first case on 30th january and entered the countrywide lockdown on march 24th, 2020 with constantly increase in number of covid-19 cases. indian government as well as states government has issued early guidelines and travel advisories to limit the further damage of disease. also, the timely precautions taken by the government have contributed greatly toward combating this pandemic. the paper attempts to devise a model that would conveniently help in assessing the predictability of pandemic covid-19 in future time period. this can be achieved by evaluating the different parameters that directly or indirectly affect the ongoing rate of pandemic. moreover, theoretical explanation, quantitative analysis and other parameters are highly required to predict the peak and size of any pandemic. we obtained information on cumulative number of covid-19 confirmed cases in india from covid19india.org. all cases are laboratory confirmed following the case definition by the government of india. some studies modeled the epidemic curve obeying the exponential growth (de silva et al., 2009 ). the nonlinear least square framework is adopted for data fitting and parameter estimation for covid-19 at this early stage. first exponential and then logistic growth curve is used to model the covid-19 pandemic, since epidemics grow exponentially not linearly. but it is surprising that exponential growth curve always provide increasing number of daily new cases. there is no saturation point. another deterministic model used for understanding the dynamics of epidemic is the susceptible-infectious-recovered (sir) model, which has been used to accurately predict incidence like sars. in the sir model, we need to know the input parameters first the stats we feed into the model (chatterjee et al., 2020; mandal et al., 2020; singh and adhikari, 2020) . the first one is r 0 called the basic reproduction number. it is essentially the number of new cases a single infected person will cause during their infectious period. it is one of the most important parameters for assessing any epidemic. corona virus has an r 0 $ 2.4. in contrast, the swine flu virus had an r 0 $ 1.5 in the 2009 swine flu epidemic (gupta, 2020) . the r 0 will inform us about how many people will get infected with one infected person. other one is the case fatality rate (cfr), which is the percentage of infected people that will die due to the infection. the cfr for corona virus has been reported between 0.5% and 4%. the lower values are more appropriate in resource better settings of medical facility. but sir model assumes that every person is moving and has equal chance of contact with each and every other person among the population irrespective of the space or distance between different people. it is assumed that the transmission rate remains constant throughout the period of pandemic. also this model considered to have the same transmission rate for who have been diagnosed and are in quarantine or those who have not been quarantined. the harmonic analysis methods and dynamic model (rao srinivasa arni et al., 2020) estimates show that the number of covid-19 infected would be 9225 (if there were 10 infected individuals as of march 1, 2020, who was not taking any precautions to spread), 17,986 (if there were 20) and 44,265 (if there were 50). sir model is a theoretical epidemiological model, in which, the population is categories into three component such as: susceptible (s), which is the group of people who are vulnerable to exposure with infectious people, infected (i), are those with the disease and can transmit it to the susceptible and the third component is the individuals who have recovered from the infectious disease and developed immunity and not susceptible to the same illness anymore (r). this framework enables us to understand the dynamics of any epidemic. thus sir model is a compartmental model in which individuals are separated into different compartments based on their status and follow the corresponding population sizes over the time. the diagrammatical representation of threecompartment model (kermack and mckendrick, 1927) is given as where, s(t) â¼ proportion of individual susceptible to covid-19 at time t, i(t) â¼ proportion of individual who have been infected by covid-19 and are capable of infecting others at time t, and r(t) â¼ proportion of individual who have been infected by covid-19 and recovered at time t, such that s(t) + i(t) + r(t) â¼ 1. hereî² is the transmission parameter controlling how much the disease can be transmitted. this is the average number of individuals that one infected individual will infect per unit time. it is determined by the chance of contact and the probability of disease transmission. while î³ is the parameter representing the rate of recovery in a particular period. the model allows us to describe the number or proportions of persons in each compartment by solving the following ordinary differential equations, several assumptions have been discussed with respect to the sir model (brauer and castillo-chavez, 2012; daley and gani, 1999) . based on the sir model, the basic reproduction number is defined as, here, r 0 is the average number of new covid-19 cases produced by a single covid-19 infected case over the time. in order to fit a sir model, the parameters were obtained by minimizing the residual sum of squares between the observed active cases and the predicted active cases. the utilization of the seir model lies in the fact that it focuses on the basic processes that are directly related to this growing pandemic. in the preparation of this model, there is a need that the population is to be divided into some subdivisions which are susceptible subdivision s(t), that denotes the population which is susceptible to catch the virus; exposed subdivision e(t), that denotes the population which is infected but the symptoms are not visible yet; infected subdivision i(t), that denotes the population which has been infected by the virus and are showing the symptoms; recovered subdivision r(t), that denotes the population which has immunity to the infection. the basic assumption to formulate this model is that the recovered patients acquired permanent active immunity. it can be justified by the strong reason that none of the patients were re-infected by the covid-19. there have been numerous cases where patients died after being discharged from the hospital but it was found that the patients were either discharged for having mild symptoms or the testing machine reported wrongly. now we have normalized these components as s + e + i + r â¼ 1. furthermore, suppose that there are equal birth and death rates, i.e., î¼ and 1 î± is the mean latent period for the disease. 1 î³ is the mean infectious period and recovered individuals are permanently immune. the contact rate î² may or may not be a function of time. thus the seir model is defined as the variable r is determined from the other variables according to equation s + e + i + r â¼ 1. a growth curve is an empirical model of the evolution of a quantity over time. growth curves are widely used in biology for quantities such as population size in population ecology and demography for population growth analysis, individual body height in physiology for growth analysis of individuals. growth is also a key property of many systems such as an economic expansion, spread of an epidemic, the formation of a crystal, an adolescent's growth and the condensation of a stellar mass. this is the simplest growth model, in which population grows at a constant rate over time. linear growth is described by the equation where p t represents the numbers or size of the system at time t, p t+1 represents the system's numbers or size of the system one time unit later, and a is the system's (linear) growth rate. many times this model fails to explain natural phenomenon. another simple model describes exponential growth, in which population grows at a constant proportional rate over time. the relation may be expressed in either of two forms, depending on whether reproduction is assumed to be continuous or periodic (shryock and siegel, 1973) . exponential growth results in a continuous curve of increase or decrease, whose slope varies in direct relation to the size of the population. where r is the constant rate of growth, p o is the initial population size, and the variables t and p t respectively represent time and the population at time t (method 1). another form of exponential curve is as follows where k â¼ p n p 0 1= n and that therefore the growth rate in eq. (3) with the current incidence of the covid-19 going on, we hear about exponential growth. in this study, an attempt has been made to understand and analyze the data through exponential growth curve. the reason for using exponential growth curve for studying the pattern of covid-19 incidence is that epidemiologists have studied these types of happenings and it is well known that the first period of an epidemic follows exponential growth. the exponential growth function is not necessarily the perfect representation of the epidemic. i have tried to fit exponential curve first, and at the next point to study the logistic growth curve because exponential curve is only fit the epidemic at the beginning. at some point, recovered people will not spread the virus anymore and when someone is or has been infected, the growth will stop. logistic growth is characterized by increasing growth in the beginning period, but a decreasing growth after point of inflection. for example, in the corona virus case, the maximum limit would be the total number of exposed people in india because when everybody is infected, the growth will be stopped. after that the increasing rate of curve starts to decline and reach to the minimum. the logistic model reveals that the growth rate of the population is determined by its biotic potential and the size of the population as modified by the natural resistance, or, in other words, by all the various effects of inherent characteristics, that are density dependence pearl and reed, 1920 . natural resistance increases as population size gets closer to the carrying capacity. logistic growth is similar to exponential growth except that it assumes an essential sustainable maximum point. in exponential growth curve, the rate of growth of y per unit of time is directly proportional to y but in practice the rate of growth cannot be in the same proportion always. the logistic curve will continue up to certain level, called the level of saturation, sometimes called the carrying capacity, after reaching carrying capacity it starts declining. a system far below its carrying capacity will at first grow almost exponentially, however, this growth gradually slows as the system expands, finally bringing it to a halt specifically at the carrying capacity (pearl and reed, 1920; shryock and siegel, 1973) . the logistic relationship can be expressed as where a, b and k are constant and y t is that value of the time series at the time t. the reciprocal of y t follows modified exponential law. hence, the given time series observation y t will follow logistic law if their reciprocal 1/y t follows modified exponential law. thus in general, we may take the factor y is called the momentum factor which increases with time t and the factor (k ã� y) is known as the retarding factor which decreases with time. when the process of growth approaches the saturation levelk, the rate of growth tends to zero. now we have dy y kã�y ã° integrating, we get log y kã�y â¼ î±kt + î³, where î³ is the constant of integration. k y â¼ 1 + e ã�î±kt :e ã�î³ ) y â¼ k 1 + e ã� î³+î±kt ã° ã� , this equation is same as eq. (4) where a â¼ ã�î³ and b â¼ ã�î±k. logistic curve has a point of inflection at half of the carrying capacity k. this point is the critical point from where the increasing rate of curve starts to decline. the time of point of inflection can be estimate as ã�a b . for the estimation of parameter of logistic curve, method of three selected point given by pearl and reed (1920) has been used. the estimate of the parameters can be obtained with equation given as: k â¼ y 2 2 y 1 + y 3 ã° ã�ã�2y 1 y 2 y 3 y 2 2 ã� y 1 y 3 where y 1, y 2 and y 3 are the cumulative number of covid-19 cases at a given time t 1, t 2 and t 3 respectively provided that t 2 ã� t 1 â¼ t 3 ã� t 2 . you may also estimate the parameter a and b by method of least square after fixing k. to predict confirmed corona cases on different day, logistic growth curve has been also used and found very exciting results. the truncated information (means not from the beginning to the present date) on confirmed cases in india has been taken from march 13 to april 2, 2020. the estimated value of the parameters are as follows k â¼ 18,708.28, a â¼ 5.495 and b â¼ ã�0.174, with these estimates predicted values has been obtained and found considerably lower values than what we observed. on april 1 and 2, 2020 the number of confirmed corona cases are drastically increasing in some part of india due to some unavoidable circumstances thus there is an earnest need to increase carrying capacity of the model, thus it is increased and considered as 22,000 and the other parameters a and b are estimated again which are a â¼ 5.657 and b â¼ ã�0.173. the predicted cumulative number of cases is very close to the observed cumulative number of cases till date. the time of point of inflection is obtained as 32.65, i.e., 35 days after beginning. we have taken data from march 13, 2020 so that the time of point of inflection should be april 14, 2020 and by may 30, 2020 there will be no new cases found in the country. exponential growth model and model given swanson provided natural estimate of the total infected cases by june 30, 2020 is all most all people in india. this estimate is obtained when no preventive measure would be taken by the government of india. the testing rate is lower in india than many western countries in the month of march and april, so our absolute numbers was low, when government initiate faster testing process then we have observed more number of cases and found this logistic model fail to provide cumulative number of corona confirm cases after april 17, 2020 thus there is a need to modify this model (fig. 1) . in order to the modification, i have taken natural log of cumulative number of corona confirm cases instead of cumulative number of corona confirm cases as taken in the previous model. this model provides the carrying capacity is about 80,000 cases and time of point of inflection is april 30, 2020. the present model provides reasonable estimate of the cumulative number of confirmed cases and by the end of july 2020 there will be no new cases found in the country. further, the number of covid-19 cases increases and the model estimate does not match to the observed number of case, therefore we need to change the data period, since the logistic curve is data-driven model that provide new estimate of point of inflection and maximum number of corona positive cases by date when disease will disappear, that helps us to plan our strategies. finally in this study we changed the data period, i.e., we have taken data from april 15th to july 16th 2020. this provides the carrying capacity is about 45 lakh cases and time of point of inflection is august 15th, 2020 with a maximum number of new cases on a day is about 30,000 per day. the model based on this data (from april 15th to july 16th 2020) provides reasonable estimate of the cumulative number of confirmed cases, and predicted value along with 95% confidence interval provided up to august 15th, 2020 (see table 1 ) and by the end of march 2021 we expect there will be no new cases in the country in absence of any effective medicine of vaccine (fig. 2) . to know the significance of lockdown we define the covid-19 case transmission is as , where x t is the number of confirm cases on t th day. we have calculated c t and the doubling time of the corona case transmission in india. the doubling time is calculate as ln2 c t â¼ 0:693 c t . we have calculated covid-19 case transmission c t on the basis of 5 days moving average of daily confirm cases (in the beginning the data in india is very fluctuating) and it is found gradually decreasing in india. this indicates the good sign of government attempts to combat this pandemic through implementing lockdown. these findings indicate that in future the burden of corona will be expectedly lowering down if the current status remains same. in table 2 given below, an attempt has been made to show the summary statistics of corona case transmission c t during various lockdown periods in india. it is observed that average covid-19 case transmission was maximum (0.16 with standard deviation 0.033) in the period prior to the lockdown. during the first lockdown period the average covid-19 case transmission was 0.14 with standard deviation 0.032, however, in lockdown 2 it was 0.07 with standard deviation 0.009 and in lockdown 3 covid-19 case transmission was 0.06 with standard deviation 0.007, however, in the period of fourth lockdown the average case transmission was 0.05 with standard deviation 0.005, thus it is clear that both average transmission load and standard deviation are decreasing. table 3 reveals the result of anova for average c t during various lockdown periods which is significant means that the average corona case transmission is significantly different is various lockdown periods considered. a group wise comparison of the average covid-19 case transmission c t during various lockdown periods is shown in table 4 which reveals that first lockdown is significantly affects the spread of corona case transmission than others but second lockdown period is not significantly different than third and fourth. same result is observed for third and fourth lockdown period. this indicates that the covid-19 transmission is not under control now. fig. 3 shows corona case transmission and doubling time in india. the corona case propagation in decreasing and doubling time is increasing day by day. let us define a function called tempo of disease that is the first differences in natural logarithms of the cumulative corona positive cases on a day, which is as: where p t and p tã�1 are the number of cumulative corona positive cases for period t and t ã� 1, respectively. when p t and p tã�1 are equal then r t will become zero. if this value of r t , i.e., zero will continue a week then we can assume no new corona cases will appear further. in the initial face of the disease spread, the tempo of disease increases but after sometime when some preventive measures is being taken then it decreases. since r t is a function of time then the first differential is defined as where r t denotes the tempo that is the first differences in natural logarithms of the cumulative corona positive cases on a day, r t is the desired level of tempo, i.e., zero in this study, t denotes the time and k is a constant of proportionality. eq. (8) is an example of an ordinary differential equation that can be solved by the method of separating variables. the eq. (8) can be written as dr t r t â¼ kdt integrating eq. (9), we get where c is an arbitrary constant. taking the antilogarithms of both sides of eq. (10) we have r t â¼ e kt+c ) e kt e c ) r t â¼ ae kt where a â¼ e c . this eq. (11) is the general solution of eq. (8). if k is less than zero, eq. (11) tells us how the covid-19 cases will decreases over the time until it reaches zero. value of a and k is estimated by least square estimation procedure using the data sets. the government of india implemented lockdown on march 24th, 2020 and expected that the tempo of disease is decreasing. government suggested and implemented social distancing and lockdown to control the spread of covid-19 in the society. in table 5 , the predicted value of covid-19 cases obtained with this method is given along with 95% confidence interval. about 21.5 lakh cases are expected by august 15th, 2020. with this model it is expected that about 45 lakh peoples will be infected in india by the end of october and after that no cases will happen since the tempo of disease r t will become zero (fig. 4) . in table 6 an attempt has been made to show the summary statistics of tempo of covid-19 r t during various lockdown periods in india. it is observed that average tempo is maximum (0.17 with standard deviation 0.062) in the period prior to the lockdown. during the first lockdown period the average tempo is 0.14 with standard deviation 0.044 and after that it is found decreasing in the various lockdowns. table 7 various lockdown periods is shown in table 8 which reveals that first lockdown is significantly different than others. consecutive mean difference shows that the decrease in disease spread has been observed but insignificant, means there is no impact of lockdown on controlling the disease spread. to analyze the temporal trends and to identify important changes in the trends of the covid-19 outbreak joinpoint regression is used in china (al hasan et al., 2020) ; here in this study we performed a joinpoint regression analysis in india to understand the pattern of covid-19. joinpoint regression analysis, enable us to identify time at a meaningful change in the slope of a trend is observed over the study period. the best fitting points known as joinpoints, that are chosen when the slope changes significantly in the models. to tackle the above problem joinpoint regression analysis (kim et al., 2000) has been employed in this study to present trend analysis. the goal of the joinpoint regression analysis is not only to provide the statistical model that best fits the time series data but also, the purpose is to provide that model which best summarizes the trend in the data (marrot, 2010) . let y i denotes the reported covid-19 positive cases on day t i such that t 1 < t 2 < â�¦ < t n . then the joinpoint regression model is defined as ln y i â¼ î± + î² 1 t 1 + î´ 1 u 1 + î´ 2 u 2 + :â�¦+ î´ j u j + îµ i (12) & and k 1 < k 2 â�¦ < k j are joinpoints. the details of joinpoint regression analysis are given elsewhere (kim et al., 2004) . joinpoint regression analysis is used when the temporal trend of an amount, like incidence, prevalence and mortality is of interest (doucet et al., 2016) . however, this method has generally been applied with the calendar year as the time scale (akinyede and soyemi, 2016; chatenoud et al., 2015; missikpode et al., 2015; mogos et al., 2016) . the joinpoint regression analysis can also be applied in epidemiological studies in which the starting date can be easily established such as the day when the disease is detected for the first time as is the case in the present analysis (rea et al., 2017) . estimated regression coefficients (î²) were calculated for the trends extracted from the joinpoint regression. additionally, the average daily percent change (adpc), calculated as a geometric weighted average of the daily percent changes (clegg et al., 2009) . the joinpoints are selected based on the data-driven bayesian information criterion (bic) method (zhang and siegmund, 2007) . the equation for computing the bic for a k-joinpoints regression is: where sse is the sum of squared errors of the k-joinpoints regression model and n is the number of observations. the model which has the minimum value of bic(k) is selected as the final model. there are other methods also for identifying the joinpoints such as permutation test method and the weighted bic methods. relative merits and demerits of different methods of identifying the joinpoints are discussed elsewhere (national institute cancer, 2013) . the permutation test method is regarded as the best method but it is computationally very intensive. it controls the error probability of selecting the wrong model at a certain level (i.e., 0.05). the bic method, on the other hand, is less complex computationally. in the present case, data on the reported confirmed cases of covid-19 are available on a daily, thus the daily percent change (dpc) from day t to day (t +1) is defined as if the trend in the daily reported confirmed cases of covid-19 is modeled as then, it can be shown that the dpc is equal to it is worthwhile to discuss here is that the positive value of dpc indicates an increasing trend while the negative value of dpc suggests a declining trend. the dpc reflects the trend in the reported covid-19 positive cases in different time segments of the reference period observed through joinpoint regression techniques. for the entire study period, it is possible to estimate average daily percent change (adpc) that is the weighted average of dpc of different time segments of the study period with weights equal to the length of different time segments. however, when the trend changes frequently, adpc has little meaning. it assumes that the random errors are heteroscedastic (have nonconstant variance). heteroscedasticity is handled by joinpoint regression using weighted least squares (wls). the weights in wls are the reciprocal of the variance and can be specified in several ways. thus standard error is used to control heteroscedastic in the analysis during the entire period. to observe the trend of reported cases, the moving average method has been used in this study. the daily percent change (dpc) in the daily reported confirmed cases of covid-19 during the period march 14th, 2020 through july 16th, 2020 is used for forecasting the daily reported confirmed cases of covid-19 in the immediate future under the assumption that the trend in the daily reported confirmed cases of covid-19 remains unchanged. the number of cases increased by the rate of 6.20% per day in india; however, the rate is different in the different segment. also table 9 reveals that the growth rate is positive and significant (about 19%) from 16th march to 3rd april and after that the growth rate is decreasing in comparison of first segment, i.e., for 28 days (from 3rd april to 30th april). the possible reason may be lockdown imposed in india. in the third segment, i.e., from 30th april to 4th may a high increase has been observed but it is insignificant. from 4th april to 13th may the rate is although the positive but dramatically lower than the previous segments growth rate. in the next segment, i.e., 5th segment which is of 8 days, we observe a significant increase of 6.55% in covid-19 cases. in the last and 6th segment from 20th may to 14th july, i.e., for 56 days, the growth rate is found again positive and significant (3.03% per day) in the covid-19 cases. fig. 5 shows that the trend increases in india still sharply and there is no hope of decline in covid-19 cases. fig. 2 shows the forecasted value of covid-19 daily cases in india. the covid-19 cases will increase further if the same trend prevailing. table 10 presents the forecast of the predicted cases of covid-19 in india along with 95% confidence intervals. this exercise suggests that by august 15th, 2020, the confirmed cases of covid-19 in india is likely to be 2,587,007 with a 95% confidence interval of 2,571,896-2,602,282 and daily reported cases will be 78,729 with 95% confidence interval of 77,516-79,961. this daily reported covid-19 positive cases may change only when an appropriate set of new interventions are introduced to fight covid-19 pandemic. it is observed that analysis indicates that in the month of august, india faces more than 50 thousand cases per day (fig. 6 ). india is in the comfortable zone with a lower growth rate than other countries. logistic model shows that, the epidemic is likely to stabilize with 45 lakh cases by the end of march 2021 and peak will come in middle of the august, however, propagation model provide estimate of maximum covid-19 case as 45 lakh but the timing is different (by end october) than the logistic model. logistic model need to monitor the data time to time for good long term prediction. the projections produced by the model and after their validation can be used to determine the scope and scale of measures that government need to initiate. joinpoint regression is based on the daily reported confirmed cases of covid-19, asserts that there has virtually been little impact of the nationwide lockdown as well as relaxations in restrictions on the progress of the covid-19 pandemic in india. the joinpoint regression analysis provides better estimate up to 15th august for the confirmed covid-19 cases than the other two methods. to know the better understanding of the progress of the epidemic in the country may be obtained by analyzing the progress of the epidemic at the regional level. in conclusion, if the current mathematical model results can be validated within the range provided here, then the social distancing and other prevention, treatment policies that the central and various state governments and people are currently implementing should continue until new cases are not seen. the spread from urban to rural and rich to poor populations should be monitor and control is an important point of consideration. mathematical models have certain limitations that there are many assumptions about homogeneity of population in terms of urban/rural or rich/poor that does not capture variations in population density. if several protective measures will not be taken effectively, then this rate may be changed. however, the government of india under the leadership of modi ji has already taken various protective measures such as lockdown in several areas, make possible quarantine facility to reduce the rate of increase of covid-19, thus we may hopefully conclude that, country will be successful to reduce the rate of this pandemic. joinpoint regression analysis of pertussis crude incidence rates the novel coronavirus disease (covid-19) outbreak trends in mainland china: a joinpoint regression analysis of the outbreak data from the health workforce in india. who; human resources for health observer data-based analysis, modelling and forecasting of the covid-19 outbreak estimation of the final size of the second phase of the coronavirus covid 19 epidemic by the logistic model mathematical models in population biology and epidemiology modelling transmission and control of the covid-19 pandemic in australia laryngeal cancer mortality trends in european countries healthcare impact of covid-19 epidemic in india: a stochastic mathematical model a time-dependent sir model for covid-19 with undetectable infected persons estimating average annual per cent change in trend analysis detection of 2019 novel coronavirus (2019-ncov) by realtime rt-pcr epidemiological research priorities for public health control of the ongoing global novel coronavirus (2019-ncov) outbreak epidemic modelling: an introduction a preliminary analysis of the epidemiology of influenza a (h1n1) v virus infection in thailand from early outbreak data prevalence and mortality trends in chronic obstructive pulmonary disease over impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand forecast of the evolution of the contagious disease caused by novel corona virus (2019-ncov) in china corona virus in india: make or break the mathematics of infectious diseases measles and rubella in the united states optimal ages or vaccination for measles rubella clinical features of patients infected with 2019 novel coronavirus in wuhan the continuing 2019-ncov epidemic threat of novel coronaviruses to global health-the latest 2019 novel coronavirus outbreak in wuhan estimating initial epidemic growth rates a contribution to the mathematical theory of epidemics permutation tests for joinpoint regression with applications to cancer rates comparability of segmented line regression models interventions to mitigate early spread of sars-cov-2 in singapore: a modelling study early dynamics of transmission and control of covid-19: a mathematical modelling study defining the epidemiology of covid-19-studies needed an optimization model for influenza a epidemics progression of covid-19 in indian states-forecasting endpoints using sir and logistic growth models prudent public health intervention strategies to control the corona virus disease 2019 transmission in india: a mathematical model-based approach colorectal cancer network (crcnet) user documentation for surveillance analytic software: joinpoint. cancer care ontario trends in non-fatal agricultural injuries requiring trauma care differences in mortality between pregnant and nonpregnant women after cardiopulmonary resuscitation joinpoint regression program. national institutes of health, united states department of health and human services on the rate of growth of the population of the united states since 1790 and its mathematical representation joinpoint regression analysis with time-on-study as time-scale. application to three italian population-based cohort studies transmission of 2019-ncov infection from an asymptomatic contact in germany covid-19 in india: potential impact of the lockdown and other longer term policies a logistic growth model for covid-19 proliferation: experiences from china and international implications in infectious diseases age-structured impact of social distancing on the covid-19 epidemic in india model-based retrospective estimates for covid-19 or coronavirus in india: continued efforts required to contain the virus spread reporting, epidemic growth, and reproduction numbers for the 2019 novel coronavirus (2019-ncov) epidemic nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study recurrent outbreaks of measles, chickenpox and mumps ii seasonality and the requirements for perpetuation and eradication of viruses in populations a modified bayes information criterion with applications to the analysis of comparative genomic hybridization data comparison of the ability of arima, wnn and svm models for drought forecasting in the sanjiang plain preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak further reading covid-19 key: cord-350603-ssen3q08 authors: albrecht, randy a.; liu, wen-chun; sant, andrea j.; tompkins, s. mark; pekosz, andrew; meliopoulos, victoria; cherry, sean; thomas, paul g.; schultz-cherry, stacey title: moving forward: recent developments for the ferret biomedical research model date: 2018-07-17 journal: mbio doi: 10.1128/mbio.01113-18 sha: doc_id: 350603 cord_uid: ssen3q08 since the initial report in 1911, the domestic ferret has become an invaluable biomedical research model. while widely recognized for its utility in influenza virus research, ferrets are used for a variety of infectious and noninfectious disease models due to the anatomical, metabolic, and physiological features they share with humans and their susceptibility to many human pathogens. however, there are limitations to the model that must be overcome for maximal utility for the scientific community. here, we describe important recent advances that will accelerate biomedical research with this animal model. populations. it is likely that new models and transgenic animals will be developed in the near future. the sequencing of the ferret genome (15) was instrumental in advancing functional genomic analysis. numerous groups developed reagents to monitor gene-specific mrna expression levels via taqman-based or sybr green-based real-time reverse transcription-pcr assays for a plethora of targets. many of these primers are available free of charge through the national institute of allergy and infectious diseases (niaid) established bei resources (https://www.beiresources.org/home.aspx). bruder et al. described the development of an expression microarray platform that included the identification of 41 genes with consistent baseline transcription profiles across tissues that could be used as housekeeping genes (16) . our group developed and is validating a fluidigm panel with 144 distinct immune response and lung injury and repair genes. beyond transcription, tisoncik-go et al. described an integrated omics analysis that profiles lipids, metabolites, and proteins in the respiratory compartments of influenza virus-infected ferrets (17) . combined, these tools provide powerful resources to the research community. despite its relevance for biomedical research, there are limitations of the ferret model for immunologic studies due to the dearth of reagents. screening of commercially available antibodies for cross-reactivity with markers on innate and adaptive cell subsets and cytokines in ferrets has yielded limited success ( table 2) . to resolve this, a group of researchers from around the world are working together to develop validated reagents and assays to improve our understanding of the innate and adaptive immune responses in the ferret. to date, recombinant proteins representing a range of intrinsic, innate, and adaptive immune markers are under development, and some are already available from commercial sources (18, 19) . these include type i and iii interferons (ifns), rig-i and toll-like receptors, cytokines, and chemokines, as well as cell surface markers for immune and nonimmune cells. in terms of adaptive immune responses, kirchenbaum and ross recently developed a monoclonal antibody against the ferret b cell receptor light chain that is useful in distinguishing kappa versus lambda b cell responses (20, 21) . enzymelinked immunosorbent spot (elispot) and flow cytometric assays have been developed to quantify the isotypes of antibody-secreting cells (igg or iga) (22) , pan-b cells (cd20 ϩ , cd79␣ ϩ ), and ig ϩ b cells (18, 19) . t cell phenotyping has been limited to quantification of overall cd3 ϩ t cells, including cd4 ϩ and cd8 ϩ subsets, by flow cytometric assays and identification of antigen-specific effector responses by detecting ifn-␥ secretion in flow-based intracellular cytokine secretion assays or elispot assays (18) . an in vivo depletion of cd8 t cells using a cross-reactive human monoclonal antibody has been shown to delay influenza virus clearance (23) . to increase our toolbox, the centers for excellence in influenza research and surveillance (ceirs) network has undertaken a large project to rapidly produce monoclonal antibodies and develop assays to support the universal influenza vaccine initiative (24) . antibodies in production include b cell markers (cd83, cd86, cd95, cd19, cd20, cd25, cd27, cd38, cd138, cxcr5, and fcr), t cell markers (cd4, ccr7, cd3e, cd40, cd40l, cd44, cd62l, cd69, cd103, pd-1, cxcr3, interleukin-7 receptor [il-7r], and il-15ra) and others (cxcr4, cd140, il-2, il-21, and il-4). these much-needed reagents will facilitate efforts to establish immunologic assays to interrogate the innate and adaptive immune responses to infection and vaccination at the level of detail that is routinely applied to studies of mouse or human immunology. importantly, the ferret model will allow correlates of protection to be established after vaccination and infection in conjunction with transmission studies, which are not available in the mouse models. additionally, the longer life span of the ferret relative to the mouse will allow analysis of the evolution of the immune response to sequential infection and/or vaccination (25) , permitting more accurate modeling of the immune response in humans. while there has been exciting progress, much work remains to move the ferret model forward. toward this goal, the ceirs group has produced fibroblasts and primary nasal and tracheal epithelial cells and cell lines, established a repository of defined tissues and cell types (table 3) , and are working with the j. craig venter institute to define the ferret major histocompatibility complex (mhc). an exciting achievement is the completion of the pacbio sequencing of the ferret mhc (granger sutton, personal communication). while these are important steps, the ultimate goal is to provide the biomedical research community with validated reagents and protocols they can trust to ensure the rigor and reproducibility in experiments utilizing the ferret model. in support of this goal, many of the reagents created through the ceirs network will be made publicly available through the ceirs data processing and coordinating center (dpcc) website (http://www.niaidceirs.org/resources/ceirs-reagents/). we thank everyone involved in team ferret, whose names we will not list for fear we might miss someone, as well as others producing reagents for the ferret model. we also thank diane post (niaid) and the members of the ceirs network for feedback, advice, and constructive criticism. studies in the embryology of the ferret anatomy of the ferret heart: an animal model for cardiac research a model of spinal cord injury drug effects on after discharge and seizure threshold in lissencephalic ferrets: an epilepsy model for drug evaluation a ferret model of copdrelated chronic bronchitis airway disease phenotypes in animal models of cystic fibrosis development of ferret as a human lung cancer model by injecting 4-(n-methyl-n-nitrosamino)-1-(3-pyridyl)-1-butanone (nnk) building the ferretome cloned ferrets produced by somatic cell nuclear transfer adeno-associated virus-targeted disruption of the cftr gene in cloned ferrets crispr/cas9-mediated genome engineering of the ferret live attenuated influenza vaccine is safe and immunogenic in immunocompromised ferrets influenza transmission in the mother-infant dyad leads to severe disease, mammary gland infection, and pathogenesis by regulating host responses impaired heterologous immunity in aged ferrets during sequential influenza a h1n1 infection the draft genome sequence of the ferret (mustela putorius furo) facilitates study of human respiratory disease transcriptome sequencing and development of an expression microarray platform for the domestic ferret integrated omics analysis of pathogenic host responses during pandemic h1n1 influenza virus infection: the crucial role of lipid metabolism flow cytometric and cytokine elispot approaches to characterize the cell-mediated immune response in ferrets following influenza virus infection screening monoclonal antibodies for cross-reactivity in the ferret model of influenza infection generation of monoclonal antibodies against immunoglobulin proteins of the domestic ferret (mustela putorius furo) infection of ferrets with influenza virus elicits a light chain-biased antibody response against hemagglutinin vaccine-specific antibody secreting cells are a robust early marker of laiv-induced b-cell response in ferrets contemporary seasonal influenza a (h1n1) virus infection primes for a more robust response to split inactivated pandemic influenza a (h1n1) virus vaccination in ferrets a universal influenza vaccine: the strategic plan for the national institute of allergy and infectious diseases immune history shapes specificity of pandemic h1n1 influenza antibody responses a virus obtained from influenza patients the pathogenesis of respiratory syncytial virus infection in infant ferrets ferrets as a novel animal model for studying human respiratory syncytial virus infections in immunocompetent and immunocompromised hosts identification of small-animal and primate models for evaluation of vaccine candidates for human metapneumovirus (hmpv) and implications for hmpv vaccine design comparison of wild-type and subacute sclerosing panencephalitis strains of measles virus. neurovirulence in ferrets and biological properties in cell cultures assessment of the ferret as an in vivo model for mumps virus infection infection of mice, ferrets, and rhesus macaques with a clinical mumps virus isolate further studies on the neonatal ferret model of infection and immunity to and attenuation of human parainfluenza viruses effect of upper respiratory infection on hearing in the ferret model virology: sars virus infection of cats and ferrets a neutralizing human monoclonal antibody protects against lethal disease in a new ferret model of acute nipah virus infection the domestic ferret (mustela putorius furo) as a lethal infection model for 3 species of ebolavirus rift valley fever: a report of three cases of laboratory infection and the experimental transmission of the disease to ferrets influenza enhances susceptibility to natural acquisition of and disease due to streptococcus pneumoniae in ferrets in vivo localization of staphylococcus aureus in nasal tissues of healthy and influenza a virus-infected ferrets helicobacter infections in laboratory animals: a model for gastric neoplasias a new experimental infection model in ferrets based on aerosolised mycobacterium bovis animal models of pneumocystosis immune system cells in healthy ferrets: an immunohistochemical study cellular immune response in the presence of protective antibody levels correlates with protection against 1918 influenza in ferrets evaluation of the humoral and cellular immune responses elicited by the live attenuated and inactivated influenza vaccines and their roles in heterologous protection in ferrets key: cord-350510-o4libq5d authors: grinfeld, m.; mulheran, p. a. title: on linear growth in covid-19 cases date: 2020-06-22 journal: nan doi: 10.1101/2020.06.19.20135640 sha: doc_id: 350510 cord_uid: o4libq5d we present an elementary model of covid-19 propagation that makes explicit the connection between testing strategies and rates of transmission and the linear growth in new cases observed in many parts of the world. an essential feature of the model is that it captures the population-level response to the infection statistics information provided by governments and other organisations. the conclusions from this model have important implications regarding benefits of wide-spread testing for the presence of the virus, something that deserves greater attention. apart from being a world-changing calamity, the present novel coronavirus pandemic is an intellectual challenge for biologists, statisticians and applied mathematicians. modelling efforts that purport to predict the course of the pandemic and the effect of public health policies, usually take the form of substantial individual-based models and are implemented in code taking thousands of lines. their predictive ability is disputed but it is doubtless that they do not help us to understand the pandemic. we suggest exactly the opposite: we formulate essentially a two-equation model of one aspect of the pandemic, and claim that it can very simply explain the following puzzling phenomenon: in many countries the rate of appearance of new cases is linear. as an example, we present the data for sweden in figure 1 [1]. in fact, sweden is a good case to work with as there are no complications to do with lock-down; similar graphs can be created, from example, from the data for the state of georgia [2], among many others. the modelling of an epidemic on the population level usually divides it in cohorts such as susceptible, infective and recovered (a so-called sir model), plus possibly some further sub-populations (e.g. asymptomatic or exposed (seir models)) [3] . the evolution of these cohorts is then modelled using rate equations that include the probability that the disease is transmitted through random contacts between them, amongst other events. in doing this, time is considered as a continuous variable. however, in order to understand the linear growth phenomenon mentioned above, we believe that it is essential to include the public response to the data that are usually made available on a daily basis. indeed, we would argue that capturing the response of the population to the information stream is essential if the model is to be of use in truly understanding the pandemic. therefore, in our approach we will consider time as a discrete variable measured in days, and develop a model for the discrete evolution of the number of infectives from day-to-day. although unusual, this approach has been successfully used elsewhere in epidemiology; for a recent example, please see [4] ii. models we derive, in its simplest and most illuminating form, a system of two difference equations for the rate of growth of new positive test results and the number of people that have been exposed to the virus; that is, we neglect the asymptomatics. the time variable n that we use is measured in days. we denote the average latent period (here and below we use epidemiological data from [5, 6] ) by l; it is about a week. it is known (again, see [5, 6] ) that individuals start shedding virus and so are infective very soon (1-2 days) after exposure and about 4-5 days before the appearance of symptoms. once the simplest model is derived, we consider the case with asymptomatics, which does not offer any substantial new illumination, but is more realistic. let us call the number of positive tests on day n, t (n). then, not taking into account false positives and negatives, but not assuming that every person showing active symptoms is tested, (1) here j(n) are people who have shown covid-19 symptoms on day n, and qj(n) is the fraction of those who have been tested on that day, and d s (n) are the positively testing members of the public who show no symptoms (perhaps yet). the subscript s is to indicate that this is only from a sample. now, if the rate of testing is p, d s (n) = pd(n), where now d(n) is the population numbers of people who have virus by pcr but do not show symptoms yet. let us denote by e(n) people who got exposed on day n. in a model without asymptomatics, here l is the latent period and for simplicity we have assumed that people become infectious immediately after exposure. now we need to model the dynamics of e(n). (a1) since the numbers of infectives are very small compared to the number of susceptibles s(n), we assume that the number of susceptibles is roughly constant and that s(n)/n ≈ 1, n being the total population size. (a2) we assume that the number of infectives available for infecting the rest of the population on day n is approximately d(n) as p is small (of the order of 2 · 10 −4 in the uk). thus, we assume that the 2 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . moment a person shows symptoms of the disease, she is removed from circulation by hospitalisation or quarantine. that is, we are making an assumption of perfect isolation. it follows from the reasoning above that people who are exposed at time n, e(n), are determined from the following simple equation: where the other arguments of f will be discussed later. (a3) we assume that f (·, (d(n)) can be written as f (·, d(n)) = cg(·)h((d(n)). the constant c which hethcote and van den driessche [7] call "constant contact rate" in our view should only incorporate the probability that an encounter between a susceptible and an infective leads to disease. so presumably wearing face masks or other personal protection measures will be expected to reduce c. we take h(d(n)) to be a monotone increasing bounded function with the properties that h(0) = 0. e.g. a function of michaelis-menten type; see [7] for other examples. (a4) we assume that the function g expresses the information stream of the population, that is, it is the translation of the information that people have into behavioural strategies governing the contact rates of the population (the same function governs also the contact rates between two susceptibles as there is no sure-fire way to determine in a contact between people not showing symptoms who is infected and who is not). (a5) we assume that the information stream is dominated by the rate of increase of the numbers of new positive tests. it would be interesting to investigate models in which g is a function of more than the last day's data, or of undominated maxima in the number of new cases, but we assume here for simplicity that r(n) is a reasonable proxy for the information stream. in other words, we assume that g(·) is a function of r(n). common sense suggests that it is a monotone decreasing function defined on r + . a possibility is g(z) = a 1 + bz r , r ≥ 1. ( then g(0) = a can be interpreted in terms of the norms of sociability in a population. the logic is that if the public is aware of high rate of increase in new cases, it becomes more risk-averse. note that the information stream is in terms of what is publicly known. thus the dynamics of the exposed cohort is governed by e(n + 1) = cg(r(n))h(d(n)). what we have to say about covid-19 is then summarised in one sentence: if the information stream is based on the number of new cases (for which r(n) is a proxy) and quarantining/hospitalisation of symptomatic cases is perfect, linear increase in the number of positive tests is to be expected. this is obvious. clearly from (1), at a fixed point (r * , e * ) and cg(r * )h(le * ) = e * which apart from the fixed point (r * , e * ) = (0, 0), in which disease is stopped, may admit a unique non-trivial fixed point, the e-component of which solves 3 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . (if such a fixed point does not exist, the epidemic disappears.) if this fixed point is stable, the number of positive tests necessarily grows linearly and from that rate of growth, the number of newly exposed people can be estimated. note that the value of this equilibrium rate is an increasing function of c and also of l since the function g is monotone decreasing. under reasonable assumptions on g and h (for example, h being of michaelis-menten type and g as in (2)) it is easily seen for (4) that e * is a decreasing function of both p and of q, since the right-hand side of (4) is monotone increasing and g is monotone decreasing. under these assumptions on h and g, the equilibrium number of positive tests will grow (sublinearly) with p and q as is to be expected. a similar analysis with the same conclusions, can be performed in the case when the information stream is a weighted averager(n, m ) of the rates from a number m of days, i.e. ifr(n, m ) = w k = 1; the value of the steady rate is independent on m or the weights w k , but these of course influence the stability of the fixed point. this model does not add much to our understanding from the previous model, and the purpose of this subsection is simply to show that it subtracts nothing either. we need to introduce three new parameters: α and β, and k. α and β are both in (0, 1). α measures the proportion of exposures leading to an asymptomatic state (realistically this seems to be about 0.4 − 0.5) and β measures how infectious is an asymptomatic individual relative to an infected one. k > l is the average duration of an asymptomatic disease. we need just to modify both (1) and (3) as now the testing also finds some of the asymptomatics who are beyond the latent period. now j(n) = (1 − α)e(n − l). e(n + 1) = cg(r(n))h(i ef f (n)), where the number of people available for effective infection on day n is the argument from now is as before. for example, if indeed α ≈ 0.5 and as in the uk, q ≈ 0.2, with r * in the uk being approximately 2000, we have that e * ≈ 20000, i.e. the number of people exposed to the virus each day is about 10 times the number of new cases. we found the linear growth rate of the number of covid-19 positive tests puzzling and have provided a simple framework in which such a dynamic can be expected. in other words, our elementary analysis is in the framework of peircian abduction, for a good review of which see psillos [8] . the linear rate is "democratically" determined by the behaviour of the individuals as well as by the rate of testing. we also presented reasons why our assumptions are sensible. we hope the present work is a contribution to the effort to "come to grips" with the pandemic, albeit in a very rough-and-ready and partial fashion; this 4 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 22, 2020. . rough and ready way still allows us, if we have access to additional information, such as that available for the uk from the kcl and zoe site [9] to estimate the number of asymptomatics, exposed, and effective spreaders. as a last remark note in the proposed model, government strategy, expressed in the parameters p and q that determine the published numbers of new cases, directly influences individual behaviour, a feedback loop that does not seem to have been sufficiently discussed. this feedback has to be understood thoroughly in order to craft more effective public health policy. an exit strategy from the covid-19 lockdown based on risk-sensitive resource allocation the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application clinical presentation and virological assessment of hospitalized cases of coronavirus disease 2019 in a travel-associated transmission cluster some epidemiological models with linear incidence an explorer upon untrodden ground: peirce on abduction key: cord-266626-9vn6yt8m authors: lei, howard; o’connell, ryan; ehwerhemuepha, louis; taraman, sharief; feaster, william; chang, anthony title: agile clinical research: a data science approach to scrumban in clinical medicine date: 2020-10-22 journal: intell based med doi: 10.1016/j.ibmed.2020.100009 sha: doc_id: 266626 cord_uid: 9vn6yt8m the covid-19 pandemic has required greater minute-to-minute urgency of patient treatment in intensive care units (icus), rendering the use of randomized controlled trials (rcts) too slow to be effective for treatment discovery. there is a need for agility in clinical research, and the use of data science to develop predictive models for patient treatment is a potential solution. however, rapidly developing predictive models in healthcare is challenging given the complexity of healthcare problems and the lack of regular interaction between data scientists and physicians. data scientists can spend significant time working in isolation to build predictive models that may not be useful in clinical environments. we propose the use of an agile data science framework based on the scrumban framework used in software development. scrumban is an iterative framework, where in each iteration larger problems are broken down into simple do-able tasks for data scientists and physicians. the two sides collaborate closely in formulating clinical questions and developing and deploying predictive models into clinical settings. physicians can provide feedback or new hypotheses given the performance of the model, and refinement of the model or clinical questions can take place in the next iteration. the rapid development of predictive models can now be achieved with increasing numbers of publicly available healthcare datasets and easily accessible cloud-based data science tools. what is truly needed are data scientist and physician partnerships ensuring close collaboration between the two sides in using these tools to develop clinically useful predictive models to meet the demands of the covid-19 healthcare landscape. we propose the use of an agile data science framework based on the scrumban framework used in software development. scrumban is an iterative framework, where in each iteration larger problems are broken down into simple do-able tasks for data scientists and physicians. the two sides collaborate closely in formulating clinical questions and developing and deploying predictive models into clinical settings. physicians can provide feedback or new hypotheses given the performance of the model, and refinement of the model or clinical questions can take place in the next iteration. the rapid development of predictive models can now be achieved with increasing numbers of publicly available healthcare datasets and easily accessible cloud-based data science tools. what is truly needed are data scientist and physician partnerships ensuring close collaboration between the two sides in using these tools to develop clinically useful predictive models to meet the demands of the covid-19 healthcare landscape. the covid-19 pandemic has greatly altered the recent healthcare landscape and has brought about greater minute-to-minute urgency of patient treatment especially in intensive care units (icus). this greater urgency for treatment implies a greater need for agility in clinical research, rendering traditional approaches such as randomized controlled trials (rcts) [1] too slow to be effective. one approach for meeting the agility needs is the use of data science for the development of predictive models to assist in patient treatment. predictive models can be rapidly and non-invasively developed leveraging existing data and computational tools, and various efforts have been undertaken [2] [3] [4] . if successful, predictive models can rapidly process volumes of patient information to assist physicians in making clinical decisions. however, the development and deployment of predictive models that are useful in clinical environments within short timeframes is challenging. traditionally, the development and deployment of models employs a sequential process that resembles the waterfall methodology used in software development [5] . data would the model be deployed into a real-world setting for the domain experts to evaluate and provide feedback. one main disadvantage of this approach is that it prescribes for little collaboration between the day-to-day operations of data scientists and domain experts such as physicians, resulting in data scientists potentially working in isolation for long periods of time. figure 1 illustrates this process. a breakdown of the tasks data scientists typically perform in isolation include data collection, data pre-processing and augmentation, model selection, model hyper-parameter turning, model training, and model testing. data pre-processing is used for the data to be in a format that's suitable for use by the predictive model. data augmentation is used to artificially increase the size of the data. for example, if the input data consist of images, augmentation can include translation, scaling, rotation, and adjusting the brightness of images to present more example images for the predictive model to learn. note that one popular technique for compensating for data size is the synthetic minority oversampling technique (smote) [6] , which addresses class imbalance in datasets by artificially increasing the amount of data in the minority class. class imbalance is commonly encountered when working with electronic medical record (emr) data in healthcare. the class representing the patients with a target condition is typically smaller in size (i.e. with fewer samples) than the class representing patients without the target condition, and this can adversely affect the accuracy of predictive models developed on such data. hyper-parameter tuning involves adjusting the parameters used in the model training process [7] , where the model is taught how to make predictions given the training data. one example of a hyper-parameter is the number of times -or iterations -the training data is presented to the model to learn. each iteration is known as an epoch. after each epoch the model increases its learning from the training data, and after many epochs the learning is completed. a second example of a hyper-parameter is the percentage of training data that's used by the model in each epoch. the more epochs and the more data presented in each epoch, the better the model learns from the training data. a final example of a hyper-parameter is the learning rate of predictive models. the learning rate inversely correlates with the amount of time the model takes to reach its "learned state". models trained using higher learning rates can reach its final state faster and complete its training sooner; however, they may not have learned as well compared to models trained using lower learning rates. depending on the amount of training data, the complexity of the data, the number of parameters in the predictive model, and the computing resources, model training can potentially take days to complete. the model would be evaluated against a separate test data to verify if performance meets requirements. if not, some or all previous steps must be repeated until performance becomes acceptable. after the performance is deemed acceptable, the model would be deployed into a real-world environment. in the end, the process from the conception of the problem to model deployment can take months, and the opportunity for domain experts to evaluate comes only after the deployment of the model. one risk is that after deployment, the model would no longer be relevant if the goals have shifted; another risk is that the model may not meet the performance requirements in a realworld setting. in either situation, time or resources allocated to model development would have been wasted. this can be particularly damaging for data science efforts addressing the covid-19 pandemic, where rapid development of approaches for detection and diagnoses of symptoms is critical. in healthcare, the ability to rapidly define goals (i.e. clinically relevant questions) and deploy predictive models that have real-world impact is faced with even more challenges. one challenge is that healthcare data such as electronic medical records (emr) of patients is inherently complex [8] , consisting of a mix of different data types and structures, missing data, and mislabeled data. the development of predictive models often requires well-structured and welllabeled data; hence, there is a greater need for data exploration, pre-processing and/or filtering when processing emr data. furthermore, it may be discovered upon exploration of available training data that the initial clinical questions and goals may not be achievable by predictive models developed using the data. those questions and goals would need to be refined before model development can proceed. furthermore, for predictive models to be usable in a clinical setting, physicians must have confidence that its performance is reliable. models that perform well under common metrics used by data scientists, such as area under the curve (auc), does not guarantee that important clinical decisions can be made based on model [9] . that is because the auc is a metric that measures model performance across a broad range of sensitivities and specificities of the model. when making important clinical decisions related to patients in the icu, such as proning versus ventilation, which drugs to use, or whether to administer anti-coagulants, knowing that the model has an excellent auc of 0.95 out of 1.0 is not as helpful as knowing that a decision based on the model has a 95% chance of being correct (i.e. the model's specificity). some clinical decisions also need to be made within minutes, implying that the model must meet real-time performance j o u r n a l p r e -p r o o f standards in order to become the "partner" that can assist physicians in on-the-spot decision making. the fact that predictive models could lack in performance after being deployed into a clinical setting implies an even greater need for a framework that allows physicians to collaborate with data scientists to continuously monitor model development and performance. furthermore, the minute-to-minute urgency of treatment needed for the covid-19 pandemic implies that the lengthy process prescribed by the traditional waterfall approach -with little communication between data scientists and physicians -is inadequate. the agile framework has been traditionally used in software development and has recently been introduced in data science [10] . the framework is an expedient approach that encourages greater velocity towards accomplishing goals. it includes the scrum and kanban frameworks, and a hybrid framework called scrumban [11] . the scrum framework prescribes consecutive "sprint cycles", which each cycle spanning a few weeks. within each cycle, team members set and refine goals, produce implementations, and perform a retrospective with stakeholders. new goals and refinements are established for the next sprint cycle. one of the team members also acts as the scrum master, who facilitates daily team meetings (called standups), and ensures that the team is working towards goals and requirements [12] . the kanban framework involves breaking down larger tasks into simple, do-able tasks. each task proceeds through a sequence of well-defined steps from start to finish. tasks are displayed as cards on a kanban board, and their positions on the board indicate how much progress has been made [13] . certain tasks may be "blocked", meaning that something needs to resolve before progress on the task can continue. figure 2 shows an example of a kanban board. one advantage to using a kanban board is that the set of all necessary tasks, along with progress for each task, is transparent to members of the development team and anyone else who is interested. overall, the kanban framework helps bring clarity in tackling larger problems. domain experts can visualize how the team is tackling the problems, along with what has been accomplished, what is in progress, what still needs to be done, and what needs to resolve before progress can be made. the proposed agile framework is shown in figure 3 . unlike the waterfall approach, the tasks in the agile approach are to be done collaboratively between data scientists and physicians, and we note that the use of cloud-based storage and computing helps by providing a common platform for accessing the data and model(s). complex problems can be broken down into tasks that can be visualized by both data scientists and physicians, enabling physicians to better understand the work that data scientists must do within each sprint cycle. the framework encourages continuous deployment of predictive models in clinical settings (i.e. such as the icu), during which time data scientists can round with physicians and receive feedback on the model's performance. the physician's insight or gestalt can be leveraged to determine whether the results of the model are believable [7] . it may be that the predictive model performs well only in certain settings, such as with only certain patient populations across certain periods of time; if so, the clinical questions can be refined or new hypotheses developed at the beginning of the next sprint cycle. the point at which the sprint cycles should end, either because the model has finally become clinically useful or the team needs to completely pivot to a different direction, is determined by the physicians. while the traditional waterfall approach could take many months for clinically useful models to be developed, the agile approach could take just a fraction of the time depending on the level of collaboration between data scientists and physicians. for agile data science to work in the healthcare domain, certain infrastructure must be in place to ensure that sprint cycles can be completed within shorter timeframes. these include the ability to: 1. rapidly acquire large datasets. 2. parse and query data in real time. 3. use established platforms and libraries rather than develop tools de-novo. these platforms and libraries should reside in a cloud framework that allows collaborative efforts to take place. the availability of publicly accessible health information databases for research is increasing despite a multitude of regulatory and financial roadblocks. one such database is the medical information mart for intensive care iii (mimic-iii) which contains de-identified data generated by over fifty thousand patients who received care in the icu at beth israel deaconess medical center [15] . the hope is that as researchers adopt the use of mimic, new insights, knowledge, and tools from around the world can be generated [16] . another publicly available database is the eicu collaborative research database, a multi-center collaborative database containing intensive care unit (icu) data from many hospitals across the united states [17] . both the mimic-iii and the eicu databases can be immediately obtained upon registration and completion of training modules. the popularity of these two databases illustrates the potential for large amounts of data to be gathered from hospitals and icus around the world and made immediately accessible to researchers. [18] . the cerner real-world data is another covid-19 research database that contains de-identified data and is freely offered to health systems [19] . finally, databases for medical imaging studies also exist, such as the chest x-ray dataset released by the nih, which contains over 100,000 chest x-ray images [20] . once datasets are obtained, storage and compute power are easily purchased and accessible from an ever-increasing number of vendors. the compute power needed for analyzing large datasets can often be met using cloud computing resources with amazon web services (aws), google cloud platform (gcp), and microsoft azure being the providers of popular cloud services [21] [22] [23] . the need for cloud computing tools rests mainly on the availability of specialized elastic compute instances. the elasticity implies that computing resources can be assessed in real time and scaled up or down as needed to balance computing power and cost. another advantage of a cloud framework is that it allows multiple data scientists and physicians to conveniently collaborate and access the work. this shift to elastic cloud resources has seen one of the major electronic medical records (emr) providers, cerner corporation [24] , develop tools for agile data science that use cloud computing resources as the underlying computing engine. these tools for agile data science often use jupyter notebook as the underlying frontend programming interface. jupyter is an open source computational environment that supports programming frameworks and languages such as apache spark [25] , python and r required for processing the data and developing predictive models [26] . open source machine learning libraries like keras [27] which enables the rapid development of advanced predictive models such as convolutional neural networks (cnns) [28] , can also be integrated. finally, the jupyter notebook framework supports collaboration amongst multiple individuals, where data scientists and physicians can query data, add and modify code and/or visualize results in real time [26] . the availability of the development tools and accessibility of data allow data scientists to rapidly acquire data, query parts of data relevant for addressing clinical questions, and develop predictive models. the outcomes of the model can lead to refinement of the clinical questions, data, or the model itself. the combination of the data scientist, physicians, and agile data science tools will help revolutionize the entire data science process and accelerate discoveries in healthcare and other application domains. agile data science is quickly becoming a necessity in healthcare, and especially critical given the covid-19 pandemic. the agile framework prescribes a rapid, continuous-improvement process enabling physicians to understand the work of data scientists and regularly evaluate predictive model performance in clinical settings. physicians can provide feedback or form new hypotheses for data scientists to implement in the next cycle of the process. this is a departure from the traditional waterfall approach, with data scientists tackling a sequence of tasks in isolation, without regularly deploying the models in real-world settings and engaging domain experts such as physicians. given the rapidly shifting healthcare landscape, the goals and requirements for the predictive models may change by the time the model is deployed; this renders the slower, traditional model development approaches unsuitable. as the agile framework encourages rapid development and deployment of predictive models, it requires data scientists to have easy access to data and the infrastructure needed for model j o u r n a l p r e -p r o o f development, deployment, and communication of outcomes. fortunately, there are now publicly available datasets such as mimic-iii, and cloud-based infrastructure such as amazon web services (aws) to achieve this. aws contains a suite of popular tools such as jupyter notebook, python, and r, allowing data scientists to rapidly upload data, and develop and deploy models with short turn-around time. given the increasing amounts of healthcare data, the plethora of clinical questions to address, as well as the minute-to-minute urgency of treating icu patients given the covid-19 pandemic, the rapid development of predictive models to address these challenges is more important than ever. we hope that the agile framework can be embraced by increasing numbers of physician and data scientist partnerships, in the process of developing clinically useful models to address these challenges. a method for assessing the quality of a randomized control trial artificial intelligence (ai) applications for covid-19 pandemic artificial intelligence-enabled rapid diagnosis of patients with covid-19 smote: synthetic minority over{sampling technique how to read articles that use machine learning (users' guide to the medical literature) data processing and text mining technologies on electronic medical records: a a physician's perspective on machine learning in healthcare. invited talk presented at machine learning for health care (mlhc) agile data science 2.0. o'reilly media, inc mvm -minimal viable model mimic-iii, a freely accessible critical care database making big data useful for health care: a summary of the inaugural mit critical data conference the eicu collaborative research database, a freely available multi-center database for covid-19 clinical data sets for research faq: covid-19 de-identified data cohort access offer chestx-ray8: hospitalscale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases apache spark -unified analytics engine for big data toward collaborative open data science in metabolomics using jupyter notebooks and cloud computing reading checks with multilayer graph transformer networks no external funding is provided for this work.j o u r n a l p r e -p r o o f j o u r n a l p r e -p r o o f highlights • agile data science in healthcare is becoming a necessity, given the covid-19 pandemic and the minute-to-minute urgency of patient treatment.• the proposed agile data science framework is based on scrumban, used in software development.• publicly available healthcare datasets and cloud-based infrastructure enable the agile framework to be widely adopted.• collaboration between physicians and data scientists needed in order to implement the agile framework. ☒ the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.☐the authors declare the following financial interests/personal relationships which may be considered as potential competing interests:j o u r n a l p r e -p r o o f key: cord-354254-89vjfkfd authors: peng, shanbi; chen, qikun; liu, enbin title: the role of computational fluid dynamics tools on investigation of pathogen transmission: prevention and control date: 2020-08-31 journal: sci total environ doi: 10.1016/j.scitotenv.2020.142090 sha: doc_id: 354254 cord_uid: 89vjfkfd transmission mechanics of infectious pathogen in various environments are of great complexity and has always been attracting many researchers' attention. as a cost-effective and powerful method, computational fluid dynamics (cfd) plays an important role in numerically solving environmental fluid mechanics. besides, with the development of computer science, an increasing number of researchers start to analyze pathogen transmission by using cfd methods. inspired by the impact of covid-19, this review summarizes research works of pathogen transmission based on cfd methods with different models and algorithms. defining the pathogen as the particle or gaseous in cfd simulation is a common method and epidemic models are used in some investigations to rise the authenticity of calculation. although it is not so difficult to describe the physical characteristics of pathogens, how to describe the biological characteristics of it is still a big challenge in the cfd simulation. a series of investigations which analyzed pathogen transmission in different environments (hospital, teaching building, etc) demonstrated the effect of airflow on pathogen transmission and emphasized the importance of reasonable ventilation. finally, this review presented three advanced methods: lbm method, porous media method, and web-based forecasting method. although cfd methods mentioned in this review may not alleviate the current pandemic situation, it helps researchers realize the transmission mechanisms of pathogens like viruses and bacteria and provides guidelines for reducing infection risk in epidemic or pandemic situations. volume, contact angle and environmental temperature were analyzed and the lifetime of droplets under those conditions was investigated. the evaporation of droplets will also be affected by the dust in the air, and this factor should be also considered in the future. different from other particles, pathogens are much smaller and their diameters are generally no more than 100nm and their motion is largely affected by flowing air, hence it is difficult to analyze their trajectories directly in the atmospheric environment. with the development of computer science, a new method based on computation fluid dynamics (cfd) can be used to solve this problem and it has been already well developed over the years. [20] and in the next decades, an increasing number of investigations about air pollution, atmosphere environment and pathogen transmission can be found. as mentioned above, the droplet can bring the pathogen into the airflow and hence cause infectious diseases due to the spread of it. normally the droplet with pathogens is generated by coughing or sneezing from the infected and the procedure of droplets generated by sneezing is shown in fig.2 : the impact of covid-19 is global and the pandemic situation is closely related to the health of every individual. it does not mean that there is no way to prevent or control it effective vaccines are unavailable though. understanding the transmission of infections such as covid-19 in various media is of great importance. in this review, the principles of different cfd algorithms are described concisely and intuitively; theories and applications of cfd in investigations of pathogen transmission are summarized. the objective of this research work is to indicate the important role of cfd method in analyzing pathogen transmission. through summarizing various applications of cfd method, the transmission mechanism of pathogens and prevention methods are also concluded in this work. three steps are necessary for numerical analysis by using cfd tools: (1) generating the mesh model with high quality is the key to ensure the accuracy of calculation; (2) boundary conditions are required to define variables at the boundary. (3) different algorithms that can be selected in cfd determine the way of iteration. this work carried out in this section is to summarize the feature of cfd from three aspects: simplification, algorithms diversity and maneuverability. different from experimental methods, the cfd method based on mathematical models that can be operated on computers is effective and cost-saving. in numerical simulations, the pathogens are carried by small particles like solid particles or droplets can be defined by calculation models. although the biological properties of the pathogen are complicated and various, the shape feature of the pathogen carrier is relatively simple to describe. in cfd, those particles can be defined as spheres, tetrahedrons, hexahedrons and even by the shape factor, then the pathogen transmission in different environmental fluids can be solved by utilizing multiphase models. moreover, the transport species model can be also applied to simulations of it. in this model, the infectious pathogen in the air is defined as the pollutant source with a constant concentration (generally measured by the field experiment). characteristics of fourier heat transfer law used to describe the heat exchange: fick law used to describe the mass exchange: (j 1j is the source diffusion relative to coordinates, d 12 is the diffusion coefficient between two sources) 1 1 1 1 12 12 12 () reynolds transportation law calculates the source quantity in control volume at time "t", which can be described as: introducing the continuous equation: and guass law (divergence law) then, the improved transportation can be written as: combining the balance equation of forces: then, the equation can be written as the form shown below: considering the acceleration caused by forces except the drag force and solving the particle motion in each step by iterative calculation: besides, the volume of fluent (vof) model performs well in simulating pathogen transmission, especially in the gas-liquid interface. in a control volume, the total value of each phase equals 100% and there are three situations in vof: if the volume fraction of "q" phase is  ,then: (1) 0 α  , no "q" phase in this cell; (2) 1 α  , the cell is full of "q" phase; (3) 0< α <1, interface between "q" phase and other phases can be found in the cell. the momentum equation and the energy equation of vof are determined and shared by each phase. the momentum equation mainly depends on characteristics and volume fraction of each phase and it can be written as below: energy equation of vof can be written as: j o u r n a l p r e -p r o o f k eff is effective thermal conductivity; j j,q is diffusion flux of "j" phase in "q" phase while the h j,q represents the enthalpy of it; s h is the volume of the heating source defined by users. the energy was defined as a variable relating to the average quality in vof: (17) resistance characteristics in all directions are assumed the same, and the equation can be written as: generally, the inertia resistance coefficient c 2 and viscous coefficient   are needed as the parameter of the boundary condition. taking the medical mask as an example and the way of obtaining these parameters are shown in fig.3 : the pressure difference (p 1 -p 2 ) between both ends of material from the mask as shown in fig. 3 j o u r n a l p r e -p r o o f can be measured under a velocity input (v i ). the relationship between pressure difference and the input velocity can be described by the equation as: then the coefficient c 2 and   can be calculated in the case of density  and thickness n  are known. some of investigations about transport species model and multiphase model are listed in tab several algorithms are mainly used in recent as summarized in table. 3. although the mechanism of pathogen transmission in the fluid is complex, the motion of pathogens still follows the hydrodynamics law and can be solved by mathematics models of cfd algorithms. for example, the lbm method can be used to solve pathogen transmission in small-scale while fdm method can be applied on the large-scale transmission of the pathogen. cfd tools are in high compatibility and their computing files can be transferred in a variety of software. in general, the structure of cfd software consists of three parts: pre-processing, solver and post-processing and fig. 4 below shows some options of each part. pathogen transmission in the environment is a complex process. for the sake of the accurate simulation result, the calculation model and parameters of the simulation are necessary. furthermore, epidemic models should be taken into account in numerical simulations. by using experimental methods to get the data that is required in the boundary condition is important, besides, experimental data are also needed to validate the simulation result. this section summarizes various epidemic models that should be used in the simulation. moreover, experimental methods which can be applied to analyzing pathogen transmission are presented in this section as well. although some of the experiments summarized were not used in combination with cfd methods, they can provide valuable references for similar studies by using numerical simulations. poussou, et.al [44] investigated the effects of moving people on pollutant diffusion and airflow. by combining the piv experiment technology, they used cfd method with a second-order upwind scheme to simulate the airflow. the re-normalization group (rng) k-ε was used in simulation in order to solve the turbulence with the good performance of accuracy, efficiency and robustness; in gao and niu [45] study, rng k-ε model including the effect of low-reynolds-number is used to solve the airflow and the diffusion of tracer gas which can represent the contaminant transmission are calculated by the equation below: where t,  and  are time, air density and tracer gas concentration respectively. unsteady flow is a big challenge in accurate simulation as zhang, et.al [46] indicated. flow in the environmental channel is always unsteady, and it hence increases the complexity of simulation on pathogen transmissions. how to treat an unsteady flow as the steady flow in practice is still a difficulty. by defining the wave in hydraulic calculation can effectively simplify the disturbance in unsteady flow. capillary wave [47] can reflect the disturbance brought by some factors to surface of fluid which can be written as: (1) for shallow waves: (2) for deep waves: ku [48] presented a "waveform" method to model unsteady flow in blood which reflects the relationship between time and volumetric flow rate: fig.5 volumetric flow rate mantha, et.al [49] used this method in the simulation of biological flows and found the relationship between wall shear stress and the location of the aneurysm; nanduri, et.al [50] also used "waveform" to solve the unsteady laminar flow and the objective of this study is to build a human body surface model to simulate the airflow around the body. although this method is useful for analyzing particle transportation in biological flows, it is not suitable for simulating unsteady flow in the atmosphere due to greater disturbance. more, in order to simplify the model for analyzing the airflow in building, axley [51] presented a multi-zone model which allows users to calculate the hourly rate of airflow between various rooms and dols and walton et.al [52] improved this model by providing the equation of mass conversion as: based on the dispersal theory which is not limited to the wall-mixed region, multi-zone model parameters should also be considered as well. airflow caused by temperature difference will affect pathogen transmission. chen, et.al [56] simulated the three cases by combing the multi-zone model and two-way air flow effect in order to demonstrate the effect of temperature difference on air quality of indoor. it can be found from one of the cases they studied, the airflow generated by the temperature difference between bathroom and corridor can transport infectious pathogens, and hence the door of infected zoom should be closed as they suggested. closing doors and windows in a room is not equivalent to obtaining a closed space. the crack of the door and windows is always ignored by researchers when simulating the airflow or pathogen spreading in the building or a single room. by using multi-zone method in cfd simulation, yang, et.al [57] analyzed the effect of stack and wind effect on contaminating dispersion and found that these factors will cause the contamination horizontally or vertically spread. their research also indicated that the pollutant gas can be transported through cracks of doors and windows and may cause infectious disease. because of the great effectiveness, multi-zone method was widely used in many cases which can be found in wu (1 ) where s is the susceptible people in an area, i is the number of infectious people and p represents the pulmonary ventilation rate of susceptible people. zhu, et.al [64] investigated the potential risk of infection in public transportation by using wells-riley model in cfd simulations. it was proved in their study that the closer to the operating exhaust in the bus the infected person is, the smaller the infection risk bringing to others is. besides, this study indicated that the ventilation system of most of buses is not effective because there is only one single exhaust was located in the middle of the cabin or the back wall. yan, et.al [65] studied the transmission of coughing particles in the breathing zone of people. in their investigation, the method that combines the wells-riley model and the lagrange model in cfd was used. it was illustrated from this study that the location of releasing particles will affect the particle travel distance. based on the well-riley model, this research work has also presented a quantifiable approach to assess the infection risk of passengers. these studies are helpful for improving the design of the vehicle ventilation system and hence reduce the infection risk though, they did not consider the effect of altitude on airflow patterns in vehicles. the wells-riley model can also be applied to building simulation. niu based on physical characteristics like aerodynamics of respiration droplets, chaudhuri, et.al [82] proposed a numerical model for the early state of covid-19 pandemic by integrated the chemical mechanism and pandemic evolution equations. the " " this work derived by using the theory of collision rate represents the lifetime of the droplet. it can be written as: some investigations based on these models are listed below in tab.5: more, hathway [92] combined the cfd method and sir model in order to analyze pathogen transmission in hospital space and asanuma and kazuhide ito [93] predicted the exposure risk of the population in the hospital by using cfd with considering the sir model. from these investigations, it can be found that this epidemic model is well performed in simulating the spread of infectious diseases. however, the number of researches that applied these models to cfd simulation is still a small amount due to the complexity of modelling and calculation in simulating the airflow or particle transport among a crowd of people. in cfd simulation, not only the mesh model is crucial but also the parameter of simulation is of great importance to let users obtain the results they require. generally, the boundary condition such as velocity, pressure, turbulence intensity can be measured from experiments. in recent, micro-particles experiments and tracer gas experiments are most used in investigations of airborne transmission. romano, et.al [94] simulated the airflow pattern and concentration of airborne particles in an operating theater (ot) by using cfd method. they also conducted an experiment in order to verify the accuracy of simulation results and in their experiment, a six-way aerosol distributor was used to convey the generated aerosol particles; opc (optical particles counter) equipped with a dilution system was used to measure the particle concentration; a rotating vane anemometer and a thermo-anemometer were used to measure the velocity and temperature respectively. they also validated the simulation result by comparing the data measured from the experiment and found that the experimental and numerical data were well coincided (error is less than 2% for temperature and 10% for velocity). the value of mean absolute percentage error for particle concentration is 42% though, the experimental curve and the numerical curve are similar in changing trends. therefore, experiments involving particle-fluid flow are more suitable for qualitative analysis, because it is hard to accurately control conditions such as temperature, pressure, stable velocity of flow. zhou, et.al [95] established a model which can be used to predict the distribution of negative ions produced by the air ionizer and the efficiency of this device. in their experiment, an emission system consisting of a compressor and nebulizer was used to compress the filtered air and aerosolize the j o u r n a l p r e -p r o o f bacteria; an ion counter was used to test the emission concentration. in order to present their experiment clearly, the installed experimental system is shown below in fig. 6 : fig. 6 the detailed experimental setup [82] the objective of the experiment carried out in this work was to measure the susceptibility besides, it was proved that the bacterial load in the shower air will increase while turning on the shower spray. the effect of droplet velocity and distribution on aerosolized bacterial groups was not given this study, more, the parameter of shower such as water temperature, nuzzle structures should be also considered in the experiment as well. choi, et.al [97] classified the airborne particle according to their optical properties by using experimental methods. ink-jet aerosol generator (ijag) was used to generate, dry the airborne particle, the light-scattering signal was used to estimate the correlation value in the classification analysis of airborne particles. the correlation value proposed in this work is helpful for particle detection and classification though, how to apply this method to detect other airborne pathogens with more complicated biological characteristics is required to be furthered. mei (1) conveyed air from experiment needs to be filtered; (2) particles should be uniformly delivered. j o u r n a l p r e -p r o o f experiments to analyzed the particle are useful for understanding the motion law of it. however, it is difficult to massively measure the characteristic of nano-scale particles. the tracer gas method is also a common method in analyzing the pollution diffusion and airflow patterns. tracer gas can be mixed with air without any changes and it can be easily detected because of special physical characteristics. helium, nitrogen, argon and carbon dioxide are always chosen to carry out the experiment as a tracer gas. gao, et.al [102] combined the use of experiment and cfd method to study airborne transmission in different flats of a high-rise building and to verify their simulation, the data of tracer gas experiment from denmark aalborg university [103] is used. the analysis of this work is comprehensive by illustrating the transmission mechanism of the airborne virus and how to control virus transmission in a high building based on this investigation is needed to be furthered. to investigate airborne transmission between horizontal adjacent units, wu, et.al [104] analyzed influence factors of transmission route especially the contribution of wind force and thermal buoyancy force and found from the result that the wind force is the main driving force to affect the inter-unit dispersion. the experiment conducted in this work is conducted in a slab-type building in hongkong, sf 6 was used as the tracer gas and injected by the air samples; co 2 was used to calculate the ventilation rate and monitored by tsi q-trak and co 2 sensor. although the spread risk may be overestimated in the analysis because the crack of the door and windows can cause the pathogens aerosol deposit, this work still provides a valuable study in identifying the possible transmission route of the airborne. ai, et.al [105] used a tracer gas (no 2 ) experiment to examine the characteristics of airborne transmission of the exhaled droplet between two people in an experimental room. two manikins were used to represent an exposed people and an infected people; air velocity was measured by the swema 3000 omnidirectional anemometer; pt100 sensor was used to monitor the air temperature; to test the tracer gas concentration, a faster concentration meter (fmc) and innova multi-gas sampler and monitor are used. this work has indicated an interaction between exhaled gas and supply flow and analyzed the impact of these factors on infection risk for an exposed person facing an infectious person. although the experiment carried out in this work was based on a steady-state condition without taking the impact factor of time into consideration, it provided an effective method for researches afterward. (1) culturing and filtering the suitable bacteria; (2) aerosolizing the bacteria particles and conveying into measuring environment; (3) analyzing the airborne transmission of e. coli by using pcr. and the flow chart given by them is shown below in fig. 7 : fig.7 experiment process of tracing bacteria [110] it will be more persuasive if this process can be carried out in an experiment of researches by using the tracer gas method or particle experiment. however, it will also increase the risk in conducting experiments if the bacteria or virus are highly infectious. overall, both the particle experiment and tracer gas experiment can help people understand the process of pathogen transmission, moreover, it provides crucial information for cfd users. on the one hand, the information including experimental data can be used as boundary conditions in cfd simulation; on the other hand, the results of the experiment can be quantitively or qualitatively verified to ensure the accuracy of cfd simulation. therefore, designing an effective experiment in analyzing pathogen transmission is necessary, it makes the simulation result more convincing. j o u r n a l p r e -p r o o f transmission of pathogens can be different in various spaces and when the epidemic outbreaks caused by infectious pathogens, hospitals will become a high-risk place and may lead to a second infection. how to control the pathogen in hospitals by using an effective ventilation system becomes a great concern. kao et.al [111] [112] they used the tracer gas no 2 to replace the viral gas emitted from the patient and simulated three cases under different volumes of supplied air and exhausted air, the simulation results presented the diffusion process of tracer gas as in fig. 9 : fig. 9 the simulation of tracer gas diffusion [112] in the same year, this research group studied a similar topic by using the tracer gas and cfd method. in this analysis, the stack effect of high rise building on airflow is considered and the simulation model is based on the general hospital k in korea as shown in fig.10 . fig. 10 the prince of wales hospital and the simulation model of it [113] they have simulated the spread of tracer gas in the wards of both on the lower floor (5f) and higher floor (15f) to demonstrate the stack effect. some of the simulation results are shown as shown in fig. 11 : figure. 11 simulation results of the tracer gas transmission in wards of different floors [113] j o u r n a l p r e -p r o o f these researches above mainly investigated pathogen transmission inside the hospital and they are meaningful in protecting patients and hospital staff. however, not only pathogen transmission inside the hospital is dangerous, but the pollutant emission from the hospital is also a great concern for public health. chang et.al [114] by using cfd modeled the atmospheric environment out the hospital and simulated the spread of the viral (sars) gas emitted from the hospital. the mesh model of simulation is shown in fig. 12 : fig. 12 mesh model of simulation [114] this model was generated by tetrahedral grids; the wind velocity as a boundary parameter was measured by the hot-film probe and anemometry equipment; 16 wind directions were considered in the calculation. moreover, in order to verify this model, tracer gas was used in the experiment model with a 1:50 scale. the simulation result below in fig. 13 respectively shows the concentration contour of pollutant gas at both the height of the roof chimney (right) and 1.5m (left) above the ground. fig. 13 diffusion of pollutant gas emitted from hospital [114] by the simulation results, they indicated that both the maximum concentration and mean concentration of pollutant gas in small and would not affect residents' health. however, when a large number of sars patients were arranged in the hospital, it is still a bit risky for people who actives in the high-concentration area on the ground level. research works above were mainly focused on the airflow pattern or impact of ventilation on pathogen transmission. however, cross-infection frequently happened in hospitals and should be paid attention to in case studies of pathogen transmission. based on eulerian-lagrangian method, a case study proposed by wang, et.al [115] has illustrated that the sneezing process from a virus carrier is ventilation is easier to be controlled, hence, it is necessary to ensure safety when emitted the viral gas from the exhausted system from the hospital. without professional medical equipment, the buildings with high population density such as residential buildings, commercial buildings and campus buildings are in higher infection risk. yang, et.al [116] studied natural ventilation in teaching buildings by using cfd method. in their investigation, phoenics with rans model was used to simulate the ventilation; the simple algorithm was used to calculate and presto scheme was used to staggered the pressure interpolation; the wind profile at inlet boundary of the simulation was determined by the equation of ashrae [117] as: 0.22 () ref u y y uh     (27) through the simulation, they indicated that the ventilation of the teaching building with a "line-type" corridor is better than that of the inside corridor; they have also presented an optimization design for better ventilation in teaching buildings by determining the best wind angle. moreover, cuce, et.al [118] studied the natural ventilation in school buildings based on its working principles and limitation of passive ventilation; in a crowded room, the concentration of volatile organic substances generated by human skin oil is high, xiong, et.al it can be observed from simulations that particles will spread in flushing because of the turbulence generated by the high speed of flowing water. more, it was obtained that 40%~60% of particles can reach above the toilet seat. the research of this work is meaningful and it was indicated that before flushing, laying down the lid is useful for preventing the virus transmission. more, washing the seat of the toilet is necessary because the floating virus may deposit on the surface of it. this research group has also analyzed the movement of a virus-laden particle in the process of urinal flushing [122] . without the prevention, over 57% of particles can escape from the urinal and the particle can reach the highest position of 0.84m at only 5.5s . so it is mandatory to wear a mask in public to reduce infection risk. furthering this study about virus transmission in the squat toilet by applying the method j o u r n a l p r e -p r o o f proposed in this work is important because, in many places such as china, the use of the squat toilet is higher than that of the sitting toilet in public. some investigations of cfd simulations of ventilation or pathogen transmission in the building environment are summarized below in tab. 6: it can be obtained from these investigations: (1) ventilation is one of the most effective methods to control the pathogens transmission and the reasonable arrangement of the ventilation system is necessary. (2) the effect of the stack effect should be considered when analyzing the ventilation in high-rise buildings. (3) rooms with infected patients need to be diluted with plenty of fresh air. traffic vehicles are also dangerous when there are infectious patients. under the high personnel density and weak ventilation system, it is difficult to control the pathogen such as the airborne virus. according to this problem, more and more researchers investigated airflow in various kinds of vehicles by using cfd methods. (1) two particles collisions are mainly considered; (2) the velocity distribution of each particle exists independently; (3) the external force does not affect the dynamic behavior of the local collision. there are various models that can be used in the simulation of lbm and these models can be defined by the layout of lattice, some models which are in common use are shown in fig. 17 (2d) and face masks have been used to prevent virus transmission and it is necessary for the epidemic situation. li [155] simulated the aerodynamic behavior of a gas mask which consists of two filter layers. fig. 19 : fig. 19 grid model of the gas mask (left) and the flow field of simulation (right) [155] this research indicated that the design of the mask such as the hole properties is important: larger hole area and greater hole distribution lead to a lower pressure drop, a smaller dead zone, and so on. theoretical analysis was mainly studied in this work and it has also provided a reference in designing a sufficient mask. dbouk, et.al [156] analyzed the role of the mask in preventing the droplet transmission by utilizing openfoam with a combination of the use of turbulence model and porous model. in the simulation model, the mask fitting to the face was considered which is shown as in fig.20 : turbulent flow in the atmosphere is unsteady due to the changing weather and it is difficult to measure the airborne transmission in the atmospheric environment directly. although aerodynamics models of airborne transmission based on cfd method have been greatly developed, it is still a big challenge for applying them to the large-scale environment. seo, et.al [159] presented a method based on meteorological information from web-system that can help for solving this problem. this research group analyzed the relationship between foot-and-mouth disease (fmd) spread and hourly wind in anseong. moreover, they collected the infection data and built a model by using the gis method. then, they used a code division multiple access (cdma) to send the weather data to a weather data acquisition server (wdas) in every 10 minutes and interlock the data with geographical information. the openfoam code was used to simulate the spread of the airborne virus the simulation result can well describe the virus transmission. the process of the cfd simulation based on web-based forecasting system can be described as below in fig. 22 : fig. 22 detailed process of cfd simulation based on web-based forecasting system the web-based forecasting system has been widely used in various cases such as flooding (li, j o u r n a l p r e -p r o o f et.al [160] ), tourism demand (song, et.al [161] ), monitoring of marine pollution (kulawiak, et.al [162] ), etc. however, there are few studies about pathogen transmission based on combing the web-based forecasting system and cfd method. hence, more databases of pathogen transmission and meteorological information are needed to develop the web-based forecasting system in the analysis of pathogen transmission. from investigations summarized in this review, it can be found that ventilation is one of the most effective methods to control pathogen transmission in the air. different environments require different ventilation systems, the building environment such as teaching building and residential building and the natural ventilation method is the main way to dilute the concentration of the pathogen. however, in high-risk zones such as hospitals, not only the reasonable ventilation of indoor is required, but also the infectious risk due to emission needs to be considered. besides, pathogen transmission in j o u r n a l p r e -p r o o f journal pre-proof different vehicles is distinct, a proper strategy of ventilation is necessary for transportation especially the airplane and high-speed train with an enclosed environment. this review also presented some advanced methods for cfd application on pathogen transmission according to recent investigations as: (1) lbm simulation allows researchers to investigate pathogen transmission from the mesoscale level; (2) based on the porous media model, researchers can better analyze the transport of pathogens in complex media, such as medical masks, human organs, etc. (3) web-based forecasting system can be combined with the cfd method to analyze the transmission of infectious pathogens in the atmospheric environment and predict the cross-regional transmission of pathogens. covid-19): dashboard data last updated: 2020/8/10, 3:06pm cest phagocytic cells contribute to the antibody-mediated elimination of pulmonary-infected sars coronavirus immunobiology of ebola and lassa virus infections clonal vaccinia virus grown in cell culture as a new smallpox vaccine limited airborne transmission of h7n9 influenza a virus between ferrets violent expiratory events: on coughing and sneezing sneezing and asymptomatic virus transmission on coughing and airborne droplet transmission to humans likelihood of survival of coronavirus in a respiratory droplet deposited on a solid surface computational fluid dynamics computational fluid dynamics fundamentals of computational fluid dynamics cfd analysis of the flow around the x-31 aircraft at high angle of attack numerical simulation of turbulent flow around helicopter ducted tail rotor cfd fire simulation of the swissair flight 111 in-flight fire -part 1: prediction of the pre-fire air flow within the cockpit and surrounding areas numerical analysis of particle erosion in the rectifying plate system during shale gas extraction analysis of particle deposition in a new-type rectifying plate system during shale gas extraction formation mechanism of trailing oil in product oil pipeline flow field and noise characteristics of manifold in natural gas transportation station. oil & gas science and technology-revue d'ifp energies nouvelles the chemistry of the lipids of tubercle bacilli lxxi. the determination of terminal methyl groups in branched chain fatty acids study of a fogging system using a computational fluid dynamics simulation multiphase simulation of lng vapour dispersion with effect of fog formation investigating the performance of thermonebulisation fungicide fogging system for loaded fruit storage room using cfd model a time-dependent eulerian model of droplet diffusion in turbulent flow computers & fluids:s0045793016300469 cross diffusion effects in the interfacial mass and heat transfer of multicomponent droplets multi-component droplet heating and evaporation: numerical simulation versus experimental data modeling aerosol number distributions from a vehicle exhaust with an aerosol cfd model development of metamodels for predicting aerosol dispersion in ventilated spaces rectangular slit atmospheric pressure aerodynamic liens aerosol concentrator mechanisms of dust diffuse pollution under forced-exhaust ventilation in fully-mechanized excavation faces by cfd-dem diffusion and pollution of multi-source dusts in a fully mechanized coal face assessment of a cfd model for short-range plume dispersion: applications to the fusion field trial 2007 (fft-07) diffusion experiment cfd simulation of nozzle characteristics in a gas aggregation cluster source influence of gravity and lift on particle velocity statistics and transfer rates in turbulent vertical channel flow simulations of pollutant dispersion within idealised urban-type geometries with cfd and integral models the impacts of roadside vegetation barriers on the dispersion of gaseous traffic pollution in urban street canyons cfd simulation of air-steam flow with condensation cfd modeling of a headbox with injecting dilution water in a central step diffusion tube a microfluidics-based on-chip impinger for airborne particle collection can a toilet promote virus transmission? from a fluid dynamics perspective vof-dem simulation of single bubble behavior in gas-liquid-solid mini-fluidized bed on respiratory droplets and face masks aerodynamic behavior of a gas mask canister containing two porous media flow and contaminant transport in an airliner cabin induced by a moving body: model experiments and cfd predictions transient cfd simulation of the respiration process and inter-person exposure assessment developments in computational fluid dynamics-based modeling for disinfection technologies over the last two decades: a review unsteady flow in open channels blood flow in arteries hemodynamics in a cerebral artery before and after the formation of an aneurysm cfd mesh generation for biological flows: geometry reconstruction using diagnostic images multi-zone dispersal analysis by element assembly 38. a multi-zone indoor air quality and ventilation analysis software tool multi-zone modeling of probable sars virus transmission by airflow between flats in block e, amoy gardens investigating a safe ventilation rate for the prevention of indoor sars transmission: an attempt based on a simulation approach multi-zone simulation of outdoor particle penetration and transport in a multi-story building significance of two-way airflow effect due to temperature difference in indoor air quality the transport of gaseous pollutants due to stack effect in high-rise residential buildings air infiltration induced inter-unit dispersion and infectious risk assessment in a high-rise residential building principles and applications of probability-based inverse modeling method for finding indoor airborne contaminant sources experimental and numerical study on particle distribution in a two-zone chamber model-based optimal control of a dedicated outdoor air-chilled ceiling system using liquid desiccant and membrane-based total heat recovery identifying index (source), patient location of sars transmission in a hospital ward airborne contagion and air hygiene: an ecological study of droplet infections an advanced numerical model for the assessment of airborne transmission of influenza in bus microenvironments evaluation of airborne disease infection risks in an airliner cabin using the lagrangian-based wells-riley approach cfd simulation of spread risks of infectious disease due to interactive wind and ventilation airflows via window openings in high-rise buildings modelling the transmission of airborne infections in enclosed spaces infection risk of indoor airborne transmission of diseases in multiple spaces preventing airborne disease transmission: review of methods for ventilation design in health care facilities risk assessment of airborne infectious diseases in aircraft cabins risk of indoor airborne infection transmission estimated from carbon dioxide concentration a probabilistic transmission dynamic model to assess indoor airborne infection risks. risk analysis : an official publication of the society for risk analysis perspectives on the basic reproductive ratio role of air distribution in sars transmission during the largest nosocomial outbreak in hong kong influenza virus in human exhaled breath: an observational study association of classroom ventilation with reduced illness absence: a prospective study in california elementary schools quantifying the routes of transmission for pandemic influenza a contribution to the mathematical theory of epidemics infectious diseases of humans:dynamics and control modeling the role of respiratory droplets in covid-19 type pandemics seasonal dynamics of recurrent epidemics stability analysis and optimal vaccination of an sir epidemic model stability analysis and optimal control of an sir epidemic model with vaccination global stability of a delayed sirs epidemic model with saturation incidence and temporary immunity epidemics of sirs model with nonuniform transmission on scale-free networks phase transitions in some epidemic models defined on small-world networks global dynamics of a seir model with varying total population size tracking epidemics with google flu trends data and a state-space seir model statistical inference in a stochastic epidemic seir model with control intervention: ebola as a case study cfd modelling of pathogen transport due to human activity integrated approach of cfd and sir epidemiological model for infectious transmission analysis in hospital numerical and experimental analysis of airborne particles control in an operating theater numerical and experimental study on airborne disinfection by negative ions in air duct flow droplet distribution and airborne bacteria in an experimental shower unit experimental studies on the classification of airborne particles based on their optical properties predicting airborne particle deposition by a modified markov chain model for fast estimation of potential contaminant spread experimental and cfd study of unsteady airborne pollutant transport within an aircraft cabin mock-up an assessment of the airborne route in hepatitis b transmission the airborne transmission of infection between flats in high-rise residential buildings: tracer gas simulation short-time airing by single-sided natural ventilation-part 1: measurement of transient air flow rates experimental analysis of driving forces and impact factors of horizontal inter-unit airborne dispersion in a residential building airborne transmission of exhaled droplet nuclei between occupants in a room with horizontal air distribution influence of human breathing modes on airborne cross infection risk person to person droplets transmission characteristics in unidirectional ventilated protective isolation room: the impact of initial droplet size airborne transmission and precautions: facts and myths experimental and numerical investigation of the wake flow of a human-shaped manikin: experiments by piv and simulations by cfd a tracing method of airborne bacteria transmission across built environments virus diffusion in isolation rooms the influence of ward ventilation on hospital cross infection by varying the location of supply and exhaust air diffuser using cfd the predictions of infection risk of indoor airborne transmission of diseases in high-rise hospitals: tracer gas simulation computational fluid dynamics simulation of air exhaust dispersion from negative isolation wards of hospitals an air distribution optimization of hospital wards for minimizing cross-infection ventilation effect on different position of classrooms in "line" type teaching building american society of heating, refrigerating and air conditioning engineers, fundamentals (si) sustainable ventilation strategies in buildings: cfd research. sustainable energy technologies and assessments modeling the time-dependent concentrations of primary and secondary reaction products of ozone with squalene in a university classroom prolonged presence of sars-cov-2 viral rna in faecal samples can a toilet promote virus transmission? from a fluid dynamics perspective virus transmission from urinals investigation on the contaminant distribution with improved ventilation system in hospital isolation rooms: effect of supply and exhaust air diffuser configurations bioaerosol deposition in single and two-bed hospital rooms: a numerical and experimental study cfd study on the transmission of indoor pollutants under personalized ventilation possible role of aerosol transmission in a hospital outbreak of influenza spatial distribution of infection risk of sars transmission in a hospital ward modelling the performance of upper room ultraviolet germicidal irradiation devices in ventilated rooms: comparison of analytical and cfd methods numerical investigation of airborne infection in naturally ventilated hospital wards with central-corridor type. indoor and built environment novel air distribution systems for commercial aircraft cabins identification of contaminant sources in enclosed environments by inverse cfd modeling identification of contaminant sources in enclosed spaces by a single sensor identify contaminant sources in airliner cabins by inverse modeling of cfd with information from a sensor current studies on air distributions in commercial airliner cabins identification of pollution sources in urban areas using reverse simulation with reversed time marching method inverse modeling of indoor instantaneous airborne contaminant source location with adjoint probability-based method under dynamic airflow field the method of quasi-reversibility : applications to partial differential equations experimental and numerical investigation of micro-environmental conditions in public transportation buses effects of the window openings on the micro-environmental condition in a school bus performance evaluation of air distribution systems in three different china railway high-speed train cabins using numerical simulation the risk of airborne influenza transmission in passenger cars lattice boltzmann method for fluid flows lattice bgk models for navier-stokes equation the lattice boltzmann equation: for fluid dynamics and beyond. the lattice boltzmann equation for fluid dynamics and beyond pore-scale modelling of dynamic interaction between svocs and airborne particles with lattice boltzmann method lattice boltzmann method simulation of svoc mass transfer with particle suspensions modeling of dynamic deposition and filtration processes of airborne particles by a single fiber with a coupled lattice boltzmann and discrete element method lattice boltzmann method and rans approach for simulation of turbulent flows and particle transport and deposition multi-block lattice boltzmann simulations of solute transport in shallow water flows large-scale oil spill simulation using the lattice boltzmann method, validation on the lebanon oil spill case lattice boltzmann simulations of anisotropic permeabilities in carbon paper gas diffusion layers drag correlation for dilute and moderately dense fluid-particle systems using the lattice boltzmann method lattice boltzmann simulation of liquid water transport in microporous and gas diffusion layers of polymer electrolyte membrane fuel cells an immersed boundary-lattice boltzmann model for simulation of malaria-infected red blood cell in micro-channel aerodynamic behavior of a gas mask canister containing two porous media on respiratory droplets and face masks 3-d numerical simulation of main sieve diaphragm with three types passageway design in a gas mask canister the role of porous media in modeling fluid flow within hollow fiber membranes of the total artificial lung web-based forecasting system for the airborne spread of livestock infectious disease using computational fluid dynamics a web-based flood forecasting system for shuangpai region developing a web-based tourism demand forecasting system interactive visualization of marine pollution monitoring and forecasting data via a web-based gis the authors are grateful for the research support received from applied basic the authors declared that they have no conflicts of interest to this work as: we declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.j o u r n a l p r e -p r o o f key: cord-326831-dvg0isgt authors: muhammad, l. j.; islam, md. milon; usman, sani sharif; ayon, safial islam title: predictive data mining models for novel coronavirus (covid-19) infected patients’ recovery date: 2020-06-21 journal: sn comput doi: 10.1007/s42979-020-00216-w sha: doc_id: 326831 cord_uid: dvg0isgt novel coronavirus (covid-19 or 2019-ncov) pandemic has neither clinically proven vaccine nor drugs; however, its patients are recovering with the aid of antibiotic medications, anti-viral drugs, and chloroquine as well as vitamin c supplementation. it is now evident that the world needs a speedy and quicker solution to contain and tackle the further spread of covid-19 across the world with the aid of non-clinical approaches such as data mining approaches, augmented intelligence and other artificial intelligence techniques so as to mitigate the huge burden on the healthcare system while providing the best possible means for patients' diagnosis and prognosis of the 2019-ncov pandemic effectively. in this study, data mining models were developed for the prediction of covid-19 infected patients’ recovery using epidemiological dataset of covid-19 patients of south korea. the decision tree, support vector machine, naive bayes, logistic regression, random forest, and k-nearest neighbor algorithms were applied directly on the dataset using python programming language to develop the models. the model predicted a minimum and maximum number of days for covid-19 patients to recover from the virus, the age group of patients who are of high risk not to recover from the covid-19 pandemic, those who are likely to recover and those who might be likely to recover quickly from covid-19 pandemic. the results of the present study have shown that the model developed with decision tree data mining algorithm is more efficient to predict the possibility of recovery of the infected patients from covid-19 pandemic with the overall accuracy of 99.85% which stands to be the best model developed among the models developed with other algorithms including support vector machine, naive bayes, logistic regression, random forest, and k-nearest neighbor. severe acute respiratory syndrome coronavirus two (sars-cov-2), the causative agent of novel coronavirus (covid-19 or 2019-ncov), has emerged in late 2019 which is believed to be originated from hubei province, china called wuhan [16, 25] . 2019-ncov or covid-19 is rapidly spreading in humans which is evidently believed that it was first derived from bats, transmitted to humans through intermediate hosts probably the raccoon dog (nyctereutes procyonoides) and palm civet (paguma larvata) [8, 18, 21] . the major symptoms of sars-cov-2 include fever, cough, and shortness of breath which in many instances appeared to be similar to that flu [16] . covid-19 had since reached a decisive point and pandemic potential which claimed the lives of many people across the world and human-to-human transmission of covid-19 from infected individuals with mild symptoms have been reported [16, 20] . according to worldometers, covid-19 pandemic affects 210 countries and territories around the world and 2 (02) international conveyances with 6,033,875 confirmed cases, 2,661,213 recovered cases, and 366,894 deaths as of may 30th, 2020, 05:37 gmt [27] . however, there is no drug or vaccine clinically proven to treat covid-19 pandemic, therefore other non-clinical or non-medical therapeutic techniques are urgently needed to contain and prevent further outbreak of covid-19 pandemic such as data mining techniques, machine learning and expert system among other artificial intelligence techniques. data mining (dm) is an advanced artificial intelligence (ai) technique that is used for discovering novel, useful, and valid hidden patterns or knowledge from dataset [6, 14] . the technique reveals relationships and knowledge or patterns among the dataset in several or single datasets [15, 16] . it has also widely used for the prognosis and diagnosis of many diseases including severe acute respiratory syndrome coronavirus (sars-cov) and middle east respiratory syndrome coronavirus (mers-cov) that were so far discovered in 2003 and 2012, respectively [16] . as huge dataset generated around the world related to 2019-ncov pandemic everyday is a treasured resource to be mined and analyzed for useful, valid, and novel knowledge or patterns extraction for better decision-making to contain the outbreak of covid-19 pandemic. in healthcare sector, data mining has been widely applied in many different applications such as predicting patient outcomes, modeling health outcomes, hospital ranking, and evaluation of treatment effectiveness and infection control, stability, and recovery [1, 23, 29] . in this study, we developed several data mining models for the prediction of 2019-ncov-infected patients' recovery. the models predict when covid-19 infected patients would be recovered and released from isolation centers as well as patients that may likely not be recovered and lost their lives to covid-19 pandemic. the models help the health workers to determine the recovery and stability of the newly infected persons with pandemic covid-19. the models are developed with the dataset obtained from korea centers for disease and prevention (kcdc) and dataset instances of the death and recovery records of the infected 2019-ncov pandemic were considered. data mining algorithm which includes decision tree, support vector machine, naive bayes, logistic regression random forest, and k-nearest neighbor were applied directly on the dataset using python programming language to develop the models. the rest of the paper is organized as follows. section 2 describes the overall methodology of the proposed system including data collection and preparation with data mining techniques. the experimental results with detailed discussions are described in sect. 3. lastly, sect. 4 concludes the paper. the dataset was obtained from kcdc which was made available on kagge website [3] . we used the epidemiological dataset of covid-19 patients of south korea. the dataset has 3254 instances with 8 attributes which include patient id, global number (the number given by kcdc), sex, birth year, age, country, province, city, disease (true: underlying disease/false: no disease) infection case, infection order (the order of infection), infected by (the id of who infected the patient), contact number (the number of contacts with people) symptom onset date, the date of symptom onset, confirmed date, the date of being confirmed, released date (the date of being released) deceased date (the date of being deceased) and state (state of the patients isolated/ released/deceased). the dataset was prepared, and cleaned where only relevant attributes were extracted from the original dataset. the extracted dataset has only 1505 data instances with 5 attributes which include gender, age, infection_case and no_days (the number of days between the date the disease was confirmed and date the patients was released or die) and state of the patients (released or deceased). we considered only two states of the patient which include released and deceased while isolation state was excluded. table 1 revealed the data type of the attributes and table 2 showed the sample of some instances of the dataset. the missing value in the dataset reduces the prediction power and produces biased estimates leading to an invalid conclusion [28] . therefore, we used last observation carried forward imputation technique to handle the missing values in the dataset. figures 1, 2 , 3, 4, 5 showed the frequency of each attribute of the dataset. logistic regression (lr) is used to determine the association between categorical dependent variables against the independent variables [9] . lr is used when the dependent variable has two values such as 0 and 1, yes and no or true and false and thus it is called binary logistic regression [22] . however, when the dependent variable has more than two support vector machine (svm) is one of the supervised learning algorithms used for classification and regression [24] . for classification task in svm involves testing and training data which contain some instances of the data [10] . each instance in the training dataset contains one or more target values; therefore the main goal of svm is to produce a model that will predict target value or values [24] . for regression, svm applied by the introduction of alternative loss function which can be linear or nonlinear [10] . decision tree (dt) is used for classification tasks in data mining and successful technique due to its ability to handle both categorical and continuous data, simplicity and comprehensibility, dt builds tree into phases which include growth and pruning phases respectively [14, 15, 26] . in the first phase, a tree is built by partitioning data into a smaller set until each partition is pure, however, the split type of the data is solely dependent on the data type [19] . the splits for a numerical attribute c form the value of (c) ≤ y, where y is value in c domain. for splitting a categorical d, form the values of (d), b ∈ g, where g is a subset of domain (d) [7] . to remove the noise in the dataset, the pruning technique is used to get the final tree built when it is fully grown [11] . the growth phase is however computationally more expensive than the pruning phase of the decision tree [6] . naive bayes is one kind of data mining classification algorithm and used to discriminate dataset instances based on specified features or attributes [13] . nb is a probabilistic classifier and uses bayes theorem for classification tasks [5] . below is the bayes theorem: random forest (rf) algorithm is an ensemble learning technique for data mining classification and regression tasks. the algorithm constructs a multitude of decision trees at training time and outputting [12] . rf data mining algorithm is the best to be used for any decision tree with overfitting to its training dataset [13] . k-nearest neighbor (k-nn) is a non-parametric and supervised data mining classifier used for regression and classification tasks [2] . in both tasks, the input variables consist of the k closes training dataset in the feature space. k-nn relies on labeled input data to learn a function so that to produce appropriate output when inputted unlabeled data [17] . in k-nn classification, the output is a class membership in which data instances is classified by a plurality vote of its neighbors, with the data instance being assigned to the class most common among its k-nearest neighbors while in k-nn regression, the output is the property value of data instance and this value is the average of the value of k-nearest neighbors [4] . (2) p (a|b) = p(b|a)p(a) p(b) . python programming language was used for data mining predictive tasks. python is a well-known general purpose and dynamic programming language that is being used for different fields such as data mining [30] , machine learning [31, 32] , and internet of things [33, 34] . data mining algorithms are being implemented using python with the help of the special purpose libraries. the models were developed using fivefold cross-validation. in decision tree, the model revealed in fig. 6 , the number of days attribute appeared to be the first splitting attribute which indicates the most important attribute. the model predicted a minimum of 5 days and a maximum of 35 days as the number days for covid-19 patients to recover from the pandemic virus. the model has shown that another important attribute for predicting recovery of the covid-19 patients is age attribute. the patients of age between 65-85 years are of high risk not to recover from the covid-19 pandemic, patients of age between 26-64 years are likely to recover while patients of age between 1-24 years are recovered quickly from covid-19 pandemic. from the model, we concluded old patients are at high risk of developing covid-19 complications which may result in death. data mining models are evaluated using evaluation techniques to determine their accuracy [14] . the techniques determine the quality and efficiency of the model using the data mining algorithm or machine learning algorithms. these main performance evaluation techniques for the data mining model include specificity, sensitivity, and accuracy. however, in this study, only accuracy is considered to evaluate the developed models. accuracy determines the percentage of the dataset instances correctly classified for the model developed by the data mining algorithm. expressed as: where tp is the true positive, tn is the true negative, fp is the false positive while fn is the false negative. the model developed with dt happened to be the most efficient with the highest percentage of accuracy of 99.85%, table 3 has shown the results of the performance evaluation of the developed models. the data mining algorithm which includes dt, svm, nb, lr, rf, and k-nn were applied directly on the dataset using python programming language. however, the model developed with dt algorithm was found to be the most accurate with 99.85% accuracy which appears to be the highest among others as shown in fig. 7 : the model predicted a minimum and maximum number of days for covid-19 patients to recover from the virus. the model also predicted the age group of patients who are at high risk not to recover from the covid-19 pandemic, those who are likely to recover and those who might be likely to recover quickly from covid-19 pandemic. from the performance evaluation result of the models, the model developed with dt data mining algorithm is efficiently capable of predicting the possibility of recovery of infected patients from covid-19 pandemic with the overall accuracy of 99.85% when compared with rf, svm, k-nn, nb and lr with the overall accuracy of 99.60%, 98.85%, 98.06%, 97.52%, and 97.49%, respectively. in the present study, data mining models were developed for the prediction of covid-19 infected patients' recovery using epidemiological dataset of covid-19 patients of south korea. dt, svm, nb, lr, rf, and k-nn algorithms were applied directly on the dataset using python programming language. the model developed with dt was found to be the most efficient with the highest percentage of accuracy of 99.85%, followed by rf with 99.60% accuracy, then svm with 98.85% accuracy, then k-nn with 98.06% accuracy, then nb with 97.52% accuracy and lr with 97.49% accuracy. the developed models would be very helpful in healthcare for the combat against covid-19. funding no funding sources. building predictive models for mers-cov infections using data mining techniques an introduction to kernel and nearest-neighbor nonparametric regression (pdf) coronavirus dataset of korea centers for disease control & prevention (kcdc) miscellaneous clustering methods in cluster analysis naive bayes classifier, towards data science information and communication technology for intelligent systems. smart innovation, systems and technologies decision tree discovery. handbook of data mining and knowledge discovery a machine learning-based model for survival prediction in patients with severe covid-19 infection medrxiv logistic regression. ncss a geometric approach to support vector machine (svm) classification sliq: a fast scalable classifier for data mining performance evaluation of random forests and artificial neural networks for the classification of liver disorder performance evaluation of classification data mining algorithms on coronary artery disease dataset performance evaluation of classification data mining algorithms on coronary artery disease dataset using decision tree data mining algorithm to predict causes of road traffic accidents, its prone locations and time along kano-wudil highway power of artificial intelligence to diagnose and prevent further covid-19 outbreak: a short communication machine learning basics with the k-nearest neighbors algorithm, towards data science mandell, douglas, and bennett's principles and practice of infectious diseases improved c4.5 algorithm for the analysis of sales transmission of 2019-ncov infection from an asymptomatic contact in germany structural basis of receptor recognition by sars-cov-2 coronary artery heart disease prediction: a comparative study of computational intelligence techniques developing iot based smart health monitoring systems: a review. rev d'intell artif prediction of breast cancer using support vector machine and k-nearest neighbors virological assessment of hospitalized patients with covid-2019 an improved c4.5 algorithm using l' hospital rule for large dataset covid-19 coronavirus pandemic the prevention and handling of the missing data development of smart healthcare monitoring system in iot environment mathematical model development to detect breast cancer using multigene genetic programming diabetes prediction: a deep learning approach attack and anomaly detection in iot sensors in iot sites using machine learning approaches a review on fall detection systems using data from smartphone sensors an iot based device-type invariant fall detection system this article is part of the topical collection "advances in computational approaches for artificial intelligence, image processing, iot and cloud applications" guest edited by bhanu prakash k n and m. shivakumar. conflict of interest authors have declared that no conflict of interest exists. key: cord-340564-3fu914lk authors: cohen, joseph paul; dao, lan; roth, karsten; morrison, paul; bengio, yoshua; abbasi, almas f; shen, beiyi; mahsa, hoshmand kochi; ghassemi, marzyeh; li, haifang; duong, tim title: predicting covid-19 pneumonia severity on chest x-ray with deep learning date: 2020-07-28 journal: cureus doi: 10.7759/cureus.9448 sha: doc_id: 340564 cord_uid: 3fu914lk introduction the need to streamline patient management for coronavirus disease-19 (covid-19) has become more pressing than ever. chest x-rays (cxrs) provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. in this study, we present a severity score prediction model for covid-19 pneumonia for frontal chest x-ray images. such a tool can gauge the severity of covid-19 lung infections (and pneumonia in general) that can be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the icu. methods images from a public covid-19 database were scored retrospectively by three blinded experts in terms of the extent of lung involvement as well as the degree of opacity. a neural network model that was pre-trained on large (non-covid-19) chest x-ray datasets is used to construct features for covid-19 images which are predictive for our task. results this study finds that training a regression model on a subset of the outputs from this pre-trained chest x-ray model predicts our geographic extent score (range 0-8) with 1.14 mean absolute error (mae) and our lung opacity score (range 0-6) with 0.78 mae. conclusions these results indicate that our model’s ability to gauge the severity of covid-19 lung infections could be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the icu. to enable follow up work, we make our code, labels, and data available online. 2 3 4 1 5 5 5 6 5 5 as the first countries explore deconfinement strategies [1] , the death toll of coronavirus disease-19 (covid-19) keeps rising [2] . the increased strain caused by the pandemic on healthcare systems worldwide has prompted many physicians to resort to new strategies and technologies. chest x-rays (cxrs) provide a non-invasive (potentially bedside) tool to monitor the progression of the disease [3, 4] . as early as march 2020, chinese hospitals used artificial intelligence (ai)-assisted computed tomography (ct) imaging analysis to screen covid-19 cases and streamline diagnosis [5] . many teams have since launched ai initiatives to improve triaging of covid-19 patients (i.e., discharge, general admission, or icu care) and allocation of hospital resources (i.e., direct non-invasive ventilation to invasive ventilation) [6] . while these recent tools exploit clinical data, practically deployable cxr-based predictive models remain lacking. in this work, we built and studied a model which predicts the severity of covid-19 pneumonia, based on cxrs, to be used as an assistive tool when managing patient care. the ability to gauge the severity of covid-19 lung infections can be used for escalation or de-escalation of care, especially in the icu. an automated tool can be applied to patients over time to objectively and quantitatively track disease progression and treatment response. we used a retrospective cohort of 94 posteroanterior (pa) cxr images from a public covid-19 image data collection [7] . while the dataset currently contains 153 images, it only counted 94 images at the time of the experiment, all of which were included in the study. all patients were reported as covid-19 positive by their physicians (most using rt-pcr) and sourced from many hospitals around the world from december 2019 to march 2020. the images were de-identified prior to our use and there was no missing data. the ratio between male/female was 44/36 with an average age of 56±14.8 (55±15.6 for male and 57±13.9 for female). radiological scoring was performed by three blinded experts: two chest radiologists (each with at least 20 years of experience) and a radiology resident. they staged disease severity using a score system [8] , based on two types of scores (parameters): extent of lung involvement and degree of opacity. they were only presented with a single cxr image at a time without any clinical context of the patient. 1. the extent of lung involvement by ground glass opacity or consolidation for each lung (right lung and left lung separately) was scored as: 0 = no involvement; 1 = <25% involvement; 2 = 25%-50% involvement; 3 = 50%-75% involvement; 4 = >75% involvement. the total extent score ranged from 0 to 8 (right lung and left lung together). 2. the degree of opacity for each lung (right lung and left lung separately) was scored as: 0 = no opacity; 1 = ground glass opacity; 2 = consolidation; 3 = white-out. the total opacity score ranged from 0 to 6 (right lung and left lung together). a spreadsheet was maintained to pair filenames with their respective scores. fleiss' kappa for inter-rater agreement was 0.45 for the opacity score and 0.71 for the extent score. -radiological society of north america (rsna) pneumonia detection challenge [9] ; -chexpert dataset from stanford university [10] ; -chestx-ray8 dataset from the national institute of health (nih) [11] ; -chestx-ray8 dataset from the nih with labels from google [12] ; -mimic-cxr dataset from massachusetts institute of technology (mit) [13] ; -padchest dataset from the university of alicante [14] ; -openi [15] these seven datasets were manually aligned to each other on 18 common radiological finding tasks in order to train a model using all datasets at once (atelectasis, consolidation, infiltration, pneumothorax, edema, emphysema, fibrosis, fibrosis, effusion, pneumonia, pleural thickening, cardiomegaly, nodule, mass, hernia, lung lesion, fracture, lung opacity, and enlarged cardiomediastinum). for example "pleural effusion" from one dataset is considered the same as "effusion" from another dataset in order to consider these labels equal. in total, 88,079 non-covid-19 images were used to train the model on these tasks. in this study, we used a densenet model [16] from the torchxrayvision library [17, 18] . densenet models have been shown to predict pneumonia well [19] . images were resized to 224x224 pixels, utilizing a center crop if the aspect ratio was uneven, and the pixel values were scaled to (-1024, 1024) for the training. before even processing the covid-19 images, a pre-training step was performed using the seven datasets to train feature extraction layers and a task prediction layer ( figure 1 ). this "pre-training" step was performed on a large set of data in order to construct general representations about lungs and other aspects of cxrs that we would have been unable to achieve on the small set of covid-19 images available. some of these representations are expected to be relevant to our downstream tasks. there are a few ways we can extract useful features from the pre-trained model as detailed in figure 1 . the two dataset blocks show that covid-19 images were not used to train the neural network. the network diagram is split into three sections. the feature extraction layers are convolutional layers which transform the image into a 1024 dimensional vector which is called the intermediate network features. these features are then transformed using the task prediction layer (a sigmoid function for each task) into the outputs for each task. the different groupings of outputs used in this work are shown. similarly to the images from non-covid-19 datasets used for pre-training, each image from the covid-19 dataset was preprocessed (resized, center cropped, rescaled), then processed by the feature extraction layers and the task prediction layer of the network. the network was trained on existing datasets before the weights were frozen. covid-19 images were processed by the network to generate features used in place of the images. as was the case with images from the seven non-covid-19 datasets, the feature extraction layers produced a representation of the 94 covid-19 images using a 1024 dimensional vector, then the fully connected task prediction layer produced outputs for each of the 18 original tasks. we build models on the pre-sigmoid outputs. linear regression was performed to predict the aforementioned scores (extent of lung involvement and opacity) using these different sets of features in place of the image itself: -intermediate network features -the result of the convolutional layers applied to the image resulting in a 1024 dimensional vector which is passed to the task prediction layer; -18 outputs -each image was represented by the 18 outputs (pre-sigmoid) from the pre-trained model; -four outputs -a hand picked subset of outputs (pre-sigmoid) were used containing radiological findings more frequent in pneumonia (lung opacity, pneumonia, infiltration, and consolidation); -lung opacity output -the single output (pre-sigmoid) for lung opacity was used because it was very related to this task. this feature was different from the opacity score that we would like to predict. for each experiment performed, the 94 images covid-19 dataset was randomly split into a train and test set roughly 50/50. multiple timepoints from the same patient were grouped together into the same split so that a patient did not span both sets. sampling was repeated throughout training in order to obtain a mean and standard deviation for each performance. as linear regression was used, there was no early stopping that had to be done to prevent the model from overfitting. therefore, the criterion for determining the final model was only the mean squared error (mse) on the training set. in order to ensure that the models are looking at reasonable aspects of the images [20] [21] [22] , a saliency map is computed by computing the gradient of the output prediction with respect to the input image (if a pixel is changed, how much will it change the prediction). in order to smooth out the saliency map, it is blurred using a 5x5 gaussian kernel. keep in mind that these saliency maps have limitations and only offer a restricted view into why a model made a prediction [22, 23] . the single "lung opacity" output as a feature yielded the best correlation (0.80), followed by four outputs (lung opacity, pneumonia, infiltration, and consolidation) parameters (0.79) ( table 1 ). building a model on only a few outputs provides the best performance. the mean absolute error (mae) is useful to understand the error in units of the scores that are predicted while the mse helps to rank the different methods based on their furthest outliers. one possible reason that fewer features work best is that having fewer parameters prevents overfitting. some features could serve as proxy variables to confounding attributes such as sex or age and preventing these features from being used prevents the distraction from hurting generalization performance. hand selecting feature subsets which are intuitively related to this task imparts domain knowledge as a bias on the model which improves performance. thus, the top performing model (using the single "lung opacity" output as a feature) is used for the subsequent qualitative analysis. figure 2 shows the top performing model's (using the single "lung opacity" output as a feature) predictions against the ground truth score (given by the blinded experts) on held out test data. majority of the data points fall close to the line of unity. the model overestimates scores between 1 and 3 and underestimates scores above 4. however, generally the predictions seem reasonable given the agreement of the raters. evaluation is on a hold out test set. the grey dashed line is a perfect prediction. red lines indicate error from a perfect prediction. r 2 : coefficient of determination. in figure 3 , we explore what the representation used by one of the best models looks at in order to identify signs of overfitting and to gain insights into the variation of the data. a t-distributed stochastic neighbor embedding (t-sne) [24] is computed on all data (even those not scored) in order to project the features into a two-dimensional (2d) space. each cxr is represented by a point in a space where relationships to other points are preserved from the higher dimensional space. the cases of the survival group tend to cluster together as well as the cases of the deceased group. this clustering indicates that score predictions align with clinical outcomes. a spatial representation of pneumonia specific features (lung opacity, pneumonia, infiltration, and consolidation) when projected into 2 dimensions (2d) using a t-distributed stochastic neighbor embedding (t-sne). in this 2d space, the high dimensional (4d) distances are preserved, specifically what is nearby. cxr images which have similar outputs are close to each other. features are extracted for all 208 images in the dataset and the geographic extent prediction is shown for each image. the survival information available in the dataset represented by the shape of the marker. in figure 4 , images are studied which were not seen by the model during training. for most of the results, the model is correctly looking at opaque regions of the lungs. figure 4b shows no signs of opacity and the model is focused on the heart and diaphragm, which is likely a sign that they are used as a color reference when determining what qualifies as opaque. in figure 4c and figure 4d , we see erroneous predictions. examples of correct (a,b) and incorrect (c,d) predictions by the model are shown with a saliency map generated by computing the gradient of the output prediction with respect to the input image and then blurred using a 5x5 gaussian kernel. the assigned and predicted scores for geographic extent are shown to the right. in the context of a pandemic and the urgency to contain the crisis, research has increased exponentially in order to alleviate the healthcare system's burden. however, many prediction models for diagnosis and prognosis of covid-19 infection are at high risk of bias and model overfitting as well as poorly reported, their alleged performance being likely optimistic [25] . in order to prevent premature implementation in hospitals [26] , tools must be robustly evaluated along several practical axes [18, 27] . indeed, while some ai-assisted tools might be powerful, they do not replace clinical judgment and their diagnostic performance cannot be assessed or claimed without a proper clinical trial [28] . existing work focuses on predicting severity from a variety of clinical indicators which include findings from chest imaging [29] . models such as the one presented in this work can complement and improve triage decisions from cxr as opposed to ct [30] . challenges in creating a predictive model involve labelling the data and achieving good interrater agreement as well as learning a representation which will generalize to new images when the number of labelled images is so low. in the case of building a predictive tool for covid-19 cxr images, the lack of a public database made it difficult to conduct large-scale robust evaluations. this small number of samples prevents proper cohort selection which is a limitation of this study and exposes our evaluation to sample bias. however, we use a model which was trained on a large dataset with related tasks which provided us with a robust unbiased covid-19 feature extractor and allows us to learn only two parameters for our best linear regression model. restricting the complexity of the learned model in this way reduces the possibility of overfitting. our evaluation could be improved if we were able to obtain new cohorts labelled with the same severity score to ascertain the generalization of our model. also, it is unknown if these radiographic scores of disease severity reflect actual functional or clinical outcomes as the open data do not have those data. we make the images, labels, model, and code public from this work so that other groups can perform follow-up evaluations. our model's ability to gauge the severity of covid-19 lung infections could be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the icu. the use of a score combining geographical extent and degree of opacity allows clinicians to compare cxr images with each other using a quantitative and objective measure. also, this can be done at scale for a large scale analysis. human subjects: consent was obtained by all participants in this study. comité d'éthique de la recherche en sciences et en santé (cerses) issued approval #cerses-20-058-d. data was collected from existing public sources such as research papers and online radiology sharing platforms. animal subjects: all authors have confirmed that this study did not involve animal subjects or tissue. conflicts of interest: in compliance with the icmje uniform disclosure form, all authors declare the following: payment/services info: this research is based on work partially supported by the cifar ai and covid-19 catalyst grants. this funding does not pose a conflict of interest. it is funding for open academic research with no expected result or intellectual property expectations. financial relationships: all authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. other relationships: all authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work. children in spain allowed to play outdoors as country eases covid-19 lockdown records more than 2,000 coronavirus deaths in one day, global death toll reaches 100,000 chest radiographic and ct findings of the 2019 novel coronavirus disease (covid-19): analysis of nine patients treated in korea imaging profile of the covid-19 infection: radiologic findings and literature review a rapid advice guideline for the diagnosis and treatment of 2019 novel coronavirus (2019-ncov) infected pneumonia (standard version) ai can help hospitals triage covid-19 patients covid-19 image data collection frequency and distribution of chest radiographic findings in patients positive for covid-19 augmenting the national institutes of health chest radiograph dataset with expert annotations of possible pneumonia chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison chest x-ray: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and populationadjusted evaluation mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. sci data padchest: a large chest x-ray image dataset with multi-label annotated reports preparing a collection of radiology examinations for distribution and retrieval torchxrayvision: a library of chest x-ray datasets and models on the limits of cross-domain generalization in automated x-ray prediction chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning neural smithing: supervised learning in feedforward artificial neural networks. a variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a crosssectional study underwhelming generalization improvements from controlling feature attribution right for the right reasons: training differentiable models by constraining their explanations visualizing data using t-sne prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal hospitals are using ai to predict the decline of covid-19 patients -before knowing it works practical guidance on artificial intelligence for health-care data artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies clinical and chest radiography features determine patient outcomes in young and middle age adults with covid-19 the role of chest imaging in patient management during the covid-19 pandemic: a multinational consensus statement from the fleischner society this research is based on work partially supported by the cifar ai and covid-19 catalyst grants. this work utilized the supercomputing facilities managed by compute canada and calcul quebec. key: cord-340354-j3xsp2po authors: noll, n. b.; askamentov, i.; druelle, v.; badenhorst, a.; jefferies, g.; albert, j.; neher, r. title: covid-19 scenarios: an interactive tool to explore the spread and associated morbidity and mortality of sars-cov-2 date: 2020-05-07 journal: nan doi: 10.1101/2020.05.05.20091363 sha: doc_id: 340354 cord_uid: j3xsp2po the ongoing sars-cov-2 pandemic has caused large outbreaks around the world and every heavily affected community has experienced a substantial strain on the health care system and a high death toll. communities therefore have to monitor the incidence of covid-19 carefully and attempt to project the demand for health care. to enable such projections, we have developed an interactive web application that simulates an age-structured seir model with separate compartments for severely and critically ill patients. the tool allows the users to modify most parameters of the model, including age specific assumptions on severity. infection control and mitigation measures that reduce transmission can be specified, as well as age-group specific isolation. the simulation of the model runs entirely on the client side in the browser; all parameter settings and results of the simulation can be exported for further downstream analysis. the tool is available at covid19-scenarios.org and the source code at github.com/neherlab/covid19_scenarios. the novel coronavirus sars-cov-2 was first detected in the city of wuhan within the hubei province of china at the end of december 2019 . in the following months, sars-cov-2 has shown to be highly transmissible -the basic reproductive number, r 0 , has been estimated to be within 2-3 (riou and althaus, 2020; zhang et al., 2020) with an estimated serial interval of 5-7 days (ganyani et al., 2020; nishiura et al., 2020) . the basic reproduction number likely varies between communities and is affected by intervention measures. the illness caused by sars-cov-2 infection, covid-19, clinically presents with a large variance of symptoms that range from mild and asymptomatic infection to acute severe respiratory illness. the clinical presentation of the infection strongly depends upon patient age (surveillances, 2020) and certain comorbidities (fang et al., 2020) . the who declared the covid-19 outbreak a pandemic on march 11th, 2020 (the who covid-19 group, 2020). as of april 20th, 2020, there have been over 2.4 million confirmed covid-19 cases from 210 countries. a critical component of the global response to the covid-19 pandemic is the possibility to explore different scenarios for local outbreaks within communities across the world using mathematical modelling. modelling is important not only to guide governmental public health policy but also to inform hospital readiness and educate the general public on the importance of social distancing efforts. the spectrum of models used to analyze * contributed equally covid-19 outbreaks ranges from computationally intensive agent-based simulation (neil m ferguson, 2020) , variants of sir/seir models (kermack et al., 1927) , to phenomenological curve fitting approaches ("imhe covid-19 forecasting team" and murray, 2020). however, traditional epidemiological modelling protocols do not scale for a global pandemic -modelling has to be done on a region-by-region basis. thus, to make such modeling widely available, we have developed an interactive, online tool that allows users to efficiently explore covid-19 scenarios based upon different epidemiological assumptions and potential mitigation strategies. the dynamics are modelled by an age-stratified seir model, with additional novel compartments that correspond to hospital and icu utilization with finite capacity. our deterministic approach strikes a compromise between the accuracy of the approximation of the outbreak dynamics and the speed of the simulation. on a typical modern computer and browser, the simulation will complete in under one second such that many different parameter values can be explored interactively. the output of the model is a time series of simulated covid-19 infections, hospitalizations, and icu usage. surveillance data such as case counts, covid-19-related fatalities, and hospitalizations can be compared to the model output when such data are available. additionally, we utilize these data to estimate a few basic parameters for each provided scenario to provide reasonable starting points for further parameter explorations. however, we stress that the focus of this tool is on the exploration of scenarios and not on parameter inference. we have designed our tool with the following principles: (i) users should be able to interact dynamically with the simulation such that changing underlying assumptions manifests instantly in the results, (ii) empirical surveillance data should be plotted with the simulation results to allow for easy assessment of parameter assumptions, (iii) results should be easily shareable via urls, exported raw data, and parameter files. our tool, covid-19 scenarios, was first released on march 9, 2020 and was one of the first publicly available interactive models. it has been utilized consistently throughout the covid-19 pandemic, averaging roughly 8 thousand page loads per day. since we first released, we have been dedicated to improving the tool, in both its underlying scientific accuracy as new data emerged, as well as the overall user experience. all source code and the aggregated surveillance data are made freely available through github. we approximate the dynamics of a covid-19 outbreak using a generalized seir model in which the population is partitioned into age-stratified compartments of: susceptible (s), exposed (e), infected (i), hospitalized (h), critical (c), icu overflow (o), dead (d) and recovered (r) individuals (kermack et al., 1927) . the progression of illness is approximated by the following compartment transitions: susceptible individuals are exposed to the virus by contact with an infected individual; exposed individuals progress towards a infectious state; infectious individuals either recover without hospitalization or progress towards a severe illness that requires hospitalization; hospitalized individuals either recover or worsen towards a critical state; individuals with a critical illness either transition to the icu or, if the hospital is at capacity, to an "overflow" compartment and either return to the hospital state or die; recovered individuals can not be infected again. see fig. 1 for an illustration of the model. we note that direct comparisons between the model predictions and available surveillance data are difficult since only a fraction of cases are confirmed by a positive test and this fraction various between regions. the number of covid-19 deaths is often a more robust measure. let a, b ∈ [1, 2, ..., n a ] denote the different age classes of each compartment. the parameters of the model fall into three broad categories: a time-dependent infection rate β a (t); the rate of transition out of the exposed, infectious, hospitalized, and critical/overflow compartments γ e , γ i , γ h , and γ c respectively; and the age-specific fractions m a , c a and f a of mild, critical, and fatal infections respectively. below, we expound upon each class of parameter. the rate of transmission, β a (t), is nominally deter-mined by both the basic reproductive number r 0 and the time period of patient infectivity γ −1 i . additionally, the rate of transmission can be effectively slowed by mitigation efforts (e.g. social distancing), which we account for phenomenologically by a multiplicative factor m(t) (see below). lastly, empirical data shows a strong, consistent seasonal variation of the four endemic coronaviruses suggesting similar seasonality in the transmissibility of sars-cov-2 (neher et al., 2020) taken together, the rate of transmission is modelled by where χ a models specific demographic isolation, and ε and t max denote the (currently unknown) amplitude of seasonal variation in transmissibility and the time of the year of peak transmission respectively. after an individual is infected (i.e. was exposed) it takes some time before the individual itself is infectious. in our model, the average value of this latency is given by γ −1 e . the incubation time of covid-19 has been estimated to be well approximated by an erlang distribution (lauer et al., 2020) . as such, we approximate the distribution of incubation times within our framework by chaining three exposed states in which the mean time to pass through all three states is γ −1 e . the mean infectious time of a covid-19 case is γ −1 i , which together with the incubation period γ −1 e defines the serial interval. the residence times in the remaining compartments are assumed to be exponentially distributed and thus taken to be a singular state. as noted above, the fraction of covid-19 infections that are asymptomatic/mild, severe cases which progress to a critical state, and critical cases that are fatal are denoted as m a , c a , and f a , respectively (see below for more detail). however, it is important to consider the effects of hospital capacity and overutilization in forecasting potential scenarios. finite hospital resources and staffing acutely impact the outcome for critical covid-19 patients and thus the overall covid-19-related fatalities. we phenomenologically capture this effect by introducing non-linear constraint of a finite number of icu beds c that can accommodate critical patients. once the number of critical cases exceeds this parameters, additional critical cases are redirected to an "overflow" compartment. we take the mortality rate of an overflow patient relative to a patient with an icu bed to be ξ. with all parameters explicitly defined, our full model . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 7, 2020. can be written where θ(x) denotes the heaviside step function and imposes the finite constraint of hospital resources. the model currently does not allow for reinfection. evidence from other coronaviruses suggests that infection elicits immunity that lasts for at least a year (callow et al., 1990) . whether this also holds for sars-cov-2 is not yet clear, but reinfection and herd immunity are of minor relevance in the early phase of a pandemic (neher et al., 2020) . reinfection might be added to the model in the future if evidence accumulates that it is important. epidemiological models, including the one defined by eqn 2, have dozens of parameters, many of which are not accurately known and difficult to measure. additionally, each model dramatically simplifies reality; our parameters are phenomenological summaries of the "true" heterogeneous dynamics. therefore, we give the user control over all model parameters in order to facilitate the exploration of the dependence of the predicted results on the input parameter values, see fig. 2a for a screenshot of the ui. we note that users specify timescales instead of rates in the ui, e.g. γ −1 i corresponds to the "infectious period" input box of fig. 2a , as we felt timescales are easier to directly interpret. for ease of use, the web application has presets for many countries and states that can be used as a starting point for exploration. in order to model both the historical transmission of covid-19 and project its further spread, one must model the enacted social distancing measures, case isolation, and quarantine policies. as such, our model gives the user the ability to specify individual interventions, indexed by α, with a well-defined start and end date and an "effectiveness" α ∈ [0, 1] parameter that quantifies the mitigation's multiplicative effect on rate of transmission. see fig. 2b for an depiction of the ui for the input of different mitigation measures. at each point in time, the cumulative efficacy of all interventions is calculated as where the product runs over all measures m(t) in effect at time t. in the absence of mitigation strategy, m = . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 7, 2020. . https://doi.org/10.1101/2020.05.05.20091363 doi: medrxiv preprint figure 2 user interface for model parameters and mitigations a) an example screenshot of the ui of the web application used to vary model parameters. on the left, we have grouped parameters of the population under study, e.g. the initial number of cases and the size of population. conversely, on the right, we grouped phenomenological parameters related to covid-19 epidemiology. all numbers can be entered manually with a keyboard or stepped with the scrollbox. b) individual mitigation measures can be added or removed from the model via the the shown interface. each intervention has a unique name, a range of times it is applied for, and a range of possible efficacies. the net mitigation on covid-19 transmission is calculated via eq. 3 1. the overall mitigation efficacy modulates covid-19 transmissibility as seen in eqn. 1. additionally, our model allows for the input of simple, time-independent age-specific isolation measures. as can be seen in fig. 3 , we provide a column for "age-specific" isolation. these numbers result in a reduction in exposure of individuals from specific age-groups to the general population. for example, this feature could explore the effect of measures specific for the elderly. the clinical outcome of a covid-19 infection strongly depends on the age of the patient . hence, the overall burden of a covid-19 epidemic within a given region strongly depends on the age demo-graphics of the population. in order to facilitate the integration of such effects within the model, we aggregated age distributions for most countries, obtained from the unsd database api (united nations statistics division, 2020) with a custom python script, to provide as presets. additionally, we allow for custom age distributions to be specified within the ui, see fig. 3 . the provided age distributions determine the fraction of people in each age group in the simulation. the chinese cdc provided extensive statistics of severity of covid-19 in different age groups (the novel coronavirus pneumonia emergency response epidemiology team, 2020), broadly compatible with estimates by (verity et al., 2020) . we used these data to parameterize the expected burden on health care systems. our severity assumptions are summarized in a editable table in the tool, shown in fig. 3 . each column can be edited . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 7, 2020. . https://doi.org/10.1101/2020.05.05.20091363 doi: medrxiv preprint and changed if users want and the implied infection fatality for each group is calculated. elements of the table directly correspond to model parameters: (i) m a is the product of the percentage confirmed and the complement of the severe fraction, (ii) c a is set by the critical column, and (iii) f a is set by the fatal fraction. the dynamics of an exponentially growing process such as the covid-19 pandemic is naturally most sensitive to the growth rate. in the context of the model, the growth rate of infections is primarily a function of the basic reproductive number r 0 and the societal interventions enacted to slow the spread of covid-19. additionally, it is a priori difficult to know the efficacy of mitigation measures. covid-19 scenarios therefore allows the user to specify ranges for r 0 as well as for the efficacy α of mitigation measures, see fig. 2ab for an example of each. the tool will randomly sample a user-specified number of parameter combinations uniformly from these ranges (by default set to 10 combinations). the outputted results display the median as well as a shaded area denoting the 20th and 80th percentile, see fig. 4 for an example of the displayed results. the primary result of the tool are trajectories of the number of cases, people in need of hospitalization, and fatalities, see fig. 4 for an example of the predictions for new york city. additionally, all predicted trajectories can be exported as a single age-stratified table for further downstream analysis. a short executive summary of the results can additionally be printed to pdf. where available, the app graphs the recent cases counts, deaths, and hospitalizations for the community under study on the plot with the model results. the surveillance data enables the user to adapt parameters to tune the simulation to the data, see fig. 4 . once the model fits past data, the user can explore future scenarios by adjusting interventions and seasonality. while covid-19 scenarios is not intended as an inference tool, we nevertheless provide parameter presets that are estimated from empirical data. the primary intent behind fitting to data is not to provide values with high confidence, but rather to facilitate the immediate utility of our tool for different scenarios from across the globe with reasonable presets. we note that care must be taken to not overfit the data; there are many more parameters of the model than features within the available data. furthermore, the testing and reporting patterns are heterogeneous across regions, as well as change over time which ultimately distort the raw numbers. we therefore elect to only estimate three model parameters for each region: (i) r 0 , not solely a property of the virus but also the social structure of the population, (ii) the initial date of the epidemic t min , and (iii) the size of the initial cluster i 0 . in addition, we preset mitigation measures that set in when case-counts rise above certain levels. again, these are not meant as fit parameters but as templates to be adjusted by the user. we assume the remaining parameters don't vary across regions. these have to be adjusted by the user if the data or other information suggests values different from the defaults. we try to fit data solely from the onset of the epidemic prior to mitigation efforts from individual regions. due to the heterogeneity of both the timing and efficacy of policies implemented across the regions provided for interactive exploration, we opted for a simple solution. as more data from more regions become available, we might fit more parameters to observations. both the initial estimates for scenario values, as well as the interactive calibration of the model require empirical observations of covid-19 infections and hospitalizations. due to the scope of scenarios provided, we utilize a number of online resources to aggregate information on new covid-19 cases, deaths, and hospitalizations. these resources include the daily updated case counts by ecdc (european centre for disease control, 2020), the us covid tracking project (the covid tracking project, 2020), other official governmental agencies from around the world, and data aggregated by volunteers. a full list of all sources we use can by found in the file data/sources.json in https://github.com/ neherlab/covid19_scenarios. the case empirical data in the app is updated every 2-3 days. covid-19 scenarios is implemented as a single-page web application using react web framework, typescript and numerous packages from node.js ecosystem. the simulation itself runs on the client side in a webworker, to ensure interactivity during the computation. the application can be hosted on any static web-server, or run locally. we host the latest release version publicly on aws infrastructure, accessible at https:// covid19-scenarios.org. data fetching, processing, parameter estimation and scenario generation is implemented using python and common data science packages, as an additional build . cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 7, 2020 may 7, . . https://doi.org/10.1101 may 7, /2020 doi: medrxiv preprint figure 3 age-dependent parameters. parameters that depend on patient age are summarized in a table that contains the distribution of age groups, severity parameters, and age-specific isolation. the severity parameters are approximately based on data by surveillances (2020) . the first column "confirmed" is our assumption on what fraction of total infections in the different age groups enter as cases in the data analyzed in surveillances (2020) . young individuals are often asymptomatic and hence less likely to be tested. the following columns specify what fraction of confirmed cases fall severely ill and require medical attention, what fraction of the former fall critically ill and requires intensive care, and lastly hat fraction of critically ill patients die. the implied infection fatality rate is given in the second to last column. step. the full source code is available under mit license on github at https://github.com/neherlab/ covid19_scenarios. instructions on how to run the application are documented there. covid-19 scenarios was first released on march 9 2020 and has been updated consistently since. countries, states, and communities across the world have to plan and prepare for the outbreaks and potential reemergence of covid-19. many countries have expert research groups that develop tailored models and sophisticated inferences (kucharski et al., 2020; neil m ferguson, 2020) to predict individualized outcomes. however, governments and other public organizations without such availability need a flexible tool that models local outbreaks, explore the effect of interventions, and can compare results to past dynamics in an interactive workflow. this is the gap that covid-19 scenarios has and continues to fill. to date, we average roughly 8 thousand page loads per day (we don't track users, but estimated these numbers of from cloudfront usage statistics and the number of requests per page load). these requests come from more than 50 countries, with most visitors coming from the usa, germany, switzerland, russia, austria, and the uk. in order to estimate the potential future burden on the health care system, users need flexible ways to adjust demographic parameters in addition to local public health care policies (who gets admitted to the icu, how long are patients hospitalized). at the same time, sensible defaults are required to provide a useful starting point for exploration. covid-19 scenarios was written with the explicit purpose to aid in this regard. the past few months have shown that social distancing measures can effectively slow the spread of covid-19 . the future trajectory of covid-19 will therefore primarily depend upon the level of social distancing and infection control that is maintained. covid-19 scenarios therefore cannot confidently predict outcomes, but rather help to explore potential future scenarios under specific assumptions made by the user. this difficulty is further compounded by the fact that the prediction of absolute numbers are exceedingly sensitive to small variations in input parameters. due to nature of exponential growth experienced within an epidemic, a small uncertainty in either the growth rate or initial date will naturally result in large uncertainty in case numbers. therefore, it is critical that these uncertainties are communicated effectively to policy makers. we therefore allow the user to specify plausible ranges for the parameter r 0 and the efficacy of the interventions. in case of several interventions, this results in a high dimensional space of possibilities that we sample uniformly. percentiles of the sampled results are displayed to capture the range of potential outcomes. we stress that in addition to parameter uncertainty, a simple seir model is a drastic abstraction and simplification that does not capture the full complexity and heterogeneity of the outbreak. nevertheless we hope that the tool is helpful for understanding the dynamic of the outbreak and exploring the effect of past and future interventions. acknowledgement we gratefully acknowledge input from members of the lab, adam kucharski, rosalind eggo, and christian althaus. nils ole tippenhauer and aitana lebrand have helped to parse, aggregate, and update surveillance data from various sources. in addition, we received many invaluable contributions from the open source community. vercel.com has supported the development with free ac-. cc-by-nc 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 7, 2020. . https://doi.org/10.1101/2020.05.05.20091363 doi: medrxiv preprint figure 4 example simulation results for new york city. (top) a plot of the time-dependent effective reproductive number. this is controlled via mitigation interventions and seasonality. the colored line corresponds to the median and the shaded area is bounded by the 20% and 80% percentiles. (middle) plot that shows both the interval (length) and ranges of possible efficacies (width) for the applied mitigation interventions. (bottom) plot that shows the resulting trajectories of all compartments. the colored line for each compartment shows the median trajectory while the shaded area is bounded by the 20% and 80% percentiles. additionally, the aggregated case count data is plotted (where available) as individual points. the display of individual compartments can be toggled by clicking on the legend. cess to their tool-stack. the time course of the immune response to experimental coronavirus infection of man are patients with hypertension and diabetes mellitus at increased risk for covid-19 infection forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months a contribution to the mathematical theory of epidemics effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov-2 in different settings the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia potential impact of seasonal forcing on a sars-cov-2 pandemic report 9 -impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand serial interval of novel coronavirus (covid-19) infections pattern of early human-to-human transmission of wuhan the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19)-china, 2020 library catalog: covidtracking the epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (covid-19) -china, 2020 data retrieved from population demographic total datasets using python pandas interface estimates of the severity of coronavirus disease 2019: a model-based analysis evolving epidemiology and impact of non-pharmaceutical interventions on the outbreak of coronavirus disease estimation of the reproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: a data-driven analysis clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study key: cord-337915-usi3crfl authors: vo, khuong; le, tai; rahmani, amir m.; dutt, nikil; cao, hung title: an efficient and robust deep learning method with 1-d octave convolution to extract fetal electrocardiogram date: 2020-07-04 journal: sensors (basel) doi: 10.3390/s20133757 sha: doc_id: 337915 cord_uid: usi3crfl the invasive method of fetal electrocardiogram (fecg) monitoring is widely used with electrodes directly attached to the fetal scalp. there are potential risks such as infection and, thus, it is usually carried out during labor in rare cases. recent advances in electronics and technologies have enabled fecg monitoring from the early stages of pregnancy through fecg extraction from the combined fetal/maternal ecg (f/mecg) signal recorded non-invasively in the abdominal area of the mother. however, cumbersome algorithms that require the reference maternal ecg as well as heavy feature crafting makes out-of-clinics fecg monitoring in daily life not yet feasible. to address these challenges, we proposed a pure end-to-end deep learning model to detect fetal qrs complexes (i.e., the main spikes observed on a fetal ecg waveform). additionally, the model has the residual network (resnet) architecture that adopts the novel 1-d octave convolution (octconv) for learning multiple temporal frequency features, which in turn reduce memory and computational cost. importantly, the model is capable of highlighting the contribution of regions that are more prominent for the detection. to evaluate our approach, data from the physionet 2013 challenge with labeled qrs complex annotations were used in the original form, and the data were then modified with gaussian and motion noise, mimicking real-world scenarios. the model can achieve a f(1) score of 91.1% while being able to save more than 50% computing cost with less than 2% performance degradation, demonstrating the effectiveness of our method. the u.s. fetal mortality rate has remained unchanged from 2006 through 2012 at 6.05 per 1000 births [1] , motivating the need for proactive fetal monitoring techniques to alert and reduce fetal mortality. the covid-19 pandemic has revealed the weaknesses of our healthcare system in providing remote monitoring for essential services. although mobile health and telemedicine technologies have been introduced for more than a decade, expectant mothers still need to visit clinics in person for regular checkups, especially twice or thrice a week in the last month of pregnancy. additional technological improvements can enable these fetal well-being monitoring and non-stress tests to be performed remotely, which can result in significant cost and time reduction, as well as mitigating the burden for the hospitals, while supporting social distancing when needed. high-frequency f/mecg part dominantly belongs to the frequency of above 100 hz, as shown in the psd plot in the top right corner. in this research, set a from the physionet/computing in the cardiology challenge database (pcdb) [17] was used as the experimental. this is the largest publicly available, non-invasive fecg database to date, which consists of a collection of one-minute abdominal ecg recordings. each recording includes four non-invasive abdominal channels where each channel is acquired at 1000 hz. the data were obtained from multiple sources using a variety of instrumentation with differing frequency response, resolution, and configuration. we used the set a, containing 75 records with reference annotations, excluding a number of recordings (a33, a38, a47, a52, a54, a71, and a74) having inaccurate reference annotations [18] . the annotations of the locations of fetal qrs complexes were manually produced by a team of experts, usually with reference to direct fecg signals that were acquired from a fetal scalp electrode [17] . as suggested in [10] , the validation set and test set were comprised of six recordings (a08-a13) and seven recordings (a01-a07), respectively. the 55 remaining recordings were for the training set. to evaluate the effectiveness of our method in practical scenarios, gaussian noise was added with various noise levels, making the signal-to-noise ratio (snr) of the modified dataset with 50.6 db, 36.8 db, and 29.1 db, respectively. specifically, a normally distributed random noise ranging from -4 μv to 4 μv was generated by randn function in matlab (r2018b, the mathworks, inc. natick, ma, usa). to have different noise levels (i.e., different snr values), we multiplied the initial random noise with constant numbers (e.g., 3, 6, and 9) before adding to the original dataset. additionally, since the dataset was collected in a clinical setting where motion artifacts were mostly avoided because the subjects were in resting positions, motion artifacts were added to the data. for this purpose, the ecg data were recorded from a healthy subject in different types of activities such as figure 1 . decomposition of fetal/maternal ecg (f/mecg) into low-frequency and high-frequency components (from data a74-channel 1): (a) low-frequency f/mecg part with the dominant frequencies of below 1 hz, as shown in power spectral density (psd) plot in the top right corner; (b) high-frequency f/mecg part dominantly belongs to the frequency of above 100 hz, as shown in the psd plot in the top right corner. the rest of our paper is organized as follows. section 2 presents the experimental data and our proposed architecture, as well as provides the efficiency analysis. in section 3, experimental results are discussed along with the interpretation of the model's predictions. finally, section 4 concludes the paper. in this research, set a from the physionet/computing in the cardiology challenge database (pcdb) [17] was used as the experimental. this is the largest publicly available, non-invasive fecg database to date, which consists of a collection of one-minute abdominal ecg recordings. each recording includes four non-invasive abdominal channels where each channel is acquired at 1000 hz. the data were obtained from multiple sources using a variety of instrumentation with differing frequency response, resolution, and configuration. we used the set a, containing 75 records with reference annotations, excluding a number of recordings (a33, a38, a47, a52, a54, a71, and a74) having inaccurate reference annotations [18] . the annotations of the locations of fetal qrs complexes were manually produced by a team of experts, usually with reference to direct fecg signals that were acquired from a fetal scalp electrode [17] . as suggested in [10] , the validation set and test set were comprised of six recordings (a08-a13) and seven recordings (a01-a07), respectively. the 55 remaining recordings were for the training set. to evaluate the effectiveness of our method in practical scenarios, gaussian noise was added with various noise levels, making the signal-to-noise ratio (snr) of the modified dataset with 50.6 db, 36.8 db, and 29.1 db, respectively. specifically, a normally distributed random noise ranging from −4 µv to 4 µv was generated by randn function in matlab (r2018b, the mathworks, inc. natick, ma, sensors 2020, 20, 3757 4 of 13 usa). to have different noise levels (i.e., different snr values), we multiplied the initial random noise with constant numbers (e.g., 3, 6, and 9) before adding to the original dataset. additionally, since the dataset was collected in a clinical setting where motion artifacts were mostly avoided because the subjects were in resting positions, motion artifacts were added to the data. for this purpose, the ecg data were recorded from a healthy subject in different types of activities such as walking and jogging. the acquired data were normalized between [−1, 1] and motion noise and filtered ecg data were then achieved by using extended kalman filter. the f/mecg was normalized with the same threshold of [−1, 1] before motion noise was added to the normalized f/mecg. in the context of fetal ecg extraction, a detected fetal qrs is considered as a true positive if it is within 50 ms of the reference annotation. therefore, for every input window frame of 4 × 100, it is labeled as class 1 if it contained the location of the complex, otherwise labeled as class 0. as shown in figure 2 , a fully convolutional network with shortcut residual connections is applied to the input data size 4 × 1000 of a 1-second window of time, followed by a recurrent network layer to output 10 predictions over 10 consecutive window frames of 4 × 100. a general 1-d convolution is applied for a centered time stamp, t, by the following equation: where y denotes the resulting feature maps from a dot product of time series x of length t with a filter ω of length l, a bias parameter b, and a non-linear function f. as can be seen in figure 3a , all input and output feature maps in vanilla convolution retains the same temporal resolution across the channels, which possibly introduces temporal redundancy since low-frequency information is captured by some feature maps that can be further compressed. sensors 2020, 20, x for peer review 4 of 13 walking and jogging. the acquired data were normalized between [-1, 1] and motion noise and filtered ecg data were then achieved by using extended kalman filter. the f/mecg was normalized with the same threshold of [-1, 1] before motion noise was added to the normalized f/mecg. in the context of fetal ecg extraction, a detected fetal qrs is considered as a true positive if it is within 50 ms of the reference annotation. therefore, for every input window frame of 4 × 100, it is labeled as class 1 if it contained the location of the complex, otherwise labeled as class 0. as shown in figure 2 , a fully convolutional network with shortcut residual connections is applied to the input data size 4 × 1000 of a 1-second window of time, followed by a recurrent network layer to output 10 predictions over 10 consecutive window frames of 4 x 100. a general 1-d convolution is applied for a centered time stamp, t, by the following equation: where y denotes the resulting feature maps from a dot product of time series x of length t with a filter ω of length l, a bias parameter b, and a non-linear function f. as can be seen in figure 3a , all input and output feature maps in vanilla convolution retains the same temporal resolution across the channels, which possibly introduces temporal redundancy since low-frequency information is captured by some feature maps that can be further compressed. with this observation, we adapted the 2-d octave convolution (octconv) [13] to the multichannel f/mecg with 1-d octconv, as illustrated in figure 3b . input feature maps of a convolutional layer x can be factorized along the channel dimension into low-frequency maps l x and highfrequency maps h x . high-frequency channels retains the feature map's resolution, while lowfrequency channels down-sampled it. practically, the l x feature representation has the temporal dimension or the length divided by 2, an octave, to produce two frequency blocks. there is a hyper upsample x k is an up-sampling operation by a factor of k via nearest interpolation, ( ) , pool x k is an average pooling operation with kernel size k and stride k , and ( ) ; f x w denotes a convolution with parameters w . green arrows denote intra-frequency information update ( , ) convolutional operation from feature map group a to group b. the l and c denote the temporal dimension and the number of channels, respectively. the ratio α of input channels (αin) and output channels (αout) are set at the same value throughout the network, except that the first octconv has αin = 0 and αout = α, while the last octconv has αin = α and αout = 0. the benefit of the new feature representation and convolution operation is the reduction in computational cost and memory footprint, enabling the implementation on resource-constrained devices. this also helps convolutional layers capture more contextual information by having larger receptive fields, by having low-frequency part l x convolved by 1 k × kernels, which could lead to the improvement in the classification performance. the 1-d cnn comprises of nine convolutional layers and one global average pooling layer. the nine convolutional layers are divided into three residual blocks with the shortcut residual connections between three consecutive blocks. those connections are linear operations that link a residual block's output to its input to alleviate the vanishing gradient effect by allowing the gradient to propagate directly through these links. in each residual block, the first, second, and third convolution have the filter lengths of 8, 5, and 3, respectively. the total numbers of filters are 64, 128, and 128 for the first, second, and third block, respectively. each block is followed by the rectified linear unit (relu) activation preceded by a batch normalization (bn) operation [19] . the length of the input time series is not altered after convolutions with strides of 1 and appropriate padding. this resnet architecture can be made invariant across different time-series datasets [15] , making it (b) 1-d octave convolution on decomposed feature maps where low-frequency channels have 50% resolution. red arrows denote inter-frequency information exchange ( f h→l , f l→h ), while green arrows denote intra-frequency information update ( f h→h , f l→l ), where f a→b denotes the convolutional operation from feature map group a to group b. the l and c denote the temporal dimension and the number of channels, respectively. the ratio α of input channels (α in ) and output channels (α out ) are set at the same value throughout the network, except that the first octconv has α in = 0 and α out = α, while the last octconv has α in = α and α out = 0. with this observation, we adapted the 2-d octave convolution (octconv) [13] to the multi-channel f/mecg with 1-d octconv, as illustrated in figure 3b . input feature maps of a convolutional layer x can be factorized along the channel dimension into low-frequency maps x l and high-frequency maps x h . high-frequency channels retains the feature map's resolution, while low-frequency channels down-sampled it. practically, the x l feature representation has the temporal dimension or the length divided by 2, an octave, to produce two frequency blocks. there is a hyper-parameter α [0, 1] to determine the ratio of channels allocated to the low-resolution features. the octave convolution effectively processes feature maps directly in their corresponding frequency tensors and manages the information exchange between frequencies, by having different filters focusing on different frequencies of the signal. for convolving on x h and x l , the original convolutional kernel w is split into two components, w h and w l , which are further broken down to intra-and inter-frequency parts: let y h and y l represent the high-frequency and low-frequency output tensors, octave convolution can be written as: where upsample(x; k) is an up-sampling operation by a factor of k via nearest interpolation, pool(x, k) is an average pooling operation with kernel size k and stride k, and f (x; w) denotes a convolution with parameters w. the benefit of the new feature representation and convolution operation is the reduction in computational cost and memory footprint, enabling the implementation on resource-constrained devices. this also helps convolutional layers capture more contextual information by having larger receptive fields, by having low-frequency part x l convolved by 1 × k kernels, which could lead to the improvement in the classification performance. the 1-d cnn comprises of nine convolutional layers and one global average pooling layer. the nine convolutional layers are divided into three residual blocks with the shortcut residual connections between three consecutive blocks. those connections are linear operations that link a residual block's output to its input to alleviate the vanishing gradient effect by allowing the gradient to propagate directly through these links. in each residual block, the first, second, and third convolution have the filter lengths of 8, 5, and 3, respectively. the total numbers of filters are 64, 128, and 128 for the first, second, and third block, respectively. each block is followed by the rectified linear unit (relu) activation preceded by a batch normalization (bn) operation [19] . the length of the input time series is not altered after convolutions with strides of 1 and appropriate padding. this resnet architecture can be made invariant across different time-series datasets [15] , making it suitable for transfer learning techniques [20] in which the model is first trained on source datasets and then fine-tuned on the target dataset to further improve performance. after the convolutional layers, we have the feature representation of the size 128 × 1000. this feature is further split into 10 parts of 128 × 100, and then each part is global-average pooled across the time dimension resulting in a 128-dimensional vector. the 10 × 128 tensor of 10 timesteps is then fed to the recurrent network layer of gated recurrent units (grus) [21] with the hidden state size of 32 to capture the sequential nature of qrs complexes. grus have gating mechanisms that adjust the information flow inside the unit. a gru memory cell is updated at every timestep, t, as detailed in the following equations: where σ denotes the logistic sigmoid function, is the elementwise multiplication, r is the reset gate, z is update gate, andĥ t denotes the candidate hidden layer. in addition, since grus only focus on learning dependencies in one direction with the assumption that the output at timestep t only depends on previous timesteps, we deploy bi-directional grus to capture dependencies in both directions. finally, the shared-weight softmax layer was applied in each feature region corresponding to each of the 4 × 100 input data to detect the fqrs complexes. note that the recurrent layer and the softmax layer on top yield negligible computing cost by modeling only a short sequence of 10 timesteps with a minimal hidden state size. memory cost. for feature representation with vanilla 1-d convolution, the storage cost is l × c, where l and c are the length and the number of channels, respectively. in octconv, a low-frequency tensor is stored at 2× lower temporal resolution in α × c channels, as illustrated in figure 3 , thus storage cost of the multi-frequency feature representation is l × (1 − α) × c + l 2 × α × c. therefore, the memory cost ratio between the 1-d octconv and regular convolution is let k be the filter length, the floating point operations per second (flops) (i.e., multiplications and additions) for computing output feature map in vanilla 1-d convolution can be calculated by in octconv, the main computation comes from the convolution operations in the two paths of inter-frequency information exchange ( f h→l and f l→h ), and the two paths of intra-frequency information update ( f h→h and f l→l ). they are estimated for each path as below. sensors 2020, 20, 3757 7 of 13 hence, the total cost for computing output feature map with octconv is therefore, the computational cost ratio between the 1-d octconv and regular convolution is from equations (4) and (8), we can derive the theoretical gains of the 1-d octave convolution per layer regarding different values of α, as shown in table 1 . as a deep learning method, the model is inherently a black-box function approximator, which is one of the primary obstacles for its use in medical diagnosis. in order to make the model more transparent, we extended the gradient-weighted class activation mapping (grad-cam) technique [16] to highlight the discriminative time-series regions that contribute to the final classification. as opposed to the model-agnostic interpretation methods [22, 23] , which require many forward passes, grad-cam has a fast-computation advantage by requiring only one single pass. in contrast to the cam method [24, 25] that trades off model complexity and performance for more transparency, grad-cam has the flexibility in applying to any cnn-based models in which cnn layers could be followed by recurrent neural network layers or fully connected layers. the method relies on the linear combination of the channels in the output feature maps of the last convolutional layers, which is the 128-channel feature map that appears before the global-average pooling layer in our network. grad-cam computes to the weights a c k , which is the importance of channel k for the target class c, by averaging gradients across the time dimension of the score for class c(y c ) with respect to channels a: since only features that have positive influences on the target class are important, relu is applied to the linear combination of maps to derive the heat map for the prediction of class c: model parameters θ are achieved by minimizing the weighted binary cross entropy loss between the ground truth targets or labels y and the estimates or predictionsŷ. that is, where m denotes the number of training samples, t is equal to 10, which is the number of successive sub-sequences to detect fqrs complexes, y ij signifies the complex at sequence j of ith training sample, and the multiplicative coefficient β is set to 2, which up-weights the cost of a positive error relative to a negative error in order to alleviate the imbalanced class problem between the number of fqrs complexes and non-fqrs complexes. the model was trained with adam optimizer [26] with the initial learning rate of 0.001, the exponential decay rates β1 = 0.9 and β2 = 0.999, and the constant for numerical stability = 1e-8. all trainable weights were initialized with xavier/glorot initialization scheme [27] . all dropout rates [28] were set at 0.4 to prevent the neural network from overfitting. the training process ran for a total of 100 epochs with batch size of 64, and the best model on the validation set was chosen to report its performance on the test set. the fundamental measures for evaluating classification performance are precision, recall, and f 1 score. precision is a measure of exactness that depicts the capacity of the model at detecting true fqrs complexes out of all the detections it makes. recall is a measure of completeness that depicts the model's capacity at finding the true fqrs complexes. f 1 score is the harmonic mean of precision and recall. they are calculated as where tp is the number of true positives (correctly identified fqrs), fp is the number of false positives (wrongly detected fqrs), and fn is the number of false negatives (missed fqrs). in table 2 , we measure the performance of the proposed model with varying α [0, 0.25, 0.5, 0.75] on the original physionet dataset. the constant number of model parameters was 0.56 million. at α = 0, the model with the vanilla convolution had the computational cost of 0.52 gigaflops (gflops). the computation of the recurrent layer and the softmax layer (gru-fc-gflops) was fixed at approximately 0.0003 gflops, which contributed only 0.06% of the total cost. we observed that the f 1 score on the test dataset (f 1 -test) first increased marginally and then slowly declined with the growth of α. the highest f 1 score of 0.911 was reached at α = 0.25 when the computation of the convolutional layers (cnn-gflops) was reduced by around 20%. we attributed the increase in f 1 score to octconv's effective design of multi-frequency processing and the resultant contextual-information augmentation by the enlargement of receptive fields. it is interesting to note that the compression to half the resolution of 75% feature maps resulted in only 1.4% f 1 score drop. to better support the generalizability of our approach, we also performed cross-validation to obtain means and standard deviations of the model's performance over 10 folds of the dataset (f 1 -cross). we observed that the cross-validated performance showed a similar trend with the results on the selected test set, with even smaller f 1 score gaps between different α. likewise, the gpu inference time also diminished with the drop in the number of flops and the increase in α. these results demonstrated 1-d octconv's capability of grouping and compressing the smoothly changed time-series feature maps. note that octconv is orthogonal and complementary to existing methods for improving cnns' efficiency. by combining octconv with popular techniques, such as pruning [29] and depth-wise convolutions [30] , we can further cut down the computing cost of the model. figure 4 depicts the effect of the number of timesteps fed into the recurrent layer with grus. at each timestep, the model processed the feature corresponding to the input window frame of 4 × 100. the f 1 -score grew sharply with the increase in the sequence length and reached the peak at 10 timesteps before leveling off. also, it is worth noting that the computational cost and memory footprint of the model grow with the increase in the input sequence length. this result validated our input segmentation strategy as well as showed the necessity of the recurrent layer to model the sequential nature of the fqrs complexes. sensors 2020, 20, x for peer review 9 of 13 we also compared our approach with the two recent deep learning algorithms reported in the literature [9, 10] in figure 5 . our method had a higher recall (90.32%) than those of other algorithms (89.06% and 80.54%), while the precision (91.82%) was slightly lower than lee's algorithm (92.77%) but significantly higher than zhong's approach (75.33%). in lee's model, the architecture is manually tuned as well as requiring post-processing steps to be best optimized for the specific task, while our proposed method performed only a single stage of processing and can generalize easily to other signal-processing schemes by having the resnet architecture invariant across various time-series classification tasks as proven in [15] . table 3 shows the model performance on the added-noise datasets in order to evaluate the effectiveness of our method in real-life scenarios. regarding gaussian noise, the performance fell sharply with the increase in the noise level. the f1 score was 0.815 at noise level 3 but decreased to 0.627 at noise level 9 when data was completely corrupted. besides, the model achieved the f1 score of 0.844 on the dataset disturbed with motion artifacts. these promising results demonstrate the robustness of our method in practical scenarios against different types of noise. we also compared our approach with the two recent deep learning algorithms reported in the literature [9, 10] in figure 5 . our method had a higher recall (90.32%) than those of other algorithms (89.06% and 80.54%), while the precision (91.82%) was slightly lower than lee's algorithm (92.77%) but significantly higher than zhong's approach (75.33%). in lee's model, the architecture is manually tuned as well as requiring post-processing steps to be best optimized for the specific task, while our proposed method performed only a single stage of processing and can generalize easily to other signal-processing schemes by having the resnet architecture invariant across various time-series classification tasks as proven in [15] . we also compared our approach with the two recent deep learning algorithms reported in the literature [9, 10] in figure 5 . our method had a higher recall (90.32%) than those of other algorithms (89.06% and 80.54%), while the precision (91.82%) was slightly lower than lee's algorithm (92.77%) but significantly higher than zhong's approach (75.33%). in lee's model, the architecture is manually tuned as well as requiring post-processing steps to be best optimized for the specific task, while our proposed method performed only a single stage of processing and can generalize easily to other signal-processing schemes by having the resnet architecture invariant across various time-series classification tasks as proven in [15] . table 3 shows the model performance on the added-noise datasets in order to evaluate the effectiveness of our method in real-life scenarios. regarding gaussian noise, the performance fell sharply with the increase in the noise level. the f1 score was 0.815 at noise level 3 but decreased to 0.627 at noise level 9 when data was completely corrupted. besides, the model achieved the f1 score of 0.844 on the dataset disturbed with motion artifacts. these promising results demonstrate the robustness of our method in practical scenarios against different types of noise. table 3 shows the model performance on the added-noise datasets in order to evaluate the effectiveness of our method in real-life scenarios. regarding gaussian noise, the performance fell sharply with the increase in the noise level. the f 1 score was 0.815 at noise level 3 but decreased to 0.627 at noise level 9 when data was completely corrupted. besides, the model achieved the f 1 score of 0.844 on the dataset disturbed with motion artifacts. these promising results demonstrate the robustness of our method in practical scenarios against different types of noise. figure 6 shows our attempt to interpret the inner workings of our network. in figure 6a , we show the examples of the heatmap outputs in a window frame using the grad-cam approach. the technique enabled the classification-trained model to not only answer whether a time series contains the location of the complex, but also localize class-specific regions. this provided us with the confidence that the detection of the qrs complexes inside a window frame is because the model correctly focused on the highest discriminative subsequence around r-peak (i.e., the maximum amplitude in the r wave of the ecg signal), and not for unknown reasons. figure 6b ,c illustrate the attention mappings for the detections of one of the fqrs complexes by our model on the entire input signals. there were two fqrs complexes and three complexes detected associated with the attention spikes in figure 6b ,c, respectively. although the maternal ecg signal is the predominant interference source, possessing much greater amplitude than the fetal ecg, our network still managed to pay high attention to the right regions of the waveform that contain fetal qrs complexes. moreover, when deciding on each window frame, the model not only focused on the local morphology of fqrs complexes, but it also took into account the surrounding complexes to reinforce its decision, which proved the capability of the recurrent layer in our network. besides the high accuracy, this demonstrates that our model understands the problem properly, which, in turn, makes the model more trustworthy and gives clinicians higher confidence in using it for medical diagnosis. qrs complexes. moreover, when deciding on each window frame, the model not only focused on the local morphology of fqrs complexes, but it also took into account the surrounding complexes to reinforce its decision, which proved the capability of the recurrent layer in our network. besides the high accuracy, this demonstrates that our model understands the problem properly, which, in turn, makes the model more trustworthy and gives clinicians higher confidence in using it for medical diagnosis. in this work, we explored a highly effective, end-to-end, deep neural network-based approach for the detection of fetal qrs complexes in non-invasive fetal ecg. we extended the novel octave convolution operation to time-series data of the non-invasive fecg, to address the temporal redundancy problem in conventional 1-d cnns. the improvement in the computation and memory efficiency of the model facilitates its deployment on resource-constrained devices (e.g., wearables). our proposed method achieved 91.82% of precision and 90.32% of recall on physionet dataset while still demonstrating the robustness in practical scenarios with noisy data. our approach holds promise to enable fetal and maternal well-being monitoring in the home setting, saving cost and labor as well as supporting the society in special pandemic scenarios such as the covid-19. moreover, with an approach to make the model more transparent by the interpretation of its decisions, the method would be better adopted by clinicians to augment diagnosis. funding: this research was funded by the national science foundation career award #1917105 to hung cao. trends in fetal and perinatal mortality in the united states electronic fetal monitoring in the united states unobtrusive continuous monitoring of fetal cardiac electrophysiology in the home setting fetal phonocardiography signal processing from abdominal records by non-adaptive methods congenital heart defects in the united states: estimating the magnitude of the affected population in 2010 a review of signal processing techniques for non-invasive fetal electrocardiography blind signal separation: statistical principles noninvasive fetal electrocardiogram extraction: blind separation versus adaptive noise cancellation fetal ecg extraction during labor using an adaptive maternal beat subtraction technique a deep learning approach for fetal qrs complex detection fetal qrs detection based on convolutional neural networks in noninvasive fetal electrocardiogram deep learning for detection of fetal ecg from multi-channel abdominal leads drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution ieee/cvf international conference on computer vision (iccv) time series classification from scratch with deep neural networks: a strong baseline deep learning for time series classification: a review visual explanations from deep networks via gradient-based localization noninvasive fetal ecg: the physionet/computing in cardiology challenge non-invasive fecg extraction from a set of abdominal sensors batch normalization: accelerating deep network training by reducing internal covariate shift transfer learning for time series classification learning phrase representations using rnn encoder-decoder for statistical machine translation why should i trust you?": explaining the predictions of any classifier a unified approach to interpreting model predictions towards understanding ecg rhythm classification using convolutional neural networks and attention mappings learning deep features for discriminative localization a method for stochastic optimization. arxiv understanding the difficulty of training deep feedforward neural networks dropout: a simple way to prevent neural networks from overfitting clip-q: deep network compression learning by in-parallel pruning-quantization xception: deep learning with depthwise separable convolutions this article is an open access article distributed under the terms and conditions of the creative commons attribution (cc by) license the authors would like to thank q. d. nguyen, s. sarafan, and a. m. naderi at uc irvine for providing us the new dataset with motion noise and gaussian noise added. the authors declare no conflict of interest.sensors 2020, 20, 3757 key: cord-330474-c6eq1djd authors: fox, j; khan, o; curtis, h.; wright, a; pal, c; cockburn, n; cooper, j; chandan, j.s.; nirantharakumar, k title: rapid translation of clinical guidelines into executable knowledge: a case study of covid‐19 and on‐line demonstration date: 2020-06-18 journal: learn health syst doi: 10.1002/lrh2.10236 sha: doc_id: 330474 cord_uid: c6eq1djd the polyphony programme is a rapidly established collaboration whose aim is to build and maintain a collection of current healthcare knowledge about detection, diagnosis and treatment of covid‐19 infections, and use artificial intelligence (knowledge engineering) techniques to apply the results in patient care. the initial goal is to assess whether the platform is adequate for rapidly building executable models of clinical expertise, while the longer term goal is to use the resulting covid‐19 knowledge model as a reference and resource for medical training, research and, with partners, develop products and services for better patient care. in this polyphony progress‐report we describe the first prototype of a care pathway and decision support system that is accessible on openclinical.net, a knowledge sharing repository. pathfinder 1 demonstrates services including situation assessment and inference, decision making, outcome prediction and workflow management. pathfinder 1 represents encouraging evidence that it is possible to rapidly develop and deploy practical clinical services for patient care and we hope to validate an advanced version in a collaborative internet trial. finally, we discuss wider implications of the polyphony framework for developing rapid learning systems in healthcare, and how we may prepare for using ai in future public health emergencies. this article is protected by copyright. all rights reserved. the covid-19 emergency is a massive challenge to human expertise and organisation, but it is also widely recognised as an opportunity to demonstrate, test and improve medical technologies, including ai techniques for delivering rapid learning systems. over recent years we have been developing a flexible methodology for creating executable models of specialist clinical expertise and a platform for sharing these models called openclinical (www.openclinical.net; fox et al, 2013) . openclinical is one of a number of efforts in recent years to use knowledge engineering and other techniques to formalise clinical guidelines as "executable knowledge (e.g. peleg et al, 2010; friedman 2019) . although systematic reviews, clinical guidelines etc have been important tools of the evidence based medicine movement their impact in improving consistency and quality of care has been less than hoped for because they are disseminated purely as human readable content (e.g. text, diagrams) in a traditional, often slow, publication and revision cycle. openclinical has made use of one particular approach to formalising clinical guidelines based on a specialised modelling language called proforma (fox and das, 2000; sutton and fox 2003) ; we adopted this approach because it has been used and trialled successfully in many medical applications (fox et al, in press) and we have wide experience using it. we see openclinical as the basis of a rapid learning system as illustrated in figure 1. this shows a knowledge life cycle for creating and maintaining executable models of care using the openclinical.net knowledge modelling and publishing platform. the top arrow of the cycle represents creation and testing of models using proforma authoring software; the next step on the right is to publish them on the openclinical.net repository (which currently carries 50+ examples of executable models for many clinical settings and specialties https://dev.openclinical.net/index.php?id=69). figure 1 : the openclinical knowledge-to-data cycle for rapid learning systems this lifecycle overlaps with other rls proposals, notably friedman et al's mcbk proposals (2019) but is distinctive in using proforma, a specialised ai language, for modelling and to support open source sharing of models. although proforma has been successful in these roles the goal of openclinical is primarily to provide a platform for crowdsourcing, validating and disseminating models not to deliver clinical services directly. a key aim of the openclinical project is to "close the loop" (left arrow) by incorporating software for acquiring clinical data from clinical implementations and trials of the models. this will permit use of data mining, machine learning and other techniques to update the evidence base and refine clinical guidelines and other care quality standards. the polyphony project was initiated on 18 march 2020 with the following mission to create, validate, publish and maintain knowledge of best medical practice regarding the detection, diagnosis and management of covid-19 infections, in a computer executable form. the purpose is to provide a resource for clinicians and researchers, healthcare provider organisations, technology developers and other users, to (1) develop point of care products and services which (2) embody best clinical practice in decision-making, workflow, data analysis and other "intelligent" services across the covid patient journey. 2. how useful is the openclinical knowledge sharing framework for empowering clinicians to critique and improve models of decision-making and care across the patient journey? 3. is it possible to adapt components of the model for use in different clinical settings or in local variants of care pathways? 4. what is the potential for combining knowledge engineering methods with techniques from data science (e.g. statistical analysis, data mining and machine learning)? this article is a progress report on our first prototype, pathfinder 1, and the design and engineering framework we have developed to support the polyphony mission. the on-line demonstrator is not complete or "clinical strength" but this is the goal of future cycles of figure 1. 1 however sufficient progress has been made that it may be of interest to the rls community. the proforma language is based on a general framework for modelling tasks, including reasoning, decision-making and planning. task models can be applied to knowledge about a particular medical domain such as the diagnosis and treatment of covid-19 patients, formalised in proforma or available in external resources. the openclinical knowledge model is illustrated schematically in figure 2 . at the bottom of the ladder are symbols (eg "fever", '38.6') which can be combined to represent data (e.g. presenting symptoms include fever) and concepts (e.g. diagnosis is a kind of decision; pathway is a kind of plan) and descriptions (e.g. patient histories). proforma can be used to model knowledge as rules for inference or action. where inference is uncertain rules can participate in complex decisions as a basis for constructing arguments for and against competing decision options. finally, decisions, actions and enquiries (actions that acquire information) can be composed into plans to achieve particular objectives. in principle plans can be encapsulated as agents that can carry out complex behaviour autonomously, though agent modelling is not within the scope of the polyphony program. proforma can be used to model clinical guidelines of various types including medical logic and recommendations, decision trees, clinical algorithms etc. early development of the covid-19 knowledge base in pathfinder 1 relied primarily on documentation published by bmj best practice in march 2020 2 . the openclinical modelling and testing platform was used by an experienced proforma modeller advised by clinicians with specialist knowledge in public health and primary care. in pathfinder 1 the concepts and components of the bmj best practice guidance were mapped directly to levels of the knowledge ladder. the resulting covid-19 model consists of five main sections: the data model; clinical contexts (data sets and scenarios); rules (inference and alerts); decisions (arguments and evidence); and care pathways. in the following paragraphs we summarise how these are modelled in the various modules of pathfinder 1. a module overview can also be seen in the graphic on the pathfinder page on openclinical. this article is protected by copyright. all rights reserved. the openclinical platform includes a tool to create data definitions for relevant clinical and other parameters, their types (strings, integers, date-times etc) and other properties 3 . the full model for pathfinder 1 consists of 46 data definitions based on the bmj best practice documentation (march 2020). these are at the symbol level of the knowledge ladder in figure 2 . a clinical context is typically a scenario on the patient journey in which one or more decisions may be taken and for which subsets of the data model are relevant. one scenario is "patient triage" where a handful of questions may be asked relevant to a decision whether to do nothing, advise self-isolation or book an ambulance. another context is "hospital work-up" which covers a detailed patient history to inform a provisional diagnosis and initial selection of investigations. based on the bmj guidance (op cit) ten contexts were identified and modelled. pathfinder 1 emphasises prehospital scenarios with less emphasis on hospital and acute care, though by the time of publication bmj best practice has added more guidance on the latter stages of the journey. as shown in the knowledge ladder proforma supports several knowledge types, of which the simplest is the if…then… rule. rules respond to situations or events expressed in boolean logic. for example, if a patient's temperature is > 38.6 pathfinder infers there is a fever; when hazardous situations are detected "red flags" are raised. other uses of rules are to alert the user if a task has not been carried out or is overdue, or when a patient is eligible for inclusion in an open research trial (e.g. patkar et al, 2013) . a proforma decision model consists of a set of options and a set of "arguments" (pros and cons) associated with each option. if an argument expression is evaluated against current patient data or other information and found to be valid, it is recorded as a reason for (or against) the relevant option with an explanation and, if required, supporting evidence that justifies the argument. all options in an active decision context accumulate patient-specific pros-and cons as data are acquired, and the decision engine aggregates these to provide a continuously updated measure of confidence in each option. in the initial modelling stage arguments are usually modelled qualitatively but if statistical or other data are available they can be assigned quantitative weights which can be aggregated using various possible decision algorithms. eight main decisions have been modelled for the covid-19 patient journey, including triage, diagnosis, prognosis, prediction of complications and choice of management plan. all decisions are concurrently active in pathfinder 1 but they can also be deployed at specific points or in particular scenarios in the care pathway. a "pathway" is a network of decisions and tasks for acquiring data and carrying out plans in a sequential and/or conditional way as illustrated in figure 3 . in the first version there is an enquiry (green diamond) about a small number of key data relevant to an initial assessment (e.g. presenting complaints, age of patient and whether patient seems ill). if required a more detailed history is taken but either way an escalation decision follows. a later version shown in the second panel is based on more detailed guidance to gps published later by bmj. pathfinder 1 can be accessed via the link https://www.openclinical.net/index.php?id=68, where there are instructions for running the demonstrator against example cases provided or against the user's own cases. figure 4 shows three screenshots from a typical run against example case 1. when running the on-line demonstrator the following points should be borne in mind: (1) the model is developed as a "standalone" resource for testing and validation and is not intended to be deployed directly into clinical use. the model is incomplete with respect to current covid-19 guidance which has evolved significantly since completion of the first modelling cycle; we expect to make significant revisions to the data model and knowledge content in light of experience and feedback before starting on the next modelling cycle (pathfinder 2). assessment against project objectives 1. creation of an executable model of best practice for supporting care of covid-19 patients. proforma is capable of modelling the main decisions and workflows across the covid-19 patient journey. pathfinder was developed in about three weeks but is incomplete with respect to published medical knowledge and recommended practice and the data, decision and pathway models all require improvement and validation by qualified clinicians. our provisional assessment is that the model could be completed and maintained on a rapid timescale. a complementary pathway this article is protected by copyright. all rights reserved. for hospital care of covid-19 patients has also been rapidly developed in proforma by deontics ltd 4 . the focus of pathfinder 2 is on extending the pathways for additional clinical contexts (e.g. comorbidity management). appraisal of the openclinical platform as a practical knowledge sharing infrastructure. pathfinder 1 was deployed on openclinical.net as a globally accessible, open access demonstrator in early april 2020. a modified version was made in response to comments and suggestions of the clinical authors and made privately available for testing by them shortly after. the incorporation of example test cases provided by clinical collaborators and independent reviewers helps to quickly familiarise users with the demonstration and to critique the decision models and pathways against realistic patient data. reusability of the data and knowledge models at different points in the care journey the "escalation decision" in the reference model was initially used in a self-triage pathway and reused in the residential care triage pathway of pathfinder 1, but was replaced with three different decisions in the gp consultation pathway in light of clinical comments and new published guidance. in the latter pathway the diagnosis decision from the pathfinder 1 model was reused without change, but significant changes were made to the initial assessment scenario and additional decisions about diagnosis and appropriate actions were added to the later pathway model. at this early stage of the project we have not been able to progress this question but are seeking to collaborate with data science specialists in the next cycle of the project. initial discussions suggest however that once case records for a population of patients are available it will be practical to exploit well established statistical and machine learning methods to calibrate argument weights for patient populations for example, and there is also scope for symbolic machine learning methods to suggest extensions to the logical knowledge model (e.g. rule induction methods). as explained above the main goal of the polyphony project is to develop a "reference" data and knowledge model, not to deploy the model directly but to be a resource for others to use in developing clinical services. this is in part due to the complexities of integrating decision support and other services with existing it infrastructure. as mentioned above deontics ltd have developed a decision support application for hospital patient assessment and deployed it in the emergency department of the liverpool and broadgreen university hospital. integration with the hospital emr and infrastructure has been achieved but than was significantly more complex than the proforma modelling step. readers interested in deployment of services will ask how proforma engages with relevant technical standards, such as medical coding and terminology standards (e.g. icd10; loinc), ontologies (e.g. snomed ct, uml), logical expressions (e.g. gello) and rules (e.g. arden syntax; cql). proforma overlaps conceptually but does not comply with these standards. the emphasis in proforma is on capturing knowledge of these and other kinds in a single language for which it offers a unified syntax and execution semantics (sutton and fox 2003) . this is highly preferable to modelling a complex body of knowledge with ad hoc combinations of notations . the development of hl7 fhir resources has a similar motivation and we are interested in exploring whether proforma can implement certain classes of fhir resources, such as the fhir plan definition (https://www.hl7.org/fhir/plandefinition.html) and fhir executable knowledge artifact (https://www.hl7.org/fhir/clinicalreasoning-knowledge-artifact-representation.html). we think that this may make integration and deployment easier without losing the simplicity and intuitiveness of the proforma and polyphony framework. if the polyphony model is to have a sustainable future it will need to be repeatedly updated as knowledge of covid-19 and current clinical practice change. maintenance can take place at two levels, the implementation (e.g. software) level and the practice (knowledge) level. we anticipate that the maintenance of the implementation can be managed through standard software engineering and version control methods. this allows messages and documentation outlining the rationale for the change to be associated with each iteration of the implementation. present version control tools also keep a history of all changes, including highlighting differences between versions for easy review/audit. classical version control would, for example, be appropriate at the level of hl7 resources and their mappings to the knowledge model. maintenance at the knowledge level is likely to be very different because the principle stakeholders here are professional clinicians, researchers and other subject matter experts. polyphony was established under the auspices of the openclinical knowledge sharing project which seeks to establish a framework for knowledge dissemination analogous to open access publishing. key here are the need for models to be intuitive and open to review and criticism by professional who have little expertise or interest in technicalities. new versions of knowledge models need to specify their provenance, the evidence on which changes are based and in clinical use any unexpected behaviour must be explainable in appropriate medical terms. if covid-19 remains clinically challenging in the medium-term if it would be desirable to recruit experienced healthcare professionals to contribute to modelling the knowledge underlying decisions and pathways. through openclinical we hope to support a sustainable community of practice to promote discussion and debate and to own and maintain the reference knowledge base. a possible organisational structure could be analogous to the "chromosome committees" of the human genome project in that specialist professional groups of gps, emergency medicine clinicians etc. would take responsibility for data and knowledge modelling in specific contexts along the patient journey, but adopting common data and knowledge representation standards. in the longer term, with the hoped-for arrival of an effective vaccine or treatments, the covid-19 emergency may pass or become tolerated as a seasonal burden like flu. these futures are controversial, and we take no position on them, but it is widely accepted that the covid-19 pandemic is only the latest in a series of infections with major consequences for human populations and there will be more to come. it will be important to have "rapid response" as well as "rapid learning" mechanisms in place. polyphony may help to inform the design of policies and mechanisms this article is protected by copyright. all rights reserved. by which expert and experienced healthcare professionals can form rapid response teams to address emerging threats. a longer-term objective is based on the proposition that the polyphony approach is not limited to the covid-19 emergency nor even only to infections. the methods outlined here are applicable to rapid deployment of executable clinical guidelines and quality standards generally. we believe they can be used to create open access and open source models of practice for many conditions whether acute or chronic, commonplace or rare, "from home to hospital to home". implementing nice guidance cognitive systems at the point of care: the credo programme net: a platform for creating and sharing knowledge and promoting best practice in healthcare syntax and semantics of the proforma guideline modelling language delivering clinical decision support services: there is nothing as practical as a good theory using computerised decision support to improve compliance of cancer multidisciplinary meetings with evidence-based guidance comparing computer-interpretable guideline models: a case-study approach thanks to ali rahmanzadeh and david sutton who were the original developers of the tallis authoring suite and proforma execution engine and generously provided technical support during the project. john fox is founder and non-executive director of deontics ltd. he also works pro bono as director of openclinical cic, a non-profit community interest company. joht singh chandan, omar khan, jenny cooper, neil cockburn, krishnarajah nirantharakumar and carla pal state that they have no conflicts of interest. both hywel curtis and andrew wright, who worked in advisory capacities as consultants on the project, also assert they have no conflicts of interest this article is protected by copyright. all rights reserved. key: cord-325862-rohhvq4h authors: zhang, yong; yu, xiangnan; sun, hongguang; tick, geoffrey r.; wei, wei; jin, bin title: applicability of time fractional derivative models for simulating the dynamics and mitigation scenarios of covid-19 date: 2020-06-04 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.109959 sha: doc_id: 325862 cord_uid: rohhvq4h fractional calculus provides a promising tool for modeling fractional dynamics in computational biology, and this study tests the applicability of fractional-derivative equations (fdes) for modeling the dynamics and mitigation scenarios of the novel coronavirus for the first time. the coronavirus disease 2019 (covid-19) pandemic radically impacts our lives, while the evolution dynamics of covid-19 remain obscure. a time-dependent susceptible, exposed, infectious, and recovered (seir) model was proposed and applied to fit and then predict the time series of covid-19 evolution observed over the last three months (up to 3/22/2020) in china. the model results revealed that 1) the transmission, infection and recovery dynamics follow the integral-order seir model with significant spatiotemporal variations in the recovery rate, likely due to the continuous improvement of screening techniques and public hospital systems, as well as full city lockdowns in china, and 2) the evolution of number of deaths follows the time fde, likely due to the time memory in the death toll. the validated seir model was then applied to predict covid-19 evolution in the united states, italy, japan, and south korea. in addition, a time fde model based on the random walk particle tracking scheme, analogous to a mixing-limited bimolecular reaction model, was developed to evaluate non-pharmaceutical strategies to mitigate covid-19 spread. preliminary tests using the fde model showed that self-quarantine may not be as efficient as strict social distancing in slowing covid-19 spread. therefore, caution is needed when applying fdes to model the coronavirus outbreak, since specific covid-19 kinetics may not exhibit nonlocal behavior. particularly, the spread of covid-19 may be affected by the rapid improvement of health care systems which may remove the memory impact in covid-19 dynamics (resulting in a short-tailed recovery curve), while the death toll and mitigation of covid-19 can be captured by the time fdes due to the nonlocal, memory impact in fatality and human activities. fractional calculus can provide a promising tool in modeling biological phenomena, as reviewed recently by lonescu et al. [1] . for example, fractional-derivative equation (fde) models have been applied to capture complex dynamics in biological tissues [2] , tumor growth [3] , dna sequencing [4] , drug uptake [5] , and salmonella bacterial infection in animal herds [6] . most recently, the fde models have been applied to model the pine wilt disease [7] , the human respiratory syncytial virus (hrsv) disease [8] , the harmonic oscillator with a position-dependent mass [9] , human liver using caputo-fabrizio fractional derivative [10, 11] , and tumor-immune surveillance [12] . motivated by these successful applications, this study tests whether the fdes can be applied to model the dynamics and mitigation scenarios of coronavirus, an emerging and critical research area that has not been focused by the fractional calculus community before. the novel coronavirus disease 2019 (covid-19) outbreak, a respiratory illness that started (first detected) in late december 2019, is a pandemic infecting >336,000 people in more than 140 countries with the average fatality rate of 4.4% globally (data up to 3/22/2020) [13] . the covid-19 pandemic is infiltrating almost every aspect of life, damaging global economy, and altering both man-made and natural environments. urgent actions have been taken but further effective and efficient strategies are promptly needed to confront this global challenge. to address this challenge and promptly guide the next efforts, it is critical to model the covid-19 outbreak. mathematical models are among the necessary tools to quantify the covid-19 and, therefore, testing the applicability of such fde models under this new global pandemic is the primary objective motivating this study. this study aims to model the covid-19 evolution dynamics (i.e., transmission, infection, recovery, and death evolution) for representative countries with apparent coronavirus cases, including china, the united states (u.s.), italy, japan, and south korea using mathematical models, most specifically the fde models described previously. it should be noted that the covid-19 spread in these countries experienced different starting (initiation and detection) times, an important fact to consider when applying these models. for example, china had passed the peak of the coronavirus outbreak, finally reaching a milestone with no new local infections on 3/19/2020 (79 days from the onset, 12/31/2019, in wuhan, china), while the u.s. coronavirus cases soared past 10,000 on the same day. this study will apply the core characteristics of the covid-19 outbreak obtained in china to estimate the covd-19 spread in the u.s., as well as other countries where the number of affected people has not yet reached its number of peak cases. in addition, this new pandemic may last for a relatively longer time than expected [14] . no vaccine against sars-cov-2 (severe acute respiratory syndrome coronavirus 2) is currently available [15] . indeed, a vaccine for prevention and infection control may not be ready before march 2021 for the covid-19, considering a minimum of 4~5 weeks for trials and at least 1 year for safety evaluation and final deployment. efficient strategies are therefore needed to mitigate the covid-19 outbreak. possible non-pharmaceutical scenarios, such as isolation of cases and contact tracing, can be evaluated using mathematical models [16] in order to identify the most efficient mitigation strategies going forward. this is another major motivation and the secondary task of this study using the fde models. to address the questions mentioned above, this study is organized as follows. section 2 proposes an updated seir model with an fde-based component for covid-19, where "s", "e", "i", and "r" stand for susceptible, exposed, infectious, and recovered people, respectively [17] . this model is then applied to fit and predict the covid-19 spread in various provinces and major cities in china, resulting in abundant datasets to derive the core characteristics of the covid-19 6 dynamics in evolution. section 3 predicts the spread of covid-19 in the other countries using the knowledge that is gained from the china case study. section 4 proposes a fully lagrangian approach with the time fde to model the spatiotemporal evolution of covid-19, and then applies it to evaluate non-pharmaceutical scenarios to mitigate the virus spread. section 5 reports the main conclusions, specifically the feasibility of fdes for capturing covid-19 evolution. in the appendix, the impacts of the non-singular kernel and fractional derivative type on model results are further discussed for readers particularly interested in these fractional calculus techniques. the main contributions of this work, therefore, include 1) the first application of fdes in modeling the evolution of the covid-19 death toll, 2) an updated seir model with a transient recovery rate to better capture the dynamics of covid-19 pandemic within china and for other countries, and 3) a particle-tracking approach based on stochastic bimolecular reaction theory to evaluate the mitigation of the spread of the covid-19 outbreak. several mathematical models have been applied for epidemic analysis of covid-19. the most widely used one, to date, is the well-known seir model. for example, peng et al. [18] (using data up to 2/16/2020) proposed a generalized seir model to successfully estimate the key epidemic parameters of covid-19 in china, and predicted the inflection point and ending time of confirmed covid-19 cases. the seir model was also applied by li et al. [19] (using the observed data up to 2/6/2020) to compare the effect of city lockdowns on the transmission dynamics for different cities in china. the sir model with time-dependent transmission and recovering rates was used by chen et al. [20] (using data up to 2/20/2020) to analyze and predict the number of confirmed cases of covid-19 in china. the sir model was extended by wang et al. [21] (using data up to 2/12/2020) to incorporate various time-varying quarantine protocols for assessing interventions on the covid-19 epidemic in china. the seir model and its modifications were also successfully applied by others [22] [23] [24] [25] , mostly for assessing the early spreading of covid-19 in china. previous applications of the popular seir model, however, may contain high uncertainty since they had limited data access for only a short period of the covid-19 outbreak. as will be shown below, the covid-19 dynamics have changed dramatically in the last three months, likely due to the improvement/adjustment of screening/testing techniques, public hospital system capabilities, and the government's control policies for contagious diseases. an updated, much better version of the seir model is therefore needed and can be reliably built now for china as more detailed and complete datasets are available that includes the coronavirus outbreak data (i.e., # of people infected) well beyond the peak in china. other models have also been used for specific purposes related to covid-19, including the global metapopulation disease transmission model to predict the impact of travel limitations on the epidemic spread [26] , the transmission model for risk assessment [27] , and the synthetic contact matrix model for quantifying reproductive ratios for covid-19 [28] . to the best of our knowledge, the classical stochastic epidemic models, such as the discrete or continuous time markov chain model and the stochastic differential equation model, have not yet been applied for covid-19 spread scenarios. a preliminary stochastic model built upon the fde for evaluating covid-19 spread in a city will be developed and applied in section 4. this section focuses on the deterministic model. the classical seir model considers constant parameters, which may not adequately capture time-dependent dynamics of epidemic (the covid-19) spread. hence, the seir model is updated in section 2.1, and then tested in section 2.2. the classical seir model containing four populations (s, e, i, and r) takes the form [29] : where is the stock of susceptible population; is the number of persons exposed to or in the latent period of the disease; is the stock of infected population; is the stock of recovered population; r denotes the number of susceptible people whom the infected people contact daily; n is the sum of all the four groups of people ( ( ) ( ) ( ) ( ) (which is a constant representing the constancy of population n); p is the constant rate of infection (i.e., representing the probability for the infected people to transform the susceptible people into infected ones); is the constant rate for the exposed person to be transformed into an infected one; and is the constant recovery rate (defining the speed for the infected person to be cured or expired). to allow for possible time-sensitive rates and time nonlocal-dependency for covid-19 evolution, we revise model (1) as: where represents the number of deaths (which is one component of i); is the rate for the healthy, susceptible person to be transferred to an infected one from exposed persons (note that covid-19 patients in the incubation period might be contagious too); and is the number of healthy, susceptible people that are contacted by exposed people daily. now, the infection rates can change with time, and the infected persons are removed from the risk of infection via a time-dependent rate term of ( ). if the recovered individuals can return to a susceptible status due to, for example, a loss of immunity, then the partial differential equation (pde) (2d) for the time rate of change of requires one more (sink) term: , where is the rate of the recovered individuals returning to a susceptible status. we replace the integer-order derivatives in the classical seir (1) by the fractional-order derivatives in (2) , to capture the possible nonlocal impact, such as the memory impact or any apparent delay, in the covid-19 outbreak. herein we use the death evolution (2e) as an example. hence, the fde (2e) containing the death probability of (while the other patients are cured) is modified as: which is the caputo fractional derivative [30, 31] with order ( ) (note that all the other indexes , , , and in model (2) are in the same range of (0, 1]). the caputo fractional derivate was selected here because the initial condition for caputo derivative takes the same form as that for the integer-order pde. particularly, the caputo derivative allows the utilization of initial values (of the integer order) with known physical interpretations. when the order =1, model (2e) reduces to the classical integer-order pde for the death evolution. the fractional pde (2e) is used here for two reasons. first, the evolution of deaths and cures may be characterized by a random process, due to the fact that the exact time for the (recovered) person to be initially infected is unknown (i.e., the patient that died or was cured today may have been diagnosed yesterday or last week). second, some patients may not be treated in time after being infected, making the death toll evolve with a time memory. therefore, we extend the classical mass-balance equation of death cases using the fde to characterize the random property and memory impact embedded in the temporal evolution of mortality. the fractional pde (2e) and its classical version will be compared below using real (observed) data. we apply the seir model (2) does not exhibit any late-time tailing behavior, which supports the application of the integer-order, time-local model (2c). the fast decline of the late time "current infected population" is most likely due to the improvement of health care facilities, which tends to accelerate the recovery of infected people and remove the possible memory impact on the disease recovery evolution. other complex seir models were also applied to model covid-19 spread, such as the one proposed by tang et al. [32] which has 12 rates/probabilities and 8 groups of people in seir. numerical results show that, compared with the seir model (2), the complex model proposed by tang et al. [32] (with solutions shown by the dotted lines in fig. 1a ) accurately fits the observed data at the early stage, but then overpredicts the spread of covid-19 observed after 2/12/2020. the dynamics of transmission for covid-19, especially the recovery rate, therefore, changed over time in china, likely due to the time-dependent conditions (i.e. improvements/adjustment)s in medical care as mentioned above. a dynamic time-local seir model, therefore, may be preferred for modeling covid-19 spread in china. in addition, compared to the fde (2e), the best-fit solution using the classical pde for death evolution (see the black, dotted line in fig. 1a ) slightly overestimates the late-time growth of mortality. the actual death toll in hubei province grew slower than that estimated by a constant rate model, indicating that the memory impact may affect the late-time dynamics of death and can be better captured by the fde (2e). the best-fit solutions using model (2) fit the evolution of the infected and recovered populations well for the data recorded from hubei province and three large cities closely related (such as in transportation or economic cooperation) to wuhan, china ( fig. 1) . the model was also shown to predict well the observed time series of covid-19 spread from 3/8/2020 to 3/22/2020 for most places, except for shanghai city (fig. 1d) . this is likely due to the number of overseas covid-19 cases imported into shanghai, whereby the number of cases were observed to increase rapidly after 3/3/2020, causing inconsistency in the affected population and failure of the model. shanghai pudong international airport, one of the two airports located in shanghai city, is the eighth-busiest airport in the world and the busiest international gateway of mainland china. when excluding the coronavirus cases imported from overseas, model (2) predicts the covid-19 spread evolution data in shanghai (fig. 2) . therefore, model (2) works well for various places in china that may not be experience a significant influx of imported cases, as external sources can easily break the internal evolution, especially the asymptotic status, of covid-19 in china. the resultant time-dependent recovery rate ( ) is depicted in fig. 3 , where the rate fitted by the latest observation data point within the fitting period (i.e., 3/7/2020) remains stable in the following prediction period. the best-fit recovery rate is the highest for shanghai (except for the impulse of ( ) for wenzhou as discussed below), which is expected since shanghai has the best public health system of all of these cities. contrarily, hubei shows the lowest recovery rate, likely due to its delayed response and the relatively limited public health capability at the beginning of the outbreak compared with shanghai. wenzhou exhibits an impulse in the infection and recovery dynamics of coronavirus (fig. 1b) , different from that observed for the other places studied herein. on february 27, 2020, the number of wenzhou's infected people suddenly declined, combined with the sudden increase of the number of people cured. this abrupt change can be effectively captured by the seir model (2) with an impulse in the recovery rate ( ). this recovery impulse is most likely due to a new hospital, the no. 2 affiliated hospital of wenzhou medical university, recently built in this city in early february, which significantly improved the public health system. the first discharged cases of coronavirus from this hospital appeared in late february 2020, resulting in the sudden increase in the total recovery rate. in addition, a relatively large number of people working in wuhan returned to wenzhou in late january, and it appears that the improved efficient screening process successfully identified the number of infected cases. the new cases were ~29,000 from 1/24/2020 to 1/31/2020 in wenzhou (with an average of 3,600 new patients per day), who were then immediately centralized for treatment. it appears that this fast response helped to alleviate the spread of coronavirus in wenzhou. the best-fit parameters of model (2) are listed in table 1 , and the initial values for each group of people are listed in table 2 . we reveal three behaviors in model parameters. first, the best-fit "s"-shaped recovery rate ( ) (fig. 3) can be described by the sigmoid function ( ) ( ) (where a, b, and c are factors), showing that the recovery rate increases exponentially before reaching a stable condition. this increase is likely due to the fact that healthcare facility capabilities improved with time (with an accelerating rate) before reaching their asymptote or maximum capacity. second, the rates and probabilities (r, , p, and ) affecting the covid-19 transmission/infection evolution slightly change in space and remain stable for a given site ( table 1 ). the small spatial fluctuation of these rates may be due to the similar strategies implemented by we also introduce an index c to quantify the infection severity of covid-19 at different places: where is the maximum number of cumulative infected people at the given site. a smaller c represents a greater infection severity of coronavirus. there is a power-law relationship between the regional population n and the maximum cumulated number of infected people (fig. 4) . this empirical formula may be used to approximate the largest cumulative number of infections, which will be applied in the subsequent section for predicting the covid-19 evolution outside of china, where the coronavirus infection has not yet reached its peak number of cases. different countries are applying different modes to slow the the spread of the covid-19. in the next sub-sections, we discuss several representative countries and then fit/predict the virus spread there. to to decrease the acceleration of covid-19 spread, italy's mode is now similar to china: lockdown of the full population. the predicted covid-19 spread in italy is plotted in fig. 6a . although italy has followed china's mode of national isolation, the number of infected people increased rapidly from 3/14/2020 to 3/19/2020 (~495 new cases per day). to account for the delayed national quarantine compared with china, we decrease the c index (while also increasing the upper limit of the cumulative infection). the covid-19 evolution prediction results show that there may be a turning point in the next two weeks when the current infected cases begin to decline. we also separate the death toll from the number of recovered cases. south korea's mode of combating the spread of covid-19 is through fast detection and tracking of the disease. south korea is using efficient mobile diagnoses tests and accurate tracing of infected cases to maintain a low death rate even with a large infected population. the mobile method can test 20,000 people per day (the maximum capability on 3/12/2020), and apps for cell phones and/or credit cards can accurately track the routes of infected people with the help of local government (without an invasion of privacy), so that warnings can be immediately delivered to the general population to obviate the places with high risk. the current infected population may have passed its peak number of cases around 3/20/2020, and the prediction shows that the covid-19 outbreak may be well controlled in ~35 days from 3/22/2020 (fig. 6b) . japan's mode of combating the spread of covid-19 is compatible with that of the u.s., in addition to other changes such as enhanced education/outreach and rapid treatment of the infected cases. specific policies include social distancing (which might be a key barrier to the spread of the novel coronavirus), personal hygiene, and quarantine of the infected cases. the current data and modeling results (fig. 6c) show that japan has an efficient way so far to limit the maximum population infected and slow the spread of covid-19, while this outbreak may last for a while. the model-predicted covid-19 spread in the u.s., using the fitted infection and recovery rates from china (fig. 5) , reveals the impact of one possible mitigation scenario for covid-19: coronavirus lockdowns, which have now been implemented by some states in the u.s. such as california and new york. other non-pharmaceutical options can and should also be evaluated using mathematical models, considering the recent surge of infected cases in the u.s. when the number of infected persons is initially small compared to that of susceptible people, the infected and susceptible people are not well mixed and hence the system is not homogeneous. under such conditions, a stochastic model is needed, as the deterministic, continuum models (such as the seir model) assume well-mixing of components for a homogeneous system [33, 34] . hence, this section develops and applies a stochastic model to evaluate non-pharmaceutical scenarios for mitigating the covid-19 with a small number of initial infections. the random walk based stochastic model for covid-19 spread is analogous to a mixing-limited bimolecular reaction-based mechanism/condition [35] . when a reactant a particle (representing a susceptible individual a) meets a reactant b particle (representing an infectious person b), a chemical reaction may occur if the collision energy is large enough to break the chemical bond (meaning that the susceptible person a may be infected if satisfying additional criteria such as a and b are close enough, and b touches his/her face after receiving coronavirus from a). therefore, the condition of a being infected is not deterministic but rather a random, probability-controlled process. this probability is related to various factors, such as the duration that a and b are in contact, the infectivity rate, and the distance between the two people, which may be characterized parsimoniously by the interaction radius r that controls the number of reactant pairs (susceptible + infectious) in a potential reaction (infection) [35] . hence, the core of the random walk stochastic model to simulate covid-19 spread is to define the interaction radius the analogous development and similarities between bimolecular reactions and the sir model can also be seen from their governing equations. the time-dependent sir model takes the form [36] : where ( ) and ( ) denote the transmission rate and recovery rate at time t, respectively. the rate equations for irreversible bimolecular reaction take the form [35] : where , , and denote the concentrations of a, b, and c, respectively; and ( ) is the forward kinetic coefficient of reaction. equations (5a) and (5b) are functionally similar to equations (4a) and (4b), respectively, if the recovery rate ( ) . therefore, following the argument in zhang et al. [35] and lu et al. [37] , we derive analytically the interaction radius r for the sir model (4): where v denotes the volume of the domain, is the time step in random walk particle tracking, is the mass (or weight) carried by each a particle, is the initial concentration of a (which can be assumed to be the normalized value 1 here), and denotes the initial number of susceptible people. the movement of a, b, and c can be described by the following time fde [35] : where denotes a, b, and c, respectively; denotes the fractional capacity coefficient (which controls the ratio between the immobile and mobile population); denotes the mean moving speed; and denotes the macrodispersion coefficient. after defining the interaction radius r, the particle tracking scheme proposed by zhang et al. [35] and lu et al. [37] with particle trajectories defined by the time fde (7) can be applied to model the transmission of coronavirus between the susceptible and infectious people. in addition to pharmaceutical strategies including vaccine and therapeutic drug development, and herd immunity that may either take a while or have a high risk, non-pharmaceutical scenarios can be tested. several particle-tracking based stochastic models were proposed recently [38] to evaluate non-pharmaceutical scenarios to mitigate coronavirus spreading in a city. here we evaluate three related scenarios (described below) using the stochastic model proposed above. in the stochastic model, we assume that 10 days after being infected, the person will be removed because of being cured or expired (dead). this is because the median disease incubation period has been estimated to be 5.1 days [39] . for simplicity purposes, the interaction radius r (6) remains constant, since the constant interaction radius was found to be able to efficiently capture the temporal variation of effective reaction rates in mixing controlled reaction experiments or simulations [35, 37] . the initial number of a and b particles is 10,000 and 4, respectively. the lagrangian solutions of the covid-19 outbreak for the three scenarios are depicted in fig. 7 . modeling results for scenarios 1, 2, and 3 show a peak in the curve of newly infected people at time t=28, 65, and 32, respectively, demonstrating that the virus spreads the fastest for the scenario without mitigation constraints (i.e., scenario 1, where the number of the total infected people or cases increases by one order of magnitude every 10 days in the rising limb), as expected. however, the value for this peak (scenario 1, =198 people) is lower than that for scenario 3 (=267 people), although the total number of the infected people for scenario 1 (9,336) is slightly larger than scenario 3 (9,322). this may be due to a greater separation of infection cases for the higher number of initial coronavirus carriers in scenario 1, which causes a lower and relatively flatter covid-19 evolution peak compared to that of scenario 3. scenario 2 has the lowest peak value (=121 people) and the most-delayed peak in the curve for new cases, and the total infection time is almost doubled compared to the other two scenarios, indicating that people living with strict social distancing may also suffer from a much longer period of covid-19 threat. it is also noteworthy that the overall trend of the solution (simulation) of scenario 1 (initial surge without special constraints, fig. 7a ) is similar to that for italy which had a delayed response to the covid-19 outbreak initially (fig. 6a) , and scenario 3 solution (a lower peak value and a longer duration due to social distancing, fig. 7b ) is similar to that observed in japan where social distancing actions have been implemented (fig. 6c) . the simulated particle plumes plotted in fig. 8 reveal the subtle discrepancy between the three mitigation scenarios. scenario 1 assumes that four initial cases were initially located on the right side of the city, while the whole population (10,000 susceptible persons) was distributed randomly in the 1ï�´1 domain (fig. 8a) . the trajectory of each person is assumed to follow (two-dimensional) brownian motion with retention (described by eq. (7)), to capture the random vector for each displacement and the random waiting time between two consecutive motions (described by the time fractional derivative term in eq. (7)). the virus moved quickly from east to west (figs. 8b and 8c) , spreading over the entire city before all the infected people were cured or expired (dead) at time t=69 (fig. 8d) . a total of 664 susceptible people (6.6% of the total population) distributed randomly around the city were never infected. scenario 2 assumes that social distancing can reduce the infection probability, which can be characterized by a smaller reaction rate or a smaller interaction radius in our lagrangian approach. the virus was spreading at a much lower rate from west to east than that in scenario 1 (figs. 8e~8g), reaching a stable condition (i.e., # of cases neither increasing or decreasing) at a later time (t=125) and leaving more susceptible people unaffected (2,734 total, which is 27.3% of the population). therefore, social distancing is effective in limiting the spread of coronavirus among people. note, however, this scenario assumes that every person in this city strictly maintains social distancing; otherwise a surge of infections may occur the same way as that shown in scenario 1. scenario 3 assumes self-quarantine. notably, not all of the infected people can be effectively quarantined due to the following: 1) people can be infected without coronavirus symptoms; 2) people in the incubation period can transmit the infection; and 3) limited health care facilities and capabilities for the large influx of patients. for example, according to imai et al. [40] , many infected people could not be appropriately screened initially in wuhan city, china. under this condition, we assume that 50% of people infected and diagnosed are immediately quarantined, while the remaining infected people (fig. 8i) can still cause the spread of coronavirus (figs. 8j~8l). self-quarantine, therefore, may not be as effective as maintained social distancing. fractional calculus provides a useful tool to modeling complex dynamics in biology, and this study extended the fde for modeling the coronavirus outbreak. the covid-19 is a pandemic third, a stochastic model based on the lagrangian scheme for the time fdes, analogous to a mixing-limited reaction mechanism model, showed that self-quarantine may not be as effective as strict social distancing, since not all the infected people can be diagnosed and immediately quarantined. while strict social distancing can apparently slow covid-19 spread, the pandemic may last longer. this is another case that fractional calculus may be used to explore covid-19 outbreak. therefore, one of the main contributions of this study is to extend the application of fdes to model dynamics and mitigation scenarios of the coronavirus spread. this appendix quantifies the potential impacts of the non-singular kernel and the type of the fractional derivative on the covid-19 death toll simulation. first, the nonsingular time-fractional definitions also provde promising modeling tools for real-world fractional dynamics. for example, the atagana-baleanu fractional derivative is defined as [41] : where ( ) â�� ( ) represents the single-parameter mittag-leffler function. we employ this definition to extend the seir model and then use the finite difference method to simulate the evolution of the covid-19 death toll, which leads to the following solver: (fig. 9) , which is consistent with the conclusion in the previous study (see [42] ). the resultant death roll peak is lower than that simulated by the conventional kernel function, but it does not mean that the kernel used in the caputo fractional derivative is no longer valid. indeed, the solution of the caputo fractional derivative can capture well the obsvered death peak (fig. 1a) . applications of the nonsingular kernel functions deserve further study in the future. second, the riemann-liouville fractional derivative is defined as [43] : the caputo fractional derivative listed in section 2.1 and the riemann-liouville fractional derivative (9) are related by [44] : where , and the operators and represent the caputo and riemann-liouville fractional derivatives, respectively. â�� the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. credit author statement: the role of fractional calculus in modeling biological phenomena: a review fractional calculus models of complex dynamics in biological tissues modeling with fractional difference equations fractional dynamics in dna simulation of drug uptake in a two compartmental fractional model for a biological system on fractional sirc model with salmonella bacterial infection semi-analytical study of pine wilt disease model with convex rate under caputo-febrizio fractional order derivative a new fractional hrsv model and its optimal control: a non-singular operator approach the fractional features of a harmonic oscillator with position-dependent mass a new study on the mathematical modelling of human liver with caputo-fabrizio fractional derivative new aspects of time fractional optimal control problems within operators with nonsingular kernel a new fractional model and optimal control of a tumor-immune surveillance with non-singular derivative operator johns hopkins coronavirus resource center a mathematical model for the novel coronavirus epidemic in wuhan, china covid-19 -new insights on a rapidly changing epidemic feasibility of controlling covid-19 outbreaks by isolation of cases and contacts a contribution to the mathematical theory of epidemics epidemic analysis of covid-19 in china by dynamical modeling the lockdown of hubei province causing different transmission dynamics of the novel coronavirus (2019-ncov) in wuhan and beijing. medrxiv a time-dependent sir model for covid-19 an epidemiological forecast model and software assessing interventions on covid-19 epidemic in china effectiveness of control strategies for coronavirus disease 2019: a seir dynamic modeling study seir transmission dynamics model of 2019 ncov coronavirus with considering the weak infectious ability and changes in latency duration nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study preliminary prediction of the basic reproduction number of the wuhan novel coronavirus 2019-ncov the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak risk assessment of novel coronavirus covid-19 outbreaks outside china estimation of country-level basic reproductive ratios for novel coronavirus (covid-19) using synthetic contact matrices linear model of dissipation whose q is almost frequency independent-ii impact of absorbing and reflective boundaries on fractional derivative modes: quantification, evaluation and application estimation of the transmission risk of the 2019-ncov and its implication for public health interventions an introduction to stochastic processes with applications to biology discrete and continuous sis epidemic models: a unifying approach evaluation and linking of effective parameters in particle-based models and continuum models for mixing-limited bimolecular reactions networks: an introduction lagrangian simulation of multi-step and rate-limited chemical reactions in multi-dimensional porous media why outbreaks like coronavirus spread exponentially and how to "flatten the curve the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application report 2: estimating the potential total number of novel coronavirus cases in wuhan city new fractional derivatives with nonlocal and non-singular kernel: theory and application to heat transfer model time fractional derivative model with mittag-leffler function kernel for describing anomalous diffusion: analytical solution in bounded-domain and model comparison fractional integrals and derivatives stochastic models for fractional calculus resources. bin jin: supervision, funding acquisition key: cord-346921-3hfxv6h8 authors: nave, ophir; hartuv, israel; shemesh, uziel title: θ-seihrd mathematical model of covid19-stability analysis using fast-slow decomposition date: 2020-09-21 journal: peerj doi: 10.7717/peerj.10019 sha: doc_id: 346921 cord_uid: 3hfxv6h8 in general, a mathematical model that contains many linear/nonlinear differential equations, describing a phenomenon, does not have an explicit hierarchy of system variables. that is, the identification of the fast variables and the slow variables of the system is not explicitly clear. the decomposition of a system into fast and slow subsystems is usually based on intuitive ideas and knowledge of the mathematical model being investigated. in this study, we apply the singular perturbed vector field (spvf) method to the covid-19 mathematical model of to expose the hierarchy of the model. this decomposition enables us to rewrite the model in new coordinates in the form of fast and slow subsystems and, hence, to investigate only the fast subsystem with different asymptotic methods. in addition, this decomposition enables us to investigate the stability analysis of the model, which is important in case of covid-19. we found the stable equilibrium points of the mathematical model and compared the results of the model with those reported by the chinese authorities and found a fit of approximately 96 percent. the coronavirus belongs to the severe acute respiratory syndrome (sars) family. it usually does not cause disease in humans, but infects animals-mammals and poultry. if humans are infected with the virus, the disease usually causes mild cooling, and the infection passes without any treatment. conversely, if the infected person has a weak immune system for some reason, the coronavirus can be fatal. it is a common virus that attacks almost every person at least once in his/her lifetime-especially during childhoodand, as is usually the case, it is not dangerous. however, the current epidemic is a recently mutated virus that has become fatal (which is why the world health organization initially called it "the new coronavirus" 2019). it should be noted that, in 2002-2003 , there was an outbreak of a similar virus from the sars family, causing the death of more than 800 people (general health fund in israel, https://www.clalit.co.il/he/yourhealth/family/ the rest of the paper is organized as follows. in "the θ-seihrd model of covid-19", we describe the mathematical model of the coronavirus called θ-seihrd model. in "results and analysis", we present the results of our comparative analysis of the cases reported by the chinese authorities. in addition, we apply the spvf model to the θ-seihrd model and find the stability of the equilibrium points. "discussion" presents our discussion. finally, "conclusions" presents the conclusion of this paper. in this section, we introduce the mathematical model of the θ-seihrd model as presented in (ivorra et al., 2020) . this model is based on the be−codis model presented in (ivorra, ramos & ngom, 2015) . it is important at this point to note that this model is not the sir-seir standard model. it considers the known special characteristics of the considered disease, such as the existence of infectious undetected cases and the sanitary conditions and infectiousness of hospitalized patients. the assumptions of the model can be found in (ivorra et al., 2020) . here, we present the mathematical formulation of the covid-19 spread. the system of equations includes nine nonlinear ordinary differential equations and has the following forms: whereṽ ¼ ðs; e; i; i u ; h r ; h d ; r d ; r u ; dþ; g e ; g i ; g h r and g i u are the transition rates of e, i, h r , and i u , respectively â 1 day ã ; β(·) is the disease contact rate â 1 day ã ; m is the fatality rate â 1 day ã ; and τ indicates the infected people, who move from one territory to another per day. the initial conditions of the system at time t 0 (for each country, however, the initial time can be changed. for example, in wuhan, china, t 0 = 10 a. cet (technical adjustment) because it was on 7 december 2019 when the first case of patients with symptoms was confirmed) are as follows: as can be seen above, the presentation of the above mathematical model has no explicit hierarchy and, hence, it is impossible to apply different asymptotic methods. in the next section, we apply the spvf methods to expose the hierarchy of the considered model. in this section, we apply the spvf method to the system of eqs. (1)-(9), as presented in (nave, 2017) . while we apply the spvf method, we expose the hierarchy of the model. this procedure enables us to rewrite the model under consideration in new coordinates and, hence, the "new" model can be decomposed into the so-called fast and slow subsystems. the set of parameters and functions used in our calculations is as follows: parameters: functions: b i u ¼ 0:375 þ 0:375 ð1 à vðtþþ ð1 à hþ; vðtþ ¼ 0:003 á m i þ 0:997 á ð1 à m i þ; here, we present only the absolute values of the eigenvalues of the spvf method (and not the eigenvectors) to avoid overwhelming the readers with too much information: as we observed from the above results, and according to the spvf method, the maximal gap of the eigenvalues is between λ 1 and λ 2 . this gap implies that the first dynamic variable of the system, which is written in the new coordinates, is fast compared with the rest of the variables. the results of the spvf algorithm are presents in figs. 1 and 2. we compute the value of |λ i + 1 |/|λ i | (i = 1,…,9), for 365 days and plot only the maximum values of this quotient for every day. according to fig. 1 we can see that for every day the gap indeed exists (for every day the graph is not zero). this means that the spvf algorithm exposes the hierarchy of the system at the new coordinates. but it is not clear from these results what is the (1)-(6) and the slow subsystem are eqs. (7)(9). the results are present in fig. 2 . we can see that at some days the fast subsystem includes only the first equation and for some other days the fast subsystem includes the first two equations of the model in the new coordinates. one can analyze the results of the spvf algorithm from the other direction. that is, if we look at fig. 2 on the 100th day, for example, then we see that only the first equation is the fast subsystem. but if we look at the 250th day for example, then the first two equations are the fast subsystem. the spvf algorithm aims to find a frame of coordinates for which the system has an sps form. hence, we have a degree of freedom to choose any frame of coordinates for this purpose. at our analysis, we have chosen the frame of coordinates, that is, eigenvectors, that belong to the 49th day where on this day we received the maximal gap of all the gaps that we obtained from the spvf algorithm. on this day the fast subsystem is the first equation of the model. we the new model is rewritten in the new coordinates using the eigenvectorsũ 1 ;ũ 2 ; …;ũ 9 , which correspond to the eigenvalues λ 1 , λ 2 ,…,λ 9 . this means that the new dynamic variables of the model are linear combinations of its old variables when the coefficients of the linear combinations are taken from the eigenvectors, that is, where t refers to the transpose operation. in matrix form, system (15) can be written as here, we denote the matrix b as a matrix whose columns are the eigenvectors of the spvf method. the right-hand side (rhd) of this system is a function of the variablesṽ, whereas its left-hand side is a function of the new variablesṽ. to write the system of eq. (16), with the same variablesṽ, we should differentiate this system with respect to time, that is, is the vector field of systems (1)-(9). to rewrite both sides of the model using the new variables, we should express the old variables as a function of the new variables using the inverse matrix of the eigenvectors, that is, substituting eqs. (19) into (17) is the vector field of the new model (the rhd of the model is written in the new coordinates). after we transformed and presented the model in the new coordinates using the eigenvectors of the spvf method, the model can be decomposed into the fast and slow subsystems based on the gap of the eigenvalues. the first eigenvalue, λ 1 , indicate that the first variable of the system, says, is the fast variable of the system and thatẽ,ĩ,ĩ u ,h r ,h d , r d ,r u andd are the slow ones. hence, the model presented in the system of eq. (20) can be decomposed as follows: according to the spvf method, the stability analysis can be investigated on the fast subsystem. to find the equilibrium points of the model, we should solve the following linear system: this implies that we look for the equilibrium points of the fast variable (denoted by an asterisk), that is,s ã , while the rest of the variables are frozen, and we take their values as the initial conditions of the system. system (22) is a system of one equation with one unknown (fast) variables:s ã . we solve this system and substitute the valuess ã in the slow subsystem and solve the system for the slow variables, that is,ẽ ã ,ĩ ã ,ĩ ã u ,h ã r ,h ã d ,r ã d ,r ã u and d ã . now, we have eight equations with eight unknown (slow) variables. to check the stability of these equilibrium points, we compute the jacobian matrix calculated at the equilibrium point. it is important at this point to note that the equilibrium points in the new coordinates have no biological meaning. moreover, to provide a biological interpretation to the equilibrium points that we received, we have to transfer them to the old coordinates for which we need to calculate the inverse matrix of the matrix with the eigenvectors, that is, here, we inverse transform only the stable equilibrium points. we received only one stable point in the new coordinates, and, therefore, we should transform only one stable equilibrium point. according to our numerical simulations, if the initial condition is on 12 january 2020, then the equilibrium point is reached after 49 days. according to official reports (from the wuhan county authorities), the closure of wuhan ended after 62 days. this means that the situation stabilized after 62 days. as we can see from the results of the mathematical model and what actually happened, there is a difference of 13 days. medically, another 13 days of lockdown would not have affected the civilians. however, apparently, the economic considerations outweighed the medical considerations and the authorities decided to lift the lockdown earlier than expected. in fig. 3 , we present the evolution of the number of cases in china for different values of the parameters. as we can see from these results, the equilibrium point is attained at approximately 83k. the results are summarized in table 1 . in this table, we present only the important variables that are stable. as we have shown in the previous section, we obtain the stable equilibrium points of the mathematical model owing to the application of the spvf method. the application of this method enables us to decompose the system into a fast and a slow subsystems. after the decomposition, we explored only the fast subsystem. in general, instead of decomposing a given system into fast and slow subsystems, one can solve the original system and determine the stable equilibrium points numerically. however, the big problem with the numerical method is that the equilibrium points are represented by their values and not as points, and they depend on the original system's parameters as can be obtained by the considered decomposition. that is, if we want to change the system parameters and determine new equilibrium points, we have to resolve the mathematical model numerically each time, which is time consuming. moreover, the spvf method allows us to first find all the equilibrium points of the original model analytically. in addition, the equilibrium points depend on the parameters of the original system. therefore, if we want to change the model parameters, we do not have to resolve the mathematical model again but only need to change the parameters in the stable equilibrium points that have been determined. as can be seen from the results obtained from the mathematical model and from the results reported by the authorities, the relative error of the model can be calculated for which, we define the relative error of the equilibrium points for each dynamic variable of the system as where u model is the dynamic variable of the system, that is, the stable equilibrium point of the original model (after the inverse transform of the stable equilibrium points) and u real denotes the values reported by the authorities. the results are as follows: as can be seen from the results of the relative error, the reported cases are indeed small in number, as was expected from the decomposition method. in this paper, we investigated the stability of the θ-seihrd mathematical model of covid-19. the mathematical model of coronavirus is presented with a hidden hierarchy. this implies we cannot know which of the dynamic variables of the model is progressing fast and which one is progressing slowly. the hidden hierarchy of the model neither allows for an asymptotic analysis in general, nor an analytical investigation in particular, but only the running of numerical simulations. in particular, the equilibrium points of the system and their stability cannot be investigated and obtained from the present model. therefore, in this study, we implemented the spvf method, which explicitly exposes the system hierarchy. this method transforms the model into new coordinates using the eigenvectors of the given vector field. in the new coordinates, the model is presented in the form of the sps, that is, with an explicit hierarchy. the hierarchy of the new mathematical model enables us to decompose the model and divide it into subsystems. in our case, we only divided it into two subsystems: a fast and a slow subsystem. according to the spvf theory, the fast subsystem can be investigated while the slow system is frozen. we found the equilibrium points the fast system in the new coordinates analytically. we analyzed the stability of the equilibrium points and obtained a single equilibrium point for the new model. to obtain a biological interpretation of the equilibrium point, we inverse transformed the stable equilibrium point into the old coordinates. according to our analytical results and the official reports of the chinese authorities, we found that the stable equilibrium points obtained from the mathematical model are very close to those of the official reports. susceptible: the person is not infected by the disease pathogen e exposed: the person is in the incubation period after being infected by the disease pathogen and has no visible clinical signs. the individual could infect other people but with a lower probability than that by the people in the infectious compartments. after the incubation period, the person moves to the infectious compartment i i infectious: after the incubation period, the first compartment of the infectious period in which nobody is expected to be detected yet. the person has completed the incubation period, may infect other people, and start developing clinical signs. after this period, people in this compartment can be either taken in charge by sanitary authorities (and we classify such people as hospitalized) or may not be detected by the said authorities and continue to be infectious (but in another compartment, i u ) i u infectious but undetected: after being in compartment i, the person can still infect other people and have clinical signs; however, they have not yet been detected and reported by the authorities. we assume that only people with low or medium symptoms can be included in this compartment, and not the people who die. after this period, people in this compartment move to the recovered compartment r u . hospitalized: the person is in the hospital (or under quarantine at home) and can still infect other people. at the end of this stage, a person moves on to the recovered compartment r d h d hospitalized who will die: the person is hospitalized and can still infect other people. at the end of this state, the person is transferred to the dead compartment r d recovered after being previously detected as infectious: the person was previously detected as infectious, survived the disease, is no longer infectious, and has developed a natural immunity to the virus. when a person enters this singularly perturbed vector fields mathematical modeling of the spread of the coronavirus disease 2019 (covid-19) taking into account the undetected infections. the case of china mathematical formulation and validation of the be-fast model for classical swine fever virus spread between and within farms validation of the forecasts for the international spread of the coronavirus disease 2019 (covid-19) done with the be-codis mathematical model application of the be-codis mathematical model to forecast the international spread of the 2019 wuhan coronavirus outbreak be-codis: a mathematical model to predict the risk of human diseases spread between countries-validation and application to the 2014 ebola virus disease epidemic 2020. coronavirus covid-19 global cases by the center for systems science and engineering (csse) early dynamics of transmission and control of covid-19: a mathematical modelling study substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2) transmission dynamics of 2019 novel coronavirus singularly perturbed vector field method (spvf) applied to combustion of monodisperse fuel spray the global dynamics for an age-structured tuberculosis transmission model with the exponential progression rate compartment, he/she remains in the hospital for a convalescence period of c 0 days (average) r u recovered after being previously infectious but undetected: the person was not previously detected as infectious, survived the disease, is no longer infectious, and has developed a natural immunity to the virus d dead by covid-19: the person did not survive the disease t duration the authors received no funding for this work. the authors declare that they have no competing interests. ophir nave conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. israel hartuv conceived and designed the experiments, performed the experiments, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. uziel shemesh conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. the following information was supplied regarding data availability: data is available at worldmeters.info: https://www.worldometers.info/coronavirus/ country/china/ key: cord-316393-ozl28ztz authors: enrique amaro, josé; dudouet, jérémie; nicolás orce, josé title: global analysis of the covid-19 pandemic using simple epidemiological models date: 2020-10-22 journal: appl math model doi: 10.1016/j.apm.2020.10.019 sha: doc_id: 316393 cord_uid: ozl28ztz several analytical models have been developed in this work to describe the evolution of fatalities arising from coronavirus covid-19 worldwide. the death or ‘d’ model is a simplified version of the well-known sir (susceptible-infected-recovered) compartment model, which allows for the transmission-dynamics equations to be solved analytically by assuming no recovery during the pandemic. by fitting to available data, the d-model provides a precise way to characterize the exponential and normal phases of the pandemic evolution, and it can be extended to describe additional spatial-time effects such as the release of lockdown measures. more accurate calculations using the extended sir or esir model, which includes recovery, and more sophisticated monte carlo grid simulations – also developed in this work – predict similar trends and suggest a common pandemic evolution with universal parameters. the evolution of the covid-19 pandemic in several countries shows the typical behavior in concord with our model trends, characterized by a rapid increase of death cases followed by a slow decline, typically asymmetric with respect to the pandemic peak. the fact that the d and esir models predict similar results – without and with recovery, respectively – indicates that covid-19 is a highly contagious virus, but that most people become asymptomatic (d model) and eventually recover (esir model). • similar results from esir and d models suggest that most susceptibles become infected, asymptomatic and eventually recover. • similar trends suggest a common pandemic evolution with universal parameters. 1 the sir (susceptible-infected-recovered) model is widely used as first-order approximation to viral spreading of contagious epidemics [1] , mass immunization planning [2, 3] , marketing, informatics and social networks [4] . its cornerstone is the so-called "mass-action" principle introduced by hamer, which assumes that the course of an epidemic depends on the rate of contact between susceptible and infected individuals [5] . this idea was extended to a continuous time framework by ross in his pioneering work on malaria transmission dynamics [6] [7] [8] , and finally put into its classic mathematical form by kermack and mckendric [9] . the email addresses: amaro@ugr.es (josé enrique amaro), j.dudouet@ip2i.in2p3.fr (jérémie dudouet), jnorce@uwc.ac.za (josé nicolás orce*) url: webpage: http://www.ugr.es/∼amaro (josé enrique amaro) , https://nuclear.uwc.ac.za (josé nicolás orce*) sir model was further developed by kendall, who provided a spatial generalization of the kermack and mckendrick model in a closed population [10] (i.e. neglecting the effects of spatial migration), and bartlett, who -after investigating the connection between the periodicity of measles epidemics and community size -predicted a traveling wave of infection moving out from the initial source of infection [11, 12] . more recent implementations have considered the typical incubation period of the disease and the spatial migration of the population. the pandemic has ignited the submission of multiple manuscripts in the last weeks. most statistical distributions used to estimate disease occurrence are of the binomial, poisson, gaussian, fermi or exponential types. despite their intrinsic differences, these distributions generally lead to similar results, assuming independence and homogeneity of disease risks [13] . in this work, we propose a simple and easy-to-use epidemiological model -the death or d model [14] that can be compared with data in order to investigate the evolution of the infection and deviations from the predicted trends. the d model is a simplified version of the sir model with analytical solutions under the assumption of no recovery -at least during the time of the pandemic. we apply it globally to countries where the infestation of the covid-19 coronavirus has widespread and caused thousands of deaths [15, 16] . additionally, d-model calculations are benchmarked with more sophisticated and reliable calculations using the extended sir (esir) and monte carlo planck (mcp) models -also developed in this work -which provide similar results, but allow for a more coherent spatial-time disentanglement of the various effects present during a pandemic. a similar esir model has recently been proposed by squillante and collaborators for infected individuals as a function of time, based on the ising model -which describes ferromagnetism in statistical mechanics -and a fermi-dirac distribution [17] . this model also reproduces a posteriori the covid-19 data for infestations in china as well as other pandemics such as ebola, sars, and influenza a/h1n1. the sir model considers the three possible states of the members of a closed population affected by a contagious disease. it is, therefore, characterized by a system of three coupled non-linear ordinary differential equations [18] , which involve three time-dependent functions: • susceptible individuals, s(t), at risk of becoming infected by the disease. • infected individuals, i(t). • recovered or removed individuals, r(t), who were infected and may have developed an immunity system or die. the sir model describes well a viral disease, where individuals typically go from the susceptible class s to the infected class i, and finally to the removed class r. recovered individuals cannot go back to be susceptible or infected classes, as it is, potentially, the case of bacterial infection. the resulting transmission-dynamics system for a closed population is described by where λ > 0 is the transmission or spreading rate, β > 0 is the removal rate and n is the fixed population size, which implies that the model neglects the effects of spatial migration. currently, there is no vaccination available for covid-19, and the only way to reduce the transmission or infection rate λ -which is often referred to as "flattening the curve"-is by implementing strong social distancing and hygiene measures. the system is reduced to a first-order differential equation, which does not possess an explicit solution, but can be solved numerically. the sir model can then be parametrized using actual infection data to solve i(t), in order to investigate the evolution of the disease. in the d model, we make the drastic assumption of no recovery in order to obtain an analytical formula to describe -instead of infestations -the death evolution by this can be useful as a fast method to foresee the global behavior as a first approach, before applying more sophisticated methods. we shall see that the resulting d model describes well enough the data of the current pandemics in different countries. the main assumption of the d model is the absence of recovery from coronavirus, i.e. r(t) = 0, at least during the pandemic time interval. this assumption may be reasonable if the spreading time of the pandemic is much faster than the recovery time, i.e. λ β . the sir equations are then reduced to the single equation of the well-known si model, which represents the simplest mathematical form of all disease models, where the infection rate is proportional to both the infected, i, and susceptible individuals n −i. where we have defined the constants the parameter b is the characteristic evolution time of the initial exponential increase of the pandemic. the constant c is the initial infestation rate with respect to the total population n. assuming c 1, in order to predict the number of deaths in the d model we assume that the number of deaths at some time t is proportional to the infestation at some former time τ, that is, where µ is the death rate, and τ is the death time. with this assumption we can finally write the d-model equation as where a = µi 0 e −τ/b , c = c e −τ/b , and a/c yields the total number of deaths predicted by the model. this is the final equation for the d-model, which presents a similar shape to the well-known woods-saxon potential for the nucleons inside the atomic nucleus or the bacterial growth curve. the rest of the parameters, µ, τ, i 0 and n are embedded in the parameters a, b, c, which represent space-time averages and can be fitted to the timely available data. consequently, the d-function passes into a well-known logistic model, which is described by the riccati equation, but with different constants (e.g. see ref. [20] and references therein). in fig. 1 , we present the fit of the d-model to the covid-19 death data for china, where its evolution has apparently been controlled and the d function has reached the plateau zone, with few increments over time, or fluctuations that are beyond the model assumptions. this plot shows the duration of the pandemicabout two months to reach the top end of the curveand the agreement, despite the crude assumptions, between data and the evolution trend described by the dmodel. this agreement encourages the application of the d model to other countries in order to investigate the different trends. in order to get insight into the stability and uncertainty of our predictions, fig. 2 shows the evolution of a, b, and c and other model predictions from fits to the daily data in spain. the meaning of these quantities is explained below: • the parameter a is the theoretical number of deaths at the day corresponding to t = 0. in general, it differs from the experimental value and can be interpreted as the expected value of deaths that day. note that experimental data may be subject to unknown systematic errors and different counting methods. • the parameter b, as mentioned above, is the characteristic evolution time. during the initial exponential behavior, it indicates the number of days for the number of deaths to double. moreover, 1/b is proportional to the slope of the almost linear behavior in the mid region of the d function. that behavior can be obtained by doing a taylor expansion around t 0 = −b n c and is given by • the parameter c is called the inverse dead factor because d(t → ∞) = a/c provides the asymptotic or expected total number of deaths. (12) figure 2 shows the stable trend of the parameters between days 19 to 24 (corresponding to march 27-30), right before reaching the peak of deaths cases, which occurred in spain around april 1. such stability validates the d-model predictions during this time. however, a rapid change of the parameters is observed, especially for a, once the peak is reached, drastically changing the prediction of the number of deaths given by a/c. this sudden change results in the slowing down of deaths per day and longer time predictions t 95 and t 99 . the parameters of the d model correspond to average values over time of the interaction coefficients between individuals, i.e. they are sensitive to an additional external effect on the pandemic evolution. these may include the lockdown effect imposed in spain in march 14 and other effects such as new sources of infection or a sudden increase of the total susceptible individuals due to social migration and large mass gatherings [21] . it is not possible to identify a specific cause because its effects are blurred by the stochastic evolution of the pandemic, which is why any reliable forecast presents large errors. one can also determine deaths/day rates by applying the first derivative to eq. 10, which allows for a determination of the pandemics peak and evolution after its turning point. the d model describes well the cumulative deaths because the sum of discrete data reduce the fluctuations, in the same way as the integral of a discontinuous function is a continuous function. however, the daily data required for d have large fluctuations -both statistical and systematic -which normally gives a slightly different set of parameters when compared with the d model. using the d model fitted to cumulative deaths allows to compute deaths/day as where ∆t = 1 day. figure 3 shows that eqs. 13 and 14 yield similar parameters, as the time increment is small enough compared with the time evolution of the d(t) function. hence, the first derivative d (t) can be used to describe deaths per day. in addition, fig. 4 shows that the parameters may be different for both d and d functions using cumulative and daily deaths, respectively, as shown for spain on april 5. it is also important to note that b is directly proportional to the full width at half maximum (fw hm) of the d (t) distribution, as shown below, the b parameter presents typical values between 4 and 10 for most countries undergoing the initial exponential phase, which yields a minimum and maximum time of 14 and 35 days, respectively, between the two extreme values of the fw hm. some models [22] include changes in the transmission rate due to various interventions implemented to contain the outbreak. the simple d model does not allow to do this explicitly, but changes in the spread can be taken into account by considering the total d or d n function as the sum of two or more independent d-functions with different parameters, which may reveal the existence of several independent sources, or virus channels. an example is shown in fig. 5 , where the two-channel function has been fitted with six parameters to the spanish data up to april 13. the fit reveals a second, smaller death peak, which substantially increase the number of deaths per day and the duration of the pandemic. this is equivalent to add a second, independent, source of infection several weeks after the initial pandemic. the second peak may as well represent a second pandemic phase driving the effects of quarantine during the descendant part of the curve. additionally, the cumulative d-function can also be computed with a two-channel function, which provides, as shown in fig. 6 , a more accurate prediction for the total number of deaths and clearly illustrates the separate effect of both source peaks. it is interesting to note that for large t, a ≈ a 2 , c ≈ c 2 and b 2 ≈ 2b. in such a case, the total number of deaths expected during the pandemic is given by d 2 (∞) = 2a/c. the d-model can also be used to estimate i(t) using the initial values of i 0 = i(0) and the total number of susceptible people n = s(0). the initial value of n is unknown, and not necessarily equal to the population of the whole country since the pandemic started in localized areas. here, we shall assume n = 10 6 , although plausible values of n can be tens of millions. note that the no-recovery assumption of the d model is unrealistic, and this calculation only provides an estimation of the number of individuals that were infected at some time, independently of whether they recovered or not. from the definition of d(t) in eq. 9, the following relations between the several parameters of the model were extracted solving the first two equations for µ and i 0 we obtain hence, µ can be computed by knowing n. however, to obtain i 0 one needs to know the death time τ. this has been estimated to be about 15 to 20 days for covid-19 cases, which can be used to compute two estimates of i(t). these are given in fig. 7 for the case of spain. since there is no recovery in the d model, the total number of infected people is i ∼ n for large t, i.e. n = 10 6 in our case. in fig. 7 , we have labeled the beginning of the lockdown in spain (march 15). for τ = 15 days, most of the susceptible individuals were already infected on that date, and even more for τ = 20 days, as the pandemic had started almost two months earlier. most of the individuals got infected, even if a great part of them -approximately 99% -had no symptoms of illness or disease. moreover, the top panel of fig. 8 shows the ratio d(t)/i(t) (deaths over infected), as given by eqs. 8 and 10, which also depends on n and τ. for n = 10 6 , the ratio d/i increases similarly to the separate functions d and i between the initial and final values, these results depend on the total susceptible population n. however, the ratio of infected with respect to susceptibles, i/n, is independent on n. this function depends only on τ and is shown in the bottom panel of fig. 8 for τ = 15 and 20 days, which reveals the rapid spread of the pandemic. accordingly, between 10% and 30% of the susceptibles were infected in march 7, and one month later (april 6), when the fit was made, all susceptibles had been infected. this does not means that the full population of the country got infected, since the number n is unknown and, for instance, excludes individuals in isolated regions, and it may additionally change because of spatial migration, not considered in the model. d-model predictions can be compared with more realistic results given by the complete sir model [9, 11] , which is characterized by eqs. 1, 2, 3 and 4 with initial conditions r(0) = 0, i(0) = i 0 , s(0) = n − i 0 . the sir system of dynamical equations can be reduced to a non-linear differential equation. first, dividing eq. 1 by eq. 3 one obtains, which yields the following exponential relation between the susceptible and the removed functions, moreover, eq. 4 provides a relation between the infected and the removed functions, which yields, by inserting into eq. 3, the final sir differential equation in order to obtain r(t) we only need to solve this first-order differential equation with the initial condition r(0) = 0. moreover, if we normalize the functions s, i and r to 1, so that s + i + r = 1, then r(t) verifies which can be solved numerically, or by approximate methods in some cases. in ref. [9] , a solution was found for small values of the exponent λ nr/β . for the coronavirus pandemic, however, this number is expected to increase and be close to one at the pandemic end. at this point, we propose a modification of the standard sir model. instead of solving eq. 33 numerically and fitting the parameters to data, the solution can be parametrized as which presents the same functional form as the d-model and, conveniently, provides a faster way to fit the model parameters by avoiding the numerical problem of solving eq. 33. in fact, numerical solutions of the sir model present a similar step function for r(t). additionally, one can assume that d(t) is proportional to r(t), and can also be written as where a 2 , c 2 = s 0 and b 2 = β /(λ n) are unknown parameters to be fitted to deaths-per-day data, together with the three parameters of the r(t)-function: a, b, c. figure 9 shows fits of the esir model to daily deaths in spain during the coronavirus spread. the use of no boundary condition for the number of deaths (left panel) is not an exact solution of the sir differential equation. a way to solve this problem is to impose the condition d (∞) = 0, as the number of deaths must stop at some time. numerically, it is enough to choose a small value of d (t) for an arbitrary large t. the middle and right panels of fig. 9 show different boundary conditions of d (100) = 10 and d (100) = 5, respectively, which yield the same results and the expected behavior for a viral disease spreading and declining. it is also consistently observed (e.g. see middle and right panels of fig. 9 ), that at large t, r(t) → a c ≈ 1, which essentially 8 means that most of the susceptible population n recovers, as we previously inferred from the d model. as shown in fig. 10 , the esir model, where c 2 has been adjusted to 1, is characterized by a broad plateau structure which, again, does not consider additional spatialtime effects. as previously done with the d model, one can also expand the esir model to accommodate its failure to take additional spatial-time effects into account. similarly, the esir2 model is proposed as, with where we have assumed that a = a and c = 2a to accommodate that r(∞) → 1 and c 2 = 1. hence, we are left with five free parameters. figure 11 shows the comparison between the esir2 and d 2 fits to real data for some european countries where covid-19 has widely spread: germany, italy, france, spain, united kingdom, sweden, belgium and netherlands, which indicate a common pattern for the evolution of the covid-19 pandemic. death data are taken from refs. [19, 23, 24 ] and consider 7-day average smoothing to correct for anomalies in data collection such as the typical weekend staggering observed in various countries, where weekend data are counted at the beginning of the next week. real error intervals are extracted from the correlation matrix. as discussed in section 2.3, the reduced d 2 model has been used with a = a 2 and c = c 2 . although arising from different assumptions -no recovery (d-model) and recovery (esir model) -both models provide similar patterns following the data trends, with slightly better values of χ 2 per degree of freedom for the esir2 model. it is also interestfigure 11 : reduced esir2 and d 2 model fits to deaths-per-day data up to august 9. ing to note that the reduced esir2 model with five parameters yields similar results to the full esir2 model, with eight parameters. as data become available, daily predictions vary for both esir2 and the d 2 models. this is because the model parameters are actually statistical averages over space-time of the properties of the complex system. no model is able to predict changes over time of these properties if the physical causes of these changes are not included. the values of the model parameters are only well defined when the disease spread is coming to an end and time changes in the parameters have little influence. contrarily, fig. 12 shows clear discrepancies between d 2 and esir2 fits to data with larger χ 2 /ndf values. there are several reasons for these anomalies: 1) a second wave surges as lockdown measures are suddenly released, as clearly shown in the case of iran, 2) different spatial-time effects as the virus spreads throughout a large country, or, generally, 3) simply because of defective counting (e.g. weekend and backlog effects). more sophisticated calculations can be compared with d 2 and esir2 predictions. in particular, monte carlo (mc) simulations have also been performed in this work for the spanish case [25] , which consist of a lattice of cells that can be in four different states: susceptible, infected, recovered or death. an infected cell can transmit the disease to any other susceptible cell within some random range r. the transmission mechanism follows principles of nuclear physics for the interaction of a particle with a target. each infected particle interacts a number n of times over the interaction region, according to its energy. the number of interactions is proportional to the interaction cross section σ and to the target surface density ρ. the discrete energy follows a planck distribution law depending on the 'temperature' of the system. for any interaction, an infection probability is applied. finally, time-dependent recovery and death probabilities are also applied. the resulting virus spread for different sets of parameters can be adjusted from covid-19 pandemic data. in addition, parameters can be made time dependent in order to investigate, for instance, the effect of an early lockdown or large mass gatherings at the rise of the pandemic. as shown in fig. 13 , our mc simulations present similar results to the d 2 model, which validates the use of the simple d-model as a first-order approximation. more details on the mc simulation will be presented in a separate manuscript [25] . interestingly, mc simulations follow the data trend up to may 11 without any changes in the parameters for nearly two weeks. an app for android devices, where the monte carlo planck model has been implemented to visualize the simulation is available from ref. [26] . in order to investigate the universality of the pandemic, it is interesting to compare all countries by plotting the d model in terms of the variable (t − t 0 )/b, where t 0 is the maximum of the daily curve given by t max = −b n(c). by shifting eq. 10 by t max = −b n(c) and dividing by t max = a/c, the normalized d function is given by, the left of fig. 14 shows similar trends for the normalized d curves of different countries, which suggests a universal behavior of the covid-19 pandemic. only iran seems to slightly deviate from the global trend, which may indicate an early and more effective initial lockdown. a similar approach can be done for the daily data using the d and esir2 models, as shown in the middle and right panels of fig. 14 , respectively. although different countries show similar trends, statistical fluctuations in the daily data do not result in a nice universal behavior as compared with d norm . however, the d and esir2 plots show that an effective lockdown is characterized by flatter and broader peaks, best characterized the iranian case, whereas spain and germany present the sharper peaks. the global models considered in this work present some differences with respect to other existing models. first, in this work we have tried to keep the models as simple as possible. this allows to use theoreticalinspired analytical expressions or semi-empirical formulae to perform the data analysis. the use of semiempirical expressions for describing physical phenomena is recurrent in physics. one of the most famous is the semi-empirical mass formula from nuclear physics. of course the free parameters need to be fitted from known data, but this allowed to obtain predictions for unknown elements. in our case we were inspired by the well known statistical sir-kind models slightly modified to obtain analytical expressions that carry the leading time dependence. we have found that the d and d 2 models allow a fast and efficient analysis of the pandemics in the initial and advanced stages. our results show that the time dependence of the pandemic parameters due to the lockdown can be effectively simulated by the sum of two dfunctions with different widths and heights and centered at different times. the distance between the maxima of the two d-functions should be a measure of the time between the effective pandemic beginning and lockdown. in the spanish case this is about 20 days. taking into account that lockdown started in march 14, this marks the pandemic starting time as about february 22. had the lockdown started on that date, the deaths would had been highly reduced. the smooth blending between the two peaks provides a transition between the two statistical regimes (or physical phases) with and without lockdown. the monte carlo simulation results are in agreement with our previous analysis with the d and d 2 models. the monte carlo generates events in a population of individuals in a lattice or grid of cells. we simulate the movement of individuals outside of the cells and interactions with the susceptible individuals within a finite range. the randon events follow statistical distributions based on the exponential laws of statistical mechanics for a system of interacting particles, driven by macroscopic magnitudes as the temperature, and interaction probabilities between individuals, that can be related to interaction cross sections. the monte carlo simulation spread the virus in space-time, and allows also space-time dependence on the parameters. in this work we have made the simplest assumptions, only allowing for a lockdown effect by reducing the range of the interaction starting on a fixed day. this simple modification allowed to reproduce nicely the spanish death-per-day curve. the lockdown produces a relatively long broadening of the curve and a slow decay. similar mc calculations can be performed in several countries to infer the devastating effect of a late lockdown as compared with early lockdown measures. the later is the case of south africa and other countries, which have not reached the exponential growth. the death (d) and extended sir (esir) models are simple enough to provide fast estimations of pandemic evolution by fitting spatial-time average parameters, and present a good first-order approximation to understand secondary effects during the pandemic, such as lockdown and population migrations, which may help to control the disease. similar models are available [17, 27] , but challenges in epidemiological modeling remain [28] [29] [30] [31] . this is a very complex system, which involves many degrees of freedom and millions of people, and even assuming consistent disease reporting -which is rarely the case -there remains an impor-tant open question: can any model predict the evolution of an epidemic from partial data? or similarly, is it possible, at any given time and data, to measure the validity of an epidemic growth curve? we finally hope that we have added new insightful ideas with the death, the extended sir and monte carlo models, which can now be applied to any country which has followed the initial exponential pandemic growth. it is important to note that the esir and d models predict similar patterns of infected and death cases assuming very different premises: recovery and no recovery, respectively. this, together with the fact that the esir model predicts that r → 1 for large t, i.e. that most infected cases eventually recover, leads to the logical conclusion that most people in a fixed population n become asymptomatic in the d model and eventually recover from covid-19. one remaining important question is what is n exactly; is it the whole country, a 13 state or a province, or is it localized to specific areas? discussion: the kermack-mckendrick epidemic threshold theorem stability analysis of sir model with vaccination seasonality and the effectiveness of mass vaccination application of sir epidemiological model: new trends epidemic disease in england -the evidence of variability and of persistency of type report on the prevention of malaria in mauritius an application of the theory of probabilities to the study of a priori pathometry. -part i an application of the theory of probabilities to the study of a priori pathometry.-part iii a contribution to the mathematical theory of epidemics discussion of âȃÿmeasles periodicity and community sizeâȃź by measles periodicity and community size deterministic and stochastic models for recurrent epidemics basic models for disease occurrence in epidemiology the d model for deaths by covid-19 the continuing 2019-ncov epidemic threat of novel coronaviruses to global health âȃť the latest 2019 novel coronavirus outbreak in wuhan, china situation report -83 attacking the covid-19 with the ising-model and the fermi-dirac distribution function the sir model and the foundations of public health impact of non-pharmaceutical interventions against covid-19 in europe: a quasi-experimental study inferring covid-19 spreading rates and potential change points for case number forecasts special issue on challenges in modelling infectious disease dynamics modeling infectious disease dynamics in the complex landscape of global health mathematical epidemiology: past, present, and future true epidemic growth construction through harmonic analysis the authors thank useful comments from emmanuel clément, araceli lopez-martens, david jenkins, ramon wyss, azwinndini muronga, liam gaffney and hans fynbo. this work was supported by the spanish ministerio de economía y competitividad and european feder funds (grant fis2017-85053-c2-1-p), junta de andalucía (grant fqm-225) and the south african national research foundation (nrf) under grant 93500. key: cord-335465-sckfkciz authors: gupta, rishi k.; marks, michael; samuels, thomas h. a.; luintel, akish; rampling, tommy; chowdhury, humayra; quartagno, matteo; nair, arjun; lipman, marc; abubakar, ibrahim; van smeden, maarten; wong, wai keong; williams, bryan; noursadeghi, mahdad title: systematic evaluation and external validation of 22 prognostic models among hospitalised adults with covid-19: an observational cohort study date: 2020-09-25 journal: eur respir j doi: 10.1183/13993003.03498-2020 sha: doc_id: 335465 cord_uid: sckfkciz background: the number of proposed prognostic models for covid-19 is growing rapidly, but it is unknown whether any are suitable for widespread clinical implementation. methods: we independently externally validated the performance candidate prognostic models, identified through a living systematic review, among consecutive adults admitted to hospital with a final diagnosis of covid-19. we reconstructed candidate models as per original descriptions and evaluated performance for their original intended outcomes using predictors measured at admission. we assessed discrimination, calibration and net benefit, compared to the default strategies of treating all and no patients, and against the most discriminating predictor in univariable analyses. results: we tested 22 candidate prognostic models among 411 participants with covid-19, of whom 180 (43.8%) and 115 (28.0%) met the endpoints of clinical deterioration and mortality, respectively. highest areas under receiver operating characteristic (auroc) curves were achieved by the news2 score for prediction of deterioration over 24 h (0.78; 95% ci 0.73–0.83), and a novel model for prediction of deterioration <14 days from admission (0.78; 0.74–0.82). the most discriminating univariable predictors were admission oxygen saturation on room air for in-hospital deterioration (auroc 0.76; 0.71–0.81), and age for in-hospital mortality (auroc 0.76; 0.71–0.81). no prognostic model demonstrated consistently higher net benefit than these univariable predictors, across a range of threshold probabilities. conclusions: admission oxygen saturation on room air and patient age are strong predictors of deterioration and mortality among hospitalised adults with covid-19, respectively. none of the prognostic models evaluated here offered incremental value for patient stratification to these univariable predictors. coronavirus disease 2019 , caused by severe acute respiratory syndrome coronavirus-2 (sars-cov-2), causes a spectrum of disease ranging from asymptomatic infection to critical illness. among people admitted to hospital, covid-19 has reported mortality of 21-33%, with 14-17% requiring admission to high dependency or intensive care units (icu) [1] [2] [3] [4] . exponential surges in transmission of sars-cov-2, coupled with the severity of disease among a subset of those affected, pose major challenges to health services by threatening to overwhelm resource capacity [5] . rapid and effective triage at the point of presentation to hospital is therefore required to facilitate adequate allocation of resources and to ensure that patients at higher risk of deterioration are managed and monitored appropriately. importantly, prognostic models may have additional value in patient stratification for emerging drug therapies [6, 7] . as a result, there has been global interest in development of prediction models for covid-19 [8] . these include models aiming to predict a diagnosis of covid-19, and prognostic models, aiming to predict disease outcomes. at the time of writing, a living systematic review has already catalogued 145 diagnostic or prognostic models for covid-19 [8] . critical appraisal of these models using quality assessment tools developed specifically for prediction modelling studies suggests that the candidate models are poorly reported, at high risk of bias and over-estimation of their reported performance [8, 9] . however, independent evaluation of candidate prognostic models in unselected datasets has been lacking. it therefore remains unclear how well these proposed models perform in practice, or whether any are suitable for widespread clinical implementation. we aimed to address this knowledge gap by systematically evaluating the performance of proposed prognostic models, among consecutive patients hospitalised with a final diagnosis of covid-19 at a single centre, when using predictors measured at the point of hospital admission. we used a published living systematic review to identify all candidate prognostic models for covid19 indexed in pubmed, embase, arxiv, medrxiv, or biorxiv until 5 th may 2020, regardless of underlying study quality [8] . we included models that aim to predict clinical deterioration or mortality among patients with covid-19. we also included prognostic scores commonly used in clinical practice [10] [11] [12] , but not specifically developed for covid-19 patients, since these models may also be considered for use by clinicians to aid risk-stratification for patients with covid19 . for each candidate model identified, we extracted predictor variables, outcome definitions (including time horizons), modelling approaches, and final model parameters from original publications, and contacted authors for additional information where required. we excluded scores where the underlying model parameters were not publicly available, since we were unable to reconstruct them, along with models for which included predictors were not available in our dataset. the latter included models that require computed tomography imaging or arterial blood gas sampling, since these investigations were not routinely performed among unselected patients with covid-19 at our centre. our study is reported in accordance with transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod) guidance for external validation studies [13] investigations were routinely performed. data were collected by direct extraction from electronic health records, complemented by manual curation. variables of interest in the dataset included: demographics (age, gender, ethnicity), comorbidities (identified through manual record review), clinical observations, laboratory measurements, radiology reports, and clinical outcomes. each chest radiograph was reported by a single radiologist, who was provided with a short summary of the indication for the investigation at the time of request, reflecting routine clinical conditions. chest radiographs were classified using british society of thoracic imaging criteria, and using a modified version of the radiographic assessment of lung edema (rale) score [14, 15] . for each predictor, measurements were recorded as part of routine clinical care. where serial measurements were available, we included the measurement taken closest to the time of presentation to hospital, with a maximum interval between presentation and measurement of 24 hours. for models that used icu admission or death, or progression to 'severe' covid-19 or death, as composite endpoints, we used a composite 'clinical deterioration' endpoint as the primary outcome. we defined clinical deterioration as initiation of ventilatory support (continuous positive airway pressure, non-invasive ventilation, high flow nasal cannula oxygen, invasive mechanical ventilation or extra-corporeal membrane oxygenation) or death, equivalent to world health organization clinical progression scale  6 [16] . this definition does not include standard oxygen therapy. we did not apply any temporal limits on (a) the minimum duration of respiratory support; or (b) the interval between presentation to hospital and the outcome. the rationale for this composite outcome is to make the endpoint more generalisable between centres, since hospital respiratory management algorithms may vary substantially. defining the outcome based on level of support, as opposed to ward setting, also ensures that it is appropriate in the context of a pandemic, when treatments that would usually only be considered in an icu setting may be administered in other environments due to resource constraints. where models specified their intended time horizon in their original description, we used this timepoint in the primary analysis, in order to ensure unbiased assessment of model calibration. where the intended time horizon was not specified, we assessed the model to predict in-hospital deterioration or mortality, as appropriate. all deterioration and mortality events were included, regardless of their clinical aetiology. participants were followed-up clinically to the point of discharge from hospital. we extended follow-up beyond discharge by cross-checking nhs spine records to identify reported deaths post-discharge, thus ensuring >30 days' follow-up for all participants. for each prognostic model included in the analyses, we reconstructed the model according to authors' original descriptions, and sought to evaluate the model discrimination and calibration performance against our approximation of their original intended endpoint. for models that provide online risk calculator tools, we validated our reconstructed models against original authors' models, by cross-checking our predictions against those generated by the web-based tools for a random subset of participants. for all models, we assessed discrimination by quantifying the area under the receiver operating characteristic curve (auroc) [17] . for models that provided outcome probability scores, we assessed calibration by visualising calibration of predicted vs. observed risk using loess-smoothed plots, and by quantifying calibration slopes and calibration-in-the-large (citl citl, but allowed us to examine the calibration slope in our dataset. we also assessed the discrimination of each candidate model for standardised outcomes of: (a) our composite endpoint of clinical deterioration; and (b) mortality, across a range of pre-specified time horizons from admission (7 days, 14 days, 30 days and any time during hospital admission), by calculating time-dependent aurocs (with cumulative sensitivity and dynamic specificity) [18] . the rationale for this analysis was to harmonise endpoints, in order to facilitate more direct comparisons of discrimination between the candidate models. in order to further benchmark the performance of candidate prognostic models, we then computed aurocs for a limited number of univariable predictors considered to be of highest importance a priori, based on clinical knowledge and existing data, for prediction of our composite endpoints of clinical deterioration and mortality (7 days, 14 days, 30 days and any time during hospital admission). the a priori predictors of interest examined in this analysis were age, clinical frailty scale, oxygen saturation at presentation on room air, c-reactive protein and absolute lymphocyte count [8, 19] . decision curve analysis allows assessment of the clinical utility of candidate models, and is dependent on both model discrimination and calibration [20] . we performed decision curve analyses to quantify the net benefit achieved by each model for predicting the intended endpoint, in order to inform clinical decision making across a range of risk:benefit ratios for an intervention or 'treatment' [20] . in this approach, the risk:benefit ratio is analogous to the cut point for a statistical model above which the intervention would be considered beneficial (deemed the 'threshold probability'). net benefit was calculated as sensitivity × prevalence -(1specificity) × (1prevalence) × w where w is the odds at the threshold probability and the prevalence is the proportion of patients who experienced the outcome [20] . we calculated net benefit across a range of clinically relevant threshold probabilities, ranging from 0 to 0.5, since the risk:benefit ratio may vary for any given intervention (or 'treatment'). we compared the utility of each candidate model against strategies of treating all and no patients, and against the best performing univariable predictor for in-hospital clinical deterioration, or mortality, as appropriate. to ensure that fair, head-to-head net benefit comparisons were made between multivariable probability based models, points score models and univariable predictors, we calibrated each of these to the validation dataset for the purpose of decision curve analysis. probability-based models were recalibrated to the validation data by refitting logistic regression models with the candidate model linear predictor as the sole predictor. we calculated 'delta' net benefit as net benefit when using the index model minus net benefit when: (a) treating all patients; and (b) using most discriminating univariable predictor. decision curve analyses were done using the rmda package in r [21] . we handled missing data using multiple imputation by chained equations [22] , using the mice package in r [23] . all variables and outcomes in the final prognostic models were included in the imputation model to ensure compatibility [22] . a total of 10 imputed datasets were generated; discrimination, calibration and net benefit metrics were pooled using rubin's rules [24] . all analyses were conducted in r (version 3.5.1). we recalculated discrimination and calibration parameters for each candidate model using (a) a complete case analysis (in view of the large amount of missingness for some models); (b) excluding patients without pcr-confirmed sars-cov-2 infection; and (c) excluding patients who met the clinical deterioration outcome within 4 hours of arrival to hospital. we also examined for non-linearity in the a priori univariable predictors using restricted cubic splines, with 3 knots. finally, we estimated optimism for discrimination and calibration parameters for the a priori univariable predictors using bootstrapping (1,000 iterations), using the rms package in r [25] . the pre-specified study protocol was approved by east midlands -nottingham 2 research ethics committee (ref: 20/em/0114; iras: 282900). we identified a total of 37 studies describing prognostic models, of which 19 studies (including 22 unique models) were eligible for inclusion (supplementary figure 1 and table 1 ). of these, 5 models were not specific to covid-19, but were developed as prognostic scores for emergency department attendees [26] , hospitalised patients [12, 27] , people with suspected infection [10] or communityacquired pneumonia [11] , respectively. of the 17 models developed specifically for covid-19, most (10/17) were developed using datasets originating in china. overall, discovery populations included hospitalised patients and were similar to the current validation population with the exception of one study that discovered a model using community data [28] , and another that used simulated data [29] . a total of 13/22 models use points-based scoring systems to derive final model scores, with the remainder using logistic regression modelling approaches to derive probability estimates. a total of 12/22 prognostic models primarily aimed to predict clinical deterioration, while the remaining 10 sought to predict mortality alone. when specified, time horizons for prognosis ranged from 1 to 30 days. candidate prognostic models not included in the current validation study are summarised in supplementary table 1 . during the study period, 521 adults were admitted with a final diagnosis of covid-19, of whom 411 met the eligibility criteria for inclusion (flowchart shown in supplementary figure 2 ). median age of the cohort was 66 years (interquartile range (iqr) 53-79), and the majority were male (252/411; 61.3%). figure 4) . for all models that provide probability scores for either deterioration or mortality, calibration appeared visually poor with evidence of overfitting and either systematic overestimation or underestimation of risk ( figure 1 ). supplementary figure 5 shows associations between prognostic models with pointsbased scores and actual risk. in addition to demonstrating reasonable discrimination, the news2 and curb65 models demonstrated approximately linear associations between scores and actual probability of deterioration at 24 hours and mortality at 30 days, respectively. next, we sought to compare the discrimination of these models for both clinical deterioration and mortality across the range of time horizons, benchmarked against preselected univariable predictors associated with adverse outcomes in covid-19 [8, 19] . we recalculated time-dependent aurocs for each of these outcomes, stratified by time horizon to the outcome ( supplementary figures 6 and 7) . these analyses showed that aurocs generally declined with increasing time horizons. admission oxygen saturation on room air was the strongest predictor of in-hospital deterioration (auroc 0.76; 95% ci 0.71-0.81), while age was the strongest predictor of in-hospital mortality (auroc 0.76; 95% ci 0.71-0.81). we compared net benefit for each prognostic model (for its original intended endpoint) to the strategies of treating all patients, treating no patients, and using the most discriminating univariable predictor for either deterioration (i.e. oxygen saturation on air) or mortality (i.e. patient age) to stratify treatment (supplementary figure 8) . although all prognostic models showed greater net benefit than treating all patients at the higher range of threshold probabilities, none of these models demonstrated consistently greater net benefit than the most discriminating univariable predictor, across the range of threshold probabilities (figure 2 ). recalculation figure 9) . finally, internal validation using bootstrapping showed near zero optimism for discrimination and calibration parameters for the univariable models (supplementary table ) . in this observational cohort study of consecutive adults hospitalised with covid-19, we systematically evaluated the performance of 22 prognostic models for covid-19. these included models developed specifically for covid-19, along with existing scores in routine clinical use prior to the pandemic. for prediction of both clinical deterioration or mortality, aurocs ranged from 0.56-0.78. news2 performed reasonably well for prediction of deterioration over a 24-hour interval, achieving an auroc of 0.78, while the carr 'final' model [31] also had an auroc of 0.78, but tended to systematically underestimate risk. all covid-specific models that derived an outcome probability of either deterioration or mortality showed poor calibration. we found that oxygen saturation (auroc 0.76) and patient age (auroc 0.76) were the most discriminating single variables for prediction of inhospital deterioration and mortality respectively. these predictors have the added advantage that they are immediately available at the point of presentation to hospital. in decision curve analysis, which is dependent upon both model discrimination and calibration, no prognostic model demonstrated clinical utility consistently greater than using these univariable predictors to inform decision-making. while previous studies have largely focused on novel model discovery, or evaluation of a limited number of existing models, this is the first study to our knowledge to evaluate systematically-identified candidate prognostic models for covid-19. we used a comprehensive living systematic review [8] to identify eligible models and sought to reconstruct each model as per the original authors' description. we then evaluated performance against its intended outcome and time horizon, wherever possible, using recommended methods of external validation incorporating assessments of discrimination, calibration and net benefit [17] . moreover, we used a robust approach of electronic health record data capture, supported by manual curation, in order to ensure a high-quality dataset, and inclusion of unselected and consecutive covid-19 cases that met our eligibility criteria. in addition, we used robust outcome measures of mortality and clinical deterioration, aligning with the who clinical progression scale [16] . a weakness of the current study is that it is based on retrospective data from a single centre, and therefore cannot assess between-setting heterogeneity in model performance. second, due to the limitations of routinely collected data, predictor variables were available for varying numbers of participants for each model, with a large proportion of missingness for models requiring lactate dehydrogenase and d-dimer measurements. we therefore performed multiple imputation, in keeping with recommendations for development and validation of multivariable prediction models, in our primary analyses [32] . findings were similar in the complete case sensitivity analysis, thus supporting the robustness of our results. future studies would benefit from standardising data capture and laboratory measurements prospectively to minimise predictor missingness. thirdly, a number of models could not be reconstructed in our data. for some models, this was due the absence of predictors in our dataset, such as those requiring computed tomography imaging, since this is not currently routinely recommended for patients with suspected or confirmed covid-19 [15] . we were also not able to include models for which the parameters were not publicly available. this underscores the need for strict adherence to reporting standards in multivariable prediction models [13] . finally, we used admission data only as predictors in this study, since most prognostic scores are intended to predict outcomes at the point of hospital admission. we note, however, that some scores are designed for dynamic in-patient monitoring, with news2 showing reasonable discrimination for deterioration over a 24-hour interval, as originally intended [27] . future studies may integrate serial data to examine model performance when using such dynamic measurements. despite the vast global interest in the pursuit of prognostic models for covid-19, our findings show that none of the covid-19-specific models evaluated in this study can currently be recommended for routine clinical use. in addition, while some of the evaluated models that are not specific to covid-19 are routinely used and may be of value among in-patients [12, 27] , people with suspected infection [10] or community-acquired pneumonia [11] , none showed greater clinical utility than the strongest univariable predictors among patients with covid-19. our data show that admission oxygen saturation on air is a strong predictor of clinical deterioration and may be evaluated in future studies to stratify in-patient management and for remote community monitoring. we note that all novel prognostic models for covid-19 assessed in the current study were derived from single-centre data. future studies may seek to pool data from multiple centres in order to robustly evaluate the performance of existing and newly emerging models across heterogeneous populations, and develop and validate novel prognostic models, through individual participant data meta-analysis [33] . such an approach would allow assessments of between-study heterogeneity and the likely generalisability of candidate models. it is also imperative that discovery populations are representative of target populations for model implementation, with inclusion of unselected cohorts. moreover, we strongly advocate for transparent reporting in keeping with tripod standards (including modelling approaches, all coefficients and standard errors) along with standardisation of outcomes and time horizons, in order to facilitate ongoing systematic evaluations of model performance and clinical utility [13] . we conclude that baseline oxygen saturation on room air and patient age are strong predictors of deterioration and mortality, respectively. none of the prognostic models evaluated in this study offer incremental value for patient stratification to these univariable predictors when using admission data. therefore, none of the evaluated prognostic models for covid-19 can be recommended for routine clinical implementation. future studies seeking to develop prognostic models for covid-19 should consider integrating multi-centre data in order to increase generalisability of findings, and should ensure benchmarking against existing models and simpler univariable predictors. mews = modified early warning score; qsofa = quick sequential (sepsis-related) organ failure assessment; rems = rapid emergency medicine score; news = national early warning score; tactic = therapeutic study in pre-icu patients admitted with covid-19; avpu = alert / responds to voice / responsive to pain / unresponsive; crp = c-reactive protein; ldh = lactate dehydrogenase; rale = radiographic assessment of lung edema; ards = acute respiratory distress syndrome; icu = intensive care unit; ecmo = extra-corporeal membrane oxygenation. units, unless otherwise specified, are: age in years; respiratory rate in breaths per minute; heart rate in beats per minute; blood pressure in mmhg; temperature in °c; oxygen saturation in %; crp in mg/l; ldh in u/l; neutrophils, lymphocytes, total white cell count and platelets x 10^9/l; d-dimer in ng/ml; creatinine in μmol/l; estimated glomerular filtration rate in ml/min/1.73 m2, albumin in g/l. ^clinician-defined obesity. for each model, performance is evaluated for its original intended outcome, shown in 'primary outcome' column. auroc = area under the receiver operating characteristic curve; ci = confidence interval. for each plot, the blue line represents a loess-smoothed calibration curve from the stacked multiply imputed datasets and rug plots indicate the distribution of data points. no model intercept was available for the caramelo or colombi 'clinical' models; the intercepts for these models were calibrated to the validation dataset, by using the model linear predictors as offset terms. the primary outcome of interest for each model is shown in the plot sub-heading. for each analysis, the endpoint is the original intended outcome and time horizon for the index model. each candidate model and univariable predictor was calibrated to the validation data during analysis to enable fair, head-to-head comparisons. delta net benefit is calculated as net benefit when using the index model minus net benefit when: (1) treating all patients; and (2) using the most discriminating univariable predictor. the most discriminating univariable predictor is admission oxygen saturation (spo2) on room air for deterioration models and patient age for mortality models. delta net benefit is shown with loess-smoothing. black dashed line indicates threshold above which index model has greater net benefit than the comparator. individual decision curves for each candidate model are shown in supplementary figure 8 . for each model, performance is evaluated for an approximation of its original intended outcome, shown in 'primary outcome' column. auroc = area under the receiver operating characteristic curve; ci = confidence interval. score primary outcome auroc (95% ci) calibration slope (95% ci) calibration in the large (95% ci) deterioration (1 day presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with covid-19 in the features of 20 133 uk patients in hospital with covid-19 using the isaric who clinical characterisation protocol: prospective observational cohort study critical care utilization for the covid-19 outbreak in lombardy, italy report 17 -clinical characteristics and predictors of outcomes of hospitalised patients with covid-19 in a london nhs trust: a retrospective cohort study | faculty of medicine | imperial college london the demand for inpatient and icu beds for covid-19 in the us: lessons from chinese cities remdesivir for the treatment of covid-19 -preliminary report effect of dexamethasone in hospitalized patients with covid-19: preliminary report prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal probast: a tool to assess the risk of bias and applicability of prediction model studies assessment of clinical criteria for sepsis defining community acquired pneumonia severity on presentation to hospital: an international derivation and validation study royal college of physicians. national early warning score (news) 2 | rcp london transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement frequency and distribution of chest radiographic findings in covid-19 positive patients covid-19 resources | the british society of thoracic imaging a minimal common outcome measure set for covid-19 clinical research prognosis research in healthcare : concepts, methods, and impact time-dependent roc curve analysis in medical research: current methods and applications the effect of frailty on survival in patients with covid-19 (cope): a multicentre, european, observational cohort study a simple, step-by-step guide to interpreting decision curve analysis risk model decision analysis multiple imputation using chained equations: issues and guidance for practice multivariate imputation by chained equations in r multiple imputation for nonresponse in surveys rms: regression modeling strategies rapid emergency medicine score: a new prognostic tool for in-hospital mortality in nonsurgical emergency department patients the ability of the national early warning score (news) to discriminate patients at risk of early cardiac arrest, unanticipated intensive care unit admission, and death predicting mortality due to sars-cov-2: a mechanistic score relating obesity and diabetes to covid-19 outcomes in mexico estimation of risk factors for covid-19 mortalitypreliminary results sample size considerations for the external validation of a multivariable prognostic model: a resampling study evaluation and improvement of the national early warning score (news2) for covid-19: a multi-hospital study transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): explanation and elaboration cochrane ipd meta-analysis methods group. individual participant data (ipd) meta-analyses of diagnostic and prognostic modeling studies: guidance on their use validation of a modified early warning score in medical admissions well-aerated lung on admitting chest ct to predict adverse outcome in covid-19 pneumonia a clinical risk score to identify patients with covid-19 at high risk of critical care admission or death: an observational cohort study development and validation of an early warning score (ewas) for predicting clinical deterioration in patients with coronavirus disease 2019 early prediction of mortality risk among severe covid-19 patients using machine learning prognostic factors for covid-19 pneumonia progression to severe symptom based on the earlier clinical features: a retrospective analysis prediction for progression risk in patients with covid-19 pneumonia: the call score acp risk grade: a simple mortality index for patients with confirmed or suspected severe acute respiratory syndrome coronavirus 2 disease (covid-19) during the early stage of outbreak in wuhan, china host susceptibility to severe covid-19 and establishment of a host risk score: findings of 487 cases outside wuhan development and external validation of a prognostic multivariable model on admission for hospitalized patients with covid-19 an interpretable mortality prediction model for covid-19 patients risk prediction for poor outcome and death in hospital inpatients with covid-19: derivation in wuhan, china and external validation in london comparing rapid scoring systems in mortality prediction of critically ill patients with novel coronavirus disease rems mortality (in-hospital) caramelo mortality (in-hospital) prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal predicting covid-19 malignant progression with ai techniques performing risk stratification for covid-19 when individual level data is not available, the experience of a large healthcare organization holistic ai-driven quantification, staging and prognosis of covid-19 pneumonia predicting community mortality risk due to covid-19 using machine learning and development of a prediction tool a tool for early prediction of severe coronavirus disease 2019 (covid-19): a multicenter study using the risk nomogram in wuhan and guangdong, china towards an artificial intelligence framework for data-driven prediction of coronavirus clinical severity consortium n & mc-19 r. development and validation of a survival calculator for hospitalized patients with covid-19 covid-19 for the cmteg for. development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with covid-19 prediction of the clinical outcome of covid-19 patients using t lymphocyte subsets with 340 cases from wuhan, china: a retrospective cohort study and a web visualization tool clinical decision support tool and rapid point-of-care platform for determining disease severity in patients with covid-19 predicting mortality risk in patients with covid-19 using artificial intelligence to help medical decision-making machine learning-based ct radiomics model for predicting hospital stay in patients with pneumonia associated with sars-cov-2 infection: a multicenter study a machine learning model reveals older age and delayed hospitalization as predictors of mortality in patients with covid-19 evaluating a widely implemented proprietary deterioration index model among hospitalized covid-19 patients machine learning to predict mortality and critical events in covid-19 positive toward a covid-19 score-risk assessments and registry association of radiologic findings with mortality of patients infected with 2019 novel coronavirus in wuhan risk assessment of progression to severe conditions for patients with covid-19 pneumonia: a single-center retrospective study logistic regression carr et al [31] ' carr [47] , and thus were included in the present study. $ no model intercept was available; the intercepts for these models were therefore calibrated to the validation dataset, using the model linear predictors as offset terms.^using oxygen scale 1 for all participants, except for those with target oxygen saturation ranges of 88-92%, e.g. in hypercapnic respiratory failure, when scale 2 is used, as recommended [12] . all candidate models included in a living systematic review were considered at high risk of bias [1] . ards = acute respiratory distress syndrome; icu = intensive care unit; ct = computed tomography. optimism is calculated using bootstrapping with 1,000 iterations. auroc = area under the receiver operating characteristic curve; ci = confidence interval. dxy = somers' delta, which is a measure of agreement between pairs of ordinal variables, ranging from -1 (no agreement) to +1 (complete agreement). dxy key: cord-332093-iluqwwxs authors: lessler, justin; cummings, derek a. t. title: mechanistic models of infectious disease and their impact on public health date: 2016-02-17 journal: american journal of epidemiology doi: 10.1093/aje/kww021 sha: doc_id: 332093 cord_uid: iluqwwxs from the 1930s through the 1940s, lowell reed and wade hampton frost used mathematical models and mechanical epidemic simulators as research tools and to teach epidemic theory to students at the johns hopkins bloomberg school of public health (then the school of hygiene and public health). since that time, modeling has become an integral part of epidemiology and public health. models have been used for explanatory and inferential purposes, as well as in planning and implementing public health responses. in this article, we review a selection of developments in the history of modeling of infectious disease dynamics over the past 100 years. we also identify trends in model development and use and speculate as to the future use of models in infectious disease dynamics. from the 1930s through the 1940s, lowell reed and wade hampton frost used mathematical models and mechanical epidemic simulators as research tools and to teach epidemic theory to students at the johns hopkins bloomberg school of public health (then the school of hygiene and public health) (1, 2) . though never published by reed and frost (versions of the model were eventually published by their students (3, 4) ), their model was one of the first mechanistic models of infectious disease transmission, and at a time long before digital computing, they may have been the first to use simulation methods to understand the epidemic process. reed and frost were pioneers in the study of infectious disease dynamics using mechanistic models, a field of epidemiology that has developed in parallel with the associative statistical models and methods of causal inference that dominate much of epidemiologic research. over the past century, mechanistic models have played an essential role in shaping public health policy, the way we study interventions aimed at controlling infectious diseases, and the theory on which disease control is based. mechanistic models differ from traditional statistical models such as regression models because their structure makes explicit hypotheses about the biological mechanisms that drive infection dynamics. such hypotheses range from simple representations of the time it takes to complete some part of the disease process (e.g., sartwell's lognormal representation of the incubation period (5)) to complex agent-based models that attempt to explicitly represent social interactions of people in an entire country (6, 7) or even the world (8) . regardless of scale, approach, and complexity, these models have more of the flavor of models in physics than the statistical models that are used in other branches of epidemiology, and in many cases they can be used to predict the effectiveness of hypothetical interventions in controlling disease spread. perhaps the first mechanistic model of infectious disease transmission used in assessing intervention strategies was a mathematical model of malaria transmission developed and refined by ronald ross in a series of papers published between 1908 and 1921 (9) (10) (11) , pre-dating the work of reed and frost by decades. this model had a direct and powerful message for public health: malaria could be controlled and even eliminated through mosquito control, even if the vector could not be completely eliminated. ross used his theoretical framework to develop and advocate for multiple indices, including the prevalence rate and the entomological inoculation rate (12) , that could effectively characterize the intensity of transmission in an area and identify goals for control. in the wake of the founding of the global malaria eradication program by the world health organization, george macdonald (13) extended ross's work in order to justify the use of insecticide as a tool for global malaria eradication (14) . in particular, he showed that increasing daily mosquito mortality from 5% to 45% would be adequate to eliminate malaria even in locations with the highest transmission intensities in africa. mechanistic models continue to play an important role in the fight against malaria. the work of ross and macdonald looms large to this day, with a recent review finding that the majority of models published since 1940 depart from central hypotheses of the ross-macdonald model in only a few key assumptions, if any (15) . although there are numerous instances over the past century in which mechanistic models have contributed to the control of a single disease (see figure 1 for some examples), their larger contribution may be in our general understanding of disease control. the prime example is the concept of herd immunity and the critical vaccination threshold. herd immunity is the indirect protection offered to members of the population susceptible to the disease (i.e., not immune and with the potential to be infected) by the immunity of surrounding individuals, and the critical vaccination threshold is the percentage of the population that must be vaccinated in order for the introduction of an infectious case to not spark an epidemic (16) . to estimate the critical vaccination threshold, we must first understand one of the most critical concepts of infectious disease dynamics, the basic reproductive number, r 0 . r 0 is the number of cases that a single infectious individual is expected to cause in a fully susceptible population. this concept was first introduced in demography and underwent significant development by lotka while on a visit to the johns hopkins university school of hygiene and public health in 1925 (see heesterbeek (17) for a full history of the development of r 0 in infectious disease). although this value does vary by setting, for many pathogens it is remarkably consistent across contexts and serves as a rough quantification of pathogen transmissibility. based on dynamic models, it has been shown that if we vaccinate a proportion of the population equivalent to 1 − 1/r 0 , then the pathogen will fail to spread in that population. this is the critical vaccination threshold, and it has helped to set vaccination goals for a number of diseases, particularly when elimination is the goal. however, the dynamics of vaccines in real populations are complex, and mechanistic models have helped us to understand what to expect after changes in vaccination policy. for instance, immediately after the introduction of a vaccine or improvement in vaccination rates, a disease may appear to be eliminated from a population. however, this long honeymoon period may be followed by a large, resurgent, outbreak that bigger than the yearly epidemics seen before the introduction of vaccination (though the cumulative number of cases is still less than what would have been seen without vaccination) (18) . these results have helped public health officials to understand that initial apparent vaccine successes may not last, as well as what to expect after introducing a new vaccine. mechanistic models have also been used to understand the optimal age range for vaccination campaigns (19, 20) , how such campaigns should be timed (21) , and how best to use vaccines when supplies are limited (20, 22) . models have also been used to design active response strategies for vaccine use, including ring vaccination strategies such as those implemented in the smallpox eradication campaign (23) . models were also used to assess strategies to respond to a bioterrorist release of smallpox in the early part of the 21st century and were influential in setting policy for response (24) (25) (26) (27) . one counterintuitive prediction of mechanistic models is that in rare cases, increased population immunity from vaccination can actually increase the incidence of severe disease. the poster child for the phenomenon is congenital rubella syndrome (crs). for most people, rubella infection causes a relatively minor infection characterized by fever and rash; however, when pregnant women are infected during the first trimester of pregnancy, it cause crs, which results in severe complications of pregnancy including congenital disorders and death of the fetus (28) . because vaccination increases the average age of infection (by decreasing the hazard of infection), a vaccination program that does not achieve sufficient coverage can increase the number of pregnant women who are infected, thereby increasing incidence of crs (29) . this is not simply a theoretical concept; although there have been no sustained increases in the incidence of crs (in part due to the public health response), both costa rica and greece experienced transient increases in crs burden after rubella vaccination (30, 31) . in light of the threat of crs, mechanistic models have played an important role in setting world health organization recommendations for the introduction of a rubella vaccine. these recommendations encourage countries to wait to introduce the vaccine until measles vaccination rates (measles and rubella vaccines are usually given together) are high enough to guarantee a reduction in crs cases and to strongly consider vaccination campaigns in women of childbearing age before the vaccine is introduced (32) . vaccination is only 1 of a suite of control measures. another that is of particular importance in the control of macroparasite infections is mass drug administration. a key difference between microparasite and macroparasite dynamics is the huge variation in transmission potential of human hosts, with some individuals experiencing huge pathogen loads that contribute disproportionately to transmission within populations (33) . here, strategies have taken an eye toward reducing overall population burdens of macroparasites, including targeting those with the highest burdens. theoretical explorations of the impact of heterogeneity in transmissibility have helped inform interventions and aided in the development of theory exploring the impact of heterogeneities in microparasites (34) . (aids). ron brookmeyer (35) used the incubation period distribution of hiv to "back calculate" the number of hiv infections that must have occurred over the previous course of the epidemic and predict the number of future hiv/aids cases in those already infected with hiv. he thereby linked an observable quantity (the number of aids cases) with an unobservable one (the number of people living with hiv). longini et al. (36) then fit a more mechanistic model of disease progression to data from hiv-infected individuals in the united states army, achieving similar results by explicitly representing the biological process. when attempting to estimate global mortality from measles infection, simons et al. (37) used a state-space model (i.e., a hidden markov model) which linked an underlying model of measles epidemic dynamics (the process model) with nationally reported measles incidence via an observation model (38) . they thereby were able to estimate the extent to which national reports underestimated measles cases by reconciling these reports with what was likely given birth rates and a known epidemic process. planning for so called "black swans," which are unlikely but catastrophic events, is essential to ensuring security and population health. the prime example of an infectious disease black swan is the 1918 influenza pandemic, which is estimated to have killed 50-100 million people in 2 years (39) . governments and policy makers depend on simulations built on mechanistic models to decide the extent of these threats and what can be done to confront them. for the past decade and a half, there have been ongoing concerns that one of several strains of influenza a that have been known to infect humans from domestic poultry (h5n1, h9n2, etc.), might develop the ability to transmit efficiently in humans and cause a major pandemic. h5n1 strains are seen as particularly concerning because of their high case fatality rate and the substantial increase in the number of human cases (particularly in southeast asia) that started in 2003 (40) . independent teams of disease modeling experts developed sophisticated agent-based models of potential emergence events to determine whether effective antiviral agents could be used to contain an emerging influenza at the source (6, 41) . these models showed that under reasonable expectations of the transmissibility of an emerging influenza (i.e., r 0 in the 1.5-2.0 range), containment was possible, though perhaps not practical, as it would require the deployment of millions of courses of antiviral medication, very early detection of the disease, and rapid response. in parallel work, groups considered how the impact of a pandemic could be mitigated in the united states if the initial containment attempt was unsuccessful (42) (43) (44) . the efforts of independent groups showed that something more than social-distancing measures (e.g., school closure, case isolation) would be needed to control a pandemic and that effective antivirals could help. in part on the basis of this work, the united states and other countries decided to stockpile antivirals to combat a future pandemic, a decision that has since been criticized by some (45) . however, these criticisms have been focused on concerns about the efficacy of the stockpiled antiviral drugs (46) rather than the results of the modeling work itself. the question of the probability of h5n1 influenza evolving to become transmissible in humans has itself been the focus of mechanistic modeling (47) . after 2 research groups had identified 2 different sets of mutations to the h5n1 virus that would be sufficient to allow airborne transmission in a mammalian host (48, 49) , russell et al. (47) developed a mathematical model of the within host dynamics of influenza evolution. although the authors were unable to confidently estimate the probability of the emergence of a pandemic h5n1 strain because of uncertainties about the underlying biological processes involved, they were able to identify the biological factors on which this probability would most strongly depend and recommend studies (e.g., deep sequencing of viral samples from h5n1-infected hosts) that might help to develop more precise predictions. there has been considerable debate surrounding the ethics of gain-of-function experiments for h5n1 influenza (50) , but if such experiments are to be justified, they must provide us a way to have advanced warning of a coming pandemic, a task that may only be possible through mechanistic models. however, to be successful, these models will require substantial additional theoretical work on how viral evolution interacts with the distribution of immunity in the population. in the event that an outbreak of an emerging disease does occur, mechanistic models are one of the first tools used to characterize the threat and plan a response. when a pandemic influenza strain emerged in 2009, it was critical to quickly assess whether it had the potential to cause illness with high rates of fatality, like the virus that emerged in the pandemic of 1918, or was a more mild disease, akin to what was seen in the pandemics of 1957 and 1968. initial assessments relied heavily on dynamic models of a variety of types, including phylogenetic techniques paired with demographic models, models based on the probability of the observed number of introductions of pandemic h1n1 into populations outside of mexico, analysis of epidemic curves, and the results of detailed investigations of early outbreaks (51, 52) . analyses by a number of groups quickly showed that the emergent pandemic h1n1 virus was behaving very much like alreadycirculating strains, and although it was still potentially a significant public health threat, it was unlikely to have a qualitatively different impact on mortality or morbidity than circulating influenza strains. in addition to its role in the response to the 2009 influenza pandemic, mechanistic modeling has played a role in the response to most of the emerging disease threats of this century, from foot and mouth disease in the united kingdom (53), to severe acute respiratory syndrome coronavirus (54) , to middle east respiratory syndrome coronavirus in saudi arabia (55) , to ebola in west africa (56) . the last of these shows both the power of mechanistic approaches and the dangers of its misuse. in the summer and fall of 2014, the number of ebola cases in west africa was continuing to grow, and it was unclear how severe the epidemic would eventually become. to address this issue, as well as the threat of spread to other countries, a number of modeling exercises were conducted (e.g., gomes et al. (57) ). of particular note was a model released by the centers for disease control and prevention that predicted that, without further intervention, 1.4 million cases of ebola would occur in liberia and sierra leone by mid-january 2015 (58). this did not come to pass, and although the authors noted that such long-term projections were tenuous, the media and many in the public health community made much of this number. of course interventions and behavior change did occur, but the authors had also made tenuous assumptions about how the populations of liberia and sierra leone mix together, essentially treating each country as a homogenous entity. in contrast, the world health organization ebola response team, who also made projections based on an unconstrained epidemic, declined to forecast further than 2 months into the future (56), and though theirs was a moderate overestimate of total cases, they avoided publishing any panic-inducing overestimations (they projected approximately 20,000 cases by november 2, 2014; approximately 13,000 actually were reported by that point) (59) . forecasting the course of disease spread is difficult to do well, particularly in the context of an active response. it also may be the least of what mechanistic approaches to disease epidemiology have to offer. the aforementioned work, particularly that of the world health organization ebola response team, also characterized important aspects of ebola's natural history and epidemiology, including its basic reproductive number (r 0 ), the decline in r over the course of the epidemic, the incubation period, and the serial interval, properties of the disease that will be important to understand should it re-emerge. mechanistic and mathematical approaches aid not only in the response to particular diseases but also in illuminating basic epidemiologic principles and important parameters that dictate whether a novel (or existing) pathogen can be controlled. in a 2004 paper, fraser et al. (60) confronted the question of why severe acute respiratory syndrome coronavirus was successfully contained, whereas influenza, hiv, and numerous others were not. they were particularly interested in the effectiveness of the tools available when first confronting a novel pathogen: contact tracing, isolation, and quarantine. they presented evidence that a critical determinant of the controllability of a pathogen is the amount of transmission that occurs before symptom onset, expressed by their parameter θ. pathogens that had a low proportion of all transmission occurring before symptom onset are easier to control because symptomatic individuals can be targeted with isolation or pharmaceuticals before they transmit to others. although forecasting is difficult, particularly in the response to an emerging disease threat, it remains a major goal of the disease-modeling community. because disease reporting is often delayed, forecasting includes not only projections into the future but also "now casting" of incidence based on more readily available information. this has led to a number of approaches in which models have been used to either process a data stream that is a proxy of the data of interest but available more quickly (e.g., google flutrends) (61) or in analyses of ongoing outbreaks to assess (with available data) what might be the current situation given the limitations of the observation process and temporal lags in both reporting and outcomes being generated (e.g., calculating case fatality rates for the severe acute respiratory syndrome coronavirus and middle east respiratory syndrome coronavirus outbreaks when many patients had yet to resolve) (55, 62) . at a larger time horizon, several efforts have attempted to forecast the impact of interventions on future incidence. one of the most successful was a project that forecasted the impact of respiratory syncytial virus immunization campaigns on the temporal pattern of incidence in the united states. using mechanistic transmission models, pitzer et al. (63, 64) made detailed predictions of the impact of vaccination on the multiannual dynamics of rotavirus, as well as the impact of the vaccine on genotype circulation. these forecasts of broad qualitative impacts of interventions are critical tests of models. detailed prospective predictions of changes that will occur with changes in health policy, which are then validated, will provide the best evidence of the utility of mechanistic models in the future. dependent happenings is the term coined by ronald ross (10) to capture the fact that for infectious diseases, an individual's risk of infection depends on the disease status of those around them (65) . this presents challenges for trial design and the interpretation of observational studies. cluster randomization and adjustment for intra-class correlation can be used to account for this effect in some cases (66) , but mechanistic models are often useful in trial design or in interpretation of results when cluster randomization is imperfect or impossible. under these conditions, simulations studies have been used to help in study design settings, including vaccine studies (67, 68) and combination approaches to hiv prevention (69) . mechanistic models have been particularly revealing for studies of vaccine effectiveness. for example, a naïve approach would be to consider that all vaccines acted in the same way, providing complete protection for some fraction of the population. however, in reality vaccines may be leaky and provide protection only in some dimensions (65) . vaccines may prevent infection all together (e.g., the measles vaccine) (70), offer protection against pathogenic disease but still allow individuals to become infected and transmit the disease (e.g., acellular pertussis vaccines (71)), or only prevent onward transmission of the disease (e.g., transmissionblocking vaccines for malaria (72) ). in order to anticipate and assess the impact of vaccines once scaled up to widespread use, the specific actions of the vaccine in reducing infection, onward transmission, and disease must be disentangled. these specific mechanisms will contribute differently to the direct, indirect, and total effects of a vaccine. these effects are increasingly targets of inference during trials (73) , and developments in infectious disease theory have driven development of both inference tools and study design to measure specific impacts (65) . in emerging outbreaks, simulation models have often been used as the framework to quickly quantitatively compare policy alternatives. the application of these models has yielded results ranging from broad information about the feasibility and potential impact of interventions to detailed recommendations about targeting of interventions. in the foot and mouth disease outbreak of 2001, models were used to determine optimal culling strategies that specified operational details of those strategies, including the timing and spatial extent of culling. even outside of public health crises, infectious disease models play an important role in setting public health policy. cost-effectiveness analyses are often built on mechanistic models of disease spread (74, 75) . models can help investigators choose between different intervention strategies, determine the potential of specific interventions, and compare investments across pathogens. infectious disease models play a critical role in incorporating indirect effects that can vary substantially across alternative programs. the design of immunization campaigns against human papillomavirus has to weigh the direct effects protecting women from human papillomavirus infection, as well as indirect protection resulting from immunization of both women and men. the tradeoffs of alternative programs in protecting individuals at risk of the most severe outcomes and those at little risk have been best evaluated in transmission models (64) . increasingly important is the marrying of mechanistic disease models with operations research by explicitly modeling the logistical constraints on public health intervention. this approach can be key when preparing for outbreaks or bioterrorism, as speed of deployment, hospital capacity, and other logistical factors can severely impact the efficiency of disease containment and its subsequent spread (25, 76) . likewise, a logistical analysis can assess the feasibility of novel diseasecontrol strategies, showing whether they are practical as well as efficacious; for instance, an analysis of the feasibility and potential effectiveness of passive immunotherapy in hong kong showed that this intervention could play an important role in controlling a mildly severe pandemic (77) . as the price of computation drops and we enter the era of "big data," the role of mechanistic models will only increase. a powerful new synergy is the combination of mechanistic models of disease spread with phylogenetic techniques outlining the evolutionary relationship between infecting pathogens. genetic sequence data present samples of pathogens taken from a large population of pathogens both within a host and among all hosts. understanding the impact of different selective pressures on pathogens is inherently a task of population genetics. models of the population dynamics of pathogens have been incorporated into models in order to explain the phylogenetic structure of pathogens. sequence data have been used to infer basic reproductive numbers of pathogens (51, 78) , harkening back to lotka's first use of the term to describe replication of organisms. in future work, we expect to see more direct integration of models with data at both population scales, as has been the tradition, and within models of infectious disease 419 host scales. traversing these scales will be a key challenge to the field. targeted funding and the relatively new paradigm (at least for epidemiology) of sanctioning competitions to identify the best methods of disease forecasting continue to in invigorate the field. in the united states, the models of infectious disease agent study and the recently completed research and policy for infectious disease dynamics program have led to well over 1,000 publications and continue to invigorate research and training in the field (79, 80) . similar initiatives in the united kingdom and other parts of europe, such as that from the medical research council's centre for outbreak analysis and modelling, have also been successful (81) . competitions such as the national oceanic and atmospheric administration's dengue forecasting project (82), the defense advanced research projects agency forecasting chikungunya challenge (83) , and the us center for disease control and prevention's predict the influenza season challenge (84) require researchers to assess and compare the performances of their models and stand by their predictions in the face of actual events. such initiative should serve to greatly improve the quality and number of models of infectious diseases, but this will only translate into improved public health if it is paired with greater engagement with policy and practice. in limited space, it is impossible to cover every important contribution that mechanistic models have made over the past century, and there is much important work that we have not covered. these contributions range from work showing the potential impact of test-and-treat strategies in hiv control (85) , to analyses of how to best use a limited supply of cholera vaccines to control disease (22, 86) , to fundamental work on the link between demographic characteristics and disease incidence (87) . these omissions should not be seen as a reflection of the quality of the work, but rather merely as the result of our need to select only a few of many good options. the use of mechanistic models in infectious disease epidemiology has shifted over the course of 100 years. the arc of their use spans beginnings as 1 of a group of statistical and mathematical tools used by epidemiologists to understand a multitude of phenomena, to use and development by an increasingly specialized group of researchers over the course of the 20th century, to more general use by a broader group of researchers. this arc still bends. at their core, these methods provide frameworks of analysis that can be treated in the same way as other statistical tools of analysis. refinement of methods has led to a theoretical base and application toolkit that allows nonspecialists to analyze and understand infectious disease dynamics with mechanistic models. this broader ecosystem of modelers, which includes methods-focused researchers and public health practitioners, has led to encouraging progress in tying models increasingly to data and to the most salient infectious disease problems facing global health. memoir on the reed-frost epidemic theory a commentary on the mechanical analogue to the reed-frost epidemic model an examination of the reed-frost theory of epidemics some mathematical developments on the epidemic theory formulated by reed and frost the distribution of incubation periods of infectious disease strategies for containing an emerging influenza pandemic in southeast asia modelling disease outbreaks in realistic urban social networks modeling the spatial spread of infectious diseases: the global epidemic and mobility computational model prevention of malaria in mauritius some quantitative studies in epidemiology the principle of repeated medication for curing infections revisiting the basic reproductive number for malaria and its implications for malaria control epidemiological basis of malaria control macdonald, and a theory for the dynamics and control of mosquito-transmitted pathogens a systematic review of mathematical models of mosquito-borne pathogen transmission: 1970-2010 herd immunity: history, theory, practice a brief history of r0 and a recipe for its calculation measles in developing countries. part ii. the predicted impact of mass vaccination impact of birth rate, seasonality and transmission rate on minimum levels of coverage needed for rubella vaccination optimizing the dose of prepandemic influenza vaccines to reduce the infection attack rate pulse mass measles vaccination across age cohorts reactive vaccination in the presence of disease hotspots smallpox and its eradication containing bioterrorist smallpox containing a large bioterrorist smallpox attack: a computer simulation approach smallpox bioterror response smallpox transmission and control: spatial dynamics in great britain the history and medical consequences of rubella vaccination against rubella and measles: quantitative investigations of different policies congenital rubella syndrome: progress and future challenges increase in congenital rubella occurrence after immunisation in greece: retrospective survey and systematic review: how does herd immunity work? world health organization. rubella vaccines: who position paper-recommendations heterogeneities in the transmission of infectious agents: implications for the design of control programs superspreading and the effect of individual variation on disease emergence reconstruction and future trends of the aids epidemic in the united states the dynamics of cd4+ t-lymphocyte decline in hiv-infected individuals: a markov modeling approach assessment of the 2010 global measles mortality reduction goal: results from a model of surveillance data tracking measles infection through non-linear state space models updating the accounts: global mortality of the 1918-1920 "spanish" influenza pandemic avian influenza a (h5n1) infection in humans containing pandemic influenza at the source strategies for mitigating an influenza pandemic mitigation strategies for pandemic influenza in the united states modeling targeted layered containment of an influenza pandemic in the united states report disputes benefit of stockpiling tamiflu neuraminidase inhibitors for preventing and treating influenza in healthy adults: systematic review and meta-analysis the potential for respiratory droplet-transmissible a/h5n1 influenza virus to evolve in a mammalian host airborne transmission of influenza a/h5n1 virus between ferrets experimental adaptation of an influenza h5 haemagglutinin (ha) confers respiratory droplet transmission to a reassortant h5 ha/h1n1 virus in ferrets ethical alternatives to experiments with novel potential pandemic pathogens pandemic potential of a strain of influenza a (h1n1): early findings outbreak of 2009 pandemic influenza a (h1n1) at a new york city school dynamics of the 2001 uk foot and mouth epidemic: stochastic dispersal in a heterogeneous landscape transmission dynamics and control of severe acute respiratory syndrome middle east respiratory syndrome coronavirus: quantification of the extent of the epidemic, surveillance biases, and transmissibility who ebola response team. ebola virus disease in west africa-the first 9 months of the epidemic and forward projections assessing the international spreading risk associated with the 2014 west african ebola outbreak estimating the future number of cases in the ebola epidemic-liberia and sierra leone world health organization factors that make an infectious disease outbreak controllable detecting influenza epidemics using search engine query data methods for estimating the case fatality ratio for a novel, emerging infectious disease demographic variability, vaccination, and the spatiotemporal dynamics of rotavirus epidemics modeling rotavirus strain dynamics in developed countries to understand the potential impact of vaccination on genotype distributions design and analysis of vaccine studies statistics notes: the intracluster correlation coefficient in cluster randomisation modeling human papillomavirus vaccine effectiveness: quantifying the impact of parameter uncertainty design and evaluation of prophylactic interventions using infectious disease incidence data from close contact groups hptn 071 (popart): a cluster-randomized trial of the population impact of an hiv combination prevention intervention including universal testing and treatment: mathematical model acellular pertussis vaccines protect against disease but fail to prevent infection and transmission in a nonhuman primate model development of a transmission-blocking malaria vaccine: progress, challenges, and the path forward effects of vaccination on invasive pneumococcal disease in south africa a review of typhoid fever transmission dynamic models and economic evaluations of vaccination cost-effectiveness analyses of human papillomavirus vaccination emergency response to an anthrax attack logistical feasibility and potential benefits of a population-wide passive-immunotherapy program during an influenza pandemic evolutionary analysis of the dynamics of viral infectious disease models of infectious disease agent study (midas) about the centre for outbreak analysis and modelling national oceanic and atmospheric administration. dengue forecasting project chikungunya threat inspires new darpa challenge predict the influenza season challenge universal voluntary hiv testing with immediate antiretroviral therapy as a strategy for elimination of hiv transmission: a mathematical model the impact of a one-dose versus two-dose oral cholera vaccine regimen in outbreak settings: a modeling study a simple model for complex dynamical transitions in epidemics on the course of epidemics of some infectious diseases we thank c. jessica metcalf for suggestions and useful discussions.conflict of interest: none declared. key: cord-336687-iw3bzy0m authors: kraemer, m. u. g.; perkins, t. a.; cummings, d. a. t.; zakar, r.; hay, s. i.; smith, d. l.; reiner, r. c. title: big city, small world: density, contact rates, and transmission of dengue across pakistan date: 2015-10-06 journal: j r soc interface doi: 10.1098/rsif.2015.0468 sha: doc_id: 336687 cord_uid: iw3bzy0m macroscopic descriptions of populations commonly assume that encounters between individuals are well mixed; i.e. each individual has an equal chance of coming into contact with any other individual. relaxing this assumption can be challenging though, due to the difficulty of acquiring detailed knowledge about the non-random nature of encounters. here, we fitted a mathematical model of dengue virus transmission to spatial time-series data from pakistan and compared maximum-likelihood estimates of ‘mixing parameters’ when disaggregating data across an urban–rural gradient. we show that dynamics across this gradient are subject not only to differing transmission intensities but also to differing strengths of nonlinearity due to differences in mixing. accounting for differences in mobility by incorporating two fine-scale, density-dependent covariate layers eliminates differences in mixing but results in a doubling of the estimated transmission potential of the large urban district of lahore. we furthermore show that neglecting spatial variation in mixing can lead to substantial underestimates of the level of effort needed to control a pathogen with vaccines or other interventions. we complement this analysis with estimates of the relationships between dengue transmission intensity and other putative environmental drivers thereof. the transmission dynamics of all infectious diseases depend on a few basic but key determinants: the availability of susceptible and infectious hosts, contacts between them and the potential for transmission upon contact. susceptibility is shaped primarily by historical patterns of transmission, the natural history of the pathogen, the host's immune response and host demography [1] . what constitutes an epidemiologically significant contact depends on the pathogen's mode of transmission [2] , and structure in contact patterns can be influenced by transportation networks and the spatial scale of transmission [3] , by host heterogeneities such as age [4] , and dynamically in response to the pathogen's influence on host behaviour [5] . whether transmission actually occurs during contact between susceptible and infectious hosts often depends heavily on environmental conditions [6] [7] [8] . disentangling the relative roles of these factors in driving patterns of disease incidence and prevalence is a difficult but central & 2015 the authors. published by the royal society under the terms of the creative commons attribution pursuit in infectious disease epidemiology, and mathematical models that capture the biology of how these mechanisms interact represent an indispensible tool in this pursuit [9] . the time-series susceptible-infected-recovered (tsir) model was developed by finkenstädt & grenfell [10] to offer an accurate and straightforward way to statistically connect mechanistic models of infectious disease transmission with time-series data. among other features, tsir models readily account for inhomogeneous mixing in a phenomenological way by allowing for rates of contact between susceptible and infectious hosts to depend on their densities nonlinearly. although this is a simple feature that can be incorporated into any model based on mass-action assumptions-indeed, earlier applications pertained to inhomogeneous mixing in predator-prey systems [11] -the 'mixing parameters' that determine the extent of this nonlinearity have primarily been fitted to empirical data in applications of the tsir model to measles, cholera, rubella and dengue [12] [13] [14] [15] . applied to discrete-time models such as the tsir, mixing parameters also have an interpretation as corrections for approximating a truly continuous-time process with a discrete-time model [16, 17] . in no application of the tsir model to date has the potential for variation in these parameters been assessed, leaving the extent to which inhomogeneity of mixing varies across space and time as an open question in the study of infectious disease dynamics. there are a number of reasons why mixing might vary in time or space. seasonal variation in mixing might arise because of travel associated with labour [18] , religious events [19] or vacation [20] , or because of the timing of school openings in the case of influenza [21] . spatial variation in mixing could arise because of cultural differences at geographical scales [3, 22, 23] , because of variation in the density and quality of roads [24] , or because of variation in human densities and myriad-associated factors [13, 25] . for vector-borne diseases, variation in mixing is amplified even further by variation in vector densities [26] , which effectively mediate contact between susceptible and infectious hosts. dengue is a mosquito-borne viral disease with a strong potential for spatial variation in mixing [27, 28] . the dominant vectors of dengue viruses (aedes spp.) thrive in areas where they are able to associate with humans, as humans provide not only a preferred source of blood but also water containers that the mosquitoes use for egg laying and for larval and pupal development [29] . two additional aspects of aedes ecology-limited dispersal distance of the mosquito [30] and daytime biting [31] -imply that human movement should be the primary means by which the viruses spread spatially [2] . indeed, analyses of dengue transmission dynamics at a variety of scales have strongly supported this hypothesis [32] [33] [34] [35] . to the extent that human movement in dense urban environments is more well mixed than elsewhere, there are likely to be differences in the extent of inhomogeneous mixing in peri-urban and rural areas. this is also presumably the case for directly transmitted pathogens, but with a potentially even stronger discrepancy for dengue due to the urban-rural gradient in mosquito densities. to assess the potential for spatial variation in the inhomogeneity of mixing as it pertains dengue transmission, we performed an analysis of district-level time series of dengue transmission in the punjab province of pakistan using a tsir model with separate mixing parameters for urban and rural districts. we likewise made estimates of the relationships between density-independent transmission potential and putative drivers thereof, such as temperature, to allow for the relative roles of extrinsic and intrinsic factors to be teased apart. finally, we performed mathematical analyses of the fitted model to assess the significance of spatial variation in mixing inhomogeneity for how time-series data are interpreted and used to guide control efforts. we obtained daily dengue case data aggregated at hospital level from punjab province provided by the health department punjab, pakistan, between 2011 and 2014. in total, 47 156 suspected and confirmed dengue cases were reported in 109 hospitals. all hospitals were subsequently geo-located using 'google maps' (http://www.maps.google.com) similar to methods described here [36] . hospitals that could not be identified were removed from the database. the hospitals were then assigned to a district within punjab, pakistan by their spatial location. a total of 21 182 cases alone were reported from the year 2011, which affected almost the entire province. many more cases occurred in lahore (35 348) compared to all other districts (8808) (table 1). a breakdown per year and each province is provided in electronic supplementary material, table s1, and additional information about collection can be found in the electronic supplementary material, appendix. no information on dengue serotypes were available. however, the predominant serotype circulating in punjab province, pakistan is that of denv-2 [37] . environmental conditions are instrumental in defining the risk of transmission of dengue [28] . transmission is limited by the availability of a competent disease vector. due to a lack of resources and political instability, no comprehensive nationwide entomological surveys have been performed in pakistan. therefore, we use a probabilistic model to infer the probability of occurrence of aedes aegypti and aedes albopictus in pakistan derived from a globally comprehensive dataset containing more than 20 000 records for each species (figure 1a,b). in short, a boosted regression tree model was applied that predicts a continuous spatial surface of mosquito occurrence from a fitted relationship between the distribution of these mosquitoes and their environmental drivers. a detailed description of the occurrence database and modelling outputs are available here [36, 38, 39] . such model outputs have proved useful in identifying areas of risk of transmission of dengue as well as malaria [28, 40, 41] . other important environmental conditions defining the risk of transmission of dengue are temperature, water availability and vegetation cover [42] . to account for such variation, raster layers of daytime land surface temperature were processed from the mod11a2 satellite, gap-filled to remove missing values, and then averaged to a monthly temporal resolution for all 4 years [43] . the density of vegetation coverage has been shown to be associated with vector abundance [44] . moreover, vegetation indices are useful proxies for precipitation and may be used to derive the presence of standing water buckets that are habitats for the aedes mosquitos [45] . the same method was again applied to derive the enhanced vegetation index (evi) from the mod11a2 satellite to produce 16-day and monthly pixel-based estimates for 2011-2014 (figure 1g) [46] . due to the inherent delay between rainfall and daily temperature influencing mosquito population dynamics and those mosquitoes contributing to an increase in denv transmission, we consider both, the influence of the current temperature, vegetation indices and precipitation, data on current transmission as well as the values of those covariates the time step before (figure 1f ). we used population count estimates on a 100 m resolution that were subsequently aggregated up to match all other raster layers to a 5 â 5 km resolution for the year 2015 (http://www.worldpop.org) (figure 1e). in a follow-up analysis to our primary investigation into the climatological drivers of dengue transmission, we included several density-based covariates. we derived a weighted accessibility metric that includes both, population density and urban accessibility, a metric commonly used to derive relative movement patterns [24, 47] . this map shows a friction surface, i.e. the time needed to travel through a specific pixel (figure 1d). we also used an urban, peri-urban and rural classification scheme to quantify patterns of urbanicity based on a globally available grid [48] (figure 1c). all covariates and case data were aggregated and averaged (where appropriate) to a district level. following finkenstädt & grenfell [10] , we assume a general transmission model of where i t,i is the number of infected and infectious individuals and s t,i the number of susceptible individuals, at time t in district i, n i is the population of district i, and b t,i is the covariate driven contact rate. we assume each individual to be susceptible as the 2011 epidemic is the first large dengue outbreak. the mixing parameter for the ith district is given by a i ; when a i is equal to 1, the population has homogeneous mixing where values less than one can either indicate inhomogeneous mixing or a need to correct for the discretization of the continuous-time process. b t , i was fitted using covariates shown in figure 1 . finally, the error terms e t,i are assumed to be independent and identically log-normally distributed random variables. the term b t,i is fit using generalized additive model regression [49 -51] . the time-varying climatological covariates are all fit as a smooth spline, while all other covariates enter b t,i linearly. for example, if covariate x 1 and x 2 are time varying and x 3 and x 4 are temporally constant, then we fit b as where s are smooths. additionally, unexplained seasonal variation is accounted for using a 12-month periodic smooth spline. model selection was performed using backwards selection. two base models were investigated. first, a climate-only model was created using only the climatological and environmental suitability covariates. second, a 'full' model using the densitydependent covariates as well as the climatological covariates were combined into a single model which was then subjected to backwards selection. for both models, the mixing coefficient was initially set equal for each district and once a final model was arrived upon, the mixing coefficient for lahore was allowed to vary separately from the other coefficients. all model fitting was conducted using r [52] and the 'mgcv' package [51] . models are fit by maximizing the restricted maximum likelihood [53] to reduce bias and over-fitting of the smooth splines. the model source code and processing of covariates will be made available in line with previous projects [54] . to explore the potential significance of spatial variation in mixing parameters, we conducted an analysis to probe the inherent mathematical trade-off between the mixing parameter a and the density-independent transmission coefficient b. specifically, to answer the question, what difference in local transmission would be necessary to account for a difference in mixing while achieving identical transmission dynamics. to explore this, we used equation (2.1) to establish: we then examined how variation in l and a 2 2 a 1 affected the left-hand side of equation (2.4) and likewise the critical proportion of the population to control in order to effect pathogen elimination, which, under our model, is p c ¼ 1 2 (1/b). the majority of cases were clustered in lahore, the capital of punjab province. ongoing transmission appeared to be focal in three (vehari, rawalpindi and lahore) districts and to have spread through infective 'sparks' to smaller more rural provinces. to disentangle the different aspects of dengue dynamics and their drivers, we used a model containing only the climatological covariates and performed backwards model selection until each covariate in the model was significant at a 5% level. this resulted in a model that explained 76.9% of the deviance and that had an adjusted r 2 of 0.746. among the yearly averaged covariates, evi and precipitation remained in the model, as well as the derived a. albopictus range map ( p ¼ 8.21 â 10 24 , 0.01, and 3.9 â 10 25 , respectively). interestingly, when we substituted the derived a. aegypti map for the a. albopictus map, the deviance explained changed very little to 76.8%. for climatological covariates that were fit as smooth splines, temperature, lagged temperature and evi remained in the model (figure 2, p-value of 0.010, 0.030 and 0.030 with effective degrees of freedom 7.55, 5.47 and 1.83, respectively). there was a significant amount of periodic variation unexplained by the climatological covariates alone, as the 'seasonality' covariate was retained by the model selection algorithm (figure 2, p ¼ 0.0034). the estimated median values for r 0 per district were clustered around 2 (mean ¼ 2.1), and their geographical distribution indicated a clear trend towards districts with larger populations ( figure 3) . finally, the mixing coefficient was significantly lower than 1 (a ¼ 0.69, 95% ci ¼ (0.614, 0.771), p ¼ 1.6 â 10 214 ). to understand these differences, the final model was then compared to a nested model in which the coefficient for lahore was allowed to vary independently of all other districts. deviance explained increased to 77% and adjusted r 2 increased to 0.753. further, the mixing coefficient for lahore (a ¼ 0.74) was significantly larger than the mixing coefficient for the other districts (a ¼ 0.59, p ¼ 0.0068) (electronic supplementary material, figure s1 ). the median r 0 for lahore was estimated at 3.28, the highest among all districts. to assess the extent to which the variation in mixing coefficients could be explained by other covariates, we considered the possibility that movement accounted for the differences in the mixing coefficients between lahore and the other districts. the density-dependent covariates (described earlier) were then added to the full model and backwards selection was performed again. the resulting model explained 78.6% of the deviance, had an adjusted r 2 of 0.763 and was superior to the final climatological model based on aic (699.23 versus 714.83). yearly averaged evi, normalized difference vegetation index and precipitation were all significant ( p ¼ 8.7 â 10 25 , 0.00024 and 0.00028, respectively). again, the derived a. albopictus map was significant ( p ¼ 0.00816). for climatological covariates fit as smooth splines, only temperature and lagged temperature were found to be significant (figure 2b, p ¼ 4.0 â 10 25 and 0.0013, and effective degrees of freedom 7.61 and 4.81, respectively), and there was still a significant 'seasonality' effect (figure 2b, p ¼ 4.0 â 10 27 , effective degrees of freedom 4.48). the best-fit mixing coefficient was a ¼ 0.58, barely lower than the mixing coefficient for non-lahore districts in the climatological model. the estimated median r 0 again clustered around 2 (mean ¼ 1.8), and again the r 0 for lahore was largest, but in this model it was considerably larger than in the climatological model (lahore r 0 ¼ 7.82, figure 2b ). full details of the best-fit model parameters are shown in electronic supplementary material, table s2 -s5. two of the density-dependent covariates remained in the model: the urban map ( p ¼ 0.01) and the weighted access map ( p ¼ 3 â 10 25 ). when the nested model that allowed lahore's mixing coefficient to vary was fitted, there was no significant difference between the two mixing coefficients ( p % 1). given a difference in estimates of the mixing parameters between lahore and elsewhere of 0.15, we analysed equation our results point to considerable spatial heterogeneity in the inhomogeneity of mixing and the strength of an associated figure 2 . model outputs using a backwards model selection procedure in the model using climatological variables (a, i -iv), and including the density-dependent variables (b, i -iv). every subplot shows the predictions of the model for the indicated parameter carrying across the indicated range and every other parameter set to their mean. figure (b, iv) shows the differences in the transmission coefficient from lahore (green) and all other districts (red). rsif.royalsocietypublishing.org j. r. soc. interface 12: 20150468 nonlinearity in transmission along an urban -rural gradient. this regional variability in mixing has direct implications for estimates of the basic reproductive number of dengue in our study region and elsewhere. although the potential for such bias in estimates of the basic reproductive number has been shown in a theoretical context [26, 55] , we provide quantitative estimates of the extent of this problem by interfacing models with a rich spatio-temporal dataset. our results have implications for estimates of population-level parameters not only for dengue but also for other infectious diseases [13,56 -60] and possibly even more broadly in ecology [11] . our analysis revealed significant differences in the inhomogeneity of mixing between urban and rural settings and found that a population-weighted urban accessibility metric was able to account for differences in mixing between these settings. mixing is presumably influenced directly by human behaviour and has been shown to be highly unpredictable, largely dependent on the local context and the spatial and temporal scale [61] . in this study, however, we could show that the density-dependent covariate we considered was able to capture the influence of these behavioural effects on a district level. once differences in the inhomogeneity of mixing were accounted for, estimated r 0 values indicated considerably larger differences between transmission potential in lahore and all other districts. synchronizing more accurate geo-referenced data would allow for the assessment of the extent to which the relationship between 'mixing parameters' and urban accessibility is dependent on the spatial scale at which data are aggregated [26, 62] . in the case of dengue, this has been limited specifically by the availability of high-resolution data [63] . complementing such an analysis with measurement of social contact patterns could be important for exploring this relationship in even more detail [22, 64, 65] and could be informed by mathematical models that explored this relationship previously for other diseases [13, 66] . another encouraging result from our analysis was the finding that large-scale mosquito suitability surfaces helped capture the environmental determinants of dengue transmission [28] . intervention strategies are contingent on both understanding key environmental drivers of transmission and the dynamics of ongoing human-to-human transmission, particularly in outbreak situations [67] . environmental drivers such as seasonal fluctuations in rainfall, temperature, vegetation coverage or mosquito abundance will help guide surveillance and control efforts targeted mostly towards the mosquito vector and its ecology [68] . once infection occurs, an important and unresolved question for dengue is how to best optimize the delivery of intervention strategies to reduce disease incidence, which is largely determined by r 0 . our analysis shows that the interaction between mixing parameters and force of infection has potentially large implications for optimizing targeted intervention, particularly in countries where transmission is high and resources are scarce [69] . in fact, this may be even more important in areas of low transmission where incidence appears to be rsif.royalsocietypublishing.org j. r. soc. interface 12: 20150468 more focal [70] . again, however, more attention is needed to determine the spatial and temporal resolution of appropriate intervention strategies and the effects of key covariates and model parameters [62] . empirical understanding of the spatial scale that is most appropriate for carrying out large-scale interventions remains unknown. once transmission has occurred in one place, understanding not only spatial heterogeneity in transmission dynamics but also their subsequent spread in mechanistic stochastic models would help to empirically determine the propagation of the disease [71] . interest in spatial spread dynamics has risen with increasing importation of dengue into heretofore non-endemic areas due to travel and trade continentally and internationally [72] . exploration of the case data in pakistan that we analysed here suggests that the virus spreads along major transport routes from lahore to karachi and north to rawalpindi. using results presented here on mixing coefficients and environmental drivers will help pinpoint areas of major risk of importation more accurately, especially in the case of recurring epidemics. we explored the consequences of a spatially differentiated mixing coefficient in the context of transmission potential within this analysis. using the fitted relationships of the environmental drivers of transmission and r 0 will enable future analyses and comparisons between diseases and geographical regions. in this context, it will be instrumental to integrate a variety of movement and social network models with the evidence presented here to infer more accurately how the geographical spread of dengue is determined. unifying the epidemiological and evolutionary dynamics of pathogens the role of human movement in the transmission of vector-borne pathogens the hidden geometry of complex, network-driven contagion phenomena host heterogeneity dominates west nile virus transmission adaptive human behavior in epidemiological models air temperature suitability for plasmodium falciparum malaria transmission in africa 2000-2012: a highresolution spatiotemporal prediction predicting the risk of avian influenza a h7n9 infection in live-poultry markets across asia absolute humidity and the seasonal onset of influenza in the continental united states infectious diseases of humans: dynamics and control time series modelling of childhood diseases: a dynamical systems approach from individuals to population densities: searching for the scale of intermediate determinism interactions between serotypes of dengue highlight epidemiological impact of cross-immunity dynamics of measles epidemics: estimating scaling of transmission rates using a time series sir model disentangling extrinsic from intrinsic factors in disease dynamics: a nonlinear time series approach with an application to cholera the epidemiology of rubella in mexico: seasonality, stochasticity and regional variation dynamical behavior of epidemiological models with nonlinear incidence rates interpreting timeseries analyses for continuous-time biological models: measles as a case study explaining seasonal fluctuations of measles in niger using nighttime lights imagery estimating potential incidence of mers-cov associated with hajj pilgrims to saudi arabia dynamic population mapping using mobile phone data using gps technology to quantify human mobility, dynamic contacts and infectious disease dynamics in a resource-poor urban environment social mixing patterns in rural and urban areas of southern china a universal model for mobility and migration patterns heterogeneity, mixing, and the spatial scales of mosquito-borne pathogen transmission refining the global spatial limits of dengue virus transmission by evidencebased consensus the global distribution and burden of dengue defining challenges and proposing solutions for control of the virus vector aedes aegypti dispersal of the dengue vector aedes aegypti within and between rural communities seasonal distribution and species composition of daytime biting mosquitoes phylogeography and population dynamics of dengue viruses in the americas socially structured human movement shapes dengue transmission despite the diffusive effect of mosquito dispersal house-to-house human movement drives dengue virus transmission the spatial dynamics of dengue virus in kamphaeng phet the global compendium of aedes aegypti and ae. albopictus occurrence. sci. data emergence and diversification of dengue 2 cosmopolitan genotype in pakistan the global distribution of the arbovirus vectors aedes aegypti and ae. albopictus. elife 4, e08347 data from the global compendium of aedes aegypti and ae. albopictus occurrence a global map of dominant malaria vectors a new world malaria map: plasmodium falciparum endemicity in 2010 the many projected futures of dengue re-examining environmental correlates of plasmodium falciparum malaria endemicity: a data-intensive variable selection approach use of mapping and spatial and space-time modeling approaches in operational control of aedes aegypti and dengue variation in aedes aegypti (diptera: culicidae) container productivity in a slum and a suburban district of rio de janeiro during dry and wet seasons an effective approach for gapfilling continental scale remotely sensed time-series remote sens spatial accessibility and the spread of hiv-1 subtypes and recombinants global rural urban mapping project (grump): gridded population of the world generalized additive models on the use of generalized additive models in time-series studies of air pollution and health fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models r: a language and environment for computing recovery of interblock information when block sizes are unequal enhancing infectious disease mapping with open access resources the scaling of contact rates with population density for the infectious disease models the critical community size for measles in the u networks and epidemic models the effect of network mixing patterns on epidemic dynamics and the efficacy of disease contact tracing predicting the spatial dynamics of rabies epidemics on heterogeneous landscapes disease extinction and community size: modeling the persistence of measles limits of predictability in commuting flows in the absence of data for calibration the spatial resolution of epidemic peaks the availability and consistency of dengue surveillance data provided online by the world health organization modeling infectious disease dynamics in the complex landscape of global health highly localized sensitivity to climate forcing drives endemic cholera in a megacity estimating drivers of autochthonous transmission of chikungunya virus in its invasion of the americas chikungunya on the move geographical and socioeconomic inequalities in women and children's nutritional status in pakistan in 2011: an analysis of data from a nationally representative survey revealing the microscale spatial signature of dengue transmission and immunity in an urban population travelling waves and spatial hierarchies in measles epidemics dengue and dengue vectors in the who european region: past, present, and scenarios for the future acknowledgements. the authors thank the who punjab office, health department punjab and pid for providing the epidemiological data, and all participants that helped collect the data. competing interests. we declare we have no competing interests. funding. m.u.g.k. acknowledges funding from the german academic key: cord-333693-z2ni79al authors: wu, lin; wang, lizhe; li, nan; sun, tao; qian, tangwen; jiang, yu; wang, fei; xu, yongjun title: modeling the covid-19 outbreak in china through multi-source information fusion date: 2020-08-06 journal: innovation doi: 10.1016/j.xinn.2020.100033 sha: doc_id: 333693 cord_uid: z2ni79al modeling the outbreak of a novel epidemic, such as coronavirus disease 2019 (covid-19), is crucial for estimating its dynamics, predicting future spread and evaluating the effects of different interventions. however, there are three issues that make this modeling a challenging task: uncertainty in data, roughness in models, and complexity in programming. we addressed these issues by presenting an interactive individual-based simulator, which is capable of modeling an epidemic through multi-source information fusion. and reported. this gap includes the incubation period and any delay in medical visit, diagnosis, or reporting. to make matters worse, modelling may be influenced by authors' prejudice, interest relationships or preconceived ideas. therefore, we argue that scientific researches should combine multiple sources of data worldwide rather than be based on a single source, and they should treat estimates from global researchers as elastic constraints imposed on models. the second challenge is roughness in popular models, which is caused by oversimplification. it introduces significant errors and makes it impossible to reduce uncertainty by combining different types of information. the most common epidemic dynamics models are compartment models such as seir (susceptible, exposed, infectious and removed) and sir (susceptible, infectious and removed), which have been adopted widely in the simulation of covid-19. the state vector of each person in a compartment is simplified as homogeneous and markovian (memoryless), and transitions among compartments are modeled by differential equations with fixed parameters such as incubation rate, transmission rate and recovery rate. however, oversimplified models are not capable of incorporating multi-type uncertain information like clinical courses, viral shedding, subclinical transmission, infections, confirmations, deaths, or interventions, so they cannot reduce uncertainty by multi-source information fusion. the last but far from the smallest challenge is the complexity of programming. this is a problem for researchers and reviewers who do not have a background in computer science. when it comes to individual-based models (ibms), implementation is impossible without rich experience in object oriented programming (oop), which compartmentalizes data into objects and describes object contents and behavior through j o u r n a l p r e -p r o o f the declaration of classes. therefore, scientists have begun to call for sharing model codes so that the results of papers can be replicated and evaluated. however, sharing model codes is not enough given that there are many programming languages and it is no small task to install and configure corresponding development environments and run publicly shared codes. to tackle the three challenges of modelling epidemic dynamics, we have developed an interactive simulator for individual-based models in this paper. this is described in detail in supplemental materials. in contrast to compartment models, individual-based models represent each individual via an independent set of specific characteristics that may change over time. this feature allows a more realistic and informative analysis of an epidemic. it is capable of interactively modeling parameter ranges as probability distributions, heterogeneity as independent objects and randomness of transmission as stochastic processes through a terminal or webpage without coding. the output of this model consists of daily values of infected cases, confirmed cases, recovered cases, deaths, and effective reproduction numbers. we can fit input parameters and output results with reported and inferred data from multiple sources. supplemental information includes supplemental materials, methods and results. the authors declare no competing interests. clinical features of patients infected with 2019 novel coronavirus in wuhan estimating the potential total number of novel coronavirus cases in wuhan city early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study estimating infection prevalence in wuhan city from repatriation flights estimates of the severity of coronavirus disease 2019: a model-based analysis estimating the burden of sars-cov-2 in france changes in contact patterns shape the dynamics of the covid-19 outbreak in china. science estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship temporal dynamics in viral shedding and transmissibility of covid-19 key: cord-322577-5bboc1z0 authors: parola, anna; rossi, alessandro; tessitore, francesca; troisi, gina; mannarini, stefania title: mental health through the covid-19 quarantine: a growth curve analysis on italian young adults date: 2020-10-02 journal: front psychol doi: 10.3389/fpsyg.2020.567484 sha: doc_id: 322577 cord_uid: 5bboc1z0 introduction: health emergencies, such as epidemics, have detrimental and long-lasting consequences on people’s mental health, which are higher during the implementation of strict lockdown measures. despite several recent psychological researches on the coronavirus disease 2019 (covid-19) pandemic highlighting that young adults represent a high risk category, no studies specifically focused on young adults’ mental health status have been carried out yet. this study aimed to assess and monitor italian young adults’ mental health status during the first 4 weeks of lockdown through the use of a longitudinal panel design. methods: participants (n = 97) provided self-reports in four time intervals (1-week intervals) in 1 month. the syndromic scales of adult self-report 18-59 were used to assess the internalizing problems (anxiety/depression, withdrawn, and somatic complaints), externalizing problems (aggressive, rule-breaking, and intrusive behavior), and personal strengths. to determine the time-varying effects of prolonged quarantine, a growth curve modeling will be performed. results: the results showed an increase in anxiety/depression, withdrawal, somatic complaints, aggressive behavior, rule-breaking behavior, and internalizing and externalizing problems and a decrease in intrusive behavior and personal strengths from t1 to t4. conclusions: the results contributed to the ongoing debate concerning the psychological impact of the covid-19 emergency, helping to plan and develop efficient intervention projects able to take care of young adults’ mental health in the long term. the novel coronavirus disease 2019 (covid-19) is a highly infectious disease that began as a viral pneumonia in late december 2019. in march 2020, the world health organization (who) declared the state of pandemic. as rapidly pointed out (fiorillo and gorwood, 2020; jakovljevi et al., 2020) , the covid-19 global pandemic has affected-and is still affecting-not only physical health but also individual, family, and collective mental health. in line with recent studies (horesh and brown, 2020; masiero et al., 2020) , the covid-19 pandemic should be classified as a critical event with a potential traumatic nature, which may be overwhelming and could lead to complex emotional responses that can negatively affect individuals and collective psychological systems. starting with china and followed by other states, extraordinary measures and containment efforts (e.g., lockdown) aimed to prevent the high risk of contagion and limit the covid-19 outbreak have been adopted. in europe, italy was the first country that had to face the pandemic. here, on march 09, 2020, strict lockdown measures were imposed by the government. a series of decrees imposed restrictions on the movements of individuals in the entire national territory from march 10 until may 3. during the lockdown, people were allowed to leave their homes only for limited and documented purposes. schools, universities, theaters, and cinemas, as well as any shops selling non-essential goods were, therefore, temporarily closed. as previous studies demonstrated (tucci et al., 2017) , health emergencies, such as epidemics, have detrimental and longlasting consequences on people's mental health. concerning the covid-19 pandemic, initial studies carried out in china reported high levels of anxiety, depression, and trauma-related symptoms (qiu et al., 2020) , both during the epidemic peak and 1 month later . moreover, the detrimental effect of epidemics on mental health seems to be higher during the implementation of strict lockdown measures. specifically, previous studies have associated quarantine with higher levels of trauma-related disorders (wu et al., 2009) , depression (hawryluck et al., 2004) , irritability and insomnia (lee et al., 2005) , acute stress (bai et al., 2004) , and avoidance behaviors and anger (marjanovic et al., 2007) . in a recent review, brooks et al. (2020) individuated major stress factors as being the long duration of quarantine, the fear of infection, the inadequate supplies and information, boredom, and frustration. in a recent italian study carried out during the third week of lockdown, cellini et al. (2020) have highlighted that italians reported high levels of depression, anxiety, and sleep disturbances. similarly, have found that high rates of negative mental health outcomes were seen in the general population 3 weeks into the covid-19 lockdown. within the stream of research investigating the impact of quarantine during epidemics on individual's mental health, there have been very few longitudinal investigations aimed at understanding and monitoring the changes in the mental health status during quarantine (brooks et al., 2020) . where longitudinal research designs were carried out, they were limited to investigating people's mental health during and after quarantine (jeong et al., 2016; wang et al., 2020) . recent psychological research on covid-19 has also highlighted that specific target groups are more at risk than others to develop a wide variety of psychological problems, such as medical workers, marginalized people (i.e., homeless and migrants), and young adults. regarding young adults (18-30 years old), recent researches have highlighted that they present higher levels of anxiety, distress, and depression than do other adult groups (cao et al., 2020; huang and zhao, 2020; qiu et al., 2020) . these findings have also been confirmed in italy (rossi r. et al., 2020) . according to cheng et al. (2014) , one of the possible reasons can be found in young adults' tendency to obtain information from social media, which can represent a high stress factor for mental health. these initial findings strongly suggest the need to assess and monitor young adults' psychological situation during the epidemic and the weight of their mental health outcomes. to the best of the authors' knowledge, there are no previous studies specifically aimed at evaluating the impact of lockdown measures on italian young adults' mental health and monitor the changes in their mental health status. to fill this gap, the current study presents a longitudinal panel design aimed to assess the italian young adults' mental health status and monitor their mental health trends during the firsts 4 weeks of lockdown imposed from the italian government during the covid-19 outbreak. on the basis of recent literature on the general population, an increase in mental health problems among young adults during quarantine was hypothesized. participants were enrolled online and provided self-reports over 1 month (1-week intervals, t1-t2-t3-t4). participants were considered eligible for participation if they met the following inclusion criteria: (a) were between 19 and 29 years and (b) were in a lockdown condition. exclusion criteria were as follows: (a) diagnosis of psychiatric disorder and/or psychopharmacological treatment (assessed with filter questions in the survey) and (b) not "absolute" lockdown condition (workers who were allowed to work outside their home during the lockdown measures). from the initial sample size of t1 (n = 120), nine participants did not participate at t2 (n = 111); four other participants did not participate at t3 (n = 107); and 10 other participants did not participate at t4. these participants were, therefore, excluded from the data analysis. the final simple-size was composed of 97 participants. approval from the university research ethics committee was obtained for collecting data. data collection took place during the italian lockdown from mid-march 2020 to mid-april 2020. the administration took place in four time intervals (1-week intervals) in 1 month. the first survey (t1) was made at the end of the first week of lockdown. the second survey (t2) coincided with the end of the second week of the lockdown. the third survey (t3) coincided with the end of the third week of the lockdown. the fourth survey (t4) coincided with the end of the fourth week of the lockdown. participants were informed about a complete guarantee of confidentiality and the voluntary nature of participation and their right to discontinue at any point. the enrollment procedure was carried out through an online advertising on social platforms. participants voluntarily accessed the online platform used for data collection once a week for the 4 weeks of administration. to ensure anonymity, a request was made to create a personal identification code to be used for the four administrations. adult self-report (asr/18-59) the syndromic scales of adult self-report 18-59 (achenbach and rescorla, 2003) were used to assess the internalizing and externalizing problems. the asr is especially valuable when used routinely, as in this study design. the asr norms provide a standardized benchmark with which to compare what is reported by each individual. standardized reassessments over a regular interval enable to identify reported stabilities and changes in a group who have particular kinds of problems. in this case, the asr instrument was administered at regular intervals of 1 week for 4 weeks in the period of the italian lockdown. the asr was developed both to document specific problems and to identify syndromes of co-occurring problems. in this study, six specific syndromic scales, anxious/depressed, withdrawn, somatic complaints, aggressive behavior, rule-breaking behavior, and intrusive were used. anxious/depressed (18 items) refers to anxiety and depressive symptoms (e.g., "i feel lonely" and "i am too fearful or anxious"). withdrawn (8 items) mainly refers to attitudes of isolation and lack of contact with others (e.g., "i don't get along with other people" and "i keep from getting involved with others"). somatic complaints (12 items) include physical illness, without a known medical cause (e.g., "i feel dizzy or lightheaded" and "physical problems without a known medical cause: stomachaches"). aggressive behavior (15 items) includes behaviors and attitudes characterized by poor control of one's aggression (e.g., "i blame others for my problems" and "i scream or yell a lot"). rule-breaking behavior (14 items) refers to transgressive behavior and violation of social norms (e.g., "i am impulsive or act without thinking" and "i lie or cheat"). intrusive (6 items) refers to the difficulty faced in the interpersonal relationships and to the prevalence of intrusive behavior (e.g., "i damage or destroy my things" and "i drink too much alcohol or get drunk"). in addition, the broadband scales, internalizing and externalizing, were computed. internalizing problems reflect internal distress, while externalizing problems reflect conflicts with other people. the internalizing scale consists of the syndrome scales anxious/depressed, withdrawn, and somatic complaints, whereas the externalizing scale consists of aggressive behavior and rule-breaking behavior. moreover, the scale of personal strengths (11 items) was used to assess the adaptive functioning of the individuals (e.g., "i try to get a lot of attention" and "i am louder than others"). the items are scored on a three-point rating scale: 0 (not true), 1 (somewhat or sometimes true), and 2 (very true or often true); and a total score may be calculated. higher raw scores indicate more problematic behaviors on each scale. then, a normalized t score-weighted for sex and age-was assigned for the syndromic scales and to each internalizing and externalizing problem scales. raw scores of the both types of scales have been quantitatively converted in terms of gender-and agespecific t scores. clinical significant threshold is indicated by t-scores ≥ 70. borderline range is from 65 to 69. the asr is a reliable and valid measure for the 18-59 general population (achenbach and rescorla, 2003) . cronbach's alpha (α) and mcdonald's omega (ω) are reported in table 1 . statistical analyses were performed with r software (v. 3.5.3; r core team, 2014 , 2015 and the following packages: psych (v. 1.8.12; revelle, 2018), irr (v. 0.84.1; gamer et al., 2019), lme4 (v.1.1-21; bates et al., 2015) , lmertest (v. 3.1-2; kuznetsova et al., 2017), esvis (v. 0.3.1; anderson, 2020) , aiccmodavg (v2.3-0; mazerolle, 2020), and ggplot2 (v. 3.1.0; wickham, 2016) . no data were missing for any of the participants on any of the asr scales at any of the measurement points. reliability was evaluated by internal consistency analysis, using cronbach's alpha (α) and mcdonald's omega for categorical data (ω). first, the mean differences between the four time intervals (t1, t2, t3, and t4) were performed. the unbiased sample estimate of standardized mean difference effect sizes (hedges' g; hedges, 1981) was performed, evaluating the magnitude of these differences. the following established ranges guide interpreting standardized mean difference magnitude: from 0.20 to 0.49 = small; from 0.50 to 0.79 = medium; and 0.80 = large (cohen, 1988) . growth curve analysis (gca) models were used to estimate the growth trajectories (i.e., slopes) of the syndromic scales of the asr-both internalizing and externalizing scales-and the personal strength scale. models also estimated subject variability in change across time, as represented in random-intercepts coefficients. parameters in each gca model were computed with maximum likelihood (ml) estimation. several models were estimated for each of the outcome variables, separately. specifically, it was hypothesized that the time (the week of quarantine) could have had an effect on the asr syndromic scales. in addition, it was also hypothesized that covariates, such as sex and the experience of covid-19 (exp-cvd19), intended as the experience of direct proximity with relatives and/or friends affected by covid-19, could have had an effect on the shape of the growth curve across time. models were sequentially specified according to the guidelines (long, 2012; grimm et al., 2017) . first, a null model was estimated to provide a baseline comparison and to calculate the intraclass correlation coefficient (model 0-intercept only). second, a null model with covariates was specified (model 1-intercept model with covariates). third, a linear model with time as predictor and covariate interactions was estimated (model 2-linear model with covariates). fourth, a quadratic model was specified with linear interaction effects of the covariates (model 3-quadratic the best model fit was assessed with several indices. first of all, the likelihood ratio test (lrt) was performed between one model and the following one in a step-up approach analysis: model 0 vs. model 1; model 1 vs. model 2; model 2 vs. model 3; and model 3 vs. model 4-the most parsimonious model will be preferred (long, 2012) . in addition, also "information criteria" indices were computed by comparing the abovementioned models. first, the schwarz bayesian information criterion (bic; schwarz, 1978; burnham and anderson, 2002) was calculated: the model with the lower bic indicated the best model-and it is recommended when model parsimony is overriding (kadane and lazar, 2004; long, 2012) . moreover, considering that the bic tends to favor simpler model (long, 2012) , the corrected akaike information criterion (aicc; akaike, 1973; azari et al., 2006) was also computed: even in this case, the model with the lower aicc indicated the best model. in addition, considering that-on a theoretical level-the bic is less desirable for model evaluation than the aicc (long, 2012) , several effect sizes based on the aicc were carried out: (i) the difference of aicc ( aicc); (ii) the weight of evidence (w h ): given a set of competing models and the unknowable true model, the w h indicates the probability that a model h is the best approximate model (the model with the large w h is the best-fitting model) (the more probable the model is, the best approximating the model will be to the true model); (iii) the evidence ratio (e h ) that expresses the difference-in oddsbetween the best-fitting model and the first worst-fitting model: the higher the e h , the more plausible is the best-fitting model. of 97 participants, 48 were male (49.5%) and 49 were female (50.5%). the mean age of the sample was 24.62 (sd = 2.88; range = 19-29). a total of 29 participants (29.9%) had experienced proximity with a covid-19-infected relative or friend. most of the participants lived with their parents during the quarantine (80.4%). all participants came from the campania region, in southern italy, and attended the university. means and standard deviations between the four time intervals (t1, t2, t3, and t4) and the effect size of means difference (hedges' g) are displayed in table 1 . the preliminary analysis showed that the increments tended to be small from t1 to t2 for each syndromic scale and breadboard scale (0.45 was the highest value). from t2 to t3, the results highlighted a medium increase for the anxious/depressed, withdrawn, and internalizing scales. from t3 to t4, the increase was null. for somatic complaints, aggressive behaviors, rule-breaking behavior, and externalizing scales, the magnitude of the effect size was medium only considering the increments from t1 to t4. across the weeks of quarantine, the somatic complaints scale increased with an almost null effect. finally, the personal strengths showed a small increase only from t1 to t3 and from t1 to t4. scatterplot (figure 1) showing the change of the syndromic scales and broadband scales score over time. quadratic model with linear covariates interactions quadratic model with all covariates interactions scales, as well as the related broadband scales, across the weeks of quarantine. specifically, figure 3 was split by sex (males vs. females), and figure 4 was split by the experience of covid-19 (yes vs. no). finally, figure 5 shows the interaction between sex and experience of covid-19. the broken lines demarcate a borderline clinical range from the 93rd to 97th percentiles for the syndromic scales and from the 84th to 90th percentiles for the internalizing and externalizing broadband scales. scores above the top broken line, i.e., above the 97th percentile for the syndromic scales and above the 90th for the internalizing and externalizing broadband, indicate that the individual reported enough problems to be of clinical concern. scores below the bottom broken line is in the normal range. as show in figure 2 , the anxious/depressed scale is above the clinical threshold in t3, and the withdrawn scale is above the normal threshold in t3 with an increase in t4. preliminary analyses (m.0) revealed that the variance related to the random intercept of the participants was equal to 24.41. the null model with covariates (m.1) revealed a nonstatistically significant effect of the interaction between sex and exp-cvd19 (b = −0.961, se = 3.460, t = 0.077, p = 0.782) or their main effects (sex: the linear model with covariates (m.2) revealed a nonstatistically significant effect of the interaction between time and exp-cvd19 (b = 0.310, se = 1.065, t = 0.291, p = 0.771) or the two simple main effects (sex: b = 4.797, se = 2.908, t = 1.649, p = 0.099; exp-cvd19: b = −1.076, se = 3.176, t = −0.339, p = 0.735). however, the model revealed a statistically significant interaction effect between time and sex (b = −2.066, se = 0.975, t = −2.118, p = 0.035) as well as the principal effect of time (b = 5.114, se = 0.760, t = 6.732, p < 0.001). figure 3 shows a greater increase in males from t1 to t2 and from t2 to t3 than in females. the preliminary analyses (m.0) revealed that the variance related to the random intercept of the participants was equal to 6.25. the 3 suggested that this model had 54% probability of being the best approximate model (w h of m.2 was 32%), and the eh suggested that m.3 had a weight of evidence almost two times (1.7) greater than m.2 of being the best approximate model ( table 3) . preliminary analyses (m.0) revealed that the variance related to the random intercept of the participants was equal to 14.70. the null model with covariates (m.1) revealed a nonstatistically significant effect of the interaction between sex and the comparison of the different multilevel growth curve models suggested that the linear model with covariates (m.2) showed the lower bic and the lower aicc. the lrt showed that m.2 was statistically significantly different from m.1 (intercept model with covariates). however, although m.2 was not statistically significantly different from m.3, it was the most parsimonious-and thus, it was chosen as the best model. however, the effect size indices suggested a negligible preference (table 3) . the comparison of the different multilevel growth curve models suggested that the linear model with covariates (m.2) showed the lower bic and the lower aicc. the lrt showed that m.2 was statistically significantly different from m.1 (intercept model with covariates). however, despite that m.2 was not statistically significantly different from m.3, it was the most parsimonious-and thus, it was chosen as the best model. however, the effect size indices suggested a small preference for m.2. indeed, the aicc suggested a small difference m.2 and m.3 (1.78), the w h of m.2 indicates that this model had 68% probability of being the best approximate model, and the e h recommend that m.2 had a weight of evidence more than two times (2.24) greater than m.3 of being the best approximate model ( table 3) . preliminary analyses (m.0) revealed that the variance related to the random intercept of the participants was equal to 2.81. the (1.44) , the w h of m.2 suggested that this model had 47% probability of being the best approximate model (w h of m.0 was 23%), and the e h suggested that m.2 had a weight of evidence two times (2.05) greater than m.0 of being the best approximate model ( table 3) . the comparison of the different multilevel growth curve models suggested that the quadratic model with linear covariates interaction (m.3) showed the lower bic and the lower aicc. the lrt showed that m.3 was statistically significantly different from m.2 (linear model with covariates). however, the lrt suggested that m.3 was not statistically significantly different from m.4, but it was more parsimonious-and thus, m.3 was chosen as the best model. however, the effect size indices suggested a small preference for m.3. indeed, the aicc suggested a small difference m.3 and m.4 (2.38), the w h of m.3 suggested that this model had 69% probability of being the best approximate model, and the e h suggested that m.3 had a weight of evidence more than three times (3.29) greater than m.4 of being the best approximate model ( table 3) . as stated above, in addition to being a public physical health emergency, the covid-19 pandemic also implies a global mental health emergency that may have a potential traumatic nature and provoke complex emotional responses that could negatively affect individual and collective mental health (jakovljevi et al., 2020; masiero et al., 2020) . therefore, this global pandemic constantly requires researchers and professionals to monitor and assess the current mental health situation, in order to plan and develop efficiency-driven strategies aimed to reduce its negative psychological impacts. this study assessed and monitored italian young adults' mental health status during the firsts 4 weeks of lockdown imposed by the government during the covid-19 outbreak, from march 16 to april 16. to the authors' knowledge, this is the first study specifically focused on young adults' mental health status during covid-19 quarantine, both in italy and worldwide. a longitudinal panel design was carried out in order to assess internalizing and externalizing problems on 97 italian young adults living in the campania region, southern italy. a gca (jackson et al., 2018) was performed to monitor the changes during the first 4 weeks of quarantine. first of all, in line with the global trend reported by previous studies carried out on the general population (cao et al., 2020; huang and zhao, 2020; qiu et al., 2020; rossi r. et al., 2020) , this study confirmed the negative behavioral and emotional responses provoked by covid-19 quarantine and also highlighted the high vulnerability of young adults in developing psychological distress. comparing the internalizing and externalizing domains, the results showed an analogous increase for both areas from t1 to t4, even though higher rates of internalizing manifestations were registered. specifically, the growth curve modeling highlighted that, within the internalizing problems area, the levels of anxiety/depression, withdrawal, and somatic complaints overall increased from t1 to t4, showing an increase while the lockdown measures were in place. in this context, in line with results obtained on medical health workers (zhang et al., 2020) , having experienced a closeness with a covid-19-infected relative or friend resulted in an increase of somatic complaints. similarly, within the externalizing problems area, the levels of aggressive behavior and rule breaking behavior increased from t1 to t4. among the internalizing domains, youth reported clinical-level symptoms of anxiety and depression. according to the recent review on the psychological impact of quarantine (rajkumar, 2020), anxiety as well as depressive symptomatology was the most common. furthermore, the results showed that withdrawal level was above the normal threshold. this finding could be related to the specific situation of quarantine and the impossibility to engage in social behaviors due to the lockdown. indeed, the physical distance can intensify feelings of loneliness that in turn trigger intense anxiety (boffo et al., 2012; banerjee and rai, 2020; . if, broadly, the results obtained confirmed the general detrimental effects of social isolation due to epidemics on young adults' mental health (hawryluck et al., 2004; tucci et al., 2017; qiu et al., 2020; wang et al., 2020) , some brief reflections need to be outlined about the specificities of young adults' condition. indeed, young adults live a specific transition period in which their identity development process is based and founded on continuous affective investments on social and extra-familiar relationships (sica et al., 2018) . in this context, the lockdown measures may be interpreted as a forced regression that triggers negative mental health outcomes even more. within the range from t1 to t4, higher levels of internalizing and externalizing problems were registered at t3, whereas a sort of stabilization from t3 to t4 emerged. the peak reported at t3 probably indicated a sort of gradual cognitive and emotional recognition experienced from young adults about the seriousness of the pandemic, which increased feelings of anxiety, depression and worry, and irritability and anger. regarding the stabilization of both internalizing and externalizing problems between t3 and t4, these findings might need to be interpreted in relation to the specific historical context of the covid-19 pandemic in italy. specifically, t4 corresponded to the week from april 16 to 12 in which a double attitude was observed in italy. on the one hand, despite the lockdown, the italian "civil protection" continued to alert the general population about the very high levels of contagions; on the other hand, in that period, italians also started to receive the first information about the so-called "phase 2, " which followed the forced lockdown. it might be hypothesized that the high levels of viral load continued to worry participants, even though the closeness to phase 2 assumed a sort of protective function regarding an eventual mental health worsening. in correspondence to the increase of mental health distress, the results also showed a gradual decrease of participants' perception of their personal strengths, suggesting the need for researchers to strengthen individual's psychological resources in order to mediate the individual reaction to the covid-19 pandemic (di giuseppe et al., 2020) . in conclusion, regarding gender differences, a significant increase of the levels of anxiety/depression from t1 to t2 and, to a lesser extent, from t2 to t3 in males than the females emerged. these findings were in line with previous studies that pointed out higher symptoms of anxiety and depression in condition of social isolations in boys than girls (troop-gordon and ladd, 2005; derdikman-eiron et al., 2011) . the results reported no other statistically significant differences between sex. these findings seemed to be in opposition with the recent studies that have investigated the impact of covid-19 on mental health and highlighted a higher vulnerability for women to develop negative mental health outcomes, as compared with men (qiu et al., 2020; rossi r. et al., 2020) . in the context of gender studies, a wide range of recent literature tended to connect these results to the reinforced gender inequalities promoted by the lockdown measures. according to these studies (adams-prassl et al., 2020; béland et al., 2020; etheridge and spantig, 2020) , in fact, during the lockdown measures, the increase of unemployment rates as well as the commitment into the domestic work and in the management of children has represented a high risk factor for women, compared with men. within the same interpretation field, the lack of significant gender differences as emerged by the results might be correlated to the same nature of the sample, which mostly involved university students who probably were involved in the same challenges and tasks and did not experienced greater or smaller efforts connected to specific gender roles, such as to outline differences. the present study is not free from limitations. first of all, the number of participants should be increased in future studies, and the results need to be replicated in other geographical areas to determine their generalizability. furthermore, the sample was only composed of university students who came from the campania region in southern italy where the covid-19 outbreak has been taken more under control. to assess the mental health of young people during the quarantine, only a self-report measure was used. consequently, the data may be influenced by a reporting bias (e.g., social desirability). moreover, despite the longitudinal panel, the study is an observational study. in this sense, experimental manipulations and a control group are lacking. future researches need to extend the young adults' mental health assessment to other italian regions, taking into consideration that in the south of italy, where the study was carried out, the covid-19 outbreak has been taken moderately and was under control, compared with the north. higher levels of distress might be hypothesized in places where very high numbers of losses and deaths have been registered. moreover, the present study investigated the internalizing and externalizing problems as individual responses to covid-19 pandemic; further investigations to measure the traumatic symptomatology and the characteristics of post-traumatic effects caused by such stressful events are needed (troisi, 2018; margherita and tessitore, 2019) . follow-up investigations are also needed. considering the high levels of withdrawal that emerged from the results, future investigations should explore the function and the role played by virtual environments and e-communities during pandemic in-depth, taking into account the roles played by the online environments and by the use of social media in terms of both risks and protective functions (faccio et al., 2019; gargiulo and margherita, 2019; procentese et al., 2019; boursier et al., 2020) . in this sense, future investigations might be also directed to investigate the changes in the dynamics of social and love relationships (mannarini et al., 2013 (mannarini et al., , 2017a balottin et al., 2017; as well as the role of social support (ratti et al., 2017) postlockdown and post-pandemic. in conclusion, recognizing the fundamental value of qualitative investigations to shed light on the inner aspects and subjective meanings of personal experiences is also vital (margherita et al., 2017; felaco and parola, 2020; parola, 2020; parola and felaco, 2020; tessitore and margherita, 2020) . these are much needed actions in order to develop an indepth understanding of the emotional and affective dimensions connected to the experience of the covid-19 pandemic, as well as possible risk and protective factors for mental health. in conclusion, the present study could contribute to the ongoing debate concerning the psychological impact of the covid-19 emergency, helping to develop efficient and personcentered intervention projects able to take care of young adults' mental health in the medium and long terms, understanding their specific needs and susceptibilities (benedetto et al., 2018; parola and donsì, 2019; fusco et al., 2019) . this is even more urgent considering that despite the distressing and prolonged situation, a significant number of people avoid seeking psychological help . on the one hand, some of these people may be reluctant to seek professional help due to the associated stigma (mannarini et al., 2017b (mannarini et al., , 2018 faccio et al., 2019; mannarini and rossi, 2019) . on the other hand, some individuals may deny the problem, leading them to think that it will probably resolve itself naturally (sareen et al., 2007; rossi ferrario et al., 2019; rossi ferrario and panzeri, 2020) , thus choosing to manage the psychological issue on their own (wilson and deane, 2012) . the datasets generated in this article are not readily available because to ensure the privacy of the participants. requests to access the datasets should be directed to ap, anna.parola@un ina.it. the studies involving human participants were reviewed and approved by the ethical committee of psychological research of university of naples federico ii and was carried out in accordance with the american psychological association rules. the patients/participants provided their written informed consent to participate in this study. ap developed the theoretical framework of the present study, designed the study, and developed the methodological approach. ar performed all the analyses and designed tables and figures. ft and gt led the literature search and interpretation of data. sm critically revised the manuscript. all authors read and approved the final version of the work. manual for the aseba adult forms & profiles inequality in the impact of the coronavirus shock: evidence from real time surveys information theory and an extension of the maximum likelihood principle esvis: visualization and estimation of effect sizes longitudinal data model selection survey of stress reactions among health care workers involved with the sars outbreak triadic interactions in families of adolescents with anorexia nervosa and families of adolescents with internalizing disorders social isolation in covid-19: the impact of loneliness fitting linear mixedeffects models using lme4 the short-term economic consequences of covid-19: exposure to disease, remote work and government response metacognitive beliefs and alcohol involvement among adolescents exploratory structure equation modeling of the ucla loneliness scale: a contribution to the italian adaptation objectified body consciousness, body image control in photos, and problematic social networking: the role of appearance control beliefs the psychological impact of quarantine and how to reduce it: rapid review of the evidence model selection and multimodel inference: a practical information-theoretic approach the psychological impact of the covid-19 epidemic on college students in china changes in sleep pattern, sense of time and digital media use during covid-19 lockdown in italy psychological health diathesis assessment system: a nationwide survey of resilient trait scale for chinese adults statistical power analysis for the behavioral sciences gender differences in subjective wellbeing, self-esteem and psychosocial functioning in adolescents with symptoms of anxiety and depression: findings from the nord-trøndelag health study psychological impact of covid-19 among italians during the first week of lockdown the gender gap in mental well-being during the covid-19 outbreak: evidence from the uk (no. 2020-08). institute for social and economic research the power of weight and the weight of power in adolescence: a comparison between young and adult women young in university-work transition. the views of undergraduates in southern italy the consequencies of the covid-19 pandemic on mental health and implications for clinical practice from creativity to future: the role of career adaptability irr: various coefficients of interrater reliability and agreement narratives of self-harm: the experience of young women through the qualitative analysis of blogs growth modeling: structural equation and multilevel modeling approaches sars control and psychological effects of quarantine distribution theory for glass's estimator of effect size and related estimators traumatic stress in the age of covid-19: a call to close critical gaps and adapt to new realities mental health burden for the public affected by the covid-19 outbreak in china: who will be the high-risk group? brief strategic therapy and cognitive behavioral therapy for women with binge eating disorder and comorbid obesity: a randomized clinical trial one-year follow-up covid-19 pandemia and public and global mental health from the perspective of global health security mental health status of people isolated due to middle east respiratory syndrome methods and criteria for model selection lmertest package: tests in linear mixed effects models the experience of sars-related stigma at amoy gardens longitudinal data analysis for the behavioral sciences using r a rasch-based dimension of delivery experience: spontaneous vs. medically assisted conception etiological beliefs, treatments, stigmatizing attitudes toward schizophrenia. what do italians and israelis think? assessing conflict management in the couple: the definition of a latent dimension the role of secure attachment, empathic self-efficacy, and stress perception in causal beliefs related to mental illness -a cross-cultural study: italy versus israel assessing mental illness stigma: a complex issue how do education and experience with mental illness interact with causal beliefs, eligible treatments and stigmatising attitudes towards schizophrenia? a comparison between mental health professionals, psychology students, relatives and patients a comparison between pro-anorexia and non-suicidal self-injury blogs: from symptom-based identity to sharing of emotions italian validation of the capacity to love inventory: preliminary results teen mothers who are daughters of teen mothers: psychological intergenerational dimensions of early motherhood from individual to social and relational dimensions in asylum-seekers' narratives: a multidimensional approach the relevance of psychosocial variables and working conditions in predicting nurses' coping strategies during the sars crisis: an online questionnaire survey from individual to social trauma: sources of everyday trauma in italy, the us and uk during the covid-19 pandemic aiccmodavg: model selection and multimodel inference based on (q)aic(c) novel coronavirus outbreak and career development: a narrative approach into the meaning of italian university graduates suspended in time. inactivity and perceived malaise in neet young adults time perspective and employment status: neet categories as negative predictor of future a narrative investigation into the meaning and experience of career destabilization in italian neet families and social media use: the role of parents' perceptions about social media impact on family systems in the relationship between family collective efficacy and open communication a nationwide survey of psychological distress among chinese people in the covid-19 epidemic: implications and policy recommendations social support, psychological distress and depression in hemodialysis patients r: a language and environment for statistical computing. r foundation for statistical computing. vienna: r core team covid-19 and mental health: a review of the existing literature psych: procedures for personality and psychological research the italian version of the attitudes toward seeking professional psychological help scale -short form: the first contribution to measurement invariance the anxiety-buffer hypothesis in the time of covid-19: when self-esteem protects from loneliness and fear to anxiety and depression covid-19 pandemic and lockdown measures impact on mental health among the general population in italy. an n= 18147 web-based survey exploring illness denial of lvad patients in cardiac rehabilitation and their caregivers: a preliminary study development and psychometric properties of a short form of the illness denial questionnaire perceived barriers to mental health service utilization in the united states, ontario, and the netherlands estimating the dimension of a model i became adult when pathways of identity resolution and adulthood transition in italian freshmen's narratives pre-and postmigratory experiences of refugees in italy: an interpretative phenomenological analysis female nigerian asylum seekers in italy: an exploration of gender identity dimensions through an interpretative phenomenological analysis. health care women int land of care seeking: pre-and postmigratory experiences in asylum seekers' narratives measuring intimate partner violence and traumatic affect: development of vita, an italian scale trajectories of peer victimiza-tion and perceptions of the self and schoolmates: precursors to inter-nalizing and externalizing problems the forgotten plague: psychiatric manifestations of ebola, zika, and emerging infectious diseases a longitudinal study on the mental health of general population during the covid-19 epidemic in china ggplot2: elegant graphics for data analysis brief report: need for autonomy and other perceived barriers relating to adolescents' intentions to seek professional mental health care the psychological impact of the sars epidemic on hospital employees in china: exposure, risk perception, and altruistic acceptance of risk mental health and psychosocial problems of medical health workers during the covid-19 epidemic in china key: cord-348010-m3a3utvz authors: wolff, michael title: on build‐up of epidemiologic models—development of a sei(3)rsd model for the spread of sars‐cov‐2 date: 2020-10-13 journal: z angew math mech doi: 10.1002/zamm.202000230 sha: doc_id: 348010 cord_uid: m3a3utvz the present study investigates essential steps in build‐up of models for description of the spread of infectious diseases. combining these modules, a sei(3)rsd model will be developed, which can take into account a possible passive immunisation by vaccination as well as different durations of latent and incubation periods. besides, infectious persons with and without symptoms can be distinguished. due to the current world‐wide sars‐cov‐2 pandemic (covid‐19 pandemic) models for description of the spread of infectious diseases and their application for forecasts have become into the focus of the scientific community as well as of broad public more than usual. currently, many papers and studies have appeared and appear dealing with the virus sars‐cov‐2 and the covid‐19 disease caused by it. this occurs under medical, virological, economic, sociological and further aspects as well as under mathematical points of view. concerning the last‐mentioned point, the main focus lies on the application of existing models and their adaptation to data about the course of infection available at the current time. clearly, the aim is to predict the possible further development, for instance in germany. it is of particular interest to investigate how will be the influence of political and administrative measures like contact restrictions, closing or rather re‐opening of schools, restaurants, hotels etc. on the course of infection. the steps considered here for building up suitable models are well‐known for long time. however, understandably they will not be dealt with in an extended way in current application‐oriented works. therefore, it is the aim of this study to present some existing steps of modelling without any pretension of completeness. thus, on the one hand we give assistance and, on the other hand, we develop a model capable to take already known properties of covid‐19 as well as a later possible passive immunisation by vaccination and a possible loss of immunity of recovered persons into account. the present study investigates essential steps in build-up of models for description of the spread of infectious diseases. combining these modules, a sei 3 rsd model will be developed, which can take into account a possible passive immunisation by vaccination as well as different durations of latent and incubation periods. besides, infectious persons with and without symptoms can be distinguished. due to the current world-wide sars-cov-2 pandemic (covid-19 pandemic) models for description of the spread of infectious diseases and their application for forecasts have become into the focus of the scientific community as well as of broad public more than usual. currently, many papers and studies have appeared and appear dealing with the virus sars-cov-2 and the covid-19 disease caused by it. this occurs under medical, virological, economic, sociological and further aspects as well as under mathematical points of view. concerning the last-mentioned point, the main focus lies on the application of existing models and their adaptation to data about the course of infection available at the current time. clearly, the aim is to predict the possible further development, for instance in germany. it is of particular interest to investigate how will be the influence of political and administrative measures like contact restrictions, closing or rather re-opening of schools, restaurants, hotels etc. on the course of infection. the steps considered here for building up suitable models are well-known for long time. however, understandably they will not be dealt with in an extended way in current application-oriented works. therefore, it is the aim of this study to present some existing steps of modelling without any pretension of completeness. thus, on the one hand we give assistance and, on the other hand, we develop a model capable to take already known properties of covid-19 as well as a later possible passive immunisation by vaccination and a possible loss of immunity of recovered persons into account. introduction at first, in december 2019 several cases of a serious lung disease occurred in the chinese city wuhan. shortly after, a new virus of the corona family was identified and its complete genome sequence was published, see [1], e.g. afterwards, the virus has spread in almost all countries. on 11 march 2020 the who classified this new disease as pandemic. the virus was officially nominated as sars-cov-2, the triggered disease as covid-19. meanwhile, the sars-cov-2 pandemic has become a serious challenge for the whole world, its social, economic, political and first of all medical consequences hardly can be estimated. studies show that the virus primarily is quickly spread via droplet infection and by aerosols, especially, if persons get in near and longer contact. without any pretension of reasonable completeness we refer to [1, 2] for overview and current information as well as to [3] [4] [5] [6] [7] and the references cited therein. besides, we refer to [8] for fluidic investigations concerning the spread of sars-cov-2 in air. in [9] , the role of relative humidity in airborne transmission of sars-cov-2 in indoor environments has been discussed. in [10] , the possible protection of masks, including simple ones, against infection has been studied, see also [11] for a corresponding meta-analysis. in some meat-processing plants in germany and other countries, super-spreading events occurred promoted by special climatic working conditions, see [12] . concerning the risk of an infection in trains we refer to [13] . the spread of infection is promoted by a wide absence of immunity within the population as well as by not yet existing vaccine and medicines. the initial hope of immunity after a survived infection could not been strengthened, see [14] . at many laboratories all over the world, scientist are intensively searching for an adequate vaccine. scientists of very different disciplines investigate virus, disease, possible vaccines and medicines, economic, political, social, psychological and other consequences. current data concerning the course of infection are collected and provided by the johns hopkins university in baltimore, usa, and in germany by the robert koch-institut (rki) [2] . the serious situation in many countries caused by the pandemic has not only challenged the policy but also the scientific community. in some countries the spread of sars-cov-2 could be decelerated by partially deep cuts into the social and economic life. however, the proceedings of infection locally and temporally differ remarkably in various countries and regions. after the chinese region around wuhan, the virus came to western europe, at first to italy, then to other european countries, to the usa, brazil, russia, india, south africa and nearly to all countries. unfortunately, after a successful deceleration the number of new infections is growing again in many countries. now, in august 2020, considering the whole world, the sars-cov-2 pandemic goes on and a point of culmination is not yet reached. due to the sars-cov-2 pandemic, suddenly models describing the spread of infectious diseases are not only object of research at specialized institutions and of comparatively few scientists, but they have come highly into the focus of many medics, virologists, economists and mathematicians. there is a long tradition of investigation of infectious diseases, of their spread within human community as well as of a corresponding development of mathematical models for description. exemplarily we refer to [15] [16] [17] , to [18] for introduction to biomathematics and modelling in biology and epidemiology, to [19] for medical and epidemiological background as well as to [20] for comprehensive modelling of infectious diseases. it is impossible to appreciate here in an adequate manner all these mathematical or highly mathematically oriented works concerning the pandemic and its possible consequences which have been published last time. a small overview can be found in [3] . moreover, we refer to [21] . contrary to standard models, there the individual groups of persons like susceptible and infective ones are further subdivided to address specific moments in their behaviour. the arising complex model is strongly tailored to the covid-19 epidemic in germany. in [22] the authors investigate several possible scenarios after easing existing restrictions of public life. in [23] , the effects of several measures for containment performed in germany are investigated a posteriori. in [24] , some aspects of parameter exploitation from available data are considered, particularly concerning the reproduction number. moreover, we refer to [25] and [26] for results of a strong lock down to the course of infection. in [27] , a mathematical model has been developed which describes the begin of the sars-cov-2 epidemic in france. in [28] , the standard sir model is analysed in a comprehensive way. based on current available data, a special variation is presented allowing for an improved monitoring of the pandemic. this study is a revised and updated version of the former papers [29, 30] . it is the aim to provide systematically some general components for building-up deterministic models for the spread of infectious diseases. the approaches summarized here are well-known. however, they could be an assistance for interested colleagues who do not deal with mathematical modelling every day. in this study, modelling means building-up models, i.e., the derivation of mathematical descriptions for real processes, here the spread of infectious diseases, starting with some basic assumptions confirmed by empirical findings. outline of the remaining paper: (i) in section 2, we investigate some general aspects of modelling the spread of infectious diseases in human communities. although this work is also a consequence of the covid-19 pandemic, our presentation is not to closely oriented to known specifics of sars-cov-2. (ii) in section 3, we get a sei 3 rsd model, which can take essential items of covid-19 known today into account. i 3 means, that there are distinguished three classes of infectious persons. contrary to [21] we only subdivide classes of population with respect to the course of infection and disease, but not with respect to age or social behaviour. however, the mentioned modular design principle allows to extend the model in case of necessity. (iii) in section 4, we deal with some mathematical questions arising from the developed models. in particular, we investigate some solution behaviour of the corresponding mathematical problems like unique global solvability and nonnegativity of each solution component. besides this, some aspects of dimensional analysis and its application to the models are treated. here we present in short some items of deterministic models for courses of infections in human communities. we are geared to the review paper by [16] as well as to the books by [20] and by [18] , ch. 5. concerning epidemiological issues, we only go into details, if this seems to be necessary for general understanding. at some places, we refer to current findings related to sars-cov-2. this allows a reasonable selection of modelling steps in order to develop a basic model for the spread of sars-cov-2 in section 3. we provide some useful definitions and explications, other ones will be introduced later. thus, misunderstandings can be excluded, and we can limit the frame of our study. we follow approximately [20] , pp. xxi-xxvi, and ch. 1. an infection is understood as the invasion of one organism by a smaller one (infecting organism). if the latter is not harmful, it is called pathogenic agent or pathogen and can cause an infectious disease. we only consider microscopic pathogens (viruses, bacteria, protozoa) as well as infections in human communities. the majority of pathogens effecting humans live only in humans or vertebrate animals, their transmission from one to another host occurs in a variety of ways: (i) by direct contact (leprosy, e.g.), (ii) via the respiratory route (influenza, sars-cov-2, e.g.), (iii) via the faecal-oral route (dysentery, e.g.), (iv) by sexual contact (hiv, gonorrhoea, e.g.) (v) by contacts with insects (vector-borne infection) (malaria, e.g.) we note that sexually transmitted diseases require a special modelling, since not all parts of the population have nearly the same sexual activity. this is not in the focus here. moreover, the modelling of hiv/aids contains special features, we refer to [20] , ch. 8 and 9.4-9.6, to [31] and to the references cited therein. in order to model the spread of infectious diseases the term contact needs more explication. clearly, each contact between two individuals is a singular event. a pathogen can be transmitted or not. moreover, depending on the individuals, a formally equal contact can lead to a transmission or not. thus, for a suitable modelling generalisation and averaging are required. for this reason, the concepts of an adequate contact, contact and reproduction numbers were introduced. definition 2.1. (adequate contacts, reproduction and contact numbers) (i) a contact is called adequate (also effective), if it leads to a transmission of the pathogen from an infectious person to another one, and, if the affected individual is susceptible, then an infection is provoked. (ii) the basic reproduction number 0 is the average number of adequate contacts of an infectious individual during its infectiousness, if it is introduced into a host population where everyone is susceptible. (iii) the contact number is the average number of adequate contacts of a typical infectious person during its infectiousness with all persons. (iv) the replacement number (also called reproduction number) is the average number of adequate contacts of an infectious individual during its infectiousness with only susceptible persons. here we use the letter instead of like in [16, 20] , to avoid confusion with the number of recovered persons (see paragraph 2.1.5). in other words, during an adequate contact the pathogen is transmitted with the probability one ( [31] , p. 66). sometimes, the result of an adequate contact of an infectious with a susceptible person is called secondary infection. thus, one can say that is the number of secondary infections produced by an infectious person during its infectiousness. thus, takes into account, that not each adequate contact produces an infection, because the fraction of susceptible persons generally shrinks in the course of infection. therefore, is time-dependent. the basic reproduction number 0 is constant, it is related to the beginning of an infection course. the contact number may be time-dependent. this is the case, if infectious persons change their contact behaviour voluntarily or due to restrictions mandated by authorities. thus, at the beginning, equals to 0 and . summarizing these thoughts, there holds in almost all cases in the case of concrete models one uses generally contact and replacement numbers, and , which reflect the current infection behaviour. moreover, these current numbers can be related to other quantities describing the model in a natural way. we return to this in paragraph 2.2.2. as we will see later, 0 , and are dimensionless quantities, i.e., they do not have any units. finally, we note that an adequate contact and hence the contact number are infection-dependent. a physically equal contact is adequate for one pathogen but not for another one. clearly, the contact number depends on the mean duration of infectiousness. the concept of the contact number averages the different and random behaviour of individuals. when investigating the spread of infectious diseases, generally two mutually influencing processes have to be considered: the development of the infection itself and the dynamics of the population in which the infection extends, see [16] . the assumptions listed up consecutively must be chosen in accordance with real proceedings of infections. moreover, these assumptions should be supported by as much as possible empirical and medical findings. in [16] , many cases of infections occurred in earlier time or recently are discussed in detail. (i) (closed-population model) an assumed constant number of community members (see remark 2.2) seems to be justified, if the infection spreads quickly, approximately within a year, and/or, if there is a balance between births, migration and non-disease-related deaths. in connection with disease modelling, these deaths are often referred to as "natural deaths". the deaths caused by infection can be listed as an extra class, or, they can be included into the group of (immune) recovered persons. (ii) (dynamic-population model) a variable number of community members should be taken into account, if the infection leads to many deaths, or, if a big growth of population and/or an essential migration substantially influence the population balance. now, it is necessary to consider an additional equation for the population development, either with given rates of births, deaths and migration or as a logistic equation (see paragraph 2.2.1). generally, in the case of constant population the model which has to be developed becomes easier as for variable population. for a better overview, at first we deal with a constant number of community members, in subsection 2.3.3, a variable population is considered in short. in accordance with current findings about sars-cov-2 and with the demographical development at least in germany, an assumed constant population seems to be justified. assumptions concerning the course of infection (i) the whole population is divided in several disjoint classes w.r.t. the course of infection. the temporal development of each class and its interaction with others is described by an own equation. in the simplest case there are two classes: persons who are susceptible for the infection and infected individuals. this approach leads to compartment models. in section 2.1.5, we deal with this item in detail. (ii) the specifics of an infection process are taken into account like mean duration of infectiousness of an individual, delay of infectiousness by a newly infected person (latent period), mean duration of acquired immunity after an overcome infection. detailed explications are given in section 2.2. (iii) outer influences on the infection process are considered like vaccinations and their temporal delay after the begin of infection (see paragraph 3.3), available capacities of intensive care or political measures to contain further infections, see for instance [21, 22] . we note, that for modelling one assumes homogeneity within the separate classes of population with respect to contact behaviour and course of infection. otherwise, a further subdivision is necessary. in the same manner, the durations mentioned above are averaged quantities. the real existing heterogeneity even within the same class is averaged based on the large number of individuals. depending on specific characteristics of infection the population is divided into disjoint subsets (classes, compartments) (see remarks 2.2 concerning terms like 'number' and 'fraction'). (i) the number of persons susceptible for an infection is mostly abbreviated by . the fraction of w.r.t. the whole population is abbreviated by . this class is also named vulnerable, its members are potentially at risk by the infection. (ii) the number of infected persons is named by , the fraction by (greek iota). (using , there arise difficulties with the dot indicating the time derivation.) if the model is to be to take a latent period into account, the class of infected is divided into subclasses in the following way. (a) the number of exposed persons which had an adequate contact with an infectious infected individual is abbreviated by and , respectively. that means, these persons already have the pathogen of infection, but cannot yet transmit it to further persons. the mean duration of disposition in this class is referred to as latent period. (b) the number of infectious persons is abbreviated by and , respectively. this class contains former exposed persons which have become infectious after ending of latent period. via an adequate contact they can infect susceptible persons. the mean time duration between infection and first appearance of symptoms is named incubation period. the latent period can be shorter as the incubation one. besides, the infectious stage can end before the symptoms disappear. depending on concrete circumstances, the class can be further divided, with regard to time and/or in parallel. a splitting with respect to time can be done into (a) a class 1 of infectious persons without symptoms, i.e., before ending the incubation period. (b) a class 2 of infectious persons with symptoms, i.e., after ending the incubation period. if necessary, the last class 2 can be divided in parallel into ( ) infectious persons 21 showing only weak or no symptoms. these individuals are not aware of their disease and danger of infection for others. therefore, they do not reduce their contact behaviour, at least not more than it is currently usual. ( ) infectious persons 22 exhibiting distinct symptoms. thus, they are aware of their disease and, therefore, they strongly change their contact behaviour or they are hospitalized. sometimes, a further subdivision in parallel is performed, for instance, infectious persons (without symptoms) in quarantine can be considered, see [21] . (iii) the number of recovered persons is usually abbreviated by and , respectively. depending on the concrete disease, recovered individuals may be temporarily or permanently immune or immediately susceptible again, see [16] and [20] for examples, e.g. (iv) the number of immune persons is often named by and , respectively. often immune individuals are included into the class of recovered ones, in particular, if their acquired immunity is permanent. persons can be immune via vaccination (passive immunisation) or by birth (or after survived disease) temporarily or permanently (active immunisation), see [16] . (v) the number of dead persons is usually abbreviated by and , respectively. sometimes, to simplify the model, this class is included into the recovered or recovered immune persons, because they are not involved in the further course of infection. however, to get a complete result, dead individuals should constitute an extra class. this is particularly the case, if the lethality is influenced by outer circumstances like availability of intensive-care units in hospitals, see [21] . a further subdivision of the classes defined above by attributes like age, danger, social state or sex is discussed in detail in [16] and [20] . moreover, in [21] , pursuing this way, the authors develop a suitable as possible specific model for the description of sars-cov-2 epidemic in germany. how many classes have to be considered for a concrete model depends on virological and medical findings, and on how far the models sufficiently well represent the course of infection as well as for which purpose they are intended. in [16] and [20] many examples are discussed. concerning the virus sars-cov-2, up to now it seems to be assured, that infected persons are already infectious before showing symptoms. therefore, the latent period is essentially shorter than the incubation one. moreover, there are infected persons showing after the incubation period no or only weak symptoms and being nevertheless infectious. we refer to [1-4] and the references cited therein. a permanent immunity after survived disease is not yet assured. some studies indicate only a temporary immunity, see [14] and [1]. insofar a model for the spread of sars-cov-2 should take the classes , , 1 , 21 , 21 , and into account. in section 3 we discuss this and present a corresponding sei 3 rsd model. if future findings show an essential difference between the durations of immunity obtained after surviving the infection and by vaccination, then an additional class of immune individuals has to be considered. based on a division into classes, different models have been investigated. usually, they are abbreviated by si, sis, sir, seir etc. sometimes, there are additions like "with delay" or with notes to population dynamics. we refer to [15] [16] [17] [18] 20, 32, 33] and the references cited therein. (i) the number of individuals in classes like , etc. are quantities equipped with the dimension persons, measured for instance by units like one person or thousand persons. however, we note that the term number is often used for dimensionless quantities in physics and natural sciences. an important example is the reynolds number in fluid mechanics. thus, there may be a source of confusion. in sections 2.2.2 and 4.1, we deal with these questions in the framework of dimensional analysis. (ii) generally, a fraction is the ratio of two quantities with the same dimension, and thus, it is dimensionless like mass or volume fractions in physics and chemistry. again, we refer to section 4.1 for details. now let us come to concrete steps in modelling. at first, it is important to model the infection mechanism of susceptible persons by infectious infected individuals. clearly, the medical and virological details of infection processes are beyond this study, see for instance [20] , ch. 1, and the references cited therein. a mathematical modelling can be performed in a discrete manner using difference equations, or in a continuous way using differential or integral equations. here, we only pursue the second way. however, many basic ideas of modelling are the same. besides the approach with ordinary differential equations (ode), in subsection 2.2.3, we deal in short with an alternative approach using an integral equation. an advantage of differential-equation models is the availability of broadly developed theoretical results and numerical procedures. for difference-equation models and further aspects of modelling we refer to [20] , ch. 2. approach for i with a differential equation at first we suppose that infected persons are infectious from the beginning of their infection and that they remain permanently infectious. moreover, we assume that there are only susceptible and infected , leading to an si model. preparation -exponential growth. like for other comparable growth processes (bacterial culture, non-controlled chain reaction in nuclear fission) it seems to be plausible assuming an exponential growth, at least for the beginning, see [34] , e.g. let the increment of infected △ ( ) (this is the number of newly infected persons) during a time period △ be proportional to the number of already existing infected ( ) as well as to the considered time period: > 0 is the (generally time-dependent) factor of proportionality (see remark 2.3 for time dependence). if there were ( ) infected individuals at time , so there are already ( ) + ( ) △ at time + △ . as it is usual, after division by △ and performing a limit process for △ → 0, from (2.2) one obtains the well-known differential equation this equation is completed by the initial condition with 0 ≤ 0 ≤ (0), and = ( ) is the generally time-dependent number of population members. it is well-known that for constant the unique global solution of the initial-value problem (2.3), (2.4) is given by the exponential function see remark 2.4 for further comments. clearly, the initial-value problem (2.3), (2.4) has also a unique global solution for a variable continuous = ( ), see [34, 35] . this solution is given by due to (2.5), a constant is not realistic after some time for real processes. thus, there remains the task to find out the detailed structure of for an infection process. we return to this in the next paragraph 2.2.1. (i) in empirical sciences parameters are calculated from measured data, or in some cases, they are determined by direct measurements. as a rule, one gets discrete values, for instance the maximum air temperature for each day measured at one chosen place. for further mathematical treatment often a parameter function is constructed using the obtained discrete values. a simple way to do this consists in building a step function. the drawback is that step functions are not continuous at all arguments. the way out is an interpolation to a piece-wise linear continuous function or to functions exhibiting differentiability of some order. (ii) for convenience, in this study we assume continuity of arising parameter functions. however, some mathematical results remain valid under slightly changed conditions for step functions, see remark 4.4 (i). this point plays some role in section 4. remarks 2.4. (exponential growth and decay) 3) describes an exponential decay, for instance radioactive decay. (ii) typical issues of exponential growth and decay (with constant ) are doubling time and half-value time (or half time), respectively. these quantities mean the time duration, during which a growing quantity reduplicates and a shrinking quantity is divided in half, respectively. suppose, a growing quantity reduplicates during the time interval [ 1 , 2 ] with 0 ≤ 1 < 2 < ∞. then the doubling time fulfils ∶= 2 − 1 = 1 ∕ ln(2) and is only determined by the length 2 − 1 of the interval and not by its position on the real line. an analogous assertion holds for < 0: the halfvalue time ℎ is given by ℎ ∶= 2 − 1 = − 1 ∕ ln(2). if the half-value time is over, a half of the initially existing radioactive substance has decayed. moreover, the half-value time is also the mean "life span" of a radioactive atom. the value exp( ) (with < 0) can be interpreted as the probability, that a radioactive atom is not yet decayed at . this last thought plays an important role later on when dealing with latent and incubation periods. limited growth -non-linear equation for . in real applications, an unlimited exponential growth predicted by equation (2.5) is no longer observed after some time. concerning infectious diseases, after some time an infectious person has also adequate contacts (see definition 2.1 (i)) to already infected ones. these contacts do not produce new infected individuals. therefore, a constant proportionality factor in (2.3) is not realistic. we change this equation in the following way. instead of there is the expression ( ) ( ) ∕ ( ) with a positive continuous . it expresses how many adequate contacts one infectious individual has during a time unit in the middle with all persons of the population. hence, is also named contact coefficient or contact rate. we prefer the first name. in paragraph 2.2.2, we discuss the relations of with the mentioned above contact and reproduction number (see definition 2.1 (ii)-(iv)). due to [16] , does not essentially depend on the size of human population, contrary to infections spread among animals. but there is generally a density dependence of , see [20] , p.31. that means, if more people live on the same place, the mean number of contacts grows. thus, in applications, is related to the territory considered, for instance for whole germany, or only for its capital berlin. here, has the unit per time like in [16] . contrary to this, in [20] , p. xxi, and ch. 2, has the unit per person, due to the difference-equation approach. the factor ( ) ∕ ( ) = ( ) is the fraction of susceptible persons within the whole population. it expresses the probability to meet an individual of the class at time . even though is constant, the expression ( ) ∕ ( ) decreases due to a shrinking class in favour of . thus, ( ) ( ) ∕ ( ) is also named effective contact coefficient or effective contact rate. from the viewpoint of it can be spoken about an active effective infection coefficient. the whole expression on the right-hand side of (2.7) is often named standard incidence. it states how many cases of infection occur in a time unit. the equation (2.7) means that for ≈ , also at the begin with only some few infective individuals, the fraction ∕ approximately equals to one. thus, at this time, exponential growth occurs else. if grows, this fraction becomes smaller, and the growth of essentially decelerates. the effective contact coefficient in (2.7) tends to zero, at least for constant and . the realistic assumption ∕ ≈ 1 for the begin can be applied to linearise the equation (2.7) as well as the equations (2.8) and (2.43), (2.44) below. based on this, qualitative investigations can be performed, see [27] and the references therein. we note that the contact coefficient is a parameter of infection course being strongly influenced from "outside". it depends on the pathogen's properties as well as on social behaviour, namely on contact manner within a considered population. can be considerably reduced, voluntarily and by means of administrative measures. during the covid-19 pandemic political authorities in many countries have ordered strong rules concerning limitations of contacts, distances to foreign persons, closings of schools and universities, churches, restaurants etc. as a result one observes a decreasing and an end of exponential growth of accumulated cases in some countries, for instance in germany. however, an easing of mandated measures and more carelessness may lead to an increasing and to essentially more cases, like in israel or australia in july, and now, in august 2020 in some european countries. the equation (2.7) describes the special situation that all susceptible individuals will be infected in the course of time, and they will remain infectious (see (2.11)). if a convalescence has to be taken into account, than this equation has to be changed bẏ( that means, in a time unit the class loses ( ) individuals due to recovery (and loss of infectiousness). the coefficient > 0 is the reciprocal value of the mean infection duration , i.e., = 1 ∕ . since the average duration of infectiousness is disease-specific, an assumed constant (and thus a constant ) seems to be plausible. the value exp(− ) can be interpreted as the probability that an infected person is still infectious at time (see remark 2.4 (ii) and paragraph 2.2.3). and, otherwise, 1 − exp(− ) is the probability that an infected loses its infectiousness at time . if infectiousness and disease have very different mean durations, the model could be extended. if the models are more complex, in (2.7) and (2.8), respectively, one has to divide by − instead of , where is the number of persons died by infection, see paragraph 2.3.2. in accordance with our assumption at the beginning of paragraph 2.2.1 there holds contrary to (2.7), in the case of equation (2.8), due to the "outflow" ( ) the evolution of and is more complex. there will be an equilibrium, see the second case in paragraph 2.2.1. since the fractions and lie between zero and one, they can be also interpreted as probabilities to meet a susceptible and an infectious person, respectively, see [16] e.g. we return to fractions in connection with dimensional analyse in paragraphs 4.1 and 4.3.1. relation to the logistic equation. (si model) substituting by − in (2.7), one getṡ for constant and , this is the (classical) logistic differential equation. for an initial value 0 < 0 < , = . > 0, continuous with ( ) ≥ 0 the unique solution to (2.11) is given by . (2.12) for with the solution grows asymptotically to . this case includes = . > 0. the logistic differential equation is discussed in detail in many textbooks on ordinary differential equations and on biomathematics, see for instance [18, 34, 35] . (sis model) substituting by − in (2.8), one getṡ for constant , and with 0 < < and 0 < 0 < the unique solution to (2.14) is given by . ( 2.15) hence, the solution grows asymptotically to (1 − ∕ ) < . therefore, a dynamic equilibrium between infected and susceptible persons will be reached after certain time. in definition 2.1 (ii)-(iv), contact and reproduction numbers are defined independently of concrete infection models. generally, theses numbers play an important role in epidemiology. now, it is the aim to find out concrete expressions for 0 , , in the case of models with equation (2.8) for the infectious infected persons. in doing so, there arise some modifications compared with definition 2.1. at first, we assume a constant contact coefficient . as described after equation as already stated, 0 remains constant in either case. the factor ( ) ∕ takes into account that the replacement number counts only adequate contacts with susceptible persons (during its period of infectiousness). this is the difference to , cf. definition 2.1 (iii), (iv). a detailed inspection shows, that some explanation needs concerning and . ( ) and ( ) are indeed current contact and replacement number, respectively. they explain the average number of adequate contacts (with all persons and with susceptible ones, respectively) of a new infectious person, assuming that ( ) (and ( ) ∕ ) remain constant during its infectiousness. thus, in some sense, ( ) and ( ) are virtual contact and replacement number, respectively, describing a frozen state. however, just these numbers play an important role, see paragraphs 4.3.1 and 4.3.2. based on definition 2.1, the individual (real) contact and replacement number, and , respectively, of an individual (still with an averaged contact behaviour) can be calculated by here, the mean duration of infectiousness is used, and 0 ( ) is the begin of infectiousness of the individual . it is thinkable to use an individual ( ). at least in this study, these numbers do not play any role. even for constant , the replacement number depends on time. equation (2.17) confirms the assertion mentioned above, that the quantities , and 0 are dimensionless. in other words, they are numbers in the sense of remark 2.2 (i). note that for more complex models the expressions for and generally differ from (2.17), see remark 3.1 and section 4.3.1. equation (2.8) and the last relation in (2.17) allow the following interpretation. the current replacement number ( ) is the ratio of inflow ∕ and outflow of infectious persons, see [24] , also for an application of this idea. thus, the meaning of replacement number can be well explained. in accordance with many deterministic epidemiologic models an infection within a completely susceptible population can start if and only if there holds for further discussion and for the question how to determine with the help of available data we refer to [16, 20] and, especially concerning sars-cov-2, to [2, 24, 36] and to the references therein. in connection with dimensional analysis we return to in paragraphs 4.3.1 and 4.3.2. we close this paragraph with remarks concerning the use of some terms, remarks 2.5. (on parameters, numbers and rates) (i) a parameter and a coefficient are understood as additional quantities in equations or as quantities formed by them. parameters can be provided with units, than they have a dimension. if they occur without units, they are called dimensionless. in the last case, they are also called (characteristic) numbers or ratios. as an example, the contact coefficient in (2.3) has the dimension 'reciprocal time', i.e. 1 ∕ . ( stands for the dimension time.) the basic reproduction number 0 , the replacement number and the contact number are dimensionless and thus (characteristic) numbers. note, that in current discussions about the sars-cov-2 pandemic sometimes the terms reproduction rate and infection rate are used incorrectly. this may lead to confusion in the broad public. (ii) a rate is often understood as a quantity which dimension has the time in the denominator. for instance, the contact coefficient is a rate. moreover, the whole right-hand side ( ) ( ) ( ) ∕ ( ) of (2.7) is a rate, the total contact rate. the derivations of differential equations (2.3) and (2.7) have been performed under the assumption that each infected person is infectious from the begin and for all time. generally, this is not the case. equation (2.8) models a finite infectiousness, assuming an exponential decay, cf. remark 2.4 (ii), which does not need to be the case. thus, there were developed general approaches containing explicitly stochastic moments and leading to integral equations for . in a special case, this is equivalent to differential equations. we explain this in short and follow [32, 33] . since models with differential equations are in the focus of this study, we do not want to apply this approach to more complex models like seird ones. for ≥ 0 we note by ( ) the probability that an infected is infectious until time . we assume: (i) the infectiousness begins immediately after an adequate contact. (ii) there are no death cases, neither infection-related nor other ones. these two assumption are made to focus on essential items. if necessary, the model can be suitably extended (see remark 2.7 (i), (ii)). let be fulfil again, 0 < = < ∞ is the average duration of infectiousness (see remark 2.4 (ii) as well as the explications after (2.8)). for the case = ∞ we refer to [33] . we follow [33] with small changes, allowing furthermore a continuous = ( ). the approach for infected persons is given by (2.22) 0 ( ) means the (probable) number of initially infected persons being still infectious at time . the integral is the sum of persons infected at , being (probable) still infectious. remark 2.6. contrary to (2.8), the approach in (2.22) contains explicitly a stochastic moment. however, let us remark, that the approach in (2.8) contains indirectly a stochastic moment via the concept of adequate contacts, which is averaged by a generally large number of individuals. now we consider two special cases for . exponentially decaying infectiousness. now we assume from condition (2.21) there follows as it has been mentioned above, is the reciprocal mean duration of infectiousness = . in [15] , it was proved that under assumption (2.23) the integral approach in (2.22) is equivalent to the one with differential equation (2.8). a probability tending to zero causes a decay of infective individuals. infectivity of equal duration. now it is assumed that the common duration of infectiousness for all infective persons amounts to 0 < < ∞. this corresponds to the function referred to for > . (2.26) these two integral equations are equivalent to an initial-value problem for a differential equation as well as to an initialvalue problem for a differential equation with delay, more preciously to the probleṁ here, is the solution of problem (2.27). in order to avoid a jump function regularisations can be used (see remark 2.7 (iii)). if 0 = 0, than is also continuous in = , however, it is identically zero. we close this paragraph with additional remarks. remarks 2.7. (i) the case of a delayed infectiousness of infected persons has not been considered in this paragraph. two possibilities may be offered. at first, one can introduce a further class of already infected persons who cannot yet infect others (see paragraph 2.1.5). the further procedure is analogous, a probability has to be defined which determines how long an individual in is not yet infectious. another possibility would be to continue with the class as before and define the function differently from (2.20), setting it zero at some initial time interval. of course, this violates the monotony condition in (2.20). moreover, it is not clear, whether the mathematical results in [32, 33] can be extended to this case. (ii) cases of death caused by disease or not can be taken into account by additional exponential terms within the integral equations for , see [32, 33] . (2. 30) with some 0 < < . up to = − all infected individuals are infectious, from = + nobody is else infectious. between these points in time, there is a linear decay of infectiousness. (iv) in [33] , the presentation is more general as in our study, and mathematical results are presented. after dealing with equations for the class of infectious infected persons, it is now the aim to add further equations for remaining classes like , and . as a result, we get complete models suitable for several kinds of infection courses. in this section 2.3, the focus lies on development and reasoning of models. in connection with mathematical investigations in section 4 the needed equations will be presented in a compact way as well as completed with initial and other conditions. in this paragraph we consider only models described with the two classes and . either the infected persons remain in for all time (si model), or they return to after survived disease (sis model). therefore, we need a suitable equation for . we distinguish two basic cases oriented to paragraphs 2.2.1 and 2.2.3. modelling with differential equations. let us suppose that the evolution of is given by the differential equation (2.8). clearly, the growth rate for is the loss rate for . therefore, it yieldṡ neglecting birth and death rates as well as migration, i.e., setting ( ) = 0 = ., or assuming = ( ) as given, we get a closed simple sis model formed by (2.8) and (2.31) together with initial conditions for and . clearly, for ≡ 0 it turns into an si model. for mathematical and numerical investigations it is mostly sufficient to solve only one equation, after substituting in (2.31) by − , for instance. supplement: modelling with integral equations. as explained in paragraphs 2.2.3 and 2.2.3, the approach with a differential equation for can be regarded as a special case corresponding to an exponentially decaying infectiousness. let us repeat equation (2.22) again. since there are only the classes and , due to ( ) = ( ) + ( ) it yields for ≡ 1 (corresponds to permanent infectiousness and ≡ 0) and ( ) = 0 = . the well-known si model (2.8) and (2.31) for ≡ 0 easily follows by taking the time derivative. in this simple case, as a rule, the formulation with differential equations is more convenient. the following models contain more than two classes. in sird and sirsd models, the class loses members to the class of recovered persons as well as to the class of infection-died individuals. in seird and seirsd models, there occurs a special class of already infected, but not yet infectious persons. as a new issue, in the denominator is substituted by − , since died persons do not have any contacts. the numbers of individuals in will not be subtracted from . persons from have furthermore adequate contacts with individuals, but these contacts do not lead to infections. the class loses members to and to . thus, the equation (2.8) must be changed: again, > 0, and = 1 ∕ is the mean duration of infectiousness. in this approach, the last one ends with the disease, or, the patient is regarded as recovered after his or her infectious period. the term − ( ) describes that infected persons leave class . here, a permanent immunity is assumed. the sum + + represents the whole number of infected individuals, currently and formerly (accumulated cases). furthermore, via = − − − , can be excluded on the right-hand side. in applications, equation (2.40) can be used for determining from measured data, or more precisely from daily reported numbers of newly infected, recovered and died persons. (ii) (sir model) sometimes, in the case of small lethality, the died persons are included in the class , and the sird model becomes a sir one, see [28] , e.g. (sirsd model) a loss of immunity of recovered persons can be integrated in the following way. assuming a mean duration of immunity and setting ∶= 1 ∕ , the term − ( ) is the rate of loss for and of benefit for . thus, there must be added − ( ) on the right-hand side of (2.36), and ( ) on the right-hand side of (2.34). therefore, equations (2.34) and (2.36) are replaced by the following ones. seird and seirsd models. now we consider a more general case. the class directly loses members to the class consisting of infected persons who are not yet infectious. after expiration of latent period the individuals of become infectious. then they belong to an class. besides this, we assume that after overcome infection former infected persons go into class of recovered individuals. either they remain there being permanently immune or they become susceptible again after some time. finally, individuals from may die by the infection-caused disease. the equation for is given by (2.34), assuming a permanent immunity in . the special feature of class is expressed in modelling, too. individuals from cannot be (newly) infected, and they cannot infect others. thus, can only grow, if there are adequate contacts between persons from and . based on (2.8) we can write:̇( here, > 0 is a parameter which can be referred to the reciprocal value of the latent period, i.e., = 1 ∕ , see [21] . the term − ( ) describes the loss of to . this loss is the benefit for . therefore, we can writė again, > 0, and = 1 ∕ is the mean duration of infectiousness. contrary to the situation in subsection 2.2 and in paragraph 2.3.2, now the infectiousness begins only after the end of latent period. the term − ( ) describes that infected persons leave class . as above in the previous paragraph, > 0 is the infection-specific lethality coefficient. the equations for and are the same ones as in (2.36) and (2.37). finally, the differential equations (2.34), (2.36), (2.37), (2.43), (2.44) (together with corresponding initial conditions -see subsection 3.2) describe a seird model for an assumed constant population. (seirsd model) the difference between seird and seirsd models is that recovered persons loss their immunity. again, as a consequence, equations (2.34) and (2.36) must be replaced by (2.41) and (2.42). now, we want to discuss in short how a variable population number can be taken into account within the models presented above. there are two possible approaches, see [33] : (i) the population development is controlled by rates for births, migration and deaths non-caused by the infection under consideration. (ii) a modified logistic equation is used. here, we only consider the first case in more detail, for the second one more complex we refer to [33] and to the references cited therein. in the case of a variable population number, = ( ), died persons are not included into , independently of the death reason. the classes of died individuals can be calculated a posteriori. the temporal course of is described by a differential equation along with an initial condition. here are ≥ 0, ≥ 0 birth and death coefficients related to the whole population, the death coefficient ≥ 0 is related to infected persons, is the migration rate. all these quantities may be time-dependent continuous functions. the migration rate can be regarded as difference of immigration and emigration rate: here, it has been supposed that all newborns are susceptible (otherwise, an additional class is needed), and that only infectious infected can die caused by infection. the initial values 0 , 0 , 0 and 0 should be meaningfully between 0 and 0 fulfilling the condition 0 = 0 + 0 + 0 + 0 . (2.54) it is easy to see that the right-hand side of (2.45) for can be obtained after addition of the right-hand sides of (2.50)-(2.53). hence, die condition (2.54) ensures that for an existing unique solution there holds for all ( ) = ( ) + ( ) + ( ) + ( ). (2.55) therefore, in the equations (2.50) and (2.51), can be substituted reducing the whole system of equations. via the number of died persons can be calculated a posteriori. remark 2.9. the numbers , etc. are all non-negative. thus, the model allows an unlimited immigration, but not a suchlike emigration. strictly speaking, all emigration rates , etc. must be equipped with switch-off functions, which prevent negative values for , etc. we drop this here, because in paragraph 3 there arise an analogous problem connected with the vaccination rate. there, a suitable switch-off function will be constructed. now it is the aim to develop a model which is better adjusted to so far known findings about sars-cov-2, without aspiring the elaborateness in [21] . in particular, we do not divide the classes , , into further subclasses in accordance with social or other aspects. some findings relevant for modelling are: latent and incubation periods differ, infected persons are infectious after the end of incubation period and there exist infectious infected persons without or only with light symptoms, see [1,2], e.g. thus, we introduce the class and subdivide the class . additionally, we take into account that recovered persons lose their acquired immunity and that susceptible individuals become temporarily immune via vaccination. for a better overview we assume a constant population number = 0 . the considerations concerning a variable in paragraph 2.3.3 can be included without any difficulties, if needed. a specific feature of the present model is that the class of infectious infected individuals i is subdivided (see also paragraph 2.1.5). (i) the class (of infected, but not infectious persons) loses to the class 1 consisting of individuals who are already infectious, but they do not have any disease symptoms. the mean length of stay in 1 equals to the difference between incubation and latent period, and , respectively. (ii) after the end of incubation period the class 1 loses parallelly to two further classes, more precisely to (a) 21 consisting of furthermore infectious infected persons without or with only weak symptoms. thus, the affected are not aware to their illness. the loss of infectiousness is assumed to be the beginning of recovery after a mean 'disease duration' ,1 − ; (b) 22 consisting of furthermore infectious infected persons exhibiting stronger symptoms and feeling ill. thus, the affected are either in home isolation or in a hospital. their recovery also starts with the loss of infectiousness after a mean 'disease duration' ,2 − . if the start of recovery and the end of infectiousness do not coincide, then, when indicated, the classes 21 and 22 have to be subdivided further. we drop this here. however, the model allows different lengths of disease periods. the partition into 21 and 22 is hence relevant, because it is plausible that the individuals certainly exhibit different contact behaviour. people with remarkable disease symptoms generally behave more carefully or are even hospitalized. in both cases one can assume that they infect essentially less susceptible persons. summarising the considerations above, now we write down all equations. for we present a modified version of (2.41). the parameter fulfils 0 < < 1. (3.7) 1 − and reflect the partition of infectious infected individuals from 1 to 21 and 22 after the end of incubation period. is also called manifestation index, since it characterizes the fraction of ill persons with remarkable symptoms. the time duration ,1 − is the reciprocal value of 21 , ,2 − correspondingly to 22 . ,1 is the point of time of recovery ever since of infection for 1 and ,2 correspondingly for 2 . the relation ,2 > ,1 seems to be plausible, but it has no mathematical relevance. > 0 is the lethality coefficient, the model approach is, that only heavily diseased persons die caused by infection. now, the class consists of individuals stemming from 21 and 22 as well as of persons immunized by vaccination. if recovered lose their immunity, they return to . thus, the equations for and arė mostly, one chooses 0 = 0, but this is not mathematically required. the initial values should be consistent with 0 , therefore, we assume 0 = 0 + 0 + 10 + 210 + 220 + 0 + 0 . (3.17) for the start of an infection there must hold as minimal requirement 0 + 10 + 210 + 220 > 0 (3. 18) again, this is not an actual mathematical assumption, but a vanishing sum in (3.18) very simplifies the problem. remarks 3.1. (i) (different durations of immunity) in the model, it is implemented that recovered and vaccinated susceptible persons have immunity of the same mean duration 1 ∕ . if this is not the case, one could consider a further class of "immune after vaccination". equally, a further class could be convenient, if for instance recovered persons are permanently immune and vaccinated ones only for some time. the first addend represents the 1 individuals, the second one 21 members, and, finally, the third one is related to 22 individuals, taking the infection-related death coefficient into account (see remark 4.6 for reasoning). moreover, after incubation period, an infected individual belongs to 22 with the probability and to 21 with the probability 1 − . hence, one gets the following (virtual) replacement number (reproduction number), observing (2.16). the term ( ) in (3.1), the vaccination rate, indicates how many persons are vaccinated in a time unit. the part is the actual vaccination rate. it is justified, if there are sufficiently many persons for vaccination. this is surely the case at begin of a vaccination campaign. the part ( ) is a control function. if there are no persons to be vaccinated, the vaccination process ends. otherwise, the function would be sometime negative for mathematical reason. for the actual vaccination rate we propose a piece-wise linear approach: the formula (3.21) means that for 1 = 0 the vaccination campaign starts with the begin of infection. in the current case of covid-19 this is impossible, hence, a begin at 1 > 0 reflects the real situation. after that, grows linearly up to the final rate 1 , i.e., the limit of capacity is reached. this approach seems to be realistic for a new vaccine which production has to be start-up and its distribution surely has to be improved. clearly, instead of (3.21) other approaches are possible. the constant 1 has the dimension ∕ , persons per time. the control function ( ) ensures that the vaccination campaign ends, if the number of susceptible persons comes up to zero. at first we define a dimensionless cut function for an arbitrarily chosen and fixed > 0 via , for 0 ≤ ≤ , 0, for < 0. (3.23) the slope of between zero and one is 1 ∕ . if tends to zero, tends (point-wise, but not uniformly) to a jump function. due to its non-continuity in the point zero, we want to avoid it for mathematical reasons. finally, we define the term ( ) by . (3.24) for reasons of dimensional homogeneity, ∕ 0 has been chosen as argument of 1 . this will be beneficial later on. from (3. 25) in applications, the threshold value > 0 has to be chosen suitably small. generally, at beginning of the infection course, one can expect ∕ 0 ≈ 1. thus, should be small, more precisely ≪ ∕ 0 , in order to avoid an essential influence to the model at begin. for mathematical reason the function ( ) is also defined for negative . under mild assumptions the non-negativity of can be proved (see section 4.2). thus, there is no practical consequence in applications. the sei 3 rsd model developed above takes several known findings about sars-cov-2 and covid-19 into account. however, it is less complex than the model developed and numerically investigated in [21] . if necessary, the modular design principle allows extensions and simplifications of the model described above. this model is a continuous one, the arising mathematical task is an initial-value problem for a system of ordinary differential equations (ode). as an advantage, the extensive mathematical instruments on theory and numerics of ode are well available. otherwise, the population is discrete. therefore, to correspond possibly well to the reality, "sufficiently large" populations must be assumed. an alternative to a continuous model is a discrete one based on difference equations, see [20] , e.g. generally, concerning models of phenomena in empirical sciences, an important task consists in validation and verification in order to decide, under which conditions the applied models yield sufficiently good approximations with the reality. in particular, a typical question is whether a sird model can sufficiently well describe a real course of infection, or a seird model must be chosen. a general drawback of modelling with ode is the absence of local dependence of the functions looked for, here , , 1 , 21 , 22 , and . however, in many countries the spread of sars-cov-2 essentially differs from region to region. therefore, the models discussed here only reflect a mean situation, for instance in germany or in a single federal land. a way out with partial differential equations (pde) containing terms describing local movements of population parts seems theoretically possible. surely, there will arise large difficulties in determining various parameter functions. instead of population numbers corresponding density functions have to be used (persons per quadrate kilometre, e.g.). additionally, there may be non-local effects caused by travelling excelling daily usual movement in the surrounding field. hence, it is not a surprise that the author did not find corresponding references to pde. in [16] and in [20] , pde are addressed in short to model a dependence on age. now, it is the aim to investigate the mathematical problem arising from the sei 3 rsd model. at first, we formulate in paragraph 4.1 an equivalent problem in dimensionless quantities. this so-called non-dimensionalized form may have some advantages. after that, in paragraph 4.2, we prove existence and uniqueness of a global solution and study its properties. we focus on the sei 3 rsd model, although many results can be applied in modified form to less complex models like sir ones. finally, in paragraph 4.3, we discuss some issues related to contact and replacement numbers for some models. now, we want to deal in more detail with the dimensions of functions and parameters involved in problem (3.1), (3.3)-(3.6), (3.8)-(3.16). for lack of space, here we do not repeat all equations and initial conditions, but we write them alike in an equivalent form with dimensionless quantities. this procedure is also called non-dimensionalisation. an advantage of this procedure is generally a reduction of the number of parameters determining the equivalent problem. sometimes, there can be an advantage in numerical investigations. moreover, the influence of parameters and their interplay can be well investigated. for more information we refer exemplarily to [37] [38] [39] [40] as well as to a compact presentation in the lecture [41] (chapter 7) . we record that all numbers of persons, , , 1 , 21 , 22 , , and 0 , 0 , 0 , 210 , 220 , 0 and 0 have the dimension persons, abbreviated by , corresponding units may be = 1 as well other ones like 100 . the time derivatives of the functions , etc. have the dimension ∕ , being the dimension time. in connection with epidemics the time is often measured by the unit day. bearing in mind these informations, it follows that all parameters 1 , 22 , , , 1 , 21 , here, let > 0 or = ∞, the index points to 'existence interval'. the choice in (4.1) and (4.2) is motivated by up to now known findings on sars-cov-2. the parameters 1 and 22 strongly depend on human contact behaviour, the lethality coefficient seems to be influenced by the capacities of health systems in divers countries. the constance of remaining parameters is not mathematically predicted. the case ≡ 0 means permanent immunity. the continuity requirements in (4.1) allow to deal within the framework of continuously differentiable solutions, see remark 4.4 (i). the next step is to define a dimensionless time (number) (see remark 4.2 (iii)). we choose the approach for a better overview we usually do not write the argument , derivatives with respect to are also indicated by a dot. the dimensionless functioñhas the following form. . (4.14) taking (3.23), (4.13) and (4.14) into account, the term̃( ) is written as , for 0 ≤ ≤ , 0, for < 0. 1 = 0 + 0 + 10 + 210 + 220 + 0 + 0 . (4.23) the additional condition (3.18) necessary for a release of infection leads to its counterpart 0 + 10 + 210 + 220 > 0. (4.24) the mathematical problem (4.6)-(4.12), (4.16)-(4.22) consists in determining seven dimensionless functions, , , 1 , 21 , 22 , , which argument is the dimensionless time (number) . this problem is governed by seven (dimensionless) parameter functions in (4.5), by , by three numbers 1 , 2 ,̃1 and the threshold value induced by the vaccination rate as well as by seven dimensionless initial conditions 0 , 0 , 10 , 210 , 220 , 0 , 0 . now, it is the aim to define a solution to the non-dimensionalized problem (4.6)-(4.12), (4.16)-(4.22) and hence also for the original problem (3.1), (3.3)-(3.6), (3.8)-(3.16). we use mathematical standard notations for spaces of continuous and continuously differentiable functions with values in ℝ as well as in ℝ ( ∈ ℕ), see [42] , e.g. for the theory of ordinary differential equations (ode) we refer exemplarily to [34, 35, 43] . the preceding definition contains also assumptions which are not mandatory. however, in doing so, we do not need to repeat them below. for convenience we define local solutions on closed intervals. this is also not mandatory. analogously, local and global solutions to the original problem (3.1), (3.3)-(3.6), (3.8)-(3.16) can be defined. there holds the following important assertion. in many cases, the corresponding non-dimensionalized problems are simplifications of the equivalent original ones. (here, the non-dimensionalized problem is governed by only seven parameters instead of eight ones.) as a consequence, theoretical and numerical investigations may be easier, see section 4.3 for examples. after dealing with the non-dimensionalized problem, the transformation back to the solution of the original problem is performed with the simple scaling in (4.25) for all solution components without a noteworthy effort. we close this paragraph with some remarks. for instance, in [29] , the above problem is considered with variable parameters, and is chosen by ∶= 1 (0). another approach uses the latent period , setting ∶= ∕ . again, based on further investigations, a chosen approach can prove to be more or less convenient. (ii) as it can be seen in (4.5), the behaviour of the solution of the non-dimensionalized problem does not depend on the absolute quantities of the parameters, but only on their ratios. this general finding plays an important role in the model theory, in particular in hydro-and aerodynamics as well as in heat conduction. we refer to [39, 40] , e.g. (iii) a great advantage of the dimension analyse is, that characteristics of the dimensionless functions, , , 1 , 21 , 22 , , like possible maxima or inflection points of the curves depend only on the dimensionless parameters and (dimensionless) initial values. based on practical experience, on theoretical considerations as well as on real and numerical experiments, the real influence of the characteristic numbers (dimensionless parameters) can be estimated. on solution behaviour of the initial-value problem 0 ≤ 220 exp(−ˆ(̃2 2 +̃ˆ)) ≤ 22 ( ) ≤ max { 1, 220 +ˆ̃1 exp(ˆ(̃2 2 +̃ˆ)) } ≤ 1, (4.36) the function is monotonously increasing. proof. the proof consists of several steps. at first, the existence of a unique local solution is proven via a special auxiliary problem. as a result the original problem has exactly one local solution which components are non-negative on the existence interval and their sum is one. additionally, the inequalities in (4.32)-(4.38) are valid. finally, a suitable a-priori estimate on an arbitrarily chosen interval [0,ˆ] allows the continuation to a unique global solution. (i) (construction of an auxiliary problem) to prove the non-negativeness of the components to a local solutions and to get a good basis for a-priori estimates, we consider a special auxiliary problem. for this purpose we define two functions: we change the equations (4.6)-(4.12) in the following way. is also a solution to the following probleṁ now, we want to derive an estimation for the function , taking into account that the local solution to the auxiliary problem is at once a local solution to the problem under consideration fulfilling (4.47). from (4.48) for one obtains the representation formula for a solution to a linear differential equation of first order: taking (4.47), (3.2), ≤ 1, (4.13) and (4.15) into account, the assertions in (4.32) follow with instead ofˆ. in particular, there holds this estimation is of special importance for the continuation of the local solution. the quantitỹ1 is defined analogously tõ1 via (4.31). the remaining assertions in (4.32)-(4.38) follow in the same manner via corresponding representation formulas, for the present with instead ofˆ. (iv) (continuation of the local solution to a global one) letˆ> > 0 be arbitrarily chosen and fixed. due to (4.31) the quantities defined there are monotonously increasing for increasingˆ. hence, there hold ) . (4.58) we define the numberˆ∶ (4.59) the [43] , e.g. (ii) since is positive on each finite interval, the remaining functions cannot be equal to one on any finite interval. (iii) positive initial values of , 1 , 21 , 22 , , lead to positive values on finite intervals. (iv) the essential assertions of theorem 4.3 can also be proven for initial-value problems arising from other models, for instance from a sird model [44] . in [29] , the above problem with variable coefficients has been dealt with. the changes are only technical. above all one can conjecture, that for more complex models like in [21] analogous assertions hold. however, the technical effort would be considerably larger. we end the present study with some remarks on qualitative properties of some models considered above. the focus lies on a connection with items of the dimensional analysis. an important question is, where the current contact and replacement number, respectively, occur in the non-dimensionalized equations. clearly, there are a lot of papers on qualitative behaviour including numerical examples and discussions of former real infection courses. we refer to [16, 20] [27] and to the literature cited therein. this paragraph is structured as follows. at first, we consider sirsd and sird models. they are of middle complexity. after that, we deal in short with sir and sis models. they are special cases in some kind, however, they obey some own features. at the end, we consider seirsd and sei 3 rd models. obviously, under corresponding mild assumptions, lemma 2 and theorem 4.3 can be easily applied to the subsequent mathematical problems arising from sirsd, sird, sir and sis models. in particular, all solutions components like , i etc. are non-negative. we want to return to the role of the replacement number (reproduction number) in the light of dimension analyse, see paragraph 2.2.2. for this reason, we consider a sirsd model given bẏ based on lemma 2, the denominator 1 − equals to + + , and, hence equations (4.68) and (4.69) can be reformulated without . for convenience, we repeat all equations and get the equivalent system: as a consequence, the first three equations do not contain . they can be solved separately, after that, can be obtained via simple integration. if additionally ≡ 0, at first, the subsystem for and can be solved and afterwards and can be obtained. this is an advantage for numerical studies. now, we want to get the specific expressions for the basic reproduction number 0 , contact number and replacement number . we remember the ideas in paragraph 2.2.2. an easy way to find consists in taking the ratio of inflow and outflow in (4.74) (or in (4.62)). thus, we get =1 +̃+ + = + + + . (4.77) thus, the (current) contact number and the basic reproduction number 0 follow in accordance with as already stated, the quantities 0 , and are dimensionless, 0 is always a constant. the following assertions easily follow. (4.79) a sufficient and necessary condition is given by however, due to 0∕ 0 + 0 + 0 ≈ 1 in many cases, in reality, condition (4.79) is often also sufficient. (iv) the lethality coefficient has only a significant influence, if it has the same magnitude of order as . and, if this is the case, then the spread of infection is essentially braked by death cases. this can explain, that the evolution led to pathogens which (as a rule) do not have too high lethality. (v) the function can only grow, so long as the replacement number fulfils ( ) > 1. (vi) if ( ) < 1, then also holds ( ) < 1, and decreases. (vii) if = 0, then can only grow, and can only fall. the case ≠ 0 is more complex and needs detailed investigations. remark 4.6. (death-adjusted infectious period) in simple models like sis and sir, the contact number fulfils the relation ( ) = ( ) (see formula (2.16)), where = = 1 ∕ . these models are applied, if infection-related deaths (and generally death cases) do not play a significant role. however, considering a sird model as above, the mean effective infectious period (or the mean death-adjusted infectious period, see [16] ) is not 1 ∕ , but has to be corrected. as a result one obtains represents the mean effective infectious period (see formula (2.21)). thus, besides using the ratio of inflow and outflow, there is an alternative way to obtain relations (4.77) and (4.78) via a mean effective duration of infectiousness. obviously, and sir models are special cases of sirsd and sird ones. thus, the assertions in the preceding paragraph remain valid after slight modifications. however, neglecting the lethality coefficient , and thus , the contact number directly occurs in the equations for and . this can be done, if ≪ . therefore, setting ≡ 0 and ≡ 0, from (4.77) and (4.78) one obtains (4.84) ). for̃= 0 a sirs model becomes a sir one. thus, due to (4.86), grows as long as reaches its final value = 1 ∕ 0 . concerning covid-19, 0 = 3 seems to be realistic. therefore, = 0.33. if there is no change of contact behaviour, two-thirds (herd immunity) of an initially fully susceptible population will be infected, before the epidemic comes to an end. note that this consideration is based on an assumed permanent immunity of recovered persons. (ii) (si and sis models) these models have been already considered in paragraph 2.2.1. in [18] , chapter 5.2, the corresponding system (with = 0) is studied for the fractions of , i and , named also by , and , without transforming the time . in [28] , a sir model is investigated in detail, analytically as well as numerically. now, we want to study how important dimensionless numbers influence the infection course in the case of more complex models. in [27] , a special seir 1 r 2 model has been investigated, addressing the begin of the sars-cov-2 epidemic in france. to focus on main items, we deal at first with the following seir model. (4.93) for a better overview we drop a possible loss of immunity and vaccination. besides, infection-related deaths are included in . if necessary, these items can be added without any special effort. obviously, the system (4.88)-(4.91) is a simplification of (4.6)-(4.12). we set the initial conditions as in (4.16)-(4.21) and the condition (4.23) with 0 = 0. to avoid trivial cases, we assume 0 + 0 > 0. thus, for > 1, 0 + 0 + 0 ≪ 1, (4.96) the sum + grows up so long as = > 1, or, equivalently > ∶= 1 ∕ . the growing sum + allows three cases: (i) and grow, (ii) grows, decreases, (iii) decreases, grows. these cases are determined by the initial conditions for and . let be 0 > 0. then, due to theorem 4.3, > 0 for all time. hence, we can re-write the equations (4.89) and (4. the initial value 0∕ 0 of the ratio ∕ determines the begin of the evolution of and . if 1 < 0 0 < , (4.100) then at least for some time after beginning both functions and grow. the limit of ∕ is 1 ∕̃, or, in other words, there holds → 1 = . (4.101) the remaining cases 0 0 < 1 < , and 1 < < 0 0 (4.102) imply a growing and a decreasing and a decreasing and a growing , respectively, at least for some time. additionally, for these cases the relation (4.101) remains to hold. the case 0 > 0 can be dealt with analogously, re-writing (4.89) and (4.90) in the following way. and, finally, the asymptotic behaviour of ∕ is given by →̃= . (4.105) clearly, this is equivalent to (4.101). obviously, for ≤ 1, the sum + cannot grow, and the infection so does. depending on initial values 0 and 0 , one of these two functions can grow for short time. finally, we want to deal with complex models of sei 3 r kind. in order to focus, at first we drop , loss of immunity and vaccination within the sei 3 rsd model presented above in section 3 and paragraph 4.1. neglecting and taking 1 = + + 1 + 21 + 22 + into account, we obtain from (4.6)-(4.11): the definition of parameters as well as the initial conditions (with 0 = 0) are given as in subsection 4.1. contrary to the previous models above, it is not clear, where in the equations contact and replacement numbers occur. based on definitions in paragraph 2.2.2 and in remark 3.1, the contact and replacement number are presented for the full problem. neglecting and , we obtain: an investigation like for models before turns out to be more complex. we only sketch some ideas. the addition of equations (4.107)-(4.110) yields an equation for the sum + 1 + 21 + 22 : +̇1 +̇2 1 +̇2 2 =̃1 1 + (̃1 −̃2 1 ) 21 again, this is a rough sufficient condition, disregarding the remaining terms in (4.100). the author thanks pd dr. georg quaas, leipzig, and prof. dr. michael böhm, bremen, for fruitful proposals and discussions when preparing this study. detection of sars-cov-2 in human breastmilk wo das corona-infektionsrisiko am größten ist viable sars-cov-2 in the air of a hospital room with covid-19 patients. medrxiv sars-cov-2 blog, hermann rietschel institut an overview on the role of relative humidity in airborne transmission of sars-cov-2 in indoor environments maskenpflicht und ihre wirkung auf die corona-pandemie: was die welt von jena lernen kann physical distancing, face masks, and eye protection to prevent person-to-person transmission of sars-cov-2 and covid-19: a systematic review and meta-analysis investigation of a superspreading event preceding the largest meat processing plant-related sars-coronavirus 2 outbreak in germany the risk of covid-19 transmission in train passengers: an epidemiological and modelling study next generation sequencing of t and b cell receptor repertoires from covid-19 patients showed signatures associated with severity o disease periodicity and stability in epidemic models: a survey the mathematics of infectious diseases vertically transmitted diseases, biomathematics grundkurs biomathematik: mathematische modelle in biologie, biochemie, medizin und pharmazie mit computerlösungen in mathematica infektionsepidemiologie: methoden, moderne surveillance, mathematische modelle, global public health an introduction to infectious disease modelling extensions of the seir model for the analysis of tailored social distancing and tracing approaches to cope with inferring change points in the covid-19 spreading reveals the effectiveness of interventions welche maßnahmen brachten corona unter kontrolle the reproduction number in the classical epidemiological model bringing accountability to the peak of the pandemic using linear response theory a very flat peak: exponential growth phase of covid-19 is mostly followed by a prolonged linear growth phase ein mathematisches modell der anfänge der coronavirus-epidemie in frankreich on covid-19 modelling zum aufbau epidemiologischer modelle -entwicklung eines sei3rsd-modells zur ausbreitung von sars-cov-2 on build-up of epidemiologic models-development of a 3 model for the spread of sars-cov-2 on the structure of the epidemic spread of aids: the influence of an infectious coagent an sis epidemic model with variable population size and a delay two sis epidemiologic models with delays gewöhnliche differentialgleichungen gewöhnliche differentialgleichungen -eine einführung schätzung einer zeitabhängigen reproduktionszahl r für daten mit einer wöchentlichen periodizität am beispiel von sars-cov-2-infektionen und covid-19 fluid-und thermodynamik: eine einführung continuum methods of physical modeling: continuum mechanics, dimensional analysis, turbulence scale-up -modellübertragung in der verfahrenstechnik, 2.aufl mathematische modellierung oxford users' guide to mathematics gewöhnliche differentialgleichungen -2., überarb. aufl on build-up of epidemiologic models-development of a sei 3 rsd model for the spread of sars-cov-2 key: cord-326908-l9wrrapv authors: duchêne, david a.; duchêne, sebastian; holmes, edward c.; ho, simon y.w. title: evaluating the adequacy of molecular clock models using posterior predictive simulations date: 2015-07-10 journal: mol biol evol doi: 10.1093/molbev/msv154 sha: doc_id: 326908 cord_uid: l9wrrapv molecular clock models are commonly used to estimate evolutionary rates and timescales from nucleotide sequences. the goal of these models is to account for rate variation among lineages, such that they are assumed to be adequate descriptions of the processes that generated the data. a common approach for selecting a clock model for a data set of interest is to examine a set of candidates and to select the model that provides the best statistical fit. however, this can lead to unreliable estimates if all the candidate models are actually inadequate. for this reason, a method of evaluating absolute model performance is critical. we describe a method that uses posterior predictive simulations to assess the adequacy of clock models. we test the power of this approach using simulated data and find that the method is sensitive to bias in the estimates of branch lengths, which tends to occur when using underparameterized clock models. we also compare the performance of the multinomial test statistic, originally developed to assess the adequacy of substitution models, but find that it has low power in identifying the adequacy of clock models. we illustrate the performance of our method using empirical data sets from coronaviruses, simian immunodeficiency virus, killer whales, and marine turtles. our results indicate that methods of investigating model adequacy, including the one proposed here, should be routinely used in combination with traditional model selection in evolutionary studies. this will reveal whether a broader range of clock models to be considered in phylogenetic analysis. analyses of nucleotide sequences can provide a range of valuable insights into evolutionary relationships and timescales, allowing various biological questions to be addressed. the problem of inferring phylogenies and evolutionary divergence times is a statistical one, such that inferences are dependent on reliable models of the evolutionary process (felsenstein 1983) . bayesian methods provide a powerful framework for estimating phylogenetic trees and evolutionary rates and timescales using parameter-rich models (huelsenbeck et al. 2001; yang and rannala 2012) . model-based phylogenetic inference in a bayesian framework has several desirable properties: it is possible to include detailed descriptions of molecular evolution (dutheil et al. 2012; heath et al. 2012) ; many of the model assumptions are explicit (sullivan and joyce 2005) ; large parameter spaces can be explored efficiently (nylander et al. 2004; drummond et al. 2006) ; and uncertainty is naturally incorporated in the estimates. as a consequence, the number and complexity of evolutionary models for bayesian inference has grown rapidly, prompting considerable interest in methods of model selection (xie et al. 2011; baele et al. 2013) . evolutionary models can provide useful insight into biological processes, but they are incomplete representations of molecular evolution (goldman 1993) . this can be problematic in phylogenetic inference when all the available models are poor descriptions of the process that generated the data (gatesy 2007) . traditional methods of model selection do not allow the rejection, or falsification, of every model in the set of candidates being considered. gelman and shalizi (2013) recently referred to this as a critical weakness in current practice of bayesian statistics. a different approach to model selection is to evaluate the adequacy, or plausibility (following brown 2014a), of the model. this involves testing whether the data could have been generated by the model in question (gelman et al. 2014 ). assessment of model adequacy is a critical step in bayesian inference in general (gelman and shalizi 2013) , and phylogenetics in particular (brown 2014a) . one method of evaluating the adequacy of a model is to use posterior predictive checks (gelman et al. 2014) . among the first of such methods in phylogenetics was the use of posterior predictive simulations, proposed by bollback (2002) . the first step in this approach is to conduct a bayesian phylogenetic analysis of the empirical data. the second step is to use simulation to generate data sets with the same size as the empirical data, using the values of model parameters sampled from the posterior distribution obtained in the first step. the data generated via these posterior predictive simulations are considered to represent hypothetical alternative or future data sets, but generated by the model used for inference. if the process that generated the empirical data can be described with the model used for inference, the posterior predictive data sets should resemble the empirical data set (gelman et al. 2014) . therefore, the third step in assessing model adequacy is to perform a comparison between the posterior predictive data and the empirical data. this comparison must be done using a test statistic that quantifies the discrepancies between the posterior predictive data and the empirical data (gelman and meng 1996) . the test statistic is calculated for each of the posterior predictive data sets to generate a distribution of values. if the test statistic calculated from the empirical data falls outside this distribution of the posterior predictive values, the model in question is considered to be inadequate. previous studies using posterior predictive checks of nucleotide substitution models have implemented a number of different test statistics. some of these provide descriptions of the sequence alignments, such as the homogeneity of base composition (huelsenbeck et al. 2001; foster 2004) , site frequency patterns (bollback 2002; lewis et al. 2014) , and unequal synonymous versus nonsynonymous substitution rates (nielsen 2002; rodrigue et al. 2009 ). brown (2014b) and reid et al. (2014) introduced test statistics based on phylogenetic inferences from posterior predictive data sets. some of the characteristics of inferred phylogenies that can be used as test statistics include the mean tree length and the median robinson-foulds distance between the sampled topologies in the analysis (brown 2014b) . although several test statistics are available for assessing models of nucleotide substitution (brown and eldabaje 2009; brown 2014a; lewis et al. 2014) , there are no methods available to assess the adequacy of molecular clock models. molecular clocks have become an established tool in evolutionary biology, allowing the study of molecular evolutionary rates and divergence times between organisms (kumar 2005; ho 2014 ). molecular clock models describe the pattern of evolutionary rates among lineages, relying on external temporal information (e.g., fossil data) to calibrate estimates of absolute rates and times. the primary differences among the various clock models include the number of distinct substitution rates across the tree and the degree to which rates are treated as a heritable trait (thorne et al. 1998; drummond et al. 2006; drummond and suchard 2010 ; for a review see ho and duchêne 2014) . for example, the strict clock assumes that the rate is the same for all branches, whereas some relaxed clock models allow each branch to have a different rate. we refer to models that assume a large number of rates as being more parameter rich than models with a small number of rates . although molecular clock models are used routinely, the methods of assessing their efficacy are restricted to estimating and comparing their statistical fit. for example, a common means of model selection is to compare marginal likelihoods in a bayesian framework (baele et al. 2013 ). however, model selection can only evaluate the relative statistical fit of the models, such that it can lead to false confidence in the estimates if all the candidate models are actually inadequate. in this study, we introduce a method for assessing the adequacy of molecular clock models. using simulated and empirical data, we show that our approach is sensitive to underparameterization of the clock model, and that it can be used to identify the branches of the tree that are in conflict with the assumed clock model. in practice, our method is also sensitive to other aspects of the hierarchical model, such as misspecification of the node-age priors. we highlight the importance of methods of evaluating the adequacy of substitution models in molecular clock analyses. to evaluate the adequacy of molecular clock models, we propose a method of generating and analyzing posterior predictive data. in this method, the posterior predictive data sets are generated using phylogenetic trees inferred from branchspecific rates and times from the posterior samples ( fig. 1 ). because this method uses branch-specific estimates, it requires a fixed tree topology. the first step in our method is to conduct a bayesian molecular clock analysis of empirical data. we assume that this analysis obtains samples from the posterior distribution of branch-specific rates and times. these estimates are given in relative time, or in absolute time if calibration priors are used. in the second step, we take a random subset of these samples. for each of these samples, we multiply the branch-specific rates and times to produce phylogenetic trees in which the branch lengths are measured in substitutions per site (subs/ site), known as phylograms. to assess model adequacy, we randomly select 100 samples from the posterior, excluding the burn-in. from these samples, posterior predictive data sets are generated by simulation along the phylograms and using the estimates of the parameters in the nucleotide substitution model. the third step in our approach is to use a clock-free method to estimate a phylogram from each of the posterior predictive data sets and from the empirical data set. for this step, we find that the maximum likelihood approach implemented in phangorn (schliep 2011 ) is effective. to compute our adequacy index, we consider the branch lengths estimated from the posterior predictive data sets under a clock-free method, such that there is a distribution of length estimates for each branch. we calculate a posterior predictive p value for each branch using the corresponding distribution obtained with the posterior predictive data sets. this value is important for identifying the length estimates for individual branches that are in conflict with the clock model. our index for overall assessment is the proportion of branches in the phylogram from the empirical data that have lengths falling outside the 95% quantile range of those estimated from the posterior predictive data sets. we refer to our index as a, or overall plausibility of branch length estimates. we also provide a measure of the extent to which the branch length estimates from the clock-free method differ from those obtained using the posterior predictive simulations. to do this, we calculate for each branch the absolute difference between the empirical branch length estimated using a clock-free method and the mean branch length estimated from the posterior predictive data. we then divide this value by the empirical branch length estimated using a clockfree method. this measure corresponds to the deviation of posterior predictive branch lengths from the branch length estimated from the empirical data. for simulations and analyses of empirical data, we present the median value across branches to avoid the effect of extreme values. we refer to this measure as "branch length deviation," of which low values represent high performance. we also investigated the uncertainty in the estimates of posterior predictive branch lengths. this is useful because it provides insight into the combined uncertainty in estimates of rates and times. the method we used was to take the width of the 95% quantile range from the posterior predictive data sets, divided by the mean length estimated for each branch. this value, along with the width of the 95% credible interval of the rate estimate from the original analysis, can then be compared among clock models to investigate the increase in uncertainty that can occur when using complex models. we first evaluated the accuracy and uncertainty of substitution rate estimates from simulated data. to do this, we compared the values used to generate the data with those estimated using each of three clock models: strict clock, random local clocks (drummond and suchard 2010) , and the uncorrelated lognormal relaxed clock (drummond et al. 2006) . we regarded the branch-specific rates as accurate when the rate used for the simulation was contained within the 95% credible interval. we found that rate estimates were frequently inaccurate under five circumstances: clock model underparameterization; rate autocorrelation among branches (kishino et al. 2001) ; uncorrelated beta-distributed rate variation among lineages; misleading node-age priors (i.e., node calibrations that differ considerably from the true node ages); and when data were generated under a strict clock but analyzed with an underparameterized substitution model ( fig. 2a ). when analyses were performed using the correct or an overparameterized clock model, more than 75% of branch rates were accurately estimated, such that the true value was contained within the 95% credible interval ( fig. 2a) . in most simulation schemes, the uncorrelated lognormal relaxed clock had high accuracy, at the expense of a small increase in the uncertainty compared with the other models ( fig. 2b ). these results are broadly similar to those of drummond et al. (2006) , who also found that underparameterization of the clock model resulted in low accuracy in rate estimates, whereas overparameterization had a negligible effect on accuracy. we analyzed data generated by simulation to test our method of assessing the adequacy of molecular clock models. the a index was approximately proportional to the branch length deviation ( fig. 3a ). we found a to be !0.95 (indicating high performance) when the model used in the analyses matched that used to generate the data, or when it was overparameterized. when the assumed model was the top right box shows the first step in assessing model adequacy using pps. in our analyses, this step is performed using branch-specific rates and times. the bottom box shows our procedure for testing the clock model, which is based on the clock-free posterior predictive distribution of the length of each branch. the thin arrows indicate that the test statistic is the posterior predictive p value for each branch. pps, posterior predictive simulations. duchêne et al. . doi:10.1093/molbev/msv154 mbe underparameterized, a was 0.92. the uncertainty obtained using posterior predictive branch lengths was sensitive to the rate variance in the simulations. for this reason, estimates from data generated according to a strict clock or an uncorrelated lognormal relaxed clock had lower uncertainty than estimates from data generated under local clocks, regardless of the model used for analysis ( fig. 3b ). estimates made using the uncorrelated lognormal relaxed clock had a larger variance in three analysis schemes: when data were generated with autocorrelated rates across branches; when data were generated with beta-distributed rates across branches; and when there was a misleading prior for the node ages. for analyses with substitution model underparameterization, our method incorrectly provided greater support for the more complex clock model, indicating that rate variation among lineages was overestimated ( fig. 3) . we used our simulated data and posterior predictive simulations to investigate the performance of the multinomial test statistic for evaluating the adequacy of molecular clock models. this test statistic was originally designed to assess models of nucleotide substitution (goldman 1993; bollback 2002) and can perform well compared with some of the other existing test statistics (brown 2014b) . the multinomial test statistic for the empirical alignment can be compared with the distribution of test statistics from posterior predictive data sets to produce a posterior predictive p value. we find that the multinomial test statistic correctly identified when the substitution model was matched or underparameterized ( fig. 4) . the multinomial likelihood did not have the power to detect clock model adequacy, but it was sensitive to rate variation among lineages, primarily from the simulation involving autocorrelated rates and when the node-age prior was misleading ( fig. 4 ). we used three clock models, as in our analyses of simulated data, to analyze a broad range of nucleotide sequence data fig. 2 . mean values of (a) accuracy and (b) uncertainty of branch rate estimates from molecular clock analyses of simulated data. each cell shows the results of 100 replicate analyses. accuracy is measured as the proportion of data sets for which the rate used for simulation was contained in the 95% credible interval of the estimate. darker shades in (a) represent high accuracy. uncertainty is measured as the width of the 95% credible interval as a proportion of the mean rate. dark shades in (b) represent small ranges in branch length estimates, and therefore low uncertainty. the initials stand for each of the schemes for estimation or simulation. sc, strict clock; loc, local clock; ucl, uncorrelated lognormal relaxed clock; rlc, random local clock; acl, autocorrelated relaxed clock; bim, beta-distributed bimodal clock; pri, misleading node-age prior; gtrg, data simulated under the parameter-rich general time-reversible substitution model with among-site rate heterogeneity. mean values of (a) plausibility, a, and (b) uncertainty as described by the posterior predictive simulations from clock analyses of simulated data. each cell shows the results of 100 replicate analyses. values in parentheses are the branch length deviations, of which lower values indicate good performance. the darker shades represent higher values of a and less uncertainty. high values of a represent good performance. in the case of uncertainty, small values indicate small ranges in posterior predictive branch lengths, and therefore low uncertainty. the initials stand for each of the schemes for estimation or simulation. sc, strict clock; loc, local clock; ucl, uncorrelated lognormal relaxed clock; rlc, random local clock; acl, autocorrelated relaxed clock; bim, beta-distributed bimodal clock; pri, misleading node-age prior; gtrg, data simulated under the parameter-rich general time-reversible substitution model with among-site rate heterogeneity. assessing the adequacy of clock models . doi:10.1093/molbev/msv154 mbe sets: the m (matrix) gene of a set of coronaviruses; the gag gene of simian immunodeficiency virus (siv; wertheim and worobey 2009); complete mitochondrial genomes of killer whales orcinus orca (morin et al. 2010) ; and 13 mitochondrial protein-coding genes of marine turtles (duchene et al. 2012) . the uncorrelated lognormal relaxed clock was the bestfitting clock model according to the marginal likelihood for the coronaviruses, siv, and the killer whales (table 1) . for the marine turtles, the random local clock provided the best fit. in all the analyses of empirical data sets, the uncorrelated lognormal relaxed clock had the best performance according to our a index. the highest a index was 0.78 for the siv and the killer whales, and the lowest uncertainty in posterior predictive branch lengths was 0.7 for the killer whales. the uncertainty for all other data sets was above 1, indicating that it was larger than the mean of the posterior predictive branch lengths. we calculated the multinomial test statistic for the empirical data sets using the posterior predictive data from a clock model analysis, as well as under a clock-free method. the multinomial test statistic from both methods suggested that the substitution model was inadequate for the siv and the marine turtles, with posterior predictive p values below 0.05. the substitution model was identified as inadequate for the coronavirus data set by the multinomial test statistic estimated using posterior predictive data sets from a clock analysis (p < 0.05); however, it was identified as adequate when using a clock-free method (p = 0.20). the mitochondrial data set from killer whales represented the only case in which the substitution model was adequate according to both multinomial likelihood estimates. for the data sets from coronaviruses and killer whales, the clock models with the highest performance had a indices of 0.53 and 0.78, respectively (table 1). these indices are substantially lower than those obtained in analyses of simulated data when the clock model used for simulation and estimation was matched. however, we evaluated the posterior predictive p values for all branches in these empirical data sets and found that at least two-thirds of the incorrect estimates correspond to relatively short terminal branches (supplementary information, supplementary material online). the branch length deviation in the empirical data ranged between 0.09 for the uncorrelated lognormal relaxed clock in the turtle data and 0.48 for the killer whale data analyzed with a strict clock (table 1) . low values for this metric indicate small differences between the posterior predictive and the empirical branch lengths. although scores for this metric varied considerably between data sets, they were closely associated with the a indices for the different models for each data set individually. for example, in every empirical data set, the lowest branch length deviation was achieved by the model with the highest a index (indicative of higher performance). importantly, the branch length deviation was not directly comparable with the a index between data sets. mbe this is probably because the posterior predictive branch lengths have different amounts of uncertainty. in particular, the a index will tend to be low if the posterior predictive branch length estimates are similar to the empirical value but have low uncertainty. this would create a scenario with a small branch length deviation but also a low a index. this appears to be the case for the coronaviruses, for which all the clock models appear inadequate according to the a index, but with the uncorrelated lognormal relaxed clock having a small branch length deviation. assessing the adequacy of models in phylogenetics is an important process that can provide information beyond that offered by traditional methods for model selection. although traditional model selection can be used to evaluate the relative statistical fit of a set of candidates, model adequacy provides information about the absolute performance of the model, such that even the best-fitting model can be a poor predictor of the data (gelman et al. 2014 ). there have been important developments in model adequacy methods and test statistics in the context of substitution models (ripplinger and sullivan 2010; brown 2014b; lewis et al. 2014 ) and estimates of gene trees (reid et al. 2014 ). here we have described a method that can be used for assessment of molecular clock models, and which should be used in combination with approaches for evaluating the adequacy of substitution models. the results of our analyses suggest that our method is able detect whether estimates of branchspecific rates and times are consistent with the expected number of substitutions along each branch. for example, in the coronavirus data set analyzed here, the best-fitting clock model was a poor predictor of the data, as was the substitution model. our index is sensitive to underparameterization of clock models and has the benefit of being computationally efficient. in addition, our metric of uncertainty in posterior predictive branch lengths is sensitive to some cases of misspecification of clock models and node-age priors, but not to substitution model misspecification, as shown for our analyses of the coronavirus data set. analyses based on the random local clock and the data simulated under two local clocks generally produced low accuracy ( fig. 2a) , with lower a indices than the other models that were matched to the true model ( fig. 3a) . the substandard performance of the random local clock when it is matched to the true model is surprising. a possible explanation is that our simulations of the local clock represented an extreme scenario in which the rates of the local clocks differed by an order of magnitude. previous studies based on simulations and empirical data demonstrated that this model can be effective when the rate differences are smaller (drummond and suchard 2010; dornburg et al. 2012) . in our analyses of empirical data, even the highest values of our index were lower than the minimum value obtained in our analyses of simulated data when the three models matched those used for simulation. this is consistent with the results of previous studies of posterior predictive simulations, which have suggested that the proposed threshold for a test statistic using simulations is conservative for empirical data (bollback 2002; ripplinger and sullivan 2010; brown 2014b) . it is difficult to suggest a specific threshold for our index to determine whether a model is inadequate. however, the interpretation is straightforward: a low a index indicates that a large proportion of branch rates and times are inconsistent with the expected number of substitutions along the branches. under ideal conditions, an a index of 0.95 or higher means that the clock model accurately describes the true pattern of rate variation. however, our method allows the user to inspect the particular branches with inconsistent estimates, which can be useful for identifying regions of the tree that cause the clock model to be inadequate. measuring the effect size of differences in the branch length estimates of the posterior predictive and empirical data can also be useful for quantifying potential errors in the estimates of node times and branch-specific rates. an important finding of our study is that overparameterized clock models typically have higher accuracy than those that are underparameterized. this is consistent with a statistical phenomenon known as the bias-variance trade-off, with underparameterization leading to high bias, and overparameterization leading to high uncertainty. this was demonstrated for molecular clock models by . although our results show a bias when the model is underparameterized, we did not detect high uncertainty with increasing model complexity. this probably occurs because the models used here are not severely overparameterized. this is consistent with the fact that bayesian analyses are robust to mild overparameterization because estimates are integrated over the uncertainty in additional parameters (huelsenbeck and rannala 2004; lemmon and moriarty 2004) . we note that our index is insensitive to the overparameterization in our analyses. this problem is also present in some adequacy statistics for substitution models (bollback 2002; ripplinger and sullivan 2010) . identifying an overparameterized model is challenging, but a recent study proposed a method to do this for substitution models (lewis et al. 2014 ). an equivalent implementation for clock models would also be valuable. another potential solution is to select a pool of adequate models and to perform model selection using methods that penalize an excess of parameters, such as marginal likelihoods or information criteria. we find that our assessment of clock model adequacy can be influenced by other components of the analysis. for example, multiple calibrations can create a misleading node-age prior that is in conflict with the clock model (warnock et al. 2012; duchêne et al. 2014; heled and drummond 2014) . although our simulations with misleading node calibrations were done using a strict clock, our method identified this scenario as clock model inadequacy when the models for estimation were the strict or random local clocks ( fig. 3a) . in the case of the uncorrelated lognormal relaxed clock, our method identified a misleading node-age prior as causing an increase in uncertainty ( fig. 3b ). this highlights the critical importance of selecting and using time calibrations appropriately, and we refer the reader to the comprehensive reviews of 2991 assessing the adequacy of clock models . doi:10.1093/molbev/msv154 mbe this topic (benton and donoghue 2007; ho and phillips 2009) . another component of the analysis that can have an impact on the adequacy of the clock model is the tree prior, which can influence the estimates of branch lengths. although one study suggested that the effect of the tree prior is not substantial (lepage et al. 2007 ), its influence on divergence-time estimates remains largely unknown. we found that substitution model underparameterization led to a severe reduction in accuracy. overconfidence in incorrect branch lengths in terms of substitutions can cause bias in divergence-time estimates (cutler 2000) . however, this form of model inadequacy is incorrectly identified by the methods we used for estimation as a form of rate variation among lineages. for our data generated using a strict clock and an underparameterized substitution model, the a index rejected the strict clock and supported the overparameterized uncorrelated lognormal relaxed clock. on the other hand, the multinomial test statistic was sensitive to substitution model underparameterization, and to some forms of rate variation among lineages. the sensitivity of the multinomial likelihood to rate variation among lineages might explain why the substitution model was rejected for the coronavirus data set when using a clock model, but not when using a clock-free method. due to this sensitivity and the substantial impact of substitution model misspecification, we recommend the use of a clock-free method to assess the substitution model before performing analyses using a clock model. our results suggest that it is only advisable to perform a clock model analysis when an adequate substitution model is available. other methods for substitution model assessment that are less conservative than the multinomial likelihood represent an interesting area for further research. we find that the a index is sensitive to patterns of rate variation among lineages that conflict with the clock model used for estimation. this is highlighted in the simulations of rate variation among lineages under autocorrelated and the unusual beta-distributed rates. in these cases, the a index identified the uncorrelated lognormal clock as the only adequate clock model, despite an increase in uncertainty in both cases. although other studies have also suggested that the uncorrelated lognormal relaxed clock can account for rate autocorrelation (drummond et al. 2006; ho et al. 2015) , an increase in uncertainty can impair the interpretation of divergence-time estimates. we suggest caution when the uncertainty values are above 1, which occurs when the widths of the 95% credible intervals are greater than the mean parameter estimates. in our analyses of the two virus data sets, the multinomial test statistic suggested that the best-fitting substitution model was inadequate. in the analyses of the siv data, our index of clock model adequacy was 0.78, similar to that of killer whales, for which the substitution model appeared adequate. we recommend caution when interpreting estimates of evolutionary rates and timescales when the substitution model is inadequate. this typically suggests that the substitution process is not being modeled correctly, which can affect inferences of branch lengths regardless of whether a clock model is used or not. for this reason, the a index of 0.78 for the siv data set might be overconfident compared with the same index obtained for the killer whale data. previous research has also suggested that there are processes in the evolution of siv that are not accounted for by current evolutionary models . we also found that all the clock models were inadequate for the coronavirus sequence data. our results might provide an explanation for the lack of consensus over the evolutionary timescale of these viruses. for example, a study of mammalian and avian coronaviruses estimated that these viruses originated at most 5,000 years ago (woo et al. 2012 ). this result stands in contrast with a subsequent study that suggested a much deeper origin of these viruses, in the order of millions of years (wertheim et al. 2013) . our results suggest that estimating the timescale of these viruses might not be feasible with the current clock models. our analysis of mitochondrial genomes of killer whales shows that even if the clock model performance is not as high as that obtained in the simulations that match the models used for estimation, a large proportion of the divergence-time estimates can still be useful. examining the estimates of specific branch lengths can indicate whether many of the node-age estimates are reliable, or whether important branches provide unreliable estimates. we recommend this practice when the substitution model has been deemed adequate and when a substantial proportion of the branch lengths are consistent with the clock model (i.e., when the a index is high). we note that the mitochondrial genomes of killer whales have the lowest a index of any data set when analyzed using a random local clock. this might occur because the model identified an average of 0-3 rate changes along the tree (0.79 rate changes; table 1). although rate variation is likely to be higher in this data set, it might not be sufficiently high for the model to detect it. analyses of mitochondrial protein-coding genes from marine turtles identified the substitution model as inadequate using the multinomial test statistic. the clock model with the highest performance had an a index of 0.70, which might be considered sufficient to interpret the divergencetime estimates for at least some portions of the tree. again, the fact that the substitution model is inadequate precludes further interpretation of the estimates of evolutionary rates and timescales. this is a surprising result for a mitochondrial data set with several internal-node calibrations. a potential solution is to assess substitution-model adequacy for individual genes and to conduct the molecular clock analysis using only those genes for which an adequate substitution model is available. we believe that, with the advent of genomic data sets, this will become a feasible strategy in the near future. some of the reasons for the paucity of studies that assess model adequacy in phylogenetics include computational demand and the lack of available methods. in this study, we have presented a method of evaluating clock model adequacy, using a simple test statistic that can be computed efficiently. assessment of clock model adequacy is an important complement to traditional methods of model selection for two primary reasons: it allows the researcher to reject all the available models if they are inadequate; and, as implemented in this study, it can be used to identify the branches with length estimates that are implausible under the assumed model. the results of our analyses of empirical data underscore the importance of evaluating the adequacy of the substitution and clock models. in some cases, several models might be adequate, particularly when they are overparameterized. in this respect, methods for traditional model selection are important tools because they can be used to select a single best-fitting model from a set of adequate models. further research into methods, test statistics, and software for evaluating model adequacy is needed, both to improve the existing models and to identify data sets that will consistently provide unreliable estimates. we generated 100 pure-birth trees with 50 tips and root-node ages of 50 my using beast v2.1 (bouckaert et al. 2014) . we then simulated branch-specific rates under five clock model treatments using the r package nelsi (ho et al. 2015) . this program simulates rates under a given model and multiplies rates by time to produce phylogenetic trees in which the branch lengths represent subs/site, known as phylograms. these phylograms were then used to simulate the evolution of dna sequences of 2,000 nt in the r package phangorn. the five clock model treatments included the following: 1) a strict clock with a rate of 5 â 10 à3 subs/site/my; 2) an uncorrelated lognormal relaxed clock (drummond et al. 2006) , with a mean rate of 5 â 10 à3 subs/site/my and a standard deviation of 0.1; 3) a treatment in which a randomly selected clade with at least ten tips experienced an increase in the rate, representing a scenario with two local clocks (yoder and yang 2000) , with rates of 1 â 10 à2 and 1 â 10 à3 subs/ site/my; 4) a treatment with rate autocorrelation, with an initial rate of 5 â 10 à3 subs/site/my and a parameter of 0.3 (kishino et al. 2001) ; and 5) a treatment with rate variation that followed a beta distribution with equal shape parameters of 0.4 and centered at 5 â 10 à3 subs/site/my, resulting in a bimodal shape. in every simulation, the mean rate was 5 â 10 à3 subs/site/my, which is approximately the mean mitochondrial evolutionary rate in mammals, birds, nonavian reptiles, and amphibians (pereira and baker 2006) . we selected this mean rate instead of sampling from the prior because our estimation methods involved an uninformative rate prior, and random samples from this can produce data sets with high sequence saturation or with low information content. we used the jukes-cantor substitution model for simulation (jukes and cantor 1969) . this model allows us to avoid making arbitrary parameterizations of more parameter-rich models, which is not the focus of this study. to explore the effect of substitution model underparameterization, we simulated additional data sets under a strict clock and a general time-reversible model with gammadistributed rates among sites, using parameters from empirical data (murphy et al. 2001) . we analyzed these data sets using the same method as for the rest of the simulated data, including the use of the simpler jukes-cantor substitution model. we also explored the effect of using misleading node-age priors. to do this, we placed two time calibrations with incorrect ages. one calibration was placed in one of the two nodes descending from the root selected at random, with an age prior of 0.1 times its true age (i.e., younger than the truth). the other calibration was placed on the most recent node in the other clade descending from the root, with an age of 0.9 of the root age (i.e., older than the truth). for this scenario, we only used trees with more than one descendant in each of the two oldest clades. we show an example of the simulated phylogeny compared with this kind of marginal prior on node ages in the supplementary information, supplementary material online. our study had 100 simulated data sets for each simulation treatment, for a total of 700 simulated alignments. we analyzed the simulated alignments using bayesian markov chain monte carlo (mcmc) sampling as implemented in beast. we used three different clock models to analyze each of the simulated alignments: the strict clock, uncorrelated lognormal relaxed clock (drummond et al. 2006) , and random local clock (drummond and suchard 2010) . we used the same tree prior and substitution model for estimation as those used for simulation. we fixed the age of the root to 50 my and fixed the tree topology to that used to simulate sequence evolution in every analysis. we analyzed the simulated data with an mcmc chain length of 2 â 10 7 steps, with samples drawn from the posterior every 2 â 10 3 steps. we discarded the first 10% of the samples as burn-in, and assessed satisfactory sampling from the posterior by verifying that effective sample sizes for all parameters were above 200 using the r package coda (plummer et al. 2006) . we performed analyses using each of the three clock models for each of the 300 simulated data sets, for a total of 900 clock analyses. we assessed the accuracy and uncertainty of the estimates made using each of the analysis schemes ( fig. 2) . to do this, we compared the simulated rates with the branch-specific rates in the posterior. next, we tested the power of our method for assessing clock model adequacy using the simulated data under each of the scenarios of simulation and analysis. we provide example code and results in a public repository in github (https://github.com/duchene/modadclocks, last accessed july 1, 2015). we also tested the power of the multinomial test statistic to assess clock model adequacy in each of the 900 analyses. this test statistic quantifies the frequency of site patterns in an alignment and is appropriate for testing the adequacy of models of nucleotide substitution (bollback 2002; brown 2014b ). we used four published data sets to investigate the performance of our method of assessing clock model adequacy in empirical data. for each data set, we performed analyses in beast using each of the three clock models used to analyze the simulated data sets. to select the substitution model for each empirical data set, we used the bayesian information criterion as calculated in the r package phangorn. accurate model selection of relaxed molecular clocks in bayesian phylogenetics paleontological evidence to date the tree of life bayesian model adequacy and choice in phylogenetics beast 2: a software platform for bayesian evolutionary analysis predictive approaches to assessing the fit of evolutionary models detection of implausible phylogenetic inferences using posterior predictive assessment of model fit puma: bayesian analysis of partitioned (and unpartitioned) model adequacy estimating divergence times in the presence of an overdispersed molecular clock relaxed clocks and inferences of heterogeneous patterns of nucleotide substitution and divergence time estimates across whales and dolphins (mammalia: cetacea) relaxed phylogenetics and dating with confidence bayesian coalescent inference of past population dynamics from molecular sequences bayesian random local clocks, or one rate to rule them all marine turtle mitogenome phylogenetics and evolution the impact of calibration and clock-model choice on molecular estimates of divergence times efficient selection of brach specific models of sequence evolution statistical inference of phylogenies a tenth crucial question regarding model use in phylogenetics bayesian data analysis model checking and model improvement simulating normalizing constants: from importance sampling to bridge sampling to path sampling philosophy and the practice of bayesian statistics statistical tests of models of dna substitution a dirichlet process prior for estimating lineage-specific substitution rates calibrated birth-death phylogenetic timetree priors for bayesian inference the changing face of the molecular evolutionary clock molecular-clock methods for estimating evolutionary rates and timescales simulating and detecting autocorrelation of molecular evolutionary rates among lineages accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times frequentist properties of bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models bayesian inference of phylogeny and its impact on evolutionary biology evolution of protein molecules performance of a divergence time estimation method under a probabilistic model of rate evolution molecular clocks: four decades of evolution computing bayes factors using thermodynamic integration the importance of proper model assumption in bayesian phylogenetics a general comparison of relaxed molecular clock models posterior predictive bayesian phylogenetic model selection complete mitochondrial genome phylogeographic analysis of killer whales (orcinus orca) indicates multiple species resolution of the early placental mammal radiation using bayesian phylogenetics mapping mutations on phylogenies bayesian phylogenetic analysis of combined data a mitogenomic timescale for birds detects variable phylogenetic rates of molecular evolution and refutes the standard molecular clock coda: convergence diagnosis and output analysis for poor fit to the multispecies coalescent is widely detectable in empirical data assessment of substitution model adequacy using frequentist and bayesian methods computational methods for evaluating phylogenetic models of coding sequence evolution with dependence between codons mrbayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space phangorn: phylogenetic analysis in r model selection in phylogenetics estimating the rate of evolution of the rate of molecular evolution exploring uncertainty in the calibration of the molecular clock a case for the ancient origin of coronaviruses relaxed molecular clocks, the bias-variance trade-off, and the quality of phylogenetic inference dating the age of the siv lineages that gave rise to hiv-1 and hiv-2 discovery of seven novel mammalian and avian coronaviruses in deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus improving marginal likelihood estimation for bayesian phylogenetic model selection molecular phylogenetics: principles and practice estimation of primate speciation dates using local molecular clocks we thank the editor, tracy heath, and an anonymous reviewer for suggestions and insights that helped improve this article. this research was undertaken with the assistance of resources from the national computational infrastructure, which is supported by the australian government. d. assessing the adequacy of clock models . doi:10.1093/molbev/msv154 mbe for each analysis of the empirical data sets, we ran the mcmc chain for 10 8 steps, with samples drawn from the posterior every 10 3 steps. we discarded the first 10% of the samples as burn-in and assessed satisfactory sampling from the posterior by verifying that the effective sample sizes for all parameters were above 200 using the r package coda. we used stepping-stone sampling to estimate the marginal likelihood of the clock model (gelman and meng 1998; lartillot and philippe 2006; xie et al. 2011) . for each bayesian analysis, we performed posterior predictive simulations as done for the simulated data sets, and assessed the substitution model using the multinomial test statistic. in addition, to estimate the clock-free multinomial test statistic, we analyzed each of the empirical data sets using mrbayes 3.2 (ronquist et al. 2012) . for these analyses we used the same chain length, sampling frequency, sampling verification method, and substitution model as in the analyses using clock models.our empirical data sets included nucleotide sequences of coronaviruses. this data set contained 43 sequences of 638 nt of a portion of the m (matrix) gene, as used by wertheim et al. (2013) . these sequences were sampled between 1941 and 2011. the best-fitting substitution model for this data set was gtr+g. we also used a data set of the gag gene of sivs, which comprised 78 sequences of 477 nt, sampled between 1983 and 2004 (wertheim and worobey 2009). the best-fitting substitution model for this data set was gtr+g. we used the bayesian skyline demographic model (drummond et al. 2005) for the analyses of both of the virus data sets, and used the sampling times for calibration.we analyzed a data set of the killer whale (o. orca), which contained 60 complete mitochondrial genome sequences of 16,386 nt (morin et al. 2010) . we calibrated the age of the root using a normal distribution with mean of 0.7 and a standard deviation of 5% of the mean, as used in the original study. the best-fitting substitution model for this data set was hky+g. finally, we analyzed a data set of several genera of marine turtles, which comprised 24 sequences of the 13 mitochondrial protein-coding genes (duchene et al. 2012) , and we selected the gtr+g substitution model. following the scheme in the original study, we used calibrations at four internal nodes. the pure-birth process was used to generate the tree prior in the analyses of the killer whales and the marine turtles. supplementary information is available at molecular biology and evolution online (http://www.mbe.oxfordjournals.org/). key: cord-305318-cont592g authors: lancaster, madeline a.; huch, meritxell title: disease modelling in human organoids date: 2019-07-01 journal: dis model mech doi: 10.1242/dmm.039347 sha: doc_id: 305318 cord_uid: cont592g the past decade has seen an explosion in the field of in vitro disease modelling, in particular the development of organoids. these self-organizing tissues derived from stem cells provide a unique system to examine mechanisms ranging from organ development to homeostasis and disease. because organoids develop according to intrinsic developmental programmes, the resultant tissue morphology recapitulates organ architecture with remarkable fidelity. furthermore, the fact that these tissues can be derived from human progenitors allows for the study of uniquely human processes and disorders. this article and accompanying poster highlight the currently available methods, particularly those aimed at modelling human biology, and provide an overview of their capabilities and limitations. we also speculate on possible future technological advances that have the potential for great strides in both disease modelling and future regenerative strategies. organoids (see box 1) are a powerful new system and are being increasingly adopted for a wide range of studies. indeed, since 2009, over 3000 papers have been published that make use of organoids in some form or another (see poster and box 2). organoids can be derived either from pluripotent stem cells (pscs), adult-tissue-resident cells (stem or differentiated cells) or embryonic progenitors (huch and koo, 2015) . because organoids follow the same basic intrinsic patterning events as the organ itself, they are a useful tool for investigating developmental organogenesis or processes of adult repair and homeostasis. their more accurate organ-like organization makes them a valuable tool for disease modelling, although further improvements, particularly in scalability, are still needed. nonetheless, because organoids have such potential, extensive effort has been made at generating organoids for a range of organ types. we describe the methods and applications for these in more detail in this article and in the accompanying poster, and discuss future directions for improving this technology and furthering its applications. pluripotent stem-cell-derived and adult-tissue-derived organoids one of the major leaps that has led to the development of organoid methods was the realization that, to better recapitulate organ morphology in vitro, one must first understand the development of that organ and try to mimic it. thus, years of research on the patterning events and signalling cascades at play during organogenesis have provided the necessary foundation to make organoid research possible. the relatively recent advent of human pluripotent stem cell (psc) cultures has provided the starting point for this process. remarkably, human pscs can be induced to spontaneously undergo differentiative and morphogenetic behaviours that mimic the formation of embryonic germ layers, especially when they are forced to form three-dimensional (3d) aggregates called embryoid bodies (ebs; box 3). ebs form germ layers that express well-described molecular markers and even segregate to form patches of individual germ-layer tissue within the aggregate. by applying specific growth factors, these aggregates can be directed towards a specific germ layer. biologists have taken advantage of the knowledge on developmental events to construct or reconstruct a tissue in vitro (organoids) that recapitulates many of the key structural and functional features of the organ (box 1). the range of organoids established so far is rapidly increasing, and although in many cases these have been first developed from rodent pscs, researchers are translating experimental methods to human cells. the use of pscs as a starting point to generate 3d organoid cultures allows the in vitro recapitulation of the developmental processes and tissue organogenesis that occur in vivo. however, once organ growth is terminated, adult tissues still require tissue maintenance and repair to ensure proper functionality of the adult organ. recently, a better understanding of the signalling pathways important for adult tissue maintenance and repair has enabled researchers to grow primary tissues as self-expanding organoids that retain all the characteristics of the tissue of origin, as well as their genetic stability over time (see huch and koo, 2015 for an extended review; table 1 ). here, we summarize the methods for derivation of organoids from human cells, be it pscs or tissue-resident cells. the development of 3d organoid cultures has proven very successful for several endoderm-derived organs, such as the intestine, stomach, liver, pancreas and lung. in this section, we discuss the findings and conditions used to develop these organoids. the term 'organoid' is actually not new. a simple pubmed search reveals its first usage in 1946 in reference to a tumour case study (smith and cochrane, 1946) . however, at that time, the term was used to describe certain histological features of tumours, such as glandular organization. over time, its meaning evolved to generally refer to tissues or structures that resemble an organ, and so this term became increasingly used in in vitro biology. however, it wasn't until the development of intestinal organoids in 2009 (sato et al., 2009 ) that it began to be used more specifically for these self-organizing in vitro structures. nonetheless, the term 'organoid' continues to be used for a wide variety of tissues or structures that may or may not fully recapitulate key features of an organ. therefore, in an attempt to clarify some of the confusion surrounding this term, we use a previously proposed working definition (lancaster and knoblich, 2014; huch and koo, 2015; clevers, 2016 ) that fulfils the most basic definition; namely, 'resembling an organ'. more specifically, a genuine organoid should satisfy several criteria: (1) a 3d structure containing cells that establish or retain the identity of the organ being modelled; (2) the presence of multiple cell types, as in the organ itself; (3) the tissue exhibits some aspect of the specialized function of the organ; and (4) self-organization according to the same intrinsic organizing principles as in the organ itself (sasai, 2013a,b) . since the dawn of biology as a scientific field, scientists have sought to understand mechanisms of human disease. while studying patients can give insight into symptoms and help describe the course of the disease, the underlying causes are often enigmatic. therefore, animal models have been a staple of human disease research, as they can be engineered to develop the disease, for example, by introducing relevant mutations. however, because of evolutionary divergence, there are many features of human biology and disease mechanisms that cannot be accurately modelled in animals. for instance, rodent models of alzheimer's disease do not ubiquitously display the characteristic plaques and tangles (asaad and lee, 2018) . when key disease features are absent from animal models, it is difficult to study the underlying mechanisms. for these reasons, efforts over the past several decades have focused on modelling human disease biology in a dish. while hela cells and other immortalized human cell lines have proven to be a powerful tool, their use has several considerable drawbacks, including their genomic instability and limited tissue identities (adey et al., 2013) . furthermore, the expansion of adult primary tissues far beyond the predicted hayflick limit is a real challenge (hayflick, 1965) . thus, more recent approaches have focused on in vitro models derived from stem cells, which allow for a broader array of tissue identities, long-term expansion, better genomic integrity and improved modelling of healthy biology. a stem cell is any cell that is able to self-renew and generate differentiated progeny, i.e. capable of generating several defined identities. thus, differentiated organ cell types can be generated in vitro from pluripotent stem cells (pscs) by following a series of differentiation steps that mimic early embryonic development. alternatively, differentiated cells derived from resident adult stem/progenitor cells, such as the stem cells of the intestinal crypt, can be used but historically these have proven difficult to expand in vitro (bjerknes and cheng, 2006) . in order to overcome these limitations and also better recapitulate tissue architecture, a new field of 3d in vitro biology called tissue engineering has come into the spotlight. by combining biology and engineering, more elaborate conformations of cells have been established, allowing for multiple cell types to be combined in a configuration that more closely mimics organ architecture. perhaps the most exciting developments have been the recent organ-on-a-chip methods, which allow for the construction of connected chambers that mimic different organ compartments, for example liver ducts and blood vasculature (huh et al., 2011) . while tissue engineering has provided a number of highly useful models for looking at the interaction between different cell types, there are certain artificial aspects that affect their ability to accurately model organ structure and therefore function. for example, the use of artificial membranes and matrices means that the cells are positioned via exogenous processes, rather than through bottom-up self-organizing principles as in organ development in vivo. thus, to more accurately model organ architecture, very recent efforts have focused on supporting the self-organizing development of these tissues in vitro. and so, the organoid field was born. in the intestine, wnt, notch, fibroblast growth factor (fgf)/ epidermal growth factor (egf) and bone morphogenetic protein (bmp)/nodal signalling are required during tissue development and in adult homeostasis and repair. by combining the knowledge on stem-cell populations and on the intestinal stem-cell niche requirements, intestinal organoids from either postnatal or adult intestinal epithelium (ootani et al., 2009) or from a single adult intestinal stem cell (sato et al., 2009) have been established, with an expansion potential far beyond the hayflick limit (>1 year in culture). similarly, mouse-and human-derived colonic stem cells have also been expanded in organoid cultures with minor adjustments in the medium composition . the culture conditions that support intestinal organoids include embedding the intestinal stem cells (or cells with the ability to acquire stem-cell potential, i.e. immediate daughters of the stem cells) in a 3d extracellular matrix (e.g. matrigel) and culturing them in a medium supplemented with egf, noggin and the wnt agonist r-spondin (rspo). these conditions allow the long-term expansion of adult intestinal epithelium in culture while retaining its ability to differentiate into individual derivatives ootani et al., 2009) . in parallel studies, the wells lab successfully differentiated pscs into gut epithelium in vitro by treating definitive endoderm-specified cells with wnt3a and fgf4 to induce posterior endoderm patterning and hindgut specification (spence et al., 2011) . these conditions result in the formation of spheroids after 2 days. then, upon transfer into matrigel and incubation in culture conditions that support the growth of adult-tissue-derived organoids, these psc-derived spheroids formed bona-fide small-intestinal organoids. recently, these psc-derived intestinal organoids have been combined with neural crest cells to recapitulate the normal and functional intestinal enteric nervous system (workman et al., 2017; schlieve et al., 2017) . gastric organoids have been obtained by either expansion of adult gastric stem cells from both the corpus and the antropyloric epithelia, as well as from directed differentiation of pscs. the identification of lgr5 as a marker for pyloric stem cells (barker et al., 2010) led to the development of the first long-term culture system of mouse gastric stem cells. corpus and pylorus stem cells were grown in matrigel and in medium supplemented with wnt3a and fgf10 (barker et al., 2010; stange et al., 2013) . blockade of transforming growth factor-beta (tgfβ) signalling enabled the expansion of human gastric organoids from both pylorus and corpus epithelium that would contain all gland and pit cell types (bartfeld et al., 2015) . differentiation into acid-producing parietal cells proved more difficult, though. only upon co-culture with the mesenchymal niche, corpus organoids from neonatal (li et al., 2014) and adult (bertaux-skeirik et al., 2016) mouse stomach tissue could differentiate into this mature cell type. the human counterpart has not been achieved, yet. to obtain psc-derived gastric organoids, mccracken et al., upon establishing definitive endoderm with activin (kubo et al., 2004; d'amour et al., 2005) , exposed the cells to wnt, fgf4 and noggin to enable gastric specification. noggin is essential to prevent intestinalization and to promote a foregut fate (mccracken et al., 2014) , while retinoic acid facilitates antrum epithelium specification. matrigel is essential for the formation of 3d foregut structures. when maintained in an egf-rich medium, these structures generate gastric organoids that contain all antral epithelium cell types (mccracken et al., 2014) . noguchi and colleagues used mouse pscs to generate a functional corpus epithelium in vitro that contained acid-producing cells (noguchi et al., 2015) . in a different approach, wells and colleagues observed that, after foregut patterning, sustained wnt signalling using the gsk3β inhibitor chir enabled fundus (corpus) specification instead of antrum. these fundic cells could be subsequently differentiated into all cell types of the gastric epithelium, including functional parietal cells (mccracken et al., 2017) . interestingly, the authors demonstrate that their psc-differentiated fundic organoids contain bona-fide stomach progenitors that can be isolated, cultured and further propagated in the tissue-derived organoid medium described by barker, huch and colleagues (barker et al., 2010) to expand adult-tissue-derived organoids (mccracken et al., 2017) , thus linking, for the first time, the development of both types of organoids. adult hepatocytes and cholangiocytes (also known as ductal cells) are the two endodermal-derived cell types in the adult liver, yet the organ is composed of mesoderm-derived hepatic mesenchymal cells. michalopoulos et al. first described liver-derived 3d structures back in 2001. in these studies, these 'organoids' were very different from what we now consider liver organoid cultures, as they would only survive for a short period of time in culture, yet they retained some of the function and structure of the hepatocyte epithelium (michalopoulos et al., 2001) . it was not until 2013 that the first liver organoids as we know them now were described. huch et al. established the first adult murine-tissue-derived liver organoid culture that sustains the long-term expansion of liver cells in vitro (huch et al., 2013b) . by combining matrigel with hepatocyte growth factor box 3. germ-layer formation: the starting point of organogenesis all organs develop from primitive tissues that form at the very early onset of embryogenesis. after fertilization, cells within the early blastula establish two main compartments: the extraembryonic tissue, which will give rise to the supportive foetal environment including placenta and amniotic sac, and the inner cell mass (icm), which will give rise to the embryo proper. human pluripotent stem cells (pscs) can be thought of as roughly equivalent to icm stem cells, and indeed human escs are taken from the icm of the human blastocyst. upon gastrulation, the embryonic disc, which derives from the icm, undergoes morphogenetic movements that establish the three germ layers: endoderm, mesoderm and ectoderm. progenitors within these germ layers have restricted potentials to generate primordial organ structures, and their specific identity will depend on their spatial context relative to one another, and on where they lie on the anterior-posterior and dorsal-ventral axes (zernicka-goetz and hadjantonakis, 2014) . this is due to the early establishment of gradients of morphogens that will influence differentiation towards particular subregions of these germ layers. for example, the endoderm will give rise to the entire gastrointestinal tract and, depending on their location, endodermal progenitors can give rise to more anterior identities, such as the stomach, or to more posterior identities, such as the colon. this is due to an anterior-posterior gradient of signalling factors like wnt, bmp and fgfs (zorn and wells, 2009) . similarly, bmp4 promotes formation of the mesendoderm, a precursor to certain mesoderm and endoderm types, whereas early activation of wnt can promote the presomitic mesoderm, the precursor to the kidney (little et al., 2016) . in contrast, in the absence of growth factors or serum, the embryoid body tends to be biased towards the ectoderm as the default. by understanding these specific morphogen gradients and their effects on the primordial tissue, these growth factors can be provided at specific concentrations and with specific timing to direct the differentiation of progenitors towards certain organ identities. (hgf), egf and liver-damage-induced factors fgf (takase et al., 2013) and rspo1 (huch et al., 2013b) , the isolated liver cells selforganized into 3d structures that retained the ability to differentiate into functional hepatocyte-like cells, even when grown from a single cell (huch et al., 2013b) . addition of an activator of cyclic adenosyl monophosphate (camp) signalling and inhibition of tgfβ signalling adapted this culture system to the expansion of adult human liver cells as self-renewing organoids that recapitulate some function of ex vivo liver tissue . of note, both human and mouse cultures could be established from single cells, hence enabling, for the first time, the study of mutational processes in healthy tissue blokzijl et al., 2016) . takebe et al. took a completely different approach and, by mixing human induced psc (ipsc)-derived hepatocytes with mesenchymal stem cells (mscs) and umbilical cord cells (huvecs), obtained the first embryonic liver bud organoids formed by proliferating hepatoblasts and supporting cells. when transplanted into different mouse sites, these developed into mature functional hepatic tissue (takebe et al., 2013) . in a follow-up study, takebe and colleagues also differentiated ipscs towards the three hepatic progenitorshepatic endoderm, endothelium and septum transversum mesenchymehence overcoming the issue of using postnatal tissue-derived stromal progenitors (takebe et al., 2017) . in 2015, sampaziotis et al. also generated cholangiocyte organoids from ipscs (sampaziotis et al., 2015) . whether combining both protocols could generate structures containing both hepatocytes and cholangiocytes or whether, instead, a better specification of true bi-potent hepatoblasts, like the novel lgr5+ population described by prior et al. (2019) is required first to achieve this goal, remains to be determined. since these seminal papers, altered protocols have enabled the establishment of liver models from different species from rat to dog (nantasanti et al., 2015; kuijk et al., 2016; kruitwagen et al., 2017) , as well as human disease models ranging from alpha-1-antitrypsin (a1at) deficiency and alagille syndrome to polycystic liver disease (wills et al., 2016) or cancer (broutier et al., 2017; nuciforo et al., 2018) . by modifying the huch conditions to expand human organoids , the vallier lab has recently established extrahepatic biliary organoids and showed that these can form bile-duct-like tubes that can reconstruct the gallbladder wall upon transplantation into mice (sampaziotis et al., 2017a) . finally, the nusse and clevers labs recently reported the establishment of hepatocyte organoids derived from adult mouse hepatocytes (peng et al., 2018; hu et al., 2018) , while the huch lab recently established clonal mouse hepatoblast cultures (prior et al., 2019) . while human adult hepatocyte cultures are yet to be established, human embryonic cultures that excellently recapitulate the bile canaliculi structure in vitro have been recently described . whether these cultures can differentiate into more mature hepatocyte or cholangiocyte derivatives and/or exhibit the extensive cellular plasticity of the in vivo tissue is still to be investigated. similarly to the liver, both embryonic and/or adult pancreas cells are difficult to expand or maintain in culture. by isolating mouse embryonic pancreas progenitors and culturing them in matrigel, grapin-botton and colleagues elegantly showed that the cells could recapitulate pancreatic embryonic development in vitro and generate exocrine (acinar) and endocrine (insulin-producing) cell derivatives (greggio et al., 2013) . by using a similar culture system as the adult liver organoids described above, huch et al. established the first adult mouse pancreas organoid model by seeding pancreas ductal cells in matrigel in the presence of fgf10, noggin, rspo1 and egf (huch et al., 2013a) . a similar system was later applied to human pancreatic cancer cells (boj et al., 2015) , although long-term expansion of healthy human pancreas tissue is still to be achieved. recently, loomans et al. expanded human pancreas tissue for a few passages and obtained structures formed exclusively of ductal epithelium. after transplantation into mice, these expressed some endocrine markers, although did not differentiate into islets (loomans et al., 2018) . unlike the grapin-botton lab's embryonic cultures described above, loomans' adult-tissue-derived ones cannot differentiate into endocrine or acinar cells in vitro. whether that is because adult pancreas ductal cells have lost their endocrine differentiation potency or whether the external cues have not been identified yet is unknown. the term 'lung organoids' describes both upper-and lower-airway organoids. protocols for both psc-and adult-tissue-derived organoids have been established. rossant and colleagues differentiated human ipscs to lung epithelium using air-liquid interphase culture (wong et al., 2012) . this initial protocol was later modified by the spence lab to instruct the cells towards a foregut fate by adding tgfβ/bmp inhibitors, fgf4, and wnt activators. subsequent activation of the hedgehog pathway with the smoothened agonist sag resulted in lung specification. transferring these specified 3d spheroids into matrigel and culturing them in fgf10-rich medium enabled the differentiation to lung organoids containing airway-like structures formed by basal, ciliated and club cells, surrounded by a mesenchymal compartment (dye et al., 2015) . interestingly, when transplanted in a bioartificial microporous poly(lactide-co-glycolide) scaffold, full maturation and differentiation of secretory lineages was observed, while long-term engraftment enabled these to differentiate into distal lung cells (dye et al., 2016) . parallel studies by the snoeck lab also showed the formation of 3d structures that developed into branching airway and early alveolar epithelium after xenotransplantation (chen et al., 2017) . establishing organoids from adult lung tissue has been more difficult. hogan and colleagues pioneered upper-airway organoids, reporting that single basal cells would self-organize into bronchiolar lung organoid cultures. these had limited expansion potential, yet were able to differentiate to both basal and luminal cells (rock et al., 2009) . building on these cultures, the rajagopal lab was able to expand airway basal stem cells by inhibiting smad signalling (mou et al., 2016) . the generation of distal (lower)-airway lung organoids has been more difficult. alveoli consist of surfactant-secreting type ii (at2) and gas-exchanging type i (at1) cells. by identifying and expanding bronchioalveolar stem cells (bascs), kim and colleagues showed that single bascs had bronchiolar and alveolar differentiation potential in lung organoids co-cultured with lung endothelial cells (lee et al., 2014) . similarly, barkauskas et al., by co-culturing human at2 cells with lung fibroblasts, established self-renewing human alveolar organoids (barkauskas et al., 2013) . co-culturing macrophages with at2 cells promoted the formation of organoids from at2 cells (lechner et al., 2017) in a dose-dependent manner (i.e. the more macrophages the more at2 growth). only recently, two labs have been able to expand lung progenitors in the absence of a mesenchymal or endothelial niche: the rawlins lab described the first human long-term self-renewing lung organoids derived from embryonic lung (nikolićet al., 2017) , while the clevers lab generated 3d human airway organoids containing ciliated, goblet, club and basal cells (sachs et al., 2019) . the generation of self-renewing human alveolar organoids in a defined medium still awaits development, though. although it was well known by the mid-20th century that dissociated rat thyroid cells aggregate and reconstitute functionally active thyroid tissue in vitro (mallette and anthony, 1966) , it was not until very recently that researchers successfully generated thyroid organoids from pscs. this delay was due to the lack of knowledge on how thyroid tissue develops. by studying the transcription factors important for thyroid development, antonica and colleagues found that forced expression of thyroid-specific transcription factors in pscs resulted in thyroid cells that formed follicles in vitro and upon transplantation (antonica et al., 2012) . more recently, kotton and colleagues identified the right combination of bmp and fgf signalling to enable the differentiation of ipscs to a thyroid fate. by isolating these thyroid progenitors and cultivating them in matrigel, the authors generated follicular organoids consisting of a monolayer of hormone-producing thyroid cells. of note, following xenotransplantation, these organoids restored th levels in a hypothyroid mouse model (kurmann et al., 2015) . whether selfrenewing human thyroid organoids can be obtained from adult thyroid epithelium remains to be discovered. in light of the early studies from mallette and anthony and the recent report from nagano and colleagues suggesting that mouse thyroid organoids can be developed from adult tissue (saito et al., 2018) , we speculate that the human counterpart will soon also be developed. the embryonic mesoderm gives rise to a variety of internal organs, including kidney, heart, cartilage, bone, reproductive organs and muscle. we detail below the methods to generate the respective human organoids. kidney organoids are an ideal illustration of the successful generation of an in vitro model based upon careful characterization of in vivo development. in particular, several recent studies have begun to shed light on where the key kidney precursors come from. work from taguchi et al. (2014) was instrumental in characterizing the early stages of metanephric kidney development, particularly the formation of metanephric mesenchyme (mm), then applying the identified signalling factors to direct differentiation of mouse and human pscs specifically towards mm cells that could form 3d structures when cocultured with mouse tissues. this study helped define the developmental steps leading to kidney formation. while taguchi et al. demonstrated the successful induction of mm, takasato et al. (2014) and xia et al. (2013) described conditions that would also give rise to the ureteric bud (ub). in both studies, the authors initially directed pscs towards the intermediate mesoderm by first mimicking signalling events involved in primitive streak formation. in the xia et al. study, the cells then gave rise to ub precursors that could mature and integrate into cocultures with embryonic mouse kidney. takasato et al. instead simultaneously generated human ub and mm identities. not only could these cells integrate into embryonic mouse kidney, but they could also form rudimentary tubule structures upon aggregation even without the supportive mouse tissue. the variety of approaches for generating early self-organising renal structures generally points to important roles for several growth factors, including bmp4, retinoic acid, fgfs, wnt and activin a. however, a breakthrough came with the establishment of the entirely self-organizing 3d human renal organoids by takasato et al. (2015) , which involved minimal use of growth factors ( just chir99021 to activate wnt, and fgf9), and led to structures with all the major components of the developing kidney, including the collecting duct, proximal and distal tubules, glomeruli, and even endothelial networks. although proper bone organoids have not yet been established, kale et al. described a promising approach to form spheroids of human adult bone precursor cells (kale et al., 2000) . these are a heterogeneous population of bone progenitors, including osteoblasts, that can be induced to form small pieces of crystalline bone called microspicules. this requires 3d aggregation of the cells as well as the removal of serum and addition of tgfβ1. given that this approach lacks the varied cell types and tissue architecture, it will be interesting to see whether current improvements in 3d culture methods could further develop these spheroids to bona-fide bone organoids. adult-derived organoid methodology has recently been applied to the fallopian tube of the female reproductive tract. the method described (kessler et al., 2015) has many similarities to the barker and huch et al. stomach organoid method and involves culture of human adult fallopian tube epithelial cells in matrigel in the presence of mouse gastric organoid medium (barker et al., 2010) with tgfβ inhibition. the resulting 3d cystic spheres develop various invaginations as they mature, and contain both ciliated and secretory cells. these tissues also respond to the sex hormones estradiol and progesterone in a manner reminiscent of the response of this tissue in vivo during the menstrual cycle. two independent groups have derived organoids from the adult endometrium using similar conditions as those described for adult liver organoids (turco et al., 2017; boretto et al., 2017) . hence, these required the presence of rspo, egf, fgf10 and noggin, but no wnt supplementation. both studies demonstrated that the endometrial organoids respond to estradiol and progesterone in specific and differential manners. specifically, estradiol stimulated increased proliferation while progesterone stimulated a more mature morphology, mimicking the later secretory phase of the menstrual cycle. these morphological and transcriptomic changes recapitulated the changes in the endometrium during the menstrual cycle. furthermore, turco et al. exposed the endometrial organoids to pregnancy signals (prolactin, human chorionic gonadotropin and human placental lactogen), which led to a decidua-like morphology similar to that seen in early pregnancy. the embryonic ectoderm gives rise to two main tissue types: surface ectoderm, which will develop into skin and its associated glands and hair; and neural ectoderm, which will give rise to the brain, spinal cord and neural crest. the neural crest is a highly multipotent entity that can further differentiate into the peripheral nervous system, as well as bone, cartilage, connective tissue, and vasculature of the head, and even contributes to the heart. for decades, mammary cells have been cultured in 3d extracellular matrix gels as a model for mammary gland development, homeostasis and tumorigenesis (hall et al., 1982) . indeed, isolated mouse mammary epithelial cells were some of the first cells to be cultured in matrigel (li et al., 1987) , and could give rise to branched structures reminiscent of the mammary gland. however, differences between mouse and human mammary stroma have made it difficult to directly translate these methodologies to a human in vitro model. the first 3d branched mammary-gland-like structures from human mammary cells were described by gudjonsson et al. (2002) , who isolated a progenitor population that could generate several cell types and the typical branched morphology of the mammary gland when embedded in matrigel. similarly, dontu et al. described the formation of mature ductal-acinar-like structures from multipotent human mammary epithelial cells upon embedding in matrigel and exposure to prolactin (dontu et al., 2003) . importantly, these studies successfully generated mammary-gland-like organoids from cells first expanded in vitro either as a cell line or in mammospheres. more recently, researchers devised methods to generate branching structures directly from isolated human mammary epithelial cells (linnemann et al., 2015) . the formation of these structures similarly relies upon embedding in 3d matrix gels and the presence of growth factors, including egf, hydrocortisone and insulin, to promote mammary epithelial-stem-cell proliferation. under these conditions, and with the addition of forskolin (fsk), human mammary epithelial cells spontaneously generated structures reminiscent of the terminal ductal lobular units of the mammary gland. importantly, these structures contained multiple cell types at correct positions within the branched structure and showed contractile activity, a feature of the basal/myoepithelial cells that eject milk during lactation. there is some controversy about the developmental origin of the salivary glandswhether ectodermal or endodermal (see patel and hoffman, 2014 and de paula et al., 2017 for details), yet mouse lineage-tracing studies suggest an ectodermal origin (rothova et al., 2012) , although there is no direct evidence for this yet. by applying the clevers organoid method to primary adult salivary gland cells, coppes and colleagues efficiently generated long-term-expanding organoids from the mouse and human salivary gland. the culture conditions required high levels of wnt activation to obtain organoids containing all differentiated salivary gland cell types. transplantation of these organoids into murine submandibular glands restored saliva secretion and increased the number of functional salivary gland acini in vivo . these studies hold the promise of salivary gland transplantation as a potential treatment for xerostomia (severe hyposalivation). the retina is the neural part of the eye and contains photoreceptors, supportive cells and interneurons. self-organizing retinal tissues were the first entirely 3d neural organoids to be successfully established from mouse (eiraku et al., 2011) and later human (nakano et al., 2012) escs. as with many organoids, matrigel turned out to be a key component. both studies described a very minimal medium, along with wnt inhibition, to develop retinal identities from escs. but it was the addition of dissolved matrigel that allowed for the formation of a more rigid epithelium that could adopt the morphology of the retinal primordium, the optic cup. these optic-cup-like organoids begin as aggregates containing large spherical vesicles of neuroepithelium with an early retinal identity and appear similar to optic vesicles. as development proceeds, these vesicles spontaneously invaginate to from a cup-like morphology reminiscent of the developing optic cup. optic-cup organoids form the typical stratified architecture of the developing retina, including rods and cones. however, generating functional photoreceptors, including light-sensitive outer segments, has been a challenge. since the initial description of self-organizing retinal tissues from human escs, others have improved upon these methods and recently described extended culture times and further maturation of the photoreceptors to include rudimentary outersegment discs and even occasional light responsiveness (wahlin et al., 2017) . in vitro modelling of human brain development has been a fast-evolving field that builds upon decades of in vivo and ex vivo work. the discovery in 2001 that human escs could form 2d rosette-like structures after an initial phase of 3d culture as an eb (zhang et al., 2001) revealed the self-organization potential of neural progenitors to form neural-tube-like structures. eiraku et al. then expanded upon this by extending the initial 3d culture phase before plating on coated dishes, which allowed the formation of even more complex stratified structures reminiscent of the developing cerebral cortex (eiraku et al., 2008; mariani et al., 2012) . specifically, neural progenitors could self-organize to form a continuous neuroepithelium, similar to retinal organoids, but as these brain-regionalized structures developed, progenitors generated neurons that properly migrated away from germinal zones to populate a more basal region reminiscent of the pre-plate of the cerebral cortex. this germinal zoning and segregation of post-mitotic neurons is highly reminiscent of the developing human brain. these initial cultures used dual smad and wnt inhibition (watanabe et al., 2005; eiraku et al., 2008) to promote the formation of relatively pure neural identities as well as to direct differentiation towards a telencephalic fate. subsequent work showed that, in the absence of these inhibitors, simply providing a very minimal medium to prevent the expansion of non-neuroectodermal identities could generate a more broad brain regional identity (lancaster et al., 2013) . furthermore, embedding in matrigel led to a dramatic reorganization and expansion of the neuroepithelium that allowed the formation of neural-tube-like buds that further expanded without the need for replating the aggregates. like the telencephalic structures described above, these cerebral organoids developed cortical regions with characteristic germinal and differentiated zones. however, these 3d tissues also formed a variety of brain regions, from hindbrain to retinal identities, within the same organoid. similarly, maintenance as a floating aggregate was also applied to the telencephalic structures to generate 3d forebrain organoids with improved tissue architecture (kadoshima et al., 2013) . since the establishment of telencephalic and cerebral organoids, subsequent studies further modified these approaches and generated organoids with more specific brain regional identities, including cortical spheroids (paşca et al., 2015) , hippocampal organoids (sakaguchi et al., 2015) , midbrain organoids (jo et al., 2016) , pituitary and hypothalamic organoids (suga et al., 2011; ozone et al., 2016) , and cerebellar organoids (muguruma et al., 2015) . more recently, researchers fused regionalized brain organoids, revealing that interneurons generated within the ventral telencephalic region could indeed migrate to dorsal cortical tissue as they do in vivo (birey et al., 2017; bagley et al., 2017; xiang et al., 2017) . finally, extended culture of cerebral organoids resulted in neuron maturation within the organoid, allowing for the formation of rudimentary networks and even rare responses to light in these whole-brain organoids containing retinal cells (quadrato et al., 2017) . the inner ear develops from non-neural ectoderm and contains thousands of hair cells that respond to minute air vibrations and thus allow us to hear, as well as detecting head movements and gravity. koehler et al. developed human inner-ear organoids from escs by using bmp4 to promote the non-neural ectoderm lineage as opposed to the neural ectoderm (koehler et al., 2017) . however, tgfβ inhibition prevented unwanted mesoderm and endoderm formation. this approach led to the successful formation of an epithelium on the surface of 3d aggregates of escs that expressed otic placode markers, the precursor to the inner ear. wnt activation subsequently led to the development of otic vesicles with supportive mesenchyme and later formation of epithelia-containing sensory hair-like cells. a primary goal of human organoids is their use in modelling human diseases in order to establish paradigms for drug screening, genotype-phenotype testing and even biobanking for specific diseases and future personalized treatments, including cell therapies. while the establishment of a human organoid model for most organs is still quite new, and therefore disease modelling in this manner is still in its infancy, there have already been several examples of using organoid cultures to study congenital or acquired human diseases. the first human condition to be modelled in organoids was cystic fibrosis (cf), in the intestinal organoid system. cf is caused by mutations in the cystic fibrosis transmembrane conductance regulator (cftr) chloride channel, which is normally expressed in epithelial cells of many organs. dekkers and colleagues (2013) generated cf-patient-derived intestinal organoids that could recapitulate the disease in vitro. they developed a swelling assay in which wild-type organoids respond to the activation of camp by importing fluid to the lumen and subsequently swelling, whereas this response is abolished in cf organoids. this approach demonstrated the excellent predictive value of intestinal organoids for identification of responders to cftr modulators, and has become the first personalized treatment test for cf patients in the netherlands (berkers et al., 2019) . in a parallel approach, vallier and colleagues also showed that ipscs from cf patients differentiated into liver cholangiocytes can also model cf in vitro, as these cells also lack the ability to swell compared to wild-type controls (sampaziotis et al., 2015) . in addition, huch et al. showed that liver organoids from patients with a1at deficiency recapitulated the epithelial features of the disease, whereby precipitates of the misfolded a1at protein accumulated in the differentiated hepatocyte-like cells in vitro . similarly, failure to develop mature biliary cells in liver organoids from an alagille syndrome patient mirrored the biliary defects observed in patients . when cerebral organoids were first established, they were also applied to the study of primary microcephaly, a genetic condition caused by a mutation in cdk5rap2 (lancaster et al., 2013) . the brain organoids generated from patient-derived ipscs were overall much smaller and the individual cortical regions were especially hypoplastic. a series of observations at various time points and specific examination of mitotic spindle orientation during progenitor divisions revealed that the patient neural stem cells began dividing asymmetrically and generating neurons too early, which led to a depletion of the progenitor pool and an eventual decrease in overall neuron number. because mice could not fully recapitulate the extent of brain-size reduction seen in humans, the organoids revealed certain morphological differences that could only be seen in this human-specific model system. a modification of the more regionally specified telencephalic protocol summarized above (eiraku et al., 2008) was also used to model a human neurodevelopmental condition, idiopathic autism spectrum disorder (asd). mariani et al., 2015 established ipsc lines from four asd patients along with closely related unaffected controls. these were initially grown as 3d aggregates, followed by plating of rosettes as described previously (eiraku et al., 2008; mariani et al., 2012) , with the modification that rosettes were then lifted off and grown again as 3d aggregates to yield forebrain organoids. although generally very similar between probands and controls, the asd organoids displayed an increased number of inhibitory interneurons as a result of upregulated foxg1, an important factor in forebrain patterning. more recently, a genetic condition causing lissencephaly (smooth brain) was modelled using forebrain organoids, revealing defects in progenitors, including mitosis timing, spindle orientation, and defective wnt signalling (bershteyn et al., 2017; iefremova et al., 2017) . leber congenital amaurosis is a ciliopathy that affects the retina and leads to inherited blindness. to model this condition, parfitt et al. (2016) generated retinal organoids from ipscs with a mutation in cep290, a known genetic cause of this condition. these organoids displayed normal initial development into optic cups, but the resulting tissues showed decreased ciliation and reduced cilia lengths. by restoring the expression of full-length cep290, the authors were able to rescue the cilia length and protein trafficking in the cilium. not only can organoids be used to model congenital conditions from stem cells of patients carrying the genetic mutation, but they can also be used to model acquired diseases such as those carried by infectious agents or acquired mutations as in the case of cancer. since the first report that hela cells could be grown in vitro, researchers have established cancer cell lines for the majority of tumour types that have facilitated seminal discoveries in cancer biology. in addition, their ease of culture have made them excellent for large-scale drug screening and development (alley et al., 1988) , and have enabled the identification of genomic markers of drug sensitivity (garnett et al., 2012) . however, they have certain drawbacks: (1) they lack the tissue architecture of the organ in question, which is often intimately related to differentiation and disease progression, (2) their establishment usually requires a strong cellular selection and (3) they present extensive heterogeneity between labs, with marked differences in gene expression and proliferation, and a considerable variability in drug responses (ben-david et al., 2018; hynds et al., 2018) . in an attempt to obtain better models that recapitulate the architecture, genetics and drug responsiveness of patients' tumours, patient-derived xenografts (pdxs) emerged as a very suitable alternative. unfortunately, they are not applicable to all cancer models, are expensive and are impractical for large drug screenings. [for an extended review on pdx models, see hidalgo et al. (2014) .] in that regard, the discovery that healthy tissue from colon (jung et al., 2011; sato et al., 2011) and stomach (barker et al., 2010) could be expanded in vitro prompted many researchers to invest significant effort in using organoid technology to model cancer for personalized medicine, drug testing and drug discovery. indeed, the use of organoids to model cancer in vitro has become of great interest to the cancer field, and organoids derived from tissue resections, biopsies or even circulating tumour cells have now been established (see table 2 for a detailed list). in all cases, cancer-derived organoids more faithfully maintain the genetic and phenotypic features of the tumour of origin. in that sense, they resemble pdxs, but with the advantage of a higher establishment success rate, can be easily expanded in vitro and are amenable for drug screening (weeber et al., 2017; pauli et al., 2017) . the ability to expand primary cancer tissue in a dish has opened up the possibility of living biobanks of cancer-derived organoid cultures from different tumour types, including colorectal , gastric (yan et al., 2018) , breast and bladder mullenders et al., 2019) cancer. these provide opportunities for drug screening as well as drug development. hence, organoids from many tissues, ranging from the ones listed above to liver (broutier et al., 2017) and oesophageal cancer, have proven suitable for large drug-screening tests. while maintenance of genotypic and phenotypic features as well as suitability for drug testing does inform us about the translational potential of cancer organoids, it would be even more informative to investigate their predictive value in correlating drugsensitivity data with clinical or genomic data. up to today, only two studies, one in colon (vlachogiannis et al., 2018) and the other in bladder cancer organoids, have demonstrated the potential of organoids for predicting patient response. we envision that these are only the tip of the iceberg of many more studies to come to correlate the predictive value of organoids with clinical outcome. organoids have proven a good model to study infectious diseases and the mechanisms behind human-specific infectious agents. hence, human small-intestinal organoids have been used to model norovirus (hunov) infection and propagation, and have enabled the identification of bile as a critical factor for hunov replication (ettayebi et al., 2016) . similarly, cryptosporidium has been shown to infect and complete its life cycle in intestinal and lung organoids (heo et al., 2018) , which has facilitated the identification of not only type ii but also type i interferon signalling as a response to parasite infection. the ability to use human intestinal organoids to propagate coronaviruses in vitro has enabled the identification of the small intestine as an alternative infection route for middle east respiratory syndrome coronavirus, which causes severe human respiratory infections (zhou et al., 2017a) . similarly, gastric organoids have been used to develop models of helicobacter pylori infection (bartfeld et al., 2015; mccracken et al., 2014) , while lung organoids have modelled influenza virus infection in vitro (zhou et al., 2018) . for extended reviews, see bartfeld (2016) , lanik et al. (2018) and dutta et al., 2017. along the same lines, cerebral organoids that model genetic microcephaly were recently adapted by a number of research groups to study the mechanisms of microcephaly caused by zika-virus infection. this area of research has highlighted the power of brain organoid methods and the ease with which independent investigators could adopt the technology. remarkably, initial observations of the effect of zika on brain organoids were described as soon as 3 months after the world health organization (who) declared zika a global health emergency in 2016. these initial reports (garcez et al., 2016; cugola et al., 2016; qian et al., 2016) described overall smaller sizes of infected organoids, consistent with the microcephaly seen in patients, and a specific effect on neural progenitors, leading to cell death, reduced proliferation and premature differentiation. more recently, these zika-infected brain organoids have been used by at least five independent groups to test treatment strategies that would prevent the effects of zika virus infection on neural progenitors zhou et al., 2017b; sacramento et al., 2017; xu et al., 2016; li et al., 2017) . the rapidity with which a completely new disease model has been established and used for drug testing speaks to the power and future potential of these in vitro models for a variety of disorders. the discovery of crispr/cas9 gene editing as a user-friendly genetic-engineering tool compared to talen or zinc-finger nuclease technologies has prompted many investigators to adapt it to almost any cell type in any organism (for extended review see adli, 2018; sander and joung, 2014) . the clevers lab pioneered the application of crispr/cas9 technology to organoid cultures and used it to correct mutations and restore the chloride channel function of cf-patient-derived intestinal organoids (schwank et al., 2013) . after this seminal paper, many others have highlighted the technology's applicability to organoid cultures and its relevance to further our understanding of human pathologies (see driehuis and clevers, 2017 for an extended review). in particular, gene editing in colon organoids has enabled the step-wise recapitulation of tumorigenesis in vitro (drost et al., 2015; matano et al., 2015) , the identification of cancer signatures in microsatellite-unstable tumours (drost et al., 2017) , the discovery of new genes involved in liver cancer (artegiani et al., 2019) or even the development of the first human brain organoid cancer model for primitive neuroectodermal tumours (bian et al., 2018) . while organoids are powerful tools for modelling human organogenesis, homeostasis, injury repair and disease aetiology, the technology needs to overcome many hurdles to reach the next level in modelling human disease. specifically, because of their 3d nature, the size of all organoids is limited by nutrient supply; because organoids lack vasculature, their development and growth depend upon diffusion from the surrounding media. while this is sufficient for smaller organoids or those without complex stratification, such as the branched ductal organoids and the epithelial tissues of the endoderm, the thicker tissues of the brain experience dramatic necrosis in the organoid interior. thus, extensive effort will likely focus on improving nutrient supply and even vascularization. some organoids have been implanted into highly angiogenic sites in rodents to enable vascularization and blood perfusion by the host (takebe et al., 2015) . this approach has even revealed that certain human-specific bi-products, for example liver metabolites, are detectable in the rodent host blood takebe et al., 2013; hu et al., 2018) . furthermore, brain organoids transplanted into a highly angiogenic site of the rodent brain could be vascularized to improve survival and attract host-derived microglia (mansour et al., 2018) . finally, recent work applying fluid flow to kidney organoids revealed the ability of endogenous endothelial cells to form a vascular network and improve the maturation of the kidney tissue (homan et al., 2019) . this vascularization approach has the potential to overcome tissue growth/survival limitations while maintaining in vitro accessibility and scalability. another hurdle to disease modelling and drug testing is the scalability of organoid cultures. because of their 3d nature and complex morphologies, organoids are typically examined using laborious and time-consuming assays such as immunohistochemistry. furthermore, their culture requires larger culture vessels and volumes of media such that it is usually quite difficult to perform drug testing in the commonly used 384-well plate format. thus, other scaling approaches are being developed, such as mini-spinning bioreactors, or spin-ω, for scaling up production of brain organoids . the complexity that arises from their self-organization also introduces some unpredictability to this model system, particularly in the case of human psc-derived organoids. thus, a desired tissue identity is not always reproducible, and even when it is present, it rarely (if ever) arises in the same configuration from organoid to organoid. this means that researchers using these methods must carefully control which organoids to use for analysis and must examine many organoids in order to separate the real phenotype from the noise of inhomogeneity. however, a number of new methods have recently been introduced to begin to address this issue, such as bioengineering methods using scaffolds or micropatterned substrates that help guide the development of stem cells to particular identities and morphologies (warmflash et al., 2014; lancaster et al., 2017; knight et al., 2018) . as discussed above, organoids have proven amenable for drug testing with relatively small catalogues of drug compounds. more large-scale drug testing to identify novel compounds that may treat a range of patients will require scaling up of organoids, and the pharmaceutical industry is beginning to investigate this approach. however, in the more immediate term, organoids could be combined in the already established arsenal of 2d and bioengineered disease models. for example, cells generated from organoids could be isolated and cultured in 2d as a more accessible and scalable source. furthermore, organ-on-a-chip approaches may be applied to organoids, or cells isolated from organoids, in order to capture the cellular diversity, but in a defined configuration. microfluidics can also be applied to organoids to introduce fluid flow or restrict their growth in defined spatial conformations (karzbrun et al., 2018) . finally, even with further improvements, organoids will not replace existing models in drug development, and should be thought of as complementary to other methods. in particular, organoids cannot replace animal models as the in vivo whole-organism context will remain a necessity when evaluating a particular drug candidate. not only could disease model organoids be used for drug testing, but there is an increasing use of liver organoids and liver-on-a-chip models to test drug metabolism (kimura et al., 2018) . since the us food and drug administration (fda) now requires drug companies to demonstrate downstream metabolites before approval, human in vitro liver models allow for initial testing before testing in patients, in which potentially unpredicted metabolites could do harm. finally, while still far from being a reality, a long-term goal of organoid technologies will be to apply them to cell replacement or even whole-organ transplantation. currently, organ donors are in short supply and many patients fail to receive a vital organ transplant in time. a source of functioning cells or tissues that could replace the faulty ones in a patient would be a major advance and potentially save thousands of lives each year. while vast improvements on the technology are still required to achieve this goal, we envision that the field might see this type of application, and many others, in the future. m.a.l. is an inventor on patent applications describing the development of cerebral organoid methods. m.h. is an inventor on patents describing the development of gastric, pancreas and liver organoids. work in the lancaster laboratory is supported by the medical research council mc_up_1201/9, and the european research council (erc stg 757710). m.h. is a wellcome trust sir henry dale fellow and is jointly funded by the wellcome trust and the royal society (104151/z/14/z). in addition, the work in the huch lab is also funded by an h2020 european research council lsmf4life grant (ech2020-668350) and an nc3rs project grant (nc/r001162/1). the haplotype-resolved genome and epigenome of the aneuploid hela cancer cell line the crispr tool kit for genome editing and beyond feasibility of drug screening with panels of human tumor cell lines using a microculture tetrazolium assay generation of functional thyroid from embryonic stem cells probing the tumor suppressor function of bap1 in crispr-engineered human liver organoids a guide to using functional magnetic resonance imaging to study alzheimer's disease in animal models fused cerebral organoids model interactions between brain regions type 2 alveolar cells are stem cells in adult lung lgr5+ve stem cells drive self-renewal in the stomach and build long-lived gastric units in vitro modeling infectious diseases and host-microbe interactions in gastrointestinal organoids in vitro expansion of human gastric epithelial stem cells and their responses to bacterial infection genetic and transcriptional evolution alters cancer cell line drug response rectal organoids enable personalized treatment of cystic fibrosis human ipsc-derived cerebral organoids model cellular features of lissencephaly and reveal prolonged mitosis of outer radial glia co-culture of gastric organoids and immortalized stomach mesenchymal cells genetically engineered cerebral organoids model brain tumor formation assembly of functionally integrated human forebrain spheroids intestinal epithelial stem cells and progenitors tissue-specific mutation accumulation in human adult stem cells during life organoid models of human and mouse ductal pancreatic cancer development of organoids from mouse and human endometrium showing endometrial epithelium physiology and long-term expandability human primary liver cancer-derived organoid cultures for disease modeling and drug screening modeling liver cancer and therapy responsiveness using organoids derived from primary mouse liver tumors a three-dimensional model of human lung development and disease from pluripotent stem cells modeling development and disease with organoids the brazilian zika virus strain causes birth defects in experimental models efficient differentiation of human embryonic stem cells to definitive endoderm overview of human salivary glands: highlights of morphology and developing processes a functional cftr assay using primary cystic fibrosis intestinal organoids in vitro propagation and transcriptional profiling of human mammary stem/progenitor cells crispr/cas 9 genome editing and its applications in organoids sequential cancer mutations in cultured human intestinal stem cells use of crisprmodified human stem cell organoids to study the origin of mutational signatures in cancer disease modeling in stem cell-derived 3d organoid systems in vitro generation of human pluripotent stem cell derived lung organoids a bioengineered niche promotes in vivo engraftment and maturation of pluripotent stem cell derived human lung organoids self-organized formation of polarized cortical tissues from escs and its active manipulation by extrinsic signals self-organizing optic-cup morphogenesis in three-dimensional culture replication of human noroviruses in stem cell-derived human enteroids organoid cultures derived from patients with advanced prostate cancer zika virus impairs growth in human neurospheres and brain organoids systematic identification of genomic markers of drug sensitivity in cancer cells artificial three-dimensional niches deconstruct pancreas development in vitro isolation, immortalization, and characterization of a human breast epithelial cell line with stem cell properties lumen formation by epithelial cell lines in response to collagen overlay: a morphogenetic model in culture the limited in vitro lifetime of human diploid cell strains modelling cryptosporidium infection in human small intestinal and lung organoids patient-derived xenograft models: an emerging platform for translational cancer research flowenhanced vascularization and maturation of kidney organoids in vitro long-term expansion of functional mouse and human hepatocytes as 3d organoids a threedimensional organoid culture system derived from human glioblastomas recapitulates the hypoxic gradients and cancer stem cell heterogeneity of tumors found in vivo modeling mouse and human development using organoid cultures unlimited in vitro expansion of adult bi-potent pancreas progenitors through the lgr5/rspondin axis in vitro expansion of single lgr5+ liver stem cells induced by wnt-driven regeneration long-term culture of genome-stable bipotent stem cells from adult human liver from 3d cell culture to organson-chips the secret lives of cancer cell lines an organoid-based model of cortical development identifies non-cell-autonomous defects in wnt signaling contributing to miller-dieker syndrome midbrain-like organoids from human pluripotent stem cells contain functional dopaminergic and neuromelaninproducing neurons isolation and in vitro expansion of human colonic stem cells self-organization of axial polarity, inside-out layer pattern, and species-specific progenitor dynamics in human es cell-derived neocortex three-dimensional cellular development is essential for ex vivo formation of human bone identification of multipotent luminal progenitor cells in human prostate organoid cultures human brain organoids on a chip reveal the physics of folding the notch and wnt pathways regulate stemness and differentiation in human fallopian tube organoids organ/body-on-a-chip based on microfluidic technology for drug discovery engineering induction of singular neural rosette emergence within hpsc-derived tissues generation of inner ear organoids containing functional hair cells from human pluripotent stem cells long-term adult feline liver organoid cultures for disease modeling of hepatic steatosis development of definitive endoderm from embryonic stem cells in culture generation and characterization of rat liver stem cell lines and their engraftment in a rat model of liver failure regeneration of thyroid function by transplantation of differentiated pluripotent stem cells organogenesis in a dish: modeling development and disease using organoid technologies cerebral organoids model human brain development and microcephaly guided selforganization and cortical plate formation in human brain organoids stem cellderived models of viral infections in the gastrointestinal tract recruited monocytes and type 2 immunity promote lung regeneration following pneumonectomy lung stem cell differentiation in mice directed by endothelial cells via a bmp4-nfatc1-thrombospondin-1 axis tumor evolution and drug response in patient-derived organoid models of bladder cancer influence of a reconstituted basement membrane and its components on casein gene expression and secretion in mouse mammary epithelial cells oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture 25-hydroxycholesterol protects host against zika virus infection and its associated microcephaly in a mouse model organoid cultures recapitulate esophageal adenocarcinoma heterogeneity providing a model for clonality studies and precision therapeutics quantification of regenerative potential in primary human mammary epithelial cells understanding kidney morphogenesis to guide renal tissue regeneration expansion of adult human pancreatic tissue yields organoids harboring progenitor cells with endocrine differentiation potential long-term in vitro expansion of salivary gland stem cells driven by wnt signals growth in culture of trypsin dissociated thyroid cells from adult rats an in vivo model of functional and vascularized human brain organoids modeling human cortical development in vitro using induced pluripotent stem cells foxg1-dependent dysregulation of gaba/glutamate neuron differentiation in autism spectrum disorders modeling colorectal cancer using crispr-cas9-mediated engineering of human intestinal organoids modelling human development and disease in pluripotent stem-cellderived gastric organoids wnt/betacatenin promotes gastric fundus specification in mice and humans histological organization in hepatocyte organoid cultures dual smad signaling inhibition enables long-term expansion of diverse epithelial basal cells self-organization of polarized cerebellar tissue in 3d culture of human pluripotent stem cells mouse and human urothelial cancer organoids: a tool for bladder cancer research self-formation of optic cups and storable stratified neural retina from human escs disease modeling and gene therapy of copper storage disease in canine hepatic organoids human embryonic lung epithelial tips are multipotent progenitors that can be expanded in vitro as longterm self-renewing organoids generation of stomach tissue from mouse embryonic stem cells organoid models of human liver cancers derived from tumor needle biopsies sustained in vitro intestinal epithelial culture within a wnt-dependent stem cell niche functional anterior pituitary generated in selforganizing culture of human embryonic stem cells identification and correction of mechanisms underlying inherited blindness in human ipscderived optic cups functional cortical neurons and astrocytes from human pluripotent stem cells in 3d culture salivary gland development: a template for regeneration personalized in vitro and in vivo cancer models to guide precision medicine inflammatory cytokine tnfalpha promotes the long-term expansion of primary hepatocytes in 3d culture human salivary gland stem cells functionally restore radiation damaged salivary glands lgr5(+) stem and progenitor cells reside at the apex of a heterogeneous embryonic hepatoblast pool brain-regionspecific organoids using mini-bioreactors for modeling zikv exposure cell diversity and network dynamics in photosensitive human brain organoids basal cells as stem cells of the mouse trachea and human airway epithelium lineage tracing of the endoderm during oral development a living biobank of breast cancer organoids captures disease heterogeneity long-term expanding human airway organoids for disease modeling the clinically approved antiviral drug sofosbuvir inhibits zika virus replication development of a functional thyroid model based on an organoid culture system generation of functional hippocampal neurons from self-organizing human embryonic stem cell-derived dorsomedial telencephalic tissue cholangiocytes derived from human induced pluripotent stem cells for disease modeling and drug validation directed differentiation of human induced pluripotent stem cells into functional cholangiocyte-like cells reconstruction of the mouse extrahepatic biliary tree using primary human extrahepatic cholangiocyte organoids crispr-cas systems for editing, regulating and targeting genomes cytosystems dynamics in self-organization of tissue architecture next-generation regenerative medicine: organogenesis from stem cells in 3d culture single lgr5 stem cells build crypt-villus structures in vitro without a mesenchymal niche long-term expansion of epithelial organoids from human colon, adenoma, adenocarcinoma, and barrett's epithelium neural crest cell implantation restores enteric nervous system function and alters the gastrointestinal transcriptome in human tissue-engineered small intestine functional repair of cftr by crispr/cas9 in intestinal stem cell organoids of cystic fibrosis patients human gastric cancer modelling using organoids human pancreatic tumor organoids reveal loss of stem cell niche factor dependence during disease progression cystic organoid teratoma: (report of a case) directed differentiation of human pluripotent stem cells into intestinal tissue in vitro differentiated troy+ chief cells act as reserve stem cells to generate all lineages of the stomach epithelium self-formation of functional adenohypophysis in three-dimensional culture redefining the in vivo origin of metanephric nephron progenitors enables generation of complex kidney structures from pluripotent stem cells directing human embryonic stem cell differentiation towards a renal lineage generates a self-organizing kidney kidney organoids from human ips cells contain multiple lineages and model human nephrogenesis fgf7 is a functional niche signal required for stimulation of adult liver progenitor cells that support liver regeneration vascularized and functional human liver from an ipsc-derived organ bud transplant vascularized and complex organ buds from diverse tissues via mesenchymal cell-driven condensation massive and reproducible production of liver buds entirely from human pluripotent stem cells successful creation of pancreatic cancer organoids by means of eus-guided fine-needle biopsy sampling for personalized cancer treatment long-term, hormone-responsive organoid cultures of human endometrium in a chemically defined medium trophoblast organoids as a model for maternal-fetal interactions during human placentation prospective derivation of a living organoid biobank of colorectal cancer patients patient-derived organoids model treatment response of metastatic gastrointestinal cancers photoreceptor outer segment-like structures in long-term 3d retinas from human pluripotent stem cells a method to recapitulate early embryonic spatial patterning in human embryonic stem cells directed differentiation of telencephalic precursors from embryonic stem cells self-organized cerebral organoids with human-specific features predict effective drugs to combat zika virus infection tumor organoids as a pre-clinical cancer model for drug discovery chromosomal abnormalities in hepatic cysts point to novel polycystic liver disease genes directed differentiation of human pluripotent stem cells into mature airway epithelia expressing functional cftr protein engineered human pluripotent-stem-cell-derived intestinal tissues with a functional enteric nervous system directed differentiation of human pluripotent cells to ureteric bud kidney progenitor-like cells fusion of regionally specified hpsc-derived organoids models human brain development and interneuron migration selfrenewal and multilineage differentiation in vitro from murine prostate stem cells identification of small-molecule inhibitors of zika virus infection and induced neural cell death via a drug repurposing screen a comprehensive human gastric cancer organoid biobank captures tumor subtype heterogeneity and enables therapeutic screening from pluripotency to differentiation: laying foundations for the body pattern in the mouse embryo in vitro differentiation of transplantable neural precursors from human embryonic stem cells human intestinal tract serves as an alternative infection route for middle east respiratory syndrome coronavirus high-content screening in hpsc-neural progenitors identifies drug candidates that inhibit zika virus infection in fetal-like organoids and adult brain differentiated human airway organoids to assess infectivity of emerging influenza virus vertebrate endoderm development and organ formation key: cord-333088-ygdau2px authors: roy, manojit; pascual, mercedes title: on representing network heterogeneities in the incidence rate of simple epidemic models date: 2006-03-31 journal: ecological complexity doi: 10.1016/j.ecocom.2005.09.001 sha: doc_id: 333088 cord_uid: ygdau2px abstract mean-field ecological models ignore space and other forms of contact structure. at the opposite extreme, high-dimensional models that are both individual-based and stochastic incorporate the distributed nature of ecological interactions. in between, moment approximations have been proposed that represent the effect of correlations on the dynamics of mean quantities. as an alternative closer to the typical temporal models used in ecology, we present here results on “modified mean-field equations” for infectious disease dynamics, in which only mean quantities are followed and the effect of heterogeneous mixing is incorporated implicitly. we specifically investigate the previously proposed empirical parameterization of heterogeneous mixing in which the bilinear incidence rate si is replaced by a nonlinear term ks p i q , for the case of stochastic sirs dynamics on different contact networks, from a regular lattice to a random structure via small-world configurations. we show that, for two distinct dynamical cases involving a stable equilibrium and a noisy endemic steady state, the modified mean-field model approximates successfully the steady state dynamics as well as the respective short and long transients of decaying cycles. this result demonstrates that early on in the transients an approximate power-law relationship is established between global (mean) quantities and the covariance structure in the network. the approach fails in the more complex case of persistent cycles observed within the narrow range of small-world configurations. most population models of disease (anderson and may, 1992) assume complete homogeneous mixing, in which an individual can interact with all others in the population. in these well-mixed models, the disease incidence rate is typically represented by the term bsi that is bilinear in s and i, the number of susceptible and infective individuals (bailey, 1975) , b being the transmission coefficient. with these models it has been possible to establish many important epidemiological results, including the existence of a population threshold for the spread of disease and the vaccination levels required for eradication (kermack and mckendrick, 1927; anderson and may, 1992; smith et al., 2005) . however, individuals are discrete and not well-mixed; they usually interact with only a small subset of the population at any given time, thereby imposing a distinctive contact structure that cannot be represented in mean-field models. explicit interactions within discrete spatial and social neighborhoods have been incorporated into a variety of individual-based models on a spatial grid and on networks (bolker and grenfell, 1995; johansen, a b s t r a c t mean-field ecological models ignore space and other forms of contact structure. at the opposite extreme, high-dimensional models that are both individual-based and stochastic incorporate the distributed nature of ecological interactions. in between, moment approximations have been proposed that represent the effect of correlations on the dynamics of mean quantities. as an alternative closer to the typical temporal models used in ecology, we present here results on ''modified mean-field equations'' for infectious disease dynamics, in which only mean quantities are followed and the effect of heterogeneous mixing is incorporated implicitly. we specifically investigate the previously proposed empirical parameterization of heterogeneous mixing in which the bilinear incidence rate si is replaced by a nonlinear term ks p i q , for the case of stochastic sirs dynamics on different contact networks, from a regular lattice to a random structure via small-world configurations. we show that, for two distinct dynamical cases involving a stable equilibrium and a noisy endemic steady state, the modified mean-field model approximates successfully the steady state dynamics as well as the respective short and long transients of decaying cycles. this result demonstrates that early on in the transients an approximate power-law relationship is established between global (mean) quantities and the covariance structure in the network. the approach fails in the more complex case of persistent cycles observed within the narrow range of small-world configurations. # 2006 elsevier b.v. all rights reserved. 2005). simplifications of these high-dimensional models have been developed to better understand their dynamics, make them more amenable to mathematical analysis and reduce computational complexity (keeling, 1999; eames and keeling, 2002; franc, 2004) . these approximations are based on moment closure methods and add corrections to the mean-field model due to the influence of covariances, as well as equations for the dynamics of these second order moments (pacala and levin, 1997; bolker, 1999; brown and bolker, 2004) . we address here an alternative simplification approach closer to the original mean-field formulation, which retains the basic structure of the mean-field equations but incorporates the effects of heterogeneous mixing implicitly via modified functional forms (mccallum et al., 2001) . specifically, the bilinear transmission term (si) in the well-mixed equations is replaced by a nonlinear term s p i q (severo, 1969) , where the exponents p, q are known as ''heterogeneity parameters''. this formulation allows an implicit representation of distributed interactions when the details of individual-level processes are unavailable (as is often the case, see gibson, 1997) , and when field data are collected in the form of a time series (e.g., koelle and pascual, 2004) . we henceforth refer to these modified equations as the heterogeneous mixing, or ''hm'', model following maule and filipe (in preparation) . the hm model is known to exhibit important properties not observed in standard mean-field models, such as the presence of multiple equilibria and periodic solutions (liu et al., 1986 (liu et al., , 1987 hethcote and van den driessche, 1991; hochberg, 1991) . this model has also been successfully fitted to the experimental time series data of lettuce fungal disease to explain its persistence (gubbins and gilligan, 1997) . however, it is not well known whether these modified mean-field equations can indeed approximate the population dynamics that emerge from individual level interactions. motivated by infectious diseases of plants, maule and filipe (in preparation) have recently compared the dynamics of the hm model to a stochastic susceptible-infective (si) model on a spatial lattice. in this paper, we implement a stochastic version of the susceptible-infective-recovered-susceptible (sirs) dynamics, to consider a broader range of dynamical behaviors including endemic equilibria and cycles (bailey, 1975; murray, 1993; johansen, 1996) . recovery from disease leading to the development of temporary immunity is also relevant to many infectious diseases in humans, such as cholera (koelle and pascual, 2004) . for the contact structure of individuals in the population we use a small-world algorithm, which is capable of generating an array of configurations ranging from a regular grid to a random network (watts and strogatz, 1998) . theory on the structural properties of these networks is well developed (watts, 2003) , and these properties are known to exist in many real interaction networks (dorogotsev and mendes, 2003) . a small-world framework has also been used recently to model epidemic transmission processes of severe acute respiratory syndrome or sars (masuda et al., 2004; verdasca et al., 2005) . we demonstrate that the hm model can accurately approximate the endemic steady states of the stochastic sirs system, including its short and long transients of damped cycles under two different parameter regimes, for all config-urations between the regular and random networks. we show that this result implies the establishment early on in the transients of a double power-law scaling relationship between the covariance structure on the network and global (mean) quantities at the population level (the total numbers of susceptible and infective individuals). we also demonstrate the existence of a complex dynamical behavior in the stochastic system within the narrow small-world region, consisting of persistent cycles with enhanced amplitude and a well-defined period that are not predicted by the equivalent homogeneous mean-field model. in this case, the hm model captures the mean infection level and the overall pattern of the decaying transient cycles, but not their phases. the model also fails to reproduce the persistence of the cycles. we conclude by discussing the potential significance and limitations of these observations. the model 2.1. stochastic formulation the population structure, that is, the social contact pattern among individuals in the population, is modeled using a small-world framework as follows. we start with a spatial grid with the interaction neighborhood restricted to eight neighbors ( fig. 1a) and periodic boundary condition, and randomly rewire a fraction f of the local connections (avoiding self and multiple connections) such that the average number of connections per individual is preserved at n 0 (=8 in this case). we call f the ''short-cut'' parameter of the network. this is a two-dimensional extension of the algorithm described in watts and strogatz (1998) . as pointed out by newman and watts (1999) , a problem with these algorithms is the small but finite probability of the existence of isolated sub-networks. we consider only those configurations that are completely connected. for f = 0 we have a regular grid (fig. 1a ), whereas f = 1 gives a random network (fig. 1c ). in between these extremes, there is a range of f values near 0.01 within which the network exhibits small-world properties (fig. 1b) . in this region, most local connections remain intact making the network highly ''clustered'' like the regular grid, with occasional short-cuts that lower the average distance between nodes drastically as in the random network. these properties are illustrated with two quantities, the ''clustering coefficient'' c and the ''average path length'' l (watts, 2003) . c denotes the probability that two neighbors of a node are themselves neighbor, and l denotes the average shortest distance between two nodes in the network. the small-world network exhibits the characteristic property of having a high value of c and simultaneously a low value of l (fig. 1d ). once the network structure is generated using the algorithm described above, the stochastic sirs dynamics are implemented with the following rules: a susceptible individual gets infected at a rate n i b, where n i is the number of infective neighbors and b is the rate of e c o l o g i c a l c o m p l e x i t y 3 ( 2 0 0 6 ) 8 0 -9 0 disease transmission across a connected susceptibleinfective pair. an infective individual loses infection at a rate g and recovers temporarily. a recovered individual loses its temporary immunity at a rate d and becomes susceptible again. stochasticity arises because the rate of each event specifies a stochastic process with a poisson-distributed time-intervals between successive occurrences of the event, with a mean interval of (rate) à1 . total population size is assumed constant (demography and disease-induced mortality are not considered), and infection propagates from an infective to a susceptible individual only if the two are connected. correlations develop as the result of the local transmission rules and the underlying network structure. therefore, holding b, g and d constant while varying the short-cut parameter f allows us to explore the effects of different network configurations (such as fig. 1a -c) on the epidemic. analytical considerations one way to analytically treat the above stochastic system is by using a pair-wise formulation (keeling, 1999) , which considers partnerships as the fundamental variables, and incorporates the pair-wise connections into model equations. using the notations of keeling et al. (1997) , this formulation gives the following set of equations for the dynamics of disease growth, and [r n ] denote respectively the number of susceptible, infective and recovered individuals each with exactly n connections, and [s n i m ] denotes the number of connected susceptible-infective pairs with n and m connections. by writing p n ½s n ¼ ½s ¼ s, where s is the total number of susceptible individuals, and p n p m ½s n i m à á ¼ p n ½s n i ¼ ½si, where [si] denotes the total number of connected susceptible-infective pairs, we can rewrite the equations for the number of susceptible, infective and recovered individuals as even though this set of equations is exact, it is not closed and additional equations are needed to specify the dynamics of the [si] pairs, which in turn depend on the dynamics of triples, etc., in an infinite hierarchy that is usually closed by moment closure approximations. however, a satisfactory closure scheme for a locally clustered network is still lacking (but see keeling et al., 1997; rand, 1999 ). here we pursue a different avenue to approximate the stochastic system with modified mean-field equations, which consider only the dynamics of mean quantities but replace the standard bilinear term bsi with a nonlinear transmission rate as follows, where k, p, q are the ''heterogeneity'' parameters (severo, 1969; liu et al., 1986; hethcote and van den driessche, 1991; hochberg, 1991) . we call eq. (2) the ''heterogeneous mixing'' (hm) model (maule and filipe, in preparation) . we note from eq. (1) that the incidence rate of the epidemic can be estimated by counting the number of connected susceptible-infective pairs [si] in the network. furthermore, [si] is directly related to the correlation c si that arises between susceptible and infective individuals in the network (keeling, 1999) . therefore, comparing eqs. (2) with (1) we see that the hm model implicitly assumes a double power-law relationship between this covariance structure and the abundances of infective and susceptible individuals. for instance, in a homogeneous network (such as a regular grid) with identical number of connections n 0 for all individuals, we have where n = s + i + r is the population size (keeling, 1999) . relationships such as eq. (3) provide an important first step towards understanding how the phenomenological parameters k, p and q are related to network structure. for a homogeneous random network in which every individual is connected to exactly n 0 randomly distributed others (see appendix a), the susceptible and infective individuals are uncorrelated and the total number of interacting pairs [si] = (n 0 /n)si. eq. (1) then reduce to these equations incorporate the familiar bilinear term si for the incidence rate, and provide a mean-field approximation for the stochastic system in which each individual randomly mixes with n 0 others. note that the transmission coefficient b is proportionately reduced by a factor n 0 /n, which is the fraction of the population in a contact neighborhood of each individual. in a completely well-mixed population, n 0 = n, and these equations reduce to the standard kermack-mckendrick form (kermack and mckendrick, 1927) . eq. (4) exhibit either a disease-free equilibrium, i(t) = 0, or an endemic equilibrium, i(t) = [dn/(g + d)][1 à g/bn 0 ], depending on whether the basic reproductive ratio r 0 = n 0 b/g is less or greater than unity. it is to be noted that while eq. (4) describe a homogeneous random network exactly, it provides only an approximation for the random network with f = 1, in which individuals have a binomially distributed number of connections around a mean n 0 (appendix a). details of the implementation of the stochastic system are described in appendix b. one practical approach to estimate the parameters k, p and q of the hm model, when the individual-level processes are unknown, would be to fit these parameters using time series data (gubbins and gilligan, 1997; bjørnstad et al., 2002; finkenstä dt et al., 2002) . indeed, with a sufficient number of parameters a satisfactory agreement between the model and the data is almost always possible. a direct fit of time series, however, will not tell us whether the disease transmission rate is well approximated by the functional form ks p i q of the model. we instead fit the parameters k, p, q to the transmission rate ''observed'' in the output of the stochastic simulation. specifically, we obtain least-squared estimates of k, p, q by fitting the term ks p i q to the computed number of pairs [si] that gives the disease transmission rate of the stochastic system (see eq. (1)). we then incorporate these estimates in eq. (2), and compare the infective time series produced by this hm model to that generated by the original stochastic network simulation. example. in this way, we can address whether the transmission rate is well captured by the modified functional form, and if that is the case, whether the hm model approximates successfully the aggregated dynamics of the stochastic system. we compare the stochastic simulation with the predictions of three sets of model equations, representing different degrees of approximation of the system. besides the hm model described above, we consider the bilinear mean-field model given by eq. (4), which assumes k = n 0 /n and p = q = 1. this comparison demonstrates the inadequacy of the wellmixed assumption built into the bilinear formulation. we also discuss a restricted hm model with an incidence function of the form ðn 0 =nþs pr i qr in eq. (2), with only two heterogeneity parameters p r and q r , as originally proposed by severo (1969) and studied by liu et al. (1986) , hethcote and van den driessche (1991) and hochberg (1991) . the stochastic sirs dynamics are capable of exhibiting a diverse array of dynamical behaviors, determined by both the epidemic parameters b, g, d and the network short-cut parameter f. we choose the following three scenarios: stable equilibrium: infection levels in the population reach a stable equilibrium relatively rapidly after a short transient (fig. 3a) . noisy endemic state: infection levels exhibit stochastic fluctuations around an endemic state following a long transient of decaying cycles (fig. 3b ). persistent cycles: fluctuations with a well defined period and enhanced amplitude persist in the small-world region near f = 0.01 (fig. 3b ). the reason for choosing these different temporal patterns is to test the hm model against a wider range of behaviors of the stochastic system. the oscillatory case has epidemiological significance because of the observed pervasiveness of cyclic disease patterns (cliff and haggett, 1984; grenfell and harwood, 1997; pascual et al., 2000) . fig. 3a presents simulation examples of the epidemic time series for three values of the short-cut parameter f, representing the regular grid (f = 0), the small-world network (f = 0.01) and the random network (f = 1). the transient pattern depends strongly on f: a high degree of local clustering in a regular grid slows the initial buildup of the epidemic, whereas in a random network with negligible clustering (fig. 1d ) the disease grows relatively fast. the transient for the small-world network lies in between these two extremes. by contrast, the stable equilibrium level of the infection remains insensitive to f, implying that the equilibrium should be well predicted by the bilinear mean-field approximation (eq. (4)) itself. least-squared estimates of the two sets of heterogeneity parameters [k, p, q] and [k r = n 0 /n, p r , q r ], for the full and restricted versions of the hm model respectively, are obtained for a series of f values corresponding to different network configurations, as described in section 3. the disease parameters b, g and d are kept fixed throughout, making the epidemic processes operate at the same rates across different networks, so that the effects of the network structure on the dynamics can be studied independently. transient patterns, however, presents a different picture. the mean-field trajectory deviates the most from the stochastic simulation for the regular grid (f = 0), and the least for the random network (f = 1). the full hm model with its three parameters k, p and q, on the other hand, demonstrates an excellent agreement with the stochastic transients for all values of f. by comparison, the transient patterns of the restricted hm model with only two fitting parameters p r and q r differ significantly for low values of f ( fig. 4a and b) . the poor agreement of the restricted hm and the mean-field transients with the stochastic data for a clustered network (low f) is due to the failure of their respective incidence functions to fit the transmission rate of the stochastic system ( fig. 2a) . on the other hand, the random network has negligible clustering, and the interaction between susceptible and infective individuals is sufficiently well mixed for the restricted hm model to provide as good an approximation of the stochastic transient as the full hm model (fig. 4c ). the estimates [k, p, q] = [0.0001, 0.94, 0.97] and [n 0 /n, p r , q r ] = [0.00005, 0.99, 1] for these two models are also quite similar. the discrepancy for the mean-field transients (fig. 4c) is due to the fact that the mean-field model gives only an approximate description of the random network with f = 1 as noted before. at the other extreme, for a regular grid the estimates of the full and restricted hm models are [k, p, q] = [1.66, 0.3, 0.69] and [n 0 /n, p r , q r ] = [0.00005, 0.84, 1.13], which differ considerably from each other. fig. 5a and b demonstrate how the parameters k, p and q of the full hm model depend on the short-cut parameter f. all three of them approach their respective well-mixed values (k = 0.00005, p = q = 1) as f ! 1, and they deviate the most as f ! 0, in accord with the earlier discussion. in particular, k is significantly higher, and likewise p and q are lower, for the regular grid than a well-mixed system, implying a strong nonlinearity of the transmission mechanism in a clustered network. such a large value of k can be understood within the context of the incidence function ks p i q , and explains why only two parameters, p r and q r in the restricted hm model, cannot absorb the contribution of k. in a homogeneous random network with n 0 connections per individual, the term (n 0 /n)si gives the expected total number of pairs [si] that govern disease transmission in the network, and the exponent values p = q = 1 indicate random mixing (of susceptible and infective individuals). by contrast, local interactions in a clustered network lower the availability of susceptible individuals (infected neighbors of an infective individual act as a barrier to disease spread), resulting in a depressed value of the exponent p significantly below 1. this nonlinear effect, combined with a low initial infective number i 0 (0.5% of the total population randomly infected) requires k in the hm model to be large enough to match the disease transmission in the network. indeed, as table 1 demonstrates, both k and p are quite sensitive to i 0 for a regular grid, unlike the other exponent q that does not 5)) is plotted against f for the three epidemic models. depend on initial conditions. increasing i 0 facilitates disease growth by distributing infective and susceptible individuals more evenly, which causes an increase of the value of p and a compensatory reduction of k. an interesting pattern in fig. 5a and b is that the values of the heterogeneity parameters remain fairly constant initially for low f, in particular within the interval 0 f < 0.01 for the exponents p and q (the range is somewhat shorter for k), and then start approaching respective meanfield values as f increases to 1. this pattern of variation is reminiscent of the plot for the clustering coefficient c shown in fig. 1d , and suggests that the clustering of the network, rather than its average path length l, influences disease transmission strongly. a measure of the accuracy of the approximation can be defined by an error function erf, computed as a mean of the point-by-point deviation of the infective time series i m (t) predicted from the models, relative to the stochastic simulation data i s (t), over the length t of the transient (the equilibrium values of the models coincide with the simulation, see fig. 4 (multiplication by 100 expresses erf as a percentage of the simulation time series). fig. 5c shows erf as a function of f for the three models. the total failure of the mean-field approximation to predict the stochastic transients is evident in the large magnitudes of error (it is 25% even for the random networks). by contrast, the excellent agreement of the full hm model for all f results in a low error throughout. on the other hand, the restricted version of the hm model gives over 30% error for low f whereas it is negligible for high f. interestingly, erf for the restricted hm and mean-field models show similar patterns of variation with f as in fig. 5b , staying relatively constant within 0 f < 0.01 and then decreasing relatively fast. local clustering in a network with low f causes disease transmission to deviate from a well-mixed approximation, and thus influences the pattern of erf for these simpler models. the second type of dynamical behavior of the stochastic system exhibits a relatively long oscillatory transient that settles onto a noisy endemic state for most values of f, near 0 as well as above (fig. 3b ). stochastic fluctuations are stronger for f = 0 than f = 1. however, in a significant exception the cycles tend to persist with a considerably increased amplitude and well defined period for a narrow range of f near 0.01, precisely where the small-world behavior arises in the network. such persistent cycles are not predicted by the homogeneous epidemic dynamics given by eq. (4) , and are therefore a consequence of the correlations generated by the contact structure. to our knowledge such nonmonotonic pattern for the amplitude of the cycles with the network parameter f has not been observed before (see section 5 for a comparison of these results with those of other studies). we estimate two quantities, the ''coefficient of variation'' (cv) and the ''degree of coherence'' (dc), which determine respectively the strength and periodicity of the cycles for different values of f. cv has the usual definition where the numerator denotes the standard deviation of the infective time series of length t s (in the stationary state excluding transients), and the denominator denotes its mean over the same time length t s . fig. 6a exhibits a characteristic peak for cv near f = 0.01, demonstrating a maximization of the cycle amplitudes in the small-world region compared to both the high and low values of f. the plot also shows that the fluctuations at the left side tail of the peak are stronger than its right side tail. consistent with this pattern, sustained fluctuations in a stochastic sirs model on a spatial grid (f = 0) were also shown by johansen (1996) . by contrast, the low variability in the random network (f ! 1) is due to the fact that the corresponding mean-field model (eq. (4)) does not have oscillatory solutions. dc provides a measure of the sharpness of the dominant peak in the fourier power spectrum of the infective time series, and is defined as where h max , v max and dv are the peak height, peak frequency and the width at one-tenth maximum respectively of a gaussian fit to the dominant peak. the sharp nature of the peak, particularly for the small-world network, makes it unfeasible to use the standard ''width at half-maximum'' fig. 6 -the coefficient of variation, cv (eq. (6)), and the degree of coherence, dc (eq. (7)), are plotted against f in a and b, respectively (see text for definitions). each point in b represents estimates using a fourier power spectrum averaged over 10 independent realizations of the stochastic simulations. (gang et al., 1993; lago-ferná ndez et al., 2000) which is often zero here. the modified implementation in eq. (7) therefore considerably underestimates the sharpness of the dominant peak. even then, fig. 6b depicts a fairly narrow maximum for dc near f = 0.01, indicating that the cycles within the small-world region have a well-defined period. the low value of dc for f = 0 implies that the fluctuations in the regular grid are stochastic in nature. a likely scenario for the origin of these persistent cycles is as follows. stochastic fluctuations are locally maintained in a regular grid by the propagating fronts of spatially distributed infective individuals, but they are out of phase across the network. the infective individuals are spatially correlated over a length j / d à1 in the grid (johansen, 1996) , which typically has a far shorter magnitude than the linear extent of the grid used here (increasing d reduces the correlation length j further, which weakens these fluctuations and gives the stable endemic state observed for instance in fig. 3a ). the addition of a small number of short-cuts in a small-world network (fig. 1b) couples together a few of these local fronts, thereby effectively increasing the correlation length to the order of the system size and creating a globally coherent periodic response. as more short-cuts are added, the network soon acquires a sufficiently random configuration and the homogeneous dynamics become dominant. another important point to note in fig. 3b is that, in contrast to fig. 3a , the mean infection level i of the cycles is not independent of f: i now increases slowly with f. an immediate implication of this observation is that, unlike the earlier case of a stable equilibrium, the bilinear mean-field model of eq. (4) will no longer be able to accurately predict the mean infection for all f. fig. 7 shows the same examples of the stochastic time series as in fig. 3b , along with the solutions of the three models. as expected, the mean-field time series fails to predict the mean infection level at varying degrees in all three cases, deviating most for the regular grid (f = 0) and least for the random network (f = 1). by comparison, the equilibrium solutions of the full and restricted versions of the hm model both demonstrate good agreement with the mean infection level of the stochastic system. for the transient patterns, the two hm models exhibit similar decaying cycles of roughly the same period, and also of the same transient length, as the stochastic time series but they occur at a different phase. even though the transient cycles of the hm models persist the longest for f = 0.01, they eventually decay onto a stable equilibrium and thus fail to predict the persistent oscillations of the smallworld network. the mean-field model, on the other hand, shows damped cycles of much shorter duration and hence is a poor predictor overall. the close agreement of the two hm time series with each other for f = 0.01 (fig. 7b) is due to the fact that the leastsquared estimate of k for the full hm model is 0.00005, equal to n 0 /n of the restricted hm, and the exponents p, q likewise reduce to p r , q r . within the entire range 0 f 1, the estimates of p and p r for the full and restricted hm models lie between [0.68, 1.08] and [0.82, 0.93], respectively, whereas q and q r both stay close to a value of 1.1. it is interesting to note here that a necessary condition for limit cycles in an sirs dynamics with a nonlinear incidence rate s p i q is q > 1 (liu et al., 1986) , which both the full and restricted hm models appear to satisfy. one possible reason then for their failure to reproduce the cycles in the smallworld region is the overall complexity of the stochastic time series, which results from nontrivial correlation patterns present in the susceptible-infective dynamics. the threeparameter incidence function ks p i q of the full hm model may not have sufficient flexibility to adequately fit the cyclic incidence pattern of the stochastic system. we emphasize here that if the cycles are not generated intrinsically, but are driven by an external variable such as a periodic environmental forcing, the outcome is well predicted when an appropriate forcing term is included in eq. (2) (results not shown). as a final note, all of the above observations for both stable equilibria and cyclic epidemic patterns have been qualitatively validated for multiple sets of values of the disease parameters b, g and d. stochastic sirs dynamics implemented on a network with varying degrees of local clustering can generate a rich spectrum of behaviors, including stable and noisy endemic equilibria as well as decaying and persistent cycles. persistent cycles arise in our system even though the homogeneous mean-field dynamics do not have oscillatory solutions, thereby revealing an interesting interplay of network structure and disease dynamics (also see rand, 1999) . our results demonstrate that a three-variable epidemic model with a nonlinear incidence function ks p i q , consisting of three ''heterogeneity '' parameters [k, p, q] , is capable of predicting the disease transmission patterns, including the transient and stable equilibrium prevalence, in a clustered network. the relatively simpler (and more standard) form s pr i qr with two parameters [p r , q r ] falls short in this regard. this restrictive model, however, is an adequate predictor of the dynamics in a random network, for which the bilinear mean-field approximation cannot explain the transient pattern. interestingly, even the function ks p i q cannot capture the complex dynamics of persistent cycles in a small-world network that has simultaneously high local clustering and long-distance connectivity. it is worth noting, however, that such persistent cycles appear within a small region of the parameter space for f, and therefore the hm model appears to provide a reasonable approximation for most cases of clustered as well as randomized networks. an implication of these findings is that an approximate relationship is established early on in the transients, lasting all the way to equilibrium, between the covariance structure of the [si] pairs and the global (mean) quantities s and i. this relationship is given by a double power law of the number of susceptible and infective individuals. it allows the closure of the equations for mean quantities, making it possible to approximate the stochastic dynamics with a simple model (hm) that mimics the basic formulation of the mean-field equations but with modified functional forms. it reveals an interesting scaling pattern from individual to population dynamics, governed by the underlying contact structure of the network. in lattice models for antagonistic interactions, which bear a strong similarity to our stochastic disease system, a number of power-law scalings have been described for the geometry of the clusters (pascual et al., 2002) . it is an open question whether the exponents for the dynamic scaling (i.e., parameters p and q here) can be derived from such geometrical properties. it also needs to be determined under what conditions power-law relationships will hold between local structure and global quantities. the failure of the hm model to generate persistent cycles may result from an inappropriate choice of the incidence function ks p i q . it remains to be seen if there exists a different functional form that better fits the incidence rate of the stochastic system and is capable of predicting the variability in the data. it is also not known whether a moment closure method including the explicit dynamics of the covariance terms themselves (pacala and levin, 1997; keeling, 1999) can provide a good approximation to the mean infection level in a network with high degree of local clustering. of course, heterogeneities in space or in contact structure are not the only factors contributing to the nonlinearity in the transmission function s p i q ; a number of other biological mechanisms of transmission can lead to such functional forms. by rewriting bks p i q as ½bks pà1 i qà1 si bsi, wherebðs; iþ now represents a density-dependent transmission efficiency in the bilinear (homogeneous) incidence framework, one can relateb to a variety of density-dependent processes such as those involving vector-borne transmission, or threshold virus loads etc (liu et al., 1986) . interestingly, it has been suggested that in such cases cyclic dynamics are likely to be stabilized, rather than amplified, by nonlinear transmission (hochberg, 1991) . it appears then that network structure can contribute to the cyclic behavior of diseases with relatively simple transmission dynamics. it is interesting to consider the persistent cycles we have discussed here in light of other studies on fluctuations in networks. on one side, cycles have been described for random networks with f = 1 because the corresponding well-mixed dynamics also have oscillatory solutions (lago-ferná ndez et al., 2000; kuperman and abramson, 2001) . at the opposite extreme, johansen (1996) reported persistent fluctuations in a stochastic sirs model on a regular grid (f = 0), strictly generated by the local clustering of the grid since the meanfield equations do not permit cycles. recent work by verdasca et al. (2005) extends johansen's observation by showing that fluctuations do occur in clustered networks from a regular grid to the small-world configuration. they describe a percolation type transition across the small-world region, implying that the fluctuations fall off sharply within this narro terval. this observation is in significant contrast to our results, where the amplitudes of the cycles are maximized by the small-world configuration, and therefore require both local clustering and some degree of randomization. one difference between the two models is that verdasca et al. (2005) use a discrete time step for the recovery of infected individuals, while in our event-driven model, time is continuous and the recovery time is exponentially distributed. a more systematic study of parameter space for these models is warranted. we should also mention that there are other ways to generate a clustered network than a small-world algorithm. for example, keeling (2005) described a method that starts with a number of randomly placed focal points in a twodimensional square, and draws a proportion of them towards their nearest focal point to generate local clusters. network building can also be attempted from the available data on selective social mixing (morris, 1995) . the advantage of our small-world algorithm is that besides being simple to implement, it is also one of the best studied networks (watts, 2003) . this algorithm generates a continuum of configurations from a regular grid to a random network, and many real systems have an underlying regular spatial structure, as in the case of hantavirus of wild rats within the city blocks of baltimore (childs et al., 1988) . moreover, emergent diseases like the recent outbreak of severe acute respiratory syndrome (sars) have been studied by modeling human contact patterns using small-world networks (masuda et al., 2004; verdasca et al., 2005) . the network considered here remains static in time. while this assumption is reasonable when disease spreads rapidly relative to changes of the network itself, there are many instances where the contact structure would vary over comparable time scales. examples include group dynamics in wildlife resulting from schooling or spatial aggregation, as well as territorial behavior. dynamic network structure involves processes such as migration among groups that establishes new connections and destroys existing ones, but also demographic processes such as birth and death as well as disease induced mortality. another topic of current interest is the effect of predation on disease growth, which splices together predator-prey and host-pathogen dynamics in which the prey is an epidemic carrier (ostfeld and holt, 2004) . simple dynamics assuming the homogeneous mixing of prey and predators makes interesting predictions about the harmful effect of predator control in aggravating disease prevalence with potential spill-over effects on humans (packer et al., 2003; ostfeld and holt, 2004) . it remains to be seen if these e c o l o g i c a l c o m p l e x i t y 3 ( 2 0 0 6 ) 8 0 -9 0 conclusions hold under an explicit modeling framework that binds together the social dynamics of both prey and predator. more generally, future work should address whether modified mean-field models provide accurate simplifications for stochastic disease models on dynamic networks. so far the work presented here for static networks provides support for the empirical application of these simpler models to time series data. statistical mechanics of complex networks infectious diseases of humans: dynamics and control the mathematical theory of infectious diseases dynamics of measles epidemics: estimating scaling of transmission rates using a time series sir model analytic models for the patchy spread of plant disease space, persistence and dynamics of measles epidemics the effects of disease dispersal and host clustering on the epidemic threshold in plants the ecology and epizootiology of hantaviral infections in small mammal communities of baltimore: a review and synthesis island epidemics evolution of networks modelling dynamic and network heterogeneities in the spread of sexually transmitted disease a stochastic model for extinction and recurrence of epidemics: estimation and inference for measles outbreaks metapopulation dynamics as a contact process on a graph stochastic resonance without external periodic force (meta) population dynamics of infectious diseases a test of heterogeneous mixing as a mechanism for ecological persistence in a disturbed environment some epidemiological models with nonlinear incidence non-linear transmission rates and the dynamics of infectious disease a simple model of recurrent epidemics correlation models for childhood epidemics the effects of local spatial structure on epidemiological invasions the implications of network structure for epidemic dynamics a contribution to the mathematical theory of epidemics disentangling extrinsic from intrinsic factors in disease dynamics: a nonlinear time series approach with an application to cholera modeling infection transmission small world effect in an epidemiological model fast response and temporal coherent oscillations in small-world networks influence of nonlinear incidence rates upon the behavior of sirs epidemiological models dynamical behavior of epidemiological models with non-linear incidence rate how should pathogen transmission be modelled? relating heterogeneous mixing models to spatial processes in disease epidemics transmission of severe acute respiratory syndrome in dynamical small-world networks data driven network models for the spread of disease mathematical biology the spread of epidemic disease on networks scaling and percolation in the small-world network model are predators good for your health? evaluating evidence for top-down regulation of zoonotic disease reservoirs biologically generated spatial pattern and the coexistence of competing species keeping the herds healthy and alert: impacts of predation upon prey with specialist pathogens cholera dynamics and el niñ o-southern oscillation simple temporal models for ecological systems with complex spatial patterns epidemic spreading in scale-free networks correlation equations and pair approximations for spatial ecologies persistence and dynamics in lattice models of epidemic spread percolation on heterogeneous networks as a model for epidemics generalizations of some stochastic epidemic models the impacts of network topology on disease spread ecological theory to enhance infectious disease control and public health policy contact networks and the evolution of virulence recurrent epidemics in small world networks small worlds collective dynamics of smallworld networks we thank juan aparicio for valuable comments about the work, and ben bolker and an anonymous reviewer for useful suggestions on the manuscript. this research was supported by a centennial fellowship of the james s. mcdonnell foundation to m.p. it is important to distinguish among the different types of random networks that are used frequently in the literature. one is the random network with f = 1 that is generated using the small-world algorithm as described in section 2 ( fig. 1c ), which has a total nn 0 /2 distinct connections, where n 0 is the original neighborhood size (=8 here) in the regular grid and n is the size of the network. each individual in this random network has a binomially distributed number of contacts around a mean n 0 . there is also the homogeneous random network discussed in relation to the mean-field eq. (4), which by definition has fixed n 0 random contacts per individual (keeling, 1999) . these two networks are, however, different from the random network of erdő s and ré nyi (albert and barabá si, 2002) , generated by randomly creating connections with a probability p among all pairs of individuals in a population. the expected number of distinct connections in the population is then pn(n à 1)/2, and each individual has a binomially distributed number of connections with mean p(n à 1). for moderate values of p and large population sizes, the erdő s-ré nyi network is much more densely connected than the first two types. all three of them, however, have negligible clustering c and path length l, since the individuals do not retain any local connections (all connections are short-cuts). an appropriate network is constructed with a given f, and the stochastic sirs dynamics are implemented on this network using the rules described in section 2. for the initial conditions, we start with a random distribution of a small number of infective individuals, only 0.5% of the total population (=0.005n) unless otherwise stated, in a pool of susceptible individuals. all generated time series used for least-squared fitting of the transmission rate have a length of 20,000 time units. the structure of the network remains fixed during the entire stochastic run. stochastic simulations were carried out with a series of network sizes ranging from n = 10 4 -10 6 . the results presented here are those for n = 160,000 and are representative of other sizes. the values for the epidemic rate parameters b, g and d are chosen so that the disease successfully establishes in the population (a finite fraction of the population remains infected at all times). r e f e r e n c e s key: cord-313046-3g2us5zh authors: taghizadeh, l.; karimi, a.; heitzinger, c. title: uncertainty quantification in epidemiological models for covid-19 pandemic date: 2020-06-03 journal: nan doi: 10.1101/2020.05.30.20117754 sha: doc_id: 313046 cord_uid: 3g2us5zh the main goal of this paper is to develop the forward and inverse modeling of the coronavirus (covid-19) pandemic using novel computational methodologies in order to accurately estimate and predict the pandemic. this leads to governmental decisions support in implementing effective protective measures and prevention of new outbreaks. to this end, we use the logistic equation and the sir system of ordinary differential equations to model the spread of the covid-19 pandemic. for the inverse modeling, we propose bayesian inversion techniques, which are robust and reliable approaches, in order to estimate the unknown parameters of the epidemiological models. we use an adaptive markov-chain monte-carlo (mcmc) algorithm for the estimation of a posteriori probability distribution and confidence intervals for the unknown model parameters as well as for the reproduction number. furthermore, we present a fatality analysis for covid-19 in austria, which is also of importance for governmental protective decision making. we perform our analyses on the publicly available data for austria to estimate the main epidemiological model parameters and to study the effectiveness of the protective measures by the austrian government. the estimated parameters and the analysis of fatalities provide useful information for decision makers and makes it possible to perform more realistic forecasts of future outbreaks. the coronavirus covid-19 pandemic is a new infectious disease which emerged from china in fall 2019 and then spread around the world. this pandemic spreads through (micro-) droplets and its outbreak speed is very high. the first reported case of sars-cov-2 was identified in wuhan, china. the first case outside of china was reported in thailand on 13 january 2020 [1] . since then, this ongoing outbreak has now spread all over the world [2] . so far (at the time of writing), this pandemic has infected around 5 230 000 individuals around the world and caused more than 335 000 deaths. out of more than 2 780 000 active cases around the world, 2% are critical patients. source of the data is the johns hopkins csse database (https://github.com/cssegisanddata/covid19) . the covid-19 pandemic was confirmed to have spread to austria on 25 february 2020 by a 24-year-old man and a 24-year-old woman (according to the federal ministry of social affairs, health, care and consumer protection, republic of austria, https://info.gesundheitsministerium.at) traveling from lombardy, italy, who were treated at a hospital in innsbruck. so far in austria more than 16 400 people have been infected and 633 deaths have been reported. furthermore, out of 833 active cases, 4% are critical patients. figure 1 displays daily confirmed cases in austria and figure 2 illustrates total cumulative count of confirmed and active cases in austria. by removing deaths and recoveries from total cases, we obtain the "currently infected cases" or "active cases" (cases still awaiting an outcome). infected people need breathing assistance and a large number of them require medical treatment in an intensive care unit (icu). countries which are affected by covid-19 attempt to keep the daily number of cases below the capacity of their health care system. in order to avert the disastrous inundation of hospitals, the virus must be kept from spreading fast. to this end, countries have been implementing protective measures such as closing schools, canceling mass gatherings, working from home (home office), selfquarantine, self-isolation, avoiding crowds, social distancing, wearing protection masks, etc. in this work, we propose bayesian inference for the analysis of the covid-19 data in order to estimate the crucial unknown quantities of the pandemic models. we use an adaptive mcmc method to find the probability distributions and confidence intervals of the epidemiological models parameters using the austrian infection data. we use this analysis for the prediction of the duration of the epidemic in austria as well as the total number of infected people and fatalities till the end of the epidemic. the model validation shows a very good agreement between the computational and measurement data of infections in austria which proves the reliability and the accurateness of the predictions. this is of great importance for making governmental decisions 2 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. in implementing the measures in order to prevent the spread of the virus. this paper is organized as follows: section 2 presents the logistic and sir epidemiological models and introduces their unknown parameters. section 3 is devoted to the bayesian analysis as the inversion method which is proposed for quantifying the uncertain model parameters in the epidemiological models. numerical results of the forward and inverse epidemic models including the quantifying the models uncertain parameters, models validation using the measurement data, pandemic forecast, fatality analysis and the effect of governmental protective measures are presented in section 4. finally, conclusions are drawn in section 5. predictive mathematical models are essential for the quantitative understanding of epidemics and for supporting decision makers in order to implement the most effective and protective measures. many mathematical models for the spread of infectious diseases [3] [4] [5] [6] and in particular for the novel covid-19 [7] [8] [9] [10] [11] have been presented and analyzed. here we start with the logistic equation as a preliminary model for epidemics and continue with the sir model and its extensions [12] [13] [14] [15] [16] [17] . the logistic equation is a nonlinear ordinary differential equation, which is used for modeling population growth. this ode is also well-known as logistic growth model and is given by where y 0 = 0 is the initial population size (initial number of confirmed cases), y denotes population size (total accumulated confirmed cases) and t time. furthermore, α and β are respectively the growth rate (infection rate) and the carrying capacity (maximum number of confirmed cases), which are positive constants. the solution to the logistic model equation is which can be rewritten as furthermore, the expected peak date of the outbreak is calculated as which is the time when the expected maximal rate of confirmed cases (growth rate) occurs. the susceptible-infected-recovered (sir) model is an epidemiological model that computes the number of people infected with a contagious disease in a closed population over time. the kermack-mckendrick model is one of the sir models, which is defined by the system of ordinary differential equations, where β and γ are the infection and recovery rates, respectively. the model consists of three components: s for the number of susceptible, i for the number of infectious, and r for the number of recovered or deceased (or immune) individuals. furthermore, n denotes the constancy of population, i.e. moreover, the dynamics of the infectious class depends on the reproduction number/ratio, which is defined as if the reproduction number is high, the probability of a pandemic is high, too. this number is also used to estimate the herd immune threshold (hit). if the reproduction number multiplied by the percentage of susceptible individuals is equal to 1, it shows an equilibrium state and thus the number of infectious people is constant. additionally, the recovery period is defined by all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. . and describes the average days to recover from infection. the transmission period in the sense of the average days to transmit the infection to a person is defined by however in a population with vital dynamics, new births can provide more susceptible individuals to the population, which sustain an epidemic or allow new introductions to spread in the population. taking the vital dynamics into account, the sir model is extended to where µ and ν denote the birth and death rates, respectively. to maintain a constant population, we assume µ = ν is the natural mortality rate. we propose bayesian inversion methods in order to solve the backward/inverse problem of covid-19, which is the problem of accurate estimation of the epidemiological model parameters as well as the reproduction ratio. bayesian inference in the context of the statistical inversion theory is based on bayes' theorem and, compared to traditional inverse methods, has the advantage of updating the prior knowledge about the unknown quantity using the measurement/observation data. furthermore, bayesian analysis is a robust inversion technique for parameter extraction and gives the (a posteriori) probability distribution and confidence intervals for the unknowns instead of providing a single estimate. we have already successfully applied bayesian inversion techniques to various pde models in engineering and medicine in order to identify parameters (see for instance [18] [19] [20] [21] ). as mentioned, the bayesian inversion approach is a robust and reliable technique to quantify the uncertain parameters of the epidemic models. in fact, the solution of the inverse problem is the posterior density that best reflects the distribution of the parameter based on the observations. as the observations or measurements are subject to noise, and the observational noise, i.e., the error e due to modeling and measurement, is unbiased and iid, it can be represented by random variables as where e is a mean-zero random variable and m is a given random variable representing observed data or measurements, for which we have a model g(q) (observation operator) dependent on a random variable q with realizations q = q(ω) representing parameters to be estimated [22] . assume a given probability space (ω, f, p ), where ω is the set of elementary events (sample space), f a σ-algebra of events, and p a probability measure. furthermore assume that all the random variables are absolutely continuous. bayes' theorem in terms of probability densities can be written as where the unknown parameters q = (q 1 , . . . , q p ) ∈ r p and the observed data y are realizations of the random variables q and m , respectively. furthermore, π 0 (q), π(q|y), and π(y|q) are the probability density functions of the prior, posterior, and (data) sampling distributions, respectively. the density π(y|q) of the data provides information from the measurement data to update the prior knowledge, and it is well-known as the likelihood density function. the goal of bayesian inversion is to estimate the posterior probability density function π(q|y), which reflects the uncertainty about the quantity of interest q using measurement data y. equation (6) gives the posterior density and summarizes our beliefs about q after we have observed y. therefore, bayes' theorem for inverse problems can be stated as follows. theorem 1 (bayes' theorem for inverse problems [22, 23] ). let π 0 (q) be the prior probability density function of the realizations q of the random parameter q. let y be a realization or measurement of the random observation variable m . then the posterior density of q given the measurements y is computing the integral appearing in bayes' theorem 1 is costly especially if the parameter space r p is high-dimensional. another problem with quadrature rules is that they require a relatively good knowledge of the support of the probability distribution, which is usually part of the information that we seek [22, 23] . in section 3.2 we shortly discuss the algorithms for bayesian estimation, which do not require evaluations of the integral and that are used to achieve the numerical results for the nonlinear model equation. 7 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. . https://doi.org/10.1101/2020.05.30.20117754 doi: medrxiv preprint markov-chain monte-carlo methods are a class of monte-carlo methods with the general idea of constructing markov chains whose stationary distribution is the posterior density [22] . the metropolis-hastings algorithm is an mcmc algorithm to draw samples from a desired distribution. in this algorithm, the first state of the chain q 0 is chosen and then the new state q k , k = 1, 2, . . . , n, of the chain is constructed based on the previous state q k−1 . to this end, a new value q * using the proposal density function j(q * |q k−1 ) is proposed. admissibility of this proposed value is tested by means of calculating the acceptance ratio α(q * |q k−1 ). if the proposed value is admissible, it is accepted as q k , otherwise the old value is kept and a new proposal is made. for more details about mcmc methods see for example [24] [25] [26] [27] . although the convergence speed is determined by the choice of a good proposal distribution, at least tens or hundreds of thousands of samples are necessary to converge to the target distribution. choosing the optimal proposal scaling is a crucial issue and affects the mcmc results; if the covariance of the proposal distribution is too small, the generated markov chain moves too slowly, and if it is too large, the proposals are rejected. hence, optimal proposal values should be found to avoid both extremes, which leads to adaptive mcmc methods [28] [29] [30] . in the following section, we will consider an adaptive algorithm that helps sample from potentially complicated distributions. searching for a good proposal value can be done manually through trial and error, but this becomes intractable in high dimensions. therefore, adaptive algorithms that find optimal proposal scales automatically are advantageous. the delayed-rejection adaptive-metropolis (dram) algorithm is an efficient adaptive mcmc algorithm [29] . it is based on the combination of two powerful ideas to modify the markov-chain monte-carlo method, namely adaptive metroplolis (am) [31, 32] and delayed-rejection (dr) [33, 34] , which are used as global and local adaptive algorithms, respectively. am finds an optimal proposal scale and updates the proposal covariance matrix, while dr updates the proposal value when q * is rejected. the basic idea of the dr algorithm is that, if the proposal q * is rejected, delayed rejection (dr) provides an alternative candidate q * * as a secondstage move rather than just retaining the previous value q k−1 . this process is called delayed rejection, which can be done for one or many stages. furthermore, the acceptance probability of the new candidate(s) is calculated. therefore, in the dr process, the previous state of the chain is updated using the optimal parameter scale or proposal covariance matrix that has been 8 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. . calculated via the am algorithm. the am algorithm is a global adaptive strategy, where a recursive relation is used to update the proposal covariance matrix. in this algorithm, we take the gaussian proposal centered at the current state of the chain q k and update the chain covariance matrix at the k-th step using where s p is a design parameter and depends only on the dimension p of the parameter space. this parameter is specified as s p := 2.38 2 /p as the common choice for gaussian targets and proposals [35] , as it optimizes the mixing properties of the metropolis-hastings search in the case of gaussians. furthermore, i p denotes the p-dimensional identity matrix, and ε > 0 is a very small constant to ensure that v k is not singular, and in most cases it can be set to zero [29] . the adaptive metropolis algorithm employs the recursive relation to update the proposal covariance matrix, where the sample mean q k is calculated recursively by a second-stage candidate q * * is chosen using the proposal function where v k is the covariance matrix produced by the adaptive algorithm (am) as the covariance of the first-stage and γ 2 < 1 is a constant. the probability of accepting the second-stage candidate, having started at q k−1 and rejected q * , is α 2 (q * * |q k−1 , q * ) := min 1, π(q * * |y)j(q * |q * * )(1 − α(q * |q * * )) π(q k−1 |y)j(q * |q k−1 )(1 − α(q * |q k−1 )) , where α is the acceptance probability (12) in the non-adaptive approach. the acceptance probability is computed so that reversibility of the posterior markov chain is preserved (for more details see for example [22, §8.6] ). the dram technique is summarized in algorithm 1. in this section, we present simulation results of bayesian inversion and the adaptive mcmc method (see algorithm 1) for the two epidemic models, namely the logistic and the sir models, using the data of the covid-19 outbreak in austria. the results include model parameter estimation, model validation and outbreak forecasting. 9 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. choose the first state of the chain q 0 such that π 0 (q 0 ) > 0. choose the number n samples of samples or iterations. choose the parameter ε. choose the initial proposal covariance matrix v 0 (diagonal or symmetric). choose the factor γ (often γ := 1/5) for the second-stage proposal distribution. for k = 1 : n samples do 1. (adaptivity:) the covariance matrix v k in the k-th step is updated by (9). 3. the new value q * is accepted with probability 4. if the new state is accepted, we set q k = q * . otherwise: (a) (delayed rejection:) a second-stage proposal q * * is generated from proposal density where v k is the adapted covariance matrix. (b) the new value q * * is accepted with probability α 2 (q * * |q k−1 , q * ) := min 1, π(q * * |y)j(q * |q * * )(1 − α(q * |q * * )) π(q k−1 |y)j(q * |q k−1 )(1 − α(q * |q k−1 )) . (c) if the new state is accepted, we set q k := q * * , otherwise q k := q k−1 . according to bayesian analysis, the unknown parameters of the logistic and sir models using the data of covid-19 outbreak in austria were found and summarized in table 1 and table 3 , respectively. these tables show the confidence intervals for the models parameters as well as the mean of the obtained markov chains in the bayesian inference. furthermore, table 2 and 4 include temporal quantities such as peak time of the outbreak estimated using the bayesian inference for the logistic and sir models. figure 4 illustrates marginal histograms of posterior distribution for the four quantities of interest in the sir model, namely β, γ, n and r 0 using bayesian analysis. all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. 12 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. . https://doi.org/10.1101/2020.05.30.20117754 doi: medrxiv preprint here, we aim to evaluate the logistic and sir models for forecasting the covid-19 outbreak by comparing the simulation results and the reported data. figure 5 illustrates the number of infected individuals in austria till now, as well as our prediction of the number of infected people for the coming days. this prediction is according to the bayesian inversion for the logistic equation as the epidemic model. now we use the sir model and estimate the unknown quantities of this epidemic model using bayesian inference for the spread of the coronavirus. quantifying the uncertain parameters such as the reproduction number leads to calculate the average number of days to recover from the infection and gives useful information about properly and accurately implementing protective measures in order to prevent the spread of the virus. furthermore, the parameter identification in the epidemic model makes it possible to predict the length of the pandemic, the number of infected individuals and the fatality rate. figure 6 displays a similar prediction using the bayesian inference for the sir model, which shows a very good agreement between the measurements and the simulation. in figure 7 , the actual and estimated infection rates are depicted. this rate is defined by 13 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. 14 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. where ∆i n = i n − i n−1 , i n and i n−1 are infected population of consequent times (e.g. in days or weeks), which are obtained using the estimated infection from the sir model. figure 8 shows the estimated and actual reproduction number for austria during the infection time. the first recovery in austria was reported on march 26, where the reproduction number is estimated as 3. this quantity decays to 1 on april 1, and afterwards remains below 1. in this figure, the threshold of r = 1 is also displayed in the sense that there is no immediate public health emergency any more when the reproduction number is below this threshold. social-distancing measures started in austria around march 16, 2020 and improvements were first observed around march 21, which is consistent with the incubation time. in figure 9 , the daily fatalities in austria as well as the relative change in fatalities are illustrated. applying the fatality ratio as well as confirmed infected cases, we present a fatality analysis which is of importance for governmental protective decision making. in epidemiology, a case fatality rate (cfr) is the proportion of deaths from a certain disease compared to the total number of people 15 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. 16 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. . https://doi.org/10.1101/2020.05.30.20117754 doi: medrxiv preprint diagnosed with the disease for a certain period of time. figure 10 depicts the case fatality rate (cfr), which is defined by and we call it the true cfr here, in contrast to the naive cfr defined by the straightforward calculations using the recorded data in austria show that both cfr and naive cfr converge to the same value of cfr * = 0.04 ( figure 10) . we can roughly predict the fatalities using the confirmed infections, a shift (the number of days between confirmation of infection and death), and the cfr. the shift is approximately equal to the time between infection and death (currently estimated to be 18 days) minus the incubation time (currently estimated to be up to 7 days) minus time for testing and reporting (see figure 11 ). the estimator of fatalities in austria is defined by fatality cases := cfr * × confirmed infected cases (shifted by 10 days). figure 11 shows a good agreement between the estimated fatalities and true values in austria. 17 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. here, in order to study the effect of the protective measures implemented by the austrian government, we compare the infection rate and the infected population in different time intervals with and without implementing the measures. austrian government decided to implement some protective measures such as social distancing around march 10, 2020. although public health measures were in place from march 16 to control the spread of covid-19, austrians started to practice social distancing and to wear masks much before. for instance, austrian universities stopped their physical activities starting march 11 and practices a distance-learning and home-office strategy. table 5 shows the weekly infection rate in austria and how it decays in subsequent weeks. the comparison between the estimated infection rates in subsequent weeks before and after implementing the protective measures highlights the importance and effectiveness of the measures such as socialdistancing and lock-down in controlling and slowing the spread of covid-19. furthermore, the fatality forecast in section 4.3 is valid as long as protective measures are in place, otherwise the number of fatalities will increase due to a large number of infected people and the limit in the capacity of intensive care unit (icu) beds in austria as the number of intensive cases increases dramatically. in fact, the main goal of protective measures and lock-down is to "flatten the curve", i.e., to decrease the infection rate so that the healthcare system is kept from becoming overwhelmed with too many 18 all rights reserved. no reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 3, 2020. (15)) and total cumulative infected population at the end of different time intervals with and without implementing the protective measures in austria. critical cases at the same time. in countries where the counterfactual scenario i.e., no public health interventions is applied, for instance in sweden, the icu demand is estimated to be almost 20 times higher than the intensive care capacity in the country and a much larger number of deaths is predicted [36] . if austrian governmental protective measures would not have taken place, after march 18 the infection rate would have remained almost constant (around 6) and therefore we would expect thousands of infected people and fatalities in austria. in this work, we developed an adaptive bayesian inversion for epidemiological models, namely the logistic and the sir models, in order to solve the inverse problem of estimating unknown quantities for the novel coronavirus covid19. quantifying the uncertainties in these models is essential since it leads to describe the characteristics of the epidemics on one hand and accurately forecasting the pandemic on the other hand. the proposed inversion recipe is robust and yields probability distributions and confidence intervals for the unknown parameters of the epidemic models including the growth rate of the outbreak and transmission and recovery rates as well as the reproduction number, whose quantification is crucial. we applied our methodology to the publicly available data for austria to estimate the main epidemiological model parameters and to study the effectiveness of the protective measures by the austrian government as well as fatality analysis. we also validated the presented models by comparing the simulated and measured data, whose results show a very good agreement. based on bayesian analysis for the logistic model, the means of the growth rate α and the carrying capacity β are estimated respectively as 0.28 and 14 974. furthermore for these parameters, 95% confidence intervals of this quantity decays to 1 on april 1, and since then remains below 1. analyzing data of infected, recovered and death cases, we obtained that the case fatality rate (cfr) has converged to the value 4%. this estimation makes it possible to forecast the fatalities in the coming 10 days. according to our analysis, the total number of death in austria is estimated as 633 in may 21, which perfectly matches the measured data. furthermore, we estimated the infection rate for consequent weeks starting from before implementing the protective measures, which shows a significant decay after the measures are in place. if the governmental protective measures would not have been implemented, the estimated infection rate before march 18 would have remained almost constant and thus a thousands of people would have been infected and would have died. these results indicate the impact of the measures such as social distancing and lock-down in controlling the spread of covid-19. the continuing 2019-ncov epidemic threat of novel coronaviruses to global health-the latest 2019 novel coronavirus outbreak in wuhan world health organization (who) coronavirus disease 2019 (covid-19) situation report -97. who infectious diseases of humans: dynamics and control mathematical epidemiology of infectious diseases: model building, analysis and interpretation the mathematics of infectious diseases mathematical models in population biology and epidemiology preliminary prediction of the basic reproduction number of the wuhan novel coronavirus 2019-ncov epidemic analysis of covid-19 in china by dynamical modeling a time delay dynamical model for outbreak of 2019-ncov and the parameter identification statistics based predictions of coronavirus 2019-ncov spreading in mainland china time-to-death approach in revealing chronicity and severity of covid-19 across the world the basic reproductive number of ebola and the effects of public health measures: the cases of congo and uganda phase-adjusted estimation of the number of coronavirus disease 2019 cases in wuhan a seiqr model for pandemic influenza and its parameter identification optimization of prognostication model about the spread of ebola based on sir model athanasios tsakris, and constantinos siettos. data-based analysis, modelling and forecasting of the covid-19 outbreak modelling the covid-19 epidemic and implementation of population-wide interventions in italy bayesian estimation of physical and geometrical parameters for nanocapacitor array biosensors bayesian inversion for electrical-impedance tomography in medical imaging using the nonlinear poisson-boltzmann equation bayesian inversion for a biofilm model including quorum sensing reliability of poisson-nernst-planck anomalous models for impedance spectroscopy uncertainty quantification: theory, implementation, and applications statistical and computational inverse problems markov chain monte carlo in practice monte carlo statistical methods monte carlo methods in geophysical inverse problems monte carlo analysis of inverse problems optimal proposal distributions and adaptive mcmc. handbook of markov chain monte carlo, 4 dram: efficient adaptive mcmc mcmc methods for functions: modifying old algorithms to make them faster adaptive proposal distribution for random walk metropolis algorithm an adaptive metropolis algorithm some adaptive monte carlo methods for bayesian inference delayed rejection in reversible jump metropolis-hastings efficient metropolis jumping rules covid-19 health care demand and mortality in sweden in response to non-pharmaceutical (npis) mitigation and suppression scenarios the authors acknowledge support by fwf (austrian science fund) start project no. y660 pde models for nanotechnology. key: cord-340375-lhv83zac authors: bliznashki, svetoslav title: a bayesian logistic growth model for the spread of covid-19 in new york date: 2020-04-07 journal: nan doi: 10.1101/2020.04.05.20054577 sha: doc_id: 340375 cord_uid: lhv83zac we use bayesian estimation for the logistic growth model in order to estimate the spread of the coronavirus epidemic in the state of new york. models weighting all data points equally as well as models with normal error structure prove inadequate to model the process accurately. on the other hand, a model with larger weights for more recent data points and with t-distributed errors seems reasonably capable of making at least short term predictions. 1. introduction. the logistic growth model is frequently used in order to model the spread of viral diseases and of covid-19 in particular (e.g. batista, 2020; wu et al., 2020) . the differential equation is given in (1): where c is the cumulative number of infected individuals, r is the infection rate, and k is the upper asymptote (i.e. the upper limit of individuals infected during the epidemic). unlike other models, like sir, eq. 1 has an explicit analytical solution: ( ) where a=(k-c 0 )/c 0 and c 0 is the initial number of infectees. the parameters of eq. 2 can easily be estimated via least squares (ls) but in this note we use a bayesian approach which allows us to make use of explicit posterior distributions in order to make probabilistic predictions. we apply the above model to the state of new york which represents a relatively geographically homogenous population with sufficient data points in order build a reliable model which is not affected by different trends present in different regions. 2. simulation 1. we begin with a simple model which estimates the parameters of eq. 2 based on the assumption of normally distributed homoscedastic errors. we use the data for 28 consecutive days of the epidemic beginning with the 4 th of march (11 infectees) and ending with the 31 st of march (75832 infectees) 1 . prior to estimation we standardized our data by dividing all data points by 70000 in order to avoid numerical problems; after the posteriors were obtained we back-transformed the results in their original scale. we assumed that the errors are normally distributed with mean equal to 0 and standard deviation (σ) estimated by the model. we used the blockwise random walk metropolis algorithm 2 in order to sample from the joint posterior distribution of the four parameters of the model (k, a, r, and σ). the proposal distribution was multivariate normal with scaled variance-covariance matrix estimated on the basis of pilot runs. uninformative improper uniform priors ranging from 0 to + ∞ were employed for all parameters in the model. a pilot chain showed an acceptance rate within the optimal range of 23% (e.g. chib & greenberg, 1995 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 /2020 r=0.34 (95% ci=[0.31; 0.37]). we see that the estimates are very similar to the posterior eaps reported above which is to be expected given the uninformative nature of our priors. still, the bayesian analysis gives slightly wider intervals for the estimates which as we'll see below is a positive. conservative. this is a commonly observed situation for phenomenological (i.e. purely datadriven) models of this type. at the time of writing this note, there is information for the number of infectees three days after the 28 days used in order to fit the model. we used the posterior estimates in order to predict the number of future infectees. more precisely, we used the posterior estimates (including σ) in order to simulate data for future values of t thereby constructing what is known as posterior predictive distributions. for example, for a given future day (e.g. for the 28 th day) we sampled all posterior values for eq. 2 parameters and for each sample we plugged in the t=28 value in order to obtain a mean prediction value; then we added a random number generated from n(0, σ) where the value for σ is sampled from the posterior alongside actual predicted the other parameters available for the given step. the resulting predictive distribution has an observed mean, variance, etc. and can be used in order to make point and/or interval predictions (hdis) as usual. some results are shown in table 2 we see that the predictive distributions fail to capture even the immediate true value which once again suggests that indeed the model is inadequate and fails to capture the true trends in the data. note, however, that the ranges of the prediction intervals increase for later data points which is a desirable quality of a model and is intrinsic to the bayesian approach employed here. as fig. 2 (upper right portion) suggests, the model converges too quickly to its upper asymptote and hence its predictions are too low and probably too narrow. this observation is not surprising given that it is well-known that the simple logistic model is applicable only during specific stages of an outbreak and/or when enough data is available (see wu et al, 2020 for a review). possible solutions include: improving the model (e.g. richards, 1959) by adding more parameters which can account better for the deviation of the observed data points from the symmetric s-shaped curve suggested by the logistic growth model; adjusting the prior distributions so as to reflect our expectations of a much higher upper asymptote (k); switching to a different, preferably more mechanistic approach altogether. instead, we attempted to construct a more accurate model within the same logistic growth paradigm in a different way: we introduced weights to our data points with later data points receiving higher weights than older ones in the hope that this will alleviate some of the problems observed above. specifically, we weighted the points according to a rectified . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10. 1101 /2020 linear-like function (e.g. glorot et al., 2011) whereby the first 20 observations received constant (0.008) low weights and the last 8 observations received linearly increasing higher weights (last 8 weights=[0.77 1.55 2.32 3.09 3.87 4.64 5.41 6.19]). the idea behind this scheme was to try to force the model to account better for the observations following the approximately linear trend observed in the upper half of fig. 2 . note also that the weights sum to the number of original observations (28). the weights pattern is shown in fig. 3 . in the subsequent simulations we used the proposed weights in order to weigh the likelihood function of the model. following simeckova (2005) , assuming we have observations y 1 , …y n and y i has density f i (y i |θ) where θ is the vector of parameters (see eq. 2), we apply the weights vector w=[w 1 ,…w n ]. if we let l i (θ)=log(f i (y i |θ)), the weighted loglikelihood function of our model becomes: 3. simulation 2. we used the above weighing scheme and repeated the previous simulation. in that sense we altered the likelihood function while leaving the prior distributions intact. everything else (including the simulation details such as number of posterior draws, thinning, etc.) was the same as reported in section 2. again, we observed . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10. 1101 /2020 good convergence for all parameters (see fig. 4 depicting a traceplot and a histogram for the k parameter). table 3 gives a summary for the posterior estimates 3 . we see that our weighting scheme appears to give more reasonable results and that the estimates for the upper asymptote . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.05.20054577 doi: medrxiv preprint the fitted equation against the observed data (fig. 5) . it is clear that the fitted curve is much more affected by the later points and consequently the upper asymptote is higher than before (compare also tables 3 and 1) table 3 . means, medians, standard deviations (sd), and 95% hdis for the parameters of eq. 2 obtained from the posterior of our second (weighted) model. table 3 . we see that this time the model accurately predicts two consequent data points and fails to predict the third. this is still not a satisfactory performance, however, and hints towards the possibility that the actual process exhibits steeper rise than the one suggested by the model. in the same way, it appears that the hdis are not wide enough in order to accommodate the actual uncertainty. looking at figures 2 and 5 we see that the errors, in all likelihood, both lack homoscedasticity and possess an auto-correlated structure. in order to (partially) alleviate these problems we removed the normality assumption present above and replaced it with the assumption that the errors follow a t-distribution with location parameter equal to 0 and scale (similar to the standard deviation used above) and degrees of freedom (df) parameters estimated from the data (see kruschke, 2012 for the same approach in the context of a linear model). we used the same weighing scheme as above and introduced the two new parameters (scale and df) describing the t-distribution governing the model's errors. we again proposed the first four parameters of the model (i.e. k, a, r, scale) by a multivariate normal distribution centered on the previous values of the chain with scaled variance-covariance matrix estimated from pilot chains; the df parameter was proposed separately based on a lognormal distribution 4 which was transformed back to the original scale after the end of the simulation. 30 million samples from the posterior were obtained with every 300 th step retained (i.e. we had a thinning parameter of 300). we used improper uniform priors for all parameters except for the df parameter for which a shifted exponential with mean equal to 30 was specified as suggested by kruschke (2012) . the results indicated good convergence (a traceplot for the k parameter is shown in fig. 6 below; histograms from the posterior for all parameters are shown in appendix a). 4 the lognormal distribution is not symmetric and hence we use the actual metropolis-hastings acceptance probability (e.g. chib & greenberg, 1995) during the step sampling from the df posterior. table 5 . means, medians, standard deviations (sd), and 95% hdis for the parameters of eq. 2 obtained from the posterior of our last model. consistently with our expectations the 95% hdi for the df parameter suggests a noticeable deviation from normality. fig. 7 shows the predicted trend based on the eap estimates shown in table 5 . finally, table 6 specifies the predictive distributions for the next 7 days and for the estimate for the final cumulative number of infectees. we see that this model accurately predicts at least three future data points. in the next several days we should be able to observe how the model deals with data points further away in time. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10.1101/2020.04.05.20054577 doi: medrxiv preprint as can be seen in appendix a the posteriors for the different parameters no longer resemble parametric distributions (for the previous two models the posteriors for the k, a, and r definitely resemble normal/t-distributions while the σ parameter is pronouncedly positively skewed and thus resembles a gamma distribution). nevertheless this model appears to be best suited for the modeled phenomenon. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https: //doi.org/10.1101 //doi.org/10. /2020 5. discussion. it appears that a logistic growth model with a weighted likelihood function and a t-distribution imposed on the error structure is able to make accurate short term predictions of the spread of a disease. the bayesian estimation gives more accurate estimates than traditional least squares and maximum likelihood approaches with more accurate interval estimates. moreover, the bayesian posteriors (including the predictive distributions) have a straightforward probabilistic interpretation which cannot be said about traditional frequentist confidence intervals. as a rule, the posterior distributions show high correlations between the parameters which makes algorithms like blockwise metropolis-hastings more effective in general than algorithms which explore a single posterior distribution at a time such as the gibbs sampler and the componentwise metropolis-hastings. the fact that some posteriors lack closed-form solutions is another impediment when it comes to the gibbs sampler but not necessarily for the use of the componentwise metropolis-hastings as demonstrated in section 3. the weighing scheme employed here proves beneficial over modeling the raw data by forcing the model to pay more attention to more recent observations. other weighing schemes are certainly possible and investigating the properties of different approaches seems a potentially fruitful future enterprise. as a whole it appears that the combination of bayesian estimation, differentially weighing the observations, and employing a more robust approach towards modeling the errors (i.e. assuming a t-distribution with scale and df parameters estimated from the data) results in more reliable hdis and prediction intervals than more traditional approaches. far from being perfect, the proposed model appears to be somewhat useful. of course, such a model can be continuously augmented by including new data points and applying the same or similar weighing procedure. presumably, continuously adjusting the model by adding new observations as they become available would improve its accuracy. that being said, our simulations suggest that we should be somewhat skeptical towards logistic growth models applied to the raw data describing an outbreak, especially when the number of available data points is relatively small and the upper asymptote appears not to have been reached yet. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint . https://doi.org/10. 1101 /2020 histograms of the posterior distributions for the five parameters for our last model. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. df . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) the copyright holder for this preprint . https://doi.org/10. 1101 /2020 if propdf<0 alp=-99999; else pred=cur(1)./(1+cur(2)*exp(-cur(3)*t)); ln=gentdst(y, pred, cur(4), exp(propdf)); ln=sum(wei.*log(ln)); ln=ln+log((1/29)*exp(-(1/29)*(exp(propdf)-1))); alp=(ln-lp)+(log(propdf)-log(curdf)); %accounting for the log-normal proposal; function y = gentdst(x, m, s, v) %x -data point, m -location, s -scale, v 0 degrees of freedom; c=(1/sqrt(v))*(1/(beta(v/2, 0.5))); y=(c/s)*(1+((x-m).^2)/(v*(s^2))).^(-0.5*(v+1)); estimation of the final size of coronavirus epidemic by the logistic model. medrxiv understanding the metropolis-hastings algorithm deep sparse rectifier neural networks. proceedings of the 14 th international conference on artificial intelligence and statistics bayesian estimation supersedes the t test a flexible growth function for empirical use maximum weighted likelihood estimation in logistic regression generalized logistic growth modeling of the covid-19 outbreak in 29 provinces in china %the columns of par correspond to the posteriors for k, a, r, scale, and df respectively; nit=30000000 wei=wei/sum(wei)*28 /(1+cur(2)*exp(-cur(3)*t)) /29)*exp(-(1/29)*(exp(curdf)-1))) )<=0 || prop(2)<=0 || prop(3)<=0 || prop(4)<=0 alp=0 /(1+prop(2)*exp(-prop(3)*t)) /29)*exp(-(1/29)*(exp(curdf)-1))) key: cord-332729-f1e334g0 authors: shah, nirav r.; lai, debbie; wang, c. jason title: an impact-oriented approach to epidemiological modeling date: 2020-09-21 journal: j gen intern med doi: 10.1007/s11606-020-06230-1 sha: doc_id: 332729 cord_uid: f1e334g0 nan t he covid-19 pandemic has propelled epidemiological modeling into the public and political consciousness, beyond the strict purview of scientific and public health experts. models have emerged as crucial tools for decisionmakers, with calls for government-mandated non-pharmaceutical interventions (npis) such as stay-at-home orders to be based on data-driven thresholds such as case numbers and transmission rates. 1 and it goes both ways: data drives use of npis, which then affect models in an iterative process. meanwhile, the outputs of covid-19 models have become a subject of public fixation and mainstay of media headlines. there is a growing body of evidence supporting the efficacy of npis such as shelter-in-place and mask-wearing, which are affected by the extent of the public's buy-in and compliance. studies have shown that npis averted a 67× increase in cases in china by february 29, 2020, and even lax compliance can reduce transmission by as much as 25%. 2 other studies suggest that suppression will minimally require social distancing by the entire population. 3 under such circumstances, public awareness and consensus become paramount, particularly in the usa, where societal and cultural norms may limit imposed lockdowns akin to those that occurred in wuhan and other parts of china. thus, there emerges an unprecedented need to build a shared understanding of the disease, not just among experts and policymakers but also for the public. those who develop epidemiological models are no longer only creating specialty tools, but consumer products as well, and thus face a new, non-traditional, set of considerations. we propose that this requires an impact-oriented approach, i.e., what is the cumulative impact of their models upon the public? we call this impact-oriented modeling. traditionally, epidemiological models have been valued for their ability to inform decision-makers who possess prior knowledge of disease management. 4 in the wake of the h1n1 pandemic in 2009, the world health organization (who) convened a mathematical modeling network of public health experts and academics. 5 the centers for disease control and prevention (cdc) recently added policy development as a sixth item in its list of the major tasks of epidemiology in public health, but there remains no mention of the impact on the general public. 6 impact-oriented modeling values more than accuracy, which remains non-negotiable. beyond simply the outputs of such a model, consideration must be given to the presentation of these outputs, including design, visualization, and supporting content, all of which affect the utility, user experience, downstream policy, and, ultimately, impact. to this end, we outline a set of 8 key considerations for impact modeling. though these eight considerations will not be easily met in totality, we recommend incorporating as many possible into modeling for the covid-19 pandemic ( table 1 ). 1. agility: is the data and model providing timely information? the fast-changing nature of covid-19 highlights the need for models to reflect the most recent information, which may differ drastically from recent, prior information. with covid-19, journalists have become an active source not only of news but also of data. the new york times' repository of covid-19 cases (available at https://github.com/nytimes/covid-19-data), collected by reporters who monitor news conferences, analyze data releases, and seek clarification from public officials, is updated daily and is among the best sources of this fundamental metric. 2. responsiveness: do the data and model respond to new evidence? not only do models allow the public and decisionmakers to react to data, but the models themselves should also react to data in an iterative fashion. a feedback loop of action-information-reaction should drive models to continuously evolve, along with covid-19 and our knowledge of it. for instance, on may 4, 2020, over five weeks after it first launched on march 26, the institute for health metrics and evaluation (ihme) pivoted from a poorly-performing curve-fitting model drawing on prior death reporting, to a traditional seir model (available at https://covid19. healthdata.org/), which led to a substantial increase in forecasted covid-19 deaths and more accurate outputs. 3. transparency: are the data and model's mechanisms and data sources publicly available for fact-checking and validation? this issue has already been raised in the field of machine learning, where the plethora of options likewise renders the task of selection difficult. in the absence of perfect knowledge and the presence of myriad approaches, open-source models and databases enable users of these models to make more informed choices between models and data sources. they also enable the validation of models and data sources, which is critical not only for verifying the accuracy, but for enabling iteration and improvement. for instance, the covid act now (can) model is fully open-source, along with its data inputs (available at https://covidactnow.org). the mechanisms of its models, its assumptions, and references, are made publicly available. 6 this enables the public and experts to escalate questions and concerns that have enabled the model to be refined, such as by ingesting more accurate data. 4. usability: can the data and model be used easily, effectively, and efficiently? intuitively, we know that when users are not able to easily access and use a product, they are less likely to continue using it. developers of consumer products are thus familiar with the need to consider user expectations, desires, and requirements. 7 covid-19 models may benefit from doing the same. for example, user research, a common component in the development of consumer products, may become increasingly important in order to better understand the barriers that prospective users of epidemiological models face. 5. accessibility: can the data and model be understood and used by a broad audience, irrespective of scientific, technical, and other capabilities? the majority of the usa has not received training in epidemiology or data science. elderly populations that are more vulnerable to infection typically have less experience using technology. as progress containing the virus depends on the cumulative behavior of millions of individuals, a broad understanding of a model results in success or failure, and hence, models must use language and visuals that forgo specialized jargon and excessive complexity. 6. universality: do data and the model draw on inputs that are defined and measured consistently across geographies? given the unprecedented nature of covid-19, countries, states, counties and cities depend upon learning from each other, and what happens across artificial political boundaries matters across a region. standardization and consistency of data across regions can enable this. for example, the covid tracking project is an open-source initiative of the atlantic and provides one of the most complete data sets available about covid-19 in the usa (available at https:// covidtracking.com/). 7. adaptability: can the model be modified and adapted? in particular, efforts to provide useful covid-19 data for the usa have run into the following quandary: even as the implementation of tactical strategies exists primarily at the local level, it is also at the local level that the big data required to feed epidemiological models becomes most difficult to obtain. it may be that the models most easily customized by cities and counties will be those that have the greatest impact. the covid-19 hospital impact model for epidemics (chime) model allows for custom inputs, such as estimates of the regional population, hospital market share, and currently hospitalized covid-19 patients, in order to assist local officials with hospital capacity planning (available at https://penn-chime.phl.io/). 8. actionability: does the model reflect current government policies? given the role of epidemiological models in shaping public discourse and behavior, there is a responsibility to also inform actionability. models that fail to do so may contribute to anxiety, confusion, or even actions that violate federal, state, or local regulations. on the other hand, models that clearly communicate the actionable implications of their outputs can contribute to a positive rather than a negative impact. both the new york times and georgetown university's center for global health, science, and security (available at https://covidamp.org/) have begun to collect data on covid-19 policies by state and effective dates, including shelter-in-place and reopening orders. to our knowledge, no data source or model currently fulfills all the considerations that we have set forth. these eight considerations may enable covid-19 data and models to become better harbingers of actionable, behavior-changing, and even life-saving information; to bridge the gap between scientific public health expertise and mainstream, layperson are the data and model's mechanisms and data sources publicly available for fact-checking and validation? 4 usability can the data and model be used easily, effectively, and efficiently? 5 accessibility can the data and model be understood and used by a broad audience, irrespective of scientific, technical, and other capabilities? 6 universality does the data and model draw on inputs that are defined and measured consistently? 7 adaptability can the model be easily modified and adapted? 8 actionability are there clear calls-to-action that reflect current government policies? knowledge; and to generate more positive impact than noise. as the british statistician george box said, "all models are wrong. but some are useful." wrong but useful -what covid-19 epidemiologic models can and cannot tell us effect of non-pharmaceutical interventions to contain covid-19 in china impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand a proposal for standardized evaluation of epidemiological models. developing the theory and practice of epidemiological forecasting (delphi) world health organization. mathematical modelling of the pandemic h1n1 third edition: an introduction to applied epidemiology and biostatistics. the centers for disease control and prevention using websites to engage consumers in managing their health and healthcare key: cord-353200-5csewb1k authors: jehi, lara; ji, xinge; milinovich, alex; erzurum, serpil; merlino, amy; gordon, steve; young, james b.; kattan, michael w. title: development and validation of a model for individualized prediction of hospitalization risk in 4,536 patients with covid-19 date: 2020-08-11 journal: plos one doi: 10.1371/journal.pone.0237419 sha: doc_id: 353200 cord_uid: 5csewb1k background: coronavirus disease 2019 is a pandemic that is straining healthcare resources, mainly hospital beds. multiple risk factors of disease progression requiring hospitalization have been identified, but medical decision-making remains complex. objective: to characterize a large cohort of patients hospitalized with covid-19, their outcomes, develop and validate a statistical model that allows individualized prediction of future hospitalization risk for a patient newly diagnosed with covid-19. design: retrospective cohort study of patients with covid-19 applying a least absolute shrinkage and selection operator (lasso) logistic regression algorithm to retain the most predictive features for hospitalization risk, followed by validation in a temporally distinct patient cohort. the final model was displayed as a nomogram and programmed into an online risk calculator. setting: one healthcare system in ohio and florida. participants: all patients infected with sars-cov-2 between march 8, 2020 and june 5, 2020. those tested before may 1 were included in the development cohort, while those tested may 1 and later comprised the validation cohort. measurements: demographic, clinical, social influencers of health, exposure risk, medical co-morbidities, vaccination history, presenting symptoms, medications, and laboratory values were collected on all patients, and considered in our model development. results: 4,536 patients tested positive for sars-cov-2 during the study period. of those, 958 (21.1%) required hospitalization. by day 3 of hospitalization, 24% of patients were transferred to the intensive care unit, and around half of the remaining patients were discharged home. ten patients died. hospitalization risk was increased with older age, black race, male sex, former smoking history, diabetes, hypertension, chronic lung disease, poor socioeconomic status, shortness of breath, diarrhea, and certain medications (nsaids, immunosuppressive treatment). hospitalization risk was reduced with prior flu vaccination. model discrimination was excellent with an area under the curve of 0.900 (95% confidence interval of 0.886–0.914) in the development cohort, and 0.813 (0.786, 0.839) in the validation cohort. the scaled brier score was 42.6% (95% ci 37.8%, 47.4%) in the development cohort and 25.6% (19.9%, 31.3%) in the validation cohort. calibration was very good. the online risk calculator is freely available and found at https://riskcalc.org/covid19hospitalization/. limitation: retrospective cohort design. conclusion: our study crystallizes published risk factors of covid-19 progression, but also provides new data on the role of social influencers of health, race, and influenza vaccination. in a context of a pandemic and limited healthcare resources, individualized outcome prediction through this nomogram or online risk calculator can facilitate complex medical decision-making. based on the latest estimates from the centers for disease control (week ending in june 6, 2020), hospitalization rates in the united states due to coronavirus disease of 2019 (covid-19) range from 5.6/100,000 population in patients 4 years or younger and up to 273.8/100,000 population in those 65 years or older, posing a significant capacity challenge to the healthcare system. strategies to address this challenge have focused on imposing social distancing to reduce viral transmission and increasing hospital bed capacity by drastically reducing usual occupancy, eliminating elective surgical procedures, and creating makeshift surge hospitals [1] . social distancing practices have indeed helped in curbing the acute need for hospital bedsat least momentarily-but the long-term healthcare capacity requirements remain unclear as strategies for lifting restrictions and resuming normal activities are in flux. improving our understanding of the clinical outcomes of patients infected with covid-19 is therefore paramount. in addition, we need predictive algorithms that identify the covid-19 patients at highest risk of progressing to severe disease to develop alternative approaches to safely manage hospitalization risk prediction and outcomes in covid-19 plos one | https://doi.org/10.1371/journal.pone.0237419 august 11, 2020 2 / 15 ethical restrictions by the cleveland clinic regulatory bodies including the institutional review board and legal counsel. in particular, variables like the patient's address, date of testing, dates of hospitalization, date of icu admission, and date of mortality are hipaa protected health information and legally cannot be publicly shared. since these variables were critical to the generation and performance of the model, a partial dataset (everything except them) is not fruitful either because it will not help in efforts of academic advancement, such as model validation or application. we will make our data sets available upon request, under appropriate data use agreements with the specific parties interested in academic collaboration. requests for data access can be made to mascar@ccf.org. them. these predictive algorithms could also be used at a population level to guide social distancing and other risk limiting strategies in a focused fashion, rather than the blanket approaches of shelter-in-place for society. older age [2] [3] , smoking [4] , and medical co-morbidities such as diabetes, hypertension, cardiovascular disease, chronic kidney disease, chronic lung disease [5] , and cancer [5] [6] have been correlated with disease worsening in patients who are already hospitalized with covid-19. it is unclear how these comorbidities, or other patient characteristics, factor into clinical worsening that leads to hospitalization. translating their significance at an individual patient care level when faced with a decision to hospitalize patients presenting with symptoms of covid-19 is even more elusive. the end result is patients being told to go home from the emergency room only to return much more ill and be admitted days later, or patients hospitalized for observation for several days without any significant clinical deterioration. we present the clinical characteristics and outcomes of patients with covid-19, including a subset who were hospitalized. we also develop and validate a statistical model that can assist with individualized prediction of hospitalization risk for a patient with covid-19. this model allows us to generate a visual statistical tool (a nomogram) that can consider numerous variables to predict an outcome of interest for an individual patient [7] . we included all patients, regardless of age, who had positive covid-19 testing at cleveland clinic between march 8, 2020 and june 5, 2020. the study cohort included all covid positive patients, whether they were hospitalized or not, from across the cleveland clinic health system which includes >220 outpatient locations and18 hospitals in ohio and florida. as testing demand increased, we adapted our organizational policies and protocols to reconcile demand with patient and caregiver safety. prior to march 18, any primary care physician could order a covid-19 test. after that date, testing resources were streamlined through a "covid-19 hotline" which followed recommendations from the centers for disease control (recommending to focus on high risk patients as defined by any of the following: age older than 60 years old or less than 36 months old; on immune therapy; having comorbidities of cancer, end-stage renal disease, diabetes, hypertension, coronary artery disease, heart failure with reduced ejection fraction, lung disease, hiv/aids, solid organ transplant; contact with known covid 19 patients; physician discretion was still allowed). demographics, co-morbidities, travel and covid-19 exposure history, medications, presenting symptoms, socioeconomic measures, treatment, disease progression, and outcomes were collected. registry variables were chosen to reflect available literature on covid-19 disease characterization, progression, and proposed treatments, including medications thought to have benefits through drug-repurposing studies [8] . capture of detailed research data was facilitated by the creation of standardized clinical templates that were implemented across the healthcare system as patients were seeking care for covid-19-related concerns. outcome capture was facilitated by a home monitoring program whereby patients who tested positive were called daily for 14 days after-test result to monitor their disease progression. data were extracted via previously validated automated feeds [9] from our electronic health record (epic, epic systems corporation) and manually by a study team trained on uniform sources for the study variables. the covid-19 research registry team includes a "reviewer" group and a "quality assurance" group. the reviewers were responsible for manually abstracting and entering a subset of variables (signs and symptoms upon presentation) that cannot be automatically extracted from the electronic health record, and for verifying high-priority variables (co-morbidities) that have been automatically pulled into the database from the electronic health record. the quality assurance group provided an independent second layer of review. study data were collected and managed using redcap electronic data capture tools hosted at cleveland clinic [10] [11] . redcap (research electronic data capture) is a secure, web-based software platform designed to support data capture for research studies, providing 1) an intuitive interface for validated data capture; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for data integration and interoperability with external sources. this research was approved by the cleveland clinic institutional review board (irb# 20-283). consent was waived by irb. nasopharyngeal and oropharyngeal swab specimens were both collected in all patients and pooled for testing by trained medical personnel. given previous beliefs that co-infection with severe acute respiratory syndrome coronavirus 2 (sars-cov-2) and other respiratory viruses is rare [12] [13] , a reflex testing algorithm was implemented to conserve resources. all patient specimens were first tested for the presence of influenza a/b and respiratory syncytial virus (rsv), and only those negative for influenza and rsv were subsequently tested for sars--cov-2. infection with sars-cov-2 was confirmed by laboratory testing using the centers for disease control and prevention (cdc) reverse transcription polymerase chain reaction (rt-pcr) sars-cov-2 assay that was validated in the cleveland clinic robert j. tomsich pathology and laboratory medicine institute. this assay uses roche magnapure extraction and abi 7500 dx pcr instruments. between march 8 and 13, the tests were sent out to lab-corp, burlington, north carolina. all testing was authorized by the food and drug administration under an emergency use authorization (eua), and in accordance with the guidelines established by the cdc. baseline data are presented as median [interquartile range [iqr] ) and number (%)]. continuous variables were compared using the mann-whitney u test, and categorical variables were compared using the chi-square test. the outcome of interest was hospitalization anytime within three days of a positive covid test. the model was built using a development cohort (patients with covid positive test resulted before may 1, 2020), and subsequently tested in a validation cohort (patients with covid positive test resulted between may 1 and june 5, 2020). this allowed us to test the model's validity over time. a full multivariable logistic model was initially constructed to predict hospital admission with covid-19 based on demographic variables, comorbidities, immunization history, symptoms, travel history, lab variables, and medications that were identified pre-admission. for modeling purposes, methods of missing value imputation for labs variables were compared using median values and using values from multivariate imputation by chained equations (mice) via the r package mice. restricted cubic splines with 3 knots were applied to continuous variables to relax the linearity assumption. a least absolute shrinkage and selection operator (lasso) logistic regression algorithm was performed to retain the most predictive features. a 10-fold cross validation method was applied to find the regularization parameter lambda which gave the minimum mean cross-validated concordance index. predictors with nonzero coefficients in the lasso regression model were chosen for calculating predicted risk. the final model was internally validated by assessing the discrimination and calibration with 1000 bootstrap resamples. discrimination was measured with the concordance index [14] . calibration was assessed visually by plotting the nomogram predicted probabilities against the observed event proportions over a series of equally spaced values within the range of the predicted probabilities. the closer the calibration curve lies along the 45˚line, the better the calibration. a scaled brier score, called the index of predictive accuracy (ipa) [15] , was also calculated, as this has some advantages over the more popular concordance index. the ipa ranges from -1 to 1, where a value of 0 indicates a useless model, and negative values imply a harmful model. we adhered to the tripod checklist for reporting the prediction model [16] . we calculated sensitivity, specificity, positive addictive value, negative predictive value at different cutoffs of predicted risk. we used r, version 3.5.0 (r project for statistical computing) [17] , with tidyverse [18] , mice [19] , caret [20] , and risk regression [21] packages for all analyses. statistical tests were 2-sided and used a significance threshold of p < .05. we included all covid positive patients during the study period in this model development and validation to optimize model performance: no specific sample size calculations were performed. an outcome of "hospitalized versus not" allows us to predict the likelihood that the patient is actually getting admitted to the hospital. this decision, however, is influenced by multiple "non-medical" factors including bed availability, regulatory systems, and individual physician preferences. to test the applicability of our model towards a determination of whether a patient should have been admitted or not, we subdivided patients included in our validation cohort and development cohorts into 4 categories: a-hospitalized and not sent home within 24 hours; b-sent home (not initially hospitalized) but ultimately hospitalized within 1 week of being sent home; c-not hospitalized at all; d-hospitalized but sent home within 24 hours. in this construct, categories a and c represent patients who were "correctly managed", at categories b and d represent those who were "incorrectly managed". we then tested the discrimination of our model in each one of those categories separately. no model recalibration was done. 4,536 patients tested positive during the study period, including 2,852 patient in the development cohort (dc) of whom 582 (20.4%) were hospitalized, and 1,684 patients in the validation cohort (vc) of whom 376 (22.3%) were hospitalized. table 1 provides demographic, exposure, clinical, laboratory, social characteristics, and medication history of covid-19 patients who were hospitalized versus those who completed their treatment on an outpatient basis in both the dc and vc. at the time of hospital admission, 260 patients were known to have covid-19, while the results of the (rt-pcr) sars-cov-2 nasopharyngeal assay were still pending on 698. six hundred and sixty five were admitted from the emergency room, 32 were transferred from other hospitals, and 261 were directly admitted from the outpatient areas. overall outcomes illustrated in fig 1 show the cumulative incidence of hospital discharge, transfer to intensive care unit, and death in our hospitalized cohort. imputation methods were evaluated with 1000 repeated bootstrapped samples. we found that models based on median imputation appeared to outperform those based on data from mice imputation, so median imputation was selected for the basis of the final model. variables that we examined and were not found to add value beyond those included in our final model for predicting hospitalization included exposure to covid 19, other family members with covid-19, fever, fatigue, sputum production, flu-like symptoms, recent international travel, coronary artery disease, heart failure, on immunosuppressive treatment, other heart disease, other lung disease, pneumovax vaccine, bun, on angiotensin converting enzyme inhibitor, angiotensin receptor blocker, toremifene, and paroxetine. model discrimination was excellent with an area under the curve of 0.900 (95% confidence interval of 0.886-0.914) in the development cohort, and 0.813 (0.786, 0.839) in the validation cohort. the scaled brier score was 42.6% (95% ci 37.8%, 47.4%) in the development cohort and 25.6% (19.9%, 31.3%) in the validation cohort. the nomogram is presented in fig 2, and an online version of the statistical (fig 3) is available at https://riskcalc.org/covid19hospitalization/. the calibration curves are shown in fig 4 and suggest that predicted risk matches observed proportions relatively well throughout the risk range. table 2 shows the sensitivity, specificity, negative predictive value, and positive predictive value at different cutoffs of predicted risk. appropriately managed patients represented the majority of the cohort: 750 patients were hospitalized with a length of stay that exceeded 24 hours (431 in dc and 319 in vc), and 3549 patients were not hospitalized at all (2258 in dc and 1291 in vc). a minority of patients (237 patients, 5.4%) fell in the category of inappropriate initial management: 208 had been initially sent home from the emergency room but were then admitted within 1 week of emergency room visit (151 in dc, 57 in vc), and 29 patients were hospitalized but then discharged within 24 hours (12 in dc, and 17 in vc). when tested in each one of those categories, the predictive model performed very well in the appropriately managed subgroup (area under the curve of 0.821), but its performance was inadequate in the 5.4% of patients who fell in the inappropriate initial management category. our results confirm a higher risk of hospitalization with older age (median age in hospitalized patients of 65.5 years compared to 48.0 years in non-hospitalized patients), male sex (56.9% of hospitalized vs 48.3% of non-hospitalized), and medical co-morbidities most prominently hypertension, diabetes, and immunosuppressive disease (variables significant on univariable analysis in table 1 , but also relevant in final model). the significant association of shortness of breath and diarrhea with hospitalization may reflect the need for inpatient supportive care with these symptoms, regardless of the etiology. beyond the expected, our results provide some insights that advance the existing literature: the world health organization warns of a higher morbidity for covid-19 in smokers, and proposes multiple possible mechanisms including frequent touching of face and mouth during the act of smoking, sharing cigarettes, and underlying lung disease [22] . we found that former smokers rather than current smokers are at higher risk of covidrelated hospitalization (table 1) , favoring the underlying lung disease mechanism. 2. medications: we found a higher risk of hospitalizations in covid-19 patients who were on angiotensin converting enzyme (ace) inhibitors, or angiotensin ii type-i receptor blockers (arbs) on univariable analysis [16, [23] [24] . however, being on these medications did not influence the final multivariable model, suggesting that prior associations between acei's and arbs with covid severity may be confounded by the underlying medical comorbidities (hypertension and diabetes) that are linked to highest covid hospitalization rates, and which are most often treated with these same drugs. ace2 can also be increased by thiazolidinediones and ibuprofen, potentially explaining the higher hospitalization risk seen in our patients on non-steroidal anti-inflammatory drugs (nsaids); in fact, the latest fda guidance cautions against the use of nsaids in covid patients [25] . overall, we recommend caution using retrospective data to draw robust conclusions assigning causation to drugs vs underlying co-morbidity vs genetically driven ace2 polymorphism. we highlight the need for carefully designed, large observational studies or randomized clinical trials to address these critical questions. 3. race: african american race was correlated with a higher hospitalization risk (36.2% of hospitalized vs 21% of non-hospitalized). this is consistent with a recent look at hospitalizations for covid-19 across 14 states from march 1 to 30 [26] . race data, which were available for 580 of 1,482 patients, revealed that african americans accounted for 33 percent of the hospitalizations, but only 18 percent of the total population surveyed [26] . the authors proposed explanations like higher rates of medical co-morbidities, higher exposure risks, and distrust of the medical community as a postulated rationale. our data, however, show that the effect of race on the individualized hospitalization risk prediction far outweighs that of any medical co-morbidity (fig 2) . it is already known that race influences the effectiveness of an immune response [27] . a deeper exploration of the underlying genetics and biology of race in the defense against and the response to a sars-cov-2 infection is needed. this should be paired with a deeper exploration of social influencers of health such as population per square kilometer, and population per household which were also relevant in our nomogram. in our online risk calculator, only the zip code entry is required: the relevant social influencers data are derived from the zip code by our program. given the multitude of risk factors discussed, the nomogram and online risk calculator assist with obviating challenges of translating complex information to patient-level clinical decisionmaking [28] . during a pandemic, with hospital beds in short supply, it is critical to empower front-line healthcare providers with tools that can supplement and support decision-making about who to admit. advances in tele-health can be leveraged for home monitoring to guide care delivery in an outpatient setting for those determined to be low risk based on the nomogram calculation. models like ours developed with data obtained through an automated abstraction from the electronic health record (ehr) offer the promise of integration within the ehr to facilitate rapid and efficient implementation into the clinical workflow. such a strategy is a pragmatic application of overdue calls for a learning health system [29]. model performance, as measured by the concordance index, is excellent (c-statistic = 0.900). this level of discrimination is clearly superior to a coin toss or assuming all patients are at equivalent risk (both c-statistics = 0.5). the calibration of the model is excellent in both the dc and vc (see fig 4) . the metric that considers calibration, the ipa value, confirms that the model predicts substantially better than chance or no model at all. overall, the model performs very well. our next step will be to integrate this model into the clinical workflow. manually abstracting data and inputting it in an online calculator is cumbersome in a busy clinical practice. interpreting the prediction without some frame of reference is complex. however, failing to see beyond these hurdles risks wasting opportunities to innovate and improve patient care. it is therefore imperative to develop a clear implementation strategy that aligns with the existing clinical needs and clinical operations of a health organization. one could start by identifying the clinical problems that would benefit from this prediction tool, and reference the information in table xx on sensitivity, specificity, positive predictive value, and negative predictive value at different prediction cutoffs to provide a framework for clinical application. an illustrative example now being explored from our own health system is the use of this calculator to tailor the intensity of home monitoring for covid positive patients. currently, every patient who tests positive for covid is being called daily for 14 days to check on their symptoms and identify disease progression early enough for intervention. with only 20-30% of covid positive patients progressing to the point of requiring hospitalization, the nurses can use our prediction tool to identify this high risk group and only call them daily, while reducing the intensity of follow-up with the rest. this is not a multicenter study. it is important to note though that it includes all hospitals and outpatient facilities of the cleveland clinic health system within the us (>220 outpatient locations and18 hospitals in ohio and florida) creating robust sampling of the covid-19 population. as with any other statistical model, other hospital systems may elect to validate this model internally for their specific patient populations as they contemplate options for integrating it in their workflow. given the alternative of no or constantly changing practice guidelines, implementation of this nomogram into our clinical workflow will allow prospective evaluation of its impact on patient care and outcomes. our model includes age as a predictor: this may mitigate our ability to identify risk factors for disease progression specific in the younger population, and may underestimate the risks in the younger population with less severe disease and less likely to seek medical care. lastly, although our model performs very well in the majority of covid positive patients, more research is needed to optimize it for the sub group (5.4% of the total cohort in our series) with either delayed or unnecessary admission. drivers of disease progression and worsening in covid-19 are multiple and complex. we developed a statistical model with excellent predictive performance (c-statistic of 0.926) to individualize the hospitalization risk assessment at the patient level. this could help guide clinical decision-making and resource allocation. supporting information s1 checklist. (pdf) conceptualization: lara jehi, steve gordon, michael w. kattan. data curation: alex milinovich. american hospital capacity and projected need for covid-19 patient care clinical progression of patients with covid-19 in shanghai clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study analysis of factors associated with disease outcomes in hospitalized patients with 2019 novel coronavirus disease characteristics of and important lessons from the coronavirus disease 2019 (covid-19) outbreak in china: summary of a report of 72314 cases from the chinese center for disease control and prevention cancer patients in sars-cov-2 infection: a nationwide analysis in china network-based drug repurposing for novel coronavirus 2019-ncov/ sars-cov-2 extracting and utilizing electronic health data from epic for research research electronic data capture (redcap)-a metadata-driven methodology and workflow process for providing translational research informatics support the red-cap consortium: building an international community of software partners public health response to the initiation and spread of pandemic covid-19 in the united states rapid sentinel surveillance for covid-19 evaluating the yield of medical tests the index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement r: a language and environment for statistical computing. r foundation for statistical computing tidyverse: easily install and load the 'tidyverse mice: multivariate imputation by chained equations in r caret: classification and regression training na). riskregression: risk regression models and prediction scores for survival analysis with competing risks receptor recognition by novel coronavirus from wuhan: an analysis based on decadelong structural studies of sars the vasoprotective axes of the renin-angiotensin system: physiological relevance and therapeutic implications in cardiovascular, hypertensive and kidney diseases hospitalization rates and characteristics of patients hospitalized with laboratory-confirmed coronavirus disease 2019-covid-net, 14 states genetic adaptation and neandertal admixture shaped the immune system of human populations complete blood count might help to identify subjects with high probability of testing positive to sars-cov-2 national academies press (us) key: cord-331646-j5mkparg authors: sze to, g. n.; chao, c. y. h. title: review and comparison between the wells–riley and dose‐response approaches to risk assessment of infectious respiratory diseases date: 2009-07-31 journal: indoor air doi: 10.1111/j.1600-0668.2009.00621.x sha: doc_id: 331646 cord_uid: j5mkparg abstract infection risk assessment is very useful in understanding the transmission dynamics of infectious diseases and in predicting the risk of these diseases to the public. quantitative infection risk assessment can provide quantitative analysis of disease transmission and the effectiveness of infection control measures. the wells–riley model has been extensively used for quantitative infection risk assessment of respiratory infectious diseases in indoor premises. some newer studies have also proposed the use of dose‐response models for such purpose. this study reviews and compares these two approaches to infection risk assessment of respiratory infectious diseases. the wells–riley model allows quick assessment and does not require interspecies extrapolation of infectivity. dose‐response models can consider other disease transmission routes in addition to airborne route and can calculate the infectious source strength of an outbreak in terms of the quantity of the pathogen rather than a hypothetical unit. spatial distribution of airborne pathogens is one of the most important factors in infection risk assessment of respiratory disease. respiratory deposition of aerosol induces heterogeneous infectivity of intake pathogens and randomness on the intake dose, which are not being well accounted for in current risk models. some suggestions for further development of the risk assessment models are proposed. practical implications: this review article summarizes the strengths and limitations of the wells–riley and the dose‐response models for risk assessment of respiratory diseases. even with many efforts by various investigators to develop and modify the risk assessment models, some limitations still persist. this review serves as a reference for further development of infection risk assessment models of respiratory diseases. the wells–riley model and dose‐response model offer specific advantages. risk assessors can select the approach that is suitable to their particular conditions to perform risk assessment. quantitative infection risk assessment can serve as a useful tool in epidemic modeling, parametric studies on disease transmission and evaluating the effectiveness of infection control measures. it describes the infection risk of an individual or a population to an infectious disease quantitatively. infection risk is expressed by a probability of infection between 0 and 1. by comparing infection risks, the influence of different environmental factors on disease transmission and the effectiveness of different infection control measures can be evaluated. quantitative infection risk assessment can also be used in epidemiological studies such as outbreak investigations. currently, there are two approaches to quanti-tative infection risk assessment of respiratory diseases, which can be transmitted via the airborne route: the wells-riley model and the dose-response model. the wells-riley equation was developed by riley and colleagues in an epidemiological study on a measles outbreak (riley et al., 1978) . the equation is based on the concept of ôquantum of infectionõ as proposed by wells (1955) and is therefore termed the wells-riley equation. the wells-riley model has been extensively used in analyzing ventilation strategy and its association to airborne infections in clinical environments (e.g., escombe et al., 2007; fennelly and nardell, 1998; nardell et al., 1991) . the dose-response relationship is used to describe the effect on organisms from the exposure to different doses of chemicals, drugs, abstract infection risk assessment is very useful in understanding the transmission dynamics of infectious diseases and in predicting the risk of these diseases to the public. quantitative infection risk assessment can provide quantitative analysis of disease transmission and the effectiveness of infection control measures. the wells-riley model has been extensively used for quantitative infection risk assessment of respiratory infectious diseases in indoor premises. some newer studies have also proposed the use of dose-response models for such purpose. this study reviews and compares these two approaches to infection risk assessment of respiratory infectious diseases. the wells-riley model allows quick assessment and does not require interspecies extrapolation of infectivity. dose-response models can consider other disease transmission routes in addition to airborne route and can calculate the infectious source strength of an outbreak in terms of the quantity of the pathogen rather than a hypothetical unit. spatial distribution of airborne pathogens is one of the most important factors in infection risk assessment of respiratory disease. respiratory deposition of aerosol induces heterogeneous infectivity of intake pathogens and randomness on the intake dose, which are not being well accounted for in current risk models. some suggestions for further development of the risk assessment models are proposed. radiation, bio-agents, or other stressors. risk assessment models based on the dose-response relationship are called dose-response models. originally, doseresponse models were mainly used for risk assessment of hazardous chemicals. they were then developed for assessing the infection risk of foodborne and waterborne pathogens (haas, 1983) . some newer studies have proposed to use dose-response models for assessing the infection risk of airborne-transmissible pathogens (e.g., armstrong and haas, 2007a; nicas, 1996; sze to et al., 2008) . this article reviews the fundamental theories, formulations, model developments, and modifications of these two approaches. the strengths and limitations of the two approaches to infection risk assessment as well as to outbreak investigations are compared. suggestions on further development of risk assessment models regarding their limitations are proposed. infection risk assessment models should be based on theories and mathematical equations that are biologically plausible or conformable to clinical or laboratorial evidences. airborne respiratory pathogens can be generated from expiration actions and other activities that introduce pathogen-laden aerosols into the air. pathogens released from the infectious source must reach the target infection site of the receptor to commence the infection. even after the pathogen has successfully reached the target infection site, it must survive the immune defenses of the receptor organism to induce infection. a number of influencing factors affect this process and the outcome. they are listed in table 1 . these factors add complexities to the exposure and risk assessment of pathogenic microorganisms. many of them are not well-understood, especially the pathogen-host interactions. as a result, statistics and probabilities are often employed to formulate quantitative infection risk assessment models. infection risk assessment consists of two components in general: the estimation of the intake dose of the infectious agent and the estimation of probability of infection under a given intake dose. the intake dose is the amount of the infectious agent reaching the target infection site. for airborne pathogens, estimation of intake dose requires knowledge of the exposure level to the infectious agent, pulmonary ventilation rate, exposure time interval, and the respiratory deposition of the infectious particles. knowing the intake dose, the probability of infection can then be modeled by a mathematical function. infection risk assessment models can be divided into two categories: deterministic models and stochastic models. in deterministic models, each individual is hypothesized to have an inherent tolerance dose toward the infectious agent (haas et al., 1999) . when a receptor organism intakes a dose of pathogens equivalent to or exceeding his/her tolerance dose, infection will occur. lower than this tolerance dose, the receptor organism will remain uninfected. following this hypothesis, the model can determine whether an individual will be infected or not under a certain intake dose. on the contrary, stochastic models do not determine whether an individual will acquire infection or not under a certain intake dose. instead, the models estimate the probability of acquiring the infection under the intake dose. more details on these two concepts will be discussed in further sections. some infection risk assessment models are classified as threshold models. when a population intakes a dose lower than the threshold dose, none of the individuals would acquire the infection, i.e., the infection risk would be zero. it should be noticed that the threshold dose is different from the tolerance dose in the deterministic models (haas et al., 1999) . threshold dose is the minimum amount of pathogens required to initiate infection. when the intake dose exceeds the threshold dose, there will be a non-zero probability of infection. tolerance dose is a deterministic indicator. when an individual receives an intake dose exceeding his/her tolerance dose, that individual will be infected. examples and the assumptions of threshold models will be discussed in further sections. wells-riley model: the quantum of infection and the poisson probability distribution wells (1955) proposed a hypothetical infectious dose unit: the quantum of infection. a quantum is defined as the number of infectious airborne particles required to infect the person and may consist of one or more airborne particles. these particles are assumed to be randomly distributed throughout the air of confined spaces. riley et al. (1978) considered the intake dose of airborne pathogens in terms of the number of quanta to evaluate the probability of escaping the infection as a modification of the reed-frost equation (abbey, 1952) . together with the poisson probability distribution describing the randomly distributed discrete infectious particles in the air, the wells-riley equation was derived as follows: where p i is the probability of infection, c is the number of infection cases, s is the number of susceptibles, i is the number of infectors, p is the pulmonary ventilation rate of a person, q is the quanta generation rate, t is the exposure time interval, and q is the room ventilation rate with review of the wells-riley and dose-response models clean air. the quanta generation rate, q, cannot be directly obtained, but estimated epidemiologically from an outbreak case where the attack rate of the disease during the outbreak is substituted into p i . if the exposure time and ventilation rate are known, the quanta generation rate of the disease can be calculated from equation 1. the exponential term of any exponential equation should always be dimensionless. following the definition by wells (1955) , a ôquantumõ has a unit describing the number of infectious particles (or the number of airborne pathogens). hence, the exponential term in equation 1 is not dimensionless but has the unit of the number of infectious particles. two different interpretations can be made on equation 1: • there is a unity infectivity term, with the unit of per infectious particle, in the exponential term. • the infectivity term is implicitly included in the backward calculated quanta generation rate in the equation, i.e., q = infectivity term · number of quanta/unit time. the infectivity term may not be one. adding an infectivity term to the exponential term would make it dimensionless. the infectivity term describes the probability of each infectious particle to initiate the infection. it should be noticed that in the first interpretation, a unity infectivity term implies that the host is completely vulnerable to the pathogen. this will make the wells-riley equation only suitable for diseases such as tuberculosis, in which the definition of tuberculosis infection fulfills this condition (huebner et al., 1993) . a unity infectivity term also indicates that one quantum is equal to one infectious particle/ pathogen and makes the model deterministic, because the individual is determined to be infected if his/her intake dose is equal to or greater than one pathogen. the first interpretation has also assumed that all inhaled infectious particles will successfully deposit on the target infection site in the respiratory tracts, which is not correct in general. adopting the second interpretation, the equation is applicable to many diseases and it is a stochastic model. respiratory deposition of infectious particles is implicitly considered in the calculated quanta generation rate. the calculated dispersion and distribution of airborne pathogens how airborne pathogens disperse and distribute in the room air governs the exposure levels of the susceptible persons. the spatial distribution of airborne pathogens depends on the proximity to the infectious source, ventilation, and the geometry of the premises. the susceptible people would generally have different exposure levels and hence different degrees of infection risk. assuming a uniform airborne pathogen distribution may cause significant error in the assessment (noakes and sleigh, 2008) . ventilation strategy airborne pathogens can be dispersed to different locations by airflow. the ventilated airflow pattern has strong correlation to the spreading of airborne transmissible diseases (li et al., 2007) . the spatial distribution of infectious particles is very dependent on the airflow pattern. infectious particles can be removed from the air by ventilation dilution, which depends on the ventilation rate. survival of pathogen pathogens may lose viability to cause infection by biological decay during the airborne stage, which is a sinking mechanism for respiratory pathogens. airborne survival of pathogens often depends on the temperature and humidity (e.g., schaffer et al., 1976) . aerosol size expiratory aerosols and many other bioaerosols are polydispersed. the transport of aerosols depends on their aerodynamic size. therefore, the dispersion of pathogen-laden aerosols is dependent on aerodynamic size and the exposure levels to these aerosols usually have spatial variations. the deposition loss of infectious particles also depends on their aerosol size . when airborne pathogens are inhaled by the receptor organism, not all but a fraction of the inhaled pathogen-laden aerosols may deposit on the target infection site in the respiratory tract. in addition, because of aerosol dynamics, the respiratory deposition of these aerosols is dependent on aerodynamic size. because of the difference in respiratory deposition of aerosols with different sizes, the aerosols have different deposition fractions on different regions of the respiratory tract. for example, aerosols with sizes >6 lm are trapped increasingly on the upper respiratory tract, aerosols with sizes >20 lm generally do not deposited on the lower respiratory tract and those with sizes >10 lm generally do not reach the alveolar region (hinds, 1999; tellier, 2006) . different regions of the respiratory tract may have different immune mechanisms. in other words, pathogens generally have different infectivity in different regions of the respiratory tract. for example, the id 50 of influenza virus is about two orders higher when the virus was introduced to the nasal cavity by intranasal drop than introduced to lower respiratory tract via aerosol inhalation (alford et al., 1966; douglas, 1975) . as the respiratory deposition of aerosols depends on their sizes, the variation of pathogen infectivity when carried by infectious particles of different sizes was also observed, as shown by many experimental infection studies (e.g., day and berendt, 1972; wells, 1955) . as induced by air turbulence, airborne pathogens trend to be randomly distributed in air. any estimated exposure level or intake dose would be an expected value rather than an exact value. air turbulence also exists in respiratory tracts. respiratory deposition fraction of aerosols is also an expected value rather than an exact value (hinds, 1999) . in other words, when the respiratory deposition fraction of aerosols with a particular size is b, each aerosol with this size would have a probability of successful deposition equal to b. when a host organism is exposed to the pathogen, whether the organism will be infected or not depending on the infectivity of the pathogen and the immune status of the host organism (haas et al., 1999) . control measures such as respiratory protection, ultraviolet irradiation and particle filtration can reduce the exposure level of the susceptibles to airborne pathogens (nazaroff et al., 1998) . quanta generation rate will be a combination of infectivity of the pathogen and the infectious source strength in the outbreak. when the wells-riley equation is used in risk assessment of pathogens with a threshold dose greater than one pathogen, it will provide more conservative assessment results at low intake doses. a more accurate approach is to use a multiple-hit exponential form (haas, 1983; nicas et al., 2005) . the wells-riley equation assumes well-mixed room air and a steady-state infectious particle concentration which varies with the ventilation rate. although riley et al. (1978) assumed that the biological decay of the airborne pathogen could be neglected, the biological decay of the pathogen during aerosolization and in the airborne state is implicitly considered in the calculated quanta generation rate. many complexities in airborne disease transmission are also implicitly considered in the quanta generation rate. the wells-riley equation provides a simple and quick assessment of the infection risk of airborne transmissible diseases. the basic reproduction number of the infection is calculated as c/i, which can be used to estimate the disease spreading risk in a large community. many epidemic modeling studies have used the wells-riley equation as part of their mathematical models (e.g., liao et al., 2005 liao et al., , 2008 noakes et al., 2006) . dose-response type infection risk assessment models require infectious dose data to construct the doseresponse relationship. the term ôdoseõ refers to the quantity of the pathogen (who, 2003) . infectious dose data are obtained from experimental infections of test animals (or human subjects) by the pathogen. for example, when a group of test animals is exposed to a certain dose of pathogens and half of the test animals acquire the infection, this particular dose of pathogen is the 50% infectious dose. interspecies extrapolation may be required when human infectious dose data are unavailable. there are both deterministic and stochastic types of doseresponse models, which interpret the dose-response relationships in different ways. deterministic models are empirical models. following the tolerance dose concept, infectious dose data are interpreted as the dose of pathogens that exceeds the tolerance dose of a portion of the population. for example, 50% infectious dose (id 50 ) exceeds the tolerance dose of half of the susceptible population. when each person in a susceptible population intakes a dose of the pathogen equal to id 50 , half of the people will be infected. when the frequency distribution of this tolerance dose is known, the infection risk of a certain intake dose can be assessed. figure 1 illustrates this idea. the cumulative curve describes the doseresponse relationship. when each member of a susceptible population receives the same dose of pathogens, the infection risk is equal to the (cumulative) relative frequency of infection at this dose. the tolerance dose concept is biologically plausible in the sense that the immune status and the hostõs sensitivity to the pathogen vary between individuals as do their tolerance doses. in addition, some infection symptoms may only be observed after the host acquires a certain amount of pathogen in the body. however, it is not biologically plausible in the sense that the pathogens would inherently be assumed to be acting cooperatively, by which infection is the consequence of their joint action (armitage et al., 1965; haas, 1983) . some examples of deterministic dose-response models are shown in table 2 . in contrast to the deterministic models, stochastic models are semi-empirical models. they assume that at any intake dose, the host will have a probability of getting infected. generally, the greater the intake dose, the greater the probability of infection will be. in the stochastic single-hit models, the host must intake a dose containing at least one pathogen. at least one of the pathogens has to reach the infection site and survive until symptoms are provoked on the host. the models are formulated by solving these conditional probabilities. some examples are shown in table 2 . stochastic dose-response models are more biologically plausible than the deterministic ones, as they are not based on the tolerance dose concept. in addition, some stochastic properties regarding the exposure and intake of the pathogens cannot be considered by the deterministic models. for example, the pathogens, as discrete matters, are randomly distributed in the suspension medium. the distribution of these pathogens in the air is also in a random manner as induced by air turbulence. therefore, the estimated exposure level and intake dose of airborne pathogens are always expected values rather than exact values. deterministic review of the wells-riley and dose-response models models often regard the intake dose as an exact value and ignore this randomness, which may cause error in the assessment. in practice, the model providing the best fit to the infectious dose data on the disease should be selected for infection risk assessment. the selection of the model will also depend on the availability of the infectious dose data. if there is only one available infectious dose value, only the exponential model can be used as the other models require at least two infectious dose values to calculate the fitting parameters. exponential dose-response model and the beta-poisson model belong to the category of non-threshold models, as they assume that an infection could be initiated by a single pathogen reaching the infection site and deterministic model lognormal some experimental infection results suggested that the distribution of tolerance doses can be described lognormally (e.g., nicas and hubbard, 2002) . therefore, the lognormal model is one of the deterministic dose-response models: where n is the intake dose, l and r are the mean and sd of natural logarithm of the tolerance dose, respectively. equation 2 can be rewritten as the cumulative distribution function: where erf is the error function. l and r are determined by fitting the infectious dose data of the disease. the infectivity of the pathogen and the pathogen-host interactions are implicitly considered by the probability distribution of the tolerance doses, hence l and r. these two deterministic models use different probability distributions in describing the distribution of the tolerance dose (haas et al., 1999) . the host organism must intake a dose containing at least one pathogen. at least one of the pathogens has to reach the infection site and survive until symptoms are provoked on the host. these conditions can be expressed by the following equation: where p 1 (j) is the probability of inhaling a number of j pathogens, p 2 (k | j) is the probability of a number of k pathogens from those j inhaled pathogens surviving inside the host to initiate the infection. the pathogens, as discrete matters, are distributed in a medium in a random manner described by the poisson probability distribution. when the medium is aerosolized, the pathogen distribution in the aerosols and hence their distribution in the air also follows the poisson probability distribution. substituting the poisson probability function into p 1 (j) in equation 4 and using a constant, r, to express the probability of a pathogen surviving inside the host to initiate the infection, the probability of infection with an intake dose, n, is derived (haas, 1983) : simplifying the summation series, it becomes the exponential dose-response model: the infectivity of the pathogen and the pathogen-host interactions are implicitly considered by r. the variation of host sensitivity is not considered in the exponential dose-response model. to complement that, a distribution of the value of r rather than a fixed value can be considered. it is believed that the beta-distribution is the most plausible description for the r values (moran, 1954) . this results in the beta-poisson model: where g is the gamma function. the equation can be approximated as follows (furumoto and mickey, 1967) : the infectivity of the pathogen and the pathogen-host interactions are implicitly considered by r, a, and b in the equations. similar to l and r in equations 2 and 3, r in equation 6 as well as a and b in equations 7 and 8 are determined by fitting the infectious dose data of the disease a,b . the approximate form does not work well when b is small and/or n is large. in the example of norwalk virus, the estimates are a = 0.040 and b = 0.055 (teunis et al., 2008) . if n = 25 virus, the exact equation predicts a 50% chance of infection, whereas the approximation only predicts a 22% chance of infection. a when calculating the fitting parameters, whether or not the respiratory deposition of pathogen-laden aerosols should be considered is dependent on the infectious dose data. if the infectious dose data refer to the inhaled dose, respiratory deposition of pathogen-laden aerosols can be implicitly considered by the fitting parameters. the intake dose would be: n = pc t , where c t is the total exposure concentration to viable pathogens. if the infectious dose data refer to the deposited dose of pathogen-laden aerosols on to the respiratory tract, the deposition fraction of pathogen-laden aerosols should be considered explicitly. the intake dose would be: n = bpc t , where b is the deposition fraction of pathogen-laden aerosols onto the respiratory tract. b taking r in the exponential form as an example, with an id 50 data, r can be calculated by substituting 0.5 to p i and the id 50 value into n in equation 6, which equals to )ln 0.5/id 50 . surviving in the host. in threshold models, infection risk is generally zero if the intake dose is lower than the threshold dose. figure 2 illustrates the difference between threshold and non-threshold models. the threshold dose will be reflected in the distribution of the tolerance dose when deterministic model is used. for stochastic model, to incorporate the effect of the threshold dose, a multiple-hit model needs to be used (nicas et al., 2005) . a simple multiple-hit model can be obtained by modifying equation 5 (haas et al., 1999) : where k min is the threshold dose. more complicated threshold models can be found in haas et al. (1999) . although the threshold dose concept is not the same as the tolerance dose concept, threshold models also inherently assume that the pathogens act cooperatively (rubin, 1987) . this assumption is not biologically plausible, as pathogen attacks on an organ or cell are spontaneous and independent actions and they do not have ôjoint actionsõ or ôcooperative attacksõ. in addition, after pathogens successfully attack the organ or cell, they may quickly replicate inside the host body and eventually reach a critical amount sufficient to provoke infection symptoms in the host. there is sufficient evidence to support the argument that only a single pathogen is required to commence infection of some diseases, including tuberculosis and smallpox (nicas et al., 2004; wells, 1955) . however, some arguments suggest that threshold models do provide more accurate assessment results for some diseases, especially under low intake doses. some experimental infection studies have observed threshold dose among the test animals (e.g., cafruny and hovinen, 1988; dean et al., 2005) . in such cases, the threshold models would provide a better fit to these infectious dose data. the observation of a threshold dose may involve some complex biology. it could also be attributed to the limited number of the test animals when conducting experimental infection study. to obtain the doseresponse relationship of a pathogen, different doses of the pathogen are given to different groups of test animals in experimental infection study. for instance, if each group consists of 10 test animals and the given dose has its true probability of infection to the test animals less than 0.05, it is most likely that no test animal in that group would be infected. if no test animal is infected under this given dose or other doses lower than this given dose, this given dose will be an observed threshold dose of the pathogen. with this limitation, a threshold dose may be observed even if the pathogen does not have such a threshold. after all, the model providing the best fit to the infectious dose data of the pathogen should be used in the infection risk assessment. to complement some of the limitations and increase the feasibility of the infection risk assessment models, subsequent developments and modifications were performed by various researchers. incorporating additional influencing factors. the original wells-riley model considered the ventilation rate as the only influencing factor to the infection risk. there are many other factors and control measures that can affect the infection risk. use of a respirator will reduce the number of inhaled infectious particles. it is feasible to incorporate a parameter in the wells-riley equation indicating this reduction. the iqpt/q term in equation 1 is the intake dose with unit of quantum. the effect of respiratory protection can be considered by multiplying this term by a fraction (fennelly and nardell, 1998; nazaroff et al., 1998; nicas, 1996) : where r is the fraction of particle penetration of the respirator. it equals 1 when no respirator is used. particle filtration and air disinfection, such as ultraviolet irradiation, are additional airborne pathogen sink mechanisms other than ventilation removal. the effect of these control measures can also be incorporated into the wells-riley model (fisk et al., 2005; nazaroff et al., 1998) : where k uv is rate coefficient of inactivation by ultraviolet irradiation, v is the room volume, q r is the flow rate to the filter, and g r is the filtration efficiency. some review of the wells-riley and dose-response models studies also suggested that the deposition loss of infectious particles and the viability loss of pathogens while airborne can also be considered by adding these sink terms in the denominator, similar to equation 11 (fisk et al., 2005; franchimon et al., 2008) : where k v is the rate coefficient of viability loss of the pathogen in the airborne state, k dep is the rate deposition loss of the infectious particles. however, readers should be aware that when the quanta generation rate, q, is backward calculated from an outbreak case by equation 1, removal by ventilation is implicitly assumed to be the sole sink mechanism for the airborne pathogen during that outbreak case. therefore, the calculated q has already implicitly considered the deposition loss of infectious particles and the viability loss of pathogens in the airborne state of that outbreak case. if this quanta generation rate is used in equation 12, the effects of these influencing factors would be considered twice and would lead to underestimation of the infection risk. to consider these two influencing factors using the wells-riley model, knowledge of the deposition and viability losses during the outbreak case is required. by substituting these two parameters into equation 12, a quanta generation rate of the disease without implicit consideration of ventilation removal, deposition and viability losses as airborne pathogen sink mechanisms during the outbreak can be obtained. risk assessors can then perform risk assessment with consideration of these influencing factors with equation 12 using this quanta generation rate. likewise, if the place of an outbreak case is equipped with ultraviolet irradiation or air filtration devices or the occupants have used respiratory protection, using equation 1 to calculate the quanta generation rate during the outbreak will obtain a q implicitly considering these influencing factors. it is inappropriate to use such quanta generation rates for parametric studies of these factors. risk assessors should either calculate the q using equation 10 and/or equation 11 or use a q calculated from another outbreak case that does not include the influence of these factors. allowance for non-steady-state and imperfect mixing. the assumption of a steady-state and well-mixed airborne pathogen concentration is one of the major limitations of the original wells-riley equation. subsequent modifications were made to overcome this limitation. gammaitoni and nucci (1997a) described the changes in the quanta level in room air using a differential equation. by considering the time weighted average pathogen concentration in the room air instead of assuming the concentration has reached a steady-state, a risk assessment equation that incorporates nonsteady-state condition was developed: where l is the air change rate or disinfection rate. equation 13 still relies on the well-mixed assumption and adopting this equation will imply that the susceptible person or population is present in the premises starting from t = 0, or that the initial quanta concentration in the room air is 0. when this is not the case, the initial quanta concentration in the room air has to be considered (gammaitoni and nucci, 1997b) : where n o is the initial quanta level in the room air. equation 13 can also be used to calculate the quanta generation rate from an outbreak, it would provide more accurate estimation than using equation 1 in general, especially when the exposure time interval is short. rudnick and milton (2003) have also proposed a modified wells-riley equation using the exhaled air volume fraction to estimate the number of quanta that the susceptible people are exposed to: where f is the average volume fraction of room air that is exhaled breath and g is the total number of people in the premises. in this equation, the exponential term is equal to the number of quanta inhaled by each susceptible person. the model estimates the pathogen concentration in room air indirectly. investigators may need to monitor the carbon dioxide concentration in the room in order to estimate f . to obtain spatial variation of infection risk, some investigators used multiple box models or divided the premises into multiple zones (e.g., ko et al., 2001 ko et al., , 2004 , where the susceptibles may have different degrees of exposure in terms of quanta and thus different levels of infection risk. rudnick and milton (2003) developed equation 15 to incorporate nonsteady-state condition, but the model still adopts the sze to and chao well-mixed assumption. however, with their proposed concept, the equation can also incorporate spatially distributed infection risk. when the amount of exhaled breath generated by the infectors and inhaled by a susceptible person in a particular spatial location is known, the susceptible personõs exposure in terms of number of quanta and the infection risk can be estimated. this can be done by conducting tracer gas measurements. in such measurements, tracer gas is released from the locations of the infectors and the concentrations of the tracer gas at the locations of each susceptible person are then measured. the amount of exhaled breath inhaled by a susceptible person can be calculated by the measured concentrations and the released concentration. this parameter can also be obtained numerically by computational fluid dynamics (cfd). a numerical model of the premises is constructed. gas surrogate is injected into the model at the locations of the infectors and its dispersion is simulated numerically. spatial distribution of gas concentration is then obtained and the amount of exhaled breath inhaled by the susceptible person at different locations can be calculated. some risk assessment studies have used these approaches to incorporate spatial variation into infection risk (e.g., gao et al., 2008; tung and hu, 2008) . these approaches can provide more realistic results, but they are more timeconsuming than using the multiple box or multi-zone model. from foodborne to airborne. dose-response assessment has a long history of use in analyzing the risk from chemical toxins. the concept has also been found to be feasible in assessing the risk of pathogenic microorganisms. the dose-response models have been widely adopted in quantitative risk assessment of infectious diseases transmitted via foodborne and waterborne routes and is recommended by the world health organization (2003). in waterborne and foodborne infections, as the contaminated water or food is consumed by ingestion, the pathogenic microorganism can directly reach the gastrointestinal region and hence the target site of infection. however, when assessing the risk of airborne infection, not all inhaled airborne pathogens will reach or be retained in the target infection site. thus, the respiratory deposition of aerosols has to be considered. exposure to airborne pathogens has to be assessed when estimating the intake dose. exposure assessment is recognized by the national academy of sciences as one of the four components of the risk assessment paradigm for human health effects (national academy of sciences, 1983) . once the exposure level to the airborne pathogen is known, the intake dose can be estimated from the pulmonary ventilation rate and deposition fraction of the aerosols. the probability of infection can then be predicted using the dose-response equation. therefore, it is critical to obtain a realistic exposure level in the susceptibles of airborne pathogens when adopting dose-response risk assessment models. to the best of our knowledge, the first study using dose-response model in assessing airborne infection risk was performed by nicas (1996) on tuberculosis: where g is the number of airborne tuberculosis bacilli released per infector per unit time and b is the deposition fraction of infectious particles in the alveolar region. readers may notice that the equation is similar to the wells-riley equation with the quanta generation rate q replaced by gb and it is also similar to the exponential dose-response equation with r equal to 1. the equation implicitly assumes that infection will occur if there is one bacillus successfully deposited on the alveolar region. infectious particles are also assumed to have a poisson distribution in the air. as a result, the probability of infection equals 0.63 when the exponential term equals 1. the first assumption delineates the host as completely vulnerable to the pathogen. equation 16 is describing the probability of the susceptible person being exposed to the tuberculosis bacilli, i.e., getting a positive skin test. as having a positive skin test equates to tuberculosis infection (huebner et al., 1993) , equation 16 is adequate for use to describe the probability of tuberculosis infection. for other diseases, the assumption that the host is completely vulnerable to the pathogen may not be appropriate. nicasõs work has shown the possibility of using the dose-response model in assessing the infection risk of disease transmitted via airborne route. nicas and his colleagues modified the equation by expressing the infectious source strength term, g, as a multiple of cough frequency, pathogen concentration in respiratory fluid and the volume of expiratory droplets introduced into the air in a cough (nicas et al., 2005) . other sinking mechanisms for the airborne pathogen were also considered in the modified equation with a formulation similar to equation 11. these two dose-response models utilize the steady-state and well-mixed assumption on the airborne pathogen concentration. the adequacy of using dose-response models in assessing airborne infection risk was further demonstrated by the work done by armstrong and haas on legionnairesõ disease. because of the unavailability of human data on legionella, interspecies extrapolation of infectious dose data from animal models was performed (armstrong and haas, 2007a) . risk extrapolation under low dose conditions was used to obtain review of the wells-riley and dose-response models results with better relevance to the infectious dose data (armstrong and haas, 2007a) . a near field-far field model was used to estimate the spatial variation of the exposure level (armstrong and haas, 2007b) . the risk assessment results were validated by comparing the estimated risk and the reported attack rate from documented outbreak cases haas, 2007b, 2008) . this series of studies has set a good example and put in place rigorous procedures in using dose-response models to assess airborne infection risk. the studies have also signified the potential of doseresponse models in assessing the infection risk of exposure to pathogen-laden aerosols other than those generated by infected people. sze have developed an exposure assessment model that can incorporate the aerodynamic size-dependent factors regarding airborne pathogens: where e(x, t o ) is the exposure level of the pathogen at location x during the exposure time interval, t o ; c is the pathogen concentration in the respiratory fluid; f(t) is the viability function of the virus in the aerosols; and v(x, t) is the volume density of expiratory droplets at the location. v(x, t) can be obtained by cfd modeling or by experiments (e.g., . the spatial distribution of infectious particles can be reflected by this parameter. it is tedious and time-consuming to model every cough during the exposure time interval to obtain v(x, t) at different locations. an alternative approach is to model the transport of the expiratory droplets after a single cough to obtain v(x, t) at different locations and then to multiply the right-hand side of the equation by the total number of coughs during the exposure time interval. with other aerodynamic size-dependent factors considered, a stochastic non-threshold dose-response model for airborne pathogens can be formed: where m is the total number of size bins, v(x, t) j is the volume density of droplets of the j th size bin and f s is the cough frequency. as the infectivity (reflected in r) and b are aerosol size-dependent, v(x, t) is thus split into different size bins. generally, equation 17 will provide more realistic exposure estimates, but it will be more time-consuming than obtaining the exposure level by the well-mixed assumption or other simple models. equation 18 is especially suitable for parametric studies on the effect of environmental control, such as ventilation strategy or airflow pattern on the infection risk via airborne transmission. droplet and indirect contact transmission. many respiratory infectious diseases can be transmitted via the airborne route and many of them can also be transmitted via droplet and indirect contact routes. it is also believed that the airborne mode may not be the major or only route of transmission for some respiratory diseases, such as sars and influenza. indirect contact transmission may be an important route of transmission for many respiratory pathogens (beggs, 2003; boone and gerba, 2007) . the result of risk assessment will be incomplete without considering these transmission routes. dose-response models can assess the infection risk of these exposure pathways provided that the intake dose of the pathogen via these transmission routes can be estimated. droplet transmission occurs when pathogens are carried in relatively large expiratory droplets. unlike small expiratory aerosols, which can remain airborne for a long time and disperse over long distances, these expiratory aerosols can only travel short distances before settling. under the definitions of the centers for disease control and prevention (2003) , disease transmission via expiratory aerosols with sizes greater than 5 lm are in the droplet mode and with sizes smaller than or equal to 5 lm are in the airborne mode. however, studies found that expiratory droplet nuclei with sizes up to about 20 lm may also travel long distances similar to these aerosols with sizes smaller than or equal to 5 lm, depending on the airflow pattern and ventilation strategy (chao and wan, 2006; . therefore, a more rigorous approach to perform the dose-response infection risk assessment is not to distinguish into the airborne mode or the droplet mode, but directly split the exposure level into different aerodynamic size ranges for the infectious particles, such as equation 18. other dose-response models described in this article can also assess the infection risk via droplet transmission when the aerodynamic size-dependent factors are incorporated and an exposure assessment method that includes spatial variation of aerosol exposure levels is adopted. other than directly inhaling aerosolized respiratory pathogens, susceptibles may also be exposed to them via contacting surfaces contaminated by the deposited pathogens, as the conjunctiva and nasal mucous membrane can be portals of entry for some respiratory pathogens such as the measles and influenza viruses. when infectious particles are deposited on solid surfaces, these surfaces will become fomites. people contacting these contaminated surfaces may then deliver the pathogen to their eyes or nasal mucous membranes and may become infected. the first study assessing the infection risk via indirect contact transmission was performed by nicas and sun (2006) . a markov chain model was used to estimate the intake dose of pathogens via indirect contact and also via other exposure pathways. when the intake dose is known, the infection risk can be assessed by the doseresponse model. their model assumes a steady-state pathogen load on contaminated surfaces, in which the rate of introducing the pathogens onto the surface equals the loss rate of the pathogen on the surface because of decay. this assumption is adequate for some pathogens with a fast or medium decay rate on the surface. some pathogens will survive on solid surfaces for days or even weeks (walther and ewald, 2004 ) and at such a slow decay rate, the steady-state pathogen load will take a long time to reach. in this case, the error associated with this assumption will be large, especially for a short and medium exposure time interval. nicas and best (2008) have proposed an analytical model assessing the infection risk via indirect contact transmission by considering an average pathogen load on hand over the concerned exposure time interval. wan et al. (2009) have also developed a mathematical model describing the process of delivering pathogen to the mucous membranes via indirect contact transmission. the model also estimates the intake dose via indirect contact transmission. their model can incorporate the non-steady-state pathogen load condition. it assumes that the decay of the pathogen on the contaminated surface is insignificant during the exposure time interval. therefore, the equation is suitable for the case when the pathogen has a slow decay rate on the contaminated surface. in contrast, nicas and sunõs model should be used when the pathogens have a fast or medium decay rate on the contaminated surface. with pathogens that can survive on inanimate surfaces for days or even weeks, the contaminated surfaces can serve as reservoirs for the pathogens up to weeks without effective disinfection. these fomites can impose potential infection risk to susceptible people for a long period of time, even after the source of the infectious particles has been removed. in this case, indirect contact transmission is the only exposure pathway to the susceptible people. their model can also estimate the intake dose in this scenario. table 3 shows these models. with the estimated intake dose, the infection risk can be assessed using the dose-response model. to assess the combined infection risk via multiple exposure pathways, the simplest way is to sum up all the intake doses from different exposure pathways and then substitute into the doseresponse model. however, as discussed in table 1 , pathogens have heterogeneous infectivity in different regions of the respiratory tract. using a summed intake dose from different exposure pathways in doseresponse assessment will only allow a single or a single set of fitting parameters that consider all the pathogens to have homogeneous infectivity. essentially, pathogens encased in small aerosols will generally infect the lower respiratory tract, whereas pathogens encased in large aerosols will mainly infect the upper respiratory tract, and pathogens acquired via indirect contact will primarily infect the mucous membranes. risk assessors should separate the intake dose for different exposure pathways and use different fitting parameters to obtain more realistic risk assessment results, unless the available infectious dose data are insufficient. to consider multiple intake doses from different exposure pathways, the dose-response equations would need to be reformulated. the exponential model is modified as follows: where r 1 , r 2 , r m , n 1 , n 2 , and n m stand for the fitting parameters and intake doses for the 1 st , 2 nd , and m th exposure pathways, respectively. in the beta-poisson model, the definite integral resulting in equation 7 (haas, 1983 ) is modified to: where r is the frequency distribution of the r value. for deterministic models, the probability of escaping infection from each exposure pathway has to be considered: where p i,1 , p i,2 , and p i,m are the infection risk via the 1 st , 2 nd , and m th exposure pathways, respectively. the heterogeneous infectivity stochastic models can also be formulated using this ôescaping the infectionõ concept. infectious particles become more diluted when they disperse farther from the source. the exposure level and hence the infection risk to respiratory pathogens are always expected to have spatial variation. as observed in many outbreaks associated with infectious respiratory diseases, the infection cases are often distributed with obvious proximity relationship to the index case (e.g., gustafson et al., 1982; marsden, 2003) . spatial distribution of the infectious particles is an important consideration in risk assessment of infectious respiratory diseases. when the well-mixed assumption is adopted, the spatial variation of infectious particles distribution is review of the wells-riley and dose-response models ignored and all the susceptible people have the same infection risk. this approach may cause underestimation of infection risk for the susceptible people in close proximity to the infectious source (noakes and sleigh, 2008) . newer models allow both approaches to consider the spatial variation of infection risk in the assessment. however, as the wells-riley models rely on backward calculated infectivity and infectious source strength (the quanta generation rate), there is currently no method to estimate the infectious source strength from an outbreak other than the well-mixed approach. when the infectious source strength term is backward calculated from an outbreak using the wellmixed assumption, the influences of the geometry of the premises, the airflow pattern, and the location of the infectious source on the spatial distribution of the infectious particles are implicitly considered in that infectious source strength. using this infectious source strength in the risk assessments, even if the risk assessor has used a model that can incorporate spatial variation in exposure levels, these influencing factors cannot be explicitly adjusted and will become errors. therefore, the errors associated with the well-mixed assumption still persist even though the risk assessor uses a modified wells-riley model that can consider spatial effects in the assessment. dose-response models do not rely on backward calculation of infectivity and infectious source strength from outbreaks. therefore, using the dose-response models that can consider spatial effects in the assessment can avoid those errors. in epidemic modeling, where the spread of the disease in the community is concerned, it is difficult to specify the geometries, the airflow patterns and the locations of the infectious sources in every indoor premises, as these factors always vary from case to case. therefore, adopting the well-mixed assumption is generally more reasonable than hypothesizing particular environments and scenarios in the modeling. as described in table 1 , air turbulence will induce randomness, which directly affects the intake dose of the airborne pathogen. the randomness induced by air turbulence consists of two components: the randomness in the pathogen distribution in air and the randomness of respiratory deposition of the inhaled pathogen. the former component can be described by poisson probability as shown in some of the current risk models. for the latter component, if the respiratory deposition fraction is multiplied to the inhaled dose, the deposited dose will be the expected value and this randomness is not fully adjusted. when a person inhales an amount of n* pathogens and the respiratory deposition fraction of the aerosols is b*, an amount of b*n* pathogens is most likely to be successfully deposited on his/her respiratory tract. however, it is also possible to have zero pathogen successfully deposited on his/her respiratory tract if the person has lottery-winning good luck or having all n* pathogens successfully deposited on his/her respiratory tract if the person has very bad luck. therefore, there should be a binomial probability distribution to describe this randomness. taking the exponential (2008) where e m is the dose of pathogen delivered to the mucous membrane, c h is the transfer efficiency of the pathogen from the surface to the hand after a contact, c m is the transfer efficiency of the pathogen from the hand to the mucous membrane after a contact, f h is the frequency of hand-to-contaminated surface contact, f m is the frequency of hand-to-mucous membrane contact, a s is the average contaminated surface area touched per hand contact, t o is the concerned time interval, c s is the pathogen load per area of the contaminated surface, and u is the decay rate of pathogen on hand. where n is the number of hand-to-mucous membrane contact, n 0 is the pathogen load on the contaminated surface at time 0 and an additional amount of n x pathogen is deposited on the contaminated surface after each cough, c h in equation 20 is the transfer efficiency of the pathogen from the surface to the hand factoring the ratio of the area of the fingerpad to the area of the contaminated surface and b is the decay rate of the pathogen on the hand. dose-response model as an example, the equation should be modified as: where the first parenthesis is the binomial coefficient. generally, this binomial distribution property is implicitly considered in the backward calculated quanta generation rate or in empirically obtained inhalation infectious dose data and thus the binomial probability of airborne pathogen deposition is reflected by these parameters. in a more rigorous sense, although the property has been implicitly considered by these parameters, it may not be fully adjusted, as binomial probabilities are different under different values of the expected intake dose. this is a limitation of both infection risk assessment approaches. in addition, respiratory deposition of aerosols is not yet well-understood. scientists are still putting efforts on investigating the aerosol deposition in the respiratory tract and to characterize regional deposition (e.g., choi and kim, 2007; park and wexler, 2008) . many other factors are also implicitly considered by the backward calculated quanta generation rate or infectious dose data as previously discussed. this may cause errors in risk assessment results. quanta generation rates and the fitting parameters of dose-response equation describe the infectivity of the pathogen. pathogen-host interaction is implicitly considered by these parameters. although the pathogen-host interaction has not been well-understood and quantified, it is reasonable to assume that the interaction between a particular species of pathogen and a particular species of host is rather consistent in different scenarios, while the variation in the individual hostõs immune status is either reflected statistically in the formulation of the model or is implicitly considered in the infectivity terms. however, other influencing factors that are being implicitly considered by the quanta generation rate or by the fitting parameters may have greater variation across different cases. the backward calculated quanta generation rate also does not distinguish whether the infection cases are caused by airborne, droplet, or indirect contact transmission or by a combination of all three, but it assumes that all infection cases are caused by airborne infection. this may also induce implicit error in the backward calculated quanta generation rate. spatial heterogeneity, pathogen survivability, deposition loss of infectious particles, and other influencing factors are also implicitly considered in the backward calculated quanta generation rate. these factors always vary from case to case. the calculated quanta generation rate value will inherit all these influencing factors in that particular outbreak case. with so many influencing factors implicitly considered by a single parameter, the caseto-case variations and hence the implicit errors would be huge. as reviewed in one study, the quanta generation rate of tuberculosis calculated from different outbreaks ranged from 1.25 to 30,840 quanta/h . other than the variation of infectious source strengths, this huge variation is also likely to be attributed by these implicit errors. in doseresponse models, as many of these influencing factors can be considered explicitly, there are fewer implicit errors in general. the quanta generation rate describes the infectivity of the pathogen as well as the infectious source strength of the outbreak. this hypothetical infectious dose unit offers convenience for risk assessment but provides less information regarding the outbreak, as it cannot distinguish the pathogen emission rate from the path-ogenõs infectivity. when the quanta generation rates of two diseases are compared, epidemiologists cannot ascertain whether the one with a greater quanta generation rate is more infective than the other or that the infector has shed more pathogens during the outbreak than the other one. infectivity of different pathogens can be easily compared by the infectious dose unit, in which the quantity of the pathogens can be directly compared. dose-response model can also be used to calculate the infectious source strength of an outbreak. the infectious source strength can be expressed in terms of the quantity of the pathogen rather than a hypothetical unit. to demonstrate this idea, we selected an influenza outbreak during an air flight in australia in 1999 (marsden, 2003) as an example. as shown in although interspecies extrapolation can be used in adopting these data to humans, using extrapolated animal data may still not fully adjust the difference of pathogen-host interactions between the two species. risk assessors should be aware that the respiratory deposition of animals differs with that of humans (asgharian et al., 1995) and this difference should also be considered when performing interspecies extrapolations. some pathogens are too dangerous to be aerosolized to conduct infection experiments, such as the sars coronavirus. their infectious dose data are often unavailable. in the wells-riley model, as the quanta generation rate is backward calculated from the outbreak case of the disease, the infectivity of the pathogen described in the quanta generation rate always refers to the infectivity of the pathogens in humans. therefore, it does not require interspecies extrapolation of infectivity. these are great advantages over the dose-response models. naturally, an outbreak case of the disease has to be available for obtaining quanta generation rate. to minimize the uncertainty of respiratory deposition of airborne pathogens induced by air turbulence, the binomial probability property of respiratory deposition should be considered explicitly, for example, using equation 24. the fitting parameters to be used should be calculated from the infectious dose data using this type of dose-response equation, so that the binomial probability property will not be implicitly considered in the calculated fitting parameters. if the infectious dose data are not obtained from the inhalation of aerosolized pathogens, for example, by intranasal inoculation, it would not contain the uncertainty of respiratory deposition. in this case, the fitting parameters should be calculated by a dose-response equation without considering the binomial probability distribution, such as equation 6. other influencing factors should also be considered explicitly with a similar procedure. this can reduce the implicit errors in the quanta generation rate or fitting parameters. the current wells-riley model only models airborne transmission of respiratory infectious diseases. aerodynamic size-dependent dispersion and deposition of the infectious particles cannot be considered. with advances in numerical modeling techniques, these shortcomings may be overcome. newer cfd models allow using a gaseous surrogate to model the dispersion and deposition of polydispersed aerosols (e.g., lai and chen, 2007; zhang and chen, 2007) . combined with rudnick and miltonõs concept, the risk assessment model should be able to incorporate aerodynamic size-dependent dispersion and deposition loss, allowing assessment on the risk of droplet transmission route. when the amount of aerosols deposited on the contaminated surfaces is known, the exposure of the susceptible person to the pathogen via indirect contact transmission in terms of the number of quantum can be estimated by equation 19 or 20. therefore, the model should also be able to assess the risk of indirect contact transmission of the disease. the wells-riley model implicitly considers many influencing factors, which provide convenience for risk assessment. with the backward calculated quanta generation rate, the wells-riley models can be used to perform risk assessment even when the infectious dose data of the pathogen are unavailable. dose-response models are able to consider many influencing factors explicitly and therefore inherit less (marsden, 2003) was selected for the calculation. with a 3-h-and-20-min exposure time interval, 20 out of 74 susceptible persons were infected (27%). a for comparison purposes, the exponential dose-response model was used. b in addition, for comparison purposes, we assumed that the airborne mode was the only transmission route during the outbreak and ventilation dilution was the only sink for the airborne pathogen; both calculations adopted the steady-state and well-mixed assumption. b is assumed to be 0.6 (alford et al., 1966) . c attack rate of the disease during the outbreak was substituted into p i in both equations. d we assumed an air change rate of 25 during the outbreak which is the typical air change rate in commercial aircraft (hunt and space, 1995) . e infectious dose data of influenza reported by alford et al. (1966) was used (mean id 50 = 1.8 tcid 50 ). more details of r estimation can be seen in footnote b of table 2 . f tcid 50 (50% tissue culture infectious dose) is a unit to quantify the amount of viable viruses. implicit errors when performing risk assessment. dose-response models can incorporate droplet and indirect contact transmission of respiratory infectious diseases, allowing them to provide a more complete risk assessment result than the current wells-riley models do. dose-response models can also calculate the infectious source strength of an outbreak in terms of the quantity of pathogen rather than the number of quantum. this provides further information for epidemiologists in understanding disease transmission. spatial distribution of airborne pathogens is an important consideration, as it governs the exposure levels of the susceptible people. respiratory deposition of aerosols also plays an important role in the intake and infection risk of respiratory pathogens. heterogeneous infectivity is observed in airborne respiratory pathogens because of the difference in their carrier aerosol size and the subsequent respiratory deposition. this is also observed when the exposure to pathogen is carried out via different exposure pathways. as induced by air turbulence, there is a binomial probability property on the respiratory deposition of airborne pathogens, which is not being well adjusted in current risk models. newer numerical modeling techniques shed insights on overcoming the existing shortcomings of current risk models. multidisciplinary knowledge is always necessary in the study of disease transmission and in formulating infection control strategies. with further developments in the two risk assessment approaches, we believe that both can serve as useful tools for understanding disease transmission mechanisms and developing infection control strategies. an examination of the reed frost theory of epidemics human influenza resulting from aerosol inhalation birth-death and other models for microbial infection a quantitative microbial risk assessment model for legionnairesõ disease: animal model selection and dose-response modeling quantitative microbial risk assessment model for legionnairesõ disease: assessment of human exposures for selected spa outbreaks legionnairesõ disease: evaluation of a quantitative microbial risk assessment model empirical modeling of particle deposition in the alveolar region of the lungs: a basis for interspecies extrapolation the airborne transmission of infection in hospital buildings: fact or fiction? the transmission of tuberculosis in confined spaces: an analytical review of alternative epidemiological models significance of fomites in the spread of respiratory and enteric viral disease the relationship between route of infection and minimum infectious dose: studies with lactate dehydrogenase-elevating virus guidelines for environmental infection control in health-care facilities a study of the dispersion of expiratory aerosols in unidirectional downward and ceiling-return type air flows using multiphase approach transport and removal of expiratory droplets in hospital ward environment mathematical analysis of particle deposition in human lungs: an improved single path transport model experimental tularemia in macaca mulatta: relationship of aerosol particle size to the infectivity of airborne pasteurella tularensis minimum infective dose of mycobacterium bovis in cattle influenza in man natural ventilation for the prevention of airborne contagion the relative efficacy of respirators and room ventilation in preventing occupational tuberculosis economic benefits of an economizer system: energy savings and reduced sick leave the feasibility of indoor humidity control against avian influenza a mathematical model for the infectivitydilution curve of tobacco mosaic virus: theoretical considerations using a mathematical model to evaluate the efficacy of tb control measures using maple to analyze a model for airborne contagion the airborne transmission of infection between flats in high-rise residential buildings: tracer gas simulation an outbreak of airborne nosocomial varicella estimation of risk due to low doses of microorganisms: a comparison of alternative methodologies quantitative microbial risk assessment aerosol technology jr (1993) the tuberculin skin test the airplane cabin environment. issues pertaining flight attendant comfort estimation of tuberculosis risk and incidence under upper room ultraviolet germicidal irradiation in a waiting room in a hypothetical scenario estimation of tuberculosis risk on a commercial airliner comparison of a new eulerian model with a modified lagrangian approach for particle distribution and deposition indoors role of ventilation in airborne transmission of infectious agents in the built environment -a multidisciplinary systematic review a probabilistic transmission dynamic model to assess indoor airborne infection risks modelling respiratory infection control measure effects influenza outbreak related to air travel the dilution assay of viruses airborne infection: theoretical limits of protection achievable by building ventilation risk assessment in the federal government: managing the process framework for evaluating measures to control nosocomial tuberculosis transmission an analytical framework for relating dose, risk, and incidence: an application to occupational tuberculosis infection a study quantifying the hand-to-face contact rate and its potential application to predicting respiratory tract infection a risk analysis for airborne pathogens with low infectious doses: application to respirator selection against coccidioides immitis spores an integrated model of infection risk in a health-care environment the infectious dose of variola (smallpox) virus toward understanding the risk of secondary airborne infection: emission of respirable pathogens applying the wells-riley equation to the risk of airborne infection in hospital environments: the importance of stochastic and proximity effects modelling the transmission of airborne infections in enclosed spaces sizedependent deposition of particles in the human lung at steady-state breathing airborne spread of measles in a suburban elementary school bacterial colonization and infection resulting from multiplication of a single organism risk of indoor airborne infection transmission estimated from carbon dioxide concentration survival of airborne influenza virus: effects of propagating host, relative humidity, and composition of spray fluids a methodology for estimating airborne virus exposures in indoor environments using the spatial distribution of expiratory aerosols and virus viability characteristics review of aerosol transmission of influenza a virus norwalk virus: how infectious is it? infection risk of indoor airborne transmission of diseases in multiple spaces pathogen survival in the external environment and the evolution of virulence transport characteristics of expiratory droplets and droplet nuclei in indoor environments with different ventilation air flow patterns dispersion of expiratory aerosols in a general hospital ward with ceiling mixing type mechanical ventilation system modeling the fate of expiratory aerosols and the associated infection risk in an aircraft cabin environment airborne contagion and air hygiene hazard characterization for pathogens in food and water-guidelines comparison of the eulerian and lagrangian methods for predicting particle transport in enclosed spaces this research was financially supported by the government of hong kong s.a.r. through general research fund accounts 611506 and 611308. key: cord-339649-ppgmmeuz authors: klein, michael g.; cheng, carolynn j.; lii, evonne; mao, keying; mesbahi, hamza; zhu, tianjie; muckstadt, john a.; hupert, nathaniel title: covid-19 models for hospital surge capacity planning: a systematic review date: 2020-09-10 journal: disaster medicine and public health preparedness doi: 10.1017/dmp.2020.332 sha: doc_id: 339649 cord_uid: ppgmmeuz objective: health system preparedness for coronavirus disease (covid-19) includes projecting the number and timing of cases requiring various types of treatment. several tools were developed to assist in this planning process. this review highlights models that project both caseload and hospital capacity requirements over time. methods: we systematically reviewed the medical and engineering literature according to preferred reporting items for systematic reviews and meta-analyses (prisma) guidelines. we completed searches using pubmed, embase, isi web of science, google scholar, and the google search engine. results: the search strategy identified 690 articles. for a detailed review, we selected 6 models that met our predefined criteria. half of the models did not include age-stratified parameters, and only 1 included the option to represent a second wave. hospital patient flow was simplified in all models; however, some considered more complex patient pathways. one model included fatality ratios with length of stay (los) adjustments for survivors versus those who die, and accommodated different los for critical care patients with or without a ventilator. conclusion: the results of our study provide information to physicians, hospital administrators, emergency response personnel, and governmental agencies on available models for preparing scenario-based plans for responding to the covid-19 or similar type of outbreak. the covid-19 pandemic caused unprecedented stress on health care systems around the globe, requiring treatment capabilities and resources exceeding "normal" emergency surge capacity. radical efforts to increase treatment space were undertaken, ranging from state-wide cancellation of elective surgeries to exhortations for hospitals to double medical and surgical ward beds (eg, in new york state). center, hospital administrators canceled elective procedures, then converted operating rooms and postanesthesia care units to intensive care units (icus). this effort created a 50% increase in icu capacity. 4 however, the pandemic severely taxed new york hospitals in late march and april 2020. personal protective equipment (ppe) was scarce, isolation capacity was insufficient, critical resource supply chains were strained, emergency departments were overwhelmed, and unstable patients were transferred between hospitals. 5 in a large medical center in new york city (nyc), 23 .6% of the first 1000 covid-19 patients were admitted or transferred to an icu. covid-19 patients in these icus required very long length of stays (los) with a median of 23 days. furthermore, the challenges extended beyond bed capacity; 57.6% of patients admitted to icu needed a ventilator and 35.2% needed dialysis. 6 naturally, concerns were reported that deaths due to shortages of ventilators and dialysis machines could have been avoided if hospitals had enough critical equipment and personnel to meet the needs of covid-19 patients. 5, 7, 8 planning for resources needed to respond to the covid-19 virus or future pandemic is based on projecting the number and timing of cases requiring various types of treatment. several tools were developed to assist hospital administrators, physicians, emergency response personnel, and governmental agencies in this planning process. these tools were typically used to consider a variety of possible scenarios at the beginning of the pandemic. in places where the peak of the first wave had already occurred, hospital surge capacity planning tools can help prepare for future waves. currently, in the united states, a surge of new covid-19 cases is occurring in multiple states that have not yet experienced a major first wave but nevertheless have relaxed physical distancing measures. resumption of large group gatherings and the occurrence of mass protests in many parts of the country may be contributing to the current rise of cases. 9 this study highlights planning models that can be used to estimate hospital capacity requirements due to surges of patients with covid-19. typically, for a planning horizon of 1 month or longer, the models can be used to consider different scenarios with different parameters. for each user-defined scenario, these tools identify an epidemic curve of the expected number of covid-19 cases per day and the expected hospital occupancy per day in medical-surgical wards and icus. we provide the input parameters, highlight key features, and explain the output that can be produced from each model. we compare distinguishing features and provide a discussion on the usefulness and limitations of these models. it is imperative to note that these models should be used only to estimate resource requirements. they do not indicate how supply chains need to be designed and operated to meet these needs. thus, they provide the basis for understanding the scope of the problems facing decision-makers but do not indicate how to address them. our focus is on models that help with both caseload projection and hospital capacity management. we conducted our review according to the preferred reporting items for systematic reviews and meta-analyses (prisma) guidelines. 10 we conducted our study from may 15, 2020, to july 15, 2020. the date of the last performed search was july 9, 2020. we completed database searches using pubmed (national library of medicine), embase (elsevier), the institute for scientific information (isi) web of science (thomson reuters), and google scholar. we used regular google searches to identify additional models that were created by researchers that are publicly available on university websites. the search key words included "covid," "hospital," "surge," "estimate," "predict," "bed," "caseload," "capacity," "tool," and "model." we used the search key words together with or and and as follows. to search the academic literature in pubmed and embase, we entered: "covid" and ("tool" or "model") and ("hospital" or "surge") and ("estimate" or "estimating" or "predict" or "predicting" or "bed" or "caseload" or "capacity"). in isi web of science, we entered: "covid" and ("tool" or "model") and ("hospital" or "surge" or "estimate" or "estimating" or "predict" or "predicting" or "bed" or "caseload" or "capacity"). in google scholar, we entered: "covid" and "hospital" and "surge" and ("estimate" or "estimating" or "predict" or "predicting") and ("bed" or "caseload") and "capacity" and ("tool" or "model"). finally, using the regular google search engine, we entered: "covid" and "hospital" and "surge" and ("bed" or "caseload") and "capacity" and "tool" and "model." the first inclusion criterion was to ensure that the article described a computer model or tool. as a second criterion, the article needed to describe a model or tool for covid-19. the third inclusion criterion was that the article must have investigated surge capacity management, including hospital occupancy. fourth, we ensured that the model input parameters included the possibility to define a population served by a single hospital. the fifth criterion was that the model had to include at least 1 parameter pertaining to hospital los. sixth, we ensured that the model considered ventilator capacity. we limited our search to english language articles. second, we excluded models that focused on forecasting cases or the epidemic curve without hospital parameters. third, we excluded models that focused primarily on the impact of non-pharmaceutical interventions on potential epidemic curves. finally, we also excluded models that focused on hospital resources without consideration of covid-19 caseloads or hospital los. our search returned a total of 690 articles. this number includes all records returned from pubmed, embase, isi web of science, and google scholar plus additional records identified from the first 50 results returned from the regular google search query. figure 1 provides a flow diagram to illustrate the search and selection process according to prisma guidelines. 10 after eliminating duplicates, 537 articles remained. for pubmed, embase, isi web of science, and google scholar, we screened titles for possible covid-19 models for hospital surge capacity planning surge capacity planning models. we then read abstracts and full-text articles of the remaining 126 articles. for regular google search results, we visited each link to determine whether the link referred to a hospital surge capacity planning model. after considering all inclusion and exclusion criteria, we selected 6 models for a detailed review. the other articles did not meet the inclusion criteria or the exclusion criteria because (i) 19 articles did not describe a computer model, (ii) 25 articles used a population from a larger region such as a nation or state without the option for a hospital level analysis, and (iii) 22 articles focused on the impact of nonpharmaceutical interventions on potential epidemic curves. in addition, 17 papers reported models that did not include an epidemic curve, and 37 had models that did not have at least 1 parameter pertaining to hospital los. academic researchers created 5 of the models included in the detailed review and the us centers for disease control and prevention (cdc) created the sixth model. all models included in this review are available at no cost, either through an online user interface or as a spreadsheet tool available for downloading. a brief description of each model is provided along with the model's input parameters, key features, and output. a comparison of the 6 models follows in the discussion section. this scalable interactive tool was designed to estimate the number of covid-19 caseloads and project the critical resources needed to treat said cases for any user designed scenario. for the epidemic curve, the tool provides the option to model 1 wave, 2 waves, or use an empirical distribution supplied directly by the user. with a single wave, covid-19 hospital admissions are assumed to be distributed according to a gamma distribution with a median of 30-90 days and dispersion parameter ranging from a relatively peaked to a relatively flat-looking arrival curve. the optional second wave can be distributed according to another gamma distribution with a median day of up to 1 year, and the dispersion parameter can be different from the first wave. given the scenario, the hospital system projections are broken down into medical-surgical and icu beds and ventilators while considering a variety of outbreak characteristics described in supplementary this online tool provides an estimate of the maximum manageable daily number of incident covid-19 cases that a health care system could serve based on an age-stratified case distribution and severity, as well as available medical resources, such as the number of available acute and critical care beds. created at the university of toronto, ontario, canada, in collaboration with university-affiliated health networks, the authors provided versions in 3 different languages: english, french, and spanish. supplementary table 2 provides the model parameters and output for the caic-rt. the source code and a video demonstration are available online, 13 and a research paper provides additional information. 14 the covid-19 hospital impact model for epidemics (chime) is an online tool developed by the predictive healthcare team at the university of pennsylvania in philadelphia. it offers users the ability to visualize forecasts for several outcomes of the covid-19 outbreakfor example, cumulative number of hospitalizations, number of new daily hospitalizations, and cumulative number of susceptible individuals in a population. hospital administrators, personnel, and public health officials can use it to predict caseload and epidemic curves, to adjust medical resources accordingly, and, overall, to enable a data-driven resource requirements plan for responding to the outbreak. chime also offers an optional spreadsheet-based ppe tool. chime uses a discrete-time susceptible, infected, removed (sir) model. the parameters are estimated from "other locations … based on logical reasoning, and best guesses from the american hospital association." 15 while some model parameters cannot be changed directly, chime has parameters that can be changed. we provide those parameters and the output in supplementary table 3 . a video demonstration is available on the chime website, 15 and a research paper provides additional information. 16 stanford university's covid-19 icu and floor projection model is an online model designed to facilitate hospital planning by estimating bed demand for covid-19 patients. the model estimates the daily number of covid-19-related medical resources required, such as intensive care beds, acute care beds, and ventilators necessary to balance the hospitalization-required patient population and hospital capacity. the model is available on the systems utilization research for (surf) stanford medicine website. 17 we provide the model parameters and output in supplementary table 4 . a recent working paper provides additional information. 18 covid-19surge is a spreadsheet-based tool created by the us cdc that can be used to estimate the surge in demand for hospital resources during the covid-19 pandemic. users can estimate the number of covid-19 patients with different needs, such as hospitalization, ventilators, and icu. at the same time, users can input the current number of patients and available medical resources to assess which community mitigation strategy is more appropriate. this allows a comparison of the predicted value with the existing resources of the hospital to make a reasonable allocation. model parameters and output are provided in supplementary table 5 . the model can be downloaded from the cdc website, 19 where additional documentation is available. this tool is a spreadsheet-based model that helps project up to 30 days in advance for hospital bed demand and occupancy (census), icu beds, critical equipment, and ppe consumption, also known as the burn rate. this model uses both deterministic and random options to calculate predictions and also has an accuracy tracker to ensure proper inputs and outputs. by providing inputs such as admission rates and los for medical, icu, and ventilated patients, the model can be used to help address capacity concerns, supply consumption concerns, as well as operational decisions amid the covid-19 pandemic. we provide the model parameters and output for the bed demand tool in supplementary table 6 . optional functionality for ppe and staff planning is excluded from supplementary table 6 . the entire spreadsheet model is available freely to any health system and can be downloaded from northeastern university's healthcare systems engineering institute website. 20 a recent working paper provides additional information. 21 public health officials and hospital leaders worldwide continue to face unprecedented challenges due to the covid-19 pandemic. uncertainty existed and still exists about the disease's spread over time. this uncertainty makes resource planning exceptionally difficult. models used to estimate resource needs are based on assumptions of how a pandemic occurs over time. in particular, parameter values used in a specific scenario represent a user's estimate of a possible way that a pandemic and its response might arise. for example, the age-stratified cdc modeling parameters 22 changed from their earliest iterations in mid-february 2020 to the later covid-19 models for hospital surge capacity planning version in april 2020; the "true" values for many parameters relating to covid-19 hospitalization may still be quite different from the cdc estimates. a recent editorial compared penn's chime model with university of toronto's caic-rt model. comments include: "as with hurricanetracking models, they make varying projections; yet in the face of uncertainty, they provide useful real-time forecasts to prepare for the pandemic, as evidenced by their broad use." 23 it can be misleading to run a single scenario and report it as a forecast of what is going to happen. instead, multiple scenarios should be run with a variety of parameter values. the models included in this review would be used as intended only if multiple scenarios are examined. table 1 provides a summary of the similarities and differences of the 6-hospital surge capacity planning models we reviewed. three of the models have online interfaces only, 2 have spreadsheet interfaces only, and cornell's c5v has all 3 versions: an online version, a spreadsheet version, and a desktop version. the university of toronto's caic-rt model is available in english, french, and spanish, whereas the other 5 models are offered in english only. for most models, the planning horizon is limited to 30 days. exceptions are versions of cornell's c5v (the spreadsheet version of which covers pre-and post-peak periods up to 180 days, and the online version of which can model up to 360 days) and the spreadsheet version of the cdc covid-19surge that supports a 1-year planning horizon. a longer planning horizon is particularly helpful for modeling an outbreak with multiple waves. most of the tools provide the option to model a single wave, whereas the online version of cornell's c5v is the only tool that provides the option to model a second wave. studies show that the odds of death from covid-19 increase with age with different proportions reported in wuhan, china, 24 northern italy, 25 and nyc. 26 furthermore, an nyc study of 5700 patients hospitalized with covid-19 reported that los and ventilator needs differ by age group. 26 therefore, another important distinguishing feature is age-stratification, usually based on cdc modeling parameters. 22 naturally, the cdc covid-19surge supports age-stratification but, surprisingly, only for the 5 age ranges originally released, not the newer division into 7 age ranges. the online version of the university of toronto's caic-rt model supports the newer 7-strata age distribution, the spreadsheet and desktop versions of c5v support the older 5 age ranges, and the online version of c5v provides the option to use either 7 age ranges or 5 age ranges. the other 3 models do not include agestratification. hence, they use the same hospitalization and icu proportions in their models, regardless of a patient's age. for the epidemic curve, some modelers adopted a mechanistic approach. in some cases, this requires inputting a parameter, called the doubling time: the number of days it takes for cases to double. with the doubling time as a parameter, mechanistic models assume that the number of cases will grow exponentially. for example, penn medicine's chime model uses a doubling time that is configurable and other parameters are based on estimates and the shape of curves observed at the beginning of the covid-19 outbreak in china and italy. the university of toronto's caic-rt model takes a different approach. instead of the epidemic curve, the model focuses on identifying the maximum number of patients that the hospital can handle in a surge situation. in practice, there are different patient pathways for covid-19 patients. for example, a patient could be admitted to a medicine ward and later be transferred to the icu and need a ventilator. however, most models have the simplifying assumption that each patient's entire los will only be in a medical-surgical ward, in the icu with a ventilator, or in the icu without a ventilator. exceptions are surf stanford medicine and northeastern university's models that consider different patient pathways and provide the option to set different los for each pathway. most models further simplify the complexity of hospital patient flow by using average los. however, the los used in cornell's c5v is set randomly according to a uniform distribution with minimum and maximum parameters set by the user. the c5v also includes fatality ratios with los adjustments for survivors versus those who die, and also accommodates different los for critical care patients who do or do not require a ventilator. on screen output with tabular data and graphs is available for all 6 models. considering that many users may wish to perform additional analyses or generate reports, the majority of the tools include options to download output to a spreadsheet. the download option for the university of toronto's caic-rt model is a.pdf file report, whereas the surf stanford medicine model provides tabular data and graphs on screen only. the models from the cdc, cornell university, and northeastern university provide the user with the option to download a spreadsheet with output that includes both tabular data and graphs, whereas penn's chime model provides the option to download tabular data without graphs. the main goal of this review was to identify models that can project both covid-19 caseload and surge capacity requirements over time for hospital level analysis with parameters including los, occupancy, and ventilator capacity. we provided detailed documentation with the input parameters, key features, and explained the output that can be produced from each model. we also provided a comparison table to highlight the similarities and differences of the models that are available to assist in this planning process. the details provided in this review may help physicians, hospital administrators, emergency response personnel, and governmental there are other existing models that are useful but did not meet the inclusion criteria or the exclusion criteria for this study. for example, the university of washington's institute for health metrics and evaluation (ihme) model 27 is widely used for covid-19 projections. the ihme model supports analysis at the country or state level, whereas the models reviewed in this paper can specify a population served by a hospital, hospital network, region, state, or nation. the ihme model is also different in that it does not have the option to enter all the user-defined parameters that the models in this review includeespecially relating to hospital los and capacity. without a vaccine for covid-19, communities around the world are sheltered in place and engaged in physical distancing to try to reduce the spread of covid-19. naturally, with the emphasis on distancing and other non-pharmaceutical interventions (npis), there are also many models emerging that focus on npis. some of the models included in this review include npis. however, there are other models that focus more on npis than the models included in this review. for example, the covid-19 international modelling (como) consortium model considers many different npis, including handwashing, working at home, school closures, international travel ban, vaccination, shielding the elderly, and self-isolation. 28 it also models health care capacity but does not yet have a complete description in a working paper. the models included in this systematic review can help predict and prevent health system capacity constraints by estimating hospital bed and ventilator requirements before they reach a crisis point. due to shortages for critical health care resources, including ppe, some of the models included in this review also include a ppe calculator. these and future models may buttress the global health care supply chain's preparedness for challenges caused by the first wave and potentially subsequent waves of covid-19. the covid-19 pandemic continues to create extraordinary challenges for hospital leaders. in this systematic review, we identified and reviewed surge capacity planning models that handle both caseload projection and hospital capacity management for this novel pandemic disease. these models have key differences: some can be used for a longer planning horizon, some have age-stratified parameters, and some incorporate different patient pathways and more detailed patient flow. an enhanced understanding of model similarities and differences may help physicians, hospital administrators, emergency response personnel, and public health agencies determine which existing models are appropriate for their use. these models help users quantify resource requirements over time for a particular set of scenarios, providing a quantitative way to describe complex health system capacity constraints under covid-19. the crucial problem now facing health systems worldwide is to determine how to construct and operate the complex supply chain needed to create the required resources. detail/30-01-2020-statement-on-the-secondmeeting-of-the-international-health-regulations world health organization. who director-general's opening remarks at the media briefing on covid-19-11 covid-19 dashboard transforming ors into icus how new york's coronavirus response made the pandemic worse characterization and clinical course of 1000 patients with coronavirus disease 2019 in new york: retrospective case series new york hospitals face another covid-19 equipment shortage: dialysis machines. marketwatch covid-19 models for hospital surge capacity planning disaster medicine and public health preparedness 7 will protests set off a second viral wave? the new york times preferred reporting items for systematic reviews and meta-analyses: the prisma statement cornell covid caseload calculator with capacity and ventilators modeling hospital capacity for a novel pandemic: development, validation, and lessons of the cornell covid caseload calculator severe outcomes among patients with coronavirus disease 2019 (covid-19) -united states covid-19 acute and intensive care resource tool (caic-rt) estimating the maximum capacity of covid-19 cases manageable per day given a health care system's constrained resources covid-19 hospital impact model for epidemics (chime) locally informed simulation to predict hospital capacity needs during the covid-19 pandemic covid-19 icu and floor projection a model to estimate bed demand for covid-19 related hospitalization. medrxiv. 2020;epub covid-19surge. centers for disease control and prevention hospital surge capacity bed, equipment, and staff demand planning model a hospital surge capacity bed, equipment, and staff demand planning model pandemic surge models in the time of severe acute respiratory syndrome coronavirus-2: wrong or useful? clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study case-fatality rate and characteristics of patients dying in relation to covid-19 in italy presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with covid-19 in the new york city area covid-19 pandemic modelling in context: uniting people and technology across nations the authors have no conflicts of interest to declare. to view supplementary material for this article, please visit https://doi.org/10. 1017/dmp.2020.332 key: cord-352431-yu7kxnab authors: langbeheim, elon; perl, david; yerushalmi, edit title: science teachers’ attitudes towards computational modeling in the context of an inquiry-based learning module date: 2020-08-25 journal: j sci educ technol doi: 10.1007/s10956-020-09855-3 sha: doc_id: 352431 cord_uid: yu7kxnab this study focuses on science teachers’ first encounter with computational modeling in professional development workshops. it examines the factors shaping the teachers’ self-efficacy and attitudes towards integrating computational modeling within inquiry-based learning modules for 9th grade physics. the learning modules introduce phenomena, the analysis of measurement data, and offer a method for coordinating the experimental findings with a theory-based computational model. teachers’ attitudes and self-efficacy were studied using survey questions and workshop activity transcripts. as expected, prior experience in physics teaching was related to teachers’ self-efficacy in teaching physics in 9th grade. also, teachers’ prior experience with programming was strongly related to their self-efficacy regarding the programming component of model construction. surprisingly, the short interaction with computational modeling increased the group’s self-efficacy, and the average rating of understanding and enjoyment was similar among teachers with and without prior programming experience. qualitative data provides additional insights into teachers’ predispositions towards the integration of computational modeling into the physics teaching. electronic supplementary material: the online version of this article (10.1007/s10956-020-09855-3) contains supplementary material, which is available to authorized users. al harrison. "maybe we've been thinking about this all wrong" paul stafford, "how's that"? al -"maybe it's not new math at all" katherine johnson -"it could be old math, something that looks at the problem numerically and not theoreti-cally… math is always dependable" al -"for you it is." (leaving the room) katherine -"euler's method!" paul -"euler's method? but that's ancient!" katherine -"it is ancient, but it works…" (scene from theodore melfi's film "hidden figures," 2016). this excerpt describes the breakthrough of the nasa team working on the calculation of the re-entry coordinates for john glenn's first orbital flight outside the atmosphere. when faced a problem that cannot be solved analytically, katherine johnson, the woman who worked as the team "calculator," suggested to use the euler method-a step-by-step calculation for solving the differential equations of motion (develaki 2019) . today, the euler method is implemented in computer models that are investigated in introductory physics courses (e.g., chabay and sherwood 2008) and allows educators to expand the scope of problems that students can tackle. developing and using computational models reflects an important goal for k-12 science education, within a range of scientific practices such as designing and employing experimental investigations, analyzing measurement data and communicating them (nrc 2012) . indeed, there is growing evidence for doing classroom inquiry that entails a dual focus on experimental measurement and computational modeling in primary and secondary schools (e.g., farris et al. 2019; fuhrmann et al. 2018) . the extent to which teachers adopt inquiry-based science teaching practices may be related to their views of inquiry learning (lotter et al. 2007; osborne 2014; blanchard et al. 2009 ), their subject matter knowledge (tseng et al. 2013) and their teaching self-efficacy (lakshmanan et al. 2011 ). however, research on teachers' attitudes towards integration of computational models in the context of experimental investigations is scarce (gerard et al. 2011) , and thus, teachers' implementation of instruction that involves both experimental classroom inquiry and construction of computational models, remains an open field of research. in this paper, we describe science teachers' engagement with inquiry-based curricular modules for 9th grade physics that introduce computational models. the modules include a structured, inquiry-based experimental component, and a theoretical component in which computational models are used to verify the experimental observations. drawing on studies that address teachers' attitudes and self-efficacy as indicators for implementation of inquiry teaching, we ask how do teachers' prior experiences in teaching physics influence their self-efficacy and attitudes towards inquiry-based learning practices? specifically, in relation to computational modeling, we ask how does teachers' prior involvement with programming influence their self-efficacy in, and experience of computational modeling? coordinating theory and experimental evidence: a model-based perspective a scientific investigation integrates two endeavors: the formation or revision of predictive theoretical constructs and the design of experiments that produce reliable observations or measurements (klahr and dunbar 1988) . in studying the development of scientific practices among students, researchers can focus on practices of experimental design such as the selection and control of variables (chen and klahr 1999) or the evaluation of measured data (allie et al. 1998; kanari and millar 2004) , or they can study how students use and conceptualize theoretical ideas. in practice, the two activities are difficult to separate. specifically, unsubstantiated hypotheses, or "implicit theories" (kuhn 2011 ) have a strong influence on students' planning and interpretation of experiments. conversely, flawed experimental design, or misinterpretations of measurements can shape theoretical misconceptions. judging experimental results as implausible in light of learners' "naïve" theories is a crucial aspect of the theoryexperiment interaction (kuhn 2011) . in many introductory college and high school courses, the interaction between theory and experiment plays out in structured labs designed to "prove" or concretize theoretical statements. although the purpose of these labs is to substantiate the theory students learn in lecture courses, studies show that they do no contribute to students' content knowledge, as measured by test performance (holmes and wieman 2018) . moreover, students' perceptions of their own expertise in conducting experiments tend to decline during such structured laboratory courses (wilcox and lewandowski 2018) . an alternative is the inquiry or design lab that entails discovery experiments that extend the theoretical knowledge students acquired in prior learning. for example, in the investigative science learning environment (isle) curriculum (etkina et al. 2006a) , students design experiments to develop and then test model-based explanations of observed phenomena. this approach brings the design of experiments and the analysis of measurement data to the fore, and often challenges students' prior knowledge. to do so, an extended amount of time is allocated to judging the reliability of measurement data. educational research shows that in these transformed labs, there is no decline in students' perceptions of their expertise in conducting experiments as in the structured lab (wilcox and lewandowski 2018) . a specific case of coordinating theory and experiments is the construction and evaluation of models. models are simplified, and often abstracted quantitative representations of the system under investigation, constructed to describe, explain or predict the system's behavior (etkina et al. 2006b ). in introductory physics courses, models are usually instantiations of general, theoretical principles such as newton's laws, in generic systems such as orbital motion under the influence of a centripetal force (halloun and hestenes 1987) . recently, researchers who design upper-level laboratories, suggested a new modeling framework, in which model construction, evaluation and revision is utilized for two inter-connected modeling cycles: the measurement model cycle and the theoretical model cycle (zwickl et al. 2015; dounas-frazer and lewandowski 2018) . this framework provides two possible reactions to a misalignment of theory-based model and measurements: revising the measurement model-the instruments used or the interpretation of the measurement data, or, revising the theoretical model. while this framework has been suggested for upper-level undergraduate labs that often involve sophisticated measurement equipment, it is a useful construct for guiding scientific investigations at various levels. it suggests designing investigations with explicit awareness to both the judgment of measurement data, and to the limitations of the theory-based, physical system models. the measurement and/or the theoretical models can be revised to achieve better alignment of theory and the behavior of the actual physical system. while most examples of theoretical system models are mathematical/analytical (e.g., dounas-frazer and lewandowski 2018), these models can be also realized using computer simulations, as we will show bellow. jong et al. 2013) . in this approach, students manipulate parameters of the simulated system to learn about its mechanism. the simulations are introduced as substitutes of the real system and the interaction with them resembles a structured laboratory experiment. for example, students investigate a simulation of collisions and compare the velocities before and after the collision to deduce the law of conservation of momentum (de jong et al. 1999) . a different educational approach is to engage students in constructing the computational model (vanlehn 2013; wagh and wilensky 2018) . engaging in model construction requires learning environments that allow students to add or change model features. the students need to plan how to add or change certain aspects of the model, and then run the computational model and compare the outcome vis-à-vis other realizations of the theory or experimental results. for example, in a computational model of a falling paratrooper, students decide whether the magnitude of the drag force depends on the velocity or not. they then examine whether their definition of the drag force yields a velocity pattern that reaches a relatively low terminal velocity as expected (develaki 2017) . constructing computational models and especially accessing their code has been explored in secondary school physics (aiken et al. 2013; sherin et al. 1993; langbeheim et al. 2019; hutchins et al. 2020) , and in a few introductory college texts (redish and wilson 1993; chabay and sherwood 2008; orban et al. 2018) . however, there are only few examples of using simulations side by side with experiments. for example, farris et al. (2019) investigated how 4th graders learn the concepts of speed and acceleration by measuring the motion of toy cars and building models of motion in a computational environment. despite its basic level of content, this study shows that even young students are able to think about variations in measurement, and relate real-world observations to computational models. similarly, in fuhrmann et al. (2018) students studied the osmosis of sugar water through an egg membrane experimentally and then explored a computer, particle-based model that mimics the experimental results. they found that both exploring a ready-made model and designing a model in this approach foster the development of conceptual knowledge about the topic. however, only actively designing the model results in developing nuanced understanding of the role of models in scientific investigations. this finding emphasizes the epistemic importance of constructing models, compared to just exploring them. engaging students in building computational models reflects the actual practice of scientists (develaki 2017; winsberg 2010 ) and contributes to the development of creativity and computational thinking (hutchins et al. 2020; popat and starkey 2019) . however, letting students construct models, and especially when construction involves the understanding and manipulation of computer code, is a much greater leap for educators. in order to guide students in understanding and modifying code, teachers need to become proficient in programming themselves and to produce new learning materials for their science and mathematics classrooms. this requires teachers and students to transition from being users of computer programs to being builders of programs, which can be viewed as a substantial change in computing culture (ben-david kolikant and ben-ari 2008) or an educational paradigm shift (disessa 2001). implementing inquiry-based teaching that involves designing experiments, interpreting measurement data, and constructing models is challenging. in this study, we consider two main factors that determine the extent to which teachers implement such teaching in their classrooms: (1) teachers' self-efficacy in teaching the subject matter (e.g., lakshmanan et al. 2011); (2) teachers' attitudes regarding inquiry-based learning and their beliefs in its effectiveness (blanchard et al. 2009 ). self-efficacy is a set of beliefs regarding one's capabilities to perform well in a certain field (bandura 1997) . these beliefs are one of the main driving forces that motivate people to put effort and pursue challenging tasks. self-efficacy beliefs are dynamic, especially during learning of new skills, or beginnings of new professional endeavors. in preliminary acquisition of an expertise, there are several factors influencing self-efficacy, such as observing others demonstrating successful performances, and most importantly, one's own mastery experiences. mastery experiences are successful performances of tasks, such as conducting an effective lesson or lesson sequence. teaching self-efficacy determines teachers' beliefs about their competence and performance in teaching (tschannen-moran et al. 1998 ). teachers' years of experience in teaching are significantly correlated with their teaching self-efficacy (tschannen-moran and hoy 2007) but not with their tendency to implement inquiry-based teaching (marshall et al. 2009 ). although physics is considered a challenging topic, studies of self-efficacy of teaching physics are very rare (e.g., tanel 2013) . the second factor influencing implementation of inquiry-based teaching is teachers' beliefs regarding the importance and fruitfulness of this type of teaching. teachers' who were less inclined to believe that inquirybased teaching methods lead to effective student learning were less likely to implement them in their lessons (lakshmanan et al. 2011) . similarly, teachers who had meaningful views of the pedagogical approaches to inquiry, grounded in learning theories, were more likely to adopt inquiry-based learning practices in their classrooms (blanchard et al. 2009 ). reformed teaching pedagogies and curricula are usually introduced in professional development (pd) workshops, so that successful implementations can be related to the teachers' backgrounds and/or to the pd process. the structure, duration, and content of pds is an important factor influencing the implementation of reformed instruction and curricular innovations. a survey study showed that pds that focus on content knowledge and provide opportunities for active learning have the strongest positive effect on teachers' self-reported increase in knowledge and skills (garet el at. 2001) . moreover, pds that include mastery experiences have the strongest effect on teachers' self-efficacy (tschannen-moran and mcmaster 2009) and consequently on classroom implementation (penuel et al. 2007 ). there are only few examples of teacher training programs that include computer programming (e.g., repenning et al. 2020) . such courses are necessary for boosting their selfefficacy in computational modeling and knowledge of computational thinking (papadakis and kalogiannakis 2019). a growing number of studies indicate that elementary and secondary school students are able to construct computational models and to modify their code in the context of physics (e.g., aiken et al. 2013; langbeheim et al. 2019; farris et al. 2019 ), but less is known about the views of in-service science teachers regarding adopting computational modeling that involves manipulation of computer code. therefore, the goal of this study is to examine science teachers' attitudes towards introducing computational model construction in the context of inquiry-based learning in physics. our main research questions are: 1. how do teachers' prior experiences in teaching physics influence their self-efficacy and attitudes towards inquirybased learning practices in a pd workshop? 2. how does teachers' prior involvement with programming influence their self-efficacy in, and experience of computational modeling that involves coding in a pd workshop? we investigated these questions in the context of workshops that introduced two curricular modules for 9th grade physics: the focus of the first is the oscillations of a springmass system, and the focus of the second is free-fall with airdrag. each module begins with experimental investigations of motion patterns, and concludes with a theoretical section that compares the experimental measurements to the output of computational models. the experimental section of both modules comprised six 90min lessons that are summarized in table 1 . the lessons are based on the following design guidelines: a) capturing student attention-the main purpose of initial lessons is to raise attention to aspects of the phenomenon through problematization (reiser 2004; phillips et al. 2017 ). in the first module, the theoretical exploration focuses on the "peculiar" lack of covariance of the period of the oscillating spring system and its amplitude. in the second, it focuses on the apparent discrepancy between the aristotelian view (that mass influences the falling rate of objects) and the galilean view (that it does not). b) scaffolding measurement and data analysis practices-the measurement and data analysis methods are introduced in a guided manner in the initial lessons. in subsequent lessons, the teacher removes some of the direct instructions: first in asking students to plan and justify an experimental design to answer a common research question (e.g., how does the falling rate of paper cups change when increasing its mass?) and then in planning and justifying an experimental design to study a question that students generate by themselves (see table 1 for examples) c) raising student accountability-students build an evaluation rubric for experimental investigations based on fabricated student reports, in order to develop awareness for reporting standards. then, they share their evaluation criteria to produce a unified rubric. finally, they use the consensus evaluation rubric to provide feedback on the presentations of their peers. after completing the experimental section, and sharing their inquiry projects using posters or presentations, the students proceed to study a theoretical model of the phenomenon. the theoretical section includes a qualitative, conceptual section and the quantitative computational section. the qualitative, conceptual section comprises two lessons in which students construct a paper and pencil theoretical model that explains the phenomenon that they investigated experimentally. for example, in the oscillating spring-mass unit, students analyze the trace of the oscillating mass, using motion patterns produced by the tracker video analysis software (brown and cox 2009 ). their analysis leads to the discovery that the maximal speed of the mass corresponds to the amplitude, or the distance covered by the mass during one period of oscillation. this finding is used to hypothesize qualitatively that the length of the motion path and the average speed of the mass cancel each other out, and therefore, their ratio-the period of one oscillation-is constant. the purpose of the final unit in the module is to introduce a quantitative approach for comparing the measurements and the theory-based model. the analytical equations of motion of both processes we discussed-harmonic oscillations and falling with a drag force-are nonlinear and involve mathematics beyond that of 9th grade. in order to overcome the mathematical complexity, we introduced the theoretical model using a computer program that calculates the trajectory using the euler method (cromer 1981; develaki 2019) . the computational models of motion with air drag and of the oscillating spring-mass systems were realized using trinket.ioa free, online tool for building coding activities and courses. we chose this platform, since it runs vpython-a 3d graphical package for python (scherer et al. 2000) . python was chosen since it replaced introductory level programming languages that were used in the past such as pascal, basic, and fortran. figure 1 illustrates the trinket interface that has the code on the left side and the graphic output on the right. the unit introduced all the necessary programming ideas so that students (and teachers in the pd workshop) were not expected to have prior experience with vpython. the supplementary material provides detailed information on the introduction of the computational model. the euler method presents the theoretically predicted motion pattern using three linear equations, calculated repeatedly every time step. the first equation calculates the net force according to the current location or velocity of the object; the second calculates the new velocity of the object, based on newton's 2nd law; and the third updates the location of the object based on the new velocity. for example, the implementation of these three equations for the falling with air drag problem is shown in lines 14-16 of the code shown in fig. 1 . in addition to using relatively simple linear equations, this approach promotes the analysis of the dynamics of specific moments during objects' motion patterns, thereby facilitating the comprehension of the process as a whole (sherin et al. 1993; sherin 2001) . the two modules were introduced in two summer pd workshops that span four full days and a total of 30 hours of contact. the workshops included an overview of the principles of inquiry-based learning, engaging in measurement cycles and in coordinating the experimental results with the computational model. the workshops were advertised as introducing inquiry for 9th grade physics, and did not as present themselves as focusing on computational modeling. the first 2 days of the pd focused on experimental investigations, and the last 2 days focused on the related theory and computational modeling. teachers worked in small groups and shared their experiences in discussions with the rest of the participants and the workshop instructors. of 49 participating teachers, 41 completed all of the activities. within this group, 26 teachers are considered "experienced teachers"-teachers who taught physics at the 9th grade level for at least 3 years, or at the high school level for at least 1 year (high-school teachers usually come with a bachelor in physics, and teach more physics in 1 year than 9th grade science teachers do in 3 years). fifteen teachers were considered "novice teachers," if they taught physics in 9th grade for less than 3 years. three years is a common cutoff for differentiating experienced and novice teachers in educational research (e.g., tschannen-moran and hoy 2007). this does the mass of an object influence its falling rate? the dispute between aristotle and galileo 2. initial problematization of measurement how to measure the period of oscillation of a mass on a spring when the motion fades? why does the mass of paper cups influence their rate of falling whereas the mass of clay balls does not? 3. joint investigations: drawing conclusions despite measurement uncertainties students systematically investigate the influence of amplitude on the period of oscillation of the mass-spring system students systematically investigate the differences between falling rates of paper cups, while increasing their mass, and discover that the differences decrease with mass 4. developing an evaluation rubric students were given an empty evaluation rubric and three fictitious student reports. they suggested categories, and differentiated between performances on each category. students chose a research question related to the period of oscillation of a spring mass system (e.g., what happens when increasing the suspended mass), suggested hypotheses and built an experimental setup to test their hypotheses. students chose a research question related to falling motion (e.g., how does the diameter of a balloon influence its falling rate), suggested hypotheses and built an experimental setup to test their hypotheses 6. peer evaluation students present their posters/presentations to their peers, and peers provide written or oral feedback based on the evaluation rubrics. group consisted of new teachers, and a few experienced teachers in other fields such as math, who were new to teaching physics topics. the first research question asked how do teachers' prior experiences in teaching physics influence their self-efficacy and attitudes towards inquiry-based learning practices in a pd workshop. to examine this question, we used the following data sources: 1a. physics teaching self-efficacy survey-before the workshop, we administered a self-efficacy survey with six likert-scale items of the same format: "i am confident in teaching quantitative problem solving/ introducing computational models/ textbook experiments, in 9 th grade physics" on a scale of 1 to 5 (5 = strongly agree, 4 = agree, etc.) and a background survey indicating years of experience in teaching physics and experience in computational modeling. the internal consistency of the selfefficacy scale was high-cronbach α = 0.85. 1b. workshop appreciation survey-at the end of the workshop, we administered a feedback survey with six five-point likert-scale items such as "the workshop contributed to my ability to explain fundamental concepts such as newton's 1st law/ textbook experiments etc." (strongly agree, agree, etc.) and open-ended items asking to state aspects of the workshop that they found significant and aspects that were lacking. 1c. attitudes towards physics lab goals-we administered a 15-item survey to evaluate teachers' beliefs about the goals their students should achieve in the physics lab. some of the items were adapted from the e-class -colorado learning attitudes science survey for experimental physics, (wilcox and lewandowski 2018) . the e-class includes items such as "the primary purpose of doing physics experiments is to confirm previously known results (strongly agree, agree etc.)". our adapted version was: "rate the following statements representing goals for students to achieve in the physics lab: to confirm the theory discussed in class." this statement reflects a common goal for the structured lab, while statements such as "to acquire tools for evaluating the work of peers" represents a goal of an inquiry-based lab. the survey utilized a 4-level likert scale. we used exploratory factor analysis to aggregate fifteen statements into two categories: (a) goals of inquiry-based laboratory practices, (b) goals of structured laboratory practices. the full list of items, the corresponding categories and internal consistency scores are shown in table 2. 2. in order to investigate the 2nd research question regarding the influence of teachers' prior involvement with programming on their self-efficacy in, and experience of computational modeling that involves coding in a pd workshop, we used the following data sources: a. programming self-efficacy-before and after the computational modeling activity we administered a self-efficacy survey with three five-point likert-scale items such as: "i think i can understand / write and modify computer code" (strongly agree, agree ,etc.). thirty-eight teachers responded to the pre and post self-efficacy items related specifically to the computational modeling activity. the internal consistency of the self-efficacy scale was substantial-cronbach α = 0.76, but the distribution of the ratings in the posttest was not normally distributed (shapiro-wilk w = 0.94, p = 0.03) fig. 1 vpython code of the computational model (left), and graphical interface (right). the code includes two definitions of graphical objects: the "box" (line 9) which imports the motion diagram of the real object (marked in red dots) and a green sphere (defined as "s" in line 10) which moves according to a computational model. the green trace of the sphere is compared to the red pattern of dots that was produced from a video of a falling paper clip using the tracker software b. programming activity appreciation-three statements addressed specifically the coding activity "the coding activity was interesting," "the coding activity improved my understanding of the physics of the spring-mass system / free-fall with drag," "i understood everything i did in the coding activity" (strongly agree, agree, etc.). the internal consistency of the activity appreciation scale was substantial-cronbach α = 0.74. the full survey can be found in the online supplement. in addition, we used transcripts from the activity itself to corroborate our survey findings. the self-efficacy related to teaching physics in 9th grade was relatively high before the workshop with an average rating of 3.91 out of 5. the average ratings of the novice teachers with less than 3 years of teaching was 3.566, and the experienced teachers average rating was 4.10. this difference is significant (t = − 2.56, p = 0.017). the characterization of teachers' perceptions of the contribution of the workshop was based on two forms of data: likertscale ratings and open-ended questions. the average ratings were calculated and analyzed according on the teachers' experience as shown in table 3 . both groups rated the contribution of computational modeling (item 6) and designing the experimental setup (item 3) higher than the rest of the topics. the contributions to two common aspects of teaching-conducting standard textbook experiments (item 2) and clarifying fundamental physics concepts (item 5)-were rated higher by the novice teachers, and these differences were marginally significant. in the other categories, the differences between the experienced and novice teachers were very small. also, an index measure, combining all of the items, reveals no significant difference among groups (t = − 0.146, p = 0.88) and between workshops (t = − 1.21, p = 0.23). the responses to the open question, corroborate these findings. for example, in their suggestions for two or more aspects of the workshop that contributed to their teaching, vera, an experienced teacher wrote: the inquiry-module for ninth grade, structured and original, will greatly facilitate my work next year. i learned to use excel for analyzing an experimental graph and fitting it to an equation. the main contributing aspects were the parallel investigations, the discussions about classroom inquiry, its meaning and its implementation. also, the acquaintance with the step-by-step method using the software (although i'm not sure i would implement it in class). table 2 the grouping of the items regarding the goals of the physics lab "rate the importance of the following statements representing goals for students to achieve in the physics lab" cronbach's α (pretest) cronbach's α (posttest) (a) inquiry-based laboratory goals "an active, hands-on acquaintance with physical systems" 0.82 0.79 "acquire tools for evaluating the work of peers" "acquire tools to improve group work" "to present and report findings to peers (using presentations/posters)" "to identify sources of measurement error" "to make inferences from experimental findings" "to learn to improve the experimental setup to increase measurement accuracy" "to design an experimental setup to investigate the quantitative relation between two variables" (b) structured laboratory goals "to illustrate and concretize the theory discussed in class" 0.64 0.81 "to learn to write reports of experimental findings" "to confirm the theory discussed in class" "to learn methods for analyzing and presenting data" "to understand the role of experiments in building theories" "to work with the safety regulations related to equipment and materials in the lab" "to search and find information sources" (relevant to the experiment) both teachers mentioned the structured inquiry sequence as a major contribution to their teaching. in relation to the computational tools, vera mentioned excel which was used to analyze the experimental data, and ana mentioned step-by-step computational modeling. overall, teachers mentioned five main components of the pd: the two most common topics, both mentioned by 54% of the teachers, were the structured inquiry sequence and/or the computational modeling activity. in addition, 41% mentioned the measurement error analysis and/or the active learning format of the workshop. finally, 24% of the teachers mentioned the contribution of the workshop to their understanding of fundamental physics concepts. the differences in attitudes towards laboratory experiments before and after the pd were relatively minor as shown in table 4 . the teachers' rating of the importance of lab purposes associated with traditional structured lab (category b.) decreased during the workshop, and goals related to inquiry practices have slightly increased. however, a closer examination shows that these differences were not the same across groups. the ratings of inquiry lab goals among novice teachers increased, whereas the experienced teachers' ratings slightly decreased. this difference is marginally significant (p < 0.1). the average rating of programming self-efficacy before the activity was 3.11 (sd = 1.01), and at the end of the workshop, it increased to 3.65 (sd = 0.93). a wilcoxon signed rank test shows that the difference between pretest and posttest is significant (w = 930, p = 0.030). fifteen teachers stated that they had prior knowledge in programming, and 23 responded they did not. we found significant differences in self-efficacy both pre and post the activity, between teachers that declared to have prior knowledge in programming and teachers who did not. teachers with prior programming knowledge rated their self-efficacy significantly higher than teachers without such knowledge, both in the pretest (t = − 3.49, p = 0.0013) and in the posttest (w = 80, p = 0.0055). a wilcoxon test shows that appreciation of the coding activity was similar among teachers with (m = 3.93, sd = 0.55) and without (m = 3.46, sd = 1.08) prior experience in programming (p = 0.239). we note that the standard deviation among teachers with no background in programming was larger than among teachers with programming background. this variance is also evident in the boxplot in fig. 2 (left) . a scatter plot of the activity appreciation ratings vs. the self-efficacy of the teachers is shown in fig. 2 (right) . the red circle shows seven teachers with no prior programming experience and low selfefficacy prior to the activity, which gave high rating to the programming activity. in order to examine the variation in the appreciation of the activity among the teachers without prior programming knowledge, we present excerpts from the activity of two teachers: vera, an experienced physics teacher with over 20 years of teaching, and ana, a novice teacher in her first year of teaching. the two teachers were trying to modify the code in the first part of the computational modeling activity (see online supplement): v: if i want to do this, and place it (the graphical object) in a different initial location then… (trying to modify the code) it doesn't work. why didn't it change anything? oh! what have i done? i erased everything! a: maybe press here? v: no, here. but i didn't save, so now what? start over? a: no, do this (points to the "remix" button) to go back to the original. that is odd, it leaves the shapes, but not the code… v: i hate this! i really, really, hate this entire thing. it's annoying. a: i really like (unclear) v: i like excel a: excel is really awesome, v: it is helpful. it is the most convenient tool! vera is frustrated with the computational modeling activity because of the technical problem that erased the changes in her code. this incident results in her saying that she "really, really hates this entire thing." ana, on the other hand, responded with more patience, and even said she likes the modeling activity. they both also agreed that excel is a useful tool. vera also rated her appreciation of the activity lower than ana at the end of the workshop. the following excerpt indicates that while vera is an experienced physics teaching and very confident with her physics teaching, she is insecure when it comes to computational modeling: instructor: how is it going? v: it's annoying…i cannot compete with them (the students) in this (points to the screen), they will defeat me. our students learn python programming in 8 th grade... inst: but they cannot defeat you in physics… v: listen, when i teach them excel, i am the (authority)… vera is afraid that her authority would be undermined by the students who know more programming than she does. again, she mentions excel as a software that she is confident in using. ana, on the other hand, is a new teacher and does not feel the tension between her confidence in teaching physics, and introducing this new task. it also reminds her something she seemed to like as a child. v: let's move one a: do you remember basic? (programming language), "if-then, goto" (functions in basic) v: i've heard of it, but never used it a: when i was 12 years old i was in an afterschool program (that introduced basic) the excerpt shows that ana had some experience in programming as a child, whereas vera had none. this minor background detail might also explain their different approaches to computational modeling and indicate the importance of early exposure to computer programming. this study addressed two research questions-the first asked how prior experience in teaching physics influenced teaching self-efficacy and attitudes towards inquiry-based laboratory practices. the findings indicate that prior experience in teaching is strongly related to teachers' self-efficacy of teaching physics. this finding corroborates bandura's theory that self-efficacy emanates from mastery experiences of performance (bandura 1997) . the significantly higher self-efficacy fig. 2 appreciation of the activity was similar and positive among both teachers with prior programming experience, and those without (left). scatter plot of the appreciation and the selfefficacy of the teachers before the activity (right) in teaching physics of the experienced teachers is also in line with prior research on the effect of experience on teaching selfefficacy (tschannen-moran and hoy 2007). while experience was related to self-efficacy, the overall appreciation of the learning modules was positive among both experienced and novice teachers. this is true for the appreciation of the experimental and computational aspects of the modules, and specifically in relation to the computational modeling activity. table 3 shows that for the two main aspects of the workshop-computational modeling and conducting structured inquiry (items 3 and 6)both groups responded similarly, and positively. the workshops' purpose was to introduce a novel approach for coordinating experimental measurements and theoretical, computational modeling. this approach differs from traditional structured labs that aim at illustrating or reinforcing theoretical concepts. teachers' decrease in ratings of traditional goals, and the increase in ratings of inquiry practices related to the coordination of experimental results and theory, indicate that the workshop had some influence on the attitudes towards the main goals of the lab. teachers' ratings of inquiry lab practices such as designing an experimental setup indicate difference between experienced and novice teachers that are marginally significant. the differences may be related to prior acquaintance with inquiry practices: the mastery experience in the workshop might have boosted the novice teachers' belief in the importance of these practices, but not those of the experienced teachers who were already confident. this is similar to the differences in lab ratings of first-year undergraduates and students in more advanced years: first year students had lower initial ratings than students in their second year or beyond (wilcox and lewandowski 2018) . this difference probably reflects a process of self-selection-students or teachers who persist in their degree or teaching-are more likely to have positive attitudes towards the importance of the practices that they acquired (wilcox and lewandowski 2018) . the second research question addressed the relation between prior involvement with programming, self-efficacy, and attitudes towards the computational modeling activity. we found that the self-efficacy related to programming of most teachers has increased during the workshop, although teachers with no programming experience were still less efficacious than the teachers with programming experience at the end of the workshop. in addition, the appreciation of the computational modeling activity was similar among teachers with or without programming experience but the groups differed in the variation of their ratings, which was larger among the novice teachers as indicated in the error bars of the boxplot in fig. 2 . the transcript of the computational activity offers an explanation for this variation-it illustrates how two teachers without prior computer programming knowledge reacted differently to the computational activity. the conversation between the two teachers hints that the difference in appreciation may be rooted in different dispositions towards the adaptation of new technologies: vera expressed resentment towards the computational modeling activity and contrasted it with her satisfaction with excel-a user-friendly interface for data visualization that was also used in the workshop. conversely, ana mentioned that the programming activity reminds her a positive experience from the past. vera's reaction reflects the difficulty in switching from the "culture" of computer users of programs such as excel to the "culture" of computer programmers that requires planning and debugging (kolikant and ben-ari 2008) . one way to overcome such barriers is by introducing programming to children at a young age to prevent the perpetuation of the culture focusing on merely using readymade computer artifacts, to a culture views the computer as tool for developing new artifacts (disessa 2001). vera's comments might also indicate that she does not see the value of the computational model. it might be that she could benefit from a framing activity that would explicate the opportunities afforded by the computational modeling activity, and its difference from the excel. for example, the computational model uses symbols for the variables of force "f" and velocity "v" and allows to perform calculations with them (see lines 14-15 in fig. 1 ) this type of representation is not possible in excel, nor is it possible to simulate motion in a dynamic manner. at the beginning of this paper, we referenced the "hidden figures" movie scene to introduce the computational euler method as a legitimate, and sometimes even necessary, substitute for analytical/mathematical modeling. we believe that since computational modeling plays a central and indispensable role in scientific inquiry, this practice needs to be represented in school science, and the only way to do this is by investing in teacher pd. in this study, we examined science teachers' reactions towards a pd workshop that introduced inquiry-based modules with a computational modeling component. we found that the workshop contributed to both experienced and novice teachers. specifically, we found that most teachers' attitudes towards the computational modeling were positive, and their self-efficacy beliefs regarding computational modeling increased. the main limitation of our study is that it does not describe the actual implementation of the modules by the teachers in their classrooms. this would be a subject for an additional study. nevertheless, we note that despite the positive reactions at the end of the workshop, only 25% of the teachers implemented the modules in their classrooms in the subsequent year. some of the teachers were unable to implement the modules because they did not teach 9th grade physics the following year. others did not implement the module, since it required substantial amount of classroom time that they preferred to dedicate to topics mandated by the standard 9th grade curriculum. similarly, farris et al. (2019) wrote that "evident in the teacher's instructional approach was the tension between supporting students' exploration (of the computational model) and the curricular need for the production of canonically correct representations… that are mandated by the curriculum and that they would need in standardized tests." we conclude that in order to encourage more teachers to implement such time consuming, challenging learning modules, they should be included national curricula and teacher training programs. in our case, we found that the "free-fall" module, which is more aligned with the middle-school curriculum, has a better chance to be implemented than the oscillations module. another way that may be useful for streamlining the learning module is by embedding the programming activities in a learning management system such as moodle. in the workshop, teachers had to split their attention between a paper worksheet, and the vpython code on the computer. in a revised version of the module that was created to support teachers in distance teaching during the covid-19 shutdown, we embedded the vpython code within an online worksheet in a learning management system. this places the programming section as an integral component of the learning unit, and can encourage students and teachers to use it. yet, another way to encourage teachers to implement is by maintaining a professional learning community of the implementing teachers. in a learning community, teachers meet regularly to discuss their successes and challenges in implementation during the school year. this supplements the summer workshop, where teachers experienced most of the learning module as students, and did not have opportunities to rehearse them as teachers. at first, we used a model of individual teacher support that included planning sessions and classroom observations with the pd team. this model of support was only partially successful, since it prevented teachers from sharing stories and reflecting on their implementation challenges with their peers. promising results from studies of professional learning communities of teachers suggest that they have a better chance to induce changes in teaching and increase teaching self-efficacy (akerson et al. 2009; lakshmanan et al. 2011 ). conflict of interest the authors declare that they have no conflicts of interest. ethical statement this research was conducted with compliance of the rules mandated by our institution's internal review board (irb) and received the approval of the chief scientist of the ministry of education in our country. consent statement all of the data sources were consenting adults. all of the subjects in this study were teachers who responded to the surveys by will, and agreed to be video recorded. understanding student computational thinking with computational modeling fostering a community of practice through a professional development program to improve elementary teachers' views of nature of science and teaching practice first-year physics students' perceptions of the quality of experimental measurements self-efficacy: the exercise of control no silver bullet for inquiry: making sense of teacher change following an inquiry-based research experience for teachers innovative uses of video analysis. the physics teacher computational physics in the introductory calculus-based course all other things being equal: children' s acquisition of the control of variables strategy stable solutions using the euler approximation the integration of computer simulation and learning suppojrt: an example from the physics domain of collisions physical and virtual laboratories in science and engineering education using computer simulations for promoting modelbased reasoning: epistemological and educational dimensions methodology and epistemology of computer simulations and implications for science education changing minds: computers, learning, and literacy the modelling framework for experimental physics: description, development, and applications using introductory labs to engage students in experimental design the role of models in physics instruction learning to interpret measurement and motion in fourth grade computational modeling should students design or interact with models? using the bifocal modelling framework to investigate model construction in high school science what makes professional development effective? results from a national sample of teachers professional development for technology-enhanced inquiry science modeling instruction in mechanics introductory physics labs: we can do better c2stem: a system for synergistic learning of physics and computational thinking reasoning from data: how students collect and interpret data in science investigations dual space search during scientific reasoning fertile zones of cultural encounter in computer science education what is scientific thinking and how does it develop the impact of science content and professional learning communities on science teaching efficacy and standards-based instruction extending the boundaries of high school physics: talented students develop computer models of physical processes in matter the influence of core teaching conceptions on teachers' use of inquiry teaching practices k-12 science and mathematics teachers' beliefs about and use of inquiry in the classroom a hybrid approach for using programming exercises in introductory physics teaching scientific practices: meeting the challenge of change evaluating a course for teaching introductory programming with scratch to pre-service kindergarten teachers what makes professional development effective? strategies that foster curriculum implementation problematizing as a scientific endeavor learning to code or coding to learn? a systematic review student programming in the introductory physics course: muppet scaffolding complex learning: the mechanisms of structuring and problematizing student work the rise of the digital polymath: switzerland is crossing the computer science education chasm through mandatory elementary pre-service teacher education vpython: 3d interactive scientific graphics for students a comparison of programming languages and algebraic notation as expressive languages for physics dynaturtle revisited: learning physics through collaborative design of a computer model prospective physics teachers' self-efficacy beliefs about teaching and conceptual understandings for the subjects of force and motion the differential antecedents of self-efficacy beliefs of novice and experienced teachers sources of self-efficacy: four pd formats and their relationship to self-efficacy and implementation of a new teaching strategy teacher efficacy: its meaning and measure how to help teachers develop inquiry teaching: perspectives from experienced science teachers model construction as a learning activity: a design space and review evobuild: a quickstart toolkit for programming agent-based models of evolutionary processes a summary of researchbased assessment of students' beliefs about the nature of experimental physics science in the age of computer simulation model-based reasoning in the physics laboratory: framework and initial results publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations key: cord-347906-3ehsg8oi authors: zhang, zizhen; zeb, anwar; hussain, sultan; alzahrani, ebraheem title: dynamics of covid-19 mathematical model with stochastic perturbation date: 2020-08-28 journal: adv differ equ doi: 10.1186/s13662-020-02909-1 sha: doc_id: 347906 cord_uid: 3ehsg8oi acknowledging many effects on humans, which are ignored in deterministic models for covid-19, in this paper, we consider stochastic mathematical model for covid-19. firstly, the formulation of a stochastic susceptible–infected–recovered model is presented. secondly, we devote with full strength our concentrated attention to sufficient conditions for extinction and persistence. thirdly, we examine the threshold of the proposed stochastic covid-19 model, when noise is small or large. finally, we show the numerical simulations graphically using matlab. sir model for control of this pandemic. but there is no one until now who could control this virus. if we make the contact rates very small it will show the best effect on the further spreading of covid-19, so for this purpose all governments take action for in terms of the household effect. for the estimation of the final size of the coronavirus epidemic, batista [3] presented the logistic growth regression model. many researchers discussed this covid-19 in different models in integer and in fractional order, see [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] , because of many applications of fractional calculus, stochastic modeling and bifurcation analysis [18] [19] [20] [21] [22] [23] [24] [25] [26] . for the more realistic models, several authors studied the stochastic models by introducing white noise [27] [28] [29] [30] [31] . the effects of the environment in the aids model were studied by dalal et al. [27] using the method of parameter perturbation. stochastic models will likely produce results different from deterministic models every time the model is run for the same parameters. stochastic models possess some inherent randomness. the same set of parameter values and initial conditions for deterministic models will lead to an ensemble of different outputs. tornatore et al. [28] [29] [30] studied the stochastic epidemic models with vaccination. in this work, they proved the existence, uniqueness, and positivity of the solution. a stochastic sis epidemic model containing vaccination is discussed by zhu et al. [31] . they obtained the condition of the disease extinction and persistence according to noise and threshold of the deterministic system. similarly, several authors discussed the same conditions for stochastic models; see [32] [33] [34] [35] [36] [37] [38] [39] . to study the effects of the environment on spreading of covid-19 and make the research more realistic, first we formulate a stochastic mathematical covid-19 model. then sufficient conditions for extinction and persistence are examined. furthermore, the threshold of the proposed stochastic covid-19 model is determined. it plays an important role in mathematical models as a backbone, when there is small or large noise. finally, we show the numerical simulations graphically with the aid of matlab. the rest of the paper is organized as follows: sect. 2 is concerned with the covid-19 model with random perturbation formulation. section 3 is related to the unique positive solution of proposed model. furthermore, we investigate the exponential stability of the proposed model in sect. 4. the persistent conditions are shown in sect. 5. finally, we conclude with the results and outcomes of the paper in sect. 6. in this section, a covid-19 mathematical model with random perturbation is formulated as follows: where the description of parameters and variables are given in table 1 . in deterministic form the model (1) is given by where also, we have so, the solution has a positivity property. for stability analysis of model (2), we have the reproductive number, which is if r 0 < 1, then system (2) will be locally stable and unstable if r 0 ≥ 1. similarly for = 0, the system (2) will be globally asymptotically stable. here, we first make the following assumptions: satisfies the usual conditions. generally, consider a stochastic differential equation of n-dimensions as with initial value y(t 0 ) = y 0 ∈ r d . by defining the differential operator l with eq. (6) if the operator l acts on a function v = (r d ×r + ;r + ), then , and solution will be left in r 3 + , with probability 1. proof since the coefficient of the differential equations of system (1) are locally lipschitz continuous for (s(0), where τ e is the time for noise caused by an explosion (see [6] ). for demonstrating the solution to be global, it is sufficient that τ e = ∞ a.s. suppose that k 0 ≥ 0 is sufficiently large so that (s(0), where we set inf φ(empty set) = ∞ throughout the paper. for k → ∞, τ k is clearly increasing. set τ ∞ = lim k→∞ τ k whither τ ∞ ≤ τ e . if we can show that τ ∞ = ∞ a.s, then τ e = ∞. if false, then there are a pair of constants t > 0 and ∈ (0, 1) such that so there is an integer k 1 ≥ k 0 , which satisfies by applying the itô formula, we obtain where lv : r 3 + →r + is defined by by choosing c = γ +μ β , it follows that further proof follows from ji et al. [31] . in this section, we investigate the condition for extinction of the spread of the coronavirus. here, we define a useful lemma concerned with this work is as follows. then in addition proof performing the integration of system (1) then we have by applying lim t→0 log by putting in the value of s(t) from eq. (17) log if condition (2) is satisfied, then and conclusion (16) is proved. next, according to inequality (19) log if condition (1) is satisfied, then and conclusion (15) is proved. we have according to (15) and (16) lim t→∞ i(t) = 0. now, from third equation of system (1), it follows that by applying the l'hospital's rule to the previous result, we have from eq. (4), it follows that hence, we have completed the proof. this section concerns the persistence of system (1). theorem 5.1 suppose that μ > ρ 2 2 . let (s(t), i(t), r(t)) be any solution of model (1) with initial conditions (s(0), i(0), r(0)) ∈ r 3 + . if˜ > 1, then proof we have we apply the limit . using eq. (17) we have . furthermore, by applying the limit t → ∞, we have . hence, the proof is complete. for the illustration of our obtained results, we use the values of the parameters and the variables given in table 2 . now for the numerical simulation, we use milstein's higher order method [40] . the results obtained through this method are shown graphically in fig. 1 for both deterministic and stochastic forms. in this work, a formulation of a stochastic covid-19 mathematical model is presented. the sufficient conditions are determined for extinction and persistence. furthermore, we discussed the threshold of proposed stochastic model when there is small or large noise. finally, we showed numerical simulations graphically with the help of software matlab. the conclusions obtained are that the spread of covid-19 will be under control ifr < 1 and ρ 2 ≤ βμ λ means that white noise is not large and the value ofr > 1 will lead to the prevailing of covid-19. breaking down of the healthcare system: mathematical modelling for controlling the novel coronavirus (2019-ncov) outbreak in wuhan china statistics-based predictions of coronavirus epidemic spreading in mainland china estimation of the final size of the coronavirus epidemic by sir model mathematical predictions for coronavirus as a global pandemic a pneumonia outbreak associated with a new coronavirus of probable bat origin pneumonia of unknown aetiology in wuhan, china: potential for international spread via commercial air travel outbreak of pneumonia of unknown etiology in wuhan china: the mystery and the miracle cross species transmission of the newly identified coronavirus 2019 cov early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia statistical analysis of forecasting covid-19 for upcoming month in pakistan study of transmission dynamics of novel covid-19 by using mathematical model dynamic analysis of a mathematical model with health care capacity for covid-19 pandemic on a comprehensive model of the novel coronavirus (covid-19) under mittag-leffler derivative modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative mathematical model for coronavirus disease 2019 (covid-19) containing isolation class qualitative analysis of a mathematical model in the time of covid-19 computational and theoretical modeling of the transmission dynamics of novel covid-19 under mittag-leffler power law dynamical behavior for a stochastic two-species competitive model a new method to investigate almost periodic solutions for an nicholson's blowflies model with time-varying delays and a linear harvesting term bifurcation of a fractional-order delayed malware propagation model in social networks on p-th moment exponential stability for stochastic cellular neural networks with distributed delays bifurcation control of a fractional-order delayed competition and cooperation model of two enterprises periodic property and asymptotic behavior for a discrete ratio-dependent food-chain system with delay blind in a commutative world: simple illustrations with functions and chaotic attractors fractional discretization: the african's tortoise walk fractal-fractional differentiation and integration: connecting fractal calculus and fractional calculus to predict complex system a stochastic model for internal hiv dynamics on a stochastic disease model with vaccination sivr epidemic model with stochastic perturbation stability of a stochastic sir system a stochastic sir epidemic model with density dependent birth rate the extinction and persistence of the stochastic sis epidemic model with vaccination the behavior of an sir epidemic model with stochastic perturbation a stochastic differential equation sis epidemic model the threshold of a stochastic sis epidemic model with vaccination the threshold of a stochastic sirs epidemic model with saturated incidence multigroup sir epidemic model with stochastic perturbation extinction and stationary distribution of a stochastic sirs epidemic model with non-linear incidence stochastic differential equations and applications an algorithmic introduction to numerical simulation of stochastic differential equations this research was supported by higher education institutions of anhui province (no. kj2020a0002). supported by higher education institutions of anhui province under (kj2020a0002). the authors confirm that the data supporting the findings of this study are available within the article cited therein. the authors declare that there is no conflict of interest regarding the publication of this paper. the authors have equally contributed to preparing this manuscript. all authors read and approved the final manuscript. springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.received: 14 july 2020 accepted: 17 august 2020 key: cord-344252-6g3zzj0o authors: farooq, junaid; bazaz, muhammad abid title: a novel adaptive deep learning model of covid-19 with focus on mortality reduction strategies date: 2020-07-21 journal: chaos solitons fractals doi: 10.1016/j.chaos.2020.110148 sha: doc_id: 344252 cord_uid: 6g3zzj0o we employ deep learning to propose an artificial neural network (ann) based and data stream guided real-time incremental learning algorithm for parameter estimation of a non-intrusive, intelligent, adaptive and online analytical model of covid-19 disease. modeling and simulation of such problems poses an additional challenge of continuously evolving training data in which the model parameters change over time depending upon external factors. our main contribution is that in a scenario of continuously evolving training data, unlike typical deep learning techniques, this non-intrusive model eliminates the need to retrain or rebuild the model from scratch every time a new training data set is received. after validating the model, we use it to study the impact of different strategies for epidemic control. finally, we propose and simulate a strategy of controlled natural immunization through risk based population compartmentalization (pc) wherein the population is divided in low risk (lr) and high risk (hr) compartments based on risk factors (like comorbidities and age) and subjected to different disease transmission dynamics by isolating the hr compartment while allowing the lr compartment to develop natural immunity. upon release from the preventive isolation, the hr compartment finds itself surrounded by enough number of immunized individuals to prevent spread of infection and thus most of the deaths occurring in this group are avoided. o an effective strategy to minimize the number of deaths through controlled natural immunization in absence of availability of vaccination at mass level. covid-19 is a highly contagious epidemic disease caused by novel coronavirus (sars-cov-2) that originated in wuhan, hubei province of china in late december 2019. world health organization (who) declared covid-19 as a pandemic on 12th march 2020 [1] . researchers are policy makers are working round the clock to find solutions and design strategies to control the pandemic and minimize its impact on human health and economy. the transmission of sars-cov-2 in humans is mostly through respiratory droplets (sneezing, coughing and while talking) and through contaminated surfaces [2] . the most significant property of sars-cov-2 is that it can persist on a variety of surfaces from hours to 9 days at room temperature which makes its transmission more rapid [3] . this virus can cause acute respiratory distress syndrome (ards) or multiple organ dysfunction, which may lead to physiological deterioration and death of an infected individual [4] . mathematical modeling of infectious diseases and epidemics has been employed as an important tool for analysis of disease characteristics and investigation of disease spread ever since the ground breaking work of kermack and mckendrick in 1972 [5] . it plays a useful role in efficient decision making and optimal policy framing. different models have been developed to analyse the transmission dynamics of many infectious diseases like malaria (ronald ross model) [6] , cholera (capasso and pareri-fontana model, 1979) [7] , gonorrhea (hethcote and yorke model, 1984) [7] , ebola [8] , h1n1 [9] etc. in this work, we employ deep learning to propose an artificial neural network (ann) based real-time online incremental learning technique to estimate parameters of a data stream guided analytical model of covid-19 to study the transmission dynamics and prevention mechanism for sars-cov-2 novel coronavirus in order to aid in optimal policy formulation, efficient decision making, forecasting and simulation. modeling and simulation of such problems poses an additional challenge of continuously evolving training data in which the model parameters change over time depending upon external factors. our main contribution is that in a scenario of continuously evolving training data, unlike typical deep learning techniques, this model eliminates the need to retrain or rebuild the model from scratch every time a new training data set is received. using a data science approach, model parameters are intelligently adapted to the new ground realities. to the best of our knowledge, this paper develops for the first time a deep learning model of epidemic diseases with data science approach in which parameters are intelligently adapted to the new ground realities with fast evolving infection dynamics. the covid-19 data from india has been taken as the case study. the first case of covid-19 in india was reported on 30th january 2020 originating from wuhan, china [10] . as on 13 june 2020, the total number of cases reported in india is 308,993 with 154,330 recoveries and 8,884 deaths [11] . hence the number of active cases is 145,779. the government of india imposed a country wide complete lockdown on 24th march 2020 with strict restrictions on the movement of people while allowing only the essential services to operate under the supervision of administration and health officials. the lockdown was renewed thrice on 14 april, 3 may and 18 may 2020 and has been relaxed since 8 june 2020 [12] . india is the second largest populated country in the world with a total population of around 1.35 billion. the health care facilities in india are considered poor with 0.55 hospital beds per thousand people of the population [13] . therefore, the covid-19 pandemic has emerged as a major challenge for the people, health workers and policy makers of the country. using a control theory approach, we analyze the stability of different disease prevention strategies. finally, we propose a strategy of controlled natural immunization of the population by dividing it in low risk and high risk compartments based on various risk factors like age and comorbidities. the two groups are treated separately and subjected to different disease mechanics with the aim to minimize the total number of deaths given the fact that the probability of death is very high in the high risk group as compared to the low risk group. the two compartments are isolated from each other for a certain period of time. the low risk compartment is allowed to fully brace the infection with maximum speed and develop immunity owing to its very low death rate while as the high risk compartment is put under preventive isolation till the infection growth curves fatten for the low risk group. upon release from the preventive isolation, the high risk group finds itself surrounded by enough number of immunized individuals to prevent spread of infection and thus most of the deaths occurring in this group are avoided. we simulate this strategy in matlab environment to establish its effectiveness in significant reduction in the number of deaths while demonstrating the usefulness of the deep learning based mathematical model. to the best of our knowledge, such an approach to reduce mortality has not been modeled and simulated earlier by the scientific community. sirvd refers to susceptible, infected, recovered, vaccinated and deceased states of individuals in a population going through an epidemic. the pioneer work in development of mathematical models for infectious diseases was carried out by by [5] known as the susceptible-infectious-recovered (sir) model. as one of the most classical models, it has been used by many researchers to study and analyse many infectious diseases like seasonal flu [14, 15] , pandemic flu [16, 17] , hiv/aids [18] , sars [19, 20] etc. these studies have shown that sir models are reliable for analysis of the infectious disease spread and evaluation of the impact of prevention schemes in different scenarios. the basic sir model is described by the following differential equations: where s , i and r are functions of time representing the number of susceptible, infected and recovered individuals in a population of size n at time t. β is the rate of transmission and γ is the rate of recovery of infected individuals. it is assumed that those recovered develop immunity and do not catch the infection again in the time span of interest. the basic sir model can be modified in various ways to accommodate different scenarios. a modified sir model known as sird (susceptible-infected-recovered-deceased) model is of our interest here and is based on the following assumptions: (i) this model is fatal unlike a typical non-lethal sir model which means that there is a positive probability of an infected person dying, p(death) = δ and δ ≥ 0. (ii) a typical sir model assumes that the recovered group gains full immunity from reinfection. however, this model accommodates the possibility of a recovered person being reinfected with probability of reinfection, p(rein f ection) = σ and σ ≥ 0. (iii) the impact of new births and unrelated deaths in ignored and the total population remains constant, n = constant ∀ t. (iv) the population is distributed randomly over the area. therefore, there are four classes of individuals: susceptible (s), infected (i), recovered (r) and deceased (d) as described by the following equation: where β = rate of infection, σ = rate of susceptibility, γ = rate of recovery and δ = rate of death. since the final cure for covid-19 pandemic is the successful discovery and optimal administration of the vaccine in the population, therefore we introduce the effect of vaccination with a given rate of vaccination under resource limited settings. this is achieved by adding a new class of individuals called vaccinated (v) in the population. it can be fairly assumed that there is no limit on the total number of vaccines produced as all the available resources for vaccine production are employed to eliminate the epidemic. however, the vaccine production capacity will have some limit based on availability of resources and facilities. therefore, there will be a limited number of vaccines available at a particular point of time. thus, it is assumed that the per capita rate of vaccination, α < where is a constant. it is assumed that the vaccination imparts long term immunity against the disease in the vaccination individuals. based on these assumption, we propose a final model as described by the following first order ordinary differential equations (ode's) and it illustrated by figure 1 : it must be noted here that the process of mass action transmission is described by the non-linear term βs i/n where β = number of contacts per unit time by a person in group i required to transmit the disease to a person in group s, n − 1 ≈ n = total number of possible contacts of a person, s /n = the fraction of possible contacts of a person that are from group s, i = the number of infected persons at time t, therefore βs i/n = number of people transmitted from group s to group i per unit of time. the next task is to to learn the model parameters which can be quite challenging in an epidemic scenario like covid-19 as the the model parameters are supposed to change with time. this section proposes an artificial neural network (ann) based adaptive incremental learning technique (annail) for online learning of the sirvd model parameters with the following assumptions: (i) the rate of vaccination α as a function of time (t) is set by the vaccine production capacity decided by availability of skilled labour, resources and facilities and by vaccination policy as well. maximum vaccine production capacity α max has been assumed to be constant indicating that there is no change in vaccine production infrastructure, technology or facilities. (ii) the rate of infection β as a function of time (t) is the major challenge for parameter learning. it is affected by external factors like degree of social distancing, lockdown etc. in case of a lockdown β decreases exponentially. therefore, in order to take into account both the lockdown and no lockdown scenarios, β has been modelled as: where t l is the time when the lockdown begins. therefore, the learning algorithm has to learn 3 parameters (β 0 , β 1 , τ β ) to find β. (iii) the rate of reinfection σ has been assumed to be zero for covid-19 disease as the human body develops antibodies to prevent re-infections in future against such a virus [21] . (iv) rate of recovery γ and rate of death δ are affected by factors like change in health care facilities, possible overcrowding of hospitals, development of new drugs to manage or treat the disease etc. both of these have been assumed to be constant in this paper. for a typical neural network or any other technique of model parameter estimation, the training data is required first to train the model before applying it on future scenarios. however, in case of an epidemic like covid-19, the training data is continuously evolving with time and the model needs to be trained and executed at the same time as the model parameters may change over time based on different external factors like government policies, social distancing etc which can be known only from newly arriving data sets. therefore, we propose a technique for the model to learn these parameters from new data sets in an adaptive manner while continuously updating the old models without the need to build the model from scratch every time a new training data set is received. deep learning and other machine learning techniques stand out in solving problems of data based model parameter estimation due to their state-of-the-art results. however, they face the problem of catastrophic forgetting which reduces their performance as new training data becomes available with time. this is because the typical neural networks require the entire dataset to update the model each time a new training data set becomes available as in case of an epidemic modelling problem where the training data becomes available incrementally with time. to address such issues, different incremental learning algorithms have been suggested in the literature [22, 23, 24] . incremental learning refers to an online learning technique of continuous model adaptation under a scenarios of continuously evolving training data. therefore, storage or access to the previously observed data is not required each time a new data set is received, as in case of an epidemic like covid-19. in order to adapt the model parameters in light of new data, it is not needed to use all the previously accumulated data for developing the model from scratch. rather, the learning network modifies the previous hypothesis to adapt to the new data chunk. in this paper, we propose hypothesis generation via an artificial neural network (ann). let d j−1 be the data set received between time t j−1 and t j , and h j−1 be the hypothesis generated on this data set. the hypothesis h j for a new data set d j received between time t j and t j+1 is a function of d j and h j−1 only as under: the experience gained from this step is stored and integrated to support in future adaptation process. thus the objective here is to integrate the previously learned knowledge into the new raw data set to adapt the model parameters accordingly; and to accumulate this experience over time to increase the model efficiency, accuracy and flexibility. the proposed framework for the above problem is shown in figure 2 . the ann is based on a non-linear activation function for successful regression analysis. the hidden layers are represented by the function f nn . with the continuous data stream, the weight distribution functions are generated to describe the learning capability of the ann where the decision boundary is adjusted to focus especially on the hard to learn data examples. the algorithm for this framework is given in algorithm 1. , where x i represents the input vector and y i represents the output. -associated mapping function output: where f nn is the ann defined mapping function. (ii) apply hypothesis h t−1 to d t , and find the pseudo-error (vi) repeat the above procedure for d t+1 . the final hypothesis is: where t is the set of incrementally developed hypotheses in the learning life. this algorithm is run in a top-down and horizontal signal flow, as shown in figure 2 . the adaptive nature of this algorithm is due to the mapping function based on ann which estimates the initial distribution functionφ t−1 for d t while providing a quantitative approach to indicate the learning power of the new data set based on previously trained model.φ t−1 is applied to the new data set to find pseudo-error, υ. thus a hard to learn example will have higher υ in step (iii) of the learning procedure, and will in turn receive higher weight in step (iv). this ensures the adaptive nature of the algorithm. mapping function connects the past experiences to the new data in an adaptive fashion. there can be many ways to design the mapping function. however, we implemented the nonlinear regression by an ann based approximation of mapping function owing to its flexibility. any such neural network based function approximation technique can be used. as in illustration we take the multilayer perceptron (mlp) in this paper. this is shown in figure 3 . the input is an n-dimensional vector (for example, number of infections, deaths, recoveries in an epidemic) of example i. the distribution function is currently estimated as j t−1 , w represents the weights of a layer. backpropagation is used to tune the weights w of different layers, where error function is defined as where k is the training epoch of the backpropagation. the neural network gives the following output: where h f represents the input to fth hidden node while as g f represents its output, υ is the input to the final node, n h is number of hidden neurons, and n is the total number of inputs. weights on the ann are updated by applying the above defined backpropagation strategy as explained below. backpropagation: weight adjustment for the hidden to out similarly, weight adjustments for the input to hidden layer is described as: where α(k) describes the learning rate. estimation of initial distribution functionφ t for d t required only the feedforward path of the mlp. this model was validated for covid-19 in india where the first 80% of data was used for training and the remaining 20 % was used for testing as shown in figure 4 . it is clearly evident from the plots shown in the figure that the results given by the model during testing are very close to the actual data. the inputs and outputs in this algorithm were: inputs : number of new infections, deaths and recoveries; rate of vaccination (α). outputs : β 0 , β 1 , τ β , δ, γ these outputs are fed to the analytical sirvd model at every time instant when a new set of input data is received to simulate and forecast different scenarios. in this section, we analyse three possible strategies to combat an epidemic like covid-19: (i) herd immunity, (ii) complete vaccination, (iii) complete lockdown and finally our proposed (iv) controlled natural immunization through risk based population compartmentalization is discussed in the next section. however, following definitions are needed beforehand: definition 1. (stability) if the jacobian matrix j for a system of n differential equations has eigenvalues λ 1 , λ 2 , ..., λ n , for a trivial steady state equilibrium at (0,0,...,0), then the stability of the solution is determined as following: .., n then the system has uniform and asymptotic stability (uas). .., n and the algebraic multiplicity equals the geometric multiplicity whenever λ i = 0 for any i, then the system has uniform stability (us). (iii) if re(λ i ) > 0 for any i and the algebraic multiplicity is greater then the geometric multiplicity whenever λ i = 0 for any i, then the system has instability. since the impact of new births and unrelated deaths has been fairly ignored for the purpose of this study and it has been assumed that the total population n stays constant throughout the epidemic, therefore the system of odes (8 -12) describing the model satisfies the following two conditions: this shows that a state of equilibrium always exists in the system. definition 2. (dfe) a disease free equilibrium (dfe) is defined as a state of equilibrium according to (25, 26) in which the number of infected and recovered individuals equal to zero such that there are no further deaths (i → d), infections (s → i) or reinfections (r → i): which gives us the following from the system of odes (8 -12) : we take the jacobian matrix of the above system to evaluate its stability. however, the fifth differential equation (representing d) is uncoupled from the first four differential equations and it can be derived from the first four equations using (25, 26) , therefore we consider the jacobian for the first four equations only which is given as: given below is the investigation of dfe stability for different epidemic control strategies. herd immunity is the idea that a virus cannot spread easily after enough people develop immunity against it. this reduces the chances of the virus transmitting from person to person and infecting those who haven't been infected yet [25] . such an immunity can be induced artificially by a vaccine, however it can also be developed naturally by the infection itself as the immune system of the body develops antibodies against the virus which prevent reinfection in future. this is based on the fact that the human body produces a non-specific innate response to a viral infection initially using neutrophils, macrophages, and dendritic cells. however, this is followed by a more specific adaptive response in the form of development of proteins called immunoglobulins which act as antibodies specifically binding to the virus. this is coupled with the formation of t-cells which generate cellular immunity by identifying and eliminating the cells that are infected with the virus. generally, sufficient presence of such antibodies in collaboration with cellular immunity prevents reinfection after recovery. although every recovered patient may not develop complete immunity, but that is the case for the most of them [21] . in case of covid-19, although research is till going on to reach a conclusive opinion, promising studies suggest that nearly all the recovered patients develop such antiviral immunoglobulin-g (igg) antibodies and are immune to reinfection [26] . this lays the basis for mass serological testing for covid-19 being practised by many governments across the world in which the blood samples of people are tested for the presence of these antibodies indicating present or past covid-19 infection. further, the data on reinfection, even if rare, is not available. therefore, the idea of herd immunity is to let the infectious disease take its natural course of action and let the population naturally develop immunity against the disease after most of the population gets infected. the dfe after herd immunity has the following form: where d 0 is the total number of deaths at the time of dfe and thus r = n − d 0 . since there is no vaccination, therefore α = 0 and v = 0. further, the rate of reinfection σ is zero as well, meaning recovered people cannot catch the infection again. using these values in (32), the jacobian at the dfe after herd immunity is given as: equating characteristic equation for this jacobian to zero gives: therefore, the eigenvalues of this system are: thus, all the eigenvalues have real parts less or equal to zero: re(λ i ) ≤ 0 ∀ i = 1, 2, ..., n and d 0 ∈ r ∩ [0, n]. hence, according to definition 1, the dfe for herd immunity possesses uniform stability (us). in this case, there are no chances of retriggering of the disease after the dfe has been reached. however, this approach has been criticised as dangerous as it will result in a large number of deaths [27] . to develop herd immunity for covid-19 in the population, roughly 70 % or more of the population needs to have gone through the infection [28] . the current death rate in india due covid-19 is around 3 % which means that if this death rate is maintained, nearly 2.1 % of the population will die till 70 % gets infected. this leads to nearly 28 million deaths. the actual number of deaths would be more than this, because with rapid growth of the infectious disease, hospitals would be over-flooded and health care infrastructure will not be able to cater to the demands of high number of patients leading to increase in the death rate. the growth of disease in india is shown in figure 5 if the disease is allowed to take its natural course without any intervention and control. the strategy of complete lockdown focuses on minimization of mobility and contact among the population with maxi-mum possible social distancing. therefore, it minimizes β (the rate of infection). the dfe under complete lockdown has the following form: where d 0 is the total number of deaths at the time of dfe and thus s = n − d 0 . since there is no vaccination, therefore α = 0 and v = 0. using these values in (32), the jacobian at the dfe under complete lockdown is given as: furthermore, equating characteristic equation for this jacobian to zero gives: therefore, the eigenvalues of this system are: thus, real part of one of the eigenvalues is always grater than zero: re(λ 3 ) > 0 ∀ d 0 ∈ r ∩ [0, n]. hence, according to definition 1, the dfe under complete lockdown is always unstable. thus, whenever a dfe is reached in this case, the tendency for the disease to be triggered again is always there. even a single case of infection can restart the disease. apart from the chances of reemergence of the disease, the time taken by the system to reduce the number of active infections to zero may be large enough to make the total lockdown unsustainable and practically impossible. the impact of lockdown on the disease growth in india is shown in figure 6 . it has been assumed that the lockdown ends on day 250 of the disease after it starts on day 55. it can be compared with figure 5 which shows the disease growth in case no preventive measures are taken. these results confirm that the infection plots shoot up as soon as the lockdown is lifted and there is no significant difference in terms of total number of infections or deaths. the only benefit of lockdown, as can be seen from these plots, is that the peak of infection is delayed which can be useful for the administration to buy some time to prepare the healthcare infrastructure of the country to brace for the full blown impact of the disease. researchers and health industries worldwide are working around the clock to discover a vaccine against sars-cov-2 [29]. moderna, a pharmacological company had started clinical testing of its mrna-based vaccine (mrna-1273) just 2 months after the complete sequencing of covid 19 was done and published by different research groups [30] . the most potential candidate for the vaccine development would be to induce our immune system to synthesize neutralizing antibodies against the viral spike protein which block its entry via ace2 receptors [31] . the dfe with vaccination of all living members of the population n has the following form: where d 0 is the total number of deaths at the time of dfe and thus v = n − d 0 as the rest of population has been vaccinated and is now immune to the disease. using these values in (32), the jacobian at the dfe with full vaccination is given as: furthermore, equating characteristic equation for this jacobian to zero gives: therefore, the eigenvalues of this system are: thus, all the eigenvalues have real parts less or equal to zero: members of the population is the way to completely eliminate the disease and its chances of re-triggering as well. however, successful development of a vaccine for sars-cov-2 has still a long way to go. further, a vaccination rate of 10 million per day is highly ambitious for a country like india. therefore, elimination of the current wave of covid-19 epidemic and minimization of number of deaths by vaccination is practically impossible as the successful development and mass administration of a vaccine is expected to take more than a year, at least. however, vaccination may still be necessary to prevent new waves of the disease in future. the strategy of herd immunity discussed in the previous section aims at minimizing the number of s (susceptible) and maximizing the number of r (recovered) people who are supposed to have developed the immunity. this will minimize the the factor β s i n in the sirvd model ode's discussed in section 2. however, this strategy results in the maximum number of deaths. the strategy of complete lockdown aims to minimize the rate of infection (β) by reducing the mobility of people. however, this strategy is not sustainable in the long term as discussed. the strategy of complete vaccination is an ideal solution to the problem. it aims at maximizing the number of people falling in v (vaccinated group) through artificial immunization. however, as discussed, the successful development of a vaccination and its administration in the population is expected to take enough time to cause a large number of deaths. in this section we propose a strategy of minimizing the number of deaths by controlled natural immunization by compartmentalization of the population in two groups: low risk and high risk. high risk group comprises of the people having co-morbidities or are aged above 50 years and as a result have high probability of death under covid-19 infection. the aim of this strategy is to minimize the number of deaths caused by covid-19 disease: the death rate (δ) is very high in the high risk group while as it is very low in the low risk group. the most prevalent comorbidity for covid-19 is hypertension, followed by diabetes with mean age of around 48.9 years [32]. in india, the percentage of population above 50 years of age is 19.28% while as the percentage above 60 is nearly 8% [33] . most of the people having comorbidities are expected to fall in above 50 age group. in this study, the high risk group has been assumed to be 20% of the population. it is proposed that these two population groups be subjected to different disease mechanics. the high risk group is subjected to a preventive quarantine or isolation wherein they are isolated from the low risk group by placing them in separate rooms or sections in homes with minimum contact with the low risk group. whatever necessary contact is required, it should be done with maximum preventive measures like wearing of face masks, sanitization etc. very high risk individuals may be placed in designated care centers where their health needs are met by medical professionals. therefore, for the high risk compartment, β is reduced to minimum as in the case of a lockdown. meanwhile, the low risk group is subjected to maximum mobility and contact among its members effectively increasing β to the maximum possible value. there should be no social distancing and other preventive measures for this group. as a result, the infection will spread very quickly in the low risk group and its members shall develop the immunity and transfer from s to r having gained the immunity naturally with a very low death rate. once this is achieved or in other words once the disease curves are flattened for the low risk group, the high risk group is released from the preventive isolation and allowed to mix with the low risk group. since most of the population (70 -80 %) is already immune and the value of s and i is low, the chances of the high risk group receiving infection from the low risk is very low because the factor β s i n has already been reduced to minimum. during pc, β is minimum for high risk group while as after pc, s i is minimum. after pc, the mobility and the contact in the population, represented by β, should be kept moderate. since, α and σ are zero, therefore: here, β represents the rate of infection which depends upon the mobility and contact among the population. due to the lockdown, this can be considered the lowest possible β in india. we have simulated the pc strategy with different values of β l ranging from 2 to 5 times of β given above to account for high mobility and contact during pc. β h = 0.2β to assume that the isolation of high risk individual is at least 5 times stronger than than the social distancing practised by the whole population during the nation wide lockdown. δ and γ are complementary in the sense that an individual either recovers or dies from the infection. to distribute the current death and recovery rate in low and high risk groups: integrating the equations of sirvd model for this case: where s (t 0 ) and i(t 0 ) are initial states. this is true for both low and high risk groups. however, for high risk group, s h (t 0 ) = i h (t 0 ) = 0 while as for low risk group; where t f is the time when pc ends and the two groups are allowed to remix. after pc: δ and γ have been restored to original values while as β has been multiplied by a factor of 2 to signify higher level of mobility and contact than lockdown albeit with social distancing. the stability analysis and eigenvalues for this case are same as that of herd immunity i.e, it posses uniform stability. this strategy was simulated using the model proposed in section 1 and 2. the results of the simulation are shown in figure (9 -10) . in all these figures, black line represents the number of deaths, blue is the total number of cases, red is the active number of infections while as green line represents the number of recovered people. in figure 9 , the preventive quarantine or compartmentalization of the population is done from day 130 after the start of the population till day 200 while as in in figure 10 the same is continued till day 300. the rate of infection β is shown as a multiple of the average rate of infection during the nation wide lockdown from 24 march to 08 june, 2020 which was the minimum possible. increase of β is achieved by increase of contact and mobility among the population, for example β = 5x means five times mobility and contact as compared to the days of nation wide lockdown. if no preventive measures are taken, then as per the current trend in the disease growth we may expect more than 55 million deaths in the country if all the population gets infected. this is shown in figure 8 . 55 million deaths is not surprising even if linear growth of the disease is assumed with 2.8% death rate for a population of 1.35 billion. different hypothetical experiments were simulated and their results are given in table 1 . as discussed earlier, β is supposed to be kept on lower side after the end of pc and social distancing is advised. these results show that the total number of deaths can be reduced to 1.3 million from 55 million, if mobility and contact is made 5 times to that of the lockdown period and pc is ended on day 300 of the pandemic while as after the end of pc, the mobility is reduced to two times. the growth of disease in such a scenario is shown in figure 11 . in this work, we developed an analytical epidemiological model for covid-19 pandemic where model parameters are continuously updated to intelligently adapt to new data sets using an ann based adaptive online incremental learning technique. in a scenario of continuously evolving training data, unlike typical deep learning techniques, the model eliminates the need to retrain or rebuild the model from scratch every time a new training data set is received. the model was validated and different scenarios were simulated to demonstrate its usefulness and significance. india was taken as a case study. however, this model can be applied to any population in the world and would be a useful tool for policy makers, health officials and researchers in improving decision making efficiency, policy formulation and forecasting. the simulation work was carried out in matlab environment. using this model, we simulated preventive measures like lockdown, vaccination and herd immunity to study their impact on the evolution of covid-19 disease. finally we proposed an effective method to significantly reduce the number of deaths caused by the pandemic in case a vaccine is not available at the mass level. this technique aims to develop natural immunity in the low risk group of the population by subjecting them to the full blown impact of sars-cov-2 virus while as subjecting the high risk group to preventive isolation during this time period. once the low risk group develops natural immunity and its disease curves are flattened, the high risk group is released from the preventive isolation. upon release, the high risk group doesn't find enough infected or susceptible people in the environment to catch the infection at a high rate and in this way the maximum number of deaths are avoided in the high risk group. the impact of this strategy has been simulated and it has been shown the the number of deaths can be reduced from 55 million to 1.3 million if the population compartmentalization starts tomorrow and ends on day 300 of the pandemic in india. during this period, the mobility and contact in low risk group has to be made five times as compared to the lockdown period and upon remixing of the two groups the mobility and contact should be reduced to 2 times from 5. the novelty of this paper lies in the use of real-time online incremental learning technique in epidemic disease modeling. many machine learning techniques have been used in epidemic disease modeling [36], however this paper is the first instance of development of an incremental learning algorithm as a real-time adaptive deep learning technique for parameter estimation of an epidemiological model thus providing the model with the capability to work online i.e, unlike typical machine learning techniques, it doesn't require to rebuild or retrain the model from scratch every time a new data set is received but intelligently adapts the model to ever changing infection dynamics. since the model is non-intrusive, adaptive, intelligent, real-time and online in nature, therefore it can be employed to monitor, forecast and simulate the growth of any infectious disease over a large sized population without losing accuracy, fidelity or com-putational performance due to limitations like run-time duration, size of training data, computational complexity, change in transmission dynamics due to mutations in virus or bacteria, change in prevention mechanisms or government policies. even if the epidemic continues for decades in the whole world, the model will keep working efficiently on daily basis without any decay in performance or rte (run-time environment). further, to the best of our knowledge, population compartmentalization to achieve natural immunity against an infectious disease while significantly reducing the mortality has been modeled and simulated for the first time in this paper. these findings could be highly useful to policy makers around the world to reduce the number of deaths in any country in case a vaccine is not readily available and lockdown is not sustainable economically. further, this is a demonstration of the usefulness and efficiency of deep learning based incremental learning algorithm in model parameter estimation and simulation of different epidemic scenarios. the doctoral research funding from ministry of human resource development, government of india, in favor of the first author is duly acknowledged. the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. detail/who-director-general-sopening-remarks-at-the-mission-briefing-on-covid aerosol and surface stability of sars-cov-2 as compared with sars-cov-1 covid-19 ards: clinical features and differences to "usual" pre-covid ards contributions to the mathematical theory of epidemics mathematical biology: 1. an introduction modeling influenza epidemics and pandemics: insights into the future of swine flu (h1n1) covid-19 covid-19 pandemic lockdown in india transmission of influenza: implications for control in health care settings a bayesian mcmc approach to study transmission of influenza: application to household longitudinal data a 'small-world-like' model for comparing interventions aimed at preventing and controlling influenza pandemics transmissibility of 1918 pandemic influenza analysis of recruitment and industrial human resources management for optimal productivity in the presence of the hiv/aids epidemic murray, transmission dynamics and control of severe acute respiratory syndrome transmission dynamics of the etiological agent of sars in hong kong: impact of public health intervention overview of some incremental learning algorithms endto-end incremental learning incremental learning algorithms and applications herd immunity": a rough guide antibody responses to sars-cov-2 in patients with covid-19 early herd immunity against covid-19: a dangerous misconception estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in 11 european countries covid-19 vaccine tracker draft landscape of covid-19 candidate vaccines immune responses in covid-19 and potential vaccines: lessons learned from sars and mers epidemic ☒ the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.☒the authors declare the following financial interests/personal relationships which may be considered as potential competing interests: key: cord-336644-kgrdul35 authors: shao, nian; zhong, min; yan, yue; pan, hanshuang; cheng, jin; chen, wenbin title: dynamic models for coronavirus disease 2019 and data analysis date: 2020-03-24 journal: math methods appl sci doi: 10.1002/mma.6345 sha: doc_id: 336644 cord_uid: kgrdul35 in this letter, two time delay dynamic models, a time delay dynamical–novel coronavirus pneumonia (tdd‐ncp) model and fudan‐chinese center for disease control and prevention (ccdc) model, are introduced to track the data of coronavirus disease 2019 (covid‐19). the tdd‐ncp model was developed recently by chengąŕs group in fudan and shanghai university of finance and economics (sufe). the tdd‐ncp model introduced the time delay process into the differential equations to describe the latent period of the epidemic. the fudan‐cdcc model was established when wenbin chen suggested to determine the kernel functions in the tdd‐ncp model by the public data from cdcc. by the public data of the cumulative confirmed cases in different regions in china and different countries, these models can clearly illustrate that the containment of the epidemic highly depends on early and effective isolations. covid-19 raised intense attention not only within china but internationally. people were concerned about the spread of epidemic and its development trend. many mathematical researches focused on the modelling of the spread and development of the covid-19. for instance, considering the epidemic's feature of spreading during the latent period, cheng's group applied the time delay process to describe the typical feature and proposed a novel dynamical system to predict the outbreak and evolution of covid-19. this model was called the time delay dynamical-novel coronavirus pneumonia (tdd-ncp) model. 1, 2 based on this work, chen et al also proposed a time delay dynamic system with an external source to describe the trend of local outbreak for covid-19. 3 then, considering the fractional order derivatives, chen et al proposed a novel time delay dynamic system with fractional order. 4 in their system, the riemann-liouville derivative was added that can describe the confirmed and cured people's growth process. later, based on chinese centers for disease control's (ccdc's) statistical data, shao et al proposed a series time delay dynamic system (called the fudan-ccdc model), 5 and they estimated the reproductive number r 0 of covid-19 in their study 6 based on wallinga and lipsitch frame work. 8 the conclusion was, the r 0 estimated is in [3.25, 3.4] of covid-19 that is bigger than that of the severe acute respiratory syndrome (sars). we refer our readers to some paralleled results. [9] [10] [11] [12] [13] [14] [15] in 23 february, the fudan-ccdc model is specially used to warn that there could be a rapid outbreak in japan if no effective quarantine measures are carried out immediately. 7 in this paper, we provide a brief introduction of the tdd-ncp model 1, 2 and fudan-ccdc model. 5 we use the fudan-ccdc model to reconstruct some important parameters (including growth rate, isolation rage, initial date) and predict the cumulative number of confirmed cases in some of cities in china. in addition, due to the serious concern of possible severe outbreak in japan, we also analyse the different evolutions of covid-19 in japan with different isolation rates in future days. the future circumstances of covid-19 in singapore will also be predicted. it is worth to emphasize that, the data employed in this paper were acquired from wind (like bloomberg), which were provided by the china health commission, but the data also can be easily found at every chinese news websites. the code is running on matlab, where some optimization packages are used. the following notations will be utilized in tdd-ncp model: assumptions: • at time t, the exposed people that may transmit to others is i 0 (t) = i(t) − g(t) − j(t), with the spread rate , which is a fixed but unknown constant. • no matter the cumulative confirmed people j(t) are isolated before diagnosed or not, they are consist of the population infected at time 1 averagely. • because of the quarantine strategy of the government, some infected people have been isolated during 1 latent days; the average experience of an isolated period is ′ 1 days; then, they were confirmed. the isolated rate was assumed to be ; a larger value of suggests that the government implemented tougher controls. • it takes 2 days on average for the diagnosed people to become cured with a rate or dead with a rate 1 − . the model will be described as follows: in which h i (t, t ′ ) with i = 1, 2, 3 are distributions that should be normalized as ∫ t 0 h i (t, t ′ )dt ′ = 1. h i can be chosen as normal distribution with h i = c i1 e −c i2 (t−t′) 2 or as a −function h i (t, t ′ ) = (t − t ′ ) for simplicity, which means every infected individual experienced the same latent period and treatment period. the model can be utilized to predict the tendency of outbreak for covid-19. suppose we know the initial conditions {i, j, g, r}| t=t 0 , where t 0 is the initial time, knowing the number of infected (j data ) and dead (r data ). setting the morbidity = 0.99, the latent period 1 = 7, ′ 1 = 14, and 2 = 12, we can identify the spread rate , the isolated rate , and the cured rate via the following two optimization problems: with the reconstructed ( * , * , * ), one can put them into system (1) and solve it numerically. the fudan-ccdc model was established when cheng, one of the authors for the tdd-ncp model, suggested to use the time delay model to fit the real data. the fudan-ccdc model is described as one can also use the discrete system with each step representing 1 day just as we have implemented in the code: we further make some assumptions on the following transition probability. • 2 (t): the transition probability from infection to illness onset, • 3 (t): the transition probability from illness onset to hospitalization, and • 4 (t): the transition probability from infection to hospitalization, which can be calculated via the convolution of 2 (t) and 3 (t), we assume the log-normal distribution for 2 (t) and the weibull distribution for 3 (t), and the distribution parameters can be estimated from ccdc by fitting the figures. 16 in addition, we denote the constant r be a growth rate that assumes to be equal to in tdd-ncp model. another important improvement in the fudan-ccdc model is taking into consideration an isolated rate , which is distinct for different time stages at different regions. so, we may assume this means that the government adopts different quarantine strategies at time t . the parameters r, 1 , 2 , and t are all be reconstructed numerically via the following optimization problem the model can also track the initial date of the epidemic when provided i(t 0 ). in the following, we list these important parameters in some cities in china; the diamond princess cruise ship, japan without the cruise, and singapore are also included in table 1 . it can be analyzed from table that the growth rate r is approximately around 0.31. the chinese government adopted a very strong quarantine strategy from 17 january 2020. actually, it is reported that there have been already 70 548 infected cases and 1875 dead cases. the following figures show the forecast of the tendency of outbreak for covid-19 in some cities in china and in singapore, see figure 1 . we also present the simulation results for covid-19 in the diamond princess cruise; everyone on board got off the cruise, and passengers on board began to disembark in batches on 19 february 2020, see figure 2 . finally, we present the estimated results of possible future scenarios for covid-19 in japan (without the cruise) under different choices of 2 , see figure 3 . we conclude from the figure as follows: • 2 = 0.4, which is a little bigger than 1 = 0.2872 but not enough. it means the quarantine strategy taken by the japanese government are insufficient, the number of infected people will remain increasing exponentially. • 2 = 0.43; the measures taken by the japanese government are not sufficient and the number of infected people will rise at a slower rate. • 2 = 0.45; the stabilization period will come a little earlier. the accumulative number of infected cases will decrease notably. • 2 = 0.5, which means the quarantine strategy is almost the same strength as those in shanghai, the epidemic will soon be under control, and the cumulative number of infected cases will be approximately 4000. during the following research, we will focus on the several questions: • parameter identification problems: from the observed data, based on our model, we would like to identify the source terms, which indicate when and how the patients infected. actually, when applying our fudan-ccdc model to analyze the tendency of covid-19 in korea, we successfully tracked down a super spreader on 7 february 2020. the following analysis related to the influence of such super spreaders will be our concentration. • stability problems with respect to the growth rate r: when we applied tdd-ncp and fudan-ccdc models to the different regions in china and different countries, a very interesting observation is that, even if we take almost the same parameter r and although the kernels in these two models are different, we can have similar results. this leads to our further consideration on the stability of this parameter. • the observability and controllability theory for two dynamical systems with respect to the isolated parameter : this parameter plays a significant role in our models. estimating this parameter can help the government making the decision on whether the government should increase the quarantine strategy. it would be interesting to study the observability and controllability of two models with respect to . the optimal control problem will be very useful. • what is the relation between our models and the classical susceptible-exposed-infectious-recovered (seir) models? other versions of our models can also been developed, such as a random input and random parameters. moreover, the methods here can generalized in other fields, such as finance, risk management, and social networks. we will discuss these topics in the future and also welcome other groups to join us. a time delay dynamical model for outbreak of 2019-ncov and the parameter identification modeling and prediction for the trend of outbreak of 2019-ncov based on a time-delay dynamic system (in chinese) a time delay dynamic system with external source for the local outbreak of 2019-ncov the reconstruction and prediction algorithm of the fractional tdd for the local outbreak of covid-19 some novel statistical time delay dynamic model by statistics data from ccdc on novel coronavirus pneumonia submitted to the reproductive number r 0 of covid-19 based on estimate of a statistical time delay dynamical system covid-19 in japan: what could happen in the future? how generation intervals shape the relationship between growth rates and reproductive numbers nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions time-varying transmission dynamics of novel coronavirus pneumonia in china analysis of the epidemic growth of the early 2019-ncov outbreak using internationally confirmed cases. medrxiv simulating the infected population and spread trend of 2019-ncov under different policy by eir model available at ssrn 3537083 early epidemiological assessment of the transmission potential and virulence of 2019 novel coronavirus in wuhan city: china, medrxiv feasibility of controlling 2019-ncov outbreaks by isolation of cases and contacts medrxiv early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia dynamic models for coronavirus disease 2019 and data analysis the authors declare no potential conflicts of interest. key: cord-330148-yltc6wpv authors: lessler, justin; azman, andrew s.; grabowski, m. kate; salje, henrik; rodriguez-barraquer, isabel title: trends in the mechanistic and dynamic modeling of infectious diseases date: 2016-07-02 journal: curr epidemiol rep doi: 10.1007/s40471-016-0078-4 sha: doc_id: 330148 cord_uid: yltc6wpv the dynamics of infectious disease epidemics are driven by interactions between individuals with differing disease status (e.g., susceptible, infected, immune). mechanistic models that capture the dynamics of such “dependent happenings” are a fundamental tool of infectious disease epidemiology. recent methodological advances combined with access to new data sources and computational power have resulted in an explosion in the use of dynamic models in the analysis of emerging and established infectious diseases. increasing use of models to inform practical public health decision making has challenged the field to develop new methods to exploit available data and appropriately characterize the uncertainty in the results. here, we discuss recent advances and areas of active research in the mechanistic and dynamic modeling of infectious disease. we highlight how a growing emphasis on data and inference, novel forecasting methods, and increasing access to “big data” are changing the field of infectious disease dynamics. we showcase the application of these methods in phylodynamic research, which combines mechanistic models with rich sources of molecular data to tie genetic data to population-level disease dynamics. as dynamics and mechanistic modeling methods mature and are increasingly tied to principled statistical approaches, the historic separation between the infectious disease dynamics and “traditional” epidemiologic methods is beginning to erode; this presents new opportunities for cross pollination between fields and novel applications. in 1916, ronald ross coined the term bdependent happen-ings^to capture the fundamental difference between the study of infectious diseases in populations and other health phenomena [1] . because infectious diseases are, for the most part, acquired from the people around us, our own future health status depends on that of our neighbors (e.g., the more people we know who are infected, the more likely we are to become infected ourselves). for acute infectious diseases, the health status of the population often changes quickly over time, with the number of people infectious, susceptible to being infected, and immune to the disease changing substantially over the course of an epidemic. further, the membership in each of these groups does not vary arbitrarily over time but is driven by often well-understood biological processes (box 1). for instance, in the simple example of a permanently immunizing infection spread through person-to-person transmission such as measles, new susceptible individuals only enter the population through birth and immigration; these individuals can then only become infected by contact with existing infectious individuals, who, in turn, will eventually become immune or die and be removed forever removed from participation in the epidemic process. the epidemic dynamics of infectious diseases are driven by similar mechanistic relationships between the current and future health states of the population. the expected number of infections at some time t is illustrated for a directly transmitted disease in the above equation. dynamic and mechanistic models of disease spread, regardless of complexity, capture these relationships in order to improve inference or predict the disease dynamics. the study of infectious disease dynamics encompasses the study of any of the shared drivers of the mechanistic processes of disease spread with an eye towards better understanding disease transmission. as illustrated above, these include: the size of the susceptible population ( ): the number of people available to be infected. the dynamics of susceptibility is not shown here, but can itself can be complex, as new susceptibles enter the population through birth, immigration and loss of immunity. for many diseases (e.g., dengue, influenza), susceptibility is not a binary state, and complex models may be needed. the force of infection: the force of infection is the probability that any individual who is susceptible at a given time becomes infected (analogous to the hazard of infection). the size of the susceptible population times the force of infection is the reproductive number ( ). when this value is above 1, the epidemic will grow. when it falls below 1, it will recede. the infectious process ( ): the infectious process dictates the chances of becoming infected on a direct or indirect contact with an infectious individual. here represented as a per contact probability of infection, this itself can be a complex, multi-faceted process. the contact process ( ): the process by which infectious contacts are made, whether directly or mediated by a some vector or the environment, is one of the most complex parts of the infectious process. much modern research focuses on accounting the role of space and population structure in the contact process. previous infections ( ): fundamental to the nature of infectious diseases is the number of previous infections, however, these may not as directly lead to current infections as illustrated here if transmission is mediated by a vector or the environment. the natural history of disease ( ): how infectious people are at particular times after their infection determines their contribution to ongoing disease transmission, and fundamentally drives the speed at which epidemics move through the population. other aspects of disease natural history (e.g., the incubation period) may determine our ability to control a disease and its ultimate health impact. over the course of the twentieth century, the main body of epidemiologic research became increasingly reliant on models of statistical association, often with strong assumptions of independence between observations (hereafter referred to as associative models) [2] . however, as a result of the need to deal with dependent happenings, there remained a strong subpopulation within infectious disease epidemiology that used models of an entirely different type. variously referred to as bmathematical,^bdynamic,^or bmechanistic^models, these models are characterized by having a mechanistic representation of the dynamic epidemic process that determines how the population's state at time t + 1 depends on its state at time t (hereafter referred to as mechanistic models). historically, these models have more often been deterministic and built top-down from first principles rather than based on patterns in any particular dataset. however, as increasing computational power has caused an explosion in the types of models that can be subject to rigorous statistical analysis, there has been a shift toward more data-driven and statistical approaches and a greater focus on stochasticity and uncertainty. this confluence between principled statistical inference and mechanistic processes is paying huge dividends in the quality of the work being produced and the types of questions being answered across disciplines within infectious disease epidemiology. infectious disease models are being given a firmer empirical footing, while the use of generative mechanistic approaches allows us to use models as tools for forecasting, strategic planning, and other activities in ways that would not be possible with models that do not represent the underlying dynamic epidemiologic processes. in this manuscript, we review current research into dynamic and mechanistic models of infectious disease with a focus on how the confluence of mechanistic approaches, new statistical methods, and novel sources of data related to disease spread are opening up new avenues in infectious disease research and public health. for those interested in further pursuing the topic, we provide a list of key resources in box 2. recent work in infectious disease dynamics has been characterized by an increasing focus on data and principled approaches to inference. traditionally, deterministic models were a dominant tool for studying the theoretical and practical basis of disease transmission in humans and animals. this approach yielded important practical and theoretical results there exist a number of freely available resources that aid infectious disease modelling efforts. courses that form the basis of our understanding of disease dynamics [3•, 4] but was limited in approach. deterministic models are usually parameterized through some combination of trajectory matching (i.e., minimizing the distance between observed and simulated data) and specifying parameters based on previous literature. this approach may be sufficient to describe the expected behavior of an infectious disease in a large population, but an increasing focus on how stochasticity and parameter uncertainty impact public health decision making, combined with the growing availability of computational power, has driven a move toward more statistically principled and data-driven likelihood-based approaches. illustrative of this evolution is the contrast between early descriptions of the key dynamic properties of hiv transmission with more recent dynamic characterizations of pandemic h1n1 influenza (h1n1pdm), middle eastern respiratory syndrome (mers-cov), and ebola. in the late 1980s, several papers were published laying out the essential properties of hiv transmission dynamics that would govern the course of the epidemic (at least in the near term) [5-7]. these papers presented deterministic epidemic models that captured the processes driving the epidemic and highlighted the key parameters, such as the speed of progression to aids, that needed to be investigated. uncertainty was largely addressed through scenario-based approaches (e.g., different future epidemic trajectories were presented for different plausible sets of parameters), and for the most part, different aspects of the transmission dynamics were derived from independent studies, with only the growth rate (i.e., doubling time) estimated from incidence data. while the parameters essential to characterizing epidemic dynamics remain largely unchanged for recently emerging pathogens, the approach to data and estimation is qualitatively different. integrated statistical frameworks built on markov chain monte carlo (mcmc) techniques are used to estimate all, or most, parameters from different datasets and to produce posterior distributions both for parameter estimates and forecasts of future incidence [8, 9•, 10]. these methods allow innovative use of unconventional data sources, such as disease incidence among travelers [8, 9•], to estimate population incidence of the disease, and molecular data can supplement incidence data providing independent estimates of the same parameters (see discussion of phylodynamics below) [8, 9•]. these recent attempts to quickly characterize the properties of emerging diseases are emblematic of an increasing focus on developing statistical methods, grounded in dynamical models, to estimate key epidemic parameters based on diverse data sources. surveillance data is often used to estimate the reproductive number (r t , the number of secondary infections that a primary infection is expected to infect at any point, t, in an epidemic), incubation period, and serial interval (the expected time between symptom onset in a case and the people that case infects), as was done in recent outbreaks of mers-cov [9•, 11] and ebola [10] . surveillance data has also been paired with serological data to estimate force of infection (i.e., the hazard of infection) and basic reproductive number (r t , when the population is fully susceptible, designated r 0 ) of several pathogens, including dengue and chikungunya [12] [13] [14] . dynamic modeling approaches can also aid in the interpretation of surveillance data. state-space models (e.g., hidden markov models) have been used to pair our mechanistic understanding of disease transmission with a statistical inference framework by linking observed incidence and dynamics with underlying population disease burden and susceptibility (i.e., the population's state). notably, this approach has been used to estimate global reductions in mortality due to measles in the face of incomplete reporting [15•] . likewise, valle and colleagues used hybrid associative and mechanistic models to account for biases that treatment of detected malaria cases might have on estimates of key values such as the incidence rate [16] . perhaps, the biggest limitation when attempting to characterize the parameters driving disease transmission remains data availability. data on disease transmission often comes from incomplete surveillance data or represents one aspect of a partially observed epidemic process. for example, epidemic curves are usually limited to symptomatic cases. similarly, key events in the transmission process, such as the exact time of infection, are generally not observable and have to be inferred from observed data. methods, such as the use of mcmc-based data augmentation and known transmission processes to infer the possible distribution of transmission trees, have been developed to deal with partially observed data and have been used to reconstruct outbreaks [17] , characterize risk factors for transmission [18] , and quantify the impact of interventions [19] . a limitation of likelihood-based approaches, such as those mentioned above, is that it is often impossible or impractical to evaluate the data likelihood, in particular, for complex models and large datasets. to deal with this challenge, several blikelihood-free approaches^have been developed, including approximate bayesian computation (abc) and sequential monte carlo (smc) [20, 21] . an advantage of these approaches is that they only require the ability to simulate from candidate models (i.e., if data can be simulated, calculation of the likelihood is unnecessary) and therefore can be more easily applied than methods that require iterative evaluation of the likelihood. abc has been used to integrate phylodynamic and epidemic models of influenza and other pathogens [22] , and smc methods have been used to parametrize dengue transmission models using data from multi-centric vaccine clinical trials [23•] . despite important advances over the past decades in linking data and transmission models as tools of inference, many challenges remain and are the topic of continued research. inference for complex models and using large datasets remain challenging, in part, due to computational burden. mechanistic models offer promise as a way to simultaneously link data from diverse, heterogenous data sources (as in [24] ), but this promise has yet to be realized, though some phylodynamics methods come close (see below). further, rapid inference on emergent epidemics remains a tool only used in high-profile epidemics [9•, 10], and these inferential techniques remain inaccessible to field epidemiologists. scientists and physicians have tried to forecast the course of epidemics since the time of hippocrates. associations between incidence and extrinsic factors such as time of year, climate, and weather can and have been used to forecast infectious diseases [25] [26] [27] . however, mechanistic models that capture the natural history of the disease (e.g., duration of immunity and cross protection) [28] , mode of transmission [29] , and movement patterns [30, 31] can improve forecasts, particularly when associations with extrinsic drivers of incidence, such as climate, are weak or unknown (e.g., for emerging pathogens). in recent years, forecasts based on models that capture the underlying mechanistic processes of transmission and pathogenesis have become common. uses range from forecasting the peak timing and magnitude of an influenza season [32] , to forecasting the spread and spatial extent of emerging pathogens such as zika, ebola, and chikungunya [33] [34] [35] . the mechanistic underpinning of these models allows forecasts to take into account dynamic processes that may, otherwise, be impossible to capture, including changes in behavior and resource availability in response to an epidemic [36] . approaches adopted from computer science, machine learning, and climate science have enhanced our ability to provide reliable forecasts with quantified uncertainty. particularly important are ensemble approaches [37] , which integrate forecasts from multiple imperfect models or different parameterizations of the same model to calculate a distribution of potential courses of the epidemic [38, 39] . ensemble approaches have been used for forecasting influenza in temperate regions [40] , where influenza is highly seasonal, and more recently in subtropical areas such as hong kong, where the seasonal pattern is less distinct [32] . similarly, ensemblebased climate models have been incorporated with infectious disease models to forecast climate-related disease including plague and malaria [41] . these examples use multiple parameterizations of a single model. ensemble approaches can also be used to accommodate uncertainty in model structure by comparing estimates from parameterizations across different models, as is work by smith and colleagues where an ensemble of 14 different individual-based models was used to estimate the impact of a future malaria vaccine [42] . there has been an explosion in the number of forecasts being made to aid public health decision making, including a number of government-sponsored contests to forecast the progression epidemics of diseases ranging from influenza to chikungunya and dengue [43] [44] [45] . as forecasts become more widely used, care must be given to ensure that the purposes of the model and uncertainty (both structural and statistical) are well communicated. in a recent outbreaks of emerging infectious diseases, like ebola, groups raced to make forecasts of the evolution and spatial spread of the outbreak [33, 34] , with some predicting an epidemic size orders of magnitude greater than what was actually observed. while some of these extreme forecasts were made as worst-case planning scenarios, they were interpreted as likely scenarios, raising alarm and casting doubt on the validity of model-based forecasts, thus highlighting the importance of clear communication of a model's purpose and its limitations [33, 46] . the quality of infectious disease forecasts and standards for their interpretation are far from the gold standard of methods and conventions used in the meteorology. improvement in both the methods used and their practical use remain critical areas of future research. the advent of bbig data^has opened up new avenues in how we parameterize and understand models of infectious disease spread. big data refers to massive datasets that are too large or complex to be processed using conventional approaches [47] . however, advances in computing increasingly allow their use without large delays in processing time or unrealistic computing capacity requirements. one of the most successful attempts to use big data to understand disease dynamics has been the use of call data records (cdrs) to capture human mobility. for each call that is made or received, mobile phone operators capture the mobile phone tower through which the call is made. by tracking tower locations for a subscriber, we can capture where he or she is moving. in practice, to ensure confidentiality, cdrs are usually averaged over millions of subscribers to provide estimates of flux between different locations in a country. transmission models built upon cdr-based estimates of seasonal patterns of human movement have been used to explain patterns of rubella disease in kenya [48] and dengue in pakistan [31] . in both instances, models built on empirical human movements seen in cdrs outperformed alternative parametric models of population movement based on our theoretical understanding of human travel patterns (e.g., gravity models where movement is based on community size and distance [49] ) and models where movement was not considered. cdr-based models have also been used to understand the dynamics of large-scale outbreaks such as ebola in west africa [50] and challenges in malaria elimination [51] . questions remain as to the generalizability of cdr-based analyses in settings where mobile phone ownership is low [52] , and problems capturing flows between countries remain. however, the largescale penetration of mobile phones, even in resourcepoor settings, makes cdrs a hugely valuable data source for informing infectious disease models. another type of big data that has enormous potential for furthering our understanding of disease dynamics is satellite imagery. detailed satellite imagery can provide high-spatial-resolution estimates of key determinants of many infectious disease processes, including environmental factors (e.g., land cover), climatic conditions (e.g., precipitation, temperature), and population density throughout the globe [53, 54] . in infectious disease epidemiology, such datasets have recently been used as the basis for statistical models that produce fine scale maps of disease incidence, prevalence, and derived transmission parameters (e.g., force of infection, basic reproductive number) for a large number of diseases. early efforts focused on mapping the global distribution of key drivers of malaria transmission [55, 56] . these approaches have since been used to estimate the burden from a wide range of pathogens [57] [58] [59] [60] , vectors [61] , and host reservoirs [62] . these analyses have allowed disease burden and risk to be estimated in areas with limited surveillance capabilities, expanding our understanding of the global burden of many pathogens. high-resolution geographic data can gain additional power when paired with mechanistic models that capture changes in disease risk, as in recent analyses that accounted for the effect of birth, natural infection, and vaccine disruptions driving increases in measles susceptibility and epidemic risk in the wake of the ebola outbreak [63] . finally, big data are increasingly being used with mechanistic models to more directly estimate disease burden in real time [64] . for example, patterns in the usage of different google search terms have been shown to correlate well with incidence trends for diseases such as influenza [65, 66] and dengue [67] . it is worth noting that big data alone can typically only explain part of trends in incidence, and models that incorporate seasonal dynamics typically outperform models that rely solely on search terms. similar approaches have been used with wikipedia updates [68] and social media sites such as twitter and facebook. electronic medication sales data and electronic medical records have also been proposed as novel data sources of disease trends [69] . these approaches can provide estimates much faster than traditional surveillance systems, where it often takes weeks or months for results of cases to be aggregated and analyzed. mechanistic models can then be fit to this data to better understand seasonal or spatial parameters. for example, yang et al. used mechanistic models fit to google flu trend data to estimate epidemiological parameters such as the basic reproductive number and the attack rate for 115 cities in the usa over a 10-year period [69, 70] . phylodynamics, the study of how epidemiological, immunological, and evolutionary processes interact to shape pathogen genealogies, is among the newest and fastest-growing areas in infectious disease research [71] . the term phylodynamics was coined in 2004 by grenfell et al., who observed that the structure of pathogen phylogenies reveals important features of epidemic dynamics in populations and within hosts [72] . this relationship provides a theoretical framework for linking molecular data with population-level disease patterns using dynamic models. early methodological work in phylodynamics concentrated on the formal integration of the kingman's coalescent and birth death models from population genetics with standard deterministic epidemic models. the coalescent model provides a framework for estimating the probability of coalescent events (lineages converging at a common ancestor) as we move back in time given changes in population size [73] . the branching patterns in a phylogenetic tree describe the ancestral history of sequenced pathogens, such that nodes closer to the root of the tree represent historical coalescent events while nodes near the tip represent recent events. the strong relationship between the genetic divergence of pathogens and time allows us to estimate the timing of coalescent events and estimate the rate of growth (or decline) of pathogen populations. these estimates are the critical link between genetic and epidemic models [74] . the formal statistical integration of population genetic and epidemic models allows us to estimate the critical epidemiological parameters such as the basic reproductive number directly from pathogen sequence data [75] [76] [77] . for example, magiorkinis et al. used sequence data from viruses collected over a 12-year period in greece to estimate subtype-specific reproductive numbers and generation times for hepatitis c [77] . using data from the athena hiv cohort, which samples ∼60 % of hiv-infected persons in the netherlands, bezemer et al. used viral sequence data to estimate reproductive numbers for hundreds of circulating transmission chains, showing that large chains persisted within the netherlands for years near the threshold for sustaining an epidemic (r = 1) [77, 78] . other phylodynamic applications have focused on elucidating the spatial dispersal pattern diseases such as influenza and hiv. in an analysis of nearly 10,000 influenza genomes, bedford et al. showed fundamental differences in the global circulation patterns of h3n2, h1n1, and influenza b viruses and that these were potentially driven by differences in the force of infection and rates of immune escape (i.e., antigenic drift) [79] . likewise, faria et al. used hiv sequence data from central africa to reconstruct the early epidemic dynamics of hiv-1 using phylodynamic methods and showed that kinshasa in the democratic republic of congo likely served as the focal point for global hiv spread [79, 80] . phylodynamics plays an important role in real-time infectious disease surveillance and targeted control [81] . in recent epidemics of mers-cov and ebola, genomic data was used to assess transmission patterns, monitor viral evolution in populations, and inform epidemic control [9•, [82] [83] [84] . analyses of hivepidemics among us and european men who have sex with men demonstrate that the amalgamation of epidemiologic and genomic data can be used to identify high-risk transmitters and optimal targeted intervention packages [85•, 86] . however, the utility of real-time phylodynamic analysis in many settings remains hindered by inadequate infrastructure, few viral sequence data, and limited analytic capacity at local levels. initial phylodynamic models could only deal with simple epidemic patterns (e.g., exponential growth), and recent methodological work has focused on extending the phylodynamic framework to account for complex nonlinear population dynamics [87, 88] . for instance, rasmussen and colleagues showed how phylodynamic models could be extended to integrate more complex stochastic and structured epidemic models using bayesian mcmc and particle filtering [89•]. others have focused on resolving transmission network structure from phylogenies [90, 91] or integrating data across multiple scales by incorporating information on intra-host pathogen diversity and ecological processes directly into phylodynamic models [92] . however, equally important recent work has shown that phylodynamic inferences can be highly sensiti ve to sam pl ing and unmeasured factors. simulation studies show that the relationship between phylogenetic trees and the underlying transmission networks is a complex function of the sampling fraction and underlying epidemic dynamics [93, 94] and that failure to account for intra-host viral diversity may bias phylodynamic inference [95] . here, we have focused on areas where we feel that there has been the most innovation in the use of dynamic epidemic models in recent years. this is not to imply that innovation has stopped in other areas where dynamic models play a key role. dynamic models have long been key to our understanding of epidemic theory. innovative models continue to be developed to deal with the challenges posed by pathogen evolution [96] , complex immunological interactions [97] , and host heterogeneity [98] . there has been increasing emphasis on the use of dynamic models in informing public health policy since the early 2000s when they played a key role in the response to the foot-and-mouth disease outbreak in the uk [99] and assessment of the risk from a smallpox-based bioterrorist attack [100, 101] . these uses have extended to endemic disease, such as a 2009 modeling analysis by granich and colleagues [102] that highlighted the potential of btest-and-treat^strategies for hiv control. recently, dynamic models have played an important role in guiding the response to emerging disease threats, from pandemic influenza [103] , to multi-drug resistant organisms [104, 105] to mers-cov [106] . many of the themes discussed throughout this manuscript have had a profound impact on these efforts, as does the need to report results and assumptions in a way accessible to policy makers. mechanistic models also crop up in other areas of epidemiology, often in less obvious ways. nearly all of the key methods of genetic epidemiology are based on a mechanistic understanding of the underlying processes inheritance, mutation, and selective pressure. social epidemiology at its core is based on the idea that our health depends on the behavior and health of those around us and, hence, has its own approaches to dependent happenings (though the terminology differs). recently, there has been increasing interest in using mechanistic modeling approaches similar to those used for infectious disease to understand health phenomena that are, in part, socially driven, such as obesity [107] . physiological measurements are often founded on mechanistic models of processes within the body (e.g., use of serum creatinine to approximate the glomerular filtration rate, a key measure of kidney function [108] ). infectious disease dynamics is, perhaps, unique in epidemiology in the number of researchers that it brings from non-health related disciplines, particularly physics, computer science, and ecology. this, combined with the unique aspects of infectious disease systems, has contributed to the use of models that are distinct from btradition-al^epidemiologic methods. however, the field is being transformed by the same forces that are transforming epidemiology in general: increasing access to technological tools and computational power; an explosion in the availability of data at the molecular, individual, and population levels; and a shift in what the important epidemiologic questions are as we eliminate old health threats and change our environment. increasing emphasis on principled statistical analysis in infectious disease modeling combined with an increasing need to deal with dynamic phenomena in epidemiologic inference opens up new opportunities for the cross pollination of ideas and the erosion of the historical barriers between epidemiologic fields. analysis of mers outbreak that incorporated a wide array of different data sources including human mobility, phylogenetic and case data into mechanistic models to allow inference on key transmission parameters. who ebola response team. ebola virus disease in west africa-the first 9 months of the epidemic and forward projections hospital outbreak of middle east respiratory syndrome coronavirus reconstruction of 60 years of chikungunya epidemiology in the philippines demonstrates episodic and focal transmission estimating dengue transmission intensity from sero-prevalence surveys in multiple countries revisiting rayong: shifting seroprofiles of dengue in thailand and their implications for transmission and control uses mechanistic models to make key inferences about the global burden of disease in the presence of imperfect data improving the modeling of disease data from the government surveillance system: a case study on malaria in the brazilian amazon role of social networks in shaping disease transmission during a community outbreak of 2009 h1n1 pandemic influenza a bayesian mcmc approach to study transmission of influenza: application to household longitudinal data inferring influenza dynamics and control in households approximate bayesian computation scheme for parameter inference and model selection in dynamical systems sequential monte carlo without likelihoods phylodynamic inference and model assessment with approximate bayesian computation: influenza as a case study estimation of parameters related to vaccine efficacy and dengue transmission from two large phase iii studies measuring the performance of vaccination programs using cross-sectional surveys: a likelihood framework and retrospective analysis forecasting malaria incidence from historical morbidity patterns in epidemic-prone areas of ethiopia: a simple seasonal adjustment method performs best climate cycles and forecasts of cutaneous leishmaniasis, a nonstationary vector-borne disease cholera dynamics and el niño-southern oscillation interactions between serotypes of dengue highlight epidemiological impact of cross-immunity generalized reproduction numbers and the prediction of patterns in waterborne disease socially structured human movement shapes dengue transmission despite the diffusive effect of mosquito dispersal impact of human mobility on the emergence of dengue epidemics in pakistan forecasting influenza epidemics in hong kong estimating the future number of cases in the ebola epidemic-liberia and sierra leone assessing the international spreading risk associated with the 2014 west african ebola outbreak model-based projections of zika virus infections in childbearing women in the americas ebola cases and health system demand in liberia a bayesian ensemble approach for epidemiological projections testing a multi-malaria-model ensemble against 30 years of data in the kenyan highlands malaria early warnings based on seasonal climate forecasts from multi-model ensembles realtime influenza forecasts during the 2012-2013 season improvement of disease prediction and modeling through the use of meteorological ensembles: human plague in uganda ensemble modeling of the likely public health impact of a preerythrocytic malaria vaccine epidemic prediction initiative chikungunya threat inspires new darpa challenge noaa's national weather s e r v i c e ebola infections fewer than predicted by disease models planning for big data quantifying seasonal population fluxes driving rubella transmission dynamics using mobile phone data the gravity model in transportation analysis: theory and extensions. utretch: vsp commentary: containing the ebola outbreak-the potential and challenge of mobile network data integrating rapid risk mapping and mobile phone call record data for strategic malaria elimination planning the impact of biases in mobile phone ownership on estimates of human mobility the quality control of long-term climatological data using objective data analysis high-resolution gridded population datasets for latin america and the caribbean in 2010 a world malaria map: plasmodium falciparum endemicity in 2007 the malaria atlas project: developing global maps of malaria risk global distribution maps of the leishmaniases. elife using global maps to predict the risk of dengue in europe mapping the zoonotic niche of marburg virus disease in africa remote sensing, land cover changes, and vector-borne diseases: use of high spatial resolution satellite imagery to map the risk of occurrence of cutaneous leishmaniasis in ghardaïa the global distribution of the arbovirus vectors aedes aegypti and ae mapping the zoonotic niche of ebola virus disease in reduced vaccination and the risk of measles and other childhood infections post-ebola enhancing disease surveillance with novel data streams: challenges and opportunities detecting influenza epidemics using search engine query data comparison: flu prescription sales data from a retail pharmacy in the us with google flu trends and us ilinet (cdc) data as flu activity indicator prediction of dengue incidence using search query surveillance wikipedia usage estimates prevalence of influenza-like illness in the united states in near real-time tracking cholera through surveillance of oral rehydration solution sales at pharmacies: insights from urban bangladesh accurate estimation of influenza epidemics using google search data via argo viral phylodynamics unifying the epidemiological and evolutionary dynamics of pathogens phylodynamics of infectious disease epidemics integrating phylodynamics and epidemiology to estimate transmission diversity in viral epidemics viral phylodynamics and the search for an beffective number of infections^ estimating the basic reproductive number from viral sequence data dispersion of the hiv-1 epidemic in men who have sex with men in the netherlands: a combined mathematical model and phylogenetic analysis global circulation patterns of seasonal influenza viruses vary with antigenic drift hiv epidemiology. the early spread and epidemic ignition of hiv-1 in human populations genomic analysis of emerging pathogens: methods, application and future trends genomic surveillance elucidates ebola virus origin and transmission during the 2014 outbreak phylodynamic analysis of ebola virus in the insights into the early epidemic spread of ebola in sierra leone provided by viral sequence data key example showing that clinical and phylogenetic data can be used to identify predominant sources of ongoing viral hiv-1 transmission during early infection in men who have sex with men: a phylodynamic analysis inference for nonlinear epidemiological models using genealogies and time series complex population dynamics and the coalescent under neutrality uses innovative methods to combine mechanistic models within phylodynamics to estimate transmission parameters modelling tree shape and structure in viral phylodynamics inferring epidemic contact structure from phylogenetic trees reconciling phylodynamics with epidemiology: the case of dengue virus in southern vietnam contact heterogeneity and phylodynamics: how contact networks shape parasite evolutionary trees how the dynamics and structure of sexual contact networks shape pathogen phylogenies within-host bacterial diversity hinders accurate reconstruction of transmission networks from genomic distance data core groups, antimicrobial resistance and rebound in gonorrhoea in north america age profile of immunity to influenza: effect of original antigenic sin insights from unifying modern approximations to infections on networks dynamics of the 2001 uk foot and mouth epidemic: stochastic dispersal in a heterogeneous landscape containing bioterrorist smallpox emergency response to a smallpox attack: the case for mass vaccination universal voluntary hiv testing with immediate antiretroviral therapy as a strategy for elimination of hiv transmission: a mathematical model strategies for containing an emerging influenza pandemic in southeast asia improving control of antibiotic-resistant gonorrhea by integrating research agendas across disciplines: key questions arising from mathematical modeling modeling epidemics of multidrug-resistant m. tuberculosis of heterogeneous fitness estimating potential incidence of mers-cov associated with hajj pilgrims to saudi arabia social network analysis and agent-based modeling in social epidemiology estimating glomerular filtration rate from serum creatinine and cystatin c key: cord-340713-v5sdowb7 authors: bird, jordan j.; barnes, chloe m.; premebida, cristiano; ekárt, anikó; faria, diego r. title: country-level pandemic risk and preparedness classification based on covid-19 data: a machine learning approach date: 2020-10-28 journal: plos one doi: 10.1371/journal.pone.0241332 sha: doc_id: 340713 cord_uid: v5sdowb7 in this work we present a three-stage machine learning strategy to country-level risk classification based on countries that are reporting covid-19 information. a k% binning discretisation (k = 25) is used to create four risk groups of countries based on the risk of transmission (coronavirus cases per million population), risk of mortality (coronavirus deaths per million population), and risk of inability to test (coronavirus tests per million population). the four risk groups produced by k% binning are labelled as ‘low’, ‘medium-low’, ‘medium-high’, and ‘high’. coronavirus-related data are then removed and the attributes for prediction of the three types of risk are given as the geopolitical and demographic data describing each country. thus, the calculation of class label is based on coronavirus data but the input attributes are country-level information regardless of coronavirus data. the three four-class classification problems are then explored and benchmarked through leave-one-country-out cross validation to find the strongest model, producing a stack of gradient boosting and decision tree algorithms for risk of transmission, a stack of support vector machine and extra trees for risk of mortality, and a gradient boosting algorithm for the risk of inability to test. it is noted that high risk for inability to test is often coupled with low risks for transmission and mortality, therefore the risk of inability to test should be interpreted first, before consideration is given to the predicted transmission and mortality risks. finally, the approach is applied to more recent risk levels to data from september 2020 and weaker results are noted due to the growth of international collaboration detracting useful knowledge from country-level attributes which suggests that similar machine learning approaches are more useful prior to situations later unfolding. according to the future of humanity institute there is a 2.05% chance that mankind will go extinct by the year 2100, through either a natural or engineered pandemic [1] . if there is one lesson to learn from the ongoing covid-19 coronavirus (sars-cov-2) pandemic, it is that a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 we were not prepared. the virus initially spread rapidly across the globe, mortality began to rise, and countries desperately struggled to test their citizens for the virus once it became known that many infectious carriers of it show no noticeable symptoms [2] [3] [4] . this suggests three main risk factors to be observant of: the initial risk of transmission due to varying factors such as, for example, population density [5] and international travel [6] ; the risk of mortality due to ageing populations [7] and underlying health issues [8, 9] ; and finally the risk of a country not being able to test citizens aptly and thus producing possibly under-reported measures of the previous two [10] . machine learning has shown success in contributing to research during the covid-19 pandemic. health service data trend models have shown to aid in classification of the virus [11, 12] , vaccine design [13] , estimation of cases, deaths, and recoveries [14, 15] , simulating what could have happened if 'lockdown' was not instituted [16] , and also simulating behaviour of the spread of the disease by prior knowledge from other locations [17] . in this work, we devise a machine learning based strategy to predict three-fold risk at the country-level: (i) risk of transmission, (ii) risk of mortality, and (iii) risk of inability to test. through these three quantifiable measures, preparedness and risk can be assessed, providing some quantitative reasoning behind global decisions, should another deadly disease grip our species again. our main contribution is the exploration of the idea that country-level demographic and geopolitical attributes can aid in the classification of pandemic risk and preparedness in terms of transmission, mortality, and an inability to test (which the previous two depend on, since testing allows for accurate measurements of transmission and mortality). in order to do this, various supervised learning classifiers are explored in order to discern how much useful information these country-level attributes carry for the classification of these three risks. we note that the classification problems are difficult, where many powerful techniques achieve unsatisfactory scores on the dataset, scoring around 10-20% higher than an approximate 25% random guess on the dataset, showing that learning useful rules from the data is not an easy task. this is not unexpected, since the classes have not been directly derived from the data used to predict them, rather, they have been derived from covid-19 statistics and then given as classes for country-level demographic and geopolitical information. due to this, strategies of linear searching and genetic optimisation are also followed in order to achieve more accurate results. although results are varied, the fact that all final models achieve much higher than 25% accuracy we formulate the problem as a 4-class problem. (which would be achieved via a random guess), shows that the geopolitical and demographic attributes at the country-level do carry predictive ability when it comes to pandemic risk and preparedness. the final models chosen are characterised by high classification accuracy for the risks of transmission, mortality, and inability to test, and are trained with no prior knowledge of the new coronavirus pandemic (other than the class). this may allow for generalisation to classify a nation's risk in the early days of a future pandemic. the remainder of this work is organised as follows: section 2 details the method followed with subsection 2.1 describing machine learning approaches in particular. section 3 presents the results for the risk of transmission (3.1), the risk of mortality (3.2) and the risk of inability to test (3.3) . finally, the limitations of the study are described, future work is suggested, and the study is concluded in section 4. collected from [18] formalised on the 12th may 2020, updated experiments for newer data can be found in section 3.7, with the relative ordering based on the three metrics with regards to population (cases, deaths, and tests per million). the risk classes are low, medium-low, medium-high and high for each type of risk. as defined in other works [19] [20] [21] , discretisation of the continuous features into bins is performed by the k% method in which k = 25 (equal frequency binning), resulting in four close-to-equal classes, with the difference being that the highest risk class is a minor 1.2% larger than the other three classes. future work aims to explore other methods of discretisation, whereas this work initially focuses on the machine learning pipeline on the basis of equal class error weighting. covid-19 data are then removed, and the attributes to complement the country-level classes are the following: un region [22] , 2020 population estimate [18] , median age [18] , population density per km 2 [18] , urban population % [18] , urban population total [18] , nursing and midwifery personnel per 10,000 (most recently recorded) [23] , medical doctors per 10,000 (most recently recorded) [23] , tobacco prevalence 2016 [24] , obesity prevalence 2016 [25] , gross domestic product 2019 [22] , land area km 2 [22] , net migration [22] , infant mortality per 1,000 births [22] , literacy rate % [22] , arable land % [22] , crop land % [22] , other land % [22] , climate classification type [22] , birth rate per 1,000 [22] , death rate per 1,000 [22] , gdp expenditure on agriculture [22] , gdp expenditure on industry [22] and gdp expenditure on services [22] . since some countries are not recorded by the world health organisation, figures for nursing, midwifery and medical doctors personnel per 10,000 people from hong kong are collected from an alternative source [26] . missing data which occurred mostly for tobacco prevalence, was given as '-1', which flags as an attribute that the data have not been collected (which could in itself provide useful information). the classification problem of risk is therefore formulated based on prior knowledge of the pandemic in terms of class only, but the attributes to attempt to classify them are purely country-level information regardless of number of cases, deaths and other coronavirus specific data. thus the problem becomes a pandemic risk and preparedness classification problem based on demographic and geopolitical attributes only. we aim for a generalisable model, which can be applied to the future state of countries, should another potential pandemic begin prior to any meaningful measurements being available. the method is illustrated in fig 1. following this, a set of machine learning models are tasked with predicting a country's risk class by learning from all other countries in a process of leave one out cross-validation [27], which is performed for all three types: finally, the best models for each risk factor are organised into a predictive framework, which produces an output for the three risks. since testing is taken into account, countries that have not reported testing data cannot be considered, but are later classified by the model generalised on those countries that do. a three-fold machine learning approach is proposed following observing the maps for the three separate risk quarters in figs 2, 3 and 4 which show the discretised inability to test risk, transmission risk, and mortality risk respectively. we note that the countries with seemingly fewer cases have performed far fewer tests as can be observed in that the growth of cases and testing tend to increase alongside one another. that is, a country with more cases will test more, and as such will have more confirmed cases, since the larger number of tests have identified more cases. the data for the two experiments were accessed on 12 th may 2020 and 16 th september 2020. trained with the strategy of leave one out cross-validation (loo cv) where every country's risk is predicted based on learning from all other countries, a set of supervised classification models are benchmarked. this section details the models focused upon, and the methods used to search for others. the metrics reported following the models described in this section are mean classification accuracy due to close-to-equal class balance [28] (low, med-low, med-high are equal and high is minimally larger by a factor of 1.2%) and high variance often observed due to the nature of loo cv [29, 30] . decision trees are tree structures, where each internal node represents a condition based on attributes that allows splitting the data and leaf nodes represent class labels [31] . a random decision forest (rdf) [32], used in this study, creates multiple random decision trees, where note that many countries at "low risk" by number of cases are at "high risk" for inability to test. https://doi.org/10.1371/journal.pone.0241332.g003 note that many countries at "low risk" by number of deaths are at "high risk" for inability to test. https://doi.org/10.1371/journal.pone.0241332.g004 each decision tree votes on the class of the input data object, and the predicted class is that, which receives the majority vote. splitting of the trees is based on information gain: where ig is the observed difference in information entropy, which is expressed in eq (2), that is, the nodes split data based on reducing the randomness of object class distribution. k-nearest neighbours (knn) is similar to an rdf in that the prediction is derived by a majority vote. the voters, rather than decision trees, are the data objects within the observations that are closest in terms of n-dimensional euclidean space where n is the number of attributes. gradient boosting [33] forms an ensemble of weak learners (decision trees) and aims to minimise a loss function via a forward stage-wise additive method. in these classification problems, deviance is minimised. at each stage, four trees (n = classes) are fit on the negative gradient of the multinomial deviance loss function, or cross-entropy loss [34, 35]: where, for k classes, i is a binary indicator of whether the prediction that class y is the class of observed data x is correct, and finally p is the probability that aforementioned data x belongs to the class label y. xgboost [36] differs slightly in that it penalises trees, leaves are shrunk proportionally, and extra randomisation is implemented. naïve bayes is a probabilistic classifier that aims to find the posterior probability for a number of different hypotheses and selecting the most likely case. bayes' theorem is given as: where p(h|d) is the posterior probability of hypothesis h given the data d, p(d|h) is the conditional probability of data d given that the hypothesis h is true. p(h) i.e., the prior, is the probability of hypothesis h being true and p(d) = p(d|h)p(h) is the probability of the data. naïvety in the algorithm is due to the assumption that each probability value is conditionally independent for a given target, calculated as pðdjhþ ¼ q n i¼1 pðd i jhþ where n is the number of attributes/ features. linear discriminant anaylsis (lda), based on fisher's linear discriminant [37] , is a statistical method that aims to find a linear combination of input features that separate classes of data objects, and then use those separations as feature selection (opting for the linear combination) or classification (placing prediction objects within a separation). classes k 2 {1, . . ., k} are assigned priorsp k ( (3) in mind, maximum-a-posteriori probability is thus calculated as: where f k (x) is the density of x conditioned on k: s k is the covariance matrix for samples of class k and class covariance matrices are assumed to be equal. the class discriminant function δ k (x) is given as: wherem k is the class mean, and finally classification is performed via quadratic discriminant analysis (qda) is an algorithm that uses a quadratic plane to separate classes of data objects. following the example of lda, qda estimates the covariance matrices of each class rather than operating on the assumption that they are the same. qda follows lda with the exception that: support vector machines (svm) optimise a high dimensional hyperplane to best separate a set of data point by class by maximising the margin and minimising the empirical risk, and then predict new data points based on the distance vector measured from the hyperplane [38]. the optimisation of the hyperplane is to achieve the goal of maximising the average margins between the points and separator. generation of a multi-class svm is performed through sequential minimal optimisation (smo) [39] by breaking down the optimisation into smaller linearly-solvable sub-problems. for multipliers a, reduced constraints are given as: where there are data classes y and k are the negative of the sum over the remaining terms of the equality constraint. stacked generalisation (stacking) [40] is the process of training a machine learning algorithm to interpret the predictions of an ensemble of algorithms trained upon the dataset in a process of meta-learning. generally, a stack can represent any kind of ensemble, but the interpretation algorithm is often logistic regression. it has been noted in multiple domains that stacking often outperforms the individual models in the ensemble [41-43]. it was observed during experimentation that the classification problems were difficult, leading to many models achieving relatively bad results, i.e., the results outperformed an approximate 25% chance random guess by around 10-20% classification accuracy, with many stateof-the-art models predicting the wrong value more than half of the time (< 50%). the solutions explored to solve this are the following: a linear search is performed for random decision forests (rdf) and k-nearest neighbours (knn) from 10, 20, . . ., 1000 decision trees and 1, 2, . . ., 50 neighbours, respectively. random forests are often found to be powerful ml algorithms, and so an in-depth search is performed in order to maximise their ability. this is also followed for knn since it is of low complexity and can thus be quickly benchmarked. a genetic search is also performed via the tree-based pipeline optimization tool (tpot) algorithm detailed in [44] with consideration to the whole scikit-learn toolkit [45] where not detailed in the previous section, more information is available on the models in [46] . tpot is an algorithm that treats each machine learning operator as a genetic programming (gp) primitive which include, modified features, feature combinations, feature selections and dimensionality reductions, learning algorithms as well as their predictions (for exploration of ensembles). gp trees were chosen since they best represented a machine learning pipeline and are implemented with the deap framework [47] , and best solutions are selected by the multi-objective nsga-ii algorithm [48] by aiming to increase classification accuracy while reducing minimising the number of machine learning operators as previously described. 5% of offspring produced by the best models cross-over with another through a process of onepoint crossover, and the remaining offspring randomly mutate at a 33% chance of point, insertion, or shrinkage. thus, the algorithm introduces and tunes ml operators with promising effect and removes operators that cause the results to degrade. finally, the best machine learning pipeline is presented from the search. to conclude, the method described in this section follows the process of manual exploration, linear search, and genetic programming in order to explore the best classification models for these problems in terms of classification accuracy. as previously described, accuracy is chosen as the metric of comparison since the datasets are closely balanced, and the drawback of loo is high variance (large standard deviation due to binary per-fold results) while enabling classification model validation of a small dataset. all of the experiments in this paper were performed using the scikit-learn toolkit [45] implemented in python. the algorithms were executed on an intel core i7 processor (3.7ghz). due to the large computational complexity when searching a problem space with loo, the algorithm was executed three times with a population size of 10 for 10 generations, if a model scored lower than the manually or linearly explored models then it was discarded, and otherwise presented if it achieved a higher score. this decision was based on the fact that results for the three problems attained were only 49.67%, 43.79%, and 56.21%, and more robust models were required in order to provide accurate predictions. in this section, the three sets of results are presented. for readability purposes, linear searches of rdf and knn are presented as the same figs (6) and (7). the linear searches for rdf and knn are shown in figs 6 and 7, respectively. the best rdf was a forest of 230 trees which scored 43.8%, and the best knn had a value of k = 30 which scored 36.6%. fig 9 shows the model comparison for risk of mortality. the difficulty of the problem can be seen with the low results achieved, with the exception of two models discovered by the genetic model search algorithm. the second best model, which utilised extra trees via recursive feature elimination scored 61.97% and the best model found was a process of stacking svm and extra trees which had a classification ability of 71.24%. country-level pandemic risk and preparedness classification based on covid-19 data fig 10 shows a comparison of other models that were explored. many solutions were quite weak, but achieving higher results in comparison to the other two problems, suggesting that the problem is a slightly less difficult one. the best algorithms, as was the case for the other problems, were also discovered by the genetic search algorithm. unlike the previous two problems, the best models found were singular rather than either an ensemble or feature elimination pipeline, where extra trees scored 71.21% and gradient boosting scored 77.12%. following the original outline of the experiment in figs 1 and 11 builds upon this by including the best findings from the three benchmarking experiments. the best model for risk of transmission was a stacking algorithm combining gradient boosting and a decision tree for 74.51% accuracy, the best model for risk of mortality was a stacking algorithm combining support vector machine and extra trees for 71.24% accuracy, and the best model for risk of inability to test was a gradient boosting algorithm for 77.12% accuracy. all of the best models were found by the genetic model search algorithm. as previously discussed, the classification of risks must be interpreted relative to one another. for example, if the maps in figs 2, 3 and 4 are observed, note that countries that do not test much also seemingly, on the surface, report fewer cases and deaths per million. on one hand, this could simply be due to the fact that there are fewer cases and thus fewer tests are country-level pandemic risk and preparedness classification based on covid-19 data required, but on the other hand, could imply that fewer tests performed have themselves led to unreported figures of the other two [49] [50] [51] [52] . with this in mind, it is important to consider the output for risk of inability to test in order to interpret the other two risks. in the case where the risk of inability to test is towards the lower end of the spectrum, then risks for transmission and mortality are more likely to be an accurate representation of the situation. vice versa, though, where there is a high risk of inability to test, this in itself should be considered the most descriptive risk factor for the country since there is less prior knowledge to base risks of transmission and mortality upon. country-level pandemic risk and preparedness classification based on covid-19 data table 1 shows the predicted class values for the best models applied to each of the respective risk classification problems. please note the discussion of interpretation in section 3.4, where high inability to test is often coupled with lower risks of the prior two, as can be seen in fig 5, for as of yet unknown reasons i.e. they could either be actually true to the pattern observed, or on the other hand, very low testing leads to naturally fewer reported cases and deaths than the actual values. many countries bare similarity to others and so have been generalised, further outliers still such as china may not have accurately predicted labels since the population is much larger than those observed in the training data, likewise this may be the case with other geopolitical information within the outlier set. in this section, we perform a preliminary exploration of how useful country-level attributes are in addition to lag-window features (seven days prior, with mean and standard deviation for days 1 − n via a growing lag-window). the process is implemented via a 10-fold temporal validation process (predicting future fold k from growing training data 1 to k − 1). this approach is explored for the forecasting of cases and deaths. appendix a in s1 appendix shows the pearson correlation coefficient of each attribute in relation to the total cases for each day. as can be expected, the most correlative feature are the cases recorded for the previous day. interestingly, mean values of the previous two and three days have more correlation to the total cases on the current day compared to the previous day lag value alone. gross domestic product and urban population have a weak but useful correlation for regression of the total cases. as can be expected, the singular pearson's correlation coefficient of each of the isolated attributes tend to be low with exception to the lag windows due to the nature of increasing growth in infections. the tables within appendices b, c, and d in s1 appendix detail the scores given to the attributes by linear regression, m5p and svr respectively. it is observed that the rankings achieved by the lag window attributes are the same for each algorithm, and the order otherwise is relatively similar. all algorithms then rank the country at the same place above other features, which actually had a negligible correlation of 0.03. another interesting observation is that the m5p algorithm ranks medical doctors per 1,000 population as relatively high in the ranking, second only to country when lag windows are not considered. urban population totals are considered important by all of the algorithms, likely since this is an indication of both spread as well as a rule of thumb for total number of infected. table 2 shows the results for total case prediction by all of the chosen algorithms. the best algorithm achieved a rmse of 325.66 when considering 41 features chosen by linear regression ranking, which were the 19 time-window attributes and 22 geopolitical or demographic attributes. this provides a decrease in rmse of 17.08 when this algorithm only considers lags of the series, and many instances can be observed in which this metric was reduced by considering additional attributes explored within this study. the best results achieved by all of the seven algorithms considered at least two of the additional attributes, it is worth noting that the best of the best models is also the model which chose the most of the additional attributes (as well as the best svr, which also chose 41 attributes in total). appendix e in s1 appendix shows the correlation of each singular attribute towards the prediction of deaths. as can be observed, the rankings of the lag windows are the same as those for total confirmed infections described in the previous section. otherwise, rankings are similar and differ only slightly, as well as their observed correlation. appendices f, g, and h in s1 appendix detail the scores given to each attribute by the linear regression, m5p and support vector regression algorithms respectively. as can be expected, the rankings match those of the highest correlation coefficient. interestingly, a small change is noted within the attributes for svr whereas quite a disparity can be seen when observing the scores given by the other two algorithms. table 3 shows the 189 models trained for forecasting total deaths. similarly to the total case predictions, the best model found was within a voting ensemble of linear regression and svr. unlike total case predictions, introducing geopolitical and demographic attributes had negative effect on the result, with the best model taking only the temporal lag window features as input. once 40 attributes were introduced, the linear regression model had an absurdly high rmse of 2.93e+05, which since average values were taking during voting regression, also affected the ensembles that included it. given the nature of research and peer review, the approach in this work was formalised on the 12th of may 2020 and as such the data is over three months out of date at the time of writing (16th of september, 2020). given this, the experiments devised in this work are re-applied to the new data. it was noted that all manual models failed to generalise with the new data. that is, a range of scores between 24.95% to 28.63% for all models, for all three risk classification problems. this is most likely due to international collaboration towards the three risk factors, and as such, country-level attributes lose classification prediction ability towards the risk factors. with the previous successful experiments in mind, this argues that risk classification would be more useful when performed prior to the situation unfolding, given that country-level information is seemingly more important at this stage when compared to the current postpeak climate. though much weaker results are now observed, this could in fact be viewed as a positive situation, given that country-level data i.e. who you are and where you are from no longer impacts risk as it was observed to in the initial experiments performed in may 2020. it has been noted during mid-2020 that organisations such as the united nations and world health organisation have implemented and released humanitarian packages to lower economically developed countries (ledcs) [53] [54] [55] . it has also been noted that many healthcare professionals returned to their native countries (often also ledcs) in order to aid in tackling the virus [56] . the positive effects of these factors likely contribute towards the reason why country-level information was useful for risk classification earlier in the pandemic, but are less-so later on post-peak. with the nature of the data streaming from the ongoing pandemic with regards to the time taken to run model benchmarks, the largest and most obvious limitation to this study is that the models are constantly going out of date by the day, since more up to date data is constantly becoming available. it is for this reason that the models should be updated at a later date, and the statistical differences that occur, if any, noted. secondly, though relatively good results were found through a complex process of genetic optimisation, further models could be explored in order to possibly reach better results than the final models in this study. finally, the interpretation that is required as aforementioned, i.e. that the risk of inability to test is the most important metric and possibly enables the other two for interpretation, suggests that the ternary approach followed could be better optimised through a unified approach. that is, one singular "metric of risk" that is calculated via the three metrics explored in this work as separate problems. prior to this study, a metric of (c + d)/t was explored (where c, d, and t denote cases, deaths, and tests respectively, all with regards to per million population), but this metric is, at this point, impossible to classify. the k% method was used to divide the continuous features into four bins where k = 25. other methods of binning such as mdl [57] , caim, cacc, and ameva [58] could also be explored and benchmarked in future experiments. to conclude, the main hypothesis that this work has argued in favour of is that geopolitical and demographic attributes at the country-level hold value in terms of classifying risk produced by the covid-19 dataset. this was shown when the four class distribution which was close to equal ('high' was 1.2% larger than the other classes) could be classified far above the approximate 25% class distribution through loo cv. though this is observably possible from the results presented in this study, it is worth noting that the classification problem proved extremely difficult for many powerful machine learning techniques, which often scored around only 40%, and a genetic search had to be followed in order to devise complex strategies of ensemble and hyperparameter optimisation in order to achieve better results at 74.51%, 71.24%, and 77.12% for the three problems. future work aims to keep the data up to date to the point at which the pandemic is over, and also to explore other methods of solving the issue of risk and preparedness classification through a more unified approach as well as through stronger machine learning models, if possible. global catastrophic risks survey presumed asymptomatic carrier transmission of covid-19 estimation of the asymptomatic ratio of novel coronavirus infections (covid-19) covid-19: identifying and isolating asymptomatic people helped eliminate virus in italian village sjö din h. high population densities catalyse the spread of covid-19 the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak case-fatality rate and characteristics of patients dying in relation to covid-19 in italy obesity in patients younger than 60 years is a risk factor for covid-19 hospital admission covid-19: risk factors for severe disease and death bearing the brunt of covid-19: older people in low and middle income countries machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: covid-19 case study artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct covid-19 coronavirus vaccine design using reverse vaccinology and machine learning forecasting the novel coronavirus covid-19 data-based analysis, modelling and forecasting of the covid-19 outbreak modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases current world population evaluation of data discretization methods to derive platform independent isoform expression signatures for multi-class tumor subtyping data mining discretization methods and performances entropy and mdl discretization of continuous variables for bayesian belief networks world health organization, et al. global health workforce statistics. geneva: who world health organization, et al. who global report on trends in prevalence of tobacco smoking 2000-2025. world health organization prevalence of obesity among adults, bmi� 30, crude estimates by country ensemble of machine learning algorithms using the stacked generalization approach to estimate the warfarin dose evaluation of a tree-based pipeline optimization tool for automating data science scikit-learn: machine learning in python api design for machine learning software: experiences from the scikit-learn project deap: evolutionary algorithms made easy a fast and elitist multiobjective genetic algorithm: nsga-ii correcting under-reported covid-19 case numbers. medrxiv internationally lost covid-19 cases level of under-reporting including under-diagnosis before the first peak of covid-19 in various countries: preliminary retrospective results based on wavelets and deterministic modeling estimating the fraction of unreported infections in epidemics with a known epicenter: an application to covid-19 will covid-19 be a litmus test for post-ebola sub-saharan africa? the coronavirus knows no borders. tidsskrift for den norske legeforening health prevention and response policies against infectious diseases: is the world ready for a novel coronavirus pandemic? proceedings book the world health organisation. how is who responding to covid-19? azerbaijani doctors return home to help their country face covid-19 multi-interval discretization of continuous-valued attributes for classification learning supervised and unsupervised discretization of continuous features the authors would like to show their gratitude to all of the medical professionals working to treat and cure covid-19 across the world, as well as the researchers working vigorously on vaccines and antibody tests for the disease. we would also like to thank all of the key workers for their effort to make life as normal as possible during these difficult times. key: cord-330714-hhvap8ts authors: shah, kamal; arfan, muhammad; mahariq, ibrahim; ahmadian, ali; salahshour, soheil; ferrara, massimiliano title: fractal-fractional mathematical model addressing the situation of corona virus in pakistan date: 2020-11-12 journal: results phys doi: 10.1016/j.rinp.2020.103560 sha: doc_id: 330714 cord_uid: hhvap8ts this work is the consideration of a fractal fractional mathematical model on the transmission and control of corona virus (covid-19), in which the total population of an infected area is divided into susceptible, infected and recovered classes. we consider a fractal-fractional order sir type model for investigation of covid-19. to realize the transmission and control of corona virus in a much better way, first we study the stability of the corresponding deterministic model using next generation matrix along with basic reproduction number. after this, we study the qualitative analysis using “fixed point theory” approach. next, we use fractional adams-bashforth approach for investigation of approximate solution to the considered model. at the end numerical simulation are been given by matlab to provide the validity of mathematical system having the arbitrary order and fractal dimension. our discussion is about covid-19 which was started firstly from chines city wuhan, transmitted throughout the globe very rapidly. this disease of covid-19 named after the attack of corona virus in chines city wuhan at the end of 2019. due to this disease more than 0.616 million individuals in initial eight months have been died. the pandemic of a terrible and much more spreading virus of recent time is of covid-19 and this is tested in the "wuhan (chinese city)" on 31 s t of december, 2019 [1, 2] . this outbreak has affected an about 13.5 millions all over the globe. the discovery of crona virus was done in "1965", as "tyrrell" and "bynoe" have find and passes a virus called "b814" [3] , which is situated in human beings "embryonic tracheal organ" grows through respiratory system organs of an aged one [4] . such kind of bacteria transmits in air through social gathering of infected people to healthy ones by droplets of coughing or sneezing. it is also spreading through keeping hands or fingers on the area or surface of different things touched by the infected ones, which is then transmitted to healthy people by touching nose, mouth and eye. this will affects "respiratory system" and the transmitted peoples will symptoms of high fever, coughing and breathing problem . the infection and the onset of symptoms ranges from one to fourteen days. infectious person shows symptoms within five to six days. to overcome the spreading of such kind of disease peoples must follow hand washing after every 20 minutes, taking masks, and isolate from gathering in different areas. scientists and politicians are trying to stabilize the aforesaid infection from transmission and spreading. the reason of transmission of such kind of pandemic is the traveling of affected persons from one area to another which infect much more community of peoples of different areas and spread the disease. for this various steps on national and international level have been taken so far, as different countries of the globe have stopped traveling and journeys of aeroplanes, trains, busses for fixed time and also closed different economic and business activities in cities for applying some careful ways to minimize large number of loss of lives of peoples. further each and every government of the world try to minimize gurnets of peoples and to decrease number of infection ones in their government territory [5] . as scholars and analyst are making different experiments and analysis to find cure or vaccine for the afore mentioned pandemic to control and stabilized it. knowing the transmission of a disease has vital work in stabilizing the pandemic in a community. accepting of a proper thinking about the disease spreading is also another important task for implementing. engineering in medical sides aware the people and pointed out the importance of modeling approach of mathematics, which is one of the important formulation for handling and understanding such kind of pandemic. mathematical formulation like modeling have been applied for various infections in past [6] [7] [8] [9] . mathematical models have a lot of property and aspect to give information to the researchers and scholars of physical and medical sciences about how to control such type of pandemic or epidemic. these models can also be applied for prediction of the expected patients in the incoming days by any controlling policy and to obtained their aims and objectives. basic research is made by scholars and scientists to formulate viral diseases and was applied by politicians to minimize such outbreak, (for detail see [10] [11] [12] [13] [14] [15] ). therefore, the afore mentioned diseases has been analyzed in many journals [1, [16] [17] [18] [19] [20] [21] [22] [23] [24] ). the models of mathematics formulation are generally ordinary (odes) or partial differential equations (pdes), saturated with equations of integrations of natural orders (ides). since the 1990, the arbitrary order (odes) and (pdes) can be applied to model real problems with much better results having accurate result. next, uses of such type of equations will be available in various fields of physics and medicine, engineering, economical problems, business and in analysis of various diseases. fractional calculus is the vide range of arbitrary order differential and integral calculus. the scope of applying f des in formation of odes and pdes of real global issues is because of its well known properties of heredity which are not found in integer order odes and pdes. inspire of ides, which are localized for global problems, the fdes are delocalized and have the past study of history effects, which is the reason of their superiority then ides. another factor is, in different conditions the coming state of the mathematical formulation not only effected by the recent state but also on the past [4, [25] [26] [27] . these properties make fdes to model the real world problems having "non-markovian behavior". next, the integer order differential equations (ides) are not able to give it behavior between any two natural order numbers. different types of fractal dimensions and arbitrary-order derivatives were presented in books to solve such limit of natural-order derivatives. such type of derivatives can be applied to different areas of physical and natural sciences. the most suitable field of applied research in present era is devotion to analysis of epidemiological formulation of infectious pandemic. more analysis about the models of mathematical formulation are developed to discuss predictions by simulation, "stability theory", "existence results" and "optimization", see [28] [29] [30] [31] [32] . because of the recent conditions, many analysis have been done on modeling of terrible pandemic of "covid-19", see [5, [33] [34] [35] . in present this field of mathematical formulation for the "covid-19" infected diseases is an interested field of research. because of such importance in [36] scientists analyzed the mathematical model of three individuals, namely "healthy or susceptible population" s(t), the "infected population" i(t) and the "recovered class" r(t) at time t as having the rate of new born and migrated individuals is denoted by a, transmission rate from susceptible to infected is denoted by b, contact rate of susceptible with infected by c, µ is naturally death rate or without infection, k is the recovery rate while λ is the death rate of infected class from aforesaid virus. we are going to study the model given in (1.1) by including recovered individuals equation for fractal-fractional order derivative with 0 < ω ≤ 1 and 0 < r ≤ 1 as given by the transfer diagram for (1.2) is given in figure 1 which shows the interaction among the compartments and various rates. for the last few decades, it is noted that arbitrary-order equations of differentiations (fdes) and integrations (fides) can be use for modeling real world problems by much better way than integer order odes, pdes and ides. in the 1750s when "reimann and liouvilli", "euler and fourier" give interesting analytical results in integer order of differential and integral calculus. due to their work the field of fractal-fractional calculus was also introduced and some best analysis has been done later on. because of their much more uses of non-integer differential and integral calculus in the field of formulation, in which much more hereditary ideas and memorizing ways cannot be cleared by old or integer order calculus. due to non-integer order calculus much more error has been reduced present in integer order derivative or anti-derivative. the useful uses of the aforesaid calculus may be seen in [4, [25] [26] [27] [37] [38] [39] [40] [41] [42] . due to these uses scholars and doctors have given more valuable time in studding of arbitrary order calculus. surely non-integer order derivative is antiderivative of definite type which means the summation of the entire function or spectrum which make it generalized and globalized. as compared to integer order derivative which is a special derivative of the non-integer order. investigation of various mathematical models for existence and uniqueness, approximation and maximization or minimization, beneficial efforts have been done by scholars, see as [43] [44] [45] [46] [47] [48] [49] . this is also notable that arbitrary-order operators of differentiation have been formulated by large number of ways. definite integration has no kernel of regular type, so, different types of "kernel" are in different lemmas. one such type of formula having currently gained more interest is of "abc" non-integer derivative defined by "attangana-baleanu" and "caputo" [50] in 2016. this arbitrary order derivative changed the "singular kernel" by "non-singular kernel" and because of this, it is studied on high level [51] [52] [53] [54] [55] [56] [57] . now the question how to solve these problems. in this regards plenty of methods available in literature which has been applied to the old definitions of fractional derivative. for instance, to handle nonlinear problems analytically, famous decomposition and homotopy methods were increasingly used (see [9, 58, 59] ). for numerical purpose in simulation usually runge kutta methods were used in large number for dealing of mathematical modeling. here for numerical simulation we will use fractional ab method for numerical simulation. the mentioned method is simple two step technique and more powerful than euler's, taylor's and rk methods. the concerned method is powerful as well as rapidly converging and stable, (for detail see [60, 61] ). definition 2.1. [33, 54, 55, 62] let us take the continuous and differentiable mapping ℧(t) in (a, b) with 0 < r ≤ 1 order, then the fractal-arbitrary order derivative of ℧(t) in abc form with fractional order 0 < ω ≤ 1 and the law of power is given as we will than get the type of derivative known as "caputo-fabrizo differential operator". next it is written that in this result abc(ω) is known as "normalization mapping" which is given as abc(0) = abc(1) = 1. κ ω is the well known mapping called "mittag-leffler" which is also known as general case of the exponential mapping [37] [38] [39] . definition 2.2. let us take the continuous and differentiable mapping ℧(t) in (a, b) with 0 < r ≤ 1 dimensional order, then the fractal-arbitrary order integral of ℧(t) in abc form along with arbitrary order 0 < ω ≤ 1 and the law of power is given by": the solution of the given problem for 0 < ω, r ≤ 1 is provided by note: for finding existence and uniqueness, we take "banach space" . here we present a theorem on fixed point which will be utilize to prove our next results. 2. f 1 will satisfy the conditions of contraction; 3. f 2 will satisfy the conditions of continuity and compactness. then the operators or functional equations f 1 w + f 2 w = w has one or more than one solution. lemma 3.1. the roots or zeros of (1.2) in the feasible region have bounds, as proof. by adding all equations of (1.2), we have if t → ∞, n(t) ≤ a µ , the last result proved our required result. next we to prove some basics results about stability analysis, for this we have to compute free equilibrium point and pandemic equilibrium point of (1.2) as as earlier mentioned that we will compute two equilibria points which are given as: e 0 = a µ , 0, 0 is the pandemic free equilibrium point of (1.2) and the pandemic is e * = (s * , i * , r * ), and s * = µ + k + λ bc , and theorem 3.1. the basic reproductory number for (1.2) is computed as proof. let we to prove the reproduction number by taking 2 nd equation of (1.2) as x = i, , f is the infected term of non-linearity and v term of linearity. further, the next generation matrix is fv −1 , and so r 0 is the greater eigen value of our considered matrix fv −1 at pandemic free equilibrium point e 0 = ( a µ , 0, 0), given as follows hence basic reproduction number is proved and is given by the last result shows the the required result. theorem 3.2. statementthe pandemic free of disease equilibrium point of (1.2) is locally asymptotically stable if r 0 < 1 and unstable if r 0 > 1. proof. let matrix of jacobian of (1.2) will be written as using the values of e 0 , we get thus the eigen values are given by further, λ 2 can be written as last result shows that "λ 2 = r 0 − 1" and λ 2 will be non-positive if "r 0 < 1". so all "eigen values" are non-positive , so (1.2) is locally asymptotically stable" at e 0 , and will be unstable otherwise. after simplification we get the characteristics equation becomes on applying routh-hurwitz criteria, all the principle minors be positive than as given below a 1 > 0, this implies that a 1 = −µ + abc µ+k+λ or a 1 = −1 + r 0 or a 1 > 0 if r 0 > 1. by similar way one can show that the following minors must also be positive. by r 0 > 1 and positivity of all minors achieved the local asymptotical and global stability for the considered system. it is of great importance to ask weather a dynamical problem we investigate exist really or not. this is the basic question and will answered by the theory of fixed points. here we analyze the concerned need for our considered problem (1.2) in this part of the paper. regarding to the aforesaid need as the integral is differentiable, we can write the right sides of model (1.2) as with the help of (4.1) and for tǫð, the (4.2) follows as with solution where take growth cognition and lipschitzian assumption for existence and uniqueness as: (c1) there will be a constants l y , m y , such that (c2) there exists constants l y > 0 such that for each ℧,℧ ∈ ℧ such that proof. we prove the theorem in two step as bellow: step i: let℧ ∈ a, where a = {℧ ∈ ℧ : ℧ ≤ φ, φ > 0} is closed convex set. then using the definition of f in (4.7), one has hence f will obey the property of contraction. step-ii: to prove that g is relatively compact, we have to prove that g is bounded, and equicontinuous. as g is continuous, y is also continuous and for any ℧ ∈ a, we have hence (4.9) shows that g is bounded. next for "equi-continuity" let t 1 > t 2 ∈ [0, τ ], we have right side in (4.9) becomes zero at t 2 → t 1 . since g is continues and so therefore we have as g is bounded operator and continuous so one has so g is uniformly continuous and bounded. thus by arzelá-ascoli theorem g is relatively compact and hence completely continuous. thus by theorem 4.1, the equation (4.4) has one or more than one solution and therefore, the (1.2) has one or more than one solution. for uniqueness we give the next result. proof. let the operator t : ℧ → ℧ defined by as ℧,℧ ∈ ℧, so we can take and thus t is contraction from (4.12). so the equation (4.4) has one solution. hence (1.2) has one solution. here, we define and give well-known results on stability analysis of (1.2), we take φ(t) as perturbed parameter which depends on the solution having condition of φ(0) = 0 as • |ψ(t)| ≤ ǫ for ǫ > 0; • abc d ( t ω, r)℧(t) = y(t, ℧(t)) + ψ(t). lemma 5.1. the solution of the perturbed problem satisfies the given relation theorem 5.1. using assumption (c2) and (5.2), the solution of the (4.4) is "ulam-hyers" stable and therefore, the analytical results of the system are" ulam-hyers" stable if θ < 1, where θ is given in (4.13). proof. take ℧ ∈ ℧ be the solution and℧ ∈ ℧ be at most solution of (4.4), then ≤ ω ω,r + θ ℧ −℧ . hence the results about the required stability is received. in this part of the paper, we are going to find numerical solutions of fractal-arbitrary order model (1.2) using abc derivative by fractal-fractional "adams-bashforth method". the the approximate solution are obtained by the aforesaid iterative scheme. for such objective, we uses the fractalfractional ab techniques [38] to provide an approximate way for the graphing of the system (1.2). for this to prove an approximate techniques, we go further with (4.1) can be noted as : where g 1 , g 2 and g 3 are defined in (4.2) by applying antiderivative of fractional order and fractal dimension to the 1 st equation of (4.1) using abc form, we get now, we approximate the function g 1 on the interval [t q , t q+1 ] through the interpolation polynomial as follows which implies that calculating i q−1,ω and i q,ω we get put t q = q∆, we get and substituting the values of (6.3) and (6.4) in (6.2), we get similarly for the other two compartments i and r we can find the same numerical scheme as similarly for the remaining two cases we can find that r 0 < 1 as for case-a. otherwise if r 0 > 1 as in [69] , r 0 = 5.7 then our considered system will be unstable and the infection will be on the top. hence our system is stable and we achieved the the fractional order model 1.2 by applying the given ab techniques in (6.5) . from 2, we observed that in future 12 weeks the susceptible population will decrease with very high rate i.e in short time. the seen decrease will be rapid at smaller non-integer order and will be slow at larger fractional order and fractal dimension and predicts that in the beginning susceptible class will go towards infected class. figure 3 provide a result that on the available data in future few months the infected cases will go up to the maximum peak value 0.8 million in "pakistan" if precautionary measures are not applied. the increase is high at low arbitrary order with fractal dimensions and as the order raises the rate of infected class goes slow and slow. similarly figure 4 shows that the recovered cases which may also increases by precautionary measures and isolation and the increase occur at smaller fractional order and fractal dimension. all the three figures shows stability and convergency. case-b, when b = 0.0016630 now we take the transmission rate as 0.0016630 and get the result through iteration method as shown in 5 to 7. we observe that as the susceptible class is decaying, then the infection population also decreases by decreasing the transmission rate through social gathering of the people. as the transmission rate decreased the peak value also decreased to 0.6 million. therefore, we say that in future four or five months increasing transmission rate, the maximum infection cases may have nearly 0.6 million. the number here is less than as compare to the preceding case which shows the effect of lock-down or implementation of the precaution among the society. the figures of case-b also implies stability and convergency which can also be showed by plugging values in the formula of r0 as in case-a. case-c, when b = 0.0016628 now we notify the same procedure for b = 0.0016628, the model behaves decrease in the population of infections class as compared to the previous one, and the peak value decreased attained it in less time. which means that in future it will decrease the number of infected cases addressing the covid-19. so our numerical solutions provide the best prediction that by decreasing the transmission rate will decrease the infected cases and vice versa in all over the country with other precautionary measure as described earlier would be implemented. the dynamical system has been shown for different compartments in figures 8, 9 and 10 respectively. in our discussion we have investigated the sir fractal-fractional model for the future prediction of covid-19 in pakistan and its process using abc fractal-arbitrary order derivatives. the global and local stability for the considered model have been found by techniques of equilibrium points along with the method of next generation matrix and "routh-hurwitz criteria". next the positivity along with boundedness has been shown by applying non-linear techniques. few "fixed point results" for the existence of one or more than one solution and "hyers-ulam" stability results have been provided for the system (1.2). using "adams-bashforth method", we have provided an approximate solution for the considered model. by using real data given for "pakistan", we have graphed the solution and its behavior under the changing of the transmission parameter for various arbitrary order and fractal dimension. on decreasing the transmission rate and implementing the rules and regulation for precaution will give best beneficial effect on the controlling or slowing the spread of the covid-19. this is also seen that for minimizing the contact with others peoples, the taken system give good output to overcome of the terrible infection. is the world ready for the coronavirus. editorial. the new york times china virus death toll rises to 41, more than 1,300 infected worldwide cultivation of viruses from a high proportion ofpatients with colds applications of fractional calculus in physics outbreak of pneumonia of unknown etiology in wuhan china: the mystery and the miracle an efficient technique for a time fractional model of lassa hemorrhagic fever spreading in pregnant women new approach for the model describing the deathly disease in pregnant women using mittag-leffler function a new fractional sirs-si malaria disease model with application of vaccines, anti-malarial drugs, and spraying semi-analytical study of pine wilt disease model with convex rate under caputo-febrizio fractional order derivative dynamical behavior of a hepatitis b virus transmission model with vaccination modeling the transmission dynamics and control of hepatitis b virus in china a mathematical model for simulating the phase-based transmissibility of a novel coronavirus a pneumonia outbreak associated with a new coronavirus of probable bat origin early transmission dynamics in wuhan, china, of novel coron-avirus221 infected pneumonia clinical features of patients infected with 2019 novel coronavirus in wuhan china new dynamical behaviour of the coronavirus (covid-19) infection system with nonlocal operator from reservoirs to people china virus death toll rises to 41, more than 1,300 infected worldwide estimating the serial interval of the novel coronavirus disease (covid-19): a statistical analysis using the public data in hong kong from preliminary estimation of the basic reproduction number of novel coronavirus (2019-ncov) in china, from 2019 to 2020: a data-driven analysis in the early phase of the outbreak pattern of early human-to-human transmission of wuhan transmission dynamics of 2019 novel coronavirus coronavirus: uk screens direct flights from wuhan after us case early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia coronavirus cases fractional differential equations, mathematics in science and engineering theory of fractional dynamic systems applications of fractional calculus to dynamic problems of linear and nonlinear hereditary mechanics of solids application of the laplace adomian decomposition method and implicit methods for solving burger's equation approximate analytical solution of the fractional epidemic model an analysis of the academic literature on simulation and modelling in health care on a two-dimensional magnetohydrodynamic problem: modelling and analysis a caputo power law model predicting the spread of the covid-19 outbreak in pakistan modelling the spread of covid-19 with new fractal-fractional operators can the lockdown save mankind before vaccination isolation and characterization of a bat sars-like coronavirus that uses the ace2 receptor genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from patients with acute respiratory disease in wuhan stochastic mathematical model for the spread and control of corona virus fractional integrals and derivatives (theory and applications a fractional calculus approach to the dynamic optimization of biological reactive systems. part i: fractional models for biological reactions an introduction to the fractional calculus and fractional differential equations theory and application of fractional differential equations on solutions of fractional differential equations modelling and analysis of fractal-fractional partial differential equations: application to reaction-diffusion model on a nonlinear fractional order model of dengue fever disease under caputo-fabrizio derivative solution of the epidemic model by adomian decomposition method solution of the epidemic model by homotopy perturbation method variational iteration method for solving the epidemic model and the prey and predator problem a conceptual model for the coronavirus disease 2019 (covid-19) outbreak in wuhan, china with individual reaction and governmental action the approximate noether symmetries and approximate first integrals for the approximate hamiltonian systems analysis of the fractional diffusion equations with fractional derivative ofnon-singular kernel fractional operators with exponential kernels and a lyapunov type inequality atangana-baleanu fractional framework of reproducing kernel technique in solving fractional population dynamics system existence theory and numerical solutions to smoking model under caputo-fabrizio fractional derivative modeling the dynamics of novel coronavirus (2019-ncov) with fractional derivative analysis coronavirus disease (covid-19) model using numerical approaches and logistic model can transfer function and bode diagram be obtained from sumudu transform numerical solution of fractional order smoking model via laplace adomian decomposition method numerical analysis of fractional order model of hiv-1 infection of cd4+ t-cells a first course in the numerical analysis of differential equations numerical methods for ordinary differential equations electromagnetic waves described by a fractional derivative of variable and constant order with non-singular kernel discrete fractional differences with nonsingular discrete mittag-leffler kernels krasnoselskii n-tupled fixed point theorem with applications to fractional nonlinear dynamical system well-posedness and fractals via fixed point theory, fixed point theory and applications a common fixed point theorem in metric spaces, fixed point theory and applications fixed point theory for contractive mappings satisfying maps in metric spaces current update in pakistan about covid-19, on 16 high contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2 there exist no competing interest regarding this manuscript. key: cord-343701-x5rghsbs authors: zhao, yu-feng; shou, ming-huan; wang, zheng-xin title: prediction of the number of patients infected with covid-19 based on rolling grey verhulst models date: 2020-06-25 journal: int j environ res public health doi: 10.3390/ijerph17124582 sha: doc_id: 343701 cord_uid: x5rghsbs the outbreak of a novel coronavirus (sars-cov-2) has caused a large number of residents in china to be infected with a highly contagious pneumonia recently. despite active control measures taken by the chinese government, the number of infected patients is still increasing day by day. at present, the changing trend of the epidemic is attracting the attention of everyone. based on data from 21 january to 20 february 2020, six rolling grey verhulst models were built using 7-, 8and 9-day data sequences to predict the daily growth trend of the number of patients confirmed with covid-19 infection in china. the results show that these six models consistently predict the s-shaped change characteristics of the cumulative number of confirmed patients, and the daily growth decreased day by day after 4 february. the predicted results obtained by different models are very approximate, with very high prediction accuracy. in the training stage, the maximum and minimum mean absolute percentage errors (mapes) are 4.74% and 1.80%, respectively; in the testing stage, the maximum and minimum mapes are 4.72% and 1.65%, respectively. this indicates that the predicted results show high robustness. if the number of clinically diagnosed cases in wuhan city, hubei province, china, where covid-19 was first detected, is not counted from 12 february, the cumulative number of confirmed covid-19 cases in china will reach a maximum of 60,364–61,327 during 17–22 march; otherwise, the cumulative number of confirmed cases in china will be 78,817–79,780. an epidemic of the novel coronavirus disease 2019 (covid-19) broke out in wuhan city, hubei province, china, in early 2020, and spread rapidly in china and across the world, causing tens of thousands of people to be infected with the virus. on 21 january, the prevention and control of the epidemic sprouted at the national level, and many provinces in china launched the first-level emergency response of epidemic prevention. by 30 january, more than 7000 covid-19 infection cases had been confirmed in china and more than 50 cases had been confirmed in countries and regions outside china, so the world health organization (who) had declared the outbreak of the novel coronavirus in china as a public health emergency of international concern (pheic). according to the latest data (http://www.nhc.gov.cn/) released by the national health commission of the people's republic of china, it can be seen from figure 1 that the number of confirmed cases increases in an s-shaped trend. although the cumulative number is rising, the number of newly confirmed cases reached an inflection point, reaching the peak on 4 february and then gradually decreasing. the number of newly confirmed cases rose excessively on 27 january due to the increase in the number of institutions with the capacity of pneumonia detection. figure 2 demonstrates that the number of suspected cases rose until 8 february and then decreased, while the number of deaths slowly increased and reached 2239 in total on 20 february. moreover, the number of cured patients significantly increased and reached 18 ,278 on the same day. on 11 february, the new coronavirus disease was officially named coronavirus disease 2019 (covid-19) by the who. the international committee on taxonomy of viruses (ictv) named the virus strain as severe acute respiratory syndrome coronavirus 2 (sars-cov-2). the accurate prediction of the number of patients infected with covid-19 in china is undoubtedly of great significance for implementing prevention and control measures and carrying out economic and social activities. considering the limited sample size since the outbreak of the epidemic and that a large sample size is needed to use the classical statistical prediction method, the present research used the grey system theory, which can be modeled with only four data points [1, 2] . on this basis, a rolling grey verhulst model and its derived models were established to predict the change trend of the number of cases of covid-19 infection in china. since the outbreak of the epidemic of covid-19, many scientists in the world have studied the causes and mechanisms of spread of the virus, along with relevant treatment programs. for example, based on the empirical analysis of a large amount of genomic data on a global scale, chen et al. [3] firstly tried to explain the reasons for rapid mutation, multiple hosts and strong host adaptability of betacoronavirus at the molecular level. xu et al. [4] found genetic evolutionary relationships of the novel coronavirus with the severe acute respiratory syndrome (sars) coronavirus and the middle east respiratory syndrome (mers) coronavirus. furthermore, lu et al. [5] found that this virus has specific nucleic acid sequences that are different from those of the known coronaviruses. zhang et al. [6] studied and summarized the new methods for detecting common respiratory viruses. koo et al. [7] found that a variety of interventions to maintain body distance are effective in reducing the number of sars-cov-2 cases. bi et al. [8] tracked covid-19-infected persons and their close contacts, showing that detection, isolation and tracking of cases can reduce the spread of the virus and help control the epidemic. archer et al. [9] used patients from 24 countries/regions, including europe and the united states, as data sources to investigate, for the first time, the impact of sars-cov-2 infection on pulmonary complications and mortality. to predict the number of people infected with the sars-cov-2 accurately, scholars have established a series of models, generally divided into three categories: the basic infectious disease models in the fields of mathematics and medicine [10] [11] [12] [13] , the economic models based on traditional statistical methods [14] [15] [16] and the algorithm models based on machine learning [17] [18] [19] . however, the above prediction models have some flaws. for example, the assumptions of the infectious disease model are strong, assuming that there is no super communicator, and are easily affected by factors such as geography; economic models or intelligent algorithms have strict requirements on data, requiring a large amount of data to be trained and tested to obtain relatively accurate prediction results. in response to such new viruses, with low data availability and incomplete cognition, the above models may not be applicable. for this reason, this research selected an innovative topic as the research object, namely the prediction of the final number of infected cases, which can provide a scientific basis for the government policies. grey system theory and grey prediction models have been widely used in many fields, such as economics [20] , demand prediction [21, 22] and environmental protection [23, 24] , since being proposed. the traditional gm(1,1) and the grey verhulst model are the core grey prediction models. bao et al. [25] predicted all factors of disability in middle-aged and old people and the probability of specific injuries through the grey gm(1,1) model and found that non-communicable diseases (ncds) are still the main threats to health for the elderly. in order to accurately predict the development of human echinococcosis in xinjiang uygur autonomous region, china, zhang et al. [26] made short-term predictions using three models, i.e., a traditional gm(1,1) model, a grey-periodic extensional combinatorial model (pecgm(1,1)) and a flamelet-generated manifold (fgm(1,1)) model optimized by fourier series. in the meanwhile, based on the transmission mechanism of echinococcosis, they established a prediction model for dynamic epidemics that can effectively predict the future development trend of epidemics. the traditional gm(1,1) model is mainly applicable to the sequences with strong exponential laws and can only describe the monotonic change process. however, the grey verhulst model has strong prediction capability for the non-monotonic swinging developmental sequences or the saturated s-shaped sequences due to the first-order accumulated generating operation (ago-1) of the original data. in recent years, some scholars have explored the optimization of initial conditions [27] and background values [28, 29] , the research of model properties [30, 31] and the accuracy improvement [32] of the grey verhulst model. for application scenarios of the grey verhulst model, some scholars have conducted research based on real socioeconomic systems, which to some extent reflects the effectiveness and superiority of the grey verhulst model in comparison with the traditional model. in order to improve the accurate prediction of short-term traffic speed and travel time, bezuglov and comert [33] utilized the gm(1,1) model, the gm(1,1) model modified by fourier error and the grey verhulst model modified by fourier error for prediction. the results demonstrate that the grey verhulst model modified by fourier error can better process sudden changes in the parameters of a traffic system sequence. wang and li [34] constructed a non-equal-interval grey verhulst model and its derived model and optimized parameters of the models by using particle swarm optimization. on this basis, they verified the environmental kuznets curve (ekc) of carbon dioxide emission in china by discussing the relationship between carbon dioxide emission and economic growth using a grey model. by building a grey verhulst model, wu et al. [35] predicted comprehensive air quality indexes in the chinese cities of beijing, tianjin and shijiazhuang, and the results show that the input of the government can promote improvement of air quality to some extent. zhang et al. [36] combined the verhulst model with the bp neural network to gain complementary advantages and improve prediction accuracy and stability. by utilizing the grey verhulst model, wang et al. [37] predicted the state of the iron and steel industry in 2025 based on the relationship between carbon emission of the industry and economic growth from 2001 to 2016; they also put forward relevant policy implications. in order to increase the predictive ability of the initial model, the model is changed appropriately to better adapt to the current needs. by introducing a new non-homogeneous exponential function, zeng et al. [38] constructed a grey n-verhulst model that overcame the defects of parameter dislocation and unreasonable initial value selection of the traditional verhulst model. in addition, some scholars have built grey models based on a rolling mechanism to explore the hidden useful information of the original data sequence and improve the modeling accuracy. akay and atak [39] proposed a grey prediction model based on a rolling mechanism to predict the total and industrial power consumption in turkey. the results demonstrate that the grey model based on a rolling mechanism shows greatly improved prediction accuracy. considering the complex randomness and nonlinearity of short-term traffic flow, xiao et al. [40] proposed a seasonal grey rolling prediction model based on the cycle truncation accumulated generation method and a rolling mechanism. xu et al. [41] established a br-agm(1,1) model based on the adaptive rolling mechanism to predict greenhouse gas emissions in china and discussed the policy significance of model overfitting and the modeling process.şahin [42] used a metabolism grey model, a nonlinear metabolism grey model and the optimized versions of the two to predict greenhouse gas emissions in turkey. the results demonstrate that prediction accuracy of the optimized nonlinear metabolism model based on a rolling mechanism is higher. the reminder of the research is arranged as follows: section 3 introduces the traditional grey verhulst model, the derived model of the grey verhulst model, the grey verhulst model based on a rolling mechanism and its derived models; section 4 shows the prediction and empirical results of pneumonia infection in china; and the conclusions are made in section 5. the grey verhulst model is an effective model to describe and predict a process with a saturation state (s-type) under the condition of small samples; it is commonly used in prediction of population, biological reproduction and product life. under strict anti-epizootic measures in china, it is assumed that covid-19 cannot spread indefinitely in china and the cumulative number of confirmed cases will not increase indefinitely and will eventually converge to the corresponding saturation value. therefore, the grey verhulst model is suitable for modeling growth and changes in the number of virus infection cases, especially for prediction of the final value and inflection point of the number of confirmed cases. the grey verhulst model selected in this study is proved to be able to predict non-linear data changes with small errors in multiple case studies [43] . furthermore, the rolling grey verhulst model established based on a rolling mechanism and its derived model can capture the dynamic characteristics of the future development trend of the system. to be clear, the model in this paper is only applicable to data with s-shaped growth, while the mortality rate and the number of deaths largely depend on the influence of a country's medical system, population parameters and various disease characteristics, which do not conform to the assumptions of the model and may lead to a large error. (1) represents the generated mean sequence of consecutive neighbors of x (1) definition 2. if x (0) , x (1) and z (1) are described as definition 1, the following formula is obtained. the above formula is the basic form of the gm(1,1) power model, and is the whitening equation of the gm(1,1) power model. (1) and z (1) are shown in definition 1, parameter listâ = [a, b] t can be calculated by using the least squares method and shown as follows: definition 3. particularly, when r = 2, then is the basic form of the grey verhulst model, and is the whitening equation of the grey verhulst model. the solution to the whitening equation of the grey verhulst model is shown as follows: by substituting the initial valuex (1) (1) = x (1) (1) into the above formula, the corresponding time response formula of the grey verhulst model is obtained as: finally, it needs to be reverted to get the predicted value of the original sequence x (0) (k). in the modeling process of the traditional grey verhulst model, a difference equation was firstly built and then converted into a differential equation, namely the whitening equation. moreover, the whitening time response function was derived through integral operation, and the prediction and simulation were finally conducted. the transformation process inevitably led to the inherent deviation of the grey verhulst model (see the work of wang et al. [30] ), so this study obtained a derived model of the grey verhulst model by referring to the traditional gm(1,1) derived model proposed by deng [44] and the method to derive the gm(1,1) power model by wang [45] . this derived model did not need further prediction by virtue of the whitening response formula, which was a function that the traditional gray verhulst model did not possess. in order to facilitate the deduction of the derived models of the grey verhulst model, the variable of the traditional gm(1,1) model is recorded as y and development coefficient is recorded as a . moreover, the grey action quantity is expressed as b . the derived model, namely the gm(1,1,y (1) ) model, is defined as follows: where α = a 1+0.5a and β = b 1+0.5a . according to this derived model, the derived model of the grey verhulst model can be obtained as follows: proof. based on the derived y (1) -type gm(1,1,y (1) ) model, the following formula can be derived. by adding the above k − 1 formulas up, by simultaneously adding y (0) (1) on both sides of the formula, the following formula is obtained: by substituting y (1) (k) = 1 into the above formula, the following equation can be obtained: when k = 2, when k = 3, 4, . . . , n, therefore, by substituting α = −a 1−0.5a , β = −b 1−0.5a into the above formula, the following formula is obtained: that is, the derived model of the grey verhulst model. where α = −a 1−0.5a and β = −b 1−0.5a . the flow chart of the derived grey verhulst model is shown in figure 3 . when using the grey verhulst model for modeling, the data before the real moment t = n are adopted. however, as time goes on, the development of any real socioeconomic system is accompanied by the constant access of some random disturbance factors, which affects the development of the system. a rolling mechanism can dynamically update the initial value of the data sequence and consider disturbance factors of the system, which has been proved to be able to greatly improve prediction accuracy [46] . therefore, this study introduced a rolling mechanism into the grey verhulst model and its derived model in order to reduce the influence of uncertain disturbance factors on the grey system in the future. the modeling process is presented as follows: the traditional grey verhulst model and its derived model established by the original data sequence x (0) = x (0) (1), x (0) (2), · · · , x (0) (n) are used to predict the next value x (0) (n + 1). by supplementing the value into the original sequence and removing the earliest data point x (0) (1), a new sequence, , is formed. this sequence is taken as the original sequence used to build the model, and the above steps are repeated for prediction and supplementation one by one. the models established according to the above steps are the grey rolling verhulst model and its derived model. the length of the rolling sequence is expressed as l. when l is 9, the rolling modeling process of the two models is shown in figure 4 . this model can make good use of new information and obtain more accurate predicted results. in order to compare accuracy and verify effectiveness and reliability of the models, absolute percentage error (ape) and mean absolute percentage error (mape) were used to calculate errors [47] : where e(i) = x (0) (i) −x (0) (i), in which x (0) (i) andx (0) (i) indicate actual value and predicted value, respectively. the levels of the accuracy of mape are shown in table 1 . relevant data of the numbers of confirmed cases, suspected cases, cured patients and deaths were acquired by referring the latest data released by national health commission (http://www.nhc.gov.cn/). by using the number of patients infected with covid-19 in china from 20 january to 20 february as the original data, empirical modeling and analysis were performed. firstly, the rolling grey verhulst model and its derived model were established, and the lengths of the rolling sequences were 7, 8 and 9. in accordance with the length of the rolling sequence, the original data were classified into a training set and a testing set, and the prediction accuracies of basic model and the derived model were compared. secondly, in view of different lengths of the rolling sequences, the model with the highest prediction accuracy was selected to predict the final value and inflection point of the number of confirmed cases. in accordance with the rolling mechanism described in section 3.3, the rolling grey verhulst model and its derived model were built. by replacing the earliest data with the latest ones, the prediction and supplementation were carried out successively. due to different analytic formulas of the general model and derived model, parameter estimation results are identical. by using the least squares method to estimate parameters of the two models, the parameter listsâ = [a, b] t are obtained when the rolling sequence lengths are 7, 8 and 9. the results are shown in table 2 . parameters of the rolling grey verhulst model and its derived model were obtained with different rolling sequence lengths, and predicted values were calculated based on recurrence prediction formula. by establishing the rolling grey verhulst model and its derived model with rolling sequence lengths of 7, 8 and 9, data from 20 january to 20 february 2020 were predicted, and the training and testing sets were established based on the length of the rolling sequence. on the basis of ensuring that the models could accurately simulate the number of confirmed patients in the training set, three models with high prediction accuracy in the testing set were selected to predict the maximum value and inflection point for different rolling sequence lengths. the results of the training and testing sets are presented in tables 3 and 4 . as displayed in table 3 , the prediction errors of the six models in the training set are less than 10%, indicating that the six models can accurately simulate changes of the number of confirmed cases in china. by observing prediction performance of the six models in the testing set, it can be seen that their prediction accuracies are all less than 10%, suggesting that the models can accurately predict changes of the number of confirmed cases in the future. as demonstrated in table 4 , in the testing set, prediction accuracy of the grey verhulst model with the rolling sequence length of 7 is higher than that of its derived model; the grey verhulst model and its derived model have mape values of 3.30% and 3.83%, respectively. when the rolling sequence lengths are 8 and 9, the grey verhulst models show lower prediction accuracy than their derived models. mapes of the grey verhulst model its derived model are 4.72% and 3.13%, respectively, when the rolling sequence length is 8. mapes of the grey verhulst model its derived model are 2.93% and 1.65%, respectively, when the rolling sequence length is 9. therefore, the grey verhulst model with the rolling sequence length of 7 and derived grey verhulst models with rolling sequence lengths of 8 and 9 were selected to predict the final value and inflection point of the number of confirmed cases in china. as shown in test results of models in section 4.2, the grey verhulst model with the rolling sequence length of 7 and derived grey verhulst models with the rolling sequence lengths of 8 and 9 were used to predict final value and inflection point of the number of confirmed cases in china. in order to ensure combination of the rolling mechanism and the derived grey verhulst models, for out-of-sample rolling prediction, the data from 12 to 20 february, 13 to 20 february and 14 to 20 february were used to predict the latest data, which served as the numbers of newly confirmed cases in the next period. then, by removing the data on 12, 13 and 14 february, new initial sequences were established to continue the rolling operation until the predicted result in the latest day no longer changes, allowing this result to be taken as the final value. on 13 february, hubei provincial health commission announced on its official website that, to be consistent with the case diagnosis and classification issued by other provinces in china, the province would release the number of clinically diagnosed cases and include this number in the confirmed cases. since then, due to the appearance of clinically diagnosed cases, the number of confirmed patients increased greatly every day. therefore, this study predicted changes of the number of confirmed patients with and without consideration of clinically diagnosed cases. this research further calculated out-of-sample prediction accuracies on 21 and 22 february to verify the predictive ability of the models. the results of final predicted values are demonstrated in table 5 . it was found that the three models could accurately predict out-of-sample data; the specific data are presented in table 6 . as illustrated in figures 5-7 , in the case of not considering clinically diagnosed cases, the maximum prediction values of the three models with rolling sequence lengths of 7, 8 and 9 are 60,364, 61,327 and 61,327, respectively. under the condition of considering clinically diagnosed cases, the final prediction values are 78,817, 79,780 and 79,780 on 17, 22 and 22 march, respectively. by analyzing changes in the number of confirmed patients, it is found that the number of confirmed patients does show an s-shaped trend. moreover, the current number of confirmed patients has been approximated to the final value, and the number of confirmed patients growing in a single day has decreased. in order to further calculate the inflection point, that is, the day with the maximum number of confirmed patients, this research plotted growth rate changes in single days using three models, as shown in figure 8 . as shown in figure 8 , the inflection point had appeared on 4 february, with the number of confirmed patients of 3892. in addition, the growth in single days declines rapidly, and the predicted results of the three models are basically the same, which proves that the results are robust. based on a rolling mechanism, the rolling grey verhulst model and its derived models for predicting the number of patients infected with covid-19 in china were constructed by adding the latest data and removing the earliest data. empirical modeling and analysis was conducted by using the number of infected cases from 20 january to 20 february. firstly, in order to ensure stability of prediction results, the rolling sequence lengths of 7, 8 and 9 were selected for the models. the original data were classified into the training and testing sets to compare prediction accuracies of the basic model and the derived models. secondly, considering the different rolling sequence lengths, the models with high precision accuracy were selected to predict the final value and inflection point of the number of confirmed patients. the results showed that the rolling grey verhulst model and its derived models could accurately predict the changes in the number of confirmed patients in china. the prediction accuracy of the rolling grey verhulst model with the rolling sequence length of 7 was higher than that of its derived model, while the prediction accuracies of the rolling grey verhulst models with rolling sequence lengths of 8 and 9 were lower than those of the derived models. therefore, this study used the rolling grey verhulst models with high accuracy to predict the final number of confirmed patients and the date of reaching the final number. by predicting the final number of confirmed patients in china using the rolling grey verhulst model, the maximum predicted numbers by the three models with rolling sequence lengths of 7, 8 and 9 were 60,364, 61,327 and 61,327, respectively, when clinically diagnosed cases were not considered. grey system theory and its application model comparison of gm (1, 1) and dgm (1, 1) based on monte-carlo simulation bioinformatics analysis of the wuhan 2019 human coronavirus genome evolution of the novel coronavirus from the ongoing wuhan outbreak and modeling of its spike protein for risk of human transmission outbreak of pneumonia of unknown etiology in wuhan china: the mystery and the miracle recent advances in the detection of respiratory virus infection in humans interventions to mitigate early spread of sars-cov-2 in singapore: a modelling study epidemiology and transmission of covid-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study mortality and pulmonary complications in patients undergoing surgery with perioperative sars-cov-2 infection: an international cohort study an epidemiological forecast model and software assessing interventions on covid-19 epidemic in china epidemic analysis of covid-19 in china by dynamical modeling. arxiv extended sir prediction of the epidemics trend of covid-19 in italy and compared with hunan modeling and forecasting the early evolution of the covid-19 pandemic in brazil real-time forecasts of the covid-19 epidemic in china from day level forecasting for coronavirus disease (covid-19) spread: analysis, modeling and recommendations. arxiv prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal prediction for the spread of covid-19 in india and effectiveness of preventive measures optimization method for forecasting confirmed cases of covid-19 in china artificial intelligence forecasting of covid-19 in china. arxiv prediction and decomposition of efficiency differences in chinese provincial community health services the nls-based nonlinear grey bernoulli model with an application to employee demand prediction of high-tech enterprises in china forecasting the water demand in chongqing, china using a grey prediction model and recommendations for the sustainable development of urban water consumption the nls-based nonlinear grey multivariate model for forecasting pollutant emissions in china forecasting the number of end-of-life vehicles using a hybrid model based on grey model and artificial neural network forecasting and analyzing the disease burden of aged population in china, based on the 2010 global burden of disease study time prediction models for echinococcosis based on gray system theory and epidemic dynamics study of the grey verhulst model based on the weighted least square method research on the modeling method of background value optimization of grey verhulst model novel grey verhulst model and its prediction accuracy unbiased grey verhulst model and its application the optimization of time response function in non-equidistant verhulst model an alternative approach to estimating the parameters of a generalised grey verhulst model: an application to steel intensity of use in the uk short-term freeway traffic parameter prediction: application of grey system theory models modelling the nonlinear relationship between co2 emissions and economic growth using a pso algorithm-based grey verhulst model analyzing the air quality of beijing, tianjin, and shijiazhuang using grey verhulst model application of optimized grey discrete verhulst-bp neural network model in settlement prediction of foundation pit decomposing the decoupling of co 2 emissions and economic growth in china's iron and steel industry a new-structure grey verhulst model: development and performance comparison grey prediction with rolling mechanism for electricity demand forecasting of turkey an improved seasonal rolling grey forecasting model using a cycle truncation accumulated generating operation for traffic flow forecasting chinese greenhouse gas emissions from energy consumption using a novel grey rolling model forecasting of turkey's greenhouse gas emissions using linear and nonlinear rolling metabolic grey model based on optimization the grey generalized verhulst model and its application for forecasting chinese pig price index the basis of grey theory models derived from gm (1,1) power model forecasting u.s. shale gas monthly production using a hybrid arima and metabolic nonlinear grey model this article is an open access article distributed under the terms and conditions of the creative commons attribution the authors declare no conflict of interest. key: cord-347199-slq70aou authors: safta, cosmin; ray, jaideep; sargsyan, khachik title: characterization of partially observed epidemics through bayesian inference: application to covid-19 date: 2020-10-07 journal: comput mech doi: 10.1007/s00466-020-01897-z sha: doc_id: 347199 cord_uid: slq70aou we demonstrate a bayesian method for the “real-time” characterization and forecasting of partially observed covid-19 epidemic. characterization is the estimation of infection spread parameters using daily counts of symptomatic patients. the method is designed to help guide medical resource allocation in the early epoch of the outbreak. the estimation problem is posed as one of bayesian inference and solved using a markov chain monte carlo technique. the data used in this study was sourced before the arrival of the second wave of infection in july 2020. the proposed modeling approach, when applied at the country level, generally provides accurate forecasts at the regional, state and country level. the epidemiological model detected the flattening of the curve in california, after public health measures were instituted. the method also detected different disease dynamics when applied to specific regions of new mexico. in this paper, we formulate and describe a data-driven epidemiological model to forecast the short-term evolution of a partially-observed epidemic, with the aim of helping estimate and plan the deployment of medical resources and personnel. it also allows us to forecast, over a short period, the stream of patients seeking medical care, and thus estimate the demand for medical resources. it is meant to be used in the early days of the outbreak, when data and information about the pathogen and its interaction with its host population is scarce. the model is simple and makes few demands on our epidemiological knowledge of the pathogen. the method is cast as one of bayesian inference of the latent infection rate (number of people infected per day), conditioned on a time-series of developing a forecasting method that is applicable in the early epoch of a partially-observed outbreak poses some peculiar difficulties. the evolution of an outbreak depends on the characteristics of the pathogen and its interaction with patterns of life (i.e., population mixing) of the host population, both of which are ill-defined during the early epoch. these difficulties are further amplified if the pathogen is novel, and its myriad presentations in a human population is not fully known. in such a case, the various stages of the disease (e.g., prodrome, symptomatic etc.), and the residence times in each, are unknown. further, the patterns of life are expected to change over time as the virulence of the pathogen becomes known and medical countermeasures are put in place. in addition, to be useful, the model must provide its forecasts and results in a timely fashion, despite imperfect knowledge about the efficacy of the countermeasures and the degree of adherence of the population to them. these requirements point towards a simple model that does not require much information or knowledge of the pathogen and its behavior to produce its outputs. in addition, it suggests an inferential approach conditioned on an easily obtained/observed marker of the progression of the outbreak (e.g., the time-series of daily new cases), even though the quality of the observations may leave much to be desired. in keeping with these insights into the peculiarities of forecasting during the early epoch, we pose our method as one of bayesian inference of a parametric model of the latent infection rate (which varies over time). this infection rate curve is convolved with the probability density function (pdf) of the incubation period of the disease to produce an expression for the time-series of newly symptomatic cases, an observable that is widely reported as "daily new cases" by various data sources [2, 5, 6] . a markov chain monte carlo (mcmc) method is used to construct a distribution for the parameters of the infection rate curve, even under an imperfect knowledge of the incubation period's pdf. this uncertain infection rate curve, which reflects the lack of data and the imperfections of the epidemiological model, can be used to provide stochastic, short-term forecasts of the outbreak's evolution. the reliance on the daily new cases, rather than the timeseries of fatalities (which, arguably, has fewer uncertainties in it) is deliberate. fatalities are delayed and thus are not a timely source of information. in addition, in order to model fatalities, the presentation and progress of the disease in an individual must be completely known, a luxury not available for a novel pathogen. our approach is heavily influenced by a similar effort undertaken in the late 1980s to analyze and forecast the progress of aids in san francisco [12] , with its reliance on simplicity and inference, though the formulation of our model is original, as is the statistical method used in the inference. there have been many attempts at modeling the behavior of covid-19, most of which have forecasting as their primary aim. our ignorance of its behavior in the human population is evident in the choice of modeling techniques used for the purpose. time-series methods such as arima [9, 39] and logistic regression for cumulative time-series [28] have been used extensively, as have machine-learning methods using long short-term memory models [16, 17] and autoencoders [18] . these approaches do not require any disease models and focus solely on fitting the data, daily or cumulative, of new cases as reported. ref. [45] contains a comprehensive summary of various machinelearning methods used to "curve-fit" covid-19 data and produce forecasts. approaches that attempt to embed disease dynamical models into their forecasting process have also be explored, usually via compartmental seir models or their extensions. compartmental models represent the progress of the disease in an individual via a set of stages with exponentially distributed residence times, and predict the size of the population in each of the stages. these mechanistic models are fitted to data to infer the means of the exponential distributions, using mcmc [11] and ensemble kalman filters (or modifications) [19, 20, 38] . less common disease modeling techniques, such as agent-based simulations [26] , modeling of infection and epidemiological processes as statistical ones [3] and the propagation of epidemics on a social network [43] have also been explored, as have been methods that include proxies of population mixing (e.g., using google mobility data [15] ). there is also a group of 20 modeling teams that submit their epidemiological forecasts regarding the covid-19 pandemic to the cdc; details can be found at their website [7] . apart from forecasting and assisting in resource allocation, data-driven methods have also been used to assess whether countermeasures have been successful e.g., by replacing a time-series of daily new cases with a piecewise linear approximation, the author in ref. [14] showed that the lockdown in india did not have a significant effect on "flattening the curve". we perform a similar analysis later, for the shelter-in-place orders implemented in california in mid-march, 2020. efforts to develop metrics, derived from observed time-series of cases, that could be used to monitor countermeasures' efficacy and trigger decisions [13] , too exist. there have also been studies to estimate the unrecorded cases of covid-19 by computing excess cases of influenza-like-illness versus previous years' numbers [32] . estimates of resurgence of the disease as nations come out of selfquarantine have also been developed [47] . some modeling and forecasting efforts have played an important role in guiding policy makers when responding to the pandemic. the first covid-19 forecasts, which led to serious considerations of imposing restrictions on the mixing of people in the united kingdom and the usa, were generated by a team from imperial college, london [21] . influential covid-19 forecasts for usa were generated by a team from the university of washington, seattle [36] and were used to estimate the demand for medical resources [35] . these forecasts have also been compared to actual data, once they became available [34] , an assessment that we also perform in this paper. adaptations of the university of washington model, that include mobility data to assess changes in population mixing, have also been developed [46] , showing enduring interest in using models and data to understand, predict and control the pandemic. figure 1 shows a schematic of the overall workflow developed in this paper. the epidemiological model is formulated in sect. 2, with postulated forms for the infection rate curve and the derivation of the prediction for daily new cases; we also discuss a filtering approach that is applied to the data before using it to infer model parameters. in sect. 3 we describe the "error model" and the statistical approach used to infer the latent infection rate curve, and to account for the uncertainties in the incubation period distribution. results, including push-forward posteriors and posterior predictive checks, are presented in sect. 4 and we conclude in sect. 5. the appendix includes a presentation of data sources used in this paper. we present here an epidemiological model to characterize and forecast the rate at which people turn symptomatic from disease over time. for the purpose of this work, we assume that once people develop symptoms, they have ready access to medical services and can be diagnosed quickly. from this perspective, these forecasts represent a lower bound on the actual number of people that are infected with covid-19 as the people currently infected, but still incubating, are not accounted for. a fraction of the population infected might also exhibit minor or no symptoms at all and might not seek medical advice. therefore, these cases will not be part of patient counts released by health officials. the epidemiological model consists of two main components: an infection rate model, presented in sect. 2.1 and an incubation rate model, described in sect. 2.2. these models are combined, through a convolution presented in sect. 2.3, into a forecast of number of cases that turn symptomatic daily. these forecasts are compared to data presented in sect. 2.4 and the appendix. the infection rate represents the probability of an individual, that eventually will get affected during an epidemic, to be infected at a specific time following the start of the epidemic [30] . we approximate the rate of infection with a gamma distribution with unknown shape parameter k and scale parameter θ . depending on the choice for the pair (k, θ) this distribution can display a sharp increase in the number of people infected followed by a long tail, a dynamic that could lead to significant pressure on the medical resources. alternatively, the model can also capture weaker gradients ("flattening the curve") equivalent to public health efforts to temporarily increase social separation and thus reduce the pressure on available medical resources. the infection rate model is given by where f (t; k, θ) is the pdf of the distribution, with k and θ strictly positive. figure 2 shows example infection rate models for several shape and scale parameter values. the time in this figure is referenced with respect to start of the most of the results presented in this paper employ a lognormal incubation distribution for covid-19 [29] . the nominal and 95% confidence interval values for the mean, μ, and standard deviation σ of the natural logarithm of the incubation model are provided in table 1 . the pdf, f ln , and cumulative distribution function (cdf), f ln , of the lognormal distribution are given by to ascertain the impact of limited sample size on the uncertainty of μ and σ , we analyze their theoretical distributions and compare with the data in table 1 . letμ andσ be the mean and standard deviation computed from a set of n samples of the natural logarithm of the incubation rate random variable. it follows that μ − μ σ/ √ n has a student's t-distribution with n-degrees of freedom. to model the uncertainty inσ we assume that has a χ 2 distribution with (n − 1) degrees of freedom. while the data in [29] is based on n = 181 confirmed cases, we found the corresponding 95% cis for μ and σ computed based on the student's t and chi-square distributions assumed above to be narrower than the ranges provided in table 1 . instead, to construct uncertain models for these statistics, we employed a number of degrees of freedom n * = 36 that provided the closest agreement, in a l 2 -sense, to the 95% ci in the reference. the left frame in fig. 3 shows the family of pdfs with μ and σ drawn from student's t and χ 2 distributions, respectively. the nominal incubation pdf is shown in black in this frame. the impact of the uncertainty in the incubation model parameters is displayed in the right frame of this figure. for example, 7 days after infection, there is a large variability (60-90%) in the fraction of infected people that completed the incubation phase and started displaying symptoms. this variability decreases at later times, e.g. after 10 days more then 85% of case completed the incubation process. in the results section we will compare results obtained with the lognormal incubation model with results based on other probability distributions. again, we turn to [29] which provides parameter values corresponding to gamma, weibull, and erlang distributions. with these assumptions the number of people infected and with completed incubation period at time t i can be written as a convolution between the infection rate and the cumulative distribution function for the incubation distribution [12, 42, 44] where n is the total number of people that will be infected throughout the epidemic and t 0 is the start time of the epidemic. this formulation assumes independence between the calendar date of the infection and the incubation distribution. using eq. (3), the number of people developing symptoms between times t i−1 and t i is computed as where can be approximated using the lognormal pdf as leading to where f ln is the lognormal pdf. the results presented in sect. 4 compute the number of people that turn symptomatic the number of people developing symptoms daily n i , computed through eqs. (4) or (7), are compared to data obtained from several sources at the national, state, or regional levels. we present the data sources in the appendix. we found that, for some states or regions, the reported daily counts exhibited a significant amount of noise. this is caused by variation in testing capabilities and sometimes by how data is aggregated from region to region and the time of the day when it is made available. sometimes previously undiagnosed cases are categorized as covid-19 and reported on the current day instead of being allocated to the original date. we employ a set of finite difference filters [25, 41] that preserve low wavenumber information, i.e. weekly or monthly trends, and reduces high wavenumber noise, e.g. large day to day variability such as all cases for successive days being reported at the end of the time rangê here y is the original data,ŷ is the filtered data. matrix i is the identity matrix and d is a band-diagonal matrix, e.g. triadiagonal for a n = 2, i.e. a 2-nd order filter and pentadiagonal for a n = 4, i.e. a 4-th order filter. we have compared 2-nd and 4-th order filters, and did not observe any significant difference between the filtered results. reference [41] provides d matrices for filters up to 12-th order. time series of y andŷ for several regions are presented in the appendix. for the remainder of this paper we will only use filtered data to infer epidemiological parameters. for notational convenience, we will drop the hat and refer to the filtered data as y. note that all the data used in this study predate june 1, 2020 (in fact, most of the studies use data gathered before may 15, 2020) when covid-19 tests were administered primarily to symptomatic patients. thus the results and inferences presented in this paper apply only to the symptomatic cohort who seek medical care, and thus pose the demand for medical resources. the data is also bereft of any information about the "second wave" of infections that affected southern and western usa in late june, 2020 [8] . given data, y, in the form of time-series of daily counts, as shown in sect. 2.4, and the model predictions n i for the number of new symptomatic counts daily, presented in sect. 2, we will employ a bayesian framework to calibrate the epidemiological model parameters. the discrepancy between the data and the model is written as where y and n are arrays containing the data and model predictions here, d is the number of data points, the model parameters are grouped as = {t 0 , n , k, θ} and represents the error model and encapsulates, in this context, both errors in the observations as well as errors due to imperfect modeling choices. the observation errors include variations due to testing capabilities as well as errors when tests are interpreted. values for the vector of parameters can be estimated in the form of a multivariate pdf via bayes theorem where p( | y) is the posterior distribution we are seeking after observing the data y, p( y| ) is the likelihood of observing the data y for a particular value of , and p( ) encapsulates any prior information available for the model parameters. bayesian methods are well-suited for dealing with heterogeneous sources of uncertainty, in this case from our modeling assumptions, i.e. model and parametric uncertainties, as well as the communicated daily counts of covid-19 new cases, i.e. experimental errors. in this work we explore both deterministic and stochastic formulations for the incubation model. in the former case the mean and standard deviation of the incubation model are fixed at their nominal values and the model prediction n i for day t i is a scalar value that depends on only. in the latter case, the incubation model is stochastic with mean and standard deviation of its natural logarithm treated as student's t and χ 2 random variables, respectively, as discussed in sect. 2.2. let us denote the underlying independent random variables by ξ = {ξ μ , ξ σ }. the model prediction n i (ξ ) is now a random variable induced by ξ plugged in eq. (4), and n(ξ ) is a random vector. we explore two formulations for the statistical discrepancy between n and y. in the first approach we assume has a zero-mean multivariate normal (mvn) distribution. under this assumption the likelihood p( y| ) for the deterministic incubation model can be written as the covariance matrix c n can in principle be parameterized, e.g. square exponential or matern models, and the corresponding parameters inferred jointly with . however, given the sparsity of data, we neglect correlations across time and presume a diagonal covariance matrix with diagonal entries computed as the additive, σ a , and multiplicative, σ m , components will be inferred jointly with the model parameters here, we infer the logarithm of these parameters to ensure they remain positive. under these assumptions, the mvn likelihood in eq. (11) is written as a product of independent gaussian densities where σ i is given by eq. (12) . in sect. 4.3 we will compare results obtained using only the additive part σ a , i.e. fixing σ m = 0, of eq. (12) with results using both the additive and multiplicative components. the second approach assumes a negative-binomial distribution for the discrepancy between data and model predictions. the negative-binomial distribution is used commonly in epidemiology to model overly dispersed data, e.g. in case where the standard deviation exceeds the mean [31] . this is observed for most regions explored in this report, in particular for the second half of april and the first half of may. for this modeling choice, the likelihood of observing the data given a choice for the model parameters is given by where α > 0 is the dispersion parameter, and is the binomial coefficient. for simulations employing a negative binomial distribution of discrepancies, the logarithm of the dispersion parameter α (to ensure it remains positive) will be inferred jointly with the other model parameters, for the stochastic incubation model the likelihood reads as p( y| ) = π n( ),ξ ( y), (16) which we simplify by assuming independence of the discrepancies between different days, arriving at unlike the deterministic incubation model, the likelihood elements for each day π n i ( ),ξ (y i ) are not analytically tractable anymore since they now incorporate contributions from ξ , i.e. from the variability of the parameters of the incubation model. one can evaluate the likelihood via kernel density estimation by sampling ξ for each sample of , and combining these samples with samples of the assumed discrepancy , in order to arrive at an estimate of π n i ( ),ξ (y i ). in fact, by sampling a single value of ξ for each sample of , one achieves an unbiased estimate of the likelihood π n i ( ),ξ (y i ), and given the independent-component assumption, it also leads to an unbiased estimate of the full likelihood π n( ),ξ ( y). a markov chain monte carlo (mcmc) algorithm is used to sample from the posterior density p( | y). mcmc is a class of techniques that allows sampling from a posterior distribution by constructing a markov chain that has the posterior as its stationary distribution. in particular, we use a delayed-rejection adaptive metropolis (dram) algorithm [23] . we have also explored additional algorithms, including transitional mcmc (tmcmc) [27, 37] as well as ensemble samplers [22] that allow model evaluations to run in parallel as well as sampling multi-modal posterior distributions. as we revised the model implementation, the computational expense reduced by approximately two orders of magnitude, and all results presented in this report are based on posterior sampling via dram. a key step in mcmc is the accept-reject mechanism via metropolis-hastings algorithm. each sample of , drawn from a proposal q(·| i ) is accepted with probability where p( i | y) and p( i+1 | y) are the values of the posterior pdf's evaluated at samples i and i+1 , respectively. in this work we employ symmetrical proposals, q( i | i+1 ) = q( i+1 | i ). this is a straightforward application of mcmc for the deterministic incubation model. in stochastic incubation model, we employ the unbiased estimate of the approximate likelihood as described in the previous section. this is the essence of the pseudo-marginal mcmc algorithm [10] guaranteeing that the accepted mcmc samples correspond to the posterior distribution. in other words, at each mcmc step we draw a random sample ξ from its distribution, and then we estimate the likelihood in a way similar to the deterministic incubation model, in eqs. (13) or (14). figure 4 shows samples corresponding to a typical mcmc simulation to sample the posterior distribution of . we used the raftery-lewis diagnostic [40] to determine the number of mcmc samples required for converged statistics corresponding to stationary posterior distributions for . the required number of samples is of the order o(10 5 − 10 6 ) depending on the geographical region employed in the inference. the resulting effective sample size [24] varies between 8000 and 15,000 samples depending on each parameter which is sufficient to estimate joint distributions for the model parameters. figure 4 displays 1d and 2d joint marginal distributions based on the chain samples shown in the previous figure. these results indicate strong dependencies between some of the model parameters, e.g. between the start of the epidemic t 0 and the the scale parameter k of the infection rate model. this was somewhat expected based on the evolution of the daily counts of symptomatic cases and the functional form that couples the infection rate and incubation models. the number of samples in the mcmc simulations is tailored to capture these dependencies. we will employ both pushed-forward distributions and bayesian posterior-predictive distributions [33] to assess the predictive skill of the proposed statistical model of the covid-19 disease spread. the schematic in eq. (18) illustrates the process to generate push-forward posterior estimates ,1) , . . . , y (pf,m) } → p pf ( y pf | y). (18) here, y (pf) denotes hypothetical data y and p pf ( y (pf) | y) denotes the push-forward probability density of the hypothetical data y (pf) conditioned on the observed data y. we start with samples from the posterior distribution p( | y). the pushed-forward posterior does not account for the discrepancy between the data y and the model predictions n, subsumed into the definition of the error model presented in eqs. (13) and (14). the bayesian posterior-predictive distribution, defined in eq. (19) is computed by marginalization of the likelihood over the posterior distribution of model parameters : in practice, we estimate p pp y (pp) | y through sampling, because analytical estimates are not usually available. the sampling workflow is similar to the one shown in eq. (18) . after the model evaluations y = n( ) are completed, we add random noise consistent with the likelihood model settings presented in sect. 3.1. the resulting samples are used to compute summary statistics p pp y (pp) | y . the push-forward and posterior-predictive distribution workflows can be used in hindcast mode, to check how well the model follows the data, and for short-term forecasts for the spread dynamics of this disease. in the hindcast regime, the infection rate is convolved with the incubation rate model to generate statistics for y (pp) (or y (pf) ) that will be compared against y, the data used to infer the model parameters. the same functional form can be used to generate statistics for y (pp) (or y (pf) ) beyond the set of dates for which data was available. we limit these forecasts to 7-10 days as our infection rate model does not count for changes in social dynamics that can significantly impact the epidemic over a longer time range. the statistical models described above are calibrated using data available at the country, state, and regional levels, and the calibrated model is used to gauge the agreement between the model and the data and to generate short-term forecasts, typically 7-10 days ahead. first, we will assess the predictive capabilities of these models for several modeling choices: we will then present results exploring the epidemiological dynamics at several geographical scales in sect. 4.5. the push-forward and posterior-predictive figures presented in this section show data used to calibrate the epidemiological model with filled black circles. the shaded color region illustrates either the pushed-forward posterior or the posterior-predictive distribution with darker colors near the median and lighter colors near the low and high quantile values. the blue colors correspond to the hindcast dates and red colors correspond to forecasts. the inter-quartile range is marked with green lines and the 95% confidence interval with dashed lines. some of the plots also show data collected at a later time, with open circles, to check the agreement between the forecast and the observed number of cases after the model has been calibrated. we start the analysis with an assessment of the impact of the choice of the family of distributions on the model prediction. the left frame of fig. 6 shows median (with red lines and symbols) and 95% ci with blue/magenta lines for the new daily cases based on lognormal, gamma, weibull, and erlang distributions for the incubation model. the mean and standard deviation of the natural logarithm of the associated lognormal random variable, and the shape and scale parameters for the other distributions are available in appendix table 2 from reference [29] . the results for all four incubation models are visually very close. this observation holds for other simulations at national/state/regional levels (results not shown). the results presented in the remainder of this paper are based on lognormal incubation models. the right frame in fig. 6 presents the corresponding infection rate curve that resulted from the model calibration. this represents a lower bound on the true number of infected people, as our model will not capture the asymptomatic cases or the population that displays minor symptoms and did not seek medical care. next, we analyze the impact of the choice of deterministic vs stochastic incubation models on the model prediction. first we ran our model using the lognormal incubation model with mean and standard deviation fixed at their nominal values in table 1 . we then used the same dataset to calibrate the epidemiological model which employs an incubation rate with uncertain mean and standard deviation as described in sect. 2.2. these results are labeled "deterministic" and "stochastic", respectively, in fig. 7 . this figure shows results based on data corresponding to the united states. the choice of deterministic vs stochastic incubation models produce very similar outputs. the results shown in the right frame of fig. 3 indicate a relatively wide spread, between 0.64 and 0.95 with a nominal around 0.8, of the fraction of people that complete the incubation and start exhibiting symptoms 7 days after infec next, we explore results based on either ae or a + me formulations for the statistical discrepancy between the epidemiological model and the data. this choice impacts the construction of the covariance matrix for the gaussian likelihood model, in eq. (12) . for ae we only infer σ a while for a + me we infer both σ a and σ m . the ae results in fig. 8a are based on the same dataset as the a + me results in fig. 8b . both formulations present advantages and disadvantages when attempting to model daily symptomatic cases that span several orders of magnitude. the ae model, in fig. 8a , presents a posterior-predictive range around the peak region that is consistent with the spread in the data. however, the constant σ = σ a over the entire date range results in much wider uncertainties predicted by the model at the onset of the epidemic. the a + me model handles the discrepancy better overall as the multiplicative error term allows it to adjust the uncertainty bound with the data. nevertheless, this model results in a wider uncertainty band than warranted by the data near the peak region. these results indicate that a formulation for an error model that is time dependent can improve the discrepancy between the covid-19 data and the epidemiological model. we briefly explore the difference between pushed-forward posterior, in fig. 8c , and the posterior-predictive data, in fig. 7b . these results show that uncertainties in the model parameters alone are not sufficient to capture the spread in the data. this observation suggests more work is needed on augmenting the epidemiological model with embedded components that can explain the spread in the data without the need for external error terms. the negative binomial distribution is used commonly in epidemiology to model overly dispersed data, e.g. in cases where the variance exceeds the mean [31] . we also observe similar trends in some of the covid-19 datasets. figure 9 shows results based on data for alaska. the results based on the two error models are very similar, with the negative binomial results (on the top row) offering a slightly wider uncertainty band to better cover the data dispersion. nevertheless, results are very similar, as they are for other regions that exhibit a similar number of daily cases, typically less than a few hundred. for regions with a larger number of daily cases, the likelihood evaluation was fraught with errors due to the evaluation of the negative binomial. we therefore shifted our attention to the gaussian formulation which offers a more robust evaluation for this problem. in this section we examine forecasts based on data aggregated at country, state, and regional levels, and highlight similarities and differences in the epidemic dynamics resulted from these datasets. the data in fig. 10 illustrates the built-in delay in the disease dynamics due to the incubation process. a stay-at-home order was issued on march 19. given the incubation rate distribution, it takes about 10 days for 90-95% of the people infected to start showing symptoms. after the stay at home order was issued, the number of daily case continued to rise because of infections that occurred before march 19. the data begins to flatten out in the first week of april and the model captures this trend a few days later, april 3-5. the data corresponding to april 9-11 show an increased dispersion. to capture this increased noise, we switched from an ae model to a + me model, with results shown in fig. 11 . fig. 20b . the data for the central region, shows a smaller daily count compared to the nw region. the epidemiological model captures the relatively large dispersion in the data for both regions. for the nm-c the first cases are recorded around march 10 and the model suggests the peak has been reached around mid-april, while nm-nw starts about a week later, around march 18, but records approximately twice more daily cases when it reaches the peak in the first half of may. both regions display declining cases as of late may. comparing the californian and new mexican results, it is clear that the degree of scatter in the new mexico data is much larger and adversely affects the inference, the model fit and the forecast accuracy. the reason for this scatter is unknown, but the daily numbers for new mexico are much smaller that california's and are affected by individual events e.g., detection of transmission in a nursing home or a community. this is further accentuated by the fact that new mexico is a sparsely populated region where sustained transmission, resulting in smooth curves, is largely impossible outside its few urban communities. this section discusses an analysis of the aggregate data from all us states. the posterior-predictive results shown in fig. 14a -d suggest the peak in the number of daily cases was reached around mid-april. nevertheless the model had to adjust the downward slope as the number of daily cases has been declining at a slower pace compared to the time window that immediately followed the peak. as a result, the prediction for the total number of people, n , that would be infected in us during this first wave of infections has been steadily increasing as results show in fig. 14e . we conclude our analysis of the proposed epidemiological model with available daily symptomatic cases pertaining to germany, italy, and spain, in figs. 15, 16 and 17. for germany, the uncertainty range increases while the epidemic is winding down, as the data has a relatively large spread compared to the number of daily cases recorded around mid-may. this reinforces an earlier point about the need to refine the error model with a time-dependent component. for spain, a brief initial downslope can be observed in early april, also evident in the filtered data presented in fig. 19b . this, howan overly-dispersed dataset and a wide uncertainty band for spain. forecasts based on daily symptomatic cases reported for italy, in fig. 17 , exhibit an upward shift observed around april 10-20, similar to data for spain above. the subsequent forecasts display narrower uncertainty bands compared to other similar forecasts above, possibly due to the absence of hotspots and/or regular data reporting. figures 10, 11 and 14 show inferences and forecasts obtained using data available till mid-may, 2020. they indicate that the outbreak was dying down, with forecasts of daily new cases trending down. in early june, public health measures to restrict population mixing were curtailed, and by mid-july, both california and the us were experiencing an explosive increase in new cases of covid-19 being detected every day, quite at variance with the forecasts in the figures. this was due to the second wave of infections caused by enhanced population mixing. the model in eq. 3 cannot capture the second wave of infections due to its reliance on a unimodal infection curve n f (τ − t 0 ; k, θ). this was by design, as the model is meant to be used early in an outbreak, with public health officials focussing on the the first wave of infections. however, it can be trivially extended with a second infection curve to yield an augmented equation [1] τ − t 0 ; k [1] , θ [1] f ln (t i − τ ; μ, σ ) dτ +n [2] t i t 0 f [2] τ − (t 0 + t); k [2] , θ [2] with two sets of parameters for the two infection curves, which are separated in time by t > 0. eq. 20 is then fitted to data which is suspected to contain effects of two waves of infection. this process does double the number of parameters to be estimated from data. however, the posterior density inferred for the parameters of the first wave (i.e., those with [1] superscript), using data collected before the arrival of the second wave, can be used to impose informative priors, considerably simplifying the estimation problem. note that the augmentation shown in eq. 20 is very intuitive, and can be repeated if multiple infection waves are suspected. a second method that could, in principle, be used to infer multiple waves of infection are compartmental models e.g., sir models, or their extensions. these models represent the epidemiological evolution of a patient through a sequence of compartments/states, with the residence time in each compartment modeled as a random variable. one of these compartments, "infectious", can then be used to model spread of the disease to other individuals. such compartmental models have also been parlayed into ordinary differential equation (ode) models for an entire population, with the population distributed among the various compartments. ode models assume that the residence time in each compartment is exponentially distributed, and using multiple compartments, can represent incubation and symptomatic periods that are not exponentially distributed. this does lead to an explosion of compartments. the spread-of-infection model often involves a time-dependent reproductive number r(t) that can be used to model the effectiveness of epidemic control measures. it denotes that number of individuals a single infected individual will spread the disease to, and as public health measures are put in place (or removed), r(t) will decrease or increase. we did not consider sir models, or their extensions, in our study as our model is meant to be used early in an outbreak when data is scarce and incomplete. since our method is data-driven and involves fitting a model, a deterministic (ode) compartmental model with few parameters would be desirable. the reasons for avoiding ode-based compartmental models are: -the incubation period of covid-19 is not exponential (it is lognormal) and there is no way of modeling it with a single "infectious" compartment. -while it is possible to decompose the "infectious" compartment into multiple sub-compartments, it would increase the dimensionality of the inverse problem as we would have to infer the fraction of the infected population in each of the sub-compartments. this is not desirable when data is scarce. -we did not consider using extensions of sir i.e., those with more compartments since it would require us to know the residence time in each compartment. this information is not available with much certainty at the start of the epidemic. this is particularly true for covid-19 where only a fraction of the "infectious" cohort progress to compartments which exhibit symptoms. -sir models can infer the existence of a second wave of infections but would require a very flexible parameterization of r(t) that would allow bi-or multimodal behavior. it is unknown what sparsely parameterized functional form would be sufficient for modeling r(t). this paper illustrates the performance of a method for producing short-term forecasts (with a forecasting horizon of about 7-10 days) of a partially-observed infectious disease outbreak. we have applied the method to the covid-19 pandemic of spring, 2020. the forecasting problem is formulated as a bayesian inverse problem, predicated on an incubation period model. the bayesian inverse problem is solved using markov chain monte carlo and infers parameters of the latent infection-rate curve from an observed time-series of new case counts. the forecast is merely the posteriorpredictive simulations using realizations of the infection-rate curve and the incubation period model. the method accommodates multiple, competing incubation period models using a pseudo-marginal metropolis-hastings sampler. the variability in the incubation rate model has little impact on the forecast uncertainty, which is mostly due to the variability in the observed data and the discrepancy between the latent infection rate model and the spread dynamics at several geographical scales. the uncertainty in the incubation period distribution also has little impact on the inferred latent infection rate curve. the method is applied at the country, provincial and regional/county scales. the bulk of the study used data aggregated at the state and country level for the united states, as well as counties in new mexico and california. we also analyzed data from a few european countries. the wide disparity of daily new cases motivated us to study two formulations for the error model used in the likelihood, though the gaussian error models were found to be acceptable for all cases. the most successful error model included a combination of multiplicative and additive errors. this was because of the wide disparity in the daily case counts experienced over the full duration of the outbreak. the method was found to be sufficiently robust to produce useful forecasts at all three spatial resolutions, though high-variance noise in low-count data (poorly reported/low-count/largely unscathed counties) posed the stiffest challenge in discerning the latent infection rate. the method produces rough-and-ready information required to monitor the efficacy of quarantining efforts. it can be used to predict the potential shift in demand of medical resources due to the model's inferential capabilities to detect changes in disease dynamics through short-term forecasts. it took about 10 days of data (about the 90% quantile of the incubation model distribution) to infer the flattening of the infection rate in california after curbs on population mixing were instituted. the method also detected the anomalous dynamics of covid-19 in northwestern new mexico, where the outbreak has displayed a stubborn persistence over time. our approach suffers from two shortcomings. the first is our reliance on the time-series of daily new confirmed cases as the calibration data. as the pandemic has progressed and testing for covid-19 infection has become widespread in the usa, the daily confirmed new cases are no longer mainly of symptomatic cases who might require medical care, and forecasts developed using our method would overpredict the demand for medical resources. however, as stated in sect. 1, our approach, with its emphasis on simplicity and reliance on easily observed data, is meant to be used in the early epoch of the outbreak for medical resource forecasting, and within those pragmatic considerations, has worked well. the approach could perhaps be augmented with a time-series of covid-19 tests administered every day to tease apart the effect of increased testing on the observed data, but that is beyond the scope of the current work. undoubtedly this would result in a more complex model, which would need to be conditioned on more plentiful data, which might not be readily available during the early epoch of an outbreak. the second shortcoming of our approach is that it does not model, detect or infer a second wave of infections, caused by an increase in population mixing. this can be accomplished by adding a second infection rate curve/model to the inference procedure. this doubles the number of parameters to be inferred from the data, but the parameters of the first wave can be tightly constrained using informative priors. this issue is currently being investigated by the authors. tone provided in scaling up the short-term forecasts to large datasets. this work was funded in part by the laboratory directed research and development (ldrd) program at sandia national laboratories. khachik sargsyan was supported by the u.s. department of energy, office of science, office of advanced scientific computing research, scientific discovery through advanced computing (scidac) program through the fastmath institute. sandia national laboratories is a multimission laboratory managed and operated by national technology and engineering solutions of sandia, llc., a wholly owned subsidiary of honeywell international, inc., for the u.s. department of energy's national nuclear security administration under contract de-na-0003525. this paper describes objective technical results and analysis. any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the u.s. department of energy or the united states government. we have used several sources [1, 2, [4] [5] [6] to gather daily counts of symptomatic cases at several times while we performed this work. the illustrations in this section present both the original data with blue symbols as well as filtered data with red symbols and lines. figure 18 shows data for all of the us (data extracted from [6] ), and for 5 selected states (data extracted from [2] ). the filtering approach, presented in sect. 2.4, preserves the weekly scale variability observed for some of the datasets in this figure, and removes some of the large day to day variability observed for example in alaska, in fig. 18d . figure 19 shows the data for several countries with a significant number of covid-19 cases as of may 10, 2010. similar to us and some of the us states, a weekly frequency trend can be observed superimposed on the overall epidemiological trend. these trends are observed on the downward slope mostly, e.g. for italy and germany. when the epidemic is taking hold, it is possible that any higher frequency fluctuation is hidden inside the sharply increasing counts. possible explanations include regional hot-spots flaring up periodically as well as expanded testing capabilities ramping-up over time. we have also explored epidemiological models applied at regional scale. the left frame in fig. 20 shows a set of counties in the bay area that were the first to issue the stayat-home order on march 17, 2020. two groups of counties in new mexico are shown with red and blue in the right frame of fig. 20 . these regions displayed different disease dynamics, e.g. a shelter-in-place was first issued in the bay area on march 16, then extended to the entire state on march 19, while the new daily counts were much larger in the nw new mexico compared to the central region. the daily counts, shown in fig. 21 for these three regions was aggregated based on county data provided by [2] . 2019-20 coronavirus pandemic covid-19) data in the united states covid-19 coronavirus pandemic covid-19 data repository by the center for systems science and engineering covid-19 pandemic data/united states medical cases reopenings stall as us records nearly 50,000 cases of covid-19 in single day modelling the occurrence of the novel pandemic covid-19 outbreak; a box and jenkins approach the pseudo-marginal approach for efficient monte carlo computations model calibration, nowcasting, and operational prediction of the covid-19 pandemic a method for obtaining shortterm projections and lower bounds on the size of the aids epidemic development and application of pandemic projection measures (ppm) for forecasting the covid-19 outbreak hawkes process modeling of covid-19 with mobility leading indicators and spatial covariates dynamics and development of the covid-19 epidemics in the us: a compartmental model with deep learning enhancement worldwide and regional forecasting of coronavirus (covid-19) spread using a deep learning model forecasting covid-19 outbreak progression in italian regions: a model based on neural network training from chinese data sequential data assimilation of the stochastic seir epidemic model for regional covid-19 dynamics an international assessment of the covid-19 pandemic using ensemble data assimilation report 9: impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand ensemble samplers with affine invariance an adaptive metropolis algorithm markov chain monte carlo in practice: a roundtable discussion several new numerical methods for compressible shear-layer simulations covasim: an agent-based model of covid-19 dynamics and interventions transitional markov chain monte carlo sampler in uqtk predictive accuracy of a hierarchical logistic model of cumulative sars-cov-2 case growth the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases estimating the early outbreak cumulative incidence of covid-19 in the united states: three complementary approaches bayesian posterior predictive checksforcomplex models learning as we go: an examination of the statistical accuracy of covid19 daily death count predictions forecasting covid-19 impact on hospital bed-days, icu-days, ventilator-days and deaths by us state in the next 4 months forecasting the impact of the first wave of the covid-19 pandemic on hospital demand and deaths for the usa and european economic area countries bayesian updating and model class selection for hysteretic structural models using stochastic simulation initial simulation of sars-cov2 spread and intervention effects in the continental us an arima model to forecast the spread and the final size of covid-2019 epidemic in italy how many iterations in the gibbs sampler? using high-order methods on adaptively refined block-structured meshes: derivatives, interpolations, and filters deriving a model for influenza epidemics from historical data modeling covid-19 on a network: super-spreaders, testing and containment real-time characterization of partially observed epidemics using surrogate models machine learning model estimating number of covid-19 infection cases over coming 24 days in every province of south korea (xgboost and multioutputregressor) projections for firstwave covid-19 deaths across the us using social-distancing measures derived from mobile phones projection of covid-19 cases and deaths in the us as individual states re-open publisher's note springer nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations the authors acknowledge the helpful feedback that john jakeman have provided on various aspects related to speed up of model evaluations. the authors also acknowledge the support erin acquesta, thomas catanach, kenny chowdhary, bert debusschere, edgar galvan, gianluca geraci, mohammad khalil, and teresa porkey: cord-352348-2wtyk3r5 authors: sabroe, ian; dockrell, david h.; vogel, stefanie n.; renshaw, stephen a.; whyte, moira k. b.; dower, steven k. title: identifying and hurdling obstacles to translational research date: 2007 journal: nat rev immunol doi: 10.1038/nri1999 sha: doc_id: 352348 cord_uid: 2wtyk3r5 although there is overwhelming pressure from funding agencies and the general public for scientists to bridge basic and translational studies, the fact remains that there are significant hurdles to overcome in order to achieve this goal. the purpose of this opinion article is to examine the nature of these hurdles and to provide food for thought on the main obstacles that impede this process. it has achieved much, including the sequencing of the entire human genome, but it is already under serious attack for its failure to deliver effective therapies in many areas. this opinion article will provide a subjective view of our understanding of translational research, identify obstacles to its successful development, and propose a series of initiatives to improve the effectiveness of translational research strategies. originating from latin, translation means to 'carry across' . in biomedical research, the goal of translational science is to develop a thorough operational understanding of the human organism in health and disease, with the goal of 'carrying across' this knowledge to alleviate disease and suffering and to improve the quality of human existence. to be translational in medicine we must acquire knowledge from the broad arena of molecular and cellular biology and then apply this knowledge to human disease. the quality of our scientific output (perceived as a change in disease incidence and/or the development of a therapy) is largely dependent on the quality of the input data and the methods for their processing and interpretation, although the process of generating effective translational science is not as linear (that is, from molecules to models to humans) as is often thought. failure to ask the appropriate questions of optimized systems leads to the acquisition of knowledge that might be less relevant than anticipated. further corruption of the process comes as a result of limitations in our models, which are often not fully appreciated (or are simply ignored) at the time of the study. additionally, grant agencies must be sufficiently flexible to allow researchers to follow up on novel observations because many of the most exciting developments arise from unexpected findings. determining what research is intrinsically translational, or has been translated effectively, is therefore surprisingly difficult, and in a healthy global biomedical research environment, translation will continue to mean very different things to different groups. few scientists will see a process through from the conception of a hypothesis to the development of a specific medical therapy. as scientists, we are nonetheless required to measure our performance in terms of our 'translational potential' , particularly when it comes to justifying and generating funding and publications. therefore, in the absence of being able to measure contributions to health directly, we often quantify individual success using surrogate markers (such as publications and their impact, prestige, and funding), which depend on the prevailing concepts of what constitutes importance. however, these markers may be flawed for this purpose. a more global assessment of the output of translational science might consider whether the scientific community has improved specific disease outcomes, quality of life or life expectancy. specific examples of global success might not be as common as we would wish, but increased successes in the areas of organ transplantation or the ability to eradicate diseases such as measles virus highlight our ability to be successful. the improved treatment of many cancers through the combination of good science and high-quality clinical trials of new therapies or combinations of therapies has been strikingly impressive and provides many examples of highly effective translational science 1,2 . there remains, however, a lack of available mechanisms by which to relate our individual contributions to the global progress of translational science, and many factors conspire to impede our progress. the translation of basic scientific research faces a myriad of hurdles, both obvious and occult. these revolve around our understanding of the nature of the translational process, the integration of the outputs of different technological approaches to disease, the use of models, access to tissues and appropriate materials, and the need for support in increasingly complex areas such as ethics and bioinformatics. in addition, owing to technological advances, well-meaning safeguards of personal privacy have been imposed in relation to carrying out research in humans, and these have, in fact, greatly impeded progress in translational studies. problems with the models. entirely appropriate restrictions on what research can be done in humans have contributed to the status quo in which the mouse has become an indispensable model for translational biology; however, it is often not possible to predict biological responses in humans accurately based solely on results obtained from animal models [3] [4] [5] [6] [7] [8] [9] [10] . within immunity, exposure of mice to their own species-specific commensals and pathogenic organisms might contribute to a species-specific immunological phenotype 4,9 that affects the translational relevance. for example, studies of the effects of irak4 (interleukin-1 receptor-associated kinase 4) deficiency have revealed increased identifying and hurdling obstacles to translational research susceptibility to a greater range of pathogens in mice than is observed in humans who are deficient for irak4 . where good potential translational concepts are generated from models, determining how to move to the human can be challenging, as highlighted by difficulties in calculating tolerogenic doses of insulin in the prevention of diabetes 10 . in addition, technical limitations, or the dominance of prevailing models, can sometimes limit the scope of in vivo work, as illustrated by studies of airway disease, where studies in mouse models are focused predominantly on the important t helper 2 (t h 2) component of asthma but are poor models for the contribution of other mediators. mouse models are also complicated by our limited understanding of the phenotypes of human diseases 14, 15 . designing perfect models of diseases that we do not fully understand is a tall order, and chronic diseases that involve life-long interactions between the host and the environment present a particularly difficult challenge [5] [6] [7] [8] . developing truly chronic models of disease is, however, heavily militated against owing to ethical concerns about prolonged suffering. moreover, the typically short-term nature of research funding, where outputs must be deliverable at a reasonable cost and within the time frame of a project grant, hinders the development of chronic models of disease. it is equally clear that work on a single cell line is often poorly predictive of the behaviour of the whole organism. although the use of primary cells from normal tissues can overcome some of these problems, the phenotype of these cells might be altered by their removal from the microenvironmental influences in vivo, and they often show markedly different responses to those of primary cells that are obtained from diseased tissue. therefore, rodent models remain essential and cannot be replaced by in vitro approaches at present, but are an imperfect translational conduit for both biological and practical reasons (such as the difficulty in establishing chronic disease models). other models such as flies and zebrafish (danio rerio) have many advantages for forward genetics and other related studies, but have substantially greater differences to humans than mammalian models. primates are the most compatible mammalian models for immunological research, and have provided invaluable insights into diseases such as hiv, but their broader use is not feasible because of major ethical concerns, as well as other practical and cost-related issues. even these models can have limitations. recently, differential expression of siglecs (sialicacid-binding immunoglobulin-like lectins), which are inhibitory receptors thought to downregulate immune-cell activation, has been noted between apes and humans 16 , and such differences in immune responses between species require careful consideration, as highlighted by the development of a 'cytokine storm' when individuals received an agonistic cd28-specific antibody that did not show the same effects in monkeys 17, 18 . research is being increasingly hindered by bureaucracy. evident at many levels in research management, nowhere is this more of a problem than in the areas of access to human samples and tissues, and in the establishment of clinical collaborations. liaison with industry also provides additional challenges 10 . furthermore, some of the most exciting areas of current research, such as work with embryonic stem cells, are subject to special ethical concerns. even the simplest research involving humans, or archived specimens from humans, is often encumbered by a multistaged, intimidating application process that might dissuade individuals from carrying out the proposed study. overcautious interpretation by ethics committees and regulatory authorities of what is ethical results in restrictive practices that can, for example, prevent re-screening of dna banks for relevant markers and mutations, or apply over-elaborate protection of clinical phenotype data. although it is absolutely clear that robust systems are needed to prevent exploitation of patients or clinical data, it is often unclear how the ethical standards that are applied to the review of proposed studies are developed. apparently arbitrary standards that might be perceived to be more ethical simply because they are more restrictive seem to be ever more on the increase. it is also unethical to excessively hinder research, but this imperative sometimes seems to come second to apparently minor scruples over details such as the methods of recruitment. although the introduction of national standards to ensure timely review of proposals has, in the united kingdom, begun to address these issues, increasingly complex review processes are frustrating and sometimes of questionable ethical value. the area of personal privacy is particularly complex. the majority public view is supportive of the use of some personal data in confidential research 19 . however, although many individuals are less concerned that well-regulated data usage is an invasion of privacy than some ethics committees seem to be, due consideration is needed of the views of the minorities who are more concerned with their privacy. the future potential for apparently confidential genetic research resulting in the identification of individuals through the cross-referencing of dna databases is clearly of concern 20,21 . in addition, governments tend to respond to highly public episodes of research offences or clinical misconduct with legislation. this runs the risk of creating additional administrative burdens for the majority of individual scientists who practise ethically and well, without ensuring the prevention of future wilful misdeeds by a rare minority. increasingly, clinical practice is becoming more defensive and less willing to engage in research as a response to a society that is becoming more risk-adverse, litigious and blame-centred when adverse events occur. this generates further barriers to research in humans and requires a frank reappraisal of what experimentation is acceptable in humans or on samples that are derived from humans. additional problems arise in setting research priorities. without effective public input, we run the risk that research priorities become the whim of changing government focuses, which lack the stable, long-term input necessary for meaningful scientific advancement. although charities and private foundations that are centred on specific diseases provide important contributions, their focus is often too narrow to tackle the wider research agenda. this has led to a culture of increasing financial pressures that are governed by short-term policies, which require rapid outputs, and piecemeal funding, stifling longer term, more open-ended research that might produce real translational progress 10 . for example, political pressures to study agents of bioterrorism or emerging infectious diseases such as sars (severe acute respiratory syndrome) can be welcomed and are clearly important; however, more long-term commitment to studying infections of historical global importance is also essential. with respect to developing long-term goals, the national institutes of health translational biomedicine must be seen by scientists and the wider public in the context of a broad-based body of science that has the potential to contribute, in the near or long term, to the advancement of human health. increased financial and political pressures on scientists to maximize directly measurable patient benefit from research carried out over short time frames runs the risk of destroying a healthy science base and should be actively resisted. a continued and robust engagement of the public is mandatory for scientists, because unless the public values the broad sweep of translational biomedical science, funding will inevitably be diverted away from science or applied only to those areas of research that immediately influence patient care. as scientists, we must learn to articulate not only the promise of science, but also the difficulties that are associated with moving an idea along to the product stage, so that unrealistic expectations of perfect medications for every disease in short time frames are not raised. we must also wrestle the debate on the usefulness of in vivo models away from the more extreme ends of the anti-vivisection movement so that a constructive, rather than confrontational, discussion can evolve. equally, we must foster a culture of research in the curricula of medical students, and promote the goals of translational research to basic science undergraduate and graduate students. we must seek to marry the skills of basic scientists with those of clinicians to convey the idea that strong translational research underlies improved health and is inseparable from the provision of good standards of clinical care. improving our use of the models? in many biomedical fields, a series of debates have highlighted deficiencies in our current in vivo models. strain differences 22 and variance in experimental conditions between research groups pose major challenges in comparing the results derived from different studies. a collaborative investment in phenotyping human disease by clinicians and basic scientists is required for developing robust models. there is clearly an opportunity to define the characteristics, behaviour and relevance of model systems, particularly with a view to standardizing optimal strains and protocols. for example, multiple models exist for common diseases such as asthma 23 and rheumatoid arthritis 24 , although harmonizing protocols and strains for particular features of a disease must avoid the loss of sufficient diversity in order to allow the finding of the unexpected. this challenge could be met by interest groups or individuals working in a particular area, by large national funding agencies and/or international funding initiatives, or alternatively by specialist societies. such debates might facilitate the comparison of data between laboratories and between species, and might highlight the components of specific diseases that are ripe for the development of new in vivo models and protocols (for example, there remains a great need to more effectively model the role of the innate immune system in acute and chronic asthma), broadening the number of disease processes or phenotypes that are modelled in pathology. although much work will continue to focus on the mouse, for some diseases comparative or independent studies in other species will continue to be important. beyond the scope of this article, there are issues ahead with respect to subject selection for early clinical trials and the development of individualized treatment regimes that are based on pharmacogenetic and individual disease phenotypes 25, 26 . figure 1 | developing interactive models. a | in vivo experimentation, perhaps in particular the generation and characterization of knockout mouse strains in experimental models of disease, is often viewed as the gatekeeper between in vitro science and the generation of drug targets that are appropriate for human disease. although central to an effective understanding of immune biology and the role of new candidate drug targets, the predictive value of in vivo experimentation is less than desired, particularly in the context of studies of single knockouts in specific disease models and mouse strains. b | an example of a more holistic network in which multiple lines of evidence allow the refinement of objectives and target relevance in order to increase the chance of successful drug discovery. such approaches reflect the approach of many researchers, but (acknowledging that no branch of science is 'easy') the main difficulties associated with undertaking human-based research run the risk of degrading the quality of data that arise from an integrated scientific approach. the tendency to view translational research as a linear process in which mice are the gateway between basic science and humans does rodent models a great disservice. in particular, the classical route of identifying genes in vitro, followed by generating knockout animals in vivo, has, in general, been poorly predictive of the consequences of targeting specific molecules in humans. advances in medical sciences that have emerged in this manner are vastly outnumbered by those that arise by serendipity or through less predictable routes. a more holistic integration of in vivo models with in vitro science and studies in the human is needed when summarizing the translational relevance of a system, in which in vivo models contribute strongly to an iterative strategy, but are not themselves the final arbiters (fig. 1) . the potential to identify medicines for use in human disease by screening less complex biological systems has long been recognized in the pharmaceutical industry, and there is now considerable interest in the use of model organisms such as caenorhabditis elegans and zebrafish in high-throughput screens for new drugs 27, 28 . it is becoming increasingly evident that these systems can be used to identify therapeutic targets in the immune system 29 . in this way, understanding the biology of pathways and gene products is deferred, and the ability of a compound to intervene in complex biological processes is directly tested, as highlighted in recent studies of calcium-channel antagonists in c. elegans 27 . combining the use of mice and humans in effective strategies. it seems self-evident that research which aspires to influence pathogenesis in humans needs to be carried out on the human system. at the same time, we must also anticipate the potential for doing harm that might accompany any new approaches to treating diseases, which militates against a rapid progression to phase i human trials. nonetheless, the extraordinary difficulties that are associated with carrying out human studies in parallel with animal models has facilitated a culture in which such studies rarely occur, and in which prioritizing research is not driven by the inclusion of such processes. indeed, almost quixotically, work in human tissues and cell lines is sometimes not deemed to be of importance until verified in a mouse with a targeted mutation. bridging this divide requires the scientific community to value more highly the studies that seek to bring mouse and human studies together, and to appreciate that in human studies a lack of phenotype or subtle modifications of processes might be as important as models that generate dramatic phenotypes. from simple co-culture models of normal human tissues and ultimately to the generation of whole organs or representations of whole organs 30 in the laboratory, we are now beginning to produce in vitro human systems that can complement essential work that is currently only achievable in vivo. increasing success with new gene-delivery systems, combined with new technologies such as gene knockdown by rna interference, might allow us to overcome the inability to study humans with targeted gene deficiencies. such models are in their infancy, and cell-culture-based approaches are, in many scientists' views, farther from human biology than techniques that investigate biology in mice. it would seem that both are required, and therefore, a major thrust to develop standardized co-culture models that are based on primary human cells from healthy and diseased subjects would provide a complementary scientific base to our strong expertise in the mouse. our commitment to this development is essential, not only to boost the translation of science to the human, but also to ensure that we honour the principles of reduction, refinement and replacement (minimizing the number of animals that are used and finding alternatives wherever feasible) that are central to all animal experimentation. we are faced with many difficulties in generating appropriate human tissue models, including defining differences in similar cell types that are pathologically relevant (for example, comparing the biology of endothelial cells isolated from the umbilical cord with those isolated from different microvascular beds). we also need to determine the optimal representative culture conditions for primary cells. for example, when should epithelial cells be studied at an air-liquid interface, or when should leukocyte-endothelial interactions be studied under flow? importantly, diseased tissues are often modified by inflammation, genetic traits and epigenetic processes that require further consideration when translating from the biology of health to the biology of disease. nevertheless, we have recently shown that simple co-cultures of primary human cells can in some circumstances replicate inflammatory systems that are observed in vivo in mice 31, 32 ; such co-culture models are becoming increasingly common. the future ability to study systems that resemble organs or complete tissues, and verifying the work of simple co-culture experiments in such systems, offers a future we should not only embrace, but also be actively working towards at every opportunity. it is routine in clinical practice to combine drugs with similar or differing modes of action when treating disease. there is also increasing appreciation of the roles of cellular and molecular networks in the aetiology of disease 33 , a fact that is implicitly recognized by the need for in vivo models in which complex interactions between tissues and cells can be studied without a need to identify the full panoply of the systems involved (box 1). in the context of an inflammatory disease, we have recently proposed that these networks are best described by the terminology of 'contiguous immunity' , whereby, in disease, temporally and spatially contiguous networks comprising multiple processes of innate and adaptive immunity are in continual dialogue and evolution 33 . it is curious, then, that so many studies aim to land a single 'killer blow' on pathological processes, rather than considering the impact of therapeutic combinations. our increasingly deeper understanding of the complexity of the inflammatory process often results in the targeting of downstream components for which there might be redundancy or very specific functionality. the targeting of such systems in combination, although complicated to achieve and less satisfying with respect to the identification of single clear targets, is nonetheless likely to have relevance for successful translation. allowing access to tissues and overcoming ethical anxieties. the diversion of research funding into bureaucracy and the delays in productivity that result from meeting statutory requirements for ethical practice contribute to the erosion of research capacity. research cannot do without governance, but it is disappointing that increasing regulation has not been met by increased support for ethical human-based research, and www.nature.com/reviews/immunol the future ability to study systems that resemble organs or complete tissues â�¦ offers a future we should not only embrace, but also be actively working towards at every opportunity. although many processes conspire to make research harder, few exist to make it easier. a simplification of the regulations and national-level funding that provides templates for ethical applications, together with training and support for investigators, would do much to make this easier. because many projects involve international collaborations, simplified standard international regulations would also greatly facilitate progress. many research projects have conceptually similar goals (such as the tackling of inflammation in a tissue), and generic pre-approved protocols requiring modest local adaptation would obviate many problems. in the united kingdom, the proposed establishment of a single health research board to coordinate government-funded health research provides an opportunity to develop programmes that examine the effects of government regulation on research and mechanisms for overcoming the issues discussed earlier. research on human tissues could be enhanced by improved access to normal and diseased specimens. the establishment of central and devolved services whose function would be to source human tissues in an ethical manner, characterize the donors and tissues anonymously, and make these resources freely available to investigators in research-active countries would transform biological research (indeed, some progress is being made here, but there is still much to be done). an emphasis on dialogue with scientists and flexibility with respect to supporting science would be required to underpin access to these resources by researchers. where materials are scarce, such systems could be supported by expertise in cell-line immortalization. the development of the lung tissue research consortium by the national heart, lung and blood institute is an exemplar of a clinical data and tissue bank that offers enormous potential for research in lung disease; in addition, the development of the uk biobank shows it is feasible for human resources to be sourced ethically in ways that allow relatively broad use in medical research according to need. the proposed tissue banks would promote a positive view of human science working in parallel with mouse biology, which also requires commitment and support from the public to make it work. 'omics'. one area in which large-scale biology is excelling is the use of public-access databases (such as those pertaining to gene expression (genomics) or the production of metabolites (metabolomics)) 34 , but we are missing opportunities to further expand these databases. local and national guidelines define the optimal management of many diseases, but it can be argued that the providers of health care are not given appropriate opportunities to engage with the scientists whose work is developing the future treatments for these diseases. scientific initiatives that are associated with national care plans could drive disease phenotyping and tissue collection for tissue, dna, rna, protein and metabolite databanks with good the difficulty in designing in vitro and in vivo experiments to model human disease has inevitably required and generated simplifications of pathology, often leading to relatively linear models of disease (for example, tissue damage, followed by antigen presentation, the generation of immunological memory and autoantibodies, and a resulting autoimmune disease). in reality, pathology is generated by networks that can exhibit substantial plasticity over the course of a disease. these principles are illustrated in the figure. a disease process -pathology -is represented at the centre of a series of simple conceptual networks, components of which are left intentionally blank to avoid attempting to define specific diseases. each component (or node) within the network might have different roles at different times in a disease, or if active at more than one tissue site, might even have different roles simultaneously in a disease process. therefore, the depicted connections are plastic over time and in individual microenvironments. in a cellular network. pathology can be considered in the context of networks of cells that are recruited or resident at inflammatory sites, whose communication through cytokines and other molecules regulates inflammation. in a cytokine network. pathology is driven by the interrelated actions of cytokines, again forming a dynamic plastic network. in a process network. process behaviour (for example, angiogenesis, scarring and leukocyte recruitment) will contribute differently to pathology at specific tissue locations and different times in the disease. as an illustration, wound-healing responses might contribute to the resolution of normal tissue architecture, the development of fibrosis, or the regulation of inflammatory cell recruitment and survival. depending on the nature and duration of the stimulus driving the woundhealing process, and the location in which it is set, multiple resulting pathological phenotypes are feasible. it is the nature of tissues to contain multiple cells and supporting structures that are physically associated, with many more cells that can transit through or become resident. we have proposed that the immunity seen in pathology rarely falls clearly into categories of innate and adaptive immune, or t helper 1 (t h 1) and t h 2 responses. rather, pathology is generated by a networked interaction that changes over time. the networked relationships of immune pathology might be better described as 'contiguous immunity', in which multiple processes or networks can be operational and in dialogue in the same space (physically contiguous); equally, processes might be linked together in evolving patterns (temporally contiguous). understanding these networks, and, where necessary, developing new models to elucidate and target them, is essential to effective translational biology. ifn, interferon; il, interleukin. public access. good clinical practice not only should focus on ensuring that existing best practice is reliably reproduced, but also should put equal weight to research and development of practice through translational research. it is not just the priority of scientists to engage with clinicians: the imperative is equally strong that clinicians should engage more with basic scientists. in both cases it is important that such engagement is facilitated and supported by national policy. another major challenge is to take the input of a large amount of descriptive data generated by an 'omics' approach and both interpret and reapply it to a translational problem in a hypothesis-driven manner. as we learn to integrate complex data sets with in vitro cell biology and in vivo models, we might begin to generate virtual phenotypes, allowing conclusions to be drawn on the basis of the relatedness of a series of data sets to the questions asked. in essence, all researchers do this when they read published data, but the process is inherently subjective and formalizing such integrated biology could be more objective and so better inform future experiments. we are now experiencing an unprecedented blossoming in available technologies, unparalleled through history, which makes biomedical science extraordinarily exciting. with this comes an ever greater financial burden and increasingly complicated ethical issues. an increasing focus on the need for effective translation from the use of these resources highlights the many obstacles that impede such progress. we have highlighted these difficulties, and have suggested strategies to rejuvenate and maximize our translational potential. tyrosine kinases as targets for cancer therapy adjuvant chemotherapy for breast cancer -30 years later pro: mice are a good model of human airway disease modelling the human immune response: can mice be trusted? modeling allergic asthma in mice: pitfalls and opportunities con: mice are not a good model of human airway disease the mouse trap satisfaction (not) guaranteed: re-evaluating the use of animal models of type 1 diabetes modelling infectious disease -time to think outside the box? lost in translation: barriers to implementing clinical immunotherapeutics for autoimmunity distinct mutations in irak-4 confer hyporesponsiveness to lipopolysaccharide and interleukin-1 in a patient with recurrent bacterial infections pyogenic bacterial infections in humans with irak-4 deficiency severe impairment of interleukin-1 and toll-like receptor signalling in mice lacking irak-4 phenotypes in asthma: useful guides for therapy, distinct biological processes, or both? evidence that severe asthma can be divided pathologically into two inflammatory subtypes with distinct physiologic and clinical characteristics loss of siglec expression on t lymphocytes during human evolution differences in immune cell 'brakes' may explain chimp-human split on aids cytokine storm in a phase 1 trial of the anti-cd28 monoclonal antibody tgn1412 national survey of british public's views on use of identifiable medical data by the national cancer registry confidentiality in genome research no longer de-identified mouse model of airway remodeling: strain differences murine models of asthma rodent models of rheumatoid arthritis concordance among gene-expressionbased predictors for breast cancer a genomic strategy to refine prognosis in early-stage non-small-cell lung cancer a small-molecule screen in c. elegans yields a new calcium channel antagonist in vivo drug discovery in the zebrafish a transgenic zebrafish model of neutrophilic inflammation capturing complex 3d tissue physiology in vitro cooperative molecular and cellular networks regulate toll-like receptor-dependent inflammatory responses agonists of toll-like receptors 2 and 4 activate airway smooth muscle via mononuclear leukocytes pulmonary perspective: targeting the networks that underpin contiguous immunity in asthma and copd a bioinformatician's view of the metabolome we thank the many scientists whose stimulating conversations have in some measure been represented in these pages. i. the authors declare no competing financial interests. ian sabroe's homepage: http://www.shef.ac.uk/medicine/staff/sabroe.html lung tissue research consortium: http://www.ltrcpublic.com uk biobank: http://www.ukbiobank.ac.uk access to this links box is available online. key: cord-330978-f3uednt5 authors: wang, yi; cao, jinde; sun, gui-quan; li, jing title: effect of time delay on pattern dynamics in a spatial epidemic model date: 2014-10-15 journal: physica a doi: 10.1016/j.physa.2014.06.038 sha: doc_id: 330978 cord_uid: f3uednt5 time delay, accounting for constant incubation period or sojourn times in an infective state, widely exists in most biological systems like epidemiological models. however, the effect of time delay on spatial epidemic models is not well understood. in this paper, spatial pattern of an epidemic model with both nonlinear incidence rate and time delay is investigated. in particular, we mainly focus on the effect of time delay on the formation of spatial pattern. through mathematical analysis, we gain the conditions for hopf bifurcation and turing bifurcation, and find exact turing space in parameter space. furthermore, numerical results show that time delay has a significant effect on pattern formation. the simulation results may enrich the finding of patterns and may well capture some key features in the epidemic models. major public health threat brought about by infectious diseases has drawn increasingly attention in recent years, such as the severe acute respiratory syndromes (sars) outbreak in 2003 [1, 2] , the avian influenza a (h7n9) outbreak in china in 2013 [3, 4] . a great deal of effort has been made toward exploring more realistic mathematical models for the transmission dynamics of infectious diseases. the aim of these models is to both explain the observed epidemiological patterns and predict the consequences of the introduction of public health interventions to control the spread of diseases (see e.g., diekmann and heesterbeek [5] for an overview). in the past several decades, various epidemic models have been formulated and studied to reveal mechanisms of disease transmission (see, for example, [6, 7] and the references therein). in modeling of communicable diseases, the incidence rate, i.e. the rate of new infections, often plays a key role in guaranteeing that the model does indeed give a reasonable qualitative description of the disease dynamics [8, 9] . great interest has been devoted to investigating the nonlinear dynamics, including hopf bifurcation, saddle-node bifurcation, bogdanov-takens bifurcation, existence of periodic and homoclinic orbits, and so on. for example, yorke and london [10] showed that the incidence rate β(1 − ci)is with positive c and time-dependent β(t) was in excellent agreement with the simulation results for measles outbreaks. capasso and serio [11] adopted a saturated incidence rate of the form βis/(1 + αi), α > 0, to represent ''crowded effect'' or ''protection measure'' in modeling the cholera epidemics in bari in 1973. to incorporate the effect of behavioral changes for certain communicable diseases, liu and coworkers [12, 13] extended that idea and proposed a nonlinear saturated incidence function βsi l /(1 + αi h ), where βsi l represented the infection force of the disease, and 1/(1 + αi h ) measured the inhibition effect from the behavioral change of the susceptible individuals when the number of infectious individuals increased, with β, l, α, h > 0. there were numerous results about nonlinear incidence rates in the literature, and we refer the reader to ref. [8] for a general review. apart from the incidence rate, many other factors, such as stochasticity [14] , seasonality [15, 16] , the distribution of latent and infectious periods [17, 18] and spatial structure [19] , can influence epidemic dynamics. in particular, the study of spatial epidemiology becomes an exciting and important area of research because space may strongly influence lots of important epidemiological phenomena due to the localized nature of transmission or other forms of interaction [20] [21] [22] [23] . as a result, mathematical models with both time and space are more realistic in considering the process of epidemic spreading. to our knowledge, both discrete patchy models and continuous reaction-diffusion systems are used to study the spatial heterogeneity [24] . patchy models are often used to describe directed movement among patches, while reaction-diffusion systems are suitable for random spatial dispersal. for example, we may find epidemic wavefronts in reaction-diffusion models, which corresponded to the real world observations as in the spread of the black death in europe from 1347 to 1350 [25, 19] . moreover, time delay is ubiquitous in most biological systems like predator-prey models and epidemiological models. we note that in disease transmission models, time delay is an important quantity accounting for many epidemiological mechanisms. in particular, time delays can be introduced to model constant incubation period or sojourn times in an infective state. for example, using an sir model with a maturation delay and vertical disease transmission busenberg et al. [26] obtained some periodic solutions. an seirs epidemic model with exponential demographic structure, disease related deaths and two delays corresponding to the latent and immune constant periods respectively was first analyzed by cooke and van den driessche [27] . they identified a delay-dependent threshold parameter θ determining the local asymptotical stability of the equilibrium states. for a brief review of delay differential equations arising from disease modeling, we refer the reader to van den driessche [28] . in our previous papers [29] [30] [31] [32] [33] [34] [35] , we considered the mechanisms of pattern formation in epidemic models induced by spatial (cross-) diffusion or noise. to the best of our knowledge, little attention [36] has been paid to the joint effect of delay and diffusion on pattern dynamics, and the mechanism of delay-induced turing instability is not well understood. our main focus, however, is on how the incorporation of a constant time lag (an incubation period) into the epidemic model alters the aforementioned qualitative results. the paper is organized as follows. in section 2, we introduce a spatial epidemic model with neumann boundary conditions and nonzero initial conditions. in section 3, we summarize the dynamics of spatial epidemic model without time delay. then, we derive the conditions for turing instability occurring in section 4 through the method of linear stability analysis. in section 5, we illustrate the effect of time delay on the emergence of turing patterns by performing extensive numerical simulations. finally, conclusions and discussion are presented in section 6. in this paper, we use a simple s-i epidemic model with nonlinear incidence rate to investigate the effect of time delay on the spatial epidemic model. let s and i be the number of susceptible, infected and infectious individuals at time t. to incorporate saturation or multiple exposures before infection, liu et al. [12, 13] proposed a nonlinear incidence rate βs p i q with p > 0, q > 0. this form of nonlinear incidence rate, without a periodic forcing, could produce much wider range of interesting dynamical phenomena in comparison to bilinear incidence rate βsi. owing to their simple form, they cannot involve many of the complex biological factors. however, they often shed insightful lights to help us understand some complex processes [30, 34, 35] . in the present paper, let p = 1 and q = 2. then we have the following spatial epidemic model with nonlinear incidence rate where a is the recruitment rate of the susceptible, d is the natural death rate of the population, and µ is the disease-related death rate from the infected. here, x = (x, y) represents the space, and ∇ 2 = ∂ 2 /∂x 2 + ∂ 2 /∂y 2 is the usual laplacian operator in two-dimensional space. the susceptible and infected individuals diffusion coefficients are denoted by d 1 and d 2 , respectively. from the biological point of view, we suppose that all the parameters are positive throughout the paper. more details about the model can be found in our previous paper [35] . in general, we are interested in the self-organization of patterns and choose the following nonzero initial conditions and neumann (zero-flux) boundary conditions where l denotes the size of the system in the directions of s and i, n is the outward unit normal vector of the boundary ∂ω. neumann boundary conditions imply that the boundary of the model domain is simply reflective, and that the domain is isolated or insulated from the external environment [37] . if the incubation period is assumed to be a constant τ > 0, we obtain a delayed spatial epidemic model with nonlinear incidence rate as follows, with the initial conditions to give insights into the effect of time delay or diffusion on system (3), it is of significance to investigate the local dynamics of its corresponding non-delay and non-diffusion model. now, we consider the case of spatially homogeneous states, i.e. nondiffusion model. from a biological point of view, we are interested in the non-negative steady states s ≥ 0, i ≥ 0. we summarize the local dynamics of the model around the equilibrium states, see our previous paper [35] for more details. from ref. [35] , we know that the system has three equilibrium states , which corresponds to extinction of the disease; (ii) coexistence of the s and i population which is unstable and a saddle by direct calculations; (iii) coexistence of the s and i population which is a stable node. hereafter, we denote by e * = (s * , i * ) this local stable equilibrium. to ensure the positivity of s * and i * , one has the following condition in the sequential, we only focus on the locally asymptotically stable equilibrium e * . for the sake of establishing our main results in section 4, we give a brief review on diffusion-induced turing instability based on [35] for system (1). linearizing system (1) around the spatially homogeneous equilibrium point (s * , i * ) for small space-and time-dependent fluctuations and expanding them in fourier space we have the following characteristic equation where and the jacobian matrix j at e * is given by here, r = x(y) or r = (x, y) corresponds to the one-or two-dimension space, δs * and δi * are spatiotemporal perturbations, and k is the wave number. solving eq. (5) yields the characteristic polynomial of the original problem (1) where therefore, the roots of (6) yield the dispersion relation at the bifurcation point, two equilibrium states of the model intersect and exchange their stability. biologically speaking, this bifurcation corresponds to a smooth transition between equilibrium states. the hopf bifurcation is space independent and breaks the temporal symmetry of a system, which gives rise to space-uniform and periodic-time oscillations, while the turing bifurcation breaks spatial symmetry, leading to the formation of stationary-time and space-oscillation patterns [29, 32] . hopf instability occurs if a pair of imaginary eigenvalues cross the real axis from the negative to the positive axis and the diffusion is absent [29, 32] . from a mathematical point of view, the hopf bifurcation appears when a positive equilibrium state is said to be turing instability if it is stable for the corresponding non-spatial model of (1), but becomes unstable with respect to homogeneous perturbation due to diffusion. a general linear analysis [38, 39] shows that the necessary conditions for yielding turing patterns for model (1) are given by the first two conditions guarantee that the equilibrium (s * , i * ) is stable for the non-diffusion model of (1), and becomes unstable for model (1) if re(λ 1,2 (j k )) transits from a negative value to a positive one (corresponding to last two conditions). mathematically speaking, the turing bifurcation occurs when and the wavenumber k t satisfies now in this subsection, we will concentrate on the stability of system (1). obviously, system (3) has the same equilibria as that of corresponding non-diffusion model of system (1) . following the approach in ref. [40, 41] and assuming τ to be small we replace expanding in taylor series and neglecting the higher-order nonlinearities, then eq. (10) becomes where h(s, i) . = βsi 2 , and h i (s * , i * ) = ∂h/∂i| (s * ,i * ) . noting that h i (s * , i * ) = g i (s * , i * ) + (d + µ), we obtain the following the homogeneous steady state e * (i.e. fixed point) of the dynamical system satisfies f (s * , i * ) = 0 and g(s * , i * ) = 0. considering small spatiotemporal perturbations δs(x, y, t) and δi(x, y, t) on a homogeneous steady state (s * , i * ), then we have s(x, y, t) = s * + δs(x, y, t), i(x, y, t) = i * + δi(x, y, t). by expanding the reaction terms around the steady state e * in taylor series up to first order and rearranging the terms, we finally obtain assume that spatiotemporal perturbations δs(x, y, t) and δi(x, y, t) take the following form δs(x, y, t) = δs * exp(λt) cos k x x cos k y y, δi(x, y, t) = δi * exp(λt) cos k x x cos k y y, where λ is the growth rate of the perturbation in time t, and k x and k y are the wavenumbers of the solutions. upon inserting (15) into eq. (14), we obtain the following matrix equation for eigenvalues where k 2 = k 2 x + k 2 y . from eq. (16) we get the following characteristic equation for the eigenvalues of the associated stability matrix where , our aim here is to establish the threshold or critical value of the delay time for which the system with time delay, which is otherwise stable with respect to homogeneous perturbation, becomes unstable. now we are in a position to investigate the effects of time delay and diffusion on the dynamical system (3), and under what conditions for time delay to destabilize the steady state of the system and bring about spatiotemporal instability. we know that the onset of instability requires that at least one of c k 2 < 0 and d k 2 > 0 is violated. note that g i = d + µ and τ ≥ 0, we always have 1 + τ (g i + d + µ) ≥ 1 > 0. hence, we consider the potential growth instability in the following two cases: (i) c k 2 < 0 is violated; (ii) d k 2 > 0 is violated. the conditions for stability of the homogeneous steady state for system (3) with delay are c k 2 =0 < 0 and d k 2 =0 > 0. since this allows the range of values for τ determined by the following condition < 0 is always satisfied, which implies that homogeneous steady state e * is locally asymptotically stable independent of delay; (2) if f s g i − f i g s + (d + µ)f s = 2(d + µ)(f s + g s ) > 0, we obtain the following range of values for τ ensuring the stability in this subsection, we consider the second case, i.e. d k > 0 is violated. in other words, the condition of instability is determined by the sign of d k . since the numerator of d k determines the turing critical line which is found to be independent of τ , the emergence of the instability in this situation is induced only by the diffusion. direct calculation shows that the critical value of bifurcation parameter β for turing critical line equals in this subsection, we consider the first case, in which c k 2 < 0 is violated. delay-diffusion-induced instability means that a positive equilibrium is uniformly asymptotically stable in the reaction-diffusion model without a delay effect (e.g., system (1)), while it becomes unstable with respect to homogeneous steady state for the reaction-diffusion model with a delay (e.g., system (3)). in this case, the condition of instability is c k 2 > 0. moreover, since the denominator of c k 2 is always positive, the condition of instability reduces to the following: which implies that the lower bound of τ must satisfy there may be a large variety of distinct patterns that can be observed by varying the parameters slightly. to well see the bifurcation condition, we let a = d = 1, d 1 = 6, and d 2 = 1. in fig. 1 , we show the turing space in β − µ plane for model (3) . γ 1 is positive equilibrium existence line (the black line), γ 2 is hopf bifurcation line corresponding to τ = 0 (the green one), γ 3 is turing bifurcation line (the red one). the turing space marked t is bounded by γ 2 and γ 3 . for parameters in this domain, above γ 2 , the positive equilibrium e * of corresponding non-delay non-diffusive model is stable; below γ 3 , the corresponding solution of system (1) is unstable. in other words, turing instability occurs, therefore turing patterns emerge. introduction of delay may give us another useful handle for further manipulation of the instability region between hopf and turing curves. the hopf bifurcation curve (the blue line) corresponding to τ = 0.01, 0.05 is also shown in fig. 1 . therefore, for a fixed parameter set and delay τ , the transmission rate β beyond a critical threshold β c asserts a condition of instability of the homogeneous steady state of the system even if β is below the hopf bifurcation line corresponding to τ = 0 (the green one). we proceed to explore this in the next section. to compare the analytical predictions from the aforesaid analysis, we have to perform extensive numerical simulations of the system under the influence of time delay, using eq. (3), by the explicit euler method with parameters defined above. the continuous problem defined by the reaction-diffusion system in two-dimensional space is solved in a discrete domain with n x × n y lattice sites. all our numerical simulations employ the neumann boundary conditions with a time step size of △t = 0.001 and space step size △h = 1.0. in the present paper, we set n x = n y = 200, and approximate the laplacian describing diffusion by differences over △h [42] . the simulations are initiated with spatially random perturbations of ∼0.01% around the steady state e * . in this paper, we want to know the effect of delay on the distribution of infected individuals. as a result, we only show results of pattern formation about one distribution of i. we conduct the simulations until they reach a stationary state or suggest a behavior that does not seem to change its characteristics anymore. in the following part, we choose suitable values of parameters for simulation: for the different values of β located in the turing space (the domain t in fig. 1 ), different categories of turing patterns for the distribution of i can emerge. in each pattern, the blue (red) represents the low (high) density of the infected i. however, we concern the effect of time delay on the pattern formation of i. for the non-delay spatial model (1), i.e. τ = 0 in model (3), when β ∈ (34.14, 44.07) for the above parameter set, turing instability occurs. figs. 2-3 present the evolution of the spatial pattern of infected population at 0, 1000, 50,000, 100,000 iterations, for β = 35 and 42 in the turing space, respectively. in fig. 2 , the regular stripe patterns prevail over the whole domain at last, and the dynamics of the system does not undergo any further changes; while in fig. 3 , the spotted spatial patterns prevail the whole domain at last. we use these two case as baseline, and investigate the effect of delay on the pattern formation. the parameter values of fig. 4 are the same as those in fig. 2 , but with time delay, τ = 0.05. all of the figures show the evolution of the spatial patterns at 0, 1000, 50,000, 100,000 iterations, with small random perturbation of the stationary state s * and i * of the spatially homogeneous systems. from these figures, we can see that the regular stripe patterns also prevail over the whole domain at last, and the dynamics of the system does not undergo any further changes. however, it should be noted that there is an essential difference between fig. 2 (non-delayed spatial system) and fig. 4 (delayed spatial system), both starting from the same initial conditions. in particular, the large regular stripe patterns in fig. 2 break down to small pieces in fig. 4 , and the direction of the stripe of fig. 4 is also changed. the parameter values of fig. 5 are the same as those in fig. 3 , but with delay τ = 0.1. different from fig. 3 , in which the spotted patterns prevail the whole domain finally, we find that stationary stripe and spot patterns coexist in the distribution of the infected population density at last in (33.86, 34.14) (34.14, 44.07), we find that stationary stripe and spot patterns emerge mixed in the distribution of the infected population density, and the dynamics of the system does not undergo any further changes finally. it should be noticed that the patterns are induced only by delayed-spatial system since there are no patterns for non-delayed spatial system, i.e. β = 34 ̸ ∈ (34.14, 44.07). to conclude, a spatial epidemic model with both nonlinear incidence rate and time delay is investigated. the numerical results correspond perfectly to our theoretical findings. specifically, there is a range of parameters in β-µ plane where the different spatial patterns can be obtained. on the one hand, for the non-delayed spatial system, different spatial patterns, such as regular stripe and spot patterns, can emerge for a range of parameter values [35] . on the other hand, time delay has a significant impact on the pattern formation. more specifically, there are three aspects. firstly, time delay can change the pattern essence, such as the arm length and the direction of the stripe, see figs. 2 and 4. secondly, time delay can largely postpone the pattern formation of the infected population density i, comparing figs. 3 and 5. thirdly, time delay can widen the turing space. in other words, for the non-delayed spatial epidemic model, there is no pattern occurring; while for the delayed one, stationary stripe and spot patterns may coexist in the distribution of the infected population density, and the dynamics of the system does not undergo any further changes finally (see fig. 6 and main text for more details). by the above analysis, we can find that the qualitative dynamics of the delayed spatial epidemic model (3) is fundamentally different from the non-delayed one (1) when time delay τ is slightly changed. in the past few years, a great deal of attention has been paid to transitions between different dynamical regimes as a result of perturbation of the system's parameters [43] . for simplicity, we let other parameter values remain fixed and vary only one parameter, such as β or τ in the present paper. in ref. [29] , sun et al. studied a spatial s-i model with logistic growth and nonlinear incidence rates βs p i q with p + q = 1, and obtained not only a stripe-like pattern but also a spot pattern, or coexistence of the two. then, in ref. [35] they assumed the recruitment of population with constant rate and set p = 1 and q = 2. however, little attention has been paid to the time delay accounting for constant incubation period or sojourn times in an infective state, which may have significant impact on pattern formations of infected population density. hence, in this paper, we unfold it by extensive numerical simulations. the methods and results in the present paper may enrich the research of pattern formation in the spatial epidemic models and may well explain the field observations in some areas. for example, jewell et al. [44] investigated the spatial and temporal dynamics of foot-and-mouth disease outbreak during the 2007, and suggested undetected (occult) infections in the uk. su et al. [45] used a reaction-diffusion system to explore the relationship between malaria fever and parasite replication cycles. here, we would like to remark that conclusive evidence of spatial patterns that have the peculiarities in interacting epidemiological systems is still to be found and there are an increasing number of indications of patterns in realistic ecosystems, such as vegetation distribution [46] , planktonic interaction [47] and a few work on prey-predator type interaction [48] . our spatial epidemic model with time delay may be more realistic to capture some key features of the complex variation and to explain the observation in spatial structure to most species [49, 50] . however, it should be noted that the method in this paper is particularly suitable for short time delay τ based on the taylor series expansions, whereas the time delay is much larger, one should investigate the stability matrix [51] [52] [53] of a reaction-diffusion system and give necessary or sufficient conditions which guarantee that its uniform steady state undergoes a turing bifurcation. usually, one needs taking care of the joint effect of diffusion and time delay [54] to obtain stability conditions which depends on transcendental equation associating characteristic eigenvalue λ with a function e λτ . in addition, the presence of noise [55] [56] [57] [58] or other terms may give rise to a rich variety of dynamical effects [59] , including noise-enhanced stability [60] , noise-delayed extinction [61, 62] and noise-induced transitions [63] . for example, pattern formation induced by the noise in two competing species was analyzed by valenti et al. [55] . nonmonotonic behavior of spatiotemporal pattern formation in noisy population dynamics has been found by fiasconaro et al. [56] . a model for epidemic dynamics was analyzed by chichigina et al. [57] , using a pulse noise model with memory. the effect of multiplicative noise, always present in population dynamics, in the form of a pulse train with regulated periodicity, on a parametric instability was investigated by chichigina et al. [58] . these and some other related issues are left for further investigation and discussion. isolation and characterization of viruses related to the sars coronavirus from animals in southern china transmission dynamics and control of severe acute respiratory syndrome comparative epidemiology of human infections with avian influenza a h7n9 and h5n1 viruses in china: a population-based study of laboratory-confirmed cases epidemiology of human infections with avian influenza a(h7n9) virus in china infectious diseases of humans the mathematics of infectious diseases recurrent outbreaks of measles, chickenpox and mumps ii a generalization of the kermack-mckendrick deterministic epidemic model influence of nonlinear incidence rates upon the behavior of sirs epidemiological models dynamical behavior of epidemiological models with nonlinear incidence rates the interplay between determinism and stochasticity in childhood diseases a simple model for complex dynamical transitions in epidemics transients and attractors in epidemics endemic models with arbitrarily distributed periods of infection i destabilization of epidemic models with the inclusion of realistic distributions of infectious periods travelling waves and spatial hierarchies in measles epidemics predicting the spatial dynamics of rabies epidemics on heterogeneous landscapes modelling vaccination strategies against foot-and-mouth disease forecast and control of epidemics in a globalized world smallpox transmission and control: spatial dynamics in great britain diffusion and ecological problems: modern perspectives geographic and temporal development of plagues analysis of a model of a vertically transmitted disease analysis of an seirs epidemic model with two delays some epidemiological models with delays pattern formation in a spatial s-i model with non-linear incidence rates influence of infection rate and migration on extinction of disease in spatial epidemics emergence of strange spatial pattern in a spatial epidemic model chaos induced by breakup of waves in a spatial epidemic model with nonlinear incidence rate spatial pattern in an epidemic system with cross-diffusion of the susceptible effect of noise on the pattern formation in an epidemic model pattern formation of an epidemic model with diffusion, nonlinear dynam pattern formation of an epidemic model with time delay pattern sensitivity to boundary and initial conditions in reaction-diffusion models turing patterns with pentagonal symmetry mathematical biology ii: spatial models and biomedical applications time-delay-induced instabilities in reaction-diffusion systems control of the hopf-turing transition by time-delayed global feedback in a reaction-diffusion system finite-difference schemes for reaction-diffusion equations modeling predator-prey interactions in matlab chaos in ecology: experimental nonlinear dynamics predicting undetected infections during the 2007 foot-and-mouth disease outbreak periodicity and synchronization in blood-stage malaria infection regular and irregular patterns in semiarid vegetation phase separation explains a new class of selforganized spatial patterns in ecological systems bifurcations and chaos in a predator-prey system with the allee effect wavefronts in time-delayed reaction-diffusion systems. theory and comparison to experiment a multi-species epidemic model with spatial dynamics three types of matrix stability turing instabilities in general systems some remarks on matrix stability with application to turing instability interaction of diffusion and delay pattern formation and spatial correlation induced by the noise in two competing species nonmonotonic behaviour of spatiotemporal pattern formation in a noisy lotka-volterra system a simple noise model with memory for biological systems stability in a system subject to noise with regulated periodicity rich dynamics in a predator-prey model with both noise and periodic force noise-enhanced stability of periodically driven metastable states role of the noise on the transient dynamics of an ecosystem of interacting species stochastic resonance and noise delayed extinction in a model of two competing species noise-induced nonequilibrium phase transition key: cord-320666-cmqj8get authors: walach, h.; hockertz, s. title: what association do political interventions, environmental and health variables have with the number of covid-19 cases and deaths? a linear modeling approach date: 2020-06-22 journal: nan doi: 10.1101/2020.06.18.20135012 sha: doc_id: 320666 cord_uid: cmqj8get background: it is unclear which variables contribute to the variance in covid-19 related deaths and covid-19 cases. method: we modelled the relationship of various predictors (health systems variables, population and population health indicators) together with variables indicating public health measures (school closures, border closures, country lockdown) in 40 european and other countries, using generalized linear models and minimized information criteria to select the best fitting and most parsimonious models. results: we fitted two models with log-linearly linked variables on gamma-distributed outome variables (cov2 cases and covid-19 related deaths, standardized on population). population standardized cases were best predicted by number of tests, life-expectancy in a country, and border closure (negative predictor, i.e. preventive). population standardized deaths were best predicted by time, the virus had been in the country, life expectancy, smoking (negative predictor, i.e. preventive), and school closures (positive predictor, i.e. accelerating). model fit statistics and model adequacy were good. discussion and interpretation: interestingly, none of the variables that code for the preparedness of the medical system, for health status or other population parameters were predictive. of the public health variables only border closure had the potential of preventing cases and none were predictors for preventing deaths. school closures, likely as a proxy for social distancing, was associated with increased deaths. conclusion: the pandemic seems to run its autonomous course and only border closure has the potential to prevent cases. none of the contributes to preventing deaths. the novel coronavirus sars-cov2 (cov2) which surfaced in china in december 2019 for the first time created a world-wide pandemic 1, 2 and an associated disease, named , with respiratory stress, heart problems, kidney failures and immunological problems associated with it [3] [4] [5] [6] [7] . countries closed down their borders, their schools, universities and cultural facilities and sometimes even their whole activities. this was due partially to its novelty and its largely unknown properties, but also, because it was soon clear that those infected by the virus could be asymptomatic for up to a week or longer, while still being infectious to others, and because high infectivity, virulence and mortality was assumed. the spread of the virus was initially very quick following a seemingly exponential growth curve, but abated and the replication numbers went into decline. currently it is highly debated what contributes to the variance that can be seen both in cov2 cases, as well as in deaths attributed to . while most people assume that political measures have mitigated the spread of the virus 8 , others hold that the process is rather autonomous, that the virus recedes after having infected all those in a population susceptible to it and then the infection abates 9, 10 . moreover, most modeling approaches that were used in early stages of the disease to inform political decision making did not take into account potential inhomogeneity of a population due to natural or specific immunity of a large part of the population 11, 12 . more recent models that take such inhomogeneity parameters into account, informed by novel data, estimated that after about 7 to 18% of a population have been infected herd immunity is reached, because the rest of the population might not be susceptible to the virus 13, 14 . since it is largely unclear what variables contribute to the variance in cases and deaths attributable to cov2, we wanted to study this question by building linear models using various predictor variables to study their influence on the outcomes covid-19 cases and deaths in various countries. we collected data on covid-19 cases and deaths as presented by the database of the european center for disease prevention and control on their website by 15th may 2020. we used european and oecd countries, because those data are most relevant to our question and are more validly accessible. we included the following 40 countries covid-19 cases and deaths were summed for the total period covered by the ecdpdatabase and used as dependent variables (criterion). we standardized cases and deaths on 100.000 inhabitants, taken from the population size in the same data-base. as predictors we collated data from publicly available sources (see supplementary material for a list and for sources) for population, health, health systems, and environmental indicators between may 15 th and 20 th 2020. the variables used as predictors are described in a protocol that was published on the server of the open science framework (https://osf.io/x93np/). briefly, we used population indicators (life-expectancy, percent single households, city dwelling, age groups, population density), health systems indicators (number of doctors, hospital beds, icu beds, pcr tests), health indicators (percentage of obese, diabetes, smoking and physically inactive persons), air pollution, and finally variables coding for political . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 actions: closure of borders, closure of schools, country lockdown, including the rapidity of implementation since the first case was noted (see table 1 ). we built two separate linear models to predict the influence of variables on populationstandardized cov2-cases and covid-19 associated deaths. in order to investigate which variables might be potential predictors first order correlations of all relevant variables with the outcome variables were calculated, using nonparametric correlations, and their inter-correlation structure was studied. only variables that contributed with an effect size r > .3 or with significant correlations were further considered for modeling. as 40 cases offer enough stability to estimate about 4 parameters reliably 15 , we opted for small models to start with and included a further predictor only if it was theoretically meaningful, empirically supported (i.e. a significant predictor) and improved model fit. we included first all those relevant predictors from population, health systems, and environmental sets in separate steps that correlate significantly with the outcome and are not collinearly related with each other. we explored model fit to find the best subset for each small group of indicators with forced entry of not more than four variables at a time, retaining only significant predictors for the next step. in a final step we included potential predictors from the set of public health indicators to investigate whether there is any improvement in model fit and whether these variables were significant predictors. the rationale of this procedure is: if the public health measures contribute to preventing cases and deaths, then they would have to emerge as potential significant predictors with negative sign (as they were dummy coded with 1 coding for present and 0 coding for absend). in addition, the model fit of the enlarged model would have to improve. as an indicator of improved model fit we used the difference of akaike information criterion (aic in its original and corrected version), the difference of bayes information criterion (bic) and the chi 2 -goodness of fit test statistic divided by degrees of freedom conjointly to avoid over and underfitting. we always used the model that minimized all of them conjointly. to assess model adequacy, plots of predicted versus observed cases, residual distribution plots, and residuals vs. cases were visually analyzed and residual plots were screened for outliers (residuals vs. chi 2 statistic). in a sensitivity analysis the model was recalculated without outliers to see whether the model structure, i.e. the variables used as significant predictors would be the same and goodness of fit improved. for those sensitivity analyses, aic and bic were only used as a further criterion if the difference was large, as the efficiency of these information criteria change with number of cases/degrees of freedom and number of variables 15, 16 . we used statistica version 13.1 for all analyses. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 the nonparametric correlations (spearman's rho) between the two predefined outcome variables, cases and deaths per 100.000 inhabitants, as well as the case-fatality rate (cfr), for illustration, are reported in table 1 . insert table 1 here of the structural variables describing the health systems only the number of tests conducted correlated significantly with number of standardized cases (r = .32) and deaths (r = .46), as well as with case-fatality rate (r = .35) and with number of icu beds (r = .39). of the variables describing political actions only border closure was negatively and significantly related with standardized cases (r = -.43), but only weakly and non-significantly with number of deaths (r = -.25): cases tended to be higher in countries that did not close the border. but neither lockdown nor school closures were significantly and sizably related with number of cases or number of deaths. only full closure of schools was slightly, but nonsignificantly related with number of cases (r = -.19) but not with number of deaths, indicating that cases were higher in countries that had not closed schools. however, as school closure was correlated positively, but non-significantly (r = .30) with cfr, it is necessary to clarify by modeling, which covariation might be influential. the duration or length of border-or school closures was only marginally and nonsignificantly negatively correlated with number of deaths and cases. the rapidity with which countries reacted, i.e. the time difference between the registration of the first case and the initiation of political reactions was only slightly correlated with number of cases, and significantly correlated only with the case-fatality rate (r = .39), i.e. countries that were slower in initiating political actions had a higher case-fatality rate. higher and significant correlations were visible with descriptors of populations and health status. there were more cases in countries that had a higher life-expectancy at birth (r = .53), and there were more deaths (r = .43) in such countries. there were more cases (r = .47), as well as more deaths (r = .36) in countries with a higher percentage living in cities. however, neither population density of a country, nor percentage living in single households emerged as a potential predictor. potentially interesting correlations emerged between case-rate and death rate with percentage of population taking lipid lowering drugs (r = .42) and with amount of mercury used in the alkaline-chlorine industry (cases: r = .42, deaths: r = .40). but since we were unable to find enough data for all countries of interest, these variables could not be used for modeling. none of the other variables describing the health status of a population (obesity rate, insufficient physical activity, sleep problems, vaccination rate, percent of diabetes patients in a population) emerged as potential predictors. paradoxically, there were more cases (r = -.38), as well as deaths (r = -.33) in countries that had a lower percentage of smokers in the population. as this correlation was even higher for male smokers, likely because smoking is predominantly a male phenomenon, the percentage of male smokers was used for further modeling. the same paradoxical relationship can be seen with variables that code for air-pollution, especially with very small particles (pm2 -particulate matter of 2 micron size per m 3 air), where we see significant negative correlations of r = -.52 with cases and r = -.39 with deaths. although the correlation with pm10 was somewhat higher, we used pm2 for modeling, because pm2 and pm10 are highly intercorrelated (r = .75) and because we had more cases with data for pm2, most notably usa. following our protocol, we constructed a model to account for the covariance structure of the variables. since the outcome variables, cases and deaths standardized on 100.000 inhabitants per country, showed adequate fit to a gamma-distribution (see efigure 1 ), we calculated a generalized linear model with a log-link-function on gamma-distributed outcomevariables: where γ refers to the gamma distribution with shape ki and shape si, i 1, … , . we used the gamma distribution as it maximizes entropy. although the (overdispersed) poisson distribution may have been a choice, we opted for the gamma distribution because by modelling standardized cases, we are effectively modelling a continuous variable, thereby excluding the poisson distribution (which models discrete events). we also considered logtransforming the outcome variables to approximate a normal distribution, but the fit was not adequate and sensitivity analyses using linear regression on a log-transformed outcome variable yielded essentially the same results, but with inadequate fit. the models that best predicted standardized cases are presented in table 2 . the first model describes the best fitting model for all countries predicting cases. the variables entering the model are life-expectancy, number of tests and smoking. parameter estimates are positive for life expectancy and number of tests, and negative for smoking. in a second step variables coding for political decisions (country lockdown, border closure, school closures) were entered. the best fitting model emerged with border closure as a negative predictor, with smoking removed. the model fit statistics show improved model fit over the first model (akaike information criterion -aic 462,62 vs. 465,69; bayes information criterion 470,94 vs. 474; chi 2 /degrees of freedom 0,43 vs. 0,46). we inspected the chi 2 vs. prognosis plot to spot outliers. there was only one clear outlier, belgium (efigure 3). removing this outlier improved model fit considerably (aic 379, 21; bic 388,19; chi 2 /degrees of freedom 0,24), with air-pollution pm2 added to the model as a negative predictor. table 2 here the model predicting covid-19 related deaths is presented in table 3 : here the duration the infection had been in the country is a significant positive predictor, and so is life expectancy. smoking is a negative predictor. when entering the public health variables only school closures emerged as a significant positive predictor that improved model fit. excluding . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 belgium, the only serious outlier, improved model fit. the same variables remain in the model as significant predictors with nearly the same regression coefficients including their sign. table 3 here inspection of residuals show that the linearity assumption is warranted. the model can predict the cases reasonably well. a plot of countries vs. residuals is presented in efigure 4 . the major findings of this modeling study using population data for 40 countries are clear, if surprising: life-expectancy emerges as a stable positive predictor both for standardized cases of cov2 infections, as well as for covid-19 related deaths. surprisingly, smoking emerges as a stable negative predictor, i.e. protective factor. of the public health or political variables only border closure emerges as a strong negative predictor for cases. but school closures is a strong positive predictor for deaths, i.e. is associated with more deaths. the parameter for number of tests conducted in a country emerges as a strongly significant positive predictor. the fact that life expectancy is the most consistent positive predictor -the longer the life expectancy in a country the more cases and deaths -is easy to understand. the disease affects most aggressively elderly and multimorbid patients. life expectancy is a complex variable, incorporating social and medical progress in a country as well as economic indicators, and hence it denotes the number of the elderly in a population, as well as the intensity of medical care. only the number of cov2-pcr-tests, out of all health systems variables, enters the model as a significant positive predictor. other quality indicators of the medical system (number of doctors per 10.000 inhabitants, number of icu or hospital beds) do not enter the model. that number of tests should be related to the number of cases is evident: the more tests are conducted in a country the more cases can be potentially registered. the absence of other indicators from the set of medical system variables shows that the development both of infections and deaths is rather independent of the preparedness of the medical system. although some interesting first order correlations indicated that health status variables might be interesting to explore, none of them emerged as a predictor, except smoking as a somewhat protective variable. this might have to do with the fact that smokers may have a hyperactive system to combat airborne noxes and hence might have a small advantage against this particular disease 17,18 . a large cohort study has documented a similar counterintuitive effect 19 , and an argument could be made that this might have to do with the fact that smokers express fewer ace2 receptors 20 , which are the main entry gate of cov2 into the lungs. 2 however, the correlation of smoking with life expectancy is negative (r = -.50 for men), and hence smoking might confer other risks that shorten lives. only in one model, excluding the outlier belgium, does air pollution play a role as a potential negative predictor, i.e. as a potential preventive factor. this is rather counterintuitive. either it could be understood along the same lines: lungs prepared to deal with small noxes might be better prepared to fight a virus. or else, airborne viruses might be captured by small airborne particles and might fall to the ground earlier. again, this could be an accidental effect that should not encourage air pollution, as this has detrimental effects elsewhere. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 the closing of borders is a significant negative predictor, denoting a protective effect, in the model including all countries and predicting number of cases, but not for predicting number of deaths. in all models predicting death the duration of the infection is a positive predictor. as deaths develop with a delay of perhaps 3-4 weeks after the first contact with the virus 21-24 , this relationship reflects a quite independent temporal dynamic of the infection. it is interesting to observe that closure of schools emerges as a strong positive predictor for the number of deaths, i.e. school closures are associated with more deaths. this could be an indicator for strong social distancing rules in a country which might be counterproductive in preventing deaths, as social distance for very ill, and presumably also very old patients, might enhance anxiety and stress and could then become a nocebo 25, 26 . it could also reflect the fact that countries which saw a rising tendency of deaths closed schools as an emergency measure, and hence school closure is an indicator of fear in a country. but considering the prevention of deaths, none of the public health measures studied are associated with the prevention of deaths. border closure might be an exception in that it is associated with reduction in the number of cases and hence, indirectly, the number of deaths. but in the full model predicting deaths it is not a significant predictor. this seems to contradict new modeling data using time series models 27,28 that report clear evidence for the effectiveness of non-pharmaceutical interventions. we actually doubt the validity of these findings. the major shortfall of these models is that they ignore the most likely reason why we find the data we find: immunity in the population and neglecting the strength of natural immunity (see below). thus, a new reliability study of such models shows that they are crucially dependent on assumptions, parameters assumed and the time point at which they capture data 29 . if the wrong assumption about a potential resistance against an infection in a population is made, the results are far off from true values. rapidity of reaction can be a positive predictor in some models, but reliably leaves the equation, as soon as the duration of infection is taken into account. this signals in our view, the fact that the dynamics of the infection develops quite independent of political actions, or rather that political actions are mostly too late. the duration of an infection in a country was only a significant predictor for standardized deaths. a model including this variable to predict standardized cases is not significant and does not improve model fit. one might argue that a perhaps more conventional way of modelling would have been to log-transform the outcome variables and use standard linear regression approaches. we tried this as a sensitivity analysis but did see essentially similar results with a residual distribution that signaled model inadequacy, and hence we doubt that such a model would have helped with understanding the data. as data on number of tests were not available for china one might argue that our model is inadequate, as it excludes an important country. while this is true, we fitted models without the number of tests as a predictor which did not lead to better fitting or more meaningful models. we also used population standardized tests as an alternative to raw number of tests, but found that the model fit was much worse. thus, the image that emerges from the data and the attempt to understand their relationship through modeling is that of a largely autonomous development. it affects mainly the elderly. smoking is somewhat protective and border closures is associated with a lower number of cases. but other measures -closing of schools and lockdown of whole countries -. cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 do not contribute to a reduced number of cases or deaths. this may have to do with the fact that the virus travels extremely quickly. even the shutdown of wuhan airport delayed the spread across china only by 2,8 days 30,31 , and as chinese airports remained open the spread of the virus across the world was guaranteed and could not be stopped by gross measures such as border closures, as these came too late. an italian seroprevalence study estimated that even at the very beginning of the pandemic in italy there were 2.7% of the population in milan that had had contact with the virus. 32 the examples of taiwan 33 and hongkong 34 show that containment is possible, if reactions come very quickly and if cases can be traced close to 100%. but already the presence of 5 cases in a population increases the likelihood of a pandemic by 50%. 35, 36 once infections are in the vulnerable segments of a populations, like in hospitals or homes for the elderly, political actions like school closures or country lockdowns do not prevent deaths. if anything, social distancing seems to be harmful. what might be useful but cannot be seen in our coarsegrained data are special protective measures geared to protect these vulnerable populations, such as protective masks for personnel and visitors in hospitals and old people's homes, or the wearing of face masks in places with bad ventilation and close proximity of people. why, then, have infections subsided and deaths receded since we gathered our data on may 15 th 2020? most people would say this was due to the public health measures 1 , and recent modelling studies seem to support this 8, 27, 28 . however, we have pointed out that the peak of the cases had been reached in wuhan already on january 26 th , only 3 days after the city lockdown. 37 this was surely too short to be an effect of public health measures, as cases manifest with a delay of at least 5, rather more days. and a careful analysis shows that, if one uses realistic retrodiction of cases, then effects of public health measures cannot be seen 38 . thus, our modelling supports the view that the public health measures of school closures and country lockdown, with the exception of the closure of borders to reduce cases, were likely ineffective in influencing cases and deaths. if anything, social distancing might even be harmful for seriously ill patients. very likely, scientists and governments overestimated the danger this virus presented and underestimated the immunological resistance in the population. while there is no doubt that those really falling seriously ill from this infection suffered a lot more and were in much greater danger than comparable patients suffering from flu or other respiratory infections 22 , there can also be little doubt that basic immunological insights were neglected from the outset. both specific [39] [40] [41] and non-specific immunity [42] [43] [44] seems to have been much greater in the population than initially assumed. this is likely the case because the difference of cov2 from other corona-viruses is not as great as initially thought. thus, a considerable percentage of any population would have been immune through specific cross-immunity against other coronaviruses, apart from the fact that non-specific immunity has been neglected in the discussion nearly completely. this is the reason why more recent models that account for this fact and introduce inhomogeneity parameters reach the conclusion that it is sufficient if 7%-18% of a population have had contact with cov2 to reach herd immunity 13 , and that further waves are unlikely given immunity 45 . our data are not foolproof but first important hints. we were unable to code more countries, due to restrictions in time, resources and availability of data. this reduces the stability of estimates and to some degree also variance, although for those variables of interest variance was large enough to estimate stable models. 46 for the chosen models the goodness of fits test signals good fit, and the relative improvement of aic and bic values is obvious and . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 well below conventional thresholds of differences between source model and improved model. [47] [48] [49] we opted for completeness of data as much as possible rather than for a large number of countries, as modeling depends on the completeness of case-wise data. some interesting and potentially useful predictors we were unable to gather: a more fine-grained resolution of different social distancing rules in different countries, availability and wearing of face masks for medical personnel and the public, for instance. every model is wrong 50 , but given the data our models have a comparatively good fit. this can also be seen indirectly, as excluding outliers improved model fit, but did not change the predictors and their overall structure. obviously, in a population-based study we have to rely on the validity of the data provided by other sources, which may be of variable, even doubtful quality. this limitation has to be borne in mind. an exploratory modeling approach like ours is always open to critique. we decided to build parsimonious models that are theoretically guided and conceptually informed 15 , starting with health systems, structural and population indicator variables and entering political public health variables in a last step, then adapt the model to find the best model fit. we followed a predefined, published protocol which guarded us against aimless fishing, and strove for parsimonious models that could explain the data with a minimum of predictors and good model fit. we avoided computer guided step-down and step-up procedures as they are inefficient or prone to overfitting. 15 thus, we are quite confident that we did not overlook an important contribution of political actions to an explanatory model: they are not visible in our data except for those we report. in conclusion: in our data-set of 40 countries, only border closure had the potential to prevent cases. other public health measures were not associated with reduced cov2-cases or covid-19 associated deaths. rather, the pandemic seems to take its own course. since being elderly is a risk factor for many diseases, and eventually death, and cannot be changed, political actions in future pandemics would likely need to focus on protecting these members of society first. apparently, closing schools and locking down countries is not the right method to prevent deaths. perhaps the most sensible measures against pandemics are high alertness and an early warning system that initiates rapid actions that can prevent pandemics from developing. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 we are grateful to sebastian sauer for advice on modeling and for counter-checking the validity of our modelling with r-routines and for critically commenting on an earlier draft. we thank the students of the master class "quantitative research methods" of the msc course "health promotion" who gathered the data for this study and participated in discussing and initiating this project. not applicable for secondary data-analysis. no external funding and no external influence. none of the authors has a conflict of interest. data will be made publicly available after publication and for peer review and qualified requests beforehand. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 16 is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020 . . https://doi.org/10.1101 residuals for full models per country; the last country is usa; china is missing, because there were no data for tests conducted in china . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020 . . https://doi.org/10.1101 is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020 . . https://doi.org/10.1101 figure 4e -model diagnostic: histogram of raw residuals -predicting cases . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020 . . https://doi.org/10.1101 is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020 . . https://doi.org/10.1101 data were extracted from the the database of the european center for disease prevention and control, as of 15 th may 2020. this yields day-wise cases, deaths and population numbers for each country. the data were parsed (cases and deaths summed, date of first case registration and populations numbers extracted) using the statistica data reporting tool and the countries of interest included. these were all european countries including switzerland, as well as other countries of the oecd, including china, iran, russia, india, japan and brazil to represent all countries that were at the beginning of the crisis, as well as other large countries in the world. we excluded africa, other south and middle american and asian countries because of a lack of resources, time and because we were not sure we would be able to find sufficient data. this yielded the 40 countries described in our protocol. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 data for border closure of iran suggested that other countries closed their borders against iran, but not iran against other countries. hence this was marked as open. we wanted to include the availability of face masks and their number, but those data were unavailable for most countries, so we excluded this variable. were taken from the unesco data base which updated them on a daily basis; we extracted the data on the 15 th of may: unesco. covid-19 educational disruption and response. access on 18.05.2020. https://en.unesco.org/covid19/educationresponse we extracted data for vaccination rates from https://de.statista.com/statistik/daten/studie/1034782/umfrage/laender-mit-d,er-hoechstenimpfquote/ this is a representative survey and covers all different vaccinations. as this covers only europe, we searched other sources. for china, india, iran, japan data came from https://ourworldindata.org/grapher/immunization-coverage-against-diphtheria-tetanus-andpertussis-dtp3-vs-gdp-per-capita for brazil from https://academic.oup.com/jtm/article/25/1/tay100/5127106 for canada from https://www.canada.ca/en/services/health/publications/vaccinesimmunization/vaccine-uptake-canadian-children-preliminary-results-2017-childhood-nationalimmunization-coverage-survey.html usa from https://www.epa.gov/outdoor-air-quality-data/air-quality-statistics-report csv formatted for all counties for 2017; imported into new spreadsheet and calculated median across pm2.5 weighted 24 h average, because there were a few counties with very high . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 doi:https://doi.org/10.1016/j.atmosenv.2019.05.003) did not contain data. prof. kümmerer, an expert in environmental toxicology did not have any information, neither did other experts asked. hence we dropped this variable in the final analysis, as the data from the un-en report on mercury in the alkaline industry was too patchy. at each step new partial data-sets were created and added to the master database with the statistica-data "merge" command with variable merging according to country names and each set was saved under a new name. since some variables were created as functions of others and indexed by the original variable number, the final database contained all variables, even those that were never further used except for orienting correlations. outcome variables were log-transformed to check whether they would then be normally distributed to allow for a normal linear regression. as the result was less than satisfactory we decided to go for a generalized model and regress on a gamma-distributed variable, as the variables were clearly gamma-distributed. date variables were calculated from starting date to the 15 th may for border closure, school closure, and finally for the rapidity of reaction as the difference. this was defined as the date when the first measure, either border closure or school closure was registered and the days to first case registration was calculated. distribution analyses of the dependent variable showed that it was gamma distributed and that a log transformation cannot rectify this. hence it was decided that a regression on gamma-distributed variables should be calculated. first non-parametric correlations were correlated to see which variables correlate at all with the outcome, and as defined in the protocol, only variables with r > .3 and/or significantly correlating variables were considered further in regression models. the regression models used the functionality generalized linear models, stipulating a gamma distributed outcome variable with a log-link function. the parametrization method was overparametrized, as there was no sigma restricted coding in our data. all potentially included variables were inspected for their descriptive parameters and to see, whether they contribute any variance. if several similar variables (.e.g. air pollution variables) were available we used those that had the least missing data in order to not lose power. all modeling approaches included first health serviced, population and health parameters in a model, trying to fit the most parsimonious model with only significant predictors in the equation. this was done by calculating forced entry models, excluding non-. cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 significant predictors and recalculating the model. highly intercorrelated variables were never used together, but separate models were calculated and the model with the best model fit was selected. after that variables representing political actions (country lockdown, school closure) were entered in an additional model and used if significant as predictors. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted june 22, 2020. . https://doi.org /10.1101 /10. /2020 association of public health interventions with the epidemiology of the covid-19 outbreak in wuhan on the origin and continuing evolution of sars-cov-2. national science review sars-cov-2 infects t lymphocytes through its spike protein-mediated membrane fusion sars-cov-2 invades host cells via a novel route: cd147-spike protein functional exhaustion of antiviral lymphocytes in covid-19 patients baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region epidemiology and transmission of covid-19 in 391 cases and 1286 of their close contacts in shenzhen, china: a retrospective cohort study. the lancet infectious diseases inferring change points in the spread of covid-19 reveals the effectiveness of interventions the end of exponential growth: the decline in the spread of the coronavirus von der fehlenden wissenschaftlichen begründung der corona-maßnahmen impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand. london: imperial college individual variation in susceptibility or exposure to sars-cov-2 lowers the herd immunity threshold why herd immunity to covid-19 is reached much earlier than thought model selection and inference: a practical information-theoretic approach regression and time-series model selection feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. the lancet global health early dynamics of transmission and control of covid-19: a mathematical modelling study. the lancet infectious diseases wuhan covid19 data -more questions than answers inferring change points in the spread of covid-19 reveals the effectiveness of interventions). advance social sciences and humanities preprint asymptomatic seroconversion of immunoglobulins to sars-cov-2 in a pediatric dialysis unit human coronavirus reinfection dynamics: lessons for sars-cov-2. medrxiv the infection fatality rate of covid-19 inferred from seroprevalence data. medrxiv epigenetic landscape during coronavirus infection sars coronavirus pathogenesis: host innate immune responses and viral antagonism of interferon. current opinion in virology recognition of virus infection and innate host responses to viral gene therapy vectors second waves, social distancing, and the spread of covid-19 across america interpreting and understanding logits, probits, and other nonlinear probability models model selection and model averaging in behavioural ecology: the utility of the it-aic framework aic and bic: comparisons of assumptions and performance aic model selection using akaike weights model selection and model averaging values, median was taken, else it would have been 103 instead of 40. only pm2 and pm10 data including us air-pollution data covid10-mastertabelle10.sta some countries missing on that variable worldwide trends in insufficient physical activity from 2001 to 2016: a pooled analysis of 358 population-based surveys with 1·9 million participants world health statistics 2020: monitoring health for the sdgs, sustainable development goals. geneva: world health organization; 2020. licence: cc by-nc-sa 3.0 igo. other sources on health and population data japan und deutschland im zahlenvergleich (2): bevölkerung. accessed 15 türkei: mehr einpersonenhaushalte und kleinere haushaltsgrößen. accessed 15 idf mena members. accessed 15 south-and-central-america/members.html international diabetes federation. (2020). idf sea members. accessed 15 idf western pacific members. accessed 15 bevölkerungsdichte nach ländern. accessed 15 do more people live in urban or rural areas? accessed 15 anteil der einpersonenhaushalte an allen privathaushalten in den ländern der eu im jahr china: altersstruktur von großbritannien: altersstruktur von japan: altersstruktur von norwegen: altersstruktur von prävalenz von diabetes bei zwischen 20-und 79-jährigen in ausgewählten ländern weltweit im jahr russland: altersstruktur von türkei: altersstruktur von usa: altersstruktur von altersstruktur der ständigen wohnbevölkerung in der schweiz von 2009 bis 2019. accessed 15 brasilien: altersstruktur von europäische union: altersstruktur in den mitgliedsstaaten im jahr 2019. accessed 15 europäische union: bevölkerungsdichte in den mitgliedsstaaten im indien: altersstruktur von iran: altersstruktur von kanada: altersstruktur von substantial-investment-needed-to-avert-mental-health-crisis prevalence of insomnia from various review sources: canada ohayon & sagales (2010) incidende of insomnia european guideline for the diagnosis and treatment of insomnia sleep dissatisfaction from canada switzerland: statista "leiden sie unter schlafstörungen and prevalence of sleep problems from van de straat & bracke then a new hyper-variable was constructed in which the mean rank of those variables that were available per country was deposited. finally, this new mean rank-variable was again ranked to yield the rank order of countries with sleep problems. this was used for further analysis. health services data: we exctracted number of doctors, standardized per 1.000 inhabitants, number of hospital beds and number of icu beds, standardized. hospital beds (per 1.000 inhabitants) who: global health observatory data repository beyond-containment-health-systems-responses-to-covid-19-in-the-oecd-6ab740c0/ the following countries are not contained in this list and data for these countries come from the according sources: china: phua drugs we attempted to get data for lipid-lowering drug consumption, but as these data were sparse and not systematically comparable, we desisted from further attempts. mercury country wide mercury consumption is not easily available. un-en reports give tons of consumption for those countries that use mercury in the chlorine-alkaline industry but this is only about half of the countries. for the rest only regional data were available. we contacted dr. steenhuisen and prof. kümmerer in the hope to get help. dr. steenhuisen did not provide data and the publication key: cord-331376-l0o1weus authors: pogwizd, steven m.; bers, donald m. title: rabbit models of heart disease date: 2009-03-17 journal: drug discov today dis models doi: 10.1016/j.ddmod.2009.02.001 sha: doc_id: 331376 cord_uid: l0o1weus human heart disease is a major cause of death and disability. a variety of animal models of cardiac disease have been developed to better understand the etiology, cellular and molecular mechanisms of cardiac dysfunction and novel therapeutic strategies. the animal models have included large animals (e.g. pig and dog) and small rodents (e.g. mouse and rat) and the advantages of genetic manipulation in mice have appropriately encouraged the development of novel mouse models of cardiac disease. however, there are major differences between rodent and human hearts that raise cautions about the extrapolation of results from mouse to human. the rabbit is a medium-sized animal that has many cellular and molecular characteristics very much like human, and is a practical alternative to larger mammals. numerous rabbit models of cardiac disease are discussed, including pressure or volume overload, ischemia, rapid-pacing, doxorubicin, drug-induced arrhythmias, transgenesis and infection. these models also lead to the assessment of therapeutic strategies which may become beneficial in human cardiac disease. the study of human cardiovascular disease has been greatly facilitated by the use of a wide variety of animal models. small animal models such as rodents, guinea pigs and hamsters offer many advantages (low cost, short gestation time and short time for disease progression). furthermore, there is a tremendous amount of historical data over decades in rat models of cardiovascular disease. moreover, the development of genetically modified mice has transformed medical research, allowing investigators to overexpress, knock-out or knock-in genes of interest to explore the functional consequences of such genetic modulation. large animal disease models, such as in dog and pig, offer distinct advantages. larger hearts provide a considerable amount of tissue (from different chambers and regions of the heart) for biochemical and molecular studies. hemodynamic assessment and imaging is easier, the heart is large enough for chronic instrumentation or for in vivo cardiac mapping studies, and large size allows assessment of human-scale interventions ranging from left ventricular assist devices (lvads), implantable cardiac defibrillators (icds), cardiac resynchronization therapy (crt) and cardiac ablative approaches. moreover, larger animal hearts have physical dimensions more like human hearts, and this larger size may be crucial for some arrhythmogenic mechanisms. however, the cost (both purchase cost and per diem charges) can be prohibitive, especially for long-term chronic studies in disease states. because of its intermediate size, the rabbit offers several potential advantages over other species. although the rabbit heart is smaller than that of dog or pig, it is large enough to easily do surgical and catheter-based interventions at a much lower cost (5-15 times less expensive than those of dogs). at the same time, many 'adult human scale' interventions have been or are being scaled down for pediatric use and assessment (e.g. pacemakers and crt) so that rabbit models can be beneficial (see below). surgical interventions are still easier than microsurgical approaches in rodents. more importantly, rabbit cardiac physiology has more characteristics that are similar to human cardiac physiology than mouse or rat has. indeed, cellular electrophysiology and ca 2+ transport in rabbit are much more like those in human than is the case for either rat or mouse [1] . this is particularly relevant for the studies of heart failure (hf) and arrhythmias because alterations in ion channel and ca 2+ transporter function or expression are thought to contribute directly to depressed contractile performance and arrhythmogenesis [2, 3] . in particular, mouse and rat ventricular action potentials (aps) have very short duration and completely lack the prominent ap plateau phase typical of human, rabbit and most mammals larger than that of rat (fig. 1d) . furthermore, the main ionic currents underlying repolarization and which greatly influence ap duration (apd) are the same in human and rabbit (delayed rectifier k + currents i kr and i ks ). these channels are not present in mouse or rat, where apd is dictated almost entirely by transient outward currents (i to ). i to is present in human and rabbit ventricle, but plays only a minor role in apd in human and rabbit. notably, the electrophysiological characteristics and secondary regulation of i to versus i kr and i ks are completely different, making this much more than a quantitative difference. in mouse and rat, almost all of the ca 2+ involved in the activation of contraction is released from the sarcoplasmic reticulum (sr), and during relaxation almost all of that ca 2+ is resequestered by the sr via the sr ca-atpase (fig. 1a , much like in skeletal muscle). however, in human and rabbit ventricular myocytes a considerably larger fraction of activating ca 2+ comes via ca 2+ entry, and the same amount is extruded at each beat by the electrogenic na/ca exchange [1] . under control conditions in rabbit and human this is 24-30% of the ca 2+ involved in excitation-contraction coupling, and in hf the sr and trans-sarcolemmal cycling can be nearly equal (fig. 1b ,c; [4] [5] [6] ). because changes in sr ca 2+ content and ] i decline in normal adult rabbit and rat ventricular myocytes, on the basis of quantitative analysis of ca 2+ removal fluxes by the sr ca-atpase (sr) na/ca exchange (ncx) and the combined slow action of the plasma membrane ca-atpase (sl ca-atpase) and mitochondrial uniporter (mito; on the basis of data in bassani et al. [105] ). (c) in rabbit hf, enhanced ncx function and decreased sr function brings these systems into more equal contribution (based on data in pogwizd et al. [4] ). human nonfailing and failing hearts exhibit similar balance of fluxes as in rabbit [6] . (d) action potentials recorded in normal adult rabbit and rat ventricular myocytes (data from bassani et al. [105] ) indicating the k + currents responsible for repolarization at different phases (i to , i ks and i kr ). release, sr ca-atpase function and na/ca exchange influence both contractile function (systolic and diastolic) and arrhythmogenesis in hf, this major fundamental species difference may be crucial not only for the end-point phenotype, but also for the adaptive/maladaptive mechanisms involved in disease progression. adult mouse and rat ventricles normally express mainly the fast a-myosin isoform (which allows faster crossbridge cycling and muscle shortening than b-myosin), but during hypertrophy and hf there is isoform switching to b-myosin [7] . adult human and rabbit myocardia express almost entirely b-myosin so there is little room for further isoform switching (although the change may still have functional consequences). these various differences are not arguments against other large animal models (e.g. canine and porcine, which resemble human and rabbit in this regard), but are crucial when extrapolating results from mouse and rat hearts in the context of human disease. in the rest of this review, we discuss specific rabbit models of heart disease (rabbit models of vascular disease, lipid disorders, diabetes mellitus, thyroid disease, obesity or chronic hypoxia are not addressed). although constrained by space, we attempt to provide an overview of the range of rabbit heart disease models used, along with some of their methods, advantages, limitations and potential (see table 1 ). arterio-venous (av) shunt formation can induce volume overload (e.g. by side-by-side anastamosis of the common carotid artery and the external jugular vein). this leads to cardiac hypertrophy, but there are little data on effects on lv contractile function [8] . aortic regurgitation (ar) also causes volume overload, and can be induced in rabbits by aortic valve cusp perforation with a catheter using a transcarotid approach [9] . severe ar leads to left ventricular hypertrophy that is followed by lv systolic dysfunction and hf over the course of one to two years. although not all ar rabbits develop hf, this represents an advantage over ar models in other species such as dog that consistently manifest normal systolic function (for review, see [9] ). chronic ar rabbits exhibit myocardial fibrosis preceding the development of hf, in large part because of cardiac fibroblasts that produce abnormal proportions of noncollagen extracellular matrix, specifically fibronectin, with little change in collagen synthesis [10] . chronic av block in the dog (induced by av nodal ablation) leads to ventricular remodeling characterized by lv hypertrophy and electrical remodeling [11] . a similar model, developed in the rabbit heart, exhibits electrical remodeling with prolonged qt interval, spontaneous torsades des pointes (tdp) in 75% of bradypaced rabbits, reduced i ks and i kr , and downregulated kvlqt1, mink and herg [12] . radiation-induced heart disease [87] www.drugdiscoverytoday.com [13] , although hf is uncommon. banding of the descending aorta of infant (ten-day-old) rabbits (a model analogous to coarctation of the aorta seen in children) leads to progressive aortic stenosis, and within six to seven weeks lv mass/ volume ratios increases by 30-100% (without change in lv systolic function), but with decreased myocyte contractility [14, 15] . hypertension can induce left ventricular hypertrophy in the rabbit when unilateral nephrectomy is combined with either contralateral renal artery constriction or renal wrapping. as an example, studies in the one-kidney, one-clip (1k,1c) goldblatt rabbit (removal of the right kidney and partial constriction of the left renal artery) demonstrated severe hypertension, a 75% increase in lv mass, and diastolic dysfunction (lv systolic function was preserved) [16] . mild lv hypertrophy has also been produced in rabbit by a one-kidney, one-wrap method [17] . in other studies, chronic infusion of angiotensin has been used to induce hypertension [18] . rv pressure overload can also induce right heart hypertrophy (and hf). this has been induced in rabbits by several interventions including pulmonary artery constriction in adult [19] or young rabbits [20] and monocrotaline-induced pulmonary hypertension [21] . myocardial infarction (mi) in the rabbit can be induced by coronary artery occlusion, and this is a useful model. there are some differences in coronary anatomy between rabbits and other species. the collateral circulation is not as extensive as it is in the canine heart, and the circumflex supplies most of the lv free wall and is more dominant than the left anterior descending (lad) coronary artery [22, 23] . as such, the occlusion of the marginal branch of the circumflex coronary artery is often used for mi induction, and leads to mild to moderate degrees of hf, with lv ejection fractions in the range of 40-50%, but in some studies efs were as low as 27% [23, 24] . mis are characterized by increased lv enddiastolic dimension and increased left atrial size, similar to that found in lad infarcts in other species. there are significant alterations in b-adrenergic receptor (b-ar) signaling including global reduction in b-ar density, reduced b-ar coupling and increased protein levels and activity of bark1 and g i [25] . there is regional heterogeneity of intracellular ca 2+ handling [23] and impaired synchronization of ca 2+ release events throughout the myocytes [26] . in addition to chronic occlusion, a rabbit model of myocardial ischemia followed by reperfusion has been used [27] . moreover, other rabbit mi models include intracoronary microsphere embolization [28] (similar to a wellestablished intracoronary microembolization model in dog) [29] , and direct current-shock-induced cardiac muscle injury [30] . hibernating myocardium models in the rabbit have been developed, for example with coronary artery ligation [31] . moreover, a cellular model of hibernating myocardium in rabbit cardiac myocytes has been developed [32] . coculturing of adult rabbit cardiomyocytes with cardiac fibroblasts induced hibernation-like dedifferentiation and serves as a valuable tool to study cellular pathways of hibernating myocardium in vitro. rabbit models of heart failure hf has been induced in the rabbit heart by several approaches. as mentioned above, volume or pressure overload, as well as mi, occasionally leads to hf. however, several hf rabbit models have been developed using approaches that have been successful in other animal species [33] [34] [35] . doxorubicin (adriamycin), an anthracycline chemotherapeutic agent, is one of the most prescribed anticancer drugs, primarily because of its effectiveness in a wide variety of hematologic malignancies and solid tumors. however, cardiotoxicity is a major clinical problem, and its cumulative toxicity on myocardium prevents their use at maximum doses that would be needed for optimal treatment [36] . progressive reduction in ejection fraction is seen during the course of therapy, and life threatening chf is observed in 10% of patients receiving more than 550 mg/m 2 . this has prompted studies in experimental models. adriamycin has been used experimentally to induce cardiomyopathy in rabbits [37] [38] [39] . this model has been useful in identifying cardioprotective agents that could limit adriamycin toxicity. it has also been useful as a model of nonischemic cardiomyopathy with severe cardiac dilatation, lv hypertrophy, and decreased lv systolic function that is progressive and irreversible, accompanied by fluid retention and activation of the sympathetic nervous system and the renin-angiotensin systems [37] . however, there is no evidence of b-adrenergic receptor downregulation as occurs in advanced human hf [40] . cardiotoxicity is due to free radical formation and lipid peroxidation that ultimately alters lysosomes, mitochondria, sr and the sarcolemmal membrane. these changes result in the activation of hydrolytic enzymes, calcium overload and reduced energy production [41] . pathologic changes including cytosolic vacuolization and myofibrillar loss are typical of adriamycin cardiotoxicity in patients, but are different from those observed in other forms of nonischemic hf in humans (e.g. idiopathic dilated cardiomyopathy (idcm), long-standing hypertension or valvular heart disease). other limitations include the variable degree of lv dysfunction produced, undesirable bone marrow and gi toxicity. high-dose catecholamines (isoproterenol or epinephrine), repetitively infused, induce a cardiomyopathy in rabbits characterized by lv dilation, hypertrophy and depressed systolic function, but a high mortality rate [42] . rapid pacing induces hf. tachycardia-induced hf has been described in patients with long-standing tachyarrhythmias such as atrial fibrillation (af) with rapid ventricular response. as such, rapid pacing (whether ventricular or atrial) has been used to induce hf in several speciesmost commonly dog, but also pig and rabbit [43] . rapid ventricular pacing in the rabbit induces hf characterized by cardiac enlargement, systolic dysfunction and ventricular myocytes exhibiting contractile dysfunction. the intermediate size of the rabbit allows implantation of standard human pacemakers. because the intrinsic heart rates of rabbits are in the range of 220-280 beats/min, pacing to rates in the range of 340-400 beats/min are required [44] [45] [46] . hf develops over the course of two to five weeks and is characterized by dyspnea, appetite loss and body weight loss. chronic pacing leads to progressive, predictable and time-dependent reduction in lv systolic function and geometry [44] , which allow the study of molecular and cellular events during the development of chronic hf. development of hf is associated with reduced responsiveness to inotropic stimulus, decreased myocyte contractility, and neurohumoral activation [44, 47] . there are changes in ca 2+ homeostatic mechanisms consistent with human hf, including decreases in ca transient amplitude, l-type ca 2+ channel activity and serca expression. however, yao et al. report decreased na/ca exchange current in pacing hf rabbits [45] in contrast to studies in canine pacing hf showing enhanced ncx activity when [ca 2+ ] i was minimally buffered [48] . as with pacing hf models in other species, lv systolic function returns to normal within approximately one week following cessation of pacing, although adenyl cyclase response to agonists takes two weeks, and b-adrenergic receptor density takes four weeks to return to normal [46] . moreover, the lv dilatation and dysfunction is not associated with myocardial or cellular hypertrophy [44] , so the changes in lv myocardial structure are not similar to those of clinical hf in humans caused by ischemic or hypertensive heart disease. the model has nonetheless provided important information regarding the pathophysiology of the failing heart including electrical remodeling [49] , abnormal intracellular ca 2+ cycling [50] and exercise training in hf [51] . many of the rabbit heart disease models provide investigators with hypertrophied and/or failing myocardium with which to study alterations such as contractile dysfunction, myocardial energetics and gene expression. however, the studies of arrhythmogenesis in the failing heart in these models are limited by the fact that few of these rabbit models are truly arrhythmogenic. in fact, even when one considers other animal models of hf, both large and small, there are few models that exhibit both severe contractile dysfunction and spontaneously occurring and inducible ventricular arrhythmias. combining volume overload (ar) with pressure overload (aortic constriction) several weeks later consistently leads to hf (much more so than either volume or pressure overload alone) [52] . we have demonstrated that hf rabbits exhibit severely depressed lv function with lvh (75% increase in hw/bw), spontaneously occurring vt that initiates by a nonreentrant mechanism such as triggered activity, and a 10% incidence of sudden death [5, 53, 54] . we also showed that hf rabbit myocytes exhibit contractile dysfunction from decreased sr ca load that is associated with altered intracellular ca handling ("ncx, #ryr and preserved serca); preserved b-ar responsiveness with enhanced b 2 -ar responsiveness; decrease in ion currents (i to , i k1 and i ks ); activation (by catecholamines) of a transient inward current i ti that can initiate delayed after depolarizations (dads); downregulation and dephosphorylation of the main ventricular gap junctional protein connexin43 (cx43) increased the expression and activation of camkii and increased ryr phosphorylation and sr ca 2+ leak [3, 5, 53, 55, 56] . many of these findings have been validated by focused studies in failing human hearts including 3d mapping studies showing focal nonreentrant mechanisms underlying vt in patients with nonischemic hf, ncx upregulation with preserved serca in a large subset of human hf patients, enhanced b 2 -adrenergic responsiveness in failing human cardiac myocytes [53, 57, 58] , and cx43 downregulation and dephosphorylation in human hf (both ischemic and nonischemic) [55] . other studies with this hf model have demonstrated additional insights into arrhythmogenesis [59] , sinus node dysfunction in hf [60] and the effects of acute ischemia superimposed on chronic hf [61] . thus, findings in these models have provided important new insights into human hf. while this arrhythmogenic rabbit model of nonischemic hf is technically challenging and takes months to develop, the similarities of rabbit myocardium to human myocardium and validating studies in humans indicate that this model provides important insights into the contractile dysfunction and arrhythmogenicity of the failing heart. tdp, a polymorphic ventricular tachycardia that occurs in the setting of a prolonged qt interval, can be caused by several pharmaceutical agents (primarily those that inhibit i kr , the rapid component of the inward rectifying potassium current, i kr ). tdp had been difficult to model in experimental animals, and mouse and rat which lack an ap plateau and i kr are less useful animal models in this arena. carlsson et al. [62] first developed an in vivo model of tdp in the anesthetized aadrenoreceptor (a-ar)-stimulated rabbit (methoxamine plus the class iii antiarrhythmic clofilium). this model has been used extensively by several groups with minor modification [62] [63] [64] to test for drug-induced tdp, to understand the underlying electrophysiological mechanisms [65] or to vol. 5, no. 3 2008 drug discovery today: disease models | cardiovascular disease models www.drugdiscoverytoday.com develop novel antiarrhythmic approaches for tdp [63] . it remains unclear as to how a-ar stimulation facilitates tdp induction, but could involve increases in [ca 2+ ] i , elevation of blood pressure or reflex vagal nerve activation [64] . myocardial failure in the rabbit (induced by coronary artery ligation) has also been used as a model to predict the development of tdp [66] . af can be studied in isolated rabbit atria preparations, and this has provided key insights into the mechanism of af [67, 68] and the role of atrial dilatation [69] . rabbit models of heart disease have been used to study atrial conduction and atrial fibrillation. chronic volume overload from av shunt formation slowed atrial conduction and enhanced the induction of atrial tachycardia (but not af) [70] . however, rabbits with ventricular tachypacing-induced hf exhibit atrial fibrosis and enhanced atrial fibrillation induced by burst pacing [71] . transgenic mice have provided important insights into cellular physiology, but the small size of murine hearts limits easy assessment of function and precluding assessment of human-size interventions such as pacing, crt and defibrillation. however, in the past decade, transgenic rabbits have been developed, combining the value of transgenic approaches (overexpression or knock-outs of key genes, previously restricted to the mouse) with the larger animal size, and the scale and physiology of the rabbit. transgenic rabbit models created to date include: long qt syndrome [72] , hypertrophic cardiomyopathy [73, 74] , g sa overexpression [75] and phospholamban overexpression [76] . in some cases, findings in transgenic rabbits appear different from those in transgenic mice. for example, transgenic rabbits overexpressing the g protein g sa do not develop cardiomyopathy like their transgenic mouse counterparts [75] ; and phospholamban overexpressing transgenic rabbits have normal cardiac function and response to b-adrenergic stimulation, unlike transgenic mouse models [76] . these findings might arise from species differences in compensatory changes and/or species differences in cellular physiology (rabbit calcium handling and ion channel physiology is closer to that of human than what is observed in mouse). the expense and time for the development of transgenic rabbits has been limiting, but with further advances in the field, transgenic rabbits will offer important insights into cardiac physiology. moreover, the addition of induced cardiac disease states (such as pressure overload, mi and hf) will provide novel insights into contractile dysfunction, cardiac remodeling and arrhythmogenesis. infectious diseases affecting the heart have stimulated several experimental models in the rabbit. myocarditis (with hf and dilated cardiomyopathy) has been induced in rabbits infected with viruses such as coronavirus and coxsackie virus [77, 78] , fungi such as cryptococcus [79] , parasites such as toxoplasmosis [80] and bacteria such as streptococci [81] and diphtheria [82] . chronic chagas disease has been produced in rabbits by the inoculation of a virulent strain of trypanosoma cruzi, and is characterized by biventricular (bi-v) dilatation and hypertrophy, apical aneurysm, interstitial fibrosis and focal myocarditis [83, 84] . bacterial endocarditis has been produced in rabbit with several bacterial pathogens including streptococcus, enterococcus and staphylococcus [85] . although the focus of these studies is predominantly in response to antimicrobial therapy, these models have provided insight into pathogenesis of both native valve and prosthetic valve endocarditis in humans. sepsis affects cardiac function, and endotoxin-induced cardiomyopathy has been induced by the infusion of bacterial lipopolysaccharide [86] . radiation also induces acute myocardial lesions, typically a pancarditis with inflammatory exudates, followed by a latent phase and eventually myocardial and pericardial fibrosis [87] . studies in rabbits have also shown that radiation enhances adriamycin cardiotoxicity [88] . having discussed a large number of rabbit models of heart disease, we finish this review with a brief discussion of how these models have been and can be used in the development of novel therapeutic approaches to heart disease. drug therapy with different pharmacologic agents as therapy for hf has been used in a wide range of rabbit heart models -primarily in infarct and combined pressure and volume overload models, but to some degree in pacinginduced and adriamycin-induced hf. cardiac ablative therapy in the rabbit heart is limited by the small size for in vivo studies using currently utilized ablation catheters. however, the isolated perfused rabbit heart has been useful for characterizing the effects of radiofrequency ablation (of atrium, ventricle and av node) on conduction and the underlying anatomy [89, 90] . crt also known as bi-v pacing is beginning to be applied to the rabbit model of mi. recent studies have shown that bi-v pacing attenuated lv dilatation, systolic dysfunction and electrical remodeling [91] . icds have not been used in rabbit models to date. however, vf inducibility and vf threshold have been assessed in the in vivo rabbit heart [92] as well as in the langendorffperfused rabbit heart [93] . gene therapy approaches have been used in the rabbit models of heart disease. direct intramuscular injection of viral vectors into rabbit lv myocardium is the simplest approach, but is limited by local transgene expression as well as traumatic effects from needle injections. myocardial gene delivery approaches to rabbits initially involved thoracotomy-based approaches: delivery of adenoviral transgenes either by ex vivo (retrograde) intracoronary delivery to donor hearts of a heterotopic heart transplant model [94] or lv cavity injection (with ascending aorta cross-clamped) [95] . however, noninvasive techniques such as percutaneous subselective coronary artery catheterization have resulted in the efficient delivery of transgene that is ventricle-specific and targeted to the particular coronary artery that is catheterized [96] . koch and his colleagues used these approaches to demonstrate that adenoviral gene transfer to overexpress the b 2 -adrenergic receptor or barkct (a peptide inhibitor of b-adrenergic receptor kinase 1 (bark1)) in the rabbit mi model can enhance contractile function and b-adrenergic receptor responsiveness, and delay the onset of hf [95] [96] [97] . other studies in rabbit include have explored adenoviral gene transfer of sodium-calcium exchanger (ncx) [98] ; fibroblast growth factor-2 [28] , caspase inhibitor p35 [99] , norepinephrine transporter uptake-1 [100] , extracellular superoxide dismutase [101] and of the antiapototic factor bcl-2 [24] . cell therapy approaches are being tested in rabbit heart. examples of recent studies include skeletal myoblast transplantation in the infarcted rabbit heart [102] , and autologous stem cell transplantation in doxorubicin-induced nonischemic hf [103, 104] . there are no ideal models of heart disease, and every model utilized has both advantages and disadvantages. the rabbit has been used to develop models of a wide variety of human diseases including pressure overload, volume overload, combined pressure and volume overload, pacing-induced hf, toxic cardiomyopathy, toxic cardiomyopathy, mi and transgenic rabbit models. transgenic rabbit models of heart disease have been developed for several human diseases and more are on the way. rabbit models of human heart disease can offer distinct advantages over rodent or large animal (dogs and pigs) in terms of intermediate size, intermediate costs and similarities to human physiology, and can provide novel insights into human cardiac disease. excitation-contraction coupling and cardiac contractile force altered cardiac myocyte ca regulation in heart failure cellular basis of triggered arrhythmias in heart failure upregulation of na(+)/ca(2+) exchanger expression and function in an arrhythmogenic rabbit model of heart failure arrhythmogenesis and contractile dysfunction in heart failure: roles of sodium-calcium exchange, inward rectifier potassium current, and residual beta-adrenergic responsiveness cellular basis of abnormal calcium transients of failing human ventricular myocytes factors controlling cardiac myosin-isoform shift during hypertrophy and heart failure differential alteration of cardiotonic effects of emd 57033 and beta-adrenoceptor agonists in volume-overload rabbit ventricular myocytes heart failure due to chronic experimental aortic regurgitation myocardial fibrosis in chronic aortic regurgitation: molecular and cellular responses to volume overload the canine model with chronic, complete atrioventricular block potassium channel subunit remodeling in rabbits exposed to long-term bradycardia or tachycardia: discrete arrhythmogenic consequences related to differential delayed-rectifier changes protein kinase c activity and expression in rabbit left ventricular hypertrophy impaired glucose transporter activity in pressureoverload hypertrophy is an early indicator of progression to failure inhibition of tumor necrosis factor-alpha improves postischemic recovery of hypertrophied hearts echocardiography in conscious 1k,1c goldblatt rabbits reveals typical features of human hypertensive ventricular diastolic dysfunction influence of hypertension with minimal hypertrophy on diastolic function during demand ischemia renal sympathetic neuroeffector function in renovascular and angiotensin ii-dependent hypertension in rabbits the mechanical characteristics of hypertrophied rabbit cardiac muscle in the absence of congestive heart failure: the contractile and series elastic elements mechanical, energetic, and biochemical changes in long-term pressure overload of rabbit heart the effects of vasoactive intestinal peptide on monocrotaline induced pulmonary hypertensive rabbits following cardiopulmonary bypass: a comparative study with isoproteronol and nitroglycerine anesthetized rabbit as a model for ischemia-and reperfusion-induced arrhythmias: effects of quinidine and bretylium non-uniform prolongation of intracellular ca 2+ transients recorded from the epicardial surface of isolated hearts from rabbits with heart failure viral gene transfer of the antiapoptotic factor bcl-2 protects against chronic postischemic heart failure molecular beta-adrenergic signaling abnormalities in failing rabbit hearts after infarction drug discovery today: disease models | cardiovascular disease models www dyssynchronous ca(2+) sparks in myocytes from infarcted hearts acceleration of the healing process and myocardial regeneration may be important as a mechanism of improvement of cardiac function and remodeling by postinfarction granulocyte colony-stimulating factor treatment effects of in vivo gene transfer of fibroblast growth factor-2 on cardiac function and collateral vessel formation in the microembolized rabbit heart a canine model of chronic heart failure produced by multiple sequential coronary microembolizations systemic and regional effects of vasopressin and angiotensin in acute left ventricular failure structural remodelling of cardiomyocytes in the border zone of infarcted rabbit heart adult rabbit cardiomyocytes undergo hibernation-like dedifferentiation when co-cultured with cardiac fibroblasts experimental models of heart failure animal models of heart failure animal models of heart failure: what is new? anthracyclines and the heart adriamycin cardiomyopathy in the rabbit: an animal model of low output cardiac failure with activation of vasoconstrictor mechanisms an experimental model of chronic cardiac failure using adriamycin in the rabbit: central haemodynamics and regional blood flow arrhythmogenesis in experimental models of heart failure: the role of increased load ventricular beta-adrenoceptors in adriamycin-induced cardiomyopathy in the rabbit analyses of the molecular mechanism of adriamycin-induced cardiotoxicity. free radic hemodynamic changes and neurohumoral regulation during development of congestive heart failure in a model of epinephrine-induced cardiomyopathy in conscious rabbits tachycardia-induced cardiomyopathy: a review of animal models and clinical studies myosin heavy chain synthesis is increased in a rabbit model of heart failure abnormal myocyte ca 2+ homeostasis in rabbits with pacing-induced heart failure alterations in cardiac adrenergic terminal function and beta-adrenoceptor density in pacing-induced heart failure left ventricular and myocyte structure and function following chronic ventricular tachycardia in rabbits enhanced ca(2+)-activated na(+)-ca(2+) exchange activity in canine pacing-induced heart failure molecular correlates of altered expression of potassium currents in failing rabbit myocardium cellular and molecular determinants of altered ca 2+ handling in the failing rabbit heart: primary defects in sr ca 2+ uptake and release mechanisms exercise training normalizes sympathetic outflow by central antioxidant mechanisms in rabbits with pacing-induced chronic heart failure experimental cardiac hypertrophy in rabbits after aortic stenosis or incompetence or both arrhythmogenic effects of beta2-adrenergic stimulation in the failing heart are attributable to enhanced sarcoplasmic reticulum ca load nonreentrant mechanisms underlying spontaneous ventricular arrhythmias in a model of nonischemic heart failure in rabbits connexin 43 downregulation and dephosphorylation in nonischemic heart failure is associated with enhanced colocalized protein phosphatase type 2a ca 2+ /calmodulin-dependent protein kinase modulates cardiac ryanodine receptor phosphorylation and sarcoplasmic reticulum ca 2+ leak in heart failure mechanisms underlying spontaneous and induced ventricular arrhythmias in patients with idiopathic dilated cardiomyopathy gene expression of the cardiac na(+)-ca 2+ exchanger in end-stage human heart failure arrhythmogenesis in heart failure changes in sinus node function in a rabbit model of heart failure with ventricular arrhythmias and sudden death electrophysiologic and extracellular ionic changes during acute ischemia in failing and normal rabbit myocardium qtu-prolongation and torsades de pointes induced by putative class iii antiarrhythmic agents in the rabbit: etiology and interventions systemic administration of calmodulin antagonist w-7 or protein kinase a inhibitor h-8 prevents torsade de pointes in rabbits importance of vagally mediated bradycardia for the induction of torsade de pointes in an in vivo model calmodulin inhibitor w-7 unmasks a novel electrocardiographic parameter that predicts initiation of torsade de pointes use of the rabbit with a failing heart to test for torsadogenicity circus movement in rabbit atrial muscle as a mechanism of tachycardia. iii. the ''leading circle'' concept: a new model of circus movement in cardiac tissue without the involvement of an anatomical obstacle mechanisms of atrial fibrillation: lessons from animal models effects of atrial dilatation on refractory period and vulnerability to atrial fibrillation in the isolated langendorffperfused rabbit heart mechanism for atrial tachyarrhythmia in chronic volume overload-induced dilated atria drug discovery today: disease models | cardiovascular disease models pioglitazone, a peroxisome proliferatoractivated receptor-gamma activator, attenuates atrial fibrosis and atrial fibrillation promotion in rabbits with congestive heart failure mechanisms of cardiac arrhythmias and sudden death in transgenic rabbits with long qt syndrome a transgenic rabbit model for human hypertrophic cardiomyopathy transgenic rabbit model for human troponin ibased hypertrophic cardiomyopathy overexpressed cardiac gs alpha in rabbits phospholamban overexpression in transgenic rabbits an experimental model for myocarditis and congestive heart failure after rabbit coronavirus infection role of mip-2 in coxsackie virus b3 myocarditis experimental cryptococcal-induced myocarditis experimental toxoplasmic myocarditis in rabbits induction of myocarditis in rabbits injected with group a streptococci new animal model of diphtheritic myocarditis the evolution of experimental trypanosoma cruzi cardiomyopathy in rabbits: further parasitological, morphological and functional studies the immunology of experimental chagas' disease. iv. production of lesions in rabbits similar to those of chronic chagas' disease in man successful single-dose teicoplanin prophylaxis against experimental streptococcal, enterococcal, and staphylococcal aortic valve endocarditis cardiac and renal effects of levosimendan, arginine vasopressin, and norepinephrine in lipopolysaccharide-treated rabbits pathogenesis of radiation-induced myocardial fibrosis adriamycin cardiomyopathy: enhanced cardiac damage in rabbits with combined drug and cardiac irradiation atrioventricular nodal fast pathway modification: mechanism for lack of ventricular rate slowing in atrial fibrillation translesion stimulus-excitation delay indicates quality of linear lesions produced by radiofrequency ablation in rabbit hearts prevention of adverse electrical and mechanical remodeling with biventricular pacing in a rabbit model of myocardial infarction i(ks) block by hmr 1556 lowers ventricular defibrillation threshold and reverses the repolarization shortening by isoproterenol without rate-dependence in rabbits high-resolution fluorescent imaging does not reveal a distinct atrioventricular nodal anterior input channel (fast pathway) in the rabbit heart during sinus rhythm myocardial gene transfer and overexpression of beta2-adrenergic receptors potentiates the functional recovery of unloaded failing hearts preservation of myocardial beta-adrenergic receptor signaling delays the development of heart failure after myocardial infarction in vivo ventricular gene delivery of a betaadrenergic receptor kinase inhibitor to the failing heart reverses cardiac dysfunction enhancement of cardiac function after adenoviral-mediated in vivo intracoronary beta2-adrenergic receptor gene delivery functional alterations after cardiac sodiumcalcium exchanger overexpression in heart failure blocking caspase-activated apoptosis improves contractility in failing myocardium cardiac overexpression of the norepinephrine transporter uptake-1 results in marked improvement of heart failure gene therapy with extracellular superoxide dismutase protects conscious rabbits against myocardial infarction the real estate of myoblast cardiac transplantation: negative remodeling is associated with location effects of autologous stem cell transplantation on ventricular electrophysiology in doxorubicin-induced heart failure effects of autologous bone marrow stem cell transplantation on beta-adrenoceptor density and electrical activation pattern in a rabbit model of non-ischemic heart failure action potential duration determines sarcoplasmic reticulum ca 2+ reloading in mammalian ventricular myocytes this work was supported by nih grants hl73966 and hl46929 (s.m.p.) and hl64724 and hl80101 (d.m.b.). key: cord-322806-g01wmmbx authors: sturniolo, s.; waites, w.; colbourn, t.; manheim, d.; panovska-griffiths, j. title: testing, tracing and isolation in compartmental models date: 2020-05-19 journal: nan doi: 10.1101/2020.05.14.20101808 sha: doc_id: 322806 cord_uid: g01wmmbx existing compartmental mathematical modelling methods for epidemics, such as seir models, cannot accurately represent effects of testing, contact tracing and isolation. this makes them inappropriate for evaluating testing and contact tracing strategies to contain an outbreak. an alternative used in practice is the application of agentor individual-based models (abm). however abms are complex, less well-understood and much more computationally expensive. this paper presents a new method for accurately including the effects of testing, contact-tracing and isolation (tti) strategies in standard compartmental models. we derive our method using a careful probabilistic argument to show how contact tracing at the individual level is reflected in aggregate on the population level. we show that the resultant seir-tti model accurately approximates the behaviour of a mechanistic agent-based model at far less computational cost. the computationally efficiency is such that it be easily and cheaply used for exploratory modelling to quantify the required levels of testing and tracing, alone and with other interventions, to assist adaptive planning for managing disease outbreaks. since the beginning of 2020, the world has been in the midst of a covid-19 pandemic, caused by the novel coronavirus sars-cov-2. to slow down the spread, many countries, including the uk have imposed social distancing mitigation strategies. however, such measures can not feasibly be imposed over a long period as may cause economic collapse. as a consequence countries need to consider how to ease lockdown measures while controlling sars-cov-2 spread. the world health organisation has recently updated their guidance on this, recommending a six point strategy that requires firstly assuring that the pandemic spread has been suppressed, and is followed by detecting, testing, isolating and contact-tracing of infected individuals [1] . mathematical modelling has figured prominently in decision making around control and containment of covid-19 spread, including the imposition of physical distancing measures [2] . it provides a logical framework for understanding the propagation of an may 14, 2020 1/23 infectious disease through a population and allows different interventions to be explored, including testing and contact tracing of infected individuals as possible strategies to ease social distancing restrictions. such models are also necessarily simplifications and understanding of their assumptions and what they do and do not represent is required to correctly interpret them. mathematical models have a long history of being used to describe the spread of infectious diseases from plague outbreaks more than a century ago [3] to the more recent sars [4] and ebola [5] , [6] epidemics, and from making decisions around different vaccination strategies for influenza [7] to modelling hiv [8] , and from modelling pandemic influenza [9] to currently facilitating real-time policy decision making around the covid-19 epidemic [2, [10] [11] [12] [13] [14] . there are several common approaches, each with advantages and disadvantages [4, 15] . compartmental models [4, 16, 17] partition the population into different compartments such as susceptible, exposed to the virus but not infectious, infectious and removed and track the movements of individuals between these groups. though dynamics of real disease outbreaks are fundamentally stochastic [18] [19] [20] , this level detail is mainly relevant for early stages or small outbreaks [21] . commonly within compartmental models a mean-field approximation given by ordinary differential equations (ode) is used [4, 22, 23] . the latter approach is particularly attractive because it is computationally efficient and can yield informative results. ode systems can be generalised to explicitly incorporate dependence on system state at some times in the past, yielding delay-differential equations (dde) [24] [25] [26] , the analogue for continuous state of markov processes with finite memory. such formulations require meticulous care to solve accurately [27, 28] and much of what is known about their behaviour consists of asymptotic results [29] [30] [31] [32] . branching processes are used [23, 33, 34] where more flexibility is desired in representing the timing of transitions among compartments and, for continuous time, are amenable to stochastic differential equation (sde) treatment. for some choices of distribution, the sde formulation is markovian and can be analysed as a continuous-time markov chain (ctmc) [19, 35] . finally, individual-or agent-based models (ibm/abm) explicitly represent each individual in the population and allow for fine-grained modelling of the characteristics of each one such as different contact patterns or susceptibilities to the disease [36] [37] [38] [39] [40] . they have been [41] , and are being [10] [11] [12] widely used for planning and epidemic control. while abms allow for maximal flexibility and realism, this comes at a high computational cost and it can be difficult to extract analytical results that relate the fine-grained behaviour to population-level effects. it is generally feasible to conduct agent-based simulations for populations of tens of thousands, but there are salient features of epidemics such as the timing and size of peaks of infectious individuals that depend on population sizes two orders of magnitude larger. an important subset of abms are network or graphical models [42] [43] [44] [45] [46] [47] where the structure of the population, the possible interactions among its members, are explicitly represented. in addition to the computational cost and analytical difficulties with abms, sufficient data to support their fine-grained realism is rarely available. for many purposes, including the one that we are concerned with here, an accurate qualitative understanding of the effect of interventions like testing and contact tracing, cheap, coarse, high-level models are more useful than expensive fine-grained models that rely on vast often not readily available data. while classic compartmental models can easily be used to simulate some interventions analogous to parameter changes, they cannot readily include effects contact tracing of infected individuals unless vast assumptions are made. this is because modelling contact-tracing is intrinsically reliant on individual behaviour within a network structure. previous work on ebola [6] , sars [48] and covid-19 used simple approaches to represent contact tracing in a compartmental model: asserting that a constant fraction of exposed individuals becomes isolated due to contact tracing [10, 14, 49, 50] or reducing transmission may 14, 2020 2/23 by a constant amount, perhaps after a delay [51] . we believe that this kind of approach is insufficient for the purpose of understanding how the rate and timing of testing and contact tracing affect success in containing outbreaks. the purpose of contact tracing is to attempt to isolate infectious, or soon to be infectious individuals. therefore, contact tracing should result in the isolation of both infectious and exposed individuals and this is a key assumption that previous work has missed. contact tracing will also inevitably result in the isolation of susceptible and recovered individuals with the former contributing to a reduced rate of disease propagation. to properly understand this process it is imperative to model the effects of contact tracing with mathematical rigour. in this paper we develop an extension to the classic susceptible-exposed-infectious-removed 1 (seir) model [16, 52, 53] simulated with odes to include testing, contacttracing, and isolation (tti) strategies. we call this model seir-tti. this model captures the salient features of the manifestation at the population level of the dynamics of testing and tracing at the individual level. due to its relative simplicity, seir-tti is applicable across a spectrum of diseases. with appropriate parametrisation, it can be used anywhere a standard seir model can be used with the same caveats and limitations. though we are clearly motivated by the current covid-19 pandemic and wish to understand how interventions like tti can be used to contain it, we do not claim that we are modelling it in particular. our contribution is a mathematical tool and software implementation that can be used for understanding tti, not a model of covid-19. the method that we present is general and can also be applied to other compartmental models, with the standard caveat that with more compartments comes more work to determine the appropriate rates. we validate our seir-tti ode model against a mechanistic agent-based model where testing, tracing and isolation of individuals is explicitly represented and show that we can achieve good agreement at far less computational cost. we also provide a flexible software package at https://github. com/ptti/ptti with a convenient declarative language for specifying parameters and interventions and implementations of the seir-tti ode model, mechanistic agent-based model, a second non-mechanistic rule-based model in the κ-language formalism [54, 55] , and several related models such as classic seir. we design a compartmentalised model describing the populations of susceptible (s), exposed (e -infected but not infectious), infectious (i) and removed (r) population cohorts. these models are widely used to describe the spread of various infectious diseases [52] . within the model framework, disease progression is captured by movement of individuals sequentially between compartments accounting for progression from susceptible individuals (s) being exposed to the virus and becoming infected but not infectious (e), to becoming infectious (i) until they recover (r). a schematic illustrating this model is shown in fig 1. the novelty of our model is that we have within each compartment included subgroups of people diagnosed and undiagnosed with the virus, attributable to reported and unreported diagnosis. individuals in our model are defined to be diagnosed either through testing or putatively through tracing. diagnosed individuals are then isolated. schematic of an seir model with diagnosis described by testing and contact-tracing. seir is a compartmentalised model describing susceptible (s), exposed (e -infected but not infectious), infectious (i) and removed (r) population cohorts. individuals move between these compartments in sequence as they become exposed, infected and infectious during disease progression until recovery. the novelty here is that each compartment comprises diagnosed and undiagnosed individuals with diagnosis leading to isolation. we assume that diagnosis happens through testing or putatively through tracing. individuals transition between compartments x and y at rates ∆ x→y which we derive in the text. before introducing contact tracing, we examine the standard seir model with testing. these results, and those in the following section, use the system of differential equations as described in detail in the methods. we choose a relatively large initial number of infectious individuals merely for illustrative purposes as it renders the dynamics clearerthe more aggressive testing regimes would result in immediate containment of a small outbreak which would be difficult to see whereas a large outbreak nevertheless takes some time to contain. the parameters have the usual meaning, with values fixed for the purposes of this section: n = 6.7 × 10 7 individuals is the total population, i(0) = 10 5 is the initial number of infected individuals,β = 0.033 infections/contact is the probability of transmission; c = 13 contacts/day is the contact rate, α = 0.2 days −1 is the incubation rate, the rate of leaving the exposed state and becoming infectious; and γ = 7 −1 days −1 is the rate of recovery, or leaving the infectious state. these values result in a basic reproduction number of r 0 = 3. in the simplest case, testing is conducted at random at some rate θ of tests per individual per day and only infectious individuals are tested and immediately isolated. representative trajectories from this system for various values of θ are shown in fig 2. the upper panel shows the time-series for total infections, exposed and infectious, and the lower panel shows the effective reproductive number, r(t). we can observe that while testing the entire population every 20 days (θ = 0.05) results in a lower maximum total number of infections, we require very frequent testing, every 3-4 days (θ = 0.3, .25) in order to control an outbreak and cross the r(t) = 1 threshold (red horizontal line). it is straightforward to work out the condition under which testing crosses this threshold by analysing the fixed points in the underlying system of differential equations since the required condition is that there is no change in the number of infectious people as they each infect one other on average and then are removed. some arithmetic yields θ crit =βc − γ, the red line in fig 3. the above shows that, whilst testing and isolating alone can be sufficient to control an outbreak, it would take a herculean effort on its own. without any form of distancing may 14, 2020 4/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint the dynamics represented here are for a scenario with normal contact, c = 13, and an initial number of infected individuals, i(0) = 100, 000. individuals who test positive are isolated for the duration of their illness. the top plot shows the total infections (exposed and infectious individuals) over time for various testing rates ranging from none, θ = 0, to testing all infectious individuals every two days, θ = 0.55. the bottom plot shows the reproduction number over time for these same scenarios. observe that even fairly frequent testing, e.g every five days, θ = 0.2, this is only sufficient to reduce peak infections by one order of magnitude from about 20 million to about two million. in the infrequent testing regimes, θ ∈ [0.05, 0.25], we can also observe that the curve described by r(t) is not a sigmoid but instead first falls to a value above r(t) = 1 before stabilising and then falling again. this is because though testing and isolating does have an effect at those rates, it is not sufficiently frequent to identify all of those who are infectious. may 14, 2020 5/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020. . (c ≈ 13) it is necessary to conduct tests about every 3.5 days. if a sizeable number of infected individuals are asymptomatic, there is no alternative but to test the entire population at this rate. distancing helps here. if contact rate is cut by half, the required rate is closer to once per fortnight. there is, however, a strategy to avoid regularly sampling the entire population in order to direct tests to those most likely to be infected: contact tracing, which we consider next. the central mathematical result is the expression for the rate at which individuals are isolated due to contact tracing, the notation is explained in detail in the methods section, but the intuition is that, for any compartment x, divided in to exclusive unconfined, x u , and isolated, x d , sub-compartments, the rate of moving between them is proportional to the probability of having had contact with an infectious individual conditional on being in x u . the effects of contact tracing is shown in fig 4. the scenario is the same as with testing alone, except that the testing rate is fixed at θ = 14 −1 days −1 and the tracing rate is fixed at χ = 2 −1 days −1 . the interpretation is that, on average, an infectious individual expect to be tested in 7 days and contacts can expect to be traced in 2 days. the choice of these values for illustrative purposes is purposeful. recall from the previous section that γ, the recovery rate is fixed at 7 −1 days −1 . one would expect that testing and isolating individuals, on average, after they have recovered and it is too late would be insufficient to contain an outbreak. indeed it is not suffcient, but it does reduce the maximum number of infected individuals somewhat. however, since tracing happens as a consequence of testing, it amplifies its effectiveness. this can be seen in the figure where even a modest tracing success rate of 30-40% results in a substantial reduction of more than half the peak infections. may 14, 2020 6/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint the relationship between testing rate and tracing rate can be seen from fig 5. when θ is very small, meaning very little testing, then contact tracing has little effect. this is unsurprising because testing causes tracing. when there is very frequent testing, on the other hand, there is little benefit to contact tracing. when testing happens more frequently on average than an individual can infect another, it is sufficient to control the outbreak on its own. however for intermediate values, contact tracing amplifies the effectiveness of testing. the above result can be seen from this plot as well: when testing of infectious individuals is expected in a week, a modest 40% success rate at tracing contacts in two days is enough to reduce the reproduction number from 2 to less than 1.5, a substantial benefit. the central result of this paper is not specific observations about how testing and contact tracing affect the propagation of epidemics, though those are valuable, but a technique to compute these effects efficiently. this technique allows consideration of larger populations than would be possible with agent-or individual-based models allowing for the exploration of many different scenarios. figs 3 and 5, for example, each contain 25 × 25 = 525 data points resulting from a separate simulation. performing these 1050 total simulations takes under a minute on a regular laptop. this would have not been possible with agent-or individual-based models, with population sizes in the hundreds of thousands or millions. it could be argued that it is sufficient to capture these dynamics in an agent-based model for modest populations and simply rescale the output for large populations. that approach is not sound for two reasons that are easily seen. first, small outbreaks. imagine a hypothetical country of 70 million people with 100 thousand infections. proportionally, that is 14.3 infections in a population of 10 thousand. there is a non-negligible probability that an outbreak of size 14 will die out on its own. this will be accounted for by the abm but is not a realistic possibility for an outbreak of 100 thousand. scaling therefore suggests fundamentally different results. second, without intervention, the number of infectious individuals will reach a maximum as the available pool of susceptible individuals becomes may 14, 2020 8/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint depleted. this takes longer in a large population simply because the pool is larger. if timing of the peak of an outbreak is a quantity of interest, a scaled abm will give the wrong result. however, doing this requires some approximations and it is important to understand where and how well these approximations hold. to do this, we compare with an agentbased model as described in the methods, and show that our method agrees well for a large range of physically interesting and realistic parameter values. a comparison of the two systems for reasonable parameter values is shown in fig 6. the figure shows good agreement between the mean trajectory of the abm and the ode approximation. the agreement is particularly precise for the exposed and infectious compartment of both varieties. we can observe a slight over-estimate of the number of unconfined susceptible individual and corresponding under-estimate of the unconfined removed ones. these over-and under-estimates are nevertheless acceptably close with a relative error in the magnitude of the susceptible population of under 10%. there exist extreme scenarios where the ode performs poorly at reproducing the mean trajectory of the abm system. an example is shown in fig 7. one such scenario is when the testing rate is very low. the figure shows when θ = 50 −1 days −1 . this circumstance violates the assumption underlying eq 21 that the number of susceptible contacts available for tracing should be much smaller than the total susceptible population. intuitively, this can be understood as the ode approximation holding well when testing and tracing are conducted sufficiently rapidly to perform their required purpose. when they do not, the approximation is poor. even in this extreme scenario, however, where the curve produced by the ode system is several standard deviations distant from the average trajectory of the abm, its shape is still similar and realistic. we consider the problem of determining the effect of testing and contact tracing in a population, p , consisting of a set of indistinguishable individuals among whom a may 14, 2020 9/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint propagates. to answer this we adapt the standard susceptible-exposed-infectious-removed (seir) compartmental model [16, 52] to incorporate contact tracing as well as testing and isolation of cohorts of people. our adaptation extends the classic seir to not only include progression through disease stages from exposure, via infection to recovery, but to also keeping track of the changing make up of the population as the disease progresses. to achieve this we require our model to have two additional features: 1. to keep track of whether people have been isolated from the rest (either due to testing positive, or having been traced as a contact of someone who tested positive) 2. to keep track of whether people have been in contact with an infectious individual recently enough to be potential targets for tracing. ordinary compartment models like seir are designed to separate individuals into distinct, non-overlapping groups. this is not a problem for the first feature, as people who are isolated and people who are not constitute entirely distinct sets. we therefore can represent unconfined and isolated individuals simply by doubling the number of states, labeling s u , e u , i u and r u the undiagnosed people who are respectively susceptible, exposed, infectious, or removed, and similarly, s d , e d , i d and r d the ones who have been diagnosed or otherwise distanced from the rest of the population, by means of home isolation, quarantine, hospitalisation and such. however, dealing with contact tracing is harder, as it can not be achieved with separate compartments. here we take two approaches. first, we describe an agent-based model that simulates contact tracing with an approximation of how it could take place in real life. this agent-based model serves as our reference. then we describe fully our compartment model, and, relying on a system of second order ordinary differential equations (odes), we introduce the concept of overlapping compartments. overlapping compartments represent model states that are not mutually exclusive, so that it is possible for an individual to belong in more than one of them e.g. be infected and contact-traced, or exposed and tested. we define equations for this model in order to represent the processes that happen in the agent-based model, providing the comparisons seen above in the results section. may 14, 2020 10/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint an agent based model of contact tracing among the possible measures to suppress an epidemic, contact tracing is defined as "an extreme form of targeted control, where the potential next-generation cases are the primary focus" [56] . in other words, contact tracing is the process by which we aim to identify and isolate individuals who have been in contact with an infectious patient in the past and are thus more likely to have been exposed to the disease, in order to remove them from the pool of possible infectious patients before they develop symptoms. we start by defining our modified seir model in agent-based form. the model features n agents each characterised by a state symbolising progression throughout the disease (s, e, i, or r) as well as a single bit characterising whether they are undiagnosed or diagnosed/distanced (u or d). as mentioned above, we label s u , s d , e u , etc. respectively the numbers of individuals in each combination of those states, and s, e, i, r the totals (u and d combined). in addition, we store a contact matrix keeping track of which individuals have been in contact with which infectious members of the population, and an array of all those individuals for whom one past infectious contact has been identified, and thus they can be traced as potentially exposed individuals. we call c t the total number of such traceable individuals. this contact matrix encapsulates a history of interactions in a way that is realistic but is not possible to represent directly in ode form. it is specifically the functioning of this individual contact matrix that we claim to reproduce at the population level with our ode formulation below. we simulate the model using gillespie's algorithm [57] , which provides a way to sample exact trajectories produced by such stochastic processes. the possible state transitions that can take place are: 1. contact between a random individual and one belonging to i u , with rate ci u . the contact is stored in the contact matrix. if the individual happens to belong in s u , with likelihoodβ ≤ 1, the contact results in exposure, and the s u individual becomes e u ; 2. progression of the disease for an e individual into i, with rate αe; 3. recovery from the disease, or removal due to hospitalisation or death, for an i individual into r, with rate γi; 4. diagnosis by regular testing of an i u individual, with rate θi. the individual is moved to i d ; all its past contacts, retrieved from the contact matrix, are marked as traceable with likelihood η ≤ 1. if the individual moved to i d was marked as traceable, it is unmarked (as they're already in isolation and there is no need to trace them any more); 5. release from isolation of an s d individual, making them s u , with rate κs d ; 6. release from isolation of an r d individual, making them r u , with rate κr d ; 7. contact tracing of a traceable individual with rate χc t . the individual is moved from x u to x d , where x is whatever state of progression they are in, and they're removed from the list of traceable individuals. the transitions described above can be intuitively seen as corresponding to the ones that would happen in an idealised real-life version of epidemic spread with testing and contact tracing. the biggest deviation from reality is the perfect mixing of the population implied by the first process. the testing and tracing processes are parametrised by θ, the rate of diagnosis of infectious individuals, η, the likelihood or efficiency with which the tracing process identifies contacts, and χ, the rate at which they are found and isolated. we will describe the meaning and importance of these numbers as we explain how they fit into an ode model description of the same processes. may 14, 2020 11/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint we begin by introducing the ode form of the standard seir model [16, 52] . because of the large number of model compartments and exchange terms between them that will be featured in the full model, we introduce a systematic notation to refer to rates that link them. we refer to ∆ x→y as the rate at which members of the population move from compartment x to compartment y . for example, ∆ s→e is the rate at which susceptible members of the population are exposed to the virus. in addition, for convenience when discussing movements that can happen due to multiple phenomena, we might add a superscript, such as ∆ z x→y , to indicate only the part of that rate that can be ascribed to a given process z. with this notation, the differential equations that describe the standard seir model have the following form, note that all terms involve compartments identified with u subscripts as these equations all apply to the undiagnosed part of our model. they will then be expanded upon to include the effects of isolation and testing in the next section. the terms in the above differential equations are defined in the usual way as, where β =βc is the infection rate, α is the disease progression rate and γ is the disease recovery rate. while this formulation treats the populations as continuous analytical functions, in general these equations describe the mean trajectory of what is fundamentally a stochastic system. this stochastic system can be simulated with gillespie's algorithm and, up to this point, is equivalent in the continuous limit to an agent-based model featuring the same compartments and transition rates. now we add diagnosis to our description. four more compartments, s d , e d , i d and r d , are created to keep track of population cohorts who have been identified as potentially infected, and thus isolated from the rest of the population as a measure to limit the spread of the disease. disease progression is not affected by this process; therefore, including isolation will change the infection rate, as unlike population i u , the isolated population i d does not contribute to further infection. hence we do not include an may 14, 2020 12/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . infection term here. this is an idealisation. in reality isolation will not be perfect, and we can imagine a reduced 'cross-infection' rate in which some people belonging to s u are infected by people in i d . this could happen with medical professionals treating infectious patients or care workers who maintain a quarantine facility. we could even consider infection of people in s d due to those in i d , such as a patient in home isolation infecting their family. however, for present purposes, we will work in an ideal situation where isolation is perfect. finally, we need to incorporate mechanisms to move individuals between the u and d branches of the model. for this purpose we define a testing rate, θ, which represents the fraction of people belonging in i u who, each day, are diagnosed with the disease. we note that this parameter does not refer to any specific testing procedure; it just represents the total of people who are recognised as having the disease. it can represent, for example, actual testing for a specific pathogen as well as clinical diagnosis. we only focus on the category of i u as these are the patients who are most likely to realise they are sick and seek medical help. this generic testing process is described by the equation, in addition, people will be released from isolation after a finite time without symptoms. for this reason, we don't include a mechanism for people in i d to return to the u branch of the model, as they're likely to be symptomatic or test positive for the pathogen. instead, we consider that people who have been isolated despite being not infected, or who are still isolated after having recovered, will return to normal conditions at a rate κ, with this model adaptation, a single infected individual can now take two paths: in which they are exposed to the disease, become infectious, and finally recover, without being isolated or diagnosed, as in the normal seir model, or, in which, after becoming infectious, they are identified, isolated, removed from the pool of those who can infect other susceptible people, and after recovering, released from isolation. having these two paths allows attainment of some degree of control of the epidemic; however, it must be noted that while we have introduced them, the states s d and e d are here left unused. this is because at this stage we associate testing with symptomaticity; there is yet no mechanism other than by diagnosis to identify someone who could be infected. this is especially problematic in terms of the impossibility of isolating exposed people. these are individuals with a latent infection who will soon become infectious. isolating them pre-emptively would contribute a great deal towards suppressing the epidemic. for this reason, we move on to include contact tracing as a means of preventive isolation. we've seen previously that it is intuitive how contact tracing can be represented in an agent-based model, in which individuals are simulated and each has an history of contacts with other members of the population. it is not as obvious how to treat contact tracing in a compartment model, where there is no memory of the histories of contacts of specific individuals, but only average quantities. we outline here a probabilistic method for doing this. may 14, 2020 13/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . let us define pr(x) the probability of an individual of belonging to compartment x of the population. for example, pr(s u ) = s u /n is the probability of an individual to be susceptible and undiagnosed. in addition, let us define pr(c i ) the probability of an individual of having had contact with an infectious individual in the past where that infectious individual is still infectious. the latter detail is important because here we consider only "next-generation" tracing; in other words, we only try to trace the direct contacts of those infectious individuals who were found to test positive. this is a conservative assumption. it could be possible to make contact tracing more effective by also tracing one generation further (the contacts of the contacts), but because the process requires exponentially more resources with each generation with decreasing likelihood of correctly identifying exposed or infectious individuals, we simply opt to neglect that possibility. therefore, in this model the only people who can be traced are those whose most recent infectious contact is still infectious; once they recover, they can not be identified as infectious any more, and thus it will be impossible to trace their contacts as well. finally, we define pr(c t ) the probability of an individual of being traced. all these probabilities are functions of time, and quantities that evolve with the model itself. first, we find that the probability of being traced is where pr(c t |c i ) is the conditional probability of being traced given that one has had an infectious contact in the past, and pr(c t |¬c i ) the probability of being traced given that one has not. clearly, pr(¬c i ) = 1 − pr(c i ). if we ignore the possibility of false positives, then pr(c t |¬c i ) = 0, namely, a person can only be traced if they did have an infectious contact in the past. if we then set an 'efficiency' parameter η representing the fraction of contacts that we are indeed able to identify, the probability of being traced at a given time is simply , to derive transition rates among compartments, we consider that individuals will be traced proportionally to how quickly the infectious individuals who originally infected them are, themselves, identified. we add a factor χ to account for the speed of the tracing process itself, and we find a global tracing rate, it then follows that, for individuals in a given compartment x, the rate at which they're isolated by contact tracing is where in the last step we made use of bayes' theorem [58] . this is our eq 1, the central mathematical result of this paper. the difficulty is then computing the exact probabilities. these are functions that, in general, vary in time and require a certain degree of information about the past. we need to define useful assumptions and approximations in order to work with these probabilities in a model that inherently lacks any memory about the individual histories of the elements of its population. one simple assumption for exposed and infectious individuals is meaning that we assume that if an individual has been exposed or infected, they must also have had an infectious contact in the recent past. this is in fact the reason why contact tracing is an effective use of resources: it skews heavily towards identifying those may 14, 2020 14/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . who have in fact been exposed to the disease. we remark that this assumption does not hold in general in circumstances where it is possible for an individual to become infected indirectly, such as by contact with contaminated surfaces. for present purposes we assume that the likelihood of such events is small compared with the likelihood of being infected through contact with another individual. another limit of this assumption is that we have defined pr(c i ) as the probability of having had an infectious contact who is still infectious. for α γ, or for some infectious individuals who may take a long time to recover, their original infector might have already recovered in the time it takes for them to be tested. however, here we study a model in which α > γ, and it is reasonable to assume that those infectious individuals who are tested are identified relatively early on in their infection, especially if θ > γ. therefore, we deem the assumption in equation 18 acceptable at least insofar as these two conditions hold and indirect infection is unlikely. estimating pr(c i |s u ) and pr(c i |r u ) is more complicated. one possible approximation is to work as if i u were constant on the time-scales of interest; in that case we would have where γ is the overall rate at which individuals are removed from the i u state. putting together recovery, regular testing, and contact tracing, we find γ = γ + θ(1 + ηχ). the main difference between the two equations is determined by the fact that someone in s u might still be infected, and thus only has a probability 1 − β of remaining susceptible after a contact with an infectious member of the population, whereas for recovered individuals this is not an issue any more. equations 19 and 20 can be used to compute rates of contact tracing by combining them with 1. however, here we try to go beyond the crude approximation of constant i u , as it may often reflect reality very poorly. we consider for example the total number of members of s u who also have had recent infectious contacts, n (c i |s u ) = pr(c i |s u )s u . we can describe these in first approximation as where the f x (t, τ ) are the 'survival functions' for the state x. in other words, these are the functions that determine how likely it is that an individual that was in x at time τ still is in the same state at time t. we also used f i , meaning the survival function of the total number of infectious individuals, i = i u + i d , because here we focus on overall infectiousness, not the fact that one might have been isolated before recovery. note, however, that only i u individuals participate in contacts. the reason that this is an approximation is that we're not excluding the n (c i |s u ) from the pool of s u that can be contacted, and thus there is a risk of double counting. that risk will remain negligible as long as n (c i |s u )/s u is small; therefore, this model will perform better in a regime in which there are few infectious individuals, and thus, few contacts. this is in fact the regime in which contact tracing is most likely to be feasible in practice, to control small outbreaks rather than in presence of an uncontrolled epidemic. regardless, we show in the results section that even when this approximation does not hold, while it results in oscillatory behaviour early on, it still generally adequately describes the overall trends and long term equilibrium. equation 21 is equivalent to the integral form of an equation for a compartment model [59] . it can be written in differential form as, may 14, 2020 15/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint where the h x = 1 f x df x dt are the 'hazard functions' for the state x. in particular, h i = γ. given the similarities between these equations and the ones describing the compartment models, it is natural to think of creating a specific compartment for n (c i |s u ). this is in fact what we do. there is, however, an important difference from regular compartments, because this compartment does not include individuals that exclusively belong to it; rather, it overlaps with s u . it is more of a device used for book-keeping purposes, to compute the integral in equation 21 within the confines of the model, than a compartment in the usual sense. we similarly define n (c i |e u ), n (c i |i u ) and n (c i |r u ), which leads, using equation 1, to the following contact tracing rates, ∆ (c t ) in addition, we establish the following transition rates between these n compartments, there is a lot going on in equations 27-37; most importantly, these new compartments do not conserve the total size of the population. their membership grows as contacts happen and shrinks as time passes. all the key processes can be summed up as follows: • elements are 'created' for each state proportionally to the rate of contact with individuals belonging to i u , adjusted with 1 − β in the case of s u to account for the likelihood that the contact is infective. these terms are 'sources' and can be recognised by having an arrow with nothing on its left in the subscripts; • elements 'decay' at a rate that amounts to γ (the hazard function for i, which always appears as it refers to the original infector) plus a rate representing the hazard function for the transition x u → x d . these terms are 'sinks' and can be recognised by having an arrow with nothing on its right in the subscripts; • elements move between compartments following the usual transitions that control the dynamics of the seir model (infection, progression of the disease, recovery). these terms are analogous to the corresponding ones connecting x u states, and contribute the remainder of the hazard function for each x u to eq. 22 and equivalents. it must also be noted that, in practice, considering equation 18, it must be n (c i |e u ) = e u and n (c i |i u ) = i u , which removes the need for two of the four compartments above and simplifies the equations to may 14, 2020 16/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 19, 2020. . . a few words are necessary on the hazard function for the x u → x d transitions. this is approximated as η θχ in states s u and r u even though that is not precisely correct; the correct hazard function would be η θχn (c i |x u )/x u , but that introduces a risk of instability for small values of x u . we justify this choice by the following reasoning. in a weak testing regime (η θχ γ), n (c i |x u )/x u might be high due to a great number of infected individuals, but in principle should never be greater than 1 (modulo the point above about double counting). therefore, the hazard function is dominated by γ. conversely, in a strong testing regime, the number of infected individuals, and thus n (c i |x u )/x u , will be very small, and this assumption will at most end up underestimating the effect of contact tracing (by causing a faster decay in n (c i |x u ) than otherwise would happen). the examples shown in the results section illustrate how this affects the simulations -in general, leading to good predictions for the behaviour of the e u and i u compartments. equations 6-8, 9-10, 11, 23-26 and 27-37, together, define entirely our model. the parameters that appear in these equations are summarised for reference in table 1 . we implement the above ordinary differential equations and agent-based model in our ptti python package (https://github.com/ptti/ptti) using the compyrtment [60] package that facilitates the formulation of initial value problems. it is written for python 3 and makes use of the scientific computation libraries numpy and scipy [61, 62] as well as the optimisation library numba [63] . the ptti package provides a declarative language for specifying simulations of models implemented as python objects. it supports setting of model parameters, simulation hyper-parameters as well as interventions that modify parameters at particular times to conduct piece-wise simulations reflecting changing conditions in a convenient and user-friendly way. we hope that this software formulation will be useful for easy and rapid exploration of the effects of different intervention scenarios for disease outbreak control. our work outlines a method for extending the classic seir model to include testing, contact-tracing and isolation (tti) strategies. we show that our novel seir-tti model can accurately approximate the behaviour of agent-based models at far less computational cost. our adaptation is applicable across compartmental models (e.g. sir, sis etc) and across infectious diseases. we suggest that the seir-tti model can be applied to the covid-19 pandemic to understand the impact of possible tti strategy to control this outbreak. the importance of modeling to support decision making is widely acknowledged, but models are far more useful when they can accurately represent the classes of interventions that are being considered [15] . the approach described in this paper is based on sound mathematical reasoning that assures accurate and efficient modelling of contact tracing and testing across a wide range of relevant parameter values. the ability to accurately model tti strategies across parameter values is vital for controlling disease outbreaks including the current covid-19 pandemic. effective testing, contact tracing and isolation strategies have been the key measures that have prevented the epidemic spreading in south korea [64] , new zealand and germany [65] . our work is novel as it is to date, and to the best of our knowledge, the first deterministic model to explicitly incorporate contact tracing. this has been until now only done with agent-based models. an important aspect of our approach is that our ode formulation explains the behaviour of the agent-based model. namely, agent-based models are formulated in terms of local interactions among individuals and exhibit emergent behaviour at the population level. for interesting agent-based models, it is usually difficult to obtain any explicit connection between the local interactions and the population-level dynamics except through simulation and inspection of the results. we argue that our work here shows such an explicit connection: we have been able to capture the dynamics that arise at a population level from testing and contact tracing. we show that this is correct by demonstrating good agreement with the population-level dynamics that emerge from the agent-based formulation where only local interactions are specified. the seir-tti model here considers disease propagation in the classical well-mixed setting. this is appropriate especially in circumstances where data are sparse and gives qualitatively similar results to those from fine-grained models that might otherwise provide more quantitatively accurate results if only more detailed data were available. in particular, well-mixed models do not include any notion of the network of contacts across which a contagion spreads in the real world. in reality, individuals in a large population are not equally likely to have contact with one another and it has long been known [42-44, 46, 47, 66-68] that heterogeneity in underlying population structure can have a strong effect [36, [69] [70] [71] on disease propagation. future work will include developing a better understanding of the relationship between network structure and effectiveness of tracing, and mathematical characterisation of the classes of solution available for these models. another extension is investigating the extent to which individual decisions about compliance with measures to reduce disease propagation (voluntary distancing, wearing of masks, etc.) affect the success of containment. a game-theoretical approach such as that considered by zhao et al. [72] may produce useful insights into this question. insights gained from these extensions can inform policy design for relaxing onerous restrictions on the population. an important next step in this work is the real-time policy driven application of seir-tti. as our next piece of work we are planning to explore how seir-tti model can be combined with economic analysis to guide decisions around optimal design of a tti strategy that can suppress the covid-19 epidemic in the uk. may 14, 2020 18/23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 19, 2020. . https://doi.org/10.1101/2020.05.14.20101808 doi: medrxiv preprint this paper gives a primer of how, using mathematical theory, the classic seir model can be extended to incorporate a testing, contact tracing and isolation strategy. the resulting seir-tti model is a key development in the widely used seir models, and an important step if these are to be useful in policy decision making during outbreaks. the long and successful history of testing, contact tracing and isolation in slowing and stopping the spread of infectious diseases is well known [56] , with clear immediate importance for covid-19 control [73] . the design of policies that include a variety of infectious disease control tools, and understanding and applying them in ways that are effective for society at large, is critical. tools and models that allow policymakers to better understand the policies and the dynamics of a disease are therefore critical. if making policy decisions without evidence is flying blindly, making decisions without understanding the consequences of the various control measures is flying without flight controls. models like seir-tti can inform policymakers of the role that testing and tracing can play in preventing the spread of disease. combined with economic and policy analysis, this can enable far better decision making both in the immediate future, and in the longer term. the next step in our work is indeed this: the application of the seir-tti model combined with economic models to investigate the effect of different tti strategies to conquer the covid-19 epidemic in the uk. world health organization. who director-general's opening remarks at the media briefing on covid-19 -13 impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand containing papers of a mathematical and physical character controlling infectious disease outbreaks: lessons from mathematical modelling modeling contact tracing in outbreaks with application to ebola the contribution of biological, mathematical, clinical, engineering and social sciences to combatting the west african ebola epidemic assessing optimal target populations for influenza vaccination programmes: an evidence synthesis and modelling study how should hiv resources be allocated? lessons learnt from applying optima hiv in 23 countries exploring the role of mass immunisation in influenza pandemic preparedness: a modelling study for the uk context the efficacy of contact tracing for the containment of the 2019 novel coronavirus (covid-19) effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of sars-cov-2 in different settings isolation and contact tracing can tip the scale to containment of covid-19 in populations with social distancing age-dependent effects in the transmission and control of covid-19 epidemics modelling sars-cov2 spread in london: approaches to lift the lockdown improving decision support for infectious disease prevention and control: aligning models and other tools with policymakers' needs infectious diseases of humans: dynamics and control modeling infectious disease dynamics in the complex landscape of global health mathematical modeling in epidemiology an introduction to stochastic epidemic models stochastic epidemic models: a survey epidemiology of transmissible diseases after elimination mathematical modeling in epidemiology methods and models in mathematical biology: deterministic and stochastic approaches. lecture notes on mathematical modelling in the life sciences some epidemiological models with delays mathematical approaches for emerging and reemerging infectious diseases: an introduction time delays in epidemic models the effect of integral conditions in certain equations modelling epidemics and population growth solution of delay differential equations via a homotopy perturbation method global stability of an sir epidemic model with time delays global asymptotic stability of an sir epidemic model with distributed time delay global behavior of an seirs epidemic model with time delays global behavior and permanence of sirs epidemic model with time delay estimation for discrete time branching processes with application to epidemics mathematical modeling in epidemiology a primer on stochastic epidemic models: formulation, numerical simulation, and analysis. infectious disease modelling individual-based perspectives on r0 agent-based simulation tools in computational epidemiology formalizing the role of agent-based modeling in causal inference and epidemiology a taxonomy for agent-based models in human infectious disease epidemiology agent-based modeling in public health: current applications and future directions individualbased computational modeling of smallpox epidemic control strategies epidemic dynamics and endemic states in complex networks when individual behaviour matters: homogeneous and network models in epidemiology contact network epidemiology: bond percolation applied to infectious disease prediction and control reasoning about a highly connected world spatial epidemiology of networked metapopulation: an overview mathematics of epidemics on networks: from exact to approximate models modelling strategies for controlling sars outbreaks modeling the impact of social distancing, testing, contact tracing and household quarantine on second-wave scenarios of the covid-19 epidemic. institute for biocomputation and physics of complex systems preprint modelling the covid-19 epidemic and implementation of population-wide interventions in italy social distancing strategies for curbing the covid-19 epidemic seasonality and period-doubling bifurcations in an epidemic model formal molecular biology the kappa language and tools contact tracing and disease control exact stochastic simulation of coupled chemical reactions an essay towards solving a problem in the doctrine of chances. by the late rev time-varying and state-dependent recovery rates in epidemiological models python for scientific computing python for scientific computing numba: a llvm-based python jit compiler llvm '15 transmission potential and severity of covid-19 in south korea countries test tactics in 'war' against covid-19 comparison of populations whose growth can be described by a branching stochastic process: with special reference to a problem in epidemiology heterogeneity in disease-transmission modeling epidemic spreading in real networks: an eigenvalue viewpoint modeling covid-19 on a network: super-spreaders, testing and containment the disease-induced herd immunity level for covid-19 is substantially lower than the classical herd immunity level individual variation in susceptibility or exposure to sars-cov-2 lowers the herd immunity threshold strategic decision making about travel during disease outbreaks: a game theoretical approach universal weekly testing as the uk covid-19 lockdown exit strategy the authors would like to thank greg colbourn, vincent danos, gabriel goh and rafaele vardavas for insightful comments on early drafts of this manuscript. this work used the cirrus uk national tier-2 hpc service at epcc (http://www.cirrus.ac.uk) funded by the university of edinburgh and epsrc (ep/p020267/1). ww was supported by the chief scientist office scotland (cov/edi/20/12). jpg was supported by the national institute for health research (nihr) applied health research and care north thames at bart's health nhs trust (nihr arc north thames). the funders had no role in study design, data collection, data analysis, data interpretation, or writing of the report. the views expressed in this article are those of the authors and not necessarily those of the nhs, the nihr, or the department of health and social care. ss, ww and jpg came up with the idea of the study. ss, ww and jpg developed the seir-tti model with input from tc and dm. ss and ww coded the model. ww, ss and jpg drafted the paper with inputs from tc and dm. the final version of the paper was approved by all authors. key: cord-354627-y07w2f43 authors: pinter, g.; felde, i.; mosavi, a.; ghamisi, p.; gloaguen, r. title: covid-19 pandemic prediction for hungary; a hybrid machine learning approach date: 2020-05-06 journal: nan doi: 10.1101/2020.05.02.20088427 sha: doc_id: 354627 cord_uid: y07w2f43 several epidemiological models are being used around the world to project the number of infected individuals and the mortality rates of the covid-19 outbreak. advancing accurate prediction models is of utmost importance to take proper actions. due to a high level of uncertainty or even lack of essential data, the standard epidemiological models have been challenged regarding the delivery of higher accuracy for long-term prediction. as an alternative to the susceptible-infected-resistant (sir)-based models, this study proposes a hybrid machine learning approach to predict the covid-19 and we exemplify its potential using data from hungary. the hybrid machine learning methods of adaptive network-based fuzzy inference system (anfis) and multi-layered perceptron-imperialist competitive algorithm (mlp-ica) are used to predict time series of infected individuals and mortality rate. the models predict that by late may, the outbreak and the total morality will drop substantially. the validation is performed for nine days with promising results, which confirms the model accuracy. it is expected that the model maintains its accuracy as long as no significant interruption occurs. based on the results reported here, and due to the complex nature of the covid-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. this paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. severe acute respiratory syndrome coronavirus 2, also known as sars-cov-2, is reported as a virus strain causing the respiratory disease of covid-19 [1] . the world health organization (who) and the global nations confirmed the coronavirus disease to be extremely contagious [2, 3] . the covid-19 pandemic has been widely recognized as a public health emergency of international concern [4] . to estimate the outbreak, identify the peak ahead of time, and also predict the mortality rate the epidemiological models had been widely used by officials and media. outbreak prediction models have shown to be essential to communicate insights into the likely spread and consequences of covid-19. furthermore, governments and other legislative bodies used the insights from prediction models to suggest new policies and to assess the effectiveness of the enforced policies [5] . the covid-19 pandemic has been reported to be extremely aggressive to spread [6] . due to the complex nature of the covid-19 outbreak and its irregularity in different countries, the standard epidemiological models, i.e., susceptible-infected-resistant (sir)-based models, had been challenged for delivering higher performance in individual nations. furthermore, as the covid-19 outbreak showed significant differences with other recent outbreaks, e.g., ebola, cholera, swine fever, h1n1 influenza, dengue fever, and zika, advanced epidemiological models have been emerged to provide higher accuracy [7] . nevertheless, due to several unknown variables involved in the spread, the complexity of population-wide behavior in various countries, and differences in containment strategies model uncertainty has been reported inevitable [8] [9] [10] . consequently, standard epidemiological models face new challenges to deliver more reliable results. the strategy standard sir models is formed around the assumption of transmitting the infectious disease through contacts, considering three different classes susceptible, infected, and recovered [11] . the susceptible to infection (class s), infected (class i), and the removed population (class r) build the foundation of the epidemiological modeling. note that definition of various classes of outbreak may vary. for instance, r is often referred to those that have recovered, developed immunity, been isolated, or passed away. however, in some countries, r is susceptible to be infected again and there exists uncertainties in allocating r a value. advancing sir-based models requires several assumptions. it is assumed that the class i transmits the infection to class s where the number of probable transmissions is proportional to the total number of contacts computed using basic differential equations as follows [12] [13] [14] . = − (1) where , , and represent the infected population, the susceptible population, and the daily reproduction rate, respectively. the value of in the time-series produced by the differential equation gradually declines. at the early stage of the outbreak, it is assumed that ≈ 1 where becomes linear. eventually the class i can be stated as follows. = − , = (2) where regulates the daily rate of spread. furthermore, the individuals excluded from the model is computed as follows. considering the above assumption, the outbreak modeling with sir is finally computed as follows: furthermore, to evaluate the performance of the sir-based models, the median success of the outbreak prediction is used which is calculated as follows: (4) several analytical solutions to the sir models have been provided in literature [15, 16] . as the different nations take different actions toward slowing down the outbreak, the sir-based model must be adopted according to the local assumptions [17] . inaccuracy of many sir-based models in predicting the outbreak and mortality rate have been evidenced during the covid-19 in many nations. the key success of an sir-based model relies on choosing the right model according to the context and the relevant assumptions. sis, sird, msir, seir, seis, mseir, and mseirs models are among the popular models used for predicting covid-19 outbreaks worldwide. the more advanced variation of sir-d models carefully considers the vital dynamics and constant population [16] . for instance, at the presence of the long-lasting immunity assumption when the immunity is not realized upon recovery from infection, the susceptible-infectious-susceptible (sis) model was suggested [18] . in contrast, the susceptible-infected-recovered-deceased-model (sird) is used when immunity is assumed [19] . in the case of covid-19 different nations took different approaches in this regard. seir models have been reported among the most popular tools to predict the outbreak. seir models through considering the significant incubation period of an infected person reported to present relatively more accurate predictions. in the case of varicella and zika outbreaks the seir models showed increased model accuracy [20, 21] . seir models assume that the incubation period is a random variable and similarly to the sir model, there is a disease-free-equilibrium [22, 23] . it should be noted, however, that seir models can not fit well where the contact network are non-stationary through time [24] . social mixing as key factor of non-stationarity determines the reproductive number 0 which is the number of susceptible individuals for infection. the value of 0 for covid-19 was estimated to be 4 which greatly trigged the pandemic [1] . the lockdown measures aimed at reducing the 0 value down to 1. nevertheless, the seir models are reported to be difficult to fit in the case of covid-19 due to the non-stationarity of mixing, caused by nudging intervention measures. therefore, to develop accurate sir-based models an in-depth information about the social movement and the quality of lockdown measures would be essential. other drawback of sir-based models is the short lead-time. as the lead-time increases, the accuracy of the model declines. for instance, for the covid-19 outbreak in italy, the accuracy of the model reduces from = 1 for the first five days to = 0.86 for day 6 [17] . overall, the sir-based models would be accurate if firstly the status of social interactions is stable. secondly, the class r can be computed precisely. to better estimate the class r, several data sources can be integrated with sir-based models, e.g., social media and call data records (cdr), which of course a high degree of uncertainty and complexity still remains [25] [26] [27] [28] [29] [30] [31] [32] . considering the above uncertainties involved in the advancement of sir-based models the generalization ability are yet to be improved to achieve scalable model with high performance [33] . due to the complexity and the large-scale nature of the problem in developing epidemiological models, machine learning has recently gained attention for building outbreak prediction models. ml has already shown promising results in contribution to advancing higher generalization ability and greater prediction reliability for longer lead-times [34] [35] [36] [37] [38] . machine learning has been already recognized a computing technique with great potential in outbreak prediction. application of ml in outbreak prediction includes several algorithms, e.g., random forest for swine fever [ [54, 55] , survival prediction [56] , and icu demand prediction [57] . furthermore, the non-peer reviewed sources suggest numerous potentials of machine learning to fight covid-19. among the applications of machine learning improvement of the existing models of prediction, identifying vulnerable groups, early diagnose, advancement of drugs delivery, evaluation of the probability of next pandemic, advancement of the integrated systems for spatio-temporal prediction, evaluating the risk of infection, advancing reliable biomedical knowledge graphs, and data mining the social networks are being noted. as stated in our former paper machine learning can be used for data preprocessing. improving the quality of data can particularly improve the quality of the sir-based model. for instance, the number of cases reported by worldometer is not precisely the number of infected cases (e in the seir model), or calculating the number of infectious people (i in seir) cannot be easily determined, as many people who might be infectious may not turn up for testing. although the number of people who are admitted to hospital and deceased wont support r as most covid-19 positive cases recover without entering hospital. considering this data problem, it is extremely difficult to fit seir models satisfactorily. considering such challenges, for future research, the ability of machine learning for estimation of the missing information on the number of exposed e or infecteds can be evaluated. along with the prediction of the outbreak, prediction of the total mortality rate (n(deaths) / n(infecteds)) is also essential to accurately estimate the number of potential patients with the in the critical situation and the required beds in intensive care units. although the research is in the very . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 6, 2020. . early stage, the trend in outbreak prediction with machine learning can be classified in two directions. firstly, improvement of the sir-based models, e.g., [53, 58] , and secondly time-series prediction [59, 60] . consequently, the state-of-the-art machine learning methods for outbreak modeling suggest two major research gaps for machine learning to address. firstly, improvement of sir-based models and secondly advancement in outbreak time series. considering the drawbacks of the sir-based models, machine learning should be able to contribute. this paper contributes to the advancement of time-series modelling and prediction of covid-19. although ml has long been established as a standard tool for modeling natural disasters and weather forecasting [61] [62] [63] [64] [65] , its application in modeling outbreak is still in the early stages. more sophisticated ml methods are yet to be explored. a recent paper by ardabili et al, [50] , explored the potential of mlp and anfis in time series prediction of covid-19 in several countries. contribution of the present paper is to improve the quality of prediction through proposing a hybrid machine learning and compare the results with anfis. in the present paper the time series of the total mortality is also included. the rest of this paper is organized as follows. section two describes the methods and materials. the results are given in section three. sections four presents conclusions. dataset is related to the statistical reports of covid-19 cases and mortality rate of hungary which is available at: https://www.worldometers.info/coronavirus/country/hungary/. figure 1 and 2 presents the total and daily reports of covid-19 statistics, respectively from 4-march to 19-april. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . in the present study, modeling is performed by machine learning methods. training is the basis of these methods as well as many artificial intelligence (ai) methods [66, 67] . according to some psychologists, humans or living things interact with their surroundings by trial and error and achieve the best performance to reach a goal. based on this theory and using the ability of computers to repeat a set of instructions, these conditions can be provided for computer programs to interact with the environment by updating values and optimizing functions, according to the results of interaction with the environment to solve a problem or achieve a specific goal. how to update values and parameters in successive repetitions by a computer is called a training algorithm [68] [69] [70] . one of these methods is neural networks (nn), according to which the modeling of the connection of neurons in the human brain, software programs were designed to solve various problems. to solve these problems, operational nn such as classification, clustering, or function approximation are performed using appropriate learning methods [71, 72] . the training of the algorithm is the initial and the important step for developing a model [73, 74] . developing a predictive ai model requires a dataset categorized into two sections i.e. input(s) (as independent variable(s)) and output(s) (as dependent variable(s)) [75] . in the present study, time-series data have been considered as the independent variables for the prediction of covid-19 cases and mortality rate (as dependent variables). time-series dataset was prepared based on two scenarios as described in table 1. the first scenario categorizes the time-series data into four inputs for the last four consequently odd days' cases or mortality rate for the prediction of xt as the next day's case or mortality rate, and the second scenario categorizes the time-series data into four inputs for the last four consequently even days' cases or mortality rate for the prediction of xt as the next day's case or mortality rate. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . [76, 77] . for this reason, hybrid methods have been growing up [76, 77] . hybrid methods contain a predictor and one or more optimizer [76, 77] . the present study develops a hybrid mlp-ica method as a robust hybrid algorithm for developing a platform for predicting the covid-19 cases and mortality rate in hungary. the ica is a method in the field of evolutionary calculations that seeks to find the optimal answer to various optimization problems. this algorithm, by mathematical modeling, provides a socio-political evolutionary algorithm for solving mathematical optimization problems [78] . like the all algorithms in this category, the ica constitutes a primary set of possible answers. these answers are known as countries in the ica. the ica gradually improves the initial responses (countries) and ultimately provides the appropriate answer to the optimization problem [78, 79] . the algorithms are based on the policy of assimilation, imperialist competition and revolution. this algorithm, by imitating the process of social, economic, and political development of countries and by mathematical modeling of parts of this process, provides operators in the form of a regular algorithm that can help to solve complex optimization problems [78, 80] . in fact, this algorithm looks at the answers to the optimization problem in the form of countries and tries to gradually improve these answers during a repetitive process and eventually reach the optimal answer to the problem [78, 80] . in the nvar dimension optimization problem, a country is an array of nvar × 1 length. this array is defined as follows: to start the algorithm, the country number of the initial country is created (ncountry). to select the nimp as the best members of the population (countries with the lowest amount of cost function) as imperialists. the rest of the ncol countries, form colonies belonging to an empire. to divide the early colonies between the imperialists, imperialist owns number of colonies, the number of which is proportional to its power [78, 80] . the following figure symbolically shows how the colonies are divided among the colonial powers. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . figure 3 . the initial empires generation [78] the integration of this model with the neural network causes the error in the network to be defined as a cost function, and with the changes in weights and biases, the output of the network improves and the resulting error decreases [81] . an adaptive neuro-fuzzy inference system (anfis) is a type of artificial neural network based on the takagi-sugeno fuzzy system [82, 83] . this method was developed in the early 1990s. since this system integrates neural networks and concepts of fuzzy logic, it can take advantage of the capabilities of both methods. it has nonlinear functions [82, 83] . figure 4 presents the architecture of the developed anfis model. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . evaluations were conducted by determination coefficient, root mean square error and mean absolute percentage error values. these factors compare the target and output values and calculates a score as an index for the performance and accuracy of the developed methods [93, 94] . table 2 presents the evaluation criteria equations. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. rmse= √ 1 ∑ ( − ) 2 (8) where n is the number of data, x and y are, respectively, predicted (output) and desired (target) values. the performance of the proposed algorithm is evaluated using both training and validation data. the training data are used to train the algorithm and define the best set of parameters to be used in anfis and mlp-ica. after that, the best setup for each algorithm is used to predict outbreaks on the validation samples. worth mentioning that due to the lack of adequate sample data to avoid the overfitting, the training is used to evaluate the model with higher performance. the training step for anfis was performed by employing three mf types as described in tables 3 and 4. according to table 4 , it can be claimed that gaussian mf provided the lowest error and highest accuracy compared with other mf types for the prediction of mortality rate. also it can be claimed that, for the selected mf type, scenario 2 provides the highest performance compared with scenario 1 for the prediction of mortality rate. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. figure 3 presents the plot diagram for the selected models according to table 3 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. according to figure 6 , mlp-ica in the presence of scenario 2 provides lower deviation from target . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . https://doi.org/10.1101/2020.05.02.20088427 doi: medrxiv preprint value followed by mlp-ica in the presence of scenario1 than the anfis in the presence of both scenarios. figures 8 and 9 present total cases and total mortality rate, respectively, and figure 10 and 11 present the daily prediction of the results from 20-april to 30-july. each figure has two sections including the reported . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . table 7 . represents the validation of the mpl-ica and anfis models for the period of 20-28 april. the proposed model of mpl-ica presented promising values for rmse and determination coefficient for prediction of both outbreak and total mortality. this approach outperforms commonly used prediction tools in the case of hungary. more work is required if this technique is adequate in all cases and for different population types and sizes. nonetheless, the learning approach could overcome imperfect input data. incomplete catalogs can occur because infected persons are asymptotic, not tested, or not listed in databases. tests in a closed environment such as large aircraft carriers in france and the us have shown that up to 60% of the infected personnel were asymptomatic. of course, military personals are not representative of large and mixed populations. nonetheless, it shows that false negatives can be abundant. in emerging countries, access to laboratory equipment required for testing is extremely limited. this will introduce a bias in the counting. finally, it is unclear if all the cases are registered. in the uk for example, it took public pressure for the government to make the casualties in retirement hospices known. and there is still doubt in the community that china produced complete data for political reasons. at the same time, national governments and local administrations implemented containment measures such as confinement and social distancing. these actions have a huge impact on transmissions and thus on cases and casualties. access to modern medical facilities is also a parameter that mainly affects the number of casualties. all these aspects will affect traditional estimation procedures whereas learning algorithms might be able to adapt, especially if multiple datasets are available for a given region. not only can our approach outperform the commonly used sir but it requires fewer input data to estimate the trends. while we provide successful results for hungarian data we need to further test these novel approaches on other databases. nonetheless, the presented results are promising and should incite the community to implement these new tools rapidly. although sir-based models been widely used for modeling the covid-19 outbreak, they include some degree of uncertainties. several advancements are emerging to improve the quality of sirbased models suitable to covid-19 outbreak. as an alternative to the sir-based models, this study proposed machine learning as a new trend in advancing outbreak models. the machine learning approach makes no assumption on the pandemic and spread of the infection. instead it predicts the time series of the infected cases as well as total mortality cases. in this study the hybrid machine learning model of mlp-ica and anfis are used to predict the covid-19 outbreak in hungary. the models predict that by late may the outbreak and the total morality will drop substantially. based on the results reported here, and due to the complex nature of the covid-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. two scenarios were proposed. scenario 1 considered sampling the odd days and scenario 2 used even days for training the data. training the two machine learning models anfis and mlp-ica were considered for the two scenarios. a detailed investigation was also carried out to explore the most suitable number of neurons. furthermore, the performance of the proposed algorithm is evaluated using both training and validation data. the training data are used to train the algorithm and define the best set of parameters to be used in anfis and mlp-ica. after that, the best setup for each algorithm is used to predict outbreaks on the validation samples. the validation is performed for nine days with promising results which confirms the model accuracy. in this study due to the lack of adequate sample data to avoid the overfitting, the training is used to choose and evaluate the model with higher performance. in the future research, as the covid-19 progress in time and with the availability of more sample data further testing and validation can be used to better evaluate the models. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . both models showed promising results in terms of predicting the time series without the assumptions that epidemiological models require. both machine learning models, as an alternative to epidemiological models, showed potential in predicting covid-19 outbreak as well as estimating total mortality. yet, mlp-ica outperformed anfis with delivering accurate results on validation samples. considering the availability of a small amount of training data, further investigation would be essential to explore the true capability of the proposed hybrid model. it is expected that the model maintains its accuracy as long as no major interruption occurs. for instance, if other outbreaks would initiate in the other cities, or the prevention regime changes, naturally the model will not maintain its accuracy. for the future studies advancing the deep learning and deep reinforcement learning models is strongly encouraged for comparative studies on various ml models for individual countries. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 6, 2020. . https://doi.org/10.1101/2020.05.02.20088427 doi: medrxiv preprint the species severe acute respiratory syndromerelated coronavirus: classifying 2019-ncov and naming it sars-cov-2 a familial cluster of pneumonia associated with the 2019 novel coronavirus indicating personto-person transmission: a study of a family cluster novel coronavirus ( 2019-ncov): situation report coronavirus disease 2019 (covid-19): situation report covid-19 and italy: what next? lancet predicting the impacts of epidemic outbreaks on global supply chains: a simulation-based analysis on the coronavirus outbreak (covid-19/sars-cov-2) case the forecasting of dynamical ross river virus outbreaks inter-outbreak stability reflects the size of the susceptible pool and forecasts magnitudes of seasonal epidemics on the predictability of infectious disease outbreaks real-time forecasting of hand-foot-and-mouth disease outbreaks using the integrating compartment model and assimilation filtering supervised forecasting of the range expansion of novel non-indigenous organisms: alien pest organisms and the 2009 h1n1 flu pandemic testing predictability of disease outbreaks with a simple model of pathogen biogeography short-term forecasting of bark beetle outbreaks on two economically important conifer tree species real-time predictions of the 2018-2019 ebola virus disease outbreak in the democratic republic of the congo using hawkes point process models a note on the derivation of epidemic final sizes mathematical models of sir disease spread with combined non-sexual and sexual transmission routes. infectious disease modelling effective containment explains sub-exponential growth in confirmed cases of recent covid-19 outbreak in mainland china the effectiveness of fallowing strategies in disease control in salmon aquaculture assessed with an sis model. preventive veterinary medicine mathematical modelling of the transmission dynamics of ebola virus research about the optimal strategies for prevention and control of varicella outbreak in a school in a central city of china: based on an seir dynamic model calibration of a seir-sei epidemic model to describe the zika virus outbreak in brazil fitting the seir model of seasonal influenza outbreak to the incidence data for russian cities transmission dynamics of zika fever: a seir based model real-time prediction of influenza outbreaks in belgium forecasted size of measles outbreaks associated with vaccination exemptions for schoolchildren simple framework for real-time forecast in a data-limited situation: the zika virus (zikv) outbreaks in brazil from 2015 to 2016 as an example. parasites vectors predicting social response to infectious disease outbreaks from internet-based news streams effective network size predicted from simulations of pathogen outbreaks through social networks provides a novel measure of structure-standardized group size google trends predicts present and future plague cases during the plague outbreak in madagascar: infodemiological study prediction of dengue outbreaks based on disease surveillance, meteorological and socio-economic data forecasting respiratory infectious outbreaks using ed-based syndromic surveillance for febrile ed visits in a metropolitan city superensemble forecast of respiratory syncytial virus outbreaks at national, regional, and state levels in the united states the norovirus epidemiologic triad: predictors of severe outcomes in us norovirus outbreaks consensus and conflict among ecological forecasts of zika virus outbreaks in the united states seasonal difference in temporal transferability of 53 modified seir and ai prediction of the epidemics trend of covid-19 in china under public health interventions covid-19) classification using ct images by machine learning methods covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks prediction of survival for severe covid-19 patients with three clinical features: development of a machine learningbased prognostic model with clinical data in wuhan critical care utilization for the covid-19 outbreak in lombardy, italy: early experience and forecast during an emergency response regression model based covid-19 outbreak predictions in india a machine learning methodology for real-time forecasting of the 2019-2020 covid-19 outbreak using internet searches, news alerts, and estimates from mechanistic models covid-19 outbreak prediction with machine learning flood prediction using machine learning models: literature review forecasting shear stress parameters in rectangular channels using new soft computing methods hybrid model of morphometric analysis and statistical correlation for hydrological units prioritization complete statistical analysis to weather forecasting advances in machine learning modeling reviewing hybrid and ensemble methods hybrid machine learning model of extreme learning machine radial basis function for breast cancer detection and diagnosis; a multilayer fuzzy expert system list of deep learning models. engineering for sustainable future deep learning and machine learning in hydrological processes climate change and earth systems a systematic review prediction of combine harvester performance using hybrid machine learning modeling and response surface methodology. engineering for sustainable future, lecture notes in networks and systems, springer nature switzerland. 2019. 72. ardabili;, s.; mosavi;, a.; varkonyi-koczy;, a. systematic review of deep learning and machine learning models in biofuels research,. engineering for sustainable future performance analysis of combine harvester using hybrid model of artificial neural networks particle swarm optimization comparative analysis of ann-ica and ann-gwo for crop yield prediction prediction of combine harvester performance using hybrid machine learning modeling and response surface methodology state of the art survey of deep learning and machine learning models for smart cities and urban sustainability imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition an imperialist competitive algorithm with memory for distributed unrelated parallel machines scheduling imperialist competitive algorithm for minimum bit error rate beamforming evolving artificial neural network and imperialist competitive algorithm for prediction oil flow rate of the reservoir adaptive neuro-fuzzy inference system for prediction of water level in reservoir an expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease prediction of the strength and elasticity modulus of gypsum using multiple regression, ann, and anfis models online genetic-anfis temperature control for 86 modeling the effects of ultrasound power and reactor dimension on the biodiesel production yield: comparison of prediction abilities between response surface methodology (rsm) and adaptive neuro-fuzzy inference system (anfis) application of anfis-based subtractive clustering algorithm in soil cation exchange capacity estimation using soil and remotely sensed data modeling and analysis of significant process parameters of fdm 3d printer using anfis application of anfis to predict crop yield based on different energy inputs water quality prediction model utilizing integrated wavelet-anfis model with cross-validation estimation of wind speed profile using adaptive neuro-fuzzy inference system (anfis) 93. faizollahzadeh_ardabili; mahmoudi, a.; mesri gundoshmian, t. modeling and simulation controlling system of hvac using fuzzy and predictive (radial basis function ,rbf) controllers modelling temperature variation of mushroom growing hall using artificial neural networks. engineering for sustainable future key: cord-340827-vx37vlkf authors: jackson, matthew o.; yariv, leeat title: chapter 14 diffusion, strategic interaction, and social structure date: 2011-12-31 journal: handbook of social economics doi: 10.1016/b978-0-444-53187-2.00014-0 sha: doc_id: 340827 cord_uid: vx37vlkf abstract we provide an overview and synthesis of the literature on how social networks influence behaviors, with a focus on diffusion. we discuss some highlights from the empirical literature on the impact of networks on behaviors and diffusion. we also discuss some of the more prominent models of network interactions, including recent advances regarding interdependent behaviors, modeled via games on networks. jel classification codes: d85, c72, l14, z13 how we act, as well as how we are acted upon, are to a large extent influenced by our relatives, friends, and acquaintances. this is true of which profession we decide to pursue, whether or not we adopt a new technology, as well as whether or not we catch the flu. in this chapter we provide an overview of research that examines how social structure impacts economic decision making and the diffusion of innovations, behaviors, and information. we begin with a brief overview of some of the stylized facts on the role of social structure on diffusion in different realms. this is a rich area of study that includes a vast set of case studies suggesting some important regularities. with that empirical perspective, we then discuss insights from the epidemiology and random graph literatures that help shed light on the spread of infections throughout a society. contagion of this form can be thought of as a basic, but important, form of social interaction, where the social structure largely determines patterns of diffusion. this literature presents a rich understanding of questions such as: "how densely connected does a society have to be in order to have an infection reach a nontrivial fraction of its members?," "how does this depend on the infectiousness of the disease?," "how does it depend on the particulars of the social network in place?," "who is most likely to become infected?," and "how widespread is an infection likely to be?," among others. the results on this apply beyond infectious diseases, and touch upon issues ranging from the spread of information to the proliferation of ideas. while such epidemiological models provide a useful look at some types of diffusion, there are many economically relevant applications in which a different modeling approach is needed, and, in particular, where the interaction between individuals requires a game theoretic analysis. in fact, though disease and the transmission of certain ideas and bits of information can be modeled through mechanical or purely probabilistic sorts of diffusion processes, there are other important situations where individuals take decisions and care about how their social neighbors or peers behave. this applies to decisions of which products to buy, which technology to adopt, whether or not to become educated, whether to learn a language, how to vote, and so forth. such interactions involve equilibrium considerations and often have multiple potential outcomes. for example, an agent might care about the proportion of neighbors adopting a given action, or might require some threshold of stimulus before becoming convinced to take an action, or might want to take an action that is different from that of his or her neighbors (e.g., free-riding on their information gathering if they do gather information, but gathering information him or herself if neighbors do not). here we provide an overview of how the recent literature has modeled such interactions, and how it has been able to meld social structure with predictions of behavior. there is a large body of work that identifies the effects of social interactions on a wide range of applications spanning fields: epidemiology, marketing, labor markets, political science, and agriculture are only a few. while some of the empirical tools for the analysis of social interaction effects have been described in block, blume, durlauf, and ioannides (chapter 18, this volume) , and many of their implementations for research on housing decisions, labor markets, addictions, and more, have been discussed in ioannides (chapter 25, this volume), epple and romano (chapter 20, this volume), topa (chapter 22, this volume) , fafchamps (chapter 24, this volume), jackson (chapter 12, this volume), and munshi (chapter 23, this volume), we now describe empirical work that ties directly to the models that are discussed in the current chapter. in particular, we discuss several examples of studies that illustrate how social structure impacts outcomes and behaviors. the relevant studies are broadly divided into two classes. first, there are crosssectional studies that concentrate on a snapshot of time and look for correlations between social interaction patterns and observable behaviors. this class relates to the analysis below of strategic games played by a network of agents. while it can be very useful in identifying correlations, it is important to keep in mind that identifying causation is complicated without the fortuitous exogenous variation or structural underpinnings. second, there are longitudinal studies that take advantage of the inherent dynamics of diffusion. such studies have generated a number of interesting observations and are more suggestive of some of the insights the theoretical literature on diffusion has generated. nonetheless, these sorts of studies also face challenges in identifying causation because of potential unobserved factors that may contemporaneously influence linked individuals. the empirical work on these topics is immense and we provide here only a narrow look of the work that is representative of the type of studies that have been pursued and relate to the focus of this chapter. studies that are based on observations at one point of time most often compare the frequency of a certain behavior or outcome across individuals who are connected as opposed to ones that are not. for example, glaeser, sacerdote, and scheinkman (1996) showed that the structure of social interactions can help explain the cross-city variance in crime rates in the u.s.; bearman, moody, and stovel (2004) examined the network of romantic connections in high-school, and its link to phenomena such as the spread of sexually transmitted diseases (see the next subsection for a discussion of the spread of epidemics). such studies provide important evidence for the correlation of behaviors with characteristics of individuals' connections. in the case of diseases, they provide some direct evidence for diffusion patterns. with regards to labor markets, there is a rich set of studies showing the importance of social connections for diffusing information about job openings, dating back to rees (1966) and rees and schultz (1970) . influential studies by granovetter (1973 granovetter ( , 1985 granovetter ( , 1995 show that even casual or infrequent acquaintances (weak ties) can play a role in diffusing information. those studies were based on interviews that directly ask subjects how they obtained information about their current jobs. other studies, based on outcomes, such as topa (2001), conley and topa (2002) , and bayer, ross, and topa (2008) , identify local correlations in employment status within neighborhoods in chicago, and consider neighborhoods that go beyond the geographic but also include proximity in other socioeconomic dimensions, examining the extent to which local interactions are important for employment outcomes. bandiera, barankay, and rasul (2008) create a bridge between network formation (namely, the creation of friendships amongst fruit pickers) and the effectiveness of different labor contracts. the extensive literature on networks in labor markets 1 documents the important role of social connections in transmitting information about jobs, and also differentiates between different types of social contacts and shows that even weak ties can be important in relaying information. there is further (and earlier) research that examines the different roles of individuals in diffusion. important work by katz and lazarsfeld (1955) (building on earlier studies of lazarsfeld, berelson, and gaudet (1944) , merton (1948) , and others), identifies the role of "opinion leaders" in the formation of various beliefs and opinions. individuals are heterogeneous (at least in behaviors), and some specialize in becoming well informed on certain subjects, and then information and opinions diffuse to other less informed individuals via conversations with these opinion leaders. lazarsfeld, berelson, and gaudet (1944) study voting decisions in an ohio town during the 1940 u.s. presidential campaign, and document the presence and significance of such opinion leaders. katz and lazarsfeld (1955) interviewed women in decatur, illinois, and asked about a number of things such as their views on household goods, fashion, movies, and local public affairs. when women showed a change in opinion in follow-up interviews, katz and lazarsfeld traced influences that led to the change in opinion, again finding evidence for the presence of opinion leaders. diffusion of new products is understandably a topic of much research. rogers (1995) discusses numerous studies illustrating the impacts of social interactions on the diffusion of new products, and suggests various factors that impact which products succeed and which products fail. for example, related to the idea of opinion leaders, feick and price (1987) surveyed 1531 households and provided evidence that consumers recognize and make use of particular individuals in their social network termed "market mavens," those who have a high propensity to provide marketplace and shopping information. whether or not products reach such mavens can influence the success of a product, independently of the product's quality. tucker (2008) uses micro-data on the adoption and use of a new video-messaging technology in an investment bank consisting of 2118 employees. tucker notes the effects of the underlying network in that employees follow the actions of those who either have formal power, or informal influence (which is, to some extent, endogenous to a social network). in the political context, there are several studies focusing on the social sources of information electors choose, as well as on the selective mis-perception of social information they are exposed to. a prime example of such a collection of studies is huckfeldt and sprague (1995), who concentrated on the social structure in south bend, indiana, during the 1984 elections. they illustrated the likelihood of socially connected individuals to hold similar political affiliations. in fact, the phenomenon of individuals connecting to individuals who are similar to them is observed across a wide array of attributes and is termed by sociologists homophily (for overviews see mcpherson, smith-lovin, and cook, 2001, jackson, 2007 , as well as the discussion of homophily in jackson, chapter 12 in this volume). while cross-sectional studies are tremendously interesting in that they suggest dimensions on which social interactions may have an impact, they face many empirical challenges. most notably, correlations between behaviors and outcomes of individuals and their peers may be driven by common unobservables and therefore be spurious. given the strong homophily patterns in many social interactions, individuals who associate with each other often have common unobserved traits, which could lead them to similar behaviors. this makes it difficult to draw (causal) conclusions from empirical analysis of the social impact on diffusion of behaviors based on cross-sectional data. 2 given some of the challenges with causal inference based on pure observation, laboratory experiments and field experiments are quite useful in eliciting the effects of real-world networks on fully controlled strategic interactions, and are being increasingly utilized. as an example, leider, mobius, rosenblat, and do (2009) elicited the friendship network among undergraduates at a u.s. college and illustrated how altruism varies as a function of social proximity. in a similar setup, goeree, mcconnell, mitchell, tromp, and yariv (2010) elicited the friendship network in an all-girls school in pasadena, ca, together with girls' characteristics and later ran dictator games with recipients who varied in social distance. they identified a "1/d law of giving," in that the percentage given to a friend was inversely related to her social distance in the network. 3 various field experiments, such as those by duflo and saez (2003) , karlan, mobius, rosenblat, and szeidl (2009), dupas (2010) , beaman and magruder (2010) , and feigenberg, field, and pande (2010) , also provide some control over the process, while working with real-world network structures to examine network influences on behavior. 4 another approach that can be taken to infer causal relationships is via structural modeling. as an example, one can examine the implications of a particular diffusion model for the patterns of adoption that should be observed. one can then infer characteristics of the process by fitting the process parameters to best match the observed outcomes in terms of behavior. for instance, banerjee, chandrasekhar, duflo, and jackson (2010) use such an approach in a study of the diffusion of microfinance participation in rural indian villages. using a model of diffusion that incorporates both information and peer effects, they then fit the model to infer the relative importance of information diffusion versus peer influences in accounting for differences in microfinance participation rates across villages. of course, in such an approach one is only as confident in the causal inference as one is confident that the model is capturing the essential underpinnings of the diffusion process. the types of conclusions that have been reached from these cross sectional studies can be roughly summarized as follows. first, in a wide variety of settings, associated individuals tend to have correlated actions and opinions. this does not necessarily embody diffusion or causation, but as discussed in the longitudinal section below, there is significant evidence of social influence in diffusion patterns as well. second, individuals tend to associate with others who are similar to themselves, in terms of beliefs and opinions. this has an impact on the structure of social interactions, and can affect diffusion. it also represents an empirical quandary of the extent to which social structure influences opinions and behavior as opposed to the reverse (that can partly be sorted out with careful analysis of longitudinal data). third, individuals fill different roles in a society, with some acting as "opinion leaders," and being key conduits of information and potential catalysts for diffusion. longitudinal data can be especially important in diffusion studies, as they provide information on how opinions and behaviors move through a society over time. they also help sort out issues of causation as well as supply-specific information about the extent to which behaviors and opinions are adopted dynamically, and by whom. such data can be especially important in going beyond the documentation of correlation between social connections and behaviors, and illustrating that social links are truly the conduits for information and diffusion if one is careful to track what is observed by whom at what point in time, and can measure the resulting changes in behavior. for example, conley and udry (2008) show that pineapple growers in ghana tend to follow those farmers who succeed in changing their levels of use of fertilizers. through careful examination of local ties, and the timing of different actions, they trace the influence of the outcome of one farmer's crop on subsequent behavior of other farmers. more generally, diffusion of new technologies is extremely important when looking at transitions in agriculture. seminal studies by ryan and gross (1943) and griliches (1957) examined the effects of social connections on the adoption of a new behavior, specifically the adoption of hybrid corn in the u.s. looking at aggregate adoption rates in different states, these authors illustrated that the diffusion of hybrid corn followed an s-shape curve over time: starting out slowly, accelerating, and then ultimately decelerating. 5 foster and rosenzweig (1995) collected household-level panel data from a representative sample of rural indian households having to do with the adoption and profitability of high-yielding seed varieties (associated with the green revolution). they identified significant learning-by-doing, where some of the learning was through neighbors' experience. in fact, the observation that adoption rates of new technologies, products, or behaviors exhibit s-shaped curves can be traced to very early studies, such as tarde (1903) , who discussed the importance of imitation in adoption. such patterns are found across many applications (see mahajan and peterson (1985) and rogers (1995) ). understanding diffusion is particularly important for epidemiology and medicine for several reasons. for one, it is important to understand how different types of diseases spread in a population. in addition, it is crucial to examine how new treatments get adopted. colizza, barrat, barthelemy, and vespignani (2006, 2007) tracked the spread of severe acute respiratory syndrome (sars) across the world combining census data with data on almost all air transit during the years 2002-2003. they illustrated the importance of structures of long-range transit networks for the spread of an epidemic. coleman, katz, and menzel (1966) is one of the first studies to document the role of social networks in diffusion processes. the study looked at the adoption of a new drug (tetracycline) by doctors and highlighted two observations. first, as with hybrid corn, adoption rates followed an s-shape curve over time. second, adoption rates depended on the density of social interactions. doctors with more contacts (measured according to the trust placed in them by other doctors) adopted at higher rates and earlier in time. 6 diffusion can occur in many different arenas of human behavior. for example christakis and fowler (2007) document influences of social contacts on obesity levels. they studied the social network of 12,067 individuals in the u.s. assessed repeatedly from 1971 to 2003 as part of the framingham heart study. concentrating on bodymass index, christakis and fowler found that a person's chances of becoming obese increased by 57% if he or she had a friend who became obese, by 40% if he or she had a sibling who became obese, and by 37% if they had a spouse who became obese in a previous period. the study controls for various selection effects, and takes advantage of the direction of friendship nominations to help sort out causation. for example, christakis and fowler find a significantly higher increase of an individual's body mass index in reaction to the obesity of someone that the individual named as a friend compared to someone who had named the individual as a friend. this is one method of sorting out causation, since if unobserved influences that were common to the agents were at work, then the direction of who mentioned the other as a friend would not matter, whereas direction would matter if it indicated which individuals react to which others. based on this analysis, christakis and fowler conclude that obesity spreads very much like an epidemic with the underlying social structure appearing to play an important role. it is worth emphasizing that even with longitudinal studies, one still has to be cautious in drawing causal inferences. the problem of homophily still looms, as linked individuals tend to have common characteristics and so may be influenced by common unobserved factors, for example, both being exposed to some external stimulus (such as advertising) at the same time. this then makes it appear as if one agent's behavior closely followed another's, even when it may simply be due to both having experienced a common external event that prompted their behaviors. aral, muchnik, and sundararajan (2009) provide an idea of how large this effect can be, by carefully tracking individual characteristics and then using propensity scores (likelihoods of having neighbors with certain behaviors) to illustrate the extent to which one can over-estimate diffusion effects by not accounting for common backgrounds of connected individuals. homophily not only suggests that linked individuals might be exposed to common influences, it also makes it hard to disentangle which of the following two processes is at the root of observed similarities in behavior between connected agents. it could be that similar behavior in fact comes from a process of selection (assortative pairing), in which similarity precedes association. alternatively, it could be a consequence of a process of socialization, in which association leads to similarity. in that respect, tracking connections and behaviors over time is particularly useful. kandel (1978) concentrated on adolescent friendship pairs and examined the levels of homophily on four attributes (frequency of current marijuana use, level of educational aspirations, political orientation, and participation in minor delinquency) at various stages of friendship formation and dissolution. she noted that observed homophily in friendship dyads resulted from a significant combination of both types of processes, so that individuals emulated their friends, but also tended to drop friendships with those more different from themselves and add new friendships to those more similar to themselves. 7 in summary, let us mention a few of the important conclusions obtained from studies of diffusion. first, not only are behaviors across socially connected individuals correlated, but individuals do influence each other. while this may sound straightforward, it takes careful control to ensure that it is not unobserved correlated traits or influences that lead to similar actions by connected individuals, as well as an analysis of similarities between friends that can lead to correlations in their preferences and the things that influence them. second, in various settings, more socially connected individuals adopt new behaviors and products earlier and at higher rates. third, diffusion exhibits specific patterns over time, and specifically there are many settings where an "s"-shaped pattern emerges, with adoption starting slowly, then accelerating, and eventually asymptoting. fourth, many diffusion processes are affected by the specifics of the patterns of interaction. we now turn to discussing various models of diffusion. as should be clear from our description of the empirical work on diffusion and behavior, models can help greatly in clarifying the tensions at play. given the issues associated with the endogeneity of social relationships, and the substantial homophily that may lead to correlated behaviors among social neighbors, it is critical to have models that help predict how behavior should evolve and how it interacts with the social structure in place. we start with some of the early models that do not account for the underlying network architecture per-se. these models incorporate the empirical observations regarding social influence through the particular dynamics assumed, or preferences posited, and generate predictions matching the aggregate empirical observations regarding diffusion over time of products, diseases, or behavior. for example, the so-called s-shaped adoption curves. after describing these models, we return to explicitly capturing the role of social networks. one of the earliest and still widely used models of diffusion is the bass (1969) model. this is a parsimonious model, which can be thought of as a "macro" model: it makes predictions about aggregate behavior in terms of the percentage of potential adopters of a product or behavior who will have adopted by a given time. the current rate of change of adoption depends on the current level and two critical parameters. these two parameters are linked to the rate at which people innovate or adopt on their own, and the rate at which they imitate or adopt because others have, thereby putting into (theoretical) force the empirical observation regarding peers' influence. if we let g(t) be the percentage of agents who have adopted by time t, and m be the fraction of agents in the population who are potential adopters, a discrete time version of the bass model is characterized by the difference equation where p is a rate of innovation and q is a rate of imitation. to glean some intuition, note that the expression p (m ã� g(t ã� 1)) represents the fraction of people who have not yet adopted and might potentially do so times the rate of spontaneous adoption. in the expression qã°m ã� gã°t ã� 1ã�ã� gã°tã�1ã� m , the rate of imitation is multiplied by two factors. the first factor, (m ã� g(t ã� 1)), is the fraction of people who have not yet adopted and may still do so. the second expression, g tã�1 ã° ã� m , is the relative fraction of potential adopters who are around to imitate. if we set m equal to 1, and look at a continuous time version of the above difference equation, we get where g is the rate of diffusion (times the rate of change of g). solving this when p > 0 and setting the initial set of adopters at 0, g(0) â¼ 0, leads to the following expression: this is a fairly flexible formula that works well at fitting time series data of innovations. by estimating p and q from existing data, one can also make forecasts of future diffusion. it has been used extensively in marketing and for the general analysis of diffusion (e.g., rogers (1995)), and has spawned many extensions and variations. 8 if q is large enough, 9 then there is a sufficient imitation/social effect, which means that the rate of adoption accelerates after it begins, and so g(t) is s-shaped (see figure 1 ), matching one of the main insights of the longitudinal empirical studies on diffusion discussed above. the bass model provides a clear intuition for why adoption curves would be s-shaped. indeed, when the adoption process begins, imitation plays a minor role (relative to innovation) since not many agents have adopted yet and so the volume of adopters grows slowly. as the number of adopters increases, the process starts to accelerate as now innovators are joined by imitators. the process eventually starts to slow down, in part simply because there are fewer agents left to adopt (the term 1ã�g(t) in (1) eventually becomes small). thus, we see a process that starts out slowly, then accelerates, and then eventually slows and asymptotes. the bass model described above is mechanical in that adopters and imitators are randomly determined; they do not choose actions strategically. the empirical observation that individuals influence each other through social contact can be derived through agents' preferences, rather than through some exogenously specified dynamics. diffusion in a strategic context was first studied without a specific structure for interactions. broadly speaking, there were two approaches taken in this early literature. in the first, all agents are connected to one another (that is, they form a complete network). effectively, this corresponds to a standard multi-agent game in which payoffs to each player depend on the entire profile of actions played in the population. the second approach has been to look at interactions in which agents are matched to partners in a random fashion. diffusion on complete networks. granovetter (1978) considered a model in which n agents are all connected to one another and each agent chooses one of two actions: 0 or 1. associated with each agent i is a number n i . this is a threshold such that if at least n i other agents take action 1 then i prefers action 1 to action 0, and if fewer than n i other agents take action 1 then agent i prefers to take action 0. the game exhibits what are known as strategic complementarities. for instance, suppose that the utility of agent i faced with a profile of actions (x 1 , . . ., x n ) 2 {0, 1} n is described by: where c i is randomly drawn from a distribution f over [0,1]. c i can be thought of as a cost that agent i experiences upon choosing action 1 (e.g., a one-time switching cost from one technology to the other, or potential time costs of joining a political revolt, etc.). the utility of agent i is normalized to 0 when choosing the action 0. when choosing the action 1, agent i experiences a benefit proportional to the fraction of other agents choosing the action 1 and a cost of c i . granovetter considered a dynamic model in which at each stage agents best respond to the previous period's distribution of actions. if in period t there was a fraction x t of agents choosing the action 1, then in period t ã¾ 1 an agent i chooses action 1 if and only if his or her cost is lower than nx t ã�x t i n ã�1 , the fraction of other agents taking action 1 in the last period. for a large population, then corresponds to an (approximate) equilibrium of a large population. the shape of the distribution f determines which equilibria are tipping points: equilibria such that only a slight addition to the fraction of agents choosing the action 1 shifts the population, under the best response dynamics, to the next higher equilibrium level of adoption (we return to a discussion of tipping and stable points when we consider a more general model of strategic interactions on networks below). note that while in the bass model the diffusion path was determined by g(t), the fraction of adopters as a function of time, here it is easier to work with f(x), corresponding to the fraction of adopters as a function of the previous period's fraction x. although granovetter (1978) does not examine conditions under which the time series will exhibit attributes like the s-shape that we discussed above, by using techniques from jackson and yariv (2007) we can derive such results, as we now discuss. keeping track of time in discrete periods (a continuous time analog is straightforward), the level of change of adoption in the society is given by thus, to derive an s-shape, we need this quantity to initially be increasing, and then eventually to decrease. assuming differentiability of f, this corresponds to the derivative of d(x t ) being positive up to some x and then negative. the derivative of f(x) ã� x is f 0 (x) ã� 1 and having an s-shape corresponds to f 0 being greater than 1 up to some point and then less than 1 beyond that point. for instance, if f is concave with an initial slope greater than 1 and an eventual slope less than 1, this is satisfied. note that the s-shape of adoption over time does not translate into an s-shape of f -but rather a sort of concavity. 10 the idea is that we initially need a rapid level of change, which corresponds to an initially high slope of f, and then a slowing down, which corresponds to a lower slope of f. fashions and random matching. a different approach than that of the bass model is taken by pesendorfer (1995) , who considers a model in which individuals are randomly matched and new fashions serve as signaling instruments for the creation of matches. he identifies particular matching technologies that generate fashion cycles. pesendorfer describes the spread of a new fashion as well as its decay over time. in pesendorfer's model, the price of the design falls as it spreads across the population. once sufficiently many consumers own the design, it is profitable to create a new design and thereby render the old design obsolete. in particular, demand for any new innovation eventually levels off as in the above two models. information cascades and learning. another influence on collective behavior derives from social learning. this can happen without any direct complementarities in actions, but due to information flow about the potential payoffs from different behaviors. if people discuss which products are worth buying, or which technologies are worth adopting, books worth reading, and so forth, even without any complementarities in behaviors, one can end up with cascades in behavior, as people infer information from others' behaviors and can (rationally) imitate them. as effects along these lines are discussed at some length in jackson (chapter 12, this volume) and goyal (chapter 15, this volume), we will not detail them here. we only stress that pure information transmission can lead to diffusion of behaviors. we now turn to models that explicitly incorporate social structure in examining diffusion patterns. we start with models that stem mostly from the epidemiology literature and account for the underlying social network, but are mechanical in terms of the way that disease spreads from one individual to another (much like the bass model described above). we then proceed to models in which players make choices that depend on their neighbors' actions as embedded in a social network; for instance, only adopting an action if a certain proportion of neighbors adopt as well (as in granovetter's setup), or possibly not adopting an action if enough neighbors do so. many models of diffusion and strategic interaction on networks have the following common elements. there is a finite set of agents n â¼ {1, . . ., n}. agents are connected by a (possibly directed) network g 2 {0, 1} nã�n . we let n i (g) {j : g ij â¼ 1} be the neighbors of i. the degree of a node i is the number of her neighbors, d i jn i (g)j. when links are determined through some random process, it is often useful to summarize the process by the resulting distribution of degrees p, where p(d) denotes the probability a random individual has a degree of d. 11, 12 each agent i 2 n takes an action x i . in order to unify and simplify the description of various models, we focus on binary actions, so that x i 2 {0, 1}. actions can be metaphors for becoming "infected" or not, buying a new product or not, choosing one of two activities, and so forth. some basic insights about the extent to which behavior or an infection can spread in a society can be derived from random graph theory. random graph theory provides a tractable base for understanding characteristics important for diffusion, such as the structure and size of the components of a network, maximally connected subnetworks. 13 before presenting some results, let us talk through some of the ideas in the context of what is known as the reed-frost model. 14 consider, for example, the spread of a disease. initially, some individuals in the society are infected through mutations of a germ or other exogenous sources. consequently, some of these individuals' neighbors are infected through contact, while others are not. this depends on how virulent the disease is, among other things. in this application, it makes sense (at least as a starting point) to assume that becoming infected or avoiding infection is not a choice; 11 such a description is not complete, in that it does not specify the potential correlations between degrees of different individuals on the network. see galeotti, goyal, jackson, vega-redondo, and yariv (2010) for more details. 12 in principle, one would want to calibrate degree distributions with actual data. the literature on network formation, see bloch and dutta (chapter 16, this volume) and jackson (chapter 12, this volume), suggests some insights on plausible degree distributions p(d). 13 formally, these are the subnetworks projected induced by maximal sets c n of nodes such any two distinct nodes in c are path connected within c. that is, for any i,j 2 c, there exist i 1 , . . ., i k 2 c such that g ii1 â¼ g i1i2 â¼ . . . â¼ g ikã�1ik â¼ g ikj â¼ 1. 14 see jackson (2008) for a more detailed discussion of this and related models. i.e., contagion here is nonstrategic. in the simplest model, there is a probability p ! 0 that a given individual is immune (e.g., through vaccination or natural defenses). if an individual is not immune, it is assumed that he or she is sure to catch the disease if one of his or her neighbors ends up with the disease. in this case, in order to estimate the volume of those ultimately infected, we proceed in two steps, depicted in figure 2 . first, we delete a fraction p of the nodes that will never be infected (these correspond to the dotted nodes in the figure) . then, we note that the components of the remaining network that contain the originally infected individuals comprise the full extent of the infection. in particular, if we can characterize what the components of the network look like after removing some portion of the nodes, we have an idea of the extent of the infection. in figure 2 , we start with one large connected component (circumvented by a dotted line) and two small-connected components. after removing the immune agents, there is still a large connected component (though smaller than before), and four small components. thus, the estimation of the extent of infection of the society is reduced to the estimation of the component structure of the network. a starting point for the formal analysis of this sort of model uses the canonical random network model, where links are formed independently, each with an identical probability p > 0 of being present. this is sometimes referred to as a "poisson random network" as its degree distribution is approximated by a poisson distribution if p is not excessively large; and has various other aliases such as an "erdã¶s-renyi random graph," a "bernoulli random graph," or a "g(n,p)" random graph (see jackson, chapter 12 in this volume, for more removing immune agents background). ultimately, the analysis boils down to considering a network on (1ã�p)n nodes with an independent link probability of p, and then measuring the size of the component containing a randomly chosen initially infected node. clearly, with a fixed set of nodes, and a positive probability p that lies strictly between 0 and 1, every conceivable network on the given set of nodes could arise. thus, in order to say something specific about the properties of the networks that are "most likely" to arise, one generally works with large n where reasoning based on laws of large numbers can be employed. for example, if we think of letting n grow, we can ask for which p's (that are now dependent on n) a nonvanishing fraction of nodes will become infected with a probability bounded away from 0. so, let us consider a sequence of societies indexed by n and corresponding probabilities of links p(n). erdã¶s and renyi (1959 renyi ( , 1960 proved a series of results that characterize some basic properties of such random graphs. in particular, 15 â�¢ the threshold for the existence of a "giant component," a component that contains a nontrivial fraction of the population, is 1/n, corresponding to an average degree of 1. that is, if p(n) over 1/n tends to infinity, then the probability of having a giant component tends to 1, while if p(n) over 1/n tends to 0, then the probability of having a giant component tends to 0. â�¢ the threshold for the network to be connected (so that every two nodes have a path between them) is log(n)/n, corresponding to an average degree that is proportional to log(n). the logic for the first threshold is easy to explain, though the proof is rather involved. to heuristically derive the threshold for the emergence of a giant component, consider following a link out of a given node. we ask whether or not one would expect to be able to find a link to another node from that one. if the expected degree is much smaller than 1, then following the few (if any) links from any given node is likely to lead to dead-ends. in contrast, when the expected degree is much higher than 1, then from any given node, one expects to be able to reach more nodes, and then even more nodes, and so forth, and so the component should expand outward. note that adjusting for the factor p of the number of immune nodes does not affect the above thresholds as they apply as limiting results, although the factor will be important for any fixed n. between these two thresholds, there is only one giant component, so that the next largest component is of a size that is a vanishing fraction of the giant component. this is intuitively clear, as to have two large components requires many links within each component but no links between the two components, which is an unlikely event. in that sense, the image that emerges from figure 2 of one large connected component is reasonably typical for a range of parameter values. these results then tell us that in a random network, if average degree is quite low (smaller than 1), then any initial infection is likely to die out. in contrast, if average degree is quite high (larger than log(n)), then any initial infection is likely to spread to all of the susceptible individuals, i.e., a fraction of 1 ã� p of the population. in the intermediate range, there is a probability that the infection will die out and also a probability that it will infect a nontrivial, but limited, portion of the susceptible population. there, it can be shown that for such random networks and large n, the fraction of nodes in the giant component of susceptible nodes is roughly approximated by the nonzero q that solves here, q is an approximation of the probability of the infection spreading to a nontrivial fraction of nodes, and also of the percentage of susceptible nodes that would be infected. 16 this provides a rough idea of the type of results that can be derived from random graph theory. there is much more that is known, as one can work with other models of random graphs (other than ones where each link has an identical probability), richer models of probabilistic infection between nodes, as well as derive more information about the potential distribution of infected individuals. it should also be emphasized that while the discussion here is in terms of "infection," the applications clearly extend to many of the other contexts we have been mentioning, such as the transmission of ideas and information. fuller treatment of behaviors, where individual decisions depend in more complicated ways on neighbors' decisions, are treated in section 4.3. the above analysis of diffusion presumes that once infected, a node eventually infects all of its susceptible neighbors. this misses important aspects of many applications. in terms of diseases, infected nodes can either recover and stop transmitting a disease, or die and completely disappear from the network. transmission will also generally be probabilistic, depending on the type of interaction and its extent. 17 similarly, if we think of behaviors, it might be that the likelihood that a node is still actively transmitting a bit of information to its neighbors decreases over time. ultimately, we will discuss models that allow for rather general strategic impact of peer behavior (a generalization of the approach taken by granovetter). but first we discuss some aspects of the epidemiology literature that takes steps forward in that direction by considering two alternative models that keep track of the state of nodes and are more explicitly dynamic. the common terminology for the possible states that 16 again, see chapter 4 in jackson (2008) for more details. 17 probabistic transmission is easily handled in the above model by simply adjusting the link probability to reflect the fact that some links might not transmit the disease. a node can be in are: susceptible, where a node is not currently infected or transmitting a disease but can catch it; infected, where a node has a disease and can transmit it to its neighbors; and removed (or recovered), where a node has been infected but is no longer able to transmit the disease and cannot be re-infected. the first of the leading models is the "sir" model (dating to kermack and mckendrick, 1927) , where nodes are initially susceptible but can catch the disease from infected neighbors. once infected, a node continues to infect neighbors until it is randomly removed from the system. this fits well the biology of some childhood diseases, such as the chicken pox, where one can only be infected once. the other model is the "sis" model (see bailey, 1975) , where once infected, nodes can randomly recover, but then they are susceptible again. this corresponds well with an assortment of bacterial infections, viruses, and flus, where one transitions back and forth between health and illness. the analysis of the sir model is a variant of the component-size analysis discussed above. the idea is that there is a random chance that an "infected" node infects a given "susceptible" neighbor before becoming "removed." roughly, one examines component structures in which instead of removing nodes randomly, one removes links randomly from the network. this results in variations on the above sorts of calculations, where there are adjusted thresholds for infection depending on the relative rates of how quickly infected nodes can infect their neighbors compared to how quickly they are removed. in contrast, the sis model involves a different sort of analysis. the canonical version of that model is best viewed as one with a random matching process rather than a social network. in particular, suppose that a node i in each period will have interactions with d i other individuals from the population. recall our notation of p(d) describing the proportion of the population that has degree d (so d interactions per period). the matches are determined randomly, in such a way that if i is matched with j, then the probability that j has degree d > 0 is given bỹ where hã�i represents the expectation with respect to p. 18 this reflects the fact that an agent is proportionally more likely to be matched with other individuals who have lots of connections. to justify this formally, one needs an infinite population. indeed, with any finite population of agents with heterogeneous degrees, the emergent networks will generally exhibit some correlation between neighbors' degrees. 19 individuals who have high degrees will have more interactions per period and will generally be more likely to be infected at any given time. an important calculation 18 we consider only individuals who have degree d > 0, as others do not participate in the society. 19 see the appendix of currarini, jackson, and pin (2009) for some details along this line. then pertains to the chance that a given meeting will be with an infected individual. if the infection rate of degree d individuals is r(d ), the probability that any given meeting will be with an infected individual is y, where the chance of meeting an infected individual in a given encounter then differs from the average infection rate in the population, which is just r â¼ p p d ã° ã�r d ã° ã�, because y is weighted by the rate at which individuals meet each other. a standard version of contagion that is commonly analyzed is one in which the probability of an agent of degree d becoming infected is where n 2 (0, 1) is a rate of transmission of infection in a given period, and is small enough so that this probability is less than one. if n is very small, this is an approximation of getting infected under d interactions with each having an (independent) probability y of being infected and then conditionally (and independently) having a probability n of getting infected through contact with a given infected individual. the last part of the model is that in any given period, an infected individual recovers and becomes susceptible with a probability d 2 (0, 1). if such a system operates on a finite population, then eventually all agents will become susceptible and that would end the infection. if there is a small probability of a new mutation and infection in any given period, the system will be ergodic and always have some probability of future infection. to get a feeling for the long run outcomes in large societies, the literature has examined a steady state (i.e., a situation in which the system essentially remains constant) of a process that is idealized as operating on an infinite (continuous) population. formally, a steady-state is defined by having r(d) be constant over time for each d. working with an approximation at the limit (termed a "mean-field" approximation that in this case can be justified with a continuum of agents, but with quite a bit of technical detail), a steady-state condition can be derived to be for each d. (1ã�r(d))nyd is the rate at which agents of degree d who were susceptible become infected and r(d)d is the rate at which infected individuals of degree d recover. letting l â¼ n d , it follows that solving (5) and (8) simultaneously leads to a characterization of the steady-state y: this system always has a solution, and therefore a steady-state, where y â¼ 0 so there is no infection. it can also have other solutions under which y is positive (but always below 1 if l is finite). unless p takes very specific forms, it can be difficult to find steady states y > 0 analytically. special cases have been analyzed, such as the case of a power distribution, where p(d ) â¼ 2d ã�3 (e.g., see pastor-satorras and vespignani (2000, 2001) ). in that case, there is always a positive steady-state infection rate. more generally, lopez-pintado (2008) addresses the question of when it is that there will be a positive steady-state infection rate. to get some intuition for her results, let so that the equation y â¼ h(y) corresponds to steady states of the system. we can now extend the analysis of granovetter's (1978) model that we described above, with this richer model in which h(y) accounts for network attributes. while the fixed-point equation identifying granovetter's stable points allowed for rather arbitrary diffusion patterns (depending on the cost distribution f), the function h has additional structure to it that we can explore. in particular, suppose we examine the infection rate that would result if we start at a rate of y and then run the system on an infinite population for one period. noting that h(0) â¼ 0, it is clear that 0 is always a fixed-point and thus a steady-state. since h(1) < 1, and h is increasing and strictly concave in y (which is seen by examining its first and second derivatives), there can be at most one fixed-point besides 0. for there to be another fixed-point (steady-state) above y â¼ 0, it must be that h 0 (0) is above 1, or else, given the strict concavity, we would have h(y) < y for all positive y. moreover, in cases where h 0 (0) > 1, a small perturbation away from a 0 infection rate will lead to increased infection. in the terminology we have introduced above, 0 would be a tipping point. since higher infection rates lead to the possibility of positive infection, as do degree distributions with high variances (relative to mean). the idea behind having a high variance is that there will be some "hub nodes" with high degree, who can foster contagion. going back to our empirical insights, this analysis fits the observations that highlylinked individuals are more likely to get infected and experience speedier diffusion. whether the aggregate behavior exhibits the s-shape that is common in many realworld diffusion processes will depend on the particulars of h, much in the same way that we discussed how the s-shape in granovetter's model depends on the shape of the distribution of costs f in that model. here, things are slightly complicated since h is a function of y, which is the probability of infection of a neighbor, and not the overall probability of infection of the population. thus, one needs to further translate how various y's over time translate into population fractions that are infected. beyond the extant empirical studies, this analysis provides some intuitions behind what is needed for an infection to be possible. it does not, however, provide an idea of how extensive the infection spread will be and how that depends on network structure. while this does not boil down to as simple a comparison as (12), there is still much that can be deduced using (9), as shown by jackson and rogers (2007) . while one cannot always directly solve notice that lyd 2 hdiã°lyd ã¾ 1ã� is an increasing and convex function of d. therefore, the right hand side of the above equality can be ordered when comparing different degree distributions in the sense of stochastic dominance (we will return to these sorts of comparisons in some of the models we discuss below). the interesting conclusion regarding steady-state infection rates is that they depend on network structure in ways that are very different at low levels of the infection rate l compared to high levels. while the above models provide some ideas about how social structure impacts diffusion, they are limited to settings where, roughly speaking, the probability that a given individual adopts a behavior is simply proportional to the infection rate of neighbors. especially when it comes to situations in which opinions or technologies are adopted, purchasing decisions are made, etc., an individual's decision can depend in much more complicated ways on the behavior of his or her neighbors. such interaction naturally calls on game theory as a tool for modeling these richer interactions. we start with static models of interactions on networks that allow for a rather general impact of peers' actions on one's own optimal choices. the first model that explicitly examines games played on a network is the model of "graphical games" as introduced by kearns, littman, and singh (2001) , and analyzed by kakade, kearns, langford, and ortiz (2003) , among others. the underlying premise in the graphical games model is that agents' payoffs depend on their own actions and the actions of their direct neighbors, as determined by the network of connections. 20 formally, the payoff structure underlying a graphical game is as follows. the payoff to each player i when the profile of actions is x â¼ (x 1 , . . ., x n ) is where x n i g ã° ã� is the profile of actions taken by the neighbors of i in the network g. most of the empirical applications discussed earlier entailed agents responding to neighbors' actions in roughly one of two ways. in some contexts, such as those pertaining to the adoption of a new product or new agricultural grain, decisions to join the workforce, or to join a criminal network, agents conceivably gain more from a particular action the greater is the volume of peers who choose a similar action. that is, payoffs exhibit strategic complementarities. in other contexts, such as experimentation on a new drug, or contribution to a public good, when an agent's neighbors choose a particular action, the relative payoff the agent gains from choosing a similar action decreases, and there is strategic substitutability. the graphical games environment allows for the analysis of both types of setups, as the following example (taken from galeotti, goyal, jackson, vega-redondo, and yariv (2010)) illustrates. example 1 (payoffs depend on the sum of actions) player i's payoff function when he or she chooses x i and her k neighbors choose the profile (x 1 , . . ., x k ) is: where f(ã�) is nondecreasing and c(ã�) is a "cost" function associated with own effort (more general but much in the spirit of (2)). the parameter l 2 r determines the nature of the externality across players' actions. the shape and sign of lf determine the effects of neighbors' action choices on one's own optimal choice. in particular, the example yields strict strategic substitutes (complements) if, assuming differentiability, lf 00 is negative (positive). there are several papers that analyze graphical games for particular choices of f and l. to mention a few examples, the case where f is concave, l â¼ 1, and c(ã�) is increasing and linear corresponds to the case of information sharing as a local public good studied by bramoullã© and kranton (2007) , where actions are strategic substitutes. in contrast, if l â¼ 1, but f is convex (with c 00 > f 00 > 0), we obtain a model with strategic complements, as proposed by goyal and moraga-gonzalez (2001) to study collaboration among local monopolies. in fact, the formulation in (13) is general enough to accommodate numerous further examples in the literature such as human capital investment (calvã³-armengol and jackson (2009)), crime and other networks (ballester, calvã³-armengol, and zenou (2006) ), some coordination problems (ellison (1993) ), and the onset of social unrest (chwe (2000) ). the computer science literature (e.g., the literature following kearns, littman, and singh (2001) , and analyzed by kakade, kearns, langford, and ortiz (2003) ) has focused predominantly on the question of when an efficient (polynomial-time) algorithm can be provided to compute nash equilibria of graphical games. it has not had much to say about the properties of equilibria, which is important when thinking about applying such models to analyze diffusion in the presence of strategic interaction. in contrast, the economics literature has concentrated on characterizing equilibrium outcomes for particular applications, and deriving general comparative statics with respect to agents' positions in a network and with respect to the network architecture itself. information players hold regarding the underlying network (namely, whether they are fully informed of the entire set of connections in the population, or only of connections in some local neighborhood) ends up playing a crucial role in the scope of predictions generated by network game models. importantly, graphical games are ones in which agents have complete information regarding the networks in place. consequently, such models suffer from inherent multiplicity problems, as clearly illustrated in the following example. it is based on a variation of (13), which is similar to a model analyzed by bramoullã© and kranton (2007) . example 2 (multiplicity -complete information) suppose that in (13), we set l â¼ 1, choose x i 2 {0, 1}, and have and c(x i ) cx i , where 0 < c < 1. this game, often labeled the best-shot public goods game, may be viewed as a game of local public-good provision. each agent would choose the action 1 (say, experimenting with a new grain, or buying a product that can be shared with one's friends) if they were alone (or no one else experimented), but would prefer that one of their neighbors incur the cost c that the action 1 entails (when experimentation is observed publicly). effectively, an agent just needs at least one agent in his or her neighborhood to take action 1 to enjoy its full benefits, but prefers that it be someone else given that the action is costly and there is no additional benefit beyond one person taking the action. note that, since c < 1, in any (pure strategy) nash equilibrium, for any player i with k neighbors, it must be the case that one of the agents in the neighborhood chooses the action 1. that is, if the chosen profile is (x 1 , . . ., x k ), then x i ã¾ p k jâ¼1 x j ! 1. in fact, there is a very rich set of equilibria in this game. to see this, consider a star network and note that there exist two equilibria, one in which the center chooses 0 and the spokes choose 1, and a second equilibrium in which the spoke players choose 0 while the center chooses 1. figure 3 illustrates these two equilibria. in the first, depicted in the left panel of the figure, the center earns more than the spoke players, while in the second equilibrium (in the right panel) it is the other way round. even in the simplest network structures equilibrium multiplicity may arise and the relation between network architecture, equilibrium actions, and systematic patterns can be difficult to discover. while complete information regarding the structure of the social network imposed in graphical game models may be very sensible when the relevant network of agents is small, in large groups of agents (such as a country's electorate, the entire set of corn growers in the 50's, sites in the world-wide web, or academic economists), it is often the case that individuals have noisy perceptions of their network's architecture. as the discussion above stressed, complete information poses many challenges because of the widespread occurrence of equilibrium multiplicity that accompanies it. in contrast, when one looks at another benchmark, where agents know how many neighbors they will have but not who they will be, the equilibrium correspondence is much easier to deal with. moreover, this benchmark is an idealized model of settings in which agents make choices like learning a language or adopting a technology that they will use over a long time. in such contexts, agents have some idea of how many interactions they are likely to have in the future, but not exactly with whom the interactions will be. a network game is a modification of a graphical game in which agents can have private and incomplete information regarding the realized social network at place. we describe here the setup corresponding to that analyzed by galeotti, goyal, jackson, vega-redondo, and yariv (2010) and yariv (2005, 2007) , restricting attention to binary action games. 21 uncertainty is operationalized by assuming the network is determined according to some random process yielding our distribution over agents' degrees, p(d), which is common knowledge. each player i has d i interactions, but does not know how many interactions each neighbor has. thus, each player knows something about his or her local neighborhood (the number of direct neighbors), but only the distribution of links in the remaining population. consider now the following utility specification, a generalization of (2). agent i has a cost of choosing 1, denoted c i . costs are randomly and independently distributed across the society, according to a distribution f c . normalize the utility from the action 0 to 0 and let the benefit of agent i from action 1 be denoted by v(d i , x), where d i is i 0 s degree and she expects each of her neighbors to independently choose the action 1 with probability x. agent i's added payoff from adopting behavior 1 over sticking to the action 0 is then this captures how the number of neighbors that i has, as well as their propensity to choose the action 1, affects the benefits from adopting 1. in particular, i prefers to choose the action 1 if this is a simple cost-benefit analysis generalizing granovetter (1978) 's setup in that benefits can now depend on one's own degree (so that the underlying network is accounted for). let f(d, x) f c (v(d, x) ). in words, f(d, x) is the probability that a random agent of degree d chooses the action 1 when anticipating that each neighbor will choose 1 with an independent probability x. note that v(d, x) can encompass all sorts of social interactions. in particular, it allows for a simple generalization of granovetter's (1978) model to situations in which agents' payoffs depend on the expected number of neighbors adopting, dx. existence of symmetric bayesian equilibria follows standard arguments. in cases where v is nondecreasing in x for each d, it is a direct consequence of tarski's fixed-point theorem. in fact, in this case, there exists an equilibrium in pure strategies. in other cases, provided v is continuous in x for each d, a fixed-point can still be found by appealing to standard theorems (e.g., kakutani) and admitting mixed strategies. 22 homogeneous costs. suppose first that all individuals experience the same cost c > 0 of choosing the action 1 (much like in example 2 above). in that case, as long as v(d, x) is monotonic in d (nonincreasing or nondecreasing), equilibria are characterized by a threshold. indeed, suppose v(d, x) is increasing in d, then any equilibrium is characterized by a threshold d ã� such that all agents of degree d < d ã� choose the action 0 and all agents of degree d > d ã� choose the action 1 (and agents of degree d ã� may mix between the actions). in particular, notice that the type of multiplicity that appeared in example 2 no longer occurs (provided degree distributions are not trivial). it is now possible to look at comparative statics of equilibrium behavior and outcomes using stochastic dominance arguments on the network itself. for ease of exposition, we illustrate this in the case of nonatomic costs (see galeotti, goyal, jackson, vega-redondo, and yariv (2010) for the general analysis). heterogeneous costs. consider the case in which f c is a continuous function, with no atoms. in this case, a simple equation is sufficient to characterize equilibria. let x be the probability that a randomly chosen neighbor chooses the action 1. then f(d, x) is the probability that a random (best responding) neighbor of degree d chooses the action 1. we can now proceed in a way reminiscent of the analysis of the sis model. recall thatpã°dã� denoted the probability that a random neighbor is of degree d (see equation (4)). it must be that again, a fixed-point equation captures much of what occurs in the game. in fact, equation (15) characterizes equilibria in the sense that any symmetric 23 equilibrium results in an x that satisfies the equation, and any x that satisfies the equation corresponds to an equilibrium where type (d i , c i ) chooses 1 if and only if inequality (14) holds. given that equilibria can be described by their corresponding x, we often refer to some value of x as being an "equilibrium." consider a symmetric equilibrium and a corresponding probability of x for a random neighbor to choose action 1. if the payoff function v is increasing in degree d, then the expected payoff of an agent with degree d ã¾ 1 is ã�and agents with higher degrees choose 1 with weakly higher probabilities. indeed, an agent of degree d ã¾ 1 can imitate the decisions of an 22 in such a case, the best response correspondence (allowing mixed strategies) for any (d i , c i ) as dependent on x is upper hemi-continuous and convex-valued. taking expectations with respect to d i and c i , we also have a set of population best responses as dependent on x that is upper hemi-continuous and convex valued. 23 symmetry indicates that agents with the same degree and costs follow similar actions. agent of degree d and gain at least as high a payoff. thus, if v is increasing (or, in much the same way, decreasing) in d for each x, then any symmetric equilibrium entails agents with higher degrees choosing action 1 with weakly higher (lower) probability. furthermore, agents of higher degree have higher (lower) expected payoffs. much as in the analysis of the epidemiological models, the multiplicity of equilibria is determined by the properties of f, which, in turn, correspond to properties ofp and f. for instance, â�¢ if f(d, 0) > 0 for some d in the support of p, and f is concave in x for each d, then there exists at most one fixed-point, and â�¢ if f(d, 0) â¼ 0 for all d and f is strictly concave or strictly convex in x for each d, then there are at most two equilibria-one at 0, and possibly an additional one, depending on the slope of f(x) at x â¼ 0. 24 in general, as long as the graph of f(x) crosses the 45-degree line only once, there is a unique equilibrium (see figure 4 below). 25 the set of equilibria generated in such network games is divided into stable and unstable ones (those we have already termed in section 3.2 as tipping points). the simple characterization given by (15) allows for a variety of comparative statics on fundamentals pertaining to either type of equilibrium. in what follows, we show how these 24 as before, the slope needs to be greater than 1 for there to be an additional equilibrium in the case of strict concavity, while the case of strict convexity depends on the various values of f(d, 1) across d. 25 morris and shin (2003, 2005) consider uncertainty on payoffs rather than on an underlying network. in coordination games, they identify a class of payoff shocks that lead to a unique equilibrium. heterogeneity in degrees combined with uncertainty plays a similar role in restricting the set of equilibria. in a sense, the analysis described here is a generalization in that it allows studying the impact of changes in a variety of fundamentals on the set of stable and unstable equilibria, regardless of multiplicity, in a rather rich environment. moreover, the equilibrium structure can be tied to the network of underlying social interactions. x t figure 4 the effects of shifting f(x) pointwise. comparative statics tie directly to a simple strategic diffusion process. indeed, it turns out there is a very useful technical link between the static and dynamic analysis of strategic interactions on networks. an early contribution to the study of diffusion of strategic behavior allowing for general network architectures was by morris (2000) . 26 morris (2000) considered coordination games played on networks. his analysis pertained to identifying social structures conducive to contagion, where a small fraction of the population choosing one action leads to that action spreading across the entire population. the main insight from morris (2000) is that maximal contagion occurs when the society has certain sorts of cohesion properties, where there are no groups (among those not initially infected) that are too inward looking in terms of their connections. in order to identify the full set of stable of equilibria using the above formalization, consider a diffusion process governed by best responses in discrete time (following yariv (2005, 2007) ). at time t â¼ 0, a fraction x 0 of the population is exogenously and randomly assigned the action 1, and the rest of the population is assigned the action 0. at each time t > 0, each agent, including the agents assigned to action 1 at the outset, best responds to the distribution of agents choosing the action 1 in period tã�1, accounting for the number of neighbors they have and presuming that their neighbors will be a random draw from the population. let x t d denote the fraction of those agents with degree d who have adopted behavior 1 at time t, and let x t denote the link-weighted fraction of agents who have adopted the behavior at time t. that is, using the distribution of neighbors' degreespã°dã�, then, as deduced before from equation (14), at each date t, and therefore x t â¼ x dp ã°dã�fã°d; x tã�1 ã� â¼ fã°x tã�1 ã�: as we have discussed, any rest point of the system corresponds to a static (bayesian) equilibrium of the system. 26 one can find predecessors with regards to specific architectures, usually lattices or complete mixings, such as conway's (1970) "game of life," and various agent-based models that followed such as the "voter model" (e.g., see clifford and sudbury (1973) and holley and liggett (1975) ), as well as models of stochastic stability (e.g., kandori, mailath, robb (1993) , young (1993) , ellison (1993) ). if payoffs exhibit complementarities, then convergence of behavior from any starting point is monotone, either upwards or downwards. in particular, once an agent switches behaviors, the agent will not want to switch back at a later date. 27 thus, although these best responses are myopic, any eventual changes in behavior are equivalently forward-looking. figure 4 depicts a mapping f governing the dynamics. equilibria, and resting points of the diffusion process, correspond to intersections of f with the 45-degree line. the figure allows an immediate distinction between two classes of equilibria that we discussed informally up to now. formally, an equilibrium x is stable if there exists e 0 > 0 such that f(x ã� e) > x ã� e and f(x ã¾ e) < x ã¾ e for all e 0 > e > 0. an equilibrium x is unstable or a tipping point if there exists e 0 > 0 such that f(x ã� e) < x ã� e and f(x ã¾ e) > x ã¾ e for all e 0 > e > 0. in the figure, the equilibrium to the left is a tipping point, while the equilibrium to the right is stable. the composition of the equilibrium set hinges on the shape of the function f. furthermore, note that a point-wise shift of f (as in the figure, to a new function f) shifts tipping points to the left and stable points to the right, loosely speaking (as sufficient shifts may eliminate some equilibria altogether), making adoption more likely. this simple insight allows for a variety of comparative statics. for instance, consider an increase in the cost of adoption, manifested as a first order stochastic dominance (fosd) shift of the cost distribution f c to f c . it follows immediately that: fã°xã� â¼ x dp ã°dã�f c ã°vã°d; xã�ã� x dp ã°dã�f c ã°vã°d; xã�ã� â¼ fã°xã� and the increase in costs corresponds to an increase of the tipping points and decrease of the stable equilibria (one by one). intuitively, increasing the barrier to choosing the action 1 leads to a higher fraction of existing adopters necessary to get the action 1 to spread even more. this formulation also allows for an analysis that goes beyond graphical games regarding the social network itself, using stochastic dominance arguments (following jackson and rogers (2007) ) and yariv (2005, 2007) ). for instance, consider an increase in the expected degree of each random neighbor that an agent has. that is, supposep ' fosdp and, for illustration, assume that f(d, x) is nondecreasing in d for all x. then, by the definition of fosd, f 0 ã°xã� â¼ x dp 0 ã°dã�fã°d; xã� ! x dp ã°dã�fã°d; xã� â¼ fã°xã�; and, under p 0 , tipping points are lower and stable equilibria are higher. 27 if actions are strategic substitutes, convergence may not be guaranteed for all starting points. however, whenever convergence is achieved, the rest point is an equilibrium, and the analysis can therefore be useful for such games as well. similar analysis allows for comparative statics regarding the distribution of links, by simply looking at mean preserving spreads (mps) of the underlying degree distribution. 28 going back to the dynamic path of adoption, we can generalize the insights that we derived regarding the granovetter (1978) model. namely, whether adoption paths track an s-shaped curve now depends on the shape of f, and thereby on the shape of both the cost distribution f and agents' utilities. there is now a substantial and growing body of research studying the impacts of interactions that occur on a network of connections. this work builds on the empirical observations of peer influence and generates a rich set of individual and aggregate predictions. insights that have been shown consistently in real-world data pertain to the higher propensities of contagion (of a disease, an action, or behavior) in more highly connected individuals, the role of "opinion leaders" in diffusion, as well as an aggregate s-shape of many diffusion curves. the theoretical analyses open the door to many other results, e.g., those regarding comparative statics across networks, payoffs, and cost distributions (when different actions vary in costs). future experimental and field data will hopefully complement these theoretical insights. a shortcoming of some of the theoretical analyses described in this chapter is that the foundation for modeling the underlying network is rooted in simple forms of random graphs in which there is little heterogeneity among nodes other than their connectivity. this misses a central observation from the empirical literature that illustrates again and again the presence of homophily, people's tendency to associate with other individuals who are similar to themselves. moreover, there are empirical studies that are suggestive of how homophily might impact diffusion, providing for increased local connectivity but decreased diffusion on a more global scale (see rogers (1995) for some discussion). beyond the implications that homophily has for the connectivity structure of the network, it also has implications for the propensity of individuals to be affected by neighbors' behavior: for instance, people who are more likely to, say, be immune may be more likely to be connected to one another, and, similarly, people who are more likely to be susceptible to infection may be more likely to be connected to one another. 29 furthermore, background factors linked to homophily can also affect the payoffs individuals receive when making decisions in their social network. enriching the interaction structure in that direction is crucial for deriving more accurate diffusion predictions. this is an active area of current study (e.g., see , bramoullã© and rogers (2010) , currarini, jackson, and pin (2006 , and peski (2008)). ultimately, the formation of a network and the strategic interactions that occur amongst individuals is a two-way street. developing richer models of the endogenous formation of networks, together with endogenous interactions on those networks, is an interesting direction for future work, both empirical and theoretical. 30 creating social contagion through viral product design: theory and evidence from a randomized field experiment, mimeo distinguishing influence based contagions from homophily driven diffusion in dynamic networks a field study on matching with network externalities, mimeo similarity and polarization in groups, mimeo the mathematical theory of infectious diseases and its applications who's who in networks. wanted: the key player social capital in the workplace: evidence on its formation and consequences the diffusion of microfinance in rural india a new product growth model for consumer durables place of work and place of residence: informal hiring networks and labor market outcomes who gets the job referral? evidence from a social networks experiment, mimeo chains of affection: the structure of adolescent romantic and sexual networks strategic experimentation in networks diversity and popularity in social networks, mimeo. calvã³-armengol, a the mechanism through which this occurs can be rooted in background characteristics such as wealth, or more fundamental personal attributes such as risk aversion. risk averse individuals may connect to one another and be more prone to protect themselves against diseases by, e.g., getting immunized there are also some models that study co-evolving social relationships and play in games with neighbors the spread of obesity in a large social network over 32 years communication and coordination in social networks a model of spatial conflict medical innovation: a diffusion study the role of the airline transportation network in the prediction and predictability of global epidemics invasion threshold in heterogeneous metapopulation networks socio-economic distance and spatial patterns in unemployment learning about a new technology: pineapple in ghana an economic model of friendship: homophily, minorities and segregation identifying the roles of race-based choice and chance in high school friendship network formation evolution of conventions in endogenous social networks, mimeo the role of information and social interactions in retirement plan decisions: evidence from a randomized experiment short-run subsidies and long-run adoption of new health products: evidence from a field experiment learning, local interaction, and coordination local conventions on random graphs on the evolution of random graphs the market maven: a diffuser of marketplace information building social capital through microfinance learning by doing and learning from others: human capital and technical change in agriculture complex networks and local externalities: a strategic approach, mimeo non-market interactions. nber working paper number 8053 crime and social interactions the 1/d law of giving r&d networks network formation and social coordination the strength of weak ties threshold models of collective behavior economic action and social structure: the problems of embeddedness getting a job: a study in contacts and careers hybrid corn: an exploration of the economics of technological change ergodic theorems for weakly interacting infinite systems and the voter model citizens, politics and social communication. cambridge studies in public opinion and political psychology job information networks, neighborhood effects and inequality social structure, segregation, and economic behavior social and economic networks relating network structure to diffusion properties through stochastic dominance. the b.e on the formation of interaction networks in social coordination games social games: matching and the play of finitely repeated games diffusion on social networks diffusion of behavior and equilibrium properties in network games correlated equilibria in graphical games homophily, selection, and socialization in adolescent friendships learning, mutation, and long run equilibria in games measuring trust in peruvian shantytowns, mimeo personal influence: the part played by people in the flow of mass communication graphical models for game theory a contribution to the mathematical theory of epidemics economic networks in the laboratory: a survey the people's choice: how the voter makes up his mind in a presidential campaign directed altruism and enforced reciprocity in social networks the dynamics of viral marketing contagion in complex social networks models for innovation diffusion (quantitative applications in the social sciences endogenous inequality in integrated labor markets with two-sided search birds of a feather: homophily in social networks patterns of influence: a study of interpersonal influence and of communications behavior in a local community global games: theory and applications heterogeneity and uniqueness in interaction games asymmetric effects in physician prescription behavior: the role of opinion leaders, mimeo. pastor-satorras epidemic dynamics and endemic states in complex networks design innovation and fashion cycles complementarities, group formation, and preferences for similarity workers and wages in an urban labor market the diffusion of hybrid seed corn in two iowa communities local network effects and network structure les lois de l'imitation: etude sociologique. elibron classics, translated to english in the laws of imitation social interactions, local spillovers and unemployment identifying formal and informal inuence in technology adoption with network externalities medical innovation revisited: social contagion versus marketing effort the evolution of conventions innovation diffusion in heterogeneous populations: contagion, social influence, and social learning key: cord-355102-jcyq8qve authors: avila, eduardo; kahmann, alessandro; alho, clarice; dorn, marcio title: hemogram data as a tool for decision-making in covid-19 management: applications to resource scarcity scenarios date: 2020-06-29 journal: peerj doi: 10.7717/peerj.9482 sha: doc_id: 355102 cord_uid: jcyq8qve background: covid-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. a critical element involves essential workforce management since current protocols recommend release from duty for symptomatic individuals, including essential personnel. testing capacity is also problematic in several countries, where diagnosis demand outnumbers available local testing capacity. purpose: this work describes a machine learning model derived from hemogram exam data performed in symptomatic patients and how they can be used to predict qrt-pcr test results. methods: hemogram exams data from 510 symptomatic patients (73 positives and 437 negatives) were used to model and predict qrt-pcr results through naïve-bayes algorithms. different scarcity scenarios were simulated, including symptomatic essential workforce management and absence of diagnostic tests. adjusts in assumed prior probabilities allow fine-tuning of the model, according to actual prediction context. results: proposed models can predict covid-19 qrt-pcr results in symptomatic individuals with high accuracy, sensitivity and specificity, yielding a 100% sensitivity and 22.6% specificity with a prior of 0.9999; 76.7% for both sensitivity and specificity with a prior of 0.2933; and 0% sensitivity and 100% specificity with a prior of 0.001. regarding background scarcity context, resources allocation can be significantly improved when model-based patient selection is observed, compared to random choice. conclusions: machine learning models can be derived from widely available, quick, and inexpensive exam data in order to predict qrt-pcr results used in covid-19 diagnosis. these models can be used to assist strategic decision-making in resource scarcity scenarios, including personnel shortage, lack of medical resources, and testing insufficiency. since its first detection and description (huang et al., 2020) , covid-19 expansion has brought worldwide concerns to governmental agents, public and private institutions, and health care specialists. declared as a pandemic, this disease has deeply impacted many aspects of life in affected communities. relative lack of knowledge about the disease particularities has led to significant efforts devoted to alleviating its effects (lipsitch, swerdlow & finelli, 2020) . alternatives to mitigate the disease spread include social distancing (anderson et al., 2020) . such a course of action has shown some success in limiting contagion rates (tu et al., 2020) . however, isolation policies manifest drawbacks as economic impact, with significant effects on macroeconomic indicators and unemployment rates (nicola et al., 2020) . to address this, governments worldwide have proposed guidelines to manage the essential workforce, considered pivotal for maintaining strategic services and provide an appropriate response to the pandemics expansion (black et al., 2020) . widespread reports of threats to critical national infrastructure have been presented, with significant impact associated with medical attention (kandel et al., 2020) . significant pressure is being faced by emergency response workers, with some countries on the brink of collapse of their national health systems (tanne et al., 2020) . the main concern associated with covid-19 is the lack of extensive testing capacity. shortage of diagnostic material and other medical supplies pose as a major restraining factor in pandemics control (ranney, griffeth & jha, 2020) . the most common covid-19 symptoms are similar to other viral infectious diseases, making the prompt clinical diagnostic impractical (adhikari et al., 2020) . official guidelines emphasize the use of quantitative real-time pcr (qrt-pcr) assays for detection of viral rna in diagnosis as the primary reference standard (tahamtan & ardebili, 2020) . in many countries, test results are hardly available within at least a week, forcing physicians and health care providers to take strategic decisions regarding patient care without quality information. previous reports have described alterations in laboratory findings in covid-19 patients. hematological effects include leukopenia, lymphocytopenia and thrombocytopenia, while biochemical results show variation on alanine and aspartate aminotransferases, creatine kinase and d-dimer levels, among other parameters (guan et al., 2020; huang et al., 2020) . some efforts have been applied to evaluate clinical and epidemiological aspects of this disease using computational methods, such as diagnosis, prognosis, symptoms severity, mortality, and response to different treatments. a useful review of some of these methods is presented by wynants et al. (2020) . the main objective of this article is to provide insights to healthcare decision-makers facing scarcity situations, as a shortage of test capacity or limitations in the essential workforce. a useful method of doing so is using hemogram test results. this clinical exam is widely available, inexpensive, and fast, applying automation to maximize throughput. to do so, we have analyzed hemogram data from brazilian symptomatic patients with available test results for covid-19. we propose a framework using a naïve-bayes model for machine learning, where test conditions can be adjusted to respond to actual lack of resources problems. finally, four distinct scarcity scenarios examples are presented, including handling of the essential workforce and shortage of testing and treatment resources. a total of 5,644 patients admitted to the emergency department of hospital israelita albert einstein (hiae, são paulo, brazil) presenting covid-19-like symptoms were tested via qrt-pcr. a total number of 599 patients (10.61%) presented positive results for covid-19. the full dataset contains patients anonymized id, age, qrt-pcr results, data on clinical evolution, and a total of 105 clinical tests. not all data was available for all patients, therefore the amount of missing information is significant, with most available parameters informed for a small fraction of subjects. all variables were normalized (i.e., mean = 0 and variance = 1) to maintain anonymity and remove scale effects. no missing data imputation was performed during model generation to avoid bias. considering the significant amount of missing data, only 510 patients presented values for all 15 parameters evaluated in hemogram results (comprising the following cell counts or hematological measures: hematocrit, hemoglobin, platelets, mean platelet volume, red blood cells, lymphocytes, leukocytes, basophils, eosinophils, monocytes, neutrophils, mean corpuscular volume (mcv), mean corpuscular hemoglobin (mch), mean corpuscular hemoglobin concentration (mchc), and red blood cell distribution width (rdw). data for the above parameters were used in model construction, along with qrt-pcr covid-19 test results. no baseline characteristics detailing can be presented for the evaluated cohort, once additional patients description is not accessible. even the limited provided individual information (as subjects ages, for instance) was normalized within the dataset, and therefore cannot be inferred from available data. the full dataset is available in https://www.kaggle.com/einsteindata4u/covid19 and can also be found in supplemental materials. machine learning (ml) is a field of study in computer science and statistics dedicated to the execution of computational tasks through algorithms that do not require explicit instructions but instead rely on learning patterns from data samples to automate inferences (mitchell, 1997) . these algorithms can infer input-output relationships without explicitly assuming a pre-determined model (geron, 2017; hastie, tibshirani & friedman, 2009 ). there are two learning paradigms: supervised and unsupervised. supervised learning is a process in which the predictive models are constructed through a set of observations, each of those associated with a known outcome (label). in opposition, in unsupervised learning, one does not have access to the labels, it can be viewed as the task of "spontaneously" finding patterns and structures in the input data. our objective with this study is to predict in advance the results of the qrt-pcr test with a supervisedmachine learning model using data from hemogram tests performed on symptomatic patients. the main process can be divided into four steps: (1) pre-processing of the data (2) selection of an appropriate classification algorithm, (3) model development and validation, that is, the process of using the selected characteristics to separate the two groups of subjects (positive for covid-19 vs. negative for covid-19 in qrt-pcr test), and (4) test generated model with additional data. steps are detailed as follows: data pre-processing samples presenting a missing value in any of the 15 evaluated features were removed, in order to avoid bias introduction in model. a total of 510 patients (73 positives for covid-19 and 437 negatives) presented complete data and were considered for the model construction. in this work, we use a gaussian naïve bayes (nb) classifier, which is a probabilistic machine learning model used for classification tasks. the main reasons for choosing this classifier are due to its low computational cost, its ability to handle missing data and because it presented better classification performance when compared to other evaluated ml techniques, for this particular dataset (data not shown). in medicine, the first computer-learn attempts in decision support were based mainly on the bayes theorem, in order to aggregate data information to physicians' previous knowledge (martin, apostolakos & roazen, 1960) . the naïve bayes (nb) method combines the previous probability of an event (also called prior probability, or simply prior) with additional evidence (as, e.g., a set of clinical data from a patient) to calculate a combined, conditional probability that includes the prior probability given the extra information. the result is the posterior probability of an outcome, or simply posterior. this classifier is called "naïve" because it considers that each exam result (variables) is independent of each other. since this situation is not realistic in medicine, the model should not be interpreted (schurink et al., 2005) . besides this drawback, it can outperform more robust alternatives in classification tasks, and once it reflects the uncertainty involved in the diagnosis, bayesian approaches are more suitable than deterministic techniques (gorry & barnett, 1968; hastie, tibshirani & friedman, 2009 ). a classifier is an estimator with a predict method that takes an input array (test) and makes predictions for each sample in it. in supervised learning estimators (our case), this method returns the predicted labels or values computed from the estimated model (for this work, positive or negative for covid-19). cross-validation is a model evaluation method that allows one to evaluate an estimator on a given dataset reliably. it consists of iteratively fitting the estimator on a fraction of the data, called training set, and testing it on the left-out unseen data, called test set. several strategies exist to partition the data. in this work, we used the leave-one-out (loo) cross-validation model, as in chang et al. (2003) , since this method is appropriate to handle small sample size datasets. the number of data points was split n times (number samples). the method was trained on all the data except for one point, and a prediction was made for that point. the proposed approach was implemented in python v.3 (https://www.python.org) code using scikit-learn v. 0.22.2 (pedregosa et al., 2011) as a backend. in order to evaluate the adequacy and generalization power of the proposed model, as well as its tolerance to handle samples containing missing data (i.e., at least one variable with no informed values), an additional set of 92 samples (10 positives for covid-19 and 82 negatives) was obtained from the patient database. those samples were not initially employed in model delineation, considering they present a single missing value among all 15 employed hemogram parameters. this "incomplete dataset", comprising 92 samples with a single missing value information per sample, was then submitted to the previously generated model, in order to evaluate classification performance and missing values handling ability for the model. for data description, probability density function (pdf) of all 15 hemogram parameters were estimated through the original sample by kernel density estimator. some hemogram parameters present notable differences between the distributions of positive and negative results, mainly regarding its modal value (distribution peak value) and variance (distribution width). differences are summarized in table 1 . regarding basophiles, eosinophils, leukocytes and platelets counts, qrt-pcr positive group distribution shows lower modal value and lower variance. on the other hand, monocyte count displays opposite behavior, once lower modal value and variance are observed for the qrt-pcr positive group. the remaining nine hemogram parameters did not show a notable difference between negative and positive groups. all variables contribute to the classification model, and despite the fact classification can be perfomed without the complete set of parameters (i.e., including missing data), a most sucessfull prediction is achieved when complete hemogram information is used as input. pdf analysis results are presented in fig. 1 . a nb classifier based on training set hemogram data was developed. under the model, the complete range of prior probabilities (from 0.0001 to 0.9999 by 0.0001 increments) was scrutinized, and posterior probability of each class was computed for different prior conditions. a posterior probability value of 0.5 was defined as the classification threshold in one of the positive or negative predicted groups. resulting model showed a good predictive power of the qrt-pcr test result based on hemogram data. figure 2 shows the accuracy, sensitivity, f1 score, and specificity curves derived from the model for different prior probabilities of each class (positive or negative for covid-19). reported prior probabilities refer to positive covid-19 condition. when setting the prior probability to the maximum defined value (0.9999), the nb classifier correctly diagnosed all pcr positive cases. on the other hand, such configuration improperly predicted 77.3% of negative pcr results as positive. regarding the lower possible prior probability setting, it does not classify a single observation as positive. this result can be explained by the unbalanced number of observations for each class, tending to over classify samples as the class with more observation, that is, negative results. such characteristics can also be noticed in the general accuracy, since smaller values for the prior used in the classifier tend to diagnose all observations as belonging to the dominant class (negative) and consequently raising the total of correctly classified samples. the break-even point is met when prior probability is set to 0.2933. under this condition, all metrics are approximately 76.6%. regarding the model sensitivity, the rate of positive samples correctly classified is over 85% within 0.999 to 0.5276 range, with small decrease of it when the prior probability of positive result is diminished within this range. when prior is set to values under 0.1, the number of positive predicted samples decreases rapidly, yielding lower sensitivity. as for specificity, it presents linear growth as tested priors decrease. ultimately, the accuracy results profile are similar to specificity, due to the negative patients dominance. figure 3 shows additional results for the classification performance presented by the proposed method. figure 3a displays the roc (receiver operating characteristics) curve for the methods. obtained measure of the area under the curve (auroc) is equivalent to 0.84, suggesting excellent prediction performance for the model. figures 3b-3d presents prediction results for the baseline model, using a dummy classifier for the most frequent class (negative qrt-pcr, fig. 3b ), and for stratified (fig. 3c) and uniform (fig. 3d) evaluations. as mentioned above, prior probability choice has a critical relevance in proposed model use. it is clear that, when extreme values of positive probability are applied (close to 0 or 1), specific classes (positive or negative qrt-pcr test results predictions) are favored, increasing its ability of correct detection. as an example, when a value of 0.9999 is set for prior probability of positive result is set, an increase in misclassification in negative class results is observed. at the same time, it is possible to properly identify samples where hemogram evidence strongly indicates a negative result, according to the model. this is based on the fact that evidence used in the model construction (in present case, hemogram data) must strongly support the reduction of posterior probability of disease to values under 0.5, therefore leading to a negative result. this logic can be applied to fine tune the prior probability used in the model, in order to improve correct classification of positive or negative groups prediction. examples of how to use this feature is provided in the ''discussion'' section. samples including a single missing value (n = 92, including 10 qrt-pcr positives) were used to test the missing data tolerance presented by the proposed model. figure 4 presents results obtained from the model application to this incomplete (lacking information for one variable per sample) dataset. laboratory findings can provide vital information for pandemics surveillance and management . hemogram data have been previously proposed as useful parameters in diagnosis and management of viral pandemics (shimoni, glick & froom, 2013) . in the present work, an analysis concerning hemogram data from symptomatic patients suspected of covid-19 infection was executed. a machine learning model based on naïve bayes method is proposed in order to predict actual qrt-pcr from such patients. the presented model can be applied to different situations, aiming to assist medical practitioners and management staff in key decisions regarding this pandemic, especially in conditions of limited access to medical resources (brown, ravallion & van de walle, 2020) . figure 5 summarizes model construction and application. predictions are not intended to be used as a diagnostic method since this technique was designed to anticipate qrt-pcr results only. as such, it is highly dependant on factors affecting qrt-pcr efficiency, and its prediction capability is dependent on the sensitivity, accuracy, and specificity of the original laboratory exam (sethuraman, jeremiah & ryo, 2020) . descriptive analysis of hemogram clinical findings shows differences in blood cell counts and other hematological parameters among covid-19 positive and negative patient results. differences are conspicuous among three measures (leukocytes, monocytes and platelets) and more discrete to additional two (basophiles and eosinophiles). it is possible that differences are also present across the complete data spectrum, even though they are not clearly visualized with pdf data. these results are in accordance to previous reports of changes in laboratory findings in covid-19 infected patients, where conditions as leukopenia, lymphocytopenia and thrombopenia were reported ding et al., 2020) . it is important to highlight that data analysis is not sufficient to characterize clinical hematological alterations in evaluated patients (when compared to demographic hematologic parameters data), once data was normalized for the evaluated sample set only. however, even within this particular quota of population (individuals presenting covid-19-like symptoms), differences were found between individuals presenting negative or positive qrt-pcr covid test results. the proposed nb-ml model can be helpful in accessing different levels of information from hemogram results, through inferring non-evident patterns and parameter relationships from this data. also, our simulations suggest that the nb model has at least some degree of tolerance to missing data values, which can be advantageous when compared to other ml techniques. bayesian techniques are based on the choice of a prior probability of an event (in present case, positive result for qrt-pcr test). the method considers actual evidence (hemogram data) to result in a posterior probability of the outcome (prediction of a positive result). by changing the selected prior probability, we can derive an uncertainty analysis of the model to understand its distribution. uncertainty can be then applied to adequately adapt the classifier to a particular ongoing context. this option allows the evaluation of different decision-making scenarios concerning diverse aspects of pandemics management. during a crisis situation, measures should be taken seeking to maximize benefits and achieve a fair resource allocation (emanuel et al., 2020) . to illustrate the model flexibility and how it can be used to help on this matter, a general framework of application is proposed, followed by a simulation of four scenarios where resource scarcity is assumed. the proposed nb model can be applied in two distinct situations. when clinical data is available for a particular patient, it is highly recommended that medical staff determine the prior probability on a case-by-case basis. when no clinical or medical data is available, or when decisions regarding resource management involving multiple symptomatic patients are necessary, the model can be used in multiple individuals simultaneously, aiming to identify those with higher probabilities of presenting positive qrt-pcr results. individual risk management and personal evaluation is essential for covid-19 response (gasmi et al., 2020) . individuals presenting covid-19 symptoms are medically evaluated where no covid-19 test is available for appropriate diagnosis confirmation. medical practitioners can determine a probability of disease based on anamnesis, symptoms, clinical exams, laboratory findings and other available data. this probability of infection, as determined by the physician or medical team, can be considered as the prior probability. using hemogram data as input, and informing the prior probability of covid-19 based on medical findings, the model will consider hemogram data to inform a posterior probability, which can be higher or lower than the original, and based on the hemogram alterations caused by the virus infection. it is important that hemogram data would not be included in original medical assessment and prior determination, in order to avoid bias and reduce model overfit. it can be used in situations where decisions are necessary for resource management including multiple individuals. choice of a target group (positive or negative qrt-pcr result prediction) should be defined. the model can be applied to multiple individuals simultaneously, with the choice of prior probability carefully adjusted to result in a specific number of predicted individuals from the target group, according to the desired outcome. this method increases the correct selection of candidates belonging to the target group, when compared to random selection. individual results can be ranked based on posterior probability of a positive or negative result, and results stratified according to convenience, as a way to evaluate a particular scenario of interest. when additional clinical data is available, or become available later, patients selected during bulk evaluation should be reassessed individually as proposed in the general framework, in order to reduce misclassifications. examples of proposed model use are presented for some specific scarcity scenarios in table 2 . as can be seen, the model sensitivity can be adjusted by selecting prior probability employed, according to desired outcome or interest group. prior selection should be carefully decided, based on current context or situation proposed, and must consider the classification group where higher accuracy is intended. high accuracy in qrt-pcr result prediction is achieved based on hemogram information only. further analysis performed on the original data (not shown) suggest that additional clinical results can improve prediction efficiency. this conclusion is in accordance with previous findings suggesting biochemical and immunological abnormalities, in addition to hematologic alterations, can be caused by covid-19 disease (henry et al., 2020) . in this context, the relevance of data employed to generate ml models is emphasized. the use of large and comprehensive datasets, containing as much information as possible regarding clinical and laboratory findings, symptoms, disease evolution, and other relevant aspects, is crucial in devising useful and adequate models. the development of nationwide or regional databases based on local data is essential, in order to capture epidemiological idiosyncrasies associated with such populations (terpos et al., 2020) . also, natural differences in hemogram results from distinct demographic groups (as seen in reference values according to age, sex, or other physiological factors) can aggregate noise to the model, which can be reduced when large database are employed in model construction, and results can be devised for each ethnographic strata. despite having high overall accuracy, performance metrics obtained with proposed model show unequal ability to predict positive or negative results. this situation is caused by a significant imbalance in number of samples belonging to each of this qrt-pcr result groups in original data. the use of balanced data in machine learning model design is important to assure high prediction quality (krawczyk, 2016) . the option of maintaining original data in model construction was adopted, since it better represents actual covid-19 prevalence among symptomatic patients, and therefore seems to represent a more realistic situation. additional simulations applying a balanced model (data not shown) using positive group oversampling (to compensate its insufficiency in original data) have devised alternative models with superior predictive power. alternative balanced model results are presented in supplemental material (fig. s1 ). therefore, additional positive samples will be added to the data and used in future model versions. as a perspective, collection of hemogram results from asymptomatic patients (in addition to symptomatic individuals) can be used to evaluate the utility of this approach on the detection of asymptomatic infections, in order to provide alternatives in diagnostics, especially in a context of testing deficiency. a web-based application was developed by the authors, in which hemogram data can be introduced for a single individual, along with prior probability of infection, based on data used to generate the present model. the online tool is available at http://sbcb.inf.ufrgs.br/covid. future implementation will allow the upload of multiple patients simultaneously, and construction or testing of user data-derived models. this service will allow easy access and practical application of the proposed model. machine learning models based on hemogram data can be employed in covid-19 pandemics management, in order to assist strategical medical decisions in different scarcity scenarios. the proposed naïve-bayes model has the flexibility to be applied in a large variety of possible critical conditions, and can be adjusted to improve classification accuracy for a particular target group. even though the method proposed in this work is epidemiology, causes, clinical manifestation and diagnosis, prevention and control of coronavirus disease (covid-19) during the early outbreak period: a scoping review how will country-based mitigation measures influence the course of the covid-19 epidemic? covid-19: the case for health-care worker screening to prevent hospital transmission can the world tm poor protect themselves from the new coronavirus? working paper 27200 gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer dynamic profile and clinical implications of hematological parameters in hospitalized patients with coronavirus disease 2019. epub ahead of print 22 fair allocation of scarce medical resources in the time of covid-19 blood and blood product use during covid-19 infection individual risk management strategy and potential therapeutic options for the covid-19 pandemic experience with a model of sequential diagnosis hands-on machine learning with scikit-learn and tensorflow: concepts, tools, and techniques to build intelligent systems clinical characteristics of coronavirus disease 2019 in china the elements of statistical learning: data mining, inference and prediction laboratory abnormalities in children with mild and severe coronavirus disease 2019 (covid-19): a pooled analysis and review clinical features of patients infected with 2019 novel coronavirus in health security capacities in the context of covid-19 outbreak: an analysis of international health regulations annual report data from 182 countries learning from imbalanced data: open challenges and future directions the critical role of laboratory medicine during coronavirus disease 2019 (covid-19) and other viral outbreaks defining the epidemiology of covid-19: studies needed clinical versus acturial prediction in the differential diagnosis of jaundice. a study of the relative accuracy of predictions made by physicians and by a statistically derived formula in differentiating parenchymal and obstructive jaundice machine learning. first edition the socio-economic implications of the coronavirus and covid-19 pandemic: a review scikit-learn: machine learning in python critical supply shortages-the need for ventilators and personal protective equipment during the covid-19 pandemic computer-assisted decision support for the diagnosis and treatment of infectious diseases in intensive care units interpreting diagnostic tests for sars-cov-2 clinical utility for the full blood count in identifying patients with pandemic influenza a (h1n1) real-time rt-pcr in covid-19 detection: issues affecting the results covid-19: how doctors and healthcare systems are tackling coronavirus worldwide hematological findings and complications of covid-19 the epidemiological and clinical features of covid-19 and lessons from this global infectious public health event prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal not suitable to be used as a diagnostic technique, it can be employed to provide additional, useful information regarding data-driven resource allocation, in shortage conditions. the authors received no funding for this work. the authors declare that they have no competing interests. eduardo avila conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. alessandro kahmann conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. clarice alho analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. marcio dorn conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft. the following information was supplied regarding data availability:the dataset used in this study was provided by a third party (albert einstein hospital, são paulo, brazil) . it is also available as a supplemental file and at kaggle: https://www.kaggle.com/einsteindata4u/covid19. supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.9482#supplemental-information. key: cord-333919-nrd9ajj2 authors: albi, g.; pareschi, l.; zanella, m. title: relaxing lockdown measures in epidemic outbreaks using selective socio-economic containment with uncertainty date: 2020-05-16 journal: nan doi: 10.1101/2020.05.12.20099721 sha: doc_id: 333919 cord_uid: nrd9ajj2 after an initial phase characterized by the introduction of timely and drastic containment measures aimed at stopping the epidemic contagion from sars-cov2, many governments are preparing to relax such measures in the face of a severe economic crisis caused by lockdowns. assessing the impact of such openings in relation to the risk of a resumption of the spread of the disease is an extremely difficult problem due to the many unknowns concerning the actual number of people infected, the actual reproduction number and infection fatality rate of the disease. in this work, starting from a compartmental model with a social structure, we derive models with multiple feedback controls depending on the social activities that allow to assess the impact of a selective relaxation of the containment measures in the presence of uncertain data. specific contact patterns in the home, work, school and other locations for all countries considered have been used. results from different scenarios in some of the major countries where the epidemic is ongoing, including germany, france, italy, spain, the united kingdom and the united states, are presented and discussed. "phase two" is the key word after the most critical moment of the coronavirus emergency. the end of the pandemic will not immediately correspond to the disappearance of sars-cov2. this is why an intermediate phase is being carefully considered, with some activities that can be resumed immediately, regulating the reintegration of workers, for example through indicators measuring the impact of work activities on potential infections, increasing prevention measures, or through so-called immunity passports. several question marks over convalescence times and the real extent of the contagion also raise fears of a second wave. it is essential to build scenarios that will help us understand how the situation might evolve in the future. while in some countries, like italy and france, the debate is open (and also to controversy), between those who would like to restart as soon as possible and those who, on the other hand, as a precautionary measure, would like to postpone the lockdown due to covid-19, other countries, like germany and sweden, are preparing to restart (and in some cases the reduction of activity has never been total). the overall objective of this second phase is to limit the major damage to a country's economy caused by the severe lockdown measures, but to avoid restarting the epidemic peak. among the many controversial aspects are, for example, the reopening of schools, sport activities and other social activities at different levels, which, while having less economic impact, have a very high social cost. indeed, it is clear that it is difficult for the population to sustain an excessively long period of lockdown. it is therefore of primary importance to analyze the possibility of relaxing the control measures put in place by many countries in order to make them more sustainable on the socio-economic front, keeping the reproductive rate of the epidemic under control and without incurring health risks [23, 16, 19] . the problem is clearly very challenging, traditional epidemiological models based on the assumption of homogeneous population mixing are inadequate, since the whole social and economic structure of the country is involved [29, 26, 11, 25, 21, 27] . on the other hand, interventions involving the whole population allow to use mathematical descriptions in analogy with classical statistical physics drawing on the statistical characteristics of a very large system of interacting individuals [1, 2, 3, 8, 18] . a further problem that cannot be ignored is the uncertainty present in the official data provided by the different countries in relation to the number of infected people. the heterogeneity of the procedures used to carry out the disease positivity tests, the delays in recording and reporting the results, and the large percentage of asymptomatic patients (in varying percentages depending on the studies and the countries but estimated by who at an average of around 80% of cases) make the construction of predictive scenarios affected by high uncertainty [28, 33, 44] . as a consequence, the actual number of infected and recovered people is typically underestimated, causing fatal delays in the implementation of public health policies facing the propagation of epidemic fronts. in this research, we try to make a contribution to these problems starting from a description of the spread of the epidemic based on a compartmental model with social structure in the presence of uncertain data. this model allows not only to take into account the specific nature of the different activities involved through appropriate interaction functions derived from experimental interaction matrices [6, 35, 37, 24] but also to systematically include the uncertainty present in the data [9, 10, 13, 28, 33, 40] . the latter property is achieved by increasing the dimensionality of the problem adding the possible sources of uncertainty from the very beginning of the modelling. hence, we extrapolate statistics by looking at the so-called quantities of interest, i.e. statistical quantities that can be obtained from the solution and that give some global information with respect to the input parameters. several techniques can be adopted for the approximation of the quantities of interest. here, following [4] we adopt stochastic galerkin methods that allow to reduce the problem to a set of deterministic equations for the numerical evaluation of the solution in presence of uncertainties [43, 36, 15] . the main assumption made in this study is that the control measures adopted by the different countries cannot be described by the standard compartmental model but must necessarily be seen as external actions carried out by policy makers in order to reduce the epidemic peak. most current research in this direction has focused on control procedures aimed at optimizing the use of vaccinations and medical treatments [5, 7, 12, 14, 30] and only recently the problem has been tackled from the perspective of non-pharmaceutical interventions [4, 31, 34, 22, 20] . for this purpose we derive new models based on multiple feedback controls that act selectively on each specific contact function and therefore social activity. based on the data in [37] this allows to analyze the impact of containment measures in a differentiated way on family, work, school, and other activities. in our line of approach, the classical epidemiological parameters that define the rate of reproduction of the infectious disease are therefore estimated only in the regime prior to lockdown and define an estimate of the reproductive rate in the absence of control. at this stage the estimation mainly serves to calibrate the model parameters and its variability will then be considered in the intrinsic uncertainty of these values. on the contrary, the control action is estimated in the first lockdown phase using the data available to date. phase two, therefore, on the modelling front is characterized by a third temporal region following the first lockdown period, in which social characteristics become essential to quantify the impact of possible decisions of the various governments. this makes it possible to carry out a systematic analysis for different countries and to observe the different behaviour of the control action in line with the dynamics observed and the measures taken by different governments. however, a realistic comparison between countries is an extremely difficult problem that would require a complex phase of renormalization of the data according to the different recording and acquisition methods used. in an attempt to provide comparative results altered as little as possible by assumptions that cannot be justified, we decided to adopt the same criteria for each country and therefore the scenarios presented, although based on realistic values, do not aspire to have the value of a real quantitative prediction. we present different simulation scenarios for various countries where the epidemic wave is underway, including germany, france, italy, spain, the united kingdom and the united states showing the effect of relaxing the lockdown measures in a selective way on the various social activities. the simulations suggest that premature lifting of these interventions will likely lead to transmissibility exceeding one again, resulting in a second wave of infection. on the other hand, a progressive loosening strategy in subsequent phases, as advocated by some governments, shows that, if properly implemented, may be capable to keep the epidemic under control by restarting various productive activities. the starting model in our discussion is a sir-type compartmental model with a social structure and uncertain inputs. the presence of a social structure is in fact essential in deriving appropriate sustainable control techniques from the population for a protracted period, as in the case of the recent covid-19 epidemic. in addition we include the effects on the dynamics of uncertain data, such as the initial conditions on the number of infected people or the interaction and recovery rates. the heterogeneity of the social structure, which impacts the diffusion of the infective disease, is characterized by a ∈ λ ⊂ r + representing the age of the individual [25, 26] . we assume that the rapid spread of the disease and the low mortality rate allows to ignore changes in the social structure, such as the aging process, births and deaths. furthermore, we introduce the random vector z = (z 1 , . . . , z dz ) ∈ r dz whose components are assumed to be independent real valued random variables taking into account various possible sources of uncertainty in the model. we assume to know the probability density p(z) : r dz → r dz + characterizing the distribution of z. we denote by s(z, a, t), i(z, a, t) and r(z, a, t) the densities at time t ≥ 0 of susceptible, infectious and recovered individuals, respectively in relation to their age a and the source of uncertainty z. the density of individuals of a given age a and the total population number n are deterministic conserved quantities in time, i.e. s(z, a, t) + i(z, a, t) + r(z, a, t) = f (a), hence, the quantities denote the uncertain fractions of the population that are susceptible, infectious and recovered respectively. in a situation where changes in the social features act on a slower scale with respect to the spread of the disease, the socially structured compartmental model with uncertainties follows the dynamics 3 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 16, 2020. . https://doi.org/10.1101/2020.05. 12.20099721 doi: medrxiv preprint with initial condition i(z, a, 0) = i 0 (z, a), s(z, a, 0) = s 0 (z, a) and r(z, a, 0) = r 0 (z, a). in (2) the functions β j (z, a, a * ) ≥ 0 represent transmission rates among individuals with different ages related to a specific activity characterized by the set a, such as home, work, school, etc., and γ(z, a) ≥ 0 is the recovery rate which may be age dependent. in the following we assume the quantities β j (z, a, a * ) proportional to the contact rates in the various activities. in the following, we introduce the usual normalization scaling and observe that the quantities s(t), i(t) and r(t) satisfy the sir dynamic where the fraction of recovered is obtained from r(z, t) = 1 − s(z, t) − i(z, t). we refer to [26, 25, 21, 27] for analytical results concerning model (2) and (3) in a deterministic setting. in order to characterize the action of a policy maker introducing a control over the system based on selective containment measures in relation to a specific social activity we consider the following optimal control setting min u∈u j(u) : where u = (u 1 , . . . , u l ) is a vector of controls acting locally on the interaction between individuals of ages a and a * , the function ν j (a, t) > 0 is a selective penalization term and r[i(·, t)] is a suitable linear operator taking into account the presence of the uncertainties z. examples of such operator that are of interest in epidemic modelling are the expectation with respect to uncertainties or relying on deterministic data which underestimate the number of infected where z 0 is a given value such that 4 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . note that, here we are considering less restrictive conditions on the space of admissible controls than in [4] . the above minimization is subject to the following dynamics with initial condition i(z, a, 0) = i 0 (z, a), s(z, a, 0) = s 0 (z, a) and r(z, a, 0) = r 0 (z, a). solving the above optimization problem, however, is generally quite complicated and computationally demanding when there are uncertainties as it involves solving simultaneously the forward problem (4)-(7) and the backward problem derived from the optimality conditions [4] . furthermore, the assumption that the policy maker follows an optimal strategy over a long time horizon seems rather unrealistic in the case of a rapidly spreading disease such as the covid-19 epidemic. in this section we consider short time horizon strategies which permits to derive suitable feedback controlled models. these strategies are suboptimal with respect the original problem (4)-(7) but they have proved to be very successful in several social modeling problems [1, 2, 3, 18] . to this aim, we consider a short time horizon of length h > 0 and formulate a time discretize optimal control problem through the functional j h (u) restricted to the interval [t, t + h], as follows subject to recalling that the macroscopic information on the infected is using (8) we can compute . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . where we assumed ∂r [i(·, t + h)]/∂u j = r [∂i(·, t + h)/∂u j ], to obtain the following nonlinear identities the above assumption on r[·] is clearly satisfied by (5) and (6), where in the case of (6) we used the notation r[s(·, a, t)i(·, a * , t)] = s(z 0 , a, t)i(z 0 , a * , t). introducing the scaling ν j (a, t) = hκ j (a, a * , t) we can apply the instantaneous strategies (10) into the discrete system (9) and pass to the limit for h → 0. the resulting feedback controlled model reads finally, we provide sufficient conditions for the admissibility of the feedback control in terms of the penalization term. in fact, the dynamic must preserve the monotonicity of the susceptible population number s(z, a, t) for each age class and for each single control action over time. by imposing the non-negativity of the total reproduction rate in (11) we get and assuming β j (z, a, a * ) these inequalities have to be satisfied ∀ a ∈ λ and for every t ≥ 0. in the case of a time independent penalization term κ j = κ j (a) we have the following admissibility conditions is defined by (5) or (6) andī(z) is the maximum reached by the total density of infectious. in this section we present the results of several simulation of the constrained compartmental model with uncertain data. details of the stochastic galerkin method used to deal efficiently with uncertain data may be found in [4, 36] . the data concerning the actual number of infected, recovered and deaths in the various country have been taken from the johns hopkins university github repository [17] ad for the specific case of italy from the github repository of the italian civil protection department [38] . the social interaction functions β j have been reconstructed from the dataset of age and location specific contact matrices related to home, work, school and other activities in [37] . finally, the demographic characteristics of the population for the various country have been taken from the united nations world populations prospects 1 . other sources of data which have been used include the coronavirus disease (covid-2019) situation reports of the who 2 and the statistic and research coronavirus pandemic (covid-19) from owd 3 . estimating epidemiological parameters is a very difficult problem that can be addressed with different approaches [9, 13, 40] . in the case of covid-19 due to the limited number of data and their great heterogeneity is an even bigger problem that can easily lead to wrong results. here, we restrict ourselves to identifying the deterministic parameters of the model through a suitable fitting procedure, considering the possible uncertainties due to such estimation as part of the subsequent uncertainty quantification process. more precisely, we have adopted the following two-level approach in estimating the parameters. in the phase preceding the lockdown we estimated the epidemic parameters, and hence the model reproduction number r 0 , in an uncontrolled regime. this estimate was then kept in the subsequent lockdown phase where we estimated as a function of time the value of the control penalty parameter. both these two calibration steps were analyzed under the assumption of homogeneous mixing. therefore, we solved two separate constrained optimization problems. first we estimated β > 0 and γ > 0 by solving in the uncontrolled time interval t ∈ [t 0 , t u ] a least square problem based on minimizing the relative l 2 norm of the difference between the reported number of infected i(t) and recoveredr(t), and the theoretical evolution of the unconstrained model i(t) and r(t). as a function of the penalization introduced on the curve of the infected or recovered we have a better adaptation of the model to the respective curves thus obtaining a range for the reproduction number r 0 . we observed that, in general, fitting the curve of infectious yields a larger estimated reproduction number compared to fitting the curve of recovered. on the other hand, the lack of reliable informations concerning the recovered in early stages of the disease suggests to adapt the model mainly to the curve of infectious and to introduce the uncertainty in the reproductive number using this estimated value as an upper bound of the reproduction number. due to the heterogeneity of the data between the different countries, in order to have comparable results with reproduction numbers r 0 = β/γ ∈ (4.5, 9.5) we constrained the value of β ∈ (0, 0.6) and the value of γ ∈ (0.04, 0.06). in table ( 1) we report the values obtained by averaging the optimization results obtained with a penalization factor of 0.01 and 10 −6 , respectively, over the recovered. next, we estimate the penalization κ = κ(t) > 0 in time by solving in the controlled time interval t ∈ [t u , t c ] for a sequence of time steps t i of size h the corresponding least square problems in [t i + k r h, t i + k r h], k l , k r ≥ 1 integers, and where for the evolution we consider the values β e and γ e estimated in the first optimization step using the curve of infectious. the second fitting procedure has been activate up to last available data with daily time stepping (h = 1) and a window of seven days (k l = 3, k r = 4) for regularization along one week of available data. for consistency we performed the same optimization process used to estimate β and γ, namely using two different penalization factors and then averaging the results. these optimization problems have been solved testing different optimization methods in combination with adaptive solvers for the system of odes. the results reported have been obtained using the matlab functions fmincon in combination with ode45. the corresponding time dependent values for the controls as well as results of the model fitting with the actual trends of infectious are reported in figure 1 . the trends have been computed using a weighted least square fitting with the model function k(t) = ae bt (1 − e ct ). for some countries, like france, spain and italy after an initial adjustment phase the penalty term converges towards a peak and has just started to decrease. this is consistent with a situation in which data concerning the number of reported infectious needs a certain period of time before being affected by the lockdown policy and can also be considered as an indicator of an unstable situation where reducing control may lead to a potential restart of the infectious curve. the penalty terms for the us and the uk clearly indicates that the pandemic is still in its growing phase and the situation is far from a controlled equilibrium state. the only exception is represented by germany where the dynamic corresponds to a significative decrease in the penalization term as a result of a timely implementation of social distancing measures. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. 8 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. next we focus on the influence of uncertain quantities on the controlled system with homogeneous mixing. according to recent results on the diffusion of covid-19 in many countries the number of infected, and therefore recovered, is largely underestimated on the official reports, see e.g. [28, 33] . one possible way to understand this is based on a renormalization process of the reported data based on the estimated infection fatality rate (ifr) of covid-19. although estimating the true ifr is generally hazardous while an epidemic is underway, some studies have estimated an overall ifr around 1.3% with an age dependent credible interval [41, 39] . in the sequel we consider a range spanning between 0.9% − 2.0%. on the contrary the current fatality rate (cfr) may vary strongly from country to country accordingly to the differences in the number of people tested, demographics, health care system. one way to have in insight in the uncertainty of data is to use the estimated ifr ranges as normalization factors for the current data reported of total cases i tot . this is done computing an estimated number of total confirmed cases asî tot = 100 × d r /ifr, where d r is the total number of confirmed deaths. the results of the variationsî tot /i tot for the various countries are summarized in figure 2 and are directly proportional to the cfr of the country. we are aware that the estimate obtained is certainly coarse, nevertheless it allows to get an idea of the disagreement between the data observed and expected in the various countries and therefore to be able to define a common scenario between the various countries. in order to have an insight on global impact of uncertain parameters we consider a twodimensional uncertainty z = (z 1 , z 2 ) with independent components such that i(z, 0) = i 0 (1 + µz 1 ), r(z, 0) = r 0 (1 + µz 1 ) 9 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . https://doi.org/10.1101/2020.05.12.20099721 doi: medrxiv preprint and β(z) = β e + α β z 2 , γ(z) = γ e + α γ z 2 (15) where z 1 , z 2 are chosen distributed as symmetric beta functions in [0, 1], i 0 and r 0 are the initial number of reported cases and recovered, and β e , γ e are the fitted values given in table 1 . in the following we will consider µ = 2(c − 1) common for all countries such that e[i(z, 0)] = ci(0), e[r(z, 0)] = cr(0) where c = 8.56, the average value from figure 2 . from a computational viewpoint we adopted the method developed in [4] based on a stochastic galerkin approach. the feedback controlled model has been computed using an estimation of the total number of susceptible and infected reported, namely we have the control term where s r (t) and i r (t) are the model solution obtained from the reported data, and thus i r (t) represents a lower bound for the uncertain solution i(z, t). in figure 3 we report the results concerning the evolution of estimated current infectious cases from the beginning of the pandemic in the reference countries using z 1 ∼ b(10, 10) and α β = α γ = 0. in the inset figures the evolution of total cases is reported. the expected number of infectious is plotted with blue continuous line. furthermore, to highlight the country-dependent underestimation of cases we report with dash-dotted lines both the expected evolutions, where the uncertain parameter c > 0 varies from country to country accordingly to the numbers on the top of the red bars in figure 2 . in figure 4 we report the evolution of reproduction number r0 for the considered countries under the uncertainties in (15) obtained with α β = −3, α γ = 5 and z 2 ∼ b(2, 2). it has been reported, in fact, that deterministic methods based on compartmental models overestimate the effective reproduction number [32] . the reproduction number is estimated from being the control u(t) defined in (16) andt is the country-dependent lockdown time. the estimated reproduction number relative to data is reported with x-marked symbols and represents an upper bound for r 0 (z 2 , t). the first day that the 50% confidence interval and the expected value fall below 1 is highlighted with a shaded green region. we can observe how the model estimates that for most countries in the first days of april the reproduction number r 0 has fallen below the threshold of 1. on the other hand, in the uk and the us the same condition was reached between the end of april and the beginning of may. in realistic terms these dates should be considered as overestimates as they are essentially based on observations without taking into account the delay in the data reported. we analyze the effects of the inclusion of age dependence and social interactions in the above scenario. the number of contacts per person generally shows considerable variability depending on age, occupation, country, in relation to the social habits of the population. however, some universal features can be extracted, which emerge as a function of specific social activities. more precisely, we consider the social interaction functions corresponding to the contact matrices in [37] for the various countries. as a result we have four interaction functions characterized by a = {f, e, p, o}, where we identify family and home contacts with β f , education and school contacts with β e , professional and work contacts with β p , and other contacts with β o . these functions have been reconstructed over the age interval λ = [0, a max ], a max = 100 using linear interpolation. we report in figure 5 , as an example, the total social interaction functions for the various countries. the functions share a similar structure but with different scalings accordingly to the country specific features identified in [37] . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . figure 3 : evolution of current and total cases for each country with uncertain initial data as in (14) based on the average uncertainty between countries. the 95% and 50% confidence levels are represented as shaded and darker shaded areas respectively. the dash-dotted lines denote the expected trends with a country dependent uncertainty from figure 2. 11 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . figure 4 : evolution of estimated reproduction number r 0 and its confidence bands for uncertain data in as in (15) . the 95% and 50% confidence levels are represented as shaded and darker shaded areas respectively. the green zones denote the interval between the first day the 50% confidence band and the expected value fall below 1. figure 5 : the total contact interaction function β = β f + β e + β p + β o taking into account the contact rates of people with different ages. family and home contacts are characterized by β f , education and school contacts by β e , professional and work contacts by β p , and other contacts by β o . 12 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . in order to match the age-structured model with the homogeneous mixing model the social functions were normalized using the previously estimated parameters β e and γ e in accordance with we considered a uniform recovery rate, together with an age-related recovery rate [42] as a decreasing function of the age in the form with r = 5 and c ∈ r such that (17) holds. clearly, this choice involves a certain degree of arbitrariness since there are not yet sufficient studies on the subject, nevertheless, as we will see in the simulations, it is able to reproduce more realistic scenarios in terms of age distribution of the infected without significantly altering the behaviour relative to the total number of infected. in a similar spirit, to match the single control applied in the extrapolation of the penalization term κ(t) to age dependent penalization factors κ j (a, t) we redistribute their values as where w j (t) ≥ 0, are weight factors denoting the relative amount of control on a specific activity. in the lockdown period accordingly to other studies [37] we assume w e = 1.5, w h = 0.2, w p = 0.5, w o = 0.6, namely the largest effort of the control is due to the school closure which as a consequence implies more interactions at home. work and other activities are equally impacted by the lockdown. in particular, these initial lockdown choices make it possible to have a good correspondence between the infectivity curves expected in the age dependent case and in the homogeneous mixing case. therefore, these values have been set homogeneously for each country and correspond to the situation in the first lockdown period. we will discuss possible changes to these choices following a relaxation of the lockdown in the different scenarios presented below. we divided the computation time frame into two zones and used different models in each zone, in accordance with the policy adopted by the various countries. the first time interval defines the period without any form of containment, the second the lockdown period. in the first zone we adopted the uncontrolled model with homogeneous mixing for the estimation of epidemiological parameters. hence, in the second zone we compute the evolution of the feedback controlled age dependent model (11) with matching (on average) interaction and recovery rates (17) and with the estimated control penalization κ(t). the initial values for the age distributions of susceptible have been taken from the specific demographic distribution of each country. more difficult is to get the same informations for the infected, since reported data are rather heterogeneous for the various country and the initial number of individuals is very small (we selected a time frame where the reported number of infectious is larger than 200). therefore, we tested the available data against a uniform distribution. as there were no particular differences in the results, we decided to adopt a uniform initial distribution of the infected for all countries. in figure 6 we report the age distribution of infected computed for each country at the end of the lockdown period using an age dependent recovery and a constant recovery. the differences in the resulting age distributions are evident. in subsequent simulations, to avoid an unrealistic peak of infection among young people, we decided to adopt an age-dependent recovery [42] . in the first scenario we analyze the effects on each country of the same relaxation of the lockdown measures at two different times. the first date is country specific accordingly to current available informations, the second is june 1st for all countries. for all countries we assumed a reduction of individual controls on the different activities by 20% on family activities, 35% on work activities and 30% on other activities without changing the control over the school. the behaviors of the 13 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . figure 6 : age distribution of infected using constant and age dependent recovery rates as in (18) at the end of the lockdown period in different countries. curves of infected people together with the relative 95% confidence bands are reported in figure 7 . the results show well the substantial differences between the different countries, with a situation in the uk and us that seems clearly premature to relax lockdown measures. on the contrary, germany and, to some extent spain, are in the most favorable situation to ease the lockdown without risking a new start of the infection. in all cases, however, it is clear that a further increase in the number of infected people should be expected. in order to highlight the differences in the behaviour of the infection according to the choices related to specific activities, such as school and work, we have considered the effects of a specific lockdown relaxation in these directions. precisely for each country we have identified a range for such loosening which gives an indication of the maximum allowed opening of the activities before a strong departure of the infection. it was assumed to relax the lockdown of the school with a mild resumption of family, work and other activities interactions by 5% for each 10% release of the school. the results are reported in figure 8 . next, we perform a similar relaxation process oriented towards productive activities with a reduction of control on such activities at various percentages. here we assumed no impact on school activities and a mild impact on family and other activities with a loosening at 5% for each 10% release of the work. the results are given in figure 9 . in both cases, the results show that the selected countries can be divided into 3 groups, germany and spain in a stable downward phase of the epidemic curve, france and italy in a still transitional phase with greater risks in reopening, and the uk and us in full growth phase of the epidemic curve in which any relaxation of lockdown measures leads to a strong restart of the epidemic. one of the major problems in the application of very strong containment strategies, is the difficulty in maintaining them over a long period, both for the economic impact and for the impact on the population from a social point of view. the results of the previous scenarios have shown that it may be possible for some countries, like germany, spain, france, and italy, to relax the lockdown measures albeit with some risk of an increase in the contagion curve. on the contrary, in all scenarios considered, the situation in the uk and the us suggests that any loosening of containment measures should be postponed. in the latter scenario, we consider a strategy based 14 . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . figure 7 : scenario 1: effect on releasing containment measures in various countries at two different times. in all countries after lockdown we assumed a reduction of individual controls on the different activities by 20% on family activities, 35% on work activities and 30% on other activities by keeping the lockdown over the school. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 16, 2020. . https://doi.org/10.1101/2020.05.12.20099721 doi: medrxiv preprint figure 10 : scenario 3: relaxing lockdown measures in a progressive way in two subsequent phases while keeping the epidemic peak under control. in the second phase only productive activities are restarted and partially home interactions and other activities. in a third phase school activities are also partially reopened (see table 2 ). on a two-stage opening of the blocking measures with a progressive approach. this possibility is analysed for the four countries where it might be possible to partially reopen production activities without restarting the contagion curve. for each country we have selected a progressive lockdown relaxation focused mainly on the opening of productive activities in the second phase and with a partial reprise of school activities in the third phase. the reduction of the controls are now country specific and the values are reported in table 2 . in figure 10 we plot the resulting behavior for the expected number of current infectious. the simulations show that for all these countries, it is possible to relax the containment measures in a progressive way by keeping the infection curve under control. however, timing and intensity of the relaxation choices play a fundamental rule in the process. the approach of a second phase of the pandemic into a new normality is full of uncertainties from the point of view of social and economic planning. it is clear to the people that this period cannot be marked by an "all free", by a return to the old normality. for some time to come it will be necessary to respect rules of conduct and hygiene standards to which we have not been accustomed. there are many issues to be addressed, in particular how gradually to reopen the various social and economic activities without creating a new wave of infected and therefore deaths. in order to analyze possible future scenarios, it is essential to have models capable of describing the impact of the epidemic according to the specific social characteristics of the country and the containment actions implemented. in this work, aware of the complexity of the problem, we have tried to provide a suitable modeling context to describe possible future scenarios in this direction. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 16, 2020. home 30%-60% 10%-25% 10%-15% 20%-25% school 0%-60% 0%-30% 0%-20% 0%-40% work 70%-80% 35%-45% 40%-50% 60%-70% other 30%-60% 10%-25% 10%-20% 20%-45% table 2 : scenario 3: progressive relaxation of lockdown measures for different countries as specific control reduction percentages. results are reported in figure 10 . more precisely, with the aid of a sir model with specific feedback controls on social interactions capable to describe the selective action of a government in opening certain activities such as home, work, school and other activities, we can simulate their future impact with respect to the current epidemic trend. in particular, in an effort to take into account the high uncertainty in the data, the model has been formalized in the presence of uncertain input parameters that allow to explore hypothetical scenarios with appropriate confidence bands. the simulation parameters have been obtained using data coming from several countries with"comparable situations" in terms of epidemic progress, such as italy, france, germany, spain, the united kingdom and the united states. the model is capable to describe accurately the reported data thanks to the introduction of the time dependent control action and therefore to provide potential useful indications thanks to the dependence of interactions between the population from the social context. the results, in accordance with the observations, show situations with different levels of sensitivity to a hypothetical reopening of certain activities. the scenarios presented in order to be able to compare the various realities are largely hypothetical situations but they highlight very well the impact of the different social activities and how some countries such as the united kingdom and the united states are still in an epidemic situation that suggests maintaining the actual lockdown measures before moving to a second phase. on the contrary, the simulations show how germany before the other countries and secondly spain, france and italy, can aim at a gradual reopening of social and economic activities, keeping the epidemic curve under control, provided that they are resumed in a progressive way and within an appropriate time frame. the use of a sir model with social structure modified through appropriate feedback controls allows to obtain simulations in agreement with the current epidemic scenarios in different countries, including germany, france, italy, spain, the united kingdom and the united states. the inclusion of uncertainty about the actual value of the number of infected people makes it possible to analyze the effects of the potential reopening of productive and social activities at different times. a multi-modelling approach aligned with the current epidemiological and demographic data, which includes experimental social interaction matrices for the different countries, permits to contextualize possible future scenarios. further studies are being conducted on geographical dependence through spatial variables. this would make it possible to characterize control measures on a local rather than global basis. . cc-by 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 16, 2020. . https://doi.org/10.1101/2020.05.12.20099721 doi: medrxiv preprint selective model-predictive control for flocking systems uncertainty quantification in control problems for flocking models kinetic description of optimal control problems and applications to opinion consensus control with uncertain data of socially structured compartmental epidemic models optimal control of a sir epidemic model with general incidence function and a time delays the french connection: the first large populationbased contact survey in france relevant for the spread of infectious diseases time-optimal control strategies in sir epidemic models un)conditional consensus emergence under perturbed and decentralized feedback controls parameter estimation and uncertainty quantification for an epidemic model towards uncertainty quantification and inference in the stochastic sir epidemic model epidemiological models with age structure, proportionate mixing, and cross-immunity optimal control for pandemic influenza: the role of limited antiviral treatment and isolation fitting dynamic models to epidemic outbreaks with quantified uncertainty: a primer for parameter uncertainty, identifiability, and forecasts optimizing vaccination strategies in an age structured sir model uncertainty quantification for kinetic models in socioeconomic and life sciences wealth distribution under the spread of infectious diseases an interactive web-based dashboard to track covid-19 in real time. the lancet infectious diseases kinetic models for optimal control of wealth inequalities restarting the economy while saving lives under covid-19 estimating the number of infections and the impact of non-pharmaceutical interventions on covid-19 in 11 european countries threshold behaviour of a sir epidemic model with age structure and immigration a feedback sir (fsir) model highlights advantages and limitations of infectionbased social distancing spread and dynamics of the covid-19 epidemic in italy: effects of emergency containment measures mixing in age-structured population models of infectious diseases modeling heterogeneous mixing in infectious disease dynamics the mathematics of infectious diseases analytical and numerical results for the age-structured s-i-s epidemic model with mixed inter-intracohort transmission correcting under-reported covid-19 case numbers: estimating the true scale of the pandemic a contribution to the mathematical theory of epidemics modeling optimal age-specific vaccination strategies against pandemic influenza an optimal control theory approach to nonpharmaceutical interventions the reproductive number of covid-19 is higher compared to sars coronavirus estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship optimal, near-optimal, and robust epidemic control social contacts and mixing patterns relevant to the spread of infectious diseases an introduction to uncertainty quantification for kinetic equations and related problems projecting social contact matrices in 152 countries using contact surveys and demographic data github: covid-19 italia -monitoraggio situazione an empirical estimate of the infection fatality rate of covid-19 from the first italian outbreak epidemic models with uncertainty in the reproduction estimating the infection and case fatality ratio for coronavirus disease (covid-19) using age-adjusted data from the outbreak on the diamond princess cruise ship age-dependent risks of incidence and mortality of covid-19 in hubei province and other parts of china hongdou estimation of the reproductive number of novel coronavirus (covid-19) and the probable outbreak size on the diamond princess cruise ship: a data-driven analysis this work has been written within the activities of gnfm and gncs groups of indam (national institute of high mathematics). g. albi and l. pareschi acknowledge the support of miur-prin project 2017, no. 2017kkjp4x "innovative numerical methods for evolutionary partial differential equations and applications" and . m. zanella was partially supported by the miur "dipartimenti di eccellenza" program (2018-2022) -department of mathematics "f. casorati", university of pavia. key: cord-319933-yp9ofhi8 authors: ruiz, sara i.; zumbrun, elizabeth e.; nalca, aysegul title: chapter 38 animal models of human viral diseases date: 2013-12-31 journal: animal models for the study of human disease doi: 10.1016/b978-0-12-415894-8.00038-5 sha: doc_id: 319933 cord_uid: yp9ofhi8 abstract as the threat of exposure to emerging and reemerging viruses within a naive population increases, it is vital that the basic mechanisms of pathogenesis and immune response be thoroughly investigated. by using animal models in this endeavor, the response to viruses can be studied in a more natural context to identify novel drug targets, and assess the efficacy and safety of new products. this is especially true in the advent of the food and drug administration's animal rule. although no one animal model is able to recapitulate all the aspects of human disease, understanding the current limitations allows for a more targeted experimental design. important facets to be considered before an animal study are the route of challenge, species of animals, biomarkers of disease, and a humane endpoint. this chapter covers the current animal models for medically important human viruses, and demonstrates where the gaps in knowledge exist. well-developed animal models are necessary to understand the disease progression, pathogenesis, and immunologic responses in humans. furthermore, to test vaccines and medical countermeasures, well-developed animal models are essential for preclinical studies. ideally, an animal model of human viral infection should mimic the host-pathogen interactions and the disease progression that is seen in the natural disease. a good animal model of viral infection should allow many parameters of infection to be assayed, including clinical signs, growth of virus, clinicopathological parameters, cellular and humoral immune responses, and virus-host interactions. furthermore, viral replication should be accompanied by measurable clinical manifestations, and pathology should resemble that of human cases such that a better understanding of the disease process in humans is attained. there is often more than one animal model that closely represents human disease for a given pathogen. small animal models are typically used for first-line screening, and for initially testing the efficacy of vaccines or therapeutics. in contrast, nonhuman primate (nhp) models are often used for the pivotal preclinical studies. this approach is also used for basic pathogenesis studies, with most experiments performed in small animal models when possible, and nhps only used to fill in remaining gaps in knowledge. the advantages of using mice to develop animal models are low cost, low genetic variability in inbred strains, and abundant molecular biological and immunological reagents. specific pathogen-free (spf), transgenic, and knockout mice are also available. a major pitfall of mouse models is that the pathogenesis and protection afforded by vaccines and therapeutics cannot always be extrapolated. additionally, blood volumes for sampling are limited in small animals, and viruses often need to be adapted through serial passage in the species to induce a productive infection. the ferret's airways are anatomically and histologically similar to that of humans, and their size enables larger or more frequent blood samples to be collected, making them an ideal model for certain respiratory pathogens. ferrets are outbred, with no standardized breeds or strains; thus, greater numbers are required in studies to achieve statistical significance and overcome the resulting variable responses. additionally, spf and transgenic animals are not available, and molecular biological reagents are lacking. other caveats making ferret models more difficult to work with are their requirement for more space than mice (rabbit-style cages), and the development of aggressive behavior with repeated procedures. nhps are genetically the closest species to humans; thus, disease progression and host-pathogen responses to viral infections are often the most similar to that of humans. however, ethical concerns of experimentation on nhps, along with the high cost and lack of spf nhps raise barriers for such studies. nhp studies should be carefully designed to ensure that the least amount of animals are used, and the studies should address the most critical questions regarding disease pathogenesis, host-pathogen responses, and protective efficacy of vaccines and therapeutics. well-designed experiments should carefully evaluate the choice of animal, including the strain, sex, and age. furthermore, route of exposure and the dose should be as close as possible to the route of exposure and dose of human disease. the endpoint for these studies is also an important criterion. depending on the desired outcome, the model system should emulate the host responses in humans when infected with the same pathogen. in summary, small animal models are helpful for the initial screening of vaccines and therapeutics, and are also often beneficial in obtaining a basic understanding of the disease. nhp models should be used for a more detailed characterization of pathogenesis and for pivotal preclinical testing studies. ultimately, an ideal animal model may not be available. in this case, a combination of different well-characterized animal models should be considered to understand the disease progression and to test medical countermeasures against the disease. in this chapter, we will be reviewing the animal models for representative members of numerous virus families causing human diseases. we will focus on the viruses for each family that are the greatest concern for public health worldwide. poliovirus (pv) is an enterovirus in the picornavirus family and causes poliomyelitis. 1 humans are the only natural host for the virus, but a number of nhp species are also susceptible. all three serotypes of pv cause paralytic disease, but it is relatively rare with only 1-2% of infected individuals ultimately developing paralysis. humans typically acquire and transmit the virus by the oral-fecal route, although transmission by aerosol droplets may also be possible. 2 the virus replicates in the oropharyngeal and intestinal mucosa, made possible by the resistance of pv to stomach acids. 3 cd155 expression in peyer's patches and m cells suggest that these cell types may be important during initiation of infection. 4 replication at extraneural sites 38 . in vivo models of viral diseases affecting humans precedes invasion into the central nervous systems (cnss), when it occurs. two effective vaccines, the salk killed polio vaccine delivered by the intramuscular route and the sabin live attenuated polio vaccine delivered orally, have been used very successfully to eliminate the disease from most parts of the world. 5 the world health organization has led a long and hard-fought global polio eradication campaign, with much success, but full eradication has not yet been achieved. since 2003, between 1000 and 2000 cases of pv infection are reported worldwide each year. 6 thus, animal models are also needed to test new vaccine approaches that could be used toward eradication of polio in the areas where it still persists. additionally, the recent focus of work with pv animal models has been fraught with urgency, as to gain understanding of pv pathogenesis before the eradication effort is complete and work with this virus ceases. animal models for the study of pv consist of nhp models and mouse models. mice are susceptible to certain adapted pv strains: p2/lansing, p1/lsb, and a variant of p3/leon. mice infected intracerebrally with p2/lansing develop disease with some clinical and histopathological features resembling that of humans. 7 wild-type mice are not susceptible to wildtype pv; however, the discovery of the pv receptor (cd155) in 1989 led to the use of hcd155 transgenic mice as a model of pv infection. 8 these mice are not susceptible to pv by the oral route and must be exposed intranasally or by intramuscular infection to induce paralytic disease. 9 interestingly, hcd155 mice that have a disruption in the interferon (ifn)-a/b receptor gene are susceptible to oral infection. 10 this finding has given rise to speculation that an intact ifn-a/b response may be responsible for limiting infection in the majority of individuals exposed to pv. thus, mouse models have proven to be very useful in gaining a better understanding of pv disease and pathogenesis. rhesus macaques are not susceptible to pv by the oral route, but they have been used extensively to study vaccine formulations for safety and immunogenicity, for monitoring neurovirulence of the live attenuated sabin vaccine, and in the past for typing pv strains. 11 bonnet monkeys are also susceptible to oral inoculation of pv, which results in the gastrointestinal shedding of virus for several weeks, with paralysis occurring in only a small proportion of animals. consistent paralytic disease can be induced in bonnet monkeys (macaca radiata) through exposure to pv by infection into the right ulnar nerve (at the elbow), resulting in limb paralysis that resembles human paralytic poliomyelitis both clinically and pathologically. 12 as such, bonnet monkeys can be used to study pv distribution and pathology and the induction of paralytic poliomyelitis or provocation paralysis. 13 hepatitis a virus causes jaundice, which is a public health problem worldwide. the incubation period lasts from 15 to 45 days with an average of 28 days. transmission between humans occurs by the oral-fecal route, person-to-person contact, or ingestion of contaminated food and water. 14 hepatitis a virus causes an acute and self-limited infection of the liver with a spectrum of signs and symptoms ranging from subclinical disease, to jaundice, fulminant hepatitis, and in some cases death. 15, 16 the disease can be divided into four clinical phases: (1) incubation period, during which the patient is asymptomatic but virus replicates and possibly transmits to others. (2) prodromal period, which might last from a few days to a week with patients generally experiencing anorexia, fever (<103 f), fatigue, malaise, myalgia, nausea, and vomiting. (3) icteric phase, in which increased bilirubin causes characteristic dark brownish colored urine. this sign is followed by pale stool and yellowish discoloration of the mucous membranes, conjunctiva, sclera, and skin. most patients develop an enlarged liver, and approximately 5-15% of the patients have splenomegaly. (4) convalescent period, with resolution of the disease and recovery of the patient. rarely, during the icteric phase, extensive necrosis of the liver occurs. these patients show a sudden increase in body temperature, marked abdominal pain, vomiting, jaundice, and the development of hepatic encephalopathy associated with coma and seizures, all signs of fulminant hepatitis. death occurs in 70-90% of patients with fulminant hepatitis. 16 experiments showed that hepatitis a causes disease only in humans, chimpanzees, several species of south american marmosets, stump-tailed monkeys, and owl monkeys via the oral or intravenous (iv) routes. [17] [18] [19] [20] it is known that cynomolgus macaques are infected with hepatitis a virus in the wild. 21 amado et al. used cynomolgus macaques (macaca fascicularis) for experimental hepatitis a infections. 17 the animals did not exhibit clinical signs of disease, but viral shedding was observed in saliva and stool as early as 6 h postinoculation (pi) and 7 days pi, respectively. although mildto-moderate hepatic pathology was observed in all macaques, seroconversion and mildly increased alanine aminotransferase (alt), an enzyme associated with liver function, were observed in some of them. because this study had a very small group of animals (four macaques), the data should not be considered as conclusive, and more studies are needed to better define the cynomolgus macaque model. although hepatitis a virus is transmitted by the oralfecal route, studies in chimpanzees and tamarins showed that the iv route was much more infectious than oral route was. there was no correlation between dose and development of clinical disease for either species or experimental routes, and similar to cynomolgus macaques, none of these species showed clinical signs of disease. 20 inoculation of common marmosets (callithrix jacchus) with hepatitis a virus did not produce clinical signs of disease as seen in other nhp models. 22, 23 liver enzyme levels increased on day 14 pi, and monkeys had measurable antihepatitis a antibodies by day 32 pi. an experimental study with cell culture-adapted hepatitis avirus in guinea pigs challenged by oral or intraperitoneal routes did not result in clinical disease, increase in liver enzymes, or seroconversion. 24 viral load was detected in stool and serum between days 14 and 52 and 21 and 49 days, respectively. liver pathology showed mild hepatitis. furthermore, histopathology indicated that virus replicated in extrahepatic tissues such as spleen, regional lymph nodes, and intestinal tract. in summary, none of the animal models for hepatitis a infection is suitable for studying pathogenesis of the virus because all clinical and most of the laboratory parameters remain within normal range or only slightly increased after the infection. one possibility is to test the safety of vaccines against hepatitis a virus in those models with demonstrable viral shedding. noroviruses, of which norwalk is the prototypic member, are responsible for up to 85% of reported food-borne gastroenteritis cases. in developing countries, this virus is responsible for approximately 200,000 deaths annually. 25 a typical disease course is self-limiting, but there have been incidences of necrotizing enterocolitis and seizures in infants. 26, 27 symptoms of infection include diarrhea, vomiting, nausea, abdominal cramping, dehydration, and fever. incubation normally is for 1-3 days, with symptoms enduring for 2-3 days. 28 viral shedding is indicative of immunocompromised status within an individual with the elderly and young having a prolonged state of shedding. 29 transmission occurs predominately through the oral-fecal route with contaminated food and water being the major vector. 30 a major hindrance to basic research into this pathogen is the lack of a cell culture system. therefore, animal models are used not only to determine the efficacy of novel drugs and vaccines but also for understanding the pathogenesis of the virus. therapeutic intervention consists of rehydration therapy and antiemetic medication. 31 no vaccine is available, and development of one is expected to be challenging given that immunity is short lived after infection. 32 nhps including marmosets, cotton-top tamarins, and rhesus macaques infected with norwalk virus can be monitored for the extent of viral shedding; however, no clinical disease is observed in these models. disease progression and severity are measured exclusively by assay of viral shedding. 33 it was determined that more virus was needed to create an infection when challenging by the oral route than when challenging by the iv route. chimpanzees were exposed to a clinical isolate of norwalk virus by the iv route. although none of the animals developed disease symptoms, viral shedding within the feces was observed within 2-5 days postinfection and lasted anywhere from 17 days to 6 weeks. viremia never occurred, and no histopathological changes were detected. the amount and duration of viral shedding were in line with what is observed upon human infection. 34 a recently identified calicivirus of rhesus origin, named tulane virus, was used as a surrogate model of infection. rhesus macaques exposed to tulane virus intragastrically developed diarrhea and fever 2 days postinfection. viral shedding was achieved for 8 days. the immune system produced antibodies that dropped in concentration within 38 days postinfection, mirroring the short-lived immunity documented in humans. the intestine developed moderate blunting of the villi as seen in human disease. 35 a murine norovirus has been identified and is closely related to human norwalk virus. however, clinically, the viruses present a different disease. the murine norovirus does not induce diarrhea nor vomiting and can develop a persistent infection in contrast to human disease. [36] [37] [38] porcine enteric caliciviruses can induce diarrheal disease in young pigs and an asymptomatic infection in adults. 39 gnotobiotic pigs can successfully be infected with a passaged clinical noroviruses isolate orally. diarrheal disease developed in 74% of the animals, and 44% were able to shed virus in their stool. no major histopathological changes or viral persistence was noted. 40 calves are naturally infected with bovine noroviruses. experimentally challenging calves with an oral inoculation of a bovine isolate resulted in diarrheal disease [14] [15] [16] the link between equine cases and the human disease was confirmed in 1938 by observing 30 cases of fatal encephalitis in children living in the same area as the equine cases. during this outbreak, eeev was isolated from the cnss of these children as well as from pigeons and pheasants. 44 eeev primarily affects areas near salty marshes and can cause localized outbreaks of disease in the summer. the enzootic cycles are maintained in moist environments such as coastal areas, shaded marshy salt swamps in north america (na), and moist forests in central america and south america (sa). 45 birds are the primary reservoir, and the virus is transmitted via mosquitoes. furthermore, forest-dwelling rodents, bats, and marsupials frequently become infected and may provide an additional reservoir in central america and sa. despite known natural hosts, the transmission cycles in these animals are not well characterized. 44 reptiles and amphibians have also been reported to become infected by eeev. eeev pathogenesis and disease have been studied in several laboratory animals. as a natural host, birds do not generally develop encephalitis except pheasants or emus, in which eeev causes encephalitis with 50-70% mortality. 46 young chickens show signs of extensive myocarditis in early experimental infection and heart failure rather than encephalitis is the cause of death. 47 besides the heart, other organs such as pancreas and kidney show multifocal necrosis. additionally, lymphocytopenia has been observed in the thymus and spleen in birds. 45 eeev causes neuronal damage in newborn mice, and the disease progresses rapidly, resulting in death. 48 similarly, eeev produces fatal encephalitis in older mice when administered via the intracerebral route, whereas inoculation via the subcutaneous route causes a pantropic infection eventually resulting in encephalitis. 49, 50 guinea pigs and hamsters have also been used as animal models for eeev studies. 51, 52 guinea pigs developed neurological involvement with decreased activity, tremors, circling behavior, and coma. neuronal necrosis was observed and resulted in brain lesions in these animals. 52 subcutaneous inoculation of eeev produced lethal biphasic disease in hamsters with severe lesions of nerve cells. the early visceral phase with viremia was followed by neuroinvasion, encephalitis and death. in addition, parenchyma necroses were observed in the liver and lymphoid organs. 51 intradermal, intramuscular, or iv inoculations of eeev in nhps cause disease but does not always result in symptoms of the nervous system. intracerebral infection of eeev results in nervous system disease and fatality in monkeys. 53 the differences in these models indicate that the initial viremia and the secondary nervous system infection do not overlap in monkeys when they are infected by the peripheral route. 54 intranasal and intralingual inoculations of eeev also cause nervous system symptoms in monkeys, but less drastic than those caused by intracerebral injections. 54 the aerosol route of infection also progresses to uniformly lethal disease in cynomolgus macaques. 55 in this model, fever was followed by elevated white blood cells and liver enzymes. neurological signs subsequently developed, and nhps became moribund and were euthanized between days 5 and 9 days postexposure. meningoencephalomyelitis was the main pathology observed in the brains of these animals. 56 similar clinical signs and pathology were observed when common marmosets were infected with eeev by the intranasal route. 57 both aerosol and intranasal nhp models had similar disease progression and pathology as those seen in human disease. a common marmoset model was used for comparison studies of sa and na strains of eeev. 57 previous studies indicated that the sa strain is less virulent than na strain for humans. common marmosets were infected intranasally with either the na or sa strain of eeev. na strain-infected animals showed signs of anorexia and neurological involvement and were euthanized 4-5 days after the challenge. although sa strain-infected animals developed viremia, they remained healthy and survived the challenge. epizootics of viral encephalitis in horses were previously described in argentina. more than 25,000 horses died from western equine encephalitis virus (weev) in the central plains of the united states in 1912. 58 weev was first isolated from the brains of horses during the outbreak in the san joaquin valley of california in 1930. although it was suspected, the first diagnosis of weev as a cause of human encephalitis occurred in 1938, when the virus was recovered from the brain of a child with fatal encephalitis. 44 in horses, the signs of disease are fever, loss of coordination, drowsiness, and anorexia, leading to prostration, coma, and death in about 40% of affected animals. 59 weev also infects other species of birds and often causes fatal disease in sparrows. weev infection occurs throughout western na and sporadically in sa as it circulates between its mosquito vector and wild birds. 44 chickens and other domestic birds, pheasants, rodents, rabbits, ungulates, tortoises, and snakes are natural reservoirs of weev. 60,61 weev has caused epidemics of encephalitis in humans, horses, and emus, but the fatality rate is lower than that for eeev. 62 predominately young children and those older than 50 years demonstrate the clinical symptoms of the disease. 63 severe disease, seizures, fatal encephalitis, and significant sequelae are more likely to occur in infants and young children. 64, 65 typically, the disease progresses asymptomatically with seroprevalence in humans being fairly common in endemic areas. species used to develop animal models for weev are mice, hamsters, guinea pigs, and ponies. studies with ponies resulted in viremia in 100% of the animals 1-5 days pi. fever was observed in 7 of 11 animals, and six exhibited signs of encephalitis. 44 after subcutaneous inoculation with weev, suckling mice started to show signs of disease by 24 h and died within 48 h. 66 in suckling mice, the heart was the only organ in which pathologic changes were observed. conversely, adult mice exhibited signs of lethargy and ruffled fur on days 4-5 postinfection. mice were severely ill by day 8 and appeared hunched and dehydrated. death occurred between days 7 and 14, and both brain and mesodermal tissues such as heart, lungs, liver, and kidney were involved. 66, 67 intracerebral and intranasal routes of infection resulted in a fatal disease that was highly dependent on dose, while intradermal and subcutaneous inoculations caused only 50% fatality in mice regardless of the amount of virus. 49 studies demonstrated that although the length of the incubation period and the disease duration varied, weev infection resulted in mortality in hamsters by all routes of inoculation. progressive lack of coordination, shivering, rapid and noisy breathing, corneal opacity, and conjunctival discharge resulting in closing of the eyelids were indicative of disease in all cases. 68 cns involvement was evident with intracerebral, intraperitoneal, and intradermal inoculations. 68 weev is highly infectious to guinea pigs. 69 intraperitoneal inoculation of weev is fatal in guinea pigs regardless of virus inoculum, with the animals exhibiting signs of illness on days 3-4, followed by death on days 5-9 (nalca, unpublished results). very limited studies have been performed with nhps. the intranasal route of infection causes severe, lethal encephalitis in rhesus macaques. 54 reed et al. exposed cynomolgus macaques to low and high doses of aerosolized weev. the animals subsequently developed fever, increased white blood counts, and cns involvement, demonstrating that the cynomolgus macaque model would be useful for testing of vaccines and therapeutics against weev. 70 venezuelan equine encephalitis virus (veev) is maintained in nature in a cycle between small rodents and mosquitoes. 45 the spread of epizootic strains of the virus to equines leads to high viremia followed by a lethal encephalitis, and tangential spread to humans. veev can easily be spread by the aerosol route making it a considerable danger for laboratory exposure. in humans, veev infection causes a sudden onset of malaise, fever, chills, headache, and sore throat. 45, 71, 72 symptoms persist for 4-6 days, followed by a 2-to 3-week period of generalized weakness. encephalitis occurs in a small percentage of adults ( 0.5%); however, the rate in children may be as high as 4%. neurologic symptoms range from nuchal rigidity, ataxia, and convulsions to the more severe cases exhibiting coma and paralysis. the overall mortality rate in humans is <1%. 45 laboratory animals such as mice, guinea pigs, and nhps exhibit different pathologic responses when infected with veev. the lymphatic system is a general target in all animals infected with cns involvement variable between different animal species. the disease caused by veev progresses very rapidly without showing signs of cns disease in guinea pigs and hamsters. mortality is typically observed within 2-4 days after infection and fatality is not dose dependent. 44 veev infection lasts longer in mice, which develop signs of nervous system disease in 5-6 days and death 1-2 days later. lethal dose in mice changes depending on the age of mice and the route of exposure. 56 in contrast to guinea pigs and hamsters, the time of the death in mice is dose dependent. mortality is observed generally within 2-4 days after infection and fatality is not dose dependent. subcutaneous/dermal infection in the mouse model results in encephalitic disease very similar to that seen in horses and humans. 73 virus begins to replicate in the draining lymph nodes at 4 h pi. eventually, virus enters the brain primarily via the olfactory system. furthermore, aerosol exposure of mice to veev can result in massive infection of the olfactory neuroepithelium, olfactory nerves, and olfactory bulbs and viral spread to brain, resulting in necrotizing panencephalitis. 74, 75 aerosol and dermal inoculation routes cause neurological pathology in mice much faster than other routes of exposure do. the clinical signs of disease in mice infected by aerosol are ruffled fur, lethargy, and hunching progressing to death. 56, 74, 75 intranasal challenge of c3h/hen mice with high dose veev caused high morbidity and mortality. 76 viral titers in brain peaked on day 4 postchallenge and stayed high until animals died on day 9-10 postchallenge. protein cytokine array done on brains of infected mice showed elevated interleukin (il)-1a, il-1b, il-6, il-12, monocyte chemoattractant protein-1 (mcp-1), ifng, mip-1a, and regulated and normal t-cell expressed and secreted levels. this model was used successfully to test antivirals against veev. 77 xi. viral disease veev infection causes a typical biphasic febrile response in nhps. initial fever was observed at 12-72 h after infection and lasted <12 h. secondary fever generally began on day 5 and lasted 3-4 days. 78 veev-infected nhps exhibited mild symptoms such as anorexia, irritability, diarrhea, and tremors. leucopenia was common in animals exhibiting fever. 79 supporting the leucopenia, observed microscopic changes in lymphatic tissues such as early destruction of lymphocytes in lymph nodes and spleen, a mild lymphocytic infiltrate in the hepatic triads, focal myocardial necrosis with lymphocytic infiltration have been observed in monkeys infected with veev. surprisingly, characteristic lesions of the cns were observed histopathologically in monkeys in spite of the lack of any clinical signs of infection. 78 the primary lesions were lymphocytic perivascular cuffing and glial proliferation and generally observed at day 6 postinfection during the secondary febrile episode. cynomolgus macaques develop similar clinical signs including fever, viremia, lymphopenia, and encephalitis upon aerosol exposure to veev. 80 chikungunya virus is a member of the genus alphaviruses, specifically the semliki forest complex, and has been responsible for a multitude of epidemics mainly within africa and southeast asia. 45 the virus is transmitted by aedes mosquitoes. given the widespread endemicity of aedes mosquitoes, chikungunya virus has the potential to spread to previously unaffected areas. this is typified by the emergence of disease for the first time in 2005 in the islands of the southwest indian ocean, including the french la reunion island, and the appearance in central italy in 2007. 81, 82 the incubation period after a mosquito bite is 2-5 days followed by a self-limiting acute phase that lasts 3-4 days. symptoms during this period include fever, arthralgia, myalgia, and rash. headache, weakness, nausea, vomiting, and polyarthralgia have all been reported. 83 individuals typically develop a stooped posture due to the pain. for approximately 12% of infected individuals, joint pain can last months after resolution of primary disease, and has the possibility to relapse. underlying health conditions, including diabetes, alcoholism, or renal disease, increase the risk of developing a severe form of disease that includes hepatitis or encephalopathy. children between the ages of 3 and 18 years have an increased risk of developing neurological manifestations. 84 there is no effective vaccine or antiviral. wild-type c57bl/6 adult mice are not permissive to chikungunya virus infection by intradermal inoculation. however, it was demonstrated that neonatal mice were susceptible, and severity was dependent upon age at infection. six-day-old mice developed paralysis by day 6, and all died by day 12, whereas 50% of nine-day-old mice were able to recover from infection. by 12 days, mice were no longer permissive to disease. infected mice developed loss of balance, hind limb dragging, and skin lesions. neonatal mice were also used as a model for neurological complications. 85, 86 an adult mouse model has been developed by injection of the ventral side of the footpad of c57bl/6j mice. viremia lasted 4-5 days accompanied by foot swelling and noted inflammation of the musculoskeletal tissue. 87, 88 adult ifna/br knockout mice also developed mild disease with symptoms including muscle weakness and lethargy, symptoms that mirrored human infection. all adult mice died within 3 days. this model was useful in identifying the viral cellular tropism for fibroblasts. 85 imprinting control region (icr) cd1 mice can also be used as a disease model. neonatal mice subcutaneously inoculated with a passaged clinical isolate of chikungunya virus developed lethargy, loss of balance, and difficulty in walking. mortality was low, 17% and 8% for newborn cd1 and icr mice, respectively. the remaining mice fully recovered within 6 weeks after infection. 86 a drawback of both the ifna/ br and cd1 mice is that the disease is not a result of immunopathogenesis as occurs in human cases, given that the mice are immunocompromised. 89 long-tailed macaques challenged with a clinical isolate of the virus developed a similar clinical disease to humans. initially, the monkeys developed high viremia with fever and rash. after this period, viremia resolved and virus could be detected in lymphoid, liver, meninges, joint, and muscle tissue. the last stage mimicked the chronic phase in which virus could be detected up to two months after infection, although no arthralgia was noted. 90 dengue virus is transmitted via the mosquito vectors aedes aegypti and aedes albopictus. 91 given the endemicity of the vectors, it is estimated that half of the world's population is at risk for exposure to dengue virus. this results in approximately 50 million cases of dengue each year, with the burden of disease in the tropical and subtropical regions of latin america, south asia, and southeast asia. 92 it is estimated that there are 20,000 deaths each year caused by dengue hemorrhagic fever (dhf). 93 there are four serotypes of dengue virus, numbered 1-4, which are capable of causing a wide spectrum of disease that ranges from asymptomatic to severe with the development of dhf. 94 incubation can range from 3 to 14 days, with the average being 4-7 days. the virus targets dendritic cells and macrophages after a mosquito bite. 95 typical infection results in classic dengue fever (df), which is self-limiting and has flu-like symptoms in conjunction with retroorbital pain, headache, skin rash, and bone and muscle pain. dhf can follow, with vascular leak syndrome and low platelet count, resulting in hemorrhage. in the most extreme cases, dengue shock syndrome (dss) develops, characterized by hypotension, shock, and circulatory failure. 94 thrombocytopenia is a hallmark clinical sign of infection, and aids in differential diagnosis. 96 severe disease has a higher propensity to occur upon secondary infection with a different dengue virus serotype. 97 this is hypothesized to occur due to antibodydependent enhancement (ade). there is no approved vaccine or drug, and hospitalized patients receive supportive care including fluid replacement. in developing an animal model, it is important to note that mosquitoes typically deposit 10 4 -10 6 pfu, and is therefore the optimal range to be used during challenge. a comprehensive review of the literature regarding animal models of dengue infection was recently published by zompi et al. 98 several laboratory mouse strains including a/j, balb/c, and c57bl/6 are permissive to dengue infection. however, the resulting disease has little resemblance to human clinical signs, and death results from paralysis. [99] [100] [101] a higher dose of an adapted dengue virus strain induced dhf symptoms in both balb/c and c57bl/6. 102, 103 this model can also yield asymptomatic infections. a mouse-adapted (ma) strain of dengue virus 2 introduced into ag129 mice developed vascular leak syndrome similar to the severe disease seen in humans. 104 passive transfer of monoclonal dengue antibodies within mice leads to ade. during the course of infection, viremia was increased, and animals died due to vascular leak syndrome. 105 another ma strain injected into balb/c caused liver damage, hemorrhagic manifestations, and vascular permeability. 103 intracranial injection of suckling mice with dengue virus leads to death and has been used to test the efficacy of therapeutics. 106 scid mice engrafted with human tumor cells develop paralysis upon infection, and are thus not useful for pathogenesis studies. 107,108 df symptoms developed after infection in nod/scid/il2rgko mice engrafted with cd34 ã¾ human progenitor cells. 109 rag-hu mice developed fever, but no other symptoms upon infection with a passaged clinical isolate and laboratory-adapted strain of dengue virus 2. 110 a passaged clinical isolate of dengue virus type 3 was recently used to create a model in immunocompetent adult mice. interperitoneal injection in c57bl/6j and balb/c caused lethality by day 6-7 postinfection in a dose-dependent manner. the first indication of infection was weight loss beginning on day 4 followed by thrombocytopenia. a drop in systolic blood pressure along with noted increases in the liver enzymes, aspartate aminotransferase (ast) and alt, were also observed. viremia was established by day 5. this model mimicked the characteristic symptoms observed in human dhf/dss cases. 111 a novel model was developed that used infected mosquitoes as the route of transmission to hu-nsg mice. female mosquitoes were intrathoracically inoculated with a clinical isolate of dengue virus type 2. infected mosquitoes then fed upon the mouse footpad to allow for the transmission of the virus via the natural route. the amount of virus detected within the mouse was directly proportional to the amount of mosquitoes it was exposed to, with four to five being optimal. detectable viral rna was in line with what is observed during human infection. severe thrombocytopenia developed on day 14. this model is intriguing given that disease was enhanced with mosquito delivery of the virus in comparison to injection of the virus. 112 nhp models have used a subcutaneous inoculation in an attempt to induce disease. although the animals are permissive to viral replication, it is to a lower degree than that observed in human infection. 113 the immunosuppressive drug, cyclophosphamide enhances infection in rhesus macaques by allowing the virus to invade monocytes. 114 throughout these preliminary studies, no clinical disease was detected. to circumvent this, a higher dose of dengue virus was used in an iv challenge of rhesus macaques. hemorrhagic manifestations appeared by day 3 and resulted in petechiae, hematomas, and coagulopathy; however, no other symptoms developed. 115 further development would allow this model to be used for testing of novel therapeutics and vaccines. although primates do not develop disease upon infection with dengue, their immune system does produce antibodies similar to those observed during the course of human infection. this has been advantageous in studying ade. sequential infection led to a crossreactive antibody response, which has been demonstrated in both humans and mice. 116 this phenotype can also be seen upon passive transfer of a monoclonal antibody to dengue and subsequent infection with the virus. rhesus macaques exposed in this manner developed viremia that was 3-to 100-fold higher than was previously reported; however, no clinical signs were apparent. 117 the lack of inducible dhf or dss symptoms hinders further examination of pathogenesis within this model. japanese encephalitis virus ( jev) is a leading cause of childhood viral encephalitis in southern and eastern asia and is a problem among military personnel and travelers to these regions. it was first isolated from the brain of a patient who died from encephalitis in japan in 1935. 118 culex mosquitoes, which breed in rice fields, transmit the virus from birds or mammals (mostly domestic pigs) to humans. the disease symptoms range from a mild febrile illness to acute meningomyeloencephalitis. after an asymptomatic incubation period of 1-2 weeks, patients show signs of fever, headache, stupor, and generalized motor seizures, especially in children. the virus causes encephalitis by invading and destroying the cortical neurons. the fatality rate ranges from 10% to 50%, and most survivors have neurological and psychiatric sequelae. 119, 120 jev virus causes fatality in infant mice by all routes of inoculation. differences in pathogenesis and outcome are seen when the virus is given by intraperitoneal inoculation. 121 these differences depend on the amount of virus and the specific viral strains used. the biphasic viral multiplication after peripheral inoculation is observed in mice tissues. primary virus replication occurs in the peripheral tissues and the secondary replication phase in the brain. 122 hamsters are another small animal species that are used as an animal model for jev. fatality was observed in hamsters inoculated intracerebrally or intranasally, while peripheral inoculation caused asymptomatic viremia. studies with rabbits and guinea pigs showed that all routes of inoculation of jev produce asymptomatic infection. 123 serial sampling studies with 12-day-old wistar rats inoculated intracerebrally with jev indicated that jev causes the overproduction of free radicals by neurons and apoptosis of neuronal cells. 124 following a study in 2010 by the same group, showed that although cytokines tumor necrosis factor (tnf)-a, ifn-g, il-4, il-6, il-10, and chemokine mcp-1 increased gradually and peaked on days 10 pi with jevin rats, the levels eventually declined, and there was no correlation with the levels of cytokines and chemokines and neuronal damage. 125 intracerebral inoculation of jev causes severe histopathological changes in brain hemispheres of rhesus monkeys. symptoms such as weakness, tremors, and convulsions began to appear on days 6-10, with indicative signs of encephalomyelitis occurring on days 8-12 postinfection for most of the animals followed by death occurring on days 8-12 postinfection for most of the animals followed by death. 126 although intranasal inoculation of jev results in fatality in both rhesus and cynomolgus monkeys, peripheral inoculation causes asymptomatic viremia in these species. 123, 127 west nile virus west nile virus (wnv) was first isolated from the blood of a woman in the west nile district of uganda in 1937. 128 after the initial isolation of wnv, the virus was subsequently isolated from patients, birds, and mosquitoes in egypt in the early 1950s 129, 130 and was shown to cause encephalitis in humans and horses. wnv is recognized as the most widespread of the flaviviruses, with a geographical distribution that includes africa, the middle east, western asia, europe, and australia. 131 the virus first reached the western hemisphere in the summer of 1999, during an outbreak involving humans, horses, and birds in the new york city metropolitan area. 132, 133 since 1999, the range of areas affected by wnv quickly extended. older people and children are most susceptible to wnv disease. wnv generally causes asymptomatic disease or a mild undifferentiated fever (west nile fever), which can last from 3 to 6 days. 134 the mortality rate after neuroinvasive disease ranges from 4% to 11%. 131, [135] [136] [137] the most severe complications are commonly seen in the elderly, with reported case fatality rates from 4% to 11%. hepatitis, myocarditis, and pancreatitis are unusual, severe, nonneurologic manifestations of wnv infection. although many early laboratory studies of wn encephalitis were performed in nhps, mice, rat, hamster, horse, pig, dog, and cat models were used to study the disease. [138] [139] [140] [141] [142] [143] [144] inoculation of wnv into nhps intracerebrally resulted in the development of either encephalitis, febrile disease, or an asymptomatic infection, depending on the virus strain and dose. viral persistence is observed in these animals regardless of the outcome of infection (i.e. asymptomatic, fever, encephalitis). 141 thus, viral persistence is regarded as a typical result of nhp infection with various wnv strains. after both intracerebral and subcutaneous inoculation, the virus localizes predominantly in the brain and may also be found in the kidneys, spleen, and lymph nodes. wnv does not result in clinical disease in nhps although the animals show a low level of viremia. 145, 146 wnv has also been extensively studied in small animals. all classical laboratory mouse strains are susceptible to lethal infections by the intracerebral and intraperitoneal routes, resulting in encephalitis and 100% mortality. intradermal route pathogenesis studies indicated that langerhans dentritic cells are the initial viral replication sites in the skin. 147, 148 the infected langerhans cells then migrate to lymph nodes, and the virus enters the blood through lymphatic and thoracic ducts and disseminates to peripheral tissues for secondary viral replication. virus eventually travels to the cns and causes pathology that is similar to human cases. [149] [150] [151] [152] tesh et al. developed a model for wn encephalitis using the golden hamster, mesocricetus auratus. hamsters appeared normal during the first 5 days, became lethargic at approximately day 6, and developed neurologic symptoms by days 7-10. 143 many of the severely affected animals died 7-14 days after infection. viremia was detected in the hamsters within 24 h after infection and persisted for 5-6 days. although there were no substantial changes in internal organs, progressive pathologic differences were seen in the brain and spinal cord of infected animals. furthermore, similar to the above-mentioned monkey experiments by pogodina et al., persistent wnv infection was found in the brains of hamsters. the etiologic agent of severe acute respiratory syndrome (sars), sars-coronavirus (cov), emerged in 2002 as it spread throughout 32 countries in a period of 6 months, infecting >8000 people and causing nearly 800 deaths. 153, 154 the main mechanism of transmission of sars-cov is through droplet spread, but it is also viable in dry form on surfaces for up to 6 days and can be detected in stool, suggesting other modes of transmission are also possible. 155 although other members of the family usually cause mild illness, sars-cov infection has a 10% case fatality with the majority of cases in people over the age of 15 years. 156, 157 after an incubation period of 2-10 days, clinical signs of sars include general malaise, fever, chills, diarrhea, dyspnea, and cough. 158 in some sars, cases, pneumonia may develop and progress to acute respiratory distress syndrome (ards). fever usually dissipates within 2 weeks and coincides with the induction of high levels of neutralizing antibodies. 159 in humans, sars-cov replication destroys respiratory epithelium, and a great deal of the pathogenesis is due to the subsequent immune responses. 160 infiltrates persisting within the lung and diffuse alveolar damage (dad) are common sequelae of sars-cov infection. virus can be isolated from secretions of the upper airways during early, but not later stages of infection as well as from other tissues. 161 sars-cov can replicate in many species, including dogs, cats, pigs, mice, rats, ferrets, foxes, and nhps. 162 chinese palm civets, raccoon dogs, and bats are possible natural hosts. no model captures all aspects of human clinical disease (pyrexia and respiratory signs), mortality (w10%), viral replication, and pathology. 163 in general, the sars-cov disease course in the model species is much milder and of shorter duration than in humans. viral replication in the various animal models may occur without clinical illness and/or histopathologic changes. the best-characterized models use mice, hamsters, ferrets, and nhps (table 38 .1). mouse models of sars-cov typically are inoculated by the intranasal route under light anesthesia. young, 6-to 8-week old balb/c mice exposed to sars-cov have viral replication detected in the lungs and nasal turbinates, with a peak on day 2 and clearance by day 5 postexposure. there is also viral replication within the small intestines of young balb/c mice. however, young mice have no clinical signs, aside from reduced weight gain, and have little to no inflammation within the lungs (pneumonitis). intranasal sars-cov infection of c57bl/6 (b6) also yield reduced weight gain and viral replication in the lungs, with a peak on day 3 and clearance by day 9. 171 in contrast, balb/c mice 13-14 interstitial pneumonitis, alveolar damage, and death also occur in old mice, resembling the age-dependent virulence observed in humans. 129s mice and b6 mice show outcomes to sars-covinfection similar to those observed for balb/c mice but have lower titers and less prolonged disease. one problem is that it is more difficult to obtain large numbers of mice older than 1 year. a number of immunocompromised knockout mouse models of intranasal sars-cov infection have also been developed. 129svev mice infected with sars-cov by the intranasal route develop bronchiolitis, with peribronchiolar inflammatory infiltrates, and interstitial inflammation in adjacent alveolar septae. 172 viral replication and disease in these mice resolve by day 14 postexposure beige, cd1ã�/ã�, and rag1ã�/ã� mice infected with sars-cov have similar outcomes to infected balb/c mice with regard to viral replication, timing of viral clearance, and a lack of clinical signs. signal transducer and activator of transcription-1 (stat1) ko mice infected intranasally with sars-cov have severe disease, with weight loss, pneumonitis, interstitial pneumonia, and some deaths. the stat1 ko mouse model is therefore useful for studies of pathogenicity, pathology, and evaluation of vaccines. syrian golden hamsters (strain lvg) are also susceptible to intranasal exposure of sars-cov. after the administration of 10 3 tcid 50 (tissue culture infective dose), along with a period of transient viremia, sars-cov replicates in nasal turbinates and lungs, resulting in pneumonitis. there are no obvious signs of disease, but exercise wheels can be used to monitor decrease in nighttime activity. some mortality has been observed, but it was not dose dependent and could have more to do with genetic differences between animals because the strain is not inbred. 163 damage is not observed in the liver or spleen despite detection of virus within these tissues. several studies have shown that intratracheal inoculation of sars-cov in anesthetized ferrets (mustela furo) results in lethargy, fever, sneezing, and nasal discharge. 167 clinical disease has been observed in several studies. sars-cov is detected in pharyngeal swabs, trachea, tracheobronchial lymph nodes, and high titers within the lungs. mortality has been observed around day 4 postexposure as well as mild alveolar damage in 5-10% of the lungs, occasionally accompanied by severe pathology within the lungs. 173 with fever, overt respiratory signs, lung damage, and some mortality, the ferret intratracheal model of sars-cov infection is perhaps most similar to human sars, albeit with a shorter time course. sars-cov infection of nhps by intransal or intratracheal routes generally results in a very mild infection, which resolves quickly. sars-cov infection of old world monkeys, such as rhesus macaques, cynomolgus macaques (cynos), and african green monkeys (agms) have been studied with variable results, possibly due to the outbred nature of the groups studied or previous exposure to related pathogens. clinical illness and viral loads have not been consistent; however, replication within the lungs and dad are features of the infections for each of the primate species. some cynos have no illness, but others have rash, lethargy, and respiratory signs and pathology. 170 rhesus have little to no disease and only have mild findings upon histopathological analysis. agms infected with sars-cov have no overt clinical signs, but dad and pneumonitis have been documented. viral replication has been detected for up to 10 days in the lungs of agms; however, the infection resolves and does not progress to fatal ards. farmed chinese masked palm civets, sold in open markets in china, were thought to be involved in the sars-covoutbreak. intratracheal and intranasal inoculation of civets with sars-covresults in lethargy, decreased aggressiveness fever, diarrhea, and conjunctivitis. 174 leucopenia, pneumonitis, and alveolar septal enlargement, with lesions similar to those observed in ferrets and nhps, have also been observed in laboratory-infected civets. common marmosets have also been shown to be susceptible to sars-cov infection. 175 vaccines have been developed for related animal covs in chickens, cattle, dogs, cats, and swine have used live-attenuated, killed, dna and viral-vectored vaccine strategies. 176 an important issue to highlight from work on these vaccines is that cov vaccines, such as those developed for cats, may induce a more severe disease. 177 as such, immune mice had th2-type immunopathology upon sars-cov challenge. 178 severe hepatitis in vaccinated ferrets with antibody enhancement in liver has been reported. 179 additionally, rechallenge of agms showed limited viral replication but significant lung inflammation, including alveolitis and interstitial pneumonia, which persisted for long periods of time after viral clearance. 180 mouse and nhp models with increased virulence may be developed by adapting the virus by repeated passage within the species of interest. ma sars and human ace2 transgenic mice are available. 181 all mammals experimentally or naturally exposed to rabies virus have been found to be susceptible. this highly neurotropic virus is a member of the lyssavirus genus and is transmitted from the bite of an infected animal to humans. 182 the virus is able to replicate within the muscle cells at the site of the bite, and then travel to the cns. once reaching the cns by retrograde axonal transport, the virus replicates within neurons creating inflammation and necrosis. the virus subsequently spreads throughout the body via peripheral nerves. 183 a typical incubation period is 30-90 days, and is highly dependent upon the location of the bite. proximity to the brain is a major factor for the onset of symptoms. the prodromal stage lasts from 2 to 10 days and is when the virus initially invades the cns. flu-like symptoms are the norm in conjunction with pain and inflammation at the site of the bite. subsequently, there are two forms of disease that can develop. in 80% of cases, an individual develops the encephalitic or furious form. this form is marked by hyperexcitability, autonomic dysfunction, and hydrophobia. the paralytic, or dumb form, is characterized by ascending paralysis. ultimately, both forms result in death days after the onset of symptoms. once the symptoms develop, there is no proven effective therapy. in the developing world, death is caused by the lack of access to medical care including postexposure prophylaxis. in na, fatal cases result because of late diagnosis. 184 syrian hamsters have been challenged with rabies virus intracerebrally, intraperitoneally, intradermally, and intranasally. all animals died as a result of the exposure, although intracerebral and intranasal inoculation led to only the furious form depicted by extreme irritability, spasms, excessive salivation, and cries. the virus used had been isolated from an infected dog brain and passaged in swiss albino mice. animals inoculated by intracerebral injection develop disease within 4-6 days, whereas all other routes of entry develop disease within 6-12 days. 185 this model has been used to study and test novel vaccine candidates. 186 mice have been extensively studied as an animal model for rabies. it was shown that swiss albino mice intracerebrally injected with a virus isolated from a dog developed only the paralytic form of disease 6 days after the initial challenge. balb/c mice are universally susceptible to intracerebral injection of rabies virus within 9 days. disease symptoms include paralysis, cachexia, and bristling appearing 1-3 days before death. 187 a more natural route of infection via peripheral injection into masseter muscles was tested on icr mice. these mice developed neurological signs including limb paralysis, and all died within 6-12 days. 188 icr mice have been instrumental in analyzing novel vaccines and correlates of protection. 189 this line of mice was also used to assess the value of ketamine treatment to induce coma during rabies infection. 190 another mouse line used is the p75 neurotrophin receptor-deficient mouse. this mouse developed a fatal encephalitis when inoculated intracerebrally with the challenge virus standard. 191 bax-deficient mice have also been used to determine the role of apoptotic cell death in the brain during the course of infection. 192 a viral isolate from silver-haired bats can also be used in the mouse model. this strain is advantageous given that it is responsible for the majority of deaths in north na. 193 early death phenomenon is typified by a decrease to time of death in a subset of individuals and animals that have been vaccinated and subsequently exposed to rabies. 194 this trend has been demonstrated experimentally in swiss outbred mice and primates. 195, 196 cynomolgus and rhesus were both infected with passaged rabies virus to create an nhp model. a high titer of virus was needed to induce disease, but exposure was found to not be universally fatal. the animals that survived beyond 4 weeks within the experiment did not develop clinical disease nor succumb to infection. primates that did develop disease refused food and had progressively less activity until death. this lasted from 24 h up to 4 days, with all animals with symptoms dying within 2 weeks. 196 bats have been experimentally challenged with rabies. vampire bats, desmodus rotundus, intramuscularly injected with a bat viral isolate displayed clinical signs including paralysis in half of the population of the study animals. of those who did develop disease, the duration was 2 days, and incubation period ranged from 7 to 30 days. regardless of disease manifestation, 89% of challenged animals died. 197 skunks can be challenged intramuscularly or intranasally with either challenge virus strain or a skunk viral isolate. interestingly, the challenge virus strain more readily produced the paralytic form, whereas the street form of rabies developed into the furious form. however, the challenge strain virus resulted in a shorter incubation period of 7-8 days in comparison to 12-14 days seen with the street virus. 198 filoviridae consists of two well-established genera, ebola virus and marburg virus (marv) and a newly discovered group, cuevavirus 199 (table 38 200 two other ebola viruses are known; ta㯠forest (tafv; previously named cote (circumflex over the 'o') d'ivoire) (ciebov) and reston (restv), which have not caused major outbreaks or lethal disease in humans. the disease in humans is characterized by aberrant innate immunity and a number of clinical symptoms such as fever, nausea, vomiting, arthralgia/myalgia, headaches, sore throat, diarrhea, abdominal pain, anorexia, and numerous others. 201 approximately 10% of patients develop petechia and a greater percentage, depending on the specific strain, may develop bleeding from various sites (gums, puncture sites, stools, etc.). 199 natural transmission in an epidemic is thought to be through direct contact or needle sticks in hospital settings. however, much of the research interest in filoviruses primarily stems from biodefense needs, particularly from aerosol biothreats. as such, intramuscular, intraperitoneal, and aerosol models have been developed in mice, hamsters, guinea pigs, and nhps for the study of pathogenesis, correlates of immunity, and for testing countermeasures. 202 because filoviruses have such high lethality rates in humans, scientists have looked for models that are uniformly lethal to stringently test efficacy of candidate vaccines and therapeutics. immunocompetent mice have not been successfully infected with wild-type filoviruses due to the control of the infection by the murine type 1 ifn response. 203 however, wild-type inbred mice are susceptible to filovirus that has been ma by serial passage. 204 balb/c mice, which are the strain of choice for intraperitoneal inoculation of ma-ebov, are not susceptible by the aerosol route. 205 for aerosol infection of immunocompetent mice, a panel of bxd (balb/ c ã� dba) recombinant inbred strains were screened, and one strain, bxd34, was shown to be particularly susceptible to airborne ma-ebov, with 100% lethality to low or high doses (w100 or 1000 pfu). these mice developed weight loss of >15% and succumbed to infection between days 7 and 8 postexposure. the aerosol infection model uses a whole-body exposure chamber to expose mice aged 6-8 weeks to ma-ebov aerosols with a mass median aerodynamic diameter (mmad) of approximately 1.6 mm and a geometric standard deviation of approximately 2.0 for 10 min. another approach uses immunodeficient mouse strains such as scid, stat1 ko, ifn receptor ko, or perforin ko with a wild-type ebov inoculum by intraperitoneal or aerosol routes. 206 mice are typically monitored for clinical disease "scores" based on activity and appearance, weight loss, and moribund condition (survival). coagulopathy, a hallmark of filovirus infection in humans, has been observed, with bleeding in a subset of animals and failure of blood samples to coagulate late in infection. liver, kidney, spleen, and lung tissue taken from moribund mice have pathology characteristic of filovirus disease in nhps. although most mouse studies have used ma-ebov or ebov, an intraperitoneal ma marv model is also available. 207 ma-marv and ma-ebov models are particularly useful for screening novel antiviral compounds. 208 hamsters are frequently used to study cardiovascular disease, coagulation disorders, and thus serve as the basis for numerous viral hemorrhagic fever models. 209 an intraperitoneal ma-ebov infection model has been developed in syrian hamsters. 210 this model, which has been used to test a vesicular stomatitis virus vectored vaccine approach, uses male 5-to 6-week-old syrian hamsters that are infected with 100 ld50 of ma-ebov. virus is present in tissues and blood collected on day 4, and all animals succumbed to the disease by day 6. detailed accounts of this model have been presented at international scientific meetings by ebihara and feldmann et al. but have not been reported in a scientific journal at the time of writing this chapter. 211 guinea pig models of filovirus infection have been developed for intraperitoneal and aerosol routes using guinea pig-adapted ebov (gp-ebov) and marv (gp-marv). 212, 213 guinea pigs models of filovirus infection are quite useful in that they develop fever, which can be monitored at frequent (hourly) intervals by telemetry. additionally, the animals are large enough for regular blood sampling in which measurable coagulation defects are observed as the infection progresses. hartley guinea pigs exposed to aerosolized gp-marv or gp-ebov become moribund at times comparable to that of nhps, generally succumbing to the infection between 7 and 12 days postexposure. by aerosol exposure, gp-ebov is uniformly lethal at both high and low doses (100 or 1000 pfu target doses) but lethality drops with low (<1000 pfu) presented doses of airborne gp-marv, and more protracted disease is seen in some animals. 213 weight loss of between 15% and 25% is a common finding in guinea pigs exposed to gp-ebov or gp-marv. fever, which becomes apparent by day 5, occurs more rapidly in gp-ebov exposed guinea pigs than with gp-marv exposure. lymphocytes and neutrophils increase during the earlier part of the disease, and platelet levels steadily drop as the disease progresses. increases in coagulation time can be seen as early as day 6 postexposure. blood chemistries (i.e. alt, ast, alkaline phosphatase (alkp), and blood urea nitrogen) indicating problems with liver and kidney function are also altered late in the disease course. nhp models of filovirus infection are the preferred models for more advanced disease characterization and testing of countermeasures because they most closely mimic the disease and immune correlates seen in humans. 214 old world primates have been primarily used for the development of intraperitoneal, intramuscular, and aerosol models of filovirus infection. uniformly lethal filovirus models have been developed for most of the virus strains in cynomolgus macaques, rhesus macaques, and to a lesser degree, in agms and marmosets. [215] [216] [217] [218] [219] low-passage human isolates that have not been passaged in animals have been sought for development of nhp models to satisfy the food and drug administration (fda) animal rule. prominent features of the infections are onset of fever by day 5 postexposure, alteration in liver function enzymes (alt, ast, and alkp), decrease in platelets, and increased coagulation times. clinical disease parameters may have a slightly delayed onset in aerosol models. petichial rash is a common sign of filovirus disease and may be more frequently observed in cynomolgus macaques than in other nhp species. dyspnea late in infection is a prominent feature of disease after aerosol exposure. a number of pronounced pathology findings include multifocal necrosis and fibrin lesions, particularly within the liver and the spleen. lymphocytolysis and lymphoid depletion are also observed. multilead, surgically implanted telemetry devices are useful in the continuous collection of temperature, blood pressure, heart rate, and activity levels. as such, blood pressure drops as animals become moribund and heart rate variability (standard deviation of the heart rate) is altered late in infection. the most recently developed telemetry devices can aid in plethysomography to measure respiratory minute volume for accurate delivery of presented doses for aerosol exposure. hendra and nipah virus are unusual within the paramyxoviridae family given that they can infect a large range of mammalian hosts. both viruses are grouped under the genus henipavirus. the natural reservoirs of the viruses are the fruit bats from the genus pteropus. hendra and nipah have the ability to cause severe disease in humans with the potential for a high case fatality rate. 220 outbreaks caused by nipah virus have been recognized in malaysia, singapore, bangladesh, and india, while hendra outbreaks have yet to be reported outside of australia. 221, 222 hendra was the first member of the genus to be identified and was initially associated with an acute respiratory disease in horses. all human cases have been linked to transmission through close contact with an infected horse. there have been no confirmed cases of direct transmission from human to human or bat to human. nipah has the distinction of being able to be transmitted by humans, although the exact route is unknown. 223 the virus is susceptible to ph, temperature, and desiccation, and thus close contact is hypothesized to be needed for successful transmission. 224 both viruses have a tropism for the neurological and respiratory tract. hendra virus incubation period is 7-17 days and is marked by a flu-like illness. symptoms at this initial stage include myalgia, headache, lethargy, sore throat, and vomiting. 225 disease progression can continue to pneumonitis or encephalitic manifestations, with the person succumbing to multiorgan failure. 226 nipah virus has an incubation period of 4 days to 2 weeks. 227 much like hendra, the first signs of disease are nondescript. severe neurological symptoms subsequently develop including encephalitis and seizures that can progress to coma within 24-48 h. 228 survivors of infection typically make a full recovery; however, 22% suffer permanent sequelae, including persistent convulsions. 229 at this time, there is no approved vaccine or antiviral, and treatment is purely supportive. animal models are being used to not only test novel vaccines and therapeutics, but also deduce the early events of disease because observed human cases are all at terminal stages. the best small animal representative is the syrian golden hamster due to their high susceptibility to both henipaviruses. clinical signs upon infection recapitulate the disease course in humans including acute encephalitis and respiratory distress. challenged animals died xi. viral disease 38 . in vivo models of viral diseases affecting humans within 4-17 days postinfection. the progression of disease and timeline are highly dependent on dose and route of infection. intranasal inoculation leads to imbalance, limb paralysis, lethargy, and breathing difficulties whereas intraperitoneal resulted in tremors and paralysis within 24 h before death. virus was detected in lung, brain, spleen, kidney, heart, spinal cords, and urine, while the brain was the most affected organ. this model has been used for vaccination and passive protection studies. [230] [231] [232] the guinea pig model has not been widely used due to the lack of a respiratory disease upon challenge. 233, 234 inoculation with hendra virus via the subcutaneous route leads to a generalized vascular disease with 20% mortality. clinical signs were apparent 7-16 days postinfection with death occurring within 2 days of cns involvement. higher inocula have been associated with the development of encephalitis lesions. intradermal and intranasal injections do not lead to disease, although the animals are able to seroconvert upon challenge. inoculum source does not affect clinical progression. nipah virus challenge only develops disease upon intraperitoneal injection and results in weight loss and transient fever for 5-7 days. virus was shed through urine and found to be present in the brain, spleen, lymph nodes, ovary, uterus, and urinary bladder. 235 ferrets display the same clinical disease as seen in the hamster model and human cases. 236, 237 upon inoculation by the oronasal route, ferrets develop severe pulmonary and neurological disease within 6-9 days including fever, coughing, and dyspnea. lesions do develop in the ferrets' brains, but to a lesser degree than seen in humans. cats have also been used as an animal model for henipaviruses. disease symptoms are not dependent upon the route of infection. the incubation period is 4-8 days and leads to respiratory and neurological symptoms. 238, 239 this model has proven to be useful in a vaccine challenge model. squirrel and agms are representative of the nhp models. within the squirrel monkeys, nipah virus is introduced by either the intranasal or iv route and subsequently leads to clinical signs similar to that in humans, although intranasal challenge results in milder disease. upon challenge, only 50% animals develop disease manifestations including anorexia, dyspnea, and acute respiratory syndrome. neurological involvement is characterized by uncoordinated motor skills, loss of consciousness, and coma. viral rna can be detected in the lung, brain, liver, kidney, spleen, and lymph nodes but is only found upon iv challenge. 240 agms have been found to be a very consistent model of both viruses. intratracheal inoculation of the viruses results in 100% mortality, and death within 8.5 and 9-12 days postinfection for hendra and nipah, respectively. the animals develop severe respiratory and neurological disease with generalized vasculitis. 241, 242 the reservoir of the viruses, gray-headed fruit bats, has been experimentally challenged. due to their status as the host organism for henipaviruses, the bats do not develop clinical disease. however, hendra virus can be detected in kidneys, heart, spleen, and fetal tissue and nipah virus can be located in urine. 243 pigs have been investigated as a model as they develop a respiratory disease upon infection with both nipah and hendra. [244] [245] [246] oral inoculation does not produce a clinical disease, but subcutaneous injection represents a successful route of infection. live virus can be isolated from the oropharynx as early as 4 days postinfection. nipah can also be transmitted between pigs. nipah was able to induce neurological symptoms in 20% of the pigs, even though virus was present in all neurological tissues regardless of symptoms. 247 within the pig model, it seemed that nipah had a greater tropism for the respiratory tract, while hendra for the neurological system. horses also are able to develop a severe respiratory tract infection accompanied with fever and general weakness upon exposure to nipah and hendra. oronasal inoculation led to systemic disease with viral rna detected in nasal swabs within 2 days. 248, 249 animals died within 4 days postexposure and were found to have interstitial pneumonia with necrosis of alveoli. 250, 251 virus could be detected in all major systems. mice, rats, rabbits, chickens, and dogs have been tested but found to be nonpermissive to infection. 232, 252 suckling balb/c mice succumb to infection if the virus is inoculated intracranially. 253 embryonated chicken eggs have been inoculated with nipah virus leading to a universally fatal disease within 4-5 days postinfection. 254 respiratory syncytial virus is responsible for lower respiratory tract infections of 33 million children under the age of 5 years, which in turn results in three million hospitalizations and approximately 200,000 deaths. 255 within the united states, hospital costs alone amount to >600 million dollars. 256 outbreaks are common in the winter. 257 the virus is transmitted by large respiratory droplets that replicate initially within the nasopharynx and further spreads to the lower respiratory tract. incubation for the virus is 2-8 days. respiratory syncytial virus is highly virulent leading to very few asymptomatic infections. 258 disease manifestations are highly dependent upon the age of the individual. primary infections in neonates produce nonspecific symptoms including the overall failure to thrive, apnea, and feeding difficulties. infants present with a mild upper respiratory tract disease that could develop into bronchiolitis and bronchopneumonia. contracting the virus at this age results in an increased chance of developing childhood asthma. 259 young children develop recurrent wheezing, whereas adults exacerbate previous respiratory conditions. 260 common clinical symptoms are runny nose, sneezing, and coughing accompanied with fever. mortality rates in hospitalized children are 1-3% with the greatest burden of disease seen in 3-4-month-olds. 261 there is no vaccine available, and ribavirin usage is not recommended for routine treatment. 262 animal models were developed in the hopes of formulating an effective and safe vaccine unlike the formalin-inactivated respiratory syncytial virus vaccine (fi-rsv). this vaccineinduced severe respiratory illness in infants who received the vaccine and were subsequently infected with live virus. 263 mice can be used to model disease, although a very high intranasal inoculation is needed to achieve clinical symptoms. 264, 265 strain choice is crucial to reproducing a physiological relevant response. 266 age does not affect primary disease manifestations. 267 however, it does play a role in later sequelae showing increased airway hyperreactivity. 268 primary infection produces increased breathing with airway obstruction. 264, 269 virus was detected as early as day 3 and reached maximum titer at day 6 postinfection. clinical illness is defined in the mouse by weight loss and ruffled fur as opposed to runny nose, sneezing, and coughing as seen in humans. cotton rats are useful given that it is a small animal disease model. the virus is able to replicate to high titers within the lungs and can be detected in both the upper and lower airways after intranasal inoculation. 270, 271 it has been reported that viral replication is 50-to 1000-fold greater in the rat model than in the mouse model. 272 the rats develop mild-to-moderate bronchiolitis or pneumonia. 273 although age does not seem to factor in clinical outcome, it has been reported that older rats tend to take longer to achieve viral clearance. viral loads peak by the fifth day, dropping to below the levels of detection by 8. the histopathology of the lungs seems to be similar to that in humans after infection. 274 this model has limited usage in modeling the human immune response to infection as challenge with the virus creates a th2 response, whereas humans tend to skew toward th1. [275] [276] [277] fi-rsv disease was recapitulated upon challenge with live virus after being vaccinated twice with fi-rsv. chinchillas have been challenged experimentally via intranasal inoculation. the virus was permissive within the nasopharynx and eustachian tube. the animals displayed an acute respiratory tract infection. this model is thought to be useful in studying mucosal immunity during infection. 278 chimpanzees are permissive to replication and clinical symptoms of respiratory syncytial virus including rhinorrhea, sneezing, and coughing. adult squirrel monkeys, newborn rhesus macaques, and infant cebus monkeys were also challenged but did not exhibit any disease symptoms nor high levels of viral replication. 279 bonnet monkeys were also tested and found to develop an inflammatory response by day 7 with viral rna detected in both bronchial and alveolar cells. 280 the chimpanzee model has proven to be useful for vaccine studies. 281, 282 sheep have also been challenged experimentally since they develop respiratory disease when exposed to ovine respiratory syncytial virus. 283 lambs were also found to be susceptible to human respiratory syncytial infection. 284, 285 when inoculated intratracheally, the lambs developed an upper respiratory tract infection with cough after 6 days. some lambs went on to develop lower respiratory disease including bronchiolitis. the pneumonia resolved itself within 14 days. during the course of disease, viral replication peaked at 6 days, and rapidly declined. studying respiratory disease in sheep is beneficial given the shared structural features between them and humans. 286, 287 the influenza viruses consist of three types: influenza a, b, and c, based on antigenic differences. influenza a is further classified by subtypes; 16 ha and 9 na subtypes are known. seasonal influenza is the most common infection and usually causes a self-limited febrile illness with upper respiratory symptoms and malaise that resolves within 10 days. 288 the rate of infection is estimated at 10% in the general population and can result in billions of dollars of loss annually from medical costs and reduced work-force productivity. approximately 40,000 people in the united states die each year from seasonal influenza. 289 thus, vaccines and therapeutics play a critical role in controlling infection, and development using animal models is ongoing. 290 influenza virus replicates in the upper and lower airways, peaking at approximately 48 h postexposure. infection can be more severe in infants and in children under the age of 22 years, people over the age of 65 years, or immunocompromised individuals in whom viral pneumonitis or pneumonia can develop or bacterial superinfection resulting in pneumonia or sepsis. 291 pneumonia from secondary bacterial infection, such as streptococcus pneumonia, streptococcus pyrogenes, and neisseria meningitides, and more rarely, staphylococcus aureus, is more common than viral pneumonia from the influenza itself, accounting for approximately 27% of all influenza-associated fatalities. 292 death, often due to ards can occur as early as 2 days after the onset of symptoms. lung histopathology in severe cases may include dad, alveolar edema and damage, hemorrhage, fibrosis, and inflammation. 288 the h5n1 avian strain of influenza has lethality rates of approximately 60% (of known cases), likely because the virus preferentially binds to the cells of the lower respiratory tract, and thus, the potential for global spread is a major concern. 293 the most frequently used animal models of influenza infection include mice, ferrets, and nhps. a very thorough guide to working with mouse, guinea pig, ferret, and cynomolgus models was published by kroeze et al. 294 lethality rate can vary with the virus strain used (with or without adaptation), dose, route of inoculation, age, and genetic background of the animal. the various animal models can capture differing diseases caused by influenza: benign, severe, superinfection and sepsis, severe with ards, and neurologic manifestations. 290 also, models can use seasonal or avian strains and models have been developed to study transmission, important for understanding the potential for more lethal strains such as h5n1 for spreading among humans. mouse models of influenza infection are very predictive for antiviral activity and tissue tropism in humans, and are useful in testing and evaluating vaccines. 295 inoculation is by the intranasal route, using approximately 60 ml of inoculum in each nare of anesthetized mice. exposure may also be to small particle aerosols containing influenza with an mmad of <5 mm. most inbred strains are susceptible, with particularly frequent use of balb/c followed by c57bl/6j mice. males and females have equivalent disease, but influenza is generally more infectious in younger 2-to 4-week-old (8-10 g) mice. mice are of somewhat limited use in characterizing the immune response to influenza. mice lack the mxa gene, which is an important part of the human innate immune response to influenza infection. the mouse homolog to mxa, mx1, is defective in most inbred mouse strains. 296 weight loss or reduced weight gain, decreased activity, huddling, ruffled fur, and increased respiration are the most common clinical signs. for more virulent strains, mice may require euthanasia as early as 48 h postexposure, but most mortality occurs from 5 to 12 days postexposure accompanied by decreases in rectal temperature. 297 pulse oximeter readings and measurement of blood gases of oxygen saturation are also used to determine the impact of influenza infection on respiratory function. 298 virus can be isolated from bronchial lavage fluids throughout the infection and from tissues after euthanasia. for influenza strains with mild-tomoderate pathogenicity, disease is nonlethal and virus replication is detected within the lungs, but usually not other organs. increases in serum alpha-1-acidglycoprotein and lung weight are also frequently present. however, mice infected with influenza do not develop fever, dyspnea, nasal exudates, sneezing, or coughing. mice can be experimentally infected with influenza a or b, but the virus generally requires adaptation to produce clinical signs. mice express the receptors for influenza attachment in the respiratory tract; however, the distribution varies and sa 2,3 predominates over sa 2,6 which is why h1, h2, and h3 subtypes usually need to be adapted to mice and h5n1, h2, h6, and h7 viruses do not require adaptation. 299 to adapt, mice are infected intratracheally or intranasally by virus isolated from the lungs, and reinfected into mice and then the process is repeated a number of times. once adapted, influenza strains can produce severe disease, systemic spread, and neurotropism. however, h5n1 and the 1918 pandemic influenza virus can cause lethal infection without adaptation. 300 h5n1 infection of mice results in viremia and viral replication in multiple organ systems, severe lung pathology, fulminant diffuse interstitial pneumonia, pulmonary edema, high levels of proinflammatory cytokines, and marked lymphopenia. 301 as in humans, the virulence of h5n1 is attributable to damage caused by an overactive host immune response. additionally, mice infected with the 1918 h1n1 influenza produces severe lung pathology and oxygen saturation levels that decrease with increasing pneumonia. 302 in superinfection models, a sublethal dose of influenza is given to mice followed 7 days later by intranasal inoculation of a sublethal dose of a bacterial strain such as s. pneumoniae or s. pyrogenes. 303 morbidity, characterized by inflammation in the lungs, but not bacteremia, begins a couple of days after superinfection and may continue for up to 2 weeks. at least one transmission model has also been developed in mice. with h2n2 influenza, transmission rates of up to 60% among cage mates can be achieved after infection by the aerosol route and cocaging after 24 h. 304 domestic ferrets (mustela putorius furo) are frequently the animal species of choice for influenza animal studies because the susceptibility, clinical signs, peak virus shedding, kinetics of transmission, local expression of cytokine mrnas, and pathology resemble that of humans. [305] [306] [307] ferrets also have airway morphology, respiratory cell types, and a distribution of influenza receptors (sa 2,6 and sa 2,3) within the airways similar to that of humans. 308 influenza was first isolated from ferrets infected intranasally with throat washes from humans harboring the infection and ferret models have since been used to test efficacy of vaccines and therapeutic treatments. 309 when performing influenza studies in ferrets, animals should be serologically negative for circulating influenza viruses. infected animals should be placed in a separate room from uninfected animals. if animals must be placed in the same room, uninfected ferrets should be handled before infected ferrets. anesthetized ferrets are experimentally exposed to influenza by intranasal. inoculation of 0.25-0.5 ml containing approximately 10 4 -10 6 egg id50 dropwise to each nostril. influenza types a and b naturally infect ferrets, resulting in an acute illness, which usually lasts 3-5 days for mildly to moderately virulent strains. 310 ferrets are more susceptible to influenza a than to influenza b strains and are also susceptible to avian influenza h5n1 strains without adaptation. 311 virulence and degree of pneumonitis caused by different influenza subtypes and strains vary from mild to severe and generally mirror that seen in humans. nonadapted h1n1, h2n2, and h3n2 have mild-to-moderate virulence in ferrets. strains of low virulence have predominant replication in the nasal turbinates. clinical signs and other disease indicators are similar to that of humans with a mild respiratory disease, sneezing, nasal secretions containing virus, fever, weight loss, high viral titers, and inflammatory infiltrate in the airways, bronchitis, and pneumonia. 312 replication in both the upper and lower airways is associated with more severe disease and greater mortality. additionally, increased expression of proinflammatory mediators and reduced expression of antiinflammatory mediators in the lower respiratory tract ferrets correlates with severe disease and lethal outcome. h5n1-infected ferrets develop severe lethargy, greater ifn response, transient lymphopenia, and replication in respiratory tract, brain, and other organs. 313 old and new world primates are susceptible to influenza infection and have an advantage over ferret and mouse models, which are deficient for h5n1 vaccine studies because there is a lack of correlation with hemagglutination inhibition. 314 of old world primates, cynomolgus macaque (m. fascicularis) are most frequently used for studies of vaccines and antiviral drug therapies. 315, 316 h5n1 and h1n1 1918 infections of cynos are very similar to those in humans. 317 cynos develop fever and ards upon intranasal inoculation of h5n1 with necrotizing bronchial interstitial pneumonia 318 nhps are challenged by multiple routes (ocular, nasal, and tracheal) simultaneously 1 ã� 10 6 pfu per site. virus antigen is primarily localized to the tonsils and pulmonary tissues. infection of cynos with h5n1 results in fever, lethargy, nasal discharge, anorexia, weight loss, nasal and tracheal washes, pathologic and histopathologic changes, and alveolar and bronchial inflammation. the 1918 h1n1 caused a very high mortality rate due to an aberrant immune response and ards and had >50% lethality (humans only had a 1-3% lethality). ards and mortality also occur with the more pathogenic strains, but nhps show reduced susceptibility to less virulent strains such as h3n2. 299 influenzainfected rhesus macaques represent a mild disease model pathogenesis for vaccine and therapeutic efficacy studies. 319 other nhp models include influenza infection of pigtailed macaques as a mild disease model and infection of new world primates such as squirrel and cebus monkeys. 320 rats (f344 and sd) inoculated with rat-adapted h3n2 developed inflammatory infiltrates and cytokines in bronchoalveolar lavage fluids, but had no lethality and few histopathological changes. 321 additionally, an influenza transmission model has been developed in guinea pigs as an alternative to ferrets. 322 cotton rats (sigmodon hispidus) have been used to test vaccines and therapeutics in a limited number of studies. 323, 324 cotton rats have an advantage over mice in that the immune system is similar to humans (including the presence of the mx gene) and influenza viruses do not have to be adapted. 325, 326 nasal and pulmonary tissues of cotton rats were infected with unregulated cytokines and lung viral load peaking at 24 h postexposure. virus was cleared from the lung by day 3 and from the nares by day 66, but animals had bronchial and alveolar damage, and pneumonia for up to 3 weeks. there is also a s. aureus superinfection model in cotton rats. 327 coinfection resulted in bacteremia, high bacterial load in lungs, peribronchiolitis, pneumonitis, alveolitis, hypothermia, and higher mortality. domestic pig influenza models have been developed for vaccine studies for swine flu. pigs are susceptible in nature as natural or intermediate hosts but are not readily susceptible to h5n1. 328, 329 although pigs infected with influenza may have fever, anorexia, and respiratory signs such as dyspnea and cough, mortality is rare. 330 size and space requirements make this animal difficult to work with, although the development of minipig (ellegaard gottingen) models may provide an easier-to-use alternative. incubation period, animals exhibit signs of fever, hepatitis, and abortion, which is a hallmark diagnostic sign known among farmers. 331 mosquito vectors, unpasteurized milk, aerosols of infected animal's body fluids, or direct contact with infected animals are the important routes of transmission to humans. 332,333 after 2-6 days of incubation period, rvfv causes a wide range of signs and symptoms in humans ranging from asymptomatic to severe disease with hepatitis, vision loss, encephalitis, and hemorrhagic fever. [334] [335] [336] depending on the severity of the disease when the symptoms start, 10-20% of the hospitalized patients might die in 3-6 days or 12-17 days after the disease onset. 334 hepatic failure, renal failure or disseminated intravascular coagulation (dic), and encephalitis are demonstrated within patients during postmortem examination. mice are one of the most susceptible animal species to rvfv infection. subcutaneous or intraperitoneal routes of infection cause acute hepatitis and lethal encephalitis at a late stage of the disease in mice. 337, 338 mice start to exhibit signs of decreased activity and ruffled fur by day 2-3 postexposure. immediately after these signs are observed, they become lethargic and generally die 3-6 days postexposure. ocular diseases or hemorrhagic form of the disease has not been observed in mice models so far. 334 increased viremia and tissue tropism were reported in mice with 338 increased liver enzymes and lymphopenia observed in sick mice. rats and gerbils are also susceptible to rvfv infection. rats' susceptibility is dependent on the rat strain used for the challenge model. there was also noted an age dependence in susceptibility of rats. although wistar-furth and brown norway strains and young rats are highly susceptible to rvfv infection, fisher 344, buffalo and lewis strains, and old rats demonstrated resistance to infection. 339, 340 similar pathologic changes such as liver damage and encephalopathy were observed in both rats and mice. there was no liver involvement in the gerbil model and animals died from severe encephalitis. the mortality rate was dependent on the strain used and the dose given to gerbils. 341 similar to the rat model, the susceptibility of gerbils was also dependent on age. so far, studies showed that rvfv does not cause uniform lethality in an nhp model. intraperitoneal, intranasal, iv, and aerosol routes have been used to develop the nhp model. rhesus macaques, cynomolgus macaques, african monkeys, and south american monkeys were some of the nhp species used for this effort. 342 monkeys showed a variety of signs ranging from febrile disease to hemorrhagic disease and mortality. temporal viremia, increased coagulation parameters (pt, aptt), and decreased platelets were some other signs observed in nhps. animals that succumbed to disease showed very similar pathogenesis to those seen in humans such as pathological changes in liver and hemorrhagic disease. there was no ocular involvement in this model. recently, smith et al. compared iv, intranasal, and subcutaneous routes of infection in common marmosets and rhesus macaques. 343 marmosets were more susceptible to rvfv infection than were rhesus macaques with marked viremia, acute hepatitis, and late onset of encephalitis. increased liver enzymes were observed in both species. necropsy results showed enlarged livers in the marmosets exposed by iv or subcutaneous routes. although there were no gross lesions in the brains of marmosets, histopathology showed encephalitis in the brains of intranasally challenged marmosets. crimean-congo hemorrhagic fever virus (cchfv) generally circulates in nature unnoticed in an enzootic tick-vertebrate-tick cycle and similar to other zoonotic agents, seems to produce little or no disease in its natural hosts, but causes severe disease in humans. cchfv transmits to humans by ixodid ticks, direct contact with sick animals/humans, or body fluids of animals/humans. 344 incubation, prehemorrhagic, hemorrhagic, and convalescence are the four phases of the disease seen in humans. the incubation period lasts 1-9 days. during the prehemorrhagic phase, patients show signs of nonspecific flu-like disease for approximately a week. the hemorrhagic period results in circulatory shock and dic in some patients. 345, 346 over the years, several attempts have been made to establish an animal model for cchf in adult mice, guinea pigs, hamsters, rats, rabbits, sheep, nhps, etc. [347] [348] [349] [350] until recently, the only animal that manifests disease is the newborn mouse. infant mice infected with cchfv intraperitoneally caused fatality around day 8 postinfection. 351 pathogenesis studies showed that virus replication was first detected in the liver, with subsequent spread to the blood (serum). virus was detected very late during the disease course in other tissues including the heart (day 6) and the brain (day 7). the recent studies using knockout adult mice were successful to develop a lethal small animal model for cchfv infection. 352, 353 bente et al. infected stat1 knockout mice by the intraperitoneal route. in this model, after the signs of fever, leucopenia, thrombocytopenia, viremia, elevated liver enzymes, and proinflammatory cytokines, mice were moribund and succumbed to disease in 3-5 days of postexposure. the second model was developed by using ifn-alpha/beta (ifna/b) receptor knockout mice. 353 similar observations were made in this model as in the stat1 knockout mouse model. the animals were moribund and died 2-4 days after exposure with high viremia levels in the liver and spleen. other laboratory animals, including nhps, show little or no signs of infection or disease when infected with cchfv. 348 butenko et al. used agms (cercopithecus aethiops) for experimental cchfv infections. except one monkey with a fever on day 4 postinfection, the animals did not exhibit signs of disease. antibodies to the virus were detected in three out of five monkeys, including the one with fever. in 1975, fagbami et al. infected two patas monkeys (cercopithecus erythrocebuserythrocebus) and one guinea baboon (papio papio) with cchfv. 347 although all three animals had low level viremia between days 1 and 5 after inoculation, only the baboon serum had neutralizing antibody activity on day 137 postinfection. similar results were obtained when horses and donkeys have been used for experimental cchfv infections. donkeys develop a low-level viremia, 354 and horses developed little or no viremia, but high levels of virus-neutralizing antibodies, which remained stable for at least 3 months. these studies suggest that horses may be useful in the laboratory to obtain serum for diagnostic and possible therapeutic purposes. 355 shepherd et al. infected 11 species of small african wild mammals and laboratory rabbits, guinea pigs, and syrian hamsters with cchfv. 349 although scrub hares (lepus saxatilis), cape ground squirrels (xerus inauris), red veld rats (aethomys chrysophilus), white-tailed rats (mystromys pumilio), and guinea pigs had viremia; south african hedgehogs (atelerix frontalis), highveld gerbils (tatera brantsii), namaqua gerbils (desmodillus auricularis), two species of multimammate mouse (mastomys natalensis and m. coucha), and syrian hamsters were negative. all species regardless of viremia levels developed antibody responses against cchfv. iv and intracranially infected animals showed the onset of viremia earlier than those infected by the subcutaneous or intraperitoneal routes. the genus hantavirus is unique among the family bunyaviridae in that it is not transmitted by an arthropod vector, but rather by rodents. 356 rodents of the family muridae are the primary reservoir for hantaviruses. infected host animals develop a persistent infection that is typically asymptomatic. transmission is achieved by the inhalation of infected rodent saliva, feces, and urine. 357 human infections can normally be traced to a rural setting with activities such as farming, land development, hunting, and camping as possible sites of transmission. rodent control is the primary route of prevention. 358 the viruses have a tropism for endothelial cells within the microvasculature of the lungs. 359 there are two distinct clinical diseases that infection can yield; hemorrhagic fever with renal syndrome (hfrs) due to infection with old world hantaviruses or hantavirus pulmonary syndrome (hps) caused by new world hantaviruses. 360 hfrs is mainly seen outside of the americas and is associated with the hantaviruses dobrava-belgrade (also known as dobrava), hantaan, puumala, and seoul. 358 incubation lasts two to three weeks and presents as flulike in the initial stages that can further develop into hemorrhagic manifestations and ultimately renal failure. thrombocytopenia subsequently develops, which can further progress to shock in approximately 15% patients. the overall mortality rate is 7%. infection with dobrava and hantaan viruses are typically linked to the development of severe disease. hps was first diagnosed in 1993 within the southwestern united states when healthy young adults became suddenly ill, progressing to severe respiratory distress and shock. the etiological agent responsible for this outbreak was identified as sin nombre virus. 361 this virus is still the leading cause within na of hps. hps due to other hantaviruses has been reported in argentina, bolivia, brazil, canada, chile, french guiana, panama, paraguay, and uruguay. 362, 363 the first report of hps in maine was recently documented. 361 andes virus was first identified in outbreaks in chile and argentina. this hantavirus is distinct in that it can be transmitted between humans. 364 the fulminant disease is more lethal than that observed for hfrs with a mortality rate of 40%. there are four phases of disease including prodromal, pulmonary, cardiac depression, and hematologic manifestation. 365 incubation typically occurs 14-17 days after exposure. 366 unlike hfrs, renal failure is not a major contributing factor to the disease. there is a short prodromal phase that gives way to cardiopulmonary involvement accompanied by cough and gastrointestinal symptoms. it is at this point that individuals are typically admitted to the hospital. pulmonary function is hindered and continues to suffer within 48 h after cardiopulmonary involvement. interstitial edema and air-space disease normally follow. in fatal cases, cardiogenic shock has been noted. 367 vaccine development has been hampered by the vast diversity of hantaviruses and the limited number of outbreaks. 368 syrian golden hamsters are the most widely used small animal models for hantavirus infection. hamsters inoculated intramuscularly with a passaged andes viral strain died within 11 days postinfection. clinical signs did not appear until 24 h before death at which point the hamsters were moribund and in respiratory distress. mortality was dose dependent, with high inoculums leading to a shorter incubation before death. during the same study, hamsters were inoculated with a passaged sin nombre isolate. no hamsters developed any symptoms during the course of observation. although an antibody response to the virus that was not dose dependent was determined via an enzymelinked immunosorbent assay. hamsters infected with andes virus were found to have significant histopathological changes to their lung, liver, and spleen. all had an interstitial pneumonia with intraalveolar edema. infectious virus could be recovered from these organs. viremia began on day 8 and lasted up to 12 days postinfection. infection of hamsters with andes virus yielded a similar clinical disease progression as is seen in human hps including rapid progression to death, fluid in the pleural cavity, and significant histopathological changes to the lungs and spleen. a major deviation in the hamster model is the detection of infectious virus within the liver. 369 lethal disease can be induced in newborn mice but does not recapitulate the clinical symptoms observed in human disease. 370 adult mice exposed to hantaan virus leads to a fatal disease dependent upon viral strain and route of infection. the disease progression is marked by neurological or pulmonary manifestations that do not mirror human disease. 371, 372 knockout mice lacing ifn-a/b were found to be highly susceptible to hantaan virus infection. 373 in a study looking at a panel of laboratory strains of mice, c57bl/6 mice were found to be most susceptible to a passaged hantaan viral strain injected intraperitoneally. animals progressed to neurological manifestation including paralyses and convulsions and succumbed to infection within 24-36 h postinfection. clinical disease was markedly different than that observed in human cases. 372 nhps have been challenged with new world hantaviruses; however, no clinical signs were reported. 374, 375 cynomolgus monkeys challenged with a clinical isolate of puumala virus developed a mild disease. 376, 377 challenge with andes virus to cynomolgus macaques by both iv and aerosol exposure led to no signs of disease. all animals did display a drop in total lymphocytes within 5 days postinfection. aerosol exposure led to 4 of 6 monkeys and 8 of 11 iv injected monkeys developed viremia. infectious virus could not be isolated from any of the animals. the family arenaviridae is composed of two serogroups: old world arenaviruses including lassa fever virus and lymphocytic choriomeningitis virus and the new world viruses of pichinde virus and junin virus. all these viruses share common clinical manifestations. 378 lassa fever virus is endemic in parts of west africa and outbreaks are typically seen in the dry season between january and april. 379 this virus is responsible for 100,000-500,000 infections per year, leading to approximately 5,000 deaths. 380 outbreaks have been reported in guinea, sierra leone, liberia, nigeria, and central african republic. however, cases sprung up in germany, the netherlands, the united kingdom, and the united states due to transmission to travelers on commercial airlines. 381 transmission of this virus typically occurs via rodents, in particular the multimammate rat, mastomys species complex. 379 humans become infected by inhaling the aerosolized virus or eating contaminated food. there has also been noted human-to-human transmission by direct contact with infected secretions or needle-stick injuries. the majority of infections are asymptomatic; however, severe disease can occur in 20% of individuals. the incubation period is from 5 to 21 days, and the initial onset is characterized by flulike illness. this is followed by diarrheal disease that can progress to hemorrhagic symptoms including encephalopathy, encephalitis, and meningitis. a third of patients develop deafness in the early phase of disease, which is permanent for a third of those affected. the overall fatality is about 1%; however, of those admitted to the hospital, it is between 15% and 25%. there is no approved vaccine, and besides supportive measures, ribavirin is effective only if started within 7 days. 382, 383 the primary animal model used to study lassa fever is the rhesus macaque. 384 aerosolized infection of lymphocytic choriomeningitis virus has been a useful model for lassa fever. both rhesus and cynomolgus monkeys exposed to the virus developed disease, but rhesus more closely mirrored the disease course and histopathology observed in human infection. 385 iv or intragastric inoculation of the virus led to severe dehydration, erythematous skin, submucosal edema, necrotic foci in the buccal cavity, and respiratory distress. the liver was severely affected by the virus as depicted by measuring the liver enzymes ast and alt. 386 disease was dose-dependent with iv, intramuscular, and subcutaneous inoculation requiring the least amount of virus to induce disease. aerosol infections and eating contaminated food could also be used, and mimic a more natural route of infection. 387 within this model, the nhp becomes viremic after 4-6 days. clinical manifestations were present by day 7, and death typically occurred within 10-14 days. 388, 389 intramuscular injection of lassa virus into cynomolgus monkeys also produced a neurological disease due to lesions within the cns. 390 this pathogenicity is seen in select cases of human lassa fever. 391, 392 a marmoset model has recently been defined using a subcutaneous injection of lassa fever virus. virus was initially detected by day 8 and viremia achieved by day 14. liver enzymes were elevated, and an enlarged liver was noted upon autopsy. there was a gradual reduction in platelets and interstitial pneumonitis diagnosed in a minority of animals. the physiological signs were the same as seen in fatal human cases. 393 mice develop a fatal neurological disorder upon intracerebral inoculation with lassa, although the outcome of infection is completely dependent upon the major histocompatibility complex (mhc) background and age of animal along with the route of inoculation. 394 guinea pig inbred strain 13 was found to be highly susceptible to lassa virus infection. the outbreed hartley strain was less susceptible, and thus, strain 13 has been the preferred model given its assured lethality. the clinical manifestations mirror those seen in humans and rhesus. 395 infection with pichinde virus that has been passaged in guinea pigs has also been used. disease signs include fever, weight loss, vascular collapse, and eventual death. 396, 397 the guinea pig is an excellent model given that it not only results in similar disease pattern as humans but also the viral distribution is similar along with the histopathology and immune response. 398, 399 infection of hamsters with a cotton rat isolate of pirital virus is similar to what is characterized in humans, and the nhp and guinea pig model. the virus was injected intraperitoneally resulting in the animals becoming lethargic and anorexic within 6-7 days. virus was first detected at 3 days and reached maximum titers within 5 days. neurological symptoms began to appear at the same time, and all the animals died by day 9. pneumonitis, pulmonary hemorrhage, and edema were also present. 400 these results were recapitulated with a nonadapted pichinde virus. [401] [402] [403] globally, diarrheal disease is the leading cause of death with rotavirus being one of the main etiological agents responsible. according to the world health organization, rotavirus alone is responsible for a third of all hospitalization related to diarrhea and 500,000-600,000 deaths per year. 404 the virus is very stable due to its three-layer capsid, which allows it to be transmitted via the oral-fecal route, depositing itself in the small intestine. rotavirus is highly contagious, and only 10 viruses are needed to cause symptomatic disease. 405 the host determinant with the greatest influence on clinical outcome is age. neonates typically are asymptomatic, which is suggested to be due to the existence of maternal antibodies. hence, the most susceptible age group is 3 months to 2 years, coinciding with a drop in these protective antibodies. 406 within this age range, children will develop noninflammatory diarrhea. virus replicates in the intestinal villus enterocytes resulting in their destruction and malabsorption of needed electrolytes and nutrients. symptoms of disease include watery, nonbloody diarrhea with vomiting, fever, and potentially dehydration that lasts up to a week. 407 there is a short episode of viremia during the course of infection. 408 mice can be used as both an infection and disease model depending upon age at challenge. mice <14 days old develop disease, whereas older mice are able to clear the infection before the onset of symptoms. 409 this halts the study of active vaccination against disease in the infection model. in the adult mouse model, the course of the infection is monitored via viral shedding within the stool. 410 infant mice, specifically balb/c, receiving an oral inoculation of a clinical strain of virus developed diarrhea within 24 h postinfection, and 95% of those exposed developed symptoms within 72 h postinfection. symptoms lasted from 2 to 4 days with no mortality. viral shedding was at its peak at 24 h and lasted up to 5 days. there were noted histopathological changes within the small intestine localized to the villi that was reversible. 407 within the adult mouse model, oral inoculation of a mouse rotavirus strain showed viral shedding by 3 days lasting up until 6 days postinfection. 410 these mouse models have been used to study correlates of protection and therapeutic efficacy including gastro-gard ã� . 409, 411, 412 rats can also be used as disease models depending upon the strain of rat. 413, 414 suckling fischer 344 rats were exposed to a simian strain of rotavirus orally. the rats were susceptible to diarrheal disease till they were 8 days old with age determining the length of viral shedding. 415 rats have mainly been used to study the correct formulation for oral rehydration. these rodents are large enough to perform in situ intestinal perfusions. within these studies, 8-day-old rats were infected with a rat strain of rotavirus by orogastric intubation. within 24 h postinfection, the rats developed diarrhea, at which point the small intestine was perfused to compare differing solutions of oral rehydration. 416 gnotobiotic pigs are also used given that they can be infected with both porcine and human strains. 417 they are susceptible to developing clinical disease from human strains up to 6 weeks of age. they allow for the analysis of the primary immune response to the virus given that they do not receive transplacental maternal antibodies and are immune competent at birth. 418 another advantage of this model is that the gastrointestinal physiology and mucosal immune system closely resemble that of humans. 419 this model has been useful in studying correlates of protection. gnotobiotic and colostrum-deprived calves have also been used as an experimental model of rotavirus infection. they are able to develop diarrhea and shed live virus. 420 gnotobiotic lambs can also develop clinical disease upon oral inoculation with clinical strains. 421 infant baboons, agms, and rhesus macaques have all proven to be infection models with severity measure by viral shedding. 422, 423 retroviridae human immunodeficiency virus type 1 the lentiviruses are a subfamily of retroviridae, which includes human immunodeficiency virus (hiv), a virus that infects 0.6% of the world's population. a greater proportion of infections and deaths occur in sub-saharan africa. worldwide, there are approximately 1.8 million deaths per year with >260,000 being children. transmission of hiv occurs by exposure to infectious body fluids. there are two species, hiv-1 and hiv-2, with hiv-2 having lower infectivity and virulence (confined mostly to west africa). the vast majority of cases worldwide are hiv-1. 424 hiv targets t-helper cells (cd4ã¾), macrophages, and dendritic cells. 425 acute infection occurs 2-4 weeks after exposure, with flu-like symptoms and viremia followed by chronic infection. symptoms in the acute phase may include fever, body aches, nausea, vomiting, headache, lymphadenopathy, pharyngitis, rash, and sores in the mouth or esophagus. cd8ã¾ t-cells are activated which kill hiv-infected cells, and are responsible for antibody production and seroconversion. acquired immune deficiency syndrome (aids) develops when cd4ã¾ t-cells decline to <200 cells per microliter; thus, cell-mediated immunity becomes impaired, and the person is more susceptible to opportunistic infections and certain cancers. humanized mice, created by engrafting human cells and tissues into scid mice, have been critical for the development of mouse models for the study of hiv infection. a number of different humanized mouse models allow for the study of hiv infection in the context of intact and functional human innate and adaptive immune responses. 426 the scidhu hiv infection model has proven to be useful, particularly in screening antivirals and therapeutics. 427 a number of different humanized mouse models have been developed for the study of hiv, including rag1ã�/ã�gcã�/ã�, rag2ã�/ ã�gcã�/ã�, nod/scidgcã�/ã� (hnog), nod/ scidgcã�/ã� (hnsg), nod/scid blt, and nod/ scidgcã�/ã� (hnsg) blt. cd34ã¾ human stem cells derived from the umbilical cord blood or fetal liver are used for humanization. 428 hiv-1 infection by intraperitoneal injection can be successful with as little as 5% peripheral blood engraftment. 429 vaginal and rectal transmission models have been developed in blt scidhu mice in which mice harbor human bone marrow, liver, and thymus tissue. hiv-1 viremia occurs within approximately 7 days pi. 430 in many of these models, spleen, lymph nodes, and thymus tissues are highly positive for virus, similar to humans. 431 importantly, depletion of human t-cells can be observed in blood and lymphoid tissues of hiv-infected humanized mice, and at least some mechanisms of pathogenesis that occur in hiv-infected humans also occur in the hiv-infected humanized mouse models. 432 the advantage of these models is that these mice are susceptible to hiv infection, and thus, the impact of drugs on the intended viral targets can be tested. one caveat is that although mice have a "common mucosal immune system," humans do not, due to the differences in the distribution of addressins. 433 thus, murine mucosal immune responses to hiv do not reflect those of humans. there are a number of important nhp models for human hiv infection. simian immunodeficiency virus (siv) infection of macaques is widely considered to be the best platform for modeling hiv infection of humans. importantly, nhps have similar, pharmacokinetics, metabolism, mucosal t-cell homing receptors, and vascular addressins to those of humans. thus, although the correlates of protection against hiv are still not completely known, immune responses to hiv infection and vaccination are likely comparable. these models mimic infection through the use of contaminated needles (iv), sexual transmission (vaginal or rectal), and maternal transmission in utero, or through breast milk. [434] [435] [436] there are also macaque models to study the emergence and clinical implications of hiv drug resistance. 437 these models most routinely use rhesus macaques (macaca mulatta), cynomolgus macaques (macaca fasicularis), and pigtailed macaque (macaca nemestrina). animals of all ages are used, depending on the needs of the study. for instance, the use of newborn macaques may be more practical for evaluating the effect of prolonged drug therapy on disease progression; however, adult nhps are more frequently used. studies are performed in bsl-2 animal laboratories, and nhps must be of simian type-d retrovirus free and siv seronegative. siv infection of pigtailed macaques is a useful model for hiv peripheral nervous system pathology, wherein an axotomy is performed and regeneration of axons is studied. 438 challenges may be through a single high dose. iv infection of rhesus macaques with 100 tcid 50 of the highly pathogenic siv/deltab670 induces aids in most macaques within 5-17 months (mean of 11 months). 439 peak viremia occurs around week 4. aids in such models is often defined as cd4ã¾ t-cells that have dropped to <50% of the baseline values. alternatively, repeated low-dose challenges are often used, depending on the requirements of the model. 440, 441 because nhps infected with hiv do not develop an infection with a clinical disease course similar to that in humans, siv or siv/hiv-1 laboratory-engineered chimeric viruses (simian-human immunodeficiency virus or shiv) are used as surrogates. nhps infected with pathogenic siv may develop clinical disease, which progresses to aids and are thus useful pathogenesis models. a disadvantage is that siv is not identical to hiv-1 and is more closely related to hiv-2. however, the polymerase region of siv is 60% homologous to that of hiv-1, and it is susceptible to many reverse transcriptase (rt) and protease inhibitors. siv is generally not susceptible to nonnucleoside inhibitors; thus, hiv-1 rt is usually put into siv for such studies. 442 sivmac239 is similar to hiv in the polymerase region and is therefore susceptible to nucleoside, rt or integrase inhibition. 443 nhps infected with sivmac239 have an asymptomatic period and disease progression resembling aids in humans, characterized by weight loss/wasting, cd4ã¾ t-cell depletion. additionally, sivmac239 uses the cxcr5 chemokine receptor as a coreceptor, similar to hiv, which is important for drugs that target entry. 444 nhps infected with shiv strains may not develop aids, but these models are useful in testing vaccine efficacy. for example, rt-shivs and env-shivs are useful for the testing and evaluation of drugs that may target the envelope or rt, respectively. 442 one disadvantage of the highly virulent env-shiv (shiv-89.6 p) is that it uses the cxcr4 coreceptor. of note, env-shivs that do use the cxcr5 coreceptor are less virulent; viremia develops and then resolves without further disease progression. 445 simian-tropic (st) hiv-1 contains the vif gene from siv. infection of pigtailed macaques with this virus results in viremia, which can be detected for three months, followed by clearance. 446 a number of routes are used for siv or shiv infection of nhps, with iv inoculation being the most common route. mucosal routes include vaginal, rectal, and intracolonic. mucosal routes require a higher one-time dose than does the iv route for infection. for the vaginal route, female macaques are treated with depo-provera (estrogen) one month before infection to synchronize the menstrual cycle, thin the epithelial lining of the vagina, and increase the susceptibility to infection by atraumatic vaginal instillation. 447 upon vaginal instillation of 500 tcid50 of shiv-162p3, peak viremia was seen around 12 days postexposure with >10 7 copies per milliliter and dropping thereafter to a constant level of 10 4 rna copies per milliliter at 60 days and beyond. in another example, in an investigation of the effect of vaccine plus vaginal microbicide on preventing infection, rhesus macaques were vaginally infected with a high dose of sivmac251. 448 an example of an intrarectal model used juvenile (2year-old) pigtailed macaques, challenged intrarectally with 10 4 tcid 50s of siv mne027 to study the pathogenesis related to the virulence factor, vpx. 449 here, viremia peaked at approximately 10 days with >10 8 copies per milliliter. viral rna was expressed in the cells of the mesenteric lymph nodes. the male genital tract is seen as a viral sanctuary with persistently high levels of hiv shedding even with antiretroviral therapy. to better understand the effect of highly active antiretroviral therapy on virus and t-cells in the male genital tract, adult (3-to 4-year old) male cynomolgus macaques were intravenously inoculated with 50 aid50s of sivmac251, and the male genital tract tissues were tested after euthanasia by pcr, ihc, and in situ hybridization. 450 pediatric models have been developed in infant rhesus macaques through the infection of siv, allowing for the study of the impact of developmental and immunological differences on the disease course. 451 importantly, mother-to-infant transmission models have also been developed. 452 pregnant female pigtailed macaques were infected during the second trimester with 100 mid 50 shiv-sf162p3 by the iv route. four of nine infants were infected; one in utero and three either intrapartum or immediately postpartum through nursing. this model is useful for the study of factors involved in transmission and the underlying immunology. nhps infected with siv or shiv are routinely evaluated for weight loss, activity level, stool consistency, appetite, virus levels in blood, and t-cell populations. cytokine and chemokine levels, antibody responses, and cytotoxic t-lymphocyte responses may also be evaluated. the ultimate goal of an hiv vaccine is sterilizing immunity (preventing infection). however, a more realistic result may be to reduce severity of infection and permanently prevent progression. strategies have included live attenuated, nonreplicating and subunit vaccines. these have variable efficacy in nhps due to the genetics of the host (mhc and terminal-repeat retrotransposon in miniature (trim) alleles), differences between challenge strains, and challenge routes. 453 nhp models have led to the development of antiviral treatments that are effective at reducing viral load and indeed transmission of hiv among humans. one preferred variation on the models for testing the longterm clinical consequences of antiviral treatment is to use newborn macaques and treat from birth onward, in some cases more than a decade. 454 unfortunately, however, successes in nhp studies do not always translate to success in humans, as seen with the recent step study that used an adenovirus-based vaccine approach. 455 vaccinated humans were not protected and may have even been more susceptible to hiv, viremia was not reduced, and the infections were not attenuated as hoped. with regard to challenge route, iv is more difficult to protect than mucosal and is used as a "worst-case scenario." however, efficacy at one mucosal route is usually comparable to that at other mucosal routes. human and animal papillomaviruses cause benign epithelial proliferations (warts) and malignant tumors of the various tissues that they infect. 456 there are >100 human papillomaviruses (hpvs), with different strains causing warts on the skin, oropharynx, nasopharynx, larynx, and anogenital tissues. approximately a third of these are transmitted sexually. of these, virulent subtypes such as hpv-16, hpv-18, hpv-31, hpv-33, and hpv-45 place individuals at high risk for cervical and other cancers. major challenges in the study of these viruses are that papillomaviruses generally do not infect any other species outside of the natural hosts and can cause a very large spectrum of severity. thus, no animal models have been identified that are susceptible to hpv. however, a number of useful surrogate models exist that use animal papillomaviruses in their natural host, or a very closely related species. 457, 458 these models have facilitated the recent development of useful and highly effective prophylactic hpv vaccines. 459 wild cottontail rabbits (sylvilagus floridanus) are the natural host for cottontail rabbit papillomavirus (crpv), but this virus also infects domestic rabbits (oryctolagus cuniculus), which are a very closely related species. 460 in this model, papillomas can range from cutaneous squamous cell carcinomas on the one end of the spectrum, and spontaneous regression on the other. lesions resulting from crpv in domestic rabbits do not typically contain infectious virus. canine oral papillomavirus (copv) cause florid warty lesions in the mucosa of the oral cavity within 4-8 weeks postexposure in experimental settings. 461 the mucosatrophic nature of these viruses and the resulting oropharyngeal papillomas that are morphologically similar to human vaginal papillomas caused by hpv-6 and hpv-11 make this a useful model. 462 these lesions typically spontaneously regress 4-8 weeks after appearing; this model is therefore useful in understanding the interplay between the host immune defense and viral pathogenesis. male and female beagles, aged 10 weeks to 2 years, with no history of copv, are typically used for these studies. infection is achieved by the application of a 10 ml droplet of virus extract to multiple 0.5 cm 2 scarified areas within the mucosa of the upper lip of anesthetized beagles. 463 bovine papillomavirus (bpv) has a wider host range than do most papillomaviruses, infecting the fibroblasts cells of numerous ungulates. 458 bpv-4 infection of cattle feeding on bracken fern, which is carcinogenic, can result in lesions of the oral and esophageal mucosa that lack detectable viral dna. bpv infections in cattle can result in a range of diseases such as skin warts, cancer of the upper gastrointestinal tract and urinary bladder, and papillomatosis of the penis, teats, and udder. finally, sexually transmitted papillomaviruses in rhesus macaques and cynomolgus macaques, rhesus papillomavirus, is very similar to hpv-16 and is associated with the development of cervical cancer. 464 mice cannot be used to study disease caused by papillomaviruses unless they are engrafted with relevant tissue, but they are often used to look at immunogenicity of vaccines. 465, 466 herpesviridae please see chapter 25. monkeypox virus (mpxv) causes disease in both animals and humans. human monkeypox, which is clinically almost identical to ordinary smallpox, occurs mostly in the rainforest of central and western africa. the virus is maintained in nature in rodent reservoirs including squirrels. 467, 468 mpxv was discovered during the pox-like disease outbreak among laboratory java macaques in denmark in 1958. no human cases were observed during this outbreak. the first human case was not recognized as a distinct disease until 1970 in zaire (the present drc) with the continued occurrence of a smallpox-like illness despite eradication efforts of smallpox in this area. during the global eradication campaign, extensive vaccination in central africa decreased the incidence of human monkeypox, but the absence of immunity in the generation born since that time and increased dependence on bush meat have resulted in renewed emergence of the disease. in the summer of 2003, a well-known outbreak in the midwest was the first occurrence of monkeypox disease in the united states and the western hemisphere. among 72 reported cases, 37 human cases were laboratory confirmed during an outbreak. 469, 470 it was determined that native prairie dogs (cynomys sp.) housed with rodents imported from ghana in west africa were the primary source of outbreak. the virus is mainly transmitted to humans while handling infected animals or by direct contact with the infected animal's body fluids, or lesions. person-toperson spread occurs by large respiratory droplets or direct contact. 471 most of the clinical features of human monkeypox are very similar to those of ordinary smallpox. 472 after a 7-to 21-day incubation period, the disease begins with fever, malaise, headache, sore throat, and cough. the main sign of the disease that distinguishes monkeypox from smallpox is swollen lymph nodes (lymphadenitis), which is observed in most of the patients before the development of rash. 471, 473 typical maculopapular rash follows the prodromal period generally lasting 1-3 days. the average size of the skin lesions is 0.5-1 cm, and the progress of lesions follows the order macules through papules, vesicles, pustules, umblication then scab and desquamation and lasts typically 2-4 weeks. fatality rate is 10% among the unvaccinated population and death generally occurs during the second week of the disease. 469, 471 mpxv is highly pathogenic for a variety of laboratory animals, and so far, many animal models have been developed by using different species and different routes of exposure (table 38. 3). because of the unavailability of variola virus to develop animal models and resulting disease manifestations in humans that are similar, mpxv is one of the pox viruses that are used very heavily to develop a number of small animal models via different routes of exposure. wild-derived inbred mouse, stat1-deficient c57bl/6 mouse, prairie dogs, african dormice, ground squirrels are highly susceptible to mpxv by different exposure routes. [474] [475] [476] [477] [478] [479] [480] [481] cast/eij mice, one of the 38 inbred mouse strains tested for susceptibility to mpxv, showed weight loss and dose-dependent mortality after intranasal exposure to mpxv. studies with the intraperitoneal route of challenge indicated a higher susceptibility to mpxv with an almost 50-fold less ld50 when compared to the intranasal route. 474 scid-balb/c mice were also susceptible to the intraperitoneal challenge route, and the disease resulted in mortality day 9 postinfection. 476 similarly c57bl/6 stat1 ã�/ã� mice were infected intranasally with mpxv, and the infection resulted in weight loss and mortality 10 days postexposure. mice models mentioned here are very promising for screening of therapeutics against pox viruses, but testing in additional models will be required for advanced development. high doses of mpxv by intraperitoneal or intranasal route caused 100% mortality in 6 days postexposure and 8 days postexposure, respectively, in ground squirrels. 480 the disease progressed very quickly, and most of the animals were lethargic and moribund by day 5 postexposure without any pox lesions or respiratory changes. a comparison study of usa mpxv and central african strain of mpxv in ground squirrels by subcutaneous route resulted in systemic disease and the mortality in 6-11 days postexposure. the disease resembles hemorrhagic smallpox with nose bleeds, impaired coagulation parameters, and hemorrhage in the lungs of the animals. because in the us outbreak the virus was transmitted by infected prairie dogs, this animal model has recently been studied much further and used to test therapeutics and vaccines compared to other small animal models. 475, 481, 491, 492 studies using intranasal, intraperitoneal, and intradermal routes of exposure showed that mpxv was highly infectious to prairie dogs. by using the west african mpxv strain, the intraperitoneal route caused a more severe disease and 100% mortality than challenge by the intranasal route. anorexia and lethargy were common signs of the disease for both exposure routes. in contrast to the intraperitoneal route, the intranasal route of exposure caused severe pulmonary edema and necrosis of lungs in prairie dogs, while splenic necrosis and hepatic lesions were observed in intraperitoneally infected animals. 481 recent studies by hutson et al. used intranasal and intradermal infections with west african and congo basin strains and showed that both strains and routes caused smallpox-like disease with longer incubation periods and generalized pox lesions. 475 therefore, this model can be used for testing therapeutics and vaccines against pox viruses. the african dormouse is susceptible to mpxv by the foodpad injection route or intranasal route. 478 mice exhibited decreased activity, hunched posture, dehydration, conjunctivitis, and weight loss. viral doses of 200 and 2000 pfu provided 100% mortality with a mean time to death of 8 days. upper gastrointestinal hemorrhage, hepatomegaly, lymphadenopathy, and hemorrhage in lungs were observed during necropsy. with the hemorrhage in several organs, this model resembles hemorrhagic smallpox. considering the limited availability of ground squirrels and african dormice, lack of reagents to these species, and resemblance to hemorrhagic smallpox disease, these models are not very attractive for further characterization and vaccine and countermeasure testing studies. nhps were exposed to mpxv by several different routes to develop animal models for mpxv. 482, 483, 485, 489, 490 during our studies by using an aerosol route of exposure, we observed that macaques had mild anorexia, depression, fever, and lymphadenopathy on day 6 postexposure. 482 complete blood count and clinical chemistries showed abnormalities similar to those of human monkeypox cases with leukocytosis and thrombocytopenia. 493 whole-blood and throat swabs had viral loads peak around day 10, and in survivors, gradually decrease until day 28 postexposure. because doses of 4 ã� 10 4 , 1 ã� 10 5 , or 1 ã� 10 6 pfu resulted in lethality for 70% of the animals, whereas a dose of 4 ã� 10 5 pfu resulted in 85% lethality, survival was not dose dependent. the main pitfall of this model was the lack of pox lesions. with the high dose, before animals can develop pox lesions, they succumbed to disease. with the low challenge dose, pox lesions were observed, but they were few in comparison to the iv model. mpxv causes dose-dependent disease in nhps when given by the iv route. 490 studies showed that with 1ã� 10 7 pfu iv, challenge results in systemic disease with fever, lymphadenopathy, macula-papular rash, and mortality. an intratracheal infection model deposits virus into the trachea, delivering directly to the airways without regard to particle size and the physiological deposition that occurs during the process of inhalation by skipping the upper respiratory system. fibrinonecrotic bronchopneumonia was described in animals that received 10 7 pfu of mpxv intratracheally. 489 although a similar challenge dose of intratracheal mpxv infection resulted in a similar viremia in nhps than with the aerosol route of infection, the timing of the first peak was delayed by 5 days in intratracheally exposed macaques compared to aerosol infection, and the amount of virus detected by qpcr was approximately 100-fold lower. this suggests that local replication is more prominent after aerosol delivery compared to that after intratracheal delivery. an intrabronchial route of exposure resulted in pneumonia in nhps. 490 delayed onset of clinical signs and viremia were observed during the disease progression. in this model, similar to aerosol and the intratracheal route of infection models, the number of pox lesions was much less than in the iv route of the infection model. a major downside of the iv, intratracheal and intrabronchial models is that the initial infection of respiratory tissue, incubation, and prodromal phases are circumvented with the direct inoculation of virus into the blood stream or into the lung. this is an important limitation when the utility of these models is to test possible vaccines and treatments in which the efficacy may depend on protecting the respiratory mucosa and targeting subsequent early stages of the infection, which are not represented in these challenge models. although the aerosol model is the natural route of transmission for human variola virus (varv) infections and a secondary route for human mpxv infections, the lack of pox lesions is the main drawback of this model. therefore, when this model is decided to be used to test medical countermeasures, the endpoints and the biomarkers to initiate treatment should be chosen carefully. hepatitis b is one of the most common infections worldwide with >400 million people chronically infected and 316,000 cases per year of liver cancer due to infection. 494 the virus can naturally infect both humans and chimpanzees. 495 hepatitis b is transmitted parenterally or postnatally from infected mothers. it can also be transmitted by sexual contact, iv drug use, blood transfusion, and acupuncture. 496 the age at which one is infected dictates the risk of developing chronic disease. 497 acute infection during adulthood is self-limiting and results in flu-like symptoms that can progress to hepatocellular involvement as observed with the development of jaundice. the clinical symptoms last for a few weeks before resolving. 498 after this acute phase, life time immunity is achieved. 499 of those infected, <5% will develop the chronic form of disease. chronicity is the most serious outcome of disease as it can result in cirrhosis or liver cancer. hepatocellular carcinoma is 100 times more likely to develop in a chronically infected individual than in a noncarrier. 500 the viral determinant for cellular transformation has yet to be determined, although studies involving the woodchuck hepadna virus suggest that x protein may be responsible. 501 many individuals are asymptomatic until complications emerge related to chronic carriage. chimpanzees have a unique strain that circulates within the population. 502, 503 it was found that 3-6% of all wild-caught animals from africa are positive for hepatitis b antigen. 504 natural and experimental challenge with the virus follows the same course as human disease; however, this is only an acute model of disease. 505 to date, the use of chimpanzees provides the only reliable method to ensure that plasma vaccines are free from infectious particles. 506 this animal model has been used to study new therapeutics and vaccines. chimpanzees are especially attuned to these studies given that their immune response to infection directly mirrors humans. 507 other nhps that have been evaluated are gibbons, orangutans, and rhesus monkeys. although these animals can be infected with hepatitis b, none develop hepatic lesions or liver damage as noted by monitoring of liver enzymes. 508 mice are not permissible to infection, and thus, numerous transgenic and humanized lines that express hepatitis b proteins have been created to facilitate their usage as an animal model. these include both immunocompetent and immunosuppressed hosts. the caveat to all of these mouse lines is that they reproduce only the acute form of disease. 495 recently, the entire genome of hepatitis b was transferred to an immunocompetent mouse line via adenovirus. this provides a model for persistent infection. 509 hepatitis b can also be studied using surrogate viruses, naturally occurring mammalian hepadna viruses. 510 the woodchuck hepatitis virus was found to induce hepatocellular carcinoma. 511 within a population, 65-75% of all neonatal woodchucks are susceptible to chronic infection. 512 a major difference between the two hepatitis isolates is the rate at which they induce cancer; almost all chronic carriers developed hepatocellular carcinoma within 3 years of the initial infection in woodchucks, whereas human infection takes much longer. 513 the acute infection strongly resembles what occurs during the course of disease in humans. there is a self-limiting acute phase resulting in a transient viremia that has the potential of chronic carriage. 514 challenge with virus in neonates leads to a chronic infection, while adults only develop the acute phase of disease. 515 a closely related species to the woodchuck is the marmota himalayan. this animal is also susceptible to the woodchuck hepadna virus upon iv injection. it was found to develop an acute hepatitis with a productive infection. 516 hepatitis d is dependent upon hepatitis b to undergo replication and successful infection in its human host. 517 there are two modes of infection possible between the viruses: coinfection in which a person is simultaneously infected or superinfection in which a chronic carrier of hepatitis b is subsequently infected with hepatitis d. 518 coinfection leads to a similar disease as seen with hepatitis b alone; however, superinfection can result in chronic hepatitis d infection and severe liver damage. 519 both coinfection and superinfection can be demonstrated within the chimpanzee and woodchuck by inoculation of human hepatitis d. 520 a recently published report demonstrated the use of a humanized chimeric mouse to study the interactions between the two viruses and drug testing. 521 the ideal animal model for human viral disease should closely recapitulate the spectrum of clinical symptoms and pathogenesis observed during the course of human infection. whenever feasible, the model should use the same virus and strain that infects humans. it is also preferable that the virus be a low passage clinical isolate; thus, animal passage or adaptation should be avoided if model species can be identified that are susceptible. ideally, the experimental route of infection would mirror that which occurs in natural disease. to understand the interplay and contribution of the immune system during infection, an immunocompetent animal should be used. the above characteristics cannot always be satisfied, however, and often virus must be adapted, knockout mice must be used, and/or the disease is not perfectly mimicked in the animal model. well-characterized animal models are critical for licensure to satisfy the food and fda's "animal rule." this rule applies to situations in which vaccines and therapeutics cannot safely or ethically be tested in humans; thus, licensure will come only after preclinical tests are performed in animal models. many fields in virology are moving toward standardized models that can be used across institutions to test vaccines and therapeutics. a current example of such an effort is within the filovirus community, where animal models, euthanasia criteria, assays, and virus strains are in the process of being standardized. the hope is that these efforts will enable results of efficacy tests on medical countermeasures to be compared across institutions. this chapter has summarized the best models available for each of the viruses described. fields' virology pathogenesis of poliomyelitis: reappraisal in the light of new data one hundred years of poliovirus pathogenesis expression of the poliovirus receptor in intestinal epithelial cells is not sufficient to permit poliovirus replication in the mouse gut present status of attenuated live-virus poliomyelitis vaccine world health organization. polio global eradication initiative annual report pathogenesis of human poliovirus infection in mice. ii. age-dependency of paralysis the poliovirus receptor protein is produced both as membranebound and secreted forms poliovirus pathogenesis in a new poliovirus receptor transgenic mouse model: agedependent paralysis and a mucosal route of infection establishment of a poliovirus oral infection system in human poliovirus receptor-expressing transgenic mice that are deficient in alpha/beta interferon receptor macaque models of human infectious disease ulnar nerve inoculation of poliovirus in bonnet monkey: a new primate model to investigate neurovirulence experimental poliomyelitis in bonnet monkey. clinical features, virology and pathology hepatitis a in day-care centers. a community-wide assessment persistence of hepatitis a virus in fulminant hepatitis and after liver transplantation acute liver failure: redefining the syndromes experimental hepatitis a virus (hav) infection in cynomolgus monkeys (macaca fascicularis): evidence of active extrahepatic site of hav replication experimental hepatitis a virus (hav) infection in callithrix jacchus: early detection of hav antigen and viral fate animal models of hepatitis a and e relative infectivity of hepatitis a virus by the oral and intravenous routes in 2 species of nonhuman primates wild malaysian cynomolgus monkeys are exposed to hepatitis a virus histopathological and immunohistochemical studies of hepatitis a virus infection in marmoset callithrix jacchus intragastric infection induced in marmosets (callithrix jacchus) by a brazilian hepatitis a virus (haf-203) experimental hepatitis a virus infection in guinea pigs systematic literature review of role of noroviruses in sporadic gastroenteritis norovirus infection as a cause of diarrhea-associated benign infantile seizures outbreak of necrotizing enterocolitis caused by norovirus in a neonatal intensive care unit progress in understanding norovirus epidemiology natural history of human calicivirus infection: a prospective cohort study foodborne viruses: an emerging problem pediatric norovirus diarrhea in nicaragua clinical immunity in acute gastroenteritis caused by norwalk agent experimental norovirus infections in nonhuman primates chimpanzees as an animal model for human norovirus infection and vaccine development experimental inoculation of juvenile rhesus macaques with primate enteric caliciviruses molecular characterization of three novel murine noroviruses persistent infection with and serologic cross-reactivity of three novel murine noroviruses [comparative study research support, n.i.h., extramural research support gastrointestinal norovirus infection associated with exacerbation of inflammatory bowel disease porcine enteric caliciviruses: genetic and antigenic relatedness to human caliciviruses, diagnosis and epidemiology pathogenesis of a genogroup ii human norovirus in gnotobiotic pigs infection of calves with bovine norovirus giii.1 strain jena virus: an experimental model to study the pathogenesis of norovirus infection an epizootic of equine encephalomyelitis that occurred in massachusetts in 1831 eastern equine encephalomyelitis virus: epidemiology and evolution of mosquito transmission vaccines and animal models for arboviral encephalitides fields virology studies on avian encephalomyelitis. ii. flock survey for embryo susceptibility to the virus the occurrence in nature of "equine encephalomyelitis" in the ring-necked pheasant eastern equine encephalitis virus infection: electron microscopic studies of mouse central nervous system a comparative study of the pathogenesis of western equine and eastern equine encephalomyelitis viral infections in mice by intracerebral and subcutaneous inoculations influence of age on susceptibility and on immune response of mice to eastern equine encephalomyelitis virus the hamster as an animal model for eastern equine encephalitiisdand its use in studies of virus entrance into the brain pathogenesis of aerosolized eastern equine encephalitis virus infection in guinea pigs eastern equine encephalitis. distribution of central nervous system lesions in man and rhesus monkey encephalomyelitis in monkeys severe encephalitis in cynomolgus macaques exposed to aerosolized eastern equine encephalitis virus pathology of animal models of alphavirus encephalitis common marmosets (callithrix jacchus) as a nonhuman primate model to assess the virulence of eastern equine encephalitis virus strains arbovirus investigations in argentina historical aspects and description of study sites western encephalitis in illinois horses and ponies medically important arboviruses of the united states and canada the ecology of western equine encephalomyelitis virus in the central valley of california an epizootic attributable to western equine encephalitis virus infection in emus in texas epidemiologic observations on acute infectious encephalitis in california, with special reference to the 1952 outbreak neurologic, intellectual, and psychologic sequelae following western encephalitis. a follow-up study of 35 cases louis encephalitis; preliminary report of a clinical follow-up study in california pathological changes in brain and other target organs of infant and weanling mice after infection with nonneuroadapted western equine encephalitis virus necrotizing myocarditis in mice infected with western equine encephalitis virus: clinical, electrocardiographic, and histopathologic correlations the pathogenesis of western equine encephalitis virus (w.e.e.) in adult hamsters with special reference to the long and short term effects on the c.n.s. of the attenuated clone 15 variant viruses of the bunya-and togaviridae families: potential as bioterrorism agents and means of control aerosol exposure to western equine encephalitis virus causes fever and encephalitis in cynomolgus macaques venezuelan equine encephalitis recovery of venezuelan equine encephalomyelitis virus in panama. a fatal case in man role of dendritic cell targeting in venezuelan equine encephalitis virus pathogenesis mechanism of neuroinvasion of venezuelan equine encephalitis virus in the mouse comparative neurovirulence and tissue tropism of wild-type and attenuated strains of venezuelan equine encephalitis virus administered by aerosol in c3h/hen and balb/c mice c3h/hen mouse model for the evaluation of antiviral agents for the treatment of venezuelan equine encephalitis virus infection treatment of venezuelan equine encephalitis virus infection with (-)-carbodine studies on the virus of venezuelan equine encephalomyelitis. i modification by cortisone of the response of the central nervous system of macaca mulatta experimental studies of rhesus monkeys infected with epizootic and enzootic subtypes of venezuelan equine encephalitis virus aerosol infection of cynomolgus macaques with enzootic strains of venezuelan equine encephalitis viruses chikungunya outbreaksdthe globalization of vectorborne diseases infection with chikungunya virus in italy: an outbreak in a temperate region changing patterns of chikungunya virus: re-emergence of a zoonotic arbovirus chikungunya and the nervous system: what we do and do not know [review] a mouse model for chikungunya: young age and inefficient type-i interferon signaling are risk factors for severe disease an animal model for studying the pathogenesis of chikungunya virus infection chikungunya virus arthritis in adult wild-type mice a mouse model of chikungunya virus-induced musculoskeletal inflammatory disease: evidence of arthritis, tenosynovitis, myositis, and persistence mouse models for chikungunya virus: deciphering immune mechanisms responsible for disease and pathology chikungunya disease in nonhuman primates involves long-term viral persistence in macrophages aedes albopictus in the united states: tenyear presence and public health implications epidemic dengue/dengue hemorrhagic fever as a public health, social and economic problem in the 21st century dengue: an update dengue haemorrhagic fever: diagnosis, treatment, prevention, and control tropism of dengue virus in mice and humans defined by viral nonstructural protein 3-specific immunostaining clinical and laboratory features that differentiate dengue from other febrile illnesses in an endemic areadpuerto rico risk factors in dengue shock syndrome animal models of dengue virus infection manifestation of thrombocytopenia in dengue-2-virus-infected mice liver injury and viremia in mice infected with dengue-2 virus early activation of natural killer and b cells in response to primary dengue virus infection in a/j mice induction of tetravalent protective immunity against four dengue serotypes by the tandem domain iii of the envelope protein essential role of platelet-activating factor receptor in the pathogenesis of dengue virus infection murine model for dengue virus-induced lethal disease with increased vascular permeability lethal antibody enhancement of dengue disease in mice is prevented by fc modification inhibitory potential of neem (azadirachta indica juss) leaves on dengue virus type-2 replication genetic basis of attenuation of dengue virus type 4 small plaque mutants with restricted replication in suckling mice and in scid mice transplanted with human liver cells study of dengue virus infection in scid mice engrafted with human k562 cells dengue virus tropism in humanized mice recapitulates human dengue fever dengue virus infection and immune response in humanized rag2(ã�/ã�) gamma(c)(ã�/ã�) (rag-hu) mice a model of denv-3 infection that recapitulates severe disease and highlights the importance of ifn-gamma in host resistance to infection mosquito bite delivery of dengue virus enhances immunogenicity and pathogenesis in humanized mice studies on the pathogenesis of dengue infection in monkeys. 3. sequential distribution of virus in primary and heterologous infections studies on dengue 2 virus infection in cyclophosphamide-treated rhesus monkeys dengue virus-induced hemorrhage in a nonhuman primate model an in-depth analysis of original antigenic sin in dengue virus infection monoclonal antibody-mediated enhancement of dengue virus infection in vitro and in vivo and strategies for prevention the arboviruses: epidemiology and ecology screening of protective antigens of japanese encephalitis virus by dna immunization: a comparative study with conventional viral vaccines immunogenicity, genetic stability, and protective efficacy of a recombinant, chimeric yellow fever-japanese encephalitis virus (chimerivax-je) as a live, attenuated vaccine candidate against japanese encephalitis studies on host factors in inapparent infection with japanese b encephalitis: influence of age, nutrition and luminal induced sleep on the course of infection in mice relation of the peripheral multiplication of japanese b encephalitis virus to the pathogenesis of the infection in mice field's virology free radical generation by neurons in a rat model of japanese encephalitis sequential changes in serum cytokines and chemokines in a rat model of japanese encephalitis experimental infections of monkeys with langat virus. i. comparison of viremia following peripheral inoculation of langat and japanese encephalitis viruses intranasal infection of monkeys with japanese encephalitis virus: clinical response and treatment with a nuclease-resistant derivative of poly (i).poly (c) a neurotropic virus isolated from the blood of a native uganda isolation from human sera in egypt of a virus apparently identical to west nile virus isolation of west nile virus from culex mosquitoes the arboviruses: epidemiology and ecology centers for disease control & prevention. outbreak of west nile-like viral encephalitisdnew york origin of the west nile virus responsible for an outbreak of encephalitis in the northeastern united states clinical virology the west nile virus outbreak of 1999 in new york: the flushing hospital experience nile feverda reemerging mosquito-borne viral disease in europe west nile viral encephalitis experimental infection of cats and dogs with west nile virus experimental infection of horses with west nile virus pathogenesis of west nile virus encephalitis in mice and rats. 1. influence of age and species on mortality and infection study on west nile virus persistence in monkeys experimental infection of pigs with west nile virus persistent west nile virus infection in the golden hamster: studies on its mechanism and possible implications for other flavivirus infections experimental encephalitis following peripheral inoculation of west nile virus in mice of different ages immunogenicity and protective efficacy of a recombinant subunit west nile virus vaccine in rhesus monkeys molecularly engineered live-attenuated chimeric west nile/dengue virus vaccines protect rhesus monkeys from west nile virus tissue tropism and neuroinvasion of west nile virus do not differ for two mouse strains with different survival rates phenotypic changes in langerhans' cells after infection with arboviruses: a role in the immune response to epidermally acquired viral infection? interleukin-1beta but not tumor necrosis factor is involved in west nile virusinduced langerhans cell migration from the skin in c57bl/6 mice profound and prolonged lymphocytopenia with west nile encephalitis innate and adaptive immune responses determine protection against disseminated infection by west nile encephalitis virus spinal cord neuropathology in human west nile virus infection animal models for sars world health organization. the world health report 2003d shaping the future stability and inactivation of sars coronavirus the severe acute respiratory syndrome clinical manifestations, laboratory findings, and treatment outcomes of sars patients identification of a novel coronavirus in patients with severe acute respiratory syndrome profiles of antibody responses against severe acute respiratory syndrome coronavirus recombinant proteins and their potential use as diagnostic markers the immunobiology of sars* viral shedding patterns of coronavirus in patients with probable severe acute respiratory syndrome sars vaccines: where are we? animal models and vaccines for sars-cov infection replication of sars coronavirus administered into the respiratory tract of african green, rhesus and cynomolgus monkeys aged balb/c mice as a model for increased severity of severe acute respiratory syndrome in elderly humans severe acute respiratory syndrome coronavirus spike protein expressed by attenuated vaccinia virus protectively immunizes mice severe acute respiratory syndrome (sars): a year in review koch's postulates fulfilled for sars virus pegylated interferon-alpha protects type 1 pneumocytes against sars coronavirus infection in macaques macaque model for severe acute respiratory syndrome mechanisms of host defense following severe acute respiratory syndromecoronavirus (sars-cov) pulmonary infection of mice resolution of primary severe acute respiratory syndromeassociated coronavirus infection requires stat1 virology: sars virus infection of cats and ferrets civets are equally susceptible to experimental infection by two different severe acute respiratory syndrome coronavirus isolates pneumonitis and multi-organ system disease in common marmosets (callithrix jacchus) infected with the severe acute respiratory syndrome-associated coronavirus severe acute respiratory syndrome vaccine development: experiences of vaccination against avian infectious bronchitis coronavirus immunopathogenesis of coronavirus infections: implications for sars immunization with sars coronavirus vaccines leads to pulmonary immunopathology on challenge with the sars virus immunization with modified vaccinia virus ankara-based recombinant vaccine against severe acute respiratory syndrome is associated with enhanced hepatitis in ferrets primary severe acute respiratory syndrome coronavirus infection limits replication but not lung inflammation upon homologous rechallenge a mouse-adapted sars-coronavirus causes disease and mortality in balb/c mice rabies re-examined rabies and other lyssavirus diseases overview, prevention, and treatment of rabies studies of rabies street virus in the syrian hamster live attenuated rabies virus co-infected with street rabies virus protects animals against rabies spread and pathogenic characteristics of a g-deficient rabies virus recombinant: an in vitro and in vivo study biological basis of rabies virus neurovirulence in mice: comparative pathogenesis study using the immunoperoxidase technique intracerebral vaccination suppresses the spread of rabies virus in the mouse brain human rabies therapy: lessons learned from experimental studies in mouse models experimental rabies virus infection of p.75 neurotrophin receptor-deficient mice apoptosis in experimental rabies in bax-deficient mice emerging pattern of rabies deaths and increased viral infectivity effective protection of monkeys against death from street virus by post-exposure administration of tissue-culture rabies vaccine a model in mice for the pathogenesis and treatment of rabies a model in mice for the study of the early death phenomenon after vaccination and challenge with rabies virus experimental rabies infection and oral vaccination in vampire bats (desmodus rotundus the distribution of challenge virus standard rabies virus versus skunk street rabies virus in the brains of experimentally infected rabid skunks a compendium of 40 years of epidemiological, clinical, and laboratory studies proportion of deaths and clinical features in bundibugyo ebola virus infection clinical aspects of marburg hemorrhagic fever correlates of immunity to filovirus infection the role of the type i interferon response in the resistance of mice to filovirus infection a mouse model for evaluation of prophylaxis and therapy of ebola hemorrhagic fever development of a murine model for aerosolized filovirus infection using a panel of bxd recombinant inbred mice. viruses lethality and pathogenesis of airborne infection with filoviruses in a129 alpha/betaã�/ã� interferon receptordeficient mice development and characterization of a mouse model for marburg hemorrhagic fever identification of an antioxidant small-molecule with broad-spectrum antiviral activity animal models of highly pathogenic rna viral infections: hemorrhagic fever viruses protective efficacy of a bivalent recombinant vesicular stomatitis virus vaccine in the syrian hamster model of lethal ebola virus infection pathogenesis of filoviruses in small animal models program abstr 5th int symp filoviruses pathogenesis of experimental ebola virus infection in guinea pigs characterization of disease and pathogenesis following airborne exposure of guinea pigs to filoviruses manuscripts in preparation postexposure antibody prophylaxis protects nonhuman primates from filovirus disease aerosol exposure to the angola strain of marburg virus causes lethal viral hemorrhagic fever in cynomolgus macaques a small nonhuman primate model for filovirus-induced disease pathology of experimental ebola virus infection in african green monkeys. involvement of fibroblastic reticular cells pathogenesis of marburg hemorrhagic fever in cynomolgus macaques a characterization of aerosolized sudan ebolavirus infection in african green monkeys, cynomolgus macaques, and rhesus macaques recent progress in henipavirus research: molecular biology, genetic diversity, animal models transmission of human infection with nipah virus recurrent zoonotic transmission of nipah virus into humans nipah virus outbreak with person-to-person transmission in a district of bangladesh henipavirus susceptibility to environmental variables hendra virus infection in a veterinarian human hendra virus encephalitis associated with equine outbreak clinical features of nipah virus encephalitis among pig farmers in malaysia the emergence of nipah virus, a highly pathogenic paramyxovirus nipah virus encephalitis acute hendra virus infection: analysis of the pathogenesis and passive antibody protection in the hamster model clinical outcome of henipavirus infection in hamsters is determined by the route and dose of infection a golden hamster model for human acute nipah virus infection histopathologic and immunohistochemical characterization of nipah virus infection in the guinea pig a guinea-pig model of hendra virus encephalitis the lesions of experimental equine morbillivirus disease in cats and guinea pigs a neutralizing human monoclonal antibody protects against lethal disease in a new ferret model of acute nipah virus infection a recombinant hendra virus g glycoproteinbased subunit vaccine protects ferrets from lethal hendra virus challenge vertical transmission and fetal replication of nipah virus in an experimentally infected cat susceptibility of cats to equine morbillivirus experimental infection of squirrel monkeys with nipah virus development of an acute and highly pathogenic nonhuman primate model of nipah virus infection a novel model of lethal hendra virus infection in african green monkeys and the effectiveness of ribavirin treatment experimental nipah virus infection in pteropid bats (pteropus poliocephalus) bacterial infections in pigs experimentally infected with nipah virus experimental inoculation study indicates swine as a potential host for hendra virus experimental nipah virus infection in pigs and cats invasion of the central nervous system in a porcine host by nipah virus experimental infection of horses with hendra virus/australia/horse/2008/redlands transmission studies of hendra virus (equine morbillivirus) in fruit bats, horses and cats a novel morbillivirus pneumonia of horses and its transmission to humans a morbillivirus that caused fatal disease in horses and humans equine morbillivirus pneumonia: susceptibility of laboratory animals to the virus feline model of acute nipah virus infection and protection with a soluble glycoprotein-based subunit vaccine distribution of viral antigens and development of lesions in chicken embryos inoculated with nipah virus global burden of acute lower respiratory infections due to respiratory syncytial virus in young children: a systematic review and meta-analysis [meta-analysis research support economic impact of respiratory syncytial virus-related illness in the us: an analysis of national databases the relationship of meteorological conditions to the epidemic activity of respiratory syncytial virus viral and host factors in human respiratory syncytial virus pathogenesis evidence of a causal role of winter virus infection during infancy in early childhood asthma respiratory syncytial virus infection in elderly and high-risk adults respiratory syncytial virus american academy of pediatrics subcommittee on diagnosis & management of bronchiolitis. diagnosis and management of bronchiolitis respiratory syncytial virus disease in infants despite prior administration of antigenic inactivated vaccine respiratory syncytial virus induces pneumonia, cytokine response, airway obstruction, and chronic inflammatory infiltrates associated with long-term airway hyperresponsiveness in mice genetic susceptibility to respiratory syncytial virus infection in inbred mice differential pathogenesis of respiratory syncytial virus clinical isolates in balb/c mice primary respiratory syncytial virus infection in mice the use of a neonatal mouse model to study respiratory syncytial virus infections respiratory syncytial virus affects pulmonary function in balb/c mice the cotton rat model of respiratory viral infections diversifying animal models: the use of hispid cotton rats (sigmodon hispidus) in infectious diseases the antiviral activity of sp-303, a natural polyphenolic polymer, against respiratory syncytial and parainfluenza type 3 viruses in cotton rats pulmonary lesions in primary respiratory syncytial virus infection, reinfection, and vaccine-enhanced disease in the cotton rat bovine respiratory syncytial virus protects cotton rats against human respiratory syncytial virus infection age at first viral infection determines the pattern of t cell-mediated disease during reinfection in adulthood the enhancement or prevention of airway hyperresponsiveness during reinfection with respiratory syncytial virus is critically dependent on the age at first infection and il-13 production [comparative study research support immunomodulation with il-4r alpha antisense oligonucleotide prevents respiratory syncytial virusmediated pulmonary disease chinchilla and murine models of upper respiratory tract infections with respiratory syncytial virus experimental respiratory syncytial virus infection of four species of primates respiratory syncytial virus infects the bonnet monkey serum neutralizing antibody titers of seropositive chimpanzees immunized with vaccines coformulated with natural fusion and attachment proteins of respiratory syncytial virus reduced clearance of respiratory syncytial virus infection in a preterm lamb model human respiratory syncytial virus a2 strain replicates and induces innate immune responses by respiratory epithelia of neonatal lambs respiratory syncytial virus is associated with an inflammatory response in lungs and architectural remodeling of lung-draining lymph nodes of newborn lambs structure as revealed by airway dissection. a comparison of mammalian lungs biomedical applications of sheep models: from asthma to vaccines the pathology of influenza virus infections mortality due to influenza in the united statesdan annualized regression approach using multiple-cause mortality data animal models for the study of influenza pathogenesis and therapy serious morbidity and mortality associated with influenza epidemics centers for disease control & prevention. bacterial coinfections in lung tissue specimens from fatal cases of 2009 pandemic influenza a (h1n1)dunited states human and avian influenza viruses target different cell types in cultures of human airway epithelium animal models head-to-head comparison of four nonadjuvanted inactivated cell culture-derived influenza vaccines: effect of composition, spatial organization and immunization route on the immunogenicity in a murine challenge model [comparative study interferon-induced mx protein: a mediator of cellular resistance to influenza virus in vitro and in vivo assay systems for study of influenza virus inhibitors utilization of pulse oximetry for the study of the inhibitory effects of antiviral agents on influenza virus in mice the contribution of animal models to the understanding of the host range and virulence of influenza a viruses biological heterogeneity, including systemic replication in mice, of h5n1 influenza a virus isolates from humans in hong kong distinct pathogenesis of hong kong-origin h5n1 viruses in mice compared to that of other highly pathogenic h5 avian influenza viruses effect of oral gavage treatment with znal42 and other metallo-ion formulations on influenza a h5n1 and h1n1 virus infections in mice inactivated and live, attenuated influenza vaccines protect mice against influenza: streptococcus pyogenes super-infections [comparative study research support the use of an animal model to study transmission of influenza virus infection strong local and systemic protective immunity induced in the ferret model by an intranasal virosome-formulated influenza subunit vaccine local innate immune responses and influenza virus transmission and virulence in ferrets regional t-and b-cell responses in influenza-infected ferrets human and avian influenza viruses target different cells in the lower respiratory tract of humans and other mammals live, attenuated influenza virus (laiv) vehicles are strong inducers of immunity toward influenza b virus [comparative study the ferret: an animal model to study influenza virus pathogenesis of avian influenza a (h5n1) viruses in ferrets severe seasonal influenza in ferrets correlates with reduced interferon and increased il-6 induction neuropathology of h5n1 virus infection in ferrets evaluation of three strains of influenza a virus in humans and in owl, cebus, and squirrel monkeys a computationally optimized hemagglutinin virus-like particle vaccine elicits broadly reactive antibodies that protect nonhuman primates from h5n1 infection evaluation of intravenous zanamivir against experimental influenza a (h5n1) virus infection in cynomolgus macaques aberrant innate immune response in lethal infection of macaques with the 1918 influenza virus pathology of human influenza a (h5n1) virus infection in cynomolgus macaques (macaca fascicularis integrated molecular signature of disease: analysis of influenza virus-infected macaques through functional genomics and proteomics integration of clinical data, pathology, and cdna microarrays in influenza virus-infected pigtailed macaques (macaca nemestrina kinetic profile of influenza virus infection in three rat strains the guinea pig as a transmission model for human influenza viruses influenza-induced tachypnea is prevented in immune cotton rats, but cannot be treated with an anti-inflammatory steroid or a neuraminidase inhibitor the antiviral potential of interferon-induced cotton rat mx proteins against orthomyxovirus (influenza), rhabdovirus, and bunyavirus the cotton rat as a model to study influenza pathogenesis and immunity the cotton rat provides a useful small-animal model for the study of influenza virus pathogenesis co-infection of the cotton rat (sigmodon hispidus) with staphylococcus aureus and influenza a virus results in synergistic disease pathogenicity of a highly pathogenic avian influenza virus, a/chicken/yamaguchi/7/04 (h5n1) in different species of birds and mammals domestic pigs have low susceptibility to h5n1 highly pathogenic avian influenza viruses [comparative study research support animal models in influenza vaccine testing rift valley fever: an uninvited zoonosis in the arabian peninsula prevalence of anti-rift-valley-fever igm antibody in abattoir workers in the nile delta during the 1993 outbreak in egypt rift valley fever the occurrence of human cases in johannesburg the pathogenesis of rift valley fever epidemic rift valley fever in egypt: observations of the spectrum of human illness crc handbook series in zoonoses rift valley fever virus in mice. i. general features of the infection the pathogenesis of rift valley fever virus in the mouse model the susceptibility of rats to rift valley fever in relation to age inbred rat strains mimic the disparate human response to rift valley fever virus infection the gerbil, meriones unguiculatus, a model for rift valley fever viral encephalitis experimental rift valley fever in rhesus macaques development of a novel nonhuman primate model for rift valley fever crimean-congo hemorrhagic fever: a global perspective crimean-congo hemorrhagic fever the clinical pathology of crimean-congo hemorrhagic fever experimental congo virus (ib-an7620) infection in primates crimean congo hemorrhagic fever: a global perspective viremia and antibody response of small african and laboratory animals to crimean-congo hemorrhagic fever virus infection a comparative study of the crimean hemorrhagic faverdcongo group of viruses [comparative study ribavirin efficacy in an in vivo model of crimean-congo hemorrhagic fever virus (cchf) infection pathogenesis and immune response of crimean-congo hemorrhagic fever virus in a stat-1 knockout mouse model crimean-congo hemorrhagic fever virus infection is lethal for adult type i interferon receptorknockout mice possibility of extracting hyperimmune gammaglobulin against chf from donkey blood sera study of susceptibility to crimean hemorrhagic fever (chf) virus in european and long-eared hedgehogs. tezisy konf vop med virus field's virology epidemiological studies of hemorrhagic fever with renal syndrome: analysis of risk factors and mode of transmission a short review hantavirus pulmonary syndrome. pathogenesis of an emerging infectious disease field's virology centers for disease control & prevention. outbreak of acute illness-southwestern united states genetic diversity, distribution, and serological features of hantavirus infection in five countries in south america first reported cases of hantavirus pulmonary syndrome in canada [case reports an unusual hantavirus outbreak in southern argentina: person-to-person transmission? hantavirus pulmonary syndrome study group for patagonia hantavirus pulmonary syndrome: the new american hemorrhagic fever the incubation period of hantavirus pulmonary syndrome cardiopulmonary manifestations of hantavirus pulmonary syndrome hantavirus pulmonary syndrome a lethal disease model for hantavirus pulmonary syndrome pathogenesis of hantaan virus infection in suckling mice: clinical, virologic, and serologic observations infection of hantaan virus strain aa57 leading to pulmonary disease in laboratory mice hantaan virus infection causes an acute neurological disease that is fatal in adult laboratory mice functional role of type i and type ii interferons in antiviral defense andes virus dna vaccine elicits a broadly cross-reactive neutralizing antibody response in nonhuman primates andes virus infection of cynomolgus macaques wild-type puumala hantavirus infection induces cytokines, c-reactive protein, creatinine, and nitric oxide in cynomolgus macaques pathology of puumala hantavirus infection in macaques viral haemorrhagic fevers caused by lassa, ebola, and marburg viruses new opportunities for field research on the pathogenesis and treatment of lassa fever imported lassa fever lassa fever. effective therapy with ribavirin lassa virus hepatitis: a study of fatal lassa fever in humans lassa virus infection of rhesus monkeys: pathogenesis and treatment with ribavirin experimental inhalation infection of monkeys of the macacus cynomolgus and macacus rhesus species with the virus of lymphocytic choriomeningitis (we) arenavirus-mediated liver pathology: acute lymphocytic choriomeningitis virus infection of rhesus macaques is characterized by high-level interleukin-6 expression and hepatocyte proliferation experimental studies of arenaviral hemorrhagic fevers lcmv-mediated hepatitis in rhesus macaques: we but not arm strain activates hepatocytes and induces liver regeneration mucosal arenavirus infection of primates can protect them from lethal hemorrhagic fever pathogenesis of lassa fever in cynomolgus macaques lassa fever encephalopathy: clinical and laboratory findings lassa fever encephalopathy: lassa virus in cerebrospinal fluid but not in serum lassa virus infection in experimentally infected marmosets: liver pathology and immunophenotypic alterations in target tissues virus taxonomy, viiith report of the international committee on taxonomy of viruses pathogenesis of lassa virus infection in guinea pigs the effect of an arenavirus infection on liver morphology and function cardiovascular and pulmonary responses to pichinde virus infection in strain 13 guinea pigs pathogenesis of pichinde virus infection in strain 13 guinea pigs: an immunocytochemical, virologic, and clinical chemistry study pichinde virus infection in strain 13 guinea pigs reduces intestinal protein reflection coefficient with compensation clinical laboratory, virologic, and pathologic changes in hamsters experimentally infected with pirital virus (arenaviridae): a rodent model of lassa fever variation between strains of hamsters in the lethality of pichinde virus infections interferon alfacon-1 protects hamsters from lethal pichinde virus infection treatment of lethal pichinde virus infections in weanling lvg/lak hamsters with ribavirin, ribamidine, selenazofurin, and ampligen [comparative study research support global mortality associated with rotavirus disease among children in minimal infective dose of rotavirus rotavirus immunoglobulin levels among indian mothers of two socio-economic groups and occurrence of rotavirus infections among their infants up to six months analyses of clinical, pathological and virological features of human rotavirus strain, yo induced gastroenteritis in infant balb/c mice epizootic diarrhea of infant mice: identification of the etiologic agent immunity to rotavirus infection in mice development of an adult mouse model for studies on protection against rotavirus a gastrointestinal rotavirus infection mouse model for immune modulation studies protection of the villus epithelial cells of the small intestine from rotavirus infection does not require immunoglobulin a rotavirus viremia and extraintestinal viral infection in the neonatal rat model [comparative study research support characterization of clinical and immune response in a rotavirus diarrhea model in suckling lewis rats development of a heterologous model in germfree suckling rats for studies of rotavirus diarrhea studies of oral rehydration solutions in animal models induction of mucosal immune responses and protection against enteric viruses: rotavirus infection of gnotobiotic pigs as a model developmental immunity in the piglet swine in biomedical research neonatal calf diarrhea induced by rotavirus characterisation of the primary local and systemic immune response in gnotobiotic lambs against rotavirus infection experimental infection of non-human primates with a human rotavirus isolate development of a rotavirus-shedding model in rhesus macaques, using a homologous wild-type rotavirus of a new p genotype reflections on 30 years of aids hivs and their replication the utility of the new generation of humanized mice to study hiv-1 infection: transmission, prevention, pathogenesis, and treatment antiretroviral pre-exposure prophylaxis prevents vaginal transmission of hiv-1 in humanized blt mice hematopoietic stem cell-engrafted nod/ scid/il2rgamma null mice develop human lymphoid systems and induce long-lasting hiv-1 infection with specific humoral immune responses hiv-1 infection and cd4 t cell depletion in the humanized rag2ã�/ã� gamma cã�/ã� (rag-hu) mouse model hiv-1 infection and pathogenesis in a novel humanized mouse model induction of robust cellular and humoral virusspecific adaptive immune responses in human immunodeficiency virus-infected humanized blt mice an aptamer-sirna chimera suppresses hiv-1 viral loads and protects from helper cd4(ã¾) t cell decline in humanized mice mucosal immunity and vaccines low-dose rectal inoculation of rhesus macaques by sivsme660 or sivmac251 recapitulates human mucosal infection by hiv-1 propagation and dissemination of infection after vaginal transmission of simian immunodeficiency virus limited dissemination of pathogenic siv after vaginal challenge of rhesus monkeys immunized with a live virulence and reduced fitness of simian immunodeficiency virus with the m184v mutation in reverse transcriptase siv-induced impairment of neurovascular repair: a potential role for vegf therapeutic dna vaccine induces broad t cell responses in the gut and sustained protection from viral rebound and aids in siv-infected rhesus macaques a nonfucosylated variant of the anti-hiv-1 monoclonal antibody b12 has enhanced fcgammariiiamediated antiviral activity in vitro but does not improve protection against mucosal shiv challenge in macaques a trivalent recombinant ad5 gag/pol/nef vaccine fails to protect rhesus macaques from infection or control virus replication after a limiting-dose heterologous siv challenge animal model for the therapy of acquired immunodeficiency syndrome with reverse transcriptase inhibitors susceptibility of hiv-2, siv and shiv to various anti-hiv-1 compounds: implications for treatment and postexposure prophylaxis use of a small molecule ccr5 inhibitor in macaques to treat simian immunodeficiency virus infection or prevent simian-human immunodeficiency virus infection shiv-1157i and passaged progeny viruses encoding r5 hiv-1 clade c env cause aids in rhesus monkeys update on animal models for hiv research limited or no protection by weakly or nonneutralizing antibodies against vaginal shiv challenge of macaques compared with a strongly neutralizing antibody macaque studies of vaccine and microbicide combinations for preventing hiv-1 sexual transmission vpx is critical for sivmne infection of pigtail macaques impact of short-term haart initiated during the chronic stage or shortly post-exposure on siv infection of male genital organs the rhesus macaque pediatric siv infection modeld a valuable tool in understanding infant hiv-1 pathogenesis and for designing pediatric hiv-1 prevention strategies perinatal transmission of shiv-sf162p3 in macaca nemestrina immune and genetic correlates of vaccine protection against mucosal infection by siv in monkeys chronic administration of tenofovir to rhesus macaques from infancy through adulthood and pregnancy: summary of pharmacokinetics and biological and virological effects efficacy assessment of a cell-mediated immunity hiv-1 vaccine (the step study): a double-blind, randomised, placebo-controlled, test-of-concept trial human papillomavirus in cervical cancer human papillomavirus research: do we still need animal models? animal models of papillomavirus pathogenesis evidence of human papillomavirus vaccine effectiveness in reducing genital warts: an analysis of california public family planning administrative claims data the rabbit viral skin papillomas and carcinomas: a model for the immunogenetics of hpv-associated carcinogenesis protection of beagle dogs from mucosal challenge with canine oral papillomavirus by immunization with recombinant adenoviruses expressing codon-optimized early genes naturally occurring, nonregressing canine oral papillomavirus infection: host immunity, virus characterization, and experimental infection regression of canine oral papillomas is associated with infiltration of cd4ã¾ and cd8ã¾ lymphocytes characterization and experimental transmission of an oncogenic papillomavirus in female macaques a multimeric l2 vaccine for prevention of animal papillomavirus infections preclinical development of highly effective and safe dna vaccines directed against hpv 16 e6 and e7 us doctors investigate more than 50 possible cases of monkeypox isolation of monkeypox virus from wild squirrel infected in nature reemergence of monkeypox: prevalence, diagnostics, and countermeasures human monkeypox infection: a family cluster in the midwestern united states human monkeypox and other poxvirus infections of man the confirmation and maintenance of smallpox eradication human monkeypox identification of wild-derived inbred mouse strains highly susceptible to monkeypox virus infection for use as small animal models a prairie dog animal model of systemic orthopoxvirus disease using west african and congo basin strains of monkeypox virus comparison of monkeypox viruses pathogenesis in mice by in vivo imaging comparative pathology of north american and central african strains of monkeypox virus in a ground squirrel model of the disease experimental infection of an african dormouse (graphiurus kelleni) with monkeypox virus a mouse model of lethal infection for evaluating prophylactics and therapeutics against monkeypox virus experimental infection of ground squirrels (spermophilus tridecemlineatus) with monkeypox virus experimental infection of prairie dogs with monkeypox virus experimental infection of cynomolgus macaques (macaca fascicularis) with aerosolized monkeypox virus the pathology of experimental aerosolized monkeypox virus infection in cynomolgus monkeys (macaca fascicularis) immunogenicity of a highly attenuated mva smallpox vaccine and protection against monkeypox smallpox vaccine does not protect macaques with aids from a lethal monkeypox virus challenge smallpox vaccine-induced antibodies are necessary and sufficient for protection against monkeypox virus virulence and pathophysiology of the congo basin and west african strains of monkeypox virus in non-human primates a novel respiratory model of infection with monkeypox virus in cynomolgus macaques antiviral treatment is more effective than smallpox vaccination upon lethal monkeypox virus infection comparative analysis of monkeypox virus infection of cynomolgus macaques by the intravenous or intrabronchial inoculation route establishment of the black-tailed prairie dog (cynomys ludovicianus) as a novel animal model for comparing smallpox vaccines administered preexposure in both high-and low-dose monkeypox virus challenges effective antiviral treatment of systemic orthopoxvirus disease: st-246 treatment of prairie dogs infected with monkeypox virus clinical characteristics of human monkeypox, and risk factors for severe disease hepatitis b virus infection cell culture and animal models of viral hepatitis. part i: hepatitis b risks of chronicity following acute hepatitis b virus infection: a review hepatitis b virus infectiondnatural history and clinical consequences clinical aspects of hepatitis b virus infection hepatitis b virus. the major etiology of hepatocellular carcinoma trans-activation of viral enhancers by the hepatitis b virus x protein identification of hepatitis b virus indigenous to chimpanzees detection of hepatitis b virus infection in wild-born chimpanzees (pan troglodytes verus): phylogenetic relationships with human and other primate genotypes antibody to hepatitis-associated antigen. frequency and pattern of response as detected by radioimmunoprecipitation hepatitis and blood transfusion perspectives on hepatitis b studies with chimpanzees hla a2 restricted cytotoxic t lymphocyte responses to multiple hepatitis b surface antigen epitopes during hepatitis b virus infection primates in the study of hepatitis viruses transfer of hbv genomes using low doses of adenovirus vectors leads to persistent infection in immune competent mice asymmetric replication of duck hepatitis b virus dna in liver cells: free minus-strand dna a virus similar to human hepatitis b virus associated with hepatitis and hepatoma in woodchucks effects of age and viral determinants on chronicity as an outcome of experimental woodchuck hepatitis virus infection hepadnavirusinduced liver cancer in woodchucks animal models of hepadnavirus-associated hepatocellular carcinoma hepatitis b viruses and hepatocellular carcinoma hepatitis b virus replication in primary macaque hepatocytes: crossing the species barrier toward a new small primate model animal models of hepatitis delta virus infection and disease experimental hepatitis delta virus infection in the chimpanzee expression of the hepatitis delta virus large and small antigens in transgenic mice experimental hepatitis delta virus infection in the animal model humanized chimeric upa mouse model for the study of hepatitis b and d virus interactions and preclinical drug evaluation key: cord-346136-sqc09x9c authors: hamilton, kyra; smith, stephanie r.; keech, jacob j.; moyers, susette a.; hagger, martin s. title: application of the health action process approach to social distancing behavior during covid‐19 date: 2020-10-02 journal: appl psychol health well being doi: 10.1111/aphw.12231 sha: doc_id: 346136 cord_uid: sqc09x9c background: this study examined the social cognition determinants of social distancing behavior during the covid‐19 pandemic in samples from australia and the us guided by the health action process approach (hapa). methods: participants (australia: n = 495, 50.1% women; us: n = 701, 48.9% women) completed hapa social cognition constructs at an initial time‐point (t1), and one week later (t2) self‐reported their social distancing behavior. results: single‐indicator structural equation models that excluded and included past behavior exhibited adequate fit with the data. intention and action control were significant predictors of social distancing behavior in both samples, and intention predicted action and coping planning in the us sample. self‐efficacy and action control were significant predictors of intention in both samples, with attitudes predicting intention in the australia sample and risk perceptions predicting intention in the us sample. significant indirect effects of social cognition constructs through intentions were observed. inclusion of past behavior attenuated model effects. multigroup analysis revealed no differences in model fit across samples, suggesting that observed variations in the parameter estimates were relatively trivial. conclusion: results indicate that social distancing is a function of motivational and volitional processes. this knowledge can be used to inform messaging regarding social distancing during covid‐19 and in future pandemics. the covid-19 pandemic has had unprecedented global effects on mortality, way of life, national economies, and physical and mental health not previously experienced in modern times. it has presented governments, healthcare services, and education facilities with wide-scale and complex logistical challenges on how to manage the rapid spread of the disease and minimise the projected human and economic costs. given that, to date, there is no vaccine to protect against covid-19, non-pharmacological intervention is the only currently available means to reduce the spread of sars-cov-2, the virus that causes covid-19, and "flatten the curve" of infection rates. in response, national and statewide governmental measures aimed at minimising transmission of the virus including "stay at home" orders, closure of businesses and places of congregation, and travel restrictions have had a substantive impact on mortality rates (worldometer, 2020) . as rates of infection dissipate in some countries, particularly in countries like australia that have relatively low rates of daily infections, governments are now beginning to ease restrictions. however, preventive behaviors aimed at reducing infection rates remain highly pertinent given concerns over the potential for infection rates to rise again and fears of a "second wave". furthermore, some countries who are easing lockdown measures, such as some states in the us, still have high localised rates of infection, highlighting the imperative of ongoing performance of preventive behaviors to manage infection transmission. based on world health organization (who) recommendations (world health organization, 2020) and previous research on behaviors known to reduce virus transmission (jefferson et al., 2011; rabie & curtis, 2006; smith et al., 2015) , two key sets of covid-19-related behaviors that may apply to the population as a whole have been proposed . the first set is "personal protective behaviors" that are aimed at the individual in order to protect themselves or others (e.g. washing hands frequently, practicing respiratory hygiene). the second set involves behaviors aimed at ensuring physical distance between people (e.g. social distancing, stay at home orders). despite knowledge of these key behaviors in the prevention of virus transmission (e.g. islam et al., 2020) , there is a relative dearth of information on the determinants and mechanisms of action that underpin these preventive behaviors and how to strengthen individuals' capacity to adopt them. in the absence of direct evidence, knowledge to inform practice guidelines that governments and organisations can use to mobilise individuals into performing covid-19 preventive behaviors has been gleaned from applying general principles from behavioral science and the models of behavior that underpin them (lunn et al., 2020; michie et al., 2020; british psychological society, 2020; west et al., 2020) as well as findings of previous empirical investigations in the psychological literature on similar health and risk behaviors (e.g. face mask use, handwashing, distancing; chu et al., 2020; reyes fern andez et al., 2016; zhang, chung, et al., 2019; zhang et al., 2020) . although this approach is potentially useful in structuring thinking and recommendations in urgent times, there is a pressing need for direct evidence that identifies the key determinants of these covid-19 preventive behaviors and the processes involved. this knowledge can then be used to inform development of effective interventions to promote uptake and adherence to these behaviors. this is especially important given that individuals' beliefs may affect their adoption of non-pharmacological measures to prevent virus transmission (teasdale et al., 2014) . prominent among social cognition theories are dual-phase models which aim to provide a comprehensive theoretical account of health behavior uptake and participation, and the processes involved (hagger, cameron, hamilton, hankonen, & lintunen, 2020; hagger, smith, keech, moyers, & hamilton, 2020) . one such dual-phase theory that has been frequently applied to predict multiple health behaviors is the health action process approach (hapa; schwarzer, 2008; schwarzer & hamilton, 2020) . the hapa is an integrated model that combines features of stage, continuum, and dual-phase social cognition models. a key feature of the model is the distinction made between motivational (where an individual is in a deliberative mindset while setting a goal/forming an intention) and volitional (where an individual is in an implementation mindset while pursuing their goal) phases involved in behavioral enactment. in the motivational phase, intentions are conceptualised as the most important determinant of behavior. intentions are proposed to be a function of three sets of belief-based constructs: outcome expectancies (beliefs that the target behavior will lead to outcomes that have utility for the individual, conceptually identical to an individual's attitudes toward the behavior), self-efficacy (beliefs in personal capacity to successfully perform the target behavior and overcome challenges and barriers to its performance), and risk perceptions (beliefs in the severity of a health condition that may arise from not performing the target behavior and personal vulnerability toward it). in the volitional phase of the hapa, planning and action control strategies are important self-regulatory strategies that determine subsequent enactment of the target behavior (schwarzer, 2008; schwarzer & hamilton, 2020) . two forms of planning are proposed: action planning, a task-facilitating strategy that relates to how individuals prepare themselves in performing a behavior, and coping planning, a distraction-inhibiting strategy that relates to how individuals prepare themselves in avoiding foreseen barriers and obstacles that may arise when performing a specific behavior, and potentially competing behaviors that may derail the behavior. in addition, action control, a self-regulatory strategy for promoting behavioral maintenance through the monitoring and evaluation of a behavior against a desired behavioral standard, is also an important direct determinant of behavior (hamilton et al., 2018; reyes fern andez et al., 2016) . behavioral intention operates as a "bridge" between the motivational and volitional phases, while planning serves to link intentions with behavior. previous research has provided support for the hapa constructs in predicting health preventive behaviors, with prominent roles for outcome expectancies, forms of selfefficacy, planning and action control, with risk perceptions only relevant in certain contexts (see schwarzer & hamilton, 2020; . furthermore, the model has been used as a basis for effective behavior change interventions aimed at promoting increased participation in health-related behaviors (schwarzer & hamilton, 2020) . given that social distancing is a key evidence-based behavior that will minimise transmission of sars-cov-2 if performed consistently at the population level, the aim of the present study was to apply the hapa to identify the social cognition and self-regulatory determinants of this preventive behavior in samples of adults from two countries, australia and the us. these two countries provided an opportunity to examine the determinants of social distancing because they experienced rapid increases in covid-19 cases at relatively similar times during the pandemic and introduced public health advice as well as "lockdown" measures and "shelter-in-place" orders to minimise transmission, including social distancing. specifically, the current research aimed to identify potentially modifiable determinants that are reliably related to social distancing intentions and behavior, which may form targets of behavioral interventions to reduce covid-19 infection rates, and, going forward, other communicable diseases transmitted through person-to-person contact. the value of applying the hapa is that it provides information on phase-relevant constructs in determining this important behavior. proposed predictions among model constructs are summarised diagrammatically in figures 1 and 2 . figure 1 presents the hapa predictions excluding effects of past social distancing behavior. intention to perform social distancing was expected to be predicted by attitude (as a proxy for outcome expectancies), self-efficacy, and risk perceptions, and social distancing behavior was expected to be predicted by self-efficacy, intentions, action planning, coping planning, and action control. intention was proposed to mediate effects of attitude, self-efficacy, and risk perceptions on behavior. in addition, intention was expected to predict action planning and coping planning such that the planning constructs mediate the intention-behavior relationship. action control was proposed to predict behavior directly. although it is strictly a self-regulation technique aimed at facilitating better behavioral enactment, as proposed by the original hapa (e.g. schwarzer, 2008) , individuals who are effective at action control (i.e. self-monitoring) may also more likely form strong intentions. action control implies not only the recall of behavior but also the recall of intentions. self-monitoring of the concurrent behavior, therefore, may make the individual aware of their intention as well as their behavior, focusing on possible discrepancies between the two. it is plausible, then, that action control can be specified as a predictor of both intention and behavior. the coexistence of intention and action control within the same dataset allows this key question to be tested; which of the two factors may be more proximal to the behavioral outcome? action control might not be a time-specific variable, and individuals may self-monitor their behaviors at any point in time (see zhou et al., 2015) , even before goal setting. actions can be monitored before making intentions, while doing so, or afterwards. thus, examining the indirect (via intention) and direct effects of action control on behavior is intuitively meaningful, although not supported by the original hapa, and tested in the present study. figure 2 outlines the inclusion of past behavior in the model to test its sufficiency. although model effects were expected to hold with the inclusion of past behavior, it was expected to attenuate the size of the proposed effects consistent with previous studies (brown et al., 2020; hagger et al., 2018) . this was expected to be the case in the current study due to the relatively brief one-week follow up. the attenuation effect was proposed to model past decision making and effects of other unmeasured constructs on behavior. a sample of australian (n = 495, 50.1% women) and us (n = 701, 48.9% women) residents were recruited via an online research panel company. to be eligible for inclusion, participants needed to be aged 18 years or older and were required to not be subject to formal quarantine for covid-19. in addition to the inclusion criteria, participants were screened on the demographic characteristics of age, gender, and geographical region and quotas were imposed to ensure that the sample comprised similar proportions of these characteristics to the national population of each country. sample characteristics are presented in table s1 . data were collected in april and may 2020 during which time residents throughout australia and all states in the us were subject to "stay at home" orders to reduce transmission of the coronavirus. the study adopted a prospective correlational design with self-report measures of hapa constructs (attitudes, self-efficacy, risk perceptions, intentions, action planning, coping planning, and action control) and past engagement in social distancing behavior administered at an initial time-point (t1) in a survey administered using the qualtrics tm online survey tool. participants were informed that they were participating in a survey on their social distancing behavior and were provided with an information sheet outlining study requirements. they were also provided with a consent form to which they had to affirm before proceeding with the survey. participants were also provided with an information sheet providing instructions on how to complete the study measures. in addition, they were provided with a definition of the target behavior: "the following survey will ask about your beliefs and attitudes about 'social distancing'. what do we mean by social distancing? social distancing (also known as 'physical distancing') is deliberately increasing the physical space between people to avoid spreading illness. the world health organization and other world leading health authorities suggest that you should maintain at least a 1-2 m (3-6 feet) distance from other people to lessen the chances of getting infected with covid-19. when answering the questions in this survey, think about your social distancing behavior (i.e. maintaining at least a 1-2 m (3-6 feet) distance from other people)." one week later (t2), participants were contacted a second time by the panel company and were asked to self-report their social distancing behavior over the previous week using the same behavioral measure administered at t1. participants received a fixed sum of money for their participation based on expected completion time consistent with the panel company's published rates. approval for study procedures was granted prior to data collection from the griffith university human research ethics committee. study measures were carried out on multi-item psychometric instruments developed using published guidelines and adapted for use with the target behavior in the current study (schwarzer, 2008) . participants provided their responses on scales with 7-point response options. complete study measures are provided in table s2 . social cognition constructs. measures of attitudes, self-efficacy, risk perceptions, intentions, action planning, coping planning, and action control from the hapa were developed according to guidelines (schwarzer, 2007) . attitude was measured using three semantic differential items in response to a common stem: "my maintaining social distancing in the next week would be...", followed by a series of bi-polar adjectives (e.g. (1) worthless -(7) valuable). self-efficacy was measured using four items (e.g. "i am confident that i could maintain social distancing", scored (1) strongly disagree to (7) strongly agree). risk perception was measured using two items (e.g. "it would be risky for me to not maintain social distancing", scored (1) strongly disagree to (7) strongly agree). intention was measured using three items (e.g. "i intend to maintain social distancing", scored (1) strongly disagree to (7) strongly agree). action planning was measured using four items. participants were required to respond to the stem: "in the next week, i have made a plan regarding...", followed by the four items of the scale (e.g. ". . .when to maintain social distancing") on likert scales ranging from strongly disagree (1) to strongly agree (7). coping planning was measured using four items. participants were required to respond to the stem: "to keep my intention to maintain social distancing in the next week in difficult situations, i have made a plan...", followed by the four items of the scale (e.g. ". . .what to do if something interferes with my goal of maintaining social distancing") on likert scales ranging from strongly disagree (1) to strongly agree (7). action control was measured using three items (e.g. "i have consistently monitored when, how often, and how to maintain social distancing"), scored (1) strongly disagree to (7) strongly agree). past behavior and behavior.participants self-reported their participation in the target behavior maintaining social distancing in relation to others to minimise transmission of the coronavirus that causes covid-19. the measure comprised two items prompting participants to report their frequency of social distancing behavior in the previous week: "in the past week, how often did you maintain social distancing?", scored (1) never to (7) always and "in the past week, i maintained social distancing", scored (1) false to (7) true. demographic variables. participants self-reported their age in years, gender, employment status (currently unemployed/full time caregiver, currently full-time employed, part-time employed, on leave without pay/furloughed), marital status (married, widowed, separated/divorced, never married, in a de facto relationship), annual household income stratified by 11 income levels based on australia and us national averages, and highest level of formal education (completed junior/lower/primary school, completed senior/high/secondary school, postschool vocational qualification/diploma, further education diploma, undergraduate university degree, postgraduate university degree). binary income (low income versus middle/high income), 1 highest education level (completed school education only versus completed post-school education), and ethnicity (white/ caucasian versus non-white) variables were computed for use in subsequent analyses. hypothesised relations among hapa constructs in the proposed model were tested in the australia and us sample separately using single-indicator structural equation models implemented in the lavaan package in r (r core team, 2020; rosseel, 2012) . we opted for single-indicator models over a full latent variable structural equation model due to the complexity of the model and the large number of parameters. the single-indicator approach utilises scale reliabilities to 1 our cut-off for low vs. medium-to-high income was based on national income data for citizens on low incomes in the us (for a family of four, the low income threshold is us$25,465 per year; semega et al., 2020) and australia (for a family of four, the low-income average is $562 per week; aihw 2020). participants reporting incomes of $400-$599 per week ($20,800-$31,199 per year) or below were classified as low income. provide an estimate of the measurement error of each variable in the model. specifically, each variable in the model was indicated by its averaged composite with the error variance fixed at a value based on the reliability estimates using the formula: 1-reliability*scale variance. simulation studies have demonstrated that parameter estimates and model fit of single-indicator models compare very favorably with full latent variable structural equation models, particularly when sample sizes are small (savalei, 2019) . we freed parameters between the single-indicator latent variables according to our proposed model. two models were estimated, one excluding effects of past social distancing behavior (model 1, figure 1 ) and one which controlled for past behavior (model 2, figure 2 ) by freeing parameter estimates from past behavior on each construct in the model. we also controlled for effects of the following demographic variables in each model by freeing paths from each variable to all other model variables: gender, age, ethnicity, income, and education level. missing data were handled using the full information maximum likelihood (fiml) method. the fiml approach is a preferred approach to handling missing data as simulation studies indicate that it leads to unbiased parameter estimates in structural equation modeling (enders & bandalos, 2001; wothke, 1998) . model comparisons across the australia and us samples were conducted using multigroup analyses. an initial configural multisample model for the model excluding past behavior was estimated (model 3), which provided evidence for the tenability of the model in accounting for the data across both samples. this was followed by a restricted model in which the parameter estimates representing proposed relations among the hapa constructs and behavior were constrained to equality across the two samples (model 4). the fit of the constrained model did not differ significantly from the configural model across the two samples, which provided evidence that model parameters did not differ substantially. this was established using a formal likelihood ratio test of the goodness-of-fit chi-square for the configural and constrained models (byrne et al., 1989) . we also examined differences in the cfi; differences of less than .01 between values for the configural and constrained models have also been proposed as indicative of invariance of parameters (cheung & rensvold, 2002) . the configural (model 5) and constrained (model 6) multisample analyses were repeated for the model including past behavior. models were implemented using the maximum likelihood estimator with bootstrapped standard errors with 1,000 bootstrap replications. goodness of fit of the models with the data was evaluated using multiple criteria comparing the proposed model with the baseline model including the goodness-of-fit chi-square (v 2 ), the comparative fit index (cfi), the standardised root mean-squared of the residuals (srmr), and the root mean square error of approximation (rmsea) and its 90% confidence interval (90% ci). since the chi-square value is often statistically significant in complex models and has been shown to lead to the rejection of adequate models, we focused on the incremental fit indices. specifically, values for the cfi should exceed 0.95, values for the srmr should be less than or equal to 0.08, and values for the rmsea should be below 0.05 with a narrow 90% confidence interval (hu & bentler, 1999) . data files, analysis scripts, and output are available online: https://osf.io/mrzex/ attrition across the two data collection occasions resulted in final sample sizes of 365 (m age = 49.78, sd = 16.89; 50.1% women; attrition rate 26.27%) and 440 (m age = 51.77, sd = 16.26; 46.6% women; attrition rate = 37.23%) participants retained at follow-up in the australia and us samples, respectively. there were no missing data for the social cognition and behavior variables as participants could not advance through the survey without providing a response. there were a few instances of missing data for the demographic variables ranging from 0.5 per cent to 8.8 per cent in the australia sample, and 0.9 per cent to 6.4 per cent in the us sample as participants could opt not to respond to these items as they represented personal data. missing data are reported in table s3 . sample characteristics at follow-up are presented in table s4 , and comparisons on study variables between those retained in the study at follow-up and those lost to attrition are presented in table s3 . attrition analyses in the australia sample revealed that participants lost to attrition were younger and were more likely to be non-white. however, there were no differences in proportion of gender, income, and education level. a manova with the social cognition constructs and past behavior as dependent variables and attrition status (lost to attrition vs. included at follow-up) revealed no differences (wilks' lambda = 0.973, f(8) = 1.60, p = .115, partial g 2 = 0.026). attrition analyses in the us sample also indicated that participants lost to attrition were younger, and more likely to be men, non-white, and lower educated, and have low income, than those remaining in the study at follow-up. the manova testing for differences on social cognition and past behavior variables among participants lost to attrition and those included at follow-up revealed statistically significant differences (wilks' lambda = 0.957, f(8) = 3.90, p < .001, partial g 2 = 0.043). follow-up tests revealed that mean values for past behavior, attitudes, intentions, and self-monitoring with respect to social distancing were significantly lower among participants lost to attrition compared to those retained at follow-up. however, effect sizes for these differences were small (ds < 0.23). the total effect is computed as the sum of the indirect effects of the independent variable on the dependent variable through all model variables plus the direct effect. descriptive statistics for study variables are presented in table s4 . participants reported high levels of intention (australia sample, m = 6.54, sd = 0.66; us sample, m = 6.39, sd = 0.85) and behavior (australia sample, m = 6.10, sd = 0.67; us sample, m = 6.40, sd = 0.97) with respect to social distancing. internal consistency of the social cognition constructs was estimated using revelle's (2018) omega and internal consistency of the behavior variables and risk perception was estimated using the spearman-brown as they comprised two items each. results are presented in table s4 . all constructs in both samples exhibited acceptable internal consistency, and these data were used to estimate measurement error in subsequent single-item structural equation models. scale variance, descriptive statistics, and computed error variance terms used in structural equation models are also presented in table s4 . correlations among the model constructs and behavior and socio-demographic variables are presented in table s5 . the single-indicator structural equation models that excluded (model 1) and included (model 2) past behavior exhibited adequate model fit with the data for both the australia and us samples (see table s6 ). standardised parameter estimates and distribution statistics for each model in the australia and us samples are presented in tables 1 and 2, 2 respectively. focusing first on the models excluding past behavior, intention and action control were statistically significant predictors of social distancing behavior in both samples, with no significant effects for self-efficacy, action planning, and coping planning. there were also no significant effects of intention on action planning or coping planning in the australia sample, while intention predicted both planning constructs in the us sample. self-efficacy and action control were significant predictors of intention in both samples, with attitudes predicting intention in the australia sample only and risk perceptions predicting intention in the us sample, although the effect in the australia sample fell short of statistical significance by a trivial margin (p = .077). there were significant indirect effects of self-efficacy on behavior mediated by intention in both samples, and significant indirect effects of risk perceptions and action control on behavior mediated by intentions in the us sample only. intention and action control had significant total effects on behavior in both samples, with a further total effect of self-efficacy in the us sample. inclusion of past behavior led to an attenuation of model effects, consistent with previous research (brown et al., 2020; hagger et al., 2018) . notably, effects 2 full parameter estimates for the models in the australia and us samples are provided in tables s7 and s8, respectively. of all hapa constructs on behavior were reduced to a trivial size and were not statistically significant. effects of constructs on intentions remained with the same pattern as those in the model excluding past behavior for both samples, albeit with smaller effect sizes. the only exception was the action control-intention effect, which was reduced to a trivial size and non-significance in the us sample. past behavior predicted all model constructs with medium-to-large effect sizes in both samples. 3 comparisons of model fit across the australia and us samples revealed adequate fit of the configural models excluding (model 3) and including (model 5) past behavior, lending support for the tenability of the proposed pattern of model effects across the samples (table s6 ). constraining regression coefficients to be invariant for the models including (model 4) and excluding (model 6) past behavior resulted in no significant change in model fit according to the goodness-of-fit chi-square and the cfi with differences in the cfi across models less than .01 (table s6 ). these findings suggested that any observed differences in the parameter estimates of the models across the australia and us samples were relatively trivial. this is consistent with the highly consistent pattern of effects in the models in each sample with relatively minor sample-specific variation. the empirical literature has highlighted the imperative of non-pharmacological interventions in reducing the transmission of communicable viruses and preventing infection (jefferson et al., 2011; rabie & curtis, 2006; smith et al., 2015) . in the context of the covid-19 pandemic, participation in behaviors that prevent virus transmission is essential given the absence of a vaccine or clinically proven pharmacological therapy. sustained, population-level participation in such behaviors is not only important to reduce infections in the current pandemic phase, but also in the phases of easing restrictions to avoid a potential "second wave" of infections. there is a pressing need for evidence of potentially modifiable determinants of covid-19 preventive behaviors, such as social distancing, 3 the social distancing behavior and past behavior variables were associated with large skewness and kurtosis values. we checked to see whether the skewness and kurtosis values affected findings. so, we re-estimated our structural equation models using a square root transformation of these variables. the reanalysis revealed virtually identical coefficients and the exact pattern of effects found for the analysis using the untransformed behavior variables. analysis scripts and output for this auxiliary analysis are available online: https://osf.io/mrzex/?view_only=3ae43e6fa81c48c6880e65d 068f5435b on which to base interventions promoting population level participation in these behaviors. the current study aimed to address this need by identifying the theory based social cognition determinants of social distancing behavior, and the processes involved, in samples from australia and the us. the study adopted a correlational prospective survey design guided by the hapa. consistent with hapa predictions, intention and action control were identified as significant direct predictors of social distancing behavior in both samples, while intention predicted action planning and coping planning in the us sample. further, self-efficacy and action control were identified as significant predictors of intention in both samples. attitudes and risk perceptions were additional predictors in the australia and us samples, respectively. significant indirect effects were also observed; self-efficacy predicted behavior mediated by intention in both samples, and risk perceptions and action control were found to predict behavior mediated by intentions in the us sample only. despite these limited differences, it should be noted that comparisons of the models across the australia and us samples suggested that observed differences in parameter estimates across the samples were relatively trivial. findings are consistent with the auxiliary assumption promulgated in the hapa, and social cognition theories more generally, that the effects of the belief-based constructs reflect generalised processes that have a consistent pattern of effects across contexts, populations, and behaviors. in sum, the current findings indicate that individuals' social distancing behavior is a function of both motivational and volitional processes, and this provides formative data on potential targets for behavioral interventions aimed at promoting participation in this preventive behavior. results of this study provide qualified support for the application of the hapa, with its focus on constructs that represent dual phases of action. findings demonstrate a prominent role for self-efficacy as the key determinant of intentions, and intentions as the key determinant of behavior across both samples. these findings are in line with applications of the hapa in multiple health behavioral contexts , as well as research on social cognition constructs more broadly (hamilton, van dongen, & hagger, 2020; mceachan et al., 2011) . confidence in engaging in health behaviors and capacity to overcome setbacks and barriers have been consistently linked with future behavioral performance (warner & french, 2020) . the pervasive effect of intention on behavior is also aligned with a substantive literature on social cognition theories demonstrating intentions as the pre-eminent determinant of behavior (hamilton, van dongen, et al., 2020; mceachan et al., 2011) . overall, these effects suggest that social distancing behavior should be conceptualised as a reasoned action. however, the current study also demonstrated a prominent role for constructs representing volitional processes in the enactment of behavior. in particular, action control, a construct reflecting individuals' application of key self-regulatory skills to enact behavior, was a consistent predictor of both intentions and behavior across the samples. individuals possessing these skills are not only more likely to form intentions to perform social distancing behaviors, but are also more likely to engage in the behavior through, for example, an automatic process. specifically, the direct effect not mediated by intentions suggests that individuals with good action control might be more effective in structuring their environment or forming habits that promote enactment of social distancing without the need for extensive deliberation or weighing up of options. over time, these individuals are likely to form habits, that is, performance of behaviors that are activated through cues and contexts independent of the goals and intentions that originally gave rise to them (aarts et al., 1998; hagger, 2019; verplanken & orbell, 2003; wood, 2017) . research has suggested that individuals possessing these skills are effective in controlling their actions more broadly, but also that such skills can be acquired or learned (gardner, 2015; gardner et al., 2020) , which provides a potential avenue for intervention: training people to be more effective in regulating their own actions. interestingly, current research shows that risk perceptions have small effects on intentions and subsequent behavior. risk perceptions had small but significant effects in the us sample, and a small effect which fell short of statistical significance in the australia sample. this pattern of effects is consistent with applications of the hapa and other social cognition models like protection motivation theory, which found relatively modest or null effects of risk perceptions on intentions and behavior (zhang, chung, et al., 2019) . in the context of covid-19 prevention and social distancing behavior, it is common knowledge that the infection will not have serious consequences for the majority of the population, and is likely only to be serious for those with underlying conditions or impaired immunity, or the elderly. as a consequence, perceived risk may not be a major influence on decisions to act. instead, it seems that self-efficacy and action control are more pervasive and consistent determinants of behavior, and these may be more pertinent targets for intervention. action and coping planning were expected to mediate intention-behavior effects in the current model, such that planning is an important part of the process of intention enactment for social distancing. however, findings indicated that neither form of planning mediated intention effects on behavior, contrary to hapa predictions. these findings are not, however, unique, and previous research has demonstrated considerable variability in the role of planning in intention enactment, and effect sizes are often small (rhodes et al., 2020; . taken together, it seems that volitional processes such as action control are far more pervasive in promoting social distancing intentions and behavior. introduction of past behavior in the current model had marked influences on the size of model effects, rendering effects of almost all model constructs on intentions and behavior trivial and not statistically significant. one interpretation of these findings is that the current model is not sufficient in accounting for social distancing over time. however, it was not unexpected that past behavior would have pervasive effects on subsequent behavior over such short range prediction and, given the high stability of social distancing behavior, it is unsurprising that it accounts for model effects over time. it must also be stressed that past behavior alone is not a construct and does not, therefore, offer any information other than on the stability of social distancing behavior (ouellette & wood, 1998) . some have proposed that past behavior is indicative of habitual influences on behavior, but research examining habit as a construct suggests that it is more than performing a behavior frequently, and that the quality of the behavioral experience, such as experiencing it as automatic or without explicit thought, better characterises habitual processes (aarts et al., 1998; hagger, 2019; verplanken & orbell, 2003) . nevertheless, the residual effect of past behavior may provide some indication of unmeasured constructs on subsequent behavior, particularly those that bypass effects of intentions and are more likely rooted in non-conscious processes that lead to behavior, such as implicit attitudes or motives. research applying social cognition models like the hapa provides useful guidance for the development of future behavioral interventions aimed at promoting social distancing behaviors. although participants' intentions toward, and actual participation in, social distancing behavior were relatively high, scores and variability estimates suggested that some participants were reporting lapses in their social distancing behavior. such lapses present considerable risks to coronavirus transmission, particularly in areas of high prevalence where the likelihood of contact with infected persons is substantially elevated. our research provides some indication of the constructs that should be targeted for change and also the types of behavior change techniques that make up the content of interventions (hagger, cameron, et al, 2020; hagger, smith, et al., 2020; kok et al., 2016) . based on current findings, strategies to promote self-efficacy should be foremost in potential targets of interventions to promote intentions and behavior. interventions that have manipulated mastery experience (i.e. practicing a behavior) and vicarious experience (i.e. observing a model performing the behavior) have been shown to be successful in strengthening self-efficacy, as have interventions that provide feedback on past or others' performance (warner & french, 2020) . tailoring of these strategies could also be considered and targeted at uptake of the behavior for those that have not already adopted the behavior (e.g. demonstration of appropriate social distance when in line to purchase goods) or at maintenance of the behavior (e.g. developing a rule of thumb on keeping an appropriate social distance every time when in line to purchase goods). action control was another key determinant of intentions and behavior. this suggests that it is important that individuals acquire monitoring and self-regulatory strategies with respect to their social distancing behavior. for example, action control involves consistent monitoring as to whether an individual follows through on their intentions for the target behavior (schwarzer & hamilton, 2020) . monitoring helps identify discrepancies in behavior (e.g. not being at an appropriate social distance when in line to purchase goods), and noting a discrepancy can trigger taking additional action to ensure goals are achieved (e.g. adjusting the distance) or for disengaging from the goal (e.g. abandoning the goods and leaving the shop) (webb & de bruin, 2020) . in order to promote better action control, interventions may prompt self-monitoring (e.g. through selfobservation of social distancing behavior) or be monitored by others (e.g. shop attendant prompts an individual to increase their social distance). given that constructs such as attitudes and risk perceptions were not strong, consistent determinants of social distancing behavior, strategies targeting change in these constructs may not be at the forefront of behavioral interventions to promote social distancing. however, context-specific interventions that target change in attitudes for individuals in australia and risk perceptions, particularly for individuals in the us, may assist in promoting stronger intentions. strategies aimed at promoting attitude change and increased risk perceptions usually involve information provision (e.g. providing information about health consequences, highlighting the pros over the cons of social distancing) and communication-persuasion (e.g. using credible sources to deliver messages, using framing/reframing methods) about the importance of maintaining social distancing (hamilton & johnson, 2020) . however, reviews suggest that such strategies relate more to short-term change rather than sustained, longer-term impact on behavior (jepson et al., 2010) . another approach could be the use of fear appeals which seek to arouse negative emotional reactions in order to promote self-protective motivation and action (kok et al., 2016) . however, caution is needed when using fear appeals to attempt to change behavior as excessively heightened fear may be counter-productive in motivating individuals to engage in preventive behaviors (kok et al., 2018; lin, 2020) , and may even be counter-productive because they are responses aimed at mitigating fear, such as avoidance or denial, neither of which may manage the risk itself (hagger et al., 2017; leventhal et al., 1998) . there is evidence that messages that highlight risk but also provide coping information to increase self-efficacy (kok et al., 2018) and that use positive prosocial language (heffner et al., 2020) may be effective because they are more readily accepted and prevent defensive and avoidant reactions. however, current evidence suggests that interventions targeting change in attitudes and risk perception are unlikely to be enough to promote social distancing. the present research has a number of strengths including focus on social distancing, a key preventive behavior aimed at reducing transmission of sars-cov-2 to prevent covid-19 infections; adoption of a fit-for-purpose theoretical model, the hapa, that provides a set of a priori predictions on the motivational and volitional determinants of the target behavior; recruitment of samples from two countries, australia and the us, with key demographic characteristics that closely match those of the population; and the use of prospective study design and structural equation modelling techniques. a number of limitations to the current data should also be noted. that there was substantive attrition at follow-up in both samples is an important limitation. non-trivial attrition could result in selection bias. for example, participants who are more motivated or engaged may be overrepresented in the sample. in the current study, participants were provided with multiple reminders to complete measures at follow-up, but more intensive recruitment and incentivisation of non-responders may have further minimised attrition rates. it should be noted that participant drop-out affected the demographic profile of the samples, particularly among underrepresented groups. this is particularly relevant to the current context given data indicating that covid-19 infection and mortality rates are higher in underrepresented minority and socioeconomic groups (cdc, 2020) . a potential solution would be to oversample in underrepresented groups in which attrition rates are likely to be high and should be considered in future research. furthermore, our recruitment strategy was focused on producing samples with characteristics that corresponded with those of the national population on gender and state. however, the samples were not stratified by salient demographic or socioeconomic variables. the current samples cannot be characterised as representative of the australian or us population. taking these biases into account, the current findings should not be considered directly generalisable to the broader population. in addition, the current study adopted a prospective design, which provided a basis for the temporal ordering of constructs in the model. however, the correlational design of the current study means that inferences of causality are based on theory rather than the data. furthermore, the current design did not permit modeling of the stability or change in model constructs over time. the latter represents an important caveat when utilising current data as a basis for intervention. future research should aim to adopt cross-lagged panel designs that model change in constructs over time, and utilise intervention or experimental designs that target change in model constructs and observe their effects on behavior. also, the study was conducted over a one-week period. although this is a relatively brief follow-up period, it was considered appropriate given the high speed of virus transmission and the need for prompt adoption of social distancing in the population to prevent widespread infection. the current results, however, do not confirm the extent to which model constructs predict social distancing over a longer period, and long-term follow-up would be necessary to support the application of the hapa in accounting for maintenance of social distancing, which is especially important as lockdown restrictions ease in order to prevent a "second wave" of infection. the present study also relied exclusively on self-report measures which may introduce additional error variance through recall bias and socially desirable responding. future studies may consider verification of behavioral data with non-self-report data such as the use of gps mapping of mobile phones or using observation to verify rates of social distancing behavior in particular contexts (e.g. workplaces, grocery stores). it might also be useful for future studies to investigate the role of social factors, as suggested in the hapa, on social distancing behavior. this is particularly important given the considerable potential for "social" influences to affect individuals' behavior in minimising person-to-person contact with others outside the individual's immediate household. precedence for these effects comes from previous research which has found that pressure from important others and moral obligation toward others predicts adherence to covid-19 preventive behaviors, including social distancing (hagger, cameron, et al, 2020; hagger, smith, et al., 2020; lin et al., 2020) . finally, this research was conducted during a period when it is likely that participants were already engaging in social distancing and, thus, already had substantive experience with the behavior, indicated by the high scale mean scores for past behavior (m = 6.5 on a 7-point scale) in both samples. this likely explains the substantive effect of past behavior in attenuating model effects and the need for longitudinal designs or using methods such as ecological momentary assessment that capture moment-by-moment changes over time in behavior. given the urgent need for populations to adopt covid-19 preventive behaviors, such as social distancing, the present study applied the hapa to predict key motivational and volitional determinants of social distancing behavior in samples across two different countries, australia and the us. overall, the current findings provide qualified support for some of the core proposed effects among the motivational and volitional factors in the model, as well as their effects on individuals' social distancing behavior. the current study fills a knowledge gap in the literature on the social psychological processes that guide social distancing behavior in an unprecedented context of a pandemic and suggests that the motivational and volitional constructs of self-efficacy, intention, and action control, in particular, may have utility in explaining this important covid-19 preventive behavior. despite the correlational design, the current findings suggest multiple potential routes to behavioral performance that can serve as a basis for the development of intervention and enable further testing of effects of the techniques on both behavior change and the targeted theory constructs. additional supporting information may be found online in the supporting information section at the end of the article. predicting behavior from actions in the past: repeated decision making or a matter of habit australia's children. canberra: australian institute of health and welfare the mediating role of reasoned-action and automatic processes from past-to-future behavior across three health behaviors testing for the equivalence of factor covariance and means structures: the issue of partial measurement invariance covid-19 in racial and ethnic minority groups evaluating goodness-of-fit indexes for testing measurement invariance physical distancing, face masks, and eye protection to prevent person-to-person transmission of sars-cov-2 and covid-19: a systematic review and meta-analysis the relative performance of full information maximum likelihood estimation for missing data in structural equation models a review and analysis of the use of "habit" in understanding, predicting and influencing health-related behaviour habit interventions habit and physical activity: theoretical advances, practical implications, and agenda for future research the reasoned action approach applied to health behavior: role of past behavior and tests of some key moderators using metaanalytic structural equation modeling handbook of behavior change the common sense model of self-regulation: meta-analysis and test of a process model predicting social distancing behavior during the covid-19 pandemic: an integrated social cognition model parental supervision for their children's toothbrushing: mediating effects of planning, self-efficacy, and action control attitude and persuasive communication interventions an extended theory of planned behavior for parent-for-child health behaviors: a meta-analysis emotional responses to prosocial messages increase willingness to self-isolate during the covid-19 pandemic cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives physical distancing interventions and incidence of coronavirus disease 2019: natural experiment in 149 countries physical interventions to interrupt or reduce the spread of respiratory viruses the effectiveness of interventions to change six health behaviours: a review of reviews a taxonomy of behaviour change methods: an intervention mapping approach ignoring theory and misinterpreting evidence: the false belief in fear appeals self-regulation, health, and behavior: a perceptual-cognitive approach social reaction toward the 2019 novel coronavirus (covid-19) using an integrated social cognition model to predict covid-19 preventive behaviors motivating social distancing during the covid-19 pandemic: an online experiment prospective prediction of health-related behaviours with the theory of planned behaviour: a meta-analysis reducing sars-cov-2 transmission in the uk: a behavioural science approach to identifying options for increasing adherence to social distancing and shielding vulnerable people habit and intention in everyday life: the multiple processes by which past behavior predicts future behavior r: a language and environment for statistical computing handwashing and risk of respiratory infections: a quantitative systematic review psych: procedures for psychological social-cognitive antecedents of hand washing: action control bridges the planning-behaviour gap planning and implementation intention interventions lavaan: an r package for structural equation modeling a comparison of several approaches for controlling measurement error in small samples the health action process approach (hapa): assessment tools modeling health behavior change: how to predict and modify the adoption and maintenance of health behaviors changing behaviour using the health action process approach income and poverty in the united states use of non-pharmaceutical interventions to reduce the transmission of influenza in adults: a systematic review public perceptions of non-pharmaceutical interventions for reducing transmission of respiratory infection: systematic review and synthesis of qualitative studies behavioural science and disease prevention: psychological guidance reflections on past behavior: a self-report index of habit strength confidence and self-efficacy interventions monitoring interventions applying principles of behaviour change to reduce sars-cov-2 transmission habit in personality and social psychology coronavirus disease (covid-19) advice for the public worldometer covid-19 coronavirus pandemic modeling longitudinal and multiple group data: practical issues, applied approaches and specific examples health beliefs of wearing facemasks for influenza a/h1n1 prevention: a qualitative investigation of hong kong older adults predicting hand washing and sleep hygiene behaviors among college students: test of an integrated social-cognition model a meta-analysis of the health action process approach the role of action control and action planning on fruit and vegetable consumption martin s. hagger's contribution was supported by a finland distinguished professor (fidipro) award (dnro 1801/31/2105) from business finland. data files and analysis scripts are available online from the open science framework project for this study: https://osf.io/mrzex/?view_only=3ae43e6fa81c48c 6880e65d068f5435b key: cord-335689-8a704p38 authors: martin, gerardo; yanez-arenas, carlos; chen, carla; plowright, raina k.; webb, rebecca j.; skerratt, lee f. title: climate change could increase the geographic extent of hendra virus spillover risk date: 2018-03-19 journal: ecohealth doi: 10.1007/s10393-018-1322-9 sha: doc_id: 335689 cord_uid: 8a704p38 disease risk mapping is important for predicting and mitigating impacts of bat-borne viruses, including hendra virus (paramyxoviridae:henipavirus), that can spillover to domestic animals and thence to humans. we produced two models to estimate areas at potential risk of hev spillover explained by the climatic suitability for its flying fox reservoir hosts, pteropus alecto and p. conspicillatus. we included additional climatic variables that might affect spillover risk through other biological processes (such as bat or horse behaviour, plant phenology and bat foraging habitat). models were fit with a poisson point process model and a log-gaussian cox process. in response to climate change, risk expanded southwards due to an expansion of p. alecto suitable habitat, which increased the number of horses at risk by 175–260% (110,000–165,000). in the northern limits of the current distribution, spillover risk was highly uncertain because of model extrapolation to novel climatic conditions. the extent of areas at risk of spillover from p. conspicillatus was predicted shrink. due to a likely expansion of p. alecto into these areas, it could replace p. conspicillatus as the main hev reservoir. we recommend: (1) hev monitoring in bats, (2) enhancing hev prevention in horses in areas predicted to be at risk, (3) investigate and develop mitigation strategies for areas that could experience reservoir host replacements. electronic supplementary material: the online version of this article (10.1007/s10393-018-1322-9) contains supplementary material, which is available to authorized users. emerging zoonotic diseases account for close to 13% of known human pathogens (taylor et al. 2001; woolhouse and gowtage-sequeria 2005) . these diseases along with other emerging pathogens that affect crops and domestic animals can have extensive socio-economic consequences (jones et al. 2008) , especially when they adapt to and transmit among their new hosts (taylor et al. 2001) . four diseases that have spilled over from bats to humans and have resulted in epidemics are ebola and marburg viruses in africa (leroy et al. 2005 ) and severe acute respiratory syndrome (sars) coronavirus and nipah virus in east asia (chua 2003; he et al. 2004; leroy et al. 2005; li et al. 2005; wang and eaton 2007) . some of these outbreaks have had long-term devastating consequences, from the loss of thousands of human lives to the collapse of the already imperilled public health systems that prevent, control and treat other diseases (chang et al. 2004; plucisnki et al. 2015) . hendra virus (hev, paramyxoviridae:henipavirus) is another bat-borne virus that spills over into domestic animals, in its case horses, and then people with high case fatality rates of 50-75% (halpin et al. 2011; smith et al. 2014; edson et al. 2015; martin et al. 2016) . it was discovered in 1994 in a brisbane suburb in queensland, australia, with two of the four australian mainland flying fox species, pteropus alecto and p. conspicillatus, as its major reservoir hosts (halpin et al. 2011; smith et al. 2014; edson et al. 2015; martin et al. 2016) , although antibodies against hev are commonly found in p. scapulatus and p. poliocephalus (young et al. 1996; plowright et al. 2008) . hev is closely related to nipah virus, which is also a henipavirus from pteropodid bats. nipah virus occurs in east asia and spills over to pigs and humans, where it has been able to cause epidemic disease outbreaks with case fatality rates close to 41% in humans (chong et al. 2002; chua 2003) . in bangladesh, spillover occurs directly to humans with even higher case fatality rates (luby et al. 2009 ). the proven ability of henipaviruses to cause epidemic outbreaks with high case fatality rates make spillover mitigation highly necessary. mitigation and prevention of impacts of disease spillover depends on our understanding of the transmission process and ability to predict it . mechanistic models of infectious diseases have proven useful frameworks to make informed decisions towards controlling and mitigating the impacts of epidemics (wickwire 1977) . these methods require high-quality longitudinal data rarely available for pathogens that originate in wild animals (woodroffe 1999 ).the poor understanding of hev dynamics in bats limit our ability to directly predict hev levels in those populations. nevertheless, prediction can be made with alternative methods to mechanistic models at lower spatial and temporal resolutions. these methods are based on readily available data and can be used to model the response of the system of interest . one approach to identify areas at risk from emerging infectious diseases is to model the ecological niche of the causal agent and its reservoir host with spatiality explicit climatic data, and to use the model to predict their geographic distribution (escobar and craft 2016) . the process of niche modelling consists of relating the climatic conditions of locations where organisms are able to breed and persist with the prevailing climatic conditions of areas where species could occur . the relationships between a species' presence and climate are usually established with statistical models that ultimately represent a measure of environmental suitability. the spatial representation of environmental suitability helps visualisation of the model's estimates in the form of maps . assuming that the organisms' niches being modelled do not undergo a climatic niche shift, models can be used to predict future distributions under climate change scenarios (pearman et al. 2008) . for instance, using these methods many diseases have been predicted to impact wider areas with climate change, expanding or shifting from tropical to subtropical areas (lafferty 2009 ). therefore, identifying areas at risk and anticipating the potential impacts of climate change on hev spillover is critical to adequately allocate resources and mitigate risk. ecological niche modelling has been applied with varying degrees of success to investigate the distribution of the zoonotic niches of bat-borne viruses. for instance when peterson et al. (2004) initially predicted areas at risk of marburg disease spillover in africa, left out wide areas that were later on shown to be at risk in updated models with improved methods and data pigott et al. 2015) . previous ecological niche modelling studies of henipavirus spillover systems have focused on answering ecological and epidemiological questions (hahn et al. 2014) , identifying reservoir hosts (martin et al. 2016); identifying new populations at risk (walsh 2015) ; or generating broad predictions of risk (daszak et al. 2013) . while their contribution towards improving our understanding has been valuable, none have focused on forecasting areas at risk of spillover in time, which is essential to anticipate the effects of climate change and inform public health measures (braks et al. 2013 ). in order to be able to predict the consequences of climate change, models must rely on climatic data that can be projected into the future, such as those resulting from global circulation models (hijmans et al. 2005; beaumont et al. 2008; wiens et al. 2009 ). empirical evidence suggests that hev spillover is related to climate by several different mechanisms acting at different temporal and spatial scales. from broad to fine: the spatial and temporal abundance patterns of hev reservoir hosts, flying foxes, are related to climatic suitability (martin et al. 2016) ; the spatial dynamics of bats are largely governed by food resources that are dependent on climate (hudson et al. 2010; giles et al. 2016) ; the levels of hev shedding may be linked to low food productivity and availability after severe weather events (plowright et al. 2008; mcfarlane et al. 2011; páez et al. 2017; peel et al. 2017) ; and lastly hev survival in microclimates which might facilitate indirect transmission, is also dependent on climate (martin et al. , 2017 . in this study we present two models that estimate the areas at risk of hendra virus spillover to horses under current and future climatic conditions. the models represent the climatic requirements for the presence of hev's reservoir hosts and the climatic conditions that have facilitated hev's transmission to horses. we used current and predicted future climatic conditions to project the statistical models and predict areas that could be at risk now and by year 2050 according to two representative greenhouse gas concentration pathways. ecological niche models often use presence only data, resulting in the extensive use of algorithmic modelling (elith et al. 2011) . a well-known disadvantage of these methods is their potential for complexity. recent efforts have made the application of better understood techniques like generalised linear models possible for presence only data, in the form of poisson point process models (renner and warton 2013; renner et al. 2015) . taking advantage of these statistically transparent frameworks, we modelled the risk of hev spillover as a poisson point process, including a log-gaussian cox process to model spatial autocorrelation in a bayesian hierarchical inference framework. this method allowed us to use the entire hev spillover database (55 events between 1994 and 2015) without thinning to control spatial autocorrelation boria et al. 2014) . the models represent the relationship between the number of spillover events per unit area, the climatic suitability for reservoir host species, and the climatic niche (temperature and precipitation) over which spillover events have occurred to date. we took the following steps to build these models: (1) assigned presence points to the most likely reservoir host species present at spillover locations, (2) computed the optimal size of spatial units and determined appropriate explanatory climatic variables, (3) selected the model structure (linear and quadratic terms and interactions with aic and cross-validation), (4) selected priors for the bayesian model, (5) fitted the bayesian model, (6) cross-validated, and (7) transferred models to climate change scenarios (fig. 1 ). hev spills over to horses from two of the four australian flying fox species edson et al. 2015; martin et al. 2016) . we treated these species as two separate reservoir host systems edson et al. 2015 ; figure 1 . workflow to construct the models. martin et al. 2018 ) that were geographically limited to the areas colonisable by p. alecto and p. conspicillatus martin et al. 2016) . the colonisable areas comprise the climatic regions of australia (obtained from the bureau of meteorology, http:www.bom.gov.au) that contain at least one presence record of each reservoir host. to assign spillover events to either system (p. alecto, north and south, and p. conspicillatus, strictly north), we extracted the maxent model's relative probability of presence of the two flying fox species at the location of spillover (martin et al. 2016) . the location points were assigned to the relevant reservoir host system based on a higher relative probability of presence value for that species. to this data, from the remaining location points that did not have greater occurrence probability for either species, we added the points with the top 5% of occurrence probability (two spillover events) for that species at spillover locations. with this final step, we allowed for ambiguity of the most likely reservoir species that acted as the source of infection in horses. point process models relate the number of points (intensity) of presence locations to spatial units. before a model was fitted, we needed to determine the optimal size of the spatial units in which spillover intensity would be regressed against a set of explanatory variables. to estimate the optimal computational resolution, we used a minimum contrast method. briefly, this method consists of comparing the bandwidth of a nonparametric field and the theoretical model that describes the spatial process of the data (davies and hazelton 2013) . the spatial resolution that minimises the error between the methods is optimal. the expected intensity of spillover cases per spatial unit was explained by the climatic suitability for the two key reservoir hosts previously modelled with maxent (phillips et al. 2006 ) by martin et al. (2016) , and a subset of the worldclim bioclimatic variables (bio1-19) (hijmans et al. 2005) . to select the explanatory bioclimatic variables less likely to negatively affect model transference to climate change conditions for each spillover system, we performed a niche views procedure. it consists of an analysis of the correlations between pairs of explanatory variables and the location of the presence points within the bivariate clouds of data points [bio1-18 and p. alecto or p. conspicillatus models (owens et al. 2013) ]. the variables less likely to affect model transference to climate change conditions are those where the presence points lie close to the centre or relatively far from the margins of the range of values within the bivariate cloud of data (owens et al. 2013 ). this may have increased model complexity by excluding variables with more explanatory power. parameters for the poisson point process model and spatial autocorrelation with a log-gaussian cox process were sampled with a metropolis-adjusted langevin algorithm implemented in the r 3.2.3 'lgcp' package ; r core team 2016). model selection was performed a priori with a poisson regression . the response variable for this part of the process was the number of points per spatial area unit (e.g. pixels). we began model selection with four model structures for the poisson regression of the p. alecto system; interactions between all explanatory variables; with linear, quadratic and cubic terms; and linear and quadratic terms and a series of interactions between linear, squared and cubed terms. each model was then subject to automated step-wise variable deletion until we obtained the model with the lowest aic. we kept the model structure with the lowest aic. then we tested the performance of this model structure on independent data with the partial roc test [described in more detail below (peterson et al. 2008) ]. if the lowest aic model performed better than random according to the test, then we used its formula in the final poisson point process. for the p. conspicillatus system, we only selected one out of three models given the small number of spillover cases potentially arising from this species-all linear terms, and linear with quadratic and cubed probability of p. conspicillatus presence (from martin et al. 2016 ). to find the appropriate spatial covariance function, we ran short chains of 5000 iterations of the mcmc sampler, with 500 burn-in iterations and thinning rate of 15 with the exponential and spiked exponential covariance functions. the chains were run with a range of priors and number of neighbouring cells to compute the covariance function. when the mcmc chains appeared to be well mixed, with lower values of autocorrelation and no rejected samples during the run phase, we ran the full mcmc chains. the final size of the full-length chains was chosen based upon the behaviour of the h parameter. for runs with h & 0.3, we used 5 m iterations, but in cases where h decreased, the number of iterations was increased up to 20 m. parameter priors that resulted in h < 0.1 at the end of the 5 k iteration chain were rejected. the burn-in phase of the fulllength chains included 10% of the number of iterations and the thinning rate was set up to keep 1% of the posterior samples (taylor et al. 2015) . to have a model that represents the underlying risk, regardless of the underlying population at risk, we corrected for the effect of horse population on spillover intensity per unit area. this was done by using a horse population density model as a poisson offset. in the absence of more data regarding horse densities than the 2007 horse census (moloney 2011), a critical assumption in the approach is that the horse population of 2007 is still correlated and broadly representative of the horse population during the time when hev spillover events have occurred, from 1994 to 2016 at the spatial scale of the model. further justification for this approach is that we aimed to represent the underlying risk for spillover regardless of the density of the spillover host. this is partly because we do not have a reliable model for future horse densities. hence the poisson component of the model with population offset and spatial covariance results in the following bayesian model: poisson component: parameter priors: where b is the vector of effects of environmental covariates z(s); y(s) is the bivariate (s 1 , s 2 ) covariance function, c a is the spatial grid cell size, k(s) is the horse population density, x(s) is the point intensity data, and r and u are the exponential covariance function r(s) parameters in eq. (1) taylor et al. 2013) . the horse population density model was built with the horse population census of 2007 (moloney 2011) . this horse density model was created by introducing uniformly distributed noise to the geographical coordinates of the horse properties, equivalent to 50% of the cell width of the environmental data. the number of horses per grid cell after noising the coordinates was rasterised, and the process was repeated 100 times. when iterations were completed, the 100 raster layers were averaged to create the final horse density model (fig. 2) . this method allowed us to account for the effect of properties spanning more than one grid cell but whose centroid lied within a single pixel. once we obtained the converged mcmc chains, we used the posterior estimates of the environmental covariates to project the point intensity of spillover per unit area in geographic and environmental space. for the final spillover risk model, we calculated the probability that the predicted intensity exceeds the lower 20 th percentile of the median intensity estimated for the locations with spillover events. this threshold was chosen because the database contains location uncertainty in nearly 20% of the spillover locations. before we simulated future scenarios of spillover, we compared the data used for model fitting with the data from climate change predictions with an extrapolation analysis [exdet (mesgaran et al. 2014) ]. briefly, extrapolation analyses consist of finding extended variable ranges (type 1 extrapolation) and different correlation structures (type 2 extrapolation) that might affect the behaviour of the statistical model. these analyses are performed with the raster data used for model transference and result in highlighting geographical areas where the model might misbehave. we used the extrapolation analysis results to remove all areas where models faced extrapolation due to novel climatic conditions. we used the 16 climate change scenarios under two different greenhouse gas representative concentration pathways [rcp 45 and 85 (hijmans et al. 2005) ]. this approach has been suggested to represent degrees of confidence in the potential outcomes of climate change given the variability between global circulation models, rcps and downscaling methods (beaumont et al. 2008) . concentration pathways are a series of alternative trajectories that greenhouse gas concentration might follow depending on a series of ecological, technological, socioeconomic, political and demographic factors that could result in different degrees of climate change. the number that accompanies the acronym rcp indicates the severity of greenhouse gas concentration (van vuuren et al. 2011 ). to generate the climate change scenarios consensus maps, we began by setting a threshold of 0.2 for all exceeding probabilities to coincide with the threshold that was used to calculate the probabilities and test the models (see below). the areas predicted absent with this threshold under current climatic conditions were set to -1 (presence = 1, absence = -1). then, each of the thresholded climate change projections had 1 added, so that areas predicted absent = 1, and areas predicted present = 2. after adding up all the thresholded predictions, we subtracted the probability that there was model extrapolation to force all extrapolated areas back to 1 (or between 1 and 2 if not all climate change scenarios resulted in extrapolation for that area). finally, we multiplied by the values for the current risk scenario based on current climatic conditions. for this, areas predicted absent using the 0.2 threshold were set to -1 and presence to 1 (absence = -1, presence = 1)). this multiplication resulted in all areas that were predicted present in future climatic conditions but were absent in today's conditions becoming negative. all areas that were predicted present both now and into the future got positive values between 1 and 2; and the areas that became unsuitable after climate change received values close to 1 (presence today (1) 9 absence future (1) = 1). future absence in the areas represented by these calculations was caused by areas remaining in the same absent state as figure 2 . location of spillover events overlaid on the horse density model (log 10 scale). the symbol for spillover events represents the reservoir host species that we attributed spillover events. spillover localities were thinned to improve visualisation. currently, areas becoming unsuitable, or by being unable to predict anything due to extrapolation. to help readers identify areas where predictions are dubious due to extrapolation we have provided a map of extrapolation conditions. models' predictive performance was measured with the partial roc (receiver operator characteristic) with a 20% omission rate (peterson et al. 2008 ) for the p. alecto system, and with a jackknife test (pearson et al. 2006 ) for the p. conspicillatus system. the partial roc test is a modification of the traditional roc area under curve (auc) that measures a model's ability to discriminate zeroes from ones. an auc score of 1 means the model is capable of perfect discrimination (no false positives or negatives). the partial roc, however, is based on the spatial performance of the model projection by contrasting the percent of area predicted that is used to generate a random predictor with the proportion of presences predicted. the resulting score is the quotient of the model's auc and the random predictor's auc. consequently, the maximum partial roc score is two (peterson et al. 2008) . to run this test with independent data, we partitioned the p. alecto system data in four sets and cross-validated their predictions with models fit in one chain of 5 m iterations, burn-in of 500 k and thinning of 4.5 k. we allowed the partial roc test a 20% omission of the test points to represent the 20% uncertainty in spillover location according to the biosecurity queensland dataset. while desirable, a higher number of partitions was not possible due to the high computational intensity of these analyses. for the p. conspicillatus system, we fitted 8 models, each one omitting one of the spillover presence points. then we calculated the minimum thresholds and their corresponding per cent of area predicted for the jackknife test (pearson et al. 2006) . the predicted spatial patterns of hev spillover risk under present climatic conditions for both reservoir host systems, p. alecto and p. conspicillatus, are consistent with the distribution of spillover events since 1994. current risk, as explained by p. alecto, comprises most of the east coast of australia, from northern queensland to central new south wales, and overlaps with parts of the distribution of p. conspicillatus. additional areas spanning farther north than the northernmost known spillover event were also predicted to be at risk (fig. 3) . risk driven by p. conspicillatus resulted in novel areas predicted to be at risk of spillover, located in the northernmost location within australia (fig. 4) . in the p. alecto hev spillover system, when we projected the models to future climate change scenarios in 2050, all 16 scenarios agree that there could be an expansion of risk towards the south and slightly farther inland (red areas in left panels of fig. 5 ). at current horse population levels (estimated during the census 2007-2011), the areas predicted to experience significantly greater risk under climate change scenario rcp45 (greenhouse gas representative concentration pathway 45) contain up to 112 914 horses, and 164 391 horses for rcp85. these numbers represent a 175-260% increase in the horse population at risk. the majority of horses at increased risk occur over 390-425 km of coastline south of the southernmost known spillover event (kempsey, figs. 2, 3) . expansion of risk under both rcps reaches the hunter valley (west of sydney) which has a high density of horses. the climate change scenario agreement in this area is very high. despite the increased geographical extent of spillover risk model projections predicted average lower probability of exceeding the specified intensity threshold for spillover to occur (20th quantile of median intensity at spillover locations) compared with current conditions (figs. 5, 6) . in northern areas, there is consensus that there could be greater risk levels in novel areas in the p. alecto spillover system. however, most of these areas could face novel climatic conditions that would result in model extrapolation, which increases uncertainty (top right corner of right panels in fig. 5 ). we could not identify any areas that would become completely unsuitable for spillover (marked with green). the only areas with 100% agreement to become completely free of risk are most likely over predictions because they lie very far from the known distribution of spillover and p. alecto. the major difference between the greenhouse gas representative concentration pathways is a greater inland expansion of areas at risk in rcp 85 compared with rcp 45 based on the p. alecto hev spillover system. this means that in the more severe climate change scenario risk could increase farther inland. with respect to p. conspicillatus, models predicted both expansion and contraction but with low agreement between climate change scenarios (left panels fig. 6 ). the northernmost end of areas at risk were predicted to shrink in both concentration pathways, although these areas were affected by extrapolation. other small areas were predicted to become unsuitable in rcp 85 that were not affected by extrapolation. both scenarios, rcp 85 and 45, predict lower probabilities of exceeding the intensity threshold compared with current climate scenarios (right panels in fig. 6 ). the areas that experienced no change for p. conspicillatus were small, and compared with p. alecto they experienced either no change or expansion (darkest blue figs. 5, 6). this indicates that p. conspicillatus habitat is likely to shrink and become more suitable for p. alecto, which raises concerns over p. conspicillatus conservation measures that might also affect hev epidemiology. the model selection procedure resulted in retention of the variables and interactions in tables 1 (p. alecto) and 2 (p. conspicillatus). for the p. alecto system, we could successfully implement the best model according to the aic of the poisson regression. however, for the p. conspicillatus system all attempts to include linear and quadratic terms or their interactions resulted in either singularity of the covariance matrix or in very long mcmc chains. as a result, we sought to keep a balance between aic and the convergence properties of the mcmc chain which resulted in a simple model of linear terms of the climatic components and a cubic term of p. conspicillatus climatic suitability. estimates of the regression coefficients (b) and spatial component (u, r) are listed in tables 1 and 2. both models converged with 10 m iterations, burn in of 1 m and thinning rate of 9 k. both models performed better than random based on their respective performance metrics. the p. alecto hev spillover system had an area under the curve ratio of 1.47 that was significantly different from 1 (p = 0.04, 1 represents the random prediction threshold) with an omission rate of 20% consistent with the threshold to calculate the exceeding probabilities. the p. conspicillatus spillover system that was tested with the jackknife test also performed better than random with a prediction rate of 0.75 (p = 2 9 10 -6 ). given that we were looking for a 20% omission threshold and a small number of presence points in the p. conspicillatus data set, the prediction rate of 0.75 is acceptable, because in a binomial process with the same sample size it is not significantly different from 0.8 (p = 0.99). the models that were tested during the model structure selection procedure performed better than random. the auc ratios of partitions were 1.84 (sd = 0.069, p = 0) and 1.85 (sd = 0.053, p = 0) with the same 20% omission rate. all future climate scenarios result in novel conditions for the models especially in northern areas. extrapolation affected mostly the projections of the p. alecto hev spillover system and occurred in the novel areas in northern queensland (top right corner of the right side panels fig. 5 ). all the climatic variables used in the model caused type 1 novelty in these areas. type 1 novelty is an increased range of values than used in model training. additional areas of extrapolation occurred in southern locations along the coast. because extrapolation in these areas was caused by the model of p. alecto distribution, which can only have values 0-1, extrapolation artefacts are unlikely. we did not perform an extrapolation analysis for the p. alecto distribution models, but only compared the predictions of maxent models fit with and without clamping and extrapolation and did not notice any differences. extrapolation affecting the p. conspicillatus system was mostly present in areas accessible to the species, but that are outside the areas estimated as suitable for p. conspicillatus. therefore, extrapolation is unlikely to affect predictions save for northernmost locations in the study area (fig. 6) . however, some climate change scenarios predict the occurrence of extrapolation type 1 and 2 likely caused by p. conspicillatus. probability of extrapolation for this system is shown in the top right corner of the right side panels in fig. 5 . under climate change, suitability for hev spillover could expand southwards. in addition to a southward expansion, some scenarios predicted inland expansion in the p. alecto hev spillover system. however, while the total area at risk of spillover was predicted to increase, the average probability of spillover in these areas could slightly decrease, especially in the p. conspicillatus system. there was high uncertainty of future risk in areas north of the current distribution of spillover. in areas currently inhabited by p. conspicillatus, p. alecto was predicted to remain stable or expand. in areas where both p. alecto and p. conspicillatus were predicted to co-occur, average probability of exceeding the intensity threshold was predicted to decrease with respect to both species. p. alecto's expansion indicates that additional mitigation efforts should be allocated where risk has been predicted to increase (marked as red in the consensus maps in figs. 5, 6 ). in addition, the expansion of p. alecto into p. conspicillatus territory suggests that p. alecto may replace or become the more predominant hev host in those areas. the current forecasted area at risk of hev spillover to horses is wider than the area that contains the detected hev spillover events. based on both p. alecto and p. conspicillatus, areas farther north than previously recognised were predicted to be at risk. absence of spillover detection in these areas is probably due to the very low density of horses (fig. 2) and relative lack of disease surveillance. while the effect of horse density on risk of spillover seems negligible (mcfarlane et al. 2011) or negative depending on the spatial scale (martin et al. 2018) , the presence of horses is conditional for spillover . previous niche modelling studies of henipavirus hosts predicted broad areas at risk in australia (daszak et al. 2013) . our results differ from these predictions because we used the actual spillover events to fit the model and because we narrowed down the number of reservoir hosts from four to two. before 2014 it was unclear which bat species were more relevant for hev epidemiology. recent findings have provided epidemiological edson et al. 2015) , and ecological (martin et al. 2016 ) support for p. alecto and p. conspicillatus as the main hev reservoir hosts. hence, we have predicted smaller areas at risk. poisson point processes have infrequently been used to model the spatial pattern of spillover of bat-borne viruses. walsh (2015) modelled the spatial pattern of nipah virus spillover to humans as a point process model in response to human footprint, the presence of bat reservoirs and environmental factors (vegetation). one key difference with our study is that we focused on modelling hev spillover driven by environmental factors in order to be able to project the left panels show areas of expansion and contraction and the level of agreement between climate change scenarios. right panels, the average predicted probability of exceeding spillover intensity among climate change scenarios (main panels). top right corners show the probability of model extrapolation. models into climate change scenarios. this enabled us to explore the potential consequences of climate change for hev spillover. with that in mind we used reservoir host density (statistically equivalent to the human footprint variable in walsh (2015) ) as an offset term to specifically model the isolated effect of climate. we decided to take this approach because of the lack of data to predict future horse density and distribution, which precludes its inclusion as a hev spillover predictor. previous studies of the zoonotic niche of bat-borne viruses including marburg and ebola viruses (peterson et al. 2004 pigott et al. 2014 pigott et al. , 2015 and nipah virus (peterson 2013 ) have used machine learning methods. interpretation of these models and the risk management implications of the predictions were thus limited to visual analysis of the geographic patterns and associated climatic factors. the transparent nature and control over model selection that poisson point processes confer result in better understanding of the likely biological meaning of statistical associations (renner et al. 2015; taylor et al. 2015) . however, definitive interpretation is dependent on understanding of the underlying biological mechanisms (walsh 2015) . although the final models are complex due to a lack of understanding of the interaction of flying foxes and horses, the statistical associations in the models of the p. alecto system are similar with those found in martin et al. (2018) . first, most of the variables' interactions that were kept in the model represent rainfall (bio12) and its seasonality (bio15). these two interact with maxent.p.alecto indicating that interactions between rainfall, its variability, and the probable presence of this bat species are important for spillover. such effects could be due to the climatic differences between areas used for foraging and establishing a roost (tidemann et al. 1999; vardon et al. 2001) . in fact, high suitability for p. alecto is not enough to explain spillover because maxent.p.alecto alone had a negative effect which is reversed when it interacts with rainfall levels (table 1) . we can infer from these associations that rainfall levels and their variability with respect to seasonal extremes could be broad-scale correlates of hev spillover risk (páez et al. 2017; martin et al. 2018 ). in the p. conspicillatus system, the small number of spillover events, nine, limits the number of variables and their interactions that could be included in the model. from the final model structure only maxent.p.conspicillatus and bio2 (mean diurnal temperature range) had significant effects (table 2) . given the cubic exponent affecting the positive effect of maxent.p.conspicillatus, we infer that transmission from this species to horses occurs in areas where climatic suitability is very high for p. conspicillatus. there was complete agreement among climate change scenarios that there could be a southward increase in log(r) 3.817497 9 10 -2 -1.568415 7.081970 9 10 -1 log(u) 1.588467 2.679587 9 10 -1 3.093003 b intercept 3.922981 -3.376057 9 10 1 3.852942 9 10 1 b bio5 -9.316293 9 10 -2 -1.890911 9 10 -1 1.705975 9 10 -2 b bio9 -8.620517 9 10 -3 -5.012380 9 10 -2 3.087590 9 10 -2 b bio12 -1.710871 9 10 -2 -3.934777 9 10 -2 3.480018 9 10 -3 b bio15 1.840011 9 10 -1 9.811339 9 10 -2 2.795649 9 10 -1 b maxent.p.alecto -1.438959 9 10 2 -2.504255 9 10 2 -3.564127 9 10 1 b i(maxent.p.alecto^2) 5.913680 9 10 1 -6.198367 1.252079 9 10 2 b bio5:maxent.p.alecto 2.524097 9 10 -1 4.738640 9 10 -2 4.704084 9 10 -1 b bio12:maxent.p.alecto 9.334252 9 10 -2 2.171861 9 10 -2 1.726844 9 10 -1 b bio12:bio15 -1.433364 9 10 -4 -7.637361 9 10 -5 -2.197990 9 10 -4 b bio12:i(maxent.p.alecto^2) -7.449967 9 10 -2 -1.408666 9 10 -1 -1.353857 9 10 -2 b's represent the regression coefficients in exponential scale. parameters r and u are the mean and variance of the spiked exponential covariance function. suitability for spillover caused by the response of p. alecto to climate change. however, the already observed southward expansion of p. alecto is faster than predicted by changing climate (roberts et al. 2012) , suggesting that other non-climatic factors like urbanisation (plowright et al. 2011; tait et al. 2014 ) are also affecting the presence and density of the bat species. to date the southernmost recorded spillover events lie within the limits of the current potential distribution of spillover (blue areas in left panels of fig. 5 ). this shows that even when p. alecto is capable of occupying areas beyond its optimal climatic niche, spillover and spillover risk occur within the areas with the highest climatic suitability in most cases, most likely due to higher potential densities of p. alecto [climatic suitability is correlated with bat density and spillover risk martin et al. 2016) ]. hence, as climatic suitability for p. alecto continues to increase southwards, potential for higher population densities could increase southwards as well as hev spillover risk. the cause of predicted expansions under climate change with high agreement might be related to the higher temperatures expected at higher altitude and lower latitudes (lafferty 2009 ), particularly in australia (williams et al. 2003) . this is consistent with the predictions of tropical diseases expanding or shifting into subtropical areas (lafferty 2009 ). the lower agreement on the inland expansion indicates that the effect of altitude is less clear among climate change scenarios. in fact, some of the scenarios indicate that there could be a contraction towards the coast. consequently, to adequately assess if there will be expansion or contraction to and from the coast, flying fox monitoring programs are required. in model projections, we identified overpredictions (figs. 3, 5) . these could be due to the inclusion of areas that are not usually available to p. alecto (soberó n and . accessible areas are usually defined by physical barriers, however, in the absence of such evident barriers for pteropodid bats in australia we assumed that climate could act as a barrier through its effects on bat physiology. while the assumption could be valid, the choice of climatic regions clearly did not eliminate inaccessible areas that could be suitable, at least according to some climatic factors. alternatively, the most relevant climatic factors that restrict the distribution of bats and hev spillover might have been discarded in the search for variables that were less likely to impact the model's transference to climate change scenarios (owens et al. 2013 ). the ultimate implications of the southward and probable inland expansion are a greater number of horses at spillover risk. depending on the representative concentration pathway (rcp), and based on the 2007 horse census, there could be at least 112-165,000 more horses at risk (175-260% increase). because there is considerable uncertainty around the potential outcomes of climate change on disease occurrence in new areas more research is needed, first to verify predictions and then to better manage the consequences (braks et al. 2013) . furthermore, the ultimate spillover risk scenario by 2050 will also depend on horse densities and socio-economic processes, and how these processes interact with climate change. therefore, one potential area of ecological and epidemiological research is the role of novel ecological interactions between flying foxes and other organisms such as food sources that could experience distributional shifts and impacts as result of human activities. we need to understand if these novel interactions and processes affect the dynamics of bat populations, hev, and spillover risk (williams et al. 2003; sala et al. 2009 ). consequently, we emphasise the need to undertake regular risk assessments to quantify hev exposure in horse populations and to consider the potential consequences of a larger horse population at risk. in light of the potentially larger horse population at risk, it is clear that direct intervention of the hev spillover system is necessary to mitigate risk in response to climate change such as extending vaccine coverage regardless of the uncertainties involved (beaumont et al. 2008 ). however, a more holistic approach would include reduction of greenhouse gas emissions. such a management strategy would positively impact all levels of organisation of the hev spillover system studied here and prevent other predicted consequences of climate change. for instance, the australian tropics are predicted to experience large biodiversity losses (williams et al. 2003) and grasslands in southern australia could experience increased variability in productivity which could affect the cattle and potentially the horse industry (cullen et al. 2009 ). our models predicted that spillover frequency could decrease in response to climate change with respect to p. conspicillatus. however, the p. alecto hev spillover system was predicted to remain or increase within the current areas suitable for p. conspicillatus. therefore, these areas of tropical north queensland could experience a replacement of reservoir host species that may result in different epidemiological processes that would benefit from different mitigation strategies. we recommend, therefore, that an area of research be the development of specific management strategies for the different flying fox species relevant to hendra virus spillover. these management strategies would anticipate and better manage flying fox species' replacements and changes in the epidemiology of hev spillover. the predicted shrinking of the distribution of p. conspicillatus could also affect the dynamics of many ecosystem processes because flying foxes are important pollinators and seed dispersers. the absence of such ecosystem services could result in further biodiversity loss. such loss of ecosystem services occurs even before bats become extinct (mcconkey and drake 2006) . therefore serious additional conservation issues may arise as a result of p. conspicillatus decline that could affect hev epidemiology. predicting hev spillover with the methods we used carries considerable uncertainty. sources of uncertainty may be related to: (1) the type of presence only data used limits the number of analytical methods that can be used and hampers the identification of limiting factors; (2) the effects of climate on hev spillover act at several different levels of ecological organisation and are not well understood, for instance temperature, humidity and ground vegetation might also limit the available pathways for hev transmission to horses (martin et al. 2017) , and temperature can regulate the flowering status of native plants (hudson et al. 2010) , the main source of food for flying foxes; (3) flying fox species distributions do not depend entirely on climate (tidemann et al. 1999; vardon et al. 2001 ), but are greatly affected by native plant phenology (giles et al. 2016) , and have an apparently innate preference for fragmented and urbanised landscapes (tait et al. 2014 ); (4) the predicted distributions of flying foxes in response to climate change do not account for other organisms' shifting distributions that affect bats' distributions. other organisms shifting distributions might give rise to novel and unpredictable interactions and effects on bats' distributions (eby et al. 1999; eby and law 2008; giles et al. 2016) ; and (5) climate change could also affect horse behaviour and susceptibility to diseases. for instance, horses have limited thermal tolerance. exceeding their comfort levels can alter their behaviour (castanheira et al. 2010 ) and increase the frequency of interaction with tree shaded areas (jørgensen and bøe 2007) , which is where hev is usually excreted ). all of these issues warrant further research to increase understanding of hev epidemiology and bat virus spillover in general. the strength of our approach lies in its generality. possible improvements to our models to make them more specific might involve: (1) including a model of bat distribution that better accounts for the effect of urbanisation (tait et al. 2014 ); (2) including other biological interactions that are crucial for bat species (giles et al. 2016 ) that can be transferred to climate change scenarios; and (3) establishing direct links between climatic factors and the levels of hev infection in bats. such an analysis would likely result in smaller areas and populations predicted to be at risk. spillover of diseases from wild to domestic animals and humans comprises several levels of ecological organisation. the first level includes the distribution of the reservoir host, and then the distribution of the causal agent within the reservoir host . by including the additional layer of spillover host distribution as an offset during model fit, we have modelled the direct effect of climate (taylor et al. 2015) on the biological processes that affect the reservoir and the causal agent and result in hev spillover. therefore, the models represent the underlying risk to any spillover hosts present in the areas predicted to be at risk due to the presence of the reservoir host and the effects of climate on the hev spillover system. the 20% omission threshold indicates that within these areas at least 80% of spillover cases could occur. the precise location and timing of spillover cases will depend on processes that occur at finer scales like the fraction of susceptible horses (e.g. unvaccinated) that are effectively exposed martin et al. 2015 martin et al. , 2017 . consequently the models should only be used to improve understanding of spillover risk, identify areas to allocate resources for mitigation and inform research activities. our results suggest that spillover events could increase farther south, and inland with climate change. the current potential distribution of hev spillover spans farther north, but the absence of reported events might be due to very low horse density and less disease surveillance. spillover events could potentially increase farther south, and inland with climate change. these potential expansions and additional areas of risk should be assessed in the first instance by monitoring flying fox populations. in northern queensland, the probable replacement of p. conspicillatus by p. alecto suggests that mitigation strategies of hev spillover risk may have to be adapted to cope with this interaction and its uncertain effects. why is the choice of future climate scenarios for species distribution modelling important? spatial filtering to reduce sampling bias can improve the performance of ecological niche models climate change and public health policy: translating the science multivariate analysis for characteristics of heat tolerance in horses in brazil the impact of the sars epidemic on the utilization of medical services: sars and the fear of sars nipah encephalitis outbreak in malaysia, clinical features in patients from seremban nipah virus outbreak in malaysia climate change effects on pasture systems in south-eastern australia interdisciplinary approaches to understanding disease emergence: the past, present, and future drivers of nipah virus emergence assessing minimum contrast parameter estimation for spatial and spatiotemporal log-gaussian cox processes spatial and spatio-temporal log-gaussian cox processes: extending the geostatistical paradigm ranking the feeding habitats of greyheaded flying foxes for conservation management the distribution, abundance and vulnerability to population reduction of a nomadic nectarivore, the grey-headed flying-fox pteropus poliocephalus in new south wales, during a period of resource concentration routes of hendra virus excretion in naturally-infected flying-foxes: implications for viral transmission and spillover risk a statistical explanation of maxent for ecologists advances and limitations of disease biogeography using ecological niche modeling hendra virus infection dynamics in australian fruit bats models of eucalypt phenology predict bat population flux. ecology and evolution 1-16 roosting behaviour and habitat selection of pteropus giganteus reveal potential links to nipah virus epidemiology pteropid bats are confirmed as the reservoir hosts of henipaviruses: a comprehensive experimental study of virus transmission molecular evolution of the sars coronavirus during the course of the sars epidemic in china very high resolution interpolated climate surfaces for global land areas climatic influences on the flowering phenology of four eucalypts: a gamlss approach global trends in emerging infectious diseases a note on the effect of daily exercise and paddock size on the behaviour of domestic horses (equus caballus) the ecology of climate change and infectious diseases fruit bats as reservoirs of ebola virus bats are natural reservoirs of sars-like coronaviruses transmission of human infection with nipah virus climatic suitability influences species specific abundance patterns of australian flying foxes and risk of hendra virus spillover hendra virus survival does not explain spillover patterns and implicates relatively direct transmission routes from flying foxes to horses microclimates might limit indirect spillover of the bat borne zoonotic hendra virus hendra virus spillover is a bimodal system driven by climatic factors flying foxes cease to function as seed dispersers long before they become extinct investigation of the climatic and environmental context of hendra virus spillover events 1994-2010 here be dragons: a tool for quantifying novelty due to covariate range and correlation change when projecting species distribution models overview of the epidemiology of equine influenza in the australian outbreak constraints on interpretation of ecological niche models by limited environmental ranges on calibration areas conditions affecting the timing and magnitude of hendra virus shedding across pteropodid bat populations in australia niche dynamics in space and time predicting species distributions from small numbers of occurrence records: a test case using cryptic geckos in madagascar letter to the editor: hendra virus spillover risk in horses: heightened vigilance and precautions being urged this winter peterson at (2006) ecologic niche modeling and spatial patterns of disease transmission mapping risk of nipah virus transmission across asia and across bangladesh ecologic and geographic distribution of filovirus disease geographic potential for outbreaks of marburg hemorrhagic fever rethinking receiver operating characteristic analysis applications in ecological niche modeling maximum entropy modeling of species geographic distributions mapping the zoonotic niche of ebola virus disease in africa mapping the zoonotic niche of marburg virus disease in africa ecological dynamics of emerging bat virus spillover reproduction and nutritional stress are risk factors for hendra virus infection in little red flying foxes (pteropus scapulatus) urban habituation, ecological connectivity and epidemic dampening: the emergence of hendra virus from flying foxes (pteropus spp pathways to zoonotic spillover effect of the ebola-virus-disease epidemic on malaria case management in guinea, 2014: a cross-sectional survey of health facilities equivalence of maxent and poisson point process models for species distribution modeling in ecology latitudinal range shifts in australian flying-foxes: a re-evaluation global biodiversity scenarios for the year 2100 flying-fox species density -a spatial risk factor for hendra virus infection in horses in eastern australia interpretation of models of fundamental ecological niches and species distributional areas interpretation of models of fundamental ecological niches and species distributional areas are flying-foxes coming to town? urbanisation of the spectacled flying-fox lgcp: an r package for inference with spatial and spatio-temporal log-gaussian cox processes bayesian inference and data augmentation schemes for spatial, spatiotemporal and multivariate log-gaussian cox processes in r risk factors for human disease emergence dry season camps of flying-foxes (pteropus spp.) in kakadu world heritage area, north australia seasonal habitat use by flying-foxes, pteropus alecto and p. scapulatus (megachiroptera), in monsoonal australia mapping the risk of nipah virus spillover into human populations in south and southeast asia bats, civets and the emergence of sars mathematical models for the control of pests and infectious disease: a survey niches, models, and climate change: assessing the assumptions and uncertainties climate change in australian tropical rainforests: an impending environmental catastrophe managing disease threats to wild animals host range and emerging and reemerging pathogens serologic evidence for the presence in pteropus bats of a paramyxovirus related to equine morbillivirus key: cord-350870-a89zj5mh authors: ikeda, hiroki; de boer, rob j; sato, kei; morita, satoru; misawa, naoko; koyanagi, yoshio; aihara, kazuyuki; iwami, shingo title: improving the estimation of the death rate of infected cells from time course data during the acute phase of virus infections: application to acute hiv-1 infection in a humanized mouse model date: 2014-05-21 journal: theor biol med model doi: 10.1186/1742-4682-11-22 sha: doc_id: 350870 cord_uid: a89zj5mh background: mathematical modeling of virus dynamics has provided quantitative insights into viral infections such as influenza, the simian immunodeficiency virus/human immunodeficiency virus, hepatitis b, and hepatitis c. through modeling, we can estimate the half-life of infected cells, the exponential growth rate, and the basic reproduction number (r(0)). to calculate r(0) from virus load data, the death rate of productively infected cells is required. this can be readily estimated from treatment data collected during the chronic phase, but is difficult to determine from acute infection data. here, we propose two new models that can reliably estimate the average life span of infected cells from acute-phase data, and apply both methods to experimental data from humanized mice infected with hiv-1. methods: both new models, called as the reduced quasi-steady state (rqs) model and the piece-wise regression (pwr) model, are derived by simplification of a standard model for the acute-phase dynamics of target cells, viruses and infected cells. by having only a limited number of parameters, both models allow us to reliably estimate the death rate of productively infected cells. simulated datasets with plausible parameter values are generated with the standard model to compare the performance of the new models with that of the major previous model (i.e., the simple exponential model). finally, we fit models to time course data from hiv-1 infected humanized mice to estimate the several important parameters characterizing their acute infection. results and conclusions: the new models provided much better estimates than the previous model because they more precisely capture the de novo infection process. both models describe the acute phase of hiv-1 infected humanized mice reasonably well, and we estimated an average death rate of infected cells of 0.61 and 0.61, an average exponential growth rate of 0.69 and 0.76, and an average basic reproduction number of 2.30 and 2.38 in the rqs model and the pwr model, respectively. these estimates are fairly close to those obtained in humans. in most viral infections, the initial exponential growth phase is followed by a second exponential phase known as contraction. in hosts exposed to viruses such as influenza and coronaviruses (the causative agents of severe acute respiratory syndrome), the viral load continuously declines during the contraction phase [1, 2] . in contrast, in chronic viral infections, such as human immunodeficiency virus (hiv) and hepatitis c virus (hcv) infections, contraction slows down such that the viral load approaches a steady state, called the virological set point [3, 4] . in both infection types, the expansion and contraction of the viral load have been modeled as single exponential functions, with parameters determined by linear regression of the log transformed data [3, [5] [6] [7] . this simple approach is reasonable as long as the conditions, e.g., the availability of target cells or the immune response, hardly change within each phase. using this approach, the initial growth rate, death rate of the infected cells, and the basic reproduction number (i.e., r 0 ) have been estimated, which has improved our understanding of particular virus infections, and has guided medical treatment [7] [8] [9] . for example, once the basic reproduction number is estimated, the critical inhibition, 1-1/r 0 , induced by vaccines, or by antiviral drugs, to prevent primary virus infection can be calculated [6] . knowledge of the death rate of infected cells is crucial for properly understanding viral dynamics because the average life time is required for calculating the basic reproduction number. in chronic viral infections such as hiv and hcv, the death rate is estimated from large perturbations of the set point viral load data instigated by potent anti-viral therapy [4, [10] [11] [12] [13] . shortly after effective treatment, the decay rate of viral load approaches the death rate of productively infected cells. however, estimating the death rate of infected cells during the acute phase remains a challenging task, and, as a consequence, it is difficult to accurately estimate the basic reproduction number from viral load data during the early stages of viral infection. in addition to calculation of the basic reproduction number, the death rate per se is useful for evaluating the efficacy of vaccine-induced cellular immune responses during the acute phase of virus infection [14] [15] [16] . therefore, an improved method for estimating the death rate of infected cells during this phase is urgently required. in this study, we first generated simulated datasets with biologically plausible parameter values, using a population dynamics model of virus population. the time evolution of target cell densities and viral load were modeled during the acute phase. the datasets describing acute infection were subsequently analyzed by two novel mathematical models to evaluate whether the new models could accurately estimate the known parameters. our proposed models properly described the artificial datasets and delivered better estimates of the parameters and well calculated indices than conventional models (i.e., simple exponential models). our methods proved especially effective for calculating the death rate of infected cells. we then applied our models to time course data from a human hepatopoietic stem cell-transplanted humanized mouse model infected with hiv type-1 (hiv-1) [17] [18] [19] [20] , to quantify the infection dynamics during the acute phase. to our knowledge, this is the first report quantifying the dynamics of acute hiv-1 infection in humanized mice. finally, we discuss how our approach may be combined with animal experiments. like previous simple exponential models [3, [5] [6] [7] , our approach is quite general and can be used in several infection models. the standard model for viral infection consists of three differential equations for target cells, t(t), infected cells, i(t), and viral particles, v(t) [7] [8] [9] . since during acute infection the normal production and loss of target cells is much smaller than the loss due to viral infection and/or its side effects [14] [15] [16] , the standard model can be reduced to where the parameters β, δ, p and c represent the conventional rate constants for viral infection of target cells, the death rate of infected cells, the virus production rate in an infected cell, and the clearance rate of virus particles, respectively. the initial expansion of viral load in this model is well approximated by v(t) ≈ v(0) exp (g 0 t) [3, [5] [6] [7] , with an exponential growth rate, g 0 (the malthusian parameter), given by the positive root of the characteristic this model can be simplified further by a quasi-steady state (qss) approximation for the viral particles [7, [21] [22] [23] . typical estimated half-lives of viruses (1/c) such as hiv, hcv and hepatitis b virus are of the order of minutes (or hours), whereas those of infected cells (1/δ) in vivo are of the order of days [4, 5, [7] [8] [9] [10] [11] [12] [13] . since, the clearance rate of viral particles, c, is typically much larger than the death rate, δ, of the infected cells, we can make a qss assumption, v′(t) = 0, and replace eq. (3) by v(t) = pi(t)/c. because we fit viral loads, v(t), rather than the number of infected cells, i(t), we also substitute i(t) = cv(t)/p into eq. (2) to obtain where r = pβ/c is the viral replication rate per target cell, and δ is the death rate of infected cells. eqs. (1) and (4) together form our first model, that we here call the "reduced quasisteady state" (rqs) model. the rqs model lumps the 8 parameters of the reduced standard model of eqs. (1-3) into five parameters, i.e., β, r, δ, t(0), and v(0). because there is no production of target cells the infection will ultimately be cleared. the five parameters together define several "observables". first, the basic reproduction number is r 0 = rt(0)/δ. second, the initial exponential growth rate is g 0 ≈ rt(0) − δ ( when δ and g 0 ≪ c, one can directly calculate the same g 0 from the characteristic equation). third, the final level of target cells is given by the epidemiological "final size equation" [24, 25] , as the solution of is the fraction of surviving target cells. in the standard model, and its simplifications outlined above, the dynamics of the target cells are coupled to the density of viral particles because target cells disappear by infection. since target cell densities in the peripheral blood (pb) may also depend on other factors, like inflammation, activation and redistribution, we next write a model where the target cell dynamics are decoupled from the viral dynamics. since, during the acute phase of several virus infections, such as hiv, siv and shiv, the decrease in number of target cells in pb is preceded by an initial flat phase [14] [15] [16] [17] [18] [19] [20] [26] [27] [28] , we propose a phenomenological model for the target cells consisting of an initial flat phase, and a second phase of exponential loss ( figure 1 ). this basically implies that we assume that target cells only become a limiting factor when their density starts to decline. thus, the dynamics of target cells is described as follows: where the parameter δ represents the daily rate of target cells loss following the initial flat phase, and t * corresponds to the time at which the target cell densities begin to decrease. eqs. (4-6) define our second model that we here call the "piece wise targets" (pwt) model. the pwt model has six parameters, i.e., β, r, δ, t(0), v(0), δ, and t * , and because it shares eq. (4) with the rqs model, it has the same definitions for the replication rate, r = pβ/c, the malthusian parameter, g 0 = rt(0) − δ, and the reproduction number r 0 = rt(0)/δ. in contrast to the partial depletion, f, in the rqs model, the target cells will ultimately be completely depleted, i.e., t(∞) → 0, in the pwt model. thanks to the decoupled dynamics of the target cells the pwt model can be solved analytically: where the replication phase (eq. (7)) is identical to the initial phase of the standard model (see the remark above). thus, at the price of one additional parameter, we can generalize the depletion of target cells to mechanisms (e.g., inflammation, activation and redistribution of target cells) other than infection only, and have a model with very similar parameters characterizing the acute viral infection. as a control method, we additionally consider a classical method that has been widely adopted in earlier literatures [3, [5] [6] [7] . previously acute infection data have been quantified using piece wise linear regression of the log transformed viral loads before and after the peak in the viral load [3, [5] [6] [7] . the ascending and descending slopes of the log viral load roughly correspond to the exponential growth rate, g 0 , and the death rate, δ, of infected cells, respectively [3, [5] [6] [7] . since g 0 = rt(0) − δ and r 0 = rt(0)/δ, these two slopes suffice to estimate the basic reproduction number, r 0 ≈ 1 + g 0 /δ, and knowing the initial target cell density, t(0), the viral replication rate per target cell can be estimated from the r 0 , i.e., r = δr 0 /t(0) [3, [5] [6] [7] . here we call this classical model as the "piece wise regression" model (pwr). it has been realized before that the down-slope will only reflect the death rate of infected cells, δ, if there is hardly any residual infection of target cells during the contraction phase, i.e., if target cells are markedly depleted [3, [5] [6] [7] . to estimate the accuracy of the parameters estimated by our two novel models, we created simulated time course data of target cell densities and viral load during the acute phase of viral infection (lasting approximately 21 days [14] [15] [16] [17] [18] [19] [20] [26] [27] [28] ) assuming biologically plausible parameter values. the artificial datasets were generated with the reduced standard population dynamics model of viral infection, i.e., eqs. (1-3), in which the target cell dynamics are coupled to the viral dynamics by the infection term. we added stochastic variation to the "data" generated by this model by adding "observational" noise and/or by varying the parameter values. the log transformed data were perturbed by adding a normally distributed noise variable with zero mean and standard deviation σ (see results). the datasets describing acute infection were subsequently analyzed by the two novel rqs and pwt models, and by the previous pwr model. the dynamics of hiv-1 infection during acute infection were quantified in a human hepatopoietic stem cell-transplanted humanized mouse model (nog-hcd34 mice) [17] [18] [19] [20] . five humanized mice were infected with the ccr5-tropic hiv-1 (strain ad8) [29] , and 100 μl of peripheral blood (pb) was routinely collected under anesthesia through the retro-orbital venous plexus at 0, 3, 7, 14, and 21 days post-infection, as previously described [17] [18] [19] [20] . the amount of viral rna in 50 μl of plasma was quantified by rt-pcr (bio medical laboratories, inc). to estimate target cell densities, the number of memory cd4 + t cells was measured by hematometry and flow cytometry, as previously described [17] [18] [19] [20] . briefly, the number of human leukocytes in 10 μl of peripheral blood (pb) was measured in a celltac α mek-6450 hematology analyzer (nihon kohden, co.), and the percentage of memory cd4 + t cells in human cd45 + leukocytes (i.e., cd45 + cd3 + cd4 + cd45racells) was quantified in a facscanto ii (bd biosciences) flow cytometer. in the flow cytometry analyses, apc-conjugated anti-cd4 antibody (rpa-4; biolegend), apc-cy7-conjugated anti-cd3 antibody (hit3a; biolegend), and peconjugated anti-cd45 antibody (hi30; biolegend) were used. all protocols involving human subjects were reviewed and approved by the kyoto university institutional review board. informed written consent from the human subjects was obtained in this study. in the methods section we formulate two novel mathematical models describing the target cell densities and the viral load during acute infection. we created artificial data with target cell densities and virus loads during acute infection using the reduced standard model for viral infection (i.e., eqs. (1-3) ). the data was generated for one ml of pb with "typical" values of the parameters for hiv-1, i.e., an infection rate β = 10 −8 per cell per day, a virus production rate p = 4000 particles per day, a death rate of infected cells δ = 0.7 per day, and a clearance rate of c = 23 per day [7] [8] [9] [10] [11] . we study whether our simplified models can describe the in silico data, and whether their (lumped) parameters are identifiable. the major biological observables of this model are the initial viral growth rate, , the viral replication rate per target cell, r = pβ/c, the death rate of infected cells, δ, and the basic reproduction number, r 0 = rt(0)/δ. first, we created in silico data with "observational" noise by adding proportional random variation to each data point. specifically, we drew random values from a gaussian distribution with a mean of one and a standard deviation σ = 0.2, and added these values to the log transformed data. this seemed natural as we are also fitting the log transformed data, and on a log scale this corresponds to the measurement error of about 60%. the generated datasets were fitted to the numerical solutions of the rqs model (eqs. (1) and (4)), the analytical solution of the pwt model (eqs. (5-8) ), and to the previous pwr model. the sum of squared residuals was minimized using the findminimum package of mathematica 9.0, fitting the target cell and viral load data simultaneously. the typical behavior of the models using these best-fit parameter estimates is depicted in figure 2 , together with the simulated data. other standard deviations of the parameters yielded similar results (results not shown). the two novel models reasonably describe the acute phase of viral infection. note that target cell densities are partially depleted in the rqs model, and will ultimately approach "0" in the pwt model. second, we created different "patients" by randomly drawing the parameter values from normal distributions centered at their typical values. thus, the infection rate, death rate of infected cells and virus production rate were assumed to be normally distributed as β e n μ β ; σ 2 β ; δ e n μ δ ; σ 2 δ à á ; and p e n μ p ; σ 2 p ; with μ β ¼ 10 −8 ; μ δ ¼ 0:7; μ p ¼ 4000; respectively. the standard deviations were set as σ β ¼ 10 −9 ; σ δ ¼ 0:3; σ p ¼ 400 . in this way, we obtained a distribution of the basic reproduction number centered around the true value (r 0 = 2.48). we then randomly sampled one parameter set of β, δ, p from the distributions, and produced 200 different artificial datasets as explained above. analyzing each dataset with the same three models (rqs, pwt and pwr), we calculated 95% confidence intervals (ci) for g 0 , r, δ and r 0 , and investigated whether the 95% ci successfully contained the true values of g 0 , r, δ and r 0 used to create the data set. this procedure was repeated for all 200 datasets, and table 1 provides the frequency of datasets for which the 95% ci successfully contained the true values of g 0 , r, δ and r 0 , i.e., the coverage probability. although the initial growth rate was well estimated by the previous pwr model, the novel rqs and pwt models estimated the viral replication rate, death rate of infected cells, and basic reproduction number much more accurately than the pwr model. thus, the novel models can more accurately extract information from acute-phase viral infection data. at the beginning of the infection all models are identical as they all predict exponential growth of the virus load. the models differ around the peak because the previous pwr model assumes an exponential contraction after the peak, whereas the new models allow the peak be formed by the loss of target cells (i.e., by the βt(t)v(t) term in the rqs model, and by the exponential loss of target cells in the pwt model). in both models this loss of target cells continues during the contraction phase. however, if target cells were depleted rapidly such that there would be hardly any infection of target cells during the contraction phase, this difference among the models would vanish, and the contraction phase of the new models would also be dominated by the death rate of infected cells. the level to which the target cells become depleted in the reduced standard model of viral infection (eqs. (1-3) ) can be computed with the epidemiological "final size equation" [24, 25] (see the methods section). using the same equation, petravic et al. [16] show that this final size of the target cell level provides a good description of the nadir, t min , of the target cell density during an acute infection. defining f = t min /t(0) as the fraction of surviving target cells, we use the final size equation to compute different values of the infection rate β to vary the nadir of the target cells over the interval f ∈ [0.001, 0.200]. doing so we again created different cases, each with a different level of target cell depletion. for simplicity this was done in the absence of noise (which does not affect these results). using the same approach as explained above we fit the "data" generated by these cases with the 3 models ( figure 3 ). as discussed previously [3, [5] [6] [7] , the pwr model (orange symbols) fails to correctly estimate the death rate of infected cells in the presence of continued de novo infections, i.e., when many target cells survive; see figure 3a (in which the dashed line denotes the correct death rate at δ = 0.7). conversely, the two novel models (rqs: blue symbols and pwt: green symbols) accurately estimate the death rate even if many target cells survive. note that it is not surprising that the rqs model provides better parameter estimates than the pwt model, because the artificial data was generated by the reduced standard model (eqs. (1-3) ), from which the rqs model was derived by a reasonable quasi-steady state assumption. the parameters are accurately estimated by all methods when target cells are severely depleted (i.e., when f approaches 0), reconfirming that the previous pwr model can accurately quantify the infected cell death when the target cells are severely depleted. this, for example, occurs in cxcr4-tropic simian-human immunodeficiency virus (shiv) infections (which deplete naïve and memory cd4 t cells during the acute phase) [14, 16, [26] [27] [28] . in cases where target cells are not as strongly depleted, the new models do much better than the pwr model in estimating the death rate ( figure 3a) , the viral replication rate ( figure 3b ) and the basic reproduction number ( figure 3c ). all models perform similarly on estimating the exponential growth rate ( figure 3d ), because they all assume (implicitly or explicitly) that the number of target cells remains constant during the earliest phases of viral infections. having established that the novel models outperform the old one, they were fitted to the 21-day time courses of viral loads and target cells observed in five virus-infected humanized mice. from the parameters estimated with the individual mouse datasets (see table 2 for the rqs model and table 3 for the pwt model), we obtained the similar g 0 ranging from 0.43 to 1.07 and from 0.48 to 1.06 per day in the rqs and pwt models, respectively. these estimates are not in disagreement with the replication rate of hiv-1 in human patients, which has been estimated to be 1.01 ± 0.37 per day [6] . we estimate a death rate of hiv-1 infected cells in humanized mice as δ ranging from 0.30 to 0.76 and from 0.38 to 0.76 per day with the rqs model and the pwt model, respectively (see tables 2 and 3) . again, this result is in concordance with estimates of viral death rate in treated hiv-1 infected patients [10, 11] . we then determined the basic reproduction number r 0 of hiv-1 in humanized mice from the individual figure 2 ; only the infection rate β was altered to adjust the target cell depletion. when the other parameters were altered, similar results were obtained (data not shown). the blue, green and orange symbols plot the indices estimated by the rqs, pwt and pwr models, respectively. the black dotted lines depict the true parameter values. estimates of t(0), r and δ, obtaining r 0 ranging from 1.58 to 3.88 and from 1.77 to 3.26 in the rqs and pwt models, respectively (see tables 2 and 3 ). since the mean of r 0 corresponds to a predicted target cell nadir of f = 0.12 in the rqs model, the simple exponential model is not expected to perform equally well on this data (see figure 3 ). the estimated parameter values of each individual mouse are given in tables 2 and 3 . using the best-fit parameter estimates the behavior of rqs and pwt models is depicted together with the individual data in figures 4a and b , respectively, which confirms that both models reasonably describe the acute phase of hiv-1 infection in humanized mice. we here propose two novel models to quantify the most important parameters characterizing acute viral infections. both models are major improvements over the previous simple exponential model [3, [5] [6] [7] because that model has difficulties estimating the death rate of infected cells when target cells are not depleted after the viral load peak (see figure 3 ). the novel models use the observed target cell densities when estimating the parameter values, and by using simulated data we have demonstrated that the new models typically outperform the previous model. applying the new models to data obtained in humanized mice estimates that the rates at which the virus expands, and at which infected cells die, resemble those measured in humans. the efficacy of vaccines eliciting cytotoxic immune responses [14] [15] [16] could be quantified by our new approach by comparing the estimated death rate of infected cells our novel approach overcomes the difficulty the previous pwr model had with estimating the death rate of infected cells in situations where target are not severely depleted. additionally, the choice between the two different models that we propose here can be made on the final densities of the target cells. if the number of target cells reaches a nadir during acute infection, the rqs model seems most appropriate. if the target cells continue to decrease the pwt model should be better. thus, whenever one has sufficient time course data from an acute infection, the new models should allow one to estimate the death rate of infected cells, and hence the r 0 , with reasonable accuracy. indeed, in animal experiments using rhesus macaques [14] [15] [16] [26] [27] [28] , ferrets [30] , and mice [17] [18] [19] [20] [31] [32] [33] [34] , both target cell densities and viral loads have been measured during the acute phase of viral infection. for example, the number of uninfected (and infected) cells in the lungs or respiratory tracts of macaques, ferrets and mice that were experimentally infected with influenza could be measured [30, 31, 34] . using the cxcr4-tropic shiv/macaque model, both target cell densities (naïve and memory cd4 t cells) and viral loads from pb have been measured [14, 16, [26] [27] [28] . the target cells of simian immunodeficiency virus (siv), or ccr5-tropic shiv infection (memory cd4 t cells expressing ccr5), have been measured from gastrointestinal mucosa samples [15] . thus there are several infection models that can be analyzed with our new models. in this paper, we developed novel mathematical approaches to estimating parameters from acute viral infection data. we demonstrated that the new models outperform the previous model using simulated data. we quantified the dynamics of acute-phase hiv-1 infections by measuring their time course data in a humanized mouse model. interestingly we find that the rates at which the virus expands, and at which infected cells die, are similar to those in humans. time lines of infection and disease in human influenza: a review of volunteer challenge studies clinical progression and viral load in a community outbreak of coronavirus-associated sars pneumonia: a prospective study viral dynamics of acute hiv-1 infection perelson as: hepatitis c viral dynamics in vivo and the antiviral efficacy of interferon-alpha therapy viral dynamics of primary viremia and antiretroviral therapy in simian immunodeficiency virus infection estimation of the initial viral growth rate and basic reproductive number during acute hiv-1 infection virus dynamics modelling viral and immune system dynamics hiv-1 dynamics in vivo: implications for therapy decay characteristics of hiv-1-infected compartments during combination therapy a novel antiviral intervention results in more accurate assessment of human immunodeficiency virus type 1 replication dynamics and t-cell decay in vivo hepatitis c viral kinetics in the era of direct acting antiviral agents and il28b quantification of viral infection dynamics in animal experiments perelson as: influence of peak viral load on the extent of cd4+ t-cell depletion in simian hiv infection estimating the infectivity of ccr5-tropic simian immunodeficiency virus siv(mac251) in the gut estimating the impact of vaccination on acute simian-human immunodeficiency virus/simian immunodeficiency virus infections selective infection of cd4+ effector memory t lymphocytes leads to preferential depletion of memory t lymphocytes in r5 hiv-1-infected humanized nod/scid/il-2rgammanull mice remarkable lethal g-to-a mutations in vif-proficient hiv-1 provirus by individual apobec3 proteins in humanized mice vpu augments the initial burst phase of hiv-1 propagation and downregulates bst2 and cd4 in humanized mice hiv-1 vpr accelerates viral replication during acute infection by exploitation of proliferating cd4+ t cells in vivo some basic properties of immune selection mathematics in population biology effect of synaptic transmission on viral fitness in hiv infection the kermack-mckendrick epidemic threshold theorem mathematical epidemiology of infectious diseases: model building, analysis and interpretation macrophage are the principal reservoir and sustain high virus loads in rhesus macaques after the depletion of cd4+ t cells by a highly pathogenic simian immunodeficiency virus/hiv type 1 chimera (shiv): implications for hiv-1 infections of humans early control of highly pathogenic simian immunodeficiency virus/human immunodeficiency virus chimeric virus infections in rhesus monkeys usually results in long-lasting asymptomatic clinical outcomes characterization of less pathogenic infectious molecular clones derived from acute-pathogenic shiv-89.6p stock virus construction and characterization of a stable full-length macrophage-tropic hiv type 1 molecular clone that directs the production of high titers of progeny virions experimental adaptation of an influenza h5 ha confers respiratory droplet transmission to a reassortant h5 ha/h1n1 virus in ferrets in vitro and in vivo characterization of new swine-origin h1n1 influenza viruses hepatitis c virus replication in mice with chimeric human livers quantifying the early immune response and adaptive immune response kinetics in mice infected with influenza a virus the lipid mediator protectin d1 inhibits influenza virus replication and improves severe influenza 11:22. submit your next manuscript to biomed central and take full advantage of: • convenient online submission • thorough peer review • no space constraints or color figure charges • immediate publication on acceptance • inclusion in pubmed, cas, scopus and google scholar • research which is freely available for redistribution this research is partly supported by the kyushu university fund (to h.i.); grants-in-aid for young scientists b23790500 (to k. s.) and b25800092 (to s.i.) from the japan society for the promotion of science (jsps); the uehara memorial foundation the authors declare that they have no competing interests. conceived and designed the study: hi, rdb, ka and si. analyzed the data: hi, sm and si. wrote the paper: hi, rdb, ka and si. contributed reagents/materials/analysis tools: ks, nm and yk. all authors read and approved the final manuscript. key: cord-337897-hkvll3xh authors: yang, zheng rong title: peptide bioinformaticspeptide classification using peptide machines date: 2009 journal: artificial neural networks doi: 10.1007/978-1-60327-101-1_9 sha: doc_id: 337897 cord_uid: hkvll3xh peptides scanned from whole protein sequences are the core information for many peptide bioinformatics research subjects, such as functional site prediction, protein structure identification, and protein function recognition. in these applications, we normally need to assign a peptide to one of the given categories using a computer model. they are therefore referred to as peptide classification applications. among various machine learning approaches, including neural networks, peptide machines have demonstrated excellent performance compared with various conventional machine learning approaches in many applications. this chapter discusses the basic concepts of peptide classification, commonly used feature extraction methods, three peptide machines, and some important issues in peptide classification. proteins are the major components for various cell activities, including gene transcription, cell proliferation, and cell differentiation. to implement these activities, proteins function only if they interact with each other. enzyme catalytic activity, acetylation, methylation, glycosylation, and phosphorylation are typical protein functions through binding. studying how to recognize functional sites is then a fundamental topic in bioinformatics research. as proteins function only when they are bound together, the binding sites (functional sites) and their surrounding residues in substrates are the basic components for functional recognition. studying consecutive residues around a functional site within a short region of a protein sequence for functional recognition is then a task of peptide classification that aims to find a proper model that maps these residues to functional status. a peptide classification model can be built using a properly selected learning algorithm with similar principles to those used in pattern classification systems. the earlier work was to investigate a set of experimentally determined (synthesized) functional peptides to find some conserved amino acids, referred in protease cleavage site prediction, we commonly use peptides with a fixed length. in some cases, we may deal with peptides with variable lengths. for instance, we may have peptides whose lengths vary from one residue to a couple of hundreds residues long when we try to identify disorder regions (segments) in proteins [ 5 ] . figure 9 .2 shows two cases, where the shaded regions indicate some disorder segments. the curves represent the probability of disorder for each residue. the dashed lines show 50% of the probability and indicate the boundary between order and disorder. it can be seen that the disorder segments have variable lengths. the importance of accurately identifying disorder segments is related to accurate protein structure analysis. if a sequence contains any disorder segments, x-ray crystallization may fail to give a structure for the sequence. it is then critically important to remove such disordered segments in a sequence before the sequence is submitted for x-ray crystallization experiments. the study of protein secondary structures also falls into the same category as disorder segment identification, where the length of peptides that are constructs for secondary structures varies [ 6 ] . because the basic components in peptides are amino acids, which are nonnumerical, we need a proper feature extraction method to convert peptides to numerical vectors. the second section of this chapter discusses some commonly used feature extraction methods. the third section introduces three peptide machines for peptide classification. the fourth section discusses some practical issues for peptide classification. the final section concludes peptide classification and gives some future research directions. currently, three types of feature extraction methods are commonly used for converting amino acids to numerical vectors: orthogonal vector, frequency estimation, and bio-basis function methods. the orthogonal vector method encodes each amino acid using a 20-bit long binary vector with 1 bit assigned a unity and the rest zeros [ 8 ] . denoted by s a peptide, a numerical vector generated using the orthogonal vector method is x î b 20×| s | , where b = {0, 1} and | s | is the length of s . the introduction of this method greatly eased the difficulty of peptide classification in the early days of applying various machine learning algorithms like neural networks to various peptide classification tasks. however, the method significantly expands the input variables, as each amino acid is represented by 20 inputs [ 9 , 10 ] . for an application with peptides ten residues long, 200 input variables are needed. figure 9 .3 shows such an application using the orthogonal vector method to convert amino acids to numerical inputs. it can be seen that data significance, expressed as a ratio of the number of data points over the number of model parameters, can be very low. meanwhile, the method may not be able to properly code biological content in peptides. the distance (dissimilarity) between any two binary vectors encoded from two different amino acids is always a constant (it is 2 if using the hamming distance or the square root of 2 if using the euclidean distance), while the similarity (mutation or substitution probability) between any two amino acids varies [ 11 -13 ] . the other limitation with this method is that it is unable to handle peptides of variable lengths. for instance, it is hard to imagine that this method could be used to implement a model for predicting disorder segments in proteins. an example of using the orthogonal vector for converting a peptide to numerical vector for using feedforward neural networks, where each residue is expanded to 20 inputs. adapted from [ 14 ] the frequency estimation is another commonly used method in peptide classification for feature extraction. when using single amino acid frequency values as features, we have the conversion as s ‫ۋ‬ x î ᑬ 20 , meaning that a converted vector always has 20 dimensions. however, it has been widely accepted that neighboring residues interact. in this case, di-peptide frequency values as features may be used, leading to the conversion as s ‫ۋ‬ x î ᑬ 420 . if tri-peptide frequency values are used, we then have the conversion as s ‫ۋ‬ x î ᑬ 8420 . this method has been used in dealing with peptides with variable lengths for peptide classification, for instance, secondary structure prediction [ 15 ] and disorder segment prediction [ 5 ] . one very important issue of this method is the difficulty of handling the large dimensionality. for any application, dealing with a data set with 8,420 features is no easy task. the third method, called the bio-basis function , was developed in 2001 [ 9 , 10 ] . the introduction of the bio-basis function was based on the understanding that nongapped pairwise homology alignment scores using a mutation matrix are able to quantify peptide similarity statistically . figure 9 .4 shows two contours where one functional peptide (sknypivq) and one nonfunctional peptide (sdgngmna) are selected as indicator peptides. we use the term indicator peptides , because the use of some indicator peptides transform a training peptide space to an indicator or feature space for modeling. we then calculate the nongapped pairwise homology alignment scores using the dayhoff mutation matrix [ 12 ] to calculate the similarities among all the functional/positive and these two indicator peptides as well as the similarities among all the nonfunctional/negative peptides and these two indicator peptides. it can be seen that positive peptides show larger similarities with the positive indicator peptide (sknypivq) from the left panel, while negative peptides show larger similarities with the negative indicator peptide (sdgngmna) from the right panel. we denote by s a query peptide and by r i the i th indicator peptide and each has d residue, the bio-basis function is defined as k(s, r ) = exp (s, r ) (r , r ) note that s j and r ij are the j th residues in the query and indicator peptides, respectively. the value of m ( s j , r ij ) can be found from a mutation matrix. figure 9 .5 shows the dayhoff mutation matrix [ 12 ] . from eq. 1 , we can see that k ( s , the equality occurs if and only if s = r i . the basic principle of the bio-basis function is normalizing nongapped pairwise homology alignment scores. as shown in fig. 9 .6 , a query peptide ( iprs ) is aligned with two indicator peptides ( kprt and ykae ) to produce two nongapped pairwise homology alignment scores a ( å 1 = 24 + 56 + 56 + 36 = 172) and b ( å 2 = 28 + 28 + 24 + 32 =112 ) respectively. because a > b , it is believed that the query peptide is more likely to have the same functional status (functional or nonfunctional) as the first indicator peptide. after the conversion using the bio-basis function, each peptide s is represented by a numerical vector x î ᑬ m or a point in an m -dimensional feature space, where m is the number of indicator peptides. note that m = 2 in fig. 9 .6 . the bio-basis function method has been successfully applied to various peptide classification tasks, for instance, the prediction of trypsin cleavage sites [ 9 ] , the prediction of hiv cleavage sites [ 10 ] , the prediction of hepatitis c virus protease cleavage sites [ 16 ] , the prediction of the disorder segments in proteins [ 7 , 17 ] , the prediction of protein phosphorylation sites [ 18 , 19 ] , the prediction of the o-linkage sites in glycoproteins [ 20 ] , the prediction of signal peptides [ 21 ] , the prediction of factor xa protease cleavage sites [ 22 ] , the analysis of mutation patterns of hiv-1 fig. 9 . 5 dayhoff matrix. each entry is a mutation probability from one amino acid (in rows) to another amino acid (in columns). reproduced from [ 14 ] with permission drug resistance [ 23 ] , the prediction of caspase cleavage sites [ 24 ] , the prediction of sars-cov protease cleavage sites, [ 25 ] and t-cell epitope prediction [ 26 ] . as one class of machine learning approaches, various vector machines are playing important roles in machine learning research and applications. three vector machines-the support vector machine [ 27 , 28 ] , relevance vector machine [ 29 ] , and orthogonal vector machine [ 30 ] -have already drawn the attention for peptide bioinformatics. each studies pattern classification with a specific focus. the support vector machine (svm) searches for data points located on or near the boundaries for maximizing the margin between two classes of data points. these found data points are referred to as the support vectors . the relevance vector machine (rvm), on the other hand, searches for data points as the representative or prototypic data points referred to as the relevance vectors . the orthogonal vector machine (ovm) searches for the orthogonal bases on which the data points become mutually independent from each other. the found data points are referred to as the orthogonal vectors . all these machines need numerical inputs. we then see how to embed the bio-basis function into these vector machines to derive peptide machines. the support peptide machine (spm) aims to find a mapping function between an input peptide s and the class membership (functional status). the model is defined as the bio-basis function. as iprs is more similar to kprt than ykae , its similarity with kprt is larger that that with ykae , see the right figure. reproduced from [ 14 ] with permission where w is the parameter vector, y = f ( s , w ) the mapping function, and y the output corresponding to the desired class membership t î {-1, 1}. note that -1 and 1 represent nonfunctional and functional status, respectively. with other classification algorithms like neural networks, the distance (error) between y and t is minimized to optimize w . this can lead to a biased hyperplane for discrimination. in fig. 9 .7 , there are two classes of peptides, a and b . the four open circles of class a and four filled circles of the class b are distributed in balance. with this set of peptides, the true hyperplane separating two classes of peptides, represented as circles, can be found as in fig. 9 .7(a) . suppose a shaded large circle belonging to class b as a noise peptide is included, as seen in fig. 9 .7(b) ; the hyperplane (the broken thick line) is biased because the error (distance) between the nine circles and the hyperplane has to be minimized. suppose a shaded circle belonging to class a as a noise peptide is included as seen in fig. 9 .7(c) ; the hyperplane (the broken thick line) also is biased. with theses biased hyperplanes, the novel peptides denoted by the triangles could be misclassified. in searching for the best hyperplane, the spm finds the set of peptides that are the most difficult training data points to classify. these peptides are referred to as support peptides . in constructing a spm classifier, the support peptides are closest to the hyperplane within the slat formed by two boundaries ( fig. 9 .7d ) and are located on the boundaries of the margin between two classes of peptides. the advantage of using an spm is that the hyperplane is searched through maximizing this margin. because of this, the spm classifier is robust; hence, it has better generalization performance than neural networks. in fig. 9 .7(d) , two open circles on the upper boundary and two filled circles on the lower boundary are selected as support peptides. the use of these four circles can form the boundaries of the maximum margin between two classes of peptides. the trained spm classifier is a linear fig. 9 .7 a hyperplane formed using conventional classification algorithms for peptides with a balanced distribution. b and c hyperplanes formed using conventional classification algorithms for peptides without balanced distribution. d hyperplane formed using spm for peptides without balanced distribution. the open circles represent class a , the filled circles class b , and the shaded circle class a or b . the thick lines represent the correct hyperplane for discrimination and the broken thick lines the biased hyperplanes. the thin lines represent the margin boundaries. the γ indicates the distance between the hyperplane and the boundary formed by support peptides. the margin is 2γ. reproduced from [ 14 ] with permission combination of the similarity between an input peptide and the found support peptides. the similarity between an input peptide and the support peptides is quantified by the bio-basis function. the decision is made using the following equation, y = sign {σ w i t i k ( s , r i )}, where t i is the class label of the i th support peptide and w i the positive parameter as a weight for the i th support peptide. the weights are determined by a spm algorithm [ 31 ] . the relevance peptide machine (rpm) was proposed based on automatic relevance determination (ard) theory [ 32 ] . in spm, the found support peptides are normally located near the boundary for discrimination between two classes of peptides. however, the relevance peptides found by rpm are prototypic peptides. this is a unique property of rpm, as the prototypic peptides found by a learning process are useful for exploring the biological inside. suppose the model output is defined as where t is a similarity vector and w = ( w 1 , w 2 , . . ., w n ) t . the model likelihood function is defined as follows: using ard prior to computation can prevent overfitting the coefficients by assuming each weight follows a gaussian where α = (α 1 , α 2 , . . . , α n ) t . the bayesian learning method gives the following posterior for the coefficients: where the parameters (covariance matrix σ , the mean vector u , and the hyperparameters for weights s ), which can be approximated: and the marginal likelihood can be obtained through integrating out the coefficients: in learning, α can be estimated in an iterative way: a formal learning process of the rpm is then conducted. the following condition is checked during learning: where θ is a threshold. if the condition is satisfied, the corresponding expansion coefficient is zeroed. the learning continues until either the change of the expansion coefficients is small or the maximum learning cycle is approached [ 33 ] . figure 9 .8 shows how the rpm selects prototypic peptides as the relevance peptides, compared with the spm, which selects boundary peptides as the support peptides. first, we define a linear model using the bio-basis function as y = kw (16) where k is defined in eq. 10 . the orthogonal least square algorithm is used as a forward selection procedure by the orthogonal peptide machine (opm). at each step, the incremental information content of a system is maximized. we can rewrite the design matrix k as a collection of column vectors ( k 1 , k 2 , . . . , k m ), where k i represents a vector of the similarities between all the training peptides and the i th the opm involves the transformation of the indicator peptides ( r i ) to the orthogonal peptides ( o i ) to reduce possible information redundancy. the feature matrix k is decomposed to an orthogonal matrix and an upper triangular matrix as follows: where the triangular matrix u has ones on the diagonal line: the orthogonal matrix satisfies where h is diagonal with the elements h ii as the space spanned by the set of orthogonal peptides is the same space spanned by the set of indicator peptides, and eq. 16 can be rewritten as we can define an error model as follows: suppose e ~ n (0, 1), the pseudoinverse method can be used to estimate a as follows: because h is diagonal, the element in a is then the quantities a and w satisfy the triangular system the gram-schmidt or the modified gram-schmidt methods are commonly used to find the orthogonal peptides then to estimate w [ 30 , 34 ] . hiv/aids is one of the most lethal and transmittable diseases, with a high mortality rate worldwide. the most effective prevention of hiv infection is to use a vaccine to block virus infection [ 34 ] . however, hiv vaccine is difficult to develop because of the expense and complexity in advancing new candidate vaccines. although some efficient models and integrated hiv vaccine research enterprises have been proposed [ 36 ] , there is little hope that an hiv vaccine would be developed before 2009 [ 37 ] . hiv is a type of retrovirus. it can enter uninfected cells to replicate itself. inhibiting viral maturation and viral reverse transcription are then the major methods so far for treating hiv-infected patients. two groups of inhibitors have since been developed. one group aims to stop protease cleavage activities and is referred to as the protease inhibitor . the other aims to stop reverse transcriptase cleavage activities and is referred to as the reverse transcriptase inhibitor . however, hiv often develops resistance to the drugs applied. drug-resistant mutants of hiv-1 protease limit the long-term effectiveness of current antiviral therapy [ 38 ] . the emergence of drug resistance remains one of the most critical issues in treating hiv-1-infected patients [ 39 ] . the genetic reason for drug resistance is the high mutation rate along with a very high replication rate of the virus. these two factors work together, leading to the evolution of drug-resistant variants and consequently resulting in therapy failure. the resistance to hiv-1 protease inhibitors can be analyzed at a molecular level with a genotypic or a phenotypic method [ 39 , 40 ] . the genotypic method is used to scan a viral genome for some mutation patterns for potential resistance. as an alternative, the viral activity can be measured in cell culture assays. yet another alternative, phenotypic testing is based on clinic observations, ithat is, directly measuring viral replication in the presence of increasing drug concentration. genotypic assays have been widely used as tools for determining hiv-1 drug resistance and for guiding treatment. the use of such tools is based on the principle that the complex impact of amino acid substitutions in hiv reverse transcriptase or protease on the phenotypic susceptibility or clinical response to the 18 available antiretroviral agents is observable [ 41 ] . genotypic assays are used to analyze mutations associated drug resistance or reduced drug susceptibility. however, this method is problematic, because various mutations and mutational patterns may lead to drug resistance [ 39 ] , and therefore the method occasionally fails to predict the effects of multiple mutations [ 42 ] . in addition, genotypic assays provide only indirect evidence of drug resistance [ 43 ] . although hiv-1 genotyping is widely accepted for monitoring antiretroviral therapy, how to interpret the mutation pattern associated with drug resistance to make accurate predictions of susceptibility to each antiretroviral drug is still challenging [ 44 ] . phenotypic assays directly measure drug resistance [ 43 ] , where drug resistance can be experimentally evaluated by measuring the ratio of free drug bound to hiv-1 protease molecules. however, this procedure is generally expensive and time consuming [ 39 , 42 ] . to deal with the difficulties encountered in genotypic assay analysis and phenotypic evaluation, a combination of inhibitor flexible docking and molecular dynamics simulations was used to calculate the protein-inhibitor binding energy. from this, an inhibitory constant is calculated for prediction [ 39 ] . later, some statistical models were established for predicting drug resistance. for instance, a statistical model was proposed to analyze the viral and immunologic dynamics of hiv infection, taking into account drug-resistant mutants and therapy outcomes [ 45 ] . the factors causing the increase in cd4 cell count and the decrease in viral load from baseline after six months of haart (highly active antiretroviral therapy) were analyzed. note that cd4 stands for cluster of differentiation 4, which is a molecule expressed on the surface of t helper cells. in addition to the baseline viral load and cd4 cell count, which are known to affect response to therapy, the baseline cd8 cell count and resistance characteristics of detectable strains are shown to improve prediction accuracy. note that cd8 stands for cluster of differentiation 8, which is a membrane glycoprotein found primarily on the surface of cytotoxic t cells. a logistic regression model was built for the prediction of the odds of achieving virology suppression after 24 weeks of haart in 656 antiretroviral-naive patients starting haart according to their week 4 viral load [ 46 ] . a regression model was built on a set of 650 matched genotype-phenotype pairs for the prediction of phenotypic drug resistance from genotypes [ 47 ] . as it was observed that the range of resistance factors varies considerably among drugs, two simple scoring functions were derived from different sets of predicted phenotypes. the scoring functions were then used for discrimination analysis [ 48 ] . in addition, machine learning algorithms were used. decision trees as a method for mimicking human intelligence were implemented to predict drug resistance [ 43 ] . a fuzzy prediction model was built based on a clinical data set of 231 patients failing highly active antiretroviral therapy (haart) and starting salvage therapy with baseline resistance genotyping and virological outcomes after three and six months [ 49 ] . in the model, a set of rules predicting genotypic resistance was initially derived from an expert and implemented using a fuzzy logic approach. the model was trained using the virological outcomes data set, that is, stanford's hiv drug-resistance database (stanford hivdb). expert systems were also used for this purpose [ 41 ] . neural networks as a type of powerful nonlinear modelers were utilized to predict hiv drug resistance for two protease inhibitors, indinavir and saquinavir [ 50 ] . other than using sequence information for prediction, a structure-based approach was proposed to predict drug resistance [ 42 ] . models of wt complexes were first produced from crystal structures. mutant complexes were then built by amino acid substitutions in the wt complexes with subsequent energy minimization of the ligand (a small molecule binding to a protein or receptor) and pr (protease receptor) binding site residues. a computational model was then built based on the calculated energies. the data set studied was obtained from an online relational database, the hiv rt and protease database ( http://hivdb.stanford.edu ) [ 51 ] . the susceptibility of the data to five protease inhibitors were determined, saquinavir (sqv), indinavir (idv), ritonavir (rtv), nelfinavir (nfv) and amprenavir (apv). we obtained 255 genotype-phenotype pairs for each of the protease inhibitor, except for apv, for which we obtained 112 pairs. phenotypic resistance testing measures in-vitro viral replication of the wild type and the viral sample in increasing drug concentrations [ 52 ] . the resistance factor, defined as the ratio of 50% inhibitory concentration (ic 50 ) of the respective viral sample to ic 50 of the nonresistant reference strain, reports the level of resistance as the fold change in susceptibility to the drug as compared to a fully susceptible reference strain. while genotypic resistance testing is done by scanning the viral genome for resistance-associated mutations. the major and minor mutations observed in a resistant hiv protease can be seen in figure 1 in [ 53 ] . hiv protease is an aspartyl protease with 99 residues. in this study, we considered major and minor mutation sites, obtaining a window size of 20 residues for each drug. we divided the sample set into two classes by attaching to each peptide a label 1 or -1 for resistant (functional) and susceptible (nonfunctional), respectively. this division depended on whether the resistance factor of a sample exceeded the cutoff value or not. based on previously published datasets and studies, we assumed the cutoff value for the resistant factor of each sample with respect to each protease inhibitor to be 3.5. we used matthews correlation coefficient [ 54 ] the larger the matthews correlation coefficient, the better the model fits the data. if the value of the matthews correlation coefficient is 1, it represents a complete correlation. if the value is 0, the prediction is completely random. if the value is negative, the prediction is on the opposite side of the target. the evaluation is carried out based on the simulation of fivefold cross-validation for each drug and each algorithm. figure 9 .9 shows the comparison between vector machines and peptide machines. it can be seen that the peptide machines outperformed the vector machines. we also constructed neural networks models for this task, their performance is far worse than vector machines (data are not shown). in building a computer model for a real application, there are two important issues for model validation in addition to model parameter estimation. the first is how to evaluate a model. the second is how to select a model based on the evaluation. there are always some hyperparameters for determination when using neural networks or machine learning algorithms. for instance, the number of hidden neurons in feedforward neural networks is such a hyperparameter, and the determination of the optimal number of hidden neurons is not an easy task. using the training data for the evaluation will not deliver meaningful result, as an optimization process can easily overfit the data, leading to poor generation capability. the generalization capability should measure how well a built model works with novel data. based on this requirement, we then need to consider how to use the available data for proper model validation. there are three commonly used methods for this purpose: cross-validation, resampling, and jackknife. all these methods use the same principle, that the validation data must not involve any process of model parameter estimation. this means that the available data must be divided into two parts. one is for model parameter estimation, which is commonly referred to as training . the other is for model evaluation (validation) and selection. the difference between these three methods is the strategy used for the division of a given data set. with the resampling method, we normally randomly sample a certain percentage of data for training and the rest for validation. such a process is commonly repeated many times. suppose there are n data points and we repeat the sampling process for m times. there will be m validation models, each of which has different training and validation data with some possible overlap. note that all these validation models use the same hyperparameters; for instance, they all use h hidden neurons. the parameters of the i th validation model are estimated using the i th training data set with k i < n data points. the i th validation model is validated on the i th validation data set with n -k i data points. because we use different training data sets each time, it is expected that the parameters of the i th validation model will be different from those of the j th validation model. the validation performance of the i th validation model is certainly different from that of the j th validation model as well. we denote by ω i the evaluated performance for the i th validation model. the evaluation statistic of the model with designed hyperparameters can follow to determine the proper values for the hyperparameters so as to select a proper model, we can vary the values assigned to the hyperparameters. if we have g hyperparameters for selection, the selection will be taken in a g -dimensional space, where each grid is a combination of hyperparameters. suppose we need to determine only the number of hidden neurons, we then have a series of m and s 2 . the best model can be selected through it should be noted that, for the resampling method, some data points may be used for multiple times in training or validation. in cross-validation, we normally randomly divide a data set into m folds. each fold contains distinctive data points. if we denote by w i as the set of data points in i th fold, we have w i ç w j = φ, meaning that two sets have no elements in common. every time, we select one fold as the validation set and the remaining m -1 folds are used as the training set for model parameters estimate. such a process is repeated for m times, until each fold has been used for validation once. this means that there are m validation models. again, all these validation models use the same hyperparameters. the i th validation model is trained using the folds except for the i th fold and validated on the i th fold. the parameters of the i th validation model also is different from those of the j th validation model, and the validation performance of different validation models will vary. note that each data point ise validated only once. we denote by when data size is not too large, one commonly prefers using the jackknife (often called leave-one-out cross-validation ) method. in using the jackknife method, we normally pick up one data point for validation and use the remaining data points for training. such a process is repeated for n times until each data point has been exhausted for validation. this means that there are n validation models for n data points. obviously, all these validation models use the same hyperparameters. the i th validation model is trained using all the data points except for the i th data point and validated on the i th data point. equation 31 can be used for evaluation. the best model can be selected through the use of eq. 30 . model validation can help us evaluate models and select models with a proper setting of hyperparameters. however, such evaluation values cannot be regarded as the final model evaluation, which can be delivered to users. the evaluated performance on validation data may not be able to reflect the true performance, because the validation data has been used to tune hyperparameters. as a consequence, the evaluated performance on validation data commonly overestimates the true model accuracy. based on this understanding, a proper peptide classification methodology must contain a blind test stage. this means that, after model evaluation and model selection, we may need to test the "best" model using another data set, which has never been used for estimating model parameters and tuning model hyperparameters or selecting models [ 7 ] . a very important issue must be discussed here, especially for peptide classification. many peptide classification tasks deal with functional site predictions. within each protein sequence, there might be a few functional sites, such as protease cleavage sites, protein interaction sites, or protein posttranslational modification sites. based on the substrate size, we can extract a peptide around one functional site. such a peptide is commonly regarded as a functional peptide. to conduct proper peptide classification, we have to have a set of nonfunctional peptides. each nonfunctional peptide has no functional site at the desired residue(s). we normally use a sliding window with a fixed length to scan a protein sequence from the n-terminal to the c-terminal, one residue by one residue to generate no-functional peptides. a commonly used method is to combine both functional and nonfunctional peptides to produce a data set. suppose we use the cross-validation method, the data set is then randomly divided into m folds for cross-validation. one then builds m validation models for model evaluation and model selection. now a question arises. when we use such kind of models for testing on novel whole protein sequences, the performance is commonly not as expected. this means that the model is some where overfitted. if model parameter estimation follows a proper procedure, the most probable problem is the data used for training and validation. we mentioned previously that we must maintain the independence of a validation data set from a training data set. when we check the method as described, we can clearly see two interesting facts. first, a training peptide and a validation peptide may come from the same protein. if one protein has conserved amino acids, the validation peptides generated this way may not be independent of the training peptides. second, a training peptide and a validation peptide may have many identical residues if they are extracted from neighboring sliding widows. based on this analysis, we proposed to use protein-oriented validation [ 24 ] . suppose we have n proteins, we may divide these n proteins into m folds. we then generate validation peptides using one fold and generate training peptides from the remaining folds. the validation models are constructed using the training peptides scanned from the sequences of the training proteins and verified on the validation peptides scanned from the sequences of the validation proteins. we always have an important issue when using machine learning approaches for peptide classification, that is, if the model is correct forever. the answer is no, as a peptide data set collected at a certain time can be far less than complete. based on this understanding, the models built using an incomplete set of peptides may not be able to generalize well forever. when new experimentally determined peptides have been collected, the exiting models must be corrected so that they can conduct classification tasks well. in this section, we investigate an interesting peptide classification topic, where we can use a model built on published peptides for peptide classification, and we can use a model built on both published peptides and newly submitted sequences in ncbi (www.ncbi.nih.gov) for peptide classification. we found there is a significant difference between the two. the case studied in this session is about hepatitis c virus (hcv), which is a member of the flaviviridae family [ 55 ] and is the major agent of parenterally transmitted non-a/non-b hepatitis [ 56 , 57 ] . the nomenclature of schechter and berger [ 1 ] is applied to designate the cleavage sites on the peptides, p 6 -p 5 -p 4 -p 3 -p 2 -p 1 -p 1 ' -p 2 ' -p 3 ' -p 4 ' , the asymmetric scissile bond being between p 1 and p 1 ' . two resources are available for modeling. first, research articles have published experimentally determined peptides, cleaved and noncleaved. second, some databank like ncbi has collected many new submissions. each submission contains a whole protein sequence with a number of experimentally determined cleavage sites. in terms of this, there are two strategies for modeling. if we believe that the published peptides are able to represent all the protease cleavage specificity, we can select indicator peptides from these published peptides for modeling. we refer to a model constructed this way as a type-i model . in fact, viruses mutate very rapidly; hence, the published peptides may not be able to represent the cleavage specificity in the recent submissions to ncbi. we can then select more indicator peptides, which show more viral mutation information from the new sequences downloaded from ncbi to build a more informative model referred to as a type-ii model. from published papers (data are not shown), we collected 215 experimentally determined peptides. they are referred to as published peptides in this study. among them, 168 are cleaved, while 47 are noncleaved. twenty-five whole protein sequences were downloaded from ncbi. they are aaa65789, aaa72945, aaa79971, aab27127, baa01583, baa02756, baa03581, baa03905, baa08372, baa09073, baa09075, baa88057, bab08107, caa03854, caa43793, cab46677, gnwvtc, jc5620, np_043570, np_671491, p26663, p26664, p27958, pc2219, and s40770. within these 25 whole protein peptides (data are not shown) are 123 cleavage sites. the cleavage sites with notes of potential, by similarity , or probable were removed. first, we built type-i models using the leave-one-out method. each model is built on 214 published peptides and tested on 1 remaining published peptide. the models are then evaluated in terms of the mean accuracy, which is calculated on 215 leaveone-out testes. the best model is selected and tested on 25 new sequences downloaded from ncbi, which are regarded as the independent testing data. the simulation shows that the noncleaved, cleaved, and total accuracies are 77%, 77%, and 77%, respectively. shown in fig. 9 .10 are the prediction results on five whole protein sequences, where the horizontal axis indicates the residues in whole protein sequences and the vertical axis the probability of positive (cleavage). if the probability at a residue is larger than 0.5, the corresponding residue is regarded as the cleavage site p 1 . the simulation shows that there were too many false positives. the average of false positive fraction was 27%, or 722 misclassified noncleavage sites. second, we built type-ii models. each model is built and validated using all the published peptides (215) plus the peptides scanned from 24 newly downloaded sequences. in this run, the protein-oriented leave-one-out method is used. with this method, 1 of 24 new sequences is removed for validation on a model built using 215 published peptides and the peptides scanned from the remaining 23 new sequences. this is repeated 24 times and the performance is estimated on the predictions on 24 new sequences. the best model is selected and tested on the peptides on the remaining new sequence. these peptides are regarded as the independent testing peptides. the noncleaved, cleaved, and total accuracies are 99%, 83%, and 99%, respectively. it can be seen that the prediction accuracy has been greatly improved compared with the type-i models. shown in fig. 9 .11 is a comparison between the type-i and the type-ii models. it shows that the type-ii models greatly outperformed the type-i models in terms of the performance for noncleaved peptides. more important, the standard deviation in the type-ii models is much smaller, demonstrating high robustness. shown in fig. 9 .12 are the prediction results on five protein sequences using the type-ii models, where the horizontal axis indicates the residues in whole protein tpf total type-i type-ii fig. 9 .11 a comparison between the type-i and type-ii models. it can be seen that the type-ii models performed much better than the type-i models in terms of the increase of specificity (truenegative fraction). therefore, the false-positive fraction (false alarm) has been significantly reduced. tnf and tpf stand for true-negative fraction (specificity) and true-positive fraction (sensitivity). reproduced from [ 58 ] with permission fig. 9.12 the probabilities of cleavage sites among the residues for five whole protein sequences for the type-ii models. the horizontal axes indicate the residues in whole sequences and the vertical axes the probabilities of positives (cleavage sites). the numbers above the percentages are the ncbi accession numbers and the percentages are the false positive fractions and the true positive fractions. the simulation shows that the type-ii models demonstrated very small falsepositive fractions compared with the type-i models. it can be seen that the probability of positives are very clean . reproduced from [ 58 ] with permission sequences and the vertical axis the probability of positive (cleavage). the simulation shows fewer false positives than the type-i models. the reason is that many of the 25 newly downloaded sequences were published after the reports with the published peptides. the published peptides may not contain complete information for all these 25 new sequences. this chapter introduced the state-of-the-art machines for peptide classification. through the discussion, we can see the difference between the peptide machines and other machine learning approaches. each peptide machine combines the feature extraction process and the model construction process to improve the efficiency. because peptide machines use the novel bio-basis function, they can have biologically well-coded inputs for building models. this is the reason that the peptide machines outperformed the other machine learning approaches. the chapter also discussed some issues related with peptide classification, such as model evaluation, protein-oriented validation, and model lifetime. each of these issues is important in terms of building correct, accurate, and robust peptide classification models. for instance, the blind test can be easily missed by some new bioinformatics researchers. protein-oriented validation has not yet been paid enough attention in bioinformatics. there is also little discussion on model lifetime. nevertheless, these important issues have been evidenced in this chapter. we should note that there is no a priori knowledge about which peptide machine should be used. research into the link between these three peptide machines may provide some important insights for building better machines for peptide classification. the core component of peptide machines is a mutation matrix. our earlier work shows that the model performance using various mutation matrices varies [ 10 ] . it is then interesting to see how we can devise a proper learning method to optimize the mutation probabilities between amino acids during model construction. on the active site of proteases, 3. mapping the active site of papain; specific peptide inhibitors of papain sequence and structure based prediction of eukaryotic protein phosphorylation sites the carbohydrates of glycoproteins machine learning algorithms for protein functional site recognition intrinsic protein disorder in complete genomes matching protein beta-sheet partners by feedforward and recurrent neural networks ronn: use of the bio-basis function neural network technique for the detection of natively disordered regions in proteins predicting the secondary structure of globular proteins using neural network models characterising proteolytic cleavage site activity using bio-basis function neural networks a novel neural network method in mining molecular sequence data basic local alignment search tool a model of evolutionary change in proteins. matrices for detecting distant relationships a structural basis for sequence comparisons-an evaluation of scoring methodologies application of support vector machines to biology linear optimization of predictors for secondary structure: application to transbilayer segments of membrane proteins reduced bio-basis function neural networks for protease cleavage site prediction predict disordered proteins using bio-basis function neural networks reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithms predicting the phosphorylation sites using hidden markov models and machine learning methods bio-basis function neural networks for the prediction of the olinkage sites in glyco-proteins predict signal peptides using bio-basis function neural networks a bio-basis function neural network for protein peptide cleavage activity characterisation bio-kernel self-organizing map for hiv drug resistance classification prediction of caspase cleavage sites using bayesian bio-basis function neural networks mining sars-cov protease cleavage data using decision trees, a novel method for decisive template searching predict t-cell epitopes using bio-support vector machines vapnik v (1995) the nature of statistical learning theory the kernel trick for distances sparse bayesian learning and the relevance vector machine orthogonal least squares learning algorithm for radial basis function networks bio-support vector machines for computational proteomics a practical bayesian framework for backpropagation networks orthogonal kernel machine in prediction of functional sites in proteins relevance peptide machine for hiv-1 drug resistance prediction how antibodies block hiv infection: paths to an aids vaccine the need for a global hiv vaccine enterprise hiv vaccine still out of our grasp molecular mechanics analysis of drug-resistant mutants of hiv protease prediction of hiv-1 protease inhibitor resistance using a protein-inhibitor flexible docking approach correlation between rules-based interpretation and virtual phenotype interpretation of hiv-1 genotypes for predicting drug resistance in hiv-infected individuals variable prediction of antiretroviral treatment outcome by different systems for interpreting genotypic human immunodeficiency virus type 1 drug resistance structure-based phenotyping predicts hiv-1 protease inhibitor resistance diversity and complexity of hiv-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype comparison of nine resistance interpretation systems for hiv-1 genotyping the role of resistance characteristics of viral strains in the prediction of the response to antiretroviral therapy in hiv infection use of viral load measured after 4 weeks of highly active antiretroviral therapy to predict virologic putcome at 24 weeks for hiv-1-positive individuals geno2pheno: estimating phenotypic drug resistance from hiv-1 genotypes characterizing the relationship between hiv-1 genotype and phenotype: prediction-based classification construction, training and clinical validation of an interpretation system for genotypic hiv-1 drug resistance based on fuzzy rules revised by virological outcomes predicting hiv drug resistance with neural networks human immunodeficiency virus reverse transcriptase and protease sequence database rapid, phenotypic hiv-1 drug sensitivity assay for protease and reverse transcriptase inhibitors analysis of the protease sequences of hiv-1 infected individuals after indinavir monotherapy comparison of the predicted and observed secondary structure of t4 phage lysozyme classification and nomenclature of virus: fifth report of the international committee on taxonomy of viruses isolation of a cdna clone derived from a blood-borne non-a non-b viral hepatitis genome an assay for circulating antibodies to a major etiologic virus of human non-a non-b hepatitis predicting hepatitis c virus protease cleavage sites using generalised linear indicator regression models key: cord-344417-1seb8b09 authors: wang, yuhang; wang, li; yang, yanjie; lian, tao title: semseq4fd: integrating global semantic relationship and local sequential order to enhance text representation for fake news detection date: 2020-10-03 journal: expert syst appl doi: 10.1016/j.eswa.2020.114090 sha: doc_id: 344417 cord_uid: 1seb8b09 the wide spread of fake news has caused huge losses to both governments and the public. many existing works on fake news detection utilized spreading information like propagatorsâô profiles and the propagation structure. however, such methods face the difficulty of data collection and cannot detect fake news at the early stage. an alternative approach is to detect fake news solely based on its content. early content-based methods rely on manually designed linguistic features. such shallow features are domain-dependent, and cannot easily be generalized to cross-domain data. recently, many natural language processing tasks resort to deep learning methods to learn word, sentence, and document representations. in this paper, we propose a novel graph-based neural network model named semseq4fd for early fake news detection based on enhanced text representations. in semseq4fd, we model the global pair-wise semantic relations between sentences as a complete graph, and learn the global sentence representations via a graph convolutional network with self-attention mechanism. considering the importance of local context in conveying the sentence meaning, we employ a 1d convolutional network to learn the local sentence representations. the two representations are combined to form the enhanced sentence representations. then a lstm-based network is used to model the sequence of enhanced sentence representations, yielding the final document representation for fake news detection. experiments conducted on four real-world datasets in english and chinese, including cross-source and cross-domain datasets, demonstrate that our model can outperform the state-of-the-art methods. with the fast advances of internet techniques, online news platforms have become a popular means for people to obtain and share information. however, due to the convenience and low cost of information dissemination, a large amount of fake news has emerged, causing many detrimental effects on politics, economy and social security. for instance, the fake news that barack obama was injured in an explosion wipes out 130 billion dollars in stock (rapoza, 26 february 2017) . after the outbreak of covid-19 epidemic, various fake news has been spreading over social media (kouzy, abi jaoude, kraitem, el alam, karam, adib, zarka, traboulsi, akl & baddour, 2020) , which causes panic and misunderstanding. some people deliberately imitate real news to create fake news. for ordinary people, it is hard to distinguish fake news from the real ones. therefore, it is vital to develop automatic fake news detection methods. existing works can be classified into three types: propagation structure-based (liu, yu, wu & wang, 2018; , user information-based , and news content-based methods (zhou, jain, phoha & zafarani, 2019) . propagation structure-based methods extract news transmission features from social network, which contain the information about how people retweet or reply this news in news dissemination process. the user information-based methods focus on the crowds which participate in news life cycle, including the people who publish the news, forward news and comment on news. a collection of features from user profile have been utilized, such as description, gender, followers, friends, location and verified type, which is known to be an essential information to obtain credibility feature (yang, shu, wang, gu, wu & liu, 2019) . nevertheless, the approaches based on propagation structure and user information are limited by the data missing, data noise problem as well as the difficulty of data collection. researchers must follow the route of fake news transmission and unceasingly capture its related behaviors. in contrast, using news content is a more straightforward and convenient way for detecting fake news than other methods mentioned above, especially in the early stage, and avoids the need to collect propagators' information and the propagation structure. in this paper, we seek to effectively judge the veracity of news solely based on its content. most content-based methods detect fake news by extracting linguistic features. pérez-rosas, kleinberg, lefevre & mihalcea (2018) employed vocabulary, syntax, and semantic features to distinct fake news. zhou et al. (2019) proposed a theory-driven early detection model, which extracts features like bow and pos tags from vocabulary and discourse level. however, these approaches only rely on feature engineering and have shortages of high-level representations and content structure utilization. hence, we adopt the text representation learning approach. to obtain enhanced text representations for fake news detection, we especially take into account the content structure-both global semantic relationship and local sequential order among sentences in a news document. on the one hand, modeling global semantic relations among sentences in the entire document is helpful for fully understanding the news (vaibhav, mandyam & hovy, 2019) . key sentences located far off in the document may have close semantic relations and these sentences convey the main idea together. on the other hand, local sequential order between consecutive sentences also makes a difference. it may have certain logic, such as causal, contrastive, and adversative relations. switching the order may result in different meanings. besides, the global sequential order is also important for expressing the information of entire document. based on the above motivations, we build an end-to-end model named semseq4fd for early fake news detection based on enhanced text representations. we first utilize the lstm network to encode individual sentences in a news document through word vectors constituting them. then, the graph convolutional network with self-attention is exploited to capture global semantic relations among far-off sentences in a news document and stress the importance of different sentences through the attention mechanism. and the 1d convolutional neural network is adopted to model local sequential order between consecutive sentences. the representations learned from these networks are fused to form the enhanced sentence representation. finally, we feed the enhanced sentence representations into the lstm-based network sequentially, and obtain the informative document representation by max-pooling, which is further used for fake news detection. we focus on cross-source and cross-domain datasets to verify the effectiveness of our model. and we only rely on text content information to achieve early accurate classification of fake news. the main contributions of our work can be summarized as follows: • we propose a content-based fake news detection model, named as semseq4fd, to take into account both global semantic relationship and local sequential order jointly. we concatenate the features learned by each network to express the enhanced sentence representation. • we introduce the global sequential order and utilize lstm-based network to generate document representation based on the enhanced sentence representation. • we conduct extensive experiments on four real-world datasets in english and chinese, including cross-source and cross-domain datasets, demonstrating that our model can outperform the state-of-the-art methods. traditional models have poor adaptability in the cross-domain tasks. they often evaluate models with test sets that share the same source and domain knowledge as the training set. classifiers trained in this way have data dependence and are easily affected by noise. different media sources have different content generators and focus on diverse topics or fields. in this work, we utilize the structure features to learn the generalizable representations of each news. we show that our proposed model could detect fake news in cross-source and cross-domain datasets. the rest of this paper is organized as follows. in section 2, we briefly review the related work on fake news detection task. section 3 describes the details of our proposed model. in section 4 we conduct a series of experiments on four real-world datasets in english and chinese, which contain cross-source and cross-domain datasets to evaluate the effectiveness of the proposed semseq4fd model. finally, we conclude this paper and shed light on our feature work. most of the previous work used information other than news content to identify fake news, such as reply news or comments associated with the news article (wu, yang & zhu, 2015; ma, gao, mitra, kwon, jansen, wong & cha, 2016; , context information (ma, gao, wei, lu & wong, 2015) , time patterns, sources (shin, jian, driscoll & bar, 2018) , user profiles rath, gao, ma & srivastava, 2017; , and a combination of these features (ruchansky, seo & liu, 2017; vosoughi, mohsenvand & roy, 2017) etc. despite the success of aforementioned works, the additional data they need will undoubtedly increase the difficulty and workload of fake news detection task. content-based research can directly judge fake news without auxiliary information, which is more conducive to discovering fake news early. typical approaches relying on content consist of two types: linguistic features-based methods and structure features-based methods. linguistic features-based methods generally detect fake news from words, sentences and documents level. it can be roughly divided into two types: machine learning and deep learning. horne & adali (2017) proposed a svm-based method with 3 broad categories features: stylistic, complexity, and psychological. this study aimed at figuring out the content stylistic differences between fake and real news. similarly, pérez-rosas et al. (2018) extracted handcrafted features from news, and built combined feature sets to train linear svm model. recently, ozbay & alatas (2020) presented a two-step method for fake news detection, which first conducts text mining and then applied twenty-three supervised artificial intelligent classification methods to the news datasets. besides, the metaheuristic algorithms could also be used for fake news detection problem. they are general-purpose solution search methods and are able to give optimum solution to the problem at an acceptable space cost. ozbay & alatas (2019) utilized the metaheuristic algorithms (such as grey wolf optimizer and salp swarm algorithm) as search methods for fake news detection and obtained promising results. the above methods could indeed solve the fake news detection problem to a certain extent, but they require manual pre-processing and artificially designed feature extraction, which are cumbersome and labor-intensive. to improve the efficiency and detect fake news automatically, wang (2017) developed a deep learning-based method, which exploits cnn and bilstm to detect fake news in word level. volkova, shaffer, jang & hodas (2017) evaluated news authority within a fusion of linguistic cues and news word embedding by cnn and lstm. these methods only focus on local word features. some researchers have studied fake news recognition at the sentence level and document level based on deep learning. yu, liu, wu, wang & tan (2017) proposed a cnn model to learn high level features for each group of posts via paragraph embeddings. ahn & jeong (2019) utilized the pre-training bert model to judge fake news on sentence-level. compared with algorithms in word level, methods based on sentence and document level have better detection effect. however, there are still several limitations. firstly, these methods cannot solve the problem of long-distance dependence of documents. the positions of semantic related sentences may not be close in the document. the existence of long-distance dependent structures makes it difficult for model to capture global semantic information of the entire document. secondly, these algorithms have poor generalizability and are prone to overfitting. fake news spread in various sources of the network. these articles with different source and domain have distinct language features and clues. models must classify news without relying on this specific information. these algorithms mentioned above usually cannot adapt to unseen news from diverse source and domain due to the overfitting problem. structure-based fake news detection methods include tree structure and graph structure. recently, uppal, sachdeva & sharma (2020) created a discourse dependency tree structure to implement automatic deception detection. nevertheless, graph structure has stronger ability to express information than other structures. since graph convolutional neural networks (gcn) (kipf & welling, 2017) apply deep neural networks to graph structure data and perform well, some studies use gcn to define fake news detection problems as node classification tasks (wei, xu & mao, 2019) and entire graph classification tasks (vaibhav et al., 2019) . constructing a graph structure has two methods depending on the modeling object. one is to use external information such as the propagation structure and the user information, and the other is to simply use text content. most of the existing studies compose graph using external data. wei et al. (2019) established a hierarchical multi-task learning framework, which uses the forwarding structure of tweets as the composition relationship among texts and utilizes the stances of tweets to help identify fake news based on graph convolutional neural network. in addition, hu, ding, qi, wang & liao (2019) exploited users profile information as news multi-relationship to build graph structures and proposed a multi-depth gcn model that can detect fake news by aggregating multi-hop neighbor information. few studies have considered the sentence relations in a news document. vaibhav et al. (2019) proposed a graph neural network model for fake news detection, which models the semantic relations among all pairs of sentences in a piece of news. however, they ignore the sentence order within a document, which make a difference for comprehending the meaning of each sentence and conveying the main thought of the news. we consider both global semantic relations between far-off sentences and local sequential order between consecutive sentences for fake news detection based on its content. consider a news document consisting of sentences  = {  } =1 , where each sentence  is a set that contains words, i.e.,  = { 1 , 2 , … , }. we define the fake news detection as the binary classification problem. each news document is associated with a label ∈ {0, 1} indicating whether it is fake or not. thus, fake news detection task can be seen as to learn a function ∶  ⟶ . we seek to perform fake news detection based solely on its content. (1) sentence encoding: each sentence in the news document is encoded by using a lstm network, whose inputs are the sequence of word vectors constituting the sentence, yielding the primary sentence representation. (2) sentence representation: the sentence representations are enhanced by taking into account both global semantic relations between far-off sentences and local sequential order between consecutive sentences. the graph convolutional network with self-attention is exploited to capture global semantic relations among far-off sentences. we utilize the 1d convolutional neural network to model local sequential order between local context. the representations learned from the two networks mentioned above are concatenated to form the enhanced sentence representation. (3) document representation: the document representation is obtained by feeding the enhanced sentence representations into a lstm network one by one and max-pooling the hidden states, which are finally fed into a softmax classification layer. we use the lstm encoder to map variable-length sequence of word embeddings into fixed-length sentence embedding, which takes the word sequential order into account. specifically, consider a sentence  composed of words. we first get individual word vectors for each word in the sentence. they are fed into a lstm network one by one from word 1 to . the hidden state in the last step ∈ can be treated as the sentence vector , which is given by . (1) we obtain the sentence vector encoded by lstm encoder, i.e., = . a sequence of sequential and fixed-length representations for sentences are generated by this way, i.e., 1 , 2 , … , . in the following, we refer to these primary sentence representations as the primary feature matrix of sentences (0) . is the dimension of each sentence. in this section, we use two neural networks to enhance the sentence representations. firstly, we utilize a graph convolutional network with self-attention to learn sentence representations that incorporate global semantic relations among any pair of sentences in the news document. secondly, we model the local sequential order between consecutive sentences by using a 1d convolutional neural network. we define the semantic relevance graph as  = (, ) , where  is the set of graph nodes and  is the set of graph edges. each node in set  is represented by the sentence vector. here we transform the edge set  to a adjacency matrix ∈ × of the complete graph which is fully connected and takes the form of all 1 with 0 on the diagonal. this structure solves the problem of long-distance dependencies between sentences, because each sentence can establish contact with all the other sentences. we use gcn (kipf & welling, 2017 ) based on the semantic relevance graph  to model the semantic data and enhance the sentence representation. we define the feature matrix composed of all sentence feature vectors as (0) ∈ × , and the adjacency matrix as ∈ × . the graph convolution operation is given by: in the above equation, is the adjacency matrix. is the weight matrix shared across all sentences, where ′ is the dimensionality of output node embeddings. this operation learns representations by aggregating the information of neighbor sentences. the output in the highest layer is the enhanced sentence representations ( ) ∈ × ′ in the -th layer. consider that not all sentences contribute equally to the expression of fake news comprehension. however, each sentence has different importance. some sentences only play a transitional role, and some sentences contain the core content. in order to allow semseq4fd model to focus on important sentences of the article, we leverage the self-attention mechanism (vaswani, shazeer, parmar, uszkoreit, jones, gomez, kaiser & polosukhin, 2017) to learn the weights to measure the importance of each sentence. formally, the scaled dot-product attention is adopted: ( 3) where, contains the importance of each sentence to the entire document. the formula uses the self-attention mentioned in (vaswani et al., 2017) and sets = = = ( ) . is a scaling factor that prevents the molecular dot product value from being too large, and which value is the dimension of sentence. ∈ × ′ denotes the output of the module sa-gcn, that is, the enhanced sentence representations, which fuses information from other sentences in the document in a weighted manner. we use the sentence level 1d convolutional neural network (kim, 2014) to capture the local order between consecutive sentences. the input is the primary sentence representation generated by the first module. that is, . then we define a filter ∈ × , in which is the number of sentences in a window. a total of ′ convolution filters are used. by moving the sliding window of size from the first to the last sentence and applying all the filters to each window, we can get the feature map ∈ × ′ , which enhances the representation of each sentence by incorporating sentences before and after it when = 3. the pad used here is the padding operation, which avoids data loss during convolution. we set it to 1 so that the first and last sentences can also participate in the convolution process. we leverage concatenation operation to integrate two types of sentence-level representations generated by the above networks, which contain global semantic relationship and local sequential order respectively, i.e., and . we get the joint representation ∈ ×2 ′ as the enhanced sentence representation. where ⊕ is the concatenation operation. the pseudocode of the algorithm for learning enhanced sentence representation is described in algorithm 1. after going through the sentence representation module, we can get the enhanced sentence representations for each sentence. in this section, we use these enhanced sentence-level representations to generate the entire documentlevel representation for fake news classification. since lstm network is able to learn contextualized representations, it can fully capture the global sequential order information of sentences in the document, so we use lstm to obtain the document representation. meanwhile, as lstm has a forget gate , an input gate and an output gate , it can retain historical information and prevent valuable information loss during the learning process. suppose the network has time steps. at the -th time step, lstm updates its unit state as follows. foreach node ∈  do 2 find the neighbor nodes set ℎ foreach node ∈ ℎ do // aggregate the neighbor node embedding 3 update node embedding by equation (2) 4 end 5 end 6 the feature matrix (1) = [ 1 ; 2 ; … ; ] learned by graph convolutional network /* self-attention mechanism */ 7 calculate self-attention value with (1) by equation (3) the feature matrix of sentence representations named learned by self-attention mechanism , , , ∈ ′ ×3 ′ are weight matrices. , , , ∈ ′ are bias vectors. is a sigmoid function. ⊙ is the element-wise multiplication. −1 ∈ ′ is the output of − 1 time step and ∈ ′ is the output of the current state. in the whole process, the forget gate first selectively filters out the information stored at the previous state. secondly, the input gate decides which new information to be updated. then, is the unit status updated by forgetting historical information and adding new informatioñ. it replaces the old state value with the new state value. finally, the output gate determine how much information will be output. the current state output is obtained by filtering information through . we combine hidden vectors trained at each time step and apply a max-pooling layer to each dimension. where represents the final representation of a document, which is then fed into a softmax layer. is the predicted probability. we minimize the loss function calculated by the cross-entropy criterion. decide which old information to be forgotten by equation (5) decide which new information to be updated by equation (6) update the unit status and obtained the current state output by equation (7) to test the effectiveness of our method, we investigate the following research questions: rq1 can our model enhance fake news classification performance both in cross-source and in cross-domain in english and chinese datasets? rq2 does the way we encode local sequential order information contribute to the classification performance? rq3 what is the effect of lstm, which is used to model the global sequential order information in the process of learning entire document-level representations for improving the fake news detection performance? whether the model can show stable prediction ability on small samples? rq5 how does our model perform on the net reclassification improvement metric? we perform fake news detection experiments on four real-world datasets, including two english datasets: lun and sln and two chinese datasets: weibo and rced. each dataset has a collection of news from news publishers or social media. we employ these datasets for the following reasons: (1) we believe that news content in two languages fake news corpora differs greatly in terms of characteristics, expression and rhetorical modes due to the variance of social cultural and cognitive systems. we need to test the effectiveness of our model on different language datasets. (2) meanwhile, fake news detection methods are generally limited to the domain or source they have been trained on, and it is difficult for them to generalize to other testing data from various publication source and knowledge domain contain completely different lexical features. we hope to verify the generalization capabilities and effectiveness of our model with multiple publication sources and various knowledge domain datasets. (3) we want to avert or reduce the data-dependence of our model as much as possible, so that the model can have theoretical and practical value. the details of datasets are listed as follows: our weibo dataset used in experiments is available on the "internet fake news detection during the epidemic" competition held by ccf task force on big data 5 . this dataset contains 3 kinds of news across 8 domains, including health, economic, technology, entertainment, society, military, political and education. because our goal is to evaluate whether the news is true or false, we only choose the real and fake news to do the binary classification task. after data cleaning, the dataset consists of 7300 news articles in all, with 3466 labeled real and 3834 labeled fake. tabel 2 shows the statistics of weibo dataset. we hope that our model could be trained on dataset from several known domains, but can effectively predict fake news from other knowledge domains that the training set has not seen. to validate this cross-domain adaptability of the model, we design cross-domain experiments on the weibo dataset and call this dataset as a cross-domain dataset. we can see that the amount of data in different fields varies greatly in weibo dataset. in order to prove the cross-domain nature of our model and maintain a balanced amount of data, we utilize the dataintensive and easily accessible domain as training datasets (health, economic, technology, entertainment, society) and others as testing datasets (military, political and education). to verify the effectiveness of our model in both cross-domain and in-domain datasets, we employ an in-domain dataset named rced 6 . in order to obtain a larger dataset, we merged the two datasets compiled by (song, yang, chen, tu, liu & sun, 2019) and (ma et al., 2016) as our rced dataset. we filter out the news that contains few sentences and we only focus on documents with more than two sentences. meanwhile, this dataset employs the fake and non-fake data from sina weibo and captures both original messages and all their reply messages as well. and we solely use the original news content of this dataset. the detailed statistics are shown in tabel 3. unlike the weibo dataset, rced is not a cross-domain dataset. we use it to validate the performance of models on in-domain short stories. same as the paper (vaibhav et al., 2019) , in all four datasets, we first hold out 10% of the whole dataset for test. then, we randomly divide the rest of dataset into training and validation subsets with proportion of 80% and 20%. for sln and lun-test datasets in english, the words are separated by white space. the sentences are separated by english full stop. and for chinese datasets like weibo and rced, we cut the words by jieba library. the sentences are segmented by chinese full stop. we select the instances that have more than 2 sentences. short news that does not meet the criteria is filtered out. we also count the average and maximum number of the sentences contained in per news document of each dataset as shown in table 4 . we will prove the effectiveness of the model regardless of the length of the news. we compared our proposed model semseq4fd to 7 state-of-the-art baseline models which can be divided into three categories: machine learning models, non-graph deep learning network models and graph-based deep learning network models. the baseline models based on machine learning include: • svm (scholkopf & smola, 2001) :a support vector machines (svm) model with linear kernel are used to classify fake news. it detects fake news based on linguistic features extracted from news content. the detailed experimental process for svm is as follows: we first collect the ngram vocabulary features of each document after a number of data pre-processing like removing stop words and tokenization. then we employ the efficient text representation algorithm term frequency-inverse document frequency (tf-idf) to obtain the term frequency values on these ngrams features (zhang, zhao & lecun, 2015) . finally, each document could be represented as a vector using the term frequency values and the vector would be fed into the svm model for classification. • logistic regression (kleinbaum, dietz, gail, klein & klein, 2002) : a logistic regression model are utilized to model article content and detect fake news. we vectorize the document using the same tf-idf method as described above, then classify it using a stable and effective logistic regression method. hyperparameters: for both svm and logistic regression, the codes were written with scikit-lean and the ngram feature we extracted are unigram, bigram and trigram. the maximum number of features is 500. other values were scikit-learn defaults. the baseline models based on non-graph deep learning network are as follows: • cnn (kim, 2014): a convolutional neural network are utilized to solve the fake news detection problem. we use a 1-d convolution layer with a filter of size 3 over the word embeddings of the whole documents, followed by a max-pooling layer and a fully connected layer. • bert (devlin, chang, lee & toutanova, 2019) : we use the google-bert pre-trained model to vectorize each sentence in the document and classify the document with a lstm network using these sentences representations. as each language has individual pre-trained bert model, for english datasets, we utilize the bert-base-uncased pre-trained model to represent the sentence. and for chinese datasets, we first use the chinese word segment tool jieba to cut the words and then vectorize sentences with the bert-base-chinese pre-trained model before they are fed into the lstm network. hyperparameters: for cnn, we set the filter size to be 3, the dimension of word embeddings is 100 and the output channel is 100. for bert, we implement the experiment with 32.0 gb memory. the maximum number of sentences in a document is 50. the dimension of sentence embeddings after pre-trained is 768. the batch size is 2. the baseline models based on graph deep learning network include: • gcn (vaibhav et al., 2019) : a graph convolutional network (gcn) (kipf & welling, 2017) has been applied to detect fake news. we first encode each sentence in document with a lstm encoder. then we apply the fully connected graph convolutional network proposed by (vaibhav et al., 2019) to learn high-level sentence representation. these representations are converted to document representation through a max-pooling layer and the document representation is fed into the fully connected layer to get its prediction. • gat (vaibhav et al., 2019) : a graph attention network (gat) (veličković, cucurull, casanova, romero, liò & bengio, 2018) has been introduced to detect fake news. we apply a graph attention network to the fully connected graph before the max-pooling layer. • gat2h (vaibhav et al., 2019) : a graph attention network with two attention heads (gat2h) is a graph attention network extended by employing two-heads attention. we apply the gat2h to the same fully connected graph and then concatenate each head's output representations horizontally and fed it into the max-pooling layer. hyperparameters: for gat and gat2h, the slope of leakyrelu is 0.2. by default, other parameters of gat, gat2h and gcn have the same values as our semseq4fd method. the details are provided in the section 4.3.1. the experimental environment is as follows: intel i7 2.20 ghz processor, 8.0 gb memory, gtx-1050 ti gpus. and all codes are implemented in python (3.6.4). we implement all machine learning based baseline models with scikit-learn libraries (0.22.1). and we implement all deep learning based baseline models, as well as our model, with pytorch libraries (1.1.0). to make a fair comparison, all results on all datasets have been averaged over several trials. we use the adam optimizer as optimization. the learning rate starts at 0.001 and reduced by 2 times if the validation accuracy doesnt increase for 3 epochs. the parameters are updated using stochastic gradient descent. for hyperparameter settings, table 5 presents a list of hyperparameters of our proposed model. the abstracts of these parameters are as follows: • max sent len: the threshold to control the maximum length of each sentence. • max sents in a doc: the threshold to control the maximum number of sentences in a document. • emb dimension: the dimension of word embedding. • hidden dimension: the dimension of primary sentence embedding. • node emb dimension: the dimension of enhanced sentence representation. • output dimension: the dimension of document representation. • k: the 1-d cnns filter size in the semseq4fd model. • dropout: the dropout rate which we set to prevent neural networks from overfitting. • batch size: the size of each batch. • max epochs: the threshold to control the maximum epoch in training step. this paper uses five different evaluation metrics to compare the performance of the proposed model, including four general metrics: accuracy, precision, recall and f1 score and one special metric: nri. the ultimate goal of our model is to detect whether a news document is fake or real. for this binary classification problem, news documents in testing datasets can usually be divided into four groups: tp (true positive), fp (false positive), tn (true negative) and fn (false negative) according to their ground-truth label and label predicted by models. • tp: the number of fake news documents correctly predicted as fake news. • fp: the number of real news documents wrongly predicted as fake news. • tn: the number of real news documents correctly predicted as real news. • fn: the number of fake news documents wrongly predicted as real news. with above concepts, the explanations for accuracy, precision, recall and f1 score metrics are as follows: the accuracy metric describes the proportion of all news documents predicted to be correct. the precision metric usually describes the proportion of fake news documents correctly predicted as fake news in all news documents predicted as fake news. however, such calculation method just focuses on fake news documents rather than the overall documents datasets. to make a fair evaluation, we followed the previous work (vaibhav et al., 2019) and used the macro-average method. specifically, we first calculated the precision for both fake news documents ( ) and real news documents ( ) respectively and then averaged them as our precision score (precision) : similarity, the recall and f1 score can be calculated as follows: special metric: we introduced a novel performance evaluation metric named nri to the fake news detection task. "nri" is the abbreviation of "net reclassification improvement". it is commonly used in healthcare domain, to quantify how well a new model reclassifies subjects as compared to an old model. the detailed information about nri metric and its experimental results are shown in section 4.7. in order to answer rq1, we compare our model against several related models covering the machine learning methods as well as the deep learning methods on four datasets as exhibited in table 6 and table 7 . from the experimental results, we have the following observation conclusions: (1) experimental results on four fake news datasets demonstrate that our proposed method is significantly better than all the comparison models and can detect fake news effectively with high accuracy. we also underline the best result of each metrics among baseline models in both tables. for english datasets, semseq4fd boosts f1 score by around 2.315% and 1.373% on average in sln and lun-test datasets. and for chinese datasets, our model obtains 1.553 % and 1.382% improvements in weibo and rced datasets. (2) we can see that the models based on the graph neural networks with graph structure information affect the performance evidently, which illustrate the effectiveness of graph convolution operation. hence, it is essential to consider the graph structure among article sentences in learning their enhanced representations. in addition, svm and logistic with tf-idf method outperform cnn model in most of the datasets, which indicates that the word frequency feature could help detect fake news. (3) our model has been proved to work well in both english and chinese languages. • for english datasets we use lun-test as our cross-source dataset and sln as our cross-domain dataset. both of them are considered to be out of domain test sets in (vaibhav et al., 2019) . however, in our view, the lun-test comes from different sources of publishers, which have different writing styles. we call it as a cross-source dataset. experiments show that our model has good generalization ability on cross-source and cross-domain english datasets. • for chinese datasets to verify that our proposed model could detect fake news in cross-domain chinese datasets, we use weibo as a cross-domain dataset and rced as a in-domain dataset. our model outperforms all of the baseline models on both datasets. this shows that our model has cross-domain characteristics and can really predict crossdomain data accurately. this is very helpful for practical applications, because we can use news from social, entertainment and other fields with large amounts of data as training set to predict the news in multiple fields where people have difficulty getting large data. we also obtain the results of validation set that split from the weibo training set, on which cnn, gcn, and our models can achieve accuracy of 0.9749, 0.9764, and 0.9829, respectively. this indicates that in the same domain, models will have a fake high accuracy on the validation set, and for test sets which have different domain, they will perform poorly. (4) table 4 compares document lengths by counting the number of sentences contained in each document. we can see that table 6 and table 7 demonstrate the performance of our model with different news length. it is obvious that news length largely affects classification results and it significantly limits the model's ability to enhance structured representations (e.g. compared with the english dataset with long documents, the results of chinese dataset with short documents have a lower improvement). besides, our model is slightly below the bert model in just one metrics (precision) on each chinese dataset. although the bert baseline model works well on several nlp tasks, it seems not good at handling fake news detection tasks with cross-domain and cross-source datasets in our experiments. meanwhile, it runs for a long time and requires high computational power and memory of the device in the training step. we still believe that our model has advantages. we also compare the f1 standard deviation of the baseline models based on the graph neural networks and our model in sln and lun-test. table 8 shows that our model is more stable than these baselines. to further illustrate the effectiveness of employing local sequential order through text-cnn and global sequential order through lstm network in learning entire document-level representations, we design the ablation experiments and explore the two component of semseq4fd. more concretely, we investigate the effects of these components by deriving two variants. we define the variants of semseq4fd as follows: • w/o cnn: w/o cnn is a variant of semseq4fd, which removes the text-cnn and does not consider the local contextual information between consecutive sentences. the enhanced sentence representations are learned by the sa-gcn module and then they are fed into the lstm network without concatenate operation. • w/o lstm: w/o lstm is a variant of semseq4fd, which excludes the lstm network in the document representation module and does not specifically model the global sequential order. as an alternative, the document representation is obtained by applying the max-pooling layer on the enhanced sentence representations learned by the sentence representation module. the experimental results of f1 score on all datasets are shown in figure 2 . from these column charts, we could draw the corresponding conclusions: (1) our proposed model achieves excellent performance on each dataset and outperforms its variants that without cnn and lstm. this indicates that simultaneously consider both local sequential order and global sequential information is indispensable for model. (2) compared with w/o cnn, w/o lstm declined further in all datasets, which implies the lstm network indeed plays a vital role in fake news detection. it seems that combining the enhanced sentence representations through lstm sequentially to learn the global sequential order is crucial for our model. there are not always a large number of samples in actual work. to answer rq4, we design experiments to explore the model's adaptability to small samples in sln and lun-test. we divide the training data into 20% to 80% and compare the experimental results of the baseline models based on the graph neural networks and semseq4fd. figure 3 clearly illustrate the variety of the f1 metrics from 20 % to 80 % fraction of training data in two datasets. by comparing semseq4fd, gcn, gat and gat2h which are all common graph-based algorithms, it is a fact that as the fraction of training data increases, our model semseq4fd shows good stability in predictive ability and outperforms the baseline models. by contrast, other methods fluctuate and they go up and down. and even the amount of data is small our model still maintains a high f1 value. this obviously shows that the semseq4fd model can capture vital information and fully utilize the limited data in the early detection of the spreading fake news at the real situation. previous experiments directly compare misclassification rates, such as accuracy. however, some instances are misclassified in the baseline model, but are corrected in our model. the general metrics we mentioned above (accuracy, precision, etc.) can only compare differences between the ground-truth label and the predicted label of the models on the overall test set. and there are limitations in explaining how well a model correctly reclassifies instances as compared to another model. thus, in order to give an intuitive illustration of our model, we further analyzed the results of all four datasets by using the nri metric (leening, vedder, witteman, pencina & steyerberg, 2014) . in the testing datasets, all instances with positive label would be classified as positive group and all instances with negative label could be classified as negative group. for a better explanation of nri, we introduce the confusion matrix as shown in table 9 , which involves the following concepts: table 9 the confusion matrix of nri index. • : the number of instances in negative group. • + : the number of positive instances that have been wrongly classified by the old model, but correctly reclassified by the new model. • + : the number of negative instances that have been wrongly classified by the old model, but correctly reclassified by the new model. • − : the number of positive instances that have been correctly classified by the old model, but wrongly reclassified to the negative label by the new model. • − : the number of negative instances that have been correctly classified by the old model, but wrongly reclassified to the positive label by the new model. with the confusion matrix of nri index, the nri metric between old model and new model would be explained. formally, for the positive group which have instances, the is given by: similarly, for the negative group which have instances, we could represent the as: and the additive nri can be regarded as the sum of them, which can be expressed as follows: the nri metric allows a more comprehensive comparison of the predictive ability of the two models than the general metrics. it ranged between -2 and 2. if nri is greater than 0, it means that the new model improves the predictive ability, otherwise it indicates that the predictive ability has declined. take the semseq4fd model and the gat model in the weibo dataset as an example. we divide the news documents into fake news and real news. the real news group contains 230 real news and the fake news group contains 285 fake news. as shown in table 10 , we could obtain confusion matrix of weibo dataset similar to we mentioned above. we can see that in the fake news group, our model semseq4fd correctly reclassifies 90 fake news documents that have been wrongly classified as real news by the gat model. however, our model only wrongly reclassifies 2 fake news documents that were correctly classified by the gat model. in the real news group, our model correctly reclassifies 1 real news document that was misclassified by the gat model. and our model misclassifies 14 real news documents as fake news, while the gat model have been correctly classified these real news documents. thus the + , − , j o u r n a l p r e -p r o o f journal pre-proof semseq4fd table 10 the confusion matrix of nri index in weibo dataset. real-group(230) real fake gat real −− 90 14 fake 2 −− 1 + and − values of semseq4fd (new model) relative to gat (old model) are 90, 2, 1, 14, respectively. the nri value of semseq4fd relative to gat should be (90 − 2) ∕ (285) + (1 − 14) ∕ (230) ≈ 0.25. and for gat, its nri value relative to semseq4fd is -0.25. it is generally accepted that the costs of classification errors in different categories are not equal. in the task of classifying fake news, we believe that wrongly classifying fake case as real one is obviously more dangerous. although our semseq4fd model is slightly weaker than gat model in detecting real news, it works well in detecting fake news and is more useful in practical work than gat model. we calculate the values between each pairs of models in the graph-based baselines as exhibited from figure 4 to figure 7 . the warm color indicates that the models on the horizontal axis (y axis) are better than models on the vertical axis (x axis), and the cold color indicates that models on the y axis have a worse ability to classify fake news than those on the x axis. we annotate each cell with the nri value using float formatting. compared with the baseline models, we can see that our model is better at predicting fake news, and can detect fake news more easily and accurately from fake and real mixed news than others. in this paper, we present an end-to-end model for early fake news detection based solely on its content, which learns enhanced text representations from the word-level to the sentence-level, and then to the document-level. in particular, the model considers global semantic relations feature, local sequential order feature and the global sequential order feature among sentences in a news document. given a news article, we first construct a complete graph structure to learn the global semantic relations feature via a graph convolutional network with self-attention mechanism. we also employ a 1d convolutional network to capture the local sequential order feature. then a lstm network is utilized to express global sequential order based on the enhanced representations. this is followed by a classifier to distinguish fake or real news. the experimental results on four datasets in english and chinese, including cross-source and crossdomain datasets, show that the proposed model outperforms other state-of-the-art methods. the experimental results demonstrate the stability and superiority of the proposed model. we found that text structure is significant to predicting the fake news. in the future, we will develop our research on the following three points: (1) the majority of text structure information can be mined and utilized. we can develop our model to discover and make full use of these complex text structures. (2) for text modeling methods, we still need more detailed and in-depth research to improve its representation effect. (3) generally, complete news always has its text content and semseq4fd figure 7 : nri values between each pair of graph-based models in rced dataset. none-textual features like images or videos. the multi-modal information also plays a vital role in fake news detection task. we will utilize this type information to solve the problem of multi-views fake news recognition. natural language contents evaluation system for detecting fake news using deep learning bert: pre-training of deep bidirectional transformers for language understanding this just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news multi-depth graph convolutional networks for fake news detection convolutional neural networks for sentence classification semi-supervised classification with graph convolutional networks logistic regression coronavirus goes viral: quantifying the covid-19 misinformation epidemic on twitter net reclassification improvement: computation, interpretation, and controversies a literature review and clinician's guide mining significant microblogs for misinformation identification: an attention-based approach early detection of fake news on social media through propagation path classification with recurrent and convolutional networks detecting rumors from microblogs with recurrent neural networks detect rumors using time series of social context information on microblogging websites a novel approach for detection of fake news on social media using metaheuristic optimization algorithms fake news detection within online social media using supervised artificial intelligence algorithms automatic detection of fake news can fake news impact the stock market truth of varying shades: analyzing language in fake news and political factchecking from retweet to believability: utilizing trust to identify rumor spreaders on twitter fake news or truth? using satirical cues to detect potentially misleading news csi: a hybrid deep model for fake news detection learning with kernels: support vector machines, regularization, optimization, and beyond the diffusion of misinformation on social media: temporal pattern, message, and source understanding user profiles on social media for fake news detection understanding user profiles on social media for fake news detection ced: credible early detection of social media rumors fake news detection using discourse segment structure analysis do sentence interactions matter? leveraging sentence level representations for fake news classification attention is all you need graph attention networks separating facts from fiction: linguistic models to classify suspicious and trusted news posts on twitter rumor gauge: predicting the veracity of rumors on twitter liar, liar pants on fire": a new benchmark dataset for fake news detection modeling conversation structure and temporal dynamics for jointly predicting rumor stance and veracity false rumors detection on sina weibo by propagation structures unsupervised fake news detection on social media: a generative approach a convolutional approach for misinformation identification character-level convolutional networks for text classification fake news early detection: a theory-driven model this work was supported by the national natural science foundation of china (no: 61872260) and gf innovative research program. key: cord-004584-bcw90f5b authors: nan title: abstracts: 8th ebsa european biophysics congress, august 23rd–27th 2011, budapest, hungary date: 2011-08-06 journal: eur biophys j doi: 10.1007/s00249-011-0734-z sha: doc_id: 4584 cord_uid: bcw90f5b nan o-003 structure determination of dynamic macromolecular complexes by single particle cryo-em holger stark max-planck-institute for biophysical chemistry, goettingen, germany macromolecular complexes are at the heart of central regulatory processes of the cell including translation, transcription, splicing, rna processing, silencing, cell cycle regulation and repair of genes. detailed understanding of such processes at a molecular level requires structural insights into large macromolecular assemblies consisting of many components such as proteins, rna and dna. single-particle electron cryomicroscopy is a powerful method for threedimensional structure determination of macromolecular assemblies involved in these essential cellular processes. it is very often the only available technique to determine the 3d structure because of the challenges in purification of complexes in the amounts and quality required for x-ray crystallographic studies. in recent years it was shown in a number of publications that it is possible to obtain near-atomic resolution structures of large and rigid macromolecules such as icosahedral viruses. due to a number of methodological advances there are now also great perspectives for high-resolution single particle cryo-em studies of large and dynamic macromolecules. successful high-resolution structure determination of dynamic complexes requires new biochemical purification strategies and protocols as well as state of the art electron microscopes and high-performance computing. in the future cryo-em will thus be able to provide structures at near-atomic resolution and information about the dynamic behavior of macromolecules simultaneously. detection and rapid manipulation of phosphoinositides with engineered molecular tools tamas balla section on molecular signal transduction, program for developmental neuroscience, nichd, nih, bethesda, md 20892, usa polyphosphoinositides (ppis) are ubiquitous lipid regulators of a variety of cellular processes serving as docking sites and conformational switches for a large number of signaling proteins. the localization and dynamic changes in ppis in live cells have been followed with the use of protein domain gfp chimeras. in this presentation we will show experimental systems that allow rapid manipulation of the levels of ppis in specific membrane compartments. we are also actively pursuing strategies that will allow us to map the distribution and possible functional diversity of the phosphatidylinositol (ptdins) pools within intact cells since they are the precursors of ppis. we will show our most recent progress in addressing this question: the use of a ptdins specific plc enzyme isolated from listeria monocytogenes together with a highly sensitive diacylglycerol sensor to determine the distribution and also to alter the level of ptdins in living cells. these studies reveal that a significant metabolically highly active ptdins pool exists associated with tiny mobile structures within the cytoplasm in addition to the known er and pm ptdins pools. we will show our most recent data on the consequences of ptdins depletion within the various ptdins pools on ppi production and on the morphology and functions of various organelles. the functionality of proteins is known to be intimately related to the motion of their constituents on the atomic/molecular level. the study of microscopic motion in complex matter is often reduced to the observation of some average mean square atomic displacement, a first, very partial characterization of the dynamics. the marked crossover in the temperature dependence of such quantities in hydrated proteins around 200 k, the so called ''dynamic transition'' has been originally observed a quarter of century ago. the origin, nature and the key characteristics of the atomic motions behind this remarkable evolution of the mean square displacement in proteins remained controversial over the past decades. recent analysis of mö ssbauer, dielectric relaxation and neutron scattering spectroscopic data provide unambiguous evidence that this phenomenon is caused by the temperature dependence of a relaxation process spread over several orders of magnitude in the time domain, similarly to the b relaxation process observed in glasses. the review and critical analysis of the available data highlights the inherent ambiguities of commonly used data fitting approaches. emerging evidence from model independent observations tend to exclude some of the proposed mechanisms. microbial rhodopsins: light-gated ion channels and pumps as optogenetic tools in neuro-and cell biology e. bamberg, c. bamann, r.e. dempski, k. feldbauer, s. kleinlogel, u. terpitz, p. wood department of biophysical chemistry, max-planck-institute of biophysics, frankfurt, germany microbial rhodopsins are widely used in these days as optogenetic tools in neuro and cell biology. we were able to show that rhodopsins from the unicellar alga chlamydomonas reinhardtii with the 7 transmembrane helix motif act as light-gated ion channels, which we named channelrhodopsins(chr1,2). together with the light driven clpump halorhodopsin chr2 is used for the non-invasive manipulation of excitable cells and living animals by light with high temporal resolution and more important with extremely high spatial resolution the functional and structural description of this new class of ion channels is given (electrophysiology, noise analysis,flash photolysis and 2d crystallography). new tools with increased spatial resolution and extremely enhanced light sensitivity in neurons are presented. a perspective for basic neurobiology and for medical applications is given. extracellular signals consists of the induction of specific gene expression patterns and the re-organization in space and time of stereo-specific macromolecular interactions that endow the cell with its specific morphology. we develop quantitative experimental and computational approaches to derive and conceptualize physical principles that underlie these dynamics of signal processing and cellular organization. we have an experimental emphasis on functional microscopic imaging approaches at multiple resolutions to study the localization and dynamics of protein reactions/ interactions, maintaining the inherent spatial organization of the cell. we have a strong recursion between computation of molecular dynamics in realistic cell geometries as sampled by microscopy, and experiments that reveal the dynamic properties of networks in living cells. we investigate the cellular topography of activities that transmit signals from receptors at the cell surface. here we ask, how spatial partitioning of intracellular signalling activities is achieved by the causality structure of the signalling network, and how this partitioning affects signal response. this entails the experimental elucidation of connections between reactions and the determination of enzyme kinetic parameters in living cells. o-008 molecular photovoltaics mimic photosynthesis michael grä tzel laboratory of photonics and interfaces, institute of chemical science and engineering, station 6, ecole polytechnique fé dé rale, ch-1015 lausanne, switzerland e-mail: michael.graetzel@epfl.ch the field of photovoltaic cells has been dominated so far by solid state p-n junction devices made e.g. of crystalline or amorphous silicon, profiting from the experience and material availability of the semiconductor industry. however, there is an increasing awareness of the possible advantages of devices referred to as ''bulk'' junctions due to their interconnected three-dimensional structure. their embodiment departs completely from the conventional flat p-n junction solid-state cells, replacing them by interpenetrating networks. this lecture focuses on dye sensitized mesoscopic solar cells (dscs), which have been developed in our laboratory. imitating natural photosynthesis, this cell is the only photovoltaic device that uses a molecular chromophore to generate electric charges from sunlight and that accomplishes the separation of the optical absorption from the charge separation and carrier transport processes. it does so by associating the molecular dye with a film constituted of tiny particles of the white pigment titanium dioxide. the dsc has made phenomenal progress, present conversion efficiencies being over 12 percent for single junction and 16 percent for tandem cells, rendering the dsc a credible alternative to conventional p-n junction devices. molecularly single-molecule imaging and tracking techniques that are applicable to living cells are revolutionizing our understanding of the plasma membrane dynamics, structure, and signal transduction functions. the plasma membrane is considered the quasi-2d non-ideal fluid that is associated with the actinbased membrane-skeleton meshwork, and its functions are likely made possible by the mechanisms based on such a unique dynamic structure, which i call membrane mechanisms. my group is largely responsible for advancing highspeed single molecule tracking, and based on the observations made by this approach, i propose a hierarchical architecture of three-tiered meso-scale (2-300 nm) domains as fundamental organizing principles of the plasma membrane. the three tiers i propose are the following. [tier 1] 30-200 nm compartments made by partitioning the entire plasma membrane by the membrane-associated actinbased meshwork (membrane skeleton: fences) and its associated transmembrane proteins (pickets). since the entire plasma membrane is partitioned by these structures, and the membrane skeleton provides important platforms for the molecular interactions and pools, membrane compartments are the most basic tier for the plasma membrane organization. [tier 2] meta-stable 1-5 nm raft domains that can be turned into stable *10-20-nm domains (receptor-cluster rafts), based on ligand-induced homo-dimers of glycosylphosphatidylinositol (gpi)-anchored receptors (coupling with [tier 3]) and facilitated by raft-lipid interactions. [tier 3] protein complexes of various sizes (3-10 nm) and lifetimes. i will also talk about how domains of tiers 2 and 3 are coupled to the membrane partitioning (tier 1). the concept of the three-tiered domain architecture of the plasma membrane and the cooperative interactions of different tiers provides a good perspective for understanding the mechanisms for signal transduction and many other functions of the plasma membrane. introduction: in the present study we investigate the effects of electromagnetic fields (emf) on the binding of norfloxacin (nrf) to human serum albumin (hsa) by fluorescence, three-dimensional fluorescence and uv-visible spectroscopic approaches. hsa is the most abundant protein in human blood plasma which works as a carrier that transports different materials in the body. nrf is used to treat variety of bacterial infections. it works by stopping the bacterial growth. methods: hsa, nrf and potassium phosphate buffer were purchased from sigma. fluorescence spectrofluorometer, uv-vis spectrophotometer, three-dimensional fluorescence and a home-built emf generator apparatuses were used. results: results obtained from this study indicated that nrf has a strong ability to quench hsa in 280 nm. in addition, there was a slight blue shift, which suggested that the microenvironment of protein became more hydrophobic after addition of nrf. moreover, synchronous fluorescence demonstrated that the microenvironment around tyrosine (tyr) had a trivial increase. these, and the results of hsa-nrf in the presence of emf with 1 khz, illustrates the same results inferred from quenching and blue shift. however, there was a significant decrease in k sv of nrf with hsa in presence of emf exposure. moreover, the binding parameters including the number of binding sites and the binding constant were calculated form hill equation. conclusion: it was shown that nrf could induce conformational changes in hsa both in the absence and presence of emf with no significant difference. yet, the affinity is decrease significantly in the presence of emf. the clinical implications are discussed in detail. characterization of the biochemical properties and biological function of the formin homology domains of drosophila we characterised the properties of drosophila melanogaster daam-fh2 and daam-fh1-fh2 fragments and their interactions with actin and profilin by using various biophysical methods and in vivo experiments. the results show that while the daam-fh2 fragment does not have any conspicuous effect on actin assembly in vivo, in cells expressing the daam-fh1-fh2 fragment a profilindependent increase in the formation of actin structures is observed. the trachea specific expression of daam-fh1-fh2 also induces phenotypic effects leading to the collapse of the tracheal tube and lethality in the larval stages. in vitro, both daam fragments catalyze actin nucleation but severely decrease both the elongation and depolymerisation rate of the filaments. profilin acts as a molecular switch in daam function. daam-fh1-fh2, remaining bound to barbed ends drives processive assembly of profilin-actin, while daam-fh2 forms an abortive complex with barbed ends that does not support profilinactin assembly. both daam fragments also bind to the sides of the actin filaments and induce actin bundling. these observations show that the drosophila melanogaster daam formin represents an extreme class of barbed end regulators gated by profilin. electron spin echo studies of free chain-labelled stearic acids interacting with b-lactoglobulin rita guzzi, luigi sportelli, rosa bartucci dipartimento di fisica, università della calabria, 87036 rende (cs), italy b-lactoglobulin (blg) binds non-covalently fatty acids within its central calyx, a cavity in the barrel formed by the strands ba-bh. we present results of pulsed electron paramagnetic resonance (epr) spectroscopy on the interaction of blg with stearic acids spin-labelled at selected positions, n, along the acyl chain (n-sasl, n = 5,7,10,12,16). d 2 o-electron spin echo envelope modulation (eseem) fourier transform spectra indicate that all segments of the bound chains in the protein binding site are accessible to the solvent. the extent of water penetration decreases progressively on moving from the first segments toward the terminal methyl end of the chain. about 50% of the nitroxides in the upper part of the chain (n = 5,7) are h-bonded by a single water molecule and this fraction reduces to 30% at the chain terminus (n = 12,16). a lower fraction of the nitroxides are h-bonded by two water molecules, and it decreases from about 15% to a vanishingly small value on going down the chain. echo-detected ed-epr spectra reveal subnanosecond librational motion of small amplitude for both 5-and 16-sasl in the protein cavity. the temperature dependence of the librations is more marked for 16-sasl and it arises mainly from an increase in librational amplitude with increasing temperature. fusion peptides (fp) pertaining to the spike glycoprotein from severe acute respiratory syndrome (sars) coronavirus are essential for the fusion between viral and host cellular membranes. here we report a biophysical characterization of the interaction of two putative fps with model membranes. fluorescence and dsc experiments showed that both peptides bind stronger to anionic than to zwitterionic lipid membranes. esr spectra showed that toac-sars ifp rotational dynamics is modulated by lipid composition and ph as compared to the spectrum of this peptide in solution. however, stearic acid spin labels reported no changes on the dynamic structure of zwitterionic micelles, whereas the whole chain of anionic surfactants was perturbed by the peptides. finally, cd data revealed a predominant b-strand structure for sars fp and an a-helix for sars ifp in the presence of micelles, in contrast to their disordered structures in buffer. overall the results point out that electrostatic and hydrophobic interactions are both important to the energetic behavior of peptide membrane interaction. these findings might provide a useful rationale for the elucidation of one of the steps involved in the fusion process, and thus help understanding the more general way of action of fps at a molecular level. interaction of filamentous actin and ezrin within surface modified cylindrical nanopores daniela behn 1,2 and claudia steinem 1 1 institute for organic and biomolecular chemistry, university of gö ttingen, tammannstraße 2, 37077 gö ttingen, germany, 2 ezrin is a member of the ezrin-radixin-moesin (erm) protein family that acts as a dynamic linker between the plasma membrane and the actin cytoskeleton and is hence involved in membrane organization, determination of shape and surface structures and other cellular processes. the protein is highly enriched in microvilli of polarized epithelial cells, where it binds filamentous actin (f-actin) with its c-terminal domain, while the n-terminal domain is connected to the plasma membrane via specific binding to l-a-phosphatidylinositol-4,5-bisphosphate (pip 2 ). nanoporous anodic aluminum oxide (aao) films provide similar dimensions as microvilli and are thus a versatile template to investigate the interaction of ezrin with f-actin within spationally confined areas. owing to their optical transparency, functionalized aaos can be used to measure the binding process of ezrin to a pip 2 containing solid supported membrane by means of time resolved optical waveguide spectroscopy (ows). confocal laser scanning microscopy (clsm) will elucidate, whether f-actin binding to ezrin takes place within or atop the nanopores. furthermore, elasticity mapping of f-actin filaments by means of atomic force microscopy will allow determining binding forces and the lateral tension of the actin cytoskeleton. in vitro application of porphyrin photosensitisers on mcf7, hela and g361 tumour cell lines binder s., kolarova h., bajgar r., tomankova k., daskova a. deparment of medical biophysics, faculty of medicine of palacky university, olomouc, czech republic tumour treatment presents a challenge to all scientists and clinicians. contemporary methods like radiotherapy, chemotherapy or surgery have many undesirable side effects. photodynamic therapy (pdt) seems to be one of alternatives which can be helpful in malignant cell therapy. pdt is not only limited to cancer treatment but is also used as an alternative for cardiovascular, skin and eye disease treatment. pdt employs photosensitive agents which need to be activated by light which is not harmful to a patient. the activated photosensitive agent provokes a formation of reactive oxygen species leading to cell damage or death. the phototoxicity of the two porphyrin photosensitizer (tmpyp, zntpps 4 .12h 2 o) on the malignant cell lines (g361, hela, mcf7) irradiated with the 1 jcm -2 doses was evaluated by ros production assay, mtt assay and comet assay. our results indicate higher efficiency of tmpyp over zntpps 4 .12h 2 o. as for the photodynamic effectiveness of the used photosensitizers on chosen cell lines we found that hela cell line is the most sensitive to phototoxic damage induced by tmpyp. p-018 nmr analysis of the respiratory syncytial virus m2-1 protein structure and of its interaction with some of its targets c. sizun 1 the respiratory syncytial virus (rsv) is a major cause of acute respiratory tract infections (bronchiolitis, pneumonia) in human and a leading cause of viral death in infants and immunocompromised patients. rsv genome is formed of a single non-segmented negative strand rna which transcription and replication is ensured by a specific rna-dependent rna polymerase complex formed of the large (l) polymerase subunit and of several cofactors. this complex has no cellular counterpart and represents an ideal target for antiviral drugs. among the cofactors, m2-1 acts as an antitermination factor and increases the polymerase processivity. its central domains has been shown, in vitro, to bind the phosphoprotein p and genomic rna in a competitive manner. here we report the nmr structure of this central domain and its interaction with p and rna fragments. m2-1 shares structural similarity with vp30, a transcription factor of ebola virus. the binding surfaces for rna and p are distinct but overlapping. rna binds to a basic cluster located next to residues found to be critical for transcription both in vitro and in vivo by mutational analysis. we speculate that m2-1 might be recruited by p to the transcription complex, where interaction with rna takes place, stabilized by additional elements. force spectroscopy at the membrane-cytoskeleton interface: interactions between ezrin and filamentous actin julia a. braunger 1,2 , ingo mey 1 and claudia steinem 1 1 institute for organic and biomolecular chemistry, georg-august-university of gö ttingen, tammannstraße 2, 37077 gö ttingen, germany, 2 ggnb doctoral program: imprs ezrin, a member of the erm (ezrin/radixin/moesin) protein family, provides a regulated linkage between the plasma membrane and the actin cytoskeleton. it contributes to the organization of structurally and functionally distinct cortical domains participating in adhesion, motility and fundamental developmental processes. ezrin is negatively regulated by an intramolecular interaction of the terminal domains that masks the f-actin binding site. a known pathway for activation involves the interaction of ezrin with phosphatidylinositol 4,5bisphosphate (pip 2 ) in the membrane, followed by phosphorylation of the threonine 567 residue in the c-terminal domain. to date, it is unclear to what extent both regulatory inputs contribute to the activation. we developed an in vitro system that facilitates the specific analysis of the interaction forces between ezrin and f-actin by means of atomic force spectroscopy (afm). applying ezrin wild type and the pseudophosphorylated mutant protein ezrin t567d, respectively, permits to monitor the individual influence of phosphorylation on the f-actin-ezrin interaction. thus, a thorough characterization of the acting forces at the ezrin-actin interface will elucidate the activation mechanism of ezrin. delivery system even more efficient, we have constructed nano-carrier by coating of ldl by polyethylene glycol (peg) . the hydrophilicity of peg should reduce the interaction of ldl with other serum proteins and consequently decrease the redistribution of loaded drug from ldl to the (lipo)proteins. dynamic light scattering was used for determination of hydrodynamic radius of ldl-peg particles. cd spectroscopy measurements didn't reveal structural changes of apoliprotein b-100 (ligand for ldl receptors on cell surface), after conjugation of peg with ldl. interaction of ldl-peg complexes with hypericin (hyp) a natural photosensitizer was studied by fluorescence spectroscopy. we have demonstrated accumulation of higher number of hyp in ldl-peg than ldl particles. however, the kinetics of hyp redistribution from hyp/ldl-peg complex to free ldl have similar parameters as those for the kinetics of hyp transfer between non-modified ldl molecules. we suggest that hyp molecules are mostly localized in the vicinity of the surface of the ldl-peg particles and they are prone to redistribution to other serum proteins. grant support: lpp-0072-07, vega-0164-09. modification of the head-group of aminophospholipids by glycation and subsequent lipid oxidation affect membrane's structure causing cell death. 1 these processes are involved in the pathogenesis of aging 2 and diabetes. 3 non-enzymatic glycation forms in the first step a schiff base (sb), which rearranges to a more stable ketoamine, amadori product, which leads to the formation of a heteregenous group of compounds (ages). although several studies have been focused on identification of aminophospholipid glycation products, 4 less attention has been paid to kinetic mechanism of the reaction. for that reason, in the present work, we compare the kinetic reactivity of polar head-group of phosphatidylethanolamine (pe) and phosphatidylserine (ps), the two target phospholipids components of mammalian cell membranes. the reaction of pe and ps's head-group with glycating compounds (glucose and arabinose) was studied in physiological conditions by using nmr spectroscopy. the obtained formation rate constants for sb are lower than those determined for the sb of the peptide ac-phe-lys with the same carbohydrates. 5 it suggests that the phosphate group may delay the glycation process. moreover, the ps's head-group has a carboxilic group in the structure, which affects the stability of the sb. we developed ultrasensitive, elisa-like nanoimmuno assays suitable for proteomics/interactomics studies in low sample volumes. we exploit the approach of dna microarray technologies applied to proteomics [1] , in combination with atomic force microscopy (afm) to generate functional protein nanoarrays: semisynthetic dna-protein conjugates are immobilized by bioaffinity within a nanoarray of complementary ssdna oligomers produced by afm nanografting (ng). a nanoarray of different antibodies or synthetic molecular binders can be generated in a single operation, once the dna nanoarray is produced. moreover, ng allows adjusting the packing density of immobilized biomolecules to achieve optimum bio-recognition. afm-based immunoassays with these nanoarrays were shown to achieve detection limit of hundreds of femto molar, in few nanoliters volumes, with very high selectivity and specificity [2] . to detect the hybridization efficiency of our devices, we run a combined experimental-computational study that provides quantitative relations for recovering the surface probe density from the mechanical response (afmcompressibility measurements) of the sample. nucleoside analogues used as anticancerous drugs can be rapidly degraded within treated cells, constituting a major obstacle of their therapeutic efficiency. among the enzymes responsible for this degradation, the cytosolic 5'nucleotidase ii (cn-ii) catalyses the hydrolysis of some nucleoside monophosphates. in order to improve the efficacy of anticancerous drugs and to define the precise role of cn-ii, new original inhibitors have been developed against cn-ii. virtual screening of chemical libraries on the crystal structure has allowed us to identify very promising candidates that turned to be competitive inhibitors of cn-ii. one molecule was included in the anticancerous treatment of tumoral cell lines in order to evaluate the potential benefit and could induce in fine a sensitization of certain anticancerous drugs. we also explore other inhibitors targeting the allosteric sites of this enzyme using a strategy that takes into account the dynamics of cn-ii. the chemical structures of the newly identified allosteric inhibitors as well as the atomic interactions with enzyme residues will be presented. the final goal of this study is to find molecules that can freeze the enzyme in a conformation for which its dynamics is severely limited and therefore its function. native mass spectrometry to decipher interactions between biomolecules sarah cianferani laboratoire de spectromé trie de masse bio-organique, université de strasbourg, iphc, 25 rue becquerel 67087 strasbourg, france. cnrs, umr7178, 67037 strasbourg, france mass spectrometry is generally understood as ''molecular mass spectrometry'' with multiple applications in biology (protein identification using proteomic approaches, recombinant protein and monoclonal antibody characterization). an original and unexpected application of mass spectrometry emerged some twenty years ago: the detection and the characterization of intact biological noncovalent complexes. with recent instrumental improvements, this approach, called native ms, is now fully integrated in structural biology programs as a complementary technique to more classical biophysical approaches (nmr, crystallography, calorimetry, spr, fluorescence, etc.). native ms provides high content information for multiprotein complexes characterization, including the determination of the binding stoichiometries or oligomerization states, sitespecificities and relative affinities. recent developments of ion mobility / mass spectrometry instruments (im-ms) provide a new additional level for ms-based structural characterization of biomolecular assemblies allowing size and shape information to be obtained through collisional cross section measurements. these different aspects of native ms for structural characterization of biomolecular assemblies will be illustrated through several examples, ranging from multiprotein-complexes to protein/nucleic acid assemblies. complex coacervation is a process which may result by electrostatic interaction between charged polysaccharides. it depends essential on ph, ionic strength and biopolymers properties like ratio, concentration and charge density. in this case, the main work was to study the structural properties of a colloidal system of opposite charge -chitosan and gum-arabic by atomic force microscopy (afm). according to some of complexes show tendency to agglomerate. this depends on the molar ration of the macromolecules and their relative molecular weights. afm micrographs show, too, that some formation of irregular aggregates by both polymers were due to presence of noncharged polar monomers in chitosan molecule. at higher gum-arabic/chitosan ratios biopolymer concentrations, coacervates appear like a core-shell miccelar structure composed of hydrophobic core (charge neutralized segments) stabilized by the excess component (positive zeta potential) and non-charged segments of gum arabic. interaction of human serum albumin with rutin theoretical and experimental approaches ícaro p caruso 1 human serum albumin (hsa) is the principal extracellular protein with a high concentration in blood plasma and carrier for many drugs to different molecular targets. flavonoids are a large class of naturally occurring polyphenols widely distributed in plants. rutin (quercetin-3-rutinoside) is the glycoside between flavonoids quercetin and disaccharide rutinose. like other flavonoids, rutin displays anti-inflammatory and anti-oxidant properties. the interaction between hsa and rutin was investigated by fluorescence spectroscopy, ab initio and molecular modeling calculations. fluorescence titration was performed by keeping the hsa concentration (4 lm) constant and stoichiometrically varying the rutin concentration (1-4 lm) . the emission spectra were obtained in the range of 305 to 500nm, with the excitation wavelength at 295nm. the obtained fluorescence data were corrected for background fluorescence and for inner filter effects. the stern-volmer quenching constant values were 3.722 9 10 5 and 1.868 9 10 5 m -1 at 298 and 303 k, respectively. from the modified stern-volmer association constants 2.285 9 10 5 (at 298 k) and 2.081 9 10 5 m -1 (at 303 k) were calculated the thermodynamic parameters dh = 14.048 kj mol -1 , dg 298k = -30.557 kj mol -1 and dg 303 k = -30.834 kj mol -1 , and ds = 55.4 kj mol -1 k -1 . fluorescence quenching method was used also to study the binding equilibria thus determining the number of binding sites 1.085 and 1.028, and binding constant 1.094 9 10 6 m -1 and 0.255 9 10 6 m -1 at 298 and 303 k, respectively. the efficient quenching of the trp214 fluorescence by rutin indicates that the binding site for the flavonoid is situated within subdomain iia of hsa. the distance r = 2.397 nm between the donor (hsa) and the acceptor (rutin) was obtained according to fluorescence resonance energy transference (fret). wavelength shifts in synchronous fluorescence spectra showed the conformation of hsa molecules is changed in the presence of rutin. the structure of rutin utilized in molecular modeling calculation was obtained by gaussian 98 program. the optimization geometry of rutin was performed in its ground states by using ab initio dft/b3lyp functional with 6-31g(d,p) basis set used in calculations. the molecular electrostatic potential (mep) was calculated to provide the molecular charge distribution of rutin. the gap energy value between the homo and lumo of the rutin molecule was about 4.22 ev which indicates that rutin is classified as a reactive molecule. from molecular modeling calculation the interaction between hsa and rutin was investigated using the autodock program package. the three-dimensional coordinates of human serum albumin were obtained from the protein data bank (entry pdb code 1ao6) and of rutin were obtained from output optimization geometry of dft. the best energy ranked result shows that rutin is localized in the proximity of single tryptophan residue (trp214) of hsa that is in agreement with the fluorescence quenching data analysis. the effect of toxofilin on the structure of monomeric actin lívia czimbalek, veronika kollá r, roland kardos, gá bor hild university of pé cs, faculty of medicine, department of biophysics, pé cs, hungary actin is one of the main components of the intracellular cytoskeleton. it plays an essential role in the cell motility, intracellular transport processes and cytokinesis as well. toxoplasma gondii is an intracellular parasite, which can utilise the actin cytoskeleton of the host cells for their own purposes. one of the expressed proteins of t. gondii is the 27 kda-sized toxofilin. the long protein is a monomeric actinbinding protein involved in the host invasion. in our work we studied the effect of the actin-binding site of toxofilin 69-196 on the g-actin. we determined the affinity of toxofilin to the actin monomer. the flourescence of the actin bound e-atp was quenched with acrylamide in the presence or absence of toxofilin. in the presence of toxofilin the accessibility of the bound e-atp decreased, which indicates that the nucleotide binding cleft is shifted to a more closed conformational state. the results of the completed experiments can help us to understand in more details what kind of cytoskeletal changes can be caused in the host cell during the invasion of the host cells by intracellular parasites. t7 bacteriophage, as a surrogate on non-enveloped viruses was selected as a test system. both tmpcp and bmpcp and their peptide conjugates proved to be efficient photosensitizers of virus inactivation. the binding of porphyrin to phage dna was not a prerequisite of phage photosensitization, moreover, photoinactivation was more efficiently induced by free than by dna bound porphyrin. mechanism of photoreaction (type i. versus type ii) and the correlation between dna binding, singlet oxygen production and virus inactivation capacity was also analyzed. dna binding reduced the virus inactivation due to the reduced absorbance and singlet oxygen production of bound photosensitizer, and altered mechanism of photoinactivation. as optical melting studies of t7 nucleoprotein revealed, photoreactions of porphyrin derivatives affected the structural integrity of dna and also of viral proteins, even if the porphyrin did not bind to np or was selectively bound to dna. anesthesia is a medical milestone (friedman & friedland, medicine's 10 greatest discoveries, 2000) and local anesthetics (la) are the most important compounds used to control pain in surgical procedures. however, systemic toxicity is still a limitation for la agents as well as low solubility, as for tetracaine (ttc). approaches to improve la effects include macrocyclic systems formation, such as in cyclodextrins (cd). we have studied complexes formed between ttc and b-cd or hydroxylpropyl (hp)-b-cd through nmr and other (uv-vis, fluorescence, dsc and x-ray diffraction) techniques. at ph 7.4 a 1:1 stoichiometry of complexation was detected for both complexes, with association constants of 777 m -1 and 2243 m -1 for ttc:b-cd and ttc:hp-b-cd, respectively. the nuclear overhauser nmr data disclosed trough the space proximities between hydrogens h h and h iat the aromatic ring of ttc -and hydrogens from the inner cavity of the cyclodextrins, allowing us to propose the topology of ttc:cd interaction. complex formation did not curb ttc association with model (liposomes) and biological membranes since the total analgesic effect (infraorbital nerve blockade in rats) induced by 15mm ttc increased 36% upon complexation. supported by (fapesp # 06/121-9, 03838-1) brazil. p-033 itc as a general thermodynamic and kinetic tool to study biomolecule interactions philippe dumas 1 , dominique burnouf 1 , eric ennifar 1 , sondes guedich 1 , guillaume bec 1 , guillaume hoffmann 1 isothermal titration calorimetry (itc) is a powerful technique for thermodynamic investigations that is little used to obtain kinetic information. we have shown that, in fact, the shape of the titration curves obtained after each ligand injection is strictly governed by the kinetics of interaction of the two partners. a simple analysis allowed us to explain several facts (e.g. the variation of time needed to return to equilibrium during a titration experiment). all simplifications were further released to obtain a very realistic simulation of an itc experiment. the method was first validated with the binding of the nevirapine inhibitor onto the hiv-1 reverse transcriptase by comparison with results obtained by biacore tm . importantly, for more complex systems, the new method yields results that cannot be obtained in another way. for example, with the e. coli transcription-regulator thiamine pyrophosphate riboswitch, we could resolve kinetically and thermodynamically the two important successive steps: (1) the binding of the tpp ligand and (2) the subsequent rna folding. our results show that initial tpp binding is controlled thermodynamically by tpp concentration, whereas the overall transcription regulation resulting from rna folding is kinetically controlled. gfps, due to their tendency to dimerize at high concentration. we have characterized for the first time the selfassociation properties of cfp (cyan fluorescent protein), the fluorescent protein mostly used as fret donor. we found that the fluorescence quenching observed at high expression level in the cell cytoplasm and the fluorescence depolarization measured at high concentration in vitro are insensitive to the a206k mutation, shown to dissociate other gfp dimers. both phenomena are satisfactorily accounted for by a model of non-specific homo-fret between cfp monomers due to molecular proximity. modeling the expected contributions to fluorescence depolarization of rotational diffusion, homo-fret within a hypothetical dimer and proximity homo-fret shows that cfp has a homo-affinity at least 30 times lower than gfp. this difference is due to an intrinsic mutation of cfp (n146i), originally introduced to increase its brightness and that by chance also disrupts the dimers. biomolecular recognition typically proceeds in an aqueous environment, where hydration shells are a constitutive part of the interacting species. the coupling of hydration shell structure to conformation is particularly pronounced for dna with its large surface to volume ratio. conformational substates of the phosphodiester backbone in b-dna contribute to dna flexibility and are strongly dependent on hydration. we have studied by rapid scan ftir spectroscopy the isothermal b i -b ii transition on its intrinsic time scale of seconds. correlation analysis of ir absorption changes induced by an incremental growth of the dna hydration shell identifies water populations w 1 (po 2 --bound) and w 2 (non-po 2 --bound) exhibiting weaker and stronger h-bonds, respectively, than those in bulk water. the b ii substate is stabilized by w 2 . the water h-bond imbalance of 3-4 kj mol -1 is equalized at little enthalpic cost upon formation of a contiguous water network (at 12-14 h 2 o molecules per dna phosphate) of reduced !(oh) band width. in this state, hydration water cooperatively stabilizes the b i conformer via the entropically favored replacement of w 2 -dna interactions by additional w 2 -water contacts, rather than binding to b i -specific hydration sites. such water rearrangements contribute to the recognition of dna by indolicidin, an antimicrobial 13-mer peptide from bovine neutrophils which, despite little intrinsic structure, preferentially binds to the b i conformer in a water-mediated induced fit. in combination with cd-spectral titrations, the data indicate that in the absence of a bulk aqueous phase, as in molecular crowded environments, water relocation within the dna hydration shell allows for entropic contributions similar to those assigned to water upon dna ligand recognition in solution. segmental-labeling expression of sh3 domains of cd2ap protein to study interaction with their ligand i.f. herranz-trillo 1 , j.l. ortega-roldan 1 , n.a.j. van transient and low affinity interactions within the cell can be enhanced by the combination of more than one domain. up to now most of the effort has been put on the study of the regulation in the affinity and specificity of the binding to isolated single domains but little is known about the effect of the presence of a second or third domain. multiple examples of proteins containing tandem domains exist in the genome like the cin85/cms family of adaptor proteins. in this family all three n-terminal sh3 domains are involved in a wide variety of different interactions, they share higher similarity among themselves than to any other sh3 domains, suggesting an overlapping specificities in binding. cd2 associated protein (cd2ap) is an adaptor protein and belongs to this family, its n-terminus consists of three sh3 domains and the interaction of each one of them with its target(-s) might be ultimately modulated by the presence of its next-door-neighbor. in this work we present the expression and purification of the tandem cd2ap-sh3a/ sh3b produced by segmental labeling techniques that allow us to express the domains with different isotopic label, improving the nmr signal and facilitating to study the interaction of the natural ligand in the presence of nextdoor-neighbor domain. there are plenty of molecules that exert their effects at the cell membrane. the evaluation of these interactions, frequently quantified by the nernst lipid/water partition constant (kp), helps to elucidate the molecular basis of these processes. we present here a recently derived and tested method to determine kp for single solute partitions using fpotential measurements. the concept was then extended to the interaction of supramolecular complexes with model membranes. a simultaneous double partition with an aqueous equilibrium is considered in this partition model. the results were validated by dynamic light scattering -dls, f-potential, fluorescence spectroscopy and laser confocal microscopy experiments. we evaluated the interaction of supramolecular complexes (peptides derived from dengue virus proteins with oligonucleotides) with luv to study our biophysical models. dengue virus (dv) infects over 50-100 million people every year and may cause viral hemorrhagic fever. no effective treatment is available and several aspects of its cellular infection mechanism remain unclear. the extension of the interactions of these complexes with biomembranes helps to elucidate some steps of dv life cycle. the aggregation of amphotericin b in the lipid membranes induced by k + and na + ions: langmuir monolayers study marta arczewska, mariusz gagoś department of biophysics, university of life sciences in lublin, poland the polyene antibiotic amphotericin b (amb) is currently the drug of choice in the treatment of fungal infections despite its undesirable side effects. according to the general conviction, the biological action of the drug is based on the formation of transmembrane channels which affect physiological ion transport, especially k + ions. this work reports the results of langmuir monolayers study of the effect of k + and na + ions on the molecular organizations of amb in the model lipid membrane. the two-component monolayers containing amb and phospholipid (dppc) have been investigated by recording surface pressure-area isotherms spread on aqueous buffers containing physiological concentration of k + and na + ions. the strength of the amb-dppc interactions and the stability of the mixed monolayers were examined on the basis of surface pressure measurements, the compressional modulus and the excess free energy of mixing. the obtained results proved a high affinity of amb towards lipids in the presence of k + than na + ions. the most stable mixed monolayers were formed with the 1:1 and 2:1 stoichiometry in the presence of k + and na + ions, respectively. this research was financed by ministry of education and science of poland within the research project n n401 015035. microcalorimetric study of antibiotic amphotericin b complexes with na + , k + and cu 2+ ions arkadiusz matwijczuk, grzegorz czernel, mariusz gagoś department of biophysics, university of life sciences in lublin, poland amphotericin b (amb) as a metabolite of streptomyces nodosus is one of the main polyene antibiotics applied in the treatment of deep-seated mycotic infections. we presented microcalorimetric (dsc) study of molecular organization of amphotericin b in lipid membranes induced by na + , k + and cu 2+ ions. the analysis of dsc curves indicates the influence of na + and k + ions on the main phase transition of pure dppc lipid. for the molar fractions of 3, 5, 10, 15 mol% amb in dppc we observed the thermal shift towards higher temperatures in respect to pure lipid, both in the presence of na + and k + ions. this result may be connected with the changes in dynamic properties of the model membrane system. in case of amb-cu 2+ complexes in aqueous solution at two ph values, 2.0 and 10.6, the dsc measurements reported endothermic heat effect. this phase transition was related to the dissociation process of amb-cu 2+ complexes. the formation of amb-cu 2+ complexes are accompanied by changes of the molecular organization of amb especially disaggregation. these all observed effects might be significant from a medical point of view. this research was financed by ministry of education and science of poland within the research project n n401 015035. membrane proteins and peptides are acting in an environment rich in other proteins or peptides. aim of our study was to understand how such molecular crowding and resulting intermolecular interactions can influence the behavior of membrane proteins, using various antimicrobial peptides and membrane proteins as examples. in the case of antimicrobial peptides we have previously described a change in their alignment in the membrane at a characteristic threshold concentration. to understand whether this change is due to unspecific crowding or specific peptidepeptide interactions, we tested if this re-alignment depends on the presence of additional peptides. in most cases we found a similar re-orientation behavior irrespective of the added peptide type, indicating unspecific crowding. when pairing pgla and magainin-2, however, we observed a distinctly different sequence of pgla re-orientation in the membrane, indicating a specific interaction between these two peptides, which correlates well with their known synergistic activity. a rather different effect of crowding was observed for the larger channel protein mscl, which was found to form clusters of functionally active proteins in the membrane. we propose that this clustering is caused by lipid-mediated protein-protein interactions. water, hydrophobic interaction, and protein stability j. raul grigera and c. gaston ferrara instituto de física de líquidos y sistemas bioló gicos (iflysib), conicet-unlp, la plata, argentina although there are several forces maintaining protein structure, it is well know that hydrophobic interaction is the dominant force of protein folding. then, we can infer that any factor that alters hydrophobic interaction will affect the protein stability. we have study by computer simulation a model system consisting in solution of lenard-jones particles in water (spc/e model) at different pressures and temperatures and analyzed the solubility i.e. the aggregation properties, of such a system. from the obtained data we are able to build up the phase surface determining the critical point. the computing results where compared with experimental data of binary mix of non polar substance in water and of protein denaturation, finding high coincidence on the critical point. since the behavior of our model system can only be due to hydrophobic effects, the coincidences with the denaturation of proteins allow us to conclude that the dominant factor that determine temperature and pressure denaturation of proteins is the hydrophobic interaction. the temperature and pressures at which the denaturation, as well the disaggregation of simple non-polar particles, starts agree with what we could expect based on the cross over line of the low to high density structure water transition. the functional reconstitution of a mitochondrial membrane protein into a lipid bilayer was studied using a quartz crystal microbalance. the 6xhis-tagged protein was immobilised via specific binding to a cu 2+ terminated sensor surface, with a change in frequency indicating approximately 75% coverage of the sensor surface by the protein. a lipid bilayer was reconstituted around the protein layer, with a final change in frequency that is consistent with the remaining area being filled by lipid. incubation with a specific ligand for the protein resulted in a significant change in frequency compared to the interaction with the surface or lipid alone. the change is greater than expected for the mass of the ligand, indicating a possible conformational change of the protein, such as the opening of a channel and increased water content of the layer. electrical impedance measurements on the same system have provided additional evidence of protein-lipid bilayer formation, and it is intended that this system will be studied with neutron reflectometry to characterise potential ligand induced channel formation. valuable functional and structural information about this membrane protein was obtained by using surface sensitive techniques to study the protein in a biomimetic lipid bilayer. visualizing and quantifying hiv-host interactions with fluorescence microscopy jelle hendrix 1, *, zeger debyser 2 , johan hofkens 3 and yves engelborghs 1 1 laboratory for biomolecular dynamics, university of leuven, belgium, 2 laboratory for molecular virology and gene therapy, university of leuven, belgium, 3 laboratory for photochemistry and spectroscopy (*present address), university of leuven, belgium protein-chromatin interactions are classically studied with in vitro assays that only provide a static picture of chromatin binding. fluorescence correlation spectroscopy (fcs) is a non-invasive technique that can be used for the same purpose. being applicable inside living cells it provides dynamic real-time information on chromatin interactions. transcriptional co-activator ledgf/p75 has well characterized protein and chromatin interacting regions. we studied ledgf/p75 in vitro and inside living cells with fcs and other techniques (luminescent proximity assay, spot/half-nucleus fluorescence recovery after photobleaching, continuous photobleaching). protein-protein interactions in living cells can be monitored with fluorescence cross-correlation spectroscopy (fccs) using fluorescent proteins as genetic labels. advantages over using fö rster resonance energy transfer (fret) are the independence from intermolecular distance and knowledge of absolute protein concentrations. we characterized fccs with fluorescent proteins in vitro and then studied the intracellular complex of ledgf/p75 and the hiv-1 integrase (in) enzyme both with fret and fccs. nucleus and its compartment nucleolus are a seat of enormous biosynthetic activity in human cancer cells. nucleolar proteins, e.g. b23 or c23, play an important role in regulation of cell division and proliferation. one of the strategies how to intermit malignant cell proliferation is affecting, e.g. by drug treatment, a net of intracellular protein interactions to bring the cell on a way of apoptosis. a cytostatic agent actinomycin d initiates apoptosis in human cancer cells, as well as in normal peripheral blood lymphocytes. at the same time, translocation of b23 and c23 into nucleoplasm is observed in the treated cells. therefore interaction between nucleolar and apoptotic proteins comes into a question. co-immunoprecipitation, fluorescence microscopy and yeast two hybrid analysis are used to answer it. in co-immunoprecipitation experiments, tumor suppressor p53 showed up to be a promising candidate for the interaction. fluorescence deposits mostly constituted by variants of transthyretin (ttr), a homotetrameric plasma protein implicated in the transport of thyroxine and retinol [1] . nowadays, the only effective therapy for ttr amyloidosis is liver transplantation. new therapeutic strategies are being developed taking advantage of our current understanding of the molecular mechanisms of amyloid formation by ttr [2] . a significant effort has been devoted to the search and rational design of compounds that might decrease ttr tetramer dissociation, for example, through ligand binding at the thyroxine binding sites of ttr [3, 4] . here, we use isothermal titration calorimetry (itc) to characterize the thermodynamic binding signature of potential ttr tetramer stabilizers, previously predicted by computerassisted methods [3] . itc allows the measurement of the magnitude of the binding affinity, but also affords the characterization of the thermodynamic binding profile of a protein-ligand interaction. high affinity/specificity ttr ligands, enthalpically and entropically optimized, may provide effective leads for the development of new and more effective drug candidates against ttr amyloidosis. we have established a set of vectors to promote easy cloning of ecfp and eyfp fusions with any protein of interest. we exploit these fluorescent fusion proteins to study protein-protein interactions by fluorescence lifetime of ecfp. the decrease of ecfp lifetime reveals fret between ecfp and eyfp and hence the interaction between proteins in question. groel-groes chaperonin complex is required for the proper folding of eschericia coli proteins. bacteriophage t4 and its distant relative coliphage rb49 encode co-chaperon proteins (respectively gp31 and coco) that can replace groes in the chaperonin complex. gp31 is also required in the folding of the major capsid protein of the phage. prd1 is a large membrane-containing bacteriophage infecting gram-negative bacteria such as e. coli and salmonella enterica. it has 15 kb long linear dsdna genome and the capsid has an icosahedral symmetry. the groel-groes chaperonin complex is needed in the assembly of prd1. we have found evidence that prd1 protein p33 can work similar way as other viral co-chaperones and substitute groes in chaperonin complex. fluorescence lifetime studies between proteins groel and p33 reveals an interaction that backs up the theory. structural modification of model membranes by fibrillar lysozyme as reaveled by fluorescence study a.p. kastorna v.n. karazin kharkiv national university, 4 svobody sq., kharkiv, 61077, ukraine recent experimental findings suggest that protein aggregation, leading to the formation and depositions of amyloids play a central role in the neurodegenerative diseases, type ii diabetes, systemic amyloidosis, etc. in the present study we focused our efforts on investigation of the influence of fibrillar lysozyme on the structural state of model lipid membranes composed of phosphatidylcholine and its mixtures with cardiolipin (10 mol %) and cholesterol (30 mol %). to achieve this purpose, two fluorescent probes with different bilayer location, 1,6-diphenyl-1,3,5-hexatriene (dph) distributing in membrane hydrocarbon core and 6-lauroyl-2-dimethylaminonaphthalene (laurdan) locating at lipid-water interface, have been employed. the changes in membrane viscosity under the influence of amyloid lysozyme were characterized by fluorescence anisotropy of dph. this fluorescence parameter was not markedly affected by fibrillar protein in all types of model membranes. the changes in emission spectra of laurdan were analysed by the generalized polarization value (gp). it was found that adding of amyloid lysozyme resulted in the increment of gp value. our data suggest that lysozyme fibrils cause reduction of bilayer polarity and increase of lipid packing density. isothermal titration calorimetry (itc) is the gold standard for the quantitative characterisation of protein-ligand and protein-protein interactions. however, reliable determination of the dissociation constant (k d ) is typically limited to the range 100 lm [ k d [ 1 nm. nevertheless, interactions characterised by a higher or lower k d can be assessed indirectly, provided that a suitable competitive ligand is available whose k d falls within the directly accessible window. unfortunately, the established competitive itc assay requires that the high-affinity ligand be soluble at high concentrations in aqueous buffer containing only minimal amounts of organic solvent. this poses serious problems when studying protein binding of small-molecule ligands taken from compound libraries dissolved in organic solvents, as is usually the case during screening or drug development. here we introduce a new itc competition assay that overcomes this limitation, thus allowing for a precise thermodynamic description of highand low-affinity protein-ligand interactions involving poorly water-soluble compounds. we discuss the theoretical background of the approach and demonstrate some practical applications using examples of both high-and low-affinity protein-ligand interactions. interaction of myoglobin with oxidized polystyrene surfaces studied using rotating particles probe m. kemper 1,2 , d. spridon 1 , l.j. van ijzendoorn 1 , m.w.j. prins 1,3 1 eindhoven university of technology, department of applied physics, eindhoven, the netherlands, 2 dutch polymer institute, eindhoven, the netherlands, 3 philips research, eindhoven, the netherlands the interaction of proteins with polymer surfaces is of profound importance for the sensitivity of biosensors. polymer surfaces are often treated in order to tune their chemical and physical properties, for example by oxidation processes. to get a better understanding of the association of proteins to treated polymer surfaces, we use the rotating particles probe (x.j.a. janssen et al., colloids and surfaces a, vol. 373, p. 88, 2011). in this novel technique, protein coated magnetic particles are in contact with a substrate and the binding is recorded for all individual particles using a rotating magnetic field. we investigate the interaction of myoglobin coated magnetic particles to spincoated polystyrene surfaces that have been oxidized with a uv/ozone treatment. the surfaces have been characterized by xps, afm and water contact angle measurements. we will demonstrate a clear influence of polystyrene oxidation on the binding fractions of the myoglobin coated particles. we interpret the results in terms of dlvo-theory: electrostatic as well as electrodynamic properties of the surfaces will be influenced by the oxidation. interact with monomeric and/or filamentous actins. twinfilin is a 37-40 kda protein composed of two adf-homologue domains connected by a short linker. in our work we studied the effects of the mouse twinfilin-1 (twf1) on the monomeric actin. we determined the affinity of twf1 to the atp-actin monomer with fluorescence anisotropy measurement (k d = 0.015 lm). the fluorescence of the actin bound e-atp was quenched with acrylamide in the presence or absence of twf1. in the presence of twinfilin the accessibility of the bound e-atp decreased, which indicates that the nucleotide binding cleft is shifted to a more closed conformational state. it was confirmed with stopped-flow experiments that the kinetics of nucleotide-exchange of actin decreased in the presence of twf1. we determined the thermodynamic properties of twf1 and investigated the effect of twinfilin on the stability of actin monomer with differential scanning calorimetry. the twf1 stabilized the stucture of the g-actin. our results can help us to understand the regulation of actin cytoskeleton in more details. magnetic np have attracted attention due to their potential of contrast enhancement of magnetic resonance imaging and targeted drug delivery, e.g. tumor magnetic hyperthermia therapy. potential nephrotoxicity of single i.v. administration of fe 3 o 4 np was studied in female wistar rats i.v. administered either placebo (10% v/v rat serum in 0.9% nacl), suspension of tio 2 np (positive control, bimodal 84/213 nm distribution), or fe 3 o 4 np (bimodal 31/122 nm distribution) in doses of 0.1, 1.0 or 10.0 mg/kg. rats were sacrificed 24h, 7-, 14-and 28-days after np injection (n=9-10/each group). administration of np did not alter kidney size significantly; renal function of np administered rats as monitored by plasma creatinine and urea concentrations, creatinine clearance and protein excretion rate did not differ significantly in either interval from rats administered placebo. one week after administration significant rise in plasma ca, its urinary and fractional excretion was observed in rats administered 10 mg fe 3 o 4 /kg. plasma mg levels rose in this group 1 and 2 weeks after administration. no significant changes in the expression of tnf-a, tgf-b, and collagen iv genes in renal cortex were revealed. no obvious nephrotoxic effects were observed in rats after a single i.v. dose of fe 3 o 4 np. study was supported by 7fp ec eu: nanotest (development of methodology for alternative testing strategies for the assessment of the toxicological profile of nanoparticles used in medical diagnostics.), grant no.: 201335. biomimetic supramolecular assemblies for studying membrane interactions in vitro and in vivo s. kolusheva, r. jelinek ben-gurion university, beer-sheva, israel we designed a novel biomimetic sensor, composed of conjugated polydiacetylene (pda) matrix embedded within lipid vesicles. the system is capable of detecting various compounds occurring within lipid membranes through rapid colorimetric as well as fluorescent transitions. the colorimetric response of the sensor is correlated to the extent of compound-membrane binding and permeation and quantified binding sensitivity to lipid composition. we describe a new disease diagnostic approach, denoted ''reactomics'', based upon reactions between blood sera and an array of vesicles comprising different lipids and polydiacetylene (pda), a chromatic polymer. we show that reactions between sera and such a lipid/pda vesicle array produce chromatic patterns which depend both upon the sera composition as well as the specific lipid constituents within the vesicles. through attachment of chromatic polydiacetylene (pda) nanopatches onto the plasma membrane, real-time visualization of surface processes in living cells is possible. the ras protein is mutated in 30% of human tumors. ras acts as a switch, transmitting a growth signal in an active gtp-bound form and turning the signal off in an inactive gdp-bound form. the switch off is accomplished by gtp hydrolysis, which is catalyzed by ras and can be further accelerated by gtpase activating proteins (gaps). mutations which prevent hydrolysis cause severe diseases including cancer. we investigate the reaction of the ras gap protein-protein complex by time-resolved ftir spectroscopy. 1 detailed information on the mechanism and the thermodynamics of the reaction was revealed: 2 first, the catalytic arginine-finger of gap has to move into the gtp binding pocket, then cleavage of gtp is fast and h 2 po 4 hydrogen-bonded in an eclipsed conformation to the b-phosphate of gdp is formed. further, we performed for the first time atr-ftir spectroscopy of ras in its native environment, a lipid membrane. 3 in this setup we are able to do difference spectroscopy of the immobilized protein. interactions with other proteins can be determined in a similar way as in spr experiments but with the additional information from the infrared spectra. galectins are a family of animal lectins that specifically bind b-galactosides and have gained much attention due to their involvement in several biologic processes such as inflammation, cell adhesion and metastasis. in such processes, several issues are still not clear including the mechanisms of interaction with different carbohydrates. galectin-4 (gal-4) is a tandem-repeat type galectin that contains two carbohydrate recognition domains (crd-i and crd-ii) connected by a linking peptide. in this study, we performed spectroscopic studies of the carbohydrate-recognition domains from human gal-4. our goals are two-fold: (1) to monitor conformational changes in each domain upon its binding to specific ligands and then to correlate the observed changes with structural differences between the crds and (2) to investigate the interaction between the crds and lipid model membranes. to achieve such objectives we used a combined approach of spectroscopic techniques involving circular dichroism and electron spin resonance. overall the results obtained so far show that crd-i and crd-ii have distinct behaviors in terms of carbohydrate recognition and membrane binding. this may be due to specific differences in their structures and certainly suggests a non-equivalent role in protein function. hemoglobin influence on lipid bilayer structure as revealed by fluorescence probe study o.k. kutsenko, g.p. gorbenko, v.m. trusova v.n. karazin kharkov national university, kharkov, ukraine hemoglobin (hb) is a red blood cell protein responsible for the oxygen transport. its affinity for lipid bilayers represents interest for gaining insight into protein biological function as well as for some applied aspects such as development of blood substitutes or biosensors. hb influence on lipid bilayer structure was investigated using fluorescent probes pyrene and prodan. model membranes were prepared of phosphatidylcholine (pc) and its mixtures with phosphatidylglycerol (pg) and cholesterol (chol). hb penetration into membrane interior is followed by the increase of relative intensity of pyrene vibronic bands and decrease of prodan general polarization value suggesting an enhancement of bilayer polarity. this implies that hb incorporation into membrane interior decreases packing density of lipid molecules, promoting water penetration into membrane core. chol condensing effect on lipid bilayer prevents protein embedment into bilayer, thus decreasing membrane hydration changes as compared to pc bilayers. in the presence of anionic lipid pg hb-induced increase of bilayer polarity was found to be most pronounced, pointing to the modulatory role of membrane composition in hb bilayer-modifying propensity. we present optimized sialic acid-based mimics binding in the low nanomolar range. molecular interactions were determined with surface plasmon resonance (spr), characterizing the affinity and the kinetics of binding. furthermore, isothermal titration calorimetry (itc) was applied to dissect the standard free energy of binding (dg°) into the standard enthalpy of binding (dh°) and the standard entropy of binding (ds°). in order to pass the cell membranes, most of these medicines has to be administrated to patients as nucleoside pro-drugs and not directly as triphosphorylated forms. because of the poor phosphorylation of the nucleoside analogues used in therapy, it is important to understand and to optimize their metabolism. our aim is to understand how compounds of chirality l turn away 3-phosphoglycerate kinase (pgk) from its normal glycolytic function to be converted into the triphosphate forms. in order to elucidate pgk mechanism and substrate specificity, we have measured the kinetics of the different steps of the enzymatic pathways by rapid mixing techniques and studied the influence of the nature of the nucleotide substrate thereon. we first performed an extensive study with d-and l-adp (see poster by p. lallemand). we are now extending the studies to other nucleotide diphosphates (some of them used in therapies). changes in the nature of the nucleobase or deletion of hydroxyl group of the sugar affect the efficiency of phosphorylation by pgk, either by decreasing dramatically their affinity or by altering the phospho-transfer step itself. structural explanations are given based on docking data. probing drug/lipid interactions at the molecular level represents an important challenge in pharmaceutical research, drug discovery and membrane biophysics. previous studies showed that enrofloxacin metalloantibiotic has potential as an antimicrobial agent candidate, since it exhibits antimicrobial effect comparable to that of free enrofloxacin but a different translocation route. these differences in uptake mechanism can be paramount in counteracting bacterial resistance. in view of lipids role in bacterial drug uptake, the interaction of these compounds with different e. coli model membranes were studied by fluorescence spectroscopy. partition coefficients determined showed that lipid/antibiotic interactions were sensitive to liposomes composition and that the metalloantibiotic had a higher partition than free enrofloxacin. these results corroborate the different mechanism of entry proposed and can be rationalized on the basis that an electrostatic interaction between the metalloantibiotic positively charged species, present at physiological ph, and the lipids negatively charged head groups clearly promotes the lipid/antimicrobial association. oligomerization and fibril assembly of amyloid b peptide amyloid b peptide (ab) forms a large amount of extracellular deposits in the brain of alzheimer's disease (ad) patients and it is believed that this peptide is related to the pathogenesis of that disease. the most abundant monomeric form of physiological ab (*90%) is constituted by 40 amino acids and is benign, but by an unknown mechanism this endogenous material becomes aggregated and neurotoxic. increasing evidence suggests that membrane interaction plays an important role in ab neurotoxicity. in this work it will be studied the interactions of ab(1-40) with ctac (cationic), sds (anionic), pfoa (anionic with fluorine atoms) and og (nonionic) amphiphiles in monomeric and micellar forms. the results demonstrated that ab(1-40) forms fibrils with different morphologies in the presence of micelles. in addition, the presence of micelles accelerates the formation of fibrils and decreases the lifetime of oligomers. we present here the exploitation of the powerful approach of surface plasmon resonance imaging and mass spectrometry coupling for protein fishing in biological fluids such as human plasma at the same sensitivity. on one hand, multiplex format spri analysis allows direct visualization and thermodynamic analysis of molecular avidity, and is advantageously used for ligand-fishing of captured bio-molecules on multiple immobilized receptors on a spri-biochip surface. on the other hand, maldi mass spectrometry is a powerful tool for identification and characterization of molecules captured on specific surface. therefore, the combination of spri and ms into one concerted procedure, using a unique dedicated surface, is of a great interest for functional and structural analysis at low femtomole level of bound molecules. to reach these goals, particular surface engineering has been engaged to maintain a high level of antibody grafting and reduce non-specific adsorption. thus, various chemistries have been tested and validated towards biological fluids such plasma, keeping in mind the capacity of the in situ investigation by ms. finally, signal to noise ratio was magnified leading to the characterization of protein lag3, a potential marker of breast cancer, in human plasma. atenolol incorporation into pnipa nanoparticles investigated by isothermal titration calorimetry mihaela mic, ioan turcu, izabell craciunescu, rodica turcu, national institute for r&d of isotopic and molecular technologies, 400293 cluj-napoca, romania e-mail: mihaela.mic@itim-cj.ro poly(n-isopropylacrylamide) (pnipa) is a thermo-sensitive hydrogel undergoing a volume phase transition at about of 32°c close to the body temperature. this volume phase transition is envisaged as a key property for drug binding and release. the purpose of our research is the thermodynamic characterization of the binding of atenolol by pnipa polymeric nanoparticles. the thermodynamic parameters which characterize the binding process are obtained using the isothermal titration calorimetry (itc) as the main investigation technique. when polymeric nanoparticles bind drug molecules, heat is either generated or absorbed depending on the amount of bond molecules and also on the exothermic or endothermic character of the binding process. the heat measurement allows the determination of binding constants, reaction stoichiometry and the thermodynamic profile of the interaction. itc technique has been used to investigate the binding properties of nanoparticles which shrink from the swollen to the collapsed state. the capacity of such nanogels to bind atenolol molecules is directly related to relevant differences between the binding properties in the swollen and in the collapsed state respectively. aggregation study of x-(alkyldimethylamonium)alkylaldonamide bromides p. misiak 1 , b. ró _ zycka-roszak 1 , e. woźniak 1 , r. skrzela 1 , k.a. wilk 2 1 department physics and biophysics, wrocław university of environmental and life sciences, wrocław, poland, 2 department of chemistry, wrocław university of technology, wrocław, poland sugar-based surfactants are of considerable research interest because they have improved surface and performance properties, reduced environmental impact, and have potential pharmaceutical and biomedical applications. x-(alkyldimethylamonium)alkylaldonamide bromides (c n gab) with different chain lenghs (n = 10, 12, 14, 16) belonging to cationic sugar-based surfactants were newly synthesised. the aggregation processes of c n gabs were studied by means of isothermal titration calorimetry (itc), electric conductance method and molecular modelling methods. the critical micelle concentrations (cmc), the degree of micelle ionization (b), the enthalpies (dh m ) and the entropies (ds m ) of micellization as well as the contributions of the headgroups to the gibbs free energies (dg m 0 (hy)) were calculated. the obtained values were compared with those for dodecyldimethylethylammonium bromide and literature data for analogical glucocationic surfactants. the latest compounds differ from c n gab surfactants by substitution of sugar chain by gluco ring. molecular modelling methods were used to relate the molecular properties of the compounds with their experimentally studied properties in solution. this work was supported by grant n n305 361739. every year over 50 million people are infected with dengue virus (denv), transmitted by a mosquito (aedes aegypti). this enveloped virus, member of the flaviviridae family, has four distinct serotypes. it has a single stranded positive rna molecule with a single open reading frame that encodes a single poliprotein, which, after appropriate processing by viral and host proteases, gives rise to three structural proteins (c, prm and e) and seven non-structural proteins (ns1, ns2a, ns2b, ns3, ns4a, ns4b and ns5) [1] . the surface of the immature virion is composed of e and prm heterodimers that are arranged as trimers protruding from the membrane [2] . the virus is thought to enter the host cell via a receptor-mediated endocytosis, although, if any, the specific dengue receptors have not been described. once inside the cell, the acidified environment inside the endocytic vesicle triggers an irreversible trimerization of the envelope (e) protein, inducing the release of the nucleocapsid (composed of viral rna and multiple copies of c protein) to the cytoplasm, thus starting the infection process, where the poliprotein is translated and processed, originating all viral proteins. considering the structural proteins c and e, these are essential for the viral infection, specifically, protein c is thought to be involved in the viral assembly and specific encapsidation of the genome and protein e (a class ii fusion protein) plays a major role in the fusion process. as recently described by some studies [3] , protein c is composed of four a helices connected by four short loops and has a highly hydrophobic region forming a concave groove that could interact with lipid membranes and a region with an increased concentration of positive charges, possibly interacting with the viral rna. as for protein e, it is composed of three b stranded domains. it is proposed that the fusion loop is located in domain ii of this protein and the putative receptor binding sites, considered essential for the viral entry, are supposedly located in domain iii. in this work, we describe the identification of the membrane active regions of both these proteins, considering both theoretical studies, hydrophobic moments, hydrophobicity and interfaciality values as well as experimental ones, namely fluorescence spectroscopy, where a fluorescent probe is encapsulated in model membrane systems, and differential scanning calorimetry [4] . we have found one region in protein c and four regions in protein e with membranotropic activity. this is the first work describing experimentally the putative membrane interacting zones of both these proteins. this work was funded by grant bfu2008-02617-bm from ministerio de ciencia y tecnologia, spain, granted to jose villalaín. investigation of membrane-membrane interaction mediated by coiled coil lipopeptides gesa pä hler, andreas janshoff georg-august-university, tammannstrasse 6, 37077 gö ttingen, germany e-mail: gpaehle@gwdg.de specific cellular membrane interaction and fusion are crucial points in vivo which are in eukaryotic cells mediated by snare proteins. the definite mechanism behind these processes is still poorly understood, but the coiled coil formation of a snare core complex consisting of four a-helices seems to generate a fusogenic driving force. this offers the possibility to design a straightforward experimental setup to mimic the complex protein-mediated membrane-membrane interaction by using mere protein fragments or peptides attached to artificial lipid bilayers which self-assemble to a coiled coil structure. in our approach, two artificial three heptad repeat coiled coil forming peptides were synthesized and attached to maleimide functionalized membranes via an in situ-coupling reaction. thus, secondary structure changes, kinetic characteristics and binding energetics were monitored during coiled coil formation with real time ellipsometry, ir and cd spectroscopy. the lipopeptide mediated membrane-membrane interaction itself is investigated by colloidal probe spectroscopy and tirfm. these techniques and the setup of our model system allow screening the energetic and structural properties of variable coiled coil forming peptides, i.e. linker-modified or biologically inspired sequences. enzymatic reactions in nanostructured surfaces: unzipping and cutting the double helix pietro parisse 1 , matteo castronovo 2 , bojana lucic 3 , alessandro vindigni 3 , giacinto scoles 2 and loredana casalis 1 1 sincrotrone trieste s.c.p.a., trieste, italy, 2 temple university, philadelphia, usa, 3 protein-dna interactions are vital for living organisms. from viruses to plants and humans, the interactions of these two different classes of biopolymers control processes as important and diverse as the expression of genes and the replication, rearrangement, and repair of dna itself. to understand these processes at the molecular level, and to follow changes in cellular pathways due to different kinds of perturbations and/or diseases it is necessary the identification and quantification of proteins and their complex network of interactions. we have exploited the high spatial resolution given by atomic force microscopy to generate dna arrays of variable density by means of nanografting. on such nanostructures, we investigate the mechanism of different enzymatic reactions (from restriction enzymes to helicases). registering with high precision the height variation due to the action of the enzyme onto the engineered dna sequences (in the case of restriction enzymes) or taking advantage of the different mechanical properties of single and double stranded dna (in the case of helicases, where for the first time kinetic data were obtained on recq1 human helicase), we were able to monitor either the activity and/or the action mechanisms of these two important classes of enzymes. in this study an attempt has been made to investigate the structure, dynamics and stability of cyclic peptide nanotubes (cpnts) formed by the self-assembly of cyclic peptides (cps) using classical molecular dynamics (md) simulation and semiempirical quantum chemistry calculation employing pm6. the structure and energetics of monomer and various oligomeric cpnts have been investigated by considering the (cyclo-[(d-ala-l-ala) 4 ]) peptide as the model for cp. various geometrical parameters extracted from the md simulation reveal that the terminal residues are loosely hydrogen bonded to the inner subunits regardless of degree of oligomerization. the hydrogen bonds present in the inner core regions are stronger than the terminal residues. as the degree of oligomerization increases, the stability of the tube increases due to the hydrogen bonding and stacking interactions between the subunits. the results show that the binding free energy increases with the extent of oligomerization and reaches saturation beyond cpnt5. in addition, hydrophobic and electrostatic interactions play crucial roles in the formation of cpnts. analysis of both structural and energetics of formation of cpnts unveils that the selfassembly of dimer, trimer and tetramer cpnts are the essential steps in the growth of cnpts. monolayers on a langmuir trough constitute a great biomimetic model to characterize protein-protein or protein-lipid interaction, where the physical state of the interfacial layer is completely controlled. we present here three studies performed on monolayers, with a wide panel of experimental (optical, spectroscopical, rheological) techniques. i) surface properties and conformation of nephila clavipes spider recombinant silk proteins (maspi and masp2) was studied at the air-water interface: we show that the mechanism of assembly of both proteins is different, although both proteins share the same sequence pattern and a close hydrophobicity. they both exhibit a certain propensity to form b-sheets that may be important for the efficiency of the natural spinning process. ii) the dystrophin molecular organization and its anchoring in a lipidic environment depend on the rod fragment used and on the lipid nature. moreover the interaction is guided by the lateral surface pressure. this lipid packing variation is essential to understand the role of the dystrophin during compression-extension cycle of the muscle membrane. iii) we evidence that non additive behavior of mixtures of food globular proteins leads to enhanced foaming properties or to self assembled objects. nucleolar-targeting peptides (nrtps) were designed by structural dissection of crotamine, a toxin from the venom of a south-american rattlesnake. at lm concentration, nrtps penetrate different cell types and exhibit exquisite nucleolar localization. the aim of this work was to pursue with the study of nrtps molecular mechanism for translocation, as well as to determine the ability of nrtp to delivery large molecules into cells. for the translocation experiments, rhodamine b-labeled nrtps were used and tested with giant multilamellar vesicles. confocal microscopy results show that there is an efficient translocation across model membranes. high levels of intracellular peptide were also seen in different cell lines and pbmc, soon after incubation with nrtp. furthermore, a conjugate of nrtp (nrtp6c) bound to b-galactosidase was prepared by chemical synthesis and tested in hela cells. this conjugate maintains enzymatic activity and is stable at 4°c for several days. the work done so far with this new family of cell-penetrating peptides revealed strong interaction and translocation with lipid model systems. moreover, successfully cellular delivery of bgalactosidase was observed and quantified. interaction of zinc phthalocyanine with ionic and non ionic surfactants: uv-vis absorption and fluorescence spectroscopy for application in photodynamic therapy m. p. romero, s. r. w. louro physic department, pontifícia universidade cató lica de rio de janeiro puc-rio, brazil among the second-generation photosensitizer (ps) developed for the treatment of neoplastic diseases by photodynamic therapy (pdt), metallo-phthalocyanines (mpc) have been proposed as an alternative to the currently used ps in clinical application. unsubstituted mpc are not soluble in physiological solvents and their in vivo administration relies upon their incorporation into carriers or their chemical conversion into water-soluble dyes by the attachment of selected substituents. in this work, uv-vis absorption and fluorescence spectroscopy were used to study the ability of different micelles for dispersing zinc phthalocyanine (znpc). the following surfactants were tested: sds, ctab, hps, tween 80, tween 20, and pluronic f127. znpc has low solubility in virtually all solvents, but dmf and dmso are observed to dissolve znpc in concentrations of the order of 0.9 and 0.2 mmol/l, respectively. stock solutions of znpc in dmf and dmso were prepared. micelles of the different surfactants containing znpc were prepared by dissolving in aqueous medium (milli-q water or phosphate buffer ph 7.4) small amounts of the stock solutions previously mixed with each surfactant. the amounts of each surfactant were calculated to give an average ratio of one znpc molecule per micelle in the final solution. the absorption and fluorescence spectra of znpc in the micellar systems were obtained, and were observed to change in time. immediately after dissolution the spectra are characteristic of monomeric znpc, suggesting formation znpccontaining nanoemulsions with the mixture of znpc-organic solvent in the hydrophobic region of the micelle. since dmso and dmf are miscible with water, the solvent diffuses out of the micelle and znpc stays inside the micelle in a monomeric or aggregated form. the different surfactants lead to different time evolution of znpc aggregation. aggregation lifetimes vary from one hour, in the case of pluronic f127, to more than twelve hours, in the case of ctab and hps. it was observed that the ionic surfactants were more efficient than non ionic ones for monomeric delivery of znpc . work partially supported by cnpq, inami and faperj. nucleobase-containing peptides are an important class of molecules comprising both artificial (synthetic nucleopeptides) and natural (peptidyl nucleosides and willardiine-containing peptides) compounds characterized in many cases by interesting biological properties. 1, 2 in this work, we report a spectroscopic study on the properties of a chiral nucleobase-bearing peptide obtained by chemical synthesis starting from commercial sources. the findings of this research strongly encourage further efforts in the field of the use of nucleopeptides as supramolecular assembling systems and open the way to novel drug delivery approaches based on nucleobase recognition. conformational plasticity. their structure depends tremendously on their local environment and confinment, and may accommodate several unrelated conformations, that are a strong challenge for the traditional characterizations of structure, supramolecular assembly and biorecognition phenomena. atomic force microscopy (afm) has been successfully exploited for both highly controllable nanolithography of biomolecules and for biorecognition studies, such as oriented prion protein -antibody interaction (sanavio et al., acs nano (2010) 4(11): 6607, bano et al. nano lett (2009) 9(7): 2614-8). here, we report different strategies for selective, oriented confinement of alphasynuclein at the nanoscale for sensitive and accurate direct detection, via precise topographic measurements on ultraflat surfaces, of biomolecular interactions in confined assemblies. a new class of cell penetrating peptides (cpps) was generated by splicing the (1-9) and (38-42) segments of crotamine, a toxin from crotalus durissus terrificus venom [1] . as they localize preferably on the nucleolus, these novel cpps were named nucleolar-targeting peptides (nrtps). the extent of nrtp partition to zwitterionic (popc; popc:cholesterol 67:33) and anionic (popg; popc:popg 70:30) lipid vesicles was studied following the intrinsic tyr or trp fluorescence of the peptides. the partition curves into popc zwitterionic vesicles were characterized by downward slopes and higher partition coefficients (k p *10 4 -10 5 ). for pure popg, an upward curve and smaller partition coefficient point out for a different type of membrane-peptide interaction. popc:popg membranes present characteristics of both types of interaction. from red edge excitation shift and quenching experiments similar conclusions were attained. leakage assays ruled out lipid vesicle disruption by crotamine or nrtps. further studies on nrtp cellular translocation mechanism and large molecule delivery are currently in progress. dystrophin is essential to skeletal muscle function and confers resistance to the sarcolemma by interacting with cytoskeleton and membrane. we characterized the behaviour of dys r11-15, a five spectrin-like repeats from the central domain of human dystrophin, in the presence of liposomes and monolayers as membrane models. interaction of dys r11-15 depends on the lipid nature, anionic or zwitterionic, with suvs, and on the lipid packing when comparing luvs to suvs. lateral pressure of lipid monolayers modifies the protein organization and leads dys r11-15 to form a regular network as revealed by afm. trypsin proteolysis assays show that the protein conformation is modified following its binding to monolayer and suvs. label free quantification by nano-lc/ms-ms allowed identifying the helical amino acid sequences in repeats 12 and 13 that are involved in the interaction with anionic suvs. results indicate that dys r11-15 constitutes a part of dystrophin that interacts with anionic as well as zwitterionic lipids and adapts its interaction and organization depending on lipid-packing and lipid nature. we provide here strong experimental evidence for a physiological role of the central domain of dystrophin on sarcolemma scaffolding through modulation of lipid-protein interactions. extracellular matrix proteins. overexpression of the mmps has been associated with a variety of diseases ranging from periodontal disease and arthritis to tumor invasion and metastasis. the majority of the more powerful synthetic inhibitors of mmps incorporate a hydroxamate group, but exhibit low selectivity and are toxic. in a recent modeling study, astaxanthin (ast), a carotenoid with potent antioxidant property, has been shown to be a potential inhibitor of mmp-13 function by occupying a binding site near the active center of the enzyme (bika 0 di et al. 2006). in our ongoing project, we investigate the binding of ast to the catalytic domain of mmps using biochemical and ultimately crystallization to validate the proposed action of ast. along these lines, the catalytic domain of mmp-13 (cdmmp-13) was expressed in e.coli bl21(de3) codon-plus and refolded using a novel effective refolding method. our results reveal that ast has a potent inhibitory effect on cdmmp-13 activity, however, determination of ic 50% or k i is difficult due to fast oxidation and structural instability of ast. ongoing work aims at optimizing the inhibition conditions and improving the refolding yield to allow analyzing structure and function of the ast-bound mmp-13 in more detail. hyaluronic acid (hyaluronan, ha) is a linear polysaccharide with a molar mass in the range of 10 5 to 10 7 da and is built from alternating units of glucuronic acid and n-acetylglucosamine. synthesized in the cellular plasma membrane, it is a network-forming and space-filling component in the extracellular matrix of animal tissues. here, we create hyaluronic acid films atop a porous alumina substrate, where they act as a barrier for macromolecular transport depending on their length and geometry. the geometry of the hyaluronic acid switches between a fully stretched and a mushroomlike state and is dependent on the concentration of hyaluronic acid. to bind hyaluronic acid selectively atop the nanoporous anodic aluminum oxide (aao), the aao is orthogonally functionalized by silane chemistry. by means of time resolved optical waveguide spectroscopy (ows) the transport of macromolecules, e.g. avidin, across the hyaluronic acid barriers can be recorded as a function of the pore diameter and hyaluronic acid concentration in a time resolved and label free manner. confocal laser scanning microscopy (clsm) provides an alternative method to investigate the orthogonal functionalization of the pores and to elucidate whether a molecule can cross the barrier at the pore entrance. we functionalized gold surfaces with a hydroxy-terminated self-assembled thiol monolayer exposing an adjustable fraction of biotin moieties. [1] by in-situ acetylation or fluorination, surface properties could be fine-tuned to different protein immobilization scenarios. using streptavidin as a linker protein, immobilization of human abcc3 [2] in liposomal and planar bilayer systems was possible. abcc3-containing proteoliposomes doped with a biotinylated anchor lipid were successfully tethered to our streptavidin-coated surfaces. biotinylation of the extracellular glycosylation of abcc3 allowed direct immobilization with inside-up orientation and subsequent assembly of a lipid bilayer. outside-up orientation was achieved by exploiting the c-terminal histidin tag of recombinant abcc3 for immobilization via ni 2+ and biotinnitrilotriacetate. all systems were thoroughly characterized by quartz crystal microbalance, atomic force microscopy and surface plasmon resonance techniques with respect to monitoring abcc3mediated substrate transport in real time. because of its role in the apoptotic pathway, conformational transitions of cytochrome c (cyt c) have gain on interest. in native cyt c, met 80 and his 18 residues serve as heme axial ligands. cyt c interaction with the membrane causes the disruption of the iron-met80 bond. this allows the binding of others endogenous ligands forming alternative low spin species 1,2 or induces peroxidase activity through the formation of a five coordinated high spin iron specie. acquisition of this peroxydase activity by cyt c has been shown to be a key stage before its release from the mitochondria 3 . in order to study these non native low spin species by checking the possible amino acids able to bind the human cyt c heme, different mutants have been designed and produced: h26q, h33n, and the double one h26q/h33n. sds in countries where seafood is an integrate part of the diet, fish are among the most common food allergen sources. the major fish allergen parvalbumins are abundant in the white muscle of many fish species. parvalbumin belongs to the family of ef-hand proteins and has a globular shape containing six helical parts. high pressure is known to unfold proteins. we performed high pressure ftir experiments, to explore the p-t phase diagram of cod parvalbumin, gad m 1, to test the possibility of its inactivation by high pressure treatment. the infrared spectrum of parvalbumin is characteristic for the helical conformation, in agreement with the crystal structure. a marked transition in the structure of the parvalbumin was observed with the central point of 0.5 gpa (at room temperature). the amide i position shifts to a wavenumber which is between the helical and the unfolded position. we assign this change to a native-molten globule transition. it was reversible as seen from the infrared spectra. it has been proven in the past that reflectometric interference spectroscopy (rifs) is a powerful measurement system for the detection of protein-protein interactions 1 . we present here the development of a reflectometric sensor which allows for the detection of membrane-protein interactions in the micromolar regime. in this study we employ two different instrumental assemblies. the first installation enables direct detection and quantification of the interaction of membrane proteins with solid supported lipid bilayers. in the second assembly the original instrument is combined with an upright fluorescence microscope. the advantages of this installation are the direct optical control of the experiment as well as a smaller sensing area. the set-up allows for the detection of interactions on lipid-patches of just several micrometers in diameter. the aim of this work is an experimental system that enables the measurement of transport processes through lipid membranes. we attempt to achieve this by covering a closed porous substrate with a lipid membrane. the first steps to reach this goal were done by spanning membranes over anodized aluminum oxide substrates. initiation of actin polimerization in cells requires nucleation factors. a pointed-end-binding protein of f-actin -the lei-omodin2-acts as a strong filament nucleator in muscle cells. the dynamical, structural and kinetic properties of a protein can provide important information to understand the intramolecular events underlying its function. we are interested in how does the leiomodin2 regulate the actin polimerization. our aim is to determine the dissociation constant of the actin-leiomodin2 complex, and study a possible side-binding effect of the leiomodin2. the cardiac leiomodin2 of rattus norvegicus is a 50 kda molecular weight protein, which contains a 17 kda n-terminal, a 18 kda leucin reach repeat (lrr) and a 15 kda c-terminal region. the n-term and the lrr regions are together tropomodulin homologues. we expressed the wild type the n-term+lrr the lrr+c-term and the c-term protein fragments by using a ptyb1 vector that contains an amplicillin resistance gene. the expression of the proteins was carried out with the twin-cn (ne bio-labs) kit, which is a chitin-intein self-cleavage and purification system. the nucleation activity of leiomodin and the polimerization speed of actin in the presence of tropomyosin and leiomodin were studied by using pyrene-actin polimerization assay. we measured the stoichiometric, conformational and kinetic properties of the leiomodin-actin complexes with co-sedimentation assay, fluorescence spectroscopic and rapid-kinetic methods. the results showed that the rate of actin polimerization depended on the leiomodin2 concentration. the nucleator activity of leiomo-din2 was ionic strength dependent. the data also confirmed that leiomodin2 is a side-binding and pointed-end binding protein of f-actin. the binding of leiomodin2 to the sides of the actin filaments was slower than to the pointed-end of the f-actin. the structure of f-actin was changed by the sidebound leiomodin2. these observations will contribute to the better understanding of the development and function of thin filaments in cardiac and other muscle tissues. leukemias are one of the most common malignancy worldwide. there is a substantial need for new chemotherapeutic drugs effective against these diseases. doxorubicin (dox), used for treatment of leukemias and solid tumors is poorly efficacious when administered systemically at conventional doses. therefore, in our study, to overcome these limitations, we used transferrin (trf) as a drug carrier. we compared the effect of dox and doxorubicin-transferrin conjugate (dox-trf) on human leukemic lymphoblasts (ccrf-cem). the in vitro growth-inhibition test, xtt assay, indicated that dox-trf was more cytotoxic for leukemia cells than dox alone. in our researches we also evaluated the alternations of mitochondrial transmembrane potential (dw m) , and production of reactive oxygen species (ros). we monitored the dw m using dye jc-1 (5,5',6,6'-tetrachloro-1,1',3,3'-tetraethylbenzimidazolcarbocyanine) . the level of ros was studied using the fluorescent probe dcfh 2 -da (2', 7'dichlorodihydrofluorescein diacetate). the results demonstrate that dox-trf induced, decrease of mitochondrial membrane potential and significantly higher production of ros compared with dox treated cells. moreover, all these results seem to be correlated with dna fragmentation analyzed by dna ladder. the tested processes were partially inhibited by the antioxidant, n -acetylocysteine (nac). the changes induced by dox-trf conjugate and free drug were suggest the different mechanism of action of dox alone and conjugated with transferrin. time-resolved detection of protein-protein interaction masahide terazima kyoto university, kyoto, 606-8502, japan revealing molecular mechanism of a protein reaction has been a central issue in biophysics. for that purpose, a variety of time-resolved spectroscopic methods have been developed. however, most of them can monitor only dynamics associated with an optical transition and it has been very difficult to trace processes without optical transition. we used the pulsed laser induced transient grating (tg) method to study spectrally silent reactions of various proteins in time-domain. here we will show studies on pixd. pixd is a 17 kda short protein which consists of the bluf domain and additional short helices, and is involved in phototactic movement. the photochemical reaction studied by absorption spectroscopy revealed that this protein exhibits typical photochemistry of the bluf proteins. the red-shifted intermediate is generated within a 100 ps. the spectrum does not change after this initial reaction, and returns back to the dark state with a time constant of 12 s at room temperature. we studied the reaction of this protein by our method and found that the proteinprotein interaction is drastically changed during the reaction. the details and the biological meaning will be presented. human ileal bile acid-binding protein (i-babp) plays a key role in the enterohepatic circulation of bile salts. previously we have shown that the protein has two binding sites and triand dihydroxy bile salts bind with strong and moderate positive cooperativity, respectively. positive cooperativity is thought to be related to a slow conformational change in the protein. our current study is directed at the structural and dynamic aspects of molecular recognition in human i-babp using nmr spectroscopy and other biophysical techniques. as a first step in the investigation, 15 n relaxation nmr experiments have been employed to characterize the backbone motion in the apo and holo protein on a wide range of timescales. our results show a moderately decreased ps-ns flexibility in the ligated protein, with most significant ordering near the portal region. in addition, the measurements indicate a slow ls-ms fluctuation at four distinct segments in the apo protein, a motion not observed in the doubly-ligated form at room temperature. our studies support the hypothesis of an allosteric mechanism of binding cooperativity in human i-babp. to shed more light on the molecular details of the binding mechanism, a site-directed mutagenesis study is in progress. cationic porphyrin-peptide conjugates were recently shown to enhance the delivery of peptide moiety to the close vicinity of nucleic acids but their interaction with dna is not yet studied. we synthesized two cationic porphyrin-peptide conjugates: tetra-peptides were linked to the tri-cationic meso-tri(4-n-methylpyridyl)-mono-(4-carboxyphenyl)porphyrin and bi-cationic meso-5,10-bis(4-n-methylpyridyl)-15,20di-(4-carboxyphenyl)porphyrin. dna binding of porphyrins, and their peptide conjugates was investigated with comprehensive spectroscopic methods. evidences provided by the decomposition of absorption spectra, fluorescence decay components, fluorescence energy transfer and cd signals reveal that peptide conjugates of di-and tricationic porphyrins bind to dna by two distinct binding modes which can be identified as intercalation and external binding. the peptide moiety does not oppose the interaction between the dna and cationic porphyrins. we compared the effect of complexation on structural stability of dna and nucleoprotein complex : hela nucleosomes and t7 phage. uv and cd melting studies revealed that the porphyrin binding increases the melting temperature of dna and destabilizes the dna protein interaction in the nucleosomes but not in the t7 phage. the wide nucleotide specificity of 3-phosphoglycerate kinase (pgk) allows its contribution to the effective phosphorylation (activation) of nucleotide-based pro-drugs against hiv. here the structural basis of the nucleotide-pgk interaction is characterised in comparison to other kinases, namely pyruvate kinase (pk) and creatine kinase (ck) by enzyme kinetic and structural modelling studies. the results evidenced favouring the purine vs. pyrimidine base containing nucleotides for pgk rather than for pk or ck. this is due to the exceptional ability of pgk in forming the hydrophobic contacts of the nucleotide rings that assures the appropriate positioning of the connected phosphate chain for catalysis. the unnatural l-configurations of the nucleotides (both purine and pyrimidine) are better accepted by pgk than either by pk or ck. further, for the l-forms the absence of the ribose oh-groups with pgk is better tolerated for the nucleotides with purine rather than pyrimidine base. on the other hand, positioning the phosphate chain of both purines and pyrimidines with l-configuration is even more important for pgk, as deduced from the kinetic studies with various nucleotide-site mutants. these characteristics of the kinase-nucleotide interactions can provide a guideline in drug-design. the role of the enzyme types atp-ases in the muscle contraction g. vincze-tiszay 1 , j. vincze 1 , e. balogh 2 1 hheif, budapest, hungary, 2 nové zá mky, slovakia the myofibrilla assuring muscle contraction gains energy to the slipping in mechanisms and the degree of efficiency of this process will decisively be determined by the velocity of recombination of the atp molecule. in this there play a particular part the na + -k + -atp-ase and mg ++ -atp-ase enzymes. chemical reactions taking place in the living organism are catalyzed by enzymes, so the recombination from adp to atp, too. this transport process can be modelled from the energetic point of view on the basis of the general transport theorem through the following formula: grad a x dx dt: from the point of view of muscle contraction it is of interest that, dependent from the type of the motions whether the length of time is very short, some seconds, or we can speak about a long lasting process. in the first case one can compare the decomposition of the atp with the avalanche effect while in the spot. its degree of efficiency is determined by the migration and linkage velocity of the ions. conclusion: the degree of efficiency of the muscle contraction is determined by the quantities of the two enzymes (na + -k + -atp-ase and mg ++ -atp-ase) as related to each other. [1] . experiments with deletion mutants have shown that the aminoterminal domain contains a beta sheet with an ordered array of acidic residues, which mediates the attachment to basic calcium phosphates [2, 3] . the inhibition is based on the formation of nanometer-sized, spherical mineral-fetuin-a colloids, denoted as calciprotein particles (cpps) [2, 4] . the initially formed cpps show hydrodynamic radii in the range of 50 nm and are only transiently stable. after a distinct lag time, they are subject to a morphological change towards larger prolate ellipsoids with hydrodynamic radii of 100-120 nm [5] . in this context, we studied the role of fetuin-a in the formation and ripening of cpps. on the one hand, dynamic light scattering (dls) was used to study the influence of temperature, fetuin-a concentration and mineral ion supersaturation on the kinetics of cpp ripening [6] . on the other hand, the protein fetuin-a was investigated by means of small angle x-ray scattering (saxs) and fluorescence correlation spectroscopy (fcs degradation of the mrna cap (mrna 5' end) by dcps (decapping scavenger) enzyme is an important process of the gene expression regulation, but little is known about its mechanism. the biological role of dcps and its potential therapeutic applications, e.g. as a novel therapeutic target for spinal muscular atrophy, make it an interesting object for biophysical investigations. the ability of dcps to act on various short capped mrnas will be presented. we have examined the substrate specificity and binding affinity of the enzyme in a quantitative manner, employing experimental physics' resources, such as atomic force microscopy (afm) and fluorescence spectroscopy for enzyme kinetics and timesynchronized-titration method (tst 2007) . in this study we extended the application of mqd-ihc to investigate potential biomarkers associated with prostate cancer (pca) invasiveness and lethal progression. objectives: to establish a mqd-ihc protocol using qd light-emitting nanoparticles 1) to detect the expression/activation of critical cell signaling proteins at the single cell level; 2) to image the plasticity and lethal progression of human pca with specific emphasis on emt and c-met signaling; and 3) to examine the utility of mqd-ihc in clinical pca specimens to determine its invasion ability and predict its metastatic capability. results: we analyzed the co-expression and activation of osteomimicry associated biomarkers: b2-microlgobulin (b2-m), phosphorylated cyclic amp responsive element binding protein (pcreb) and androgen receptor (ar) in 2,100 cells from 14 localized pca tissue areas (gleason 3 and 4) of 10 patients with known metastatic status. the overall median % triple positive for b2-m + /pcreb + /ar + cells was 51.5%. the median triple positive for the samples with metastatic potential was 61% compared with those without metastatic potential (median = 0%); p = 0.01 by a wilcoxon rank sum test. the results were confirmed in 11 pca bone metastatic specimens. we also investigated the c-met signaling in castration-resistant human pca model or crpc xenografts and the clinical pca specimens and found that the downstream signal components including pakt and mcl-1 were activated. conclusion: to validate our findings, additional clinical specimens with confirmed survival data will be analyzed and the cell-signaling-network-based mqd-ihc will be automated by vectra image analysis system in a high throughput manner with the hope to predict the lethal progression of pca prior to clinical manifestation of distant metastases. protein ligand binding is an important field of biopharmaceutical research. there are many techniques for quantitative determination of the ligand binding. the combination of isothermal titration calorimetry (itc) and thermal shift assay provides a robust estimate of the binding constant. many binding reactions are coupled to the absorption or release of protons by the protein or the ligand, conformational changes of the protein and other processes. to correlate the structural features of binding with the energetics of binding one needs to carry out a detailed thermodynamic study of the binding reaction and to determine dependencies such as ph, buffer and temperature. here we present a detailed thermodynamic description of radicicol binding to human heat shock protein hsp90 and determined proton linkage contributions to observed binding thermodynamics. we calculated the pk a of the group responsible for proton linkage, the protonation enthalpy of this group and intrinsic thermodynamic parameters for radicicol binding. the intrinsic enthalpy of radicicol binding to hsp90 is one of the largest enthalpies observed for any protein -ligand binding. the structural features responsible for such large binding enthalpy and very favorable intrinsic binding gibbs free energy are discussed. neuronal systems and modelling o-111 optogenetic electrophysiology walther akemann, amelie perron, hiroki mutoh, and thomas knö pfel laboratory for neuronal circuit dynamics, riken brain science institute, japan the combination of optical imaging methods with targeted expression of protein-based fluorescent probes enables the functional analysis of selected cell populations within intact neuronal circuitries. we previously demonstrated optogenetic monitoring of electrical activity in isolated cells, brain slices and living animals using voltage-sensitive fluorescent proteins (vsfps) generated by fusing fluorescent proteins with a membrane-integrated voltage sensor domain. however, several properties of these voltage reporters remained suboptimal, limiting the spatiotemporal resolution of vsfpbased voltage imaging. a major limitation of vsfps had been a reduced signal-to-noise ratio arising from intracellular aggregation and poor membrane targeting upon long-term expression in vivo. to address this limitation, we generated a series of enhanced genetically-encoded sensors for membrane voltage (named vsfp-butterflies) based on a novel molecular design that combines the advantageous features of vsfp2s and vsfp3s with molecular trafficking strategies. the new sensors exhibit faster response kinetics at subthreshold membrane potentials and enhanced localization to neuronal plasma membranes after long-term expression in vivo, enabling the optical recording of action potentials from individual neurons in single sweeps. vsfp-butterflies provide optical readouts of population activity such as sensoryevoked responses and neocortical slow-wave oscillations with signal amplitudes exceeding 1% dr/r 0 in anesthetized mice. vsfp-butterflies will empower optogenetic electrophysiology by enabling new type of experiments bridging cellular and systems neuroscience and illuminating the function of neural circuits across multiple scales. opsin molecules are a burgeoning new tool for temporally precise neuronal stimulation or inhibition. opsin properties are commonly characterized in cell culture or acute brain slice preparations using whole cell patch clamp techniques, where neuronal membrane voltage is fixed at the resting potential. however, in vivo, where neurons are firing action potentials, opsins are exposed to large fluctuations in membrane voltage and transmembrane ionic concentrations which can influence opsin function. in the case of implanted light delivery devices, stimulation light power varies as a function of brain tissue volume. we therefore investigated the stability of opsin properties across a variety of in vivo-like stimulation conditions. we find that off-kinetics of excitatory opsins vary significantly with holding membrane potential; channelrhodopsin (chr2) slowing with depolarisation and chief (chr1/chr2 hybrid) in contrast, accelerating. new chr2 point mutation variants demonstrate stability across all membrane potentials. we additionally explore responses to initial and subsequent light pulses and find that chief has the unique property of accelerating kinetics after the first light stimulation. inhibitory opsins vary in sensitivity to light in a manner which correlates with their off-kinetics. slower opsins, such as mac (leptosphaeria maculans), have higher sensitivity at low light power densities, saturating early relative to fast inhibitory opsins such as arch (archaeorhodopsin) and nphr (halorhodopsin). we discuss the relative merits of stability versus versatility of opsins under variable stimulation conditions. it has been previously shown that overexpression of ndm29 ncrna in a sknbe2-derived neuroblastoma (nb) cell line leads to cell differentiation, with a decrease of malignant potential. here we use the patch-recording technique to characterize the ionic channel apparatus of nb cells expressing ndm29 at its basal level (mock cells) or at 5.4 fold higher levels (s1 cells). the two cell lines shared very similar pools of functional k channels, but s1 cells displayed larger ttx-sensitive na currents and were able to generate action potentials, while mock cells were not. in addition, while mock cells barely express functional gaba receptors, in the majority of s1 rapid application of gaba elicited a current with a ec 50 =11.4 lm; this current was antagonized by bicuculline (10 lm) and potentiated by zaleplon (ec 50 = 35 nm). in mock cells, real time pcr evidenced a high level of gaba a a 3 subunit, while in s1 cells a significant expression of a 1 and a 4 was detected, whereas a 3 mrna was downregulated by 70%, confirming the development of functional gaba a receptors. in the same cell lines, the presence of specific markers and the secretion of specific cytokines confirmed that ndm29 expression leads to a differentiation process toward a neuron-like, rather than glial-like, phenotype. was planned therefore to reconstitute a model of brain tumors in rats by orthotopic implantation of xenogenic transformed human cells. iron is important element used for chemical reaction catalysis and physiological cell functions. the reason of iron deposition is still unknown. under conditions prevailing in human brain it is expected the formation of an amorphous or minute crystalline phase. we used light, scanning (sem) and transmission electron microscopy (tem), energy-dispersive microanalysis, electron diffraction and electron paramagnetic resonance (epr) for investigation of iron deposits in globus pallidus of human brain. sem revealed iron rich particles with na, si, p, s, cl, ca and cu around glial cells. tem revealed bumpy, solid particles of platy and sometimes rounded shape with the size of 2 lm to 6 lm. these ones were identified as hematite. epr measurements showed the presence of fe(iii) and cu(ii), but little amount of fe(ii) can not be excluded. we consider low-temperature process of hematite formation in human globus pallidus in aqueous environment influenced by organic and inorganic factors. chemical processes leading to nanoparticles formation can be associated with neurodegenerations such as alzheimer or parkinson disease. over the past 50 years our understanding of the basic biophysical mechanisms governing the spatio-temporal dynamics of neuronal membrane potentials and synaptic efficacies has significantly expanded and improved. much research has focussed on how ionic currents contribute to the generation and propagation of action potentials, how subthreshold signals propagate along dendritic trees, how the active properties of dendrites shape the integration of incoming signals in a neuron, and how pre-and postsynaptic activities â and potential heterosynaptic effects a determine the way synaptic efficacies change on the short-and long-term. yet, despite these advances, there have been no systematic efforts to relate the basic dynamical repertoire of neurons to the computational challenges neural circuits face, and in particular to explain systematically how the biophysical properties of neurons are adapted to process information efficiently under the constraints of noise and uncertainty in the nervous system. as an initial step in this direction, i will show how various biophysical properties of neurons, in particular short-term synaptic plasticity and dendritic non-linearities, can be seen as adaptations to resolve an important bottleneck in neuronal information processing: the loss of information entailed by the conversion of analogue membrane potentials to digital spike trains. the optogenetic toolbox has greatly expanded since the first demonstration of genetically-targeted optical manipulation of neural activity. in addition to the cation channel channelrhodopsin-2 (chr2), the panel of excitatory opsins now includes an array of chr2 variants with mutations in critical residues, in addition to other, related cation channels, and channel hybrids. the inhibitory opsin panel has similarly expanded beyond the first-described halorhodopsin (nphr), a chloride pump, to include trafficking-enhanced versions of nphr as well as the proton pumps mac and arch. while the expansion of available opsins offers researchers an increasingly powerful and diverse selection of tools, it has also made it increasingly difficult to select the optimal tool for a given experiment. one cannot extract a comparison of opsins from the current literature, since studies differ across multiple variables known to contribute to opsin performance (e.g. expression method, light power density, stimulation protocols, etc.). here, we provide the first empirical comparison of both excitatory and inhibitory opsins under standardized conditions. furthermore, we identify the set of parameters that describe the properties of an opsin in a way that is maximally relevant for biological application. o-120 subcellular compartment-specific distribution of voltage-gated ion channels zoltan nusser institute of experimental medicine, hungarian academy of sciences, budapest, hungary voltage-gated na + (nav) channels are essential for generating the output signal of nerve cells, the action potential (ap). in most nerve cells, aps are initiated in the axon initial segment (ais). in vitro electrophysiological and imaging studies have demonstrated that dendritic nav channels support active backpropagation of aps into the dendrites, but the subunit composition of these channels remained elusive. here, i will present evidence for the somato-dendritic location of nav channels in hippocampal pyramidal cells (pcs). using a highly sensitive electron microscopic immunogold localization technique, we revealed the presence of the nav1.6 subunit in pc proximal and distal dendrites, where their density is 40-fold lower than that found in aiss. a gradual decrease in nav1.6 density along the proximo-distal axis of the dendritic tree was also detected. we have also investigated the subcellular distribution of kv4.2 voltage-gated k + channel subunit and found a somato-dendritic localization. in contrast to that of nav1.6 channels, the density of kv4.2 first increases then decreases as a function of distance from the somata of pcs. such subcellular compartment-specific distribution of voltage-gated ion channels increases the computational power of nerve cells. keywords: memory, extra cellular matrix, random walk we expose first a biological model of memory based on one hand of the mechanical oscillations of axons during action potential and on the other hand on the changes in the extra cellular matrix composition when a mechanical strain is applied on it. due to these changes, the stiffness of the extra cellular matrix along the most excited neurons will increase close to these neurons due to the growth of astrocytes around them and to the elastoplastic behavior of collagen. this will create preferential paths linked to a memory effect. in a second part, we expose a physical model based on random walk of the action potential on the array composed of dendrites and axons. this last model shows that repetition of the same event leads to long time memory of this event and that paradoxical sleep leads to the linking of different events put into memory. myelinated nerve fibres were studied with fluorescent microscopy and laser interference microscopy. ca 2+ redistribution during prolonged stimulation, changes in morphology and rearrangement of cytoplasmic structures were compared in normal conditions and after membrane modification by lysolecithin and methyl-b-cyclodextrin. lysolecithin is a detergent known to provoke demyelination, and methyl-b-cyclodextrin extracts cholesterol from membranes. cholesterol extraction could lead to disruption of membrane caveolae-like microdomains or ''rafts'' and solubilisation of different proteins connected to them. our data suggest that methyl-b-cyclodextrin and lysolecithin lead to different changes in morphology and distribution of cytoplasmic structures. the effect was different for different regions of the nerve (node of ranvier, paranodal and internodal regions). the agents also altered the kinetics of ca 2+ response to stimulation in myelinated fibres. extracellular carbonic anhydrase contributes to the regulation of ca2+ homeostasis and salivation in submandibular salivary gland nataliya fedirko, olga kopach, nana, voitenko lviv national university, human and animal physiology, lviv, ukraine the maintenance of ph in the oral cavity is important for the oral heath since even a minor drop in ph can result in dental caries and damage to the teeth. submandibular salivary gland (smg) is main source of fluid and electrolytes enriched saliva therefore its core for oral ph homeostasis. smg secretion is activated by acetylcholine (ach) in [ca 2+ ] idependend manner and accompanied with oral ph acidic shifts. ph shifts could be due to changes in buffering capacity that is regulated by carbonic anhydrase (ca). despite the expression of different subtypes of ca in smg the role of ca in the regulation of smg function is unclear yet. we found that ca inhibition by benzolamide (bz) decreased of fluid secretion in vivo extracellular na 2+ concentration in situ. the latter confirm the ability of ca to modify both primarily and final saliva secretion. we also found correlation between the secretion and ca 2+ -homeostasis since bz-induced decrease of: in striated muscle ca 2+ release from the sarcoplasmic reticulum (sr) occurs when ryanodine receptors (ryr-s) open either spontaneously or upon the stimulation from dihydropyridine receptors that are located in the adjacent transverse-tubular membrane and change their conformation when the cell is depolarized. recent observations demonstrated that muscles from animal models of ptdinsp phosphatase deficiency suffer from altered ca 2+ homeostasis and excitation-contraction coupling, raising the possibility that ptdinsp-s could modulate voltage-activated sr ca 2+ release in mammalian muscle. the openings of a single or a cluster of ryr-s can be detected as ca 2+ release events on images recorded from fibres loaded with fluorescent ca 2+ indicators. to elucidate the effects of ptdinsp-s on ca 2+ release events, images were recorded from skeletal muscle fibers enzymatically isolated from the m. flexor digitorum breavis of mice utilizing a super-fast scanning technique. a wavelet-based detection method was used to automatically identify the events on the images. three different ptdinsp-s (ptdins3p, ptdins5p, and ptdins(3,5)p) were tested. all these ptdinsp compounds decreased the frequency of spontaneous ca 2+ release events. supported by the hungarian national science fund (otka 75604), té t. calcium sparks elicited by 1 mmol/l caffeine and by a depolarization to -60 mv were recorded at high time resolution on both x-y (30 frames/s) and line-scan images (65 lines/ ms) on intact skeletal muscle fibers of the frog. while a typical spark appeared in one frame only, 17.3 and 26.0% of spark positions overlapped on consecutive frames following caffeine treatment or depolarization, respectively. while both caffeine and depolarization increased the frequency of sparks, as estimated from x-y images, the morphology of sparks was different under the two conditions. both the amplitude (in df/f 0 ; 0.49 ± 0.025 vs. 0.29 ± 0.001; n = 22426 vs. 23714; mean ± sem, p \ 0.05) and the full width at half maximum (in lm; parallel with fiber axis: 2.33 ± 0.002 vs. 2.21 ± 0.005; perpendicular to fiber axis: 2.07 ± 0.003 vs. 1.88 ± 0.004) of sparks was significantly greater after caffeine treatment than on depolarized cells. these observations were confirmed on sparks identified in line-scan images. in addition, x-t images were used to analyze the time course of these events. calcium sparks had significantly slower rising phase under both conditions as compared to the control. on the other hand, while the rate of rise of signal mass was decreased after depolarization, it increased in the presence of caffeine. prolonged depolarisation of skeletal muscle cells induces entry of extracellular calcium into muscle cells, an event referred to as excitation-coupled calcium entry. skeletal muscle excitation-coupled calcium entry relies on the interaction between the 1,4 dihydropyridine receptor on the sarcolemma and the ryanodine receptor on the sarcoplasmic reticulum membrane. in this study we exploited tirf microscopy to monitor with high spatial resolution excitationcoupled calcium entry (ecce) in primary coltures of human skeletal muscle cells harbouring mutation in the ryr1 gene linked to malignant hyperthermia and central core disease. we found that excitation-coupled calcium entry is strongly enhanced in cells from patients with central core disease compared to individuals with malignant hyperthermia and controls. in addition, excitation-coupled calcium entry induces generation of reactive nitrogen species and causes the nuclear translocation of nfatc1. the activation of nfatc1 dependent genes is consistent with an increase of the il6 secretion from primary coltures human myotubes from ccd patients and with fibre type 1 predominance of skeletal muscle of ccd patients. membrane lipids, microdomains & signalling p-132 ftir and calorimetric investigation of the effects of trehalose and multivalent cations on lipid structure sawsan abu sharkh, jana oertel, and karim fahmy division of biophysics, institute of radiochemistry, helmholtz-zentrum dresden-rossendorf, germany e-mail: s.sharkh@hzdr.de the structure of membrane lipids is of fundamental importance for the integrity of cell and organelle membranes in living organisms. membrane lipids are typically hydrated and their headgroup charges counter-balanced by solvated ions. consequently, water loss can induce severe structural changes in lipid packing (lyotropic transitions) and can lead to the damage of lipid membranes even after rehydration. this can be one out of several factors that affect the viability of organisms undergoing desiccation. many organisms, however, are resistant to even extreme water loss. some of them synthesize trehalose which has been shown to be associated with survival of desiccation in phylogenetically diverse organisms (yeast, nematodes, brine shrimp, insect larvae, resurrection plants, and others). here we have studied hydration sensitive transitions in model lipids to determine the effect of trehalose and electrostatics on lipid order. hydration pulse-induced time-resolved fourier-transform infrared (ftir) difference spectroscopy was used to address hydration-dependent lipid structure as a function of trehalose. in combination with differential scanning calorimetry and studies of langmuir-blodget films we arrive at a structural and energetically consistent picture of how trehalose can affects lipidic phase behaviour and support a native lipid structure under water loss. experiments were performed on model lipids with different headgroups and native lipids from desiccation-tolerant organisms. controlled self-assembly and membrane organization of lipophilic nucleosides and nucleic acids: perspectives for applications martin loew 1 , paula pescador 3 , matthias schade 1 , julian appelfeller 1 , jü rgen liebscher 2 , oliver seitz 2 , daniel huster 3 , andreas herrmann 1 , and anna arbuzova 1 1 humboldt universitä t zu berlin, institute of biology/ biophysics, berlin, germany, 2 humboldt universitä t zu berlin, institute of chemistry, berlin, germany, 3 universitä t leipzig, department of medical physics and biophysics, leipzig, germany lipophilic conjugates of nucleosides and nucleic acids such as dna, rna, and peptide nucleic acid (pna) -combining assembly properties of amphiphiles and specific molecular recognition properties of nucleic acids -allow numerous applications in medicine and biotechnology. we recently observed self-assembly of microtubes, stable cylindrical structures with outer diameters of 300 nm and 2-3 lm and a length of 20-40 lm, from a cholesterol-modified nucleoside and phospholipids. morphology and properties of these microtubes and functionalization with lipophilic dna will be characterized. we also observed that lipophilic nucleic acids, pna and dna differing in their lipophilic moieties, partition into different lipid domains in model and biological membranes as visualized by hybridization with respective complementary fluorescently-labeled dna strands. upon heating, domains vanished and both lipophilic nucleic acid structures intermixed with each other. reformation of the lipid domains by cooling led again to separation of membrane-anchored nucleic acids. by linking specific functions to complementary strands, this approach offers a reversible tool for triggering interactions among the structures and for the arrangement of reactions and signaling cascades on biomimetic surfaces. conformational dependent trafficking of p-glycoprotein with rafts zsuzsanna gutayné tó th, orsolya bá rsony, katalin goda, gá bor szabó and zsolt bacsó university of debrecen, mhsc, department of biophysics and cell biology, debrecen, hungary p-glycoprotein (pgp), an abc-transporter playing a prominent role in multidrug resistance, demonstrate conformationdependent endocytosis on the surface of 3t3-mdr1 cells. these cell surface transporters have a uic2 conformationsensitive-antibody-recognizable subpopulation, which is about one-third of the rest persisting long on the cell surface and perform fast internalization via rafts. we have identified that the rapid internalization is followed by quick exocytosis, in which the other subpopulation is not or only slightly involved. the exocytosis presents a cholesterol depletion dependent intensification, in contrast to the internalization, which is inhibited by cyclodextrin treatment. this continuous recycling examined by total internal reflection (tirf) microscopy increases the amount of the raft associated subpopulation of pgps in the plasma membrane, and it might have a role in restoring the cholesterol content of the membrane after cholesterol depletion. our presentation will summarize related endocytotic, exocytotic and recycling processes and that how does our data fit into our current notions regarding to the cholesterol and sphingomyelin trafficking. membrane nanodomains based on phase-segregating lipid mixtures have been emerged as a key organizing principle of the plasma membrane. they have been shown to play important roles in signal transduction and membrane trafficking. we have developed lipid-like probes carrying multivalent nitrilotriacetic acid (tris-nta) head groups for selective targeting of histagged proteins into liquid ordered or liquid disordered phases. in giant unilamellar vesicles strong partitioning of tris-nta lipids into different lipid phases was observed. for a saturated tris-nta lipid, at least 10-fold preference for the liquid ordered phase was found. in contrast, an unsaturated nta lipid shows a comparable preference for the liquid disordered phase. partitioning into submicroscopic membrane domains formed in solid supported membranes was confirmed by superresolution imaging. single molecule tracking of his-tagged proteins tethered to solid supported membranes revealed clear differences in the diffusion behavior of the different nta-lipids. by using bsa as a carrier, multivalent nta lipids were efficiently incorporated into the plasma membrane of live cells. based on this approach, we established versatile methods for probing and manipulating the spatiotemporal dynamics of membrane nano domains in live cells. il-9 is a multifunctional cytokine with pleiotropic effects on t cells. the il-9r consists of the cytokine-specific a-subunit and the c c -chain shared with other cytokines, including il-2 and -15, important regulators of t cells. we have previously shown the preassembly of the heterotrimeric il-2 and il-15r, as well as their participation in common superclusters with mhc glycoproteins in lipid rafts of human t lymphoma cells. integrity of lipid rafts was shown to be important in il-2 signaling. we could hypothesize that other members of the c c cytokine receptor family, such as the il-9r complex, may also fulfill their tasks in a similar environment, maybe in the same superclusters. co-localization of il-9r with lipid rafts as well as with the il-2r/mhc superclusters was determined by clsm. molecular scale interactions of il-9ra with il-2r and mhc molecules were determined by microscopic and flow cytometric fret experiments. the role of lipid rafts in il-9r signaling was assessed by following the effect of cholesterol depletion on il-9 induced stat phosphorylation. our results suggest the possibility that preassembly of the receptor complexes in common membrane microdomains with mhc glycoproteins may be a general property of c c cytokines in t cells. to unravel the molecular processes leading to fas clustering in lipid rafts, a 21-mer peptide corresponding to the single transmembrane domain of the death receptor was reconstituted into model membranes that display liquid-disordered/ liquid-ordered phase coexistence, i.e. mimicking cells plasma membranes. using the intrinsic fluorescence of the peptide two tryptophans residues (trp 176 and trp 189 ), fas membrane lateral organization, conformation and dynamics was studied by steady-state and time-resolved fluorescence techniques. our results show that the fas has preferential localization to liquid disordered membrane regions, and that it undergoes a conformational change from a bilayer inserted state in liquid-disordered membranes to an interfacial state in liquidordered membranes. this is a result of the strong hydrophobic mismatch between the (hydrophobic) peptide length and the hydrophobic thickness of liquid-ordered membranes. in addition, we show that ceramide, a sphingolipid intimately involved in fas oligomerization and apoptosis triggering, does not affect fas membrane organization. overall, our results highlight ceramide role as an enhancer of fas oligomerization, and unravel the protective function of fas transmembrane domain against non-ligand induced fas apoptosis. organization and dynamics of membrane-bound bovine a-lactalbumin: a fluorescence approach arunima chaudhuri and amitabha chattopadhyay centre for cellular and molecular biology, hyderabad 500 007, india e-mail: amit@ccmb.res.in many soluble proteins are known to interact with membranes in partially disordered states, and the mechanism and relevance of such interactions in cellular processes are beginning to be understood. interestingly, apo-bovine alactalbumin (bla), a soluble protein, specifically interacts with negatively charged membranes and the membranebound protein exhibits a molten globule conformation. we have used the wavelength-selective fluorescence approach to monitor the molten globule conformation of bla upon binding to negatively charged membranes as compared to zwitterionic membranes. tryptophans in bla exhibit differential red edge excitation shift (rees) upon binding to negatively charged and zwitterionic membranes, implying differential rates of solvent relaxation around the tryptophan residues. our results utilizing fluorescence anisotropy, lifetime and depth analysis by the parallax approach of the tryptophans further support the differential organization and dynamics of the membrane-bound bla forms. in addition, dipole potential measurements and dye leakage assays are being used in our ongoing experiments to explore the mechanism of bla binding to membranes. these results assume significance in the light of antimicrobial and tumoricidal functions of a-lactalbumin. role of long-range effective protein-protein forces in the formation and stability of membrane protein nano-domains nicolas destainvill laboratoire de physique thé orique, université paul sabatier toulouse 3 -cnrs, toulouse, france we discuss a realistic scenario, accounting for the existence of sub-micro-metric protein domains in plasma membranes. we propose that proteins embedded in bio-membranes can spontaneously self-organize into stable small clusters, due to the competition between short-range attractive and intermediate-range repulsive forces between proteins, specific to these systems. in addition, membrane domains are supposedly specialized, in order to perform a determined biological task, in the sense that they gather one or a few protein species out of the hundreds of different ones that a cell membrane may contain. by analyzing the balance between mixing entropy and protein affinities, we propose that protein sorting in distinct domains, leading to domain specialization, can be explained without appealing to preexisting lipidic micro-phase separations, as in the lipid raft scenario. we show that the proposed scenario is compatible with known physical interactions between membrane proteins, even if thousands of different species coexist. lipid rafts are cholesterol and sphingolipid-enriched functional microdomains present in biomembranes. rafts have been operationally defined as membrane fractions that are detergent insoluble at low temperature. here we have characterized drms from erythrocytes treated with the nonionic detergents brij58 and brij98, at 4°c and 37°c, and compared them to drms obtained with triton x-100 (tx100). we have also investigated the effect of cholesterol depletion in drms formation. brij drms were enriched in cholesterol as well as tx100 drms. hptlc analysis showed a very similar distribution of phosphatidylcholine-pc, phosphatidylethanolamine-pe and sphingomyelin-sm in brij drms to that found in ghost membranes. sm-enriched drms were obtained only with tx100 while pe content was decreased in tx100 drms, in comparison to brij drms. immunoblot essays revealed that rafts markers (flotillin-2 and stomatin) were present in all drms. contrary to tx100 drms, analysis of electron paramagnetic resonance spectra (with 5-doxyl stearate spin label) revealed that brij drms are not in the liquid-ordered state, evincing the differential extraction of membrane lipids promoted by these detergents. supported by fapesp/cnpq (brazil). several biological membrane mimics have been built to investigate the topology of molecules in membranes. among them ''bicelles'', i.e., mixtures of long-chain and short-chain saturated phospholipids hydrated up to 98%, became very popular because they orient spontaneously in magnetic fields. disk-shaped systems of 40-80 nm diameter and 4-5 nm thickness have been measured by electron microscopy and solid state nmr and can be oriented by magnetic fields with the disc-plane normal perpendicular to the field. we have been developing recently lipids that contain in one of their chains a biphenyl group (tbbpc) affording an orientation parallel to the magnetic field, in the absence of lanthanides. a large number of hydrophobic molecules including membrane proteins have been successfully embedded and static nmr afforded finding the orientation of protein helices in membranes; mas nmr provided the 3d structure of peptides in bicelles. biphenyl bicelles keep their macroscopic orientation for days outside the field, thus leading to combined nmr and x-rays experiments. tbbpc allows also construction of lm vesicles showing a remarkable oblate deformation in magnetic fields (anisotropy of 3-10) and opens the way to applications for structural biology or drug delivery under mri. imaging membrane heterogeneities and domains by super-resolution sted nanoscopy christian eggeling, veronika mueller, alf honigmann, stefan w. hell department of nanobiophotonics, max planck institute for biophysical chemistry, am fassberg 11, 37077 gö ttingen, germany cholesterol-assisted lipid and protein interactions such as the integration into lipid nanodomains are considered to play a functional part in a whole range of membrane-associated processes, but their direct and non-invasive observation in living cells is impeded by the resolution limit of [200nm of a conventional far-field optical microscope. we report the detection of the membrane heterogeneities in nanosized areas in the plasma membrane of living cells using the superior spatial resolution of stimulated emission depletion (sted) far-field nanoscopy. by combining a (tunable) resolution of down to 30 nm with tools such as fluorescence correlation spectroscopy (fcs), we obtain new details of molecular membrane dynamics. sphingolipids or other proteins are transiently (* 10 ms) trapped on the nanoscale in cholesterol-mediated molecular complexes, while others diffuse freely or show a kind of hopping diffusion. the results are compared to sted experiments on model membranes, which highlight potential influences of the fluorescent tag. the novel observations shed new light on the role of lipid-protein interactions and nanodomains for membrane bioactivity. ca2+ controlled all-or-none like recruitment of synaptotagmin-1 c2ab to membranes sune m. christensen, nicky ehrlich, dimitrios stamou bio-nanotechnology laboratory, department of neuroscience and pharmacology, nano-science center, lundbeck foundation center biomembranes in nanomedicine, university of copenhagen, 2100 copenhagen, denmark e-mail: ehrlich@nano.ku.dk & stamou@nano.ku.dk synaptotagmin-1 (syt) is the major ca2+ sensor that triggers the fast, synchronous fusion of synaptic vesicle with the presynaptic membrane upon ca2+-mediated membrane recruitment of the cytosolic c2ab domain. the ca2+-dependent recruitment of syts c2ab domain to membranes has so far been investigated by ensemble assays. here we revisited binding of wild type c2ab and different c2ab mutants of syt to lipid membranes using a recently developed single vesicle assay. the hallmark of the single vesicle approach is that it provides unique information on heterogeneous properties that would otherwise be hidden due to ensemble averaging. we found that c2ab does not bind to all vesicles in a homogenous manner, but in an all-or-none like fashion to a fraction of the vesicles. the fraction of vesicles with bound c2ab is regulated by the amount of negatively charged lipids in the membrane as well as by [ca2+] . this ca2+ controlled all-ornone like recruitment of syt to membranes provides a possible explanation for the strongly heterogeneous behavior of the in vitro model system for neuronal membrane fusion. furthermore, heterogeneity in release probability among synaptic vesicles is a critical property in determining the output of a neuronal signaling event. new insights in the transport mechanism of ciprofloxacin revealed by fluorescence spectroscopy mariana ferreira, sílvia c. lopes, paula gameiro requimte, faculdade de ciê ncias da universidade do porto, porto, portugal keywords: fluoroquinolones; liposomes; proteoliposomes; ompf; ciprofloxacin; fluorescence; anisotropy. fluoroquinolones are antibiotics that have a large spectrum of action against gram negative and some gram positive bacteria. the interaction between these species and liposomes has been cited as a reference in the understanding of their diffusion through phospholipid bilayer and can be quantified by the determination of partition coefficients between a hydrophobic phase (liposomes) and an aqueous solution. it is also known that some porins of the bacterial membranes are involved in transport mechanism of many fluoroquinolones. ompf, a well characterized membrane protein characteristic of the outer membrane of gram negative bacteria assumes the conformation of homo-trimer, whose monomers have two tryptophan residues (one located at the interface of monomers and the other at the interface lipid/protein). thus, we proceeded to study the interaction of ciprofloxacin, a second generation fluoroquinolone, with unilamellar liposomal vesicles and ompf proteoliposomes of pope/popg, pope/popg/cardiolipin and e. coli total. partition coefficients (kp's) and the association with ompf proteoliposomes were determined by steady state fluorescence spectroscopy under physiological conditions (t=37°c; ph 7.4). the membrane mimetic systems used were characterized by dls and fluorescence anisotropy. motivation is whether there exist differences in the patternforming capabilities of two adhesion molecules of different roles: cd44, mediating ,,dynamic'' adhesion in cell rolling and icam-1, mediating ,,static'' adhesion during the formation of immune-synapse. homo-and hetero-associations of cd44, icam-1 and the mhci is investigated on the nm-and lmdistance levels on ls174t colon carcinoma cells in two different conditions of lymphocyte homing: (1) with ifnc and tnfa, both lymphokines up-regulating the expression level of mhci and icam-1 and down-regulating that of cd44. (2) crosslinking of cd44 and icam-1 representing receptor engagement. the observations are explained by assuming the existence of a kinase cascade-level crosstalk between the cd44 and icam-1 molecules which manifests in characteristic complementary changes in the properties of cell surface receptor patterns. for the characterisation of cluster morphology new colocalization approaches were developed: (i) ,,number of first neighbours'' distribution curves, (ii) ,,acceptor photobleaching fret-fluorescence intensity fluctuation product'' correlation diagrams, and (iii) ,,random gradient-kernel smoothing assisted decay'' of pearson-correlations, and (iv) k-function formalism. analyzing janus kinases in living cells with single-color fcs using confocal and tir-illumination thomas weidemann 1 , hetvi gandhi 1 , remigiusz worch 1,3 , robert weinmeister 1 , christian bö kel 2 , and petra schwille 1 1 biophysics research group, technische universitä t dresden, germany, 2 crtd, center for regenerative therapies dresden, technische universitä t dresden, germany, 3 institute of physics, polish academy of science, warsaw, poland cytokine receptors of the hematopoietic superfamily transduce their signal through non-covalently bound janus kinases. there are only 4 such kinases in humans (jak1, 2, 3 and tyk2), which associate to 46 different cytokine receptor chains. here we study the dynamics of gfp-tagged jak1 and jak3 in epithelial cells with fluorescence correlation spectroscopy (fcs). jak1 and jak3 behave differently in various aspects: in the absence of receptors, jak1 still binds the membrane, whereas jak3 diffuses homogeneously in the cytoplasm. we used fcs under total internal reflection illumination (tir-fcs) and determined the membrane binding affinity of jak1 to be 60±36 nm. the association of jak3 with the common gamma chain (c c ) is very tight as shown by fluorescence recovery after photobleaching (frap). molecular brightness analysis of single-point fcs shows that jak1 diffuses as a monomer in the rather small cytoplasmic pool, whereas jak3 diffuses as dimers, which undergo a defined oligomerization. the degree of oligomerization decays at higher concentrations, indicating that some unknown, saturable scaffold is involved. characterizing the binding and mobility schemes of the janus kinases may be important to further elucidate their specific and redundant effects in signal transduction. plasma membrane (pm)-enriched fraction obtained through subcellular fractioning protocols are commonly used in studies investigating the ability of a compound to bind to a receptor. however, the presence of mitochondria membranes (mi) in the pm-enriched fraction may compromise several experimental results because mi may also contain the interest binding proteins. aiming to analyze the subcellular fractioning quality of a standard sucrose density based protocol, we investigated (a) the na + k + -atpase (pm marker) and succinate dehydrogenase -sd (mi marker) activities; (b) the immunocontent of the adenine nucleotide translocator (ant -mi membrane marker) in both pm-and mi-enriched fractions. since several binding protocols may require long incubation period, we verified the quality of both fractions after 24 hours of incubation in adequate buffer. our results show that pmand mi-enriched fractions exhibit contamination with mi or pm, respectively. we did not observe any effect of incubation on na + k + -atpase activity and ant content in both fractions. surprisingly, sd activity was preserved in the pm-but not in mi-enriched fraction after incubation. these data suggest the need of more careful use of pm-enriched fraction preparation in studies involving pm proteins characterization. human neutrophil peptide 1 (hnp1) is a human cationic defensin that present microbial activity against various bacteria (both gram-positive and negative), fungi and viruses. hnp1 is stored in the cytoplasmic azurophilic granules of neutrophils and epithelial cells. in order to elucidate the mode of action of this antimicrobial peptide (amp), studies based on its lipid selectivity were carried out. large unilamellar vesicles (luv) with different lipid compositions were used as biomembranes model systems (mammal, fungal and bacterial models). changes on the intrinsic fluorescence of the tryptophan residues present in hnp1 upon membrane binding/insertion were followed, showing that hnp1 have quite distinct preferences for mammalian and fungal membrane model systems. hnp1 showed low interaction with glucosylceramide rich membranes, but high sterol selectivity: it has a high partition for ergosterol-containing membranes (as fungal membranes) and low interaction with cholesterolcontaining membranes (as in mammalian cells). these results reveal that lipid selectivity is the first step after interaction with the membrane. further insights on the hnp1 membrane interaction process were given by fluorescence quenching measurements using acrylamide, 5-doxylstearic acid (5ns) or (16ns). nanoparticles (np) are currently used in many industrial or research applications (paints, cosmetics, drug delivery materials…). recent papers demonstrate clearly their activity with biological membranes (nanoscale holes, membrane thinning, disruption). different studies of the np-membrane interaction suggest that parameters are particularly important, such as the np size, their surface properties or their aggregation state. composition of biological membranes being particularly complex, supported lipid bilayers (slb) composed of a restricted number of lipids are usually used as simplified membrane model. moreover, these two-dimensional systems are convenient for surface analysis techniques, such as atomic force microscopy (afm), giving information on the morphology of the slb and its mechanical properties. in this work, we study the behaviour of slbs made of lipids representative of the membrane fluid phase (popc) or of the raft phase (sphingomyelin). these slbs are deposited on planar surfaces (mica or glass) previously recovered with silica beads (10 or 100 nm in diameter) in order to mimick the np-membrane interaction. we will present in this work our first results obtained by afm and fluorescence microscopy. it is well known that the eukaryotic nuclei are the sphere of lipids active metabolism. the investigations demonstrated the existence of numerous enzymes in nuclei which modulate the changes of nuclear lipids during different cellular processes. although the nuclear membrane is accepted as the main place of the lipids localization, nearly 10% of nuclear lipids are discovered in chromatin fraction. the ability of chromatin phospolipids to regulate dna replication and transcription was already demonstrated. the chromatin phospolipids seems to play an important role in cell proliferation and differentiation as well as in apoptosis. it seems also possible that chromatin phospholipids may be participated in realization of cisplatin antitumor effects. the 24-hour in vivo effect of cisplatin on rat liver chromatin phospholipids was investigated. the phospholipids of rat liver chromatin were fractionated by microtlc technique. the quantitative estimation of fractionated phospholipids was carried out by computer program fugifilm science lab. 2001 image gauge v 4.0. the alteration of total phospholipids content as well as the quantitative changes among the individual phospholipids fractions in rat liver chromatin after in vivo action of cisplatin was established. the total content of chromatin phospholipids was significantly decreased after the cisplatin action. four from five individual phospholipids fractions were markedly changed after the drug action. two cholin-content phospholipids, particularly phosphatidylcholine and sphingomyelin exhibit diversity in sensitivity to this drug: the increase of sphingomyelin content accompanied by quantitative decrease of phosphatidylcholine. the quantity of cardiolipin was markedly increased while the amount of phosphatidylinositol was decreased after the cisplatin treatment. the phosphatidylethanolamin content remained unchanged after the drug action. it seems that high sensitivity of chromatin phospholipids exhibited to cisplatin action may play an important role in antitumor effects of this drug. membrane lipids and drug resistance in staphylococci r. d. harvey institute of pharmaceutical science, king's college london, 150 stanford street, london se1 9nh, uk staphylococci express numerous resistance mechanisms against common antimicrobials, including peptide components of the innate immune system which have been trumpeted as being likely candidates to replace our increasingly ineffective antibiotics. the membrane phospholipid lysylphosphatidylglycerol (l-pg) appears to play a key role in staphylococcal drug resistance, since its absence in mutant bacteria renders them susceptible to a range of cationic antimicrobials. the current assumption about the role l-pg plays in drug resistance is that of facilitating charge neutralisation of the plasma membrane, leading to loss of affinity towards cationic moieties. we have investigated this phenomenon using a range of model membrane systems composed of both synthetic lipids and reconstituted natural lipid extracts, using such techniques as stopped-flow fluorescence, circular dichroism and neutron scattering. our conclusions indicate that the initial assumptions about the role of l-pg in drug resistance are over-simplistic and certainly do not tell the whole story of the physical and biological properties of this fascinating moderator or membrane behaviour. our findings show that l-pg does not inhibit antimicrobial drug action by charge dampening, hinting at a different protective mechanism. modulation of a-toxin binding by membrane composition m. schwiering, a. brack, c. beck, h. decker, n. hellmann institute for molecular biophysics, university of mainz, mainz, germany although the alpha-toxin from s. aureus was the first poreforming toxin identified, its mode of interaction with membranes is still not fully understood. the toxin forms heptameric pores on cellular and artificial membranes. the present hypothesis is that the initial binding to the membrane occurs with low affinity, and that an efficient oligomerisation, relying on clusters of binding sites, is the reason for the overall high affinity of the binding process. in order to separate the effects of increasing concentration of binding sites from this topological effect, we investigated the oligomer formation based on pyrene-fluorescence for a series of lipid compositions, where the fraction of toxin binding lipids (egg phospatidylcholine (epc) or egg sphingomyelin (esm)) was varied while their concentration remained constant. the results indicate that an increased local density of toxin binding sites occurring due to phase separation facilitates oligomer formation. furthermore, the change in local environment (number of neighboring cholesterol molecules) upon domain formation also enhances oligomer formation.we thank the dfg (sfb 490) for financial support, s. bhakdi and a. valeva for production of the toxin and helpful discussions. we explored quercetin effects on lipid bilayers containing cholesterol using a spectrofluorimetric approach. we used the fluorescent probe laurdan which is able to detect changes in membrane phase properties. when incorporated in lipid bilayers, laurdan emits from two different excited states, a non-relaxed one when the bilayer packing is tight and a relaxed state when the bilayer packing is loose. this behavior is seen in recorded spectra as a shift of maximum emission fluorescence from 440 nm at temperatures below lipids phase transition to 490 nm at temperatures above lipids phase transition values. emission spectra of laurdan were analyzed as a sum of two gaussian bands, centered on the two emission wavelengths allowing a good evaluation of the relative presence of each population. our results show that both laurdan emission states are present with different shares in a wide temperature range for dmpc liposomes with cholesterol. quercetin leads to a decrease in the phase transition temperature of liposomes, acting in the same time as a quencher on laurdan fluorescence. this paper is supported by the sectorial operational programme human resources development, financed from the caveolins are essential membrane proteins found in caveolae. the caveolin scaffolding domain of caveolin-1 includes a short sequence containing a crac motif (v 94 tkywfyr 101 ) at its c-terminal end. to investigate the role of this motif in the caveolin-membrane interaction at the atomic level, we performed a detailed structural and dynamics characterization of a cav-1(v94-l102) nonapeptide encompassing this motif and including the first residue of cav-1 hydrophobic domain (l102), in dodecylphosphocholine (dpc) micelles and in dmpc/dhpc bicelles, as membrane mimics. nmr data revealed that this peptide folded as an amphipathic helix located in the polar head group region. the two tyrosine sidechains, flanked by arginine and lysine residues, are situated on one face of this helix, whereas the phenylalanine and tryptophan side-chains are located on the opposite face (le lan c. et al., 2010, eur. biophys. j., 39, 307-325). to investigate the interactions between the crac motif and the lipids, we performed molecular dynamics simulations in two different environment: a dpc micelle and a popc bilayer. the results obtained are in good agreement with nmr data and the comparison between both systems provided insight into the orientation of the crac motif at the membrane interface and into its interactions with lipids. this work was partially supported by the strategic grant posdru/21/1. our study suggests that conformation and positioning of hydroxyl groups significantly affects thermotropic properties of sphingolipids and sterol interaction. the polymorphism of a new series of bolaamphiphile molecules based on n-(12-betainylamino-dodecane)-octyl b-d-glucofuranosiduronamide chloride is investigated. the length of the main bridging chain is varied in order to modify the hydrophilic/lipophilic balance. the other chemical modification was to introduce a diacetylenic unit in the middle of the bridging chain to study the influence of the pp stacking on the supramolecular organisation of these molecules. dry bolaamphiphiles self-organize in supramolecular structures such as lamellar crystalline structure, lamellar fluid structure and lamellar gel structure. the thermal dependence of these structures, as well as the phase transition is followed by smallangle and wide-angle x-ray scattering. once the thermal cycle is accomplished, the system remains in the kinetically stabilized undercooled high-temperature phase at temperature of 20°c. subsequently, the time dependence of the relaxation to the thermodynamically stable phase is followed and very slow relaxation on the order of hours or days is observed. the study of polymorphism and the stability of various phases of this new series of bolaamphiphiles is interesting for potential application in health, cosmetics or food industry, is undertaken in this work. alkylphospholipids have shown promising results in several clinical studies and among them perifosine (opp) is promising for breast cancer therapy. antitumor effect was much better in estrogen receptor negative (er-) than in estrogen receptor positive (er+) tumors in vivo. it is believed that apl do not target dna, but they insert in the plasma membrane and ultimately lead to cell death. liposomes made of opp and different amount of cholesterol (ch) showed diminished hemolytic activity as compared to micellar opp, but in most cases cytotoxic activity was lower. in order to find optimal liposomal composition and to understand better the difference in the response of er+ and er-cells the interaction opp liposomes with er+ and er-cells was studied. for liposomes with high amount of ch both cell types showed slow release of the liposome entrapped spin probe into the cytoplasm. liposmes with low amount of ch interact better with cells but the release is faster for er-as for er+ cells at 37°c. experiments with nitroxide-labeled opp (sl-opp) liposomes suggest that the exchange of sl-opp between liposomes and cellular membranes is fast. however, translocation of sl-opp across the plasma membrane is slow, but seems to be faster for opp resistent, er+ cells as for er-cells at 37°c. estimation of a membrane pore size based on the law of conservation of mass krystian kubica 1 , artur wrona 1 and olga hrydziuszko 2 1 institute of biomedical engineering and instrumentation, wroclaw university of technology, wroclaw, poland, 2 centre for systems biology, university of birmingham, birmingham, uk the size of biomembrane pores determines which solutes or active compounds may enter the cell. here, using a mathematical model of a lipid bilayer and the law of conservation of mass, we calculate the radius of a membrane pore created by rearranging the lipid molecules (the pore wall was formed out of the lipid heads taken from the membrane regions situated directly above and below the pore, prior its formation). assuming a constant number of lipid molecules per bilayer (with or without the pore) and based on the literature data (60% decrease in the area per chain for a fluid-to-gel transition and a matching change of one chain volume not exceeding 4%) we have shown that the pore radius can measure up to 4.7nm (for a 7nm thick lipid bilayer) without the lipid molecules undergoing a phase transition. a further assumption of area per chain being modified as a consequence of the lipids conformational changes has resulted in an increase of the calculated radius up to 7.1nm. finally, a comparison of the pore volume with the corresponding volume of the lipid bilayer has led to a conclusion that for the system under consideration the membrane pore can only be created with the lipids undergoing fluidto-gel conformational changes. the key signaling pathway involves tyrosine phosphorylation of signal transducers and activators of transcription (stat1 and stat2) by receptor-associated janus kinases. we aim to unveil of the very early events of signal activation including ligand-induced receptor assembly and the recruitment of the cytoplasmic effector proteins stat1 and stat2 in living cells. to this end, we have explored the spatiotemporal dynamics of stat recruitment at the membrane on a single molecule level. highly transient interaction of stats to membrane-proximal sites was detected by tirf-microscopy, allowing for localizing and tracking individual molecules beyond the diffraction limit. thus, we obtained a pattern of the spatio-temporal recruitment of stat molecules to the plasma membrane revealing distinct submicroscopic structures and hotspots of stat interaction with overlapping recruitment sites for stat1 and stat2. strikingly, these stat binding sites were independent on receptor localization and expression level. simultaneous superresolution imaging of the cytoskeleton revealed the organization of stat recruitment sites within the cortical actin skeleton. characterization of molecular dynamics on living cell membranes at the nanoscale level is fundamental to unravel the mechanisms of membrane organization andcompartmentalization. we have recently demonstr ated the feasibility of fluorescence correlation spectroscopy (fcs) based on the nanometric illumination of near-field scanning optical microscopy (nsom) probes on intact living cells [1] . nsom-fcs was applied to study the diffusion of fluorescent lipid analogs on living cho cells. the experiments allowed to reveal details of the diffusion hidden by larger illumination areas and associated with nanoscale membrane compartmentalization. the technique also offers the unique advantages of straightforward implementation of multiple color excitation, opening the possibility to study nanoscale molecular cross-correlation. furthermore, the nsom probe evanescent axial illumination allows to extend diffusion study to the membrane-proximal cytosolic region. as such, nsom-fcs represents a novel powerful tool to characterize the details of many biological processes in which molecular diffusion plays a relevant role. the growing interest in supported lipid bilayers (slbs) on conductive substrates, such as gold, is due to the possibility of designing lipid-based biosensor interfaces with electrochemical transduction. due to the hydrophobicity of gold it is still a challenge to deposit planar and continuous bilayers without previous surface modification. most studies on gold concern single-phase slbs without cholesterol or gangliosides, two vital components of biomembranes. in this work the experimental conditions suitable for the formation of complex slbs with phase-separation directly on gold are exploited. the mixtures dopc/dppc/cholesterol (4:4:2) with 0 or 10 mol % of ganglioside gm1, which should yield lipid raft-like domains according to reported phase diagrams, were studied. slb with lipid rafts were successfully formed onto bare au (111), although surface modification with 11-mercapto-undecanoic acid sam stabilized the slbs due to its charge and hydrophilicity. the different experimental conditions tested had an impact on nano/microdomains organization observed by atomic force microscopy in buffer solution. surface characterization through the combined use of ellipsometry, cyclic voltammetry and afm allowed to optimize the conditions for the formation of more planar and compact slbs. it is widely accepted that the conversion of the soluble, nontoxic amyloid b-protein (ab) monomer to aggregated toxic ab rich in b-sheet structures is central to the development of alzheimer's disease. however, the mechanism of the abnormal aggregation of ab in vivo is not well understood. we have proposed that ganglioside clusters in lipid rafts mediate the formation of amyloid fibrils by ab, the toxicity and physicochemical properties of which are different from those of ab amyloids formed in solution [1, 2] . in this presentation, we report a detailed mechanism by which ab-(1-40) fibrillizes in raft-like lipid bilayers composed of gm1/cholesterol/sphingomyelin. at lower concentrations, ab formed an a helix-rich structure, which was cooperatively converted to a b sheet-rich structure above a threshold concentration. the structure was further changed to a seed-prone b structure at higher concentrations. the seed recruited ab in solution to form amyloid fibrils. [ hepatitis c virus (hcv) has a great impact on public health, affecting more than 170 million people worldwide since it is the cause of liver-related diseases such as chronic hepatitis, cirrhosis and hepatocarcinoma. hcv entry into the host cell is achieved by the fusion of viral and cellular membranes, replicates its genome in a membrane associated replication complex, and morphogenesis has been suggested to take place in the endoplasmic reticulum (er) or modified er membranes. the variability of the hcv proteins gives the virus the ability to escape the host immune surveillance system and notably hampers the development of an efficient vaccine. hcv has a single-stranded genome which encode a polyprotein, cleaved by a combination of cellular and viral proteases to produce the mature structural proteins (core, e1, e2, and p7) and non-structural ones (ns2, ns3, ns4a, ns4b, ns5a and ns5b), the latter associated with the membrane originated from the er in the emerging virus. the ns4b protein, a fundamental player in the hcv replicative process and the least characterized hcv protein, is a highly hydrophobic protein associated with er membranes. it has recently been shown that the c-terminal is palmitoylated and the n-terminal region has potent polymerization activity. the expression of ns4b induces the formation of the so called membranous web, which has been postulated to be the hcv rna replication complex. thus, a function of ns4b might be to induce a specific membrane alteration that serves as a scaffold for the formation of the hcv replication complex and therefore has critical role in the hcv cycle. due to the high hydrophobic nature of ns4b, a detailed structure determination of this protein is very difficult. the ns4b protein is an integral membrane protein with four or more transmembrane domains. the c-terminal region of ns4b is constituted by two a helices, h 1 (approximately from amino acid 1912 to 1924) and h 2 (approximately from amino acid 1940 to 1960), which have been studied as potential targets for inhibiting hcv replication. previous studies from our group, based on the study of the effect of ns4b peptide libraries on model membrane integrity, have allowed us to propose the location of different segments in this protein that would be implicated in lipid-protein interaction. additionally, the h 1 region could be an essential constituent in the interaction between protein and membrane. in this study we show that peptides derived from the c-terminal domain of ns4b protein of hcv are able to interact with high affinity to biomembranes, significantly destabilizing them and affecting their biophysical properties. there were also differences in the interaction of the peptide depending on the lipid composition of the membranes studied. we have also applied fluorescence spectroscopy, infrared spectroscopy and differential scanning calorimetry which have given as a detailed biophysical study of the interaction of the peptide with model biomembranes. this work was supported by grant bfu2008-02617-bmc (ministerio de ciencia y tecnología, spain) to j.v. a semi-quantitative theory describing the adhesion kinetics between soft objects as living cells or vesicles was developed. the nucleation-like mechanism has been described in the framework of a non-equilibrium fokker-planck approach accounting for the adhesion patch growth and dissolution (a. raudino, m. pannuzzo, j. chem. phys. 132, 045103 (2010)). a well known puzzling effect is the dramatic enhancement of the adhesion/fusion rate of lipid membranes by water-soluble polymers that do not specifically interact with the membrane surface. we extend the previous approach by molecular dynamics simulations in the framework of a coarse-grained picture of the system (lipid+polymer+ions embedded in an explicit water medium) in order to test and support our previous analytical results. simulations show that the osmotic pressure due to the polymer exclusion from the inter-membrane spacing is partially balanced by an electrostatic pressure. however, we also evidenced an interesting coupling between osmotic forces and electrostatic effects. indeed, when charged membranes are considered, polymers of low dielectric permittivity are partially excluded from the inter-membrane space because of the increased local salt concentration. the increased salt concentration means also a larger density of divalent ions which form a bridge at the contact region (stronger adhesion). the overall effect is a smaller membrane repulsion. this effect disappears when neutral membranes are considered. the model could explain the fusion kinetics between lipid vesicles, provided the short-range adhesion transition is the rate-limiting step to the whole fusion process. the role of ceramide acyl chain length and unsaturation on membrane structure sandra n. ceramide fatty acid composition selectively regulates distinct cell processes by a yet unknown mechanism. however, evidence suggests that biophysical processes are important in the activation of signalling pathways. indeed, ceramide strongly affects membrane order, induces gel/fluid phase separation and forms highly-ordered gel domains. the impact of ceramide n-acyl chain in the biophysical properties of a fluid membrane was studied in popc membranes mixed with distinct ceramides. our results show that: i) saturated ceramide has a stronger impact on the fluid membrane, increasing its order and promoting gel/fluid phase separation, while their unsaturated counterparts have a lower (c24:1 ceramide) or no (c18:1 ceramide) ability to form gel domains at physiological temperature, ii) differences between distinct saturated species are smaller and are related mainly to domain morphology, and iii) very long chain ceramide induces the formation of tubular structures probably associated with interdigitation. these results suggest that generation of different ceramide species in cell membranes has a distinct biophysical impact with acyl chain saturation dictating membrane lateral organization, and chain asymmetry governing interdigitation and membrane morphology. extra high content of cholesterol (chol) in fiber-cell membranes in the eye lens leads to chol saturation and formation of cholesterol bilayer domains (cbds). it is hypothesized that high enrichment in cholesterol helps to maintain lens transparency and protect against cataractogenesis. in model studies, the cbd is formed in a phospholipid bilayer when cholesterol content exceeding the cholesterol solubility threshold, thus, the cbd is surrounded by phospholipid bilayer saturated with cholesterol. in the present study, we carried out molecular dynamics (md) simulation of two bilayers: a palmitoyloleoylphosphatidylcholine (popc) bilayer (reference) and a 1:1 popc-chol bilayer, to investigate the smoothing effect of the saturating amount of cholesterol on the bilayer. to our knowledge, this effect has not been studied so far so this study is certainly providing new results. our results indicate that saturation with cholesterol significantly narrows the distribution of vertical positions of the center-of-mass of the popc molecules and the popc atoms in the bilayer and smoothes the bilayer surface. we hypothesize that this smoothing effect decreases light-scattering and helps to maintain lens transparency. the phospholipid content of staphylococcus aureus membranes displays a high degree of variability (1-3). the major phospholipids found in s. aureus are phosphatidylglycerol (pg), cardiolipin (cl) and lysylphosphatidylglycerol (lpg), (1) the concentrations of which are environment dependent and see fluctuations on exposure to high concentrations of positively charged moieties (4) . up regulation of lpg has a suspected role in neutralisation of the plasma membrane in response to cationic threats. studies have been conducted to probe biomimetic models of this theory however our focus is to look at atomic details of membrane extracts in the presence of magaininf5w. s. aureus 476 lipid extracts from cells grown at ph 5.5 and 7.0, studies by neutron diffraction with and without peptide at two contrasts. d-spacings were assessed by vogt area fitting and bragg's law. bilayer separation at low ph was *1-2 å less than ph 7.0. with peptide, bilayer separations of ph 5.5 and 7.0 extracts were reduced by *2 å and *4 å , respectively. reduced pg content of low ph extracts is suggested to reduce d-spacing, however presence of peptide further reduces this, possibly by an anion neutralisation effect. abnormal d-spacing on increased humidity may be due to the breakdown of lpg. activation of neutrophils releasing hocl and apoptosis of vein endothelial cells are the events documented to occur in the course of atherosclerosis. as lipid chlorohydrins, which are the key products of the reaction between hocl and unsaturated fatty acid residues, were found in atherosclerotic plaques, we decided to check their biological activity in the context of their ability to act as the mediators of hoclinduced oxidative stress and apoptosis in the culture of immortalized human umbilical vein endothelial cells (hu-vec-st). the concentration of reactive oxygen species was found to be elevated after 1 h cell incubation with phospahtidylcholine chlorohydrins. this effect was at least partially caused by the leakage of superoxide anion from mitochondria and followed by depletion of gsh and total thiols. the significant decrease of antioxidant capacity of cell extracts was also observed. the intracellular red-ox imbalance was accompanied by the increase of the ratio between phosphorylated and dephosphorylated forms of p38 map kinase. after longer incubation a significant number of apoptotic cells appeared. summing up, phosphatidylcholine chlorohydrins may be regarded as signaling molecules, able to initiate signalling pathways by induction of oxidative stress. giant unilamellar vesicles (guvs) are a valuable tool in the study of lateral distribution of biological membrane components. guv dimensions are comparable to typical cell plasma membranes and lipid phase separation can be observed through fluorescence microscopy. guv studies frequently require immobilization of the vesicles, and several methods are available for that effect. one of the most common methodologies for vesicle immobilization is the use of avidin/streptavidin coated surfaces and biotin labeled lipids at very low concentration in the vesicles. here, we analyze the effect of using this methodology on lipid domain distribution for different lipid compositions. we show that as a result of non-homogeneous distribution of biotin labeled lipids between liquid disordered, liquid ordered and gel phases, distribution of lipid domains inside guvs can be dramatically affected. monitoring membrane permeability: development of a sicm approach christoph saßen 1,2 and claudia steinem 1 1 institute for organic and biomolecular chemistry, university of gö ttingen, tammannstraße 2, 37077 gö ttingen, germany, 2 ggnb doctoral program: imprs, physics of biological and complex systems scanning ion conductance microscopy (sicm) utilises a nanopipette containing an electrode as a probe for surface investigations with resolutions of 1/3 of the inner pipette diameter. experiments are conducted under physiological conditions, in situ and without mechanical contact of probe and sample. hence, sicm serves as a well-suited technique for the investigation of soft objects such as cells or artificial lipid membranes. using pore-suspending membranes (psm) as a model system, interactions of melittin as an example for cell penetrating peptides (cpps) and lipid membranes are investigated by means of sicm. formation of a range of solvent free psm from lipid vesicles has been achieved as confirmed by means of fluorescence microscopy and sicm. application of melittin results in rupturing of the lipid bilayer. putative insights gained from this assay are critical concentrations of membrane permeabilising ccps and answers to mechanistic questions, e.g. whether ccps translocate only or form pores within the lipid bilayer. positioning of the z-ring in escherichia coli prior to cell division is regulated by intracellular pole-to-pole oscillation and membrane binding of min proteins, allowing assembly of ftsz filaments only at the center plane of the cell. in order to investigate the influence of membrane geometry on the dynamic behavior of membrane binding of min proteins, we combined concepts of synthetic biology and microfabrication technology. glass slides were patterned by a gold coating with microscopic windows of different geometries, and supported lipid bilayers (slb) were formed on these microstructures. on slbs, min-proteins organize into parallel waves. confinement of the artificial membranes determined the direction of propagation.min-waves could be guided along curved membrane stripes, in circles and even along slalom-geometries. in elongated membrane structures, the protein waves always propagate along the longest axis. coupling of protein waves across spatially separated membrane patches was observed, dependent on gap size and viscosity of the aqueous media above the bilayer. this indicates the existence of an inhomogeneous and dynamic protein gradient above the membrane. minimal systems for membrane associated cellular processes petra schwille bio technology center biotec, technical university of dresden, germany the strive for identifying minimal biological systems, particularly of subcellular structures or modules, has in the past years been very successful, and crucial in vitro experiments with reduced complexity can nowadays be performed, e.g., on reconstituted cytoskeleton and membrane systems. in this overview talk, i will first discuss the virtues of minimal membrane systems, such as guvs and supported membranes, in quantitatively understand protein-lipid interactions, in particular lipid domain formation and its relevance on protein function. membrane transformations, such as vesicle fusion and fission, but also vesicle splitting, can be reconstituted in these simple subsystems, due to the inherent physical properties of selfassembled lipids, and it is compelling question how simple a protein machinery may be that is still able to regulate these transformations. as an exciting example of the power of minimal systems, i show how the interplay between a membrane and only two antagonistic proteins from the bacterial cell division machinery can result in emergence of protein self-organization and pattern formation, and discuss the possibility of reconstituting a minimal divisome. quantitative microscopic analysis reveals cell confluence regulated divergence of pdgfr-initiated signaling pathways with logically streamlined cellular outputs á rpá d szö } or, lá szló ujalky-nagy, já nos szö ll} osi, gyö rgy vereb university of debrecen, department of biophysics and cell biology, debrecen, hungary platelet derived growth factor receptors (pdgfr) play an important role in proliferation and survival of tumor cells. pdgf-bb stimulation caused a redistribution of pdgf receptors towards gm1 rich domains, which was more prominent in confluent monolayers. pdgf-bb stimulation significantly increased relative receptor phosphorylation of the ras / mapk pathway specific tyr716 residues and the pi3-kinase / akt pathway specific tyr751 residues in nonconfluent cultures. tyr771 residues that serve as adaptors for ras-gap which inactivates the mapk pathway and tyr1021 residues feeding into the plc-gamma / camk-pkc pathway were the docking sites significantly hyperphosphorylation following ligand stimulation in confluent cells. we found that p-akt facilitated cell survival and pmapk dependent proliferation is more activated in dispersed cells, while phospholipase c-gamma mediated calcium release and pkc-dependent rhoa activation are the prominent output features pdgf stimulus achieves in confluent cultures. these observations suggest that the same stimulus is able to promote distinctly relevant signaling outputs, namely, cell division and survival in sparse cultures and inhibition of proliferation joined with promotion of migration in confluent monolayers that appear contact inhibited. a thermodynamic approach to phase coexistence in ternary cholesterol-phospholipid mixtures jean wolff, carlos m. marques and fabrice thalmann institut charles sadron, université de strasbourg, cnrs upr, 22, 23 rue du loess, strasbourg cedex, f-67037, france e-mail: thalmann@ics-cnrs.unistra.fr we present a simple and predictive model for describing the phase stability of ternary cholesterol-phospholipid mixtures. assuming that competition between the liquid and gel organizations of the phospholipids is the main driving force behind lipid segregation, we derive a phenomenological gibbs free-energy of mixing, based on the calorimetric properties of the lipids main transition. gibbs phase diagrams are numerically obtained that reproduces the most important experimental features of dppc-dopc-chol membranes, such as regions of triple coexistence and liquid orderedliquid disordered segregation. based on this approach, we present a scenario for the evolution of the phase diagram with temperature. results for other phospholipid species, such as popc or psm will also be presented. interleukin-2 and -15 receptors play a central role in the activation, survival and death of t lymphocytes. they form supramolecular clusters with mhc i and ii glycoproteins in t cells. in damaged or inflamed tissues the extracellular k+ concentration increases, which can depolarize the membrane. the common signaling beta and gamma chains of il-2/15r are phosphorylated upon cytokine binding and get a permanent dipole moment, thus their conformation, interactions, mobility and activity may be sensitive to the membrane potential. we induced depolarization on ft 7.10 t lymphoma cells by increasing the ec. k+ level or by blocking kv 1.3 voltage gated k+ channels with margatoxin. fcs measuremens showed that the lateral mobility of fab-labeled il-2/ 15r and mhc i and ii decreased upon depolarization, while that of gpi-linked cd48 did not change. fret efficiency measured between some elements of the il-receptor/mhc cluster increased, which may reflect an increase of cluster size. il-2-induced receptor activity, as monitored by measuring stat5-phosphorylation, increased upon depolarization, whereas il-15 induced phosphorylation did not change. our results may reveal a novel regulatory mechanism of receptor function by the membrane potential. cytokines play an important role in t cell activation and immunological memory, whereas mhcs are known for the role in antigen presentation. we applied rnai to silence the expression of mhc i in order to study its possible role in receptor assembly and function. fret data indicated that the association of il-2r and il-15r with mhc i as well as between il-2r and il-15r weakened. fcs indicated an increase of receptor mobility also suggesting the partial disassembly of the clusters. mhc i gene silencing lead to a remarkable increase of il-2/il-15 induced phosphorylation of stat5 transcription factors. in search for the molecular background of this inhibition of signaling by mhc i we checked il-2 binding and the formation of the receptor complex (il-2r alpha -il-2r beta association), but we did not find a difference as compared to the control. our results suggest that mhc i plays an organizing role in maintaining supramolecular receptor clusters and inhibits il-2r signaling, revealing a nonclassic new function of mhc i beyond its classical role in antigen presentation. interleukin-4 (il-4) is an important cytokine involved in adaptive immunity. il-4 binds with high affinity the singlepass transmembrane receptor il-4ra. the occupied complex, il-4/il-4ra then engages either il-2rc or il-13ra1, to form an activated type i or ii receptor, respectively. this formation of heterodimers is believed to trigger cross-activation of intracellular janus kinases. here we follow a fluorescently labeled ligand through various stages of receptor activation in hek293t: using fluorescence correlation spectroscopy (fcs), we see that the receptor chains diffuse as monomers within the plasma membrane. using dual-color fccs provides direct evidence for ligand induced co-diffusion of occupied il-4ra and il-13ra1. in contrast, type i complexes containing il-2rc could not be observed. however, ectopic expression of gfp-tagged il-2rc/jak3 induced stable fluorescent speckles in or close to the plasma membrane. we identified these structures as early sorting endosomes by colocalization of surface markers like eea1 and rab gtpases. the il-4ra chain is continuously trafficking into these compartments. these observations suggest that the formation of a type i il-4r heterodimer may require internalization and that early endosomes serve as a platform for il-4 signaling. among the membrane associated proteins, the ras family, which is lipid-anchored g protein, plays a key role in a large range of physiological processes and, more importantly, is deregulated in a large variety of cancer. in this context, plasma membrane heterogeneity appears as a central concept since it ultimately tunes the specification and regulation of ras-dependent signaling processes. therefore, to investigate the dynamic and complex membrane lateral organization in living cells, we have developed an original approach based on molecule diffusion measurements performed by fluorescence correlation spectroscopy at different spatial scales (spot variable fcs, svfcs) (1). we have shown in a variety of cell types that lipidbased nanodomains are instrumental for cell membrane compartmentalization. we have also observed that thesenanodomains are critically involved in the activation of signaling pathways and are essential for physiological responses (2-3). more recently, weextend the application of svfcs to characterizethe dynamics of k-rasproteinat the plasma membrane. as major result, we demonstrated that the rate of k-ras association/dissociation from the membrane is fast but vary as a functional of the activation state of the molecule as well as of specific intracellular protein interactions. we have so demonstrated that an helical lid sub-domain in the sbd is essential for monomeric as binding, but not for the anti-aggregation activity of the chaperone, suggesting that hsp70 is able to interact with pre-fibrillar oligomeric species formed during as aggregation and that, then, the mechanism of binding for these species is different from that of the monomeric protein. aggregation of the acylphosphatase from sulfolobus solfataricus (sso acp) into amyloid-like protofibrils is induced by the establishment of an intermolecular interaction between a 11-residue unfolded segment at the n-terminus and the globular unit of another molecule. we have used data from hydrogen/deuterium exchange experiments, intermolecular paramagnetic relaxation enhancements and isothermal titration calorimetry measurements on an aggregation-resistant sso acp variant lacking the 11-residue n-terminus to characterize the initial steps of the aggregation reaction. under solution conditions that favour aggregation of the wild-type protein, the truncated protein was found to interact with a peptide corresponding to the n-terminal residues of the full length protein. this interaction involves the fourth strand of the main b-sheet structure of the protein and the loop following this region and induces a slight decrease in protein flexibility. we suggest that the amyloidogenic state populated by sso acp prior to aggregation does not present local unfolding but is characterized by increased dynamics throughout the sequence that allow the protein to establish new interactions, leading to the aggregation reaction. amyloid-like aggregates alter the membrane mobility of gm1 gangliosides martino calamai and francesco pavone university of florence, lens -european laboratory for non-linear spectroscopy, sesto fiorentino, florence, italy neuronal dysfunction in neurodegenerative pathologies such as alzheimer's disease is currently attributed to the interaction of amyloid aggregates with the plasma membrane. amongst the variety of toxic mechanisms proposed, one involves the binding of amyloid species to gm1 gangliosides. gm1 takes part into the formation of membrane rafts, and exerts antineurotoxic, neuroprotective, and neurorestorative effects on various central neurotransmitter systems. in this study, we investigated the effects of amyloid-like aggregates formed by the highly amyloidogenic structural motif of the yeast prion sup35 (sup35nm) on the mobility of gm1 on the plasmamembrane of living cells. preformed sup35nm aggregates were incubated with cells and gm1 molecules were subsequently labeled with biotinylated ctx-b and streptavidin quantum dots (qds). single qds bound to gm1 were then tracked. the mobility of gm1 was found to decrease dramatically in the presence of sup35nm aggregates, switching from brownian to mainly confined motion. the considerable interference of amyloid-like aggregates with the lateral diffusion of gm1 might imply a consequent loss of function of gm1, thus contributing to explain the toxic mechanism ascribed to this particular interaction. insights into the early stages of fibrillogenesis of insulin using mass spectrometry harriet l. insulin is a vital hormone in metabolic processes as it regulates the glucose levels in the body. insulin is stored in the b cells of the pancreas as a hexamer, however its biologically active form is the monomer. the formation of fibrillar aggregates of insulin rarely occurs in the body; however localised amyloidosis at the site of injection for diabetes patients and aggregation of pharmaceutical insulin stocks present problems. in the current study oligomers formed early in the process of fibril assembly in vitro are observed by mass spectrometry (ms). ms is the only technique which allows early species to be characterised as it can identify different oligomeric orders by mass to charge ratio and show protein abundance and aggregation propensity. on mobility ms is used to examine rotationally averaged collision cross sections of oligomers in the aggregating solution. a wide array of oligomers is observed and the stability of specific species is remarked. the presence of multiple conformations for the highly charged oligomers is particularly noted and their assignment confirmed using fourier transform ion cyclotron resonance ms and collision induced dissociation. molecular modelling has been used to further explore the conformational space the oligomers inhabit. amyloid fibrils consisting of different proteins have been recognized as an accompanying feature of several neurodegenerative diseases. many proteins without known connection to any diseases have been found to form amyloid fibrils in vitro, leading to suggestion that the ability to form fibrils is the inherent property of polypeptide chain. the observed common character of protein amyloid formation enables to seek further clues of fibrillation mechanism by studying generalized sequenceless polypeptide models, e.g. polylysine. we have studied conformational transitions of polylysine, with different chain length at various ph, ionic strength and temperature by means of novel approachviscometric method. this polypeptide undergoes alfa-helix to beta-sheet transition and forms amyloid fibrils in special conditions. temperature induced a-helix to b-sheet transitions occurs at ph interval form 10 to 11.8 and with increasing chain length is slightly shifted to the lower ph. we have found narrow ph interval, in which the thermal transition is fully reversible, suggesting the high sensitivity of polypeptide conformation on subtle changes in charge on its side chains. this work was supported within the projects vega 0079,0038 and 0155, cex sas nanofluid and by esf project no. 26220120033. understanding the mechanisms of the conversion from the native state of a protein to the amyloidal state represents a fundamental step in improving the purification, storage and delivery of protein-based drugs and it is also of great relevance for developing strategies to prevent in vivo protein aggregation. amyloid fibrils have a structural arrangement of cross b-sheet but they can also experience different packing into three dimensional superstructures, i.e. polymorphism. it is well known that, among others, both the geometric confinement of the molecules and shear forces can affect the final morphology of the aggregates. importantly, due to the complexity and crowding of the cellular region, such parameters also play a crucial role in in vivo processes. we present an experimental approach to study in vitro amyloid aggregation in a controlled and uniform shear force field and within microscale environments. in particular we focus on the effect of these two parameters on the formation of spherical aggregates, known as spherulites. using micro channels of different cross-sections from 5 to1000 lm x 12 lm and flow rates in the range of hundreds of ll/min, the number and diameter of spherulites within the channels have been characterized using crossed polarizers optical microscopy. inhibition of insulin amyloid fibrillization by albumin magnetic fluid k. insulin amyloid aggregation causes serious problems for patient with insulin dependent diabetes undergoing long-term treatment by injection, in production and storage of this drug and in application of insulin pumps. recent studies indicate that protein amyloid aggregation causes the cell impairment and death; however, the prevention of amyloid aggregation is beneficial. we have investigated ability of albumin magnetic fluid (amf) to inhibit insulin amyloid aggregation by spectroscopic and microscopic techniques. albumin magnetic fluid consists of magnetic fe 3 o 4 nanoparticles sterically stabilized by sodium oleate and functionalized with bovine serum albumin (bsa) at various weight ratios bsa/fe 3 o 4 . we have found the positive correlation between inhibiting activity of afm and nanoparticle diameter and zeta potential. the ability of amf to inhibit formation of amyloid fibrils exhibits concentration dependence with ic 50 values comparable to insulin concentration. the observed features make amf of potential interest as agents effective in the solving of problems associated with insulin amyloid aggregation. (this work was supported within the projects vega 0079, 0155 and 0077, cex sas nanofluid, apvv-0171-10, sk-ro-0012-10 and esf project 26220220005). amyloid formation of peptides causes diseases like alzheimer's and parkinson's disease. however, the conditions for the onset of the neurotoxic beta-sheet formation are poorly understood. we focus on aggregation triggers and their interplay: interactions with hydrophobic-hydrophilic interfaces, orientation of peptides in 2d, metal ion complexation and lipid layers. the tailor-made model peptides exhibit defined secondary structure propensities and metal ion binding sites. the interactions of the peptide with the air-water interface and with metal ions are studied using surface sensitive methods connected to film balance measurements. x-ray diffraction, x-ray reflection, infrared reflection-absorption spectroscopy and total reflection x-ray fluorescence were applied to reveal the layer structure, peptide conformations and metal ion binding at the interface. we found that amyloid formation in 2d is dominated by the hydrophobic-hydrophilic interface and not comparable to the bulk behaviour. the interface can enhance or inhibit betasheet formation. the effect of metal ion complexation depends on the arrangement of the binding sites in the peptide and the preferred metal complexation geometry. the two triggers interface and metal ion complexation, can oppose each other. effect of apoe isoform and lipidation status on proteolytic clearance of the amyloid-beta peptide hubin, e. alzheimer's disease (ad) is the most common type of dementia in the elderly. the most important genetic risk factor identified for ad is the isoform, e2, e3 or e4, of apolipoprotein e (apoe), a lipid-carrying protein. one hallmark of ad is the accumulation of amyloid-beta peptide (ab) in the brain which is thought to result from an imbalance between the production of ab and its clearance. previous studies report an important role for apoe in ab degradation. we sought to determine the effect of apoe isoform and lipidation status on the degradation of soluble ab by proteinases such as insulin-degrading enzyme and neprilysin. in this study an in vitro ab clearance assay based on the competition between ab and a fluorogenic peptide substrate is developed to quantify ab degradation. to elucidate the proteolytic clearance mechanism, the fragments resulting from cleavage are identified by mass spectrometry and further analyzed to identify the interacting stretch of the ab sequence with the different apoe isoforms. the results suggest that apoe influences the rate of ab degradation. the aggregation of proteins into fibrillar nanostructures is a general form of behaviour encountered for a range of different polypeptide chains. the formation of these structures is associated with pathological processes in the context of alzheimer's and parkinson's diseases but is also involved in biologically beneficial roles which include functional coatings and catalytical scaffolds. this talk focuses on recent work directed at understanding the kinetics of this process through the development and application of experimental biophysical methods and their combination with kinetic theories for linear growth phenomena. lbs contains not only as, but also other proteins including 14-3-3 proteins. 14-3-3 proteins exist mainly as a dimer and its exact functions are remain unclear. however, recent work has shown that the association of 14-3-3(eta) with as in lbs. herein we show how 14-3-3(eta) can modulate as in vitro aggregation behavior, by rerouting it toward the formation of stable non-fibrillar aggregates. we also show that the resulting populations of fibrillar and pre-fibrillar aggregates exhibit a modified toxicity in vivo with respect to the unperturbed aggregates. interestingly, 14-3-3(eta) does not show any binding affinity for monomeric as, nor for the mature fibrillar aggregates. we provide evidence that it acts on the oligomeric species which form during the amyloidogenesis process of aggregation. since 14-3-3(eta) can influence the toxicity of amyloidogenesis without perturbing the functional as monomers, we are convinced that once fully understood, its mode of action could represent a promising model to mimic with synthetic drugs and peptides. what makes an amyloid toxic: morphology, structure or interaction with membranes? more than 40 human diseases are related to amyloids. in order to understand why some amyloids may become toxic to their host and some others are not, we first developed a genetical approach in the yeast saccharomyces cerevisiae. we have chosen the amyloid/prion protein het-s prion forming domain (218-289) from the fungi podospora anserina, which is not toxic in yeast. some toxic amyloids mutants were generated by random mutagenesis. in vitro the most toxic mutant called ''m8'' displays very peculiar nanofibers, which polymerized mainly in amyloid antiparallel b-sheets whereas the non-toxic wt exhibits a parallel polymerization. we further established the dynamic of assembling of the m8 toxic amyloid, in comparison to the wt non-toxic amyloid, and showed the presence of specific oligomeric intermediates also organized in antiparallel b-sheet structures. a more global structure/toxicity study on more than 40 mutants clearly identified an antiparallel b-sheet signature for all the toxic mutants. therefore size, intermediates and antiparallel structures may account for amyloid toxicity in yeast but we still wonder what their cellular targets are. recently, we established the first evidences that toxic mutants may specifically bind in vitro to lipids, particularly negatively charged. interconnected mechanisms in abeta (1-40) we present an experimental study on the fibril formation of ab(1-40) peptide at ph 7.4. the kinetics of this process is characterized by the occurrence of multiple transient species that give rise to final aggregates whose morphology and molecular structure are strongly affected by the growth conditions. to observe in details the aggregation pathway as a function of solution conditions, we have used different experimental techniques as light scattering, thioflavin t fluorescence, circular dichroism and two-photon fluorescence microscopy. this approach gives information on the time evolution of conformational changes at molecular level, on the aggregates/fibrils growth and on their morphologies. the selected experimental conditions allowed us to highlight the existence of at least three different aggregation mechanisms acting in competition. a first assembly stage, which implies conformational conversion of native peptides, leads to the formation of small ordered oligomers representing an activated conformation to proceed towards fibril growth. this process constitutes the rate limiting step for two distinct fibril nucleation mechanisms that probably implicate spatially heterogeneous mechanisms. the formation of amyloid fibrils of amylin 10-29 was studied by means of molecular dynamics (md) and energy partition on three peptide ß-sheet stack systems with the same amino acid composition: wild type amylin 10-29 (amyl 10-29), reverse amylin 10-29 (rev-amyl 10-29) and scrambled amylin version scr-amyl 10-29. the results show that for amylin10-29 peptides, amino acid composition determines the propensity of a peptide to form amyloid fibrils independent of their sequence. the sequence of amino acids defines the shape and the strength of amyloid protofibril, which conforms with the atom force microscope (afm) data [1] . md show that the 6x6revamyl-10-29 stack has looser selfassembly than the 6x6amyl-10-29 stack, which conforms with the results of fourier transform infrared spectroscopy (ftir) measurements for the peptides studied [1] . the results of md show that 6x6amyl-10-29 could have a turn, which consists with ftir data [1] . data on ab aggregation kinetics have been characterized by a large spread between experiments on identical samples that contain so many molecules that stochastic behaviour is difficult to explain unless caused by uncontrolled amounts of impurities or interfaces. we have therefore spent considerable effort to eliminate sources of inhomogeneity and reached a level of reproducibility between identical samples and between experiments on separate occasions that we can now collect data that can lead to mechanistic insights into the aggregation process per se, and into the mechanism of action of inhibitors. data on ab42 aggregation will be shown that give insight into the influence of physical parameters like peptide concentration, shear and ionic strength, as well as the effect of inhibitory proteins, model membranes and the effects of sequence variations. monte carlo simulations of amyloid formation from model peptides corroborate the finding from experiments and underscore that the very high level of predictability and reproducibility comes from multiple parallel processes. negatively-charged membranes were reported to catalyze ''amyloid-like'' fiber formation by non-amyloidogenic proteins [1] . our study aims to elucidate the factors that govern the formation of these amyloid-like fibers. lysozyme was selected as a model of non-amyloidogenic protein and was fluorescently-labeled with alexa fluor 488 (a488-lz). first, a488-lz partition towards phosphatidylserine-containing liposomes was characterized quantitatively using fluorescence correlation spectroscopy (fcs), in order to calculate the protein coverage of liposomes. secondly, the interaction between a488-lz and negatively-charged lipid membranes was studied using both steady-state and time-resolved fluorescence techniques. this interaction was found to switch from a peripheral binding to the anionic headgroups, at high lipid/ protein molar ratio (l/p), to a partial insertion of protein into the hydrophobic core of the membrane, at low l/p. finally, the lipidprotein supramolecular complexes formed at low l/p were characterized by fluorescence lifetime imaging microscopy (flim). the mean lifetime of a488-lz in these supramolecular structures is much lower compared to the values obtained for the free and bound a488-lz at high l/p. the fiber characterization will be complemented by fcs studies. [ the conversion of normal prp c to its pathological isoform prp sc is a key event in prion diseases and is proposed to occur at the cell surface or more probably in acidic late endosomes. a convergence of evidence strongly suggests that the early events leading to the structural conversion of the prp seem to be in relation with more or less stable soluble oligomers, which could mediate neurotoxicity. as commonly shared by other amyloidogenic proteins, membrane-bound monomers undergo a series of lipid-dependent conformational changes, leading to the formation of oligomers of varying toxicity rich in b-sheet structures (annular pores, amyloid fibrils). here, we have used a combination of biophysical techniques (dynamic and static light scattering, fluorescence techniques, and quartz crystral microbalance) to elucidate the interaction of native monomeric prp and that of purified b-rich oligomeric prp on model lipid membranes. under well established conditions, three b-sheet-rich oligomers were generated from the partial unfolding of the monomer in solution, which were found to form in parallel. from single mutation and/or truncation of the full length prp, the polymerization pathway is strongly affected, revealing the high conformational diversity of prp. in our previous work, we identify the minimal region of the prp protein leading to the same polymerization pattern of the full length prp. soluble 12-subunits and 36-subunits oligomers were obtained depending on the single mutation or truncation and purified. we compare their structural properties (ftir, cd) when associated with anionic lipid bilayers and study their propensities to permeabilize the membrane. fluorescence kinetics suggest different mechanisms of membrane perturbation for the monomer and the prp oligomers. deciphering this complex network of lipid interactions and conformational diversity of the prp protein will help for understanding of how amyloidogenic proteins induce neurotoxicity. the traditional view of the lipid bilayer described as a ''sea'' of lipids where proteins may float freely, is going to be inadequate to describe the increasingly large number of complex phenomena which are known to take place in biological membranes. membrane-assisted protein-protein interactions, formation of lipid clusters, protein-induced variation of the membrane shape, abnormal membrane permeabilities and conformational transitions of membrane-embedded proteins are only a few examples of the variegated ensemble of events whose tightly regulated cross-talk is essential for cell structure and function. experimental work on the above mentionated problems is very difficult and some time not accessible, especially when the studied systems have a fast dynamics. due to the large size of the systems usually involved in this multifaceted framework, a detailed molecular description of these phenomena is beyond the possibilities of conventional amyloid aggregation, a generic behavior of proteins, is related to incurable human pathologies -amyloid-related diseases, associated with formation of amyloid deposits in the body. all types of amyloid aggregates possess rich bsheet structural motif. the recent data confirm the toxic effect of aggregates on the cells, however, it was found that reduction of amyloid aggregates plays important role in prevention as well as therapy of amyloidosis. we have investigated effect of phytoalexin derivatives on amyloid fibrillization of two proteins, human insulin and chicken egg lysozyme, by tht and ans fluorescence assays. we have found that amyloid aggregation of both studied proteins was significantly inhibited by phytoalexin derivates cyclobrassinin and benzocamalexin. for most effective phytoalexins the estimated ic 50 values were at low micromolar concentration. the observed inhibiting activity was confirmed by transmission electron microscopy. our data suggest the potential therapeutic use of the most effective phytoalexins in the reduction of amyloid aggregation. (this work was supported within the projects vega 0079, 0155, cex sas nanofluid, apvv-0171-10, sk-ro-0012-10 and esf project 26220220005). the amyloid pore hypothesis suggests that interactions of oligomeric alpha-synuclein (as) with membranes play an important role in parkinson's disease. oligomers are thought to permeabilize membranes and interfere with ca 2+ pathways. permeabilization by as requires the presence of negatively charged phospholipids. whether as can bind and permeabilize membranes with physiologically relevant lipid compositions has not been extensively explored. here we report on the binding of as to giant unilamellar vesicles (guvs) with physiologically relevant lipid compositions. comparing different protocols of oligomer preparation, leakage assays on both large unilamellar vesicles (calcein release) and guvs (hpts efflux assay) show that as is not able to permeabilize these membranes. the presence of cholesterol has a stabilizing effect on these membrane systems. in agreement with these findings, we do not observe concentration dependent as toxicity using in vivo mts assays. however, in the calcein release assay, different as preparations show differences in kinetics and as concentrations that cause 100% leakage. these results motivate us to critically reassess the amyloid pore hypothesis, and suggest that membrane permeabilization may be attributable only to a very specific as species. alpha-synuclein oligomers impair membrane integrity-a mechanistic view martin t. stö ckl, mireille m. a. e. claessens, vinod subramaniam nanobiophysics, mesa+ institute for nanotechnology, university of twente, enschede, the netherlands one of the most prevalent neurodegenerative diseases is parkinson's disease (pd), which is accompanied with the loss of dopaminergic neurons. although the mechanisms leading to the death of these cells are still unclear, the protein alpha-synuclein (as) is one of the pivotal factors. previous studies indicate that especially oligomeric forms of as show a detrimental effect on membrane integrity. as an intact membrane is crucial to many cellular processes, the impairment of the membrane integrity is a likely pathway for neuronal death. we use different phospholipid bilayer model systems to investigate the mechanisms underlying this process. atomic force microscopy in combination with suspended asymmetric phospholipid bilayers, which closely mimic the plasma membrane, allows the identification of the binding sites, the measurement of penetration depths of the as oligomers into the phospholipid bilayer, and the detection of membrane thinning or creation of membrane defects. using an approach based on phospholipid vesicles we were able to observe for the first time that as oligomers cause an enhanced lipid-flip flop. suggesting that the loss of lipid asymmetry is a novel mechanism which may contribute to or trigger neuronal death in pd. amyloid protein-membrane interactions and structuretoxicity relationship study h.p. ta (toxic) has a much higher and more specific effect on negatively charged phospholipids (dops, dopi and dopg) than in the case of wt (non-toxic). therefore the insertion of protein into phospholipid monolayers, which occurred similarly for both wt and m8, is not a key factor for these effects (h.p. ta et al. langmuir, in press). we are now using unilamellar vesicles as a membrane model to investigate the amyloid protein (toxic and nontoxic) -phospholipid interactions. results confirmed the high specific and strong effects of m8 on negatively-charged membrane. in this project, we study the chemical, physical and biological properties of fibrillar networks. the formation and the network mechanics are investigated by combining droplet-based microfluidics with optical microscopy and small angle x-ray scattering (saxs). the chosen system, fibrin network formation, plays an important role in blood coagulation processes. crosslinking of fibrinogen induced by an enzymatic reaction with thrombin leads to 3d fibrin network formation. the fibrillar networks are formed within picoliter droplets of aqueous solutions in an continuous oil phase. droplets containing fibrinogen and thrombin can be produced of different sizes and stored for fibrin network formation. the formation of the fibrillar networks is imaged by fluorescence microscopy. to analyze the elastic properties of the networks, the droplets flow through a microchannel device of alternating width in order to squeeze and stretch the networks. additionally, saxs experiments will give structural information about the molecular dimensions of the networks. the amyloid beta peptide (ab), implicated in alzheimer's disease (ad), is released from the amyloid precursor protein (app) by secretase-induced cleavage. this process results in the release of a range of ab peptides varying in length. the brains of ad patients often contain longer ab peptides while the total concentration of ab is unaffected. longer peptides are more hydrophobic having far-reaching consequences for their toxicity and aggregation. as ab is necessary for normal neuronal function, research activities into ad therapeutic development currently explore the possibilities of modulating c-secretase activity to produce short ab peptides. whether such an approach effectively ameliorates the toxic effect of ab has not been explored yet. to answer this question, we studied the impact of heterogeneity in ab pools in an in vitro biophysical and in cellulo context using microelectrode array to assay the synaptic activity of primary neurons. we show that various lengths of the ab peptide and mixtures thereof aggregate with distinct kinetics and notoriously affect synaptotoxic and cytotoxic response. we also show that small amounts of less abundant peptides ab38 and ab43 induce aggregation and toxicity of ab40 while the behavior of ab42 is unaffected. one of the most important irreversible oxidative modifications of proteins is carbonylation, a process of introducing the carbonyl group in reaction with reactive oxygen species. importantly, carbonylation increases with the age of cells and is associated with the formation of intracellular protein aggregates and the pathogenesis of age-related disorders such as neurodegenerative diseases and cancer. however, it is still largely unclear how carbonylation affects protein structure, dynamics and aggregability on the atomic level. here, we use classical molecular dynamics simulations to study structure and dynamics of the carbonylated headpiece domain of villin, a key actin-organizing protein. we perform an exhaustive set of molecular dynamics simulations of native villin headpiece together with every single combination containing carbonylated versions of its seven lysine, arginine and proline residues, the quantitatively most important carbonylable amino acids. surprisingly, our results suggest that high levels of carbonylation, far above those associated with cell death in vivo, may be required to destabilize and unfold protein structure through the disruption of specific stabilizing elements, such as salt bridges or proline kinks, or tampering with the hydrophobic effect. on the other hand, by using thermodynamic integration and molecular hydrophobicity potential approaches, we quantitatively show that carbonylation of hydrophilic lysine and arginine residues is equivalent to introducing hydrophobic, chargeneutral mutations in their place, and, by comparison with experimental results, demonstrate that this by itself significantly increases intrinsic aggregation propensity of both structured, native proteins and their unfolded states. finally, our results provide a foundation for a novel experimental strategy to study the effects of carbonylation on protein structure, dynamics and aggregability using site-directed mutagenesis. septins are an evolutionarily conserved family of gtp-binding proteins involved in important cellular processes, such as cytokinesis and exocytosis, and have been implicated in neurological diseases, such as alzheimer's and parkinson's diseases. the focus of this study was two septins of schistosoma mansoni, (the causative agent of schistosomiasis in south america) named smsept5 and smsept10, which were produced in a recombinant system. our objective was to verify if these septins from a simpler organism display similar characteristics to human septins. analysis of protein structure by circular dichroism showed that both recombinant smseptins produced were folded. the gtpase activity assay showed that smsept5 was able to hydrolyze gtp, whereas smsept10 was not. aggregation studies for amyloid fibril detection by right angle light scattering and thioflavin t fluorescence assay were performed. both proteins showed a temperature dependent increase in light scattering and fluorescence emission of tht probe. this indicated that s. mansoni septins tend to aggregate into amyloid-like fibers in high temperatures, with thresholds of 30°c for smsept5 and 37°c for smsept10. these results are in accordance to that previously reported for human septins. in our work we investigated the response to standard chemotherapy of blood lymphocytes of patients suffering with melanoma. dna single and double strand breaks were determined using comet assay; intracellular levels of marker proteins were detected using immunocytochemistry. ultimately this set of parameters allows to characterize two mechanisms of dna repair (base excision repair, ber and mismatch repair, mmr) which together with apoptosis proneness underlie response of tumor cells to chemotherapy. cell death caused by o 6 mg adducts is promoted by mmr system by inducing unrepaired double strand breaks in dna. there was a linear correlation between the level of dsdna breaks in lymphocytes after 1-st cycle of chemotherapy and mmr efficiency in them. the level of double strand breaks in dna after 1-st cycle of chemotherapy is predictive of clinical outcome. otherwise damage at the level of ssdna (ap-sites and single strand breaks) and ber mechanism associated with it couldn't be a good prognostic factor of chemotherapy. high level of double strand breaks in dna in blood lymphocytes of melanoma patients 48 hours after 1-st cycle of chemotherapy appears to be a marker of a good prognosis. self-assembly and stability of g-quadruplex: counterions, pressure and temperature effects e. baldassarri jr., p. mariani, f. spinozzi, m. g. ortore saifet dept. & cnism, marche polytechnic university, ancona, italy the important role of g-quadruplex in biological systems is based on two main features: composition and stability of telomeres, and activity of telomerase. the g-quadruplex structures are formed by supramolecular organization of basic units called g-quartets that are planar rings constituted by four guanosines linked by hoogsten hydrogen bonds. gquadruplex requires the presence of monovalent cations playing a key role in stabilizing these structures, since they give rise to coordination bonds needed for the stacking of more tetrads. we performed x-ray diffraction experiments at different pressures (ranging from 1 to 2000 bar), and small angle x-ray scattering (saxs) changing the temperature (between 20-60°c retinoic acid receptor (rar) is a member of the nuclear receptor superfamily. this ligand-inducible transcription factor binds to dna as a heterodimer with the retinoid x receptor (rxr) in the nucleus. the nucleus is a dynamic compartment and live-cell imaging techniques make it possible to investigate transcription factor action in real-time. we studied the diffusion of egfp-rar by fluorescence correlation spectroscopy (fcs) in order to uncover the molecular interactions determining receptor mobility. in the absence of ligand we identified two distinct species with different mobilities. the fast component has a diffusion coefficient of d 1 = 1.8-6 lm 2 /s corresponding to small oligomeric forms, whereas the slow component with d 2 = 0.06-0.12 lm 2 /s corresponds to interactions of rar with the chromatin or other large structures. the rar ligand binding domain fragment also has a slow component probably as a result of indirect dna-binding via rxr, with lower affinity than the intact rar:rxr complex. importantly, rar-agonist treatment shifts the equilibrium towards the slow population of the wild type receptor, but without significantly changing the mobility of either the fast or the slow population. by using a series of mutant forms of the receptor with altered dna-or coregulator-binding capacity we found that the slow component is probably related to chromatin binding, and that coregulator exchange, specifically the binding of the coactivator complex, is the main determinant contributing to the redistribution of rar during ligand activation. formation of inactive nuclear with high level of dna compaction in sperm cells is accompanied by a substitution of linker histones h1 by a number of other proteins. among them sperm-specific histones (ssh), which are characterized by elongated arginine-rich polypeptide chain compared to the somatic h1. the secondary and tertiary structure of the ssh and their interactions with dna were studied using spectroscopic and thermodynamic approaches. the histones were isolated from sperm of marine invertebrates and rat thymus. all studied ssh demonstrate no considerable compaction of dna in solutions of low ionic strength. however, at physiological conditions, ssh h1 from s.intermedius and a.japonica compact dna more intensively than other ssh. the somatic h1 from rat thymus revealed a minimal ability to compact the dna. we suggest that the ssh h1 are able to interact with dna not only in the major groove but also in the minor groove of the double helix inducing considerable structural changes in dna and facilitating the formation of the supercompact sperm chromatin. the authors are grateful for the financial support from the russian foundation for basic research (grants § 10-04-00092 and 09-08-01119) and from administration of saint-petersburg. ionizing radiation causes modification and destruction of nitrogenous bases in dna molecule. there are also local breakages of hydrogen bonds both in the lesion sites mentioned above and in other sites of the macromolecule. to reveal the amount of some of these damages we applied cd and uv absorption spectroscopy. radiation-induced changes in dna structure influence its uv absorption spectrum in different ways: partial denaturation causes hyperchromic effect, while destruction of the bases results in hypochromic shift. at the same time both of them result in the same changes in dna cd spectra: the decrease in intensity. we attempted to segregate the described damages in dna structure and studied the influence of dna ionic surroundings on the radiation effect. it is shown that the radiation efficiency of base destruction and partial denaturation increases with decreasing concentration of nacl in irradiated solution. udu (ugly duckling) has been first identified from a zebrafish mutant and shown to play an essential role during erythroid development; however, its roles in other cellular processes remain largely unexplored. facs analysis showed that the loss of udu function resulted in defective cell cycle progression and comet assay indicated the presence of increased dna damage in udu mutants. we further showed that the extensive p53-dependent apoptosis found in udu mutants is a consequence of activation in the atm-chk2 pathway. udu appears not to be required for dna repair, because both wild-type and udu embryos similarly respond to and recover from uv treatment. yeast two-hybrid and coimmunoprecipitation data demonstrated that pah-l repeats and sant-l domain of udu interacts with mcm3 and mcm4. furthermore, udu was colocalized with brdu and heterochromatin during dna replication, suggesting a role in maintaining genome integrity. recently, we started to work on the second zebrafish homolog, udu2, and its mammalian counterpart, gon4l. preliminary data showed that udu2 and gon4l mrna injection can rescue zebrafish udu mutant phenotypes. furthermore, pah-l and sant-l domains of udu2 and gon4l can bind to mcm3 and mcm4 and they are localized in the nucleus. these data suggest that udu2 and gon4l are functionally equivalent to zebrafish udu. their molecular mechanism leading to udu phenotypes is currently under investigation. chromatin condensation: general polyelectrolyte association and histone-tail specific folding nikolay korolev, nikolay berezhnoy, abdollah allahverdi, renliang yang, chuan-fa liu, james p. tam, lars nordenskiö ld school of biological sciences, nanyang technological university, 60 nanyang drive, 637551, singapore the major component of chromatin, dna, is a densely charged polyanion. electrostatic interactions between dna and dnapackaging proteins contribute decisively to formation of its elementary unit, the nucleosome, and are also important in chromatin folding into higher-order structures. we investigate condensation of dna and chromatin and find that electrostatics and polyelectrolyte character of dna play dominant role in both dna and chromatin condensation. by comprehensive experimental studies and using novel oligocationic ligands, we suggest simple universal equation describing dna condensation as a function of oligocation, dna and monovalent salt concentrations and including the ligand-dna binding constant. we found that similar dependence was also observed in condensation of the nucleosome arrays. next, we studied how general electrostatic and specific structural alterations caused by lysine acetylations in the histone tails influence formation of 30-nm chromatin fibre and intermolecular nucleosome array association. for the first time, a structural model is suggested which explains critical dependence of chromatin fibre folding on acetylation of the single lysine at position 16 of the histone h4. exceptional importance of the h4 lys 16 acetylation in general and gene specific transcriptional activation has been known for many years but no structural basis for this effect has yet been proposed. detection of specific dna sequences is central to modern molecular diagnostic. ultrasensitive raman measurements of nucleic acids are possible through exploiting the effect of surface-enhanced raman scattering (sers). in this work, the sers spectra of genomic dnas from leaves of different apple trees grown in the field, have been analyzed [1] . a detailed comparative analysis of sers signatures of genomic dnas is given. sers wavenumbers (cm -1 ) are reported here for all types of vibrations of plant genomic dnas, including bands assigned to localized vibrations of the purine and pyrimidine residues, localized vibrations of the deoxyribosephosphate moiety, etc. proposed band assignments are given. a strong dependence of the sers spectra on dna concentration and on time have been observed. in biochemical fields, nucleic acids might be used to explore the interaction between dna and small molecules, which is important in connection with probing the accurate local structure of dna and with understanding the natural dnamediated biological mechanisms [2] . the ph-dependent structure of dna studied by fourier transform infrared spectroscopy the region of the infrared spectrum studied covered the wave number range from 4000 cm -1 to 700 cm -1 . ir spectra show that in ph 8.0-9.0 interval carbonyl (c=o) band at 1689 -1 cm (assigned to guanine) is reduced in intensity and slightly shifted to lower frequencies. at the ph 10.0 significantly decreases band intensity at 1712 cm -1 due to unbounded c2=o of thymine and shifts to lower frequencies, indicating at the transition of this group in bounded form, supposedly by means of excess polarized hydroxyl ions. together this, in basic region a new intense absorption band has been observed in 1500-1300cm -1 frequency interval, corresponding to o-h group in-plane bending vibration (1410-1310cm -1 ). as for acidic conditions, it was observed that under the extreme ph (*2.8) value carbonyl absorption region shifts to higher frequencies and absorption intensity significantly increased, indicated at releasing of c=o groups from h-bonding between base pairs. moreover, bands intensity at 1418cm -1 and 1022cm -1 corresponding to out-of-plan deformation of nh 2 groups increased due to rupture of connections between the dna strands. during the last decade it was found that in many cases specific structural organization of multi-molecular protein and dna-protein complexes determines their functioning in living cells. although these functioning structures are usually unique, it is often possible to identify their common structural elements. one of the interesting examples of such universal elements are hmgb domains: structurally conservative functional domains of non-histone proteins hmgb1/2 also identified in many nuclear proteins. using afm, thermodynamic approaches, circular dichroism and molecular absorption spectroscopy in far-uv and mid-ir regions we have studied structural organization of the complexes between dna and different proteins, including hmgb1, hmgb-domain recombinant proteins and linker histone h1. we have demonstrated, that interaction with dna leads to increasing both a-helicity of the proteins and thermal stability of dna. also, this interaction may result in formation of highly ordered supramolecular complexes facilitated by hmgb-domains. the c-terminal sequence of hmgb1/2 regulates affinity of the proteins to dna and can be ''inactivated'' by interaction with histone h1. based on the data obtained a model of the interaction of multy-domain hmgb-proteins with dna is suggested. darmstadt, germany, 4 lmu biozentrum; munich, germany *these authors contributed equally to this work chromatin in living cells displays considerable mobility on a local scale. this movement is consistent with a constrained diffusion model, in that individual loci execute multiple, random jumps. to investigate the connection between local chromatin diffusion (lcd) and the changes in nuclear organization, we established a stable hela cell line expressing gfp-pcna. this protein, a core component of the replication machinery, serves as a cell-cycle marker and allows us to visualize sites of ongoing dna synthesis within the nucleus. to monitor lcd, we labeled discrete genomic loci through incorporation of cy3-dutp. this experimental system, in conjunction with particle tracking analysis, has enabled us to quantitatively measure chromatin mobility throughout the cell cycle. our results demonstrate that lcd is significantly decreased in s-phase. to explore the connection between dna replication and reduced chromatin movement, we undertook a more detailed examination of lcd in s-phase nuclei, correlating chromatin mobility with sites of replication. our results demonstrate that labeled chromatin in close proximity to gfp-pcna foci exhibit significantly decreased mobility. we therefore conclude that presence of active replication forks constrains the movement of adjacent chromatin. single-molecule studies of dna replication antoine m. van oijen zernike institute for advanced materials, groningen university, nijenborgh 4. 9747 ag groningen, the netherlands e-mail: a.m.van.oijen@rug.nl advances in optical imaging and molecular manipulation techniques have made it possible to observe individual enzymes and record molecular movies that provide new insight into their dynamics and reaction mechanisms. in a biological context, most of these enzymes function in concert with other enzymes in multi-protein complexes, so an important future direction will be the utilization of single-molecule techniques to unravel the orchestration of large macromolecular assemblies. we are applying a single-molecule approach to study dna replication. i will present recent results of single-molecule studies of replication in bacterial and eukaryotic systems. by combining the stretching of individual dna molecules with the fluorescence observation of individual proteins, we visualize the dynamic interaction of replication factors with the fork. in the bacteriophage t7 replication system, we show that dna polymerases dynamically associate with and dissociate from the fork during replication. further, i will present new data from single-molecule replication studies in x. laevis oocyte extracts. we have developed a novel imaging scheme that permits single-molecule fluorescence experiments at concentrations of labeled protein that were hitherto inaccessible. using this method, we visualize, in real time, origin firing and fork movement. in force-extension diagrams of reference models of naked dna (freely jointed chain, wormlike chain) as well as extensionrotation diagrams of naked dna have been successfully recovered. of note, plectonemic structures are most efficiently simulated thanks to ode's collision detection code. new insights into nucleosome and chromatin fiber structure and dynamics will be presented. the study of the pkm.101 plasmid effect on the repair of dna j. vincze 1 , i, francia 2 , g. vincze-tiszay 1 1 hheif, budapest, hungary, 2 univ. debrecen, debrecen, hungary in our experiments was studied the effect of pkm.101 plasmid on repair of single strand breaks in dna induced by 60cogamma irradiation in e.coli k 12 ab 1157 (wild type) and its different rec mutant cells. the pkm.101 resistant-factor in case of the control decreases the sensitivity of radiation, which as we suppose, is reached by the help of dna conformation change. it can be supposed from the well known effect of radiation biology that by the effect of pkm.101, the ratio of dna radiation sensitive volumes by appearing its new conformation decreases. the pkm.101 r-factor in rec mutants in case of gamma irradiation shows effects in two ways. one is the ''chemical'' connection between the r-factor and dna, though the other relate to positive and negative ''induced'' radiation resistance from the local type effect of the connection of an r-factor and a rec mutant, and the two radiation resistant effects are added algebraically. as a result from the view of biology we have to categorize the radiation resistance and the connected repair processes as two different classes according to the change either in the chemical or in the induced radiation resistant effect. recent studies have indicated that two trimethylated peptides (k6, k7), derived from the parental hybrid peptide ca(1-7)m(2-9), strongly interact with a bacterial membrane model (mixture of zwitterionic and negatively charged lipids), but not with a membrane model of mammalian erythrocytes (zwitterionic lipids) [1] . a reduction of the cytotoxicity effect and an improvement of the therapeutic index have also been reported for the derivatives when compared with the parental ca(1-7)m(2-9) [2] . in this work, with the aim of providing insight on the interaction phenomena of the indicated peptides with zwitterionic and negatively charged membrane models, a systematic molecular dynamics study was carried out. full hydrated bilayers of dmpc:dmpg (3:1) and pope:popg(3:1) were studied in the presence of each peptide, and results analyzed in terms of peptide structure and membrane composition. lipid-water and lipid-lipid interactions at the membrane/water interface play important role in maintaining the bilayer structure, however, this region is not easily available for experimental studies. we performed molecular dynamics simulations of two bilayers composed of two different types of lipids: (1) dioleoylphosphatidylcholine (dopc); (2) galactolipid monogalactosyldiacylglycerol (mgdg). to investigate the properties of the membrane/water interface region, we performed analysis of lipid-lipid interactions: direct, via charge pairs (dopc) and hydrogen bonds (mgdg) as well as indirect, via water bridges. we also examined water-lipid interactions. existence of well-defined entities (lipids) linked by different types of interactions (hydrogen bonds, charge pairs, water bridges) makes the analysis of the membrane/ water interface region a suitable for a graph theoretical description. we applied a network analysis approach for comparative analysis of simulated systems. we note a marked difference between the organization as well as the dynamics of the interfacial region of the two bilayers. l-nucleoside analogues form an important class of antiviral and anticancer drug candidates. to be pharmacologically active, they need to be phosphorylated in multiple steps by cellular kinases. human phosphoglycerate kinase (hpgk) was shown to exhibit low specificity for nucleotide diphosphate analogues and its catalytic efficiency in phosphorylation was also affected. to elucidate the effect of ligand chirality on dynamics and catalytic efficiency, molecular dynamics simulations were performed on four different nucleotides (d-/l-adp and d-/l-cdp) in complex with hpgk and 1,3-bisphospho-d-glycerate (bpg). the simulation results confirm high affinity for the natural substrate (d-adp), while l-adp shows only moderate affinity for hpgk. the observed short residence time of both cdp enantiomers at the active site suggests very weak binding affinity which may result in poor catalytic efficiency shown for hpgk with d-/l-cdp. analysis of the simulations unravels important dynamic conditions for efficient phosphorylation replacing the single requirement of a tight binding. using the van der waals density functional based on the semilocal exchange functional pw86 together with a longrange component of the correlation energy [1] implemented in the siesta program code, we have calculated the band structure of the double stranded dna. the unit cell was built by taking together 10 gc (or at) homogenous base pairs and we have considered the translational symmetry as the periodic boundary condition. the results obtained are compared with the oligomer calculations taking up to seven base pairs. the band structure obtained with this van der waals density functional is also compared with results obtained with other exchange-correlation functionals as well as with band structure obtained by the hartree-fock crystal-orbital method taking into account the helical symmetry of the double stranded dna. the role of different parts of dna (base pairs, sugar-phosphate backbone, na ions) is also presented. transmembrane (tm) proteins comprise some 15% to 30% of the proteome but owing to technical difficulties, relatively few of these structures have been determined experimentally. computational modeling techniques can be used to provide the essential structural data needed to shed light on structurefunction relationships in tm proteins. low-resolution electrondensity maps, obtained from cryo-electron microscopy (cryo-em) or preliminary x-ray diffraction studies, can be used to restrict the search in conformational space. at the right resolution, the locations of tm helices can be roughly determined even when the amino acids are not visible. when these data are combined with physicochemical characteristics of amino acids (such as their hydrophobicity) and with evolutionary conservation analysis of the protein family, the location of the amino acids can be modeled. the modelstructure may provide molecular interpretations of the effects of mutations. moreover, it can be used to suggest molecular mechanisms and to design new mutations to examine them. the overall approach will be demonstrated using two human proteins: copper transporter 1 (ctr1), which is the main copper transporter in the human cell, and the 18 kda translocator protein (tspo) of the outer mitochondria. modelstructures of these proteins and their functional implications in health and disease will be discussed. calcium channels play a crucial role in many physiological functions and their selectivity mechanism is still an unresolved question and a subject of debate. a physical model of selective ''ion binding'' in the l-type calcium channel is constructed, and consequences of the model are compared with experimental data. this reduced model treats only ions and the carboxylate oxygens of the eeee locus explicitly and restricts interactions to hard-core repulsion and ion-ion and ion-dielectric electrostatic forces. according to the charge/space competition mechanism, the charge of structural ions attracts cations into the filter, while excluded volume effects are trying to keep them out. this is a competition between energy and entropy, where the balance of these terms minimizes free energy and determines selectivity. experimental conditions involving binary mixtures of alkali and/or alkaline earth metal ions are computed. the model pore rejects alkali metal ions in the presence of biological concentrations of ca2+ and predicts the blockade of alkali metal ion currents by micromolar ca2+. conductance patterns observed in varied mixtures containing na+ and li+, or ba2+ and ca2+, are predicted. ca2+ is substantially more potent in blocking na+ current than ba2+. in apparent contrast to experiments sing buffered ca2+ solutions, the predicted potency of ca2+ in blocking alkali metal ion currents depends on the species and concentration of the alkali metal ion, as is expected if these ions compete with ca2+ for the pore. these experiments depend on the problematic estimation of ca2+ activity in solutions buffered for ca2+ and ph in a varying background of ulk salt. equilibrium binding affinity (expressed as the occupancy of the selectivity filter by various ions) is computed by equilibrium grand canonical monte carlo (gcmc) simulations. the conductivity of the channel is estimated from the equilibrium concentration profiles using the integrated nernst-planck equation. our simulations show that the selectivity of l-type calcium channels can arise from an interplay of electrostatic and hard-core repulsion forces among ions and a few crucial channel atoms. the reduced system selects for the cation that delivers the largest charge in the smallest ion volume. we have also performed dynamic monte carlo (dmc) simulations for a model ca channel to simulate current directly and present our results for the dynamical selectivity (expressed as the flux carried by various ions). we show that the binding affinity of ca2+ versus na+ is always larger than the dynamical selectivity because ca2+ ions are tightly bound to the binding site of the selectivity filter of the channel and, at the same time, their mobility and drift velocity is smaller in this region. carotenoids are used in light-harvesting complexes with the twofold aim to extend the spectral range of the antenna and to avoid radiation damages. the effect of the polarity and conformation of the environment is supposed to be responsible for the tuning of the electronic, optical and vibrational properties of peridinin carotenoid both in solution and in protein matrix. we investigate the effect of vibrational properties of peridinin in different solvents by means of vibrational spectroscopies and qm/mm molecular dynamics simulations 1 . the shift of vibrational fingerprints in the 1500-2000 cm -1 frequency region, due to the solvent polarity and proticity, is studied in three cases: cyclohexane (apolar/aprotic), deuterated acetonitrile (polar/aprotic) and methanol (polar/ protic). the frequencies and vibrational modes of the carbonyl, the allene, and the polyene chain were identified using effective normal mode analysis 2 and compared with the present and previous experimental data 3 . on the basis of our calculations and experiments in different solvents, we propose a classification of the four peridinins of the high-salt pcp form. the controlled self-assembly of functional molecular species on well defined surfaces is a promising approach toward the design of nanoscale architectures. by using this methodology, regular low-dimensional systems such as supramolecular clusters, chains, or nanoporous arrays can be fabricated. small biological molecules such as amino acids represent an important class of building blocks that are of interest for molecular architectonic on surfaces because they inherently qualify for molecular recognition and selfassembly [1] . the interaction between amino acids and solid surfaces is decisive for the development of bioanalytical devices or biocompatible materials as well as for a fundamental understanding of protein-surface bonding. we investigate the adsorbtion mechanism of the cysteine on au(111) surfaces by means of the dft [2] . our main concern is to describe the molecule-metal bonding mechanism. therefore we present a complex study, including the full determination of the density of states for the free and adsorbed molecule, the determination of molecule-surface bonding energy. the method of crystal orbital overlap populations is used in order to determine the contribution of specific atomic orbitals to the molecule-metal bond. it is now widely accepted that myoglobin (mb) is not simply an o 2 storage/delivery system but, depending on oxygen concentration, it exerts other fundamental physiological roles. recent studies revealed a widespread expression and, in particular, an over-expression in response to hypoxia, in various non-muscle tissues, including tumor cells. in human five different mb isoforms are present. the two most expressed ([90%) differ only at the 54 th position, k54 (mb-i) and e54 (mb-ii) respectively. since high-altitude natives from tibet are characterized by a higher mb concentration and locomotion efficiency, together with the observation that the mb overexpression is totally attributable to mb-ii, the idea that the latter might be one of the responses to high-altitude evolutionary adaptation, i.e. hypoxic environment, started to emerge. however, this is not yet supported by any structure/function investigation. we performed hundred nanoseconds md simulations on human mbs to investigate the structure and dynamics of both protein and surface water. important differences have been protein kinases play key roles in cell signaling and constitute crucial therapeutic targets. in normal cell, upon substrate binding, tyrosine kinase receptor kit undergoes extensive structural rearrangement leading to receptor dimerization and activation. this process is initiated by the departure of the juxta membrane region (jmr) from the active site, allowing the activation loop (aloop) deployment. the deregulation of kit activity is associated with various forms of cancer provoked by abnormalities in signal transduction pathways. mutations v560g (jmr) and d816h/v (a-loop) have been reported as oncogenic and/or drug-resistant. to contribute further in the understanding of kit activation/ deactivation mechanisms, we applied a multi-approach protocol combining molecular dynamics (md), normal modes analysis (nma) and pocket detection. disturbing structural effects, both local (a-loop) and long-range (jmr), were evidenced for kit d816h/v in the inactive state. nma showed that jmr is able to depart its position more easily in the mutants than in the wild type. pockets analysis revealed that this detachment is sufficient to open an access to the atp binding site. our results provided a plausible conception of mutant dimerization and a way to explore putative allosteric binding sites. transmembrane association of leukocyte integrin heterodimer might be mediated by a polar interaction choon-peng chng 1 and suet-mien tan 2 1 biophysics group, a*star institute of high performance computing, and, 2 school of biological sciences, nanyang technological university, republic of singapore the lateral association of transmembrane (tm) helices is important to the folding of membrane proteins as well as a means for signaling across the cell membrane. for integrin, a hetero-dimeric protein important for cell adhesion and migration, the association of its a-and b-subunits' tm helices plays a key role in mediating bi-directional mechanical signaling across the membrane. we found evidence from experiment and simulation for a polar interaction (hydrogenbond) across leukocyte integrin alb2 tm that is absent in the better-studied platelet integrin aiibb3 [1] . our coarse-grained molecular dynamics simulations of tm helix-helix selfassembly showed more native-like packing achieved by alb2 within the simulation timescale as compared to its 'lossof-function' b2t686g mutant or aiibb3 [2] . association free energy profiles also showed a deeper minimum at a smaller helix-helix separation for alb2, suggestive of tighter packing. the likely conservation of this polar interaction across the b2 integrin family further reinforces its importance to the proper functioning of leukocyte integrins. active extrusion of drugs through efflux pumps constitutes one of the main mechanisms of multidrug resistance in cells. in recent years, large efforts have been devoted to the biochemical and structural characterization of rnd efflux pumps in gram-negative bacteria, in particular the acrb/a-tolc system of e.coli. specific attention has been addressed to the active part of the efflux system, constituted by the acrb unit. despite the presence of several data, crucial questions concerning its functioning are still open. the understanding of the structure-dynamics-function relationship of mexb, the analogous transporter in p. aeruginosa, encounters even more difficulties, because of the lack of structural data of the transporter in complex with substrates. to shade some light on the activity of mexb, we performed computational studies on mexb interacting with two compounds, meropenem and imipenem, the first known to be a good substrate, and the second a modest one. several techniques were used in the present work, ranging from flexible docking [1] to standard and targeted molecular dynamics (md) simulations. starting from the published crystal structure [2] we identified the most probable poses of the two compounds in both the original experimental and in the md-equilibrated structures. we used information from acrb binding pocket in order to find relevant binding sites of the two compounds in the analogous binding pocket of mexb. meropenem frequently lies with appropriate orientation in a pocket similar to the one identified for doxorubicin in acrb [3] , while the occurrence of imipenem poses in the same pocket is very scarce. additionally, when present in the pocket, imipenem is located in a position that renders very unlikely its extrusion toward the oprm docking domain during the simulation of the functional peristalsis. the analysis of the trajectories has provided a complete inventory of the transporter and antibiotic hot spots, which is key information in terms of screening and design of antibiotics and inhibitors. clathrins are three-legged proteins with the intriguing ability to self-assemble into a wide variety of polyhedral cages. the nucleation and growth of a clathrin lattice against the cytosolic face of a cell membrane enables the endocytosis of membrane proteins and various external molecules, by wrapping the membrane around the cargo to produce a coated transport vesicle within the cell. clathrins can also self-assemble, in slightly acidic solutions devoid of auxiliary proteins, into empty cages. our simulations of this process, using a highly coarsegrained model, indicate that the key to self-assembly is neither calthrin's characteristic puckered triskelion shape, nor the alignment of four legs along all cage edges, but an asymmetric distribution of interaction sites around the leg's circumference. based on the critical assembly concentration, the binding strength in these cages is estimated at 25 to 40 k b t per clathrin. the simulations also answer the long-standing conundrum of how flat patches of purely hexagonal clathrin lattices, which in cryo electron microscopy are frequently seen to decorate cell membranes, can produce highly curved cages containing twelve pentagonal faces interdispersed between hexagonal faces. we present experimental evidence supporting this pathway. in eukaryotic cells, the exchange of macromolecules between the cytoplasm and the nucleus is mediated by specialized transport factors. by binding to these transporters, cargo molecules, which are otherwise excluded from entering the nucleus, can traverse the nuclear pore complex efficiently. most of the proteins mediating nuclear import and export exhibit a characteristic _-solenoid fold, which provides them with an unusual intrinsic flexibility. crm1 is an essential nuclear export receptor, which recognizes a very broad range of export cargoes. crm1-dependent nuclear export is ran-gtpase-driven, and recognition of rangtp and cargo is highly cooperative. however, recent crystal structures show that the binding sites for export cargos and rangtp are located at distant parts of crm1 [1] [2] [3] . we have used a combined approach of all-atom molecular dynamics simulations and small-angle x-ray scattering to study rangtp and cargo binding to crm1. we have found that the allosteric effect in crm1-dependent nuclear export arises from a combination of subtle structural rearrangements and changes in the dynamic properties of crm1. light-induced phototactic responds in green algae chlamydomonas reinhardtii are mediated via microbial-type rhodopsins, termed channelrhodopsin-1 (chr1) and channelrhodopsin-2 (chr2)1, which carry the chromophore retinal covalently linked to lysin via a schiff base and were shown to be directly light-gated ion-channels2. the n-terminal putative seven-transmembrane region of chr2 was shown to be responsible for the generation of photocurrents and exhibits sequence similarity to the well understood proton pump bacteriorhodopsin (br) and the sensory rhodopsin anabaena sensory rhodopsin (asr)3. as for the majority of membrane proteins, there is no 3d-structural data for chr2 available yet. here we present homology models of chr2 using two high-resolution x-ray template structures of br (1qhj4) respectively asr (1xio5) in order to get structural and functional insights into chr2. with both homology models we performed molecular dynamics (md) simulations in a native membrane/solvent environment using gromacs 4.0.36. comparison of energetic and structural results revealed obvious advantages of the br-based homology model of chr2. here we show that the br-based homology model is a reliable model of chr2 exhibiting structural features already found experimen-tally7. our br-based homology model of chr2 allows predictions of putative crucial residues within chr2. so we proposed several mutations within the chr2 sequence which are already accomplished. electrophysiologic and spectroscopic studies of these mutations are underway in order to confirm the functional relevance of these residues and to contribute to an optimized usage of chr2 as a powerful tool in optogenetics. ( neuroglobin is a recently discovered globin protein predominantly expressed in brain. its biological function is still elusive. despite the fact that neuroglobin shares very little sequence homology to the well-known globins as mioglobin and hemoglobin, they all have a characteristic globin fold with heme molecule bound to the distal pocket. the structural investigations and co binding kinetics revealed existence of cavities and tunnels within the protein matrix, where small ligands can be stored even for hundreds of microseconds [1] . in human neuroglobin there is one internal disulfide bond possible which existence is proved to have significant effect on ligand affinity [2] . in this study effects of temperature, ph, distal histidine mutation and presence of disulfide bond on co rebinding to neuroglobin are investigated by flash photolysis experiments. in parallel, the molecular dynamics simulations are performed in corresponding conditions in order to investigate structural change of neuroglobin and especially changes in distribution of internal tunnels and cavities able to bind diatomic ligands in response to different physical conditions listed above. the thrombospondin family, being extracelluar proteins, is known to be implicated in various physiological processes such as wound healing, inflammation, angiogenesis and neoplasia. the signature domain of thrombospondins shows high sequence identities and thus allows us to transfer results obtained, studying this complex calcium reach part of the proteins, from one member of the family to the other. the domain is known to play a key role in hereditary diseases such as psach or med. in this part of thrombospondins lies a binding site to integrins, important for cell attachment. it is further known that the lectin like globe binds to cd-47, a feature known to be important in cancer research. as the theoretical unit we are trying to resolve these problems by numerical means and are constantly challenged by the size, where thrombospondin can be a huge trimeric protein as one strand can measure 430 kda, and the large variety of subdomains found in this proteins. we are thus facing a multiscale problem which can range from solving, by means of quantum mechanics a specific ion binding site, to large scale abstraction by continuum mechanics. in our talk we will show you our newest results that we obtained by simulating calcium rich c-terminal domain which is known to be conserved across the entire family, and give you an outlook into the future of our research. the process of swift heavy ions energy deposition while penetrating a solid or scattering on its surface can result in a strong and nonequilibrium excitation of matter. an extremely localized character of this excitation, meanwhile, can make possible both selective changes in chemistry of matter 1 and its surface nanomodifications 2 . since possible applications have been found in bio-and it-technologies (cancer curing and nanostructuring respectively) in the last decade, the heavy ion bombardment technique has attracted a lot of scientific interest 3, 4 . the processes of fast energy deposition into the solid and its further dissipation, however, are essentially perturbed with highly excited and nonequilibrium state of both lattice and electron systems. at such conditions therefore, the precision in treatment of processes of electron thermalization, fast electron heat conduction, and phase transformation of the overheated solid becomes crucial. having several physical models to handle the mentioned processes, it is nevertheless difficult to describe all of them within a scale of a single computational approach. our work is aimed on elaboration of the atomistic-continuum model 5 of heavy ion bombardment of solids. in particular, the model will be applied to study the formation of nanohillocks in the experiments on swift heavy ion xe + scattering on srtio 3 surface 2 . [3] g. aquaporins are protein channels located across the cell membrane with the role of conducting water or other small sugar alcohol molecules (aquaglyceroporins). the presence of the human aquaporin 5 (hsaqp5) in cells proximal to airinteracting surfaces (eyes, lacrimal glands, salivary glands, lungs, stomach etc.) suggest its potentially important role in ''wetting'' these surfaces. the high-resolution x-ray structure of the hsaqp5 tetramer (pdb code 3d9s) exhibits two important features: (i) lack of the four fold symmetry, common in most of the aquaporins, and (ii) occlusion of the central pore by a phosphatidylserine lipid tail. in this study we investigate the importance of these two features on the transport properties of the human aqp5 by means of molecular dynamics simulations. we found that the asymmetry in the tetramer leads to a distribution of monomeric channel structures characterized by different free energy landscapes felt by the water molecules passing through the channel. furthermore, the structures' distribution is influenced both by the presence/absence of the lipid tail in the central pore, and by the lipid composition of the bilayer that solvates the hsaqp5 tetramer. elucidating the modular structure of the protein g c2 fragment and human igg fc domain binding site using computer simulations hiqmet kamberaj faculty of technical sciences, international balkan university, skopje, r. of macedonia protein-protein recognition plays an important role in most biological processes. although the structures of many protein-protein complexes have been solved in molecular detail, general rules describing affinity and selectivity of proteinprotein interactions break down when applied to a larger set of protein-protein complexes with extremely diverse nature of the interfaces. in this work, we will analyze the non-linear clustering of the residues at the interface between proteins. the boundaries between clusters are defined by clustering the mutual information of the protein-protein interface. we will show that the mutations in one module do not affect residues located in a neighboring module by studying the structural and energetic consequences of the mutation. to the contrary, within their module, we will show that the mutations cause complex energetic and structural consequences. in this study, this is shown on the interaction between protein g c2 fragment and human igg fc domain by combining molecular dynamics simulations and mutual information theory, and computational alanine scanning technique. the modular architecture of binding sites, which resembles human engineering design, greatly simplifies the design of new protein interactions and provides a feasible view of how these interactions evolved. the results test our understanding of the dominant contributions to the free energy of protein-protein interactions, can guide experiments aimed at the design of protein interaction inhibitors, and provide a stepping-stone to important applications such as interface redesign. membrane proteins can form large multimeric assemblies in native membranes that are implicated in a wide range of biological processes, from signal transduction to organelle structure. hydrophobic mismatch and membrane curvature are involved in long range forces largely contributing to such segregation. however, the existing assembly specificity is thought to be coded in the atomic details of protein surface and topology. these are best described in high resolution structures and atomistic molecular dynamics simulations. in order to explore more systematically such forces and energetics arising at intermediate time scales and resolution, we use coarse grained molecular dynamics simulations applied to 20 membrane systems spanning over 5 to 15us. as a first glimpse we study how proteins induce lipid perturbations using a previously developed conformational entropy estimator. we show that in the model membrane where hydrophobic mismatch is present, lipid perturbations extend up to *40a from the protein surface. however, significant variations in perturbation profiles are seen. parameters such as protein shape, surface topology, and amino acid physicochemical properties are studied to discover the parameters governing such perturbations. crossing energy barriers with self-guided langevin dynamics gerhard kö nig, xiongwu wu, bernard brooks national institutes of health, national heart, lung and blood institute, laboratory of computational biology, rockville, md, usa even with modern computer power, the applicability of molecular dynamics simulations is restricted by their ability to sample the conformational space of biomolecules. often high energy barriers cause normal molecular dynamics simulations to stay trapped in local energy minima, leading to biased results. to address this problem, self-guided langevin dynamics (sgld) were developed. it enhances conformational transitions by accelerating the slow systematic motions in the system. this is achieved by calculating the the local average of velocities and adding a guiding force along the direction of systematic motions. thus, the efficiency of sgld is governed by three factors: a.) the friction constant involved in the langevin dynamics b.) the local averaging time and c.) the guiding factor that determines the guiding force. however, the guiding force also causes deviations from the original ensemble that have to be corrected by reweighting the data, thus decreasing the efficiency. here, we explore the three-dimensional parameter space of sgld for several benchmark systems with particularly rough energy surfaces. based on our data, we supply guidelines for the optimal selection of sgld parameters, to allow the extension of our method to other biological problems of interest. propagation of d816v/h mutation effects across kit receptor e. laine, i. c. de beauchê ne, c. auclair and l. tchertanov lbpa, cnrs -ens de cachan, france receptor tyrosine kinases (rtks) regulate critical biological processes. constitutive activation of rtks provokes cancers, inflammatory diseases and neuronal disorders. biological data evidenced that oncogenic mutations of the rtk kit, located either in the juxtamembrane region (jmr) or in the activation loop (a-loop) -as is the case of d816v/h, displace the equilibrium of conformational states toward the active form. we present a multi-approach study that combines molecular dynamics (md), normal modes (nm) and pocket detection to characterize and compare the impact of d816v/h on the structure, dynamics and thermodynamics of kit. we have evidenced a local structural destabilization of a-loop induced by the mutation and a long-range effect on the structure and position of jmr. we have further correlated these observations with experimental data and decipher some details about the activation mechanisms of the mutants, involving leverage of the jmr negative regulation and release of an access to the catalytic site. through the identification of ''local dynamic domains'' and the recording of interactions within the protein, we propose a model of the mutational effects propagation, which highlights the importance of both structural distortion and local conformational fluctuations. investigation of biological active azolidinones and related heterocycles refer to one of the most successful scientific projects of dh lnmu. it is based on three strategic vectors: organic synthesis, pharmacological research, rational design of ''drug-like'' molecules (including in silico approaches). while applying the research strategy we succeeded in gaining a number of interesting results that make possible to extend the field, especially in the scope of ''drug-like'' molecules design, notably it has focused on the search of new anticancer agents. anticancer activity screening was carried out for more than 1000 compounds (us nci protocol (dtp) based on obtained directed library over 5000 new compounds, among them 167 compounds showed high activity level. for the purpose of optimization and rational design directions of highly active molecules with optimal ''drug-like'' characteristics and discovering of possible mechanism of action sar, compare analysis, molecular docking and qsa(p)r were carried out. obtained results allowed to form main directions of possible anticancer activity mechanisms, which probable are apoptosis-related. nowadays the investigation of cellular and molecular aspects of anticancer effects is in progress. regulation of (bio)chemical reactions by mechanical force has been proposed to be fundamental to cellular functions [1, 2, 3] . atomic force microscopy experiments have identified the effect of mechanical force on the reactivity of thiol/disulfide exchange, a biologically important reaction [4] . in order to understand the influence of the force at an atomistic level, we have performed hybrid quantum mechanicsmolecular mechanics (qm/mm) transition path sampling simulation of the reaction under external forces. the results of the simulations supported the experimental findings and demonstrated that the location of the transition state on the free energy surface was shifted due to force [5] . in contrast to our findings, however, a recent experimental study suggests only a weak coupling between the mechanical force and the reaction rate [6] . in this study, the reactants were covalently linked to a stilbene molecule. in this system a force can be applied by photo-isomerization from the relaxed trans to the strained cis configuration. a drawback of this system is that one cannot easily determine the forces that acting on the reaction coordinate. therefore, we have developed a force distribution analysis method for quantum mechanical molecular dynamics simulations. the results of the analysis show how isomerization of stilbene alters the forces acting on the reacting atoms. the force distribution is essential for understanding how chemistry is controlled by external forces. [ conformational space modelling (csm) is a promising method for membrane protein structure determination. it is based on the concept of the side-chain conformational space (sccs), which is formed by the allowed side-chain conformations of a given residue. each sccs can be calculated from a 3d structure or measured via epr-sdsl experiments. for structure determination a set of singly spinlabelled mutants is needed. the final structure is obtained by altering an initial (possibly random) 3d structure until the best fit between the calculated and measured sccs for the whole set is found. such optimization is computationally intensive; therefore csm includes several empirical approximations. one of them describes the effect of the lipid tails on the sccs. the implementation is not trivial as lipids diffuse in the plane of the membrane and the lipid tails behave differently at different membrane depths. to unravel this relationship adaptive biasing force md simulations were used. an alanine peptide helix was made in silico, spin-labelled at the middle and inserted perpendicularly into a dmpc membrane. the free energy of the spin-label orientation at various membrane depths was calculated. a 3d free energy surface describing the membrane ''depth'' effect was obtained. it is known that b-cyclodextrins (bcds) are able to modify the cholesterol content of cell or model membranes. however the molecular mechanism of this process is still not resolved. using molecular dynamics simulations, we have been able to study the bcd-mediated cholesterol extraction from cholesterol monolayers and lipid-cholesterol monolayer mixtures. we have investigated many conditions that would affect this process (e.g. lipid-cholesterol ratio, lipid chain unsaturation level) our results can be summarized as follow: i) dimerization of bcds, ii) binding of the dimers to the membrane surface assuming either a tilted (parallel to the membrane normal) or untilted (90°respect to the normal of the membrane) configuration, iii) the latter configuration is suitable to extract cholesterol at a reasonable computational time scale (100-200 ns), however, this process may be affected by unfavorable energy barriers (from 10 to 80 kj mol -1 ), iv) desorption of the complex brings cholesterol in solution, v) the bcd-cholesterol complex is able to transfer cholesterol to other membranes. with a clearer understanding of the basic molecular mechanism of the bcd mediated process of cholesterol extraction, we can begin to rationalize the design of more efficient bcds in numerous applications. the mechanism of the complex formation of biopolymers with ligands including the solvent molecules is an actual problem of modern biophysical and biological science. polypeptides form a secondary structure and mimic the motifs of the protein architecture. the study of complexation between polypeptides and solvent molecules leads to deeper understanding of the basic interaction of proteins with environment at atomic level. besides polypeptides are promising for the development of applications which encompass some of the following desirable features: anti-fouling, biocompatibility, biodegradability, specific biomolecular sensitivity. on this account polypeptides have a great significance for a variety modern applications ranging from the nanoscale medicine devises up to food technology and others. we compare the results of calculations of complexes between helical polypeptides (polyglutamic acid in neutral form and poly-c-benzyl-l-glutamate) and water molecules at dft pbe level and the results of ftir-spectroscopic study of the film of wetted polypeptides. vibrational spectroscopy is one of the most useful experimental tools to study non-covalent bonded complexes, and calculated spectra in comparison with experimental data are reliable test for reality of simulated complexes. platelet aggregation at the site of vascular injury is vital to prevent bleeding. excessive platelet function, however, may lead to thrombus formation after surgery. therefore, an accurate measure and control of platelet aggregation is important. in vitro platelet aggregometry monitors aggregate formation in platelet rich plasma triggered by agonists such as adp, epinephrine or collagen. the fraction of aggregated platelets is plotted versus time, and platelet function is assessed by analyzing the plot's morphology. we propose new measures of platelet function based on a compartmental kinetic model of platelet aggregation induced by adp. our model includes three compartments: single, aggregated and deaggregated platelets. it is simpler than earlier models and agrees with experimental data. the model parameters were determined by non-linear least squares fitting of published data. we associated the kinetic parameters with the activity of the adp receptors p2y1 and p2y12. to this end, we studied published data obtained in the presence and in the absence of specific antagonists of these receptors. comparison of kinetic parameters of healthy subjects with those of patients with myeloproliferative disorder (mpd) shows that the function of p2y12 is significantly reduced in mpd. coarse-grained modeling of drug-membrane interactions manuel n. melo, durba sengupta, siewert-jan marrink groningen biomolecular sciences and biotechnology institute, university of groningen, groningen, the netherlands the martini coarse-grained (cg) forcefield was used to simulate the actions of the antimicrobial peptide alamethicin and of the anti-tumoral drug doxorubicin. both drugs were shown to interact strongly with a fluid phospholipid bilayer, and aggregate there, in agreement with experimental results. because doxorubicin may establish intermolecular h-bonding, and thus lower its dipole moment, the cg representation of a doxorubicin dimer was adjusted. this less polar dimer was then tested for translocation and/or pore formation. contrary to results of atomistic simulations, alamethicin aggregates did not spontaneously open pores. they did so, however, when the size of the water beads was decreased. several small independent pores could then be observed. the magnitude of the permeability of these pores is analyzed and compared to experimental values. the occurrence of multiple small pores could indicate that the different conductance levels experimentally observed for alamethicin might simply result from the association of different numbers of these small pores. polarization refers to the asymmetric changes in cellular organization in response to external or internal signals. neuronal polarization begins with the growth of a single neurite shortly after cell division, followed by the growth of a second neurite at the opposite pole. this early bipolar shape is critical for brain function, as it defines axis of migration and consequently proper three dimensional organization and nerve circuitry. however, it is not known if a direct relationship exists between the formation of the second, opposite, neurite and the mechanisms involved in the formation of the first. we tackled this issue through mathematical modeling, based on membrane traffic (exocytosis-endocytosis), and lateral diffusion. with this approach, we demonstrated that a single pole of molecular asymmetry is sufficient to induce a second one at the opposite side, upon induction of growth from the first pole. our work gives mathematical proof that the occurrence of a single asymmetry in a round cell is sufficient to warrant morphological bipolarism. trypsin is one of the best characterized serine proteases. enzyme acylation process is required for substrate degradation. there is a lot of information about how this process undergoes. however, in order to obtain a more detailed description of the catalytic triad mechanism, a qm/mm approach was used. we used the hybrid qm/mm potential implemented in amber11 package. in the qm calculations a dft hamiltonian was used. we develop an approach based on the adaptively biased md in order to obtain the free energy surface of the conformational space defined by the reaction coordinates. this approach presents some characteristics of steered md and umbrella sampling procedures. our results offer information about the lowest energy trajectory, the barrier profile of the reaction, and the geometry of the transition state. this method also provides a further insight into the role of specific residues in the reaction. substituting asp102, member of the catalytic triad, for ala we were able to detect an increase of the barrier profile. this was due to the loss of the interaction of carbonyl group of asp102 with nd of his57, which make ne of this residue a worst proton acceptor. this results show our approach as a valuable method in the study of enzymatic mechanisms. the intracellular media comprise a great variety of macromolecular species that are not individually concentrated, but being preset in the same compartment they exclude each other's volume and produce crowding. crowding has a profound impact on protein structure and determines conformational transitions and macromolecular association. we investigated macromolecular association on a 50% w/w bovine serum albumin (bsa) solution by time-domain terahertz (thz) spectroscopy and molecular modeling. molecular crowding was simulated by including two bsa molecules in the same water box. we generated *300 such dimeric models, computed their thz spectra by normal modes analysis and compared them with the experimental data. the best bsa dimer model was selected based on the agreement with the experiment in the lowest frequencies domain of up to 1 thz. symmetry constraints improve accuracy of ion channels models. application to two-pore-domain channels adina l. ion channels are important drug targets. structural information required for structure-based drug design is often filled by homology models (hm). making accurate hm is challenging because few templates are available and these often have substantial structural differences. besides, in molecular dynamics (md) simulations channels deviate from ideal symmetry and accumulate thermal defects, which complicate hm refinement using md. we evaluate the ability of symmetry-constrained md simulations to improve hm accuracy, using an approach conceptually similar to casp competition: build hm of channels with known structure and evaluate the efficiency of various symmetry-constrained md methods in improving hms accuracy (measured as deviation from x-ray structure). results indicate that unrestrained md does not improve accuracy, instantaneous symmetrization improves accuracy but not stability during subsequent md, while gradually imposing symmetry constraints improves both accuracy (by 5-50%) and stability. moreover, accuracy and stability are strongly correlated, making stability a reliable criterion in predicting hm accuracy. we further used this method to refine hm of trek channels. we also propose a gating mechanism for mechanosensitive channels that was further experimentally confirmed. nucleotide modifications and trna anticodon-mrna codon interactions on the ribosome olof allné r and lennart nilsson karolinska institutet, stockholm, sweden molecular dynamics simulations of the trna anticodon and mrna codon have been used to study the effect of the common trna modifications cmo 5 u34 and m 6 a37. in trna val these modifications allow all four nucleotides to be successfully read at the wobble position in a codon. previous data suggest entropic effects are mainly responsible for the extended reading capabilities but detailed mechanisms have remained unknown. the aim of this paper is to elucidate the details of these mechanisms on an atomic level and quantify their effects. we have applied: extensive free energy perturbation coupled with umbrella sampling, entropic calculations of trna (free and bound to the ribosome) and thorough structural analysis of the ribosomal decoding center. human neuroserpin (hns) is a serine protease inhibitor (serpin) of a tissue-type plasminogen activator (tpa). the conformational flexibility and the metastable state of this proteins underlies to misfolding and to dysfunctional mutations causing a class of rare genetic diseases which share the same molecular basis. the conformational transition of the native form, triggered upon the cleavage at reactive center loop (rlc), releases a complex of the cleaved form bound to the inactivated target protease. without rlc cleavage a stable inactive latent form can be obtained by intra/inter molecular loop insertion leading to polymerization. this work concerns the study of the three above mentioned forms of hns by md simulations to investigate the relation between their conformational stability and. the starting native and cleaved configurations are based on the x-ray structure, while the latent form is here modelled. the results of the simulation reveal a striking conformational stability along with the intrinsic flexibility of selected regions of the fold.the analysis of the essential collective modes of the native hns shows that the initial opening of the b-sheet a coincides with several changes in the local pattern of salt bridges and of hydrogen bonds. regulation of ubiquitin-conjugating enzymes: a common mechanism based on a pattern of hydrophobic and acidic residues enzyme temperature adaptation generally involves a modulation of intramolecular interactions, affecting proteins dynamics, stability and activity [1] [2] . in this contribution, we discuss studies of different classes of extremophilic enzymes, focusing on cold-adapted variants, as well as their mesophilic-like mutants, performed by all-atom molecular dynamics simulations with particular attention to structural communication among residues within the threedimensional architecture [3] [4] . common adaptation strategies turned out to be based on improved local flexibility in the proximity of the functional sites, decrease in interconnected electrostatic interactions, and modulation of correlated motions and networks of communicating residues. specific differences related to the diverse protein folds can also be detected. bneurexins and neuroligins are cell adhesion molecules and play important role in synapse junction formation, maturation and signal transduction between neurons. mutations in genes coding these proteins occurs in patients with cognitive diseases like autism disorders, asperger syndrome and mental retardation [1] . it has been found that the complex bneurexin-neuroligin has also an important role in angiogenesis [2] . herein we will present molecular foundations of bneurexin-neuroligin interactions obtained from all-atom molecular dynamics simulations of bneurexin, neuroligin and their complex (3b3q) [3] . 50 ns md trajectories (charmm force field) were analyzed and roles of ca2+ and n-actetyl-d-glucosamine posttranslational modifications in intermolecular interactions were scrutinized. advances in hardware and software have enabled increasingly long atomistic molecular dynamics simulations of biomolecules, allowing the exploration of processes occurring on timescales of hundreds of microseconds to a few milliseconds. increasing the length of simulations beyond the microsecond time scale has exposed a number of limitations in the accuracy of commonly employed force fields. such limitations become more severe as the size of the systems investigated and the length of the simulations increase. here i will describe the force field problems that we have encountered in our studies, how we identified and addressed them, and what we have learned in the process about the biophysics of the systems we are investigating. while the quest for a ''perfect'' force field is not over (and may never be), our work has greatly improved the accuracy and range of applicability of simple physics-based force fields, to the point that reliable predictions can now be obtained from millisecond-timescale simulations of biomolecules. local anesthetics (la) are pain-relief drugs, widely used in medicine and dentistry. the relatively short duration of analgesia still restricts their clinical use for the treatment of chronic pain. nowadays, intensive research is focused on anesthetics entrapped into liposomes to enhance their activity and pharmacokinetic properties [1] . in this work, we investigated the encapsulation of prilocaine (plc), an aminoamide local anesthetic, into a small unilamellar liposome. on the line of our previous work [2] , we have carried out molecular dynamics (md) simulations using a coarse grain model up the microsecond time scale. in this way, we compare the effects of the concentration of la at fisiological ph. we were able to capture important features of the plc-vesicle interactions. the behavior of plcs at fisiological ph is essentially a combination of high and low ph: we found that all neutral plc molecules rapidly diffuse into the hydrophobic region of the vesicle adopting an asymmetric bimodal density distribution. protonated plc molecules (pplc) initially placed in water were instead only found on the external monolayer, with a high rate of exchange with the water phase and no access to the inner part of the liposome in a concentration dependent way. we focus on applications of molecular and mesoscale simulation methodologies to the cellular transport process of endocytosis, i.e., active transport mechanisms characterized by vesicle nucleation and budding of the cell membrane orchestrated by protein-interaction networks, and functionalized carrier adhesion to cell surfaces. we discuss theoretical and computational methodologies for quantitatively describing how cell-membrane topologies are actively mediated and manipulated by intracellular protein assemblies. we also discuss methods for computing absolute binding free energies for carrier adhesion. we present rigorous validation of our models by comparing to diverse range of experiments. the importance of delta-opioid receptors as target of a large number of drugs is well recognized, but the molecular details of interaction and action of the compounds are largely unknown. in an effort to shade some light on this important issue we performed an extensive computational study on the interaction of two compounds, clozapine and desmetilclozapine, with a delta-opioid receptor. according to experiments, the lacking of a single methyl group in desmetilclozapine with respect of clozapine makes the former more active than the latter, providing a system well suited for a comparative study. we investigated stable configurations of the two drugs inside the receptor by simulating their escape routes by metadynamics, an algorithm that allows the simulation of events that are otherwise out of range for standard molecular dynamics simulations. our results point out that the action of the compound is related to the spatial distribution of the affinity sites it visits during its permanency. desmetilclozapine has a larger distribution of residues, which is interacting with, than clozapine. however, large conformational changes of the receptor were not observed in the presence of both compounds. thus, a more dynamical picture of ligand-receptor affinity is proposed on the basis of the results obtained, involving the competition among different stable states as well as the interaction with the solvents. such information might be useful to provide hints and insights that can be exploited in more structure-and-dynamics-oriented drug design. the coupling between the mechanical properties of enzymes and their biological activity is a well-established feature that has been the object of numerous experimental and theoretical works. in particular, recent experiments show that enzymatic function can be modulated anisotropically by mechanical stress. we study such phenomena using a method or investigating local flexibility on the residue scale, which combines a reduced protein representation with brownian dynamics simulations. we performed calculations on the enzyme guanylate kinase in order to study its mechanical response when submitted to anisotropic deformations. the resulting modifications of the protein's rigidity profile can be related to the changes in substrate binding affinities that were observed experimentally. further analysis of the principal components of motion of the trajectories shows how the application of a mechanical constraint on the protein can disrupt its dynamics, thus leading to a decrease of the enzyme's catalytic rate. eventually, a systematic probe of the protein surface led to the prediction of potential hotspots where the application of an external constraint would produce a large functional response both from the mechanhiv-1 protease autocatalyses its own release from gag and gagpol precursor polyproteins into mature functional proteins. as it is functional in the dimeric form, whilst initially, only a single monomer is embedded within each gagpol chain, the question arises as to what cut's the cutter. two individual monomers in different gagpol chains are known to come together to form an embedded-dimer precursor protease. mature-like protease activity is concomitant with n-terminal intramolecular cleavage of this transient embedded-dimer precursor, but how this crucial maturation-initiating step is physically achieved, has remained unknown. here, we show via 400 all-atom explicit solvent molecular dynamics simulation runs of 400 ns each that the n-terminal of an immature-like protease, with the n-terminal initially unbound as in the gagpol polyprotein, can self-associate to the active site and therefore be cleaved under conditions of thermodynamic equilibrium, identifying possible binding pathways at atomic resolution, in agreement with previous indirect experimental evidence [1] . the binding pathway predominantly makes use of the open conformation of the beta-hairpin flaps characterised by us previously [2] , and the n-terminal binds across the entire active site in good agreement with crystal structures of a cleavage-site peptidebound protease. the n-terminus serves two roles, firstly in the maturation of the protease itself by self-associating to the protein and then as a stabilizing component of the dimer interface in a mature protease. targeting the prior mechanism could be the focus of a novel therapeutic strategy involving immature protease inhibition. [ knotted proteins are the object of an increasing number of experimental and theoretical studies, because of their ability to fold reversibly in the same topologically entangled conformation. the topological constraint significantly affects their folding landscape, thus providing new insight and challenges into the funnel folding theory [1] . recently developed experimental methods to trap and detect knots have suggested that denaturated ensembles of the knotted proteins may be knotted [2] . we present numerical simulations of the early stage of folding of the knotted proteins belonging to protein families mtase (methyltransferase) and sotcase (succinyl-ornithine transcarbamylase), and of their unknotted homologues [3] . our results show that native interactions are not sufficient to generate the knot in the denaturated configurations. however, when non-native interactions are included we observe formation of knots only in the chains whose native state is knotted. in addition, we find that the knots are formed through a universal mechanism. such a knot formation mechanism correctly predicts the fraction of the knotted proteins found in nature and can be used to make qualitative prediction on their folding times. shape and motility and also for numerous signaling processes. adhesion is based on non-covalent interactions between transmembrane proteins and the extracellular matrix. cells are able to create two-dimensional assemblies of integrins, so called focal adhesions, which they use to stick to the substrate and collect information about the environmental properties. the goal of this work is a deeper understanding of the formation and the stability of these adhesion clusters. bond cluster formation and disintegration is dynamically modeled with the aid of monte carlo simulations. in the model, a membrane is attached to a flat surface via a variable number of adhesion bonds. the spacial configuration of these adhesion points subjected to an inhomogeneous stress field maps into a distribution of local membrane/ surface distances. we introduce a model which explicitly accounts for the membrane elasticity and demonstrate that such models are able to explain the spontaneous formation of adhesion bond clusters. structure based models are successful at conjugating the essence of the energy landscape theory of protein folding [1] with an easy and efficient implementation. recently their realm expanded beyond single protein structure, been used profitably to widely study large conformational transitions [2] [3] . still, when dealing with conformational transition between two well-defined structures an unbiased and realistic description of the local backbone and sidechain interactions is necessary. the proposed model merges a precise description of these interactions with a structure-based long range potential that takes into account different conformers. we present the results of the activation of the catalytic domain of human csrc tyrosine kinase for which we reconstructed the transition free energy and the description of the activation loop flexibility. the excellent model performances in terms of speed and the satisfactory accuracy of the description of the system and its flexibility are promising for a more systematic study of the tyrosine kinase family activation mechanisms. [ we introduce a previously undescribed technique for modelling the kinetics of stochastic chemical systems. we apply richardson extrapolation, a sequence acceleration method for ordinary differential equations, to a fixed-step tau-leaping algorithm, to produce an extrapolated tau-leaping method which has weak order of accuracy two. we prove this mathematically for the case of linear propensity functions. we use four numerical examples, two linear and two nonlinear, to show the higher accuracy of our technique in practice. we illustrate this by using plots of absolute error for a fixed-step tau-leap and the extrapolated tau-leap. in all cases, the errors for our method are lower than for a fixedstep tau-leap; in most cases they are second order of accuracy. the major tripartite efflux pump acrab-tolc is responsible for the intrinsic and acquired multidrug resistance in escherichia coli. at heart of the extrusion machinery there is the homotrimeric transporter acrb, which is in charge of the selective binding of structurally and chemically different substrates and energy transduction. the effects of conformational changes, which have been proposed as the key features of the extrusion of drugs, are investigated at molecular level using different computational methods like targeted molecular dynamics. simulations, including almost half a million atoms, have been used to assess several hypotheses concerning the structure-dynamics-function relationship of the acrb protein. the results indicate that, upon induction of conformational changes, the substrate detaches from the binding pocket and approaches the gate to the central funnel. in addition, we provide evidence for the proposed peristaltic transport involving a zipper-like closure of the binding pocket, responsible for the displacement of the drug. using these atomistic simulations the role of specific amino acids during the transitions can be identified, providing an interpretation of sitedirected mutagenesis experiments. additionally, we discuss a possible role of water molecules in the extrusion process. virus inhibitory peptide (virip), a 20 amino acid peptide, binds to the fusion peptide (fp) of human immunodeficiency virus type 1 (hiv-1) gp41 and blocks viral entry. molecular dynamics (md) and molecular mechanics/poisson-boltzmann surface area (mm/pbsa) free energy calculations were executed to explore the binding interaction between several virip derivatives and gp41 fp. a promising correlation between antiviral activity and simulated binding free energy was established thanks to restriction of the flexibility of the peptides, inclusion of configurational entropy calculations and the use of multiple internal dielectric constants for the mm/pbsa calculations depending on the amino acids sequence. bases on these results, a virtual screening experiment was carried out to design enhanced virip analogues. a selection of peptides was tested for inhibitory activity and several improved virip derivatives were identified. these results demonstrate that computational modelling strategies using an improved mm/pbsa methodology can be used for the simulation of peptide complexes. as such, we were able to obtain enhanced hiv-1 entry inhibitor peptides via an mm/pbsa based virtual screening approach. an essential step during the hiv life cycle is the integration of the viral cdna into the human genome. hiv-1 integrase mediates integration in a tight complex with the cellular cofactor: ledgf/p75 [1] . disruption of the interaction interferes with hiv replication and therefore provides an interesting new drug target for antiretroviral therapy [2, 3] . here we present the structure based discovery and optimization of a series of small molecule inhibitors that bind to hiv-1 integrase and block the interaction with ledgf/p75. the work flow was set up according to a funnel principle in which a series of virtual screening tools were applied in such way to discard at each step molecules unlikely to be active against the desired target (including 2d filtering, pharmacophore modelling and molecular docking) the activity and selectivity of the selected molecules were confirmed in an alpha screen based assay, that measure protein-protein interaction in vitro, and furthermore by in vivo experiments. active compounds proceeded towards crystallographic soaking into the receptor protein crystals. these crystal structures not only validated the binding mode and activity of the hit compounds, but furthermore were used as a platform for structure based drug design which resulted in the rational discovery of new hit compounds and optimized lead compounds. in vitro and in vivo experiments validated the mechanism of action of these compounds and show that they are a novel class of antiretroviral compounds with in vivo inhibitory activity by targeting the interaction between ledgf/p75 and hiv-1 integrase. cross resistance profiling indicate that these compounds are active against current resistant viral strains. [4] currently the most potent inhibitors show an in vivo ic 50 of 55nm. these compounds are promising for future pharmaceutical optimizations to be used in the clinic as new antiretroviral agents. crystallography was used to validate the binding mode of the discovered inhibitors. insights in the interaction of the ligand-protein complex allowed for rational design of optimized inhibitors. ligand-induced structural changes are small, thermal fluctuations can play a dominant role in determining allosteric signalling. in thermodynamic terms, the entropy change for subsequent binding is influenced by global vibrational modes being either damped or activated by an initial binding event. one advantage of such a mechanism is the possibility for long range allosteric signalling. here, changes to slow internal motion can be harnessed to provide signalling across long distances. this paper considers homotropic allostery in homodimeric proteins, and presents results from a theoretical approach designed to understand the mechanisms responsible for both cooperativity and anticooperativity. theoretical results are presented for the binding of camp to the catabolite activator protein (cap) [1] , where it is shown that coupling strength within a dimer is of key importance in determining the nature of the allosteric response. results from theory are presented along side both atomistic simulations and simple coarse-grained models, designed to show how fluctuations can play a key role in allosteric signalling in homodimeric proteins. [ reversibly switchable fluorescent proteins (rsfps) can be switched between a fluorescent (on) and a nonfluorescent (off) state which is accompanied by a cis-trans isomerization of the chromophore on the molecular level 1,2. this unique property has already provided new aspects to various microscopy techniques, such as high resolution microscopy, fcs or monochromatic multicolor microscopy 3-5. despite of their established potential, rsfps still have a major drawback: the wavelength for fluorescence excitation is always one of the two switching wavelengths. the imaging process thus inevitably results in the switching of a small fraction of the rsfps, which might hinder or complicate some experiments. we developed a new reversibly switchable fluorescent protein which eliminates the problem of the coupling between switching and fluorescence excitation. this fluorescent protein follows an unusual and currently unknown mechanism of switching between a fluorescent and a nonfluorescent state. it is brightly fluorescent and exhibits an excellent signal to noise ratio. in parallel studies [2] , qd-based ligands (egf, mabs) were targeted to egfr in gliomas. cell-cultures, animal models and ex vivo biopsies of human high-grade as well as low-grade gliomas showed high probe specificity.. the aim is to define more precisely the tumor boundaries at the time of resection. we used the programmable array microscope designed for sensitive, high-speed optical sectioning, particularly of living cells. the pam is based on structured illumination and conjugate detection using a digital micromirror device (dmd) [3] located in a primary image plane. the unique feature is the rapid, (re)programmable adjustment of local excitation intensity, dynamic, on-the-fly optimization is thus achieved, e.g. multipoint frap [4] , light exposure minimization and object tracking [5] , or super-resolution strategies. the features and operation of the 3rd generation pam will be presented. contraction of muscle cells, motility of microorganisms, neuronal activity, and other fast cellular processes require microscopic imaging of a three-dimensional (3d) volume with a video-rate scanning. we present 3d video-rate investigations of structural dynamics in biological samples with the multicontrast third-and second-harmonic generation as well as fluorescence microscope. the multidepth scanning is achieved by two combined laser beams with staggered femtosecond pulses. each of the beams is equipped with a pair of deformable mirrors for dynamic wavefront manipulation enabling multidepth refocusing with simultaneous corrections for optical aberrations. combined, more than 250 frames per second lateral scanning with fast refocusing enables the 3d video-rate imaging of dynamically moving structures. in addition, combination of two laser beams is accomplished at two perpendicular polarizations enabling live imaging of sample anisotropy, which is important for structural studies particularly with the second harmonic generation microscopy. investigations of beating chick embryo hearts with the 3d video-rate scanning microscope revealed multidirectional cardiomyocyte contraction dynamics in myocardial tissue. intricate synchronization of contractions between different layers of myocytes in the tissue will be presented. the video-rate 3d microscopy opens new possibilities of imaging fast biological processes in living organisms. confocal fluorescence microscopy is an invaluable tool to study biological systems at cellular level also thanks to the synthesis of always new specific fluorescent probes, e.g. multiprobe labelling enables complex system characterization. however, only the recent employment of narrowband tunable filters overcomes the problems due to the use of the broadband ones. the possibility of acquiring the emission spectra in a spatially resolved way extends simple image intensity studies into characterization of complex probeenvironment relationship through the sensitivity of fluorescence spectra to the local molecular environment differences. consequently, fluorescence microspectroscopy (fms) is able to provide the spectral information in a well defined spatial region allowing the researcher to simultaneously obtain spatial and spectroscopic information. our instrument has been specially built to study live cells and their interaction with nanomaterials, drug carriers and modified cell environment. other main characteristics are reducing the bleaching effect and employing a white light source that does not limit the use to specific probes. graphical tools, such as colour coded images, have also been introduced to provide explicit and straightforward visual information. high speed fpga based multi-tau correlation for singlephoton avalanche diode arrays jan buchholz 1,2 , jan krieger we demonstrate the use of fret-imaging (forster resonance energy transfer) as an assay to directly monitor the dynamics of cross-bridge conformational changes in single fibres of skeletal muscle. we measured nm-distances of several fret pairs located at strategic positions to sense myosin head conformational changes: we focused our attention on the essential light chain, elc (specifically labelling a modified elc and exchanging it with the natural elc of the fibre) and we investigated its interaction with the sh1 helix, with the nucleotide binding pocket and with actin. we characterized fret in single rigor muscle fibres, determining distances in agreement with those from the crystallographic data. the results demonstrate the viability of the approach in sensing different fret efficiencies over a few nanometres, an essential requirement to follow the expected small fret variations in contracting muscle fibres. we are now performing dynamic experiments on rigor and active fibres by applying small stretch/release cycles to alter the interaction distances (estimated time resolution of nearly 20ms/frame). in this configuration, it will be possible to measure functional changes, shedding light on the myosin head dynamics during contraction. focal stimulation of cultured neurons is crucial since it mimics physiological molecules release. indeed, the nervous system finely tunes the activity of each synapse by regulating the secretion of molecules spatially and temporally. currently used techniques have some drawbacks such as a poor spatial resolution or a low flexibility. we propose a novel approach based on optical tweezers (ot) [1] to overcome these limitations. ot allow an ease manipulation with sub-micrometric precision of silica beads, which can be functionalized with any protein. for a proof-of-principle study we coated 1,5lm large beads with brain-derived neurotrophic factor (bdnf) or bovine serum album (bsa) as control. we showed that a single bead was able to activate the bdnf receptor trkb, inducing its phosphorylation. moreover bdnf beads but not control beads were able to induce c-fos translocation into the nucleus [2] , indicating that the whole pathway was activated. finally, we positioned vectors in proximity to the growth cones of cultured hippocampal neurons [3] . control beads didn't affect the normal development of these structures while bdnf beads significantly did. these findings support the use of the ot technology for long-term, localized stimulation of specific subcellular neuronal compartments. a key role of its photoactivity, due to the singlet oxygen production, which has a very short lifetime (ns-ls, depending of hyp environment). hyp sub-cellular localization depends on its concentration in the medium, incubation time and used delivery system. variations in activity of protein kinase c, (anti-apoptotic pkca and pro-apoptotic pkcd) in correlation with the activity of bcl-2 protein, cytochrome c release from mitochondria and decreasing of mitochondrial membrane potential after photodynamic action were monitored. the study was performed for two different delivery modes of hyp to u-87mg glioma cells-hyp alone (membrane diffusion) vs. hyp loaded in low density lipoprotein (ldl) (endocytosis). confocal fluorescence microscopy, flow-cytometry and specific fluorescence labeling were used as main experimental techniques. our results show that hyp photoaction strongly affects apoptotic response of the cells and that the dynamics of this action significantly depends on used delivery system. correlation analysis between the monitored parameters (see above) determined for both delivery system is presented and critically discussed. surface contamination by bacteria is a natural and spontaneous process occurring in any natural environment on biotic (mucosa, tissues…) and abiotic surfaces (medical equipments, food surfaces…). whatever the bacterial nature (gram-positive or -negative), the environmental fluid (air, water, blood…) and the receptor surface (implants, medical equipments, food surfaces…), the surface contamination initiated by the first adherent bacteria can evolve to a three dimensional structure named biofilm (cohesive bacteria assembly ensured by an autoproduced extracellular organic matrix). the mechanisms by which these biofilms offer protective environment to viral particles or hypertolerance to antimicrobial action are not yet elucidated. to reach a better understanding of biofilm reactivity, we reported for the first time successful applications of correlative time-resolved optical microscopy approach by time-lapse (tl), frap, fcs, flim, for real-time analysis of molecular mobility and reaction inside biofilms. by means of non-biological or biological (virus, biocides and antibiotics) reactive compounds, significant advances to understand the roles of the extracellular matrix and bacteria physiological properties were obtained, an important step to improve pathogenic biofilm inactivation. here we present a feasibility study to develop two-photon microscopy (2pm) into a standard diagnostic tool for noninvasive, skin cancer diagnosis. the goal is defining experimental parameters to maximize image quality of optical biopsies in human skin, while avoiding tissue damage. possible diagnostic indicators will be compared for healthy tissue, benign, and malignant melanocytic lesions. we report on preliminary results of a study on 2pm imaging of ''ex-vivo'' biopsy samples, were autofluorescence intensity and contrast between lesion and surrounding tissues were optimised varying excitation wavelength, detection band, and dispersion pre-compensation. moreover, we determined modulation functions for laser power and detector gain to compensate losses in deep tissue imaging. as the main process of photo-damage, thermo-mechanical modifications were quantified and damage threshold powers were determined. in order to image structural changes in ordered tissue like collagen fibres, second-harmonic generation signals were recorded and optimised. in-vivo two-photon imaging of the honeybee antennal we adapted a two-photon microscope for in-vivo imaging of the honeybee olfactory system, focusing on its primary centres, the antennal lobes. the setup allowed obtaining both 3d-tomographic measurements of the antennal lobe morphology and time-resolved in-vivo calcium imaging of its neuronal activity. the morphological data were used to precisely measure the glomerular volume in both sides of the brain, investigating the question of volumetric lateralization. functional calcium imaging allowed to record the characteristic glomerular response maps to external odour stimuli applied to the bees' antennae. compared to previous neural imaging experiments in the honeybee, this work enhanced spatial and temporal resolution, penetration depth, and it minimized photo-damage. final goal of this study is the extension of the existing functional atlases of the antennal lobe to 3d and into the temporal dimension by investigating the time-resolved activity pattern. the use of voltage sensitive fluorescent dyes (vsd) for noninvasive measurement of the action potential (ap) in blood perfused heart have been hindered by low interrogation depth, high absorption and auto-fluorescence of cardiac tissue. these limitations are diminished using new near infrared (nir) vsd (di-4-anbdqbs). here we validated toxicity and photo-toxicity of these dyes in guinea pig and human cardiac muscle slabs. application of nir vsd showed no effect on cardiac muscle contraction force or relaxation. optical action potentials closely tracked kinetics of microelectrode recorded aps in both field and electrode stimulated preparations. for phototoxicity assessment dye (50 lm) preloaded cardiac slabs were exposed to prolonged laser radiation of various power. microelectrode ap recordings show that exposure to prolonged laser radiation (10min.; 2mw/mm 2 ) of dye loaded tissue had no statistically significant effect on apd50 or conduction velocity, thus indicating no or weak photo-toxicity on the nir vsd. in contrast, exposure to 5 min. laser radiation of phototoxic dyes (mitotracker deep-red) preloaded tissue caused significant reduction in apd50 (by 13%) and conduction velocity (30%). thus, due to the low photo-toxicity, nir vsd are well suited for in vivo cardiac imaging. streptomycesetes are filamentous gram-positive soil bacteria well known for their complex morphological development and secondary metabolite production. during their life cycle spores germinate to form a network of hyphae, which later develops into aerial mycelium when cross-walls are generated and spores are formed. we have examined and compared the last stage of the differentiation process in a wild-type s. coelicolor (m145) and its dcabb mutant lacking a calmodulin-like calcium binding protein. the strains were grown on four kinds of media: smms, smms with 10 % saccharose, r5 and r5 with reduced calcium in order to study the effect of environment and osmotic stress on the sporulation of the two strains to assess the function of cabb protein. from the cultures pictures were taken at 48 hours and after 7 days using phase contrast, atomic force and confocal laser scanning microscope and the sizes of spores were measured. our results showed that the dcabb mutant made smaller spores, its differentiation and stress response were slower. we could conclude from it that the aberrant protein slows the metabolism, the signal transduction and has effects on sporulation, septation and air-mycelium formation. based on it we can tell that the cabb has a significant role in normal development. the mobility and reaction parameters of molecules inside living cells can be conveniently measured using fluorescent probes. typically fluorescence correlation spectroscopy (fcs) based on confocal microscopy is used for such measurements. this implies high time-resolution but only for a single spot at a time. in order to achieve a high time-resolution at multiple spots, we built a single plane illumination microscope (spim) equipped with high-speed image acquisition devices and a high-na detection optics. this allows us to do parallel fcs measurements in a thin plane (width * 2-3 lm) of the sample. our setup is equipped with a fast emccd camera (full frame time resolution 2000ls) and a 32932 pixel array of spads. the spad array has a full frame time resolution of 3-10ls, which is even fast enough to resolve the typical motion time-scale of small molecules (like egfp) inside living cells. the performance of the system is characterized by diffusion measurements of water-soluble fluorescent beads, as well as fcs measurements in living cells. our data acquisition system uses programmable hardware for some tasks and is fast enough to allow real-time correlation of 1024 pixels, as well as saving the complete dataset for later evaluation. electron cryo-microscopy (cryo-em) covers a larger size range than any other technique in structural biology, from atomic resolution structures of membrane proteins, to large noncrystalline single molecules, entire organelles or even cells. electron crystallography of two-dimensional (2d) crystals makes it possible to examine membrane proteins in the quasi-native environment of a lipid bilayer at high to moderately high resolution (1.8-8å ). recently, we have used electron crystallography to investigate functionally important conformational changes in membrane transport proteins such as the sodium/proton antiporters nhaa and nhap, or the structure of channelrhodopsin. ''single particle'' cryo-em is well suited to study the structure of large macromolecular assemblies in the 3.3 to 20å resolution range. a recent example is our 19å map of a mitochondrial respiratory chain supercomplex consisting of one copy of complex 1, two copies of complex iii, and one of complex iv. the fit of the x-ray structures to our map indicates short pathways for efficient electron shuttling between complex i and iii by ubiquinol, and between complex iii and iv by cytochrome c. electron cryo-tomography can visualize large protein complexes in their cellular context at 30-50å resolution, and thus bridges the gap between protein crystallography and light microscopy. cryo-et is particularly suitable for studying biological membranes and large membrane protein complexes in situ. we found that long rows of atp synthase dimers along ridges of inner membrane cristae are an ubiquitous feature of mitochondria from all 6 species we investigated (2 mammals, 3 fungi, 1 plant). the proton pumps of the respiratory chain appear to be confined to the flat membrane regions on either side of the ridges. this highly conserved pattern suggests a fundamental role of the mitochondrial cristae as proton traps for efficient atp synthesis. single-particle analysis: advanced fluorescence imaging, including subdiffraction microscopy, relies on fluorophores with controllable emission properties. chief among these fluorophores are the on/off photoswitchable and green-to-red photoconvertible fluorescent proteins. irisfp was recently reported as the first fluorescent protein to combine irreversible photoconversion from a green-emitting to a red-emitting state with reversible on/off photoswitching in both the green and red states. the introduction of this protein resulted in new applications such as super-resolution pulse-chase imaging, but the properties of irisfp are far from being optimal from a spectroscopic point of view and its tetrameric organization complicates its use as a fusion tag. we have demonstrated how four-state optical highlighting can be rationally introduced into photoconvertible fluorescent proteins by developing and characterizing a new set of such enhanced optical highlighters derived from meo-sfp and dendra2. one of these, which we called nijifp, was identified as a promising new multi-photoactivatable fluorescent protein with optical properties that make it ideal for advanced fluorescence-based imaging applications. introducing to medicine and biology concept of optical markers in tremendous way has changed the recent status of these two important disciplines. this was mainly due to strong development in imaging techniques which recently allow us to investigate both static as well dynamic properties of living cells, their components and their interactions with external factors. recently used molecular markers including organic dyes, fluorescent proteins or chelates containing lanthanide ions have several significant limitations. one of the alternatives for molecular markers are inorganic quantum dots (ie. cdse, cds) which are recently commonly used in many academic works. however, even if they are much better from physicochemical point of view, from the application point of view at this moment they are rather useless mainly because of their high risk of toxicity. one of the solution combining advantages of both concepts is to make nontoxic inorganic nanocrystals doped by lanthanide ions. in this work, we will present optical results obtained for nayf 4 nanocrystals doped by different lanthanide ions. the aim of this work was to design and to synthesize these markers and to understand physical processes responsible for their emission and to optimize these processes to the physical limits. intravital microscopy has fostered a full blown of publication regarding the behavior of cells in different tissues and physiological conditions. however, a few papers describe how the motility parameters can be used to understand whether an interaction is occurring and, on balance, the distinction between interacting and non interacting cells is performed visually on the image time stack. here we describe a multi-parameter approach that allows to discern among different cell behaviors on an objective ground and we demonstrate its effectiveness valuating the mutual fate of natural killer (nk) and dendritic (dc) cells at the draining lymph-nodes in inflammatory and stationary conditions. the method is time saving and allows a wide scale characterization of the lymphocyte tracks and to build up statistics of the cell-cell interaction duration. this has allowed the development of a numerical model of the nk-dc interaction, based on a molecular-stochastic dynamic approach, whose output can be directly compared to the data. hemozoin is formed during malaria infection of red blood cells, the malaria parasite cleaves hemoglobin, leaving free heme which is then toxic to the parasite. the free heme is then bio-crystalized to form hemozoin which allows the parasite to remain viable. the hemozoin is released during the breakdown of the red blood cells, is small and can be difficult to resolve spatially. since it contains an abundance of heme protein, which has a strong absorbance at 532 nm, it can be readily detected and tracked by using resonant raman scattering spectroscopy. here we use slit-scanning confocal raman imaging to detect the hemozoin and resolve it against the background molecules. inside a red blood cell, hemoglobin is the strongest background signal since it also contains large amounts of heme. nevertheless, the discrimination is possible, and the time-resolved observation of hemozoin is an important tool to understand effects of malaria since the hemozoin can trigger the immune response and cause inflammation in tissue. muscle performance at the molecular level is determined by the elementary displacement (working stroke) produced by the motor protein myosin ii, and its dependence on load. we developed a laser trap assay (the optical leash) capable of applying controlled loads to a single myosin head before the working stroke is initiated and probing actin-myosin interaction on the microsecond time scale. we found that the working stroke size depends both on load and on the detachment pathway followed by myosin. in a first pathway, myosin detaches very rapidly from actin (\1ms) without producing any movement. in a second pathway, myosin steps and remains bound to actin for a time inversely proportional to atp concentration; the working stroke remains constant (*5 nm) as the load is increased, until it suddenly vanishes as the isometric force is reached (5.7±0.6pn). a third dissociation pathway becomes more populated as the force is increased, due to premature unbinding of myosin from actin, resulting in a working stroke that decreases with load. taken together, these results give a new insight into the molecular mechanism of load dependence of the myosin working stroke, which is a primary determinant of skeletal muscle performance and efficiency. previously we have deleted either or both of these terminal helices genetically. surprisingly, all mutants rotated in the correct direction, showing that the shaft portion is dispensable. here we inquire if the rest of the c rotor, the globular protrusion that accounts for *70 % of the c residues, is also dispensable. keeping the n-and c-terminal helices that constitute the shaft, we have replaced the middle *200 residues with a short helix-turn-helix motif borrowed from a different protein. the protrusion-less mutant thus made retained a high atpase activity and rotated fast in the correct direction. this may not be unexpected because, in crystal structures, most of the removed residues do not contact with the a 3 b 3 ring. combined with the previous results, however, the present results indicate that none of the c residues are needed for rotation. the rotary mechanism of a molecular engine, the vacuolar proton-atpase, working in a biomembrane csilla ferencz 1 , pá l petrovszki 1 , zoltá n kó ta 1 the rotary mechanism of vacuolar proton-atpase (v-atpase) couples atp hydrolysis and trans-membrane proton translocation. we tested the effect of oscillating electric (ac) field on v-atpase activity in yeast vacuoles. the ac technique has several advantages over direct observations: it can be applied on native membranes, there are no labels and attachments involved, and the target protein is in its natural environment. this was/is the first of its kind of experiment on v-atpase, and we got strikingly different results from previous studies on other proteins: both low and high frequency ac field reduces atpase activity in a wide frequency range. a sharp resonance is seen at 88.3 hz, where the atpase activity reaches or exceeds the control (no ac) level. we think that the resonance happens at that of the 60 degrees rotor steps, meaning that the rotation rate of the rotor is around 15 hz, under the given conditions. synchronisation of individual atpases by slow or matching, but not fast ac is likely via a hold-and-release mechanism. we can explain the above observations by assuming that the ac field interacts with the proton movements, and if we consider the estimated geometry of the hydrophilic proton channels and the proton binding cites on the rotor. [2] . the ttss constitutes a continuous protein transport channel of constant length through the bacterial envelope [3] . the needle of the type three secretion system is made of a single small protein (protomer). we analyzed the assembly and the structure of the ttss needle using different biophysical methods including fourier transform-infrared spectroscopy, nmr spectroscopy and x-ray crystallography. we show that the ttss needle protomer refolds spontaneously to extend the needle from the distal end. the protomer refolds from a-helix into b-strand conformation to form the ttss needle [4] . regulated secretion of virulence factors requires the presence of additional protein at the ttss needle tip. x-ray crystal structure analysis of the tip complex revealed major conformational changes in both the needle and the tip proteins during assembly of the s. typhimurium ttss. our structural analysis provides the first detailed insight into both the open state of the ttss needle tip and the conformational changes occurring at the pathogen-host interface [5] . [ the membrane-bound component f o of the atp synthase works as a rotary motor and plays the central role of driving the f 1 component to transform chemi-osmotic energy into atp synthesis. we have conducted molecular dynamics simulations of the membrane-bound f o sector with explicit lipid bilayer, in which the particular interest was to observe the onset of helix motion in the c ring upon the change of the protonation state of asp61 of the c 1 subunit, which is the essential element of the boyer's binding-change mechanism. to investigate the influence of transmembrane potential and ph gradient, i.e., the proton motive force, on the structure and dynamics of the a-c complex, different electric fields have been applied along the membrane normal. correlation map analysis indicated that the correlated motions of residues in the interface of the a-c complex were significantly reduced by external electric fields. the deuterium order parameter (s cd ) profile calculated by averaging all the lipids in the f o -bound bilayer was not very different from that of the pure bilayer system, which agrees with recent 2 h solid-state nmr experiments. however, by delineating the lipid properties according to their vicinity to f o , we found that the s cd profiles of different lipid shells are prominently different. lipids close to f o formed a more ordered structure. similarly, the lateral diffusion of lipids on the membrane surface also followed a shell-dependent behavior. the lipids in the proximity of f o exhibited very significantly reduced diffusional motion. the numerical value of s cd was anti-correlated with that of the diffusion coefficient, i.e., the more ordered lipid structures led to slower lipid diffusion. our findings will not only help for elucidating the dynamics of f o depending on the protonation states and electric fields, but may also shed some light to the interactions between the motor f o and its surrounding lipids under physiological condition, which could help to rationalize its extraordinary energy conversion efficiency. this work has been published 1 in march 2010, and it was selected as one of the two featured articles of that issue. the research project is directed toward the construction of a synthetic bio-inorganic machine that consists in a single actin filament that interacts with a linear array of myosin ii motors regularly disposed on a nano-structured device. the motor array is intended to simulate the unique properties of the ensemble of motor proteins in the half-sarcomere of the muscle by providing the condition for developing steady force and shortening by cyclic interactions with the actin filament. the mechanical outputs in the range 0.5-200 pn force and 1-10,000 nm shortening will be measured and controlled the bacterial flagellar motor is a membrane-embedded molecular machine that rotates filaments, providing a propulsive force for bacteria to swim. the molecular mechanism of torque (turning force) generation is being investigated through the study of the properties and three-dimensional structure of the motor's stator unit. we are taking both topdown and bottom-up approaches, combining data from the molecular genetics studies, cross-linking, x-ray protein crystallography and molecular dynamics simulations. we have recently determined the first crystal structure of the protein domain that anchors the proton-motive-force-generating mechanism of the flagellar motor to the cell wall, and formulated a model of how the stator attaches to peptidoglycan. the work presented at the meeting will inform the audience on our latest work that establishes the relationship between the structure, dynamics and function of a key component of the bacterial flagellar motor, the motility protein b (motb). this work will be put in the perspective of the mechanism of rotation, stator assembly, anchoring to peptidoglycan and interaction with the rotor, and discussed in the light of the elementary events composing the cycle of electrochemical-to-mechanical energy conversion that drives flagellar rotation. members of the conserved kinesin-5 family fulfill essential roles in mitotic spindle morphogenesis and dynamics and were thought to be slow, processive microtubule (mt)-plusend directed motors. the mechanisms that regulate kinesin-5 function are still not well understood. we have examined in vitro and in vivo functions of the saccharomyces cerevisiae kinesin-5 cin8 using single-molecule motility assays and single-molecule fluorescence microscopy and found that cin8 motility is exceptional in the kinesin-5 family. in vitro, individual cin8 motors could be switched by ionic conditions from rapid (up to 50 lm/min) and processive minus-end, to bidirectional, to slow plus-end motion. deletion of the uniquely large insert of 99 amino acids in loop 8 of cin8 induced bias towards minus-end motility and strongly affected the directional switching of cin8 both in vivo and in vitro. we further found that deletion of the functionally overlapping kinesin-5 kip1 and of the spindle-organizing protein ase1 affected cin8 velocity and processivity, but directionality was not affected. the entirely unexpected finding of switching of cin8 directionality in vivo and in vitro demonstrates that the ''gear box'' of kinesins is much more complex and versatile than thought. many biological motor molecules move within cells using stepsizes predictable from their structures. myosin-vi, however, has much larger and more broadly-distributed stepsizes than those predicted from its short lever arms. we explain the discrepancy by monitoring qdots and gold nano-particles attached to the myosin vi motor domains using high-sensitivity nano-imaging. the large stepsizes were attributed to an extended and relatively rigid lever arm; their variability to two stepsizes, one large (72 nm) and one small (44 nm). these results suggest there exist two tilt-angles during myosin-vi stepping, which correspond to the pre-and post-powerstrokes states and regulate the leading head. the large steps are consistent with the previously reported hand-over-hand mechanism, while the small steps follow an inchworm-like mechanism and increase in frequency with adp. switching between these two mechanisms in a strain sensitive, adpdependent manner allows myosin-vi to fulfill its multiple cellular tasks including vesicle transport and membrane anchoring. http://www.fbs.osaka-u.ac.jp/labs/yanagida/, http:// www.qbic.riken.jp/. ferritin deposits iron in oxyhydroxide iron core surrounded by protein shell. the iron core structure may vary in different ferritins in both normal and pathological cases. to study iron core variations the mö ssbauer spectroscopy with a high velocity resolution was applied for comparative analysis of normal and leukemia chicken liver and spleen tissues, human liver ferritin and commercial pharmaceutical products imferon, maltoferò and ferrum lek as ferritin models. mö ssbauer spectra of these samples measured with a high velocity resolution at room temperature were fitted using two models: homogenous iron core (one quadrupole doublet) and heterogeneous iron core (several quadrupole doublets). the results of both fits demonstrated small variations of mö ssbauer hyperfine parameters related to structural variations of the iron cores. these small structural variations may be a result of different degree of crystallinity, the way of iron package, nonequivalent iron position, etc. obtained small differences for normal and leukemia tissues may be useful for distinguishing ferritins from normal and pathological cases. this work was supported in part by the russian foundation for basic research (grant # 09-02-00055-a). invasion of epithelial cells by salmonella enterica is mediated by bacterial ''effector'' proteins that are delivered into the host cell by a type iii secretion system (ttss). the collaborative action of these translocated effectors modulates a variety of cellular processes leading to bacterial uptake into mammalian cells. type iii effectors require the presence in the bacterial cytosol of specific tts chaperones. effectors are known to interact with their chaperone via a chaperone binding domain (cbd) situated at their n-terminus. this work focus on sopb, an effector with phosphoinositide phosphatase activity and particularly its interaction with the specific chaperone sige by biochemical, biophysical and structural approaches. we have co-expressed sopb with its specific chaperone sige and purified the complex, determined the limits of the cbd and purified the sopb cbd /sige complex. the structure of sige has been solved previously but no crystals could be obtained for structure determination of both complexes. we used saxs experiment combined with biophysical approach to analyse the interaction between sopb and its chaperone as well as the quaternary structure on the complex that will be described in this presentation. guanylate monophosphate kinase (gmpk) is a cytosolic enzyme involved in nucleotide metabolic pathways. one of the physiological roles of gmpks is the reversible phosphoryl group transfer from atp to gmp (its specific ligand), yielding adp and gdp. the gmpk from haemophilus influenzae is a small protein, with a number of 208 amino acids in the primary structure. in order to determine the secondary structure changes of this enzyme, as well as some physical characteristics of its complexes with gmp and atp ligands, circular dichroism (cd) and atr -ftir studies were performed. the enzyme and its ligands were dissolved in tris -hcl buffer, at ph 7.4 and 25°c. from the cd spectra the content of the secondary structure elements of gmpk and gmpk/gmp, gmpk/atp (with and without mg 2+ ) were determined. the major secondary structure elements of gmpk from haemophilus influenzae were a-helix (* 37 %) and b-sheet (* 36 %). atr -ftir experiments show that the amide i and amide ii bands of the gmpk are typical for a protein with great a-helix content. from the second derivative spectra, the content of the secondary structure elements were estimated. these data were in agreement with those obtained by cd. assembly of the mature human immunodeficiency virus type 1 capsid involves the oligomerization of the capsid protein, ca. the c-terminal domain of ca, ctd, participates both in the formation of ca hexamers, and in the joining of hexamers through homodimerization. intact ca and the isolated ctd are able to homodimerize in solution with similar affinity (dissociation constant in the order of 10 lm); ctd homodimerization involves mainly an a-helical region. in this work, we show that peptides derived from the dimerization helix (which keep residues energetically important for dimerization and with higher helical propensities than the wild-type sequence) are able to self-associate with affinities similar to that of the whole ctd. moreover, the peptides have a higher helicity than that of the wild-type sequence, although is not as high as the theoretically predicted. interesting enough, the peptides bind to ctd, but binding in all peptides, but one, does not occur at the dimerization interface of ctd (helix 9). rather, binding occurs at the last helical region of ctd, even for the wild-type peptide, as shown by hsqc-nmr. as a consequence, all peptides, but one, are unable to inhibit capsid assembly of the whole ca in vitro. the peptide whose binding occurs at the ctd dimerization helix has an val?arg mutation in position 191, which has been involved in dimer-dimer contacts. these findings suggest that event keeping the energetically important residues to attain ctd dimerization within a more largely populated helical structure is not enough to hamper dimerization of ctd. putp is an integral membrane protein located in the cytoplasmic membrane of escherichia coli, being responsible for the coupled transport of na + and proline. it belongs to the family of sodium solute symporters (sss). structural data for putp is not available, but secondary structure predictions together with biochemical and biophysical analyses suggest a 13 transmembrane motif. from a recent homology model based on the x-ray structure of the related na + /galactose symporter vsglt, previously published electron paramagnetic resonance (epr) studies, and recent crystallographic and epr studies on the cognate bacterial homolog of a neurotransmitter:na + symporter, leut, it has been proposed that helices viii and ix as well as the interconnecting ''loop 9'' region determine the accessibility of the periplasmic cavities which bind sodium and proline. we performed site-directed spin labeling of ''loop 9'' in combination with epr spectroscopy to investigate the structural features of this region and possible conformational changes induced by sodium and proline. analyses of spin label mobility and polarity as well as accessibility to paramagnetic quenchers allow us to refine this region in the present homology model. furthermore, our data suggest conformational changes in this region upon substrate binding including an overall motion of a helical segment. fatty acid-binding proteins (fabp) are a family of low molecular weight proteins that share structural homology and the ability to bind fatty acids. the common structural feature is a b-barrel of 10 antiparallel b-strands forming a large inner cavity that accommodates nonpolar ligands, capped by a portal region, comprising two short a-helices. b-fabp exhibits high affinity for the docosahexaenoic (dha) and oleic acid (oa). it is also postulated that b-fabp may interact with nuclear receptors from ppar family. in the present work, we used molecular biology and spectroscopic techniques to correlate structure, dynamics and function. site-directed mutagenesis was used to produce 5 mutants of b-fabp with a nitroxide spin probe (mtsl) selectively attached to residues located at the portal region. esr spectra of the labeled b-fabp mutants were sensitive to the location of the mutation and were able to monitor interactions in three cases: (1) shsp are ubiquitous proteins involved in cellular resistance to various stress (oxidative, heat, osmotic…). they are able to prevent aggregation of non-native proteins through the ability of forming large soluble complexes and preventing their nonspecific and insoluble aggregation. in consequence to this molecular chaperone function, they can regulate many processes (resistance to chemotherapy, modulation of the cellular adhesion and invasion, inflammatory response in skin), and the modulation of their expression has been found to be a molecular marker in cancers, spermatogenesis, or cartilage degeneration. furthermore, they are involved in several pathologies: myopathies, neuropathies, cancers, cataracts. among the 10 human members (hspb1-10), this study focused on hspb1 (hsp27 involved in some cancers), hspb4 (lens specific), hspb5 (lens, muscle, heart, lung) and hspb5-r120g (responsible for a desmin-related myopathy and a cataract). as shsp form large, soluble (but polydisperse in mammals) hetero-oligomers, molecular biology, biochemistry, biophysics and bioinformatics were successfully combined to compare the functional/dysfunctional assemblies in order to understand the critical parameters between shsp members depending upon their tissue and cellular localization. ionizing radiation is a type of radiation that contains enough energy to displace electrons and break chemical bonds. this can promote the removal of at least one electron from an atom or molecule, creating radical species, namely, reactive oxygen species (ros) [1, 2] . these are often associated with damages at cellular level, such as, dna mutations, cell cycle modifications and, in animal cells, cancer. to overcome this problem, organisms developed different protection/repair mechanisms that enable them to survive to these threats. dna glycosylases are enzymes that are part of base excision repair (ber) system, mainly responsible for dna repair. they can recognize a dna lesion and, in some cases, are able to remove the mutated base. here we propose to study one of those enzymes, endonuclease iii, which contains a [4fe-4s] cluster [3, 4] . samples were exposed to different doses of uv-c radiation and the effects were studied by electrophoretic and spectroscopic methods. [ na,k-atpase is an integral protein present in the plasma membrane of animal cells, and consists of two main subunits: the a and b. cholesterol is an essential constituent of the animal membrane cells. in order to study the interaction between na,k-atpase and cholesterol, we have used the dsc technique, and a proteoliposome system composed by the enzyme and dppc:dppe, with different percentage in mol of cholesterol. the heat capacity of purified na,k-atpase profile exhibits three transitions with 31, 189 and 60 kcal/mol at 55, 62 and 69°c. multiple components in the unfolding transition could be attributed either to different steps in the pathway or to independent unfolding of different domains. this denaturation of na,k-atpase is an irreversible process. for the proteoliposome, we also observed three peaks, with 180, 217 and 41 kcal/mol and 54, 64 and 72°c. this increase in dh indicates that the lipids stabilize the protein. when cholesterol was used (from 10 to 40 mol %), the first transition was shifted to a lower temperature value around 35°c. these results confirm that cholesterol has an influence on the packing and fluidity of lipid bilayer and changes in lipid microenvironment alter the thermostability as well as the activity of na,k-atpase. financial support: fapesp. we have undertaken to study the structure and function of peroxisomal multifunctional enzyme type 2 (mfe-2) from different organisms. mfe-2 is a key enzyme in long-and branched-chain fatty acids breakdown in peroxisomes. it contains two enzymes in the same polypeptide and also consists of differing amount of domains depending on the species. crystal structure and enzyme kinetics of drosophila melanogaster mfe-2 has revealed the domain assembly and raised a question about existence of a substrate channeling mechanism. small-angle x-ray scattering studies have further resolved the assembly of domains in the human mfe-2. mutations in the mfe-2-coding gene in humans may cause d-bifunctional protein deficiency -a metabolic disease characterized by accumulation of fatty acyl-coa intermediates due to inactive or residually active mfe-2 protein. we have also studied the structure, stability and dynamics of such mutant proteins both experimentally and in silico. the latest results on all these studies will be presented. ftsz is a protein that plays a key role in bacterial division, forming a protein ring directly related to the constriction of the membrane. this process has been observed to occur without the help of molecular motors. nonetheless, the details of the self-assembly and subsequent force generation of the septal ring are still obscure. afm observations allows the study of the behaviour of ftsz solutions on a substrate with unprecedented resolution, permitting the identification of individual protein filaments. the different resulting structures can be compared to monte carlo models for a 2d lattice accounting for the essential interactions between monomers. these include a strong longitudinal bond that allows a limited flexibility (i.e., curvature of the filaments) and a weaker lateral interaction. the work we present follows this approach, focussing on the latest experiments with ftsz mutants. by using these mutants, it is possible to choose the specific region of the monomer that will anchor to the substrate, thus generating new structures that provide an insight into monomer-monomer interactions. in this way, we explore the anisotropy of the lateral bond in ftsz, a factor that has not been taken into account before but may prove to be of importance in fstz behaviour in vivo. modeling protein structures and their complexes with limited experimental data dominik gront university of warsaw, pasteura 1, 02-093 warsaw, poland conventional methods for protein structure determination require collecting huge amounts of high-quality experimental data. in many cases the data (possibly fragmentary and/or ambiguous) on itself cannot discriminate between alternative conformations and a unique structure cannot be determined. small angle xray scattering is an example of such a ''weak'' experiment. the spectrum encodes only several independent degrees of freedom that provide a global description of a molecular geometry in a very synthetic way. in this contribution we utilized both local information obtained from nmr measurements and global description of a macromolecule as given by saxs profile combined with a knowledge-based bimolecular force field to determine tertiary and quaternary structure of model protein systems. saxs curve as well as various kinds of local nmr data such as isotropic chemical shifts and their tensors, j-couplings, rdc, backbone noe and redor from nmr in solid phase are parsed with the ''experimental'' module of bioshell toolkit and utilized by rosetta modeling suite to generate plausible conformations. obtained results show the new protocol is capable to deliver very accurate models. noenet-use of noe networks for nmr resonance assignment of proteins with known 3d structure dirk stratmann, carine van heijenoort and eric guittet structural genomics programs today yield an increasing number of protein structures, obtained by x-ray diffraction, whose functions remain to be elucidated. nmr plays here a crucial role through its ability to readily identify binding sites in their complexes or to map dynamic features on the structure. an important nmr limiting step is the often fastidious assignment of the nmr spectra. for proteins whose 3d structures are already known, the matching of experimental and back-calculated data allows a straightforward assignment of the nmr spectra. we developed noenet, a structure-based assignment approach. it is based on a complete search algorithm, robust against assignment errors, even for sparse input data. it allows functional studies, like modeling of protein-complexes or protein dynamics studies for proteins as large as 28 kd. almost any type of additional restraints can be used as filters to speed up the procedure or restrict the assignment ensemble. noenet being mainly based on nmr data (noes) orthogonal to those used in triple resonance experiments (jcouplings), its combination even with a low number of ambiguous j-coupling based sequential connectivities yields a high precision assignment ensemble. we observed, that t. thermophilus isopropylmalate dehydrogenase (ipmdh) has higher rigidity and lower enzyme activity at room temperature than its mesophilic counterpart (e. coli), while the enzymes have nearly identical flexibilities under their respective physiological working conditions, suggesting that evolutionary adaptation tends to maintain optimum activity by adjusting a ''corresponding state'' regarding conformational flexibility. in order to reveal the nature of the conformational flexibility change related to enzymatic activity, we designed a series of mutations involving non conserved prolines specific to thermophilic ipmdh. proline to glycine mutations substantially increased conformational flexibility and decreased conformational stability. the mutant enzyme variants did not show enhanced catalytic activity, but the non arrhenius temperature dependence of enzyme activity of the wild type was abolished. this phenomenon underlines the fact that the delicate balance between flexibility, stability and activity which is required for the environmental adaptation of enzymes can be easily disrupted with mutations even distant from the active site, providing further evidence that optimization of proper functional motions are also a selective force in the evolution of enzymes. the kinetoplastids trypanosoma brucei, t. cruzi and leishmania major are responsible for causing great morbidity and mortality in developing countries. the all a-helical dimeric dutpases from these organisms represent promising drug targets due to their essential nature and markedly different structural and biochemical properties compared to the trimeric human enzyme. to aid in the development of dutpase inhibitors we have been structurally characterizing the enzymes from these species. here we present the structure of the t. brucei enzyme in open and closed conformations, completing the view of the enzymes from the kinetoplastids. furthermore, we sought to probe the reaction mechanism for this family of enzymes as a mechanism has been proposed based on previous structural work but has not received any further verification. the proposed scheme is similar to that of the trimeric enzyme but differs in detail. using tryptophan fluorescence quenching in the presence of the transition state mimic alf 3 we have been able to identify which species is the likely transition state in the reaction. the crystal structure of t. brucei in complex with this transition state analogue confirms the nature of the nucleophilic attack clearly showing how it differs from trimeric enzymes. the structure of factor h-c3d complex explains regulation of immune complement alternative pathway circular dichroism (cd) spectroscopy is a widely used technique for studying the secondary structure (ss) of proteins. numerous algorithms have been developed for the estimation of the ss composition from the cd spectra. although, these methods give more or less accurate estimation for proteins rich in a-helical structure, they often fail to provide acceptable results on mixed or b-rich proteins. the problem arises from the diversity of b-structures, which is thought to be an intrinsic limitation of the technique. the worst predictions are provided for proteins of unusual b-structures and for amyloid fibrils. our aim was to develop a new algorithm for the more accurate estimation of ss contents for a broader range of protein folds with special interest to amyloid fibrils. using synchrotron radiation cd (srcd), we were able to collect high quality spectra of amyloid fibrils with good s/n ratios down to 175nm. the novel reference dataset with spectra that significantly differ from present reference sets, extends the information content for ss determination. our algorithm takes into account the diverse twist of the b-sheets that has a great influence on the spectral features. for the first time, we can reliably distinguish parallel and antiparallel b-structure using cd spectroscopy. monitoring the assembly of the membrane protein insertase alexej kedrov 1 , marko sustarsic 2 , arnold j.m. driessen 1 1 groningen biomolecular sciences and bioengineering institute, university of groningen, the netherlands, 2 university of oxford, uk molecular forces that govern membrane protein integration and folding remain a major question in current molecular biology and biophysics. each nascent polypeptide chain should acquire its unique three-dimensional folded state within a complex environment formed by the anisotropic lipid membrane and the membrane-water interface. secyeg translocase and members of a recently described yidc/oxa1/alb3 chaperone family are recognized as primary players in the membrane protein genesis. these proteins, so called insertases, serve as membraneembedded molecular pores where the newly synthesized protein is loaded prior its release into the bilayer. here we apply fluorescence correlation spectroscopy to monitor the assembly of the insertase:ribosome:nascent polypeptide chain complexes in solution and reconstituted into nanodiscs and model membranes. results provide insights on molecular mechanisms and dynamics of the insertase functioning. conformational changes during gtpase activity induced self-assembly of human guanylate binding protein 1 revealed by epr spectroscopy correct assembly and regulation of multi-component molecular machines is essential for many biological activities. the type iii secretion system (t3ss) is a complex molecular machine that is a key virulence determinant for important gram-negative pathogens including shigella, yersinia and salmonella species [1] [2] . the t3ss consists of multiple copies of *25 different proteins (totalling *7mda), spans both bacterial membranes and drives insertion of a contiguous pore into the host-cell membrane. virulence factors are secreted via this apparatus directly into the host cell. in all t3ss various levels of regulation occur with switching between secretion off and on states overlaid on control of which substrates are secreted. genes involved in a variety of these switches have been identified but the molecular mechanisms underlying these functions is poorly understood. we are studying the t3ss of shigella flexneri, the causative agent of dysentery and will present the structure of the so-called ''gatekeeper protein'' mxia. the diacylglycerolacyltransferase1 (dgat1) is an integral protein from the reticulum endoplasmic membrane that plays an essential role in triacylglyceride synthesis. in cattle, this enzyme is associated to the fat content regulation on milk and meat. in this study, synthetic peptides corresponding to both dgat1 binding sites (sit1 and sit2) were designed, purified and employed to investigate the enzyme interaction with substrates and membrane models. different binding specificities in the interaction with phospholipid vesicles and micelles were noted. sit1 showed to bind more strongly in nonpolar membrane models, while sit2 was electrostatically attracted to negative phospholipid surfaces. the binding of both peptides was followed by significant conformational changes (like unordered to helix transition) in circular dichroism spectra and a 20nm blue shift in fluorescence emission. the binding of sit1 and sit2 peptides to negative liposome gave dissociation constants (k d ) of 170 and 0.44lm, respectively, and a leakage action 24-fold higher to sit2. the difference in specificity is related to the features of the putative substrates (acyl-coas and diacylglycerol) and can be attributed to the distinct role of each dgat1 binding site during lipid synthesis. supported by fapesp. ftsz, the bacterial homologue of tubulin, assembles into polymers in the bacterial division ring. the interfaces between monomers include a gtp molecule, although the relationship between polymerization and gtpase activity is still controversial. a set of short ftsz polymers were modelled and the formation of active gtpase structures was monitored using molecular dynamics. only the interfaces nearest the polymer ends exhibited geometry adequate for gtp hydrolysis. conversion of a mid-polymer interface to a close-to-end interface resulted in its spontaneous rearrangement from an inactive to an active conformation. fluorescent proteins (fps) have become extremely valuable tools in the life sciences. due to the latest advances in the light microscopy, there is a steady need for fps with improved spectral properties. mirisfp is a monomeric fp that can be switched reversibly between a bright green fluorescent and a dark state by illumination with light of specific wavelengths [1] . structurally, this photo-switching is based on a cis-trans isomerization of the chromophore. upon illumination with violet light, mirisfp can be irreversibly photoconverted from the green-emitting to a red-emitting form. the red form can again be switched reversibly between a fluorescent and a dark state. to elucidate the mechanistic details of the photoinduced reactions, we have generated mirisgfp1. this variant can still undergo reversible photoswitching, but lacks the ability to photoconvert to the red state so that the photoinduced transitions of the green form can be studied without 'artifacts' due to green-to-red photoconversion. using uv/visible spectroscopy, we have characterized the on-and off-switching processes in great detail. several light-activated reaction pathways have been identified. they are highly intertwined so that the net effect achieved with light of a particular wavelength depends on the relative probabilities to photoinduce the various processes. phosducin (pd) is a g t bc-binding protein that is highly expressed in photoreceptors. pd is phosphorylated in dark-adapted retina and is dephosphorylated in response to light. dephosphorylated pd binds g t protein bc-heterodimer with high affinity and inhibits its interaction with g t a or other effectors, whereas phosphorylated pd does not. therefore pd down-regulates the light response in photoreceptors. phosphorylation of pd at s54 and s73 leads to the binding of the 14-3-3 protein. the 14-3-3 proteins function as scaffolds modulating the activity of their binding partners and their role in pd regulation is still unclear. the 14-3-3 protein binding may serve to sequester phosphorylated pd from g t bc or to decrease the rate of pd dephosphorylation and degradation. we performed several biophysical studies of the 14-3-3:pd complex. analytical ultracentrifugation was used to determine the complex stoichiometry and dissociation constant. conformational changes of pd induced both by the phosphorylation itself and by 14-3-3 binding were studied using the time-resolved fluorescence spectroscopy techniques. mö ssbauer spectroscopy with a high velocity resolution was used for comparative study of various oxyhemoglobins for analysis of the heme iron electronic structure and protein structure-function relationship. samples of pig, rabbit and normal human oxyhemoglobins and oxyhemoglobins from patients with chronic myeloleukemia and multiple myeloma were measured using mö ssbauer spectrometric system with a high velocity resolution at 90 k. mö ssbauer spectra were fitted using two models: one quadrupole doublet (model of equivalent iron electronic structure in a-and b-subunits of hemoglobins) and superposition of two quadrupole doublets (model of non-equivalent iron electronic structure in a-and b-subunits of hemoglobins). in both models small variations of mö ssbauer hyperfine parameters (quadrupole splitting and isomer shift) were observed for normal human, rabbit and pig oxyhemoglobins and related to different heme iron stereochemistry and oxygen affinity. small variations of mö ssbauer hyperfine parameters for oxyhemoglobins from patients were related to possible variations in the heme iron stereochemistry and function. the different types of silk produced by orb-weaving spiders display various mechanical properties to fulfill diverse functions. for example, the dragline silk produced by the major ampulate glands exhibits high toughness that comes from a good trade-off between stiffness and extensibility. on the other hand, flagelliform silk of the capture spirals of the web is highly elastic due to the presence of proline and glycine residues. these properties are completely dictated by the structural organization of the fiber (crystallinity, degree of molecular orientation, secondary structure, microstructure), which in turn results from the protein primary structure and the mechanism of spinning. although the spinning process of dragline silk begins to be understood, the molecular events occurring in the secretory glands and leading to the formation of other silk fibers are unknown, mainly due to a lack of information regarding their initial and final structures. taking advantage of the efficiency of raman spectromicroscopy to investigate micrometer-sized biological samples, we have determined the conformation of proteins in the complete set of glands of the orb-weaving spider nephila clavipes as well as in the fibers that are spun from these glands. domain of rgs3 at 2.3å resolution was solved. the stoichiometry of 14-3-3 f /rgs3 protein complex was elucidated using the analytical ultracentrifugation. to map the interaction between 14-3-3f and rgs3 protein we performed a wide range of biophysical measurements: h/d exchange and cross link experiments coupled to mass spectrometry, fret (fö rster resonance energy transfer) time-resolved fluorescence experiments, time-resolved tryptophan fluorescence spectroscopy and saxs (small angle x-ray scattering) measurements. based on all these results we build 3d model of 14-3-3 f /rgs3 protein complex. our model revealed new details on architecture of complex formed by 14-3-3 proteins. to date all known structure of 14-3-3 proteins complexes suggests that the ligand is docked in the central channel of 14-3-3 protein. our results indicate that the rgs domain of rgs3 protein is located outside the central channel of 14-3-3f protein interacting with less-conserved residues of 14-3-3f. the receptor for advanced glycation end-products (rage) is a multiligand cell surface receptor involved in various human diseases. the major alternative splice product of rage comprises its extracellular region that occurs as a soluble protein (srage). although the structures of srage domains were available, their assembly into the functional full length protein remained unknown. here we employed synchrotron small-angle x-ray scattering to characterize the solution structure of human srage. the protein revealed concentration-dependent oligomerization behaviour, which was also mediated by the presence of ca 2+ ions. rigid body models of monomeric and dimeric srage were obtained from the scattering data recorded at different solvent conditions. the monomer displays a j-like shape while the dimer is formed through the association of the two n-terminal domains and has an elongated structure. the results provided insight into the assembly of i) the heterodimer srage:rage, which is responsible for blockage of the receptor signalling, and ii) rage homodimer, which is necessary for signal transduction, paving the way for the design of therapeutical strategies for a large number of different pathologies. clpb is a hexameric aaa? atpase that extracts unfolded polypeptides from aggregates by threading them through its central pore. the contribution of coiled-coil m domains is fundamental for the functional mechanism of this chaperone, and its location within the protein structure in previous structural models is contradictory. we present cryo-electron microscopy structural analysis of clpb from e. coli in several nucleotide states. the study reveals a novel architecture for clpb and shows that m domains form an internal scaffold located in the central chamber of clpb hexamers. this inner structure transmits local signals due to atp binding and hydrolysis by aaa? domains. surprisingly, coiled-coil m domains are seen to bend significantly around a hinge region that separates two structural motifs. our results present a new framework to understand clpb-mediated protein disaggregation. streptomyces clavuligerus isoenzymes involved in clavulanic acid biosynthesis: a structural approach clavulanic acid (ca) is a potent b-lactamase inhibitor produced by streptomyces clavuligerus. n 2 -(2-carboxyethyl) arginine synthase (ceas) and proclavaminate amidino hydrolase (pah) catalyze the initial steps in the biosynthesis of ca. recently ceas1and pah1genes (paralogous of ceas and pah) were related to the ca biosynthesis but their products have not been studied yet. here we present the initial structural analysis of ceas1 and pah1 using biophysical techniques. pah1 and ceas1 genes were isolated from the genomic dna of s. clavuligerus and overexpressed in e. coli. the recombinant proteins were purified by affinity chromatography and analyzed by size exclusion chromatography, non-denaturing page, dynamic light scattering, far-uv circular dichroism (cd) and fluorescence spectroscopy. our results showed that pah1 and ceas1 were obtained as hexamer and dimer respectively. both proteins showed an a/b folding, being stable up to 35°c. above this temperature protein unfolding was observed but the complete unfolding was not observed, even at 100°c. moreover ceas1 and pah1 showed to be stable over a wide ph range (ph 5.5 -9.5). we are currently working on improving ceas1 crystals which are a promising step towards the elucidation of the ceas1 structure. supported by fapesp. • synchrotron radiation circular dichroism • mass spectrometry following vuv photoionisation • fluorescence imaging with lifetime and spectral measurements here we will present the srcd experiment. high photon flux of 10 10 photons / sec, improved detector performances as well as user-orientated software developments have proven to be the garants for successful data collections,which considerably increased the information content obtained. the exploration into the charge transfer region of the peptide bonds is adding specifically new insights. low sample volumes of as little as 2ll per spectra as well as convenient sample chamber handling allow for economic and efficient data collections. typical spectra acquisition from 280 to 170nm, last for 9 min for three scans with a 1nm step size. prior to high resolution based techniques, srcd spectrawill answer questions about folding states of macromolecules including dna, rna and sugar macromolecules as well as their complexes with proteins and specially membrane proteins. sporulation in bacillus subtilis begins with an asymmetric cell division producing a smaller cell called the forespore, which initially lies side-by-side with the larger mother cell. in a phagocytosis-like event, the mother cell engulfs the forespore so that the latter is internalised as a cell-within-a-cell. engulfment involves the migration of the mother cell membrane around the forespore until the leading edges of this engulfing membrane meet and fuse. this releases the forespore, now surrounded by a double membrane, into the mother cell cytoplasm. membrane migration during engulfment is facilitated by the interacting proteins spoiiq and spoiiiah that are membrane-associated and expressed in the forespore and the mother cell respectively 1 . they interact in the intercellular space and function initially as a molecular zipper and later they participate in a more elaborate complex in which spoiiq and spoiiiah are integral components of an intercellular channel. this channel is a topic of much current interest, having initially been proposed as a conduit for the passage from the mother cell to the forespore of a specific, but putative, regulator of the rna polymerase sigma factor, r g 2 and later as a gap junction-like feeding tube 3 through which the mother cell supplies molecules for the biosynthetic needs of the forespore. here we present data on the structure and interactions of spoiiq and spoiiiah gleaned from biophysical methods and protein crystallography. these data lead to a plausible model for the intercellular channel. the glycine receptor (glyr) is a chloride permeable ligand gated ion channel and that can mediate synaptic inhibition. due to a possible involvement in the pathophysiology of temporal lobe epilepsy, the different properties of glyrs containing alpha3l and k subunit isoforms are currently investigated. previous characterizations of homomeric receptors consisting of these isoforms have shown a difference in their electrophysiological properties and their membrane distribution as observed by diffraction-limited fluorescence microscopy. we studied these isoforms, when separately expressed in transfected hek 293t cells, by using single molecule tracking (smt) in living cells and direct stochastic optical reconstruction microscopy (dstorm) on fixated cells. for both techniques the glyrs are stained using a primary antibody directly labeled with alexa fluor dyes. the dstorm experiments support the observation that alpha3l glyr are clustered, while the alpha3k glyrs are more uniformly spread. the analysis of the short range diffusion coefficients obtained by smt reveals the presence of heterogeneous motion for both isoforms. the k-isoform has a higher fraction of fast diffusion. in contrast, the l-isoform is more associated with slow diffusion and appears to undergo hindered diffusion. since nanoparticles are suitable for tumor therapy due to their passive targeting to cancer cells by enhanced permeability and retention effect [1] , it is important to understand mechanisms of their delivery into the living cancer cells. in this respect we have developed a modular spectral imaging system based on a white light spinning disk confocal fluorescence microscope and a narrow tunable emission filter. firstly, interaction of polymer nanoparticles and cells labeled with spectrally overlapping probes was examined. the use of fluorescence microspectroscopy (fms) allowed co-localization, which showed that the size of polymer nanoparticles strongly influences their transfer across the cell plasma membrane. next, the delivery of liposomes (composed of cancerostatic alkylphospholipid (opp) and cholesterol) labeled with environment-sensitive fluorescent probe was monitored. we were able to detect a very small shift in emission spectra of cholesterol-poor opp liposomes inside and outside the cells, which would not be possible without the use of fms. this shift implies that the delivery of these liposomes into cancer cells is based on fusion with the cell membrane [2] . [ high-resolution optical imaging techniques make now accessible the detection of nanofeatures in bio-and soft-matter by non-ionizing visible radiation. however, high-resolution imaging is critically dependent by the fluorescent probes used for reporting on the nano-environment. on account of our long-standing interest in the development of fluorescent probes, we set out to design and engineer new fluorescent systems for nanoscale imaging and sensing of biological specimens and soft-matter. these fluorophores report on fast subtle changes of their nanoscale environment at excited state and are meant to fulfill these requirements: a) optical responses (intensity, wavelength-shift, lifetime, anisotropy) predictably related to the environmental polarity, viscosity, macromolecular structure, b) high brightness allowing for single-molecule detection, c) easily conjugable to biomolecules or macromolecules of interest. notably, we aim at conjugating these properties with the capability of nanoscopy imaging based on stimulated emission depletion or stochastic reconstruction optical microscopy. in this lecture the main features and applications of the engineered probes will be reviewed and future developments in this exciting field will be discussed. foamy virus (fv) is an atypical retrovirus which shares similarities with hiv and hepatitis b viruses. despite numerous biochemical studies, its entry pathway remains unclear, namely membrane fusion or endocytosis. to tackle this issue, dual color fluorescent viruses were engineered with a gfp labeled capsid and a mcherry labeled envelope. using high resolution 3d imaging and 3d single virus tracing, we followed the entry of the fluorescent viruses in living cells with a precision of 30nm in the plane and 40 nm along the optical axis. to distinguish between the two possible pathways, we developed a novel colocalization analysis method for determining the moment along every single trace where the colors separate, i.e. the fusion event. the combination of this dynamical colocalization information with the instantaneous velocity of the particle and its position within the reconstructed 3d cell shape allows us to determine whether the separation of capsid and envelope happens at the cell membrane or from endosomes. we then compared two types of fv and demonstrated, consistently with previous ph-dependency studies, that the prototype fv can enter the cell by endocytosis and membrane fusion, whereas the simian fv was only observed to fuse after endocytosis. phosphatidylinositol 4,5-bisphosphate (pi(4,5)p 2 ) is a minor component of the plasma membrane known to a critical agent in the regulation of synaptic transmission. clustering of pi(4,5)p 2 in synaptic active zones is important for synaptic transmission. however, pi(4,5)p 2 does not spontaneously segregate in fluid lipid membranes and another mechanism must be responsible for the lateral segregation of this lipid in active zones. clustering of pi(4,5)p 2 is expected to be associated with lipid-protein interactions and possibly partition towards lipid rafts in the plasma membrane. here we analyze the influence of protein palmitoylation on the formation of pi(4,5)p 2 clusters and on synaptic protein-pi(4,5)p 2 interaction by means of fö rster resonance energy transfer measurements by fluorescence lifetime imaging (fret-flim) and fret confocal microscopy. . during sporulation an entire chromosome is transferred into the forespore. this process starts by the formation of an asymmetrically located division septum that leads to the formation of two unequally sized compartments: a large mothercell and a smaller forespore. the septum traps about 30% of the chromosome to be transferred into the forespore. the remaining (*3mbp) are then translocated from the mother cell into the forespore by an active mechanism involving the spo-iiie dna translocase. the mechanisms of translocation, particularly the control of the directionality, still remains unknown and various models have been proposed so far. since each model predicts very different distribution of spoiiie proteins at the sporulation septum, we used palm microscopy (photoactivated localization microscopy) to investigate proteins localization in live-sporulating bacteria. using this technique, we showed that spoiiie proteins are forming a single tight focus at the septum with a characteristic size of around 30nm. more surprising, the focus is usually localized in the mother cell compartment and the mean distance between the spoiiie focus and the septum is 35nm. our data suggest that during the translocation process, spoiiie proteins are only forming stable complexes on the mother cell side, allowing then for a control of the chromosome translocation from the mother cell to the forespore. morphogenetic gradients determine cell identity in a concentration-dependent manner and do so in a way that is both incredibly precise and remarkably robust. in order to understand how they achieve this feat, one needs to establish the sequence of molecular mechanisms starting with morphogen gradient formation and leading to the expression of downstream target genes. in fruit flies, the transcription factor bicoid (bcd) is a crucial morphogen that forms an exponential concentration gradient along the embryo ap axis and turns on cascades of target genes in distinct anterior domains. we measured bcd-egfp mobility in live d. melanogaster embryos using fluorescence correlation spectroscopy and fluorescence recovery after photobleaching. we found that bcd-egfp molecules had a diffusion coefficient on the order of * 7 lm 2 /s during nuclear cycles 12-14, both in the cytoplasm and in nuclei. this value is large enough to explain the stable establishment of the bcd gradient simply by diffusion. on the other hand, in the context of the extremely precise orchestration of the transcription of the hunchback bcd target gene, it is too slow to explain how a precise reading of bicoid concentration could be achieved at each interphase without the existence of a memorization process. single molecule studies of key processes during the initiation of innate and adaptive immune response the two pillars of the vertebrate immune system are the innate and adaptive immune response, which confer resistance to pathogens and play a role in numerous diseases. here we exploit single molecule fluorescence imaging on live cells to study the key molecular processes that underpin these responses. the first project looks at the changes in the organisation of toll-like receptor 4 (tlr4) on the cell surface of macrophages upon activation via lipopolysaccharide (lps), as it is currently not known whether a higher level of tlr4 organisation is required for the signalling process. macrophages natively express tlr4 at a low level which allows for oligomerisation to be analysed in live cells by dynamic single molecule colocalisation (dysco) using data obtained by totalinternal reflection fluorescence (tirf) microscopy. the experiments of the second project aim at determining the critical initial events in t-cell triggering by labelling key proteins like the tcr receptor and cd45 on the surface of live t-cells and following how their spatial distribution changes following the binding of the t-cell to a surface. this enables us to distinguish between the different models of t-cell triggering which are based on aggregation, segregation or a conformational change of the tcr. the study of cells using scanning force microscopy the motility of unicellular parasites in mammalians seems very interesting, yet very complex. in a world, were inertia cannot be used for propulsion, in a world at low reynolds numbers, most of our everyday strategies of self-propulsion do not work. one class of parasites that know their way around, the flagellate trypansome, manage not only to survive in the blood stream, which is a lot faster than its own propulsion velocity and where the trypanosome is constantly attacked by its host's immune response, but also to penetrate the bloodbrain-barrier, which actually should be to tight to enter. even though trypanosomes are known for more than 100 years, their motility behaviour is not completely elucidated yet. now, using high-speed darkfield-microscopy in combination with optical tweezers in microfluidic devices and analyzing the recorded data, new light has been shed on the motility of these parasites. astonishing results show that trypanosomes are very well adapted to their hosts environment, they even can abuse red blood cells for their self-propulsion and use the bloodstream itself to drag antibodies bound to their surface to their cell mouth, where the antibodies are endocytosed and digested. the first part of the presentation will discuss nanoparticle (quantum dot, qd) biosensors and nanoactuators that exploit novel and unusual fret phenomena in the induction/detection of protein aggregation [1] , reversible on-off qd photoswitching [2] , and ph sensing [3] . the second part of the presentation will feature the application of an integrated chemical biological fret approach for the in situ (in/on living cells) detection of conformational changes in the ectodomain of a receptor tyrosine kinase (the receptor for the growth factor egf) induced by ligand binding [4] . the measurements were conducted with a two-photon scanning microscope equipped with tcspc detection; novel methods for lifetime analysis ad interpretation were employed to confirm the concerted domain rearrangements predicted from x-ray crystallography. the study of protein-protein interactions in vivo is often hindered by the limited acquisition speed of typical instrumentation used, for instance, for lifetime imaging microscopy. anisotropy polarization is altered by the occurrence of foerster resonance energy transfer (fret) and anisotropy imaging was shown to be comparatively fast and simple to implement. here, we present the adaptation of a spinning disc confocal microscope for fluorescence anisotropy imaging that allowed to achieve in vivo imaging at high spatial and temporal resolution. we demonstrate the capabilities of this system and in-house developed analysis software by imaging living caenorhabditis elegans expressing constitutive dimeric and monomeric proteins that are tagged with gfp. measuring intracellular viscosity: from molecular rotors to photodynamic therapy of cancer marina k. kuimova department of chemistry, imperial college london, exhibition road, sw7 2az, uk viscosity is one of the main factors which influence diffusion in condensed media and is crucial for cell function. however, mapping viscosity on a single-cell scale is a challenge. we have imaged viscosity inside live cells using fluorescent probes, called molecular rotors, in which the speed of rotation about a sterically hindered bond is viscosity-dependent [1] [2] [3] . this approach enabled us to demonstrate that viscosity distribution in a cell is highly heterogeneous and that the local microviscosity in hydrophobic cell domains can be up to 1009 higher than that of water. we demonstrated that the intracellular viscosity increases dramatically during light activated cancer treatment, called photodynamic therapy (pdt) [2] . we also demonstrated that the ability of a fluorophore to induce apoptosis in cells during pdt [4] , or to act as a benign molecular rotor, measuring viscosity, can be controlled by carefully selecting the excitation wavelength in viscous medium [5] . in the field of biophysics and nanomedicine, the cellular reaction and the kinetics of gene expression after transfection of live cells with plasmid dna or gene-silencing sirna is of great interest. in a previous study on the transfection kinetics of non-viral gene-transfer [1] we realised that the development of single-cell arrays would be a great step towards easy-to-analyse, high-throughput transfection studies. the regular arrangement of single cells would overcome the limitations in image-analysis that arise from whole populations of cells. in addition to that, the analysis of expression kinetics at the single-cell can help to identify the cell-to-cell variability within a cell population. in order to develop suitable single-cell arrays, we are currently adjusting the different parameters of such a microenvironment (e.g. size, shape, surface-functionalisation) in order to end up with a defined surrounding for single-cell transfection studies. in addition to that, we try to find the optimal uptake pathway for each of the different applications. the neurodegenerative disorder alzheimer's disease (ad) causes cognitive impairment such as loss of episodic memory with ultimately fatal consequences. accumulation and aggregation of two proteins in the brain -amyloid beta and tau -is a characteristic feature. these soluble proteins aggregate during the course of the disease and assemble into amyloid-like filaments. recently it was found that the toxicity of soluble amyloid beta oligomers must also be taken into account for the pathogenesis of cognitive failure in ad. if oligomers are the predominant toxic species it would be pertinent to determine how they disrupt and impair neuronal function. the prion protein (prp) receptor has been proposed to mediate amyloid beta binding to neuronal cells. we have characterised the interaction of the amyloid beta and the prp receptor expressed on hippocampal and neuroblastoma cells at the single-molecule level. we do not detect any colocalisation of either the 40 or 42 amino acids variants with the prp receptor. bacterial biofilms are of the utmost importance in the study of environmental bioremediation and the design of materials for medical applications. the understanding of the mechanisms that govern cell adhesion must be analysed from the physics point of view in order to obtain quantitative descriptors. the genus rhodococcus is widely spread in natural environments. the species are metabolically diverse and thus they can degrade a wide range of pollutants. due to their high hydrophobicity, these cells are very resistant to harsh conditions, are able to degrade hydrophobic substances (e.g. oil) and attach to high-contact angle surfaces. the hydrophobicity of several strains of rhodococcus is measured and mapped using chemical force microscopy (cfm) in the present study. cfm relies on the functionalisation of scanning force microscopy (sfm) tips using hydrophobic or hydrophilic groups. in cfm, the microscope is operated in the forcevolume mode, which combines adhesion data with topographic images. the careful control of the tip chemistry permits the study of interactions between the functional groups on the tip and the bacterial surface, thus allowing the assessment of hydrophobicity. in order to perform a cfm study, the cells need to be firmly anchored to a substrate under physiological conditions (i.e. under a nutrient media or a saline buffer). to this end, several adhesive surfaces have been tested in order to find the one that gives the best results. optical microscopy is arguably the most important technique for the study of living systems because it allows 3d imaging of cells and tissues under physiological conditions under minimally invasive conditions. conventional far-field microscopy is diffraction-limited; only structures larger than *200 nm can be resolved, which is insufficient for many applications. recently, techniques featuring image resolutions down to *20 nm have been introduced such as localization microscopy (palm, storm) and reversible saturable optical fluorescent transition microscopy (resolft, sted). these methods are well suited for live-cell imaging and narrow the resolution gap between light and electron microscopy significantly. we have used palm imaging to study the formation and disassembly of focal adhesions of live hela cells in a high resolution pulse-chase experiment using monomeric irisfp [1] . mirisfp is a photoactivatable fluorescent protein that combines irreversible photoconversion from a green-to a red-emitting form with reversible photoswitching between a fluorescent and a nonfluorescent state in both forms. in our experiments a subpopulation of mirisfp molecules is photoconverted to the red form by irradiating a specified region of the cell with a pulse of violet light. migration of tagged proteins out of the conversion region can be studied by subsequently localizing the proteins in other regions of the cell by palm imaging, now using the photoswitching capability of the red species. real-time image reconstruction developed in our lab [2] allowed instant control imaging parameters. live cell imaging of cancer cells is often used for in-vitro studies in connection with photodynamic diagnostic and therapy (pdd and pdt). especially in presence of a photosensitizer, this live cell imaging can only be performed over relatively short duration (at most 1 hour). this restriction comes from the light-induced cell damages (photodamages) that result from rapid fluorescence photobleaching of photosensitizer. while these studies reveal exciting results, it takes several hours to discover the detailed effects of the photosensitizer on cell damage. up to our knowledge, however, there is no general guideline for modification of excitation light dose to achieve that. in this paper, the relation between excitation light doses, photobleaching of photosensitizer (pvp-hyperycin) and cell vitality are investigated using human lung epithelial carcinoma cells (a549). the strategy of this paper is to reduce the excitation light dose by using a low-power pulsed blue led such that the structures are visible in time-lapse images. fluorescence signals and image quality are improved by labelling the cells with an additional non-toxic marker called carboxyfluorescein-diacetate-succinimidyl-ester (cfse). in total we collected 2700 time-lapse images (time intervals 2 min) of dual-marked a549 cells under three different light intensities (1.59, 6.34 and 14.27 mw/cm 2 ) and a variety of pulse lengths (0.127, 1.29, 13, 54.5 and 131 ms) over five hours. we have found that there is a nonlinear relationship between the amount of excitation light dose and cell vitality. cells are healthy, i.e. they commence and complete mitosis, when exposed to low light intensities and brief pulses of light. light intensities higher than 6.34 mw/cm 2 together with pulse durations longer than 13 ms often cause cell vesiculation, blebbing and apoptosis. in all other cases, however, we found no cell death. in the future, this striking nonlinearity will be studied in more detail. progressive advances in scanning ion conductance microscopy (sicm) [1] enabled us to convert ordinary scanning probe microscope (spm) in to versatile multifunctional technique. as an imaging tool, ion conductance microscopy is capable to deliver highest possible topographical resolution on living cell membranes among any other microscopy techniques [2] . also, it can visualize surfaces complexity of those makes them impossible to image by other spms [3] . ion conductance microscopy combined with a battery of powerful methods such as fluorescence resonance energy transfer (fret) [4] , patch-clamp, force mapping, localized drug delivery, nano-deposition and nano-sensing is unique among current imaging techniques. the rich combination of ion conductance imaging with other imaging techniques such as laser confocal and electrochemical [5] will facilitate the study of living cells and tissues at nanoscale. ) and coleoptiles of wheat (triticum aestivum l.) seedlings, which were growing in light and dark conditions, were used to determine fluorescence of whole cells. fluorescence emission spectrum was monitored by fluorescent microscopy using the spectrometer usb 4000. fluorescence intensity f490, f680, f710 and f740 was determined and data was statistically analyzed in annova. we observed that bgf, rf and frf intensity increased in the first leaves with the age of the seedlings. in the coleoptiles was observed great bgf intensity increase with the age of the seedlings. in the coleoptiles decreased rf intensity of the 144 and 196 hours old seedlings, and bgf intensity decreased of 196 hours old seedlings. it was found that emission spectrum and fluorescence intensity changes are induced by the lack of light and salt (nacl) stress. analysis of fluorescence spectrum can quickly and accurately indicate the outset of light and salt stress in plants. there are analogical changes in fluorescence emission spectrum of plant cells in senescence and stress conditions. it was assumed that environmental stress and senescence have common mechanisms in plants. this changes can be monitored by fluorescent microscopy. triple-colour super-resolution imaging in living cells markus staufenbiel, stephan wilmes, domenik lisse, friedrich roder, oliver beutel, christian richter and jacob piehler universitä t osnabrü ck, fachbereich biologie, barbarastraße 11, 49076, germany, markus.staufenbiel@biologie.uniosnabrü ck.de super-resolution fluorescence imaging techniques based on single molecule localisation has opened tremendous insight into the sub-micrometre organisation of the cell. live cell imaging techniques such as fluorescence photoactivation localization microscopy (fpalm) are currently limited to dual-colour detection due to the restricted availability of red-fluorescent photoswitchable proteins. we employed photoswitching of the oxazine dye atto655 under reducing conditions for super-resolution imaging in the cytoplasm of living cells. for efficient and specific covalent labelling of target proteins, we have made use of the halotag system. atto655 was coupled to the halotag ligand (htl) and fast reaction of htl-atto655 with the halotag enyzme was confirmed in vitro by solid phase binding assays. efficient labelling of the membrane cytoskeleton using lifeact fused to the halotag was observed and super-resolution imaging was readily achieved. based on this approach, we managed to follow the nanoscale dynamics of the actin cytoskeleton as well as clathrincoated pits using clathrin light chain fused to the halotag. we combined this technique with fpalm for triplecolour super-resolution imaging of the spatial distribution of membrane receptors in context of the membrane skeleton. the erbb family of receptor tyrosine kinases consists of four transmembrane proteins that transduce signals across the membrane to control cell fate. growth factor binding results in homo-and hetero-interactions between these receptors at the membrane. erbb receptors are implicated in many cancers, making them a target for therapeutic drugs. to date, studies of erbb interactions have been limited to individual family members or specific pairs, giving an incomplete picture of the highly complex behaviour controlling positive and negative feedback loops and signalling outcomes. to investigate erbb receptor interactions, we have developed tirf-based single molecule fluorescence microscopes capable of simultaneously imaging three, and soon five, fluorescence probes in live cells. we have also developed a catalogue of extrinsic fluorescent probes for 1:1 labelling of both endogenous and transfected erbb family members in mammalian cells, plus a bayesian approach to the analysis of single molecule data. this allows us to track active and inactive erbb family members at the basal surface of a model breast cancer cell line that expresses physiological levels of all four receptors. we present here initial characterisation of the entire erbb family together in the cell membrane. the human genome contains more than 800 g proteincoupled receptors (gpcrs); overall, 3-4% of the mammalian genome encodes these molecules. processes controlled by gpcrs include neurotransmission, cellular metabolism, secretion, and immune responses. however it is the stoichiometry of these receptors that is the most controversial. the starting point for understanding gpcr function was the idea that these receptors are monomeric. on the other hand a lot of recent studies favour the concept that gpcr form dimers and are not capable of signalling as independent monomers. recent single molecule studies try to solve this dilemma by suggesting that gpcrs form transient dimers with a lifetime of *100 ms. however questions remain about the physiological relevance of the preparations necessary for these studies, since they have not been performed on endogenous receptors. here, we directly image individual endogenous receptors using an equimolar mixture of two colour fluorescent fab fragments. we can then determine the receptors stoichiometry by quantifying its dynamic single molecule colocalisation (dy-sco) recorded by total-internal reflection fluorescence (tirf) microscopy. we have recently investigated the domain dynamics of pgk (1) . structural analysis by small angle neutron scattering revealed that the structure of the holoprotein in solution is more compact as compared to the crystal structure, but would not allow the functionally important phosphoryl transfer between the substrates, if the protein would be static. brownian large scale domain fluctuations on a timescale of 50 ns was revealed by neutron spin echo spectroscopy. the observed dynamics shows that the protein has the flexibility to allow fluctuations and displacements that seem to enable function. [ many physiological and pathological processes involve insertion and translocation of soluble proteins into and across biological membranes. however, the molecular mechanisms of protein membrane insertion and translocation remain poorly understood. here, we describe the ph-dependent membrane insertion of the diphtheria toxin t domain in lipid bilayers by specular neutron reflectometry and solid-state nmr spectroscopy. we gained unprecedented structural resolution using contrast-variation techniques that allow us to propose a sequential model of the membrane-insertion process at angstrom resolution along the perpendicular axis of the membrane. at ph 6, the native tertiary structure of the t domain unfolds, allowing its binding to the membrane. the membrane-bound state is characterized by a localization of the c-terminal hydrophobic helices within the outer third of the cis fatty acyl-chain region, and these helices are oriented predominantly parallel to the plane of the membrane. in contrast, the amphiphilic n-terminal helices remain in the buffer, above the polar headgroups due to repulsive electrostatic interactions. at ph 4, repulsive interactions vanish; the n-terminal helices penetrate the headgroup region and are oriented parallel to the plane of the membrane. the c-terminal helices penetrate deeper into the bilayer and occupy about two thirds of the acyl-chain region. these helices do not adopt a transmembrane orientation. interestingly, the t domain induces disorder in the surrounding phospholipids and creates a continuum of water molecules spanning the membrane. we propose that this local destabilization permeabilizes the lipid bilayer and facilitates the translocation of the catalytic domain across the membrane. the limited stability in vitro of mps motivates the search of new surfactants (1) (2) (3) (4) . fss with a polymeric hydrophilic head proved to be mild towards mps (1) .new fss were designed with chemically defined polar heads for structural applications. lac-derivative was efficient in keeping several mps water soluble and active but formed elongated rods (2) . the glu-family was synthesized, characterized in by sans and auc and for its biochemical interest. the formation of rods is related to the low volumetric ratio between the polar head and hydrophobic tail. the surfactant bearing two glucose moieties is the most promising one, leading to both homogeneous and stable complexes for both br and the b 6 f. it was also shown be of particular interest for the structural investigation of membrane proteins using sans (3). by combining elastic and quasi-elastic neutron scattering data, and by applying theory originally developed to investigate dynamics in glassy polymers, we have shown that in lyophilised apoferritin above t * 100 k the dynamic response observed in the pico-to nano-second time regime is driven by ch 3 dynamics alone, where the methyl species exhibit a distribution of activation energies. our results suggest that over the temporal and spatial range studied the main apoferritin peptide chain remains rigid. interestingly, similar results are reported for other smaller, more flexible lyophilised bio-materials. we believe this work elucidates fundamental aspects of the dynamic landscape in apoferritin which will aid development of complex molecular dynamic model simulations of super-molecules. a detailed appreciation of the relationships between dynamics and biological function will require analysis based on such models that realize the full complexity of macromolecular material. biological systems must often be stored for extended periods of time. this is done by lyophilisation in the presence of lyoprotectants, such as sugars, which results in stable products at ambient conditions. [1] in an effort to understand the mechanism of preservation and stabilization, the interactions between sugars and liposome vesicles, which serve as a simple membrane model, have been studied extensively. amongst the common sugars, trehalose has superior preservative effects [1] and accumulates to high concentrations in many anhydrobiotic organisms. despite many experimental and numerical studies three mechanisms are proposed: vitrification [2] , preferential exclusion [3] and water replacement [4] . to gain more insight into the stabilization mechanism we have recently investigated the effect of trehalose on the bending elasticity of fully hydrated unilamellar vesicles of 1,2-dipalmitoyl-phosphatidylcholine (dppc) in d 2 o at temperatures below and above the lipid melting transition (tm) using neutron spin-echo. the data was analyzed using the zilman-granek theory. at all temperatures measured, trehalose stiffens the bilayer suggesting strong interactions between trehalose and the lipid. trehalose appears to broaden the melting transition but does not change the tm. this agrees with observations using differential scanning calorimetry. influence of macromolecular crowding on protein stability sté phane longeville, clé mence le coeur laboratoire lé on brillouin, gif-sur-yvette, france cell interior is a complex environment filled with a variety of different objects with respect to shape and size. macromolecules are present at a total concentration up to several hundred grams per litre and the overall occupied volume fraction can reach ö & 0.3-0.4. under crowding environment protein-protein interaction play a fundamental role. the crowding environment can affect some physical, chemical, and biological properties of biological macromolecules [1, 2] . traditionally, protein folding is studied in vitro at very low concentration of proteins. under such conditions, small globular single chain proteins can unfold and refold quite rapidly depending mainly to the nature of the solvent. such processes have been very intensively studied, since folding of proteins into their native structure is the mechanism, which transforms polypeptide into its biologically active structure. protein misfolding is involved in a very high number of diseases [4] (e.g. alzheimer, parkinson, and kreuzfeld-jacob diseases, type ii diabetes, …). theoretically, the problem was studied by the introduction of the concept of excluded volume [5] . in recent papers [6, 7] , minton uses statistical thermodynamic models to address the question. he predicted that inert cosolutes stabilize the native state of proteins against unfolding state mainly by destabilizing the unfolded state and that the dimension of the unfolded state decreases with increasing the concentration of cosolute in a measurable way. small angle neutron scattering (sans) is a technique of choice for such study because, by using appropriate mixtures of light and heavy water, it is possible to match the scattering length density of the solvent to the one of the cosolute and thus to measure the conformation of a molecule at low concentration in a presence of a high concentration of another one. we will present a complete experimental study of the mechanism that leads to protein stabilization by macromolecular crowding [7, 8] . coupled dynamics of protein and hydration water studied by inelastic neutron scattering and molecular dynamics simulation hiroshi nakagawa 1,2 , mikio kataoka 1,3 1 japan atomic energy agency, tokai, japan, 2 juelich centre for neutron science, forschungszentrum juelich gmbh, 3 nara institute of science and technology, ikoma, japan proteins work in an aqueous environment at ambient temperature. it is widely accepted that the proteins are flexible and mobile. the flexibility and mobility, that is, protein dynamics are essential for protein functions. neutron incoherent scattering is one of the most powerful techniques to observe protein dynamics quantitatively. here i will talk about dynamics of protein and its hydration water. the structure of a soluble protein thermally fluctuates in the solvated environment of a living cell. understanding the effects of hydration water on protein dynamics is essential to determine the molecular basis of life. however, the precise relationship between hydration water and protein dynamics is still unknown because hydration water is ubiquitously configured on the protein surface. we found that hydration level dependence of the onset of the protein dynamical transition is correlated with the hydration water network. hydration water dynamics change above the threshold hydration level, and water dynamics control protein dynamics. these findings lead to the conclusion that the hydration water network formation is an essential property that activates the anharmonic motions of a protein, which are responsible for protein function. thermal motions and stability of hemoglobin of three endotherms (platypus -ornithorhynchus anatinus, domestic chicken -gallus gallus domesticus and human -homo sapiens) and an ectotherm (salt water crocodile -crocodylus porosus) were investigated using neutron scattering and circular dichroism. the results revealed a direct correlation between the dynamic parameters, melting -, and body temperatures. on one hand, a certain flexibility of the protein is mandatory for biological function and activity. on the other hand, intramolecular forces must be strong enough to stabilize the structure of the protein and to prevent unfolding. our study presents experimental evidence which support the hypothesis that the specific amino acid composition of hb has a significant influence on thermal fluctuations of the protein. the amino acid sequence of hb seems to have evolved to permit an optimal flexibility of the protein at body temperature. macromolecular resilience was found to increase with body and melting temperatures, thus regulating hb dynamics. where k is the trap spring constant, a is the subdiffusion exponent and e a is the mittag-leffler function. the parameters obtained by fitting this equation to the experimental msds are summarized in table 1 . at short lag times we have not found any difference between the two cell types, contrarily to the previous results obtained by afm 2 . for both cell lines the subdiffusion exponent, a was found close to , the value predicted by the theory of semiflexible polymers. but the crossover frequency x 3 , was found smaller for the cancerous cells for all datasets. it corresponds to passage to the confined regime at longer times. we attribute it to the bigger impact of molecular motors. we study the spatiotemporal evolution of fibrous protein networks present in the intracellular and extracellular matrices. here, we focus on the in vitro actin network dynamics and evolution. in order to study the hierarchical self-assembly of the network formation in a confined environment and provide external stimuli affecting the system minimally, we introduce a new microfluidic design. the microfluidic setup consists of a controlling channel to which microchambers of different shapes and sizes are connected through narrow channels. this design results in having mainly convective transport in the controlling channel and diffusive transport into the microchamber. rhodamine labeled actin monomers diffuse into the chamber. after polymerization is induced, they form a confined entangled network. cross-linking proteins can then be added to increase the network complexity. moreover, we can generate gradients of reactants across the microchambers and vasodilator-stimulated phosphoprotein (vasp) is a crucial regulator of actin dynamics. it is important in cellular processes such as axon guidance and migration, promoting assembly of filopodia and lamellipodia. vasp's multiple domain structure increases the range of interactions it has with actin monomers, filaments, and other proteins and it displays multiple binding modes both in vitro and in vivo, including barbed end elongation and filament bundling. however, it is not fully understood how vasp affects the structural and mechanical properties of actin networks. we characterize vasp-mediated bundling of actin networks in a simplified in vitro system using confocal microscopy and quantify mechanical properties with rheology measurements. we show that the network properties differ from other actin bundling proteins and reflect vasp's multiple domain structure, displaying a complex bundling phase space that depends upon solution conditions. we observe the formation of large bundle aggregates accompanied by a reduction in network elasticity at high protein ratios. in addition, we change vasp's actin binding mode and eliminate bundling by introducing free actin monomers. finally, we show preliminary results from a biomimetic system that extends the range of actin-vasp interaction. cell migration or proliferation? the go or grow hypothesis in cancer cell cultures tamá s garay, é va juhá sz, jó zsef tímá r, balá zs heged} us 2 nd department of pathology, semmelweis university, budapest, hungary background: cancer related death is constantly growing in the past decades. the mortality of solid tumors is mostly due to the metastatic potential of tumor cells which requires a fine adjustment between cell migration and cell proliferation. as the metabolic processes in the cell provide a limited amount of available energy (i.e. atp) the various biological processes like cell motility or dna synthesis compete for the atp available. the go or grow hypothesis postulates that tumor cells show either high migration or proliferation potential. in our study we investigate on a large series of tumor cell lines whether this assumption stands for malignant cells. materials and methods: twenty tumor cell lines derived from malignant mesothelioma (mesodermal origin) and malignant melanoma (neuroectodermal origin) were subjected to three-days-long time-lapse videomicroscopic recordings. cell motility and proliferation were characterized by the probability of cell division within 24 hours and the 24-hour migration distance of the cells. results: we found a wide range in both the cell migratory activity and the proliferation capacity in our series. the 24-hour migration distance ranged from 40 to 300 micron and from 10 to 130 micron in mesothelioma and melanoma cells, respectively. the lowest 24-hour cell division probability was found to be 0.21 in both the melanoma and mesothelioma series while the highest proliferation activity reached 1.1 and 1.4 in melanoma and mesothelioma, respectively. interestingly, in the melanoma cell lines we found a significant positive correlation (r=0,6909; p=0,0186) between cell proliferation and cell migration. in contrast our mesothelioma cell lines displayed no correlation between these two cellular processes. conclusions: in summary our findings demonstrate that the investigated tumor cells do not defer cell proliferation for cell migration. important to note the tumor cells derived from various organ systems may differ in terms of regulation of cell migration and cell proliferation. furthermore our observation is in line with the general observation of pathologists that the highly proliferative tumors often display significant invasion of the surrounding normal tissue. many cell types are sensitive to mechanical signals. one striking example is the modulation of cell proliferation, morphology, motility, and protein expression in response to substrate stiffness. changing the elastic moduli of substrates alters the formation of focal adhesions, the formation of actin filament bundles, and the stability of intermediate filaments. the range of stiffness over which different primary cell types respond can vary over a wide range and generally reflects the elastic modulus of the tissue from which these cells were isolated. mechanosensing also depends on the type of adhesion receptor by which the cell binds, and therefore on the molecular composition of the specific extracellular matrix. the viscoelastic properties of different extracellular matrices and cytoskeletal elements also influence the response of cells to mechanical signals, and the unusual non-linear elasticity of many biopolymer gels, characterized by strain-stiffening leads to novel mechanisms by which cells alter their stiffness by engagement of molecular motors that produce internal stresses. the molecular mechanisms by which cells detect substrate stiffness are largely uncharacterized, but simultaneous control of substrate stiffness and adhesive patterns suggests that stiffness sensing occurs on a length scale much larger than single molecular linkages and that the time needed for mechanosensing is on the order of a few seconds. to explore potential role of cytoskeletal component in cardiomyocyte for adaptation to extreme conditions was carried out the comparative study of expression of cytoskeletal sarcomeric protein titin in myocardium of ground squirrels during hibernation and gerbils after spaceflight. we have revealed a two-fold increase in content of long n2ba titin isoform as compared to short n2b titin isoform in different heart chambers of hibernating ground squirrels. the prevalence of the long titin isoform is known to determine the larger extensibility of heart muscle that promotes, according to frank-starling law, the increase in force of heart contractility for pumping higher viscous blood during torpor and adapting the myocardium to greater mechanical loads during awakening. moreover, titin mrna level showed seasonal downregulation in which all hibernating stages differed significantly from summer active level. it is possible that the decline of mrna and protein synthesis during hibernation may be regarded as the accommodation for minimization of energetic expenditures. we have not revealed differences in titin mrna levels between control gerbils and gerbils after spaceflight. but we have also observed the two-fold growth in the amount of n2ba titin isoform in left ventricle of gerbils after spaceflight that is likely to be directed to the restoration of the reduced heart contractility at zero-gravity. these results suggest that the increase of the content of the long n2ba titin isoform may serve as universal adaptive mechanism for regulating of heart function in response to the extreme conditions. nuclear migration is a general term for a non-random movement of the nucleus toward specific sites in the cell. this phenomenon has been described throughout the eukaryotes from yeast to mammals. the process is however still poorly understood in mammalian cells. by using microcontact printing we are able to regulate the geometry and spreading of cultured cells. adhesive micropatterns of fibronectin provide an attachment surface for the cells whereas the passivation of the surface by pll-peg prevents protein, thus cell adhesion. live cell imaging by time-lapse microscopy has shown that under these conditions cells gain a bipolar shape, and more interestingly, the nuclei of the cells showed auto-reversed motion. our research tries to understand the molecular cues and mechanisms behind the observed cellular and nuclear movement. we have already shown that the cytoskeleton plays an important role in this phenomena but the exact players and the detailed mechanism remain to be clarified. in order to identify the most important components and their relationship have drug treatments and sirna experiments have been applied. although our research focuses mainly on the motility of the nucleus, it may also help to get a better understanding of the general theme of cell migration. cell motility involves a number of strategies that cells use to move in their environments in order to seek nutrients, escape danger and fulfil morphogenetic roles. when these processes become uncontrolled, pathological behaviours, like cancer or metastasis of cancerous cells, can occur. here we present a new method for the contextual quantification of cellular motility, membrane fluidity and intracellular redox state, by using the ratiometric, redox-sensitive protein rxyfp and the ratiometric fluidity-sensitive probe laurdan. we provide evidence that dynamic redox and fluidity changes are correlated with signaling processes involved in cellular motility. these findings may pave the way to novel approaches for the pharmacological control of cell invasiveness and metastasis. manipulation of cellular mechanics anna pietuch, andreas janshoff georg-august university, tammanstraße 6, 37077 gö ttingen, germany, e-mail: anna.pietuch@chemie.unigoettingen.de rheological properties of cells determined by the underlying cytoskeleton (cortex) are key features in cellular processes like cell migration, cell division, and cell morphology. today it is possible to investigate local cellular elastic properties under almost physiological conditions using the afm. by performing force indentation curves on local areas on a cell surface the use of contact mechanic models provides the young's modulus, comprising information about the elastic properties of cells. the administration of cytoskeleton modifying substances into the cell is achieved by microinjection. we are also investigating morphological changes and rearrangements of the cytoskeleton in time resolved impedance measurements. electric cell-substrate impedance sensing is a label-free and minimal invasive technique which allows monitoring morphological changes of cells in real time. readout of the impedance is sensitive to changes in cell-substrate contacts as well as density of cell-cell contacts yielding important information about the integrity of the cell layer and changes in the properties of the cell membrane. we are studying the cellular response to modification of the cytoskeleton e.g. by introducing proteins which affect directly the organization of the actin structure like ezrin. mechanical characterization of actin gels by a magnetic colloids technique thomas pujol, olivia du roure, julien heuvingh pmmh, espci-cnrs-umpc-p7. paris, france the actin polymer is central in cell biology: it is a major component of cytoskeleton and it plays a fundamental role in motility, division, mechanotransduction…. its polymerization just beneath the cell membrane generates forces responsible for cell movement. actin filaments form a network whose architecture is defined by the nature of the binding proteins and depends on the location in the cell. for example, in the structure which leads cell migration, the lamellipodium, the gel is branched due to its interaction with arp2/3 protein complex. determining the mechanical properties of such actin network is a crucial interest to understand how forces are generated and transmitted in living cells. we grow a branched actin network from the surface of colloids using the arp2/3 machinery. the particles are super paramagnetic and they attract each other via dipole-dipole interaction to form chain. by increasing the magnetic field we apply an increasing force to the gel and we optically measure the resulting deformation. from those measurements we deduce a young modulus for a large amount of data. we are characterizing different networks by varying the concentration of the capping and branching proteins and we show how mechanics can be regulated by the different proteins. microtubules (mts) are central to the organization of the eukaryotic intracellular space and are involved in the control of cell morphology. in fission yeasts cells mts transport polarity factors to poles where growth is located, thus ensuring the establishment and maintenance of the characteristic spherocylindrical shape. for this purpose, mt polymerization dynamics is tightly regulated. using automated image analysis software, we investigated the spatial dependence of mt dynamics in interphase fission yeast cells. we evidenced that compressive forces generated by mts growing against the cell pole locally reduce mt growth velocities and enhance catastrophe frequencies. in addition, our systematic and quantitative analysis (in combination with genetic modifications) provides a tool to study the role of ?tips (plus-end tracking proteins) such as mal3 and tip1 in the spatial regulation of mt dynamics. we further use this system to decipher how the linear transport by mt interferes with the feedback circuitry that assures the correct spatial distribution of tea1, the main polarity factor in fission yeast cells. the dynamics of the cytoskeleton are largely driven by cytoskeletal motor proteins. complex cellular functions, such as mitotsis, need a high degree of control of these motors. the versatility and sophistication of biological nanomachines still challenges our understanding. kinesin-5 motors fulfill essential roles in mitotic spindle morphogenesis and dynamics and were thought to be slow, processive microtubule (mt)-plus-end directed motors. here we have examined in vitro and in vivo functions of the saccharomyces cerevisiae kinesin-5 cin8 using single-molecule motility assays and single-molecule fluorescence microscopy. in vivo, the majority of cin8 motors moved slowly towards mt plus-ends, but we also observed occasional minus-end directed motility episodes. in vitro, individual cin8 motors could be switched by ionic conditions from rapid (up to 50 lm/min) and processive minus-end, to bidirectional, to slow plus-end motion. deletion of the uniquely large insert in loop 8 of cin8 induced bias towards minus-end motility and strongly affected the directional switching of cin8 both in vivo and in vitro. the entirely unexpected in vivo and in vitro switching of cin8 directionality and speed demonstrate that kinesins are much more complex than thought. these results will force us to rethink molecular models of motor function and will move the regulation of motors into the limelight as pivotal for understanding cytoskeleton-based machineries. morphological and dynamical changes during tgf-b induced epithelial-to-mesenchymal transition david schneider 1 , marco tarantola 2 , and andreas janshoff 1 1 institute of physical chemistry, georg-august-university, gö ttingen, germany, 2 max planck institute for dynamics and self-organization, laboratory for fluid dynamics, pattern formation and nanobiocomplexity (lfpn), goettingen, germany the epithelial-to-mesenchymal transition (emt) is a program of cellular development associated with loss of cell-cell contacts, a decreased cell adhesion and substantial morphological changes. besides its importance for numerous developmental processes like embryogenesis, emt has also been held responsible for the development and progression of tumors and formation of metastases. the influence of the cytokine transforming growth factor 1 (tgf-b1) induced emt on structure, migration, cytoskeletal dynamics and long-term correlations of the mammalian epithelial cell lines nmumg, a549 and mda-mb231 was investigated by time-resolved impedance analysis and atomic force microscopy (afm) performing force-indentation measurements. the three cell lines display important differences in cellular morphology mirrored in changes of their elastic response (young modulus), as well as their dynamics upon tgf-b1 treatment. impedance based measurements of micromotility reveal a complex dynamic response to tgf-b1 exposure which leads to a transient increase in fluctuation amplitude and long-term correlation. additionally, the investigation of cellular elasticity via afm depicts the different cytoskeletal alterations depending on the metastatic potential of the used cell type. physics of cellular mechanosensitivity studied with biomimetic actin-filled liposomes bjö rn stuhrmann, feng-ching tsai, guido mul, gijsje koenderink fom institute amolf, amsterdam, the netherlands biological cells actively probe the mechanical properties of their tissue environment by exerting contractile forces on it, and use this information to decide whether to grow, migrate, or proliferate. the physical basis for cell contractility is the actin cytoskeleton, which transmits motor generated stresses to mechanosensitive adhesions sites that anchor the cell to the tissue. the origins of mechanosensing are far from understood due to the complex interplay of mechanical effects and biochemical signaling that occurs far from equilibrium. we use a quantitative biophysical approach based on biomimetic constructs to elucidate physical principles that underlie active mechanosensing in biological cells. we have built realistic in vitro models of contractile cells by encapsulating cross-linked actin networks together with myosin motors in cell-sized membranous containers (liposomes). our method has several advantages over prior methods, including high liposome yield, compatibility with physiological buffers, and chemical control over protein/lipid coupling. i will show contour fluctuation spectra of constructs and first data on mechanical response obtained by laser tweezers microrheology. our work will yield novel insights into stress generation and stiffness sensing of cells. setting up a system to reconstitute cytoskeleton-based protein delivery and patterning in vitro nú ria taberner, liedewij laan, marileen dogterom fom institute amolf, amsterdam, the netherlands keywords: microtubules, fission yeast, cell polarity, protein patterning, plus end binding proteins. many different cell types, from mobile fibroblasts [1] to fission yeast cells [2] , display non-homogenous protein patterns on their cell cortex. in fission yeast the cell-end marker protein tea1 that among others is responsible for recruiting the actin dependent cell-growth machinery, is specifically located at the growing cell ends. tea1 travels at the tips of growing microtubules and is delivered to the cell ends [2] . we aim to in vitro reconstitute a minimal microtubule plus-end tracking system that leads to cortical protein patterning in functionalized microfabricated chambers. our model will allow us to perturb microtubule-based transport and diffusion independently and evaluate the resulting protein patterns. [ the tropomyosins (tm) are dimeric actin-binding proteins that form longitudinal polymers along the actin filament groove. there is a great variety of isoforms, but the division of labour between the individual tms and their significance is poorly understood. as in most cell types, also in the neurons several isoforms are present, whose spatio-temporal localisation is differentially regulated. the neuron-specific brain 3 isoform (tmbr-3) can be found in the axon of the mature cells. we aimed to clone, express and charaterise this protein in terms of its effects on the kinetic parameters of the actin filament. using a pet28a construct we purified native, tag-free protein, and examined if it influences the rate of actin polymerisation or the stability of the filaments in the presence of either gelsolin or latrunculin-a, two depolymerising agents. in cosedimentation experiments the affinity of tmbr-3 to actin was* 3lm, about six times that of skeletal muscle tropomyosin. the net rate of actin polymerisation was reduced by 17% in the presence of tmbr-3. the depolymerisation induced by gelsolin or latrunculin-a was inhibited in a concentration-dependent manner. tmbr-3 seems to stabilise actin filaments against disassembly without significant effect on the net polymerisation. cell mobility and metastatic spreading: a study on human neoplastic cells using optical tweezers f. tavano the primary causes of death in cancer patients are local invasion and metastasis but their mechanisms are not yet completely understood. metastatization is accompanied by alterations of the cytoskeleton and membrane structure leading to changes in their biomechanical properties [1] . in this study we analyzed by means of optical tweezers the mechanical properties of two different breast adenocarcinoma cell lines corresponding to different metastatic potential. ot were used to grab the plasma membrane by a 1,5 um silica bead and form a plasma membrane tether. we measured the force exerted by the cell membrane on the bead and drew the force-elongation curves. fitting data in the kelvin body model [2] we found out the values for the viscoelastic parameters influencing the pulling of the membrane tethers. the first cell line analyzed, mcf-7, associated to a low metastatic potential showed tether stiffness of 153 pn/um in average. the second cell line, mda-mb 231, poorly differentiated with a high metastatic potential had a tether stiffness of 36pn/um in average, that is a four times lower value. these results seems to confirm the hypotesis that metastasis prone cells are softer than less aggressive cancer cells, and support the use of ot for these measurements for its sub-pn force resolution and because cells are manipulated without damage. [ tubulin polymerization promoting protein (tppp/p25) is a brain-specific protein that primary targets the microtubule network modulating its dynamics and stability. tppp/p25 is a disordered protein with extended unstructured segments localized at the n-and c-terminals straddling a flexible region. tppp/p25 is primarily expressed in oligodendrocytes where its multifunctional features such as tubulin polymerization promoting and microtubule bundling activities are crucial for the development of the projections in the course of oligodendrocyte differentiation enabling the ensheathment of axons with a myelin sheath that is indispensable for the normal function of the central nervous system. microtubule network, a major constituent of the cytoskeleton, displays multiple physiological and pathological functions in eukaryotic cells. the distinct functions of the microtubular structures are attained by static and dynamic associations of macromolecules and ligands as well as by post-translational modifications. tppp/p25 is actively involved in the regulation of microtubule dynamics not exclusively by its bundling activity, but also by its tubulin acetylation-promoting activity. atypical histone deacetylases, such as nad-dependent sirt2 and histone deacetylase-6, function outside of the nucleus and control the acetylation level of cytosolic proteins, such as tubulin. acetylation-driven regulation of the microtubule network during cellular differentiation is an ambiguous issue. tppp/p25 has been recently identified as an interacting partner and inhibitor of these deacetylases and their interaction decreased the growth velocity of the microtubule plus ends and the motility of the cells. we have established cell models for the quantification of the acetylation degree of microtubule network in correlation with its dynamics and stability as well as in relation to aggresome formation, that mimics the pathological inclusion formation. the intracellular level of tppp/p25 is controlled at posttranscription level by microrna and at protein level by the proteosome machinery. under pathological circumstances this disordered protein displays additional moonlighting function that is independent of its association with microtubule system or deacetylases; it enters aberrant proteinprotein interaction with a-synuclein forming toxic aggregates within the neuronal and glial cells leading to the formation of inclusions characteristic for parkinson's disease and multiple system atrophy, respectively. the cell membrane separates the intracellular from the extracellular environment while intimately interacting with the cytoskeleton in numerous cellular functions, including cell division and motility. cell shape changes are for a large part mediated by the contractile actomyosin network forming the cortex underneath the cell membrane. to uncover molecular mechanisms of cell shape control based on actin-membrane interactions, we built a novel biomimetic model system: a cell-sized liposome encapsulating an actively contracting actin-myosin network. our fabrication method is inspired by a recent report of liposome preparation by swelling of lipid layers in agarose hydrogel films. 1 we extensively characterize important liposomal properties, finding diameters between 7 and 20 lm, unilamellarity, and excellent and uniform encapsulation efficiency. we further demonstrate chemical control of actin network anchoring to the membrane. the resulting liposomes allow quantitative tests of physical models of cell shape generation and mechanics. in the cohesive structure of the cytoskeleton functionally distinct actin arrays orchestrate fundamental cell functions in a spatiotemporally controlled manner. emerging evidences emphasize that protein isoforms are essential for the functional polymorphism of the actin cytoskeleton. the generation of diverse actin networks is catalyzed by different nucleation factors, like formins and arp2/3 complex. these actin arrays also exhibit qualitative and quantitative differences in the associated tropomyosin (tm) isoforms. how the molecular composition and the function of actin networks are coupled is not completely understood. we investigated the effects of different tm isoforms (skeletal muscle, cytoskeletal 5nm1 and br3) on the activity of mdia1 formin and arp2/3 complex using fluorescence spectroscopic approaches. the results show that the studied tm isoforms have different effects on the mdia1-, and arp2/3 complexmediated actin assembly. the activity of the arp2/3 complex is inhibited by sktm and tm5nm1, while tmbr3 does not have any effect. all three tm isoforms inhibited the activity of mdia1. these results contribute to the understanding of the mechanisms by which tropomyosin isoforms regulate the functional diversity of the actin cytoskeleton. chronic thromboembolic pulmonary hypertension (cteph) is a dual pulmonary vascular disorder, which combines major vascular remodelling with small-vessel arteriopathy. the presence of fibroblasts in the clot, occluding the pulmonary arteries, and its composition create a microenvironment with increased collagen level, which might affect the local endothelial function. in this study, human pulmonary artery endothelial cells (hpaec) were exposed to collagen type i to address the effect of the thrombotic microenvironment on the vessel wall forming cells. the hpaecs, cultured under standard conditions were treated with 1, 10 and 100lg/ml of collagen type i for 6h and 24h. the changes in the endothelial cell barrier function were investigated by performing permeability and migration test as well as ve-cadherin staining. collagen type i treatment led to a decrease in ve-cadherin signal in hpaec. the loosening of cell-cell contacts could be proven with a significant increase in permeability after 6h of collagen treatment with different concentrations. besides the loosening of the cell-cell contacts, the hpaec migration was also dose dependently retarded by collagen application over time. our data show that collagen-rich microenvironment leads to a disruption of the junctional proteins in hpaecs, indicating an environmental induced possible alteration in the function of endothelial cells in the clots of cteph patients. the implementation of miniaturisation and high throughput screening has quickened the pace of protein structure determination. however, for most proteins the process still requires milligram quantities of protein with purity [ 95 %. these amounts are required as a result of unavoidable losses during purification and for the extensive screening of crystallisation space. for integral membrane proteins (imps), one of the initial steps in the structure determination procedure is still a major bottleneck -the over-expression of the target protein in the milligram quantity range. with a view of developing guidelines for over-expression of human imps, a systematic approach using the three most common laboratory expression systems (e. coli, s. cerevisiae, sf9 insect cells) was implemented. initial expression levels were determined by either partial purification using ni 2? -nta (e. coli), green fluorescence protein (gfp) fluorescence using a c-terminal gfp tagged protein (s. cerevisiae) or flag tagged partial purification (sf9 cells). the results show that e. coli is suitable for the over-expression of human imps in the required quantity range however protein size and complexity is an important factor. the yeast system is fast and affordable but, for the group of human imps tested, the expression levels were borderline. finally for the insect cell system, the timelines are slower and it is in comparison costly to run, however, it can produce relatively large quantities of human imps. the cu?-atpase copb of enterococcus hirae is a bacterial p-type atpase involved in resistance to high levels of environmental copper by expelling excess copper. the membrane protein copb was purified from an over-expressing strain and solubilized in dodecyl-maltoside. by uv circular dichroism the secondary structure is predicted to contain 40-50 % a-helices and 10-15% b-sheets in agreement with estimates based on homology with the ca atpase serca1. we present cd-spectroscopic data on thermal unfolding of the protein to address the influence of the binding of the atp analogs atpcs and the fluorescent analog mant-atp on the protein stability. such analogs are used to mimic functional states of the atpase but undergo different interactions with the binding site that are not well characterized. we propose a competition-based assay for nucleotide binding using cdspectroscopy to deduce the occupancy of the nucleotidebinding site by non-fluorescent nucleotides. alternatively, the change of intrinsic fluorescence of mant-atp upon binding to the atpase is exploited in these assays. finally, we show how the simultaneous measurement of protein cd and nucleotide fluorescence in thermal denaturation experiments may help to determine the stability of several functional conformational states of copb. showing the steady-state distribution of electric potential, ionic concentrations are obtained efficiently. channel current, a summation of drift and diffusive currents, can be further computed from the flux of ionic concentrations. the influence of finite size effect will be also addressed. effect of cholesterol and cytoskeleton on k v 10.1 membrane distribution jimé nez-garduño am 1,2 , pardo la 2 , ortega a 1 , stü hmer w 2 1 unam, mexico-city, mexico, 2 mpi-em, gö ttingen, germany the potassium channel k v 10.1 is expressed nearly exclusively in the central nervous system. besides its function as an ion channel, k v 10.1 has also been associated with non-canonical signaling functions. various membrane proteins associated with cholesterol-sphingolipids enriched microdomains are involved in signaling pathways. in this work we studied the membrane distribution of k v 10.1 in highly purified brain-tissue plasma membranes as a function of cholesterol content versus cytoskeletal proteins. the results show that one fraction of kv10.1 associates to cholesterol-rich domains or detergent resistant membranes (drm) and another fraction to non-drm domains. the kv10.1 fraction inserted in drm is dependent on cholesterol as well as on cytoskeleton proteins. depletion of cholesterol leads to a doubling of k v 10.1 current density. we suggest that k v 10.1 coexists in two different populations: one where the transmembrane domain fits cholesterol enriched membranes and another able to fit into a less packed lipid bilayer. the importance of this distribution on signaling processes needs to be further investigated. we use the reduced model of an axis-symmetric water-filled channel whose protein wall has a single charged site. the channel length, radius and fixed charge are selected to match experimental data for gramicidin a. the ion current, occupancy and escape rate are simulated by the 1d self-consistent bd technique with account taken of the electrostatic ion-ion interaction. the bath with non-zero ion concentration on one side of the channel is modelled via the smoluchowski arrival rate. it is shown that: a) the occupancy saturates with michaelis-menten kinetics. b) the escape rate starts from the kramers value at small concentrations and then increases with concentration due to the electrostatic amplification of charge fluctuations. the resulting dynamics of the current can be described by modified reaction rate theory accounting for ionic escape over the fluctuating barrier [1] . many membrane-protein functions are amenable to biophysical and biochemical investigation only after the protein of interest has been reconstituted from a detergent-solubilised state into artificial lipid bilayers. unfortunately, functional reconstitution has remained one of the main bottlenecks in the handling of numerous membrane proteins. in particular, gauging the success of reconstitution experiments has thus far been limited to trial-and-error approaches. to address this problem, we have established high-sensitivity isothermal titration calorimetry (itc) as a powerful method for monitoring the reconstitution of membrane proteins into liposomes. itc has previously been employed for characterising liposome solubilisation and reconstitution in the absence of protein. here we show that itc is also excellently suited for tracking the complex process of membrane-protein reconstitution in a non-invasive and fully automated manner. the approach is exemplified for the prokaryotic potassium channel kcsa, which we first purified in detergent micelles and then reconstituted into stable proteoliposomes at very high protein densities. electrophysiological experiments performed in planar lipid membranes confirmed that kcsa regained its functional activity upon itc-guided reconstitution. gating currents of low-voltage-activated t-type calcium channels family má ria karmažínová , ľ ubica lacinová institute of molecular physiology and genetics, sav, bratislava, slovak republic t-type calcium channels are distinguished by relatively low voltage threshold for an activation and steep voltage dependence of activation and inactivation kinetics just above the activation threshold. kinetics and voltage dependence of macroscopic inward calcium current through ca v 3 channels was described in a detail. in contrast, very little information is available on gating current of these channels. therefore we compared gating currents measured from all three ca v 3.1, ca v 3.2 and ca v 3.3 channels. voltage dependencies of charge movement differ dramatically from those for macroscopic current. first, their slope factors are several-fold bigger that slope factors of macroscopic current activation. second, activation mid-point for ca v 3.3 channels on-gating is shifted to more positive membrane potentials by about 20 mv compare to ca v 3.1 and ca v 3.2 channels, whose activation mid-points are similar. the same is truth for off-gating voltage dependences. kinetics of both onand off-gating is remarkably faster for ca v 3.1 and ca v 3.2 channels compare to ca v 3.3 channels. further, more charge is moved per unit of macroscopic current amplitude in ca v 3.3 channels compare to ca v 3.1 and ca v 3.2 channels. supported by apvv-0212-10 and vega 2/0195/10. the local anaesthetic lidocaine (lid) is generally believed to reach its binding site in the intracellular vestibule of the voltage-gated sodium channel via the cell membrane. qx222 (qx) is a permanently charged, quaternary amine analogue of lid, that can access this binding site via a hydrophilic route across the channel protein. the mutation i1575e of the adult rat muscle-type sodium channel (rna v 1.4) opens such a hydrophilic pathway. when bound to the internal vestibule, lid stabilizes both fast and slow inactivated states. we wondered whether qx, once bound to the internal vestibule, exerts a similar modulatory action on inactivated states as lid. the construct i1575e was expressed in tsa201 cells and studied by means of the patch-clamp technique. when applied from the extracellular side 500 lm qx stabilized the slow but not the fast inactivated state in i1575e. when applied internally, qx entered the channel, but stabilization of inactivated states could not be observed. these results suggest that binding site for use-dependent block is in the inner vestibule of the channel, fast inactivation is modulated only by the hydrophobic form of lid, and the binding site for modulation of slow inactivation by qx is only accessible from the extracellular side of the channel. we observed that lipophilicity (quantified by the logarithm of the calculated water-octanol partition coefficient, logp) is important in determining both kr and ki, but had a greater effect on ki. distribution coefficients (logd) discriminated better between kr and ki than partition coefficients (logp). the ratio of positively charged/neutral forms (quantified by the acidic dissociation constant, pka) was a significant determinant of resting affinity: predominantly charged compounds tended to be more potent against resting channels, while neutral compounds tended to be more state-dependent. aromaticity was more important for inactivated state affinity. the acidification of intracellular compartments is critical for a wide range of cellular processes. a recent candidate for ph regulation within early endosomes and tgn is the highly conserved intracellular na ? /h ? exchanger 7 isoform (nhe7), whose mutation leads to neurological syndromes in human patients. however, due to its intracellular localization, nhe7 biochemical features are still poorly characterized and its biological function remains elusive. we have developed somatic cell genetic techniques that enable the selection of variant cells able to resist h ? killing through plasma membrane expression of h ? extruders. this enabled us to obtain stable cell lines with forced plasma membrane expression of nhe7. we used them to measure its functional and pharmacological parameters with high accuracy, using fast transport kinetics. to summarize, this exchanger displays unique features within the nhe family, especially with respect to its affinity for its substrates, lithium, sodium and protons and for its guanidine-derived inhibitors. taken together with our results on the subcellular localization of the native nhe7, these unique biochemical features provide new insights on the biological function and pathological implications of this intracellular na ? /h ? exchanger. analysis of the collective behaviour of ion channels j. miśkiewicz, z. trela, s. przestalski wrocław university of environmental and life sciences, physics and biophysics department, ul. norwida 25, 50-375 wrocław, poland a novel approach to the analysis of the ion current recordings is proposed. the main goal of the standard patch clamp technique is to measure single channel activity (however the whole cell configuration is also used in various researches). in the presented study the ion channels time series recordings were several (up to four) ion channels were present are analysed and the collective behaviour of ion channels is investigated. the time ion current time series are converted into dwelltime series and the channel activity is analysed. the hypothesis of collective ion channels behaviour is verified and the influence of organolead compounds (met3pbcl) on collective ion channel activity is measured. the analysis is performed on the sv cation channels of vacuolar membrane of beta vulgaris. the aim of our computed study was to examine the possible binding site of primaquine (pq) using a combined homology protein modeling, automated docking and experimental approach. the target models of wild-type and mutant-types of the voltage-dependent sodium channel in rat skeletal muscle (rna v 1.4) were based on previous work by tikhonov and zhorov. docking was carried out on the p-loop into the structure model of rna v 1.4 channel, in the open state configuration, to identify those amino acidic residues important for primaquine binding. the threedimensional models of the p-loop segment of wild types and mutant types (w402. w756c, w1239c and w1531a at the outer tryptophan-rich lip, as well as d400c, e755c, k1237c and a1529c of the deka motif) helped us to identify residues playing a key role in aminoquinoline binding. in good agreement with experimental results, a 1000-fold inhibition loss was observed, tryptophan 756 is crucial for the reversible blocking effects of pq. as a result, w756c abolished the blocking effect of primaquine in voltage-clamp assays. hydrogen bond formation accelerates channel opening of the bacterial mechanosensitive channel mscl yasuyuki sawada and masahiro sokabe department of physiology, nagoya university graduate school of medicine, nagoya, japan the bacterial mechanosensitive channel mscl is constituted of homopentamer of a subunit with two transmembrane inner and outer (tm1, tm2) a-helices, and its 3d structure of the closed state has been resolved. the major issue of mscl is to understand the gating mechanism driven by tension in the membrane. to address this question, we performed molecular dynamics (md) simulations for the opening of mscl embedded in the lipid bilayer. in the closed state of mscl, neighboring tm1 inner helices are crossed each other near the cytoplasmic side and leu19 and val23 in the constricted part form a stable hydrophobic environment called gate. upon membrane stretch, phe78 in tm2 outer helices was dragged by lipids, leading to an opening of mscl. thus phe78 was concluded to be the major tension sensor. during opening, tm1inner helices were also dragged and tilted, accompanied by the outward sliding of the crossings. this led to a slight expansion of the gate associated with an exposure of oxygen atoms of the backbone to the inner surface of the gate. this allows water penetration in the gate and formation of hydrogen bonds between water and the exposed oxygen, which in turn weakened the hydrophobic interaction at the crossings, causing a further opening of the gate and water permeation. mitochondrial bk ca , mitobk ca has been proposed to be cardioprotective and formed by proteins of *50 to *125 kda. thus, we investigated the molecular characteristics of this channel in isolated mitochondria from murine heart. labeling of adult mouse cardiomyocytes with plasmalemma bk ca antibodies, mitotracker, and wheat germ agglutinin yielded remarkable mitochondrial but not plasma membrane localization. nanoscale fluorescence microscopy (stimulated emission depletion) revealed 7 to 15 of *40-50 nm bk ca clusters per mitochondria. further, western blot analysis of purified mitochondria showed the presence of a full length *125 kda protein. systematic rt-pcr exon scanning of isolated cardiomyocyte mrnas were consistent with a full length *125 kda alpha-subunit protein and revealed the expression of three splice inserts. insertless-bk ca robustly localized to the plasma membrane of cho cells but when a c-terminal splice insert was present bk ca was readily targeted to the mitochondria (protein proximity index was *1.0 indicating 100% colocalization). hence, cardiac mitobk ca is composed by full-length bk ca protein but with splice inserts which may facilitate its targeting to mitochondria. supported by nih and aha. patch-clamp technique was used to examine effect of trimethyl-lead and -tin on the sv channel activity in the red beet (beta vulgaris l.) taproot vacuoles. it was found that the addition of both investigated compounds to the bath solution inhibit, in a concentration-dependent manner, sv currents. when single channel properties were analyzed, only little channel activity can be recorded in the presence of 100 lm of organometal. compounds investigated decreased significantly (by about one order of magnitude) the open probability of single channels. the recordings of single channel activity obtained in the presence and in the absence of organometal showed that compounds only slightly (by ca. 10%) decreased the unitary conductance of single channels. it was also found that organometal diminished significantly the number of sv channel openings, whereas it did not change the opening times of the channels. taken together, these results suggest that organometal binding site is located outside the channel's selectivity filter and that the inhibitory effect of both compounds investigated on sv channel activity probably results from organometal-induced disorder in compatibility between membrane lipids and membrane proteins. the research was financed by polish ministry of science and higher education by grant no. nn305 336434. electrophysiological investigation of the hvdac1 ion channel in pore-spanning membranes conrad weichbrodt, claudia steinem georg-august-universitä t gö ttingen, iobc, tammannstr. 2, d-37077 gö ttingen, germany, e-mail: cweichb@gwdg.de the human voltage-dependent anion channel 1 (hvdac1) plays an important role in cell life and apoptosis since it is the main porin of the outer mitochondrial membrane. as hvdac1 is believed to play a pivotal role in apoptosis-related diseases such as stroke, alzheimer, parkinson and cancer, the alterations of its electrophysiological properties under different conditions are of great value. to perform different investigations, refolded hvdac1 is reconstituted in artificial membranes which typically consist of dphpc with a cholesterol fraction of 10-30%. they are prepared via the mü ller-rudin-technique on a functionalized porous alumina substrate containing pores with a diameter of 60 nm. the quality of these so-called nano-black-lipid-membranes (nano-blms) is verified via electrochemical impedance spectroscopy (eis), hvdac1 is reconstituted and single channel recordings are made. membranes are also prepared by spreading proteoliposomes on hydrophobized porous silicon nitride with pores of 1 lm diameter. information about altered gating-characteristics and related conductivities is gained by application of holding potentials up to ±100 mv and evaluation of the resulting currents. the hvdac1 was a kind gift of prof. c. griesinger, mpibpc, gö ttingen. pharmacological inhibition of cardiac herg k ? channels is associated with increased risk of arrhythmias. many drugs bind directly to the channel, thereby blocking ion conduction. ala-scanning mutagenesis identified residues important for drug block. two aromatic residues y652 and f656 were found to be crucial for block of most compounds. surprisingly, some cavity blocking drugs are only weakly affected by mutation y652a. in this study we provide a structural interpretation for this observation. md simulations on the y652a mutant suggest side chain rearrangements of f656 located one helical turn below y652. loss of p-p stacking induces reorientation of f656 from a cavity facing to a cavity lining conformation, thereby substantially altering the shape of the binding site. docking studies reveal that due to their rigid shape and compactness y652 insensitive drugs can still favorably interact with the reoriented f656 aryl groups, while molecules with more extended geometries cannot. the ankyrin transient receptor potential channel trpa1 is a transmembrane protein that plays a key role in the transduction of noxious chemical and thermal stimuli in nociceptors. in addition to chemical activation, trpa1 can be activated by highly depolarizing voltages but the molecular basis of this regulation is unclear. the transmembrane part of the tetrameric trpa1 is structurally related to the voltagegated k ? channels in which the conserved charged residues within the fourth transmembrane region (s4) constitute part of a voltage sensor. compared to these channels, the voltage-dependence of trpa1 is very weak (apparent number of gating charges * 0.4 versus 12 in k ? channels) and its putative voltage-sensing domain most likely lies outside the s4 because trpa1 completely lacks positively charged residues in this region. in the present study we used homology modelling and molecular dynamics to create models of the transmembrane part and the proximal cytoplasmic c terminus of trpa1. in combination with electrophysiological data obtained from whole cell patchclamp measurements we were able to point out several positively charged residues which mutation strongly alter the voltage sensitivity of trpa1 channel. these may be candidates for as yet unrecognized voltage sensor. photosynthesis p-511 action of double stress on photosystem 2 aliyeva samira a., gasanov ralphreed a. institute of botany, azerbaijan national academy of sciences, baku, azerbaijan the simultaneous effect of photoinhibitory illumination and toxic action of heavy metals ions (cd 2? and co 2? ) on activity of ps2 in vitro measuring by millisecond delayed fluorescence (ms-df) of chlorophyll a was studied. during action on chloroplasts only of cd 2? (10 -3 m) the fast component of ms-df, which originates via radiative recombination of reaction center with the camn 4 -cluster or y z on donor side of ps2, is inhibited stronger than at action of only co 2? . the steadystate level at cd 2? treatment is remain stable, while at co 2? action it is increased. simultaneous action of cd 2? and photoinhibitory illumination (4000 lmol photons m -2 s -1 ) have shown that fast component of ms-df was inhibited faster with time than in case of action of co 2? and excess light. result indicates that damage sites of action cd 2? and co 2? are donor and acceptor side of ps2, accordingly. we assume that binding site of cd 2? is y z or camn 4 -cluster, one of the recombination partners with p 680 ? on the donor side of ps2. thereby, action of cd ions on donor side of ps2 leads mainly to development of mechanism of donor-side photoinhibition. field instrument for determination of the photosynthetic capacity of intact photosynthetic bacteria e. asztalos 1 , z. gingl 2 and p. maró ti 1 1 department of medical physics and informatics, university of szeged, hungary, 2 department of experimental physics, university of szeged, hungary a combined pump and probe fluorometer and spectrophotometer with high power laser diodes has been constructed to measure fast induction and relaxation of the bacteriochlorophyll fluorescence yield and light-induced absorption changes in intact cells of photosynthetic bacteria. the construction is the upgraded version of our previous set up 1 with better time resolution (5 ls). the compact design of the mechanics, optics, electronics and data processing makes the device easy to use as outdoor instrument or to integrate into larger measuring systems. the versatility and excellent performance of the apparatus will be demonstrated on different fields: 1) organization and redox state of the photosynthetic apparatus of the whole cells under different growth conditions deduced from fluorescence characteristics including the lag phase, the amplitude and the rise time of the variable fluorescence, 2) electron transfer in the reaction center, cytochrome bc 1 complex and in between obtained from relaxation of the fluorescence and 3) re-reduction kinetics of the oxidized primary donor of the reaction center and energetization and relaxation of the intracytoplasmic membrane tracked by absorption changes at 798 and 525 nm, respectively. previous work has established that the iron stress induced protein a (isia) synthesized by cyanobacteria under stress conditions, has at least two functions: light harvesting [1] and photoprotection [2] . under prolonged iron starvation isia becomes the main chlorophyll-binding protein in the cell and occurs without a photosystem association. these isia aggregates have a strong ability to dissipate light energy and there is evidence of carotenoid participation in the quenching mechanism via downhill energy transfer from chlorophyll to the s1 state of a carotenoid [3] . in the present work we have measured the temperature dependence of the fluorescence of carotenoid depleted mutants (echinenone and/or zeaxanthin) and isia monomers in order to investigate the role of carotenoid and aggregation in the quenching process. pigment analysis confirms the absence of the carotenoid mutated in its biosynthesis but shows that it is mainly replaced. the monomers are lacking two carotenoids, echinenone and one of the two ß-carotenes found previously in isia aggregates. temperature dependent fluorescence shows that quenching properties are affected in the monomers and the mutants lacking zeaxanthin. soon exhausted oil resources and global climate change have stimulated research aiming at production of alternative fuels, ideally driven by solar energy. production of solar fuels needs to involve the splitting of water into protons, energized electrons and dioxygen. in photosynthetic organisms, solar-energy conversion and catalysis of water splitting (or water oxidation) proceed in an impressive cofactor-protein complex denoted as photosystem ii (psii). the heart of biological water-oxidation is a protein-bound manganese-calcium complex working at technically unmatched efficiency. in an attempt to learn from nature, the natural paragon is intensely studied using advanced biophysical methods. structural studies by x-ray spectroscopy with synchrotron radiation play a prominent role in this endeavor. time-resolved methods provide insights in the formation of intermediate states of the reaction cycle. an overview is presented focusing on (i) the efficiency of solar energy usage in psii, (ii) the interrelation between electron transfer and proton relocations, and (iii) the mechanism of water oxidation. as an outlook, new results on water oxidation by biomimetic manganese and cobalt oxides, which may become a key element in future solar-fuel systems, are presented. the peridinin-chlorophyll-protein (pcp) is a light-harvesting complex (lhc) that works as antenna in the photosynthetic process of dinoflagellates. the protein contains both chlorophylls and carotenoids molecules, the latter being responsible to extend the spectral range of captured light to regions where chlorophylls are transparent. pcp crystal structures [1] reveal that each chlorophyll is surrounded by 3 or 4 molecules of the carotenoid peridinin, located in non-equivalent positions. the different protein environment of the sites might be responsible of a spectral shift of the pigments with the functional role to extend the absorption spectra of the complex and enhance its light harvesting capabilities. high resolution x-ray diffraction data on a reconstructed pcp, the refolded peridinin-chlorophyll a-protein (rfpcp) [2] , and on the less common high salt-pcp (hspcp) opened the way to the mechanistic understanding of peridinin spectral tuning, peridinin chlorophyll energy transfer and photoprotective mechanism [3] . the two pcp forms differ in various features: spectral properties, molar mass, amino acid sequence and, above all, pigment stoichiometry, the peridinin:chlorophyll ratio being 4:1 for the rfpcp and 3:1 for the hspcp [4] . in the present work we perform classical molecular dynamics simulations of the rfpcp and the hspcp in explicit water solution. we analyse the structure and dynamics of the proteins and of their pigments to characterize the different peridinin sites in both pcp forms in terms of quantities that can affect the chromophore spectra, such as distorsion, fluctuations and nature of the protein environment. the comparison between the data suggests correspondences between the pigments of the two forms. quantum and mixed quantum/ classical molecular dynamics simulations are also under progress to investigate the effect of the protein environment on the electronic and optical properties of the pcp pigments. peculiar applications, like in optoelectronics, biosensors, photovoltaics. among the existing carrier matrices conductive metal oxides (e.g. indium tin oxide, ito), carbon nanotubes, graphenes, silicon (si) are the most frequently used materials because of their unique characteristics such as good conductivity, good optical properties and excellent adhesion to substrates. in our work we combined purified photosynthetic reaction center protein (rc) and porous silicon (psi) investigating the morphology and optoelectronic properties of the bio-nanocomposite material. ftir spectroscopy, scanning electron microscopy and energydispersive x-ray spectroscopy indicated the binding of the protein to the psi. specular reflectance spectra showed a red shift in the characteristic interference peak of the psi microcavity which was saturated at higher concentration of the protein. the binding was more effective if the functionalization was done by the si-specific oligopeptide compared to the classical covalent binding via aminopropyl-triethoxysilane (aptes). excitation by single saturation flashes indicated that the rc still exhibited photochemical turnover after the binding. the role of reactive oxygen species (ros) in plant stress, both as damaging agent and as potential signal molecule is often assessed in experiments using photosensitized elicitor dyes. for these studies, it is essential to know how efficiently these chemicals generate ros, whether they are specific ros sources, as well as their cellular localization and additional side effects. the present study addresses these issues using a variety of dyes known and traditionally applied as singlet oxygen ( 1 o 2 ) sources. rose bengal (rb), methylene violet (mvi), methylene blue (mb), neutral red (nr) and indigo carmine (ic) were studied as putative ros sources in tobacco leaves. ros products of photosensitized dyes were measured in vitro, using spin trapping epr spectroscopy. dye concentrations and irradiation concentration leading to equal absorbed excitation quanta were determined spectrophotometrically. in vivo studies were carried out using tobacco leaves infiltrated with water solutions of the putative 1 o 2 sources. cellular localizations were identified on the basis of the dyes' fluorescence. rb, nr and mvi reached into mesophyll cells and were used to study the effects of these dyes on photosynthesis. photochemical yields and quenching processes were compared before and after photosensitization of the elicitor dyes inside the leaf samples. chlorophyll-chlorophyll charge transfer quenching is the main mechanism of non-photochemical quenching in higher plants alfred r. holzwarth max-planck-institut fü r bioanorganischechemie, stiftstraße 34-36, d-45470 mü lheim a.d.ruhr, germany non-photochemical quenching (npq) in plants protects against photochemical destruction of the photosynthetic apparatus under excess light conditions.while one location of the npq process has been shown to be centered on the major light harvesting complex ii (lhcii) (q1 type or qequenching), an additional quenching center responsible for qi type (identical to q2 center) quenchinghas been suggested to be located on the minor light-harvesting complexes upon accumulation ofzeaxanthin (zx), in particularon cp24 and cp29 we have performed femtosecond transient absorption and time-resolved fluorescence measurements of npq quenching in intact leaves of higher plants, on isolated light harvesting complexes in the minor (non-aggregated)light harvesting complex cp29 reconstituted with violaxanthin (vx) or zx, and in the isolated major lhc ii complex in the aggregated state. in all of these situations we find the formation of chl-chl charge transfer (ct) states to be the dominant quenching mechanism. the yield of formation of carotenoid cation states and/or carotenoid s 1 state is either extremely low or absent, thus excluding their involvement in npq quenching as a major quenching mechanism. single-molecule spectroscopy (sms) is a powerful technique that allows investigation of fluorescence properties from single fluorescing systems. this technique enabled us to investigate the dynamics of the fluorescence intensity and spectral profiles of single, isolated light harvesting complex (lhc) on timescales of milliseconds to seconds, during continuous laser illumination. we were able to observe how each complex can rapidly switch between different emission states [1, 2] and to characterise the intensity and the spectral dynamics of major and minor antenna complexes from plants, in two different environments, mimicking the light harvesting and the light dissipating state, respectively. the results will be discussed with respect to the current models for nonphotochemical quenching (npq) mechanisms [3, 4, 5] , a vital photoprotection mechanism during which the lhcs of plants switch between a state of very efficient light utilisation and one in which excess absorbed excitation energy is harmlessly dissipated as heat. phaeodactylum tricornutum is one of the most utilized model organisms in diatom photosynthesis research mainly due to availability of its genome sequence (1). it's photosynthetic antennae are the fucoxanthin chlorophyll a/c binding proteins (fcps) which share a high degree of homology with lhcs of higher plants and green algae (2) . for detailed investigation of the antenna system of p.tricornutum, a transgenic strain expressing recombinant his-tagged fcpa protein was created which simplified the purification of a specific stable trimeric fcp complex consisting of fcpa and fcpe proteins. excitation energy coupling between fucoxanthin and chlorophyll a was intact and the existence of a chlorophyll a/fucoxanthin excitonic dimer was demonstrated (3). we investigated in detail the existence of specific antenna system for psi and psii in p.tricornutum as in case of higher plants. our studies indicated that at least the main light harvesting proteins fcpa and fcpe are most probably shared as a common antenna by both psi and psii. harvesting complex ii (lhciib) from spinach or in native thylakoid membranes by picosecond time-resolved fluorescence. the domain size was estimated by monitoring the efficiency of added exogeneous singlet excitation quenchers -phenyl-p-benzoquinone (ppq) and dinitrobenzene (dnb). the fluorescence decay kinetics of the systems under study were registered without quenchers and with quenchers added in a range of concentrations. stern-volmer constants, k 0 sv and k sv -for aggregates (membranes) and detergentsolubilized complexes, respectively, were determined from the concentration dependence of the ratio of the mean fluorescence lifetimes without/with quencher (s 0 , s). the ratio k 0 k sv • s 0 /s 0 0 was suggested as a measure of the functional domain size. values in the range of 15-30 were found for lhcii macroaggregates and 12-24 for native thylakoid membranes, corresponding to domain sizes of 500-1000 chlorophylls. although substantial, the determined functional domain size is still orders of magnitude smaller than the number of physically connected pigment-protein complexes; thus our results imply that the physical size of an antenna system beyond these numbers has little or no effect on improving the light-harvesting efficiency. the interaction between photosynthetic reaction center proteins (rcs) purified from purple bacterium rhodobacter sphaeroides r-26 and functionalized and non-functionalized (single (swnt) and multiple (mwnt) walled) carbon nanotubes (cnt-s) has been investigated. both structural (afm, tem and sem microscopy) and functional (flash photolysis and conductivity) techniques showed that rcs can be bound effectively to different cnts. both physical sorption and binding through -nh 2 or -cooh groups gave similar results. however, it appeared that by physical sorption some sections of the cnts were covered by multiple layers of rcs. after the binding the rcs kept their photochemical activity for a long time (at least for three months, even in dried form) and there is a redox interaction between the cnt and rcs. the attachment of rc to cnts results in an accumulation of positive and negative charges followed by slow reorganization of the protein structure after excitation. in the absence of cnt the secondary quinone activity decays quickly as a function of time after drying the rc onto a glass surface. the special electronic properties of the swnt/protein complexes open the possibility for several applications, e.g. in microelectronics, analytics or energy conversion and storage. the decay of the high fluorescence state generated by actinic illumination of different durations was measured in whole cells of various strains and mutants of photosynthetic purple bacteria. although similar method is used in higher plants, its application in photosynthetic bacteria is novel and highly challanging. the available data are restricted and only the re-oxidation of the reduced primary quinone (q a -? q a ) is usually blamed for the decay kinetics. here, we will analyse the complexity of the kinetics over a very broad time range (from 5 ls to 5 s) and show that the dark relaxation of the bacteriochlorophyll fluorescence reflects the overlap of several processes attributed to the intra-and interproteinous electron transfer processes of the reaction center (rc) and cytochrome bc 1 complex of the bacterium. in the shorter (\ 1 ms) time scale, the dominating effect is the re-reduction of the oxidized primary donor (p ? ? p) that is followed by the re-oxidation of the acceptor complex of the rc by the cytochrome bc 1 complex. as the life times and amplitudes of the components depend on the physiological state of the photosynthetic apparatus, the relaxation of the fluorescence can be used to monitor the photosynthetic capacity of the photosynthetic bacteria in vivo. circular dichroism (cd) spectroscopy is an indispensable tool to probe molecular architecture. at the molecular level chirality results in intrinsic cd, pigment-pigment interactions in protein complexes give rise to excitonic cd, whereas ''psitype'' cd originates from large densely packed chiral aggregates. it has been well established that the anisotropic cd (acd), measured on samples with defined orientation, carries specific information on the architecture of molecules. however, acd can easily be distorted by linear dichroism of the sample or instrumental imperfections -which might be the reason why it is rarely studied in photosynthesis research. here we present acd spectra of isolated intact and washed, unstacked thylakoid membranes, photosystem ii membranes (bby), and tightly-stacked lamellar macroaggregates of the main light-harvesting complex ii (lhcii). we show that the acd spectra of face-and edge-aligned stacked thylakoid membranes and lhcii lamellae exhibit profound differences in their psi-type cd bands. marked differences are also seen in the excitonic cd of bby and washed thylakoid membranes. thus acd provides an additional dimension to the light induced conformation changes of quinone depleted photosynthetic reaction centers (rcs) purified from the carotenoid-less rhodobacter sphaeroides r-26 were investigated by transient absorption (ta) and grating (tg) methods. surprisingly, the decay of the ta signal measured at 860 nm was divided into a 15 ns and a 40 ls components. the latter coincides with the life time of the tg signal, which was assigned earlier [nagy et al. (2008) febs lett. 582, 3657-3662] to spectrally silent conformational changes. the nature of the 40 ls phase was investigated further. although, the probability of chlorophyll triplet formation under our measuring conditions was small, possible contribution of the triplet states was also studied. the presence of carotenoid in the wild type rcs eliminated the 40 ls component indicating the role of carotenoid in the energy transfer within the rcs. there was no significant effect of the molecular oxygen on the ta. this fact may be explained if the chlorophyll triplets inside the protein have reduced accessibility to molecular oxygen. a differential effect of osmotic potential and viscosity on conformation changes accompanying the primary charge separation was measured by the effect of ficoll, glucose and glycerol as compared the ta to the tg signals. variable chlorophyll fluorescence: in part a yield change due to light-induced conformational change gert schansker 1 , szilvia z. tó th 1 , lá szló ková cs 1 , alfred r. holzwarth 2 and gy} oz} o garab 1 1 institute of plant biology, biological research center, hungarian academy of sciences, szeged, hungary, 2 max-planck-institut fü r bioanorganische chemie, mü lheim an der ruhr, germany on a dark-to-light transition the chlorophyll fluorescence rises from a minimum intensity (f 0 ) to a maximum intensity (f m ). conventionally, this rise is interpreted to arise from the reduction of the primary quinone acceptor, q a , of photosystem ii -although this cannot explain all presently available observations. in untreated leaves, at room temperature, the fluorescence rise follows the reduction of the electron transport chain (etc). once induced, *30-40 % of the variable fluorescence intensity relaxes within 100 ms in darkness and can be re-induced within 3 ms as long as the etc remains reduced. analyzing the fluorescence relaxation kinetics, ?/-dcmu, *30% of the amplitude cannot be explained by q a -re-oxidation. special properties of this phase determined on dcmu-inhibited samples: at cryogenic temperatures (below -40°c), where the q a -/s 2 recombination is blocked, it still relaxes and it exhibits a strong temperature dependence with an apparent e a & 12 kj/mol, whereas the reduction of q a is nearly temperature insensitive. a fluorescence yield change, driven by light-induced conformational change in the reaction center complex, can explain all these observations. tuning function in bacterial light-harvesting complexes katia duquesne 1 , edward o'reilly 2 , cecile blanchard 1 , alexandra olaya-castro 2 and james n. sturgis 1 1 lism, cnrs and aix-marseille university, marseille, france, 2 department of physics, university college, london, uk purple photosynthetic bacteria are able to synthesize a variety of different light-harvesting complexes, sometimes referred to as lh2, lh3 and lh4. here we have investigated the structural origins of these different forms and the manner in which the sequence tunes the absorption spectrum of the light-harvesting system. we then consider the functional consequences of this tuning for the organization of the light harvesting system and on the ecology of the organisms. specifically by spectroscopic techniques, in particular circular dichroism and resonance raman spectroscopy, we have been able to obtain information on the organization of the bacteriochlorophyll binding sites in the unusual lh4 of roseobacter denitrificans. this provides a picture of how different peripheral light-harvesting complexes are able to modulate the absorption spectrum. the structure and organization of this complex is the put in the context of the the recently published variability of the light-harvesting complexes. in particular the observation of their ability to form mixed complexes containing different polypeptides. we examine quantitatively the possible reasons for maintaining such variability by considering the transport properties of membranes containing either pure or mixed complexes and show that mixed complexes can permit light-harvesting to continue during adaptation. we then consider the different constraints that may be behind this type of adaptation in different bacteria and the conditions under which different types of antenna system might be optimal finally we integrate this into the evolutionary context of adaptation to variable light intensity and the ecological niches where such organisms are found. interaction between photosynthetic reaction centers and ito t. szabo 1 , g. bencsik 2 , g. kozak 1 , cs. visy 2 , z. gingl 3 , k. hernadi 4 , k. nagy 5 , gy. varo 5 and l. nagy 1 1 departments of medical physics and informatics, 2 physical chemistry and materials science, 3 technical informatics and of, 4 applied and environmental chemistry, university of szeged, hungary, 5 institute of biophysics, has biological research center, szeged, hungary photosynthetic reaction center proteins (rc) purified from rhodobacter sphaeroides purple bacterium were deposited on the surface of indium-tin-oxide (ito), a transparent conductive oxide, and the photochemical/-physical properties of the composite was investigated. the kinetics of the light induced absorption change indicated that the rc was still active in the composite and there was an interaction between the protein cofactors and the ito. the electrochromic response of the bacteriopheophytine absorption at 771 nm showed an increased electric field perturbation around this chromophore on the surface of ito compared to the one measured in solution. this absorption change is associated with the chargecompensating relaxation events inside the protein. similar life time, but smaller magnitude of this absorption change was measured on the surface of borosilicate glass. the light induced change in the conductivity of the composite as a function of the concentration showed the typical sigmoid saturation characteristics unlike if the chlorophyll was layered on the ito which compound is photochemically inactive. in this later case the light induced change in the conductivity was oppositely proportional to the chlorophyll concentration due to the thermal dissipation of the excitation energy. the supramolecular organization of photosystem ii in vivo studied by circular dichroism spectroscopy tü nde tó th 1,2 , herbert van amerongen 2,3 , gy} oz} o garab 1 , lá szló ková cs 1 1 institute of plant biology, biological research center, hungarian academy of sciences, hungary, 2 wageningen university, laboratory of biophysics, wageningen, the netherlands, 3 microspectroscopy centre, wageningen university, wageningen, the netherlands the light reactions of photosynthesis in higher plants take place in granal chloroplast thylakoid membranes, which contain chirally organized macrodomains composed of photosystem ii (psii) supercomplexes associated with light harvesting antenna complexes (lhciis). the physiological relevance of this hierarchic organization, which often manifest itself in semicrystalline assemblies, has not been elucidated but the diversity of the supramolecular structures and their reorganizations under different conditions indicates its regulatory role. the present work focuses on the structural and functional roles of different components of lhcii-psii supercomplexes. we used various growth conditions, influencing the protein composition, and different arabidopsis mutants (kocp24, kocp26, kopsbw, kopsbx, dgd1), with altered organization of the membranes, and measured their circular dichroism (cd) spectra as well as their chlorophyll fluorescence kinetics to characterize the chiral macro-organization of the chromophores and the functional parameters of the membranes, respectively. we show that the formation of chiral macrodomains require the presence of supercomplexes. our data also reveal specific functions of some of the protein or lipid compounds in the light adaptation processes of plants. excitation energy transfer and non-photochemical quenching in photosynthesis rienk van grondelle department of physics, vu university, de boelelaan 1081, 1081hv, amsterdam, the netherlands the success of photosynthesis relies on two ultrafast processes: excitation energy transfer in the light-harvesting antenna followed by charge separation in the reaction center. lhcii, the peripheral light-harvesting complex of photosystem ii, plays a major role. at the same time, the same light-harvesting system can be 'switched' into a quenching state, which effectively protects the reaction center of photosystem ii from over-excitation and photodamage. in this talk i will demonstrate how lhcii collects and transfers excitation energy. using single molecule spectroscopy we have discovered how lhcii can switch between this light-harvesting state, a quenched state and a red-shifted state. we show that the switching properties between the light-harvesting state and the quenched state depend strongly on the environmental conditions, where the quenched state is favoured under 'npq-like' conditions. it is argued that this is the mechanism of non-photochemical quenching in plants. photobiology in the soil: arrested chlorophyll biosynthesis in pea epicotyl sections beá ta vitá nyi, katalin solymosi, annamá ria kó sa, bé la bö ddi eö tvö s university, institute of biology, department of plant anatomy, pá zmá ny p. s. 1/c, h-1117 budapest, hungary the key regulatory step of chlorophyll (chl) biosynthesis is the nadph:protochlorophyllide oxidoreductase (por) catalyzed reduction of protochlorophyllide (pchlide)which is light activated in angiosperms. this process is usually described on artificially dark-grown plants. in this work, we studied epicotyl segments developed under the soil surface, which were dissected from pea plants grown under natural light conditions. using 77 k fluorescence spectroscopy, pigment analyses, electron microscopy and fluorescence microscopy, we found that upper segments showed transitional developmental stages, i.e. chl appeared besides pchl(ide) and etio-chloroplasts were typical. in regions under 5 cm depth, however, the characteristics of the segments were similar to those of plants germinated artificially in complete darkness, i.e. only pchl(ide) and etioplasts were present. the results of this work prove that these latter symptoms may occur in shaded tissues of fully developed, photosynthetically active plants grown under natural conditions. in this overview talk it will be shown how atomistic computations can complement experimental measurements in our quest to understand biological electron and proton transfer reactions. at first the molecular simulation methods for calculation of important electron transfer parameter such as reorganization free energy, electronic coupling matrix elements and reduction potentials will be explained. then three applications will be discussed where such computations help interpret experimental data on a molecular level. the first example concerns electron tunneling between heme a and heme a3 in the proton pump cytochrome c oxidase. this reaction is very fast, occurring on the nano-second time scale, and it is unclear if this is due to an unusually low reorganization free energy or due to high electronic coupling. carrying out large-scale all-atom molecular dynamics simulation of oxidase embedded in a membrane, we do not find evidence for unusually small values of reorganization energy as proposed previously, implying that the nanosecond tunneling rate between heme a and a3 is supported by very efficient electronic coupling. the second example is on electron transport in a deca-heme 'wire'-like protein, used by certain anaerobic bacteria to transport electrons from the inside of the cell to extracellular substrates. the crystal structure of such a protein has been solved recently for the first time. however, it is unclear if and in which direction the wire structure supports electron transport. here we present results of heme reduction potential calculations that help us reveal the possible electron flow in this protein. in a third example we explain how quantum mechanical/molecular mechanical methods (qm/mm) recently helped us understand why the catalase from h. pylori is prone to undergo an undesired protein radical migration reaction during catalysis. proton pumping activity of purple and brown membranes regenerated with retinal analogues k. bryl 1 , k.yoshihara 2 1 university of warmia and mazury, department of physics and biophysics, olsztyn, poland, 2 suntory institute for bioorganic research, wakayamadai, osaka 618, japan the retinal protein bacteriorhodopsin (br) acts as a light-driven proton pump in the purple membrane (pm) of halobacterium salinarium (h.s.). the aim of these studies was to clarify whether the specific crystalline structure of protein and protein-substrate interactions are significant for h ? transfer into the aqueous bulk phase. two membrane systems were prepared: purple membranes (br arranged in a two-dimensional hexagonal lattice) and brown membranes (br not arranged in a crystalline lattice) were regenerated with 14-fuororetinals. light-induced proton release and reuptake as well as surface potential changes inherent in the regenerated systems reaction cycles were measured. signals of optical ph indicators residing in the aqueous bulk phase were compared with signals of ph indicator covalently linked to the extracellular surface of proteins and with surface potential changes detected by the potentiometric probe. the energies of activation of proton transfer have been calculated. experimental results and thermodynamic parameters (energies of activation) suggest the different mechanism of proton transfer into the aqueous bulk phase in these two systems. the implications for models of localized-delocalized energy coupling by proton gradients will be discussed. iron regulation is a vital process in organisms and in most of them it is accomplished through the metal solubilisation and storage by ferroxidase enzymes of the ferritin family, which have the ability of sequester, oxidize and mineralize ferrous ions using oxygen or hydrogen peroxide as substrate. dnabinding proteins from starved cells (dps) belong to this ferritin's family. dps belongs to the sub-type designated as miniferritins and, besides iron storage and release capability, is responsible for hydrogen peroxide resistance showing the ability to form stable complexes with dna. the preferable cosubstrate of this enzyme is h 2 o 2 although the reaction can occur in the presence of oxygen with lower rate [1, 2] . in this work, the electrochemical behaviour of the recombinant dps from pseudomonas nautica, was assessed as a function of metal content in anaerobic environment with h 2 o 2 as co-substrate. the obtained electrochemical results together with spectroscopic studies allowed inferring some new hypothesis on the dps iron uptake mechanism. for myoglobin electrostatically immobilized on au-deposited mixed self-assembled monolayers (sams) of the composition: -s-(ch 2 ) 11 -cooh/-s-(ch 2 ) 11 -oh. our approach allows for a soft switching of the haem group charge state and accurate probing of the accompanying reorganizational dynamics of conformational (quasi-diffusional) and quantum (e.g. protonrelated) modes. the electron transfer rate constants were determined with h 2 o or d 2 o as solvent, under the variable temperature (288-328 k) or pressure (5-150 mpa) conditions, revealing the overall reorganization free energy of 0.5±0.1 ev, activation volume of -3.1±0.1 cm 3 mol -1 and inverse solvent kinetic isotope effect of 0.7±0.1 (25°c). on the grounds of an extended charge-transfer theory, we propose specific protoncoupled et mechanism additionally coupled to the slow conformational dynamics of the protein matrix accompanied by translocation(s) of haem-adjacent water molecule(s). proton gradients across pore spanning membranes: towards on-chip energy conversion daniel frese, claudia steinem institute for organic and biomolecular chemistry, georg-august-university goettingen, germany in cell organelles, chemiosmotic potentials resulting from proton gradients across membranes are widely used to fix chemical energy in forms of atp. the high efficiency of this protein-mediated energy conversion raises interest for artificial proton gradient setups. to investigate proton transport across artificial membranes, we prepared pore spanning membranes (psms) on porous silicon substrates via painting technique. this allowed us to trap aqueous content of well-defined composition and volume inside the substrates microcavities. nigericin, a peptide that acts as an h ? -k ? -antiporter and bacteriorhodopsin, a transmembrane protein which is well known to be a light driven proton pump, were reconstituted into the pre-formed psms to achieve proton transport from one aqueous compartment to the other. changes of proton concentrations inside the pores were monitored by means of confocal laser scanning microscopy (clsm). therefore, pores were filled with pyranine, a ph-sensitive fluorescence dye and variations in intensity were measured to analyze proton translocation. we were able to show that both, nigericin and bacteriorhodopsin are capable of building up a proton gradient across psms and plan to co-reconstitute atp synthases for on-chip energy conversion by formation of atp. application of the gibbs free energy profiles to sequential biochemical reactions pé ter farkas, tamá s kiss and eugene hamori department of biological physics, eö tvö s ló rá nd tudomá ny egyetem, budapest, hungary, and department of biochemistry, tulane university, new orleans, la., usa the full understanding of the energetic details of complex metabolic reaction sequences requires a step-by-step analysis of the gibbs free energy (g) changes of the ''parasystem'' (i.e., a collection of atoms comprising all the molecules participating in a given reaction) as it gradually changes from its initialreactants state to its final-products state along the reaction pathway. knowing the respective equilibrium constants of each of the participating reaction steps and also the actual in vivo concentrations of the metabolites involved, a free-energy profile can be constructed that will reveal important information about the progress of the reaction as driven by thermodynamic forces. this approach will be illustrated on some biochemical reactions including the glycolytic/gluconeogenetic pathways. furthermore, the often misleading text-book representation of enzymatic catalysis will be reexamined and explained in thermodynamic terms using the free-energy profiles of both the non-catalyzed and the enzyme-catalyzed reactions. redox-active proteins can be diversely functionalized at metaldeposited self-assembled monolayers (sams) of a widely variable composition and thickness. the voltammetric methodology in combination with the advanced data processing procedures allow for comprehensive kinetic data within the congruent series of nano-devices and the subsequent calculation of the key physical parameters, such as the medium reorganization energy of et, the donor-acceptor electronic coupling, effective relaxation time (related to the protein's and environment's fluctuational dynamics), etc. in our studies the ''model'' redox protein, cytochrome c (cytc), was either freely diffusing to sam terminal groups (mode 1), or attached to sams through the electrostatic interaction (mode 2), or specific ''wiring'' (mode 3). another redox-active protein, azurin, was confined at terminal sam groups through the hydrophobic interaction (mode 4). diverse experimental strategies including the variation of sam thickness, solution viscosity temperature and hydrostatic pressure, allowed for a severe demonstration of the full adiabatic and nonadiabatic control (thinner and thicker sams, respectively), and the intermediary regime, in a nice agreement with the major theoretical predictions. proton transfers in a light-driven proton pump j.k. lanyi dept. physiology & biophysics, university of california, irvine, usa illumination of bacteriorhodopsin causes isomerization of alltrans retinal to 13-cis,15-anti and a cyclic reaction ensues, in which the protein and the chromophore undergo conformational changes with an overall ten millisecond turn-over time and a proton is transported from one membrane side to the other. with crystal structures of six trapped intermediate states and plausible structural models for the remaining two intermediates, structures are now available for the initial bacteriorhodopsin state and all intermediates. they reveal the molecular events that underlie the light-induced transport: protonation of the retinal schiff base by asp85, proton release to the extracellular membrane surface, a switch event that allows reprotonation of the schiff base from the cytoplasmic side, side-chain and main-chain motions initiated in the cytoplasmic region, formation of a single-file chain of hydrogen-bonded water molecules that conducts the proton of asp96 to the schiff base, and reprotonation of asp96 from the cytoplasmic surface. the observed changes can be summarized as a detailed atomic-level movie in which gradual relaxation of the distorted retinal causes a cascade of displacements of water and protein atoms results in vectorial proton transfers to and from the schiff base. electron transfer (et) processes are fundamental in photosynthesis, respiration and enzyme catalysis. the relative importance of superexchange and sequential mechanisms in biological et is still a matter of debate. the identification of any ''stepping stones'' necessary for electron hopping is a key point in the understanding of long range et. hence, the study of a single event in the sequence of reactions occurring in these phenomena is a fundamental but formidable task. muon spin relaxation (lsr) has been shown to be sensitive to charge transport on a molecular lengthscale. the muon is a very sensitive probe of electron transport, as any changes to the electronic density sampled by the muon can change its spin polarization, which can easily be measured. in this context, a very useful tool is the detection of the so-called avoided level crossing (alc) resonances [1] . the enhancement in the loss of polarization of the muon's spin on these resonances dramatically increases sensitivity. we show that a laser pump -lsr probe technique can measure et processes at particular, and most importantly known, sites within the amino acids chain, and therefore track the time evolution of the electron over the molecule. keywords: photosynthesis, reaction center, electron transfer, proton transfer, fourier transform infrared, l210dn, isotopic labeling, band assignment, histidine, mechanism in photosynthesis, the central step in transforming light energy into chemical energy is the coupling of light-induced electron transfer to proton uptake. in the photosynthetic reaction center (rc) of rhodobacter sphaeroides, fast formation of the charge separated state p ? q a is followed by a slower electron transfer from the primary quinone q a to the secondary quinone q b and the uptake of a proton from the cytoplasm to q b . previous fourier transform infrared (ftir) measurements on rc suggested an intermediate x in the q a -q b to q a q b transition. mutation of the amino acid aspl210 to asn (l210dn mutant) slows down proton uptake and oxidation of q a -. using time-resolved ftir spectroscopy we characterized this rc mutant and proposed specific ir bands that belong to the intermediate x. to study the role of the iron-histidine complex located between q a and q b , we performed fast-scan ftir experiments on the l210dn mutant marked with isotopically labelled histidine. we assigned ir bands of the intermediate x between 1120 cm -1 and 1080 cm -1 to histidine vibrations. these bands show the protonation of a histidine, most likely hisl190, during the q a -q b to q a q b transition. based on these results we propose a new mechanism of the coupling of electron and proton transfer in photosynthesis. complex i of respiratory chains is an energy transducing enzyme present in most bacteria and in all mitochondria. it is the least understood complex of the aerobic respiratory chain, even though the crystallographic a-helical structures of bacterial and mitochondrial complexes have been recently determined [1] [2] . this complex catalyses the oxidation of nadh and the reduction of quinone, coupled to cation translocation across the membrane. rhodothermus marinus complex i, our main model system, is a nadh:menaquinone oxidoreductase and has been extensively characterized. we have made an exhaustive study in order to identify all the subunits present in the complex [3] . the nature of the coupling charge of r. marinus complex i was investigated using inside-out membrane vesicles, which were active with respect to nadh oxidation and capable of creating and maintain an nadh-driven membrane potential (dw) positive inside. it was observed that this bacterial complex i is able of h ? and na ? transport, although to opposite directions. the coupling ion of the system was shown to be the h ? being transported to the periplasm, contributing in this way to the establishment of the electrochemical potential difference, while na ? is translocated to the cytoplasm [4] . the sodium ion extrusion from the membrane vesicles was due to the activity of complex i, since it was sensitive to its inhibitor rotenone, and it was still observed when the complex i segment of the respiratory chain was isolated by the simultaneous presence of cyanide and external quinones. additional studies have shown that although neither the catalytic reaction nor the establishment of the dph requires the presence of na ? , the presence of this ion increased the proton transport. combining all these results, a model for the coupling mechanism of complex i was proposed, suggesting the presence of two different energy coupling sites, one that works only as a proton pump (na ? independent), and the other functioning as a na ? /h ? antiporter (na ? dependent) [4] . this model was reinforced by further studies performed in the presence of the na ? /h ? antiporter inhibitor, 5-(n-ethyl-n-isopropyl)-amiloride (eipa) [5] . a deeper inside into the coupling mechanism of this enzyme was provided by studying the influence of sodium ions on energy transduction by complexes i from escherichia coli and paracoccus denitrificans. it was observed that the na ? / h ? antiporter activity is not exclusive of r. marinus complex i, since the e. coli enzyme is also capable of such a transport, but is not a general property given that the p. denitrificans enzyme does not performed sodium translocation [6] . due to the fact that r. marinus and e. coli enzymes reduce menaquinone while p. denitrificans complex i reduce ubiquinone, it is suggested that the na ? /h ? antiporter activity may be correlated with the type of quinone used as substrate. under anaerobic conditions some bacteria can use nitrate instead of oxygen in a process called denitrification. during denitrification, the reduction of no to n 2 o is catalyzed by a membrane-bound enzyme nitric oxide reductase (nor). this enzyme is an important step in the evolution of a respiratory system: nor belongs to the superfamily of o 2 -reducing heme-copper oxidases and is assumed to be the evolutionary ancestor of cytochrome c oxidase. the understanding of nor functioning was limited by the lack of structural information, but recently the first structures (cnor type from ps. aerug. and qnor type from g. stearoth.) were solved [1] [2] . we will present results of the first computational studies of nor (both cnor and qnor types) [2] [3] . the studies include: (i) large-scale all-atom md simulations of proteins in their natural environment (i.e. embedded in membrane and solvent), which were performed to describe water dynamics inside the protein and to identify potential proton transfer pathways, and (ii) free-energy calculations by the empirical valence bond (evb) method [4] for the explicit proton translocations along the pathways established by md. among important findings are new proton pathways, which were not predicted from the x-ray structure and could be identified only by means of computer simulations. simulations also reveal that, despite a high structural similarity between cnor and qnor, these enzymes utilize strikingly different proton uptake mechanisms. our results provide insights into the functional conversion between no and o 2 reductases, and into evolution proton transfer mechanisms and respiratory enzymes in general. the genome of the bacterium geobacter sulfurreducens (gs) encodes for 111 c-type cytochromes (1). genetic studies using cytochrome deficient gs strains and proteomic studies identified cytochromes that were produced under specific growth conditions (2) (3) (4) . a putative outer-membrane cytochrome, omcf, is crucial for fe 3? and u 6? reduction and also for microbial electricity production (4). omcf is a monoheme c-type cytochrome with sequence similarity to soluble cytochromes c 6 of photosynthetic algae and cyanobacteria (4). the structure of oxidized omcf was determined (5) being the first example of a cytochrome c 6 -like structure from a nonphotosynthetic organism. the structural features of omcf hinted a different function compared to cytochromes c 6 from photosynthetic organisms, being an excellent example of how structurally related proteins are specifically design by nature to perform different physiological functions. in order to elucidate omcf structural-functional mechanism, isotopic labeled protein ( 15 n and 13 c) was produced and its structure in the reduced form determined by nmr. single point-mutations at key residues were produced by site-directed mutagenesis and their impact on the structural and functional properties of omcf will be presented. in the early 90s, the search for the source of nitrogen monoxide (no) production in mammals led to the discovery of three major isoforms of no-synthases (nos): the neuronal nos (nnos), the inducible nos (inos), and the endothelial nos (enos) ( (1)). 20 years later, based on genomic analysis, numerous nos-like proteins have been identified in the genome other organism and in particular of several bacteria (bacillus anthrax, staphylococcus aureus… (2)). in spite of superimposable 3d structures and the ability to catalyse no production, all these enzymes are carrying different (if not opposite) physiological activities including cgmp signalling, cytotoxic activities, anti-oxidant defence, metabolism… moreover, noss become increasingly associated to oxidative-stress related pathologies ranging from neurodegenerative disorders, cardiovascular and inflammatory diseases, diabetes, cancers (3)…. this apparent paradox seems related to the belief that the strong similarity of sequence and structure of no-synthases must lead to a unique and identical functioning (no production) for all isoforms. this is blatant for bacterial nos-like proteins that are lacking the essential components required for no biosynthesis but remain considered as genuine no synthases. this approach might remain an obstacle to the understanding of nos actual biological role and could prevent the design of efficient nos-targeted therapeutic strategies. to elucidate this ''nos paradox'' our group has initiated a multidisciplinary approach that aims to relate the wide diversity of nos biological activities to variations in the catalytic mechanism of noss, to modifications of their regulation patterns and to adaptations to their physiological environment. in this context we have been investigating the mechanism of bacillus subtilis nos-like proteins with a special focus on the features that are specific to nos mechanism: i) electron and proton transfers and the role of the substrate and the pterin cofactor ii) oxygen activation and the role of the proximal ligand iii) the very molecular mechanism and the variations in the nature of reaction intermediates. for that matter we have been using a combination of radiolytical techniques (cryoreduction with 60 co y-irradiation, pulse-radiolysis with the elyse electron accelerator), stateof-the-art spectroscopies (epr, atr-ftir and resonance raman, and picoseconds uv-visible absorption spectrosocopies), organic synthesis (synthesized substrates and cofactors analogues) biochemistry and molecular biology (site-directed mutagenesis). we will present our results on the coupling of electron and proton transfers, on the tuning of the proximal ''push effect'' and we will discuss the conditions that favour for each nos isoform no production versus other reactive nitrogen and oxygen species. photosynthetic iron oxidation (pio) is an ancient form of photosynthesis with relevant consequences in the shaping of the planet. this form of metabolism may have been involved in the deposition of geological structures known as banded iron formations, which hold key information regarding the co-evolution of photosynthesis and earth. rhodopseudomonas palustris tie-1 and rhodobacter ferroxidans sw2 both use ferrous iron as an electron donor to support photosynthetic growth (i.e. photoferrotrophy). the sw2 foxeyz operon can stimulate light-dependent iron oxidation by other bacteria. it codes a two-heme cytochrome, a pyrroloquinoline quinone protein and an inner membrane transporter, respectively. in tie-1 the pioabc operon is required for photoferrotrophy. it codes for a ten-heme cytochrome, an outer membrane beta-barrel and a high potential iron-sulfur protein (hipip) respectively. here we present functional and structural characterization of proteins involved in pio. this molecular characterization is essential for understanding this mode of bioenergetic metabolism, and may one day aid the development of biotechnological applications like microbial fuel cells and bioremediation. alongside classical, cytochrome respiratory pathway, phycomyces blakesleeanus possess alternative, cyanideresistant respiration (crr) facilitated by alternative oxidase (aox). in order to study role of oxygen in regulation of crr, the effects of cyanide on respiration of 24h old mycelia in aerated (control), hypoxic and anoxic conditions were measured. mycelium was incubated in these conditions for 1.5h, 3h and 5h. after 1.5h, aox activity was increased only in specimens incubated in anoxic conditions (13.6%). after 3h, increase in aox activity was significant in both hypoxic and anoxic specimens (18.9% and 18.8%, respectively), with even greater increase after 5h, 20.7% for hypoxic and 23.3% for specimens in anoxic conditions. mycelia treated for 5h was then oxygenated for 10 minutes. this induced decrease in aox activity of 17% in anoxic and even 23.9% in hypoxic mycelia. aox is recognized as one of the mechanisms for maintaining low levels of reduced ubiquione which can function in conditions in which cytochrome chain is disabled, such as anoxia. this is in concordance with results obtained on p. blakesleeanus, where aox levels rise in hypoxic and anoxic conditions and decrease close to control level shortly after introduction of oxygen into the system. influence of escherichia coli f 0 f 1 -atpase on hydrogenase activity during glycerol fermentation k. trchounian 1,2 , g. sawers 2 , a. trchounian 1 1 department of biophysics, yerevan state university, 0025 yerevan, armenia, 2 institute for microbiology, martin-luther university halle-wittenberg, 06120 haale, germany e. coli encodes four hydrogenases (hyd); only three of these, hyd-1, hyd-2 and hyd-3 have been well characterized. hyd-2 has been shown recently to reversibly evolve hydrogen during glycerol fermentation at ph 7.5 [1] . proton reduction was inhibited by n,n'dicyclohexylcarbodiimide suggesting a link with the proton-translocating f 0 f 1 -atpase. indeed, at ph 7.5 in an e. coli mutant (dk8) lacking f 0 f 1 overall hyd-activity was reduced to approximately 50% of the wild type activity; hyd-2, but not hyd-1, was detected in an in-gel activity assay. f 0 f 1 is therefore suggested to be required for hyd-1 activity. at ph 6.5 in glycerol medium hyd-activity in dk8 was *10% of wild type activity and hyd-1 and hyd-2 exhibited only weak activity. this indicated a significant f 0 f 1 contribution towards hyd-activity as ph decreased. furthermore, at ph 5.5 hydactivity was negligible and only a very weak activity band corresponding to hyd-2 could be observed. these results suggest that f 0 f 1 -atpase is essential for hydrogenase activity during glycerol fermentation at ph 5.5. taken together, the results suggest an interdependence between hyd-1, hyd-2 and f 0 f 1 -atpase activity. [ ion channels and transporters control many facets of cancer cell biology 1 and blocking the activity of these impairs tumor cell growth in vitro and in vivo. this new paradigm has opened new opportunities for pharmaceutical research in oncology 1,2 . we have contributed to this field showing that k v 11.1 (herg1) channels are aberrantly expressed in several human cancers where they control different aspects of the neoplastic cell biology such as proliferation and apoptosis, invasiveness and angiogenesis, the latter through the regulation of vegf secretion (reviewed in 1 ). the herg1dependent effects were shown in vitro and, more recently, in vivo. in preclinical models of both leukemia 1 and colorectal cancer 3 , herg1 overexpression confers a higher malignancy to neoplastic cells. moreover, herg1 blockers have therapeutic potential, since preclinical tests showed that treatment with specific herg1 blockers overcame chemoresistance in acute leukemias 4 as well as reduced gi cancer growth, angiogenesis and metastatic spread 3 . the overall message emerging from our data is that the herg1 protein represents a novel biomarker and drug target in oncology. up to now, herg1 was considered an ''antitarget'' due to the cardiac side effects that many (but not all) herg1 blockers produce and that result in lengthening the electrocardiographic qt interval. we report here recent studies on known and newly developed herg1 blockers that exhibit no cardiotoxicity and are more specific for the herg1 channels expressed in cancer cells. we reported previously that the increase of the cholesterol content of the cell membrane (in vitro) modified the biophysical parameters of the gating of kvl.3 k ? ion channels in human t-lymphocytes. in our present study we aimed to determine the effect of hypercholesterolemia on the biophysical parameters of kv1.3gating and the proliferation of t cells. t-lymphocytes were isolated from the peripheral blood of patients with cholesterol level considered normal (\5.2 mmol/l,control group) and patients with hypercholesterinaemia (hc). whole-cell k? currents were measured in patchclamped t cells and the kinetic (activation and inactivation kinetics) and equilibrium parameters (voltage-dependence of steady-state activation) of kv1.3 gating were determined. lymphocyte proliferation was measured using cfse staining with and without anti-cd3 and anti-cd28 stimulation. our results indicate that the biophysical parameters of kv1.3 gating are similar in the control group and in the 'hc' samples. the cfse-based assay showed that hypercholesterolemic t cells had higher spontenaous activation rate compared to control group. however, t cells from high cholesterol level patients challenged by anti-cd3 and anti-cd28 exhibited lower proliferation rate than control cells. generalized epilepsy with febrile seizures plus (gefs?, omim 604233) is a childhood genetic epilepsy syndrome correlated to mutations in the ancillary b-subunit of neuronal voltage-gated sodium channels (nachs). b1-subunit is non-covalently associated with nach a-subunits, serving as modulator of channel activity, regulator of channel cell surface expression and as cell adhesion molecule. the first and best characterized gefs? mutation is the c121w. this mutation changed a conserved cysteine residue into a tryptophan, disrupting a putative disulphide bridge that should normally maintain an extracellular immunoglobulinlike fold. in this study, we investigated the presence of this putative disulphide bond using 2-d-diagonal-sds-page, where the proteins were separated in the first dimension in absence of a reduction agent and in presence of it in the second dimension. this method allows to visualize the protein above the diagonal experimentally confirming that the disulphide bond is intramolecular. duchenne muscular dystrophy (dmd) is associated with severe cardiac complications. recent research suggests that impaired voltage-gated ion channels in dystrophic cardiomyocytes accompany cardiac pathology. it is, however, unknown if ion channel defects are primary effects of dystrophic gene mutations, or secondary effects of the developing cardiomyopathy. here, we studied na and ca channel impairments in dystrophic neonatal cardiomyocytes, derived from dmd mouse models, prior to cardiomyopathy development. dystrophin-deficiency reduced na current density. in addition, extra utrophin-deficiency altered na channel gating. moreover, also ca channel inactivation was reduced, suggesting that ion channel abnormalities are universal primary effects of dystrophic gene mutations. to assess developmental changes, we also studied na channel impairments in dystrophic adult cardiomyocytes, and found a stronger na current reduction than in neonatal ones. the described na channel impairments slowed the action potential upstroke in adult cardiomyocytes, and only in dystrophic adult mice, the qrs interval of the ecg was prolonged. ion channel impairments precede pathology development in the dystrophic heart, and may be considered cardiomyopathy triggers. supported by austrian fwf (p19352). it has been over 15 years since sensory neuron-specific sodium channel na v 1.8 was identified. since then na v 1.8 has been shown to play a crucial role in pain pathways, and it became a prominent drug target for novel pain killers. in contrast to myelinated neurons, mechanisms that target voltagegated sodium channels to the unmyelinated c-fibre axons are largely unknown. we investigated the localisation of na v 1.8 in unmyelinated primary sensory neurons. na v 1.8 was found to be clustered in lipid rafts on unmyelinated axons. when the lipid rafts are disrupted, remarkable reduction in both the conduction velocity and the number of cells responsive to mechanical stimuli to the unmyelinated axons were seen. using a compartment culture system, we also found that disruption of rafts in the middle region of the sensory axons caused a significant reduction in the responsiveness of the neurons to chemical stimuli to nerve endings. this is due to the failure of action potential propagation through the axons. these data suggest that clustering of na v 1.8 in lipid rafts of unmyelinated fibres is a key factor for the functional properties of the channel, which may due to a change in the voltage threshold. disruption of the na v 1.8 cluster and modifying lipid rafts in primary sensory neurons may be a useful new approach to control the excitability of nociceptive neurons. ion currents in are crucially important for activation of t-lymphocytes. our aim was to investigate how the blockage of various ion channels in isolation or in combination affects mitogen-dependent activation and proliferation of t-cells. we activated human peripheral blood lymphocytes using monoclonal antibodies against the tcr-cd3 complex and cd28. we applied specific channel blockers inhibiting the major ion channels of the t-cell: either the kv1.3 (tea or anuroctoxin), ikca1 (tram-34), or the crac channel (2-apb) alone or in combination. five days after the stimulus we measured the change in cell size and cellular granulation with flow cytometry along with the proportion of dividing cells, using cfse (carboxyfluorescein succinimidyl ester) dilution assay. our measurements indicated that the ion channel blockers suppressed the proportion of dividing cells in a dosedependent manner. increasing the strength of the stimulation reduced the potency of the blockers to inhibit cell proliferation and eventually the blockers were ineffective in decreasing lymphocyte proliferation. we experienced the greatest amount of inhibition using the combination of the blockers, which indicates synergy in the regulation pathway of the various ion channels. recently, sodium dependent phosphate transporter napi2b was revealed as potential marker for breast, thyroid and ovarian cancer. in vivo, napi2b is involved in maintenance of phosphate homeostasis and mutations or aberrant expressions of its gene (slc34a2) are associated with several diseases including cancer. however, data about napi2b mrna expression in different types of cancer and correspondent normal tissues are controversial and restricted. we investigated slc34a2 gene expression level in normal ovarian tissues and different histomorphological types of ovarian tumors using real-time pcr analysis. it was found that slc34a2 gene was highly expressed in well-differentiated endometrioid and papillary serous tumors, but was not expressed in low-differentiated tumors, benign tumors and in most normal tissues. mrna expression of slc34a2 in serouse and endometrioid ovarian tumors accurately correlated with protein expression that was detected in these tumor samples by western blot analysis and immunohistochemistry in our previous investigation. upregulation of slc34a2 gene expression in well-differentiated tumors may reflect cell differentiation processes during ovarian cancerogenesis and could serve as potential marker for ovarian cancer diagnosis and prognosis. in the present contribution a procedure for molecular motion characterization based on the evaluation of the mean square displacement (msd), through the self-distribution function (sdf), is presented. it is shown how msd, which represents an important observable for the characterization of dynamical properties, can be decomposed into different partial contributions associated to system dynamical processes within a specific spatial scale. it is shown how the sdf procedure allows us to evaluate both total msd and partial msds through total and partial sdfs. as a result, total msd is the weighed sum of partial msds in which the weights are obtained by the fitting procedure of measured elastic incoherent neutron scattering (eins) intensity. we apply sdf procedure to data collected, by in13, in10 and in4 spectrometers (institute laue langevin), on aqueous mixtures of two homologous disaccharides (sucrose and trehalose) and on dry and hydrated (h 2 o and d 2 o) lysozyme with and without disaccharides. the nature of the dynamical transition is highlighted and it is shown that it occurs when the system relaxation time becomes shorter than the instrumental energy time. finally, the bioprotectants effect on protein dynamics and the amplitude of vibrations in lysozyme are presented. we evaluated q for genotoxicity in mcf-7 breast cancer cells in the presence or absence of doxorubicin (dox), docetaxel (dtx) and paclitaxel (ptx), anticancer drugs commonly used in chemotherapy of different solid tumors. damage to dna was determinated by comet assay. the cells, after treatment with investigated compounds, were washed up and cultured in fresh medium for 0, 24, 48, 72 and 96 hours. we have found that q by itself caused significant dna damage. moreover flavonol enhanced genotoxic effect of anticancer drugs. the highest amount of dna in the comet tail was observed 48 h after treatment with combination of q with dox. similar changes were found in cells incubated with combination of q with taxanes -ptx or dtx. however, damage to dna in this case was considerably lower than damage caused by combination of q with dtx. our results confirmed anticancer and genotoxic activity of quercetin, which makes it a promising candidate for a potential use as a modulator of cytotoxicity and anticancer activity of anthracycline and taxane chemotherapeutics. although the health effects of low-frequency and intensity electromagnetic fields (lfi-emfs) are controversial, increasing evidence suggests that lfi-emfs are capable for initiating various healing processes. many (bio)physical ideas were suggested to explain the influence of lfi-emfs in living systems but the main effect of lfi-emfs on cell functions remains vague. however, some effects of lfi-emfs may be explained by redox and membrane processes. during diseases, cells not only demonstrate altered biochemical processes but also produce altered non-linear bioelectromagnetic complex patterns. thus, it is reasonable to use non-linear bioelectric and bioelectromagnetic signals from cells of the body for potential therapeutic applications that may be more effective than the artificial lfi-emfs signals. our novel emost (electromagnetic-own-signal-treatment) method is based on the utilization of the non-linear, bioelectric and bioelectromagnetic signals of the patients without any electromagnetic wave modulation and inversion of recorded output signals of subjects. here, we report our some restorative results after emost application. we also suggest that the possible effects of the emost may be achieved via redox-related processes. background or leak potassium conductances are a major determinant of resting potential and input resistance, two key components of cell excitability. these currents are not passive but finely tuned and adapted to cell specific functions. k 2p channels producing these currents are tightly regulated by a variety of chemical and physical stimuli including temperature, membrane stretch, free fatty acids, ph, ca 2? , neurotransmitters and hormones as well as protein partners. these different stimuli converge on gating mechanisms that show remarkable conservation between intracellular k2p channels (twik1 channels) and k2p channels located at the plasma membrane (trek1/2 channels). living at the edge: volume 2 the conduction system of interfacial forces into the alveolar type ii cell in previous investigations, using new microscopic approaches, we found that the presence of an a li leads to a paradoxical situation: it is a potential threat that may cause cell injury, but also a important stimulus: at ii cells respond promptly, and show sustained ca 2?signals that activate exocytosis. exocytosed surfactant, in turn, clearly prolonged the time to irreversible cell damage, and may be an adaptive and evolutionary defense mechanism against the harmful nature of surface forces. recently we published that at ii cells are sensing the a li but how this stimulus is conducted and converted into the cell, is still obscure. currently we are searching for potential calcium sources and it seems that the cells signal ca 2? by extracellular ca 2? entry probably through mechanosensitive channels. specialdesigned gene chips allowed a whole genome profiling of a li exposed single cells. these cells react with rapid changes on the transcriptional level: cellular pathways that are involved include e.g. defense response and lipid metabolism, and we identified genes associated with several lung diseases and injuries. we summarize, interface forces are strong, they are acting on the cells and triggering cellular events that are closely related with classical concepts in mechanotransduction, it is very plausible that those forces play a crucial role in the lung surfactant homeostasis. calorimetric method and instrumentation were worked out and applied for investigation aqueous solutions of proteins. thermal effects were analyzed by own construction heat-fluxtype dsc cell designed for temperature range from the boiling point of water down to 120 k. the achieved sensitivity of heat flow rate (hf) of the instrument is better than 50%w tested using 1 ll of 150 mm aqueous solution of nacl. from the integral value of hf -the total enthalpy change dh total , the enthalpies of transitions were separated from the heat capacities. using the method, several types of proteins (bsa, erd10, ubq, a-, b-casein, and wt, a53t, a-synuclein mutants) were investigated in that temperature range. the results are shown in detail as an illustrative example. potential applications are outlined, which include (i) the distinction between the solvent accessible surfaces of globular and intrinsically disordered proteins, (ii) the distinction between protein mutants, and (iii) the identification of monomer and polymer protein states. this method provides a possibility to study the polymerization process (amyloid formation) and to investigate in-situ the reason and circumstances of that. morphometry due to self gravity in living organism is an integral part in understanding self gravitation bio. computational studies of biological system, especially in ab-initio embryo as self gravitating mass, floating over amniotic fluid, as if maintaining anti -gravitation (extrinsic) mechanism, would be an interesting one to explore biomechanics of intrinsic gravity. since the work would be of exploratory nature, both from the point of view of identifying appropriate stage of development of embryological morulla and available computational logics, it was contemplated to initiate functional study with a few reported ultrasound evidential works on small animals like mice, grasshopper including compact human embryo as mathematical structure that may allow the formal definition of concepts such as convergence, mapping and continuity. as many of the finite-dimensional function analysis in topological vector spaces are available, we initiated our simulation and studies on concentrating locally compact banach spaces. 3 institute for x-ray physics, university of gö ttingen, germany we report on hard x-ray phase contrast imaging of black lipid membranes (blms), which are freely suspended over a micro machined aperture in an aqueous solution. this new way of membrane structure analysis allows investigating bio molecular and organic substances in aqueous environments by parallel and divergent beam propagation imaging, using partially coherent multi-kev x-ray radiation. the width of the thinning film is significantly smaller than the detector pixel size, but can be resolved from quantitative analysis of the intensity fringes in the fresnel diffraction regime down to its native thickness of about 5nm. to our knowledge this is the first time that such small features of a very weak phase object have been visualized by direct x-ray imaging techniques. we have put forward a simplified but extendable model, which enables the theoretical description of image formation and characterization of membrane thickness and its decrease during the thinning process from a bulk to a bimolecular film. on the basis of recent experiments, future investigations will be performed to study the interactions of membranes, as they are for example known from synaptic fusion, with high spatial resolution. shown that the acid incorporates mainly in the exterior part of the erythrocyte membrane, inducing creation of echinocytes. this suggests that it interacts predominantly with the outer part of the lipid layer of erythrocytes and liposomes. it was also shown that the cga decreases the packing order of the hydrophobic part of the membranes, without changing the anisotropic fluorescence of the hydrophobic part. one of the unique features of single molecule absorption/ emission is their anisotropy due to the well-defined transition dipoles for both processes allowing the determination of the molecule's 3d orientation. therefore, several techniques have been proposed in order to determine the full 3d orientation of dipole emitters on a single molecule level. we recently demonstrated a technique that combines emission distribution and polarization detection [1, 2, 3] . as the method is an intensity distribution technique and based on single photon detection in principle, one can extend the 3d orientation determination to fluorescence correlation spectroscopy (fcs) as well as dynamical anisotropy measurements. this allows for the determination of the dynamics in 3d orientation of single molecules down to a nanosecond timescales. the 3d orientation is particularly interesting in non-isotropic environments. a lipid membrane is such a non-isotropic environment of enormous importance in biological systems. we therefore use giant unilamellar vesicle (guv) labeled with dyes like dio as a model system. due to the defined curvature of such vesicles all possible dipole orientations can be achieved. this allows us to show the capabilities of our method on different timescales and to quantify the error in determination of 3d orientation dynamics in lipid membranes. [ the aim of the studies was to determine the changes that occur in a biological membrane and the model lipid membrane as a result of interaction with strawberry leaf extract. numerous studies conducted all over the world have documented a beneficial effect of polyphenolic compounds on the human organism. however, the mechanism of the interaction on the molecular and cell level is not yet known. in the work presented, the effect of strawberry leaf extract on the erythrocyte and black lipid membranes has been investigated. the applied methods -spectroscopic, fluorimetric and electric -allowed to determine the hemolytic and antioxidant activity, and the packing order in the erythrocyte membrane as well as the electric capacity of blms. the results obtained indicate that the extract is efficient in protecting membrane lipids against oxidation, does not induce hemolysis, increases osmotic resistance and decreases packing order in the hydrophilic region of the erythrocyte membrane. moreover, it increases stability and life-time of flat lipid membranes, without altering their specific capacity. supported lipid bilayers are an abundant research platform for understanding the behavior of cell membranes as they allow for additional mechanical stability and enable characterization techniques not reachable otherwise. however, in computer simulations these systems have been studied only rarely up to now. we present systematic studies on different length scales of the changes that a support inflicts on a phospholipid bilayer using molecular modeling. we characterize the density and pressure profiles as well as the density imbalance induced by the support. it turns out that the changes in pressure profile are strong enough that protein function should be impacted leading to a previously neglected mechanism of transmembrane protein malfunction in supported bilayers. we determine the diffusion coefficients and characterize the influence of corrugation of the support. we also measure the free energy of transfer of phospholipids between leaflets using the coarsegrained martini model. it turns out that there is at equilibrium about a 2-3% higher density in the proximal leaflet. these results are in agreement with data obtained by very large scale modeling using a water free model where flip-flop can be observed directly. we are additionally characterizing the intermediate states which determine the barrier height and therefore the rate of translocation. we also study the influence of surface roughness and curvature on the behavior. simulations in atomistic detail are performed for selected systems in order to confirm the findings. [ the inverse bar (i-bar) domain is part of the superfamily of the membrane-deforming protein is bin-amphiphysin-rvs (bar) proteins which induce either positive or negative membrane curvature both in vitro and in cells. generation of membrane curvature by these membrane deforming proteins often works together with actin dynamics. i-bar shares its function between actin bundling and membrane binding but it is still obscured what molecular mechanisms are responsible for these functions. the aim of our project is to investigate the detailed membrane binding properties of the i-bar of irsp53 and its relations to the actin cytoskeleton. in vitro fret experiments and fluorescence quenching studies were carried out between the i-bar and liposomes made up from different lipid constructs. we have found that the i-bar has preference to bind to the negatively charged lipids however it can also bind to the uncharged lipids. the fluorescence quenching studies reflected that the accessibility of the i-bar surface was higher toward the negatively charged lipids than for the uncharged ones. tns fluorescence assay reflected that the i-bar domain binds to the surface of the micells rather than penetrating into its core. lipid bilayers present a well-known order-disorder chain transition at ambient temperatures. this transition may become anomalous if the lipid head-group presents ionic dissociation at low ionic strength, as detected by several experimental techniques: between the gel and the liquid phases an intermediate phase appears as a shoulder in the specific heat, a deep in turbulence or a maximum in conductivity. we propose a statistical model which allows ionic dissociation of the polar group on the membrane surface and thus introduces competition between the hydrophobic interaction of hydrocarbonic chains, which favours the gel phase, at low temperatures, and the electrostatic interaction of charged head-groups, which favours the fluid phase, at higher temperatures. the model presents an intermediate fluid phase with higher dissociation and charge ordering on the membrane surface, beyond a sharp gel-fluid transition. the model presents increasing temperature of the main transition by addition of salt, as well as the shrinking of the anomalous region as chain length increases. model thermodynamic behavior is compared to results for pgs, phospholipids with a glycerol head-group. the well-programmed membrane fusion systems, operating in a weakly acidic environment, have attracted attention in the fields of biochemistry, biophysics, and pharmacy because these acidic conditions are generally observed in endosomal membranes or tumor tissues. we have reported a selective liposomal membrane fusion system toward a sugar-like cyclic cis-diol structure on the target liposome. this system consists of a lapidated phenyl boronic acid derivative as membrane -bound fusogen and phosphatidylinositol as target. here we report the preparation of a boronic acid / ph-responsive polypeptide conjugate as a novel membrane fusion device and the development of a target selective liposomal membrane fusion system with endosomal ph-responsiveness. during the course of lipid-mixing, inner-leaflet lipid-mixing, and contents-mixing assays to characterize membrane fusion behavior, we clearly observed a liposomal membrane fusion phenomenon when the ph of the experimental system was changed from 7.4 (physiological) to 5.0 (endosomal). our highly effective methods, which include a target selective liposomal membrane fusion, can be useful in the area of nanomedicine such as hybridoma technology and liposomebased drug or gene delivery. complete and reversible chemical denaturation of an a-helical membrane protein jana broecker, sebastian fiedler, sandro keller molecular biophysics, university of kaiserslautern, erwin-schrö dinger-str. 13, 67663 kaiserslautern, germany the question of how an unordered polypeptide chain assumes its native, biologically active conformation is one of the greatest challenges in molecular biophysics and cell biology. this is particularly true for membrane proteins. chemical denaturants such as urea have been used successfully for in vitro un-and refolding studies of soluble proteins and b-barrel membrane proteins. in stark contrast with these two protein classes, in vitro unfolding of a-helical membrane proteins by urea is often irreversible, and alternative denaturation assays using the harsh detergent sodium dodecyl sulphate suffer from a lack of a common reference state. here we present the complete and reversible chemical denaturation of the bacterial a-helical membrane protein mistic out of different micellar environments by urea. we applied multidimensional spectroscopy and techniques typically used in b-barrel membrane protein unfolding. mistic unfolds reversibly following a two-state equilibrium that exhibits the same unfolded reference state. this allows for a direct comparison of the folding energetics in different membrane-mimetic systems and contributes to our understanding of how a-helical membrane proteins fold as compared with both b-barrel membrane proteins and water-soluble proteins. in recent years, buckwheat has been of great interest in the world markets of healthy food, due to its high energy value, the content of unsaturated fatty acids, mineral constituents and vitamins. its seeds contain flavonoids which are natural, efficient antioxidants. the aim of the present studies was to investigate the effect of buckwheat extracts on the properties of biological membrane, which is the main site of the interaction between the substances buckwheat contains and the organism. the research was conducted on red blood cells and their isolated membranes, using the spectrophotometric, microscopic and fluorimetric methods. from the results obtained it follows that the compounds contained in buckwheat extracts increase the osmotic resistance of erythrocytes, making them less sensitive to the medium's osmotic pressure, induce changes in cell shape, producing increased number of echinocytes, and decrease the packing order of the polar heads of membrane lipids. it can thus be inferred that the compounds contained in the extracts penetrate the hydrophobic region of the erythrocyte membrane and alter its properties. .due to its small size, symmetric structure, amphipathicity, proteolytic stability and testable mode of activity, the gs backbone is a convenient model system to examine the structure-activity relationship of individual amino acid substitutions. we have previously reported the structure analysis of two gs analogues in which either the val or leu residues on the hydrophobic face of the molecule were substituted by the aliphatic 19 f-labeled amino acid 4f-phg. using 19 f-ssnmr in oriented lipid bilayers, we observed a re-alignment of the peptide that is compatible with the formation of a putative pore [top.curr.chem. 273:139]. here, we present novel analogs of gs with different 19 f-prolines in the b-turn region, and with cf 3 -bpg in place of leu. based on these 19 f-ssnmr results and supported by cd, dsc and activity tests, we could demonstrate that all analogues are structurally intact and antimicrobially active. we observe, however, differences in the re-alignment propensity when comparing these gs analogues in dlpc and dmpc bilayers. these differences can be rationalized in terms of molecular shape being changed upon incorporation of unnatural amino acids at various sites of the molecule. the beta-propiolactone (bpl) is an inactivating reagent commonly used to produce viral vaccine preparations (whole virions or split-virions). although bpl has been reported to inactivate nucleic acids, its mechanism of action on proteins and the outcome on viral infection remains ill-defined. in this work, h3n2/victoria/210/2009 influenza virus strain has been submitted to various bpl inactivation conditions (from 2lm to 1 mm). cell infection ability was progressively reduced and entirely abolished at 1 mm bpl. to clarify the bpl effect, we focused on membrane fusion infection steps using kinetic fluorescence molecule leakage from liposome and lipid fret assays combined with cryo electron microscopy. membrane fusion measured at ph 5 on gm3 liposomes was reduced in a dose-dependent manner. interestingly the fusion activity was partially restored using the proton-ionophore monensin as confirmed by cryoem images. in addition, a decrease of molecule leakage irrespective to bpl concentration was measured suggesting that the hemagglutinin affinity for gm3 was slightly modified even at low bpl concentration. altogether these results strongly suggest that bpl treatment impairs m2 protein activity likely by preventing proton transport and bring new light on the mechanism of action of bpl. cellular membranes have a heterogeneous lipid composition, potentially forming nano-domains or membrane rafts, believed to be platforms of altered fluidity involved in protein sorting and trafficking 1 . an alternative mechanism, potentially leading to protein sorting, has recently been proposed, suggesting that the curvature of membranes can also actively regulate protein localization 2 . recently we showed that a variety of protein anchoring motifs are membrane-curvature sensors and thus up concentrate in regions of high membrane curvature 3 . furthermore the curvature sensing ability of the anchoring motifs persisted independently of their structural characteristics. this leads us to speculate that curvature sensing might be an inherent property of any curved membrane and as a consequence, the lipid composition of the bilayer could potentially regulate this recruitment by membrane curvature. thus there might be an intimate, yet unrecognized, link between the way raft-like membrane domains and membrane-curvature promotes the localization of membrane-anchored proteins. we examined how changing the lipid composition of liposomes influenced the recruitment by membrane curvature of a model amphiphilic protein-anchoring motif. employing our single liposome curvature assay, we tested lipid mixtures with different ratios of dopc, sphingomyelin and cholesterol, giving rise to liposome populations of different phase-states. we found an amplified recruitment by membrane curvature for all raft-like l o phase-state mixtures when compared to the l d phase-state counterparts. based on these findings we suggest a synergetic effect when combining a raft-like lipid phase-state and high membrane curvature, resulting in a highly potent mechanism for selective localization of membrane-anchored proteins. keywords: non-lamellar lipid structure, phase transition, minerval, 2-hydroxylated fatty acid. minerval (2-hydroxyoleic acid), a potent antitumoral drug, is known to modulate the lipid membrane structure by decreasing the lamellar-to-non-lamellar phase transition temperature (t h ). a series of 2-hydroxy fatty acid derivatives, varying in acyl chain length and degree of unsaturation, have been analyzed in terms of their ability to stabilize the inverted hexagonal (h ii ) phase in palmitoyl-oleoyl-phosphatidylethanolamine membranes. differential scanning calorimetry and 31 p-nuclear magnetic resonance showed that mono-and polyunsaturated, but not saturated, 2-hydroxylated fatty acid molecules were able to decrease the t h . lipid vesicles mimicking the lipid composition of a cell membrane were solubilized at 4°c in the presence of triton x-100. the results demonstrated that the amount of detergent-resistant membranes, which are related to liquid ordered (lo) structures, decreased in the presence of 2-hydroxylated fatty acids. the so-called lipid membrane therapy focuses on the reversion of cell disfunction through the modulation of the membrane structure, thus altering the activity of membrane-associated proteins. the ability to modify the biophysical properties of a lipid membrane makes the studied 2-hydroxylated fatty acid molecules be prospective candidates for the use in the lipid membrane therapy. ]. however, a significant downward divergence occurs above 4 mol % of probe content, which might indicate deviations to ideal mixing in fluid phase. results for b-py-c 10 -hpc, in mixtures of popc with 10 and 20 mol % pops were indistinguishable from those obtained with pure popc vesicles; however excimer formation in pure pops bilayers appears to be appreciably higher. we also compared the excimer formation findings with quenching of the same probes by low concentrations of doxyl quencher groups labeled acyl phospholipid chain at the same depth of the pyrenyl group. the results are also scrutinized by the same two-dimensional kinetic formalism and good correlation was also found. derek marsh max-planck-institut fü r biophysikalische chemie, 37070 gö ttingen, germany the amassing of comprehensive data on the lipid composition of biological membranes by lipidomics initiatives provides a potent challenge to the membrane biophysicist interested in lipid structure. this resolves itself essentially into two aspects. the first systematises the dependence of membrane biophysical parameters on lipid molecular structure. lipid volumes, membrane dimensions, chain-melting temperatures and enthalpies, nonlamellar phase formation and structure, critical micelle concentrations and thermodynamics of membrane formation, membrane-membrane interactions and lipid transfer are amongst the properties of central biophysical interest. the relevant structural parameters are lipid chain length, degree of unsaturation, chain branching and headgroup configuration. the second, more complex and less well developed, aspect concerns the lipidlipid interactions that determine the membrane properties of lipid mixtures. in part, these can be obtained from binary phase diagrams, and the more limited number of ternary phase diagrams -notably with cholesterol -that are available. extrapolation to higher order mixtures lies in the future. i shall attempt to summarise some of the progress in these directions. the immediate aim is a second edition of my handbook of lipid bilayers, which, in addition to a vastly expanded database, will include interpretative features and will be available in the early part of next year. lateral diffusion dynamics in phosphatidylcholine/cholesterol bilayers has been mostly accessed by means of epr, nmr and fcs spectroscopic techniques. reliable stady-state florescence quenching analysis of diffusion-controlled processes has been ampered by the lack of a self-consistent kinetic formalism for the two-dimensional (2d) counterpart of the classical stern-volmer analysis for three-dimensional (3d) solvents. we studied the excimer formation of phospholipid-labeled pyrenyl probes (proportion of 4 mol %) in mixed popc/cholestrol mlv liposomes by combined steady-state and timeresolved fluorescence the findings are in very good agreement with the theoretical predictions of the kinetic formalism specific for fluorescence quenching processes occurring in the small hydrophobic head group, the closely packed acyl chains, and the capability of interfacial hydrogen bonding have been suggested to govern the characteristic membrane behavior of long chain saturated ceramides: the self-segregation, and the formation of hexagonal phases and highly ordered gel phases. while it has been shown that structural alterations of the ceramide acyl chains induce position dependent effects on their behavior, we wanted to study the effect of interfacial properties, including hydrogen bonding, on ceramide membrane properties. the h-bond donor functions of 2nh and 3oh in the sphingosine backbone of palmitoylceramide were disrupted either separately or simultaneously by replacing the hydrogen with a methyl-group. when the lateral phase behavior of mixed bilayers containing cholesterol/sphingomyelin-rich domains was studied in the presence of the ceramide analogs, the 3o-methylated ceramide appeared to form a thermally stable, sterol-excluding gel phase with sphingomyelin, whereas the 2o-methylated ceramide failed in both thermal stabilization and sterol displacement. the doubly methylated analog was the poorest ceramide mimic. together with the possible steric effects induced by the methylations, the lack of 2nh h-bond donor function impaired ceramide membrane behavior to a greater degree than the lack of 3oh h-bond donor function. sugar-based surfactants are made from renewable resources using the ''green chemistry'' methods, are easily biodegradable and used in washing agents, cosmetics, and drug carriers. besides, there are attempts to use them as nonviral vectors in gene therapy. we studied the influence of new x-(alkyldimethylamonium)alkylaldonamide bromides (c n gab) with different chain lengths (n = 10, 12, 14, 16) on the thermotropic phase behavior of dppc, and dppc/chol bilayers by means of differential scanning calorimetry. the surfactants were added either to the water phase or directly to the lipid phase (a mixed film was formed). we analyzed the changes in the temperatures, enthalpies and shapes of the main phase transitions as a function of concentration. molecular modeling methods were also used. cytotoxicicity of the c n gabs was determined in the cell line l929 and a549. for cytotoxicity test, the cells were seeded in 96-well plates 1 ml of 2á10 6 cells/ml in the culture medium eagle'a or dulbecco with 2% calf serum, penicillin and streptomycin was deposited into each plates. the cells were treated with various doses of surfactants and incubated. the minimal concentration which was toxic to approximately 50% of cells was taken as tccd 50 . this work was supported by grant n n305 361739. membrane fusion is ubiquitous in life requiring remodeling of two phospholipid bilayers. as supported by many experimental results and theoretical analyses, merging of membranes seems to proceed via similar sequential intermediates. contacting membranes form a stalk between the proximal leaflets which expand radially into a hemifusion diaphragm (hd) and subsequently open to a fusion pore. direct experimental verification of the hd is difficult due to its transient nature. using confocal fluorescence microscopy we have investigated the fusion of giant unilamellar vesicles (guvs) containing fluorescent membrane protein anchors and fluorescent lipid analogues in the presence of divalent cations. time resolved imaging revealed that fusion was preceded by displacement of peptides and lipid analogues from the guv-guv contact region being of several lm in size. a detailed analysis showed that this structure is consistent with the formation of an hd. a quantitative model of the hemifusion equilibrium and kinetics of the growing hd was developed. bilayer tension could be shown to drive hd expansion and interleaflet tension was found to act as a counterforce, because the outer leaflets are compressed upon hd growth. the model and its predictions fit nicely with observations above. concentration effects of trehalose in the equivalent polarity of fluid popc bilayers c. nobre 2 , d. arrais 1 , j. martins 1,2 1 ibb -cbme, faro, portugal, 2 dcbb -fct, universidade do algarve, faro, portugal trehalose is an important disaccharide, formed by two units of glucose linked by a a-1,1 glicosidic bond. it is capable of replacing water molecules in the hydration shell of the phospholipid headgroups, in cases of extreme dehydration, by establishing hydrogen bonds with their -co and -po groups, preserving this way the membrane structure. the polarity gradient is a significant feature of lipid bilayers and is influenced by the amounts of water within this medium. it is therefore important to understand the effects of different concentrations of trehalose in simple model membranes. using the pyrene empirical polarity scale, we monitorized changes in the polarity values when varying trehalose concentration in the bounding aqueous phase. for lower concentrations (until 0.25 m), we observed a decrease in polarity, comparing with popc bilayers in pure water. for higher trehalose concentrations (above 0.5 m), the polarity values are indistinguishable from those popc in water. using the freeze and thaw technique we obtained the same results, except for the lower trehalose concentrations. general anesthetics are indispensible tools of daily surgery. yet, their molecular mode of action remains elusive. while one school favors specific (direct) interactions with proteins of the central nervous system, another school adheres to a nonspecific modulation of biophysical membrane properties. one of the strongest arguments against lipid theories is the absence of stereo-specific effects in model membranes, as opposed to their detection by electrophysiological measurements on ionchannels. we have combined x-ray scattering and molecular dynamics simulations on palmitoyl-oleoyl-phosphatidylcholine bilayers with fluorescence microscopy on live cells to study the effects of the stereoisomers of ketamine on membrane properties. we find significant effects of both enantiomers on the distribution of lateral pressures at clinically relevant concentrations, being more pronounced for s-(?)-ketamine. we further calculated the effect of the lateral pressure profile changes on the opening probability of an ion-channel using crystallographic information. the observed channel inhibition compares remarkably well with clinically observed effects of the enantiomers. we thus provide first evidence for a stereo-specific, but indirect effect of general anesthetics on ion-channels. dependence of gramicidin a channel lifetime on membrane structure obtained from x-ray scattering measurements horia i. petrache department of physics, indiana university purdue university indianapolis, in 46202, usa the activity of ion channels, in particular the lifetime of their conducting (open) state depends on the physical properties of lipid bilayers [1, 2] which in turn depend on lipid headgroup and acyl chain composition. in order to investigate this dependence, we have performed measurements of gramicidin a (ga) channel lifetimes in three different lipid series. in each series, the lipid headgroups were phosphatidylcholine (pc), phosphatidylethanolamine (pe), and phosphatidylserine (ps), while the acyl chains consisted of symmetric monounsaturated di(18:1), mixed (16:0)(18:1), and methylated di(16:0-4me). in order to minimize the effect of headgroup electrostatics, measurements where performed in 1m kcl salts. we show how ga lifetimes depend on headgroup and acyl chain composition and on structural parameters determined by x-ray scattering. for the lipids considered, ga lifetimes cover a range from 0.7 seconds in the dope lipid to 18 seconds in dphps. in this range, we find a gaussian dependence of ga lifetime on bilayer thickness, consistent with hydrophobic matching models. we discuss different aspects of channel-lipid interactions and to what extent measurements of ga lifetime in binary mixtures are consistent with measurements in pure lipid systems. [ the aim of the studies was to determine the effect of chlorogenic acid (cga), which is the main constituent of plant extracts, on properties of the model membranes. its effect was studied on temperature of the main phase transition of various lipids with and without presence of cholesterol, using the differential scanning calorimetry (dsc) method and the fluorimetric method. in particular, the degree of packing order of the hydrophilic phase of liposomes was determined using the laurdan and prodan probes, and fluorescence anisotropy of the hydrophobic phase with the probes dph and tma-dph. it had also been studied the effect of chlorogenic acid on the structure and capacity of black lipid membranes (blms), formed of egg lecithin and lipids extracted from erythrocytes. the results obtained indicate that cga lowers the main phase transition temperature slightly, without changing the fluorescence anisotropy in the hydrophobic part of the bilyer, and causes a decease in the packing order of the hydrophilic phase. by monitoring the capacity during blm formation we have found that the presence of chlorogenic acid accelerates the process of lipid self-organization into a bilayer, and increases stability and life of the blms. however, the was no effect of cga on specific capacity of the membranes, and thus on thickness of the liposome membrane hydrophobic layer. this work was sponsored by the ministry of science and education, scientific project no. n n312 422340 and n n304 173840. many examples have recently been found where biological processes in the lipid bilayer are affected by the changes in the physicochemical properties of the membrane, e.g. the local curvature, the membrane tension and certainly the membrane structure. it has been shown that the activity of polyene antibiotics is strongly correlated to the phase diagram in a membrane composed of a mixture of popc and ergosterol or cholesterol (j membrane biol, 237:41-49, (2010)). it is known that polyene action is quite sensitive to the type of sterol in the membrane, which enables its medical use, mainly as antifungals. it has been proposed that this selectivity of the drug to fungi is related to structure modulation by the sterols (see for example, biophys. j., 85, 2323, (2003)) and therefore the correlation found could be due to structural differences between popc/ergosterol and popc/ cholesterol along the corresponding phase diagrams. to investigate this, molecular dynamics simulations of the above mixtures along their phase diagrams were performed. it was found that there are indeed marked differences in structure along the phase diagrams, but for the sterol-sterol distribution function. an analysis of the behavior of this observable and the implications on polyene action is discussed. acyl transfer from lipids without enzyme catalysis: a new paradigm for membrane protein ageing? john sanderson, catherine pridmore, jackie mosely, paul yeo durham university, department of chemistry, durham, uk membrane proteins are recycled in cellulo with half-lives ranging from minutes to days. in other systems, such as enveloped viruses, proteins may equally remain membranebound for periods of days. it is therefore of interest to examine the behaviour of proteins in model membranes over extended periods in order to determine the long-term stability of the mixed systems, both in kinetic terms (attainment of equilibrium states) and chemical terms (reactivity). the reactivity of proteins towards membranes has been examined using the peptide melittin as a model for membrane proteins. acyl transfer from phospholipids to the peptide was found to occur over a period of several days, in the absence of any enzyme catalysis. transfer was detectable after 2 days and reached 50% conversion in 8 days. using tandem mass spectrometry approaches, the sites of melittin modification were localised. these sites included the side chain of lysine, opening the possibility that this residue may be modified in any membrane protein where this residue has an appropriate disposition. these observations challenge preconceptions concerning the membrane as an inert medium and highlight potential new mechanisms for membrane protein ageing. interaction of poly(l-arginine) with negatively charged bilayers studied by ft-ir spectroscopy christian schwieger, alfred blume martin-luther-university, halle-wittenberg, institute of chemistry, van-dankelman-platz 3, 06108 halle / saale, germany e-mail: christian.schwieger@chemie.uni-halle.de oligoarginine residues attached to macromolecules are known to facilitate the transport through lipid membranes. since the mechanism of this transport is still unclear, the effect is often called ''arginine magic''. we studied the interaction of poly(larginine) (pla) of different molecular weight with negatively charged lipid bilayers. we have shown by calorimetric and monolayer techniques that the interaction is due to a combination of electrostatic and hydrophobic forces. now we present an ft-ir spectroscopic study to reveal the effect of pla binding on membrane organisation and peptide conformation. we will show that pla binding reduces the lipid miscibility of negatively charged (pg or pa) and zwitterionic (pc) lipids within the bilayer. from the shift of the c=o stretching vibration we deduce that arginine side chains penetrate into the hydrophobic/ hydrophilic interface and replace hydration water molecules. the binding reduces the rotational freedom of the lipid molecules, as could be shown by an analysis of the ch 2 -streching vibrations. pla binds in a b-sheet conformation to pg or pa gel phase membranes whereas its structure in bulk is random coil. the shift of the guanidyl vibration frequencies shows that also hydrogen bonds contribute to the pla -lipid interactions. neutron scattering studies of model membrane as a function of hydration and temperature federica sebastiani 1,2 , alessandra filabozzi 1 and giovanna fragneto 2 1 dipartimento di fisica, università degli studi di roma ''tor vergata'', roma, italy, 2 institut laue-langevin, grenoble, france cell membranes carry out highly specialised functions in living materials. the composition of bacterial membranes is essential to understand the mechanism of action of antimicrobial peptides. in order to understand the role of the various components contributing to the overall behaviour, we have reproduced the membrane of bacillus subtilis and carried out neutron diffraction studies on d16 (small momentum-transfer diffractometer) and d17 (reflectometer used as a diffractometer), at ill. an ordered and homogeneous sample has been obtained by using the widely studied dmpc. the measured d-spacing of dmpc as a function of the relative humidity (rh) is related to the physical and chemical conditions affecting the sample. consequently the reliability of the humidity chamber, which has been previously upgraded, has been stated. moreover, the most suitable preparation technique has been set up. in order to investigate the component roles within bacillus subtilis membrane, three samples of phospholipids were prepared (with pope, popg and cardiolipin). neutron diffraction measurements, performed at controlled rh and temperature, suggested the presence of interesting phase transitions or coexistence of phases. the rupture of membrane vesicles near solid surfaces annamá ria taká ts-nyeste, imre deré nyi department of biological physics, eö tvö s university, h-1117 budapest pazmany p. stny. 1/a, hungary the behavior of lipid membranes near solid surfaces has a great significance both in medicine and in technology. in spite of the widespread use and study of such membrane phenomena, their theoretical analysis is rather scarce. our main goal here is to understand the process during which membrane vesicles first adhere to solid surfaces, then rupture (or go through a series of transient ruptures) due to the mechanical tension induced by the adhesion, and finally spread along the surface forming a supported lipid bilayer. in our theoretical description we simultaneously consider the dynamics of spontaneous pore opening and closing; volume loss via leakage through the pores; and the advancement of the adhesion front. all these processes are supposed to follow an overdamped dynamics and coupled to each other through membrane tension. our numerical simulations reveal that the rupture process consists of three well distinguishable phases: a fast initial volume loss; followed by a slow volume loss; ending with a final burst and surface spreading. the second phase can be skipped if either the first phase advances far enough or the third phase sets in early enough. the smaller the vesicle, the further the first phase can advance. the third phase can start earlier if either the surface is smooth enough, or the adhesion energy is large enough, or the line tension is small enough. when the second phase is not skipped the time needed for the rupture process can take very long with a large variance. in the realistic range of the material properties (line tension, bending rigidity) the process is qualitatively always the same, so the most decisive parameter remains the size of the vesicle: the smaller the vesicle the faster and easier it ruptures. (3)). we chose four plant-derived polyphenols (flavonoids and stilbenes) of documented biological activity to study their influence on lipid domain number, area, shape, and borderlength. we found that resveratrol elevated the number of domains per vesicle, decreased their area and markedly increased the total length of domain border without affecting domains' circular shape. surprisingly, no such effect was observed for piceatannol differing from resveratrol by one hydroxyl group only. neither genistein nor 8-prenylnaringenin changed the morphology of lipid domains significantly. the possible mechanism of resveratrol-induced effect on lipid domains' morphology could be its selective accumulation in the interfacial regions between liquid ordered and liquid disordered domains. putative cholesterol recognition amino acid consensus (crac) motif in hiv coreceptors cxcr4 and ccr5 mikhail a. zhukovsky, albrecht ott biological experimental physics department, saarland university, 66041 saarbruecken, germany we identified a cholesterol recognition amino acid consensus (crac) motif in transmembrane domain 5 (tmd5) of two g protein-coupled receptors (gpcrs), human chemokine receptors cxcr4 and ccr5, coreceptors of human immunodeficiency virus (hiv). we suggest that residues belonging to this crac motif are involved in cholesterol binding to cxcr4 and ccr5 that is responsible for cholesterol requirement for cxcr4 and ccr5 conformation and function and for the role that cell cholesterol plays in the cell entry of cxcr4-using and ccr5-using hiv strains. putative crac sequences involve residues v 214 /l 216 -y 219 -k 225 in cxcr4 and l 208 /v 209 /v 211 -y 214 -k 219 in ccr5. in cxcr4, crac motif is highly conserved across chordata species, whereas in ccr5, crac motif is less conserved. t curve describe quantitatively the interfacial landscape around the protein molecules and can be used for the distinction between the globular and idp states. the behavior of the t 1 and t 2 data showed that there are two reorientation types present for every protein solutions below 0°c, irrespective for the nature of the protein or the solvent composition. local field fluctuation and the bpp models were applied, which failed for the buffered protein solutions and for the idps dissolved in water. a main cause of the failure is the changing h in the analyzed temperature range. this case is valid for the solutions of idps and for buffered solutions of both protein types. another cause can be the active relaxation channels other than dipolar when ions of quadrupolar nuclei are present. ligand-induced disorder-to-order transition plays a key role in the biological functions of many proteins that contain intrinsically disordered regions. this trait is exhibited by rtx (repeat in toxin) motifs found in more than 250 virulence factors secreted by gram-negative pathogenic bacteria. we investigated several cyaa rtx polypeptides of different lengths ranging from 111 to 706 residues. we showed that the rtx proteins exhibit the hallmarks of intrinsically disordered proteins in the absence of calcium: they adopt premolten globule conformations and exhibit a strong timeaveraged apparent hydration, due in part to the internal electrostatic repulsions between negatively charged residues, as revealed by the high mean net charge. calcium binding triggers a strong reduction of the mean net charge, dehydration and compaction, folding and stabilization of secondary and tertiary structures of the rtx proteins. we propose that the intrinsically disordered character of the rtx proteins may facilitate the uptake and secretion of virulence factors through the bacterial secretion machinery. these results support the hypothesis that the folding reaction is achieved upon protein secretion and, in the case of proteins containing rtx motifs, could be finely regulated by the calcium gradient across bacterial cell wall. occupational exposure to heavy metals has been recognized to be a risk factor for parkinson's disease via metaltriggered deposition of alpha-synuclein (as) 1,2 . in the present work, al 3? induced conformational change and instant oligomerization of as have been studied using fret and fcs as main techniques. donor and acceptor were labeled in the c-terminal at positions a107c and a140c. the average lifetime of donor in the presence of acceptor increases with the increase of al 3? concentration, indicating as adopts a more extended conformation upon al 3? binding. the intrinsic tyr fluorescence rises sharply within the mixing dead time, reflecting an enhanced hydrophobicity of the tyr environment and a fast conformational change of as. al 3? also induces an immediate oligomerization of as as monitored by fcs. the diffusion coefficient of as changes from 85 ± 5 lm 2 /s as monomer state to 23 ± 5 lm 2 /s as oligomer state. the oligomerization is supposed to be induced by the ligand bridging of trivalent al ions. nearly 2% of human genes encode protein-kinases (pk), enzymes involved in cellular signaling and several other vital biochemical functions, which transfer phosphate groups from atp to specific target molecules, modifying their activity. [1] deregulated pk have been linked to numerous diseases including cancer and diabetes, making them attractive targets for drug design. [2] conformational transitions play a central role in regulating the phosphorylation activity. pk adopt an on state that is maximally active and one or more inactive states that shows minimal activity. [3] the similarity of the relatively rigid and largely conserved atp binding site makes the design of selective inhibitors binding to the active state very difficult. indeed some of the best cancer therapies available are based on inhibitors, as imatinib, that bind to inactive states peculiar to a small subset of pk (abl, c-kit and pdgfr in the case on imatinib). thus, understanding the atomic details of the active to inactive transitions in kinases has a great importance. here we study a particular active-toinactive transition of c-src, a fundamental proto-oncogene involved in cancer and metastasis, by using multi-microsecond long fully solvated molecular dynamics simulations, metadynamics and ptmetad calculations [4, 5] . the results, validated by mutagenesis, x-ray crystallography and binding kinetics, are suggestive of a functional role for the conformational transition. moreover, we were able to single out the most important residues affecting the conformational transition and to show that even a very conservative amino-acid substitution can have a dramatic effect on the conformational free energy landscape. the time-scales of protein folding events range over many orders of magnitude. in order to understand the complex folding mechanisms, peptides with well-defined secondary structure are often used as model systems as they may be regarded as smallest folding units of proteins. the formation of secondary structure elements occur on the nanosecond to low microsecond time scale. thus, stopped-flow techniques are too slow whereas pulsed laser techniques are capable to trigger folding processes in nanoseconds and to analyze faster folding events. we study ns-to-ls peptide dynamics by temperature-jump infrared spectroscopy. after initiation of a nanosecond temperature jump, the spectral response is monitored at single wavelengths in the amide i region reflecting the dynamics of the peptide backbone. relaxation rates are obtained. the helix-to-coil relaxation of polyglutamic acid is a multi-step process and requires more complex models than two-state kinetics. however, there are kinetic steps that are well described by single-exponential behavior and a two-state model. we demonstrate how equilibrium and time-resolved infrared spectroscopic data can be combined to deduce folding rates. unfolding and refolding studies using chemical denaturants have contributed tremendously to our understanding of the thermodynamics and kinetics of protein folding and stability. however, a major limitation of this approach lies in the large uncertainty inherent in the extrapolation of the free energy of unfolding in the absence of denaturant from free energy values measured at finite denaturant concentrations. here we show that this limitation can be overcome by combining multiple spectroscopic signals-including fluorescence, circular dichroism, and absorbance-recorded in a quasi-simultaneous and fully automated way at different wavelengths. we have optimised the number of wavelength values used, the integration time per data point, the increment in the denaturant concentration, and the weighting scheme applied for global data fitting. compared with the traditional approach based on the use of a single or a few wavelengths, we could thus improve the precision of the free energy value by an order of magnitude. we exemplify and validate this novel approach using representative, well-studied globular proteins and explain how it can be exploited to quantify subtle changes in membrane-protein stability which have thus far remained elusive. the rates of protein conformational changes are usually not only limited by external but also internal friction, however, the origin and significance of this latter phenomenon is poorly understood. it is often found experimentally that a linear fit to the reciprocal of the reaction rate as a function of the viscosity of the external medium has a non-zero 1 , the physical basis of pressure unfolding is still largely unknown. we report here a specific study of cavities contributions to the volume difference between unfolded and folded states (dv u ), using four single point mutants of staphylococcus nuclease (snase). each mutation is localised in a strategic position on the protein structure and was designed to change a large buried hydrophobic side chain into alanine, thus opening tunable cavities in the snase 3d structure. measuring hsqcs peaks intensities up to 2500 bar monitored the equilibrium high pressure unfolding and leads us to precise estimations of dv u for more than two-thirds of the 143 residues of each mutant. so-fast hmqc experiments 2 were also performed to measure folding and unfolding rates from 200 bar pressure jumps. high-pressure fluorescence experiments were done on six additional alanine mutants to complement the nmr study, allowing a more complete exploration of the local pressure sensitivity along the protein 3d structure. all these highly reliable measurements shed light on the real signification of the thermodynamic parameter dvu, and bring an unprecedented complex and heterogeneous picture at a residue level of the apparent two-state folding process of snase. determination of contributing factors to the volume change magnitude between unfolded and folded states (dv u ) is a longstanding question in the high-pressure field. we provide here new experimental and computational data using two wellcharacterized model proteins: notch ankyrin repeat domain (nank) and staphylococcal nuclease (snase). the repetitive nature of the nank protein was used to study influence of the protein size on dv u in a systematic way with a set of deletion mutants. high-pressure fluorescence data provided new evidences that neither peptide bonds hydration nor side chains differential hydration could be considered as major contributor to the measured dv u value. additional molecular dynamics (md) simulations rather suggested that the heterogeneous distribution of void volume in the folded states structures could explain the dv u variations among the nank deletion mutants. the specific issue of the void volume contribution to dv u values was studied using 10 cavity mutants in snase, allowing a large structural mapping of the alanine mutations on this globular protein. combination of x-ray crystallography, highpressure fluorescence, high-pressure nmr and md simulations provided a first clear determination of the void volume contribution to the dv u values. these results also bring an unprecedented complex and heterogeneous picture at a residue level of the apparent two-state folding process of snase. we expressed an ig domain (i27) and a 170-residue-long fragment of the pevk domain in order to investigate the effect of temperature and pressure on their conformation. ftir spectroscopy is a useful method for investigating the secondary structure of proteins. we analyzed the amide i band to obtain information on protein structure. fluorescence labeling was also used in some experiments. to generate high pressures, a diamond anvil cell was employed. the ftir and fluorescence spectra of the protein fragments were recorded across the pressure and temperature ranges of 0-1 gpa and 0-100°c, respectively. moderate changes were observed in the conformation of the pevk fragments in the explored range of the t-p plane, suggesting that the domain is a highly flexible, random-coil across the entire studied t-p range. by contrast, the i27 domain showed quite stable secondary structure. intrinsically disordered proteins participate in important regulatory functions in the cell, including regulation of transcription, translation, the cell cycle, and numerous signal transduction events. disordered proteins often undergo coupled folding and binding transitions upon interaction with their cellular targets. the lack of stable globular structure can confer numerous functional advantages, including, paradoxically, both binding promiscuity and high specificity in target interactions. nmr is unique in being able to provide detailed insights into the intrinsic conformational preferences and dynamics of unfolded and partly folded proteins, and into the mechanism of coupled folding and binding. the function of intrinsically disordered protein domains in transcriptional regulation and signaling will be described, with particular reference to the general transcriptional coactivators cbp and p300, the tumor suppressor p53, and the adenovirus e1a oncoprotein. the globular domains of cbp/p300 are targets for coupled folding and binding of disordered transactivation motifs of numerous transcription factors and viral oncogenes, which compete for binding to limiting amounts of cbp/p300. many intrinsically disordered proteins contain multipartite interaction motifs that perform an essential function in the integration of complex signaling networks. the role of multipartite binding motifs and post translational modifications in regulation of p53-mediated signaling pathways will be discussed. the early vascular network is one of the simplest functioning organs in the embryo. its formation involves only one cell type and it can be readily observed and manipulated in avian embryos or in vitro explants. the early vascular network of warm-blooded vertebrates self-organizes by the collective motility of cell streams, or multicellular ''sprouts''. the elongation of these future vascular network segments depends on a continuous supply of cells, moving along the sprout towards its tip. to understand the observed self-organization process, we investigate computational models containing interactions between adherent, polarized and self-propelled cells. by comparing the simulations with data from in vivo or simplistic in vitro experiments, we explore the role of active migration, leader cells, invasion of the ecm, and cell guidance by micromechanical properties of adjacent cell surfaces. boron neutron capture therapy (bnct) is a promising method for treating the highly fatal brain tumor; glioblastoma multiform. it is a binary modality; in which use is made of two components simultaneously; viz. thermal neutrons and boron-10. the biophysics of bnct is very complicated; primarily due to the complexity of element composition of the brain. moreover; numerous components contributes to the over all radiation dose both to normal brain and to tumor. simple algebraic summation cannot be applied to these dose components, since each component should at first be weighed by its relative biological effectiveness (rbe) value. unfortunately, there is no worldwide agreement on these rbe values. thermal neutrons were formerly employed for bnct, but they failed to prove therapeutic efficacy. later on; epithermal neutrons were suggested proposing that they would be enough thermalized while transporting in the brain tissues. however; debate aroused regarding the optimum source neutrons energy for treating brain tumors located at different depths in brain. insufficient knowledge regarding the rbe values of different bnct dose components was a major obstacle. a new concept was adopted for estimating the optimum source neutrons energy appropriate for different circumstances of bnct. four postulations on the optimum source neutrons energy were worked out, almost entirely independent of the rbe values of the different dose components. four corresponding condition on the optimum source neutrons energy were deduced. an energy escalation study was carried out investigating 65 different source neutron energies, between 0.01 ev and 13.2 mev. mcnp4b monte_carlo neutron transport code was utilized to study the behavior of these neutrons in the brain. the deduced four conditions were applied to the results. a source neutron energy range of few electron volts (ev) to about 30 kev was estimated to be optimum for bnct of brain tumors located at different depths in brain. simulation of mutation induction by inhaled radon progenies in the bronchial epithelium balá zs g. madas, á rpá d farkas and imre balá shá zy hungarian academy of sciences kfki atomic energy research institute, konkoly-thege mikló s ú t 29-33., budapest, h-1121, hungary radon is considered as the second most important cause of lung cancer after smoking. to understand the mechanisms leading from radon exposure to cancer formation is of crucial importance. this study focuses on the description of mutation induction by radon progenies in the bronchial epithelium. computational fluid and particle dynamics approach was applied to determine the radio-aerosol deposition distribution in the central airways. a numerical replica of a small fragment of the bronchial epithelium was prepared based on experimental data. microdosimetric computations were performed to quantify the cellular radiation burdens at the very site of deposition accumulation. a mutagenesis model was applied supposing that radiation induces dna damages and enhances the cell turnover rate. the results show that both considered mutagenic effects of densely ionising radiation contribute significantly to mutation induction and mutation rate depends non-linearly on exposure rate. furthermore, simulations suggest that the local maintenance capacity of the bronchial epithelium can be exhausted by chronic exposure to radon progenies with activity concentration characteristic of some uranium mines. the present work demonstrates possible applications of numerical modelling in radon related carcinogenesis studies. the neural crest is a group of cells found in all vertebrate embryos. it forms in the neural folds at the border of the neural plate and gives rise to a huge variety of cells, tissues and organs. one of the astonishing characteristic of neural crest cells is that they are able to migrate very long distances in the embryo. the neural crest has been called the ''explorer of the embryo'' as it is one of the embryonic cell types that migrate most during development, eventually colonizing almost every tissue. in this talk i will discuss our recent finding about neural crest migration. we have shown that neural crest cells, classically described as mesenchymal cells, migrate in large clusters cytokinesis relies on tight regulation of the mechanical properties of the cell cortex, a thin acto-myosin network lying under the plasma membrane. although most studies of cytokinetic mechanics focus on force generation at the equatorial acto-myosin ring, a contractile cortex remains at the poles of dividing cells throughout cytokinesis. whether polar forces influence cytokinetic cell shape is poorly understood. combining cell biology and biophysics, we demonstrate that the polar cortex makes cytokinesis inherently unstable and that any imbalance in contractile forces between the poles compromises furrow positioning. we show that limited asymmetric polar contractions occur during normal cytokinesis, and that perturbing the polar cortex leads to cell shape oscillations and division failure. a theoretical model based on a competition between cortex turnover and contraction dynamics accurately accounts for the oscillations. we further propose that blebs, membrane protrusions that commonly form at the poles of dividing cells, stabilise the position of the cleavage furrow by acting as valves releasing cortical contractility. taken together, our findings show that the physical properties of the entire cell are integrated into a finetuned mechanical system ensuring successful cytokinesis. collective motion of individual cells marks the onset of the transition to multicellularity in many microorganisms. this transition is often mediated by intercellular communication signals between cells. here, we show, in contrast, that the transition from single cell to collective motion in an ensemble of gliding bacterial cells can be understood as a dynamical selfassembly process of self-propelled rods. experiments were carried out with a mutant of the bacterium myxococcus xanthus moving by means of the a-motility system only and without undergoing reversals. the collective motion phase is confined to a monolayer and is characterized by the organization of cells into larger moving clusters. a transition to collective motion is detected in experiments by image analysis, that reveals a qualitative change of the cluster-size distribution at a critical cell packing fraction around 17%. this transition is characterized by a scale-free power-law cluster size distribution with an exponent 0.88. we provide a theoretical model for cluster formation of self-propelled rods that reproduces the experimental findings for the cluster size distribution. our findings suggest that the interplay of selfpropulsion of bacteria and volume exclusion effects of the rodshaped cell bodies is sufficient to explain the onset of collective motion and the related changes in the cluster statistics. despite much speculation on the existence of structurally distinct oligomeric species associated with the conversion of certain monomeric proteins into amyloid fibrils, it has not previously been possible to observe them directly or to relate them to any key mechanistic steps involved in the interconversion process. we have developed a novel application of singlemolecule intermolecular fret to investigate in unprecedented detail the aggregation and disaggregation of alpha-synuclein, the protein whose pathogenic deposition as intracellular lewy bodies is a characteristic feature of parkinson's disease. our study reveals that a range of oligomers of different size and structure are formed, even at physiologically relevant concentrations. interestingly, the resistance to degradation of the aggregated state of alpha-synuclein, which is a wellwe focused on the structure-dynamics interplay and showed how the fractal-like properties of proteins lead to such anomalous dynamics. we used diffusion, a method sensitive to the structural features of the protein fold and them alone, in order to probe protein structure. conducting a large scale study of diffusion on over 500 pdb structures we found it to be anomalous, an indication of a fractal-like structure. taking advantage of known and newly derived relations between vibrational dynamics and diffusion, we demonstrated the equivalence of our findings to the existence of structurally originated anomalies in the vibrational dynamics of proteins. more specifically, the time dependent vibrational mean square displacement (msd) of an amino acid is predicted to be subdiffusive. the thermal variance in the instantaneous distance between amino acids is shown to grow as a power law of the equilibrium distance. the autocorrelation function in time of the instantaneous distance between amino acids is shown to decay anomalously. our analysis offers a practical tool that may aid in the identification of amino acid pairs involved in large conformational changes. more recently, we studied the effect of the hydrodynamic interaction between amino acids using a zimm-type model. we computed the time-dependent msd of an amino acid and the time-dependent autocorrelation function of the distance between two amino acids, and showed that these dynamic quantities evolve anomalously, similar to the rouse-type behavior, yet with modified dynamic exponents. we also studied the dynamic structure factor s(k,t) of proteins at large wavenumbers k, kr g [ [1, with r g the gyration radius, that are sensitive to the protein internal dynamics. we showed that the decay of s(k,t) is dominated by the spatially averaged msd of an amino acid. as a result, s(k,t) effectively decays as a stretched exponential. we compared our theory with recent neutron spin-echo studies of myoglobin and hemoglobin for the rouse and zimm models of hydrodynamic friction. in addition, i will mention two other projects currently underway: (i) a new elastic network model that accounts for the tensorial aspects of protein elasticity and is a combination of stretch-compress springs and bond-bending energies. (ii) the unfolding of a protein under the exertion of a large pulling force. allosteric regulation of enzymatic activity is crucial for controlling a multitude of fundamental cellular processes. yet the molecular level details underlying regulation often remain poorly understood. here we employed single molecule activity studies to dissect the mechanistic origin of enzymatic activity regulation. as a model system we employed a lipase and measured its activity as a function of accessibility to surface tethered liposomes (1), which are known regulators of its activity. our results surprisingly revealed that the lipase oscillates between 2 states of different activity. we accurately quantified for the first time both the interconversion rates between activity states and the inherent activity of these states. based on these we calculated the energetic landscape of the entire reaction pathway and identified that regulatory interactions redistributed the probability to reside on preexisting enzymatic activity states but did not alter the activity of these states. our findings provide the missing link between conformational and activity substates supporting and represent the first direct validation of the textbook hypothesis of conformational selection for regulation of enzymatic activity to identify the potential targets of cgmp in arabidopsis plants we adopted a proteomic approach to isolate possible cgmp-binding proteins. purification of soluble cgmp-binding proteins was performed using cgmp-agarose-based affinity chromatography procedure. next eluted proteins were analyzed by sds-page which revealed ten bands. we focused the subsequent analysis on low-molecular peptides of 15, 16 and18 kda which were bound cgmp more intensively. after 2d-ief-page of the proteins isolated by cgmp-agaroseaffinity chromatography eight most abundant protein spots in the low-molecular area were visualized. these spots of interest were excised from the gel and in gel digested by trypsin. then tryptic peptides were analyzed by maldi-tof mass spectrometry and identified as isoforms of nucleoside diphosphate kinase (ndpk) from arabidopsis. thus, our data suggest that ndpk is a potential target of cgmp signaling in arabidopsis. dual-color fluorescence-burst analysis (dcfba) was applied to measure the quaternary structure and high affinity binding of the bacterial motor protein seca to the protein-conducting channel secyeg reconstituted into lipid vesicles. dcfba is an equilibrium technique that enables the direct observation and quantification of protein-protein interactions at the single molecule level. seca binds to secyeg as a dimer with a nucleotideand preprotein-dependent dissociation constant. one of the seca protomers binds secyeg in a salt-resistant manner, while binding of the second protomer is salt-sensitive. since protein translocation is salt-sensitive we conclude that the dimeric state of seca is required for protein translocation. a structural model for the dimeric assembly of seca while bound to secyeg is proposed based on the crystal structures of the thermatoga maritima seca-secyeg and the escherichia coli seca dimer. • dcfba is a flurorescence based single molecule technique that allows assessment of the stoichiometry of ligands bound to membrane receptors • dimeric seca binds asymmetrically to the protein-conducting membrane channel secyeg • monomeric seca binds secyeg but dimeric seca is required for protein translocation • protein translocation depends on receptor cycling of the dimeric seca if the dna charge is sufficiently neutralized by counter-ions, electrostatic interactions between helical charge patterns can cause attraction [1] . helix specific interactions also cause tilt, in one direction, between two dna fragments [1] . in braids and supercoils, this impetus to tilt breaks positive-negative supercoil symmetry. we show that these effects may cause spontaneous braiding of two molecules, lowering the dna pairing energy [2] . the pairing is more energetically favourable for homologues (same base pair text) than for nonhomologous pairs. this might explain pairing between only homologues observed in nacl solution [3] . also, we construct a simple model for a closed loop supercoil, including chiral electrostatic interactions. there are very interesting effects, for sufficient charge neutralization and groove localization of counter-ions. i.) positive super-coils are more energetically favourable than negative ones. ii.) a transition between loosely and tightly wound supercoils as one moves from negative to positive values of the supercoiling density. iii.) in positive super-coils the chiral interaction underwinds dna. [ von willebrand factor (vwf) is a large multimeric protein that is crucial for the force sensing cascade triggering primary hemostasis. it mediates binding of activated thrombocytes to injured epithelial tissue and serves as a transporter for coagulation factor viii. while it was shown that the hemostatic activity of vwf is affected by shear stress [1] , the exact impact that shear forces have on the inflammatory cascade remains unclear. it is assumed that hydrodynamic forces lead to partial unfolding of vwf, which in consequence exposes more binding sites. in order to observe shear-induced changes of the protein's functionality, we measure conformational changes of vwf under flow with fluorescence correlation spectroscopy (fcs). we aim to measure the degree of uncoiling of vwf multimers under various buffer conditions, e.g. in the presence of colloids, vesicles or platelets. as only large multimers show significant hemostatic activity we intend to monitor the molecular weight distribution of vwf. shifts in this distribution indicate various pathological conditions making our multimer analysis a fast diagnostical tool for vwf-related diseases. this will serve as a basis for studies of vwf binding to collagen, fviii, gpib, vesicles and membranecoated nanoparticles under shear flow. [ data on mechanical properties of medically important proteins located in neural junctions are very limited. contactins (cntn) and paranodin proteins, located in extracellular part of ranvier nodes, are important for proper brain wiring. here we study a new series of fniii modules from human cntn-1 and -4 using a single molecule afm force spectroscopy and advanced, all-atom steered molecular dynamics (smd) computer simulations. mutations in cntns are responsible for numerous brain disorders including autism or pathological development of odor maps. perhaps mechanical properties of individual fniii mutated protein modules are compromised, thus we address this problem. a comparison of our afm force spectra with those of reference proteins will be presented [1] [2] , and the molecular level interpretation fniii nanomechanics, based on our smd data will be given. we believe that these data should help to understand a role of cntn in regulation of sodium ion channels in both normal and autistic subjects. supported in part by polish ministry of education and science, grant no. n202 262038, the computational center task in gdansk and license for accelrys software. recent achievements in rational dna-motors engineering demonstrate the possibility to design nano-motors and nanorobots capable of performing externally controlled or programmed tasks. a major obstacle in developing such a complex molecular machine is the difficulty in characterizing the intermediates, the final products and their activity. typically, non in-situ gel and afm and in-situ bulk fluorescence methods are used. i will present two dna-motors recently developed and studied using in-situ single-molecule fluorescence resonance energy transfer (smfret), alternating laser excitation (alex) and total internal reflection fluorescence spectroscopy (tirf), and will demonstrate that these methods can improve the way we design, construct, measure and understand highly complex dna-based machines. a motor made of bipedal dna-walker, which walks on a dna track embedded on a dna-origami, capable of long walking distance and maintaining structural stability, will be presented. the motor is non-autonomous; it receives ss-dna fuel/anti-fuel commands from outside (as in shin & pierce, jacs, 2004). the motors assembly stages and singlemotor's walking steps are monitored using smfret. the second motor is based on published bipedal autonomous dna-motor (seeman, science 2009). it is characterized by coordinated activity between the different motor domains leading to processive, linear and synchronized movement along a directionally polar track. to prove that the motor indeed walks, the authors chemically froze the motor at each step and use a complicated radioactive gel assay. i will demonstrate that using single-molecule approach, we are able to directly and in-situ measure single-motor's movements in few simple experimental steps, and measure its structural dynamics and kinetics. translation by a single eukaryotic ribosome using single molecule total internal reflection fluorescence microscopy, we observed translation of a short messenger rna (mrna) strand by single eukaryotic ribosomes. the ribosome-mrnas complexes are fixed to a microscope coverslip through the mrna, and mrnas are located through fluorescently labelled oligonucleotides hybridized to it downstream start codon. because of the ribosome helicase activity, the double strand formed by the oligonucleotide and the mrna is opened while the ribosome translates this region of the mrna. thus, the loss of the fluorescence signal allows us to measure the distribution of translation speed of single ribosomes. careful attention was given to photobleaching for the data analysis. this experiment opens the door to the study of eukaryotic translation at the single molecule level. erythrocyte hyperaggregation, a cardiovascular risk factor, has been associated to high plasma concentrations of fibrinogen. using atomic force microscopy (afm)-based force spectroscopy measurements, we have recently identified the erythrocyte membrane receptor for fibrinogen, an integrin with a a3 or a3-like subunit [1] . after this, we extended the study to the influence of erythrocyte aging on fibrinogen binding [2] . force spectroscopy measurements showed that upon erythrocyte aging, there is a decrease of the binding to fibrinogen by decreasing the frequency of its occurrence (from 18.6% to 4.6%) but not its force. this observation is reinforced by zeta-potential and fluorescence spectroscopy measurements. knowing that younger erythrocytes bind more to fibrinogen, we could presume that this population is the main contributor to the cardiovascular diseases associated with increased fibrinogen blood content, which disturbs the blood flow. our data also show that sialic acid residues on the erythrocyte membrane contribute for the interaction with fibrinogen, possibly by facilitating the binding to its receptor. antimicrobial peptides are usually polycationic and amphiphilic with high affinity for bacterial membranes. in order to characterize their therapeutic potential it is crucial to disclose which properties of the peptide/lipids are important for target selectivity, and to examine the peptide structure and its association with lipid bilayers. in this work, first experiments have been carried out on a promising peptide called sb056, which might represent the basis for developing a novel class of antibiotics. with the goal of enhancing the activity of a new semi-synthetic sequence, two identical peptides (wkkirvrlsa) were assembled via a lysine-linker, carrying also an octanoyl-lipid anchor. a highly active compound was obtained, but its structure and mode-of-action remain unexplored. this dendrimeric peptide and its linear deca-peptide counterpart are being studied in parallel to highlight the relevant properties and differences between dendrimeric structure and the sequence. monolayer intercalation is investigated with microtensiometry, fluorescence spectroscopy is applied to study thermodynamics and kinetics of the binding process. circular dichroism, nmr and md simulations are employed with the aim of elucidating the 3d structure in the membrane-bound state. the capability of proteins to build structures via self-organization is fascinating biophysicists since decades. with the advent of single-molecule methods, namely fluorescence correlation spectroscopy (fcs) and fluorescence resonance energy transfer (fret), the process of complex formation is becoming accessible to direct observation. coronaviruses (cov) are enveloped positive-stranded rna viruses. for sars-cov, it was shown that coronaviruses encode a rna-dependent rna-polymerase (rdrp) build from non-structural protein 7 (nsp7) and non-structural protein 8 (nsp8). this hexadecameric nsp7-nsp8 complex is a hollow, cylinder-like structure assembled from eight copies of nsp8 and held together by eight nsp7 molecules [1, 2] . we are aiming at understanding the assembly process and conformational changes of the complex for the related feline coronavirus. first results implicate that nsp8 alone forms a dimer, where interchain fret is more efficient than intrachain fret. for the complex the results indicate that nsp7-nsp8 form a heterodimer which is different from sars-cov. our experiments highlight the potential of single-molecule fret for the study of protein complex formation. diffracted x-ray tracking (dxt) has been considered as a powerful technique for detecting subtle dynamic motion of the target protein at single molecular level. in dxt, the dynamics of a single protein can be monitored through trajectory of the laue spot from the nanocrystal which was labeled on the objective protein. in this study, dxt was applied to the group ii chaperonin, a protein machinery that captures an unfolded protein and refolds it to the correct conformation in an atp dependent manner. a mutant group ii chaperonin from thermococcus strain ks-1 with a cys residue at the tip of the helical protrusion was immobilized on the gold substrate surface and was labeled with a gold nanocrystal. we monitored diffracted spots from the nanocrystal as dynamic motion of the chaperonin, and found that the torsional motion of the chaperonin in the presence of atp condition was 10 times larger than that in the absence of atp condition. and uv-light triggered dxt study using caged atp revealed that the chaperonin twisted counterclockwisely (from the top view of chaperonin) when the chaperonin closed its chamber, and the angular velocity from open to closed state was 10 % faster than that from closed to open state. peptides or proteins may convert (under some conditions) from their soluble forms into highly ordered fibrillar aggregates. in vivo such transitions can lead to neurodegenerative disorders such as alzheimer's disease. alzheimer's disease is characterised by the extracellular deposition of abeta peptide in amyloid plaques, and the intracellular formation of neurofibrillary tangle (nft) deposits within neurons, the latter correlating well with disease severity. the major constituent of nft deposits are paired helical filaments (phf) composed of a microtubule-associated protein known as tau. studying the process by which tau forms these large aggregates may be an essential step in understanding the molecular basis of alzheimer's disease and other tauopathies. we have applied a two-colour single molecule fluorescence technique, and single molecule intermolecular fret measurements to study the soluble oligomers of tau which are formed during the aggregation and disaggregation of phf's. the neuronal protein alpha-synuclein is considered to play a critical role in the onset and progression of parkinson's disease. fibrillar aggregates of alpha-synuclein are the main constituents of the lewy bodies that are found in the brains of parkinson patients. however, there is growing evidence suggesting that oligomeric aggregates are significantly more toxic to cells than fibrillar aggregates. very little is known about the structure and composition of these oligomeric aggregates. we present results using single-molecule photobleaching approaches to determine the number of monomeric subunits constituting the oligomers. our results show that the oligomers have a narrow size distribution, consisting of *13-20 monomers per oligomer. fluorescence correlation spectroscopy data confirm the narrow size distribution and additionally indicate a very loose packing of the oligomers. in combination with bulk fluorescence spectroscopy results of tryptophan containing mutants of alpha-synuclein, we present a structural model for the alpha-synuclein oligomer. gold colloids are widely used for in vitro and in vivo imaging. compared to the traditional optical tags sers-coded nanoparticles show a narrow emission bandwidth with structured spectra typical of the molecule used, a wider excitation bandwidth, higher emission intensity, a better photo-stability, and a lower toxicity. this is why in cancer therapy, besides being considered good tools for the delivery of anti-tumor drugs, aunp can be also good optical tags for the analyses of both np localization by laser scanning microscopy and the process of drug release inside the cells by raman. in our work we used 10 nm diameter aunp loaded with rhodamine 6g, a molecule with a high raman and fluorescence efficiency, and with a chemical structure similar to doxorubicin, the antitumoral drug used in our system. the data showed that aunp are internalized by cells and sers can be performed. 10 nm and 60 nm diameter aunp loaded with doxorubicin were incubated at different time points with a549 cell line (human adenocarcinomic alveolar basal epithelial cells). only 60 nm aunp showed intense raman emission typical of the doxorubicin phonon transitions. in recent years biomedical applications of diamond nanoparticles have become of significant interest, which rises questions of their biocompatibility and mechanisms of interactions with cells. the aim of this study was to compare the effect of nonmodified diamond nanoparticles (dnps) and dnps modified by the fenton reaction on human endothelial cells. dnps (\10 nm particle size, sigma) were modified by the fenton reaction introducing surface -oh groups. immortalized human endothelial cells (huvec-st) were incubated with 2-100 lg/ml dnps in the optimem medium. diamond nanoparticles modified by the fenton reaction had smaller hydrodynamic diameter estimated by dynamic light scattering and the surface potential (zeta potential) measured using laser-doppler electrophoresis. they were more cytotoxic as evaluated by the mtt reduction assay. dnps augmented generation of reactive oxygen species in the cells, estimated by oxidation of 2',7'-dichlorofluorescin, the effect being higher for the fenton-modified dnps after 48-h incubation. cellular production of nitric oxide, estimated with daf-fm, was also affected by dnps; after 72h, fentonmodified oh, in contrast to non-modified diamond, decreased no production. diamond nanoparticles affected also the cellular level of glutathione and activities of main antioxidant enzymes (superoxide dismutase, catalase, glutathione peroxidase, glutathione reductase and glutathione s-transferase). we aim at investigating how photoreactions of proteins can be controlled by means of intense thz radiation tuned in resonance to specific vibrational modes, much in analogy to coherent control experiments conducted by fs nir laser pulses [1] . for this we will combine a time-resolved ir difference spectroscopic setup with uniquely intense, tunable narrow bandwidth thz radiation (3 -280 lm) at the ps beamline of the thz free electron laser felbe. these experiments will be performed on bacteriorhodopsin (br) which is the sole protein of the purple membrane of the archaebacterium halobacterium salinarum [2] . upon illumination, the chromophore retinal isomerizes around the c 13 -c 14 double bond [3] and br pumps a proton from the cytoplasmic to the extracellular side. this proton gradient is used by the bacterium to drive photosynthetic atp production under low oxygen tension [4] . in our experiment, the photoreaction is initiated by a visible laser pulse as in standard experiments, but then the sample will be irradiated by a thz pulse from the free electron laser tuned into resonance with low-energy vibrational modes which is supposed to influence the photoreaction [1] . such vibrational control will be monitored by time-resolved ftir spectroscopy using the step-scan technique [5] . liposomes are increasingly studied as nanoscale drug delivery systems and biomembrane models. however the exact structure dynamics and mechanical behavior of liposomes is little known. atomic force microscopy (afm) is a powerful tool to characterize nanoscale morphology and enables the mechanical manipulation of submicron-sized vesicles. a drawback of afm, however, is that liposomes may flatten and rupture on substrates to form patches or supported planar bilayers (spb). our aim was to obtain better understanding of factors affecting liposomes on substrates and find experimental conditions at which liposomes preserve their structural integrity. in the presence of divalent cations dppc liposomes formed spb on mica. vesicles sedimented subsequently preserved their integrity and showed stronger attachment to spb. in addition to cross-bridging lipid head groups, divalent cations influence the surface charge of liposomes, thereby modulating liposome-substrate and liposome-liposome interfacial interactions. preserved vesicles stabilized by divalent cations may provide a unique experimental system for studying membrane-protein interactions. the influence of e.g. ph and ionic strength on various chromatographic bead -biomass combinations. to analyze the force curves, possible elastic contributions, e.g. from deforming cellular membranes, have to be decoupled from the interaction forces. then, bead-biomass interactions will be modeled using (extended) dlvo theory and resulting data can also be compared to real-life eba processes. the project aims for a better understanding of the interaction forces in chromatography and might help to improve the process quality of eba. long-term non-invasive in vivo monitoring of the survival, migration, homing and fate of transplanted cells is of key importance for the success of cell therapy and regenerative medicine. tools for in vivo magnetic resonance (mr) imaging of labeled cells are therefore being developed. we have prepared superparamagnetic iron oxide nanoparticles by the coprecipitation of fe(ii) and fe(iii) salts and oxidation. to stabilize the particles and to facilitate their internalization by the cells, the nanoparticles were coated with several novel low-and highmolecular weight compounds including d-mannose, poly(llysin), poly(n,n-dimethylacrylamide) and dopamine-hyaluronate conjugate. the surface-modified magnetic nanoparticles were thoroughly characterized by a range of physico-chemical methods, which proved the presence of the coating on the particles. the particles were then investigated in stem cell experiments in terms of real time cell proliferation analysis, viability, labeling efficiency and differentiation. the iron oxide concentration of the labeled cells was assessed using mr relaxometry. the advantages/disadvantages of particular iron oxide coatings will be discussed and the optimal coating suggested. excellent contrast was achieved by labeling the cells with dopamine-hyaluronate-coated nanoparticles. support of the as cr (no. kan401220801) is acknowledged. in the last decades the interest towards the fabrication of innovative bio-sensors with improved sensitivity and reliability for medical-diagnostics applications has been constantly risen. among the different techniques, microfluidic systems are playing a major role. in order to detect extremely low concentrations of biomolecules (pm and fm), attention should be placed on the controlled, selective functionalization of micro-and nano-channels. in this work we propose a new approach to functionalize gold patches inside fluidic channels. we start from self-assembled monolayers (sams) of thiolated molecules on a gold electrode deposited inside the channel. then, by using an electrochemical approach [1, 2] we remove molecules from the sam at selected locations, by applying a negative voltage to the electrode. the newly exposed gold surface can be re-functionalized by using a thiolated biomolecule (i.e an antibody) capable to bind specific proteins flowing inside the channel. the cycle can be applied to other electrodes in the microfluidic system, creating a multiplexing device which, as we will show, can differentially measure ionic current flows in different channels. optically actuated micromanipulation and micro-probing of biological samples are increasingly important methods in the today's laboratory. microbeads as probes are the most commonly used tools in this field, although the only manipulative motion they allow is translation. we present polymerized 3d microstructures which can also be used for optical micromanipulation with more degree of freedom than microbeads. the two-photon polymerization (tpp), based on focused fs laser beam into appropriate photopolymers is a powerful method to build structures of arbitrary complexity with submicrometer resolution. the presented tools have the advantage of being capable of twisting and rotational manipulative motion, and also that the position of biological manipulation and optical trapping is spatially separated. different manipulative interfaces, the positioning stability and surface activation of the manipulators will be discussed. the potential application of multiplexed quantum dot labeling, mqdl, in clinical detection, prognosis and monitoring therapeutic response has attracted high interests from bioengineers, pathologists and cancer biologists. mqdl is superior comparing with conventional organic dye staining in its narrow emission bandwidths, wide signal dynamic ranges, high detection sensitivity, and low noise to signal ratios. however, the majority of the mqdl application has been limited to identification of specific cell type or cancer subtype and the improvement in labeling methodology. in this study, we focused on simultaneous detection and analysis of 5 proteins in the c-met activation pathway, i.e. rankl, vegf, nrpln-1, p-c-met and mcl-1, which are known to be associated with human prostate cancer progression and metastasis. two experimental systems were analyzed: 1) fixed xenograft tissues from an established ltl313 castration resistance human prostate cancer or crpc model; and 2) clinical prostate tissue specimens from localized cancer and bone metastasis. in the presentation we will report our experience in 1) the mqdl protocol optimization for the sequential reactions of individual primary antibody, the biotinylated secondary antibody and streptavidin-coated qd conjugate with nuclear dapi staining; and 2) the multiplexed image catching, image unmixing, and subsequent per cell base quantification. for future multi-specimen analyses and validation, we will introduce a high throughput vectra image analysis system. carbon nanotubes (cnts) 1 are already quite popular among many scientific and technological disciplines 2 . in recent years they have been targeted for biotechnological and medical applications. in this work we have investigated the nanostructural self assemblies of biological lipid molecules in presence of cnts. advantage of using highly aligned cnts 3 for this purpose being the possibility of studying the interactions of lipid molecules on the macromolecular surface as well as in the confinement of aligned cnts. we have observed various lyotropic nanostructures that are found for corresponding lipids in the bulk under dry and hydrated conditions. 4 nanostructural studies were mainly performed using small and wide angle x-ray scattering techniques. this work is crucial for designing the nano-micro-fluidic architectures and supported model membranes where both -functionalization of cnts and nanostructural assembling of lipids, could be employed simultaneously. alternative of the commonly used measuring tools in protein research and medical diagnostics. thazolidone derivatives are novel synthetic compounds possessing various biological activities. we selected three such compounds les-3120, 3166, 3372 which passed national cancer institute in vitro tests. annexin v/pi and dapi staining, dna electrophoresis in agarose gel, and western-blot analysis using specific antibodies against 30 cellular proteins involved in apoptosis were applied to study molecular mechanisms of tumor cell death induced by these compounds. it was found that molecular targets of thiazolidones in target cells strongly depend on structure of their side groups: les-3120 containing isatine fragment, activated caspase-8 involved in receptor-mediated apoptosis, while les-3166 possessing benzthiazol residue, induced mitochondrial apoptosis mediated by caspase-9, and les-3372 which has a unique chlorine atom in side chain, also led to mitochondrial apoptosis mediated by aif (apoptosis-inducing factor). to increase anticancer potential of these molecules, in silico study was performed and most active groups of les-3120 and les-3372 were combined into one molecule. in vitro studies showed that such hybrid molecule called les-3661 possessed 10 times higher anticancer potential (ic 50 =1 lm) comparing with initial compounds. we report on the synthesis and characterization of vaterite microcontainers for controlled drug release. moreover, we present experiments on possible release strategies of encapsulated substances via recrystallization, ph controlled, or by desorption methods. vaterite spherical particles were fabricated with controllable average sizes from 400±12nm till 10±1hm. we considered two ways of functionalization of the containers: encapsulation of the substances during the vaterite synthesis or their adsorption onto the prepared particles. as model experiments, vaterite containers, encapsulating rhodamine 6g, were imaged by two-photon microscopy, showing dye release into the aqueous medium due to recrystallization to calcite within 3 days. differently, in ethanol only small amounts of the encapsulated markers were diffusion released after one week. the release mechanisms can be further controlled by covering the microcontainers with additional polymer layers to increase diffusion and recrystallization time. a change of the ph from neutral to acid conditions leads to the destruction of the vaterite matrix followed by a quick release of the encapsulated materials. these flexible control mechanisms make this system an interesting candidate for pharmaceutical applications. magnetic nanoparticles (np) in combination with therapeutic molecules represent one of most promising methods for targeted drug delivery. one of major current limitations of magnetic drug targeting is to achieve efficient concentration of magnetic carrier-drug complexes at the targeted sites due to poor mobility of nanoparticles in tissue structures. interstitial delivery is hindered by microscopic extracellular matrix, which represents a major barrier for nanoparticles motilities. in order to achieve efficient magnetic drug targeting it is crucial to know particle mobility in a given in vivo environment as well as to apply magnetic field having appropriate field gradient which drags magnetic nps. we used gel magnetophoresis in order to measure motilities of different magnetic nps (co-ferrite, c-fe 2 o 3 ) in agarose gel. numerical modeling using fem method was used to determine appropriate settings of magnets, which generate sufficient magnetic field gradient. further, we used the numerical modeling to evaluate the magnetic force on the nps for different geometries. we obtained that one of crucial factors which determines final mobility in tissue is formation of larger aggregates of nanoparticles under physiological conditions and interaction of nanoparticles with surrounding matrix. defining the forces required to gate mechanosensitive channels in mammalian sensory neurons kate poole and gary lewin department of neuroscience, max delbrueck center for molecular medicine, robert-roessle str 10, 13125, berlin-buch, germany our sense of touch and mechanical pain is based on mechano-electrical transduction (met) at the terminal endings of subsets of dorsal root ganglion (drg) neurons innervating the skin. to quantify the stimulus strengths required to gate mechanosensitive channels in these subsets of neurons, we developed an approach using microstructured surfaces. the drg neurons are grown on laminin-coated pdms pillar arrays, mechanical stimuli are applied by deflecting individual pili and the deflection is monitored using light microscopy. as the pili behave as light-guides, the center of each pilus can be determined from a fit of the intensity values, allowing detection of movements of a few nanometers. the response to such stimuli is monitored using whole-cell patch-clamp. pili deflections of 10nm can gate the rapidly adapting-current in mechanoreceptor cells, while deflections above 150nm are required for gating of slowly adapting-currents in nociceptors. smaller stimuli are required to generate currents via pili deflection (10nm) vs neurite indentation (70-100nm), suggesting that gating occurs at the cell-substrate interface. we have also characterized the met currents present in n2a cells which we show are modulated by the substrate to which the cells are attached. enhanced stimulation of toll-like receptor 9 via immunostimulatory nanoparticles jan rother, anna pietuch, andreas janshoff georg-august university gö ttingen, institute of physical chemistry, tammanstr. 6, d-37077 gö ttingen, germany e-mail: jrother@gwdg.de among the toll-like receptor family (tlrs), the tlr9 has been subject of intensive research because of its predominant localization in the lysosomes of immune cells and its ligand rendering it a potential candidate for immunotherapy of autoimmune diseases and cancer. additionally, a use as an adjuvant in vaccination is aimed using synthetic cpg-oligodeoxynucleotides (cpg-odn's). albeit, immunostimulatory cpg-odns already showed promising results in animal experiments and clinical trials, several groups found that tlr9 is also expressed by tumor cells. first experiments show that activation of tlr9 displayed by cancerous cells leads to a decreased apoptosis rate and proliferation posing unpredictable threat to tumor patients exposed to cpg-odns. therefore, detailed knowledge about the impact of cpg-odns on cancer cells is inevitable for a save use in pharmaceutics. herein, we describe a sophisticated way to address tlr9 in cancer cells using cpg-odn functionalized ''superparamagnetic'' mno-and c-fe 2 o 3 -nanoparticles (nps) to stimulate tlr9 in a549 cells. analysis of impedimetric measurements revealed a cytotoxic effect of the mno-nps. cells treated with immunostimulatory fe 2 o 3 -nps showed an increased micromotility as well as a higher long-term correleation of the impedance signal. biomedical diagnostics like high-sensitivity, single-molecule study, easy sample preparation. furthermore, sers allows to conduct non-invasive studies of conformations of the molecules without destruction of living cells, i.e. in vivo [1] . this work presents a sers study of cytosolic hemoglobin (hb c ) using silver nanoparticles (agnps). the hb c was isolated from cytoplasm of red blood cells taken from rat erythrocytes and diluted. agnps were prepared by developing leopold and lendl method [2] . three types of colloids were prepared at various temperatures (25, 40 and 60°c). the resulted agnps were characterized by uv-vis-, ftirspectroscopy, dls and tem. reduction of ag ions leads to the formation of predominantly spherical agnps but also silver nanorods, faceted and aggregated agnps in small quantities with a surface plasmon resonance band in the range of 413 -445 nm. for agnps synthesized at 25°c, for example, a bimodal size distribution was observed (about 7 and 45 nm medium sizes, respectively). sers measurements were optimized for each type of agnps. it was demonstrated that agnps gave strong raman enhancement from hb c and types of sers spectra differ from each other. nano-zno is characterized by unique properties, low toxicity and high biocompatibility instead of a lot of others nanomaterials. for this fact nanoparticles of zno have great potential for applications in biosystems, for example biolabeling, biosensoring, delivery systems and others, which can be used in genetics, pathology, criminology, safety of food and many other industries. for these bioapplications are necessary surface modifications, which can made to the nanostructures to better suit their integration with biological systems, leading to such interesting properties as enhanced aqueous solubility, bio-recognition or applicability for biological systems. for synthesis of zno nanoparticles in aqueous solution we used 11-mercapto-undecanoic acid (mua) as stabilizing agent. the coating of nanoparticles with mua could allow their solubility in the water and the binding through carboxyl groups present in its structure. we defined the optimal ph for mua modificated nano-zno solubility and their ability interaction with positive charges. we studied the optical properties of pure and surface modificated nanoparticles and their conjugates with cytochrome c and also the effect of ph on the interaction between nano-mua and horse cytochrome c. the permeation of water soluble molecules across cell membranes is controlled by channel forming proteins and particularly the channel surface determines the selectivity. an adequate method to study properties of these channels is electrophysiology and in particular analyzing the ion current fluctuation in the presence of permeating solutes provides information on possible interactions with the channel surface. as the binding of antibiotic molecules in the channels of interest is significantly weaker than that of preferentially diffusing nutrients in substrate-specific pores, the resolution of conductance measurements has to be significantly increased to be able to resolve the events in all cases. due to the limited time resolution, fast permeation events are not visible. here we demonstrate that miniaturization of the lipid bilayer; varying the temperature or changing the solvent may enhance the resolution. although electrophysiology is considered as a single molecule technique, it does not provide atomic resolution. molecular details of solute permeation can be revealed by combining electrophysiology and all atom computer modeling. [ novel functionalized nanocomposites (nc) were designed and synthesized on the basis of polymeric surface-active oligoelectrolytes. the developed technology permits controlling: 1) quality and quantity of structural blocks of nc, and size unimodality; 2) branching at specific sites in polymer chain of nc; 3) providing nc with reactive chemical groups; 4) covalent conjugation of specific bio-targeting molecules. provided bioactive elements were: a) specific anticancer drugs, antibiotics, alkaloids; b) dna and sirna; c) immunoglobulins and lectins; d) lipids and amino acids; e) polyethylene glycol. fluorescent, luminescent, super-paramagnetic, or x-ray detectable compounds were also incorporated in nc to make them detectable and measurable. biocompatible nc possessing low toxicity towards mammalian cells in vitro and in vivo (mice) were created. they were effective in delivery of: 1) drugs (doxorubicine and antibiotics) for chemotherapy in vitro and in vivo; 2) dna for transfection of mammalian, yeast and bacterial cells; 3) protein antigens for animal immunization and specific lectins for targeting apoptotic cells. these and other approaches in application of developed nc and nanobiotechnologies are considered. this work was supported by stcu grants #1930, #4140, #4953. in the anti-cancer drug delivery domain, nanotechnologies are a promising tool, providing a good tissue distribution and a low toxicity. drug delivery vehicles relying on solid nanoparticles have been proposed, among which diamond nanoparticle (size\20 nm) is a very promising candidate [1] . we have investigated the delivery of sirna by nanodiamonds (nd) into cells in culture, in the context of the treatment of a rare child bone cancer (ewing sarcoma), by such a gene therapy. sirna was bound to nds after nds coating by cationic polymers, so that the interaction is strong enough to pass the cell membrane without loss of the drug and does not prevent its subsequent release. the cellular studies showed a specific inhibition of the gene expression at the mrna and protein level by the nd vectorized sirna. we also uses the fluorescence of color center created in the nanodiamonds [2] to monitor the release of fluorescently-labeled sirna in the intracellular medium. this technique brings a quantitative insight in the efficiency of sirna to stop cell proliferation. considering the success of the cell model we recently started the drug delivery in tumor xenografted on nude mice. silica nanoparticles are stable aqueous suspension of condensed siloxane nanocomposites, having an average diameter between 10 and 100 nm. particles containing organic functional groups on their surface are called organically modified silica nanoparticles (ormosil). due to the various chemical and physical properties of the surface groups, ormosil nanoparticles may have an enormous variety of biological applications, such as in vivo bioimaging, non-viral gene delivery or targeted drug delivery. our aim was to synthesize both void and fluorescent dye doped amino functionalized ormosil nanoparticles through the microemulsion method and use them for gene delivery. the obtained nanoparticles have been characterized by transmission electron microscopy and dynamic light scattering. furthermore, the nanoparticles have been investigated to exploit their transfection efficiency and the possible toxicity caused by surfactants used in the synthesis. the transfection efficiency was tested on various cell cultures. our further aim is the in vivo transfection of salivary glands using ormosil nanoparticles. our work has shown that the nanomedicine approach, with nanoparticles acting as a dna-delivery tool is a promising direction for targeted gene therapy. in vivo amperoetric cells for detection of fast diffusing, physiologically important small molecules lívia nagy, bernadett bareith, tü nde angyal, erika pinté r, gé za nagy university of pé cs, pé cs, hungary h 2 s is a naturally occurring gas that is toxic in high concentration. it exists also in different tissues of living animals sometimes in concentrations as high as 20 lm. it is generally accepted, that h 2 s has important roles in modulating different, physiologically important biochemical processes similarly to other, fast diffusing molecules like no, co and h 2 o 2 . for investigation of the physiological effects of these species their local concentration in the studied biological media is important to know. this means methods needed for measuring the instantaneous concentration with high spatial resolution in living tissues without major invasion. electrometric micro, and ultramicro sensors are often gain application in experimental life sciences for measurement of local ion concentration or following neurotransmitter species in vivo measurements. in our work efforts are being carried out to improve the applicability of selective electrometric sensors in life science experiments. as a result of these work an improved h 2 s measuring cell and improved electrode and method was developed for measurement of electroactive small molecules like no or h 2 o 2 . in the poster to be presented the structure, the working principles and the performances of the different sensors mentioned will be described. bacteriorhodopsin (br) is the only protein in the purple membrane of the halophilic organism halobacterium salinarium. it is a light-driven proton pump converting light into a transmembrane proton gradient through isomerization of the covently bound retinal chromophore. its stability, as well as its photoactivity in dried films, has made br an attractive material for biomolecular devices. such studies, however, have used br within the membrane, on relatively large surfaces. here, conducting-probe atomic force microscopy (c-afm) analysis was performed after isolating the protein from its native membrane environment while keeping its basic trimeric structure, and demonstrated that the molecular conductance of br can be reversibly photoswitched with predictable wavelength sensitivity. intimate and robust coupling to gold electrodes was achieved by using a strategically engineered cysteine mutant located on the intracellular side of the protein which, combined with a 75% delipidation, generated protein trimers homogenously orientated on the surface. c-afm proximal probe analysis showed a reproducible 3 fold drop of br mean resistance over *5 cycles of interspersed illuminations at the same gold-br-gold junction when k[495 nm, while no shift was observed with other wavelengths. capture of circulating tumor cells with a highly efficient nanostructured silicon substrates with integrated chaotic micromixers shutao wang 1 ). this core technology shows significantly improved sensitivity in detecting rare ctcs from whole blood, thus provides an alternative for monitoring cancer progression. by assembling a capture-agent-coated nanostructured substrate with a microfluidic chaotic mixer, this integrated microchip can be applied to isolate ctcs from whole blood with superb efficiency. ultimately, the application of this approach will open up opportunities for early detection of cancer metastasis and for isolation of rare populations of cells that cannot feasibly be done using existing technologies. this technology helped to find a needle in a haystack and will open up the opportunity for single cell genomic and epigenetic sequencing and gene expression profiling. results from further development of this technology will assist the physicians in follow-up patients and testing vigorously the concept of personalized oncology with individualized therapy. this novel technology has recently been reviewed and highlighted by nature medicine the growing crisis in organ transplantation and the aging population have driven a search for new and alternative therapies by using advanced bioengineering methods. the formation of organized and functional tissues is a very complex task: the cellular environment requires suitable physiological conditions that, presently, can be achieved and maintained by using properly-designed bioreactors reproducing all specific functions and bioactive factors that assure viability/regeneration of cells cultured in an appropriate scaffold. the creation of biomimetic environment requires the use of biomaterials such as membranes with specific physico-chemical, morphological and transport properties on the basis of the targeted tissue or organ. tailor-made membranes (organic, functionalized with specific biomolecules, in hollow-fiber configuration), designed and operated according to well-defined engineering criteria are able to sustain specific biotransformations, to provide adequate transport of oxygen, nutrients and catabolites throughout the cellular compartment, and to supply appropriate biomechanical stimuli of the developing tissue. in this talk the author will show the development of membrane engineered constructs focusing on liver and neuronal systems. the role of membrane surface and transport properties in providing instructive signals to the cells for the guiding of proliferation and differentiation will be discussed. membrane bioreactors, which through the fluid dynamics modulation may simulate the in vivo complex physiological environment ensuring an adequate mass transfer of nutrients and metabolites and the molecular and mechanical regulatory signals, will be presented. here we present a novel but simple system for cell-based assays enabling simultaneous testing of multiple samples on a same tissue without cross-contamination between neighbouring assays, as well as sequenced or repeated assays at the same tissue location. the principle of this method lies in the spatially-controlled diffusion of test compounds through a porous matrix to the target cells. a simple microfabrication technology was used to define areas where diffusion processes are allowed or inhibited. we performed proof-of-principle experiments on madin-darby canine kidney (mdck) epithelial cells using hoechst nuclear staining and calcein-am cell viability assay. fluorescent staining superimposed properly on membrane pattern with a dose-dependent response, indicating that both compounds specifically and selectively diffused to the target cells. mdck cells similarly treated with cytochalasin b showed their actin network rapidly altered, thus demonstrating the suitability of this system for drug screening applications. such a well-less cell-based screening system enabling multiple compounds testing on a same tissue and requiring very small volumes of test samples appears interesting for studying potential combined effects of different biochemicals applied separately or sequentially. it is generally believed that all-optical data processing is the most promising direction to achieve serious improvements both in capacity and speed of internet data traffic. one of the bottlenecks of the state-of-the-art photonic integration technology is to find the proper nonlinear optical (nlo) materials that are supposed to serve as cladding media in waveguidebased integrated optical circuits performing light-controlled active functions. recently, the unique chromoprotein bacteriorhodopsin (br) has been proposed to be used as an active, programmable nlo material in all-optical integrated circuits. in integrated optical applications of br, its light-induced refractive index change is utilized. in this paper we exploit the refractive index changes of a dried br film accompanying the ultrafast transitions to intermediates i and k, which allows even sub-ps switching, leading beyond tbit/s communication rate. in the experiments direct pulses of a femtosecond laser system at 800 nm were used along with synchronized ultrafast laser pulses at 530 nm. we believe that the results may be the basis for the future realization of a protein-based integrated optical device, and represent the first steps to a conceptual paradigm change in optical communication technologies. last years, such autoantibodies attract an increasing attention of researchers as potential cancer biomarkers. since the sera of cancer patients typically contain a unique set of antibodies that reflect the tumor-associated antigens expressed in a particular malignant tissue, diagnosing and predicting the outcome of disease such as breast cancer based on serum autoantibody profiling is an attractive concept. to create a representative panel of antigens for detecting of breast cancer autoantibody profile we selected 18 breast cancer associated antigens. these antigens were identified by screening of tumor cdna libraries with autologous sera using serex (serological investigation of recombinantly expressed clones) approach. all antigens were cloned, expressed, purified in bacteria and tested with sera of breast cancer patients and healthy donors in large-scale allogenic screening using elisa. the utility of selected tumor associated antigens for detecting of autoantibody profile in different types of breast cancer was evaluated. (controlled by architectural software) is carried out according to a design template, consistent with the geometry and composition of the desired organ module. structure formation occurs by the post-printing fusion of the discrete bio-ink units. when the bio-ink units contain more than one cell type, fusion is accompanied by sorting of the cells into the physiologically relevant pattern. thus structure formation takes place through self-assembly processes akin to those utilized in early embryonic morphogenesis. we demonstrate the technology by detailing the construction of vascular and nerve grafts. spherical and cylindrical bio-ink units have been employed to build fully biological linear and branching vascular tubular conduits and multiluminal nerve grafts. upon perfusion in a bioreactor the constructs achieved desirable biomechanical and biochemical properties that allowed implantation into animal models. our results show that the printing of conveniently prepared cellular units is feasible and may represent a promising tissue and organ engineering technology. femtosecond lasers have become important tools for noncontact microprocessing of biological specimens. due to the short pulse length and intensity-dependent nature of the multiphoton ionization process, fs-laser pulses affect only a small volume of a treated cell, providing a high degree of spatial localization. we employed fs-laser to address topical bioengineering and biomedical problems such as cell fusion and embryo biopsy respectively. a tightly focused laser beam (cr:f seed oscillator and a regenerative amplifier, 620nm, 100 fs, 10 hz) was used for a fusion of blastomeres of two-cell mouse embryos and for a polar body (pb) biopsy. in order to fuse blastomeres the contact border of cells was perforated by a single laser pulse. the fusion process usually completed within *60 min. in order to perform a noncontact laser based pb biopsy we initially drilled an opening in the zona pellucida with a set of laser pulses, and then extracted the pb out of zygote by means of optical tweezer (cw laser, 1064 nm). the energy of laser pulses was thoroughly optimized to prevent cell damage and increase the fusion and biopsy rates. the proposed techniques demonstrate high efficiency and selectivity and show a great potential for using fs lasers as a microsurgical tool. new insights into mechanisms of electric field mediated gene delivery maš a kanduš er and mojca pavlin university of ljubljana, faculty of electrical engineering, si-1000 ljubljana, slovenia gene electrotransfer is widely used for transfer of genetic material in biological cells by local application of electric pulses and is currently the most promising non-viral delivery method for gene therapy for a series of diseases as well as for dna vaccination. current description of the process defines several steps: electropermeabilization, dna-membrane interaction, translocation, trafficking to nucleus and into nucleus. but the mechanisms of electrotransfer are still not fully understood. we present results of the systematic in vitro analysis using pegfp of all steps involved in electrotransfection from electropermeabilization, analysis of different pulsing protocols, theoretical analysis of plasmid mobility to visualization of the processes of dna-membrane interaction. we demonstrate that in order to translate in vitro results to tissue level sub-optimal plasmid concentrations have to be used. furthermore, sofar the method of dna entry into cytoplasm was only speculated. our results suggest that it is crucial that first, membrane is electropermeabilized, then sufficient electrophoretic force is crucial for insertion of dna into destabilized lipid bilayer followed by dna translocation into cytoplasm via a slow process. efficiency of electrotransfer depends also on the stage of of cell culture -cells in dividing phase are easer to electrotransfect. gentamicin interaction with b16f10 cell membrane studied by dielectrophoresis dielectrophoresis (dep) is the translational motion of polarizable particles due to an electric field gradient. positive-dep and negative-dep correspond to particle movement forward or backward the region of high field intensity, respectively. our study reveals some of the cell membrane modifications induced by gentamicin (gt), as they are reflected in the crossover frequency f co of b16f10 murine cells incubated with gt for different concentrations and durations. f co is the ac frequency when cells turn from positive-dep to negative-dep. gentamicin is a positively charged aminoglycosidic antibiotic, with concentration-dependent killing action; it is widely used because of its low cost and reliable bactericidal activity. gt drawbacks consist in high toxicity for renal and hearing cells; the molecular mechanisms of this toxicity are still unclear. for low external medium conductivities (& 0.0012 s/m), f co of control and gt-cells was found to range from 3 to 10khz. f co shifts to higher frequencies with the increase of gt concentration and incubation time. cells dielectrophoretic behavior is discussed using the cell singleshell based model. extracellular matrix (ecm) is a major obstacle for succesful delivery of genes. chitosan is a versatile and biocompatible polysaccharide derived from chitin and is a promising gene carrier. chitosan-dna interactions, and hence dna polyplexation and release can be controlled through chitosan de-acetylation degree, molecular weight and functionalization of chitosan cationic groups. grafting of poly(ethylene glycol) peg to gene delivery vectors increases circulation time of gene delivery systems in blood vessels and reduces polyplexes charge. diffusion and unpacking of pegylated and non-pegylated chitosan-dna polyplexes through articial ecms based on collagen and collagen-hyaluronic acid (ha) gels were compared using fluorescence correlation microscopy, confocal microscopy and colocalization analysis. non-pegylated polyplexes were immobilized in the gels whereas pegylated polyplexes were diffusing. the smaller charge of pegylated polyplexes seems to decrease interactions between polyplexes and ecm components. furthermore, ha might also screen collagen fibers-pegylated polyplexes interactions. pegylated polyplexes also showed a higher degree of unpacking in gels, probably due to a looser compaction of dna by pegylated chitosan compared to non-pegylated chitosan. fabrication of vesicles, a close membrane made of an amphiphile bilayer, has great potentiality for encapsulation and controlled release in chemical, food or biomedical industries but also from a more fundamental point of view for the design of biomimetic objects. methods based on lipid film hydratation 1 , inverse emulsion techniques 2 and more recently microfluidic techniques such as double emulsion 3 or jetting 4 method are limited either by a low yield, a low reproducibility, a poor control on the size, or by the presence of remaining solvent or defects. we propose a fast and robust method 5 easy to implement: continuous droplet interface crossing encapsulation (cdice), that allows the production of defect-free vesicles at high-yield with a control in size and content. the vesicles have controlled bilayer composition with a polydispersity in size lower than 11%. we have shown that solutions as diverse as actin, cells, micrometric colloids, protein and high ionic strength solutions can easily be encapsulated using this process. by adjusting the parameters of our set-up, we are able to produce vesicles in the range 4-100 lm in diameter, stable for weeks. we believe this method open new perspectives for the design of biomimetic systems and even artificial tissues. under appropriate conditions. the extremely variable d3 domain of flagellin subunits, comprising residues 190-283, protrudes at the outer surface of flagellar filaments. the d3 domain has no significant role in the construction of the filament structure. thus, replacement of d3 may offer a promising approach for insertion of heterologous proteins or domains without disturbing the self-assembly of flagellin subunits. our work aims at the construction of flagellinbased fusion proteins which preserve the polymerization ability of flagellin and maintain the functional properties of the fusion partner as well. in this work a fusion construct of flagellin and the superfolder mutant of green fluorescent protein (gfp) was created. the obtained gfp variant was highly fluorescent and capable of forming filamentous assemblies. our results imply that other proteins (enzymes, binding domains etc.) can also be endowed by polymerization ability in a similar way. this approach opens up the way for construction of multifunctional filamentous nanostructures. generation 5 polyamidoamine (pamam) dendrimer has been shown to be highly efficient nonviral carriers in gene delivery. however, their toxicity limits their applications. in this study, to improve their characteristics as gene delivery carriers, g5 pamam dendrimer was modified with anti-tag72 nanobody through hetrobifunctional peg, then complexed with t-bid coding pdna, yielding pamam-peg-anti-tag72 nanobody/pdna nanoparticles (nps). nuclear magnetic resonance (nmr) spectroscopy, zeta sizing and gel retardation assay results provided evidence that the nanovector was successfully constructed. the transfection efficiency of vector/pdna complexes were evaluated in vitro. real time pcr results also demonstrated that anti-tag72 nanobody modified nps are more efficient in t-bid killer gene expressing in colon cancer cell line than the unmodified nps. in conclusion, pamam-peg-anti-tag72 nanobody showed great potential to be applied in designing tumour-targeting gene delivery system. 3 dept of chemistry, faculty of science, national university of singapore, singapore, 4 dept of biochemistry, yong loo lin school of medicine, national university of singapore, singapore, 5 division of bioengineering, faculty of engineering, national university of singapore, singapore macromolecular crowding (mmc) is a biophysical tool which has been used extensively to enhance chemical reactions and biological processes by means of the excluded volume effect (eve). the in vivo stem cell microenvironment contains macromolecules which are crucial for stem cell selfrenewal and cell fate determination. in order to mimic this physiological microenvironment, crowders are included in cell culture medium. we have observed that the ex vivo differentiation of human mesenchymal stem cells (hmscs) into the adipogenic lineage is significantly amplified when a crowder mixture comprising ficoll 70 and ficoll 400 is added to the culture medium. stem cell differentiation is modulated by soluble chemical substances as well as interactions between cells and the extracellular matrix (ecm), and both these external influences may be affected by mmc. measurements we have performed by fluorescence correlation spectroscopy (fcs) show that ficoll additives cause anomalous subdiffusion within a crowder concentration range of 0 to 300 mg/ml. the diffusion of fluorophorelabelled molecules in artificial lipid bilayers and membranes of living cells is not changed by crowders, suggesting that these crowders do not directly alter membrane properties and cell surface signalling. however, we have data to suggest that crowders increase actin polymerization reaction rates in vitro. we have also observed that crowders are taken up by stem cells and that they localize to specific compartments. based upon our observations, we hypothesize that crowders can influence stem cell differentiation by influencing molecular kinetics. lignocellulose-based composites are becoming extremely important and perspective sustainable and renewable natural materials. fibre modification enhancing their existing properties can be obtained to broaden the application areas. in response to shortcomings of traditional chemical and physical methods, enzymes and chemo-enzymatic methods have emerged as eco-friendly catalysts working under mild conditions and enable tailoring of the material surface properties by substrate specificity and regional selectivity. recently, binding of different functional molecules to lignin-rich fibres by using an oxidative enzyme (e.g. laccase) has been reported leading to their functionalisation through free radical reactions. by the application of electron paramagnetic resonance spectroscopy (epr) laccase action was inspected. consumption of substrates was investigated and their polymerization traced. stable radical intermediates were detected with epr when substrate molecules were in contact with active enzymes. secondly, oxidation of mediators like nitroxides was determined via epr spectroscopy of stable water-soluble nitroxide radicals. finally, the generation of short-lived radicals as well as their reduction was measured via epr spin trapping using dmpo as sensitive water soluble spin trap. mammalian ovary hormone stimulation (ohs) is known to be an inalienable stage of reproductive biotechnology as well as human infertility treatment. the basic aim of the ohs is to receive a stock of valuable oocytes and early embryos for subsequent utilization in the reproductive technology, experimental work et al. however, it is known that ohs itself affects the character of ovulation and oocyte quality, which in its turn affects the development of embryos and even has distant consequences. the wideness of cell parameters and appropriate methods for investigation of gamete/embryo quality are very important. the aim of this study is determination of specific electric conductivity of mouse oocytes and early embryos which have been received after ohs in comparison with the ones that have been received in natural animal sex cycle. using techniques of electroporation the dependence of specific electric conductivity of mouse oocytes, zygotes, 2-cell and 8-cell embryos on the external electric field intensity has been studied. it is shown that the whole pool of oocytes that were obtained in the result of ohs consists of two groups of oocytes that don't differ from each other morphologically, but differ by their electric parameters and resistance to electric breakdown. at the zygote stage, dividing of embryos into two groups is preserved, but is less expressed. at the stage of 2-cell and 8-cell dividing of embryos into two groups on their electric conductivity disappeared but certain scattering of the parameters due to individual embryo peculiarities is observed. the obtained data show that ohs may lead to latent changes of oocyte state that in their turn affect embryo quality. many microbes synthesize and accumulate granules of polyhydroxyalkanoates (pha, biodegradable storage materials alternative to traditional plastics), which help them survive under stresses. in particular, the plant-growth-promoting rhizobacterium azospirillum brasilense, that is under investigation worldwide owing to its agricultural and biotechnological significance, can produce poly-3hydroxybutyrate (phb) [1] . in our work, phb synthesis in a. brasilense cells was studied under various stresses using diffuse reflectance ftir spectroscopy. phb in cells was determined from the band intensity ratio of the polyester m(c=o) at *1740 cm -1 to that of cell proteins (amide ii band at *1550 cm -1 ), showing a. brasilense to be able to produce phb up to over 60% of cells' dry weight. stresses induced phb accumulation, enhancing ir absorption in phb specific regions. analysis of a few structure-sensitive phb vibration bands revealed changes in the degree of intracellular phb crystallinity (related to its enzymatic digestion rate) at different stages of bacterial growth, reflecting a novel trait of the bacterial adaptability to an enhancing stress, which is of great importance to agricultural biotechnology. the aim of this work is to furnish enzymes with polymerization ability by creating fusion constructs with the polymerizable protein, flagellin, the main component of bacterial flagellar filaments. the d3 domain of flagellin, exposed on the surface of flagellar filaments, is formed by the hypervariable central portion of the polypeptide chain. d3 is not essential for filament formation. the concept in this project is to replace the d3 domain with suitable monomeric enzymes without adversely affecting polymerization ability, and to assemble these chimeric flagellins into tubular nanostructures. to test the feasibility of this approach, xylanase a (xyna) from b. subtilis was chosen as a model enzyme for insertion. with the help of genetic engineering, a fusion construct was created in which the d3 domain was replaced by xyna. the flic(xyna) chimera exhibited catalytic activity as well as polymerization ability. these results demonstrate that polymerization ability can be introduced into various proteins, and building blocks for rationally designed assembly of filamentous nanostructures can be created ( table 1) . the support of the hungarian national office for research and technology and the hungarian scientific research fund (otka) (grants ck77819, nk77978, nanoflag) is acknowledged. cluster phases of membrane proteins an alternative scenario for the formation of specialized protein nano-domains (cluster phases) in biomembranes dna fragmentation induced in human fibroblasts by accelerated 56 fe ions of differing energies swift heavy ion irradiation of srtio 3 under grazing incidence'' proc. natl. acad. sci. usa 105 forespore engulfment mediated by a ratchet-like mechanism a channel connecting the mother cell and forespore during bacterial endospore formation a feeding tube model for activation of a cell-specific transcription factor during sporulation in bacillus subtilis the scanning ion-conductance microscope imaging proteins in membranes of living cells by high-resolution scanning ion conductance microscopy nanoscale live-cell imaging using hopping probe ion conductance microscopy beta2-adrenergic receptor redistribution in heart failure changes camp compartmentation simultaneous noncontact topography and electrochemical imaging by secm/sicm featuring ion current feedback regulation timasheff in protein-solvent interactions the roles of water in foods on motor and electrical oscillations in urinary tract: computer evaluation daniele martin 1,2 , viktor foltin 1,2 , erich gornik 2 , rumen stainov 3,4 , tanya zlateva 4 nü rnberg. icsd e.v. postfach (pob) 340316, d-80100 mü nchen method: parameters: motor patterns (guinea-pig) -frequency/f, amplitudes/a (% init. length, isot. & intracell. rec.) of spontaneous phasic/spc & tonic/stc contractions, also electrical spikes/s, bursts/b, burst plateaus/bp (neu et al. biophys.j. 125/6a/jan2008 stretch (3-80mn), k-/ca-influence induced specific changes in motor/electrical parameters. special computer programme reflects exactly biophysical parameters. conclusion: acc. to earlier/recent results mechano-sensitive ca ?? -activated k ? -channels participate in electrical oscillations of detrusor/ureteral myocytes. further experiments/evaluations incl effect of hydrophobic mismatch on the light-induced structural changes in bacterial reaction centers s. s. deshmukh, h. akhavein, and lá szló ká lmá n department of physics mechanism of proton transfer in nitric oxide reductase: computational study andrei pisliakov riken advanced science institute, wako-shi proteins 74, 266. acknowledgments this work was supported by project grant ptdc/qui/70182/ 2006 (cas) and doctoral grants sfrh th anniversary conference aicr this work was supported by the italian association for cancer research (airc), the istituto toscano tumori and the associazione noi per voi differential hydration, void volume: which factor provides the main contribution to dv u ? inserm umr 1054 fr; roumestand@cbs.cnrs.fr introduction: globalization needs new organizational models also for biophysics. reports on necessity of int. institutes for biophysics (iib) c/o int. universities (proposed by british nobel laureate b.russell) are given conception: proposals for ebsa-discussion: 1. enlargement of executive committee by a. honorary & presidents (permanent 1-3: moral support & 1-3 fixed term), b. interdisciplinary commission: scientists from biology, medicine, physics, etc. (feps/iups, iuphar, iupab, etc). 2. implication of interdisciplinary topics to esba/iupab congressprogrammes, 3. also for biophysical journals. 4. organization of common interdisciplinary sessions not only to biophysical, but also to other congresses. 5. co-operation between esba/iupab with int. interdisciplinary organisations (waas, icsd/ias, eur. academies) for creation of iib by network of national ones: successive common personnel, possibility for whole life work, etc. conclusion: realization of proposals 1.-5. could increase scientific/political authority of ebsa/iupab, leading to model for renewal of scientific organizations collective migration of neural crest cells: a balance between repulsion and attraction roberto mayor university college london goodilin 2,3 , olga v single-molecule cut-and-paste surface assembly (smcp) has been also used to build up a biotin scaffold that streptavidin utilizing specific molecular interactions, for example between dna-binding proteins and dna or antibodies and antigens, this technique is capable of providing a scaffold for the controlled self-assembly of functional complexes. furthermore, this allows for the introduction of smcp into protein science. we aim to employ dna-binding zinc-finger variants and gfp-binding nanobodies as shuttle-tags fused to the proteins of interest. thus a fully expressible system that can be used for the step-wise assembly of individual building blocks to form single-molecule cut-and-paste surface assembly optically monitoring the mechanical assembly of single molecules nanoparticle self-assembly on a dna-scaffold written by single-molecule cut-and-paste torsional motion analysis of group ii chaperonin using diffracted x-ray tracking nanomechanical manipulation of mason-pfizer monkey retroviral rna fragment with optical tweezers melinda simon 1 , zsolt má rtonfalvi 1 , pasquale bianco 1 , beá ta vé rtessy 2 , mikló s kellermayer 1 micro-viscosimeter generated and manipulated by light andrá s buzá s 1 , lá szló oroszi 1 , ló rá nd kelemen 1 , pá l ormos 1 temesvá ri krt proc. natl. acad. sci. usa 105 neural signal recordings with a novel multisite silicon probe gergely má rton 1 , anita pongrá cz 1 , lá szló grand 2,3 , é va vá zsonyi 1 pé ter pá zmá ny catholic university, faculty of information technology, h-1083, 50/a prá ter st multiscale pattern fabrication for life-science applications francesco valle 1 , beatrice chelli 1 , michele bianchi 1 , eva bystrenova 1 , marianna barbalinardo 3 , arian shehu 1 , tobias cramer 1 , mauro murgia 1 , giulia foschi 1 miroslava kuricova 1 , jana tulinska 1 , aurelia liskova 1 , eva neubauerova 1 , maria dusinska 2,1, ladislava wsolova acceleration neuronal precursors differentiation induced by substrate nanotopography gianluca grenci 1 , jelena ban 2 , elisabetta ruaro 4 , massimo tormen 1 , marco lazzarino 1,3 and vincent torre 2 light-induced structural changes are reported near the primary electron donor of bacterial reaction centers (brc) dispersed in detergent micelles and in liposomes from lipids with different fatty acid chain lengths. in this study we present evidence for the correlation between the light-induced increase of the local dielectric constant, determined by the analysis of the electrochromic absorption changes, and the lifetime of the charge-separated state at physiologically relevant temperatures. the increase of the local dielectric constant induced a significant decrease of the oxidation potential of the primary electron donor and a slow proton release, which appears to be the rate limiting step in the overall process. systematic selection of the head group charges of detergents and lipids, as well as the thickness of the fatty acid chains of the liposome forming lipids can increase the lifetime of the charge-separated state by up to 5 orders of magnitude. such extensions of the lifetime of the charge-separated state were reported earlier only at cryogenic temperatures and can provide new opportunities to utilize the brc in energy storage. ontogenesis of photosynthetic bacteria tracked by absorption and fluorescence kinetics m. kis, e. asztalos, p. maró ti department of medical physics and informatics, university of szeged, hungarythe development of photosynthetic membrane of rhodobacter sphaeroides was studied by absorption spectroscopy and fast induction of bacteriochlorophyll fluorescence in different phases of the growth, under various growing conditions (oxygen content, light intensity etc.) and in synchronous cell population. the results are: 1) the newly synthesized components of the membranes were imbedded immediately into the proteinous scaffold independently on the age of the cell (no ,,transient'' membranes were observed). 2) under aerobic conditions, the pigments were bleached and under anaerobic conditions the pigment systems showed greening. the relative variable fluorescence (f v /f max ) had small age-dependent (but not cellcycle-related) changes. the fluorescence induction kinetics was sensitive marker of the aerobiosis: the f v /f max ratio dropped from 0.7 to 0.4 and the photochemical rate constant from 5á10 4 s -1 to 3á10 4 s -1 with an apparent halftime of about 4-5 hours after change from anaerobic to aerobic atmosphere.3) the electrogenic signal (absorption change at 525 nm) reflected the energetization of the membrane which showed cell-cycle dependent changes. that included periodic production and arrangement of protein-lipid components of the membrane synchronized to the cell division. interfacial water in b-casein molecular surfaces: wide-line nmr, relaxation and dsc characterization t. verebé lyi 1 , m. bokor 1 , p. kamasa 1 , p. tompa 2 , k. tompa 1 1 research institute for solid state physics and optics, hungarian academy of sciences, pob. 49, 1525 budapest,hungary, 2 institute of enzymology, biological research center, hungarian academy of sciences, pob. 7, 1518 budapest, hungarywide-line proton nmr fid, echoes, spin-lattice and spin-spin relaxation times were measured at 82.55 mhz frequency in the -70°c to ?40°c temperature range, in lyophilized bcasein and aqueous and buffered solutions, and dsc method were also applied. the motivation for the selection of b-casein is the uncertainty of structural order/disorder. naturally, the nmr and thermal characteristics were also evaluated. the melting of hydration water could be detected well below 0°c and the quantity of mobile water molecules (hydration) was measured. the hydration vs. melting temperature curve has informed us on the bonding character between the protein surfaces and water molecules. the generally used local field fluctuation model and the bpp theory were applied in the interpretation, and the limits of the models were concluded. the torsional properties of dna play an important role in cellular processes such as transcription, replication, and repair. to access these properties, a number of single-molecule techniques such as magnetic tweezers have been developed to apply torque to dna and coil it. i will briefly refer to investigations of dna-protein interactions using these techniques, and describe what has been learnt. i will then focus on the development of novel magnetic techniques that go beyond standard magnetic tweezers, such as the magnetic torque tweezers 1 and the freely-orbiting magnetic tweezers 2 . these approaches allow one to quantify conjugate variables such as twist and torque. for example, the magnetic torque tweezers rely on high-resolution tracking of the position and rotation angle of magnetic particles in a low stiffness angular clamp. we demonstrate the experimental implementation of this technique and the resolution of the angular tracking. subsequently, we employ this technique to measure the torsional stiffness c of both dsdna molecules and reca heteroduplex filaments. lastly, i will describe novel applications of the optical torque wrench 3, 4 . the optical torque wrench is a laser trapping technique developed at cornell capable of applying and directly measuring torque on microscopic birefringent particles via spin momentum transfer. we have focused on the angular dynamics of the trapped birefringent particle 4 , demonstrating its excitability in the vicinity of a critical point. this links the optical torque wrench to non-linear dynamical systems such as neuronal and cardiovascular tissues, non-linear optics and chemical reactions, which all display an excitable binary ('all-or-none') response to input perturbations. based on this dynamical feature, we devise a conceptually novel sensing technique capable of detecting single perturbation events with high signal-to-noise ratio and continuously adjustable sensitivity.for the first time we report a comparative approach based on surface enhanced raman spectroscopy (sers) and raman spectroscopy to study different types of haemoglobin molecules in living erythrocytes. in erythrocytes there are two fractions of haemoglobin: cytosolic (hb c ) and membranebound (hb m ). the concentration of hb m is less then 0,5% and therefore it is impossible to study hb m with traditional optical techniques. modifications of cellular membrane can affect conformation of hb m . therefore, it can be used as a sensitive marker of pathologies. firstly, we investigated enhancement of sers signal of hb m depending on ag nanoparticles' size. we found that the intensity of sers spectra of hb m and enhancement factor increase with the decrease in ag nanoparticles' size. secondly, we investigated the dependence of haemoporphyrin conformation in both hb c and hb m ion ph values. we observed different sensitivity of hb c and hb m to the ph and found that conformational movements of haemoporphyrin (vibrations of pyrrol rings and side radicals) in hb m are sterically hampered comparably with hb c . our observation is an evidence of a benefit of application of surface enhanced raman spectroscopy to investigate properties of the hb m in erythrocytes and provide new information about conformational changes and functional properties of hb m . rna nanotechnology is an emerging field with high potential for nanomedicine applications. however, the prediction of rna three-dimensional nanostructure assembly is still a challenging task that requires a thorough understanding the rules that govern molecular folding on a rough energy landscape. in this work, we present a comprehensive analysis of the free energy landscape of the human mitochondrial trna lys , which possesses two different folded states in addition to the unfolded one. we have quantitatively analyzed the degree of rna tertiary structure stabilization, firstly, for different types of cations, 1 and, secondly, for several naturally-occurring nucleotide modifications in the structural core of the trna lys . 2,3 thus, notable variations in the rna binding specificity was observed for the divalent ions of mg 2? , ca 2? and mn 2? , that can be attributed to their sizes and coordination properties to specific ligands. furthermore, we observed that the presence of m 2 g10 modification together with the principal stabilizing m 1 a9 modification facilitates the rna folding into the biologically functional cloverleaf shape to a larger extent than the sum of individual contributions of these modifications. in order to elucidate the mechanism of the recognition, we used diffracted x-ray tracking method (dxt) that monitors real-time movements of individual proteins in solution at the single-molecule level. we found that peptides move distinctly from i-a k , and the rotational motions of peptides correlate with the type b t cell activation. in the case of diabetogenic i-a g7 , immediately after peptide exchange, all the peptides moves magnificently but the motion ceased in a week, then new ordered motion appears; the rotational motion of peptides correlate to t cell activation, which is analogous to the peptide in i-a k . the rotational motion of peptides may create transient conformation of peptide/mhc that recognized by a population of t cells. dxt measurement of peptide/mhc complex well correlated to other biological phenomenon too. our finding is the first observation that fluctuations at the level of brownian motion affect to the functions of proteins.mason-pfizer monkey virus (mpmv) is an excellent model for the analysis of retrovirus assembly and maturation. however, neither the structure of the viral rna, nor its modulation by capsid-protein binding are exactly known. to explore the structure of the mpmv genome, here we manipulated individual molecules of its packaging signal sequence with optical tweezers. the 207-base-long segment of mpmv rna corresponding to the packaging signal, extended on each side with 1200-base-long indifferent gene segments for use as molecular handles, was cloned into a pet22b vector. rna was synthesized in an in vitro transcription system. rna/ dna handles were obtained by hybridization in a pcr with complementary dna initiated with primers labeled with either digoxigenin or biotin. the complex was manipulated in repetitive stretch and relaxation cycles across a force range of 0-70 pn. during stretch, transition occurred which increased the rna chain length and likely corresponds to unfolding. the length gain associated with the unfolding steps distributed across three main peaks at *13, 20, 32 nm, corresponding to *35, 57, 85 bases, respectively. often reverse transitions were observed during mechanical relaxation, indicating that refolding against force proceeds in a quasi-equilibrium process. structural investigation of gpcr transmembrane signaling by use of nanobodies jan steyaert 1,2 1 structural biology brussels, vrije universiteit brussel, pleinlaan 2, 1050 brussel, belgium, 2 department of structural biology, vib, pleinlaan 2, 1050 brussel, belgiumin 1993, scientists at the vrije universiteit brussel discovered the occurence of bona fide antibodies devoid of light chains in camelidae. the small and rigid recombinant antigen binding fragments (15kd) of these heavy chain only antibodies -known as vhhs or nanobodies -proved to be unique research tools in structural biology. by rigidifying flexible regions and obscuring aggregative surfaces, nanobody complexes warrant conformationally uniform samples that are key to protein structure determination by x-ray crystallography:• nanobodies bind cryptic epitopes and lock proteins in unique native conformations • nbs increase the stability of soluble proteins and solubilized membrane proteins • nbs reduce the conformational complexity of soluble proteins and solubilized membrane proteins • nbs increase the polar surface enabling the growth of diffracting crystals • nbs allow to affinity-trap active protein i will focus my talk on the use of nbs for the structural investigation of gpcr transmembrane signaling to illustrate the power of the nanobody platform for generating diffracting quality crystals of the most challenging targets including gpcrs and their complexes with downstream signaling partners. dynamics of the type i interferon receptor assembly in the plasma membrane stephan wilmes, sara lö chte, oliver beutel, changjiang you, christian paolo richter and jacob piehler university of osnabrü ck, division of biophysics, barbarastrasse 11, 49076 osnabrü ck, germanytype i interferons (ifn) are key cytokines in the innate immune response and play a critical role in host defense. all ifns bind to a shared cell surface receptor comprised of two subunits, ifnar1 and ifnar2. detailed structure-function analysis of ifns has established that the ifn-receptor interaction dynamics plays a critical role for signalling specificities.here we have explored the dynamics of receptor diffusion and ifn assembly in living cells. by using highly specific orthogonal posttranslational labelling approaches combined with tirf-microscopy we probed the spatio-temporal dynamics of receptor diffusion and interaction in the plasma membrane of live cells on the single molecule level. for this purpose, we employed posttranslational labelling with photostable organic fluorophores. this allowed us to map diffusion and lateral distribution of ifnar1 and ifnar2 with very high spatial and temporal resolution by using single particle tracking (spt) and single molecule localization imaging. observed events of ''transient confinement'' and co-localization with the membrane-proximal actin-meshwork suggest partitioning of ifnar1/2 in specialized microcompartments. this will be investigated in terms influence on receptor assembly and recruitment of cytoplasmic effector proteins. cytoplasmic dynein moves through uncoordinated action of the aaa1 ring domains ahmet yildiz department of physics, and department of molecular and cell biology, university of california, berkeley, ca 94720 usa cytoplasmic dynein is a homodimeric aaa? motor that moves processively toward the microtubule minus end. the mechanism by which the two catalytic head domains interact and move relative to each other remains unresolved. by tracking the positions of both heads at nanometer resolution, we found that the heads remain widely separated and move independently along the microtubule, a mechanism different from that of kinesin and myosin. the direction and size of steps vary as a function of interhead separation. dynein processivity is maintained with only one active head, which drags its inactive partner head forward. these results challenge established views of motor processivity and show that dynein is a unique motor that moves without strictly coordinating the mechanochemical cycles of its two heads.o-715 self-controlled monofunctionalization of quantum dots and their applicaitons in studying protein-protein interaction in live cells changjiang you, stephan wilmes, sara loechte, oliver beutel, domenik lisse, christian paolo richter, and jacob piehler universitä t osnabrü ck, fachbereich biologie, barbarastrasse 11, 49076 osnabrü ck, germanyindividual proteins labeled with semiconductor nanocrystals (quantum dots, qd) greatly facilitate studying protein-protein interactions with ultrahigh spatial and temporal resolution. multiplex single molecule tracking and imaging require monovalent quantum dots (mvqd) capable of orthogonally labeling proteins with high yield. for this purpose, we prepared monovalance qd-trisnta by a chemical conjugation method. our results indicated that monovalent qd-trisnta was obtained in high yield by restricting the coupling by means of electrostatic repulsion. monovalent functionalization of the qd-trisnta was confirmed by assays in vitro and in vivo. two-color qd tracking of interferon receptors ifnar1 and ifnar2 based on mvqd-trisnta were realized on live cell [1] .to broaden the multiplex toolbox of mvqds, we extended the electrostatic-repulsion induced self-control concept for mono-functionalizing quantum dots with different affinity moieties. as a first instance, we used negatively-charged biotin peptide to produce qd with biotin mono-functionalization. we confirmed our approach was a general method to rend qd monovalent by single molecule assays based on stepwise photobleaching. these mvqds facilitate obtaining spatiotemporal information of ifnars' organization in live cells. by orthogonal labeling u5a cells stably expressing ifnar2 at low level with biotin mvqd and mvqd-trisnta-ifn, we verified colocalization and colocomotion of individual ifn and ifnar2 at minute scale. combined with super-resolution imaging of ifnars' cytosolic effecter stat2, we observed the dynamic coming-and-going contact between the microcompartments of ifnar2 and stat2.a micron sized viscometer was fabricated using the couette type geometry that is capable of measuring the complex viscosity of fluids. the viscometer was produced by two photon polymerization of su8 photopolymer using a femtosecond laser system, a high na objective and a piezo translator stage. the viscometer was manipulated by holographic optical tweezers and operated in the 0.005-1 hz frequency range. video analysis algorithm was used to evaluate our measurements. we tested the viscometer with water-glycerol solutions. one of the main reasons for lack of reliability in protein analysis for disease diagnostics or monitoring is a lack of test sensitivity. this is because, for many tests, to be reliable, they need to be performed on a homogeneous, and therefore very small, sample. current in-vitro techniques fail in accurately identifying small differences in protein content, function and interactions starting from samples constituted of few or even single cells. a nanotechnology approach may overcome the current limits in low abundance protein detection. we aim at designing a microwell device for the trapping (in native environment) and the parallel characterization of rare cells (e.g. adult stem cells). such versatile device, based on soft and nanolithography, will promote cell adhesion and viability on differently functionalized bio-compatible materials, allowing for the morphological characterization of the cells, at a single cell level. in parallel, by facing our microwell device with a protein nanoarray, produced via atomic force microscopy nanolithography, we can run proteomic studies at a single/few cells level. moreover, we could foresee the possibility to deliver different stimuli to each cell, correlating the changes in chemistry/ morphology with the protein profile at a single cell level. using an electrophysiological assay the activity of nhaa was tested in a wide ph range from ph 5.0 to 9.5. forward and reverse transport directions were investigated at zero membrane potential using preparations with inside out and right side out oriented transporters with na ? or h ? gradients as the driving force. under symmetrical ph conditions with a na ? gradient for activation, both the wt and the ph-shifted g338s variant exhibit highly symmetrical transport activity with bell shaped ph dependencies, but the optimal ph was shifted 1.8 ph units to the acidic range in the variant. in both strains the ph dependence was associated with a systematic increase of the k m for na ? at acidic ph. under symmetrical na ? concentration with a ph gradient for nhaa activation an unexpected novel characteristic of the antiporter was revealed; rather than being down regulated it remained active even at ph as low as 5. these data allowed to advance a transport mechanism based on competing na ? and h ? binding to a common transport site and to develop a kinetic model quantitatively explaining the experimental results. in support of these results both alkaline ph and na ? induce the conformational change of nhaa associated with nhaa cation translocation as demonstrated here by trypsin digestion. furthermore, na ? translocation was found to be associated with the displacement of a negative charge. in conclusion, the electrophysiological assay allowed to reveal the mechanism of nhaa antiport and sheds new light on the concept of nhaa ph regulation. swimming motility is widespread among bacteria. however, in confined or structured habitats bacteria often come in contact with solid surfaces which has an effect on the swimming characteristics. we used microfabrication technology to quantitatively study the interaction of swimming cells with solid boundaries. we tracked bacteria near surfaces with various engineered topologies, including flat and curved shapes. we were able to study several surface related phenomena such as hydrodynamic trapping and correlated motion. we think that our results may help to understand how physical effects play a role in surface related biological processes involving bacteria such as biofilm formation. cell labeling efficiency of oppositely charged magnetic iron oxide nanoparticles-a comparative study raimo hartmann 1 , christoph schweiger 2 , feng zhang 1 , wolfgang. j. parak 1 , thomas kissel 2,# , pilar rivera_gil 1,# 1 biophotonics, institute of physics, philipps university of marburg, 2 pharmaceutical technology, institute of pharmacy, philipps university of marburg e-mail: kissel@staff.uni-marburg.de; pilar.riveragil@physik.uni-marburg.dethe interaction of nanomaterials with cells is a key factor when considering their translocation into clinical applications. especially an effective accumulation of nanoparticles inside certain tissues is beneficial for a great number of applications. predominantly size, shape and surface charge of nanoparticles influence their cellular internalization and distribution. to investigate this, two series of maghemite (c-fe 2 o 3 ) nanoparticles were synthesized either via aqueous coprecipitation or via thermal decomposition of organometallic precursor molecules. size and the spherical shape of both nanoparticle types were kept constant whereas the charge was changed by modifying the surface of the nanoparticles with polymers of opposite charge, in detail poly(ethylene imine) (pei) and a polymaleic anhydride derivative (pma). the positively and negatively charged c-fe 2 o 3 nanoparticles were characterized with respect to size, zeta potential, colloidal stability and magnetic properties. furthermore, the uptake rate and localization of both formulations into a549 carcinoma cells after fluorescent labeling of the carriers as well as the resulting alteration in mr-relaxation times were evaluated. membrane proteins are the target of more than 50% of all drugs and are encoded by about 30% of the human genome. electrophysiological techniques, like patch-clamp, unravelled many functional aspects of membrane proteins but usually suffer from poor structural sensitivity. we have developed surface enhanced infrared difference absorption spectroscopy (seidas) 1,2 to probe potential-induced structural changes of a protein on the level of a monolayer. a novel concept is introduced to incorporate membrane proteins into solid supported lipid bilayers in an orientated manner via the affinity of the his-tag to the ni-nta terminated gold surface 3 . full functionality of surface-tethered cytochrome c oxidase is demonstrated by cyclic voltammetry after binding of the natural electron donor cytochrome c. general applicability of the methodological approach is shown by tethering photosystem ii to the gold surface 4 . in conjunction with hydrogenase, the basis is set towards a biomimetic system for h 2 -production. recently, we succeeded to record ir difference spectra of a monolayer of sensory rhodopsin ii under voltage-clamp conditions 5 . this approach opens an avenue towards mechanistic studies of voltage-gated ion channels with unprecedented structural and temporal sensitivity. initial vibrational studies on the novel light-gated channelrhodopsin-2 will be presented 6 . probing biomass-chromatographic bead interactions by afm force spectroscopy gesa helms, marcelo ferná ndez-lahore, rami reddy vennapusa, and jü rgen fritz school of engineering and science, jacobs university bremen, 28759 bremen, germany e-mail: g.helms@jacobs-university.dein expanded bed adsorption (eba), bioproducts are purified from an unclarified fermentation broth by their adsorption on chromatographic beads in a fluidized bed. the unspecific deposition of biomass onto the adsorbent matrix can severely affect the process performance, leading to a poor system hydrodynamics which then decreases the success of this unit operation. to quantify the bead-biomass interactions different chromatographic beads are attached to afm cantilevers, and force spectroscopy experiments are performed with these colloidal probes on model surfaces and cells in solution. the experiments are conducted under varying conditions to study uncovering physiological processes at the cellular level is essential in order to study complex brain mechanisms. using multisite signal recording techniques in the extracellular space, functional connectivity between different brain areas can be revealed. a novel microfabrication process flow, based on the combination of wet chemical etching methods was developed, which yields highly reproducible and mechanically robust silicon-based multielectrode devices. the fabricated shaft of the probe is 280 lm wide, 80 lm thick, has rounded edges and ends in a yacht-bow like, sharp tip. its unique shape provides decreased invasivity. the sensor contains 24 platinum recording sites at precisely defined locations. murine in vivo experiments showed that the probes could easily penetrate the meninges. high quality signals, providing local field potential, multi-and single unit activities, were recorded. the interfaces between the tissue and the platinum contacts were further improved by electrochemical etching and carbon nanotube coating of the metal sites. the integrated optical mach-zehnder interferometer is a highly sensitive device, considered a powerful lab-on-a-chip tool for specific detection of various chemical and biochemical reactions. despite its advantages, there is no commercially available biosensor based on this technique. the main reason is the inherent instability of the device due to slight changes of environmental parameters. in this paper we offer a solution to this problem that enables the optimal adjustment of the working point of the sensor prior to the measurement. the key feature is a control unit made of a thin film of the lightsensitive chromoprotein bacteriorhodopsin deposited on the reference arm of the interferometer. after showing the transfer characteristics of such a device, we demonstrate its applicability to sensing of specific protein-protein interactions. we expect our method to become a rapid and cost-efficientthe combination of unconventional fabrication technology and biomaterials allows both to realize state-of-the-art devices with highly controlled lateral features and performances and to study the main properties of the biomolecules themselves by operating at a scale level comparable with the one crucial for their activity. soft lithography and microfluidic devices offer a tool-box both to study biomolecules under highly confined environments [1] and to fabricate in an easy way topographic features with locally controlled mechanical and chemical surface properties, thus leading to a finer control of the interplay of mechanics and chemistry. i will present an application of this technology to the control of cell fate that is becoming a key issue in regenerative medicine in the perspective of generating novel artificial tissues. patterns of extracellular matrix (ecm) proteins have been fabricated, by a modified lithographically controlled wetting (lcw), on the highly antifouling surface of teflon-af to guide the adhesion, growth and differentiation of neural cells (shsy5y, 1321n1, ne-4c) achieving an extremely accurate guidance [2] . local surface topography is also known to influence the cell fate [3] , thus, integrating this parameter in the substrate fabrication could increase the complexity of the signals supplied to the cells. in this perspective we have developed a novel fabrication technique, named lithographically controlled etching (lce), allowing, in one step, to engrave and to functionalize the substrate surface over different lengthscales and with different functionalities. i will conclude showing how we have been developing ultrathin film organic field effect transistors (ofets) as label-free biological transducers and sensors of biological systems. ofets are low dimensional devices where ordered conjugated molecules act as charge transport material. unconventional patterning techniques and microfluidics have been adapted to proteins and nucleic acids to dose the molecules on the ofet channel with a high control of the concentration. in another set of experiments, we have also been addressing the signalling from neural cells and networks grown on pentacene ultra-thin film transistosr [3, 4] .advances in nanotechnology are beginning to exert a significant impact in medicine. increasing use of nanomaterials in treatment of diseases has raised concerns about their potential risks to human health. in our study, the effect of poly(lactic-co-glycolic acid) (plga) and titanium dioxide (tio 2 ) nanoparticles (nps) on function of b-and t-lymphocytes was investigated in vitro. human blood cultures were treated with plga and tio 2 nps in concentrations: 0.12; 3 and 75 lg/cm 2 for 72h. lymphocyte transformation assay was used to assess the effect of nps on lymphocyte function. lymphocytes were stimulated with mitogens: concanavalin a, phytohaemmagglutinin (t-cell response) and pokeweed mitogen (b-cell response). our findings indicate immunomodulatory effect of plga nps. proliferative response of t-and b-lymphocytes exposed in vitro to the highest dose of plga for 72h was suppressed significantly (p.01, p.05). on the other hand, we observed stimulative effect of exposure to middle dose of plga nps on b-lymphocyte proliferation (p.05). no alteration was found in lymphocyte proliferation treated in vitro with tio 2 nps for 72h. in conclusion, proliferation of lymphocytes in vitro might be one of the relevant tests for evaluation of nps immunotoxicity.embryonic stem (es) cell differentiation in specific cell lineage is still a major challenge in regenerative medicine. differentiation is usually achieved by using biochemical factors (bf) which concentration and sides effects are not completely understood. therefore, we produced patterns in polydimethylsiloxane (pdms) consisting of groove and pillar arrays of sub-micrometric lateral resolution as substrates for cell cultures. we analyzed the effect of different nanostructures on differentiation of es-derived neuronal precursors into neuronal lineage without adding biochemical factors. neuronal precursors adhere on pdms more effectively than on glass coverslips but the elastomeric material itself doesn't enhance neuronal differentiation. nano-pillars increase both precursors differentiation and survival with respect to grooves. we demonstrated that neuronal yield was enhanced by increasing pillars height from 35 to 400 nm. on higher pillar neuronal differentiation reaches *80% 96 hours after plating and the largest differentiation enhancement of pillars over flat pdms was observed during the first 6 hours of culture. we conclude that pdms nanopillars accelerate and increase neuronal differentiation. key: cord-328181-b2o05j3j authors: nunez-corrales, s.; jakobsson, e. title: the epidemiology workbench: a tool for communities to strategize in response to covid-19 and other infectious diseases date: 2020-07-25 journal: nan doi: 10.1101/2020.07.22.20159798 sha: doc_id: 328181 cord_uid: b2o05j3j covid-19 poses a dramatic challenge to health, community life, and the economy of communities across the world. while the properties of the virus are similar from place to place, the impact has been dramatically different from place to place, due to such factors as population density, mobility, age distribution, etc. thus, optimum testing and social distancing strategies may also be different from place to place. the epidemiology workbench provides access to an agent-based model in which demographic, geographic, and public health information a community together with a social distancing and testing strategy may be input, and a range of possible outcomes computed, to inform local authorities on coping strategies. the model is adaptable to other infectious diseases, and to other strains of coronavirus. the tool is illustrated by scenarios for the cities of urbana and champaign, illinois, the home of the university of illinois at urbana-champaign. our calculations suggest that massive testing is the most effective strategy to combat the likely increase in local cases due to mass ingress of a student population carrying a higher viral load than that currently present in the community. mathematical models of infectious disease epidemiological dynamics can be provide valuable assistance to public health officials and health care providers in assessing the likely seriousness of an epidemic or its potential to grow into a pandemic, and later in allocating resources to counter the spread of the disease [105] . simulations that trace either prior or projected time courses make use of various mathematical and computational techniques, including (non-exhaustively) differential equation models, statistical regression and curve fitting, network propagation dynamics, and direct representation of human actors and their actions by means of agent-based models. the sir model in particular along with its various adaptations [100] has remained successful, at least in part, due to its universality as 1 evidenced by its empirical adequacy across multiple epidemics, and by its formal robustness when connecting microscale host-pathogen related events and macroscale disease observables [28, 10] . stochastic versions of the sir model show that adding noise to the system changes the predicted onset of an epidemic [113] , the stability of its endemic equilibrium [123] , the value of its effective reproduction number [57] or its duration [72] when contrasted against the deterministic one. this is significant not only at the theoretical level when studying the stability and asymptotic representativeness of deterministic vs stochastic sir models under various noise regimes, but at the policy making level where computational epidemiology may form the basis of informed decisions under policy, budgetary and other types of constraints [46] . moreover, the sir has been extended spatially to account for the diffusion-like properties associated with geographic patterns observed during epidemics. while the qualitative (and some quantitative) properties of the traditional and the spatial sir models remain largely shared, spatial versions appear to be numerically susceptible to how these capture spatial interactions [101] . as with any other diffusion process in some space, we are usually interested on the ability of a disease to cover larger shares of the population as time marches on. detailed analysis of deterministic and stochastic sir models with spatial components [16] indicates the existence of solutions corresponding to traveling waves that propagate the disease among point-like processes. it has also been shown that spatial sir models can account for the effects of long-distance travel by replacing diffusion operators containing local processes with appropriate integro-differential ones that capture non-local dispersal processes [117] . agent-based models (abms) constitute a family of models where sets of active entities (i.e. agents) interact collectively by following prescribed individual rules intended to portray the emergent dynamics of a real social system [20] . in computational epidemiology, abms have been used and comparatively evaluated against sir models of various kinds. analysis of the behavior of both classes of models suggests that for many purposes the two classes will give qualitatively the same result, but that agent-based models have an advantage in ease of accounting for heterogeneity in subpopulations where that is significant [5, 96] . furthermore, the sir differential equations model is derivable from asymptotic limit of an sir abm model through diffusion approximation [17] . a notable coordinated effort to develop agent-based models of a flu pandemic was the models of infectious disease agents study funded by the national institutes of health [51] . more recently the attention of the world was dramatically drawn to the need for public health interventions in the case of covid-19 by a simulation model projecting 2.2 million deaths from covid-19 in the united states, and 510,000 in the u.k., in the absence of such interventions [41] . since then, models continue to be refined as more data are analyzed [3] . it is important to note that there are enormous local geographic variations in the incidence of, and deaths from, covid-19 [23] . these local variations imply a need for local models, to enable local authorities to construct appropriate strategies of social distancing and testing for mitigation of the effects of covid-19. the work described in this paper is designed to meet this need. wicked policy problems are characterized by 1) complexity of elements to be accounted for and their relationship to each other, 2) uncertainty in relationship to the description of the problem and the consequences of actions, and 3) divergence within the affected community of values and interests [55] . by all three measures of wickedness -complexity, uncertainty, and divergence-covid-19 is a highly wicked problem and will continue to be at least until there is an effective and universally available vaccine. dimensions of complexity in covid-19 emerge from all of the multiple ways in which people interact with each other in such a way as to breathe the same air, and from the consequent trade-offs. these trade-offs involve public health, economics, every aspect of community life, and levels of emotional stress in individuals-a multi-level hierarchy of perspectives involving psychology, sociology, economics, health care, and politics. correspondingly, we expect covid-19 modeling efforts to include a growing number of these concerns while remaining actionable and scientifically useful. we expect the complexity of such models to grow, but to do so in a manner that remains intellectually transparent [38] about what is stated in them. to this extent, models become critical components within the top-level decision support system necessary to regain situational control during the current pandemic. dimensions of uncertainty abound. as noted above, there is enormous geographic variation in the documented impact of the disease, and variation even in the apparent fundamental parameters of the virus -transmissibility, latency, and virulence-for reasons that are not yet understood. contributors to the uncertainty may be genetic variation in human populations [26] , genetic variation in the virus as it continues to evolve [107] , variation in childhood vaccine regimens from one nation to another [82] , variations in weather patterns [61] , and variations in testing rates and disease reporting accuracy and criteria [60] . also, there is an element of pure chance -whether or not a particular community was "seeded" with infectious individuals, and how many. dimensions of divergence are in some ways clear, and in some ways complicated. the clearest divergence is between the imperative to save lives by social distancing and the costs of social distancing to the community-both economic costs [15, 110] and also costs that are less tangible due to how wealth moves across the global economy [79] . early on in this crisis most of the world appears to have made the choice of that we would throw our economies into depression [12] and restrict many community activities we value [40, 97] in order to save the lives of the probably less than 1% of the population who would die from infection should no social distancing constraints be imposed. at the time of writing, this choice is constantly being reconsidered, or at least recalibrated [50] . more, and more open discussions are being held on increasing economic activity even at the acknowledged cost of more infections and even deaths. one is reminded of a famous comedy routine when comedian jack benny, whose comic persona was as a notorious cheapskate, is held at gunpoint by a robber who demands "your money or your life!" this is followed by a long silence, a repeated demand, and a response by benny, "i'm thinking!". it seems that covid-19 has the whole world thinking about the trade-offs between economics -and other aspects of community life-and lives. with respect to divergence, covid-19 seems as wicked as possible. the economic dimensions of community life can be measured in dollars. the many other dimensions have no common units of measurement, so their relative value is literally incalculable. and yet we are forced to decide about what to value. another way to look at a wicked policy problem is a one where the space of potential solution alternatives contains far more social dilemmas than solutions. we may thus define a social dilemma is a situation where multiple agents have (explicit or implicit) stakes in the resolution of a problem, stakes are tied to multiple value systems (and not just shared, "objective" technical considerations for example), and a proposed solution contains value contradictions that get translated to unacceptable potential losses were that solution to materialize; at the same time, the social dilemma also provides consequences if a solution fails to appear .conversely, a perfect solution to a wicked problem is a point of total satisfaction of constraints at all levels of representation of the problem. this is, of course, an idealization; in practice some constraints must be eliminated, relaxed or ignored to find a collective solution. in summary, a social problem is wicked if the density of true solutions is low in the space of all solution alternatives and the search for them can be described as unstructured or even counter-incentivized at best. thus, the final element of covid-19 as a truly wicked problem is that, although it is insoluble, we must make our best effort to solve it. the consequences of not trying to solve this intractable problem, of simply guessing at answers guided only by intuition, are far worse than the consequences of being guided by imperfect models. 3 building a multi-objective model for covid-19: the agent-based route based on the discussion above, our current research efforts have focused on the development of an integrated simulation model capable of a) accurately reflecting known dynamics of the current pandemic and the qualitative results of other models, b) simulating data-driven stochastic heterogeneity across agent populations to more realistically reflect the variability of underlying human populations when the model is applied, c) integrating economic considerations in association with observable features of the pandemic, d) allowing detailed simulation of known public policy measures at different times, intensities and dates, and e) providing a simple interface for non-expert users to configure and interpret. in relation to the latter point (e), we envision assisting the decision-making process in two steps: first by providing some metaphor or visual proxy for users to construct intuitions by running specific scenarios, and then by translating some of these intuitions into fullyfledged computing and analysis tasks. succinctly, we wish to facilitate decision making processes that are both robust and adaptive [66] while helping decision makers to avoid falling into the "illusion of control", or the false belief in the causal relation between computing consequences with a model and immediately improving their decision-making abilities [63] . at the same time, we remain painfully aware of the intrinsic difficulties posed by imperfect data, imperfect implementation of public policy measures, and uncertain timelines for when and for how long to apply measures under unknown timelines for availability of vaccines [104] . ee expect our model to be beneficial when a) decision makers are fully aware of the underlying simplifications we have made, b) model outcomes are contrasted and adjusted with incoming data during an unfolding situation, c) experts assisting decision makers carefully determine and document how data produced by these simulations is analyzed and translated into tentative recommendations [70] . to this extent, we have focused our efforts on providing modeling tools for population centers with 100,000 inhabitants or less. our choice is motivated by the geographic distribution of cities and towns across the us [45] and by the apparent inverse correlation between population size and rurality. this is significant since the push for urbanization seems to have driven rural cities and towns to more precarious health systems than their urban counterparts [95] : one can expect covid-19 propagation to be slower due to lower population densities, but the impact to be at least similar or stronger due to age distribution and availability of health care facilities [8] , with a special emphasis on availability of icus [58] . in our model, agents interact and traverse a discrete 2d torus composed of connected lattice points that represent geographic locations. agent actions and decisions are governed by random variables with suitable distributions. a single execution (i.e. a scenario run) of a parametrization of the model corresponds to a possible world, while a simulation (i.e. a scenario) comprises an ensemble of multiple executions with the same parametrizations where outcomes correspond to distributions of agent states and observable quantities must be computed through averages. the choice of geometry presupposes that agents move across a common landscape at all times, and no agents enter or leave it. this simplification of the geographical landscape, while in general unrealistic, is not uncommon [93, 9] and provides two advantages: a) it naturally matches intuitions behind interactive particle systems driven by mass action principles [75] such as in the sir model and b) when translated into code, no boundary checks need to be performed by the agents. lattice sites are connected, from the perspective of an agent, by a moore neighborhood in an effort to reduce the effect of discretization artifacts [64] . we note here that our model presupposes a homogeneous population density as a means of ensure representativeness of processes within the geographical domain. although accounting for variable population density areas in the same scenario is possible, our approach simplifies implementation aspects and prevents artifacts for model outcomes that may be strongly density dependent [121] in both epidemiological and economic aspects. our model does not explicitly contain a rich representation of locations where agents are drawn to and act as temporal sinks. instead, we chose to model agent tropism through randomized agent dwell times. dwell time (τ dw ) refers here to the minimum amount of time an agent susceptible to covid-19 contagion needs to spend in a given location to acquire the 5 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) virus. based on existing estimates [36] , we have chosen τ dw = 15 minutes, which corresponds to one time step t s , s ≥ 0. average total dwell time t dw = k dw · τ dw for an agent corresponds to the (integer) average number of steps an agent will dwell on a single location. since our model assumes a day as a usual reporting unit in decision-making activities, all parameters stated in days are internally rescaled to τ dw units (i.e. 1 day = 96 simulation steps). agent dwell times are set at creation time using a random deviate from poisson(k; λ = k dw ). to configure of a scenario, a collection of demographic and disease parameters must be specified. age structure appears to be strongly associated with differences in covid-19 fatality rates [19, 32, 37, 98] . the model requires estimates both of the distribution and observed fatality proportions p sex f and p age f per sex (i.e. male and female) and age (i.e. every ten-year intervals) respectively. co-morbidities are introduced in a similar manner by means of the age and sex structure of the population as a collection of positive multipliers k sex f , k age f per relevant condition, and aggregate clinical data about their prevalence per age and sex. the total number of agents n at the onset of the simulation remains constant at all times, except when an influx of new agents is simulated. to do so, the number of new agents entering the population correspond to a proportion p new of the existing ones. in addition, one must specify when the agents will be introduced t s = t new and how long it takes them to enter the space τ new . in this manner our model can account for seasonal population increases driven by, for instance, agricultural production or the start of semesters in university towns. after time t s = t new + τ new , the simulation will contain approximately n = n + p new · n agents. to account for population density ρ pop , the model specifies the width w and height h of the lattice that will contain the agents. we presuppose that agents move across the lattice one jump at a time if their dwell time is exhausted. the size of the lattice should be also adjusted based on the mobility and transportation patterns of individuals within the enclosed region of interest -that is, excluding realistic density fluctuations due to commuters that spend most of their time outside of the simulated region. the effect of such adjustment is equal to re-scaling the population density by a mobility pre-factor k mobl . after these calculations have been performed, the model should ensure for t new = ∞ that intuitively, the effect of greater mobility is equivalent to increased population density, or correspondingly, to traversing a smaller space. while our model does not include the effect of road networks or vehicle use, a carefully constructed average for k mobl can provide a sufficiently adequate approximation. disease-wise, the model comprises six critical parameters. first, the initial proportion of agents p iexp that are exposed to the disease. the model assumes that their introduction occurs at the onset of the disease incubation period. then, the incubation period τ incb and the recovery time τ recv are inputs correspond to average observed or estimated values in clinical 6 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . patients [69] . in the simulation, each agents is initialized with individual incubation and recovery times drawn from poisson(k; λ = τ incb ) and poisson(k; λ = τ recv ) respectively. our reasoning behind this choice rests on the fact that a) the time at which symptoms manifest across patients depends on a common organismal response to the pathogen dependent on the activation of known (and yet unknown) molecular pathways [2, 103, 111] , and at the same time on the intra-population variations that are found across individuals due to their specific genetic make-up and context. thus, we treat symptoms onset as a homogeneous poisson process for simplicity even when the described coupling exists. an improvement to our current model would entail, if the reasoning above holds, computing observables using a general poisson point process by assuming that the radon-nikodym density exists [29] . fourth, the proportion of individuals who remain asymptomatic p asym is accounted for and utilized when stochastically deciding the fate of exposed agents. the role of asymptomatic patients remains heavily investigated [11, 85] and appears to play a crucial role in disease mitigation for covid-19 [18, 33, 43, 122] . from literature data, the proportion of asymptomatic patients appears to vary greatly across countries and demographics (e.g. [4, 6] ), although the matter is far from settled. in this sense, our model is intended to be applied by using data starting at the most local level if possible, and only moving to larger geographical instances when data cannot be obtained by means of statistically robust antibody testing. fifth, the proportion of severe cases p sev is significant for the model due to its relation to hospital capacity. the definition of severity used here is that reported in [118] ; we suggest similar guidelines on estimating the proportion severe cases should be followed. another proxy for severity may comprise the number of non-icu and icu admissions, and their ratio [86] ; in general, the need of hospitalization implies that clinical evaluation of a patient raises enough concerns as to consider the possibility of transitioning from non-icu stage to the icu stage of care [94] . to account for saturation of health care services, the model males use of the proportion of beds proportional to population density p beds , and we assume that only severe cases are hospitalized. if a severe patient cannot be hospitalized due to saturation, then its probability of fatality rises by a factor to be determined empirically; for reference, our model presupposes a four-fold increase. at present, our model does not provide estimates of icu occupancy. finally, the model utilizes the probability of contagion p cont per agent per interaction. this quantity can be obtained from field data or other (more coarsed-grained) epidemiological models at the onset from estimates of the basic reproduction number r 0 by observing that where n s,i t is the average number of contacts between susceptible (s) and infectious (i) agents, and d is the duration of infectiousness of the disease. recalling the sir model 7 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . we observe that γ = d −1 and our view of r 0 is that of a preliminary estimate for initial calibration at the onset of the period of interest. for reporting purposes, we favor the effective reproductive number r(t), and consequently provide information about the observable for n s,i (t) such that since one agent represents multiple individuals in the region of interest, it becomes necessary to compensate for this renormalization process. for such purpose, we provide a scaling factor associated to the representativeness of the model k r given with a scale 1:r by we found empirically γ ≈ 1.58489, such that rescaling the probability of contagion to obtain p cont = p cont /k r leads to in addition, r(t) ∝ ρ pop (eq. 1), which implies that k mob must also modulate r(t). the final expression becomes at the model level, observables are interrogated across the agent population, agent step actions scheduled and the step number updated. at each step, agents move one space across the 2d torus after exhausting their dwell time per location. all agents posses an internal state σ that stores information relevant to the disease, its economic aspects and various control structures. their motion is driven by a random walk within their moore neighborhood, unless their state has been set to isolation. isolation means not moving across the space regardless of dwell times. more than one agent can inhabit one lattice site, which forms the basis for determining when a susceptible agent becomes exposed and the infectious cycle 8 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . starts. prior to performing stage-dependent computations from the disease perspective, agents compute the consequences of policy measures and adjust various elements of their internal states relevant to epidemiological and economic actions to follow. our model departs from the usual compartments of the sir and extends it in order to account for a more fine grained variety of significant infection stages. agent disease states are as follows: susceptible all agents (except those marked as initially exposed) start as the susceptible population. when susceptible agents share the same lattice site, these may come in contact with other exposed, symptomatic (i.e. due to voluntary or involuntary breaking of quarantine with probability 1 − p isoeff ) or undetected asymptomatic agents. if at least one agent is infectious, the agent changes its state σ to exposed as dictated by bernoulli(σ, p cont ). unless quarantined due to a policy measure, susceptible agents move freely across the lattice. exposed in the absence of any policy measures impacting exposed agents, these continue to explore lattice sites until their incubation period given by poisson(x, λ = τ incb ) is exhausted. at that time, agents become asymptomatic as dictated by bernoulli(σ, p asym ) or symptomatic detected, otherwise. asymptomatic asymptomatic agents continue moving through the space and remain infectious until the recovery period given by poisson(x, λ = τ recv ) is exhausted. at that point, the agent enters the population of those recovered. symptomatic when an agent becomes symptomatic, it is immediately marked as detected and quarantined. regardless of stringency of testing policies, the definition of confirmed case depends at the minimum on being both symptomatic and a positive identification via some form of testing (i.e. rt-pcr qualitative or quantitative, serological [91] ). symptomatic agents follow two possible trajectories. in the first one, agents convalesce without becoming severe until their recovery time is exhausted an recuperate. in the second one, agents become severe as dictated by bernoulli(σ, p sev ). in terms of the impact of saturation of health care services, research is needed to determine lethality of patients outside hospitals and other facilities. however, using existing ethical guidelines that provide heuristics of fair resource allocation for beds and ventilator equipment [39] as a proxy, we estimate that lethality increases (conservatively) by at least a factor of 5. asymptomatic detected asymptomatic agents detected by some widespread systematic testing strategy are immediately quarantined in place, and wait until their recovery time is exhausted. at that point, they join the population of recovered agents. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . severe agents that enter the severe stage represent patients that require some form of hospitalization, and for some of them, use of mechanical ventilators; these remain under perfect quarantine. lethality, computed per age and sex population structures as p f = p age f · p sex f is used to determine whether a severe agent becomes a fatality as given by bernoulli(σ, p f ). agents, on the other hand, may survive until recovery. recovered agents that have recovered leave quarantine and move again freely across the lattice. our model does not consider the probability of re-infection, but this may need to be included in the future [92] . at the moment of writing, this aspect of covid-19 remains speculative and uncertain for human patients [34, 90, 115] despite encouraging evidence obtained from experiments using rhesus macaques [14] . deceased agent that count as fatalities do not interact with other agents or undergo any further epidemiological significant events until the end of the simulation. in order to increase model realism, we consider the effect of inbound infectious cases of two sorts: people that live within the community but have to travel and work outside of it in a steady stream daily that maintains the overall population density stable, and people that move seasonally within the community, potentially bringing in more cases that have a different viral load. this last case describes, for example, the opening of university campuses for instruction where several people relocate to adjacent towns. for the first situation, the model includes the probability of susceptible people becoming infected with a daily rate that determines the infectious stage depending on bernoulli(x, r ibnd ). in the second type of infectious cases, at time t mass a proportion p mass of the current population will enter the simulation space during a time period τ mass with a probability of being in the exposed state of p mass . covid-19 has forced a frantic search for public policy measure combinations capable of containing viral transmission, and ideally quelling its progression altogether [13, 42, 47, 56, 67, 71, 89, 99, 106, 115, 116, 120] . all measures reviewed and emerging across literature roughly belong to four main classes of measures: 1) those that aim to reduce at any instant the density of individuals at locations with potentially high concentration of people by means of imposed self-isolation of non-essential workers and cancellation of activities involving massive amounts of people, 2) those that reduce the likelihood of viral exposure for those qualified as essential workers for whom close social interactions are inescapable, 3) those intended to detect and isolate positive virus carriers through application of molecular or serological testing and 4) those that seek to reconstruct interaction histories in which a positive infectious patient may have had an active role in unknowingly spreading the disease. it has become increasingly clear that these measures are essential yet hard to sustain for long periods of time. on the one hand, various degrees of negative psychological impact have 10 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . been recently reported [31] , in particular impacts that decrease adherence to public policy measures [59] which are expected to naturally arise after periods of prolonged confinement under a collective crisis; world war ii critics of air raid shelter policies constitute a significant precedent [80] . on the hand, mounting concerns on lasting economic impacts [76, 83] materialize the wickedness of the covid-19 pandemic and the cost of measures to address it [22] , concerns that which include social protection of workers [44] , the labor market [30] , patterns of work [65] , gender equality [7] , nation-state economics [74] , and monetary policy [21] to name a few. motivated by the latter, our work attempts to model the individual variability expected when these types of measures are implemented, their various impacts in terms of disease and economy collective observables, and the potential outcomes of combining them in various manners. we note that societal and economic impacts of covid-19 differ from those in other pandemics due to the tight coupling of global events and the effect of near-instantaneous digital communication. we are changing the pandemic while living it. while our model does not provide mechanisms to state the associated control problem in cybernetics terms, emerging literature (e.g. [35, 48, 87, 119] ) suggests that such approach may be possible, and even essential to provide solutions that account for the complexities involved in politically and socially driven environments. self-isolation in our model is captured by establishing a period in which a proportion p isol of agents in mobile states (i.e. susceptible, exposed, asymptomatic) remains at a fixed location for a well-defined period of time. self-isolation starts at time t isol and extends for a period τ isol , after which motion across space is restored. agents isolate with effectiveness p isoeff , representing the probability that when in contact with another infective agent the final probability of contagion becomes p cont = p isoeff · p cont . social distancing is modeled as a distance-dependent adjustment constant δ( ) that adjusts the probability of contagion p cont depending on linear distance assumed between agents within the same cell such that p cont = δ( ) · p cont . based on recent experiments on the effect of air turbulence on droplet dispersion [24] , we assume a decreasing sigmoid profile after 1.5 meters. hence, 11 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . where k is a constant that adjust the decrease rate of the sigmoid function. for the purposes of the covid-19 model, k = 10.0. the value of can be adjusted also to account for the effect of other interventions that decrease contagion probability per contact by decreasing the effective viral load, such as the use of various types of face masks [62] . similar to self-isolation, testing in the simulation operates by distributing a target percentage of the population into a given period. susceptible and asymptomatic agents can be tested; in our simulation, we do not re-test those who recover. a symptomatic case is treated immediately as tested, representing the case where a patient reaches a health provider and a test is applied to determine the correspondence between symptoms and the disease. testing proceeds in the following manner. once time t test is reached, symptomatic and asymptomatic agents are selected with a probability p test proportional to the period τ test . this testing process simulates massive testing policies without any statistical design or underlying population structure. we simulate forms of automated contact tracing by means of a set of known prior contacts. we assume a delay of two days for contact follow-up once an infected patient has been discovered. once an agent is marked as positive, all of its contacts are evaluated and classified either as susceptible (negative), symptomatic or asymptomatic detected. contact tracing utilizes the same start time and period as testing. along with an epidemiological model, we have included economic factors tied to disease stages. after some investigation around statistical theories in economics [27] , we decided to implement a simple model where value creation in terms of exchanges between money and products or services [84] are computed in connection with the progression of the disease. our model is an oversimplification of economic systems, and as such, its goal is to provide a numerical intuition about the immediate effects on the accumulation of capital by individuals and the public sector during the period of interest. thus, the economic model makes no assumptions about wealth distribution, wealth inequality or other societal factors, and as such only aims at portraying the impact of an epidemic on transactional capital gains or losses at the private and public levels. regarding public value [81] , we observe that the complex web of actions across individuals and institutions make construction of a detailed model expensive in connection with infectious disease dynamics. growing literature on the subject is indicative of the latter [25, 53, 78, 112] . to the best of our current experience, proper treatment of the current situation would require a different class of model based on fractional operators coupling 12 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . https://doi.org/10.1101/2020.07.22.20159798 doi: medrxiv preprint epidemiological [1] and econometric aspects [109, 73] capable of accounting for short-and long-term memory macroeconomic effects [108] . our take on the matter can be stated through the following somewhat intuitive principles: 1. disease stages that allow interactions should cause non-linear positive externalities in terms of collectively amplified public value. the more frequent interaction exchanges, the higher the materialization of public value as a function of non-linear effects of reaching throughput efficiencies that both maximize economies of scale and translate into deep capital accumulation and redistribution across the public sector. 2. disease stages that forbid interactions but do not put individual lives at risk should cause linear negative externalities associated to both increased unemployment ripple effects and inability to reach throughput thresholds capable of amplifying value creation. 3. disease stages that both forbid interactions and put individual lives at risk should cause non-linear negative externalities due to increased unemployment ripple effects, inability to reach throughput thresholds capable of amplifying value creation and the saturation of alternatives under an increasingly severe public health crisis. 4. in addition, we estimate the cost of performing one test as part of the public cost. the purpose of this observable is to provide an account of testing as a public measure versus other actions for which their cost is harder to account for. in order to materialize the above principles, our approach is inspired by that of inputoutput matrices utilized to account for combined economy-ecosystem calculations [52] . the equivalent of the matrix m in our model expresses economic outputs per disease stages. viewed as a matrix computation, the components of the input vector u correspond to proportions of the agent population per infectious stage, while the output vector v is of the form v = (v priv , v pub ). depending on the disease stage, vector components are computed aggregating the value of interactions or individually. by a suitable homotopic transformation t [88], we approximate non-linear effects of market dynamics, thus the final value of v becomes t [v] = (v α priv priv , v α pub pub ). the matrix components m ij and both α priv and α pub are model inputs. our model captures during its execution various observables every time step. all values are aggregates and no particular agent information is stored; in the future, this can be of value if the model is extended with network information. after a simulation has completed, the following classes of observables are recorded in a csv format: . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. step number, population size disease (fraction) susceptible, exposed, asymptomatic, symptomatic quarantined, asymptomatic quarantined, severe, recovered, diseased measures self-isolated, tested, traced epidemiology effective reproductive number economics cumulative private value, cumulative public value the epidemiology workbench is implemented using python (v 3.6) using the mesa agentbased simulation library [77] . in batch processing mode, our model receives a json file with all the parameters described above and a number indicating the number of cores to use during the simulation. once a sanity check is performed, the parameters are used in conjunction with the multiprocessing batch running features provided by mesa. we also provide a parameterizable web-based dashboard to explore individual runs. the objective behind this corresponds to enabling decision-making users to progressively gain intuitions behind each parameter, and not to provide operational information. our code is openly accessible through github 1 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . cases can provide information leading to estimates, or values provided by trusted sources at the most local level possible can be used. incubation time is likely to be blurred by multiple confusion variables captured in our model as a poisson distribution. despite its stochasticity, the model supposes that large population sizes group tightly towards a mean value. this assumption may need to be revised retrospectively later. for the proportion of asymptomatic patients, contact tracing and structured antibody testing can provide specific local information. as a general calibration protocol, we suggest the following steps: 1. adjust grid size based on fixed population density until r 0 matches the best known value for the area in the first days of the model (steps = 96) and > 30 runs. population density in the model loosely includes average mobility patterns, and cell sizes reflect the distance traversed every 15 minutes. also -but not recommended-the probability of contagion may be used to calibrate. this may imply unknown population conditions and should be used only to test hypotheses about individual variations that manifest in the ability of covid-19 to spread. 2. execute the model to the point where the number of symptomatic agents corresponds to one representative agent. for example, with an abm of 1000 agents and a population of 100k individuals, the critical infected agent-to-population ratio is 1:100. use the point in time rounded to the nearest integer day as point of departure for policy measures. in practice, this implies executing the model in excess of as many days as the longest known incubation period. 3. if policy measures have been introduced, use date above as the reference point for their introduction. to develop scenarios, we strongly recommend starting from the most recently calibrated model that includes policies as well as using a model without measures as a basis for counterfactual arguments. the cities of urbana (pop. 42,214) 2 and champaign (pop. 88,909) 3 comprise a population of approximately 132,000 inhabitants. figure 1 describes the current percent distribution per age group. in addition, distribution per sex is 50.5% males and 49.5% females. however, the population across both cities undergoes a seasonal decrease on mid may due to the end of the spring semester and an increase in early august with the start of the fall semester in the university of illinois at urbana-champaign. from the more than 50,000 students enrolled on campus on may 4 , we estimate that around 30,000 leave during summer to return for the fall semester. effectively, the summer population amounts to around 100,000 inhabitants. we used covid-19 age-and sex-dependent mortality values as reported by cdc until june 10. covid-19 data was obtained from the champaign-urbana public health district (cu-phd) 5 and cdc for mortality data 6 . sex-dependent mortality was established at 61.8% for males and 38.2% for females. the first local case in the community was reported on march 8, and state-wide shelter-in-place measures were applied on march 21 7 . later on, mandatory mask usage was established on may 1 8 . by april 21, cumulative cases had reached 0.1% of the population, and by july 8 it had increased to 1%. the local r(t) value at the peak cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . period between april 21 and may 17 reached an estimated value of 1.2; after these measures, it remained around 0.91. we used a probability of contagion per interaction of 0.004 every 15 minutes if there is at least one person infected at the same location as the susceptible one. this value, although computed from data, appears to reflect compliance with various sanitation practices among the population. based on the fluctuation in local data, we have estimated an inbound probability of 0.0002 new cases due to members of the community becoming exposed elsewhere. cu-phd performs strict contact tracing across all cases, hence for all simulations we assume contact tracing remains active across all simulations. google mobility data were used to estimate the average effective shelter-in-place value to a 45% of the population with 0.8% efficacy of self-isolation. the severity was similarly estimated at 5% based on local case information. similarly, a starting value of r(t) = 1.2 corresponded to a grid of 190 × 225 cells and 1000 agents and k mob ≈ 0.4781; this last value appears to be consistent with the regular use of public transport in the area and local observed shelter-at-home patterns. at the start of the simulation, the agent-to-inhabitant ratio is approximately 1:100. hence, the calibration starts with one exposed agent. differences between r(t) in april and july correspond to = 1.89. intuitively, this implies that the effect of wearing a mask may roughly translate into increasing the distance 0.39 meters beyond what who recommends for proper social distancing. however, this cannot and should not be interpreted as to relax mask usage in any manner. in regard to asymptomatic patients, we have established a conservative value of 35% based on prior studies [49] . the following sequence of events was assumed toward the start of classes this fall: our goal for simulating the impact of mass ingress was to determine, based on various measures, the viability of preserving public health under variations of measures a week after a significant portion of the population has been tested. to determine the latter, we obtained data for r(t), symptomatic, asymptomatic, severe and deceased fractions of the population. in addition, we collected information about economic impact using our input-output matrix model. a total of 6 simulations explore the following parameter settings: . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . 1. shelter-in-place continued/removed on day 117, 2. massive testing performed to 25%, 50% and 75% of the population in the enlarged community. we also computed a counterfactual case corresponding to no massive testing and lifting of shelter-at-home orders to compare against a backdrop without measures. model calibration results (figure 2 ) correspond to the parametrization publicly available at the github project repository 9 . we computed a total of 7 scenarios (shelter-at-home times testing levels plus counterfactual), each one with an ensemble of 30 independent runs. execution of these scenarios was performed on amazon ec2 infrastructure using a c5a.8xlarge non-dedicated instance with 36 processors and 64 gb ram. we used an ubnutu 18.04 86 image as the choice of operating system. calibration cpu time with an ensemble of similar size for the first 40 days is, on average, 16.3±0.4 minutes, and the average execution time of a complete scenario is 65.2 ± 1.3 minutes. preliminary profiling indicates that random number generation using the scipy library [114] explains most of the execution time. no attempts were made to further speed up our code by compiling it using cython [102] or numba [68] . our goal in these simulations was not to reproduce exactly the case curves observed in the community, but to obtain a picture that remains quantitatively and qualitatively rigorous. confidence intervals are calculated at 95% when present. we provide scripts to automate the setup 9 see: https://bit.ly/2bhcuv3 18 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . of the amazon ec2 instance with model installation 10 , the execution of all scenarios 11 and their visualization 12 for reproducibility purposes. the following convention is applied to all figures: dark red corresponds to 25% testing, teal to 50% and dark blue to 75%; the counterfactual case is colored in purple. a dotted line corresponds to the start of mass ingress, a dot-dash line to the end of mass ingress and the start of massive testing, and a dashed line to the end of massive testing. all figures related to disease stages start at day 80. model results indicate that outbound exposed individuals coupled with local fluctuations appear to drive the behavior of this pandemic for the cities of urbana and champaign. the combination of contact tracing and public health management by cu-phd, compliance with health and sanitary measures and rapid implementation of shelter-in-place measures have prevented the pandemic from escalating in the region. considering both cities as a closed system, adequate health management appears to be ultimately responsible for the small number of severe cases and hospitalizations in the region. the effect of masks, based on the information obtained from our simulation, appears to have a significant effect on the value r(t) according to fig. 2 . our simulation of the reopening of the university of illinois local campus with a higher viral load suggests that an increase in cases should be observable independent of the testing regime or relaxation measures. the future impact of this increase, however, is not. figure 3 indicates that lifting all shelter-at-home restrictions and performing testing has a significant impact regardless of the testing level, while preserving shelter-at-home measures along with any testing level can drastically sustain r(t) slightly below pre-mask order values. we note that r(t) decreases in all cases, which can be explained by contact tracing-based testing. this suggests that even hybrid education modes (in presence + online) constitute significantly better alternatives than full campus reopening. testing intensity matters. in terms of the outcome after day 137, testing intensity determines the last value of active cases during two weeks after testing. note that even when the testing instant itself has passed, capturing a larger and larger number of positive cases (particularly asymptomatic ones) drastically reduces the infectious population (figure ??) . even when shelter-at-home measures have been lifted, testing reduces the fraction of the population classified as active cases at least higher but close to its value prior to mass ingress (25%), roughly equal its prior value (50%) or lower than the prior value (75%). lifting shelter-at-home measures has a significant impact on the magnitude of both the peak of active cases and the value after two weeks have passed since testing occurred. plans to test the student population once per week at scale, although not simulated here, appear to be a most effective solution to further tame the curve. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . the main mechanism massive testing addresses in general, according to our simulations, is the removal of infectious individuals from the population, in particular those who are asymptomatic. in general, the estimated proportion of asymptomatic patients is a significant driver of the contagion in our model. when the population increases, many more individual contacts are possible within the same geographical area, and the lag induced by the incubation time translates into observing the impact of testing at least a week later. figure 5 compares the asymptomatic fraction of the population across a baseline simulation without massive testing or some degree of sheltering. as in the previous case, shelter-inplace measures have an effect on the growth rate of the asymptomatic population, but even a testing intensity of 25% appears to lower it significantly compared to the counterfactual case. at case severity of 5%, more hospitalizations may be expected six days after day 130 (mass ingress), particularly if shelter-at-home orders have been lifted ( figure 6 ). the impact of testing, while it cannot be fully distinguished due to the overlap of confidence intervals in our simulations our analysis of economic impact focuses on two per-capita averages: cumulative private value (without any egress) and cumulative public value. while this part of our proposed model is experimental and requires further analysis, we proceed to state the current results. first, we studied the effect of the pandemic only on individual accumulation of private value (figure 7) . mass ingress appears to renormalize temporarily the distribution and removing shelter-at-home measures, predictably, increases the final value at day 153, but only by approximately five units. all testing levels appear to comprise a relatively tight bundle, which can be interpreted either as an artifact due to the model being simplistic, or as the fact that the impact of testing levels on cumulative private value in the context of a 21 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. pandemic under control (as in urbana and champaign cities) is limited. even when these differences are small, they do exist. testing at low levels (i.e. 25% and 50%) reduces the number of isolated people compared to testing at a broader scale (i.e. 70%). however, when an epidemic process remains under control, public health benefits largely appears overshadow individual losses, contingent on the validity of this approach. in the case of per-capita cumulative public value, the model predicts negative outcomes even for an epidemic under control (figure ) . this appears to align with public expenditure needed to mitigate any economic and societal lockdown necessary to stop the spread of the disease, as well as the negative externalities leading to less public transactions taking place on 22 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted july 25, 2020. . https://doi.org/10.1101/2020.07.22.20159798 doi: medrxiv preprint systems that were designed to support a certain minimum load to remain profitable: public finances tend to be, in general, inelastic for that reason. when shelter-at-home measures are lifted, a natural order of solutions arises from worst-case (our counterfactual, purple) to best-case (75% testing, blue): strong testing reduces the long-term impact of active, asymptomatic and severe cases. in this situation, however, the effect of mass ingress appears to be to re-bundle the behavior as r(t) increases again. if shelter-at-home is preserved, an interesting situation arises: doing nothing appears to be a good economic solution. we believe the main reason behind this result to be that, for the case modeled here, the social and economic cost of the lockdown is higher due to the situation being under control; however, the material gains computed by the model between are rather small, and preserving public health over economics is a better-long term strategy. despite this, strong massive testing still provides a next best solution from the point of view of economics, and the best strategy from the public health perspective; this is evidenced by the diverging curves in fig. 8(b) . we speculate, based on our current simulation outcomes, that the ordering in the economic cost profile of a pandemic during its exponential phase should be similar to that of fig. 8(a) , but with the divergence observed in (b). we reported here the construction of an agent-based workbench using the mesa modeling framework capable of capturing epidemic processes alongside public policy measures. the model is fully stochastic, entailing the computation of observables of different kinds. while computationally expensive, its formulation allows to easily obtain quantities that appear to be useful in the process of combating an epidemic. we applied our workbench to understanding the possible epidemiological profile of two cities, urbana and champaign, in the context 23 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 25, 2020. . of the reopening of the university of illinois local campus next fall. our simulations indicate that at least 50% testing of the local population is needed to sustain the pressure of mass ingress of individuals with a higher viral load compared to the local one. more generally, contemporary management of an epidemic demands changing the mode of interaction across as much as the population as possible from those requiring physical proximity to those that do not. although digital technologies provide mechanisms to preserve safe spatial distancing, temporal distancing can also be intelligently used to reduce the probability of contagion. in terms of economics, public health measures must be privileged over financial concerns, since the panorama appears similarly bleak during the early phases of an epidemic, and strict measures possibly provide the best solution during the exponential phase. our model contains the following key limitations. first, only one measure per type can be specified at the moment, instead of a sequence of dates paired with values corresponding to the measured (or expected) effect of measures of the same type. in the example above, we used an approximation of shelter-in-place for the entire simulation period (april 21-september 12), even when the state of illinois ordered phase 4 re-openings on june 26. another critical element missing in our model corresponds to preferential shelter-in-place per age group. even when mortality appears to more strongly impact the elderly, local mortality is low. this may be due to higher compliance of that population with shelter-athome and other sanitary measures, including wearing masks in public. variable viral loads per disease stage [54] are missing in our model, but these are harder to calibrate due to the biosafety and time elements involved in quantitative pcr-rt. nevertheless, we do foresee situations where this may be possible and pertinent. finally, our model assumes agents have an effectively infinite memory to remember whom they have had contact with. contact tracing has a stringent limit when performed manually, which can be expanded greatly by means of various information technologies. hence, our model is not realistic in this sense, since it does not distinguish between these two cases: when an epidemic has reached certain critical mass, active cases will be underestimated. computationally, the epidemiology workbench is limited by the lack of true concurrency in mesa, thereby impacting scaling properties of our simulation. distributing the agents across multiple compute nodes in python requires architectural changes in mesa beyond the scope of our work. at present, efforts are under way to develop an elixir-based abm platform capable of addressing this limitation as part of the spec collaboration in the computational social science community. another critical bottleneck is random number generation, for which various strategies may be applied, including the use of (approximately) irrational numbers as coefficients of fourier series. a final aspect underscored by our research, in particular in the context of covid-19, is the need for anticipatory mechanisms driving public policy measures. in this sense, simulation methods such as the one presented here or others lack convenience when introducing external events and the execution cost of recomputing complex models, and appear to make sense only at early stages of a epidemic process: once the disease spread has reached its exponential phase, the need moves from prediction to probabilistically qualified estimations of short term measure effectiveness. this suggests a different class of stochastic methods that 24 . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 25, 2020. . not only predict expected trends but also recommend measures based on their effectiveness, similar to those used in high-speed trading of financial derivatives and futures. the complexity of the socio-technical character of the global economy and society demands these more powerful methods to successfully address the wicked character of a pandemic, in particular this one. as of now, our efforts concentrate on seeking viable ways to package and deploy the epidemiology workbench across various cyberinfrastructure resources in order to make it available to other small cities (including training resources) and, particularly, to communities with strong presence of underrepresented minorities whose public health planning resources are heavily constrained. . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 25, 2020. . https://doi.org/10.1101/2020.07.22.20159798 doi: medrxiv preprint . cc-by-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted july 25, 2020. . https://doi.org/10.1101/2020.07.22.20159798 doi: medrxiv preprint on a comprehensive model of the novel coronavirus (covid-19) under mittag-leffler derivative the possible immunological pathways for the variable immunopathogenesis of covid-19 infections among healthy adults, elderly and children special report: the simulations driving the world's response to covid-19 investigating the impact of asymptomatic carriers on covid-19 transmission. medrxiv comparing large-scale computational approaches to epidemic modeling: agent-based versus structured metapopulation models asymptomatic coronavirus infection: mers-cov and sars-cov-2 (covid-19) the impact of covid-19 on gender equality rural america and coronavirus epidemic: challenges and solutions landscape epidemiology modeling using an agent-based model and a geographic information system elementary proof of convergence to the mean-field model for the sir process presumed asymptomatic carrier transmission of covid-19 covid-induced economic uncertainty covid-19: emerging protective measures lack of reinfection in rhesus macaques infected with sars-cov-2. biorxiv sectoral effects of social distancing. available at ssrn deterministic and stochastic models for recurrent epidemics agent-based derivation of the sir-differential equations covid-19: the case for health-care worker screening to prevent hospital transmission. the lancet effects of latency and age structure on the dynamics and containment of covid-19. medrxiv agent-based modeling: methods and techniques for simulating human systems. proceedings of the national academy of sciences fighting the covid-19 emergency and relaunching the european economy: debt monetization and recovery bonds the economy on ice: meeting the economic challenges during and after the covid-19 crisis geographical tracking and mapping of coronavirus disease covid-19/severe acute respiratory syndrome coronavirus 2 (sars-cov-2) epidemic and associated events around the world: how 21st century gis technologies are supporting the global fight against outbreaks and epidemics turbulent gas clouds and respiratory pathogen emissions: potential implications for reducing transmission of covid-19 the economic ripple effects of covid-19. unpublished manuscript. available at the world bank development policy and covid-19-eseminar series comparative genetic analysis of the novel coronavirus (2019-ncov/sars-cov-2) receptor ace2 in different populations statistical theories of income and wealth distribution. economics: the open-access the sir epidemic model from a pde point of view stochastic geometry and its applications labor markets during the covid-19 crisis: a preliminary view social psychological measurements of covid-19: coronavirus perceived threat, government response, impacts, and experiences questionnaires cmmid covid-19 working group, et al. age-dependent effects in the transmission and control of covid-19 epidemics. medrxiv covid-19: identifying and isolating asymptomatic people helped eliminate virus in italian village covid-19-new insights on a rapidly changing epidemic optimal covid-19 epidemic control until vaccine deployment. medrxiv how to restart? an agent-based simulation model towards the definition of strategies for covid-19" second phase demographic science aids in understanding the spread and fatality rates of covid-19 model transparency and validation: a report of the ispor-smdm modeling good research practices task force-7 fair allocation of scarce medical resources in the time of covid-19 the effectiveness of moral messages on public health behavioral intentions during the covid-19 pandemic impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand the global community needs to swiftly ramp up the response to contain covid-19 asymptomatic transmission, the achilles' heel of current strategies to control covid-19 social protection and jobs responses to covid-19: a real-time review of country measures us city-size distribution and space ethics, information technology, and public health: duties and challenges in computational epidemiology face masks for the public during the covid-19 crisis optimal quarantine strategies for covid-19 control models spread of sars-cov-2 in the icelandic population variation in government responses to covid-19 modeling targeted layered containment of an influenza pandemic in the united states ecological pricing and economic efficiency combining behavioral economics and infectious disease epidemiology to mitigate the covid-19 outbreak temporal dynamics in viral shedding and transmissibility of covid-19 wicked problems in public policy quantifying the impact of physical distance measures on the transmission of covid-19 in the uk threshold behaviour of a stochastic sir model an older population increases estimated covid-19 death rates in rural america measuring two distinct psychological threats of covid-19 and their unique impacts on wellbeing and adherence to public health behaviors health security capacities in the context of covid-19 outbreak: an analysis of international health regulations annual report data from 182 countries. the lancet projecting the transmission dynamics of sars-cov-2 through the postpandemic period aerosol filtration efficiency of common fabrics used in respiratory cloth masks computer-assisted decision making: performance, beliefs, and the illusion of control moore and more and symmetry the impact of covid-19 on us employment and hours: real-time estimates with homebase data coping with the wickedness of public policy problems: approaches for decision making under deep uncertainty effect of non-pharmaceutical interventions to contain covid-19 in china numba: a llvm-based python jit compiler the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application the complexities of agent-based modeling output analysis. the journal of artificial societies and social simulation scientific and ethical basis for social-distancing interventions against covid-19. the lancet. infectious diseases long-time behavior of a stochastic sir model fractional state space analysis of economic systems the cost of covid-19: a rough estimate of the 2020 us gdp impact. special edition policy brief exploring synthetic mass action models covid-19, risk, fear, and fall-out mesa: an agent-based modeling framework 3 the economic impact of covid-19 the global macroeconomic impacts of covid-19: seven scenarios air raid shelter policy and its critics in britain before the second world war public value inside: what is public value creation? correlation between universal bcg vaccination policy and reduced morbidity and mortality for covid-19: an epidemiological study the world after covid-19 and its impact on global economy the quantity theory of the value of money estimating the asymptomatic proportion of coronavirus disease 2019 (covid-19) cases on board the diamond princess cruise ship, yokohama, japan projecting hospital utilization during the covid-19 outbreaks in the united states controlling the transmission dynamics of covid-19 generalized degree and nonlinear problems quarantine alone or in combination with other public health measures to control covid-19: a rapid review the covid-19 pandemic in the us: a clinical update world health organization et al. laboratory testing for coronavirus disease (covid-19) in suspected human cases: interim guidance will we see protection or reinfection in covid-19? abloom: location behaviour, spatial patterns, and agent-based modelling intensive care management of coronavirus disease 2019 (covid-19): challenges and recommendations. the lancet respiratory medicine structural urbanism contributes to poorer health outcomes for rural america heterogeneity and network structure in the dynamics of diffusion: comparing agent-based and differential equation models moral decision-making and mental health during the covid-19 pandemic estimating the infection and case fatality ratio for coronavirus disease (covid-19) using age-adjusted data from the outbreak on the diamond princess cruise ship covid-19 epidemic in switzerland: on the importance of testing, contact tracing and isolation extending the sir epidemic model modelling sir-type epidemics by odes, pdes, difference equations and cellular automata-a comparative study fast numerical computations with cython covid-19 infection: the perspectives on immune responses forecasting models for coronavirus disease (covid-19): a survey of the stateof-the-art mathematical modeling of infectious disease dynamics only strict quarantine measures can curb the coronavirus disease (covid-19) outbreak in italy covid-19: epidemiology, evolution, and cross-disciplinary perspectives long and short memory in economics: fractional-order difference and differentiation fractional calculus in economic growth modeling. the portuguese case the benefits and costs of using social distancing to flatten the curve for covid-19 molecular basis of covid-19 relationships in different species: a one health perspective. microbes and infection susceptible-infected-recovered (sir) dynamics of covid-19 and economic impact stability of a stochastic sir system scipy 1.0: fundamental algorithms for scientific computing in python positive rt-pcr test results in discharged covid-19 patients: reinfection or residual? can we contain the covid-19 outbreak with the same measures as for sars? epidemic waves of a spatial sir model in combination with random dispersal and non-local dispersal estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china a cybernetics-based dynamic infection model for analyzing sars-cov-2 infection stability and predicting uncontrollable risks. medrxiv taking the right measures to control covid-19. the lancet infectious diseases role of density and field in spatial economics. contemporary issues in urban and regional economics covid-19 transmission through asymptomatic carriers is a challenge to containment. influenza and other respiratory viruses stochastic sir model with jumps the authors wish to thank the following individuals for their contributions and support: key: cord-347952-k95wrory authors: prieto, diana m; das, tapas k; savachkin, alex a; uribe, andres; izurieta, ricardo; malavade, sharad title: a systematic review to identify areas of enhancements of pandemic simulation models for operational use at provincial and local levels date: 2012-03-30 journal: bmc public health doi: 10.1186/1471-2458-12-251 sha: doc_id: 347952 cord_uid: k95wrory background: in recent years, computer simulation models have supported development of pandemic influenza preparedness policies. however, u.s. policymakers have raised several concerns about the practical use of these models. in this review paper, we examine the extent to which the current literature already addresses these concerns and identify means of enhancing the current models for higher operational use. methods: we surveyed pubmed and other sources for published research literature on simulation models for influenza pandemic preparedness. we identified 23 models published between 1990 and 2010 that consider single-region (e.g., country, province, city) outbreaks and multi-pronged mitigation strategies. we developed a plan for examination of the literature based on the concerns raised by the policymakers. results: while examining the concerns about the adequacy and validity of data, we found that though the epidemiological data supporting the models appears to be adequate, it should be validated through as many updates as possible during an outbreak. demographical data must improve its interfaces for access, retrieval, and translation into model parameters. regarding the concern about credibility and validity of modeling assumptions, we found that the models often simplify reality to reduce computational burden. such simplifications may be permissible if they do not interfere with the performance assessment of the mitigation strategies. we also agreed with the concern that social behavior is inadequately represented in pandemic influenza models. our review showed that the models consider only a few social-behavioral aspects including contact rates, withdrawal from work or school due to symptoms appearance or to care for sick relatives, and compliance to social distancing, vaccination, and antiviral prophylaxis. the concern about the degree of accessibility of the models is palpable, since we found three models that are currently accessible by the public while other models are seeking public accessibility. policymakers would prefer models scalable to any population size that can be downloadable and operable in personal computers. but scaling models to larger populations would often require computational needs that cannot be handled with personal computers and laptops. as a limitation, we state that some existing models could not be included in our review due to their limited available documentation discussing the choice of relevant parameter values. conclusions: to adequately address the concerns of the policymakers, we need continuing model enhancements in critical areas including: updating of epidemiological data during a pandemic, smooth handling of large demographical databases, incorporation of a broader spectrum of social-behavioral aspects, updating information for contact patterns, adaptation of recent methodologies for collecting human mobility data, and improvement of computational efficiency and accessibility. results: while examining the concerns about the adequacy and validity of data, we found that though the epidemiological data supporting the models appears to be adequate, it should be validated through as many updates as possible during an outbreak. demographical data must improve its interfaces for access, retrieval, and translation into model parameters. regarding the concern about credibility and validity of modeling assumptions, we found that the models often simplify reality to reduce computational burden. such simplifications may be permissible if they do not interfere with the performance assessment of the mitigation strategies. we also agreed with the concern that social behavior is inadequately represented in pandemic influenza models. our review showed that the models consider only a few social-behavioral aspects including contact rates, withdrawal from work or school due to symptoms appearance or to care for sick relatives, and compliance to social distancing, vaccination, and antiviral prophylaxis. the concern about the degree of accessibility of the models is palpable, since we found three models that are currently accessible by the public while other models are seeking public accessibility. policymakers would prefer models scalable to any population size that can be downloadable and operable in personal computers. but scaling models to larger populations would often require computational needs that cannot be handled with personal computers and laptops. as a limitation, we state that some existing models could not be included in our review due to their limited available documentation discussing the choice of relevant parameter values. conclusions: to adequately address the concerns of the policymakers, we need continuing model enhancements in critical areas including: updating of epidemiological data during a pandemic, smooth handling of large demographical databases, incorporation of a broader spectrum of social-behavioral aspects, updating information for contact patterns, adaptation of recent methodologies for collecting human mobility data, and improvement of computational efficiency and accessibility. the ability of computer simulation models to "better frame problems and opportunities, integrate data sources, quantify the impact of specific events or outcomes, and improve multi-stakeholder decision making," has motivated their use in public health preparedness (php) [1] . in 2006, one such initiative was the creation of the preparedness modeling unit by the centers for disease control and prevention (cdc) in the u.s. the purpose of this unit is to coordinate, develop, and promote "problem-appropriate and data-centric" computer models that substantiate php decision making [2] . of the existing computer simulation models addressing php, those focused on disease spread and mitigation of pandemic influenza (pi) have been recognized by the public health officials as useful decision support tools for preparedness planning [1] . in recent years, computer simulation models were used by the centers for disease control and prevention (cdc), department of health and human services (hhs), and other federal agencies to formulate the "u.s. community containment guidance for pandemic influenza" [3] . although the potential of the exiting pi models is well acknowledged, it is perceived that the models are not yet usable by the state and local public health practitioners for operational decision making [1, [4] [5] [6] . to identify the challenges associated with the practical implementation of the pi models, the national network of public health institutes, at the request of cdc, conducted a national survey of the practitioners [1] . the challenges identified by the survey are summarized in table 1 . we divided the challenges (labeled a1 through a10 in table 1 ) into two categories: those (a1 through a5) that are related to model design and implementation and can potentially be addressed by adaptation of the existing models and their supporting databases, and those (a6 through a10) that are related to resource and policy issues, and can only be addressed by changing public health resource management approaches and enforcing new policies. although it is important to address the challenges a6 through a10, we consider this a prerogative of the public health administrators. hence, the challenges a6 to a10 will not be discussed in this paper. the challenges a1 through a5 reflect the perspectives of the public health officials, the end users of the pi models, on the practical usability of the existing pi models and databases in supporting decision making. addressing these challenges would require a broad set of enhancements to the existing pi models and associated databases, which have not been fully attempted in the literature. in this paper, we conduct a review of the pi mitigation models available in the published research literature with an objective of answering the question: "how to enhance the pandemic simulation models and the associated databases for operational use at provincial and local levels?" we believe that our review accomplishes its objective in two steps. first, it exposes the differences between the perspectives of the public health practitioners and the developers of models and databases on the required model capabilities. second, it derives recommendations for enhancing practical usability of the pi models and the associated databases. in this section, we describe each of the design and implementation challenges of the existing pi models (a1-a5) and present our methods to examine the challenges in the research literature. in addition, we present our paper screening and parameter selection criteria. design and implementation challenges of pandemic models and databases validity of data support (a1) public health policy makers advocate that the model parameters be derived from up to date demographical and epidemiological data during an outbreak [1] . in this paper we examine some of the key aspects of data support, such as data availability, data access, data retrieval, and data translation. to ensure data availability, a process must be in place for collection and archival of both demographical and epidemiological data during an outbreak. the data must be temporally consistent, i.e., it must represent the actual state of the outbreak. in the united states and other few countries, availability of temporally consistent demographical data is currently supported by governmental databases including the decennial census and the national household travel survey [7] [8] [9] [10] . to ensure temporal consistency of epidemiological data, the institute of medicine (iom) has recommended enhancing the data collection protocols to support real-time decision making [4] . the frequency of data updating may vary based on the decision objective of the model (e.g., outbreak detection, outbreak surveillance, and initiation and scope of interventions). as noted by fay-wolfe, the timeliness of a decision is as important as its correctness [11] , and there should be a balance between the cost of data updating and the marginal benefits of the model driven decisions. archival of data must allow expedited access for model developers and users. in addition, mechanisms should be available for manual or automatic retrieval of data and its translation into model parameter values in a timely manner. in our review of the existing pi models at provincial and local levels, we examined the validity of data that was used in supporting major model parameters. the major model parameters include: the reproduction number, defined as the number of secondary infections that arise from a typical primary case [12] ; the proportion of the population who become infected, also called infection attack rate [13] ; the disease natural history within an individual; and fractions of symptomatic and asymptomatic individuals. the first row of table 2 summarizes our approach to examine data validity. for each reviewed pi model, and, for each of the major model parameters, we examined the source and the age of data used (a1a, a1b), the type of interface used for data access and retrieval (a1c), and the technique used for translating data into the parameter values (a1d). public health practitioners have emphasized the need for models with credible and valid assumptions [1] . credibility and validity of model assumptions generally refer to how closely the assumptions represent reality. however, for modeling purposes, assumptions are often made to balance data needs, analytical tractability, and computational feasibility of the models with their ability to support timely and correct decisions [5] . making strong assumptions may produce results that are timely but with limited or no decision support value. on the other hand, relaxing the simplifying assumptions to the point of analytical intractability or computational infeasibility may seriously compromise the fundamental purpose of the models. every model is comprised of multitudes of assumptions pertaining to contact dynamics, transmission and infection processes, spatial and temporal considerations, demographics, mobility mode(s), and stochasticity of parameters. credibility and validity of these assumptions largely depend on how well they support the decision objectives of the models. for example, if a model objective is to test a household isolation strategy (allowing sick individuals to be isolated at home, in a separate room), the model assumptions must allow tracking of all the individuals within the household (primary caregivers and others) so that the contact among the household members can be assigned and possible spread of infection within the household can be assessed. this idea is further discussed in the results section through an analysis of some of the model assumptions regarding contact probability and frequency of new infection updates that were made in two of the commonly referenced pi models in the pandemic literature [14, 15] . it has been observed in [1] that the existing pi models fall short of capturing relevant aspects of human behavior. this observation naturally evokes the following questions. what are the relevant behavioral aspects that must be considered in pi models? are there scientific evidences that establish the relative importance of these aspects? what temporal consistency is required for data support of the aspects of human behavior? the third row of table 2 summarizes our plan to examine how the existing models capture human behavior. for each reviewed pi model, we first identify the behavioral aspects that were considered, and then for each aspect we examine the source and the age of data used, the type of interface used for data access and retrieval, and the technique used for translating data into model parameter values (a1 a-d). we also attempt to answer the questions raised above, with a particular focus on determining what enhancements can be done to better represent human behavior in pi models. public health practitioners have indicated the need for openly available models and population specific data that can be downloaded and synthesized using personal computers [1] . while the ability to access the models is essential for end users, executing the pi models on personal computers, in most cases, may not be feasible due to the computational complexities of the models. some of the existing models feature highly granular description of disease spread dynamics and mitigation via consideration of scenarios involving millions of individuals and refined time scales. while such details might increase credibility and validity of the models, this can also result in a substantial computational burden, sometimes, beyond the capabilities of personal computers. there are several factors which contribute to the computational burden of the pi models, the primary of which is the population size. higher population size of the affected region requires larger datasets to be accessed, retrieved, and downloaded to populate the models. other critical issues that add to the computational burden are: data interface with a limited bandwidth, the frequency of updating of data during a pandemic progress, pre-processing (filtering and quality assurance) requirement for raw data, and the need for data translation into parameter values using methods, like maximum likelihood estimation and other arithmetic conversions. the choice of the pi model itself can also have a significant influence on the computational burden. for example, differential equation (de) models divide population members into compartments, where in each compartment every member makes the same number of contacts (homogeneous mixing) and a contact can be any member in the compartment (perfect mixing). in contrast, agentbased (ab) models track each individual of the population where an individual contacts only the members in his/her relationship network (e.g., neighbors, co-workers, household members, etc.) [16] . the refined traceability of individual members offered by ab models increases the usage of computational resources. further increases in the computational needs are brought on by the need for running multiple replicates of the models and generating reliable output summaries. as summarized in the last row of table 2 , we examine which models have been made available to general public and whether they are offered as an open or closed source code. we also check for the documentation of model implementation as well as for existence of user support, if any. in addition, we look for the ways that researchers have attempted to address the computational feasibility of their models, including data access, retrieval and translation, model execution, and generation of model outputs. the initial set of articles for our review was selected following the prisma reporting methodology, as applicable. we used the pubmed search engine with the keyword string "influenza" and "pandemic" and "model" in english language. a total of 640 papers were found which were published between 1990 and 2010. we filtered those using the following selection criteria (also depicted in figure 1 ). -articles that evaluate one or more strategies in each of the mitigation categories: social distancing, vaccination, and antiviral application. we limited the paper (by excluding models that do not consider all three categories) to contain the scope of this review, as we examined a large table 2 plan for examination of the design and implementation challenges of the existing pi models design and implementation challenges validity of data support (a1) for model parameters for each pi model and for each of the major model parameters (e.g., reproduction number, illness attack rate) examine: a1a. data source for parameter values (actual, simulated, assumed) a1b. age of data a1c. type of interface for data access and retrieval (manual, automatic) a1d. technique to translate raw data into model parameter values (e.g., arithmetic conversion, bayesian estimation) credibility and validity of model assumptions ( body of related papers from which our selected articles drew their parameters (see additional tables). -articles with single-region simulation models. we defined single-region for the purpose of this review as either a country or any part thereof. models presenting disease spread approaches without mention of any regional boundary were included, as these approaches can directly support decision makers at provincial and local levels. there exists a significant and important body of literature that is dedicated to global pandemic influenza modeling that aims at quantifying global disease spread [17] [18] [19] [20] , assessing the impact of global vaccine distribution and immunization strategies [18] [19] [20] and assessing the impact of recommended or self-initiated mobility behaviors in the global disease spread [21, 22] . as these overarching aims of the global models do not directly impact operational decisions of provincial and local policy makers during an evolving pandemic, we have not included them in our final selection of articles. -articles that include data sources for most model parameter values and, when possible, specify the methods for parameter estimation. we included this criterion in order to evaluate models with respect to the challenge of "validity of data support." see table 2 where we outline our evaluation plan. clearly, models not satisfying this criterion would not support our review objectives. using the above filtering criteria, an additional snowball search was implemented outside pubmed, which yielded 5 additional eligible papers [14, [23] [24] [25] [26] and bringing the total number of papers reviewed to twentythree. we grouped the twenty-three selected articles in eleven different clusters based on their model (see table 3 ). the clusters are named either by the name used in the literature or by the first author name(s). for example, all three papers in the imperial-pitt cluster use the model introduced initially by ferguson et al. [27] . in each cluster, to review the criteria for the design and implementation challenge (a1), we selected the article with the largest and most detailed testbed (marked in bold in table 3 ). as stated earlier, credibility and validity of model assumptions (a2), were examined via two most commonly cited models in the pandemic literature [14, 15] . the challenges a3-a5 were examined separately for each of the selected articles. out of the ten model clusters presented in table 3 , eight are agent-based simulation models, while the rest are differential equation models. also, while most of the models use purely epidemiological measures (e.g., infection attack rates and reproduction numbers) to assess the effectiveness of mitigation strategies, only a few use economic measures [26, 35, 39] . in our review, we examined epidemiological, demographical, and social-behavioral parameters of the pandemic models. we did not examine the parameters of the mitigation strategies as a separate category since those are functions of the epidemiological, demographical, and social-behavioral parameters. for example, the risk groups for vaccine and antiviral (which are mitigation parameters) are functions of epidemiological parameters such as susceptibility to infection and susceptibility to death, respectively. another example is the compliance to non-pharmaceutical interventions, a mitigation strategy parameter, which can be achieved by altering the social behavioral parameters of the model. in this section, we present the results of our review of the models that evaluate at least one strategy from each mitigation category (social distancing, vaccination and antiviral application). we also identify areas of enhancements of the simulation based pi models for operational use. our discussion on validity of data support includes both epidemiological and demographic data. additional file 1: table s1 summarizes the most common epidemiological parameters used in the selected models along with their data sources, interface for data access and retrieval, and techniques used in translating raw data into parameter values. additional file 1: table s2 presents information similar to above for demographic parameters. the most commonly used epidemiological parameters are reproduction number (r), illness attack rate (iar), initial set of articles filtered from pubmed using keyword search (n = 640) remaining articles (n = 28) exclusion of articles that do not examine pandemic influenza spread under a comprehensive set of mitigation strategies (n = 612) exclusion of articles that examine global pandemic spread (n = 6) remaining articles (n = 22) remaining articles (n = 18) inclusion of articles that meet the above criteria but are obtained using snowball search outside pubmed (n = 5) exclusion of articles that do not provide a comprehensive support for data collection and parameterization methods (n = 4) articles reviewed (n = 23) figure 1 selection criteria for pi models for systematic review. disease natural history parameters, and fraction of asymptomatic infected cases. in the models that we have examined, estimates of reproduction numbers have been obtained by fitting case/mortality time series data from the past pandemics into models using differential equations [44] , cumulative exponential growth equations [7] , and bayesian likelihood expressions [27] . iars have been estimated primarily using household sampling studies [33] , epidemic surveys [29, 45] , and case time series reported for 2009 h1n1 [36, 46] . the parameters of the disease natural history, which are modeled using either a continuous or phase-partitioned time scale (see additional file 1: table s1 ), have been estimated from household random sampling data [27, 33, 47] , viral shedding profiles from experimental control studies [23, 43, 48, 49] , and case time series reported for 2009 h1n1 [36, 46] . bayesian likelihood estimation methods were used in translating 2009 case time series data [27, 46] . fraction of asymptomatic infected cases has been estimated using data sources and translation techniques similar to the ones used for natural history. recent phylogenetic studies on the 2009 h1n1 virus help to identify which of the above epidemiological parameters need real-time re-assessment. these studies suggest that the migratory patterns of the virus, rather than the intrinsic genomic features, are responsible for the second pandemic wave in 2009 [50, 51] . since r and iar are affected not only by the genomic features but also by the migratory patterns of the virus, a close monitoring of these parameters throughout the pandemic spread is essential. real-time monitoring of parameters describing disease natural history and fraction of asymptomatic cases is generally not necessary since they are mostly dependent on the intrinsic genomic features of the virus. these parameters can be estimated when a viral evolution is confirmed through laboratory surveillance. estimation methods may include surveys (e.g., household surveys of members of index cases [52, 53] ) and laboratory experiments that inoculate pandemic strains into human volunteers [54] . current pandemic research literature shows the existence of estimation methodologies for iar and r that can be readily used provided that raw data is available [46] . there exist several estimators for r (wallinga et al. [55, 56] , fraser [57] , white and pagano [58] , bettencourt et al. [59] , and cauchemez et al. [60] ). these estimates have been derived from different underlying infection transmission models (e.g., differential equations, time since infection and directed network). with different underlying transmission models, the estimators consider data from different perspectives, thereby yield different values for r at a certain time t. for example, fraser [57] proposes an instantaneous r that observes how past case incidence data (e.g., in time points t-1, t-2, t-3) contribute to the present incidence at time t. in contrast, wallinga et al. [55, 56] and cauchemez et al. [60] propose estimators that observe how the future incidences (e.g., t + 1, t + 2, t + 3) are contributed by a case at time t. white and pagano [58] considers an estimator that can be called a running estimate of the instantaneous reproduction number. further extensions of the above methods have been developed to accommodate more realistic assumptions. bettencourt extended its r estimator to account for multiple introductions from a reservoir [59] . the wallinga estimator was extended by cowling [61] to allow for reporting delays and repeated importations, and by glass [62] to allow for heterogeneities among age groups (e.g., adults and children). the fraser estimator was extended by nishiura [63] to allow the estimation of the reproduction number for a specific age class given infection by another age class. the above methods for real-time estimation of r are difficult to implement in the initial and evolving stages of a pandemic given the present status of the surveillance systems. at provincial and local levels, surveillance systems are passive as they mostly collect data from infected cases who are seeking healthcare [64] . with passive surveillance, only a fraction of symptomatic cases are detected with a probable time delay from the onset of symptoms. once the symptomatic cases seek healthcare and are reported to the surveillance system, the healthcare providers selectively submit specimens to the public health laboratories (phl) for confirmatory testing. during the h1n1 pandemic in 2009, in regions with high incidence rates, the daily testing capacities of the phl were far exceeded by the number of specimens received. in these phl, the existing first-come-first-serve testing policy and the manual methods for receiving and processing the specimens further delayed the pace of publication of confirmed cases. the time series of the laboratory confirmed cases likely have been higher due to the increased specimen submission resulting from the behavioral response (fear) of both the susceptible population and the healthcare providers after the pandemic declaration [65] . similarly, time series of the confirmed cases likely have been lower at the later stages of the pandemic as federal agencies advocated to refrain from specimen submission [66] . the present status of the surveillance systems calls for the models to account for: the underreporting rates, the delay between onset of symptoms and infection reporting, and the fear factor. in addition, we believe that it is necessary to develop and analyze the cost of strategies to implement active surveillance and reduce the delays in the confirmatory testing of the specimens. in our opinion, the above enhancement can be achieved by developing methods for statistical sampling and testing of specimens in the phl. in addition, new scheduling protocols will have to be developed for testing the specimens, given the limited laboratory testing resources, in order to better assess the epidemiological parameters of an outbreak. with better sampling and scheduling schemes at the phl, alterations in the specimen submission policies during a pandemic (as experienced in the u.s. during the 2009 outbreak) may not be necessary. the above enhancements would also support a better real-time assessment of the iar, which is also derived from case incidence data. our review of the selected pi models indicates that currently all of the tasks relating to access and retrieval of epidemiological data are being done manually. techniques for translation of data into model parameter values range from relatively simple arithmetic conversions to more time-consuming methods of fitting mathematical and statistical models (see additional file 1: table s1 ). there exist recent mechanisms to estimate incidence curves in real-time by using web-based questionnaires from symptomatic volunteers [67] , google and yahoo search queries [68, 69] and tweeter messages [70] and have supported influenza preparedness in several european countries and the u.s. [67, 69] . if real-time incidence estimates are to be translated into pi models parameters, complex translation techniques might delay execution of the model. we believe that model developers should consider building (semi)automatic interfaces for epidemiological data access and retrieval and develop translation algorithms that can balance the run time and accuracy. additional file 1: table s2 shows the most common demographic parameters that are used in the selected models. the parameters are population size/density, distribution of household size, peer-group size, age, commuting travel, long-distance travel, and importation of infected cases to the modeled region. estimation of these parameters has traditionally relied on comprehensive public databases, including the u.s. census, landscan, italian institute of statistics, census of canada, hong kong survey data, uk national statistics, national household travel survey, uk department of transport, u.s. national centre for educational statistics, the italian ministry of university and research and the uk department for education and skills. readers are referred to additional file 1: table s2 for a complete list of databases and their web addresses. our literature review shows that access and retrieval of these data are currently handled through manual procedures. hence, there is an opportunity for developing tools to accomplish (semi)automatic data access, retrieval, and translation into model parameters whenever a new outbreak begins. it is worth noting that access to demographic information is currently limited in many countries, and therefore obtaining demographic parameters in real-time would only be possible for where information holders (censing agencies and governmental institutions) openly share the data. the data sources supporting parameters for importation of infected cases reach beyond the modeled region requiring the regional models to couple with global importation models. this coupling is essential since the possibility of new infection arrivals may accelerate the occurrence of the pandemic peak [17] . this information on peak occurrence could significantly influence time of interventions. some of the single region models consider a closed community with infusion of a small set of infected cases at the beginning [24, 26, 34] . single region models also consider a pseudo global coupling through a constant introduction of cases per unit time [15, 29] . other single region models adopt a more detailed approach, where, for each time unit, the number of imported infections is estimated by the product of the new arrivals to the region and the probability of an import being infected. this infection probability is estimated through a global disease spread compartmental model [14, 30] . the latter approach is similar to the one used by merler [17] for seeding infections worldwide and is operationally viable due to its computational simplicity. for a more comprehensive approach to case importation and global modeling of disease spread, see [71] . recall that our objective here is to discuss how the credibility and validity of assumptions should be viewed in light of their impact on the usability of models for public health decision making. we examine the assumptions regarding contact probability and the frequency of new infection updates (e.g., daily, quarterly, hourly) in two models: the imperial-pitt [14] and the uw-lanl models [15] . choice of these models was driven by their similarities (in region, mixing groups, and the infection transmission processes), and the facts that these models were cross validated by halloran [28] and were used for developing the cdc and hhs "community containment guidance for pandemic influenza" [3] . we first examine the assumptions that influence contact probabilities within different mixing groups (see table 4 ). for household, the imperial-pitt model assumes constant contact probability while the uw-lanl model assumes that the probability varies with age (e.g., kid to kid, kid to adult). the assumption of contact probability varying with age matches reality better than assuming it to be constant [72] . however, for households with smaller living areas the variations may not be significant. also, neither of the papers aimed at examining strategies (e.g., isolation of sick children within a house) that depended on age-based contact probability. hence, we believe that the assumptions can be considered credible and valid. for workplaces and schools, the assumption of 75% of contacts within the group and 25% contacts outside the group, as made in the imperial-pitt model, appears closer to reality than the assumption of constant probability in the uw-lanl model [72] . for community places, the imperial-pitt model considered proximity as a factor influencing the contact probability, which was required for implementing the strategy of providing antiviral prophylaxis to individuals within a ring of certain radius around each detected case. we also examined the assumptions regarding the frequency of infection updates. the frequency of update dictates how often the infection status of the contacted individuals is evaluated. in reality, infection transmission may occur (or does not occur) whenever there is a contact event between a susceptible and an infected subject. the imperial-pitt and the uw-lanl models do not evaluate infection status after each contact event, since this would require consideration of refined daily schedules to determine the times of the contact events. instead, the models evaluate infection status every six hours [14] or at the end of the day [15] by aggregating the contact events. while such simplified assumptions do not allow the determination of the exact time of infection for each susceptible, they offer a significant computational reduction. moreover, in a real-life situation, it will be nearly impossible to determine the exact time of each infection, and hence practical mitigation (or surveillance) strategies should not rely on it. the above analysis reveals how the nature of mitigation strategies drives the modeling assumptions and the computational burden. we therefore believe that the policymakers and the modelers should work collaboratively in developing modeling assumptions that adequately support the mitigation strategy needs. furthermore, the issue of credibility and validity of the model assumptions should be viewed from the perspectives of the decision needs and the balance between analytical tractability and computational complexity. for example, it is unlikely that any mitigation strategy would have an element that depends of the minute by minute changes in the disease status. hence, it might be unnecessary to consider a time scale of the order of a minute for a model and thus increase both computational and data needs. contact rate is the most common social-behavioral aspect considered by the models that we have examined. in these models, except for eichner et al. [43] , the values of the contact rates were assumed due to the unavailability of reliable data required to describe the mobility and connectivity of modern human networks [24, 37, 38] . however, it is now possible to find "fresh" estimates of the types, frequency, and duration of human contacts either from a recent survey at the continental level [72] or from a model that derives synthetic contact information at the country level [73] . in addition, recent advances in data collection through bluetooth enabled mobile telephones [74] and radio frequency identification (rfid) devices [75] allow better extraction of proximity patterns and social relationships. availability of these data creates further opportunity to explore methods of access, retrieval, and translation into model parameters. issues of data confidentiality, cost of the sensing devices, and low compliance to the activation of sensing applications might prevent the bluetooth and rfid technologies from being effectively used in evolving pandemic outbreaks. another possibility is the use of aggregated and anonymous network bandwidth consumption data (from network service providers) to extrapolate population distribution in different areas at different points in time [76, 77] . other social-behavioral parameters that are considered by the reviewed models include reactive withdrawal from work or school due to appearance of symptoms [27] , work absenteeism to care for sick relatives or children at home due to school closure [27, 36, 38, 43] , and compliance to social distancing, vaccination, and antiviral prophylaxis [28, 37] . once again, due to the lack of data support, the values of most of these parameters were assumed and their sensitivities were studied to assess the best and worst case scenarios. existing surveys collected during the 2009 h1n1 outbreak can be useful in quantifying the above parameters [78, 79] . recent literature has explored many additional socialbehavioral aspects that were not considered in the models we reviewed. there are surveys that quantify the levels of support for school closure, follow up on sick students by the teachers [78] , healthcare seeking behavior [80] , perceived severity, perceived susceptibility, fear, general compliance intentions, compliance to wearing face masks, role of information, wishful thinking, fatalistic thinking, intentions to fly away, stocking, staying indoors, avoiding social contact, avoiding health care professionals, keeping children at home and staying at home, and going to work despite being advised to stay at home [81] . there are also models that assess the effect of selfinitiated avoidance to a place with disease prevalence [21] , voluntary vaccination and free-ride (not to vaccinate but rely on the rest of the population to keep coverage high [82] . other recognized behaviors include refusal to vaccinate due to religious beliefs and not vaccinating due to lack of awareness [82] . we believe that there is a need for further studies to establish the relative influence of all of the above mentioned social-behavioral factors on operational models for pandemic spread and mitigation. subsequently, the influential factors need to be analyzed to determine how relevant information about those factors should be collected (e.g., in real-time or through surveys before an outbreak), accessed, retrieved, and translated into the final model parameter values. it is important to mention very recent efforts in improving models for assessment of relevant social behavioral components including commuting, long distance travel behavior [20, 83, 84] , and authority recommended decline of travel to/from affected regions [22] . for operational modeling, it would be helpful to adapt the approaches used by these models in translating massive data sets (e.g., bank notes, mobile phone user trajectories, air and commuting travel networks) into model parameter values. in addition, available new methodologies to model social-behavior that adapts to evolving disease dynamics [85] should be incorporated into the operational models. with regards to accessibility and scalability of the selected models, we first attempted to determine which of the simulation models were made available to general public, either as an open or closed source code. we also checked for available documentation for model implementation and user support, if any. most importantly, we looked into how the researchers attempted to achieve the computational feasibility of their models (see additional file 1: table s3 ). three of the models that make their source codes accessible to general public are influsim [43] , ciofi [30] and flute [23] . influsim is a closed source differential equation-based model with a graphical user interface (gui) which allows the evaluation of a variety of mitigation strategies, including school closure, place closure, antiviral application to infected cases, and isolation. ciofi is an open source model that is coupled with a differential equation model to allow for a more realistic importation of cases to a region. flute is an open source model, which is an updated version of the uw-lanl [34] agent-based model. the source code for flute is also available as a parallelized version that supports simulation of large populations on multiple processors. among these three softwares, influsim has a gui, whereas ciofi and uw-lanl are provided as a c/c++ code. influsim's gui seems to be more user friendly for healthcare policymakers. flute and ciofi, on the other hand, offer more options for mitigation strategies, but requires the knowledge of c/c++ programming language and the communication protocols for parallelization. other c++ models are planning to become, or are already, publicly accessible, according to the models of infectious disease agent study (midas) survey [86] . we note that the policy makers would greatly benefit if softwares like flute or ciofi can be made available through a cyber-enabled computing infrastructure, such as teragrid [87] . this will provide the policy makers access to the program through a web based gui without having to cope with the issues of software parallelization and equipment availability. moreover, the policy makers will not require the skills of programming, modeling, and data integration. the need for replicates for accurate assessment of the model output measures and the run time per replicate are major scalability issues for pandemic simulation models. large-scale simulations of the u.s. population reported running times of up to 6 h per replicate, depending on the number of parallel threads used [23] (see additional file 1: table s3 for further details). it would then take a run time of one week to execute 28 replicates of only one pandemic scenario. note that, most of the modeling approaches have reported between 100 to 1000 replicates per scenario [24, [36] [37] [38] [39] [40] [41] , with the exception of [14, 26, 28, 30] which implemented between 5 to 50 replicates. clearly, it would take about one month to run 100 replicates for a single scenario involving the entire u.s. population. while it may not be necessary to simulate the entire population of a country to address mitigation related questions, the issue of the computational burden is daunting nonetheless. we therefore believe that the modeling community should actively seek to develop innovative methodologies to reduce the computational requirements associated with obtaining reliable outputs. minimization of running time has been recently addressed through high performance computing techniques and parallelization by some of the midas models (e.g., epifast) and other research groups (e.g., dicon, gsam), as reported in [86] . minimization of replicates can be achieved by running the replicates, one more at a time, until the confidence intervals for the output variables become acceptable [14, 26] . in addition to the need of minimizing running time and number of replicates, it is also necessary to develop innovative methodologies to minimize the setting up time of operational models. these methodologies should enable the user to automatically select the level of modeling detail, according to the population to mimic (see a discussion of this framework in the context of human mobility [20] ), and allow the automatic calibration of the model parameters. there exist several simulation models of pandemic influenza that can be used at the provincial and local levels and were not treated as part of the evaluated models in this article. their exclusion is due to their limited available documentation discussing the choice of demographic, social-behavioral or epidemiological parameter values. we mention and discuss their relevant features in this manuscript, whenever applicable. for information about the additional models, the reader is referred to [86, 88, 89] . there also exist a body of literature evaluating less than three types of mitigation strategies that were not considered as part of the review, as we discussed in the methods section. this literature is valuable is providing insights about reproduction patterns [90, 91] , effect of cross-immunity [92] , antiviral resistance [93] , vaccine dosage [94, 95] , social-distancing [96] and public health interventions in previous pandemics [97, 98] . though the literature on pandemic models is rich and contains analysis and results that are valuable for public health preparedness, policy makers have raised several questions regarding practical use of these models. the questions are as follows. is the data support adequate and valid? how credible and valid are the model assumptions? is human behavior represented appropriately in these models? how accessible and scalable are these models? this review paper attempts to determine to what extent the current literature addresses the above questions at provincial and local levels, and what the areas of possible enhancements are. the findings with regards to the areas of enhancements are summarized below. enhance the following: availability of real-time epidemiological data; access and retrieval of demographical and epidemiological data; translation of data into model parameter values. we analyzed the most common epidemiological and demographical parameters that are used in pandemic models, and discussed the need for adequate updating of these parameters during an outbreak. as regards the epidemiological parameters, we have noted the need to obtain prompt and reliable estimates for the iar and r, which we believe can be obtained by enhancing protocols for expedited and representative specimen collection and testing. during a pandemic, the estimates for iar and r should also be obtained as often as possible to update simulation models. for the disease natural history and the fraction of asymptomatic cases, estimation should occur every time viral evolution is confirmed by the public health laboratories. for periodic updating of the simulation models, there is a need to develop interfaces for (semi)automatic data access and retrieval. algorithms for translating data into model parameters should not delay model execution and decision making. demographic data are generally available. but most of the models that we examined are not capable of performing (semi)automatic access, retrieval, and translation of demographic data into model parameter values. examine validity of modeling assumptions from the point of view of the decisions that are supported by the model. by referring to two of the most commonly cited pandemic preparedness models [15, 27] , we discussed how simplifying model assumptions are made to reduce computational burden, as long as the assumptions do not interfere with the performance evaluation of the mitigation strategies. some mitigation strategies require more realistic model assumptions (e.g., location based antiviral prophylaxis would require models that track geographic coordinates of individuals so that those within a radius of an infected individual can be identified). whereas other mitigation strategies might be well supported by coarser models (e.g.,"antiviral prophylaxis for household members") would require models that track household membership). therefore, whenever validity of the modeling assumptions is examined, the criteria chosen for the examination should depend on the decisions supported by the model. incorporate the following: a broader spectrum of social behavioral aspects; updated information for contact patterns; new methodologies for collection of human mobility data. some of the social behavioral factors that have been considered in the examined models are social distancing and vaccination compliance, natural withdraw from work when symptoms appear, and work absenteeism to care for sick family members. although some of the examined models attempt to capture social-behavioral issues, it appears that they lack adequate consideration of many other factors (e.g., voluntary vaccination, voluntary avoidance to travel to affected regions). hence, there is a need for research studies or expert opinion analysis to identify which social-behavioral factors are significant for disease spread. it is also essential to determine how the social behavioral data should be collected (in real-time or through surveys), archived for easy access, retrieved, and translated into model parameters. in addition, operational models for pandemic spread and mitigation should reflect the state of the art in data for the contact parameters and integrate recent methodologies for collection of human mobility data. enhance computational efficiency of the solution algorithms. our review indicates that some of the models have reached a reasonable running time of up to 6 h per replicate for a large region, such as the entire usa [14, 23] . however, operational models need also to be set up and replicated in real-time, and methodologies addressing these two issues are needed. we have also discussed the question whether the public health decision makers should be burdened with the task of downloading and running models using local computers (laptops). this task can be far more complex than how it is perceived by the public health decision makers. we believe that models should be housed in a cyber computing environment with an easy user interface for the decision makers. additional file 1: additional file 1: table s1 epidemiological parameters in models for pandemic influenza preparedness. the excel sheet "additional file1: table s1" shows the epidemiological parameters most commonly used in the models for pandemic influenza, the parameter data sources, and the means for access, retrieval and translation. additional file 1: table s2 demographic parameters in models for pandemic influenza preparedness. the excel sheet "additional file 1: table s2 " shows the demographic parameters most commonly used in the models for pandemic influenza, the parameter data sources, and the means for access, retrieval and translation. additional file 1: table s3 accessibility and scalability features investigated in the models. the excel sheet "additional file 3" shows the different models examined, together with their type of public access, number and running time per replicate, and techniques to manage computational burden. use of computer modeling for emergency preparedness functions by local and state health official: a needs assessment cdc's new preparedness modeling initiative: beyond (and before) crisis response interim pre-pandemic planning guidance: community strategy for pandemic influenza mitigation in the united states modeling community containment for pandemic influenza: a letter report m bd: recommendations for modeling disaster responses in public health and medicine: a position paper of the society for medical decision making. med decision making yale new haven center for emergency preparedness and disaster responsem, and us northern command: study to determine the requirements for an operational epidemiological modeling process in support of decision making during disaster medical and public health response operations national household travel survey (nths) viii censimento generale della popolazione e delle abitazioni real-time database and information systems: research advance how generation intervals shape the relationship between growth rates and reproduction numbers concepts of transmission and dynamics strategies for mitigating an influenza pandemic mitigation strategies for pandemic influenza in the united states heterogeneity and network structure in the dynamics of diffusion: comparing agent-based and differential equation models the role of population heterogeneity and human mobility in the spread of pandemic influenza potential for a global dynamic of influenza a (h1n1) the global transmission and control of influenza multiscale mobility networks and the spatial spreading of infectious diseases modeling human mobility responses to the large-scale spreading of infectious diseases human mobility networks, travel restrictions, and the global spread of 2009 h1n1 pandemic flute, a publicly available stochastic influenza epidemic simulation model targeted social distancing design for pandemic influenza a large scale simulation model for assessment of societal risk and development of dynamic mitigation strategies a predictive decision aid methodology for dynamic mitigation of influenza pandemics: special issue on optimization in disaster relief strategies for containing an emerging influenza pandemic in southeast asia modeling targeted layered containment of an influenza pandemic in the united states reducing the impact of the next influenza pandemic using household-based public health interventions scalia tomba img: mitigation measures for pandemic influenza in italy: an individual based model considering different scenarios simple models for containment of a pandemic a model for influenza with vaccination and antiviral treatment containing pandemic influenza with antiviral agents containing pandemic influenza at the source economic evaluation of influenza pandemic mitigation strategies in the united states using a stochastic microsimulation transmission model modelling mitigation strategies for pandemic (h1n1) effective, robust design of community mitigation for pandemic influenza: a systematic examination of proposed us guidance rescinding community mitigation strategies in an influenza pandemic health outcomes and costs of community mitigation strategies for an influenza pandemic in the united states assessing the role of basic control measures, antivirals and vaccine in curtailing pandemic influenza: scenarios for the us, uk and the netherlands mathematical assessment of canadas pandemic influenza preparedness plan a model for the spread and control of pandemic influenza in an isolated geographical region the influenza pandemic preparedness planning tool influsim transmissibility of 1918 pandemic influenza epidemic influenza; a survey chicago: american medical association pandemic potential of a strain of influenza a (h1n1): early findings an influenza simulation model for immunization studies non-pharmaceutical interventions for pandemic influenza, national and community measures local and systemic cytokine responses during experimental human influenza a virus infection the early diversification of influenza a/ h1n1pdm phylogeography of the spring and fall waves of the h1n1/09 pandemic influenza virus in the united states estimates from a national prospective survey of household contacts in france household transmission of 2009 pandemic influenza a (h1n1) virus in the united states timelines of infection and disease in human influenza: a review of volunteer challenge studies different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures real-time estimates in early detection of sars estimating individual and household reproduction numbers in an emerging epidemic a likelihood-based method for real-time estimation of the serial interval and reproductive number of an epidemic real time bayesian estimation of the epidemic potential of emerging infectious diseases estimating in real time the efficacy of measures to control emerging communicable diseases the effective reproduction number of pandemic influenza. prospective estimation estimating reproduction numbers for adults and children from case data pros and cons of estimating the reproduction number from early epidemic growth rate of influenza a (h1n1) global infectious disease surveillance and health intelligence monitoring influenza activity, including using internet searches for influenza surveillance detecting influenza epidemics using search engine query data the use of twitter to track levels of disease activity and public concern in the u.s. during the influenza a h1n1 pandemic the role of the airline transportation network in the prediction and predictability of global epidemics social contacts and mixing patterns relevant to the spread of infectious diseases little italy: an agent-based approach to the estimation of contact patterns -fitting predicted matrices to serological data reality mining: sensing complex social systems dynamics of person-to-person interactions from distributed rfid sensor networks cellular census: explorations in urban data collection. pervasive computing mobile landscapes: using location data from cell-phones for urban analysis. environ and planning b: plann and des social and economic impact of school closure resulting from pandemic influenza a/h1n1 compliance and side effects of prophylactic oseltamivir treatment in a school in south west england using an online survey of healthcare-seeking behaviour to estimate the magnitude and severity of the 2009 h1n1v influenza epidemic in england behavioural intentions in response to an influenza pandemic modelling the influence of human behaviour on the spread of infectious diseases: a review the scaling laws of human travel understanding individual human mobility patterns adaptive human behavior in epidemiological models stochastic modelling of the spatial spread of influenza in germany simple models of influenza progression within a heterogeneous population planning for the next influenza h1n1 season: a modelling study a populationdynamic model for evaluating the potential spread of drug-resistant influenza virus infections during community-based use of antivirals optimizing the dose of pre-pandemic influenza vaccines to reduce the infection attack rate finding optimal vaccination strategies for pandemic influenza using genetic algorithms living with influenza: impacts of government imposed and voluntarily selected interventions public health interventions and epidemic intensity during the 1918 influenza pandemic the effect of public health measures on the 1918 influenza pandemic in the us cities the authors wish to thank doctor lillian stark, virology administrator of the bureau of laboratories in tampa, florida, for providing valuable information on the problems faced by the laboratory during the h1n1 pandemics. the authors also wish to thank the reviewers of this manuscript for providing valuable suggestions and reference material. we appreciate the support of dayna martinez, a doctoral student at usf, in providing some literature information on social-behavioral aspects of pandemic influenza. authors' contributions dp conducted the systematic review and analysis of the models. td and as guided dp and au in designing the conceptual framework for the review. all three jointly wrote the manuscript. ri and sm provided public health expert opinion on the conceptual framework and also reviewed the manuscript. all authors read and approved the final manuscript. the authors declare that they have no competing interests. key: cord-339374-2hxnez28 authors: de kort, hanne; baguette, michel; lenoir, jonathan; stevens, virginie m. title: toward reliable habitat suitability and accessibility models in an era of multiple environmental stressors date: 2020-09-22 journal: ecol evol doi: 10.1002/ece3.6753 sha: doc_id: 339374 cord_uid: 2hxnez28 global biodiversity declines, largely driven by climate and land‐use changes, urge the development of transparent guidelines for effective conservation strategies. species distribution modeling (sdm) is a widely used approach for predicting potential shifts in species distributions, which can in turn support ecological conservation where environmental change is expected to impact population and community dynamics. improvements in sdm accuracy through incorporating intra‐ and interspecific processes have boosted the sdm field forward, but simultaneously urge harmonizing the vast array of sdm approaches into an overarching, widely adoptable, and scientifically justified sdm framework. in this review, we first discuss how climate warming and land‐use change interact to govern population dynamics and species’ distributions, depending on species’ dispersal and evolutionary abilities. we particularly emphasize that both land‐use and climate change can reduce the accessibility to suitable habitat for many species, rendering the ability of species to colonize new habitat and to exchange genetic variation a crucial yet poorly implemented component of sdm. we then unite existing methodological sdm practices that aim to increase model accuracy through accounting for multiple global change stressors, dispersal, or evolution, while shifting our focus to model feasibility. we finally propose a roadmap harmonizing model accuracy and feasibility, applicable to both common and rare species, particularly those with poor dispersal abilities. this roadmap (a) paves the way for an overarching sdm framework allowing comparison and synthesis of different sdm studies and (b) could advance sdm to a level that allows systematic integration of sdm outcomes into effective conservation plans. biodiversity is under threat across the globe, and its preservation requires transparent and effective guidelines for policy, conservation practitioners, and educators based on realistic assessments of biodiversity-environment relations (elith & leathwick, 2009; kok et al., 2017; pereira et al., 2010; titeux, henle, mihoub, & brotons, 2016) . species distribution modeling (sdm) has been a popular toolbox for studying relationships between environmental change stressors and spatial shifts in species' suitable habitat and for predicting the potential distribution of single species, communities, and ecosystems through user-defined environmental change scenarios (alvarado-serrano & knowles, 2014; peterson, 2003; weber, stevens, diniz-filho, & grelle, 2017) . the overall sdm framework is not just an interesting tool for identifying areas of local conservation concern or areas not yet occupied but potentially suitable; it has the potential to contribute substantially to the global protection of biodiversity and ecosystem services threatened by multiple environmental stressors, including land-use change and habitat fragmentation, climate change, invasive alien species, pollution, and overexploitation (franklin, 2013; kok et al., 2017; wiens, stralberg, jongsomjit, howell, & snyder, 2009 ). ignoring the joint effects of multiple environmental stressors can be highly misleading; they have been found to give rise to ecological outcomes unpredicted by single environmental stressors (bellard, leclerc, & courchamp, 2015; fournier, barbet-massin, rome, & courchamp, 2017; guo, lenoir, &bonebrake, 2018; marshall et al., 2017; peterson & nakazawa, 2007; segan, murray, & watson, 2016; visconti et al., 2016,) and can trigger evolutionary responses that differ from expectations assumed by single stressor evolution (kelly, debiasse, villela, roberts, & cecola, 2016; mcclanahan, graham, & darling, 2014) . accounting for multiple environmental changes and potential evolutionary responses can greatly improve model accuracy (bellard et al., 2015; titeux et al., 2016) , providing sdm outcomes that better represent the potential distribution and ultimately the occupied distributional area of the species or community under study ( figure 1) . indeed, part of the suitable habitat of the potential distribution is often not reachable due to dispersal limitation and spatially variable habitat connectivity (e.g., physical barriers), sizing the potential distribution down to the accessible habitat (barve et al., 2011; peterson, papeş, & soberón, 2015; pulliam, 2000) . where dispersal of the focal species strongly depends on the distribution and dispersal of interacting species, the biotic context is an additional determinant of the occupied distributional area of the focal species (figure 1 ). finally, decreased connectivity among suitable habitat patches may promote evolution toward reduced dispersal (cote et al., 2017; graae et al., 2018) , further reducing the size of the occupied distributional area (figure 1 ). in this review, we pinpoint how joint environmental changes drive population, species, and community dynamics in comparison with single stressors, focusing on climate and land-use change as two of the most prominent threats to biodiversity. predominantly driven by the ongoing biodiversity crisis, a universal urge for largescale land-use restoration, and the existence of user-friendly implementation tools, scientists increasingly study the impacts of both climate-and land-use change on biodiversity redistribution using sdm (araújo et al., 2019; harrison & gassner 2020; milanesi, rocca, & robinson, 2020; titeux et al., 2016) . this ongoing sdm f i g u r e 1 schematic summary of the elements contributing to species distribution modeling (sdm) reliability and utility (a), the eco-evolutionary processes underlying distribution dynamics and sdm performance (b), and a hypothetical representation of the predicted occupied distributional area of a species (phengaris arion) that relies on the presence of host plants (c). colors represent the accessible and suitable habitat (purple), habitat unsuitable due to the predicted absence of host plant (gray), habitat with host plant but inaccessible (dark purple), inaccessible habitat without host plant (dark red), and environmentally unsuitable habitat (white background). patches occupied by larger butterflies (representing better dispersers) are predicted to be accessible due to dispersal evolution (after dekort, prunier, et al., 2018) boost opens the door for comparing and synthesizing published sdm studies for answering taxon-wide and large-scale research questions, including the role of species' traits and evolutionary potential in driving general species distribution shifts in response to land-use change. the lack of an overarching, streamlined, and widely adoptable sdm framework containing all tools required for model preparation, parameterization, and selection, however, likely strongly refrains sdm from its full potential (araújo et al., 2019) . moreover, a standardized and science-based sdm toolbox would pave the way for practitioners to more systematically use the main outcomes (predicted range maps) of the whole sdm framework are frequently misleading and sometimes misused (see carlson, chipperfield, benito, telford, & o'hara, 2020) due to several main sources of uncertainty that the user, especially the beginner user, needs to have in mind when performing sdm. first, the correlative nature of sdm inherently prevents this toolbox from building on causal relationships between environmental input variables and the occurrence of a species. this causality limitation is unavoidable because sdm inevitably requires predictors able to reflect or capture presumed causal mechanisms from natural history knowledge. the lack of information about the causal link between the predictor variables and the response variable, usually a binary variable representing species presence-absence or presence-background data, is clearly the main limitation that the user needs to constantly keep in mind when running sdm. for instance, every single sdm method, and especially the most advanced ones such as machine learning methods, will almost always be able to find a statistical link between spatially structured predictors and the response variable used to measure species distribution. even when predictors are known to be completely unrelated to the species distribution, such as the caricatural but very eloquent use of paintings to predict species distributions (fourcade, besnard, & secondi, 2018) , sdm will allow the user to draw nice-looking maps with high prediction accuracy according to the metrics used by state-of-the-art sdm methods. this suggests another important limitation hampering the use of sdm techniques which is the misuse or overconfidence in metrics used to measure the performance of sdm outcomes (lobo, jiménez-valverde, & real, 2008) . indeed, typical sdm model performance metrics such as the area under the receiving operating characteristic curve (auc), the kappa statistic, and the true skill statistic (tss) have been heavily criticized, as they tend to overpredict model performance under the influence of sample prevalence, consequently compromising model validation accuracy and model comparability (e.g., leroy et al., 2018; morán-ordóñez, lahoz-monfort, elith, & wintle, 2017) . hence, the reason why these metrics are misleading measures of the performance of predictive distribution models partly relates to one last limitation to have in mind when running sdm which is the data quality of the response variable. unless reliable and ground-truth distribution data are available, such model validation parameters are extremely questionable. in situ validation of the presence or abundance data at locations for a range of predicted probability estimates therefore is highly recommended for an unbiased perspective of model performance. the final but equally important issue regarding data quality revolves around absence data. all the metrics used to measure the performance of sdm outcomes somehow rely on absence information, whether obtained from field observations or from a random selection strategy of background data, also known as pseudo-absences. clearly, collection of field-validated true absence data across the species range is highly encouraged over pseudo-absences selection strategies to improve sdm reliability (leroy et al., 2018; lobo, 2016) . the sampling strategy used to select pseudo-absences, such as considerations on the spatial extent to be used, is known to highly influence sdm outcomes (barbet-massin, jiguet, albert, & thuiller, 2012; lobo, jiménez-valverde, & hortal, 2010; van der wal, shoo, graham, & williams, 2009 ). even absence data recorded during field surveys are tricky to use and should be carefully handled depending on the aim of the undertaken sdm exercise (hattab et al., 2017) : mapping the occupied distributional area or modeling the potential distribution. if the user aims at mapping the occupied distributional area, which is more or less a spatial interpolation of the suitable and accessible locations for the focal species, then all absence data recorded during field surveys are useful for model calibration and validation: whether it represents true environmental absences or dispersal-limited absences reflecting sites out of dispersal reach but potentially suitable regarding the abiotic and biotic conditions. in such cases, absence data bearing dispersal limitation information are very important and should be incorporated in the sdm framework together with predictor variables reflecting dispersal constraints, such as speciesspecific dispersal kernel, so as to improve the mapping of the occupied distributional area (lobo et al., 2010; meentemeyer, anacker, mark, & rizzo, 2008; václavík & meentemeyer, 2009 ). however, if the user aims at modeling the potential distribution, which implies spatial extrapolations beyond the actual occupied distributional area, then it is recommended to exclude dispersal-limited absence from both model calibration and validation (hattab et al., 2017) . indeed, using dispersal-limited absences to validate the potential distribution predicted from sdm will inevitably and mathematically lead to low auc or tss values, which results in models misclassified as having poor predictive performance. hence, one needs to carefully think about the meaning of the available absence data and how to use it for sdm calibration and validation steps. modeling the potential distribution or mapping the occupied distributional area are two very different sdm exercises requiring a different thinking on the use of absence data (hattab et al., 2017) . sdm for conservation decision making while being aware of the potential pitfalls (box 1). strikingly, however, while land use has begun its debut into sdm, scientists do not generally quantify or map the amount of habitat that will likely be colonized by the species under study. to meet this concern, we specifically elaborate on the impacts of reduced habitat connectivity, a major side effect of various environmental changes, on sdm outcomes. we also investigate how dispersal is limited both by the effects of the landscape structure and configuration, and by changes in biotic interactions within communities. we then discuss why evolutionary processes should be incorporated to increase the reliability of sdm (bush et al., 2016) , and finally propose an sdm roadmap that incorporates multiple environmental stressors, dispersal, and evolution into a feasible, more reliable, and streamlined modeling framework. we stress that even for data-insufficient systems, it is possible to incorporate the processes described above. for rare and poor-disperser species in particular, realistic scenarios that take into account the various drivers of population and range dynamics can have crucial conservation implications. an sdm roadmap could also motivate nonexperts in the modeling field and young scientists to apply sdm to their target species, thereby facilitating the implementation of sdm in applied sciences and conservation practices. we stress, however, that expert evaluation of any sdm study is vital to a correct interpretation of sdm outcomes (see box 1 for potential pitfalls). the majority of sdm studies still relies on a single environmental stressor (or group of related stressors), with bioclimatic stressors (e.g., warmer and dryer conditions) representing, by far, the most popular environmental predictor used to forecast species' distribution changes (titeux et al., 2016) . these studies observed drastic reductions in the modeled climatic suitability of currently occupied habitat for macroinvertebrates (up to 65%, domisch et al., 2013; parmesan et al., 1999 ), vertebrates (up to 80%, warren, wright, seifert, & shaffer, 2014 , and plants (up to 90%, aguirregutiérrez, van treuren, hoekstra, & van hintum, 2017; kane et al., 2017) . the effective impact of climate change on biodiversity goes beyond direct climate-occurrence relations, also involving disruption of habitat connectivity and of species interactions within communities (bertrand et al., 2016; garcia, cabeza, rahbek, & araujo, 2014; walther et al., 2002) . first, through reducing the amount of suitable habitat, climate change increases isolation between remaining patches, consequently inhibiting gene flow across the landscape and impairing population dynamics (graae et al., 2018; inoue & berg, 2017; razgour et al., 2018) . reduced gene flow increases local inbreeding risk and extinction rates, and reduces the exchange of adaptive variation (razgour et al., 2018; slatkin, 1987) . second, climate change renders habitat more prone to alien species' invasions (bellard et al., 2013; hulme, 2017) and can alter community composition and ecosystem processes (carroll et al., 2015; garcía molinos et al., 2015; pearson et al., 2013; perring et al., 2016; sheldon, yang, & tewksbury, 2011; sunday, bates, & dulvy, 2012) . global change may therefore not only alter species' distributions through direct abiotic environment-occurrence interactions but also indirectly through shaping the biotic context (e.g., prey, competitors, and pollinators) (carroll et al., 2015; gonzález-varo et al., 2013; warren & bradford, 2014; wisz et al., 2013) . over half of the terrestrial, ice-free surface is transformed through ongoing land-use change and habitat loss and fragmentation, considerably adding to the impacts of climate change on terrestrial biodiversity (aguiar et al., 2016; hansen et al., 2013; newbold et al., 2015; reino, beja, araújo, dray, & segurado, 2013; titeux et al., 2016; vitousek, 1994 , but see warren et al., 1999 for antagonistic effects). assuming unchanged (static) land use in sdm thus renders future projections questionable (ay, guillemot, martin-stpaul, doyen, & leadley, 2017; fournier et al., 2017; perring et al., 2016; titeux et al., 2016 ), yet despite the continuous recognition that land-use change constitutes a major threat to global biodiversity (perring et al., 2016; titeux et al., 2016) , the translation of this awareness into sdm keeps lagging behind. given the extent and magnitude of climate and land-use change, their combined impact on natural ecosystems is expected to be complex (guo et al., 2018) , urging for a transparent framework allowing the identification of areas susceptible to combined global change threats. co-occurrence of climate and land-use change has been shown to have interactive and often synergistic effects on biodiversity and species redistribution (jetz, wilcove, & dobson, 2007; marshall et al., 2017; pereira et al., 2010; visconti et al., 2016; zwiener et al., 2017) . first, land-use change reinforces climate warming when it is associated with livestock breeding and deforestation, which considerably boost greenhouse gas emissions (naudts et al., 2016; reisinger & clark, 2017) . second, land-use change increases the amount of suitable habitat edges that are sensitive to climate change due to the absence of a protective microclimate (brook, sodhi, & bradshaw, 2008; lembrechts, nijs, & lenoir, 2018; suggitt et al., 2020 , vanneste et al., 2020 . typical examples are the increased risk of forest fires due to increased wind exposure of forest remnant edges and inward desiccation of natural grasslands due to conversion of surrounding habitat into intensively managed agricultural land (e.g., alencar, brando, asner, & putz, 2015; tuff, tuff, & davies, 2016) . third, deforestation can accelerate the rate of upslope range shifts in tropical regions whereas the opposite has been demonstrated for higher latitudes (guo et al., 2018) , indicating the confounding effects of land-use change on climate-driven range shifts. fourth, reduced gene flow across the remaining suitable landscape jeopardizes the exchange of adaptive genetic variation that could otherwise allow evolutionary adaptation to a changing climate. inhibited gene flow also compromises genetic diversity and fitness within remaining populations, increasing their vulnerability to environmental stressors (frankham, ballou, briscoe, & ballou, 2002; markert et al., 2010; schrieber & lachmuth, 2017) . a global study assessing the interactive impacts of climate and habitat loss on vertebrate diversity predicted that nearly half of the ecoregions worldwide, mainly including tropical forest, savannahs, and wetlands, will be impacted by a synergy between habitat loss and climate change during the 21st century (segan et al., 2016) . species distribution modeling studies increasingly include variables representing both climate and land-use change in their models and have found varying support for land-use change impacts on predicted species distributions, depending on the thematic resolution of the land-use variables and the species under study (brodie, 2016; marshall et al., 2017; martin, van dyck, dendoncker, & titeux, 2013; scheller & mladenoff, 2005 ; zamora-gutierrez, pearson, green, & jones, 2018) . with the availability of a 250-m resolution regional land-use map and iucn distribution maps for 286 mammal species, brodie (2016) was able to study the interactive effects of climate change and land-use change (through oil palm plantations) on mammal diversity in south-east asia. mammal species were found to be robust to changes in climate alone, but responded dramatically to the combined effects of climate and land-use change, with median habitat suitability losses of 47.5% (low carbon emission scenario) up to 67.7% (high emission). in a similar vein, sdm and conservation prioritization testing based on 2,255 woody plant species of the brazilian atlantic forest showed that contrasting climate change scenarios did not shape conservation prioritizations, while management strategies aiming to reduce habitat fragmentation were found to be indispensable for the long-term conservation of atlantic forest diversity (zwiener et al., 2017) . we stress that while the ma the potential distribution as predicted by sdm unconstrained by dispersal limitations is fundamentally different from the occupied distributional area, that is, the distribution actually occupied by the species, requiring the integration of species' dispersal abilities and dispersal barriers into sdm. in other words, the occupied distributional area of a given species (also referred to as the realized distribution) can be seen as a spatial interpolation of the suitable and accessible locations occupied by the focal species at a given moment in time: an instantaneous map of the real spatial occupation of the focal species. by contrast, the potential distribution can be interpreted as a spatial extrapolation, albeit not an environmental extrapolation beyond the species environmental niche, of where the species could find suitable environmental conditions to occur if it would be able to reach that location, that is, unlimited by its own dispersal abilities or by dispersal barriers. many studies reporting poleward or upward range shifts in the occupied distributional area also show that actual expansion rates of the studied organisms lag behind the displacement of their climatic envelopes (i.e., the potential distribution) (e.g., bertrand et al., 2011; bertrand et al., 2016) , most likely due to dispersal and establishment lags at the leading edge (alexander et al., 2018) . two nonexclusive factors can explain dispersal and establishment lags: (a) the displacements of individuals are slowed down by low habitat connectivity, and (b) individuals are struggling to settle viable populations in new habitat due to their dependence on biotic interactions. a basic understanding of these processes is required to assist with the implementation of dispersal-informed sdm (alexander et al., 2018 ). habitat connectivity describes how dispersal of individuals across the landscape is facilitated or impeded by landscape structure and configuration (taylor, fahrig, henein, & merriam, 1993) . dispersal, that is, movements potentially leading to gene flow among populations (ronce, 2007) , is thus key for species to track suitable habitat shifts (berg, julliard, & baguette, 2010) . the study of dispersal in ecology and evolution is a swiftly evolving field of investigation since almost two decades (bowler & benton, 2005) , generating findings that are crucial for understanding the role of dispersal in sdm. changes in landscape structure and configuration entail high dispersal costs and hence strongly affect the fitness of dispersing individuals (bonte et al., 2012) . accordingly, theoretical models predict that dispersal will be most generally counterselected if its costs increase along habitat fragmentation gradients and exceed its expected benefits (cote et al., 2017; duputie & massol, 2013; heino & hanski, 2001; mathias, kisdi, & olivieri, 2001; travis & dytham, 1999) . empirical studies confirm that habitat fragmentation can decrease dispersal propensity (the probability that an individual leaves a habitat patch) and increase dispersal efficiency by reducing dispersal costs either through a reduced search time and/or through the selection of safer dispersal routes (baguette, blanchet, legrand, stevens, & turlure, 2013; baguette & van dyck, 2007) . however, under particular conditions of landscape configuration and habitat suitability, theory also predicts the emergence of dispersal polymorphism within populations, in which high dispersal phenotypes with a generalist strategy coexist with low dispersal specialist phenotypes (mathias et al., 2001) . taken together, habitat connectivity can have various ecological and evolutionary consequences, and fully ignoring this important component of the accessibility to suitable habitat may therefore strongly interfere with sdm outcomes. to increase our understanding of the mismatch between the potential distribution (as modeled through sdm unconstrained by dispersal limitations or habitat connectivity) and the effectively occupied distributional area, the integration of dispersal and habitat connectivity metrics into sdm studies was put forward as a priority a decade ago (araújo & guisan, 2006; hirzel & le lay, 2008) . although a growing number of studies discussed or implemented the effects of dispersal in sdm, the proportion of sdm studies implementing dispersal has remained steady for years (bateman, murphy, reside, mokany, & vanderwal, 2013; holloway & miller, 2017) . one of the most striking examples of the spatial lapse between potential distribution and the occupied distributional area was provided by an sdm study on bornean orangutans (struebig et al., 2015) . the authors predicted a loss of ca. 74% (within current range) and 84% (outside current range) of the potential orangutan distribution by 2080s due to both climate change and direct habitat loss. however, given the sedentary lifestyle of the females, it is unlikely that the species would shift its distribution toward all suitable habitat (not all suitable habitat is accessible, even within the current range) with the predicted pace of environmental change. hence, conservation corridors or assisted translocation is required to merge suitable and accessible distributions and to ensure long-term persistence of this endangered species (struebig et al., 2015) . such extreme examples of jeopardized dispersal clearly show the urgency of a paradigm shift in conservation biology distinguishing suitable from accessible. in the same vein, albert, rayfield, dumitru, and gonzalez (2017) evidenced that accounting for connectivity in spatial prioritization of protected areas for 14 focal vertebrate species strongly modified conservation priorities. how climate change interacts with landscape features to affect dispersal is another key question, and the fine-scale consequences of the interplay of climate and land-use change on the spatial distribution of suitable habitat may explain dispersal lagging behind climate change (lembrechts et al., 2018) . mestre, risk, mira, beja, and pita (2017) accordingly showed that range shift predictions consecutive to climate alterations were overly optimistic when using sdm disregarding habitat connectivity. in a very interesting study, fordham et al. (2017) showed that models incorporating both habitat connectivity and climate suitability provided better predictions of the range shifts observed between 1970 and 2017 for 20 british bird species, than do models based on climate suitability changes alone. such integrated modeling scenarios could be greatly simplified by translating climate changes directly into habitat connectivity changes after assessing how changes in climatic conditions modify both habitat suitability and resistance to individual movements (see also inoue & berg, 2017; razgour et al., 2017) . the end point of this procedure is a single model that incorporates the effects of climate change as one of the drivers of changes in habitat suitability and connectivity. accordingly, this parsimonious approach is biologically more relevant than the production of competing models that consider the effects of either climate change or land-use change on the accessibility of suitable habitat, whereas these two factors are clearly not independent. (alexander et al., 2018) . in general, effective dispersal of specialists depends on dispersal abilities of their least dispersive interactor (see also supporting information s1). for instance, the expansion rate of specialist butterflies to accessible and suitable habitat patches under contrasted scenarios of habitat connectivity has been shown to be severely curbed by the dispersal rates of their only host plants (dekort, prunier, et al., 2018; schweiger, settele, kudrna, klotz, & kühn, 2008) . conversely, the dispersal lag between a weakly mobile parasitoid and its highly dispersive butterfly host provides the latter with enemy-free accessible and suitable habitat patches in highly fragmented habitats (bergerot et al., 2010) . the vast majority of sdm studies implicitly considers the relationship between an organism's ability to survive and the environmental conditions as fixed in time and space. however, the probability for a species to overcome environmental changes increases, by definition, with its adaptive potential. the ecological niche of a given species can thus evolve (visser, 2008 , wasof et al., 2013 , and local adaptation can be seen as an alternative or complementary solution for dispersal under changing environmental conditions (graae et al., 2018) . correspondingly, phenological shifts were reported for a variety of organisms, ranging from plants (e.g., franks, sim, & weis, 2007 ) over birds (e.g., charmantier & gienapp, 2014) to arthropods (e.g., van asch, salis, holleman, van lith, & visser, 2013) . other traits, including color morphs (karell, ahola, karstinen, valkama, & brommer, 2011) , dispersal ability (travis et al., 2013) and the thermal niche (rolland et al., 2018) have been shown to evolve under climate change. the key elements to phenotypic evolution and thus the evolution of the ecological niche under climate change are local genetic additive variance (underlying interindividual variation), life history (affecting both the pace of natural evolution and the feedbacks among traits), and the interplay between dispersal and the landscape structure (balancing genetic mixing and drift). evolutionary potential may thus differ considerably within and between species. using a dynamic eco-evolutionary model coupled to correlative niche projections, cotto et al. (2017) showed that evolutionary adaptation is unlikely to prevent the predicted range contraction of long-lived perennial alpine plant species under predicted climate change. on the contrary, evolutionary rescue was reported for insects with short generation time and fast growth (kearney, porter, williams, ritchie, & hoffmann, 2009) . similarly, using a hybrid sdm approach that incorporates dispersal and evolutionary dynamics into correlative sdm (see supporting information s3), bush et al. (2016) showed that evolutionary adaptation reduces projected range losses up to 33% for 17 species of drosophila. although evolution will probably not be adequate to ensure population persistence for most species under the current pace of climate change, slowing down the pace of climate change is expected to promote evolutionary rescue (cotto et al., 2017) . evolution of phenotypic traits, including dispersal evolution (see supporting information s2), thus plays a major role in shaping and conserving the potential distribution of many species. environmental predictor maps (figure 2 .1) are often freely downloadable and can be merged with occurrence points using a basic gis application. global climate data are widely available for researchers to model past, current, and future climate projections (e.g., worldclim and chelsa), rendering climate niche modeling an attractive approach for global change research relative to land-use modeling. yet, also regional and global land-use maps have become accessible (e.g., globcover, modis2005, corine) and can be converted from a vector to an sdm-friendly raster format using any geographical information system (gis rasterizing) to model land-use and habitat connectivity. moreover, while the short-term nature of land-use maps and land-use change may complicate land-use change sdm, simple land-use change scenarios could be tested in a sensitivity analysis for a given near-future climate change scenario. where land cover maps generally lack resolutions beyond the dominant land cover types (e.g., "grasslands" and "urban area"), there are several ways that allow more ecologically relevant land-use mapping. information on the protection level of areas across the globe (e.g., through the protected area network) (kremen et al., 2008) can be merged with a land cover map to indicate the level of management of specific land cover types (rodrigues et al., 2004) . grasslands and forests are, for example, more likely to be intensively managed or exploited when unprotected. alternatively, human population density maps and road maps can be used to fine-tune the intensity of land disturbance (e.g., newbold et al., 2015) , while a forest cover map can be integrated to model high-resolution forest change for species strongly depending on the absence or presence of forests (e.g., hansen et al., 2013) . finally, inclusion of microclimatic variables such as topography not only allows increasing the spatial resolution and accuracy of sdm predictions, it also has been demonstrated to play an underappreciated role in shaping the trajectories of species evolution and redistribution (de kort et al., 2020; suggitt et al., 2020) . (see also box 1). when soil water availability, for example, has been suggested to be deterministic for the presence of the species under study, a wetness index map or equivalent, should evidently be incorporated during modeling. although the technical aspects underlying model evaluation lie beyond the scope of this review (see, e.g., aiello-lammens, boria, radosavljevic, vilela, & anderson, 2015; muscarella, galante, soley-guardia, & boria, 2014; phillips & dudík, 2008; radosavljevic & anderson, 2014 , for techniclal sdm considerations), we highlight the importance of model tuning and parameterization as a key criterion for realistic modeling (box 1). the predictive performance of sdms is usually evaluated through cross-validation, using training data to fit models and a set of testing data that is spatially independent of the training data to evaluate these models (hijmans, 2012) . this model evaluation approach is, however, highly sensitive to spatial autocorrelation of environmental variables between training and testing data, and only tests the set of known occurrence data. these drawbacks artificially inflate cross-validated model statistics (hijmans, 2012; morán-ordóñez et al., 2017) and stress the importance of in situ validation (see below). we recommend sdm users to consistently model at least three scenarios: climate only, land use only, and climate and land use f i g u r e 2 circular roadmap presenting the necessary steps toward science-based species distribution modeling (sdm) that is feasible for most species and implementers, with the emphasis on poor dispersers as a particularly vulnerable and valuable group of species. note that step 4 has been given priority over step 5 because of its direct conservation implications. step 7 represents the completion of the suitable and accessible species distribution as predicted by the sdm framework. see figure 1 for color scheme of hypothetical distribution. we refer to the roadmap steps in the main text using bold references (e.g., figure 2 .2 refers to step 2 of the roadmap), where specific directions are provided for the respective steps. please carefully read box 1 to understand the importance of the "further considerations" presented in the blue rectangle of figure 2 together, in addition to a full model that incorporates potential additional predictors such as soil moisture and biotic interactions ( figure 2.3) . such a standardized sdm framework allows (a) addressing model complexity issues (see brun et al., 2020; gregr, palacios, thompson, & chan, 2019) and (b) comparing the contribution of major environmental change stressors to predicted shifts in species distributions between independent sdm studies. although more efforts are needed for true cross-species and cross-study comparison of sdm outcomes (e.g., the development of an overarching, decision tree-based sdm environment that systematically informs users about parameter and predictor choice and more detailed land-use scenarios), we emphasize the promise that standardizing sdm holds (a for uncovering overall effects of dispersal, evolution, life history, and anthropogenic stressors on the projected distribution of many species and communities, and (b) for framing each study into a much larger and significant context. integrating evolution into sdm without assumptions on mutation and demographic rates (supporting information s3), which can be particularly challenging for rare species, can be achieved if the evolutionary potential of populations across the study area is quantified (gotelli & stanton-geddes, 2015; ikeda et al., 2017) . quantitative genetic screening of phenotypic traits has long been thought to provide the most accurate information on the evolutionary potential of traits and life-history syndromes. such quantitative genetic research is now, however, questioned due to drawbacks related to limited statistical power, high time consumption, unrealistic assumptions on the genetic architecture underlying adaptive traits and poor representation of natural conditions (hoffmann, sgrò, & kristensen, 2017; wood, yates, & fraser, 2016) . more recently, the use of genomic markers representing neutral and/or adaptive genetic variation has been proposed and tested for modeling local adaptation and adaptive potential in sdm (fitzpatrick & keller, 2015; ikeda et al., 2017; marcer, méndez-vigo, alonso-blanco, & picó, 2016) . as a consequence of local adaptation, many populations develop a local genetic signature shaped by historical and current environmental conditions (watanabe & monaghan, 2017) . variation in population responses and environmental change is thus expected, and shown, to affect the ability of sdms that do not incorporate genetic structure to predict future species' distributions (ikeda et al., 2017; marcer et al., 2016) . in addition to the implementation of neutral genetic structure, explicitly incorporating genetic variation underlying adaptive traits can be achieved through ecology-informed genome screening (e.g., de kort & honnay, 2017) . as phenotypic traits respond to environmental change through shifts in the underlying genes, associations between genetic and environmental variation are assumed to result from local adaptation (manel, schwartz, luikart, & taberlet, 2003; storfer et al., 2007) . even without a sequenced reference genome, landscape genomic analysis of genetic markers, allowing the identification of genetic patterns associated with environmental variation (e.g., climate or land use), is a feasible strategy for any species (frichot, schoville, bouchard, & françois, 2013; manel & holderegger, 2013; rellstab, gugerli, eckert, hancock, & holderegger, 2015; sork et al., 2016) . high variation at genetic variants associated with temperature or habitat fragmentation may therefore indicate a high potential to adapt to future climate and land-use change. while the integration of this adaptive potential into sdm is under development (e.g., peterson, doak, & morris, 2019) , simple correlations between neutral and adaptive genetic diversity on the one hand, and sdm habitat suitability estimates on the other hand, indicates the extent to which adaptive evolution may affect sdm projections. high adaptive potential at the rear edge, for example, may prevent local extinctions despite a projected northward shift with climate change (erichsen et al., 2018; exposito-alonso et al., 2018) . low genetic diversity at suitable locations, on the other hand, may be indicative of imminent local extinction and poor connectivity, and/or of relatively recent colonization after a period of reduced habitat suitability (e.g., gutiérrez-rodríguez, barbosa, & martínez-solano, 2017; dekort, baguette, et al., 2018; dekort, prunier, et al., 2018) . conservation actions focusing on expanding or connecting sdm-based suitable patches holding populations of low genetic diversity may consequently increase options for dispersal and evolution. we therefore recommend conservation prioritization based on a combination of sdm and genetic marker assessment (figures 2.2 and 2.4) . although the collection of genetic material for genetic marker analysis may be particularly challenging for rare species, noninvasive sampling such as fecal, hair, and eggshell sampling may overcome this issue (e.g., depending on the scale of the study, land-use change scenarios ( figure 2 .5) could rely on local managers that are aware of ongoing land-use developments, regional storylines related to demands of arable production, livestock number, urbanization, and/or international socioeconomic parameters (e.g., dekort, prunier, et al., 2018; dullinger et al., 2020; la sorte et al., 2017; marshall et al., 2018) , or global land-use scenarios as outlined by the intergovernmental panel on climate change (ipcc). land-use variables can subsequently be manipulated (e.g., partial conversion of extensive grasslands into forest, or forest into built-up area) using basic gis applications to generate sdm scenarios integrating both climate change and habitat fragmentation (e.g., lehsten et al., 2015; marshall et al., 2018; martin et al., 2013) . climate change scenarios have been outlined by the ipcc and are freely available at chelsa-climate.org and worldclim.org. at this point, all data are available for modeling, using dismo or another sdm framework (figure 2 .5). models are first trained and tested using the occurrence points and (cropped) environmental maps, finally providing a habitat suitability map that reflects the present distribution. the model results are then extrapolated to predict future distributions under the provided scenarios. these projections do not account for dispersal and evolution, and require fine-tuning and contextualization based on species' life-history traits (through partial dispersal modeling, figure 2 .6) and genetic markers (through discussion of the adaptive potential, figure 2 .4). stevens et al., 2013) . these life-history traits can therefore be considered to define a partial dispersal sdm approximating the occupied distributional area to a more accurate extent than a sdm assuming no dispersal (bateman et al., 2013) . alternatively, it is possible to create a species-specific dispersal kernel and use it as a predictor variable capturing the impact of dispersal limitations on the occupied distributional area (hattab et al., 2017; meentemeyer et al., 2008; václavík & meentemeyer, 2009 ) (see box 1). even relatively simple partial dispersal models, where the potential distribution has been clipped down to accessible distribution based on estimates of maximum dispersal distances, have been shown to improve distribution projections under environmental change (dekort, prunier, et al., 2018; fitzpatrick, gove, sanders, & dunn, 2008; meier et al., 2012; midgley et al., 2006) . the implementation comfort and the limited number of assumptions related to demographic rates make this type of partial-dispersal sdm the preferred option for many species (bateman et al., 2013) . the most reliable and informative partial dispersal sdms are expected for species with poor dispersal capacities, because poor dispersers (a) are often of high conservation concern, (b) facilitate integration of dispersal into sdm through assuming limited dispersal between suitable patches (figure 2.6) , and (c) provide conservative estimates of patch accessibility for associated species. although we do not specifically recommend to focus on poor dispersers, we do believe that this important target group should receive particular attention in future sdm studies aiming to develop dispersal-and evolution-informed conservation strategies. although model accuracy improves considerably in partial dispersal sdm, they still not fully reflect real conditions. a more mechanistic approach, for example, through hybrid models integrating both correlative and mechanistic principles, could further increase model reliability (see supporting information s3). in silico model parameterization and validation should be complemented with in situ model evaluation in unsampled regions, through extracting a set of suitable and unsuitable habitat coordinates from model output and empirically evaluate occurrence in the field (araujo, pearson, thuiller, & erhard, 2005; dekort, prunier, et al., 2018) (figure 2.8) . among the rare examples of studies using in situ sdm validation, williams et al. (2009) were able to find 24 new localities (out of 36 checkpoint sites) shared among four rare plant species across the rattlesnake creek terrane in california. area under the curve (auc) of the receiver operating characteristic (hanley & mcneil, 1982) , a commonly used model validation statistic retrieved from the cross-validation approach, ranged between 0.94 and 0.98, commonly interpreted as nearly perfect predictive performance (but see box 1 for pitfalls related to this model validation metrics like the auc statistic). two important conclusions can be drawn from this study. first, there can be considerable inconsistency between in silico (simulated) and in situ (real) model validation, which may reflect (a) the drawbacks of in silico model validation methods and (b) the (in)accessibility of suitable habitat patches. second, sdm studies can be highly suitable for conservation purposes, given that (a) validation was performed both in silico using independent calibration and testing data and in situ, (b) relevant environmental maps and scenarios are generated, and (c) dispersal and evolution are implemented. we finally recommend projected shifts in distributions to be followed up in situ at regular time intervals (figure 2 .7) (see, e.g., areias guerreiro, mira, & barbosa, 2016; barbet-massin, rome, villemant, & courchamp, 2018; west, kumar, brown, stohlgren, & bromberg, 2016 ). this has two major advantages, including the ability to test the power of sdm approaches to predict distribution shifts and to assess the impact of conservation actions on projected shifts in sdm. we thank the referees and editor for handling this manuscript. all data used were accessed through referenced literature. https://orcid.org/0000-0003-2516-0134 delimiting the geographical background in species distribution modelling land use change emission scenarios: anticipating a forest transition process in the brazilian amazon crop wild relatives range shifts and conservation in europe under climate change spthin: an r package for spatial thinning of species occurrence records for use in ecological niche models applying network theory to prioritize multispecies habitat networks that are robust to climate and land-use change landscape fragmentation, severe drought, and the new amazon forest fire regime lags in the response of mountain plant communities to climate change ecological niche models in phylogeographic studies: applications, advances and precautions five (or so) challenges for species distribution modelling standards for distribution models in biodiversity assessments validation of species-climate impact models under climate change how well can models predict changes in species distributions? a 13-year-old otter model revisited the economics of land use reveals a selection bias in tree species distribution models individual dispersal, landscape connectivity and ecological networks landscape connectivity and animal behavior: functional grain as a key determinant for dispersal selecting pseudo-absences for species distribution models: how, where and how many? can species distribution models really predict the expansion of invasive species? anthropogenic climate change drives shift and shuffle in north atlantic phytoplankton communities the crucial role of the accessible area in ecological niche modeling and species distribution modeling appropriateness of full-, partial-and no-dispersal scenarios in climate change impact modelling reorganization of north atlantic marine copepod biodiversity and climate combined impacts of global changes on biodiversity across the usa will climate change promote future invasions? changes in plant community composition lag behind climate warming in lowland forests ecological constraints increase the climatic debt in forests metacommunity dynamics: decline of functional relationship along a habitat fragmentation gradient advancing ecological understandings through technological transformations in noninvasive genetics costs of dispersal causes and consequences of animal dispersal strategies: relating individual behaviour to spatial dynamics synergistic effects of climate change and agricultural land use on mammals trade-offs in covariate selection for species distribution models: a methodological comparison synergies among extinction drivers under global change model complexity affects species distribution projections under climate change splot -a new tool for global vegetation analyses global trait-environment relationships of plant communities incorporating evolutionary adaptation in species distribution modelling reduces projected vulnerability to climate change combining trade data and niche modelling improves predictions of the origin and distribution of non-native european populations of a globally invasive species species distribution models are inappropriate for covid-19 hydrologically driven ecosystem processes determine the distribution and persistence of ecosystem-specialist predators under climate change climate change and timing of avian breeding and migration: evolutionary versus plastic changes rapid range shifts of species associated with high levels of climate warming mismatch between marine plankton range movements and the velocity of climate change will plant movements keep up with climate change? evolution of dispersal strategies and dispersal syndromes in fragmented landscapes a dynamic eco-evolutionary model predicts slow response of alpine plants to climate warming assessing evolutionary potential in tree species through ecology-informed genome screening evolutionary biology: self/nonself evolution, species and complex traits evolution pre-adaptation to climate change through topography-driven phenotypic plasticity genetic costructure in a meta-community under threat of habitat fragmentation interacting grassland species under threat of multiple global change drivers differences in the climatic debts of birds and butterflies at a continental scale spatial autocorrelation analysis and ecological niche modeling allows inference of range dynamics driving the population genetic structure of a neotropical savanna tree modelling distribution in european stream macroinvertebrates under future climates a socio-ecological model for predicting impacts of land-use and climate change on regional plant diversity in the austrian alps an empiricist's guide to theoretical predictions on the evolution of dispersal species distribution models: ecological explanation and prediction across space and time hyrcanian forests-stable rear-edge populations harbouring high genetic diversity of fraxinus excelsior, a common european tree species genomic basis and evolutionary potential for extreme drought adaptation in arabidopsis thaliana climate change, plant migration, and range collapse in a global biodiversity hotspot: the banksia (proteaceae) of western australia ecological genomics meets community-level modelling of biodiversity: mapping the genomic landscape of current and future environmental adaptation how complex should models be? comparing correlative and mechanistic range dynamics models paintings predict the distribution of species, or the challenge of selecting environmental predictors and evaluation statistics predicting species distribution combining multi-scale drivers introduction to conservation genetics species distribution models in conservation biogeography: developments and challenges rapid evolution of flowering time by an annual plant in response to a climate fluctuation testing for associations between loci and environmental gradients using latent factor mixed models multiple dimensions of climate change and their implications for biodiversity climate velocity and the future global redistribution of marine biodiversity combined effects of global change pressures on animal-mediated pollination climate change, genetic markers and species distribution modelling stay or go -how topographic complexity influences alpine plant population and community responses to climate change why less complexity produces better forecasts: an independent data evaluation of kelp habitat models modelling of species distributions, range dynamics and communities under imperfect detection: advances, challenges and opportunities sesam -a new framework integrating macroecological and species distribution models for predicting spatio-temporal patterns of species assemblages is my species distribution model fit for purpose? matching data and models to applications land-use change interacts with climate to determine elevational species redistribution habitat distribution models: are mutualist distributions good predictors of their associates? integrative inference of population history in the ibero-maghrebian endemic pleurodeles waltl (salamandridae) the meaning and use of the area under a receiver operating characteristic (roc) curve high-resolution global maps of 21st-century forest cover change host plant availability potentially limits butterfly distributions under cold environmental conditions agricultural lands key to mitigation and adaptation a unified framework to model the potential and realized distributions of invasive species within the invaded range evolution of migration rate in a spatially realistic metapopulation model cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model habitat suitability modelling and niche theory how biotic interactions may alter future predictions of species distributions: future threats to the persistence of the arctic fox in fennoscandia revisiting adaptive potential, population size, and conservation a quantitative synthesis of the movement concepts used within species distribution modelling climate change and biological invasions: evidence, expectations, and response options genetically informed ecological niche models improve climate change predictions predicting the effects of climate change on population connectivity and genetic diversity of an imperiled freshwater mussel, cumberlandia monodonta (bivalvia: margaritiferidae), in riverine systems projected impacts of climate and land-use change on the global diversity of birds (gm mace pantheria: a species-level database of life history, ecology, and geography of extant and recently extinct mammals using regional climate projections to guide grassland community restoration in the face of climate change climate change drives microevolution in a wild bird integrating biophysical models and evolutionary theory to predict climatic impacts on species' ranges: the dengue mosquito aedes aegypti in australia rapid shifts in plant distribution with recent climate change adaptation to climate change: trade-offs among responses to multiple stressors in an intertidal crustacean biodiversity and ecosystem services require ipbes to take novel approach to scenarios aligning conservation priorities across taxa in madagascar with high-resolution planning tools global change and the distributional dynamics of migratory bird populations wintering in central america disentangling the effects of land-use change, climate and co 2 on projected future european habitat types incorporating microclimate into species distribution models latitudinal and elevational range shifts under contemporary climate change without quality presence-absence data, discrimination metrics such as tss can be misleading measures of model performance how disturbance, competition, and dispersal interact to prevent tree range boundaries from keeping pace with climate change the use of occurrence data to predict the effects of climate change on insects. current opinion in insect science the uncertain nature of absences and their importance in species distribution modelling auc: a misleading measure of the performance of predictive distribution models species' traits as predictors of range shifts under contemporary climate change: a review and meta-analysis ten years of landscape genetics landscape genetics: combining landscape ecology and population genetics tackling intraspecific genetic structure in distribution models better reflects species geographical range population genetic diversity and fitness in multiple environments the interplay of climate and land use change affects the distribution of eu bumblebees testing instead of assuming the importance of land use change scenarios to model species distributions under climate change divergent evolution of dispersal in a heterogeneous landscape coral reefs in a crystal ball: predicting the future from the vulnerability of corals and reef fishes to multiple stressors early detection of emerging forest disease using dispersal estimation and ecological niche modeling climate, competition and connectivity affect future migration and ranges of european trees integrating occurrence data and expert maps for improved species range predictions a metapopulation approach to predict species range shifts under different climate change and landscape connectivity scenarios migration rate limitations on climate change-induced range shifts in cape proteaceae integrating dynamic environmental predictors and species occurrences: toward true dynamic species distribution models evaluating 318 continental-scale species distribution models over a 60-year prediction horizon: what factors influence the reliability of predictions? enmeval: an r package for conducting spatially independent evaluations and estimating optimal model complexity for maxent ecological niche models europe's forest management did not mitigate climate warming global effects of land use on local terrestrial biodiversity poleward shifts in geographical ranges of butterfly species associated with regional warming shifts in arctic vegetation and associated feedbacks under climate change accounting for preferential sampling in species distribution models scenarios for global biodiversity in the 21st century global environmental change effects on ecosystems: the importance of land-use legacies transferability and model evaluation in ecological niche modeling: a comparison of garp and maxent predicting the geography of species' invasions via ecological niche modeling environmental data sets matter in ecological niche modelling: an example with solenopsis invicta and solenopsis richteri mechanistic and correlative models of ecological niches incorporating local adaptation into forecasts of species' distribution and abundance under climate change the fate of páramo plant assemblages in the sky islands of the northern andes modeling of species distributions with maxent: new extensions and a comprehensive evaluation on the relationship between niche and distribution making better maxent models of species distributions: complexity, overfitting and evaluation an integrated framework to identify wildlife populations under threat from climate change does local habitat fragmentation affect large-scale distributions? the case of a specialist grassland bird how much do direct livestock emissions actually contribute to global warming? a practical guide to environmental association analysis in landscape genomics effectiveness of the global protected area network in representing species diversity the impact of endothermy on the climatic niche evolution and the distribution of vertebrate diversity how does it feel to be like a rolling stone? ten questions about dispersal evolution a spatially interactive simulation of climate change, harvesting, wind, and tree species migration and projected changes to forest composition and biomass in northern wisconsin dispersal will limit ability of mammals to track climate change in the western hemisphere the genetic paradox of invasions revisited: the potential role of inbreeding × environment interactions in invasion success climate change can cause spatial mismatch of trophically interacting species a global assessment of current and future biodiversity vulnerability to habitat loss-climate change interactions climate change and community disassembly: impacts of warming on tropical and temperate montane community structure gene flow and the geographic structure of natural populations landscape genomic analysis of candidate genes for climate adaptation in a california endemic oak, quercus lobata putting the 'landscape' in landscape genetics anticipated climate and landcover changes reveal refuge areas for borneo's orangutans extinction risk from climate change is reduced by microclimatic buffering thermal tolerance and the global redistribution of animals dispersal syndromes and the use of life-histories to predict dispersal predicting species' maximum dispersal distances from simple plant traits connectivity is a vital element of landscape structure can dispersal investment explain why tall plant species achieve longer dispersal distances than short plant species climate change distracts us from other threats to biodiversity dispersal and species' responses to climate change habitat persistence, habitat availability and the evolution of dispersal a framework for integrating thermal biology into fragmentation research invasive species distribution modeling (isdm): are absence data and dispersal constraints needed to predict actual distributions? ecological modelling evolutionary response of the egg hatching date of a herbivorous insect under climate change selecting pseudo-absence data for presence-only distribution modeling: how far should you stray from what you know? ecological modelling contrasting microclimates among hedgerows and woodlands across temperate projecting global biodiversity indicators under future development scenarios keeping up with a warming world; assessing the rate of adaptation to climate change beyond global warming: ecology and global change ecological responses to recent climate change evaluating presence-only species distribution models with discrimination accuracy is uninformative for many applications incorporating model complexity and spatial sampling bias into ecological niche models of climate change risks faced by 90 california vertebrate species of concern rapid responses of british butterflies to opposing forces of climate and habitat change mutualism fails when climate response differs between interacting species comparative tests of the species-genetic diversity correlation at neutral and nonneutral loci in four species of stream insect is there a correlation between abundance and environmental suitability derived from ecological niche modelling? a meta-analysis field validation of an invasive species maxent model niches, models, and climate change: assessing the assumptions and uncertainties using species distribution models to predict new occurrences for rare plants the role of biotic interactions in shaping distributions and realised assemblages of species: implications for species distribution modelling are heritability and selection related to population size in nature? meta-analysis and conservation implications forecasting the combined effects of climate and land use change on mexican bats failure to migrate: lack of tree range expansion in response to climate change planning for conservation and restoration under climate and land use change in the brazilian atlantic forest toward reliable habitat suitability and accessibility models in an era of multiple environmental stressors key: cord-352543-8il0dh58 authors: kuzdeuov, a.; baimukashev, d.; karabay, a.; ibragimov, b.; mirzakhmetov, a.; nurpeiissov, m.; lewis, m.; varol, h. a. title: a network-based stochastic epidemic simulator: controlling covid-19 with region-specific policies date: 2020-05-06 journal: nan doi: 10.1101/2020.05.02.20089136 sha: doc_id: 352543 cord_uid: 8il0dh58 in this work, we present an open-source stochastic epidemic simulator, calibrated with extant epidemic experience of covid-19. our simulator incorporates information ranging from population demographics and mobility data to health care resource capacity, by region, with interactive controls of system variables to allow dynamic and interactive modeling of events. the simulator can be generalized to model the propagation of any disease, in any territory, but for this experiment was customized to model the spread of covid-19 in the republic of kazakhstan, and estimate outcomes of policy options to inform deliberations on governmental interdiction policies. c oronavirus disease 2019 (covid-19) has emerged as a global crisis that threatens to overwhelm public healthcare systems and disrupt the social and economic welfare of every country. the daily lives of billions of people have been impacted by the unprecedented social controls imposed by governments to inhibit the spread of the novel coronavirus [1] [3] . covid-19 has spread rapidly amongst a globally susceptible population, with rates of propagation and mortality greater than the averages typically associated with influenza [4] . initial projections of the global impact [5] indicate that it will likely become the most severe pandemic event over a century, dating to the influenza epidemic of 1918. the situation is exacerbated by several unfortunate observations: the population has no prior exposure, carriers are highly contagious in pre-symptomatic states [6] , there is no vaccine yet available, widespread testing for the virus began quite late due to lack of testing kits, reagents, and facilities, and there appears to be broad variation of risk profiles based on population demographics and prior health history [7] . in this scenario, epidemiological models can be used to project the future course of the disease, and to estimate the impact of non-pharmaceutical interventions (npis) and related control measures that might be used to slow the contagion, and thereby provide time to enhance health care resources and develop effective immunological defenses such as new vaccines. we have developed and implemented a network-based stochastic epidemic simulator (leveraging our prior work [8] ) which models cities and regions as nodes in a graph, and the edges between nodes representing transit links of roads, railways, and air travel routes to model the mobility of inhabitants amongst cities. the simulator includes population demographics along with health care system capacity, in particular, the intensive care unit (icu) availability, which serves as a negative impact multiplier when the number of icu beds is exceeded. in each node, the simulator runs a compartmental susceptible-exposed-infectious-recovered (seir) model, such that individuals can cycle through the four stages based on state transition probabilities. these probabilities are based on parameters such as the susceptible-to-exposed transition constant and the mortality rate, which can be influenced by age, gender, genetic profile, and health status. the simulator can be used to estimate the extent and duration of an epidemic over time, and model the potential impact of npi measures deployed to suppress or mitigate the spread of the virus. the supported npi control measures include what are commonly described as social (or physical) distancing, such as limiting travel or quarantining a region, on a localized basis and by transit link. the susceptible-infected-recovered (sir) epidemic model is one of the first proposed for epidemiological simulation purposes, and has been utilized widely due to its simplicity and effectiveness [9] . in the sir model, a society consists of three compartments. the first compartment, susceptible (s), contains individuals who are vulnerable and not yet infected. the second compartment, infected (i), is formed from susceptible individuals who become infected; in this state, they are capable of shedding the virus and spreading the disease. the last compartment, recovered (r) consists of previously infected individuals who have overcome the disease. the recovered individuals are presumed to have acquired some level of immunity to the disease, such that they have a lower probability of reinfection compared to susceptible individuals. the sir model, however, has several limitations based on simplifying assumptions that do not correlate well to actual viral propagation. for instance, the sir model assumes that each individual has an equal probability of spreading the disease, and that infected individuals themselves become infectious immediately, when, in practice, there is often a latency period in between. the model also presumes that each region has a fixed-size population, thus not accounting for population mobility. the experience with the sir model has motivated many enhancements and variations, with new compartmental states, such as seir, seirs, sirs, sei, seis, si, and sis [10] . many of these models incorporate the additional compartment -exposed (e). in one variation, individuals are infected in this exposed state but not yet infectious during the latent period; these individuals become infectious when the latent period ends. in general, compartments and the related state transition probabilities are selected based on the characteristics of the specific disease and the purpose of the model. the basic reproduction rate r 0 , denoting the average number of individuals infected by an infectious individual, is used as a parameter in many epidemiology models [10] . it can be interpreted as the ability of the infection to intrude and persist in a new population. in deterministic models, infection spreads in a sustained manner among a susceptible population if r 0 > 1 [11] . the propagation of infectious diseases can be simulated using both deterministic and stochastic models [12] . a deterministic epidemic model is formulated as a system of differential equations, while a stochastic epidemic model can be implemented using stochastic differential equations or discrete time markov processes [13] . the deterministic model yields the same solution each time if the initial conditions and parameter values of the model are not changed. the distinction of a stochastic model is that it contains at least one probabilistic element, i.e. the output of the model varies at some range even if the initial conditions and parameter values of the model remain constant. both approaches have their advantages and limitations [14] , [15] . the main strength of the deterministic model is that it clearly shows how the initial conditions and parameters impact the model behavior, but at the expense of more realistic and nuanced simulation. a deterministic model ignores uncertainties, which can be considered as a major limitation due to the fact that epidemic propagation is inherently a stochastic phenomenon. in contrast, the stochastic models take into account uncertainties, though by introducing probabilities, with the natural outcome that no two simulations produce exactly the same results, stochastic models are regarded as more challenging to manage and demonstrate causal links. variations of deterministic and stochastic models, in the context of both fixed-population and networked dynamics, have been studied in the literature [16] . a stochastic isir epidemic model with immunization strategies was presented to simulate the epidemic spread on social contact networks [17] . the seir model with time delay on scale-free networks was described in [18] to investigate the reproduction number and transmission dynamics of the disease. a model that analyses a population with two diseases, infectious and noninfectious, was introduced in [19] . the model assumes that the noninfectious disease itself is not dangerous but capable to attack the immune system, which means that its combination with infectious diseases might be fatal. in our work, the main focus is to develop a network-based simulator that closely models the population and regional mobility dynamics. therefore, we employed the seir stochastic model to simulate the spread of covid-19 virus in a closed population. then, we extended the model to investigate the dynamics of the spread of the disease between different regions. the detailed methodology is provided in section iii. in order to make epidemic simulators more realistic, it is important to populate them with authentic initial conditions, and calibrate the system variables against real data. prior examples include the seir model that was implemented using the anylogic program to examine the spread of h1n1 influenza virus between different regions of korea based on the highway and domestic flight traffic data [20] . in [21] , the seir model was introduced which takes into account a number of workers in a region, their geographic locations and age groups. an agent-based system that uses social interactions and individual mobility patterns was employed to study the h1n1 outbreak in mexico [22] . for our purposes, to accurately model the situation of the republic of kazakhstan as our case study, we incorporated transit data from highway travel, domestic flights, and railway connections amongst the regions of the country. in addition, we used the actual population demographics and hospital capacities of each region, provided under a collaboration agreement with the government. one of the most important stages in the development of epidemiological models is the calibration of system variables. calibration is tuning model parameters with available real epidemic data. this process is necessary to refine the model's ability to accurately project future scenarios in a reasonable and demonstrable manner, which can then be compared with actual events in real time to either affirm or undermine the model's assumptions. with sufficient data, particle and ensemble filters can be utilized to further improve model accuracy [23] , [24] . in our case, covid-19 is at an early stage of spreading in kazakhstan. the first official case was announced on the 13 march 2020 [25] . therefore, we calibrated our model empirically, first with reference to prior examples, and then with the limited available case data for the country. another important feature of epidemiological simulation is the visualization of the outcomes. the visualizations help users better understand the propagation of the virus, and the impact of policy decisions. there are many publicly available software tools for epidemic simulation with graphical interface components [26] , [27] . many countries have developed their own covid-19 epidemic models to simulate different scenarios [28] [32] . the main advantage of our epidemic simulator is that the source code is publicly available, thus researchers can use or adapt it to their specific problem and save valuable time. also, we provide a visualization tool such that users without programming . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 6, 2020. (1 -γ mor2 ) * (1 -γ rim ) (otherwise) fig. 1 : the statechart of the seir epidemic node simulator. skills can easily run the simulator and readily comprehend the results. the graphical user interface allows the user to set the initial conditions and model parameters, to dynamically adjust parameters during model execution, to incrementally plot the simulation results in real-time, to save the intermediate and final results, and to upload the saved model and continue the simulation. in addition, we have prepared a series of video tutorials 1 providing background context for the project, describing the architecture of the model, the implementation and use of the simulator, and the outcomes from the assessment of the republic of kazakhstan scenarios. in our simulator, the seir model for a single node consists of four superstates (susceptible (s s ), exposed (e s ), infected (i s ) and recovered (r s )) and transitions amongst states occur in accordance with the statechart shown in fig. 1 . a description of parameters is given in the table i . the susceptible superstate (s s ) consists of two states: susceptible (s) and vaccinated (v ). the daily vaccination rate vr dictates the transition between these states. the ratio of immunized people after vaccination vir is used to describe the portion of individuals going from vaccinated state (v ) to vaccination immunized (v i) or back to the susceptible (s); however, it is not yet activated since there is no known vaccine for covid-19. in order to represent delay dynamics of epidemics, exposure t exp , infection t in f , and vaccination immunization t vac periods are taken into account. the expected value (ev ) of the transition from susceptible state (s) to exposed state (e) is accomplished according to the parameter β exp , as where ∆t is a simulation step size, n s is the total number of individuals in susceptible state (s), n is the total population, and λ is a weighted population which is estimated as where n exp and n in f are the total number of substates in the exposed (e) and infected (i) superstates, and iso is isolated state, si is severe infected state, and q is quarantined state. along with the total number of substates of the vaccination state n vac , these are calculated as the exposed superstate (e s ) includes exposed (e) and quarantined (q) states and the transition between these states . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. is governed by the daily quarantine rate qr. the transition rates between the exposed superstate (e s ) and infected superstate (i s ) are equal to one since after the end of incubation period all individuals transfer from exposed (s) and quarantine (q) states to infected (i) and isolated (iso) states respectively. the main difference from our previous model, presented in [8] , is the introduction of a third state severe infected (si) within the infected superstate (i s ). individuals move from the infected (i) and isolated (iso) states to the severe infected (si) according to the severe infection rate sir. the natural death rate is neglected in infected (i) and isolated (iso) states and the transition to the dead state (d) occurs only through the severe infected state (si). this is done to model the high number of asymptomatic or mild cases which carry the disease. the natural birth and death rate factors are removed due to their negligible impact over the relatively short duration of the simulation. also, individuals move from infected (i) and isolated (iso) states to recovery immunized (ri) state according to the recovery immunization rate γ rim , and the rest make a transition to the susceptible state (s). the mortality rate used in the model is divided into γ mor1 and γ mor2 depending on the hospital capacity: γ mor1 is applied when the number of people in severe infected state (si) is below the hospital capacity, otherwise the model uses γ mor2 mortality rate which is greater than γ mor1 . finally, the individuals move from severe infected state (si) to recovery immunized (ri) and susceptible (s) states. in case, if the hospital capacity is not exceeded then the expected transition values are calculated as otherwise, in order to launch a run of the simulator, the values of transition parameters and the number of individuals in each state must be initialized, after which the stochastic solver will proceed with the iterations of time units, generating the corresponding population states based on those parameters. a network model in the epidemiological context is the representation of a country as a graph of interconnected nodes, where an individual node represents an administrative unit of the country, such as a city or region, taking into account their respective population demographics and health care capacities. this is a convenient representation since measures during epidemics are usually taken for specific administrative units. the transportation connections between the regions can also be modeled as the edges of the network. each node in the network runs concurrently as a separate instance of the single-node seir model described in sec. iii-a. the internal parameters of each seir model can be changed independently. the network model thus allows for viral propagation, with corresponding state transitions, within and amongst the nodes representing the cities and the regions. for a network with m nodes, the total transition matrix t total ∈ z m×m can be created by collecting and combining the open source data on the daily domestic air (t air ), rail (t rail ), and highway (t highway ) travel between different regions as t total = t air + t rail + t highway . to simulate the network transition between nodes, at each sampling time of the network simulator ∆t , the randomly sampled population is transferred from one node to another according to the transition matrix. it is important to note that the sampling times ∆t used for the network and ∆t for the node are different (∆t > ∆t). the expected value of transition from state k of node i to the corresponding state k of node j is given by the following equation: where t i j is the population transfer from node i to node j, n i is the population of node i, s k i is the number of individuals in the state k of node i. some of the states such as quarantined and severe infected are excluded from the transfer between the nodes, i.e. k ∈ {q, si, iso, d}. our simulator allows disconnection of a particular node by turning it off from the list of air, highway and railway transitions. also, the traffic tr and leakage ratios lr facilitate the fine tuning of the transportation between the nodes. specifically, by lowering the traffic ratio, the number of individuals transferring between the nodes can be decreased. this can be used to simulate the policies which throttle down the transportation between regions. the leakage ratio allows minor transitions even though a node is disconnected from the network. this is done to take into account individuals that . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. cross quarantined borders illegally and also for transportation between regions for essential services such as food delivery. the simulator was implemented in python programming language. for faster operation, we used the multiprocessing module of python to parallelize the execution of the single node simulations across all regions. a graphical user interface (gui) for the simulator (see fig. 3 ) was developed using the interactive visualization library bokeh [33] . this opensource bsd license library utilizes web-browsers to visualize data. this simplifies online deployment of the simulator and also enables platform independence. the source code 2 of our simulator was uploaded to github under the bsd license. figure 3 illustrates the components of the simulator interface. the number of individuals in each state for a selected region and for the whole country are shown on the top side (1) in real-time. also, the starting date of the simulation and the current date are provided here. the epidemic dynamics versus time are visualized in the upper left plot (2). on the right side of this plot, the epidemic heat map (3) is displayed. also, the gui allows to save simulation results and model parameters to a csv file (6). the saved model parameters can be reloaded (7, 8) to continue the simulation from a prior point. also, it is possible to reset the parameters to default values (4) . simulation parameters can be selected for each node (9) and for the whole network (10) . the transition matrix can be adjusted using checkboxes (11) and slider bars to adjust traffic and leakage ratios. the resulting transition matrix is shown at the bottom part of the interface (12). in order to test the capability of our simulator and fine tune its parameters, we first focused on the simulation of the lombardy region in italy. we chose lombardy since it is the epicenter of the covid-19 outbreak in italy, its epidemic timeline is well-established, and the italian government has shared comprehensive daily epidemic data since 24 february 2020 [34] , thus facilitating the calibration of model assumptions against actual events and data. the population of lombardy is just over 10 million people, therefore, we used this number to initialize the number of susceptible individuals in our model. we started the simulation on 1 january 2020 based on the results presented in [35] . we set the number of exposed individuals to ten on this date, as this figure yielded accurate model state populations going forward. in order to tune other parameters of the model, we used the reported data for the lombardy region for the period between 24 february and 27 april 2020. also, we took into account the timeline of events and government policy decisions that would impact the simulation (see table ii ). the model parameters are summarized in table iii . a parameter β exp was tuned according to the events described in table ii . the model was simulated for 180 days with the sampling time ∆t = 1/24 days. the results of the simulation are shown in fig. 4 . it can be seen from the figure that the model fits the reported number of deaths for the given period of time closely. also, we can observe that the epidemic reaches a peak around 26 march 2020 when the total number of infected individuals is 67,850. if we look at the reported data, new daily confirmed cases increased significantly starting from 19 march 2020 (2171 . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. cases) and remained high until 28 march 2020 (2117 cases). after this period, the new daily cases started to gradually decrease. thus, we can assume that the model estimated the peak of the spread of covid-19 accurately. also, according to the simulation, we forecast that the epidemic might continue until the end of june and the number of total deaths might exceed 17,000 in lombardy. : as noted, we tested the model's assumptions against prior epidemic experience, and then customized the model to simulate the spread of covid-19 in the republic of kazakhstan. we performed model calibration and validation by replicating the observed covid-19 numbers in the country. epidemics usually start in hubs with the import of the disease from another country and propagates to other regions via transportation links between regions. the first confirmed . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. instances of covid-19 patients in kazakhstan were detected in the largest cities, nur-sultan and almaty, and then the disease spread to other regions [25] . these two cities serve as transportation hubs as highlighted in fig. 2 . table iv lists all nodes with the respective abbreviations, population numbers and hospital capacities for the republic of kazakhstan. the transition matrix between nodes representing the domestic movements in the country have been estimated based on the government records and available statistical data (see table v) , and captures the end destination numbers, not accounting for intermediate transit points. for instance, if one travels from the capital to the other regions via a third transfer city, only the departure and arrival nodes have been considered in the transition matrix. the timeline, introduced in table vi , of the governmental policies related to covid-19 have been used to dynamically adjust model parameters. since the first confirmed-positive cases are dated to 13 march 2020 in nur-sultan and almaty cities, we started the simulation from 1 march 2020 with 10 exposed individuals in each of these two cities. the simulation parameters are summarized in table vii . the node and network sampling times were set as ∆t = 1/24 and ∆t = 1/2 days. the simulation results are shown fig. 5 . it can be seen that, on 27 april 2020, the actual number of covid-19 deaths is 25 while the predicted number of deaths is 26. this suggests that the model was calibrated accurately and can be utilized to simulate future scenarios. as for the almaty city result, the simulation concludes 12 dead and 1188 infected by april 27. whereas, the reported statistics are 8 and 876 respectively. the discrepancy might be due to the stochastic nature of the simulator and the lack of testing available in the city . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. which might result in lower number of the officially registered cases [25] . 2) covid-19 control strategies: after establishing the initial conditions for kazakhstan, and calibrating the state transitions, the next step was to model outcome scenarios for four variations of policy intervention: 1) complete quarantine, 2) going back to normal life, 3) "new normal", and 4) augmenting the "new normal" strategy with the introduction of technological measures to detect infection zones and facilitate individual exposure detection and contact tracing. the new normal strategy introduces measures such as an increase in hospital capacity, unhindered supply of all necessary drugs, effective social distancing of vulnerable people, an increase in testing rates, and persistent observation of hygienic precautions and moderate social distancing by the general population. the introduction of enhanced technological monitoring methods, such as identification of infection zones, and automated notifications to individuals whenever they have visited an infection zone or come into physical proximity with an infected individual, would increase the ability to detect and trace infections; in this scenario we used much higher quarantine rates due to presumed greater ability to control the interaction of individuals in the exposed and susceptible categories. the fig. 6 shows the number of infected and dead versus time for these four strategies simulated from april 27. table viii summarizes the key parameters used in the simulations. . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. v: the total transition matrix between the nodes which is the sum of the domestic air, highway and rail transition matrices. node aao aak aqm aqt aty wkz zha man nst pav qar qos qyz ekz smk nkz tur aao 0 7,447 0 0 0 0 1424 0 338 0 1,865 0 0 1805 0 0 0 aak 7,447 0 0 195 154 158 1827 163 465 176 503 195 303 683 756 0 240 aqm 0 0 0 0 109 0 0 113 2,551 0 180 400 0 0 0 254 0 aqt 0 195 0 0 459 352 0 448 154 0 914 471 429 0 0 0 0 aty 0 154 109 459 0 283 0 431 128 0 the results illustrate the effectiveness of the complete quarantine approach, and the impact of technological measures, compared to the new normal strategy. although the quarantine is demonstrably the most effective, it is not sustainable long-term due to extremely negative social and economic impact. as for the second strategy -back to normal lifestyle -simulation shows that it is also impractical due to an exponential second wave of epidemic growth in the country, thus from these considerations the optimal policies would be a hybrid approach that balances the effectiveness of quarantine with measures to minimize the b exp value; the model could be used to establish this balance, ongoing, with adjustments made on a node-by-node basis. 3) intermittent quarantine strategy: the policies implemented in most countries including kazakhstan are predominantly non-pharmaceutical suppression strategies designed to abruptly interrupt viral propagation (temporarily reducing r 0 ), and thereby "flatten the curve" to minimize impact on the healthcare infrastructure until such time as pharmaceutical or immunological outcomes can be achieved. the suppression strategies consist of strict quarantine mea. cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. the copyright holder for this preprint this version posted may 6, 2020. . https://doi.org/10.1101/2020.05.02.20089136 doi: medrxiv preprint sures such as the lockdown of specific regions, banning of social gatherings, closure of schools and universities (and subsequent shift to fully on-line delivery methods), national adoption of work-from-home policies, and the temporary closure of non-essential productions. to achieve the aforementioned balance, we considered an intermittent quarantine regime, effectively alternating between suppression and controlled mitigation strategies with β exp values of 0.19 and 0.05 respectively for quarantine off and on periods (see fig. 7 ). the simulation analysis shows that for longer periods of intermittent quarantine the peak in infected is much higher but the number of infected (area under the curve) is lower; the analysis is conducted as a multi-objective optimization problem as a function of time and medical preparedness. 4) effects of transportation limitations: our simulator incorporates domestic population mobility based on the transition matrix. the inclusion of mobility makes the model more realistic, with more nuanced modeling of suppression policy effects, and thus the simulation results more precise. to illustrate the effect of transportation limitations on the infection spread, we simulated the validation scenario with four different traffic ratios (0, 0.33, 0.66 and 1) starting from day 57. the peak of infected numbers increases with traffic ratio as shown in the fig. 8 . the traffic ratio of zero illustrates the case of complete quarantine, ongoing, which might not be feasible for both practical and economic reasons. the network-of-nodes design of the stochastic epidemic simulator allows granular adjustments to the model's parameters at the level of individual regions (nodes), and the interactive controls allow for testing policy interventions at discrete time intervals, also, unique to each region. the inclusion of mobility transitions between regions also allow for more precise simulation of events. the calibration of the model presents a conundrum, in that if we fit the parameters of the model to the reported data on confirmed cases, and set the state transition rates based on those reports, then the simulator outcomes will be dependent on the relative accuracy of the reported data, and factors such as physical limits on the number of tests that can be conducted could skew the outcomes. in the case of covid-19, it is likely that the number of infected individuals, and the number of deaths attributed to it, are higher than the number of laboratory-confirmed cases, given the relatively uneven rates of testing across different populations, and wide variations in percentage of positive tests, along with the variations in mortality rates, and the concurrent excess mortality observed in many of the reporting zones. perhaps the single greatest factor is that most countries only test if a patient meets screening criteria, with explicit symptoms, but exposed patients can remain asymptomatic for approximately five days, and a non-trivial percentage of those infected may not seek medical care, due to light symptoms, and the infection may not be recognized or registered as covid-19 due to sensitivity and specificity of rapid tests available on market [36] . we have run many scenarios, and, often, the projections of the simulator ran higher than the reported figures. this was an initial concern, until updated reporting inclusive of testing rates, positive rates, mortality and excess mortality were taken into account, which suggested that the preliminary reports were simply incomplete. thus, if we run the simulator tightly, calibrating and updating based only on the reported confirmed case data, then we risk essentially modeling for those with severe enough symptoms that they require medical care, but by doing so the model will only capture a portion of the impact, and skew the population sizes for the susceptible and recovered states. the significance of this outcome becomes greater . cc-by-nc-nd 4.0 international license it is made available under a is the author/funder, who has granted medrxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the copyright holder for this preprint this version posted may 6, 2020. . https://doi.org/10.1101/2020.05.02.20089136 doi: medrxiv preprint over time, as it determines whether and when a population reaches herd-immunity levels, and so too, the duration and extent of npi policies. in order to overcome this outcome it is necessary to establish a more rigorous testing regimen, in accordance with standard testing protocols, to accurately estimate the size of each population state. in this case the advantage of the node-based approach lies in the opportunity to adjust the policies at the regional level, and thereby reduce the potentially unnecessary economic and social disruption of a country-wide lockdown, and strategically smooth the pressure on hospitals in different regions. we have designed and implemented an open-source network-based stochastic epidemic simulator that models the movement of a disease through the seir states of a populace. the simulator incorporates interactive controls of state transition probabilities and modification of environmental factors such as health care capacity and mobility amongst regions. the interactive controls enable the dynamic introduction of events and the testing of policy options that allow the evaluation of alternative scenarios and projection of their potential impact over time, at both the national and regional levels. the simulator was calibrated with recent data on covid-19 in the lombardy region of italy, utilized to fine-tune model assumptions regarding viral propagation and mortality rates in both normal and health-care over-capacity situations. the simulator was then configured to model the covid-19 outbreak in kazakhstan, using empirical observations and authoritative data on initial conditions as provided by government sources, as part of a collaboration established with the express objective to inform policy deliberations. based on this republic of kazakhstan configuration, we successfully modeled the spread of the disease, and the impact of various stages of government policy decisions, such as closing schools, selectively quarantining regions, and the imposition of "lockdown" conditions (also known as "shelter in place"). the simulator was then used to evaluate a range of scenarios with the goal of minimizing strain on the health care system while reducing negative social and economic impact by seeking a balance between containment of disease spread and the imposition of less stringent social controls on a localized basis. on one extreme, the simulator indicates that a lack of mitigation will yield unmitigated disaster, with overwhelmed health care systems and high rates of mortality. on the other extreme, persistent suppression (via quarantines and strict social distancing) will curtail disease propagation and reduce mortality, but also cause economic harm, and instigate other forms of social distress. the projections of the simulator suggest that going forward, it will be necessary to maintain some level of social controls guided by a comprehensive testing regimen so as to constrain propagation while minimizing social and economic impact, until such time as more widespread immunity amongst the population (so-called "herd" immunity) is gradually achieved, or a vaccine is introduced. the effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak covid-19 in italy: momentous decisions and many uncertainties response to covid-19 in taiwan: big data analytics, new technology, and proactive testing estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand quantifying sars-cov-2 transmission suggests epidemic control with digital contact tracing clinical course and risk factors for mortality of adult inpatients with covid-19 in wuhan, china: a retrospective cohort study moses: a matlab-based open-source stochastic epidemic simulator a contribution to the mathematical theory of epidemics the mathematics of infectious diseases stochastic epidemic models: a survey mathematical epidemiology: past, present, and future comparison of deterministic and stochastic sis and sir models in discrete time stochastic versus deterministic approaches nine challenges for deterministic epidemic models analysis and control of epidemics: a survey of spreading processes on complex networks modeling epidemics spreading on social contact networks spreading dynamics of an seir model with delay on scale-free networks modeling and control of an epidemic disease under possible complication epidemic simulation of h1n1 influenza virus using gis in south korea planning for infectious disease outbreaks: a geographic disease spread, clinic location, and resource allocation simulation an agentbased model of epidemic spread using human mobility and social network information particle filtering in a seirv simulation model of h1n1 influenza comparison of filtering methods for the modeling and retrospective forecasting of influenza epidemics official situation with coronavirus in kazakhstan publicly available software tools for decision-makers during an emergent epidemic-systematic evaluation of utility and usability the gleamviz computational tool, a publicly available software to explore realistic epidemic spreading scenarios at the global scale mathematical modelling of covid-19 transmission and mitigation strategies in the population of ontario, canada generic probabilistic modelling and non-homogeneity issues for the uk epidemic of covid-19 modelling transmission and control of the covid-19 pandemic in australia the sars-cov-2 epidemic outbreak: a review of plausible scenarios of containment and mitigation for mexico data analysis and modeling of the evolution of covid-19 in brazil the bokeh visualization library dati covid-19 italia the early phase of the covid-19 outbreak in spread of sars-cov-2 in the icelandic population key: cord-346309-hveuq2x9 authors: reis, ben y; kohane, isaac s; mandl, kenneth d title: an epidemiological network model for disease outbreak detection date: 2007-06-26 journal: plos med doi: 10.1371/journal.pmed.0040210 sha: doc_id: 346309 cord_uid: hveuq2x9 background: advanced disease-surveillance systems have been deployed worldwide to provide early detection of infectious disease outbreaks and bioterrorist attacks. new methods that improve the overall detection capabilities of these systems can have a broad practical impact. furthermore, most current generation surveillance systems are vulnerable to dramatic and unpredictable shifts in the health-care data that they monitor. these shifts can occur during major public events, such as the olympics, as a result of population surges and public closures. shifts can also occur during epidemics and pandemics as a result of quarantines, the worried-well flooding emergency departments or, conversely, the public staying away from hospitals for fear of nosocomial infection. most surveillance systems are not robust to such shifts in health-care utilization, either because they do not adjust baselines and alert-thresholds to new utilization levels, or because the utilization shifts themselves may trigger an alarm. as a result, public-health crises and major public events threaten to undermine health-surveillance systems at the very times they are needed most. methods and findings: to address this challenge, we introduce a class of epidemiological network models that monitor the relationships among different health-care data streams instead of monitoring the data streams themselves. by extracting the extra information present in the relationships between the data streams, these models have the potential to improve the detection capabilities of a system. furthermore, the models' relational nature has the potential to increase a system's robustness to unpredictable baseline shifts. we implemented these models and evaluated their effectiveness using historical emergency department data from five hospitals in a single metropolitan area, recorded over a period of 4.5 y by the automated epidemiological geotemporal integrated surveillance real-time public health–surveillance system, developed by the children's hospital informatics program at the harvard-mit division of health sciences and technology on behalf of the massachusetts department of public health. we performed experiments with semi-synthetic outbreaks of different magnitudes and simulated baseline shifts of different types and magnitudes. the results show that the network models provide better detection of localized outbreaks, and greater robustness to unpredictable shifts than a reference time-series modeling approach. conclusions: the integrated network models of epidemiological data streams and their interrelationships have the potential to improve current surveillance efforts, providing better localized outbreak detection under normal circumstances, as well as more robust performance in the face of shifts in health-care utilization during epidemics and major public events. abbreviations: aegis, automated epidemiological geotemporal integrated surveillance; cusum, cumulative sum; ewma, exponential weighted moving average; sars, severe acute respiratory syndrome * to whom correspondence should be addressed. e-mail: ben_reis@ harvard.edu advanced disease-surveillance systems have been deployed worldwide to provide early detection of infectious disease outbreaks and bioterrorist attacks. new methods that improve the overall detection capabilities of these systems can have a broad practical impact. furthermore, most current generation surveillance systems are vulnerable to dramatic and unpredictable shifts in the health-care data that they monitor. these shifts can occur during major public events, such as the olympics, as a result of population surges and public closures. shifts can also occur during epidemics and pandemics as a result of quarantines, the worriedwell flooding emergency departments or, conversely, the public staying away from hospitals for fear of nosocomial infection. most surveillance systems are not robust to such shifts in health-care utilization, either because they do not adjust baselines and alert-thresholds to new utilization levels, or because the utilization shifts themselves may trigger an alarm. as a result, public-health crises and major public events threaten to undermine health-surveillance systems at the very times they are needed most. to address this challenge, we introduce a class of epidemiological network models that monitor the relationships among different health-care data streams instead of monitoring the data streams themselves. by extracting the extra information present in the relationships between the data streams, these models have the potential to improve the detection capabilities of a system. furthermore, the models' relational nature has the potential to increase a system's robustness to unpredictable baseline shifts. we implemented these models and evaluated their effectiveness using historical emergency department data from five hospitals in a single metropolitan area, recorded over a period of 4.5 y by the automated epidemiological geotemporal integrated surveillance real-time public health-surveillance system, developed by the children's hospital informatics program at the harvard-mit division of health sciences and technology on behalf of the massachusetts department of public health. we performed experiments with semi-synthetic outbreaks of different magnitudes and simulated baseline shifts of different types and magnitudes. the results show that the network models provide better detection of localized outbreaks, and greater robustness to unpredictable shifts than a reference time-series modeling approach. understanding and monitoring large-scale disease patterns is critical for planning and directing public-health responses during pandemics [1] [2] [3] [4] [5] . in order to address the growing threats of global infectious disease pandemics such as influenza [6] , severe acute respiratory syndrome (sars) [7] , and bioterrorism [8] , advanced disease-surveillance systems have been deployed worldwide to monitor epidemiological data such as hospital visits [9, 10] , pharmaceutical orders [11] , and laboratory tests [12] . improving the overall detection capabilities of these systems can have a wide practical impact. furthermore, it would be beneficial to reduce the vulnerability of many of these systems to shifts in health-care utilization that can occur during public-health emergencies such as epidemics and pandemics [13] [14] [15] or during major public events [16] . we need to be prepared for the shifts in health-care utilization that often accompany major public events, such as the olympics, caused by population surges or closures of certain areas to the public [16] . first, we need to be prepared for drops in health-care utilization under emergency conditions, including epidemics and pandemics where the public may stay away from hospitals for fear of being infected, as 66.7% reported doing so during the sars epidemic in hong kong [13] . similarly, a detailed study of the greater toronto area found major drops in numerous types of health-care utilization during the sars epidemic, including emergency department visits, physician visits, inpatient and outpatient procedures, and outpatient diagnostic tests [14] . second, the ''worried-well''-those wrongly suspecting that they have been infected-may proceed to flood hospitals, not only stressing the clinical resources, but also dramatically shifting the baseline from its historical pattern, potentially obscuring a real signal [15] . third, public-health interventions such as closures, quarantines, and travel restrictions can cause major changes in health-care utilization patterns. such shifts threaten to undermine disease-surveillance systems at the very times they are needed most. during major public events, the risks and potential costs of bioterrorist attacks and other public-health emergencies increase. during epidemics, as health resources are already stretched, it is important to maintain disease outbreaksurveillance capabilities and situational awareness [4, 5] . at present, many disease-surveillance systems rely either on comparing current counts with historical time-series models, or on identifying sudden increases in utilization (e.g., cumulative sum [cusum] or exponential weighted moving average [ewma] [9, 10] ). these approaches are not robust to major shifts in health-care utilization: systems based on historical time-series models of health-care counts do not adjust their baselines and alert-thresholds to the new unknown utilization levels, while systems based on identifying sudden increases in utilization may be falsely triggered by the utilization shifts themselves. in order to both improve overall detection performance and reduce vulnerability to baseline shifts, we introduce a general class of epidemiological network models that explicitly capture the relationships among epidemiological data streams. in this approach, the surveillance task is transformed from one of monitoring health-care data streams, to one of monitoring the relationships among these data streams: an epidemiological network begins with historical time-series models of the ratios between each possible pair of data streams being monitored. (as described in discussion, it may be desirable to model only a selected subset of these ratios.) these ratios do not remain at a constant value; rather, we assume that these ratios vary in a predictable way according to seasonal and other patterns that can be modeled. the ratios predicted by these historical models are compared with the ratios observed in the actual data in order to determine whether an aberration has occurred. the complete approach is described in detail below. these network models have two primary benefits. first, they take advantage of the extra information present in the relationships between the monitored data streams in order to increase overall detection performance. second, their relational nature makes them more robust to the unpredictable shifts described above, as illustrated by the following scenario. the olympics bring a large influx of people into a metropolitan area for 2 wk and cause a broad surge in overall health-care utilization. in the midst of this surge, a localized infectious disease outbreak takes place. the surge in overall utilization falsely triggers the alarms of standard biosurveillance models and thus masks the actual outbreak. on the other hand, since the surge affects multiple data streams similarly, the relationships between the various data streams are not affected as much by the surge. since the network model monitors these relationships, it is able to ignore the surge and thus detect the outbreak. our assumption is that broad utilization shifts would affect multiple data streams in a similar way, and would thus not significantly affect the ratios among these data streams. in order to validate this assumption, we need to study the stability of the ratios around real-world surges. this assessment is difficult to do, since for most planned events, such as the olympics, additional temporary health-care facilities are set up at the site of the event in order to deal with the expected surge. this preparation reduces or eliminates the surge that is recorded by the permanent health-care system, and therefore makes it hard to find data that describe surges. however, some modest shifts do appear in the health-care utilization data, and they are informative. we obtained data on the 2000 sydney summer olympics directly from the centre for epidemiology and research, new south wales department of health, new south wales emergency department data collection. the data show a 5% surge in visits during the olympics. while the magnitude of this shift is far less dramatic than those expected in a disaster, the sydney olympics nonetheless provide an opportunity to measure the stability of the ratios under surge conditions. despite the surge, the relative rates of major syndromic groups remained very stable between the same periods in 1999 and 2000. injury visits accounted for 21.50% of overall visits in 1999, compared with an almost identical 21.53% in 2000. gastroenteritis visits accounted for 5.84% in 1999, compared with 5.75% in 2000. as shown in table 1 , the resulting ratios among the different syndromic groups remained stable. although we would have liked to examine the stability of ratios in the face of a larger surge, we were not able to find a larger surge for which multi-year health-care utilization data were available. it is important to note that while the above data about a planned event are informative, surveillance systems need to be prepared for the much larger surges that would likely accompany unplanned events, such as pandemics, natural disasters, or other unexpected events that cause large shifts in utilization. initial motivation for this work originated as a result of the authors' experience advising the hellenic center for infectious diseases control in advance of the 2004 summer olympics in athens [17] , where there was concern that a population surge caused by the influx of a large number of tourists would significantly alter health-care utilization patterns relative to the baseline levels recorded during the previous summer. the epidemiological network model was then formalized in the context of the us centers for disease control and prevention's nationwide biosense health-surveillance system [18] , for which the authors are researching improved surveillance methods for integration of inputs from multiple health-care data streams. biosense collects and analyzes health-care utilization data, which have been made anonymous, from a number of national data sources, including the department of defense and the veteran's administration, and is now procuring local emergency department data sources from around the united states. in order to evaluate the practical utility of this approach for surveillance, we constructed epidemiological network models based on real-world historical health-care data and compared their outbreak-detection performance to that of standard historical models. the models were evaluated using semi-synthetic data-streams-real background data with injected outbreaks-both under normal conditions and in the presence of different types of baseline shifts. the proposed epidemiological network model is compared with a previously described reference time-series model [19] . both models are used to detect simulated outbreaks introduced into actual historical daily counts for respiratory-related visits, gastrointestinal-related visits, and total visits at five emergency departments in the same metropolitan area. the data cover a period of 1,619 d, or roughly 4.5 y. the first 1,214 d are used to train the models, while the final 405 d are used to test their performance. the data are collected by the automated epidemiological geotemporal integrated surveillance (aegis) real-time public health-surveillance system, developed by the child-ren's hospital informatics program at the harvard-mit division of health sciences and technology on behalf of the massachusetts department of public health. aegis fully automates the monitoring of emergency departments across massachusetts. the system receives automatic updates from the various health-care facilities and performs outbreak detection, alerting, and visualization functions for publichealth personnel and clinicians. the aegis system incorporates both temporal and geospatial approaches for outbreak detection. the goal of an epidemiological network model is to model the historical relationships among health-care data streams and to interpret newly observed data in the context of these modeled relationships. in the training phase, we construct time-series models of the ratios between all possible pairs of health-care utilization data streams. these models capture the weekly, seasonal, and long-term variations in these ratios. in the testing phase, the actual observed ratios are compared with the ratios predicted by the historical models. we begin with n health-care data streams, s i , each describing daily counts of a particular syndrome category at a particular hospital. for this study, we use three syndromic categories (respiratory, gastrointestinal, and total visits) at five hospitals, for a total of n ¼ 15 data streams. all possible pair-wise ratios are calculated among these n data streams, for a total of n 2 à n ¼ 210 ratios, r ij : for each day, t, we calculate the ratio of the daily counts for stream s i to the daily counts for stream s j . for each ratio, the numerator s i is called the target data stream, and the denominator s j is called the context data stream, since the target data stream is said to be interpreted in the context of the context data stream, as described below. a sample epidemiological network consisting of 30 nodes and 210 edges is shown in figure 1 . the nodes in the network represent the data streams: each of the n data streams appears twice, once as a context data stream and another time as a target data stream. edges represent ratios between data streams: namely, the target data stream divided by the context data stream. to train the network, a time-series model, r ij , is fitted for each ratio, r ij , over the training period using established time-series methods [19] . the data are first smoothed with a 7-d exponential filter (ewma with coefficient 0.5) to reduce the effects of noise [20] . the linear trend is calculated and subtracted out, then the overall mean is calculated and subtracted out, then the day-of-week means (seven values) are calculated and subtracted out, and finally the day-of-year means (365 values) are calculated. in order to generate predictions from this model, these four components are summed, using the appropriate values for day of the week, day of the year, and trend. the difference between each actual ratio, r ij , and its corresponding modeled prediction, r ij , is the error, e ij . during network operation, the goal of the network is to determine the extent to which the observed ratios among the data streams differ from the ratios predicted by the historical models. observed ratios, r ij , are calculated from the observed data, and are compared with the expected ratios to yield the observed errors, e ij : e ij ðtþ ¼ r ij ðtþ à r ij ðtþ ð 3þ in order to interpret the magnitudes of these deviations from the expected values, the observed errors are compared with the historical errors from the training phase. a nonparametric approach is used to rank the current error against the historical errors. this rank is divided by the maximum rank (1 þ the number of training days), resulting in a value of between 0 and 1, which is the individual aberration score, w ij . conceptually, each of the individual aberration scores, w ij , represents the interpretation of the activity of the target data stream, s i , from the perspective of the activity at the context data stream, s j : if the observed ratio between these two data streams is exactly as predicted by the historical model, e ij is equal to 0 and w ij is equal to a moderate value. if the target data stream is higher than expected, e ij is positive and w ij is a higher value closer to 1. if it is lower than expected, e ij is positive and w ij is a lower value closer to 0. high aberration scores, w ij , are represented by thicker edges in the network visualization, as shown in figure 1 . some ratios are more unpredictable than others-i.e., they have a greater amount of variability that is not accounted for by the historical model, and thus a greater modeling error. the nonparametric approach to evaluating aberrations adjusts for this variability by interpreting a given aberration in the context of all previous aberrations for that particular ratio during the training period. it is important to note that each individual aberration score, w ij , can be affected by the activities of both its target and context data streams. for example, it would be unclear from a single high w ij score as to whether the target data stream is unexpectedly high or whether the context data stream is unexpectedly low. in order to obtain an integrated consensus view of a particular target data stream, s i , an integrated consensus score, c i , is created by averaging together all the aberration scores that have s i as the target data stream (i.e., in the numerator of the ratio). this integrated score represents the collective interpretation of the activity at the target node, from the perspective of all the other nodes: each data stream appears twice in the network. the context nodes on the left are used for interpreting the activity of the target nodes on the right. each edge represents the ratio of the target node divided by the context node, with a thicker edge indicating that the ratio is higher than expected. doi:10.1371/journal.pmed.0040210.g001 or an alarm is generated whenever c i is greater than a threshold value c thresh . as described below, this threshold value is chosen to achieve a desired specificity. the nonparameteric nature of the individual aberration scores addresses the potential issue of outliers that would normally arise when taking an average. it is also important to note that while the integrated consensus score helps to reduce the effects of fluctuations in individual context data streams, it is still possible for an extreme drop in one context data stream to trigger a false alarm in a target data stream. this is particularly true in networks having few context data streams. in the case of only one context data stream, a substantial decrease in the count in the context data stream will trigger a false alarm in the target data stream. for comparison, we also implement a reference time-series surveillance approach that models each health-care data stream directly, instead of modeling the relationships between data streams as above. this model uses the same time-series modeling methods described above and previously [19] . first, the daily counts data are smoothed with a 7-d exponential filter. the linear trend is calculated and subtracted out, then the overall mean is calculated and subtracted out, and then the mean for each day of the week (seven values) is calculated and subtracted out. finally, the mean for each day of the year (365 values) is calculated and subtracted out. to generate a prediction, these four components are added together, taking the appropriate values depending on the particular day of the week and day of the year. the difference between the observed daily counts and the counts predicted by the model is the aberration score for that data stream. an alarm is generated whenever this aberration score is greater than a threshold value, chosen to achieve a desired level of specificity, as described below. by employing identical time-series methods for modeling the relationships between the streams in the network approach and modeling the actual data streams themselves in the reference approach, we are able to perform a controlled comparison between the two approaches. following established methods [19] [20] [21] , we use semisynthetic localized outbreaks to evaluate the disease-monitoring capabilities of the network. the injected outbreaks used here follow a 7-d lognormal temporal distribution (figure 2 ), representing the epidemiological distribution of incubation times resulting from a single-source common vehicle infection, as described by sartwell [22] . when injecting outbreaks into either respiratory-or gastrointestinal-related data streams, the same number of visits is also added to the appropriate total-visits data stream for that hospital in order to maintain consistency. multiple simulation experiments are performed, varying the number of data streams used in the network, the target data stream, s i , into which the outbreaks are introduced, and the magnitude of the outbreaks. while many additional outbreak types are possible, the simulated outbreaks used here serve as a paradigmatic set of benchmark stimuli for gauging the relative outbreak-detection performance of the different surveillance approaches. we constructed epidemiological networks from respiratory, gastrointestinal, and total daily visit data from five hospitals in a single metropolitan area, for a total of 15 data streams, s i (n ¼ 15). in training the network, we modeled all possible pair-wise ratios between the 15 data streams, for a total of 210 ratios. for comparison, we implemented the reference time-series surveillance model described above, which uses the same time-series methods but instead of modeling the epidemiological relationships, models the 15 data streams directly. semi-synthetic simulated outbreaks were used to evaluate the aberration-detection capabilities of the network, as described above. we simulated outbreaks across a range of magnitudes occurring at any one of the 15 data streams. for the first set of experiments, 486,000 tests were performed: 15 target data streams 3 405 d of the testing period 3 40 outbreak sizes (with a peak magnitude increase ranging from 2.5% to 100.0%) 3 two models (network versus reference). for the purposes of systematic comparison between the reference and network models, we allowed for the addition of fractional cases in the simulations. we compared the detection sensitivities of the reference and network models by fixing specificity at a benchmark 95% and measuring the sensitivity of the model. in order to measure sensitivity at a desired specificity, we gradually increased the alarm threshold incrementally from 0 to the maximum value until the desired specificity was reached. we then measured the sensitivity at the same threshold. sensitivity is defined in terms of outbreak-days-the proportion of all days during which outbreaks were occurring such that an alarm was generated. at 95% specificity, the network approach significantly outperformed the reference approach in detecting respiratory and gastrointestinal outbreaks, yielding 4.9% 6 1.9% and 6.0% 6 2.0% absolute increases in sensitivity, respectively (representing 19.1% and 34.1% relative improvements in sensitivity, respectively), for outbreaks characterized by a 37.5% increase on the peak day of the outbreak (table 2) . we found this ordering of sensitivities to be consistent over the range of outbreak sizes. for outbreaks introduced into the total-outbreak signals, the reference model achieved 2.1% 6 2% better absolute sensitivity than the network model (2.9% difference in relative sensitivity). this result is likely because the total-visit signals are much larger in absolute terms, and therefore the signal-to-noise ratio is higher (table 3) , making it easier for the reference model to detect the outbreaks. the ''total outbreak'' experiments were run for reasons of comprehensiveness, but it should be noted that there is no clear epidemiological correlate to an outbreak that affects all syndrome groups, other than a population surge, which the network models are designed to ignore as described in the discussion section. also, an increase in total visits without an increase in respiratory or gastrointestinal visits may correspond to an outbreak in yet another syndrome category. table 2 also shows results for the same experiments at three other practical specificity levels, and an average for all four specificity levels. in all cases, the network approach performs better for respiratory and gastrointestinal outbreaks and the reference model performs better in total-visit outbreaks. by visually inspecting the response of the network model to the outbreaks, it can be seen that while the individual aberration scores exhibited fairly noisy behavior throughout the testing period (figure 3) , the integrated consensus scores consolidated the information from the individual aberration scores, reconstructing the simulated outbreaks presented to the system (figure 4 ). next, we studied the effects of different network compositions on detection performance, constructing networks of different sizes and constituent data streams ( figure 5 ). for each target data stream, we created 77 different homogeneous context networks-i.e., networks containing the target data stream plus between one and five additional data streams of a single syndromic category. in total, 1,155 networks were created and analyzed (15 target data streams 3 77 networks). we then introduced simulated outbreaks characterized by a 37.5% increase in daily visit counts over the background counts in the target data stream on the peak day of the outbreak into the target data stream of each network, and calculated the sensitivity obtained from all the networks having particular size and membership characteristics, for a fixed benchmark specificity of 95%. in total, 467,775 tests were performed (1,155 networks 3 405 d). we found that detection performance generally increased with network size ( figure 6 ). furthermore, regardless of which data stream contained the outbreaks, total-visit data streams provided the best context for detection. this is consistent with the greater statistical stability of the totalvisits data streams, which on average had a far smaller variability (table 3) . total data streams were also the easiest target data streams in which to detect outbreaks, followed by respiratory data streams, and then by gastrointestinal data streams. this result is likely because the number of injected cases is a constant proportion of stream size. for a constant number of injected cases, total data streams would likely be the hardest target data streams for detection. next, we systematically compared the performance advantage gained from five key context groups. for a respiratory target signal, the five groups were as follows: (1) total visits at the same hospital; (2) total visits at all other hospitals; (3) gastrointestinal visits at the same hospital; (4) gastrointestinal visits at all other hospitals; and (5) respiratory visits at all other hospitals. if the target signal comprised gastrointestinal or total visits, the five context groups above would be changed accordingly, as detailed in figures 7-9 . given the possibility of either including or excluding each of these five groups, there were 31 (2 5 à 1) possible networks for each target signal. the results of the above analysis are shown for respiratory (figure 7) , gastrointestinal (figure 8) , and total-visit target signals (figure 9 ). each row represents a different network construction. rows are ranked by the average sensitivity achieved over the five possible target signals for that table. the following general trends are apparent. total visits at all the other hospitals were the most helpful context group overall. given a context of all the streams from the same hospital, it is beneficial to add total visits from other hospitals, as well as the same syndrome group from the other hospitals. beginning with a context of total visits from the same hospital, there is a slight additional advantage in including a different syndrome group from the same hospital. in order to gauge the performance of the network and reference models in the face of baseline shifts in health-care utilization, we performed a further set of simulation experiments, where, in addition to the simulated outbreaks of peak magnitude 37.5%, we introduced various types and magnitudes of baseline shifts for a period of 200 d in the middle of the 405-d testing period. we compared the performance of the reference time-series model, the complete network model, and a network model containing only total-visit nodes. for respiratory and gastrointestinal outbreaks, we also compared the performance of a two-node network containing only the target data stream and the total-visit data stream from the same hospital. we began by simulating the effects of a large population surge, such as might be seen during a large public event. we did this by introducing a uniform increase across all data streams for 200 d in the middle of the testing period. we found that the detection performance of the reference model degraded rapidly with increasing baseline shifts, while the performance of the various network models remained stable ( figure 10 ). we next simulated the effects of a frightened public staying away from hospitals during an epidemic. we did this by introducing uniform drops across all data streams for 200 d. here too, we found that the detection performance of the reference model degraded rapidly with increasing baseline shifts, while the performance of the various network models remained robust ( figure 11 ). we then simulated the effects of the ''worried-well'' on a surveillance system by introducing targeted increases in only one syndromic category-respiratory or gastrointestinal ( figure 12) . we compared the performance of the reference model, a full-network model, the two-node networks described above, and a homogeneous network model containing only data streams of the same syndromic category as the target data stream. the performance of the full and homogeneous networks was superior to that of the reference model. the homogeneous networks, consisting of solely respiratory or gastrointestinal data streams, proved robust to the targeted shifts and achieved consistent detection performance even in the face of large shifts. this result is consistent with all the ratios in these networks being affected equally by the targeted baseline shifts. the performance of the full network degraded slightly in the face of larger shifts, while the performance of the two-node network degraded more severely. these results are because the two-node network did not include relationships that were unaffected by the shifts that could help stabilize performance. it should be noted that this same phenomenon-an increase in one syndromic category across multiple locations-may also be indicative of a widespread outbreak, as discussed further below. in this paper, we describe an epidemiological network model that monitors the relationships between health-care utilization data streams for the purpose of detecting disease outbreaks. results from simulation experiments show that these models deliver improved outbreak-detection performance under normal conditions compared with a standard reference time-series model. furthermore, the network models are far more robust than the reference model to the unpredictable baseline shifts that may occur around epidemics or large public events. the results also show that epidemiological relationships are inherently valuable for surveillance: the activity at one hospital can be better understood by examining it in relation to the activity at other hospitals. in a previous paper [20] , we showed the benefits of interpreting epidemiological data in its temporal context-namely, the epidemiological activity on surrounding days [23] . in the present study, we show that it is also beneficial to examine epidemiological data in its network context-i.e., the activity of related epidemiological data streams. based on the results obtained, it is clear that different types of networks are useful for detecting different types of signals. we present eight different classes of signals, their possible interpretations, and the approaches that would be able to detect them: the first four classes of signals involve increases in one or more data streams. (1) a rise in one syndrome group at a single location may correspond to a localized outbreak or simply a data irregularity. such a signal could be detected by all network models as well as the reference model. (2) a rise in all syndrome groups at a single location probably corresponds to a geographical shift in utilization, (e.g., a quarantine elsewhere), as an outbreak would not be expected to cause an increase in all syndrome groups. such a signal would be detected by network models that include multiple locations, and by the reference model. (3) a rise in one syndrome group across all locations may correspond to a widespread outbreak or may similarly result from the visits by the ''worried-well.'' such a signal would be detected by network models that include multiple syndrome groups, and by the reference model. (4) a rise in all syndrome groups in figure 10 . simulation of a population surge during a large public event to simulate a population surge during a large public event, all data streams are increased by a uniform amount (x-axis) for 200 d in the middle of the testing period. full networks, total-visit networks, two-node networks (target data stream and total visits at the same hospital), and reference models are compared. average results are shown for each target data stream type. error bars are standard errors. doi:10.1371/journal.pmed.0040210.g010 figure 11 . simulation of a frightened public staying away from hospitals during a pandemic to simulate a frightened public staying away from hospitals during a pandemic, all data streams are dropped by a uniform amount (x-axis) for 200 d in the middle of the testing period. full networks, total-visit networks, two-node networks (target data stream and total visits at the same hospital), and reference models are compared. average results are shown for each target data stream type. error bars are standard errors. doi:10.1371/journal.pmed.0040210.g011 all locations probably corresponds to a population surge, as an outbreak would not be expected to cause an increase in all syndrome groups. this signal would be ignored by all network models, but would be detected by the reference model. the next four classes of signals involve decreases in one or more data streams. all of these signals are unlikely to be indicative of an outbreak, but are important for maintaining situational awareness in certain critical situations. as mentioned above, a significant decrease in a context data stream has the potential to trigger a false alarm in the target data stream, especially in networks with few context nodes. this is particularly true in two-node networks, where there is only one context data stream. (5) a fall in one syndrome group at a single location does not have an obvious interpretation. all models will ignore such a signal, since they are set to alarm on increases only. (6) a fall in all syndrome groups at a single location could represent a geographical shift in utilization (e.g., a local quarantine). all models will ignore such a signal. the baselines of all models will be affected, except for network models that include only nodes from single locations. (7) a fall in one syndrome group at all locations may represent a frightened public. all models will ignore such a signal. the baselines of all models will be affected, except for network models that include only nodes from single syndromic groups. (8) a fall in all data types at all locations may represent a regional population decrease or a frightened public staying away from hospitals out of concern for nosocomial infection (e.g., during an influenza pandemic). all models will ignore such a signal. the baseline of only the reference model will be affected. from this overview, it is clear that the network models are more robust than the reference model, with fewer false alarms (in scenarios 2 and 4) and less vulnerability to irregularities in baselines (in scenarios [6] [7] [8] . based on the results obtained, when constructing epidemiological networks for monitoring a particular epidemiological data stream, we recommend prioritizing the inclusion of a total visits from all other hospitals, followed by total visits from the same hospital, followed by data streams of the same syndrome group from other hospitals and streams of different syndrome groups from the same hospital, followed by data streams of different syndrome groups from different hospitals. we further recommend that, in addition to fullnetwork models, homogeneous network models (e.g., only respiratory nodes from multiple hospitals) be maintained for greater stability in the face of major targeted shifts in healthcare utilization. the two-node networks described above are similar in certain ways to the ''rate''-based approach used by a small number of surveillance systems today [24] [25] [26] [27] . instead of monitoring daily counts directly, these systems monitor daily counts as a proportion of the total counts. for example, the respiratory-related visits at a certain hospital could be tracked as a percentage of the total number of visits to that hospital, or alternatively, as a percentage of the total number of respiratory visits in the region. these ''rate''-based approaches have been proposed where absolute daily counts are too unstable for modeling [24] , or where population-atrisk numbers are not available for use in spatiotemporal scan statistics [25] . the approach presented here is fundamentally different in that it explicitly models and tracks all possible inter-data stream relationships, not just those between a particular data stream and its corresponding total-visits data stream. furthermore, the present approach is motivated by the desire to increase robustness in the face of large shifts in health-care utilization that may occur during epidemics or major public events. as such, this study includes a systematic study of the models' responses to different magnitudes of both broad and targeted baseline shifts. the two-node networks described above are an example of this general class of ''rate''-based models. while the two-node approach works well under normal conditions, it is not as robust to targeted shifts in health-care utilization as larger network models. the results therefore show that there is value in modeling all, or a selected combination of the relationships among health-care data streams, not just the relationship between a data stream and its corresponding total-visits data stream. modeling all these relationships involves an order-n expansion of the number of models maintained internally by the system: n 2 à n models are used to monitor n data streams. the additional information inherent in this larger space is extracted to improve detection performance, after which the individual model outputs are collapsed back to form the n integrated outputs of the system. since the number of models grows quadratically with the number of data streams, n, the method can become computationally intensive for large numbers of streams. in such a case, the number of models could be minimized by, for example, constructing only networks that include nodes from different figure 12 . simulation of the effects of the worried-well flooding hospitals during a pandemic to simulate the effects of the worried-well flooding hospitals during a pandemic, a targeted rise is introduced in only one type of data stream. full networks, respiratory-or gastrointestinal-only networks, two-node networks, and reference models are compared. error bars are standard errors. doi:10.1371/journal.pmed.0040210.g012 syndrome groups but from the same hospital, or alternatively, including all context nodes from the same hospital and only total-visit nodes from other hospitals. this work is different from other recent epidemiological research that has described simulated contact networks of individual people moving about in a regional environment and transmitting infectious diseases from one person to another. these simulations model the rate of spread of an infection under various conditions and interventions and help prepare for emergency scenarios by evaluating different health policies. on the other hand, we studied relational networks of hospitals monitoring health-care utilization in a regional environment, for the purpose of detecting localized outbreaks in a timely fashion and maintaining situational awareness under various conditions. our work is also focused on generating an integrated network view of an entire healthcare environment. limitations of this study include the use of simulated infectious disease outbreaks and baseline shifts. we use a realistic outbreak shape and baseline shift pattern, and perform simulation experiments varying the magnitudes of both of these. while other outbreak shapes and baseline shift patterns are possible, this approach allows us to create a paradigmatic set of conditions for evaluating the relative outbreak-detection performance of the various approaches [21] . another possible limitation is that even though our findings are based on data across multiple disease categories (syndromes), multiple hospitals, and multiple years, relationships between epidemiological data streams may be different in other data environments. also, our methods are focused on temporal modeling, and therefore do not have an explicit geospatial representation of patient location, even though grouping the data by hospital does preserve a certain degree of geospatial information. the specific temporal modeling approach used requires a solid base of historical data for the training set. however, this modeling approach is not integral to the network strategy, and one could build an operational network by using other temporal modeling approaches. furthermore, as advanced disease-surveillance systems grow to monitor an increasing number of data streams, the risk of information overload increases. to address this problem, attempts to integrate information from multiple data streams have largely focused on detecting the multiple effects of a single outbreak across many data streams [28] [29] [30] [31] . the approach described here is fundamentally different in that it focuses on detecting outbreaks in one data stream by monitoring fluctuations in its relationships to the other data streams, although it can also be used for detecting outbreaks that affect multiple data streams. we recommend using the network approaches described here alongside current approaches to realize the complementary benefits of both. these findings suggest areas for future investigation. there are inherent time lags among epidemiological data streams: for example, pediatric data have been found to lead adult data in respiratory visits [32] . while the approach described here may implicitly model these relative time lags, future approaches can include explicit modeling of relative temporal relationships among data streams. it is also possible to develop this method further to track outbreaks in multiple hospitals and syndrome groups. it is further possible to study the effects on timeliness of detection of different network approaches. also, while we show the utility of the network approach for monitoring disease patterns on a regional basis, networks constructed from national or global data may help reveal important trends at wider scales. editors' summary background. the main task of public-health officials is to promote health in communities around the world. to do this, they need to monitor human health continually, so that any outbreaks (epidemics) of infectious diseases (particularly global epidemics or pandemics) or any bioterrorist attacks can be detected and dealt with quickly. in recent years, advanced disease-surveillance systems have been introduced that analyze data on hospital visits, purchases of drugs, and the use of laboratory tests to look for tell-tale signs of disease outbreaks. these surveillance systems work by comparing current data on the use of health-care resources with historical data or by identifying sudden increases in the use of these resources. so, for example, more doctors asking for tests for salmonella than in the past might presage an outbreak of food poisoning, and a sudden rise in people buying overthe-counter flu remedies might indicate the start of an influenza pandemic. why was this study done? existing disease-surveillance systems don't always detect disease outbreaks, particularly in situations where there are shifts in the baseline patterns of health-care use. for example, during an epidemic, people might stay away from hospitals because of the fear of becoming infected, whereas after a suspected bioterrorist attack with an infectious agent, hospitals might be flooded with ''worried well'' (healthy people who think they have been exposed to the agent). baseline shifts like these might prevent the detection of increased illness caused by the epidemic or the bioterrorist attack. localized population surges associated with major public events (for example, the olympics) are also likely to reduce the ability of existing surveillance systems to detect infectious disease outbreaks. in this study, the researchers developed a new class of surveillance systems called ''epidemiological network models.'' these systems aim to improve the detection of disease outbreaks by monitoring fluctuations in the relationships between information detailing the use of various health-care resources over time (data streams). what did the researchers do and find? the researchers used data collected over a 3-y period from five boston hospitals on visits for respiratory (breathing) problems and for gastrointestinal (stomach and gut) problems, and on total visits (15 data streams in total), to construct a network model that included all the possible pair-wise comparisons between the data streams. they tested this model by comparing its ability to detect simulated disease outbreaks implanted into data collected over an additional year with that of a reference model based on individual data streams. the network approach, they report, was better at detecting localized outbreaks of respiratory and gastrointestinal disease than the reference approach. to investigate how well the network model dealt with baseline shifts in the use of health-care resources, the researchers then added in a large population surge. the detection performance of the reference model decreased in this test, but the performance of the complete network model and of models that included relationships between only some of the data streams remained stable. finally, the researchers tested what would happen in a situation where there were large numbers of ''worried well.'' again, the network models detected disease outbreaks consistently better than the reference model. what do these findings mean? these findings suggest that epidemiological network systems that monitor the relationships between health-care resource-utilization data streams might detect disease outbreaks better than current systems under normal conditions and might be less affected by unpredictable shifts in the baseline data. however, because the tests of the new class of surveillance system reported here used simulated infectious disease outbreaks and baseline shifts, the network models may behave differently in real-life situations or if built using data from other hospitals. nevertheless, these findings strongly suggest that public-health officials, provided they have sufficient computer power at their disposal, might improve their ability to detect disease outbreaks by using epidemiological network systems alongside their current disease-surveillance systems. additional information. please access these web sites via the online version of this summary at http://dx.doi.org/10.1371/journal.pmed. 0040210. wikipedia pages on public health (note that wikipedia is a free online encyclopedia that anyone can edit, and is available in several languages) a brief description from the world health organization of public-health surveillance (in english, french, spanish, russian, arabic, and chinese) a detailed report from the us centers for disease control and prevention called ''framework for evaluating public health surveillance systems for the early detection of outbreaks'' the international society for disease surveillance web site containing pandemic influenza at the source strategies for containing an emerging influenza pandemic in southeast asia public health vaccination policies for containing an anthrax outbreak world health organization writing group (2006) nonpharmaceutical interventions for pandemic influenza, national and community measures transmissibility of 1918 pandemic influenza syndromic surveillance for influenzalike illness in ambulatory care network sars surveillance during emergency public health response planning for smallpox outbreaks systematic review: surveillance systems for early detection of bioterrorismrelated diseases implementing syndromic surveillance: a practical guide informed by the early experience national retail data monitor for public health surveillance using laboratory-based surveillance data for prevention: an algorithm for detecting salmonella outbreaks sars-related perceptions in hong kong utilization of ontario's health system during the 2003 sars outbreak. toronto: institute for clinical and evaluative sciences pandemic influenza preparedness and mitigation in refugee and displaced populations. who guidelines for humanitarian agencies medical care delivery at the 1996 olympic games algorithm for statistical detection of peaks-syndromic surveillance system for the athens biosense: implementation of a national early event detection and situational awareness system time series modeling for syndromic surveillance using temporal context to improve biosurveillance measuring outbreak-detection performance by using controlled feature set simulations the distribution of incubation periods of infectious disease harvard team suggests route to better bioterror alerts can syndromic surveillance data detect local outbreaks of communicable disease? a model using a historical cryptosporidiosis outbreak a space-time permutation scan statistic for disease outbreak detection syndromic surveillance in public health practice: the new york city emergency department system monitoring over-the-counter pharmacy sales for early outbreak detection in new york city algorithms for rapid outbreak detection: a research synthesis integrating syndromic surveillance data across multiple locations: effects on outbreak detection performance public health monitoring tools for multiple data streams bivariate method for spatio-temporal syndromic surveillance identifying pediatric age groups for influenza vaccination using a real-time regional surveillance system the authors thank john brownstein of harvard medical school for helpful comments on the manuscript.author contributions. byr, kdm, and isk wrote the paper and analyzed and interpreted the data. byr and kdm designed the study, byr performed experiments, kdm and byr collected data, and isk suggested particular methods to be used in the data analysis. key: cord-350001-pd2bnqbp authors: liu, l.; vikram, s.; lao, j.; ben, x.; d'amour, a.; o'banion, s.; sandler, m.; saurous, r. a.; hoffman, m. d. title: estimating the changing infection rate of covid-19 using bayesian models of mobility date: 2020-08-07 journal: nan doi: 10.1101/2020.08.06.20169664 sha: doc_id: 350001 cord_uid: pd2bnqbp in order to prepare for and control the continued spread of the covid-19 pandemic while minimizing its economic impact, the world needs to be able to estimate and predict covid-19's spread. unfortunately, we cannot directly observe the prevalence or growth rate of covid-19; these must be inferred using some kind of model. we propose a hierarchical bayesian extension to the classic susceptible-exposed-infected-removed (seir) compartmental model that adds compartments to account for isolation and death and allows the infection rate to vary as a function of both mobility data collected from mobile phones and a latent time-varying factor that accounts for changes in behavior not captured by mobility data. since confirmed-case data is unreliable, we infer the model's parameters conditioned on deaths data. we replace the exponential-waiting-time assumption of classic compartmental models with erlang distributions, which allows for a more realistic model of the long lag between exposure and death. the mobility data gives us a leading indicator that can quickly detect changes in the pandemic's local growth rate and forecast changes in death rates weeks ahead of time. this is an analysis of observational data, so any causal interpretations of the model's inferences should be treated as suggestive at best; nonetheless, the model's inferred relationship between different kinds of trips and the infection rate do suggest some possible hypotheses about what kinds of activities might contribute most to covid-19's spread. in response to the coronavirus pandemic of 2020, countries around the world instituted nonpharmaceutical interventions in an attempt to slow the spread of the disease. these interventions included such measures as social distancing, mandatory wearing of masks, and shutting down nonessential businesses and services. understanding the impact of these measures on the spread of the disease is critical to informing decisions and designing interventions, but due to the delay between infection and obtaining test results, it can be weeks before the effect of such an intervention can be seen in case counts and death counts [1] . ideally, policymakers would have a current estimate (a "nowcast") showing how the growth rate of the disease is responding to various interventions and other factors, as well as a forecast that quickly adapts to changing conditions and generates "whatif" estimates of how changes in behavior might affect disease spread. unfortunately, case and death counts are lagging indicators; a new infection may take weeks to lead to a confirmed case and/or death. to draw inferences earlier, we need more-responsive correlates of infection rates. as part of a wider effort to understand the impact of the non-pharmaceutical interventions, many institutions such as the new york times [2] , google [3] , and apple [4] have released data about mobility trends. this aggregated mobility information measures movement trends at various levels of granularity (county, state), and across different mobility categories (going to the grocery or work, etc.). such aggregated mobility data lacks individual-level detail, and may not be a representative sample of the broader population's behavior, but it may still hold useful signals for understanding the current rate of spread of coronavirus. given our understanding of how coronavirus spreads, we expect that more movement in society results in more infections. we thus hope the mobility trends correlate with the unobserved infection rate of covid-19. the breakdown of overall mobility into categories based on destination-type or distance also provides an opportunity to gain intuitions about what types of trips most spread the disease (although these intuitions should be treated as hypotheses needing further validation, since the data are observational). with this available mobility data, the next step is building models that can use it to make reliable predictions. one approach would be to incorporate a mobility signal into a discriminative predictive model. although it is more difficult to incorporate assumptions and prior knowledge into a discriminative model, the results from [5] indicate that that the correlation between mobility and infection rate is strong enough to make good forecasts. on the other hand, compartmental models [e.g., 1, 6, 7] assume a flexible, causal story for the spread of a disease and can also incorporate mobility data as a covariate for predicting the time-varying infection rate of a disease. in this paper, we explore the use of mobility data in compartmental models and find that: • mobility data can be used to forecast for the spread of the disease. • mobility data is a promising low-lag control signal that can be used to infer the invisible spread of covid-19 and assist policy makers. • incorporating mobility data into compartmental models suggests that some types of trips influence the infection rate more than others. compartmental models for epidemiology partition a population into "compartments" depending on which stage of a disease's life cycle a member is in. a simple model is the susceptible-infectious-recovered, or sir model [8] , which partitions a population into "susceptible", the people who are not yet infected, "infectious", the people who have the disease and are actively spreading it to others, and "recovered", the people who are no longer infected with the disease, often including both those who successfully recover and those who die from the disease. in practice, we may want to include an "exposed" compartment, for individuals who have contracted the disease but are not yet infectious, resulting in an seir model. an seir model assumes an initial state (i.e. some initial numbers (s 0 , e 0 , i 0 , r 0 ) and a population size n s 0 + e 0 + i 0 + r 0 ) and a set of parameters that govern a simulation that evolves this initial state over time. the parameters include β, the infection rate, α, the incubation rate, and γ, the recovery rate. the evolution of the state over time is prescribed by a set of differential equations: ds dt = −β is n ; de dt = β is n − αe; di dt = αe − γi; dr dt = γi. when using a compartmental model, we are often interested in obtaining the values of each of the compartments at a set of input times. a compartmental model defines a continuous-time process and it is conventional to choose the unit of time to be days. thus, integrating the equations to obtain the values of each compartment at a set of discrete times τ = {0, 1, 2, . . . , t } corresponds to obtaining the values at the beginning of each day for t days moving forward from the initial state. we focus on these daily values that are the output of integrating the compartmental model equations; we define "simulating" a compartmental model as the process of obtaining these values via numerical methods like euler or runge-kutta integration. formally, we assume a compartmental model is a series of differential equations associated with a set of simulation parameters θ. besides the parameters that govern the differential equations, we also include the initial state s in its parameters. for an sir model we have θ = (β, γ, s); the seir model adds an incubation-rate parameter α. the simulation function m sir (θ) or m seir (θ) outputs the values of each compartment at the beginning of a set of days τ = {0, 1, . . . , t }. for example, m seir (θ) = (s 0:t , e 0:t , i 0:t , r 0:t ). compartmental models provide a convenient, mechanistic formulation for how a disease spreads in a population. however, most often though we don't know the parameters of the model beforehand, but we do have some data that can provide a learning signal to fit the parameters, one such signal is the daily number of new cases of a disease, which can be predicted by a compartmental model as the change in i + r between each day. formally, assume we observe a time-series of daily case counts c 1:t . from a set of simulation parameters θ, we obtain i 0:t , r 0:t from m (θ), the total number of infected and recovered on each day. we can compute the predicted case countŝ c 1:t = (i + r) 1:t − (i + r) 0:t −1 , and compute a loss l(c 1:t ,ĉ 1:t ) (e.g. mean-squared error). we can then optimize the parameters with respect to this loss with an algorithm like gradient descent. case count data is readily available for covid-19 but is problematic for a few reasons. first, many (perhaps most) people who are infected are not tested; many mild cases are not tested either because individuals do not know that they are infected or because they have sufficiently mild cases that they either cannot or do not feel the need to get tested. testing availability and policies are inconsistent across different regions, so the amount of underreporting is presumably also inconsistent across different datasets. furthermore, demographic differences introduce selection bias: if patients with severe cases are more likely to get tested, then regions with a higher fraction of severe cases (e.g., because they have an older population) will have less underreporting. second, there can be a significant delay between when tests are administered and when they show up in case counts; this delay is not consistently reported. finally, there will be false positives and false negatives, and these error rates are also likely to vary by region since training is needed to get a good sample for polymerase chain reaction (pcr) tests. death counts are more reliable than case counts and should also correlate with infection rate of the disease, albeit with a larger lag than case counts. we expect there to be a smaller underreporting factor for deaths because it is more unlikely that a death from covid-19 goes unnoticed than an asymptomatic case. while there will still be false positives and negatives in the data due to misdiagnosed and misreported cases, we expect the death counts to be less noisy and biased than the case counts. unfortunately, using death counts as a learning signal comes with a significant challenge: time lag. with covid-19, it may take on the order of 2 to 3 weeks until death after infection, meaning any uptick in infections will not be visible in the death counts for 2 to 3 weeks [7, 9] . this introduces a modeling challenge, where a compartmental model that models the recovery period incorrectly will likely learn incorrect values for the other parameters in order to compensate for the incorrect delay. furthermore, compartmental models do not usually directly simulate deaths in a population but rather the sum of deaths and proper recoveries. observing deaths requires introducing a new parameter into the model, ω, the infection-fatality-ratio (ifr) in order to convert "recoveries" into the subset that die (and whose deaths are reported as due to covid-19). several studies have aimed to identify values for the parameters of covid-19's behavioral dynamics, such as the lengths of the incubation and recovery periods [9] . even then, it is unlikely that we can identify a single value for each of these parameters given the differences in methodology and assumptions between studies. however, given the surrounding literature, it is possible to construct a distribution over possible parameters that reflects the uncertainty that comes from both a lack of consensus and an inherently noisy process. a strong prior distribution over parameters also helps with identifiability issues, as adding more parameters to an sir model often results in nonidentiability. bayesian modeling is a flexible framework for incorporating these prior assumptions and capturing the resulting predictive uncertainty. a bayesian compartmental model proceeds by putting prior distributions over the parameters and initial state of a compartmental model. these priors are opportunities to incorporate external evidence from studies, like observed lengths of incubation and recovery periods. the model assumes that the observed data are sampled with some measurement noise from the output of the simulation given the parameters. inference about the model parameters proceeds by applying bayes's rule. we define a generative process for an observed death count time series d 1:t , where θ are the parameters of a compartmental model. we define additional parameters φ = (r, ω), where r is the shape parameter for a negative binomial likelihood, and ω the infection-fatality-ratio (ifr). in this generative process, we simulate a compartmental model forward and compute its predicted daily deaths, which is the used as the mean of a noisy observation process. the distributions of interest are the posterior distribution over parameters p(θ, φ|d 1:t ) and the posterior predictive distribution p(d * |d 1:t ) = p(d * |θ, φ)p(θ, φ|d 1:t )dθdφ. the former is useful for exploratory data analysis and provides uncertainty estimates over the parameters in the model and the latter is the distribution needed for forecasting and nowcasting. unfortunately, we cannot analytically compute these distributions so it is common to use methods that approximate the posterior and predictive distributions like sequential monte carlo (smc) and markov chain monte carlo (mcmc). estimates of the recovery period for covid-19 are around 15-20 days and we observe this in the lag between observed infections and recoveries. in the basic seir model, everyone in the i category is infectious at a constant rate for the entire duration of that recovery period. but in reality, it is unlikely that people remain equally infectious during the entirety of the recovery period due to isolation at home and in hospitals. to better model this drop in infectiousness, we add a "q" compartment to the model that contains "isolated" or "quarantined" individuals. isolated/quarantined individuals behave like infectious ones in that they contribute to the flow from susceptible to exposed, but at a fraction of the rate. the new parameter governing the rate of flow between i and q we call the "isolation rate" η and the fraction governing the reduction in infection rate we call the "quarantine reduction" q. we call this an seiqr model (visualized in figure 1 ) which has the following dynamics: compartmental models implicitly assume that individuals spend an amount of time in each compartment that is exponentially distributed with that compartment's rate parameter. an exponential distribution will correctly model an average waiting time, but the overall shape of the waiting time distribution may completely differ from the observed distribution and will put mass on small waiting times. this can be problematic if say we know that a disease often at least 3 days to recover from. a commonly used remedy for the exponential time waiting assumption inherent to sir and seir dynamics is to expand each compartment into several subcompartments and modify the compartment's rate parameter to match the original's expected value. a new hyperparameter of our seiqr model is thus the number of subcompartments for each of the compartments. in conventional compartmental modeling, the infection rate parameter β is time-invariant, freely parameterized and directly learned from data. this is a strong assumption about how a disease might spread; for example, we might expect that the true number of infections on a given day is noisy and that there are likely strong day-of-week effects that come from commuting and varying travel patterns on weekends. (the smoothing effect of the lag between infection and death would make such day-of-week effects invisible in daily death counts.) furthermore, with covid-19, we hope that resultant lockdowns cause a drop in the infection rate thereby slowing the spread of the disease. thus, for modeling covid-19, we'd like a compartmental model with a time-varying infection rate that captures any drop resulting from non-pharmaceutical interventions. mobility data measures the relative change in how many trips are taken daily relative to prelockdown behavior. each datapoint in mobility data is a k-dimensional time-series m 0:t ; each individual series tracks mobility trends (the higher the values the more trips are being taken, and vice versa). mobility is further aggregated at the county level, so we have a k-dimensional timeseries for each of c counties {m c 0:t } c c=1 . to model infection rate using mobility data, we define a generalized linear model (glm) that predicts infection rate. we use a noncentered hierarchical model over glm weights where we independently sample city-specific base coefficients {w c } c c=1 . we then sample a normally-distributed k-dimensional vector of feature-specific coefficient adjustments f and sample a base infection rate b. we finally compute a county-specific infection rate β c (t) = softplus((w c +f ) t m c t +b). the city coefficients w c model the differences in how mobility affects infections between different counties and the feature coefficients f model the infection patterns of different types of mobility. mobility is a potential predictor of infection rate because it correlates with the quantity lockdowns are designed to lower-infections coming human-to-human interaction. some trips may have more human-to-human interactions than other, but in aggregate we expect the numbers of total interactions (and thereby infections) to drop if there are less total trips. mobility data can be collected and processed relatively quickly and therefore offers the potential to become a signal for forecasting and nowcasting. however, mobility data does not capture all aspects of how disease spreads. the second modification is incorporating a changepoint to model a broader set of latent factors that influence the infection rate. for example, mobility data does not reflect the change in infection rate due to changes in human behavior, such as the widespread wearing of masks and maintaining social distance. these effects are harder to measure but are still relevant to predicting a time-varying infection rate; a changepoint, or a function of time that captures a drop in infection rate, is a convenient means of lumping together all unknown latent factors into a parameterized time-varying infection rate. we define a changepoint to be a point in time at which there is a monotonic drop in infection rate, modeled as a negative sigmoid. we parameterize it with three parameters: κ, or time at which the change happens, ρ, a number between 0 and 1 determining the amplitude of the sigmoid, and ν, the slope of the sigmoid that dictates the rate of the change. we thus have the parameterized function π(t) = ρ + (1 − ρ)σ(−ν(t − κ)) where σ is the sigmoid function. unlike mobility data, a changepoint cannot model complex changes in infection rate but it can, however, capture factors like differences in human behavior that is not reflected in mobility data. by itself, a changepoint can model the flattening of the death count curves we observe in covid-19 data (i.e. β(t) = π(t)), but a changepoint is not useful for forecasting and nowcasting because we can only infer a changepoint has happened after observing the case or death counts weeks after. thus, we desire a hybrid model that can leverage the strengths of both approaches to time-varying mobility rate. when modeling multiple counties, we model county-specific changepoints and merge the changepoint with the mobility model by multiplying it by the output of the mobility glm , i.e. β c (t) = softplus((w c + f ) t m c t + b)π c (t). 4 related work [7] proposed one of the first bayesian sir models of covid-19 based on observed death counts. rather than treat death as a compartment, they convolve the new-infection curve in a simple sirwith-observed-changepoint model with a time-to-death distribution. [10] develop a bayesian semimechanistic (rather than compartmental) model linking google mobility data and a latent timevarying factor to a time-varying reproductive number r t , which is combined with a latent ifr and time-to-death distribution to model observed deaths. [11] develop a changepoint model similar to ours, informed by lockdown dates rather than mobility data. [12] strongly argue for the importance of relaxing the exponential-waiting-time assumption in compartmental models; see also [13, 14] . concurrently with this work, [15] proposed a very similar bayesian compartmental model regressing from apple mobility data to a time-varying infection rate β(t). the most salient difference is that we consider a latent changepoint to account for changes in behavior that are not captured by mobility data. we find that this factor is important for making accurate long-term forecasts, although it makes relatively little difference over the 8-day forecast window they consider. datasets we use unprocessed covid-19 case and death count numbers from the new york times [2] , which produces daily reports of new infections and deaths at both state and county level. for mobility, we use the community mobility reports published by google [3] in our evaluation of bayesian compartmental models. google's community mobility datasets are created with aggregated, anonymized sets of data from users who have turned on the location history setting, which is off by default [16] . no personally identifiable information, like an individual's location, contacts or movement, is made available at any point. trends are aggregated to the county level (including washington, dc and independent cities that are not otherwise included in county boundaries), and available daily from february 16th through may 21th, 2020. the google dataset also breaks down mobility trends into several categories. we consider two usages of the mobility data: (1) using each category as a feature (except for residential) and (2) aggregating the categories into a single overall mobility feature. we normalize the dataset to have 0 represent "normal" mobility and deviations represent positive and negative relative changes in mobility (e.g. a value of 0.3 represents a 30% increase in mobility). figure 2 shows the daily mobility changes in new york city for each category in google's community mobility reports, where "overall" represents the aggregated category. in new york city, we observe a dramatic decrease in many of the mobility categories and in overall mobility after the official non-pharmaceutical intervention occurred in late march. since mid-april, the number of park visits in new york city has also increased, while the number of visits in other categories is relatively small. the hyperparameters of the model are the number of subcompartments in the erlang-based models, for which we chose 2 subcompartments for each of the e, i and q compartments. the priors of the various model parameters are defined in table a. 2. priors were chosen to match estimates found in [9] , which aggregates parameter estimates from relevant literature. the compartmental model parameters (except for infection rate) are shared across counties, and the rest of the parameters in the model are county-specific, with the exception of f , which is mobility-feature specific. numerical integration we simulate compartmental dynamics with an euler-discretization of the differential equations with a step size of 1. despite such a large step size, we found the dynamics to be stable with reasonable parameter settings. however, with extreme values for parameters, we found that the dynamics become unstable and numerical issues arise. we address this by con-straining certain parameter values in the prior distribution to avoid extreme values. we also find no significant performance improvement with a lower step size. we use an adaptive sequential monte carlo (smc) method for bayesian inference. we utilize the algorithm from [17] which adaptively constructs a series of annealed distributions that interpolates between the prior over latent variables and joint distribution over latent variables. we then initialize a set of particles that are iteratively updated with a hamiltonian monte-carlo (hmc) kernel to be representative of each of the annealed distributions [18, 19] . in practice, we map parameters into an unconstrained space and modify the prior to take into account the volume difference resulting from change-of-variable, and we execute smc in this unconstrained space. we run 5 chains in parallel with 1000 particles each, collecting a total of 5000 samples from the posterior distribution. inference is implemented using tensorflow probability [20, 21] and were executed on a gpu where training a single model took around 20 minutes. we focus on the period starting on 3/7/2020 extending to 5/21/2020. we fit various versions of the models described above to the aforementioned mobility trend and ground-truth death counts, and evaluate their ability to forecast, nowcast, and offer insights into how mobility influences covid-19's infection rate. to evaluate the extensions proposed in the paper, we we consider a baseline seiqr model (time-invariant infection rate), a changepoint seiqr (cseiqr), an seiqr model with a infection rate varying as a function of mobility data from both multiple and an aggregated category (mseiqr[multi] and mseiqr [overall] ), and a model that combines both multiple category mobility-based regression and a changepoint (cmseiqr). to evaluate the overall effectiveness of modeling covid-19 using the proposed models, we construct a covid-19 forecasting task on the 25 most populous us counties (treating the five boroughs of new york city as one county). in three variants of this task, we choose a date (4/22/20) and fit a model to the data up to this date. we compute the predictive log-likelihood for the next 7, 14, and 28 days worth of data, using mobility data from the held-out period when forecasting 2 . to measure the quality of each model's forecasts, we report the average daily marginal log-likelihood numbers for the held-out data, which we estimate by taking the log-mean-exp of of the log-likelihoods for every sample, and averaging across days. this measures how well on average the model can predict the number of deaths on a randomly chosen day in the future. comparing mobility datasets we first evaluate whether having access to more fine-grained information about mobility trends is valuable for forecasting. to accomplish this, we examine held-out log-likelihood of mseiqr[multi/overall] models in table b.4a and table b .5. we find that the mseiqr[multi] consistently achieves a higher average log-likelihood than the mseiqr[overall] mobility model in all forecast windows. investigating further, it appears that the overall mobilitytrained model "overfits" to the slow increase in mobility we observe towards the end of the forecast period and predicts upticks in deaths where the data continues to trend downwards. this problem is reduced in the multiple-category model; we thus hypothesize that there are certain types of mobility that better correlate with infection rate and aggregating over different mobility types adds noise to these signals. for example, figure b .3 shows that despite high uncertainty changes in grocery_and_pharmacy are a stronger signal of changes in infection rate than, say, transit_stations-this observed association does not imply that interventions targeting grocery stores and pharmacies will reduce covid-19 spread more than closing subways, but it does suggest that these relative changes should not be weighted equally when trying to estimate its spread. comparing model variants we report held-out log-likelihood numbers for all model variants in subsection b.2. we observe that all mobility models tend to latch onto mobility upticks in in late april and may. mseiqr[multi/overall] models were trained without these upticks and can only explain infection rate in terms of mobility, so subsequently we see a forecasted uptick in deaths for those models. on the other hand, the changepoint model is often correct but every so often, nowcasts from the mobility-based models tend to be sharper, but only in cmseiqr models do we see the nowcast distributions consistent capture the hindcast. misidentifies the drop in the infection rate and makes a blatantly wrong forecast (see figure 4 ). the cmseiqr model tends to forecast somewhere in between, predicting a decay in death counts but with a slight uptick towards the end of the forecast. however, it has a much wider interval due to the inclusion of the changepoint, which mitigates the effect of mobility towards the end of the forecast period. we visualize the rest of forecasts in subsection c.1. our bayesian models are able to nowcast the reproductive number r t for covid-19 from the posterior and recent mobility data, using the equation: r t = β(t) * (1/η + q/γ), but it is hard to evaluate the accuracy of r t estimation due to lack of ground truth data. instead we conduct a held-out-data experiment to validate the consistency and sharpness of a model's nowcasting results. we pick a date t, fit a model trained up to t, and compute the distribution over r t . this model has not seen the deaths that result from infections happening to dates near t, so we expect its estimate of r t to have a wide interval, our "nowcast" estimate. we compare our nowcast distribution to a "hindcast" distribution, a distribution over r t from a model trained on data up to t + 21. in figure 5 and figure b .4, we plot a kernel-density estimate of r t from a range of models. we expect this hindcast distribution has a tighter estimate of r t since it has access to more data, but we also hope that it lies within the nowcast distribution. we find that cmseiqr tends to have more conservative but less incorrect nowcasts. after including more data, the cseiqr and mseiqr[multi] models collapse to a sharp estimate, but to an area with low mass in the nowcast, whereas the cmseiqr model collapses to a less-sharp distribution that is closer to the nowcast. this indicates that the r t estimates coming from cseiqr and mseiqr[multi] models can be confidently incorrect. we find that mobility is a promising signal for nowcasting and forecasting the spread of covid-19, but it is important to understand its limitations. notably, our experiments suggest that mobility signals alone cannot explain all variation in infection rates; for example, increases in mobility in may do not appear to have caused massive increases in infections. however, we can alleviate these limitations by assuming the existence of and marginalizing other latent factors. it goes without saying that the impact of covid-19 has been massive and overwhelmingly negative. our hope is that this paper contributes usefully to the discussion about how best to predict, monitor, and control the spread of covid-19 using the limited data available. unfortunately, there is always the danger that work in this space may be misinterpreted or overinterpreted, especially by nonexperts. journalists, policymakers, and politicians may not have the technical training to critically evaluate these models, but they also do not have the luxury of not having an opinion; in the real world, decisions must be made by agents with bounded rationality using imperfect data. this problem of overinterpretation is particularly salient to this work-it is easy to forget that a model's error bars are only as reliable as its assumptions. for example, we found that changepointonly and mobility-only models tend to be overconfident when making long-term forecasts, whereas a more flexible changepoint+mobility model is able to consider a wider variety of scenarios 3 . this should give pause to anyone relying on forecasts from simplistic models for decision-making-"simpler" does not necessarily imply "fewer assumptions" or "more reliable". all of this is cause for caution, but not inaction. even imperfect models are still useful tools for rigorously and honestly integrating our prior beliefs and the limited evidence we can collect, as long as we remember that they are imperfect. we report our choice of prior distributions in table a impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand the new york times. the new york times covid-19 tracking page google covid-19 community mobility reports apple covid-19 mobility trends reports projections for first-wave covid-19 deaths across the us using social-distancing measures derived from mobile phones. medrxiv an introduction to compartmental modeling for the budding infectious disease modeler estimating sars-cov-2-positive americans using deaths-only data containing papers of a mathematical and physical character jonathan ish-horowicz, et al. state-level tracking of covid-19 in the united states inferring covid-19 spreading rates and potential change points for case number forecasts appropriate models for the management of infectious diseases discrete stochastic analogs of erlang epidemic models equivalence of the erlang seir epidemic model and the renewal equation mobility trends provide a leading indicator of changes in sars-cov-2 transmission. medrxiv google covid-19 community mobility reports: anonymization process description (version 1.0) an adaptive sequential monte carlo method for approximate bayesian computation a conceptual introduction to hamiltonian monte carlo adaptive tuning of hamiltonian monte carlo within sequential monte carlo tensorflow: a system for large-scale machine learning 4: average held-out log-likelihood numbers for the mseiqr[multi] and seiqr models 5: average held-out log-likelihood numbers for mseiqr c.2 additional r t forecasts key: cord-006229-7yoilsho authors: nan title: abstracts of the 82(nd) annual meeting of the german society for experimental and clinical pharmacology and toxicology (dgpt) and the 18(th) annual meeting of the network clinical pharmacology germany (vklipha) in cooperation with the arbeitsgemeinschaft für angewandte humanpharmakologie e.v. (agah) date: 2016-02-06 journal: naunyn schmiedebergs arch pharmacol doi: 10.1007/s00210-016-1213-y sha: doc_id: 6229 cord_uid: 7yoilsho nan in vitro systems and mechanistic investigations i the demand of alternative test systems which closely mirror the in vivo situation is one of the main challenges in modern toxicity testing. the major goal is the development of in vitro systems that partly display the complexity of an organism and thus may mimick in vivo conditions. despite great efforts in the past no adequate in vitro systems are available yet. on the other hand, cell cultures from almost every organ are easily accessible and therefore may help to roughly assess the toxic potential of substances at target structures. nonetheless, the complex interactions which take place in vivo cannot be addressed in single cell cultures. in the liver, hepatocytes comprise 80% of the total liver volume while non-parenchymal cells -endothelial cells, stellate cells and kupffer cells (that is, liver resident macrophages) -contribute only 6.5% of the volume, but 40% of the total cell number (kmiec 2001) . it has been increasingly recognized that in the liver neighboring non-parenchymal cells release molecules which contribute to the inflammatory damage and even aggravate it (adams et al. 2010) . in our project a human in vitro co-culture system was established by combining a hepatic and a monocytic cell line, the latter of which can be differentiated to a macrophage-like phenotype. in this system the hepatotoxicty of substances has been analyzed, and the results were compared to single cultures and to published data from in vivo studies. using ketoconazole, an antifungal, as a known hepatotoxic substance, inflammatory markers were studied and proved to be significantly upregulated only in coculture. conversely, cultures of hepatic cells only did not display this increase in inflammatory markers. at the same time, a negative substance, caffeine, failed to show any hepatotoxic potential in the co-culture system. our results demonstrate that this novel in vitro co-culture model represents a promising tool to evaluate the hepatotoxic potential of substances. in drug research, it might help to reduce animal testing as drugs with a high dili potential can be dropped early in the development phase. question: raman spectroscopy (rs) is a highly sensitive analytical method for markerfree and non-invasive identification and characterization of cells. here, we present rs as a novel tool for gentle yet precise cell analysis in three independent experiments, focusing on monitoring cellular reactions upon treatment. we could provide evidence that rs is a suitable tool to monitor cell differentiation, analyze cell modification and study cell apoptosis after drug application. methods: in a first experiment, mesenchymal stem cells (mscs) were treated with erythropoetin (epo) for certain time points and subsequently fixed with paraformaldehyde (pfa) for raman analysis. in addition, skbr3 breast cancer cells were exposed to the anti-cancer drug herceptin (20μg/ml). cells were then fixed in pfa for rs. in a last experiment, molm-13 cells were separately cultivated in microwells and treated with thymidine for different time points prior to raman analysis. results: raman spectroscopy was able to monitor differentiation of epo treated mscs and found that around 35% of treated cells showed fibroblast like raman profiles. in case of herceptin treated skbr3 cells, rs found internal changes of the cell´s metabolism as reaction on drug application. analyzing the most prominent differences in raman spectra revealed discrimination of cells to be mainly due to changes in amid i, lipid and protein content. in the last experiment, rs was able to follow apoptosis of molm-13 cells after thymidine application and discriminate early from late apoptotic states. discussion: rs is a photonic marker for gentle yet highly specific cell analysis, which allows monitoring of single cell reaction after drug treatment. thereby, rs provides information about changes within the entire metabolome on a single cell level. raman spectra are as characteristic as a "fingerprint". rs works label-free and non-invasive and thus does not impare cell viability. this allows to gain new insights in pharmacological development and toxicological survey. acknowledgement: this project received funding from the eu 7th health program grant agreement no 279288. deutsches zentrum für herzinsuffizienz, würzburg, germany extracellular signal-regulated kinases 1 and 2 (erk1/2) are essential for the regulation of cell growth and cell survival and their kinase activity is up-regulated for example in different types of cancers and pathological cardiac hypertrophy. while inhibition of erk1/2 activity by kinase inhibitors prevents tumor growth, it can also lead to exacerbated cardiomyocyte death and impaired heart function. interestingly, we have previously identified an erk autophosphorylation at threonine 188 as a prerequisite for nuclear erk1/2 signaling and erk-mediated cardiac hypertrophy. here, we investigated an alternative strategy to interfere with erk1/2 signaling: since activation of erk1/2 triggers erk dimerization, a prerequisite for erk t188autophosphorylation, we chose the erk dimer interface as a possible target to selectively interfere with erk t188 -phosphorylation. first, we investigated the impact of monomeric erk2 on cardiac function. to address this issue, we generated mice with cardiac overexpression of monomeric erk2 ∆174-177 and performed transverse aortic constriction (tac) to induce cardiac hypertrophy. compared to wild-type mice, erk2 ∆174-177 overexpression attenuated tac-induced cardiac hypertrophy, interstitial fibrosis and mrna expression levels of collagen and brain natriuretic peptide (bnp), while cardiomyocyte survival and cardiac function were largely preserved. because of the positive effects of monomeric erk ∆174-177 in the heart, we designed a peptide to interfere with endogenous erk dimerization. cross-linking and coimmunoprecipitation experiments showed that the peptide binds to erk2 and prevents its dimerization. moreover, the peptide effectively inhibited erk t188 -phosphorylation and nuclear translocation of yfp-tagged wild-type erk2 after phenylephrine stimulation. further, adenoviral or adeno-associated virus serotype 9 (aav9)-induced overexpression of the peptide in neonatal rat cardiomyocytes (nrcm) or mouse hearts resulted in a significantly reduced hypertrophic response to phenylephrine or tac, background: g i -proteins have been proposed to be cardioprotective. it's matter of debate whether this depends on the particular g i isoform and/or on the particular conditions (e.g. cardiac "stress") only. in our study we investigated effects of a gα i2knockout on cardiac function and survival in a murine heart-failure model of cardiac β 1adrenoceptor overexpression. methods and results: β 1 -adrenoceptor overexpressing mice lacking gα i2 (β 1 -tg/gα i2 -/-) were compared to wild-type (c57bl/6) mice and littermates either overexpressing cardiac β 1 -adrenoceptors (β 1 -tg) or lacking gα i2 (gα i2 -/-). at 300 days of age mortality of mice only lacking gα i2 was higher compared to wild-type or β 1 -tg, but similar to β 1tg/gα i2 -/mice. beyond 300 days mortality of β 1 -tg/gα i2 -/mice was enhanced compared to all other genotypes (mean survival time: 363±21 days). echocardiography revealed similar cardiac function of wild-type, β 1 -tg and gα i2 -/mice, but significant impairment for β 1 -tg/gα i2 -/mice (e.g. ejection fraction 14±2% versus 40±4% in wild-type mice). significantly increased ventricle-to body-weight ratio (0.71±0.06% versus 0.48±0.02% in wild types), left-ventricular size (length 0.82±0.04 cm versus 0.66±0.03 cm in wild types) and anp and bnp expression (mrna: 2819% and 495% of wild type, respectively) clearly indicated hypertrophy. gα i3 was significantly upregulated in gα i2 -knockouts (protein compared to wild-type mice: 340±90% in gα i2 -/and 394±80% in β 1 -tg/gα i2 -/-, respectively). radioligand binding experiments confirmed cardiac overexpression of β 1adrenoceptors in β 1 -tg mice. of note, overexpression levels differed depending on the particular wild-type background. on an fvb/n background we found the overexpression level to be more than 2-fold higher (b max : 1425 ± 68 fmol/mg) than on the otherwise used c57bl/6 background. accordingly, fvb/n-based β 1 -tg mice showed a significantly impaired cardiac function at an age of 300d while c57bl/6-based β 1 -tg mice did not. conclusions: gα i2 -deficiency combined with cardiac β 1 -adrenoceptor overexpression strongly impaired survival and cardiac function. on a c57bl/6 background β 1adrenoceptor overexpression alone had not induced cardiac hypertrophy or dysfunction at day 300 while there was overt cardiomyopathy in mice additionally lacking gα i2 . we propose an enhanced effect of increased β 1 -adrenergic drive by lack of protection via gα i2 . the observed gα i3 upregulation was not sufficient to compensate for gα i2 deficiency suggesting an isoform-specific and/or a concentration-dependent mechanism. the role of gα i3 is currently addressed in a subsequent study using β 1 -tg and gα i3deficient mice. heart failure is accompanied by morphological and functional alterations (e.g. hypertrophy, decreased contractility) which are summarized by the term "cardiac remodeling". while the β-adrenergic signaling pathway is essential for short-term modulation of cardiac performance, its chronic stimulation by elevated plasma catecholamines and the subsequent activation of camp-dependent signal transduction pathways is regarded as a fundamental factor in the pathogenesis of cardiac remodeling. however, the mechanisms mediating the transition of physiological conditions of short-term to detrimental remodeling under long-term β-adrenergic stimulation are not understood in detail so far. in this context, icer, an isoform of the camp dependent transcription factor crem (camp responsive element modulator), acts as an early response gene strongly induced by beta-adrenergic stimulation via camp responsive elements (cre) in its promoter. contrary to its cre-mediated induction, icer is a strong inhibitor of cre-mediated transcription by itself. here we study the role of icer induction in the catecholamine-induced cardiac remodeling in a time dependent manner by the use of icer deficient mice (iko) and wild type (wt) controls, which were treated with isoproterenol (iso; 10 mg/kg per d) for 6 and 24 hours and 7 days. overall 7 days of iso stimulation resulted in an elevation of cardiomyocyte length in iko (in µm; 7d iso 156±1) vs. wt cardiomyocytes (7d iso 143±2). at this time point a 29 % decrease of cardiac output and a 16 % decrease of the maximal rate of rise of left ventricular pressure (dp/dt max ) in iko vs. wt animals was detectable. the maximum increase of icer mrna in wt cardiomyocytes already occurred after 6 h (75-fold), and declined after 24 h (29-fold) to 2.5 fold increase after 7 days, while icer mrna was not detectable in iko mice. this raised the hypothesis, that the early induction of icer modulates transcriptional processes after beta-adrenergicstimulation, involved in cardiac remodeling of the heart. profiling of mrna expression levels between iko vs. wt cardiomyocytes at the different time points revealed: 55 regulated genes (up-regulated: 45 %) in untreated; 103 altered genes (up 35 %) after 6h; 1437 changed genes (up 97%) after 24 h and 131 altered genes (up 19%) after 7 days of iso treatment. in summary, the absence of icer induction in myocytes resulted in an increase of cardiomyocytes length and a decrease of heart performance after 7 days of betaadrenergic stimulation. this is preceded by upregulated mrna levels of several hundred genes at 24 h, which is going along with the induction of transcriptional inhibitor icer after a few hours of beta-adrenergic stimulation. this suggested a protective role of icer by inhibiting the progression of cardiac remodeling after betaadrenergic stimulation in an early responsive manner. (supported by the dfg) the performance of the adult heart is tightly regulated by g protein-coupled receptors. adrenergic and angiotensin receptors efficiently control heart rate and contractility. muscarinic receptors, on the other hand serve as master regulators of the conduction system, which is often lost upon myocardial infarction. this function of muscarinic receptors has been well described in the adult or late embryonic heart. here we provide evidence that muscarinic receptors are crucial to constrain pacemaker cell identity. we applied subtype-specific inhibitors of muscarinic receptors to zebrafish embryos of different stages. we observed that both, early cardiac function as well as specification are specifically regulated by muscarinic m3 receptors, while m2 receptors appear to exert a heart-specific function only at later stages. continuous m3 blockage renders zebrafish with greatly altered cardiac morphology, particularly of the conduction system. furthermore, embryos with m3 inhibition display impaired ventricular function most likely due to an av-block and substantial arrhythmia in the atrium. importantly, to observe these phenotypes it was sufficient to block m3 receptors during stages of cardiac differentiation, which is long before a heart tube has formed. we corroborated our findings regarding these morphological changes using marker gene analysis. furthermore, we obtained evidence for m3 receptors preventing a transcriptional program towards the induction of pacemaker cells at the expense of av canal cells. importantly, this is not only true during heart development. a pacemaker program is also induced in adult hearts upon m3 inhibition. taken together, we postulate that muscarinic m3 receptors confine a pacemaker lineage during early steps of heart development as well as in the adult heart. our data suggests m3 receptors as potential new therapeutic targets for the regeneration of hearts with an injured sinoatrial node. systemic inhibition of mir-21 has proved effective against fibrosis of the myocardium and in other organs. mir-21 has been reported to exert detrimental effects in cardiac fibroblasts and protective roles in cardiac myocytes and other myocardial cell types. a better definition of the cell types that contribute to the beneficial effects of inhibiting mir-21 in vivo may aid the development of strategies with enhanced therapeutic efficacy. thus far, no approach to selectively manipulate micrornas in the non-myocyte population of cardiac cells in vivo has been available. in this study, we developed an icre-encoding mml virus for application in mir-21 fl/fl mice. delivery of this vector to neonates achieved targeted genetic ablation of mir-21 in non-myocyte cardiac cells. immunohistochemistry and flow cytometry confirmed that mmlv was highly selective and effective for cardiac fibroblasts and endothelial cells. in parallel, an aav9-icre vector allowed for specific and almost complete deletion of mir-21 in cardiac myocytes. when tested in a model for chronic left ventricular pressure overload, mmlv-icremediated deletion of mir-21 in cardiac fibroblasts and endothelial cells significantly reduced cardiac fibrosis and hypertrophy and improved cardiac function. the benefit of this cell-type-specific inhibition exceeded that observed upon global genetic deletion of the mir-21 gene in mice. aav9-mediated deletion of mir-21, albeit lowering cardiac hypertrophy, had no effect on fibrosis or cardiac function. taken together, neonatal delivery of engineered icre-encoding viruses enabled for the first time a differential gene targeting in non-myocyte and myocyte cells in myocardium. non-myocyte deletion of mir-21 demonstrated that mir-21 exerts its cardiac profibrotic activity directly in cardiac fibroblasts and in endothelial cells. this novel finding should encourage tailoring of antimir-21 therapy towards cellular tropism. chronic inflammatory diseases, such as psoriasis or rheumatoid arthritis, are characterized by constant leukocyte infiltration and ongoing angiogenesis in the inflamed tissue. as current anti-inflammatory pharmacotherapy is not always satisfying, there is a great demand for the discovery of new drug leads as well as novel drug targets. the synthetic carbazole alkaloid derivative c81 acts as a multikinase inhibitor. results of a thermal shift assay revealed that c81 shows by far the highest binding affinity to the bmp-2-inducible kinase (bmp2k/bike). bmp2k represents an as yet largely uncharacterized protein, which is not regulated by bmp-2 in endothelial cells. therefore, we aimed to analyze (i) the pharmacological potential of c81 and (ii) the role of bmp2k in angiogenic and inflammatory processes in the vascular endothelium. initial experiments show that only high concentrations of c81 affected the viability of human umbilical vein endothelial cells (huvecs). both c81 and the knock-down of bm2k (rnai) reduced the migratory capacity of a human microvascular endothelial cell line (hmec-1). also the proliferation of hmec-1 was reduced by c81 treatment (ic 50 : 7 µm). a tube formation assay on matrigel demonstrated that c81 significantly impaired the formation of capillary-like structures in a dose-dependent manner. interestingly, the analysis (western blot) of signaling molecules in huvecs that play a crucial role in cell proliferation (e.g. erk, akt) revealed that these pathways are not influenced, neither by c81 treatment nor by bmp2k gene silencing. in regard to inflammatory processes, c81 treatment or bmp2k silencing of huvecs decreased the adhesion of thp-1 cells, a monocytic cell line, onto the activated endothelial cells. as the interaction of leukocytes is mainly mediated by cell adhesion molecules (cams), the effect of c81 or bmp2k silencing on their expression was analyzed (flow cytometry, qpcr). while the expression of cams was strongly decreased after c81 treatment, the knock-down of bmp2k did not markedly affect their expression. furthermore, both approaches did not lead to the reduction of tnf-induced iκbα degradation (western blot) or p65 translocation into the nucleus (microscopy). our study provides first insights into the anti-inflammatory and anti-angiogenic potential of the carbazole alkaloid derivative c81 in vitro. the precise role of bmp2k in angiogenic and inflammatory endothelial processes as well as the involved pathways during bmp2k silencing and c81 treatment will be further elucidated. moreover, since the inhibition of bmp2k seems not to be responsible for all actions of c81, we will investigate the role of other kinases affected by the compound in these processes. the chemokine receptor cxcr4 is a multifunctional receptor which is activated by its natural ligand c-x-c motif chemokine 12 (cxcl12). cxcr4 seems to be part of the lipopolysaccharide sensing complex, suggesting that an intervention with cxcr4 agonists or antagonists could result in reduced tlr4 signaling. however, the role of cxcr4 and the influence of different cxcr4 ligands in acute as well as chronic inflammatory diseases are still contradictious. therefore, we aimed to characterize the systemic effects of cxcr4 activation in severe systemic inflammation and to evaluate its impact on endotoxin induced organ damages by applying a sublethal lps dose (5 mg/body weight) in mice. the plasma stable cxcl12 analog ctce-0214d was synthesized and administered subcutaneously shortly before lps treatment to ensure a delayed release and thereby a prolonged effect of the drug. 24 hours following lps administration, mice were sacrificed and blood was obtained for tnf alpha, ifn gamma and blood glucose evaluation. additionally, histopathological changes and oxidative stress in the liver and spleen were assessed and liver biotransformation capacity was determined. finally, cxcr4, cxcl12 and tlr4 expression patterns in liver, spleen and thymus tissue as well as the presence of different markers for oxidative stress and apoptosis were evaluated by means of immunohistochemistry. ctce-0214d improved the health status and distinctly reduced the lps mediated effects on tnf alpha, ifn gamma and blood glucose levels by approximately 35%, 50% or 70%, respectively. it attenuated oxidative stress in the liver and spleen tissue and enhanced liver biotransformation capacity unambiguously. ctce-0214d diminished the lps induced expression of cxcr4, cxcl12, tlr4, nf-κb, cleaved caspase-3 and gp91 phox, whereas heme oxygenase 1 expression and activity were induced above average. furthermore, tunel staining revealed anti-apoptotic effects of ctce-0214d in all organs. the cxcr4 is undoubtedly involved in inflammation. its activation was accompanied by anti-inflammatory, anti-oxidative and cytoprotective effects as ctce-0214d attenuated tlr4 signaling, induced heme oxygenase 1 activity and mitigated apoptosis. thus, the administration of cxcl12 analogs seems to be a promising treatment option to control acute systemic inflammation, especially when accompanied by a hepatic dysfunction and an excessive production of free radicals. the neurodegenerative disease friedreich ataxia (frda) is caused by a gaa triplet repeat expansion in the first intron of the frataxin gene, which results in a reduction of the corresponding mitochondrial protein. despite several cellular and animal models the exact function of frataxin is still a matter of debate, but the role of frataxin in iron sulfur cluster biosynthesis is generally accepted. however, we still don't know which primary metabolic events are caused by a frataxin deficit and until now, there is no therapeutic option available. we developed a new cellular model for frda by using the cre/loxp recombination system in mouse embryonic fibroblasts (mef). c57bl/6j mouse strains with a loxp flanked exon 4 of the frataxin gene and an tamoxifen-inducible cre-recombinase (creer t2 ) were crossed and several mef cell lines isolated. after selection by genotype and growth manner the fx-mef 2-1 (fxn -/-) and fx-mef 2-8 (fxn +/-) cell lines were finally choosed. the generation of the homozygous or heterozygous frataxin knockout was successfully tested on rna and protein level. long maintenance of the frataxin depleted fibroblasts revealed a strong growth inhibition consistent to earlier observations in other cell systems. therefore, we established a pattern of treatment over 12 days, with medium and substance changes at day 4 and 8, which allows us to get a fully functional knockout and overcome the growth inhibition problem. endpoint measurements of known metabolic phenomena from mammalian and non-mammalian models were studied at day 12 of our novel cell system. the induced total disruption of frataxin leads to a clearly reduced aconitase activity, cell division and oxygen consumption as well as an increase in ros production. in the heterozygous knockout with residual frataxin activity no such changes were observed. in addition, our pattern of treatment enables us to monitor the full and partial frataxin knockout in the course of time, to detect early and late metabolic events after frataxin disruption. therefore we analysed the mentioned parameters (with additional atp and iron content) in parallel at day 3, 5, 7 and 10 and could identify an initial event followed by secondary consequences and parameters, which seem to play only a minor role in the frda pathogenesis. on the contrary, a partial deficit of frataxin didn't result in any differences over time and suggests that there are only cellular alterations below a critical threshold. in conclusion, our new established mammalian cellular frda model mimics typical metabolic consequences of the human disease and seems to be qualified for frda research. the model shows for the first time six different metabolic events in the course of time in parallel and reveals insights into primary and secondary events of frda pathogenesis. these observations can be used to better understand the function of frataxin and can help to develop new therapeutic strategies to address the consequences of frataxin deficiency. moreover, the transfer of this cell model into 96well plates offers the possibility for a high-throughput screening of potential therapeutic substances. the disease diphtheria is caused by the diphtheria toxin (dt) which belongs to the group of single-chain ab-type bacterial protein toxins. receptor-binding of the b-domain on the target cell surface is followed by receptor-mediated endocytosis and internalization into early endosomal vesicles. endosomal acidification triggers membrane insertion and pore formation of the transmembrane (t) domain together with translocation of the (partially) unfolded catalytic (c) domain into the cytosol. herein, dta catalyzes adp-ribosylation of elongation factor 2 which leads to disruption of protein synthesis and finally causes cell death [1] . in hela cells, these events are related to cell-rounding functioning as a specific endpoint to monitor the uptake of dta into the cytosol of the host cell. as for other adp-ribosylating toxins such as c. botulinum c2 toxin, c. perfringens iota toxin and c. difficile cdt, also in case of native dt we demonstrated the involvement of several host cell factors during the translocation step of the catalytic domain across the endosomal membrane [2, 3] . in detail, we confirmed the involvement of the host cell chaperone hsp90 and the thioredoxin reductase (trxr), the latter presumably responsible for the reduction of the interchain disulfide bond between the dta and dtb moieties [4, 5, 6] . furthermore, we identified another group of protein folding helpers, the family of peptidyl-prolyl cis/trans isomerases (ppiases) including cyclophilin a (cypa), cyp40 and fk506-binding protein (fkbp)51 as required cytosolic factors for dta translocation. to characterize the role of the protein folding helpers in more detail, we investigated their interaction with purified dta in vitro by performing dot blot analysis with immobilized recombinant host cell factors, co-precipitation of cellular factors with dta from hela lysate and isothermal titration calorimetry with purified proteins therewith determining the thermodynamic parameters of the individual binding events. thereby, we detected binding of dta to hsp90, cypa, cyp40, fkbp51 and fkbp52. the data increase the knowledge of the molecular mechanisms underlying dt uptake and especially dta translocation which can be medically used to develop novel therapeutic strategies against the disease diphtheria. [1] murphy (2011) toxins 3, 294-308. [2] barth (2011) naunyn-schmied arch pharmacol 383, 237-245. [3] kaiser et al. (2012) cell. microbiol. 14, 1193-1205. [4] dmochewitz et al. (2011) cell. microbiol. 13 , 359-373. [5] ratts et al. (2003) j. cell biol. 160, 1139-1150. [6] schnell et al. (2015) novel afflictions as for example clostridium (c.) difficile associated diseases (cdad) caused by clostridium difficile are on the increase and challenging to treat. cdad most frequent occur in hospitalized patients after prolonged treatment with antibiotics. cdad includes among others diarrhea and the severe form of pseudomembranous colitis. not only the treatment of the infection, but also the treatment of the toxins has a high clinical significance. c. difficile secretes the exotoxins a (tcda) and b (tcdb), which glycosylate and thereby inactivate rho-gtpases in mammalian cells. tcda and tcdb are considered as the causative agents of cdad. in the last few years, more and more hypervirulent strains of c. difficile were described. in these hypervirulent strains, the adp-ribosyltransferase cdt was found as a third toxin in addition to tcda and tcdb. given the lack of agents effective against antibiotic-resistant bacterial strains and bacterial exotoxins, the development of novel pharmacological strategies is needed. the antimicrobial activity of naturally occurring substances is already known for a long time. one important mechanism of the innate immune system is the production of natural peptides showing antibiotic features. in recent years, it was shown that human antimicrobial peptides as important part of the native innate immune system plays a crucial role not only in inactivation of bacteria but also in inhibition of bacterial toxins (1). prompted by these result, we found that only human α-defensin-1 (hnp-1) but not human β-defensin-1 (hbd-1), both important effectors of the innate immune system, protected cultured epithelial cells from intoxication with tcda and cdt when applied prior to the toxins to the cells. moreover, α-defensin-1 prevented also the cytotoxic effects of all three c. difficile toxins tcda, tcdb and cdt combined in the medium. the combined investigation of all three toxins might be even more suitable to mimic the situation after an infection with hypervirulent c. difficile. the inhibition of the toxins was monitored by cell rounding caused by each of the toxins. currently, the molecular mechanisms underlying the inhibitory effects are still unknown and will be investigated in different cell lines. in conclusion, our results demonstrate that hnp-1 causes a loss of cytotoxicity of the c. difficile toxins and may act as novel drugs to cure c. difficile infections that contribute to cdad. neurodegenerative diseases like parkinson´s disease (pd) are accompanied by altered gene expression levels in the brain. recent studies support a role of regulatory noncoding rnas, such as micrornas (mirnas), which silence a specific set of mrnas at the post-transcriptional level. upon their aberrant expression, they are likely involved in the pathophysiology of specific neuronal loss. manipulation of neuronal gene expression is pivotal for understanding the function of proteins and the development of new therapeutic strategies. rna interference (rnai) strategies can be employed through the administration of small interfering rnas (sirna), which mediate the specific knockdown of a selected target gene. however, the main challenge is the delivery of these rnas into the neurons of interest. in this pilot study, we present a method for delivering sirnas in polymeric nanoparticles based on low molecular weight polyethylenimines (peis). their intracerebroventricular (icv) injection leads to in vivo silencing of neuronal gene expression in the brain of mice overexpressing α-synuclein (thy1-asyn mice). in a first step, pei-complexed sirna tagged with afluorescencedye were injected to track the localization and distribution after icv administration. five days later, fluorescent cells were visible throughout the brain, with the highest fluorescence intensity around the ventricles. fluorescence was also observed in large cells of the lumbar spinal cord. moreover, preliminaryresultsdemonstrate a 42.6% knockdown (p<0.05 student's t-test, n=6) of human α-synuclein (snca) in thetargetstructurestriatum upon a single icv injection of pei-complexed specific sirna compared to the control injection group (n=9). hence, our first results support the usability and efficacy of pei nanoparticle-mediated delivery of short rnas, namely sirnas, for rapidly and efficiently reducing the expression of a neuronal target gene of interest in the brain in vivo. this may allow the development of gene therapy strategies for the treatment of neurodegenerative diseases. is a propenylic alkenylbenzene found in several plants, e.g. acorus calamus. bacontaining plant materials are used to flavor foods, and are active ingredients in traditional plant medicines. thus, human exposure results primarily from the intake of bitters and teas, as well as from calamus-containing medicines and plant food supplements. although many (positive) pharmacological properties/effects of asarone isomers are described in the literature, ba was found to be carcinogenic in rodents (liver, duodenum) when given daily or in a single dosage. early experiments indicated that ba is not activated via hydroxylation and sulfonation as it is the case for known hepatocarcinogenic allylic alkenylbenzenes such as estragole or methyleugenol. because the mechanism of metabolic activation of ba in not known, we investigated the metabolism of ba in liver microsomes and human cytochrome p450 (cyp) enzymes, the mutagenicity of ba and its metabolites in the ames fluctuation assay and the dna adduct formation in primary rat hepatocytes. we found that side-chain epoxidation (leading to diols and a ketone) was by far the most dominating metabolic route of ba in liver microsomes and human cyp enzymes. ba was mutagenic in the ames test (+s9 mix), as was the synthesized ba-epoxide (-s9 mix). furthermore, we were able to synthesize and characterize a ba epoxide-derived dna adduct with deoxyguanosine. this dna adduct was formed in a concentration-dependent manner in rat hepatocytes incubated with ba. our results strongly indicate that ba is genotoxic with the side-chain epoxide being its ultimate carcinogen. morbid obesity is an independent risk factor for cardiovascular disease, type 2 diabetes mellitus and certain types of cancer. bariatric surgery -with the roux-en-y gastric bypass (rygb) being the gold standard -has become the therapeutic option of choice as a sustained weight loss and improvement of associated morbidity is achieved in the majority of patients. there is, however, a lack of evidence focusing on bariatric surgery induced sustained weight loss and its possible impact on cancer risk. we investigated the association between obesity, oxidative stress and genomic damage after roux-en-y gastric bypass surgery (rygb) or caloric restriction induced weight loss in the obese zucker rat. obese male zucker fa/fa rats were divided into three groups: sham surgery (sham), rygb and caloric restriction (cr) and were compared with lean controls (lean; zucker fa/+ rats). shams showed impaired glucose tolerance and elevated plasma insulin levels, which were less severe in rygb and cr. oxidative stress was elevated in kidney, liver and colon tissue of sham and reduced again after weight loss induced by either rygb or bwm. urine-derived oxidization products of lipids, dna and rna increased in shams and decreased after weight loss (rygb and cr). dna double strand breaks were more frequent in shams than in the weight loss groups or lean. dna damage in zucker fa/fa rats correlated with their basal plasma insulin values. obese rats showed elevated oxidative stress and genomic damage in comparison to lean rats. after body weight loss, achieved by either rygb or caloric restriction alone, oxidative stress level and genomic damage were decreased. this may indicate a reduction of the elevated cancer risk in obesity. ) mice were treated with the nocrelated compound azoxymethane (aom) followed by the administration of dextran sodium sulfate to trigger crc. tumors were quantified by non-invasive mini-endoscopy, which revealed a non-linear increase in crc formation in wt and aag -/mice. in contrast, a linear dose-dependent increase in tumor frequency was observed in mgmt -/and mgmt -/-/aag -/mice. the data was corroborated by hockey stick modeling, which yielded similar carcinogenic threshold doses for wt and aag -/mice. o 6 -meg levels and depletion of mgmt activity correlated well with the observed dose-response in crc formation. aom dose-dependently induced double strand breaks (dsbs) in colon crypts including in lgr5-positive colon stem cells, which coincided with atr-chk1-p53 signaling. intriguingly, mgmt-deficient mice displayed significantly enhanced levels of γ-h2ax, suggesting the usefulness of γ-h2ax as an early genotoxicity marker in the colorectum. this study demonstrates for the first time a non-linear dose-response for alkylation-induced carcinogenesis and reveals dna repair by mgmt, but not aag, as a key node in determining a carcinogenic threshold at low alkylation dose levels [2] . obesity is characterized as a status where the excessive accumulation of fat in adipocytes leads to local inflammation and hypoxia; both contributing to severe obesity associated co-morbidities such as cardiovascular disease and type 2 diabetes mellitus. local inflammation is mediated by macrophages, stromal vascular cells, preadipocytes, and adipocytes as well as by a number of proinflammatory cytokines and chemokines (1). in particular, the chemokines monocyte chemoattractant protein (mcp-1, ccl2), interleukin-8 (il-8/cxcl8) and stromal cell-derived factor 1 (sdf-1a/cxcl12) secreted by stromal vascular cells, preadipocytes, and adipocytes exert paracrine effects by recruiting neutrophils, monocytes/macrophages, and t-and b-cells. interestingly, deficiency in cxcl14 was shown to attenuate obesity in mice (2) . the cc chemokine ccl2 is known to stimulate ccr2 receptors, and the cxc chemokines cxcl8 and cxcl12 to activate cxcr1 and cxcr2 or cxcr4 and cxcr7, respectively. the receptor(s) activated by cxcl14 is currently unknown. a role of cxcl14 in modulating cxcr4 signaling has been proposed. we initiated our studies to determine the presence and functional significance of chemotaxin receptors in human adipocytes and their precursor cells. to this end, the mrna expression pattern of cc chemokine receptors, cxc chemokine receptors formylated peptide receptor fpr1 and the related receptor fprl1, and cc and cxc chemokines was analyzed during in vitro adipose differentiation of human simpson-golabi-behmel-syndrome (sgbs) preadipocytes, and under conditions mimicking an inflammatory response. in particular, we focused on the expression pattern of human ccr2 receptors, since previous reports indicated a role in adipogenic differentiation. however, our comprehensive analysis using different sources of adipocytes and their precursors indicated that ccr2 receptors were absent (3) . yet, the analysis revealed appreciable levels of mrna encoding ccl2, cxcl12, and cxcl14, and ccr1, cxcr2, cxcr7, fpr1, and fprl1, and cxcr4. of interest, cxcr4-and cxcr7-mrna were found to be up-regulated under the proinflammatory conditions. to analyze the responses of adipocytes and their precursors to chemokine receptor agonists, we used chemokine-mcherry fusion proteins purified from baculovirus-infected insect cells, e.g ccl2, cxcl12, cxcl14. while sgbspreadipocytes and adipocytes did not accumulate ccl2-mcherry upon stimulation, they showed a small accumulation of cxcl12-mcherry, and a strong accumulation of cxcl14-mcherry in the endosomal compartment. similar results were obtained in murine 3t3l-1 preadipocytes. using mass spectrometry analysis, we set out to identify the cxcl14-binding putative receptor protein(s) in murine 3t3l-1 preadipocytes. (1) makki, k. et al., (2013) in personalized medicine tumors are screened for several mutations in oncogenes or tumor suppressors. however, the cellular protein content not exclusively depends on the dna. we identified new rac variants generated on the mrna level in androgenindependent prostate cancer cells. all variants represent active forms of the gtpase. they are capable to suppress rhoa-induced apoptosis and additionally, mediate the synthesis of genes which are under the control of the androgen receptor. importantly, expression of the rac variants is sufficient to support tumor growth in mice. we prove the existence of the variants and verify their clinical appearance and relevance in tissue samples of a prostate cancer patient. dna analysis, however, revealed the wildtype sequence of rac. therefore, routine analysis of patient tumor tissue would miss the detection of active rac which precludes the success of therapy. the existence of active rac variants in prostate cancer tissue that promote resistance towards androgen deprivation suggest rac inhibition as an effective add on therapeutic strategy against prostate cancer. the bacterial effector protein exotoxin y (exoy) of pseudomonas aeruginosa is delivered into host cells via the bacterial type iii secretion system. once arrived in the host cell nucleotidyl cyclase activity of exoy is activated by a yet unknown cofactor and thus has a profound effect on concentrations of cyclic nucleotides: in addition to production of cyclic amp (camp) and cyclic gmp (cgmp) there is a massive synthesis of cyclic 3', and to some extent of the corresponding cytidylyl analogue ccmp 1,2 . currently, the role of cump and ccmp during the pathogenesis of p. aeruginosa infection remains unknown 3 . one of our hypotheses is that these cyclic nucleotides fulfil a role as first messengers, e.g. in the communication between individual bacteria or bacterial populations during establishment of acute or chronic infections. to test this hypothesis, the intra-and extrabacterial concentrations of cyclic nucleotides were measured via hplc-ms/ms at different time-points in liquid cultures of p. aeruginosa, either in a complete (lb medium) or a starving medium (vogel-bonner medium). additionally, we tested if supplementation of the media with extrinsic cump or ccmp had an effect on these measured concentrations. the influence of extrabacterial cyclic nucleotides on the bacterial metabolism and homeostasis was evaluated with a microarray of bacterial total cdna extracted at different time points of p. aeruginosa liquid culture with or without extrinsic cump/ ccmp. furthermore we investigated a potential function of the cyclic nucleotides in biofilm formation. cyclic ump and cyclic cmp have differential roles in bacterial metabolism and communication. for example, whereas ccmp is synthesized by p. aeruginosa when the bacteria are in a nutrient-rich environment, we could not detect bacterial cump under any tested circumstance. in our biofilm formation assays, only ccmp had a biofilmpromoting effect, but only in very high concentrations. the currently ongoing analysis of gene expression data in the presence or absence of cump may reveal a role of this cyclic nucleotide as first messenger, too. in further studies we will elucidate the signal transduction processes underlying the observed cump / ccmp effects, for example by identifying cump and ccmp binding proteins and their coupling mechanisms to intracellular signalling cascades. synapses are complex computational platforms that transmit information encoded in action potentials but also transform their functionality through synaptic plasticity. g protein-coupled receptors (gpcrs) play a major role in modulating the strength of the synapses via the second messenger camp 1 . however the spatio-temporal dynamics of the mode of action of camp underlying synaptic plasticity are still controversial. the role of this study was to investigate the dynamics of camp signaling at the drosophila neuromuscular junction, where octopamine binding to its receptors has been shown to cause camp-dependent synaptic plasticity 2 . for this purpose, we generated a transgenic drosophila expressing the camp sensor epac1-camps 3 in motor neuron. this allowed us to directly follow the octopamine-induced camp signals in real time by fluorescence resonance energy transfer (fret) in different compartments of the motor neuron (i.e. cell body, axon, boutons). we found that octopamine induces a steep camp gradient from the synaptic bouton (high camp) to the cell body (low camp), which was due by higher pde activity in the cell body. high octopamine concentrations evoked a response also in the soma. notably, these signals were independent and isolated form each other. moreover, application of octopamine by iontophoresis to single synaptic boutons induced bouton-confined camp signals. these data reveal that a motor neuron can posses multiple and largely independent camp signaling compartments, and provide new basis to explain how camp could control neurotransmission at a level of a single synapse. 1 kandel, e.r., dudai, y. & mayford, m.r. the molecular and systems biology of memory. cell 157, 163-186 (2014) . 2 koon, a.c., et al., autoregulatory and paracrine control of synaptic and behavioral plasticity by octopaminergic signaling. nat neurosci. 14 (2): p. 190-9 (2010) . 3 nikolaev vo, bünemann m, hein l, hannawacker a, lohse mj novel single chain camp sensors for receptor-induced signal propagation. j. biol. chem.279, 37215-37218 (2004) cardiovascular pharmacology 039 hyaluronic acid deposition determines engineered heart muscle characteristics and can be pharmacologically targeted to enhance function s. schlick background: engineered human myocardium (ehm) can be generated from psc derived cardiomyocytes (cms) and primary fibroblasts suspended in a collagen i hydrogel (70%:30%:0.4 mg/ml). ehm development encompasses an early consolidation phase followed by functional maturation. the presence of fibroblasts is essential for consolidation into a force-generating ehm. here we assessed the hypothesis that fibroblasts of different origin support ehm formation differentially as a function of hyaluronic acid deposition. methods and results: oscillatory rheology (1% strain, 1 hz) on cell-free and cell containing collagen i hydrogels directly after casting revealed enhanced consolidation in the presence of human foreskin fibroblasts (ffbs) compared to primary adult cardiac fibroblasts (cfbs) -change in storage modulus over time (pa/min): collagen 0.03; collagen + cms 0.03; collagen + cms + cfbs 0.09, collagen + cms + ffbs 0.2. we next generated ehm with cms and ffbs or cfbs. after 4 weeks of culture under serum-free conditions, we assessed ehm function by contraction measurements. ffb-ehms developed a significantly (p<0.01) higher force of contraction (foc) per cross sectional area (csa) than cfb-ehms (maximal foc/ csa are in mn/mm 2 : 1.8±0.1, n=29 vs. 0.3±0.1, n=20). cross sectional area (csa) of tissues was greatly increased (p<0.01) in cfb-ehms (csa in mm 2 : 1.6±0.1, n=20 vs. 0.6±0.03, n=30) and nonmyocyte content was higher in cfb-ehms (5.6±0.7, n=9 vs. 3.1±0.4, n=15; x10 5 cells/ml). histological analysis revealed that cardiomyocytes were only poorly matured in cfb-ehms compared to ffb-ehms. extending ehm functional data, principal component analysis of rnaseq data revealed distinct expression patterns for ffbs and cfbs, in which hyaluronic acid synthase 2 (has2) enzyme was significantly (p<0.01) upregulated. based on these findings, we pharmacologically intervened with has2 mediated hyaluronic acid (ha) deposition by treating cfb-ehms with hyaluronidase during all 4 weeks of culture. interestingly, ecm manipulation with low concentrations of enzyme significantly (p<0.01) reduced csa (csa in mm 2 : control 1.8±0.4, n=8; hyaluronidase of concentrations from 0.15u to 150u 0.9±0.2, n=4) with a concurrent, statistically significant (p<0.01), increase in contractile function and improved cardiomyocyte morphology on a histological level (maximal foc/csa in mn/mm 2 : control 0.2±0.05, n=8; hyaluronidase of concentrations from 0.15u to 150u 0.4±0.08, n=4). summary and conclusions: our data suggest that ehm consolidation is influenced differentially by fibroblasts of different tissue origin with hff-ehm being functionally superior to cfb-ehm. cfb-ehm could be rescued by hyaluronidase leading to reduced ha deposition. the latter demonstrates that extracellular matrix composition is centrally involved in ehm development. angiogenesis is the process of formation of new blood vessels from the pre-existing ones. vascular endothelial growth factor (vegf) is the most studied regulator of this process. by binding to its type 2 receptor (vegfr2), it has been shown to activate a variety of different signaling-pathways leading to enhanced angiogenesis. camp, on the other hand, is a versatile second messenger which regulates various endothelial functions including barrier function. it directly activates protein kinase a (pka) or the exchange protein directly activated by camp (epac) which is a guanine exchange factor (gef) for the small monomeric gtpase rap. as human umbilical vein endothelial cells (huvec) express both camp effectors (epac1 and pka), we investigated the role of camp-signaling using a spheroid based sprouting assay as an in vitro model for angiogenesis. interestingly, the activation of β-adrenergic receptors with 5 µm of isoproterenol significantly increased the cumulative sprout length. similarly, the selective activation of epac with 30 µm of the camp analog 8-pcpt-2'-o-camp (007) significantly increased the basal and the vegf-induced cumulative sprout length. in accordance, sirna-mediated depletion of epac1 in huvec decreased the basal and vegf-induced sprouting. surprisingly, 10 µm of forskolin increased basal and vegf-induced cumulative sprout length stronger than 007, indicating an additional role of pka. in accordance, 1 µm of myristoylated pki, a membrane-permeable specific pka inhibitor, significantly attenuated the forskolin-induced increase in sprouting. in all conditions tested, 50 ng/ml of vegf always showed an additive effect to the same extent on cumulative sprout length. therefore, our data indicate that the vegf-pathway is acting independently of the camp-pathway in the regulation of the sprouting angiogenesis. the β-adrenergic receptor-mediated activation of camp signaling in huvec induces angiogenic sprouting by activation of epac1 and pka. introduction: hypertension is one major risk factor for the development of chronic heart and kidney disease. mineralocorticoid receptor (mr) antagonists are a cornerstone in the therapy of heart failure and there is first evidence for a beneficial effect on the kidney as well. inflammation plays an important role in hypertensive organ injury. thus, this study was designed to evaluate and directly compare the effect of mr deletion in endothelial cells on blood pressure and cardiac vs. renal injury in a mouse model of deoxycorticosterone acetate-induced hypertension. methods and results: mice lacking the mineralocorticoid receptor in endothelial cells (mr cdh5cre ) were created using the cre/loxp system. mr cdh5cre and cre-negative littermates (mr wildtype ) underwent unilateral nephrectomy and received 1 % nacl with drinking water for 6 weeks. the mineralocorticoid deoxycorticosterone acetate (doca, 2.5 mg/d) was delivered by subcutaneous pellets. untreated mice served as controls (ctrl). ambulatory blood pressure was determined by implantable telemetry in awake mice. doca/salt treatment increased mean blood pressure in mr wildtype (141.7 ± 8.0 vs. ctrl 97.8 ± 3.2 mmhg, p<0.001) and mr cdh5cre (150.0 ± 3.0 vs. ctrl 104.1 ± 4.7 mmhg, p<0.001) without differences between genotypes. cardiac hypertrophy after doca/salt treatment was ameliorated in mr cdh5cre mice (ventricle weight 162.4 ± 4.2 mg vs. mr wildtype 189.0 ± 10.9 mg, p<0.05). doca/salt significantly increased cardiac fibrosis and the expression of fibrotic marker genes in mr wildtype but not in mr cdh5cre mice. this was accompanied by an increased expression of the vascular cellular adhesion molecule (vcam1) in mr wildtype cardiac endothelial cells. renal function was not altered by mr deletion in endothelial cells at baseline. doca/salt treatment lead to marked interstitial fibrosis in the kidneys of mr wildtype (sirius red fibrosis score: 2.4 ± 0.2 vs. ctrl 0.1 ± 0.1, p<0.001) and mr cdh5cre (2.3 ± 0.2 vs. ctrl 0.1 ± 0.1, p<0.001) mice. mrna expression of the fibrosis marker gene col3a1 (mr wildtype 4.3 ± 0.9-fold; mr cdh5cre 3.9 ± 1.1-fold vs. ctrl) was similarly increased. periodic acid-schiff staining revealed glomerular injury in both genotypes. this was associated with a marked rise in urinary albumin / creatinine ratio (mr wildtype 20.4 ± 5.7fold; mr cdh5cre 34.3 ± 12.5-fold vs. ctrl). in the kidney vcam1 mrna expression and the number of macrophages was increased by doca/salt treatment independently from endothelial mr deletion. conclusion: in conclusion, mr deletion from endothelial cells ameliorated doca/saltinduced cardiac but not renal inflammation and remodeling independently from blood pressure. these findings suggest different mechanisms for the beneficial effect of mr antagonists in hypertensive heart vs. kidney disease. platelets are relevant cells implicated in morbidity and mortality provoked by cardiovascular thrombosis. even with the actual antiplatelet therapy there is still a substantial incidence of arterial thrombosis. therefore, a better understanding of the mechanisms involved in platelet activation and aggregation is required to develop improved antiplatelet therapies. the increase in the intracellular ca 2+ concentration due to ca 2+ entry from the extracellular space is critical for platelet activation and aggregation. ca 2+ entry follows activation of plasma membrane receptors including gqcoupled receptors for adp, thromboxane a 2 (txa 2 ) or thrombin, as well as the collagen receptor glycoprotein vi (gpvi). the cellular signalling pathways downstream these receptors involve activation of phospholipase c and second messengers that are known to mediate activation of trpc channels. trpc proteins form receptor-operated cation channels, but their regulation and permeability differ depending on the cell type. it has been proposed that trpc proteins might contribute to platelet function as constituents of agonist-activated ca 2+ entry channels; however, the experimental approaches used so far and the lack of specific agonists or antagonists have not allowed to determine the individual contribution of trpc proteins for agonist-induced ca 2+ entry in platelets, aggregation and thrombosis formation. we detected the expression of trpc3 and trpc6 in human and mouse platelets. we identified that those proteins together are essential components of a system of coincidence detection in cellular ca 2+ signalling. this coincidence detection triggered by simultaneous stimulation of both thrombin and collagen receptors is required for the phosphatidylserine exposure in human and murine platelets, indicating the role of trpc3/c6 proteins for procoagulant activity. in addition, we detected the expression of s12 trpc1 transcripts in mouse platelets. therefore, we tested trpc1/c3/c6-deficient mice in an in vivo model of arterial thrombosis where they showed reduced thrombus formation. regardless of the protective effect of trpc1/c3/c6 inactivation observed in the thrombosis model, no differences were detected in tail bleeding. to evaluate the relevance of these trpc proteins in platelet aggregation we measured in vitro platelet aggregation in platelets from trpc1/c3/c6-deficient mice and we observed that the aggregation was reduced after adp (3µm) or txa 2 -analogue (1µm) stimulation, but not after collagen stimulation (10µg/ml). we are currently analyzing the in vitro aggregation and the agonist-evoked ca 2+ response in platelets from different trpc-compound and single deficient mouse lines to understand the mechanisms behind this phenotype with regard to its complementarity to actual antiplatelet therapy. background: thrombin signaling initiates inflammatory events directly and through activation of platelets. endogenous and pharmacologic inhibitors of thrombin are therefore of relevance during atheroprogression and for therapeutic intervention. the small leucine-rich proteoglycan biglycan (bgn) is such an endogenous thrombin inhibitor that acts through activation of heparin cofactor ii (hcii). here, the effect of genetic deletion of bgn on thrombin activity, inflammation and atherosclerosis was addressed. methods and results: bgn concentrations were elevated in the plasma of patients with acute coronary syndrome. in apoe -/mice, bgn was detected in the plasma as well as in the glykokalyx of capillaries. additionally, bgn expression occurred in the subendothelial matrix of arterioles as well as in atherosclerotic plaques. in line with a role of bgn in balancing thrombin activity, apoe -/-/bgn -/0 mice exhibited higher activity of circulating thrombin and increased numbers of activated platelets than did apoe -/mice. furthermore, higher concentrations of circulating cytokines in apoe -/-/bgn -/0 mice suggested a pro-inflammatory phenotype. likewise, immunohistochemistry and facs analysis of the aorta demonstrated increased macrophage content in atherosclerotic lesions of these mice. in addition, apoe -/-/bgn -/0 mice exhibited higher aortic plaque burden and larger atherosclerotic lesions at the aortic root. of note, apoe -/-/bgn -/0 mice showed progressive dilatation of the aortic arch corresponding to a decrease in collagen fibril density suggestive of an outward remodelling in the absence of bgn. no differences were evident with respect to lipid content of the aortic root plaques or circulating plasma lipids. treatment with the thrombin inhibitor argatroban reversed platelet activation and aortic macrophage accumulation in apoe -/-/bgn -/0 mice. conclusions: the present results strongly suggest a protective role of bgn during the progression of atherosclerosis by inhibiting thrombin activity and platelet activation, and ultimately macrophage-mediated plaque inflammation. the exposure to environmental or human-made xenobiotics including drugs induces the hepato-intestinal transcription of metabolizing enzymes and transporters. the time-span of induction is thought not to exceed xenobiotic exposure, in order to minimize disturbances of endobiotic metabolism. in contrast, we find cross-generational transmission of the induction of the phase i enzyme cyp2b10 (1200-fold in females, 700-fold in males) in 6-day old offspring of adult female mice exposed one week prior mating to tcpobop (3 mg/kg i.p.) , the model ligand of the xenosensing nuclear receptor car. such cross-generational effects of xenobiotics are of great clinical interest as they could have profound consequences on the health status of the offspring, including interferences with drug therapies. the multigenerational transmission of tcpobop-driven induction could be mediated by pre-uterine/pre-conceptional epigenetic changes of oocytes. alternatively, they could be brought about by direct intrauterine/post-conceptional contact with tcpobop released from long-term depots. to discriminate between these mechanisms we conducted embryo transfer experiments. both donor mothers and foster mothers were injected with tcpobop (3 mg/kg) prior to mating. the analysis of hepatic cyp2b10 expression in 6-day-old offspring is clearly consistent with a post-conceptional onset of tcpobop effects. thus, offspring of solvent-injected donor mothers transferred to tcpobopexposed foster mothers display a 700-fold induction while offspring from the reciprocal experiment show no changes. cesarean sections on day e18.5 followed by crossfostering proved transmission to be mediated predominantly via lactation (f1 hepatic cyp2b10 induction 3000-fold) and only to a minor part via intra-uterine exposure (300fold). this mechanism is consistent with the absence of induction transmission via the male germline. to analyze if tcpobop leads to functional consequences in drug metabolism of f0 and f1 generation, we conducted in vivo zoxazolamine paralysis assays taken as a functional test for cyp2b10 catalytic activity. in both tcpobop-pretreated f0 and in their f1 descendants, the induction reduced the duration of paralysis evoked by zoxazolamine by >50%. the characterization of cross-generational tcpobop-mediated effects on other processes controlled by car such as energy and bone metabolism is in progress. first tests indicate a transmission of anabolic effects on bone, as evidenced by the induction of serum osteocalcin expression by 45% in 12-weeks-old offspring. in summary, the car-mediated cyp2b10 induction by tcpobop is transmitted to the offspring mainly via lactation, resulting in lasting phenotypic consequences in drug and bone metabolism. the effects of similarly lipophilic drugs and anthropogenic environmental pollutants are currently being investigated. such compounds could affect offspring despite discontinuation of intake or exposure well ahead of pregnancy. age-related cognitive decline can eventually lead to dementia, the most common mental illness in elderly people and an immense challenge for patients, their families and caregivers. cholinesterase inhibitors constitute the most commonly used antidementia prescription medication. the standardized ginkgo biloba leaf extract egb 761 ® is approved for treating age-associated cognitive impairment and has been shown to improve the quality of life in patients suffering from mild dementia. a clinical trial with 96 alzheimer´s disease patients indicated that the combined treatment with donepezil and egb 761 ® had less side effects than donepezil alone (yancheva et al., 2009) . in an animal model of cognitive aging, we compared the effect of combined treatment with egb 761 ® or donepezil monotherapy and vehicle. we compared the effect of chronic treatment (15 days of pretreatment) with donepezil (1,5 mg/kg p. o.), egb 761 ® (100 mg/kg p. o.), or the combination of the two drugs, or vehicle in 18 -20 month old male ofa rats. learning and memory performance were assessed by morris water maze testing, motor behavior in an open field paradigm. in addition to chronic treatment, the substances were administered orally 30 minutes before testing. compared to the first day and to the control group, only the combination group showed a significant reduction in latency to reach the hidden platform on the second day of testing. moreover, from the second day of testing onwards, the donepezil, the egb 761 ® and the combination group required less time to reach the hidden platform compared to the first day. the control group did not reach the same latency reduction until day three. there were no effects on motor behavior. these results suggest a superiority of the combined treatment of donepezil with egb 761 ® compared to monotherapy. literature: yancheva, s., ihl, r., nikolova, g., panayotov, p., schlaefke, s., & hoerr, r. (2009) . ginkgo biloba extract egb 761(r), donepezil or both combined in the treatment of alzheimer's disease with neuropsychiatric features: a randomised, double-blind, exploratory trial. aging ment health, 13 (2) , [183] [184] [185] [186] [187] [188] [189] [190] institut für vegetative physiologie und pathophysiologie, göttingen, germany values are well below the plasma concentrations (3-28 µm; just et al. expert opin emerging drugs 20: 161-164, 2015) observed in patients treated with dantrolene. although not yet proven directly, oat3 may be involved in renal secretion of dantrolene and 5-oh dantrolene by mediating the first step, i.e. the uptake across the basolateral membrane of proximal tubule cells. the second step, the release of these compounds into the urine across the luminal membrane, is possibly mediated by mrp4. since oat3 was also detected in the cytoplasmic membrane of skeletal muscle cells (takeda et al. europ. j. pharmacol. 483: 133-138, 2004) , dantrolene may reach its target, the intracellular ryanodine receptor, ryr1, by influx through oat3, where it inhibits calcium efflux by ryr1 thereby preventing severe muscle contraction and malignant hyperthermia. this assumption, however, awaits a direct demonstration of dantrolene transport by oat3. in addition, we identified, besides dantrolene and 5-oh dantrolene, several further fdaapproved drugs such as tyrphostin ag 1478, ceefourin 1, glafenine, nalidixic acid, and prazosine, as inhibitors of es uptake by oat3. controls 1 with other defects and controls 2 with genetic disorders. all drugs used in the first trimester were identified from the database, and were cross-referenced against previously compiled lists of drugs with reactive intermediates and drugs with faa. drugs with reactive intermediates, with systemic absorption and with a daily dose ≥50mg were considered os-inducing drugs. when there was an association between os-inducing drugs and a group of birth defects, we further investigated two different faa exposure categories: concurrent exposure to both os-inducing drugs and faa drugs (os+/faa+) and exposure to os-inducing drugs only (os+/faa-). when the number of subjects allowed (at least five cases/controls were exposed), we examined the role of folic. odds ratios (ors) with 95% confidence intervals were adjusted for maternal smoking and alcohol use in the first trimester in controls 1 and additionally adjusted for maternal age in controls 2. results: a total of nine groups of birth defects were investigated. only nervous system defects were associated with os-inducing drugs. exposure rates were 65/464 (14.0%) for cases, 512/6033 (8.5%) for controls 1 and 130/1564 (8.3%) for controls 2 and adjusted ors (95%cis) were 1.71 (1.29-2.26) and 1.77 (1.27-2.46), respectively. this association was unchanged when we examined os+/faa+ and os+/faa-separately. the os+/faa+ category, however, had slightly higher or values than the os+/faa-(2.41 vs. 1.61 for controls 1, and 2.55 vs. 1.67 for controls 2). because of the low number of exposed subjects, we could only examine folic in relation to os+/faa-. using os-/faa-/folic+ as reference, we found the highest risk with os+/faa-/folicand a lesser magnitude with os+/faa-/folic+ (ors being 2 and 1.6 times respectively for both controls). conclusion: our study suggests an increased risk of having a child with nervous system defects in mothers who were exposed to os-inducing drugs during pregnancy, and a potential risk reduction with folic. background: inhibition of rho-gtpases with statins as well as specific inhibition of the small gtpase rac1 protects non-transformed cells from topoisomerase ii-(top2)-poisoninduced cleavable complex formation and thereof derived dna double-strand breaks. this effect rests at least partially on rac1-mediated regulation of topoisomerase ii activity. however, the link between rac1 and top2-poisoning is only poorly understood. furthermore, it is unclear whether mitochondrial or nuclear type ii topoisomerases are the most relevant target for top2-poison-induced cytotoxicity. here, we investigated the relevance of rac1-regulated actin cytoskeleton integrity as well as mitochondrial integrity in top2-poison-induced dna damage responses as well as cytotoxicity under situation of rac1 inhibition. methods: since endothelial cells are the first barrier for any kind of systemically administered chemicals and cardiomyocytes are particular sensitive to anthracyclines, endothelial cells (h5v) as well as cardiomyocytes (h9c2) were chosen as in vitro model systems for top2-poisoning. the cells were pre-treated with rac1 inhibitors, statins or actin cytoskeleton disruptors and were subsequently treated with the topoisomerase ii poisons doxorubicin or etoposide. to compare the levels of induced dna damage, γh2ax foci quantifications as well as the comet assay were employed. actin disruption was visualized by phalloidin-fitc staining. to be able to detect relevant changes in mitochondrial mass or integrity, high doses of top2-poisons had to be used in both cell lines. changes in mitochondrial homeostasis as well as integrity were detected by the jc1-assay, mitotracker assay as well as atp-assay. additionally, pcr-and gelelectrophoresis-based methods were used for detecting mitochondrial dna damages. selected components of the dna damage response machinery as well as factors of mitochondrial homeostasis were detected by western blot. results: disruption of the integrity of the actin cytoskeleton attenuated the dna damage response to a similar extent as seen by rac1 inhibition, pointing to a role of actin filaments in the dna damage response after genotoxic insults. the actin cytoskeleton seems to participate in genotoxin-induced dna damage, -repair or in the dna damage response as reflected by reduced numbers of nuclear h2ax-foci as well as the comet assay after treatment with doxorubicin. this was not related to nuclear import or export of doxorubicin. disturbance of mitochondrial homeostasis or integrity was only detectable at high doses of topoisomerase ii poisons. this was largely unaffected by pre-treatment with statins or rac1-inhibitor. top2-poison-induced raise in mitochondrial mass was slightly enhanced by the rac1-inhibitor and statins. interestingly, inhibition of rac1 counteracted doxorubicin-induced phosphorylation of the amp-kinase in endothelial cells but not in cardiomyocytes. conclusion: mitochondrial toxicity seems to play only a minor role in top2-poisoninduced cytotoxicity in h9c2 and h5v cells. the data point to a role of rac1-regulated filamentous (nuclear?) actin in the dna repair and/or dna damage response after treatment with top2-poisons. poly(adp-ribose) polymerase 1 (parp1) and the recq helicase werner syndrome protein (wrn) are important caretakers of the genome. they physically interact with each other and are both localized in the nucleus and in particular in the nucleoli. both participate in various overlapping mechanisms of dna metabolism, in particular genotoxic stress response and dna repair [1] . previously, we and others have shown in biochemical studies that enzymatic functions of wrn are regulated by parp1 as well as by non-covalent poly(adp-ribose)-wrn interaction [2] [3] [4] . furthermore, pharmacological parp inhibition as well as a genetic parp1 ablation in hela cells alters the recruitment kinetics of wrn to sites of laser-induced dna damage [5] . here we report a novel role for parp1 and poly(adp-ribosyl)ation in the regulation of wrn's subnuclear spatial distribution upon induction of oxidative stress. we could verify previous reports that wrn is transiently released from nucleoli upon induction of oxidative stress, camptothecin (cpt) treatment, and laser-induced dna damage in a time-dependent manner. while, cpt-induced translocation appears to be a parpindependent process, our results reveal that upon h 2 o 2 -induced oxidative stress, parp1 is essential for the translocation of wrn from the nucleoli to the nucleoplasm. parp1 activity only partially contributes to wrn release from nucleoli, underlining the importance of a direct wrn-parp1 interaction for subnuclear wrn redistribution. furthermore, we identified a novel par-binding motif within the wrn sequence that is located in its rqc domain, which also harbors the binding site for parp1 and is necessary for wrn's nucleolar localization under non-stress conditions. currently, we are testing corresponding wrn mutants to analyze if this region is responsible for the parp1-dependent release of wrn from nucleoli to sites of dna damage. in conclusion, we provide novel insight into the role of parp1 in wrn's spatio-temporal regulation in the nucleus during the oxidative stress response. host-cell reactivation (hcr) is an assay used to determine dna repair capacity of cells. in its canonical layout, the test utilised a virus or a plasmid with a marker gene, inactivated by uv-damage [1, 2] . among the infected or transfected host cells types, only those with functional dna repair pathway would re-activate the damaged dna, thus providing a rationale for identification of dna repair genes in the mutant screens. an obvious advantage of hcr is that repair can be measured in cells that have not been exposed to a damaging agent. however, because of multiple variables of the damage generation, transfection and interpretation of results, the assay has been hard to harmonise and develop into a widely accepted quantitative dna repair assay. over the last years, my team has developed and validated several major improvements of the mammalian hcr assay. exploiting sequence-specific nicking endonucleases and customised design of the reporter vectors, we proposed an innovative and very efficient technique for incorporation of synthetic oligonucleotides, containing single structurally defined dna base and backbone modifications, into desired gene elements [3] . this ad hoc approach allows examination of the repair in a stand-specific manner and at single nucleotide resolution. we efficiently applied hcr in its new layout for measurement of the nucleotide excision repair of various dna adducts. moreover, we demonstrated that the enhanced hcr assay can differentiate between the transcription-coupled (tc-ner) and global genome (gg-ner) subpathways of ner [4] . we further obtained new significant insights into the lesion-specific mechanisms of base excision repair of several endogenously occurring aberrant dna bases [5 -7] and plan to adapt the assay to the detection of mismatch repair and translesion dna synthesis. in addition to the applications in the dna repair field, the enhanced hcr assay provides a tool for investigation of the dynamics and transcriptional impact of the regulatory dna bases 5methylcytosine and 5-hydroxymethylcytosine as well as their derivatives (5formylcytosine and 5-carboxycytosine a coordinated and faithful dna damage response is of central importance for maintaining genomic integrity and cell survival. transcriptional activation of dna repair genes is an important regulatory mechanism contributing to the adaptation of cells to genotoxic stress and protection against genotoxin-mediated cell death. here we show that exposure to a low dose of benzo(a)pyrene 9, , the active metabolite of benzo(a)pyrene (b[a]p), which represents the most important carcinogen formed by incomplete combustion during food preparation and smoking, causes upregulation of several dna repair genes. combined induction of the nucleotide excision repair (ner) genes ddb2, xpc, xpf and xpg enhanced repair activity and protected cells against a subsequent bpde exposure. furthermore induction of the translesion polymerase polh was also involved in protection against bpde-induced apoptosis, however led to an enhanced mutation frequency in the surviving cells. activation of these dna repair pathways was also observed upon exposure to b[a]p and in vivo in buccal cells of male individuals upon smoking, indicating that this mechanism may be involved in the formation of smoking-related cancers. altogether, we could show that low-dose bpde exposure activates a complex network of transcriptional alterations, leading to protection against cell death, at the cost of increased mutation frequency, highlighting the danger of occasional smoking. poly(adp-ribosyl)ation (parylation) is an essential posttranslational modification with the biopolymer poly(adp-ribose) (par). the reaction is catalyzed by poly(adp-ribose) polymerases (parps) and plays key roles in cellular physiology and stress response by regulating physico-chemical properties of target proteins. of the 17 members of the human parp gene family, at least four have been shown to exhibit par-forming capacity. upon dna damage parp1 is catalytically activated and is thought to contribute to the bulk of the cellular par formation. parp inhibitors are currently being tested in clinical cancer treatment, in combination therapy, or as monotherapeutic agents by inducing synthetic lethality (mangerich and bürkle 2011, mangerich and bürkle 2015) . here we generated a genetic knock out of parp1 in one of the most widely used human cell systems, i.e. hela parp1 ko cells, via talen-mediated gene targeting and characterized these cells with regards to parylation metabolism and genotoxic stress response. furthermore, by reconstituting hela parp1 ko cells with a series of artificial and natural parp1 variants, we analyzed structure-function relationships of parp1 in a cellular environment without interfering with endogenously expressed wt-parp1. we confirmed that the parp1e988k mutant exhibits mono-adp-ribosylation activity and extended previous reports by demonstrating that the parp1l713f mutant is constitutively active in a cellular environment, leading to high cellular par levels even in unchallenged cells. additionally, both mutants exhibited significantly altered recruitment and dissociation kinetics at sites of laser-induced dna-damage, which can partially be attributed to non-covalent parp1-par interaction via at least one specific par binding motif located in zinc finger 2 of parp1. expression of both artificial mutants led to distinct cellular consequences, caused by the altered cellular biochemistry. while the expression of parp1l713f itself triggered apoptosis, parp1e988k expression led to a strong g2-arrest during cell cycle and sensitized cells to camptothecin treatment. interestingly, pharmacological parp inhibition with abt888 mitigated effects of the e988k mutant, suggesting distinct functions of mono-adp-ribosylation. finally, by reconstituting parp1 ko cells with a natural cancer-associated parp1 snp variant (v762a), as well as a newly identified parp1 mutant present in a patient of pediatric colorectal carcinoma (f304l-v762a), we demonstrate, that these variants exhibit altered biochemical and cellular properties, potentially supporting carcinogenesis. together, this study establishes a novel model to study parp1-dependent parylation during genotoxic stress response and reveals new insight into the structure-function relationships of artificial as well as natural parp1 variants in a cellular environment, with implications for parp research in general. mangerich, a. and a. bürkle (2011) . "how to kill tumor cells with inhibitors of poly(adpribosyl)ation." int j cancer128 (2): 251-265. mangerich, a. and a. bürkle (2015) . multitasking roles for poly (adp-ribosyl) ation in aging and longevity. parp inhibitors for cancer therapy, springer: 125-179. leukemic cells frequently overexpress the transcription factor wilms tumor 1 (wt1) and the persistence of wt1 expression after chemotherapy indicates remaining leukemic stem cells. hydroxyurea induces replicative stress by its ability to inhibit ribonucleotide reductase, an enzyme that catalyzes the synthesis of dntps from ntps. we demonstrate that the expression levels of wt1 determines the extent of dna damage and apoptosis in a panel of leukemic cells treated with hydroxyurea. accordingly, inhibiting apoptosis through chemical inhibition of caspases or by overexpression of mitochondrial anti-apoptotic bcl proteins prevents the hydroxyurea-induced depletion of wt1 and cell death. in addition, we show that an rna interference-mediated elimination of wt1 sensitizes leukemic cells to the pro-apoptotic and dna damaging effects of hydroxyurea. furthermore, such a loss of wt1 suppresses hydroxyurea-induced erythroid differentiation. pharmacological approaches that diminish wt1 also sensitize cells to hydroxyurea. these include the tyrosine kinase inhibitor (tki) imatinib or epigenetic modifiers belonging to the histone deacetylase inhibitor (hdaci) group. thus, an inhibition of wt1 is therapeutically exploitable for a targeting approach against leukemic cells undergoing replicative stress. our novel findings reveal that wt1 is a novel biological target of hydroxyurea and they suggest that wt1 has a previously unrecognized ability to prevent dna damage when replication forks halt and eventually collapse. fret (fluorescence resonance energy transfer)-based cell assays were developed to directly monitor receptor activation and receptor-stimulated camp response. mutant ß 1 ar were generated by insertion of cyan and yellow fluorescent proteins (cfp and yfp) into the third intracellular loop and the c-terminus, respectively (bornholz et al., cardiovasc res 97:472, 2013) and stably transfected to hek 293 cells (hekß 1 -fret). to monitor the camp response the epac1-based fret sensor of camp, was stably transfected alone (hekwt-e1) and together with a moderate level of native ß 1 ar to hek293 cells (hekß 1 -e1; nikolaev et al. jacc 50: 423, 2007) . fret-activity was measured with recently developed fluorescence detectors (12 channels) equipped with fast semiconductor technology, avoiding any movable optical and mechanical parts, using 438 nm for excitation and 483/540 nm for the emmission ratio. cells were cultivated in 96-format 12-well strips, incubated in physiological hepes-buffered salt solution and treated with ßar agonists of different selectivity and affinity to determine their ßar-subtype preference. catecholamines tested in hekß 1 -fret cells exhibited ec 50 -values (-log, m) which matched k d -values (-log, m) known from native heart receptor membranes (isoprenaline, iso: 6.9±0.1, adrenaline 5.7±0.1, noradrenaline 6.2±0.1). ßar-expression levels were controlled by radioligand binding with [ 3 h]-(-)-cgp12,177 resulting in different densities of ~4x10 6 and ~1.4x10 4 receptors/cell in hekß 1 -fret and hekß 1 -e1, respectively, whereas in hekwt-e1 cells only ~1000 ß 2 ar were found. surprisingly, the low level of ßar in hekwt-e1 cells allowed the measurement of the action of ß 2 -sympathomimetics (ß 2 sym), e.g. fenoterol, thereby amplifying receptor binding (pk d~6 .7) to an effective regulation of fret activity in the presence of 0.06 mm ibmx (pec 50~8 .3±0.1), nearly matching the ~100-fold amplification of iso (pk d~6 .9; pec 50~8 .9). in order to determine ß 1 ar-mediated side effects of ß 2 sym, hekß 1 -e1 cells characterized by a 14-fold higher level of ß 1 ar over ß 2 ar were assayed for fret activity. fenoterol maximally inhibited fret activity with a pec 50~8 .4 whereas the high affinity ß 2 sym salmeterol acted a partial agonist (~75% of iso maximum, pec 50~8 .6), both compounds being rather insensitive against the highly effective ß 1 ar-blockade with 1 µm cgp 20,712 a. for that reason hekß 1 -fret cells were used characterizing fenoterol as partial agonist (~25%) whereas salmeterol activated less than 10% of maximum receptor activation by iso. thus, it has to be concluded that the low level of effectively coupled wt-ß 2 ar present in hekß 1 -e1 cells precludes the exact determination of ß 1 ar-mediated side effects of ß 2 sym, and that cfp/yfp labelled receptors have to be used for the determination of the subtype specific intrinsic activity of an agonist. they account for about one third of all drug targets. their regulation from desensitization to internalization and alternative signal transduction is largely dependent on phosphorylation of intracellular serine and threonine residues of the activated receptor. even though the β 1 -adrenoceptor is of tremendous importance in a number of diseases its phosphorylation remains poorly understood. we addressed this question in a qualitative and quantitative way. by using radioactive phosphorylation assays and mass spectrometry, we were able to elucidate the phosphorylation pattern of the human β 1 -adrenoceptor in vitro. we identified ten previously unknown phosphorylation sites in the third intracellular loop and the receptor's c-terminus. labeling hek293 cells with stable heavy isotopes (silac) lead to the discovery of a stimulation-dependent regulation of several of these phosphorylation sites. furthermore, mutagenesis studies in stably transfected hek293 cells revealed the impact of phosphorylation for arrestin binding and internalization of the receptor. fluorescence resonance energy transfer experiments with β 1 -adrenoceptor variants carrying point mutations of putative phosphorylation sites identified two c-terminal phosphosites that determine arrestin recruitment. our current goal is to further investigate the functional implications of these newly identified phosphorylation sites on downstream signal transduction, with an emphasis on the map kinase pathway. a moderate increase in arrestin affinity to the β 2 -adrenergic receptor is sufficient to induce arrestin internalization. the homologous desensitization of g-protein-coupled receptors is a two-step process. initially, g-protein-coupled receptor kinases phosphorylate agonist-occupied receptors which are subsequently bound by arrestins. in many cases, the resulting receptorarrestin complex is then internalized via clathrin-coated pits. dependent on the identity of the receptor and the ligand, the complex between receptor and arrestin may exist only in the proximity of the plasma membrane or internalize into the cell interior. we constructed mutants of the β 2 -adrenergic receptor carrying three additional serine residues in various positions at the c-terminal tail. one of these mutants which carried the serine residues in close proximity to the endogenous grk phosphorylation sites (β 2 ar-sss) showed increased isoprenaline-stimulated phosphorylation and differences in arrestin-3 affinity and trafficking. the affinity of arrestin-3 to the receptor was measured by fluorescence resonance energy transfer (fret) between the receptor and arrestin-3 and by two-color fluorescence recovery after photobleaching (frap). in the fret assay, arrestin-3 dissociation from the β 2 ar-sss receptor upon agonist washout was prolonged approximately two-fold compared to the wild-type receptor. frap was performed with an n-terminally tagged receptor immobilized with an antibody against the n-terminal tag either in solution or on a micropatterned surface. in these assays, the recovery of arrestin-3 into the bleached region was prolonged between two-and fourfold for the β 2 ar-sss receptor compared to the wild-type. even though this two-to fourfold increase in affinity seemed rather modest, it resulted in the trafficking of receptorarrestin complexes to the early endosome whereas the wild-type receptor interacted only transiently with arrestin at the plasma membrane. furthermore, the increased affinity of arrestin led to more efficient internalization of the β 2 ar-sss compared to the wild-type receptor. however, recycling to the plasma membrane after agonist washout was very similar for both receptors. we conclude that even a modest change in affinity between a g-protein-coupled receptor and arrestin can lead to substantial alterations in arrestin trafficking which in turn may have effects on cellular signaling. despite recent structural research allow for better understanding of gpcr structure, the crucial aspects of the selectivity mechanism of receptor -g protein subtype coupling remain unresolved. based on the hypothesis that the affinity of the ternary complex (agonist/gpcr/g-protein) in the nucleotide-free state determines the selectivity of gpcr-g protein coupling, we set out to measure gpcr-g protein interaction in membranes of single cells. in order to quantify the affinity of gα-subunit towards gpcrs in single cell, we determined the lifetime of the receptor-g protein complex in living cells upon agonist withdrawal under conditions of gtp-depletion. therefore, we utilized förster resonance energy transfer (fret) based assays to study interactions between fluorescent muscarinic receptors and heterotrimeric g proteins in single permeabilized hek293t cells transfected with the appropriate cdnas. here we focused on muscarinic m1-, m2-, and m3-receptors and characterized the kinetics of agonist-induced binding of go/i -and gq/11-proteins to muscarinic receptors and their subsequent dissociation in the absence of nucleotides. as a measure of affinity we calculated the rate constant of g protein dissociation from the receptor after agonist withdrawal. the dissociation kinetics of go protein from m3-and m1-achrs was found to be 10-fold faster in comparison to gq. similarly, we observed a 15-fold right shift of the concentration-response curves of go proteins binding to m3-achr in comparison to gq. in order to ensure, that the affinity of the ternary complex correlates with the efficiency of g protein activation, we performed experiments on the g protein activity in intact cells expressing non-fluorescent m3-achr by using a fret-based assay. our results showed that gq activation required 10-fold lower agonist concentration compared to go activation, suggesting that indeed the stability of the ternary complex in the absence of nucleotides determines the selectivity of gpcr-g protein coupling. we further explored the subtype selectivity of m2-achr for gi family members by comparison of dissociation kinetics of gi1-, gi2, gi3-, and go-proteins from m2-achr under nucleotide depleted conditions. k off of gi1 and gi3 were found to be two-fold higher in comparison to gi2 and go proteins, indicating the higher affinity of the latter ones to m2-achr. our fret-based assay to study receptor-g-protein interactions in membranes of single cells has been proven to be a fast and reliable method to quantify the affinity of the ternary complex. the g protein subtype dependent differences in the affinity towards activated receptors correlate with the g protein coupling efficiency of this receptor. despite their tremendous pharmacological relevance and potential for the development of new drugs, our understanding of g protein-coupled receptor (gpcr) architecture and signaling mechanisms are still limited. major reasons for this are the low abundance and poor biophysical properties of gpcrs, which makes them one of the most challenging class of proteins for structural and biophysical studies. among the superfamily of gpcrs, the class b receptors comprising 15 receptors are structurally least understood because to date it has not been possible to obtain a crystal structure of this receptor class. to overcome these limitations, we have developed a method for improving functional expression and simultaneous thermo-stabilization of gpcrs by directed evolution which is based on expression of receptors in saccharomyces cerevisiae and subsequent selection of highly expressing variants by flow cytometry with fluorescent ligands. by this strategy, key residues within a receptor sequence can be rapidly identified that are responsible for improved biophysical properties without greatly affecting the pharmacological features of the receptor. we have now applied this method to the human parathyroid hormone 1 receptor, a member of the class b of gpcrs which is a major regulator of calcium homeostasis in the body and a key target for the treatment of osteoporosis. from two rounds of directed evolution in yeast we obtained several mutants of parathyroid hormone 1 receptor that exhibit strongly improved expression levels and that remain stable after solubilization in detergents. these receptor variants are ideal candidates for subsequent structural and biophysical analysis. opioid drugs exert nearly all of their clinically relevant actions through stimulation of mors (μ-opioid receptors). the molecular biology of endogenous opioid peptides and their cognate receptors has been studied extensively in vitro. for mor, signaling efficiency is tightly regulated and ultimately limited by the coordinated phosphorylation of intracellular serine and threonine residues. morphine induces a selective phosphorylation of serine 375 that is predominantly catalyzed by g protein-coupled receptor kinase 5. as a consequence, the selective morphine-induced s375 phosphorylation does not lead to a robust beta-arrestin mobilization and receptor internalization. by contrast, high-efficacy opioid agonists such as fentanyl or etonitazene not only induce phosphorylation of s375 but also drive higher order phosphorylation on the flanking residues threonine 370, threonine 376, and threonine 379 in a hierarchical phosphorylation cascade that specifically requires grk2 and grk3 isoforms. as a consequence, multisite phosphorylation induced by potent agonist promotes both betaarrestin mobilization and a robust receptor internalization. however, little is known about agonist-selective phosphorylation patterns in vivo after acute and chronic drug administration. to learn more about mor regulation in vivo we have generated a new μopioid receptor knock in mouse with an n-terminal ha-tag. using these mice, we were able to study in vivo phosphorylation of an endogenous g protein-coupled receptor using both mass spectrometry and phosphosite-specific antibodies. we were also able to address the question which of the many putative mor splice variants detected on the mrna level are indeed expressed as functional receptors in mouse brain. ion channels 061 hcn4 in thalamic relay neurons is necessary for oscillatory activity in the thalamocortical system institut für physiologie i, westfälische wilhelms-universität, münster, germany hcn channels underlie the i h current and are involved, among other functions, in the genesis of epilepsy. the significance of hcn1 and hcn2 isoforms for brain function and epilepsy has been demonstrated, however the role of hcn4, the third major neuronal hcn subunit, is not known. here we show an unexpected role of hcn4 in controlling oscillations in the thalamocortical network. hcn4 is predominantly expressed in several thalamic relay nuclei, but not in the thalamic reticular nucleus and the cerebral cortex. hcn4-deficient thalamocortical relay neurons showed a massive reduction of i h and strongly reduced intrinsic burst firing. evoked thalamic oscillations in a slice preparation were completely abolished. in vivo, brain-specific hcn4 null mutants were protected against induced spike-and-wave discharges (swd), the hallmark of absence seizures. our findings indicate that hcn4 is necessary for rhythmic intrathalamic oscillations and that the channels constitutes an important component of swd generation. ludwig-maximilians-universität, walther-straub-institut für pharmakologie und toxikologie, münchen, germany trpc4 and 5 channels are members of the classical transient receptor potential (trpc) family whose activation mechanism downstream of phospholipase c (plc) largely remained elusive until now. while trpc3/6/7 channels are directly activated by diacylglycerol (dag), trpc4 and 5 channels are commonly regarded as daginsensitive. in contrast to trpc3/6/7 channels, they contain a c-terminal pdz-binding motif allowing for binding of na + /h + exchanger regulatory factor (nherf) 1 and 2. interestingly, performing electrophysiological measurements, co-immunoprecipitations and intermolecular dynamic fret experiments, we found that dissociation of nherf proteins from the c-terminus of trpc5 confers dag-sensitivity on trpc5 channels. trpc5 channels were dag-sensitive under the following experimental conditions: inhibition of protein kinase c, amino acid exchange in the c-terminal pdz-binding motif, pip 2 depletion with and without involvement of plc, over-expression of g-protein coupled receptors, down-regulation of endogenous nherf1 and 2 proteins and overexpression of a nherf1 mutant incapable of trpc5 binding. these findings strongly argue for nherf proteins as molecular determinants for channel activation. interestingly, pip 2 depletion itself caused slight trpc5 current increases while during pip 2 depletion, the membrane permeable dag analogue oag evoked even higher trpc5 currents suggesting that pip 2 depletion induces an active and dag-sensitive channel conformation. receptor mediated pip 2 depletion also resulted in dissociation of nherf1 and 2 from the c-terminus of trpc5 thereby eliciting a dag-sensitive trpc5 channel conformation. thus, our findings suggest that dag-sensitivity of trpc5 is the result of an activation cascade starting with pip 2 depletion and subsequent dynamic dissociation of nherf1 and 2 from the c-terminus of trpc5. altogether, dagsensitivity is a unifying functional hallmark of all trpc channels. the melastatin-related transient receptor potential channel trpm3 is a heat-activated nonselective cation channel expressed in sensory neurons of dorsal root ganglia. since trpm3-deficient mice show impaired inflammatory thermal hyperalgesia, the pharmacological inhibition of trpm3 may exert antinociceptive properties. fluorometric ca 2+ assays and a compound library containing approved drugs were used to identify trpm3 inhibitors and to characterize their potency and selectivity. biophysical properties of the block were assessed using electrophysiological patch-clamp methods. microfluorometry in fura-2-loaded single cells was applied to monitor [ca 2+ ] i signals in isolated dorsal root ganglion (drg) neurons. analgesic effects were assessed applying pregnenolone sulfate (pregs)-induced chemical pain and heat stimuli at mice. in the screening approach using stably transfected hek trpm3 cells we identified the nonsteroidal anti-inflammatory drug (nsaid) diclofenac, the tetracyclic antidepressant maprotiline and the anticonvulsant primidone as highly efficient trpm3 inhibitors. the compounds exhibited half-maximal inhibitory concentrations of 0.6-6 µm. the selectivity profiles of maprotiline and primidone for trpm3 were promising with no inhibitory effects on trpm2, trpm8, trpa1, trpv1, trpc5, trpc6 and p2x7 receptor channels. primidone inhibited pregs-induced [ca 2+ ] i signals in rat drg neurones, indicating a block of native trpm3 channels. consistently, primidone attenuated nocifensive responses of mice to paw-injected pregs. furthermore, intraplantar primidone reduced nociception in healthy and hyperalgesic cfa-inflamed paws in the hot plate test. the finding that an approved drug can inhibit trpm3 at concentrations that may be therapeutically relevant and thereby can act as an analgesic, provides a method to study trpm3-related effects by acutely challenging the channel´s function. pharmacological interference with trpm3 applying an approved drug or optimised successor compounds may pave the way to better understanding of physiological functions of trpm3 in humans and may represent a novel concept for analgesic treatment. excitotoxicity, calcium deregulation, mitochondrial dysfunction and neuroinflammation contribute to progressive cell death in many neurodegenerative diseases. therefore, proteins that prevent deregulation of these pathways are considered as drug targets. potential therapeutic approaches may benefit from modulation of small-conductance calcium-activated potassium (sk) channels, since recent data supports the hypothesis that sk channel activity promotes neuronal survival against cellular stress via a dual mechanism of action: i) by controlling neuronal excitability and ii) by preventing mitochondrial dysfunction and inflammation. our previous studies showed that activation of sk channels in neurons exerted protective effects through inhibition of nmdarmediated excitotoxicity. further, we revealed recently that in a model of glutamate oxytosis, activation of sk channels attenuated mitochondrial fission, prevented the release of pro-apoptotic mitochondrial proteins, and reduced cell death. however, little is known about the function of sk channels in cell metabolism and neuroinflammatory processes in non-neuronal cells, such as microglial cells. in this study, we addressed the question whether sk channel activation affected primary mouse microglia activation upon lps and α-synuclein challenge. we found that activation of sk channels significantly reduced activation of microglia in a concentration-dependent manner, as detected by real-time xcelligence cell impedance measurements. further data on cytokine (tnf-alpha and il-6) analysis revealed that activation of sk channels attenuated α-synuclein-induced cytokine release. inhibition of glycolysis prevented microglial activation and cytokine release. although sk channel activation slightly reduced atp levels, it attenuated α-synuclein-induced no release. furthermore, glycolytic products and ampk signaling were evaluated. overall, our findings show that activation of sk channels attenuates microglial cell activation. thus, sk channels are promising therapeutic targets for neurodegenerative disorders, where neuroinflammation and cell metabolic deregulation are associated with progression of the disease. mutant of the residue pair e103/k308 can be crosslinked efficiently in both states, the closed-and open state of the p2x2r. interestingly, oxidative crosslinking of cysteine substitution mutants of each individual residue pair significantly reduced the atpinduced current amplitudes. charge reversal or swapping mutagenesis and cysteine modification by charged mts-reagents indicated the electrostatic nature of the pairwise interactions in these four residue pairs. furthermore, preliminary data from triple, tetra and penta mutant cycle analysis indicated energetic coupling between the residue pairs e84/r290, e103/k308, e167/r290 and e167/k308 and thus indicates the cooperative interaction in a larger salt bridge network. together with the markedly reduced current amplitudes following disulfide crosslinking, our data suggest that the salt bridge network serves to stabilize the closed-state conformation of the p2x2r. the comparison of the closed-state and open-state model of the rat p2x2r showed that atp promotes a marked rearrangement of the side chains of the residues r290 and k308 to enable the strong ionic coordination of the γ-phosphate oxygen of atp. in summary, our data are in line with the concept that the electrostatic interaction of r290 and k308 with atp competitively releases e84, e103 and e167 from their strong electrostatic coupling and thus initiates a destabilization of the closed-state, which favors channel opening. fig. 1 in a similarity search using sequence motifs conserved amongst various members of the trp protein family we identified three non-annotated putative membrane proteins that we initially termed tmem1, tmem2 and tmem4. expression analysis using the nanostring ncounter system, northern blotting and rt-pcr showed that murine tmem2 is expressed in various tissues including heart, brain, lung, endothelium, colon, cardiac myocytes, cardiac fibroblasts, embryonic fibroblasts, mast cells and pancreatic acinar cells. hydropathy analysis predicts that tmem2 proteins exhibit 6 to 10 plasma-membrane spanning domains, but fluorescently labeled tmem2 fusion constructs expressed in mouse embryonic fibroblasts revealed a vesicular subcellular localization pattern. in contrast to the prediction by the psort ii algorithm, tmem2-eyfp could not be identified in the plasma membrane of fibroblasts, cardiac myocytes, mast cells or pancreatic acinar cells but showed a significant colocalization with markers and fusion constructs specific for acidic compartments including lysosomes. in tmem2 -/mice, a marked elevation of amylase and lipase plasma levels was observed. we found that constitutive but not stimulated amylase secretion from tmem2-deficient acinar cells is elevated indicating a cell autonomous defect. calcium (ca ++ ) is an important signaling molecule regulating stimulated as well as constitutive secretion from pancreatic acinar cells. microfluorimetric measurements using fura-2 or indo-1 indicate higher resting ca ++ concentrations in tmem2-/-pancreatic acinar cells correlating with elevated basal enzyme secretion. in tmem2-yfp-knock-add-on mice we identified tmem2 in organelles of the apical acinar cell pole and a partial colocalisation with lamp2 proteins. furthermore, largely increased elevations in cytoplasmic ca ++ concentration were observed upon osmotic lysis of lysosomes triggered by gly-phe β-naphthylamide (gpn) or by nh 4 cl application. the role of tmem2 for ca ++ release was evaluated by stimulation with low concentration of cholecystokinin 2pm) in the absence of extracellular ca ++ using both microfluorimetric recording of cytosolic ca ++ transients as well as electrophysiological recordings of ca ++ -activated chloride currents. these measurements revealed a higher frequency of intracellular ca ++ oscillations and a larger area under the curve of ca ++ activated chloride currents upon cck-8 stimulation indicating that tmem2 inactivation leads to an enhancement of the globalization of cck-8 evoked ca2+ release from intracellular organelles. taken together, our study identifies tmem2 as a novel regulator of ca ++ release from intracellular organelles including endo-lysosomes and as a critical determinant of constitutive protein secretion in pancreatic acinar cells. while generally highlighting toxicology as a translational science that requires academic anchoring, gundert-remy and co-workers [1] have called for efforts to improve the relevance of in vitro methodologies in predicting in vivo effects. against this background, the german society of toxicology working group on alternative approaches to animal testing proposes specific quality criteria (qc) for in vitro methods and for research work using in vitro methods. these qc may serve to evaluate in vitro methods that are developed or applied in-house or that are described in work plans, peer reviewed articles, etc. for the time being, the qc focus on in vitro cell or tissue culture methods that address human health endpoints in the context of substance-related regulatory toxicity testing. nevertheless, these qc are also generally applicable to in vitro research conducted for other toxicological purposes. relevant work from, e.g., the organisation for the economic co-operation and development has been taken into account in specifying the qc that cover the following aspects:  the 3rs impact of an in vitro method in replacing, reducing (and refining) a specific animal test for a specific toxicological endpoint. this aspect also includes scientific hurdles that, in the past, had impeded the successful development of in vitro methods for the given toxicological endpoint.  scientific relevance and reliability, i.e. which fundamental requirements should an in vitro method meet to ensure that its results are relevant and reliable.  practicability and applicability, i.e. what is the expected expenditure for the in vitro method, and have relevant authorities and industrial sectors been involved in the development of the in vitro method. qc related to the scientific relevance of research work using a specific in vitro method provide a tool to justify, e.g., the suitability of the selected test system and in vitro endpoint(s) for the given purpose; the selection of test substances, positive and negative controls; the setting and control of test concentrations; and the definition of acceptance criteria to determine the relevance of test results. the proposed qc may serve as a framework to assess the relevance of in vitro methods and in vitro research work. thereby, they aim at improving in vitro predictivity of in vivo toxicological effects, which in return contributes to reducing and replacing the need for animal testing. [1] gundert-remy, u. et al. (2015) . toxicology: a discipline in need of academic anchoring -the point of view of the german society of toxicology. arch toxicol 89: 1881-93. during the past decades, considerable progress has been made in implementing 3r approaches in routine safety assessment. in spite of these achievements, animal tests still need to be conducted if legal requirements prescribe in vivo tests or if no reliable, accepted alternative method exists. efforts to foster 3r approaches and make them 'ready for use' focus on three levels: development and validation of scientific methods/strategies, regulatory acceptance of acknowledged approaches and global harmonization of standards. strong cooperation between toxicological experts from scientific bodies, national and international authorities and industry is needed to advance on all three levels for the benefit of animal welfare. 3r approaches that have already been implemented in routine safety assessment of consumer goods do not focus solely on replacement of animal testing by use of accepted alternative methods. in cases where reliable alternative approaches are not yet available, reduction in animal numbers and refinement of testing procedures can be achieved on a case-by-case basis. a tailor-made, tiered testing strategy is usually pursued that involves knowledge on specific characteristics of the test item and makes use of all available data, including details on exposure and results obtained with structural homologues. hurdles to apply alternative approaches can even occur for established methods. as legislations give different priority to alternative approaches, it remains challenging to fulfil conflicting legal requirements in different regions of the world, or even to address horizontal legislations of the same region. furthermore, successfully validated and legally implemented alternative approaches might not always provide the safety assessor with meaningful test results. with gaining experience, limitations of test systems can become evident that affect for example the applicability domain of the method, as has been the case both for some in vitro and in vivo methods. in these cases, the new information needs to be shared not only among safety assessors, but also with method developers and regulators to facilitate refinement of scientific approaches and/or amendments of regulations. leibniz-institut für arbeitsforschung (ifado), vistox, dortmund, germany two-photon microscopy facilitates imaging of biological processes in vivo. establishing this recent technique in mouse liver allowed us to record in a real-time the sequence of events during acetaminophen (apap) induced-liver damage. although apap is intensively studied and described in vivo imaging revealed so far unknown scenarios of cell death. the hepatocytes close to the central vein of a liver lobule went within hours into cell death as commonly described due to the toxic metabolite napqi. surprisingly, we observed a distinct way of cell killing at the outer border of the dead cell area which is accompanied by bile acid decompartmentalization. there, within an hour after apap administration dilatation of bile canaliculi was observed. subsequently, bile acids containing invaginations arouse from the apical side of a hepatocyte into the cytosol. these invaginations ballooned until the bile leaked into the hepatocyte volume and subsequently the plasma membrane of the affected hepatocytes lost its integrity leading to cell death. this mechanism emerged in an environment for hepatocytes where moderate napqi levels meet intracellular high bile salt concentrations of the midzonal region. in conclusion, establishing in vivo imaging in mouse liver enabled us to identify new cellular mechanisms which cannot be discovered by conventional methods. universitätsklinikum düsseldorf, institut für toxikologie, düsseldorf, germany introduction: lung inflammation and fibrosis are considered as major toxicities after thoracic cancer radiotherapy. up to now effective pharmacological interventions for normal tissue protection are largely missing. hmg-coa reductase inhibitors (statins), which are used in the clinic for lipid-lowering purpose, are reported to have multiple inhibitory effects on genotoxic stress responses. for this reason we aim to investigate the usefulness of statins to protect normal lung cells in vitro and lung tissue in vivo from damage provoked by ionizing radiation (ir). methods: according to clinically relevant anticancer radiation regimens, we used fractionated irradiation schemes (4 x 4 gy) for both in vitro as well as in vivo experiments. we analyzed the effect of lovastatin on ir-induced dna damage formation and repair, dna damage response (ddr) and cell death in non-proliferating human lung fibroblasts, epithelial as well as endothelial cells. furthermore, we established an irradiation device that is useful to selectively irradiate the right lung of mice and investigated the influence of lovastatin on lung damage following fractionated and selective irradiation of the lung in vivo (balb/c mice). results: compared to lung fibroblasts and epithelial cells, endothelial cells exhibited the highest radiosensitivity and underwent ir-induced apoptosis which was partly prevented by lovastatin. by contrast fibroblasts and epithelial cells did not undergo apoptosis upon irradiation. lovastatin did not affect initial dna damage formation in any of these cells. in all three lung cell types lovastatin enhanced the repair of dna double-strand breaks as analyzed 24 h after the last irradiation by γh2ax nuclear foci formation. depending on the cell type lovastatin affected various components of the ddr machinery in vitro. in vivo, lovastatin prevented ir-mediated increase in breathing frequency as determined two and four weeks after fractionated irradiation. moreover, statin treatment attenuated the level of residual dna damage and ir-induced apoptosis as analyzed four weeks after irradiation. these results were mimicked when eht1864, a small molecule inhibitor of the small rho-gtpase rac1, was applied in vivo, pointing to an involvement of rac1 in statin-mediated radioprotective effects. conclusion: bearing in mind that statins are well tolerated in humans, we suggest the application of statins as a promising pharmacological strategy for the prevention of irradiation-induced damage of the lung. targeted genome engineering by crispr/cas9 is an evolving tool for generating specific knockout cell lines. co-expression of crispr/cas9 allows for efficient dna cleavage and introduction of so called indel mutations (insertion/deletion point mutations) that lead to either misfolded non-functional proteins or complete knockout. we exploited this tool to generate a bid (bh3-interacting domain death agonist) knockout cell line in neuronal ht-22 cells. bid has been shown to be involved in regulated cell death pathways like oxytosis where its activation mediates mitochondrial demise, subsequent release of apoptosis inducing factor (aif) and cell death. in the cell death model of oxytosis the cystine/glutamate antiporter (x c -) is inhibited by high extracellular glutamate concentrations. following events such as increasing lipid peroxidation and ros production resemble major characteristics of another emerging cell death pathway, called ferroptosis. in this study we generated a bid crispr/cas9-knockout cell line to elucidate the role of bid as a potential link of oxytosis and ferroptosis in the ht-22 cell line. in order to investigate the potential mechanistic overlap at the level of mitochondrial death pathways, we induced oxytosis with glutamate or ferroptosis with erastin in wild-type cells and analyzed the respective effects of the well-established inhibitors ferrostatin-1 and the bid inhibitor bi-6c9 on cell death and mitochondrial paradigms. these results were then compared to the effects of glutamate or erastin in crispr/cas9-bid-knockout cells. bi-6c9 inhibited glutamate-induced morphological changes of ht-22 cells and also prevented cell death as assessed using the mtt assay and annexin v/pi staining. similar results were observed with ferrostatin-1 in the model of erastin-induced ferroptosis. subsequent facs analysis of lipid peroxidation by bodipy staining demonstrated that bi-6c9 abolishes lipid peroxide formation in the erastin model and ferrostatin-1 in the model of oxytosis. facs analysis was further employed for the detection of mitochondrial ros formation. mitosox staining revealed a significantly decreased production of mitochondrial ros by bi-6c9 and ferrostatin-1 in the respective model systems. investigating the crispr/cas9-bid-knockout ht-22 cell line revealed that bid knockout prevented cell death, lipid peroxidation and mitochondrial toxicity in both model systems of cell death, oxytosis and ferroptosis. in conclusion, the present study exposes bid as a pivotal molecular link between the previously separated cell death pathways oxytosis and ferroptosis at the level of mitochondria. parkinson's disease is a common neurodegenerative movement disorder characterized by midbrain dopaminergic neuronal loss in the substantia nigra that has been linked to alpha-synuclein toxicity. however, the molecular mechanisms underlying alphasynuclein-mediated toxicity in human dopaminergic neuronal loss are not well defined. the goal of this study was to investigate the deleterious effects of alpha synuclein in particular mitochondrial toxicity in human dopaminergic cells. therefore, we have generated neuron specific, adeno associated virus type 2 (aav2) expressing cytosolic as well as mitochondrial targeted alpha synuclein and egfp expressing viruses used as respective controls. overexpression of both, the cytosolic and the mitochondrial variants of alpha synuclein severely disrupted the dendritic network, induced loss of cellular atp, enhanced mitochondrial ros production, and was associated with activation of caspases and dopaminergic cell death in a time-dependent manner. in addition, real-time analysis of mitochondrial bioenergetics using the seahorse bioscience system following aav infection elicited a complete damage to mitochondrial respiration capacity in the dopaminergic neurons. our results suggested that mitochondrial targeted expression of alpha synuclein appeared to be more toxic than the cytosolic form of alpha synuclein. in addition, ultrastructural mitochondrial morphological analysis by transmission electron microscopy illustrated a number of deformed cristae in cells expressing the cytosolic alpha synuclein and a complete loss of cristae structure and massively swollen mitochondria following the expression of mitochondrial targeted alpha synuclein in the human dopaminergic neurons. in addition, we found that inhibition of caspases by the broad spectrum caspase inhibitor qvd significantly ameliorated alpha synuclein-induced dopaminergic neuronal death. interestingly, inhibition of caspases preserved neuronal network integrity, atp levels and mitochondrial respiration capacity in both paradigms of cytosolic and mitochondrial alpha synuclein overexpression. overall, our findings show that cytosolic as well as mitochondrial targeted expression of alpha synuclein is detrimental to human dopaminergic neurons, while inhibition of caspases amend alpha synuclein toxicity at the level of mitochondria. thus, caspase inhibitors provide promising therapeutic potential to prevent dopaminergic neuronal death in parkinson's syndromes that are associated with alpha synuclein toxicity. degradation of and adverse effects caused by tattoo and permanent make-up pigments upon sunlight exposure and laser removal have been occasionally reported in the last decades. until now, only the ban of certain azo-pigments has been addressed in the national legislation. the regulation was based on a number of studies showing the cleavage of azo-bonds by ultra violet light and laser-irradiation leading to the formation of carcinogenic aromatic amines. as a result, especially german tattoo ink manufactures switched to the use of more light-fast polycyclic pigments assuming these would be safer for this kind of application when compared to azo-pigments. to assess the potential risks of polycyclic pigments in terms of decomposition in the skin, we compared the photochemical cleavage of the widely used azo-pigment orange 13 and the polycyclic pigment copper phthalocyanine blue. main decomposition products are qualitatively and quantitatively analyzed after q-switched laser irradiation of 1 mg/ml aqueous suspensions and tattooed pig skin. irradiated specimen were extracted with ethyl acetate and analyzed with gas chromatography coupled to mass spectrometric detection (gc/ms) using liquid injection and head-space sampling techniques. we were able to confirm the cleavage of pigment orange 13 at the azo-and other weak bonds in our experimental set-up (fig.1a) . amongst other substances, the carcinogens aniline (max. conc. 1.01 ± 0.12 µg/ml) and 3,3-dichlorobenzidine (max. conc. 0.88 ± 0.18 µg/ml) are formed. despite the lack of such weak bonds, the highly stable porphyrin-like structure of copper phthalocyanine blue is as well decomposed upon laser-irradiation (fig. 1b) . here, 1,2-benzenedicarbonitrile (max. conc. 1.11 ± 0.12 µg/ml) were found as the main decomposition product in all experimental setups. concentrations of cleavage products were generally higher in aqueous suspensions compared to pig skin extracts with both pigments. additionally, the highly toxic gas hydrogen cyanide (max. conc. 35.8 ± 4.32 µg/ml) and the human carcinogen benzene (max. conc. 0.19 ± 0.06 µg/ml) were formed from both pigments, dependent on the laser wavelengths used. cyanide levels of ≥50 µg/ml evolving upon ruby laser irradiation of >1.5 mg/ml aqueous suspensions of phthalocyanine blue were proven to significantly reduce cell viability in human skin cells in vitro. reference 1 schreiver, i., hutzler, c., laux, p., berlien, h. p. & luch, a. formation of highly toxic hydrogen cyanide upon ruby laser irradiation of the tattoo pigment phthalocyanine blue. sci rep 5, 12915 (2015) . understanding the interactions between nanoscaled objects and living cells is of great importance for risk assessment, due to rising application of nanomaterials in foodrelated products. several studies show that silver nanoparticles can reach the intestinal epithelia in nanoform in a human in vitro digestion model. nevertheless, only sparse data concerning the direct quantification of cellular uptake of silver nanoparticles are available. therefore, this study was focused on a systematical quantitative comparison of the cellular uptake of differently coated silver nanoparticles of comparable size. intracellular uptake was determined quantitatively via a transwell tm -system with subsequent elemental analysis (aas) and ion beam microscopy (ibm). silver nanoparticles were coated with poly (acrylic acid) and polyvinylpyrrolidone and characterized extensively by tem, dls, saxs, zetasizer and nanosight. agpure tm as a widely used reference nanoparticle coated with tween 20 and tagat to v was also used for comparison. different intestinal cell models were applied to get closer to the complex in vivo situation: beside the widely used caco-2 model we also investigated particle uptake in a model which considered the enterocyte-covering mucus layer, as well as in a model specialized on particle uptake, the so-called m-cell model. our findings suggest that silver uptake is clearly a particle-and not an ion-related effect. the internalization of silver nanoparticles was enhanced in uptake-specialized m-cells, although no enhanced transport through the cells was observable. furthermore, the mucus did not providing a substantial additional barrier for nanoparticle internalization. rutherford backscattering spectrometry (rbs) via ibm allowed distinguishing between adsorbed an internalized material and the results were in accordance with the transwell tm -data. additionally, ibm investigations via particle-induced x-ray emission (pixe) showed intracellular association of silver with sulfur. the quantification of silver nanoparticle internalization revealed a clear particle-specific and a coating-related uptake. furthermore, a high amount of silver nanoparticles is taken up in cell models of higher complexity. thus, an underestimation of particle effects in vitro might be prevented by considering cell models with greater proximity to the in vivo situation. analyzing iron oxide nanoparticles for drug delivery -innovative investigation tools for nanotoxicology nanoparticles offer promising new possibilities for medical applications including therapy and diagnosis of various diseases. especially nanoparticle systems with magnetic cores provide a broad application spectrum as contrast agents, magnetic transporters, or heat carriers in hyperthermia treatment. for bench to bedside translation of superparamagnetic iron oxide nanoparticles (spions) for medical applications, safety issues have to be clarified. for that, reliable standards must be established on the basis of comprehensively validated physicochemical and biological characterization methods. spions consisting of maghemite and magnetite are usually of brown or black color. due to these special properties, spions and other metal oxide nanoparticles are prone to interfere with classical toxicological assays relying on optical detection of colorimetric, fluorescence or luminescence signals. particularly, nanoparticle concentration and cellular uptake are further influencing factors. consequently, for reliable analysis of nanoparticle mediated effects, alternative robust and interference-free readouts have to be established. based on long lasting experience working with spions, we suggest a combination of complementary methods to analyse nanoparticle-mediated effects: multiparameter analyses in flow cytometry deliver statistically relevant data and link uptake of nanoparticles (side scatter increase) with cellular effects in a high-content style. combination of noninvasive, label-free impedance measurements (xcelligence system) with real-time (fluorescence) microscopy enables us to monitor cellular proliferation and morphology over several days without interference by nanoparticles. additional experiments in multicellular tumor spheroids provide information about tissue infiltration and thus, more closely resemble the in vivo situation. using those complementary methods, several drug-loaded spion systems dedicated for medical applications have been successfully characterized previously. in sum, nanotoxicology is a complex and interdisciplinary challenge, where physicochemical parameters, as well as in vitro and in vivo behavior of nanoparticles have to be considered. to address these basic requirements, we are working on a stringent standardized road of characterization for iron oxide nanoparticles synthesized for medical applications. reference: lyer s, tietze r, unterweger h, zaloga j, singh r, matuszak j, poettler m, friedrich rp, duerr s, cicha i, janko c, alexiou c. nanomedical innovation: the seon concept for an improved cancer therapy with magnetic nanoparticles. nanomedicine (lond). 2015; 10 (21) acrylamide (aa) is an α,β-unsaturated compound, which is categorized as probably carcinogenic to humans [1, 2] . aa is known to arise in foods by heat treatment in the course of the maillard reaction between reducing sugars and amino acids at processing temperatures > 120 °c [3] . dietary aa exposure has mainly been estimated on the basis of dietary recall, assessing consumption of foods with known aa contents. the use of human biomarkers of aa exposure, primarily haemoglobin adducts of aa and its genotoxic metabolite, glycidamide ( ga) in red blood cells, as well as mercapturic acids excreted in the urine, is a promising alternative. such biomarkers are to be validated by exact measurement of aa uptake in duplicates of food as consumed (duplicate diet studies) [4] . we here present results of a nine-day human intervention study with 14 healthy male volunteers. aa contents were determined in duplicates of servings as consumed and kinetics of aa-associated mercapturic acids (aama and gama) monitored in total urine [5] . the study design included washout periods with an aa-minimized diet (21 -41 ng /kg bw), a low aa intake day (0.6 -0.8 µg /kg bw) as well as a high aa intake day (1.3 -1.8 µg /kg bw). after a three-day washout period an aama baseline level of 93 ± 31 nmol/d was determined. low aa intake led to an aama excretion within 24 h of 225 ± 37 nmol/d, high intake to 404 ± 78 nmol/d corresponding to an aama excretion rate of about 30% of the ingested aa dose within 24 h, whereas aama output within 72 h corresponded to 58% of the respective aa intake, the aama baseline after 3 days washout corresponds to a net exposure level of 0.2 -0.3 μg aa/kg bw/d. whether this represents a true baseline level is to be clarified in a follow-up study. in summary, this study provides important quantitative information on kinetics of urinary short-term exposure biomarkers validated by analytically verified dietary aa intake at present day food contamination levels. [1] deutsche forschungsgemeinschaft (dfg), mak-und bat-werte-liste 2013 , 2013 doi: 10.1002 iarc, iarc monographs on the evaluations of carcinogenic risks to humans 1994, 60. [3] tareke et al., j. agric. food chem. 2002, 50, 4998-5006. [4] efsa panel on contaminants in the food chain (contam), efsa journal 2015; 13(6):4104 [321 pp.] . [5] ruenz et al., arch. toxicol. 2015 doi: 10.1007 /s00204-015-1494 heinrich-heine-universität, institut für toxikologie, düsseldorf, germany objective: flavonoids are known to modulate distinct signaling pathways thereby causing different physiological effects. effects of the flavonoids baicalein and myricetin as well as several methylated derivatives were analyzed in the nematode caenorhabditis elegans and in in hct116 colon carcinoma cells and to get insights in molecular mechanisms modulated by these compounds. methods: radical-scavenging activity (teac, dcf), stress resistance (sytox, sodium arsenite), modulation of signaling pathways (nrf2/skn-1, daf16), life span. results: baicalein enhances the resistance of c. elegans against lethal thermal and sodium arsenite stress and dose-dependently prolongs the life span of the nematode (median life span: + 57%). using rna interference we were able to show that the induction of longevity and the enhanced stress-resistance were dependent on skn-1 (homolog to mammalian nrf2), but not daf-16 (homolog to mammalian foxo), another pivotal transcription factor. negletein was the only methylated derivative which was able to enhance the life span of the nematode. in hct116 cells, baicalein activates nrf2; the methylated derivatives oroxylin a and negletein showed a comparable redox-active potential in these cells, but only negletein was able to activate nrf2. the dietary flavonoid myricetin as well as the methylated derivatives laricitrin, syringetin and myricetintrimethylether strongly enhance life span of c. elegans, decreased oxidative stress (dcf) and accumulation of lipofuscin. in contrast to myricetin, the methylated compounds strongly enhanced the resistance against thermal stress. furthermore, treatment with the derivatives induced a much stronger nuclear localization of the daf-16 transcription factor. conclusion: baicalein increases stress-resistance and life span in c. elegans via skn-1 but not daf-16. experiments with methylated baicalein derivatives suggest that the redox-active potential has a minor impact on the nrf2/skn-1 activation since only distinct derivates activate this pathway. in case of myricetin, the methylation increases the stress resistance of the flavonoid. methylation seems to enhance the biofunctionality of the flavonoids. our results may be useful to understand molecular mechanisms of flavonoids and methylated derivatives used as food supplements or pharmacological extracts. the loss of progesterone during menopause is linked to common sleep complaints of the affected women. consequently, a previous study of our laboratory demonstrated sleep promoting effects of oral progesterone replacement in postmenopausal women [1] . the oral administration of progesterone, however, is compromised by individual differences in bioavailability and metabolism of the steroid. we therefore investigated the sleep-eeg effects after intranasal application of progesterone in 12 healthy postmenopausal women (50-70 yrs).in a randomized doubleblind protocol each subject received four treatments, 2 doses of intranasal progesterone (4.5mg mpp22; 9.0 mg mpp22), 10mg of zolpidem and placebo. the 4 conditions consisted of 2 experimental nights (adaptation + examination) separated by at least one week. during each examination sleep eeg was recorded from 23:00 to 07:00. simultaneously blood was collected every 20 min between 22:00 and 07:00 by long catheter for later analysis of the hormones growth hormone (gh), cortisol, melatonin and progesterone. conventional sleep-eeg was statistically evaluated by multivariate analyses of variances (manovas) with repeated measures designs after removal of two outliers, which showed a low sleep efficiency index (sei) after 4.5 and 9.0mg mpp22. univariate f-tests in the manovas pointed to the following results (significant p-values at α=0.05). sei was higher after zolpidem than after the other three treatments. after 9.0mg mpp22 sei was elevated significantly in comparison to placebo. subjects spent more time in nonrem sleep and less time in intermittent wakefulness after 9.0mg mpp22 and after zolpidem than after placebo. total sleep time was elevated and wake after sleep onset (waso) was reduced after 9.0mg mpp22 and after zolpidem. after all active treatments with mpp22 and zolpidem the time spent in sleep stage 2 was higher than after placebo. the amount of slow-wave sleep was higher after zolpidem than after placebo. in addition, the higher dose of mpp22 resulted in an increase of spindle and β frequencies combined with a decrease of δ oscillations during nonrem sleep. in comparison, administration of zolpidem resulted in strong increase of δ, spindle and high β frequencies as well as strong decrease in θ and α frequencies. nocturnal progesterone levels increased after 9.0 mg mpp22. no other changes of hormone secretion were found. our study show sleep promoting effects of 9.0 mg mpp. as expected the sleep promoting effect of zolpidem was confirmed. the spectral signature of intranasal progesterone partly resembled the well-known sleep-eeg alterations induced by gaba active compounds. progestereone levels were elevated after 9.0 mg mpp22. no other endocrine effects were observed. introduction: anticholinergic drugs or drugs with anticholinergic side effects are commonly used for the treatment of various diseases in the elderly population. elderly patients are particularly vulnerable to anticholinergic-related cognitive effects. moreover, there is a relationship between anticholinergic exposure and cognitive impairment. however, there is currently a lack of data on the anticholinergic burden in geriatric patients in germany. it was therefore the aim of this study to evaluate the anticholinergic burden in a large representative cohort of geriatric patients. materials and methods: in this retrospective cohort study, (co)-prescriptions of anticholinergic drugs as well as anti-dementia drugs were evaluated using the discharge medication of geriatric patients between january 2013 and june 2015 from the geriatrics in bavaria-database (gib-dat). anticholinergic drugs were classified according to the anticholinergic cognitive burden (acb) scale in three groups (definite anticholinergics with a score of 2 or 3 and possible anticholinergics with a score of 1). the acb scale was modified by omitting trospium and by adding the three drugs biperiden, metixen and maprotilin, which are used in germany, with a score of 3. a patient's individual score of 3 or higher is considered to be clinically relevant. in total, 130,186 geriatric patients (median age 82 years, 66.3% female, median no. of drugs 9) were evaluated. of these, 41,456 (31.8%) patients took at least one drug with anticholinergic properties. two or more anticholinergic drugs were coprescribed in 10,941 (26.4% of the patients taking anticholinergic drugs) patients. 11,241 (27.1% of the patients taking anticholinergic drugs) patients had a score of 3 or higher. the most common anticholinergic drug combinations involving two definite anticholinergic drugs were amantadine/quetiapine (58), amitriptyline/quetiapine (41) and amitriptyline/carbamazepine (35). 2,885 (7.0%) patients received anticholinergic drugs in combination with anti-dementia drugs. conclusions: one third of patients in a large geriatric population were prescribed at least one anticholinergic drug. one quarter received a co-prescription of anticholinergic drugs. caution is advised prescribing anticholinergic drugs to elderly patients especially with dementia. the antiglaucoma agents brimonidine and timolol are novel substrates of the organic cation transporters oct2 and mate1 expressed in human eye c. neul purpose: glaucoma is a leading cause of visual loss in the world population. lowering intraocular pressure by topical administration of antiglaucoma agents is still the mainstay for glaucoma treatment. 1,2 although many effective drugs exist, the major challenge is their efficient intraocular delivery, which is estimated to amount to 3 the involvement of membrane drug transporters in the intraocular delivery of the widely prescribed antiglaucoma prostanoid latanoprost has been described. 4 however, it is currently unknown whether the cationic drugs brimonidine and timolol, which are also commonly used antiglaucoma agents, are similarly transported by drug transporters and whether these transporters are expressed in human eye. brimonidine is an α 2 -adrenergic agonist, which inhibits the activity of the adenylate cyclase subsequently leading to a reduced production of aqueous humor. timolol is a β-adrenergic receptor antagonist, which blocks β-receptors on the ciliary epithelium also resulting in a reduced aqueous humor production. the aim of the present study was to determine whether brimonidine and timolol are substrates of the organic cation drug transporters oct1 (encoded by slc22a1), oct2 (slc22a2), oct3 (slc22a3) and mate1 (slc47a1). a further aim was to investigate whether these transporters are localized in different human eye substructures. experimental design: transport of brimonidine and timolol was studied using the mammalian cell line hek293 stably expressing the organic cation transporters oct1, oct2, oct3 or mate1. 5 intracellular accumulation of brimonidine and timolol was analyzed by mass spectrometry. immunohistochemistry and immunofluorescence experiments were performed to study the localization of these transporters in different substructures from glaucomatous and non-glaucomatous human eyes. results: uptake experiments revealed that brimonidine is transported by oct2 and mate1 in a time-and concentration-dependent manner, but not by oct1 or oct3. timolol is only transported by mate1, but not by the octs. as shown by immunolocalization studies, the oct2 and mate1 transporter proteins were expressed in all anterior eye substructures of non-glaucomatous and glaucomatous eyes, i.e. the cornea, the conjunctiva and the ciliary body. conclusion: our data demonstrate that oct2 and mate1 may play a role in the ocular disposition of the antiglaucoma drugs brimonidine and timolol and may contribute to interindividual variability of drug concentrations and effects. references: 1. zhang et al., nat rev drug discov. 2012 jun 15; 11(7) :541-59. 2. lavik et al., eye (lond). 2011 may; 25(5):578-86. 3. gaudana et al., pharm res. 2009 may; 26(5):1197 -216. 4. kraft et al., invest ophthalmol vis sci. 2010 51(5):2504 -11. 5. nies et al., plos one. 2011 6(7) :e22163. supported by the robert bosch foundation, stuttgart, germany. immature platelet count or immature platelet fraction as optimal predictor of antiplatelet response to thienopyridine therapy c. stratz 1 , t. nuehrenberg background: previous data suggest that reticulated platelets impact significantly on antiplatelet response to thienopyridines. it is unknown which of the parameters describing reticulated platelets is the optimal predictor of antiplatelet response to thienopyridine therapy. methods: this study is a prespecified subanalysis of the excelsiorload trial that randomized elective patients undergoing coronary stenting to loading with clopidogrel 600mg, prasugrel 30mg or prasugrel 60mg (n=300). adp-induced platelet reactivity was assessed by impedance aggregometry before loading (=intrinsic platelet reactivity) and on day 1 after loading. multiple parameters of reticulated platelets were assessed by an automated whole blood flow cytometer: immature platelet fraction (ipf, proportion of reticulated platelet of the whole platelet pool), highly immature platelet fraction (hipf), absolute immature platelet count (ipc). results: each parameter of reticulated platelets correlated significantly with adpinduced platelet reactivity: ipf (r s =0.18; p=0.002), hipf (r s =0.18; p=0.002), ipc r s =0.26; p<0.001). in a multivariable model including all three parameters, only ipc remained as significant predictor of platelet reactivity (p<0.001). after adjustment to known predictors of on-clopidogrel platelet reactivity including cytochrome p450 2c19 polymorphisms (*2 and *17), age, body mass index, diabetes, smoking and intrinsic platelet reactivity, ipc s21 was the strongest predictor of on-treatment platelet reactivity (partial η 2 =0.045; p<0.001) followed by intrinsic platelet reactivity (partial η 2 =0.034; p < 0.002). these findings prevailed when analyzing subgroups of patients on clopidogrel or on prasugrel. conclusion: immature platelet count is the strongest platelet count derived predictor of antiplatelet response to thienopyridine treatment. given its easy availability together with its even stronger association with on-treatment platelet reactivity when compared to known predictors including the cyp 2c19*2 polymorphism, immature platelet count might become the preferable predictor of antiplatelet response to thienopyridine treatment. cutaneous squamous cell carcinoma (cscc) is the second most common human cancer with continuously rising incidences worldwide. primarily caused by cumulative uvb exposure, cscc accounts for considerable costs for health care systems and poses a deadly risk especially to organ transplant recipients [1] . current chemotherapy needs to be improved, because even the topical treatment for cscc's carcinoma in situ bears limited efficacy and painful adverse effects [2] . however, animal-based approaches in preclinical development contribute to the frequent failure of investigational new drugs in clinical trials [3] . herein, we characterized a human cell-based cscc model, normal reconstructed human skin (rhs) served as control. whereas rhs exhibited low proliferation, the co-culture with cscc increased ki-67 index 23-fold in the cscc model (p£0.001). while the presence of claudin-4 and occludin were distinctly reduced, zonula occludens protein-1 was more wide-spread, and claudin-1 was heterogeneously distributed within the cscc model compared with rhs. this is in accordance to the in vivo situation [4] and likely contributes to the impaired barrier function of the cscc model, as demonstrated for 2.6-fold increased caffeine permeation. finally, the ingenol mebutate effects in the cscc model and rhs closely mimic the anti-tumor effect and the adverse reactions in patients [2] , both linked to the drug's inherent cytotoxicity. in conclusion, the thoroughly characterization of disease models fosters both advanced preclinical drug development and improved cscc treatment. funded by the german government, the berlin-brandenburg research platform bb3r with integrated graduate education was launched in 2014. the aim of this research platform, along with the associated graduate school, is to close substantial knowledge gaps in the fields of the 3rs and to find alternatives to animal experimentation within the next years. a panel of 3r experts has been set up to provide advice and assistance and to raise awareness in society for 3r-related issues. research in bb3r investigates physiological functions on different levels to establish alternative methods for preclinical drug development and basic research. the principal investigators aim at facilitating research collaborations and sustainable research activities in the region berlin-brandenburg and abroad. an integrated bb3r graduate program has been developed to offer structured training to graduate students in a specific, mandatory course program on 3r including modules on ethics and legislation. currently, 14 phd students are qualified for management positions in professional areas related to the 3rs, and three junior research groups are now ready to expand their regional research activities nationwide. furthermore, the concept of a novel lecture series for master students and undergraduates has been designed and awarded the animal welfare research award for berlin-brandenburg in teaching and education. the state government of berlin supports the research platform bb3r and will be funding an additional professorship at the fu berlin to further promote research on alternative testing. finally, co-operations with national and international partners are being built to facilitate the project-based exchange of scientists and joint research. currently, the identification and evaluation of skin sensitizers is mainly restricted to animal testing using the guinea pig maximization test, buehler test or the murine local lymph node assay. recently, an adverse outcome pathway of skin sensitization has been released by the oecd, identifying the key events leading to allergic contact dermatitis. in vitro tests address these key events and two assays are now regulatory adopted (oecd 442c and 442d). the use of the current in chemico and in vitro models is, however, limited since they do not reflect dermal penetration, complete biotransformation and cell cross-talk in an organotypic environment. in this study, we aimed to overcome these limitations by establishing reconstructed skin tissues containing langerhans cells (lcs). in vitro generated immature monocyte-derived (molcs) or mutz-3-derived cells (mutz-lcs) cultivated with keratinocytes on a dermal compartment with fibroblasts form a stratified epidermis after 14 days as indicated by the expression of epidermal differentiation markers. molcs or mutz-lcs were mainly localized in suprabasal layers of the epidermis and distributed homogeneously in accordance with native human skin. topical application of the extreme contact sensitizer 2,4-dinitrochlorobenzene (dncb) induced il-6 and il-8 secretion in skin models with lc-like cells, whereas no change was observed in control rhs lacking immune cells. increased gene expression of cd83 and pd-l1 in the dermal compartment indicated lc maturation. we confirmed the enhanced mobility from epidermal to dermal compartments for mutz-lcs and molcs in the presence of dncb. in summary, we successfully integrated immature and functional lc-like cells into reconstructed human skin. this fosters the development of animal-free test systems for advanced and potentially individualized hazard assessment of skin sensitization. computational methods for prediction of in vitro activity of new chemical structures. background: with a constant increase in the number of new chemicals synthesized every year, it becomes highly important to employ the most reliable and fast in silico screening methods to predict their safety and activity profiles. in recent years, in silico prediction methods received great attention as alternatives to animal experiments for evaluation of various toxicological endpoints, complementing the theme of replace, reduce and refine (3rs). various computational approaches have been proposed for prediction of toxicity of chemicals ranging from quantitative structure activity relationship modeling to molecular similarity based methods and machine-learning methods. within the "toxicology in the 21st century" screening initiative, a crowdsourced platform was established for development and validation of computational models to predict the interference of chemical compounds in nuclear and stress receptor pathways based on a training set containing more than 10,000 compounds tested in high-throughput screening assays. methods: here we present the results of various molecular similarity-based and machine-learning-based methods over an independent evaluation set containing 647 compounds. further, we compare the performance of these methods when applied individually and together. in retrospect we also discuss the reasons behind the superior performance of an ensemble approach, that combines a molecular similarity approach and a naïve bayes machine learning algorithm in achieving best prediction rates in comparison with other individual methods, explaining their intrinsic limitations. results and conclusions: our results suggest that, although prediction methods were optimized individually for each modeled target, an ensemble of similarity and machinelearning approaches provides promising performance indicating its broad applicability in toxicity prediction. charité -universitätsmedizin berlin, structural bioinformatics group, berlin, germany a multitude of drug candidates (approx. 20 %) fail in the late drug development due to toxicity and adverse effects [1] . immunotoxicity can be divided into four types of immune-mediated adverse effects: immunosuppression, immunostimulation, hypersensitivity and autoimmunity. current safety evaluations of drug candidates with respect to immunotoxicity are comprised of in vivo and in vitro assays. here, we present an in silico approach for predicting immunotoxic substances based on immune suppressive and not toxic compounds using the laplacian-modified naϊve baysian model as a supervised machine learning method. therefore, we examined the relations between about 51,000 compounds and 115 cancer cell lines from the national cancer institute's (nci) nci-60 growth inhibition data [2] with focus on five immune cell lines (rpmi-8226, ccrf-cem, . different fingerprints encoding the chemical structures have been evaluated for their predictive power (e. g. extended-connectivity fingerprints, substructure fingerprints) and showed good prediction rates in a retrospective analysis. acting in phase ii metabolism, sulfotransferases (sult) transform endo-and exogenous molecules such as drugs into more hydrophilic entities that are easily excreted from the human body [1] . although serving detoxification, sult-mediated transformation of molecules has also been associated with the formation of chemically reactive metabolites that could promote adverse reactions [2] . the development of a computerbased model that allows prediction of molecules susceptible to metabolism would improve drug development and drug safety [3] , and encourage the reduction of in vivo testing according to the principles of the 3rs (replacement, reduction and refinement). in our study, we developed an in silico model to predict human sult subtype 1e1 activity acting in phase ii metabolism. structure-based molecular modelling of sult activity is challenging due to the broad and overlapping substrate spectrum of sult subtypes. this low substrate specificity can be attributed to the high degree of conformational flexibility of the enzyme, particularly in the active site. therefore, molecular dynamics simulations were performed to address enzyme flexibility and the broad substrate spectrum of sult ( figure 1 ). based on a collection of selected enzyme conformations from molecular dynamics simulations and a dataset of active sult1e1 ligands, ensemble docking was utilized to generate ligand-protein complexes that served as a basis for 3d pharmacophore development. eight specific 3d pharmacophores were created that allow identification of sult1e1 substrates and inhibitors. for further refinement of the computer-based prediction, machine learning models and post-filtering steps were generated that allow classification of predicted hits. the final prediction model was used to screen the drugbank (a database comprising over 6,500 experimental and approved drugs). a major part of the predicted hits could be confirmed from literature. from the remaining hits, a representative selection of molecules was experimentally tested for sult1e1 inhibition or biotransformation. experimental results were in agreement with our computer-based models and revealed previously-unknown biotransformation by or inhibition of sult1e1 for compounds listed in the drugbank. introduction: vascularization of the dermal equivalent of full-thickness skin constructs by endothelial cells is highly desirable, because such constructs closely mimic the architecture of real skin. unfortunately, the realization of a capillary network in skin constructs is still difficult. in our study of full-thickness skin constructs, following the methodologies of küchler et al. (2011) , there were alterations in the epidermal differentiation after endothelialization of the dermal equivalent. the aim of this study was to characterize these changes on a morphological level. material and methods: non-endothelialized constructs (keratinocytes, fibroblasts) were prepared according to küchler et al. (2011) . to obtain endothelialized constructs, the dermal equivalent of the non-endothelialized constructs was enriched with endothelial cells. after two weeks of in vitro culture, the skin constructs were processed for quantitative as well as qualitative assessment by light and electron microscopy. results: both types of skin construct developed all strata, i.e., stratum basale, spinosum, granulosum, corneum of a stratified soft-cornified epidermis, although the two constructs displayed differences in every stratum: significantly more mitoses occurred in the epithelial germ layers of the endothelialized constructs (p=0.013). in addition, significantly more keratohyalin granules were counted within their stratum granulosum (p=0.010). 50% of the shapes of the spinous and the granulosum cells were irregular and these cells were separated by wide intercellular spaces. the typical epidermal lamellar bodies appeared in the endothelialized constructs more often than in the nonendothelialized ones. at the stratum granulosum -stratum corneum interface, no cohesion between the strata was present. background and novelty: during the last decade organ-on-a-chip technologies received increasing attention in the scientific community. the idea of combining different tissue types on a physiological-like system creates completely new options on how substances can be characterized without the use of animal experiments. animal models were used for the investigation of skin sensitization as a standard method until animal testing for cosmetic substances was banned in the e.u. in 2013. by combining skin models with an immunological counterpart, new data will be presented to see if the multi-organ-chip add value to the current need of alternative methods regarding skin sensitization and immunotoxicity testing. experimental approach: the multi-organ-chip platform is designed to combine different human cell and tissue types like 3d-spheroids, barrier models or biopsies in one microfluidic system. a peristaltic on-chip micropump enables circulation of medium, allowing for a constant perfusion between the compartments. first experiments were performed using a dendritic-cell-only approach in the 2-organ-chip. in subsequent cocultivation experiments ex vivo human epidermis and dendritic cells were cultivated each in one culture compartment connected by the microfluidic channels. for analysis we measured the typical activation marker cd86 and the vitality of the dendritic cells by flow cytometry. functionally different sensitizers were selected to investigate their effects in our model. finally a more complex 3d matrices including different immunological cell types to emulate in vivo-like reactions like in the human lymph node were cultivated in the 2-organ-chip. results and discussion: our data show a strong influence of pump pressure and pumping frequency on the activation of dendritic cells. hence, we established an adequate set up by cultivating the dendritic cells in cell culture inserts, preventing cell activation due to shear stress. compared to existing sensitization assays, the main advantage of the perfused 2-organ-chip sensitization assay is the presence of an epidermis equivalent, partially integrating important parameters such as metabolism and skin barrier function. we compared our data with reference cd86-values from the pbmdc (peripheral blood monocyte derived dendritic cells) skin sensitization assay. for identical substances, we observed differences in dendritic cell activation between the pbmdc assay and the 2-organ-chip perfused assay. here we present the first-time cultivation of primary derived immune cells cultivated on our microfluidic system which is a promising enhancement to integrate immunological reactions on further multi-organ combinations. due to growing social and political pressure, the interest in alternatives to animal testing has constantly increased during the past 10 years stimulating the development and validation of new in vitro test systems including reconstructed skin models. additional to toxicological studies and permeability assays, skin models are of high interest for fundamental research to elucidate basic physiological and pathophysiological processes in human skin [1, 2] . as for today, most of the in vitro skin models are grown from primary keratinocytes and fibroblasts that were either isolated from excised human skin or from juvenile foreskin following circumcision. in this project, we aimed for the generation of in vitro skin models using hair folliclederived cells. therefore, different methods to optimize cell isolation and expansion of outer root sheath (ors) cells from human hair follicles were systematically investigated. the best procedure for isolation of ors cells was the direct cell outgrowth on a cell culture insert which was co-cultured with a feeder layer of postmitotic human dermal fibroblasts. following outgrowth, the cells were either further cultivated with feeder cells in specific serum-enriched cell culture medium to obtain hair follicle-derived keratinocytes or using the same culture medium without feeder cells to obtain fibroblasts. afterwards, the generation of hair follicle-derived fibroblasts and keratinocytes was verified via the fibroblast-specific markers vimentin and desmin and the keratinocyte marker cytokeratin (ck) 14 clearly showing that vimentin and desmin are expressed in hair follicle-derived fibroblasts and in dermal fibroblasts. as expected, these cells were negative for ck14, which was abundantly expressed in hair folliclederived keratinocytes. moreover, the expression of collagen type i, iv, tgf-beta, alpha sma and il-1 alpha in hair follicle-derived fibroblasts and dermal fibroblasts showed no significant differences. ultimately, hair follicle-derived keratinocytes and fibroblasts were used to grow full-thickness skin models which were subsequently characterized with regard to epidermal differentiation, skin permeability and skin surface ph. again, no significant differences compared with skin models grown from skin-derived cells were detected showing the potential of using hair follicle-derived cells for generating in vitro skin models. [1] vávrová, k., henkes, d., strüver, k., sochorová, m., školová, b. et al. filaggrin deficiency leads to impaired lipid profile and altered acidification pathways in a 3d skin construct. the journal of investigative dermatology. 134, 746-753 (2014 bundesinstitut für risikobewertung, experimentelle toxikologie und zebet, berlin, germany background: the eu directive 2010/63 has been drawn up with the aim of ultimately replacing animal testing. wherever animal experimentation is necessary, the 3-rprinciple of russel and burch (replace, reduce, refine) has to be observed. the primary goal of the 3-r-principle is to replace animal testing with alternative methods. if no alternative method can be applied, the total number of animals is supposed be reduced. consequently, some animals are used multiple times in the course of an experiment. for example, in imaging studies, rodents are exposed to anesthesia several times in order to control the progress of a disease. however, the directive claims that "the benefit of reusing animals should be balanced against any adverse effects on their welfare, taking into account the lifetime experience of the individual animal". objective: we are looking into whether multiple exposures to anesthesia cause more stress than a single exposure. methods: the most common mouse strain c57/bl6 j and anesthetics isoflurane and the combination of ketamine/xylazine are used. with regard to recent studies, the animals are anesthetized six times for 45 minutes over a period of three weeks. all parameters observed are compared between controls, animals with a single and repeated anesthesia. the interval between the administration of the anesthesias is three to four days. when the animals are under anesthesia, their vital parameters are continuously monitored and afterwards their general condition is examined. the grimace scale is scored 30 and 150 minutes after anesthesia. besides pain, the grimace scale can also assess anxiety, stress and malaise. the display of so-called luxury behaviors like nest building and burrowing behavior serves as an indicator of wellbeing. furthermore, activity, food and water intake are monitored for 24 hours. a behavioral test battery including the free exploratory paradigm, open field, balance beam and rota rod test is performed one, seven and ten days after the last anesthesia. motor coordination and balance are assessed by the balance beam and rota rod. the open field is a test to investigate anxiety-related and exploratory behavior, the free exploratory paradigm estimates trait anxiety. moreover, corticosterone metabolites are measured in feces and fur in order to prove evidence of cumulative stress. results: the first results of our study will be presented at the 82 nd meeting of dgpt. conclusion: we are confident that the results of our study will contribute to the assessment of the severity level caused by multiple exposures to anesthesia and in this way be a benefit for the welfare of laboratory rodents. bb3r is funded by bmbf. in 1959 the 3r-principle was defined by the british scientists william russel and rex burch in their book 'the principles of human experimental techniques'. the 3r refer to replace, reduce, and refine. they set the goals to use alternative non-animal methods (replace), to reduce the number of laboratory animals (reduce) and to minimize the distress of laboratory animals and to refine their welfare (refine). the implementation of the 3r-concept is the overall goal of scientific animal welfare. article 4 of the 'directive 2010/63/eu on the protection of animals used for scientific purposes' state that the research on refinement is as important as on replacement and reduction [1] . according to the current statistics on laboratory animals, the mouse is the most commonly used animal in experiments with approximately 70 % [2] [3] . effective pain treatment is crucial not only for ethical and legal considerations but also to achieve highquality results [4] . pain in mice is only obvious if it is severe or acute pain. however, it is difficult to identify slight or moderate pain. the determination of pain levels and effective dosages of analgesics is therefore challenging. the most commonly used analgesics for postsurgical pain treatment in mice are opioids. however, the recommendations for their use show vast dosage ranges. for example, the recommended dose of buprenorphine ranges from 0.05 to 2.5 mg/kg per bodyweight [5] [6] [7] depending on the pain model used. additionally, putative pharmacogenetic strain differences have to be considered. for example, the analgesic treatment with morphine shows mouse strain specific differences in pain sensitivity [8] . the goal of the project is the refinement of the recommendations for effective dosage of opioid analgesics in laboratory animals for mouse inbred strains by incorporating strain specific differences. regarding the phenotype, we want to identify a putative inbred strain dependent pain threshold and measure the drug level in plasma and brain. additionally the metabolic capacity and mrna expression level will be investigated. genotype identification is based on a data bank analysis used for correlation with phenotypical parameters. [1] rl 2010/63/eu. [2] bmel (2014) . anzahl der für versuche und andere wissenschaftliche zwecke verwendeten wirbeltiere. [3] eu commission (2013) . seventh report on the statistics on the number of animals used for experimental and other scientific purposes in the member states of the european union. [4] carbone l (2011) pain in laboratory animals: the ethical and regulatory imperatives. plos one 6, e21578. [5] gv-solas, a.f. a.d. (2013) . empfehlung zur schmerztherapie bei versuchstieren. [6] carpenter, j.w., t.y. mashima, and d.j. rupiper (2001) . exotic animal formulary, 2nd edition, w.b. saunders co., phila. [7] flecknell, p. (1996) . laboratory animal anaesthesia, 2nd edition, academic press, san diego, ca. introduction: göttingen minipigs™ are frequently used large animal models for orofacial research, for example dental implant surgery. requests from experimental surgeons for detailed anatomical information can not be answered because there is no existing data, especially not age-related. because of unavailable data and the false choice of animal age, surgical interventions fail or lead to enormous post-operative suffering. therefore the aim of this study is the acquisition of detailed anatomical data of the mandibula and other organs and structures without sacrificing pigs for this reason. animals, materials and methods: ct scans of a 64-slice scanner were collected from 18 female minipigs, consisting of 6 animals aged 12 months (group 1, n=6) and 12 animals (group 2; n=12) examined at the age of 17 and 21 months. these minipigs were involved in experiments, approved by the regional office for health and social affairs, berlin. image analysis was performed using vitrea advanced® (vital images). more than 50 parameters concerning teeth, the mandibular body, frame and canal, coronoid process and mandibular condyle were defined and measured. for now, we focused on the development of the mandibular canal and the distance between the dorsal borders of the mandibular canal to the alveolar ridge at the most posterior mental foramen, parameters immensely important for planning interventions when testing new dental implants. results: the measurements by computed tomography showed variations of several parameters between left and right ramus mandibulae and within the different age groups. the volume of the canalis mandibulae increases in time. animals of the same age show significant differences in volume, with a range of up to 65% where the largest volume was 13,4 ml and the lowest 4,7 ml. the distance between the dorsal borders of the mandibular canal to the alveolar ridge decreases between 12 and 17 months of age. comparing 17 and 21 months old minipigs, no significant difference in distance could be observed. from the age of 17 months the position of the mandibular canal in relation to the alveolar ridge remains constant. conclusion: the decrease of the distance between the mandibular canal and the alveolar ridge between 12 and 17 months of age indicates ongoing anatomical changes of this parameter until the age of 17 months. therefore animals should be older than 17 months if included in long-term studies after orofacial experiments, like implant surgery of the mandibula. because of the described individual differences, the authors strongly suggest to support the planning of orofacial interventions by ct imaging or other radiographic techniques. background: laboratory housing conditions are standardized to a high level. under these conditions, the occurrence of stereotypic behaviour (sb) can be observed. stereotypies are commonly known as deviations from normal behaviour that are repetitive, invariant and without any obvious function or aim for the animal [1]. worldwide it is estimated that over 85 million animals perform sb, with the highest prevalence in laboratory animals and the agricultural sector. fvb/n is a typical inbred mouse-strain that shows different types of stereotyped movements. it is known that environmental enrichment decreases the incidence of sb, yet they still occur [2] . since animal's behaviour highly influences its metabolism and immune system, differences in handling, caring and keeping can lead to varying results, even with an identical experimental setup [3] . aim: of the study aim of the study is to observe different life stages of fvb/n-mice and immediately detect the development of sb. observations are linked to various behavioural tests and the characterisation of the metabolic and immunological phenotype. the results will lead to a better understanding of the mechanisms driving the development of sb and clarify its implication to animal welfare or to what extent the performance of stereotypies even reflect emotional suffering. the strain-related behaviour and sb are recorded with computer-assisted programmes. observational periods already begin with the parental generation. as possible indicators for later developing sb, data on reproductive success and maternal care are collected, such as different behavioural tests. the animals are characterized by a specific protocol for detecting the metabolic and immunological phenotype. finally, histological and molecular biological analyses follow. outlook: it is of paramount importance to take good care for the welfare of laboratory animals (3r-refinement). though, the knowledge about the ethological particularities of animals and the motivational base of animals performing sb are not enough to generally avoid stereotypies. therefore the meaning according to the character of sb has to be analysed more intensively to understand the needs of laboratory animals and evolve recommendations for optimizing the breeding and keeping such as for the assessment of possible distress in animals performing sb. objective: thermoregulation is a vital function in both humans and animals with the serotonin (5-ht) system, in particular the 5-ht 1a receptor, playing a major role. activating 5-ht 1a receptors by the 5-ht 1a receptor agonist 8-hydroxy-2-(dipropylamino)tetralin (8-oh-dpat) leads to reduced body temperature. while there is consensus that hypothermia is induced by the stimulation of postsynaptic 5-ht 1a receptors in rats and humans, the regulatory mechanisms in mice are less clear. in our group, within phenotyping a transgenic mouse line permanently overexpressing the 5-ht 1a receptor in serotonergic projection areas, bert et al. (2008, pmid: 18396339) revealed exaggerated 8-oh-dpat-provoked hypothermic response. thus, the objective of the present study was to substantiate the contribution of postsynaptic 5-ht 1a receptors to thermoregulation, more precisely to the hypothermic effect of 8-oh-dpat, in mice. methods: we used radio telemetry technique to monitor the basal body temperature and the hypothermic effect of different doses of 8-oh-dpat (0.1 mg/kg -4 mg/kg i. p.) in male transgenic mice in comparison to nmri wild-type males. additionally, we investigated whether reduction of serotonergic activity by pretreatment with the 5-ht synthesis inhibitor parachlorophenylalanine (pcpa; 100 mg/kg, i. p. on four consecutive days) would alter the effects of 8-oh-dpat on body temperature in transgenic mice postsynaptically overexpressing the 5-ht 1a receptor. results: 5-ht 1a overexpressing mice revealed lower levels of basal body temperature than wild types (transgenic mice: 36.0 °c; nmri wild-type mice: 37.4 °c). in both genotypes, systemic administration of 8-oh-dpat dose-dependently decreased body temperature, being significantly more pronounced in mutant mice (-2.8 °c compared to -1.5 °c in nmri wild types). dose response curves of 8-oh-dpat revealed an ed 50 = 0.4 mg/kg in transgenic and an ed 50 = 0.57 mg/kg in nmri wild-type mice. pcpa pretreatment did not alter the hypothermic response to 8-oh-dpat in mice. the dose-response curves indicate a higher potency of 8-oh-dpat in transgenic mice. the exaggerated hypothermic response to 8-oh-dpat in mutant mice implies that postsynaptic 5-ht 1a receptors could be involved in thermoregulatory function in mice. this assumption is further confirmed by the fact that 8-oh-dpatevoked thermal responses were not influenced by pretreatment with pcpa, most notably in transgenic mice overexpressing 5-ht 1a receptors postsynaptically. genetic variation within g protein-coupled receptors compromises the therapeutic application of drugs targeting these receptors. one of the most intensely studied variation is p.arg389gly in the human beta1-adrenoceptor (adrb1). the adrb1 carrying arginine at position 389 in helix 8 in the proximal carboxy terminus is more frequent among caucasians and is hyperfunctional. yet, the molecular basis for the differences between the beta1-adrenoceptor variants arg389-adrb1 and gly389-adrb1 is poorly understood. despite its hyperfunctionality, we found the arg389-variant of the adrb1 to be hyperphosphorylated upon continuous stimulation with norepinephrine when compared to the gly389-variant. using adrb1 sensors to monitor activation kinetics by fluorescence resonance energy transfer (fret), the arg389-adrb1 exerted faster activation speed and arrestin recruitment than the gly389-variant. both depended on phosphorylation of the receptor as shown by knockdown of g protein-coupled receptor kinases and by fret experiments using phosphorylation-deficient adrb1 mutants. point mutation of single serines and threonines in the carboxy terminus of the adrb1 finally revealed a variant-specific phosphorylation pattern that determines arrestin recruitment. taken together, these findings suggest that differences in receptor phosphorylation determine the differences in activation speed, efficacy and arrestin recruitment of adrb1 variants. opioid drugs are the most potent analgesics, which are used in the clinic; however, by activating the μ-opioid receptor (mor) they also produce several adverse side effects including constipation, antinociceptive tolerance, and physical dependence. there is substantial evidence suggesting that g protein-coupled receptor kinases (grks) and βarrestins play key roles in regulating mor signaling and responsiveness. following phosphorylation by grks, β-arrestins bind to phosphorylated mors, which prevents further interactions between the receptor and g proteins even in the continued presence of agonist resulting in diminished g protein-mediated signaling. we have previously shown that agonist-induced phosphorylation of mor occurs at a conserved 10-residue sequence, 370 trehpstant 379 , in the carboxyl-terminal cytoplasmic tail. morphine induces a selective phosphorylation of serine 375 (s375) in the middle of this sequence that is predominantly catalyzed by g protein-coupled receptor kinase 5 (grk5). by contrast, high-efficacy opioids not only induce phosphorylation of s375 but also drive higher-order phosphorylation on the flanking residues threonine 370 (t370), threonine 376 (t376), and threonine 379 (t379) in a hierarchical phosphorylation cascade that specifically requires grk2/3 isoforms. to investigate this mechanism further, we have developed novel β-galactosidase complementation assays to monitor agonist-dependent recruitment of grk2 and grk3 to activated mors. using this assay, we were able to show that activation of mor by high-efficacy agonists such as damgo results in a robust translocation of grk2/3. in contrast, activation by low-efficacy agonists such as morphine results in a much less pronounced recruitment of grk2/3 isoforms. interestingly, damgo-induced β-arrestin recruitment was strongly inhibited by sirna knock down of grk2 or grk3. conversely, the morphine-induced β-arrestin recruitment was strongly enhanced by overexpression of grk2 or grk3. mutation of s375 to alanine strongly inhibited both grk and β-arrestin recruitment. however, mutation of all 11 carboxyl-terminal serine and threonine residues of mor was required to completely abolish its interaction with arrestins and grks resulting in a complete loss of mor internalization and desensitization. heterotrimeric g proteins are located at the inner leaflet of plasma membranes and are a major primary transducer of cell signaling initiated by g protein-coupled receptors (gpcrs). based on sequence similarity, heterotrimeric g proteins can be subdivided into four main classes, i.e. gi/o, gs, gq/11, and g12/13, which interact with distinct cellular effectors to shape the final cellular response 1 . identification of new selective and cell-permeable g protein inhibitors is of great interest as these may be beneficial in complex pathologies that involve signaling of multiple gpcrs 2 . mechanistically, small molecule g protein inhibitors may, for instance, block nucleotide exchange by inhibiting gdp release (i.e. guanyl nucleotide dissociation inhibitors, gdis) or allow gdp release but block gtp entry by stabilizing the g protein in an empty pocket conformation 3 . here, we set out a new approach to classify g protein inhibitors regarding their mechanism of action in radioligand binding experiments. in particular, we investigated the influence of the specific gα q/11/14 inhibitor fr900359 4 on agonist-radioantagonist binding experiments performed with membranes of cho cells stably expressing the muscarinic m1 acetylcholine receptor (cho-m1). agonistradioantagonist competition under these conditions is biphasic because agonists bind with higher affinity in the ternary complex consisting of agonist, receptor and nucleotidefree g protein compared with a g protein-free receptor 1,5 . we show that fr900359 can be classified as a gdi as it significantly reduced high affinity binding of carbachol in cho-m1 membranes. in contrast to this, bim-46187, a pan g protein inhibitor with a cell-type-dependent preference for gq, did not influence high affinity agonist binding and thus stabilized gq in an empty pocket conformation 3 . interestingly, inhibition of high affinity agonist binding by fr900359 was incomplete in agonist-radioantagonist displacement studies and also when a radioagonist was applied. to fully prevent high affinity agonist binding by fr900359, combined uncoupling of both gi and gs proteins from m1 was required by pre-treatment with pertussis toxin and cholera toxin, respectively. these data demonstrate that not only gq, but also gi, and gs, contribute to the high affinity fraction of m1 receptors. taken together, our findings show that radioligand binding experiments are an attractive approach to classify new g protein inhibitors according to their mechanism of action. universität, würzburg, germany g protein-coupled receptors (gpcrs) are cell surface receptors which, upon a conformational change in the receptor protein induced by an extracellular stimulus, can transduce the signal onto intracellular adaptor proteins such as heterotrimeric g proteins [1] . gpcr-induced cell signaling can be rather complex as several gpcrs may activate multiple different adaptor proteins and can additionally be activated via distinct binding sites, i.e. the orthosteric transmitter binding site and other "allosteric" binding sites [2] . in the present work, we wanted to investigate the influence of an allosteric binding site on receptor activation of muscarinic acetylcholine receptors (machrs). to this end, we employed the orthosteric full agonists acetylcholine and iperoxo as well as several dualsteric compounds consisting of iperoxo linked to an allosteric phthalimide (phth) or naphthalimide (naph) moiety through alkyl chains of different length or through a diamide linker (fri). binding of the allosteric part to the receptor protein may restrict the conformational flexibility of the receptor protein and thus interfere with receptor activation [2] . therefore, application of different linker length may control the signaling outcome. here, we applied the human m1 machr which preferentially activates g proteins of the g q/11 type but can also promiscuously stimulate g s proteins. g q/11 -and g sdependent signaling pathways were analyzed by application of cho cells stably transfected with the human m1 machr in ip1 and camp accumulation assays, respectively. in comparison to the orthosteric building block iperoxo, all dualsteric compounds under investigation showed a decrease in potency for both g q -mediated and g s -mediated signaling. our findings show that the bulkier allosteric naph residue impaired both signaling pathways to a greater extent than the smaller substituent phth. particularly, the compound iper-6-naph completely lost intrinsic activity for both g q/11 and g s activation at the m1 machr. moreover, g s -mediated pathway activation is more sensitive to spatial restriction in the allosteric vestibule than g q -signaling. interestingly, longer linker length led to improved signaling for both pathways (g q and g s ) in both hybrid series. iper-7-phth seems to be an exception as it had a higher intrinsic efficacy for g s -dependent signaling than the other phth hybrids with longer linker chains. strikingly, only iper-fri-phth, which corresponds to iper-8-phth in linker length, but is able to engage increased hydrogen bonding with the receptor protein, acted as a full agonist on m1 machr for both signaling pathways under investigation. taken together, these data strongly suggest that, in comparison to g q/11 -mediated signaling, activation of the g s protein in m1 machr is more sensitive to spatial restriction in the allosteric vestibule. thus, it appears to be possible to control signaling of the m1 machr by allosteric constraint of the receptor's conformational flexibility. for more than three decades 3-(1h-imidazol-4-yl)propylguanidine (sk&f-91,486 (1) [1] ) is known as the prototypic pharmacophore of highly potent histamine h 2 -receptor (h 2 r) agonists of the guanidine class of compounds including, e.g., impromidine and arpromidine. [2] in order to get more insight into the structure-activity relationships of alkylated analogues of sk&f-91,486, we characterized 78 newly synthesized derivatives including several bivalent compounds (e.g., 2) by using different pharmacological in-vitro methods. [3] the potential h 2 r agonists were subjected to a broad screening procedure utilizing radioligand binding assays with membranes of sf9 cells [4] (hh 1,2,3,4 r). compounds were also functionally characterized in the [ 35 s]gtpgs assay (hh 2 r, sf9 cell membranes). [5] selectivity vs. hh 1,3,4 r was determined for selected derivatives also using this technique. organ bath studies (gph 1 r (ileum), gph 2 r (right atrium)) yielded functional data in a more physiological environment. the major part of the new sk&f-91,486 analogues displayed partial or full agonism via hh 2 r and gph 2 r, respectively. the most potent analogue, bivalent thiazole-type bisguanidine 2, was a partial agonist (e max = 88%) and 250-times as potent as histamine vis-à-vis the gph 2 r. attempts to antagonize the positive chronotropic effect of (partial) agonists by preincubation with cimetidine, or by adding a cimetidine bolus at the end of the concentration-response curve, respectively, were successful and furnished pa 2 values for the antagonist (5.87 -6.38) which are in accordance with literature data. however, in the functional in-vitro assay on gph 2 r, the positive chronotropic response evoked by sk&f-91,486 was surprisingly resistant to antagonism by cimetidine and other typical h 2 r antagonists (ranitidine, famotidine), although the compound so far has been unanimously classified by others as a weak partial h 2 r agonist. this behaviour is unique within the large series of sk&f-91,486 analogues studied so far under similar conditions. probably the positive chronotropic effect of the lead compound is -at least in part -the result of a second molecular interaction which has been overlooked so far. [1] parsons, m.e. et al.; agents actions 1975, 5, 464. [2] buschauer, a.; j. med. chem. 1989 , 32, 1963 -1970 [3] pockes, s.; dissertation, univ. regensburg 2015. [4] seifert, r. ; j. pharmacol. exp. ther. 2003, 305 (3) , 1104-1115. the nociceptin/orphanin fq (n/ofq) peptide (nop) receptor is the fourth most recently discovered and least characterized member of the opioid receptor family (mor, kor and dor). nop receptor is widely distributed and modulates several physiological processes by its endogenous ligand nociceptin. the nop receptor is a potential target for the development of ligands with therapeutic use in several pathophysiological states such as chronic and neuropathic pain. consequently, there is increasing interest in understanding the molecular regulation of nop receptor. recently, we generated two phosphosite-specific antibodies directed against the carboxyl-terminal residues serine 351 (s351) and threonine 362 /serine 363 (t362/s363), which enabled us to selectively detect either the s351-or the t362/s363-phosphorylated forms of the receptor. our results show that nociceptin, mcoppb, sch221510 and ro64-6198 induce a stably phosphorylation at s351 and t362/s363 followed by a profound internalization of the receptor. the nociceptin-induced s351 and t362/s363 phosphorylation can be blocked by selective nop receptor antagonists such as j113397 or sb612,111. nnc63-0532, buprenorphine and norbuprenophine failed to induce a phosphorylation at these sites. in the presence of nociceptin, s351 phosphorylation occurred at a faster rate than phosphorylation at t362/s363 indicating that s351 is the primary site of agonistdependent phosphorylation. activation of pkc by phorbol 12-myristate 13-acetate facilitated receptor phosphorylation only at s351 but not at t362/s363, indicating that s351 can also undergo heterologous phosphorylation. using nop-gfp knock in mice, we detected nop receptors in brain, spinal cord and dorsal root ganglia (drg). we were also able to demonstrate a dose-dependent nop receptor phosphorylation at t362/s363 in mouse brain in vivo using western blot and mass spectrometry. in contrast, mcoppb and sch221510 failed to induce phosphorylation in vivo. together, these data provide new insights into the molecular regulation of the nop receptor in vitro and in vivo. several findings indicate that inflammatory diseases are initiated or maintained by an imbalance of receptor baised signaling; the latter referring to the ability of distinct ligands to endue individual receptors with qualitatively different g-protein-and/or b-arrestindependent signaling (1). chemokines and their receptors regulate a wide array of leukocyte functions, including chemotaxis, adhesion, and transendothelial migration, and thus play important roles in regulating inflammation (2) . in man, two cc chemokine receptors ccr2a and ccr2b are present that only differ in their carboxyl terminal portions; the latter known to interact with multi-protein complexes made up of heterotrimeric g proteins (pertussis toxin sensitive and -insensitive) and non-g-protein components, including b-arrestin (2) . interested in differential signaling of ccr2a and ccr2b we comparatively analyzed ligand-induced g-protein-regulated signaling pathways (e.g. activation of phospholipase c isoenzymes and rhogtpase-induced serum response factor [srf] activity) and b-arrestin-regulated pathways (e.g. internalization of receptors and phosphorylation of erk1/2) in the presence of ccl2, ccl8, ccl7, and ccl13. all these chemokines have been shown to interact with human ccr2 receptors. in addition, the structural requirements of the carboxyl terminal portions involved in determining specificity in g-protein-dependent signaling was addressed by using ccr2 receptor mutants. the comparative analysis revealed that differences in ligand-induced activation of g-protein-dependent (pertussis-toxin-sensitive versus pertussis-toxin-insensitive) and/or b-arrestin-dependent signaling exist. for example, activation of ccr2b receptors by ccl2 induced both rho-dependent srf activation and receptor internalization, while ccl8-stimulation resulted in srf but little if any receptor internalization. in contrast, ccr2a-expressing cells showed ccl2dependent srf-activation but any receptor/ligand internalization. analysis of the structural requirements of ccr2 receptors for coupling to g proteins revealed that arginine 313 within the putative 'eighth helix' of the carboxyl-terminal portions of ccr2a and ccr2b is not involved in ga i -mediated induction of erk1/2 and plays a minor role in ccr2b receptor internalization, but is specifically required for the ccr2a/ and ccr2b/ga q -mediated activation of srf. serotonin 5-ht 2c receptors (5-ht 2c r) are functional engaged with gq proteins and expressed in the central nervous system (cns). 5-ht 2c r significantly regulate emotion, feeding, reward or cognition and thus might serve as promising targets for drugs against psychiatric disorders or obesity. due to the technical difficulties in isolating cells from the cns and the hitherto lack of suitable cell lines that would endogenously express 5-ht 2c r, our knowledge about this receptor subtype is rather limited. recently established hypothalamic mhypoa2-10 cells show some characteristics of appetite-regulating hypothalamic neurons of the paraventricular nucleus, where 5-ht 2c r in vivo expression has been detected. thus, we tested mhypoa2-10 cells for putative 5-ht 2c r expression by performing single cell calcium imaging. we observed that serotonin and the specific 5-ht 2c r agonist, way161,503 induced robust calcium transients, which were strongly inhibited by two unrelated 5-ht 2c r-selective antagonist (sb206,553, rs102,221). serotonin and way161,503 also activated a camp response element-dependent reporter gene construct and induced significant phosphorylation of extracellularregulated kinases-1/2 in a sb206,553 and rs102,221 sensitive manner, providing further evidence for functional 5-ht 2c r expression in mhypoa-2/10 cells. intrinsic activity of way161,503 ranged between 0.3 and 0.5 compared to serotonin in all assays, defining way161,503 as a partial 5-ht 2c r agonist. in conclusion, we provide convincing data that hypothalamic mhypoa-2/10 cells endogenously express 5-ht 2c r and thus represent the first cell line to analyse 5-ht 2c r pharmacology, signaling and regulation in its natural environment. optical and electrophysiological methods allow detection and characterization of g i/o -protein coupled receptors j. straub 1 1 walther-straub-institut für pharmakologie und toxikologie, münchen, germany g-protein mediated signaling pathways are essential components of basic cellular functions. of note, g-protein coupled receptors (gpcrs) constitute one of the major drug targets in modern medicine. however, despite their clinical importance, fundamental properties of these receptors remain unknown. in particular, regulation of the major second messenger camp by g s -and g i/o -protein coupled receptors is of special interest. the classical biochemical method to detect receptor-mediated camp level changes uses pre-labeling with 3 h-adenine and calculation of the conversion rate to 3 h-camp. although, this multi-cellular method is highly sensitive and reproducible, it lacks time resolved and spacial assessment of camp formation in single living cells. to measure g s -protein induced increases of intracellular camp levels in single living cells in a time resolved manner, the fret-based camp-sensor epac is commonly used. however, it was unknown whether this sensor might be suitable to detect g i/o -protein mediated decreases of intracellular camp levels as well. in this study, we show that fret-based camp sensors can be deployed to reliably monitor g i/o -protein mediated camp level decreases. fret experiments with adrenergic α 2a or µ 1 opioid receptors in combination with different fret-based camp sensors showed a notable reduction of intracellular camp levels upon receptor activation which could be significantly reduced by selective receptor antagonists. of note, hek293 cells had to be pre-incubated with forskolin in submaximal concentration to increase basal camp levels. our findings suggest that fret based epac sensors are suitable to detect g i/o -protein activation similar to electrophysiological whole-cell measurements with g i/o -protein coupled receptors and trpc5 or heteromeric kir3.1/kir3.2 or kir3.1/kir3.4 channels coexpressing cells. hereby, agonist stimulations caused current increases with characteristic current-voltage relationships. altogether, our findings indicate that both optical and electrophysiological approaches allow time resolved detection and characterization of g i/o -protein coupled receptor activation in single living cells. histamine can exert positive inotropic and chronotropic effects in humans via histamine h 2 -receptors. we have generated and partially characterized transgenic mice (tg) which overexpress the human histamine h 2 -receptor specifically in cardiomyocytes via the α-myosin heavy chain promoter. in these mice, but not in wild type mice (wt), histamine increased heart rate and ejection fraction (ef) measured in vivo by echocardiography under isoflurane anesthesia. to investigate some aspects of the signaling pathway in these mice, we crossed the tg mice with pp2a mice leading to double transgenic mice (h 2 xpp2a = dt). pp2a mice (j biol chem 2004; 279:40827) overexpress the catalytic subunit of protein phosphatase 2a (pp2a) in cardiac myocytes and develop a cardiac hypertrophy. at an age of about 240 days we noted reduced ef in pp2a (33.1 ± 3.5 %, n=16) compared to wt (54.4 ± 4.5 %, n=10) and tg (59.8 ± 2.9 %, n=10). interestingly, in dt the ef (43.9 ± 4.9 %, n=11) was higher than in pp2a at similar heart rates. e´/a´ was elevated in dt compared to wt. relative heart weights were unchanged between these groups. in summary, we demonstrated that pp2a is involved in h 2 -receptor signaling and we tentatively conclude that the h 2 -receptor is able to ameliorate systolic but not diastolic cardiac function of pp2a mice. serotonin (5-ht) can exert positive inotropic and chronotropic effects in humans via 5-ht 4 -receptors. we have generated transgenic mice (tg) which overexpress the human 5-ht 4 -receptor selectively in cardiomyocytes via the α-myosin heavy chain promoter. in these mice, but not in wild type mice (wt), serotonin induced increases in heart rate and ejection fraction. we treated the mice with 30 µg lps (lipopolysaccharide, i.p.; a standard model of sepsis) per g body weight or isotonic sodium chloride solution (as solvent control). echocardiography with isoflurane anesthesia was performed before and 3, 7 and 24 hours after lps treatment. lps led to a time-dependent deterioration of cardiac function in both tg and wt. the deterioration included systolic function (left ventricular ejection fraction=ef) as well as diastolic function (height of a and e waves through the mitral valve: e/a). for instance, ef amounted to 58.8 ± 20.7% seven hours after lps in wt and to 61. [4] [5] [6] [7] [8] [9] [10] [11] [12] p<0 .05 vs pre-lps value). however, 24 hours after lps, diastolic function, measured as e/a, amounted to 1.86 ± 0.43 in p<0 .05 tg vs. wt). moreover, after 24 hours a less pronounced decline in body temperature (probably due to superficial abdominal hyperemia) occurred in tg versus wt. in contrast, while all flow parameters declined after 3 and 7 hours of lps, they were not different between wt and tg. for instance, maximum flow (in mm/s) through the ascending aorta declined from 1158.3 ± 212.2 to 782.5 ± 154.3 in wt and from 1208.4 ± 235.1 to 798.8 ± 170.1 in tg (tg vs. wt, p>0.05, n=10-12; after 7 hours). we tentatively conclude: 5-ht 4 -receptors overexpression protects the heart against sepsis, putatively by interference with the intracellular biochemical pathway of lps in cardiomyocytes. histamine can exert positive inotropic and chronotropic effects in humans via histamine h 2 -receptors. we have generated transgenic mice (tg) which overexpress the human h 2 -receptor specifically in cardiomyocytes via the α-myosin heavy chain promoter. in tg, but not in wild type mice (wt), histamine (ec 50 = 34 nm) or amthamine (ec 50 = 10 nm), a more selective and potent h 2 -receptor agonist, induced positive inotropic effects (pie) and positive chronotropic effects (pce) in isolated left and right atrial preparations, respectively. in order to investigate the signal transduction for the pie in atrium, contractile studies using partially depolarized preparations were performed. therefore, left atrial preparations were equilibrated in the organ bath to 44 mm potassium chloride. thereafter, histamine (100 µm) induced a pie (31.1 ± 10.4 % of control, n=7) in tg but not in wt preparations whereas isoprenaline (10 µm) increased force in both wt and tg. the pie of histamine in potassium treated tg atrium could be blocked by cimetidine. compound 48/80, a releasing agent of endogenous histamine, increased force of contraction in tg left atria to a higher extent than in wt. furthermore, we tested whether analgetics known to release histamine were inotropically active in tg. however, morphine (10 µm) was unable to affect contractility in wt or tg, whereas ketamine and fentanyl increased left atrial contractility in both tg and wt. in summary, we demonstrated an involvement of the l-type calcium channel current in the h 2 -receptor mediated pie in tg atria. we failed to release inotropically active histamine using classical analgetics, arguing that a direct effect also in the human heart is unlikely to occur. the initial step in the homologous desensitization of g-protein-coupled receptors is their phosphorylation by one of the g-protein-coupled receptor kinases (grks). we demonstrate here measurement of the interaction of grk2, a ubiquitously expressed grk, with the μ-opioid receptor (µor) by fret and its dependence on agonist efficacy. hek293t cells transfected with yfp-tagged µor and mturquoise-tagged grk2 as well as non-fluorescent gα i1 , gβ and gγ subunits showed a robust increase in fret upon superfusion with 10 µm [d-ala 2 , n-mephe 4 , gly-ol]-enkephalin (damgo) which was reversible upon agonist washout. the partial agonist morphine (30 µm) also caused a fret increase but the amplitude of the fret signal was reduced to approximately 15-20% of that of the corresponding damgo signal. grk2 binds g-protein βγ (gβγ) subunits, and therefore we aimed to find out how cotransfection of grk2 affected the interaction of gβγ with the µor. however, we could not measure any damgo-induced fret changes between yfp-tagged µor and mturquoise-tagged gβ in the presence of non-fluorescent gα i1 and gγ. this was unexpected because we had previously successfully determined interactions between gβγ and the α 2a -adrenergic receptor or the m 3 muscarinic acetylcholine receptor. this lack of fret was not due to an inability of the tagged gβ to interact with the µor as we could measure damgo-induced fret changes between yfp-tagged gα i1 and gβγ (gβ tagged with mturquoise) in the presence of non-fluorescent µor. moreover, when we attempted to establish an effect of grk2 on the interaction between the µor and gβγ, we could pick up fret between the µor and gβγ. comparison of the on-and off-kinetics of the µor-grk2 interaction with that of the µor-gβγ interaction in the presence of grk2 revealed similar time constants both for the on-and off-kinetics (grk2: k on 0.16 s ). this suggests that, in the absence of grk2, the orientation of the two fluorophores on the µor and gβγ may be unfavorable or the interaction may be too short-lived to produce an appreciable fret signal. in the presence of grk2, however, gβγ changes its position relative to the µor in a way that allows the interaction of the grk2-gβγ complex with the µor to be detected by fret. similarly, we measured fret between gβγ and the α 2a -adrenergic receptor or the m 3 muscarinic acetylcholine receptor in the absence and presence of grk2 and compared the kinetics with the kinetics of grk2 binding and unbinding to these receptors. in both cases, we found that grk2 and gβγ in the presence of grk2 associate and dissociate from these receptors with comparable kinetics. our results suggest that ligand efficacy for µors is already apparent on the level of receptor-grk interaction. institute of pharmacology, university medical center göttingen, ag lutz, göttingen, germany introduction: the monomeric gtpase rhoa is dysregulated in heart disease and in vivo models provide evidence of rhoa signaling being involved in the progression of cardiac fibrosis and hypertrophy. how rhoa is regulated within this context on a cellular level is not defined. objective: the goal of this study was to analyze rhoa activation in adult cardiomyocytes (amcm) from normal and diseased mouse hearts in response to g protein-coupled receptor activation. this project also aimed at providing new insight into the dependence of rhoa localization and activation on the signaling organizing caveolae in neonatal as well as adult cardiomyocytes and engineered heart muscles. methods: cardiomyocytes from sham-operated c57bl/6 mice, from mice subjected to transverse aortic constriction (tac) or from neonatal rats were either directly fixed after isolation or cultured in the presence or absence of methyl-b-cyclodextrin (mβcd). for analyses of rhoa activation cells were treated with endothelin-1 (et-1) for 90 sec. cells were prepared for immunofluorescence analysis or lyzed for immunoblotting. imaging was performed using confocal microscopy. effects of mβcd were further studied in 3d engineered heart muscles (ehm) made from total neonatal rat cardiac cells. the contractile function of ehm was assessed in the organ bath and cells were studied in sections by immunofluorescence analysis. results: in amcm from sham mice active rhoa mainly localizes at the sarcolemma and is augmented in response to et-1 treatment. in tac-amcm the basal level of active rhoa is increased and surprisingly et-1 had no further effect. immunoblot analysis demonstrated that in tac-amcm rhoa expression was per se higher and the major caveolae protein caveolin-3 was reduced. to test the influence of caveolae on rhoa activation and expression, we treated nrcm with mβcd and found the expression of rhoa as well as of rhoa target genes ccn1 and ccn2 to be moderately up-regulated after 24h. in addition, an intensified longitudinal alignment of sarcomeric actin fibers was detectable, which could also be seen in mβcd-treated ehm. however, mβcd had no effect on ehm twitch tension but increased the resting tension compared to control. we further treated amcm with mβcd and found rhoa expression to be increased and its activity less sensitive to et-1 treatment. finally, we could show that the perinuclear localization of cav3 and rhoa was strongly reduced after mβcd treatment. whereas g-protein coupled receptors (gpcrs) have been long believed to signal through cyclic amp only at cell surface, our group has previously shown that gpcrs not only signal at the cell surface but can also continue doing so once internalized together with their ligands, leading to persistent camp production (1). this phenomenon, which we originally described for the thyroid stimulating hormone receptor (tshr) in thyroid cells, has been observed also for other gpcrs (2) (3) (4) . however, the intracellular compartment responsible for such persistent signaling was insufficiently characterized. the aim of this study was to follow by live-cell imaging the internalization and trafficking of tshr, tsh and effector proteins in thyroid cells. mouse primary thyroid cells were transfected with fluorescent-protein tagged tshr, g-proteins, nanobody specific for active g-proteins and/or subcellular markers by electroporation, stimulated with fluorescently labeled tsh and visualized using highly inclined thin illumination (hilo) microscopy. our results suggest that tsh is internalized in complex with its receptor and they traffic retrogradely via the trans golgi network (tgn). while we could not find any evidence of internalized tsh/tshr complexes activating g-proteins in early endosomes, we show that tsh/tshr complexes meet the intracellular pool of gαs in the tgn and activate it, as visualized in real-time by a nanobody specific for active gαs. upon acute brefeldin a-induced golgi collapse, the retrograde trafficking of tsh/tshr via tgn is hindered. bulk tsh stimulations in primary mouse thyroid cells isolated from transgenic mice expressing the camp sensor, epac1-camps, also show a significantly reduced camp production in the presence of brefeldin-a. these data provide evidence that internalized tsh/tshr complexes meet and activate g-proteins at the tgn, which might serve as the main platform of persistent camp signaling after receptor internalization. objective: sphingosine 1-phosphate (s1p) is generated by sphingosine kinase (sk)-1 and -2 and acts mainly as an extracellular ligand at five specific g protein-coupled receptors, denoted s1p 1-5 . after activation, s1p receptors regulate important processes in the progression of renal diseases, such as mesangial cell migration methods and results: here we demonstrate that dexamethasone treatment lowered s1p 1 mrna and protein expression levels in rat mesangial cells measured by taqman® and western blot analyses. this effect was abolished in the presence of the glucocorticoid receptor antagonist ru-486. in addition, in vivo studies showed that dexamethasone downregulated s1p 1 expression in glomeruli isolated from c57bl/6 mice treated with dexamethasone (10 mg/kg body weight). functionally, we identified s1p 1 as a key player mediating s1p-induced mesangial cell migration. using boyden chamber assays, we could show that dexamethasone treatment significantly lowered s1p-induced migration of mesangial cells. this effect was again reversed in the presence of ru-486. conclusion: we suggest that dexamethasone inhibits s1p-induced mesangial cell migration via downregulation of s1p 1 . overall, these results demonstrate that dexamethasone has functional important effects on sphingolipid metabolism and action in renal mesangial cells (koch et al., biol. chem. 2015; 396: 803-12) . the g protein-coupled receptor mrgd is a receptor for angiotensin-(1-7) involving g alphas , camp, and phosphokinase a rationale: angiotensin (ang)-(1-7) has cardiovascular protective effects and is the opponent of the often detrimental ang ii within the renin-angiotensin system. although it is well-accepted that the g-protein coupled receptor mas is a receptor for the heptapeptide, the lack in knowing initial signalling molecules stimulated by ang-(1-7) prevented final verification of ligand/receptor interaction as well as the identification of further hypothesized receptors for the heptapeptide. objective: the study aimed to identify a second messenger stimulated by ang-(1-7) allowing confirmation as well as discovery of the heptapeptide's receptors. we identified camp as the second messenger for ang-(1-7). the heptapeptide elevates camp concentration in primary cells such as endothelial or mesangial cells. using camp as readout in receptor-transfected hek293 cells, we provided final pharmacological proof for mas to be a receptor for ang-(1-7). more important, we identified the g-protein coupled receptor mrgd as a second receptor for ang-(1-7). consequently, the heptapeptide failed to increase camp concentration in primary mesangial cells with genetic deficiency in both mas and mrgd. furthermore, we excluded the ang ii type 2 receptor at2 as a receptor for the heptapeptide, but discovered that the at2 blocker pd123119 can also block the mas and mrgd receptors. conclusions: our results lead to an expansion and partial revision of the reninangiotensin system, by identifying a second receptor for the protective ang-(1-7) but excluding the at2 receptor, and by enforcing the revisit of such publications which concluded at2 function by using pd123119 as a specific at2 blocker. members of the g protein-coupled receptor (gpcr) superfamily are integral membrane proteins that are activated by extracellular ligands and induce cell signaling via g proteins and other adaptor proteins. rhodopsin, the prototypical gpcr that mediates vision, is activated by photons that isomerize its covalent ligand. spectroscopic analyses of the cognate agonist retinal allow a detailed description of rhodopsin dynamics at submillisecond resolution. using rhodopsin as a model, it has been demonstrated that receptor activation, i.e. a switch from the fully inactive to the fully active state, occurs within 1 ms 1 . activation kinetics of other receptors have been studied mainly using fluorescence resonance energy transfer (fret) which allows kinetic studies with high resolution in living cells 2 . in contrast to rhodopsin, activation constants of several gpcrs using agonist superfusion have been determined to range between 30-80 ms 3 . however, it is unknown if all gpcrs with diffusible ligands are really activated on a longer timescale or if ligand diffusion to the binding site is rate limiting. in this study, we intend to overcome ligand diffusion by using photodestruction of caged ligands. we monitor activation-related conformational changes of homodimeric metabotropic glutamate receptor 1 (mglur1) sensors by fret after uncaging of an inert glutamate derivative. 4-methoxy-7-nitroindolinyl-l-glutamate (mni-glutamate) is a caged derivative of glutamate that does not activate mglur1s. upon pre-incubation of hek-tsa cells expressing both cfp-and yfp-tagged mglur1 protomers with mniglutamate, a short uv laser pulse releases active l-glutamate close to the receptor binding site. we demonstrate very rapid mglur1 activation kinetics and this allows us to study the process of signal transduction of this homodimeric gpcrs opioids are still the mainstay of modern pain treatment. most of the clinically established substances primarily exert their effects via the µ-opioid receptor (mor). however, many side effects such as tolerance, constipation and respiratory depression limit their therapeutic use. the efficacy of mor agonists in the treatment of chronic pain is unsatisfactory. in general analgesic effects can be mediated by all four members of the opioid receptor family. the nociception receptor (nop) is the latest member of the opioid receptor family. there is a rapidly growing interest for the development of novel nop and combined mor/nop agonists. the aim of this developement is novel therapeutic agents with improved analgesic characteristics and less classical mor-mediated side effects. here we used buprenorphine as a clinically established opioid which exerts its effect on multiple opioid receptor subtypes. recently, nalfurafine, a potent kappa-opioid receptor (kor) agonist was granted by japanese authorities for the treatment of uremic pruritus. even though kor agonists are known to mediate dysphoria and hallucinations this has not been reported for nalfurafine. rudolf-virchow-zentrum für experimentelle biomedizin, würzburg, germany g protein-coupled receptors (gpcrs) belong to a superfamily of cell surface signaling proteins that mediate many physiological responses to hormones and neurotransmitters. they represent the prime targets for therapeutic drugs in healthcare. however, due to the limited knowledge about the pharmacology of the majority of gpcrs, only few of them are employed as therapeutic target. in our lab, the activation kinetics of the α 2aadrenergic receptor, among others receptors, has been extensively studied in single cell assays (1) (2) . the activation kinetics of the labeled α 2a -adrenergic were monitored by förster resonance energy transfer (fret). the goal of our study is to design a sensor to monitor receptor activation kinetics in high throughput screening assays. for the proof of concept, we used the α 2a -adrenergic receptor as a prototype. to optimize the fret efficiency we exchanged the previous acceptor (yfp) with the halotag technology (3) . additionally, we used halotag in combination with the nanoluc (4) to explore the possibility of using bret as a high-throughput approach to monitor receptor activation kinetics. the fret-based sensor α 2a -halo/cfp showed an increase in fret upon application of the full endogenous agonist norepinephrine with an ec50 value in accordance with the previously published data. this suggests the functionality of the fret-based α 2a -halo/cfp sensor. similar results were obtained with the α 2a -adrenergic receptor bret-based sensor. in contrast to the full agonist norepinephrine, the inverse agonist, yohimbine, decreased the ratio in both, fret as well as bret-based α 2aadrenergic receptor sensors. this suggests the sensitivity of the methods to discriminate among agonist (increased ratio) and antagonist (decreased ratio) induced receptor kinetics. our results show the feasibility of using halotag to monitor receptor activation via fret in a single cells format and halotag, in combination with nanoluc, can be used to monitor receptor activation in high-throughput format. , c. hoffmann the homologous visual arrestins) that has recently been solved by x-ray crystallography. here we investigated both the interaction with gpcrs and β-arrestin conformational changes in real time and in living cells with a series of fret-based β-arrestin2 biosensors. upon stimulation, β2-adrenergic receptors bound β-arrestin2 with a time constant τ = 1.3 ± 0.17 s, indicating that β-arrestin2 binding rapidly terminates their gprotein signaling. we observed a subsequent receptor-mediated conformational change in β-arrestin2 with a τ = 2.2 ± 0.22 s. stimulation of β2-adrenergic vs. m2 muscarinic or ffa4 receptors resulted in different patterns of conformational changes in the various β-arrestin2 sensors and also in downstream kinase signaling, revealing receptor-specificity in β-arrestin2 activation. upon agonist removal, first the interaction (delay = 1.9 ± 0.51 s) and only then the active state of β-arrestin2 (delay = 4.2 ± 0.85 s) were reversed. accordingly, β-arrestin localization at the cell membrane lasted much longer than the direct interaction with β2-adrenergic receptors. our data indicate a rapid, receptorspecific, two step binding and activation process between gpcrs and β-arrestins; they further suggest that β-arrestins remain active following dissociation from receptors, allowing them to remain at the cell surface and presumably signal independently. thus, gpcrs trigger a rapid, receptor-specific activation/deactivation cycle of β-arrestins, which permits their active signaling. patghogenic clostridium difficile produce two large glucosyltransferases, tcda and tcdb, which are the main pathogenicity factors. the cytotoxin tcdb is about 1,000 fold more potent than tcda. tcda and tcdb are a/b structure toxins exhibiting an enzymatically active (a) domain and a binding/translocation domain (b) to deliver the active glucosyltransferase domain into the cytosol of host cells. beside its glucosyltransferase activity by which the substrate proteins mainly of the family of rho gtpases are inhibited, tcdb has additional cytotoxic effects that are independent of rho glucosylation. to investigate the mechanism by which tcdb induces early cell death, we applied chimeras of tcdb from different toxinotypes where different glucosyltransferase domains were combined with different translocation domains. to this end we cloned tcdb from strain vpi10463 (historical strain), strain 1479 (serotype f, variant tcdbf), strain r20291 (hypervirulent strain, ribotype o27), and strain r9385 (hypervirulent strain with tcdbf characteristics). we were able to investigate the impact of the glucosyltransferase domains with different substrate specificity when translocated into the host cell by identical translocation domains. furthermore, we tested different translocation domains to deliver the same glucosyltransferase domain into host cells. we found that the glucosyltransferase domain of tcdbf (strain 1470) is less cytotoxic with respect to early cell death mediated by reactive oxygen species than that from reference strain vpi10463. in addition, the translocation domain also showed significant impact on cytotoxicity, probably by faster intracellular delivery of the gtd. by using glucosyltransferase deficient mutants where the highly conserved dxd-motif was changed to nxn, we were able to show that glucosylation of rho gtpases counteracts the cytotoxic effect, since the mutants were more cytotoxic than wildtype toxins. in conclusion, the cytotoxicity of tcdb mainly depends on the translocation efficiency into the host cell and on the kinetic of glucosylation of their substrate gtpases. thus, sensitivity of target cells towards cytotoxic effect also depends on receptor abundancy and intracellular status of rho gtpases, whereas the cytopathic effect, i.e. cell rounding, is predominatly determined by the substrate specificity. introduction: p63rhogef activates the g protein-coupled receptor (gpcr)-mediated induction of rhoa in different cells. however, its role in cardiac fibroblasts (cf) is not defined yet. thus we studied its localization and function in cf in 2d and 3d culture experiments. methods: neonatal rat cardiac fibroblasts (nrcf) and adult ventricular fibroblasts (amcf) from wild type mice and p63rhogef-knockout mice were adenovirally transduced for 48 to 72 h with recombinant adenoviruses or directly used. for 2d studies the cells were treated with angiotensin ii (ang ii). the location of the involved signaling components, rhoa activation and down-stream effects were studied by confocal microscopy and biochemical analyses. in addition, cf were used to prepare cf-containing engineered connective (ect) or muscle (ehm) tissues. results: we could show that p63rhogef locates at the plasma membrane, adjacent to the golgi apparatus and at the base of primary cilia. in accordance, p63rhogef regulates in response to ang ii the expression and secretion of the connective tissue growth factor (ctgf) in nrcf involving the serum response factor. in ect, p63rhogef increases the stiffness of these tissues and in ehm containing cf expressing gain-and-loss-of-function p63rhogef variants it influences the contractility. interestingly, the increase in ect stiffness was independent of p63rhogef's regulatory function of ctgf, as overexpression of ctgf in cf had no impact on ect properties arguing for a more general role of p63rhogef in auto-and paracrine signaling. moreover, our data on amcf with a genetic deletion of p63rhogef implies that p63rhogef is not only a transducer of gpcr-dependent rhoa activation as its loss led to an increase in rhoa expression accompanied by an increase in rhoa-dependent gene expression suggesting a role of p63rhogef in the feedback regulation of this signaling cascade. conclusion: in summary, our data show that p63rhogef regulates auto-and paracrine signaling in cardiac fibroblasts. the atrophic rhinitis is characterized by a drastic destruction of nasal turbinate bones in different animals. it leads to shortening and twisting of the snout and a growth retardation of young pigs. this bone degradation is induced by pasteuralla multocida toxin (pmt), a toxin produced by p. multocida serogroups a and d. this destructive effect indicates an interaction of pmt with bone cells like osteoclasts and osteoblasts. we demonstrated that pmt stimulates the differentiation of osteoclasts and inhibits the differentiation of osteoblasts in a gq-dependent mechanism. the underlying molecular mechanism of the toxin is the deamidation of an essential glutamine residue in the αsubunits of heterotrimeric g proteins, which results in the constitutive activation of the g protein. until now only the function and the pmt-dependent effects on osteoblasts and osteoclasts were studied in detail, but there is also a third important cell type in bone, the osteocytes. osteocytes are discussed as the regulator of the bone turn-over by interacting with osteoclasts and osteoblasts e.g. via secretion of several osteoclastogenic and osteoblastogenic cytokines. therefore, we studied the effects of pmt on the function of osteocytes in more detail. we utilized an osteocyte like cell line and primary osteocytes isolated from tibiae and femurs from mice. the susceptibility of primary osteocytes and the osteocyte like cell line towards pmt was demonstrated by detection of toxin-induced deamidation of g proteins. we also observed a pmt-induced secretion of different cytokines, like rankl, il-6 and tnf-α, which are known to induce osteoclastogenesis or inhibit osteoblastogenesis. furthermore, we studied the underlying signal transduction pathways and other pmtinduced effects on osteocytes, like morphological changes. in summary, we show that pmt acts on osteocytes by stimulating heterotrimeric g proteins. this might have impact on overall bone metabolism due to modulation of osteoblast and osteoclast activity. pasteurella multocida is an opportunistic pathogen often residing in the nasal pharyngeal space of animals. one virulence factor of p. multocida serogroups a and d is the protein toxin pmt (p. multocida toxin), which is the causative agent of atrophic rhinitis characterized by degradation of nasal turbinate bones in pigs and other animals. on the molecular level, pmt activates distinct members (g q/11 , g 12/13 and g i ,) of heterotrimeric g proteins leading to a modulation of bone metabolism: the toxin stimulates osteoclastogenesis but blocks osteoblastogenesis which results in bone loss. this mechanism of action of pmt might be exploited to counteract the human disease fibrodysplasia ossificans progressiva (fop), a rare and highly disabling disorder of extensive heterotopic bone growth. the underlying cause of fop is a point mutation in the activation domain of acvr1 (r206h), a bmp (bone morphogenic protein) type 1 receptor. this mutation leads to an inflated bmp-signaling and heterotopic osteoblastogenesis. here, we report that c2c12 cells, a mouse myoblast cell line often used as a fop model, are susceptible to pmt intoxication. pmt induces deamidation of g proteins in these cells. furthermore, pmt very efficiently inhibits bmp4-induced osteoblast differentiation in c2c12 cells. this has been shown by measuring alkaline phosphatase expression which is an early marker of osteoblast differentiation. additionally, the impact of pmt on acvr1 r206h induced osteoblastogenesis will be investigated and the involved cellular signaling pathways will be characterized in detail. the data indicate that activation of heterotrimeric g-proteins might be a rationale for pharmacological therapy of fop. p63rhogef is an activator of the monomeric gtpase rhoa and was shown to be expressed in the heart. in cardiac fibroblasts and smooth muscle cells, p63rhogef regulates rhoa in response to angiotensin ii and controls the actin cytoskeleton as well as protein expression and secretion. its role in cardiomyocytes, however, has not been elucidated so far. cardiomyocytes were isolated from neonatal rat hearts (nrcm), wild type mouse hearts (wt-amcm) and homozygous (ko-amcm) p63rhogef knockout mouse hearts. the cells were either directly fixed or adenovirally transduced for 48 h in culture. for activation of the g q/11 -signaling the cells were treated with endothelin-1 (et-1), angiotensin ii (ang ii) or phenylephrine (pe) for 90 s. rhoa activation was assessed by affinity binding or with a specific active-rhoa antibody. other proteins were detected by immunoblot or immunofluorescence analysis with subsequent confocal imaging. in nrcm p63rhogef is involved in the regulation of the et-1-induced rhoa activity and thus increases the expression and secretion of the rhoa target gene ctgf. in accordance, p63rhogef was found to be localized at the sarcolemma as well as in intracellular membrane compartments. the strongest co-localization was detected with the kdel-receptor 3 (kdelr3) which resides in the endoplasmatic reticulum membrane. next, we analyzed rhoa activation in wt and ko-amcm and could show that a loss of p63rhogef led to an increase in basal rhoa activity and an uncoupling from the gpcrs. interestingly, in the ko-amcm caveolin-3 the major component of caveolae, in which several gpcrs are clustered, was reduced in its expression and a shift in localization from transverse to longitudinal membrane tubules was found, arguing for a role of p63rhogef in intracellular protein transport. in accordance, golgi apparatus particles, which were demonstrated to play role in caveolae formation, were reduced in size in ko-amcm. to further address the role of p63rhogef in the transport of membrane proteins, we overexpressed p63rhogef in wt-amcm and could show that this led to an increase in the expression of the kdelr3 and its co-localization with p63rhogef in the perinuclear region and at the sarcolemma. no sarcolemmal localization of kdelr3 was found in control-transduced cells. further, p63rhogef was localized adjacent to golgi apparatus particles which were similar reduced in size as detected in the ko-amcm. finally, we expressed the dominant negative construct p63dn and detected similar changes with respect to kdelr3 localization and golgi structure suggesting that this regulatory function of p63rhogef is not dependent on its activity. conclusion: p63rhogef mediates the activation of rhoa from gpcrs coupled to g q/11 proteins. moreover, it has a function in intracellular transport and distribution of membrane proteins independent of its activity. universität bonn, pharmacology and toxicology section, bonn, germany g protein signaling is a means allowing cells to quickly respond and adapt to environmental changes. four major g protein classes (gs, gi/o, gq/11, g12/13) exist in mammals and these must suffice to convey signals from about 800 g protein-coupled receptors to the cell interior. as such, g proteins receive, interpret, and finally route the gpcr signals to diverse sets of downstream target proteins and thereby permit cells to respond to their ever changing environment. 1 understanding contribution of individual g protein classes or even isoforms to complex signaling networks in living cells requires capacity to activate or inactivate proteins with great precision and selectivity. one approach towards inactivation of g protein function is via chemical inhibition. however, "true specificity" of chemical inhibitors for their associated targets may often be debated. in this study we posit that fr900359, a cyclic depsipeptide isolated from the leaves of ardisia crenata, may clearly be designated as "truly specific" for inhibition of gq signaling. using a broad set of complementary methods based on label-free holistic cell sensing, classical endpoint assays, and bioluminescence resonance energy transferbased g protein biosensors we assign exceptional selectivity to fr for inhibition of gq/11/14 over all other mammalian isoforms ("on-target effects"). in holistic label-free recordings using hek293 cells that lack functional gq/11 alleles by crispr-cas9 genome editing, bona-fide gq stimuli were undetectable. however, reintroduction of gq into the knockout background was required and sufficient to fully restore both, agonist responses and their inhibition by fr. moreover, fr was completely ineffective in cells lacking gq/11 using phenotypic assays that examine basic cellular functions such as cell growth, viability, morphology and expression of housekeeping genes ("off-target effects") 2 . from these results we conclude that fr is of outstanding value as molecular probe to unravel contribution of gq signaling in complex biological processes in vitro, ex vivo and in vivo. just as pertussis toxin, applied world-wide by numerous laboratories to diagnose signaling of gi/o proteins, we anticipate fr to stand out at least equally for investigations into the biological relevance of gq. binary actin adp-ribosylating toxins like c. perfringens iota toxin and c. difficile transferase cdt cause depolymerisation of the cortical actin cytoskeleton, induce the formation of microtubule-based cell membrane protrusions and redirect rab-dependent intracellular traffic (schwan et al. 2009 ). here, we employed the model of toxin-induced protrusions to study the formation of cilia. we found that toxin-induced microtubule-based protrusion formation at the cell membrane depends on recruitment of septins, which are highly conserved, small gtpbinding proteins. similar to toxin-caused protrusions, septins are also recruited to the site of ciliogenesis. inhibition of septins by shrna-based knockdown inhibits ciliogenesis as well toxin-induced protrusion formation. septins are suggested to be involved in exocytotic processes, which are important for ciliogenesis and also for toxin-induced protrusion formation. accordingly, translocation of septins is accompanied by a recruitment of rab proteins and proteins of the exocytotic machinery. the data indicate that septins function as a scaffold at the base of cellular processes like cilia and toxin-induced protrusions, organizing the cross-talk between the actin cytoskeleton and microtubules to regulate the vesicle traffic-and exocytotic machinery. hypervirulent clostridium difficile strains are associated with increased morbidity and mortality. these strains produce the actin-adp-ribosylating clostridium difficile toxin cdt. cdt depolymerizes the actin cytoskeleton, causes formation of microtubule-based protrusions and increases pathogen adherence. septins are essential for cdt-induced protrusion formation. sept2, 6, and 7 accumulate at predetermined protrusion sites and form collar-like structures at the base of protrusions. the septins are a prerequisite for protrusion formation. the inhibitor forchlorfenuron or knock-down of septins inhibit protrusion formation. septins colocalize with active cdc42 and its effector borg which act as up-stream regulators of septin polymerization. microtubules interact with septin structures. precipitation and surface plasmon resonance studies revealed high-affinity binding of septins to microtubule plus end tracking protein eb1 thereby guiding incoming microtubules. the data indicate that cdt hijacks conserved regulatory principles involved in microtubule-membrane interaction, depending on septins, cdc42, borgs and restructuring of the actin cytoskeleton. the zebrafish danio rerio has become an important vertebrate model organism for a wide range of scientific questions [1] . current studies are mainly focused on development, genetics and disease for which the zebrafish is particularly well suited due to its small size, rapid development, short generation time, optical transparency of embryos and larvae as well as conservation in functional domains [1] . hitherto, nothing is known about the composition and endogenous level of different cnmps in various developmental stages and organs of danio rerio. therefore, we used the zebrafish in our study as a vertebrate model to characterize systematically the temporal-and organspecific occurrence(s) of all cnmps including cump in vivo. cyclic nucleotides were quantified by high performance liquid chromatography quadrupole tandem mass spectrometry. we observed specific cnmp patterns in developmental stages and different organs from adult zebrafish, which is in support of the hypothesis of a distinct cnmp signaling code [2] . camp, cgmp and cump were present in tissue samples of both developmental stages (embryos at 24 hours post fertilization, larvae at 5 days post fertilization) and within all harvested organs. remarkably, these three cnmps were the only ones detected in the brain. camp concentration of entrails as well as camp and cgmp concentrations in the brain were similar to those previously described in mouse tissues [3] . ccmp was detected throughout development and was present in all organs except the brain. the identity of ccmp and cump in the zebrafish was confirmed by high performance liquid chromatography quadrupole time-of-flight mass spectrometry (hplc-ms/tof). thus, we unequivocally show for the first time that cump occurs in vertebrates. furthermore, we detected cimp in several developmental stages of the zebrafish, and observed the highest concentrations in testes and heart, but we were unable to unequivocally identify cimp via hplc-ms/tof. in the zebrafish, sac is evolutionarily not conserved and absent, since a search in the ncbi gene data base revealed no entry for sac (also referred to as ac10). therefore, sac can be excluded as cump-and ccmp generator in this system and sgc remains as the only bona fide cump-and ccmp generator in the zebrafish. to test this hypothesis, the effects of no donors, sgc stimulators and sgc activators on cump levels in zebrafish should be examined in future studies. recently, induction of apoptosis in mouse lymphoma cells by ccmp-am has been described [4] . thus, it would be interesting to examine the effect of ccmp-am on zebrafish embryos in future studies as well. [1] seifert, r.: ccmp and cump: emerging second messengers, trends in biochemical sciences 40, 8-15 (2015) . ccmp, 3',5'-cump and 3',5'-cimp were ineffective. to further characterize the action of 3',5'-cgmp on hut-78 cells, we determined apoptosis (propidium iodide/annexin v staining) and proliferation (carboxyfluorescein succinimidylester staining). 3',5'-cgmp significantly increased apoptosis (ec 50 = 75 µm) and inhibited proliferation (ec 50 = 63 µm) of okt3-activated hut-78 cells. interestingly, also 2',3'-cgmp exhibited comparable effects on apoptosis and proliferation with ec 50 values of 92 µm and 75 µm, respectively, while 2',3'-camp, 2',3'-ccmp and 2',3'-cump were ineffective. this indicates that the pro-apoptotic and antiproliferative action of cgmp does not depend on the position of the phosphodiester bond. we also tested 3',5'-cgmp degradation products under the same experimental conditions and found that both 5'-gmp and guanosine increased apoptosis and inhibited proliferation with ec 50 -values between 50 and 100 µm. by contrast, adenosine did not influence cell growth and viability, suggesting that adenosine receptors are not involved in the observed effects. our results suggest that the guanosine moiety is responsible for the pro-apoptotic and antiproliferative effects of 3',5'-cgmp, 2',3'-cgmp, 5'-gmp. it has been reported earlier that guanosine is toxic to jurkat cells, another t cell lymphoma cell line [1]. 3',5'-cgmp may be hydrolyzed by an ekto-phosphodiesterase on the cell surface of hut-78 cells (e.g. enpp1), yielding 5'-gmp, which could be further degraded to guanosine by the 5'-ekto-nukleotidase cd73. a similar pathway may lead from 2',3'-cgmp to guanosine. a previous analysis of phosphodiesterases (pdes) revealed that the dual-specific pde isoforms 3a and 3b as well as the cgmp-selective pde9a also degrade the emerging second messenger cump [1, 2] . we analyzed the enzyme kinetics of pde3b-mediated cump-hydrolysis using recombinant gst-tagged pde3b and a highly sensitive and specific hplc-ms/ms method. our data show that pde3b is a low-affinity enzyme for cump with a cump k m -value of >100 µm. the pde3-selective competitive inhibitor milrinone inhibited pde3b-mediated cump degradation, suggesting that cump binds to the catalytic center. pde3b is highly expressed in adipose tissue [3, 4] . thus, we differentiated murine 3t3-l1-mbx-fibroblasts into adipocytes and analyzed differentiation-dependent alterations of pde3b expression and basal cnmp-concentrations. in both differentiated and undifferentiated 3t3-l1 cells cump and ccmp were detected in addition to the established second messengers camp and cgmp. differentiation to adipocytes reduced camp and cgmp by 66 % and 60 %, respectively, while ccmp was reduced by 78 % and cump even by 85 %. these findings suggest that cump plays a distinct role in adipocyte differentiation. the cump-hydrolyzing pde3b was upregulated ~1000-fold on mrna level after adipocyte differentiation, which may contribute to the observed reduction of basal cump concentrations. we currently investigate the potential biological role of cump in differentiation and lipolysis experiments, analyzing the effects of the membrane-permeant cumpacetoxymethyl ester cump-am. in future experiments, we will also analyze the enzyme kinetics of pde9a-mediated cump hydrolysis. pde9a is the first example of a cgmp-"specific" cump-hydrolyzing pde. background: cgmp and camp are cyclic nucleotide messengers relevant to many physiological and pathophysiological conditions. live-cell imaging with fret-based biosensors is a powerful method to study the spatiotemporal dynamics of cgmp and camp under close-to-native conditions. however, with the existing biosensors it is difficult to resolve potential membrane-associated cgmp microdomains and to monitor cgmp and camp signals in parallel in the same cell. we have generated novel versions of the "green" cfp/yfp-based cytosolic cgmp biosensor, cgi500. they comprise a "green" membrane-targeted version (mcgi500) and a "red" variant (red cgi500) that contains the fluorophores tsapphire and dimer2. methods: the sensors were expressed and characterised in primary vascular smooth muscle cells (vsmcs). intracellular cgmp was elevated in intact vsmcs by application of a nitric oxide donor or natriuretic peptides, and the sensor's sensitivity to each stimulation and its signal-to-noise ratio were determined. to test each sensor's sensitivity and specificity for cgmp versus camp, sensor-expressing cells were permeabilised with β-escin and exposed to defined concentrations of cyclic nucleotides. results: the original cgi500 sensor showed a good signal-to-noise ratio, an ec 50 value of ≈1.1 µm for cgmp, and a high selectivity for cgmp over camp (>100-fold). flincg3, a non-fret-based cgmp sensor, showed similar properties as cgi500. the new membrane-targeted mcgi500 and the new red cgi500 displayed ec 50 values of ≈0.4 µm cgmp and a high selectivity for cgmp over camp (>300-fold). in vsmcs, the red cgi500 showed a better signal-to-noise ratio than the previously described "red" cgmp sensor, red cges-de5. the "green" fret-based camp sensor, epac1-camps, showed a signal-to-noise ratio comparable to that of cgi500, an ec 50 value of ≈3 µm for camp and a selectivity for camp over cgmp of ≈6-fold. finally, imaging of cells expressing both the epac1-camps and the red cgi500 demonstrated the feasibility of combined visualisation of camp and cgmp signals in the same cell. the new cgmp biosensors should be useful for a broad spectrum of applications requiring real-time monitoring of cgmp signals. for example, mcgi500 would be useful to investigate membrane-associated cgmp compartments and red cgi500 to study the crosstalk between cgmp and camp signalling in living cells and tissues. the cgmp-system is a major regulator of blood pressure. cgmp-dependent protein kinases (cgks), located in the smooth muscle layer of vessels, enable cells to dilate and therefore cause a decrease in blood pressure (bp). to the contrary, the reninangiotensin-aldosteron-system (raas) acts as opponent and causes an increase in bp; furthermore, it influences fluid-electrolyte balance. renin, which is secreted from reninproducing cells located in the juxtaglomerular apparatus (jga), is the key regulator enzyme in this system. pharmacological inhibition of the raas, e.g. via ace-inhibitors or at1-receptor-antagonists, is a powerful tool to treat hypertension, but chronically challenges this endocrine system, resulting in an enhancement of renin expression. this is caused by an increased number of renin-expressing cells (the so-called reninrecruitment), which derive from a reversible metaplastic retransformation of extraglomerular and smooth muscle cells of afferent arterioles. next to regulation of renin-function via camp/pka, it has been shown that enos-derived no supports this recruitment via activation of sgc and subsequent generation of cgmp [1] . whether this causes an activation of cgks is not known. these enzymes exist in 3 different isoforms, cgkiα, cgkiβ und cgkii. in contrast to the β-isoform, cgkiα (as well as cgkii) is highly expressed in the jga [2] , [3] . therefore, we analyzed, whether cgkiα also plays a role regarding renin synthesis, secretion or recruitment. to characterize the function of cgkiα in jga-cells, we generated renin-cell specific cgkiα-knockout mice (ren cre-cgki fl/fl) and stimulated renin recruitment via administration of a low salt diet (0.02% na + ) and enalapril (10mg/kg/d) for 3 weeks. we analyzed blood pressure, mrna and renal protein content of renin and cgkiα, plasma renin activity and renin recruitment. furthermore, we activated the cgmp-system in these mice using bay41-8543, a sgcstimulator, and re-analyzed the above-mentioned parameters. our results indicate that cgkiα could be an additional system supporting renin recruitment but is not a crucial pre-requisite. in contrast, the basal renin concentration and activity appears to be downregulated in ren cre-cgki fl/fl-mice, thus, cgki could be an important regulator of renin synthesis. universität regensburg, pharmakologie und toxikologie, regensburg, germany jaw1/lrmp (lymphoid-restricted membrane protein) is a type 2 membrane protein, localised to the cytoplasmic face of the endoplasmic reticulum. it encodes a 539 amino acid protein with a highly conserved coiled-coil domain in the middle third of the protein and a cooh-terminal transmembrane domain [1] , [2] . jaw1 and irag have a limited homology throughout the length of the protein. the coiled-coil domain and the putative transmembrane anchor at the c-terminus of jaw1 and irag share the highest homology [3] , [4] . the coiled coil domains of irag and jaw1 are important for the interaction with ip 3 rs. as already known, irag forms a trimeric complex with cgkiβ and ip 3 r1 and gets phosphorylated by cgkiβ [5] . hence we examined if jaw1 is a new target protein of cgkiβ. the recognition site, where a substrate can be phosphorylated by cgki, is composed of the following amino acids: (k/r)(k/r)x(s/t). in the amino acid sequence of jaw1 possible phosphorylation sites can be found. our in vitro studies with jaw1 and cgki showed that jaw1 gets phosphorylated in a cgmp-dependent manner by cgkiβ. in contrast, jaw1 was not phosphorylated by cgkiα. furthermore, no stable interaction between jaw1 and cgkiβ was detected. to examine the importance of jaw1 in vivo, we generated a conditional knockout mouse. mating with a cmv cre mouse, resulted in an ubiquitous deletion of jaw1. mrna analysis and western blot analysis approved the deletion. the expression pattern revealed high expression in the thymus and weaker expression in the lung, spleen, colon, pancreas and the tongue. as already published by shindo et al., jaw1 was found in sweet, bitter, and umami taste receptor-expressing cells of mouse circumvallate, foliate, and fungiform papillae. we confirmed these results by x-gal staining and mrna analysis. therefore, we decided to analyse if jaw1 influences taste perception. two bottle preference tests did not result in significant differences between wildtype and knockout mice, indicating that taste perception is not altered by jaw1. hence, the function of jaw1 in taste receptor expressing cells has to be further examined in future studies. the cyclic purine nucleotides adenosine 3',5'-cyclic monophosphate (camp) and guanosine 3',5' cyclic monophosphate (cgmp) are well-characterized second messengers. both are generated by nucleotidyl cyclases and degraded by phosphodiesterases. several binding partners of camp and cgmp were already identified and functionally analyzed, e.g. camp-dependent protein kinase (pka) and cgmp-dependent protein kinase (pkg) as well as exchange protein activated by camp 1 and 2, hyperpolarization-activated cyclic nucleotide gated channels and phosphodiesterases. recent data indicate that the cyclic pyrimidine nucleotides cytosine 3',5'-cyclic monophosphate (ccmp) and uridine 3',5'-cyclic monophosphate (cump) also fulfill the criteria of second messengers [1, 2] . the interaction of ccmp with the regulatory subunits of pka (pkariα and pkariiα) has already been shown by using ccmpagarose [3] . additional ccmp-and cump-binding proteins such as calnexin (chaperone), myomegalin (phosphodiesterase-interacting protein) and akap9 (a-kinase anchoring protein) were identified by mass spectrometry analysis. to verify the interaction of ccmp and cump with these potential target proteins, ccmp and cump linked to biotin were used as another approach. the biotin-constructs exhibit lower steric interference than the ccmp-and cump-agarose matrices, which were previously used to confirm the binding of pkariα to ccmp and cump [4] . flag-tagged calnexin, flag-tagged myomegalin and myc-tagged yotiao (smallest splice-variant of akap9) were examined as potential ccmp and cump target proteins. hek293 cells were transiently transfected with the cdna of the respective proteins. the lysates of the protein-overexpressing cells were then incubated with ccmp-and cumpbiotin matrices and bound proteins were purified using strep-tactin® beads (iba). afterwards, the interaction of ccmp and cump with the potential binding partners was analyzed by western-blotting. a pkariα antibody was used as a positive control. analogous experiments were also performed using ccmp-and cump-agaroses. once the interaction between the cyclic pyrimidine nucleotides and the potential binding partners has been confirmed, deletion mutants will be cloned to localize the ccmp-and cump-binding area of the target proteins in further studies. axonal branching is essential for the correct formation of neuronal circuits and enables the simultaneous transmission of information throughout the body. in mice, the bifurcation of axons of sensory neurons at the dorsal root entry zone of the developing spinal cord depends on a cgmp signaling cascade that includes c-type natriuretic peptide (cnp), natriuretic peptide receptor 2 (npr2, also termed gc-b), and cgmpdependent protein kinase iα. in this study it was investigated, if a disturbance of cgmp signaling induced by manipulation of cgmp breakdown or cnp scavenging affects axon bifurcation of murine embryonic dorsal root ganglion (drg) neurons. rt-pcr screens, in situ hybridization, and fret-based cgmp imaging in living neurons revealed phosphodiesterase 2a (pde2a) as the major enzyme for degradation of cnp-induced cgmp in embryonic drg neurons. interestingly, cgmp measurements and dii labeling of pde2a knockout embryos indicated that a strongly elevated concentration of cgmp does not impair sensory axon bifurcation of drg neurons in vivo. the natriuretic peptide receptor 3 (npr3) was found to be expressed in the roof and floor plate of the spinal cord as well as in the dorsal roots of e12.5 embryos. because npr3 binds natriuretic peptides, but does not generate cgmp, it is thought to act as a natriuretic peptide clearance receptor. by scavenging cnp, npr3 could lower the activity of the cnp-npr2-cgmp signaling cascade in drg neurons. in the absence of npr3, the majority of sensory axons showed normal bifurcation, but »13 % of the axons turned only in rostral or caudal direction. this study shows (1) that pde2a is important for the degradation of cgmp in embryonic drg neurons, (2) that the bifurcation of sensory axons in the spinal cord can tolerate high levels of intracellular cgmp in the absence of pde2a, and (3) in the central nervous system, no-dependent cgmp signalling is associated with many different developmental processes and brain functions, and plays an important role in memory consolidation and cognition. to analyse cgmp signals in primary cells, a knock-in mouse was generated which stably and ubiquitously expresses a fret-based cgmp indicator (cgi500). cultured cortical and hippocampal neurons were found to respond to exogenously applied no (gsno). in these cell types, endogenous no is mainly generated by the neuronal no synthase (nnos) isoform which requires a rise in intracellular calcium for activation of the no/cgmp-signalling cascade. here, we show that ampa-type ionotropic glutamate receptors were capable to induce cgmp response in cultured cortical and hippocampal neurons. surprisingly, amparinduced cgmp signals were independent of nmdar activation, as inhibition of nmdars with the nmdar antagonist d-apv (d-2-amino-5-phosphonopentanoic acid) did not block ampar-induced cgmp response. however, cgmp accumulation depends on no synthase activation as the nos inhibitor l-nna (ng-nitro-l-arginine) completely abolished cgmp accumulation. whether ampar-induced nos activation depends on calcium influx via calcium permeable ampars, vgccs (voltage gated calcium channels) or calcium release from intracellular stores will further be investigated in detail. cyclic adenosine monophosphate (camp) is an important and ubiquitous cellular second messenger. a dogma in signaling is that camp is distributed homogenously in the cell and that its concentration changes equally upon stimulation. in contrast, a large body of evidence suggests the existence of concentration gradients (so-called microdomains) of camp. in this regard, phosphodiesterases (pdes), the only enzymes which can degrade camp, have been suspected to be responsible for maintaining those gradients. however, how pdes establish camp gradients is entirely unknown. here, we measure local camp levels in hek cells and cytosolic fractions thereof using the camp fretsensor epac1-camps fused to a phosphodiesterase (pde4a1). we demonstrate the existence of low camp concentrations in close proximity to pdes and show that this gradient is maintained by pde hydrolytic activity. further we establish that camp gradients cannot be maintained solely on the basis of pde activity as the camp turnover is very slow. we provide evidence that pdes are structurally organized in yet unspecified 'microstructures' in which camp diffusion must be considerably slowed down. taken together, we suggest that camp gradients are established by pde hydrolytic activity in cellular regions with slow diffusion of camp. our study sheds light on the organization and maintenance of signaling compartments in cells. the influence of pde hydrolytic activity on camp gradients universität würzburg, pharmakologie und toxikologie, würzburg, germany phosphodiesterases (pdes) are a family of enzymes that degrade cyclic amp (camp) and cyclic gmp (cgmp) to their respective monophosphates. although several pdes have been shown to play an important role in a wide variety of physiological and pathological processes, their complexity and function in cell signalling is only beginning to be understood. it is especially astonishing that eukaryotes express more than 100 different pde isoforms while their single function is to degrade camp, cgmp or both. in recent years, a large body of evidence has suggested that pdes (especially isoforms of the pde families 3, 4 and 5) are key players in establishing signalling compartments. these so-called microdomains are yet unspecified regions in cells where the concentrations of camp and cgmp are higher or lower than in the bulk cytosol. in a companion abstract (bock & lohse) we provide evidence that camp nanodomains, i.e. regions of low camp concentrations, exist in cells in the direct vicinity of pde4. however, the mechanisms by which pdes establish and maintain camp gradients are largely unknown. here we study if establishing camp gradients is a general role of pdes and, moreover, if the size of camp nanodomains mainly depends on pde hydrolytic activity. by fusing an ultra-fast pde2a3 (v max = 120 µmol/min/mg) to a camp fret sensor (epac1-camps) we monitor camp concentrations in direct vicinity of pde2a3. in comparison to pde4, we show that pde2a3, albeit displaying a high camp turnover, only establishes a small camp gradient in both cytosol preparations of transfected hek cells and in living cells. interestingly, this gradient can be increased by deleting the nterminal regulatory domains while maintaining fast camp turnover. biochemical mapping of the camp gradient gives an estimate of the size of nanodomains. taken together, our data suggest that establishing camp gradients does not exclusively depend on pde hydrolytic activity. arglabin is a plant-derived sesquiterpene lactone used for cancer therapy in kazakhstan and russia. signaling pathways targeted by arglabin are poorly understood. we have isolated arglabin by using high performance liquid chromatography from a methanolic extract of artemisia glabella, a plant endemic in kazakhstan. mass spectrometric analyses confirmed the chemical structure and the purity of the isolated arglabin. in j774 macrophages, arglabin strongly induced accumulation of lc3 type ii protein in the absence of inflammasome activators, and also in cells activated with lps and cholesterol crystals. in addition, arglabin induced clustering of lc3-ii at autophagosomal membranes, as evidenced from its punctuated pattern in confocal microscopic images of arglabin-treated macrophages, which is a characteristic sign for autophagy. since autophagy activation leads to increased degradation of nlrp3 and pro-interleukin (il)-1β, we further analyzed whether arglabin inhibits the nlrp3 inflammasome. arglabin reduced expression of nlrp3 and pro-il-1β, inhibited activation of caspase-1, and release of mature il-1β by lps-and cholesterol-crystal-stimulated macrophages consistent with inhibition of the nlrp3 inflammasome. intraperitoneal injection of arglabin into female apoe2.ki mice fed a high fat diet resulted in significantly decreased plasma levels of proinflammatory il-1β. moreover, arglabin markedly reduced mean lesion areas in the sinus and whole aorta in mice. thus, arglabin may represent a new promising drug to treat diseases associated with inflammasome activation, e.g. atherosclerosis. this work was supported in part by a grant from the nouvelle société francophone d'athérosclérose (nsfa). recently, we found that activation of the proteinase-activated receptor (par) 2 stimulates renin release in the isolated perfused kidney model. therefore, we determined in the current experiments the response of plasma renin concentration (prc) to acute intraperitoneal administration of the par2 activating peptide sligrl (100µg/kg), hydralazine (2 mg/kg), isoproterenol (10 mg/kg), losartan (3 mg/kg) and furosemide (40 mg/kg) in conscious wild-type (wt) and par2-deficient mice. prc was measured in plasma obtained by tail vein puncture. renal renin expression was determined by quantitative rt-pcr. renal protein expression was measured by immunohistochemistry. on a control diet (0.6% nacl), plasma renin concentration (in ng angiotensin i per ml per hour) was significantly lower in par2-deficient mice than in wild-type mice (190 ± 35 versus 380 ± 91). renin mrna expression was 50 ± 9% of wt. renin-expressing cells were located at the juxtaglomerular position and renal renin protein expression was lower in par2-deficient mice. as measured by tail-cuff method, systolic blood pressure was not different between par2 (-/-) and wt mice. administration of sligrl increased renin secretion about 6-fold (p<0.01). acute stimulation of renin release by furosemide, isoproterenol, losartan and hydralazine caused significant increases of plasma renin concentration in both par2 (-/-) and wt mice. the absolute changes (delta prc) were similar (3780 ± 770, 2230 ± 400, 2780 ± 70, 1390 ± 460 in wt, and 2320 ± 270, 2530 ± 250, 3010 ± 210, 1230 ± 470 in par2 (-/-)). in conclusion, chronic absence of par2 reduces basal renin expression and renin release. however, par2-deficiency does not alter renin release in response to typical stimuli for renin secretion. therefore, par2 does not appear to be a mandatory and specific requirement for acute regulatory responsiveness. increased myofilament ca 2+ sensitivity could be the underlying cause of diastolic dysfunction. we evaluated acute effects of epigallocatechin-3-gallate (egcg), which has been shown to decrease myofilament ca 2+ sensitivity, on cardiac myocyte contractility and force-ca 2+ relationship of skinned cardiac muscle strips in an hcm mouse model with left ventricular hypertrophy and both systolic and diastolic dysfunction. methods: the hcm mouse model used in this study carries a point mutation in the cardiac myosin-binding protein c gene at the homozygous state (mybpc3-targeted knock-in; ki). we isolated ventricular myocytes from adult ki and wt mice and analyzed sarcomere shortening and ca 2+ transients at 37 °c under 1 hz pacing using the ionoptix system in the absence or presence of egcg (1.8 µm). furthermore, force-ca 2+ relationships of skinned cardiac muscle strips of ki and wt mice were obtained ±egcg (30 µm). results: in baseline settings and absence of fura-2, ki cardiomyocytes displayed higher sarcomere shortening ( type-1 serine/threonine protein phosphatase (pp1) comprises a family of enzymes that dephosphorylate cardiac regulatory proteins, thereby modulating ca 2+ handling and contractility. all pp1 heterodimers possess a catalytic subunit, which is selectively inhibited by inhibitor 2 (i-2). it has been shown by our group that the heart-directed overexpression of a truncated, constitutively active form of i-2 resulted in an improved basal ca 2+ handling and contractility. in contrast, chronic pressure overload by transverse aortic constriction exacerbated the progression of cardiac remodeling and heart failure in transgenic mice. in the present study, we tested whether the overexpression of a full-length form of i-2, regulated by gsk3-dependent phosphorylation at thr 72 , resulted in comparable functional alterations using a model of induced heart failure. for this purpose transgenic (tg) and wild-type (wt) mice were subjected to chronic application of isoprenaline (iso, 30 mg/kg/d) via osmotic minipumps. iso-stimulated mice were compared to mice treated with 0.9% nacl (n=5). after one week of iso administration, cardiac hypertrophy was comparable in tg and wt. ca 2+ transients were measured in isolated, indo-1-loaded myocytes. the peak amplitude of [ca] i was reduced by 39% in tg nacl compared to wt nacl (p<0.05), whereas chronic iso application was associated with comparable effects in tg and wt. [ca] i decay kinetics were comparable in nacl-treated groups but hastened by 25% in tg iso compared to wt iso (p<0.05). consistently, sr ca 2+ load was diminished by 28% in tg nacl compared to wt nacl (p<0.05). chronic iso stimulation led to an unchanged sr ca 2+ content in tg and wt myocytes. biochemical analyses revealed that chronic βadrenergic stimulation was accompanied by a more than 4-fold higher phospholamban phosphorylation at ser 16 in tg (p<0.05). thus, these findings suggest that overexpression of i-2 is able to reduce the progression of heart failure by an improvement of myocyte ca 2+ handling. the ligand-activated farnesoid x receptor (fxr) is a nuclear receptor highly expressed in gastrointestinal and metabolic tissues, such as the duodenum, jejunum, ileum, colon and the liver, but also in lower amounts for instance in macrophages. the endogenous agonists for this receptor are bile acids with the primary bile acid chenodeoxycholic acid as the most active one. activation of fxr regulates the transcription of target genes relevant in bile acid homeostasis, glucose and lipid metabolism, liver protection, inflammation, and cancerogenesis. agonists for fxr have been discussed as possible therapeutic options for the treatment of obesity and the metabolic syndrome. atherosclerosis is the main pathology underlying cardiovasular diseases and often occurs side-by-side with the metabolic syndrome. cholesterol deposition and the formation of cholesterol-loaded foam cells from macrophages lead to the formation of atherosclerotic plaques. this can be prevented by stimulation of cholesterol efflux from macrophages. based on leoligin, a lignan-type secondary plant metabolite, naturally occuring in leontopodium alpinum cass., 168 derivatives were synthesized and subjected to a fxr pharmacophore-based in silico screening. testing of 56 virtual hits in a luciferase-based fxr transactivation assay yielded one compound with promising activity on fxr. moreover, the heterodimer partner of fxr, rxrα, was not activated by this leoligin derivative in a luciferase-based rxrα assay. in addition, this compound was able to increase cholesterol efflux in thp-1 macrophages without affecting cell viability. western blot experiments revealed an increase in atp-binding cassette transporter a1 (abca1) expression in human thp-1 macrophages by this leoligin derivative. the transporters abcg1 and scavenger receptor class b (sr-bi), which also play a key role in macrophage cholesterol efflux, will be investigated. moreover, the effect of the leoligin derivative on liver x receptor activation, the nuclear receptor responsible for upregulation of these transporters, has to be studied. based on this data, further characterization of the molecular mechanism underlying the described effects will provide valuable insights in a possible crosstalk between macrophage cholesterol efflux and fxr activation. background: rho-associated kinases rock1 and rock2 are serine/threonine kinases that are downstream targets of the small gtpases rhoa, rhob, and rhoc. rock1 and rock2 are known to play a pivotal role in the pathogenesis of myocardial fibrosis. however, their specific function in cardiac fibroblasts (cf) remains unclear. remodelling of the diseased heart results in the transition of fibroblasts to a myofibroblast phenotype exemplified by an increased proliferation, migration rate and synthesis of extracellular matrix (ecm) proteins. therefore, we sought to investigate whether rock protein signalling intermediates have an impact on cellular characteristics, intracellular protein expression and mechanical properties in cf and engineered tissues. methods: neonatal cardiac fibroblasts were isolated from wild type rats and downregulation of rock1 and rock2 by 75% was achieved by lentiviral transduction or transfection. wild type fibroblasts were treated with 10 μm fasudil or 3 µm h1152p for general rock inhibition and 3 µm slx-2119 for inhibition of rock2. protein expression and modification was determined by immunoblot analysis, gene expression by qpcr analysis, cf morphology and the localisation of cytoskeletal proteins by immunofluorescence analysis, cell proliferation by automated nuclei counting, cell migration on a planar surface by life cell imaging, and rigidity of engineered tissues by rheological measurement. results: our results show that both rock1 and rock2 influence cf morphology, gene expression, proliferation and migration. the knockdown and inhibition of rocks was associated with changes in cf morphology accompanied by a disorganization of higher-order actin structures including stress fibers and geodesic domes. moreover, the knockdown of rock1 and rock2 in cf increased adhesion velocity, whereas proliferation was attenuated. interestingly, downregulation of rock2, but not of rock1 led to a significantly decreased migration velocity and distance suggesting an isolated principle role for rock2 in cardiac fibroblast migratory behavior. analysis of a three dimensional engineered tissue model composed of cardiac fibroblasts (engineered connective tissue, ect) suggested that rocks are involved in the regulation and turnover of the extracellular matrix (ecm) and thus influence viscoelastic properties of engineered tissues. destructive tensile strength measurement in ect treated with rock inhibitors showed that rigidity was significantly reduced when compared to control tissues. rna sequencing of ect treated with the rock inhibitor h1152p and qpcr analysis of cf with a downregulation of rock1 and rock2 showed that both rocks are involved in the regulation of ecm proteins, such as collagens 4a2, 6a, and 8a1, biglycan, decorin, elastin and its respective degrading enzyme mmp12. conclusion: this study demonstrates that rock signalling controls myofibroblast characteristics of cf via remodelling of the cytoskeleton and the ecm. background: regulation and fine-tuning of gene transcription in cardiomyocytes (cms) is a centerpiece of cardiac development, function, and disease. in order to obtain authentic data, cell type-specific analyses are indispensable. recently, high-purity isolation protocols for cm nuclei were established [bergmann, exp cell res, 2011] and employed for detailed genetic and epigenetic studies on cardiac gene transcription [gilsbach, nat commun, 2014] . however, corresponding protein analyses, which bridge from transcriptional control to cm function are still lacking. therefore, we aimed to map the landscape of nuclear protein expression in newborn and adult mice in order to complement and extend our epigenetic studies. methods and results: cardiac nuclei were isolated from homogenized adult and p1 mouse frozen hearts by sucrose gradient centrifugation. magnetic-assisted cell sorting (macs) with pcm-1 as a nucleus-specific marker was used to enrich cm nuclei to >95%. proteins were extracted from nuclear lysate with 2% sds. quantitative protein data were obtained from silac-based liquid chromatographytandem mass spectrometry (lc-ms/ms) experiments with an ltq orbitrap xl mass spectrometer after in-gel digestion with trypsin. nuclear protein extracts from 3 murine cell lines served as silac (lys8/arg10)-labeled internal standard. finally, protein data are correlated with corresponding mrna data obtained by rna-sequencing. we identified 1041 proteins, 688 of which are annotated to the nucleus. 21%/23% (adult / p1) of nuclear proteins are annotated as dna-binding. 1.5%/1.4% belong to transcription factor complexes; 7.3%/7.5% are able to bind transcription factors. 7.9%/8.3% have chromatin modification functions; 4.3%/4.4% modify histones. nuclearenriched go terms include mrna processing and transport, transcription, nucleosome assembly and protein degradation. 71% of proteins are shared between p1 and adult nuclei. 142 proteins are exclusive to p1, 34 are only found in adult. 93 proteins are enriched (1.3-fold increase in abundance) in p1 hearts, 89 proteins in adult proteins related to heart development, gene silencing, and dna replication are more prominent in or exclusive to newborn mice. adult nuclei strongly express proteins related to regulation of actin fibers and cm function, proteins involved in protein degradation, and chaperones. although high mrna expression increases the chance of protein identification, a significant correlation between mrna and protein level could not be observed on a genome-wide scale. conclusions: we present a comprehensive and specific protein landscape of newborn and adult cm nuclei. young cm nuclei appear as a developing tissue, show the ability for proliferation, and indicate ongoing alterations in gene expression. adult cm nuclei prominently display a focus on regulation of contractile fibers and cm function, as well as chaperones and proteasomal proteins indicative of its arduous function. background: fibrosis is a hallmark of many myocardial pathologies and contributes to distorted organ architecture and function. recent studies have identified premature senescence as regulatory mechanism of tissue fibrosis. however, its relevance in the heart remains to be established. objective: to investigate the role of premature senescence in myocardial fibrosis. methods: murine models of cardiac disease and human heart biopsies were analyzed for characteristics of premature senescence and fibrosis. results: senescence markers p21 cip1/waf1 , senescence-associated ß-galactosidase (sa-ß-gal) and p16 ink4a were increased 2-, 8-and 20-fold (n=5-7; p < 0.01), respectively, in perivascular fibrotic areas after transverse aortic constriction (tac) when compared to sham-treated controls. similar results were observed with cardiomyocytespecific β1-adrenoceptor transgenic mice and human heart biopsies. senescent cells were positive for vimentin (92 ± 0.9%), platelet derived growth factor receptor α (94 ± 0.9%) and α-smooth muscle actin (71 ± 2.3%), specifying myofibroblasts as the predominant cell population undergoing premature senescence in the heart. conclusion: our data provide first evidence for an essential role of premature senescence of myofibroblasts in myocardial fibrosis. it is tempting to speculate that pharmacologic modulation of premature senescence might provide a novel therapeutic target for anti-fibrotic therapies in the heart. introduction: the endocannabinoid system is increasingly studied in cardiac research due to its role in fibrosis, inflammation and cell fate modulation. the deregulation of this system has been implicated in myocardial infarction (mi) and consequent heart failure development. a recent study suggests cannabinoid receptor 1 (cb1) inhibition to improve cardiac function and to reduce adverse remodeling after cardiac stress, but the exact underlying molecular mechanisms of these beneficial effects are still unknown. micrornas (mirnas, mirs) provide a complex layer of post-transcriptional regulation modulating key biological processes such as tissue remodelling in heart failure. the aim of the present study was to explore microrna pathways in the chronic effect of cb1 receptor inhibition after angiotensin-fibrosis induction and left ventricular remodeling. methods and results: adverse cardiac remodelling was induced in mice by chronic administration of angiotensin ii (angii, 1,5 mg/kg/day) with osmotic minipumps for 14 days. treatment with cb1 antagonist, or vehicle was performed every second day during the angii administration period. hemodynamic parameters were measured by echocardiography and cardiac pressure volume catheter and tissue samples were taken for molecular and histological analysis. after two weeks of angii infusion, left ventricular dysfunction was prevented by cb1 antagonist treatment. this was shown by significantly improvements of the myocardial performance index and end-diastolic pressure values. at the tissue level, anti-fibrotic effects of cb1 antagonist treatment was confirmed histologically and by expression analysis of pro-fibrotic genes. these beneficial effects were also observed in cb1 ko mice and in an aging mice model. the particular role of tissue fibroblast in aii-induced cardiac fibrosis was further explored. primary cardiac fibroblasts (cf) from each experimental group were isolated and analysed by next generation deep rna sequencing to identify differentially regulated micrornas. microrna-181a/b family was downregulated by in vivo angii delivery and vice versa upregulated after cb1 antagonist treatment and foxb1 (a direct target of mir-181a family) was differentially regulated, suggesting a possible mechanism of action for the benefits of cb1 receptor inhibition. conclusion: we found that in angii-induced cardiac remodelling, lv function is preserved by chronic cb1 antagonist treatment and that cardiac fibrosis is reduced with concomitant downregulation of fibrogenic genes. also, cf-enriched mir-181a/b family seems to be sensitive to cb1 antagonist treatment, thereby affecting cardiac fibrosis. the current study employs a novel concept regarding chronic cb1 inhibitor treatment and may provide important details and novel targets for anti-fibrotic approaches in heart failure. background: low homoarginine (harg) was recently identified as an emerging biomarker for stroke, myocardial infarction, and heart failure in clinical and epidemiological studies. harg competes with arginine as a substrate for nitric oxide (no) synthase and weakly inhibits arginase. both mechanisms might lead to increased no formation in vivo. the aim of this study was to investigate whether harg effects the development of atherosclerosis as a potential underlying mechanism of cardiovascular diseases. methods: harg-deficient agat-knockout (agat -/-) and wildtype (wt) mice were crossed with apolipoprotein e (apoe) deficient mice and fed with high fat diet (hfd) for three months to induce atherosclerosis. harg plasma concentrations were determined using mass spectrometry. en face preparation of aortae followed by red oil staining of atherosclerotic plaques and quantitative evaluation of plaque areas was performed for female mice. endothelial function of male mice was tested with acetylcholine (ach) and nitroglycerin (ntg) after contraction with prostaglandin f2α. background: the direct oral thrombin inhibitor dabigatran etexilate (dabigatran) is used for the prevention and treatment of venous thromboembolism. obese patients as well as patients with type 2 diabetes mellitus (t2dm) have an increased risk for thrombotic disease and show enhanced thrombin generation. besides its role in blood coagulation, thrombin is known to be involved in many pro-inflammatory processes. in obesity, adipose tissue (at) inflammation plays a crucial role in the development of insulin resistance and t2dm and contributes to atherosclerosis development. the aim of the present study was to analyse the effects of dabigatran on at inflammation in a mouse model of diet-induced obesity in the context of accelerated atherosclerosis. methods: 10-week-old female low-density lipoprotein receptor-deficient (lldr -/-) mice were fed a high-fat diet containing 5 mg/g dabigatran or respective placebo for 20 weeks. results: analysis of visceral at revealed a significant increase in adipocyte size in dabigatran-treated mice, although body weight, fat mass, glucose tolerance, and insulin resistance were unchanged between groups. this effects seemed to be directly mediated by thrombin, as treatment with another thrombin inhibitor (argatroban) also resulted in the development of adipocyte hypertrophy. accordingly, in vitro studies in 3t3-l1 cells revealed an inhibitory effect of thrombin on lipid accumulation in adipocytes. the amount of pro-inflammatory cd11c-positive macrophages (atms) in visceral adipose tissue was significantly reduced, and the secretion of pro-inflammatory il-6 from visceral at was significantly lower in dabigatran-treated animals. in vitro studies using 3t3 l1 cells and primary bone marrow-derived macrophages revealed that the changes in macrophage polarization were not directly mediated by thrombin, but indirectly by a change in the secretion profile of adipocytes. a similar reduction in proinflammatory macrophages as detected in at could also be observed in the aortic wall of dabigatran-treated mice. conclusions: the direct thrombin inhibitor dabigatran inhibits at inflammation and the accumulation of pro-inflammatory macrophages in vat but also the aortic wall of ldlr -/mice. these anti-inflammatory effects of dabigatran might contribute to the known atheroprotective effects of dabigatran. background: sarco/endoplasmic reticulum ca 2+ -atpase (serca2a) and its inhibitor phospholamban (pln) are critical determinants of cardiomyocyte calcium cycling and hence, cardiac contractility. pln exists in an equilibrium between mono-and pentamers. while monomeric pln has been implicated in direct serca2a inhibition, a functional role for the pentamers remains ambiguous. recently it has been shown that pln pentamers modulate pka-dependent phosphorylation of pln monomers in vitro. 1 using transgenic mouse models we now investigated the effects of pln pentamers on pln phosphorylation, myocyte ca 2+ cycling and contractility in cardiac myocytes. methods: phosphorylation patterns of pln were analyzed by western blot using phospho-specific antibodies as well as phosphate affinity sds-page. to assess the phosphorylation at baseline, pln knockout (pln-ko) mice expressing either wild type pln (tgpln) or the solely monomeric pln afa mutant (tgafa) transgene were deeply anesthetized, whereas pln phosphorylation by pka was induced using the betaadrenergic agonist isoproterenol. the consequences on myocyte ca 2+ kinetics were measured in isolated, fura2-loaded and electrically paced (0.5hz) cardiomyocytes as the time to 50% decay of the ca 2+ signal (t 50% ). the time to 50% baseline of sarcomere length (t 50% baseline) characterized the speed of myocyte relaxation. results: under basal conditions, we found stronger phosphorylation of pln pentamers than monomers, pointing at pentamers as the preferred pka target. pln afa monomers showed 3.3-fold stronger phosphorylation signals if pentamers were absent (p<0.0001). consistent with a higher basal phosphorylation of pln afa monomers, measurements of calcium kinetics revealed a faster decay of calcium signals in tgafa compared with tgpln cardiomyocytes (t 50% [ms]: 92±8 and 122±8, respectively, p<0.05). notably, t 50% of pln-ko myocytes was 79±5ms (p= not significant versus tgafa), indicating that the strong basal phosphorylation of monomers leads to near complete inactivation of pln in tgafa. upon stimulation of pka, pln monomer phosphorylation and calcium kinetics of tgpln and tgafa mouse myocytes were indistinguishable, because monomer phosphorylation and the speed of cytosolic ca 2+ clearance strongly increased only in tgpln. acceleration of sarcomere relaxation upon pka stimulation was also more pronounced in tgpln than in tgafa and pln-ko myocytes (increase of t 50% baseline [ms]: 32±8 in tgpln versus 7±4 in tgafa-pln and 5±7 in pln-ko, p<0.05). even high-dose isoproterenol induced phosphorylation of only about half of all protomers of pln pentamers suggesting a high capacity of pentamers to attenuate monomer phosphorylation by acting as a phosphate scavenger. conclusions: our data demonstrate that pln pentamers reduce basal phosphorylation of pln monomers in myocytes. nevertheless, pentamers allow strong phosphorylation of monomers during beta-adrenergic stimulation, thereby extending the range within which pln can modify diastolic ca 2+ kinetics and myocyte relaxation. therapeutic inhibition of micrornas is a promising field in cardiovascular research. vector-based overexpression of an inhibitor construct (e.g. microrna sponge) is one approach to achieve sustained inhibition with potential applicability in humans. yet the strength of expression achieved by the currently available gene therapy vectors (e.g. aavs) in humans remains a limiting factor, therefore inhibitory constructs with increased potency would provide an improvement of this approach and bring it closer to therapeutic application. micrornas are believed to discriminate between potential binding sites, based on additional factors provided by the endogenous untranslated regions at the 3' end of mrnas (3' utrs) and the proteins that are bound to them. aim of this project was to investigate whether selected endogenous 3' utrs can likewise increase the potency of microrna inhibitory constructs. to this end several known targets for a cardiac microrna were selected and their relative potencies of microrna inhibition was compared. to accurately assess changes in the activity of the respective micrornas we constructed dual-fluorescent reporter plasmids and established an automated fluorescent microscopy acquisition and analysis pipeline. among several tested utr constructs we found one which strongly increased the inhibitory potency of the microrna binding site in primary rat cardiac myocytes (nrcms). furthermore a similar effect was obtained when the binding site was exchanged for that of a different microrna and analyzed in the nih-3t3 fibroblast cell line. we therefore conclude that endogenous utr contexts can indeed be successfully applied to increase the potency of vector-based microrna inhibitors. the g-protein-coupled protease-activated receptor-2 (par2) regulates inflammatory responses including monocyte migration and cytokine release. par2 is activated by the coagulation factor-xa or by the tissue-factor (tf)/factor viia complex. the immunomodulatory lipid sphingosine-1-phosphate (s1p) is released from activated platelets and interlinks blood coagulation and inflammation. this study investigates the impact of s1p on the expression of par2, tf and of the anticoagulant protein thrombomodulin (tm) in human monocytes and after pma-induced differentiation into macrophage-like cells. monocytic thp1 and u937 cell lines were used as human monocyte models. primary monocytes were isolated from healthy volunteers using a magnetic bead-based monocyte isolation kit. expression of par2, tf and tm was measured by quantitative real-time pcr and western blotting. differentiation of monocytes into macrophage-like cells was induced by incubation with 50 ng/ml pma (phorbol 12-myristate 13-acetate) over 72 h. calibrated automated thrombin (cat) generation was determined in plateletrich plasma from healthy volunteers. in thp1 and u937 cells s1p induced a time-(1 to 24 h) and concentration-dependent (0.1 to 10 µm) significant upregulation of par2 mrna and total protein expression. par2 total protein was upregulated maximally (about 1.5-fold, n=7) with 1 µm s1p after 16 h incubation. comparable effects were seen in human primary monocytes. in comparison, tf mrna and protein were only marginally elevated in non-differentiated thp1 monocytes and tm was not regulated by s1p. after differentiation of cultured monocytic cells with pma into adhesive macrophage-like cells, incubation with s1p resulted in a time-(1 to 24 h) and concentration-dependent (0.1 to 10 µm) significant upregulation of tf expression within 3 to 6 h of incubation. conversely, par2 total protein expression was reduced by about 50% after 24 h s1p incubation. the expression of tm was again not affected. the generation of thrombin in platelet-rich plasma was determined using pma-differentiated thp1 cells as tf source. timedependent incubation with s1p (1 µm) in differentiated monocytes shortened the time to the onset (lag time) of thrombin generation in plasma from 8.0±1.6 to 6.5±1.4 min and elevated total the thrombin generation capacity from 814±130 to 1286±129 nm. peak thrombin formation was elevated from 66±31 to 130±30 nm/min (control versus s1p for 6h, mean±sd, n=5, respectively). these data suggest that s1p induces an enhanced expression of par2 in undifferentiated human monocytes while tf and tm are not regulated. in differentiated monocytes/macrophages, s1p does upregulate tf expression but attenuates par2 levels. since par2 is involved in regulation cell migration, s1p may stimulate a phenotypic switching from a migratory to a procoagulant phenotype during differentiation of monocytes into macrophages. the pro-inflammatory cytokine interleukin-6 (il-6) plays an important role in vascular inflammation. coagulation factors such as the activated factor-x (fxa) may regulate local inflammatory responses of the vessel wall. in this study we investigated whether fxa regulates il-6 expression and secretion in human vascular smooth muscle cells (smc) as well as in failed thrombosed vein grafts. also, we analysed its possible prothrombotic impact on monocytes. il-6 mrna expression was determined in primary human saphenous vein smc by taqman® real-time pcr. secretion to the cell culture media was measured by elisa. tissue factor (tf) expression in monocytic thp-1 cells was determined by western blot. immunostainings for il-6 and the smc marker smoothelin were performed on paraffin embedded tissue sections from failed thrombosed vein grafts and control veins. incubation of cultured human venous smc with fxa (30 nm) induced a time-dependent (3 -16 h) increase in il-6 mrna expression. maximum expression was observed within 6 h to a 7.2±3.3 fold increase (mean±sd, n=5, p<0.05). incubation with an inhibitor of p38 map kinase (sb203580, 10 µm) or pi3k (ly294002, 10 µm) significantly attenuated fxa-induced il-6 mrna expression (n=5). inhibition of p42/44 mapk, rho kinase or nf-kb had no significant effect. stimulation with fxa for 24h resulted in a markedly increased il-6 secretion into the smc culture media from 0.6±0.2 to 1.1±0.3 ng/ml (p<0.5, n=4). stimulation of thp-1 cells with il-6 (1 ng/ml) induced a time dependent (6 -24 h) increase of up to 2.1±0.6 fold (p<0.05, n=5) in tf protein expression. immunostainings of tissue sample of failed vein grafts revealed enhanced il-6 expression in smc-rich regions in vessel walls compared to non-thrombosed control veins suggesting an elevated il-6 regulation in thrombosed vein grafts in vivo. in conclusion, fxa induced il-6 expression and secretion in venous smc which may be regulated via p38 and pi3k signaling. il-6 enhanced tf expression in thp-1 monocytes and was found in smc-rich regions in failed thrombosed vein grafts. fxa-stimulated il-6 release may be involved in regulating local pro-thrombotic processes during vascular inflammation and possibly vein graft failure. rationale: the transcription factors camp-response element binding protein (creb) and camp-responsive element modulator (crem) bind to camp response elements (cres) and mediate a camp dependent gene regulation. suppression of cre mediated transcription is linked to atrial remodeling in genetic mouse models. inhibition of creb target genes is associated with atrial fibrillation (af) susceptibility in patients. creb and crem affect histone acetylation recruiting the creb-binding protein (cbp/p300). the histone acetyltransferase (hat) activity of cbp facilitates gene transcription by loosening chromatin structure. histone deacetylases (hdacs) catalyze the inverse reaction: histone deacetylation with consecutive gene silencing. mice with heart directed expression of the human cardiac isoform crem-ib∆c-x (tg) show atrial dilatation, morphological and physiological alterations in atria preceding spontaneous-onset af. the hdac inhibitor (hdac i ) valproic acid (vpa) reduced atrial weight and af incidence in tg mice. here we tested the hypothesis, that vpa attenuates the structural remodeling in tg atria by reversing changes in atrial gene regulation due to the transgene. methods and results: tg and wt mice were treated from week 5-16 with vpa (0.71% in drinking water, ad libitum) or vehicle (veh). atrial ultrastructure was studied by electron microscopy (em) (week 7 and 16). veh-treated tg atria showed a progressive dysorganisation of sarcomeres (sm) with less mitochondria and more collagen fibers between cardiomyocytes as compared to veh-treated wt atria. the fraction of sm structure in veh-treated tg atria was significantly reduced as compared to veh-treated wt atria (week 7: tg veh : 33±2%, wt veh : 47±1%; week 16: tg veh : 19±2%, wt veh : 44±1%, p<0.05). vpa led to a more organized ultrastructure and restored, at least partially, the degradation of the sm in the tg atria (tg vpa at week 7: 40±2%, tg vpa at week 16: 34±1%, p<0.05 vs. tg veh ). the structure of wt atria was not affected by vpa. we further analyzed the protein abundance profiles in the groups of all animals (wt veh , wt vpa, tg veh , tg vpa) by using lc-ms/ms. between veh-treated genotypes (tg veh vs. wt veh , p<0.05) 998 proteins were significantly changed while 854 proteins were differentially abundant between vpa-treated groups (tg vpa vs. wt vpa, p<0.05). 109 proteins were regulated by vpa in wt atria (wt vpa vs. wt veh , p<0.05), whereas vpa affected 525 proteins in tg atria (tg vpa vs. tg veh , p<0.05), out of which 43 proteins were common. 295 prominent changed proteins between veh-treated tg and wt atria were significantly regulated by vpa in tg atria in the opposite direction. a functional pathway analysis showed that pathways activated in tg atria such as cardiac fibrosis, mitochondrial dysfunction were inhibited by vpa treatment. conclusion: similar to human af, crem-tg mice present atrial dilatation, ultrastructural changes and impaired conduction and spontaneous af. while vpa had little to no effect in wt mice, valproate improved the tg phenotype by interfering with pathways involved in structural remodeling. this supports the idea that hdac inhibition by vpa antagonizes effects of crem expression in atria. in isolated mouse cardiac preparations, histamine is ineffective regarding inotropic or chronotropic effects, presumably because of lack of receptor protein expression. on the other hand, histamine can exert positive inotropic and chronotropic effects in humans via cardiac histamine h 2 -receptors. hence, we have generated transgenic mice (tg) which overexpress the human h 2 -receptor specifically in cardiomyocytes. in isolated left and right atrial preparations of these mice, we investigated the histamine metabolism on a functional level. preparations of wild type mice (wt) served as control. histamine induced positive inotropic effects (pie) and positive chronotropic effects (pce) in left and right atria of tg mice, respectively, but not in wt. interestingly, the inhibitor of histamine oxidation, aminoguanidine (1 mm), shifted the concentration response curves for the pie of histamine from ec 50 = 110 nm to 37 nm (p<0.05). furthermore, the unspecific inhibitor of mono amine oxidase, tranylcypromine (10 µm), shifted the pie of histamine from ec 50 = 70 nm to 38 nm and increased the efficacy of histamine for the pie (p<0.05). these data indicate that exogenously applied histamine is subject to degradation in the mouse heart by two different pathways namely via diaminoxidase and mono amine oxidase. drugs that inhibit theses enzymes could conceivably alter cardiac function also in the human heart. protein phosphorylation by kinases and dephosphorylation by protein phosphatases has a crucial function in cell signal cascades. it has been shown that cardiomyocyte specific overexpression of serine /threonine protein phosphatases pp1, pp2a, pp2b (calcineurin) and pp5 in mice leads to cardiac hypertrophy and alters cardiac function. to examine the function of another important protein phosphatase in the heart we established a mouse model overexpressing protein phosphatase 2cβ (pp2cβ) under control of the α-myosin heavy chain promoter. cardiac overexpression was demonstrated by western blotting. like other serine/threonine phosphatases, pp2cβ can lead to cardiac hypertrophy. in transgenic mice (tg), relative ventricular weight was increased (4.24 ± 0.14 mg/g) compared to wild type (wt) littermates (3.78 ± 0.20 mg/g; p<0.05) whereas weights of right and left atria were unchanged. therefore, relative heart weight was increased in tg (4.78 ± 0.17 mg/g) vs. wt (4.13 ± 0.22 mg/g; n=8-12; 10-11 months of age; p<0.05). left ventricular function, measured in vivo by echocardiography under isoflurane anesthesia was diminished in tg compared to wt (ejection fraction: 58.33 ± 3.16 % (tg) versus 76.94 ± 0.92 % (wt); n=12-16; 8-11 months; p<0.05 and fractional shortening: 30.84 ± 2.02 % (tg) versus 44.94 ± 0.87 % (wt); n=12-16; 8-11 months; p<0.05). the left ventricle was dilated (systolic diameter: 2.78 ± 0.21 mm (tg) versus 1.84 ± 0.06 mm (wt); n=12-16; 8-11 months; p<0.05; diastolic diameter: 3.97± 0.19 mm (tg) versus 3.33 ± 0.07 mm (wt); n=12-16; 8-11 months; p<0.05). in contrast, atrial function measured as response to β-adrenergic stimulation in isolated left and right atrial preparations was unchanged in tg vs. wt. in summary, our results indicate that pp2cβ overexpression can lead to ventricular dysfunction and hypertrophy. the underlying signal transduction pathways need to be elucidated. the insulin-like growth factor binding protein 5 (igfbp5) -a potential developmental gene is regulated upon cardiac stress m. wölfer background: cardiac remodeling is a complex biological adaptation process of the failing heart accompanied by a re-activation of embryonic gene expression, which so far has unclear pathophysiological relevance. we and others showed that insulin-like growth factor binding protein 5 (igfbp5) is expressed in the early pre-cardiac region in mouse embryos and its up-regulation impairs cardiac progenitor differentiation. igfbp5 functions as an extracellular growth factor binding protein for igf and also has igfindependent activities. the role of this factor in the context of cardiac remodeling is still unknown. the aim of this study was to investigate the relevance of igfbp5 in cardiogenesis and cardiac remodeling and its role as a potential target for ameliorating stress-induced cardiac remodeling methods and results: we investigated the expression of igfbp5 in murine cardiac tissue at different developmental stages by qpcr normalized to tpt1 (tumor protein, translationally-controlled 1). this analysis showed temporal changes of cardiac igfbp5 expression from developing to postnatal hearts, where a high expression was detected in early heart stages, which decreased during cardiac development and became low in the postnatal heart. the analysis of igfbp5 expression in different heart cells showed a very low igfbp5 in adult cardiomyocytes in contrast to a high expression in undifferentiated sca-1 positive cells. in a mouse model with cardiac specific wnt/βcatenin activation, which led to cardiac dysfunction, igfbp5 was found up-regulated (p<0.05). further we found an increased igfbp5 expression after pressure induced cardiac hypertrophy using mice with transverse aortic constriction (tac) (p<0.01). in line with this data, an in vitro model of human heart muscle hypertrophy using engineered cardiac heart muscle (ehm) showed an up-regulation of igfbp5 upon adrenergic activation via norepinephrine stimulation accompanied by a functional deterioration in comparison to untreated controls (p<0.05). all these findings were further supported by rna-sequencing analysis from human aortic stenosis patient samples, where igfbp5 expression was found increased in patients with compensatory hypertrophy and in a higher extent in patients with heart failure in comparison to non-failing heart samples. interestingly, the expression of igfbp5 in angiotensin 2 or norepinephrine stimulated neonatal murine cardiomyocytes, as well as in hearts of mice treated with angiotensin 2, showed the opposite results, namely a reduction in its expression (p<0.01; p<0.05, respectively). summary and conclusion: our results show active igfbp5 transcription in the early developing heart but a low expression in the postnatal heart. a re-activation of expression was found in the process of pathological heart remodeling in mouse and human, in vivo as well as in vitro, indicate the participation of igfbp5 in a conserved manner. we hypothesize that igfbp5 may participate in the developmental gene program becoming activated in the diseased adult heart again. the functional role and regulation of igfbp5 is under investigation. we have recently shown that perivascular adipose tissue (pvat) plays a crucial role in obesity-induced vascular dysfunction. in pvat-free aortas isolated from male c57bl/6j mice fed a high-fat diet (hfd) for 22 weeks, the endothelium-dependent nitric oxide (no)-mediated vasodilator response to acetylcholine remained normal. in contrast, a clear reduction in the vasodilator response to acetylcholine was observed in aortas from obese mice when pvat was left in place. these results suggest that the reason for vascular dysfunction in diet-induced obese mice is a pvat dysfunction rather than an endothelial dysfunction. treatment of hfd mice during the last 4 weeks with crataegus extract ws® 1442 (150 mg/kg/day) completely normalized vascular function in pvatcontaining aorta. the expression of endothelial no synthase (enos) was not changed by ws® 1442, neither in pvat nor in aorta. phosphorylation at serine 1177 is the most important positive modulation of enos activity. hfd-induced obesity was associated with a reduction in enos phosphorylation at serine 1177 in pvat, but not in aorta. ws® 1442 treatment significantly improved enos serine 1177 phosphorylation selectively in pvat but had no effect in aorta. a major upstream kinase for enos serine 1177 phosphorylation is akt. the activity of this kinase is inhibited in the pvat of hfd mice, which was largely reversed by ws® 1442 treatment. in addition, ws® 1442 treatment enhanced the mrna expression of the nad-dependent deacetylase sirtuin-1 (sirt1), known also as a longevity gene. the activity of sirt1 depends, among others, on the intracellular content of its cofactor nad. ws® 1442 treatment led to an upregulation of nicotinamide phosphoribosyltransferase (nampt), a rate-limiting enzyme in the salvage pathway of nad biosynthesis. one of the non-histone substrates of sirt1 is enos. deacetylation of enos at lysine residues 497 and 507 by sirt1 enhances the activity of the enos enzyme. currently, we are studying the effect of ws® 1442 on nad synthesis, sirt1 activity, and enos (de)acetylation. in conclusion, crataegus extract ws® 1442 reverses obesity-induced vascular dysfunction by improving pvat function. the molecular mechanisms may involve enos phosphorylation at serine 1177 and upregulation of sirt1. the raf kinase inhibitor protein (rkip) inhibits g-protein-coupled receptor kinase (grk2) and the raf-erk1/2 pathway. these two functions of rkip could counteract each other. while grk2 inhibition is cardio-protective, inhibition of the pro-survival erk1/2 axis promotes signs of heart failure in patients and experimental models. in view of this ambivalent nature, the function of rkip in vivo is not clear. furthermore, rkip could have a pathophysiological role because heart specimens from patients with heart failure showed rkip up-regulation (ref. 1). to investigate the impact of cardiac rkip upregulation in vivo, we generated transgenic mice with myocardium-specific expression of the human rkip gene (pebp1) under control of the alpha-mhc promoter. two different rkip-transgenic lines with 2.7-fold and 3.4-fold increased cardiac rkip protein level were generated (ref. 2, and jax id number 911819). we investigated the cardiac phenotype and found that tg-rkip mice developed cardiac hypertrophy with a significantly increased heart weight to body weight ratio and a decreased left ventricular ejection fraction relative to non-transgenic fvb controls, as early as 10 weeks of age. histology analysis revealed progressive atrial and ventricular enlargement of tg-rkip hearts. ecg abnormalities, a lower maximum rate of left ventricular pressure rise, and a strongly decreased left ventricular ejection fraction of 32.9±2.3 % (n=6; ±s.d.) were documented at an age of 8 months. down-regulation of the transgenic rkip by lentiviral transduction of an rkip-targeting mirna retarded the cardiac phenotype of tg-rkip mice. thus, dual-specific inhibition of the grk2 and raf-erk1/2 axis by the human rkip gene (pebp1) triggers signs of heart failure in vivo, and the documented upregulation of the cardiac rkip in heart failure patients could aggravate disease pathogenesis. these findings are in contrast to rodent rkip (pebp1), which does not seem to inhibit the raf-erk1/2 axis in vivo but instead confers grk2 inhibitionheterodimerization between the at1 receptor (at1r) for the vasopressor peptide, angiotensin ii, and the b2 receptor (b2r) for the vasodepressor peptide, bradykinin, enhances angiotensin ii-stimulated signalling in cells. in addition, at1r--b2r heterodimerization has a major pathophysiological role and contributes to the angiotensin ii hypersensitivity in women with preeclampsia hypertension. to analyse the vascular function of the at1r--b2r heterodimer in vivo, we generated a transgenic model of at1r--b2r heterodimerization (tg-b2r+) by transgenic expression of the b2r gene (bdkrb2) in the b2r-deficient tg-b2r-/-strain. fluorescence resonance energy transfer (fret) imaging was applied to analyse the interaction between different gprotein-coupled receptors in the aorta of transgenic mice. we report here that fret imaging detected the close interaction between the aortic at1r and b2r at a distance of less than 9 nm in tg-b2r+ mice whereas the at1r--b2r heterodimer was absent in tg-b2r-/-mice. in contrast, fret was not detectable between the endothelin eta receptor (etar) and the b2r in the aorta of tg-b2r+ mice, although immunofluorescence and immunohistology confirmed the aortic (co-)-localization of both, etar and b2r. the efficient at1r--b2r heterodimerization in tg-b2r+ mice was accompanied by an enhanced angiotensin ii at1r-stimulated vasopressor response relative to that of tg-b2-/-mice, which lack the at1r--b2r heterodimer. as a control, the endothelin-1-stimulated vasopressor response mediated by the etar, which did not dimerize with b2r, was not significantly different between tg-b2r+ and tg-b2r-/-mice. together these findings provide strong evidence that at1r--b2r heterodimerization occurs in vivo and enhances the angiotensin ii at1r-stimulated vasopressor response. dysfunction of the cardiac energy substrate metabolism is a characteristic feature of late-stage heart failure. the dysfunctional cardiac substrate metabolism contributes to insufficient energy generation and has limited treatment options. in search for a treatment approach, we investigated whether inhibition of g-protein-coupled receptor kinase 2 (grk2) could confer cardioprotection by targeting the dysfunctional cardiac substrate use. the impaired substrate metabolism of late-stage heart failure was reproduced in a transgenic model with myocardium-specific expression of fatty acid synthase (fasn), which is the major palmitate-synthesizing enzyme. experiments with a seahorse xf24 extracellular flux analyzer revealed that in an adult-like lipogenic milieu, fasn-transgenic cardiomyocytes reproduced the overall depressed substrate use of late-stage heart failure with a switch from fatty acid to predominant glucose utilization. the impaired substrate use was largely retarded by co-expression of a small peptide inhibitor of grk2, grkinh. the grkinh-mediated protection against cardiometabolic remodelling required an intact raf-erk1/2 axis and involved the erk1/2-dependent inactivation of the heart failure-promoting peroxisome proliferatoractivated receptor gamma (pparg) by phosphorylation of serine-273. as a consequence of erk-dependent phosphorylation of pparg on serine-273, the expression of heart failure-related pparg targets such as fatty acid synthase, resistin and adiponectin was decreased. the importance of pparg serine-273 phosphorylation was further shown in transgenic mice with myocardium-specific expression of the phosphorylation-deficient pparg serine-273a mutant, which was resistant to the cardioprotective activity of grkinh. taken together our experiments show that grk2 inhibition could target cardiometabolic remodelling by inhibition of the heart failure-promoting transcription factor pparg. the effect of sodium valproate on the action potential of atrial myocytes of crem ib∆c-x transgenic mice introduction: in mouse, cardiomyocyte directed over-expression of transcription factor crem (camp response element modulator) causes an atrial phenotype characterized by hypertrophy, reduced contractility and increased duration of the monophasic action potential (map). moreover, this animal model (crem ib∆c-x) showed spontaneous atrial fibrillation (af) episodes as early as 5 weeks of age in homozygous mice and 10-12 weeks of age in heterozygous mice (phenotype delayed towards adult stage). previous studies in heterozygous mice targeted hdac2 inhibition by sodium valproate (vpa, an anticonvulsant drug, acting also as inhibitor of hdac class i>ii). vpa treatment delayed significantly the development of atrial hypertrophy and the incidence of af episodes, without affecting cellular hypertrophy. our aim was to investigate the effect of chronic vpa treatment on the electrical activity of atrial myocytes isolated from crem ib∆c-x and wild type (wt) littermate mice. methods and results: atrial myocytes were isolated from 12 weeks old wild type mice (wt) and heterozygous crem ib∆c-x transgenic mice with enlarged atria (tg), treated for 7 weeks with vpa (0.4 mm in the drinking water) vs. water (vehicle control). action potentials (ap) were measured at room temperature using the patch-clamp technique. atrial myocytes of water treated tg mice had the ap amplitude significantly reduced by 7 mv compared to water treated wt, and, in line with previous results for the map, the tg cells depolarized with a slower slope of 90.2±8 v/s (tg: n=33 cells) vs. 109.8±5.1 v/s (wt: n=30, p=0.05). moreover, ap of atrial myocytes isolated from water treated tg mice had longer duration (apd) at 50% (tg: 11.6±1.4 ms, n=35 vs. wt: 6.9±0.5 ms, n=30, p<0.01), at 70% (in ms: 21.2±2.5 vs. 13.6±1, p<0.05) and at 90% (in ms: 46.8±4.5 vs. 34±2.5, p<0.05) repolarization. vpa treatment reduced ap amplitude in wt mice by 6 mv (n=22, p<0.05) vs. water treated wt, without altering the slope of depolarization or the apd. in vpa treated tg mice the apd was reduced (50%: 7.3±0.8 ms, 70%: 13.8±1.5 ms, 90%: 30.9±3.2 ms, n=21, p<0.05 vs. untreated tg), the amplitude was increased by 5 mv (n.s.) and the slope of depolarization was increased by 21% (p=0.16, n.s.). membrane capacitance evaluation, as an estimation of atrial myocyte size, showed that in untreated tg mice the cells were larger than in wt (tg: 104±7.2 pf, n=27 vs. wt: 55±5.4 pf, n=30, p<0.001), in line with the occurrence of cellular hypertrophy in tg atria. chronic vpa treatment did not change the cell size in either genotype (wt-vpa: 64.8±7 pf, n=17 n.s. vs. wt; tg-vpa: 88±7 pf, n=14, p<0.05 vs. wt-vpa; p<0.01 vs. wt; n.s. vs. tg). conclusions: in hypertrophied atrial myocytes of crem-ib∆c-x, ap were characterized by smaller amplitude, slower onset of depolarization and increased duration compared to wt cells. despite having no effect on atrial myocytes size, vpa treatment reduced the duration and showed a tendency to increase the amplitude and the slope of depolarization of the action potential in tg mice to values similar to wt. these data suggest that chronic treatment with vpa restored partially the electrical activity of atrial myocytes and may reverse the electrical remodeling via hdac2 inhibition. (supported by the dfg) of the transcription factor crem (camp response modulator) icer, smicer and crem-ib∆c-x are inducible by β-adrenergic stimulation and code for similar or even identical proteins. thus, these isoforms are able to repress expression of respective target genes in response to camp and might play a role in an arrhythmogenic remodeling during the development of chronic heart diseases. here we test this hypothesis in a mouse model with transgenic expression of crem-ib∆c-x (tg). these mice develop not only spontaneous onset atrial fibrillation but likewise arrhythmogenic alterations in the ventricle. patch clamp experiments revealed an increased na + -ca 2+ exchanger current (i ncx ) and decreased transient outward current (i to ) in tg ventricular cardiomyocytes (vcms) vs. wild-type controls (ctl). these alterations were associated with an increased arrhythmogenicity in tg vcms. action potentials were prolonged in tg vcms vs. ctls leading to an increased proportion of vcms displaying early afterdepolarizations. ca 2+ imaging revealed that the transduction rate of spontaneous sub-threshold ca 2+ -waves into supra-threshold transient-like ca 2+ -events which is mediated by the ncx was increased in tg vcms. at the same time the serca mediated ca 2+ transport rate (r serca ) was enhanced in tg vcms potentially limiting ca 2+ extrusion by the ncx. underlining the in-vivo relevance of our findings ventricular extrasystoles (ves) were augmented in ecgs of tg mice (ves/mouse during 10 -6 m isoproterenol challenge, tg: 3.65*, ctl: 0.4; n=20/condition). the increase in i ncx and r serca and the decrease in i to went along with an increase of ncx1, serca2a and decrease of kchip2 protein levels. however, the respective mrna levels (slc8a1, atp2a2 and kcnip2) were unaltered between groups pointing to a post-transcriptional regulation of these genes. in a mrnasequencing approach we identified the downregulation of precursor mirnas inter alia for mir-369 (fold change in tg: 0.04*) and mir-1 (fold change in tg: 0.13*) (n=10/condition). atp2a2 is a predicted target of mir-369 and mir-1 has recently been shown to regulate ncx1 and i to related potassium channel subunits. (*p<0.05 vs. ctl) our results demonstrate that transgenic expression of crem-ib∆c-x in mouse vcms leads to distinct arrhythmogenic alterations. they further indicate that the repression of micro rnas by short crem repressor isoforms may lead to the upregulation of genes in the context of an arrhythmogenic remodeling. since crem-repressors are inducible by chronic β-adrenergic stimulation our results suggest that the inhibition of credependent transcription contributes to the formation of an arrhythmogenic substrate in chronic heart disease. ( chronic overstimulation of cardiac β-adrenergic receptors (β-ar) is a major trigger for the development and maintenance of cardiac hypertrophy and heart failure. although the camp activated proteinkinase a (pka) is known as a prominent downstream effector of β-ar signaling, its functional contribution to pathological cardiac remodeling is neither well understood nor directly studied so far. to address this issue we used mice carrying a point mutation in the regulatory pka subunit riα (pkariαb), which prevents binding of camp and consequently diminished kinase activity. this dominant negative mutation was controlled by a tamoxifen (tam) inducible αmhc promotor driven cre transgene which allows a selective expression in the ventricular myocardium. the inducible and tissue specific gene expression was analyzed and confirmed by pcr, rt-pcr and immunohistochemistry. furthermore diminished phosphorylation of several pka targets verifies impaired pka activity in tam treated double transgenic animals. hypertrophic response in ventricular pka mutants was studied in genetic, pharmacological and surgical mouse models of heart disease as well. genetically induced heart failure was observed following tam treatment in mice expressing an inducible myocardial-specific cre transgene. this deleterious cardiac phenotype develops independently of the presence of the floxed transgene. 8 days after tam treatment, controls displayed elevated heart weight to bodyweight ratio (hbr) and heart weight to tibia length (htr). hbr shifted from 6.2(mg/g) in untreated control animals to 10.9(mg/g) in tam injected mice. in contrast pka inhibited mutants displayed a minor increased hbr of 7.7 (mg/g). for the pharmacological induction of cardiac hypertrophy we implanted osmotic mini pumps, delivering a combination of isoproterenol and phenylephrine. control animals showed a significantly increased hbr (8.4 mg/g) compared to saline treated animals (6.8mg/g) and pka mutants (7.1 mg/g). paradoxically, all pka inhibited animals displayed a consistent elevation in important hypertrophic markers like anp. surgical constriction of the aortic arch (transverse aortic constriction tac) led to a pressure induced hypertrophic response (hbr: 6.5 vs 7.9 mg/g) followed by a pronounced elevation in several hypertrophic factors such as anp, myh6/7 ratio and myocytes size. in contrast pka mutants displayed an irregular progression of cardiac hypertrophy presented by two groups with either an unchanged (6.9 mg/g) or a strongly elevated hbr (13.5 mg/g). however, additional hypertrophic factors including anp, myh6/7 ratio and myocyte size were significantly increased in both groups. to our knowledge this is the first report, which directly studies the role of ventricular pka activity in cardiac hypertrophy in a genetically altered mouse model. our results suggest that in an early stage of cardiac remodeling pka inhibition alleviates cardiac weight gain but provokes a detrimental shift during further progression, which implicates a protective role of ventricular pka activity in cardiac disease. acetyl-coa carboxylase catalyzes the first step in the biosynthesis of fatty acids in bacterial and eukaryotic cells, i.e. the conversion (carboxylation) of acetyl-coa into malonyl-coa. acc-generated malonyl-coa functions as a substrate for de novo lipogenesis and acts as an inhibitor of mitochondrial β-oxidation of fatty acids. because of its role in lipid metabolism this enzyme has become an interesting target in drug discovery in the field of metabolic diseases and cancer. despite this interest in acc, no attention has as yet been given to the role of acc in endothelial cells. we aimed to investigate the role of acc in two functional key aspects of angiogenesis: endothelial cell proliferation and migration. we used the acc inhibitor soraphen a, a polyketidic natural compound isolated from the myxobacterium sorangium cellulosum, as well as an rnai-based approach to inhibit the function of acc. primary human umbilical vein endothelial cells (huvecs) were used as in vitro model. first, we analyzed the action of soraphen a on cell viability. the compound did neither lower the metabolic activity of huvecs up to a concentration of 100 µm after 24 and 48 h (ctb assay) nor increase in the apoptosis rate after 24, 48, or 72 h up to 100 µm. measuring adenosine triphosphate (atp) levels revealed that 30 µm soraphen a does not alter the atp levels in huvecs after 24 h treatment. in contrast, a 48 h treatment significantly lowered the atp levels by 12 %. also gene silencing of acc1 in huvecs attenuated the atp levels by 11 %. mitochondrial membrane potential (mmp) assays showed decreased mmp levels (10 %) in soraphen a-treated cells after 24 h. interestingly, the compound inhibited the proliferation of endothelial cells with an ic 50 value of 34 µm. cell cycle analysis showed that soraphen a decreases the amount of cells in the g 0 /g 1 phase by 26 % and increases the number of cells in the g 2 /m phase by 50 %. the compound also inhibited the activation of akt (western blot analysis). in a wound healing/scratch assay, 30 µm soraphen a lowered the migration of endothelial cells by 65 %. gene silencing of acc1 in huvecs strongly decreased endothelial migration, whereas a knockdown of acc2 had no influence. furthermore, boyden chamber assays revealed that soraphen a can also lower chemotactic migration by 34 %. since actin rearrangement is necessary for migratory processes, we analyzed the factin cytoskeleton (microscopy) and found that soraphen a decreases the number of filopodia by 60 % but did not influence stress fiber formation. surprisingly, soraphen atreated cells did not exhibit significant alterations in their capacity to form tube-like structures on matrigel. in summary, we could gather first hints that inhibiting acc has an immense impact on the proliferation and migration of primary endothelial cells. the mechanistic basis of this phenomenon will be investigated in future studies by analyzing the lipid profile and the transcriptome of endothelial cells. acknowledgement: this work was supported by the german research foundation (dfg, for 1406, fu 691/9-2) . introduction: statins are among the best examined drugs with excellent efficacy and safety profiles. lowered low-density lipoprotein (ldl) cholesterol goals, new indications for treatment and new knowledge about their pleiotropic effect have promoted a considerable increase in statin use. but as statin use becomes more widespread, awareness of their adverse effects as well as the recognition of statin intolerance problems increase. statin intolerance is a significant problem in the treatment of dyslipidemia, understood as the inability to tolerate a dose of statin required to reduce individual cardiovascular risk sufficiently and could result from different statin-related side effects. muscle-related adverse events, elevation of liver enzymes, cognitive problems and new onset diabetes mellitus have all been described, especially at higher doses. although muscle symptoms are the common side effects observed, excluding other adverse events might underestimate the number of patients with true statin intolerance. these patients represent a target population for the newest lipid lowering drug category i.e. the proprotein convertase subtilisin/kexin type 9 (pcsk9) inhibitors. this work aimed to give an overview of published definitions derived from clinical studies, associations as well as major drug regulatory agencies. we discussed overlaps, differences and limitations in the current definitions. methods: literature based search included pubmed and uptodate publications in english and german language until october 2015. we performed hand searches of the references retrieved and performed an overview. results: a definition of statin intolerance of the european medicines agency (ema) or the us food and drug administration (fda) is not available. in clinical studies, different definitions are chosen and the results are not comparable. also different associations, such as the american heart association (aha), the european atherosclerosis society (eas), the canadian working group or the national lipid association (nla) are not able to agree on one common definition. statin intolerance definitions included different types of muscle symptoms, integration of ck levels and minimal requirements of statin doses. there are currently no validated questionnaires or specific laboratory parameters available. in addition, the term 'myopathy' is often considered as a synonym to statin intolerance. overall, only a few major studies have been conducted with statin intolerant patients so far using inconsistent definitions. discussion and conclusion: there is an unmet need to find a robust and clear definition of statin intolerance as overemphasizing it might hinder appropriate clinical use of this important drug class. thus, further work is required to develop a consensus definition on statin intolerance or a more focused definition regarding statin-associated muscle symptoms only. subsequently, these definitions could be implemented in patient care and their relevance being analyzed and tested in future studies. background: development of cardiac hypertrophy is characterized by reactivation of genes involved in cardiac development. wnt/β-catenin signaling is essential for embryonic cardiac development and is known to be dysregulated in pathological heart remodeling. our previous work suggested a cardiac specific protein complex regulating wnt/b-catenin/tcf transcription in the adult heart. we aim to identify and to characterize this complex in order to find potentially interesting targets for pharmacological therapy preventing maladaptive cardiac remodeling and the onset of heart failure. results: we previously demonstrated that the krüppel-like-factor 15 (klf15) is a βcatenin interaction partner and a cardiac specific nuclear inhibitor of the wnt/β-catenindependent transcription. because klf15 and β-catenin are ubiquitously expressed, we suggest the existence of cardiac specific co-factors responsible for cardiac specificity in this complex. we identified the basic leucine zipper and w2 domain containing protein 2 (bzw2), a phylogenetically conserved protein, as a β-catenin and klf15 interaction partner using yeast-two-hybrid screen. in vitro overexpression experiments and coimmunoprecipitation validated these interactions, which were also confirmed by mass spectrometry. in the developing mouse embryo bzw2 mrna expression is detectable in the heart, neuronal tissue, somites, limbs and branchial arches as shown by whole mount in situ hybridization. in the adulthood expression of bzw2 is confined to the heart, predominantly in cardiomyocytes and in cardiac progenitor cells compared to cardiac fibroblasts (*p<0.05, cm n=4, cfb n=4, cpc n=2), and skeletal muscle. bzw2 was localized in both the cytosol and in the nucleus. mutation analysis showed the importance of the n-terminus of bzw2, containing a putative bzip dna interaction domain, for the nuclear placement of the protein. bzw2 protein expression was significantly increased under cardiac wnt/β-catenin signaling activation in vivo in two mouse models (klf15 knockout (ko) mice **p<0.01 n=3, and in a cardiac specific β-catenin stabilized mouse model, **p<0.01 n=6). a mouse model with constitutively bzw2 loss of function (bzw2 ko) showed cardiac specific upregulation of β-catenin on rna level (***p<0.001, p<0.05, ctrl n=4, bzw2 ko n=5) and on protein level (**p<0.05, ctrl n=7, bzw2 ko n=9). echocardiography analysis in eight-weeks-old bzw2 ko mice showed increased left ventricle wall thickness indicating a hypertrophic phenotype at baseline. we also observed increased levels of bzw2 expression in angiotensin ii treated mice as a model for cardiac hypertrophy (*p<0.05 ctrl n=4, angii n=3) as well as in human samples derived from patients with dilated cardiomyopathy and ischemic cardiomyopathy (*p<0.05 ctrl n=3, dcm n=11, icm n=10). conclusion: these data demonstrated that bzw2 is associated with components of the canonical wnt cascade and suggest its relevance in the constitutive regulation of the wnt/β-catenin components specific in the heart. this study further contribute to the elucidation of the tuning of the wnt-off/-on states aiming to establish a proof-of-concept model for wnt-modulation as a therapeutic strategy in hypertrophic-induced heart failure. objective: sphingosine-1-phosphate (s1p) is involved in the regulation of cell growth, survival, migration and adhesion. it is formed by sphingosine kinases and degraded by phosphatases and s1p lyase [1] . mice that lack s1p lyase are characterized by the accumulation of s1p and sphingosine in their cells and tissues, and by lymphopenia, generalized inflammation, multiple organ damage, and a strongly reduced life span [2] [3] [4] . on the other hand, embryonic fibroblasts from s1p lyase-deficient mice (sgpl1 -/--mefs) are resistant to chemotherapy-induced apoptosis [5] , in part due to an upregulation of multidrug transporters of the atp-binding cassette (abc) transporter family [6] . interestingly, s1p lyase-deficient mice have elevated plasma levels of cholesterol and triglycerides, while suffering from strongly reduced body fat [7] . the aim of the present study was to analyze the link between s1p lyase deficiency and altered cholesterol homeostasis using sgpl1 background: cardiac gene expression changes during cardiac development and under pathophysiological conditions. these alterations in gene expression are regulated by several processes and the exact regulation of gene expression is essential for the proper development and function of the heart. crucial steps in transcription regulation are rna polymerase ii (pol ii) recruitment and changes in pol ii activity. pol ii activity is tightly linked with phosphorylation at serine-2 (p-ser2) of the carboxyterminal domain of pol ii. thus, the aim of the present study was to identify cardiomyocyte-specific genomewide pol ii and p-ser2-pol ii enrichments to get insight into pol ii activity and recruitment in development and disease. methods and results: to get insight into rna polymerase ii dynamics, genome-wide maps of rna polymerase ii occupancy were generated by chromatinimmunoprecipitation in cardiomyocyte nuclei purified from normal neonatal and adult mouse hearts. in addition, cardiomyocyte nuclei were obtained from adult hearts after 4 weeks of pressure overload induced by transverse aortic constriction (tac). cardiomyocyte nuclei were isolated by magnetic beads with an anti-pcm1 antibody. nuclei were used for pol ii chromatin-immunoprecipitation followed by deep sequencing (chip-seq). to test if pol ii marks correlate with nuclear mrna expression in cardiomyocyte nuclei all coding genes were ranked according to their expression level. genes expressed in cardiomyocyte nuclei (> 0.062 fpkm, gene expression rank < 12,653) showed high pol ii enrichment at promoters as well as in genomic regions. many of the gene promoters showed high levels of pol ii accumulation at the transcription start site, as compared to genic regions, which have been associated with pol ii pausing. in contrast, p-ser2-pol ii showed enrichment downstream of the transcription start site. the genomic region of troponin i type 1 (tnni1) which is expressed in cardiac muscle only during development but not in adult cardiomyocytes, was enriched for pol ii in neonatal cardiomyocytes. at the tnni1 gene, pol ii was absent in adult or pressure-overloaded cardiomyocytes. in contrast, pol ii enrichment at the alpha actin 1 (acta1) locus was only present in pressure-overloaded cardiomyocytes. these data are consistent with cellular rna-seq data showing an induction of acta1 after tac. furthermore no pol ii enrichment could be detected in the genomic region of biglycan (bgn), a matrix proteoglycan that is not expressed in cardiomyocytes. this confirms a high purity of cardiomyocyte chromatin. conclusions: this study provides, for the first time, cardiomyocyte-specific landscapes of rna polymerase ii occupancy in heart development and disease. cardiac myocyte maintenance dna methyltransferase 1 is essential for embryonic heart development but is dispensable for cardiac function and remodeling postnatally t. nührenberg background: recent studies have identified dynamic changes in dna methylation in cardiac myocytes during development, postnatal maturation and in disease. however, the enzymes involved in shaping the cardiac myocyte dna methylome are only partially known. here, we explored the role of maintenance dna methyltransferase dnmt1 in cardiac development and in remodeling after chronic left ventricular pressure overload. methods: in mice, deletion of the dnmt1 gene was accomplished by use of two different cre recombinases. crosses of homozygous dnmt1 fl/fl mice with heterozygous dnmt1 fl/+ mice expressing a cre recombinase under control of the atrial myosin light chain gene promoter (myl7-cre) resulted in embryonic deletion of dnmt1 (ko). embryos without myl7-cre served as controls (ctl). embryos were dissected and genotyped at e11.5, e12.5, e13.5 and e14.5. rna-seq and pyrosequencing of genomic dna was performed on e11.5 hearts, histology on e12.5 hearts and electron microscopic imaging on e13.5 hearts. for deletion of dnmt1 in adult mice, homozygous dnmt1 fl/fl mice expressing an inducible cre recombinase (myh6-mcm) were given tamoxifen i.p. over 4 days. homozygous dnmt1 fl/fl mice not carrying myh6-mcm as well as myh6-mcm carrying mice without dnmt1 fl alleles were also injected with tamoxifen and served as controls. cardiac phenotyping including histology, echocardiography and qpcr was carried out without (sham) or with left ventricular pressure overload induced by transverse aortic constriction (tac). results: myl7-cre mediated loss of dnmt1 resulted in progressive embryonic lethality with absence of living ko embryos after e14.5. ko embryos displayed loss of cardiomyocyte gene expression patterns, decreased promotor cpg methylation of aberrantly expressed genes and ultrastructural features of wide-spread cardiac myocyte cell death. in contrast, tamoxifen-induced ablation of dnmt1 in adult mice did not affect survival of ko mice. cardiac phenotyping of adult mice revealed no significant differences between ko and ctl mice under sham and tac conditions. conclusion: dna methyltransferase dnmt1 in embryonic cardiac myocytes is essential for proper heart development. in adult cardiomyocytes, dnmt1 is dispensable for normal cardiac function and for adaptation to chronic cardiac pressure overload. background: recent studies showed that mice with general deletion of the oxidoreductase tet3 involved in dna demethylation are embryonic lethal, the underlying cause remaining unknown. this prompted us to investigate wether embryonic lethality is caused by cardiomyocyte-specific loss of tet3. methods: female mice homozygous for tet3 flox and male mice heterozygous for tet3 flox and heterozygous for myl7-cre were mated and offspring were genotyped after weaning. mice homozygous for tet3 flox and heterozygous for myl7-cre (ko) or homozygous for tet3 flox without myl7-cre (controls) were sacrificed at 12 weeks of age and ventricles were harvested. mrna of the ventricles was isolated and expression of cardiomyocyte-specific genes was evaluated by quantitative real-time pcr. cardiomyocyte-specific genomic dna from ko and control mice was obtained from facs-sorted cardiomyocyte nuclei and bisulfite-converted for analysis of dna methylation by pyrosequencing. results: ko mice showed embryonic lethality of nearly 50 %. born ko mice developed without phenotypic abnormalities (normal heart weight/tibia length ratio) and displayed compensatory upregulation of tet1 and tet2. cardiomyocyte genomic dna of ko mice showed significantly higher methylation levels in the body of the atp2a2 gene and at the binding site of the transcription factor gata4 but not near the promotor and the binding site of the transcription factor tbx5. higher methylation levels were not accompanied by changes in atp2a2 expression. however, both myh7 and nppb were upregulated in ko mice compared to control mice. conclusions: our findings suggest that tet3 is involved in dna demethylation in cardiomyocytes. loss of tet3 expression resulted in embryonic lethality. compensatory upregulation of tet1 and tet2 isoenzymes may contribute to the incomplete penetrance of this phenotype. further studies are ongoing to investigate the functional relevance of tet3 in cardiomyocytes. to identify novel proteins secreted by the myocardium, we have previously conducted a genetic screen, which led to the identification of protease inhibitor 16 (pi16). a recent gwas analysis showed an association of a genetic variant in the pi16 genomic locus (rs1405069) with chemerin plasma levels. here we tested the hypothesis that pi16 determines chemerin plasma levels through regulation of chemerin processing. we generated mice deficient for pi16, which did not display an overt phenotype under basal conditions. plasma levels of chemerin were found significantly lower in pi16-deficient animals compared to littermate controls. to investigate whether pi16 and chemerin interact, we performed co-immunoprecipitation experiments. indeed, we found pi16 to co-precipitate with chemerin from both murine plasma and cell culture supernatants. as chemerin is proteolytically processed and activated, we next asked whether the presence of pi16 would affect the processing of pro-chemerin to its processed forms in native tissue. western blot analysis on cardiac and adipose tissue lysates that detected both the unprocessed precursor and the processed forms of chemerin showed a significant shift towards the processed forms upon genetic deletion of pi16. when we assayed for the activity of the chemerincleaving protease cathepsin k, we found recombinant pi16 to potently inhibit cathepsin k activity. taken together, we propose pi16 to act as a regulator of chemerin processing. the transient receptor potential canonical 6 (trpc6) is a second messenger-gated cation channel, which mediates depolarization and ca 2+ entry. it is known to be activated by diacylglycerol derivatives (dag, 1-oleoyl-1-acetyl-sn-glycerol oag) [1] in a pkc-independent manner and plays important roles in lung and kidney physiology. gain-of-function mutations in the trpc6 gene can cause focal segmental glomerulosclerosis (fsgs), a kidney dysfunction leading to end stage renal disease. [2] thus, the discovery of potent inhibitors of trpc6 may help to develop new therapeutic strategies. urban et al. discovered that larixol, a natural product with a labdane skeleton found in the oleoresin of the european larch (larix decidua), blocks the oag-dependent activation of trpc6. [3] larixyl acetate, another component of the resin showed an even higher potency in trpc6 inhibition (ic 50 = 0.26 µm) and a 12-fold selectivity compared to trpc3. these findings led to the idea that further modifications of the larixol lead structure may reveal even more potent inhibitors. furthermore, changes in selectivity and efficacy of such compounds may also provide a deeper insight about relevant structural elements for channel binding. as larixyl carbamate was assumed to exhibit a higher metabolic stability as larixyl acetate, this compound was already investigated in previous studies. it showed a potent and subtype-selective inhibition of trpc6. hence, the development of further carbamates was a priority objective. as an alternative to the use of different isocyanates for the introduction of a carbamate function at the c1 position of the molecule we found an elegant way via formation of an active ester with carbonyldiimidazole. this precursor allowed the design of several isosteric compounds like larixyl methylcarbamate, larixyl hydrazide and larixyl methylcarbonate, which were all able to block trpc6 with similarly low ic 50 values. the introduction of more bulky side chains appeared to diminish the bioactivity of the compounds, the stereochemistry at the c1 position, however, seems to play no important role for the inhibition of trpc6 currents. larixyl methylcarbamate lead to trpc6 inhibition with an ic 50 value of 0.15 ± 0.06 µm. compared to larixyl carbamate and larixyl methylcarbonate, which are also very potent blockers of trpc6, this compound bears the benefit of high subtype selectivity towards trpc3. even with concentrations up to 50 µm of larixyl methylcarbamate, no complete inhibition of the ca 2+ influx via trpc3 channels could be achieved. this fact distinguishes this larixol derivative as a very promising compound for further studies of trpc6 in health and disease. poisoning by organophosphorus compounds (opc) including pesticides and highly toxic nerve agents is based on irreversible inhibition of acetylcholinesterase (ache) resulting in an excess of acetylcholine causing accumulation. the subsequent overstimulation at nicotinic and muscarinic receptors finally leads to respiratory arrest due to paralysis of the respiratory muscles. therapy focuses on competitive antagonism at muscarinic acetylcholine receptors and reactivation of inhibited ache by bisquarternary pyridinium oximes. thereby, nicotinic malfunction is not directly approached. for that reason, an alternative strategy appears rational using nicotinic acetylcholine receptor (nachr) active substances to counteract the effects of accumulated acetylcholine and thus to restore the loss of function of nachrs. different bispyridinium-non-oxime-compounds (bps) have been demonstrated to be able to serve as target structures for the identification of new positive allosteric modulators of nicotinic receptors. unlike nicotinic agonists, positive allosteric modulators can reinforce the endogenous cholinergic neurotransmission despite of acetylcholine accumulation in the synaptic cleft. to this end, the following electrophysiological in vitro study investigated the effect of twelve diversely substituted bps on human α7 nachr using whole-cell patch clamping under voltage-clamping conditions (-70 mv) performed with planar electrodes in an automatic system (nanion technologies gmbh, munich). cholinergic currents of hα7 nachr that have been expressed in stably transfected cho cells were activated by the agonist nicotine. measurements of the effect of various bps concentrations in the presence of nicotine were performed to establish concentration-response relations. cholinergic inward currents were generated by human α7 nachrs in response to low nicotine concentrations. at high concentrations of the drug the currents were decayed reflecting both, desensitization of the receptors and presumably block of the open channel by high agonist concentrations. four out of twelve bps co-applicated with nicotine showed a concentration-dependent enhancement of peak agonist-evoked currents and, most pronounced, 4-tert-butyl-substitued-bp, also demonstrated a marked elongation of the evoked response. this suggests a positive allosteric effect of these compounds on the nicotinic receptor. however, at high bp concentrations in the presence of agonist, responses were decayed significantly, presumably resulting from an open channel block induced by bps. hence, further compounds have to be synthesized to identify promising candidate compounds for improvement of effective therapy against nerve agent poisoning. the transient receptor potential channels (trp) are a family of tetrameric nonselective cation channels, which are involved in a variety of physiological and pathological processes (1). among the 28 mammalian trp channels, the canonical channel 5 (trpc5) is a ca 2+ -permeable ion channel, which is predominantly expressed in the brain (2) . many aspects of trpc5 function are still elusive although behavioral experiments with trpc5-knock-out mice suggest a role in innate fear-response (3) and some studies indicate a trpc5-mediated down regulation of neurite outgrowth in nerve cells (4, 5) . to elucidate trpc5 function on a cellular level, selective and potent compounds are required to acutely control channel activity. despite extensive research, trpc5 modulators often lack selectivity or exhibit toxicity, limiting their applicability in vivo (6, 7) . thus, there is still a need for identifying novel and efficient trpc5 modulators. we therefore screened a compound library (chembionet) and identified a benzothiadiazine derivative (btd) as a novel, potent, and selective trpc5 activator. hek293 cells heterologously expressing trpc5 upon tetracycline induction (hek trpc5 ) show a btd-induced concentration-dependent activation in both ca 2+ assays (ec 50 = 0.71 µm) and in electrophysiological whole cell patch clamp recordings (ec 50 = 1.1 µm). btd elicits currents with an n-shaped i/v curve, typical for trpc5. the resulting activation is long lasting, reversible and sensitive to clemizole, a recently established trpc5 inhibitor (8) . mtt assays revealed that incubating hek trpc5 cells for 24h with btd concentrations above 1 µm results in a concentration-dependent decrease in viability and cell proliferation, indicating a ca 2+ -mediated cytotoxic effect in consequence of sustained channel activity. non-induced control cells remain unaffected by btd at concentrations up to 10 µm. ca 2+ assays showed no influence of btd on closely related trpc4 channels, as well as trpc3/6/7 at concentrations up to 10 µm. the same applied to more distantly related trpv and trpm channels. besides a homotetrameric organization, trpc5 subunits can also assemble to heteromeric channel complexes with their closest relatives trpc1 and trpc4 (9) . trpc1/5 and trpc4/5 heteromers can also be activated by btd as evident from their typical i/v curves in patch clamp experiments, suggesting a high selectivity of btd for channel complexes bearing at least one trpc5 subunit. transient receptor potential canonical channels 3/6 and 7 are controlled by membrane lipids and highly expressed in neuronal and cardiac tissues. the involvement of these channels in development and (patho)physiology of these tissues is well documented, while our understanding of structure-function relations, specifically in terms of the lipid sensing machinery, in these channel proteins is still incomplete. using a homology model of trpc3, based on the recently available structural information on trpv1, we performed structure-guided mutagenesis and identified a single residue in transmembrane domain 6 (g652), which is conserved within the canonical family of trp channels. single point mutations at position 652 in trpc3 largely eliminated lipid sensitivity. trpc3 g652a expressed in hek293 cells was found resistant not only to activation via the phospholipase c pathway but also to direct administration of diacylglycerols. on the contrary, a synthetic agonist of trpc3/6/7 channels (gsk1702934a) activated wild-type trpc3 and trpc6 channels as well as the respective lipid insensitive mutants (trpc3 g652a, trpc6 g709a ). interestingly, the synthetic activator was found to generate substantially enhanced trpc conductances in cells expressing the lipid-insensitive mutants as compared to wild-type proteins. closer inspection of sensitivity of the wt and mutant proteins to various gsk derivatives argue against a contribution of g652 to gsk recognition by trpc3. our results demonstrate the existence of two different mechanisms of trpc3/6 activation supposedly involving distinct gating movements in the channel complex. we suggest that lipid gating of trpc3/6 involves a hinge-point and/or requires a certain level of flexibility within transmembrane segment s6 provided by g652. lipids and synthetic activators of trpc3/6 may be capable of initiating markedly different structural rearrangement in these channels. objective: organophosphorus compounds (opcs), i.e. nerve agents or pesticides, are highly toxic due to their strong inhibition potency against acetylcholinesterase (ache). inhibited ache results in accumulation of acetylcholine in the synaptic cleft and thus the desensitisation of the nicotinic acetylcholine receptor (nachr) in the postsynaptic membrane is provoked. as the therapeutic efficacy of oximes is limited, e.g. poisoning by soman or tabun, the direct targeting of nachr may be an alternative therapeutic approach. studies with the non-oxime bispyridinium compound (bp) mb327 (1,1'-(propane-1,3-diyl) bis (4-tert-butylpyridinium) di(iodide)) demonstrated a therapeutic effect against soman in vitro and in vivo. consequently, studying the affinity of bps at muscle-type nachrs and functional effects are topics of interest. to identify potential candidates, homologous series of substituted and non-substituted analogues (linker c 1 -c 10 ) of mb327 were investigated by using binding and functional assays. experimental procedures: crude membranes from frozen electric organ of torpedo californica were purified by sucrose-gradient density centrifugation and used in both affinity and functional assays. in competition radio-ligand binding assays, the influence on [³h]epibatidine binding sites of torpedo muscle-type nachr was determined. functional assessments were carried out with a bilayer method to investigate the effect on the cholinergic signal induced by 100 µm carbamoylcholine. results: bispyridinium compounds bearing unsubstituted pyridinium rings and long alkyl linkers (> c 7 ) inhibited the binding of [³h]epibatidine and decreased the cholinergic signal of 100 µm carbamoylcholine in the functional assay. mb327 and several bispyridinium structure analogues (mainly c 2 -c 4 linker) exhibited no regular displacement curves at [ 3 h]epibatidine binding sites and enhanced the carbamoylcholine-induced signal. the results demonstrate that the described affinity and functional screening methods detected some structure-activity-relationships (sar). depending on linker length and substitution pattern, the investigated bispyridinium compounds seemed to interact as positive allosteric modulators. further research is necessary to verify this hypothesis. non oxime bispyridinium compounds with an effect on soman-blocked respiratory muscle function have no effect on normal muscle function the life threatening toxicity of organophosphorus compounds (op), like nerve agents or pesticides, lies in the inhibition of acetylcholinesterase (ache) which causes cholinergic crisis. the accumulated acetylcholine in neuromuscular synapses results in the desensitization of nicotinic acetylcholine receptors (nachr) and paralysis of respiratoric muscles. the 4-tert-butyl-substituted bispyridinium compound mb327 showed therapeutic efficacy in soman and tabun poisoned guinea pigs in vivo. partial restauration of neuromuscular transmission by bispyridinium compounds (bp), e.g. mb327 or mb420, could also observed in soman paralysed respiratory muscles in vitro and was partly attributed to an interaction of bp with nachrs. however, it is unknown, whether these bp might affect normal respiratory muscle function in the absence of cholinergic crisis. therefore this study investigated the effect of bp on physiological rat diaphragm muscle function. force generation of rat diaphragm hemispheres was determined after incubation with increasing bp concentrations (1-300 µm) and compare to sham treatment. the diaphragm hemispheres were stimulated every ten minutes by an indirect electrical field (20, 50, 100 hz). muscle force was analyzed as time-force integral and is expressed as percentage of the individual control values, measured at the outset of the experiment. the muscle force dropped during the experiment. the application of the bispyridinium compounds mb327 (1,1′-(propan-1,3-diyl) bis (4-tertbutylpyridinium) di(iodide)) and mb420 (1,1′-(propan-1,3-diyl) bis (2-ethylpyridinium) di(iodide)) in the tested concentration range (1-300 µm) did not change muscle force production compared to the sham treated muscle. this was equally true for low (20 hz) and high (50 and 100 hz) stimulation frequencies. this study showed that bispyridinium compounds which can partially reverse somaninduced neuromuscular block in rat diaphragms show no effect on respiratory muscle function in absence of the op-induced neuromuscular block. these results suggest that the bp tested in this study interacted with desensitised nachrs only, but do not affect physiological neuromuscular transmission. this effect needs to be investigated with further, promising bp compounds. interaction of recombinant pain-relevant atp-and proton-gated ion channels in an expression system; potentiation of the p2x3 receptor-induced current by the opening of asic3 channels g. stephan 1 , p. illes 1 1 universität leipzig, rudolf-boehm-institut für pharmakologie und toxikologie, leipzig, germany the p2x3 receptor (r) is a ligand-gated cationic channel, which is activated by extracellular atp. the acid sensing ion channel 3 (asic3) belongs to the enac/degenerin family and is gated by extracellular protons. despite their different amino acid sequences both ion channels share the same structure and pore architecture, by i.e. consisting of three identical subunits. besides, they are both located at partially overlapping subpopulations of dorsal root ganglia neurons and are implicated in acidic pain signaling. consequently, their physical interaction in the cell membrane or even the formation of heteromeric receptor channels from p2x3 and asic3 subunits has to be taken into consideration. we transfected rat (r)p2x3r and rasic3 constructs individually or together in a ratio of 1:1 into cho cells. we further used the whole cell patch clamp technique to analyze the current responses either elicited by the application of α,β-methylene-atp (α,β-meatp) or by a decrease in the extracellular ph value. the functionality of the individually transfected p2x3r-and asic3-constructs was verified by recording concentration-response curves for the agonists α,β-meatp and protons, respectively. after co-transfection of both ion channels, a ph-shift from 7.4 to 6.7 caused a rapidly desensitizing current response and a subsequent strong potentiation of the α,β-meatp-induced current. an even larger potentiation was achieved after a decrease of the ph value to 6.5. the opening of asic3 channels failed to facilitate the p2x3r current when 2-guanidine-4-methylquinazoline was used to stimulate a non-proton ligand-sensor of asic3. then, we substituted ca 2+ in the extracellular medium by ba 2+ or decreased the intrapipette concentration of egta, to modify the free intracellular ca 2+ concentration. in cells individually transfected with the receptor-channels, external ba 2+ increased the effect of α,β-meatp but decreased the effect of protons. the lowering of intrapipette egta modified p2x3r-and asic3-specific currents in a similar manner as external ba 2+ . in cells co-transfected with p2x3r/asic3, the ionic manipulations mentioned above abolished the potentiation of the α,β-meatp currents by asic3 activation. taken together, our results suggest that p2x3r and asic3 interact with each other, since the activation of asic3 had a marked impact on the p2x3r specific current response. further experiments are required to clarify the mechanism of this interaction, although it has been shown that extra-and intracellular ca 2+ and the proton sensor of asic3 appear to critically participate in this process. university of duisburg-essen, institute for anatomy, essen, germany background: the sperm acrosome reaction is an all-or-none secretion process, mainly following the conserved principles of calcium-regulated exocytosis in neurons and neurosecretory cells. however, the relationship between the formation of hundreds of fusion pores and the required mobilization of calcium from the lysosome-related acrosomal vesicle has only been partially defined. hence, the second messenger, nicotinic acid adenine dinucleotide phosphate (naadp), known to promote efflux of calcium from lysosome-like acidic compartments, was analyzed for its ability to trigger acrosome reaction in mouse sperm. in addition, the expression of two-pore channel (tpc) proteins, which are primarily localized in lysosome-related acidic organelles and which present potential molecular targets of naadp were examined in mammalian spermatozoa. methodology/ principal findings: our results show that treatment of spermatozoa with naadp resulted in a loss of the acrosomal vesicle, which shows typical properties, described for tpcs: (i) registered responses were not detectable for its chemical analogue nadp, and (ii) where blocked by the naadp antagonist trans-ned-19. in addition, (iii) two narrow bell-shaped dose-response-curves were found, with maxima either in the nanomolar or low micromolar naadp concentration range. performing immungold-electron microscopy with a tpc1 specific antibody, a co-localization with naadp-binding at the acrosomal region was detectable. moreover, quantifying loss of the acrosomal vesicle in tpc1 null sperm upon application of different naadp concentrations, responsiveness to low micromolar naadp concentrations was completely abolished. conclusions/significance: our finding that two convergent naadp-dependent pathways are operative in driving acrosomal exocytosis and that zona pellucida induced acrosomal exocytosis is prevented by trans-ned-19 support the concept that both naadp-gated cascades match local naadp concentrations with the efflux of acrosomal calcium, thereby ensuring reliable and complete fusion of the large acrosomal vesicle. since the acrosome reaction shares the same basic sequence of events typical for the conserved process of calcium regulated exocytosis, such as tethering, docking, priming and final vesicle fusion, the sperm model system may also be useful to comparatively examine whether the same convergence of naadp-dependent pathways is also operative in cellular systems with many secretory vesicles. walther-straub-institut, münchen, germany trpv4 channels are members of the vanilloid family of trp proteins. the channel is nearly ubiquitously expressed and can be found in brain, kidney, skin, heart, blood vessels as well as in the lung. pulmonary expression of trpv4 has been identified in endothelial cells (1), epithelial cells (2) and arterial smooth muscle cells (3) . most interestingly, the channel is known to be involved in the development of several lung diseases such as cough, asthma and pulmonary edema formation, due to its activation by heat, changes in osmolarity and shear stress (reviewed in 4). it is a matter of debate however if trpv4 activation in pulmonary endothelial as well as epithelial cells induces disruption of the barrier and an increased fluid leak into the alveolus as described for trpc6 (5) which is also expressed in both cell types. to analyze the potential role of trpv4 on ischemia-reperfusion-injury(iri)-induced pulmonary edema formation we utilized the isolated perfused mouse lung model. much to our surprise, we detected a significantly enlarged edema formation after 90 minutes of ischemia in trpv4-deficient mice in comparison to wild-type (wt) mice. this effect was observed by constant weight measurements as well as wet-to-dry ratio gain and was not dependent on the initial perfusion rate. most interestingly, edema formation of trpv4/trpc6 double deficient lungs was indistinguishable from wt lungs, indicating antagonizing effects of both channels, because trpc6 deficiency protected lungs from iri-induced edema (5) . moreover, we identified reduced expression levels of aquaporin 5 in trpv4-deficient lung lysates compared to wt lungs. these findings raise the intriguing possibility that trpv4 might be involved in the regulation of aquaporin expression in lung endothelial cells. endosomes and lysosomes are cell organelles involved in transport, breakdown and secretion of proteins, lipids, and other macromolecules. endolysosomal dysfunction can cause storage disorders such as mucolipidoses, sphingolipidoses, or neuronal ceroid lipofuscinoses, but is also implicated in the development of metabolic and neurodegenerative diseases, retinal and pigmentation disorders, trace metal dishomeostasis, infectious diseases, and cancer. endolysosomal ion channels and transporters are highly critical for the tight regulation of the multiple endolysosomal fusion and fission processes including endo-and exocytotic events as well as the regulation of proton and other ionic concentrations in the lumen of endolysosomal vesicles. methods to patch-clamp endolysosomal organelles are continuously improving. yet until now it has not been possible to selectively enlarge endosomal or lysosomal organelles with pharmacological tools for patch-clamp experimentation. we show here by using a combination of two small molecules that we can selectively enlarge early endosomes to a degree sufficient for patch-clamp experimentation. the ability to more selectively patch-clamp intracellular organelles will substantially improve the functional investigation of endolysosomal ion channels under physiological and pathophysiological conditions. the transient receptor potential (trp) channels are a superfamily of non-selective ion channels involved in a variety of physiological processes and in the pathgenesis of many disorders. in the kidney, trp channels have been implicated to be involved in diabetic nephropathy, focal, segmental glomerulosclerosis, polycystic kidney disease, hypomagnesemia with secondary hypercalcemia and idiopathic hypercalcuria. the melastatin-like trp channel subfamily 3 (trpm3) has been shown to be expressed in human kidney 1, 2 . using newly developed anti-trpm3 antibodies, we are able to visualize trpm3 protein in epithelial cells of proximal tubule as well as collecting ducts in the mouse kidneys. therefore, we compared renal function of male, five month old mice lacking trpm3 (trpm3 the atp-gated p2x7 receptor (p2x7r) is a non-selective cation channel widely expressed in epithelia, endothelia, and cells of hematopoietic origin. it plays a central role in cytokine release and studies in p2x7 -/animals indicate its involvement in inflammatory and neurodegenerative diseases. in addition, accumulating data suggests a functional role in neurons and its involvement in neurotransmitter release in the brain. however, despite its importance as a drug target, its precise localization and its molecular and physiological functions remain poorly understood. in particular, the location and function of p2x7r in neurons remain a matter of ongoing debate. to clarify the cellular and subcellular distribution of the p2x7r and to investigate its physiological and pathophysiological role in the brain we generated bac transgenic mouse models in which murine polymorphic variants of egfp-tagged p2x7r are overexpressed under the control of the endogenous p2x7 promoter. the egfp-tagged p2x7rs are efficiently overexpressed in the plasma membrane and can be directly visualized by green fluorescence or indirectly by anti-egfp antibodies. the obtained mouse lines show different expression levels but identical expression patterns with predominant expression in the cerebellum, hippocampus, and thalamus. using cell type-specific markers, p2x7-egfp was identified in almost all microglia and subpopulations of oligodendrocytes and astrocytes in the brain. in the spinal cord, numerous astrocytes in the white matter showed egfp immunoreactivity. so far, no egfp immunostaining was found on map2-and neun-positive cells indicating that under non-pathological conditions, the p2x7 receptor is not expressed in neurons of the cns. interestingly, a higher expression level of cd68 protein was observed in the p2x7r-overexpressing mice. these results suggests that overexpression of p2x7rs alone is sufficient to induce microglia activation, even under non-pathological conditions. since cd68 primarily localizes to lysosomes and endosomes, this further supports a role of the p2x7r in the regulation of phagocytosis. to validate these data, conditional knockout mice are generated. the current status of the project will be presented. the two-pore channels (tpcs) -tpc1 and tpc2 -are located in membranes of intracellular organelles of the endo-lysosomal system. the tpc-protein-monomer contains two homologous domains with six transmembrane α-helices each. a functional tpc probably consists of a dimer of two tpc-proteins resembling an ion channel architecture with the typical four domain organization like voltage gated na + or ca 2+ channels or such as trp channels. due to their biophysical properties, tpc1 and tpc2 are assumed to be involved in the efflux of ca 2+ from intracellular organelles and thereby contribute to fusion/fission processes of endosomes and lysosomes. thus, tpcs are supposed to be important regulators for vesicle trafficking, sorting and degradation/recycling processes. recently, it was shown that virus entry and replication of certain strains of filoviridae -such as ebola -depends on functional tpcs and that either block or genetic inactivation of tpcs reduces virus infectivity. a large family of bacterial protein toxins elicit their effects by modification of intracellular target proteins of host cells. these toxins are taken up by receptor mediated endocytosis and follow different endosomal routes to reach their final cytosolic destination. these toxins principally use two different intracellular routes: the first group uses an entry route via early or late endosomes (short-trip toxins), the second group takes a retrograde route via endosomes, golgi network and the endoplasmic reticulum (long-trip toxins) to get access to the cytosol. translocation of short-trip toxins -such as diphtheria toxin (dt), pasteurella multocida toxin (pmt) and bacillus anthracis lethal factor (pa/lf) -from early and late endosomes into the cytosol is driven by ongoing acidification. long-trip toxins -including cholera toxin (ct) -are retrogradely transported after endocytosis via the golgi apparatus to the endoplasmic reticulum (er). within the er a specific peptide-motif allows the translocation into the cytosol. due to the role of tpc1 for vesicle fusion & fission processes we investigated a potential impact of tpc1 on the uptake of bacterial toxins. first we determined the precise localization of tpc1 in intracellular compartments. to deduce its role for trafficking processes we performed co-localization and correlation studies with a whole set of established markers such as rab-gtpases and pips. second we intoxicated wild-type and tpc1 deficient cell lines with different bacterial protein toxins such as cholera (ct), diphtheria (dt) or pasteurella multocida (pmt) toxin. using cell viability and other intoxication assays we investigated the consequences of tpc1-deletion on bacterial toxin uptake, translocation and cytotoxicity. universität des saarlandes, institut für experimentelle und klinische pharmakologie und toxikologie, homburg/saar, germany trpm3 ion channels are considered to be involved in hormone release from pancreatic islets and the pituitary gland 1,2 . the trpm3 gene encodes a number of different splice variants that differ in their permeation properties and their activity in response to agonists 3, 4 . variations include the presence or absence of five stretches of 10 to 25 amino acid residues within the aminoterminus, long or short pore loops and long or truncated carboxytermini 5 . furthermore, three different aminotermini of human and mouse proteins have been described 5 , but presumably these differences are caused by the activity of different promoters. screening of a mouse pituitary gland cdna library identified 12 variants that differ in exons 8, 13, 15, 17 and 20 . however, only variants bearing the short, ca 2+ permeable pore loop were detected. 3´ rapid amplification of cdna ends (3´race) revealed that trpm3 proteins of the pituitary gland carry truncated c-termini exclusively. 5´race identified five independent regions of transcription initiation within the trpm3 gene implying the presence of five independent promotors. the different transcripts encode four trpm3 amino termini α, β, γ and δ of 1-155 amino acid residues including those described in humans (β,γ). however, the activity of these variants after stimulation with pregnenolone sulfate varied largely just as their frequency in the pituitary with shortened γ-variants being most abundant (~76 %). two-pore channels (tpcs) are a small family of ion-channels found throughout the endolysosomal system of eukaryotic cells. phylogenetically tpcs belong to the voltagegated ion channel superfamily sharing common traits with ca v / na v and trp channels. tpcs show a duplicated architecture with two homologous trans-membrane domains. each domain is build up by six membrane spanning alpha helices linked by short loops. it is very likely that tpcs form dimers maintaining the four-fold symmetry found in other members of the voltage-gated ion channel superfamily. due to their localization in the endolysosomal system tpcs are not accessible to conventional patch clamping. to investigate tpcs electrophysiological properties we use black lipid bilayer measurements. purified channels are integrated into an artificial phospholipid bilayer that separates two chambers enabling us to apply different buffers. upon activation by naadp ions flow through the channel and can thereby pass the diffusion barrier. the movement of charged molecules through tpcs results in currents which can be amplified and recorded. the controlled environment of the lipid bilayer setup allows testing of different ions as well as putative activating and inhibitory substances. one of the major drawbacks of conventional lipid bilayer setups are long preparation times between each measurement. stability of the phospholipid bilayer can pose another issue. often several membranes have to be established before a measurement can be performed making it very time consuming to achieve adequate experiment numbers. new multichannel systems resolve this issue supporting fast formation of bilayers while allowing measurement of up to 16 different bilayers at a time. here we utilize the "orbit" multichannel system with a meca16 chip by ionera to measure tpc1 channel activity. we will present data generated by the "orbit" system and a conventional bilayer setup using different channel constructs and charge carriers. cardiac action potentials are generated and propagated through the coordinated activity of multiple ion channels, including voltage-gated sodium channels (nav1) and potassium channels (like kv1, kv4). the voltage-gated na + channel nav1.5 initiates the cardiac action potential (ap), is essential for rapid depolarization, and is also known to control the ap duration in cardiomyocytes. the voltage-gated k + channel kv4.3 is responsible for the early repolarization of action potentials in human heart. similar to many membrane proteins, nav1.5 and kv4.3 have been found to be regulated by several interacting proteins. the transmembrane β subunit dipeptidyl aminopepidase-like protein (dpp) 10 is known to interact with the kv4.3 channel complex, modulating kinetics and voltage dependence. the overexpression of dpp10 in ventricular cardiomyocytes of rats revealed strong reduction of ap amplitude and significant slowing of ap upstroke velocity and ap duration, which could not be explained by the effects on cardiac kv4 channels. to study the potential influence of dpp10 on nav1.5 channels, we performed whole-cell patch-clamp analysis of transiently transfected cho-k1 cells, expressing scn5a alone or with dpp10. surprisingly, we observed significant effects of dpp10 on nav1.5 channel voltage dependence and kinetics. thus, the co-expression of dpp10 significantly shifted the half-maximal voltage of steady-state activation and steady-stateinactivation to more positive potentials compared to nav1.5 channels alone. in addition we analysed the effects of dpp10 on the kinetics nav1.5 currents. while time to current peak was not affected in cells co-expressing nav1.5 and dpp10 compared to nav1.5 alone, dpp10 slightly accelerated the inactivation. in addition, the time course of recovery from inactivation was clearly accelerated in cells expressing both nav1.5 and dpp10 compared to nav1.5 alone. in summary, we provide first evidence that dpp10 not only interacts with kv4 channels, but also influences nav1.5 channels modulating the depolarization as well as the early repolarization phase of the cardiac ap. therefore, it becomes likely that these ion channels are part of large, multi-protein complexes, and that the pore-forming subunits kv4.3 and nav1.5 behave very differently depending on the expression of its associated proteins like dpp10. cardiac fibroblasts (cf) comprise the most abundant cell type of the mammalian heart and it is known that they contribute to maladaptive cardiac remodeling processes. in response to pressure or volume overload, ischemia-reperfusion injury or myocardial infarction, cardiac fibroblasts proliferate and transdifferentiate into myofibroblasts which produce collagen and pro-hypertrophic cytokines influencing cardiomyocyte function and size. it was shown that β-adrenergic stimulation of cfs with isoproterenol leads to angiotensin ii (at-ii) production and autocrine stimulation of these cells (jaffre et al, 2009 ). activation of phoypholipase c triggered by at-ii leads to formation of inositol trisphosphate (ip 3 ) and subsequent release of calcium from intracellular stores as well as calcium entry across the cell membrane. the focus of our research is the identification of the plasmalemmal channel proteins such as trpc channels mediating this calcium entry, and whether these calcium entry pathways in cfs contribute to pathological remodeling. to date the precise role of calcium entry for these pathological processes is largely unknown. trpc channels are candidates for the analysis of calcium homeostasis in cf. recently, we showed that trpc1/c4-deficient mice are protected from maladaptive cardiac remodeling after neurohumoral stimulation or pressure overload, respectively, which can be explained by a significant reduction of a background ca 2+ -entry (bcge) pathway in cardiomyocytes; this bgce is enhanced by stimulation with agonists such as isoproterenol or angiotensin ii and it critically depends on trpc1 and trpc4 (camacho londoño et al., 2015) . nevertheless, the role of trpc1/c4 for calcium homeostasis in cfs has not been analyzed so far. we established an in vitro model that allows the analysis of calcium release and entry triggered by several (patho)physiological agonists in cultured primary adult cfs from mice. cfs were isolated using langendorff-perfusion and were cultured for maximal 6 days. our results show that there is no difference in 100 nm at-ii induced ca 2+ -release or ca 2+ -entry in trpc1/c4-deficient cf compared to wt. to evaluate whether the lack of trpc1/c4 can be compensated by other trpc channels we currently analyze cfs from trpc hepta ko mice lacking all seven trpc channel proteins concerning at-ii induced ca 2+ -release and ca 2+ -entry and we will also analyze the influence of other agonists on cfs which are known to evoke a longer lasting rise in the [ca 2+ ] i like isoproterenol, 5-ht, thrombin and endothelin-1. cardiovascular and metabolic diseases are currently the primary cause of morbidity and mortality in the western world and are spreading to the rest of the world following globalization. adipose tissue, in particular perivascular adipose tissue (pvat) is recognized as an important player in the development of these diseases. the release of relaxing factor(s) from the pvat has been a matter of interesting and highly spirited debates about its nature, the channels that govern its activities and its role in vascular dysfunction. data from our laboratory indicate that adipose-derived relaxing factor (adrf) is an important player, however the potential channels necessary for its downstream activities are still under study. our recent research primarily focuses on kv7.1 channels, which are known to be expressed in vascular smooth muscle cells. k v 7.1 voltage-gated potassium channels are expressed in vascular smooth muscle cells (vsmc) of diverse arteries, including mesenteric arteries. based on pharmacological evidence using r-l3 (k v 7.1 opener), hmr1556, chromanol-293b (k v 7.1 inhibitors), these channels have been suggested to be involved in the regulation of vascular tone. however, the specificity of these drugs in vivo is uncertain. we used kcnq1-/-mice to determine whether k v 7.1 plays a role in the regulation of arterial tone. we found that r-l3 produces similar concentration-dependent relaxations (ec 50 ~1,4 µm) of wild-type (kcnq1+/+) and kcnq1-/-arteries pre-contracted with either phenylephrine or 60 mm kcl. this relaxation was not affected by 10 µm chromanol-293b, 10 µm hmr1556 or 30 µm xe991 (pan-k v 7 blocker). the anti-contractile effects of pvat were normal in kcnq1-/-arteries. chromanol-293b and hmr1556 did not affect the anti-contractile effects of perivascular adipose tissue (pvat). isolated vsmcs from kcnq1-/-mice exhibited normal peak k v currents. the k v 7.2-5 opener retigabine caused similar relaxations in kcnq1-/-and wild-type vessels. we conclude that k v 7.1 channels are apparently not involved in the control of arterial tone by alpha 1 adrenergic vasoconstrictors and pvat. in addition, r-l3 is an inappropriate pharmacological tool for studying the function of native vascular k v 7.1 channels in mice. introduction: according to international guidelines [1] [2] [3] [4] [5] [6] [7] , both human and animal skin in vitro models have been used and validated to predict percutaneous penetration in humans. excellent correlations have been found for domestic pig as surrogate for human skin [8] [9] . material and methods: tissue the skin samples are descended from female pigs of german landrace (50 kg weight, approx. 4 month). process approved under german welfare law. after narcotization and euthanization the animals were shaved with an electric shaver, washed and dried. microbiological investigation swabs from 10 different skin area (fig. 1) were taken before next preparation step. skin areas were cut, stretched and subcutaneous fatty tissue was carefully removed. the skin was harvested at thickness of 1.000 µm by dermatome. after dermatomization, samples were taken for histological examination. skin disks of 30 mm were punched out from the frozen skin stripes and stored at -20°c. hplc and skin absorption waters corporation hplc containing 2767 sample manager, 2545 binary gradient pump, 2998 pda detector (optional: 3100 electron spray mass spectrometer); column: nucleodur® 100-5 c18 ec 50mm x 4.0 mm id. hanson microette™ vision® diffusion test system (hanson vision® autoplus™ autosampler/autofill™ collector, 6-cell drive system with vertical diffusion cell "standard". the permeation experiment was performed over a period of 48 h at 32°c. the dosage compartments of each cell were filled with approximately 300 µl of the caffeine solution (10 mg/ml). samples of 1 ml are taken after 0, 12, 24, 36 and 48 hours from receptor medium of each cell. aliquots from each vdc are analyzed by hplc in duplicate. microbiology staphylococcus spp. were found explicitly on pig skin surface. bacteria are facultative anaerobe, gram-positive bacteria that are physiologically colonizing the skin, oropharynx and the gastrointestinal tract. histology and skin thickness he staining and mechanical skin thickness determinations confirmed intact dermatomizing process of skin (fig. 2) . thickness was in the order of magnitude between 897 to 1.230 µm and intra variations were less than 10 %. caffeine skin absorption permeability coefficients and lag-phases recorded are in the same order of magnitude of previous work [10], demonstrating intact barrier properties of the membranes after 3 month storage process. until today, intra-assay variations are superior or equal to interregional and inter-animal variations. discussion: well characterized dps1000 provides ready to use research tool for locally and systemic skin investigations. ongoing experiments will cover skin structure analysis by raman spectroscopy, biophotonics and storage impact on skin barrier function. respirable biopersistent granular dusts (gbs) should fulfill the criteria of i.) a negligible solubility in physiological lung fluid that ii.) do not exhibit a specific surface chemistryrelated toxicity at volumetric non-overload conditions in lungs. in 2012, the mak commission derived a new threshold value of 0.3 mg/m 3 for gbs with a density of 1, recognizing that at this concentration a chronic inflammation and increase of the lung cancer risk will not occur. -the objectives of the project were i.) to determine an experimental dissolution value for 'low soluble' gbs using six candidate dusts; ii.) in addition, to measure the inflammatory response in lung lavage fluid and to decide on the criterion 'inert dust'; iii.) to investigate whether nanoscaled dusts could possibly fulfill the criteria to be included in the gbs class. -six micro-and nanoscaled dusts (one of them a well-characterised inert tio 2 dust (microscaled; rutile modification) were compared analysing the solubility in the lung fluid (day 3, 28 and 90) and the lung toxicity after intratracheal instillation in rats (day 3 and 28): tio 2 (rutile, micro), tio 2 (anatase, nano), eu 2 o 3 (micro-nano mixed), baso 4 (micro), zro 2 (micro) and amorphous sio 2 (nano). two doses of 0.5 and 1.5 µl per rat were administered to wistar rats; these volume doses resulted in a non-overload and moderate overload of lungs, respectively. -the differential cell count showed only slight inflammatory cell levels after treatment with tio 2 (rutile) and baso 4 (pmn < 5% after 3 days in the low dose group; < 15% in the high dose group; full recovery after 28 days). in contrast, the tio 2 (anatase) showed a stronger response (pmn > 30% after 3 and 28 days). the rare earth eu 2 o 3 (micro-nano) dust showed the strongest effect (approx. 40% pmn after 3 and 28 days) including a red-coloured lung lavage fluid. µ-zro 2 and amorphous sio 2 showed a strong acute response after 3 days, however, mostly complete recovery after 28 days. the low solubility criterion was met by the following dusts: tio 2 (both) and zro 2 . -similar volumetric lung burdens were deposited in a parallel validation experiment (14-day subacute inhalations). overall the physiological inhalation route confirmed the results obtained in the instillation study, thus suggesting the applicability of the latter as a tool for identification for gbs dusts. however, for nanoscaled dusts an individual toxicological characterization seems to be adequate. polycyclic aromatic hydrocarbons (pah) represent a complex mixture of compounds and occur in considerable amounts as contaminants in the environment and food. some pah have been demonstrated to be carcinogenic and mutagenic. benzo[a]pyrene (bp), the most known and studied member of the potent carcinogenic pah, is classified by iarc as group 1 carcinogen and is present in a wide variety of food items. however, other non-carcinogenic pah such as pyrene (pyr) and fluoranthene (fa) are also found in substantial amounts in the diet and are strongly suspicious to cause interactive effects. reporter gene assays were used for analyzing interactive effects of a ternary mixture including bp, pyr and fa in relative proportions occurring in food on the nuclear receptors aryl hydrocarbon receptor (ahr) and constitutive androstane receptor (car). the observed activations were verified at the gene expression level in the human hepatoma cell line heparg. beside the well characterized ligand bp, 25 µm of pyr and 20 µm fa also activated the ahr, even though to a much lesser extent. no significant higher activation over the level of bp alone was reached when testing different pah mixtures. however, in heparg cells the analysis of cyp1a1 gene expression as a model target gene of ahr showed synergistic effects after pah co-exposure. in addition, the activation of human car was analyzed. pyr and fa are each strong agonists, whereas bp was less potent. for the ternary pah mixture with bp a strong decrease of the induction was observed. this inhibiting effect was verified at the mrna level using the model car target gene cyp2b6 in heparg cells. in conclusion, really occurring mixtures with non-carcinogenic pah can modulate the effects of carcinogenic pah. such effects warrant to be investigated in more detail to enhance our knowledge of interaction of pah mixtures at the molecular level. cytochrome p450 enzymes and transporters are important for the turnover of pharmaceutical compounds. their expression levels and activity influence bioavailability and convey drug-drug interactions. moreover, transporters mediate barrier maintenance of several organs such as blood-brain-barrier and placenta-barrier. overexpression of export transporters in tumors can lead to multiple drug resistance. however, membrane associated proteins are difficult to quantify by conventional bioanalytical methods such as sandwich immunoassays because of their hydrophobicity. antibody -based analysis of cytochrome p450 enzymes and transporters is challenging due to their sequence homology. therefore, we developed a test system for protein quantification which combines the sensitivity of immunoprecipitation and the specificity of mass spectrometry: this method is especially convenient for hydrophobic proteins because denatured samples are analyzed on peptide level. one peptide from each protein, which can be assigned unambiguously, is identified via tandem ms and quantified by means of an isotope labeled reference. prior to ms-read-out, the peptides are enriched by antibodies which recognize a very short c-terminal epitope. these epitopes are selected in such way that they are common in peptides derived from target proteins and therefore allow the analysis of protein groups with few antibodies. the major advantage of this method is that whole cell or tissue lysates -without preparation of microsomal fractions -can be used for quantification by lc-ms. also, samples from different model organisms can be analyzed with the same assay which enhances the comparability of experiments. physiologically based toxicokinetic modeling (pbtk) is an in silico tool to predict compound kinetics based on test substance related properties and physiological parameters of the organism. pbtk is a key element for inverse dosimetry to relate effect concentrations in vitro to external, e.g. oral doses. in our investigations, we use 8 compartment models for the rat including adrenals and testes or ovaries. test substance specific properties taken for pbtk modeling are molecular weight and logp o/w as well as ivis based tissue specific partition coefficients, hepatic clearance, intestinal permeability and plasma protein binding. berkeley madonna software was applied to solve consequent differential equations. here we present the above described model for the 3 test substances bisphenol a (bpa), fenarimol (fen) and ketoconazole (keto). using the lowest effect concentrations (loecs) of bpa, fen and keto from 1) an in vitro yeast based assays with human estrogen and androgen receptor combined with a reporter gene and 2) the interaction of steroidogenesis model calculations were made to relate in vitro concentrations to oral doses in the rat. model calculations, based on in vitro loecs of 10 µm (bpa), 3 µm (fen) and 0.01 µm (keto), for concentrations in target organs resulted in estimated oral loels of 16, 4 and 0.04 mg/kg. when calculations were made for plasma levels oral loels were estimated to be 608, 77 and 16 mg/kg for bpa, fen and keto, respectively when compared to existing in vivo data with endocrine related loels of 375 mg/kg bw day for bpa (1), 50 mg/kg day for fen (2) and 6 mg/kg day for keto (3) , it can be concluded that for the exemplary test substances addressed, ivis related risk assessment approaches based on target tissues seems overpredictive, whereas plasma related loels were closer to the in vivo situation, to study the activation of the nuclear receptor pxr a gal4/uas-based pxr transactivation assay was used. the pxr-mediated induction of cyp3a4 promotor activity was investigated using a pxr-dependent cyp3a4 reporter gene assay. cyp3a4 induction was analyzed at the mrna and protein levels in hepg2 cells using qpcr and western blot. to cover the most frequently occurring pa structures (retronecine, heliotridine and otonecine type as well as monoester, open-chain diester and cyclic diester) the four pa senecionine, heliotrine, echimidine and senkirkine were selected as representative pa for initial analyses. out of the four investigated pa only echimidine activated pxr. accordingly, pxrmediated induction of cyp3a4 promotor activity could only be detected for echimidine. cyp3a4 induction by echimidine was verified at the mrna and protein level in hepg2 wildtype and hepg2 pxr-overexpressing cells. taking heinrich-heine universität düsseldorf, institut für toxikologie, düsseldorf, germany introduction: in higher concentrations, the blood pressure regulating hormone angiotensin ii (ang ii), leads to vasoconstriction, hypertension and oxidative stress by activation of the renin angiotensin system (ras). here we investigate if nadphoxidases are responsible for ras-mediated oxidative stress in kidney and heart. nadph-oxidases (7 isoforms are known, nox 1-5, duox 1 & 2) are membrane-bound enzymes that produce reactive oxygen species (ros) for example during the immune response and cell signaling. material and methods: to clarify the role of nadph-oxidases, wildtype mice and nox 1-, nox 2-and nox 4-deficient mice were equipped with osmotic minipumps, delivering ang ii in a concentration of 600 ng/kg during 28 days to stimulate high blood pressure. kidney and heart were investigated for steady-state ros levels and dna damage (dna single and double strand breaks). results: in wildtype mice, ang ii leads to hypertension, a declined renal function, formation of ros in kidney and heart and to dna single and double strand breaks in the kidney. all nox-knockout mice exhibited ang ii-mediated hypertension and albuminuria. the lack of nox 2 and nox 4 could neither protect from the formation of oxidative stress in the kindey nor from dna double strand breaks in the kidney. initial findings from the nox 1-knockout mouse do not show an increase in dna double strand breaks in the kidney. discussion: contrary to published results of nox 1-knockout mice, we observed a constant rise in blood pressure over the treatment period compared to the control. this can possibly be due to different ang ii doses. in nox 4-knockout mice we observed increased oxidative stress and increased renal dna damage already in untreated control animals, which is in line with reports suggesting a protective effect of nox4. conclusion: separate eliminations of 3 nadph-isoforms did not allow the identification of the enzyme which is responsible for ang ii-induced oxidative stress. a possible explanation is that oxidative stress is caused by more than one nox-isoform or other enzymes like xanthine oxidase or nitrogen synthase take major part in the formation of ros. of mice (rats, cats, dogs, monkey) and men -how to measure kidney biomarkers across species introduction and objectives: there is a need for reliable biomarker assays to detect organ toxicity induced by drug candidates. in the last 50 years about 60 drugs were withdrawn from the market due to liver and/or kidney damage. for the detection of druginduced organ injury, safety-tox studies in rodents, dogs, non-human primates and humans are mandatory in the drug development process. currently, several protein biomarkers for kidney, liver and cardiovascular organ toxicitity are being clinically validated by international consortia like the safer and faster evidence-based translation (safe-t) or the predictive safety consortium (pstc). we are developing mass-spectrometry-based immuoassays suitable for the detection of these markers in animal models to support these efforts method: urinary proteins are proteolytically digested to peptides using trypsin. subsequently synthetic isotope-labelled peptide standards are spiked in at known concentrations. we employ multi-specific antibodies (txp-antibodies) targeting cterminal amino acid motifs for the enrichment of peptides derived from the protein biomarkers. finally, the protein biomarkers are quantified using nanolc-parallel reaction monitoring-ms. the use of our group-specific txp-antibodies allows protein analysis of samples from different species using the same antibody. results and discussion: we established an ms-based immunoassay platform for the analysis of kidney (diki) injury protein biomarkers in urine across 5 species; human, cynomolgus, mouse, rat and dog. we analyzed the potential diki biomarkers aquaporin 2, podocin, synaptopodin, retinol-binding protein 4, clusterin and osteopontin in urine samples from toxicity studies in cynomolgus monkeys, rodents and humans. conclusion: the application of group-specific txp-antibodies and mass spectrometry allows the quantification of biomarkers in urine of all relevant model organisms. the results strongly support the validation of translational drug-induced organ injury protein biomarkers. although effective anticancer-therapeutic regimen are available, they are accompanied by severe adverse effects on normal tissue, for instance chemotherapy induced peripheral neuropathy (cipn) caused by platinum compounds. the pathophysiology of this clinically highly relevant side-effect is still unknown and neither prophylaxis nor specific treatment is available. therefore, further research elucidating the underlying molecular mechanisms of platinating anti-tumor drugs leading to cipn is required as basis for future development of preventive or therapeutic strategies. in general, platinum compounds lead to cell death mainly via dna damage induction (mostly intrastrandcrosslinks) and through interference with the redox homeostasis of cells. here, we introduce and suggest the well-known nematode model organism c. elegans to elucidate mechanisms of neurotoxicity triggered by platinating agents. so far, we determined doses for cis-and oxaliplatin, which have only moderate effects on development, reproduction and body movement (muscular read-out). however, these doses are sufficient to trigger apoptosis in c. elegans and to induce a considerable amount of 1,2-intrastrand crosslinks in dna (measured by south-western blotting). even more important they lead to strong neurotoxicity in a functional read-out (pharyngeal pumping). with regard to redox homeostasis, we determined the oxidative stress resistance showing that e.g. cisplatin sensitizes c. elegans to reactive oxygen species (ros), which could be prevented if worms were co-or pretreated with n-acetylcysteine. furthermore we determined the level of ros in living c. elegans after treatment with platinating agents and also in combination with protective compounds. using the advantages of c. elegans as a genetic model system, we will further clarify the relevance of different defense mechanisms, including dna repair (nucleotide excision repair, base excision repair), detoxification systems (antioxidative stress factors, metallothioneins) as well as drug transporters and signaling proteins. this will be achieved by using rna interference approaches that allow targeting either the whole animal or specific tissues (i.e. neurons) only. first results of this approach will be presented. finally we aim to use this setup to identify neuroprotective compounds that prevent chemotherapy induced peripheral neuropathy induced by platinating anti-tumor drugs. clostridium botulinum c3 exoenyzme (c3) exclusively adp-ribosylates rhoa, b and c leading to reorganization of the actin cytoskeleton and morphological changes. in addition to the enzyme-based inhibition of rho-gtpases, c3 promotes in an enzymeindependent manner axonal and dendritic growth in neurons. as c3 lacks the canonical binding and translocation domains of bacterial protein toxins, cell entry is currently not well understood. based on overlay assays and mass spec analyses the intermediate filament vimentin was identified as the putative membrane receptor for c3. knock down of vimentin by sirna and application of the selective vimentin disruptor acrylamide led to a significantly delayed uptake of c3. moreover, addition of extracellular vimentin to cells induced an enhanced uptake of c3. proof of principle experiments in astrocytes and neurons from vimentin knock out mice showed c3-induced morphological changes (astrocyte stellation and axon growth) to a reduced extent and a significantly delayed uptake of c3 compared to wild type cells. as vimentin knock out did not completely inhibit c3 uptake into cells, an additional uptake mechanism or additional receptor for c3 is likely. nevertheless, our data reveals that c3 employs a specific endocytosis mechanism with involvement of the intermediate filament vimentin to gain access to host cells. the primary target organ of organic hg species-mediated toxicity is the central nervous system (cns). humans are exposed to organic hg mainly in the form of methylmercury (mehg) via the consumption of contaminated fish and other seafood products. in terrestrial food sources hg is mostly found as inorganic hg. thiomersal is a further organic hg compound which is used as a preservative in medical preparations. exposure to organic hg promotes primarily neurological effects. the understanding of transfer mechanisms regarding the cns is an important precondition for an evaluation of hg species-induced neurotoxicity. thus, primary porcine in vitro models of the bloodbrain barrier and the blood-cerebrospinal fluid (csf) barrier were used to investigate effects of mehgcl, thiomersal and hgcl 2 on the barriers as well as transfer properties into and out of the cns in vitro. the results show significant transfer differences of the various incubated species as well as in the different barrier systems. whereas the bloodbrain barrier seems to account for the transfer of organic hg species from the blood side to the brain side, these species are transferred in the contrary direction by the blood-csf barrier. inorganic hgcl 2 was not transferred across both brain barriers towards the brain side but was able to leave the brain side across the blood-brain barrier. additionally, cytotoxic effects of the hg species by themselves as well as the combination of organic and inorganic hg species have been investigated in human astrocytes and human differentiated neurons. differentiated neurons were much more sensitive towards all hg species. organic species exerted stronger cytotoxic effects in both cell types as compared to hgcl 2 . interestingly, a coincubation of organic and inorganic hg species led to an increased cytotoxicity in the astrocytes. this cocytotoxic effect is currently investigated in differentiated neurons. the species-specific differences with respect to both, effects on and transfer across the blood-brain and the blood-csf barrier in vitro as well as toxic effects in brain target cells, clearly emphasizes the necessity for comparative analyses. introduction: the neural crest is a multipotent stem cell population that arises at the neural plate border during early fetal development. neural crest cells (nccs) migrate to target sites in the periphery, where they differentiate into multiple cell types, including melanocytes, cranial bones and peripheral neurons. failure of ncc migration can lead to severe disorders, such as hirschsprung's disease. aim: to test whether toxicants interfere with human ncc migration, a high-throughput migration assay was established. this test system was used to screen an 80 compound library of potential developmental toxicants. methods: nccs were derived from human embryonic stem cells. the cells were allowed to migrate for 24 h before toxicants were added to the cells. migration and viability of the cells were then measured after another 24 h by high-content image analysis and a custom-developed software package. results: the screening library was assembled by the us national toxicology program (ntp) and consisted of different substance classes, e.g. organophosphates, organochlorines, drug-like compounds, pesticides and polycyclic aromatic hydrocarbons (pahs). out of the tested potential developmental toxicants, 26 compounds reduced ncc migration at non-cytotoxic concentrations. hit-confirmation testing confirmed 23 of the compounds as concentration-dependent inhibitors of ncc migration. among the potential developmental toxicants identified here, there were several organophosphates (e.g. chlorpyriphos) and drug-like compounds as well as polybrominated diphenyl ethers (pbdes) and organochlorine pesticides (e.g. ddt and dieldrin), while none of the tested pahs inhibited ncc migration. the negative controls in the screening library, like acetylsalicylic acid, acetaminophen and saccharin, proved to be non-toxic. conclusion/outlook: the newly established test system allows screening of potential developmental toxicants in a high throughput manner for interference with human ncc migration. confirmation in other types of migration assays is ongoing, and selected compounds from amongst the screen hits are undergoing mechanistic evaluation. oxidative stress is regarded as a major trigger for neuronal dysfunction and death in the ageing brain and in multiple neurodegenerative disorders. how oxidative stress mediates neuronal death and whether the associated mechanisms are accessible for therapeutic intervention strategies is not clarified. increasing evidence suggests, however, that oxidative stress triggers molecular mechanisms of regulated necrosis that involve the activation of receptor interacting protein 1 (rip1) independently of death receptor activation. here, we show that erastin-induced ferroptosis which involves inhibition of the glutamate-cystein transporter (xc -), glutathione depletion and lethal formation of reactive oxygen species (ros) 1 , triggers mechanisms of regulated necrosis independent of tnfα-signaling. in hippocampal ht-22 cells erastin promotes activation of rip1 and subsequent rip1-rip3 necrosome formation which has been investigated as a hallmark of regulated necrosis 2 . in fact, silencing of rip1 by sirna or by the rip1 inhibitor necrostatin-1 prevents ferroptosis-induced cell death whereas the ferroptosis inhibitor ferrostatin-1 fails to protect cells against tnfα-induced classical necroptosis, a form of programmed cell death that is mediated by receptor interacting protein-1 (rip1) and rip3 kinases downstream of death receptor activation (e.g. tumor necrosis factor receptor tnfr) 2, 3 . recently, a genome-wide sirna screen linked cylindromatosis (cyld) to rip1/rip3-dependent necroptosis 4 and also in the present paradigm of ferroptosis, cyld depletion promotes neuronal survival and decreases rip1-rip3 complex formation, suggesting a role of cyld in intrinsic pathways of regulated necrosis triggered by oxidative stress. the ns5a inhibitor daclatasvir is used in combination with other antivirals such as the polymerase inhibitor sofosbuvir for treatment of chronic infection with the hepatitis c virus. daclatasvir is embryotoxic and teratogenic in rats and rabbits at exposures at or above the clinical exposure. in contrast, no teratogenic effects were observed in rat and rabbit developmental toxicity studies with ledipasvir, another ns5a inhibitor. we studied these compounds in the embryonic stem cell test (est) alone and in combination with sofosbuvir. the ns5a inhibitors were obtained from selleckchem, the main metabolite of sofosbuvir, psi-6202, was from medchem express. murine embryonic stem cells (es-d3) were obtained from atcc. they were kept in iscove's modified dulbecco's medium (imdm). substances were dissolved in dmso at a final dmso-concentration of 0.1% in the culture medium. a cytotoxicity assay as well as a differentiation assay were performed. after 10 days in culture the cells were evaluated. cytotoxicity was measured by an mtt test. differentiation into contracting myocardial cells was determined using direct phase contrast microscopy. the substances were tested at concentrations between 0.1 and 30 mg/l, which is a broad coverage of the therapeutically relevant concentrations reached in patients. at a concentration of 10 mg daclatasvir / l medium and higher the substance inhibited differentiation of cells. we observed contracting myocytes in 23, 22 and 2 wells out of 24 wells in total at concentrations of 1, 3 and 10 mg/l. at 30 mg/l no differentiation was observed. effects on cell viability were observed at 30 mg/l. unexpectedly, we found a higher potency with ledipasvir. at the low, therapeutically relevant concentration of 1 mg/l this nsa5-inibitor showed a clear impact on differentiation with 6 out of 24 wells affected and no differentiation at higher concentrations. addition of sofosbuvir or its main metabolite psi-6206 at concentrations up to 30 mg/l had no influence on the concentration effect curves established for daclatasvir or ledipasvir. this is the first indication of an embryotoxic potential of ledipasvir. the difference to the results from the routinely performed animal experiments is unknown. possibly, metabolic activity in the maternal organism is responsible for this discrepancy. dimoxystrobin is a european-registered pesticidal active ingredient. biologically it is acting as an inhibitor of the fungal respiratory chain. for the purpose of european registration a full set of toxicological studies has been conducted with dimoxystrobin, including reproduction toxicity studies (according to the most recent oecd tg 416) and developmental toxicity studies (oecd tg 414) in rats and rabbits. dimoxystrobin interferes with the iron transport in rats and mice. this leads to lower serum iron levels and anemia in rats after repeated exposure. this holds true for treated dams and offspring in reproduction toxicity studies. furthermore, offspring effects seen at the high dose of the 2-generation toxicity study were a hypochromic microcytic anemia, impaired body weight development, which only developed postnatally, and reversible cardiomegalies in some 21-days old pups. for all effects clear noaels were determined. in the 2-generation toxicity study no dose adjustment during pregnancy and lactation was performed, which resulted in considerably higher food and compound intakes in dams and offspring during these lifestages. as a result, it seemed, that pups were more severely affected by body weight effects compared to the parental generation. by performing a life-stage specific comparison of body weight and substance intakes, as well as benchmark dose calculations (bmd) for these parameters, it could be demonstrated that the point of departures (pods) and the loaels for direct dimoxystrobin-related effects were comparable for offspring and parents. the heart effects (cardiomegaly), which were reversible, occurred only after direct dimoxystrobinexposure and are considered to be secondary to the detected offspring anemia. both effects (lower body weights and offspring cardiomegalies) only occur postnatally and are not the consequence of in-utero exposure, as no respective effects at higher doses in rat prenatal toxicity studies were seen. two new mechanistic studies (1-generation toxicity study and a 3-week study in young and adult rats, additionally investigating serum iron levels and anemia) confirmed, that pups and young rats were not more sensitive than adult animals to develop anemia or decreased serum iron levels. in 2006, dimoxystrobin was classified with r63 (possible risk of harm to the unborn child) by the ecb, which was the european authority responsible for classification and labeling, before echa in helsinki was formed. the r63 (which has been translated into the ghs classification repr. 2, h361d) was based on offspring body weight and heart effects seen in the 2-generation toxicity study. based on a comprehensive re-evaluation of existing and on new data of dimoxystrobin, the conclusion can be drawn, that a classification for reproduction toxicity is scientifically not justified and should be reconsidered. perfluorooctanesulfonic acid (pfos) and perfluorooctanoic acid (pfoa) are perfluorinated substances (pfas) which are used for the fabrication of surfaces with water-and dirt-repellent properties. due to their reprotoxic properties and their persistence in the environment, the use of pfos was restricted in 2009 and a restriction program for pfoa was initiated in 2013. therefore, industry switches to pfoa and pfos substitutes, which are predominantly pfas with a shorter carbon chain length, or structure-related compounds. in contrast to pfoa and pfos only few toxicological data are available for their substitutes. aim of this study was to examine endocrine effects of the substitutes perfluorohexanesulfonic acid (pfhxs), perfluorobutanesulfonic acid (pfbs), perfluorohexanoic acid (pfhxa), perfluorobutanoic acid (pfba) and 2,3,3,3-tetrafluoro-2-(heptafluoropropoxy)propionic acid (genx) in comparison to pfoa and pfos. a hek-293t cell-based dual-luciferase reporter gene assay was used to investigate the potential of these compounds to affect the activity of the human estrogen receptors herα and herβ. the reporter gene assay revealed no activation of herα or herβ by the pfas tested in this study. to investigate the potential inhibition of herα and herβ by pfas, a coincubation with the estrogen receptor agonist 17β-estradiol was performed. none of the tested pfas inhibited herα or herβ activity. however, in the case of herβ an enhancement of 17β-estradiol-stimulated activity was observed. thus, pfas do not directly activate or inhibit the human estrogen receptors but have an impact on herβ activity as they amplify the activation mediated by 17β-estradiol. further studies will be conducted to examine this synergistic effect in more detail. the xeer-reporter cell line: a novel dual-color luciferase reporter assay for simultaneous detection of estrogen and arylhydrocarbon receptor activation p. tarnow consumers are exposed to a multitude of anthropogenic and natural substances capable of activating or inhibiting ligand activated transcription factors, respectively. this in turn can lead to adverse health effects, particularly for substances acting on signalling pathways that are subject to regulatory crosstalk such as xenoestrogens and polycyclic aromatic hydrocarbons (pahs). xenoestrogens are known to activate human estrogen receptors (ers), whereas pahs or dioxins act on the arylhydrocarbon receptor (ahr). importantly, both receptor signalling pathways are interconnected by a complex crosstalk on multiple levels. this ranges from direct protein-protein interactions to competition for common co-factors. however, although this cross-talk has been long known we still lack a deeper understanding of its molecular mechanisms and physiological implications. one reason for this is a lack of tools to visualise and investigate receptor interaction in vivo. based on the breast cancer cell line t47d we thus developed a dualcolour reporter assay which allows time-resolved simultaneous monitoring of the activation of er and ahr in living cells. the assay uses two beetle luciferases emitting luminescence in the red (slr) and the green (eluc) spectrum, respectively. while eluc is expressed under the control of a 6-fold repeated xenobiotic response element (xre) slr is subject to transcription regulation by a 6-fold repeated estrogen response element (ere). both constructs were stably transfected into t47d human breast cancer cells, which endogenously express erα and ahr and are thus ideally suited for monitoring interactions with both receptors. the respective "xeer"-cell line has been successfully subjected to proof of principle studies, using prototypical er-and ahrligands as well as various phytochemicals and xenobiotics. besides e2 and tcdd ligands included various pahs, polychlorinated biphenyls, alpha-and betanaphthoflavone, cosmetic ingredients (butylparaben, benzophenone-2 and 4-mbc), bisphenol a, genistein, resveratrol, diindoylmethane as well as pharmacological antagonists of both receptors. asian women consuming soy rich food throughout life possess lower levels of 17betaestradiol (e2) in plasma (pl) than western women, whose diet is characterized by less soy consumption during early life and possible intake of soy based dietary supplements during adulthood. however, the impact of these soy exposure scenarios on estrogen (biotrans)formation and the consequence thereof in female mammary glands (mg) has not been investigated yet. thus, female august copenhagen irish rats were fed either isoflavone (if) depleted diet (idd, western exposure scenario without if supplement) or if rich diet (ird, asian exposure scenario) until the end of the study at postnatal day (pnd) 81. furthermore, rats fed idd until pnd74 were fed ird for 7 days (idd+ird, western dietary exposure scenario with if supplement). estrous was determined histologically. levels of transcripts were determined by qpcr and e2 and estrone (e1) in pl and mg were quantified by gc-ms/ms. statistical analyses of estrogens were performed by kruskal wallis and unpaired wilcoxon tests and of transcript levels by linear regression models considering the explanatory variables tissue levels of e2 and diet (idd vs ird and idd vs idd+ird). e2 levels in pl and mg did not coincide with those predicted by estrous. furthermore, median levels of e1 and ratios e2/e1 in mg and ratio of e2 levels in pl/mg were not affected by diet. in contrast, diet tended to affect e2 concentrations in pl (p=0.1211) due to an increase in the ird group (p=0.0056) whereas e2 levels in the idd+ird group only tended to be elevated (p=0.0788). in mg, ird and idd+ird increased e2 levels only weakly (p=0.0788 each). likewise, besides significant changes in transcript levels of cyp1a1 and 1a2, putatively decreasing oxidation of e2 to catechols, in the idd+ird group and (not significantly) also in the ird group, no changes in transcript levels putatively affecting e2 levels were observed. moreover, no decrease in levels of transcripts indicative for cellular (oxidative) stress (gclc, tp53, mt1a) was observed in the idd+ird group. e2 mg levels were significantly associated with an increase in transcript levels of areg and pgr, indicating activation of estrogen receptor (er). in contrast, ird was associated with a significant and idd+ird with a not significant decrease in pgr transcript levels. e2 levels but not diet were significantly associated with gata3 transcript levels, indicating tissue differentiation. furthermore, levels of transcripts involved in intercellular communication (egfr, wnt4) were significantly decreased by idd+ird and not significantly by ird and differed from that affected by e2 (increase in gdf15, hgf, igf1r, wnt5a). bmf, a marker transcript for apoptosis was increased by ird, but not affected by e2 and even decreased not significantly by idd+ird. taken together, despite an increase in e2 levels in pl, less er activation was observed after dietary exposure to if. whereas e2 and transcript levels of enzymes involved in e2 (biotrans)formation as well as er activation and cellular communication were affected similarly but to a different extend in both asian and western if exposure scenarios, differences in apoptosis were observed between ird and idd+ird groups. supported by dfg le 1329/10-1. august copenhagen irish (aci) rats with 17β-estradiol (e2)-releasing implants are an accepted model to study the etiology of breast cancer, but neither e2 (biotrans)formation in mammary gland tissues (mg) during tumorigenesis, nor the impact of isoflavones (if) shown to affect tumorigenesis in aci rats, has been investigated, yet. therefore at postnatal day (pnd) 75 and 175, placebo (-e2) or silastic implants containing 4 mg e2 were implanted in female aci rats exposed to either if depleted diet (idd) or if rich diet (ird) since conception until the end of the study at pnd 285. palpable mg tumors (pt) and 1-2 mg per animal without pt were characterized histologically and categorized into normal (-e2 group, n=12), hyperplasia and non-pt and pt with and without solid tumors (+e2 group, n=32). e2, estrone (e1), their hydroxylation products and methylation (meo-) products thereof, as well as conjugates of e1 and e2 in plasmas and mg were analyzed by gc-and uhplc-ms/ms, respectively. levels of 49 transcripts involved in (biotrans)formation of e2 and estrogen receptor (er) activation were determined by taqman®-pcr. without exogenous e2, plasma e2 as well as e1 and (borderline) e2 levels in mg were higher in ird. plasma e2 as well as e1 and e2 levels in mg were lower in the -e2 group than that in the +e2 group. e2 levels as well as e2/e1 and e2 mg/plasma ratios were elevated in pt, accompanied by a significant increase in transcript levels indicative for estrogen receptor activation (areg, pgr) and proliferation (mki67). ird increased e2/e1 ratio in pt and, although ird did not affect er activation (areg, pgr), ird increased differentiation (gata3) in normal and hyperplastic tissues and tended to decrease proliferation in hyperplastic (ccnd1) tissues. levels of e1 and 2-meo-e1 were highest in hyperplastic tissues, accompanied by an increase in transcript levels of hsd17b2 (conversion e2 to e1) and cyp1a1. transcript levels of gstm1 and gstm2 were decreased in the whole +e2 group and of gstt1 and gstt3 in hyperplastic tissues, possibly decreasing inactivation of electrophilic metabolites. accordingly, maximum transcript levels of tp53 and mt1a indicating cellular (oxidative) stress were observed in hyperplastic tissues. ird did neither affect levels of 2-meo-e1 nor cellular stress (gclc, mt1a, tp53). of note, neither 4-meo-e1, nor e1 catechols, nor e2 catechols nor methylation products of the latter were observed in any sample. furthermore, no conjugates of e1 or e2 were detected in plasmas and mammary gland tissues. thus, changes in transcript levels of conjugating enzymes induced by tumorigenesis and by ird were not related with detectable conjugate levels of e1 or e2. taken together, whereas hyperplastic tissues were characterized by maximum oxidative metabolism of e1 and cellular (oxidative) stress, pt exhibited highest e2 levels and er activation. ird increased differentiation and decreased proliferation in normal and hyperplastic tissues but increased e2/e1 ratio in pt. supported by dfg le-1329/10-1. level of 17beta-estradiol (e2) in human breast tissue is considered to affect breast cancer initiation, promotion and progression. although putatively beneficial and adverse effects of soy isoflavones (if) on the human mammary gland, in particular in western women, have been discussed extensively, the influence of if levels on estrogen formation in human mammary gland tissue has not been investigated yet. thus, glandular tissues were dissected from 37 mammoplasty specimen obtained from women (age 18-66 years old) not taking estrogen active drugs. 14 of these women had been exposed to if by their usual diet or by intake of a soy-based dietary supplement for 7 days prior to mammoplasty. information on soy consumption and lifestyle were collected by questionnaire and tissues were characterized histologically. genistein, daidzein their conjugates (n=12) and bacterial metabolites (n=7) as well as the estrogens estrone (e1)-sulfate, e1, e2 and 2-methoxy-e1 were determined by uhplcand gc-ms/ms, respectively and transcript levels of 19 enzymes involved in e2 (biotrans)formation were quantified by taqman®-pcr in glandular tissues. isoflavonoids were categorized into the if parameters aglycones (agl) and conjugates (con) of either genistein, daidzein or sum of both and were further statistically analyzed by spearman`s rank correlation analysis. a positive correlation of e2/e1 ratio with agl(+con) was observed in glandular tissues (r=0.49, p=0.002), accompanied by a significant negative correlation of e1 levels with agl (r=-0.35/p=0.032), possibly due to reduction of 17beta-hydroxysteroid dehydrogenase 2 (conversion of e2 to e1) expression as indicated by a weak negative correlation of transcript levels of 17beta-hydroxysteroid dehydrogenase 2 with agl+con (r=-0.25, p=0.080). further statistical analysis taking into account multiple variables using linear regression models will provide more insights into variables affecting e1/e2 ratio. taken together, estrogen profile in human glandular breast tissue seems to be affected by if levels. supported by dfg le-1329/10-1. allergic contact dermatitis (acd) is a widespread disease often caused by substances in consumables. the eu prohibits the testing of cosmetic ingredients in vivo. this urges the development of reliable in vitro testing strategies. activation of dendritic cells (dcs) represents a key step during sensitization as they are essential for selection and priming of allergen specific effector t cells. in an integrated omics approach we aimed to further elucidate the molecular mechanisms of dc activation using quantitative metabolomics and proteomics. monocytic thp-1 cells were used as a model system and treated with the sensitizer 2,4dinitrochlorobenzene (dncb; 5, 10 and 20 µm) and the irritant sodium dodecyl sulfate (sds; 100 µm). samples were taken after 4, 8 and 24 hours. thp-1 activation was analyzed by measuring the established activation markers cd86 and cd54 after 24 hours. a targeted lc-ms/ms approach was used to analyze 188 metabolites including amino acids and lipids. protein levels were quantified by nano-lc-maldi-ms/ms after stable isotope labeling by amino acids in cell culture (silac). data sets were examined by multivariate analyses for identification of biomarker candidates. regulated metabolites and proteins were subjected to pathway analysis. the data presented might contribute to the further development of suitable in vitro testing methods for chemical-mediated sensitization. drug induced liver injury (dili) is one of the most frequent causes of acute liver injury and a main cause for drug withdrawals. currently there are no reliable models to test the dili potential of new compounds available. kupffer cells (kc) play an important role in hepatic cell stress mediated through chemokines and release of endogenous proteins. kc activation by damaged or stressed hepatocytes can lead to activation of the nf-κb signaling pathway transmitted by reactive oxygen intermediates (roi). we have recently established a liver model composed of primary human hepatocytes (phh) and kc which enables investigation of immune reactions after induction of hepatocyte stress (kegel et al., 2015) . aim of the present study was the kinetic investigation of hepatic cell stress induction and macrophage activation after treatment with subtoxic concentrations of hepatotoxic drugs. primary human hepatocytes (phh) and kc were isolated from human liver resectates using a two-step collagenase perfusion technique. initial kc activation was characterized by the dcf assay and immunofluorescence staining. phh were incubated with different concentrations of acetaminophen (apap) and diclofenac (dic) for different time intervals. cell stress was evaluated by measurement of oxidative stress (dcfassay) and viability (xtt-assay). in order to simulate macrophage activation following hepatocyte damage, kc and macrophages derived from the monocytic cell line thp-1 were incubated with supernatants of phh treated with hepatotoxic compounds. kc and thp-1-macrophage activation were investigated by measuring intracellular formation of roi using the dcf-assay and cell activity using the xtt-assay. the characterization of kc activation revealed a donor and disease dependent kc activation resulting in kc differentiation to pro-and anti-inflammatory macrophages. therefore, kc were substituted by macrophages derived from thp-1 cells. evaluation of hepatic cell stress showed the strongest effect on thp-1-macrophages when phh were incubated with apap or dic for 4 h.treatment of kc and thp-1-macrophages with supernatants of phh challenged for 4 h with hepatotoxic compounds indicates that thp-1-derived macrophages react similar to kc when treated with phh supernatants in terms of cell activity and roi-production. in conclusion, thp-1 derived macrophages might be a suitable alternative to kc concerning macrophage activation. the evaluated kinetic window of 4 h covering hepatic stress induction and immune reaction allows to perform these measurements in a since the use of cerium dioxide nanoparticles is known to be beneficial e.g. in terms of reducing fuel consumption when added to diesel fuel it has become a frequently used nanomaterial. to compensate the concurrent lack of information on its toxicology a 90day nose-only inhalation study was initiated. by comparing the results to a combined chronic inhalation toxicity and carcinogenicity study using the same test items and experimental conditions (basf, ludwigshafen, germany) early indicators for genotoxic and carcinogenic effects should be determined. rats were exposed to 0, 0.1, 0.3, 1 and 3 mg/cm³ ceo 2 as well as 50 mg/m³ baso 4 nanoparticles (6 h/day, 5 days/week, 13 weeks). animal dissections were conducted at five time points (exposure day 1 and 28; recovery day 1, 28 and 90) aiming for endpoints mandatory according to oecd guideline 413. additionally, gene expression analyses in isolated pneumocytes type ii were performed using pathway arrays for inflammation, oxidative stress, genotoxicity, apoptosis and lung cancer. the given results intend the identification of marker genes displaying modulated expression in response to nanoparticle exposure. investigations on ceo 2 and baso 4 retention in the lung are also included in this project. in bronchoalveolar lavage fluid (balf) a time and dose-dependent increase of inflammatory cells has been detected up to the end of exposure. the amount of inflammatory cells decreased during post-exposure; however, in the high dose group a persistent inflammation up to 90 days was detected by balf and histopathology examination. based on our current results effects of ceo 2 nanoparticles on the respiratory system are suggested. its relevance in the context of long term effects such as tumor development needs to be estimated considering all investigations included in this study. the inhalt-90 project is funded by the german federal ministry of education and research (bmbf) -03x0149a. core or coating material? what dictates the uptake and translocation of nanoparticles in vitro? nanoparticles are becoming increasingly important role in consumer-related products. understanding the interactions between nanoscaled objects and living cells is therefore of great importance for risk assessment. in this context, it is generally accepted that nanoparticle size and shape are crucial parameters regarding the potential of nanoparticles to penetrate cell membranes and epithelial barriers. current research in this field additionally focuses on the particle coating material. in order to distinguish between core-and coating-related effects in nanoparticle uptake and translocation behavior, this study investigated two nanoparticles equal in size, coating and charge but different in core material. silver and iron oxide were chosen as core materials to ensure similar nanoparticles characteristics after particle synthesis. nanoparticles were coated with poly (acrylic acid) (pas) and extensively characterized by tem (transmission electron microscopy), saxs (small-angle x-ray scattering), zetasizer tm and nanosight tm . for uptake and transport studies the widely used human intestinal caco-2 model in a transwell tm -system with subsequent elemental analysis (aas) was used. for evaluation and particle visualization transmission electron microscopy (tem) and ion beam microscopy (ibm) were conducted. although similar in size, charge and coating material, the behavior of particles in caco-2 cells was quite different. the internalized amount was comparable, but pas-coated iron oxide nanoparticles were additionally transported through the cells. by contrast, pascoated silver nanoparticles remained in the cells. our findings suggest that the coating material influenced only the uptake of the nanoparticles whereas the translocation was determined by the core material. in summary, a core-dependent effect on nanoparticle translocation was revealed. both the uptake and transport of nanoparticles in and through cells should be considered when discussing nanoparticle fate and safety. nanotechnology is having a great impact not only on basic research but also on many sectors of industry opening the market for numerous new applications ranging from electronics to the health care system. besides their great innovative potential, the large variety of existing synthetic nanomaterials used in the last decade represents a major challenge for scientists and regulators in terms of measuring and assessing the potential hazard caused by the materials or the products themselves. equally, consumers often miss reliable and easy-to-understand information on nanomaterials and nanotechnology and do not know where to get such information. therefore, the international dana 2.0 expert team brings together its expertise and knowledge from different research areas dealing with all aspects of nanosafety research in order to create and provide easy-to-understand, up-to-date and quality-approved nanomaterials' knowledge base on www.nanopartikel.info. this information platform covers the 25 market-relevant nanomaterials focusing on their effects on the safety of humans and the environment.in order to manage and asses the rapidly increasing number of publications related to nanosafety issues, the dana 2.0 project developed a customised methodology «literature criteria checklist», which includes mandatory and desirable assessment criteria covering physico-chemical characterisation, sample preparation and (biological) testing parameters. this checklist facilitates the discrimination between high-and low quality publications and all positively evaluated literature is then fed into the dana knowledge base. accounting for the need to harmonise experimental practices, the dana team also developed a template for standard operation procedures (sop) to support careful scientific practice. validated protocols generated within the german bmbf-funded nanosafety research projects are presented together with results from the swiss ccmx project v.i.g.o. and available for download. another unique feature of the dana knowledge base is the integrated application-based database that provides a unique link between nanomaterials in real applications (e.g. environmental remediation or medical products) and their potential impacts/ toxicological effect(s) that can be easily accessed by the interested visitor. additionally, dana2.0 provides a list of faqs, a link platform with contact data to other information portals and the opportunity to directly pose questions to our experts via e-mail. dana 2.0 is also present on twitter, follow us @nano_info. background: particulate matter of combustion processes enhances cardio-vascular diseases and increases associated mortality rates. around 13% of total pm10 emissions are emitted by wood burners (uba 2006) . how wood combustion aerosols (particles and gasses) can affect human lung cells and how such cellular responses depend on the usage of different wood types and burners is widely unknown. methods: in an exposure chamber imitating the human respiratory tract human alveolar cells (a549) were exposed at an air-liquid-interface (ali) to gasses and particles of wood combustion aerosols. log wood of beech, birch and spruce was burnt in a conventional oven and compared to the combustion of wood pellets in a modern pellet burner. the combustion aerosols were diluted 1:40 and directly delivered to the exposure chamber. after 4h exposure the lung cells were lysed and rna was isolated. in an array based transcription analysis of the whole genome the effects of the aerosol exposures on lung cells was assessed. in parallel, physical and chemical parameters of the combustion aerosols were analyzed. results: the combustion aerosol of wood pellets contained less organic substances than the log wood aerosols, but was higher in its zinc content. genome-wide we found a higher number of regulated genes with combustion of pellets compared to combustion of log wood. the gas phase alone (filtered aerosol) showed comparable gene regulatory activity as the particle-containing total aerosol. aerosol from log wood burning induced mainly genes of the xenobiotic metabolism and cellular signaling. pellet aerosols additionally regulated apoptosis and dna repair processes. conclusions: modern pellet burners reach better combustion efficiencies than conventional log wood ovens, but their emissions seem to stress human lung cells stronger. one reason might be the higher zinc content of wood pellet aerosols. multiwalled carbon nanotubes (mwcnts) may pose as a risk similar to asbestos in causing cancer, notably mesothelioma, which is a malignant tumor originating from mesothelial cells. to identify molecular cues leading to mesothelioma development, we performed genome-wide transcriptome analysis using microarrays in primary human peritoneal mesothelial lp9 cells treated with two different tumor-inducing tailor-made mwcnts (rat model; rittinghausen et al. 2014 part fibre toxicol 11:59), or amosite asbestos at 3 µg/cm2 for 24 h. specifically, we determined how the transcriptomic changes of the highly tumorigenic mwcnt a would differ from another tumor-inducing mwcnt with albeit lesser potency (mwcnt d), long amosite asbestos as positive control, and milled mwcnt a as material control. initial analysis using bioinformatic tools, revealed 3788 significantly differentially regulated genes for mwcnt a, 1680 for mwcnt d, 145 for amosite, and 4 for milled mwcnt. further analyses with ingenuity pathway analysis comparing the two different mwcnt types and amosite, found common as well as exclusive biomarkers. interestingly, we identified many differentially regulated genes implicated in cellular senescence, a growth arrest in response to different stressors including dna damage, disrupted chromatin, and strong mitogenic signals. paradoxically, cellular senescence can represent both tumor suppression and tumor promotion mechanisms. more important, we found differential expression of genes associated with senescence-associated secretory phenotype (sasp) such as inflammatory cytokines, chemokines, proteases, and growth factors, which were manyfolds up-and down-regulated in mwcnt a, compared to mwcnt d and amosite. the mechanisms leading to mesothelioma induction by mwcnts are from far clear, but the key information emerging from the present transcriptomic data, together with our previously identified senescence markers, indicate that cellular senescence has a likely role. nanotechnology offers great advantages for the food industry despite its partly unknown risks, whose enlightenment is the main target of nanotoxicology. due to variability in terms of size, material, shape, surface texture and several endogenous influences, the toxicity of most frequently used and ingested nanomaterials is difficult to estimate. therefore, the aim of this study was the in vitro investigation of toxicological endpoints such as cell viability, dna integrity and the induction of apoptotic processes in human colon carcinoma cells (ht29). for this purpose ht29 cells were exposed for 24 hours with metal nanoparticles (gold, silver) and metal oxide nanoparticles (copper oxide, titanium dioxide, zinc oxide) in concentrations of 2-10 µg/ml. at first the cellular uptake of the nanoparticles by means of an icpms was determined. the influence of cell viability was demonstrated by the trypan blue staining and the mtt assay. the alkaline comet assay gave information about possible dna damages and the use of the repair enzyme formamidopyrimidine dna glycosylase (fpg) additionally allowed the detection of oxidized bases. the induction of programmed cell death was examined using by annexin v fitc assay. the icp-ms data showed a maximum particle content of 1.39 pg per cell for the used concentration range. the metal oxide nanoparticles resulted in a significant reduction of cell viability with a decrease up to 40 % after copper oxide and zinc oxide treatment. for metal particles, only for silver a reduced cell metabolism of about 50 % was detectable by the mtt assay. low genotoxic effects could be determined for silver nanoparticles (tail intensity about 12%; control about 6%), while for titanium dioxide the amount of oxidized bases was additionally increased (tail intensity about 20%; control about 6%) for concentrations above 8 µg/ml. induction of apoptosis was determined for silver particles (up to 24% early apoptotic and 20% late apoptotic cells) as well as for titanium dioxide and zinc oxide (10% each early apoptotic cells), whereby the most significant increase in late apoptotic cells was detected for zinc oxide (up to 90%). the results obtained in our studies indicate a clear particle-dependent influence on cell viability and apoptosis-triggering processes, depending on the used material or the concentration deployed, while only minor changes of dna integrity were detected. the evolution in the field of nanotechnology led to a variety of novel materials at the nanoscale. among them are different carbon materials like buckyballs, graphene nanoplates and carbon nanotubes (cnts). cnts are hollow carbon fibres with either one (sw) or multiple sidewalls (mw). mwcnts usually show a diameter of up to 100nm and can be several micrometres long. because of their nanoscale diameter cnt-uptake can take place directly through the plasma membrane of cells by the so called nanoneedle effect [1] . additionally cnts, like most nanomaterials, show a high surface to volume ratio and, because of their micro scale length, a potentially high loading capacity. these properties make cnts interesting for the potential use as drug delivery carriers (ddcs). mwcnts, produced via chemical vapour deposition with a diameter of 45nm and 16µm length, were used in three different forms, unmodified, acid oxidized (ox_cnt) and ground. cytotoxicity testing was performed in human umbilical vein endothelial cells (huvec). the cells were seeded in 48-well plates and exposed to doses of 1, 5, 10 and 25µg/cm² growth area of the respective cnt type for 24h. the wst-8 assay was applied for testing cell viability and the ldh cytotoxicity assay to identify potential damage to the plasma membrane and to calculate overall cytotoxicity. the results show that an increased oxidation time for the ox_cnts, in a h 2 so 4 /hno 3 mixture, leads to decreased cytotoxicity in huvec, compared to unmodified mwcnts. during the oxidation reactive oxygen groups are formed on the cnt surface [2] . these groups lead to a reduced hydrophobicity of the cnt surface which could be responsible for the decline in cytotoxicity. future investigations will include the toxicological analysis of mwcnts functionalized with polyethylene glycol (cnt-peg). the hydrophilic polymer peg will be covalently bound to the cnt surface and is expected to further reduce the cytotoxic effect. for these investigations different analytical methods will be used. among others, cell cycle analysis, the brdu assay, pathway arrays and qrt-pcr for the investigation of gene expression and cytokines will be measured. these methods will involve a co-culture model of huvec and human umbilical vein smooth muscle cells (huvsmcs) for a better approximation to the cellular in vivo situation. additionally the peg modified mwcnts will be tested for their loading capacity and efficacy with the anticancer drug doxorubicin for a potential use as an intravenous drug delivery carrier in vivo. although aluminium is one of the most common elements in the biosphere, up to now little is known about its impact on human health. aluminium and its chemical derivatives are highly abundant in food, food contact materials and consumer products what leads to an exposition via the gastrointestinal tract (gi tract), the lung and via skin contact. recently, aluminium is hypothized to cohere with cancer and neurodegenerative disorders. lately, due to an increasing attentiveness on this topic, limiting values for food additives have been tightened by the eu commission. however, cellular effects of aluminium and especially aluminium-containing nanomaterials, are in the focus of our research activities, for example in the international solnanotox project. we established an in vitro simulation system of the gi tract, where nanomaterials undergo the different physiological, chemical and proteinbiochemical conditions of saliva, gastric juice and the intestine. the artificially digested nanomaterials, as well as soluble aluminiumchloride as ionic control substance, were subjected to several analytical and biochemical methods to characterize their change of appearance and their cytotoxic effects on intestinal cellular models. we observed the fate of the nanomaterials during typical ph-values of saliva, gastric and intestinal juice with dynamic light scattering measurements and icp-ms in the single particle mode. after observable disappearance at ph 2 the particles recovered in the simulated intestinal fluid. the simulation of the gi tract, mainly the change of ph settings, may lead to a certain chemical activation of aluminium that can increase bioavailability in the intestine after oral uptake of aluminium-containing food products. in vitro assays like ctb, mtt and cellular impedance measurements showed that there were no acute cytotoxic effects measurable after a period up to 48h after incubation, comparable to undigested particles. in contrast, high amounts of aluminium ions showed additional effects on cell viability compared to non-digested aluminium ions. although toxicological potential of al ions to healthy tissue appears to be low, increased hazardous potential cannot be ruled out to pre-damaged tissue and can have a relevance for special consumer groups with for example chronical intestinal inflammation or dietary eating behavior combined with high exposure to al-containing food products. in the eu, there is a strong need for solutions to substitute halogenated flame retardant (hfr) additives, employed in the fabrication of fr thermoplastic and thermoset materials. these materials are used in diverse commercial products, applications and markets, such as electrical/electronic devices, low-voltage wires or household appliances. the phoenix project, funded by the european union 7 th framework program (grant agreement no. 310187), therefore investigates e.g. several tailor-made, nano-layered hybrid particles as alternatives to hfr additives. considering "safer-bydesign" strategies, potential fr nanomaterials (nm) were physico-chemically characterized (e.g. particle size distribution, zeta potential) and screened early in development for their (geno)toxic and pro-inflammatory potential to timely reject nm with high health hazard. as inhalation is the most important exposure route for nm, lungrelevant cells were used as in vitro screening models. to better enable detection and differentiation of the biologic effects of the most promising nm, screening was started with primary rat lung alveolar macrophages (am; cells of first contact for inhaled nm), at a high concentration of 50µg/cm 2 (24h of incubation), using membrane damage (lactate dehydrogenase release assay), direct dna-damage (alkaline comet assay), and il-8 liberation (elisa) as primary endpoints, and quartz dq12 and al 2 o 3 as particulate positive and negative controls, respectively. in this screening system, biologically inert nm could be differentiated from more active ones. thereby, mg(oh) 2 nanoplatelets (mg1; mean lateral size: 1,5-2 µm; mean thickness: 15-20 nm) represented the least, and pristine few layers graphene nanoplatelets (gr1; mean lateral size: 2 µm; mean thickness: 3 nm; graphene layers: 8 ± 0.5) the most biologically active nm. clear concentration dependencies were detected for gr1 in follow-up experiments. mg1 and gr1 were further tested in other lung-relevant cell types, i.e. mrc-5 primary human lung fibroblasts and a549 lung adenocarcinoma epithelial cells. interestingly, mrc-5 cells were less sensitive towards biological effects of gr1, compared to am, whereas a549 cells showed nearly no effect, keeping in mind that lung epithelial cells are the target cells of lung tumor development. to test the hypothesis that the observed cellspecific differences in sensitivity might in part be based on cellular uptake, cells were exposed for 24 h to 12.5 or 25 µg/cm 2 of gr1 on chamber slides. slides were finally stained with dapi and analyzed by dark field microscopy. cells indeed demonstrated differences in uptake capacity and also showed unique pattern of cellular localization of gr1, i.e. with tight perinuclear agglomeration in am, a more scattered cytoplasmic distribution in mrc-5 cells and limited uptake in a549 cells. additionally, biological activity of the diverse nm seemed also to correlate with cellular uptake, as determined by light and dark field microscopy in am and mrc-5 cells. in conclusion, an am-based screening system was able to differentiate biological activity of diverse nm, with morphology, physico-chemical characteristics, and related cellular uptake most likely to be key for nm-and cell type-specific hazard. lung carcinogenicity and putative systemic effects of low-dose life-time inhalation exposure to biopersistent nanoparticles were examined in a chronic inhalation study performed according to oecd test guideline no. 453 with several protocol extensions. female rats (100/group) were exposed to cerium dioxide (nm-212, 0.1; 0.3; 1; 3 mg/m³) for two years; a control group was exposed to clean air. after one year exposure, 42 µg/lung was found in animals exposed to 0.1 mg/m³ and 2.6 mg/lung in animals exposed to 3 mg/m³. histological examination of lungs revealed several adverse and non-adverse effects in the lung. the non-adverse effects comprised accumulation of particle-laden macrophages in alveolar/interstitial areas and in the balt, particle-laden syncytial giant cells in the balt and bronchiolo-alveolar hyperplasia (alveolar bronchiolization). the adverse effects included (mixed) alveolar/interstitial inflammatory cell infiltration, alveolar/interstitial granulomatous inflammation, interstitial fibrosis and alveolar lipoproteinosis. the incidence and severity of the effects were concentration-related. alveolar lipoproteinosis was not observed at low concentrations of 0.1 and 0.3 mg/m³ ceo 2 . neither pre-neoplastic nor neoplastic changes were observed after 12-months exposure. a no observed adverse effect concentration could not be established in this study. the comprehensive histopathological examinations of lungs and other tissues will be finalized in 2017. this project is part of the eu project nanoreg. moreover, german federal ministry for the environment, nature conservation, building and nuclear safety, german federal institute for occupational safety and health, german federal environment agency funded this project. lung carcinogenicity and putative systemic effects of low-dose life-time inhalation exposure to biopersistent nanoparticles were examined in a chronic inhalation study performed according to oecd test guideline no. 453 with several protocol extensions. female rats (100/group) were exposed to cerium dioxide (nm-212, 0.1; 0.3; 1; 3 mg/m³) and barium sulfate (nm-220; 50 mg/m³) for two years; a control group was exposed to clean air. lung burdens and burdens in extrahepatic tissues were measured at various time-point. the two year exposure period was successfully terminated and 50 animals per dose group were examined for organ burden and histopathology. the remaining animals currently are kept exposure-free for maximally 6 additional months. up to two years exposure to both nanoparticles did not lead to body weight reduction compared to control animals. the mortality rates were in an acceptable range. macroscopically evident tumors were not detected after two years. the ceo 2 lung burdens were maximally 3.5 mg/g lung tissue at the highest exposure concentration of 3 mg/m³. in comparison, highest ceo 2 burdens in organs remote to exposure were liver and spleen with maximally roughly 1 x 10 -3 g/g tissue. in brain, maximum ceo 2 levels were 7x10 -6 mg/g lung tissue. baso 4 lung burdens were comparatively low (1 mg/g) within the first 13 weeks of exposure and steeply increased to 6 mg/g lung tissue after one year. the comprehensive histopathological examinations of lungs and other tissues will be finalized in 2017. the european centre for ecotoxicology and toxicology of chemicals (ecetoc) 'nano task force' proposes decision-making framework for the grouping and testing of nanomaterials (df4nano) that consists of 3 tiers to assign nanomaterials to 4 main groups, to perform sub-grouping within the main groups and to determine and refine specific information needs. the df4nanogrouping covers all relevant aspects of a nanomaterial's life cycle and biological pathways, i.e. intrinsic material and systemdependent properties, biopersistence, uptake and biodistribution, cellular and apical toxic effects. use (including manufacture), release and route of exposure are applied as 'qualifiers' within the df4nano to determine if, e.g. nanomaterials cannot be released from a product matrix, which may justify the waiving of testing. the four main groups encompass (1) soluble nanomaterials, (2) biopersistent high aspect ratio nanomaterials, (3) passive nanomaterials, and (4) active nanomaterials. the df4nano aims to group nanomaterials by their specific mode-of-action that results in an apical toxic effect. this is eventually directed by a nanomaterial's intrinsic properties. however, since the exact correlation of intrinsic material properties and apical toxic effect is not yet established, the df4nano uses the 'functionality' of nanomaterials for grouping rather than relying on intrinsic material properties alone. such functionalities include system-dependent material properties (such as dissolution rate in biologically relevant media), bio-physical interactions, in vitro effects and release and exposure. the df4nano is a hazard and risk assessment tool that applies modern toxicology and contributes to the sustainable development of nano-technological products. it ensures that no studies are performed that do not provide crucial data and therefore saves animals and resources. the the grouping decisions of df4nano for 24 nanomaterials were validated against grouping by results of existing in vivo data and demonstrated 23 concordant grouping decisions. serum concentrations in cell culture medium influence the effect of cerium oxide nanoparticles on human lung a549 cells: cytotoxicity, proinflammation, cellular uptake, and particle properties k. burchardt the wide use of cerium oxide (ceo 2 ) nanoparticles (np), e.g. as fuel additive and for industrial and biomedical applications, evoked intense studies to understand the effects those particles might have on living organisms. contradictory results of ceo 2 nanoparticle toxicity have been published as they depend on many variables like shape, size, diluent and others. in our work we used an in vitro model of ceo 2 nanoparticles and lung carcinoma cells to investigate the role of serum content in the cell culture medium on cellular toxicity, particle uptake, proinflammation and particle characteristics. proinflammatory ceo 2 np with an average diameter of 15-30nm and different concentrations were diluted in cell culture medium with different fetal calf serum (fcs) concentrations (0.01, 0.1, 1 and 10%) and were used to expose human lung adenocarcinoma cells (a549) for up to 24 hours. at 100µg/ml ceo 2 np showed little to no toxic effecton growth arrested a549 cells at fcs concentrations of 1% or below, but cell viability was decreased to about 80% in proliferating cells in cultures with 10% fcs. the proinflammatory effect of ceo 2 np was investigated through measurement of il-8 mrna expression after 3 hours and cellular il-8 secretion after 24 hours. the qrt-pcr showed that the expression of il-8 mrna in cells treated with 100 µg/ml ceo 2 np in 10% fcs medium was three times higher than in cells treated with lower fcs concentrations. this finding correlated with the cellular il-8 secretion, which showed a stronger increase by cells treated at 10% fcs. differences in cellular uptake of ceo 2 np was determined by fluorescence-activated cell sorting (facs) after 2 and 24 hours of exposition. after 2 hours, the cells treated with ceo 2 np in 10% fcs medium showed a lower mean granularity (∆gmean) as measure for cellular particle uptake than those with less fcs in medium. after 24 hours all probes showed about the same granularity. to examine the effect the fcs concentrations on ceo 2 np characteristics in cell cultures we used dynamic light scattering (dls) and phase contrast microscopy (pcm). dls measurements revealed an increasing hydrodynamic diameter of the particles with decreasing fcs concentrations (about 1030nm (10% fcs) to 4090nm (0.01% fcs)), which was correlated by an increasing particle agglomeration shown by pcm. our results show that the fcs concentration in cell culture medium has a direct or indirect impact on the cytotoxicity, the proinflammatory effect, the facs parameter for cellular particle uptake as well as on particle properties, which should be taken into account when designing, performing and interpreting in vitro experiments to investigate the toxicity of nanoparticles. toxic and inflammatory effects of shape-engineered titanium dioxide nanoparticles in nr8383 rat alveolar macrophages. knowledge about the contrasting toxicity of nanoparticles (np) of different chemical composition has steadily increased over the past decade. however, available literature often reveals considerable differences in effects within a specific type of nanomaterial. these contrasts have been contributed to different handling and testing protocols as well as to sample-specific differences in physico-chemical properties of np that could affect their mode of interaction with cells. within the nanometrology project setnanometro, the highly controlled generation and characterisation of a large set of shape-engineered tio 2 np allows us to investigate the potential role of subtle shape-and surface structure changes on np toxicity. as inhalation represents the most relevant uptake route of np, the nr8383 rat alveolar macrophage cell line was selected for in vitro toxicological testing. since oxidative stress and inflammation are considered as key biological pathways in nanotoxicity, we evaluated the expression of the oxidative stress marker genes heme oxygenase-1 (ho-1) and g-glutamylcysteine synthetase (g-gcs) as well as the pro-inflammatory genes interleukin (il)-1β, il-6, il-18 and inducible nitric oxide synthase (inos) by qrt-pcr. protein levels of il-1β and tumour necrosis factor-α (tnf-α) were measured by elisa. cytotoxicity testing of the tio 2 np by wst-1 assay overall revealed only minimal toxicity in comparison to sio 2 np which were used as reference material. ho-1 and g-gcs mrna analyses indicated that specific tio 2 np triggered a moderate induction of oxidative stress. il-6 was only induced after sio 2 treatment, whereas il-18 was not affected by any of the tested np. in contrast, various tio 2 np caused a significant induction of il-1β mrna expression. however, no significant induction of il-1β and tnf-α protein secretion was observed for any of the tio 2 np. the results obtained from these and ongoing investigations will be linked to the physico-chemical database as being developed for all tio 2 np within the setnanometro project, with the overall aim to build and model nano-structure activity relationships (nsar) for this widely applied type of nanomaterial. acknowledgements: the setnanometro project is supported by the eu-fp7 programme. specific types of tio 2 particles were obtained by solaronix (switzerland), evonik and cristal. potential of silver and silver nanoparticles to reduce n-acetyltransferase 1 (nat1) activity j. lichter 1 , b. blömeke 1 1 trier university, department of environmental toxicology, trier, germany humans are exposed to various kinds of engineered nanoparticles including silver, which is frequently used in consumer and biomedical product due to its bactericide properties. despite their widespread usage, knowledge about influences on cellular functions is still incomplete. n-acetyltransferase 1 (nat1), an enzyme which is ubiquitously expressed in human tissues, catalyzes the transfer of an acetyl group to its substrates and although its endogenous function is not clear yet, it is well known to be involved in the n-acetylation of arylamines. in addition nat1 enzyme activity is known to modulated by non-substrates including metals and certain nanoparticles, however, the influence of silver on nat1 has not been analyzed yet. to address whether human nat1 is a target of silver nanoparticles and released irons on protein, purified nat1 was exposed to silver ions (agno 3 ) and silver nanoparticles (ag10-cooh, average size 10 nm, carboxyl functionalized), and nat1 enzyme activity was analyzing the n-acetylation of the nat1 substrate para-aminobenzoic acid (paba). therefore, purified nat1 (1 ng/µl) was co-exposed for 20 min to paba (1 mm) and agno 3 or ag10-cooh (0.01, 0.1, 1, 10 and 100 µg/ml each), resulting in a nat1:silver relation of 1:0.01-100 (w/w). both, agno 3 and ag10-cooh inhibited the n-acetylation of paba in a concentration dependent manner. using equal amounts of silver and nat1 (w/w, 1 µg/ml) enzyme activity was reduced about 98±0.2% (agno 3 ) and 82±0.6% (ag10-cooh). the lowest concentration analyzed (0.01 µg/ml) reduced nat1 activity about 24±5% (agno 3 ) and 17±5% (ag10-cooh). fifty percent activity reduction was caused by 0.11 ± 0.01 µg/ml of agno 3 and 0.21 ± 0.04 µg/ml of ag10-cooh, which is 10-fold lower compared to the published ic50 values for other metal oxide nanoparticles (3-15 µg/ml). these data indicate that both chemical silver species are able to modulate paba acetylation. further studies will be performed to clarify whether silver ions and/or silver nanoparticles could affect the specific n-acetylation of arylamines in human cells. colloidal silver has been used in medicine for centuries and nanosilver is present in many consumer-related products. however, despite of intense research in the past few years, the potential of nanosilver to induce effects different from ionic silver in vivo and in vitro, is still under debate. in this study, we compared proteomic effects of nanosilver (agpure™) and ionic silver (silver acetate) in the kidney of male rats after repeated oral delivery in a rat 28-days toxicity study. in order to avoid overt signs of toxicity, silver was dosed moderately in amounts of 60 and 6 mg/kg body weight for nanosilver and corresponding amounts of silver acetate (9.3 and 0.93 mg/kg). accordingly, no pathologic effects, including results from clinical chemistry and hematology, were reported. kidney tissue protein crude extract was separated by 2-d gel electrophoresis and differentially expressed spots were identified by maldi-ms. 374 unique proteins, showing a log 2 ratio of ≤ -0.3 for downregulation and ≥ 0.3 for upregulation were identified in all treatment groups. protein lists were analyzed with ingenuity pathway analysis (ipa). when comparing effects of particulate and ionic treatments, similar alterations were indicated for canonical pathways associated with glycolysis, gluconeogenesis and tricarboxylic acid cycle. regarding inflammatory responses, stronger effects were derived for ionic treatments. for both types of silver exposures, changes of protein expressions were linked to changes of fatty acid metabolism and nrf2-mediated stress. mitochondrial dysfunction was highlighted for both nanosilver treatments only, as well as activation of the insulin receptor. in the top-scored network of the higher dose nanosilver treatment, upregulated 14-3-3 protein zeta (ywhaz) displays a central position. ywhaz, an important regulator of cell cycle and apoptosis, interacts with the insulin receptor and is well known to be envolved in many types of cancer. overall, both forms of silver treatment revealed similar patterns of affected cellular and molecular functions in rat kidney, supporting common and overlapping mechanisms of particulate compared to ionic silver. because of the widespread application of nanomaterials and the fact that for some nanomaterials effects on different organisms where shown, nanomaterials are still in focus of interest. moreover the fate of nps is only partiallyassessed over the lifecycle of products containing nanomaterials. while general toxicological properties of nps are well described in diverse in-vitro and in-vivo experiments, the distribution of these particles during the whole and complex process of waste incineration shows big knowledge gaps. in the "nanoemission"-project the entire route from the residual material via incineration, filtering of the exhaust gas up to a possible release into the environment are considered together with the toxicological evaluation of effects on humans and the environment. in these experiments the influence of thermal waste treatment on the toxicological profile of nanoparticles, contained in the waste, will be described. after a complete characterization of the two types of baso 4 -nps from two different manufacturers by scanning electron microscopy/ energy dispersive x-ray analysis (sem/edx), measurement of the specific surface area (bet) and dynamic light scattering (dls) we investigated the impact of the pure baso 4 -nps on primary cells (normal human bronchial epithelial cells (nhbec) and peripherial lung cells (plc)). both materials show statistically significant cytotoxic effects in the resazurin-assay (decreased viability below 40 % for 1 mg/ml after 72 h). in general the effects of both nps were almost similar. additionally the effects of the baso 4-nps were compared more in detail. uptake of the baso 4 was quantified by icp-ms after 24 h and 72 h as well as the release of ba-ions into the cell culture medium after centrifugal separation. the incubation with baso 4 of plc and nhbec to 0.1 mg/ml over 24 h and 72 h leads to ~200 µg baso 4 /1 mio cells. the uptake is dose dependent but not time dependent. the impact on the secretion of inflammatory cytokines was determined by bead-based multiplex-elisa flow cytometry. tnf-α, il-8 and il-33 could be detected in nhbec and plc after np incubation. the possible induction of apoptosis was measured by flow cytometry as well. first investigations showed no induction of apoptosis for both materials. the impact of both nps on the intracellular glutathione level was measured by hplc and showed a decrease of gsh after 72 h. summing up baso 4 -nps showed toxic effects in primary human lung cell cultures after 72 h for concentrations under 1 mg/ml. c3 exoenzyme from c. botulinum is an adp-ribosyltransferase that inactivates selectively rhoa, b and c by coupling an adp-ribose moiety. rho-gtpases represent a molecular switch integrating different receptor signalling to downstream transcriptional cascades that regulate various cellular processes, such as regulation of actin cytoskeleton, cell proliferation and apoptosis. previous studies with the murine hippocampal cell line ht22 revealed a c3-mediated inhibition of cell proliferation and a prevention of serum-starved cells from apoptosis (rohrbeck et al., 2012) . former results of studies on map kinase signalling indicated c3-induced modulations of downstream signalling modules. therefore, ht22 cells treated with 500 nm c3 for 48 h were applied for screening of the activity of 48 various transcription factors followed by luciferase reporter assays. five transcription factors namely sp1, atf2, e2f-1, cbf, stat6 were identified as significantly regulated in their activity. for validation of identified transcription factors, studies on the protein level of certain target genes were performed. western blot analyses exhibited an enhanced abundance of sp1 target genes p21 and cox-2 as well as a raise in phosphorylation of c-jun. in contrast, the level of apoptosisinducing gadd153, a target gene of atf2, was decreased. our results suggest that c3 is able to modulate the activity of transcription factors whose target genes are involved in the regulation of cell proliferation and apoptosis. via covalent binding of chemical compounds to dna, adducts can be formed. as a consequence, mutations may occur, which represent stages of chemical mutagenesis and carcinogenesis. the isolation of leucocytes represents an essential field of work within dna-adductomics of blood cells, in which adducts serve as markers of exposure for biological monitoring. peripheral mononuclear blood cells (pbmc) comprising monocytes and lymphocytes can be separated from other blood cells like granulocytes, erythrocytes (and dead cells) by isopycnic density gradient centrifugation of buffy coats (bc´s). bc´s are blood concentrates rich in leucocytes and thrombocytes, prepared from whole blood samples thorough removal of plasma and erythrocytes. for the experiments only bc´s from healthy donors were used. while erythrocytes contain no dna, lymphocytes, which constitute a subcategory of leucocytes, carry genetic information. a variety of commercially available fluids with a density of 1.077g/cm 3 , based on different polymers, exists. these materials form the gradient required for the centrifugation. furthermore, there are several methods available, which differ in the ratio blood vs. fluid, duration of the different work up steps, centrifugation parameters and others. the aim of this work was to compare the effectivity of the different fluids and optimize the workflow. for isolation of pbmc the fluid was put in a tube and then covered by a thin layer of blood. after centrifugation, five layers were obtained (figure 1 ). the interphase was removed carefully, washed and the cells were counted. the yield is expressed as percentage of isolated vs. total leucocytes in the bc, which was based on the leucocyte count of the whole blood sample. as separation fluids, ficoll® paque plus (ge healthcare), histopaque® (sigma aldrich) and lsm® (ge healthcare) were tested. no significant differences between the different fluids were observed. although dilution is often recommended in literature, it was found that dilution had no effect on the yield. similarly, using different fluids (rpmi or pbs) for the washing steps did not reveal any differences. increasing centrifugation speed from 400 to 500 xg resulted in higher yields. in general, large variations in yield (46.8% ± 17.1%) among the various bc´s arose. this result is due to the different parameters such as age, storage, leucocyte and erythrocyte counts of the different bc´s used. therefore, the test conditions were optimized using the same batch of bc. the results show that the low priced separation fluids are comparable in performance tothe more expensive ones. by the direct lamination with bc (without dilution) the wastage could be minimized, the yield increased and thus the isolation made more efficient. the franz cell is a well-established model in lots of skin research fields. we adapted the diffusion cell system to additives and contaminants of consumer products which are designated for skin contacts. we were aimed to simulate real exposure as realistic as possible. to meet requirements like longevity, haptic properties and factory costs, different polymers are used as the raw material of choice and modified by a variable number of additives in the majority of commodities. besides additives with a defined function such as plasticizers, stabilizers, colorants and vulcanization accelerators, contaminants as well as decomposition products or so-called non-intentionally added substances (nias) can be part of the material, among them potentially harmful substances. in a first step, polymers of consumer products like flashlights or tools have been characterized concerning additive composition. possible breakdown products were identified by means of gc-ms/ms or pyrolysis-gc-ms. we focused on analytes of toxicological relevance including antioxidants such as n-phenyl-2-naphthylamine (neozon d), which is suspected of causing cancer, and on degradation products like cresol and its derivatives (e.g., mesitol or 2-tert-butyl-4-methylphenol). subsequently, analytes of interest were brought into direct skin contact using porcine, human and artificial skin models in the franz cell chamber assay. the analytes were either followed up layer by layer using tape stripping or examined utilizing cryo sections. for visualization purposes, analytical evaluation has been completed by use of imaging techniques like he staining or atr-ftir (attenuated total reflectance fourier transform infrared) microscopy. the latter was used with intrinsic markers for tissue specific distribution. our project provides evidence for the potential of polymer components to overcome the natural epidermal barrier and, in part, to enter the viable layers of the epidermis. during skin contact with consumer products several substances migrate out of the matrix and penetrate the skin, among them substances which are hazardous to health such as neozon d. depending on its dose, the blister agent sulfur mustard (sm) may lead to painful erythema, blistering with complicated wound healing, pulmonary edema, pulmonary bleeding and temporary blindness. sm is listed as a schedule 1 chemical in the chemical weapons convention and thus its production, stockpiling and use is prohibited. sm still represents a serious threat for civilians and military forces, especially in asymmetric, terroristic and accidental scenarios. after exposure, the highly reactive molecule alkylates nucleophilic sites in endogenous biomacromolecules forming the characteristic and stable hydroxyethylthioehtyl (hete) residue. hence, bioanalytical methods targeting theses adducts in forensic post-exposure verification analysis are of high interest. herein, we present an optimized, accurate and comparably simple method to detect adducts of sm and human serum albumin (hsa) alkylated at its cysteine 34 -residue. since albumin extraction from human plasma is a time-consuming and expensive step of an established procedure, an alternative method for direct proteolysis of human plasma was developed. plasma samples were cleaved directly with pronase resulting in the alkylated dipeptide hete-cysteine-proline (hete-cp) which is detected by micro-liquid chromatography-electrospray ionization high-resolution tandem-mass spectrometry (µlc-esi hr ms/ms). in order to optimize reproducibility and yield of proteolysis, kinetics were investigated for different kinds of plasma (edta-, citrate-and heparin-) as well as serum. two different mass spectrometers, a triple quadrupole system (4000 qtrap) and a hybrid quadrupole time-of-flight instrument (tt5600 + ), were compared. the latter one proved to be the more selective and sensitive system. the method was successfully applied to in vitro and in vivo samples of real cases of sm poisoning. organophosphorus compounds (op), which were originally intended to be used as pesticides to increase agricultural yields at the onset of the 20 th century, still represent a considerable threat to human health. by irreversible inhibition of acetylcholinesterase op lead to cholinergic crisis due to uncontrolled increase of acetylcholine in the synaptic cleft. finally, sudden death by respiratory failure may result if medical countermeasures are lacking. therefore exceptionally toxic compounds were designed und synthesized as chemical warfare agents (cwa), among which v-type nerve agents, i.e. vx, chinese vx and russian vx, belong to the most toxic artificial substances. recent events like the first gulf war, terrorist attacks in tokyo and the conflict in syria underline the need for ongoing and strict surveillance of cwa prohibition by the chemical weapons convention. unambiguous evidence of such substances (verification analysis) plays an important role with great political and legal impact. a variety of such bioanalytical methods have been established at the bundeswehr institute of pharmacology and toxicology in munich, which is accounted for medical chemical defence in germany. exhibiting quite short half-lives in vivo nerve agents can hardly be detected days or even weeks after exposure. accordingly, there is a great need for additional long-term biomarkers like specific protein adducts. consequently, the current work focuses on examination of adducts between nerve agents and human serum albumin (hsa) as its high abundance and stability in vivo provide relative ease of sampling. after incubation of hsa with v-type nerve agents in vitro the protein was subjected to proteolysis. subsequently resulting peptides were separated using microbore high-performance liquid chromatography (µlc) and detected on-line by modern high-resolution tandemmass spectrometry (hr ms/ms). this allowed unambiguous identification of already known phosphylated tyrosines as well as novel adducts between cysteine-proline dipeptides and the thiol-containing leaving group of v-type nerve agents. simultaneous detection of both biomarkers was realized by a new method, which was applicable even at the very low toxicologically relevant concentrations of v-type agents. therefore, this method represents a valuable and novel supplement of existing methods for verification. fatty acid esters of glycidol (glycidyl esters) are processing contaminants formed as byproducts of industrial deodorizing of plant fats or during other heating processes. following oral intake, glycidyl esters are mainly cleaved to release the reactive glycidol in the gastrointestinal tract. according to the national toxicology program (ntp), glycidol is carcinogenic, genotoxic and teratogenic in rodents. it is classified as probably carcinogenic to humans (iarc group 2a). the exposure assessment of the oral intake of glycidyl esters in humans is difficult, because the current data set for glycidyl ester contents in food is incomplete. we developed a method for the determination of the internal exposure to glycidol by mass spectrometric quantification of a hemoglobin adduct reflecting the total glycidol burden over approximately three months. a modified edman degradation was adapted for the cleavage of the valine residues from the n-termini of hemoglobin by fluoresceinisothiocyanat (fitc) (von stedingk et al. (2011 ) chem res toxicol 24, 1957 resulting in the formation of dihydroxypropyl-valine-fluorescein thiohydantoin (dhp-val-fth). the target analyte is purified with mixed-mode anion-exchange solid-phase extraction and analyzed by lc-ms/ms. a major advantage of the technique is the application to whole blood samples, which renders the time-consuming isolation of erythrocytes unnecessary. we synthesized dhp-d 7 -val-fth as an internal standard for the quantification of the glycidol adduct by lc-ms/ms multiple reaction monitoring. a limit of detection of 5 fmol per injection (5 pmol adduct/g hemoglobin) was achieved. the application of this method will possibly allow future monitoring of the internal exposure of glycidol in human studies. glycidol and 3-monochloropropane-1,2-diol (3-mcpd) are carcinogenic food contaminants, which are present in heat-processed oils and fats mainly in form of fatty acid esters. the risk assessment concerning human consumption of these substances is complicated by various reasons. for example, the data on the occurrence in food stuffs is incomplete. also, the amounts of the proximate carcinogens released from ester hydrolysis in the gastrointestinal tract in humans are not known. monitoring of the internal exposure would be an alternative strategy to support the assessment of possible health risks related to the intake of glycidol and 3-mcpd and their fatty acid esters. for short-term monitoring of the internal exposure, urinary metabolites are suitable biomarkers. we study the potential use of two different substances as descriptors of the oral intake of glycidol and 3-mcpd. the metabolite 2,3-dihydroxy mercapturic acid (dhpma) is generated following glutathione conjugation of both compounds. the second target analyte is 3-mcpd itself, which may also be formed from glycidol in the reaction with hydrochloric acid in the stomach. a method for the quantification of urinary 3-mcpd by gc-ms is currently developed. an lc-ms/ms multiple reaction monitoring technique was devised for the quantification of dhpma in urine samples with the isotope-labeled reference compound 13 c 2 -dhpma. the limit of quantification is 10 µg dhpma/l. related to creatinine, the analyte was detected to be in a relatively small concentration range in urine samples from humans. the average concentration in urine samples (n = 45) of one male volunteer collected over ten days was 154 ± 21 µg dhpma/g creatinine. a meal of a highly contaminated, commercially available frying fat (containing 1.1 mg of glycidol equivalents) did not lead to a visible increase of the urinary concentrations. the considerable background levels of dhpma in urine of humans and also in urine samples of other mammals support the hypothesis that dhpma may be also be formed from an endogenous c 3 -metabolite, as already reported by eckert et al. furfuryl alcohol is a common food contaminant formed by acid-and heat-induced dehydration from pentoses. it induced renal tubule neoplasms in male b6c3f1 mice and nasal neoplasms in male f344/n rats in a study of the national toxicology program (ntp). the neoplastic effects may originate from sulfotransferase (sult)-catalyzed conversion of furfuryl alcohol into the dna reactive and mutagenic 2-sulfoxymethylfuran. the incomplete data set of furfuryl alcohol contents in food does not allow estimating the human exposure. thus, we sought a method for the determination of the internal exposure. recently, the dna adduct of furfuryl alcohol n 2 -[(furan-2-yl)methyl]-2′deoxyguanosine was detected in specimen of human lung tissue. however, human biomonitoring of dna adducts has various disadvantages. for example, dna adducts are removed by various repair systems and human dna samples are usually not accessible in sufficient quantities. we decided to develop a biomarker for the internal exposure to furfuryl alcohol using blood proteins as dosimetric targets. bladder cancer (bc) is a smoking and occupation related disease showing a substantial genetic component. though the prognosis is generally good, a major problem is the frequent relapses affecting about half of the patients. n-acetyltransferase 2 (nat2) is well-known to modulate bc risk in persons heavily exposed to carcinogenic aromatic amines. we aim to investigate the impact of nat2 genotypes, in particular, the ultraslow genotype, on relapse-free time after first diagnosis in 772 bladder cancer cases. we used follow-ups of three case-control studies from lutherstadt wittenberg (n=207), dortmund (n=167) and neuss (n=398). nat2 was genotyped using seven characteristic polymorphisms (rs1801279, rs1041983, rs1801280, rs1799929, rs1799930, rs1208, rs1799931). haplotypes were reconstructed using phase v.2.1.1. we compared slow to rapid acetylators. additionally, we differentiated between the most frequent slow nat2*5b/*5b and *5b/other slow haplotypes as well as between the ultra-slow *6a/*6a and *6a/other slow haplotypes compared to rapid acetylators. chi-square tests used to check the frequency of relapses in ultra-slow, slow and rapid acetylators. genotype differences in relapse-free time up to 5 yr after first diagnosis of bc were analysed using cox proportional hazards models adjusted for age, gender, smoking habits, invasiveness and study group. a total of 371 (49%) patients showed a relapse within the first 5 yr after bc diagnosis. slow acetylators show a higher frequency of relapses than rapid acetylators (51% vs. 46%, p=0.154). this frequency is even higher in ultra-slow acetylators (61%, or=1.81, p=0.019) but not in slow *5b/*5b genotypes (49%, p=0.609). ultra-slow acetylators had a significantly shorter relapse free time within 5 yr after bc diagnosis than rapid acetylators (median 0.66 vs. 0.94 yr, hr=1.57, p=0.009). this trend was not that pronounced in all slow acetylators combined (0.78 yr, hr=1.19, p=0.127) nor in the subgroup of nat2*5b/*5b genotypes (0.79 yr, hr=1.20, p=0.255). the effect of ultraslow nat2 is even more pronounced in smokers (hr=1.78, p=0.003) but absent nonsmokers (hr=0.89, p=0.781). ultra-slow nat2 seems to be associated with a higher recurrence risk and a shorter relapse-free time, especially in smokers. slow nat2 in general seems to have less impact on recurrence. aalborg university, department of biotechnology, chemistry and environmental engineering, aalborg, denmark xenoestrogens with the potential for endocrine disruption like bisphenol a (bpa) may bind to the estrogen receptors (ers) and modulate expression of er target genes mimicking the natural ligand 17β-estradiol (e2). the potential for endocrine adversity is still predominantly assessed in vivo as existing in vitro tests have only limited value for an exposure-based risk assessment. thus, the development of reliable bioassays for the detection of endocrine disruptors is one of the paramount challenges faced by modern toxicology. a targeted metabolomics approach in mcf-7 cells treated with e2 or bpa revealed potential biomarkers for the estrogenic potency of the studied compounds. among them were several phosphatidylcholines and amino acids. we further addressed proline levels that were found to be strongly increased. investigations of proline levels over time showed a clear proliferation correlated concentration dependency after both e2 and bpa stimulation. furthermore, sirna knockdown experiments suggested an influence of the oncogenic transcription factor myc and the dependency of erα activation on the estrogen-mediated proline increase. our study demonstrates metabolomics as a powerful tool for biomarker identification and hypothesis generation. the results could be used further to develop bioassays for the detection of endocrine disruptive chemicals. children are considered to be more sensitive to most chemicals than the general population due to a variety of factors, including dynamic growth and developmental processes as well as physiological, metabolic and behavioral differences [1]. however, only a few data are available on the magnitude of preschool children's exposure to most chemicals present in many consumer products. several of these chemicals are linked to endocrine disrupting effects in animal studies and are suspected to have also adverse effects e.g. on development and function of the reproductive organs as well as on neurological and behavioral development in humans. among of the chemicals that have been a major focus of discussion in the last years are phthalates, dinch, parabens, bisphenol a, and triclosan due to their suspected health effects. therefore we aimed to investigate exposure levels to metabolites of different phthalates and parabens as well as to bisphenol a and triclosan in urine samples collected from preschool children in german day-care centers from north rhine-westphalia (lupe iii; 2011/12). urine specimens from children aged from 20 to 80 months from 23 different day-care centers were analyzed. in total, 253 preschool children were recruited with mean age of 54 months. our study results show that nearly all children (>95 %) of the study population had urine concentrations equal to or above the limit of quantification for five most common phthalates metabolites (mnbp, mibp, 5oh-mehp, 5oxo-mehp, 7oxo-minp), for bisphenol a and methylparaben. triclosan was detected in 16 % of the study population. in general, the median urinary concentrations of the above mentioned phthalate metabolites were about 5-50 µg/l in spot urine samples. the highest amount among the phthalate metabolites was observed for mibp with maximal values of about 1000 µg/l. median urinary concentration for methylparaben and bisphenol a were about 48 µg/l and 2 µg/l respectively. the maximum methylparaben, bisphenol a and triclosan level found were 1770 µg/l, 72.4 µg/l and 55,6 µg/l respectively. in conclusion, our study shows a widespread exposure of young children to various phthalates, parabens and bisphenol a in north rhine-westphalia, germany. a follow-up human biomonitoring study (2014/2015) has finished recruitment and is in the process of analyzing data. polycyclic aromatic hydrocarbons (pahs) represent a large group of organic compounds that are common environmental contaminants. they are formed by incomplete combustion of organic matter such as coal or crude oil and are often known to be carcinogenic, mutagenic and teratogenic. the acute toxicity of pahs is rather low, but because of their stability and lipophilic character those compounds can accumulate in the human body and cause severe chronic effects. additionally pahs may enter the food chain when preserving meat or fish by exposure to smoke. in the european union maximum levels of 2 µg/kg benzo[a]pyrene and 12 µg/kg as the sum of benzo[a]pyrene, benz[a]anthracene, benzo[b]fluoranthene and chrysene in the meat of smoked fish and smoked fishery products are set, respectively. smoked fish is often handmade in small fishery stores in schleswig-holstein, where self caught fish is prepared in smoke houses. this technique implies the danger of pahs to accumulate in smoked fishery products above allowed maximum levels. here, we report our findings of pahs in smoked fishery products bought in local convenience and fishery stores in schleswig-holstein and give a brief overview about actual contaminant levels. hplc with fluorescence detection was used to determine the quality and quantity of several toxic pahs in smoked fishery products made locally. pahs may constitute risks for human health when exposed to hazardous levels and therefore it is important to have knowledge about given contaminant levels. colorectal cancer (crc) is one of the most frequent cancers worldwide and is tightly linked to dietary habits. epidemiological studies provided evidence that the intake of red meat is associated with an increased risk to develop crc [1]. red meat contains high amounts of heme iron, which is thought to play a causal role in tumor formation. the underlying molecular mechanism, however, remains elusive and may involve increased cell proliferation and dna damage induction by heme-iron. in this study, we set out to analyze the genotoxic and cytotoxic effects of heme-iron in human colonic epithelial cell lines. we used hemin (fe iii ) as commercially available heme source, which was compared to inorganic iron chloride (fecl 3 ). first, the time-dependent internalization of hemin and fecl 3 into hct116 cells was determined using icp-ms/ms analysis. treatment of cells with inorganic iron resulted in a maximum of intracellular iron content after 1 h at all doses tested, while hemin particularly at high doses caused an iron accumulation up to 24 h. hemin catalyzed the formation of reactive oxygen species (ros) in a dose-dependent manner in caco-2 and hct116-cells as shown by flow cytometry. consistent with this finding, hemin dose-dependently induced the oxidative dna lesion 8-oxoguanine (8-oxog) as revealed by slot blot analysis and fpg-modified alkaline comet assay. using a pharmacological inhibitor of mutt homologue 1 (mth1), which protects the nucleotide pool by hydrolysis of 8-oxogtp, 8-oxog dna adduct levels in hemin-treated cells were further enhanced. in contrast, inorganic iron hardly affected the cellular ros level and only slightly increased oxidative dna damage. subsequently, a time-and dose-dependent activation of the dna damage response (ddr) by hemin was shown in hct116 and caco-2 cells using western blot analysis, which was followed by a reduction in cell viability at high doses after 72 h. finally, the cytotoxic effects of hemin and inorganic iron were tested using an ex vivo model of intestinal crypt organoids. preliminary results indicate that hemin is highly cytotoxic in organoids, whereas inorganic iron does not impair their viability. taken together, this study demonstrated that hemin induces oxidative stress and dna damage, resulting in the activation of the ddr and subsequent cytotoxicity. in contrast, inorganic iron displayed only modest effects. further in vivo studies using dna-repair deficient and proficient animals will shed light on the contribution of specific dna lesions to hemeassociated colorectal carcinogenesis. red and processed meat consumption is known to be a crucial risk factor in the development of colon cancer, which is one of the most common cancers worldwide. a diet rich in red and processed meat increases n-nitrosation within the colon leading to an increase in endogenously formed nitroso compounds. these compounds are able to alkylate dna of gastrointestinal tract cells, resulting in the formation of dna adducts such as -meg is repaired by mgmt, the potential of this enzyme to repair o 6 -cmg in cells is not well characterized. additionally, adenomas containing a k-ras gc-at transition mutation have lower mgmt levels than adenomas without this mutation. therefore, the aim of this study is to determine the role of mgmt in protecting colorectal cells from genotoxicity by repairing o 6 -cmg adducts. for this purpose, an mgmt-deficient non-transformed human colon epithelial cell line was established by stable transfection via rna interference to inhibit the expression and therefore the activity of mgmt. the transfected cell line was analyzed for complete mgmt gene silencing by activity and expression analyses and cell characterization. results confirmed that there was neither expression nor activity for mgmt in the transfected cells, and the cell characterization data showed that mgmt deficiency did not lead to differences in growth behavior or cell morphology or malignant cell transformation. cytotoxicity experiments performed in the transfected and nontransfected cell lines by treatment with dna alkylating agents suggest that the mgmtdeficient cell line is more sensitive to dna alkylating agents than the non-transfected cell line. these results were also supported by dna damage analysis via comet assay. asarones are secondary plant constituents mainly occurring in acorus calamus l. and guatteria gaumeri. the essential oil of the rhizome of acorus calamus l. is used for flavoring of food and alcoholic beverages. the concentration of β-asarone (ba) in these oils varies between 0.3% and 95%. the bark of guatteria gaumeri, which is rich in αasarone (aa), is used as cholesterol-lowering drug and to treat gallstones in traditional mexican medicine. both, aa and ba are carcinogenic in rodents and mutagenic in the ames test after metabolic activation and thus classified as genotoxic carcinogens. [1, 2] previously, the major metabolites in microsomal incubations with aa and ba were identified and synthesized in our laboratory. side-chain epoxides, the corresponding diols, side-chain alcohols and aldehydes were identified as the major metabolites of aa and ba. [3] the investigation of the mutagenic potency in the ames fluctuation test showed positive results in the salmonella strain ta 100 for aa and ba with metabolic activation and for aa-and ba-epoxide without metabolic activation. while it is known that epoxides are reactive against nucleophiles we set up the hypothesis of dna-adduct formation by epoxides to explain the mutagenic effect. this adducts are currently isolated, characterized by nmr-spectroscopy and used to quantify adducts in cells. at the same time primary rat hepatocytes were incubated with non-cytotoxic concentrations of aa and ba for 24 h. after harvest and lysis of the cells, the dna was isolated by chloroform/phenol extraction and enzymatically hydrolyzed. [4] the residual hydrolysates will be used to identify the dna-adducts formed in cells and to determine the adduct formation rate. preliminary experiments indicate that both, aa and ba also form dna-adducts in intact cells in a concentration-dependent manner. [1] göggmann, schimmer (1983) . mutagenicity testing of α-asarone and commercial calamus drugs with salmonella typhimurium. mutation research. 121, [191] [192] [193] [194] wiseman et al. fatty acid esters of 3-chloro-1, and of 2-chloro-1, are food process contaminants that are formed, e.g., during refinement of vegetable oils. after ingestion, the esters are hydrolyzed in the gut, thereby releasing free 3-mcpd and 2-mcpd. 3-mcpd has been identified by the international agency for research on cancer (iarc) to be possibly carcinogenic to humans (class 2b) and is therefore in the focus of food safety authorities. to elucidate the toxicological properties of these compounds at the molecular level (mode of action) a proteomic study was conducted in which 10 mg/kg b.w./day of either 3-mcpd or 2-mcpd were orally administered to rats over a period of 28 days. total protein was extracted from different tissues of the animals and separated via twodimensional gel electrophoresis (2de). among others, the redox sensor protein dj-1 was identified to be deregulated in liver, kidney, testis, and heart of rats treated either with 3-mcpd or 2-mcpd. up to six different isoforms of dj-1 were identified by 2de-western blot analysis, all of them having the same molecular weight but different pi values, indicating protein modifications of low molecular weight but with a strong impact on the protein charge. treatment of the animals with either 3-mcpd or 2-mcpd predominantly led to a shift of the abundance between two dj-1 isoforms in the rat tissues. this effect was more pronounced with 3-mcpd compared to 2-mcpd. mass spectrometric analysis of these two isoforms identified an oxidation of a conserved cysteine residue (cys106) of dj-1 to a cysteine sulfonic acid to be the protein modification induced by treatment of the rats with either 3-mcpd or 2-mcpd. dj-1 is discussed to participate in a number of biological processes such as proteolysis, protein folding, or redox regulation. oxidation of cys106 was shown to be crucial for the activity of dj-1, and the irreversible oxidation of cys106 to a cysteine sulfonic acid as observed in the present study was shown to result in a loss of protein function and was correlated with diseases related to oxidative stress such as parkinson's disease. thus, the potential impact of 3-mcpd and 2-mcpd on cellular oxidative stress and on associated neurodegenerative diseases has to be considered in the ongoing risk assessment of these food contaminants. pyrrolizidine alkaloids (pa) are secondary plant compounds and widely spread in plant kingdom. humans can therefore be regularly exposed via direct or indirect contamination of food, like herbal teas, honey, wheat or salad. 1,2-unsaturated pa are known for their potentially harmful effects. an acute intoxication may cause venoocclusive disease leading to hepatomegaly, ascites as well as liver hardening resulting in high mortality. on the other hand, chronic pa intoxications due to regular consumption of small amounts of pa are characterized by hepatic necrosis, fibrosis and cirrhosis. an initial whole genome transcriptome study in hepatocytes revealed that molecular pathways related to cancer development, cell cycle regulation and cell death are regulated by the four structurally different pa echimidine, heliotrine, senecionine and senkirkine in short-term exposure (24h). additionally, lipid and bile acid metabolism was affected. however, the uptake of pa by food is more likely to be continuous than singular. therefore, the aim of this study was to investigate molecular effects of longterm exposure (14d) comparatively to short-term exposure (24h) in the hepatoma cell line heparg with the four structurally different pa. in this context we analyzed selected cellular parameters like cytotoxicity and morphology. in contrast to short-term exposure, structure-and concentration-dependent cytotoxicity was found for the long-term exposure (sn>he>em/ski). furthermore, obvious morphological changes such as destructuration and perforation of the heparg cell monolayer were seen after 14d of incubation. based on these findings, a possible induction of apoptosis or necrosis by pa was examined. effects of long-term exposure to pa on gene expression were investigated for a set of genes found to be regulated in the short-term whole genome transcriptome study. the identified regulation of gene expression was confirmed for both terms, with the strongest regulation for cyp7a1 (down-regulation), a gene involved in cholesterol metabolism. therefore, the effects of pa on various parameters related to cholesterol metabolism were analyzed, showing pa effects on cholesterol levels and bile acid secretion. short-term exposure to pa did not affect cell viability. however, repeated doses of pa resulted in severe effects on hepatic cells, concerning viability and morphology. at the mrna level, short-and long-term incubation with pa seem to affect a wide range of signaling pathways. in conclusion, we show for the first time that heparg cells can serve as an in vitro model for hepatotoxicity following chronic intake of pa. shiga toxin-producing e. coli (stec) strains cause a diversity of enteric symptoms in humans, ranging from mild diarrhea to severe diseases such as the hemolytic uremic syndrome (hus). hus is a life threatening disease with microvascular endothelial damage in the gastrointestinal tract and the kidneys, which often leads to haemolytic anemia, thrombocytopenia and acute renal failure. shiga toxin plays a major role for the pathogenesis of hus but the subtilase cytotoxin, which was identified during an hus outbreak in 1998 in stec strains, might contribute as an additional potent enterotoxin. like shiga toxin, subab is an ab 5 protein toxin. the pentameric binding/transport subunit (subb) delivers the enzymatic active subunit (suba), a protease, into the endoplasmic reticulum (er) of human target cells. in the er, suba cleaves the chaperone bip/grp78, which results in cell stress and caspase 3/7dependent cell death. recently, we discovered that higher concentrations of suba (10 mg/ml) causes cytotoxic effects in human epithelial cells (hela, in the absence of subb. the cytotoxic effects were investigated in hela cells in more detail. here, suba binds in a concentration dependent manner to the cell surface and induces dramatic morphological changes as well as caspase-dependent cell death [1] . in contrast to hela and caco-2 cells, cho fibroblasts did not respond to suba. currently, further cell types including macrophages are tested for suba effects and the molecular mechanisms underlying the cytotoxic effects caused by suba and the cell type selectivity are investigated. although there are strong cytotoxic effects caused by suba on some human epithelial cells in vitro, the situation in vivo and the pathogenic relevance of the newly observed suba effect are not known. thermal treatment of fat-containing foodstuff in the presence of salt leads to formation of 3-mcpd and its esters. high amounts of 3-mcpd esters detected in food raised toxicological concern. recent studies revealed that food may also contain significant amounts of structurally related 2-mcpd and its fatty acid esters. toxicological studies indicate genotoxicity in vitro and a carcinogenic potential of 3-mcpd, pointing to kidney and testes as main target organs. 3-mcpd esters were shown to cause similar, but milder effects. few unpublished data exist for 2-mcpd, showing similar organ toxicity compared to 3-mcpd and identifying heart as additional target organ. no such data exist for 2-mcpd diesters. in consequence, further toxicological data were required in order to complete risk assessment for 3-mcpd, 2-mcpd and their esters. hence, an oral 28-days study was performed in male rats in order to fill data gaps and provide essential information for risk assessment. a proteomic analysis was included in order to compare molecular effects induced by 3-mcpd and 2-mcpd and its dipalmitic ester in rat liver, kidney, testes and heart. in order to avoid overt toxicity, moderate doses of 3-mcpd + 2-mcpd (10 mg/kg body weight), and 2-mcpd-dipalmitate (53 + 13.3 mg/kg body weight) were applied. accordingly, no pathologic effects were reported. here, we present proteomic results obtained after analysis of heart tissue. after separation of heart tissue protein crude extract by 2-d gel electrophoresis , differentially expressed spots were identified by maldi-ms. a total of 270 unique proteins deregulated at a log 2 ratio of ≤ -0.5 for downregulation and ≥ 0.5 for upregulation at p ≤ 0.05 were identified. comparing deregulated spots induced by different treatments revealed that a higher number of spots was deregulated by 2-mcpd versus 3-mcpd. dipalmitate treatment even caused more deregulation than 2-mcpd. upregulated cytochrome b-c1 complex subunit rieske (ucri) and downregulated acetyl-coa acetyltransferase (thil) were among the top deregulated proteins after 2-mcpd and 2-mcpd dipalmitate treatment. pronounced upregulation of respiratory chain protein ucri, not deregulated after 3-mcpd treatment, indicates an effect on energy metabolism. downregulation of thil, involved in ketone body metabolism, was only weakly affected after 3-mcpd treatment. protein dj-1 (park7), a multifunctional redoxsensitive chaperone and protease protecting the cell against oxidative stress, was significantly downregulated after treatment with 3-mcpd and the higher dose of the 2-mcpd diester. tropomyosin beta chain (tpm2) was commonly upregulated after 3-mcpd and 2-mcpd treatments, possibly indicating a tgf-beta induction of actin stress fibers. for rat heart, data show that similarities but also some significant differences of 2-mcpd-and 2-mcpd dipalmitate-induced proteomic changes exist compared to 3-mcpd, indicating different mechanisms of toxicity for this structural analogues. oxidation products (oxy) of cholesterol (chol) such as 7α-hydroxy-chol (7α-ho-chol), 7β-hydroxy-chol (7β-ho-chol), 7-keto-chol (7-o-chol), 5,6α-epoxy-chol (α-epoxy-chol) and 5,6β-epoxy-chol (β-epoxy-chol) are formed by autoxidation of chol and are discussed as biomarkers for inflammation and oxidative stress in human tissues to be used in the identification of risk factors for disease. however, oxy-chols are also present in processed foodstuff where 7β-ho-chol (milk) and 7-o-chol (fish, meat, and egg) tend to represent the main oxychols, whereas epoxy-chols, are generally minor constituents. thus, levels of oxychols in tissues may result from both endogenous formation as well as dietary exposure. since quantitative profiles of oxy-chols have not been determined in human adipose tissues yet, levels of oxychols and chol were quantified using gc-ms/ms (internal standards: deuterated oxychols) and gc-fid (internal standard: 5α-cholestan-3β-ol), respectively in breast adipose tissues of 48 healthy women undergoing mammoplasty. furthermore, tissue levels of 25 fatty acids in adipose tissues were determined by gc-fid (internal standard: undecanoic acid) to assess relative levels of pentadecanoic acid, docosahexaenoic acid and elaidic acid, indicative for consumption of dairy products, fish, and processed fats, respectively. all oxychols were detected in all breast adipose tissues. the most abundant oxychol was β-epoxy-chol (median: 0.36 nmol/g tissue, range: 0.06-1.55 nmol/g), followed by α-epoxy-chol > 7-o-chol > 7α-ho-chol> 7β-ho-chol (0.009 nmol/g, range: 0.002-0.11 nmol/g). tissue levels of chol (2.8 micro mol/g, range: 1.88-3.87 micro mol/g) did not correlate (spearman's rank analysis) with that of epoxy-chols and correlated even negatively with that of 7α-ho-, 7β-ho-, and 7-o-chol (r = -0.29, p=0.024-0.044) median oxychol/chol ratios ranged from 0.0119 (5, to 0.0003 (7β-ho-chol). 7-o-chol and 7-ho-chols correlated strongly with each other (r=0.81-0.91, p oxy-chol levels did not correlate with relative amounts of pentadecanoic acid and docosahexaenoic acid, whereas total oxychol, 7β-ho-chol, and β-epoxy-chol levels correlated with relative amounts of elaidic acid (r=0.30, 0.30, and 0.31, respectively, p=0.037, 0.034, 0.028, respectively) . no correlations of oxychol levels, individual or total oxychol/chol ratios with age or bmi were observed. taken together, oxychol profiles in breast adipose tissues were different from that usually present in food but could be affected by dietary habits. classification and labelling of hazardous substances and hazardous consumer products (1) has proven to be a very efficient tool for risk communication. consumer products, such as glue, varnish, or washing and cleansing products need to be classified and labelled if they contain dangerous ingredients that render the mixture hazardous. personal care products, however, need not be classified and labelled if they contain dangerous substances above the thresholds for classification. they are excluded in the clp regulation. what would happen without this exception? when i applied the criteria for classification and labelling to a selection of cosmetic product formulas in a conservative approach, most products would have to be labelled and classified, mainly due to hazardous effects to the eye and to the skin (2) . benefits of personal care products can go along with unwanted properties such as the hazards for the human health, and consumers should be informed about them (3) . risk communication for every day products like personal care products should be clear, easily and quickly understandable. according to the cosmetic regulation (4) the ingredients must be listed on the containers. it must be questioned whether the listing of the ingredients without hazard pictograms on the products could be considered as a clear, easily and quickly understandable risk communication instrument (3). the results show that it is urgent to inform consumers better about the potential dangers of personal care products, because cosmetics need to be applied even with more care than any other consumer product. it is strongly recommended to delete the exception provision in the clp regulation for personal care products. the infochemical effect describes that anthropogenic substances can interfere with the chemical communication and influence organisms so that they perceive their chemical environments differently (1,2). infochemicals play a role in life history, habitat search, food related aspects and survival which shows that disturbed communication (infodisruption) could affect population vulnerability at various decisive points (3). the classical ecotoxicological standard tests do not allow to observe the infochemical effect. systematic analyses are needed to find out more about the relevance of this new chapter in ecotoxicology for natural ecosystems. the first crucial step is to select suitable test substances. repellents (substances used to keep away target organisms, e.g. invertebrates such as midges or fleas via olfaction) enter surface waters mainly indirectly via wastewater discharges from sewage treatment plants or directly by being washed off from the skin and clothes of bathers. there are various indications that repellents which are not toxic for aquatic animals could induce effects like organismic downstream drift of non-target species (downstream dislocation of e.g. crustacean and insect larvae in streams). in a literature study, three repellents were identified to be suitable test compounds for proof of concept of the infochemical effect. deet (cas 134-62-3), icaridine (cas 119515-38-7) and ebaap (cas 52304-36-6) (4). these substances are investigated in new test designs in a subsequent experimental part of the project which are designed to detect and quantify the infochemical effect. persistent, bioaccumulative and toxic (pbt) substances or very persistent and very bioaccumulative (vpvb) substances are compounds with hazardous properties. the non-biodegradability (persistence) and high accumulation in organisms (bioaccumulation) may elicit long-term adverse effects in the environment. persistent and bioaccumulative substances concentrate in the environmental compartments (water, sediment, soil, air) and can be distributed in the food chains. ecotoxicological effects are strengthened by bioaccumulation and appear often in remote areas like marine and polar regions. in the framework of pbt assessment, contrary to the risk assessment, the substances are evaluated regardless of the emission into the environment. an evaluation of medicinal active ingredients under assessment in the german federal environment agency (uba) identified less than 10% as potential pbt candidates. due to data lacks in many cases a definite pbt classification is not possible. the poster presents results of an extensive literature research on the global occurrence of pharmaceuticals in the environment. data on measured environmental occurrences from more than 1,000 international publications have been transferred to a database, with more than 120,000 entries. according to the database, pharmaceuticals have been found in the environment of 71 countries worldwide in all five un regions. most published data are for the compartments surface water and sewage effluent; less information is available for groundwater, manure, soil, and other environmental matrices. more than 600 active pharmaceutical substances (or their metabolites and transformation products) have been detected in the environment. most findings have been published for industrialized countries. monitoring campaigns are increasingly being conducted in developing and emerging countries. these have revealed the global scale of the occurrence of pharmaceuticals in the environment. for example, diclofenac, a non-steroidal inflammatory drug, has been detected in the aquatic environment in 50 countries worldwide. a number of globally marketed pharmaceuticals have been found in both developing and industrialized countries. the available ecotoxicological information indicates that certain pharmaceuticals pose a risk to the environment at measured concentrations. options for cooperative action to address the risk of are also presented. the aim of the research presented was to support the discussion of the proposed emerging policy issue pharmaceuticals in the environment under the strategic approach to international chemicals management (saicm), which is a global initiative of united nation environment programme (unep). the european chemicals' legislation reach aims to protect man and the environment from substances of very high concern (svhc). chemicals with (very) persistent, (very) bioaccumulative and toxic properties (pbt and vpvb compounds), substances that are carcinogenic, mutagenic and toxic to reproduction (cmr compounds), as well as chemicals of comparable concern like endocrine disruptors (eds) may be subject to authorization. the identification of eds is limited as yet, because specific experimental assessments are not required under reach. evidence is currently based on a combination of few experiments, expert judgement and structural analogy with known eds. structural alerts for the identification of potential eds: predictions of properties and effects from chemical structures are based on similarities with either active or inactive substances. structural alerts for the identification of potential estrogen/androgen-ed activities are relevant parts of the structures of compounds that are known to interact with estrogen and androgen receptors as either agonists or antagonists. in addition to the backbones of the chemical structures (pharmacophore) for steric fit to the receptors, modulating factors may be small substructures for local interactions with receptor binding sites and physicochemical properties related to the strength of binding to the receptor. comparison of evidence from in silico, in vitro and in vivo assays for potential eds: identification of potential eds based on findings from mammalian long-term reproduction studies, fish life-cycle tests, receptor assays, and chemical alerts were compared and differences analysed. agreement is limited because in vivo, in vitro and in silico methods address different aspects of potential effects on endocrine systems regarding toxicological targets, modes of action and functional similarity of chemical structures. a combination of toxicological and chemical assays can provide comprehensive and complementary information to support evidence-based identification of potential eds among the chemicals released into the environment. application of structural alerts for the identification of potential eds to the einecs inventory: more than 33000 discrete organic einecs compounds are within the applicability domain of the structural alerts for potential estrogen/androgen-ed activities. among them, 3585 chemicals (ca. 11%) are indicated as potential candidates for endocrine effects based on structural alerts. due to the possibility that these chemicals may interact with estrogen/androgen receptors they should be subject to further investigations regarding their potential for endocrine effects, eventually leading to regulatory actions. within the imi (innovative medicine initiative) project "intelligence-led assessment of pharmaceuticals of the environment " (ipie;http://i-pie.org/), a more intelligent environmental testing and tools for prioritisation of legacy compounds shall be developed. regarding the evaluation of fish toxicity, screening approaches in fish embryos are pursued. while the standard fish embryo toxicity (fet) test is restricted to lethal parameters more relevant as substitutes for acute effects, additional sublethal endpoints may provide expanded applicability of the fet assay for chronic effect assessment in fish. in this respect, the analysis of the metabolome could provide additional insights into biochemical processes elicited by pharmaceutical compounds and could potentially support the extrapolation from fish embryo to chronic fish toxicity as displayed in the standard early life stage (els) test. in the context of ipie a pilot study was performed with the aim to quantify and comparatively assess changes in the metabolic signatures of fish and fish embryos induced by the reference compound amikacin, an aminoglycoside antibiotic, in order to identify metabolic patterns applicable as biomarker profiles that can be linked to apical endpoints in terms of an integrated approach. therefore, toxic effects in fathead minnow embryos and els fish were investigated following a 4 and 32 days exposure, examining conventional endpoints and additionally using a combined direct injection and lc-ms/ms assay for metabolite identification and quantification. metabolic endpoints were found to be at least as sensitive as standard apical endpoints such as growth and mortality as detected in the longer-term fish study. furthermore, multivariate data analysis (pca-x, opls-da) revealed substance induced metabolic perturbations specific for fish and fish embryos, respectively. beyond that, the statistical approach of shared and unique structure (sus) identified some metabolites from the classes of amino acids, biogenic amines and lipids to be similarly changed in both developmental stages related to amikacin treatment, representing shared biomarker candidates. in this first pilot study, the integrated metabolomics approach yielded insights into the molecular consequences of amikacin exposure and provided indications for biomarkers for common effects in fish embryos and fish. due to the different exposure levels in the fet (1 -100 mg/l) and els study (0.0003 -0.03 mg/l), more definitive conclusions regarding the predictivity of metabolic responses in fish embryos for apical endpoints in chronic fish tests are yet not possible. further studies are ongoing with a range of pharmaceutical compounds of different chemical classes which will reveal more substantial information on the applicability of this technology in the prediction of longterm effects in fish. the human cationic amino acid transporter hcat-1 (slc7a1) represents the major uptake route for arginine and other cationic amino acids (such as the essential amino acid lysine) in most mammalian cells. it thus provides these amino acids for protein synthesis as well as for essential metabolic pathways. in endothelial cells, special attention has been given to the role of hcat-1 for supplying arginine to nitric oxide synthase. in spite of its wide distribution, hcat-1 expression is highly regulated both, on the transcriptional and post-transcriptional level. however, the genetic elements involved in transcriptional regulation a largely unknown. here we studied the expressional regulation of hcat-1 in human ea.hy926 endothelial cells. starvation of these cells from cationic amino acids led to a pronounced induction of both, hcat-1 mrna and protein. the mrna induction was almost completely abolished by the transcription elongation inhibitor drb (5,6-dichloro-1-β-dribofuranosylbenzimidazole), suggesting the involvement of transcriptional regulation. we thus aimed at identifying the promoter elements in the hcat-1 gene responsible for this regulation. to our surprise and in contrast to data published by others 1 the chromosomal region up to 5 kb upstream of the first hcat-1 exon exhibited no promoter activity in either endothelial or dld-1 colon carcinoma cells that exhibit a very strong endogenous hcat-1 expression. we could however identify a promoter element within the first intron of the hcat-1 gene. transcriptional activity of this element increased upon amino acid starvation in a similar way as endogenous hcat-1 expression. our results indicate starvation-induced transcriptional regulation of hcat-1 through newly identified promoter regions distinct from those published previously. the transport of a multitude of drug molecules into the cell is mediated by multispecific organic cation transporters (octs), belonging to the solute carrier group (slc). one of these families within this group of membrane transport proteins is the slc47-family consisting of the two multidrug and toxin extrusion transporters mate-1 (slc47a1) and mate2-k (slc47a2). while mate-1 is highly expressed in several different tissues like kidney, liver, skeletal muscle, adrenal glands, testes and heart, mate2-k exclusively occurs in the apical membrane of proximal tubular epithelial cells within the kidney. both transport proteins translocate organic cations in exchange of protons into as well as out of the cell. to define the affinity of both transporters for the anti-diabetic drug metformin and to investigate their interactions with 26 different anti-neoplastic agents comparative transport experiments with the model substrate 1-methyl-4-phenylpyridinium (mpp) have been carried out. therefore stably transfected hek293-cells expressing mate-1 or mate2-k transport proteins generated by portacelltec biosciences gmbh and vector transfected hek293-cells were used. the interaction analyses were carried out by determining the uptake of [ 14 c]-metformin and the inhibition of [ university of basel, basel, switzerland drug transporters play a pivotal role in pharmacokinetics by modulating the cellular entry or efflux of compounds. one transporter facilitating the transport of bile acids, steroid hormones, and statins is the organic anion transporting polypeptide (oatp) 2b1 that is highly expressed in placenta, liver, and small intestine. especially its activity in small intestine and liver is assumed to be basis for specific drug-drug interactions. understanding mechanisms involved in pharmacokinetics is a prerequisite in drug development. to test whether there are species differences in transport activity we compared the expression and function of the human and rat orthologue. first, we determined the transport activity of the known oatp2b1 substrate estrone 3sulfate (e 1 s), and observed a significantly lower k m for mdck-hoatp2b1 compared to .00 ± 10.23 µm, *p<0.05 student´s t-test) whereas there was no difference in v max (mdck-hoatp2b1 119.4 ± 11.3 fmol/min/µg protein; mdck-roatp2b1 102.3 ± 12.8 fmol/min/µg protein). the human oatp2b1 exhibits multiple binding sites for its substrates that may explain specific drug-drug interactions [1] . to identify whether rat oatp2b1 owns the same characteristics, the cellular accumulation of 50 µm e 1 s (low affinity site) or 0.005 µm e 1 s (high affinity site) was determined in presence of specific inhibitors/substrates of oatp2b1. as observed for atorvastatin, a known inhibitor of both affinity sites, the rat transporter failed to exhibit the low affinity site. in detail, while atorvastatin reduced the accumulation of e 1 s in mdck-hoatp2b1 cells (110.0 ± 14.6 fmol/min/µg protein vs. 60.7 ± 7.5 fmol/min/µg protein, *p<0.05 student´s t-test), there was no inhibition of e 1 s accumulation in mdck-roatp2b1 cells (53.2 ± 13.2 fmol/min/µg protein vs. 49.0 ± 17.4 fmol/min/µg protein). additionally, we observed a different membrane localization of both transporters as assessed by immunofluorescent staining showing an apical localization for rat oatp2b1 while human oatp2b1 is localized at the basolateral pole of the cellular model. absolute quantification of mrna (copy number per 1000 ng of total rna) in different tissues of rat revealed a high expression of oatp2b1 in liver (159.6 ± 45.5), a moderate expression in small intestine (14.9 ± 4.0) and colon (57.0 ± 12.3), and a low expression level in kidney (2.3 ± 0.8) . the latter is in accordance with previous findings showing that oatp2b1 is abundant in rat kidney as quantified by absolute proteomics technics [2] . our data demonstrated species differences in localization and activity of the drug transporter. further studies are warranted to proof whether this knowledge will help in future drug development and which molecular cause is responsible for the observed data. background: intestinal drug absorption depends on various factors including aqueous volume, site-specific ph and intestinal motility. moreover, the expression of efflux-and uptake-transporters vary in dependence of the anatomical localization making the gut a complex barrier for drug transfer into the body. in a recent study, site-specific protein and mrna expression levels of 10 drug transporters were determined along the entire length of the human gut. interestingly, discrepancies between mrna expression and protein levels were observed. moreover, there were quantitative differences between small intestine and colon. as a consequence the question arose if this observation could be explained by small non-coding rnas acting as highly tissue-specific posttranscriptional regulators of gene expression. hence, in our current study, we aimed to investigate the impact of mirnas on site-specific transporter expression along the human intestine. methods: total rna was isolated from biopsies obtained from six disease-free organ donors. tissue samples were acquired from the duodenum, the upper and lower jejunum, the upper and lower ileum, and the transversal or descending colon. the expression of 754 mirnas was measured using rt-pcr based low density arrays. expression of all detected mirnas was correlated with transporter protein data recently determined by lc-ms/ms-based targeted proteomics. mirnas and transporter genes showing an inverse pearson's correlation between mirna and protein expression underwent an in-silico search (microcosm targets v.5, targetscan7.0) for putative mirna/mrna interaction. to investigate those interactions in more detail, reporter gene assays and site directed mutagenesis were conducted. results: out of 754 mirnas, 248 were detected in all tissue types. out of ten transporters five showed significant inverse correlations with 10 putatively targeting mirnas (e.g. pept1 and hsa-mir-27a, r= -0.498, p=0.002). reporter gene assays indicated interactions of mir-193a/b with pept1 3'-utr (p = 0.0025 and p = 0.0012) as well as of mir-27a with abcb1 3'-utr (p = 0.0006). the site-specific abundance of intestinal drug transporters is significantly affected by microrna-mediated post-transcriptional regulation under physiological conditions as exemplified for pept1 and abcb1. background: the human uptake transporter nact [gene symbol slc13a5; also known as mindy, the human orthologue of the drosophila indy (i´m not dead yet) transporter] is expressed in liver and brain. nact is important for energy metabolism and brain development and mediates the uptake of tricarboxylic acid (tca) intermediate such as citrate and succinate. reduced expression of this transporter, as studied in knock-out mice, mimics aspects of dietary restriction, promotes longevity and protects against insulin resistance and adiposity. furthermore, mutations in the human slc13a5 gene are associated with epileptic encephalopathy with seizure onset in the first days of life, possibly due to the reduced uptake of tca intermediates into neurons. to gain more insights into the role of nact in drug transport and into structure-function relationships, we studied the inhibition of nact-mediated citrate transport by various drugs and analyzed the effect of known mutations in the slc13a5 gene on nact-mediated transport. methods: drug inhibition studies were performed using hek cells stably expressing human nact with citrate as prototypic substrate. twenty-four drugs were used as potential inhibitors of nact-mediated uptake. the effects of eight mutations, three of them (nactp.g219r, -p.t227m and -p.l488p) associated with epileptic encephalopathy, were analyzed using a transient transfection approach. furthermore, the first computational-based structural model of the nact transporter was established and the impact of all mutations on substrate transport was modeled. results: inhibitions studies demonstrated that only a few drugs (three out of 24 tested) inhibited nact-mediated citrate transport at the tested drug concentration of 100 µm. from these, benzbromarone shows the strongest inhibition with an ic 50 value of 83.2 µm. furthermore, citrate transport was also slightly inhibited by pioglitazone and rosiglitazone. citrate transport of the mutated proteins nactp.g219r, -p.g219e,p.t227m, -p.l420p and -p.l488p was totally abolished and the effect of these mutations could be explained on the basis of the structural model. conclusion: inhibition studies demonstrated that simultaneously administered drugs can inhibit nact-mediated uptake of the prototypic substrate citrate. nact-mediated uptake is abolished by mutations in the slc13a5 gene associated with epileptic encephalopathy. the effect of these mutations can be explained on the basis of the first structural model of this uptake transporter. the atp-binding cassette subfamily b member 1 (abcb1) is a drug efflux pump responsible for the classic multi-drug resistance phenotype in cancer cells. increased activity of abcb1 in cancer cells contributes to protection against a wide range of chemotherapeutic agents. this dramatically decreases therapeutic options and the chance of patient survival. knowledge of the underlying mechanisms for abcb1 deregulation is a critical step for the reversal of this unfavorable condition. of note, phosphatidylinositol-4,5-bisphosphate (pip 2 ) is a known regulator of abcb1. the protein "myristoylated alanine-rich c-kinase substrate" (marcks) is known for its ability to bind and sequester the phospholipid pip 2 . in various forms of colorectal cancer the deregulation of marcks protein goes hand in hand with an increase in malignancy and therapeutic resistance. in this study, we characterized the enigmatic marcks as a modulator of abcb1 activity. we focused on a subgroup of colon cancers, where marcks resides in a hyperphosphorylated state for which the established cell line ht-29 served as a model. we employed various in-vitro methods for the measurement of abcb1 activity, in combination with imaging experiments, assays for cellular viability and classical methods of molecular biology. we combined these approaches with pharmacological inhibition of marcks phosphorylation state or rnai-mediated depletion of marcks. with these interventions we could modulate endogenous abcb1 activity and re-sensitize our cell model against chemotherapeutic agents like 5-fluorouracil which are known to be substrates of abcb1. taken together, our findings suggest a new way how a cancer cell can gain a state of therapeutic resistance. the exploitation of marcks as modulator of abcb1 might be a new approach to target resistant tumors without interfering with the natural function of abcb1 in non-malignant tissue. background: in several large clinical studies low blood concentrations of homoarginine were identified as independent risk marker for stroke, cardiovascular events and mortality. experimental data suggest an active role of homoarginine deficiency in disease development. interference with l-arginine-dependent pathways and signaling has been implicated as a possible mechanism. it was the aim of the present study to identify transport mechanisms involved in the cellular uptake and tissue distribution of homoarginine, which is poorly understood, so far. the experiments focused on cationic amino acid transporters (cats) and possible interactions with known cat substrates. methods: uptake assays were performed using [ 3 h]-labeled homoarginine as substrate and human embryonic kidney (hek293) cells stably overexpressing human cat1 [gene: slc7a1 (solute carrier family 7)], cat2a (slc7a2a) or cat2b (slc7a2b). cells transfected with an empty vector were used as control. unlabeled known catsubstrates l-arginine and asymmetric dimethylarginine (adma) were investigated as inhibitors. results: compared to the uptake into control cells, uptake of homoarginine was significantly higher in hek cells overexpressing cat1 (7-fold), cat2a (1.6-fold) or cat2b (2.2-fold) after 2.5 min using 100 µm substrate (each p<0.001). apparent k m values for cellular net uptake of homoarginine were 174.6 µm for cat1, 3640 µm for cat2a and 522.8 µm for cat2b. homoarginine uptake by the three cats could be inhibited by addition of l-arginine and adma. conclusion: the protective biomarker homoarginine is a substrate of the human cationic amino acid transporters cat1, cat2a and cat2b. compared to other cat substrates, the cat1-and cat2b-mediated homoarginine transport is characterized by relatively high affinity. the uptake of homoarginine by all three cats can be inhibited by l-arginine and adma. taken together these findings make cat-mediated transport of homoarginine a possible target for interactions and pharmacological interventions aimed at homoarginine homeostasis. this project is supported by the doktor robert pfleger-stiftung. background and aim: organic cation transporter 1 (oct1, alternative name slcc22a1) is a polyspecific membrane transporter, which has been suggested to play a role in absorption, distribution and elimination of drugs and toxins. besides endogenous compounds like thiamine (vitamin b1), known oct1 substrates are toxins like mpp + as well as drugs like metformin, o-desmethyltramadol, ranitidine, and sumatriptan. tissue distribution of oct1 has been shown to have strong inter-species differences. in rodents oct1 is strongly expressed in both the sinusoidal membrane of hepatocytes and the basolateral membrane of kidney epithelial cells. human oct1 is strongly expressed on the sinusoidal membrane of hepatocytes, but not in the kidney. furthermore, substrate specific differences have been observed between the human and rodent oct1 orthologs. the aim of this study is to characterize the mechanisms causing substrate specificity between rodent and human oct1 orthologs. methods: overexpression of oct1 orthologs in mouse, rat and human cells was performed by targeted chromosomal integration of t-rex™ 293. the cells were characterized in detail for their ability to transport the model substrates tea + , mpp + , and asp + , the drugs ranitidine, sumatriptan and fenoterol. results: mouse and rat oct1 orthologs showed similar transport activity for all the model substrates and drugs tested. however, significant differences were observed when rodent orthologs were compared to the human oct1. human oct1 showed a 5fold higher v max for the uptake of asp + and 2-fold increase of v max for sumatriptan in comparison to the rodent oct1 orthologs. conversely, human oct1 showed a 4-fold lower v max for the uptake of fenoterol compared to rodent oct1s. there was no difference between rodent and human oct1 in the uptake of ranitidine. these differences were further characterized in detail using chimeric mouse-human oct1 constructs and by comparing the effects of key point mutations in mouse and human oct1 orthologs. conclusions: these data demonstrate strong differences in the substrate specificity between rodent and human oct1 orthologs. therefore oct1 pharmacokinetic data obtained in mouse or rat models could not be directly extrapolated for use in human. furthermore, comparative functional analyses of orthologs may help reveal the mechanisms underlying polyspecificity of oct1. ranitidine is a histamine h 2 -receptor antagonist which is commonly used without prescription for the treatment of pyrosis and gastric ulcers. approximately 30% of the systemic clearance of ranitidine is via hepatic metabolism. ranitidine is known to be a substrate of the hepatic organic cation transporter oct1 [1]. oct1 is expressed on the sinusoidal membrane of human hepatocytes and is highly genetically variable. sixteen major oct1 alleles are known [2] . thereof 12 alleles confer partial or complete loss of oct1 activity. the observed loss of activity was highly substrate specific and should be analyzed substrate by substrate. in this study we analyzed the effects of genetic polymorphisms in oct1 on the uptake of ranitidine. we used hek293 cells stably transfected to overexpress wild type or variant oct1 isoforms. the variant oct1 alleles oct1*1a (met408val), oct1*1c (phe160leu), oct1*1d (pro341leu/met408val), oct1*2 (met420del), oct1*3 (arg61cys), oct1*4 (gly401ser), oct1*5 (gly465arg/met420del), oct1*6 (cys88arg/met420del), oct1*7 (ser14phe), oct1*8 (arg488met), oct1*9 (pro117leu), oct1*10 (ser189leu), oct1*11 (ile449thr), oct1*12 (ser29leu) and oct1*13 (thr245met) were analyzed. we characterized in ranitidine uptake determined k m and v max for the different polymorphic oct1 isoforms. wild type oct1 showed a time and concentration dependent uptake of ranitidine with a k m of 62.91 ± 4.32 µm and a v max of 1125.41 ± 86.12 pmol/min/mg protein. variants oct1*5, oct1*6, oct1*12 and oct1*13 completely lacked ranitidine transport activity. variants oct1*2, oct1*3, oct1*4 and oct1*10 showed v max values decreased by 64, 77, 91 and 50 %, respectively. oct1*8 variant showed an increase of v max by 25 %. there was no significant changes in ranitidine uptake by variants oct1*1a, oct1*1c, oct1*1d, oct1*7, oct1*9 and oct1*11 compared to the wild type. there were no significant differences in the k m values between the wild-type and variants. in conclusion, we confirmed ranitidine as an oct1 substrate and demonstrated that genetic polymorphisms in oct1 can strongly affect ranitidine uptake. the effects of oct1 polymorphisms on ranitidine pharmacokinetics in humans remain to be analyzed. otto-von-guericke university, institute for pharmacology and toxicology, magdeburg, germany serotonergic hallucinogens ( s hgs), such as lysergic acid diethylamide (lsd), induce profound alterations of human consciousness, which are thought to be mediated by activation of 5-ht 2a receptors. with repeated application, the mind-altering effects of most s hgs rapidly are undermined by tolerance. the only exception seems to be dimethyltryptamine (dmt), whose mind-altering effects for reasons unknown even with repeated application do not decrease. assuming that dmt differs from other s hgs in its capacity to regulate 5-ht 2a receptors, we here compare lsd and dmt with regard to processes of 5-ht 2a downregulation. in heat-exposed rats, lsd and dmt induce a marked increase in body core temperature (hyperthermia) accompanied by "defensive hypersalivation". both effects are mimicked by the 5-ht 2 selective agonist dimethoxybromoamphetamine (dob) and can be blocked by the selective 5-ht 2a antagonists ketanserin and mdl100907. after repeated application, the temperatureregulatory effects of lsd are subject to tolerance, whereas the ones of dmt are not. tolerance to lsd (as measured by dob induced [ 35 s]gtpуs binding) is paralleled by a desensitisation of frontocortical 5-ht 2 receptors; for dmt, there is no such decrease. applying techniques of immunocytochemistry, transphosphatidylation, and quantitative real-time pcr to (ha-5-ht 2a transfected) hek293 and (endogenously 5-ht 2a expressing) c6 glioma cells, respectively, we furthermore demonstrate that dmt in contrast to lsd fails to internalise 5-ht 2a receptors, fails to activate the phospholipase d (which is needed for 5-ht 2a internalisation), and fails to inhibit the synthesis of 5-ht 2a receptors. given that dmt unlike lsd turns out to be inactive as to all processes of 5-ht 2a downregulation investigated, our data suggest that the differential tolerance development noted for dmt and lsd indeed might be accounted for by differential regulation of 5-ht 2a receptors. lsd and dmt both have recently regained scientific attention as potential therapeutics in the treatment of depression and/or anxiety disorders. providing mechanistic insights into their action, thus, is of timely clinical relevance. increased gaba release in human neocortex at high intracellular sodium and low extracellular calcium -an anti-seizure mechanism? ] e , this reduction might induce an anti-seizure mechanism by augmenting gat-mediated gaba release, a mechanism absent in rats. aging is complex on the systems as well as on the molecular level. the process of aging is characterized by a progressive loss of physiological functions and accumulation of cellular damage. one hallmark of aging is an impaired protein homeostasis. the imbalance of the quality control of both de novo protein synthesis and protein degradation, therefore, is likely to contribute to the phenotype of aging. we investigated protein turnover rates with the state-of-the art techniques funcat (dieterich et al., 2010) and sunset (schmidt et al., 2009) in aging neuronal cell cultures . using these techniques we show a prominent decrease in protein synthesis and degradation that progressed gradually in aging neuronal cells cultured up to div 60. in order to rejuvenate the protein turnover in aged neuronal cells we applied the selective eukaryotic elongation factor-2 kinase inhibitor a 484954 and the polyamine spermidine and observed protein translation utilizing funcat and sunset. whereas both a 484954 and spermidine had no effect on de novo protein synthesis in juvenile neurons (div 20) , both substances increased the de novo protein synthesis to a juvenile level in aged neuronal cultures (div60). this effect is seen in neuronal somata and dendritic spines. the molecular function of spermidine as an "anti-aging agent" is not defined yet. thus, additional pharmacological interventions are used for further examination of specific molecular spermidine targets. in conclusion, the described experimental setup is used to investigate impaired protein homeostasis as one hallmark of aging. agents with a presumed "anti-aging" effect can be tested for a potential rejuvenating effect on the level of protein homeostasis. a screening approach to test tolerability of multitargeted drug combinations for antiepileptogenesis in mice a large variety of brain insults can induce the development of symptomatic epilepsies, particularly temporal lobe epilepsy. in the latent period after the initial insult multiple molecular, structural, and functional changes proceed in the brain and finally lead to spontaneous recurrent seizures. prevention of these developments, called antiepileptogenesis, in patients at risk is a major unmet clinical need. several drugs underwent clinical trials for epilepsy prevention, but none of the drugs tested was effective. similarly, most previous preclinical attempts to develop antiepileptogenic strategies failed. in the majority of studies, drugs were given as monotherapy. however, epilepsy is a complex network phenomenon, so that it is unlikely that a single drug can halt epileptogenesis. recently, multitargeted approaches were proposed ("network pharmacology") to interfere with epileptogenesis. developing novel combinations of clinically used drugs with diverse mechanisms that are potentially relevant for antiepileptogenesis is a strategy, which would allow a relatively rapid translation into the clinic. we developed an algorithm for testing such drug combinations in a screening approach, modelled after the drug development phases in humans. tolerability of four repeatedly administered drug combinations was evaluated by a behavioral test battery: a, levetiracetam and phenobarbital; b, valproate, losartan, and memantine; c, levetiracetam and topiramate; and d, levetiracetam, parecoxib, and anakinra. as in clinical trials, tolerability was separately evaluated before starting efficacy experiments to identify any adverse effects of the combinations that may critically limit the successful use in preclinical studies on antiepileptogenesis and translation of these preclinical findings to the clinic. based on previous studies, we expected that tolerability would be lower in epileptic mice than in nonepileptic mice. therefore nonepileptic mice were used as a first step, followed by epileptic mice and mice during the latent period shortly after status epilepticus. except combination b, all drug cocktails were relatively well tolerated. in contrast to our expectations, except combination c, no significant differences were determined between nonepileptic and post-status epilepticus animals. as a next step, the rationally chosen drug combinations will be evaluated for antiepileptogenic activity in mouse and rat models of symptomatic epilepsy. major depression is one of the most common mental disorders worldwide, with serious social and economic consequences. there are many different hypotheses concerning the pathophysiology of this disease. the complex brain serotonin system and particularly the serotonin 1a receptors (5-ht 1a r) apparently play a pivotal role in the development of depression. the involvement of an altered, 5-ht 1a r-mediated signalling in adult neurogenesis is also discussed. however, in this context the effects of pre-and postsynaptically located 5-ht 1a rs have not been clarified yet. mice with a permanent overexpression of postsynaptic 5-ht 1a rs (oe mice) represent a unique tool to elucidate the effects of postsynaptic 5-ht 1a rs in adult neurogenesis and depressive-like behaviour. previous studies demonstrated an increased proliferation and survival of newborn cells in the adult dentate gyrus of female oe mice in comparison to controls. in the present study, we investigate the proliferation and survival of adult born cells after chronic treatment (15 days) with the selective 5-ht 1a r agonist 8-oh-dpat (0,5 mg/kg/day) in young adult oe and wt mice. on the last three days of treatment, newly generated cells of oe and wt mice are labelled by injections with bromodeoxyuridine (brdu; 50 mg/kg/day). mice are sacrificed either one day (proliferation) or 21 days (survival) after the last injection. we hypothesise that the data we will present will confirm our previous results, with possibly more pronounced proneurogenic effects and differences in male mice. further immunohistochemical studies post-exercise and behavioural analyses are in progress to identify the relation between chronic postsynaptic 5-ht 1a r stimulation, depressive-like behaviour and hippocampusdependent learning. dystonia is a common movement disorder characterized by intermittent and prolonged muscle contractions resulting in involuntary movements and/or abnormal postures. the lack of knowledge of the pathophysiology of dystonia hampers the development of effective therapeutics. although benzodiazepines can improve dystonic symptoms, tolerance and side effects limit their use. there is evidence for striatal dysfunctions in human dystonia. gabaergic striatal interneurons (in) are important for the regulation of striatal signaling. in the dt sz mutant hamster, a model of paroxysmal dystonia, immunoreactive in were reduced at the age of maximum severity of dystonia (33 days), but not after spontaneous remission (age 90 days). as indicated by unaltered homeoboxprotein nkx 2.1 (cell density, mrna), the age-dependent deficit seems not to be related to a disturbed migration, but to a retarded maturation of in in mutant hamsters. here we further determined the maturation of striatal gabaergic neurons in the dt sz hamster compared to healthy controls. kcc2 and cavii mrna, used as markers for the gaba-switch, were unchanged in 18 and 33 day old mutant hamsters, indicating that there is no general delay in gabaergic maturation. as a retarded maturation seems to be specific for in, we used another marker for gabaergic maturation: the expression of specific gaba a receptor (gaba a r) subunits (mature striatal in express the alpha1 subunit). by stereological determination, we found a 46% decrease in alpha1 subunit expressing neurons. a lower immunoreactive intensity was restricted to the somata of dorsomedial striatal in (32%) of dt sz hamsters, indicating both a reduced density as well as a delayed maturation. these findings prompted us to examine the effects of the αlpha1 gaba a r preferring compound zolpidem in comparison with the benzodiazepine clonazepam. zolpidem (2.0 and 10.0 mg/kg i.p.) only exerted moderate antidystonic effects compared to the benzodiazepine clonazepam (0.5 and 1.0 mg/kg i.p.) in the dt sz hamster. examinations of αlpha2 gaba a r preferring compounds are ongoing. in summary our studies indicate that there is no general defect in striatal gabaergic maturation in the dt sz mutant but a specific alteration of striatal gabaergic interneurons which express αlpha1 gaba a r subunits. changes in αlpha1 gaba a r subunit expression and differences in the antidystonic efficacy of zolpidem and clonazepam indicate that further investigations on the role of gaba a r subunits could lead to new therapeutic approaches for the treatment of dystonia. 2005) . therefore, the hypothesis of the present study was that pregnenolone attenuates the inhibition of synaptic transmission elicited by cannabinoids. methods: 250 µm-thick slices containing the cerebellum and the nucleus accumbens were prepared from the brains of mice and rats. spontaneous and electrically evoked gabaergic inhibitory postsynaptic currents (sipscs and eipscs) and evoked glutamatergic excitatory postsynaptic currents (eepscs) were analyzed in superfused brain slices with patch-clamp electrophysiological techniques. results: a) the synthetic cannabinoids jwh-018 (5 x 10 -6 m) and jwh-210 (5 x 10 -6 m) inhibited the spontaneous gabaergic synaptic input (sipscs) to purkinje cells in mouse cerebellar slices. the inhibition by jwh-018 was not affected by pregnenolone (10 -7 m), the inhibition by jwh-210 was only marginally attenuated. b) the depolarization of the purkinje cells induced suppression of the gabaergic input to purkinje cells (dsi); pregnenolone (10 -7 m) did not affect this endocannabinoid-mediated form of synaptic suppression. c) in rat nucleus accumbens slices, gabaergic and glutamatergic synaptic input to medium spiny neurons was activated by electrical stimulation of axons. ∆ 9 -tetrahydrocannabinol (2 x 10 -5 m) suppressed the gabaergic and glutamatergic synaptic transmission in the nucleus accumbens. these suppressive effects of ∆ 9tetrahydrocannabinol were not changed by pregnenolone (10 -7 m). d) finally, we tested whether we can observe neurosteroid-mediated effects in our brain slice preparations. tetrahydro-deoxycorticosterone (thdoc, 10 -6 m) markedly prolonged the decay time constant (τ) of spontaneous gabaergic postsynaptic currents (sipscs), similarly as in previous experiments (ej cooper et al., j physiol 521: 437-449, 1999) . the results show that inhibition of gabaergic and glutamatergic synaptic transmission by synthetic-, endogenous,-and phyto-cannabinoids is not changed by pregnenolone. therefore, it is unlikely that interference with cannabinoid-induced inhibition of synaptic transmission is the mechanism by which pregnenolone attenuates behavioural and somatic effects of ∆ 9 -tetrahydrocannabinol in vivo. the hypothalamus is one of the key players in the regulation of the energy homeostasis. cold stress leads to an activation of neurons in the paraventricular hypothalamic nucleus (pvn) and increases thermogenesis. the thyrotropin-releasing-hormone (trh) neurons have an important function in this effect. however it is hardly understood which role the trh neurons exactly play and how they are connected to other regions of the brain. we have transduced neurons in the pvn of mice with a recombinant adeno associated virus which contains an activating "designer receptors exclusively activated by designer drugs" (dreadd) system under the control of a shortened trh promotor. two weeks after transduction the animals were injected with clozapine-n-oxide (cno). to analyse the physiological function of this neurons we performed indirect calorimetry, measured rectal temperature and thermogenesis in the brown adipose tissue (bat), analysed drinking feeding behaviour and the home cage activity. after stimulation we measured the expression of genes in bat as well plasma hormone levels of pituitary hormones. propranolol and the specific β3-antagonist sr59230a were used to analyse the relevance of the sympathetic system. to further characterise the transduced neurons and their projections we used immunohistochemistry methods. after stimulation with cno the energy expenditure and body temperature were increased. these effects were mostly driven through an activation of the brown adipose tissue (bat). in dreadd transduced trh-receptor 1 (trh-r1) knockout mice this effects were abolished. in parallel the plasma levels of tsh, the ucp1 mrna level in the bat, the home cage activity as well the food and water intake were increased. after the treatment with propranolol and sr59230a the effects on the thermogenesis were reduced, but the home cage activity was not affected. sr59230a treatment normalised the food intake and increased in parallel the plasma leptin concentrations after cno stimulation. transduced neurons project into the raphe nucleus, the medial part of the thalamus and the spinal cord. with our experiments we could provide strong evidence for a sympathetic connection of the transduced neurons in the pvn to the bat and for the involvement of thr neurons in these effects. therefore, this system is a suitable tool to investigate the metabolic relevance of trh neurons in detail. background & objective: during obesity development, tissue factor signalling contributes coagulation-independently to inflammatory and metabolic dysfunction of adipose tissue. adipogenesis involves proliferation and differentiation of preadipocytes, apoptosis and hypertrophic growth of differentiated adipocytes, angiogenesis and extracellular matrix reorganisation. the coagulant protease thrombin promotes similar processes in various cell types, through activation of protease-activated receptors par-1, par-3 and par-4. in human adipose tissue, par-1 is found in vascular stromal cells and par-4 in preadipocytes and differentiated adipocytes. thrombin stimulates mitogenic kinase signalling and induces inflammatory cytokine and angiogenic growth factor secretion in adipocytes. we have examined the contribution of thrombin receptor activation to adipogenesis processes in 3t3l1 cells. results: differentiation of 3t3l1 preadipocytes with insulin, dexamethasone and isobutylmethylxanthine increases leptin and pparg gene expression and accumulation of triglycerides and oil red o-stained lipids. par-4 is time-dependently upregulated in maturing cells while par-1 expression is detectable but not altered. in preadipocytes, thrombin (3 u/ml) activates the mitogenic kinase erk1/2, promotes cell proliferation and induces gene expression of the maturation markers leptin and pparg and the inflammatory marker tumor necrosis factor alpha (tnfa). repeated stimulation of differentiating adipocytes with thrombin suppresses induction of leptin and pparg and attenuates lipid accumulation, while expression levels of the proliferation marker ki67 and the inflammatory cytokine interleukin (il)-6 are increased compared to differentiated control cells. similar proliferative and anti-adipogenic effects are seen with the selective par4-activating peptide (gypgkf, 100 µm) and cathepsin g, a proteolytic par-4 activator released from neutrophils and mast cells. repeated exposure of maturing 3t3l1 cells to conditioned medium from degranulating mouse peritoneal mast cells (mccm) augments lipid accumulation and il-6 expression. pretreatment of mccm with a par-4 inhibitor further drives lipid accumulation, the induction of il-6 by contrast is suppressed. conclusion: par-4 activation by thrombin or inflammatory cell-derived cathepsin g appears to suppress adipogenesis, possibly by maintaining proliferative capacity and preventing the growth arrest essential for initiating matuation. increased par-4 expression in maturing adipocytes may instead support inflammatory changes, thereby promoting the onset of insulin resistance. the chemokine receptor cxcr4 antagonist amd3100 exerts deleterious effects in endotoxemia in vivo s. seemann 1 , a. lupp the chemokine receptor cxcr4 is a multifunctional receptor which is activated by its natural ligand c-x-c motif chemokine 12 (cxcl12). although a blockade of the cxcr4/cxcl12 axis revealed beneficial outcomes in chronic inflammatory diseases, its importance in acute inflammatory diseases remains contradictious and not well characterized. as cxcr4 seems to be part of the lipopolysaccharide sensing complex, cxcr4 agonists or antagonists may have a positive impact on tlr4 signaling. additionally, cxcr4 is involved in the production of pro-inflammatory cytokines, suggesting the receptor to be a promising target in terms of mitigating the cytokine storm. therefore, we aimed to investigate the impact of a cxcr4 blockade on endotoxemia by applying a sublethal lps dose (5 mg/body weight) in mice. the selective cxcr4 inhibitor amd3100 was administered intraperitoneally shortly after lps treatment to ensure an immediate effect after endotoxemia onset. 24 hours after lps administration, the clinical severity score, the body temperature and the body weight of the animals were determined. afterwards, the mice were sacrificed and serum tnf alpha as well as ifn gamma levels were measured. furthermore, the oxidative stress in the brain, liver, lung and kidney tissue was assessed. in addition, the biotransformation capacity of the liver was evaluated and finally, the expression of gp91 phox as well as of heme oxygenase 1 in the spleen and liver were determined by means of immunohistochemistry. the mice of the amd3100 plus lps treatment group displayed a significantly impaired general condition, a reduced body temperature and a decreased body weight in comparison to the control and to the lps treated animals, respectively. tnf alpha levels were significantly increased by more than 200% or 35% when compared to the control or to the lps group, respectively, whereas ifn gamma levels were elevated by about 11% in comparison to mice which had received lps only. in all investigated organs, but especially in the liver and in the kidney, co-administration of amd3100 and lps caused massive oxidative stress. furthermore, the protein contents and the activities of several cyp enzymes in the liver were significantly reduced. immunohistochemistry revealed gp91 phox to occur above average, whereas heme oxygenase 1 expression was distinctly decreased. our results indicate that a blockade of the cxcr4 in endotoxemia is disadvantageous and even worsens the disease. co-administration of amd3100 and lps impaired the health status of the animals, caused massive oxidative stress and diminished the biotransformation capacity. thus, handling acute systemic inflammation with a cxcr4 antagonist cannot be recommended, hence indicating the activation of cxcr4 to be an attractive treatment option. toll-like receptors (tlrs)recognizemicrobial pathogens and trigger inflammatory immune responses to control infections. in acne vulgaris, activation of tlr2 by propionibacterium acnes contributes to inflammation. although glucocorticoids have immunosuppressive and anti-inflammatory effects, acne can be provoked by systemic or topical treatment. enhanced tlr2 expression by glucocorticoids has been reported in undifferentiated keratinocytes, however, human skin cells of different epidermal and dermal layers have not been investigated. in this study, the modulation of tlr expression by dexamethasone was assessed in monolayer cultures of primary human keratinocytes and dermal fibroblasts, as well as the immortalized keratinocyte cell line hacat. constitutive tlr2, tlr1 and tlr6 mrna and protein expression was confirmed in basal keratinocytes, calcium-induced differentiated keratinocytes, hacat cells and fibroblasts by qpcr and western blotting. dexamethasone induced tlr2 expression in a time-dependent and concentration-dependent manner and reduced tlr1/6 expression in keratinocytes but not in hacat cells or fibroblasts. stimulation with dexamethasone in the presence of the pro-inflammatory cytokines tnfα or il-1β further increased tlr2 mrna levels. gene expression of mapk phosphatase-1 (mkp-1) was also upregulated by dexamethasone. glucocorticoid-induced tlr2 expression was negatively regulated by p38 mapk signalling under inflammatory conditions through mkp-1 induction which functions to deactivate mapks. as expected, dexamethasone inhibited the immune responses linked to tlr2 signalling as demonstrated by reduced il-8, il-1β, mmp-1 and mcp-1 levels. however, the expression of traf6, a critical cytosolic regulator of tlr-and tnf family-mediated signalling, was further upregulated by the tlr2 agonist hklm (heat-killed lysteria monocytogenes) in dexamethasonepretreated basal keratinocytes.conclusively, our results provide novel insights intothe molecular mechanismsof glucocorticoid-mediatedtlr expressionand function in human skin cells. psoriasis is a cutaneous chronic inflammatory disease characterized by increased amounts of il-1 cytokines and t helper 17 (th17) related cytokines in lesional psoriatic skin. treatment with beta-adrenoceptor antagonists is associated with induction or aggravation of psoriasis, however, the underlying mechanism is poorly understood. previously, we could demonstrate a pivotal role for langerhans cells and dermal dendritic cells in antimalarial-provoked psoriasis by maintaining a potent th17 activity. in the present study, we investigated the effect of propranolol on human monocyte-derived langerhans-like cells (molc) and dendritic cells (modc) under inflammatory conditions. in the presence of il-1β, propranolol induced the th17 priming cytokines il-23 and il-6 in a concentration-dependent manner. the increased cytokine release was not mediated by camp suggesting gpcr-independent pathways. in contrast, il-36γ and lps failed to increase il-23 release in molc and modc in the presence of propranolol but further induced secretion of il-1β. autophagy has been linked with the secretion of il-1 family cytokines that are upregulated in chronic inflammatory disorders such as psoriasis. propranolol upregulated the expression levels of the autophagy marker p62 and lc3-i to lc3-ii conversion, induced accumulation of lc3 positive vesicles, as well as expression of il-1 signalling downstream adapter molecule traf6, indicating a late-stage block in autophagy. in summary, our results suggest a prominent role of cutaneous dendritic cell subtypes in psoriasis-like skin inflammation mediated by propranolol and possibly other beta blockers. langerhans cells (lcs) represent a highly specialized subset of epidermal dendritic cells (dcs), yet not fully understood in their function of balancing skin immunity. in the present study, we investigated in vitro generated langerhans-like cells obtained from the human acute myeloid leukaemia cell line mutz-3 (mutz-lcs) to study tlr-and cytokine-dependent activation of epidermal dcs. mutz-lcs revealed high tlr2 expression and responded robustly to tlr2 engagement, confirmed by increased cd83 and cd86 expression, upregulated il-6, il-12p40 and il-23p19 mrna levels and il-8 release. tlr2 activation reduced ccr6 and elevated ccr7 mrna expression and induced migration of mutz-lcs towards ccl21. similar results were obtained by stimulation with pro-inflammatory cytokines tnf-α and il 1β whereas ligands of tlr3 and tlr4 failed to induce a fully mature phenotype. despite limited cytokine gene expression and production for tlr2-activated mutz-lcs, co culture with naive cd4+ t cells led to significantly increased ifn-γ and il-22 levels indicating th1 differentiation independent of il-12. tlr2-mediated effects were blocked by the putative tlr2/1 antagonist cu-cpt22, however, no selectivity for either tlr2/1 or tlr2/6 was observed. computer-aided docking studies confirmed non-selective binding of the tlr2 antagonist. taken together, our results indicate a critical role for tlr2 signalling in mutz-lcs considering the leukemic origin of the generated langerhans-like cells. the stagnation in the development of new antibiotics during the last decades and the concomitant high increase of resistant bacteria emphasize the urgent need for new therapeutic options. antimicrobial peptides are promising agents for the treatment of bacterial infections and recent studies indicate that pep19-2.5, a synthetic antimicrobial and lps-neutralizing peptide (salp), efficiently neutralizes pathogenicity factors of gram-negative and gram-positive bacteria and protects against sepsis. in the present study, we investigated the potential of pep19-2.5 and the structurally related compound pep19-4lf for their therapeutic application in bacterial skin infections. primary human keratinocytes responded to tlr2 (fsl-1) but not tlr4 (lps) activation by increased il-8 production, as determined by elisa. western blot analysis showed that both salps inhibited fsl-1-induced phosphorylation of nf-κb p65 and p38 mapk. furthermore, the peptides significantly reduced il-8 release and gene expression of il-1β, ccl2 (mcp-1) and hbd-2 as assessed by qpcr. no cytotoxicity (mtt test) was observed at salp concentrations below 10 µg/ml. in lps-stimulated monocyte-derived dendritic cells, the peptides blocked il-6 secretion, downregulated expression of the maturation markers cd83 and cd86, as analysed by flow cytometry, and inhibited ccr7-dependent migration capacity. similarly, monocyte-derived langerhans-like cells activated with lps and pro-inflammatory cytokines showed reduced il-6 levels and cd83/cd86 expression in the presence of salps. in addition to acute inflammation, bacterial infections often result in impaired wound healing. since re-epithelialization is a critical step in wound repair, we tested whether pep 19-2.5 affects keratinocyte migration using the scratch wound assay. the peptide markedly promoted cell migration and accelerated artificial wound closure at concentrations as low as 1 ng/ml and was equipotent to tgf-β. conclusively, our data suggest a novel therapeutic target for the treatment of patients with acute and chronic skin infections. recently, we and others have shown that the transcription factor nuclear factor erythroid 2-related factor 2 (nrf2), a major regulator of the cellular antioxidant defence system, is activated by mechanical ventilation. during ventilator-induced lung injury, nrf2 exerts a protective role by interaction with the stretch-induced growth factor amphiregulin. in the current study, we aimed to investigate the role of nrf2 in acid-induced lung injury, a model for aspiration-induced ards. methods: nrf2-deficient (nrf2 -/-) mice and wild type (wt) littermates were tracheotomised and ventilated for 30min (v t =16ml/kg, f=90min -1 , peep=2cmh 2 o, fio 2 =0.3), before 50 µl hydrochloric acid (hcl) with ph=1.5 or ph=1.3 were instilled intratracheally, controls received nacl. mice were then ventilated for further 6h under monitoring of lung mechanics and vital parameters. blood gases as well as proinflammatory mediators, neutrophil recruitment and microvascular permeability were examined to assess lung injury. results: instillation of hcl ph=1.5 induced mild lung injury, indicated by hypoxemia (po 2 /fio 2 ~300mmhg) and continuously increasing lung tissue elastance (stiffness), from which nrf2 -/mice were protected. pulmonary inflammation, characterized by liberation of cytokines, chemokines and oedema formation, was attenuated in nrf2 -/mice. in contrast, hcl ph=1.3 caused more severe lung injury (po 2 /fio 2 ~200mmhg) with a steeper incline in elastance and more severe inflammation in both wt and nrf2 -/mice. conclusion: we conclude that the presence of nrf2 augments mild acid-induced lung injury, but plays no role in more severe injury. these discrepant results will be elucidated in future investigations. uniklinik rwth aachen, institut für pharmakologie und toxikologie, aachen, germany 2 fakultät für maschinenwesen der rwth aachen, werkzeugmaschinenlabor, aachen, germany rationale: reproducibility is key to science. in recent times, the reproducibility of biomedical research has been questioned increasingly. this reproducibility crisis also affects complex animal experiments, which -if not reproducible -might also be regarded as unethical and lose public acceptance. part of the problem is frequently that the provided documentation is not sufficient for reproduction. therefore, in this study we analyzed the potential of conventional quality management tools -used as standard in machine production -as an approach to improve the documentation and ascertain the quality of complex animal experiments. methods: quality management tools were transferred to an experimental animal set up -the mouse intensive care unit (micu) -which we use for lung injury studies. the tools included visualization of the experimental set-up, transfer of the experimental procedures to an event-driven process chain (epc) and statistical process control (spc) of all crucial pulmonary and cardiovascular parameters. data from ventilator-and acidinduced lung injury studies acquired in the micu were analyzed retrospectively. results: schematic visualization of the micu resulted in a chart comprising medical components, hardware, software and generated data types. the customized epc included all important activities and the resulting events for preparation of the mouse and the workplace, the actual animal ventilation experiment and sample-taking. in addition, checklists were provided for these activities and events, to ensure standardization of every work step. lung impedance and cardiac functions from ventilator-and acid-induced lung injury models were analyzed by spc and correlated with events in the epc. the spc proved to be suitable to identify outliers, predict processes and thereby validate the lung injury models. conclusions: conventional quality management tools were successfully adapted to analyze the quality of lung injury experiments in the micu. we suggest that this new approach is suitable to standardize animal testing procedures and increase the reproducibility of animal studies. background: a dysfunctional endothelial l-arginine-nitric oxide (no) pathway is a key pathomechanism of idiopathic pulmonary arterial hypertension (ipah) that can be provoked by hypoxia in cell culture models [1] [2] [3] [4] . the small peptide apelin is involved in the maintenance of pulmonary vascular homeostasis and angiogenesis although its precise mechanism of action is still unclear [5] . asymmetric dimethylarginine (adma) is known to be an endogenous inhibitor of endothelial no synthase and is associated with several cardiovascular diseases [6] . adma is degraded by dimethylarginine dimethylaminohydrolase 1 and 2 (ddah) enzymes [7] . objective: to determine the effect of apelin on the l-arginine/no pathway in human pulmonary microvascular endothelial cells (hpmecs). methods: hpmecs were cultured under normoxic and ph-related hypoxic conditions and treated with apelin. the expression of regulators of the l-arginine/no pathway were analysed using real-time pcr. the effect of apelin on the phosphoinositide-3 kinase (pi3k)/akt signalling pathway was determined using immunoassays and specific inhibitors[lh1] . apelin and adma concentrations were measured in cell culture supernatants using an enzyme-linked immunosorbent assay and a liquid chromatography-tandem mass spectrometry assay. results: treatment with apelin resulted in a reduced expression of the apelin receptor (aplnr) on hpmecs suggesting a negative feedback mechanism. apelin directly influenced the l-arginine/no pathway by increasing the expression of ddah1 and ddah2 enzymes. thus, the concentration of adma was decreased in hpmecs supernatant following treatment with apelin. the effect of apelin could be abrogated by modulation of the pi3k/akt pathway. conclusion: apelin modulates the l-arginine/no pathway and mediates enhanced degradation of adma via an upregulated expression of ddah 1 and 2 enzymes. the pi3k/akt pathway might play a decisive role in regulation of the effect of aplein. an apelin receptor agonist could be a novel and promising therapeutic option for ipah treatment. background and purpose: there is presently no proven pharmacological therapy for the acute respiratory distress syndrome (ards). recently, we and others discovered that the heptapeptide angiotensin (ang)-(1-7) shows significant beneficial effects in preclinical models of acute lung injury (ali). here, we aimed to identify the best time window for ang-(1-7) administration to protect rats from oleic acid (oa) induced ali. experimental approach: the effects of intravenously infused ang-(1-7) were examined over four different time windows before or after induction of ali in male sprague-dawley rats. hemodynamic effects were continuously monitored, and loss of barrier function, inflammation, and lung peptidase activities were measured as experimental endpoints. key results: ang-(1-7) infusion provided best protection from experimental ali when administered by continuous infusion starting 30min after oa infusion till the end of the experiment (30-240min). both pretreatment (-60-0min before oa) and short-term therapy (30-90 min after oa) also had beneficial effects although less pronounced than the effects achieved with the optimal therapy window. starting infusion of ang-(1-7) 90min after oa (late-term infusion) achieved no protective effects on barrier function or hemodynamic alterations, but still reduced myeloperoxidase and angiotensin converting enzyme activity, respectively. conclusions and implications. our findings indicate that early initiation of therapy after ali and continuous drug delivery are most beneficial for optimal therapeutic efficiency of ang-(1-7) treatment in experimental ali, and presumably accordingly, in clinical ards. airway epithelium functions as a physicochemical barrier against dust, air pollutants and other pathogens and plays a critical role in physiological and pathological processes including modulation of the inflammatory response, innate immunity and airway remodeling such as in human asthma, copd and equine recurrent airway obstruction (rao). models of the airway epithelia are, indeed, missing for the horse; thus, we established long-term equine bronchial epithelial cell cultures using the rock inhibitor y-27632 and cell growth and differentiation was characterized. bronchial epithelial cells (ebec) from adult horses were cultured in the presence and absence of 10 µm y-27632 under conventional and air-liquid-interface (ali) culture conditions. cell proliferation and differentiation were analyzed. formation of a functional epithelial barrier was investigated by transepithelial electric resistance (teer) measurement and immunocytochemical staining of the tight-junction-protein zonula occludens-1 (zo-1). under conventional culture, y-27632 induced higher growth rate of primary ebec and increased the passage number up to 5 passages with retained epithelial cell behavior. in the presence of y-27632, ebecs under ali showed higher teer values. expression of zo-1 correlated with the increase in teer, but in y-27632-treated ebec tight-junctionformation was more rapid, indicating accelerated differentiation, as well h/e-staining and scanning electron microscopic imaging showed a higher amount of cilia and microvilli and pas-positive cells. in conclusion, the data suggest that the rock inhibitor y-27632 facilitates long-term culture of equine bronchial epithelial cells which can be used to study airway disease mechanisms and to identify pharmacological targets. leishmaniasis is a neglected disease of tropical and subtropical regions with millions of people at risk of infection with severe consequences including death. current antileishmanial drugs exhibit serious side effects and also development of resistances is rising. this disease is caused by protozoal organisms from the genus leishmania. in their insect vector they exist in the promastigote form, while in the mammalian host they survive as amastigotes inside the phagolysosomes of macrophages. this makes a specific pharmacotherapy complicated. due to the success of artemisinin in malaria therapy, it was of interest whether endoperoxides are also useful to treat leishmaniasis. in a previous study we demonstrated that ascaridole, an endoperoxide from chenopodium ambrosioides, can cure cutaneous leishmaniasis in a mouse model and exhibited ic 50 values for the viability in the low micromolar range [1] . even though in chemical model systems some basic ideas about the mechanism of activation of these endoperoxides exist, in biological systems including leishmania parasites this activation step has never been demonstrated. therefore, we set up experiments to identify primary drug intermediates formed from ascaridole by activation in leishmania tarentolae promastigotes using electron spin resonance spectroscopy in combination with spin trapping methods. ascaridole was activated in a cell-free system by fe 2+ . the radicals were trapped by 2-methyl-2-nitrosopropane (mnp). the resulting esr spectra consisted of the triplet of duplets. spectral simulations revealed coupling parameters of a n = 16.8 g, and a h = 1.8 g. these coupling constants are compatible with iso-propyl radicals as primary intermediates. in the cellular system, consisting of leishmania tarentolae promastigotes, instead of mnp the less cytotoxic 5,5-dimethyl-1pyrroline-n-oxide (dmpo) was used for spin trapping. without addition of fe 2+ a six line esr signal was observed. spectral simulations of the dmpo spin adduct revealed coupling constants of a n = 16.1 and a h = 24.6 g. according to previously published data [2] from other spin trapping experiments, this corresponds to the formation of carboncentered radicals from ascaridole by leishmania parasites. additional experiments using iron chelators and antioxidants as well as a comparison with the endoperoxide artemisinin were performed. in summary, this study for the first time demonstrated the activation of the endoperoxide ascaridole by a protozoal organism to its active intermediate as a prerequisite to understand its mechanism of action. [1] l. monzote, j. pastor, r. scull, and l. gille. antileishmanial activity of essential oil from chenopodium ambrosioides and its main components against experimental cutaneous leishmaniasis in balb/c mice. phytomedicine 21:1048-1052, 2014. nitric oxide (no), produced by the inducible nitric oxide synthase (inos) has many functions in physiological and pathophysiological pathways. after induction of inos expression by cytokines and other agents the enzyme produces high amounts of no in a ca 2+ -independent way. this high no production can have beneficial microbicidal, antiparasital, antiviral and antitumoral effects. in contrast, aberrant inos induction may have detrimental consequences and seems to be part of many diseases such asasthma, arthritis, multiple sclerosis, colitis, psoriasis, neurodegenerative diseases, tumor development, transplant rejection or septic shock. analysis of the human inos-mrna structure revealed the existence of an upstream open reading frame (µorf) and a putative internal ribosome entry site (ires) in the 5' untranslated region (5'utr) in front of the start codon of the inos coding sequence (cds). to analyze the function of the µorf and the putative ires we cloned different egfp and luciferase reporter constructs and transfected them into the human colon carcinoma cell line dld1. using a plasmid construct with the µorf fused with the egfp cds, we could show that the µorf can be translated. however, compared to the positive control plasmid less egfp was produced, which can be explained by a weak kozak sequence of the µorf. blocking the mrna cap-dependent translation by cloning a stem loop structure in front of the inos 5'utr within a luciferase reporter plasmid led to a remarkable loss of luciferase production. thus, the expression of inos seems to be cap-dependent. furthermore, transfection experiments with dld1 cells using constructs coding for a bicistronic renilla-firefly luciferase mrna showed that there is no ires in front of the inos cds. taken together, the inos expression seems to be cap-dependent and without influence of an ires, while the µorf is translatable. therefore we speculate that inos expression is only possible due to a leaky scanning mechanism depending on the weak kozak sequence of the µorf. objectives: vascular oxidative stress is considered a pathophysiologic factor promoting cardiovascular diseases such as coronary artery disease, heart failure, diabetes and hypertension. there are several sources of superoxide in vascular smooth muscle and endothelial cells but whether an impairment of the catalytic function of enos and thus generation of oxidative stress is involved in blood pressure (bp) regulation and/or the development of hypertensive disease states is unknown. methods: we generated a mutant enos in which one of the two essential cysteines required for the coordination with the central zn-ion, correct dimer formation and normal activity is replaced by alanine (c101a-enos). normal enos (enos-tg) or a novel dimer-destabilized c101a-enos described previously (antioxid redox signal. 2015 sep 20; 23(9) :711-23) were introduced in c57bl/6 in an endothelial-specific manner. mice were monitored for enos expression and localization, aortic relaxation, systolic blood pressure, levels of superoxide and several post-translational modifications indicating activity and/or increased vascular oxidative stress. some groups of mice underwent voluntary exercise training for 4 weeks or treatment with sod mimetic tempol. results: c101a-enos-tg showed significantly increased superoxide generation, protein-and enos-tyrosine-nitration, enos-s-glutathionylation, enos 1176/79 phosphorylation and amp kinase (ampkα) phosphorylation at thr172 in aorta, skeletal muscle, left ventricular myocardium and lung as compared to enos-tg and wild type (wt) controls. the localization of enos-c101a-tg was restricted to endothelium as evidenced by immunohistochemically staining for enos and an endothelial-specific marker cd31. exercise training increased phosphorylation of enos at ser 1176/79 and of ampkα at thr 172 in wt but not in c101a-enos-tg. aortic endothelium-dependent and endothelium-independent relaxations were similar in all strains. in striking contrast, c101a-enos-tg displayed normal blood pressure despite higher levels of enos, while enos-tg showed significant hypotension. tempol completely reversed the occurring protein modifications and significantly reduced bp in c101a-enos-tg but not in wt controls. conclusions: by means of a novel transgenic mouse model we demonstrated that vascular oxidative stress generated by endothelial-specific expression of a dimerdestabilized variant of enos selectively prevents bp reducing activity of vascular enos, while having no effect on aortic endothelial-dependent relaxation. these data suggest that oxidative stress in microvascular endothelium may play a role in the development of essential hypertension. the herbal medicinal product myrrhinil-intest ® consists of myrrh, chamomile flower dry extract and coffee charcoal. clinical data prove the effectiveness of this herbal preparation for inflammatory intestinal disorders. to further investigate the anti-inflammatory potential of the single components as part of a multi-target principle, an ethanolic (my) and aqueous (mya) myrrh extract, ethanolic chamomile flower extract (ka) and aqueous coffee charcoal extract (cc), were examined in an in vitro tnbs inflammation model using rat small intestinal preparations. the effect of the plant extracts on tnbs induced inflammatory damage was characterised based on tnfα-gene expression analysis, isometric contraction measurement and histological analysis. furthermore, tnfα-release from lpsstimulated thp-1 cells was determined. budesonide was used as positive control. additionally, microarray gene expression analysis was performed in lps/ifnγ stimulated native human macrophages to determine potential underlying mechanisms. the tnbs-induced overexpression of tnfα-mrna was reduced after ka (0.1 mg/ml) and mya (1 mg/ml) treatment down to 24% and 16% resp.; tnbs-induced loss of contractility and reduction of mucosal layer thickness was inhibited after ka (3 mg/ml) treatment by 26% and 25% resp.; after mya (0.1-1mg/ml) treatment by 17% and 44% resp. lps-induced tnfα release from thp-1 cells was inhibited concentrationdependently by my (ic50 = 60.65 μg/ml; 97% inhibition), ka (ic50 = 439 μg/ml; 71% inhibition) and cc (ic50 = 1886 μg/ml; 44% inhibition). furthermore, ka (200 µg/ml) and cc (500 µg/ml) inhibited the lps/ifnγ-induced expression of genes associated with chemokine signalling up to 100-fold (for cxcl13). the presented study demonstrates further evidence for anti-inflammatory properties of the herbal components which contribute to the reported clinical effectiveness. introduction: the purine nucleoside adenosine, which is involved in a variety of physiological functions, regulates immune and inflammatory responses and acts as a modulator of gut functions. although it is present at low concentrations in the extracellular space, stressful conditions, such as inflammation, can markedly increase its extracellular level up to micromolar range. by activation of different receptor subtypes adenosine is able to induce anti-inflammatory or pro-inflammatory impacts. aim: the current study examined the impact of adenosine a2a receptors (a2ar) and adenosine a2b receptors (a2br) to regulate contractility in untreated and inflamed rat colon preparations using a specific a2ar agonist (cgs 21680) and an a2br antagonist (psb-1115) on acute inflammation in rat colon preparations. further it focused on interactions of the multi-herbal drug stw 5 with a2ar as a possible mechanism of the protective effect of stw 5 in gastrointestinal disorders. methods: inflammation was induced by intraluminal instillation of 2,4,6-trinitrobenzene sulfonic acid (tnbs). contractions were measured isometrically in an organ bath set up. gene expression was determined using rt-pcr. radio ligand binding assays (competition experiments) were carried out with rat brain homogenates. morphological changes were estimated after van gieson staining. results: all four adenosine receptor subtypes were expressed in untreated colon preparations. activation of a1, a2b, and a3 receptor with specific agonists reduced the acetylcholine (ach, 10µm)-induced contractions, while activation of a2br enhanced it. after incubation with tnbs morphological damages in colonic mucosa and muscle walls were detectable followed by reduced ach-contractions. the tnbs-mediated decrease of ach-contractions as well as the morphological damages were partially normalized by co-incubation of tnbs with cgs 21680 (10µm) or with psb 1115 (100µm). the same effects with smaller intensity were found for stw 5 (512 µg/ml) in female but not in male colon preparations. these results are in accordance with ligand binding studies indicating that stw 5 interact with the a2ar. conclusion: anti-inflammatory mechanisms and cell protective actions of stw 5 are partly due to the interaction with adenosine receptors. the results give a clear-cut correlation with symptom improvements in clinical trials and thereby highlight the relevance of stw 5 as a therapeutic approach in ibs. (allescher 2006) . therefore, a multi-target approach is a promising therapeutic strategy, as is exemplified by stw 5(ottillinger et al. 2013) . stw 5 (iberogast®) is a fixed combination of nine plant extracts with iberis amara (stw 6) as one of its components. it is successfully used for treatment of functional dyspepsia and irritable bowel syndrome (ibs). to allow an overview of targets addressed by stw 5 and the role of its components in relation to the different forms and causes of functional gi diseases, an evaluation of the data, which have been gained from more than 150 pharmacological tests, is needed. all data from studies including stw 5 alone, or stw 5 and its components, were retrieved and sorted according to types of study models (human and animal systems, animal disease models, gi-preparations, cell cultures, in vitro-systems) and respective etiologic mechanisms related to fgds and then visualized in the form of 2d histograms (lorkowski et al. 2015) . results: more than 150 pharmacological tests indicated anti-oxidative activity, electrophysiological effects, ulcer protection, anti-inflammatory actions, pro-kinetic and spasmolytic effects as well as reflux and acid reduction. moreover, the analysis indicated that the components of stw 5 contribute differently to the overall effect of stw 5. altogether, the evaluation of the data shows that stw 5 is active in response to multiple etiologic factors involved in fgds, especially functional dyspepsia and irritable bowel syndrome, and to which extent the herbal extract components of the combination are relevant for the different mechanisms of action and their translation to clinical efficacy. conclusion: multi step clustering allows the transformation of complex data sets. it makes the allocation of specific actions to the different components of stw 5 manageable, so also giving support to its clinical use in patients with different symptoms. introduction: stw 5 ii has been recently developed in an effort to reduce the number of active extracts in the mother multi-component herbal preparation, stw 5 (iberogast ® , steigerwald arzneimittelwerk gmbh, darmstadt, germany) without affecting the overall therapeutic efficiency. stw 5 consists of a mixture of 9 standardized extracts: bitter candytuft (iberis amara), lemon balm (melissa officinalis), chamomile (matricaria recutita), caraway fruit (carum carvi), peppermint leaf (mentha piperita), liquorice root (glycyrrhiza glabra), angelica root (angelica archangelica), milk thistle (silybum marianum) and celandine herb (chelidonium majus), whereas stw 5 ii lacks the last 3 components. stw 5 was shown to be effective clinically to treat functional dyspepsia (1) and irritable bowel syndrome (2) and was shown experimentally to be effective to guard against the development of radiation induced intestinal mucositis (3) and in the management of ulcerative colitis (4) . the present study was initiated to show whether stw 5 ii with the reduced component extracts would also be as effective in the latter condition. this was induced in wistar rats by feeding them with 5% dextran sodium sulfate in drinking water for 1 week when lesions were observed in the colon evidenced by histological examination as well as colon shortening and reduction of colon mass index. this was associated with a rise in myeloperoxidase and a fall in reduced glutathione, glutathione peroxidase, and superoxide dismutase in colon homogenates as well as a rise in tnfα in serum. oral administration of stw 5 in doses of 2 and 5 ml/kg or stw 5 ii in a dose of 2 ml/kg for 1 week before and continued during dss feeding tended to normalize all the changes in a fashion comparable to sulfasalazine, used as a reference drug in a dose of 300 mg/kg. conclusions: the modified preparation, stw 5 ii thus proved to be as effective as stw 5, thereby reflecting its potential usefulness in ulcerative colitis possibly by virtue of its anti-inflammatory and anti-oxidant properties. (1) schmulson mj (2008) the emetic pathways include the action of neurotransmitters dopamine, serotonin and substance p in the emetic centers localized in the brainstem, area postrema and vagal nerve afferents. previous in vivo studies in beagle dogs revealed that the plant alkaloid lycorine potentially induce nausea and emesis. though antagonists of the tachykinin receptor 1 (maropitant) and serotonin receptor 3 (ondansetron) prevented lycorinemediated emesis, the molecular mechanism of nausea and vomiting remain still unknown. to study the mechanism of action of the emetic agents, we analyzed the effect of lycorine (direct activation of nk1) and channel opening (activation of 5ht3) on the intracellular calcium homeostasis (using fluorometric ca 2+ analysis) and cell proliferation rates in endogenously nk1 and 5ht3 receptor expressing cell lines as well as in cho and hek cells stably expressing the receptors. neither endogenously receptor expressing nk1 or 5ht3 cells nor receptor overexpressing cells showed calciumflux or calcium mobilization after stimulation with lycorine. furthermore, we are measuring the receptor number and subtypes using radioligand binding studies. it is planned, moreover, to obtain fluorescent labeled constructs of the nk1 receptor to gain insights into the involvement of receptor internalization which might mediate emesis. by characterizing these molecular principles of the nk1 and 5ht3 receptors, we are attempting to obtain more information in predicting drug-induced side effects such as nausea and emesis. the intestinal epithelium is completely renewed every 4-5 days. this process is driven by stem cells, which reside within specialized niches in the intestinal crypts and give rise to several differentiated cell types, including enterocytes, paneth, enteroendocrine, goblet and tuft cells. however, the molecular mechanisms that establish and maintain differentiated cell numbers and proportions remain largely unknown. here, we systematically analyzed the intestinal expression of semaphorins and plexins, which constitute a ligand-receptor system that plays central roles in cell-cell communication in various biological contexts. we identified plexin-b2 and its semaphorin ligands to be highly expressed in intestinal epithelial cells. genetic inactivation of plexin-b2 in intestinal organoids strongly reduced the number of enteroendocrine cells. our data suggest that semaphorin-plexin-b2 signaling promotes differentiation of intestinal epithelial cells towards the enteroendocrine lineage. the gastric epithelium contains several types of differentiated cells, including foveolar cells that produce mucus, parietal cells that secrete gastric acid and intrinsic factor, chief cells that synthesize pepsinogen and gastric lipase, and enteroendocrine cells that release different hormones. these differentiated cell types all originate from multipotent stem cells, yet little is known about how this differentiation process is regulated on a molecular level. the gap protein rasal1 controls the activity of small gtpases of the ras family, and its expression levels have been shown to inversely correlate with progression of stomach cancers. however, functional studies on the physiological role of rasal1 in the gastric epithelium are lacking. here, we established and characterized a mouse line with inactivation of the rasal1 gene. we observed that these mice showed increased numbers of enteroendocrine cells in the gastric mucosa. conditional inactivation of rasal1 in enteroendocrine cells, using a mouse line in which cre expression is driven by the atoh1 promoter, further corroborated that rasal1 expression in enteroendocrine cells determines enteroendocrine cell numbers. these findings identify rasal1 as a regulator of gastric epithelial cell differentiation. ). the present study investigates the impact of two faah inhibitors (arachidonoyl serotonin [aa-5ht], urb597) on a549 lung cancer cell metastasis and invasion. lc-ms analyses revealed increased levels of faah substrates (aea, 2-ag, oea, pea) in cells incubated with either faah inhibitor. in athymic nude mice faah inhibitors were shown to elicit a dosedependent antimetastatic action. in vitro, a concentration-dependent anti-invasive action of either faah inhibitor was demonstrated, accompanied with upregulation of tissue inhibitor of matrix metalloproteinases-1 (timp-1). using sirna approaches, a causal link between the timp-1-upregulating and anti-invasive action of faah inhibitors was confirmed. moreover, knockdown of faah by sirna was shown to confer decreased cancer cell invasiveness and increased timp-1 expression. inhibitor experiments point toward a decisive role of cb 2 and transient receptor potential vanilloid 1 in conferring the anti-invasive effects of faah inhibitors and faah sirna. finally, antimetastatic and anti-invasive effects were confirmed for all faah substrates. collectively, the present study provides first-time proof for a pronounced antimetastatic action of the faah inhibitors aa-5ht and urb597. as underlying mechanism of its anti-invasive properties an upregulation of timp-1 was identified. regenerative activity in tissues of mesenchymal origin depends on the migratory potential of mesenchymal stem cells (mscs). the present study focused on inhibitors of the enzyme fatty acid amide hydrolase (faah), which catalyzes the degradation of endocannabinoids (anandamide, 2-arachidonoylglycerol) and endocannabinoid-like substances (n-oleoylethanolamine, n-palmitoylethanolamine). in boyden chamber assays, the faah inhibitors, urb597 and arachidonoyl serotonin (aa-5ht), were found to increase the migration of human adipose-derived mscs. lc-ms analyses revealed increased levels of all four aforementioned faah substrates in mscs incubated with either faah inhibitor. following addition to mscs, all faah substrates mimicked the promigratory action of faah inhibitors. promigratory effects of faah inhibitors and substrates were causally linked to activation of p42/44 mitogen-activated protein kinase (mapk), as well as to cytosol-to-nucleus translocation of the transcription factor, peroxisome proliferator-activated receptor α (pparα). whereas pparα activation by faah inhibitors and substrates became reversed upon inhibition of p42/44 mapk activation, a blockade of pparα left p42/44 mapk phosphorylation unaltered. collectively, these data demonstrate faah inhibitors and substrates to cause p42/44 mapk phosphorylation, which subsequently activates pparα to confer increased migration of mscs. this novel pathway may be involved in regenerative effects of endocannabinoids whose degradation could be a target of pharmacological intervention by faah inhibitors. background: the hematopoietic disorder chronic myeloid leukemia (cml) is one of the most extensively studied neoplasms. it is caused by translocation between chromosomes 9 and 22 leading to the formation of the philadelphia chromosome and the bcr-abl fusiongene. first-line targeted therapy is still the tyrosine-kinase inhibitor imatinib (im), which led to tremendous success in treatment. however, the amount of therapeutic resistances is increasing, caused either by bcr-abl-dependent mechanisms (e.g. bcr-abl amplification/overexpression, point mutations) or bcr-ablindependent mechanisms. these might be linked to alterations in drug transporter expression or particularly, microrna-expression levels. in our previous study, we analyzed the changes of microrna expression profiles during the development of imresistances in the leukemic cell line k562. an inverse correlation of mir-212 expression and protein levels of the efflux transporter atp-binding cassette transporter g2 (abcg2) was observed in cells resistant to different im-concentrations, pointing to a relation of mir-212 to im-resistance. hence, we investigate in current studies, how the influence of mir-212 on im-sensitivity could be explained. methods: we transfected k562 cells, sensitive treatment-naïve cells and cells resistant to various im-concentrations, either with mir-mimic pre-mir-212 or inhibitory anti-mir-212, challenged them with im and analyzed effects on cell viability, activation of apoptosis and cell death using wst-1-, caspase glo 9-assay and cell counting. in addition, we analyzed changes in abcg2 expression using flow cytometry and qrt-pcr and investigated alterations in im-efflux using hplc and hoechst efflux assay. results: under im-treatment, sensitive k562 showed an effect of mir-212-inhibition using anti-mir-212. this led to a significant promotion of cell survival apparent on the level of respiratory chain function (p<0.01) and cell membrane integrity and reduced capase-9 activity (p<0.05). furthermore, these mirna-effects are dose-dependent as confirmed in concentration row-experiments. regarding transport and abcg2 expression, we found that 2 µm im-resistant k562 do not express higher amounts of abcg2, but showed higher transport rates of im or the abcg2-substrate hoechst 33342. conclusions: overall, these experiments indicate that mir-212 does not only affect abcg2-expression, but also influences cell sensitivity to im in a more direct manner. further analysis will now be performed to reveal the underlying mechanism, how cell sensitivity to im is altered and if these effects occur due to a direct regulation of abcg2. in summary, these findings could be relevant in cml-therapy, overcoming imresistances with a better understanding of mirna-and drug transporter alteration in cml. acknowledgments: we would like to thank all the authors for their contribution to this project. this work was funded by the university hospital schleswig-holstein. oxidized silicon nanoparticles and iron oxide nanoparticles for radiation therapy s. klein radiation therapy often combined with surgery and/or chemotherapy is applied to more than 50 % of patients at some point of their treatment. the cytotoxic effects of ionizing radiation occur from their ability to produce dna double-strand breaks through the formation of free radicals within cells. however, the curative potential of radiotherapy is often limited by intrinsic radio resistance of cancer cells and normal tissue toxicity. to overcome this resistance and enhance the effectiveness of ionizing radiation, radio sensitizers are used in combination with radiotherapy. in our studies we used amino functionalized, oxidized silicon nanoparticles (sinp), superparamagnetic iron oxide nanoparticles (spion) and iron doped silicon nanoparticles (fe(1%)-sinp) to increase the formation of reactive oxygen species (ros) in cells. cancer and tissue cells loaded with the various nanoparticles were irradiated with a single dose of 1-3 gy using a 120 kv x-ray tube. after irradiation, the formation of the different ros species including superoxide, hydroxyl radicals and singlet oxygen was investigated. sinps with sizes around 1 nm can easily cross the cell and nuclear membrane. the positively charged amino functionalized sinps stick in all membranes as well in those of the mitochondria. irradiation of the mitochondria may cause the depolarization of the mitochondrial membrane, which enables the release of cytochrome c and simultaneously, an inhibition of the respiratory chain, which leads to an increased generation of superoxide. amino functionalized sinps, as being embedded in the outer mitochondrial membrane, evidently enhance the depolarizing effect of the x-ray radiation on the mitochondria and therefore increase the concentration of superoxide. [1] oxidized sinps with larger sizes accumulate in the cytoplasm and generate mainly singlet oxygen after irradiation. spions enter the cells via endocytosis, whereas the uncoated spions remain in the vesicles and the citrate coated spions accumulate in the cytoplasm. cells loaded with citrate coated spions show no higher ros concentration than in media-cultured cells. but after irradiation, the ros formation increased drastically. this enhancing effect is explained with the impact of x-rays onto the surface of spions which is due to the destruction of surface structures. the freed spion surface contains easier accessible iron ions. this ions can participate in the fenton and haber-weiss chemistry and thus, catalyze the hydroxyl radical formation. [2] 1 to 5 % iron doped sinp increase the formation of hydroxyl radicals as well as the generation of singlet oxygen after irradiation. chronic pain in response to tissue damage (inflammatory pain) or nerve injury (neuropathic pain) is a major clinical health problem, affecting up to 30% of adults worldwide. currently available treatments are only partially susceptible and are accompanied with therapy limiting side effects. thus it is important to elucidate molecular mechanisms of pain signaling in detail to obtain new insights in potential future therapies. recent data indicate that hydrogen sulfide (h 2 s) contributes to the processing of chronic pain, however pro-as well as antinociceptive effects have been described so far. moreover the sources of h 2 s production in the nociceptive system are only poorly understood. here we investigated the expression of the h 2 s releasing enzyme cystathionine g-lyase (cse) in the nociceptive system and characterized its role in chronic pain signaling using cse deficient mice. paw inflammation and peripheral nerve injury led to upregulation of cse expression in dorsal root ganglia. however, conditional knockout mice lacking cse in sensory neurons as well as global cse knockout mice demonstrated normal pain behaviors in inflammatory and neuropathic pain models as compared to wt littermates. thus, our results suggest that cse is not critically involved in chronic pain signaling in mice and that sources different from cse mediate the pain relevant effects of h 2 s. this work was supported by the deutsche forschungsgemeinschaft (sfb815-a14) and in part by loewe-schwerpunkt "anwendungsorientierte arzneimittelforschung". heinrich-heine-universität, institut für toxikologie, düsseldorf, germany 2 heinrich-heine-universität, urologie, düsseldorf, germany background: cisplatin (cispt) is frequently used in the therapy of advanced stage urothelial cell carcinoma (ucc). yet, inherent and acquired drug resistance limits the clinical use of cispt. here, we comparatively investigated the response of epithelial-like (rt-112) and mesenchymal-like (j-82) uc cells following cispt treatment. methods: upon selection with equitoxic doses of cispt for months, we obtained cispt resistant variants (rt-112 r , j-82 r ). cell viability was measured using the alamar blue assay. cell cycle distribution was analysed by flow cytometry. immunocytochemistry was used to quantify the number of nuclear γh2ax and 53bp1 foci representing dna double strand breaks (dsbs), while western blot was used to unravel the role of dna damage response (ddr) to acquired cispt resistance. qrt-pcr was performed to analyse the mrna expression of genes associated with cispt resistance. j-82 and j-82 r cells were treated with different concentrations of lovastatin and selected ddr inhibitors to elucidate their influence on cell viability. results: untreated rt-112 cells showed an about 2-3-fold higher resistance to cispt than j-82 cells. both cell lines differed in the expression pattern of genes that are associated with cispt resistance. rt-112 r and j-82 r revealed a 2-3-fold increased cispt resistance as compared to the parental cells. during the selection procedure, we observed that acquired cispt resistance goes along with morphological alterations that resemble epithelial mesenchymal transition (emt). cell cycle analysis of rt-112 r cells disclosed a reduced apoptosis and enhanced g2/m arrest following cispt exposure as compared to rt-112 wild-type cells. by contrast, induction of cell death was similar in j-82 and j-82 r cells. notably, j-82 r cells showed a reduced formation of cispt-induced dsbs. correspondingly, the related ddr was diminished in j-82 r as compared to their parental cells. this was not found when ddr was comparatively analysed between rt-112 r and rt-112 cells. data obtained from qrt-pcr analysis indicate that different mechanisms contribute to acquired drug resistance of j-82 r and rt-112 r . unexpectedly, j-82 r and rt-112 r shared the upregulation of xaf-1. treatment of j-82 r cells with statins and protein kinase inhibitors revealed an enhanced sensitivity to pharmacological inhibition of chk-1 and, moreover, re-sensitization to cispt by chk-1 inhibitor. based on the data we suggest that mechanisms of acquired cispt resistance of epithelial and mesenchymal uc cell lines are different with apoptosisrelated mechanisms appear to be more relevant for epithelial-like rt-112 cells and ddr-related mechanisms dominating cispt susceptibility in mesenchymal-like j-82 cells. furthermore, our findings indicate that chk-1 might be an appropriate target to deal with acquired cispt resistance in ucc. in many patients, gastric cancer treatment with conventional cytostatic agents shows only limited clinical response. novel therapeutics, which inhibit rtk signaling by targeting c-met or her family receptors, have demonstrated some efficacy; however, primary resistance of gastric cancer cells against these inhibitors is still a major problem. in the present study we investigated the mechanism of heregulin (hrg)-promoted survival of gastric cancer cells after treatment with c-met inhbitors or sirna-mediated downregulation of c-met. we found that hrg treatment of gastric cancer cells with a c-met amplification partially rescued the cells from the antiproliferative effects of pharmacological c-met inhibition or sirna-mediated downregulation of c-met. moreover, c-met inhibition or downregulation led to an induction of her3 expression on mrna and protein level, whereas other her family receptors were unaffected. downregulation of her3 impaired the hrg-mediated rescue of cell survival upon c-met inhibition. in other tumor entities the chromatin organizer special at-rich sequence-binding protein 1 (satb1) has been described as a regulator of her family receptor expression involved in adaptive responses of tumor cells. thus, we investigated the contribution of satb1 in the upregulation of her3 after c-met inhibition. of note, c-met inhibitors as well as c-met-specific sirnas markedly induced satb1 expression in gastric cancer cells, and the downregulation of satb1 by sirnas completely prevented the induction of her3 upon c-met inhibition. in contrast, her1 or her2 expression levels were not affected by satb1-specific sirnas. the function of satb1 as transcriptional regulator is controlled by its phosphorylation status, which in turn is modulated by pkc activity. thus, we also tested the effect of pkc inhibitors on her3 expression after c-met inhibition. interestingly, the upregulation of her3 in gastric cancer cells was significantly reduced by pkc inhibitors. to summarize, satb1 and pkc are critically involved in the regulation of her3 expression in gastric cancer cells after treatment with c-met inhibitors and the oncogene her3 plays a crucial role for tumor cell survival in this context. thus, inhibition of pkc or satb1 may help to overcome resistance against c-met inhibition in this tumor entitiy. in the rising field of nanomedicine, development of new approaches in diagnosis and treatment of cancer is a challenging task. typically, a nanocarrier is synthesized and linked to functional compounds displaying either diagnostic or therapeutic effects in cancer models. recently, nanomaterials combining both diagnostic and therapeutic properties, so-called 'theranostics', became of primary interest. here we used a human albumin-polyethylene glycol (peg) copolymer (hsa) as a theranostic platform for molecular integration of the chemotherapeutic drug doxorubicin (dox) and the magnet resonance imaging (mri) contrast agent gadolinium (gd) yielding gd-hsa-dox nanoparticles. besides in vitro testing, which demonstrated cytotoxic efficacy of gd-hsa-dox, we used the chorioallantoic membrane (cam) of fertilized chick eggs as a preclinical xenotransplantation model. the cam assay, which in legal terms does not represent an animal experiment, allows testing of compounds in an in vivo setting. this model is particularly helpful to narrow the gap between in vitro and in vivo applications in rodents, because it can help to reduce number of elaborate experiments with typically nude mice, and it reduces or even avoids exposure of those animals to adverse effects and distress. treatment-resistant mda-mb-231 breast cancer cells stably transfected with luciferase were xenotransplanted onto the chorioallantoic membrane. after formation of solid breast cancer xenografts, gd-hsa-dox was injected intravenously and its antiproliferative effect was evaluated by ivis imaging of luciferase activity and by immunohistochemical analysis of the tumor xenografts for the ki-67 proliferation antigen. in comparison to conventional dox, gd-hsa-dox showed increased antiproliferative efficacy and reduced general toxicity in the cam assay. on the basis of these findings, a rodent model was established, where the mda-mb-231 breast cancer cells were orthotopically xenotransplanted into the mammary fat pads of female nmri nu/nu mice. in this model, we further investigated biocompatibility, as well as diagnostic and therapeutic properties of the engineered nanomaterial. after repeated administration of gd-hsa-dox into the tail vein of the animals, biocompatibility of gd-hsa-dox was confirmed by uncompromised liver, kidney and hematopoietic parameters. to warrant diagnostic properties, accumulation of the nanomaterial in tumor tissue is indispensable. by small animal mri of gd, kinetics of intravenously applied gd-hsa-dox in tumor tissue was monitored. an enhancement of the engineered nanomaterial in tumor tissue was detected for up to 47 h after injection indicating successful enrichment of gd-hsa-dox within the tumor tissue, which can be ascribed to the enhanced permeability and retention (epr) effect observed in the microenvironment of many solid tumor tissues. we are currently investigating the antitumor efficacy of gd-has-dox in this mouse model and preliminary data seem to indicate a dose-dependent anticancer effect. supported by the volkswagenstiftung. tubulin-binding agents are the most important anti-tumoral drugs. due to the side effects and the development of resistances, the discovery of new agents is still of importance. recently, pretubulysin (pt), a naturally occurring precursor of the myxobacterial compound tubulysin, was identified as a novel tubulin-binding compound. in the dfg research group for 1406, pt was characterized as anti-tumoral, antiangiogenic and vascular-disrupting compound. moreover, pt was also found to inhibit the formation of metastases in vivo. aim of the present study was to gain first insights into the mechanisms underlying this anti-metastatic effect by investigating the influence of pt on the interaction of endothelial and tumor cells in vitro. pt treatment of primary human endothelial cells (huvecs) strongly increased the adhesion of breast cancer cells (mda-mb-231) on huvecs, but limited their transmigration through the endothelium (transwell assay). based on this data, the gene expression of presumably involved adhesion molecules was determined by qrt-pcr: icam-1, vcam-1, e-selectin, n-cadherin, and galectin-3. moreover, the chemokine system cxcl12/cxcr4 was analyzed. it could be demonstrated that the mrna level of endothelial n-cadherin is upregulated by pt. while total protein expression of ncadherin was enhanced in pt treated huvec, its surface expression was not largely influenced by pt (western blot, flow cytometry). in line with this, blocking endothelial ncadherin by a neutralizing antibody revealed that this protein is not involved in ptevoked tumor cell adhesion. interestingly, pt strongly augmented the mrna and protein expression of cxcl12 in huvecs (qrt-pcr, western blot), whereas its endothelial secretion was not affected by pt (elisa). an autocrine action of cxcl12 could be excluded, since blocking the cxcl12 receptor cxcr4 on endothelial cells with plerixafor did not influence cancer cell adhesion. by microscopic analyses, we observed that pt treatment causes transient gaps in the huvec monolayer, where tumor cells prefer to adhere. since β1-integrins on the tumor cells could mediate interactions between cancer cells and extracellular matrix proteins in the gaps (e.g. collagen), their influence in cell adhesion and transmigration assays was examined. both the pt-evoked increase in cell adhesion and decrease in transmigration was completely abolished when β1-integrins were blocked on mdas by a neutralizing antibody. these results indicate that the anti-metastatic action of pretubulysin might be based on the trapping of tumor cells on the endothelium. whether this effect is also relevant in vivo, will be analyzed in future studies using intravital microscopy. this work was supported by the german research foundation (dfg, for 1406, fu 691/9-2). introduction: tyrosine kinase inhibitors (tkis) for the treatment of non-small cell lung cancer (nsclc) patients harboring activating mutations in the epidermal growth factor receptor have shown prominent success. nevertheless, patients treated with tkis eventually acquire resistance and relapse (1). based on an evolutionary cancer model (2) , weekly high dose-pulsed tki regimens were proposed to delay resistance. using data from nsclc bearing mice treated with erlotinib at different dosing regimens, we developed a semi-mechanistic pharmacokinetic/pharmacodynamic model for erlotinib effects on tumor killing and resistance development. methods: data was available from experiments in xenograft mice bearing nsclc tumors (pc9 and hcc827 cell lines; both erlotinib sensitive) (3). plasma concentrations from two single-dose groups, 30mg/kg and 200mg/kg, were used for pharmacokinetic modeling. relative tumor volume changes in mice randomized to five dosing regimens (15mg/kg daily, 30mg/kg daily, 200mg/kg every 2 days, 200mg/kg every 4 days, or vehicle) was the pharmacodynamic endpoint. a tumor growth inhibition model was developed by testing linear, exponential and logistic models to account for the tumor growth kinetics, as well as fitting an emax model to explain the effect of exposure on killing the sensitive tumor cells, and resistance development. analysis was performed using nonmem 7.3. results: absorption was dose dependent, and a precipitate compartment accounted for dissolution limited absorption for the 200mg/kg dose. a 1-compartment model with first order elimination kinetics described distribution and elimination. to describe tumor volume changes, a tumor was assumed to be a mixture of sensitive and resistant cells (represented by distinct compartments and ordinary differential equations). exponential kinetics best described natural growth (doubling times: 13 and 52 days, for sensitive and fully resistant cells, respectively). a tumor was found to transit through a less sensitive phase before acquiring full resistance. an e max model (less than linear) best described effect on the sensitive cells (ec 50 =0.53μm for both cell lines), and on the partially sensitive transit phase (ec 50 =1.24μm and 3.00μm, for hcc827 and pc9 cell lines, respectively), urging to provide adequate trough erlotinib concentrations for optimal effects. conclusions & future perspectives: an exposure-driven tumor growth inhibition model accounting for the kinetics of resistance development was developed. the model emphasizes the need for establishing an adequate trough erlotinib concentrations to delay disease progression. extracts of the stem bark of ficus platyphylla (fp) have been used in traditional nigerian medicine to treat psychoses, depression, epilepsy, pain and inflammation. previous studies have revealed the analgesic and anti-inflammatory effects of fp in different assays including acetic acid-induced writhing, formalin-induced nociception, and albumin-induced oedema. in this study, we assessed the effects of the standardised extract of fp on hot plate nociceptive threshold and vocalisation threshold in response to electrical stimulation of the tail root in order to confirm its acclaimed analgesic properties. we also investigated the molecular mechanisms underlying these effects, with the focus on opiate receptor binding and the key enzymes of eicosanoid biosynthesis, namely cyclooxygenase (cox) and 5-lipoxygenase (5-lo). fp (i) increased the hot plate nociceptive threshold and vocalisation threshold. the increase in hot plate nociceptive threshold was detectable over a period of 30 min whereas the increase in vocalisation threshold persisted over a period of 90 min. (ii) fp showed an affinity for µ opiate receptors but not for δ or κ opiate receptors, and (iii) fp inhibited the activities of cox-2 and 5-lo but not of cox-1. we provided evidence supporting the use of fp in nigerian folk medicine for the treatment of different types of pain, and identified opioid and non-opioid targets. it is interesting to note that the dual inhibition of cox-2 and 5-lo appears favourable in terms of both efficacy and side effect profile. despite the fact, that the enormous economic burden and individual suffering caused by gastrointestinal infections permanently persists in developing and newly industrialized countries, healthcare systems in first world countries underestimated its significance for a long time. the alarming prevalence of multidrug-resistant gram-negative bacteria, combined with a high epidemic potential of gastrointestinal pathogens, however, demonstrates the urgent need for new antibiotics and antiinfectives worldwide. 2,5 million deaths per year were actually caused by acute diarrheal infections. the most common causative agents of acute diarrheal infections, amongst others, are yersinia enterocolitica, campylobacter jejuni, salmonella spp., shigella spp., escherichia coli, vibrio cholerae, and clostridium difficile. the established treatment based on antibiotics is mostly ineffective or may even have adverse side effects and result in prolonged shedding. in either way, antibiotic treatment also eradicates at least parts of the intestinal microbiome, and thereby disrupts colonization resistance, fosters overgrowth of pathogens and prolongs shedding times. therefore, the development of future drugs should be focused on highly specific antiinfectives, which enable a direct pathogenspecific treatment. one very promising strategy is the inhibition of the biogenesis of outer membrane virulence factors. due to the fact that many decisive virulenceassociated outer membrane proteins (omps) of gram-negative enteropathogens are substrates of the periplasmic chaperone sura exclusively, we developed a new assay format to determine sura in vitro chaperone activity. previous publications by behrens et al., 2001 and buchner et al., 1998 documented an assay to determine sura in vitro chaperone activity with extremely limited sensitivity and minimal detectable concentration, which was not suitable for high throughput screening (hts). we now developed a luciferase-based screening assay. this highly sensitive and robust test system has been validated extensively and now gives reliable output with an appreciable z-factor of > 0,6. in cooperation with the hzi braunschweig (germany) and the hzi saarbrücken (germany), we were able to screen over 7000 purified compounds and over 500 extracts of myxobacteria. during the ongoing screening period, the assay generated four validated primary actives, which corresponds to a positive hit rate of 0,05 %. additionally, we developed an elaborate follow-up strategy to validate positive hits, which includes a well-established mouse infection model. we are looking forward to escalate our screening efforts and would like to use this abstract to invite all scientist who are interested in testing compound/natural extract libraries for an activity against the target structure sura. the potential atypical antipsychotic and dopamine d 2 receptor partial agonist 2bromoterguride antagonizes phencyclidine-and apomorphine-induced prepulse inhibition and novel object recognition deficits in rats e. tarland objectives: schizophrenia is a disabling mental disorder affecting more than 21 million people worldwide. available medical therapies are effective in the treatment of psychosis and other positive symptoms, however come with considerable side effects and often fail to ameliorate cognitive deficits and negative symptoms of the disorder. the dopamine d 2 receptor partial agonist 2-bromoterguride (2-bt) has recently been shown to exhibit antipsychotic effects in rats without causing adverse side effects common to antipsychotic drugs [1]. to determine its atypical character in vivo, the ability of 2-bt to antagonize the disruptive effects of phencyclidine (pcp) and apomorphine on sensory motor gating was determined in the prepulse inhibition paradigm. the effect of 2-bt on cognitive deficits was assessed in the novel object recognition (nor) test after object recognition memory deficits were induced by pcp treatment. method: 10 week old male sprague-dawley rats were injected with 2-bt (0.1 or 0.3mg/kg; i.p.) followed by pcp (1.5mg/kg; s.c.) or apomorphine (0.5mg/kg; s.c.). prepulse inhibition was measured in two sound-proof startle chambers. the attenuating effect of 2-bt (0.1 or 0.3mg/kg; i.p.) on visual learning and memory deficits following subchronic administration of pcp (5.0mg/kg; i.p. twice daily for 7 days) was assessed in the nor task consisting of a 3min acquisition trial and a 3min retention trial separated by a 1h inter-trial interval. clozapine (5.0mg/kg; i.p) or haloperidol (0.1mg/kg; i.p) were used as positive controls. results: the dopamine d 2 receptor partial agonist 2-bt (0.3mg/kg) and the typical antipsychotic haloperidol successfully antagonized apomorphine-induced ppi-deficits. interestingly 2-bt also ameliorated the pcp-induced ppi-deficits to the same extent as the atypical antipsychotic clozapine. preliminary data from the nor test indicate that 2-btreduces subchronic pcp-induced cognitive deficits in novel object recognition analogous to clozapine. the disrupting effects of pcp on ppi are mediated by non-competitive antagonism at nmda sites indirectly influencing a series of neurotransmitter systems. our results indicate that 2-bt mediates actions at multiple neurotransmitter receptors as it successfully ameliorated both the pcp-and apomorphine-induced ppi disruptions in rats, showing an atypical antipsychotic character. furthermore, our preliminary results support the potential atypical antipsychotic effect of 2-bt as it restored performance in the nor test, a test with good predictive validity. due to the previously shown properties and antipsychotic-like effects of 2-bromoterguride [1], this substance may be a promising candidate for treatment of schizophrenic patients. ongoing experiments investigate the potency of 2-bt to improve social deficits following a sub-chronic pcp regime in rats. background and objectives: cannabinoid-1 receptor signaling increases the rewarding effects of food intake and promotes the growth of adipocytes, whereas cb2 possibly opposes these pro-obesity effects by silencing the activated immune cells that are key drivers of the metabolic syndrome. pro-and anti-orexigenic cannabimimetic signaling may become unbalanced with age because of alterations of the immune and endocannabinoid system. methods: to specifically address the role of cb2 for age-associated obesity we analyzed metabolic, cardiovascular, immune and neuronal functions in 1. 2-1.8 year old cb2 -/and control mice, fed with a standard diet and assessed effects of the cb2 agonist, hu308 during high fat diet in 12-16 week old mice. results: the cb2 -/mice were obese with hypertrophy of visceral fat, immune cell polarization towards pro-inflammatory sub-populations in fat and liver and hypertension, as well as increased mortality despite normal blood glucose. they also developed stronger paw inflammation and a premature loss of transient receptor potential responsiveness in primary sensory neurons, a phenomenon typical for small fiber disease. the cb2 agonist hu308 prevented hfd-evoked hypertension, reduced hfdevoked polarization of adipose tissue macrophages towards the m1-like proinflammatory type and reduced hfd-evoked nociceptive hypersensitivity but had no effect on weight gain. conclusion: cb2 agonists may fortify cb2-mediated anti-obesity signaling without the risk of anti-cb1 mediated depression that caused the failure of rimonabant. leishmaniasis is a neglected tropical disease caused by leishmania, eukaryotic protozoal organisms, which infect humans and other mammals. this disease is transmitted by sandflies of the genus phlebotomus. due to global warming the endemic region of these vectors expands further to northwards and threatens south european countries as well. the treatment of leishmaniasis is difficult due to toxicity and resistance development for current drugs. the so far unexplored inhibition of mitochondrial functions in leishmania by natural products or even food ingredients seems to be an interesting alternative. two food ingredients, resveratrol (res) and xanthohumol (xan), were widely studied in mammalian cells but little is known about their actions on protozoal parasites. therefore, we compared the influence of res and xan on the function of leishmanial and mammalian mitochondria. anti-leishmanial activities of xenobiotics were assessed in cell culture of leishmania tarentolae promastigotes (ltp), leishmania amazonensis amastigotes (laa) and compared to peritoneal macrophages from mouse (pmm) using viability assays. furthermore, mechanistic studies regarding mitochondrial functions were conducted in ltp, mitochondrial fractions isolated from ltp and bovine heart submitochondrial particles using oxygen consumption measurements, assays of individual mitochondrial complex activities, membrane potential and superoxide radical formation by photometry, fluorimetry and electron spin resonance spectroscopy. in ltp, xan inhibited the viability more effective than res (ic 50 : xan 23 µm, res 161 µm). likewise, xan and res demonstrated anti-leishmanial activity in laa (ic 50 : xan 7 µm, res 14 µm) while had less influence on the viability of pmm (ic 50 : xan 68 µm, res > 438 µm). in contrast to res, xan strongly inhibited oxygen consumption in leishmania. further studies demonstrated that this is based on the inhibition of the mitochondrial electron transfer complex ii/iii by xan which was less pronounced with res. however, xan also demonstrated inhibitory activity on mammalian mitochondrial complex iii. in addition, xan caused no decrease of the membrane potential in leishmanial mitochondria, while res resulted in mitochondrial uncoupling. neither xan nor res increased mitochondrial superoxide release in ltp. these data show that res, a major polyphenol from red wine, and xan, an ingredient of hop-containing beer, may have selective anti-leishmanial activity. tryptophan hydroxylase (tph) is the rate-limiting enzyme in serotonin (5-ht) biosynthesis. its two existing isoforms are exclusively expressed in the periphery (tph1), or the raphe nuclei of the brainstem (tph2) and the respective 5-ht populations are distinctly separated by the blood-brain barrier, offering the possibility to pharmacologically modulate central and peripheral functions in an independent manner. peripheral 5-ht is mainly produced by tph1-expressing enterochromaffin cells of the gut and taken up into platelets and transported in the blood stream. upon platelet activation, 5-ht is rapidly released and locally induces multiple effects, such as vasoconstriction, cell proliferation or fibrosis and is furthermore involved in the regulation of e.g. vascular tone, gut motility, primary hemostasis, insulin secretion and t-cell-mediated immune response. following the classical early drug development pathway, we developed a fluorescencebased tph activity assay and performed a high-throughput screening of about 37000 small chemical compounds. we discovered a novel class of tph inhibitors, which was thoroughly validated in a variety of in vitro assay setups. combining medicinal chemistry and x-ray crystallography, we further aimed to develop these inhibitors into preclinical drug candidates. to date we were able to generate and patent a series of novel tph inhibitors with optimized affinity and an in vitro ic 50 in the low nanomolar range. this novel class of tph inhibitors could potentially be used to treat a variety of disorders with aberrant peripheral 5-ht signaling, such as gastrointestinal disorders (e.g. irritable bowel syndrome, crohn's disease, various forms of diarrhea), cardiac valve diseases, pulmonary hypertension, chronic respiratory diseases and some neuroendocrine (carcinoid) tumors. primary hepatocellular carcinoma (hcc) is the most frequent type of liver cancer. therapeutic options are rare. beside sorafenib, a tyrosinkinase inhibitor, which is only used in end stage liver cancer, the surgical intervention is the only successful clinical treatment option. hence there is an urgent need to develop new therapeutic strategies and to identify new drugs for therapy of hcc. hcc often arises in fibrotic or cirrhotic liver, which is accompanied by a change of the extracellular matrix (ecm) composition. in addition it was shown that hepatoma cells express different integrins, which interact with ecm and intracellular cell signaling, compared to hepatocytes. snake venoms have gained increased attention, as it was shown that some of their enzymes and peptides directly act on tumor cells and their multicellular arrangement or indirectly by influencing the stroma environment of the tumor. aim of the present study was to investigate the effect of snake venoms on liver cancer related cell lines as well as their specific action on the ecm-integrin axis. the effects of the snake venoms vipera palestinae (vp), calloselasma rhodostoma (cr) and echis sochureki (es) on a cellular level (mtt, ldh release), on cell-cellconnections (caco2 permeability assay) and on cell-matrix-interactions (adherence test) were investigated. cell-matrix interactions were tested with an adhesion assay using collagen i (c-i), collagen iv (c-iv), fibronectin (fn) and laminin (lm) as ecm compounds. in our in vitro models we used hepg2 as a hcc tumor cell line and the fibroblast cell line fi301 as stroma simulation. additionally caco2 cells were used, a colon carcinoma cell line representing colorectal liver metastasis. the toxicity of snake venom on liver cancer related cell lines was determined in the range of 0.01 -100 µg/ml and plotted into dose response curves. the noaels were calculated from these dose response curves: vp: 0.5 µg/ml -cr: 1 µg/ml -es: 5 µg/ml. performance of the caco2-transwell permeability assay revealed no influence of the tested venom concentrations on the integrity of the cellular arrangement. investigations for integrin inhibition revealed that the venom from vp reduced adherence on lm coated plates and the venom of ec reduced adherence on lm and fn coated plates compared to untreated cells. there was no effect on the adherence on any matrix from the venom of cr observable. co-incubation of the snake venoms of vp and es (below or near noael concentrations) with 5-fluorouracil (5fu), which is used as a chemotherapeutic agent, caused a reduction of its ic50 values. the results indicate that components of vp and ec inhibit the formation of cell-matrixinteractions possibly acting as disintegrins. the co-incubation experiments demonstrated a synergistic effect of 5fu and snake venoms. further experiments should enable the isolation of therapeutic active venom compounds, identification of disintegrins and their role in synergistic mechanisms in liver cancer therapy. modulation of the blood-brain barrier with peptidomimetics to improve drug delivery s. dithmer after decades of research, the blood-brain barrier (bbb) still remains a major problem for successful delivery to the brain for the vast majority of drugs. the main component forming the bbb is the brain microvascular endothelium. the paracellular permeation is limited by tight junctions (tjs), a multiprotein complex composed of the members of the claudin family claudin-1, -3, -5, -12. claudin-5 is known to be the key tj protein tightening the bbb. therefore, claudin-5 has been selected as target to modulate the bbb. for this reason, drug enhancer peptides (peptidomimetics) were designed to modulate transiently claudin-5 and, thereby, permeabilize the bbb. by combining biochemical protein/peptide interaction and tissue culture methods, we identified, validated and optimized peptide sequences modulating claudin-5 containing barriers. the claudin-5 targeting peptides decreased the transcellular electrical resistance and increased the permeability through mdck-ii cell monolayers stably expressing yfpclaudin-5 and immortalized brain endothelial cells (bend.3). the peptides decreased the amount of claudin-5 and zo-1 at cell-cell contacts and changed the cell morphology from spindle-shaped to more round-shaped. all tested peptides showed no signs of toxicity on cell cultures and in vivo (intravenous injection). permeability measurements in mice proved enhanced permeation of na-fluorescein (376 da) through the bbb, which was confirmed by magnet resonance imaging of contrast agents (gd-dtpa, 547 da). in summary, we identified new peptides with the potential to enhance cerebral delivery of small molecules through the bbb. treatment of cerebral diseases is limited by the capability of pharmacologically active agents to penetrate the blood-brain barrier (bbb). this paracellularly tight diffusion barrier is formed by brain capillary endothelial cells. the paraendothelial cleft is sealed by tight junctions (tjs), a multiprotein complex. cerebral tjs predominantly consist of claudin-5 (cldn5) which tightens the bbb for molecules <800 da. consequently, cldn5 is a potential target for transient and size-specific modulation of the bbb to improve cns penetration for pharmaceutically active agents. in high throughput screening using a cldn5 assay, the barrier opener 1 (bo1) was identified as a cldn5 modulator. initially, a significant removal of cldn5 from the plasma membrane was shown by confocal microscopy using epithelial and endothelial cell lines. measurement of transcellular electrical resistance and of paracellular permeability using lucifer yellow (mw 521 da) demonstrated the effect of bo1. concentration dependent treatment (50-150 µm) of cell monolayers with bo1 reduced tightness of the tjs between some hours and 24 h. applying 2-hydroxypropyl-ß-cyclodextrin as a solubilizer, opening activityof bo1 became detectable in mice. due to short stability (< 2 h) of bo1 in the bloodplasma repeated administration (1.5 mg/kg i.v.) was required to induce significantly increased permeability of the bbb for na-fluorescein(mw 376 da). the small molecule bo1 is a promising new approach for transient opening of the bbb in vivo. further modification of the stability and solubility of bo1 is necessary to optimize its applicability. the complex of tight junction (tj) proteins is located between opposing epithelial or endothelial cells. tjs restrict the paracellular permeation of ions and other solutes. tricellulin (tric) tightens tricellular tjs (ttjs) and regulates bicellular tj (btj) proteins like claudins and occludin (occl). current data suggest an important role of ttjs at the blood-brain barrier (bbb). a main pharmacological problem is modulation of the bbb to improve drug delivery to the cns. therefore, tricsi has been developed as a peptide taken from tric to open tissue barriers specifically and transiently.initially, a recombinant protein was generated based on a sequence of an extracellular loop of tric, tagged with maltose binding protein. the fusion protein caused down-regulation of tric, internalization of both tric and occl (confocal laser scanning microscopy), and a significant decrease in transcellular electrical resistance (ter) of a human epithelial colorectal adenocarcinoma cell line. then, studies with the synthetic peptide tricsi indicated its capacity of cell barrier openingafter about 16 h of incubation with concentrations varying from 100 to 150 µmaffecting the membrane localization of tricand occl. barrier opening was proven by decreasing ter, increasing permeability coefficient of lucifer yellow (457 da) and fitc-dextran (10 kda); the localization of tric elongated from ttjs towards btjs and cldn1 was weakened at btjs.physiochemical properties of tricsiexamined by circular dichroism spectroscopy suggested ß-strand structure and no helical propensity. taken together, a tric-derived peptide has been identified increasing the paracellular permeability of tissue barriers and redistributing the cellular localization of tj proteins. tricsi is a novel, promising tool to overcome cerebral barriers with the potential to improve drug delivery to the cns. further experiments are needed to better understand the role of tric in tissue barriers as well as to clarify the mode of action of tricsi. introduction: lung transplantation has become an established treatment option for a variety of end-stage lung diseases, but the long-term survival is often disappointing. the leading cause of death is generally chronic rejection which is characterized by inflammation and fibrous obliteration of the small airways, progressively leading to a reduction of the airflow. the mouse heterotopic tracheal transplantation model is widely used as an experimental model to study the development of obliterative airway disease. despite its widespread application, the heterotopic transplantation model does have a number of limitations, as for example the lack of airflow. the present study provides a description of the orthotopic tracheal transplantation mouse model, which shares more similarities with transplant situation in humans, and provides the analysis of airway obliteration via micro ct and histological evaluation. methods: a seven-ring donor trachea from balb/c mice was implanted into the recipient c57bl/6 mice. c57bl/6 mice without transplantation were used as normal controls. donor c57bl/6 mice to recipient c57bl/6 mice were served as the isograft group. 42 days after transplantation, mice were scanned using an in vivo small animal µct (skyscan 1176). tracheal tissue was harvested and fixed in formalin, embedded in paraffin, cut and stained with hematoxylin and eosin (h&e) as well as sirius-red/fast-green. results and conclusions: histologic evaluation showed luminal narrowing with subepithelial inflammatory cell infiltrates and fibrosis, as well as partially damaged and flattened epithelium. the aerated volume of the allogeneic grafts, analyzed by micro ct was significantly reduced compared to the isogenic control grafts and normal controls. non-invasive imaging via micro ct may offer an option for longitudinal monitoring of the progression of obliterative airway disease as well as response to treatment. c. elegans is a well-established model organism to study the aging process as well as effects of various substances in vivo. its lifespan is regulated by multiple signaling pathways (e.g. insulin or mtor signaling), which are well conserved up to humans. the insulin/igf-1 pathway was the first pathway shown to effect ageing in animals. mutations that decrease the activity of daf-2 (igf1r) lead to a significant increase of lifespan accompanied by a decrease of age pigment accumulation in c. elegans. the relevant effector of the insulin/igf-1 pathway is the transcription factor daf-16 (hfoxo3a). inhibition of hmg-coa reductase (enzyme of mevalonate pathway) by statins, which are frequently used as cholesterol-lowering agents in the clinic, has been shown to attenuate protein prenylation and glycosylation. notably, prenylated-, membrane-bound small gtp-binding proteins are important for the regulation of the afore mentioned age-related signaling pathways like the insulin/igf-1 pathway. recently, a cohort study showed that a decreased mortality rate in humans between age 78 -90 correlates with statin treatment, but is independent of total cholesterol levels. as c. elegans harbors the mevalonate pathway, but the branch leading to cholesterol synthesis is missing, it is a well-suited model to study cholesterol -independent effects of statins on aging-associated phenotypes and the underlying molecular mechanisms. here, we show that exposure of c. elegans to statins substantially decelerated the accumulation of age pigments. while the level of age pigments roughly doubled in control animals, there was only a slight increase in the lovastatin group. the use of atorvastatin gave comparable results indicating a more general effect of the inhibition of the hmg-coa reductase. the retarded accumulation of age pigments could be partly phenocopied using an inhibitor of the small gtpase rac1 or using rnai against the hmg-coa reductase. a reduced level of age pigments is prognostic for an elevated mean lifespan (about 20%) in c. elegans. a post reproductive treatment with lovastatin, mimicking the use of statins in patients of advanced age increased the mean lifespan in c. elegans even further. in addition, we could show a mild reduction of fertility and a developmental delay as well as a marked increase in acute thermal stress resistance mediated by lovastatin. besides the reduced accumulation of age pigments and the increased lifespan these are phenotypes which are usually observed under accumulation of daf-16 overactivity. consequently we found an increased nuclear localization of daf-16 in the presence of lovastatin and lovastatin completely failed to reduce age pigments in a daf-16-ko mutant background. rt-qpcr brought jnk-1, a known activator of daf-16, into play as a possible effector induced by statins. this is currently under investigation. in summary, statin exposure induces a longevity phenotype in c. elegans, which might be daf-16 dependent. this findings indicates that a product of the mevalonate pathway might influence the insulin/igf-1 pathway and particularly the transcription factor daf-16. the high-fat diet (hfd)-fed, streptozotocin (stz)-treated rat model is one of the experimentally-induced animal models of diabetes. this model is often used to evaluate the antidiabetic activity of several agents. according to srinivasan et al. (2005) , prolonged exposure of high-fat diet leads to insulin resistance, and the development of diabetes occurs only in insulin-resistant hfd-fed rats following low dose stz, because the hfd-fed rats are already mildly hyperglycemic due to insulin resistance (1). in hfd/stz model, the rats are fed with high-fat diet for 2-4 weeks or for a relatively long time (≥ 3 months) in order to simulate the insulin resistance and/or glucose intolerance. after induction of diabetes with multiple or single low-dose of stz (30-35 mg/kg), some of the diabetic rats receive treatment (2) . in this way, the impact of treatment can be determined by comparing the differences between groups. despite the lack of methodological information concerning the feeding time in some studies, all rats should be allowed to continue to feed on their respective diets until the end of the study. but what would happen if the hfd was switched to normal pellet diet in these diabetic rats? in our experience, the feeding of npd for 4 weeks significantly decreased fbg in diabetic rats compared to hfd-fed diabetic rats (234.40 ± 42.71 mg/dl vs. 464.00 ± 23.88 mg/dl, p < 0.05). although diet regulation could not restore normal blood glucose, such a decrease was unexpected. in addition, the body weights of the npd-fed diabetic rats were significantly lower than the body weights of the hfd-fed diabetic rats (249 ±6.00 gvs. 288.00 ±4.41 g, p < 0.05). there was no significant difference in body weight between nondiabetic control rats and diabetic rats fed npd for 4 weeks. further details can be found in table 1 . diet regulation and weight loss may prevent, control and reverse diabetes. however, at later stages of the disease, it is difficult to improve blood glucose control without medication, because the disease progresses from insulin resistance to insulin deficiency (3) . according to some diabetes researchers, the amount of residual functional betacells mass is an important issue, and another important question is whether hfd/stz rat mimics an early or late stage of type 2 diabetes (4). these preliminary findings suggest the possibility that hfd/stz rat model may simulate the characteristics of early stage more than the final stage of type 2 diabetes, and hyperglicemia in the experimental model can partially reverse with diet regulation. references: 1. srinivasan, k., viswanad, b., asrat, l., kaul, c. l., ramarao, p. (2005) . combination of high-fat diet-fed and low-dose streptozotocin-treated rat: a model for type 2 diabetes and pharmacological screening. pharmacol res 52 (4): 313-320. 2. oztürk z, gurpinar t, vural k, boyacıoglu s, korkmaz m, var a. (2015) . effects of selenium on endothelial dysfunction and metabolic profile in low dose streptozotocin induced diabetic rats fed a high fat diet. biotech histochem 90 (7): 506-515. 3. franz, m. j. (2007) . the dilemma of weight loss in diabetes. diabetes spectr 20 (3) animal models are pivotal for studies of pathogenesis and treatment of movement disorders. dystonia, characterized by sustained or intermittent muscle contractions causing twisting movements/postures, is regarded as a basal ganglia disorder. the pathophysiology is however poorly understood. in mouse models of dyt1 dystonia, which is caused by a gag deletion in tor1a that encodes for the protein torsin a, ex vivo electrophysiological studies have shown an abnormal d2 receptor mediated release of acetylcholine from striatal interneurons. in these models, which do not exhibit a dystonic phenotype, the functional relevance of the increased d2 receptor mediated acetylcholine release has not been examined yet. the aim of present study was to (1) generate more powerful tests to detect behavioural alterations in the dyt1 knock-in mouse and to (2) examine the behavioral effects of the d2 receptor agonist quinpirole. for this purpose, a sequence of cognitive, motoric and sensorimotor tests were performed in this mouse model. only the adhesive removal test that explores sensorimotor connectivity revealed significant impairments in the dyt1 knock-in mice compared to controls. to induce a more characteristic and stronger phenotype, the "rotating beam test" was developed. this motoric test measures motor coordination and balance. interestingly, dyt1 knock-in mice showed significant motor deficits in the rotating beam test. based on these results, the acute effects of quinpirole (0.25 -1 mg/kg i.p.) were tested in dyt1 knock-in and wildtype mice. subsequent to the injections, mice were tested in the open field, the rotating beam test and the adhesive removal test, respectively. in the open field test, dyt1 knock-in mice showed increased thigmotaxis at a dose of 0.5 mg/kg quinpirole. in the rotating beam test, both groups showed a dose-dependently reduced performance. in the adhesive removal test, quinpirole improved the reaction time in dyt1 mice independently of dosage, while no effects were observed in the wildtype littermates. however, in vehicle follow-up (post-drug control), this effect remained consistent in the dyt1 model, suggesting a habituation effect. in conclusion, we generated a new test, i.e., the rotating beam test which improves the detection of mild motor impairments in dyt1 knock-in mice. furthermore, the adhesive removal test revealed sensorimotor dysfunctions in this animal model. these results represent an important step for our ongoing optogenetic examinations on the role of abnormal neuronal plasticity in dyt1 dystonia and for pharmacological studies. the first data on the effects of quinpirole do not indicate a critical role of d2 dysregulated acetylcholine release, but this has to be clarified by ongoing local striatal injections of quinpirole and by pharmacological manipulations of the cholinergic system. renal fibrosis is characterized by decreased nitric oxide (no) bioavailability and pronounced transforming growth factor β (tgfβ) signalling subsequently excessive extracellular matrix (ecm) deposition. here, the effects of the soluble guanylate cyclase (sgc) stimulator bay 41-8543 after unilateral ureter obstruction (uuo) have been studied. kidney fibrosis was induced by unilateral ureter obstruction (uuo) in wild type (wt) and cgki knock-out (cgki ko) mice. starting one day after uuo, the sgc stimulator bay 41-8543 was (4mg/kg/daily) i. p. injected for seven days. biomarkers indicating remodelling processes in the kidney were analysed via mrna expression and protein expression. bay 41-8543 administration influenced the activity of the ecm degrading matrix metalloproteases (mmp2 and mmp9) and their inhibitor timp-1, the expression pattern of extracellular matrix proteins (e.g. collagen and fibronectin) of profibrotic mediators (e.g. connective tissue growth factors (ctgf) and plasminogen-activator inhibitor-1 (pai-1)) and the secretion of cytokines, e.g. il-6. thereby, bay 41-8543 increments the cgmp pool among others via modulation of endothelial no synthase (enos) expression. agents, which enhance no and cyclic guanosine monophosphate (cgmp) ameliorate the progression of fibrotic tissue. however, the molecular mechanism by which cgmp via cgki affects the development of kidney fibrosis has not fully been elucidated. accordingly, the present study investigates the functional role of sgc stimulation in regulating the fibrotic process, the signalling pathway and the underlying mechanisms involved. we hypothesize that the antifibrotic potential of bay 41-8543 might be related to the increased cgmp pool and the inhibition of the mapk and smad signalling pathway. the elucidation of the signalling allows the development of new therapeutic options. infection of mice with listeria monocytogenes (lm) results in a strong t-cell response that is critical for an efficient defense. here, we demonstrate that the adapter protein sly1 (sh3-domain protein expressed inlymphocytes 1) is essential for the generation of a fully functional t-cell response. the lack of sly1 leads to reduced survival rates of infected mice. the increased susceptibility of sly1 ko mice was caused by reduced proliferation of differentiated t cells. ex vivo analyses of isolated sly1 ko t cells displayed a dysregulation of forkhead box protein o1 (foxo1) shuttling after tcr signaling, which resulted in an increased expression of cell cycle inhibiting genes, and therefore, reduced expansion of the t-cell population. foxo1 shuttles to the cytoplasm after phosphorylation in a protein complex including 14-3-3 proteins. interestingly, we observed a similar regulation for the adapter protein sly1, where tcr stimulation results in sly1 phosphorylation and sly1 export to the cytoplasm. moreover, immunoprecipitation analyses revealed a binding of sly1 to 14-3-3 proteins. altogether, this study describes sly1 as an immunoregulatory protein, which is involved in the generation of adaptive immune responses during lm infection, and provides a model of how sly1 regulates t-cell proliferation (schäll et al., eur j immunol. 2015) . the catalytical isoforms p110γ and p110δ of phosphatidylinositol-4,5-bisphosphate 3kinase γ (pi3kγ) and pi3kδ play an important role in the pathogenesis of asthma. two key elements in allergic asthma are increased eosinophil and ige levels. whereas dual pharmacological inhibition of the catalytical subunits p110γ and p110δ reduces asthmaassociated eosinophilic lung infiltration and ameliorates disease symptoms, it has been shown that dual genetic deficiency in pi3kγ and pi3kδ in p110γ ko δ d910a mice increases serum ige and basal eosinophil counts in mucosal tissues and blood. this suggests that long-term inhibition of p110γ and p110δ might exacerbate asthma. here we analysed p110γ/δ -/mice and determined ige and eosinophil counts in a basal state and the immune response to ovalbumin (ova)-induced allergic asthma. we found that serum concentrations of ige, il-5 and eosinophil numbers in blood, spleen and bone marrow were significantly increased in p110γ/δ -/mice in comparison to single knock-out (ko) and wildtype (wt) mice. nevertheless, p110γ/δ -/mice were protected against ovainduced infiltration of eosinophils, neutrophils, b cells and t cells into the lung tissue and the bronchoalveolar space. moreover, p110γ/δ -/mice, but not single ko mice, showed a reduced bronchial hyperresponsiveness as measured with the isolated and perfused lung. we conclude that although the dual deficiency of p110γ and p110δ causes eosinophilia and ige hyperproduction, p110γ/δ -/mice are not prone to develop ovainduced allergic asthma. an increase of plasma extravasation induced by activation of constitutively expressed endothelial bradykinin type 2 receptors (b2) has been shown to contribute to the development of angioedema occurring as a sometimes life-threatening side effect of angiotensin-converting enzyme inhibitors such as enalapril (new engl j med 2015:372; 418-425) . these drugs inhibit the degradation of bradykinin and increase its vascular steady-state concentration. hence, it is reasonable to assume that bradykinin may destabilise the endothelial barrier, i.e. may increase physiologic extravasation. while the commonly used miles assay provides a useful and relatively easy tool to study the effect of permeabilizing mediators in-vivo, it does not distinguish between intravascular and interstitial evan's blue dye. likewise, extravasation can only be quantified at one particular time point per animal, usually 20-30 min. furthermore, evaluation of physiologic extravasation is not possible. in contrast, non-invasive twophoton laser microscopy may allow separating the intravascular from the interstitial compartment and thereby investigations of changes of the physiologic endothelial barrier induced by drugs or transgenes. therefore, we have evaluated this methodology for its suitability to study endothelial permeability in mice in vivo. to establish this, we used two different fluorescent dyes of different molecular weight. a 200,000 kda dextran equipped with a green fluorescent chromophore which cannot leave vascular lumen was injected intravenously to visualize small dermal blood vessels of the mouse ear located approximately 200 µm below the surface. after stabilization of the green fluorescent signal, a 10,000 kda dextran equipped with a red fluorescent chromophore which easily traverses the endothelial barrier was applied by intravenous injection. the red fluorescence permeates into the interstitium during physiologic extravasation and accumulates in the interstitial space. this process can be followed by measuring the decrease of intravascular red fluorescence over various time periods. using this methodology we have studied whether endothelial-specific overexpression of b2 changes physiologic endothelial permeability. this newly developed transgenic mouse line (b2 tg ) was established using a plasmid consisting of pbluescript ii sk+ -vector, the tie-2-promotor, the human b2 cdna, the sv40 poly-a-signal and a tie-2 intron fragment. we observed that b2tg showed a significantly stronger extravasation than their transgene negative littermates as evidenced by the more rapid extravasation of the 10,000 kda dextran at each time point (fig. 1) .we conclude that two-photon laser microscopy is suitable to study endothelial permeability non-invasively in-vivo and that this methodology allows to study the effects of drugs and transgenes on the endothelial barrier under non-inflammatory conditions. furthermore, our results suggest that endothelial-specific overexpression of b2 increases physiologic extravasation. non-allergic angioedema such as angioedema induced by angiotensin converting enzyme inhibitors (acei) develops as a consequence of increased activation of bradykinin receptor type 2 (b2). using a plasmid consisting of pbluescript ii sk+ -vector, the tie-2-promotor, the human b2 cdna, the sv40 poly-a-signal and a tie-2 intron fragment a transgenic mouse line harbouring an endothelial-specific overexpression of b2 was generated and backcrossed to c57bl/6 for more than 10 times (b2 tg ). lung mrna using primers specific for the human or the mouse b2 cdna revealed a 12.5-fold stronger expression of human b2 in b2 tg (n=6), while the expression of murine b2 mrna was unchanged and similar to transgene negative littermates (b2 n ). we have evaluated the specificity of several antibodies directed against b2 and found that a rabbit monoclonal anti b2 antibody appears to be reliable, i.e. there was just a faint staining in lung tissue of b2 -/mice. however, this antibody primarily stains rodent b2 and has only little cross-reactivity to human b2. hence, we were not able to detect a significant increase of b2 protein in tissues of b2 tg . previous experiments have shown that bradykinin induced concentration-dependent constrictions of aortic rings with a maximal effect at 1 µm of bradykinin. the contraction due to bradykinin was completely inhibited by icatibant or diclofenac indicating that it is mediated by endothelial b2 activation and dependent on cyclooxygenase activity. in striking contrast to their transgene negative littermates b2 n , we found a significant icatibant sensitive aortic dilation in b2 tg following preincubation with diclofenac which indicates functional overexpression of b2 in conductance vessels of b2 tg . to evaluate whether this applies to dermal micro vessels we used the miles assay to quantify dermal extravasation of the albumin-bound dye evans blue following intradermal injection of 30 µl of either vehicle, bradykinin, labradimil and histamine (control). increasing concentrations of bradykinin caused a significant increase of extravasation reaching 4.41±0.11 fold at 18.9 nmol bradykinin in c57bl/6 (n=6 each, p<0.0001 vs. vehicle). a similar increase was found in b2 n (4.45±0.25 fold, n=7, p<0.0001 vs. vehicle), while there was a stronger response in b2 tg (5.50±0.16 fold, n=7, p<0.0001 vs. vehicle) which was significantly different to b2 n (p<0.01) and c57bl/6 (p<0.01). in another set of experiments the specific and ace-resistant b2 agonist labradimil (1.89 nmol) was used instead of bradykinin. labradimil increased extravasation by 3.736±0.121 fold in c57bl/6 (n=6 each, p<0.0001 vs. vehicle), by 4.51±0.11 fold in br2 n (n=7 each, p<0.0001 vs. vehicle) and 4.88±0,21 fold in b2 tg (n=6, p<0.0001 vs. vehicle) which was significantly different to c57bl/6 (p<0.01) but not to b2 n (p>0.05). the effects of bradykinin and labradimil were largely blocked by 10 nmol icatibant (i.v.) in c57bl/6, b2 n and b2 tg mice (p<0.0001) and hence mediated by activation of b2. these data suggest that overexpression of b2 in b2 tg is functionally active in endothelial cells of large conductance and small dermal vessels. therefore, b2 tg represents a new animal model suitable for cardiovascular and non-allergic angioedema research. pharmacokinetic pharmacodynamic modeling of irreversible effects: the rituximab example f. keller 1 1 universitätsklinikum, innere 1, nephrologie, ulm, germany background: for pharmacokinetic-pharmacodynamic modeling usually the sigmoid emax model is used as described by the hill equation. however, treatment regimens exist where the effect is only exerted as long as the drug concentration increases whereas decreasing concentrations produce no longer an effect. examples are the pulse anti-cancer therapy such as originally proposed by the devita protocols. methods: here, the new model for irreversible drug action is derived from the time dependent change of the concentration that must be larger than the time-dependent growth of the number of target cells (tumor or bacteria). the irreversible effect can be assumed if the is no further growth of the target cells occurs. de/dt = + dc/dt -dn/dt dn/dt = 0 a solution for the above differential equation can be obtained by use of the integral exponential function iec based on the euler-mascheroni constant (gamma = 0.5772 …). this model of an irreversible effect was applied.to the example of rituximab where the initial effect on cd19+ and cd20+ b-cells completely persists for 6 month. to obtain a numerical solution, the following parameters are needed to be determined: the target concentration ctarget = 100 mcg/ml and the infusion time t = 2 hours. results: it can be shown that a plausible result for rituximab can be estimated only under the condition that a short distribution half-life is assumed of t1/2 = 2 hours (not shorter than the time t of infusion). with the terminal half-life of 460 hours no plausible solution is obtained. under these conditions two observations are made: there is a negative effect both, initially for low concentrations and after cessation of the infusion when concentrations decrease (in reality this means no effect in both cases). the irreversible effect is proportional to the target concentration. the shorter the half-life comes out relative to the infusion time (t1/2 < t) the stronger is the effect (figure) . occasionally there are specific questions occurring on the ruminant xenobiotic metabolism: 1) are the observed metabolites ruminant specific and formed directly in the rumen? 2) are ruminants able to cleave plant specific metabolites like glycosides to the respective aglycon? in the past new additional in vivo goat metabolism studies with at least one animal were performed. the aim of the project was to elucidate an alternative in-vitro method to replace the existing in vivo method in order to address robustly specific questions on xenobiotic metabolism in ruminants for registration of ppp. fresh sheep rumen fluid was incubated in-vitro >7 days by using rumen simulation technology (rusitec). the conservation of the physiological conditions were proven by measurement of ph (~ph 6.6) and redox potential (~-300 mv). the microflora composition and their viability (bacteria, protozoa and fungi) of the rumen were monitored by microscopy, incubation on agar plates and performing several viability tests (e.g. glycosidase-test, nh3 and short fatty acids). all the tests showed that the rusitec is a successful tool to maintain sheep rumen fluid for at least 7 days in -vitro. the metabolic behavior and performance of the rumen fluid was tested by e.g. incubating 14c-triazol derivative metabolites (tdm) like triazol-alanin (ta); triazol-acetic-acid (taa) and triazol-lactic-acid (tla), which are usually formed in plants after application of triazol-containing fungicides. it was shown that ta was cleaved within 72 h to 1.2.4.-triazol, while taa and tla were stable under these conditions. these data are in a good accordance with available in vivo data in cows. moreover glycosides (12c-polydatin, octyl-14c-β-d-glucopyranosid) were cleaved within 1 hour completely. all these data show, that the rumen fluid maintained its metabolic performance by using rusitec. basf identified the rusitec method, which is usually used in different areas (e.g. investigation of methane production in-vitro) as suitable and adapted this method for the purpose of investigation of ruminant xenobiotic metabolism. it was shown that rusitec is a robust method to analyze rumen xenobiotic metabolism and therefore can clearly substitute in vivo animal studies on ruminant metabolism studies beyond oecd503 and contribute significantly to animal welfare (3r: replacement). results: by using the training set, physicochemical (e.g. lipophilicity) and pharmacokinetic characteristics of mtx (e.g. v max for active tubular secretion) were slightly adjusted. using the gfr formula of morris et al. (1982) and including an empiric correction factor, mrd for the training set was 2.49 whereas bias was 2.80 µmol/l. by applying the developed pbpk model to the test set the respective values were mrd=3.92 and bias=1.43 µmol/l. for the covariates "at least one potentially interacting co-medication" and "trimethoprime/sulfamethoxazole" a significant impact on the prediction quality was found. conclusions: using the developed pbpk-model, a good prediction of the pharmacokinetics of hd mtx in severely ill children was found. by including additional factors influencing the prediction of mtx characteristics (e.g. co-medications) an improved prediction of mtx-sl might be reached. in prospective clinical trials, those more complex models should be evaluated and might be helpful to predict hd mtx pharmacokinetics and reduce unwanted side effects. hypericin is a natural polycyclic quinone found in hypericum perforatum. although hypericin reportedly has numerous pharmacological activities, only a limited number of studies have been performed on the absorption and transport characteristics of this compound, presumably, because hypericin is a highly lipophilic compound which is poorly soluble in physiological medium. recently we have shown that quercitrin and isoquercitrin, but not hyerosid, quercetin or rutin increased the uptake of hypericin in caco-2 cells. the major aim of this study was to get a detailed understanding of the exposure and fate of hypericin in the caco-2 cell system under different experimental conditions. the permeation characteristics of hypericin (5 µm) in absence or presence of hypericum extract 145, 62.5 µg/ml) were studied in the absorptive direction. following application of hypericin to the apical side of the monolayer only negligible amounts of the compound were found in the basolateral compartment. the amount of hypericin in the basolateral compartment increased concentration-dependently in the presence of the extract (from 0 to 7.5 %). the majority of hypericin was found after cell extraction (44% in absence and 76% in presence of the extract). the recovery was in the range of 90 %, and significant amounts of hypericin found after cell extraction. fluorescence microscopy and imaging analysis revealed that hypericin is mainly accumulated in the cell membrane. the precise mechanism through which hypericin might overcome the hydrophobic barrier of cell membranes remains to be elucidated. however, our experiments demonstrated that the permeation characteristics of hypericin significantly improved in presence of the extract. background: the combination of gamithromycin (gam), a novel drug with the big advantage of a once weekly administration, and rifampicin (rif) is used in the treatment of lower airway disease in foals. both are effective in the therapy of infections with rhodococcus equi, a gram-positive coccobacillus bacterium, which is known to survive and reproduce within alveolar macrophages. macrolides are combined with rif to prevent resistance developing with single agent therapy. both drug classes reach high concentrations in the lung, penetrate into phagocytes and kill intracellular pathogens. methods: a controlled, single-and multiple dose study with four-periods was conducted in 10 healthy foals (5 ♂ and 5 ♀, age: 42-63 days, body weight: 100-177 kg) which were treated once with rif alone (10 mg/kg s.i.d., p.o., a) followed by the administration of gam (6 mg/kg once weekly, i.v., b) for 3 weeks. study period 3 ("rif-gam acute", c) includes the administration of gam and rif for 7 days with an administration interruption after the first rif dose for blood sampling. for the last study period ("rif-gam chronic", d) both gam and rif were coadministered for 2 weeks. all periods were completed with blood sampling for pharmacokinetic analysis for 48 ( . rif is also accumulated in the lung, but to a much lower degree than gam (elf/c 24 h : 1.2 ± 0.5; balc/c 24 h : 2.01 ± 1.24). conclusion: pharmacokinetic data of the present study provides surprising results. in previous studies coadministration of clarithromycin and rif show a dramatic decreased plasma exposure of the macrolide, whereas the balc/c 24 h -ratio was unaffected (peters et al. 2011) . in contrast, systemic exposure of gam increase significantly in case of the combined therapy and the balc/c 24 h -ratio was nearly halved. both macrolides have in common, that they are intensively accumulated in the lung (elf << balc). at the moment there is further research required (e.g. in vitro studies) for a better understanding of the very interesting in vivo data. 1 1 institut für klinische pharmakologie, göttingen, germany background and aim: desvenlafaxine is a selective serotonin and norepinephrin reuptake inhibitor, which is approved in the usa (but not in europe) for treatment of major depressive disorder. desvenlafaxine is the major active metabolite of the antidepressant venlafaxine. desvenlafaxin is produced by o-desmethylation via cyp2d6. direct administration of desvenlafaxine should bypass the variability in venlafaxine pharmacokinetics caused by the highly polymorphic cyp2d6. however, desvenlafaxine is less lipophilic than venlafaxine and may require carrier-mediated transport to penetrate cell membranes. based on our in vitro data, desvenlafaxine is a substrate of the hepatic organic cation transporter oct1 and common genetic polymorphisms abolished desvenlafaxine cellular uptake. about 9% of caucasians are compound heterozygous carriers of loss-of-function oct1 polymorphisms. therefore, oct1 polymorphisms may cause substantial inter-individual variability in the hepatic uptake and plasma concentrations of desvenlafaxine. in this study we evaluated the influence of genetically-determined loss of oct1 function on the pharmacokinetics and pharmacodynamics of desvenlafaxine. primary aim was dependence of desvenlafaxine plasma concentrations (represented by auc as primary endpoint) on the number of active oct1 alleles. methods: 50 mg desvenlafaxine (pristiq®) was orally administered to 41 healthy subjects preselected according to their oct1 genotypes. oct1*1 allele was regarded full active, oct1*2 to *6 alleles were regarded loss of function. plasma concentrations of desvenlafaxine and its main metabolite didesmethylvenlafaxine were quantified in plasma sampled up to 60 hours after administration using lc-ms/ms. pupillographic measurements were performed as possible surrogate markers for desvenlafaxine pharmacodynamics. results out of the 41 subjects 14 carried two active, 13 one active, and 14 zero active oct1 alleles. age, height and weight were 26.9 ± 6.4 years (mean ± standard deviation), 1.75 ± 0.11 m and 70.9 ± 12.1 kg with no significant differences among the oct1 genotypes. there were strong variations in the pharmacokinetics of desvenlafaxine and its metabolite didesmethylvenlafaxine. the auc 0-infinity of desvenlafaxine varied between 52.8 and 282.2 min*mg/l and auc 0-infinity of didesmethylvenlafaxine between 3.5 and 30.7 mg*min/l. however, neither desvenlafaxine nor didesmethylvenlafaxine pharmacokinetics significantly differed among the three oct1 genotypes. concerning pharmacodynamics of desvenlafaxine, pupil diameters at maximal constriction after a standardized light exposure were on average 14% greater around the time of t max than before administration. in line with the pharmacokinetic results there were no significant differences in maximal constriction or other pupillographic parameters among the oct1 genotype groups. conclusions: our results suggest that oct1 genotype does not affect the pharmacokinetics of desvenlafaxine and therefore no dose adjustment in respect to oct1 genotype should be considered. other factors like renal transporters or polymorphic glucuronidation may explain the great variability in desvenlafaxine pharmacokinetics. background: cardiovascular disorders and medication are highly prevalent in elderly (1). due to age related changes in the body, the elderly are particularly vulnerable to side effects and adverse drug reactions. some psychotropic drugs are linked with reports of cardiac side effects. additionally, some cardiac drugs may also cause psychiatric symptoms. of these, angiotensin converting enzyme inhibitors, beta blockers, methyldopa and calcium channel antagonists can induce or exacerbate symptoms of depression (2) . the aim of this study was to provide information on the concomitant use of cardiovascular drugs among elderly patients who took psychotropic medication. methods: we conducted a single-center, retrospective study between september 2013 and december 2013 using the medical records of elderly patients (≥65 years of age) admitted to basin sitesi polyclinic, izmir ataturk research hospital, turkey. demographic characteristics of patients, diagnoses, prescription drugs were evaluated, and spss 16.0 statistical software was used for data analysis. number, percent, mean and standard deviation were used as descriptive statistics. results: a total of 541 elderly patients with psychiatric disorders were identified. one in four patients receiving psychotropic medication took at least one cardiovascular agent concomitantly (n=135). median age was 72 (min:65, max:98), 84 patients were female (62.2%). according to medical records of 135 patients, the most commonly used drugs were escitalopram, sertraline, mirtazapine, quetiapine, mianserin and risperidone. the proportion of the concomitantly use of cardiovascular drugs was higher among the patients who took more than one psychotropic drug (69.6% vs. %30.4) compared to patients taking psychotropic monotherapy. a higher percentage of women used diuretics (65.4% vs. 33.3%) and angiotensin receptor blockers (36.9% vs. 23.5%) concomitantly with psychotropic drugs when compared to men. the proportion of men using angiotensin-converting enzyme inhibitors and lipid-modifying agents was higher than women (58.8% vs. 38%, 68.6% vs. 20.2%, respectively). conclusions: the world's population is ageing rapidly. according to world health organization, over 20% of elderly suffer from a psychiatric or neurological disorder. our data showed that use of cardiovascular drugs among elderly patients with psychiatric disorders was extensive. the effects and interactions of these drugs should be discussed and carefully evaluated before starting treatment in the elderly. further studies focusing on drug use in elderly will increase the success in geriatric pharmacotherapy. since the adoption of the ich e14 guideline [1], the thorough qt (tqt) study has become a standard element of clinical drug development. however, with the iq/csrc study [2] the ability to detect qtc-prolongations of about 10 ms in a phase i setting has been demonstrated. as a consequence, regulatory agencies have begun to grant waivers for a tqt study based on negative qt findings obtained from first-in-man studies. a concentration-response model is the key tool that gives sufficient power to an analysis based on data from single or multiple ascending dose studies. this power has been investigated in subsampling studies that simulate situations comparable to those encountered in first-in-man studies [3, 4] . other topics to be addressed are the assurance of sufficient quality of the ecgs obtained, in particular in doses that cause adverse reactions, and a replacement for the active control that is part of a tqt study. if a model based statistical analysis is used for confirmatory inference, it must be specified in advance. this pre-specification includes tests to ascertain that the model assumptions are met and alternative methods to be used in case they are violated. in particular linearity and the absence of a hysteresis, i.e. a delay between the drug concentration and the observed qt effect need to be tested. this is an area of active research. in this contribution, i will share current experience from a statistical perspective, both based on real data and on simulated studies. i will also discuss critical points in the design of first-in-man studies that are intended to be used to obtain a waiver for a tqt study. human platelets express the g-protein-coupled angiotensin receptor-like-1 (apj) receptor. apj is activated by apelin, which is produced as pre-apelin and cleaved into several bioactive peptides such as apelin-12, -13 and -17. apelin and apj are expressed in a variety of tissues such as the heart, the vessel wall, several tumor types, and in platelets. to date, there is no description or a suggested function of the apelin/apj system in platelets to date. here, we investigate apj expression and function in human platelets. apelin and apj expression were determined in platelet-rich plasma from healthy donors by immunofluorescence, western blotting and flow cytometry. in a pilot study apelin and platelet apj expression were analyzed in 23 patients with nstemi, 4 stemi patients and 14 controls. here, platelet aggregation was analyzed by light transmission aggregometry (lta); platelet cd62 and apj expression by flow cytometry and circulating plasma apelin by elisa. in resting human platelets, apj receptor expression was observed predominantly in the outer cell membrane, as determined by immunofluorescence staining and flow cytometry. activation with a selective thrombin receptor-activating peptide (ap1) resulted in decreased apj protein levels determined by western blotting in platelet lysates compared to untreated controls. preincubation of platelets with different apelin isoforms for 30 sec to 5 min reduced platelet aggregation in lta studies by up to 20 % for apelin-17. this effect was inhibited by preincubation of the platelets with the enos inhibitor l-name (300 µm), suggesting the involvement of a no-dependent mechanism. in patients with myocardial infarction the expression of platelet apj was significantly reduced compared to the control group (56,84% ± 9,28 % in mi versus 100 % ± 19,35 %in controls; p = 0.029). this reduction in apj expression on platelets was accompanied by decreased plasma levels of apelin-17 in patients with mi (14.95 ± 0.6 pg/ml versus 16.98 ± 0.6 pg/ml; p = 0.035). interestingly, the decreased apj expression on platelets in mi patients significantly correlated inversely with the troponin t plasma levels (r = -0.46; p = 0.03). this may suggest an association of apj expression with lower plasma levels of troponin t and possibly tissue damage. in conclusion, our study shows for the first time the expression of apj and a possible function in human platelets. apj may act as an endogenous inhibitor of platelet aggregation in response to certain apelin isoforms, predominantly apelin-17. upon platelet activation, apj is internalized and surface expression is reduced by about 50 %. in mi patients, plasma levels of apelin-17 and platelet apj expression were reduced. this correlated inversely with troponin t levels. reduced circulating apelin-17 levels and platelet apj expression may be associated or partly account for platelet hyperactivity in mi patients. anticholinergic drug use, m 1 receptor affinity and dementia risk-a pharmacoepidemiological analysis using claims data f. thome background: dementia is characterized by cumulative cognitive decline and progressive inability for independent living. the lack of suitable therapies for terminating the progression of this disease underlines the importance of the detection of risk factors. anticholinergic drugs have been shown to enhance cognitive decline in the elderly. the classification of anticholinergic drugs according to their anticholinergic burden, however, is inconsistent. since cholinergic transmission is mainly mediated by the m 1 muscarinic acetylcholine receptor in the brain, we classified anticholinergic drugs from anticholinergic risk lists according to their affinity for the m 1 receptor subtype and calculated the risk for the onset of incident dementia. methods: our analyses are based on claims data of the public health insurance fund aok. data include information of inpatient and outpatient diagnoses and treatment from 2004 to 2011. inclusion criteria comprised the initial absence of dementia and age of 75 years or older in 2004. anticholinergic drugs were taken from three anticholinergic risk lists. the pdsp-database and literature search were used to define k i -values for the substances. hazard ratios were calculated using time-dependent cox regression including covariates like age, gender, and several comorbidities. results and conclusion: anticholinergic drug exposure increases the risk for dementia. we found that anticholinergics with small k i -values are at a higher risk than those with greater k i -values. furthermore, conventional risk factors for dementia (e.g. age, depression, stroke) could be confirmed. in conclusion, the intake of anticholinergic drugs increases the risk for incident dementia in the elderly. taking into account the m 1 receptor affinity may be a contribution in determining the anticholinergic load in view of the risk for incident dementia. safety signal detection in a large german statutory health insurance databasefirst results of a feasibility assessment f. andersohn 1,2 , s. schmiedl 3, 4 , k. janhsen 3, 5 , p. thuermann 3, 4 , j. walker background: during the last years, approaches to routinely screen health care databases based on electronic medical records or claims to identify drug safety signals were proposed. to evaluate the performance of such methods, reference sets of index drugs have been compiled consisting of (1) drugs with a known association to a certain adverse event (=positive controls) and (2) drugs without any evidence to cause this adverse event (=negative controls). the best possible signal detection method would identify a safety signal for 100% of the positive control drugs, and for none of the negative controls. ryan et al. 2013 developed drug reference sets for four adverse events of interest (acute myocardial infarction=ami, acute kidney injury =aki, acute liver injury=ali, and upper gastrointestinal bleeding=ugib) and have shown feasibility of using these reference sets in us health care databases. if the use of these us specific drug lists for evaluation of signal detection methods is also feasible within german health care databases, is unknown. aims: to evaluate if the drug reference sets developed by ryan et al. (drug saf. 2013 ;36 suppl 1:s33-47) could be used for testing signal detection methods in a large german statutory health insurance database. methods: data source was the health risk institute (hri) database, an anonymized healthcare database with longitudinal health insurance data from approximately six million germans. new users (initiators) of index drugs in 2010 to 2013 were identified and followed-up for one year from their first prescription. exposed person-time to the respective index drug was assessed to estimate for which of these drugs an increase in risk of 50% (relative risk 1.5) compared to the background incidence of the respective adverse event (ami, aki, ali or ugib) could be identified with 80% statistical power. results: from a total of 182 index drugs in the reference sets of ryan et al., 142 (78.0 %) were also available on the german drug market and were used by at least one insurant in the hri database during the study period. a total of 5,485,722 index drug initiators were included in the analysis. for a total of 16 index drugs, a relative risk of 1.5 could be detected with 80% power. the numbers of index drugs for each of the outcomes of interest were: ami (3 positive controls; 6 negative controls); aki (3 positive controls; 1 negative control); ugib (2 positive controls; 1 negative control). as the background incidence of ali was low, no positive or negative control with sufficient power was identified for this outcome. conclusions: using the set of reference drugs proposed by ryan et al., the number of drug-event pairs with 80% power to detect a relative risk of 1.5 was low, despite the magnitude of the database used. this may be attributable to differences of drug exposure in germany and the us. hence, an adaptation of the drug list to the german drug market and consumption data might be relevant for future evaluations of signal detection methods using german databases. correlation of sativex™ doses to steady state concentrations of 11-nor-9-carboxyadministration of the oromucosal spray sativex™ represents a therapy option for treatment of spasticity in patients with multiple sclerosis. sativex™ is an extract containing equal amounts of the cannabis-derived cannabinoids ∆ 9tetrahydrocannabinol (thc) and cannabidiol (cbd). in cases of cannabis abuse a long elimination half life of some thc metabolites is known. therefore, in patients receiving sativex™, the long elimination half life of these metabolites should allow a drug monitoring under conditions of steady state. due to the fact that immunologically based methods for thc determination are very common in medical chemistry, a monitoring might be simply performed even in patients under sativex™ therapy. in a preliminary observational study 11-nor-9-carboxy-∆ 9 -thc (thc-cooh) concentrations were measured with a commercial immunoassay in urine samples of 16 patients with multiple sclerosis obtaining sativex™. in addition, thc-cooh, thc, cbd as well as the hydroxy metabolites of thc and cbd were measured by gc/ms in urine and blood samples. using this analytical technique, only an excessive dosing (as compared to the declaration by the patient) can be detected. as a result of this approach, thc-cooh concentrations determined by the immunoassay were found not to correlate to the daily applicated amount of sativex™ as indicated by the patients (spearman rang order test: p > 0.05). two patients mentioned not to have taken sativex™ on the day the samples had been taken, and one patient predicated an additional cannabis abuse. in three patients the immunological thc-cooh determination was negative or nearly negative. interpretation of the data is hampered by the fact that an incorrect declaration of sativex™ applications by the patients cannot be excluded. introduction: learning analytics seeks to enhance the learning process through systematic measurement and analysis of learning related data to provide informative feedback for students and lecturers. however, which parameters have the best predictive power for academic performance remains to be elucidated. objective: to analyze the potential of different learning analytics parameters to predict exam performance in undergraduate medical education of pharmacology. methods and results: hypertext preprocessor (php) as server-side scripting language was used to develop a learning analytics platform linked to a my structured query language (mysql) database for storage and analysis of data (www.tumanalytics.de). the database consisted of 440 lecturer-authored multiple choice questions that were made available to a cohort of undergraduate medical students enrolled in a pharmacology course (winter term 2014/15) at technische universität münchen (tum). the course consisted of a 28-day teaching period, followed by a 12-day self-study period and a final written exam. students' assessment data of tumanalytics was collected during the self-study period and correlated to the individual exam results in a pseudonymized manner. a total of 224 out of 393 (57%) students participated in the study. the coefficient of multiple correlation (r) was calculated for different parameters in relation to exam results as a measure of predictive power. of different parameters investigated, the total score and the score of the first attempt in tumanalytics had the highest positive correlation with exam performance (abb. 1). no sex-specific differences were observed. summary and conclusion: in this study we systematically investigated the potential of different learning analytics parameters to predict learning outcome and exam performance. total score and score of the first attempt were identified as parameters with the highest predictive power. in conclusion, our study underscores the potential of learning analytics as valuable feedback source in undergraduate medical education of pharmacology. in educational settings, tests (e.g., written or oral exams) are usually considered devices of assessment. however, a recent and intriguing line of evidence from basic cognitive psychological research suggests that tests may not only help to assess what students know, but may also help to improve the learning and long-term retention of information. the goal of the present study was to apply such test-enhanced learning to pharmacological teaching. after the last lecture of a pharmacology class (n=194 3 rd -year medical students, n=213 4 th -year medical students: basic or clinical pharmacology, respectively), one week prior to the final exam, students were given the opportunity to voluntarily participate in online exam. because pilot work from previous semesters had revealed relatively low levels of participation in such formative exams, students were offered bonus points for (successful) participation as an incentive. the online exam consisted of 60 items (i.e., selected pieces of information from the lectures and seminars) and was provided on the e-learning platform ilias. twenty of the 60 items were presented as statements for restudy, 20 items were tested using single-choice questions, and 20 items were tested using short-answer questions. randomly a third of the students were assigned to different sets of questions. the summative final written exam for each group consisted of 30 single-choice questions, 15 questions of which had not been used before (as a standard of comparison). the remaining 15 questions of the final exam were taken from the previous online exam, but were slightly reworded to avoid ceiling effects. each of 5 of these reworded 15 questions from the final exam corresponded to restudy items, single-choice items, and short-answer items from the online exam, respectively. the main points of interest were (i) whether the re-processing (rewording and asking for transfer of knowledge) of information in the online exam affected participants' performance on the final exam, and (ii) whether any effect depended on the specific type of re-processing (restudy vs. single-choice test vs. shortanswer test). if previous findings from basic cognitive psychological research on testenhanced learning can be generalized to more applied settings and educationally more relevant materials such as pharmacological information, students' performance in the final exam should be better for questions corresponding to previously tested items than for questions corresponding to previously restudied items. moreover, if more difficult tests lead to more test-enhanced learning than less difficult tests, as is suggested by recent findings from cognitive psychological research, performance for questions corresponding to (supposedly more difficult) short-answer items should even exceed that for questions corresponding to (supposedly less difficult) single-choice items. the present findings bear direct implications for educational practice. safe and rational prescribing is one of future physicians' key skills [1] . in order to address the persisting prescribing deficiencies [2] , we set out to develop a learning tool for pharmacotherapies of the most important diseases worldwide. the format, scope, information architecture, and functionalities of the app were identified through assessment of existing apps, literature analysis, app simulation-based student surveys, and expert advice. a fully functional offline app format for smartphones was selected based on the trends in using digital technologies for educational purposes [3] and on the unreliable internet and power availability in many learning settings. a relational database based on semantic relationships was chosen to minimize information redundancy and to enable the retrieval of drug-related information in the context of mechanisms, contra-and indications, adverse drug reactions, interactions, and common prescribing situations. the usability was optimized using a simulation of the app evaluated by medical students from germany and tanzania, and by experts. a list of 67 indications was assembled beginning with disease burden data for the seven who world bank regions. each disease accounts for at least 0.3% of life years lost due to premature death or lived with ill-health or disability (daly) in at least one region. the list was further complemented according to expert recommendations. therapeutic recommendations are based on current guidelines, considering cheaper treatment alternatives provided in the who list of essential medicines. a novel dual-scale classification system lists drug mechanisms according to the affected physiological process and to the resulting therapeutic effect [4] . contraindications, adverse drug reactions, and interactions were compiled using drug monographs of the european medical agency, the us food and drug administration, and health canada. unexpectedly, we found significant differences among these sources in respect of adverse drug reactions. this necessitated the ongoing verification through surveying general practitioners and specialists in internal medicine. during the dgpt meeting we will present the results of testing of the cardiovascular section comprising 8 indications, 28 drugs representing 25 mechanisms, and up to 120 adverse drug reactions. the european certified pharmacologists (eucp) programme was lauched in july 2014 by the federation of european pharmacological societies with the intention to identify experts in the field of pharmacology whose competency profile, in addition to their personal specialised scientific expertise, covers expert knowledge in all major fields of the discipline. seventeen ephar member societies have declared their active participation in the eucp programme so far (austria, croatia, czech republic, finland, france, germany, greece, hungary, italy, the netherlands, norway, poland, portugal, serbia, slovenia, spain, turkey) . eacpt, the european association of clinical pharmacology and therapeutics, has also recently decided to participate in the eucp programme. national programmes must meet all requirements of the eucp guidelines including a clear catalogue of criteria with respect to knowledge, practical awareness and skills, as well as general rules including rules for final assessment of candidates. such programmes may be based on existing diplomas or training schemes or may consist of a set of rules how applicants may submit credentials for their expertise with respect to the eucp criteria. so far, three ephar member societies have submitted a national eucp program: austria, italy and the netherlands. the programmes differ in structure and reflect the flexibility of the eucp programme with respect to the respective national conditions. while the italian programme is based on a catalogue of criteria where applicants have to certify and document their expertise on the basis of this catalogue and the dutch program is based on a structured phd training course, the eucp scheme submitted by the austrian pharmacological society aphar is based on the legally regulated diploma medical specialist (facharzt) in pharmacology and toxicology (aphar also plans to submit separate regulations for specialists in clinical pharmacology and for nonmedically qualified pharmacologists). the aphar eucp scheme has been approved by the eucp committee in november 2015 and its regulations are available on the eucp website (www.eucp-certification.org). the differently structured programs of the italian and dutch pharmacological societies will also be available at this website, once approved by the eucp committee and may thus serve as 'case studies' for other ephar and eacpt member societies wishing to take part in the eucp programme. introduction: the outstanding importance of pharmacovigilance (pv) for the safe use of medicines has increasingly been recognised during recent years. the multidisciplinary character of pv requires know-how in topics as different as pharmacology, clinical medicine, pharmacoepidemiology, information technology, pharmaceutical manufacturing, legal aspects, public health policies, and medical traditions in different regions of the world. in this complex situation there is a growing need for pv capacity building, in particular by professional training through high quality pv courses with different focuses and different levels of detailing. against this background, the world health organization (who) and the international society of pharmacovigilance (isop) have co-operated to create a curriculum for teaching pv which can be used for a wide range of audiences and in very different settings and situations. the purpose was to provide an inventory and systematically structured overview of pv including recent developments of topics like pharmacogenomics, consumer reporting of adrs, risk management and who-led international projects, and to propose a range of tasks for practical training. we made use of several relevant already existing packages of pv topics and concepts of pv teaching from national and international institutions. we also drew from extensive printed material as well as comprehensive reviews, textbooks and guidelines developed by international organisations which are often available online. results: the curriculum includes a main component consisting of a major part for theoretical lecture-based training and a minor component with suggestions for hands-on exercises. the theoretical part has a three-level hierarchical and modular structure with evenly divided tiers. there are 15 chapters. each of them is divided into four sections and each section into four to six sub-sections. the practical part consists of twelve times three or four proposals for practical tasks which are related to the theoretical lectures. since its launch in 2014 1 it has successfully been used in several international courses. currently a pilot project is under way to explore its use for 'crowd sourcing': it is placed on the isop homepage with a programme allowing for institutions or persons experienced in pv teaching to upload any relevant presentations they may have with a link to related chapters, sections or subsections. these presentations will be offered for downloading by interested users. the curriculum provides a comprehensive coverage of almost all areas of pv. the structure and content allows almost every kind of focusing on specific issues and going into depth, while maintaining the overall context. it offers opportunities of tailoring courses specifically to the needs of different audiences and can be applied to various forms of training, such as broad, comprehensive and intensive courses, short overviews or focuses on specific narrow topics in perspective. according to the reach regulation (ec) no. 1907/2006 chemicals produced, marketed or used within the european union have to undergo a registration process, wherein the registrants have to provide information on hazard and potential risks presented by the substances. however, the standard information requirements defined in annexes vii to x of the regulation might be waived or adapted by the registrants if adequate documentation and justification according to criteria specified in annexes vii to xi are provided. to evaluate the data availability in registration dossiers of high tonnage substances (above 1000 tpa) and their compliance with the reach regulation, the federal institute for risk assessment (bfr) in cooperation with the federal environment agency (uba) developed a systematic web-based scheme. in total, 1932 dossiers were checked for selected human health and environmental endpoints such as repeated dose toxicity, genetic toxicity and ecotoxicity. a remarkable high rate, 43% to 82 % depending on the endpoint, of the evaluated dossiers included waiving or adaptations from the standard information requirements. therefore, those dossiers were not concluded, but categorised as 'complex' (springer et al., 2015) . the use of waiving and adaptations in 'complex' endpoints were part of a follow-up project. herein, it was evaluated whether the given justifications were in accordance with the criteria set out in the respective reach annexes. the results will show the frequency and pattern of waiving/adaptation approaches for the human health as well as the environmental endpoints. besides this general overview, specific problems regarding the application of the reach regulation were identified and their significance with regard to remaining data gaps will be discussed. the german commission for the investigation of health hazards of chemical compounds in the work area has re-evaluated dimethylformamide, and classified it in the carcinogen category 4. this category is for chemicals with carcinogenic potential for which a non-genotoxic mode of action is of prime importance and genotoxic effects play no or at most a minor part provided the mak and bat values are observed. under these conditions no contribution to human cancer risk is expected. dmf was identified as a substance of very high concern by european commission. the amount of dmf manufactured and/or imported into the eu is, in the range of 10000-100 000 t/y. n,n-dimethylformamide is a hepatotoxin in humans and rats. the carcinogenicity studies in both mouse and rat were conducted with test material of an acceptable purity and physical form. the critical study involved administration of dmf via inhalation, which is relevant to human exposure. there is conclusive evidence that dmf induces significant increases of hepatocellular carcinomas in rats after exposure to 800 ml/m³ and in mice in response to 200 ml/m³ and higher. several in vitro and in vivo studies have indicated that dmf is not genotoxic. the results of the long-term studies reveal that the tumors develop in the liver only after chronic toxic inflammatory and degenerative changes have developed in this organ. the commission concluded that the tumors are a result of chronic liver damage, occurring at high exposure concentrations. the available evidence therefore suggests that there is a threshold dose for the carcinogenic effects of dmf. accordingly, dmf was classified in carcinogen category 4 with a mak-value of 5 ml/m³, an exposure concentration which does not induces liver toxicity and as a consequence is not associated with an increased cancer risk. today, a large majority of people is constantly exposed to electromagnetic radiation. many studies have been performed to investigate whether this type of radiation has a potential to affect biological systems at low intensity levels. even though no complete consensus has been reached so far in this issue, most of the investigations do not indicate a harmful potential of this radiation. two questions remain open until today, i. e. long-term effects and specific effects on children. it has been demonstrated that in comparison to adults, children absorb far higher doses of mobile phone radiation in the skull, particularly in the bone marrow, where hematopoiesis takes place. these absorptions occasionally exceed the recommended safety limits. the aim of this study was to elucidate, whether cells of the hematopoietic system can be affected by different forms of mobile phone radiation. as biological systems, two cell types were investigated, hl-60 cells as an established cell line, and human hematopoietic stem cells. the radiation was modulated according to the two major technologies, gsm (900 mhz) and umts (1950 mhz). additionally, lte (2.535 mhz) modulation was applied because this technology is used worldwide already but has not been studied sufficiently. cells were exposed for a short and a long period and with different intensities ranging from 0 to 4 w/kg. studied endpoints included oxidative stress, differentiation, dna repair, cell cycle, dna damage, histone acetylation, and apoptosis. appropriate negative and positive controls were included and three independent replicate experiments were performed. exposure to radiofrequency radiation did not induce any alterations of cell functions, measured as oxidative stress and cell cycle. cell death in the form of apoptosis was not observed. primary dna damage was not induced and dna repair capacity for nucleotide excision repair was not changed. epigenetic effects (measured as histone acetylation) were not observed. finally, differentiation was not affected. the effect of treatment with various chemicals as positive controls was different in the two cell types. all in all, mobile phone radiation did not induce effects on human hematopoietic cells. in 2014 who published guidance on evaluating and expressing uncertainties in human health hazard characterisation (hc). in this approach, the outcome of hc is expressed as an interval or distribution rather than a "traditional" deterministic point estimate, such as a reference dose (rfd), thereby communicating potential uncertainties more clearly. risk management protection goals, such as the acceptable magnitude of effect (m) and incidence (i) in the population, are made explicit quantitatively along with the confidence with which they are achieved, e.g. by an rfd. specifically, the goal of this approach to hc is to estimate the "hd m i" , i.e. the "true" human dose associated with m and i (e.g. body weight decreased by ≥ 10% (m) in 5% (i) of the population). if uncertainties are expressed by providing estimates of the hd m i as confidence intervals or uncertainty distributions, both the "degree of uncertainty" (ratio of upper and lower limit of the interval/relevant distribution segment) and the "coverage" (statistical confidence) associated with a given rfd value can be characterised. alternatively, one may start from a chosen coverage and calculate the associated "probabilistic rfd". uncertainty in each hc aspect, e.g. inter-/intraspecies or time extrapolation, can be characterised by a "generic default uncertainty distribution" which has been derived from historical data, but may be replaced by a substance-or effect-specific distribution (analogous to a chemical-specific adjustment factor in the "traditional" deterministic approach), where available. the uncertainty distributions for the individual hc aspects can then be combined into an overall uncertainty interval/distribution in (1) a simple non-probabilistic way, (2) a more refined "approximate probabilistic analysis" (aproba, a free spreadsheet tool for easy implementation also by non-statisticians is available from the who website), or (3) a fully probabilistic monte carlo analysis. the hc aspects contributing most to overall uncertainty are also identified and may be prioritised for refinement in a next assessment tier. the who approach uses a tiered strategy which may start with evaluating the uncertainties in the outcome of a "traditional" deterministic hc. in this way it represents a unified framework integrating deterministic and probabilistic hc methodologies. moreover, the concept can be easily combined with exposure uncertainty assessment. risk managers may use the additional information in better weighing potential health effects against other interests. when they consider the overall uncertainty larger than desirable in view of the problem formulation, they may decide to ask for a more refined (higher tier) assessment. if all possibilities for refinement are exhausted, the new approach can also aid in the selection of new data which might need to be generated. due to a constant improvement in analytical methods an increasing number of substances are found in drinking water. the joint project toxbox aimed for the development of a reliable test battery, allowing for a rapid evaluation of single substances in water. eleven partners either from the research (nine) or the business sector (2) formed the project. the attention was focused on genotoxic, neurotoxic and endocrine effects, which are considered to be of most concern to the consumer. by the end of the project a set of guidelines is published that describes the analytical methods in detail. the project was based on a theoretical concept, called "health-related indicator value" (hriv), which was developed by the german federal environment agency (uba) for the assessment of substances with incomplete toxicological data. depending on the type of effect a hriv between 0.1 and 3.0 µg/laws derived for the substance which had to be evaluated. during the years an increasing amount of substances as well as an increase in finds was observed in drinking water. this called for the creation of an evaluation scheme that offers rapid and at the same time reliable evaluations of chemicals for which there are no data available. the concept is in accordance with tox21, which envisages the trustworthy evaluation of relevant endpoints by two or three in vitro assays. in the context of toxbox this was provided for the endpoints genotoxicity, neurotoxicity and endocrine effects. in all cases a hierarchic strategy is applied that enables a first assessment via relatively simple test assays and only when these test give a hint towards an effective more elaborate techniques are applied for a final assessment. the ames test and micronucleus assay in combination with the umu test will form the panel for genotoxicity testing. neurotoxicity will be assessed by comparing necrotic and apoptotic effects as well as the development of reactive oxygen species in human nervous cell with human liver cells. additionally neuron specific assay like the neurite outgrowth test are performed. this is complemented by measuring the activity of acetyl choline esterase activity and the development of the side line organ in zebra fish (danio rerio). the test battery for endocrine effects consists of hormone specific reporter gene s81 assays in addition to the h295r assay. when necessary a reproduction assay in the mud snail potamopyrgus antipodarum is carried out. during the project some 60 substances were evaluated. this allowed for the development of a reliable test strategy. currently the guidelines for performing the required tests are in the making. metabolomics has gained increasing interest over the last years with numerous possible applications ranging from strain optimization for industrial production over drug discovery to improved toxicity testing. however, regulatory acceptance of this promising approach is still not reached, mostly because standardization and evaluation of reproducibility are still mostly lacking. the metamap®tox database has been developed by basf over the last ten years containing toxicity and metabolome profiles of more than 750 different compounds. to ensure maximum reliability, data was gained from plasma samples of highly standardized 4 week rat studies. animal maintenance and treatment, sampling and work-up of plasma, measurement of the metabolome as well as data interpretation and storage were standardized including thorough documentation, the compliance with sops and safe data storage. data from more than 80 control groups with each 10 males and females were analyzed to assess variability. an in depth analysis of this showed a high stability and robustness of the metabolome over a period of ten years. after artificially splitting the groups of 10 control groups into groups of five animals and comparing the number of statistically significantly regulated (false positive) metabolites, the peak of the distribution curve was located to the left of the exact (gaussian) center, but tailed off to the right more than expected under the normal distribution. from this analysis we were able to calculate density distributions (relative ratio and standard deviation) for the control values of each metabolite, which can serve as a historical control displaying the range of changes which can be expected as normal. during the course of our project we have used more than ten exact repeats to show reproducibility and reliability of the metabolome analysis (kamp et al., 2012) . comparing these exact repeats at different levels of statistical significance, we noted that at a level of statistical significance of approximately p = 0.1, the best balance between matches (metabolites regulated in the same direction) and mismatches (metabolites regulated in opposite directions) was obtained. the high quality standards applied as well as the examination of control data increase the robustness of this approach, going also hand in hand with improved data quality. this significantly facilitates decision making based on the gained data. due to these improvements a new level of transparency is reached, which might allow inclusion of metabolome data in a regulatory environment. hydroxycitric acid (hca) is a fruit acid naturally occurring in fruits of the tropical plant garcinia cambogia. a number of dietary supplements intended for weight loss contain hca, which is added in form of g. cambogia extracts. the composition of these extracts is often not clearly specified. health concerns about safety of the hca-containing supplements have been raised, based on results from animal studies, which observed toxic effects on the testis and on spermatogenesis after administration of preparations containing high hca doses. in the current risk assessment, the possible health risks associated with consumption of hca-containing dietary supplements (hca doses of approximately 1000 to 3000 mg per day) were evaluated based on relevant animal and human studies with the focus on testicular toxicity as a critical endpoint. in several published animal studies, repeated (short-term or subchronic) ingestion of certain hca-preparations (g. cambogia-extract or ca 2+ -hca salt) induced testicular atrophy (i. e. atrophy of seminiferous tubules, degenerative changes of sertoli cells at histological examination) and impaired spermatogenesis (i. e. decreased sperm counts) in male rats at high doses (noael and loael of 389 and 778 mg hca/kg body weight & day, respectively). animal studies with other hca-preparations (ca 2+/ k + -hca salt) found no such effects at the highest hca-doses tested (noael: 610,8 mg hca/kg bwt & day). human intervention studies which addressed the safety of hca in healthy test persons reported no substancespecific adverse effects after ingestion of hca doses up to 3000 mg per day over the period of up to 12 weeks. however, the question of possible adverse effects of hca on the human testes was not adequately addressed in studies with human volunteers. in a single clinical study with 24 male test subjects, no significant changes in endocrinologically relevant parameters such as serum inhibin b or fsh were observed after consumption of 3000 mg of hca for 12 weeks. however, no investigations of direct parameters that might inform on potential effects on spermatogenesis, such as sperm quality and sperm count, were conducted in this study. considering the serious adverse effects on the testes observed in several animal studies as well as in view of lack of the adequate human data on the safety of the long-term use of hca-preparations, it is concluded that knowledge gaps and substantial uncertainties exist regarding the safety with respect to human health of high amounts of hca found in commercially available food supplements, particularly with regard to the human male reproductive system. a critical look on the passing-bablok-regression b. mayer 1 1 universität ulm, institut für epidemiologie und medizinische biometrie, ulm, germany background: the passing-bablok (pb)-regression is a commonly used approach to prove the equality of different analytical methods when studying quantitative laboratory data. it is based on the assumption that the measurements of two methods are linearly related. if then one method is regressed onto the other and the respective confidence intervals of the intercept and the slope include 0 and 1, respectively, it is assumed to have a proof of methods equality. however, this conclusion is problematic in respect of an essential principle of statistical hypothesis testing. methods: in this talk the general idea behind the pb-regression is discussed critically. although the method makes use of confidence intervals a decision is made, which is why it is important to discuss how the results of a statistical hypothesis test have to be interpreted. moreover, alternative statistical approaches to investigate agreement in biometrical practice are pointed out by means of a practical example and their advantages and limitations are addressed. results: all approaches applied to a sample data set led to the same conclusions. demonstrating methods equality though necessitates an a-priori definition of an appropriate equivalence margin. conclusion: the pb-regression may give useful advice when comparing two measurement methods towards equality. however, its results are statistically inconclusive, since the pb-method does not follow the principle of equivalence testing. alternative measures of agreement should be applied instead to ensure results which are not attackable and serve as a statistical proof. insulin is an important parameter both in toxicology (toxicity to the endocrine pancreas) and pharmacology (models of diabetes and metabolic syndrome). currently available elisa and ria methodologies for insulin often require up to 100 µl plasma or serum for a single measurement. in order to meet the general trend to include more relevant parameters in animal studies and restrictions through animal welfare requirements to limit the volume of interim blood draws we explored the rat/mouse insulin singleplex assay of meso scale discovery (msd) as an alternate assay consuming only 10 µl serum or plasma or less for a single measurement. the assay is a sandwich immunoassay, whereby insulin in the sample binds to the capture antibody immobilized on the working electrode surface at the bottom of each well and recruitment of the labeled detection antibody (anti-insulin labeled with electrochemiluminescent compound, msd sulfo-tag™ label) by bound analyte completes the sandwich. voltage applied to the plate electrodes then causes the label bound to the electrode surface to emit light the intensity of which is a quantitative measure of insulin. the msd insulin assay was characterized by a robust calibration and only small variations within repeated measurements. the assay presented a broad dynamic range and differences in insulin levels of normal and rats suffering from metabolic syndrome could readily be demonstrated. furthermore, the high sensitivity may even allow the use of smaller sample volumes. these features render this assay an attractive alternative for the measurement of insulin. the lack of corresponding quality control samples for internal quality control may be considered as a relative drawback. however, the cross reactivity of the assay with human insulin provides the opportunity to use qcs designed for human assays and to possibly participate in ring trials for human insulin for external quality control if needed. surfactants are main constituents of different consumer products, e.g. detergents or cosmetic cleansing products. since surfactants show an intrinsic skin irritation potential, dilutions are used in the final products to avoid adverse effects like irritant contact dermatitis from product use. in addition, mixtures of different surfactants are typically formulated, as it is a long-standing experience that those mixtures exhibit much lower acute irritation potential than expected from the mere summation of their individual irritation potential, an effect coined surfactant antagonism. only few studies were performed to gain a more fundamental understanding of the effect, and it's mechanistic basis remains unclear. however, a thorough understanding of the surfactant antagonism is not only of value for the formulation of products that are considered 'mild to the skin'. it is also important for the classification of products according to the clp regulation in cases when data of the mixtures is missing, because summation of the ingredients' irritating effects usually results in over-classification as skin irritant. due to the progress in the development of alternatives to animal testing, different in vitro methods have become available to determine skin irritating properties of substances. methods like the oecd tg 431 and 439 especially aim at deriving a classification for skin irritation/corrosion effects according to the clp regulation. however, even though these methods became the preferred test methods for skin irritation testing, to our knowledge hitherto isolated investigations on the surfactant antagonism were only performed either by human patch test studies or by non-standard in vitro assays. in this study, the irritation potential of binary mixtures of sodium dodecylsulfate (sds), linear alkylbezene sulfate (las), cocamidopropyl betaine (cabp) and alkylpolyglucosid (apg) compared to the single compounds was investigated using open source reconstructed epidermis (os-rep) models. combinations of sds or las with cabp and apg, respectively, resulted in a clear decrease of the irritation potential compared to the irritation exerted by the single surfactants, even though the total surfactant concentration was higher in the mixtures. in addition, the effect of surfactant antagonism was also observed in a mixture of cabp and apg. the reduced irritation potential of mixed surfactants came along with both reduced skin penetration of fluorescein and reduced release of ldh. since no surfactant antagonism is observed in monolayer cultures of keratinocytes that were exposed to mixtures of surfactants, it is assumed that keratinocytes in the viable parts of the reconstructed epidermis are promptly damaged by the surfactants once the model's barrier is destroyed. hence, surfactant antagonism appears to be primarily driven by the mixture's lower ability to damage the skin model's barrier. the micronucleus (mn) test is a reliable method for the detection of cytogenetic damage in proliferating cells. in recent years, substantial progress has been made on automated, thus faster and more objective scoring of mn test samples, i.e., methods based on flow cytometry. the aim of the present study was to use the adherently growing human bladder cancer cell line rt4 to carry out a comparison between traditional (fluorescence microscopy) and automated (flow cytometry) mn scoring. for this purpose, different substances which are known to be positive controls were used. rt4 cells were either continuously incubated for 72 h (approximately 1.5 cell cycles duration) with methyl methanesulfonate (mms; 50-200 µm), benzo[a]pyrene (b[a]p; 0.1-1 µm), vincristine (3-20 nm) , and colcemid (10-20 nm) or cells were irradiated with xrays (0.5-2 sv) and then cultured for 72 h. for standard mn scoring, cells were harvested, subjected to hypotonic treatment, fixed with methanol/acetic acid, placed on glass slides, stained with acridine orange and observed by fluorescence microscopy. for the flow cytometric method, harvested cells were stained in two sequential steps. intact cells were subjected to ethidium monazide bromide followed by photoactivation (75 w, 20 min) to label dead or dying cells. then, cells were lysed and stained with sytox green for a pan-dna labelling and analyzed on a flow cytometer. both, chemically-and radiation-induced treatment led to a dose-dependent induction of mn when evaluated by fluorescence microscopy. when the flow cytometry-based method was applied, clearly positive results including a dose-dependent induction of mn, however, were obtained only for 3 out of the 5 treatments (vincristine, colcemid and x-rays); whereas, treatment with mms and b[a]p led to only minor increases in relative mn frequencies (≤2-fold), even at the maximum concentrations. in summary, flow cytometry-based mn scoring has been successfully applied in rt4 cells. however, our initial results suggest that flow cytometry-based mn scoring is less sensitive than microscopic scoring when rt4 cells are used. so far, only few adherently growing cell lines have been applied to flow cytometry-based mn scoring. further substances (positive and negative controls) and possibly other adherent cell lines need to be tested to expand our knowledge on the effectiveness of automated mn scoring in vitro and compared to traditional approaches. background: the platinating agent cisplatin is commonly used in the therapy of various types of solid tumors, especially urogenital cancers. its anticancer efficacy largely depends on the formation of bivalent dna intrastrand crosslinks, which impair dna replication and transcription. these crosslinks stimulate mechanisms of the dna damage response (ddr), thereby triggering checkpoint activation, gene expression and cell death. the clinically most relevant adverse effect associated with cisplatin treatment is nephrotoxicity, which mainly results from damaged tubular cells. here, we analyzed the influence of the hmg-coa-reductase inhibitor lovastatin on the cisplatin-induced geno-and cytotoxicity in the rat renal proximal tubular epithelial cell line nrk-52e. methods: cell viability was determined by using the alamar blue assay, as well as by electrical impedance measurements via the icelligence system. alterations in cell cycle progression were assayed by flow cytometric analysis. the formation of pt-(gpg) intrastrand crosslinks was determined via southwestern blot. the amount of dna double-strand breaks (dsbs) was quantified by measuring the level of s139 phosphorylated h2ax (γh2ax) via immunocytochemistry as well as by western blot. additionally, neutral and alkaline comet assays were performed to determine the amount of dna single-and dna double-strand breaks. mechanisms of the ddr were analyzed by western blot as well as by quantitative real-time pcr. results: the data show that pretreatment of nrk-52e cells with a subtoxic dose of lovastatin reduced the cytotoxicity evoked by high doses of cisplatin by protection from cisplatin-stimulated apoptotic cell death. moreover, lovastatin had extensive inhibitory effects on cisplatin-induced ddr, as reflected on the level of p-atm, p-p53, p-chk1, p-chk2 and p-kap1. furthermore, activation of mitogen-activated kinases (mapks) was also reduced. the lovastatin-mediated mitigation of cisplatin-induced ddr was independent of the initial formation of dsbs as well as of pt-(gpg) intrastrand crosslinks. lovastatin protects nrk-52e cells from cisplatin-induced cytotoxicity by interfering with proapoptotic mechanisms of the ddr independently from initial dna damage formation. with respect to the clinic, the data indicate that lovastatin might be useful to mitigate cisplatin-induced nephrotoxicity. the influence of oxidant tert-butylhydrochinone (tbhq) on endothelial cell migration in wrn-deficient cells k. laarmann 1 , g. fritz 1 1 institut für toxikologie, düsseldorf, germany introduction: wrn is a dna helicase and possesses a 3´-5´exonuclease and atpase activity as well as a single strand annealing activity. it is involved in dna repair, by interacting with proteins of base excision repair (ber) and nucleotide excision repair (ner). defects of wrn are marked by genome instability which, in turn, is caused by defects in dna damage repair. patients with a mutation in the wrn gene show premature aging and early mortality. the latter is mainly caused by arteriosclerosis. furthermore, wrn participates in the regulation of genotoxic stress responses stimulated by reactive oxygen species (ros) and alkylating agents. the aim of this study was (i) to investigate whether endothelial cell migration and adhesion were effected by sub-toxic (ic 10 ) and moderate toxic (ic 40 ) concentrations of the oxidant tertbutylhydrochinone (tbhq) and (ii) whether wrn influences migration and adhesion in the presence or absence of tbhq. methods: endothelial-like ea.hy926 cells were treated with different concentrations of the redox cycling and thus ros producing oxidant tbhq. viability was measured by the alamar blue assay. ic 10 and ic 40 were determined after 24h permanent treatment. to investigate the influence of wrn on endothelial cell migration and adhesion, a wrn knock-down was performed in ea.hy926 cells using rna interference. to measure migration, a confluent cell monolayer was scratched using a pipet tip, 24h after permanent tbhq treatment. pictures were taken at the time points 0h, 4h and 8h after performing the scratch. the non-closed area was measured. in a second part, adhesion of the calcein-labeled colon adenocarcinoma ht-29 cell line on the ea.hy926 monolayer was investigated. wrn-deficient or non-deficient cells were treated with 100 µm and 160 µm tbhq or with tnfα. results: for ea.hy926 cells, 100 µm and 160 µm tbhq were determined as ic 10 and ic 40 , respectively. performing the migration assay, ea.hy926 cells showed 75% gap closure, whereas wrn-deficient cells showed a closure of only 35% after 8h. the gap was closed of 65% and 55% after 100 µm and 160 µm tbhq treatment. in wrndeficient cells no remarkable effect on migration was observed after 100 µm tbhq treatment, whereas the treatment with 160 µm tbhq showed a slight decrease in migration of about 10% compared to wrn-deficient cells. no effect on adhesion was observed after tnfα treatment. after 160 µm tbhq treatment a slight increase of adhesion was detected in ea.hy926 cells. the influence of moderate tbhq concentration on adhesion was reduced in the absence of wrn. conclusion: wrn influences endothelial cell migration. in contrast to wild-type ea.hy926 cells, no significant effect of tbhq was observed on migration of wrndeficient cells. furthermore, the moderate toxic concentration of tbhq showed slightly increased ht-29 adhesion to ea.hy926, which was not found in wrn-deficient cells. outlook: in forthcoming studies we analyse the effect of alkylating agents on migration and adhesion. data will be presented and discussed. the aim of the present work was to compare the sensitivity of leukemia cell lines (hl60, jurkat and tk6) and hematopoietic stem cells with regard to the response to genotoxic agents. chromosomal damage was analyzed by evaluation of the micronucleus frequency. furthermore, changes in the proliferation index and the frequencies of apoptotic and mitotic cells were assessed. several cytostatic drugs with different mechanisms of action were used as genotoxic agents. doxorubicin was used as an intercalator, radical producer and topoisomerase ii inhibitor. also, the effects of vinblastine, a mitosis-inhibiting drug and of methyl methanesulfonate, which forms dna-adducts and stalls replication forks, were analyzed. in general, a difference in sensitivity between the different substances was observed. with regard to the formation of micronuclei after treatment with doxorubicin, jurkat and tk6 cells showed similar increasing trends, whereas hl60 cells showed a much higher increase in micronucleus frequency. a clear decrease in proliferation and the frequency of mitotic cells was observed at the highest concentration (100 nm doxorubicin) investigated, and only a slight increase in the number of apoptotic cells could be shown. the biggest differences in formation of micronuclei could be detected after treatment with vinblastine. hl60 cells showed only a slight increase of micronuclei, but the effect on jurkat cells was stronger. the highest micronucleus frequency after vinblastine treatment was detectable for the tk6 cells. the results for the highest investigated concentrations (31.6 nm and 100 nm vinblastine) showed a significant reduction of the proliferation index. this effect is reflected by the increasing numbers of apoptotic cells in all cell lines. the results for methyl methanesulfonate demonstrated only a small increase in micronucleus formation for the jurkat cells, but higher values for the tk6 cells. in contrast the hl60 cells did not lead to a concentration-dependent effect with methyl methanesulfonate. these results are complemented by preliminary findings in hematopoietic stem cells at selected compound concentrations. the different results between the leukemia cell lines and the stem cells might possibly originate from the different p53 status of hl60 (null), jurkat (multiple mutations), tk6 (wild type) and hematopoietic stem cells (wild type). this difference might also cause differences in cell cycle control or repair mechanisms, and needs further investigations. hyperinsulinemia is thought to enhance cancer risk. a possible mechanism is induction of oxidative stress and dna damage by insulin, here, the effect of a combination of metformin with insulin was investigated in vitro and in vivo. the rational for this were reported antioxidative properties of metformin and the aim to gain further insights into mechanisms responsible for protecting the genome from insulin mediated oxidative stress and damage. comet assay, micronucleus frequency test and a mammalian gene mutation assay were used to evaluate the dna damage produced by insulin alone or in combination with metformin. for analysis of antioxidant activity, oxidative stress and mitochondrial disturbances, the cell-free frap assay, the superoxide-sensitive dye dihydroethidium and the mitochondrial membrane potential-sensitive dye jc-1 were applied. accumulation of p53 and pakt were analysed. as an in vivo model, hyperinsulinemic zucker diabetic fatty rats, additionally exposed to insulin during a hyperinsulinemic euglycemic clamp, were treated with metformin. in the rat kidney samples, dhe staining, p53 and pakt analysis, and quantification of the oxidized dna base 8-oxodg was performed. metformin did not show intrinsic antioxidant activity in the cell free assay, but protected cultured cells from insulin mediated oxidative stress, dna damage and mutation. treatment of the rats with metformin protected their kidneys from oxidative stress and genomic damage induced by hyperinsulinemia. metformin may protect patients from genomic damage induced by elevated insulin levels. this may support efforts to reduce the elevated cancer risk that is associated with hyperinsulinemia. the human skin is the primary barrier against environmental and chemical impacts. as such it shields us against a plethora of xenobiotics such as potentially carcinogenic polycyclic aromatic hydrocarbons (pahs). at the same time it is the second most densely populated organ, harbouring more than 1000 bacterial species and population densities of up to 10 7 cfu per cm 2 . yet little is known about this microbiome's potential to metabolise and toxify pahs such as benzo[a]pyrene (b[a]p). previous work at the bfr showed that degradation of b[a]p and other pahs is a universal feature of the skin's microbiome (sowada et al., 2014) . the corresponding metabolites only partly overlap with those known from eukaryotic metabolism and possess cytotoxic as well as genotoxic properties. excretion of these metabolites will lead to exposure times of 20-30 hours or longer for full and partial metabolisers, respectively. while in vitro studies show the corresponding substances to exert their effects synergistically, an assessment of their potential impact on human carcinogenesis is pending. one obvious mode of action would be direct genotoxicity. however, another option is interference with uv-damage repair. ultraviolet radiation (uvr) from sunlight is regarded the main causative factor for the induction of skin cancer. it induces two of the most abundant mutagenic and cytotoxic dna lesions, that is cyclobutane-pyrimidine dimers (cpds) and 6-4 photoproducts (6-4pps). these lesions are repaired primarily by nucleotide excision repair (ner), a system that is also responsible for the removal of pah-derived dna adducts. we therefore wanted to know whether and to what extent bacterial b[a]p metabolites have the capacity to interfere with ner, potentially contributing to uv-induced dna-damage. to investigate this selected genotoxic metabolites were examined for their potential to affect the dna repair capacity of skin cells (hacat). following treatment with uva/b and bacterial b[a]p-metabolites the skin's repair capacity was assessed using a modified comet-assay. ionizing radiation (ir) is a well-established model to induce dna double-strand breaks (dna-dsbs), but it also generates a broad range of other dna lesions including dna single-strand breaks as well as oxidative dna base modifications. furthermore, ir is able to modify membrane components and triggers the activation of epidermal growth factor receptor. a more specific dsb-inducer is cytolethal distending toxin (cdt), which is produced by a variety of gram-negative bacteria and harbours an intrinsic dnase-like endonuclease activity [1] . dsbs are potent cytotoxic lesions and promote genomic instability, e.g. by formation of chromosomal aberrations. a cellular mechanism to prevent genomic instability and maintain cell homeostasis could be autophagy. this process is highly regulated involving the lysosomal degradation of damaged organelles and proteins. here, we study autophagy induction following dsb generation in human colorectal cancer cells as well as in primary human colonic epithelial cells (hcec) and analyzed regulatory mechanisms. first, the autophagy-specific marker lc3b was shown to increase in a dose-and time-dependent manner after treatment with both cdt and ir as assessed by confocal immunofluorescence microscopy and western blot analysis in hct116. similar results were obtained in sw48 and hcec cells via western blot. these findings are in agreement with the enhanced formation of autophagosomes and the dose-dependent decrease of the autophagy substrate p62 as observed by flow cytometry and western blot analysis in hct116, sw48 and hcec. cdt-and irinduced autophagy rates in hct116 increased over time correlating well with the dsb induction. importantly, a dnasei-defective mutant of cdt did neither cause dsbs nor induce autophagy. additionally, the time-dependent accumulation of the lysosomal associated membrane protein 1 (lamp-1) was observed by confocal immunofluorescence microscopy. dsb-induced autophagy was blocked by chemical inhibitors. next, we showed that both ir and, to a lesser degree, cdt induce the phosphorylation of akt at ser473. pharmacological inhibition of akt in hct116 cells enhanced the cdt-and ir-induced autophagy shown by accumulation of lc3b and lamp1 after 48 h and increased autophagosome formation. upregulation of dsbinduced autophagy by akt inhibition resulted in a decreased cytotoxicity after 72 h and significantly lower apoptosis/necrosis rates after 48 h, which were determined by mts cell viability assay and annexin-v/pi staining. ongoing studies will evaluate the impact of other dna damage response pathways and the potential protective role of autophagy against genomic instability. mustard agents are potent dna alkylating agents. among them, the bi-functional agent sulfur mustard (sm) was used as a chemical warfare agent due to its vesicant properties. although the use of sm in warfare has been banned in most countries of the world, its use in terroristic attacks or asymmetrical conflicts, such as the syrian civil war, still represents a realistic and significant threat. on the other hand, especially nitrogen mustards, such as cyclophosphamide or melphalan, have been used as chemotherapeutic agents due to their cytostatic properties. thus, mustard-induced dna damage, in particular dna crosslinks, can trigger complex pathological states, as it is observed in sulfur mustard exposed victims, but on the other hand also lead to the chemotherapeutic effects of clinically-used nitrogen mustards. mass spectrometric monitoring and quantitation of mustard-induced dna adducts can help to unambiguously identify and verify sm-exposed victims and to monitor the efficiency, as well as potential side-effects of mustard-based chemotherapy. up to now, the verification of mustard-induced nucleic acid damage is mainly based on immunohistochemical methods, which have several drawbacks such as limited specificity, sensitivity, and low dynamic range of quantitation. with this project, we aim to develop a (hplc/uplc)-ms/ms-based platform for the quantitation of the most common mustard-induced dna adducts including bis(n7-guanine-ethyl) sulfide dna crosslinks. up to date, we established methods for the quantitation of the several common dna adducts induced by the mono-functional sulfur-mustard derivative 2chloroethyl ethyl sulfide ("half mustard", cees). for that reason purification protocols, chromatographic conditions and mass spectrometric settings were developed to detect n7-ethylthioethyl-2´desoxyguanosine (n7-ete-dg) and n3-ethylthioethyl-2´desoxyadenosine (n3-ete-da) and their thermal hydrolysis products n7ethylthioethyl-guanine (n7-ete-gua) and n3-ethylthioethyl-adenine (n3-ete-ade), respectively, and the sensitivity was compared to immunohistochemical methods. additional non-radioactive isotope-labelled standards are being synthesized, which will be spiked into samples to account for technical variability during sample work-up and to improve ms-based quantitation. this procedure requires minimal cellular material and therefore should be transferred to quantitation of dna adducts in human blood samples. this will allow to monitor dna adducts as biomarkers of exposure in potential smexposed victims as well as in mustard-based chemotherapy. this method also sets a basis to investigate specific mustard-induced dna repair mechanisms and their cellular consequences. the γh2ax assay vs. comet assay for genotoxicity testing universitätsmedizin mainz, institut für toxikologie, mainz, germany dna damage leads to activation of the cellular dna damage response (ddr). this signalling network results in activation of various dna repair proteins and chromatin structure modulators. a frequent manifestation of ddr is the phosphorylation of histone 2ax (gh2ax), which can be visualised as gh2ax foci by immunocytochemistry. in the present study, we tried to assess if gh2ax is a reliable biomarker for detecting the cellular response to dna damage. we selected 14 well-characterised genotoxic compounds and compared them with 10 non-genotoxic chemicals in the wellcharacterised cho cell system. we measured quantitatively γh2ax by manual and automatic scoring of γh2ax foci, and by flow cytometry counting of γh2ax positive cells. the cytotoxicity dose-response was determined by the mtt cell proliferation/viability assay. we show that a) all genotoxic agents were able to induce dose-dependently γh2ax in the cytotoxic range whereas no induction was observed after treatment with non-genotoxicants; b) manual scoring of γh2ax foci and automated scoring gave similar results, with the automated scoring being faster and more reproducible; c) data obtained by foci counting and facs analysis of γh2ax positive cells showed a significant correlation. further we compared dna damage induced by 4 selected genotoxins at the time-points using the alkaline and neutral comet assay. significant correlation with the alkaline and neutral comet assay was observed for some but not all genotoxins and, predominantly, at earlier time points. we suggest that comet assays detect mainly primary dna damage, whereas γh2ax assay detects a specific response to dna damage which can persist longer. the γh2ax foci and flow-cytometry assays allow for a rapid and reliable determination of genetic damage in mammalian cells and can be used as additional genotoxicity assays. available in vitro methods to investigate the genotoxic potential of drugs fall short of throughput, specificity and mode of action information. a set of mechanistic biomarkers for clastogenic, aneugenic or apoptotic effects may help to overcome these limitations. thus, a staining assay amenable to flow cytometric analysis is being developed by litron laboratories, rochester, ny, supported by international collaborators. the experimental design of this assay consists of 3 stages. the objective of this work is the evaluation of this assay in the laboratories of bayer pharma ag. the biomarkers covered by the assay are associated with dna damage response pathways that have potential for class discrimination (clastogen/aneugen/cytotoxicant) of in vitro genotoxicants: dna double strand breaks (γh2ax), nuclear division (phospho-h3, dna content), apoptosis (cleaved parp). based on the pilot work at litron laboratories, tk6 cells were introduced to the genetic toxicology of bayer pharma ag. cells were exposed for 4 and 24 hrs in triplicates on a 96 microwell plate to one reference clastogen (etoposide, eto), aneugen (vinblastine, vb) and cytotoxicant (carbonyl cyanide 3-chlorophenylhydrazone, cccp). after staining, the samples were analyzed with the flow cytometer bd accuri c6 (bd biosciences, heidelberg, germany). the reference substances yielded the responses expected from the pilot study at litron laboratories: vb showed distinct increases of phospho-h3 events at 4 and 24 hrs and polyploidy at 24 hrs time point. eto induced a clear increase of yh2ax with a simultaneous reduction of phospho-h3 at 4 and 24 hrs. finally, cccp caused a reduction of phospho-h3 events, increased cleaved parp events and did not influence γh2ax. moreover, benchmarking experiments under pilot work conditions were performed with high content imaging analysis. we compared yh2ax and phsopho-h3 pilot study results as well as cleaved parp with caspase 3/7. in addition, the tunel assay (click-it ® tunel alexa fluor, thermofisher) was executed to benchmark cleaved parp. the benchmarking results support the selected biomarkers of the multiplexed assay. in stage 2, additional reference compounds (three aneugens/clastogens/cytotoxicants) were investigated. so far, the chosen biomarkers of dna damage response appear useful for class discrimination and provide additional information to existing genotoxicity tests. cell-cell contacts are involved in keeping a physiological balance between proliferation, differentiation and apoptosis. far less is known about the role of cell-cell-contacts in regulating necrosis, for instance in response to oxidative stress. previous findings of our group demonstrated that, in contrast to semi-confluent proliferating cultures, confluent murine fibroblasts (nih3t3, mef) and human keratinocytes (hacat) are protected against necrosis induced by tert-butyl hydroperoxide (t-booh). comparison of confluent cells (g0/g1 = ~70 %) and semi-confluent cultures, similarly arrested in the g0/g1 phase by serum-starvation or the mek inhibitor u0126, ascertained that the resistance against t-booh is mediated by cell-cell contacts and not by cell cycle arrest. we further revealed that confluent cultures are protected against t-booh-induced dna double strand breaks as assessed by the neutral comet assay and against mitochondrial damage detected by flow cytometric analysis of dioc6 staining. to better understand the protective role of cell-cell-contacts in ros-mediated necrosis, we started characterizing the signaling cascade induced by t-booh in semi-confluent proliferating cultures. in accordance with the observed formation of dna double strand breaks in response to t-booh, we detected phosphorylation of the checkpoint kinase chk2. however, inhibition of atm, the kinase responsible for chk2 activation, did not influence t-booh-induced cell death. interestingly, first experiments gave a hint for the participation of rip1, since the chemical rip1 kinase inhibitor necrostatin-1 (nec-1) blocked cell death up to averagely 50 %, what is described as a specific marker for regulated necrosis. in line with this observation, t-booh-induced cell death could not be blocked by the pan-caspase inhibitor z-vad-fmk strongly indicating that caspase activity is not required. moreover, parp-1 and p53 are probably not involved. deeper analyses could give evidence that nec-1 did not block formation of dna double strand breaks nor mitochondrial damage indicating that the kinase blocked by nec-1, possibly rip1, acts downstream of dna double strand breaks and / or mitochondrial damage. in the end, we could identify a crucial role of ca 2+ signaling for t-booh-mediated toxicity. as the calcium chelator bapta-am was able to completely block not only cell death, but also mitochondrial damage and dna double-strand break formation, there is a strong need for further investigations of the possible interplay between regulated necrosis and calcium, regulated by cell-cell contacts among oxidative stress. the work was supported by the hoffmann-klose-stiftung, the promotionsförderung rheinland-pfalz, the johannes gutenberg-university and the university medical center of the johannes gutenberg-university. the mammalian target of rapamycin (mtor) forms two multiprotein complexes (mtorc1 and mtorc2) and influences cell growth, proliferation, survival and metabolism. constitutively activated mtor was found to be deregulated in several cancer types, which makes it an interesting target for therapeutic cancer strategies. rapamycin is able to inhibit mtor and its downstream targets and is currently studied for its anticancer properties in clinical trials. despite previous evidence, there are studies that show an adverse effect in cancer treatment causing tumour growth, evolving the question of the effectiveness of the drug in cancer treatment. therefore, we examined the transformational potential of rapamycin in a balb/c cell transformation assay (cta) as well as markers of proliferation and protein synthesis. the balb/c 3t3 cell transformation assay mimics different stages of in vivo carcinogenicity (initiation, promotion, post-promotion phase) and is a promising alternative to rodent bioassays. balb/c fibroblasts are treated for 3 days with the tumour initiator mca (3-methylcholanthrene) followed by 13 days with the promotor tpa (12-o-tetradecanoylphorbol-13-acetate). upon treatment with these chemicals cells are transformed into morphologically aberrant foci and can be visualized after six weeks by giemsa staining. it is possible to apply additional substances during the whole assay or in several phases of transformation and evaluate the colony formation. furthermore, our improved protocol allows additional westernblot or immunofluorescence analysis. the influence on cell proliferation of different concentrations of rapamycin was investigated by cell counting (living and dead) to choose a suitable concentration for the cta. performances of balb/c ctas with 10 nm rapamycin showed, contrary to expectations, an increase in cell transformation. by administration of rapamycin only in the promotion phase we could detect an increase in colony formation, whereas a treatment with rapamycin in the post-promotion phase with already established foci, seemed to reveal its therapeutic properties. to better understand the role of mtor in our cell transformation system we used another mtor inhibitor called osi-027. surprisingly, an incubation with 3 µm osi-027 led to a decrease in colony formation. we are now able to investigate the underlying mechanism with westernblot and immunofluorescence analysis and can compare regulations of downstream targets like the marker of protein synthesis p-s6. our investigations revealed different cell transformation outcomes by comparing the two known mtor inhibitors rapamycin and osi-027, which need to be further evaluated. in the ongoing project we want to detect differences between rapamycin and osi-027 by protein analysis and identify key proteins, which are involved in this opposed colony formation of the balb/c cells. these results can be helpful to better understand mtor inhibition in matters of tumour therapy. introduction: over the past 50 years, the biguanide compound metformin has been widely prescribed as an insulin sensitizer in type 2 diabetes mellitus. interestingly, recent meta-analyses of epidemiological studies have shown that metformin might be involved in risk reduction of carcinogenesis. in vitro studies have described amp-activated protein kinase (ampk)-dependent, by inhibition of the respiratory chain complex i, as well as ampk-independent actions of metformin. however, the detailed molecular mechanisms by which metformin affects cell proliferation and carcinogenesis have not been well identified up until now. method: to evaluate the protective potential of metformin, balb/c 3t3 cell transformation assays were performed. this valid toxicological method is an alternative to in vivo carcinogenic testing and mimics the different stages of cell transformation during carcinogenesis. in detail, mouse fibroblasts are treated with metformin and/or the tumour initiator 3-methylcholanthrene (mca) and the tumour promotor 12-otetradecanoylphorbol-13-acetate (tpa). in the first experiment several metformin concentrations (0.1-1 mm metformin) were applied answering the question of an effective metformin concentration. next, metformin treatment during the different phases of carcinogenesis (initiation, promotion, post-promotion phase) was done determining the most effective phase for an intervention, i.e. chemopreventive or chemotherapeutical properties of metformin. additionally, the effect of metformin on the energy metabolism of the cells was analysed using various methods like immunoblot and oxygen measurement by clark electrode. results/discussion: analysis of different metformin concentrations revealed a concentration-dependent effect of metformin. in detail, decreased colony forming potential of balb/c cells was most prominent using 1 mm metformin. this effect was not caused by growth inhibition of metformin itself since 1 mm metformin showed no growth inhibitory properties in a cellular growth pretrial. interestingly, the 2 phase cell transformation assay showed that the metformin effect is more pronounce in the postpromotion phase than in the initiation and promotion phase pointing to a chemotherapeutical potential. investigating several energy metabolism parameters, the results indicate that metformin may affect cell respiration as well as energy-dependent mechanistic markers like ampk. the presented results support rather the idea of the chemotherapeutic potential of metformin than a chemopreventive, using 1 mm metformin. the initial analysis of energy metabolism markers discovered interesting starting points for further investigations. johannes gutenberg university, institute of toxicology, 55131 mainz, germany nvp, widely used e. g. as a monomer for polyvinylpyrrolidones (pvp) with applications in food technology or cosmetics is a known hepatocarcinogen in rats after inhalative exposure to 5, 10, and 20 ppm for 2 years. nvp is tested in a battery of genotoxicity assays (e.g. ames, hprt, mouse lymphoma, uds, chromosome aberration, cell transformation, micronucleus test (mnt) in mice bone marrow) [1]) that all yielded negative results. however, nvp induces cell proliferation in liver (loaec: 0.5 ppm) after whole body exposure to vapor [2] . to confirm the absence of genotoxicity in the context of a potentially non-genotoxic mode of action, a five day whole body inhalation study to nvp vapor with concentrations of 0, 5, 10, 20 ppm was conducted in wistar rats (six animals per gender and group, ethyl methanesulfonate 200 mg/kg bw p.o. as positive control). genotoxicity was investigated by the mnt in bone marrow and the comet assay (± fpg) in liver and lung. further investigated endpoints related to possible non-hepatogenotoxic moa were: enzyme induction (erod, prod, brod), oxidative stress (gsh-, gssg-, non-protein sulfhydryl group level), and peroxisome proliferation (cyp4a, cyanide-insensitive palmitoyl-coa-oxidase). at carcinogenic inhalative doses, the results of this study proved the absence of genotoxicity in lung, liver and bone marrow as neither the tail intensity in the comet assay nor the number of micronuclei in the mnt was increased compared to the controls. however, also the non-genotoxic parameters (cyp-enzyme activity, glutathione levels, cyanide-insensitive-palmitoyl-coa-oxidase) were not affected by nvp-treatment. as potential metabolic activation cannot be excluded and may essentially contribute to the understanding of the carcinogenic mechanism, in vitro investigations in rat liver systems (subcellular fractions, hepatocytes, precision cut liver slices (pcls)) were performed additionally. up to now, 2-pyrrolidone is the only identified in vitro metabolite. as these results cannot mimic the in vivo situation of two described, ring-and vinylmajority containing unidentified metabolites [3] detailed investigations on metabolism may be a future perspective to approach the overall understanding of the carcinogenic mechanism of nvp. introduction: in ischemic conditions such as wound healing and myocardial infarction, new vessels are generated by vasculogenesis and angiogenesis. these processes are stimulated by the signalling peptide vascular endothelial growth factor which therefore has been proposed as a promising compound for the treatment of ischemic conditions. however, results of respective clinical studies have not been fully convincing yet. here, we investigated principles underlying the selforganization of newly formed vessels to functionally adequate microvascular networks indispensable for proper tissue substrate supply. intravital microscopy of the chick chorioallantoic membrane (cam), a non animal model as defined by the american national institutes of health's office for protection from research risks, was used to study peripheral expansion of existing arteriolar and venular trees by recruiting segments of the dense polygonal capillary mesh. this process we call "emerging angiogenesis". methods: white leghorn chicken eggs were put into incubators on embryonic day 0 (e0) at 37.5°c and 82% humidity. on e3, the eggs were cracked open and transferred into petri dishes. on e10, cam microcirculation was recorded using time-lapse intravital videomicroscopy at discrete time points for up to 48 hours. to improve the visibility of the capillary mesh, videorecordings were processed offline by generating coefficient of variation images of pixel grey values over time. changes of network topology during the observation time were investigated. results: in the cam, a sequence of specific events leading to extension of existing vessel trees was observed: in a capillary mesh region near terminal branches of existing vessel trees, homogeneous flow distribution is transferred to inhomogeneous flow distribution: preferred flow pathways through the mesh evolve carrying most of the blood. over time, these flow pathways exhibit diameter increase, straighten and connect the mesh to arteriolar and venular trees. in contrast, less perfused parallel mesh flow pathways and transversal mesh segments exhibit progressive decrease of flow and diameter resulting in vessel regression. as a result, hierarchical vessel tree structures are extended into the mesh region. while newly generated tree extensions are located above the mesh at the beginning, they sink to a lower level at later stages until they are finally covered by a reconstituted mesh network. the cam ex ovo model is well suited for studying emerging angiogenesis. vessel tree extension occurs via parallel processes of vessel maturation and capillary mesh segment regression. at later stages, newly formed vessel tree branches sink and the capillary mesh is reconstituted above. in the next step of our project, we will implement these phenomena in a computer simulation and use theoretical modeling to further investigate and better understand principles underlying microvascular network maturation. this will allow us to derive effective therapeutic strategies which could be tested in the cam model. chemicals are able to induce cancer in a wide range of organs. therefore, it is very important to investigate the toxic properties of chemical substances, especially their carcinogenic potential. in this context the number of animal experiments will drastically increase in the future. in order to avoid the use of expensive and time consuming animal experiments for long-term carcinogenic studies, the development of an in vitro system to test the carcinogenic potential of a high number of chemicals in a highly reproducible manner within a short period of time is imperative. by combination of the well-established balb/c cell transformation method with the soft agar colony formation assay, we developed a high-throughput in vitro system to identify effects of chemicals on cell transformation for the first time. balb/c mouse fibroblasts are treated with 3-methylcholanthrene as a tumour initiator and 12-o-tetradecanoylphorbol-13-acetate as promotor for several days, whereby foci of transformed cells are developed. after the promotion phase of the common balb/c cell transformation assay, cells are transferred into soft agar to further monitor the anchorage independent growth of transformed cells only. the established soft agar transformation assay reproduces the foci growth of previous experiments and is performed in 96-well plate format. hence, we can analyse the carcinogenic potential of several chemical substances in parallel and are also searching for alternative endpoint analysis, e.g. the usage of fluorescing cells stably expressing irfp, instead of the former time-consuming microscopic assessment. the here presented new technique is a high-throughput and low priced alternative for the evaluation of the carcinogenic potential of chemical substances in a short period of time without animal testing. the effort to develop new or refine established in vitro test systems rises due to animal welfare, scientific and/or regulatory reasons (e.g. the animal testing ban concerning the risk assessment of cosmetic product ingredients in march 2013). this progress, among others, leads to an increased performance of cell-based assays. the majority of model cell lines are routinely cultured using medium supplemented with fetal bovine serum (fbs) in amounts between 5-10%. the application of serum-substitutes will provide a reduction of the animal number needed, which corresponds to the guiding principles of the three r's (3r), described by russel and burch in 1959. in addition, chemically defined serum-substitutes have the potential to reduce the inter-experimental variability of test conditions caused by the inherent differences in chemical composition across fbs batches 1 , resulting in a refinement of in vitro testing. in this study, human tk6 cells were gradually adapted to serum-free conditions, where they show comparable growth gradients at the exponential phase. for cells under serum-free conditions a mean doubling time of 19.3 (± 1.7) h was observed while fbs supplemented cells showed a doubling time of 13.5 (± 0.8) several non-animal test methods addressing key events in the sensitization process have passed formal validation and oecd (draft) test guidelines are available. one of these methods is the direct peptide reactivity assay (dpra) assessing the ability of a chemical to bind to proteins to form a complete antigen (oecd tg 442c). the test is used to obtain a yes/no answer on whether the substance has a protein-binding potential. for a complete risk assessment, however, an estimation of a chemical's potency is also needed. in this study we examined if an assessment of potency could be achieved by 1) determining reactivity class cut offs based on published data on 199 substances for the dpra performed according to oecd 442c to predict un ghs sensitizer classes, 2) a variant of the dpra assessing reaction kinetics (time and concentration) for 12 substances or 3) an extended protocol testing several test substance concentrations for 50 reference substances and estimating the concentration of a test substance that is needed to cause a peptide depletion of 6.38% (ec6.38%). results of the first approach indicated that cut offs to differentiate the un ghs sensitizer classes 1a and 1b could indeed be defined. secondly, evaluating the reaction time based assay in which several time points between 5 min and 24 hours were assessed, it was found that not all reactions followed ideal kinetics. hence further investigations are needed to eventually derive a reaction time based prediction model. the results of the 3 rd approach (the standard protocol of the dpra was amended by testing three concentrations i.e. 1, 10, and 100 mm) indicated that potency classes could be assigned using the ec6.38% value to assess potency. in summary, using quantitative information derived from the dpra in particular using ec6.38% value may support the assessment of the skin sensitizing potency. identification of pre-and pro-haptens with non-animal test methods for skin sensitization since pro-haptens may be metabolically activated in the skin, information on xenobiotic metabolizing enzyme (xme) activities in cell lines used for testing of sensitization in vitro is of special interest. metabolic activity of e.g. n-acetyltransferase 1 (nat1) and esterase in the keratinocyte (keratinosens tm and lusens) and dendritic cell-like cells (u937 and thp-1) was previously demonstrated. aldehyde dehydrogenase (aldh) activities were found in keratinosens tm and lusens cells. activities of the investigated cytochrome p450-dependent alkylresorufin o-dealkylases, flavin-containing monooxygenase, alcohol dehydrogenase as well as udp glucuronosyl transferase activities were below detection in all investigated cell lines. a set of 27 putative pre-and pro-haptens (no obvious structural alert for peptide reactivity but positive in vivo) was routinely tested using the above mentioned cell lines as well as in the direct peptide reactivity assay (dpra). 18 of the compounds were unexpectedly positive in the dpra und further analyzed by lc/ms techniques to clarify the reaction mechanism leading to true positive results in this assay. oxidation products like dipeptide formations or the oxidation of the peptide-based sulfhydryl group led to positive results for benzo[a]pyren or 5-amino-2-methylphenol, respectively. in contrast, covalent peptide adducts were identified for 12 putative pre-haptens, indicating the dpra to be suitable for compounds requiring abiotic oxidation to get activated. for some dpra negatives, the keratinocyte and dendritic cell based assays provided true positive results. a combination of dpra, keratinosens tm and h-clat within a '2 out of 3' prediction model provided a high sensitivity of 81% for the set of the pre-/pro-haptens. the sensitivity of this combination of non-animal test methods in the '2 out of 3' prediction model in a set of 95 direct haptens was comparable (sensitivity = 87% when compared to llna skin sensitization testing is mandatory for all substances produced or marketed in volumes larger than 1 tonne per year under the european reach legislation. with reach supporting in vivo testing only "as a last resort" and the marketing ban for finished cosmetic products with ingredients tested in animals, attention has been given to developing integrated testing strategies combining in vitro, in silico and in chemico methods. key challenges are which tests to select and how to combine non-animal methods into testing strategies. this study suggests a bayesian value of information (voi) approach for developing non-animal testing strategies, which consider information gains from testing, but also expected payoffs from adopting regulatory decisions on the use of a substance, and testing costs. the 'value' of testing is defined as the expected social net benefit from decision-making on the use of chemicals with additional, but uncertain information from testing. the voi is calculated for a set of individual nonanimal methods including dpra, oecd qsar toolbox, are-nrf2 luciferase method covered by keratinosens and lusens, and hclat, seven battery combinations of these methods, and 86 two-test and 360 three-tests sequential strategies consisting of nonanimal methods. their voi is compared to the voi of the local lymph node assay (llna) as the animal test. we find that battery and sequential combinations of nonanimal methods reveal a higher voi than the llna. in particular, for small prior beliefs (i.e. a chemicals is, prior to testing, assumed to be a non-sensitiser), a battery of dpra + lusens reveals the highest voi. if there are strong beliefs that a chemical is a sensitizer, a sequential combination of the battery dpra + lusens, followed by keratinosens + hclat at the second stage and by the oecd qsar toolbox at the third stage performs best. for given specifications of expected payoffs the voi of the nonanimal strategy significantly outperformed the voi of the llna, for the entire range of prior beliefs. this underlines strong economic potential of non-animal methods for skin sensitization assessment. a chemical series to predict the proarrhythmic potential of drugs with low solubility for which no reliable purkinje fiber results could be obtained. these validation results showed that this cardiosafety in silico model can successfully be applied in r&d to predict the proarrhythmic potential of drug candidates within the model ad. introduction: the use of p-phenylenediamine (ppd) and derivatives (tab. 1) in oxidative consumer hair dye products is considered as key in hair dye allergic contact dermatitis [1] [2] [3] [4] . in recent supplement, 2-methoxymethyl-ppd (me+) shows significantly reduced sensitizing properties [5, 6] . since overcoming the skin barrier is a prerequisite for sensitization, numerous in vitro an in vivo studies on skin penetration of ppd and derivatives have been performed. the aim of the present study is the in silico prediction of the penetration of ppds, because such computations may help in understanding the processes involved in sensitization. for the first time, software dskin [7] is challenged to simulate this class of compounds. in silico results are retrospectively compared to previously published experimental data and may assist in future tailoring of in vitro experiments. material and methods: the permeabilities, lag-times and the time-dependent accumulated amounts of ppds were computed using dskin. input parameters for the latter were a concentration of 1 mg/ml (1%), finite dosing and 30 min in use incubation periods. molecular structures were optimized ab initio and the condensed fukui functions (ff) were estimated from mulliken population analyses [8] and electrostatic potentials using gamess [9] . results: initial results agree with experimental results using ppd in white petrolatum, demonstrating the applicability of dskin to ppds. the four ppds exhibit only small differences in permeabilities in silico (tab. 1). toluene-2,5-diamine shows a higher accumulated mass due to increased lipophilicity (fig. 1) . in general, the ff were very similar for all ppds and indicated that the n atoms would be the preferred targets for radical and electrophilic attack. discussion and outlook: in silico methods may be used to model the permeation of ppds despite their low molecular weight and low lipophilicity. the low amounts of ppds under in use conditions result from oxidative conditions. computed me+ permeation was not different to other ppds, therefore other properties account for the reduced sensitization potential. the very similar ff values hint at similar reaction pathways. furthermore, ppd and its derivatives are prone to n-acetylation in living skin resulting in metabolites exhibiting higher molecular weight and greater lipophilicity than the parent compounds. the effects of n-acetylation and reactions of ppd and its derivatives with histidine and cysteine residues are subject of upcoming computations. dermal absorption is an important factor in regulatory science regarding the registration of chemicals, agrochemicals and cosmetics. the issue has gained importance since it has been realized that the skin is not completely impenetrable for chemical substances. [1] the different ways to assess dermal absorption range from qsar models to complex in vivo studies including a complete toxicokinetic examination. the choice of method depends on the question that has to be answered as different systems give different results: absorption as % of applied dose in in vivo studies or permeability coefficient and lag time in infinite dose in vitro studies [1] . ideally both data would be available. since the oecd has adopted a guideline for assessing dermal penetration in vitro in 2004 the number of in vitro studies is rising continuously. depending on the chosen method results may vary in reliability and in acceptance by regulatory authorities. the skinab database [ba1] contains data for about 600 substances on dermal absorption which has been found through the echemportal [3] and extended with data from the edetox database. for selected substances with a broad spectrum of data available further analysis has now been started. 165 chemicals have been investigated in a comparable test system; from these 79 were shown to have a low dermal absorption of less than 1% and 51 compounds showed a high absorption rate of more than 50%. for the assessment of dermal exposure either the absorbed dose in percent or the flux can be measured. data analysis showed that only for 15 substances both is available: flux data from in vitro studies and absorption data from in vivo studies. this data could be used to clarify which parameter would be most useful for exposure assessment regarding dermal exposure. seven substances in the dataset were conspicuous for their range of absorption rates in different studies: less than 1% to more than 50%. an in depth analysis revealed the complex influence that different exposure parameters have on the results of dermal absorption studies. for some chemicals the influence of exposure time on increasing absorption values could be clearly demonstrated. beside other factors such as the chosen vehicle, and the (non-)occlusion of the site of exposure especially the choice species introduced a high variability; this holds even for the most common laboratory animals t. a review published by jung in 2015 [5] which comes to the conclusion that i.e. hairless species are usually not a good model to predict dermal absorption in humans. [1]who (2006) ehc 235 dermal absorption [2] scholz et al (2014) naunyn-schmiedeberg´s arch pharmacol 3 (suppl 1):s86 [3] www.echemportal.org [4] http://edetox.ncl.ac.uk [5] jung et al (2015) in-silico methods have evolved to indispensable tools in various areas of life sciences. several stages in drug development including hit identification and lead optimization, for instance, highly benefit from an accurate estimation of binding free energies associated with biological host-guest systems. as a consequence, the need for laboratory experiments including in-vivo experiments and animal testing is considerably reduced. another field profiting from free energy calculations is human as well as ecotoxicology. upon the development and risk assessment of new chemicals, transformation products arising from biotic or abiotic degradation of the parent substance have usually been neglected. however, since few years, the risk assessment of new chemicals often includes transformation products probably causing more harm than the parent substance itself. such studies as well are mostly carried out on the basis of in-vitro and in-vivo tests. moreover, many metabolites can be detected but neither enriched nor synthesized in amount sufficient for toxicological evaluations. at this stage, computational methods come into play. using classical molecular dynamics simulations in combination with an empirical linear prediction model, we have investigated several metabolites of the drugs sulfamethoxazole and carbamazepine and prioritized them according to their estimated binding affinities to potential biological target proteins. consequently, a couple of metabolites were identified that bind to one or more human cytochrome p450 variants and the bacterial enzyme dihydropteroate synthase, respectively, which are known to be sensitive to the two drugs. the investigations were carried out in the framework of the bb3r poject funded by the german government through bmbf. instituto superiore di sanità, environment and primary prevention dept., rome, italien introduction: in vitro methods have been increasingly used to characterize pharmacological and toxicological properties of substances. to address the problem of nominal versus actual concentrations, in vitro biokinetic studies were recently undertaken (truisi et al., toxicol lett 233:172-86, 2015) . we use those data as input into a physiologically based human kinetic model (pbhkm) to model the in vivo doses leading to the in vitro measured concentrations. methods: a pbhkm was used to simulate the concentration time profile of ibuprofen in the hepatic vein after oral administration. the details of the model and the physiological parameters used have been described elsewhere (abraham et al., arch. toxicology 79: 63-73, 2004) . we modelled the concentration time profile exploring the dose which would lead to a concentration at 1 hour and at 24 hour as similar as possible to the concentration measured in the supernatant of human freshly prepared cell cultures after dosing the culture with ibuprofen. we parametrized the pbhkm with the parameters which have been estimated from the in vitro kinetic studies (clearance between 3 and 15 µm 3 /sec (truisi et al., 2015) and an absorption of 100% and an absorption rate of 1/h (cristofoletti and dressman, j pharm sci 103: 3263-75, 2014). results: the data of the in vitro study with 100 µm ibuprofen could well be modelled. when assuming a clearance of 15 µm 3 /sec the dose of 1480 mg resulted in an 1 hour concentration of 66.5 µm in the hepatic vein of pbhkm equal to 133.0 nmol/well (volume of the well = 2 ml) in the in vitro study in which the measured concentration was 138.75 nmol/well. the concentration at 24 hours of 8.7 µm (equal to 17.4 nmol/well) corresponded with the in vitro concentration (16.5 nmol/well). the modelling approach was less successful with in vitro dosing of 1000 µm. the 10 fold higher dose of 14800 mg lead to nearly double the concentration at 1 hour than measured in vitro. with a dose of 8500 mg/kg an approximation was feasible resulting in 396.7 µm in the hepatic vein at 1 hour which is equal to 793.4 nmol/well whereas the measured concentration in vitro was 782.85 nmol/well. even with a clearance value as low as the 2.5 percentile (3 µm 3 /sec), the concentration at 24 hours was modelled to be lower than the in vitro measured value (in vivo model: 98.7 µm which corresponds to 197.4 nmol/well; measured in vitro concentration: 979.2 nmol/well). discussion: this is the first attempt to use kinetic data obtained in vitro to feed in in a pbhkm for reverse dosimetry finding the dose which corresponds in vivo to the in vitro situation. in the case presented here, the in vitro dose assumed to be low in vitro (100 µm) corresponds to a dose of 1480 mg (note: the highest approved daily dose is 2,400 mg). for the high in vitro dose modelling was successful only for the concentration 1 hour after dosing and a dose of 8,500 mg. conclusions: in vitro kinetic parameters, such as clearance, can successfully be used for parametrizing a pbhkm. it is of utmost importance for the relevance of in vitro finding to assure that the concentrations used in vitro can be obtained with relevant in vivo doses. in this case, the in vitro concentrations were within (low dose) and 3.5 fold above (high dose) the in vivo relevant therapeutic concentration range. introduction: a variety of drug residues have been detected in sewage plant run-offs, rivers and lakes, but also in groundwater and tap water samples. studies have yet to identify a risk for human health from these contaminants, but adverse health effects have been reported for various species, including fish and birds. it has recently been suggested that for a comprehensive risk assessment toxicologists should also consider transformation products (tps) of such water contaminants that may arise from abiotic and biotic (metabolic) reactions. with aciclovir (acv), a well-known antiviral drug, as the parent drug we tried an in-silico approach to identify tps that might be of interest due to some mutagenic or carcinogenic toxicophores. methods: from a literature and database search we picked up 12 acv-tps. predicted acute toxicities and mutagenic / carcinogenic properties for these tps were derived from an expert system analysis using the lazar portal (http://lazar.in-silico.ch/) as front end. results: two of the identified acv-tps could not be handled by lazar because of insufficient training data in one out of eight queried categories. the highest score (4 positive out of 8 possible genotoxicity categories) was assigned to 9 of the 12 tps, including acv itself. this is a rather low score when compared to other water-borne drug residues, e.g. carbamazepine. cofa, an imidazole derivative of acv seen in advanced oxidation processes, had shown antiproliferative effects in several ecotoxicologic screening assays, e.g. [1], but was unremarkable in our tests. additionally, a computer-based simulation of the respective tps interacting with human cyp isozymes did not support concerns that these tps may pose a risk for human health. conclusions: our in-silico analyses of 12 acv-tps did not provide evidence for any adverse health effects in the micromolar concentration range. further studies are needed to clarify if the biological activity of some acv-tps in ecotoxicological assays may eventually affect yet unidentified biological targets in the human body. sulfur mustard (sm) is a chemical warfare agent which was first used in world war i, but has found use in several conflicts afterwards. although sm is prohibited by geneva protocol, terroristic attacks cannot be ruled out. latest news give rise to concern that is may be in the possession of sm and is willing to deploy it. even 200 years after the initial synthesis of sm its mode of action is not fully unraveled. thus, no antidote does exist. however, chemosensing ion channels have been shown to be activated by highly toxic chemicals and might represent a specific therapeutic target. previous studies have shown that the sm-surrogate cees (mono-functional alkylating agent) is able to activate transient receptor potential ankyrin 1 (trpa1) channels that are known to affect mapk cell signaling. mapk-pathways, especially perk1/2, are known to increase protein biosynthesis through activation of transcription factors binding to the serum response element (sre). it is unknown whether alkylating agents have also impact on mapk signaling mediated through trpa1 activation. our results demonstrate that aitc resulted in phosphorylation of the mapk perk1/2 and increased protein biosynthesis of sre-regulated genes in hek293 cells overexpressing htrpa1. cees increased perk1/2 levels already after 2.5 min which could be prevented by the trpa1 blocker ap18. activation of target genes through perk1/2 signaling was also evident, but less pronounced compared to aitc. our results demonstrate that alkylating agents have impact on cell signaling through trpa1 channel activation. thus, trpa1 might represent a promising target for counteracting sm toxicity. sulfur mustard (sm) is a chemical warfare agent that provokes severe inflammation and blistering upon exposure to the skin accompanied by disturbed wound healing. the potential use of sm in terroristic assaults amplified the interest in understanding the underlying cellular and molecular pathomechanisms in order to improve therapeutical intervention. autophagy is a highly conserved catabolic pathway in eukaryotes that ensures the degradation and recycling of cellular components through the lysosomal machinery. autophagy is important for cell survival in physiological and pathological stress situations. emerging knowledge indicates that imbalanced regulation of autophagy disturbs basal cell functions including proliferation, differentiation and migration, thus contributing to the pathophysiology of various diseases. after penetration into skin cells, sm alkylates and thereby modifies nucleic acids and proteins thus forming aggregates of dysfunctional proteins destined for autophagic disposal. in our studies, we analyzed the influence of sm on protein expression (western blotting) of autophagy-related (atg) genes as well as proliferation (wst-1) of primary normal human keratinocytes (nhek) and primary normal human dermal fibroblasts (nhdf). preliminary results demonstrate that sm strongly dysregulates the biosynthesis of atg proteins that may contribute to the diminished cell migration and proliferation under these conditions. our findings suggest that sm affects autophagy in correlation with an impairment of physiological functions in keratinocytes and fibroblasts that are essentially required for normal tissue regeneration. thus, application of pharmacological modulators of autophagy might be useful in the treatment of the delayed wound healing in skin upon exposure to sm. exposure of the respiratory tract to airborne particles is a major risk to human health. due to the ubiquitous application of these particles in the field of pharmacy, industry and in daily life, there is a strong necessity to investigate the toxic properties and the underlying pathomechanisms of these inhalable substances. in addition, the eu chemicals regulation requires not only that all substances placed on the market have to undergo a toxicological characterization, including the identification of potential toxic inhalation hazards (reach), but also that animal testing shall be undertaken only as a last resort ("3rs" principle) and the promotion of the development of alternative methods. thus, the development, establishment and validation of alternative in vitrobased test systems for the assessment of pulmonary toxicity are in the focus of current research. until now, most of the available in vitro cell culture models are limited to some extent as in those studies the exposure is either done under submerged conditions, not resembling the exposure conditions in vivo, or a homogeneous particle distribution is not guaranteed. the cultex ® radial flow system (rfs) is a specially designed in vitro modular exposure system that overcomes these limitations. it enables the homogenous exposure of human lung epithelial cells at the air-liquid interface (ali), thereby mimicking the physiological conditions of the alveolae. however, further optimizations are needed for the enhancement of the cultex® methodology. aim of this study was first the optimization of the test methodology in general (i.e. focus on clean air controls of the human lung epithelial cell line a549), and second the improvement of cultivation conditions. parameters such as handling of the cultex® device (proper closing and opening operation of the cultex® rfs module; improved washing conditions and media supply), treatment of the incubator controls, adjustment of clean air pressure and flow rates, and integration of two additional filters were sequentially adjusted in order to enhance the methodical setup. our results show that the test parameters for clean air exposure of the a549 cells were successfully optimized resulting in more accurate and robust data. cultivation conditions were improved by changing from closed-wall cell culture inserts to open-wall cell culture inserts. the openwall inserts turned out to be more suitable for exposure experiments as they provided a better medium supply and preserved humidity. deductive, the change of the cell culture inserts was identified as the deciding factor for the improvement of cell morphology. hence, we have successfully optimized the cultex ® rfs methodology for clean air exposure of a549 cells. human primary hepatocytes represent the gold standard in in vitro liver research. due to their low availability and high costs, alternative liver cell models with comparable morphological and biochemical characteristics have come into focus. the human hepatocarcinoma cell line hepg2 is often used as a model for liver toxicity studies. however, under two-dimensional (2d) cultivation conditions the expression of xenobiotic-metabolizing enzymes and typical liver markers is very low. cultivation for 21 days in a three-dimensional (3d) matrigel culture system has been reported to strongly increase the metabolic competency of hepg2 cells. in our present study we extended previous studies and compared hepg2 cell cultivation in three different 3d culture systems: collagen, matrigel and alvetex culture system. cell morphology, albumin secretion, cytochrome p450 monooxygenase (cyp) enzyme activities, as well as expression of xenobiotic-metabolizing and liver-specific enzymes were analyzed after 3, 7, 14, and 21 days of cultivation. our results show that the previously reported increase of metabolic competency of hepg2 cells is not primarily the result of 3d culture but a consequence of the duration of cultivation. hepg2 cells grown for 21 days in 2d monolayer exhibit comparable biochemical characteristics, cyp activities and gene expression patterns as all 3d culture systems used in our study. however, cyp activities did not reach the level of heparg cells. in conclusion, the increase of metabolic competence of the hepatocarcinoma cell line hepg2 is not due to 3d cultivation but rather a result of prolonged cultivation time. in vitro assessment of the neurotoxic potential of arsenolipids arsenolipids are organic, lipid-soluble arsenic compounds, which occur mainly in marine organisms. major human exposure routes are fatty fish including herring or fish oil-based food supplements. about 55 different arsenolipids have been identified so far. thereby, arsenic-containing hydrocarbons (ashc) and arsenic-containing fatty acids (asfa) represent two subgroups of the arsenolipids [1]. our in vitro studies have demonstrated high cellular bioavailability and a high cytotoxic potential of ashcs in human liver and bladder cells [2] , whereas asfas were less toxic [3] . a substantial transfer across an intestinal barrier model (caco-2) indicated that ashcs are highly intestinal available. in comparison, asfas showed lower intestinal bioavailability and underwent a presystemic metabolism [4] . moreover, in drosophila melanogaster ashcs exerted late developmental toxicity and accumulated in the fruit fly's brain. these results suggest that ashcs might pass the blood-brain-barrier due to their amphiphilic structure [5] . in order to assess the neurotoxic potential we currently investigate the toxicity of several arsenolipids in differentiated, human neurons (luhmes). after 48 h incubation with ashcs or asfas, cell number (hoechst) as well as cellular dehydrogenase activity (resazurin) were measured, with the latter endpoint turning out to be more sensitive. ashcs showed substantial cytotoxic effects (ic 50 ~ 7-12.5 µm) in a concentration range comparable to that of arsenite (ic 50 ~ 7.5 µm), whereas asfas were less cytotoxic (ic 50 > 100 µm). after incubation with ashcs the cellular arsenic concentrations increased 10-20fold as compared to incubation with arsenite. further studies indicated that one possible toxic mode of action of arsenolipids could be a disruption of the cellular energy level. therefore, the mitochondrial membrane potential was investigated after incubation with the arsenic compounds in differentiated neurons. whereas arsenite did not exert an impact, ashcs reduced the mitochondrial membrane potential significantly. this might be due to interactions of the amphiphilic ashcs with mitochondrial membranes. currently we investigate the impact of the arsenolipids on neurite outgrowth as a developmental toxicity endpoint. standard treatment of poisoning by organophosphorus compounds (op; e.g. nerve agents and pesticides) consists of co-administration of atropine and an oxime-based reactivator of inhibited cholinesterases. due to lack of efficacy of clinically used oximes against various op-inhibited human acetylcholinesterase (ache) (e.g. soman) research started focusing on new therapeutic approaches. several research groups conducted in silico screenings [1, 2] in order to identify new non-oxime reactivators, presenting amodiaquine as a promising candidate for paraoxon-inhibited hache. for decades, antimalarial drugs like amodiaquine and chloroquine have been closely investigated regarding their side effects, thereby discovering interaction with cholinesterases, which could pose a new potential therapeutic benefit for inhibited cholinesterases. therefore, in this study interactions between antimalarial agents in presence or absence of ops were examined spectrophotometrically by a modified ellman assay. reversible inhibition of cholinesterases was observed with both antimalarial agents. amodiaquine had higher inhibitory potency for hache than human butyrylcholinesterase (bche), being confirmed by ic 50 values of 0.67 ± 0.02 µm for hache and 81.28 ± 0.04 µm for hbche. ic 50 values with chloroquine were 28.37 ± 0.02 µm for hache and 55.62 ± 0.02 µm for hbche, thus representing a weaker inhibition of hache than amodiaquine. furthermore, reactivation of paraoxon-(pxe), sarin-(gb), cyclosarin-(gf), and vxinhibited hache and hbche by amodiaquine and chloroquine was determined. after 60 minutes, only paraoxon-inhibited hache (50%) and cyclosarin-inhibited hbche (10%) were reactivated by 500 µm chloroquine. on the contrary, 10 µm amodiaquine reactivated all tested ops after 60 minutes in the following order: pxe > vx > gf > gb. in contrast, with hbche the highest reactivation was generated with 100 µm amodiaquine in the following order: vx > gb > gf > pxe. due to the high reversible inhibitory potency of amodiaquine, an increased concentration does not result in a higher reactivation of op-inhibited hache. in summary, our results show that amodiaquine is a reactivator of op-inhibited cholinesterases. in the future, non-oxime reactivators that are structurally-related to amodiaquine should be further investigated. [1] bhattacharjee, a.k., marek, e., le, h.t., gordon, r.k., eur. j. med. chem., 2012, 49, 229-238. [2] katz, f.s., pecic, s., tran, t.h., trakht, i., schneider, l., zhu, z., ton-that, l., luzac, m., zlatanic, v., damera, s., macdonald, j., landry, d.w., tong, l., stojanovic, m.n., chembiochem., 2015 , 16, 2205 -2215 bundesinstitut für risikobewertung, lebensmittelsicherheit, berlin, germany development of mammary gland tumors is connected to a deregulation of breast epithelial cell differentiation, a complex process which cannot be reproduced in vitro under standard cell culture conditions. however, cultivation of cells in a tissue-like environment in an in vitro three dimensional (3d) model can mimic general architecture, function and differentiation of mammary bulks. in this project, a 3d model was used consisting of the permanent breast epithelial cell lines mcf10a (er -, estrogen receptor negative) and mcf12a (er + , estrogen receptor negative) grown in matrigel tm , which mimics the complex extracellular matrix in vivo. the 3d culture of mcf10a and mcf12a cells in matrigel tm results in the formation of growth-arrested, polarized spheroids with a lumen (acini-like organoids). in order to perform a semi-quantitative estimation on the influence of substances on the differentiation of the breast cells for the identification of non-genotoxic carcinogens a scoring method was developed. this scoring method provides information about substance-induced morphological changes of the spheroids during differentiation based on the following parameters: size of the spheroids, the formation of the lumen, and the degree of polarization. furthermore, the model allows distinguishing between erdependent (mcf12a) and er-independent (mcf10a and mcf12a) effects. the 3d in vitro model is a useful tool for toxicologists to study substance effects on differentiation processes. the system will be used to examine the potential of e.g. food contaminants such as phthalates or perfluorinated substances (pfas) to disrupt the differentiation process of breast epithelial cells and will therefore serve as a valuable in vitro tool to assess their carcinogenic potential. inflammatory episodes occur erratically throughout life and are likely to play a critical role in the alteration of the individual susceptibility of a person to idiosyncratic druginduced liver injury (idili), a particular severe form of drug-induced liver injury (dili). in concordance with the inflammatory stress hypothesis, modest inflammatory stress can lower the threshold for hepatotoxicity and make an individual susceptible to develop liver injury during exposure to therapeutic doses of a drug. in order to evaluate the role of immune cells and its secreted factors during drug therapy, we established an in vitro test battery consisting of two cell culture systems in presence or absence of proinflammatory factors (lps, tnfα): (a) the monoculture of human hepatoma (hepg2) cells and (b) co-culture systems of human monocytic or macrophage-like (thp-1) and hepg2 cells. with these different test settings we aimed to identify whether the introduction of inflammatory immune cells and/or pro-inflammatory factors could increase the sensitivity of liver cells towards idili compounds. three reference substance pairs were tested, namely troglitazone -rosiglitazone, trovafloxacin -levofloxacin, and diclofenac -acetylsalicylic acid, each of them being composed of a compound that is known to induce idili and a partner compound of the same substance class that does not induce idili. first, all compounds were tested for cytotoxicity towards the single cell systems using the wst-assay. co-culture experiments with hepg2 and thp-1 monocytes or macrophages as well as co-exposure experiments with lps or tnfα were then done at about 20% cytotoxicity of the respective substance in the most sensitive cell type. subsequently the results were compared to the experiments in the monoculture of hepg2. we observed that every idili compound showed a significant increase in cytotoxicity in a minimum of one exposure combination while this effect was not observed with the corresponding non-dili partner compound. in conclusion, a combination of different culture systems and co-exposures with proinflammatory factors is needed for a valid differentiation between non-dili and idili compounds. this test battery could provide a useful tool for the prediction of inflammation-associated idiosyncratic drug-induced hepatotoxicity. furthermore, our results support the inflammatory stress hypothesis and points to an involvement of proinflammatory factors in the development of idili. extensive animal models of carcinogenicity ensure a safe usage of chemicals. to elucidate fundamental molecular mechanisms of carcinogenicity these methods are expensive, time consuming and above all too complex. in contrast, most in vitro methods are rather simple and detect only selected endpoints, like dna damage, mutations or changes in proliferation. the balb/c cell transformation assay is a validated toxicological method to identify potential tumour initiators and promotors. first, balb/c mouse fibroblasts form a monolayer culture and get contact-inhibited after reaching confluence. upon treatment with a tumour initiator (3-methylcholanthrene) and promotor (12-o-tetradecanoylphorbol-13-acetate) transformed cells do not stop proliferation and grow as morphologically aberrant foci over the monolayer of normal cells. after fixation with methanol at day 42, morphological aberrant foci can be visualized with giemsa staining. because the balb/c assay mimics different stages of the malignant cell transformation process (initiation, promotion and post-promotion phase) and detects with the colony formation a late endpoint of carcinogenicity we improved this method for mechanistic cancer research. using the example of insulin-signalling pathway we can show that (1) several substances have a different impact on the transformation process, (2) it is possible to identify for each substance the phase with the greatest effectiveness and (3) we can detect additional endpoints to elucidate the mechanistic mode of action. therefore we used several compounds (linsitinib, metformin, rapamycin, …) to manipulate the insulin-signalling pathway on different levels (insr, ampk, mtor, …) and analysed a number of characteristic endpoints of carcinogenesis. changes on protein level and signalling (westernblot, immunofluorescence, flow cytometry) or parameters of energy metabolism (oxygen consumption, glucose or atp measurement) are measurable and enable new insights into the process of cancer origin. summing up, the balb/c 3t3 assay proves to be a cheap and short-time alternative to rodent bioassays. although this method does not mimic the whole in vivo neoplastic process, it can be used to provide essential information regarding key proteins and their signalling, during the different stages of transformation. is there hope to correctly classify severe ocular irritant agrochemical formulations using in vitro methods: a proof of concept using the isolated chicken eye test, two modified bcop protocols and an epiocular™ et50 protocol while some in vitro methods addressing ocular irritancy have gained regulatory acceptance, to date the draize rabbit eye test (oecd tg 405) is the only world-wide regulatory accepted test for the determination of the full range of eye irritation potential. further although several in vitro methods for the severe eye irritation have gained regulatory acceptance, agrochemical formulations are nor explicitly included nor excluded from the applicability domain to predict severe ocular irritant formulations. systematic analyses are only available for e.g. the hen's egg test-chorioallantoic membrane (het-cam), and bovine corneal opacity and permeability (bcop, oecd tg 437) assays both showing that the used protocols do not provide sufficient sensitivity to reliably predict severe ocular irritating formulations. the purpose of this study was to evaluate whether the regulatory accepted isolated chicken eye (ice, oecd tg 438) test including corneal histopathology (as suggested for evaluation of the depth of injury), as well two modified protocols of the bcop and/or an et50 (exposure time reducing viability of treated tissue to 50%) protocol using the reconstructed cornea model epiocular™ are useful to predict severe ocular irritant agrochemical formulations. a proof of concept comprising the testing of ten to twelve agrochemical formulations with available in vivo data in each assay was conducted. in summary, based on the ice evaluation described in oecd tg 438, one of the five severe ocular irritant formulations (un ghs cat 1) was predicted correctly. using both modified protocol versions of the bcop the result for one of the four tested un ghs cat 1 formulations was just above the un ghs cat 1 classification border for using one of the modified protocols. lastly and most promising, the epiocular™ et50 predicted four of five tested un ghs formulations correctly with the fifths being close to the classification border. additional agrochemical formulations will be tested to further evaluated the epiocular™ et50 protocol to identify severe ocular irritant agrochemical formulations. drug-induced pancreatic toxicity comprises effects on the exocrine and/or the endocrine pancreas, which both can have serious clinical implications, e.g. acute pancreatitis or diabetes mellitus. adverse effects on the pancreas are occasionally observed during drug discovery and development and often prohibit further development. hence, there is a need for reliable in vitro models to early on identify the pancreas-toxic potential of drug candidates. permanent cell lines and primary cells have many shortcomings, e.g. loss of cell-to-cell and cell-to-matrix relationships or changes in cell physiology due to the isolation procedure. pancreas tissue slices are a potential alternative, circumventing most of these limitations. their preparation is rather elaborate which may explain its rare use. so far, pancreas tissue slices have predominantly been used to address physiological or pharmacological questions, although they might also serve as valuable in vitro model for toxicological applications. therefore, this work aimed to establish and characterize rat pancreas tissue slices as in vitro model for studying drug-induced pancreatic toxicity. results will be compared to the responses of the permanent endocrine (ins-1e) and exocrine (ar42j) pancreatic cell lines to evaluate a potential added value. rat pancreas tissue slices were prepared by a protocol adapted from marciniak et al. (nat protoc, 2014 . 9(12): p. 2809 . briefly, pancreas was infused and embedded with agarose. tissue sections of app. 200 µm were prepared using a vibratome and maintained in cell culture medium for up to 6 days. cell viability was determined by daily measurement of lactate dehydrogenase (ldh) in medium supernatants and by microscopic evaluation following fixation in 10 % formalin and h&e staining. functional integrity of acinar and beta cells were assessed by cell-type specific secretory responses (i.e. insulin, amylase, lipase) to physiological stimuli. moreover, the effects of the pancreas toxins streptozotocin (stz), alloxan (all), and the cholecystokinin (cck) analogue cerulein on the viability and functional integrity of tissue slices were compared to the respective responses of the cell lines. we were able to establish an optimized isolation and cultivation procedure for rat pancreas tissue slices applying minor modifications to the original protocol. cell viability declined over the cultivation period. stimulation of the cell lines with glucose or cerulein increased secretion of insulin (ins-1e cells) or amylase/lipase (ar42j cells), respectively. the pancreas slices responded to both stimuli, demonstrating functional integrity of endocrine and exocrine cells. treatment of ins-1e islet cells with the betacell toxicants all or stz only slightly affected islet cell viability, whereas treatment of ar42j acinar cells with cerulein at supraphysiological concentrations had no effect. this set of experiments is currently completed by investigating the effects of all, stz and cerulein on the viability of acinar and islet cells in pancreas slices. our preliminary data demonstrate feasibility to prepare and cultivate rat pancreas tissue slices over a period of 6 days thereby maintaining functional integrity to some extent. coculture of human monocytes with the keratinocyte cell line hacat in serumcontaining medium leads to higher sensitivity to weak contact allergens: an improvement for the loose-fit coculture-based sensitization assay (lcsa) a. sonnenburg 1 , j. the loose-fit coculture-based sensitization assay (lcsa) has proved reliable for the in vitro detection of contact sensitizers in the past. however, the use of primary human keratinocytes has some disadvantages. to facilitate high throughput screening of chemicals, we replaced primary keratinocytes from the original assay setup (setup a) by the human keratinocyte cell line hacat. these cells were cocultured with monocytederived dendritic cells in serum-free medium (setup b) or fetal calf serum (fcs)containing medium (setup c). upregulation of the dendritic cell maturation marker cd86 assessed by flow cytometry served as endpoint. we have tested four substances known as sensitizers and four non-sensitizers in both new setups as well as in the original setup with primary cells. three out of four sensitizers (2,4-dinitrochlorobenzene, 2-mercaptobenzothiazole, and coumarin) , and three out of four non-sensitizers (glycerol, monochlorobenzene, and salicylic acid) were correctly assessed under all culture conditions. the weak sensitizing potency of resorcinol was only detected by setup b with fcs supplemented medium. a false positive reaction to caprylic (octanoic) acid in all three setups confirms earlier results from our laboratory that some fatty acids are able to induce cd86 on dendritic cells in vitro. culture in fcs supplemented medium led to generation of dendritic cells showing a more pronounced upregulation of cd86 after application of substances with rather high sensitization potency compared to dendritic cells which are formed under serum-free conditions. therefore, we characterized dendritic cells from setups b and c by flow cytometric measurement of additional dendritic cell surface markers. dendritic cells from the original setup a had been characterized extensively before (schreiner et al., toxicology 2008; 249:146-1529) . dendritic cells generated in fcs supplemented medium were cd1a+/cd1c+, whereas dendritic cells from serum free culture conditions were cd1a−/cd1c− regardless whether cocultured with primary human keratinocytes or hacat. populations with cd1a+/cd1c+ dendritic cells in coculture seem to show a higher sensitivity to weak sensitizers, which proved beneficial for the identification of resorcinol. in conclusion, modification of the lcsa protocol led to an increased sensitivity of the assay. due to ethical and social reasons, in vitro assays are being developed to replace animal tests for addressing e.g. toxicological questions. for the induction of skin sensitization by chemicals, resulting in tolerance or allergic contact dermatitis after repeated exposure, prerequisites are the induction of inflammatory responses in keratinocytes supporting maturation of dendritic cells (dc), which is needed for the t cell response. although related in vitro assays consisting of one single cell type have good hazard prediction capacities, they have limitations in predicting sensitization potency. one drawback could be the lack of communication between keratinocytes and dc. with respect to the activation of keratinocytes and maturation of dc, intercellular communication between these two cells may include the release of danger molecules such as cytokines, damage-associated molecules such as atp, and metabolized chemicals. beside this, microrna (mirna), among them those that can regulate dc activation or maturation, can be differentially expressed upon stimulation but can also be transferred between cells. for skin sensitizers, we reported already that cross talk between hacat keratinocytes and thp-1 cells, as model for dc, enhanced cyp1 enzyme activity in hacat cells exposed to benzo[a]pyrene (b[a]p) and eugenol, belonging to a subgroup of chemicals (prohaptens) whose sensitizing potential depend on prior metabolic activation e.g. via cytochrome p450 (cyp) enzymes. furthermore, coculture clearly increased the upregulation of the cell surface molecule cd86 on thp-1 cells after incubation with these prohaptens and also several other skin sensitizers. in this study we further elucidate the cross talk between thp-1 cells and hacat cells by analyzing the impact of hacat cells on the expression of mirnas in thp-1 cells by microarray technology. we identified 6 differentially expressed mirnas in cocultured thp-1 cells compared to monocultured thp-1 cells irrespective of the treatment (medium, 0.2% dmso as solvent control, b[a]p). in the presence of dmso and b[a]p (after 48h) 8 additional mirnas are differentially expressed. up to now it is not clear whether the cross talk between hacat and thp-1 cells comprises the exchange of mirna between the cocultured cells or whether it influences the expression of these mirna in thp-1 cells, or both. given that one mirna has several gene targets these results illustrate that the cross talk between thp-1 and hacat cells also impacts on the mirnome. walther-straub-institut der lmu-münchen, münchen, germany transient receptor potential (trp) proteins represent a large superfamily of nonselective cation channels sensing toxic stimuli in the human body. trpa1 expresses a high number of aminoterminal ankyrin repeats and is the only member of the trpa family. channel monomers form homotetramers in the plasma membrane with six transmembrane segments (tm) and a pore forming loop between tm5 and 6. trpa1 has been extensively described in sensory nerve endings as an important cellular detector for toxic stimuli and as an oxygen sensor (reviewed in 1). although recently two reports identified trpa1 in pulmonary epithelial and endothelial cells (2, 3) , its expression in non-neuronal tissues is still a matter of debate. after isolation and identification of different murine lung cells we were able to identify murine trpa1 protein in primary endothelial cells, pneumocytes type ii (atii) and fibroblasts by using specific antibodies in a western blot analysis, but not in cells from trpa1-deficient mice. atii cells were identified by specific cell markers such as surfactant protein c and were further differentiated to ati cells characterized by their specific expression of podoplanin. quantitative trp expression patterns will now be evaluated by quantitative reverse transcription (rt)-pcr as well as utilizing nanostring ® technology in different lung cells. to characterize trpa1 on a cellular level we cultured a hek293 cell line stably expressing trpa1 (4) . allylisothiocyanate (aitc) a specific activator as well as hypoxia and hyperoxia was able to induce ca 2+ -influx in this cell line, which was blocked by the specific inhibitor a96079. in the future, we will utilize the isolated perfused lung model (5) to quantify toxin-induced edema formation in ex vivo lungs from wt and trpa1-deficient mice after exposure to potential toxic inhalation hazards (tih see 6) to challenge the hypothesis of trpa1 as an important toxin sensor in the lung. by this strategy we hope to understand trpa1 function in lung cells and to evaluate trpa1 proteins as potential pharmacological targets for a specific therapeutic intervention during toxin-induced edema formation. metabolism by the intestinal microbiota is likely to contribute essentially to the plasma metabolite profile of the mammalian host organism and it requires adequate identification of effects of the microbiome on the endogenous plasma metabolite patterns. the current investigations present insights in the mammalian-microbiome cometabolism of endogenous metabolites. antibiotics have a profound effect on the micro-organism composition of the microbiome and hence on the mammalian-microbiome co-metabolism. the consequences, however, on the functionality of the microbiome (defined as the production of metabolites absorbed by the host) and which of these changes are related to the microbiome are not well understood. to identify plasma metabolites related to microbiome changes due to antibiotic treatment, we have employed a metabolomics approach. to this purpose broadspectrum antibiotics belonging to the class of aminoglycosides (streptomycin, neomycin, gentamicin), fluoroquinolones (moxifloxacin, levofloxacin) and tetracyclines (doxycycline, tetracycline) were administered orally for 28 days to male rats including blood sampling for metabolic profiling after 7, 14 and 28 days. fluoroquinolones and tetracyclines can be absorbed from the gut whereas aminoglycosides cannot. to distinguish between metabolite changes caused by systemic toxicity of the antibiotics and microbiome related changes, the metabolites identified in the metabolome pattern were compared to a list of metabolites known to be produced by the gastro-intestinal micro-organisms. beside changes mainly concerning amino acids and carbohydrates, hippuric acid and indole-3-acetic acid were identified as key metabolites being affected by antibiotic treatment. for each class the following gut metabolites were found to be unique: indole-3-propionic acid for aminoglycosides, taurine for fluoroquinolones, 3indoxylsulfate, uracil and allantoin for tetracyclines. for each class of antibiotics specific and selective metabolome patterns could be established. the results suggest that plasma based metabolic profiling (metabolomics) could be a suitable tool to investigate the effect of antibiotics on the functionality of the microbiome and to obtain insight in the mammalian-microbiome co-metabolism of endogenous metabolites. drug-induced liver injury (dili) is still a major reason for termination of clinical trials and thus is an important concern in drug development. identification and prediction of dili in the clinic and in preclinical safety testing still relies on the classical clinical chemistry panel and histopathology with known limitations in sensitivity and specificity. in the last years bile acids (bas) have been studied as potential biomarkers to better characterize drug-induced liver injury with promising results (ellinger-ziegelbauer et al., 2011; luo, schomaker, houle, aubrecht, & colangelo, 2014; yamazaki et al., 2013) . to evaluate whether a targeted bile acid profiling via lc-ms/ms in plasma and liver tissue can improve assessment of liver injury, methapyrilene (mpy) a known hepatotoxin, or corresponding vehicle, was administered daily to male wistar rats at a low (30 mg/kg) and a high (80 mg/kg) dose. rats were sacrificed following 3, 7, or 14 consecutive daily doses, or after recovery 10 days following 14 consecutive administrations of mpy or vehicle. in addition to bile acids which were determined both in plasma and tissue, conventional preclinical safety endpoints (histopathology and clinical chemistry) assessment and gene expression profiling was performed in liver to obtain mechanistic information about potential changes in regulation of bile acid levels. conventional findings included periportal necrosis, inflammation and biliary hyperplasia, and increased liver enzyme activity and bilirubin levels during the treatment phase. the bile acid pattern showed increased levels of conjugated and unconjugated bile acids in low dose and high dose groups compared to the controls after administration of methapyrilene. furthermore, although liver enzyme activity and bilirubin levels in serum were decreased again in the recovery groups, suggesting recovering liver injury, bile acid concentrations remained elevated with no signs of recovery. analysis of transcriptomics data revealed decreased levels of mrna encoding α-methylacyl-coa racemase (amacr) 4 and 15 days after dosing, a gene responsible for bile acid synthesis. membrane transport systems for bile acids like sodium/taurocholate co-transporting polypeptide (ntcp) and organic anion transporting polypeptide 1 (oatp1) expression were down regulated as well, indicating that the increased bile acid concentrations in plasma and tissue could be attributable to reduced uptake by the hepatocyte. in summary the data suggest that targeted bile acid profiling could be used as potential biomarkers to enhance assessment of drug-induced liver injury. photorhabdus asymbiotica is an entomopathogen and emerging human pathogen causing soft tissue infections in humans. photorhabdus asymbiotica produces the bacterial protein toxin patox, which is cytotoxic for various cell lines and kills insect larvae. previous studies have established that patox harbors two enzymatic active domains, a glycosyltransferase and a deamidase domain. the glycosyltransferase domain inactivates host gtpases of the rho family by glcnacylation of a tyrosine residue in the effector binding loop, which results in the disassembly of the actin cytoskeleton. the deamidase domain deamidates a crucial glutamine residue in heterotrimeric gα i and gα q/11 proteins, which renders the g proteins constitutive active. sequence and structural homology analyses of patox revealed a third domain (patox p ) resembling peptidases of the c58 protease family. patox p contains the conserved catalytic triade (c/h/d) of papain-like cysteine proteases and shares sequence similarity with effectors from yersinia pestis (yersinia outer protein yopt) and pseudomonas syringae (avirulence protein avrpphb). transient expression of patox p in hela cells induces cell rounding and indicates a cytotoxic potential of patox p . incubation of patox p with linearized bovine serum albumin (bsa) results in cleavage products of bsa assuming proteolytic activity of patox p . mutation of the catalytic cysteine in patox p prevents cleavage of bsa and blocks cytotoxicity. we were not able to observe autocatalytic cleavage of patox constructs under various conditions. the intracellular activity of the protease domain is most likely involved in the pathogenicity of patox. vitamin d metabolism -involved in triazole fungicide toxicity? a. lehmann 1 1 background: in a 28-day rat feeding study with the azole fungicides cyproconazole (c), epoxiconazole (e), propiconazole (p), tebuconazole (t), prochloraz (pz) as well as combinations c+e and c+e+pz, a reduction of vitamin d (vitd) receptor mrna levels was reproducibly observed in adrenals for c, e and p. transcription of various enzymes related to vitd homeostasis (including cyp2r1, gc, cyp3a, ugt1a) in liver was also affected, while initial indications for modulation of renal cyp24a1 and renal and hepatic cyp27b1 could not be confirmed. a possible induction of parathormone (pth) was noted for the high dose of c, but statistical significance could not be shown. we have now performed supporting analyses for serum vitd levels, measured additional transcript levels and will provide a framework for the interpretation of the findings. methods: male wistar rats (n=5 for single substances, n=10 for combinations) were treated for 28 days at dose levels tested based on noaels from 90-day subchronic feeding studies and ranged from noael/100 to noaelx10. quantitative rt-pcr analyses were performed on organ samples obtained at sacrifice. serum vitd levels were determined using the total (25-oh) vitamin d elisa (drg instruments gmbh, marburg, germany). results: the elisa established for diagnostic analysis of human serum and plasma samples could be applied to rat serum. vitd levels in control animals (n=30) were 64.7±8.3 ng/ml (min/max: 48/78 ng/ml), i.e. in the range of values reported previously for rats. for the high dose of c (1000 ppm in food, n=5), there was a statistically nonsignificant reduction of vitd levels to 71.3±17.6% of the concurrent control (n=5). however, for 4 of 5 animals of this group, measured vitd level were below the range observed in pooled controls (n=30). an according follow-up is ongoing. qrt-pcr analysis of adrenal tissue showed deregulation of apoptosis related genes (p21 for c, e and pz; cdk1 and gadd45a for e; cdkn1c for c), which is in agreement with an involvement of vitd in the autocrine/paracrine regulation of cell proliferation. conclusion: reduction of circulating vitd levels would be plausible as a result of induction of hepatic cyp3a1/2 and ugt1a. however, this could not be confirmed by elisa as a general mechanism for all azole fungicides under investigation. only for rats fed with 1000ppm cyproconazole, there were indications for a moderate reduction of 25-oh vitamin d, which would correlate with the previously reported moderate increase in serum pth for this group. hansen's disease during pregnancy and lactation: two babies born to a mother using antileprosy drugs z. ozturk hansen's disease, also known as leprosy, during pregnancy has been rarely reported in europe and united states. early diagnosis is important, and medication can decrease the risk of those living with leprosy patients from acquiring the disease. this report presents a case of multidrug antileprosy therapy during pregnancy and lactation. a 26-year-old multiparous woman with a known case of multibacillary leprosy presented with unplanned pregnancy. her pregnancy was discovered in the 9th week, and she has been taking a multidrug therapy (dapsone 100 mg/day, rifampicin 600 mg/month, clofazimine 50 mg /day and clofazimine 300 mg/month) for the past 8 months. diagnosis of leprosy was established in her previous pregnancy. the patient was informed about the risks of drugs used in pregnancy. the treatment was continued unchanged during pregnancy. a detailed fetal ultrasonography was offered to scan the development of the fetus at about 20 weeks. in the 8th, 22nd, 28th weeks of pregnancy, prenatal sonographic examinations revealed normal fetal growth and amniotic fluid volume. at 28 weeks pregnant, she was diagnosed with gestational diabetes. diabetes did not cause any symptoms during pregnancy, and it was controlled with a reduced-calorie diet in a week. the patient delivered a healthy baby girl by vaginal birth in the 39th week of gestation without perinatal complications. the baby was also healthy (apgar 8-9,3300 g,51 cm), and its growth and development were normal during a 6-month follow-up period. the patient decided to breastfeed while taking medication. she had a previous experience with use of anti-leprosy drugs while breastfeeding, her other child was 15 months old and healthy. as well as in the first child, skin discoloration was observed in newborn due to clofazimine during lactation. after 3 months, she stopped breastfeeding, and the infant's skin changes were reversed. for pregnant women and practitioners, treatment of leprosy in pregnancy can be complicated. physical and neurological damage may be irreversible even if cured. multidrug therapy consisting dapsone, rifampicin and clofazimine is highly effective for people with leprosy and considered safe, both for the mother and the child. antileprosy drugs are excreted into human milk but there is no report of adverse effects except for skin discolouration of the infant due to clofazimine. therefore, multidrug therapy for leprosy patients should be continued unchanged during pregnancy and lactation. methods: individuals included in the analysis were participants of the berlin initiative study (bis). bis is a population-based prospective cohort study initiated in 2009 in berlin, germany, to evaluate kidney function in people ≥ 70 years. medication was assessed through personal interviews and coded using the anatomical therapeutic chemical classification system. for estimation of glomerular filtration rate (egfr) we used the ckd-epicr equation. predictor analysis was conducted via logistic regression. results: figure 1 illustrates the percentage of drug use for the three noacs and phenprocoumon, the most common vitamin k antagonist in germany, over the course of four years. table 1 shows the characteristics of patients for each oral anticoagulant group during the four-year follow-up visit (from january 2014 until april 2015). the probability of dabigatran use rose with increasing age (+12%), and the probability of phenprocoumon use rose in case of egfr < 60 ml/min/1.73m 2 (+54%) or male sex (+82%). discussion: our data show that also in the elderly noac use increased over the past years. characteristics such as age, sex or kidney function had an impact on the choice of oral anticoagulation. objective: orthostatic hypotension (oh) is an important factor in determining cardiovascular mortality especially in older age. different factors were discussed to influence oh. arterial stiffness, medication and frailty were demonstrated as modifying factors of oh. the aim of this study was to assess prevalence of and influencing factors on oh in nursing home residents (nhr) in germany. methods: systolic (sbp) and diastolic (dbp) blood pressure as well as pulse pressure (pp) and pulse wave velocity (pwv) as markers of arterial stiffness were measured in nhr aged ≥ 65 years in 12 nursing homes in berlin, germany. measurements were first performed in the sitting position and then repeated after standing up. oh was defined as a sbp decrease of > 20 mmhg and/or dbp decrease of > 10 mmhg within 3 min after standing up. hypertension was defined as the presence of diagnosis arterial hypertension, the prescription of at least one antihypertensive drug, or mean sbp values > 139 mmhg and/or mean dbp >89 mmhg. information about antihypertensive medication was received from interviews and medical records. frailty was determined by geriatric assessments, e.g. "timed up and go test" (tug) or barthel scale. results: oh testing could be performed with 96 nhr (mean age = 84.5 ± 7.3 years). in total, 15 subjects (15.6%) had oh. the mean change in sbp from sitting to standing was 19.2 ± 15 mmhg (range +8.5 to -52.5 mmhg) in patients with oh and 1.5 ±10.9 mmhg (range +44.5 to -17 mmhg) in patients without oh. mean sbp was significantly higher (143.6 ± 17.1 mmhg) in people with oh than in those without (131.5 ± 20.1mmhg). all of the nhr with oh were hypertensive compared to 89% of the nhr without oh. sex, mean age, pwv and pp was not significantly different between individuals with or without oh (p>0.05). medication data was available for 89 patients. all individuals with oh and 60 nhr without oh (80%) had antihypertensive medication. more than 2 different antihypertensive drugs were present in 11 patients with oh (78.5%) and in 43 patients without oh (57.3%). the intake of beta-blockers had no impact on oh development. geriatric assessments did not differ significantly between the oh group and the non-oh group. more than 75% of patients in both groups reached 80 points as maximum in barthel scale defining a need for assistance and tug analyses demonstrated that around 50% of patients with oh as well as patients without oh needed more than 19 sec showing a motor slowing. conclusion: we found a relatively low prevalence of oh in our very old patient cohort and the overall bp control was good. similar to earlier publications mean sbp was significantly higher in nhr with oh. all of the other investigated factors were not associated with the occurrence of oh. the small cohort size might have limited the detection of cardiovascular, epidemiological or geriatric associations. in addition, important confounding factors such as the inability to stand of some nhr and the lack of standardized fraility assessments must be addressed. impact of reticulated platelets on the initial antiplatelet response to thienopyridine loading in patients undergoing elective coronary intervention c. stratz 1 , t. nuehrenberg are known to be involved in cell metabolism pathways and therefore ccrcc is supposed to be a metabolic disease. in order to facilitate a better understanding of cancer metabolism and to support tumor classification on the metabolite level we have developed a novel analytical approach for comprehensive metabolomic profiling of small molecules and lipids in kidney tissue. the method was established and validated based on porcine tissue and, as proof of concept, applied to a small cohort of human normal and ccrcc tissue samples for molecular tissue differentiation. methods: five fresh frozen ccrcc samples and corresponding normal tissue were used for cancer-specific metabolomic profiling and were derived from patients who underwent partial or radical nephrectomy. metabolites and lipids were recovered from tissue samples by a two-step extraction protocol. tissue homogenization and extraction of polar metabolites was performed in methanol/water (aqueous extract) by a beadbeating approach. lipids were recovered by consecutive extraction of the pellet with methanol/methyl tert-butyl ether (organic extract). metabolites in aqueous extracts were separated by hydrophilic liquid interaction chromatography whereas compounds in organic extracts were separated by reversed phase chromatography prior high resolution mass spectrometry. results: reproducibility of tissue extraction and metabolite analysis was assessed by the analysis of multiple individually prepared porcine kidney samples. more than 1000 metabolic features including amino acids, nucleotides, small organic acids, phospholipids, sphingolipids, glycerolipids and fatty acids could be reproducible (cv ≤ 30 %) analyzed with the novel non-targeted metabolomics approach. the validated protocol was applied for metabolomic profiling of kidney tissue derived from ccrcc patients. based on unsupervised multivariate statistics, a clear differentiation between cancerous and normal tissue for the small metabolites profile as well as for the lipid profile could be observed. a first subset of differentially regulated metabolites responsible for tissue differentiation could be tentatively identified. conclusion: metabolomic profiling of kidney tissue extracts enables differentiation between ccrcc and normal kidney tissue samples based on the lipid and small molecule metabolomic profiles. further studies on larger and independent sample groups are necessary to confirm and validate our preliminary findings. in summary, the presented approach provides a first basis for comprehensive metabolomics studies in human kidney tissue and thus offers great potential for the metabolic characterization of ccrcc with important prognostic and therapeutic implications in the future. introduction: clomiphene (clom) citrate as mixture of trans-and cis-isomer (60:40) is the first line therapy for the treatment of infertility caused by the polycystic ovary syndrome. treatment schedule includes dose escalation from 50 mg/d clom citrate to up to 150 mg/d in case of non-ovulation. however, therapy outcome is variable and approximately 10 -30% of patients do not benefit from clom treatment. the pro-drug clom is bioactivated via 4-hydroxylation of trans-clom by the highly polymorphic cytochrome p450 (cyp) 2d6 leading to the major active metabolite trans-4hydroxyclomiphene (trans-4-oh-clom) [1] . recently, we identified a less active trans-3-oh-clom which is also formed by cyp2d6. besides the formation of the active metabolites, their plasma concentrations are influenced by their clearance e.g. via glucuronidation and sulfation. here we investigated the glucuronidation and sulfation of both hydroxyl-metabolites. methods: isoforms of udp-glucuronosyl-transferase (ugt) and sulfotransferase (sult) responsible for conjugation of oh-clom were identified using commercially available supersomes. glucuronidation and sulfation kinetics were determined in pooled human liver microsomes. conjugated clom metabolites were quantified in plasma and urine samples obtained from healthy female volunteers who received a single dose of 100 mg clom citrate. results: incubations with human liver microsomes revealed an almost 60-fold higher glucuronidation rate for trans-3-oh-clom, which is exclusively catalyzed by ugt2b7, compared to the more potent trans-4-oh-clom. for the latter a pattern of multiple ugts was identified. in contrast, the intrinsic clearance of trans-4-oh-clom to its sulfate is 16-fold higher compared to 3-oh-clom. for both metabolites a participation of sult1a1 and sult1e1 was identified. these results were in line with previous studies, which identified the same sults [2] and ugts [3] responsible for the conjugation of the structurally related trans-4-hydroxytamoxifen. in addition, in vivo data from plasma and urine samples confirmed the reverse regioselective glucuronidation and sulfation of trans-3-oh-clom and trans-4-oh-clom. overall, concentrations of clomglucuronides were significantly higher than those of sulfates. highest concentrations in plasma and urine samples were measured for trans-clom-3-o-glucuronide. conclusion: our results suggest a new metabolic route via trans-3-oh-clom which appears to be a potential inactivation pathway of clom. 1 1 institut für pharmakologie und toxikologie der bundeswehr, münchen, germany for decades the biological effect of sm has been investigated. it is well known how sm interacts and destroys cells. unfortunately, it is still unknown if and how a cell can become resistant against sm. within the here described experiments we investigated a new approach adapting cells to the presence of sm. over a time period of nearly three years the cells were cultivated in presence of sm with increasing concentrations. before starting the initial sm sensitivity was investigated. at the beginning cells were cultivated with a concentration of 0.07 µm sm (ic 10 ). today the cells are able to tolerate a concentration of 7.2 µm sm (ic 90 ), which reflects to a concentration of which 90% of the original cells would have died. to determine cellular characteristics, the resistant cells were compared with wildtype cells. the following cell characteristics were investigated: proliferation, apoptosis, clonogenicity, size of nuclei and cytoplasm, cell-cell contacts, dna adducts formation, secretome, screening of mirna expression, next generation sequencing, vital observation and scratch assay, nad(p) + /nad(p)h, h 2 o 2 , glutathione, ca 2+ -influx, mdrchannels, resistance to other alkylating agents and the reversibility of the resistance. the resistant cells demonstrate smaller nuclei and cytoplasm, less dna adducts, a higher clonogenicity as well as proliferation and less apoptosis. the secretome analysis showed an up-regulation of anti-apoptotic acting cytokines timp and ang and the proproliferative acting cytokines timp and pdgf-aa. in contrast, immunologically active cytokines were down-regulated. concerning cell-cell contacts no differences were seen. in the mirna screening 49 significant up-regulated and 20 significant down-regulated mirnas have been observed. noteworthy was the regulation of various members of 11 different families. during vital observation and in a scratch-assay the resistant cells were show to have disadvantages. the observed resistance was not unique for sm but also towards other alkylating agents and cytostatic drugs. by analyzing the reversibility cells stayed resistant over more than 35 weeks. in conclusion, many aspects investigated in this study have an influence on the sm resistance, pointing out that it is a combination of various effects that are involved to switch on resistance. more likely, there are many aspects working together. the present results are an important step in the characterization of the sm-resistant cell line and further studies may be able to directly use these as a start for target identification in antidote or prophylactic agent discovery. the arylhydrocarbon receptor (ahr) is localized in a cytosolic complex that contains several co-chaperones and associated factors. the protein is shifted into the nucleus in response to endogenous and xenobiotic ligands. however, a transient nuclear transport does also occur in the absence of any ligands, while the predominant cytoplasmic compartmentalization is maintained by parallel export. we have analyzed the interplay between this basal nucleo-cytoplasmic shuttling and ligand induced transport in hepg2 cells, using a yfp-tagged fusion protein that is capable to respond to ligands and to trigger the induction of cyp1a1 expression. basal import was assessed in cells that had been treated with leptomycin b (lmb), an inhibitor of crm1-mediated nuclear export. interestingly, the apparent ahr import rate in lmb-treated cells was comparable with nuclear import as trigged by xenobiotic (b-naphthoflavone) or endogenous (kynurenine) ligands. this observation was confirmed for endogenous ahr in hepg2 cells, since both ligands and lmb showed comparable effects on nuclear compartmentalization. however, the basal nuclear import rate in lmb-treated cells was strongly increased by ahr ligands. ligand-induced nuclear transport was therefore confirmed as an import step in receptor activation. interestingly, lmb did also accelerate nuclear import of ahr after pretreatment of cells with ahr ligands. these data suggest that nuclear export of the ahr is maintained in the presence of ligands. receptor activation might therefore comprise several rounds of shuttling, thereby involving both accelerated import and continued export of the ahr protein fraction that has not already undergone interactions with arnt or dna. we suggest that nuclear export provides an additional kinetic control of ahr activation and function. mitochondrial toxicology: rescuing mitochondria in wilson disease avoids acute liver failure h. zischka 1 1 institut für molekulare toxikologie und pharmakologie, ag zischka, neuherberg, germany in wilson disease (wd) functional loss mutations in the hepatocyte atp7b gene cause dramatic copper overload leading to acute liver failure, posing an unmet therapeutic issue. we find that the pathology of severe wd cases is mirrored in lpp (-/-) rats carrying a functional loss atp7b mutation. this is especially apparent in the hepatocyte mitochondrial compartment. a progressive copper deposition increasingly harms the lifesustaining mitochondrial membrane integrity. thus, depleting this devastating mitochondrial copper burden is a core requirement for a treatment strategy against acute liver failure in this wd animal model. preparation for the master degree program in toxicology started in 2006 as a cooperation of charité universitätsmedizin berlin with the university of potsdam and other institutions of the region. first enrollment of students was done in 2008. the program was accredited in 2011 by the central evaluation and accreditation agency. it offers a modern curriculum encompassing a wide variety of scientific aspects with an interdisciplinary character. this training program in toxicology is organized in modules and ends with the degree "master of science" (m.sc.). the goal of this program in toxicology is to teach the basis of the interactions between substances at toxic concentrations and living organisms, as well as the molecular mechanism of the adverse effects of chemicals. the understanding of the mechanism of a toxic action is an important prerequisite for the scientifically based evaluation of a hazard associated with a substance. furthermore, only with the knowledge of the mechanism of action and a deduction of structure activity relationships it is possible to predict toxic effects of new substances. this knowledge should enable students to perform a risk evaluation of chemicals or to predict the adverse effects of chemicals with the aim that human beings and the environment can be protected from harmful consequences of chemical exposure. the program allocates 30 places per year to an average of 60 applicants. most applicants have a basic training in the fields biology, chemistry, pharmacy, veterinary medicine and nutritional sciences. about 75% of the students are female. the majority of them have a bachelor's degree before starting the master program, other degrees are diploma and state examination as pharmacists or physicians. ninety percent of the students pass the final examination consisting of the master's thesis and disputation at the end of the four semesters. afterwards, most of the graduates aim to obtain a phd degree. the program is well established in the education of toxicologists in germany. respiratory injury due to chlorine developed from consumer products. still an issue in germany u. stedtler 1 , m. hermanns-clausen 1 1 uniklinikum freiburg, vergiftungs-informations-zentrale, freiburg, germany objective: in the last decades strong effords have been took to improve product safety, especially in products intended for domestic use. hypochlorite-containing cleaners may develop chlorine gas when acidified e.g. by adding an acid sanitary cleaner. usually these cleaners contain sodium hydroxide or other strong alkalines to avoid this reaction. we analysed reports to our poisons center concerning inhalation exposure to chlorine developed from hypochlorite-containing mixtures. method: retrospective search in the case database of the poisons center. human inhalative exposures to chlorine released from mixing hypochlorite as well as human inhalative hypochlorite exposure alone were analysed. frequency and symptoms were compared. results: from 2010 to 2015 in total 85 cases of human exposures to chlorine developed from mixtures of hypochlorite and acids (0.8 of 1000 cases) were registered. in 55 cases the exposure was due to mixtures of products intended for domestic use. 94 % of the exposed patients reported symptoms. only in two cases the symptoms were not considered to be caused by the inhalation accident. most frequent symptoms reported were (percent of symptomatic patients): cough (45 %), dyspnea ( 33 %), irritated upper airway (26 %), abdominal discomfort (pain, nausea, vomiting) (21 %), thoracic pain (20 %), irritated eyes (11 %), dizziness (8%), and bronchospasm (6 %). further symptoms were malaise, headache, irritated nose, sweating, muscle pain, and others. in 12 patients (14 %) the symptoms were graded as moderate severe. main symptoms in this group were dyspnoea ( 83% ), cough, and irritated airway. one third of the patients experienced bronchial obstruction. all symptomatic patients developed symptoms while exposed or shortly after exposure. there were no severe or fatal cases (especially no lung edema) and all symptoms were expected to resolve completely. because hypochlorite containing procucts sontanously release "chlorine-like" smelling gases, we additionally analysed inhalation exposures to hypochlorite solutions alone in the same period. there were 42 patients in the same period exposed to hypochlorite evaporation alone. 36 of them (86 %) had symptoms of which in 30 cases these were considered to be caused or possibly be caused by the hypochlorite. most frequent symptoms were irritated upper airway (33 %), nausea or vomiting (30 %), cough (23 %), irritated eyes (20 %). dyspneoa was less fequent than in the mixture group (10 %). all symptoms were considered mild. there was no bronchospasm or thoracic discomfort. conclusion: respiratory injuries by chlorine from hypochlorite-containing solutions still occur despite clear warning on the label. the majority of cases was due to products for domestic use. symptoms develop shortly after exposure. the γh2ax assay for genotoxic and nongenotoxic agents: comparison of h2ax phosphorylation with cell death response perturbation of mitosis through inhibition of histone acetyltransferases: the key to ochratoxin a toxicity and carcinogenicity? regulation of chromatin by histone modifications transcriptomic alterations induced by ochratoxin a in rat and human renal proximal tubular in vitro models and comparison to a rat in vivo model in vitro gene expression data supporting a dna non-reactive genotoxic mechanism for ochratoxin a fragment ion patchwork quantification for measuring site-specific acetylation degrees combinatorial patterns of histone acetylations and methylations in the human genome inroads to predict in vivo toxicology -an introduction to the etox project value of shared preclinical safety studies -the etox database acknowledgements: support of the bfr through grant 1322-530 is gratefully acknowledged. acknowledgement: supported by the robert bosch foundation, stuttgart, germany. [1] mürdter t, et al. hum mol genet, 2011, 21:1145-54 [2] nishiyama t. et al., biochemical pharmacology, 2002, 63:1817 -1830 [3] sun d. et al., drug metabolism and disposition, 2007, 35:2006 background: infections are a major problem in patients with burn diseases (bd). due to severe injuries of their total body surface area (tbsa), burn patients have altered pharmacokinetic characteristics. therefore, insufficient plasma concentrations may be achieved, when standard dosing schedules are applied for antibiotics such as piperacillin. for time-dependent antibiotics, the duration how long drug concentration exceeds the minimal inhibition concentration (mic) is crucial for their antibacterial effects. since pseudomonas spp. is the main problematic pathophysiological bacterium for bd patients. the aim of the present study was to monitor the plasma concentrations of piperacillin during piperacillin/tazobactam treatment in bd patients. patients from intensive care units (icu) served as controls. methods: 10 bd patients (5/5 m/f, 43.4±5.3y, tbsa 40 .9±5.9%) and 5 patients (74.4±3.9y) from the icu were included in this observational study. blood samples were taken within the 3 rd interval of the 8h-lasting dosing period of piperacillin/tazobactam (4/0.5g within 0.5h) at 1, 4 and 7.5h after the end of infusion. total and free piperacillin concentrations were determined in plasma using hplc-uv after deproteinisation with acetonitrile and by ultrafiltration, respectively. pharmacokinetic parameters and dosing simulations were calculated by tdmx (www.tdmx.eu). free plasma concentrations of piperacillin exceeding at least 1xmic but preferably 4xmic over the whole dosing interval were considered to be sufficient for antibiotic efficacy (mic 16 mg/l for pseudomonas spp.,www.eucast.org). results: the pharmacokinetic parameters of total piperacillin, calculated for each bd or icu patient using the concentrations at 1, 4, and 7.5h, were as follows: c max 69.6±7.9 vs.116.3±10.4 mg/l, p<0.05, half-life 1.8±0.3 vs. 2.3±0.3h, p>0.05, clearance 12.6±1.9 vs. 6.5±0.4 l/h, p<0.05, volume of distribution 27.8±2.0 vs. 20.9±1.3l, p<0.05. free concentrations (which were included in tdmx calculations) were 87±2 vs. 81±2% (p<0.05) of total concentrations. duration per day while concentrations exceeded 1xmic (15.6±1.9 vs. 22.0±1.1h, p<0.05) or 4xmic (5.4±1.3 vs. 9 .4±0.7h, p<0.05) were lower in bd than in icu patients. moreover, tdmx simulations predicted that the duration per day for 4xmic could be enhanced to 16.2±1.5h if the piperacillin amount will be increased to 4x8 g/d and the infusion duration to 3h. pharmacokinetic parameters have, however, to be determined in a pilot study with bd patients to ensure predicted values. conclusions: standard dosage regimens for piperacillin/tazobactam could result in suboptimal plasma concentrations of piperacillin in bd patients as well as in icu patients. drug monitoring and tdmx simulation of kinetic parameters may easily help to improve piperacillin treatment in bd patients. background: high dose methotrexate (hd mtx), defined as >1000mg mtx/m 2 bodysurface-area (bsa) is used in children to treat a variety of malignant diseases since the 1950s. clinicians observe relevant rates of severe unwanted side effects. identifying patients having an increased risk for toxicity due to altered mtx pharmacokinetics is urgently needed. we aim to develop and evaluate a physiology-based pharmacokinetic (pbpk) model for hd mtx in children using pk-sim® (bayer technology services gmbh, leverkusen, germany) with a special emphasize on relevant covariates. methods: in this non-interventional observational study, children receiving hd mtx intravenously at two major german pediatric oncology departments during the years 2004-2009 were included if at least one mtx serum level (mtx-sl) was determined during clinical routine. 29 patients aged 2-18 years (male = 19, female = 10) with following diagnoses were included: acute lymphoblastic leukemia, non-hodgkin lymphoma, burkitt lymphoma, brain stem glioma and glioblastoma multiforme. in total, 103 mtx treatment cycles corresponding to 300 mtx-sl were used in this study. patients were randomized into two patient sets (training set and test set). based on literature data, mtx pbpk-models were developed and slightly adapted taking into account mean relative deviation (mrd) and bias of predicted versus observed mtx-sl of the training set. the pbpk model with the lowest mrd and bias was chosen and finally evaluated using the test set. the impact of the covariates urine ph <6.5, trimethoprime/sulfamethoxazole, proton-pump-inhibitors, non-steroidal anti-inflammatory drugs and ß-lactam antibiotics on the prediction quality was assessed using the mann-whitney u test. ochratoxin a (ota) is a wide-spread food contaminant and one of the most potent renal carcinogens [1] . recent data by our group demonstrate that ota inhibits histone acetyltransferases (hats), thereby causing a global reduction of lysine acetylation of histones and non-histone proteins [2] . based on these findings and the importance of specific histone acetylation marks in regulating gene transcription [3] , we speculated that repression of gene expression as the predominant transcriptional response to ota [4, 5] may be linked to loss of histone acetylation. in this study we therefore used a novel mass spectrometry approach, which is based on chemical acetylation of unmodified lysine residues of histones using 13 c-labeled acetic anhydride and subsequent calculation of the degree of acetylation based on the measured intensities of heavy and light acetylated isotopologues [6] , to identify and quantify site-specific alterations in histone acetylation in human kidney epithelial (hk-2) cells treated with ota. our results demonstrate ota-mediated loss of acetylation at almost all important lysine residues at histones h2a, h2b, h3 and h4. we further selected acetylation at histone h3 lysine 9 (h3k9), a well-known euchromatic hallmark that is elevated at promoter regions of transcriptionally active genes [7] and which was reduced from ~ 3% in controls to < 0.1% in response to ota, to establish a link between loss of h3k9 acetylation and expression of genes consistently shown to be down-regulated in response to ota [4, 5] . using chromatin immunoprecipitation followed by quantitative real-time pcr (chip-qpcr), we observed ota-mediated loss of h3k9 acetylation at promoter regions of the selected genes (% of controls: amigo2: 45%, clasp2: 60%, ctnnd1: 54%). overall, these data provide first evidence for a mechanistic link between h3k9 hypoacetylation as a consequence of ota-mediated inhibition of hats and repression of gene expression by ota. a new paradigm to assess the proarrhythmic potential of drugs is proposed by the cipa (comprehensive in vitro pro-arrhythmia assay) initiative combining a suite of a priori in vitro assays (7 most important ion channels for cardiac activity) coupled to in silico reconstructions of cellular cardiac action potential (ap). the etox consortium has developed a multiscale simulation in silico model based on o'hara/rudy incorporating the principles of this new paradigm. the core model simulates the effects of drugs on a virtual cardiac tissue composed by different types of cardiomyocytes. the input of this model, the blockade of a set of 3 ion channels (ikr/herg, iks, ical), can be obtained experimentally or predicted using advanced 3d-qsar models. the system predicts the % change of the qt interval at different drug concentrations in order to facilitate risk assessment. this in silico model was validated using purkinje fiber assay results (input: ap prolongation and arrhythmogenic risk assessed by early after-depolarisation occurrence) from 500 in-house drug candidates. the validation showed that predictivity is highly dependent on the model's applicability domain (ad): for some chemical series the proarrhythmic potential could not be identified, for others, however, most of the positive drugs were correctly predicted with sensitivities up to 80-90% (average prediction accuracy was 70%). retraining of this model with additional internal data should help to improve the model ad and predictivity. it is important to note that ap prolongation was correctly predicted for many proarrhythmic drugs with only low (> 30 µm) in vitro herg inhibition. furthermore, the model showed high additional benefit for read-across within bayer pharma ag, investigational toxicology, berlin, germany etox [1] [2] started in 2010 and is a public-private partnership project within the european innovative medicines initiative (imi) [3] . the etox project is building a toxicology database relevant to pharmaceutical development and to elaborate innovative strategies and software tools. the overall goal is to better predict the toxicological profiles of new chemical entities in early stages of the drug development pipeline based on existing in vivo study results contributed by the participating efpia * companies in the consortium. the etox database is a relational database with a specifically designed schema to store complex and comprehensive preclinical safety data like the study design, toxicokinetics, adme data, clinical chemistry, hematology, gross necropsy, histopathological findings and general toxic effects. in addition relevant data from public sources has been included into the database. the primary focus for data collection are systemic toxicity (up to 4 week) repeated dose studies, mostly in rodent. overall more than 7000 study reports for approximately 1400 investigated compounds. in order to optimize the usage and mapping of data from different sources the development of common ontologies was a key task within the project. this timeconsuming step was necessary to make a high quality read-across analysis possible and valuable. therefore the ontobrowser [4] tool was developed to curate and harmonize the verbatim terms to standardize terms which are used within the etox database. until now more than 13 million verbatim terms were curated. additionally to the toxicology database, a web-based user interface called etoxsys was developed to allow the retrieving of toxicity information, as well as the prediction of toxic endpoints for chemical compounds. due to the complex search capabilities, the database can be queried for structural similarity, similar target classes and specific toxicological endpoints. approximately 150 prediction models based on public data are available and first models based on in vivo data are in development. the etox database therefore represents a valuable tool for early animal-free assessment of drug candidates [5] . * european federation of pharmaceutical industries and associations cell lines background: consumers are constantly exposed to chemical mixtures e. g. to multiple residues of different pesticides via the diet. this raises questions concerning potential cumulative effects, especially for substances causing toxicity by a common mode of action. since substances are tested for regulatory purposes on an individual basis at generally high dose levels, there is only limited data available on potential mixture effects especially in the low dose range. with more than 400 active substances approved for being used in pesticides and over 100000 chemicals registered under reach there are more possible combinations than one could test with classical animal experiments. the development of in vitro tools for assessment of mixture effects consequently is of tremendous importance. methods: as a first step in the development of such in vitro tools we used a group of fungicides, (tri-)azoles, as model substances in a set of different cell lines from known target tissues, basically liver (human: hepg2, heparg, rat: h4iie) and adrenal gland (human: h295r). concentrations were taken from measured tissue concentrations in vivo to ensure that used concentrations of the (tri-) azoles reflect realistic effect levels. the cell lines were exposed with the triazoles cyproconazole and epoxiconazole as well as with the azole prochloraz as individual substances and in binary or ternary combinations of these substances at three dose levels and three different time periods. the effects of the substances were subsequently analysed by transcriptomics and metabolomics. a support vector machine will be utilized to integrate the data from the different sources to gain a complete picture of affected adverse outcome pathways and mechanistic information about the applied fungicides. first results indicate combination effects of the substances also at the omics level depending on the specific endpoint and the concentration used. some of these are comparable to effects found with similar methods in a standard toxicity test, a 28-day feeding study in the rat, thus raising hope for the development of in vitro methods suitable to detect combination effects. background: plant protection and biocide products are chemical mixtures, which contain one or more active substances as well as several co-formulants (e.g. solvents, wetting agents, thickener or preservatives). nevertheless, to this day extensive toxicological testing is performed only with the individual active substances, while the plant protection products are only evaluated for acute toxicity, ie, a single dose group experiment with rats is performed as well as testing for skin-and eye-irritation. current pesticides regulation foresees testing of potential harmful mixture effects but only when adequate methods are available making the development of such methods a high priority. several published studies both in vitro and in vivo have shown fortified toxic effects of plant protection products compared to individual active substances. methods: here we present effects of plant protection products as a whole as compared to the individual active substances or co-formulants in a set of human cell lines of hepatic and renal origin (hepg2, heparg, hek293). cytoxicity has been analysed by wst-1 and nru assay as well as gene expression of several marker genes involved in xenobiotic metabolism. additionally reporter gene assays have been conducted for nuclear receptors such as ahr and car. results: while some active substances showed lower toxicity as compared to the respective products, this cannot be confirmed as a general rule for all endpoints for all of the analysed fungicides or herbicides containing active substances such as epoxiconazole, cyproconazole, azoxystrobin or glyphosate. chemical compounds may induce skin sensitization in humans, resulting in tolerance or allergic contact dermatitis after repeated exposure. mechanistically, the activation of dendritic cells is one of the prerequisites for the induction of skin sensitization. a subgroup of sensitizing chemicals, prohaptens, need metabolic activation, e.g. via cytochrome p450 (cyp) enzymes. thus, xenobiotic metabolism may crucially impact on a chemical's potential for the induction of skin sensitization by activation, but also deactivation of reactive molecules via conjugation, which determines the concentration and the chemical species available for protein haptenation and cell activation. we established a coculture model consisting of hacat keratinocytes and thp-1 as surrogate dendritic cells for the detection of sensitizing chemicals and found enhanced cyp1 enzyme activity in hacat cells exposed to benzo[a]pyrene (b[a]p) and eugenol as well as clearly increased expression of cell surface molecule cd86 on thp-1 cells after incubation with these prohaptens (hennen et al., 2011) . here, we studied the impact of intercellular cross talk on activation and conjugation capacities in more detail. treatment of thp-1 with b[a]p and eugenol in coculture with hacat cells augmented cyp1a1 and/or cyp1b1 mrna levels, while this was not found for thp-1 monoculture. augmentation of cyp1a1 mrna needed continuous presence of hacat cells. in coculture, levels of 3-oh-b[a]p as exemplary cyp-dependent metabolite were increased compared to single cultures. in contrast to this, total glutathione contents as well as n-acetyltransferase 1 enzyme activities in both cell types were not modulated in coculture, furthermore the capacity for sulfation/glucuronidation of 3-oh-b[a]p was maintained in coculture. additionally, the decrease of the total glutathione content in thp-1 cells by 2,4-dinitrochlorobenzene (dncb) was much less pronounced when exposed in coculture with hacat cells, showing that hacat cells provide additional targets for cysteine-reactive chemicals such as dncb, diminishing the total amount of chemicals available for thp-1 cells.overall, results indicate that the cross talk between keratinocytes and antigenpresenting cells enhances their capacities for metabolic activation of chemicals, while hacat cells also provide supplementary capacities for phase ii reactions. references: hennen j et al. cross talk between keratinocytes and dendritic cells: impact on the prediction of sen-sitization. toxicol sci 2011 123:501-510.toxicology -toxic pathway analysis/aop 395 background: reticulated platelets are associated with impaired antiplatelet response to thienopyridine treatment. this interaction might be caused by intrinsic properties of reticulated platelets or a decreased drug exposure due to high platelet turnover reflected by reticulated platelets as surrogate. we investigated the impact of reticulated platelets on antiplatelet response to thienopyridines and if this effect is linked to platelet turnover. methods: this study randomized elective patients to loading with clopidogrel 600mg or prasugrel 60mg (n=200). adp-induced platelet reactivity was assessed by impedance aggregometry 30 to 120 minutes and day 1after loading but before intake of the next dose of thienopyridines. immature platelet count (ipc) was assessed as marker of reticulated platelets by whole blood flow cytometry. results: platelet reactivity increased with rising tertiles of ipc (figure) . this effect was more pronounced in patients on clopidogrel as compared to patients on prasugrel. overall, ipc correlated well with on-treatment platelet reactivity at 120min (r=0.21; p<0.001). this correlation did not change over time indicating an effect independent of platelet turnover (comparison of correlations 120min/day 1: p=0.57 for clopidogrel, p=0.76 for prasugrel). conclusion: a high immature platelet count is associated with impaired response to thienopyridine loading. this effect is independent of platelet turnover indicating a relation to intrinsic properties of reticulated platelets. introduction: one of the biggest drawbacks of protein-based therapeutics with intracellular targets is their inability to enter the cytosol. targeted toxins are known to be used in drug delivery. aim of the study was to target epidermal growth factor (egf) receptor overexpressed on pancreatic carcinoma using a novel well-defined targeted toxin consisting of egf fused to the toxic plant ribosome-inactivating protein dianthin and a glycosidic triterpenoid (so1861) as efficacy enhancer. methods: the enzymatic activity of dianthin-egf was verified by an adenine release assay. the kinetics of cytotoxicity were evaluated in pancreatic adenocarcinoma bxpc-3 and miapaca-2 cells in comparison to the non-target cell line nih3t3 with an impedance-based real time cell analyzer (xcelligence) and final cytotoxicity analyses with conventional end-point mtt assays. acute toxic of dianthin-egf was studied in male balb/c mice. a xenograft solid tumor model was developed in male nude mice by injecting bxpc-3 cells into the dorsal part subcutaneously. dianthin-egf was administered at the vicinity of the tumor and so1861 by subcutaneous injection at the neck. after the tumor reached a diameter of 2 to 3 mm in size 6 treatments were given in total. tumor volumes and body weight shifts were observed twice weekly to determine the potency of dianthin-egf when given alone and in combination with so1861 in comparison to placebo. immunohistochemical detection of egf receptor was performed according to the manufacturers's advice (dako, glostrup, denmark, k1492). complete blood count analysis was done by labor 28 gmbh, berlin. results: the adenine release mediated by dianthin-egf was 47.8 pmol adenine/pmol toxin/h. the in vitro efficacy of the targeted toxin was proven by an ic50 value of approximately 1 nm for egf receptor expressing miapaca-2 and bxpc-3 cells as compared to 100 nm for non-target nih3t3 cells. real time measurement of the cytotoxicity showed a dose-dependent decrease in cell viability from 10 pm to 1 µm. toxicity studies in balb/c mice revealed 0.4 µg/mouse to be non-toxic and maximum tolerated dose (mtd) whereas 40 µg caused moribundity accompanied with white ocular discharge. efficacy studies were performed for a period of 28 days. the combination therapy showed that the average tumor volume measured by a digital vernier caliper was found to be 80% less than for placebo whereas single therapy using dianthin-egf alone caused a further increase in tumor volume which was although yet 50% less when compared to placebo. immunohistochemistry slides showed egf receptor expression in each of all untreated xenograft tumors, which further confirms the presence of egf receptor overexpression in the target bxpc-3 cell line. enlarged spleen was only observed in untreated xenografts. no significant change in various blood parameters (rbc counts, wbc counts, hgb, hct, mcv, mch and mchc) were observed on hematological analysis except for the platelet (plt) counts in comparison to healthy male nude mice. conclusion: combination therapy with so1861 proves to be a promising approach for the targeted delivery of toxins instead of single therapy administering targeted toxin alone. the strategy is specific for egf receptor overexpressing tumors such as pancreatic cancer. introduction: moringa oleifera (mo) is a popular herbal supplement used for treatment and management of diverse diseases in sub-saharan africa. its intake among individuals infected with hiv/aids has increased recently due to the purported immune boosting property. limited information, however, is available regarding its potential to cause interactions with commonly prescribed medications that are substrates of cyp3a4 and p-glycoprotein. methods: the methanol extract and four fractions of mo were tested on recombinant cyp3a4 at different concentrations with and without nadph to determine the ic 50 shift reduction. the crude methanol extract of mo was incubated with testosterone (tst) and cryopreserved hepatocytes to evaluate its influence on clearance of tst. effect of mo on the efflux transporter, p-glycoprotein was investigated by incubating the methanol extract with mdr1 -mdckii cells. virtual screening was conducted to predict physicochemical properties, bioavailability and interaction potential of phytochemical compounds unique to mo using combination of molinspiration version 2014.11 and admetsar. results: fractions (f1-f3) indicated ic 50 shift reduction ≥5 post-incubation with and without nadph. mo showed moderate interaction (auc i /auc = 2.46) with tst in cryopreserved hepatocytes. also, mo mildly inhibited the transport of digoxin (ic 50 = 35.45 µg/ml) across mdr1 -mdckii cells. niaziminin indicated 85.57% bioavailablity via the human intestinal membrane with 61% chance of inhibiting cyp3a4. βsitostenone showed strong p-gp inhibition (83.27%) with 100% absorption via the intestine. conclusions: mo has the potential to inhibit the metabolism or excretion of other medications that are eliminated by cyp3a4 or p-glycoprotein, respectively, if adequate amounts of the active constituents such as niaziminin and β-sitostenone enter the circulation. background: herb-induced liver injury (hili) has attracted attention in the past years due to an increasing number of publications reporting cases of hepatotoxicity associated with use of phytotherapeutics. here, we present data on hili from the berlin case-control surveillance study fakos. methods: fakos was initiated in 2000 to study serious toxicity of drugs including hepatotoxicity. potential cases of liver injury were ascertained in more than 180 departments of all 51 berlin hospitals from october 2002 until december 2011. through a standardised face-to-face interview and review of medical charts information on all previous intakes of drugs or herbals, on co-morbidities, and demographic data was ascertained. inclusion criteria were an elevation of alanine aminotransferase or aspartate aminotransferase threefold above the upper limit of normal or an elevation of total bilirubin higher than 2 mg/dl. excluded were patients with underlying liver disease (e.g., alcoholic fatty liver disease). drug or herbal aetiology was assessed based on the updated council for international organizations of medical sciences (cioms) scale. results: of all 198 cases of hepatotoxicity included into the fakos study, herbs were involved in ten cases (5.1%). demographic, clinical, and laboratory characteristics of these ten cases are illustrated in table 1 . among the six patients with available liver biopsy results, five patients showed signs of necrosis, either disseminated or predominantly near the central vein. portal inflammation was more common than lobular inflammation, and the infiltrates contained mostly lymphocytes, neutrophil or eosinophil granulocytes. herbal aetiology was judged two times as probable (ayurvedic herb in patient 1, pelargonium sidoides in patient 6), and eight times as possible (valeriana in patients 3, 4, 8, 9, 10 , mentha piperita in patient 5, hypericum perforatum in patient 2, eucalyptus globulus in patient 7). in nine cases other non-herbal drugs were also suspected as potentially hepatotoxic (exception: patient 6). seven cases occurred in the ambulatory setting requiring hospitalisation, three cases occurred during hospital stay. discussion: this case series provides further information on laboratory and clinical aspects of hili. it corroborates known risks for valeriana and ayurveda treatment, and suggests that further herbals rarely or never associated with liver injury before such as pelargonium sidoides, hypericum perforatum or mentha piperita could also exhibit a hepatotoxic potential. clinical routine often requires to evaluate the cause of a newly occurring adverse event. if this event is regarded to be iatrogen, further information of the association between the drugs in the current medication list and the adverse event is needed. this information should ideally reflect the true risk and allow ranking of the drugs according to this risk to identify which drug to discontinue first. we discuss the summary of product characteristics (spc), the sider side effect resource and openvigil 2 as possible sources of information. spcs are becoming more and more a vindicative charter for pharmaceutical companies that contain misleading information which is not based on evidence (ref. 1). since it relies on the spcs, sider inherits these shortcomings and flags warnings that result from confounding factors (ref. 1, fig. 1 ). furthermore, if any rates are given, they are not easily comparable since they stem from different studies. pharmacovigilance data are biased by the very nature of the data and the collection method. however, once confounders are eliminated, pharmacovigilance offers better information om how to rank the drugs than spcs/sider. we present decision-guiding information obtained by sider and by openvigil 2 for one of our patients ( fig. 2 & 3 ) and discuss how this information was used to modify the therapy. institut für naturheilkunde und klinische pharmakologie, universität ulm, ulm, germany background: differences (polymorphisms) in target genes or genes encoding drug transport proteins or drug metabolizing enzymes may be responsible, among other factors, for observed variation in patients' response to medications. pharmacogenetics aims at identification of patients at higher, genetically determined, risk of adverse drug effects or ineffective medication, to modify dosage or switch to alternative therapy. there is, however, a lack of awareness of pharmacogenetic-based clinical practise guidelines. methods: a systematic literature review was conducted which focused on published guidelines on genotype-based (germ-line genetic variants) dosage modification or selection of drugs. we serched the medline and the pharmacogenomics knowledgebase (pharmgkb) databases. prescribing information was also screened for pharmacogenetic guidance. results: the systematic review revealed recommendations for 61 drugs (table) that enable the translation of genetic test results into actionable prescribing decisions. for 20% of these drugs the respective german drug labels recommend or even require pharmacogenetic testing (table, 3rd column). although pharmacogenetic testing is recommended, the prescribing information not always provides guidance on how to adjust the drug dosage based on the pharmacogenetic test result. compared with the german or european drug labels, the fda drug labels povide more detailed information on pharmacogenetic dose modifications. conclusions: academic working groups have a front-runner role in the development of prescribing recommendations based on genetic markers. to date, drug labels rarely contain detailed guidelines how available genetic test results should be used to adjust drug dosage. because pharmacogenetics has a growing role during drug development and pre-prescription genotyping will become more widespread, it is expected that specific pharmacogenetic guidance for the treating physicians will become increasingly important. bisphenol a (bpa) is a high production volume compound mainly used as a monomer to make polymers for various applications, including food-contact applications. people are exposed to low levels of bpa because very small amounts of bpa may migrate from the food packaging into foods or beverages. however, other potential sources of exposure, such as dermal contact have also been identified (efsa, 2015) . a substance evaluation process (corap) was initiated for bpa by the european chemicals agency (echa). as part of the safety evaluation of bpa, a study was required by echa to assess absorption and metabolism of bpa following dermal exposure to human skin. an in vitro study with human skin was requested according to oecd tg 428 under consideration of the scientific committee on consumer safety (sccs) criteria for the in vitro assessment of dermal absorption. to investigate potential dermal bpa metabolism fresh human skin was used. abdominal skin was obtained fresh from surgery from 4 different donors. split-thickness human skin membranes were mounted into flow-through diffusion cells (n=4 per dose and donor) and the receptor fluid was pumped underneath the skin at a constant flow rate. the skin surface temperature was maintained at 32°c throughout the experiment and electrical resistance barrier integrity testing was performed at the start (0 h) and end of the experiment (24 h). four test preparations at final bpa concentrations of 2.4, 12, 60, and 300mg/l were investigated. the highest concentration was chosen based on the maximum solubility of bpa in water and the lowest concenration was chosen based upon the specific activity of the radiolabelled [ 14 c]-bpa that could be used for mass balance. percutaneous absorption was assessed by collecting receptor fluid (tissue culture medium (dmem), containing ethanol (ca 1%, v/v), uridine 5'-diphosphoglucuronic acid (udpga, 2 mm) and 3'-phosphoadenosine-5'-phosphosulfate (paps, 40 µm)), at multiple time points througout the experiment. at termination the skin was removed from the cells and the stratum corneum was removed with 20 successive tape strips. the exposed epidermis was separated from the dermis using a scalpel. metabolism was investigated for the highest concentration (300 mg bpa/l) only, using a hplc with in-line radiodetection and confirmed bpa-glucuronide (bpa-g) and bpa-sulfate (bpa-s) standards for comparison. no metabolism was observed in any of the epidermis samples, however some metabolism is observed in dermis and receptor fluid samples. metabolites were identified with retention consistent with bpa-g and bpa-s, and also some more polar components. the mean total absorbed dose (receptor fluid + receptor chamber wash + receptor rinse) was between 1.7 and 3.6% of the applied dose and the mean dermal delivery (epidermis + dermis + total absorbed dose) was between 16 and 20% of the applied dose, with the majority of the radioactivity associated with epidermis samples compared to dermis and receptor fluid samples. a linear dose-response relationship is observed over the whole concentration range. anastrozole is a well-known non-steroidal aromatase-inhibiting drug approved for the second-line treatment of breast cancer after surgery and for treating postmenopausal women. treatment with the only available dosage form, anastrozole film-coated tablets for oral administration, is frequently associated with concentration-dependent unwanted side effects like hot flashes, fatigue, joint pain, joint stiffness, vaginal dryness, hair loss, skin rash, nausea, diarrhea and headache. in order to minimize the local gastrointestinal as well as systemic side effects, a system for transdermal anastrozole delivery has recently been developed. in this study, we describe the first experimental in vivo application of a transdermal therapeutic system (tts) to beagle dogs and, as a necessary prerequisite for the analysis of the time course of anastrozole release and uptake, a simple, sensitive and accurate lc-ms method for quantifying anastrozole in plasma. the detection of fragment ions at m/z 225 and 237 instead of the molecule ions (m/z 294 and 306) generated from the elevated collision energy, and the use of a deuterated internal standard resulted in increased relative abundances and improved signal-to-noise ratios.the lower limit of quantification and the limit of detection were 1.4 ng/ml and 0.5 ng/ml, respectively. the developed method was successfully applied in a pharmacokinetic study of anastrozole plasma levels in beagle dogs, measuring percutaneous drug absorption from an experimental, newly designed glycerol-based patch / tts. a distinct time course was observed, with an initial linear increase over 24 hours and a plateau thereafter. this offers promising strategies for the transdermal application of anastrozole with improved pharmacokinetics. background: the monocarboxylate transporter 4 (mct4), encoded by the slc16a3 gene, mediates h + -coupled transport of lactate across the plasma membrane. for cells with high glycolytic activity lactate export is of major importance for the maintenance of the glycolytic metabolism and for the prevention of intracellular acidification. in glycolytic tumor cells, the acidic extracellular environment resulting from export of lactate and h + , furthermore promotes anti-apoptotic effects and metastasis. clear cell renal cell carcinoma (ccrcc) is the most common subtype of renal cell carcinoma (rcc) and is characterized by a metabolic shift towards enhanced aerobic glycolysis and hence, increased lactate production. mct4 and its epigenetic regulation by slc16a3 promoter methylation has previously been identified as prognostic marker for ccrcc outcome and as target for ccrcc treatment. since metastatic ccrcc is associated with poor overall survival and represents a major challenge for treatment, mct4/slc16a3 might represent a promising prognostic marker and a target for therapeutic intervention also for metastatic disease. methods: mct4 protein expression was analysed in 130 paraffin embedded tissue samples of distant metastases derived from different organs by immunohistochemical staining of tissue microarrays. protein expression was evaluated semi-quantitatively using tissue studio v.3.6 (definiens ag). dna methylation in the slc16a3 promoter, specifically at the previously identified cpg site with prognostic potential in primary ccrcc, was analysed in 82 paraffin embedded metastasis samples by maldi tof-ms. mct4 protein expression data and dna methylation at the specific cpg site in the slc16a3 promoter were correlated with clinicopathological parameters and outcome data. results: distant metastases of primary ccrcc showed high mct4 protein expression irrespective of the affected organ. the most frequently affected organs like lung or bone, with approximately 28% and 14% in our cohort respectively, showed similar expression levels as less frequent metastatic sites such as thyroid gland or spleen. accordingly, dna methylation at the identified cpg site in the slc16a3 promoter was low in metastatic tissue in all investigated organ sites. an association of low promoter dna methylation level at the previously identified prognostic cpg site in metastases with poor tumor-specific survival of the patients was observed. conclusion: from these results we hypothesize that dna methylation at specific cpg sites in the 5'-regulatory region of mct4 may not only serve as a predictor for patient outcome and as potential novel target for therapeutic intervention in primary, but also for metastatic disease. tamoxifen is used to treat pre-and postmenopausal women with estrogen-receptor (er) positive breast cancer. as a prodrug, tamoxifen undergoes extensive hepatic metabolism resulting in a complex mixture of metabolites with estrogenic and antiestrogenic effects. while endoxifen and (z) 4-hydroxytamoxifen are the most potent antiestrogenic metabolites, bisphenol and both isomers (e) and (z) of metabolite e are the most potent compounds with estrogenic properties at the er. the mixed antagonist/agonist pharmacodynamic effects of the selective estrogen receptor modulator tamoxifen at the er have been mainly attributed to tissue specific action of er coregulators, yet little is known about agonistic metabolites contributing to its estrogenic actions. the aim of the present study was to clarify whether there is a genetic component for interindividual differences in the formation and clinical effect of agonistic tamoxifen metabolites. a genome-wide association study (gwas) was conducted on steady-state agonist plasma levels in 390 postmenopausal breast cancer patients of european origin who were treated with 20 mg/day of tamoxifen for at least 6 months. plasma concentrations of estrogenic metabolites bisphenol, (e), and (z) metabolite e were quantified using a recently established lc-ms/ms method 1 . promising snps for an association between genotype and either plasma metabolite concentration or clinical outcome were confirmed for their relevance in an independent patient cohort of 313 premenopausal breast cancer patients mainly of european descent 2, 3 . twelve snps close to or above genome-wide significance (p <5e-08) were found to be associated with allele-dependent variable (e) or (z) metabolite e plasma levels, while no genomic hit was found for the tamoxifen metabolite bisphenol. here, positive intergenic or genic regions mapped to chromosomes 1, 2 and 16 for (e) metabolite e and to chromosomes 15 and 18 for (z) metabolite e. upon genotyping of the validation cohort, two genetic loci with minor allele frequencies < 5% were confirmed as putative candidates: rs662106 was associated with a 21-39% variant allele-dependent increase of (e) and (z) metabolite e isomers (p< 0.05), and rs3731872, mapping to a gene encoding zinc finger protein znf124, was associated with increased risk of reccurrence or death (hr carriers 2.6, 95% ci: 1.3 -3.4; p < 0.005). these findings suggest the existence of genetic loci that may contribute to the formation and clinical effect of estrogenic tamoxifen metabolites and therefore could explain therapeutic failure of tamoxifen and/or the occurrence of adverse events during treatment. introduction: metabolomic monitoring of endogenous biomarkers is of increasing importance for the assessment of drug safety and efficacy during clinical drug development. myrcludex b, a novel lipopetide-based entry inhibitor for the therapy of hepatitis b and d, exerts its function through inhibition of the hepatic bile acid transporter na + -taurocholate cotransporting polypeptide (ntcp). in order to assess a myrcludex binduced metabolomic response in humans, lc/ms-based monitoring of endogenous metabolites was performed in blood and urine samples from healthy individuals before and during treatment with myrcludex b. methods: plasma and urine samples were collected from healthy volunteers participating in clinical phase i trials to evaluate safety, tolerability, and pharmacokinetics of single doses of the ntcp inhibitor myrcludex b. using quadrupole time-of-flight mass spectrometry coupled to reversed-phase chromatography (lc-qtof-ms) a set of 15 known ntcp substrates (bile acids) was quantified by targeted metabolomics. protein precipitation was performed in the presence of deuterium-labeled internal standards (istds) which allowed absolute bile acid (ba) quantification in low amounts of plasma. ba profiling in urine was performed after dilution with methanol/water (1:1) in the presence of istds. both methods were validated according to fda guidance and applied to monitor the effect of myrcludex b treatment on human bile acid homeostasis. results: dynamic quantification in plasma and urine was achieved in the range from 7.8 nm to 10000 nm depending on the ba species analyzed. intraday-and interday accuracy and precision were in the 15% tolerance range for all analytes in all matrices. matrix effects were between 39-104% (plasma) and 31-95% (urine), apparent recoveries in plasma were above 97%. basal plasma ba level (mean ± sd) in fasting healthy subjects were 667 ± 574 nm (unconjugated bas), 935 ± 629 nm (glycine-conjugated bas) and 104 ± 62 nm (taurine-conjugated bas). urinary ba level (nmol/g creatinine) were 193 ± 225 nm (unconjugated bas), 89 ± 29 nm (glycine-conjugated bas) and 6 ± 2 nm (taurine-conjugated bas). myrcludex-induced ntcp inhibition resulted in significantly elevated amounts of conjugated ba species demonstrating a spillover of ntcp substrates into the systemic circulation. furthermore, higher urinary ba level were observed during treatment indicating accelerated elimination of excessive bas from the body. conclusion: lc/ms-based monitoring of endogenous biomarkers has been successfully established and applied to study the effect of myrcludex b treatment on human ba metabolism. the results obtained by our assay demonstrate that a myrcludex-induced ntcp inhibition drastically affects human ba homeostasis. this observation provides valuable insights into the drug´s mode of action and will be indispensable for the assessment of side effects and dose-finding processes during future clinical trials. further studies are required to assess a possible role of ba modification (e.g. sulfation) in the process of ba detoxification during myrcludex treatment. key: cord-140624-lphr5prl authors: grundel, sara; heyder, stefan; hotz, thomas; ritschel, tobias k. s.; sauerteig, philipp; worthmann, karl title: how much testing and social distancing is required to control covid-19? some insight based on an age-differentiated compartmental model date: 2020-11-02 journal: nan doi: nan sha: doc_id: 140624 cord_uid: lphr5prl in this paper, we provide insights on how much testing and social distancing is required to control covid-19. to this end, we develop a compartmental model that accounts for key aspects of the disease: 1) incubation time, 2) age-dependent symptom severity, and 3) testing and hospitalization delays; the model's parameters are chosen based on medical evidence, and, for concreteness, adapted to the german situation. then, optimal mass-testing and age-dependent social-distancing policies are determined by solving optimal control problems both in open loop and within a model predictive control framework. we aim to minimize testing and/or social distancing until herd immunity sets in under a constraint on the number of available intensive care units. we find that an early and short lockdown is inevitable but can be slowly relaxed over the following months. the severe acute respiratory syndrome coronavirus 2 (sars-cov-2) is a strain of coronavirus which causes the respiratory illness coronavirus disease 2019 . in 2020, on march 11 th , the world health organization (who) declared the outbreak of sars-cov-2 a pandemic [41] . due to the novelty of the virus, there was (and, at the time of submitting this manuscript, still is) significant uncertainty concerning the severity and mortality of covid-19. furthermore, as of october 2020, no vaccine has completed the trials necessary for approving widespread use [24] . therefore, many countries are enforcing nonpharmaceutical countermeasures [19, 20] , e.g. 1) social distancing, 2) increased public hygiene, 3) travel restrictions, 4) self-isolation (quarantine), and 5) population-wide mass testing for sars-cov-2 infection. however, enforcing these countermeasures for long periods of time can have severe economic and social consequences, both at the national and the global scale [28] . therefore, there is a need for identifying economic strategies for simultaneously relaxing the countermeasures and containing the pandemic. model-based decision support systems can be used for exactly this purpose. they use predictive models to assist decision makers in identifying and evaluating candidate strategies (e.g. [1] ). in particular, given a dynamical model of the spread of sars-cov-2, economically optimal (open-loop) mitigation strategies can be identified by solving optimal control problems (ocps) over several months or even years. a key advantage of this approach is that it can directly account for constraints, e.g. related to the capacity of public healthcare systems. however, given the uncertainty surrounding sars-cov-2 and covid-19, it is advisable to implement the optimal mitigation strategies in closed-loop, i.e. to repeatedly update the strategies when new data becomes available. this is referred to as model predictive control (mpc) [32] and is an established method for advanced process control [31] . the predictive capabilities of the underlying model are crucial for the efficacy of the resulting mitigation strategy, and a common challenge is to identify suitable model parameters. epidemics are often modelled using deterministic compartmental models [18] , e.g. the classical sir model, where individuals are either susceptible, infectious, or removed, or the seir model which, additionally, takes the incubation time into account. optimal control of compartmental models was already an active research topic before the sars-cov-2 pandemic (see [35] for a review). in particular, optimal control of sir models has been considered, e.g. for arbitrary social interaction models [2] and to identify time-optimal mitigation strategies [3, 16] . optimal control of more complex models has also been considered. for instance, fischer et al. [12] consider optimal control of a model with two species, bussell et al. [6] demonstrate the importance of closed-loop mitigation strategies (i.e. of incorporating feedback), and watkins et al. [39] consider mpc of stochastic compartmental models. in [11] the authors determine control strategies to maintain hard infection caps in a disease-vector model based on the theory of barriers. this approach, however, exploits the low dimensionality of the model. application of these techniques to complex compartmental models, therefore, requires model order reduction. in response to the sars-cov-2 pandemic, many researchers have presented optimal control strategies, for instance based on pontryagin's maximum principle (e.g. [21, 30, 42] ). these strategies typically involve 1) extended sir or seir models, 2) nonpharmaceutical countermeasures (often social distancing), and 3) minimization of the number of infected as well as the economic cost of the countermeasures (and often other quantities as well, e.g. the number of deaths). furthermore, they rarely satisfy hard constraints, for instance related to health care or testing capacities. in the following, we highlight some of the key developments in decision support for sars-cov-2 mitigation based on optimal control. gondim and machado [14] use a model with three age groups to compute optimal quarantine strategies (for susceptible individuals) which minimize the number of infected and the cost of quarantining. bonnans and gianatti [4] compute social distancing strategies based on a model with a continuous age structure. here, the strategies minimize a combination of 1) the number of deaths, 2) the peak number of hospitalized, and 3) the cost of social distancing. similarly, richard et al. [33] present optimal social distancing strategies based on a model with a continuous age and infection duration structure, which minimize the number of deaths and the cost of social distancing. morato et al. [25] compute on-off (also called bang-bang) social distancing strategies which minimize 1) the number of symptomatic infectious people and 2) the duration of the social distancing policies, subject to constraints on intensive care unit (icu) occupancy. they use extended sir models. carli et al. [7] use mpc to compute social distancing and travel restriction strategies for an extended multi-region sir model, minimizing the cost of the countermeasures and preventing an overload on the hospitals. köhler et al. [22] use mpc to minimize the number of fatalities caused by covid-19, subject to constraints on the economic cost of social distancing. they take a modified sidarthe model [13] as basis and use interval arithmetic in the mpc to propagate model uncertainties. finally, tsay et al. [37] use mpc to minimize the cost of social distancing and testing, subject to an upper bound on the peak number of infectious people who have been tested positive. they use the unscented kalman filter to estimate the noisy state variables of an extended seir model. in this work, we address some of the key questions that decision makers involved in the mitigation of the sars-cov-2 pandemic are facing: 1) is mass testing alone sufficient to avoid overloading of icus? 2) if not, how much social distancing is then required? 3) how much can social distancing measures be reduced by targeting specific age groups? 4) how do strategies obtained by short and long-term planning differ? 5) what are the benefits of increasing the daily testing capacity or the icu capacity? here, the limited icu capacity is considered as an example for constraints imposed by the health care system or political considerations. of course, different constraints such as limited personnel for contact tracing could be incorporated as well. we address the above questions by proposing a novel compartmental model and using optimal control as well as mpc to compute open and closed-loop social distancing and testing strategies. the model contains three age groups, and it accounts for several of the key challenging characteristics of covid-19, i.e. 1) the incubation time, 2) different levels of symptom severity depending on age, 3) delay of testing results (and the following self-isolation), and 4) delay of hospitalization. furthermore, we choose values of the epidemiological model parameters based on the current state of knowledge in order to ensure that our numerical results match reality. for concreteness, we use the covid-19 outbreak in germany to determine parameters depending on demographics and the health care system. however, we expect our conclusions to carry over to outbreaks in other developed countries as well. the remainder of this paper is structured as follows. in section 2, we describe the novel compartmental model of the sars-cov-2 outbreak in germany, and in section 3, we motivate our choice of model parameters. in section 4, we demonstrate that optimal control can be used as a decision support tool based on the proposed model, and we conclude the paper in section 5. in this section, we propose a dynamical model tailored to covid-19. the aim is to be able to evaluate the effect of population-wide mass testing (in combination with quarantine) and social distancing measures on the development of the pandemic. to this end, we extend the well-known sir model. we start with an illustration of the connection between 1) infectious disease models based on randomly acting individual agents and 2) their approximation by ordinary differential equation compartmental models. this exposition will highlight the interpretation and conversion of parameters when moving from a random to a deterministic model. for simplicity, we consider the classical sir model in this subsection. however, the connection, especially the interpretation of parameters, is similar for more complex models such as the one described in section 2.2. consider a population of n pop individuals or agents each being either susceptible, infectious or removed. at time t ∈ [0, ∞) denote the (random) set of susceptibles by s t , the set of infectious by i t and the set of removed by r t . time is modeled continuously and measured in days. we assume a homogeneous population with contacts between agents a and b following a poisson process with intensity λ which does not depend on the agents considered. infections occur randomly upon contact with a fixed probability α if one of the agents is susceptible and the other infectious. thus, potentially infectious contacts also follow a poisson process with respective intensity αλ. similarly, we model other events, in this simple model only recoveries, to occur according to a poisson process. this implies that the time an agent spends in the infectious compartment is exponentially distributed with rate η, say, which we also assume to be the same for each agent (see [27] for models where these quantities follow other distributions). we denote by s(t) = e |st| npop , i(t) = e |it| npop and r(t) = e |rt| npop the expected share of the population which are susceptible, infectious and removed, respectively. since for large n pop the change of |st| npop over a short time interval can, due to the law of large numbers, be well approximated by its expectation, s(t) will provide a sufficient approximation of |st| npop over the finite time horizon considered for a country the size of germany. by the same argument, i(t) and r(t) approximate |it| npop and |rt| npop , respectively, sufficiently well. if a is susceptible he will transition to the infectious compartment upon having an infectious contact. at a fixed time t with a ∈ s t , there are two possible sources of infection for a: either some b ∈ i t which is already infectious or some c ∈ s t which will become infectious himself at some later time. to determine the probability that b infects a in the time frame (t, t + ∆t], we analyze two competing events: the first is an infectious contact between a and b, and the second is b's recovery from the infectious state. both events happen independently of one another with exponentially distributed time of occurrence, the first with rate αλ and the second with rate η. thus the first time of occurrence of one of these is again exponentially distributed with rate αλ+η and the probability that the first occurrence is an infectious contact is αλ αλ+η . in total for c ∈ s t to infect a in (t, t + ∆t], c has to become infectious himself before he in turn can infect a. this happens only with probability o((∆t) 2 ) and can, thus, be neglected in the following calculations. in total a is moved out of the susceptible compartment with probability approximating |st| npop and |it| npop by s(t) and i(t) using the law of total expectation yields as we assume the time from infection to removal to be exponentially distributed with rate η, a similar but more straight-forward calculation reveals where η −1 is the mean stay of a single agent in the infectious compartment. we now set β = n pop αλ, which can be interpreted in this model as the daily amount of (potentially) infectious contacts a single agent has. since s(t) + i(t) + r(t) = 1 for all t, we obtain the following system of odes: to determine suitable parameter values for β and η in this simplistic model, we reiterate that these are best thought of in the probabilistic setting. for the coefficients of the linear terms on the right-hand side, the interpretation is straight-forward: it is the rate of the exponential distribution underlying the time until an agent leaves the compartment. its inverse is the mean stay in this compartment. for coefficients of interaction (product) terms, here β, the interpretation is the rate at which an agent in the first compartment causes other agents to leave the second compartment. in our setting, this is the daily amount of infections one infectious agent causes which can readily be seen from the definition of β. see section 3 for a more detailed discussion of the parameter values we use in our model. as above mentioned, these interpretations for the parameters carry over in a straight-forward manner to more sophisticated models such as the one considered in the following. the sir model provides a good starting point to study the dynamics of pandemics. however, due to its simple structure it is not suited to model the covid-19 pandemic adequately. in particular it does not include hospitalization, age-specific disease progressions and interventions. therefore, we extend the sir model in three ways. 1. we introduce eight additional compartments. in detail, we take into account that people can be infected, but not yet infectious. we call them exposed (or latent) and denote the compartment by e, see also [18] . moreover, we split the infectious compartment into three depending on how the course of the infection will be. we distinguish between severe cases i s that are going to need intensive care, i.e. they will move to h icu at some point in time; mild cases i m that are going to visit a physician and hence be quarantined, i.e. removed; and asymptomatic cases i a that might recover without being detected. furthermore, we incorporate the possibility of being tested but not yet detected by introducing the compartments t s (severe) and t o (other). we assume that the patients with severe cases will visit a physician at some point before being sent to an icu. to this end, we introduce p as a pre-icu compartment which comprises isolated patients at home or on a regular hospital ward. moreover, we split the compartment of removed people into known and unknown cases r = r u + r k . 2. each compartment is further divided into n g groups, n g ∈ n, depending on the age of a subject in order to study how these groups affect each other. 3. social distancing and hygiene measures affect the contact rate as well as the transmission probability. therefore, β can be used as a time-dependent control input β(t). the resulting seitphr model reads aṡ where the subscript i ∈ {1, 2, . . . , n g } denotes the age group in ascending order. we enforce ng i=1 n i = 1, where n i denotes the relative size of age group i. we assume a mean incubation time γ −1 independent of both the course of infection and the age of the patient. however, depending on the age, there are different probabilities π s i , π m i , and π a i for the three courses of infection, where π s i + π m i + π a i = 1 for all i. similar to the sir model (1), the parameters denoted by η correspond to people being removed from the system, i.e. η s and η m denote those who visit a physician and, therefore, are put into quarantine immediately, while η a represents unreported recovery. we denote the total number of susceptibles and unreported cases by describes the rate of those being tested per day, where tests are distributed uniformly at random among all individuals in u i . in addition, symptomatic cases who visit physicians are assumed to be tested as well. therefore, the total number of tests at time t ≥ 0 is given by note that testing does not affect the state of non-infectious subjects. parameters τ s and τ o denote the rate from being tested positive to being detected, and hence being put into quarantine. furthermore, ρ is the rate from pre-hospital quarantine to hospitalization and σ from hospitalization to being reportedly removed, i.e. σ incorporates both mortality and recovery rate of hospitalized patients. the basic structure of the seitphr model (2) is depicted in figure 1 . figure 1 : flow of the seitphr model for one age group. the controls are indicated with dashed red edges. unreported compartments are highlighted by the left red triangle, while tested and detected compartments are highlighted by the right blue trapezium. for a concise notation we stack the state vectors into x = (x 1 , . . . x ng ) and the controls into u = (β, θ), where and β = (β ij ) ng i,j=1 with β ij : r ≥0 → r ≥0 and θ = (θ 1 , . . . , θ ng ). similarly, we denote π ∈ r 3ng , τ ∈ r 2 , and η ∈ r 3 . thus, we write system (2) aṡ where p = (π, η, τ, ρ, σ, γ) ∈ r 3ng+8 collects all parameters. furthermore, we introduce the initial condition before we present our choices for the parameters of model (2), let us reiterate that some of the parameters of our model depend on age. we indicate this dependence by an appropriate index which we drop if the parameter is constant across agegroups. for example, π s i is the age-dependent probability of having a severe course of disease while we assume η a , the rate with which asymptomatic cases recover, to be age-independent. n i we use data on the population size of germany at the end of 2019 from the genesis-online database of the destatis [36] . the first age group consists of individuals aged younger than 15 years, the second of those older than 15 but younger than 60 years, and the last comprises all individuals older than 60 years. these groupings result in proportions n 1 = 0.14, n 2 = 0.58 and n 3 = 0.28. the rate at which an infected agent in compartment i j infects susceptibles in compartment s i depends on the contact structure of a population as well as the probability that a contact between a susceptible and infectious agent leads to a transmission of the disease. we base our contact process on data from the polymod study on daily contacts in several european countries [26] . from this data we calculate a contact matrix c = (c ij ) whose (i, j)-th entry is the mean amount of contacts an individual in age group i has with age group j; here we only consider those contacts labeled as physical, since those are more likely to lead to viral transmission. let us denote by β 0 ij the rate at which a single infectious agent from age group j infects susceptible agents from age group i if no countermeasures, such as social distancing, are in place. we model β 0 ij to be proportional to c ij and let α be the corresponding proportionality constant. if a single infectious agent is introduced without interventions such as test, quarantine and social distancing measures in place into the otherwise completely susceptible, i.e. virgin, population, the mean amount of secondary cases he causes is the basic reproduction number there is a wide variety of estimates for r 0 in the literature [29] , with most estimates in the interval [2, 3.5] . we choose a value of r 0 = 2.5 as early, higher estimates might be biased upwards due to imported and undetected cases. fixing η −1 = 6 (see the discussion on η a below) we calculate α = 5.79% and in turn β 0 ij from (4): γ the rate at which latent cases become infectious is the inverse of the mean incubation time. this parameter is modeled age-independent and chosen to be 0.19, which corresponds to a mean incubation time of 5.2 days [23] . these parameters denote the proportion of individuals in age group i that have severe, mild or asymptomatic course of disease. for germany, the robert koch institut (rki) has published data on severity of disease progression for 12, 178 cases by age-groups [34] . for our purposes we define a severe case to be a case that will eventually be admitted to intensive care, a mild case being one developing influenza-like symptoms, pneumonia or being admitted to hospital for other reasons. all other cases we classified as asymptomatic. thus we obtain π i = (π s i , π m i , π a i ), the proportion of severe, mild and asymptomatic cases in age group i, respectively, as (π 1 , π 2 , π 3 ) = 1 100 observe that the oldest age group is at highest risk with 3.02% of infected individuals admitted to icu. also the proportion of severe cases in the youngest age group is higher than in the middle age group. this might be explained by the fact that cases in the youngest age group are detected less frequently due to them being tested less, leading to overreporting of severe cases. η s , η m , η a these are the rates at which infectious individuals are removed from the infection process, if no mass-testing is implemented, i.e. if θ i = 0. for individuals with severe or mild course of disease when they develop symptoms leading to self-isolation, quarantine prescribed by a physician, or to direct hospitalization. one characteristic of covid-19 is that even presymptomatic cases transmit sars-cov2 [17] . we assume the time from being infectious to symptom onset to be 2 days after which we add 2 more days which it takes before the infectee visits a physician. thus we choose η s = 0.25. for mild progressions we assume the same mean duration from being infectious to symptomatic, though in this case individuals self-isolate, visit a physician or receive a positive test result after a mean waiting time of again 2 more days; consequently, we also set η m = 0.25. for asymptomatic cases in i a i the only way to be removed from the infection process is by recovery from the infection. in [40] positive virus samples were found in patient's throats for up to 8 days after symptom onset. assuming a lower viral load for asymptomatic cases with only 4 days of potential infectiousness and adding the 2 days of presymptomatic transmission we chose η a = 0.17, corresponding to a mean time of 6 days to recovery for asymptomatic cases. τ o , τ s as we assume the testing related to the controls θ i to be of a random nature, tested individuals are not yet removed from the infection process. instead we assume positive test results to become available after a mean delay of 2 days. however, severe cases may visit a physician and thus go into immediate quarantine before receiving their test result. the latter transition occurs with rate η s , and hence the faster transition occurs with rate τ s = 1 2 + η s = 0.75. non-severe cases that are tested, t o i , are removed if they recover naturally (with rate η a ), or receive a positive test result, or visit a physician. that leads to τ o = η m + η a + 1 2 = 0.92 for each age group. ρ this parameter is the rate at which severe cases move from being in the pre-icu state to the intensive care unit. this includes time spent in quarantine at home as well as time spent in the hospital in normal care while being isolated. in [10] the median time from symptom onset to being in intensive care for 50 patients was 9 days. as the median of an exponential distribution exp(ζ) is log 2 ζ we choose a mean stay of 9 log 2 − 2 ≈ 10.98 = ρ −1 days, accounting for the mean two days from symptom onset to the transition into the pre-icu compartment. σ this is the mean time spent on intensive care until discharge or death. according to [10] patients with acute respiratory distress symptoms (ards) spent a median amount of 13 days in intensive care and patients without ards spent a median amount of 2 days in intensive care. of the 50 patients considered in this study, 24 were afflicted with ards. converting again between median and mean for the assumed exponential distribution yields a mean time of σ −1 = the divi-intensivregister offers daily information on the amount of free intensive care hospital beds in germany. on 20 october 2020 they reported a capacity of 8, 872 free beds with 879 actively treated covid-19 patients [9] . we therefore round the maximal icu-beds available for covid-19 patients to 10, 000. t max in late august until the beginning of october the rki conducted between 1 and 1.2 million weekly sars-cov-2 tests in germany. this motivates our upper bound t max = 1,200,000 7 of daily tests. x 0 we initialize our model at time t = 0 with entries of x 0 set to 0 except for those related to the susceptible, latent and infectious compartments. our choice of initial values is informed by the number of active cases reported by the rki in late march assuming the proportion of underreporting to be 50%. we hence set the total number of infectious agents at t = 0 to 524 and the number of latent agents to 1672 distributed among the age-groups according to n i . as we explain in remark 1 our model is robust against misspecification of the initial values. figure 2 demonstrates the simulation capabilities of our model. here, the course of the pandemic is visualized if no countermeasures are implemented, i.e. no social distancing (β(t) = β 0 ) and no mass-testing (θ(t) = 0). as expected, the pandemic evolves too fast to satisfy any reasonable cap on the number of required icus. in particular, the number h icu of required icus exceeds 100,000 whereas we noted above that in germany only about 10,000 icus are available to treat covid-19 patients. therefore, countermeasures are indispensable to avoid an overload on the hospitals. note that if we distinguish different age groups the pandemic evolves faster, but less icus are required, as the pandemic spreads mostly in the less vulnerable, younger age groups. similar observations, viz. herd immunity being achieved faster in heterogeneous populations in comparison to homogeneous ones, have already been made by [5] . in this section, we provide information on how to keep the epidemic manageable. to this end, we formulate suitable optimal control problems (ocps) and solve them numerically. since we neither take vaccines nor re-infections into account, we consider the epidemic to be over once herd immunity is achieved, i.e. a state where the introduction of new infectious agents does not lead to an outbreak. therefore, our main goal is to reach herd immunity with as few social distancing as possible while maintaining strict limits on the icu occupancy to avoid a breakdown of the health care system. we call a control u = (β, θ) of the system (3) , . . . , n g }, and is satisfied for all t. a natural stopping point for simulations is when the share of susceptibles has decreased enough to ensure herd immunity even when all countermeasures are lifted completely. the time-dependent effective reproduction number r(t), that is the mean number of secondary cases a primary cases will cause at time t, can be used to determine whether herd-immunity has been reached: this will be the case if r(t) is less than 1. if there is only one infected compartment, as in a simple sir model, the latter condition is equivalent toi̇(t) < 0. if there is more than one infected compartment, as in model (2), [8, 38] have suggested to compute r ngm (t), based on the so-called next-generation matrix, as a proxy for the effective reproduction number. then, r ngm (t) exhibits the same threshold property as r(t), that is r ngm (t) < 1 implies herd immunity. thus we use r ngm (t) to check whether our simulations have reached herd immunity. a time horizon of two years (104 weeks) sufficed for all our simulations. this section is structured as follows. first, we verify the existence of a feasible testing strategy, i.e. without enforcing social distancing. note that due to delays in testing, the existence of a solution is not trivial and depends on the initial value. next, we establish an upper bound on the maximal number of tests per day and investigate to what extent social distancing is required in order to ensure feasibility. throughout our simulations, we assume the length of one control interval to be one week. this reflects the practical constraint that the government cannot change policies arbitrarily often but more realistically on a weekly basis. throughout our simulations we use the matlab-inherent sequential quadratic programming (sqp) tool to solve the ocps. here, our goal is to maintain a hard cap on the number of required icus with as few tests as possible without enforcing social distancing, i.e. β ≡ β 0 . to this end, we solve the ocp the objective function penalizes the total number of tests over the entire time horizon [t 0 , t f ] with t f > t 0 ≥ 0. the equality constraint (6c) captures the system dynamics while the one-sided box constraints (6e) ensure that the testing rates cannot be negative. figure 3 depicts the optimal controls as well as the total number of tests and the number of detected cases per day while figure 4 shows the impact on the evolution of the epidemic. here, we computed the effective reproduction number r ngm (t) at each time step, demonstrating that we reached herd immunity. we observe that there exists a testing strategy that ensures feasibility, which was not obvious from the outset because of the assumed delays. in particular, the bound (6b) is active once it's reached, i.e. h icu ≡ h icu max , and becomes inactive when the number of susceptible people falls below a certain threshold and r ngm (t) < 1 indicating the onset of herd immunity. (6) satisfies the socalled turnpike property [15] . typically, turnpikes indicate the optimal operating state of a system. these are steady states at which the running costs are minimized. in our example, since we do not penalize the number of required icus, the best strategy is to stay at the upper bound while saving tests. once the objective function value is zero the system leaves the state eventually. in particular, regardless of the initial value, the system is steered towards this optimal operating point. as a consequence, a rough estimation suffices as initial guess for our simulations. a rigorous analysis of these turnpikes, however, is left for future research. however, these results are only of theoretical interest, since this optimal testing strategy would be prohibitively expensive and might not even be implementable at all. for instance, regarding figure 3 , one observes that the mean testing rate reaches about 0.5, which corresponds to being tested every two days on average. moreover, the total number of tests per day required for this approach is more than 12,000,000 on average (over 65 weeks), compared to t max ≈ 170, 000 daily tests which are currently conducted in germany. note that, even with this enormous testing effort, the number of detected cases, t s + t o , is rather small since the number of infectious individuals is small compared to the total population. in conclusion, mass-testing alone currently does not suffice to maintain hospitalization caps in reality. these arguments support the government's decision to introduce additional measures like social distancing and hygiene concepts. however, cheap rapid test kits might change the situation favorably as they could be made widely available, self-administered while giving immediate test results. in the following subsections, we enforce t max as an upper bound on the amount of daily tests. under this additional constraint we then determine the minimal amount of social distancing required to reach herd immunity. the success of such measures depends on the acceptance and thus compliance by the general population. in a first step, we determine an optimal social distancing strategy by penalizing the deviation of β from β 0 equally over all age groups. this might increase acceptance in the general population due to the (perceived) fairness of such measures: everyone is treated equally and contacts are reduced by the same proportion for everyone. in reality such strategies may be hard to conceive as different measures affect the age groups differently, i.e. closing schools and nurserys affects those in the lowest age group the most while leaving the oldest age group unaffected. nevertheless a mixture of many different non-pharmaceutical measures may be able to achieve such a reduction in contacts. we introduce a time-varying factor δ = δ(t) describing the amount of social distancing that is implemented. moreover, we choose to penalize the 2 deviation of this control input from δ = 1 in the objective function in order to smooth the optimal control. for instance, penalizing the 1 deviation yields bang-bang controls, i.e. the optimal solution jumps back and forth between the two extremal options: no contact restrictions and lock down (simulations not shown). therefore, we determine an optimal homogeneous social-distancing policy by solving min θ,δ note that we allow to distribute the tests among the age groups by not fixing θ i , but enforcing (7e) and (7g). furthermore, we introduce a regularization term with weight κ = 10 −5 . the choice of κ is based on simulations. in contrast to (6), we always find a feasible solution of (7) if the epidemic has not yet evolved too far. more precisely, by choosing δ = 0, which corresponds to a complete lockdown, we are (theoretically) able to stop the spread. therefore, if the initial number of people with a severe course of infection is sufficiently low, the upper bound on the number of icus will not be violated. a highly fluctuating social distancing strategy may lead to low acceptance in the general population, because people have to adapt to new rules every few weeks. thus before we solve (7) let us have a look at what happens if we consider a constant value for δ over time, i.e. a social distancing strategy without fluctuations. figure 5 (left) shows that fewer contacts result in a longer time for the epidemic to abate on the one hand, but a lower number of total infections within the considered time horizon on the other hand. moreover, figure 5 (middle) visualizes that quite strict social distancing is needed in order to meet the icu capacities. the maximal value of δ to stay feasible is 0.487, i.e. contacts needed to be more than halved over three years. furthermore, once we lift the restrictions, see figure 6 , there might be another outbreak. in particular, the stronger the restrictions were in the beginning, the stronger the second outbreak will be. therefore, it is essential to establish herd immunity before lifting all restrictions, and to adapt the policy over time. a visualization of the optimal solution of (7) can be found in figures 7 and 8 . as mentioned above, the bound on h icu is not violated. since the weight κ is chosen sufficiently small, the upper bound on the total number of tests per day is active as long as the upper bound on δ is not. however, note that not all age groups are tested equally. more precisely, only the middle-aged group is tested at all. the reason is that this group is the largest (n 2 > n 1 + n 3 ) and has the highest contact rates (c.f. (5) ) and therefore, contributes more to the spread of the epidemic than the other groups. furthermore, we observe that the social distancing policy has to be quite strict in the beginning. in particular, min t δ(t) ≈ 0.3 which corresponds to a reduction of average contacts per person by 70%. however, this can be qualitatively compared to the measures taken in germany starting in mid march 2020 when contacts were reduced by school and restaurant closures as well as other contact restrictions. in conclusion, social distancing is an effective tool to keep the epidemic manageable. comparing the results of (7) to the simulations with constant δ we see that a (partial) lockdown appears inevitable. however, our simulations suggest to let the epidemic evolve for a few weeks, then enforce a contact reduction down to approximately 30% for 2-4 weeks before slowly lifting the restrictions over the next 12 months until herd immunity is achieved. the constraint that contacts are reduced by the same proportion for each age group is restrictive and it is plausible that more efficient solutions exist when contact reductions are distributed differently across age groups. one reason to consider such a strategy is that it may be more efficient at stopping the spread of the epidemic; as mentioned above the middle-age group is the driver of the epidemic while the oldest age group consists of the most vulnerable individuals. in any case, such an age-differentiated social distancing strategy needs to be accepted by the whole population to be successful. hence, we improve the social distancing policy computed above by allowing it to depend on age. given the solution (θ , δ ) of (7), we solve the ocp min θ,β here, we use δ to define β min ij = min t δ (t)β nom ij , i.e. the lower bound on β in (8) is the worst case of (7). therefore, no one is treated worse than when applying homogeneous social distancing. note that (θ , β) with β = δ β 0 is feasible for problem (8) . as in (7) we penalize testing as soon as β(t) = β 0 holds. results for (8) can be found in figures 9 and 10 , wherē describes the average number of contacts per person and day in a heterogeneous population. here, we used κ = 10 −5 . the corresponding value for β 0 isβ 0 = figure 9 : optimal age-dependent social-distancing strategy for three age groups over two years. figure 10 : evolution of the compartments associated with controls depicted in figure 9 . 0.4167. this allows to compare the solution β ij (t) withβ 0 δ (t) obtained from (7) . similar to the solution of (7), the upper bound on testing is active most of the time, while essentially only the middle-aged group is tested. the social distancing measures are less restrictive than for (7) which makes compliance with the measures more likely. however, the measures could be perceived as unfair, since the contacts of the oldest age group are restricted most. moreover, the contacts of the middle-age group are least restricted. therefore, the working class would be allowed to go to work which benefits the economy. in conclusion, social distancing is crucial to avoid an overload on the hospitals. in addition, testing middle-aged people helps to reduce the required amount of social distancing. furthermore, all presented strategies support a lock down a few weeks into the epidemic, which is followed by lifting the restrictions step by step until herd immunity sets in. age-differentiated social distancing might be hard to argue for, but it helps to end the epidemic several months earlier and, therefore, support the economy. the control strategies derived in the previous subsections provide rough guidelines for how the epidemic can be controlled. however, from a decision maker's perspective, it will be hard to argue for policies taking effect in the far future. in particular, there are many uncertainties that might affect the performance of the control strategy over the time span of two years, and hence the control strategy needs to be adjusted over time. therefore decision need to be revised constantly adapting to the changing conditions during the epidemic's progress. model predictive control (mpc) provides a state of the art methodology to tackle such figure 11 : optimal control for solving (8) in closed loop for varying prediction horizon length. for the sake of readability, we depicted average values of θ and sums of t tot over the age groups. figure 12 : evolution of the epidemic based on the controls depicted in figure 11 . for the sake of readability, only the sum over the age groups is visualized. problems. the basic idea of mpc is to consecutively solve a series of ocps over a smaller horizon of k control intervals rather than solving a single ocp over the whole horizon. more precisely, only the first part of the optimal control derived by solving such an auxiliary ocp is implemented. then, the time window is shifted, and the procedure is repeated based on updated measurements. for a detailed introduction to mpc we refer to [32] . here, we tackle (8) via mpc; the earlier problems can be treated analogously. the mpc scheme for (8) is summarized in algorithm 1. input: prediction horizon length k, length of control interval ∆t. set time t = t 0 . repeat: 1. obtain current statesx = x(t). results based on varying prediction horizon lengths can be found in figures 11 and 12 . the basic structure of both the optimal control and the associated states is comparable to the open-loop solution presented in the previous subsection. therefore, we stopped the simulations after one and a half years. the length of the prediction horizon affects mainly the optimal social distancing policy. in particular, the larger the prediction horizon, the less social distancing is needed in total. more precisely, for bigger k, we implement a slightly stricter lockdown but can start it later and relax it earlier. furthermore, the larger k gets the closer the optimal solution is to the open-loop solution. in particular, the mpc solutions qualitatively resemble the open-loop solution: after an early lockdown, social distancing is slowly lifted. for k = 3, the icu capacity reaches its upper limit earlier due to the laissezfaire policy in the beginning. however, this constraint also becomes inactive earlier. for even shorter prediction horizons recursive feasibility cannot be guaranteed, i.e. the icu cap might be violated (simulations not shown). so far, we assumed both the upper bounds on the number of tests per day, and on the number of icus to be fixed at our chosen values. in practice, these conditions may change: free icu capacity might exhibit seasonal patterns and the number of possible tests per day depends on infrastructure and available personnel. in addition, varying the upper bounds is useful to illustrate the benefits of increased testing and higher icu capacities. in this subsection, we investigate the impact of these parameters on the optimal social distancing policy numerically. first, we study the effect t max has on the social distancing by solving (7) via mpc, see figure 13 (left). as pointed out in the previous subsection, the prediction horizon length affects the start and end time of measures as well as its peak (simulations not shown). in addition, increasing t max by some factor t max fac ≥ 1 shifts the whole δ curve upwards, i.e., as expected, the more tests are available, the less social distancing is required. furthermore, figure 13 (left) visualizes the impact of t max on the objective function value of (7). figure 13 : impact of t max on social-distancing costs (left) and of h max on both social-distancing costs (middle) and testing (right). in the last two subfigures the currently available number of icus in germany is highlighted by a vertical dashed line. the dashed horizontal lines in the right-most figure indicate the total testing capacities over the entire simulation horizon. factor of modification of t max denoted by t max fac . second, we investigate the impact of the number of icus on the optimal solution of (8) . results can be found in figures 13 and 14 . for the simulations in figure 13 (middle and right) we used mpc with prediction horizon k = 12 weeks. figure 13 (middle) clearly shows that the number of available icus directly affects the cost function value. while for a small value of h icu max , every additional icu contributes, for large values, a saturation seems to take place. in particular, doubling the current number of available icus does help, but the benefit becomes negligible when increasing it further. these phenomena are almost unaffected by doubling or halving t max . however, when there are not enough icus, then the upper bound on t tot is always active, see figure 13 (right), where t tot is at its maximum value all the time. moreover, an increase in the number of icus clearly figure 14 : impact of the available number h icu max of icus and the prediction horizon k on the average social distancing. the dotted cyan line refers to the number of currently available icus in germany. the vertical dotted black line marks the end of social distancing measures for that setting. leads to a reduction in the social distancing measures, as can be seen in figure 14 . in summary, increasing test capacities and/or icu capacities helps to reduce measures like social distancing. however, the impact of the number of available icus appears to be much stronger. nonetheless the qualitative shape of the solutions over time is not affected by varying these constraints. in this paper, we demonstrated how mitigation of the covid-19 epidemic can be achieved by a combination of age-stratified testing and social distancing measures while avoiding a breakdown of the health care system. we showed that in our compartmental model mass testing alone is insufficient to achieve this goal, as it would require unrealistic testing capacities. as a remedy, we designed optimal social distancing strategies with a focus on applicability and acceptance in the general population, i.e. strategies with slowly changing contact reductions. the resulting social distancing measures imitate the measures actually taken in germany, but are lifted at a much slower pace. agedifferentiated contact reductions may improve upon these results as they yield qualitatively similar social distancing strategies and prioritize relaxing restrictions for the work-force and children. to model the process of policy making more realistically, we used mpc which allows to adapt to deviations from the envisioned course of the epidemic by solving the optimal control problem repeatedly. our analysis reveals that longer prediction horizons allow for faster lifting of restrictions although long-term predictions may be infeasible in practice. additionally we showed that the amount of available intensive care units is a key factor influencing the required amount of social distancing. we believe that our model with the chosen parameters reflects reality sufficiently well to provide qualitatively valid insight on how testing and social distancing can control the spread of sars-cov-2. we learned that mass testing alone is, assuming realistic testing capacities, not sufficient to avoid a breakdown of the health care system in germany. to prevent this, one has to implement strict contact reductions early on, which, ideally, should then be eased slowly. if one allows these reductions to vary by age, one is able to relax restrictions for the (working) middle age group, at the cost of reducing contacts of the more vulnerable older population. while short-term planning of measures is unable to control the exponential growth of cases, medium-term planning produces strategies that, qualitatively, do not differ from optimal ones while being flexible enough to adapt to new circumstances. finally, as expected, the number of available intensive care units dictates how fast herd-immunity can be reached and how much total social distancing is necessary. however, we caution the reader against interpreting these results in a quantitative way, as our model has not been devised to produce precise predictions. similarly, we want to stress that we do not provide concrete policies to implement, as the impact of particular countermeasures on β is not easily quantified. concerning other influences on the epidemic's evolution, note that we have not yet considered vaccinations nor re-infections, both of which could be included in our model without difficulties, if parameters are available to model them. as our model is based on odes, interaction effects such as contact tracing cannot be included. agent-based (stochastic) models are able to handle these critical effects and could be seen as a natural extension of our (deterministic) compartmental model. to solve the resulting stochastic optimal control problems would then require more sophisticated techniques, however. a model for covid-19 with isolation optimal control of deterministic epidemics time-optimal control strategies in sir epidemic models optimal control techniques based on infection age for the study of the covid-19 epidemic a mathematical model reveals the influence of population heterogeneity on herd immunity to sars-cov-2 applying optimal control theory to complex epidemiological models to inform real-world disease management model predictive control to mitigate the covid-19 outbreak in a multi-region scenario on the definition and the computation of the basic reproduction ratio r0 in models for infectious-diseases in heterogeneous populations charakteristik von 50 hospitalisierten covid-19-patienten mit und ohne ards maintaining hard infection caps in epidemics via the theory of barriers. e-print optimal vaccination and control strategies against dengue modelling the covid-19 epidemic and implementation of population-wide interventions in italy optimal quarantine strategies for the covid-19 pandemic in a population with a discrete age structure approximation properties of receding horizon optimal control optimal control of epidemics with limited resources temporal dynamics in viral shedding and transmissibility of covid-19 the mathematics of infectious diseases policy responses to covid-19 world economic outlook: a long and difficult ascent beyong just flattening the curve: optimal control of epidemics with purely non-pharmaceutical interventions the incubation period of coronavirus disease 2019 (covid-19) from publicly reported confirmed cases: estimation and application evolution of the covid-19 vaccine development landscape an optimal predictive control strategy for covid-19 (sars-cov-2) social distancing policies in brazil social contacts and mixing patterns relevant to the spread of infectious diseases stability of sis spreading processes in networks with non-markovian transmission and recovery the territorial impact of covid-19: managing the crisis across levels of government a systematic review of covid-19 epidemiology based on current evidence optimal control of the covid-19 pandemic with non-pharmaceutical interventions a survey of industrial model predictive control technology model predictive control: theory, computation, and design age-structured non-pharmaceutical interventions for optimal control of covid-19 epidemic. e-print, medrxiv vorläufige bewertung der krankheitsschwere von covid-19 in deutschland basierend aufübermittelten fällen gemäß infektionsschutzgesetz bevölkerung: deutschland, stichtag, altersjahre modeling, state estimation, and optimal control for the us covid-19 outbreak reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission robust economic model predictive control of continuous-time epidemic processes clinical presentation and virological assessment of hospitalized cases of coronavirus disease world health organization. coronavirus disease 2019 (covid-19) situation report 51 non pharmaceutical interventions for optimal control of covid-19 we thank kurt chudej (university of bayreuth) for insights on modelling pandemics and manuel schaller (tu ilmenau) for fruitful discussions on optimal control and the turnpike property. key: cord-143847-vtwn5mmd authors: ryffel, th'eo; pointcheval, david; bach, francis title: ariann: low-interaction privacy-preserving deep learning via function secret sharing date: 2020-06-08 journal: nan doi: nan sha: doc_id: 143847 cord_uid: vtwn5mmd we propose ariann, a low-interaction framework to perform private training and inference of standard deep neural networks on sensitive data. this framework implements semi-honest 2-party computation and leverages function secret sharing, a recent cryptographic protocol that only uses lightweight primitives to achieve an efficient online phase with a single message of the size of the inputs, for operations like comparison and multiplication which are building blocks of neural networks. built on top of pytorch, it offers a wide range of functions including relu, maxpool and batchnorm, and allows to use models like alexnet or resnet18. we report experimental results for inference and training over distant servers. last, we propose an extension to support n-party private federated learning. the massive improvements of cryptography techniques for secure computation over sensitive data [15, 13, 28] have spurred the development of the field of privacy-preserving machine learning [45, 1] . privacy-preserving techniques have become practical for concrete use cases, thus encouraging public authorities to use them to protect citizens' data, for example in covid-19 apps [27, 17, 38, 39] . however, tools are lacking to provide end-to-end solutions for institutions that have little expertise in cryptography while facing critical data privacy challenges. a striking example is hospitals which handle large amounts of data while having relatively constrained technical teams. secure multiparty computation (smpc) is a promising technique that can efficiently be integrated into machine learning workflows to ensure data and model privacy, while allowing multiple parties or institutions to participate in a joint project. in particular, smpc provides intrinsic shared governance: because data are shared, none of the parties can decide alone to reconstruct it. this is particularly suited for collaborations between institutions willing to share ownership on a trained model. use case. the main use case driving our work is the collaboration between healthcare institutions such as hospitals or clinical research laboratories. such collaboration involves a model owner and possibly several data owners like hospitals. as the model can be a sensitive asset (in terms of intellectual property, strategic asset or regulatory and privacy issues), standard federated learning [29, 7] that does not protect against model theft or model retro-engineering [24, 18] is not suitable. to data centers, but are likely to remain online for long periods of time. last, parties are honestbut-curious, [20, chapter 7.2.2] and care about their reputation. hence, they have little incentive to deviate from the original protocol, but they will use any information available in their own interest. contributions. by leveraging function secret sharing (fss) [9, 10] , we propose the first lowinteraction framework for private deep learning which drastically reduces communication to a single round for basic machine learning operations, and achieves the first private evaluation benchmark on resnet18. • we build on existing work on function secret sharing to design compact and efficient algorithms for comparison and multiplication, which are building blocks of neural networks. they are highly modular and can be assembled to build complex workflows. • we show how these blocks can be used in machine learning to implement operations for secure evaluation and training of arbitrary models on private data, including maxpool and batchnorm. we achieve single round communication for comparison, convolutional or linear layers. • last, we provide an implementation 1 and demonstrate its practicality both in lan (local area network) and wan settings by running secure training and inference on cifar-10 and tiny imagenet with models such as alexnet [31] and resnet18 [22] . related work. related work in privacy-preserving machine learning encompasses smpc and homomorphic encryption (he) techniques. he only needs a single round of interaction but does not support efficiently non-linearities. for example, ngraph-he [5] and its extensions [4] build on the seal library [44] and provide a framework for secure evaluation that greatly improves on the cryptonet seminal work [19] , but it resorts to polynomials (like the square) for activation functions. smpc frameworks usually provide faster implementations using lightweight cryptography. minionn and deepsecure [34, 41] use optimized garbled circuits [50] that allow very few communication rounds, but they do not support training and alter the neural network structure to speed up execution. other frameworks such as sharemind [6] , secureml [36] , securenn [47] or more recently falcon [48] rely on additive secret sharing and allow secure model evaluation and training. they use simpler and more efficient primitives, but require a large number of rounds of communication, such as 11 in [47] or 5 + log 2 (l) in [48] (typically 10 with l = 32) for relu. aby [16] , chameleon [40] and more recently aby 3 [35] mix garbled circuits, additive or binary secret sharing based on what is most efficient for the operations considered. however, conversion between those can be expensive and they do not support training except aby 3 . last, works like gazelle [26] combine he and smpc to make the most of both, but conversion can also be costly. works on trusted execution environment are left out of the scope of this article as they require access to dedicated hardware [25] . data owners which cannot afford these secure enclaves might be reluctant to use a cloud service and to send their data. notations. all values are encoded on n bits and live in z 2 n . note that for a perfect comparison, y + α should not wrap around and become negative. because y is in practice small compared to the n-bit encoding amplitude, the failure rate is less than one comparison in a million, as detailed in appendix c.1. security model. we consider security against honest-but-curious adversaries, i.e., parties following the protocol but trying to infer as much information as possible about others' input or function share. this is a standard security model in many smpc frameworks [6, 3, 40, 47] and is aligned with our main use case: parties that would not follow the protocol would face major backlash for their reputation if they got caught. the security of our protocols relies on indistinguishability of the function shares, which informally means that the shares received by each party are computationally indistinguishable from random strings. a formal definition of the security is given in [10] . about malicious adversaries, i.e., parties who would not follow the protocol, as all the data available are random, they cannot get any information about the inputs of the other parties, including the parameters of the evaluated functions, unless the parties reconstruct some shared values. the later and the fewer values are reconstructed, the better it is. as mentioned by [11] , our protocols could be extended to guarantee security with abort against malicious adversaries using mac authentication [15] , which means that the protocol would abort if parties deviated from it. our algorithms for private equality and comparison are built on top of the work of [10] , so the security assumptions are the same as in this article. however, our protocols achieve higher efficiency by specializing on the operations needed for neural network evaluation or training. we start by describing private equality which is slightly simpler and gives useful hints about how comparison works. the equality test consists in comparing a public input x to a private value α. evaluating the input using the function keys can be viewed as walking a binary tree of depth n, where n is the number of bits of the input (typically 32). among all the possible paths, the path from the root down to α is called the special path. figure 1 illustrates this tree and provides a compact representation which is used by our protocol, where we do not detail branches for which all leaves are 0. evaluation goes as follows: two evaluators are each given a function key which includes a distinct initial random state (s, t) ∈ {0, 1} λ × {0, 1}. each evaluator starts from the root, at each step i goes down one node in the tree and updates his state depending on the bit x[i] using a common correction word cw (i) ∈ {0, 1} 2(λ+1) from the function key. at the end of the computation, each evaluation outputs t. as long as x[i] = α[i], the evaluators stay on the special path and because the input x is public and common, they both follow the same path. if a bit x[i] = α[i] is met, they leave the special path and should output 0 ; else, they stay on it all the way down, which means that x = α and they should output 1. the main idea is that while they are on the special path, evaluators should have states (s 0 , t 0 ) and (s 1 , t 1 ) respectively, such that s 0 and s 1 are i.i.d. and t 0 ⊕ t 1 = 1. when they leave it, the correction word should act to have s 0 = s 1 but still indistinguishable from random and t 0 = t 1 , which ensures t 0 ⊕ t 1 = 0. each evaluator should output its t j and the result will be given by t 0 ⊕ t 1 . the formal description of the protocol is given below and is composed of two parts: first, in algorithm 1, the keygen algorithm consists of a preprocessing step to generate the functions keys, and then, in algorithm 2, eval is run by two evaluators to perform the equality test. it takes as input the private share held by each evaluator and the function key that they have received. they use g : {0, 1} λ → {0, 1} 2(λ+1) , a pseudorandom generator, where the output set is {0, 1} λ+1 ×{0, 1} λ+1 , and operations modulo 2 n implicitly convert back and forth n-bit strings into integers. intuitively, the correction words cw (i) are built from the expected state of each evaluator on the special path, i.e., the state that each should have at each node i if it is on the special path given some initial state. during evaluation, a correction word is applied by an evaluator only when it has t = 1. hence, on the special path, the correction is applied only by one evaluator at each bit. algorithm 1: keygen: key generation for equality to α if at step i, the evaluator stays on the special path, the correction word compensates the current states of both evaluators by xor-ing them with themselves and re-introduces a pseudorandom value s (either s r 0 ⊕ s r 1 or s l 0 ⊕ s l 1 ), which means the xor of their states is now (s, 1) but those states are still indistinguishable from random. on the other hand, if x[i] = α[i], the new state takes the other half of the correction word, so that the xor of the two evaluators states is (0, 0). from there, they have the same states and both have either t = 0 or t = 1. they will continue to apply the same corrections at each step and their states will remain the same with t 0 ⊕ t 1 = 0. a final computation is performed to obtain shared [[t ]] modulo 2 n of the result bit t = t 0 ⊕ t 1 ∈ {0, 1} shared modulo 2. from the privacy point of view, when the seed s is (truly) random, g(s) also looks like a random bit-string (this is a pseudorandom bit-string). each half is used either in the cw or in the next state, but not both. therefore, the correction words cw (i) do not contain information about the expected states and for j = 0, 1, the output k j is independently uniformly distributed with respect to α and 1−j , in a computational way. as a consequence, at the end of the evaluation, for j = 0, 1, t j also follows a distribution independent of α. until the shared values are reconstructed, even a malicious adversary cannot learn anything about α nor the inputs of the other player. function keys should be sent to the evaluators in advance, which requires one extra communication of the size of the keys. we use the trick of [10] to reduce the size of each correction word in the keys, from 2(1 + λ) to (2 + λ) by reusing the pseudo-random λ-bit string dedicated to the state used when leaving the special path for the state used for staying onto it, since for the latter state the only constraint is the pseudo-randomness of the bitstring. our major contribution to the function secret sharing scheme is regarding comparison (which allows to tackle non-polynomial activation functions for neural networks): we build on the idea of the equality test to provide a synthetic and efficient protocol whose structure is very close from the previous one. instead of seeing the special path as a simple path, it can be seen as a frontier for the zone in the tree where x ≤ α. to evaluate x ≤ α, we could evaluate all the paths on the left of the special path and then sum up the results, but this is highly inefficient as it requires exponentially many evaluations. our key idea here is to evaluate all these paths at the same time, noting that each time one leaves the special path, it either falls on the left side (i.e., x < α) or on the right side (i.e., x > α). hence, we only need to add an extra step at each node of the evaluation, where depending on the bit value x[i], we output a leaf label which is 1 only if x[i] < α[i] and all previous bits are identical. only one label between the final label (which corresponds to x = α) and the leaf labels can be equal to one, because only a single path can be taken. therefore, evaluators will return the sum of all the labels to get the final output. the full description of the comparison protocol is detailed in appendix a, together with a detailed explanation of how it works. we now apply these primitives to a private deep learning setup in which a model owner interacts with a data owner. the data and the model parameters are sensitive and are secret shared to be kept private. the shape of the input and the architecture of the model are however public, which is a standard assumption in secure deep learning [34, 36] . all our operations are modular and follow this additive sharing workflow: inputs are provided secret shared and are masked with random values before being revealed. this disclosed value is then consumed with preprocessed function keys to produce a secret shared output. each operation is independent of all surrounding operations, which is known as circuit-independent preprocessing [11] and implies that key generation can be fully outsourced without having to know the model architecture. this results in a fast runtime execution with a very efficient online communication, with a single round of communication and a message size equal to the input size for comparison. preprocessing is performed by a trusted third party to build the function keys. this is a valid assumption in our use case as such third party would typically be an institution concerned about its image, and it is very easy to check that preprocessed material is correct using a cut-and-choose technique [51] . matrix multiplication (matmul). as mentioned by [11] , multiplication fit in this additive sharing workflow. we use beaver triples [2] ]. matrix multiplication is identical but uses matrix beaver triples [14] . relu activation function is supported as a direct application of our comparison protocol, which we combine with a point wise multiplication. convolution can be computed as a single matrix multiplication using an unrolling technique as described in [12] and illustrated in figure 3 in appendix c.2. argmax operator used in classification to determine the predicted label can also be computed in a constant number of rounds using pairwise comparisons as shown by [21] . the main idea here is, given a vector (x 0 , . . . , x m−1 ), to compute the matrix m ∈ r m−1×m where each row m i = (x i+1 mod m , . . . , x i+m+1 mod m ). then, each element of column j is compared to x j , which requires m(m − 1) parallel comparisons. a column j where all elements are lower than x j indicates that j is a valid result for the argmax. maxpool can be implemented by combining these two methods: the matrix is first unrolled like in figure 3 and the maximum of each row in then computed using parallel pairwise comparisons. more details and an optimization when the kernel size k equals 2 are given in appendix c.3. batchnorm is implemented using a approximate division with newton's method as in [48] : given an input x = (x 0 , . . . , x m−1 ) with mean µ and variance σ 2 , we return γ ·θ · ( x − µ) + β. variables γ and β are learnable parameters andθ is the estimate inverse of √ σ 2 + with 1 and is computed iteratively using: θ i+1 = θ i · (3 − (σ 2 + ) · θ 2 i )/2. more details can be found in appendix c.4. more generally, for more complex activation functions such as softmax, we can use polynomial approximations methods, which achieve acceptable accuracy despite involving a higher number of rounds [37, 23, 21] . table 1 summarizes the online communication cost of each operation, and shows that basic operations such as comparison have a very efficient online communication. we also report results from [48] which achieve good experimental performance. these operations are sufficient to evaluate real world models in a fully private way. to also support private training of these models, we need to perform a private backward pass. as we overload operations such as convolutions or activation functions, we cannot use the built-in autograd functionality of pytorch. therefore, we have developed a custom autograd functionality, where we specify how to compute the derivatives of the operations that we have overloaded. backpropagation also uses the same basic blocks than those used in the forward pass. this 2-party protocol between a model owner and a data owner can be extended to an n-party federated learning protocol where several clients contribute their data to a model owned by an orchestrator server. this approach is inspired by secure aggregation [8] but we do not consider here clients being phones which means we are less concerned with parties dropping before the end of the protocol. in addition, we do not reveal the updated model at each aggregation or at any stage, hence providing better privacy than secure aggregation. at the beginning of the interaction, the server and model owner initializes its model and builds n pairs of additive shares of the model parameters. for each pair i, it keeps one of the shares and sends the other one to the corresponding client i. then, the server runs in parallel the training procedure with all the clients until the aggregation phase starts. aggregation for the server shares is straightforward, as the n shares it holds can be simply locally averaged. but the clients have to average their shares together to get a client share of the aggregated model. one possibility is that clients broadcast their shares and compute the average locally. however, to prevent a client colluding with the server from reconstructing the model contributed by a given client, they hide their shares using masking. this can be done using correlated random masks: client i generates a seed, sends it to client i + 1 while receiving one from client i − 1. client i then generates a random mask m i using its seed and another m i−1 using the one of client i − 1 and publishes its share masked with m i − m i−1 . as the masks cancel each other out, the computation will be correct. we follow a setup very close to [48] and assess inference and training performance of several networks on the datasets mnist [33] , cifar-10 [30] , 64×64 tiny imagenet and 224×224 tiny imagenet [49, 42] , presented in appendix d.1. more precisely, we assess 5 networks as in [48] : a fully-connected network (network-1), a small convolutional network with maxpool (network-2), lenet [32] , alexnet [31] and vgg16 [46] . furthermore, we also include resnet18 [22] which to the best of our knowledge has never been studied before in private deep learning. the description of these networks is taken verbatim from [48] and is available in appendix d.2. our implementation is written in python. to use our protocols that only work in finite groups like z 2 32 , we convert our input values and model parameters to fixed precision. to do so, we rely on the pysyft library [43] protocol. however, our inference runtimes reported in table 2 compare favourably with existing work including [34-36, 47, 48] , in the lan setting and particularly in the wan setting thanks to our reduced number of communication rounds. for example, our implementation of network-1 is 2× faster than the best previous result by [35] in the lan setting and 18× faster in the wan setting compared to [48] . for bigger networks such as alexnet on cifar-10, we are still 13× faster in the wan setting than [48] . results are given for a batched evaluation, which allows parallelism and hence faster execution as in [48] . for larger networks, we reduce the batch size to have the preprocessing material (including the function keys) fitting into ram. test accuracy. thanks to the flexibility of our framework, we can train each of these networks in plain text and need only one line of code to turn them into private networks, where all parameters are secret shared. we compare these private networks to their plaintext counterparts and observe that the accuracy is well preserved as shown in table 3 . if we degrade the encoding precision, which by default considers values in z 2 32 , and the fixed precision which is by default of 3 decimals, performance degrades as shown in appendix b. training. we can either train from scratch those networks or fine tune pre-trained models. training is an end-to-end private procedure, which means the loss and the gradients are never accessible in plain text. we use stochastic gradient descent (sgd) which is a simple but popular optimizer, and support both hinge loss and mean square error (mse) loss, as other losses like cross entropy which is used in clear text by [48] cannot be computed over secret shared data without approximations. we report runtime and accuracy obtained by training from scratch the smaller networks in table 4 . note that because of the number of epochs, the optimizer and the loss chosen, accuracy does not match best known results. however, the training procedure is not altered and the trained model will be strictly equivalent to its plaintext counterpart. training cannot complete in reasonable time for larger networks, which are anyway available pre-trained. note that training time includes the time spent building the preprocessing material, as it cannot be fully processed in advance and stored in ram. discussion. for larger networks, we could not use batches of size 128. this is mainly due to the size of the comparison function keys which is currently proportional to the size of the input tensor, with a multiplication factor of nλ where n = 32 and λ = 128. optimizing the function secret sharing protocol to reduce those keys would lead to massive improvements in the protocol's efficiency. our implementation actually has more communication than is theoretically necessary according to table 1 , suggesting that the experimental results could be further improved. as we build on top of pytorch, using machines with gpus could also potentially result in a massive speed-up, as an important fraction of the execution time is dedicated to computation. last, accuracies presented in table 3 and table 4 do not match state-of-the-art performance for the models and datasets considered. this is not due to internal defaults of our protocol but to the simplified training procedure we had to use. supporting losses such as the logistic loss, more complex optimizers like adam and dropout layers would be an interesting follow-up. one can observe the great similarity of structure of the comparison protocol given in algorithm 3 and 4 with the equality protocol from algorithm 1 and 2: the equality test is performed in parallel with an additional information out i at each node, which holds a share of either 0 when the evaluator stays on the special path or if it has already left it at a previous node, or a share of α[i] when it leaves the special path. this means that if α[i] = 1, leaving the special path implies that x[i] = 0 and hence x ≤ α, while if α[i] = 0, leaving implies x[i] = 1 so x > α and the output should be 0. the final share out n+1 corresponds the previous equality test. note that in all these computations modulo 2 n , while the bitstrings s j · cw (i) ) = ((state j,0 , state j,1 ), (state j,0 , state j,1 )) 9 parse s we have studied the impact of lowering the encoding space of the input to our function secret sharing protocol from z 2 32 to z 2 k with k < 32. finding the lowest k guaranteeing good performance is an interesting challenge as the function keys size is directly proportional to it. this has to be done together with reducing fixed precision from 3 decimals down to 1 decimal to ensure private values aren't too big, which would result in higher failure rate in our private comparison protocol. we have reported in table 5 our findings on network-1, which is pre-trained and then evaluated in a private fashion. table 5 : accuracy (in %) of network-1 given different precision and encoding spaces what we observe is that 3 decimals of precision is the most appropriate setting to have an optimal precision while allowing to slightly reduce the encoding space down to z 2 24 or z 2 28 . because this is not a massive gain and in order to keep the failure rate in comparison very low, we have kept z 2 32 for all our experiments. c implementation details our comparison protocol can fail if y + α wraps around and becomes negative. we can't act on α because it must be completely random to act as a perfect mask and to make sure the revealed x = y + α mod 2 n does not leak any information about y, but the smaller y is, the lower the error probability will be. [11] suggests a method which uses 2 invocations of the protocol to guarantee perfect correctness but because it incurs an important runtime overhead, we rather show that the failure rate of our comparison protocol is very small and is reasonable in contexts that tolerate a few mistakes, as in machine learning. more precisely, we quantify it on real world examples, namely on network-2 and on the 64×64 tiny imagenet version of vgg16, with a fixed precision of 3 decimals, and find respective failure rates of 1 in 4 millions comparisons and 1 in 100 millions comparisons. such error rates do not affect the model accuracy, as table 3 shows. figure 4 illustrates how maxpool uses ideas from matrix unrolling and argmax computation. notations present in the figure are consistent with the explanation of argmax using pairwise comparison in section 4.3. the m × m matrix is first unrolled to a m 2 × k 2 matrix. it is then expanded on k 2 layers, each of which each shifted by a step of 1. next, m 2 k 2 (k 2 − 1) pairwise comparisons are then applied simultaneously between the first layer and the other ones, and for each x i we sum the result of its k − 1 comparison and check if it equals k − 1. we multiply this boolean by x i and sum up along a line (like x 1 to x 4 in the figure) . last, we restructure the matrix back to its initial structure. in addition, when the kernel size k is 2, rows are only of length 4 and it can be more efficient to use a binary tree approach instead, i.e. compute the maximum of columns 0 and 1, 2 and 3 and the max of the result: it requires log 2 (k 2 ) = 2 rounds of communication and only approximately (k 2 − 1)(m/s) 2 comparisons, compared to a fixed 3 rounds and approximately k 4 (m/s) 2 . interestingly, average pooling can be computed locally on the shares without interaction because it only includes mean operations, but we didn't replace maxpool operations with average pooling to avoid distorting existing neural networks architecture. the batchnorm layer is the only one in our implementation which is a polynomial approximation. moreover, compared to [48] , the approximation is significantly coarser as we don't make any costly initial approximation and we reduce the number of iterations of the newton method from 4 to only 3. typical relative error can be up to 20% but as the primary purpose of batchnorm is to normalise data, having rough approximations here is not an issue and doesn't affect learning capabilities, as our experiments show. however, it is a limitation for using pre-trained networks: we observed on alexnet adapted to cifar-10 that training the model with a standard batchnorm and evaluating it with our approximation resulted in poor results, so we had to train it with the approximated layer. this section is taken almost verbatim from [48] . we select 4 datasets popularly used for training image classification models: mnist [33] , cifar-10 [30] , 64×64 tiny imagenet and 224×224 tiny imagenet [49] . mnist mnist [33] is a collection of handwritten digits dataset. it consists of 60,000 images in the training set and 10,000 in the test set. each image is a 28×28 pixel image of a handwritten digit along wit a label between 0 and 9. we evaluate network-1, network-2, and the lenet network on this dataset. cifar-10 cifar-10 [30] consists of 50,000 images in the training set and 10,000 in the test set. it is composed of 10 different classes (such as airplanes, dogs, horses etc.) and there are 6,000 images of each class with each image consisting of a colored 32×32 image. we perform private training of alexnet and inference of vgg16 on this dataset. tiny imagenet tiny imagenet [49] consists of two datasets of 100,000 training samples and 10,000 test samples with 200 different classes. the first dataset is composed of colored 64×64 images and we use it with alexnet and vgg16. the second is composed of colored 224×224 images and is used with resnet18. we have selected 6 models for our experimentations. network-1 a 3-layered fully-connected network with relu used in secureml [36] . network-2 a 4-layered network selected in minionn [34] with 2 convolutional and 2 fullyconnected layers, which uses maxpool in addition to relu activation. lenet this network, first proposed by lecun et al. [32] , was used in automated detection of zip codes and digit recognition. the network contains 2 convolutional layers and 2 fully connected layers. alexnet alexnet is the famous winner of the 2012 imagenet ilsvrc-2012 competition [31] . it has 5 convolutional layers and 3 fully connected layers and it can batch normalization layer for stability and efficient training. vgg16 vgg16 is the runner-up of the ilsvrc-2014 competition [46] . vgg16 has 16 layers and has about 138m parameters. resnet18 resnet18 [22] is the runner-up of the ilsvrc-2015 competition. it is a convolutional neural network that is 18 layers deep, and has 11.7m parameters. it uses batch normalisation and we're the first private deep learning framework to evaluate this network. model architectures of network-1 and network-2, together with lenet, and the adaptations for cifar-10 of alexnet and vgg16 are precisely depicted in appendix d of [48] . note that in the cifar-10 version alexnet, authors have used the version with batchnorm layers, and we have kept this choice. for the 64×64 tiny imagenet version of alexnet, we used the standard architecture from pytorch to have a pretrained network. it doesn't have batchnorm layers, and we have adapted the classifier part as illustrated in figure 5 . note also that we permute relu and maxpool where applicable like in [48] , as this is strictly equivalent in terms of output for the network and reduces the number of comparisons. more generally, we don't proceed to any alteration of the network behaviour except with the approximation on batchnorm. this improves usability of our framework as it allows to take a pre-trained neural network from a standard deep learning library like pytorch and to encrypt it generically with a single line of code. privacy-preserving machine learning: threats and solutions efficient multiparty protocols using circuit randomization optimizing semi-honest secure multiparty computation for the internet ngraph-he2: a high-throughput framework for neural network inference on encrypted data ngraph-he: a graph compiler for deep learning on homomorphically encrypted data sharemind: a framework for fast privacypreserving computations towards federated learning at scale: system design practical secure aggregation for privacy-preserving machine learning function secret sharing function secret sharing: improvements and extensions secure computation with preprocessing via function secret sharing high performance convolutional neural networks for document processing faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds private image analysis with mpc. accessed 2019-11-01 multiparty computation from somewhat homomorphic encryption aby-a framework for efficient mixed-protocol secure two-party computation a survey of secure multiparty computation protocols for privacy preserving genetic tests model inversion attacks that exploit confidence information and basic countermeasures cryptonets: applying neural networks to encrypted data with high throughput and accuracy foundations of cryptography deep residual learning for image recognition accuracy and stability of numerical algorithms deep models under the gan: information leakage from collaborative deep learning chiron: privacy-preserving machine learning as a service {gazelle}: a low latency framework for secure neural network inference an efficient multi-party scheme for privacy preserving collaborative filtering for healthcare recommender system overdrive: making spdz great again federated learning: strategies for improving communication efficiency the cifar-10 dataset imagenet classification with deep convolutional neural networks gradient-based learning applied to document recognition mnist handwritten digit database oblivious neural network predictions via minionn transformations aby3: a mixed protocol framework for machine learning secureml: a system for scalable privacy-preserving machine learning an improved newton iteration for the generalized inverse of a matrix, with applications information technology-based tracing strategy in response to covid-19 in south korea-privacy controversies privacy-preserving contact tracing of covid-19 patients chameleon: a hybrid secure computation framework for machine learning applications deepsecure: scalable provably-secure deep learning imagenet large scale visual recognition challenge a generic framework for privacy preserving deep learning privacy-preserving deep learning very deep convolutional networks for large-scale image recognition securenn: efficient and private neural network training falcon: honest-majority maliciously secure framework for private deep learning tiny imagenet challenge how to generate and exchange secrets the cut-and-choose game and its application to cryptographic protocols we would like to thank geoffroy couteau, chloé hébant and loïc estève for helpful discussions throughout this project. we are also grateful for the long-standing support of the openmined community and in particular its dedicated cryptography team, including yugandhar tripathi, s p sharan, george-cristian muraru, muhammed abogazia, alan aboudib, ayoub benaissa, sukhad joshi and many others.this work was supported in part by the european community's seventh framework programme (fp7/2007-2013 grant agreement no. 339563 -cryptocloud) and by the french project fui anblic. the computing power was graciously provided by the french company arkhn. key: cord-162105-u0w56xrp authors: centeno, raffy s.; marquez, judith p. title: how much did the tourism industry lost? estimating earning loss of tourism in the philippines date: 2020-04-21 journal: nan doi: nan sha: doc_id: 162105 cord_uid: u0w56xrp the study aimed to forecast the total earnings lost of the tourism industry of the philippines during the covid-19 pandemic using seasonal autoregressive integrated moving average. several models were considered based on the autocorrelation and partial autocorrelation graphs. based on the akaike's information criterion (aic) and root mean squared error, arima(1,1,1)$times$(1,0,1)$_{12}$ was identified to be the better model among the others with an aic value of $-414.51$ and rmse of $47884.85$. moreover, it is expected that the industry will have an estimated earning loss of around 170.5 billion pesos if the covid-19 crisis will continue up to july. possible recommendations to mitigate the problem includes stopping foreign tourism but allowing regions for domestic travels if the regions are confirmed to have no cases of covid-19, assuming that every regions will follow the stringent guidelines to eliminate or prevent transmissions; or extending this to countries with no covid-19 cases. according to the philippine statistics authority, tourism accounts to 12.7% of the countrys gross domestic product in the year 2018 [1] . moreover, national economic development authority reported that 1.5% of the countrys gdp on 2018 is accounted to international tourism with korea, china and usa having the largest numbers of tourists coming in []. in addition, department of tourism recorded that 7.4% of the total domestic tourists or an estimated figure of 3.97 million tourists, both foreign and domestics were in davao region on 2018 [2] . also, employment in tourism in-dustry was roughly estimated to 5.4 million in 2018 which constitutes 13% of the employment in the country according to the philippine statistics authority [3] . hence, estimating the total earnings of the tourism industry in the philippines will be very helpful in formulating necessary interventions and strategies to mitigate the effects of the covid-19 pandemic. this paper will serve as a baseline research to describe and estimate the earnings lost of the said industry. the objective of this research is to forecast the monthly earnings loss of the tourism industry during the covid-19 pandemic by forecasting the monthly foreign visitor arrivals using seasonal autoregressive integrated moving average. specifically, it aims to answer the following questions: 1. what is the order of the seasonal autoregressive intergrated moving average for the monthly foreign visitor arrivals in the philippines? 2. how much earnings did the tourism industry lost during the covid-19 pandemic? the study covers a period of approximately eight years from january 2012 to december 2019. also, the modeling technique that was considered in this research is limited only to autoregressive integrated moving average (arima) and seasonal autoregressive integrated moving average (sarima). other modeling techniques were not tested and considered. the research utilized longitudinal research design wherein the monthly foreign visitor arrivals in the philippines is recorded and analyzed. a longitudinal research design is an observational research method in which data is gathered for the same subject repeatedly over a period of time [4] . forecasting method, specifically the seasonal autoregressive integrated moving average (sarima), was used to forecast the future monthly foreign visitor arrivals. in selecting the appropriate model to forecast the monthly foreign visitor arrivals in the philippines, the box-jenkins methodology was used. the data set was divided into two sets: the training set which is composed of 86 data points from january 2012 to december 2018; and testing set which is composed of 12 data points from january 2019 to december 2019. the training set was used to identify the appropriate sarima order whereas the testing set will measure the accuracy of the selected model using root mean squared error. the best model, in the context of this paper, was characterized to have a low akaike's information criterion and low root mean squared error. the data were extracted from department of tourism website. the data were composed of monthly foreign visitor arrivals from january 2012 to december 2019 which is composed of 98 data points. box-jenkins methodology refers to a systematic method of identifying, fitting, checking, and using sarima time series models. the method is appropriate for time series of medium to long length which is at least 50 observations. the box-jenkins approach is divided into three stages: model identification, model estimation, and diagnostic checking. in this stage, the first step is to check whether the data is stationary or not. if it is not, then differencing was applied to the data until it becomes stationary. stationary series means that the value of the series fluctuates around a constant mean and variance with no seasonality over time. plotting the sample autocorrelation function (acf) and sample partial autocorrelation function (pacf) can be used to assess if the series is stationary or not. also, augmented dickey−fuller (adf) test can be applied to check if the series is stationary or not. next step is to check if the variance of the series is constant or not. if it is not, data transformation such as differencing and/or box-cox transformation (eg. logarithm and square root) may be applied. once done, the parameters p and q are identified using the acf and pacf. if there are 2 or more candidate models, the akaike's information criterion (aic) can be used to select which among the models is better. the model with the lowest aic was selected. in this stage, parameters are estimated by finding the values of the model coefficients which provide the best fit to the data. in this research, the combination of conditional sum of squares and maximum likelihood estimates was used by the researcher. conditional sum of squares was utilized to find the starting values, then maximum likelihood was applied after. diagnostic checking performs residual analysis. this stage involves testing the assumptions of the model to identify any areas where the model is inadequate and if the corresponding residuals are uncorrelated. box-pierce and ljung-box tests may be used to test the assumptions. once the model is a good fit, it can be used for forecasting. forecast evaluation involves generating forecasted values equal to the time frame of the model validation set then comparing these values to the latter. the root mean squared error was used to check the accuracy of the model. moreover, the acf and pacf plots were used to check if the residuals behave like white noise while the shapiro-wilk test was used to perform normality test. the following statistical tools were used in the data analysis of this study. sample autocorrelation function measures how correlated past data points are to future values, based on how many time steps these points are separated by. given a time series x t , we define the sample autocorrelation function, r k , at lag k as [5] wherex is the average of n observations . sample partial autocorrelation function measures the correlation between two points that are separated by some number of periods but with the effect of the intervening correlations removed in the series. given a time series x t , the partial autocorrelation of lag k is the autocorrelation between x t and x t+k with the linear dependence of x t on x t+1 through x t+k−1 removed. the sample partial autocorrelation function is defined as [5] and r k is the sample autocorrelation at lag k. rmse is a frequently used measure of the difference between values predicted by a model and the values actually observed from the environment that is being modelled. these individual differences are also called residuals, and the rmse serves to aggregate them into a single measure of predictive power. the rmse of a model prediction with respect to the estimated variable x model is defined as the square root of the mean squared error [6] whereŷ i is the predicted values, y i is the actual value, and n is the number of observations. the aic is a measure of how well a model fits a dataset, penalizing models that are so flexible that they would also fit unrelated datasets just as well. the general form for calculating the aic is [5] aic p,q = −2 ln(maximized likelihood) + 2r n where n is the sample size, r = p + q + 1 is the number of estimated parameters, and including a constant term. the ljung−box statistic, also called the modified box-pierce statistic, is a function of the accumulated sample autocorrelation, r j , up to any specified time lag m. this statistic is used to test whether the residuals of a series of observations over time are random and independent. the null hypothesis is that the model does not exhibit lack of fit and the alternative hypothesis is the model exhibits lack of fit. the test statistic is defined as [5] wherer 2 k is the estimated autocorrelation of the series at lag k, m is the number of lags being tested, n is the sample size, and the given statistic is approximately chi square distributed with h degrees of freedom, where h = m − p − q. conditional sum of squares was utilized to find the starting values in estimating the parameters of the sarima process. the formula is given by [7] θ n = arg min θ∈ s n (θ) (5) where according to [7] , once the model order has been identified, maximum likelihood was used to estimate the parameters c, φ 1 , ..., φ p , θ 1 , ..., θ q . this method finds the values of the parameters which maximize the probability of getting the data that has been observed . for sarima models, the process is very similar to the least squares estimates that would be obtained by minimizing where ε t is the error term. box−cox transformation is applied to stabilize the variance of a time series. it is a family of transformations that includes logarithms and power transformation which depend on the parameter λ and are defined as follows [8] where y i is the original time series values, w i is the transformed time series values using box-cox, and λ is the parameter for the transformation. r is a programming language and free software environment for statistical computing and graphics that is supported by the r foundation for statistical computing [9] . r includes linear and nonlinear modeling, classical statistical tests, time-series analysis, classification modeling, clustering, etc. the 'forecast' package [7] was utilized to generate time series plots, autocorrelation function/partial autocorrelation function plots, and forecasting. also, the 'tseries' package [10] was used to perform augmented dickey-fuller (adf) to test stationarity. moreover, the 'lmtest' package [11] was used to test the parameters of the sarima model. finally, the 'ggplot2' [12] , 'tidyr' [13] , and 'dplyr' [14] were used to plot time series data considered during the conduct of the research. line plot was used to describe the behavior of the monthly foreign visitor arrivals in the philippines. figure 1 shows that there is an increasing trend and a seasonality pattern in the time series. specifically, there is a seasonal increase in monthly foreign visitor arrivals every december and a seasonal decrease every september. these patterns suggest a seasonal autoregressive integrated moving average (sarima) approach in modeling and forecasting the monthly foreign visitor arrivals in the philippines. akaike information criterion and root mean squared error were used to identify which model was used to model and forecast the monthly foreign visitor arrivals in the philippines. table 1 shows the top two sarima models based on aic generated using r. arima (1,1,1)×(1,0,1) 12 has the lowest aic with a value of −414.56 which is followed by arima (1,1,1)×(1,0,1) 12 with an aic value of −414.51. model estimation was performed on both models and generated significant parameters for both models (refer to appendix a.2). moreover, diagnostic checking was performed to assess the model. both models passed the checks using residual versus time plot, residual versus fitted plot, normal q-q plot, acf graph, pacf graphs, ljung-box test, and shapiro-wilk test (refer to appendix a.3). finally, forecast evaluation was performed to measure the accuracy of the model using an out-of-sample data set (refer to appendix a.4). arima (1,1,1)×(1,0,1) 12 produced the lowest rmse relative to arima (0,1,2)×(1,0,1) 12 . hence, the former was used to forecast the monthly foreign visitor arrivals in the philippines. ing the covid-19 pandemic crisis figure 2 shows the estimated earnings loss (in billion pesos) of the tourism industry of the philippines every month from april 2020 to december 2020. according to the department of tourism, the average daily expenditure (ade) for the month in review is p 8,423.98 and the average length of stay (alos) of tourists in the country is recorded at 7.11 nights. the figures were generated by multiplying the forecasted monthly foreign visitor arrivals, ade, and alos (rounded to 7) [2] . moreover, it is forecasted under community quarantine that the recovery time will take around four to five months (up to july) [15] . with this, the estimated earning loss of the country in terms of tourism will be around 170.5 billion pesos. based on the results presented on the study, the following findings were drawn: 1. the order of sarima model used to forecast the monthly foreign visitor arrival is arima (1,1,1)×(1,0,1) 12 since it produced a relatively low aic of −414.51 and the lowest rmse of 47884.85 using an out-of-sample data. this means that the model is relatively better among other sarima models considered in forecasting the monthly foreign visitor arrivals in the philippines. 2. if the covid-19 pandemic lasts up to five months, the tourism industry of the philippines will have an estimated earnings loss of about p 170.5 billion. assumptions about average daily expenditure and average length of stay of tourists were based on the department of tourism reports. the projected p 170.5 billion loss on philippines foreign tourism is really a huge money. regaining such loss the soonest time, however, would only jeopardize the lives of the filipino people. on the other hand, the government can, perhaps, reopen the philippines domestic tourism. this would somehow help regain the countrys loss on revenue from tourism, although not fully. however, the following recommendations, shown in scenarios/options below, may be helpful in regaining it, both in foreign and domestic tourism, and ensuring safety among filipinos, as well. 1. option 1: stop foreign tourism until the availability of the vaccine, but gradually open domestic tourism starting july of 2020. in this scenario/option, the following considerations may be adhered to, viz. (a) not all domestic tourism shall be reopened in the entire country; only those areas with zero covid-19 cases; (b) for areas where domestic tourism is allowed/reopened, appropriate guidelines should be strictly implemented by concerned departments/agencies to eliminate/prevent covid-19 transmission; and (c) digital code that would help in tracing the contacts and whereabouts of domestic tourists, as being used in china and singapore, should be installed before the reopening of the domestic tourism. 2. option 2: gradual opening of foreign tourism starting july 2020 and full reopening of domestic tourism on the first semester of 2021 or when the covid-19 cases in the philippines is already zero. however, the following considerations should be satisfied, viz. (a) only countries with covid-19 zero cases are allowed to enter the philippines; (b) appropriate guidelines should be strictly implemented by concerned departments/ agencies both for foreign and domestic tourism to eliminate/ prevent the spread of the said virus; and (c) digital code that would help in tracing the contacts and whereabouts of foreign tourists, as being used in china and singapore, should be installed before reopening the foreign tourism in the philippines. a.1 model identification fig. 3 : line, acf, and pacf plot of monthly visitor arrivals line, acf, and pacf graph were used to identify the model to be used to forecast the monthly visitor arrivals in the philippines. the line graph in figure 3 shows an increasing trend which suggests a non-stationary behavior. this is supported by the acf and pacf plots which shows a slow decay in all the lags of the former and the the first lag is significant for the latter. moreover, the line plot slightly display an increasing variance across time. therefore, data transformation such as differencing and box-cox transform were applied to the time series data. figure 4 show the line, acf, and pacf graph of the transformed data. the line graph shows that the data is stationary which is supported by both acf and pacf graphs. moreover, acf graph suggests a seasonal pattern in the data since 12 th , 24 th , 36 th , 48 th , and 60 th lags are significant. this is also true in the case of pacf since the 12 th lag is significant. akaike's information criterion (aic) was used to identify the best sarima model from among the models considered. table 2 shows the top 2 models with the least aic, namely: arima (0,1,2)×(1,0,1) 12 (model 1) and arima (1,1,1)×(1,0,1) 12 (model 2)with an aic of −414.56 and −414.51, respectively. the combination of conditional sum of squares and maximum likelihood estimates were used to estimate the parameters of the three moving averages and test its significance. table 3 shows the estimated coefficients, standard errors, z-values, and p-values of each parameters of arima (0,1,2)×(1,0,1) 12 . since the p-value of the parameter is less than 0.05, there is sufficient evidence to say that the estimate of the moving averages, seasonal autoregressive, and seasonal moving average are significantly different from zero. the combination of conditional sum of squares and maximum likelihood estimates were used to estimate the parameters of the three moving averages and test its significance. table 4 shows the estimated coefficients, standard 12 . since the p-value of the parameter is less than 0.05, there is sufficient evidence to say that the estimate of the moving averages, seasonal autoregressive, and seasonal moving average are significantly different from zero. residual versus time, residual versus fitted, and normal q-q plot were used to perform diagnostic checking for the model whereas acf and pacf plots of the residuals were used to check if there are remaining patterns that should be accounted by the model 1 and 2. graphs for model 1 are displayed in the left whereas graphs for model 2 are displayed in the right. figure 5 shows that the residuals and time does not display correlation between the two variables. therefore, this scatter plot suggests that the residuals has no serial correlation, that is, there is no interdependence between time and residuals. this is supported by ljung-box test which suggests that the error terms behave randomly for both model 1 (q(20) = 20.109, df=20, model df=4, p= 0.4511) and model 2 (q(20) = 20.941, df=20, model df=4, p= 0.4006). in addition, the residuals versus fitted values scatter plot displays no visible funneling pattern which indicates that the variances of the error term are relatively equal. moreover, the normal q-q plot suggests that the residuals are normally distributed since most of the the values lie along a line. this is supported by shapiro-wilk test which suggest that the error is normally distributed for both model 1 (w= 0.98062, p= 0.2372) and model 2 (w= 0.9852, p= 0.4513). finally, fig. 5 : line, acf, and pacf plot of the two models acf and pacf graphs displays that all of the lags are within the acceptable limits. therefore, all the lags are not significant which means that the residuals of the model may be considered as white noise. hence, the residuals are assumed to be guassian white noise. figure 6 shows the acf and pacf graphs of the forecast errors both models. all of the autocorrelation and partial autocorrelation are within the limits which means that these values are not significant. therefore, the forecast errors are considered white noise. shapiro-wilk test was used to test if the forecast errors of both models were normally distributed. the results show that there is no sufficient evidence to say that the forecast error terms are not normally distributed for both model 1 (w= 0.0.878, p= 0.082) and model 2 (w= 0.875, p= 0.075). this means that it can be assumed that the forebased on the diagnostic presented, the models satisfied all the assumptions of a seasonal autoregressive integrated moving average model. furthermore, model 2 is relatively accurate compared to model 1 based on the rmse of each model. hence, the arima (1,1,1)×(1,0,1) 12 was used to forecast the monthly visitor arrivals in the philippines. philippine statistic authority: contribution of tourism to the philippine economy is 12.7 percent department of tourism-philippines: tourism statistics philippine statistic authority encyclopedia of research design, 1 ed time series analysis: forecasting and control long short-term memory networks with python: develop sequence prediction models with deep learning automatic time series forecasting: the forecast package for r box-cox transformation r: a language and environment for statistical computing. r foundation for statistical computing tseries: time series analysis and computational finance. r package version 0 diagnostic checking in regression relationships ggplot2: elegant graphics for data analysis package 'tidyr a grammar of data manipulation what quarantine measures can do? modelling the dynamics of covid-19 transmission in davao region key: cord-015147-h0o0yqv8 authors: nan title: oral communications and posters date: 2014-09-12 journal: inflamm res doi: 10.1007/bf03353884 sha: doc_id: 15147 cord_uid: h0o0yqv8 nan the drosophila host defense is a multifaceted process which involves reprogramming of gene expression, activation of proteolytic cascades in the blood and uptake of microrganismsby professionalphagocytes. most of the recent studies have focused on challenge-induced expression of antimicrobial peptides and have addressed the following questions : (1) which genes are upregulated during various types of bacterial, fungal or viral infections and what are their functions? (2),what is the nature of the intracellular signalling cascades which lead togene transcription during these infections; (3) how does drosophila recognize infections and does it discriminate between various types of aggressors (e.g. fungal versus bacterial or viral) to mount an appropriate response. over the last ten years we have gained significant insights into these various aspects and the presentation will review our current understanding of the drosophila immune response and put it into a phylogenetic perspective, namely by drawing some stringent parallels with the mammalian innate immune response. there is strong evidence that autoimmunity to myelin antigens plays a major role in the development of multiple sclerosis. several myelin-derived autoantigenic targets have been described and include myelin basic protein (mbp), proteolipid protein and myelin oligodendrocyte glycoprotein. there has been a particular focus on mbp for at least two reasons: mbp-specific cd4+ tcell receptors (tcrs) have been found in multiple sclerosis brains, and cells presenting an immunodominant mbp(85-99) peptide in complex with hla-dr2b have been shown to be present in multiple sclerosis lesions. also, humanized mice expressing the hla-dr2b gene and a human t-cell receptor (tcr) that recognises the mbp85-99 peptide in the context of hla-dr2b either spontaneously or after immunization with mbp85-99 develop experimental autoimmune encephalomyelitis, which has several features in common with multiple sclerosis. this talk will focus on, how humanized mice recently has been used to study the interplay between genetic and environmental risk factors in multiple sclerosis. to resolve the pathogenic mechanisms of rheumatoid arthritis (ra) we need to identify the causative genetic and environmental factors. this has however proven to be complex, with many factors and genes interacting. inbred animals are useful for studies of the identification of genes associated with complex traits and diseases such as ra. animal models offer a possibility to better define the traits, and to segregate the genes in a controlled way enabling linkage analysis. there are several arthritis models, which each may reflect various variants of the heterogeneity of ra in humans. examples are collagen induced arthritis (cia) and pristane induced arthritis (pia), which both fulfills the clinical diagnostic criteria for ra. both diseases are genetically complex and the susceptibility is, as ra, dependent on many polymorphic genes operating in concert. so far 2 genes in this concert have been identified; the mhc class ii ab gene in the mouse (1) and the ncf1 gene in the rat (2) and in the mouse (3) . the ncf1 protein is a part of the nadph oxidase complex mediating oxidative burst. the discovery of the ncf1 polymorphism led to a new proposed pathway in which oxygen radicals modify antigen presentation and the resulting activation of autoreactive t cells. mice with the deficient ncf1 allele are more susceptible to cia, and also developed a chronic form of arthritis. interestingly, the immune response to cii was enhanced by the ncf1 deficiency linking the ncf1 pathway to the adaptive immune response. oxidation of t cell membranes seem to be a key event in the pathogenic mechanism as reduction of t cell membranes induces arthritis in rats (4). on the basis of these findings a new type of therapy myasthenia gravis is a prototypic autoimmune disease, caused in most cases by autoantibodies to the muscle acetylcholine receptor (achr) at the neuromuscular junction. the antibodies reduce the number of achr leading to failure of neuromuscular transmission and muscle weakness. the achr antibodies as measured in conventional immunoprecipitation assays are igg, high affinity, polyclonal and specific for human achr. they reduce the numbers of achr by antigenic modulation and by complement-mediated damage to the neuromuscular junction. myasthenia gravis has a very intriguing relationship with the thymus gland. in many younger patients, the thymus is hyperplastic with immune cell infiltrates and germinal centre formation. around the germinal centres, within the medulla, there are rare muscle-like cells called myoid cells that express achrs. there are many b cells and plasma cells and thymic lymphocyte preparations synthesise achr antibodies. the possibility that the thymic achr induces the germinal centre formation and achr antibody synthesis is supported by much evidence. some patients, however, have thymic tumours and in these the role of the thymus is less clear. moreover, older patients with typical myasthenia usually have thymic tissue which is normal for their age. there are up to 20% of myasthenia patients that do not have the typical achr antibodies. some of these have instead antibodies to muscle specific kinase, a receptor tyrosine kinase that is restricted to the neuromuscular junction. the pathogenic mechanisms by which the antibodies cause disease are not yet clearly identified and the evidence will be discussed. finally, among the patients who have neither achr nor musk antibodies by conventional testing, we have evidence for lower affinity antibodies to achr which can now be detected using molecular approaches which will be described. arry-886 (azd6244) is an inhibitor of mek1/2 currently in development for cancer. phase 1 determined the msd (100 mg) and the safety of the compound given continuously. in decreasing frequency, common treatment-related side effects were rash, diarrhea, nausea, peripheral edema, and vomiting. paired pre-and postdose tumor biopsies showed a reduction in perk (-83%) and proliferative index (-46%). the trough plasma concentration (400 ng/ml) corresponded to~40% inhibition of perk. about 45% of pts had stable disease after 2 months. these results demonstrate that arry-886 (azd6244) is well-tolerated, hits its target, and produces a high incidence of long-lasting stable disease. there are several on-going phase 2 studies, in melanoma, colorectal, lung and pancreatic cancer. arry-162 is another potent, selective mek 1/2 inhibitor, currently in development for inflammatory diseases. when human whole blood was stimulated with tpa, arry-162 inhibited tnfa, il-1b and il-6 (ic50s of 23, 21 and 21 nm, respectively). 50% inhibition of perk required 280 nm. arry-162 was highly efficacious in cia and aia rat models, with ed50s of 3 and~10 mg/ kg, respectively. in normal volunteers arry-162 was well-tolerated and there was a dose-proportional increase in drug exposure. in ex vivo blood samples, there was a dramatic time-and concentration-dependent inhibition of tpa-induced tnfa and il1b. an on-going multiple ascending dose clinical study is further exploring the pharmacokinetics, pharmacodynamics and tolerability of arry-162 monotherapy. in addition, we have initiated a clinical trial designed to evaluate arry-162 in combination with methotrexate in patients with rheumatoid arthritis. rho kinases (rock) are involved in many physiological and pathological processes including smooth muscle contraction, cytoskeletal arrangement, cell adhesion, migration and proliferation.rocks prominent role in cytoskeletal architecture suggests that rock inhibitors should have therapeutic impact in oncology and fibrotic diseases which require cytoskeletal rearrangement to progress.we have synthesized small molecule inhibitors of rock which are specific for the rock-2 isoform.these rock-2 inhibitors, typified by slx-2119 and slx-3060, are potent (ic50 < 100 nm), selective for rock-2 (>100 fold selectivity for rock-2 over rock-1) and exhibit good oral bioavailability.this talk will focus on several areas in oncology and fibrotic disease where the ability to demonstrate an in vitro effect on the cytoskeleton translates into activity in the disease model in vivo.slx-2119 inhibits cell proliferation and migration in several tumor cell lines including ht-1080, panc-1 and mda-mb-231. moreover in xenograft studies using nude mice, slx-2119 significantly inhibited tumor growth with these same cell lines. in liver fibrosis, slx-2119 prevents the differentiation of rat primary hepatic stellate cells into myofibroblasts and inhibits the proliferation of myofibroblasts as well as inhibiting hepatic steatosis in an atherosclerosis model.slx-3060 is an effective antifibrotic agent in the kidney unilateral urethral obstruction model and inhibits renal fibroblast differentiation and proliferation.these data suggest that rock-2 selective inhibition of cytoskeletal modification in key cell types (e.g. tumor cells, stellate cells and fibroblasts) by compounds such as slx-2119 and slx-3060 will provide effective treatment for oncology and fibrotic disease. cyclooxygenases (cox) catalyze the first step in the synthesis of prostaglandins (pg) from arachidonic acid.cox-1 is constitutively expressed.the cox-2 gene is an immediate early-response gene that is induced by variety of mitogenic and inflammatory stimuli.levels of cox-2 are increased in both inflamed and malignant tissues.in inflamed tissues, there is both pharmacological and genetic evidence that targeting cox-2 can either improve (e.g., osteoarthritis) or exacerbate symptoms (e.g., inflammatory bowel disease).multiple lines of evidence suggest that cox-2 plays a significant role in carcinogenesis.the most specific data that support a cause-and effect relationship between cox-2 and tumorigenesis come from genetic studies.overexpression of cox-2 has been observed to drive tumor formation whereas cox-2 deficiency protects against several tumor types.selective cox-2 inhibitors protect against the formation and growth of experimental tumors.moreover, selective cox-2 inhibitors are active in preventing colorectal adenomas in humans.increased amounts of cox-2-derived pge2 are found in both inflamed and neoplastic tissues.the fact that pge2 can stimulate cell proliferation, inhibit apoptosis and induce angiogenesis fits with evidence that induction of cox-2 contributes to both wound healing and tumor growth.taken together, it seems likely that cox-2 induction contributes to wound healing in response to injury but reduces the threshold for carcinogenesis. (1), k hagihara(2), t nishikawa(1), j song(1), a matsumura (2) (1) health care center, osaka university, japan (2) osaka university graduate school of medicine, japan it is still less known about the actual pathogenic role of il-6 in the inflammatory status. to know the pathogenic role of il-6 and the efficacy of il-6 blockade in inflammation, a humanized anti-il-6 receptor antibody, tocilizumab, was used for the treatment in chronic inflammatory diseases, such as castlemans disease, rheumatoid arthritis and crohns disease. since il-6 blocking therapy improved the clinical symptoms and the laboratory findings, the il-6 function in inflammation could be analyzed for the induction of inflammatory molecules, such as serum amyloid a (saa). saamrna induction, saa promoter activity and assembling of transcriptional factors on saa promoter were analyzed by the real time rt-pcr, gel shift assay and dna affinity chromatography in hepatocyte stimulated with the proinflammatory cytokines, il-6, il-1 and tnf-alpha. in result, il-6 was an essential cytokine in induction of saamrna through the activation of stat3 which constructed the complex with nf-kappab p65 and a cofactor p300. although there was no stat3 consensus region on saa promoter, stat3 bound at the 3 site of nf-kappab re. the above research proved that il-6 signal is essential on the synergistic induction of saa via newly discovered stat3 transcriptional mechanism, suggesting the presence of this stat3 mechanism in inflammation, and confirming the normalization of serum saa level by il-6 blocking therapy in inflammatory diseases. this research method develops a subsequent therapy for serious aa amyloidosis by inhibition of saa production, and elucidates the cytokine mechanism on immunopathogenesis of chronic inflammatory diseases. takashi wada(1), k matsushima(2), s kaneko (1) (1) kanazawa university, japan (2) university of tokyo, japan accumulating evidence indicates that chemokine/chemokine receptor system plays a key role in the pathogenesis of various renal diseases via leukocyte migration. pathophysiological impacts of chemokines have shed light on molecular mechanisms of leukocyte trafficking and their activation in the inflammatory aspects of progressive renal injury.locally expressed chemokines are proven to be capable of inciting leukocyte migration to the kidney, resulting in initiating and promoting chronic kidney diseases.the possible positive amplification loop from cxc chemokines to cc chemokines may contribute to progressive renal injury, resulting in sclerosis/fibrosis.it is of note that monocyte chemoattractant protein (mcp)-1/ monocyte chemotactic and activating factor (mcaf)/ ccl2, a prototype of cc chemokines, promotes and escalates chronic kidney diseases with any etiologies via the infiltration and activation of monocytes/macrophages, proteinuria and collagen synthesis.interactions between infiltrated inflammatory cells and resident renal cells eventually lead to the progression of fibrosis. new insights into renal fibrosis have been uncovered by the regulation of fibrocytes dependent on chemokine system. in addition, recent studies demonstrate that chemokines have been expanding their universe beyond leukocyte migration to the kidney, including homeostasis, development and repair of the kidney.the selective intervention of chemokines might have the therapeutic potential to alter inflammatory responses, thereby halting the progression of renal injury. we focus on recent progresses on the role of chemokines and their cognate receptors in renal injury in this symposium. (1), p lacamera (1), b shea (1), g campanella (1), b karimi-shah (1) , n kim (1), z zhao(2), v polosukhin (3), y xu(2), t blackwell (3) aberrant wound-healing responses to injury have been implicated in the development of pulmonary fibrosis, but the mediators directing these pathologic responses remain to be fully identified.here we demonstrate that lysophosphatidic acid (lpa) is induced by lung injury in the bleomycin model of pulmonary fibrosis, and that mice deficient for one of its receptors, lpa1, are dramatically protected from pulmonary fibrosis and mortality following bleomycin challenge. the absence of lpa1 markedly reduced fibroblast responses to the chemotactic activity present in the airspaces following bleomycin, and attenuated the subsequent accumulation of fibroblasts in the lung.the increase in vascular permeability caused by lung injury was also markedly reduced in lpa1-deficient mice, whereas bleomycin-induced leukocyte recruitment was preserved.these results demonstrate that lpa1 links pulmonary fibrosis to lung injury by mediating fibroblast recruitment and vascular leak, two of the wound-healing responses that are thought to be inappropriately excessive when injury leads to fibrosis rather than repair.lpa1 therefore represents a new target for lung diseases in which aberrant responses to injury contribute to the development of fibrosis, such as idiopathic pulmonary fibrosis and the acute respiratory distress syndrome. we have reported that inflammation is detrimental for survival of new hippocampal neurons early after they have been born. our data now show that microglia activation, as an indicator of inflammation, is not pro-or antineurogenic per se but the net outcome is probably dependent on the balance between secreted molecules with pro-and antiinflammatory action. we have found that a substantial fraction of the new hippocampal neurons formed after status epilepticus survive despite chronic inflammation. we have started to explore the role of tnf-alpha for adult neurogenesis. infusion of an antibody to tnf-alpha was shown to reduce survival of new striatal and hippocampal neurons generated after stroke, probably by interfering with action of the ligand on the tnf-r2 receptor. we have shown that tnf-r1 is a negative regulator of progenitor proliferation in basal and insult-induced hippocampal neurogenesis. we have also used patch-clamp technique to explore whether a pathological environment influences the synaptic properties of new granule cells. rats were exposed to either a physiological stimulus, i.e., running, or a brain insult, i.e., status epilepticus which is associated with inflammation. we found that new granule cells in runners and status epilepticus animals had similar intrinsic membrane properties. in contrast, the new neurons which had developed in the physiological and pathological tissue environments differed with respect to tonic drive and short-term plasticity of both excitatory and inhibitory afferent synapses. the role of inflammation for these differences is currently being explored. proteinase-activated receptor-2 (par-2) is cleaved within its aminoterminal extracellular domain by serine protei-nases such as trypsin, unmasking a new aminoterminus starting with sligkv that binds intramolecularly and activates the receptor. par-2 is implicated in innate defense responses associated with lung inflammation. we showed that par-2 is expressed by human alveolar (a549) and bronchial (16hbe) epithelial cell lines, and is activated by trypsin and by the activating synthetic peptide sligkv-nh2. in cystic fibrosis patients, airspaces are invaded by polymorphonuclear neutrophils that release elastase and cathepsin g, two serine proteinases, and by pseudomonas aeruginosa that secretes an elastolytic metalloproteinase.we demonstrated that these three proteinases do not activate par-2, but rather disarm this receptor, preventing its further activation by trypsin but not by sligkv-nh2. preincubation of a par-2 transfected cell line with either proteinases leads to the disappearance of the cleavage/activation epitope. proteolysis by these three proteinases of synthetic peptides representing the aminoterminal extracellular domain encompassing the cleavage/activation sequence of par-2, generate fragments that would not by themselves act as receptor-activating ligands and that would not yield receptor-activating peptides upon further proteolysis by trypsin. our data indicate that neutrophiland pathogen-derived proteinases can potentially silence the function of par-2 in the respiratory tract, thereby altering the host innate defense mechanisms. caspase-3 -dependent killing of host cells and to disrupt intestinal barrier function, which, at least in the case of giardiasis, ultimately causes lymphocyte-dependent intestinal malfunction, and the production of diarrheal symptoms. ongoing research is investigating whether par agonists and microbial pathogens may cause epithelial apoptosis, increased permeability, and overall epithelial malfunction in the gastrointestinal tract, via common or intersecting pathways. the intestinal epithelium is exposed to a variety of proteases in both health and disease, including digestive proteases such as trypsin.given that protease-activated receptor 2 (par2) responds to trypsin and is expressed on intestinal epithelial cells, we investigated the effect of trypsin on intestinal epithelial barrier function.scbn, caco-ii and t84 epithelial cells were grown to confluence on filter supports and mounted in ussing chambers to study short circuit current (isc) and transepithelial resistance (rte).cell monolayers were incubated with inhibitors of transcellular ion transport in order to isolate the contribution of the paracellular pathway to rte.apical exposure to serine proteases including trypsin, elastase and chymotrypsin caused a rapid and sustained increase in rte and decreased the transepithelial flux of a 3000mw dextran.interestingly, the effect of trypsin could not be replicated by activators of pars 1, 2 and 4, suggesting that the effect on rte was not due to activation of pars.subsequent experiments showed that trypsin activated phosphatidylinositol-dependent phospholipase c.a trypsin-induced increase in intracellular calcium was not involved.inhibition of pkc-zeta, but not of typical pkcs, also blocked the response.our data point to a role for postprandial trypsin that extends beyond that of a digestive enzyme; it is also a participant in cellular pathways that control tight junction permeability. physiologically, the trypsin-induced increase in resistance could augment transcellular transport by reducing passive paracellular back-diffusion of ions. further studies will assess how these pathways might be disrupted in the barrier dysfunction characteristic of intestinal inflammation. clustering of inflammatory bowel disease in large families and the observation of an increased concordance between monozygotic twins suggests heritable components in these disorders. the high concordance in monozygotic twins (>55%), which is not seen in dizygotic twins (<5%) points to strong contribution of genetic susceptibility to the overall risk for disease. ibd represents a "complex disease" and may involve a large number of interacting disease genes. crohns disease has become an example for the successful molecular exploration of a polygenic etiology. crohns disease was not known before 1920. incidence has increased since now leading to a lifetime prevalence of up to 0.5% in western industrialized countries. the current hypotheses propose unknown trigger factors in the life style of western industrialized nations that interact with a polygenic susceptibility. it appears that increased expression and production of tnf and an enhanced state of activation of the nfkb system are main drivers of the mucosal inflammatory reaction. the exploration of inflammatory pathophysiology of crohns disease using full genome, cdna and oligonucleotide based arrays, respectively, has generated large sets of genes that are differentially expressed between inflamed mucosa and normal controls. while this may lead to new targets for a pathophysiology oriented therapy, it appears, however, that the dissection of the inflammatory pathophysiology does not allow to identify the multifactorial etiology of the disease. genome-wide linkage analysis has demonstrated eight confirmed susceptibility regions with the one on chromosome 16 being most consistent between different populations. in 2001 three coding variations in the card 15 gene were identified that are highly associated with development of the disease. all variants affect a part of the gene that codes for the leucin rich part of the protein, that appears to be involved in bacteria induced activation of nfkb in macrophages and epithelial cells. interestingly, the three disease associated snps are never found on the same haplotype. in compound heterozygotes or homozygotes they result in a rr of > 35 to develop crohns disease as an adult. a particular subphenotype with localization of the disease in the ileocecal region is highly associated with the variants in the card 15 gene. variations in the card 15 gene do not fully explain the linkage finding in the pericentromeric region of chromosome 16. after stratification for card 15 variants, the broad linkage peak is reduced to two more defined peaks on 16p and 16q, respectively. while the exploration of these regions has led to several association signals that are subject to further fine mapping a further disease gene progress has been greater in the other linkage regions (i.e. on chromosomes 10 and 5, respectively). dlg-5 is the example of a low-risk susceptibility gene with a modest associated odds ratio (1.2-1.5) . interestingly, the associa-tion signal appears to be confined to young males. slc22a4/5 which encode the kation-transporters octn1 and 2 have been suggested to represent the disease gene in the 200+ kb haplotype block on chromosome 5q31. mdr1 has also been implicated as a disease gene in ibd. although the human association studies have resulted in highly controversial findings a knockout mouse with a colitis phenotype makes mdr1 likely as a low risk susceptibility gene. with the advent of high-density, genome wide association studies enormous progress has been made to discover the remaining disease genes. recently a 330k illumina scan has been published identifying il-23r as a further disease gene. we used a genome wide candidate gene approach (with appr. 20.000 csnps) to identify atg16l1 as a further disease genes. both genes were confirmed and a further regulatory snp involving ptger4 was annotated by a belgian genome wide scan. by the time of presentation three further genome wide snp scans in crohn disease will most likely have entered the public domain. the further exploration of crohns disease (and other inflammatory conditions of barrier organs) will have to annotate the function and pathophysiologies based on genetic risk maps that are completed with amazing speed. the creation of a medical systems biology of disease will lead to new models and eventually new therapies. the chemokine receptor ccr9 plays a pivotal role in mediating the migration of t cells to the gastrointestinal mucosa. the ligand for ccr9, teck, s highly expressed in the gi tract. the pathogenicity of intestinal ccr9+/ cd4+ t cells has been demonstrated in animal models and this cell population is substantially increased in the peripheral circulation of crohns and celiac disease patients. ccx282-b is a highly selective and potent, orally bioavailable, small molecule antagonist of ccr9.the compound proved to be highly efficacious in the tnf-aare and mdr1a-/-murine models of inflammatory bowel disease (ibd). in phase i trials, ccx282-b was well-tolerated, and no drug-related saes were reported.a 28-day placebo-controlled phase ii study was recently completed in patients with moderate to severe crohns disease. ccx282-b was shown to be both safe and to have encouraging clinical results: 56% of patients on ccx282-b (cdai !250, crp >7.5mg/l) exhibited a 70-point drop in cdai compared to 29% on placebo. furthermore, a crp decrease of 11 mg/l was seen in the ccx282-b group compared to placebo. colonic biopsy samples were analyzed for expression of several pro-inflammatory cytokines. a mean decrease from baseline in the concentrations of tnf-alpha, il-12 p70, ifn-gamma, and the chemokine rantes was shown in the ccx282-b group while levels remained stable in the placebo group. ccx282-b is the first chemokine-based inhibitor of leukocyte trafficking to be tested in ibd. the compound shows anti-inflammatory activity and encouraging evidence for clinical benefit in the treatment of crohns disease. the activating receptor nkg2d seems to be implicated in the pathogenesis of several autoimmune diseases in humans and in animal models for type 1 diabetes and multiple sclerosis. the aim of this study was to asses the role of nkg2d in a model of inflammatory bowel disease, where cd4+cd25-t cells from balb/c mice are adoptively transferred to scid mice, and to evaluate the therapeutic effect of an anti-nkg2d antibody therapy. the expression of nkg2d was evaluated by flow cytometry, immunohistochemistry and by pcr. we found a marked up-regulation of nkg2d on the cell surface as well as increased levels of nkg2d mrna in cd4+ t cells from colitic scid mice as compared to normal balb/c mice. we next studied the effect of anti-nkg2d antibody (cx5) treatment initiated either before onset of colitis, when the colitis was mild, or when severe colitis was established. cx5 treatment decreased the cellsurface expression of nkg2d and prophylactic administration of cx5 attenuated the development of colitis significantly. a moderate reduction in the severity of disease was observed after cx5 administration to mildly colitic animals, whereas cx5 did not attenuate severe colitis. thus, nkg2d may be involved in the pathogenesis of colitis in this model, particularly in the early phases, since the expression of nkg2d in cd4+ t cells increased markedly during the development of disease and since administration of cx5 early but not late in the course attenuated the disease severity. proteins are used already for more than a century in the treatment of disease. the first generation were proteins derived from animals such as antisera used to treat infectious diseases as diphtheria and tetanus and later bovine and porcine insulin for the treatment ofdiabetes. the second generation were natural proteins from human source like the plasma derived clotting factors and human growth hormone. the development of the recombinant dna and cell fusion technology in the seventies of the 20th century opened up the possibilities to produce human proteins and monoclonal antibodies in unlimited amount in microbial and mammalian host cells. in 1982 human insulin was introduced as the first recombinant dna derived biopharmaceutical and since than more than 160 have gained approval. the pipeline contains many more potential biopharmaceuticals and at present 1 in 4 new drug applications concerns a biotechnology derived product. a major problem of therapeutic proteins is the induction of antibodies. for foreign proteins such as the murine derived monoclonal antibodies thisimmunogenicity was to be expected. however the humanization of monoclonal antibodies has reduced but not solved the problem of immunogenicity. and also the proteins which are homologues of endogenousfactors such as gm-csf, interferons etc. induce antibodies, sometimes even in the majority of patients. by definition we are immune tolerant to products which are copies of endogenous proteins. the products not necessarily need to be exact copies of the natural proteins to share this immune tolerance. when human therapeutic proteins induce antibodies, they are breaking b cell tolerance, which starts with the activation of autoreactive b-cells. presenting the self-epitopes in an array form is very potent activator of these b-cells. this explains why aggregates of human proteins are the most important factor in induction of antibodies. these aggregates may not be immediately present in the product, but may appear during storage making stability and formulation an important issue in predicting the immunogenicity. there are only a few studies in experimental model systems on the properties of the aggregates which break b-cell tolerance, indicating that only multiple order aggregates (>trimers) are involved. we study the capacity of a protein product to break b-cell tolerance in mice made transgenic for the specific protein. these mice are immune tolerant and there is a good correlation of an immune response in these mice and in patients. although these models have helped to identify the factors important for breaking b-cell tolerance and also have been useful in improving the formulation of products, there is not yet enough experience to use them as absolute predictors of immunogenicity of human proteins. they also allow to study the involvement of tcells in breaking b cell tolerance. all data obtained untilnow indicate this process to be t-cell independent. contact information: dr huub schellekens, central laboratory animal institute, utrecht university, utrecht, the netherlands e-mail: h.schellekens@gdl.uu.nl biomonitor aps, and institute for inflammation research (iir), rigshospitalet, copenhagen, denmark using recombinant technology, one can now produce protein drugs which are almost identical with naturally occurring human proteins, including antibodies (abs). many have assumed that these drugs may be administered with little or no risk of triggering specific t-and/or b-lymphocyte reactivities, because patients according to immunological dogma are tolerant towards their own proteins. unfortunately, this is not the case, and even socalled 100% human biologicals are potentially immunogenic (1) (2) . i shall discuss two examples: 1) recombinant human cytokines (ifn-beta-1a and -1b), and 2) anticytokine ab constructs (anti-tumor necrosis factor (tnf)-alpha). ifn-beta has been used for treatment of patients with multiple sclerosis since the early nineties. though initially neglected as a clinical problem, ifn-beta like many other human proteins is indeed immunogenic, especially those produced by recombinant gene technologies. the reported frequencies and titers of anti-ifn-beta ab vary considerably depending upon ifn-beta preparations and administration, and the types of assays being used (2-4). it took more than 10 years of clinical use before abmediated decrease in bioactivity of ifn-beta, a condition in which the clinical effect of continued injection of rec. ifn-beta is minimized or abrogated, was universally recognized (5, 6). 2) anti-tnf-alpha human ab constructs tnf-alpha is an inflammatory cytokine of central pathogenic importance in many immunoinflammatory conditions, and measures to diminish production and/or effects of tnf-alpha have long been a goal in the treatment of these conditions. currently, there are three approved and two other anti-tnf-alpha biopharmaceuticals in clinical use. unfortunately, response failure is frequently encountered. thus, 30-40% of patients are primary non-or lowresponders to the anti-tnf constructs, and secondary response failure is commonplace, mostly due to induction of anti-abs. several different methods have been used to assess circulating levels of anti-tnf drugs as well as anti-abs. most of these have been based on elisa technology, with their inherent problems of false positivity, susceptibility to nonspecific interference, etc. interferon beta (ifnbeta) has been an important step forward in the treatment of multiple sclerosis(ms), an inflammatory disease of the human central nervous system. however, one of the problems of ifnbeta is its immunogenicity; a substantial percentage of ms patients treated this recombinant protein develop anti-ifnã� antibodies, primarily of the igg class. the level of these antibodies tends to be low in the first month or two and peaks by six to eighteen months after initiation of therapy. most studies of these antibodies have measured their ability to neutralize ifnbetas effect in vitro, using assays in which sera from ms patients inhibit the protective effect of ifnbeta on viral killing of target cells. this antibody population is called neutralizing antibodies (nabs). tests measuring binding of antibodies to ifnã� in vitro are called binding antibody (bab) assay. anti-ifnbeta antibodies detected by bab assays are present in a high percentage of ms patients, and can occur at low levels without any apparent adverse effect on ifnbeta bioactivity. the distinction between babs and nabs is artificial, and all binding antibodies are likely neutralizing, if the neutralizing assay system is adequately sensitive; i.e., the development of babs and nabs is a continuum with the assay systems simply measuring the strength of the antibody response. in many treated patients, the anti-ifnbeta antibody response is strong, despite the resemblance of the injected protein to the human homologue, and high levels of neutralizing antibodies develop. high levels of anti-ifnbeta antibodies with high affinity results in loss of ifnbeta bioactivity, a phenomenon which has been called antibody-mediated decreased bioactivity or adb. adb can be considered the in vivo correlate of the neutralizing effect of the anti-ifnã� antibody population, while the nab assay measures the in vitro neutralization of this population of immunoglobulins in the serum. the three ifnbeta preparations have different incidences of nabs and different patterns of appearance and disappearance over time of nabs. because there is no direct correlation between nab levels and bioactivity at moderate levels of nab, in vivo bioactivity assays for ifnbeta have become increasingly utilized. in a large multicenter study in the us, called the insight study, bioactivity as measured by ifnbeta induced upregulation of the ifn-response genes mxa, viperin, and ifit1, was shown to be highly correlated with nab levels, confirming a single center study (pachner, a.r., pak,e., narayan, k., multiplex analysis of expression of three ifnbeta-inducible genes in antibody-positive ms patients, neurology, 66:444-446, 2006) . multiple studies, including a large multicenter study in denmark and a recent study from our center using high resolution mri of the brain once a month, have demonstrated that nabs abolish the salutary effects of ifnbeta on clinical aspects of ms, especially inflammation. recent guidelines for european neurologists recommend stopping ifnbeta in nab-positive patients. in order to maintain bioactivity of this important medication for ms, some neurologists have attempted to use immunosuppressives either to prevent the development of nabs, or to treat them once they have developed. however, at this point in time, there is no clearly optimal way to treat nabs. major efforts have been underway to decrease the immunogenicity of ifnbeta and a new formulation of one of the higher immunogenicity products has been recently developed and tested. proteinase-activated receptors (pars). endogenous serine proteinases such as thrombin, mast cell tryptase, trypsins, kallikreins, cathepsin g, for example, as well as exogenous proteases released by mites or bacteria are involved in cutaneous inflammation, host defense or tumor cell regulation. thus, the expression of pars on keratinocytes, endothelial cells, nerves, and immune cells suggest important role of pars as a part of the communication system in the skin during inflammation and the immune response. for example, par2 activates nf-kb in keratinocytes and endothelial cells, stimulates the release of chemokines and cytokines, and is involved in proliferation and differentiation. on sensory nerves, this receptor controls neurogenic inflammation by modulating edema and extravasation via release of neuropeptides into the inflammatory site. par1 and par2 also modulate leukocyte-endothelial interactions in the skin, thereby regulating inflammatory responses such as leukocyte trafficking through the vessel wall. they also stimulate signal transduction pathways involved in cutaneous inflammation. in sum, this novel receptor family requires a paradigm shift in thinking about the role of proteases in cutaneous biology and disease. novel compounds regulating protease and par function may be beneficial for the treatment of several skin diseases such as atopic dermatitis, psoriasis or pruritus. serine proteinases are upregulated in arthritic joints where their enzymatic activity participates in the destruction of articular soft tissues. in addition to their degradative functions, serine proteinases can also act as signalling molecules by activating members of the gprotein coupled receptor family called the proteinase activated receptors (pars). these receptors are known to regulate tissue inflammation and pain, although their function in joints is unclear. our study examined the effect of par4 activation in joint inflammation and pain. male c57bl/6 mice received an intra-articular injection of either the par4 activating peptide aypgkf-nh2 or the inactive peptide yapgkf-nh2 (100 mg) into the right knee. knee joint blood flow was then measured in these mice by laser doppler perfusion imaging while joint diameter measurements gave an indication of tissue oedema. mechanical allodynia was also assessed in these animals by application of von frey filaments onto the plantar surface of the ipsilateral hindpaw and a pain score was calculated. intra-articular injection of the par4 activating peptide caused knee joint blood flow to gradually increase by up to 25% over the succeeding 2hrs. knee joint swelling was also observed as well as the development of a mechanical allodynia. all responses could be blocked by pre-treatment with the selective par4 antagonist pepducin p4pal10 (100 mg i.p.). the control peptide yapgkf-nh2 had no discernible effect on joint inflammation or pain. these experiments show that peripheral activation of par4 receptors in mice knees causes joint inflammation and pain. vincent lagente (1), e boichot (2) (1) air liquide, centre de recherche claude-delorme, jouy en josas, france (2) inserm u620, universitã� de rennes 1 matrix metalloproteinases (mmps) are a major group of proteases known to regulate the turn-over of extracellular matrix and so they are suggested to be important in the process of lung disease associated with tissue remodelling. these led to the concept that modulation of airway remodeling including excessive proteolysis damage of the tissue may be of interest for future treatment. among metalloproteinases (mmps) family, macrophage elastase (mmp-12) is able to degrade extracellular matrix components such as elastin and is involved in tissue remodeling processes in chronic obstructive pulmonary disease (copd). pulmonary fibrosis has an aggressive course and is usually fatal for an average of three to six years after the onset of symptoms. pulmonary fibrosis is associated with deposition of extracellular matrix (ecm) components in the lung interstitium. the excessive airway remodeling as a result of an imbalance in the equilibrium of the normal processes of synthesis and degradation of extracellular matrix components could be in favor of anti-protease treatments. indeed, the correlation of the differences in hydroxyproline levels in the lungs of bleomycin-treated mice strongly suggests that a reduced molar pro-mmp-9/timp-1 ratio in broncholaveolar lavage fluid is associated with collagen deposition, beginning as early as the inflammatory events at day 1 after bleomycin administration. finally, these observations emphasize those effective therapies for these disorders must be given early in the natural history of the disease, prior to the development of extensive lung destruction and fibrosis. in addition to their degradative properties, proteases can act as signalling molecules that send specific messages to cells.recent work has demonstrated that proteases are able to signal to peripheral sensory neurons, thereby participating to neurogenic inflammation processes and to the transmission or inhibition of pain messages. serine proteases cleaving specifically at an arginin site are able to activate protease-activated receptors (pars), which then send specific messages to cells. we have demonstrated that 3 members of the par family (par1, par2 and par4) are present on peripheral sensory neurons, where they can be activated by different proteases.the activation of par1 and par2 in isolated sensory neurons provokes calcium mobilization and the release of substance p and cgrp, while the activation of par4 inhibited bradykinin-and capsaicin-induced calcium signal, and neuropeptide release.thrombin and pancreatic trypsin caused inflammation respectively through a par1 and par2-dependent mechanism involving the release of neuropeptide.the extrapancreatic form of trypsin (mesotrypsin or trypsin iv) also caused neurogenic inflammation through a par2 and par1dependent mechanism, and causes inflammatory hyperalgesia and allodynia, through a par2-dependent mechanism.in contrast, activation of par4 on peripheral sensory neurons inhibited inflammatory hyperalgesia and allodynia. taken together, these results provide evidences that proteases can interfere with inflammatory and pain mechanisms through the activation of pars on peripheral sensory neurons.determining the role of each individual proteases and their receptors in sensory neuron signalling and above all inflammatory and pain mechanisms constitutes an important challenge to raise new anti-inflammatory and analgesic drugs. introduction: a scoring system for disseminated intravascular coagulation (dic) in humans has been proposed by the international society on thrombosis and haemostasis (isth). it was the objective of this study to develop and validate a similar scoring system for dic in dogs in order to establish the dog as a spontaneous animal model. methods: for the developmental study, 100 consecutive dogs admitted to the intensive care unit (icu) were enrolled prospectively (group a). blood samples were collected daily and a broad panel of coagulation assays performed. diagnosis of dic was based on the expert opinion of one human physician and two veterinarians. a multiple logistic regression model was developed with the coagulation parameters as explaining variables for the diagnosis of dic. integrity and diagnostic accuracy was subsequently evaluated in a separate prospective study according to the stard criteria. the validation study prospectively enrolled 50 consecutive dogs (group b). results: 37 dogs were excluded from group a where 23/63 dogs (37%) had dic. final multiple logistic regression model was based on aptt, pt, d-dimer and fibrinogen and had a very high diagnostic sensitivity and specificity. diagnostic accuracy of the model was sustained by prospective evaluation in group b. conclusion: based on generally available assays, it was possible to design an objective diagnostic model for canine dic, which has both a high sensitivity and specificity. such a model will provide basis for treatment optimization and make it possible to conduct multicentered therapy studies with a minimum risk of systematic misclassification of patients. in 1997, a coagulation independent change in light transmittance (biphasic waveform [bpw] ) was reported in automated activated partial thromboplastin time assays (aptt) in patients with disseminated intravascular coagulation (dic). a calcium-dependant precipitate of creactive protein and very-low-density-lipoprotein was causing the bpw. our group recently identified this phenomenon in dogs also. initially, bpw was introduced as a complementary tool to assist diagnosing dic. however, recent studies reported that bpw may have a stronger potential as a prognostic marker for survival. the aims of the study were to prospectively investigate (a) the diagnostic significance of bpw regarding dic and (b) the significance of bpw to outcome, in dogs with diseases known to predispose for dic. the study was performed as a prospective, observational study including 50 consecutive dogs with a final diagnosis known to predispose for dic (20% were finally diagnosed with dic). outcome was 28-day survival. bpw was assessed by means of a hirudin-modified aptt assay (kjelgaard-hansen et al., jvim 2006:20; 765-766) . relative risk according to bpw (rr [95% confidence interval]) for (a) a dic diagnosis and (b) 28-day mortality, were assessed. 28-day mortality in the study population was 44%. 28% were bpw positive. bpw was not a significant diagnostic factor for dic (rr=0.64 [0.16; 2.67] ), but strongly so for outcome (rr=2.57 [1.46;4.52] ) with a 79% (11/14) mortality amongst bpw positive dogs. in conclusion, bpw was observed in dogs predisposed to dic, with a strong potential as a risk factor for outcome, a finding in line with recent findings in humans. (1), b hideo (2) (1) department of molecular pathology, kumamoto university, japan (2) department of gastroenterological surgery, kumamoto university, japan aeromonas species are facultative anaerobic gramnegative rods that are ubiquitous, waterborne bacilli, most commonly implicated as causative agents of gastroenteritis. aeromonas infections often develop sepsis and disseminated intravascular coagulation syndrome (dic) is a life-threatening complication of sepsis patients, causing multiple organ failure.however, a mechanism leading to coagulation induction in the bacterial infection has not been known. to study the dic induction by aeromonas species infection, we investigated coagulation activity of a serine protease (asp) from aeromonas sobria, predominantly isolated in patients blood. proteolytically active asp shortened both activated partial thromboplastin time and prothrombin time of human plasma in a dose-dependent manner starting at an enzyme concentration of 30 nm. asp activated human prothrombin, releasing hydrolytic activity for thrombinspecific substrate boc-val-pro-arg-mca, but no enzymatic activity was produced from coagulation factors ix and x. analysis by sds-page revealed that asp released a prothrombin fragment with a molecular weight identical with that of fâ¿-thrombin in an incubation timedependent manner. western blotting using biotinylated phe-pro-arg-chloromethylketone, a thrombin inhibitor, showed that asp produced an enzymatically active fragment whose molecular weight was same as that of fâ¿-thrombin. prothrombin incubated with asp but not the protease itself caused platelet aggregation. these results indicate that asp activates prothrombin, producing fâ¿-thrombin that converts fibrinogen to fibrin clot, and suggest that asp coagulation-inducing activity contributes to dic development in sepsis caused by aeromonas sobria infection. the present study shows a link between inflammation and coagulation mediated by a bacterial protease. hemolytic episodes are often associated to high amounts of free heme in circulation (up to 20 um) and the development of an inflammatory response that may develop to a chronic inflammation. our group has shown that free heme is a prototypical proinflammatory molecule, able to induce neutrophil migration, actin cytoskeleton reorganization and nadph oxidasederived reactive oxygen species (ros) generation, as well as pkc activation and interlukin-8 expression (graã�a-souza et al., 2002) . moreover, free heme inhibits human neutrophil spontaneous apoptosis, a feature that is closely related to the impairment of resolution of inflammation and consequent promotion of chronic inflammatory status. heme protective effect requires nadph oxidase-derived ros and involves the activation of mapk, pi3k and nf-kb signaling cascades as well as heme oxygenase (ho) activity (arruda et al., 2004) . more recently, we have shown that heme antiapoptotic effect is closely related to the maintainance of mitochondrial stability, inhibition of bax insertion into mitochondria and a dramatic increase on bcl-xl/bad protein ratio in a ros-dependent manner, requiring the same signaling pathways that regulate heme anti-apoptotic effect. these findings attest to a prominent role of free heme in the onset of inflammation associated to hemolytic episodes as well as the statement of chronic inflammation related to these disorders. the recent advance on the study of free heme as a proinflammatory molecule brings up hope for the development of new strategies to ameliorate acute and chronic inflammation found during hemolytic episodes.financial support: faperj, cnpq, capes. (1) (1) university of melbourne, victoria, australia (2) monash university, victoria, australia we have previously demonstrated that mice lacking the anti-oxidative enzyme, glutathione peroxidase (gpx1), show significantly larger infarcts after stroke. recent studies have demonstrated that adhesion molecule-mediated leukocyte recruitment is associated with increased tissue damage in stroke, while mice lacking key adhesion molecules conferred neuro-protection. nevertheless, the involvement of oxidative stress in leukocyte recruitment and subsequent regulated cell injury is yet to be elucidated. to explore this, gpx1-/-mice were subjected to transient mid-cerebral artery occlusion (mcao) followed by cerebral intravital microscopy, for assessment of leukocyte-endothelium interactions in intact cerebral microvasculature. after 1hr mcao, leukocyte-endothelium interaction was significantly reduced in gpx1-/-mice compared to wt counterparts during the second hour of reperfusion. laser doppler and direct measurement of blood flow in pial postcapillary venules revealed a reduction of reperfusion in gpx1-/-mice following transient mcao. this suggests that the reduction in nutritive blood flow following stroke in gpx1-/-mice may explain the enhanced injury in these mice as well as the reduced leukocyte-endothelium interaction. furthermore, matrix metalloproteinase-9 (mmp9) which has previously been shown to be implicated in endothelial dysfunction and the pathogenesis of stroke was found to be up-regulated in gpx1-/-mice to a greater extent than in wt mice after mcao, suggesting a role for oxidative stress in cerebral microvascular injury. the data present here suggests oxidative stress may be one of the factors that contribute to reduced post-ischemic perfusion, via the disruption of the endothelial function as indicated by the increased level of mmp9. chris bolton(1), c paul(2), s barker (3), r mongru (3) (1) william harvey research institute, london, uk (2) university west of england, bristol (3) queen mary university of london adrenomedullin (am) acts as a vasodilator in many vascular beds including the cerebral circulation where the peptide is produced in larger amounts than in the periphery.in vitro work has shown that am beneficially regulates blood-brain barrier (bbb) characteristics including transendothelial electrical resistance, permeability and p-glycoprotein pump activation.our preliminary studies in acute experimental autoimmune encephalomyelitis (eae), a model of the human disease multiple sclerosis (ms), have demonstrated significant elevations in am peptide levels corresponding with am mrna changes during late, neurological disease where am production may be linked to the restoration of bbb function. however, am is not exclusively produced as result of am gene upregulation. furthermore, am peptide levels do not always match am mrna changes during other disease phases of eae.the current study has investigated, more closely, the relationship between am gene expression and subsequent levels of associated peptides. am mrna levels were determined, by rt-pcr, in the cerebellum, medulla-pons and spinal cord of normal and eae-inoculated lewis rats at the height of disease. am and proadrenomedullin peptide (pamp) levels were measured in the tissues by radioimmunoassay.all tissues examined showed an increase in am gene mrna compared to control levels.am and pamp changes were observed in the samples and differences between the peptide profiles were recorded.an understanding of alterations in the generation of am and related peptides during neuroinflammation may provide insight into mechanisms affecting bbb permeability and be of relevance to the changes in neurovascular function seen during ms. platelet-activating factor (paf) contributes to the robust inflammatory responses in acute phase and spread of secondary injury. although, paf is believed to be a potent edematous but non-painful mediator in peripheral tissues, we recently demonstrated that paf may be a mediator of noxious signaling in spinal cord in case of neuronal injury. paf-induced tactile allodynia may be mediated by atp, glutamate and the generation of nitric oxide (no). the present study elucidated down-stream signaling pathway for paf-induced tactile allodynia. paf-and glutamate-induced tactile allodynia was blocked by the pretreatment with no scavengers and inhibitors of no synthase, soluble guanylate cyclase or cgmp-dependent protein kinase (pkg). recent evidence attributes the generation of pain to specific disfunctions of inhibitory glycinergic neurotransmission. to explore the target molecule for induction of tactile allodynia, the effect of knockdown of glycine receptors containing the a3 subunit (glyr a3) by sirna spinally transfected with hvj-e vector was examined. in mice spinally transferred with sirna for glyr a3, the reduction of glyr a3 was demonstrated in superficial layer of dorsal horn by immunohistochemical analysis. pcpt-cgmp, paf, glutamate failed to induce tactile allodynia in mice spinally transferred with sirna of glyr a3, while these compounds produced tactile allodynia in mice transferred with mutant sirna of glyr a3 as a control. glycine tranporter inhibitors ameriolated paf-and pcpt-cgmp-induced allodynia. these results suggest that glutamate-no-cgmp-pkg pathway plays a key role for paf-induced tactile allodynia in spinal cord and glyr a3 may be a target molecule for pkg to induce allodynia. (1), r leite(2), ys bakhle (3) (1) federal university of minas gerais, belo horizonte, brazil (2) medical college of georgia, augusta, usa (3) imperial college, london, uk selective cyclooxygenase 2 inhibitors (coxibs) induce a characteristic increase in mechanical nociceptive threshold, referred to as "hypoalgesia", in inflammatory pain induced by carrageenan in rat paws.we have here assessed the role of the cytoskeleton in this hypoalgesia induced by celecoxib (cx). male holtzman rats (150-200g; 3-5 animals/group) were injected in the right hind paw (ipl) with a range of cytoskeletal inhibitors (selective inhibitors of microtubules (taxol, nocodazole, colchicine), of actin microfilaments (latrunculin b, cytochalasin b) or of intermediary filaments (acrylamide) (pico to nanomoles per paw) and 30 min later given cx (12mg/kg, s.c.). after a further 30 min, rats were injected (ipl) with the inflammatory stimulus, carrageenan (250 mg/paw). mechanical pain threshold was hourly measured over the next 4 h, using the randall-sellitto method. the cxinduced hypoalgesia was reversed by low doses of latrunculin b or cytochalasin (latrunculin 100% reversal = 1.26 nanomoles), higher doses of microtubule inhibitors (taxol 100% reversal = 23.4 nanomoles) with no effect of acrylamide (7 up to 141 nanomoles).we conclude that 1) local changes in (paw) cytoskeleton occurred during cxinduced hypoalgesia and 2) actin microfilaments were the cytoskeletal components most critically involved in this hypoalgesia.financial support: cnpq, fapemig and capes there are reports regarding the up-regulation of cyclooxygenase isoenzyme particularly inducible isoform i.e. cox-2 in brain during neurodegenerative or neuropsychiatric disorders.in the present study, we examined the effect of nimesulide (a preferential cox-2 inhibitor) in subchronic immobilization stress. mice were subjected to immobilization stress for 6 hrs daily for a period of seven days. nimesulide (2.5 mg/kg, i.p.) was administered daily for 7 days before challenging them to immobilization stress. behavioral analysis revealed the hyperlocomotor activity and increased anxious response. subchronic stress decreased % retention of memory and also caused hyperalgesic response in mice. biochemical analysis revealed that chronic immobilization stress significantly increased lipid peroxidation and nitrite levels and decreased the reduced glutathione and adrenal ascorbic acid levels. chronic treatment with nimesulide significantly attenuated the immobilization stress-induced behavioral and biochemical alterations. these results suggested that the use of nimesulide could be a useful neuroprotective strategy in the treatment of stress. there is accumulated evidence for ngf role as a peripheral pain mediator. ngf is upregulated in diverse inflammatory conditions and evokes hyperalgesia when injected in humans and rats. ngf increase was also observed in temporomandibular join (tmj) after cfa injection, indicating its possible involvement in local hyperalgesic states. therefore, the objective here was to evaluate if ngf participate in the tmj nociception. to test this hypothesis, the ngf was injected into the tmj alone or after carrageenan (cg) and the spontaneous nociceptive behavior of head flinches was counted for up 30min. further evidence for the ngf nociceptive activity was obtained quantifying the local production of ngf after cg injection, by elisa, and the fos-like immunolabeling in the trigeminal sensory nucleus (including the caudalis, interpolaris and oralis) after ngf injection. injections were performed in 0.25ul. ngf (0.2, 1 and 5ug) injected in the tmj challenged 1h prior by cg (100ug) induces a dose-dependent increase in the number of head flinches. this increase was reduced by k252a (1 and 2ug), indicating a trka receptor-mediated effect. we detected a significant increase in the ngf production 1 and 3h after the tmj cg (300ug) injection. the tmj injection of ngf (5ug) alone did not induce detectable spontaneous nociceptive behavior. however, the ngf (5ug) injection induces a significant increase in the fos like immunolabeling (fli) in the sensory trigeminal nucleus compared to the saline injection. these results indicate that the ngf participates in the nociceptive activity in the tmj, specially in inflammatory conditions. mif was reported as a key cytokine in the pathogenesis of rheumatoid arthritis (ra) several years ago, but it now clear that mif is also involved in the pathogenesis of systemic lupus erythematosus (sle) and atherosclerosis. mif-deficient lupus-prone mrl/lpr mice exhibit prolonged survival and reduced renal and skin disease compared to mif-expressing mice. similarly, mif-deficient atheroma-pone ldlr-deficient or apoe-deficient mice are significantly protected from disease and antimif mab therapy is beneficial. ra and sle are each characterised both by an increased prevalence of atherosclerotic vascular disease and by overexpression of mif. given the effects of mif on atherosclerosis it can be hypothesised that mif overexpression participates in the risk of atherosclerotic vascular disease in ra and sle. recent data have provided insights into mechanisms of action for mif relevant to all these concepts. firstly, the newly described role of mif in the selective recruitment of monocyte-macrophage lineage cells is of particular relevance to ra, sle, and atherosclerosis, with evidence that mif mediates macrophage recruitment in sle and atherosclerosis. secondly, glucocorticoid (gc) therapy is possible risk factor for atherosclerosis in patients with ra and sle, and it is now clear that gc increase the expression and release of mif, potentially implicating mif in gc-related increases in atherosclerosis in ra and sle. specific therapeutic targeting of mif in ra and sle may address not only primary disease pathways but also the increased risk of atherosclerosis in these diseases. to enter inflamed tissues, leukocytes must undergo adhesion molecule-mediated interactions with the endothelial surface of vessels at the site of inflammation.cytokines such as tumour necrosis factor (tnf) are established as important mediators capable of promoting leukocyte-endothelial cell interactions.however, in inflammatory diseases such as atherosclerosis and rheumatoid arthritis, elevated expression of another cytokine, macrophage migration inhibitory factor (mif) occurs, yet the role of this cytokine in leukocyte recruitment is unknown.therefore we explored the ability of mif to regulate leukocyte recruitment.this was achieved using intravital microscopy to examine the intact microvasculature in mice following local mif treatment. these experiments showed that mif induced leukocyte adhesion and transmigration in vivo, resulting in accumulation of predominantly cd68+/f4/80-ve/cd11c-ve monocyte/ macrophage lineage cells.mif did not induce upregulation of adhesion molecules p-selectin and vcam-1, although their constitutive expression contributed to recruitment.in contrast, mif-induced recruitment was blocked by antibodies to the monocyte-specific chemokine, ccl2/mcp-1, and its receptor ccr2, and in response to anti-cxcr2.this was supported by in vitro experiments showing that mif induced ccl2/mcp-1 release from cultured murine endothelial cells.finally, mice lacking cd74, the putative mif binding molecule, did not respond to mif.these data demonstrate a previously unrecognized function of this pleiotropic cytokine: induction of monocyte migration into tissues, and indicate the involvement of a pathway involving a complicated chemokine/chemokine receptor pathway with contribution from cd74.this function may be critical to the ability of mif to promote diseases in which macrophages are key participants. gm-csf and m-csf (csf-1) can enhance macrophage lineage numbers as well as modulate their differentiation and function. of recent potential significance for the therapy of inflammatory/autoimmune diseases, their blockade in relevant animal models leads to a reduction in disease activity. what the critical actions are of these csfs on macrophages during inflammatory reactions are unknown. to address this issue, adherent macrophages (gm-bmm and bmm) were first derived from bone marrow precursors by gm-csf and m-csf, respectively, and stimulated in vitro with lps to measure secreted cytokine production, as well as nf-kb and ap-1 activities. gm-bmm preferentially produced tnfa, il-6, il-12 and il-23 while, conversely, bmm generated more il-10 and ccl2; strikingly the latter population could not produce detectable il-12 and il-23. following lps stimulation, gm-bmm displayed rapid ikba degradation, rela nuclear translocation and nf-kb dna binding relative to bmm, as well as a faster and enhanced ap-1 activation. each macrophage population was also pre-treated with the other csf prior to lps stimulation and found to adopt the phenotype of the other population to some extent as judged by cytokine production and nf-kb activity. thus gm-csf and m-csf demonstrate at the level of macrophage cytokine production different and even competing responses with implications for their respective roles in inflammation including a possible dampening role for m-csf. granulocyte macrophage-colony stimulating factor (gm-csf), initially discovered for its role in the differentiation of haematopoietic cells into granulocytes and macrophages, can also affect mature cell function and may be considered proinflammatory. gm-csf is able to prime macrophages for increased pro-inflammatory responses, including the increased release of tnfa and il-12 following stimulation with, for example, lps. in addition, gm-csf has been shown in vivo, using murine disease models, to play a key role in a number of inflammatory diseases. gm-csf-/-mice have been shown to be resistant to several diseases, including arthritis, and, most notably, blockade of gm-csf with a neutralizing monoclonal antibody was effective at ameliorating arthritis when given either prophylactically or therapeutically. t cells appear to be the major cell type responsible for gm-csf production required for arthritis, and gm-csf appears important in the effector phase of disease, subsequent to t cell activation. blockade of gm-csf results in fewer inflammatory cells, particularly macrophages, and cytokines such as tnfa, at the site of inflammation. these findings suggest that blockade of gm-csf may be an effective treatment in a range of inflammatory diseases. the autoimmune disease type 1 diabetes mellitus (t1dm) is thought to be mediated by autoreactive t cells recognizing islet autoantigens, including gad65, ia-2 and proinsulin. this disease arises on a distinctive genetic background, mapping most notably to the mhc, and is also open to strong environmental influence. to investigate the pathogenesis of the disease, and in particular the prevailing paradigm that islet autoreactive t cells are important, we have developed an approach to epitope identification that is mhc allele and autoantigen specific, and operates for both cd4 and cd8 t cells. utilizing this, we have uncovered populations of islet antigen-specific t cells that have the immunological credentials to be both pathogenic (eg th1, tc1) and protective (treg) in the disease. we have cloned some of these cell types, enabling us to analyse their function and provide an insight that will be important for an understanding of disease mechanisms, as well as guiding novel therapeutic interventions. tcr transgenic targeting b:9-23 cause diabetes 2. knockouts of the insulin 2 gene (expressed in thymus as well as islets) accelerates diabetes while knockout of insulin 1 gene (islet expression) prevents 90% of diabetes 3. dual insulin knockout with transgenic insulin with altered peptide (b16:a) prevents all diabetes 4. islets with native b:9-23 sequence, but not altered sequence when transplanted into knockouts restore anti-insulin autoimmunity and diabetes transfer by t cells 5.anti-b9-23 t cells have conserved valpha and jalpha chain usage but no conservation n region or beta chain 6. alpha chain as transgene sufficient to engender anti-insulin autoantibodies 7. kay and coworkers demonstrate insulin reactivity "upstream" of igrp and igrp reactivity nonessential.future studies in nod directed at deleting specific conserved alpha chains to test diabetes prevention and develop therapeutic.in man we can now identify at birth genetic risk as high as 80% of activating anti-islet autoimmunity with mhc analysis and restricted heterogeneity suggesting dominant target.insulin autoantibodies in prospective studies such as daisy usually appear initially and levels are related to progression to diabetes.analysis of cadaveric donors is underway to elucidate primary targets. (t1d) is an autoimmune disease in which genes and environment contribute to cell-mediated immune destruction of insulin-producing beta cells in the islets of the pancreas. the holy grail of autoimmune disease prevention is negative vaccination against autoantigens to induce disease-specific immune tolerance. this has been achieved in rodents by administering autoantigen via a tolerogenic route (mucosal), cell type (stem cell or resting dendritic cell), mode (with blockade of t-cell co-stimulation molecules) or form (as an altered peptide ligand). compelling evidence demonstrates that proinsulin is the key autoantigen that drives beta-cell destruction in the non-obese diabetic (nod) mouse model of t1d, and possibly in humans. proinsulin/ insulin dna, protein or t-cell epitope peptides administered in a tolerogenic manner to the nod mouse can delay or prevent the development of diabetes, via one or more mechanisms (deletion or anergy of effector t cells, induction of regulatory t cells). administration of autoantigen via the mucosal route, which induces anti-diabetic regulatory t cells in the rodent, is the most immediately translatable approach to humans. initial human trials of vaccination with oral autoantigens lacked evidence of bioeffect, probably due to inadequate dosage in end-stage disease. recently, however, the first evidence for a therapeutic effect of mucosal autoantigen has been seen in trials of oral and nasal insulin in islet autoantibodypositive individuals at risk for t1d. combination autoantigen-specific vaccination also shows promise in combination with non-specific immunotherapy in established t1d. leukocyte extravazation is an integral process both physiologically (immunosurveillance) and pathophysiologically (inflammation). the initial paradigm of a 4-step process comprising tethering/rolling, activation, firm adhesion, and diapedesis, each involving specific adhesion molecules, has repeatedly been modified in the light of more recent findings. additionally, organ-specific differences regarding the role of distinct molecules were established. finally, the skin became a good "model" to study due to its accessability and availability of powerful animal models. in-vitro adhesion assays, flow-chamber systems, intravital microscopy, animal models for delayed-type hypersensitivity, and transplantation approaches have successfully been employed to investigate leukocyte extravazation. numerous molecular interactions such as the cutaneous lymphocyte-associated antigen and sialyl-lewisx, or icam-1 and lfa-1, have been proven sufficiently relevant to make them candidates for potential therapies. with the anti lfa-1 antibody efalizumab, approved for the treatment of psoriasis, the first therapeutic agent specifically targeting leukocyte extravazation is already on the market; other compounds are under development. moreover, novel data suggest that well-established anti-inflamamtory therapies such as fumarates also influence this process, thus contributing to their clinical efficacy. ongoing research aks for adopting a more "dynamic" view on leukocyte extravazation as several molecules obviously perform multiple tasks throughout this process rather than being limited to just one step of this multi-step cascade; this is particularly true for the so-called junctional adhesuã­on molecules which obviously mediate more than just diapedesis. finally, similarities between leukocyte extravazation and hematogenic metastases are emerging. consequently, certain anti-inflammatory compounds may turn out to also exhibit striking anti-metastatic efficacy, and vice versa. department of dermatology, heinrich-heine-university, dã¼sseldorf, germany atopic dermatitis, psoriasis vulagaris and cutaneous lupus erythematosus represent chronic inflammatory skin diseases showing distinct clinical phenotypes but sharing one aspect. the recruitment of pathogenic leukocyte subsets into the skin represents a prerequisite for their initiation and maintenance. during recent years, our knowledge of the immunopathogenesis of chronic inflammatory skin diseases increased significantly. with regard to the recruitment pathways of leukocytes, a superfamily of small cytokine-like proteins so called chemokines has attracted significant attention. here the complex interactions within the chemokine ligand-receptor network are introduced, the involvement of chemokines in memory t and dendritic cell trafficking is outlined and current concepts of their role in the immunopathogenesis of atopic dermatitis, psorasis vulgaris and cutaneous lupus erythematosus are summarized. the skin serves as a unique organ for studying general principles of inflammation because of its easy accessibility for clinical evaluation and tissue sampling. a network of pro-inflammatory cytokines including il-1 and tnf-a is known to play a key role in the pathogenesis of cutaneous inflammatory diseases through activation of specific signalling pathways. recently, progress in understanding the underlying mechanisms regulating inflammatory signalling pathways in the immunopathogenesis in skin carcinomas, psoriasis vulgaris and atopic dermatitis has been made. kinases have been identified to play a crucial role in regulating the expression and activation of inflammatory mediators in these inflammatory skin diseases. mitogen-activated protein kinases (mapks) are a family of serine/threonine protein kinases that mediate a wide variety of cellular behaviours in response to external stress signals. increased activity of mapks, in particular p38 mapk, and their involvement in the regulation of synthesis of inflammatory mediators at the transcriptional and translational level has recently been demonstrated. progress in our understanding of inflammatory signalling pathways has identified new targets for treating inflammatory diseases, but the challenge is to place a value on one target relative to another and to evolve strategies to target them. a careful examination of different signalling pathways in various inflammatory conditions is therefore needed. this presentation gathers recent advances in signal transduction in skin inflammation focusing interleukin-1, tnf-âµ, p38 mapk, msk1/2, mk2, nf-kappab and ap-1. histamine is an important inflammatory mediator in humans, and despite their relatively modest efficacy antihistamines are frequently used to treat allergic conditions, as well as other histamine-mediated reactions such as pruritus. in contrast, antihistamines are of very limited use for controlling other conditions where histamine production is abundant, including asthma. the discovery of the histamine h4 receptor (h4r) prompted us to reinvestigate the role of histamine in pulmonary allergic responses, as well as in pruritus. h4r deficient mice and mice treated with h4r antagonists exhibited decreased allergic lung inflammation in several models, with decreases in infiltrating lung eosinophils and lymphocytes and decreases in th2 responses. ex vivo restimulation of primed t cells showed decreases in th2 cytokine production, and in vitro experiments suggest that decreased cytokine and chemokine production by dendritic cells after blockade of the h4r was responsible for the the t cell effects. the influence of h4r on allergic or histamine-induced pruritus was explored in mice using selective histamine receptor antagonists and h4r deficient mice. the h4r was found to mediate the majority of histamine-mediated and allergic itching, while the contibution by the h1r was minor. surprisingly the h4r effect was independent of mast cells or other hematopoetic cells. this work suggests that the h4r can modulate both allergic responses via its influence on t cell activation, and pruritus through mechanisms that are independent of hematopoetic cells. the studies show that the h4r mediates previously uncharacterized effects of histamine and highlight the therapeutic potential of h4r antagonists. (1), bsp reddy (2) (1) nizams institute of medical sciences, hyderabad; india (2) genix pharma, india rupatidine, carries the majority of the histamine h1receptor -blocking activity and has been introducedfor the treatment of allergic rhinitis and urticaria. objectives: the aim of this study was to compare the effect of two by measure of inhibition of histamine induced wheal and flare response. methodology: 12 male volunteers were enrolled after written informed consent before to ethic committee approved protocol. in this randomised, double-blind, single oral dose, cross overstudy, they were randomized to receive either 10 mg rupatidineformulation after overnight fast. washout was 10 days. wheal and flare were induced on the forearm of the trial subjects, by histamine intradermally injection while the subject was lying comfortably with arm resting on the bed. ten minutes later, wheal and flare were visualized under a bright lamp. histamine induced wheal and flare skin test was performed before and regularly to 24hours after drug administration. results: administration of reference and test formulations of rupatidine, significantly inhibited the histamine induced cutaneous response in all the subjects. the least square mean ratio (%) t vs r for peak activity imax-% (maximum inhibition of histamine induced wheal and flare response); area under the activity time curve (auc0-24 mmsq/hr and auc 0-24 %/hr) both for untransformed and log transformed data were found to be within 80-125% of90% ci limits both formulations well tolerated. conclusions: it can thus be concluded that be concluded that test formulation of rupatidine tablet is bioequivalent to reference rupatidine tablet (1), h yoshimura(1), k ohara (1), y mastui (1), h hara (1), h inoue (1), h kitasato (1), c yokoyama(2), s narumiya (3), m majima (1) (1) kitasato university school of medicine, kanagawa, japan (2) tokyo dental medical colledge, tokyo, japan (3) kyoto university school of medicine, kyoto, japan thromboxane (tx) a2 is a potent stimulator of platelet activation and aggregation and vascular constriction. we have reported the magnitude of cytokine-mediated release of sdf-1 from platelets and the recruitment of nonendothelial cxcr4+ vegfr1+ hematopoietic progenitorsconstitute revascularization. we hypothesized that txa2 induces angiogenic response by stimulating sdf-1 and vegf which derived from platelet aggregainflamm. res., supplement 3 (2007) tion.to evaluate this hypothesis, we dissected the role of the txa2 in angiogenesis response using mouse hind limb model. recovery from acute hind limb ischemia, as assessed by the ratio between the treated ischemic limb and the untreated control right limb was assessed in wild type mice (c57bl/6 wt) , prostaglandin i2 receptor (ip) knock out(ipko) and thromboxane (tx) a2 receptor (tp) knock out(tpko). blood recovery in tp-/-significantly delayed compared to wt and ipko. immunohistochemical studiesrevealed that tp-/-mice were less stained against pecam positive cells compared to wt and ipkoplasma sdf-1 and vegf concentration were significantly reduced in tp-/-mice. we observed during in vivo fluorescence microscopic study that compared to tpko, ipko and wt significantly increased platelet attachment to the microvessels around ligated area. tpko translpanted wt bone marrow cells increased blood recovery compared to tpko transplanted tpko bone marrow cells. in addition, mice injected with txa2 synthase c-dna expressing fibroblast increased blood flow recovery compared to control mice. these results suggested that tp signaling rescues ischemic condition by inducing angiogenesis by secreting sdf-1 and vegf from platelet aggregation. purpose: the s100 calcium-binding proteins, a8, a9 and a12 are constitutively expressed in neutrophils and induced in activated macrophages. high levels are found in sera from patients with infection and several chronic inflammatory diseases. the calgranulin complex, a8/a9 is anti-microbial; a8 has oxidant-scavenging functions. a12 is chemotactic for monocytes, and recruits leukocytes in vivo by activating mast cells (mc). effects of these mediators on mc and monocyte function were compared. methods: human pbmc or murine mc were activated in vitro with s100 and mediator release and cytokine induction (assessed by quantitative rt-pcr/elisa), determined. a cys41 to ala41 a8 mutant was used to determine whether effects on mc are mediated by redox. immunohistochemistry was used to demonstrate s100s in asthmatic lung. the s100s were expressed in asthmatic lung, particularly in eosinophils and alveolar macrophages. strong reactivity occurred with an antibody recognising predominantly the hypochlorite-oxidised from of a8. a8, a9 or the a8/a9 complex had relatively low ability to induce il1, tnf, il6, and chemokines mrna from pbmc compared to a12. only a12 induced significant levels of il13; none induced il4 or gm-csf mrna compared to lps. in contrast to a12 which is activating, a8 significantly inhibited mc degranulation provoked by ige cross-linking; suppression was dependent on cys41. conclusions: the cytokine profile generated by a12 in mc and monocytes strongly supports a role the pathogenesis of asthma. in contrast, results strongly support a role from a8 in oxidant defence, particularly to hypobromite generated by activated eosinophils. (1), d mankuta(2), g gleich (3), f levi-schaffer (1) (1) hebrew university of jerusalem, israel (2) hadassah university hospital, israel (3) university of utah, usa the onset, amplitude and termination of allergic responses is regulated at the mast cell/eosinophil interface. eosinophil major basic protein (mbp), which activates mast cells in the late-chronic phase of allergic inflammation, is a central determinant in this interface. characterized more than two decades ago, the exact nature of this activation has not been clarified as yet. here we demonstrate that mbp exerts its activating effect on human mast cells and basophils through cd29 and hematopoietic cell kinase (hck). a genome-wide analysis showed that hck displays shifts in mrna levels specifically upon mbp-induced mast cell activation. hck also shows a unique priming pattern prior to this activation. cd29 is phosphorylated specifically upon activation with mbp and deploys a signaling complex that critically depends on hck. extracellular neutralization of cd29 interferes with mbp entry into the cell, and this as well as rna silencing of hck results in defective mbp-induced activation. finally, cd29 neutralization abrogates mbp-induced anaphylaxis in-vivo. these findings picture for the first time a chronic-phase specific pathway mediating eosinophil-induced mast cell activation with critical consequences for the therapy of chronic allergic inflammation. alexander robinson (1), d kashanin (2), f odowd(2), v williams(2), g walsh (1) (1) cellix ltd, institute of molecular medicine, dublin, ireland (2) university of aberdeen, scotland, uk leukocyte adhesion to endothelial cell bound proteins, such as icam-1 and vcam-1, is an initial step of the inflammatory response. we have developed an in vitro microfluidic system which mimics conditions found in blood vessels in vivo during an immune response. using this system, we can record leukocyte adhesion levels under physiologically relevant flow conditions (e.g. 0-5 dynes/cm2). the adhesion profiles of resting and pmastimulated peripheral blood lymphocytes (pbls) were recorded, with respect to vcam-1, icam-1, and bsa. images at each shear stress level were captured using a digital camera, and analysed using our in-house ducocell software package. distinct morphological changes in pma-stimulated pbls, compared to non-stimulated cells, can be observed. these include a less rounded appearance of the pma-stimulated pbls, and evidence of "uropod" formation, which anchor the t cell to the endothelium as part of the migration process. levels of adhesion to vcam-1 are high (80-90%, compared to control), but there appears to be little difference between the adhesion profiles of non-stimulated and pma-stimulated pbls.however, there is a distinct difference between the adhesion levels of non-stimulated and pma-stimulated pbls to icam-1, with pma-stimulated cells showing a higher affinity for icam-1 than nonstimulated cells (approx. 70% and 40%, respectively).pbl adhesion to bsa is negligible. we present a novel in vitro microfluidic pump system that can simulate leukocyte adhesion to the endothelium under flow conditions. this platform is a more efficient and economical system compared to those currently available, due to reduced material costs and style of construction. introduction: pulmonary aspiration of gastric contents is a common complication observed in icu patients and a potential trigger of ards. in this study we evaluated the course of lung inflammation induced by intranasal instillation of gastric juice (gj). methods: gj was obtained from donor rats (ph 1.6). male c57bl/6 were instilled with 2 ml/kg of gj. after 2 or 24 h, the animals were sacrificed and lung and balf were collected. control group consisted of non-manipulated mice. 287.3ae18.6, sg24h: 277.8ae63.8; pg/ml). discussion: gj aspiration induced an initial adherence of pmn to lung tissue that is correlated with increased tnf-a/il-10 ratio in balf at the 2nd h. the reduction of mpo activity is correlated with the decrease in tnf-a/il-10 ratio. the late increase of pmn in balf might be a consequence of the early production of tnf-a. the results are suggestive that the treatment of patients exposed to acid aspiration should be focused in the initial period of the insult and in the blockage of tnf-a. objectives: intestinal i/r is implicated as a prime initiating event in the development of acute respiratory distress syndrome (ards) after trauma and hemorrhagic shock. we investigated the effects of lps challenge to mice previously submitted to i-i/r, a two-hit model of acute lung injury. methods: male c57bl/6 mice were subjected to 45 min of intestinal ischemia and challenged with 0.1 mg/kg of intranasal lps at the 4th hour of reperfusion (two-hit). balf and culture of lung explants were performed 20 h after lps challenge. mice subjected to i-i/r or lps alone were used as controls. results: two-hit mice showed marked increase in lung evans blue dye leakage compared to i-i/r (375.5ae53.9 vs 48.8ae3.5, mg/mg). lung mpo was increased (0.30ae0.03 vs 0.15ae0.03; od 450nm) whereas the neutrophil recruitment to balf was inhibited in the two-hit group compared to lps group (11.5ae2.6 vs 28.5ae3.4; x 10e3cells/mouse). the levels of nox-in the two-hit group were significantly increased when compared to i-i/ r controls in balf (10.6ae0.5 vs 6.5ae0.8; mm) and in lung explants (3.1ae0.3 vs 1.3ae0.1; mm/mg of tissue). conclusions: intestinal i/r predisposes the animal to an exacerbated response to a low dose lps insult. the exacerbated production of nitric oxide observed in the two-hit group may cause endothelial damage, thereby explaining the major increase in vascular permeability in the two-hit group. the results are suggestive that patients exposed to systemic inflammatory response might develop ards when in contact with secondary inflammatory stimuli. nitric oxide may play an important role in this process. (1) (1) novartis institutes for biomedical research, horsham, uk (2) university of michigan, usa obligatory for using oxygen in energy transfer pathways was the simultaneous co-evolution of enzymes that detoxify the reactive species formed as by-products. thus, we hypothesized that individuals with low aerobic function will have reduced anti-oxidant capacity and, therefore, be more susceptible to smoking-related lung diseases like copd. to test this hypothesis, we exposed high capacity runner (hcr) and low capacity runner (lcr) rats to 3 months of whole-body smoke exposures.the animals, bred over successive generations on the same background strain for high or low running capacity, differ by over 500% (p<1x10-37) for exercise capacity, measured by running on a treadmill.after 3 months of exposures, inflammatory cells in bronchoalveolar lavage fluid were increased in both the hcr-and lcr-smokeexposed(se) animals compared to air-exposed controls (p<0.001); however there was a 2-3-fold increase in the number of neutrophils and lymphocytes in the lcr-over the hcr-se group (p<0.001).histopathology revealed there was greater inflammation and lung damage present in the lcr-versus hcr-se group (p<0.05). metabonomic (metabolite profiling) analysis revealed that while peroxidation of lung lipids occurred for both se groups, oxidative damage to the lung surfactant layer was significantly more extensive for the lcr-se. systemic oxidative damage was also more apparent in the lcr-se group, with metabolic profiling suggesting a reduced capacity to regenerate muscle glutathione. the metabolic data suggest that repair processes maybe more effective in the hcrs. in summary, these data support the concept that aerobic capacity may be central to ones susceptibility to developing smoking-related lung disease. (1), ap ligeiro-oliveira(1), jm ferreira-jr(4), sr almeida(1), w tavares de lima(1), shp farsky (1) (1) university of sâ¼o paulo, brazil (2) regional integrated university of alto uruguai and missã°es, brazil methods: male wistar rats were exposed to vehicle or hq (50 mg/kg; ip.;daily, 22 days, two-day interval every five days). on day 10, animals were ip sensitized with ovalbumin (oa). assays were performed on day 23. results: hq-exposed rats presented reduced number of leukocytes in the bronchoalveolar fluid and by impaired in vitro oa-induced tracheal contraction. the latter effect suggests reduction on mast cell degranulation, and it was corroborated by in vivo decreased mesenteric mast cell degranulation after topical application of oa. the oa-specificity response was confirmed by normal ability of mast cells to degranulate in both groups of animals after topical application of compound 48/80. in fact, lower levels of circulating oa-anaphylactic ige antibodies were found in hq-exposed rats. this latter effect was not dependent on number or proliferation of lymphocytes, nevertheless reduced expressions of costimulatory molecules cd45 and cd6 on oa-activated lymphocytes indicated the interference of hq exposure on signaling of the humoral response during an allergic inflammation. contact information: ms sandra manoela dias macedo, regional integrated university of alto uruguai and missã°es / university of sâ¼o p, department of clinical and toxicological analyses, sâ¼o paulo, brazil e-mail: smdmacedo@yahoo.com.br (1) (1) radboud university nijmegen, medical centre, nijmegen, the netherlands (2) university hospital, zã¼rich, switzerland toll-like receptors (tlr) are essential in the recognition of invading microorganisms. however, increasing evidence shows involvement of tlr in autoimmunity, such as rheumatoid arthritis (ra), as well. here we investigated whether synovial expression of tlr3 and tlr7 was associated with the expression of ifna, tnfa, il-1b, il-12, il-17, and il-18 and studied in what way these receptors and cytokines were associated in vitro. using immunohistochemistry, we found that tlr3 / tlr7 expression in synovial tissue was associated with the presence of ifna, il-1b and il-18, but not tnfa, il-12 and il-17. to investigate whether ifna, il-1b and il-18 could induce tlr3 / tlr7 upregulation in vitro, we incubated separate lymphocyte populations with these cytokines and subsequently determined tlr3/7 mrna expression. ifna incubation resulted in significant tlr3 /tlr7 upregulation, whereas il-1b and il-18 did not. pre-incubation with ifna and subsequent stimulation of tlr3 /tlr7 significantly enhanced il-6, tnfa and ifna/b production, indicating that the ifn-induced tlr upregulation was functional. low amounts of biologically active il-1b were produced upon stimulation with atp, but not upon tlr3 / tlr7 stimulation, although mrna levels were high. interestingly, ifna-priming significantly increased the atp-induced il-1b production. here, we demonstrated a dual role for ifna in vitro, which could explain the association between tlr and il-1b / il-18 in synovial tissue. first, involvement in tlr3 /tlr7 regulation and second, involvement in atp-induced production of biologically active il-1b. these results suggest involvement of anti-viral immune responses in ra and ifna as a key player in chronic inflammation. the pathogenesis of chronic joint inflammation remains unclear although the involvement of pathogen recognition receptors (ppr) has been suggested recently. here, we described the role of two members of the nacht-lrr (nlr) family, nod1 (nucleotide/ binding oligomerization domain) and nod2 in model of acute joint inflammation induced by intraarticular injection of tlr2 (toll-like receptor) agonist streptococcus pyogenes cell wall fragments. we found that nod2 deficiency resulted in reduced joint inflammation and protection against early cartilage damage. in contrast, nod1 gene deficient mice developed enhanced joint inflammation with concomitant elevated levels of proinflammatory cytokines and cartilage damage. to explore whether the different function of nod1 and nod2 occurs also in humans, we exposed pbmcs carrying either nod1frameshift or nod2frameshift mutation with scw fragments in vitro. both tnfa and il-1b production was clearly impaired in pbmcs carrying the nod2fs compared to pbmcs isolated from healthy controls. in line with the nod1 gene deficient mice, human pbmcs bearing the nod1 mutation produced enhanced levels of proinflammatory cytokines after 24h stimulation with scw fragments. these data indicated that the nlr family members nod1 and nod2 have a different function in controlling tlr2-mediated pathways. we hypothesize that intracellular nod1-nod2 interactions determine the cellular response to tlr2 triggers. whether lack of controlling tlr2-driven pathways by nod1 signalling is involved in the pathogenesis of autoinflammatory or autoimmune disease, such as rheumatoid arthritis (ra), remains to be elucidated. leukocyte immunoglobulin-like receptors (lilrs) are a family of receptors with potential immune-regulatory function. activating and inhibitory receptors play a role in maintaining immunological equilibrium and an imbalance may lead to the onset of autoimmune diseases such as rheumatoid arthritis (ra). ra is a chronic inflammatory disease of joints caused by mediators (i.e. tnf-a) produced by activated leukocytes. we recently demonstrated expression of activating lilra2 in synovial tissue macrophages from ra patients. the aim of this study was to determine expression and function of lilra2 in monocytes and macrophages. peripheral blood mononuclear cells (pbmc) were prepared by standard density gradient separation and in vitro-derived macrophages were generated by differentiating thp-1 cells with vitamin-d3. lilra2 expression was measured by flow cytometry before and after modulation with cytokines. differentiation to macrophages significantly up-regulated lilra2 expression (p=0.0239). treatment of macrophages with lps, tnf-a, il-1b and ifn-g but not il-10 caused significant down-regulation of lilra2 (p<0.01). function of lilra2 was assessed by cross-linking with plate-immobilised lilra2-specific mab. soluble tnf-a was measured by elisa. activation of cells elicited tnf-a production in a dose-dependent manner while time-course analysis shows maximal production at 18h. correlation between lilra2 expression and response to cross-linking indicates that level of expression may relate directly to the degree of activation. decrease expression in response to acute-phase cytokines suggests controlled regulation during inflammation. in ra, abnormal regulation of lilra2 could potentially exacerbate inflammation by inducing uncontrolled production of proinflammatory cytokines. pharmacological blocking of lilra2 could potentially provide therapeutic benefit. (1) (1) university of valencia, spain (2) northwick park institute for medical research, uk co-releasing molecules (co-rms) mimic the biological actions of co derived from heme oxygenase activity. in the present work we studied the effects of a water-soluble co-releasing molecule (corm-3) on an animal model of human rheumatoid arthritis. dba-1/j mice were treated with corm-3 (5, 10 or 20 mg/kg/day, i.p.) from day 22 to 31 after collagen-induced arthritis (cia) and sacrificed on day 32. administration of corm-3 resulted in a significant improvement of the clinical profile of this disease since it markedly reduced joint swelling and redness. histological analysis of the joints in control arthritic mice indicated the presence of granulocytes and mononuclear cells, cartilage erosion, chondrocyte death and proteoglycan depletion. all these parameters were significantly reduced by corm-3 treatment with the most pronounced protective effect observed at 10 mg/kg. the levels of pro-inflammatory mediators (pge2, il-1beta, tnfalpha, il-2 and il-6) in the hind paw homogenates were significantly inhibited by corm-3. in addition, comp levels in serum, a marker of cartilage degradation, was reduced by the co-releasing agent. our studies show that therapeutic administration of corm-3 alleviates the clinical features of murine cia at the late phase of this response. the beneficial action of co liberated from corm-3 appears to be associated with a decrease in inflammatory cytokines and reduction of cell infiltration into the synovial tissues ultimately leading to a protective effect on the cartilage. aim: to setup a bovine model for cytokine-induced articular cartilage collagen degradation, and characterize the model using a variety of compounds targeting different disease mechanisms relevant to arthritis. methods: full thickness bovine articular cartilage punches were cultured with or without 10 ng/ml il-1a, tnf-a and oncostatin m. after three weeks the cartilage and culture medium were analyzed for weight changes, water content, dna content, glycosaminoglycans (gag), hydroxyproline (hyp), damaged collagen molecules, mmp activity, ctx-ii and comp. diclofenac, dexamethasone, pioglitazone, remicade, risedronate, galardin and a77-1726 were tested for their effect on cartilage degradation. results: exposure of articular cartilage to cytokines resulted in a decreased cartilage weight, increased proteoglycans degradation, increased collagen degradation, increased percentage of denatured collagen, increased water content and increased levels of active mmps (all p < 0.01). comp release during the first week of culture showed a trend towards up regulation during the first week of culture for all three donors, this was however not significant due to the small number of donors. most of the described processes were modulated by one or more of the drugs tested, indicating that this model for articular cartilage destruction is sensitive to treatment. discussion: stimulation of bovine articular cartilage explants with a cocktail of il-1a, tnf-a and osm results in clear and consistent changes in the cartilage, highly reminiscent of cartilage destruction during arthritis. further research needs to establish whether the model is also sensitive to anabolic factors that potentially could repair the damage. toll-like receptors (tlrs) may contribute to the progression of rheumatoid arthritis through recognition of hostderived damage-associated ligands that have repeatedly been found in arthritic joints. involvement of tlr2 and tlr4 activation in the expression of arthritis was studied using interleukin-1 receptor antagonist deficient (il-1ra-/-) mice, which spontaneously develop an autoimmune t-cell mediated arthritis. spontaneous onset of arthritis was dependent on tlr activation by microbial flora, as germ-free mice did not develop arthritis. after crossing with tlr knockouts, il-1ra-/-tlr2-/-mice developed more severe arthritis compared to il-ra-/-tlr2+/+ littermates; whereas, tlr4-/-il-1ra-/-mice were protected against arthritis. to clarify the mechanism by which tlr2 and tlr4 differentially regulated the disease expression, we studied the role of these tlrs in il-23 production and th17 development, both important in il-1ra-/-arthritis. wild type bone-marrow-derived dendritic cells (bmdcs) produced similar levels of il-23 upon stimulation with tlr2 and tlr4 ligands; however, il-1ra-/-bmdcs produced less il-23 than wild type dcs upon tlr2 stimulation and more il-23 than wild type dcs upon tlr4 stimulation. furthermore, il-1ra-/-t cells produced lower amounts of il-17 when cultured with tlr2-activated apcs and higher amounts of il-17 when cultured with tlr4-stimulated apcs, both in combination with cd3 stimulation. facs analysis of th17 (cd4+/il-17+) cells from both spleen and draining lymph nodes revealed 70% reduction in il-1ra-/-tlr4-/-mice compared to il-1ra-/-tlr4+/+ littermates. specific cd3/cd28 stimulation of non-adherent splenocytes confirmed lower il-17 production in il-1ra-/-tlr4-/-. these findings suggest important roles for tlr2 and tlr4 in regulation of th17 development and expression of arthritis. prostaglandine2 (pge2) stimulates the transactivational activity of p53 through p38map kinase-dependent ser15 phosphorylation (jbc 2006) .p53 controls cell-cycle progression, in part, by differential regulation of ap-1 proto-oncogenes (jun/fos).currently we studied pge2 control of cyclin d1 promoter activity with particular attention to the role of ap-1 oncogenes.pge2 induced a 7.6 fold increase in junb mrna expression (northern blot), a 2.1-fold increase in junb promoter activity (luciferase assay), and increased ser259junb phosphorylation in human synovial fibroblasts (hsf) (western blot).c-jun was strongly inhibited while jund, c-fos, fra1/ 2, and fosb expression were upregulated by pge2.in cell-cycle experiments, transformation with a constitutively active ha-ras construct (ras g12v) resulted in a 3.9 fold increase in cyclin-d1 promoter activity, cyclin-d1 synthesis, thr202/tyr204 phosphorylation of erk1/2 (6.2 fold) and ap-1 (c-jun)-dependent transactivity (3.1 fold); cyclin d1/cdk4-6 inhibitor p16ink4a synthesis was suppressed. addition of excess rass17n dominant negative mutant construct to the plasmid mix abrogated the aforementioned processes.ectopic expression of c-jun, c-fos and especially jund expression constructs stimulated cyclin d1 promoter activity/protein synthesis, blocked p16ink4a synthesis; the latter effects were reversed by the addition of excess junb.pge2 exerted temporal and bi-phasic dose-dependent control of the cyclin d1 promoter activity, largely through differential ap-1 activation and promoted cell cycle arrest and apoptosis in hsf at high physiological concentrations.the results provide further insight into the biology of the cpla2/ cox/pges biosynthetic axis and highlight the complexity of pge2 action in terms of cell-cycle progression. di-glucopyranosylamine (diga) is an antikeratitic (roberts et al., 1990 , acvo conference, scottsdale) immunomodulatory pyranosyl disaccharide with parenteral anti-rheumatic activity (bolton et al., 2005 , inflammation res. 54(s2) s121) and unknown mechanism of action.interestingly, anti-tnf therapy is anti-keratitic.-diga hydrolyses to monoglucosylamine (mga) and glucose, which is prevented by n-acetylation (nacdi-ga).lider (1995, pnas. 92: 5037-41) showed that sulphated disaccharides are orally active, inhibit tnf synthesis and the dth reaction.we have investigated the anti-tnf and anti-rheumatic activity of the sulphated and free digas.human whole blood (hwb) was stimulated with pha (5mg/ml) to synthesise tnf.antigen induced arthritis (aia) was induced in methylated bovine serum albumin (mbsa) sensitised c57bl/6 mice challenged i.a. into the stifle joint.collagen arthritis (cia) induced in dba mice by sensitisation to bovine collagen, were boosted i.p. with collagen at day 21. hwb tnf synthesis was inhibited by diga, mga and nacdiga(ic50 <0.1mm). diga (10ml, 1mm) i.a. prevented 24 hour aia (-0.32+/-0.06mm).diga at 100mg/kg reduced aia when administered i.v. (-0.40+/-0.13mm, p<0.05) and i.p. (-0.34+/-0.13mm, p<0.05), but is hydrolysed p.o. (0.24+/-0.08mm ns). polysulphated diga (diga8s) was unstable, but stabilised by n-acetylation (nacdi-ga8s).tnf synthesis was potently inhibited by both nacdiga and nacdiga8s (ic50 <0.1mm).nacdiga8s (100mg/kg p.o.) inhibited aia (-0.43+/-0.13mm), and nacdiga8s with lower degrees of sulphation (mw 800 and 1200 kda) inhibited the development of mouse collagen induced arthritis as assessed by clinical score. sulphated diglucosylamines represent a new class of heparinoid which are potent inhibitors of tnf synthesis and possess oral anti-rheumatic activity. excessive no appears to play a key role in the pathogenesis of chronic inflammatory diseases. in this study we aimed to evaluate no synthesis in rheumatoid arthritis (ra) before and after therapy. it was performed on 92 persons, divided into 6 groups: a negative control group of healthy volunteers, a positive control group with ra, a group with ra and physiotherapy (phys), a group with ra and low doses of cimetidine (cim) + doxycycline (dox), a group with ra and combined treatment phys + cim + dox, and a group treated with usual doses of ibuprofen (ibu). serum nitrite/nitrate (griess) was measured in order to evaluate no synthesis. results: compared to the positive control group, in all the treated groups no synthesis decreased significantly. there was no significant difference between phys and cim+dox effect alone. the combined treatment, phys + cim + dox had a much better inhibitory effect on no synthesis. between the phys, cim + dox and phys + cim + dox groups and that treated with ibuprofen, there was no significant difference in reducing no synthesis. conclusions: 1) in ra phys + cim + dox treatment was as efficient as ibuprofen in reducing no synthesis. 2) the low doses of cim and dox may allow a longer treatment due to the lower side effects risk enhanced socs1 expression following exposure of murine macrophages to lps implicated socs1 in the control of lps-mediated signaling. socs1 regulates nfkb signaling in murine macrophages, blocking at the level of mal or ikba phosphorylation. we investigated the role of socs1 in regulating the production tnf by lps and pam3csk4-activated primary human monocytes. blood monocytes were isolated by centrifugal elutriation and either infected with an adenoviral vector expressing socs1 (adv-socs1), control vector (adv-gfp) or left untreated. adv-socs1 monocytes were exposed to tlr4 and tlr2 ligands, lps (500 ng/ml) or pam3csk4 (300 ng/ml). facs analysis demonstrated infection efficiencies of 50ae4% and 67ae4% (n=16, mean ae sem) of monocytes expressing adv-gfp or adv-socs1 at moi 50. adv-socs1 blocked lps and pam3csk4 induced tnf mrna and protein production in a dose-dependent manner. in contrast, il-6 and il-10 production by adv-socs1-infected monocytes was not blocked. adv-socs1 also blocked lps and pam3csk4induced tnf production by macrophages isolated from synovial fluid. infection efficiencies of 67ae1% or 72ae1% were obtained. quantitative western blot analysis revealed that the classically defined nfkb pathway was not altered at the level of ikba or p65 activation. furthermore, the kinetics of lps and pam3csk4induced ikba phosphorylation and degradation in adv-socs1 monocytes remained unaffected (n=5 and 2 donors, respectively). further, analysis of parallel mapk pathways demonstrated no block in p38 or erk mapk pathways. these data suggest that socs1 regulation of lps and pam3csk4-induced tnf production by human monocytes occurs downstream of tlrs, possibly at the level of transcription. recently, beta-nad+ has emerged as a novel extracellular player in the human urinary bladder. beta-nad+ is the natural substrate of cd38 which catalyzes the conversion of beta-nad+ to cadpr. under normal conditions in vivo, there is no or only very small quantities (submicromolar range) of extracellular beta-nad+ compared to intracellular levels (200-1000 mm). during inflammation cell lysis may cause bursts of high local beta-nad+ levels. however, the effect of beta-nad+ on the human detrusor smooth muscle cells (hdsmc) was unknown. the effect of beta-nad+ on cultured (explant technique) hdsmc was determined by: 1) measuring cytosolic free calcium ([ca2+] i) in fura-2 loaded hdsmc using spectrofluorometry and 2) force measurements in 10-20 mg detrusor strips. hdsmc responded to beta-nad+ (40-100 mm) with an immediate and transient increase in [ca2+] i. the ca2+ transient was followed by one or two much slower and transient increases in [ca2+] i, indicative of beta-nad+ enzymatic conversion into cadpr. the ca2+ responses persisted in the absence of extracellular calcium. the ca2+ responses to beta-nad+ were not affected by exposure of hdsmc to atp supporting the notion that the effects of beta-nad+ were not mediated via p2x7 purinoceptors. furthermore, beta-nad+ caused a concentration-dependent detrusor muscle relaxation. this is the first study to report that extracellular beta-nad+ affect intracellular calcium homeostasis and force in hdsmc. these powerful actions of beta-nad+ suggest a role for beta-nad+/cadpr system as a novel extracellular player in the human detrusor during inflammation. aids remains a worldwide threat more than two decades after identification of hiv as the etiological agent. its wide dissemination can be partly attributed to its successful suppression of immunity resulting in disease progression and concomitant opportunistic infections including mycobacterial and cytomegalovirus infections. hiv trans-activator (tat) is one of the regulatory proteins that mediates hiv replication and dysregulates cellular functions such as apoptosis and cytokine expression. for example, tat induces tumor necrosis factor (tnf) and enhances gp120-induced neurotoxicity. we recently showed that tat induces the overexpression of il-10 via cellular kinase pkr and activation of transcription factor ets-1. in this study, we examined whether tat plays a role in perturbing interferon-& (ifn&) signal transduction. we showed that tat impaired ifn&-induced stat1 tyrosine phosphorylation, but had no effects on the serine residue of stat1 and jak kinases in primary human blood monocytes. furthermore, we found that the nuclear translocation of phospho-stat1 was abrogated by tat. the inhibition of phospho-stat1 led to the deformation of stat1 homodimers and subsequent stat-dna complex. to investigate the cellular consequences, we measured the expression of ifn&-stimulated genes including human leukocyte antigen (hla) and 2,5oligoadenylate synthetase (2,5oas), a key enzyme in the activation of latent ribonuclease l. the results showed that tat inhibited transcriptional activation of 2,5oas and hla. taken together, we identified a new role for tat in which it impairs ifn& signal transduction and suppresses inflammation, thus crippling the immune system and contributing to hiv persistence, opportunistic infections and disease progression. caspase-1 belongs to the group of inflammatory caspases and is the activating enzyme for the pro-inflammatory cytokine interleukin-18 (il-18), a cytokine known to play an important role in the pathogenesis of psoriasis. the purpose of this study was to determine the expression of caspase-1 in psoriatic skin and the signaling mechanisms involved in stress induced activation of caspase-1 and il-18. interestingly, increased caspase-1 activity in lesional compared with nonlesional psoriatic skin was seen as determined by western blotting. in vitro experiments in cultured human keratinocytes demonstrated anisomycin induced, p38 mapk dependent increased secretion of procaspase-1 and active caspase-1. furthermore, anisomycin increased the mrna expression of il-18 through a p38 mapk dependent but caspase-1 independent mechanism, reaching a maximum level after 12 hours of stimulation. finally, anisomycin caused a rapid (4 hours) increase in the secretion of proil-18 and active il-18. secretion of active il-18 was mediated through a p38 mapk/caspase-1 dependent mechanism, whereas secretion of proil-18 was mediated by a p38 mapk dependent but capsase-1 independent mechanism. these data demonstrate that the activity of caspase-1 is increased in psoriatic skin and that il-18 secretion is regulated by a p38 mapk/caspase-1 dependent mechanism, making caspase-1 a potential target in the treatment of psoriasis. prostaglandin e2 (pge2) regulates the stability of cyclooxygenase-2 (cox-2) mrna through adenylate/uridylate-rich elements (ares) in the 3untranslated region (3utr) by a positive autocrine/paracrine feed-forward loop. the principal objective of this study was to elucidate the molecular mechanisms involved in the pge2dependent stabilization of cox-2 in human synovial fibroblasts (hsfs). transfection of well-known are binding proteins (aubps) demonstrated that tristetraprolin (ttp) potently destabilized a [luciferase-cox-2 3utr] reporter fusion mrna (71ae5.7% decrease in luciferase activity vs. control). ttp protein levels in hsfs remained constant despite il-1b-induced changes in ttp mrna levels, thus suggesting translational regulation of its expression. pge2 did not affect the transcription or translation of this gene in hsfs. western blot analysis of hsf ttp demonstrated the existence of a specific, covalent~55kda heterocomplex containing ttp (ttphcx). although ttphcxs exact composition and stoichiometry is yet to be defined, pge2 selectively regulated the amount of this heterocomplex in a time-dependent manner. furthermore, protein shuttling studies performed using real-time confocal microscopy revealed that pge2 can induce export of a small nuclear pool of ttp-gfp. finally, transfection of ttp into hsfs also influenced cox-2 gene transcription, thus enabling ttp to regulate cox-2 gene expression at both the transcriptional and post-transcriptional level. in conclusion, we have demonstrated that ttp is an rna binding protein capable of influencing cox-2 mrna stability and transcription and whose localization and interaction with other factors is regulated by pge2. these data can provide important insight into deciphering the role of pge2 in fine-tuning physiological and pathophysiological gene regulation. (1) (1) chinese academy of sciences, shanghai, pr china (2) ohio state university, usa mitogen-activated protein (map) kinases play a critical role in innate immune responses to microbial infection through eliciting the biosynthesis of proinflammatory cytokines. map phosphatases (mkp)-1 is an archetypical member of the dual-specificity phosphatase family that deactivates map kinases. induction of mkp-1 has been implicated in attenuating the lipopolysaccharide (lps) and peptidoglycan (pgn) responses, but how the expression of the mkp-1 is regulated is still not fully understood. here, we show that inhibition of p38 map kinase by specific inhibitor sb 203580 or rna interference (rnai) markedly reduced the expression of mkp-1 in lps or pgn-treated macrophages, which is correlated with prolonged activation of p38 and jnk. depletion of mapkap kinase 2 (mk2), a downstream substrate of p38, by rnai also inhibited the expression of mkp-1. the mrna level of mkp-1 is not affected by inhibition of p38, but the expression of mkp-1 is inhibited by treatment of cycloheximide. thus, p38 mapk plays a critical role in mediating expression of mkp-1 at a posttranscriptional level. furthermore, inhibition of p38 by sb 203580 prevented the expression of mkp-1 in lpstolerized macrophages, restored the activation of map kinases after lps restimulation. these results indicate a critical role of p38-mk2-dependent induction of mkp-1 in innate immune responses. the i-kb kinase (ikk) complex regulates the activation of nf-kb a key transcription factor in inflammation and immunity. whilst ikka activity is necessary for proinflammatory and anti-apoptotic gene expression, ikka has distinct roles in lymphorganogenesis and b cell maturation. here we describe a role for ikka in cell mediated immunity (cmi). paw inflammation in methylated bsa-induced cmi was significantly reduced in transgenic mice expressing a mutant ikka protein that cannot be activated (ikka aa/aa ) compared to wild-type (wt). antigen-induced il-2 and ifng production by ikka aa/aa splenocytes and ikka aa/aa t cell:dc cocultures were also significantly reduced ex vivo. this could be normalised by using wt t cell: ikka aa/aa dendritic cell (dc) but not ikka aa/aa t cell:wt dc combinations. this suggests that reduced cmi in ikka aa/ aa mice is due to a defect in ikka aa/aa t cells not dcs. this is not due to a requirement of ikka in tcrmediated activation of t cells, since anti-cd3/cd28 mediated activation of ikka aa/aa t cells was unaltered. however, lps-induced production of the important th1 cytokine il-12 is impaired in ikka aa/aa dcs. we are currently addressing the hypothesis that ikka activity may be required for the generation and maintenance of antigen-specific t cells in vivo. recently we described a role for ikkâµ in the negative regulation of innate immunity and acute inflammation, which is in contrast to its role shown here in promoting adaptive immunity and antigen-driven inflammation. ikka may represent an alternative target for the treatment of autoimmune disease which would not compromise host defence. as a latent transcription factor, nf-kb translocates from cytoplasm into nucleus upon stimulations and mediates expression of genes important in immunity, inflammation and development. although extensive studies have been done regarding how nf-kb is triggered into nucleus, little is known about how it is regulated inside nucleus. by twohybrid approach, we identify a prefolding-like protein snip that is expressed predominantly and interacts specifically with nf-kb inside nucleus. we show that rnai knockdown of snip leads to impaired nucleus activity of nf-kb and dramatically attenuates expression of nf-kb dependent genes. this interference also sensitizes cells to apoptosis by tnf-a. furthermore, snip forms a dynamic complex with nf-kb and is recruited to the nf-kb enhancesome upon stimulation. interestingly, snip protein level correlates with constitutive nf-kb activity in human prostate cancer cell lines. the presence of nf-kb within nucleus of stimulated or constitutive active cells is significantly diminished without endogenous snip. our results reveal that snip is an integral component of nf-kb enhancesome and essential for its stability in nucleus, which uncovers a new mechanism of nf-kb regulation. bone remodeling is a tightly regulated process that couples resorption of old bone by osteoclasts and the deposition of new bone by osteoblasts. an imbalance between bone formation and bone resorption can result in various metabolic bone diseases, such as rheumatoid arthritis and osteoporosis. osteoclasts are terminally differentiated cells that arise from a haematopoietic stem cell lineage, which also gives rise to monocytes and macrophages. osteoclast differentiation and regulation of this process to maintain bone homeostasis are central to the understanding of the pathogenesis and treatment of bone diseases, such as osteoporosis. in vitro, osteoclast formation from bone marrow macrophages is induced by rankl (receptor activator of nf-kappa b ligand) in the presence of m-csf (macrophage colony stimulating factor). osteoclastogenesis is markedly enhanced in bone marrow macrophages from ifnar1-/-and ifnar2-/-mice and results in increased number of multinucleated cells positive for osteoclast marker, trap (tartrate-resistant acid phosphatase). consequently, the mutant mice develop osteoporotic phenotype, characterised by reduced bone density. these findings suggest that the ifn alpha/beta system is critical for the negative feedback regulation of osteoclastogenesis and that rankl signaling is essential for the induction of osteoclast differentiation. atp acting on p2x7 receptors in macrophage is one of the main physiological signals that lead to the processing and release of the pro-inflammatory cytokine, interleukin-1beta (il-1b), their activation also leads to rapid opening of a membrane pore permeable to dyes such as ethidium. here we identify pannexin-1, a recently described mammalian protein that functions as an hemichannel when ectopically expressed, as this dye-uptake pathway and show that signalling through pannexin-1 is required for processing of caspase-1 and release of mature il-1b induced by p2x7 receptor activation. furthermore, maitotoxin and nigericin, two agents considered to evoke il-1b release via the same mechanism were studied. maitotoxin evoked dye uptake whose kinetics were similar to a slow pannexin-1-independent phase induced by p2x7 receptor activation, and this was unaltered by pannexin-1 inhibition.nigericin did not induce dye uptake.inhibition of pannexin-1 blocked caspase-1 and il-1b processing and release in response to this two stimuli.thus, while pannexin-1 is required for il-1b release in response to maitotoxin, nigericin and atp, a mechanism distinct from pannexin-1 hemichannel activation must underlie the former two processes. introduction: saa is a classic acute-phase protein upregulated during inflammatory response. saa is active on leukocytes and modulates inflammation and immunity through the induction of cytokines, including the chemokine il-8. here we verify the effect of saa on the mrna expression and release of mip-3alpha, a chemokyne involved in the recruitment of dendritic cells. methods: peripheral blood mononuclear cells (pbmc) isolated from peripheral blood by density gradient were cultured in rpmi medium in the presence of saa. mip-3alphaconcentration was determinated in the supernatant of cell cultures by elisa. mrnawas analyzed bythe ribonuclease protection assay (rpa). results: pbmc stimulated with saa (20 ug/ml) induced the expression of mip-3alpha mrna at 2, 4 and 12 hours. mip-3alpha protein was found in the suppernatant of 24 and 42 hours cultures (p<0,05) and the addition of sb 203580 (p38 inhibitor) and pd 98059 (erk 1/2 inhibitor) completely abolished the release of mip-3alpha. conclusions: saa is an inducer of mip-3alpha expression in pbmc and p38 and erk1/2 are important pathway signaling to this effect. saa is one of the factors responsible by the recruitment of dendritic cells. the p38 pathway is activated in numerous inflammatory conditions, including ra, ibd asthma, acute coronary syndrome, and copd, and its activation helps drive the production of inflammatory mediators. inhibitors of p38 decrease mediator production and therefore can produce profound anti-inflammatory effects. arry-797 is a potent inhibitor of p38 enzyme (ic50 < 5nm) with a novel pharmacophore and physiochemical properties distinct from those of other p38 inhibitors, being very water soluble. it is extremely potent in human whole blood, blocking lps-stimulated tnf production with an ic50 < 2nm.in animal models of rheumatoid arthritis (cia and aia) the compound significantly normalized histologic endpoints, such as inflammation, bone resorption and cartilage damage (ed50~3 mg/kg). a phase i single ascending dose clinical study was run in healthy volunteers. after an oral dose of 50, 100, 200, 300 or 400 mg), blood was drawn at various times, stimulated ex vivo with lps, and analyzed for cytokines and inflammatory mediatorsfj arry-797 was well-tolerated and drug exposure was proportional to dose. in ex vivo samples, there was both a time-and concentrationdependent inhibition of il1, pge2 and tnffz with >80% inhibition observed at the 50 mg dose level. the plasma concentrations of drug peaked at~20 ng/ml at the 50 mg dose and cytokine inhibition was sustained for >4 hours, showing that low doses of arry-797 produced profound effects on clinical biomarkers. further evaluation of arry-797 in patients with inflammatory diseases is planned. introduction: we demonstrated that in vivo chronicle blockage of nos (l-name, 20mg/kg; oral route; 14 days) impairs leukocyte-endothelial interactions and neutrophils migration into inflammatory focus. these effects may be depending, at least in part, on decreased expression of l-selectin on leukocytes and pecam-1 on endothelium. aimed to clarify the mechanisms involved on these inhibitory effects, we now investigated the role of l-name treatment on secretion of tnf and il-1b; by circulating leukocytes and migrated peritoneal neutrophils. methods: male wistar rats were treated with l-name (20mg/kg; oral route; 14 days) or sterile saline (control). circulating leukocytes were isolated from blood collected from abdominal aorta and migrated neutrophils were obtained 4 hours after i.p. injection of oyster glycogen (1%;10ml). no (griess reaction) and cytokines (elisa) were quantified in supernatants of 1 x 106 cultured cells before and 18 hours after lps stimulation (5m;g/ml). results: levels of no, tnf and il-1b; were reduced in circulating leukocytes from l-name-treated rats in both basal and lps stimulated conditions. on the other hand, only secretion of il-1b; was impaired by migrated neutrophils. conclusions: results show that in vivo l-name treatment, which partially reduces no production, decreases the secretion of pro-inflammatory cytokines by circulating leukocytes. however, the same pattern of inhibition is not detected if neutrophils are in vivo primed. objectives: to investigate the ability and mechanism of ifn-g to suppress interleukin-1 (il-1)-induced mmp-13 expression in articular chondrocytes. methods: human chondrocytes were treated with ifn-g or il-1beta alone or in combination. mmp-13 mrna was analyzed by rt-pcr. mmp-13 protein, phospho-stat1 and p44/42 mapk levels were measured by western blotting. mmp-13 promoter-luciferase, cmv-cbp/p300 plasmids and stat1 sirna were transfected by calcium phosphate method. ap-1 activity was monitored by elisa. stat1-cbp/p300 interaction was studied by immunoprecipitation. results: ifn-gpotently suppressed il-1-induced expression of mmp-13 and promoter activity. blockade with neutralizing ifn-gr1 antibody revealed that mmp-13 inhibition by ifn-â¼ was mediated by the ifn-â¼ receptor. ifn-beta-stimulated activation of stat1 was directly correlated with mmp-13 suppression. knockdown of stat1 gene by specific sirna or its inhibition with fludarabine partially restored the il-1 induction of mmp-13 expression and promoter activity. ifn-g did not alter activator protein (ap-1) binding ability but promoted physical interaction of stat1 and cbp/p300 co-activator. p300 overexpression reversed ifn-g inhibition of endogenous mmp-13 mrna expression and exogenous mmp-13 promoter activity. conclusions: ifn-g through its receptor activates stat1, which binds with cbp/p300 co-activator, sequesters it from the cell system and thus inhibits transcriptional induction of mmp-13 gene in chondrocytes. ifn-g and its signaling pathways could be targeted therapeutically for (1), p asmawidjaja (1), r hendriks(2), erik lubberts (1) (1) erasmus medical center, department of rheumatology, rotterdam, the netherlands (2) erasmus medical center, department of immunology, rotterdam, the netherlands the objective of this study was to identify the role of il-23 in th17 polarization in the prone autoimmune dba-1 mice with and without collagen-induced arthritis and to evaluate th17 specific cytokine and transcription factor expression. il-23 induced th17 cells in vitro from spleen cells of naã¯ve and collagen-type ii (cii) immunized dba-1 mice. the percentage of th17 cells is markedly higher in cii-immunized versus naã¯ve dba-1 mice. adding il-23 to tgf-beta/il-6 stimulated cd4 + t cells did not significantly increase the percentage of th17 cells. tgfbeta/il-6 in contrast to il-23 induced a relatively high percentage of il-17+/ifn-gamma-cells and low il-17-ifn-gamma+ cells. tgf-beta/il-6 did not increase il-23 receptor expression, which may explain why adding il-23 directly or two days after tgf-beta/il-6 did not result in an increase in the percentage of th17 cells. elevated expression of il-17a and il-17f as well as the th17 specific transcription factor rorgammat was found under il-23 as well as tgfbeta/il-6 conditions. interestingly, il-23 but not tgf-beta/il-6 is critical in the th17 cytokine il-22 expression in t cells from ciiimmunized dba-1 mice. these data show that il-23 was more pronounced in inducing il-17+/ifn-gamma-(th17) cells under cii-immunized conditions. furthermore, il-23 did not markedly increase the percentage of th17 cells induced by tgf-beta/il-6. however, il-23 is critical for the induction of il-22 expression, suggesting a unique role for il-23 in the induction of specific th17 cytokines ebi3 was initially discovered as a transcriptionally activated gene in epstein-barr virus-infected human b lymphocytes, and similar to p40 of il-12. ebi3 protein has been shown to form heterodimers with p28. p28/ebi3 termed il-27, can influence the function of multiple t cell subsets, including naive, effector, regulatory and memory t cells. however, previous studies showed that the overlapped expression of ebi3 and p28 is very limited. these data lead to the hypothesis that ebi3 may play a role independently from its association with p28. thus, to define the function of ebi3, we generated ebi3 transgenic (tg) mice expressing in multiple tissues. ebi3 tg mice exhibited no histologic abnormalities in various organs and normal numbers of naive and memory cd4+, cd8+ t cells, b cells, nk cells and nkt cells. cd4+t cells isolated from spleens of ebi3 tg mice, however, produced less ifn-g than cells from wt (wild type) control mice after in vitro stimulation with anti-cd3 and anti-cd28 antibodies. in vivo studies, delayed-type hypersensitivity (dth) and contact hypersensitivity (chs) responses were significantly reduced in ebi tg compared with wt mice. moreover, the chs responses in ebi3 tg mice were recovered with anti-ebi3 polyclonal antibody. notably, chs reaction in wt mice was increased by anti-ebi3 antibody. in contrast, anti-p28 antibody suppressed chs responses in wt mice. these data suggest that ebi3 acts in different from il-27, and reduces th1 responses. (1), o thaunat(2), x houard (1), o meilhac (1), g caligiuri(2), a nicoletti (2) (1) inserm u698 and university denis diderot-paris 7, chu xavier bichat, paris, france (2) inserm umr s 681, universitã� pierre et marie curie-paris 6, centre de recherche des cordeliers, paris, france arteries are composed of three concentric tissue layers which exhibit different structures and properties. because arterial injury is generally initiated at the interface with circulating blood, most studies performed to unravel the mechanisms involved in injury-induced arterial responses have been focused on the innermost layer (intima). in contrast, the role of the outermost tunica, the adventitia, has attracted relatively little attention and remains elusive. in the present review, we focus on involvement of the adventitia in the response to various types of arterial injury leading to vascular remodeling. several lines of evidence show that the initial insult and the early intimal response lead to the genesis of (neo-) mediators that are centrifugally conveyed by mass transport towards the adventitia. these mediators trigger local adventitial responses including angiogenesis, immuno-inflammation, and fibrosis. we propose that these three processes sequentially interact and that their net balance participates in producing each specific pathological condition. hence, an adventitial adaptive immune response predominates in chronic rejection. inflammatory phagocytic cell recruitment and initiation of a shift from innate to adaptive immunity characterize the adventitial response to proteolysis products in abdominal aortic aneurysm. centripetal adventitial sprouting of neovessels, leading to intraplaque hemorrhages, predominates in atherothrombosis. adventitial fibrosis mediated by low inflammation characterizes the response to mechanical stress and is responsible for constrictive remodeling of arterial segments and initiating interstitial fibrosis in perivascular tissues. these adventitial events thus impact not only on the vessel wall biology but also on the surrounding tissue. atherosclerosis has many of the characteristics of an inflammatory disease, and thus would classically involve endothelial cox-derived prostaglandins such as pge2 and prostacyclin acting on ep and ip receptors, respectively.activation of vascular ip receptors is especially important in limiting the atherogenic properties of thromboxane a2 acting on tp receptors.more recently, expression of ghrelin receptors has been shown to be increased in atherosclerotic plaques, and ghrelin itself has anti-inflammatory properties in addition to its classical role as a hunger hormone.as well as the complex crosstalk between g-protein-coupled receptors (gpcrs), recent evidence indicates that many gpcrs exist constitutively as homodimeric complexes, and that the formation of heterodimers not only influences the classical cell signalling pathways used by these receptors, but also affects their subcellular distribution.we have found that ep3-i, tp and ghrelin receptors readily form homodimers, but that co-transfection of hek293 cells with these receptors results in the formation of heterodimers with unpredictable effects on receptor distribution and cell signalling properties.since inflammatory conditions are thought to change the relative expression levels of gpcrs in the vasculature, and since varying the expression levels of gpcrs will affect their ability to form heterodimers, then one might predict that gpcr heterodimerization would indeed influence the reactivity of vascular tissue during inflammation. [this work was fully supported by grants from the research grants council of the hong kong special administrative region (cuhk4267/02m and 2041151 vascular inflammation leads to formation of leukotrienes through the 5-lipoxygenase pathway of arachidonic acid metabolism. leukotriene forming enzymes are expressed within atherosclerotic lesions and locally produced leukotrienes exert pro-inflammatory actions within the vascular wall by means of cell surface receptors of the blt and cyslt receptor subtypes. recent mechanistic studies have supported the notion of a major role of leukotriene signaling in atherosclerosis. leukotriene b4 (ltb4) is for example one of the most potent chemotactic mediators formed within the atherosclerotic lesion, inducing migration of a number different cell-types of both hematopoietic and non-hematopoietic origin. initially identified on neutrophils, blt receptor activation is involved in monocyte chemotaxis and adhesion as well as in vascular smooth muscle cell migration and proliferation, providing examples of potential mechanisms in ltb4-induced atherogenesis. targeting ltb4-induced activation of vascular smooth muscle cells has beneficial effects in models of intimal hyperplasia and restenosis after vascular injury. furthermore, blt receptor expression has been demonstrated on t-cells, suggesting ltb4 as a potential link between innate and adaptive immunological reactions. taken together, the local formation of leukotrienes within the atherosclerotic lesion and the potent pro-inflammatory effects of leukotriene receptor activation in target cells of atherosclerosis provide a rationale for a role leukotrienes in this disease. further experimental and clinical studies are however needed to develop therapeutic strategies of treatments targeting leukotriene signaling in atherosclerosis. in normal physiological conditions, the prostanoid (prostaglandin (pg) and thromboxane (tx)) synthesis is dependent on the constitutive isoform of cyclooxygenase (cox-1). this synthesis and release happen few minutes after cell or tissue stimulation. in vascular preparations submitted to pro-inflammatory conditions for some hours, the inducible isoform of cyclooxygenase (cox-2) and other prostanoid synthases can be observed. as an illustration of the previous experimental results, there is an increased presence of cox-2 and the inducible enzyme responsible for pge2 synthesis (mpges-1) detected by immunocytochemistry in the carotid atherosclerotic plaque with strong inflammation. in vascular cells in culture, pgi2 is the major biological active prostanoid produced in the normal physiological conditions. however, when cox-2 is induced, pgi2 and pge2 are equally produced. the role of cox-1, cox-2 and mpges-1 activities is also dependent on the expression of the various prostanoid receptors in the considered vessel. there is increasing evidence for the presence and a role of the ep receptor subtypes (ep1, ep2, ep3 or ep4) preferentially stimulated by pge2 in the vascular wall. for these reasons, we have characterized the receptors activated by pge2 in human mammary arteries. in these vessels incubated with a pro-inflammatory cytokine (interleukin-1ã¢) and lipopolysaccharides a reduced contractility to norepinephrine has been observed. this effect is abolished by treatment of the vascular preparations with a selective cox-2 inhibitor, suggesting that prostanoid synthesis and/or prostanoid receptors could be involved. rheumatoid arthritis is a syndrome which probably consists of a number of diseases for which the risk factors differ. two major processes were identified: the generation of the anti-citrullinated antigens immune response (highly sepcific for ra).we show that the different hla class ii alleles contribute to the development of anti-ccp-positive and anti-ccp negative ra.the se alleles do not independently contribute to the progression to ra, but rather contributed to the development of anti-ccp antibodies. next we determined the effect of smoking and observed that smoking only conferred risk to contract ra in the ccp-positive group and not in the anti-ccp negative group. for the risk factor ptpn22 (a gene that regulates treshold of lumphocyte activation) the allele c1858t only contributed to ccp-positive ra. in contrast to hla two other risk factors were found to be associated with both ccp-positive and ccp-negative ra. the risk factor in the fcrl-gene as has been identified in the japanese population was also tested in 931 dutch ra cases and 570 unrelated dutch controls. carrier analysis of the snp (rs7528684) revealed association of cc genotype with higher risk of developing ra as compared to tt & tc carriers (p =0.039 and or =1.31). in a meta-analysis of all studies comparing 9467 individuals, the or for the cc genotype to develop ra was 1.2 and the p-value < 0.001. in conclusion, different steps in pathogenesis of the syndrome ra can be delineated this talk will focus on recent advances in understanding primary genetic factors predisposing to inflammatory bowel disease (crohns disease and ulcerative colitis). proven genes containing genetic variants predisposing to crohns disease include ibd5/5q31, card15/nod2 and il23r. data is suggestive but not yet as convincing for many other genes. a common theme is of genetic variants influencing early innate immune responses to intestinal bacterial components, and subsequent adaptive immune responses, leading to intestinal inflammation. only for card15/nod2 is there (partial) understanding of how genetic variation influences biological function to cause chronic disease. some mouse models (gene targeted) of card15 appear to show opposite effects to other models and human systems. in humans, card15 mutations impair responses to bacterial components (muramyl dipeptide) mainly at low dose sensing. it is likely this receptor system normally maintains intestinal crypt sterility and protection from invasive infection. pathogen-recognition receptors (prrs) are key components of immune systems and are involved in innate effector mechanisms and activation of adaptive immunity. since their discovery in vertebrates, toll-like receptors (tlrs) have become the focus of extensive research that has revealed their significance in the regulation of many facets of our immune system. recently a new family of intracellular prrs, the nod-like receptors (nlrs), which include both nods and nalps have been described. mutations within the nalp3/cryopyrin/ cias1 gene are responsible for three autoinflammatory disorders: muckle-wells syndrome, familial cold autoinflammatory syndrome, and cinca/nomid. the nalp3 protein associates with asc and caspase-1 (thereby forming a molecular machine termed inflammasome that displays high proil-1beta-processing activity. macrophages from muckle-wells patients spontaneously secrete active il-1beta. increased inflammasome activity is therefore likely to be the molecular basis of the symptoms associated with nalp3-dependent autoinflammatory disorders. here we will emphasis on the ability of this protein complex to promote the development of autoinflammatory syndromes. allergic inflammation (ai) is a complex phenomenon initiated by allergen binding to ige sensitized mast cells and consequent mast cell activation. this causes the symptoms of the early phase of ai and the onset of the late phase characterized by the penetration in the inflamed tissue of inflammatory cells, notably the eosinophils. their subsequent activation is believed to cause tissue damage and to be the main responsible for the tissue remodeling, especially when the ai becomes a chronic process. we defined a novel functional unit that we termed the allergic synapse formed by mast celleosinophil couples. in the synapse these two old cellular players of ai have a cross talk via soluble mediators and receptor-ligand interactions. this results in mast celleosinophil functional synergism that consequently amplifies and prolongs the inflammatory response. in addition, mast cells and eosinophils are influenced and influence as in a sort of allergic niche the surrounding structural cells, i.e. fibroblasts and endothelial cells. we propose to view the allergic synapse/niche as a specialized effector unit worthy to be blocked for the treatment/prevention of allergic inflammation. (1) (1) erasmus mc, rotterdam, the netherlands (2) department of immunology weizmann institute of science, rehovot, israel allergic asthma is one of the most common chronic diseases in western society, characterized by reversible airway obstruction, mucus hypersecretion and infiltration of the airway wall with th2 cells, eosinophils, and mast cells. if we are to devise new therapies for this disease, it is important to elucidate how th2 cells are activated and respond to intrinsically harmless allergens. dendritic cells (dcs) are the most important antigen presenting cells in the lung and are mainly recognized for their exceptional potential to generate a primary immune response and sensitization to aeroallergens. we have shown that intratracheal injection of ovalbumin (ova) pulsed dcs induces sensitization leading to eosinophilic airway inflammation upon ova aerosol challenge. we investigated the role of dcs in the secondary immune response in a murine asthma model. ova aerosol challenge in ova-dc sensitized mice, induced an almost 100 fold increase in the number of airway dcs as well as an increase in eosinophils and t cells. to investigate the functional importance of dcs for the induction and maintenance of airway inflammation in response to allergen challenge, we conditionally knocked-out endogenous dcs in sensitised cd11c-diphtheria toxin (dt) receptor (cd11cdtr) transgenic mice by airway administration of dt 24 h before ova aerosol (4x) challenge or during an ongoing inflammation (depletion after 3x ova aerosols continued with 3 additional ova aerosols). numbers of balf eosinophils, th2 cytokine production by mediastinal lymph nodes and peribronchial and perivascular inflammatory infiltrates were dramatically decreased, illustrating an essential role for airway dcs during secondary challenge. karolinska institute, stockholm, sweden nk cells are innate lymphocytes with potent immunoregulatory functions. they are potent producers of several cytokines and chemokines, and also respond to similar molecules in the body and at inflammatory sites. even though traditionally best characterized for their role in anti-viral and anti-tumor immunity, they influence several other types of immune responses. for example, they are involved in, and affect, acute as well as chronic inflammatory responses. in the present talk, a general overview on our current knowledge of nk cell biology will be provided, with a special emphasis on the role of these cells in allergic inflammation. basophils are major effector cells in allergic reactions due to their ability to release substantial quantities of histamine and eicosanoids following activation of high affinity ige receptors (fcri) with allergens. although these attributes are shared with their tissuefixed mast cell compatriots, basophils are unique in their ability to also rapidly elaborate th2-type cytokines (e.g. il-4 and il-13), subsequently supporting ige synthesis and underlying atopy. importantly, these mediators are additionally secreted following primary exposure to certain parasites (e.g. s. mansoni) and immunoglobulin superantigens, suggesting a role for basophils in innate immunity and in assisting developing th2-type adaptive immune responses. while we are beginning to understand the potential physiological functions of these cells regarding host defence, blocking their activity with respect to treating symptoms of allergic disease has remained an enigma. recent advances, however, have shed light upon the major intracellular signal transduction processes involved in fcri activation and may lead to novel therapeutic strategies for inhibiting mediator secretions. an important discovery in this regard is the phosphatase ship, which downregulates pi 3-kinase signalling in both basophils and mast cells. recent data shows that ship expressions in basophils are reduced from donors with active allergic disease but that these levels may be increased, and the activity of basophils subsequently inhibited, by targeting receptors associated with ship recruitment (cd200r, cd300r). identifying the natural ligands for these inhibitory receptors may therefore pave the way for new therapies for the treatment of allergic inflammation. mitogenesis and proliferation of vsmc play an important role in atherogenesis. pro-inflammatory secretory phospholipases a2 (spla 2 ) hydrolyse glycerophospholipids of hdl and ldl and release pro-inflammatory agents, lyso-lipids, oxidized and non-oxidized fatty acids and isoprostanes.spla 2 s lipolysis products localize in vascular wall in vicinity of vsmc.we have tested the impact of spla 2 , hdl and ldl and of their hydrolysis products on mitogenesis and pge2 and ltb4 release from vsmc.mitogenesis was significantly enhanced by native hdl, and ldl, and by group v spla 2 . spla 2 hydrolysis of hdl and ldl enhanced mitogenic activity in order v>x>iia.the release of pge2 from vsmc was enhanced by group x spla 2 s but not iia or v.the greatest effect was seen for hdl hydrolysed by group v and x spla 2 .native ldl and its spla2 hydrolysis products enhanced the release of pge2 in order x>v>iia.the release of ltb4 from vsmc was markedly increased by native ldl and hdl, and hydrolysis products of group v and x, but not iia spla2.migration of vsmc was significantly enhanced by spla 2 iia and inhibited by hdl.this study demonstrates a complex interaction of hdl and ldl with pro-inflammatory spla2s, which affects mitogenesis, eicosanoid release and migration of vsmc.study of biocompatible spla2 blockers in the therapy of atherosclerosis is indicated. contact information: professor waldemar pruzanski, university of toronto, department of medicine, toronto, ontario, canada e-mail: drwpruzanski@bellnet.ca (1) (1) ipmc-cnrs umr6097, valbonne, france (2) university of washington, seattle, usa (3) inserm umrs 525, paris, france (4) university of naples, italy the superfamily of phospholipase a2 comprises at least 9 intracellular enzymes and up to 12 secreted pla2s (spla 2 s). elucidating the biological roles of each pla2 member is currently the most challenging issue in the pla2 field. the different spla 2 s are not isoforms and are likely to function either as enzymes producing key lipid mediators (eicosanoids and lysophospholipids) or as ligands that bind to specific soluble or membrane-bound proteins (like cytokines). increasing evidence suggests that spla 2 s iia, iii, v, and x are involved in inflammatory diseases including atherosclerosis. among spla2s, the human group x (hgx) enzyme has the highest enzymatic activity towards phosphatidylcholine, the major phospholipid of cellular membranes and low density lipoproteins (ldl). on human alveolar macrophages, hgx spla 2 can trigger secretion of tnf alpha, il1 and il6 in a non-enzymatic manner. on colorectal cancer cells, hgx spla 2 stimulates cell proliferation, produces potent eicosanoids including pge2, and activates the transcription of key genes involved in inflammation and cancer. the enzyme can also hydrolyze pc and platelet-activating factor (paf) of ldl particles very efficiently. finally, hgx spla 2 is present in human atherosclerotic lesions and converts ldl into a proinflammatory particle that induces macrophage foam cell formation, as well as map kinase activation, arachidonic acid release, and expression of adhesion molecules in huvec cells. some other key molecular features of spla 2 s including hgx will be presented. we have reported preferential release of polyunsaturated fatty acids during hydrolysis of lipoprotein phosphatidylcholine (ptdcho) by spla 2 s, but the mechanism of this selectivity is not known. since both sphingomyelin (sm) and lysoptdcho inhibit the activity while increasing fatty acid specificity of other pla2s, we have examined fatty acid release by spla2siia, v and x in relation to relative increases in proportion of endogenous sm and lysoptdcho during lipoprotein digestion. the analyses were performed by normal phase liquid chromatography with on-line electrospray mass spectrometry (lc/esi-ms) and lc/collision induced dissociation (cid)/esi-ms using conventional preparations ofldl and hdl. the highest preference for arachidonate release from ldl by group x spla 2 was observed when the residual sm/ pdcho molar ratio had reached 1.2 compared to a starting ratio of 0.4.group v spla 2 showed preferential release of linoleate at residual sm/ptdcho molar ratio 1.5, while at intermediate ratios, both arachidonate and linoleate were released at more comparable ratios. the relative increases in lysoptdcho and sm during the digestion with spla 2 iia were much more limited, and a preferential hydrolysis of polyunsaturated fatty acids was not observed. these results suggest a lipid phase separation as a likely basis for a differential hydrolysis of molecular species of ptdcho. the residual sm/ptdcho ratios reached during group v and x spla 2 digestion are similar to those observed for lesional ldl, which promote release of ceramides by smase leading to ldl aggregation. the above findings support a potential role of sphingomyelins in atherogenesis. although sphingomyelin (sm) is one of the most abundant phospholipids in lipoproteins and cell membranes, its physiological significance is unclear. because of its localization in the outer surface of the cells, and its structural similarity to phosphatidylcholine (pc),we proposed that it competitively inhibits phospholipolysis of cell membranes by external phospholipases (pla). we showed that sm inhibits several lipolytic enzymes including secretory pla2 iia, v, and x, and hepatic and endothelial lipases, all of which hydrolyze pc. treatment of sm in the substrate with smase c not only relieved the inhibition but also activated the pla2 reaction further, suggesting that ceramide, the product of smase c, independently stimulates pla2, possibly by disrupting the bilayer structure. smase d treatment, which produces ceramide phosphate, did not stimulate the spla2. the fatty acid specificity of pla2 is significantly affected by sm. thus spla2x exhibited enhanced specificity for the release of arachidonic acid (20:4) in presence of sm, due to a preferential inhibition of hydrolysis of other pc species. in contrast, sm inhibited the release of 20:4 by spla2v. ceramide selectively stimulated the release of 20:4 by both enzymes. only the long chain ceramides (> 10 carbons) were effective, while ceramide phosphate did not stimulate spla2 activity. sm-deficient cells released more 20:4 in response to spla2-treatment than normal cells, and pretreatment of normal cells with smase c increased their susceptibility to spla2 attack. these studies show that sm and ceramide regulate the activity and specificity of pla, and consequently the inflammatory response. secretory phospholipase a2 (spla2) types iia, v, or x, have been associated with inflammatory diseases and tissue injury including atherosclerosis in humans and mice.given the link between spla2 and atherogenesis, a mouse model of atherosclerosis (apoe-/-) was used to study the effects of a-002, an inhibitor of spla2 enzymes, on atherosclerosis and cholesterol levels over 16 weeks of treatment. mice were fed with a high-fat, high cholesterol diet alone during the study (21% fat; 0.15% cholesterol, 19.5% casein) and were treated with vehicle or a-002 bid at 30 mg/kg or 90 mg/kg by oral gavage. total cholesterol was significantly decreased after one month of treatment and remained lower throughout the study.treatment with a-002 significantly reduced aortic atherosclerotic plaque formation in apoe-/-mice fed a high fat diet when compared to the untreated control by approximately 40 %. in a different model that used angiotensin ii in conjunction with a high fat diet in a background of apoe-/-deficient mice for 4 weeks, oral dosing of a-002 (30 mg/kg bid) significantly reduced aortic atherosclerosis and aneurysm rate when compared to vehicle. these data suggest that a-002 is a potential novel therapeutic agent for the treatment of atherosclerosis. (1), s doty(2), c antonescu (3), c staniloae (1) (1) saint vincents hospital manhattan, new york, usa (2) hospital for special surgery, new york, usa (3) sloan-kettering institute for cancer research, new york, usa tnf-stimulated gene 6 (tsg-6) is induced by tnf-a during inflammation and its secreted product tsg-6 glycoprotein is involved in immune-mediated inflammatory diseases and fertility. it regulates cox-2 and prostaglandin synthesis, and participates in extracellular matrix remodeling. considering the chronic inflammatory nature of atherosclerosis we hypothesized that tsg-6 is expressed in atherosclerotic plaques and investigated tsg-6 protein expression and cellular distribution on 12 superficial femoral artery endarterectomy specimens from 6 diabetic and 6 non-diabetic patients with peripheral vascular disease. six histologically normal radial artery specimens were analyzed as control. paraffin embedded samples were studied by immunohistochemistry using a goat polyclonal anti-human-tsg-6 antibody. tsg-6 expression was consistently present in all atherectomy specimens but not in control specimens. a distinct, strong cytoplasmic staining pattern was uniformly detected in the endothelial lining of the intima, as well as in the neo-vessel proliferation of the plaque. cytoplasmic staining was also identified in the smooth muscle cell proliferation of the neo-intima. patchy tsg-6 expression was noted in the extracellular matrix. within the inflammatory plaques from diabetic patients, tsg-6 stained the foamy macrophages. tsg-6 expression was also confirmed and quantified by qrt-pcr that showed a significant up-regulation of tsg-6 gene (more that 100 fold induction compared to housekeeping genes). our study identifies for the first time the preferential expression of tsg-6 in atherosclerotic lesions and characterizes its distribution within the cellular and matrix components of the plaque. tsg-6 is a novel inflammatory mediator of atherosclerosis and a potentially new marker of endothelial / smooth muscle cell activation. (1), r krohn (1), h lue (1), jl gregory(2), a zernecke (1), rr koenen (1), t kooistra (3), p ghezzi(4), r kleemann (3), r bucala(5), mj hickey (2), c weber (1) (1) university hospital of the rwth aachen, germany (2) centre for inflammatory diseases, monash university, melbourne, australia mediators, which cannot be classified into chemokine subfamilies but share functional patterns, e.g. signaling through chemokine receptors, constitute a group termed chemokine-like function (clf)-chemokines. the pleiotropic cytokine macrophage migration inhibitory factor (mif) plays a critical role in inflammatory diseases and atherogenesis. the underlying molecular mechanisms are poorly understood, but, interestingly, mif displays structural features resembling chemokines. we have identified the chemokine receptor cxcr2 as a functional receptor for mif. mif triggered galphai/integrin-dependent arrest and chemotaxis of monocytes specifically through cxcr2, inducing rapid integrin activation. mif directly bound to cxcr2 with high affinity (kd of 1.4 nm). monocyte arrest mediated by mif in inflamed or atherosclerotic arteries involved cxcr2 as well as cd74, a recently identified membrane receptor moiety for mif. accordingly, cxcr2 and cd74 were found to occur in a receptor complex. in vivo, mif deficiency impaired monocyte adhesion to the aortic/arterial wall in atherosclerosis-prone mice, as evidenced by intravital microscopy. thus, mif displays chemokine-like functions by acting as a non-cognate ligand of cxcr2, serving as a regulator of inflammatory and atherogenic recruitment. these data harbor an intriguing novel therapeutic prospect by targeting mif in atherosclerosis and add a new dimension to mif and chemokine receptor biology. (1), r toes (3), h van bockel(2), paul quax (1,2) (1) tno bioscienses, leiden, the netherlands (2) department of vascular surgery, leiden university medical center, the netherlands (3) department of rheumatoly, leiden university medical center, the netherlands the immune system is thought to play a crucial role in regulating collateral circulation (arteriogenesis), a vital compensatory mechanism in patients with arterial obstructive disease. here, we studied the role of lymphocytes in a murine model for artiogenenesis after acutehind limb ischemia. arteriogenesis was impaired in c57bl/6 mice depleted for natural killer (nk)-cells by anti-nk1.1 antibodies and in nk-cell-deficient transgenic mice. arteriogenesis was, however, unaffected in jâµ281knockout mice that lack nk1.1+ natural killer t (nkt)cells, indicating that nk-cells, rather than nkt-cells are involved in arteriogenesis. furthermore, arteriogenesis was impaired in c57bl/6 mice depleted for cd4+ tlymphocytes by anti-cd4 antibodies, and in major histocompatibility complex (mhc)-class-ii-deficient mice that lack mature peripheral cd4+ t-lymphocytes. this impairment was even more profound in anti-nk1.1treated mhc-class-ii-deficient mice that lack both nkand cd4+ t-lymphocytes. finally, collateral growth was severely reduced in balb/c as compared with c57bl/6 mice, two strains with different bias in immune responsiveness. correspondingly, fewer cd3-positive lymphocytes accumulated around collaterals in balb/c mice. these data show that both nk-cells and cd4+ t-cells play an important role in arteriogenesis. moreover, our data hold promise for the development of novel clinical interventions as promoting lymphocyte activation might represent a powerful method to treat ischemic disease. post-interventional vascular remodeling in venous bypass grafts, seen as intimal hyperplasia (ih) and accelerated atherosclerosis, often causing graft failure. inflammation is an important trigger for these processe. complement is an important part of the immune system and participates in regulating inflammation. although involved in several other inflammatory diseases, the role of the complement cascade in vein graft remodeling is unknown. the involvement of the complement system in vein graft disease was studied here using a model in which caval veins are grafted in carotids arteries of hypercholesterolemic apoe3leiden mice. in these veins ih and accelerated atherosclerotic lesions develop within 28 days, consisting mainly of foamcells and smc. to study the functional role of complement in vein graft remodeling, cobra venom factor (cvf:20u daily) was used to deplete complement starting one day prior to vein graft surgery. cvf-treatment reduced vein graft thickening by 64% (p=0.02), when compared to saline treated controls (n=6).to confirm that the reduction by cvf was due to hampered complement function and not a direct effect of cvf, complement activation was blocked using crry-ig (inhibiting c3 convertases). crry-ig (3mg every other day) led to 50% decrease in vein graft thickening (p=0.03) compared to controls receiving non-relevant control igg. these data prove that complement activation plays a major in intimal hyperplasia and accelerated atherosclerosis in vein grafts. (1), m-c koutsing tine(1), p borgeat(2), h ong (1), sylvie marleau (1) (1) universite de montreal, quebec, canada (2) centre de recherche en rhumatologie et immunologie, canada we have previously shown that ep 80317, a growth hormone-releasing peptide (ghrp) analogue binding selectively to the scavenger receptor cd36, elicits a striking reduction in atherosclerosis development in apolipoprotein-deficient (apoe-/-), a condition associated with increased circulating numbers of primed/activated leukocytes. we investigated the effect of ghrp analogues on i/r-elicited remote lung injury in 18week-old apoe-/-mice fed a high fat high cholesterol (hfhc) diet from 4 weeks of age. at 18 weeks old, mice were treated daily with a s.c. injection of ep 80317 (300 mg/kg) for 14 days and were then subjected to unilateral hindlimb ischemia (by rubber band application) for 30 minutes, followed by 180 minutes reperfusion. our results show that ep 80317 significantly reduced leukocyte accumulation by 56% in the lungs, from 3.6 (ae 0.7) in vehicle-treated mice to 1.6 (ae 0.6) x 107 leukocytes/g lung in ep 80317-treated mice (n = 6-7 per group), as assessed by myeloperoxidase assay. this was associated with a 56% reduction of opsonized zymosan-elicited blood chemiluminescence. in contrast, neither blood chemiluminescence, nor leukocyte accumulation in the lungs were signicantly modulated in apoe-/-/cd36-/-deficient mice, from 1.9 (ae 0.5) in vehicle-treated mice to 1.9 (ae 0.8) x 107 leukocytes/g lung in ep 80317-treated mice. we conclude that ep 80317 protects i/r-elicited circulating leukocyte priming/activation and remote lung injury, possibly through a cd36-mediated pathway. glycogen synthase kinase 3beta (gsk-3beta) is a serine/ threonine protein kinase that has recently emerged as a key regulatory switch in the modulation of the inflammatory response. dysregulation of gsk-3beta has been implicated in the pathogenesis of several diseases including sepsis. here we investigate the effects of two chemically distinct inhibitors of gsk-3beta, tdzd-8 and sb216763, on the circulatory failure and the organ injury and dysfunction associated with hemorrhagic shock. male wistar rats were subjected to hemorrhage (sufficient to lower mean arterial blood pressure to 35 mmhg for 90 min) and subsequently resuscitated with shed blood for 4 h. hemorrhage and resuscitation resulted in an increase in serum levels of (a) creatinine and, hence, renal dysfunction, and (b) alanine aminotransferase and aspartate aminotransferase and, hence, hepatic injury. treatment of rats with either tdzd-8 (1 mg/kg, i.v.) or sb216763 (0.6 mg/kg, i.v.) 5 min before resuscitation abolished the renal dysfunction and liver injury caused by hemorrhagic shock. the protection afforded by these compounds was confirmed by histological observations of lung, kidney and liver samples. in addition, tdzd-8, but not sb216763, attenuated the increase caused by hemorrhage and resuscitation in plasma levels of the proinflammatory cytokine interleukin 6. neither of the gsk-3beta inhibitors however affected the delayed fall in blood pressure caused by hemorrhagic shock. thus, we propose that inhibition of gsk-3beta may represent a novel therapeutic approach in the therapy of hemorrhagic shock. (1), y ito (1), h yoshimura (1), h inoue (1), n kurouzu (1), h hara (1), y mastui (1), h kitasato (1), s narumiya(2), c yokoyama (3), m majima (1) (1) kitasato university school of medicine, japan (2) kyoto university school of medicine, japan (3) tokyo medical and dental university, japan thromboxane (tx) a2 is a potent stimulator of platelet activation and aggregation and vascular constriction. we have reported cytokine-mediated release of sdf-1 from platelets and the recruitment of nonendothelial cxcr4+ vegfr1+ hematopoietic progenitors constitute the major determinant of revascularization. we hypothesized txa2 induces angiogenic response by stimulating sdf-1 and vegf which derived from platelet aggregation. to evaluate this hypothesis, we dissected the role of the txa2 in angiogenesis response using mouse hind limb ischemia. recovery from acute hind limb ischemia, as assessed in wild type mice (c57bl/6 wt) , prostaglandin i2 receptor (ip) knock out mice (ipko) and thromboxane (tx) a2 receptor (tp) knock out mice (tpko) by using lase doppler. blood recovery in tpko significantly delayed compared to wt and ipko. immunohistochemical studies revealed that the number of cd31positive cells in the ischemic quadriceps were less stained in tpko compared to wt and ipko.plasma sdf-1 and vegf concentration were significantly reduced in tpko mice. we observed during in vivo fluorescence microscopic study that compared to tpko, ipko and wt significantly increased platelet attachment to the microvessels around ligated area. tpko translpanted wt bone marrow cells increased blood recovery compared to tpko transplanted tpko bone marrow cells. in addition, mice injected with txa2 synthase c-dna expressing fibroblast increased blood flow recovery compared to control mice. these results suggested that tp signaling rescues ischemic condition by inducing angiogenesis by secreting sdf-1 and vegf from platelet aggregation. administration of selective tp agonist may open new therapeutic strategy in regenerative cardiovascular medicine. during renal ischemia/reperfusion (i/r) injury, apoptosis has been reported as a very important contributor to the final kidney damage. the determinant role of cytoskeleton derangement in the development of apoptosis has been previously reported, but a clear description of the different mechanisms involved in this process has not been yet provided. the aim of the study is to know the role of peroxynitrite as inductor of cytoskeleton derangement and apoptosis during the inflammatory process associated to renal ischemia-reperfusion. based in a rat kidney i/r model, by experiments in which both the actin cytoskeleton and peroxynitrite generation were pharmacologically manipulated, results indicate that the peroxynitrite produced during the i/r derived oxidative stress state, is able to provoke cytoskeleton derangement and apoptosis development. thus, the control in the peroxynitrite generation during the i/r could be an effective tool for the improvement of cytoskeleton damage and reduction apoptosis incidence in the renal i/r injury. metabolomics, the global profiling of metabolites, may inform about the multiple interacting processes involved in inflammatory disease. using nmr spectroscopy we analysed metabolite fingerprints in serum from early arthritis, and at a site of inflammation, in the posterior segment of the eye. serum from 101 patients with synovitis of "t 3 months duration whose outcome was determined at clinical follow-up was used. vitreous samples were from 31 patients undergoing vitrectomy for vitreoretinal disorders. one dimensional 1h nmr spectra were acquired. principal components analysis (pca) of the processed data was conducted along with a supervised classification. with the arthritis serum there was a clear relationship between each samples score in the pca analysis and the level of crp. supervised classification of the initial samples was able to predict outcome, whether rheumatoid arthritis, other chronic arthritis or self-limiting arthritis, with high specificity and sensitivity. a similar approach using the eye fluids was able to give a clear discrimination between two pathologically similar conditions lens-induced and chronic uveitis. in this case differences were not due to a straightforward relationship with inflammatory markers (il-6, ccl2), which did not correlate with pca in these samples. similarly, certain molecules, such as lactate, were associated with ocular disease, but not rheumatoid arthritis. these results suggest that underlying inflammatory processes may differ in these conditions or may reflect predisposing metabolic patterns in individual patients. 1h-nmr-based metabolomics may provide a useful measure of outcome in inflammatory diseases and give novel insights into the pathological processes involved. (1), am artoli(2), a sequeira(2), c saldanha (1) (1) instituto de medicina molecular,faculdade de medicina de lisboa, portugal (2) cemat, instituto superior tã�cnico, universidade de tã�cnica de lisboa, portugal the recruitment of leukocytes from the blood stream and their subsequent adhesion to endothelial walls are essential stages to the immune response system during inflammation. the precise dynamic mechanisms by which molecular mediators facilitate leukocyte arrest are still unknown. in this study combined experimental results and computer simulations are used to investigate localized hydrodynamics of individual and collective behaviour of clusters of leukocytes. leukocyte-endothelial cell interactions in post-capillary venules of wistar rats cremaster muscle were monitorized by intravital microscopy. from these experiments the haemorheologic and haemodynamical measured parameters were used in time dependent three-dimensional computer simulations, using a mesoscopic lattice boltzmann solver for shear thinning fluids. the dynamics of leukocyte clusters under non-newtonian blood flow with shear thinning viscosity was computed and discussed. in this paper we present quantified distributions of velocity and shear stress on the surface of leukocytes and near vessel wall attachment points. we have also observed one region of maximum shear stress and two regions of minimum shear stress on the surface of leukocytes close to the endothelial wall. we verified that the collective hydrodynamic behaviour of the cluster of recruited leukocytes establishes a strong motive for additional leukocyte recruitment. it was found that the lattice boltzmann solver used here is fully adaptive to the measured experimental parameters. this study suggests that the influence of the leukocytes rolling on the increase of the endothelial wall shear stress may support the activation of more signalling mediators during inflammation. macrophages are essential for host defence, but when excessively and persistently activated, these cells contribute to the initiation and progression of inflammatory diseases such as rheumatoid arthritis. investigating the function of inflammatory genes in macrophages may identify novel therapeutic targets for inflammatory diseases. one family of transcripts that are highly expressed in activated macrophages are members of the schlafen (slfn) gene family; a recently identified family whose function is still unknown. this study examined the mrna expression of slfn in activated bone marrowderived macrophages in vitro, and in collagen-induced arthritis (cia) in vivo. real-time pcr expression analyses of bone marrow-derived macrophages stimulated with lipopolysaccharide (lps) over a time course, revealed differential expression of individual slfn family genes. in particular, slfn-1, slfn-2, and slfn-4 were maximally induced after 24 hours. the maximal induction of slfn-8 and slfn-9 was observed after 4 hours of lps treatment. individual members of the slfn family were also differentially expressed in cia, a model of rheumatoid arthritis. mrna levels of slfn-1, slfn-2, slfn-4 and slfn-5 were elevated in joints affected by cia. to investigate the role of slfn-4, we have generated a transgenic mouse line, which over expresses slfn-4 specifically in cells of the mononuclear phagocyte system, by using a novel binary expression based on the c-fms promoter and gal4. further characterisation of the slfn-4 over expressing mouse line will be used to assess the function of slfn-4 in macrophage biology and inflammation, and its potential as a therapeutic target. macrophages play an important role in resolving inflammation. it is known that the resolution of inflammation requires alternative activation of macrophages. but the precise events of phenotype switching in macrophages remain poorly understood. we show that lipocalin 2, lcn-2, is able to provoke a switch in macrophage activation. in an in vitro co-culture model for renal epithelial cells and macrophages, we detected by sirna technique that the presence or absence of lcn-2 determines proliferation processes in damaged renal epithelial cells. the proliferative response was dependent on proinflammatory or anti-inflammatory environment. as lcn-2 is an acute phase protein synthesized during inflammation and unregulated in a number of pathological conditions, it may play an important role in survival and regeneration. we anticipate here that our results could be relevant for further research on the mechanisms of the phenotype switch induced by lcn-2. (1), y cao (1), s adhikari (1), m wallig (2) (1) national university of singapore, department of pharmacology, singapore (2) university of illinois at urbana champaign, usa it has earlier been shown that the extent of apoptotic acinar cell death is inversely related to the severity of acute pancreatitis. our previous works have demonstrated that induction of pancreatic acinar cell apoptosis by crambene protects mice against acute pancreatitis. the current study aims to investigate the role of phagocytic receptors and the anti-inflammatory effect of phagocytosis in protecting mice against acute pancreatitis by crambene. acute pancreatitis was induced in the mouse by administering hourly injections of caerulein (50 mg/kg) for 3, 6 and 10 hours respectively. neutralizing monoclonal anti-il-10 antibody (2.5 mg/kg) was administered either with or without crambene (70 mg/kg) 12 hours before the first caerulein injection. rt-pcr, western blotting and immunostaining were performed to detect cd36 expression. apoptosis in pancreatic sections was visualized by tunel. severity of acute pancreatitis was evaluated by estimation of serum amylase, pancreatic myeloperoxidase (mpo), water content, and morphological examination. pancreatic levels of inflammatory mediators were examined by elisa. the protective effect of crambene is mediated by reducing production of pro-inflammatory cytokines such as mcp-1, tnf-a and il-1㢠and up-regulating anti-inflammatory mediators like il-10. phagocytotic clearance in mouse acute pancreatitis may be essentially through macrophage surface receptor cd36.the anti-inflammatory mediator il-10 plays an important role in crambene-induced protective action against acute pancreatitis. the release of anti-inflammatory mediator il-10 is downstream of phagocytosis. these results show that induction of pancreatic acinar cell apoptosis by crambene treatment protects mice against acute pancreatitis via induction of anti-inflammatory pathways. (1,2) (1) northern ontario school of medicine, thunder bay, ontario, canada (2) lakehead university, canada integrin receptors and their ligands are involved in adhesion and internalization of several human pathogens, including pseudomonas aeruginosa. we have recently established that beta1 integrins in lung epithelial cells (lec) provide co-stimulatory signals regulating inflammatory responses (ulanova et al, am j physiol, 2005, 288: l497-l507). we hypothesized that lec integrins serve as receptors to recognize pathogen-associated molecules and mediate the innate immune response to p. aeruginosa. to determine molecular mechanisms of integrin involvement in innate immunity, we used an in vitro model of p. aeruginosa infection of a549 cells. to investigate interactions of bacteria with lec, p. aeruginosa strain pak was chromosomally labeled with a green fluorescent protein gene using a mini-tn7 delivery system.using several fluorescence-based detection systems, we established that the natural beta1 integrin ligand, fibronectin, mediates bacterial adhesion to lec.p. aeruginosa infection caused rapid transcriptional upregulation of alphav and beta4 integrin expression followed by the increased cell surface protein expression. the surface expression of integrin beta1 increased shortly following bacterial exposure without alterations of mrna expression, suggesting rapid protein redistribution within the cells. the data indicate that p. aeruginosa are capable to modulate integrin gene/protein expression in lec, potentially using fibronectin to alleviate bacterial binding to beta1 integrins. upon their engagement, integrin receptors can initiate intracellular signaling involved in innate immune and inflammatory responses to the pathogen. integrin receptors in lec may represent significant therapeutic targets in pulmonary infection caused by p. aeruginosa. the purine nucleoside adenosine has a major modulatory impact on the inflammatory and immune systems. neutrophils, which are generally the first cells to migrate toward lesions and initiate host defense functions, are particularly responsive to the action of adenosine. through activation of the a2a receptor (a2ar) present on neutrophils, adenosine inhibits phagocytosis, generation of cytotoxic oxygen species, and adhesion. also, recent work showed that adenosine can transform the profile of lipid mediators generated by neutrophils, inhibiting leukotriene b4 formation while potentiating that of prostaglandin e2 through the up-regulation of the cyclooxygenase (cox)-2 pathway. moreover, our laboratory determined that a2ar engagement can dramatically modulate the generation and secretion of neutrophil-derived cytokines/chemokines, including tnf-and mips. in mice lacking the a2ar, migrated neutrophils expressed less cox-2 than their wild type counterpart while displaying higher mrna levels of tnf-and mip-1. mononuclear cells from a2ar knock out mice, which eventually replace neutrophils into the air pouch, also displayed a more pro-inflammatory phenotype than those from wild-type animals. signal transduction experiments, aiming to delineate the intracellular events leading to the modulation of neutrophil functions following a2ar engagement, implicate pivotal metabolic pathways such as intracellular cyclic amp, p38 and pi-3k. together, these results indicate that adenosine may have a profound and multi-pronged influence on the phenotype of neutrophils and present this cell as being pivotal in mediating adenosines anti-inflammatory effects. the newest developments regarding adenosines effects on neutrophil functions will be presented.this work is supported by the canadian institutes of health research (cihr). human skin serves not only as a physical barrier against infection, but also as a "chemical barrier" by constitutively and inducibly producing antimicrobial proteins (amps). to identify human skin amps, we analysed extracts of healthy persons stratum corneum by reversed phase-hplc and purified a novel 9 kda amp that showed sequence similarity to mouse hornerin. suggesting that it originates from the human ortholog, we cloned it. human hornerin encodes a 2850 amino acid protein that contains a s100 domain, an ef-hand calciumbinding domain, a spacer sequence and two types of tandem repeats, suggesting that it represents a novel member of the fused s100 protein family. strongest constitutive hornerin mrna expression was seen in differentiated keratinocyte cultures. to follow the hypothesis, that hornerin fragments represent amps, we recombinantly expressed three hornerin peptides, rhrnr2 (tandem repeat unit b), rhrnr3 (tandem repeat unit a) and rhrnr4 (c-terminus) and subsequently analysed their antimicrobial activity using the microdilution assay system. the rhrnr3 peptide, containing the sequence motif found in the purified natural hornerin fragment isolated from stratum corneum, exhibited antimicrobial activity at low micromolar concentrations against escherichia coli, pseudomonas aeruginosa and candida albicans. the other peptides were found to be not or nearly not antimicrobially active. our results suggest that hornerin may have a yet unknown protective function in healthy human skin as part of the "chemical barrier" with preformed amps, which are generated from parts of the tandem repeats of a hornerin precursor molecule by a yet unknown cleavage mechanism. (1), n lu(1), r jonsson(2), d gullberg (1) (1) department of biomedicine, university of bergen, norway (2) the gade institute, university of bergen, norway a11ã�1 is the latest addition to the integrin family of heterodimeric receptors for the extracellular matrix. previously, it has been shown that this collagen receptor takes part in processes such as cell migration and matrix contraction. in this study we investigated the factors that regulate mouse integrin a11ã�1 expression. specifically, we have analyzed the influence of cell passage, growth factors and the 3-d microenvironment. using sv40 immortalized as well as primary fibroblasts, we show that a11ã�1 integrin is up-regulated when these cells are cultured within stressed collagen type i lattices. however, a11ã�1is downregulated when the collagen gels are made under relaxed conditions, allowing cells contract the lattice diameter. we also show here that a11 is upregulated by tgf-a on planar substrates. these findings suggest that mechanical tension and tgf-a are important factors in the regulation of a11ã�1 that need to be to taken into consideration when evaluating the role of a11ã�1 in wound healing and fibrotic disorders. (1), n vergnolle (1), p andrade-gordon (2) (1) inflammation research network, university of calgary, canada (2) rw johnson pharmaceutical research institute, canada the objective of this study was to investigate the effects of par2 deficiency in various models of colonic inflammation in order to elucidate the role of endogenous par2 in the process of inflammation in the gut.colonic inflammation in c57bl6 wildtype and par2 -/-mice was induced by treatment with 2.5% dss (in drinking water) or tnbs (1mg or 2mg in 100ul of 40% ethanol, single intracolonic injection) or pre-sensitizing mice with 3% oxazolone (in olive oil) applied to the skin of the abdomen, and 7 days later, a single intracolonic injection of 1% oxazolone (dissolved in 50% ethanol).intravital microscopy was performed, 7 days (tnbs/dss) or 4 days (oxazolone) after induction of colitis on the colonic venules to assess changes in leukocyte rolling, adhesion and vessel diameter.lastly, various parameters of inflammation were assessed following the intravital microscopy.par2 -/-mice showed significantly lower leukocyte adherence and vessel dilation compared to the wildtype mice in dss, and tnbs challenge. in all three challenges, mpo activity, macroscopic damage score and bowel thickness were significantly higher in wild-type mice, compared to par2 -/-.our evidences indicate that deficiency in par2 attenuates inflammatory responses in the experimental models of colitis associated with either th1 (tnbs/dss) or th2 (oxazolone) cytokine profile.therefore, par2 deficiency in the gut exerts antiinflammatory properties that are independent of th1 or th2 cytokine profile.the present study further highlights par2 as a potential target for inflammatory bowel diseases. (1), n vergnolle (1), p andrade-gordon (2) (1) inflammation research network, university of calgary, canada (2) rw johnson pharmaceutical research institute, canada in a previous study, inflammatory responses induced by three different models of colitis (tnbs/dss/oxazolone) were significantly attenuated in mice deficient for par2 (par2-/-). among the inflammatory parameters observed, infiltration of granulocytes to the colon was consistently reduced by par2 deficiency. aim of this study was to assess the effects of par2 deficiency (via par2-/-mice) on the recruitment of leukocyte in colonic venules. in anaesthetized animals, leukocyte rolling/ adherence and vasodilation were induced, by topical administration of fmlp (100 mm) or paf (100 nm) or by intraperitoneal injection of tnf-a; (0.5 mg -given 3 hours before the intravital microscopy). using intravital microscopy, we evaluated the ability of various leukocyte stimuli to induce leukocyte trafficking and vasodilation in colonic venules of par2 -/-versus par2+/+ mice. fmlp and paf as well as tnf-a; induced significant vasodilation and an increase in rolling/adhesion of leukocytes in mouse colonic venules. par2 -/-mice showed significantly lower leukocyte rolling compared to the wildtype mice in response to fmlp topical administration. leukocyte adherence induced by fmlp and tnf-a; was significantly lower in par2 -/-mice compared to wild types as well. no difference was observed between par2 -/-and wildtype for leukocyte rolling/adherence-induced by paf. the lack of functional par2 attenuated leukocyte trafficking in response to fmlp and tnf-a; but not to paf. the involvement of par2 activation in mouse colon leukocyte trafficking highlights par2 as an important mediator of inflammatory cell recruitment and thereby a potential target for the treatment of inflammatory bowel diseases. (1), kk hansen(2), k chapman(2), n vergnolle (2), ep diamandis (1), md hollenberg (2) (1) advanced center for detection of cancer, mount sinai hospital, university of toronto, toronto, on, canada (2) proteinases and inflammation network, university of calgary, calgary, ab, canada kallikreins (klks) are secreted serine proteinases identified in many cancers and multiple sclerosis lesions. we have recently shown that klks can activate proteinaseactivated receptors (pars), a family of g-protein coupled receptors associated with inflammation. we hypothesized that like trypsin, kallikreins can trigger inflammation via the pars. we studied the ability of klks 6 and 14 to activate pars 1, 2 and 4 in vitro and to cause oedema in a mouse model of paw inflammation in vivo. we found that klk14 is able to activate both of pars 2 and 4 and to prevent thrombin from activating par1. on the other hand, klk6 was a specific activator of par2. kallikrein administration in vivo resulted in a paw oedema response comparable in magnitude and time to that generated by trypsin. the oedema was accompanied by a decreased threshold of mechanical and thermal nociception. our data demonstrate that by activating pars 2 and 4 and by inactivating par1, kallikreins, like klks 6 and 14, may play a role in regulating the inflammatory response and perception of pain. (1), d park (1), b short(2), n brouard(2), p simmons(2), s graves (1), j hamilton (1) (1) melbourne university, melbourne, victoria, (2) peter maccallum cancer institute. melbourne, australia mouse mesenchymal stem cell enriched populations can be isolated from bone tissue by employing lineage immuno-depletion followed by fluorescence-activated cell sorting based on the cell surface expression of the sca-1 antigen. such isolated cells can subsequently be cultured and differentiate towards the osteogenic, adipogenic or chondrogenic linage in vitro. using this model we investigated the influence of the proinflammatory cyto-kines, tnfa or il-1b, on early osteogenesis in vitro. under osteogenic conditions, il-1b was found to inhibit cell proliferation in a dose dependent manner, whereas tnfa exhibited no effect. histochemical examination revealed the presence of either tnfa or il-1b to dramatically decreased mineralization in a dose dependent manner. q-pcr analysis indicated that in the presence of il-1b, despite increased expression of bone-specific alkaline phosphatase (akp2) mrna, levels of other osteogenesis markers (runx2, col1a and sp7) were decreased. in the presence of tnfa, levels of akp2, runx2 and sp7 were all decreased. our findings indicate that the influence of early mesenchymal progenitor cells on bone remodelling may be substantially altered in the presence of proinflammatory cytokines. using are-driven and nf-kb-targeted reporter genes, transfection of the nf-kb p65 subunit and nrf2 into hepg2 or other cells, as well as sirna technique to knockdown endogenous p65 in cells, we found that nf-kb p65 subunit repressed the anti-inflammatory and anticarcinogenetic nrf2-are pathway at transcriptional level. in p65-overexpressed cells, the are-dependent expression of heme oxygenase-1 was strongly repressed. in the cells where nf-kb and nrf2 were simultaneously activated, p65 unidirectionally antagonized nrf2 transcriptional activity. the p65-mediated are inhibition was independent of the transcriptional and dna-binding activities of p65. co-transfection and rna interference experiments revealed two mechanisms which coordinate the p65-mediated repression of are: (1) p65 selectively deprives creb binding protein (cbp) from nrf2, but not mafk, by competitive interaction with the ch1-kix domain of cbp, resulting in inactivation of nrf2 transactivation domain and concomitant abrogation of the nrf2-stimulated coactivator activity of cbp; (2) p65 promotes recruitment of histone deacetylases 3 (hdac3) to are by enhancing the interaction of hdac3 with either cbp or mafk, leading to inactivation of cbp and deacetylation of mafk. this study may establish a novel pro-inflammatory and pro-carcinogenic model for the transrepression of the are-dependent gene expression by p65 subunit. since various inflammatory and tumor tissues constitutively overexpresses p65 in their nuclei, the finding in this study implies a strong repression of are-dependent gene expression must take place in those tissues. in this regard, the findings in this study may help to explain why oxidative stresses and toxic insults usually occur in those pathological loci. dendritic cells (dc) play a pivotal role in the induction of immune response and tolerance. it is less known that dc accumulate in atherosclerotic arteries, where they might activate t-cells and contribute to the progression of disease. the serine protease thrombin is the main effector protease of the coagulation cascade. thrombin is also generated at sites of vascular injury and during inflammation. hence, thrombin generation is observed within atherosclerotic and other inflammatory lesions including rheumatoid arthitis. thrombin activates various cells via protease-activated receptors (pars). immature dc do not express pars. upon maturation with lps, tnfalpha, or cd40l, only lps-matured dc expressed par1 and par3 on their surface. stimulation of dc with thrombin, par1-or par3-activating peptides elicited actin polymerization and concentration-dependent chemotactic responses in lps-, but not in tnf-alphamatured dc. the thrombin-induced migration was a true chemotaxis as assessed by checkerboard analysis. stimulation of pars with thrombin or respective receptoractivating peptides led to activation of erk1/2 and rho kinase i (rock-i) as well as subsequent phosphorylation of the regulatory myosin light chain 2 (mlc2). the erk1/2-and rock-i-mediated phosphorylation of mlc2 was indispensable for the par-mediated chemotaxis as shown by use of pharmacological inhibitors of rock, erk and mlc kinases. in addition, thrombin significantly increased the ability of mature dc to activate proliferation of naive t-lymphocytes in mixed leukocyte reactions. in conclusion, our work demonstrates expression of functionally active thrombin receptors on lps-matured dc. we identified thrombin as a potent chemoattractant for mature dc, acting via rho/ erk-signaling pathways. data concerning the role of circulating modified low density lipoproteins (modldl) in atherogenesis and other pathologies are scarce. one reason for this is the lack of suitable radiolabeling methods for direct assessment of metabolic pathways of modldl in vivo. we report a novel approach for specific labeling of human native ldl (nldl) and modldl (iron-, hypochloriteand myeloperoxidase-oxidized, nitrated, glycated, and homocysteinylated ldl) with the positron emitter fluorine-18 (18f) by either nh2-reactive n-succinimidyl-4-[18f]fluorobenzoate or sh-reactive n-[6-(4-[18f]fluorobenzylidene)-aminooxyhexyl]maleimide (radiochemical yields, 15-40%; specific radioactivity, 150-400 gbq/ mmol). radiolabeling itself caused neither additional oxidative structural modifications of ldl lipids and proteins nor adverse alterations of their biological activity and functionality in vitro. the approach was evaluated with respect to binding and uptake of 18f-nldl and 18f-modldl in cells overexpressing various lipoproteinrecognizing receptors. the metabolic fate of 18f-nldl and 18f-modldl in vivo was delineated by dynamic small animal pet studies in rats and mice. the in vivo distribution and kinetics of nldl and modldl correlated well with the anatomical localization and functional expression of ldl receptors, scavenger receptors, and receptors for advanced glycation end products. the study shows that ldl modification, depending on type and extent of modification, in part or fully blocks binding to the ldl receptor, and reroutes the modldl to tissuespecific disease associated pathways. in this line, 18flabeling of modldl and the use of small animal pet provide a valuable tool for imaging and functional characterization of these pathways and specific sites of pathologic processes, including inflammatory processes, in animal models in vivo. the p38 mapk signaling pathway, which regulates the activity of different transcriptions factors including nuclear factor-ã�b (nf-ã�b), is activated in lesional psoriatic skin. the purpose of the present study was to investigate the effect of fumaric acid esters on the p38 mapk and the down stream kinases msk1 and 2 in cultured human keratinocytes. cell cultures were incubated with dimethylfumarate (dmf), methylhydrogenfumarate (mhf) or fumaric acid (fa) and then stimulated with il-1b before kinase activation was determined by western blotting. a significant inhibition of both msk1 and 2 activations was seen after pre-incubation with dmf and stimulation with il-1b whereas mhf and fa had no effect. also, dmf decreased phosphorylation of nf-kb / p65 (ser276), which is known to be transactivated by msk1. furthermore, incubation with dmf before stimulation with il-1b resulted in a significant decrease in nf-kb binding to the il-8 kb and the il-20 kb binding sites as well as a subsequent decrease in il-8 and il-20 mrna expression. our results suggest that dmf specifically inhibits msk1 and 2 activations and subsequently inhibits nf-kb induced gene-transcriptions which are believed to be important in the pathogenesis of psoriasis. these effects of dmf explain the anti-psoriatic effect of fumaric acid esters. a humanized model of psoriasis was successfully established by transplanting non-lesional skin biopsies from psoriasis patients onto bg-nu-xid mice lacking b, t and nk cells. in this system, a psoriatic process is triggered by intradermal injection of activated autologous peripheral blood lymphocytes. inflammation is associated with the expression of activation markers and inflammatory medi-ators such as tnf-alpha, hla-dr and cd1a and this results in increased proliferation and differentiation of keratinocytes, demonstrated by increased expression of ki-67 and ck-16. epidermal hyperplasia is a typical readout in this model. in a series of studies, this model was found to be sensitive too a wide range of compounds, including inhibitors of tnf-alpha, antibodies directed against growth factors, mmp-inhibitors, calcipotriol, metothrexate, betamethasone and cyclosporine a.in addition, we showed that inhibition of fatty acid oxidation had an anti-psoriatic effect in this model (caspary et al. brit j dermatol 2005; 153, 937-944) . employing lesional skin it was demonstrated that inhibition can also be performed in a therapeutic setting.due to its humanized nature this model represents a powerful tool for the identification or validation of compounds with potential for the treatment of psoriasis. kristian otkjaer (1), e hasselager(2), j clausen(2), l iversen (1), k kragballe (1) (1) aarhus university hospital, denmark (2) novo nordisk a/s, denmark interleukin-20 (il-20) is assumed to be a key cytokine in the pathogenesis of psoriasis. increased levels of il-20 are present in lesional psoriatic skin compared with nonlesional skin where it is barely detectable. whether il-20 is derived from antigen-presenting cells or keratinocytes remains unsolved. the aim of the present study was, therefore, to characterize il-20 expression in non-lesional psoriatic skin ex vivo. 3 mm punch biopsies from nonlesional psoriatic skin were collected. biopsies were transferred to cacl enriched keratinocyte basal media and cultured with vehicle or il-1beta (10ng/ml) for 0, 1, 2, 4, 6, 12 and 24 hours, respectively. the samples were analyzed by in situ hybridisation, qrt-pcr, immunofluorescent staining and elisa. incubation with il-1beta rapidly induced il-20 mrna expression in the biopsies. the highest level of il-20 mrna was detected after 4 hours and in situ hybridisation revealed that basal as well as suprabasal keratinocytes throughout the epidermis were the only cellular source of il-20 mrna. increased levels of il-20 protein were detected in the supernatant of the il-1beta stimulated biopsies. immunofluorescent staining of the biopsies showed no il-20 protein in the keratinocytes, whereas the il-20 protein was present in epidermal cd1a positive dendritic cells. our data emphasize the keratinocyte as the cellular source of il-20 expression in human skin. interestingly, immunofluorescent staining of our cultured biopsies showed il-20 protein in epidermal dendritic cells whereas no il-20 was detected in the keratinocytes. this indicates that epidermal dendritic cells are the target for keratinocytederived il-20. one response of epidermal keratinocytes to inflammatory stress is the induction of matrix metalloproteinases (mmps) that participate in tissue remodeling. excessive proteolytic activity is associated with chronic wounds and tissue damage during persistent inflammation. calcitriol, the hormonally active form of vitamin d, is known to have beneficial effects during cutaneous inflammation. we hypothesized that one way in which calcitriol exerts its effect on inflamed skin is by attenuation of damages caused by excessive mmp proteolytic activity. our experimental model consists of hacat keratinocytes cultured with tnf to simulate an inflammatory state. pro-mmp-9 was quantified by gelatin zymography and mrna by real-time pcr. the levels and activation of signaling proteins were determined by immunoblotting. the increase in pro-mmp-9 activity and mrna levels induced by tnf was inhibited by~50% following 48h treatment with calcitriol. using specific inhibitors we established that the induction of mmp-9 was dependent upon the erk pathway, while p38-mapk and pkc inhibited, and jun-kinase, pi-3-kinase and src did not affect it. levels of c-fos, a component of ap-1 transcription complex known to mediate mmp-9 induction, were elevated by tnf and further increased by calcitriol. the induction of mmp-9 by tnf was abolished by inhibition of the egfr tyrosine kinase attesting to the requirement for egfr trans-activation. calcitriol also inhibited the induction of mmp-9 by egf. we conclude that calcitriol inhibits the induction of mmp-9 gene expression by tnf in keratinocytes by affecting an event downstream to the convergence of the egfr and the tnf signaling pathways. (1), p verzaal (1), t lagerweij (1), c persoon-deen (1), l havekes (1), a oranje (2) (1) tno pharma, department of inflammatory and degenerative diseases, leiden, the netherlands (2) erasmus medical center, department dermatology and venereology, rotterdam, the netherlands mice with transgenic overexpression of human apolipoprotein c1 in liver and skin display a strongly disturbed lipid metabolism. moreover, these mice show a loss of skin-barrier function evident from increased trans epidermal water loss. these mice develop symptoms of atopic dermatitis, i.e. scaling, lichenification, papules, excoriation and pruritus. both hyperplasia of epidermis and dermis are observed. histological analysis shows increased numbers of cd4+ t cells, eosinophils, mast cells and ige-positive cells in the dermis. serum levels of ige are increased as well. cytokine profiling of draining lymph nodes is in favor of a th2-mediated disease. development of atopic dermatis in this model was found to be sensitive to topical treatment with triamcinoloneacetonide, fluticasone-proprionate and tacrolimus. moreover, oral treatment with dexamethasone successfully inhibits the development of disease in this model. impairment of the skin barrier is most likely the underlying cause of the development of atopic dermatitis in this model.this model is useful for identifying new therapeutic strategies and obtaining new insight into the pathogenesis of atopic dermatitis. topical immunosupppressants such as elidel and protopic are highly efficacious therapeutics for the treatment of atopic dermatitis and other dermatological conditions.-when delivered topically, these calcinuerin inhibitors offer several advantages over topical steroids; however, these marketed drugs have received a controversial "black box warning" because of a potential cancer risk. we speculated that systemic exposure of these drugs over long term use may contribute to the cancer risk.accordingly, we have designed and discovered a series of "soft" cyclosporin a (csa) derivatives as potentially safer alternatives.in general, soft drugs are engineered, via medicinal chemistry, to be effective upon local delivery but upon systemic exposure they are rapidly inactivated by metabolic pathways.in this way, exposure of active drug to distal organs is greatly minimized resulting in a significant enhancement in therapeutic index.the results or our drug discovery efforts around soft csa derivatives will be presented. (1), y sawanobori (2), u bang-olsen (3), c vestergaard(4), c grã¸nhã¸j-larsen (1) background: a strain of japanese fancy-mice, nc/nga, serves as a model for atopic dermatitis. under specific pathogen-free conditions, the mice remain healthy, but when kept under non-sterile conditions, they exhibit pruritic lesions like atopic dermatitis. scratching behaviour of the mice precedes the development of dermatitis, and a correlation between registered scratching counts and expression of il-31 mrna has been shown. also, transgenic mice over-expressing il-31 exhibit increased scratching behaviour and develop severe dermatitis. consequently we decided to explore the therapeutic effect of an anti il-31 antibody on scratching behaviour and dermatitis in nc/nga mice. methods: prior to clinical manifestation of dermatitis, we commenced treatment of nc/nga mice with il-31 ratanti-mouse 10 mg/kg intraperitoneally every fifth day for seven weeks. clinical dermatitis, scratching behaviour and weight gain, was assessed throughout the intervention period. serum analysis for ige and il-13 as well as histopathological and immunohistochemistry analysis on skin biopsies were also performed at end-point. results: taken over the entire intervention period, treatment with anti il-31 antibody in nc/nga mice from age seven weeks did not meet the primary end points, which were scratch, dermatitis and body weight. however, post hoc analysis revealed a significant reduc-tion of scratch by the anti il-31 antibody treatment in the time interval day 22-43. our results suggest an anti pruritic role for il-31 antibody in an atopic dermatitis-like animal model. anti il-31 antibody is therefore a new therapeutic opportunity for the treatment of pruritus in atopic dermatitis and perhaps other pruritic diseases. (1), p ferro (1), hm asnagli (1), v ardissone (1), t ruckle (2), f altruda (3), ch ladel (1) (1) rbm merck serono/university of torino, italy (2) merck serono pharmaceutical research institute, geneva, switzerland (3) university of torino, dipartimento di genetica, biologia, biochimica, italy class-i phosphoinositide 3-kinases (pi3ks) play a critical role in modulating innate and adaptive immune responses, as they are important transducers of external stimuli to cells, such as granulocytes and lymphocytes. since pi3k-g plays a pivotal role in mediating leukocyte chemotaxis and activation, as well as mast cell degranulation, the pharmacological blockade of pi3k-g might offer an innovative rationale-based therapeutic strategy for inflammatory skin disorders. in our study the inhibitory properties of a selective pi3k-g inhibitor as605858 on inflammation was applied to murine models modeling skin diseases like psoriasis and dermatitis. two mouse models were used: the first, irritant contact dermatitis (icd), is an innate inflammatory skin condition arising from the release of pro-inflammatory cytokines in response to haptens, usually chemicals. the second, contact hypersensitivity (chs) is a t-cell dependent model, modeling in part t-cell-mediated skin diseases such as psoriasis. we demonstrated the therapeutic effect of pi3kg inhibition and subsequent inhibition of chemotaxis in models of skin diseases and showed that a selective pi3k-g inhibitor can excert an important therapeutic efficacy (dose-dependent) in models of innate immunity (icd) -effective dose 0,3mg/kg p.o. once -as well as in t-cell mediated skin pathology (chs)effective dose 1mg/kg p.o. bid. we conclude that the mechanism of action related to inhibition of pi3k-g are demonstrable after oral administration of selective inhibitors like as605858 in models of acute and chronic skin inflammation and are mediated by modulation of innate and acquired immunity. introduction: high mobility group box 1(hmgb1) has recently been identified as a late mediator of endotoxin lethality. we newly developed an extra corporeal hmgb1 absorber. the purpose of this study was to test the hypothesis that hmgb1 removal could prevent or reduce endotoxin induced lethality or tissue injury of rats. methods: all experiments were conducted in accordance with the institutional care and use committee.male wistar rats were randomly allocated into three groups; hmgb1 absorber group (group i),hmgb1 nonabsorber group (group ii), and vacant column group (group iii).we applied these columns for each groups at 6 hours after lps injection. the rats were sacrificed 48 hours after lps injection for pulmonary histology. we statistically analyzed survival rate with kaplan-meier and the levels of hmgb1 with anova. results: survival rate was 67% in the group i at 48 hours after lps injection, as compared with 0% in the group ii and 11% in the group iii. the pulmonary histology in both group ii and group iii showed acute inflammatory injuries, whereas group i showed less inflammatory changes.the level of hmgb1 in the group i was significantly lower than those of group ii and iii. discussions:these results demonstratethat specific absorption of endogenous hmgb1 therapeutically reverses lethality of established sepsis indicating that hmgb1 inhibitors and absorber can be treated in a clinically relevant therapeutic window that is significantly wider than for other known cytokines. contact information: dr hideo iwasaka, oita university, anesthesiology and intensive care unit, yufu city, japan e-mail: hiwasaka@med.oita-u.ac.jp (1) (1) department of pharmacology, national university of singapore, singapore (2) dso national laboratories, singapore hydrogen sulfide (h2s) is increasingly recognized as a proinflammatory mediator in various inflammatory conditions. in this study, we have investigated the role of h2s in regulating expression of some endothelial adhesion molecules and migration of leukocytes to inflamed sites in sepsis. male swiss mice were subjected to cecal ligation and puncture (clp) induced sepsis and treated with saline, dl-propargylglycine (pag, 50 mg/kg i.p.), an inhibitor of h2s formation or sodium hydrogen sulfide (nahs) (10mg/kg, i.p.), a h2s donor. pag was administered either 1 hour before or 1 hour after induction of sepsis while nahs was given at the time of clp. using intravital microcopy, we found that in sepsis, prophylactic and therapeutic administration of pag significantly reduced the leukocyte rolling and adherence in mesenteric venules coupled with decreased mrna and protein levels of adhesion molecules (icam-1, p-selectin and e-selectin) in lung and liver. in contrast, injection of nahs significantly upregulated leukocyte rolling and attachment as well as tissue levels of adhesion molecules in sepsis. on the other hand, normal mice were given nahs (10mg/kg i.p.) to induce lung inflammation with or without pretreatment of nf-xã�b inhibitor, bay 11-7082. h2s treatment enhanced the pulmonary level of adhesion molecules and neutrophil infiltration in lung. these alterations were reversed by pretreatment with bay11-7082. moreover, expression of cxcr2 in neutrophils obtained from h2s treated mice was significantly upregulated leading to an obvious elevation in mip-2 directed migration of neutrophils. therefore, h2s act as an important endogenous regulator of leukocyte trafficking during inflammatory response. transient receptor potential vanilloid 1 (trpv1) is primarily found on sensory nerves. we have demonstrated its pro-inflammatory potential in arthritis and now present evidence that it is protective in an endotoxininduced model of sepsis. selective trpv1 antagonists are not available for use in the mouse in vivo, thus established trpv1 knockout (-/-) mice were used. c57bl6 wt and trpv1-/-mice were matched for age and sex and injected intraperitoneally (i.p.) with lipopolysaccharide (lps). the response was monitored for 4h. blood pressure, measured before and at intervals after lps in conscious mice via a tail cuff was reduced in both wt and trpv1-/-mice, with trpv1-/-mice showing an enhanced drop at 4h. in a separate group temperature, a proposed pre-mortality marker was also reduced by 4h, again with a significantly increased drop in trpv1ko. furthermore higher levels of two inflammatory markers tnfa and nitrite (as an indicator of no) were measured in peritoneal lavage and higher levels measured intrpv1-/-as compared with wt samples. finally aspartate aminotransferase (ast) levels also enhanced in trpv1-/-versus wt mice, although markers for kidney and pancreatic damage were similar in both genotypes. we conclude that trpv1 plays a protective role in sepsis. trpv1 is known to be present on nonneuronal (e.g. vascular components) and their relative involvement in sepsis is unknown. (1), da souza-junior(2), l de paula (1), mc jamur (2), c oliver (2), sg ramos (2), cl silva (2), lucia helena faccioli (1) ( h37rv) . infected balb/c mice developed an acute pulmonary inflammation and higher levels of tnf-a, il-1, kc, mcp-1 and mip-2 were detected in the lungs by day 15. in vivo degranulation of mast cells by c48/80 led to a reduction of the inflammatory reaction associated to a marked decline proinflammatory cytokine and chemokine levels in the lungs. the magnitude of cellular immune response was also partially impaired in infected mice and treated with c48/80. histologically, the exacerbated granulomatous inflammation shown in the lung parenchyma of infected mice was attenuated in infected mice and treated with c48/80. of interest, the number of mycobacterial bacilli recovered from the lungs was 1 log higher after treatment of infected mice with c48/80. these findings suggest that mcs participate in host defense against m. tuberculosis infection through of the modulation of cytokines and chemokines, which are important for the recruitment and activation of inflammatory cells. (1), ms chadfield (2), db sã¸rensen (1), h offenberg (1), m bisgaard (1), he jensen (1) (1) department of veterinary pathobiology, faculty of life sciences, university of copenhagen, denmark (2) novo nordisk a/s, cell biology, gentofte, denmark introduction: pasteurella multocida is an important cause of pneumonia in several animal species and may spread systemically. the aim of this study was to evaluate initial inflammatory reactions and the inocula effect due to strains of p. multocida of different origin in an aerogenous murine model. materials and methods: 30 female balb/ c-j mice (app. 20 g, taconic, denmark) were infected intranasally with two clinical isolates of p. multocida of avian (vp161) and porcine (p934) origin, at three different levels of inocula concentration. after euthanasia, specimens of lung and liver tissue were collected for bacteriological and histopathological evaluation. furthermore, lung tissue samples were taken for measurement of expression of metalloproteinase mmp-9 and metalloproteinase inhibitor timp-1. results: all mice infected with the avian strain were euthanized after 24 hours. viable counts recovered from lung and liver tissue were high, and histopathology revealed pronounced acute bronchopneumonia. in the liver, disseminated necrosis with formation of microabscess was also seen. on the contrary, a dose response was observed with the porcine strain with regard to recovery of viable counts and development of lesions was apparent after 24, 48 and 72 h. furthermore, differences were seen in the nature of the lesions caused by the two strains. there was a difference in expression of mmp-9 and timp-1 between infected and noninfected mice. the model proved suitable for the evaluation of pulmonary inflammatory reactions between the two different host-derived strains as demonstrated through viable counts, histopathology and expression of mmp-9 and timp-1. (1), r molinaro (1), a franã�a (1), m bozza (1), f cunha(2), s kunkel (3) (1) universidade do rio de janeiro, brazil (2) universidade de sâ¼o paulo, brazil (3) university of michigan, usa introduction and objectives: studies reveal that regulatory t (treg) cells control immune responses; therefore these responses must be controlled to enable the effective protection against infections and cancer. ccr4 knockout mice (ccr4 -/-) are more resistant to lps shock. so, our aim is to study the mechanisms involved in the resistance of ccr4-/-subjected to severe sepsis by cecal ligation and puncture (clp) and how tregs modulate this effect. results: c57/bl6 mice were subjected to clp model, whereby the cecum was partially ligated and puncture nine times with a 21g needle. sham-operated mice were used as control. mice subjected to clp and sham surgery were treated with antibiotic since 6h after and until 3 days. ccr4 -/-mice subjected to clp presented an increase in survival rate (78%) compared with wildtype mice (17%), and a marked improvement in the innate response concern to neutrophil migration to peritoneum and lung, bacteria load and cytokine levels compared to wild-type mice. besides, tregs from ccr4-/-clp mice did not inhibit proliferation of t effector cells as observed for treg from wild clp mice, at a proportional ratio of teffector:treg. interesting, treg from ccr4-/-clp mice did not inhibit neutrophil migration to bal when co-injected with fungal challenge as secondary infection, while the treg from wild clp mice did, as expected. conclusions: these results suggest that treg cells from ccr4-/-mice did not present suppressive response and it could be an important factor in their survival. inflammation and oxidative stress are known to be one of the important causes that are responsible for many diseases. inflammation has been associated with diseases like cancer, diabetes and many other. proinflammatory cytokine (tnf -âµ and il-1ã¢) and no are considered as pivotal mediators in inflammatory conditions like rheumatoid arthritis, sepsis and cancer etc. thus inhibition of pro-inflammatory cytokines and no production are important targets for treatment of inflammatory disorders. nowadays due to the emerging side effects of cox inhibitors, these targets have been paid more attention for the treatment of these conditions. some medicinal plants such as curcuma longa (1), commiphora mukul (2) in humoral memory, antibodies secreted into serum and other body fluids protect an individual against repeated challenges of previously encountered pathogens. antibody-secreting plasmacells are mostly considered to be shortlived, terminally differentiated b lymphocytes, eliminated after a few days or weeks by apoptosis. however, in secondary lymphoid organs and in the bone marrow, plasma cells can survive for months and years, without dna-synthesis and refractory to signals from antigen or antigen-antibody complexes. the lifetime of these longlived plasmacells depends on an intrinsic competence to survive in the distinct environment of those organs, which defines a specific survival niche. the niche provides survival signals like il-6, cxcl12 and tnf. within a functional niche, the lifetime of a plasma cell is apparently not limited intrinsically. the number of niches in the body has to be limited, in order to maintain physiological concentrations of serum immunoglobulins. thus recruitment of new plasmacells to the pool of old memory plasma cells has to be competetive. this competition is probably controlled by a simple molecular mechanism, namely the dual functionality of chemokines like cxcl12, which attract newly generated plasmablasts to a survival niche and at the same time are a survival signal for the plasma cell. plasmablasts and plasma cells express cxcr4, the receptor for cxcl12. while plasmablasts migrate in response to cxcl12, plasma cells depend on it for survival in the niche, but are no longer migratory. thus once disloged from their niche, they will die. plasmablasts newly generated upon systemic secondary immunization, upon concommittant stimulation with interferon-gamma, can also express cxcr3, the receptor for the interferon-gamma-induced chemokines cxcl9, 10 and 11, which may lure the plasmablasts into inflamed tissue. the switch in the potential to migrate provides also an efficient means to eliminate plasma cells of the peak of an immune response, which as plasmablasts had migrated to the tissue inflamed in that pathogenic challenge. inflamed tissue contains survival niches for plasma cells. in the inflamed tissue, plasma cells provide high local antibody concentrations while the tissue is inflamed. upon resolution of the inflammation the plasma cells will be dislodged and die. longlived plasma cells provide longlasting antibody titers (protective memory) and leave memory b cells a role in reactive memory, generating memoryplasmacells in secondary challenges, and if serum titers are not sufficient to protect. in chronic inflammation, this mechanism can contribute to pathogenesis. thus in the nzb/w model of lupus, longlived autoreactive plasma cells are generated early in pathogenesis, which survive in bone marrow and spleen. later, in established disease, autoreactive plasma cells are shortlived and continuously generated. they do not compete with the longlived plasma cells, and both populations coexist as prominent populations. interestingly, longlived plasmacells are resistant to therapeutic immunosuppression, while the generation of shortlived plasma cells is blocked. this may be the reason for the failure to cure antibodymediated immunopathology, e.g. in autoimmunity and allergy, by conventional immunosuppression, (1), h lee (2) c5a is a potent inflammatory mediator produced during complement activation. unregulated c5a signalling through its receptor (c5ar) on neutrophils and other leukocytes is implicated in the pathogenesis of autoimmune diseases including rheumatoid arthritis and systemic lupus erythematosus. considerable effort has gone into development of c5ar antagonists for human therapy. we took neutrophils from genetically modified human c5ar knock-in mice in which the mouse c5ar coding region was replaced with human c5ar sequences and immunized wild-type mice to generate high affinity antagonist monoclonal antibodies (mabs) to human c5ar. these mabs inhibit c5a-induced neutrophil migration and calcium-flux, and bind to a region of the 2nd extracellular loop of c5ar loop that seems to be critical for receptor activity. this study investigated the effectiveness of these mabs in the k/bxn serum-transfer model of inflammatory arthritis. human c5ar knock-in mice were given 1-10 mg/kg mab intraperitoneally, before or after inflammatory arthritis developed. mice treated with anti-c5ar mab one day before serum transfer did not develop swelling or clinical signs of arthritis in contrast to controls. histopathology of the joints in anti-c5ar mab-treated mice revealed a complete block of the massive influx of leukocytes and cartilage erosion seen in controls. furthermore, and most significantly, a single 1 mg/kg dose of anti-c5ar mab given 5 days after initiation of disease completely reversed inflammation. in the collagen-induced arthritis (cia) model, injection of anti-c5ar mab after development of inflammation also reversed inflammation to baseline. these potent new antibodies to human c5ar are in preclinical development. the cytokine macrophage migration inhibitory factor (mif) participates in fundamental events in innate and adaptive immunity. the profile of activities of mif in vivo and in vitro is strongly suggestive of a role for mif in the pathogenesis of many inflammatory diseases, including rheumatoid arthritis (ra), asthma, and sepsis. mif also has a unique relationship with glucocorticoids, in that despite antagonizing their effects, the expression of mif is in fact induced by glucocorticoids. thus, mif functions as a physiological counter-regulator of the anti-inflammatory effects of glucocorticoids. therapeutic mif antagonist may therefore provide a specific means of steroid sparing. since mif are highly conserved among different species, it is hard to develop high affinity antibodies due to immune tolerance.we developed a proprietary technique to break the immune tolerance and selected high affinity mouse monoclonal antibodies against mif.the antibody can neutralize mif activity in cell based assays, and is very effective in a lps induced mouse sepsis model. using this antibody as a tool, we are studying the function of mif in comparison with the function of lps.we found that lps induced inos expression and no secretion are dependent on the secretion of mif.we also found that although both lps and mif induce g1 arrest in macrophage cell line raw264.7, their functions are independent to each other. structure-based small molecule drug design. an effective agent would be the first orally active cytokine antagonist. methods: collagen-induced arthritis (cia) was induced in dba-1 mice by immunisation with bovine type ii collagen/adjuvant on day 0 and 21. cor100140, synthesized on the basis of computer modeling of mif protein x-ray crystallographic data, was administered by daily oral gavage from day 21. etanercept (8mg/kg ip q2d) was used as a positive control. the mek-erk pathway is activated in numerous inflammatory conditions, including ra, ibd and copd. arry-162 is a potent (ic50 = 12 nm), selective, atp-uncompetitive mek1/2 inhibitor.arry-162 is highly efficacious in cia and aia rat models, with ed50s of 3 and 10 mg/kg, respectively, equal to or better than standard agents. addition of arry-162 to methotrexate, etanercept, ibuprofen or dexamethasone regimens in these models results in improved efficacy that is at least additive, if not synergistic. in tpa-stimulated human whole blood, this compound inhibited tnf, il-1 and il-6 production (ic50s of 23, 21 and 21 nm, respectively). in contrast, inhibition of perk required 280 nm to achieve a 50% reduction, demonstrating that inhibition of pro-inflammatory processes is very sensitive to perk inhibition.in clinical studies, healthy volunteers were administered a single oral dose of 10, 20, 30, 40 or 80 mg); blood was drawn at various times after dosing and stimulated ex vivo with tpa. arry-162 was well-tolerated and drug exposure was dose-proportional. in ex vivo blood samples, there was a dramatic time-and concentration-dependent inhibition of tpainduced il1 and tnffz with >80% inhibition observed at plasma concentrations of 50 and 150 ng/ml, respectively. similar inhibition of perk required 300 ng/ml. a multiple ascending dose clinical study has confirmed the pharmacokinetics and pharmacodynamics of arry-162 and helped define tolerability. clinical evaluation of arry-162 in combination with methotrexate in patients with rheumatoid arthritis is on-going. p38 (a mitogen-activated protein kinase) has been shown to play a key role in the release of cytokines such as tnfand il-1a from monocytes in signaling cascades that are initiated due to extra cellular stress stimuli. inhibition of p38 activity is expected to regulate the levels of tnf-a and il-1b thereby alleviating the effects of inflammation in ra. a new class of p38 inhibitors based on the naphthyridininone scaffold have been discovered. x-ray crystallography and site directed mutagenesis studies were critical tools that aided the evolution of the naphthyridinone lead class starting from a pyrido-pyrimidinone template. this presentation will discuss the derivation of key benchmark pre-clinical candidates in these novel scaffold classes (shown below) as influenced by structural biology studies, mutagenesis data and molecular modelling. efficacy studies in animal models for benchmark compounds will also be presented. (1), h aaes (1), w-h boehncke(2), j pfeffer(2), t skak-nielsen(1), i teige (1), k abell (1), ph kvist (1), e ottosen (1), tk petersen (1), lars svensson (1) (1) discovery, leo pharma, 55 industriparken, ballerup, denmark (2) department of dermatology, johann wolfgang goethe-university, frankfurt am main, germany p38 map kinase plays an important role in mediating an inflammatory response in mammalian cells. as a consequence of activation, several inflammatory mediators are released including il-1b] and tnfa. both cytokines have a central role in the pathogenesis of inflammatory conditions such as psoriasis. approximately 30% of psoriasis patients develop psoriatic arthritis. leo15520 is a member of newly developed class of selective p38 map kinase inhibitors. the compound was tested orally in in vivo models relevant for psoriasis and psoriatic arthritis. the in vivo models selected include the cia arthritis model, the human psoriasis xenograft scid mouse model, the uvb-induced dermatitis model, the lps induced tnfa model and a local gvh model. treatment with leo15520 led to an amelioration of the ongoing inflammation in all investigated in vivo models. in the cia model, a clear dose response effect was observed on the developing arthritis in both rats and mice (65 % reduction in mice and 77 % in rats at 10 mg/kg p.o.). in the humanised psoriasis model, leo15520 at a dose of 20 mg/kg, had an effect on both the hyperplastic epidermis (epidermal thickness reduced by 41%) and on the infiltrating inflammatory cells. the anti-inflammatory effect of leo15520 was even close to the effect of systemically delivered steroids in both models. we believe that the new highly selective class of p38map kinase inhibitors has a strong potential as an orally delivered therapy for systemic inflammation diseases such as psoriasis and arthritis. slx-2119 is a potent, selective, orally bioavailable inhibitor of the rho-kinase rock-2. its ic50 for rock-2 and rock-1 inhibition is 100 nm and >10 mm respectively. the ability of slx-2119 to inhibit septic liver injury was investigated in c57bl/6j mice challenged with lipopolysaccharide (lps) and d-galactosamine (d-gal). mice were given lps (10 mg/mouse) and d-gal (18 mg/ mouse) i.p and 5.5 hours later were sacrificed for analysis of liver injury. mice challenged with lps/d-gal had a >10 fold increase in serum alt and ast levels. this increase was reduced by >90% in mice pretreated with slx-2119 either orally (100 mg/kg, 2 and 20 hrs prechallenge) or i.p. (10-100 mg/kg, 15 min pre-challenge). slx-2119 inhibited the increase in hepatic levels of tnffalpha produced by lps/d-gal by >50%. to assess the kinetics of slx-2119s benefit, slx-2119 (100 mg/kg i.p.) was given 15 min prior to, or 15 or 60 min after the lps/ d-gal. slx-2119 was effective at inhibiting the rise in alt and ast levels at all 3 time points suggesting that inhibiting rock-2 even after the initiation of the lps/d-gal driven cascade protects against septic liver injury. in a survival study, 9 out of 10 mice given lps/d-gal were dead by 8 hrs whereas in mice given slx-2119 (100 mg/kg orally) 1 animal died at 10 hours and the remaining 9 mice were alive 48 hrs later. these results show that specific inhibitors of rock-2 may have therapeutic utility in the treatment of sepsis and subsequent liver injury. theta through three key hydrogen bond mediated interactions.they potently inhibit protein kinase c activity in vitro as demonstrated by inhibition of il-2 secretion in human purified t-cells stimulated with anti-cd3 and anti-cd28 or whole blood seb challenge.the pkc-theta inhibitors are orally bioavailable and demonstrate immunosuppressive activity in a mouse model of human delayed-type hypersensitivity responses. (1), j zhang (1), k henley (1), m white (1), d hilton (1), b kile (1) ( to investigate pathogenesis of rheumatoid arthritis, we used mice transgenic for the uniquely human fcgam-mariia in inducible and passively transferred models of arthritis. transgenic mice developed severe ra-like disease in both model systems, indicating that the transgene played a major role in arthritis pathogenesis. disease could be reduced by the administration of either specific monoclonal antibodies to fcgammariia or small chemical entities (sce) designed to bind to the fcgam-mariia dimer. to investigate the cause of this enhanced sensitivity to auto-immune stimuli, the phagocytic capacity of transgenic mice compared to c57bl/6 control mice was examined using phagocytosis of fluorescent beads coated with ova or ova/anti-ova immune complex or opsonised sheep red blood cells. in both assays macrophages from fcgammariia transgenic mice showed significantly increased phagocytosis comapred with cells from control mice at 4 hours.this difference diminished over time andwas only seen where particles were opsonised. treatment of macrophages with specific fcgammariia blocking monoclonal antibody fragments 8.7 f(ab)2 or iv.3 f(ab) reduced phagocytosis to background levels . macrophages from transgenic mice also showed significantly greater production of inflammatory cytokines tnf-alpha and il-1beta when stimulated in vitro with heat aggregated immunoglobulin (hagg). moreover, this response was also blocked by specific fcgammariia monoclonal antibody fragments or sce. thus, expression of the fcgammariia transgene in these mice leads to increased uptake of and reactivity to immune complexes, resulting in enhancement of inflammatory sequelae in the form of increased th1 cytokine secretion andamplification of the pro-inflammatory response leading to arthritic disease. harald burkhardt (1), u hã¼ffmeier (2), i kã§nig (3), j lacorz (2), a reis (2), k reich (4) (1) johann wolfgang goethe university, frankfurt, germany (2) friedrich-alexander-unisversity of erlangen-nuremberg, germany results: whereas the earlier described strong association of allele tnf*-238a with psoriasis could be confirmed, our study revealed that this association was completely dependent on carrying the psors1 risk allele. for psa, but not psoriasis vulgaris without joint involvement strong association with the allele tnf*-857t was detected (or=1.956, 95% ci 1.33-2.88; pcorr=0.0025) also in patients negative for the psors1 risk allele. our results indicate genetic differences between psoriasis vulgaris patients with and without joint manifestation. while the previously reported association between tnf*-238a and psoriasis seems to primarily reflect ld with psors1, tnf*-857t may represent a risk factor for psa independent of psors1. (1), n modi (1), m stanford (1), e kondeatis (1), r vaughan (1), f fortune (2), w madanat (3), c kanawati(4), p murray (5) methods: dna was obtained from 167 patients with bd, 61 from the uk and 106 from the middle east (me), and 159 controls individuals, 129 from the uk and 30 from me. dna was prepared by proteinase k digestion, and salt extraction and -318 and 49 snp were detected by a pcr-ssp. results: there was no significant difference in expression of -318c/t or 49a/g when all bd patients were compared to all control individuals, (p=0.3 and 0.9, respectively). as we have previously shown differences in snp expression in different patient groups we tested the uk and me patients separately. however, there was no significant difference in either snp in uk bd patients, (p=0.15, p=0.39) or me bd patients (p=0.7, p=0.6) when compared to the appropriate controls. (1), da brown (1), h johnen (1), mr qiu (1), t kuffner (1), pgm curmi (1), l brown (1), m mazzanti (2), sn breit (1) ( introduction: cerebral palsy (cp) is a nonprogressive motor disorder caused by white matter damage in the developing brain. it is often accompanied with neurocognitive and sensory disabilities. the cause and pathogenesis of cp is multifactorial and continues to be poorly understood. chorioamnionitis, clinically silent or manifest, has been reported to be a risk factor for cp both in term and preterm infants. il-16 gene is single copy gene located on chromosome 15q26.1-3 in humans. interleukin-16 is synthesized by a variety of immune (t cells, eosinophils and dendritic cells) and non-immune (fibroblasts, epithelial and neuronal) cells. it is also detected in organ-specific secretions in a number of inflammatory processes. amniotic fluid interleukin-16 concentrations decreased with advancing gestational age. but, women with preterm labor and women with chorioamnionitis have higher interleukin 16 amniotic fluid concentrations than those who delivered at term or those with sterile amniotic fluid. the aim of our study was to estimate allelic frequency for regulatory il16 -295 snp in the children with the cp. methods: dnas obtained from peripheral blood of 46 cp patients and 182 unrelated healthy volunteers were genotyped for the il16 -295 snp pcr-rflp method. results and conclusions: il16 -259 genotype fncc was more common in the population with cerebral palsy in the comparison to healthy volunteers. the significance of the association between il16 fngene polymorphisms and cerebral palsy has to be investigated in the future studies. (1), d delbro (1), e hansson (2) (1) kalmar university, sweden (2) gã§teborg university, sweden background: acetylcholine (ach) is a major signalling molecule, binding partly at nicotinic receptors (nachrs; a family of ions channels with nicotine as a selective ligand). one subtype, the alpha7nachrs, has antiinflammatory effects by way of down-regulation of tnfalpha release from macrophages. the alpha7nachrs have been demonstrated in neuronal as well nonneuronal tissues, e.g. astrocytes and microglia. aim. in rat astrocytes in primary culture, and in astrocytes cocultivated with primary microvessel cultures study: 1. the expression of alpha3nachrs and alpha7nachrs by immunofluorescence and western blot. 2. intracellular ca2+-transients spread within the astroglial networks after stimulation with nicotine. 3. pro-inflammatory cytokines, il-1b, il-6, and tnfalpha, released from microglial cells after stimulation with nicotine. results: alpha7nachrs expression was more evident in astrocytes co-cultivated with endothelial cells, suggesting that endothelial cells release factors, which increase the maturity of astrocytes. the ca2+ transient evoked by nicotine was also more pronounced in the co-cultured astrocytes. these ca2+ responses were blocked by the alpha7nachr antagonist alpha-bungarotoxin. the release of pro-inflammatory cytokines was down-regulated after stimulation by nicotine. conclusion: alpha7nachrs appear to be involved in some of the effects of nicotine administration to rat astrocytes. an anti-inflammatory action of cholinergic nerves on the astrocytes via nachrs seems probable, which may have therapeutic implications. university, department of natural sciences, kalmar, sweden e-mail: ann.pettersson@hik.se gedeon richter plc., budapest, hungary tramadol, an atypical opioid analgesic, is increasingly used for the treatment of osteoarthritis because it does not produce typical side effects of nsaids. review of clinical data show that it produces symptom relief and improves function, but these benefits are small and adverse events often cause participants to stop taking the medication (cohran database 2006). efficacy of tramadol in animal models of inflammatory pain is well established; however complex characterization of the drug regarding analgesic and side effects is missing in rats. our aim was to assess oral efficacy of tramadol at side effect free doses in rats. anti-hyperalgesic effect was determined in complete freunds adjuvant (cfa) induced thermal hyperalgesia test (th) and in model of cfa-induced knee joint arthritis. effect on inflammation was characterized in carrageenan induced paw edema test (et). for the characterization of side effect accelerating rotarod assay (arr) was used. significant impairment in arr was noted at 55 mg/kg dose, and above. in the th test weak 35% effect was seen at side effect free dose (40 mg/kg). in the arthritis model b.i.d. 40 mg/kg dose of tramadol given on days 3-7 after cfa caused a maximum 86% effect on weight bearing incapacitance (day 4). however, its effect seemed to diminish (34% on day 7) upon repeated treatment. in et tramadol had significant anti-inflammatory effect at 100 but not at 30 mg/kg dose. these results show that efficacy of tramadol in rat inflammatory pain models is limited by its side effects, in accordance with clinical data. chronic relapsing experimental allergic encephalomyelitis (cr eae) is a disease that bears striking similarities to the human condition multiple sclerosis (ms). in particular, cr eae and ms have major inflammatory events in the central nervous system (cns) that culminate in demyelination and disruption of axonal function.one group of mediators involved in the progressive cns inflammation are the prostaglandins (pgs).pg generation is regulated in experimental non-immune conditions of the cns by n-methyl-d-aspartate (nmda) receptor activation.nmda receptor-mediated events are also evident in eae and may therefore have the potential to influence pg production.the study was designed to profile pge2 and pgd2 in cns tissues during the development of cr eae and to examine the role of the nmda receptor in pg production through the use of the specific antagonist mk-801.biozzi mice were inoculated for cr eae and cns tissues were sampled during the course of disease.enzyme immunoassay of processed samples revealed pge2 and pgd2 levels within normal limits during the acute and subsequent remission phases of cr eae.in contrast, dramatic changes in pg concentrations were observed with a relapse of symptoms and a remission of disease.mk-801 was therapeutically administered to cr eae-diseased mice at the onset of relapse and changes in cns pg levels were recorded.the relapse phase of cr eae, but not the acute stage of disease, is characterised by an increase in cns pg production that may be influenced by nmda receptor activation. livia l camargo(1), lm yshii (1) (1), sk costa (1) (1) university of sâ¼o paulo, brazil (2) king's college, london, uk (3) butantan institute, sâ¼o paulo, brazil objectives: the neuropeptide substance p (sp) released by capsaicin-sensitive nerves (csn) plays a pivotal role in neurogenic inflammation. despite the prevalence of arthritis, the contribution of sp to the progression of arthritis has not been established. this study investigated the effect of csn ablation and sr140333, a sp antagonist, on knee joint inflammation and pain induced by intraarticular (i.a.) injection of kaolin (10 %, 5 h time course) in female wistar rats. the kaolin-injected knee (ipsilateral -ipsi) of vehicle-treated rats exhibited a significant, timedependent oedema as compared to the contralateral knee. in addition, increased pain score and high levels of myeloperoxidase (mpo, marker of neutrophil accumulation) and both pro-inflammatory (il-1b and il-6, but not tnfa) and anti-inflammatory (il-10) cytokines was detected in the ipsi synovial fluid of these animals. both destruction of knee joint csn fibres by neonatal capsaicin treatment and i.a. injection of rats with sr140333 (1 nmol/cavity) significantly attenuated the kaolin-induced pain score and knee oedema, suggesting that kaolin is acting, at least partially, via a neuronal mechanism. in contrast, the same treatment caused increased mpo activity and cytokine concentrations measured 5 h post kaolin injection. conclusions: peripheral release of sp after kaolin injection acts to increase pain generation, oedema formation and inflammatory cell influx. however, chronic tachykininergic depletion by capsaicin treatment up-regulates the production of pro-inflammatory cytokines that are important in triggering cell influx in the synovial cavity. (1) (1) university of sâ¼o paulo, sâ¼o paulo, brazil (2) butantan institute, sâ¼o paulo, brazil objectives: previously, intra-tracheal (i.tr.) injection of dep or 1,2-naphthoquinone (1,2-nq) evoked plasma extravasation and cell influx in rat airways. we now investigated whether simultaneous injection of these pollutants had a synergistic inflammatory action. we also determined the ability of dep and 1,2-nq-induced airway inflammation to evoke changes in the rat isolated thoracic aorta (rta) and corpus cavernosum (rcc) reactivity using organ bath assays. results: capsaicin-or vehicle-treated male wistar rats received i.tr. injection of dep (1 mg/kg) and 1,2-nq (35 nmol/kg). after 15 min, dep and 1,2-nq produced a potent (additive) plasma extravasation in the trachea and bronchi, but not lung, compared with each compound alone. in capsaicin-treated rats or treated with tachykinin antagonists, the response was inhibited, suggesting an important role for c-fibres, primarily tachykinins. increased mpo and cytokine levels were detected in bronchi of capsaicin-treated rats following 3 h treatment with pollutants. this treatment contributes to augment the ach (10-9-10-4 m)-induced relaxation in the rta but not in the rcc. in capsaicin-treated rats, the rta response to the pollutants was not affected but it was capable of evoking a marked relaxation in rcc in animals challenged with pollutants. conclusions: 1,2-nq exacerbates dep-induced plasma extravasation and mpo activity in the airways, indicating a neurogenic mechanism through tachykinins. the exposure to dep and 1,2-nq affects the endotheliumdependent response in rta without interfering with rcc. neuropeptides are unlikely to affect the pollutantsinduced changes in the rta. acknowledgements: capes, cnpq, fapesp. we thank ma alves for technical assistance. trans-resveratrol (rv) is a naturally occurring polyphenolic compound present in certain foods that has anticancer and anti-inflammatory properties. the purpose of this study was to determine the effect of rv on the production of proinflammatory cytokines and reactive oxygen species stimulated by lipopolysaccharides (lps) in glial cells. rt-pcr showed that rv (5, 25, 50 mm) dose-dependently inhibited 500 ng/ml lps induced tnfa, il-1b, il-6, mcp-1 and inducible nitric oxide synthase mrna expression. rv also inhibited lps-induced production of these cytokines (elisa), nitric oxide and reactive oxygen species in a dose-dependent manner. western blot analyses showed that resveratrol could inhibit lps-stimulated phosphorylation of erk1/2 and jnk but not p38. nf-kb reporter assay showed rv could inhibit nf-kb activation by lps in microglia and astrocytes. these results suggest that rv may inhibit lps-induced microglial and astrocyte activation through erk1/2, jnk and nf-kb signaling pathways. therefore, rv is a natural product with therapeutical potential against disease conditions in cns that involve an overproduction of proinflammatory cytokines and reactive oxygen species. (1), cs patil(2), sv padi (1), vp singh (1) (1) university institute of pharmaceutical sciences, panjab university, chandigarh, india (2) pharmacology r & d, panacea biotec ltd. lalru, punjab, india persistent stimulation of nociceptors and c-fibers by tissue injury causes hyperalgesia and allodynia by sensitization of nociceptors and facilitation of synaptic transmission in the spinal cord. the important participant in the inflammatory response of injured peripheral nerve may be nitric oxide (no). the aim of the present study was to test the sensitivity of pde5 inhibitor sildenafil in chronic constriction injury (cci) model, a rat model of neuropathic pain. sciatic nerve injury is associated with development of hyperalgesia 14 days after the nerve ligation. sildenafil (100 and 200fã� g/rat, i.t.) produced a significant decrease in pain threshold, which in lower dose did not alter the nociceptive threshold. the hyperalgesic effect of sildenafil was blocked by l-name and methylene blue (mb), which on per se treatment showed antinociceptive effect in nerve ligated rats. the results from the present study indicated that the major activation of no cgmp pathway in the chronic constriction injury model of neuropathic pain. the aggravation of hyperalgesic response might be due to the increased cgmp levels resulting in pkg-i activation and its upregulation. glycine transporter (glyt) 1 and glyt2 are expressed in glia and neurons, respectively. glyt1 make clearance of glycine released from glycinergic neuron in synapse and thus terminates the neurotransmission and also regulates over-stimulation by glycine spilled over to nmda receptors, while glyt2 supplies glycine into synaptic vesicles in glycinergic neurons. therefore, glyt inhibitors could modulate inhibitory glycinergic or excitatory glutamatergic neurotransmission. the present study examined the effects of glyt inhibitors on pain in animal models. inhibitors of glyt1 (sarcosine and org25935) and glyt2 (alx1393 and org25543) by intrathecal (i.t.) injection reduced formalin-induced nociceptive behaviors. glyt1 inhibitors reduced allodynia score and reversed the reduction of paw withdrawal threshold in complete freund adjuvant (cfa)-induced inflammation mice but the antiallodynia effects appeared after latent period. on the other hand, glyt2 inhibitors produced antiallodynia effects immediately after the i.t. injection in cfa-treated mice. these inhibitors produced the similar antiallodynia effects in the partial sciatic nerve ligationinjury or streptozotocin-induced diabetic neuropathic pain models, either by i.t. injection or i.v. injection.pretreatment of specific antagonists of glycine site of the nmda receptors disappeared the latent periods of glyt1 inhibitors and potentiated the antiallodynia effect. glycine receptor antagonist, strychnine by i.t. injected reversed the antiallodynia effect of i.v. injected glyt inhibitors. these results suggest that both glyt1 and glyt2 inhibitors by enforcing glycinergic inhibitory neurotransmission in spinal cord produce potent antinociceptive effect and may be novel candidate for medicament of pain control. the tumour microenvironment in particular tumour associated macrophages (tams) play a role in determining tumour outcome. despite strong causative links between inflammation and human gastric cancer progression, little is known of the role of tams in this disease. we have utilized our mouse model of gastric tumourigenesis, the gp130757ff mouse to assess the effect of the gp130757ff mutation on macrophage function and ascertain the role of macrophages in tumor formation. this mouse has a knockin mutation in the il-6 family cytokine receptor gp130 preventing shp2/ras/erk signaling and leading to constitutive stat3 transcriptional activation. tumour development is inflammation dependent, there is a requirement for il-11, and development is inhibited by nsaid treatment or microbial eradication, however an adaptive immune response is s410 inflamm. res., supplement 3 (2007) posters dispensable for tumorigenesis. in gastric antrum the gp130757ff mutation results, decreased erk1/2 activation and constitutive phosphorylation of stat3, abnormal signaling is replicated in macrophages. antral stat3 activation is unaffected by depletion of il-6. however in macrophages an absence of il-6 results in higher stat3 activation demonstrating the anti-stimulatory role of il-6 on macrophages. the gp130757ff mutation results in macrophages with decreased il-6 and increased inos mrna expression reflecting a more basally activated phenotype. manipulations of the gp130757ff mouse that reduce tumor size (eg. antibiotic or nsaid treatment or stat3 hemizygosity) coincidently result in reduced macrophage infiltration of antral mucosa. macrophages of gp130757ff mice display aberrant gp130 signaling potentially resulting in exaggerated response to stimuli. the key difference between mutant gastric mucosa and macrophages are changes in transcription of target genes. (1), y riffo-vasquez(2), s brain(2), s costa (1) (1) university of sâ¼o paulo, brazil (2) king's college london, uk objectives: we have previously shown that simultaneous intra-tracheal injection of diesel exhaust particles (dep) and 1,2-naphthoquinone (1,2-nq) caused a potent inflammation in rat airways partially dependent on neurogenic-mediated mechanism. this study investigates the mechanism of action of thee pollutants using inflammatory assays and histopathological approach in the lung and trachea of wild type (wt) and trpv1 knockout (ko) mice exposed to dep and 1,2-nq. dep (1 mg/kg) and 1,2-nq (35 nmol/kg)-induced airway inflammation was assessed via 125i-labelled albumin 1 h post pollutants injection. mpo assay and histopathology were performed 3 h after treatment. staining of lung and trachea specimens with h&e provided a profile of cell infiltration in both wt and ko animals. results: injection of dep and 1,2-nq evoked a potent plasma extravasation into the trachea and lung of wt, but not trpv1 ko, suggesting these pollutants act via a trpv1-mediated mechanism. in contrast, mpo activity in the airways of trpv1 ko mice was exacerbated compared to wt mice data. likewise, the histopathology revealed high numbers of leukocytes and macrophages infiltrated into the lungs and trachea of trpv1 ko mice compared to wt. conclusions: inhibition of increased microvascular permeability in the airways of trpv1 ko mice treated with pollutants suggests that these receptors are the predominant mechanism involved in the inflammation. however, neutrophils/macrophages accumulate more in trpv1 ko, indicating that the lack of trpv1 receptors up-regulates production of inflammatory cells in response to pollutants, thus supporting a protective role for trpv1 receptors. acknowledgements: capes, cnpq, fapesp. (1), n sato(1,2), y endo (1), s sugawara (1) (1) division of oral immunology, department of oral biology, tohoku university graduate school of dentistry, sendai, japan (2) division of fixed prosthodontics, department of restorative dentistry, tohoku university graduate school of dentistry, sendai, japan biotin is a water-soluble vitamin of the b complex and functions as a cofactor of carboxylases.biotin deficiency causes alopecia and scaly erythematous dermatitis.moreover, serum biotin levels are significantly lower in atopic dermatitis patients than in healthy subjects, indicating that biotin deficiency is involved in inflammatory diseases.however, immunological effects of biotin on allergic inflammation remain unclear.in this study, we investigated the effects of biotin-deficiency on metal allergy using nickel (ni)-allergy model mouse and a murine macrophage cell line j774.1. female balb/c mice (4 weeks old) received a basal diet or a biotin-deficient diet for 8 weeks.ten days after sensitization with intraperitoneal injection of lps and nicl2, the mice were challenged with intradermal injection of nicl2 into the pinnas.allergic inflammation was measured by ear swelling.the ethical board for nonhuman species of the tohoku university graduate school of medicine approved the experimental procedure followed in this study.j774.1 cells were cultured in biotin-sufficient or deficient medium for 4 weeks. ear swelling was significantly higher in biotin-deficient mice than biotin-sufficient mice.il-1 beta productions by splenocytes were significantly higher in biotin-deficient mice than in biotinsufficient mice.moreover, biotin-deficient j774.1 cells produced il-1 beta significantly higher than biotinsufficient j774.1 cells.to investigate the therapeutic effects of biotin-supplementation, biotin-deficient mice received biotin contained-water for 2 weeks.ear swelling was significantly lower in biotin-supplement mice than biotin-deficient mice.these results indicated that biotindeficiency deteriorates allergic inflammation.the augmentation of il-1 beta production is probably involved in those deteriorations. one of the most important targets of cytokine action is the blood vessels, which undergo some structural and functional changes that result in activation of endothelium. applying the elisa technique, levels of il-18, icam-1 and e-selectin were studied in 77 patients with acute pancreatitis. mediators levels were studied in arterial, venous and pancreatic ascites samples. according the atlanta criterion the mild pancreatitis was established in 33 patients and severe -in 44 patients. the highest levels of il-18 were noted in ascites and lowest in arterial samples. the highest concentration of adhesion molecules was in venous samples and lowest in ascites. it was a clear correlation between levels of il-18 and adhesion molecules and severity of pancreatitis. during first week the levels of il-18 gradually increased in patients with severe pancreatitis, while in patients with the edematous pancreatitis its levels decreased starting from the third day. icam-1 levels gradually increased during first three day with the following decrease after this term. the highest levels of e-selectin were noted at the time of admission. the clear correlation between il-18 and adhesion molecules was noted in both groups of patients. besides that, the clear strong correlation was observed between il-18 and quantity of circulating granulocytes and between e-selectin and hematocrite in patients with necrotizing pancreatitis. our study confirms the importance of activation of endothelium as a part of the systemic inflammatory response in patients with acute pancreatitis. subsequently eos infiltrate the tissues, are activated, and release mediators inducing the late phase response. this is characterized by tissue damage and repair, and in chronic reactions, tissue remodeling and fibrosis. mc/eos cross-talk by physical and non-physical contact is an essential feature of late-phase and chronic allergic reactions. we have previously described an mc/eos interaction facilitated by soluble mediators and shown to enhance allergic inflammation. still, pathways that mediate mc/eos cross-talk in allergy are not fully characterized. methods: human cord-blood derived mc were cocultured with peripheral blood eos, and activated with anti-ige. mc/eos couples in co-culture and in human nasal polyp tissue sections were specifically stained and counted using microscopy. expression of surface molecules was analyzed by facs. mc activation was measured by chromogenic assays for ã¢-hexosaminidase. relevant surface molecules were neutralized using antibodies, to assess interference with couple formation and activation. results: mc and eos physically interact, forming welldefined couples in-vitro and in-vivo. in the presence of eos, mc are more releasable under baseline or igeactivating conditions. this effect is partially mediated by cd48 and dnam-1 on mc, and 2b4 and nectin-2 on eos. we describe a novel physical interaction between mc and eos that we name "the allergic synapse". this synapse may upregulate allergic reactions, thus serving as a target for therapeutic intervention in allergic and inflammatory diseases. methods: mc were obtained from cord blood mononuclear cells (6-8 weeks with scf, interleukin-6 and prostaglandin e2, cbmc). cbmc were activated, after their culture for 5 days with myeloma ige (2mg/ml), by rabbit anti-human ige antibodies (5mg/ml), for 1, 3 and 6h at 378c. activation was measured by b-hexosaminidase release, determined by enzymatic-colorimetric assay. the expression of flip, mcl-1, bcl-2, bcl-xl, bak and bax was assessed by immunoblot analysis. results: two anti-apoptotic proteins were found to be upregulated: flip, which is involved in the extrinsic apoptotic pathway and mcl-1 that is mainly implicated in the intrinsic apoptotic pathway. in contrast, the expression of two other anti-apoptotic proteins that we have examined (i.e. bcl-2 and bcl-xl) was not altered. likewise, the expression of pro-apoptotic proteins from the bcl-2 family (i.e. bak and bax) was either undetectable or unchanged. conclusions: our findings reveal that ige-dependent activation of human mc mainly induces the selective increase in the expression of two pro-survival molecules. this may be one of the mechanisms that underlay mc hyperplasia in the chronic allergic inflammation. (1), c armishaw(2), z yang (1), h cai (1), p alewood(2), c geczy (1) (1) university of new south wales, faculty of medicne, sydney, australia (2) university of queensland, institute for molecular biosciences, st lucia, australia purpose: human s100a8, s100a9 and s100a12 are closely related proteins associated with inflammation. s100a12 is expressed in the human genome, but not in the mouse. s100a12 is a potent monocyte chemoattractant and mast cell (mc) activator. mouse (m) s100a8 and human (h) s100a8 share 58% structural identity, but the hinge domains are more divergent. the ms100a8 hinge (ms100a842-55) is a potent chemoattractant for leukocytes whereas this sequence in hs100a8 (hs100a843-56) is inactive. methods: s100a12 hinge domain and its alamine scan mutants were synthesized and their activities were tested by using thp1 cells or murine mc in vitro and mouse footpad injection. results: s100a12 hinge domain (s100a1238-53) was chemotactic for monocytes and mc and provoked mc degranulation in vitro and oedema, and leukocyte recruitment in vivo. in contrast to s100a12, the hinge domain only weakly induced cytokine production (il6, il8) by macrophages. residues essential for oedema were hydrophobic in nature (leu40, isoleu44, isoleu47 and isoleu53). n42 and i44 were essential for responses provoked by 10-12m s100a1238-53 whereas mutation of k38, l40, n46, i47, d49, k50 and i53 significantly reduced migration with 10-9m s100a1238-53. conclusions: s100a12 and ms100a8 may be functional chemattractant equivalents; s100a12 may have arisen by duplication of human s100a8. isoleucine residues in s100a12 hinge domain are essential for its proinflammatory properties. (1), g kiriakopoulou (1), e tsimara(2), a voultsou(2), k zarkadis (1) (1) general hospital of zakynthos, greece (2) medical centre of katastari, zakynthos, greece schistosome is a disease which pests in 74 tropical countries. the endemic areas are south america, far and middle east, and africa. our aim is to present an incident which concerning a parasitic infection, not endemic in greece. patient: man 30 years old who immigrated from pakistan (at the side of the aparkenar river) in greece, a year ago. he turned out with anlage, headache and weight loss. he is a swimmer. he mentioned also a fever before migrating to greece. clinically: largely-scaled paleness, stomach -mild and diffused sensitivity, with also lightly increased enteric sounds and a small degree of hepatomegaly, when palpated deeply. laboratory examination: leucocytes 9.030 eo 24.8%, hb 8.2g/dl, ht 27.2%, mcv 65,6fl, mch 19.1pg, thrombocytes 419.000, blood sedimentation 10mm, fe 10mg/ml, ferritin 2.98ng/ml. normal biochemical examinations. u/s: livers size lightly increased 16.5mm, hepatoportal vein with normal amplitude. parasitology of feces: in the immediate confection with lugol of the second sample, there were observed: big scattering oval ovules (130 -60mm) with a thin wall and quills on the side, as well as three imago worms (>10 mm): schistosoma mansoni. treatment: praziquantel was given. recheck in a months time: blood examinationleucosites 11.230, eo 9.8%, hb 10,2g/dl, ht 33.6%. parasitology of feces is negative. in the diagnosis of anemia combined with fever, to patients who are immigrants, schistosomiasis posters inflamm. res., supplement 3 (2007) should be taken into account, especially when sideropenic anemia is accompanied with intense eosinophile, because the disease is mostly a reaction of retardate supersensitivity. bronchial asthma is a chronic airway inflammatory disease caused by immune cells such as t lymphocytes and eosinophils. recently, high-sensitivity crp (hs-crp) has become available for detecting small changes in crp levels within the normal range, allowing for assessment of subclinical inflammation. this study was undertaken to investigate the relationship between hs-crp and bronchial asthma. we collected blood samples from 109 patients with bronchial asthma with or without attacks and measured serum eosinophil cationic protein (ecp) and pulmonary function as well as serum hs-crp. serum crp levels in patients without attacks (average 0.473 mg/ l; p < 0.001) and with attacks (average 0.908 mg/l; p < 0.001) were significantly higher than those of normal controls (average 0.262 mg/l). serum hs-crp levels were inversely correlated with fev1.0% in asthmatic patients (r = -0.4915, p < 0.01). in conclusion, these results indicate that serum hs-crp as well as ecp may be related to the state of asthma exacerbation and allergic inflammation. objectives: the pshr assay can be used to test the biologic activity of allergens since it mimics the effector phase of a type i hypersensitivity reaction. in this study we tested methods for removing ige-antibodies (stripping) from donor basophils. moreover a method was developed for improving the antigen specificity profile in pshr. proof-of-concept was provided using absorption with complete allergen extracts. methods: buffy coats were screened to exclude reactivity against relevant allergens. subsequently pbmcs were purified and basophils were stripped with either lactate or a phosphate buffer. cells were passively sensitized with patient sera and challenged with allergen extracts in various concentrations. released histamine was measured spectrofluorometrically (hr-test, reflab). cutoff was 10% hr. for absorption experiments patient sera (from a peanut allergic and a codfish allergic patient) were incubated with streptavidin-coated sepharose beads coupled with biotinylated allergens (peanut, and bsa as control). after centrifugation the supernatant was used to passively sensitize stripped basophils. hr was measured as described above. results: stripping experiments using the two buffers only partially removed surface ige but passive sensitization of stripped basophils was equally effective enabling determinations of sub-nanogram quantities of peanut allergen. absorption experiments showed that it was possible to specifically remove peanut specific ige from patient serum. removal of specific ige reactivity to peanut extract was verified by western blotting. conclusions: using peanut extract as a model it was demonstrated that in pshr antigen specificity can be modified. (1), sk bk(1), v sharma(2), rp bhandari (3) (1) nepal medical college teaching hospital, nepal (2) all india institute of medical sciences, new delhi, background: nepal has one of the highest maternalmortality rates in the world. this study was to evaluate the incidence, disease pattern, and risk-factors for thromboembolism in pregnant nepalese women. methods: women with thromboembolic diseases were identified and their case records retrieved and reviewed s414 inflamm. res., supplement 3 (2007) posters from january 1998 to december 2006. demographiccharacteristics were compared between women with and without thromboembolism. the total number of deliveries over the study period was 16,993, giving an incidence of 1.88 per 1000 deliveries. there were two cases of pulmonary-embolism and one resulted in a maternal-death. the others had deep-vein-thrombosis of which over 80% were limited to calf veins only. the ultrasound-examinations requested for suspected deep-venous-thrombosis before and after the event of maternal-death were 1.62 and 10.7 per 1000 deliveries (p <.001);the corresponding cases of deepvenous-thrombosis diagnosed were 0.29 and 2.94 per 1000 deliveries, respectively (p <.001). the majority (75%) of cases were diagnosed in the postpartum period, mainly after cesarean-delivery. women with venous-thromboembolism were older, had higher bmi, and a higherincidence of preeclampsia. there were approximately twice as many postpartum as antepartum events. bloodgroup a, multiple-pregnancy, caesarean-section, cardiacdisease, delivery at gestational-age of <36 weeks, a bmi of25, or more and maternal-age of 35 or over were all found to increase incidence of venous-thromboembolism. the long-standing belief that thromboembolism is rare among nepalese is at least partly because of undiagnosis most of these events are deep-veinthromboses occurring in the postpartum period and it is very essential for primary prevention developing country like nepal. (1), ch ladel (1), t ruckle(2), c rommel(2), r cirillo (1) (1) rbm-merck serono, colleretto giacosa, italy (2) serono pharmacological research institute, geneva, switzerland rheumatoid arthritis (ra) is a severe articular disease. massive leukocyte activation and infiltration into joints result in cartilage and bone destruction. blockade of pi3k signalling pathway has been demonstrated to be curative in a murine model of ra, collagen-induced arthritis (cia). in this study we explored the molecular mechanisms by which pi3k signalling inhibition resulted in clinical amelioration of disease symptoms. as605858, a novel isoform non-selective, yet specific class-i-pi3k inhibitor, administered at 15 mg/kg twice-a-day for 8 days, to mice showing signs of arthritis, paw swelling and inflamed digits, induced a significant amelioration of disease course. at the end of treatment, post-arthritic paws were removed and phosphrylation levels of akt (p-akt), downstream target in pi3k-mediated signal, were determined by semi-quantitative immunohistochemistry, also immunophenotyping of circulating cells by flowcytometry was conducted. akt phosporylation resulted to be significantly enhanced by disease induction and as605858 was able to decrease its levels down to values comparable with naã¯ve animals. controls and as605858treated mice were bled before treatment, after two days of treatment and at treatments end. no changes in cellular composition (morphology and hematology parameters) between experimental groups were observed. t cell number was not affected, however a significant decrease of natural-killer, memory and regulatory t cells was observed after as605858 administration. finally, a non-significant, moderate reduction of bcell number was also observed. these data demonstrate that efficacy of as605858 in arthritis models is mediated by direct modulation of the target resulting in a mixed anti-inflammatory (via pi3kg) and immunosuppressive (via pi3kd/a/b) activity. (1) (1) institute of biomedical science, university of sao paulo, brazil (2) ibilce -sao paulo state university, brazil epidemiologic data suggest that female sex hormones are involved in the pathophysiology of allergic asthma. we investigated in rats the immunomodulatory potential of estradiol and progesterone on the expression of allergic asthma. seven days after being ovariectomized (ovx), groups of rats were sensitized with ovalbumin (oa). fourteen days after sensitization, the animals were oachallenged and used 1 day thereafter. allergic,shamoperated animals were used as controls. some ovx animals were treated with estradiol (280 ã¬g/kg) or progesterone (200 ã¬g/kg) 24 h before being challenged. mast cell degranulation was quantified in samples of isolated, oa-challenged bronchi. the airway reactivity of inner bronchi to methacholine and the functional activity of cells we analysed. ovx caused reduction of the allergic lung inflammation and bronchial hyperresponsiveness with regard to intact female rats. estradiol reverted the reduced cellular recruitment into lungs, whereas progesterone reduced the pulmonary inflammatory response and reverted the bronchial hyperresponsiveness. cultured bal and bone marrow cells from allergic rats increased posters inflamm. res., supplement 3 (2007) s415 the release of il-10 and reduced that of il-1 and tnf. the release of il-4 by bone marrow cells was significantly reduced. these effects were reverted by estradiol, and progesterone reduced il-4 and increased il-10 production in bal and increased that of il-1 and tnf in bone marrow cells. bronchial mast cell degranulation upon direct contact with oa in ovx rats was less than in controls. it is suggested that female sex hormones can modulate the allergic lung inflammation in rats by acting on cellular migration/activity and airway responsiveness. objectives: to study the participation of fsh on modulation of e-and l-selectin, icam-1 and mac-1 expression in ovariectomized (ovx) rats made allergic. methods: female rats were sensitized (oa/alumen) after 1 (ovx-1) or 7 days (ovx-7) of ovx or sham-operated (sh). fourteen days thereafter animals were challenged (oa, 1%; aerosol) and sacrificed 24 h after. bronchoalveolar lavage (bal) was collected and flow citometry analyses oficam-1, mac-1 and l-selectin expression was studied. in parallel, lungs were frozen and sections were analysedfor e-selectin expression by immunohistochemistry. results: at day 1, e-selectin expression increased(ovx-1=1,8ae0,08) and at day 7, decreased (ovx-7=1,3ae0,1) as compared to respective controls (sh-1=1,45ae0,08; sh-7=1,9ae0,08). estrogen treatment reverted this profile in both groups (ovx-1+e=1,37ae0,09; ovx-7+e=1,8ae0,1). mean fluorescence intensity of bal cells showed increase of mac-1 expression (ovx-1=19ae0,5 vs sh-1=15,5ae1,0), icam-1 (ovx-1=18,1ae1,1 vs sh-1=13,3ae0,2) and l-selectin (ovx-1=11,2ae0,8 vs sh-1=9,5ae0,12) at day 1 i.e., ovx-1 group. on the other hand, we observed a decrease in icam-1 (ovx-7=14,2ae0,75 vs sh-7=19ae1,4) and mac-1 expression (ovx-7=20,7ae1,4 vs sh-7=27,5ae1,5) was seen in ovx-7 group. conclusions: oscillation of hormone levels during immunization with oa increased (ovx-1) and decreased (ovx-7) expression of adhesion molecules. estradiol treatment reverted this effect. this results suggest that fsh modulates the ali in rats by acting on cell (1), s lim (1), y lin (1), bp leung (1), c thiemermann (2), wsf wong (1) (1) national university of singapore (2) the william harvey research institute, london, uk glycogen synthase kinase 3b (gsk-3b) is known to regulate various cellular functions including inflammatory responses. we hypothesized that inhibition of gsk-3b may have anti-inflammatory effects in a mouse asthma model. balb/c mice were sensitized with ovalbumin (ova) and challenged with aerosolized ova. tdzd-8, a non-atp competitive gsk-3b inhibitor, was administrated by i.v. injection one hour before ova challenge. tdzd-8 significantly reduced the ova-induced eosinophilia in a dose-dependent manner and inhibited the levels of il-5, il-13 and eotaxin in bronchoalveolar lavage (bal) fluid. tdzd-8 also suppressed the mrna levels of icam-1, vcam-1 and chitinase proteins in the lung. histological studies revealed that tdzd-8 substantially reduced the inflammatory cell infiltration and mucus secretion in the lung tissue. tdzd-8 also decreased ova-specific ige level in the serum. in addition, ova-induced increase in airway resistance and reduction in dynamic compliance were attenuated by tdzd-8. our findings suggest that inhibition of gsk-3b may have therapeutic potential for the treatment of allergic airway inflammation. (1), a yildirim (1), f ercan (2), n gedik (3), m yuksel(4), inci alican (1) ( materials and methods: sprague-dawley rats (200-250 g) were exposed to 90 oc (burn group) or 25 oc water bath (control group) for 10 s under ether anesthesia. adm (100 ng/kg; s.c) was administered 10 min before burn and all rats were decapitated 24 h after the burn insult. trunk blood was collected for the measurement of tnf-alpha level and the lung, ileum and kidney samples were stored for microscopic scoring and for determination of lipid peroxidation (lp), myeloperoxidase (mpo) activity and formation of reactive oxygen metabolites (roms) using chemiluminescence assay. results: burn resulted in severe morphologic damage in tested tissues. lp increased in lung and kidney (p<0.01-0.001) and mpo activity showed a marked increase in all tested tissues (p<0.001) of the burn group. adm reversed these parameters effectively (p<0.05-0.001). luminol chemiluminescence levels showed increases in both ileum and lung (p<0.01-0.001) whereas lucigenin chemiluminescence levels increased in ileum and kidney (p<0.01) of the burn group. adm treatment was also beneficial in reducing chemiluminescence levels near to controls (p<0.01). adm reduced plasma tnf-alpha level (p<0.01) which showed a significant increase in burned animals compared to controls (p<0.001). conclusions: adm is beneficial in remote organ damage following burn insult via decreasing neutrophil infiltration, rom generation, lp, and the release of proinflammatory cytokine tnf-alpha. kalpana panday (1), sd joshi (2), kr reddy (3) ( we determined the crystal structure of human hematopoietic prostaglandin (pg) d synthase (h-pgds) as the quaternary complex with glutathione (gsh), mg2+, and an inhibitor, hql-79, with anti-inflammatory activities in vivo, at 1.45 resolution. hql-79 was found to reside within the catalytic cleft between trp104 and gsh in the quaternary complex. hql-79 inhibited h-pgds competitively against the substrate pgh2 as well as noncompetitively with respect to gsh. surface plasmon resonance analysis revealed that hql-79 bound to h-pgds with an affinity that was 10-fold higher in the presence of gsh and mg2+ than in their absence. hql-79 inhibited selectively pgd2 production by human h-pgds-expressing megakaryocytes but only marginally affected the production of other prostanoids, suggesting the tight functional engagement between h-pgds and cyclooxygenase. orally administered hql-79 inhibited antigen-induced production of pgd2, and airway inflammation in mice without affecting the production of pge2 and pgf2fâ¿. knowledge about this quaternary structure should accelerate the structure-based development of novel anti-inflammatory drugs that inhibit pgd2 production specifically. introduction: it has been proved that high levels of mechanical ventilation produce lung injury as well as local inflammation. this study was designed to evaluate how the generation of inflammatory mediators by an over-stretched lung affects the non-hyperventilated lung. methods: male wistar rats (250-275g) were anesthetized and paralyzed, and the two lungs were independently intubated. differential ventilation was applied for 3h (70 breath/min). one lung was subjected to hyper-ventilation (20ml/kg/lung) and the other was ventilated with a normal volume (5ml/kg/lung). in a control group, both lungs were ventilated with a normal volume. after sacrifice, samples of lung, plasma and liver were collected. the expression of the pro-inflammatory chemokine mip-2 was evaluated by rt-pcr and the edema was assessed by the ratio between the wet and dry weight of the lung. systemic inflammation was estimated in liver by measuring the expression of tnfa by rt-pcr as well as its levels in plasma. the hiper-ventilated lung showed an increase in the ratio between the wet and dry weight and in mip-2 expression compared to the normal ventilated lung. no differences were found in edema, neither expression of mip-2 between the normally ventilated lung and control lungs. no significant changes were observed in liver expression and plasma levels of tnfa as a consequence of unilateral lung hyper-ventilation. the over-straining caused to the hyperventilated lung leaded to a local inflammatory response without systemic effects. the normal ventilated lung is not affected by the inflammatory process triggered on the over-strained lung. (1), j-y gillon(2), v lagente (1), e boichot (1) (1) university of rennes 1, rennes, france (2) serono international s.a., geneva, switzerland macrophage elastase (mmp-12) is a metalloproteinase involved not only in emphysema but also in the inflammatory process associated with copd (chronic obstructive pulmonary disease). the mechanism of action of mmp-12 in the development of pulmonary inflammatory process is still unknown. in the present study, we investigated the effect of recombinant human mmp-12 (rhmmp12) on il-8/cxcl8 release from alveolar epithelial cell line a549 and we explored the underlying mechanisms. a549 cells were stimulated with rhmmp-12 (1.10-3-2.10-1 u/ml) during 6 hours and il-8/cxcl8 level in supernatant was determined by elisa. involvement of map (mitogen activated protein)-kinases was studied by western-blotting and also using chemical inhibitors. nfkb activation was examined with the transam nfkb p65 kit. we observed that mmp-12 elicited il-8/cxcl8 release in a dose-dependent manner. this production could be prevented by a pretreatment for 1hour with a selective mmp-12 inhibitor (as111793 1-30 mm) or with the nonselective inhibitor batimastat (1-30mm) . the il-8/cxcl8 production was also inhibited by actinomycin d (5mg/ml), erk1/2 inhibitors (u0126 5mm and pd98059 10mm) and nfkb inhibitors including bay11-7082 (10mm) and nfkb activation inhibitor (10nm), whereas p38 kinase inhibitor (sb203580) had no effect. stimulation with mmp-12 was rapidly followed by a phosphorylation of erk1/2 (5min) and an nfkb nuclear translocation and activation (1h). the nfkb activation was not inhibited by a treatment with u0126. these data suggest that alveolar epithelium is a target of mmp-12 since it upregulates gene expres-s418 inflamm. res., supplement 3 (2007) posters sion and release of il-8/cxcl8, via erk1/2 and nfkb activation, but these two pathways appears to be distinct. agents which are associated with lung inflammation, such as cigarette smoke and lipopolysaccharide (lps), induce the production of pro-inflammatory chemokines in lung epithelial cells in vitro, and the induction of interleukin (il)-8, in particular, is often used a measure of relative toxicity.in this study we have compared mrna expression and mediator release in nci-h292 human lung epithelial cells exposed to lung toxicants, namely: cigarette smoke total particulate matter (tpm), lps, bleomycin, diesel exhaust particles, residual oil fly ash (rofa), carbon black and vanadyl sulphate.polystyrene, poly(methyl-methacrylate) and the tpm vehicle, dimethyl-sulphoxide were used as negative controls.confluent monolayers of h292 cells were exposed to serial dilutions of test agents in serum-free medium for 24 hours.the conditioned medium was then removed and assayed for a range of pro-inflammatory cytokines and other selected mediators by luminex technology.the levels of gene expression of il-8, matrix-metalloprotease-1 (mmp-1), the gel-forming mucin muc5ac, heparinbinding epidermal growth factor-like growth factor (hb-egf) and the cytochrome p450s cyp1a1 and cyp1b1 were determined by quantitative-polymerase chain reaction.all of the toxicants induced similar responses whereas the negative controls were largely ineffective.-such a panel of biomarkers may enable an in vitro assessment of the potential to cause lung inflammation.-moreover the use of several biomarkers could give a more accurate picture of toxicity than the determination of il-8 alone, particularly in the case of agents such as tpm, where the conventional vehicle is found to have some biological activity. respiratory tract infections are a major public health issue. prevention in high risk populations relies mainly on vaccination. vaccination is highly recommended in decrease absenteeism. immunomodulating drugs are important tools in the treatment of infectious diseases. immunomodulatory agents are probably contributive in decreasing exacerbation rates. the authors present different classes of immunomodulators that are currently in use. the vaccine, created from a bacterial protein, reeducates the immune system to stop inflammation. by preventing infections, vaccines prevent the development of a strong t helper (th1) response. the challenge is now to inform about new possibilities of an optimal prevention respiratory tract infections. deoxypodophyllotoxin (dpt) is a medicinal herbal product that is isolated from anthriscus sylvestri. that inhibits cyclooxygenase-2 (cox-2) and cox-1dependent phases of prostaglandin d2 (pgd2) generation in bone marrow-derived mast cells (bmmc) in a concentration-dependent manner with ic50 values of 1.89mm and 65.3mm, respectively. this compound inhibited cox-1 and 2-dependent conversion of the exogenous arachidonic acid to pgd2 in a dose-dependent manner with an ic50 values of 0.01mm and 12.1mm, respectively. however, dpt did not inhibit cox-2 protein expression up to a concentration of 30mm in the bmmc, indicating that dpt directly inhibits cox-2 activity. furthermore, this compound consistently inhibited the production of leukotriene c4 (ltc4) in a dose dependent manner, with an ic50 value of 0.37mm. these posters inflamm. res., supplement 3 (2007) s419 results clearly demonstrate that dpt has a dual cox-2/5-lox inhibitory activity in vitro. therefore, this compound might provide the basis for novel anti-inflammatory drugs. in order to determine anti-allergic and antiasthmatic activity of dpt in vivo, we used rat pca model which was activated by anti-dinirophenyl ige and ovalbumin/alum induced mouse asthmatic animal model, respectively. as a result, dpt strongly inhibited pca reaction as well as it reduced infiltrated eosinophil numbers in bronchoalveolar lavage fluid. furthermore, dpt decreased the mrna levels of the th2 cytokines in a murine asthmatic model. in addition, northern blot analysis showed that dpt also reduced both the eotaxin and arginase i mrna levels in a dose dependent manner. these results suggest that dpt may be beneficial in regulating various inflammatory diseases. an imbalance of proteases and anti-proteases in the lung has been implicated in the pathogenesis of chronic obstructive pulmonary disease (copd), a smokingrelated disorder associated with accelerated lung function decline.in particular, the activity of matrix metalloproteases (mmps) have been implicated in driving both the inflammation and parenchymal destruction observed in copd patients.here, we tested whether a broad spectrum mmp inhibitor, pkf-242484, could block the inflammation induced by an acute exposure to cigarette smoke in 2 strains of mice, balb/c and c57/bl6.animals were administered the compound (1-10 mg/kg) either per os (p.o.) or intranasally (i.n.) 1 hour before and 5 hours after exposure to smoke on three consecutive days.bronchoalveolar lavage (bal) was performed and lungs were flash frozen for inflammatory marker analysis.pkf-242484 dose-dependently reduced lung neutrophilia in balb/c mice when dosed either p.o. (~45% at 10 mg/kg; p < 0.01)or i.n. (~60% at 10 mg/kg; p < 0.01).however, the compound had no clear effect on bal neutrophil infiltration in c57/bl6 mice by either route of administration.interestingly, in both strains bal macrophages dose-dependently trended towards an increase when the compound was dosed p.o. and decreased when dosed locally (p < 0.05). examination of lung tissue cytokine levels revealed that while smoke-exposure increases il-1beta, kc, and mip-2, pkf-242484 had little effect on these cytokines.these data suggests the ability of broad spectrum mmp inhibitors to inhibit smoke-induced acute neutrophil inflammation is strain-dependent, while its ability to limit macrophage infiltration may be route dependent. to investigate the role of seh in the regulation of the pulmonary inflammatory response, we have used seh deficient mice in a locally administered lps model. male seh deficient mice (ko) and their wildtype (wt) littermates were exposed to inhaled lps.four hours later they were sacrificed and bal was performed. differential counts and cytokine levels in bal were evaluated. results: lps induced a significant increase in total cell number and neutrophil number in bal in the seh deficient mice and in wt mice;no significant differences between the groups were seen (table 1 ) cytokine analysis showed significant increases in tnfalpha, il-6, kc, gm-csf, mcp-1, il-1beta and rantes in the lps-exposed wt mice. no significant differences were seen between lps exposed wt and ko mice except for a significant increase in tnfa in ko mice. our results show that seh has no pivotal role for the regulation of the acute inflammatory response to lung administered lps. fam.liliaceae. based on literature data which signaled the presence of steroidic saponins in the rhyzomes, isolation, identification and quantitative determination of these compounds were done. the antiinflammatory activity was tested in non-immune chronic inflammation model:the cotton-pellets granuloma test in rats, and in an immune chronic inflammation model:arthritis test induced by freunds adjuvant in rats. in the first test, the antiinflammatory effect of steroidic saponins 600 mg/kg is weak, statistically insignificant. in the arthritis test, steroidic saponins 600 mg/kg proved an antiinfalammatory activity, influencing especially the primary response, but also the secondary one, in the last part of the experiment (after 16 days). in both tests, the suprarenal glands weight was modified. objectives: dendritic cells (dcs) are professional antigen presenting cells. many types of dc with subtle difference in phenotype have been reported in several organs. existence of dc was reported in synovial tissue of rheumatoid arthritis (ra), however, the details in ra dc still remains unclear. in this study, we generated a new lineage of dc with gmcsf (+tnfa) and investigated their functions. furthermore the ability of osteoclastgenesis was examined. methods: monocyte-derived dcs or macrophage were generated in (1) tnfa + gm-csf (2) gm-csf (3) il-4 + gm-csf (4) mcsf. the phenotypes of these cells were analyzed by morphological examination and flow cytometry (facs calibur). cell proliferation was examined by wst-8 assay. dc functions were assessed in antigen presenting ability (mlr assay), cytokine production (elisa), and endocytosis (fitc-dextran uptake). concerning osteoclastgenesis, monocyte-derived dcs were incubated with rankl and mcsf. trap stain was performed and the resorption ability was assessed on osteologic cell culture system and dentine slices. results: these cells were dendritic-like and their surface markers were cd1a low cd11b + cd11c + cd14 + cd86 + cd83 low hla-dr + dc-sign low and different from conventional dc or macrophage. they had an antigen presenting ability to induce na*ve cd4+ t cell proliferation, il-12 production and endocytosis. in the presence of rankl and mcsf, they differentiated multinuclear trap-positive cells with bone resorption ability, which was strengthened by tnfa. we generated a new lineage of dc with gm-csf (+ tnfa). the dc seemed to play a pivotal role in inflammatory arthritis under tnf immunity. the clinical effectiveness of rituximab and other b cellattenuating rheumatoid arthritis (ra) therapies has increased interest in understanding the role of b cells in ra pathogenesis.the possible mechanisms underlying the effectiveness of rituximab were investigated by performing biosimulation research in the entelos ra physiolab platform, a mathematical model of the joint of an ra patient.the platform dynamically integrates the contributions of immune cells (t cells, b cells, and macrophages), resident cells (fibroblast-like synoviocytes and chondrocytes) and mediators to the joint inflammation and structural damage observed in ra.the b cell lifecycle is represented in the platform, as well as effector functions such as antigen presentation, mediator and autoantibody production, and immune complex formation.the dynamics of these b cell properties were calibrated to reproduce reported experimental behaviors and clinical outputs from ra patients (e.g., acr score and radiographic progression rates).an assessment of the contribution of individual b cell functions to clinical outcome suggests that plasma cell-derived immune complexes are key modulators of inflammation in ra patients. in contrast, proinflammatory cytokine production by b cells contributes minimally to synovial hyperplasia, but plays a role in the progression of structural damage.immune complex formation leads to monocyte activation, increased mediator production by macrophages and an increase in antigen availability to t cells.biosimulation research in the ra physiolab platform is advancing our understanding of the mechanisms underlying effective b cell-targeting ra therapies, and may guide the development of improved second-generation therapeutic approaches. methods: the hmscs were obtained at the operation.the mononuclear cells were extracted and the colony forming assay were performed after 2weeks. the hmscs were cultured and in passage1,the cell surface antigens of both groups were analyzed by flow cytometory.in passage2, in control group, the cells were cultured with beta glycerophosphate(bgp) and in osteogenic group, the cells were cultured with bgp and ascorbic acid(aa) and dexamethasone(dex).after 2weeks, alp staining and activity were measured in each group.after 3weeks, alizarin red s assays were performed.rna was extracted from the cells cultured with bgp and aa and dex for 2weeks and 3weeks and the gene expressions of bone formation markers were examined by real time pcr. the mann-whitney test was used for the statistical analyses. the colony forming assays showed no significant differences in oa and ra. in flow cytometory, the cell surface antigens in oa and ra were almost same.in alp activity,there were no significant differences in oa and ra.in alizarin red s, there were significant differences.in real time pcr,the gene expressions showed no significant differences in oa and ra. conclusions: the hmscs of ra will be able to use for regenerative therapy. silje vermedal hã¸gh (1), hm lindegaard (2), gl sorensen (1), a hã¸j (3), c bendixen (3), p junker (2), u holmskov (1) ( cytosolic phospholipase a2 (cpla2) plays a crucial role in eicosanoid production, by releasing an arachidonic acid from membrane phospholipids. in addition, cpla2 regulates the phagocyte nadph oxidase-releasing superoxides and that is the only isozyme responsible for the production of eicosanoid in phagocytic cells. collageninduced arthritis (cia) in mice is an experimental model of auto-immune diesease with several clinical and histological features resembling rheumatoid arthritis in humans. previous studies show that cpla2-deficient mice are resistant to cia. thus we aimed to study whether cpla2 is up-regulated during the development of cia and to detect its exact location in the inflamed joints. immunoblot analysis revealed an increase in the level of joint cpla2 protein during the development of the disease which correlates with the severity of the inflammation, as examined by paw thickness. immunohistochemistry with specific anti-cpla2 antibodies revealed low positive cpla2 protein levels in skeletal muscles, sebaceous glands and skin (epidermis, dermis) tissues of healthy paws. in the joints of the cia mice, large amounts of inflammatory infiltrate containing cpla2 were detected. in addition, robust cpla2 protein expression in the skeletal muscles surrounding the joints and strong cpla2 positive staining in sebaceous glands were detected. the high correlation between the severity of inflammation and the elevated cpla2 protein, due to an inflammatory infiltrate and increased cpla2 expression, suggests an important role of cpla2 in the development of arthritis. rheumatoid arthritis (ra) is the complex disease depending on environmental as well as genetic factors. in spite of the large research efforts we know in fact only small number of genes involved in this disease. animal models are useful tools for better understanding of the pathogenic mechanisms and genes leading to the disease process. the aim of the current project is to identify the genes and their functional role importance for arthritis. two loci associated with arthritis were identified using a cross between the b10.q (intermediate susceptible) and nod.q strains (resistant to arthritis). one on chromosome 2 a disease protective locus (cia2) and another on chromosome 1 a disease-promoting locus (cia9). nod.q allele at cia9 promotes arthritis whereas cia2, has a protective effect in contrast to the b10.q allele on chromosome 2. a promising candidate gene in cia2 locus is complement factor 5 (c5) as the nod.q allele produced defective c5 protein. cia9 locus contains several genes of potential importance for disease such as fc riib, fc riii, fc riv ncf2, fh. the results of cia (collagen induced arthritis) and caia (collagen antibody induced arthritis) experiments using the subcongenic mice generated that contains fc r region showed a significant difference in incidence and severity of arthritis. the disease is controlled in a recessive pattern. the fragment devoid of fc r region seems to be protective. the subcongenic for the cia2 locus has been generated recently which will be tested for cia and caia susceptibility. the results from these experiments will be discussed in detail. angela pizzolla, ka gelderman, r holmdahl ncf1 is a component of the nadph oxidase complex, which produces reactive oxygen species (ros) upon activation into phagosomes and extracellularly. polymorphisms in the ncf1 gene that impair the capacity to produce ros, enhance susceptibility of both mice and rats to arthritis. activation of autoreactive t cells drives arthritis development but neither ncf1 expression nor oxidative burst have been detected in t cells. we hypothesize that antigen presenting cells influence t cell activity by producing ros during antigen presentation. we aimed to clarify the role of ros produced by dendritic cells (dc) on t cell activation. dc were grown from bone marrow from ncf1 mutated and wildtype mice. we could show that ncf1 mutated dc proliferated and differentiated better, had higher expression levels of costimulatory molecules and mhc ii upon stimulation as compared to ncf1 wildtype dc. in addition, ncf1 mutated dc induced higher levels of il-2 production by hybridoma t cells. to analyze the role of ncf1 in dc on arthritis, mice were developed expressing functional ncf1 restricted to dc (b10.qdcn). these mice are characterized for burst, ncf1 expression and ability to present antigen. we published that immunization with myelin oligodendrocyte glycoprotein (mog) protein resulted in higher disease severity than immunization with mog peptide in ncf1 deficient mice. this suggests that ncf1 plays a role in the uptake and processing of posters inflamm. res., supplement 3 (2007) s423 antigens, probably by dc. this will be further investigated with the b10.qdcn mice using in vitro assays as well as in vivo models for arthritis and multiple sclerosis. purpose: macrophage migration inhibitory factor (mif) is a pro-inflammatory cytokine involved in both innate and adaptive immune responses. it is expressed in human ra synovial tissue and its suppression inhibits t or b cell dependent animal models of ra. we investigated the role of mif in k/bxn serum transfer arthritis. methods: arthritis was induced by injection of k/bxn serum on days 0 and 2 in littermate wt and mif-/-mice. arthritis was scored clinically, ankle thickness was measured using microcallipers, and joints collected for histology. sections were scored for synovitis, synovial fluid exudate, cartilage degradation and bone damage. results: wt mice exhibited arthritis as early as day 1 and 100% incidence was observed on day 7. mif -/-mice exhibited delayed arthritis, with onset on day 3 and 100% incidence on day 9. mif -/-mice exhibited significantly reduced disease severity as measured by clinical disease score ( methods: osteoarthritis was induced by bilateral transection of the medial meniscus in dunkin-hartley guinea pigs using minimal invasive surgery to avoid cartilage damage due to inflammation and/or intra-articular bleeding. results: the first signs of osteoarthritis development were macroscopically observed four weeks after meniscal transection. twelve weeks after surgery the lesions were still restricted to the medial side of the joint and did not reach into the subchondral bone. cartilage destruction due to meniscal transection was also histologically detected. however, biomarkers for cartilage destruction (ctx-ii, hp/lp ratio, comp) were not increased. treatments aiming at different processes in osteoarthritis, such as bone destruction (risedronate), inflammation (pioglitazone and anakinra), and cartilage destruction (galardin) were not effective in this model. the early degenerative changes in this transection model are probably too mild to be measured in the systemic circulation using classic biomarkers. further research into new biomarkers is needed to detect and monitor the early stages of osteoarthritis. the ineffectiveness of the compounds tested in this model underscores the urgent need for new strategies to treat the disease. the meniscal transection model might prove to be useful tool for identifying new biomarkers and treatments. livia l camargo (1), a denadai-souza (1), lm yshii (1), a schenka (2), ma barreto (1), d boletini-santos (3), c lima (3), v rioli (3), mn muscar (1), e ferro (1), sk costa (1) ( methods: aia was induced via intraarticular (i.a.) injection of methylated bovine serum albumin in immunized male rats. knee oedema and pain score were assessed daily in controls and animals treated with hemopressin (10 or 20 ã¬g/day; i.a.). histopathological changes, cell number and cytokines in the synovial fluid of aia rats were determined at day 4. results: aia rats developed a severe mono-arthritis characterized by a joint oedema and pain. at day 4, there was marked cellular infiltration, hyperplasia, pannus formation and destruction of bone and cartilage, but pro-inflammatory cytokines were undetectable by elisa. both doses of hemopressin significantly reduced the knee oedema, but only 20 mg of hemopressin attenuated the pain score. acute joint inflammation was significantly reduced by hemopressin, but failed to significantly affect chronic histopathological signs (hyperplasia, pannus etc.). conclusions: hemopressin has potential for treating acute signs of aia by reducing synovial plasma protein extravasation, alleviating pain and reducing acute joint histopathological changes, thus providing an alternative strategy for treatment of oedema and pain in arthritis. calcitriol, the hormonal metabolite of vitamin d and it synthetic analogs exert an anti-inflammatory action on psoriatic skin lesions while eliciting mild inflammation on healthy skin. the map kinase erk plays an important role in the induction of chemokines, cytokines and adhesion molecules in keratinocytes that maintain the epidermal inflammatory response.we hypothesized that the dual effect of calcitriol may be partially attributed to differential effects on erk activation in the presence or absence of inflammatory mediators. our experimental model was immortalized non-tumorigenic human hacat keratinocytes cultured in the absence of exogenous growth factors or active mediators. inflammation was mimicked by exposure to tnf. level and activation of signaling molecules were determined by immunoblotting. by using the specific egf receptor (egfr) tyrosine kinase inhibitor, ag1487, we established that tnf activates erk in an egfr-dependent and egfrindependent modes. the egfr-dependent activation resulted in the induction of the transcription factor, c-fos, while the egfr-independent activation was of a shorter duration and did not affect c-fos expression. treatment with calcitriol alone increased erk activation and c-fos induction. pretreatment with calcitriol enhanced egfr-dependent erk activation and tyrosine phosphorylation of the egfr, but completely abolished egfr-independent tnf-induced erk activation. pretreatment with calcitriol increased the rate of de-phosphorylation of activated erk, accounting for the inhibition of egfr-independent erk activation by tnf. it is possible that effects on the erk cascade underlie the dual action of calcitriol and its synthetic analogs on cutaneous inflammation. christina barja-fidalgo (1), r saldanha-gama (1), ja moraes (1), r zingali (2), c marcienkewicz (3) (1) universidade do estado do rio de janeiro, rio de janeiro, brazil (2) universidade federal do rio de janeiro, rio de janeiro, brazil (3) temple university, philadelphia, usa neutrophils adhere on vascular endothelium and directly migrate toward inflamed tissue to exert their primary defense function. integrin are receptors that drive cell adhesion and motility and interfere with cell activation, functions and survival.acting as both anchoring molecules and signaling receptor, transducing signals outsidein and inside-out, integrins are potential targets for therapeutic and diagnostic opportunities. disintegrins are a family of cystein-rich low-molecular weight peptides that usually contain an rgd sequence, a cell attachment site of ecm and cell surface proteins recognized by integrins. they are considered selective and competitive antagonists of integrins, being potent inhibitors of platelet aggregation and cell-cell/cell-ecm interactions. we reported that rgd-disintegrins, selectively interact with integrin amb2, a5b1 and/or avb3) on human neutrophils, interfering with cell functions through the activation of integrin-coupled intracellular signaling pathways. recently showed that, a selective ligand of a9/a4b1 integrins, vlo5, induces neutrophil chemotaxis, cytoskeleton mobilization and potently inhibiting neutrophil spontaneous apoptosis. these effects are mediated vlo5 interaction with a9b1 integrin, activating the focal adhesion cascade. vlo5 effects on the delay of neutrophil is modulated by pi3k, erk-2 mapk and nf-kb pathways that seems to interfere with the balance between anti-and pro-apoptotic bcl-2 family members and with mitochondrial membrane potential. data emphasize mechanistic details of the role of a9b1 integrins interactions on human neutrophils and support the use of disintegrins as prototypes to develop logical combinations of drugs to optimize or minimize the susceptibility of a selected target cell population to apoptosis during therapeutic interventions. (faperj, cnpq, capes, ifs-sweeden) (1), h serezani (2), m peters-golden(2), sonia jancar (1) '(1) university of sâ¼o paulo, brazil (2) university of michigan, usa it is been shown that leukotriene (lt) b4 and cysteinyl lt (ltc4, ltd4 and lte4) enhances fcr-mediated phagocytosis in alveolar macrophages (am), dependent on protein kinase c (pkc). in contrast, ltb4 but not cyslt effects are exclusive on syk activation. in the present study we investigated the role of specific pkc isoforms and its upstream and downstream targets involved in lt-enhanced fcr-mediated phagocytosis. to this purpose, ams were pretreated or not with inhibitors of pkc-d (rottlerin -6 um), pkc-a (ro-32-0432-9 nm), pi3k (ly290042-10 um and wortmannin-10nm), erk 1/ 2 (pd98059 -2 um), cpla2 (aacocf3-10 um), p38mapk (sb20190-10 um) and ca++ (bapta/am-10 um), before stimulation with ltb4 or ltd4 and addition of igg-opsonized red blood cells for 90 min. activation (phosphorylation) of signaling molecules by lts were analyzed by western blot. our results demonstrate that ltb4 -enhanced phagocytosis is dependent on pkc-a, while ltd4 effects are mediated by pkc-d. cell proliferation and differentiation, adhesion, cell activation and apoptosis. while galectin-1 mainly acts as an anti-inflammatory and pro-apoptotic molecule, galectin-3 is known as a strong pro-inflammatory and anti-apoptotic signal. we have recently recognized galectin-3 as a new molecular target of immunomodulatory drugs in monocyte/macrophage-like cells. in this study we investigated the effects of immunomodulatory drugs (aspirin, indomethacin, hydrocortisone and dexamethasone), applied in therapeutic ranges on the expression of galectin-1 at gene and protein level in monocytic thp-1 cells. we have also tested the effects of these drugs on both galectins in the cells activated by lipopolysaccharide from e. coli (lps). the targeted mrna level was evaluated using quantitative rt-pcr technique and the expression of both galectins in cell homogenates was determined by western-immunoblot analysis. the results showed that immunomodulatory drugs affected the expression of galectin-1 on both, gene and protein level, and that the effects were dependent on drug type and applied concentration as well as time of the exposure. the modulatory effects of the applied drugs on galectin-1 and -3 expressions were also observed in the cells activated by lps. these findings represent important step in the understanding of the effects of immunomodulatory drugs on galectin-1 and-3 expressions, as well as the role of these lectins in the physiology of monocytes. introduction: pancreatitis-associated protein (pap) has been recently described as an endogenous mechanism involved in the regulation of inflammation. in the present study, we show some of the molecular mechanisms implicated in the intracellular signaling pathways modulated by pap. the pancreatic human cell line panc1 was incubated with pap (500ng/ml) and/or tnfa (100ng/ml). total rna was obtained and the expression of tnfa was examined by rt-pcr. in addition, the effect of pap on nfkb activation was measured by inmunofluorescence in cells. western blot analysis was used to determine the expression of nfkb mediators: phosphorylated ikk, ikba and p65. results: we observed that pap administration to cells prevented nfkb translocation to the nucleus as well as the tnfa-induced tnfa gene expression. when tnfastimulated cells were treated with cycloheximide in order to block protein synthesis, the induction of tnfa gene expression was completely restored. on the other hand, pap had no effect on ikk phosphorylation or ikba degradation. conclusions:in this study we have provided evidence that pap modulates the inflammatory response by blocking nfkb translocation to the nucleus. this pap-induced nfkb inhibition requires a jak/stat-dependent de novo protein synthesis. objectives: this study investigated the inhibitory mechanism of hyaluronan (ha) on lipopolysaccharide (lps)stimulated production of proinflammatory cytokines in u937 macrophages. methods: ha was added to u937 macrophage cultures in the presence of lps, with or without pretreatment with anti-intercellular adhesion molecule-1 (icam-1) antibody. secreted levels of tumor necrosis factor a (tnfa), interleukin (il)-1b, and il-6 were determined by enzyme-linked immunosorbent assay. the phosphorylation of nuclear factor (nf)-kb, ikba, and mitogenactivated protein kinases (mapks) was analyzed by immunoblotting. results: lps stimulated production of tnfa, il-1b, and il-6. in contrast to 800 kda ha, 2700 kda ha at 1 mg/ ml inhibited lps-induced cytokine production. anti-icam-1 antibody blocked the effects of ha on the lps actions on u937 cells. lps activated nf-kb and mapk pathways, whereas ha down-regulated p65 nf-kb and ikba phosphorylation by lps without affecting mapks. inhibition studies revealed the requirement of nf-kb for lps-stimulated cytokine production. anti-icam-1 antibody reversed the inhibitory effects of ha on phosphorylation of p65 nf-kb and ikba. conclusions: ha of intrinsic molecular weight suppresses lps-stimulated production of proinflammatory cytokines via icam-1 through down-regulation of nf-kb and ikb. exogenous ha into arthritic joints could act as an anti-nf-kb agent by the mechanism demonstrated in the present study. the principal eicosanoid product of endothelial cox-2 is prostacyclin (pgi2), which is a potent vascular patency factor. induction of endothelial cox-2 under hypoxic conditions is well documented. this response, along with associated pgi2 release, is likely an important protective homeostatic response. in order to explore the role of candidate signalling agents in cox-2 expression in response to hypoxia, studies were undertaken using luciferase reporter constructs of the cox-2 promoter region in huvec. cox-2 induction under hypoxic conditions was confirmed with the wild type construct. strategic mutations of transcription factor binding sites showed that sites for hypoxia inducible factors (hifs) were more important for cox-2 expression than those for nfkb. furthermore, expression of cox-2 was increased under normoxic conditions by transfection of huvec with normoxia-stable hif mutants. emsa showed hif binding in nuclear extracts from untransfected hypoxic huvec. under these hypoxic conditions, increased release of pgi2 but not vaso-occlusive thromboxane a2 was seen. thus the putative protective induction of cox-2 in endothelial cells in response to hypoxia involves signalling by hifs. crescentic glomerulonephritis is characterized by crescent formation and rapid progress to renal failure, where the predominance of th1 immune response plays a crucial role. however, the therapeutic efficacy of the regulation of th1-predominant immune responses remains to be investigated. therefore, the effects of a th1 selective inhibitor tak-603 were investigated in a model of crescentic glomerulonephritis in wky rats. methods: tak-603 was administered orally, starting at the time of induction of glomerulonephritis. in group 1, the drug was administered daily for the initial 6 days. tak-603 was administered on day 0 only in group 2, and from day 3 to 5 in group 3. in each group, nephritic rats were killed on days 6 and 56. results: in group 1, glomerular damage, including crescent formation, was improved on day 6, with reductions in the numbers of cd4, cd8 and ed-1 positive cells, as well as in urinary protein excretion. protein and transcript levels of th1 cytokines in the diseased kidneys were markedly decreased by tak-603 treatment. renal pathology, including glomerulosclerosis and interstitial fibrosis, was ameliorated and proteinuria was markedly decreased. elevated levels of serum creatinine showed concomitant improvement. in group 3, in which treatment was initiated shortly after the appearance of glomerular abnormalities, glomerular damage was also diminished, resulting in a decrease in urinary protein excretion. treatment only on the first day in group 2, partially rescued renal dysfunction. conclusions: these results suggest that the initial inhibition of th1 immune response has an appealing therapeutic potential for crescentic glomerulonephritis. parasitic nematodes and their hosts are now known to produce a wide range of galectins. whilst host galectins have been shown to modulate the recruitment and effector function of inflammatory cells including mast cells, neutrophils and eosinophils, the role of secreted parasitic galectins is less well defined. studies at moredun have demonstrated that both the endoparasitic helminth, haemonchus contortus, and the ectoparasitic mite, psoroptes ovis, produce galectin-like factors which, in vitro, directly influence the migration and survival of eosinophils from their natural sheep host. excretory-secretory extracts from both parasites contained potent chemokinetic activity and were also able to promote eosinophil survival in the presence of dexamethasone. separation by affinity chromatography, as well as specific sugar inhibition and mass spectrometric profiling, revealed the active components to be galectins. in the case of h. contortus, there was homolgy with known est sequences, which allowed subsequent cloning and expression studies to be undertaken. a functional in vivo role for these parasitederived galectins awaits confirmation, but the possibility is raised that they could directly influence the host immune response following infection. this may have particular significance in mite infections in which exudates from the associated eosinophilc lesion appear s428 inflamm. res., supplement 3 (2007) posters to provide the primary nutrient source for their survival. moreover, the observation that two very different parasites may have evolved similar mechanisms for manipulating the host inflammatory response to their benefit, raises the further possibility that parasite galectins may provide potentially novel therapeutic targets. the aim of the study was to investigate the time course of cytokine gene expression in liver and lungs of mice with lipopolysaccharide (lps)-induced septic shock and to assess the effect of three different immunomodulatory agents on cytokine mrna levels in these tissues. male cd-1 mice were injected intraperitoneally with 50 mg/kg lps alone or concomitantly with an intravenous dose of pentoxifylline (100 mg/kg), lisofylline (100 mg/kg) or prednisolone (10 mg/kg). the tissues were harvested 0, 1, 4, 6, and 24 h following lps administration and stored at -808c. relative quantification of tumor necrosis factoralpha (tnf-alpha), interleukin-1beta (il-1beta), interleukin-6 (il-6), and interferon-gamma (ifn-gamma) mrna levels was performed by real-time rt-pcr. the highest levels of cytokine mrna were observed at 4 or 6 h after lps administration, whereas the expression of tnf-alpha and il-1beta in lungs and il-6 in liver reached the peak values at 1h and then decreased gradually. in addition, the lps effects on cytokine mrna were more pronounced in liver when compared to lungs. all administered compounds inhibited the lpsinduced tnf-alpha mrna expression (by up to approximately 50%), whereas lisofylline significantly increased ifn-gamma mrna levels in both tissues at most investigated time points. for other cytokines, the observed differences did not reached statistical significance. in conclusion, with the exception of ifn-gamma, the time course of cytokine mrnas differed considerably depending on the type of tissue. in the murine model of lps-induced septic shock, only tnf-alpha gene expression was suppressed by all compounds under investigation. maria sanz(1), m losada (1), c company (1), c lope-gines(1), l piqueras(2), j cortijo (3) the migration of leukocytes into inflamed tissues involves a cascade of molecular events finely regulated by cell adhesion molecules and chemokines. fractalkine/ cx3cl1 (fr) is a membrane-bound chemokine that functions as a mononuclear leukocyte chemoattractant and as an adhesion molecule. clinical studies and animal disease models have shown that fr is also involved in the pathogenesis atherosclerosis. we have demonstrated that angiotensin-ii (aii) has proinflammatory actions inducing the initial attachment of mononuclear cells to the arteriolar endothelium. in the present study we have investigated whether aii can cause the synthesis and expression of fr on human umbilical arterial endothelial cells (huaecs). huaecs were stimulated with ang-ii 1 microm or with tnfalpha (20 ng/ml) for 1, 4 and 24 h. fr was determined in the culture supernatants by conventional sandwich elisa. fr was only detected after 4 h and 24 h stimulation with tnfalphafnwhereas aii was unable to provoke the cleavage of the chemokine. semiquantitative rt-pcr analysis of huaecs showed increased fr mrna expression in aii-stimulated cells for 1 and 4 h. these effects were caused by the interaction of aii with its at1 receptor since they were abolished by losartan (at1 receptor antagonist). tnfalpha also increased fr mrna. immunohistochemical analysis of the cultured endothelial cells showed a clear expression of fr in huaecs stimulated with aii or tnfalpha for 4 and 24 h. these results suggest that fr could be a key chemokine in the selective adhesion of mononuclear leukocytes to the arterial endothelium elicited by aii. the lipophilic yeast malassezia is an exacerbating factor in atopic dermatitis (ad).among organisms of the malassezia species, m. globosa and m. restricta are particularly dominant on the skin of ad patients. our previous study has demonstrated that human keratinocytes responded to the two malassezia species with different th2-type cytokine profiles, i.e. m. globosa induced il-5, il-10, and il-13 secretion from the keratinocytes, whereas m. restricta induced il-4 secretion.these findings suggest that m. globosa and m. restricta play a synergistic role in triggering or exacerbating ad by stimulating the th2 immune response. pattern recognition receptors (prrs) of human keratinocytes play an important role in the induction of inflammatory and innate immune responses. in this study, we assessed the role of prrs for cytokine production by human keratinocytes in response to malassezia species. human keratinocytes were pretreated with various anti-prr monoclonal antibodies (mabs) and stimulated with m. globosa or m. restricta. cytokine secretion from keratinocytes was measured by using fast quant elisa kit. exposure of human keratinocytes to m. globosa and m. restricta resulted in enhanced secretion of il-5 and il-4, respectively. the m. globosainduced increase in il-5 secretion was inhibited by mabs against cd14 and cd91. in case of m. restricta, mabs against toll-like receptor 2 (tlr2) and cd14 suppressed significantly il-4 secretion from keratinocytes. these findings suggest that the distinct prrs interactions with fungal pathogen-associated molecular patterns (pamps) are key factors in differential cytokine secretion from keratinocytes stimulated with malassezia species. atopic dermatitis (ad) is a chronic, relapsing inflammatory skin disease associated with allergy. mdc (macrophage-derived chemokine/ccl22) and tarc (thymus and activation-regulated chemokine/ccl17) are th2type cytokines, and it has been reported that serum mdc and tarc levels are associated with ad disease. in present study, we investigated the effect of prunus yedoensis matsum barks on the inflammatory chemokines (mdc and tarc) and jak-stat pathway in hacat keratinocytes. as a result, etoac fraction and e5 sub-fraction inhibited the mrna expression and protein level of mdc and tarc in a dose-dependent manner. also, e5 sub-fraction showed inhibitory activity on the stat1 protein level. these results suggest that p. yedoensis may have an anti-atopic activity by suppressing the inflammatory chemokines (mdc & tarc). the il-1 family now consists of 11 members, most of which have assigned functions.there are 10 members of the il-1 receptor family (including the decoy receptor type ii il-1r).many of the il-1 family members possess neither a signal peptide nor an apparent prodomain, but nevertheless manage to exit the cell.the il-1 family members il-1f6, f8 and f9 signal through a complex of the il-1r family member rp2 in association with il-1r acp to activate common inflammatory pathways.the specific activity is is low, on the order of 1-10ug/ml ec50.we have found that removal of a few n-terminal amino acids from il-1f5, f6, f8 and f9 can increase their bioactivity approximately 1000-fold. the location of the n-terminus leading to increased specific activity is quite specific; removal of one more or one fewer amino acid eliminates the effect.in addition, n-terminally truncated il-1f5 is capable of antagonizing signaling via il-1rrp2, but full-length f5 is inactive. (1) university of ulsan, japan kyoto university, japan inflammation plays a pathogenic role in the development of obesity-related complications such as type ii diabetes and atherosclerosis. tumor necrosis factor alpha (tnfa) is closely associated with the enhanced inflammatory responses in obesity and the obesity-related pathologies. tr2 (hvem/tnfrsf14), which is a member of the tnf receptor superfamily and the receptor for lymphotoxins-related inducible ligand that competes for glycoprotein d binding to herpesvirus entry mediator on t cells (light/tnfsf14), is a potent mediator of inflammatory responses. the purpose of this study is to examine the hypothesis that obesity-induced inflammatory responses can be attenuated by inhibiting tr2 pathway. c57bl/6 tr2 knockout mice and their wild-type control were fed a high-fat diet for 12 weeks and the obesity phenotypes were determined in the obese tr2 knockout mice and the control. the obese tr2 mice fed a high-fat diet elicited the attenuation of body weight gain and insulin resistance relative to wild-type control mice. expression levels of inflammatory genes significantly decreased in the adipose tissue of the obese tr2 knockout mice compared with those of the control. our results demonstrate that obesity-induced inflammatory responses and insulin resistance can be attenuated in obese tr2 knockout mice fed a high-fat diet. objectives: the present study was undertaken to investigate the role of insulin on allergic airway inflammation. methods: diabetic male wistar rats (alloxan, 42 mg/kg, i.v.) and controls were sensitized with ova (100 ã¬g) and al(oh)3 (8 mg, s.c.) 10 days after alloxan or saline injection. the animals were challenged 14 days later by the intratracheal instillation of ova (1 mg/0.4 ml). the following analyses were performed 6 h thereafter: (a) total and differential cell counts in bronchoalveolar lavage (bal) fluid; (b) quantification of tnf-alpha and il-1 beta in the bal by elisa; and (c) immunohistochemistry for p-and e-selectins on lung vessels. results: compared to the control animals, diabetic rats exhibited reduced number of neutrophils (90%) and mononuclear cells (50%); reduced levels of tnf-alpha (45%) and il-1 beta (30%), and reduced p-selectin expression (30%) in response to ova challenge. these abnormalities were corrected after treatment of diabetic rats with a single dose of nph insulin (4 iu, s.c.), 2 h before ova challenge. although we did not find differences in e-selectin expression between diabetic rats and controls, expression of this molecule was amplified by insulin. conclusions: data presented show that insulin controls neutrophils migration during allergic airway inflammatory possibly by modulation of tnf-alpha and il-1 beta production and selectin expression. supported by fapesp and pronex. hormonally active vitamin d derivatives are beneficial in the treatment of cutaneous inflammatory disorders, particularly in psoriasis. their anti-inflammatory effect is usually attributed to inhibition of the activity of infiltrating immune cells.we examined whether vitamin d also interferes with the pro-inflammatory action of the keratinocytes themselves. human hacat keratinocytes cultured in the absence of exogenous growth factors or active mediators were exposed to tnf to simulate an inflammatory challenge and their response was monitored by assessing mrna levels of the cytokine tnf, the chemokine il-8 and the adhesion molecule icam-1 by real-time pcr. icam1 and il-8 were induced rapidly peaking after 2h, their mrna levels increased again from 8h to reach a plateau between 16h to 24h after exposure to tnf, whereas tnf mrna levels increased steadily between 2h and 24h. 24h pretreatment with calcitriol, the hormonal form of vitamin d, inhibited induction of il-8 but did not affect that of icam-1 or tnf 2h following exposure while calcitriol markedly inhibited the induction of all 3 pro-inflammatory genes 16h after the tnf challenge.calcitriol inhibits the activation of jun kinase (jnk) and p38 by tnf. this action was mimicked by the posters inflamm. res., supplement 3 (2007) s431 jnk inhibitor sp600125 and the p38 inhibitor sb203580.the combination of the two inhibitors fully reproduced the time and gene dependent modulatory effect of calcitriol. we conclude that vitamin d attenuates the active contribution of keratinocytes to cutaneous inflammation and that this modulatory effect is explained by inhibition of the jnk and p38 cascades. (1), cm lotufo (2), p borelli (1), zs ferreira (2), rp markus (2), shp farsky (1) (1) department of clinical and toxicological analyses, school of pharmaceutical sciences, university of sâ¼o paulo, brazil (2) department of physiology, bioscience institute,university of sâ¼o paulo, braziil introduction: we showed that endogenous glucocorticoids (eg) control neutrophil mobilizations from the bone marrow and peripheral compartment by modulating the expression of l-selectin on segmented cells. aims: we evaluated the role of eg on endothelial cells (ec) and the molecular mechanisms responsible for hormonal actions in neutrophils and ec. methods: neutrophils were collected from blood, segmented leukocytes from femoral bone marrow and ec were cultured from testis vessels. cells were obtained from adrenalectomized (adx), ru 38486-treated, shamoperated, vehicle-treated and non-manipulated (nm) wistar rats. results: circulating neutrophils and segmented cells from ru 38486-treated rats presented elevated and decreased expressions of l-selectin vs cells from control animals, respectively. the effects were not dependent on alterations of l-selectin mrna levels. ec from adx animals presented more ability to adhere neutrophils from nm rats and enhanced mrna levels and membrane expressions of icam-1, vcam-1 and pecam-1. participation of the glucocorticoid cytosolic receptor(gcr) on these effects was shown by similar results in cells from ru 38486 treated rats. nfkappab translocation in neutrophils was equivalent in all groups of animals, but it was enhanced in ec from adx or ru 38486-treated rats. conclusions: data show the participation of the gcr on events involved in neutrophil mobilizations, but nfkappab transcription is only involved on ec. (1), y naito(2), t okuda (3), k mizushima (3), t okayama (3), i hirata (3), h tsuboi (3), t suzuki (3), o handa (1) background: despite the inhalation of co at high concentrations had been considered as a toxic gas, the inhalation of co at low concentration has recently been shown the cytoprotective and anti-inflammatory effect against various animal models. however, it is unclear whether the direct exposure of co to the intestinal inflamed mucosa is effective or not. in this study, we investigated the therapeutic efficacy of the rectal co administration for rat colitis model. acute colitis was induced with trinitrobenzene sulfonic acid (tnbs) in male wistar rats. co(200ppm-10ml) was intrarectally administrated twice a day after the induction of colitis. rats were sacrificed at 3 days after the administration of tnbs. the distal colon was removed, and the ulcer lesions were measured. thiobarbituric acid (tba)-reactive substances and tissueassociated myeloperoxidase (mpo) activity were measured in the colonic mucosa as indices of lipid peroxidation and neutrophil infiltration, respectively. moreover, we evaluated the expressions of cinc-1 mrna/protein and p-p38 mapk protein. the intracolonic administration of co ameliorated tnbs-induced colonic ulceration. the increases in tba-reactive substances and mpo activity after tnbs administration were both significantly inhibited by treatment with co. moreover, the rectal administration of co significantly inhibited the increased expression of cinc-1 mrna/protein and p-p38 mapk protein after the induction of tnbs-induced colitis. the rectal administration of co protected from the intestinal inflammation in rats. based on these data, the beneficial effects of co on the intestinal mucosal injury may be attributed to its anti-inflammatory properties. alessandra gambero (1), m marã³stica (1), m saad(2), j pedrazzoli jr (1) (1) sâ¼o francisco university medical school, brazil (2) state university of campinas, brazil recent studies have shown that adipocytes produce and secrete several bioactive molecules like adipocytokines. the adipose tissue can also present short and long-term changes during inflammation and infectious pathologies. in this study, the alterations of mesenteric and perinodal mesenteric adipose tissue during experimental colitis induced by repeated intracolonic tnbs instillations were evaluated. the adipocyte size was measured after collagenase digestion. the basal lipolysis (glycerol release) and adipocytokine production was monitored after short time culture of adipose tissue. the colitis animals showed higher mesenteric fat mass (1.02+-0.06 and 0.78+-0.09 % of body weight for colitis and control, respectively; p<0.05) in despite of the lower body weight. the mesenteric adipocytes from colitis animals presented reduced diameter (31.67+-0.28 and 35.69+-0.37 um for colitis and control, respectively; p<0.01), higher basal lipolysis (41.67+-6.89 and 18.77+-3.66 ug.mg-1 for colitis and control, respectively; p<0.05) and tnf-alpha production (8.64+-0.77 and 5.51+-0.86 ng.mg-1 for colitis and control, respectively; p<0.05). perinodal mesenteric adipocytes presented normal diameters, higher basal lipolysis (72.48+-15.38 and 25.96.77+-3.05 ug.mg-1 for colitis and control, respectively; p<0.05), increased tnfalpha (46.34+-5.82 and 15.71+-2.15 ng.ml-1 for colitis and control, respectively; p<0.01), leptin (400.41+-77.10 and 201.01+-52.01 pg.ml-1 for colitis and control, respectively; p<0.05) and adiponectin production (15029+-2340 and 5918+-498 ng.ml-1 for colitis and control, respectively; p<0.01). the mesenteric adipose tissue was modified during the experimental inflammation, but some alterations were site specific. perinodal adipose tissue retained the ability to produce anti-inflammatory and pro-inflammatory cytokines, while mesenteric adipose tissue only the pro-inflammatory one. this work was financially supported by fapesp. inflammatory bowel disease (ibd) is a group of chronic inflammatory disorders of the intestine. the role of the pro-inflammatory p38 mapk signalling cascade in the pathogenesis of ibd is highly controversial. we therefore aimed to investigate the role of p38 mapk in chronic dextran sodium sulfate (dss) induced colitis, an experimental model of ibd. chronic intestinal inflammation was induced by oral cyclic administration of 3 % dss in sjl mice. clinically, the dss treatment produced episodes of colitis manifested by diarrhoea, gross intestinal bleeding, marked loss of body weight, and shortening of the colon. at the molecular level, this was accompanied by an up-regulation of tnfa, il-1ã¢, il-6, il-17, kc, cox-2, igg heavy chain, and phospho-stat3 in the dss treated mice.the clinical and molecular features described above recapitulate findings reported in human ibd. in order to assess the role of p38 mapk, the activation of the p38 mapk signalling cascade was analysed by western blot analysis. the expression and phosphorylation levels of both p38 mapk and of mk2 and hsp27, two down-stream targets, were not increased in dss treated animals compared to controls. leo 15520, a potent inhibitor of p38 activity in vivo, was dosed as pretreatment and after completion of dss treatment. pretreatment had a deteriorating effect on all measured cytokines, whereas treatment after disease induction had no effect on any measured parameters. collectively, these results strongly suggest that the p38 kinase pathway only plays a minor role, if any, in the dss model. (sp) were gmcsf differentiated, dcs purified through gr1+ cell depletion, and spleen tcells isolated by pan tcell negative selection. spdcs or bmdcs were stimulated +/-100ng/ml lps. for mlr, balb/c tcells were added for 5 days. cells were incubated with sb203580 (4-(4-fluorophenyl)-2-(4-ethylsulfinylphenyl)-5-(4-pyridyl)1h-imidazole, sb) or ml3403 ((rs)-{4-[5-(4-fluorophenyl)-2-methylsulfanyl-3h-imidazol-4-yl]pyridine-2-yl}-(1 -phenylethyl)amine, ml) and washed prior to lps stimulation (bmdcs) or cell mixing (t cells). 3hthymidine incorporation was measured, cell viability by mtt assay, tnfa and il-12 production by elisa. mlr tcell proliferation inspdcs or bmdcs was inhibited by sb (ic50 spdc 0.06mm, bmdc 1.0mm) and ml (ic50 spdc 0.03mm, bmdc 0.03mm). preincubation with dcs had no effect, despite reduced lps stimulated il-12 and tnfa synthesis by sb (ic50 il-12 1.2mm, tnf 0.16mm) and ml (ic50 il-12 0.9mm, tnf 0.11mm). preliminary data shows that preincubation of t cells with sb and ml modifies the mlr response. p38 plays a role in the interaction of dcs and t cells in antigen recognition. however, pre-incubation of drugs with dcs was ineffective. the role of t cell p38 mapk in the mlr is under investigation. p38 inhibitors may possess disease modifying properties because of reduced tcell antigen reactivity to dc antigen presentation. (1), s luik(2), s laufer(2), m seed (3), v holan(4), s fiorucci (5) (1) synovo gmbh, tã¼bingen, germany (2) university of tã¼bingen, germany in vivo anti-inflammatory activity of certain p38 kinase inhibitors is limited by bioavailability. however, it is possible that they may be useful in the therapy of ibd should it be possible to mediate there uptake in and around bowel lesions. we reasoned that activity could be especially increased if drug physical properties were altered to allow preferential partition into macrophages and neutrophils (wbc) associated with lesions. we prepared prodrugs of p38 inhibitors and screened them using whole human blood, murine spleenocytes and peritoneal macrophage. pharmacologically inert macrocycle (azilide) conjugates were assessed for enhanced efficacy in murine collagen induced arthritis either therapeutically (after onset of signs) or prophylactically (2 d post boost) or in a dss or tnbs model of ibd in the mouse. in both types of models, the prodrugs achieved improved suppression of arthritis and inflammatory score in colon sections at tolerated doses with optimal activity at 15mmolkg-1d-1. we propose that the prodrugs increase efficacy via improved pharmacokinetics partly related to biased disposition of the prodrug toward immune cells. despite the potent anti-inflammatory and immunosuppressive properties of glucocorticoids its applying in the management of severe necrotizing pancreatitis is still controversial. the plasma levels of interleukins (il-8, il-18) and adhesion molecules (e-selectin and icam-1) were measured in 36 patients with necrotizing pancreatitis. the measurement was performed immediately after admission, at the 3, 7, and 14 day. all patients were divided on two groups: first group compiled 20 patients, in which dexamethasone (16 mg/day during 7-8 days) was applied in the complex management of acute pancreatitis, and control group -16 patients that did not receive corticosteroids. all patients received the initial therapy. the increased levels of il-6, il-8, il-18, icam-1, and eselectin were noted in both groups of patients at the time of admission. the gradually increase of all proinflammatory mediators plasma levels up to seventh day was noted in patients of the control group. its levels clear correlated with the severity of mods and spreading of necrotic processes confirmed by ct. starting from the third day the gradually decrease of mediators levels were noted in the patients of the first group. the incidence of contamination of necrotic foci had no difference in both groups of patients. the ability of glucocorticoids to inhibit expression of proinflammatory mediators due to the glucocorticoids-mediated repression of nf-kappa b pathway provide the pathogenetic substantiation for the applying of glucocorticoids in the complex management of necrotizing pancreatitis. the objective of this study was to examine whether t cell specific overexpression of the th2 transcription factor gata-3 can inhibit th1/th17 cell mediated experimental mbsa arthritis. mbsa-immunized wild type mice developed joint inflammation which gradually increased in time with a maximum at day 7. at day 1, t cell specific gata-3 tg mice did not show any difference in arthritis score compared to wild type mice. however, at day 7, wild type mice had developed severe joint inflammation having the maximum arthritis score. in contrast, gata-3 tg mice showed only mild joint inflammation, suggesting that t cell specific overexpression of gata-3 protects against development of severe joint inflammation. facs analysis revealed low levels of il-17 +/ifn-gammacells in wild type as well as in gata-3 tg mice at day 1. as expected, il-4 positive cells were higher in gata-3 tg mice compared with wild type mice. interestingly, at day 7, the percentage of il-17 +/ ifn-gamma-cells were markedly increased in wild type mice but not in gata-3 tg mice, suggesting prevention of th17 expansion under gata-3 overexpression in vivo. these data revealed that t cell specific overexpression of the th2 transcription factor gata-3 protects against progression of severe joint inflammation during mbsainduced arthritis. furthermore, il-17 +/ifn-gammacells play a critical role in the progression of joint inflammation in this model and gata-3 overexpression in t cells prevents expansion of the il-17 +/ifn-gamma-t cell subset. pingping jiang (1), pt sangild(2), t thymann (2), hh-y ngai (1), w-h sit (1), k-l chan (3), jm-f wan (1) ( necrotizing enterocolitis (nec) is a severe intestinal inflammatory disease for which the disease etiology and progression remain unclear. preterm delivery and enteral milk formula feeding are factors predisposing to nec. to understand the pathophysiology of nec, two-dimensional gel electrophoresis (2d page) proteomic approach was applied in studying changes in intestinal protein pattern in preterm piglets with spontaneous nec occurring in response to 2 days of parenteral feeding followed by 1 day of enteral formula feeding. the intestinal proteomes of pigs with clinical symptoms of nec (n = 3) were compared with corresponding pigs remained healthy (n = 4). syproruby staining was used and differently expressed proteins were identified by maldi-tof ms or maldi-tof/tof ms. the proteins with significantly different expression between nec and healthy pigs involve in energy metabolism (sorbitol dehydrogenase, mitochondrial aldehyde dehydrogenase 2 and chain a, medium-chain acyl-coa dehydrogenase with 3-thiaoctanoyl-coa etc.), inflammation (peptide-binding protein 74 (pbp74) and snail homolog 3), signal transduction proteins (thyroid hormone binding protein precursor, park7 protein and chain b, structure of the rho family gtp-binding protein cdc42 in complex with the multifunctional regulator rhogdi etc.) and anti-oxidation (manganese-containing superoxide dismutase(sod)). these data underscore the significant impact of intestinal proteomics in unraveling nec pathophysiology and biomarker discovery. blood are used to monitor the progression of inflammation. the aim of our study were to investigate systemic markers of disease in a rat model of lps induced pulmonary inflammation to provide a link between preclinical in vivo research and early clinical research. animals were exposed to bacterial lipopolysacharide (e.coli 026:b6) by inhalation. the lungs were lavaged 6 hours post provocation and the level of cell influx was determined. relevant mediators of acute pulmonary inflammation were analysed with standard elisa technology in bronchoalveolar lavage fluid and in blood. in addition, measurement of changes in body temperature were performed at different time-points post provocation in order to monitor the systemic inflammatory responses to the local pulmonary inflammation manifested as alteration in body-temperature. results showing the effects of lps challenge on local and systemic parameters will be presented and the possible link to lps responses in man discussed. pulmonary inflammation models are widely used in pharmacological research. however, provocations and treatments aimed directly at the lung are often invasive which limits the possibility to perform repeated administrations of test agents and compounds. also, results derived from bronchioalvelar (bal) fluid are subject to variability if the retrieval techniques are non standardized. here we describe a non-invasive standardized method for retrieval of bal fluid to be used in mice. we present the characterisation of these techniques using the inflammatory response to lps and propose that this non invasive method can be used to refine lps and other challenge models. the objectives were to evaluate the dynamic response after a single intra-tracheal administration of of 5mg (50ml/animal) of lps (p.aeruginosa) to c57bl/6j mice. control animals received a single dose of sterile 50ml 0.9% saline/animal. the mice were terminated 4, 24, 48, 72 and 96h after instillation using a non invasive and operator independent lavage technique. results showing the effects of single lps challenge on bal parameters, excised lung gas volume and lung weight will be presented showing reliable dynamic responses. these techniques open the possibility to run repeated treatments and chronic provocations and are not subject to variability from bal fluid retrieval. the human psoriasis xenograft scid mouse model is one of the most accepted and well characterized models for screening of novel anti-psoriatic compounds. the model has primarily been applied for testing novel treatment principles via systemic or intradermal administration routes. in order to evaluate the model for topical treatment, psoriatic keratome biopsies were grafted to immune-deficient scid mice. transplanted mice were treated with daivonex /dovobet (calcipotriol) and bms (betamethasone). the results show a strong antipsoriatic efficacy after treatment with bms (epidermal thickness reduced by 68%). treatment with daivonex / dovobet also showed an anti psoriatic effect with a 31% reduction in epidermal thickness. serum did not contain test compounds, indicating that the observed effect were not due to systemic exposure. the observed effects are in concordance with clinical results of treatment of psoriasis. it is concluded that the model is useful for testing topical treatments. we have demonstrated that adult rats offspring of dams submitted to protein restriction during early lactation, presented impaired acute immune responses probably related to an imbalance in glucocorticoids and insulin secretion (barja-fidalgo; inflamm res 52 (11): 2003) . here, we evaluated the innate immunity mediated by neutrophils and host defense against infection in adult rats offspring of dams fed with either a protein free diet (un-group) or 22% protein diet (c-group) during the first 10 days of lactation. un rats showed lower number of blood pmn, though no difference in bone-marrow neutrophils number was observed. blood neutrophils from un-group presented a significantly reduced phagocytic activity against opsonized zymosan, constitutively expressed inos and spontaneously produced o2-, no and tnf-alpha. in vivo treatment with lps, at non-lethal doses, significantly increases tnf-alpha and superoxide production by neutrophils, compared with controls. lps increased no production by neutrophils from both groups, inducing inos expression in control cells, but no further increase in inos expression in un rats. nucleare nf-kb is constitutively augmented in un rats. un animals presented a higher survival rate in a model of clp-induced severe sepsis. these results indicate that a metabolic programming induced during early lactation affects the innate immune responses in adult rats, which are unable to properly mount an inflammatory response, may predispose to chronic diseases in adult life. transgenic mice over-expressing vascular endothelial growth factor (vegf) in the epidermal basal layer under the human keratin 14 (k14) promoter have previously been reported to develop a psoriasis-like inflammatory condition in the skin. important hallmarks of psoriasis are epidermal hyperplasia in association with infiltration of t-cells in the dermis and epidermis and also increased dermal angiogenesis. the aim of this study was to describe the epidermal hyperplasia and the infiltration of the skin with t-cells in transgenic k14/vegf mice. we induced a cutaneous inflammation in the ear skin by repeated topical treatments with 12-o-tetradecanoylphorbol-13-acetate (tpa), in order to investigate the inflammatory response. the in vivo pharmacological effect of topical treatment with a number of reference compounds, including betamethasone-17-valerate, was also investigated. the ear thickness was significantly increased in transgenic animals compared to wild type animals following tpa-induction. the epidermal thickness measured in histological sections of biopsies from the ear skin was also significantly increased in transgenic animals. furthermore, increased dermal vascularisation was observed in the histological sections of the ear skin. a marked infiltration with cd3-positive cells was observed in both dermis and epidermis, and this was highly correlated with the increase in epidermal thickness. finally, topical treatment with betamethasone-17-valerate significantly reduced the ear swelling and epidermal thickness. we conclude that over-expression of vegf in the epidermal basal layer plays an important role in skin inflammation and for the development of important psoriatic hallmarks. the model may furthermore be used as an in vivo screening tool for novel anti-psoriatic compounds. background and aims: the diabetes-prone bb (bbdp) rat spontaneously develops insulin-dependent diabetes resembling type 1 diabetes (t1d) in man. the bbdp rat is t-cell lymphopenic with a profound lack of regulatory t cells. the recent thymic emigrants in bbdp rapidly undergo apoptosis unless rescued from apoptosis by tcr stimulation. the increase in apoptosis is due to a frameshift mutation in gimap5 which causes a severe truncation of the protein. the mutation is the strongest genetic factor for rat t1d. we aim to detect how gimap5 affects the lifespan of t cells. results: overexpression in c58 cells of both wt gimap5 and gimap5 with the bbdp mutation causes an increase in apoptosis -the latter with a very rapid onset. reduction of human gimap5 by rna-interference in jurkat cells did not affect the number of apoptotic cells. overexpression of human gimap5 causes apoptosis in jurkat cells and primary naã¯ve t cells but not in activated t cells. finally, gimap5-mrna is upregulated in in vitro activated human primary t-cells (detected by rt-pcr). conclusions: based on the phenotype of the bbdp, rat gimap5 was expected to be anti-apoptotic. however, we report here that overexpression of both mutated and wt gimap5 causes rapid death of the cells. this suggests that gimap5 is pro-apoptotic. the results with human wt gimap5 support this conclusion: recently, much focus has been on the cellular cd38/ cadpr signaling system during inflammatory processes. the cd38/cadpr system has been shown to be regulated by interferon, estrogen and the proinflammatory cytokine il-8. to our knowledge, the expression and function of the cd38/cadpr signaling system in the human detrusor muscle have not been described. cd38 protein expression in cultured (explant technique) human detrusor smooth muscle cells (hdsmc) was demonstrated by western blot (wb) and confocal microscopy (cm). cytosolic free ca2+ concentration ([ca2+] i) in hdsmc and isometric force in human detrusor strips were measured by spectrofluorometry and myograph technique, respectively. wb and cm showed that hdsmc expressed cell surface cd38 which could be upregulated by il-8 (20 ng/ml). in hdsmc briefly activated with il-8 (20 ng/ml) cadpr induced a rapid, transient dose-dependent increase in [ca2+]i. cyclic adpr-mediated ca2+ increase was greatly reduced in ca2+ free medium suggesting ca2+ entry as well as ca2+ release. cyclic adpr -elicited ca2+ increase was mimicked by 3-deaza-cadpr, and blocked by 8-bromo-cadpr, a cadpr antagonist, but not by nifedipine or verapamil. in the presence of il-8, cadpr caused concentration-dependent relaxations of detrusor muscle. we report for the first time that 1) hdsmc express cell surface cd38, 2) the expression and function of cd38 are augmented by il-8, 3) externally added cadpr elicited a rapid, il-8 -dependent, and 8-bromo-cadpr-inhibitable ca2+ mobilization, 4) cadpr induces relaxation of human detrusor muscle. the study indicates a role of cd38/cadpr in human urinary bladder inflammation. miao lin is a formulation of sen miao san and lingzhi that consists of cortex phellodendri, atractylodisa rhizoma, radix achyranthis bidentatae, and ganoderma lucidum. these ingredients are reported to have anti-inflammatory and analgesic effects. in this study, we have investigated the anti-arthritic property of miao lin in an animal model of arthritis induced by unilateral injection of freunds complete adjuvant (fca) into rat knees. contents of the miao lin capsules were dissolved in saline and administered to the rats daily by intraperitoneal or oral route for 7 days before induction of arthritis and 7 days after. extension angle, size and blood flow of the rat knee joints were measured to give indexes of algesia, oedema, and hyperaemia, respectively. assessments of the extent of cell infiltration, tissue proliferation, and erosions of cartilage and bone provided additional indexes of the arthritis condition. single unilateral injection of fca into rat knees produced significant oedema, algesia, hyperaemia, immune cell infiltration, synovial tissue proliferation, and erosions of cartilage and bone in the ipsilateral knees compared with the contralateral saline-injected knees. intra-peritoneal injection of miao lin (50 mg/kg/day) suppressed oedema, pain and hyperaemia in the inflamed knees, and oral administration (500 mg/kg/day) suppressed oedema and hyperaemia. histological examination showed that both routes of administrations of miao lin reduced immune cell infiltration and erosions of cartilage and bone, and intraperitoneally administered miao lin also attenuated synovial tissue proliferation. these findings suggest treatment with intra-peritoneal or oral miao lin could provide significant anti-arthritic effects. an extract of the anti-arthritic thermalife cream contains 13 trace elements. diffusion studies were undertaken to assess the permeability of human epidermis to the trace elements. non-penetrating trace elements were discarded from the test formula (t2), and compared with the original formula (t1) for in vitro anti-inflammatory efficacy (tnf-a secretion in lps-challenged human monocytes). methods: human epidermis was mounted in vertical franz type diffusion cells (stratum corneum facing up). t1 cream (n=4) or no cream (n=4) was applied to the donor compartment of diffusion cells, with pbs in the receptor compartment (3.0ml ; stirred continuously at 37 c). 240 min after administration the receptor fluid was analysed for presence of metal ions by icp-ms. a replication study used a different skin donor. subsequently, human monocyte cultures (10% fcs, 5% co2) were either stimulated with 500ng/ml lps (e.coli 0111:b4,) or not in the presence of 10% t1, 10% t2, or no treatment.24 hours after incubation, culture media were collected, centrifuged, and assayed (cytokine elisa). statistical analyses used a treat by lps anova (p < 0.05). results: zinc was the only trace element to penetrate the human epidermis significantly. both formulations strongly suppressed lps-induced tnf-a secretion. t2 with zinc only was more effective than t1 (treat:f2,12 = 57.13, p < 0.0001; lps:f1,12 = 245.47, p < 0.0001; treat by lps:f2,12 = 70.01, p < 0.0001). conclusions: anti-tnf efficacy from thermalife extracts was retained with zinc chloride as the only trace element. (1) (1) osprey pharmaceuticals limited, canada (2) probetex, inc., texas, usa the ccl2/ccr2 chemokine/receptor axis, infiltrating monocytes/macrophages (m/m), th1 cells and mast cells play a pathological role in tissue damage and fibrosis in kidney diseases. the eradication of the supernumerary activated leukocytes should curb the production of inflammatory mediators and modulate chemokine communications, thus ameliorating disease. a recombinant fusion protein comprised of the human ccl2 chemokine fused to a truncated form of the enzymatically active a1 domain of shiga toxin has been developed. the ccl2 portion binds specifically to ccr2-bearing leukocytes and enters the cells, where the sa1 portion inhibits protein synthesis. the compound was tested in a model of anti-thymocyte serum (ats)-induced mesangioproliferative glomerulonephritis. male rats were injected with ats on day 0 and treated intravenously with vehicle, 50 or 100mg/kg of the recombinant protein q2d from day 2 until day 8. urine and blood collections were made prior to ats injection and on days 5 and 9. animals were sacrificed on day 9. no treatment related effects on body weight or signs of clinical toxicity were observed. urine protein levels were decreased in treated animals. histopathological analyses of kidney sections revealed maximum reductions of 40, 36, 38, and 28% for glomerular lesions, m/m count, fibronectin and âµ-smooth muscle actin, respectively. the latter two proteins are markers for extracellular matrix synthesis and mesangial cell activation, respectively. these results indicate a significant renal-protective effect in this model of nephritis. further observations suggest that different chemokine-ligand toxins may be used in the treatment of diseases modulated by other chemokine/receptor axes. inflamm. res., supplement 3 (2007) posters immuno-depletion followed by fluorescence-activated cell sorting based on the cell surface expression of the sca-1 antigen. such isolated cells can subsequently be cultured and differentiate towards the osteogenic, adipogenic or chondrogenic linage in vitro. using this model we investigated the influence of the proinflammatory cytokines, tnfa or il-1b, on early osteogenesis in vitro. under osteogenic conditions, il-1b was found to inhibit cell proliferation in a dose dependent manner, whereas tnfa exhibited no effect. histochemical examination revealed the presence of either tnfa or il-1b to dramatically decreased mineralization in a dose dependent manner. q-pcr analysis indicated that in the presence of il-1b, despite increased expression of bone-specific alkaline phosphatase (akp2) mrna, levels of other osteogenesis markers (runx2, col1a and sp7) were decreased. in the presence of tnfa, levels of akp2, runx2 and sp7 were all decreased. our findings indicate that the influence of early mesenchymal progenitor cells on bone remodelling may be substantially altered in the presence of proinflammatory cytokines. coronary artery disease (cad) is characterized by enrichment of inflammatory cells in the vessel wall. we hypothesized that an altered transmigration and activation of monocytes may contribute to plaque build up. in vivo transmigration was studied by use of a skin blister model. blisters are raised by suction and cells are analysed the following morning (0h blister) and after additional ten hours of incubation with pbs or autologous serum, corresponding to intermediate and intense blister. monocytes were analysed by flow cytometry for the expression of cd11b, before and after in vitro fmlp stimulation. chemokines in serum and blister fluid was analysed in parallel. cd11b expression on resting monocytes harvested from 0h blister was lower in patients as compared to controls (p=0.001). lower expression of cd11b in patients was also observed in the intermediate and intense blisters after stimulation with fmlp (p=0.04 and p= 0.005, respectively). the number of transmigrated cells was similar in both groups and increased with the intensity of inflammation. serum concentration of mip-1âµ was higher among patients (p=0.01) and similar levels were seen in blister fluids. concentration of mcp-1 was similar in both serum and blister fluid. we demonstrate that monocytes from patients with cad have a reduced expression and ability to up-regulate the adhesion molecule cd11b at sites of inflammation. these differences may modulate events related to the transmigration process and indicate a changed activation pattern. to which extent this feature might contribute to monocyte entrapment at the atherosclerotic site needs further studies to be delineated. in inflammation, nitric oxide (no) is produced by inducible nitric oxide synthase (inos) induced by bacterial products and cytokines, and no acts as a regulatory and proinflammatory mediator. one of the anti-inflammatory mechanisms of glucocorticoids is the inhibition of no production. the aim of the present study was to investigate the mechanisms how glucocorticoids inhibit inos expression and no production. dexamethasone and a dissociated glucocorticoid ru24858 inhibited no production, and inos protein and mrna expression in murine j774 macrophages exposed to bacterial lipopolysaccharide (lps). in the presence of a glucocorticoid receptor (gr) antagonist mifepristone, the effects of dexamethasone and ru24858 on no production were reduced. the role of histone deacetylation in the glucocorticoid effect was studied by using three inhibitors of histone deacetylases (hdacs); non-selective trichostatin a and apicidin, and hdac1 selective mc1293. hdac inhibitors reversed the effects of dexamethasone and ru24858 on inos expression or no production. stably transfected a549/8 cells containing human inos promoter were used in promoter-activity studies. cytokine-induced inos promoter activity was inhibited by dexamethasone and the inhibitory effect was reversed by trichostatin a. these results suggest that glucocorticoids inhibit inos expression and no production by a gr-mediated and gre-independent manner possibly through histone deacetylation and transcriptional silencing. we are investigating mechanisms involved in tnfainduced hyperalgesia in the mouse paw. previously, we have seen that tnfa causes a trpv1-dependent bilateral hyperalgesia. here we investigate the role of cox in this process. female cd1 mice (25-30g) were given intraplantar injections (i.pl.) of tnfa (10pmol/50microl) and tyrode (as vehicle, contralateral paw; 50microl). thermal hyperalgesic thresholds were measured using the hargreaves technique before and 1h after injection. indomethacin (20mg/kg) was co-injected with tnfa whilst contralateral paw was injected with tyrode and corresponding amounts of indomethacin vehicle (5% nahco3). another group of mice were injected with tnfa i.pl. plus 5% nahco3 with the contralateral paw injected with tyrode plus indomethacin (20mg/kg). results are expressed as mean ae s.e.m and statistical analysis performed using students t-test. tnfa (10pmol) leads to significantly reduced (p<0.05 compared to baseline values) paw withdrawal latency in both paws 1h after injection i.e bilateral hyperalgesia. however, local injection of indomethacin (20mg/kg) with tnfa prevented this reduction in paw withdrawal latency in both paws suggesting that prostaglandins are important in the development of hyperalgesia. interestingly, indomethacin co-injected with tyrode in the contralateral paw did not prevent the reduction in paw withdrawal latency in both paws. the same results were seen using the selective cox-2 inhibitor, nimesulide. in conclusion, cox-2 derived prostaglandins are important in the development of hyperalgesia. local cox-2 inhibition at the site of tnfa-induced inflammation prevents the bilateral hyperalgesia suggesting that local prostaglandin production is sufficient to cause hyperalgesia in the contralateral paw. hydrogen sulfide (h2s) is synthesized naturally in the body from cysteine by cystathionine g lyase (cse). h2s has been variously reported to exhibit both pro-and antiinflammatory activity. in an attempt to obtain further information about the role of h2s in inflammation we examined the effect of dexamethasone on lipopolysaccharide (lps)-mediated endotoxic shock. male sprague dawley rats (240-280g) were administered dexamathasone (1 mg/kg, i.p.) either 1 h before or 1 h after lps (4 mg/kg, i.p.) injection. animals were killed 3 h after lps administration and plasma and tissues harvested. as expected, lps injection significantly increased plasma tnfa and il-1b as well as liver and lung myeloperoxidase (mpo) activity. lps also increased plasma nitrate/ nitrite (nox), h2s concentration and liver and kidney h2s synthesis from exogenous cysteine indicative of upregulation of cse in these tissues. either pre-or post treatment of animals with dexamethasone reduced signs of inflammation and also reduced the increase in plasma h2s and tissue h2s synthesizing activity. in separate in vitro experiments, exposure of rat peripheral leucocytes to lps (100 ng/ml, 3 h, 37oc) resulted in upregulation of both cse and inos (measured by western blot). dexamethasone (100 nm) significantly (p<0.05) reduced expression of both cse and inos. these data provide further evidence that h2s is synthesised during endotoxic shock and suggest, for the first time, that at least part of the anti-inflammatory effect of dexamethsone may be related to inhibition of h2s production. (1), u jalonen (1), h kankaanranta (1), r tuominen(2), e moilanen (1) (1) the immunopharmacology research group, medical school, university of tampere and tampere university hospital, tampere, finland (2) the division of pharmacology and toxicology, faculty of pharmacy, university of helsinki, helsinki, finland tristetraprolin (ttp), also known as nup475, tis11, g0s24 and zfp36, is a factor that binds to 3utr of mrna of some transiently expressed inflammatory genes and regulates mrna stability. ttp has been implicated in the posttranscriptional regulation of e.g. tumor necrosis factor a and inducible nitric oxide synthase. however, the regulation of the expression of ttp itself is largely unknown. in the present study, we investigated the role of classical protein kinase c (cpkc) isoenzymes in the regulation of ttp expression. in j774 macrophages ttp expression is induced by lipopolysaccharide (lps) and this can be further enhanced by addition of 100 nm phorbol myristate acetate (pma). this additive effect of pma on ttp was abolished by a prolonged preincubation with a higher s442 inflamm. res., supplement 3 (2007) posters concentration of pma for 24 h, which also downregulated the expression of pkca, pkcbi and pkcbii isoenzymes. pkc inhibitors ro318220 (inhibits pkcb, & and e), gã�6976 (inhibits pkca, b and &) and cgp53353 (inhibits pkcbii) reduced lps + pma -induced ttp protein and ttp mrna expression. pkcbii inhibitor cgp53353 did not affect ttp mrna half-life and therefore we measured the effects of cgp53353 on the activation of transcription factors involved in ttp expression. cgp53353 had no effect on the activation of nf-kb, egr1 or sp1. in contrast, cgp53353 reduced the activation of transcription factor ap-2, which may explain its inhibitory action on ttp expression. the results suggest that pkcbii is involved in the regulation of ttp expression in activated macrophages, possibly through the activation of transcription factor ap-2. the most widespread gracilaria verrucosa in the sea of korea is the attached form of red algae growing on a rockly substrate. in this study, we isolated fourteen compounds from g. verrucosa and investigated their inhibitory effect on the production of inflammatory markers (tnf-a il-6, il-1 and no) in raw264.7 cells. among them, 10-oxooctadec-8-enoic acid and 11-oxooctadec-9-enoic acid inhibited the production of tnf-a, il-6, il-1 and no at the concentration of 20 mg. also, these two compounds showed inhibitory activity on the mrna expression and protein level of inflammatory markers (tnf-a il-6, il-1 and inos) in a dose-dependent manner. these results suggest that g. verrucosa may have anti-inflammatory activity through the inhibition of inflammatory cytokines and inos. lene jensen(1), p hjarnaa (1), j fensholdt (1), p-p elena(2), k abell (1), tk petersen (1) (1) discovery, leo pharma, ballerup, denmark (2) iris pharma, la gaude, france angiogenesis is known to play an important role in many inflammatory diseases including arthritis. additionally, inflammation is known to play a role in the angiogenesisdriven disease age-related macular degeneration (amd). we have synthesized a potent angiogenesis inhibitor, leo-a, targeting kinases related to angiogenesis, e.g. vegfr-2. additionally, leo-a has potent effects on a broad panel of other kinases, whose normal functions are related to inflammation and immunity. the compound was tested systemically in inflammatory in vivo models in mice and rats. the in vivo models selected include the cia arthritis model (mice and rats), the local gvh rat model, the lps induced tnfa model (mice and rats), the anti-cd3 induced il-2 response mouse model and the rat argon laser-induced choroidal neovascularisation (chnv) model, a model for amd. the following results were obtained after systemic treatment with doses of up to 30 mg/kg i.p. or 50 mg/kg p.o. once daily: in the local gvh model, leo-a significantly inhibited the growth of the local lymph node by 50 %. in the cia model, leo-a had a significant inhibitory effect on the progress of arthritis both in mice and in rats when dosed early (pretreatment). in the lps induced tnfa model in mice, high doses of leo-a were found to inhibit the tnfa release. in the chnv model, a significant effect was obtained following systemic treatment. in conclusion, leo-a has an interesting profile for the treatment of diseases in which inflammation and angiogenesis are involved. mice lacking pi3kg and d isoforms display severe impairment of thymocyte development, but the outcome of this developmental defect has not been investigated. we show here that mice harbor pi3kg gene depletion and pi3kd kinase-inactive mutation, pik3cgd koi, exhibited thymus atrophy, similar to previously reported pi3kg and d double knockout (p110g/d-/-) mice, and profound peripheral lymphoid depletion, markedly reduced lamda chain production and seemingly lymphopenia-provoked effector/memory t cell activity. in particular, serum igg1/ igg2a ratio and ige level were elevated in pik3cgd koi mice corresponding to a skewed th2 profile in vitro. histological analysis revealed eosionophil-and t celldominated inflammation in stomach and salivary gland as well as occasionally other organs of pik3cgd koi mice, but organ-specific auto-antibody was not detected in circulation. on the contrary, when mature wt t cells were treated with pi3k d or together with pi3k g selective inhibitors, while th1 cytokines were suppressed th2 cytokines were not augmented in vitro. thus, t cell development, but not peripheral t cell proliferation or cytokine production, requires cooperativity of pi3kg and d. genetic inactivation of these two isoforms leads to the development of severe lymphopenia, skewed type 2 ig and t cell response, and increased susceptibility to eosinophilic multiple organ inflammation; whereas pharmacological inhibition at the adult stage would probably not promote th2 reaction but attenuate th1 medicated disorders. platelet activating factor (paf) is an important mediator in several pathophysiological processes. paf receptor activation can causes a series of cellular and tissue modifications and can lead to the production and/or release of diverse molecules, including cytokines, chemokines and receptors, amongst others, which are capable of amplifying the inflammation. paf can up-regulate kinin b1 receptor expression by various mechanisms. our aim was to investigate the role for kinases in paf-induced kinin b1 receptor up-regulation. wistar rats were treated with paf, or left untreated as controls, 6h before i.d. injection of 0.1ml pbs containing des-arg9-bradykinin (dapk, 100nmol right hind paw) and 0.1ml pbs (for control, left paw). various kinase inhibitors were administered to the rats after paf treatment and oedema was measured by the use of a plethysmometer (ugo basile) 10-120 minutes after dapk-injection. oedema was expressed in ml as difference between right and left paws.additionally paw samples were taken for western blot analysis for total and phosphorylated forms of jnk and erk1/2. dabk-induced paw oedema after pafinjection is significantly inhibited by the selective jnk sp600125 and erk1/2 pd98059 inhibitors. western blot analysis shows that phosphorylation of jnk and erk1/2 is important in the up-regulation of b1 receptors. our results clearly show that the phosphorylation of both erk1/2 and jnk mapkinases is an important step for the in vivo up-regulation of b1 receptors by paf. however, the exact mechanisms (transcriptional and post-transcriptional) by which paf can trigger kinase phosphorylation and then up-regulate the b1 receptor require further investigation. continued interest in development of small molecule inhibitors of p38 mitogen-activated protein (map) kinase is based on the central role this enzyme plays in inflammatory cell signaling. activation of p38 leads to increase production of pro-inflammatory cytokines such as tnf-a and il-1b making it an prominent target for antiinflammatory drug discovery. a virtuell screening approach identified 3-(2-chlorophenyl)-6-((4-methoxyphenoxy)methyl)[1, 2, 4] triazolo [3,4-b] [1, 3, 4] thi adiazole as a potential hit. this was confirmed by synthesis and testing. to explore further sar, a first set of derivates was prepared by cyclization of the 5-substituted-4-amino-3-mercapto-4h-1,2,4-triazoles with carboxylic acids in presence of phosphorus oxychloride. the synthetic strategy used allows both variation at position 3 and 6. synthesis and sar will be presented. cytokines like il-1b and tnfa play central roles in inflammatory diseases like rheumatoid arthritis. production of cytokines in monocytes, macrophages and other cells is triggered by factors such as lps, uv-light, osmotic and cellular stress or physical and chemical attraction. in particular il-1b and tnfa are key regulators as they amplify inflammatory stimuli in cells by induction and upregulation of further cytokines. involved in this signal pathway, p38mapk as a pivotal enzyme is considered to be a validated drug target and therefore, p38mapkinhibitors are of therapeutical interest. in this study, we developed and validated an economic in vivo whole blood assay for optimization and characterization of small molecule p38mapk-inhibitors with promising in vitro activity. the assay procedure involves defined blood cell stimulation by lps and isolation of tnfa or il-1b, which are subsequently quantified by tmb-elisa technique via photometric measurement. the validation of the assay conditions involved well characterized p38mapk inhibitor sb203580 and a highly active compound developed in our lab. a data set was generated by determining 18 whole blood samples consisting of in each case three male and female individuals on three different days. statistical methods were used to analyze specificity, baseline-peak correlations, repeatability, robustness as well as gender specific intra-and interindividual differences. p38 mitogen-activated protein (map) kinase is required for the biosynthesis and release of pro-inflammatory cytokines il-1 and tnf a. inhibition of p38 map kinase could reduce the expression of these cytokines and is therefore a promising target for the treatment of many inflammatory disorders, like rheumatoid arthritis and inflammatory bowel disease. trisubstituted pyridinylimidazoles are potent inhibitors of the p38 map kinase. scope of this work was to investigate 2-thio-ether moiety as a position to link the inhibitors to macrocyclic drug carriers. we synthesised 2-alkylsulfanyl, 4-(4-fluorophenyl), 5-(2-aminopyridin-4-yl) substituted imidazoles as p38 map kinase inhibitors. as substitution at this pyridinyl moiety allows both increase the anti-inflammatory activity as well as selectivity. the synthesis and biological testing of effective the 2-aminopyridin-4-yl imidazoles with low inhibitory concentrations are described. biological data demonstrate both the imidazole derviates and the linked imidazoles lead to highly efficient inhibitors.variation at the 2-thio-ether moietywhich interacts in the phosphate binding region of the enzyme -with polar groups shows no loss of activity. studies underscored the importance of hydrogen bonding with the backbone nh group of met109, for inhibitory activity. less clear is the importance of the hydrogen bond between n3 of the imidazole ring and lys53 of p38 map kinase.to investigate the role of lys53 in interacting with the scaffold we prepared two sets of 4,5diaryl-substituted isoxazoles. these data suggest a dynamic interaction of the core heterocycle with lys53, contrary to the observation on the compound vk-19911 and p38 map kinase, that a nitrogen atom bearing a lone pair in position 3 of the imidazole ring could be necessary to avoid a repulsive interaction with the positively charged side chain of lys53 rather than to form an attractive interaction with p38 map kinase. to complete our study, we focused on the interdependency of biological effects exerted by substitution at the pyridine ring for a series of 3-substituted and unsubstituted 4,5-diarylisoxazoles investigating the interaction with the hydrophobic pocket ii of p38. these data indicate that the isoxazole has better scaffold properties compared with imidazoles, suggesting that heterocycles that are stable as regioisomers, such as isoxazole (in contrast to tautomeric imidazoles), are worthy of further investigations. despite of the intensive research effort, sepsis is still the leading cause of death in critically ill patients. it is a consequence of acute inflammatory response to lipopolysaccharide (lps), a major component of the outer membrane of gram-negative bacteria. natural products are known sources of bioactive components exerting antioxidative and anti-inflammatory effects. in this study, we investigated the effect of ferulaldehyde (fa), a natural compound of red wine, on lps-induced endotoxic shock in mice and on lps-stimulated murine macrophage-like raw 264.7 cells. treatment of c57bl/6 mice with fa significantly attenuated the lps-induced inflammatory response in the gastrointestinal tract, and decreased the level of the two major pro-inflammatory cytokines tnf-a and il-1b in the serum. the serum level of the anti-inflammatory cytokine il-10 was higher in mice treated with fa and lps compared to lps treatment alone. lps-induced phosphorylation and thereby activation of akt, and jnk was also strongly inhibited by fa treatment whereas the phosphorylation level of erk1/ 2 and p38 mapks remained unaltered. activation of nuclear factor-kappab (nf-kb) in liver of fa-treated mice were significantly suppressed. although fa had no effect on the production of inflammatory cytokines, or on inhibition of signal transduction pathways in raw 264.7 cells either, it decreased the lps-induced ros and nitrite production in a dose-dependent manner. our results suggest that fa has antioxidative and anti-inflammatory activities by enhancing antioxidative defense systems, which in turn decrease inflammatory cytokine response and suppress nf-kb activity via the down-regulation of akt and jnk. myeloperoxidase (mpo), stored in the azurophilic granules of the neutrophil granulocyte, is a heme enzyme with the unique property of oxidising chloride ion to the powerful reactant hypochlorous acid in the presence of hydrogen peroxide. therefore, it plays an important role at inflammatory loci in killing invading micro-organisms. on the other hand, hypochlorous acid reacts with a variety of biomolecules as amino acids or membrane lipids and causes therefore host tissue damage resulting in widespread diseases like atherosclerosis or rheumatoid arthritis, e.g. the formation of chloramines from taurine or ammonium ions is one possibility to reduce tissue toxicity while maintaining bactericidal properties. membrane charge alterations during apoptosis provide docking sites for the kationic enzyme myeloperoxidase and this close contact to the membrane lipids opens the possibility for lipid alteration pathways even though these reactions will normally not take place because of their slowness. we investigated alterations in phospholipids after reaction with hypochlorous acid or the myeloperoxidase-hydrogen peroxide-chloride system by matrixassisted laser desorption and ionisation time-of-flight mass spectrometry (maldi tof ms). specific reaction products play an important role in the modulation of the immune response. comparative pathobiology of the disease is also discussed within the context of current human and animal reoviral disease models. objectives: to study the safety and efficacy of infliximab plus leflunomide combination therapy in adult rheumatoid arthritis (ra). methods: twenty patients with active ra received leflunomide 100 mg for 3 days followed by 20 mg daily for 32 weeks. at week 2 all patients started infliximab 3 mg/kg, and received a further four infusions at weeks 4, 8, 16 and 24. results: the commonest adverse event was pruritis associated with an eczematous rash. there was no relationship between the serum concentration of a77 1726, the active metabolite of leflunomide, and adverse events. the mean disease activity score (das28) fell from 7.18 at week 0 to 5.18 (p<0.0001) at week 4 and remained between 3.85 and 4.85 up to week 32. in those patients remaining on treatment, more than 80% achieved an acr20 response from week 8 to week 28, and up to 46% achieved an acr70 response. conclusions: infliximab plus leflunomide combination therapy appears to be highly efficacious in the treatment of adult ra. however, widespread use may be limited by adverse events, which were common and in some cases severe. objective: the transcription factor nuclear factor-kb (nfkb) regulates the expression of proinflammatory cytokines such as tnfa and il-1 those play pivotal roles in pathogenesis in rheumatoid arthritis. parthenolide, a sesquiterpene lactone, was reported to inhibit the dna-binding of nfkb. the objective of this study is to investigate the potential of parathenolid to inhibit the pathogenesis of collagen-induced arthritis. methods: mice were injected i.p. with a cocktail of 4 anticollagentype ii mabs on day 0, followed by i.p. injection of lps on day 3 to induce anti-collagen mab-induced arthritis. the mice were orally administrated with parathenolide (50 mg/kg/day) starting on the day of first immunization (day 0) in prophylactic treatment group and after the onset of arthritis (day 4) in the therapeutic treatment group. clinical disease score, radiographic and histological scores were evaluated. mrna expression of il-1b and tnfa in the affected joints were measured by real-time pcr. results: clinical disease scores were significantly reduced both in prophylactic treatment group (7.4ae3.7) and therapeutic treatment group (7.25ae3.1) compared to untreated group (13.6ae2.7, p = 0.0163 and p = 0.0084 respectively). histological scores of joint destruction were significantly reduced in prophylactic treatment group compared to untreated mice (p<0.05). steady state mrna levels of il-1b and tnfa in isolated joints were significantly decreased in prophylactic treatment group compared to untreated mice (p<0.05). the results in this study suggest that nfkb is an important therapeutic molecular target for treatment of inflammatory arthritis. fibrinogen is a soluble plasma glycoprotein, multifunctional, that participates in haemostasis and has adhesive and inflammatory functions through specific interactions with other cells. the concentration of this glycoprotein increase in inflammatory conditions. a fundamental paradigm involved in the acute inflammatory response is neutrophil migration to the affected tissues to mount an initial innate response to the aggression. the objective of this study is to characterize how fibrinogen modulates the pattern of neutrophil activation. neutrophils from healthy donors were isolated from peripheral venous blood and loaded with the fluorescent probe dihydrorhodamine 123 (1ã¬m) to detect oxygen free radical production. the cells (1,0x 106 cell/ml) were then incubated with a range of concentrations of fibrinogen (0-400mg/dl) for 15 minutes. our results show that posters inflamm. res., supplement 3 (2007) s449 fibrinogen leads to an increase in neutrophil activation as measured by free radical production. this effect becomes evident at borderline-high concentrations (300-400mg/ dl), and in some of the individuals it was possible to differentiate two subpopulations of low-responsive and high responsive neutrophils to activation by fibrinogen. we hypothesize that, in this regard, the concentrations of fibrinogen identified as a risk factor might promote the setting of an inflammatory microenvironment in the circulation and facilitate cardiovascular disease progression. cyclooxygenases (cox-1 and cox-2) are isoenzymes involved in the first steps of the biosynthesis of prostanoids. the constitutively expressed isoform cox-1 is mainly involved in homeostatic processes, while the inducible isoform (cox-2) is associated with inflammatory reactions. various in vitro assays have been developed in order to define the selectivity against cox-1 and cox-2 of nonsteroidal anti-inflammatory drugs (nsaids). however, these in vitro assays can give discordant results related to several parameters. the aim of this study was to optimize and standardize two distinct in vitro methodologies to evaluate new nsaid candidates. first, in an enzymatic cox assay, the arachidonic acid concentration (aa; cox substrate), and the species of cox enzymes tested (ovine vs. human), two factors able to conceal the anti-cox activity of nsaids, have been evaluated and optimized. next, we developed an in vitro cell-based assay using human whole blood depleted from plasma and reconstituted in saline solution. this cell-based assay allows concomitant measurement of anti-cox-1 and anti-cox-2 effects by prostaglandin e2 (pge2) measurement after a23187 (calcimycin) and bacterial lipopolysaccharide (lps) stimulations, respectively. both assays have been calibrated and compared by testing 7 reference nsaids, selective or not for cox-1 or cox-2. fifty % inhibitory concentration (ic50) values against cox-1 and cox-2 and cox-2:cox-1 ratios obtained were in accordance with the previously described nsaid specificity and coherent between both assays (r= 0.93). in conclusion, both in vitro assays are optimized to determine the efficiency and the selectivity of new nsaid candidates against human cox-1 and cox-2. for increasing the success of preclinical drug candidate molecules, there is a need for translatable animal models. the human serum skin chamber technique and the rodent carrageenan induced air pouch model are two wellestablished methods for measuring interstitial inflammation in respective species. we aimed to study the translational aspects of these models. material and methods: in humans, epidermal skin chambers were stimulated with autologous serum for 10 hours. in rats, a dorsal subcutaneous air pouch was stimulated with autologous serum on day 6. the inflammatory response was measured after 4, 8 and 24 hours. the cellular distribution of in vivo transmigrated cells, the expression of cytokine receptors, adhesion molecules and inflammatory mediators was investigated. results: at 8/10 hours the cellular distribution was similar in air pouch and skin chambers. the major population constituted of granulocytes, followed by monocytes/macrophages and lymphocytes. both in human and rats the concentrations of mpo and mcp-1 were increased. furthermore, transmigrated cells displayed a different chemokine receptor pattern. in rats transmigrated cells expressed cd11b, were cd45lo, ssclo and rp-1+ (granulocyte marker). in humans, transmigrated granulocytes expressed cd16 and cd11b. these cells had a significantly higher cd11b expression compared to corresponding cells in peripheral circulation. our results indicate that the serum induced human skin chamber technique and rodent air pouch model translate well to each other. these models may be useful for bridgingpreclinical and clinical drug discovery. furthermore, they may work as translatable proof of mechanism (pom) models for drug candidates targeting different inflammatory components. objectives: to analyse if neopterin (a by-product of activated macrophage metabolism) is elevated in patients with systemic inflammatory insult at the time of ischemic stroke. material and methods: we investigated 86 consecutive patients with mean age 67ae7.8 years who were admitted within 24 h after ischemic stroke. a control group of 37 patients with mean age 58ae4.9 years without ischemic stroke was also tested. measurement of serum neopterin levels were performed using enzyme linked immunosorbent assay. results: patients with acute ischemic stroke had significantly higher serum levels (mean value+sd) of neopterin than those without acute ischemic stroke: 9.6ae1.2 and 7ae0.8 nmol/l. correlation analysis revealed p<0.01. discussion: immune mechanisms contribute to cerebral ischemic injury. the finding of higher serum levels of neopterin, which is regarded as a humoral component of the immune-mediated inflammatory response sustains the hypothesis that patients with ischemic stroke may show higher levels of inflammatory markers like neopterin. our results indicate increased macrophage activation after ischemic stroke. in patients with stroke it has been shown that neopterin was determinant of endothelium-dependent vascular dysfunction. however, these preliminary results need be confirmed by controlled studies. produced a marked (p<0.01) reduction in the number and duration of ventricular tachycardia (vt) during both ischemic and reperfusion phases. the total number of ischemic ventricular ectopic beats (vebs) reduced from 667ffae116 in the control to 66ffae22 at the concentration of 50 ffmg/ml (p<0.001). in the ischemic phase, cynodon dactylon (50 ffmg/ml) also decreased the incidence of vt from 100% (control) to 33%. in addition, incidences of reperfusion-induced vt, total vf and reversible vf duration were significantly lowered by the same concentration (p<0.01 for all). the results show that cynodon dactylon has a protective effect against i/r-induced cardiac arrhythmias in isolated rat hearts. regarding the presence of flavonoid glycosides confirmed during phytochemical screening of the extract and their potential role in the scavenging of oxygen free radicals, it seems that the cardioprotective effects of cynodon dactylon probably is due to its anti-inflammatory properties.key words: cynodon dactylon; arrhythmias; anti-inflammatory; isolated heart; rat objectives: intestinal ischemia-reperfusion (iri) is well known to be associated with distant organ dysfunction; but no evidences to date have focused either the brain or skeletal muscle. we thus decided to investigate the effects of iri on nos and cox isoforms, neutrophil infiltration (mpo), lipoperoxidation (tbars) and protein tyrosine nitration (nt) in different brain areas and diaphragm muscle of wistar rats. methods: iri comprised the occlusion of superior mesenteric artery during 45 min followed by 2 h of reperfusion. sham animals were submitted to the surgical procedure with no interference on the blood flow. results: iri resulted in increased expression of mrna for nnos (cortex) and cox-2 (hypothalamus) associated to a marked reduction of ca2+-dependent nos activity in cortex, hypothalamus and hippocampus (but not in cerebellum). tbars contents were also reduced in cortex and hypothalamus. neither mpo activity nor nt was altered by iri in the brain. diaphragms from animals with iri exhibited increased mpo and ca2+-dependent nos activities, as well as tbars content and nt. in contrast, enos protein expression and both gene and protein nnos expression were reduced. no effects were observed on cox isoforms or enos gene expression. conclusions: these findings suggest that, within the first 2 h of reperfusion following intestinal ischemia, an oxidative response is observed in diaphragm, involving both lipid and protein modifications. in the cns, distinctive susceptibility to the iri seems to occur in the different areas, probably as a defensive strategy aimed to counteract the iri-mediated systemic injury. anne-sofie johansson (1), h qui (2), m wang (2), i vedin (1), jz haeggstrã§m (2), j palmblad (1) ( (1), r carnuccio(2), p romagnoli (3), f rossi (1) (1) second university of naples, italy (2) university of naples, italy (3) university of florence, italy we previously found that several inflammatory markers, e.g., nuclear factor-kb (nf-kb), were increased and a neointima was formed in a model of carotid surgical injury (1) . the purpose of the present study was to determine if chronic treatment with rosiglitazone protects rat carotid artery from surgical injury induced by an incision of the vascular wall. to this aim we measured cox-2, nf-kb, platelet aggregation and neointima formation in rats administered rosiglitazone (10 mg/kg/ die, by gavage) for 7 days before carotid injury and 21 days after injury. control rats received physiological solution. 14 days after injury cox-2 expression, evaluated by western blot, was significantly lower in the treated carotid versus controls (p<0.0001). rosiglitazone also caused a significant decrease of nf-kb/dna binding activity, evaluated by electrophoretic mobility shift assay, in nuclear extracts of treated carotids at all time points considered. platelet aggregation was reduced by 30% in treated versus control carotids (p<0.0005). the influx of inflammatory cells in response to injury, monitored by electron microscopy and immunohistochemistry, was lower in treated than in control carotids starting 7 days after rosiglitazone treatment. the results indicate that rosiglitazone inhibits molecular and cellular inflammatory events induced by vascular injury. the aim of the present study was to investigate the relevance of peripheral macrophage activity for the susceptibility to the induction of experimental allergic encephalomyelitis (eae). rats of eae-susceptible dark agouti and eae-resistant albino oxford strain were immunized with guinea pig spinal cord homogenate (dagpsc and aogpsc), while non-immunized rats served as controls (danim, aonim). on day 15 after immunization rat peritoneal macrophages were tested for adherence capacity, zymosan phagocytosis and respiratory burst. macrophages from aonim rats exhibited lower adherence capacity and higher phagocytosis and h2o2 production then macrophages from danim rats. immunization decreased adherence and phagocytosis and increased h2o2 production in macrophages from ao rats, but did not influence these activities in macrophages from da rats. our results suggest that inflammatory activities of macrophages from ao rats could be considered as regulatory mechanisms connected with the resistance to eae induction ( (1), b sehnert (2), h lanig(2), s pã¤ã�ler(2), r holmdahl (3), h burkhardt (1) (1) johann wolfgang goethe university, frankfurt, germany (2) friedrich-alexander university of erlangen-nuremberg, germany (3) lund university, sweden objectives: the aim of the present study was to characterize the interaction sites between the prototypic arthritogenic murine igg mab ciic1that is highly somatically mutated and its epitope on type ii collagen (cii, aa359-369). methods: the establishment of a dynamic simulation modelling of a ciic1 single-chain fragment (scfv) in complex with the triple helical ciic1 epitope permitted structural insights into immune complex formation. the computer-based data were experimentally tested by mutations of predicted critical residues into alanine in the c1scfvs and the respective ciic1 epitope that were produced as recombinant constructs. the binding affinities of the mutated scfvs were determined by elisa and surface plasmon resonance measurements. the mutation experiments confirmed the predicted interaction sites of cii in the cdr2 and cdr3 regions of both heavy andlight chain. surprinsingly also the model prediction, that the conversion of the c1scfv sequence into the respective germline does not affect cii binding affinity (kd 3x 10-8) could be confirmed experimentally by the mutagenesis of 13(!) positions. our data indicate that potentially harmful cartilage specific humoral autoimmunity is germline encoded. the molecular modeling further demonstrate that the rigid collagen triple helix restricts the likelihood of molecular interactions with the corresponding cdrregions of the antibody considerably compared to globular antigens. these sterical constraints might provide an explanation why somatic mutations have no obvious impact on cii recognition by the arthritogenic autoantibody. moreover, the structural insights into cii-autoantibody interaction might be useful in future developments of collagenomimetic ligands for therapeutic and diagnosistic purposes. we observed a significant association between the mbp-elicited cd4+ t-cell proliferation and active brain lesions, on the one hand, and il-4, il-5 and ifn-gamma, on the other. when grown in the presence of standard serum from a healthy donor, pbmc from healthy individuals responded to mbp with a higher il-10 production than pbmc from ms patients. thus, normal pbmc respond to mbp with production of tnf-alpha, ifn-gamma and il-10, but ms is associated with enhanced tnf-alpha-, ifn-gammaand decreased il-10 responses, and disease activity is associated with mbp-induced proliferation of cd4+ t cells. (1), k goula (1), p georgakopoulos (2) (1) renal unit, st. anrdew hospital, patras, greece (2) intensive care unit, st. anrdew hospital, patras, greece background: urethritis is an infection of the urethra. most cases are sexually transmitted. haemodialysedpeople seem more prone to all kinds of urinary tract infectionsthan others. patients with underlying diabetes are also a specific population at risk. urethritis may be caused by some sexually transmitted diseases (chlamydia, gonorrhea, and ureaplasma urealyticum infections) and by the same organisms that cause urinary tract infections (e. coli or klebsiella). viral causes of urethritis include herpes simplex virus and cytomegalovirus. neisseria gonorrhoeae and c trachomatis account for most cases of urethritis in men (70%). the aim of our study was to determine all cases of urethritis of haemodialysed patients at our unit during the last five years. we also determined diabetes as a coexisting factor in the infected patients. we retrospectively reviewed all cases of urethritis of 72 maintenance haemodialysis patients at our center over the past 5 years. the diagnosis was made according to patients symptoms and signs but also using urine specimens for culture.12 patients (16.6%) from the study group were diabetic. results: 4 cases of urethritis were determined. all infected patients were diabetic. isolated microorganisms were e. coli (2 cases), enterobacter aerogenes (1 case objectives: to explore the ability to use paquinimod as a steroid sparing drug in an animal model for sle. methods: mice were initially treated with a high dose of prednisolone (2 mg/kg/day). thereafter the amount of steroid was reduced to 0.5 mg/kg/day and a low dose of paquinimod (0.2 mg/kg/day) was added. the development of glomerulonephritis was measured as hematuria during the experimental period. serum was collected for analysis of anti-dsdna antibodies. kidneys were collected and histopathological observations were performed. organ weight and lymphocyte sub-populations were assessed in the spleen. results: when treatment with high dose prednisolone was replaced by low dose prednisolone andpaquinimod a steroid sparing effect was seen in a number of variables. a significant reduction in the level of hematuria, in spleen enlargement and in the total number of cd4+, cd8+ and on cd4-cd8-t cells was observed in mice treated with paquinimod and low dose of prednisolone compared to mice treated with high dose prednisolone alone. the development of glomerulonephritis was also significantly reduced in these mice. an almost complete inhibition of anti-dsdna in serum was seen in all treated groups. conclusions: when high dose prednisolone was replaced by low dose prednisolone and paquinimod a steroid sparing effect was seen when a number of variables e.g., hematuria, t-cell sub-populations and development of glomerulonephritis were examined. this setting could be of great importance in future treatment of human sle in order to reduce the steroid dose needed in the treatment of this disease. and la(ssb). the purpose of this study was to screen for novel antibodies against cell surface antigens in primary sjs. proteins (mp) were isolated from cell membranes (hela cells), and were tested with sera from sjs patients or healthy blood donors individually in western blot (wb) at 1:100. mp were separated on 2-d gels and tested in wb (1:100) to locate the appropriate spots for mass spectrometry (ms) analysis. paraformaldehyde fixed hela cells were incubated with sera from patients or blood donors and examined by fluorescence microscopy. antigens were isolated at around 35, 50, 75, 100 kda (64 total positive/79 tested patients). the dominant antigen was at 50 kda. large quantities of endogenous proteins were obtained and the membrane fraction was enriched. one of the main obstacles to further study possible surface antigens as m3 muscarinic receptor was overcome. proteins were separated on 2d-gels and tested in wb to locate the relative spots for ms. the correct localization of the patients antibodies on the cell surface was confirmed by fluorescence microscopy. in conclusion, membrane or membrane-associated antigens were recognised by sera from sjs patients. one of them might correspond to m3 muscarinic receptor. this identification might help in developing a diagnostic assay for sjs. osamu handa, s kokura, k mizuahima, s akagiri, t takagi, y naito, n yoshida, t yoshikawa aim: various additives and preservatives are used in cosmetics, foods and medicines in order to prevent deterioration. however the precise mechanism of cytotoxicity of these additives are not known. in this study, we investigated the effects of ultraviolet-b (uvb) exposure on additives-treated human normal skin keratinocytes (hacat). most popularly used additives in cosmetics such as methylparaben (mp), octandiol (od) and phenoxyethanol (pe) were used. hacat keratinocyte was cultured in mp-containing medium for 24 h, exposed to uvb and further cultured for another 24 h. subsequent cellular viability was evaluated by fluorescent microscopy and flow cytometry using double staining method with hoechst 33342 and propidium iodide or annexin-v. same experiments were done using od and pe respectively instead of mp under same condition. in addition, gene chip analysis was performed in each group. results: uvb exposure enhances cytotoxicity of these additives even at low concentration. gene chip analysis showed that the expression of apoptosis-related genes, oxidative stress-related genes and transcription related genes were significantly upregulated in each group. these results indicate that some additives, which have been considered safe preservatives in cosmetics, may have harmful effects on human skin when exposed to sunlight. these kinases in the pathogenesis of psoriasis. recently, increased focal activation of the downstream target mitogen-and stress-activated protein kinase 1 (msk1) was demonstrated in psoriatic epidermis. the purpose of this study was to investigate msk2 and the transcription factor camp-response-element-binding protein (creb) in psoriatic skin and in cultured normal human keratinocytes. keratome and punch biopsies were taken from patients with plaque-type psoriasis. normal human keratinocytes were cultured and stimulated by interleukin-1㢠(il-1ã�) or anisomycin. some of anti-inflammatory plant flavonoids as a form of whole plant extracts have been used topically for skin inflammatory disorders. on human skin inflammation, matrix metalloproteinase-1 (mmp-1) plays a pivotal role on unbalanced turn-over or rapid breakdown of collagen molecules. in the present study, for establishing a therapeutic potential against skin inflammatory disorders, the effects of natural flavonoids on mmp-1 activity and mmp-1 expression were examined. from the results, the flavonols including quercetin and kaempferol were revealed to be strong inhibitors of human recombinant mmp-1 with the ic50s of 39.6 -43.7 ã¬m, while the flavones such as apigenin and wogonin showed weak inhibition. when the effects of flavonoids on mmp-1 induction were studied, it was found that quercetin, kaempferol, apigenin and wogonin (12.5 -25.0 ã¬m) strongly inhibited mmp-1 induction from tpa-treated human dermal fibroblasts, but naringenin (flavanone) did not. by gel shift assay, these flavonoids were also found to inhibit the activation of the transcription factor, ap-1, whereas naringenin did not. among mapks, quercetin inhibited the extracellular signal-regulated protein kinase (erk) and p38 mapk activation, and kaempferol inhibited the p38 mapk and c-jun n-terminal kinase (jnk) activation. on the contrary, the flavones and naringenin did not inhibit the activation of these three mapks. all these results indicate that the capacity of mmp-1 inhibition and mmp-1 down-regulation of flavonoids may block collagen breakdown in certain pathological conditions and certain flavonoids are useful to treat skin inflammation, especially by topical application. (1) (1) university of valencia, spain (2) istituto di chimica biomolecolare cnr, napoli, italy avarol is a marine sesquiterpenoid hydroquinone with several pharmacological properties including antioxidant, anti-inflammatory, and antipsoriatic effects. recently, its derivative avarol-3-thiosalicylate (ta) also demonstrated interesting perspectives as anti-inflammatory drug in vitro and in vivo.it is interesting to note that avarol and ta inhibited nf-ã�b activation in hacat keratinocytes. now, the effect of avarol and ta was investigated in the tpa-induced hyperplasia murine skin model, which presents some similarities with psoriatic lesions. topical treatment with ta (20 mg/ml) produced a 97 % inhibition of oedema and a strong reduction of pge2 (100 %), ltb4 (100 %) and mpo activity (65 %) in skin homogenates. the inhibitory effect of avarol at the same dose was 79 % for oedema, 90 % for pge2, and 50 % for ltb4 and mpo activity. histological study for both compounds showed a decrease in epidermal hyperplasia as well as leukocyte infiltration respect to tpa treatment. besides, the reduction of cutaneous tnf-a by avarol and ta was also detected by immunohistochemical analysis. these compounds were also capable of suppressing nf-ã�b nuclear translocation in mouse skin. in summary, our results suggest that inhibition of proinflammatory metabolites by ta and avarol might be beneficial for the treatment of the inflammatory component of psoriasis. its mechanism of action is related to the inhibition of nf-ã�b activation and can be mediated by the downregulation of intracellular signal-transduction (1), ams silva(2), cmm santos(2), dcga pinto(2), jas cavaleiro(2), jlfc lima (1) (1) requimte, departamento de quã­mica-fã­sica, faculdade de farmâµcia da universidade do porto, porto, portugal (2) departamento de quã­mica, universidade de aveiro, aveiro, portugal 2-styrylchromones are a novel class of chromones, vinylogues of flavones (2-phenylchromones), which have recently been found in nature. several natural and synthetic chromones have demonstrated to possess biological effects of potential therapeutic applications. however, the anti-inflammatory potential of 2-styrylchromones has not been explored so far. thus, the aim of this work was to evaluate the putative anti-inflammatory properties of several synthetic 2-styrylchromones by studding their influence on different systems that are related to the inflammatory process. the putative inhibitory effects of several 2-styrylchromones on the proinflammatory enzymes cyclooxygenase 1 (cox-1), cyclooxygenase 2 (cox-2) and 5-lipoxygenase (5-lox) was evaluated in vitro and compared with structurally related flavonoids. the capacity of the studied 2-styrylchromones to scavenge reactive oxygen (ros) and nitrogen species (rns) was also assessed by different in vitro assays, which allowed to identify the influence of those compounds in each reactive species, separately. from the tested 2-styrylchromones, those having a catecholic bring where shown to be the most effective scavengers of ros and rns, being, in some cases, more active than flavonoids. no considerable correlation was found between the scavenging profile of these compounds and their interactions with pro-inflammatory enzymes. the results obtained from the present study indicate that some of the tested compounds are promising molecules with potential therapeutic value. the usefulness of 2-styrylchromones in the prevention or control of inflammation can only be clarified with additional studies concerning their influence on other relevant mechanisms of this pathology. the importance of tumor-associated inflammatory cells, able to affect different aspects of neoplastic tissue, is a current matter of debate. primarily monocytes are recruited from the circulation into solid tumors and metastases where they differentiate into macrophages with several phenotypes and, e.g., may significantly contribute to uptake of certain radiotracers. we therefore sought to characterize the uptake of various radiotracers used for positron emission tomography (pet) in a well characterized in vitro model of human monocytes/macrophages in comparison with that in various human tumor cells. uptake of radiotracers 18f-fluorodeoxyglucose (fdg), 3-o-methyl-6-18f-fluoro-l-dopa (omfd), and 18f-labeled native/oxidized low density lipoproteins (nldl, oxldl) in single-or cocultivated human myeloid (monocytic) leukemia cell line thp-1 was compared with that by squamous cell carcinoma (fadu), mamma (mcf-7) and colorectal adenocarcinoma (ht29) cell lines (without or in the presence of specific inhibitiors). several thp-1 phenotypes along the monocytic pathway (monocytes, differentiated macrophages, retrodifferentiated cells) were studied before, during and after incubation with phorbol myristate acetate. differentiated thp-1 cells show, when compared with tumor cells, a comparable fdg accumulation, a considerably lower omfd uptake, and a significantly higher oxldl uptake. on the other hand, during differentiation and retrodifferentiation thp-1 cells obviously establish a distinct sequence of biological processes also reflected by considerable alterations in radiotracer uptake. the observed differences in uptake of several radiotracers in vitro in-between thp-1 phenotypes and between thp-1 phenotypes and tumor cells, respectively, stimulate studies on the contribution of macrophage radiotracer uptake to the overall uptake in neoplastic or inflammatory lesions in vivo. genomic and full-length cdna sequences provide opportunities for understanding human gene expression. determination of the mrna start sites would be the first step in identifying the promoter region, which pivotally regulates transcription of the gene. although the mrna start sites of most genes show heterogeneity, this may reflect physiological, developmental, and pathological states of the particular cells or tissues. recently, we have developed a 5-end sage (5sage) that can be used to globally identify the transcriptional start sites and frequency of individual mrnas. a strong association exists between states of chronic inflammation and cancer, and it is believed that mediators of inflammation may be responsible for this phenomenon. another important factor in tumor development seems to be the epigenetic effects on tumor suppressor genes. because of its ability to suppress tumor cell proliferation, angiogenesis, and inflammation, the epigenetic drug such as histone deacetylase (hdac) inhibitor is currently in clinical trials. however, how epigenetic drugs mediate its effects is poorly understood. to assess the effects of epigenetic drugs, the gene expression by 5sage in colon cancer cell lines treated with epigenetically affecting agents, 5-aza-2deoxycytidine, a potent inhibitor of genomic and promoter-specific dna methylation and trichostatin a, a hdac inhibitor was investigated. epigenetic modification induced not only the change of expression of several inflammation-associated genes and the cell cycle progression-associated genes in human colon cancer cells but also the gene expression with aberrant start sites. colon cancer is one of the most frequently diagnosed cancers in western societies. interleukin-6 (il-6) is a potent, pleiotropic, inflammatory cytokine that contributes to a multitude of physiological and pathophysiological processes. il-6 is produced by many different cell types. the main sources in vivo are stimulated monocytes, fibroblasts, and endothelial cells. a variety of studies have demonstrated that over expression of il-6 contributes to the pathogenesis of various inflammatory diseases as well as cancer. it has been reported that human colorectal cancer cells display a wide heterogeneity in their potential to express and produce il-6. serum levels of il-6 are elevated in patients with colorectal cancer, however serum levels of il-6 were found to be independent of il-6 mrna expression in tumor tissue. in this study we analyzed il-6 mrna expression by real-time pcr in 50 sporadic colon cancer tissue as well as corresponding normal mucous tissue. il-6 mrna expression in tumor tissue was lower than in the corresponding normal mucous tissue (p=0,0074). there was no correlation betweenil-6 mrna expression and tumor grade or stage. thus we can conclude that il-6 produced at the tumor site is not involved in sporadic colon cancer progression. (1), t aiamsa-ard (1), v chinswangwatanakul (2), ki techatraisak (3), s chotewuttakorn (1), a thaworn (1) ( material and methods: huvec were cultured as standard techniques and grown to confluence until use. serum was obtained from cholangiocarcinoma patients and normal healthy subjects. huvec were treated with 10% of serum and incubated for 24 hours. cells were analyzed by using [3h] thymidine and immunoblotting assay for cell proliferation and cox-2/nos-2 protein expression, respectively. results: serum of cca patients trend to have more effect on proliferation of endothelial cell than healthy control subjects. on the protein expression, cca serum significantly increased the expression of cox-2 but not nos-2 in hevec. however, the proliferate effect on endothelial cells by cca sera did not correlate with the expression of cox-2. conclusions: this result suggested that some factors in serum of cancer patients could induce cox-2 protein expression in huvec. the increasing of cox-2 might be one of various factors involve in the proliferation process. aim: superoxide is responsible for the neutrophil-mediated tumoricidal activity. the aim of our work was to monitor the changes of superoxide production from neutrophil attributed to tumor development from the early phase to the advanced stage, and to investigate the effects of ok-432@on neutrophil-derived superoxide production and tumor growth. methods: ah109a rat hepatocellular carcinoma cells were implanted into the hind leg of male donryu rats. pmns were harvested from rat peritoneal cavity 6 h after intraperitoneal injection of oyster glycogen. superoxide production were measured by the method of cladependent chemiluminescence, which has high sensitivity and specificity to superoxide. the counts of peripheral leukocytes were significantly increased during tumor progression, and there are significant difference between that of controls and tumor-bearing rats after 18 days of tumor inoculation. both pma and oz-induced superoxide generation derived from neutrophils became significantly reduced in the advanced stage of cancer. the suppression of neutrophil-derived superoxide generation was accompanied with tumor progression and an increased number of neutrophils in the peripheral blood. the subcutaneous administration of ok-432, a biological response modifier, prevented the suppression of neutrophil-derived superoxide generation during tumor progression, which might induce the tendency of tumor growth suppression. our results suggested that the decreased superoxide generation as well as the high leukocytes concentration in the peripheral blood could be considered as indicators of an advanced stage of cancer. furthermore, the effect of ok-432 on neutrophil-derived superoxide production in cancer-bearing rats may provide pharmacological evidence to the therapeutic effects of ok-432. (1), m jokic (2), v zjacic-rotkvic(1), s kapitanovic (2) (1) university hospital sestre milosrdnice, bucharest, romania (2) division of molecular medicine rudjer boskovic institute, bucharest, romania introduction: il-6 is a pleiotropic cytokine mapped to chromosome 7p21-24. its promoter snp -174 g/c is associated with high serum cytokine production, and according to current investigation can play a role in the development and progression of different gastrointestinal malignancies. we tested its genotypes in the gastrointestinal and pancreatic neuroendocrine tumors (gep-nets). patients and methods: dnas from 80 patients diagnosed with gep-net and 162 age and sex-matched volunteers were analyzed for -174 g/c snp of the il-6 gene. to analyze il6 -174 c/g polymorphism we used pcr -nlaiii rflp method. for statistical analysis ã�2 test and fishers exact test were used. the level of significance was 0.05. results: there were no differences observed in the frequencies of the -174 high expression (gc and gg) genotypes between the patients and healthy volunteers (p=0.5518), as well as between patients with gastrointestinal or pancreatic endocrine tumors (p=0.3762). -174 g/c genotype was statistically more frequent among patients with non-functional pancreatic endocrine tumors (pets) than in those with functional pets (p=0.0246). conclusions: high expression genotypes of il-6 -174 snp are more frequent in non-functional pets and may be a marker for the mentioned malignancies. are important in inflammation, are found around and within a variety of human tumors. their number correlates with tumor vascularity and aggressiveness and is a negative indicator for patient survival. how mast cells influence tumor growth is not well understood. the neuroendocrine peptide, neurotensin (nt) is a potent secretagogue of mc that has tumor-promoting effects in animals and promotes the growth of a variety of human cancer cells, acting via its gpcr nt-type 1 receptor (nts1). here we show that hmc-1 human mc express nt-precursor (pront) mrna and protein, and secrete immunoreactive nt when stimulated. rt-pcr on hmc-1 cell rna yielded a band with 99% sequence identity to pront and a band corresponding to the pront processing enzyme, pc5a.immunocytochemistry on hmc-1 cells showed specific staining for pront. stimulation of hmc-1 cells with a23187 + pma, pge2, c48/80 or mastoparan released immunoreactive nt.rt-pcr on hmc-1 cell rna yielded a band with 99% sequence identity to human nts1. western blotting gave bands corresponding to unglycosylated (49 kda) and glycosylated (55 kda) nts1.immunocytotochemistry on hmc-1 cells showed specific staining for nts1. these finding have significance for the role of mast cells in tumor growth. (1), j buddenkotte (1), mp schã§n(2), m steinhoff (1) (1) university hospital mã¼nster, germany (2) university hospital wã¼rzburg, germany the proteinase-activated receptor par-2 has been demonstrated to modulate tumor growth, invasion and metastasis in various tissues. however, the role of par-2 in cutaneous cancerogenesis is still unknown. here we could show a protective role of par-2 in the development of epidermal skin tumors: we established a mouse skin tumor model using chemically induced carcinogenesis. to this end, par-2-deficient and wild-type mice were painted once with dmba (7,12-dimethylbenz[a]anthracene) for sensibilization, followed by topical application of the phorbol ester pma (phorbol myristate acetate (12-o-tetradecanoylphorbol-13-acetate)) twice per week at the same sites. tumors started to appear after eight weeks. after 13 weeks, par-2-deficient mice showed a significantly increased number of skin tumors (14 per animal on the average) in contrast to the wild-type (eight tumors per mouse). analysis of possible signal transduction pathways activated upon par-2 stimulation in hacat keratinocytes showed an involvement of extracellular signal regulated kinase 1/2 (erk1/2) and profound epidermal growth factor receptor (egfr) transactivation, leading to secretion of the tumor-suppressing factor transforming growth factor-beta 1 (tgf-ã¢1). thus, our results provide the first experimental evidence for a tumor-protective role of par-2. (1), ma arbã³s (1), a fraga (1), i de torres(2), j reventã³s (1), j morote (3) ( pathogenesis of benign prostatic hyperplasia (bph) and prostate cancer (pca) is still unresolved, although chronic inflammation may play a significant role in disease progression. prostate stromal fibroblasts may be contributing to the inflammatory process through the expression and secretion of pro-inflammatory mediators, in particular proteoglycan-bound chemokines and other chemoatractants, and the interaction with inflammatory cells such as monocytes. to better understand molecular mechanisms underlying functional differences among prostate fibroblast populations, our primary objective was to characterize proteoglycan and chemokine gene expression in human fibroblasts of different histological/ pathological origin cultured in normal and monocyteconditioned media. we analysed 20 primary human fibroblast cultures from normal transition zone (tz), normal peripheral zone (pz), benign prostatic hyperplasia (bph), and pathologically confirmed prostate cancer (ca). cells of different origin displayed distinct mrna expression profiles for the core proteins of proteoglycans and both sdf1/cxcr4 and mcp1/ccr2 chemokine axis. when incubated with monocyte-conditioned medium all four cell types significantly changed sdf1/cxcr4 and mcp1/ccr2 expression in a fibroblast population dependent manner. monocyte-fibroblast cell adhesion and the chemotactic response of fibroblasts to human peripheral blood monocytes were investigated in a coculture system. monocytes adhered rapidly to fibroblasts and preferentially to bph and pz cells. in addition, chemotaxis was significantly induced in both fibroblast cultures after incubation with monocytes. our results suggest that prostatic fibroblasts have a key inflammatory role associated to a distinctive proteoglycan gene expression and chemokine induction, which is dependent on their histological and pathological source. supported by the spanish urology society (madrid, spain). we have recently shown that paf-receptor is involved in phagocytosis of apoptotic and necrotic but not viable cells, possibly through its interaction with paf-like molecules present on the surface of these cells. removal of altered cells by macrophages could modify the microenvironment at an inflammatory site, and thus influence tumor growth. in the present study we investigated the impact of apoptotic cells or treatment with paf-r antagonist on ehrlich ascitic tumor (eat, ip) and melanoma b16f10 (sc). paf-r antagonist, web2170 (5mg/kg, ip) was given daily for 6 days. we found that eat growth was significantly reduced by pretreatment with web2170, and that inoculation of apoptotic cells (thymocytes) before tumor implantation stimulated tumor growth, an effect reversed by web2170 pretreatment. eat growth was accompanied by increased production of prostaglandin e2, vegf and no which was reduced significantly by web2170 treatment. in b16f10 melanoma, web2170, alone or in association with an apoptosis inducer chemotherapeutic agent, dacarbazin (dcb, 40 ug/kg,ip) significantly reduced tumor mass volume and the number of intratumoral small vessels. in association with dcb, web-2170 reducedactive caspase-3 expression in the tumor andmarkedly increased the survival of tumor-bearing mice. the data obtained here show that during tumor growth, activation of paf-r by molecules present in the surface of apoptotic/necrotic cells, or by paf produced in the milieu, favors tumor growth and suggests that pafantagonists could be useful in tumor treatment, particularly when in association with chemotherapy. financial suport by fapesp and cnpq. (1), mt quiles (1), a figueras(2), r mangues(2), f vinals(2), jr germa(2), g capella (2) (1) institut de recerca vall de hebron, barcelona, spain (2) translational research laboratory, idibell -institut cataladoncologia, spain the malignant potential of tumor cells may be influenced by the molecular nature of k-ras mutations. we have previously shown that codon 12 mutations associate with an increased resistance to apoptosis. we hypothesized that their different malignant potential in vivo could be also related to the generation of a distinct angiogenic and inflammatory profile including vascular structure, macrophage infiltration and expression of angiogenic modulators, proteolytic mediators and the cxcl12(sdf-1)/ cxcr4 chemokine axis. to do so we have combined in vitro and in vivo studies using stable cys12 and asp13 nih3t3 transfectants. cys12 tumors showed a higher microvessel density associated with shorter latency period. prominent vessels with âµ-smooth muscle actin positives cells surrounded by f4/80 macrophages were only observed in asp13 tumors associated with a shorter growth period. asp13 tumors displayed increased vegf expression both at the rna and protein levels, mainly produced by tumor cells. tsp-1 protein levels were similarly diminished in both transfectants. higher mmp9 and mmp7 activities and expression were observed in asp13 tumors probably produced by macrophages or stromal cells. total and active mmp2 levels were higher in cys12 tumors. the expression of sdf-1 and cxcr4 remained unchanged while sdf-1g isoform was selectively induced in cys12 tumors, suggesting sdf-1a or b are induced in asp13 tumors. these results show distinct k-ras mutations induce specific angiogenic phenotypes. the differential stimulation of vegf expression, metalloprotease activities and sdf-1 expression observed is the result of the joint action of tumor cells and the local microenvironment. contact information: dr maria a arbos via, institut de recerca vall de hebron, unitat de recerca biomedica, barcelona, spain e-mail: maarbos@ir.vhebron.net incisional hernias (ihs) represent a common complication of laparotomies, involving remarkable healthcare costs. representative ih animal models are lacking and characterization of human tissue resources is scant. this limited understanding of fundamental mechanisms regulating the destruction of the abdominal wall currently limits the prevention and treatment of ihs. here, we compared tissue specimens (carefully obtained > 5cm of the defect) and primary fibroblasts cultures from fascia and skeletal muscle of subjects with/without ih hernia. the most prominent morphologic characteristics of ih tissue were: alterations of the microstructure of the connective tissue and loss of extracellular matrix (ecm), and a paucity of fibroblasts. in ih muscles, inflammatory infiltrates were observed. other significant changes were: decreased collagen type i/iii ratio; differential proteoglycans mrna expression; enhanced metalloproteinases/ endogenous inhibitors ratio (mmps/timps); and upregulation of apoptosis effectors (caspase-3 and substrates; tnf-alpha; il-6). in vitro, hernia fibroblasts (ihfs) exhibited significantly higher (2-fold) cellular proliferation and migration rates and decreased strength of adhesion as compared to control fibroblasts, even after several passages. moreover, ihfs ultrastructure analysis revealed accumulation of autophagic vacuoles, autophagolysomes-like structures and multilayered lamellar and fingerprint profiles, as well as mitochondrial swelling. based on these descriptive results in human tissues, a novel hypothesis emerges regarding ih formation. specifically we propose that inflammation-related mechanisms triggering proteolytic and apoptotic effectors regulate cell turnover and eventually contribute to atrophy and progressive tissue insuffiency. overall, this may be causally involved in the mechanisms of ecm destruction yielding ih (supported by fis pi_ 030290 and gencat_agaur_ xt_0417). (1), m spinola(2), c pignatiello(2), w cabrera (1), og ribeiro (1), n starobinas(1), t dragani (2) (1) butantan institute, sao paulo, brazil (2) istituto nazionale tumori, milan, italy airmax and airmin mice are phenotypically selected for maximal or minimal subcutaneous acute inflammatory response, respectively, and display high inter-line differences in protein exudates and neutrophil infiltration, as well as in bone marrow granulopoiesis, inflammatory cytokines, and neutrophil apoptosis. in a combined experiment of urethane-induced lung inflammatory response and lung tumorigenesis, airmin mice developed a persistent subacute lung inflammation and a 40fold higher lung tumor multiplicity than airmax mice, which showed a transient lung inflammatory response. we have analyzed gene expression profiles of these outbred lines in comparison to the lung cancer resistant c57bl/6 and lung cancer susceptible a/j mouse strains. gene expression profile analysis of urethane-treated and untreated animals was performed using the applied biosystems mouse genome survey microarray containing 32,000 mouse transcripts. mrna expression of candidate differentially expressed genes was validated by quantitative real-time pcr and the over-represented biological themes were analyzed with the ease software. urethane treatment modulated the gene expression profile in all four lines. among the confirmed genes, vanin3 (vnn3) and major histocompatibility antigen e alpha (h2-ea) resulted common to both mouse models. the most represented gene categories in air model were acute phase response, immunoresponse, electron and lipid transport, complement activation and tissue repair. mhc/antigen process and presentation and immunoresponse were the major themes in the inbred model. moreover, a gene cluster in chromosome 3 (45.0 cm) was observed. the study suggests that the expression of a subset of genes may show a strain-and line-specific modulation pattern during inflammatory response and lung tumorigenesis. inhibition of tumour induced angiogenesis constitutes very attractive anti-cancer therapeutic approach.it is well established that the vegf signal transduction pathway is one of the key drivers of deregulated angiogenesis and selective inhibition can lead to inhibition of tumour growth. however, multiple angiogenic growth factors and pathways are involved, leading to a phenomenon of redundancy and overcoming of an inhibition of vegf signalling only. we have developed a nanomolar inhibitor (compound a) of the receptor tyrosine kinase vegfr-r2 (kdr), which was subsequently shown to be a potent inhibitor of closely related kinases (vegfr-1 and -3, pdgfr, kit, csf-1r) but also unrelated soluble tyrosine kinases (src-familily of kinases and raf). compound a potently inhibits vegf stimulated endothelial cell proliferation but has no effect on non-ec proliferation, which is suggestive of a selective antiangiogenic potential. the unique kinase inhibitory profile of compound a combined with excellent oral bioavailability (87%) has translated into superior in vivo antiinflamm. res., supplement 3 (2007) posters tumour efficacy when compared to the relatively selective kdr inhibitor ptk787. thus, treatment of nude mice implanted with either commercial atcc derived tumour cells (a431 and du-145) or low passage patient derived tumors (cxf 1103; colon cancer, rxf 631; renal cancer) with compound a resulted in inhibition of tumour growth which was significantly better than for ptk787 treated mice. compound a is fairly well tolerated by rodents and extended toxicological studies have been initiated to determine the therapeutic index, which also may allow for exploration of other non-cancer indications. (1), p bobrowski(2), m shukla (3), t haqqi (3) (1) albany medical college, usa (2) rainforest nutritionals, inc, usa (3) case western reserve university school of medicine, usa background: the amazonian medicinal plant sangre de grado (croton palanostigma) has traditional applications for wound healing and inflammation. we sought to characterize an extract (progrado) in terms of safety, proanthocyanidin profile, antioxidant activity and anabolic/catabolic actions in human cartilage explants. methods: acute oral safety and toxicity was tested in rats according under oecd protocol #420. proanthocyanidin oligomers were quantified by hplc and progrados antioxidant activity assessed by the orac, norac and horac assays. human cartilage explants, obtained from surgical specimens, were treated with il-1b (10 ng/ ml) to induce matrix degradation and glycosaminoglycans (gag) release. progrado (2-10 mg/ml) was tested for its ability to maintain optimal igf-1 transcription and translation in cartilage explants and cultured chondrocytes. results: progrado displayed no evidence of toxicity (2000 mg/kg po) leading to gsh safety rating of 5/unclassifiable. oligomeric proanthocyanidin content was high (158 mg/kg) with the majority of oligomers > 10mers.progrado was a remarkably potent antioxidant and in an ex vivo model of inflammation-induced cartilage breakdown, progrado was exceptionally effective in reducing both basal and il-1b induced glycosaminoglycan release from human cartilage explants. progrado prevented il-1b induced suppression of igf-1 production from human cartilage explants as well as stimulating basal igf-1 production (p<0.05). comparable changes in igf-1 gene expression were noted in cultured human chondrocytes. conclusions: progrado has a promising safety profile, significant chondroprotective and antioxidant actions, and promotes the production of the cartilage repair factor, igf-1. this suggests that progrado may offer therapeutic benefits in joint health, wound healing and inflammation. the solvent extracts from korean fermented soybean (chungkukjang) were evaluated for their protective effects against the generation of free radicals and lipid peroxidation. the activities of chungkukjang were compared with several antioxidants and soybean isoflavones including genistein and daidzein. in addition, the protective effects against h 2 o 2 -induced cytotoxicity and oxidative dna damage in the nih/3t3 fibroblasts line were examined. the extracts from chungkukjang and soy isoflavones inhibited the generation of 1,1-diphenyl-2picryl hydrazine (dpph) radicals, and had an inhibitory effect on ldl oxidation. the extracts from chungkukjang and soy isoflavones strongly inhibited h 2 o 2 -induced dna damage in the presence or absence of endonuclease iii and fpg. furthermore, they showed cytoprotective effects against h 2 o 2 , without cytotoxicity except for the hexane extract at high concentrations (> 450 mg/ml). the ethanol and n-butanol extracts appeared to have most potent antioxidant activities. these in vitro results show that the extracts of chungkukjang may be a useful antigenotoxic antioxidant by scavenging free radicals, inhibiting lipid peroxidation and protecting against oxidative dna damage without having cytotoxic effect. moreover, the extracts of chungkukjang inhibited mda formation in the liver, dna damage assessed by comet assay and the microucleated reticulocyte formation of peripheral blood in kbro 3 -treated mice. these in vivo results were similar to those of in vitro experiments. therefore, chungkukjang containing soy isoflavones is a promising functional food that can prevent oxidative stress. (supported by bk21 project from korea research foundation). sirt1 is a histone deacetylase, involved in oxidative stress and aging. because the role of aging and exercise on sirtuins activity in rats is unknown, we investigated the effects of exercise on age-related changes in the sirt1 activity, comparing heart (h) and adipose (a) tissue of sedentary young (n10), sedentary old (n10) and trained old (n10) rats. the trained old rats performed a 8-weeks moderate training on treadmill. on h and a tissue of all rats sirt1 activity was evaluated by assay kit, peroxidative damage measuring malondialdehyde (mda) and protein-aldehyde adducts 4-hydroxynonenal (4-hne), mnsod, catalase and foxo3a by western blot, and gadd45a, cyclin d2 and foxo3a mrna by rt-pcr. aging reduced sirt1 activity in h (p<0.0001) without effects in a, producing an increase of mda (h, p<0.0005; a, p<0.0001) and 4-hne (h, p<0.005; a, p<0.0005), and a decrease of mn-sod (p<0.02) and catalase (p<0.0001) expression in both h and a. aging did not affect foxo3a protein expression in h, and foxo3a mrna in a. exercise produced an increase in h foxo3a protein expression (p<0.02) and in a foxo3a mrna, associated to higher mn-sod (h, p<0.01; a, p<0.005) and catalase (h, p<0.0001; a, p=0.01) levels in both h and a of aged rats. in heart exercise-induced higher sirt1 activity bring on decrease in cyclin d2 and increase in gadd45a mrna expression. in a we found a similar decrease in cyclin d2, without changes in gadd45a mrna expression. these findings suggest that exercise is able to increase sirt1 activity in aged rats. (1), t horiguchi(1), k abe(2), h inoue(2), t noma (1) (1) institute of health biosciences, the university of tokushima graduate school, tokushima, japan (2) minophagen pharmaceutical co. ltd, japan objectives: glycyrrhizin (gl) is a major component of glycyrrhizae radix (licorice) that is generally used for treatment of hepatitis. gl has a regulatory activity on arachidonic metabolism, immunological function, and anti-viral effects. however, the molecular mechanisms of the effects remain unclear. to analyze the molecular basis of gl signaling, we performed the microarray analysis using ccl4-induced mouse hepatitis models. methods: eight-week-old icr male mice were treated intraperitoneally with 100 fã�l/ kg bw of ccl4 w/wo 250 mg/ kg bw of gl. after 24 hours and 72 hours, livers and serum were collected and analyzed. for microarray analysis, the expression patterns of genes between 72 hour-treated-livers (ccl4 or ccl4 and gl) and no treated-livers were compared. results: gl-treatment dramatically decreased the gpt activity in plasma at 24 hours compared to that in ccl4treated plasma. however, the levels of mrna expression of inflammatory genes such as phospholipase a2, hsp47, and procollagen were still very high in gl-treated liver. after 72 hours, the mrna levels of them were significantly reduced in gl-treated mice compared to those of ccl4-treated liver. then, we screened 41,000 genes by microarray and found that 71 genes were up-regulated and 63 genes were down regulated in ccl4+gl compared to ccl4 treatment. interestingly, ros scavenger-related genes were significantly up-regulated in ccl4 + gl. detail analysis is currently ongoing. we found the unique relationship between gl activity and ros regulation. this finding suggests a novel way to treat inflammatory diseases including hepatitis. objectives: experimental autoimmune encephalomyelitis (eae) is a demyelinating autoimmune disease that results from an immunological reaction against different myelin components at the cns. it is widely employed as an animal model of human multiple sclerosis. interestingly, the number of studies relating these diseases with peripheral organs is limited. we thus investigated the consequences of eae on the degree of lipoperoxidation (tbars) and mpo activity in different rat peripheral organs (eg. lung, spleen, liver, stomach, duodenum, colon, ileum, kidney and bladder). university of waikato, hamilton, new zealand mitochondria play a fundamental role in the life and death of all eukaryotic cells. cells with dysfunctional mitochondria are known to have higher levels of a molecular stress protein (cpn60). this protein is increasingly being implicated to play a role in modulating cellular inflammation. we have developed an in vitro model cell system using thp-1 monocyte cells with compromised mitochondrial bioenergetic functions to investigate the relationships between mitochondrial dysfunction, cpn60 expression and modulation of proinflammatory cytokine responses. we have found that the ability of cpn60 to modulate tnf-a expression was strongly correlated with the loss of mitochondrial bioenergetic functions in our thp-1 cells. we also demonstrate that such modulation involves both erk1/2 and p38mapk pathways. the significance of these results in relation to the role of mitochondria as modulators of inflammation will be discussed. (1), b arnold(2), g opdenakker (3) (1) jagiellonian university, department of evolutionary immunobiology, krakow, poland (2) german cancer research center, department of molecular immunology, heidelberg, germany (3) rega institute for medical research, university of leuven, laboratory of immunobiology, leuven, belgium we showed that in mice genetically deprived of metalloproteinase 9 (mmp-9-/-) at least one compensatory mechanism operates as there are elevated levels of pge2 of cox-1 origin expressed by peritoneal macrophages during zymosan peritonitis; and this leads to increased early vascular permeability observed in those animals. also infiltration of peritoneal cavity by inflammatory neutrophils is changed in mmp-9-/-mice as at 6 hrs of inflammation, when otherwise highest numbers of neutrophils are detected in peritoneum, the cell numbers are significantly lower in the mice in comparison to their controls. in contrary, at 24 hrs of peritonitis, when normally resolution of peritonitis takes place, no decrease in neutrophil counts is observed. thus the aim of the present study was to evaluate if impairment of neutrophil apoptosis could account for this latter phenomenon in mmp-9-/-mice. for this numbers of apoptotic (annexin v) and necrotic (7-aad) peritoneal leukocytes were evaluated and levels of active caspases were tested by application of either caspase detecting antibodies or fluorochrome-labelled inhibitors; all analyses were performed by flow cytometry. the results revealed that both, numbers of apoptotic cells and levels of active caspase 3 were significantly lowered in mmp-9-/-mice while levels of caspase 7, 8 and 9 were significantly elevated in comparison to control animals. we conclude that an impairment of apoptosis is observed in mmp-9-/mice during zymosan peritonitis and it is due to the decreased levels of active caspase 3. the increased activity of other examined caspases is most probably independent of apoptosis. (1), h james (1) the selective inhibition of nitric oxide generation by inhibiting the activity of nitric oxide synthase(nos) isoforms represents a novel therapeutic target for the development of anti-inflammatory agents. the aim of this study was to evaluate the activity of nos inhibitors in experimentally induced inflammation, pain and hyperalgesia. the effect on acute inflammation was studied in carrageenan-induced paw edema in rats. the effects on carrageenan-induced hyperalgesia, tail flick response to radiant heat and acetic acid-induced writhing were also studied. nos inhibitor ng-nitro-l-arginine methylester (l-name), 10 and 20 mg/kg produced a dose-dependent inhibition of paw edema (42% and 57% at 2h; 45% and 57% at 4h). a marked reduction in paw edema was observed with ng-monomethyl-l-arginine acetate (l-nmma), 10 mg/kg(79% at 2h; 91% at 4h). selective inducible nos(inos) inhibitor aminoguanidine hemisulfate inhibited the paw edema at a dose of 20 mg/ kg(59% at 2h; 64% at 4h) but not with a dose of 10 mg/kg . the effects were comparable to nonselective cox inhibitor indomethacin 5 mg and 10 mg/kg (40% and 77% at 2h; 29% and 72% at 4h respectively) and selective cox-2 inhibitor rofecoxib, 5 mg/kg (49% and 78% respectively). nos and inos inhibitors significantly increased the pain threshold latency in the tail-flick test. these inhibited the acetic acid-induced writhes, the effect being comparable to indomethacin. however, carrageenan-induced paw hyperalgesia was not inhibited. the results suggest that nitric oxide plays a role in carrageenan-induced acute inflammation and both nosand inos inhibitors have a potential anti-inflammatoryand anti-nociceptive activity. (1), p hart(2), j edwards (1), c quirk (1) (1) molecular pharmacology limited (usa), australian division, perth, western australia (2) telethon institute for child health research, perth, western australia thermalife cream, an anti-arthritic biological product, has been successfully used off-label for sun burn recovery. a novel product, derived from thermalife, was assessed on its therapeutic potential in oxsoralen-uvb burns. as a possible mechanism for the sunburn efficacy, suppression of tnf-a and il-1㢠production by human monocytes was assessed in vitro. methods: sunburn: four sites were marked on the arm of the subject. three sites were exposed to oxsoralen (1%) plus uva/uvb light, one site was exposed to oxsoralen only. cream was applied at 5min, or at 4hrs after injury. a third injury site was not treated. photographs were taken before, 24hrs, and 7weeks after injury. cytokines: human monocyte cultures (10% fcs, 5% co2) were either stimulated with 500ng/ml lps (e.coli 0111:b4) or not in the presence of 0% or 10% active ingredient.24hrs after incubation, culture media was collected, centrifuged, and assayed (cytokine elisa). results: at 24hrs after oxsoralen-uv, the 5min treatment site showed slight erythema, the 4hr treatment site had pronounced erythema and slight blister formation, whereas the untreated site had pronounced erythema and strong blister formation. 7 weeks after injury, the 5min site was normal, the 4hr site was a dark colour, whereas the untreated site had a significant scar. oxsoralen alone had no effect on the skin. the novel product suppressed lps-induced tnf-a and il1㢠secretion by 48.7% and 71.1%, respectively. conclusions: a novel thermalife-derived product reduced total injury after oxsoralen enhanced uva/ uvb burns, which is possibly related to cytokine suppression. (1), p hart(2), j snowden (1), maud eijkenboom (1) (1) molecular pharmacology limited (usa), australian division,perth, western australia (2) telethon institute for child health research, perth, western australia a mixture of bovine plasma protein fractions and zinc chloride (bov-zn) was assessed for its ability to regulate cytokine production by lps-stimulated monocytes. dosereponse curves for tnf-a suppression were generated. further, competition with fcs in the culture medium and the metabolism of monocytes under influence of bov-zn were assessed. in all experiments the culture medium environment was similar. human monocyte cultures (10% fcs, 5% co2) were either stimulated with 500ng/ml lps (e.coli 0111:b4) or not in the presence of 0%, 2.5%, 5%, 7.5%, 10%, 20% or 40% bov-zn (two pooled experiments). 24 hours after incubation, culture media were collected, centrifuged, and assayed (cytokine elisa). a competitive inhibition design for the standard tnf-a assay was set up for 0%, 1%, 5%, 10% fcs against 0%, 2.5%, 5%, 10% bov-zn. the culture media were treated as above. metabolism of non-proliferating monocytes was measured via accumulation of bioreduced formazan (promega celltiter 96) in treated and untreated cell cultures over 0-45 hrs at intervals. the ic50 for tnf-a suppression was reached at 2.5% bov-zn in each of two experiments. fcs did not compete with bov-zn in suppressing tnf-a in lpsstimulated monocytes. at low fcs concentrations bov-zn stimulated tnf-a production in the absence of lps. this tnf-a increase was countered with increasing concentrations of fcs. metabolism of cells was not affected by 10% bov-zn. conclusions: bov-zn could reliably and effectively reduce tnf-a secretion in vitro, without competing with the fcs in the culture medium, and without disturbing the metabolism of monocytes. inflammatory diseases such as rheumatoid arthritis (ra) result from overproduction of cytokines including tnf-â£\ and il-1fã�. these cytokines are known to be regulated by the stress-activated p38fnfnmap kinase pathway. because of this, inhibition of p38 map kinase has been one of the most compelling targets for the treatment of inflammatory disease. over the last 10 years, numerous groups have reported on the development of p38 map kinase inhibitors. x-ray co-crystallization with the enzyme suggests a propensity to accommodate structurally diverse molecules. regions of the binding site are known to be unique to p38 vs other kinases, enabling the development of p38 selective molecules. inflamm. res., supplement 3 (2007) posters anti-inflammatory activities. a series of 11 labdane-type diterpenoids (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) with various patterns of substitution were tested for potential anti-inflammatory activity.of these compounds, 4 and 11 were selected to evaluate their influence in targets relevant to the regulation of the inflammatory response. these derivatives displayed good in vivo anti-inflammatory activity, and maximum inhibitions of 50 and 90 % were noted in the 12-o-tetradecanoyl-phorbol-13-acetate (tpa)-induced ear oedema in mice. in addition, inhibition of myeloperoxidase activity, an index of cellular infiltration, was also observed. the diterpenoids also reduced the production of nitric oxide, prostaglandin e2, and tumour necrosis factor-alpha in bacterial endotoxin-activated raw 264.7 macrophage cells with ic50 in the range 1-10 mm. inhibition of these inflammatory mediators was related to reductions in the expression of inducible nitric oxide synthase, and cyclooxygenase-2, as determined by western-blot analysis and rt-pcr. since nuclear factor-kappab (nf-kb) plays a central role in the transcriptional regulation of these proteins, we investigated the effects of these diterpenoids on this signalling pathway. our results indicate that both compounds interfere with the phosphorilation of ikbalpha and ikbã�, resulting in inhibition of their degradation. in summary, the anti-inflammatory effects of these labdane diterpenoids are related to the inhibition of inflammatory mediators by blocking nf-kb activation and provide potentially therapeutic perspectives in inflammatory conditions. the aim of the study was a research of mechanisms of inflammatory action of a new drug fepolen at a dosage of 100 mg/kg prepared from bee products (bee pollen and phenolic hydrophobic extract of propolis) for prostatitis treatment. to fulfill above mentioned task a model of zimozan induced oedema whose dynamics gives a possibility to estimate influence of a drug on both routs of arachidonic acid metabolism -via cyclooxiginase and lipooxigenase was used. a comparison with diclofenacum at a dosage of 8 mg/kg and substance ffw-755ã� (inhibitor of cyclooxygenase and lipooxygenase and has a high antiinflammatory effect (41%) at a dosage of 16 mg/kg) was made. the results of influence of drugs dynamics of zimozan inflammation show that anti-inflammatory action of fepolen is based on decrease of release of biogenic amines and activity of lipooxigenase and is higher then effect of diclofenacum. fepolen revealed the highest effect during first 30 min and 1 hour of inflammation that was higher then action of diclofenacum. these data proves that fepolen decreases lipooxygenase activity that is in charge for the inflammation during this period. during next hours therapeutic effects of fepolen and diclofenacum were at the same level. fepolen showed the same dynamics of anti-inflammatory action as ffw-755ã� that demonstrates a property to influence both routs of arachidonic acid metabolism. in summary with previous results in conditions of carageninic inflammation we can conclude that anti-inflammatory action of fepolen is based mostly on influence on lipooxygenase then cyclooxygenase. the aim of this study was to establish a method by which probiotic bacteria can be selected for their immuno modulatory properties, especifically the ability of certain strains to suppress an inflammatory response. the gastrointestinal inflammatory condition crohns disease involves a th1-response with increased levels of proinflammatory cytokines like tnfa and il12, and mouse models of crohns disease show that the balance of il10/12 is crucial for disease progression. we have used mouse bone marrow derived dendritic cells (bmdc) to model a proinflammatory crohns disease like condition in vitro with cocktail-induced bmdc secretion of il12, il6 and tnfaf jthe model was validated using anti-inflammatory molecules like dexamethasone and prostaglandin d2, which were able to suppress the cocktail induced il12 secretion. further validation of the model is confirmed by the fact, that probiotic strains which are able to suppress tnbs induced colitis in mice in preventive studies, also show potent anti-inflammatory activity in our model. among clinically relevant as well as novel probiotic strains, we have selected strains with potent anti-inflammatory properties, and are currently investigating the possible mechanism of action of these strains. in summary, our established model is suitable for identification of anti-inflammatory activity of probiotic strains and potentially other immune suppressing components, for rational selection of candidates for further preclinical and clinical evaluation and development. p11-12 inflammation and human incisional hernia pathophysiology maria antonia arbos via(1) eae was induced by immunization of female lewis rats with guinea-pig myelin basic protein (mbp) in complete freunds adjuvant (cfa) and the animals were studied at the stage iii of the disease (characterized by complete paralysis of the hind-limbs) compared to cfa rats, eae resulted in increased mpo activity (u/mg tissue) in kidney (280 ae 75 vs. 121 ae 14 179 ae 35; p<0.001), and higher tbars contents (nmol mda/mg tissue) in liver acknowledgements: capes, cnpq, fapesp. contact information: ms simone teixeira, university of sâ¼o paulo, department of pharmacology, campinas, brazil e-mail: mone@usp.br tolerability, investigator and subject global assessments and rescue medication consumption supported by bk21 project from korea research foundation) contact information: mr young hoon kim here we report on the development of the pge2 mimetic combination therapy (dp-837953) that inhibits basal and lps/tlr4 induced tnf-?, il-1ã�, and mmp-1, 8, 9, 13 in human and murine synovial membranes and peripheral macrophages.in a murine model of chronic synovitis (dorsal skin air pouch), dp837953 dramatically inhibited il-1ã�, tnf-?, mip2, mcp-2 and il-8 expression, delayed the profile of leukocyte/neutrophil extravasation and reduced exudate volume.in a model of inflammatory arthritis (collagen induced arthritis, cia), dp837953 markedly reversed the inflammatory pathology by reducing synovial hyperplasia, cartilage erosion and articular inflammation.in addition, tnf-?, il-1ã�, mmp-8 and to a lesser extent mmp-9 expression/synthesis levels were strongly suppressed as judged by rt-pcr and elisa measurements we have developed two rias, one for functional blood levels of the above mentioned anti-tnf-alpha constructs, and one for anti-abs (all isotypes), and we have used these methods to monitor patients treated with infliximab/remicade and etanercept/enbrel ; i shall present some of these data (7, 8 anatomy-physiology, faculty of medicine, laval university, quebec, canada neutrophils, which are often the first leukocytes to migrate at inflamed sites, can generate ltb4 from the 5-lipoxygenase pathway, pge2 through the inducible cyclooxygenase (cox-2) pathway and cytokines/chemokines as tnf-alpha, il-1beta, il-8, mip-1alpha, mip-1beta, mip-2alpha and mip-3alpha. engagement of the adenosine a2a receptor (a2ar) blocks the in vitro synthesis of ltb4 while it potentiates the cox-2 pathway in fmlp-treated neutrophils. in addition, it selectively prevents the expression and release of tnfalpha, mip-1alpha, mip-1beta, mip-2alpha and mip-3alpha in toll-like receptor-4-stimulated human neutrophils. little effect was observed on il-1beta and il-8. using the murine air pouch model of inflammation with a2ar-knockout mice, we observed that the activation of a2ar positively impacts the expression of cox-2 in vivo, with particular magnitude in inflammatory leukocytes. in mice lacking the a2a receptor, neutrophils that migrated into the air pouch 4h following lps injection expressed higher mrna levels of tnf-alpha, mip-1alpha and mip-1beta than neutrophils from wild type mice. together, these results indicate that neutrophils are important mediators of adenosines protective effects. given the uncontrolled inflammatory phenotype observed in a2ar-knockout mice and in view of the potent inhibitory actions of pge2 on inflammatory cells, an increased cox-2 expression and a prevented release of tnf-alpha, mip-1alpha and mip-1beta caused by a2ar activation, observed particularly in neutrophils, may take part in an early modulatory mechanism promoting anti-inflammatory activities of adenosine. sepsis induced by endotoxins including lipopolysaccharide (lps) is a big problem in clinical medicine. for a better insight into the molecular pathways and to assess markers of endotoxin-induced sepsis, we applied thetwo dimensional gel electrophoresis (2d-page) and maldi-tof to follow the changes of significant proteins in a murine macrophage cell line -raw 264.7 after challenged with lps (escherichia coli 026:b6) for 12, 18 and 24 hours. we identified 21 proteins from approximately 500 detected protein spots with either increased or decreased in relative abundance as a result of lps treatment. the proteins identified with increased expression are the retinoblastoma binding protein 7, capg protein, poly(rc) binding protein 1, isocitrate dehydrogenase 3 (nad+) alpha, lactate dehydrogenase 1, a chain, guanine nucleotide binding protein (g protein), beta polypeptide 2 like 1, triosephosphate isomerase 1 and proteasome alpha 5 subunit); and ones with decreased expression are the acidic ribosomal phosphoprotein p0, malate dehydrogenase, soluble, proliferating cell nuclear antigen, proteasome (prosome, macropain) subunit, alpha type 1 and rho, gdp dissociation inhibitor (gdi) beta). many of these altered proteins have interesting functions in inflammation. with the information obtained with the proteomic approach, it is possible to improve current methods of monitoring endotoxemia and to identify new therapeutic targets. the ubiquitous mitogen-activated protein (map) kinases are important enzymes in signal-tranduction cascades which regulates diverse cellular events such as cell transformation, proliferation, differentation, and apoptosis. they are therefore potential drug targets for therapeutic intervention in the treatment of inflammation, cancer, and other immune diseases. based on a virtual screening approach we identified 6-amino-1-benzyl-4-(4-bromophenyl)3-methyl-1,4-dihydropyrano[2,3c] pyrazole-5-c arbonitrile as a potential novel lead structure as p38 map kinase inhibitors. a set of compounds were prepared starting from different substituted pyrazol-5(4h)-ones via a base-catalyzed condensation with aldehydes and ch acids, such as malononitrile, to provide them for biological tests in a p38 enzyme assay. first structure-activity relationship confirm the value of this novel lead. this study was conducted to determine the physiological c-reactive protein (crp) and alpha 1-acid glycoprotein (aag) levels for two groups of beagle dogs: healthy dogs of various ages and pregnant dogs. serum crp levels were measured by elisa and aag levels were measured in healthy beagles of various ages by tia, and then separately -in pregnant beagles -by srid. serum crp levels ranged from 1.5 to 16.0 ã¬g/ml in male, and from 1.8 to 18.9 ã¬g/ml in female dogs. no significant sex-related differences were observed in the values. further, there were no significant age-related differences either. serum crp levels increased during pregnancy and peaked at 70.2-90.4 ã¬g/ml 30 or 45 days after ovulation, demonstrating two characteristic features of crp levels change in pregnant dogs. serum aag levels ranged from 40 to 960 ã¬g/ml in male, and from 47 to 833 ã¬g/ml in female dogs, without any significant sex-or age-related variation. serum aag levels increased in all pregnant beagles and peaked in the middle of gestation at 250-1,000 ã¬g/ml. despite a high value of 1,210 -1,360 ã¬g/ml being observed for serum aag levels in 3 pregnant beagles inoculated with staphylococcus aureus, its levels in umbilical cord blood were below the detection limit of srid (40 ã¬g/ml). no significant sex-/-age related differences were observed in serum both crp and aag levels and these levels increased during pregnancy. the results of aag levels in umbilical cords were below the detection levels suggest aag is not transported to the placenta. polymorphonuclear neutrophils (pmns) play a key role in the inflammatory response against infectious agents.however, they can elicit significant tissue damage and in this respect, anti-inflammatory drugs are of interest.in this study, we examined the effect of pbi-1393, a low molecular weight immunomodulatory molecule, on pmn activation by lps both in vitro and in vivo. we measured by elisa the production of tnf-a by human lps-activated pmn in the presence or absence of pbi-1393.the ability of pbi 1393 to modulate pmn activation and recruitment in vivo was assessed using a rat air pouch model of inflammation.exudates from different groups of animals (controls and pbi-1393 treated animals, n=6) were used to assess leukocyte infiltration and to measure by elisa tnf-âµ, mcp-1 and pge2 production.in vitro, pbi-1393 is able to significantly decrease by 28.8% ae 0.08% (p<0.05), tnf-a production by human lpsactivated pmn.in vivo, pbi-1393 significantly decreased the production of tnf-a (41.4% ae 7.2%; p<0.005), mcp-1 (16.3% ae 8.4%; p<0.05) and pge2 (29.2% ae 13.5%; p<0.05) induced by lps injection.however, pbi-1393 did not significantly inhibit leukocyte infiltration.these results show that pbi-1393 is able to modulate pmn activation and inflammatory response and suggest potential use as anti-inflammatory agent. (1), lj lowenstine,(2), aj norris(2), t spangler(2), lm woods (2) (1) zoological society of san diego, usa (2) department of pathology, microbiology, and immunology, university of california, davis, usa this study investigated the role of a novel reovirus in a 2004 outbreak of necrotizing typhlocolitis in american crows in california. included is a detailed characterization of the necrotizing and inflammatory characteristics of the disease, as well as a discussion of the implications of these findings upon proposed mechanisms of pathogenesis. complete histopathology including stains for lesion characterization and potential concurrent etiologic agents was performed on all outbreak crows. feces and ceca were submitted for culture, parasitology, and negative contrast electron microscopy. two control groups (n=10 each) were selected for parasitology and em (group 1), and gross and histopathology (group 2). all outbreak cases and group 2 controls tested negative for west nile virus by pcr.all outbreak crows had marked, necrotizing heterophilic typhlocolitis, fibrinonecrotizing splenitis, and variable intestinal lamina proprial necrosis and hemorrhage.two cases had multifocal hepatic necrosis. negative contrast em revealed reovirus particles in 100 % (4/4) of outbreak cases and in 0% (0/ 10) of controls. supplemental tests failed to suggest other concurrent or confounding etiologic agents.overall, the findings suggest association between the reovirus and the outbreak of typhlocolitis, and the absence of reovirus in controls suggests that it is not ubiquitous in the crow population.there was a noteable absence of similar typhlocolitis in 65 archived crows submitted to the vmth from 1994-2004, suggesting an emerging corvid disease in california, which bears further investigation. mitogen-activated protein kinase (mapks) pathways play an important role in the signalling system activated by proinflammatory cytokines. among the most important cascades the activation of erk1/2 by mek1/2 is reported to be responsible for inflammatory responses and degradation of osteoarthritic cartilage. as701820, a selective mek1 inhibitor, demonstrated anti-inflammatory properties in reducing tnf-alpha production induced by lps injection (ic50 2 mg/kg). therefore, primary aim of the present study was to assess the therapeutic strength of the as701820 in a mouse model of collagen-induced rheumatoid arthritis (cia) assessing the effect of the compound on structural changes related to the cartilage. cia is characterized by severe polyarthritis affecting peripheral joints, synovial hyperplasia with persistent inflammation and cartilage erosion. as701820 treatment was initiated when signs of arthritis were clinically visible (in terms of paw swelling and redness) and was continued for 7 days (twice daily), by oral route at the doses of 10, 30 and 100 mg/kg. as701820 at 30 and 100 mg/kg significantly reduced clinical arthritic read-outs such as clinical score and paw swelling. at histology, vehicle-treated animals showed severe inflammation and joint surface erosion. administration of as701820 significantly decreased inflammatory infiltrates and treated cartilage surfaces that presented normal levels of proteoglycan content. in conclusion, the results obtained in this study clearly demonstrate that the selective blockade of mek1 could be considered as an innovative therapeutic approach to treat rheumatoid arthritis. experimental evidences have shown that the toxicity of ni salts may involve inflammatory processes, with a subsequent overproduction of reactive oxygen species (ros) and carcinogenicity. neutrophils are the most abundant leukocytes of blood, and participate actively in the inflammatory innate host defense response. however, relatively little is known about the potential of nickel salts in activating human neutrophils.thus, the aim of the present study was to evaluate the putative stimulation of oxidative burst in isolated human neutrophils by nickel nitrate. the measurement of neutrophil burst was undertaken in vitro, by chemiluminescence, by monitoring the oxidation of luminol by neutrophil-generated ros and reactive nitrogen species (rns). enzymatic inhibitors and specific reactive species scavengers were used to evaluate which species were involved in neutrophils activation by nickel nitrate. the obtained results showed that nickel nitrate stimulates human neutrophils burst in a concentration-dependent manner, within levels that may be attained in vivo. in the present experimental conditions, the reactive species involved in neutrophils activation by nickel nitrate were superoxide radical (o2-.), hydrogen peroxide (h2o2), hydroxyl radical (ho.) and perchloric acic (hocl). the observed activation of isolated human neutrophils burst by nickel nitrate and subsequent tissue damage due to a sustained formation of reactive species may contribute for the deleterious effects attributed to this transition metal, though this assumption needs to be confirmed in vivo. (1), h spalteholz (1), u reibetanz (1), p salavei (1), m fischlechner (1), h-j glander(2), j arnhold (1) (1) university of leipzig, medical faculty, institute for medical physics and biophysics, germany (2) university of leipzig, derpartment of dermatology, andrology training centre of the european academy of andrology, germany unintentional childlessness often caused by common reasons as inflammation affects 15 -20 % of german couples. inflammations of the male genital tract lead to an infiltration of polymorphonuclear granulocytes (pmn), respectively induce a restricted spermatozoa quality associated with early triggered acrosome reaction (ar) and apoptosis as well as changes in the lipid structure and reduced mobility. stimulated pmn release the strongly cationic heme protein myeloperoxidase (mpo), which is able to bind to negatively charged membrane surfaces, e.g. apoptotic cell membranes with externalized phosphatidylserine (ps). a population of freshly prepared spermatozoa shows only a very small amount of cells with mpo binding ability as well as externalization of ps. the number of spermatozoa able to bind mpo raises considerably in samples containing predamaged cells or introducing the ar as could be observed with rhodamine b isothiocyanate (ritc)labelled mpo and antibody techniques by fluorescence microscopy as well as flowcytometry. the activation ofmpo with its substrate hydrogen peroxide (h2o2) in the presence of chloride ions generates the powerful oxidizing and chlorinating species hypochlorous acid (hocl) and enhanced markedly the number of annexin v positive and non-vital cells. components of seminal plasma as well as serum albumin can protect spermatozoa for the deleterious effects of mpo. the coincidence of ps externalization and mpo binding to spermatozoa surfaces indicates an up to now unknown role of this enzyme in recognition and removal of apoptotic cells during inflammation. recent findings suggest a crucial role of proteinaseactivated receptor-2 (par2) in inflammation and innate immunity. par2 is the second member of a novel g protein-coupled receptor subfamily with seven putative trans-membrane domains. this subfamily is characterized by a unique mechanism of receptor activation. accessible serine proteases cleave the receptor to expose a new, previously cryptic, n-terminal sequence ("tethered ligand") which further interacts with the same receptor and activates it. tryptase, trypsin, and bacterial serine proteases are capable of directly activating par2. par2 is expressed by human neutrophils, however its functions on these cells remained unclear. the data of our present study indicate that par2 agonists enhance interferon gamma (ifna)-induced up-regulation of cell surface fca;ri, one of the key receptors involved in neutrophil phagocytic activity. moreover, par2 agonists (serine proteases as well as synthetic activating peptide) and their receptor represent an additional system which controls neutrophil transendothelial migration and apoptosis in vitro. additionally, there is a significant increase of par2 expression on the neutrophil cell surface in the case of septic patients as compared to cells from healthy volunteers. together, our results indicate that par2 may be involved in the pathophysiology of acute bacteria-induced human diseases (sepsis or septic shock, for example) potentially by regulating neutrophil apoptosis, transendothelial migration and fca-receptor expression. aim: to ascertain the role of macrophages as direct inducers of regeneration after renal ischemia/reperfusion, and to establish whether inflammatory conditions contribute to the process. we determined whether adoptive transfer of macrophages at different stages of kidney inflammation after mouse renal i/r could restore reparation and assessed the influence of inflammation in the process.results: i/r provoked the increases in renal regeneration, as evaluated by inmunohistochemistry and pcr mrna of stathmin and pcna. the cytokine profile revealed the influence of the inflammatory environment on kidney repair. regeneration was macrophage-dependent, decreasing when depletion was provoked, and increasing with adoptive transfer of macrophages; however, administration of resting macrophages did not induce repair at the time points in which tissue was inflamed, and was only able to promote regeneration in the absence of inflammation (72 hours). pro-inflammatory cytokines increased at the early stages of reperfusion, coinciding with low regeneration, and anti-inflammatory cytokines increased during the longer periods of reperfusion when regeneration was more evident.conclusions: macrophages directly induce renal regeneration after ischemia/reperfusion in an inflammationdependent manner. (1), k bendtzen (1), f sellebjerg (2), ch nielsen (3) ( antibodies against myelin basic protein (mbp) are present in sera from patients with multiple sclerosis (ms), but the role of these antibodies is controversial. we collected sera from 22 ms patients and 17 healthy individuals and found that both groups contained igm anti-mbp antibodies, while ms sera contained small amounts of igg anti-mbp. however, the two groups of sera did not differ significantly with respect to the content of either antibody subclass. addition of mbp to the various sera and subsequent addition of the mixtures to normal peripheral blood mononuclear cells (pbmc) resulted in a significant deposition of igm on cd14+ monocytes, indicating that formation of mbp/igm complexes had occurred. this deposition was strongly inhibited by addition of 10 mm edta to the sera, indicating that it was complement dependent. the pbmc produced significant amounts of il-10, tnf-alpha and ifn-gamma upon stimulation with mbp, and the extent of the cytokine production did not depend upon whether sera from ms patients or from healthy controls were present. however, disruption of the tertiary structure of mbp by boiling significantly reduced the production of all three cytokines, supporting a role for antibodies in the induction of cytokine responses to mbp. we propose that natural igm autoantibodies may form complexes with mbp, facilitating the uptake of mbp by antigenpresenting cells (apc). since sera from ms patients did not enhance this uptake and the subsequent cytokine production, the mechanism may be part of an appropriate peripheral regulation of self-reactivity. we currently investigate this possibility. loredana postiglione(1), g tarantino(2), a spanã²(2), p ladogana (1), fl perrone(1), s padula(2), a riccio (2) (1) federico ii university medical school of naples, department of molecular and cellular biology and pathology l.califano, naples, italy (2) federico ii university medical school of naples, departmentof clinical and experimental medicine, naples, italybackground: hepatitis c virus (hcv) infection can induce immunological disorders with different clinical expression such asarthritis, sjogren sindrome and various form of vasculitis.aim: to study the prevalence of anti-cyclic citrullinated peptides antibodies (anti-ccp) in a group of patients affected by hcv-related arthritis and the eventual correlations with rheumatoid factor (rf) and/orantinuclear antibodies (ana), and articular involvement. study design: 30 patients with arthritis were selected in a population of 380 subjects affected by hcv infection. each patients was evaluated by clinical examination (23 denoted poliarticular and 7 mono-oligoarticualr involvement), by x-graphic aspects of joint involvement (8 patients presented join erosions), by ana, rf and anti-ccp positiveness.results: 33,3% of patients presented positivenessfor anti-ccp, without significant correlation between suchparameter and ana, rf and articular involvement. anti ccp resulted positive in 4 out of the 8 patients with joint erosions, and only in 6 out of the 22 patients without joint erosions. such frequency analyzed by chi square ended up in no significant differences. our patients presented an interesting prevalence of the positiveness for anti-ccp. these data suggest a consideration about the specificity, commonly attributed to this parameter in the diagnosis of rheumatoid arthritis. expression of nkg2d on cd4+ t cells is generally rare in both mice and humans, but has been reported in a number of inflammatory diseases, including rheumatoid arthritis, crohns disease and an animal model of type 1 diabetes. the monoclonal antibody cx5 recognizes murine nkg2d and has been shown to block ligandbinding and mediate internalization of nkg2d. furthermore, cx5 can inhibit and/or ameliorate disease in animal models of type 1 diabetes and inflammatory bowel disease. thus, it is very likely that nkg2d plays an important role in the development of inflammatory and autoimmune diseases. since little is known about the pharmacokinetics and pharmacodynamics of the cx5 antibody, we decided to study this in both regular balb/ c mice and immunodeficient cb17.scid mice. different doses of cx5 antibody was injected intraperitoneally and pk and pd was measured by elisa (anti-cx5 elisa in serum) and flow cytometry (down-regulation of nkg2d on cd49b+ nk cells) for up to two weeks after administration. we found that cx5 very efficiently down-regulate nkg2d on cd49b+ nk cells and that the effects of the antibody can be seen for more than two weeks after one single injection. finally, we propose a model which may be helpful in predicting the effects of different doses of cx5 antibody in vivo. (1), k mehta(2), n deo(2), j chaudhary(2), p bobrowski (3) (1) albany medical college, usa (2) vedic lifesciences, usa (3) rainforest nutritionals, inc, us background: the efficacy and safety of reparagen, in treating osteoarthritis was compared to glucosamine sulfate in a mumbai-based multi-center, randomized, double-blind study.methods: subjects (n=95) were screened and randomized to receive glucosamine sulfate (n= 41, 1500 mg/day) or reparagen (n=38, 1800 mg/day), a polyherbal consisting of vincaria (uncaria guianensis) and rni 249 (lepidium meyenii) administered orally, twice daily. primary efficacy variables were womac scores, visual analog score (vas) for pain, and response to treatment defined as a 20% improvement in womac pain, with assessments at 1, 2, 4, 6 and 8 weeks. secondary variables were results: subject randomization was effective and both treatments showed significant benefits in primary outcomes within one week (p<0.05), with a similar, progressive improvement over the course of the 8 week treatment protocol (42-49% reduction in total womac or vas scores). the response rate was substantial for both glucosamine (88%) and reparagen (92%), which exceeded placebo responses (55%, p <0.01) and supported by investigator and subject assessments. tolerability was excellent and safety parameters were unchanged. rescue medication use was significantly lower in the reparagen group (p <0.01), and serum igf-1 levels were unaltered.conclusions: both reparagen and glucosamine sulfate produced substantial improvements osteoarthritis symptoms. response rates were high and the safety profile was excellent, with significantly less rescue medication use with reparagen. we speculate that the high response rate to glucosamine sulfate may reflect higher baseline pain levels or synergy with dietary curcumin. inflammation accompanies and aggravates progression of all modern human chronic pathological conditions. growing evidence indicates the beneficial role of proper nutrition in controlling inflammation. we investigated the effects of selected essential nutrients in experimental inflammation and the molecular mechanisms involved. tested nutrient mixture (nm) consisted of green tea catechins, citrus flavonoids hesperidin, naringenin and quercetin, ascorbate, lysine, proline, arginine and cysteine. systemic inflammation in mice challenged with bacterial lipopolysaccharide (lps) was monitored by blood plasma levels of fourteen key inflammatory cytokines. two week supplementation with 250 mg nm/kg body weight prior to lps challenge provided significantly greater protection than did supplementation with ibuprofen. induction of interleukin-6 (il-6) and monocyte chemoattracting protein-1, two cytokines especially responsive to lps challenge, was reduced in nmsupplemented animals by 58% and 86%, respectively. corresponding reduction in ibuprofen group was 34% and 43%. protective mechanisms involved were assessed in human cultured u937 macrophages stimulated with lps.the cytokines most responsive were tumor necrosis factor alpha (71% and 17% reduction by supplementation with nm and ibuprofen, respectively) and il-12 (66% and 15% in corresponding reduction). nm supplementation dramatically reduced prostaglandin e2 secretion by stimulated macrophages along with cyclooxigenase-2 (cox2) cellular protein expression. mrna levels forcox2 and inflammatory cytokines were also dramatically reduced. quercetin was the most effective nutrient when tested individually. however, nm appeared to surplus the combined effect of individual components. we conclude that the tested combination of essential nutrients demonstrates strong beneficial effects in experimental inflammation by targeting responsible gene expression. (1), hp kim (1), kh son (2) (1) college of pharmacy, kangwon national university, south korea (2) department of food and nutrition, andong university, south korea chalcones belong to flavonoid family from plant origin and some of them possess anti-inflammatory activity. recently, several natural and synthetic chalcones were reported to inhibit inducible nitric oxide synthase (inos)-catalyzed no production in cell cultures. in the present study, to find the optimal chemical structures and to elucidate their action mechanisms, 41 synthetic chalcones having the substituent(s) on a-and b-rings were prepared and their effects on inos-catalyzed no production were evaluated using lps-treated raw 264.7 cells. among the tested compounds, 2-methoxy-3,4-dichlorochalcone (ch15), 2-hydroxy-6-methoxychalcone (ch29), 2-hydroxy-3-bromo-6-methoxychalcone (ch31) and 2-hydroxy-4,6-dimethoxychalcone (ch35) potently inhibited no production (ic 50 s, 7.0 -9.9 mm). the favorable chemical structures were found to be a methoxyl substitution in a-ring at adjacent position (2 or 6) to carbonyl moiety with/without 2-(or 6-)hydroxyl group and 3-halogen substitution in b-ring. when the cellular action mechanisms of ch15, ch31 and ch35 were further examined, it was revealed that ch15 and ch31 clearly down-regulated inos expression while ch35 did not. moreover, ch15 and ch31 were proved to suppress the nuclear transcription factor-kb activation. from the results, it is suggested that certain chalcone derivatives potently inhibit inos-catalyzed no production by the different cellular mechanisms, inos down-regulation or inos inhibition, depending on their chemical structures. these chalcone derivatives may be possibly used as lead compounds for developing new anti-inflammatory agents. an oligomeric stilbene alpha-viniferin (avf) was isolated from root of carex humilis (cyperaceae) as an inhibitor of cyclooxygenase (cox)-2 activity by bioassayguided fractionation. the avf was later found to downregulate lipopolysaccharide (lps)-induced cox-2 expression as well as to inhibit nuclear factor (nf)-kb activation, in addition to its inhibitory effect on cox-2 activity. furthermore, the compound exhibited antiarthritic effect in vivo. avf is a trimer of resveratrol and contains benzofuran moieties in its central part. starting from benzofuran and its related chemicals, 2cyclohexylimino-6-methyl-6,7-dihydro-5h-benzo [1, 3] oxathiol-4-one (lyr-64) was discovered to inhibit lpsinduced nf-kb transcriptional activity in macrophages raw 264.7. the lyr-64 reduced lps-induced dna binding activity and nuclear translocation of nf-kb as well as inhibited lps-induced degradation and phosphorylation of inhibitory kb (ikb) protein. these results suggest that lyr-64 could suppress lps signaling molecule, putatively ikb kinase (ikk) complex, upstream ikb degradation in nf-kb activating pathway. lyr-64 inhibited in vitro kinase activity, gst-ikb phosphorylation, of wild type ikkbeta or a constitutively active ikkbeta mutant (c/a, cys-179 to ala) but did not affect that of another constitutively active ikkbeta mutant (ss/ee, ser-177 and 181 to glu). therefore, lyr-64 could inhibit lps-induced nf-kb activating pathway by targeting ser-177 and/or 181 residues on the activation domain of ikkbeta. as pharmacological actions, lyr-64 prevented nf-kb-dependent expression of inducible nitric oxide synthase, cox-2, or inflammatory cytokines at the transcription level in lps-stimulated macrophages raw 264.7. furthermore, lyr-64 protected lpsinduced septic shock in vivo. faculty of medicine, institute of pharmacology, ljubljana, slovenia a part of anti-inflammatory action of antidepressants can arise from their effect on histamine elimination from the side of inflammation. in mammals histamine is mainly degraded by two enzymes: histamine-n-methyltransferase (hnmt) and diamine oxidase (dao). the aim of present investigation is to establish whether antidepressants amitriptyline and sertraline can affect histamine metabolism. their effects on enzyme activity and mrna expression were studied in guinea pig tissues. plasma and tissue homogenates were incubated with saline (control) and different antidepressant concentrations. specific enzymatic activities of dao and hnmt were determined by radiometric assay. in addition, guinea pigs were treated with saline or amitriptyline (4 mg/kg, ip), afterwards dao and hnmt mrnas were detected by pcr in different tissues. results showed thatamitriptyline, 100 nm, 50, 100 and 500 mm, increased guinea pig plasma dao activity by 5, 7, 17 and 11 %, respectively, while sertraline increased it at 30 mm (by 15 %). at higher concentrations (100 and 500 mm) sertraline decreased dao activity. in the guinea pig tissues hnmt activity changes were found only when incubated with amitriptyline; sertraline had no effect. at 10 and 50 nm amitriptyline the activity of hnmt increased by 20 and 9 %, respectively. in animals treated with amitriptyline an induction of dao and hnmt mrna expression was noticed in several tissues. our results suggest that in guinea pigs due to higher histamine metabolism antiinflammatory effects can be expected at lower concentrations of antidepressants. the effect might be the opposite with higher amitriptyline concentrations. steven hefeneider(1,2), c macarthur (1), d trune (1), s mccoy (2) (1) oregon health and science university, portland, oregon, usa (2) targeted gene delivery, inc., portland, oregon, usa engagement of toll-like receptors (tlrs) by bacterial components such as lps and dna initiates inflammation.the current study examines a novel anti-inflammatory peptide, termed p13, for treatment of inflammation induced by either lps or bacteria.peptide p13 was derived from an immunoregulatory protein of vaccinia virus, and interferes with tlr signaling.in this study we examined the efficacy of p13 to limit inflammation in a mouse model of sepsis and a model of middle ear inflammation, termed acute otitis media (aom).we demonstrate in the sepsis model, that in vivo treatment of mice with p13 inhibited lps-induced production of serum inflammatory mediators.moreover, p13 treatment, administered after initiation of inflammation, significantly increased survival of mice injected with lps.in the aom model, peptide p13 significantly reduced in vivo middle ear inflammation and fluid accumulation initiated by h. influenza.assessment of route of administration and delayed treatment studies demonstrated the efficacy of peptide p13.simultaneous injection of bacteria and peptide p13 resulted in a significant reduction in fluid accumulation, infiltrating cells, and tympanic membrane thickness.fluid accumulation within the eustachian tubes was also significantly reduced following p13 treatment.-subcutaneous and oral administrations of p13, but not intravenous administration, were also efficacious in reducing inflammation. administration of p13 after initiation of an ongoing inflammatory response was effective at reducing inflammation and fluid development.taken together, these results demonstrate the therapeutic potential of peptide p13 to limit an inflammatory response and suggest a possible new treatment strategy for bacterial-induced inflammation. (1), c zhou(2), y zhang(2), m sun(2), x wan (1), h yu(2), x yang(2), rd ye (3), j-k shen (1) formyl peptide receptor-like 1 (fprl1) is a structural homologue of fpr, which binds chemotactic peptides of as small as 3 amino acids (e.g., fmet-leu-phe, fmlf) and activates potent bactericidal functions in neutrophils. in comparison, fprl1 ligands include peptides of 6-104 amino acids, such as trp-lys-tyr-met-val-[d]met (wkymvm) and other synthetic peptides. to determine the core peptide sequence required for fprl1 activation, we prepared various analogues based on wkymvm and evaluated their bioactivities in an fprl1-transfected cell line. although substitution of d-met6 resulted in loss of activity, removal of val5 together with d-met6 produced a peptide that retained most of the bioactivities of the parent peptide. the resulting peptide, wkym, represents a core structure for an fprl1 ligand. further substitution of lys2 with nle slightly improved the potency of the tetrapeptide, which becomes a dual agonist for both fprl1 and fpr. based on these structure-activity studies, we propose a model in which the modified tetrapeptide trp-nle-tyr-met (wnleym) binds to fprl1 through aromatic interactions involving the side chains of trp1 and tyr3, hydrophobic interaction of nle2, and the thio-based hydrogen bonding of met4, with the respective residues in fprl1 which have not been identified. the identification of the core sequence of a potent peptide agonist provides a structural basis for future design of peptidomimetics as potential therapeutic agents for fprl1-related disorders.there is a growing awareness of the interaction of food constituents with the immune system. the present study aims to evaluate immunomodulatory effects of two of these nutritional components, i.e. glycine and lactoferrin. mice orally supplemented with glycine, lactoferrin or a combination were injected intradermal (in the ear) with zymosan. ear swelling, as a measure for inflammation, as well as il-1, tnf-a and il-6 levels in the ear and the number of tnf-a producing spleen cells were analyzed.-glycine and lactoferrin were able to decrease the zymosan induced inflammatory response locally (decreased ear swelling and pro-inflammatory cytokine levels) as well as systemically (reduced number of tnf-a producing spleen cells).glycine effects (20, 50 and 100 mg/mouse/day) were concentration dependent whereas for lactoferrin only the lowest doses (0.1 and 1 mg/mouse/ day) inhibited the inflammatory response significantly. surprisingly higher doses of lactoferrin (5 and 25 mg/ mouse/day) failed to influence the inflammatory reaction. a combination of both nutrients (lactoferrin 0.1mg/ mouse/day in combination with glycine 20 or 50 mg/ mouse/day) inhibited the zymosan induced ear swelling synergistically. additionally an additive effect of both components was seen on the number of tnf-a producing spleen cells. the present data show anti-inflammatory activity of glycine and lactoferrin using the zymosan induced inflammation model.moreover a combination of both components demonstrated a synergistic effect on inflammation of the skin and an additive effect on the number of tnf-a producing spleen cells. (1), p sambrook(1), k fukudome(2), m xue (1) (1) university of sydney, st leonards, nsw, australia (2) saga medical school, saga, japan objectives: to investigate the i) expression of endothelial protein c receptor (epcr) in synovial membrane and peripheral blood monocytes from patients with rheumatoid arthritis (ra) and osteoarthritis (oa) and ii) role of epcr and its ligand, activated protein c (apc), on the function of monocytes from ra patients.methods: epcr, cd68 and pc/apc in synovial tissues were detected by immunostaining and in situ pcr. monocytes were isolated from peripheral blood of patients with ra and treated with apc, lipopolysaccharide (lps), and/or epcr blocking antibody, rcr252. cells and supernatants were collected to analyze the expression/activation of epcr, nuclear factor nf-kb and tumour necrosis factor tnf-a.results: epcr was expressed by both oa and ra synovial tissues but was markedly increased in ra synovium. epcr was colocalized with pc/apc mostly on cd68 positive cells in synovium. in ra monocytes, apc upregulated epcr expression reduced monocyte chemoattractant protein-1-induced chemotaxis of monocytes by approximately 50%. apc also completely suppressed lps-stimulated nf-kb activation and attenuated tnf-a protein by more than 40% in ra monocytes. the inhibitory effects of apc were reversed by rcr252, indicating that epcr modulates the inhibitory effects of apc.conclusions: our results demonstrate for the first time that epcr is expressed by synovial tissues, particularly in ra, where it co-localizes with pc/apc on monocytes/ macrophages. in addition, apc inhibits the migration and activation of ra monocytes via epcr. these inhibitory effects on ra monocytes suggest that pc pathway may have a beneficial therapeutic effect in ra.